Publication details [#212]

Mikhailov, Mikhail. 2001. Two approaches to automated text aligning of parallel fiction texts. Across Languages and Cultures 2 (1) : 87–96.
Publication type
Article in jnl/bk
Publication language


Parallel text corpora supply researchers with data for multilingual lexicographic research, translation studies, and language typology. The objectives of the ParRus research project at the University of Tampere are to compile a Russian-Finnish parallel corpus and to develop the software for the maintenance of the corpus. Text aligning is the crucial problem in compiling parallel corpora. The study of parallel texts shows that, in most cases, the translator retains paragraphs of the original in the translation. The Source Language - Target Language quotient is also a stable value. The aligning programme developed compares original with translation, paragraph by paragraph, adding new paragraphs to the extracts being aligned until the extracts match the SL-TL quotient. The system only produces good results if the translation is structurally close to the original. However, the study of parallel texts shows that frequency of words and their translation equivalents does not usually match.
Source : Based on abstract in journal