Publications
Publication details [#4610]
Moore, Robert C. 2002. Fast and accurate sentence alignment of bilingual corpora. In Richardson, Stephen D., ed. Machine Translation: from research to real users (Lecture Notes in Computer Science 2499). Cham: Springer. pp. 135–144.
Publication type
Article in jnl/bk
Publication language
English
Abstract
The author presents a new method for aligning sentences with their translations in a parallel bilingual corpus. Previous approaches have generally been based either on sentence length or word correspondences. Sentence-length-based methods are relatively fast and fairly accurate. Word-correspondence-based methods are generally more accurate but much slower, and usually depend on cognates or a bilingual lexicon. This method adapts and combines these approaches, achieving high accuracy at a modest computational cost, and requiring no knowledge of the languages or the corpus beyond division into words and sentences.
Source : Bitra