Publications
Publication details [#3519]
Dorr, Bonnie Jean, Lisa Pearl, Rebecca Hwa and Nizar Habash. 2002. DUSTer: a method for unraveling cross-language divergences for statistical word-level alignment. In Richardson, Stephen D., ed. Machine Translation: from research to real users (Lecture Notes in Computer Science 2499). Cham: Springer. pp. 31–43.
Publication type
Article in jnl/bk
Publication language
English
Abstract
The frequent occurrence of divergences -structural differences between languages- presents a great challenge for statistical word-level alignment. In this paper, the authors introduce DUSTer, a method for systematically identifying common divergence types and transforming an English sentence structure to bear a closer resemblance to that of another language. The ultimate goal is to enable more accurate alignment and projection of dependency trees in another language without requiring any training on dependency-tree data in that language. The authors present an empirical analysis comparing the complexities of performing word-level alignments with and without divergence handling. The results suggest that the approach facilitates word-level alignment, particularly for sentence pairs containing divergences.
Source : Bitra