Edited by Patrick Drouin, Natalia Grabar, Thierry Hamon and Kyo Kageura
[Terminology 21:2] 2015
► pp. 263–291
Compositional translation of single-word complex terms using multilingual splitting
Multilingual terminology acquisition from comparable corpora has been attracting the interest of researchers for twenty years, but challenges still remain. Bilingual term alignment, a subtask of multilingual terminology acquisition, requires a pre-processing step, because term structure may differ according to the language. Morphologically constructed terms should be segmented in order to be aligned with their equivalents in other languages. This article addresses the translation of complex terms using a compositional approach. We focus on the pre-processing of such terms and introduce a domain-oriented splitting method that we apply to compound terms belonging to two domains and four languages. The segmentations are used as input to a translation step. We evaluate which percentage of segmentations can be correctly translated by a compositional approach, and which splitting strategy (precision or recall-oriented) performs better. The results are compared to those obtained with the reference segmentations and with a corpus-base splitting method. Our method is close to the reference segmentation and outperforms the corpus-based method.