Extracting morpheme pairs from bilingual terminological corpora

Tsuji, Keita; Kageura, Kyo

doi:10.1075/term.7.1.08tsu

Article published In:

Terminology
Vol. 7:1 (2001) ► pp.101–114

Extracting morpheme pairs from bilingual terminological corpora

Keita Tsuji | University of Tokyo

Kyo Kageura | University of Tokyo

An HMM-based method for extracting bilingual morpheme pairs from domain-specific bilingual term lists is reported in this paper. In recent years, many bilingual term lists have become available in electronic form. If the bilingual morpheme pairs in the lists are automatically identified, they can be used as bootstrapping information for the automatic identification of bilingual term pairs in bilingual textual corpora. Or, they can be used for automatically extracting translation rules of complex terms. In our method, Japanese terms are segmented into morphemes while at the same time the corresponding Japanese-English morpheme pairs are identified. The advantage of our method is that it requires no pre-processing tool such as a morphological analyser. The result of the experiment was quite satisfactory, our method achieved well over 80% precision and recall.

Keywords: Bilingual Morpheme Pairs, Automatic Extraction, Term List,Translation Rules, Hidden Markov Model.

Published online: 7 December 2001

https://doi.org/10.1075/term.7.1.08tsu

Cited by

Cited by 1 other publications

Abekawa, Takeshi & Kyo Kageura

2009. Proceedings of the 3rd International Universal Communication Symposium, ► pp. 115 ff.

This list is based on CrossRef data as of 9 june 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.