Tsuji, Keita and Kyo Kageura. 2001. Extracting morpheme pairs from bilingual terminological corpora. Terminology 7 (1) : 101–114.
An HMM-based method for extracting bilingual morpheme pairs from domain-specific bilingual term lists is reported in this paper. In recent years, many bilingual term lists have become available in electronic form. If the bilingual morpheme pairs in the lists are automatically identified, they can be used as bootstrapping information for the automatic identification of bilingual term pairs in bilingual textual corpora. Or, they can be used for automatically extracting translation rules of complex terms. In the method, described in this article, Japanese terms are segmented into morphemes while at the same time the corresponding Japanese-English morpheme pairs are identified. The advantage of this method is that it requires no pre-processing tool such as a morphological analyser. The result of the experiment was quite satisfactory, the method achieved well over 80% precision and recall.
