Bilingual term recognition revisited
The bag-of-equivalents term alignment approach and its evaluation
The paper describes LUIZ, a bilingual term recognition system that has been developed for the Slovene-English language pair. The system is a hybrid term extractor using morphosyntactic patterns and statistical ranking to propose domain-specific expressions for each of the two languages, whereupon translation equivalents between the languages are identified using the innovative bag-of-equivalents approach. This simple but effective method is based on the Twente word aligner to obtain a lexicon of single word translation pairs and their probability scores, which is then used to identify correspondences between multi-word terms. The bilingual term recognition system has been tested and evaluated on three parallel subcorpora from the tourism, accounting and military domain. Average precision of the term alignment component is 0.83, whereby only fully equivalent and domain-relevant terms were counted as positives. Another advantage of the described approach is the fact that we successfully detect term variants and multiple translations of a candidate multi-word term. Since our term alignment method does not require sentence-aligned corpora it can be used with comparable corpora, provided we already have a domain-specific lexicon or dictionary of single-word correspondences. The paper concludes with some thoughts on the users of term recognition systems and their needs based on our observations from the online version of the system.
Keywords: term alignment, ATR evaluation, bilingual term recognition, parallel corpora, word alignment, comparable corpora
Published online: 03 December 2010
https://doi.org/10.1075/term.16.2.01vin
https://doi.org/10.1075/term.16.2.01vin
Cited by
Cited by 14 other publications
Amjadian, Ehsan, Diana Inkpen, T. Sima Paribakht & Farahnaz Faez
Clouet, Elizaveta, Rima Harastani, Béatrice Daille & Emmanuel Morin
Croijmans, Ilja, Iris Hendrickx, Els Lefever, Asifa Majid & Antal Van Den Bosch
Harastani, Rima, Béatrice Daille & Emmanuel Morin
Hellrich, Johannes & Udo Hahn
Hoste, Veronique, Klaar Vanopstal, Ayla Rigouts Terryn & Els Lefever
Logar Berginc, Nataša & Dejan Verčič
Pinnis, Mārcis, Nikola Ljubešić, Dan Ştefănescu, Inguna Skadiņa, Marko Tadić, Tatjana Gornostaja, Špela Vintar & Darja Fišer
Repar, Andraž, Matej Martinc & Senja Pollak
Repar, Andraž, Vid Podpečan, Anže Vavpetič, Nada Lavrač & Senja Pollak
Rigouts Terryn, Ayla, Véronique Hoste & Els Lefever
Vivaldi, Jorge & Iria da Cunha
This list is based on CrossRef data as of 22 november 2020. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.