Publications
Publication details [#8337]
Wanner, Leo, Bernd Bohnet, Mark Giereth and Vanesa Vidal. 2005. The first steps towards the automatic compilation of specialized collocation dictionaries. In Ibekwe SanJuan, Fidelia, Anne Condamines and María Teresa Cabré Castellví, eds. Application-driven terminology engineering. Special issue of Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 11 (1): 143–180.
Publication type
Article in Special issue
Publication language
English
Keywords
Journal DOI
10.1075/term
Abstract
Collocation dictionaries are essential in specialized discourse for understanding, production, and translation. Especially translation, which is often undertaken by professionals who are not specialists of the field, is in need of dictionaries with detailed syntactic and semantic information on lexical and semantic links between terms. However, collocation dictionaries are hardly available for general, let alone specialized, discourse. The manual compilation of collocation dictionaries from large corpora is a time consuming and cost-intensive procedure. A (partial) automation of this procedure has become a high-priority topic in computational lexicography. In this article, the authors discuss how collocations can be acquired from specialized corpora and labeled with semantic tags using machine-learning techniques. As semantic tags, lexical functions from the Explanatory Combinatorial Lexicology are used. The authors explore the performance of two different machine-learning techniques, Nearest Neighbor Classification and Tree Augmented Bayesian Classification, testing them on a Spanish law corpus.
Source : Based on abstract in journal