Publications

Publication details [#14631]

Barrière, Caroline. 2006. Semi-automatic corpus construction from informative texts. In Bowker, Lynne, ed. Lexicography, terminology, and translation: text-based studies in honour of Ingrid Meyer (Perspectives on Translation). Ottawa: University of Ottawa Press. pp. 81–92.

Publication type

Chapter in book

Publication language

English

Keywords

corpus=corpora | term extraction | tool=translation tool

Abstract

As a precursor to performing terminological research, a terminologist must compile a corpus of texts that is then mined for terminological data. One of the most challenging tasks facing a terminologist is to determine which texts should be included in the corpus. The author explores a method for helping terminologists to automatically construct a useful electronic corpus by using a tool that can compute the knowledge-rich value of a text. This knowledge-rich value is based on the density of knowledge patterns. The author introduces a prototype software tool that can automatically identify texts that have a high knowledge-rich value. The article concludes with an evaluation of some of the limitations of the prototype tool and makes some suggestions for future research that will help to overcome these limitations.

Source : A. Matthyssen