Publication details [#54079]

Hoste, Véronique, Klaar Vanopstal, Els Lefever and Isabelle Delaere. 2010. Classification-based scientific term detection in patient information. Terminology 16 (1) : 1–29.
Publication type
Article in journal
Publication language
Language as a subject
Place, Publisher
John Benjamins
Journal DOI


Although intended for the “average layman”, both in terms of readability and contents, the current patient information still contains many scientific terms. Different studies have concluded that the use of scientific terminology is one of the factors, which greatly influences the readability of this patient information. The present study deals with the problem of automatic term recognition of overly scientific terminology as a first step towards the replacement of the recognized scientific terms by their popular counterpart. In order to do so, this paper experimented with two approaches, a dictionary-based approach and a learning-based approach, which is trained on a rich feature vector. The research was conducted on a bilingual corpus of English and Dutch EPARs (European Public Assessment Report). The results show that we can extract scientific terms with a high accuracy (> 80%, 10% below human performance) for both languages. Furthermore, it shows that a lexicon-independent approach, which solely relies on orthographical and morphological information is the most powerful predictor of the scientific character of a given term.