Classification-based scientific term detection in patient information
Although intended for the “average layman”, both in terms of readability and contents, the current patient information still contains many scientific terms. Different studies have concluded that the use of scientific terminology is one of the factors, which greatly influences the readability of this patient information. The present study deals with the problem of automatic term recognition of overly scientific terminology as a first step towards the replacement of the recognized scientific terms by their popular counterpart. In order to do so, we experimented with two approaches, a dictionary-based approach and a learning-based approach, which is trained on a rich feature vector. The research was conducted on a bilingual corpus of English and Dutch EPARs (European Public Assessment Report). Our results show that we can extract scientific terms with a high accuracy (> 80%, 10% below human performance) for both languages. Furthermore, we show that a lexicon-independent approach, which solely relies on orthographical and morphological information is the most powerful predictor of the scientific character of a given term.
Keywords: automatic term extraction, patient information, machine learning
Published online: 11 May 2010
Cited by 6 other publications
Azari, Razieh, Marziyeh Khalilizadeh Ganjalikhani & Anahita Amirshoja’i
DE CLERCQ, ORPHÉE, VÉRONIQUE HOSTE, BART DESMET, PHILIP VAN OOSTEN, MARTINE DE COCK & LIEVE MACKEN
Marciniak, Małgorzata & Agnieszka Mykowiecka
Renahy, Julie, Izabella Thomas, Grégory Chippeaux, Bérenger Germain, Xavier Petiaux, Barbara Rath, Valérie de Grivel, Sylviane Cardey & Dominique A. Vuitton
Tran, Quoc Duyet, Haydar Demirhan & Anil Dolgun
Tran, Quoc Duyet, Anil Dolgun & Haydar Demirhan
This list is based on CrossRef data as of 10 november 2021. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.