Publications

Publication details [#8336]

Publication type
Article in Special issue
Publication language
English
Journal DOI
10.1075/term

Abstract

This paper investigates an application of distributional similarity techniques to the problem of structural organisation of biomedical terminology. The application domain is the relatively small GENIA corpus. Using terms that have been accurately marked-up by hand within the corpus, the authors consider the problem of automatically determining semantic proximity. Terminological units are defined for the purposes as normalized classes of individual terms. Syntactic analysis of the corpus data is carried out using the Pro3Gres parser and provides the data required to calculate distributional similarity using a variety of measures. Evaluation is performed against a hand-crafted gold standard for this domain in the form of the GENIA ontology. The authors show that distributional similarity can be used to predict semantic type with a good degree of accuracy, reaching an optimal value of 63.1%.
Source : Based on abstract in journal