Publications

Publication details [#59583]

Geeraerts, Dirk, Dirk Speelman and Yves Peirsman. 2015. The corpus-based identification of cross-lectal synonyms in pluricentric languages. International Journal of Corpus Linguistics 20 (1) : 54–80.
Publication type
Article in journal
Publication language
English
Language as a subject
Place, Publisher
John Benjamins
Journal DOI
10.1075/ijcl

Annotation

This article discusses a corpus-based method for the automatic identification of synonyms across different varieties of the same language. This method, based on the paradigm of distributional semantics, quantifies semantic similarity on the basis of contextual similarity in two comparable corpora. In two case studies for Dutch and German, this study shows that it automatically identifies the correct synonym for 31% and 25% of the target words, respectively. A manual error analysis moreover indicates that many additional synonyms are very close in the distributional model, while most other distributional neighbours are semantically related to the target word along other dimensions than synonymy. On the basis of these results, it is argued that distributional-semantic methods can play a crucial role in the further evolution of corpus-based lexical semantics to a more quantitative discipline.