Hybrid models for sense guessing of Chinese unknown words

Lu, Xiaofei

doi:10.1075/ijcl.13.1.06lu

Article published In:

International Journal of Corpus Linguistics
Vol. 13:1 (2008) ► pp.99–128

Hybrid models for sense guessing of Chinese unknown words

Xiaofei Lu | The Pennsylvania State University

This paper addresses the problem of classifying Chinese unknown words into fine-grained semantic categories defined in a Chinese thesaurus, Cilin (Mei et al. 1984). We present three novel knowledge-based models that capture the relationship between the semantic categories of an unknown word and those of its component characters in three different ways, and combine two of them with a corpus-based model that uses contextual information to classify unknown words. Experiments show that the combined knowledge-based model outperforms previous methods on the same task, but the use of contextual information does not further improve performance.

Keywords: Chinese unknown words, corpus annotation, knowledge-based models, corpus-based models, lexical acquisition, sense tagging

Published online: 19 January 2009

https://doi.org/10.1075/ijcl.13.1.06lu

Cited by (2)

Cited by two other publications

Lu, Xiaofei & Renfen Hu

2021. Sense-aware lexical sophistication indices and their relationship to second language writing quality. Behavior Research Methods 54:3 ► pp. 1444 ff.

Lu, Xiaofei

2014. Summary and Outlook. In Computational Methods for Corpus Annotation and Analysis, ► pp. 175 ff.

This list is based on CrossRef data as of 5 august 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.