Hybrid models for sense guessing of Chinese unknown words
This paper addresses the problem of classifying Chinese unknown words into fine-grained semantic categories defined in a Chinese thesaurus, Cilin (Mei et al. 1984). We present three novel knowledge-based models that capture the relationship between the semantic categories of an unknown word and those of its component characters in three different ways, and combine two of them with a corpus-based model that uses contextual information to classify unknown words. Experiments show that the combined knowledge-based model outperforms previous methods on the same task, but the use of contextual information does not further improve performance.
Keywords: Chinese unknown words, corpus annotation, knowledge-based models, corpus-based models, lexical acquisition, sense tagging
Published online: 19 January 2009
Cited by 2 other publications
Lu, Xiaofei & Renfen Hu
This list is based on CrossRef data as of 05 january 2022. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.