Hybrid Approaches for Automatic Segmentation and Annotation of a Chinese Text Corpus

Feng, Zhiwei

doi:10.1075/ijcl.6.si.04fen

Article published In:

Text Corpora and Multilingual Lexicography
Wolfgang Teubert
[International Journal of Corpus Linguistics 6:SI] 2001
► pp. 35–42

Hybrid Approaches for Automatic Segmentation and Annotation of a Chinese Text Corpus

Zhiwei Feng

This paper describes the hybrid approaches for automatic segmentation and annotation of a Chinese text corpus. Some experiment results are given. Hybrid approaches combine the rule-based method, the statistic-based method, and the automatic learning method. It is a good approach, and it can obviously improve the precision of segmentation and annotation of a Chinese text corpus.

Keywords: tagging, hybrid approach, rule-based approach, HMM (Hidden Markov Model), CLAWS (Constituent-Likelihood Automatic Word-tagging System) algorithm, TBED (Transform Based Error Driven), segmentation, Brill method

Published online: 17 December 2001

https://doi.org/10.1075/ijcl.6.si.04fen

Cited by (1)

Cited by 1 other publications

Xiao, Richard & Xianyao Hu

2015. Corpora and Corpus Tools in Use. In Corpus-Based Studies of Translational Chinese in English-Chinese Translation [New Frontiers in Translation Studies, ], ► pp. 37 ff.

This list is based on CrossRef data as of 4 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.