Evolution and present situation of corpus research in China

Feng, Zhiwei

doi:10.1075/ijcl.11.2.03fen

Article published In:

International Journal of Corpus Linguistics
Vol. 11:2 (2006) ► pp.173–207

Evolution and present situation of corpus research in China

Zhiwei Feng | Institute of Applied Linguistics, China

In this paper, the author introduces in detail the development and present situation of corpus linguistics in China: earlier corpora, large-scale & authentic text corpora, national corpora, speech corpora, bilingual corpora and corpora of minority languages in China. The various processing techniques for corpora are also introduced: automatic word segmentation of Chinese text, automatic PoS tagging, automatic tagging of phrase structure and automatic alignment of bilingual corpora. This paper is a bird’s-eye view of corpus linguistics of China. Finally, the author discusses several problems in present corpus research: standardization of corpus specifications, commonly sharing of language resources, knowledge properties, etc.

Keywords: automatic tagging of phrase structure, automatic alignment of bilingual corpora, corpus, large-scale and authentic text, speech corpora, bilingual corpora, corpora of minority languages in China, automatic word segmentation, automatic PoS tagging

Published online: 11 July 2006

https://doi.org/10.1075/ijcl.11.2.03fen

Cited by (9)

Cited by nine other publications

Order by:

Guan, Wei

2016. Corpus linguistics in Chinese contexts. System 61 ► pp. 118 ff.

Zhao, Qiurong

2016. Review of Zou, Smith & Hoey (2015): Corpus Linguistics in Chinese Contexts. Chinese Language and Discourse. An International and Interdisciplinary Journal 7:1 ► pp. 166 ff.

Xu, Jiajin

2015. Corpus-based Chinese studies. Chinese Language and Discourse. An International and Interdisciplinary Journal 6:2 ► pp. 218 ff.

Xu, Jiajin

2019. The Corpus Approach to the Teaching and Learning of Chinese as an L1 and an L2 in Retrospect. In Computational and Corpus Approaches to Chinese Language Learning [Chinese Language Learning Sciences, ], ► pp. 33 ff.

Cheng, Winnie

2012. Corpora: C hinese‐Language . In The Encyclopedia of Applied Linguistics,

Wong, Hai Ming, Susan M. Bridges, Cynthia K.Y. Yiu, Colman P.J. McGrath, Terry K. Au & Divya S. Parthasarathy

2012. Development and validation of Hong Kong Rapid Estimate of Adult Literacy in Dentistry. Journal of Investigative and Clinical Dentistry 3:2 ► pp. 118 ff.

Han, Chong

2011. Reading Chinese online entertainment news: Metaphor and language play. Journal of Pragmatics 43:14 ► pp. 3473 ff.

Cai, Qing, Marc Brysbaert & Antoni Rodriguez-Fornells

2010. SUBTLEX-CH: Chinese Word and Character Frequencies Based on Film Subtitles. PLoS ONE 5:6 ► pp. e10729 ff.

권혁승

2009. The SNU Korean Learner Corpus of English: Compilation and Application. English Language and Linguistics null:28 ► pp. 203 ff.

This list is based on CrossRef data as of 17 october 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.