The Representativeness of Czech corpora
The attempt to balance corpora with respect to their future usage led to the introduction of the termexpectations(Králík 2001b). On the bases of several statistical inquiries of such expectations, the textual structure ofSYN2000,which is the synchronic part of the Czech National Corpus (CNC), was proposed and realised. The present article explains the original composition briefly and discusses two new inquiries concerning expectations(A-2001andC-2001).Important corrections for future work on the CNC are suggested. The expectations concerning newspapers changed radically during 1996–2001. Within the same period, an obvious rise of interest in fiction can be detected. The reasons for these developments can be traced to trends in Czech society. Thus, we have proposed a considerable reduction in the proportion of newspaper texts and a large increase in the proportion of fiction texts. According to new searches, more detailed percentages for specific subject areas are suggested.
Keywords: textual structure, expectation, corpus, corpora
Published online: 01 September 2005
Cited by 4 other publications
Bibiri, Anca Diana, Speranţa Cecilia Bolea, Liviu Andrei Scutelnicu, Alex Mihai Moruz, Laura Pistol & Dan Cristea
Bijankhan, Mahmood, Javad Sheykhzadegan, Mohammad Bahrani & Masood Ghayoomi
Králík, Jan & Ludmila Uhlířová
Usoniene, Aurelija, Linas Butenas, Birute Ryvityte, Jolanta Sinkuniene, Erika Jasionyte & Algimantas Juozapavicius
This list is based on CrossRef data as of 15 april 2021. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.