Spoken Corpora Design: Their Constitutive Parameters

Čermák, František

doi:10.1075/ijcl.14.1.07cer

Article published In:

International Journal of Corpus Linguistics
Vol. 14:1 (2009) ► pp.113–123

Spoken Corpora Design

Their Constitutive Parameters

František Čermák | Charles University, Prague

From a linguistic point of view, spoken corpora should be primary for research but that has not been the case so far. Hence, the problem of what should be included in the corpora has hardly ever been considered. Often it would appear that anything spoken is included on an ad hoc basis. The need and scarcity of real prototypical spoken corpora points to a necessity of mapping the field in its entirety and identifying its relevant parameters. In order to do this the present paper translates the major differences between spoken and written texts into usable parameters. Ultimately this could enable the setting up of a representative spoken corpus with a clear core of real and typical spoken language, etc.

Keywords: spoken corpus design, representativeness, spoken and written corpus, core spoken corpus, demographic categories, categories of situation

Published online: 24 March 2009

https://doi.org/10.1075/ijcl.14.1.07cer

Cited by

Cited by 8 other publications

Order by:

Demirel, Elif Tokdemir & Semin Kazazoğlu

2015. The Comparison of Collocation Use by Turkish and Asian Learners of English: The Case of TCSE Corpus and Icnale Corpus. Procedia - Social and Behavioral Sciences 174 ► pp. 2278 ff.

Komrsková, Zuzana, Marie Kopřivová, David Lukeš, Petra Poukarová & Hana Goláňová

2017. New Spoken Corpora of Czech: ORTOFON and DIALEKT. Journal of Linguistics/Jazykovedný casopis 68:2 ► pp. 219 ff.

Kopřivová, Marie, Zuzana Komrsková, Petra Poukarová & David Lukeš

2019. Relevant Criteria for Selection of Spoken Data: Theory Meets Practice. Journal of Linguistics/Jazykovedný casopis 70:2 ► pp. 324 ff.

Leuckert, Sven & Sofia Rüdiger

2020. Non-canonical syntax in an Expanding Circle variety. English World-Wide. A Journal of Varieties of English 41:1 ► pp. 33 ff.

Love, Robbie, Claire Dembry, Andrew Hardie, Vaclav Brezina & Tony McEnery

2022. The Spoken BNC2014. International Journal of Corpus Linguistics ► pp. 319 ff.

Poukarová, Petra

2017. Correlative Conjunctions in Spoken Texts. Journal of Linguistics/Jazykovedný casopis 68:2 ► pp. 305 ff.

Zhang, Jinyi, Ye Tian, Jiannan Mao, Mei Han, Feng Wen, Cong Guo, Zhonghui Gao & Tadahiro Matsumoto

2023. WCC-JC 2.0: A Web-Crawled and Manually Aligned Parallel Corpus for Japanese-Chinese Neural Machine Translation. Electronics 12:5 ► pp. 1140 ff.

[no author supplied]

2019. Morpho-Syntactic Patterns in Spoken Korean English [Varieties of English Around the World, G62],

This list is based on CrossRef data as of 1 june 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.