Author and register as sources of variation
A corpus-based study using elicited texts
Václav Cvrček | Charles University
Zuzana Laubeová | Charles University
David Lukeš | Charles University
Petra Poukarová | Charles University
Anna Řehořková | Charles University
Adrian Jan Zasina | Charles University
This paper investigates the contribution of author/idiolect vs. register/type-of-text – as the most salient factors
influencing the final shape of a text – towards explaining the variation observed in Czech texts. Since it is almost impossible to explore
the effect of these factors on authentic data, we used elicited letters collected in a fully crossed experimental design (representative
sample of 200 authors × four elicitation scenarios serving as a proxy to register variation). The variation encompassed by the elicited
texts is analyzed through the lens of a general-purpose multi-dimensional model of Czech. Using triangulation via three established
statistical methods and one devised for the purpose of this study, we find that register matters a great deal, explaining 1.5 times as much
variation overall as idiolect. This should be taken into account when designing research in sociolinguistics or variation studies in
general.
Keywords: variation, idiolect, register, multi-dimensional analysis, Czech
Published online: 27 October 2020
https://doi.org/10.1075/ijcl.19020.cvr
https://doi.org/10.1075/ijcl.19020.cvr
References
References
Amoroso, L. W.
Baayen, H., van Halteren, H., & Tweedie, F.
Bakeman, R.
Baker, P., & Egbert, J.
Bayley, R., Cameron, R., & Lucas, C.
Biber, D., & Finegan, E.
Čermák, F.
Český statistický úřad
[Czech Statistical Office] (2015) Věk a vzdělání populace [Age and education of the population]. https://www.czso.cz
Conrad, S.
Cvrček, V., Komrsková, Z., Lukeš, D., Poukarová, P., Řehořková, A., & Zasina, A. J.
in preparation). Register variability of elicited texts.
Cvrček, V., Komrsková, Z., Lukeš, D., Poukarová, P., Řehořková, A., Zasina, A. J., & Benko, V.
Egbert, J., & Baker, P.
Finegan, E., & Rickford, J. R.
Grant, T.
Hinrichs, L., & Szmrecsanyi, B.
Hnátková, M.
Iwasaki, S., & Horie, P. I.
Jelínek, T.
King, B. M., Rosopa, P. J., & Minium, E. W.
Krejci, B., & Hilton, K.
Kučera, D.
Kučera, D., & Havigerová, J. M.
Labov, W.
Louwerse, M. M.
Nakagawa, S., Johnson, P. C. D., & Schielzeth, H.
Petkevič, V.
Rickford, J. R., & McNair-Knox, F.
Riordan, B.
Spoustová, D., Hajič, J., Votrubec, J., Krbec, P., & Květoň, P.
(2007) The best of two worlds: Cooperation of statistical and rule-based taggers for Czech. In J. Piskorski & T. Hristo (Eds.), Proceedings of the Workshop on Balto-Slavonic Natural Language Processing (pp. 67–74). Association for Computational Linguistics. https://www.aclweb.org/anthology/W07-1709
Staples, S., Biber, D., & Reppen, R.
Straková, J., Straka, M., & Hajič, J.
Szmrecsanyi, B.
Szmrecsanyi, B., & Hinrichs, L.
(2008) Probabilistic determinants of genitive variation in spoken and written English: A multivariate comparison across time, space, and genres. In T. Nevalainen, I. Taavitsainen, P. Pahta, & M. Korhonen (Eds.), The Dynamics of Linguistic Variation: Corpus Evidence on English Past and Present (pp. 291–309). John Benjamins. 

Tagliamonte, S.
Tambouratzis, G., Markantonatou, S., Hairetakis, N., Vassiliou, M., Tambouratzis, D., & Carayannis, G.
Zasina, A. J., Lukeš, D., Komrsková, Z., Poukarová, P., & Řehořková, A.
(2018) Koditex: Korpus diverzifikovaných textů [Koditex: Corpus of diversified texts] (version 1). Ústav Českého národního korpusu FF UK. https://www.korpus.cz