The coronavirus pandemic may be the largest crisis the world has had to face since World War II. It does not come
as a surprise that it is also having an impact on language as our primary communication tool. In this short paper, we present
three inter-connected resources that are designed to capture and illustrate these effects on a subset of the German language: An
RSS corpus of German-language newsfeeds (with freely available untruncated frequency lists), a continuously updated HTML page
tracking the diversity of the vocabulary in the RSS corpus and a Shiny web application that enables other researchers and the
broader public to explore the corpus in terms of basic frequencies.
Chang, W., Cheng, J., Allaire, J., Xie, Y., & McPherson, J. (2020). shiny: Web application framework for R (Version 1.4.0.2) [Computer software]. [URL]
Davies, M. (2016–). Corpus of news on the web (NOW): 10 billion words from 20 countries, updated every day. [URL]
Dowle, M., & Srinivason, A. (2019). data.table: Extension of “data.frame” (Version 1.12.8) [Computer software]. [URL]
Grolemund, G., & Wickham, H. (2011). Dates and times made easy with {lubridate}. Journal of Statistical Software, 40(3), 1–25.
Johnson, W. (1944). Studies in language behavior: I. A program of research. Psychological Monographs: General and Applied, 56(2), 1–15.
Koplenig, A. (2017). A data-driven method to identify (correlated) changes in chronological corpora. Journal of Quantitative Linguistics, 24(4), 289–318.
Michel, J. -B., Shen, Y. K., Aiden, A. P., Verses, A., Gray, M. K., The Google Books Team, Pickett, J. P., Hoiberg, D., Clancy, D., Norvig, P., Orwant, J., Pinker, S., Nowak, M. A., & Aiden, L. E. (2011). Quantitative analysis of culture using millions of digitized books. Science, 331(14), 176–182.
R Core Team. (2020). R: A language and environment for statistical computing (Version 4.0.2). R Foundation for Statistical Computing [Computer software]. [URL]
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379–423.
Temple Lang, D. (2020). XML: Tools for parsing and generating XML within R and S-Plus (Version 3.99-0.3) [Computer software]. [URL]
Xie, Y., Allaire, J., & Grolemund, G. (2018). R Markdown: The Definitive Guide. Chapman and Hall/CRC. [URL].
Cited by (5)
Cited by five other publications
Wolfer, Sascha & Jan Oliver Rüdiger
2023. Tagesaktuelle Aufbereitung, Analyse und Exploration sprachlicher Daten aus RSS-Feeds. In Angewandte Data Science, ► pp. 3 ff.
Chen, Xi, Vincent Xian Wang & Chu-Ren Huang
2022. Themes and Sentiments of Online Comments Under COVID-19: A Case Study of Macau. In Chinese Lexical Semantics [Lecture Notes in Computer Science, 13249], ► pp. 494 ff.
Marinova, Elena V.
2022. Semantic dominants of 2020 neologisms as a means of coding reality in the Russian language. Russian Language Studies 20:4 ► pp. 449 ff.
Alkhammash, Reem
2021. 2021 International Conference of Women in Data Science at Taif University (WiDSTaif ), ► pp. 1 ff.
Mahlberg, Michaela & Gavin Brookes
2021. Language and Covid-19. International Journal of Corpus Linguistics 26:4 ► pp. 441 ff.
This list is based on CrossRef data as of 4 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.