Short Paper
Tracking and analyzing recent developments in German-language online press in the face of the coronavirus crisis
cOWIDplus Analysis and cOWIDplus Viewer
The coronavirus pandemic may be the largest crisis the world has had to face since World War II. It does not come
as a surprise that it is also having an impact on language as our primary communication tool. In this short paper, we present
three inter-connected resources that are designed to capture and illustrate these effects on a subset of the German language: An
RSS corpus of German-language newsfeeds (with freely available untruncated frequency lists), a continuously updated HTML page
tracking the diversity of the vocabulary in the RSS corpus and a Shiny web application that enables other researchers and the
broader public to explore the corpus in terms of basic frequencies.
Article outline
- 1.The coronavirus pandemic and its influence on language
- 2.The RSS corpus
- 2.1Corpus preparation
- 2.2Corpus size
- 3.
cOWIDplus Analysis
- 4.
cOWIDplus Viewer
- 4.1Architecture and interface
- 4.2Examples
- 5.Conclusions
-
References
References
Chang, W., Cheng, J., Allaire, J., Xie, Y., & McPherson, J.
(
2020)
shiny: Web application framework for R (Version 1.4.0.2) [Computer software].
[URL]
Davies, M.
(
2016–)
Corpus of news on the web (NOW): 10 billion words from 20 countries, updated every day.
[URL]
Dowle, M., & Srinivason, A.
(
2019)
data.table: Extension of “data.frame” (Version 1.12.8) [Computer software].
[URL]
Grolemund, G., & Wickham, H.
(
2011)
Dates and times made easy with {lubridate}.
Journal of Statistical Software, 40(3), 1–25.
Johnson, W.
(
1944)
Studies in language behavior: I. A program of research.
Psychological Monographs: General and Applied, 56(2), 1–15.
Koplenig, A.
(
2017)
A data-driven method to identify (correlated) changes in chronological corpora.
Journal of Quantitative Linguistics, 24(4), 289–318.
Michel, J. -B., Shen, Y. K., Aiden, A. P., Verses, A., Gray, M. K., The Google Books Team, Pickett, J. P., Hoiberg, D., Clancy, D., Norvig, P., Orwant, J., Pinker, S., Nowak, M. A., & Aiden, L. E.
(
2011)
Quantitative analysis of culture using millions of digitized books.
Science, 331(14), 176–182.
R Core Team
(
2020)
R: A language and environment for statistical computing (Version 4.0.2). R Foundation for Statistical Computing [Computer software].
[URL]
Shannon, C. E.
(
1948)
A mathematical theory of communication.
Bell System Technical Journal, 27(3), 379–423.
Temple Lang, D.
(
2020)
XML: Tools for parsing and generating XML within R and S-Plus (Version 3.99-0.3) [Computer software].
[URL]
Xie, Y., Allaire, J., & Grolemund, G.
(
2018)
R Markdown: The Definitive Guide. Chapman and Hall/CRC.
[URL].
Cited by
Cited by 1 other publications
Alkhammash, Reem
2021.
2021 International Conference of Women in Data Science at Taif University (WiDSTaif ),
► pp. 1 ff.
This list is based on CrossRef data as of 11 november 2021. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.