Chapter published in:
Corpora and the Changing Society: Studies in the evolution of EnglishEdited by Paula Rautionaho, Arja Nurmi and Juhani Klemola
[Studies in Corpus Linguistics 96] 2020
► pp. 29–56
Changes in society and language
Charting poverty
Gerold Schneider | University of Zurich
This study addresses how societal and linguistic changes can be
detected using historical corpora, with the topics of poverty and industrial revolution
as a case study, based on large historical corpora, in particular EEBO, and CLMET3.0.
The results, based on a rich array of state-of-the art statistical approaches (such as
kernel density estimation), show how poverty, industrial revolution, and urbanization
are associated through, for instance, the associations of war, religion, family,
poverty, and suffering. The study also discusses the importance of data size and
cleanness, the temptations of distant reading, and the necessity for validating the
discovered patterns in close reading and distant reading in interaction.
Article outline
- 1.Introduction
- 2.Data and pre-processing
- 2.1The EEBO Collection as sampler corpus
- 2.2The CLMET3.0 corpus
- 2.3The pre-processing step of spelling normalization
- 3.Methods
- 3.1Data-based and data-driven approaches
- 3.2Document classification
- 3.3Topic modelling
- 3.4Conceptual maps
- 4.Results and discussion
- 4.1Dictionary-based approach
- 4.2Topic modelling
- 4.2.1EEBO early vs. EEBO late
- 4.2.2Adding CLMET3.0 and increasing the number of topics
- 4.3Conceptual maps
- 5.Conclusions
-
Notes -
References
Published online: 08 April 2020
https://doi.org/10.1075/scl.96.02sch
https://doi.org/10.1075/scl.96.02sch
References
Corpora and software
CLMET3.0 = Corpus of Late
Modern English Texts
version
3.0. De Smet, Hendrik, Diller, Hans-Jürgen & Tyrkkö, Jukka (comps). https://perswww.kuleuven.be/~u0044428/
EEBO = Early English Books
Online. Davies, Mark
Gephi
Mallet = Machine Learning for LanguagE
Toolkit
Other references
Ananiadou, Sophia, Kell, Douglas B. & Tsujii, Jun-ichi
Baroni, Marco & Lenci, Alessandro
Bartsch, Sabine & Evert, Stefan
Church, Kenneth
C. W.
2013 Did
living standards improve during the Industrial
Revolution? The
Economist, September 13 2013 <https://www.economist.com/free-exchange/2013/09/13/did-living-standards-improve-during-the-industrial-revolution> (30 December 2018).
Daudin, Guillaume, O’Rourke, Kevin H., & Prados de la Escosura, Leandro
2008 Trade
and empire, 1700–1870. Technical Report # 2008–24,
OFCE: Centre de recherche en économie et sciences po. https://www.ofce.sciences-po.fr/pdf/dtravail/WP2008-24.pdf> (30 December 2018).
Evert, Stefan
Food and Agriculture Organisation of
the United
Nations
. November 2003 Anti-hunger
Programme. A twin-track approach to hunger
reduction: priorities for national and international
action. http://www.fao.org/3/J0563E/j0563e02.htm.
Firth, Rupert
Glynn, Dylan
Gries, Stefan T.
Grimmer, Justin & Stewart, Brandon
Hatton, Timothy J. & Bray, Bernice E.
Hilpert, Martin & Gries, Stefan T.
Jurafsky, Daniel & Martin, James H.
Komlos, John
Michel, Jean-Baptiste, Shen, Yuan Kui, Presser Aiden, Aviva, Veres, Adrian, Gray, Matthew K., The Google Books Team, Pickett, Joseph P., Hoiberg, Dale, Clancy, Dan, Norvig, Peter, Orwant, Jon, Pinker, Steven, Nowak, Martin A. & Lieberman Aiden, Erez
Oakes, Michael P.
Rayson, Paul
Sahlgren, Magnus
Schneider, Gerold
2018 Differences
between Swiss High German and German High German via data-driven
methods. In Proceedings
of SwissText 2018, Mark Cieliebak, Don Tuggener & Fernando Benites (eds), 6–16. http://ceur-ws.org/Vol-2226/> (30 December 2018).
Schneider, Gerold, Pettersson, Eva & Percillier, Michael
2017 Comparing
rule-based and SMT-based spelling normalisation for English historical
texts. Proceedings of the NoDaLiDa 2017 Workshop on
Processing Historical Language. http://www.ep.liu.se/ecp/133/008/ecp17133008.pdf> (30 December 2018).
Schwartz, H. Andrew & Ungar, Lyle H.
Szreter, Simon & Mooney, Graham
Taavitsainen, Irma & Schneider, Gerold
Tognini-Bonelli, Elena
Wüest, Bruno, Schneider, Gerold & Amsler, Michael