Chapter 4
Digital Dickens
An automated content analysis of Charles Dickens’ novels
This investigation employs computational linguistic methods such as document classification, topic
modelling, and distributional semantics to scrutinize eight novels by Charles Dickens, uncovering dimensions of social
criticism, literary realism, and narrative structures. While affirming positive results for automated analysis of
social criticism, the study emphasizes that it could discover differing associations only due to semantic abstraction,
which distributional semantics, word embeddings, and topic modelling can offer. Literary realism is successfully
traced through detailed descriptions and everyday activities. Plotting plots with computational linguistic methods,
specifically conceptual maps with textplot, shows promise but requires refinement. The study shows that current
methods in content analysis offer new possibilities for literary analysis and digital humanities.
Article outline
- 1.Introduction
- 2.Motivation and background
- 2.1Distributional semantics
- 2.2Content analysis
- 2.3Dickens’ visions and style
- 3.Materials
- 3.1A corpus of Dickens novels
- 3.2CLMET 3.0 Corpus
- 3.3Combined corpus
- 4.Methods
- 4.1Document classification
- 4.2Topic modelling
- 4.3Conceptual maps
- 4.4Distributional semantics
- 4.5Comparison and triangulation of methods
- 5.Results
- 5.1Poverty in Dickens
- 5.1.1Frequency
- 5.1.2Document classification
- 5.1.3Distributional semantics
- 5.1.4Topic modelling
- 5.2Literary realism
- 5.2.1Distributional semantics
- 5.2.2Topic modelling
- 5.2.2Conceptual maps
- 5.2.3Plotting plots
- 6.Conclusion
-
Acknowledgements
-
Notes
-
References