Chapter 3
Medical topics and style from 1500 to 2018
A corpus-driven exploration
This chapter investigates changes in medical topics, style and language across 500 years, from 1500 to 2018. To do so, we employ data-driven methods of Computational Linguistics and Digital Humanities: document classification, topic modelling, and automatically constructed conceptual maps. We trace changes from traditional thinking in the scholastic period to empirical methods, professionalised medicine, and finally the increasing importance of data, statistics and clinical studies, away from symptom-centred medicine. We conclude that medical discourse has undergone radical changes and that data-driven methods reflect these changes and offer an unprecedented overview. We also critically discuss shortcomings of our data and methods.
Keywords: data-driven approaches, machine learning, collocations, Topic Modelling, history of medicine, Digital Humanities, conceptual maps, Kernel Density Estimation, automated content analysis, English medical discourse, language and health, culturomics
Article outline
- 1.Introduction
- 2.Motivation
- 2.1Systematic comparison of all lexical features
- 2.2Advanced computational methods
- 2.3Sampling and representativeness
- 3.Materials
- 3.1CEEM
- 3.2ARCHER Medical
- 3.3HIMERA
- 3.4PubMed Excerpt
- 3.5Overview of the complete data of our investigation
- 3.6Limitations of the data
- 4.Methods
- 4.1Data preparation
- 4.2Supervised document classification
- 4.3Unsupervised topic modelling
- 4.4Unsupervised Conceptual Maps with Kernel Density Estimation
- 5.Results
- 5.1Results of supervised document classification
- 5.2Results of unsupervised topic modelling
- 5.3Results of Unsupervised Conceptual Maps with Kernel Density Estimation
- 6.Conclusion and future prospects
-
Acknowledgements
-
Notes
-
References