Gerold Schneider
List of John Benjamins publications for which Gerold Schneider plays a role.
Chapter 4. Digital Dickens: An automated content analysis of Charles Dickens’ novels Crossing Boundaries through Corpora: Innovative corpus approaches within and beyond linguistics, Buschfeld, Sarah, Patricia Ronan, Theresa Neumaier, Andreas Weilinghoff and Lisa Westermayer (eds.), pp. 62–98 | Chapter
2024 This investigation employs computational linguistic methods such as document classification, topic modelling, and distributional semantics to scrutinize eight novels by Charles Dickens, uncovering dimensions of social criticism, literary realism, and narrative structures. While affirming… read more
Chapter 12. Colloquialisation, compression and democratisation in British parliamentary debates Exploring Language and Society with Big Data: Parliamentary discourse across time and space, Korhonen, Minna, Haidee Kotze and Jukka Tyrkkö (eds.), pp. 336–372 | Chapter
2023 We conduct an analysis of the link between colloquialisation and democratisation in debates in the British parliament. Our corpus is a sampler of the Hansard archive, covering the period 1803–2005 and containing 170 million words. We first investigate how the linguistic patterns of parliamentary… read more
Differences in syntactic annotation affect retrieval: Verb-attached PPs in the history of English International Journal of Corpus Linguistics 28:3, pp. 378–406 | Article
2023 Prepositional phrases (PPs) play an important part in English argument structure constructions, but pose considerable challenges for linguistic investigations of any kind. In addition to the fact that PP-attachment is generally notoriously difficult to model computationally, a particularly… read more
Chapter 3. Medical topics and style from 1500 to 2018: A corpus-driven exploration Corpus Pragmatic Studies on the History of Medical Discourse, Hiltunen, Turo and Irma Taavitsainen (eds.), pp. 49–78 | Chapter
2022 This chapter investigates changes in medical topics, style and language across 500 years, from 1500 to 2018. To do so, we employ data-driven methods of Computational Linguistics and Digital Humanities: document classification, topic modelling, and automatically constructed conceptual maps. We… read more
Chapter 7. Syntactic changes in verbal clauses and noun phrases from 1500 onwards English Historical Linguistics: Change in structure and meaning, Los, Bettelou, Claire Cowie, Patrick Honeybone and Graeme Trousdale (eds.), pp. 163–200 | Chapter
2022 Can the promise of data-driven methods hold in historical linguistics? Can they detect salient syntactic changes and open new research avenues? I first use data-driven measures to detect patterns in the ARCHER corpus. Secondly, I qualitatively interpret the differences and build hypotheses.… read more
Recent changes in spoken British English in verbal and nominal constructions Broadening the Spectrum of Corpus Linguistics: New approaches to variability and change, Flach, Susanne and Martin Hilpert (eds.), pp. 173–195 | Chapter
2022 Starting from a data-driven approach, the current paper compares the BNC1994 spoken to the BNC2014. We first narrow down possible research questions due to differences in the compilation and transcription of the two BNC generations. Then we investigate three robustly detectable changes at the… read more
Spelling normalisation of Late Modern English: Comparison and combination of VARD and character-based statistical machine translation Late Modern English: Novel encounters, Kytö, Merja and Erik Smitterberg (eds.), pp. 243–268 | Chapter
2020 To be able to profit from natural language processing (NLP) tools for analysing historical text, an important step is spelling normalisation. We first compare and second combine two different approaches: on the one hand VARD, a rule-based system which is based on dictionary lookup and rules with… read more
Changes in society and language: Charting poverty Corpora and the Changing Society: Studies in the evolution of English, Rautionaho, Paula, Arja Nurmi and Juhani Klemola (eds.), pp. 29–56 | Chapter
2020 This study addresses how societal and linguistic changes can be detected using historical corpora, with the topics of poverty and industrial revolution as a case study, based on large historical corpora, in particular EEBO, and CLMET3.0. The results, based on a rich array of state-of-the art… read more
Chapter 3. Topics of eighteenth-century medical writing with triangulation of methods: LMEMT and the underlying reality Late Modern English Medical Texts: Writing medicine in the eighteenth century, Taavitsainen, Irma and Turo Hiltunen (eds.), pp. 31–74 | Chapter
2019 This chapter deals with the most important developments within society and the medical discourse community in the eighteenth-century Britain. It applies several methods by way of triangulation to probe into relevant aspects of the history of medicine and medical writing between 1700 and 1800. The… read more
Detecting innovations in a parsed corpus of learner English Rethinking Linguistic Creativity in Non-native Englishes, Deshors, Sandra C., Sandra Götz and Samantha Laporte (eds.), pp. 47–74 | Article
2018 In research on L2 English, recent corpus-based studies indicate that some nonstandard forms are shared by indigenized (ESL) and foreign (EFL) varieties of English, which challenges the idea of a clear dichotomy between innovation and error. We present a data-driven large-scale method to detect… read more
Chapter 2. From lexical bundles to surprisal and language models: Measuring the idiom principle in native and learner language Applications of Pattern-driven Methods in Corpus Linguistics, Kopaczyk, Joanna and Jukka Tyrkkö (eds.), pp. 15–56 | Chapter
2018 We exploit the information theoretic measure of surprisal to analyze the formulaicity of lexical sequences. We first show the prevalence of individual lexical bundles, then we argue that abstracting to surprisal as an information-theoretic measure of lexical bundleness, formulaicity and… read more
Introduction New Approaches to English Linguistics: Building bridges, Timofeeva, Olga, Anne-Christine Gardner, Alpo Honkapohja and Sarah Chevalier (eds.), pp. 1–12 | Article
2016 Detecting innovations in a parsed corpus of learner English Linguistic Innovations: Rethinking linguistic creativity in non-native Englishes, Deshors, Sandra C., Sandra Götz and Samantha Laporte (eds.), pp. 177–204 | Article
2016 In research on L2 English, recent corpus-based studies indicate that some non-standard forms are shared by indigenized (ESL) and foreign (EFL) varieties of English, which challenges the idea of a clear dichotomy between innovation and error. We present a data-driven large-scale method to detect… read more
Statistical sequence and parsing models for descriptive linguistics and psycholinguistics New Approaches to English Linguistics: Building bridges, Timofeeva, Olga, Anne-Christine Gardner, Alpo Honkapohja and Sarah Chevalier (eds.), pp. 281–320 | Article
2016 This study shows that using computational linguistic models is beneficial for descriptive linguistics and psycholinguistics. It applies two models to various English genres and learner language: 1) surprisal and 2) a syntactic parser, allowing us to investigate the role of ambiguity and the… read more
Determining light verb constructions in contemporary British and Irish English International Journal of Corpus Linguistics 20:3, pp. 326–354 | Article
2015 This study implements an automated parser-based approach to the investigation of light verb constructions. The database consisting of ICE-GB and ICE-IRE is used to obtain qualitative and quantitative results on the use of light verb structures. The study explains and evaluates the steps employed to… read more
2015
“Off with their heads”: Profiling TAM in ICE corpora Mapping Unity and Diversity World-Wide: Corpus-Based Studies of New Englishes, Hundt, Marianne and Ulrike Gut (eds.), pp. 1–34 | Article
2012 The main aim of our chapter is a methodological one, that of comparing a largely data-driven approach to regional variation in world Englishes and a corpus-based approach. As a case study, we examine tense, aspect and modality (TAM) differences between five varieties. Our investigation uses… read more
Fishing for compliments: Precision and recall in corpus-linguistic compliment research Speech Acts in the History of English, Jucker, Andreas H. and Irma Taavitsainen (eds.), pp. 273–294 | Article
2008 Using distributional similarity to organise biomedical terminology Application-Driven Terminology Engineering, Ibekwe-SanJuan, Fidelia, Anne Condamines and Teresa Cabré (eds.), pp. 97–126 | Article
2007 Using distributional similarity to organise biomedical terminology Application-Driven Terminology Engineering, Ibekwe-SanJuan, Fidelia, Anne Condamines and Teresa Cabré (eds.), pp. 107–141 | Article
2005