Gerold Schneider

orcid.org/0000-0002-1905-6237

List of John Benjamins publications in which Gerold Schneider is involved.

Articles (22) | Order by:

Reveilhac, Maud and Gerold Schneider 2025 Evaluating a transparent and interpretable approach to stance detection using linguistic markers in social media data Reproducibility, Replicability, and Robustness in Corpus Linguistics, Schweinberger, Martin and Michael Haugh (eds.), pp. 195–233 | Article

Our study focuses on replicability, which entails researchers’ ability to achieve similar results to a prior study using identical methods but a different yet comparable dataset. We address the challenge of stance detection (determining whether a document is “favorable,” “against,” or “neutral”… read more

Reveilhac, Maud and Gerold Schneider 2025 Measuring language complexity about European politics in Swiss parliamentary debates Mathematical Modelling in Linguistics and Text Analysis: Theory and applications, Pawłowski, Adam, Sheila Embleton, Jan Mačutek and Aris Xanthos (eds.), pp. 191–206 | Chapter

This study investigates changes in the political discourse surrounding Europe and European integration in Swiss parliamentary debates from 1995 to 2022. It relies on text analysis methods for measuring linguistic complexity and benchmarks the observed trends against topicality, party affiliation… read more

Schneider, Gerold 2024 Chapter 4. Digital Dickens: An automated content analysis of Charles Dickens’ novels Crossing Boundaries through Corpora: Innovative corpus approaches within and beyond linguistics, Buschfeld, Sarah, Patricia Ronan, Theresa Neumaier, Andreas Weilinghoff and Lisa Westermayer (eds.), pp. 62–98 | Chapter

This investigation employs computational linguistic methods such as document classification, topic modelling, and distributional semantics to scrutinize eight novels by Charles Dickens, uncovering dimensions of social criticism, literary realism, and narrative structures. While affirming… read more

Schneider, Gerold and Maud Reveilhac 2023 Chapter 12. Colloquialisation, compression and democratisation in British parliamentary debates Exploring Language and Society with Big Data: Parliamentary discourse across time and space, Korhonen, Minna, Haidee Kotze and Jukka Tyrkkö (eds.), pp. 336–372 | Chapter

We conduct an analysis of the link between colloquialisation and democratisation in debates in the British parliament. Our corpus is a sampler of the Hansard archive, covering the period 1803–2005 and containing 170 million words. We first investigate how the linguistic patterns of parliamentary… read more

Zehentner, Eva, Marianne Hundt, Gerold Schneider and Melanie Röthlisberger 2023 Differences in syntactic annotation affect retrieval: Verb-attached PPs in the history of English International Journal of Corpus Linguistics 28:3, pp. 378–406 | Article

Prepositional phrases (PPs) play an important part in English argument structure constructions, but pose considerable challenges for linguistic investigations of any kind. In addition to the fact that PP-attachment is generally notoriously difficult to model computationally, a particularly… read more

Schneider, Gerold 2022 Chapter 3. Medical topics and style from 1500 to 2018: A corpus-driven exploration Corpus Pragmatic Studies on the History of Medical Discourse, Hiltunen, Turo and Irma Taavitsainen (eds.), pp. 49–78 | Chapter

This chapter investigates changes in medical topics, style and language across 500 years, from 1500 to 2018. To do so, we employ data-driven methods of Computational Linguistics and Digital Humanities: document classification, topic modelling, and automatically constructed conceptual maps. We… read more

Schneider, Gerold 2022 Chapter 7. Syntactic changes in verbal clauses and noun phrases from 1500 onwards English Historical Linguistics: Change in structure and meaning, Los, Bettelou, Claire Cowie, Patrick Honeybone and Graeme Trousdale (eds.), pp. 163–200 | Chapter

Can the promise of data-driven methods hold in historical linguistics? Can they detect salient syntactic changes and open new research avenues? I first use data-driven measures to detect patterns in the ARCHER corpus. Secondly, I qualitatively interpret the differences and build hypotheses.… read more

Schneider, Gerold 2022 Recent changes in spoken British English in verbal and nominal constructions Broadening the Spectrum of Corpus Linguistics: New approaches to variability and change, Flach, Susanne and Martin Hilpert (eds.), pp. 173–195 | Chapter

Starting from a data-driven approach, the current paper compares the BNC1994 spoken to the BNC2014. We first narrow down possible research questions due to differences in the compilation and transcription of the two BNC generations. Then we investigate three robustly detectable changes at the… read more

Schneider, Gerold 2020 Spelling normalisation of Late Modern English: Comparison and combination of VARD and character-based statistical machine translation Late Modern English: Novel encounters, Kytö, Merja and Erik Smitterberg (eds.), pp. 243–268 | Chapter

To be able to profit from natural language processing (NLP) tools for analysing historical text, an important step is spelling normalisation. We first compare and second combine two different approaches: on the one hand VARD, a rule-based system which is based on dictionary lookup and rules with… read more

Schneider, Gerold 2020 Changes in society and language: Charting poverty Corpora and the Changing Society: Studies in the evolution of English, Rautionaho, Paula, Arja Nurmi and Juhani Klemola (eds.), pp. 29–56 | Chapter

This study addresses how societal and linguistic changes can be detected using historical corpora, with the topics of poverty and industrial revolution as a case study, based on large historical corpora, in particular EEBO, and CLMET3.0. The results, based on a rich array of state-of-the art… read more

Taavitsainen, Irma, Gerold Schneider and Peter Murray Jones 2019 Chapter 3. Topics of eighteenth-century medical writing with triangulation of methods: LMEMT and the underlying reality Late Modern English Medical Texts: Writing medicine in the eighteenth century, Taavitsainen, Irma and Turo Hiltunen (eds.), pp. 31–74 | Chapter

This chapter deals with the most important developments within society and the medical discourse community in the eighteenth-century Britain. It applies several methods by way of triangulation to probe into relevant aspects of the history of medicine and medical writing between 1700 and 1800. The… read more

Schneider, Gerold and Gaëtanelle Gilquin 2018 Detecting innovations in a parsed corpus of learner English Rethinking Linguistic Creativity in Non-native Englishes, Deshors, Sandra C., Sandra Götz and Samantha Laporte (eds.), pp. 47–74 | Article

In research on L2 English, recent corpus-based studies indicate that some nonstandard forms are shared by indigenized (ESL) and foreign (EFL) varieties of English, which challenges the idea of a clear dichotomy between innovation and error. We present a data-driven large-scale method to detect… read more

Schneider, Gerold and Gintare Grigonyte 2018 Chapter 2. From lexical bundles to surprisal and language models: Measuring the idiom principle in native and learner language Applications of Pattern-driven Methods in Corpus Linguistics, Kopaczyk, Joanna and Jukka Tyrkkö (eds.), pp. 15–56 | Chapter

We exploit the information theoretic measure of surprisal to analyze the formulaicity of lexical sequences. We first show the prevalence of individual lexical bundles, then we argue that abstracting to surprisal as an information-theoretic measure of lexical bundleness, formulaicity and… read more

Chevalier, Sarah, Anne-Christine Gardner, Alpo Honkapohja, Marianne Hundt, Gerold Schneider and Olga Timofeeva 2016 Introduction New Approaches to English Linguistics: Building bridges, Timofeeva, Olga, Anne-Christine Gardner, Alpo Honkapohja and Sarah Chevalier (eds.), pp. 1–12 | Chapter

Schneider, Gerold and Gaëtanelle Gilquin 2016 Detecting innovations in a parsed corpus of learner English Linguistic Innovations: Rethinking linguistic creativity in non-native Englishes, Deshors, Sandra C., Sandra Götz and Samantha Laporte (eds.), pp. 177–204 | Article

In research on L2 English, recent corpus-based studies indicate that some non-standard forms are shared by indigenized (ESL) and foreign (EFL) varieties of English, which challenges the idea of a clear dichotomy between innovation and error. We present a data-driven large-scale method to detect… read more

Schneider, Gerold and Gintare Grigonyte 2016 Statistical sequence and parsing models for descriptive linguistics and psycholinguistics New Approaches to English Linguistics: Building bridges, Timofeeva, Olga, Anne-Christine Gardner, Alpo Honkapohja and Sarah Chevalier (eds.), pp. 281–320 | Chapter

This study shows that using computational linguistic models is beneficial for descriptive linguistics and psycholinguistics. It applies two models to various English genres and learner language: 1) surprisal and 2) a syntactic parser, allowing us to investigate the role of ambiguity and the… read more

Ronan, Patricia and Gerold Schneider 2015 Determining light verb constructions in contemporary British and Irish English International Journal of Corpus Linguistics 20:3, pp. 326–354 | Article

This study implements an automated parser-based approach to the investigation of light verb constructions. The database consisting of ICE-GB and ICE-IRE is used to obtain qualitative and quantitative results on the use of light verb structures. The study explains and evaluates the steps employed to… read more

Schneider, Gerold 2015 Review of Díaz-Negrillo, Ballier & Thompson (2013): Automatic Treatment and Analysis of Learner Corpus Data International Journal of Learner Corpus Research 1:1, pp. 172–177 | Review

Schneider, Gerold and Marianne Hundt 2012 “Off with their heads”: Profiling TAM in ICE corpora Mapping Unity and Diversity World-Wide: Corpus-Based Studies of New Englishes, Hundt, Marianne and Ulrike Gut (eds.), pp. 1–34 | Article

The main aim of our chapter is a methodological one, that of comparing a largely data-driven approach to regional variation in world Englishes and a corpus-based approach. As a case study, we examine tense, aspect and modality (TAM) differences between five varieties. Our investigation uses… read more

Jucker, Andreas H., Gerold Schneider, Irma Taavitsainen and Barb Breustedt 2008 Fishing for compliments: Precision and recall in corpus-linguistic compliment research Speech Acts in the History of English, Jucker, Andreas H. and Irma Taavitsainen (eds.), pp. 273–294 | Article

Weeds, Julie, James Dowdall, Gerold Schneider, Bill Keller and David J. Weir 2007 Using distributional similarity to organise biomedical terminology Application-Driven Terminology Engineering, Ibekwe-SanJuan, Fidelia, Anne Condamines and Teresa Cabré (eds.), pp. 97–126 | Article

Weeds, Julie, James Dowdall, Gerold Schneider, Bill Keller and David J. Weir 2005 Using distributional similarity to organise biomedical terminology Application-Driven Terminology Engineering, Ibekwe-SanJuan, Fidelia, Anne Condamines and Teresa Cabré (eds.), pp. 107–141 | Article