Explorations into the social contexts of neologism use in early
English correspondence
This paper describes ongoing work towards a rich analysis of the
social contexts of neologism use in historical corpora, in particular the
Corpora of Early English Correspondence, with research
questions concerning the innovators, meanings and diffusion of neologisms. To
enable this kind of study, we are developing new processes, tools and ways of
combining data from different sources, including the Oxford English
Dictionary, the Historical Thesaurus, and
contemporary published texts. Comparing neologism candidates across these
sources is complicated by the large amount of spelling variation. To make the
issues tractable, we start from case studies of individual suffixes
(-ity, -er) and people (Thomas Twining). By developing
tools aiding these studies, we build toward more general analyses. Our aim is to
develop an open-source environment where information on neologism candidates is
gathered from a variety of algorithms and sources, pooled, and presented to a
human evaluator for verification and exploration.
Article outline
- 1.Introduction
- 2.
Corpora of Early English Correspondence (CEEC)
- 3.Two case studies of specific types of neologisms: -ity and
-er
- 3.1Case study 1: -ity
- 3.2Case study 2: -er
- 4.Towards computational discovery of neologisms in general
- 4.1Case study 3: Thomas Twining
- 5.Discussion
- Acknowledgements
-
References
References
Adamson, Sylvia
1989 With double tongue: Diglossia, stylistics and the teaching of
English. In
Mick Short (ed.),
Reading, analysing and teaching literature, 204–240. London: Longman.

Alexander, Marc & Christian Kay
2014 The spread of RED in the Historical Thesaurus of
English. In
Wendy Anderson,
Carole P. Biggam,
Carole Hough &
Christian Kay (eds.),
Colour studies: A broad spectrum, 126–139. Amsterdam: John Benjamins.

Amoia, Marilisa & Jose Manuel Martinez
2013 Using comparable collections of historical texts for building a
diachronic dictionary for spelling normalization. In
Piroska Lendvai &
Kalliopi Zervanou (eds.),
Proceedings of the 7th Workshop on Language Technology for Cultural
Heritage, Social Sciences, and Humanities (LaTeCH 2013), 84–89. Stroudsburg, PA: Association for Computational Linguistics.

Baron, Alistair, Paul Rayson & Dawn Archer
2009 Automatic standardization of spelling for historical text
mining. In
Claire Warwick (ed.),
Digital Humanities 2009: Conference abstracts, 309–312. College Park, MD: Maryland Institute for Technology in the Humanities.

Bauer, Laurie
2001 Morphological productivity (
Cambridge Studies in Linguistics 95). Cambridge: Cambridge University Press.


Bird, Steven, Ewan Klein & Edward Loper
2009 Natural language processing with Python: Analyzing text with the Natural
Language Toolkit. Sebastopol, CA: O’Reilly Media.

Brewer, Charlotte
2007 Treasure-house of the language: The living OED. New Haven: Yale University Press.

Burns, Philip R.
2013 MorphAdorner v2: A Java library for the morphological adornment of
English language texts. Evanston, IL: Northwestern University.
[URL] (
19 May 2018)
CEEC. Corpora of Early English Correspondence
. Compiled by
Terttu Nevalainen,
Helena Raumolin-Brunberg et al. at the Department of Modern Languages, University of Helsinki.
[URL] (
19 May 2018)
Conde-Silvestre, Juan Camilo
2012 The role of social networks and mobility in diachronic
sociolinguistics. In
Juan Manuel Hernández-Campoy &
Juan Camilo Conde-Silvestre (eds.),
The handbook of historical sociolinguistics (
Blackwell Handbooks in Linguistics), 332–352. Chichester: Wiley-Blackwell.


Grieve, Jack, Andrea Nini & Diansheng Guo
2017 Analyzing lexical emergence in Modern American English
online.
English Language and Linguistics 21(1). 99–127.


Hoffmann, Sebastian
2004 Using the OED quotations database as a corpus – a linguistic
appraisal.
ICAME Journal 281. 17–30.

Kaislaniemi, Samuli
2018 The Corpus of Early English Correspondence Extension (CEECE). In
Terttu Nevalainen,
Minna Palander-Collin &
Tanja Säily (eds.),
Patterns of change in 18th-century English: A sociolinguistic
approach (
Advances in Historical Sociolinguistics 8), 45–59. Amsterdam: John Benjamins.

Kaunisto, Mark
2013 Scare quotes and glosses: Indicators of lexical innovation with
affixed derivatives. In
Roderick W. McConchie,
Teo Juvonen,
Mark Kaunisto,
Minna Nevala &
Jukka Tyrkkö (eds.),
Selected proceedings of the 2012 Symposium on New Approaches in English
Historical Lexis (HEL-LEX 3), 97–106. Somerville, MA: Cascadilla Proceedings Project.

Kay, Christian, Jane Roberts, Michael Samuels & Irené Wotherspoon
(eds.) 2009 Historical Thesaurus of the Oxford English Dictionary. OED Online. Oxford University Press.
[URL] (
19 May 2018)
Miller, George A.
1995 WordNet: A lexical database for English.
Communications of the ACM 38(11). 39–41.


Nevalainen, Terttu
1999 Early Modern English lexis and semantics. In
Roger Lass (ed.),
The Cambridge history of the English language, III: 1476–1776, 332–458. Cambridge: Cambridge University Press.

OED. Oxford English Dictionary
.
OED Online. Oxford University Press.
[URL] (
19 May 2018)
Palander-Collin, Minna & Mikko Hakala
2011 Standardized versions of the Corpora of Early English
Correspondence.
Corpus Resource Database (CoRD). Helsinki: VARIENG.
[URL] (
19 May 2018)
PCEEC
2006 Parsed Corpus of Early English Correspondence, tagged version. Annotated by
Arja Nurmi,
Ann Taylor,
Anthony Warner,
Susan Pintzuk, and
Terttu Nevalainen. Compiled by the
CEEC Project Team. York: University of York and Helsinki: University of Helsinki. Distributed through the Oxford Text Archive.
[URL] (
19 May 2018.)
Philips, Lawrence
2000 The double metaphone search algorithm.
C/C++ Users Journal 18(6). 38–43.

Plag, Ingo
2003 Word-formation in English (
Cambridge Textbooks in Linguistics). Cambridge: Cambridge University Press.


Säily, Tanja
2014 Sociolinguistic variation in English derivational productivity: Studies
and methods in diachronic corpus linguistics (
Mémoires de la Société Néophilologique de Helsinki XCIV). Helsinki: Société Néophilologique.

Säily, Tanja & Jukka Suomela
2017
types 2: Exploring word-frequency differences in
corpora. In
Turo Hiltunen,
Joe McVeigh &
Tanja Säily (eds.),
Big and rich data in English corpus linguistics: Methods and
explorations (
Studies in Variation, Contacts and Change in English 19). Helsinki: VARIENG.
[URL] (
19 May 2018)
Säily, Tanja, Jukka Suomela & Eetu Mäkelä
In preparation.
Variation in morphological productivity in the history of
English: The case of -er
.
Scherrer, Yves & Tomaž Erjavec
2016 Modernising historical Slovene words.
Natural Language Engineering 22(6). 881–905.


Cited by
Cited by 3 other publications
Kerremans, Daphné, Jelena Prokić, Quirin Würschinger & Hans-Jörg Schmid
Landert, Daniela, Tanja Säily & Mika Hämäläinen
2023.
TV series as disseminators of emerging vocabulary: Non-codified expressions in the TV Corpus.
ICAME Journal 47:1
► pp. 63 ff.

This list is based on CrossRef data as of 30 august 2023. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.