Publication details [#65667]

Schmid, Hans-Jörg, Daphné Kerremans, Jelena Prokić and Quirin Würschinger. 2018. Using data-mining to identify and study patterns in lexical innovation on the web. The NeoCrawler. Pragmatics & Cognition 25 (1).
Publication type
Article in journal
Publication language
Place, Publisher
John Benjamins
Journal DOI


This article proposes the NeoCrawler – a tailor-made webcrawler, which identifies and retrieves neologisms from the Internet and systematically monitors the use of detected neologisms on the web by means of weekly searches. It enables researchers to use the web as a corpus in order to investigate the dynamics of lexical innovation on a large-scale and systematic basis. The NeoCrawler represents an innovative web-mining tool which opens up new opportunities for linguists to tackle a number of unresolved and under-researched issues in the field of lexical innovation. This article presents the design as well as the most important characteristics of two modules, the Discoverer and the Observer, with regard to the usage-based study of lexical innovation and diffusion. Keywords: webcrawler, neologisms, innovation identification, string matching, data-mining, lexical innovation