Making Google Books n-grams useful for a wide range of research on language change
The “standard” Google Books n-grams were released by Google in 2010, and they include more than 155 billion words of data for the American English data alone. Unfortunately, the standard interface is far too simplistic to allow many types of useful research on this massive dataset. In this paper, I discuss an alternative “advanced” architecture and interface for these datasets, which is freely available at googlebooks.byu.edu. This resource allows for a wide range of research on lexical, phraseological, syntactic, and semantic changes in English, in ways that would not be possible with the standard interface. With this new resource, researchers now have access to hundreds of billions of words of data, and can map out changes in English in ways that were not previously possible.
Keywords: Google Books, historical, syntactic, semantic, lexical
Published online: 01 September 2014
. (forthcoming). “A corpus-based study of lexical developments in Early and Late Modern English”. In M. Kytö & P. Pahta (Eds.) Handbook of English Historical Linguistics,. Cambridge: Cambridge University Press.
de Smet, H.
Michel, J.B., Kui Shen, Y., Presser Aiden, A., Veres, A., Gray, M., The Google Books Team, Pickett, J., Hoiberg, D., Clancy, D., Norvig, P., Orwant, J., Pinker, S., Nowak, M. & Lieberman Aiden, E.
2009 “Google's Book Search: A disaster for scholars”. The Chronicle of Higher Education, August 31 2009 Available at: http://chronicle.com/article/Googles-Book-Search-A/48245/ (accessed March 2014).
2010 “Counting on Google Books”. The Chronicle of Higher Education. December 16 2010 Available at: https://chronicle.com/article/Counting-on-Google-Books/125735/ (accessed March 2014).
Cited by other publications
No author info given
Banasiak, Dariusz, Jarosław Mierzwa & Antoni Sterna
Donmez, Ilknur & Elena Battini Sonmez
Liao, Xuanyi & Guang Cheng
Vijayarani, J. & T. V. Geetha
Zakharov, V. P. & A. Ts. Masevich
This list is based on CrossRef data as of 22 september 2020. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.