Frequency of use and basic vocabulary
We use corpora from 18 languages to study the frequency of basic words such as mother, sun, and red. We compare three lists, Swadesh-200, Swadesh-100, and the Leipzig-Jakarta list (Tadmor 2009), and find that they have a high average inter-correlation. Using the WOLD semantic categories and fields (Haspelmath and Tadmor 2009), we find regularities in the word meaning types that are most likely to deviate from the overall correlations, i.e. words whose frequency-of-use varies significantly, such as those encoded by function words and basic actions (do/make), spatial relations (left, right), cognition words (to know, when), or possession (to take). Our results indicate a core collection of basic meanings universally used with similar regularity, despite other linguistic pressures impinging on these frequencies.
References (44)
Altenberg, Bengt. 1998. On the phraseology of spoken English: The evidence of recurrent word combinations. In A. P. Cowie, ed.,
Phraseology: Theory, Analysis and Applications
, 101–122. Oxford: Clarendon Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Bell, Allan. 1984. Language style as audience design.
Language in Society
13: 145–204. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Boroditsky, Lera. 2003. Linguistic Relativity. In L. Nadel, ed.,
Encyclopedia of Cognitive Science
, 917–921. London: MacMillan Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Bowerman, Melissa. 1996. The origins of children’s spatial semantic categories: Cognitive versus linguistics determinants. In J. Gumperz & S. Levinson, eds.,
Rethinking Linguistic Relativity
, 145–176. Cambridge: Cambridge University Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Brown, Penelope & Stephen Levinson. 1987.
Politeness: Some Universals in Language Usage
. Cambridge: Cambridge University Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Brown, Penelope & Stephen Levinson. 2009. Language as mind tools: Learning how to think through speaking. In J. Guo, E. V. Lieven, N. Budwig, S. Ervin-Tripp, K. Nakamura, & S. Ozcaliskan, eds.,
Crosslinguistic Approaches to the Psychology of Language: Research in the Traditions of Dan Slobin
, 451–464. New York: Psychology Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Bybee, Joan. 2007.
Frequency of Use and the Organisation of Language
. Oxford: Oxford University Press. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Bybee, Joan & Sandra Thompson. 2000. Three frequency effects in syntax.
Berkeley Linguistics Society
23: 65–85.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Calude, Andreea & Mark Pagel. 2011. How do we use language? Shared patterns in frequency of word-use across seventeen World languages.
Philosophical Transactions of the Royal Society B
366: 1101–1107. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Campbell, Lyle. 1999.
Historical Linguistics: An Introduction
. Cambridge, MA: MIT Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Clark, Herbert. 1996.
Using Language
. Cambridge: Cambridge University Press. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Croft, William. 2000.
Explaining Language Change: An Evolutionary Approach
. London: Longman.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Croft, William & Alan Cruse. 2004.
Cognitive Linguistics
. Cambridge: Cambridge University Press. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Ellis, Nick. 2002. Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition.
Studies in Second Language Acquisition
24: 143–188.![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Embleton, Shiela. 1986.
Statistics in Historical Linguistics
. Bochum: Brockmeyer.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Giles, Howard & Nick Coupland. 1991.
Language: Contexts and Consequences
. Pacific Grove: Brooks/Cole Publishing Company.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Goddard, C. & A. Wierzbicka, eds. 2002.
Meaning and Universal Grammar: Theory and Empirical Findings
(2 volumes). Amsterdam & Philadelphia: Benjamins.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Gumperz, John & Stephen Levinson, eds. 1996.
Rethinking Linguistic Relativity. [Studies in the Social and Cultural Foundations of Language 17]
. Cambridge: Cambridge University Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Haspelmath, Martin & Uri Tadmor, eds. 2009a.
World Loanword Database
. Munich: Max Planck Digital Library.![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Haspelmath, Martin & Uri Tadmor. 2009b. The Loanword Typology project and the World Loanword Database. In M. Haspelmath & U. Tadmor, eds.,
Loanwords in the World’s Languages: A Comparative Handbook
, 1–34. Berlin: Mouton de Gruyter. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Hopper, Paul & Elizabeth Traugott. 1993.
Grammaticalization
. Cambridge: Cambridge University Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Kemmer, Suzanne & Michael Israel. 1994. Variation and the usage-based model. In K. Beals,
R. Denton, R. Knippen, L. Melnar, H. Suzuki, & E. Zeinfeld, eds.,
Papers from the Thirtieth Regional Meeting of the Chicago Linguistics Society: Vol. 2. Para-session on Variation and Linguistic Theory
, 165–179. Chicago: Chicago Linguistic Society.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Labov, William. 1972.
Sociolinguistic Patterns
. Pennsylvania: University of Pennsylvania Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Labov, William. 2001.
Principles of Linguistic Change. Volume II: Social Factors
. Oxford: Blackwell.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Langacker, Ronald. 1987.
Foundations of Cognitive Grammar. Volume 1: Theoretical Prerequisites
. Stanford: Stanford University Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Lehmann, Christian. 1995[1982].
Thoughts on Grammaticalization
. Munich: Lincom Europa.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Lewis, Paul, ed. 2009.
Ethnologue: Languages of the World
. Sixteenth edition. Dallas, Tex.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Norde, M, K. Beijering, & A. Lenz. 2012. Current trends in grammaticalization research.
Language Sciences
.
.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Nurse, Derek & Thomas Spear. 1985.
The Swahili: Reconstructing the History and Language of an African Society
800–1500. Philadelphia: University of Pennsylvania Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Pagel, Mark. 2008. The rise of the machine.
Nature
452: 699. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Pagel, Mark. 2009. Human language as a culturally transmitted replicator.
Nature Reviews Genetics
10: 405–415.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Pagel, Mark, Quentin Atkinson, & Andrew Meade. 2007. Frequency of word-use predicts rates of lexical evolution throughout Indo-European history.
Nature
449: 717–720. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Pawley, Andrew & Frances Syder. 1983. Two puzzles for linguistic theory: Native like selection and native like fluency. In J. S. Richards & R. W. Schmidt, eds.,
Language and Communication
, 191–225. London: Longman.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
R Development Core Team. 2004. R: A language and environment for statistical computing. On-line at [URL].
Thomason, Sandra & Terrence Kaufman. 1988.
Language Contact, Creolization, and Genetic Linguistics
. Berkeley: University of California Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Traugott, Elizabeth. 1988. Pragmatic strengthening and grammaticalization. In S. Axmaker,
A. Jaisser, & H. Singmaster, eds.,
Proceedings of the Fourteenth Annual Meeting of the Berkeley Linguistic Society
, 406–416. Berkeley: Berkeley Linguistic Society.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Shapiro, Bernard. 1969. The subjective estimate of relative word frequency.
Journal of Verbal Learning and Verbal Behavior
8: 248–251. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Swadesh, Morris. 1955. Towards greater accuracy in lexicostatistic dating.
International Journal of American Linguistics
21: 121–137. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Wray, Alison & George Grace. 2007. The consequences of talking to strangers: Evolutionary corollaries of socio-cultural influences on linguistic form.
Lingua
117: 543–578. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Zipf, George. 1935.
The Psycho-biology of Language
. Cambridge, MA: MIT Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Tadmor Uri. 2009. Loanwords in the world’s languages: Findings and results. In M. Haspelmath & U. Tadmor, eds.,
Loanwords in the World’s Languages: A Comparative Handbook
, 55–75. Berlin: Mouton de Gruyter.![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cited by (3)
Cited by three other publications
De Deyne, Simon, Marc Brysbaert & Irina Elgort
Bhattacharya, Tanmoy, Nancy Retzlaff, Damián E Blasi, William Croft, Michael Cysouw, Daniel Hruschka, Ian Maddieson, Lydia Müller, Eric Smith, Peter F Stadler, George Starostin & Hyejin Youn
2018.
Studying language evolution in the age of big data.
Journal of Language Evolution 3:2
► pp. 94 ff.
![DOI logo](//benjamins.com/logos/doi-logo.svg)
Tamburelli, Marco & Lissander Brasca
2018.
Revisiting the classification of Gallo-Italic: a dialectometric approach.
Digital Scholarship in the Humanities 33:2
► pp. 442 ff.
![DOI logo](//benjamins.com/logos/doi-logo.svg)
This list is based on CrossRef data as of 24 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.