Frequency of use and basic vocabulary
We use corpora from 18 languages to study the frequency of basic words such as mother, sun, and red. We compare three lists, Swadesh-200, Swadesh-100, and the Leipzig-Jakarta list (Tadmor 2009), and find that they have a high average inter-correlation. Using the WOLD semantic categories and fields (Haspelmath and Tadmor 2009), we find regularities in the word meaning types that are most likely to deviate from the overall correlations, i.e. words whose frequency-of-use varies significantly, such as those encoded by function words and basic actions (do/make), spatial relations (left, right), cognition words (to know, when), or possession (to take). Our results indicate a core collection of basic meanings universally used with similar regularity, despite other linguistic pressures impinging on these frequencies.
References (44)
Altenberg, Bengt. 1998. On the phraseology of spoken English: The evidence of recurrent word combinations. In A. P. Cowie, ed.,
Phraseology: Theory, Analysis and Applications
, 101–122. Oxford: Clarendon Press.
Bell, Allan. 1984. Language style as audience design.
Language in Society
13: 145–204.
Boroditsky, Lera. 2003. Linguistic Relativity. In L. Nadel, ed.,
Encyclopedia of Cognitive Science
, 917–921. London: MacMillan Press.
Bowerman, Melissa. 1996. The origins of children’s spatial semantic categories: Cognitive versus linguistics determinants. In J. Gumperz & S. Levinson, eds.,
Rethinking Linguistic Relativity
, 145–176. Cambridge: Cambridge University Press.
Brown, Penelope & Stephen Levinson. 1987.
Politeness: Some Universals in Language Usage
. Cambridge: Cambridge University Press.
Brown, Penelope & Stephen Levinson. 2009. Language as mind tools: Learning how to think through speaking. In J. Guo, E. V. Lieven, N. Budwig, S. Ervin-Tripp, K. Nakamura, & S. Ozcaliskan, eds.,
Crosslinguistic Approaches to the Psychology of Language: Research in the Traditions of Dan Slobin
, 451–464. New York: Psychology Press.
Bybee, Joan. 2007.
Frequency of Use and the Organisation of Language
. Oxford: Oxford University Press.
Bybee, Joan & Sandra Thompson. 2000. Three frequency effects in syntax.
Berkeley Linguistics Society
23: 65–85.
Calude, Andreea & Mark Pagel. 2011. How do we use language? Shared patterns in frequency of word-use across seventeen World languages.
Philosophical Transactions of the Royal Society B
366: 1101–1107.
Campbell, Lyle. 1999.
Historical Linguistics: An Introduction
. Cambridge, MA: MIT Press.
Clark, Herbert. 1996.
Using Language
. Cambridge: Cambridge University Press.
Croft, William. 2000.
Explaining Language Change: An Evolutionary Approach
. London: Longman.
Croft, William & Alan Cruse. 2004.
Cognitive Linguistics
. Cambridge: Cambridge University Press.
Ellis, Nick. 2002. Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition.
Studies in Second Language Acquisition
24: 143–188.
Embleton, Shiela. 1986.
Statistics in Historical Linguistics
. Bochum: Brockmeyer.
Giles, Howard & Nick Coupland. 1991.
Language: Contexts and Consequences
. Pacific Grove: Brooks/Cole Publishing Company.
Goddard, C. & A. Wierzbicka, eds. 2002.
Meaning and Universal Grammar: Theory and Empirical Findings
(2 volumes). Amsterdam & Philadelphia: Benjamins.
Gumperz, John & Stephen Levinson, eds. 1996.
Rethinking Linguistic Relativity. [Studies in the Social and Cultural Foundations of Language 17]
. Cambridge: Cambridge University Press.
Haspelmath, Martin & Uri Tadmor, eds. 2009a.
World Loanword Database
. Munich: Max Planck Digital Library.
Haspelmath, Martin & Uri Tadmor. 2009b. The Loanword Typology project and the World Loanword Database. In M. Haspelmath & U. Tadmor, eds.,
Loanwords in the World’s Languages: A Comparative Handbook
, 1–34. Berlin: Mouton de Gruyter.
Hopper, Paul & Elizabeth Traugott. 1993.
Grammaticalization
. Cambridge: Cambridge University Press.
Kemmer, Suzanne & Michael Israel. 1994. Variation and the usage-based model. In K. Beals,
R. Denton, R. Knippen, L. Melnar, H. Suzuki, & E. Zeinfeld, eds.,
Papers from the Thirtieth Regional Meeting of the Chicago Linguistics Society: Vol. 2. Para-session on Variation and Linguistic Theory
, 165–179. Chicago: Chicago Linguistic Society.
Labov, William. 1972.
Sociolinguistic Patterns
. Pennsylvania: University of Pennsylvania Press.
Labov, William. 2001.
Principles of Linguistic Change. Volume II: Social Factors
. Oxford: Blackwell.
Langacker, Ronald. 1987.
Foundations of Cognitive Grammar. Volume 1: Theoretical Prerequisites
. Stanford: Stanford University Press.
Lehmann, Christian. 1995[1982].
Thoughts on Grammaticalization
. Munich: Lincom Europa.
Lewis, Paul, ed. 2009.
Ethnologue: Languages of the World
. Sixteenth edition. Dallas, Tex.
Norde, M, K. Beijering, & A. Lenz. 2012. Current trends in grammaticalization research.
Language Sciences
. .
Nurse, Derek & Thomas Spear. 1985.
The Swahili: Reconstructing the History and Language of an African Society
800–1500. Philadelphia: University of Pennsylvania Press.
Pagel, Mark. 2008. The rise of the machine.
Nature
452: 699.
Pagel, Mark. 2009. Human language as a culturally transmitted replicator.
Nature Reviews Genetics
10: 405–415.
Pagel, Mark, Quentin Atkinson, & Andrew Meade. 2007. Frequency of word-use predicts rates of lexical evolution throughout Indo-European history.
Nature
449: 717–720.
Pawley, Andrew & Frances Syder. 1983. Two puzzles for linguistic theory: Native like selection and native like fluency. In J. S. Richards & R. W. Schmidt, eds.,
Language and Communication
, 191–225. London: Longman.
R Development Core Team. 2004. R: A language and environment for statistical computing. On-line at [URL].
Thomason, Sandra & Terrence Kaufman. 1988.
Language Contact, Creolization, and Genetic Linguistics
. Berkeley: University of California Press.
Traugott, Elizabeth. 1988. Pragmatic strengthening and grammaticalization. In S. Axmaker,
A. Jaisser, & H. Singmaster, eds.,
Proceedings of the Fourteenth Annual Meeting of the Berkeley Linguistic Society
, 406–416. Berkeley: Berkeley Linguistic Society.
Shapiro, Bernard. 1969. The subjective estimate of relative word frequency.
Journal of Verbal Learning and Verbal Behavior
8: 248–251.
Swadesh, Morris. 1955. Towards greater accuracy in lexicostatistic dating.
International Journal of American Linguistics
21: 121–137.
Wray, Alison & George Grace. 2007. The consequences of talking to strangers: Evolutionary corollaries of socio-cultural influences on linguistic form.
Lingua
117: 543–578.
Zipf, George. 1935.
The Psycho-biology of Language
. Cambridge, MA: MIT Press.
Tadmor Uri. 2009. Loanwords in the world’s languages: Findings and results. In M. Haspelmath & U. Tadmor, eds.,
Loanwords in the World’s Languages: A Comparative Handbook
, 55–75. Berlin: Mouton de Gruyter.
Cited by (3)
Cited by three other publications
De Deyne, Simon, Marc Brysbaert & Irina Elgort
Bhattacharya, Tanmoy, Nancy Retzlaff, Damián E Blasi, William Croft, Michael Cysouw, Daniel Hruschka, Ian Maddieson, Lydia Müller, Eric Smith, Peter F Stadler, George Starostin & Hyejin Youn
2018.
Studying language evolution in the age of big data.
Journal of Language Evolution 3:2
► pp. 94 ff.
Tamburelli, Marco & Lissander Brasca
2018.
Revisiting the classification of Gallo-Italic: a dialectometric approach.
Digital Scholarship in the Humanities 33:2
► pp. 442 ff.
This list is based on CrossRef data as of 17 november 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.