Word frequency counts
Linking corpus data to user’s perception in linguistic research
Lexical frequency is one of the major variables involved in language processing. It constitutes a cornerstone of
psycholinguistic, corpus linguistic as well as applied research. Linguists take frequency counts from corpora and they started to
take them for granted. However, voices emerge that corpora may not always provide a comprehensive picture of how frequently
lexical items appear in a language. In the present contribution I compare corpus frequency counts for English and Polish words to
native speakers’ perception of frequency. The analysis shows that, while generally objective and subjective values are related,
there is a disparity between measures for frequent Polish words. The direction of the relationship, though positive, is also not
as strong as in previous studies. I suggest linking objective with subjective frequency measures in research.
Article outline
- Introduction
- 1.Experiment – methods
- 1.1Aims and hypotheses
- 1.2Participants
- 1.3Materials
- 1.4Procedure
- 1.5Data analysis
- 1.6Results
- 2.Discussion and conclusions
- Acknowledgements
-
References
References (36)
Alderson, J. C.
2007 Judging the frequency of English words.
Applied Linguistics, 28(3), 383–409.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Baayen, R. H.
2008 Analyzing Linguistic Data: A Practical Introduction to Statistics Using R. Cambridge: Cambridge University Press.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Balota, D. A., Paul, S., & Spieler, D. H.
1999 Attentional control of lexical processing pathways during word recognition and reading. In
S. Garrod, &
M. Pickering (Eds.),
Language Processing. East Sussex, UK: Psychology Press Ltd.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Balota, D. A., Pilotti, M., & Cortese, M. J.
2001 Subjective frequency estimates for 2,938 monosyllabic words.
Memory & Cognition, 291, 639–647.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Balota, D. A., Cortese, M. J., Sergent-Marshall, S., Spieler, D. H., & Yap, M. J.
2004 Visual word recognition of single-syllable words.
Journal of Experimental Psychology: General, 133(2), 283–316.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Balota, D. A., Yap, M. J., Cortese, M. J., Hutchison, K. I., Kessler, B., Loftis, B., Neely, J. H., Nelson, D. L., Simpson, G. B., & Treiman, R.
2007 The English Lexicon Project.
Behavior Research Methods, 391, 445–459.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Baroni, M., & Evert, S.
2005 Testing the extrapolation quality of word frequency models. In
P. Danielsson, &
M. Wagenmakers. (Eds.)
Proceedings of Corpus Linguistics 2005.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Brysbaert, M., Buchmeier, M., Conrad, M., Jacobs, A. M., Bölte, J., & Böhl, A.
2011 The word frequency effect: a review of recent developments and implications for the choice of frequency estimates in German.
Experimental Psychology, 58(5), 412–424.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Brysbaert, M., Stevens, M., Mandera, P., & Keuleers, E.
2016 The impact of word prevalence on lexical decision times: Evidence from the Dutch Lexicon Project 2.
Journal of Experimental Psychology: Human Perception and Performance, 421, 441–458.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Carroll, J. B.
1971 Measurement properties of subjective magnitude estimates of word frequency.
Journal of Verbal Learning and Verbal Behavior, 101, 722–729.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Coane, J. H., & Balota, D. A.
2010 Repetition priming across distinct contexts: effects of lexical status, word frequency and retrieval test.
Quarterly Journal of Experimental Psychology, 63(12), 2376–2398.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cobb, T., & Boulton, A.
2015 Classroom applications of corpus analysis. In
Biber, D., &
Reppen, R. (eds.)
The Cambridge handbook of English corpus linguistics, 478–497.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Coxhead, A.
2000 A new academic word list.
TESOL quarterly, 34(2): 213–238.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
De Groot, A. M.
1992 Determinants of word translation.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 18(5), 1001–1018.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
De Groot, A. M., Borgwaldt, S., Bos, M., & Van den Eijnden, E.
2002 Lexical decision and word naming in bilinguals: Language effects and task effects.
Journal of Memory and Language, 471, 91–124.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
De Groot, A. M.
2011 Language and cognition in bilinguals and multilinguals: An introduction. New York: Psychology Press.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Desrochers, A., & Bergeron, M.
2000 Valeurs de frequence subjective et d’imagerie pour un echantillon de 1,916 substantifs de la langue francaise.
Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale, 54(4), 274–325.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Fischer-Baum, S., Dickson, D., & Federmeier, K.
2014 Frequency and regularity effects in reading are task dependent: Evidence from ERPs.
Language, Cognition and Neuroscience, 29(10), 1342–1355.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Gernsbacher, M. A.
1984 Resolving 20 years of inconsistent interactions between lexical familiarity and orthography, concreteness, and polysemy.
Journal of Experimental Psychology: General, 1131, 256–281.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Gries, S. T. & Divjak, D.
2012 Frequency effects in language learning and processing. Volume 11. Berlin: De Gruyter.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Inhoff, A. W. & Rayner, K.
1986 Parafoveal word processing during eye fixations in reading: Effects of word frequency.
Perception & Psychophysics, 401, 431–439.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Jiang, N.
2012 Conducting Reaction Time Research in Second Language Studies. New York: Routledge.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Leech, G. N.
1992 Corpora and theories of linguistic performance. In
J. Svartvik (Ed.),
Directions in Corpus Linguistics, 105–22. Berlin: Mouton de Gruyter.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Mandera, P., Keuleers, E., Wodniecka, Z., & Brysbaert, M.
2015 SUBTLEX-PL: Subtitle-based word frequency estimates for Polish.
Behavior Research Methods, 47(2), 471–483.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
McEnery, T., & Wilson, A.
1996 Corpus Linguistics. Edinburgh: Edinburgh University Press.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
McEnery, T. & Hardie, A.
2012 Corpus Linguistics: Method, theory and practice. Cambridge: Cambridge University Press.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
McGee, I.
2008 Word frequency estimates revisited – a response to Alderson (2007).
Applied Linguistics, 29(3), 509–514.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Oakes, M. P.
1998 Statistics for corpus linguistics. Edinburgh: Edinburgh University Press.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Okamoto, M.
2015 Is corpus word frequency a good yardstick for selecting words to teach? Threshold levels for vocabulary selection.
System, 511, 1–10.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Reynolds, B. L.
2015 The effects of word form variation and frequency on second language incidental vocabulary acquisition through reading.
Applied Linguistics Review, 6(4), 467–497.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Ringeling, T.
1984 Subjective estimates as a useful alternative to word frequency counts.
Interlanguage Studies Bulletin, 8(1), 59–69.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Schad, D. J., Risse, S., Slattery, T., Rayner, K.
2014 Word frequency in fast priming: Evidence for immediate cognitive control of eye movements during reading.
Visual Cognition, 22(3), 390–414.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
The British National Corpus, version 2 (BNC World)
2001 Distributed by Oxford University Computing Services on behalf of the BNC Consortium. URL:
[URL]
Thompson, G., & Desrochers, A.
2009 Corroborating biased indicators: Global and local agreement among objective and subjective estimates of printed word frequency.
Behaviour Research Methods, 41(2), 452–471.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Van Heuven, W. J. B., Mandera, P., Keuleers, E., & Brysbaert, M.
2014 SUBTLEX-UK: a new and improved word frequency database for British English.
Quarterly Journal of Experimental Psychology, 67(6), 1176–1190.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cited by (1)
Cited by 1 other publications
This list is based on CrossRef data as of 5 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.