How much vocabulary is needed to use a concordance?
Vocabulary load is a predictor of comprehension and a common concern in relation to learner use of concordances; however, vocabulary load figures for whole texts have limited relevance to learner use of concordances. This paper explores the average vocabulary load of the citations (or lines) in a concordance, reflecting how learners use concordances as reading or reference resources. Non-parametric tests are used to compare the vocabulary loads of citations from three authentic written corpora and a corpus of graded readers. The results indicate that citations from authentic corpora have an average vocabulary load of 4,000–5,000 word families, there are reliable differences in vocabulary load between citations from different corpora, and the magnitude of difference between citations from authentic corpora can be equivalent to the magnitude of difference between authentic corpora and graded reader corpora. The paper concludes with a discussion of the results in relation to language learner use of concordances.
Article outline
- 1.Introduction
- 2.The vocabulary demands of learner use of concordances
- 2.1Vocabulary and reading
- 2.2Concordances, reading and vocabulary load
- 3.Data and method
- 3.1Corpora
- 3.2Expanded word lists
- 3.3Procedure
- 3.3.1Power analysis
- 3.3.2Developing the sampling frame
- 3.3.3Scoring
- Vocabulary coverage level
- Mean word frequency
- 3.3.4Extraction software
- 4.Results
- 4.1Study one: Three authentic corpora
- 4.2Study two: Replication
- 4.3Study three: Authentic corpora compared with a graded corpus
- 5.Pedagogical implications and discussion
- 6.Conclusions
- Acknowledgements
- Note
-
References
References
Allan, R.
(
2009)
Can a graded reader corpus provide ‘authentic’ input? ELT Journal, 63(1), 23–32.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Allan, R.
(
2010)
Concordances versus dictionaries: Evaluating approaches to word learning in ESOL. In
R. Chacón-Beltrán,
C. Abello-Contesse, &
M. D. M. Torreblanca-López (Eds.),
Insights into Non-native Vocabulary Teaching and Learning (pp. 112–125). Bristol: Multilingual Matters.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Baayen, R. H.
(
2001)
Word Frequency Distributions. Dordrecht: Kluwer Academic.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Ballance, O. J.
(
2017)
Pedagogical models of concordance use: Correlations between concordance user preferences.
Computer Assisted Language Learning, 30(3–4), 259–283.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Bauer, L., & Nation, I. S. P.
(
1993)
Word families.
International Journal of Lexicography, 6(4), 253–279.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Bernardini, S.
(
2000)
Systematising serendipity: Proposals for concordancing large corpora with language learners. In
L. Burnard &
T. McEnery (Eds.),
Rethinking Language Pedagogy from a Corpus Perspective (pp. 225–235). Frankfurt am Main: Peter Lang.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Bernardini, S.
(
2002)
Exploring new directions for discovery learning. In
B. Kettemann &
G. Marko (Eds.),
Teaching and Learning by Doing Corpus Analysis (pp. 165–182). Amsterdam: Rodopi.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Biber, D., Conrad, S., & Reppen, R.
(
1998)
Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
BNC-Consortium
(
2001)
The British National Corpus, version 2 (BNC World). Distributed by Oxford University Computing Services. Retrieved from
[URL] (last acccessed November 2019).
Boulton, A., & Cobb, T.
(
2017)
Corpus use in language learning: A meta-analysis.
Language Learning, 67(2), 348–393.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Chambers, A., & O’Sullivan, I.
(
2004)
Corpus consultation and advanced learners: Writing skills in French.
ReCALL, 16(1), 158–172.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Charles, M.
(
2011)
Using hands-on concordancing to teach rhetorical functions: Evaluation and implications for EAP writing classes. In
A. Frankenberg-Garcia,
L. Flowerdew, &
G. Aston (Eds.),
New Trends in Corpora and Language Learning (pp. 26–43). London: Continuum.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Chujo, K., Oghigian, K., & Akasegawa, S.
(
2015)
A corpus and grammatical browsing system for remedial EFL learners. In
A. Leńko-Szymańska &
A. Boulton (Eds.),
Multiple Affordances of Language Corpora for Data-driven Learning (pp. 109–128). Amsterdam: John Benjamins.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cobb, T.
(
1997)
Is there any measurable learning from hands-on concordancing? System, 25(3), 301–315.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cobb, T.
(
1999)
Breadth and depth of lexical acquisition with hands-on concordancing.
Computer Assisted Language Learning, 12(4), 345–360.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cobb, T.
n.d.).
Graded Reader Corpus. Retrieved from
[URL] (last acccessed November 2019).
Coxhead, A., & Ballance, O. J.
(
2018)
Learning through a corpus. In
A. Burns &
J. C. Richards (Eds.),
The Cambridge Guide to Learning English as a Second Language (pp. 307–315). Cambridge: Cambridge University Press.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Coxhead, A., Demecheleer, M., & McLaughlin, E.
(
2016)
The technical vocabulary of Carpentry: Loads, lists and bearings.
TESOLANZ Journal, 241, 38–71.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Coxhead, A. & Wallis, R.
(
2012)
TED talks, vocabulary and listening for EAP.
TESOLANZ Journal, 201, 55–67.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Dang, T. N. Y., & Webb, S.
(
2014)
The lexical profile of academic spoken English.
English for Specific Purposes, 331, 66–76.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Davies, M.
(
2008–)
The Corpus of Contemporary American English (COCA): 520 millions words, 1990-present. Retrieved from
[URL] (last acccessed November 2019).
Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A.
(
2007)
G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences.
Behavior Research Methods, 39(2), 175–191.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Franken, M.
(
2014)
The nature and scope of student search strategies in using a web derived corpus for writing.
The Language Learning Journal, 42(1), 85–102.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Frankenberg-Garcia, A.
(
2014)
How language learners can benefit from corpora, or not.
Recherches en didatique des langues et des cultures: les cahiers de l’acedle, 11(1), 93–110.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Grabe, W., & Stoller, F. L.
(
2011)
Teaching and Researching Reading (2nd ed.). Harlow: Longman/Pearson.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Hadley, G., & Charles, M.
(
2017)
Enhancing extensive reading with data-driven learning.
Language Learning & Technology, 21(3), 131–152.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Hsu, W.
(
2011)
The vocabulary thresholds of business textbooks and business research articles for EFL learners.
English for Specific Purposes, 30(4), 247–257.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Hsu, W.
(
2014)
Measuring the vocabulary load of engineering textbooks for EFL undergraduates.
English for Specific Purposes, 331, 54–65.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Hu, M., & Nation, I. S. P.
(
2000)
Unknown vocabulary density and reading.
Reading in a Foreign Language, 13(1), 403–430.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Hyland, K.
(
2015)
Corpora and written academic English. In
D. Biber &
R. Reppen (Eds.),
The Cambridge Handbook of English Corpus Linguistics (pp. 292–308). Cambridge, UK: Cambridge University Press.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Johns, T.
(
1991)
Should you be persuaded: Two samples of data-driven learning materials.
English Language Research Journal, 41, 1–16.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Johns, T.
(
2002)
Data-driven learning: The perpetual challenge. In
B. Kettemann &
G. Marko (Eds.),
Teaching and Learning by Doing Corpus Analysis: Proceedings of the Fourth International Conference on Teaching and Language Corpora, Graz
19–24 July 2000 (pp. 107–117). Amsterdam: Rodopi.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Kennedy, C., & Miceli, T.
(
2001)
An evaluation of intermediate students’ approaches to corpus investigation.
Language Learning and Technology, 5(3), 77–90.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Kennedy, C., & Miceli, T.
(
2010)
Corpus-assisted creative writing: Introducing intermediate Italian learners to a corpus as a reference resource.
Language Learning & Technology, 14(1), 28–44.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Kennedy, C., & Miceli, T.
(
2016)
Cultivating effective corpus use by language learners.
Computer Assisted Language Learning, 30(1–2), 1–24.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Kennedy, G.
(
1998)
An Introduction to Corpus Linguistics. London, UK: Longman.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Kilgarriff, A., Husák, M., McAdam, K., Rundell, M., & Rychlý, P.
(
2008 15–19 July).
GDEX: Automatically finding good dictionary examples in a corpus. Paper presented at the 13th EURALEX, Barcelona, Spain.
Kilgarriff, A., Marcowitz, F., Smith, S., & Thomas, J.
(
2015)
Corpora and language learning with the Sketch Engine and SKELL.
Revue française de linguistique appliquée, 20(1), 61–80.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Larson-Hall, J.
(
2010)
A Guide to Doing Statistics in Second Language Research Using SPSS. New York, NY: Routledge.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Laufer, B., & Ravenhorst-Kalovski, G.
(
2010)
Lexical threshold revisited: Lexical text coverage, learners’ vocabulary size and reading comprehension.
Reading in a Foreign Language, 22(1), 15–30.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Lee, D.
(
2002)
Genres, registers, text types, domains and styles: Clarifying the concepts and navigating a path through the BNC jungle. In
B. Kettemann &
G. Marko (Eds.),
Teaching and Learning by Doing Corpus Analysis: Proceedings of the Fourth International Conference on Teaching and Language Corpora, Graz 19–24 July, 2000 (pp. 247–292). Amsterdam: Rodopi.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Lee, H., Warschauer, M., & Lee, J. H.
(
2018)
The effects of corpus use on second language vocabulary learning: A multilevel meta-analysis.
Applied Linguistics, 40(5), 721–753.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Nation, I. S. P.
(
2006)
How large a vocabulary is needed for reading and listening? The Canadian Modern Language Review / La revue canadienne des langues vivantes, 63(1), 59–81.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Nation, I. S. P.
(
2012)
Range program with BNC/COCA lists 25,000 words. Retrieved from
[URL] (last acccessed November 2019).
Nation, I. S. P.
(
2013)
Learning Vocabulary in Another Language (2nd ed.). Cambridge: Cambridge University Press.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Nation, I. S. P., & Webb, S.
(
2011)
Researching and Analyzing Vocabulary. Boston, MA: Heinle.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Python Software Foundation
(
2001–2019)
Python (
Version 2.7) [Computer software]. Retrieved from
[URL] (last accessed November 2019).
Rayner, K.
(
1998)
Eye movements in reading and information processing: 20 years of research.
Psychological Bulletin, 124(3), 372–422.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Rayner, K.
(
2009)
Eye movements and attention in reading, scene perception, and visual search.
The Quarterly Journal of Experimental Psychology, 62(8), 1457–1506.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Rayson, P.
(
2015)
Computational tools and methods for corpus compilation and analysis. In
D. Biber &
R. Reppen (Eds.),
The Cambridge Handbook of English Corpus Linguistics. Cambridge: Cambridge University Press.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Schmitt, N., Jiang, X., & Grabe, W.
(
2011)
The percentage of words known in a text and reading comprehension.
The Modern Language Journal, 95(1), 26–43.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Sinclair, J.
(
2003)
Reading Concordances: An Introdcution. London: Pearson/Longman.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Sorell, J.
(
2015)
Word frequencies. In
J. R. Taylor (Ed.),
The Oxford Handbook of the Word (pp. 68–88). Oxford: Oxford University Press.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Swan, M., & Walter, C.
(
2017)
Misunderstanding comprehension.
ELT Journal, 71(2), 228–236.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Tegge, F.
(
2017)
The lexical coverage of popular songs in English language teaching.
System, 671, 87–98.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Tono, Y., Satake, Y., & Miura, A.
(
2014)
The effects of using corpora on revision tasks in L2 writing with coded error feedback.
ReCALL, 26(2), 147–162.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Webb, S., & Macalister, J.
(
2013)
Is text written for children useful for L2 extensive reading? TESOL Quarterly, 47(2), 300–322.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Webb, S., & Rodgers, M.
(
2009a)
The lexical coverage of movies.
Applied Linguistics, 30(3), 407–427.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Webb, S., & Rodgers, M.
(
2009b)
Vocabulary demands of television programs.
Language Learning, 59(2), 335–366.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Webb, S., Sasao, Y., & Ballance, O.
Wible, D., Chien, F.-Y., Kuo, C.-H., & Wang, C. C.
(
2002)
A lexical difficulty filter for language learners. In
B. Kettemann &
G. Marko (Eds.),
Teaching and Learning by Doing Corpus Analysis: Proceedings of the Fourth International Conference on Teaching and Language Corpora, Graz 19–24 July, 2000 (pp. 147–154). Amsterdam: Rodopi.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Widdowson, H. G.
(
1998)
Context, community, and authentic language.
TESOL Quarterly, 32(4), 705–716.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Yoon, H.
(
2008)
More than a linguistic reference: The influence of corpus technology on L2 academic writing.
Language Learning & Technology, 12(2), 31–48.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Yoon, H., & Hirvela, A.
(
2004)
ESL student attitudes toward corpus use in L2 writing.
Journal of Second Language Writing, 13(4), 257–283.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cited by
Cited by 3 other publications
Ballance, Oliver James
2021.
Narrow reading, vocabulary load and collocations in context: Exploring lexical repetition in concordances from a pedagogical perspective.
ReCALL 33:1
► pp. 4 ff.
![DOI logo](//benjamins.com/logos/doi-logo.svg)
Crosthwaite, Peter, Luciana & Martin Schweinberger
2021.
Voices from the periphery: Perceptions of Indonesian primary vs secondary pre-service teacher trainees about corpora and data-driven learning in the L2 English classroom.
Applied Corpus Linguistics 1:1
► pp. 100003 ff.
![DOI logo](//benjamins.com/logos/doi-logo.svg)
This list is based on CrossRef data as of 1 june 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.