How much vocabulary is needed to use a concordance?
Vocabulary load is a predictor of comprehension and a common concern in relation to learner use of concordances; however, vocabulary load figures for whole texts have limited relevance to learner use of concordances. This paper explores the average vocabulary load of the citations (or lines) in a concordance, reflecting how learners use concordances as reading or reference resources. Non-parametric tests are used to compare the vocabulary loads of citations from three authentic written corpora and a corpus of graded readers. The results indicate that citations from authentic corpora have an average vocabulary load of 4,000–5,000 word families, there are reliable differences in vocabulary load between citations from different corpora, and the magnitude of difference between citations from authentic corpora can be equivalent to the magnitude of difference between authentic corpora and graded reader corpora. The paper concludes with a discussion of the results in relation to language learner use of concordances.
- 2.The vocabulary demands of learner use of concordances
- 2.1Vocabulary and reading
- 2.2Concordances, reading and vocabulary load
- 3.Data and method
- 3.2Expanded word lists
- 3.3.1Power analysis
- 3.3.2Developing the sampling frame
- Vocabulary coverage level
- Mean word frequency
- 3.3.4Extraction software
- 4.1Study one: Three authentic corpora
- 4.2Study two: Replication
- 4.3Study three: Authentic corpora compared with a graded corpus
- 5.Pedagogical implications and discussion
Published online: 16 April 2020
Ballance, O. J.
Bauer, L., & Nation, I. S. P.
Biber, D., Conrad, S., & Reppen, R.
(2001) The British National Corpus, version 2 (BNC World). Distributed by Oxford University Computing Services. Retrieved from http://www.natcorp.ox.ac.uk/getting/index.xml (last acccessed November 2019).
Boulton, A., & Cobb, T.
Chambers, A., & O’Sullivan, I.
Chujo, K., Oghigian, K., & Akasegawa, S.
n.d.). Graded Reader Corpus. Retrieved from http://www.lextutor.ca/conc/graded/ (last acccessed November 2019).
Coxhead, A., & Ballance, O. J.
Coxhead, A., Demecheleer, M., & McLaughlin, E.
Coxhead, A. & Wallis, R.
Dang, T. N. Y., & Webb, S.
(2008–) The Corpus of Contemporary American English (COCA): 520 millions words, 1990-present. Retrieved from https://www.english-corpora.org/coca/ (last acccessed November 2019).
Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A.
Grabe, W., & Stoller, F. L.
Hadley, G., & Charles, M.
Hu, M., & Nation, I. S. P.
Kennedy, C., & Miceli, T.
Kilgarriff, A., Husák, M., McAdam, K., Rundell, M., & Rychlý, P.
(2008 15–19 July). GDEX: Automatically finding good dictionary examples in a corpus. Paper presented at the 13th EURALEX, Barcelona, Spain.
Kilgarriff, A., Marcowitz, F., Smith, S., & Thomas, J.
Laufer, B., & Ravenhorst-Kalovski, G.
(2002) Genres, registers, text types, domains and styles: Clarifying the concepts and navigating a path through the BNC jungle. In B. Kettemann & G. Marko (Eds.), Teaching and Learning by Doing Corpus Analysis: Proceedings of the Fourth International Conference on Teaching and Language Corpora, Graz 19–24 July, 2000 (pp. 247–292). Amsterdam: Rodopi.
Lee, H., Warschauer, M., & Lee, J. H.
Nation, I. S. P.
(2012) Range program with BNC/COCA lists 25,000 words. Retrieved from https://www.victoria.ac.nz/lals/about/staff/paul-nation (last acccessed November 2019).
Python Software Foundation
(2001–2019) Python (Version 2.7) [Computer software]. Retrieved from https://www.python.org/ (last accessed November 2019).
Schmitt, N., Jiang, X., & Grabe, W.
Tono, Y., Satake, Y., & Miura, A.
Webb, S., & Macalister, J.
Webb, S., & Rodgers, M.
Webb, S., Sasao, Y., & Ballance, O.
Wible, D., Chien, F.-Y., Kuo, C.-H., & Wang, C. C.
Widdowson, H. G.
Cited by 3 other publications
Ballance, Oliver James
Crosthwaite, Peter, Luciana & Martin Schweinberger
This list is based on CrossRef data as of 15 april 2022. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.