Evaluating reliability in quantitative vocabulary studies
The influence of corpus design and composition
Recent methodological advances have been used to create word lists based on large corpora. The present paper explores whether these corpora — and the associated lists — are unequivocally more representative. Corpus design considerations have usually focused on issues of external representativeness (representing the target discourse domain), while disregarding issues of internal representativeness (whether the corpus permits reliable descriptions of linguistic variation). This disregard may be especially problematic for studies of lexical variation, where it is difficult to achieve stable, reliable results from corpus analysis. The present paper illustrates these challenges through experiments based on analysis of a corpus representing a highly restricted discourse domain: university-level introductory psychology textbooks. The results indicate that corpus design and composition has a much greater influence on lexical variation than previously recognized, highlighting the need to evaluate internal representativeness in quantitative corpus-based research.
Keywords: lexical diversity and variability, word lists, reliability and validity, corpus representativeness
Published online: 30 March 2015
Adolphs, S., & Schmitt, N.
Biber, D., Conrad, S., & Cortes, V.
Biber, D., Conrad, S., & Reppen, R.
Biber, D., Conrad, S., Reppen, R., Byrd, P., Helt, M., Clark, V., Cortes, V., Csomay, E., & Urzua, A.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E.
Brezina, V., & Gablasova, D.
(2013) Is there a core general vocabulary? Introducing the New General Service List. Applied Linguistics, 1(23). Retrieved from http://applij.oxfordjournals.org/content/early/2013/08/25/applin.amt018.full
Carroll, J.B., Davies, P., & Richman, B.
The College Board
(2010) CLEP® Introductory Psychology: At a Glance. Retrieved from http://clep.collegeboard.org/clep-introductory-psychology-glance
Covington, M., & McFall, J.
Davies, M., & Gardner, D.
Francis, W.N., & Kucera, H.
(1979) Manual of Information to Accompany a Standard Corpus of Present-Day Edited American English, for Use with Digital Computers. Department of Linguistics, Brown University, Providence, RI. Retrieved from http://www.hit.uib.no/icame/brown/bcm.html
Gardner, D., & Davies, M.
(2013) A new academic vocabulary list. Applied Linguistics, 1(24). Retrieved from http://applij.oxfordjournals.org/content/early/2013/08/02/applin.amt015
Gries, S. Th
Heatley, A., & Nation, P.
(1994) Range [Web-based tool]. Victoria University of Wellington, NZ. Available from http://www.victoria.ac.nz/lals/about/staff/paul-nation
Juilland, A., & Chang-Rodríguez, E.
Leech, G., Rayson, P., & Wilson, A.
Martínez, I., Beck, S., & Panza, C.
McEnery, T., & Hardie, A.
McEnery, T., & Wilson, A.
McEnery, T., Xiao, R., & Tono, Y.
Millar, N., & Budgell, B.
(2012) The Challenge of Constructing a Reliable Word List: An Exploratory Corpus-based Analysis of Introductory Psychology textbooks. (Unpublished doctoral dissertation). Northern Arizona University, Flagstaff, AZ.
Nation, I.S.P., & Waring, R.
Simpson-Vlach, R. & Ellis, N.
Thorndike, D.L., & Lorge, I.
Tweedie, F., & Baayen, H.
Wang, J., Liang, S., & Ge, G.
Xue, G., & Nation, I.S.P.
Cited by 17 other publications
Bentum, Martijn, Louis ten Bosch, Antal van den Bosch & Mirjam Ernestus
Coxhead, Averil, Emma McLaughlin & Aleeshea Reid
Dong, Luobing, Qiumin Guo, Weili Wu & Meghana N. Satpute
Green, Clarence & James Lambert
Liang, Linxin & Mingwu Xu
Pan, Fan, Randi Reppen & Douglas Biber
Pan, Fan, Randi Reppen & Douglas Biber
TALALAKINA, EKATERINA, DENIS STUKAL & MIKHAIL KAMROTOV
This list is based on CrossRef data as of 13 may 2021. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.