Chapter 4. Validating lexical measures using human scores of lexical proficiency

Crossley, Scott A.; Salsbury, Tom; McNamara, Danielle S.

doi:10.1075/sibil.47.06ch4

Part of

Vocabulary Knowledge: Human ratings and automated measures
Edited by Scott Jarvis and Michael Daller
[Studies in Bilingualism 47] 2013
► pp. 105–134

Chapter 4. Validating lexical measures using human scores of lexical proficiency

Scott A. Crossley | Georgia State University

Tom Salsbury | Washington State University

Danielle S. McNamara | Arizona State University

This study examines the convergent validity of a wide range of computational indices reported by Coh-Metrix that have been associated in past studies with lexical features such as basic category words, semantic co-referentiality, word frequency, and lexical diversity. This study uses human judgments of these lexical features as found in free-writing samples as operationalizations of the lexical constructs the indices are meant to measure. Statistical analyses were then conducted to examine the convergent validity of each index and to assess the predictive ability of the indices that correlate strongest with the human judgments to explain holistic scores of lexical proficiency in L1 and L2 speakers. Correlations between the automated lexical indices and the operationalized constructs demonstrated small to large effect sizes providing a degree of convergent validity for most of the automated indices examined in this study. A multiple regression predicting holistic judgments of lexical proficiency using these automated lexical indices explained 40% of the variance in a training set and 37% of the variance in a test set. The findings from the study provide a degree of confidence that the indices are measuring the constructs they were predicted to measure.

Published online: 14 August 2013

https://doi.org/10.1075/sibil.47.06ch4

Cited by (5)

Cited by 5 other publications

Order by:

Crossley, Scott, Yu Tian, Perpetual Baffour, Alex Franklin, Youngmeen Kim, Wesley Morris, Meg Benner, Aigner Picou & Ulrich Boser

2023. The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus. International Journal of Learner Corpus Research 9:2 ► pp. 248 ff.

Hržica, Gordana & Maja Roch

2021. Lexical diversity in bilingual speakers of Croatian and Italian. In Language Impairment in Multilingual Settings [Trends in Language Acquisition Research, 29], ► pp. 100 ff.

Gharibi, Khadijeh & Frank Boers

2019. Influential factors in lexical richness of young heritage speakers’ family language: Iranians in New Zealand. International Journal of Bilingualism 23:2 ► pp. 381 ff.

Sosa, Ricardo

2019. Metrics to select design tasks in experimental creativity research. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science 233:2 ► pp. 440 ff.

Treffers-Daller, Jeanine, Patrick Parslow & Shirley Williams

2016. Back to Basics: How Measures of Lexical Diversity Can Help Discriminate between CEFR Levels. Applied Linguistics ► pp. amw009 ff.

This list is based on CrossRef data as of 25 june 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.