Recently-developed tools which quickly and reliably quantify vocabulary use on a range of measures open up new
possibilities for understanding the construct of vocabulary sophistication. To take this work forward, we need to understand how
these different measures relate to each other and to human readers’ perceptions of texts. This study applied 356 quantitative
measures of vocabulary use generated by an automated vocabulary analysis tool (Kyle & Crossley, 2015) to a large corpus of
assignments written for First-Year Composition courses at a university in the United States. Results suggest that the majority of
measures can be reduced to a much smaller set without substantial loss of information. However, distinctions need to be retained
between measures based on content vs. function words and on different measures of collocational strength. Overall, correlations
with grades are reliable but weak.
(2014) Quantifying the development of phraseological competence in L2 English writing: An automated approach. Journal of Second Language Writing,
(1988) Variation Across Speech and Writing. Cambridge: Cambridge University Press.
(2007) British National Corpus, version 3 (BNC XML ed.). Retrieved from [URL] (Last accessed February 2019).
Brown, G. D. A.
(1984) A frequency count of 190,000 words in the London-Lund corpus of English conversation. Behavior Research Methods, Instrumentation & Computers,
Brysbaert, M., & New, B.
(2009) Moving beyond Kucera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior Research Methods,
Bulté, B., & Housen, A.
(2014) Conceptualizing and measuring short-term changes in L2 writing complexity. Journal of Second Language Writing,
(1990) CELEX: A Guide for Users. Nijmegen: CELEX – Centre for Lexical Information.
(2000) A new academic wordlist. TESOL Quarterly,
Crossley, S. A., Cai, Z., & McNamara, D.
(2012) Syntagmatic, paradigmatic, and automatic n-gram approaches to assessing essay quality. In G. M. Youngblood & P. M. McCarthy (Eds.), Proceedings of the Twenty-Fifth International Florida Artificial Intelligence Research Society Conference (pp. 214–219). Palo-Alto, CA: The AAAI Press.
Crossley, S. A., DeFore, C., Kyle, K., Dai, J., & McNamara, D.
(2013) Paragraph specific n-gram approaches to automatically assessing essay quality. In S. K. D’Mello, R. A. Clavo & A. Olney (Eds.), Proceedings of the 6th International Conference on Educational Data Mining (pp. 216–219). Heidelberg: Springer. Retrieved from [URL] (Last accessed February 2019)
Crossley, S. A., Salsbury, T., McNamara, D., & Jarvis, S.
(2010) Predicting lexical proficiency in language learner texts using computational indices. Language Testing,
Crossley, S. A., Weston, J. L., Sullivan, S. T. M., & McNamara, D.
(2011) The development of writing proficiency as a function of grade level: A linguistic analysis. Written Communication,
Cumming, A., Kantor, R., Baba, K., Erdosy, U., Eouanzoui, K., & James, M.
(2005) Differences in written discourse in independent and integrated prototype tasks for next generation TOEFL. Assessing Writing,
(in press). Corpus research on the development of children’s writing in L1 English. In A. Glaznieks, A. Abel, V. Lyding, & V. Nicolas(Eds.)Corpora and Language in Use: Proceedings of the Learner Corpus Research Conference2017 Louvain: Presses Universitaires de Louvain.
Durrant, P., & Schmitt, N.
(2009) To what extent do native and non-native writers make use of collocations?International Review of Applied Linguistics,
Garner, J., Crossley, S. A., & Kyle, K.
(2018) Beginning and intermediate L2 writers’ use of N-grams: An association measures study. International Review of Applied Linguistics. Advance online publication.
Golub, L. S., & Frederick, W. C.
(1979) Linguistic Structures in the discourse of fourth and sixth graders. Madison, WI: Center for Cognitive Learning, The University of Wisconsin.
Graesser, A. C., McNamara, D., Louwerse, M. M., & Cai, Z.
(2014) Coh-Metrix: Analysis of text on cohesion and language. Behavioral Research Methods, Instruments, and Computers,
Granger, S., & Bestgen, Y.
(2014) The use of collocations by intermediate vs. advanced non-ntive writers: A bigram-based study. International Review of Applied Linguistics,
(2016) The relationship between lexical sophistication and independent and source-based writing. Journal of Second Language Writing,
Malvern, D., & Richards, B.
(2002) Investigating accommodation in language proficiency interviews using a new measure of lexical diversity. Language Testing,
Malvern, D., Richards, B. J., Chipere, N., & Durán, P.
(2004) Lexical Diversity and Language Development. Basingstoke: Palgrave Macmillan.
Massey, A. J., & Elliott, G. L.
(1996) Aspects of Writing in 16+ English Examinations Between 1980 & 1994. Cambridge: University of Cambridge Local Examinations Syndicate.
Massey, A. J., Elliott, G. L., & Johnson, N. K.
(2005) Variations in Aspects of Writing in 16+ English Examinations Between 1980 and 2004: Vocabulary, Spelling, Punctuation, Sentence Structure, Non-standard English. Cambridge: Cambridge Assessment.
Mazgutova, D., & Kormos, J.
(2015) Syntactic and lexical development in an intensive English for Academic Purposes programme. Journal of Second Language Writing,
McCarthy, P. M., & Jarvis, S.
(2011) MTLD, voc-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods,
Meurers, D., & Dickinson, M.
(2017) Evidence and interpretation in language learning research: Opportunities for collaboration with computational linguistics. Language Learning,
(2013) Big data, learning analytics, and social assessment. Journal of Writing Assessment,
(1999) Writing matters: Linguistic characteristics of writing in GCSE English examinations. English in Education,
(2009) From talking to writing: Linguistic development in writing. BJEP Monograph Series II,
Olinghouse, N. G., & Leaird, J. T.
(2009) The relationship between measures of vocabulary and narrarive writing quality in second- and fourth-grade students. Reading and Writing,
Olinghouse, N. G., & Wilson, J.
(2013) The relationship between vocabulary and writing quality in three genres. Reading and Writing: An Interdisciplinary Journal,
(2018) Phraseological competence: A missing component in university entrance language tests? Insights from a study of EFL learners’ use of statistical collocations. Language Assessment Quarterly,
(2019) The phraseological dimension in interlanguage complexity research. Second Language Research,
R Development Core Team
(2013) R: A Language and Environment for Statistical Computing (Version 1.0.136) [Computer software]. Vienna: R Foundation for Statistical Computing. Retrieved from [URL] (last accessed February 2019).
(2000) Assessing Vocabulary. Cambridge: Cambridge University Press.
Roessingh, H., Elgie, S., & Kover, P.
(2015) Using lexical profiling tools to investigage children’s written vocabulary in grade 3: An exploratory study. Language Assessment Quarterly,
Simpson-Vlach, R., & Ellis, N. C.
(2010) An Academic Formulas List: New methods in phraseology research. Applied Linguistics,
(2009) The impact of studying in a second language (L2) medium university on the development of L2 writing. Journal of Second Language Writing,
Thorndike, E. L. & Lorge, I.
(1944) The Teacher’s Word Book of 30,000 Words. New York, NY: Teachers College, Columbia University.
Treffers-Daller, J., Parslow, P., & Williams, S.
(2018) Back to basics: How measures of lexical diversity can help discriminate between CEFR levels. Applied Linguistics,
Uccelli, P., Dobbs, C. L., & Scott, J.
(2013) Mastering academic language: Organization and stance in the persuasive writing of high school students. Written Communication,
Verspoor, M., Schmid, M. S., & Xu, X.
(2012) A dynamic usage based perspective on L2 writing. Journal of Second Language Writing,
Vidakovic, I., & Barker, F.
(2010) Use of words and multi-word units in Skills for Life Writing examinations. University of Cambridge ESOL Examinations Research Notes,
Vieregge, Q., Stedman, K., Mitchell, T., & Moxley, J.
(2012) Agency in the Age of Peer Production. Urbana, IL: National Council of Teachers of English.
Cited by 6 other publications
2021. Exploring longitudinal changes in lexical and syntactic features in beginning-level EFL learner writing. System 103 ► pp. 102680 ff.
2021. Examining lexical features and academic vocabulary use in adolescent L2 students’ text-based analytical essays. Assessing Writing 49 ► pp. 100540 ff.
McCallum, Lee & Philip Durrant
2022. Shaping Writing Grades,
Stewart, Jeffrey, Joseph P. Vitta, Christopher Nicklin, Stuart McLean, Geoffrey G. Pinchbeck & Brandon Kramer
2022. The Relationship between Word Difficulty and Frequency: A Response to Hashimoto (2021). Language Assessment Quarterly 19:1 ► pp. 90 ff.
Vitta, Joseph P., Christopher Nicklin & Simon W. Albright
2023. Academic word difficulty and multidimensional lexical sophistication: An English‐for‐academic‐purposes‐focused conceptual replication of Hashimoto and Egbert (2019). The Modern Language Journal
This list is based on CrossRef data as of 22 may 2023. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.