What do (most of) our dispersion measures measure (most)? Dispersion?
This paper discusses the degree to which most of the most widely-used measures of dispersion in corpus linguistics are not particularly valid in the sense of actually measuring dispersion rather than some amalgam of a lot of frequency and a little dispersion. The paper demonstrates these issues on the basis of data from a variety of corpora. I then outline how to design a dispersion measure that only measures dispersion and show that (i) it indeed measures information that is different from frequency in an intuitive way and (ii) has a higher degree of predictive power of lexical decision times from the MALD database than nearly all other measures in nearly all corpora tested.
Keywords: dispersion, frequency, association, range, Juilland’s D , Gries’s DP , generalized additive modeling
Published online: 30 November 2021
Adelman, James S., Gordon D. A. Brown, & José F. Quesada
Baayen, R. Harald
Baayen, R. Harald, Petar Milin, & Michael Ramscar
Balota, David A. & Daniel H. Spieler
Bestgen, Yves & Sylviane Granger
Brysbaert, Marc & Boris New
Brysbaert, Marc, Pawel Mandera, Samantha F. McCormick, & Emmanuel Keuleers
Carroll, John B.
Durrant, Phil & Norbert Schmitt
Ellis, Nick C.
Ellis, Nick C., Rita Simpson-Vlach, & Carson Maynard
Fu, M. & Shaofeng, Li
Gries, Stefan Th.
Gries, Stefan, Th.
Juilland, Alphonse G., Dorothy R. Brodin, & Catherine Davidovitch
Oakes, Michael P. & Malcolm Farrow
Savický, Petr & Jaroslava Hlaváčová
Schmid, Hans Joerg
Spärck Jones, Karen
Spieler, Daniel H. & David A. Balota
Tucker, Benjamin V., Daniel Brennerm, D. Kyle Danielson, Matthew C. Kelley, Filip Nenadić, & Michelle Sims