Article In:
The Mental Lexicon: Online-First ArticlesOrthographic uncertainty
An entropy-based measure of word form typicality
Measures of orthographic typicality have long been studied as
predictors of lexical access. The best-known orthographic typicality measure is
orthographic neighbourhood size (Coltheart’s
N or ON), the number of words that are one letter
different, by substitution, from the target word. A more recent related measure
of orthographic typicality is orthographic Levenshtein distance 20
(OLD20), the average Levenshtein orthographic edit distance
of a target word from its 20 closest neighbours (Yarkoni, Balota, and Yap, 2008). Both measures have
been implicated in lexical access. In this paper, we propose and assess a family
of measures of word form similarity we call orthographic
uncertainty. These measures are based on Shannon entropy (Shannon, 1948), which has a long
history of being considered psychologically relevant. Orthographic uncertainty
measures are superior to ON and OLD20 at predicting lexical decision and naming
reaction times and accuracies. They are also superior to the older measures
insofar as they are naturally tied to the widely-accepted quantification using
Shannon Entropy of the psychological functions of familiarity, uncertainty,
learnability, and representational and computational efficiency.
Keywords: lexical access, reading, orthography, orthographic neighbourhood, Coltheart’s N, ON, Levenshtein distance, OLD20, word typicality
Article outline
- Measures of orthographic typicality
- Other measures of orthographic typicality
- Non-positional N-gram measures
- Positional N-gram measures
- N-gram entropy measures
- Non-positional N-gram entropy
- Positional N-gram entropy
- A new measure of orthographic uncertainty
- Analysis of the properties of OU
- Study 1: Predicting behavioral measures of lexical access
- LD RT
- LD accuracy
- Naming RT
- Naming accuracy
- Study 1: Discussion
- Study 2: Predicting human judgments of orthographic typicality
- Method
- Participants
- Stimuli
- Results
- Discussion
- Study 3: Cross-linguistic analysis
- Method
- Results
- Discussion
- General discussion
- Open practices statement
- Notes
- Author queries
-
References
This content is being prepared for publication; it may be subject to changes.
References (73)
Akaike, H. (1973). Information
theory and an extension of the maximum likelihood
principle. In B. N. Petrov & F. Caski (Eds.), Proceedings
of the Second International Symposium on Information
Theory (pp. 267–281). Budapest: Akademiai Kiado.
(1974). A
new look at the statistical model
identification. IEEE Transactions on
Automatic
Control,
19
1, 716–723.
Andrews, S. (1989). Frequency
and neighborhood effects on lexical access: Activation or
search? Journal of Experimental Psychology:
Learning, Memory, and
Cognition, 151, 802–814.
(1992). Frequency
and neighborhood effects on lexical access: Lexical similarity or
orthographic redundancy? Journal of
Experimental Psychology: Learning, Memory, and
Cognition, 181, 234–254.
(1997). The
effect of orthographic similarity on lexical retrieval: Resolving
neighborhood conflicts. Psychonomic Bulletin
&
Review,
4
(4), 439–461.
Assink, E. M., Kattenberg, G., & Wortmann, C. (1998). Exploring
the boundaries of sublexical word identification units: The use of onsets
and rimes and reading ability. Journal of
psycholinguistic
research,
27
1, 639–659.
Balota, D. A., Yap, M. J., Hutchison, K. A., Cortese, M. J., Kessler, B., Loftis, B., Neely, J., Nelson, D. L., Simpson, G. B., & Treiman, R. (2007). The
English lexicon project. Behavior Research
Methods,
39
(3), 445–459.
Bentz, C., Alikaniotis, D., Cysouw, M., & Ferrer-i-Cancho, R. (2017). The
entropy of words — Learnability and expressivity across more than 1000
languages. Entropy,
19
(6), 275.
Blais, C., Fiset, D., Arguin, M., Jolicoeur, P., Bub, D., & Gosselin, F. (2009). Reading
between eye saccades. PLoS
One, 4(7), e6448.
Booth, J. R., & Perfetti, C. A. (2002). Onset
and rime structure influences naming but not early word identification in
children and adults. Scientific Studies of
Reading,
6
(1), 1–23.
Bowey, J. A. (1990). Orthographic
onsets and rimes as functional units of
reading. Memory &
Cognition,
18
(4), 419–427.
Carhart-Harris, R. L., Leech, R., Hellyer, P. J., Shanahan, M., Feilding, A., Tagliazucchi, E., Chialvo, D. R., & Nutt, D. (2014). The
entropic brain: a theory of conscious states informed by neuroimaging
research with psychedelic drugs. Frontiers in
human
neuroscience,
8
1, 20.
Carreiras, M., Perea, M., & Grainger, J. (1997). Effects
of the orthographic neighborhood in visual word recognition: Cross-task
comparisons. Journal of experimental
psychology: learning, memory, and
cognition,
23
(4), 857.
Chaitin, G. J. (1975). A
theory of program size formally identical to information
theory. Journal of the
ACM,
22
(3), 329–340.
Chen, Q., & Mirman, D. (2012). Competition
and cooperation among similar representations: toward a unified account of
facilitative and inhibitory effects of lexical
neighbors. Psychological
review,
119
(2), 417.
Coltheart, M., Davelaar, E., Jonasson, J., & Besner, D. (1977). Access
to the internal
lexicon. In S. Dornic (Ed.), Attention
and performance VI: Proceedings of the Sixth International Symposium on
Attention and Performance, Stockholm, Sweden, July 28-August 1,
1975. Hillsdale, N.J: Lawrence Erlbaum.
Coupé, C., Oh, Y. M., Dediu, D., & Pellegrino, F. (2019). Different
languages, similar encoding efficiency: Comparable information rates across
the human communicative niche. Science
Advances,
5
(9), eaaw2594.
Davis, C. J., Perea, M., & Acha, J. (2009). Re
(de) fining the orthographic neighborhood: The role of addition and deletion
neighbors in lexical decision and
reading. Journal of Experimental Psychology:
Human Perception and
Performance,
35
(5), 1550.
Duñabeitia, J. A., & Vidal-Abarca, E. (2008). Children
like dense neighborhoods: Orthographic neighborhood density effects in novel
readers. Spanish Journal of
Psychology,
11
(1), 26.
Dye, M., Johns, B. T., Jones, M. N., & Ramscar, M. (2016). The
structure of names in memory: Deviations from uniform entropy impair memory
for linguistic
sequences. In A. Papafragou, D. Grodner, D. Mirman, & J. C. Trueswell (Eds.), Proceedings
of the 38th Annual Conference of the Cognitive Science
Society (pp. 1763–1768). Austin, TX: Cognitive Science Society.
Dye, M., Milin, P., Futrell, R. & Ramscar, M. (2017). A
functional theory of gender
paradigms. In: Kiefer, F., Blevins, J. P. and Bartos, H., (eds.) Perspectives
on Morphological Structure: Data and
Analyses. Brill, Leiden, pp. 212–239.
Grainger, J., & Jacobs, A. M. (1993). Masked
partial-word priming in visual word recognition: Effects of positional
letter frequency. Journal of Experimental
Psychology-Human Perception and
Performance, 19(5), 951–964.
(1996). Orthographic
processing in visual word recognition: a multiple read-out
model. Psychological
Review,
103
(3), 518.
Harati, P., Westbury, C., & Kiaee, M. (2021). Evaluating
the predication model of metaphor comprehension: Using word2vec to model
best/worst quality judgments of 622 novel
metaphors. Behavior Research
Methods,
53
(5), 2214–2225.
Heilbron, M., Armeni, K., Schoffelen, J. M., Hagoort, P., & De Lange, F. P. (2022). A
hierarchy of linguistic predictions during natural language
comprehension. Proceedings of the National
Academy of
Sciences,
119
(32), e2201968119.
Hollis, G. (2018). Scoring
best-worst data in unbalanced many-item designs, with applications to
crowdsourcing semantic judgments. Behavior
Research
Methods,
50
(
2
), 711–729.
(2020). The
role of number of items per trial in best–worst scaling
experiments. Behavior Research
Methods,
52
(2), 694–722.
Hollis, G., & Westbury, C. (2006). NUANCE:
Naturalistic University of Alberta nonlinear correlation
explorer. Behavior Research
Methods,
38
(1), 8–23.
(2018). When
is best-worst best? A comparison of best-worst scaling, numeric estimation,
and rating scales for collection of semantic
norms. Behavior Research
Methods,
50
1, 115–133.
Hollis, G., Westbury, C. F., & Peterson, J. B. (2006). NUANCE
3.0: Using genetic programming to model variable
relationships. Behavior research
methods,
38
(2), 218–228.
Huntsman, L. A., & Lima, S. D. (1996). Orthographic
neighborhood structure and lexical
access. Journal of Psycholinguistic
Research,
25
(3), 417–429.
Keuleers, E. (2013). vwr
r package (v. 3). Downloaded
from: [URL]
Keuleers, E., Lacey, P., Rastle, K., & Brysbaert, M. (2012). The
British Lexicon Project: lexical decision data for 28,730 monosyllabic and
disyllabic English words. Behavior Research
Methods,
44
(1), 287–304.
Kiritchenko, S., & Mohammad, S. M. (2016). Capturing
reliable fine-grained sentiment associations by crowdsourcing and best-worst
scaling. San Diego: Paper presented
at the 15th Annual Conference of the
North American Chapter of the Association for Computational Linguistics:
Human Language Technologies (NAACL)
.
Levenshtein, V. I. (1966). Binary
codes capable of correcting deletions, insertions and
reversals. Soviet Physics
Doklady, 101, 707.
Louviere, J. J., Flynn, T. N., & Marley, A. A. J. (2015). Best-worst
scaling: Theory, methods and
applications. Cambridge: Cambridge University Press.
Luce, R. D. (2003). Whatever
happened to information theory in
psychology?. Review of General
Psychology,
7
(2), 183–188.
Luthra, S., You, H., Rueckl, J. G., & Magnuson, J. S. (2020). Friends
in Low-Entropy Places: Orthographic Neighbor Effects on Visual Word
Identification Differ Across Letter
Positions. Cognitive
Science, 44(12), e12917.
McClelland, J. L., & Rumelhart, D. E. (1981). An
interactive activation model of context effects in letter perception: I. An
account of basic findings. Psychological
review,
88
(5), 375.
Miller, R. R., Barnet, R. C., & Grahame, N. J. (1995). Assessment
of the Rescorla-Wagner model. Psychological
Bulletin,
117
(3), 363.
O’Regan, J. K., Lévy-Schoen, A., Pynte, J., & Brugaillière, B. (1984). Convenient
fixation location within isolated words of different length and
structure. Journal of Experimental
Psychology: Human Perception and
Performance, 10(3), 393.
Peereman, R., & Content, A. (1997). Orthographic
and phonological neighborhoods in naming: Not all neighbors are equally
influential in orthographic space. Journal of
Memory and
language,
37
(3), 382–410.
Perea, M., & Rosa, E. (2000). The
effects of orthographic neighborhood in reading and laboratory word
identification tasks: A
review. Psicológica,
21
(2), 327–340.
Piantadosi, S. T., Tily, H., & Gibson, E. (2011). Word
lengths are optimized for efficient
communication. Proceedings of the National
Academy of
Sciences, 108(9), 3526–3529.
Pothos, E. (2010). An
entropy model for artificial grammar
learning. Frontiers in
psychology,
1
1, 16.
Ramscar, M., Dye, M., & Klein, J. (2013). Children
value informativity over logic in word
learning. Psychological
science,
24
(6), 1017–1023.
Ramscar, M., & Port, R. F. (2016). How
spoken languages work in the absence of an inventory of discrete
units. Language
Sciences,
53
1, 58–74.
Rayner, K., & Kaiser, J. S. (1975). Reading
mutilated text. Journal of Educational
Psychology, 671, 301–306.
Rayner, K., White, S. J., Johnson, R. L., & Liversedge, S. P. (2006). Raeding
wrods with jubmled lettres: There is a
cost. Psychological
Science, 171, 192–193.
Rescorla, R. (1988). Pavlovian
conditioning: it’s not what you think it
is. American
Psychologist,
43
1, 151–160.
Rumelhart, D. E., & McClelland, J. L. (1982). An
interactive activation model of context effects in letter perception: II.
The contextual enhancement effect and some tests and extensions of the
model. Psychological
review,
89
(1), 60.
Shannon, C. E. (1948). A
note on the concept of entropy. Bell System
Technical
Journal,
27
1, 379–423.
Sears, C. R., Hino, Y., & Lupker, S. J. (1995). Neighborhood
size and neighborhood frequency effects in word
recognition. Journal of Experimental
Psychology: Human Perception and
Performance,
21
(4), 876.
Shannon, C. E. (1948). A
Mathematical Theory of Communication. Bell
System Technical
Journal, 271, 379–423 & 623–656.
Shaoul, C. & Westbury, C. (2006). USENET
Orthographic Frequencies for 111,627 English
Words. (2005–2006) Edmonton, AB: University of Alberta (downloaded
from [URL])
Siakaluk, P. D., Sears, C. R., & Lupker, S. J. (2002). Orthographic
neighborhood effects in lexical decision: The effects of nonword
orthographic neighborhood size. Journal of
Experimental Psychology: Human Perception and
Performance, 28(3), 661.
Takahashi, T. (2013). A
psychophysical theory of Shannon
entropy. Neuroendocrinology
Letters,
34
(7), 615–617.
Treiman, R. (1985). Onsets
and rimes as units of spoken syllables: Evidence from
children. Journal of experimental child
psychology,
39
(1), 161–181.
(1986). The
division between onsets and rimes in English
syllables. Journal of Memory and
Language,
25
(4), 476–491.
Treiman, R., Fowler, C. A., Gross, J., Berch, D., & Weatherston, S. (1995). Syllable
structure or word structure? Evidence for onset and rime units with
disyllabic and trisyllabic stimuli. Journal
of Memory and
Language,
34
(1), 132–155.
Westbury, C. F., & Hollis, G. (2007). Putting
Humpty together again: Synthetic approaches to nonlinear variable effects
underlying lexical
access. In G. Jarema & G. Libben (Eds.), The
Mental Lexicon: Core
perspectives (pp. 7–30). Bingley: Emerald.
Westbury, C., Shaoul, C., Moroschan, G., & Ramscar, M. (2016). Telling
the world’s least funny jokes: On the quantification of humor as
entropy. Journal of Memory and
Language,
86
1, 141–156.
Westbury, C., Yang, M., and Anderson, K. (2024). The
Principal Components of Meaning,
Revisited. [Accepted for publication
in Psychonomic Bulletin &
Review]
Wood, S. N. (2011). Fast
stable restricted maximum likelihood and marginal likelihood estimation of
semiparametric generalized linear
models. Journal of the Royal Statistical
Society
(B), 73(1), 3–36.
Yarkoni, T., Balota, D., & Yap, M. (2008). Moving
beyond Coltheart’s N: A new measure of orthographic
similarity. Psychonomic Bulletin &
Review,
15
(
5
), 971–979.
Zhang, J. W., & Wang, Q. H. (2010). The
orthographic neighborhood effect in word
recognition. Advances in Psychological
Science,
18
(06), 892–899.