What knowledge influences our choice of words when we write or speak? Predicting which word a person will produce next is not easy, even when the linguistic context is known. One task that has been used to assess context dependent word choice is the fill-in-the-blank task, also called the cloze task. The cloze probability of specific context is an empirical measure found by asking many people to fill in the blank. In this paper we harness the power of large corpora to look at the influence of corpus-derived probabilistic information from a word’s micro-context on word choice. We asked young adults to complete short phrases called n-grams with up to 20 responses per phrase. The probability of the responded word and the conditional probability of the response given the context were predictive of the frequency with which each response was produced. Furthermore the order in which the participants generated multiple completions of the same context was predicted by the conditional probability as well. These results suggest that word choice in cloze tasks taps into implicit knowledge of a person’s past experience with that word in various contexts. Furthermore, the importance of n-gram conditional probabilities in our analysis is further evidence of implicit knowledge about multi-word sequences and support theories of language processing that involve anticipating or predicting based on context.
(1961) Transmission of information: A statistical theory of communications. American Journal of Physics, 291, 793.
Fillenbaum, S., Jones, L., & Rapoport, A.
(1963) The predictability of words and their grammatical classes as a function of rate of deletion from a speech transcript1. Journal of Verbal Learning and Verbal Behavior, 2(2), 186–194.
(1977) Word frequency, information theory, and cloze performance: A transfer feature theory of processing in reading. Reading Research Quarterly, 13(4), 508–537.
Francis, W., & Kucera, H.
(1982) Frequency analysis of English usage. Boston, MA, USA: Houghton Mifflin Company.
Frank, S.L., & Bod, R.
(2011) Insensitivity of the human sentence-processing system to hierarchical structure. Psychological Science, 22(6), 829–834.
Griffin, Z., & Bock, K.
(1998) Constraint, word frequency, and the relationship between lexical processing levels in spoken word production. Journal of Memory and Language, 38(3), 313–338.
Hahn, L.W., & Sivley, R.M.
(2011) Entropy, semantic relatedness and proximity. Behavior Research Methods, 43(3), 746–760.
Hay, J., Pelucchi, B., Estes, K., & Saffran, J.
(2011) Linking sounds to meanings: Infant statistical learning in a natural language. Cognitive Psychology, 63(2), 93–106.
(2008) Anticipatory processes in sentence processing. Language and Linguistics Compass, 2(4), 647.
Kučera, H., & Francis, W.
(1967) Computational analysis of present-day American English. Dartmouth, NH, USA: Dartmouth Publishing Group.
Kutas, M., & Hillyard, S.
(1984) Brain potentials during reading reflect word expectancy and semantic association. Nature, 307(5947), 161–163.
McEvoy, C.L., Nelson, D.L., & Komatsu, T.
(1999) What is the connection between true and false memories? The differential roles of inter item associations in recall and recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25(5), 1177.
(1986) Cloze procedure as a memory-search process. Journal of Educational Psychology, 781, 433–440.
Mirman, D., Graf Estes, K., & Magnuson, J.
(2010) Computational modeling of statistical learning: Effects of transitional probability versus frequency and links to word learning. Infancy, 15(5), 471–486.
Nelson, D.L., McEvoy, C.L., & Dennis, S.
(2000) What is free association and what does it measure?Memory & Cognition, 28(6), 887–899.
Nelson, D.L., McEvoy, C.L., & Schreiber, T.A.
(1998) The University of South Florida word association, rhyme, and word fragment norms. [URL].
Nelson, D.L., McKinney, V., Gee, N., & Janczura, G.
(1998) Interpreting the influence of implicitly activated memories on recall and recognition. Psychological Review, 105(2), 299.
Norris, D., & Kinoshita, S.
(2008) Perception as evidence accumulation and Bayesian inference: Insights from masked priming. Journal of Experimental Psychology: General, 137(3), 434–455.
Owens, M., O’Boyle, P., McMahon, J., Ming, J., & Smith, F.
(1997) A comparison of human and statistical language model performance using missing-word tests. Language and Speech, 40(4), 377.
Pickering, M., & Garrod, S.
(2007) Do people use language production to make predictions during comprehension?Trends in Cognitive Sciences, 11(3), 105–110.
R. Development Core Team
(2009) R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.
Ramscar, M., & Gitcho, N.
(2007) Developmental change and the nature of learning in childhood. Trends in Cognitive Science, 11(7), 274–279.
Ruff, R., Light, R., Parker, S., & Levin, H.
(1997) The psychological construct of word fluency. Brain and Language, 57(3), 394–405.
Saffran, J.R., Aslin, R.N., & Newport, E.L.
(1996) Statistical learning by 8-month-old infants. Science, 2741, 1926–1928.
Schwanenflugel, P., & LaCount, K.
(1988) Semantic relatedness and the scope of facilitation for upcoming words in sentences. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14(2), 344.
(1948) A mathematical theory of communication. Bell System Technical Journal, 271, 379–423.
(1951) Prediction and entropy of printed English. Bell System Technical Journal, 30(1), 50–64.
(2013) The subjective frequency of word n-grams. Psihologija, 46(4), 497–537.
(2011) Scaling up psycholinguistics. Unpublished Doctoral Dissertation Downloaded in December, 2013 from [URL]. San Diego, CA, USA: University of California, San Diego.
Smith, N.J., & Levy, R.
(2011) Cloze but no cigar: The complex relationship between cloze, corpus, and subjective probabilities in language processing. In
Proceedings of the 33rd annual meeting of the cognitive science conference
2022. A Novel
-Gram-Based Image Classification Model and Its Applications in Diagnosing Thyroid Nodule and Retinal OCT Images. Computational and Mathematical Methods in Medicine 2022 ► pp. 1 ff.
This list is based on CrossRef data as of 31 october 2023. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.