Article published in:Bridging the Methodological Divide: Linguistic and psycholinguistic approaches to formulaic language
Edited by Stefanie Wulff and Debra Titone
[The Mental Lexicon 9:3] 2014
► pp. 437–472
N-gram probability effects in a cloze task
What knowledge influences our choice of words when we write or speak? Predicting which word a person will produce next is not easy, even when the linguistic context is known. One task that has been used to assess context dependent word choice is the fill-in-the-blank task, also called the cloze task. The cloze probability of specific context is an empirical measure found by asking many people to fill in the blank. In this paper we harness the power of large corpora to look at the influence of corpus-derived probabilistic information from a word’s micro-context on word choice. We asked young adults to complete short phrases called n-grams with up to 20 responses per phrase. The probability of the responded word and the conditional probability of the response given the context were predictive of the frequency with which each response was produced. Furthermore the order in which the participants generated multiple completions of the same context was predicted by the conditional probability as well. These results suggest that word choice in cloze tasks taps into implicit knowledge of a person’s past experience with that word in various contexts. Furthermore, the importance of n-gram conditional probabilities in our analysis is further evidence of implicit knowledge about multi-word sequences and support theories of language processing that involve anticipating or predicting based on context.
Keywords: formulaic language, multi-word expressions, n-grams, cloze probability, production
Published online: 23 January 2015
Arnon, I., & Cohen Priva, U.
Arnon, I., & Snider, N.
Baayen, R.H., Hendrix, P., & Ramscar, M.
Baayen, R.H., Milin, P., Djurdjevic, D., Hendrix, P., & Marelli, M.
Bates, D., Mächler, M., & Bolker, B.
(2011) lme4: linear mixed-effects models using S4 classes. Retrieved from http://cran.r-project.org/web/packages/lme4/.
Battig, W., & Montague, W.
Beattie, G., & Butterworth, B.
Belsley, D.A., Kuh, E., & Welsch, R.E.
Block, C., & Baldwin, C.
Bloom, P., & Fischler, I.
Brants, T., & Franz, A.
Chou, Y.M., Polansky, A.M., & Mason, R.L.
Conway, C.M., Bauernschmidt, A., Huang, S., & Pisoni, D.
Criss, A., Aue, W., & Smith, L.
DeLong, K., Urbach, T., & Kutas, M.
Dilkina, K., McClelland, J.L., & Plaut, D.C.
Fano, R.M., & Hawkins, D.
Fillenbaum, S., Jones, L., & Rapoport, A.
Francis, W., & Kucera, H.
Frank, S.L., & Bod, R.
Griffin, Z., & Bock, K.
Hahn, L.W., & Sivley, R.M.
Hay, J., Pelucchi, B., Estes, K., & Saffran, J.
Kučera, H., & Francis, W.
Kutas, M., & Hillyard, S.
McEvoy, C.L., Nelson, D.L., & Komatsu, T.
Mirman, D., Graf Estes, K., & Magnuson, J.
Nelson, D.L., McEvoy, C.L., & Dennis, S.
Nelson, D.L., McEvoy, C.L., & Schreiber, T.A.
(1998) The University of South Florida word association, rhyme, and word fragment norms. http://www.usf.edu/FreeAssociation/.
Nelson, D.L., McKinney, V., Gee, N., & Janczura, G.
Norris, D., & Kinoshita, S.
Owens, M., O’Boyle, P., McMahon, J., Ming, J., & Smith, F.
Pickering, M., & Garrod, S.
R. Development Core Team
Ramscar, M., & Gitcho, N.
Ruff, R., Light, R., Parker, S., & Levin, H.
Saffran, J.R., Aslin, R.N., & Newport, E.L.
Schwanenflugel, P., & LaCount, K.
Shaoul, C., & Westbury, C.F.
Shaoul, C., Westbury, C.F., & Baayen, R.H.
(2011) Scaling up psycholinguistics. Unpublished Doctoral Dissertation Downloaded in December, 2013 from http://vorpus.org/. San Diego, CA, USA: University of California, San Diego.
Smith, N.J., & Levy, R.
(2011) Cloze but no cigar: The complex relationship between cloze, corpus, and subjective probabilities in language processing. In Proceedings of the 33rd annual meeting of the cognitive science conference (pp. 1637–1642).
Sprenger, S., & van Rijn, H.
Tremblay, A., & Tucker, B.V.
Willems, R., & Hagoort, P.
Cited by 7 other publications
Baayen, R. Harald, Petar Milin & Michael Ramscar
Jacobs, Cassandra L., Gary S. Dell, Aaron S. Benjamin & Colin Bannard
Lõo, Kaidi, Juhani Järvikivi, Fabian Tomaschek, Benjamin V. Tucker & R. Harald Baayen
Manshu, Tu & Zhao Xuemin
Matusevych, Yevgen, Afra Alishahi & Ad Backus
This list is based on CrossRef data as of 14 january 2022. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.