Insights from studying statistical learning
Acquiring language is notoriously complex, yet for the majority of children this feat is accomplished with remarkable ease. Usage-based accounts of language acquisition suggest that this success can be largely attributed to the wealth of experience with language that children accumulate over the course of language acquisition. One field of research that is heavily underpinned by this principle of experience is statistical learning, which posits that learners can perform powerful computations over the distribution of information in a given input, which can help them to discern precisely how that input is structured, and how it operates. A growing body of work brings this notion to bear in the field of language acquisition, due to a developing understanding of the richness of the statistical information contained in speech. In this chapter we discuss the role that statistical learning plays in language acquisition, emphasising the importance of both the distribution of information within language, and the situation in which language is being learnt. First, we address the types of statistical learning that apply to a range of language learning tasks, asking whether the statistical processes purported to support language learning are the same or distinct across different tasks in language acquisition. Second, we expand the perspective on what counts as environmental input, by determining how statistical learning operates over the situated learning environment, and not just sequences of sounds in utterances. Finally, we address the role of variability in children’s input, and examine how statistical learning can accommodate (and perhaps even exploit) this during language acquisition.
Article outline
- Preface
- Introduction
- Statistical processes for different language learning tasks
- General statistical principles of language acquisition: Grouping and dividing
- The role of the broader environment on learning
- A note on cue variability
- Conclusions
-
References
References
Aristotle
(350BCE
1932)
Posterior analytics. Translated by
G. R. G. Mure,
Works of Aristotle, Volume 2. Oxford: Oxford University Press.
Ay, N., Flack, J., & Krakauer, D.
(
2007)
Robustness and complexity co-constructed in multimodal signalling networks.
Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1479), 441–447.
Bahrick, L. E., Lickliter, R., & Flom, R.
(
2004)
Intersensory redundancy guides the development of selective attention, perception, and cognition in infancy.
Current Directions in Psychological Science, 13, 99–102.
Baldwin, D. A.
(
1991)
Infants’ contribution to the achievement of joint reference.
Child Development, 62, 875–890.
Black, A., & Bergmann, C.
(
2017)
Quantifying infants’ statistical word segmentation: A meta-analysis. In
G. Gunzelmann,
A. Howes,
T. Tenbrink, &
E. J. Davelaar (Eds.),
Proceedings of the 39th Annual Conference of the Cognitive Science Society (pp. 124–129). Austin, TX: Cognitive Science Society.
Brooks, R., & Meltzoff, A. N.
(
2008)
Infant gaze following and pointing predict accelerated vocabulary growth through two years of age: A longitudinal, growth curve modeling study.
Journal of Child Language, 35(1), 207–220.
Cameron-Faulkner, T., Lieven, E. V., & Tomasello, M.
(
2003)
A construction based analysis of child directed speech.
Cognitive Science, 27(6), 843–873.
Cartwright, T. A., & Brent, M. R.
(
1997)
Syntactic categorization in early language acquisition: Formalizing the role of distributional analysis.
Cognition, 63(2), 121–170.
Chomsky, N.
(
1981)
Lectures on government and binding. Dordrecht: Foris.
Chomsky, N.
(
2005)
Three factors in language design.
Linguistic Inquiry, 36, 1–22.
Clerkin, E. M., Hart, E., Rehg, J. M., Yu, C., Smith, L. B.
(
2017)
Real-world visual statistics and infants’ first-learned object names.
Philosophical Transactions of the Royal Society B, 372 (1711), 1–10.
Crain, S., & Nakayama, M.
(
1987)
Structure dependence in grammar formation.
Language, 63(3), 522–543.
Conwell, E.
(
2017)
Prosodic disambiguation of noun/verb homophones in child-directed speech.
Journal of Chid Language, 44(3), 734–751.
Culter, A., & Norris, D.
(
1988)
The role of strong syllables in segmentation for lexical access.
Journal of Experimental Psychology: Human Perception & Performance, 14, 113–121.
Cunillera, T., Toro, J. M., Sebastian-Galles, N., & Rodruiguez-Fornells, A.
(
2006)
The effects of stress and statistical cues on continuous speech segmentation: An event-related brain potential study.
Brain Research, 1123(1), 168–178.
De Diego-Balaguer, R., Rodriguez-Fornells, A. & Bachoud-Lévi, A. C.
(
2015)
Prosodic cues enhance rule learning by changing speech segmentation mechanisms.
Frontiers in Psychology, 6, 1478.
De Diego-Balauger, R., Toro, J. M., Rodriguez-Fornells, A., & Bachoud-Levi, A.-C.
(
2007)
Different neurophysiological mechanisms underlying word and rule extraction from speech.
PLoS One, 2, 01175.
Deutsch, D.
(
2013)
Grouping mechanisms in music. In
D. Deutsch (Ed.),
The psychology of music (pp. 184–238). San Diego, CA: Elsevier.
Dupoux, E.
(
2018)
Cognitive science in the era of artificial intelligence: A roadmap for reverse-engineering the infant language-learner.
Cognition, 173, 43–59.
Elman, J. L.
(
1993)
Learning and developmentin neural networks: The importance of starting small.
Cognition, 48, 71–99.
Endress, A. D., Scholl, B. J., & Mehler, J.
(
2005)
The role of salience in the extraction of algebraic rules.
Journal of Psychology: General, 134(3), 406–419.
Fernald, A.
(
1985)
Four-month-old infants prefer to listen to motherese.
Infant Behavior & Development, 8(2), 181–195.
Fernald, A., & Kuhl, P. K.
(
1987)
Acoustic determinants of infant preference for motherese speech.
Infant Behavior & Development, 10(3), 279–293.
Fery, C. & Schubö, F.
(
2010)
Hierarchical prosodic structures in the intonation of center embedded relative clauses.
The Linguistic Review, 27(3), 293–317.
French, R. M., Addyman, C., & Mareschal, D.
(
2011)
TRACX: A recognition-based connectionist framework for sequence segmentation and chunk extraction.
Psychological Review, 118(4), 614–636.
Freudenthal, D., Pine, J. M., & Gobet, F.
(
2006)
Modeling the development of children’s use of optional infinitives in Dutch and English using MOSAIC.
Cognitive Science, 30, 277–310.
Fries, C. C.
(
1952)
The structure of English. London: Longmans.
Frost, R. L. A., Isbilen, E. S., Christiansen, M. H. & Monaghan, P.
(
2019)
Testing the limits of non-adjacent dependency learning: Statistical segmentation and generalization across domains. In
A. K. Goel,
C. M. Seifert, &
C. Freksa (Eds.),
Proceedings of the 41st Annual Conference of the Cognitive Science Society. Montreal, QB: Cognitive Science Society.
Frost, R. L. A., Jessop, A., Durrant, S., Peter, M. S., Bidgood, A., C., Pine, J. M., Rowland, C. F., & Monaghan, P.
(
2020)
Non-adjacent dependency learning in infancy, and its link to language development.
Cognitive Psychology, 120: 101291.
Frost, R. L. A., & Monaghan, P.
(
2016)
Simultaneous segmentation and generalisation of non-adjacent dependencies from continuous speech.
Cognition, 147, 70–74.
Frost, R. L. A., Monaghan, P., & Tatsumi, T.
(
2017)
Domain-general mechanisms for speech segmentation: The role of duration information in language learning.
Journal of Experimental Psychology: Human Perception and Performance, 43(3), 466–476.
Gogate, L. J., Maganti, M., & Laing, K.
(
2013)
Maternal naming of object wholes versus parts for preverbal infants: A fine-grained analysis of scaffolding at 6 to 8 months.
Infant Behavior & Development, 36(3), 470–479.
Gomez, R. L.
(
2002)
Variability and detection of invariant structure.
Psychological Science, 13(5), 431–436.
Graf Estes, K., & Hurley, K.
(
2013)
Infant-directed prosody helps infants map sounds to meanings.
Infancy, 18(5), 797–824.
Harris, Z. S.
(
1954)
Distributional structure.
Word, 10, 140–162.
Harris, Z. S.
(
1955)
From phoneme to morpheme.
Language, 31, 190–222.
Hawthorne, K., & Gerken, L.
(
2014)
From pauses to clauses: Prosody facilitates learning of syntactic constituency.
Cognition, 133, 420 – 428.
Hendrickson, A. T., & Perfors, A.
(
2019)
Cross-situational learning in a Zipfian environment.
Cognition, 189, 11–22.
Hockema, S. A.
(
2006)
Finding words in speech: An investigation of American English.
Language Learning and Development, 2, 119–146.
Hollich, G., Hirsh-Pasek, K., & Golinkoff, R. M.
(
2000)
Breaking the language barrier: An emergentist coalition model for the origins of word learning.
Monographs of the Society for Research in Child Development, 65(
3, Serial No. 262).
Houston-Price, C., Plunkett, K., & Duffy, H.
(
2006)
The use of social and salience cues in early word learning.
Journal of Experimental Child Psychology, 95, 27–55.
Jackendoff, R.
(
2002)
Foundations of language: Brain, meaning, grammar, evolution. Oxford: Oxford University Press.
Johnson, E. K., & Jusczyk, P. W.
(
2001)
Word segmentation by 8-month-olds: When speech cues count more than statistics.
Journal of Memory and Language, 44(4), 548–567.
Johnson, E. K., & Seidl, A. H.
(
2009)
At 11 months, prosody still outranks statistics.
Developmental Science, 12(1), 131–141.
Johnson, E. K., & Tyler, M. D.
(
2010)
Testing the limits of statistical learning for word segmentation.
Developmental Science, 13(2), 339–345.
Jusczyk, P. W., Cutler, A., Redanz, N. J.
(
1993)
Infants’ preference for the predominant stress patterns of English words.
Child Development, 64(3), 675–687.
Kamper, H., Jansen, A., & Goldwater, S.
(
2016)
Unsupervised word segmentation and lexicon discovery using acoustic word embeddings. In
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 24, 669–679.
Kelly, M. H.
(
1992)
Using sound to solve syntactic problems: The role of phonology in grammatical category assignments.
Psychological Review, 99, 349–364.
Kurumada, C., Meylan, S. C., & Frank, M. C.
(
2013)
Zipfian frequency distributions facilitate word segmentation in context.
Cognition, 127(3), 439–453.
Lany, J.
(
2014)
Judging words by their covers and the company they keep: Probabilistic cues support word learning.
Child Development, 85(4), 1727–1739.
Lieven, E. V., Behrens, H., Speares, J., & Tomasello, M.
(
2003)
Early syntactic creativity: A usage-based approach.
Journal of Child Language, 30(2), 333–370.
Lieven, E. V., & Brandt, S.
(
2011)
The constructivist approach.
Infancia y Aprendizaje, 34(3), 281–296.
Lieven, E. V., Salomo, D., & Tomasello, M.
(
2009)
Two-year-old children’s production of multiword utterances: A usage-based analysis.
Cognitive Linguistics, 20, 481–508.
Ma, W., Golinkoff, R. M., Houston, D. M., & Hirsh-Pasek, K.
(
2011)
Word learning in infant- and adult-directed speech.
Language Learning and Development,7(3), 185–201.
MacWhinney, B. J.
(
2000)
The CHILDES project: Tools for analyzing talk (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
MacWhinney, B., Bates, E., & Kliegl, R.
(
1984)
Cue validity and sentence interpretation in English, German, and Italian.
Journal of Verbal Learning and Verbal Behaviour, 23, 127–150.
Marchetto, E., & Bonatti, L. L.
(
2013)
Words and possible words in early language acquisition.
Cognitive Psychology 67(3), 130–150.
Marchetto, E., & Bonatti, L. L.
(
2015)
Finding words and word structure in artificial speech: The development of infants’ sensitivity to morphosyntactic regularities.
Journal of Child Language, 42(4), 873–902.
Mattys, S. L., White, L., & Melhorn, J. F.
(
2005)
Integration of multiple speech segmentation cues: A hierarchical framework.
Journal of Experimental Psychology: General, 134, 477–500.
McClelland, J. L., & Elman, J. L.
(
1986)
The TRACE model of speech perception.
Cognitive Psychology, 18(1), 1–86.
McMurray, B., Horst, J. S., & Samuelson, L. K.
(
2012)
Word learning emerges from the interaction of online referent selection and slow associative learning.
Psychological Review, 119(4), 831–877.
Meyer, M., & Baldwin, D. A.
(
2013)
Pointing as a socio-pragmatic cue to particular vs. generic reference.
Language Learning and Development, 9(3), 245–265.
Mintz, T.
(
2003)
Frequent frames as a cue for grammatical categories in child directed speech.
Cognition, 90, 91–117.
Modrak, D. K. W.
(
2001)
Aristotle’s theory of language and meaning. Cambridge: Cambridge University Press.
Monaghan, P.
(
2017)
Canalization of language structure from environmental constraints: A computational model of word learning from multiple cues.
Topics in Cognitive Science, 9, 21–34..
Monaghan, P., Brand, J., Frost, R. L. A., & Taylor, G.
(
2017)
Multiple variable cues in the environment promote accurate and robust word learning. In
G. Gunzelmann,
A. Howes,
T. Tenbrink, &
E. J. Davelaar (Eds.),
Proceedings of the 39th Annual Conference of the Cognitive Science Society (pp. 817–822). Austin, TX: Cognitive Science Society.
Monaghan, P., Chater, N., & Christiansen, M. H.
(
2005)
The differential contribution of phonological and distributional cues in grammatical categorisation.
Cognition, 96, 143–182.
Monaghan, P., & Christiansen, M. H.
(
2010)
Words in puddles of sound: Modelling psycholinguistic effects in speech segmentation.
Journal of Child Language, 37, 545–564.
Monaghan, P., Christiansen, M. H., & Chater, N.
(
2007)
The Phonological Distributional Coherence Hypothesis: Cross-linguistic evidence in language acquisition.
Cognitive Psychology, 55, 259–305.
Monaghan, P. & Mattock, K.
(
2012)
Integrating constraints for learning word referent mappings.
Cognition, 123, 133–143.
Monaghan, P., Mattock, K., Davies, R., & Smith, A. C.
(
2015)
Gavagai is as gavagai does: Learning nouns and verbs from cross-situational statistics.
Cognitive Science, 39, 1099–1112.
Monaghan, P., Kalashnikova, M., & Mattock, K.
(
2017)
Intrinsic and extrinsic cues to word learning. In
G. Westermann &
N. Mani (Eds.),
Early word learning. Hove: Psychology Press.
Moore, C., Angelopolous, M., & Bennett, P.
(
1999)
Word learning in the context of referential and salience cues.
Developmental Psychology, 35(1), 60–68.
Mueller, J. L., Bahlmann, J., & Friederici, A. D.
(
2010)
Learnability of embedded syntactic structures depends on prosodic cues.
Cognitive Science, 34(2), 338–349.
Nespor, M. & Vogel, I.
(
1986)
Prosodic Phonology. Dordrecht: Foris Publications
Newmeyer, F. J.
(
2017)
Form and function in the evolution of grammar.
Cognitive Science, 41, 259–276.
Newport, E. L., & Aslin, R.
(
2004)
Learning at a distance: Statistical learning of non-adjacent dependencies.
Cognitive Psychology, 48(2), 127–162.
Nixon, J. S.
submitted).
Of mice and men: Is speech sound acquisition statistical or error- driven?
O’Brien, M. G., Jackson, C. N., Gardner, C. E.
(
2014)
Cross-linguistic differences in prosodic cues to syntactic disambiguation in German and English.
Applied Psycholinguistics, 35(1), 27–70.
Pelucchi, B., Hay, J. F., Saffran, J. R.
(
2009)
Statistical learning in a natural language by 8-month-old infants.
Child Development, 80(3), 674–685.
Perruchet, P., Tyler, M. D., Galland, N., & Peereman, R.
(
2004)
Learning non- adjacent dependencies: No need for algebraic-like computations.
Journal of Experimental Psychology, 133(4), 573–583).
Perruchet, P., & Vinter, A.
(
1998)
PARSER: A model for word segmentation.
Journal of Memory and Language, 39(2), 246–263.
Peña, M., Bonatti, L., Nespor, M., & Mehler, J.
(
2002)
Signal-driven computations in speech processing.
Science, 298, 604–607.
Pinker, S.
(
1984)
Language learnability and language development. Cambridge, MA: Harvard University Press.
Pullum, G. K., & Scholz, B.
(
2002)
Empirical assessment of stimulus poverty arguments.
The Linguistic Review, 19, 9–50.
Quine, W. V. O.
(
1960)
Word and object. Cambridge, MA: The MIT Press
Redington, M., Chater, N. & Finch, S.
(
1998)
Distributional information: A powerful cue for acquiring syntactic structures.
Cognitive Science, 22, 425–469.
Rodriguez-Fornells, A., Cunillera, T., Mestres-Misse, A., & De Diego-Balauger, R.
(
2009)
Neurophysiological mechanisms involved in language learning in adults.
Philosophical Transactions of the Royal Society, B: Biological Sciences, 364(1536), 3711–3734.
Saffran, J., Aslin, R., & Newport, E.
(
1996)
Statistical learning by 8-month-old infants.
Science, 274, 1926–1928.
Saffran, J. R., Newport, E. L., & Aslin, R. N.
(
1996b)
Word segmentation: The role of distributional cues.
Journal of Memory and Language, 35(4), 606–621.
Saffran, J. R., Newport, E. L., Aslin, R. N., Tunick, R. A., & Barrueco, S.
(
1997)
Incidental language learning: Listening (and learning) out of the corner of your ear.
Psychological Science, 8 (2), 101–105.
Saksida, A., Langus, A., & Nespor, M.
(
2017)
Co-occurrence statistics as a language-dependent cue for speech segmentation.
Developmental Science, 20(3), e12390.
Salverda, A. P., Dahan, D., & McQueen, J. M.
(
2003)
The role of prosodic boundaries in the resolution of lexical embedding in speech comprehension.
Cognition, 90, 51–89.
Scott, R. M., & Fisher, C.
(
2012)
2.5-Year-olds use cross-situational consistency to learn verbs under referential uncertainty.
Cognition, 122, 163–180.
Shukla, M., White, K. S., & Aslin, R.
(
2011)
Prosody guides the rapid mapping of auditory word forms onto visual objects in 6-mo-old infants.
PNAS, 108(15), 6038–6043.
Smith, L., & Yu, C.
(
2008)
Infants rapidly learn word–referent mappings via cross-situational statistics.
Cognition, 106, 1558–1568.
Smith, L. B., Jayaraman, S., Clerkin, E., & Yu, C.
(
2018)
The developing infant creates a curriculum for statistical learning.
Trends in Cognitive Sciences, 22(4), 325–336.
St Clair, M. C., Monaghan, P., & Christiansen, M. H.
(
2010)
Learning grammatical categories from distributional cues: Flexible frames for language acquisition.
Cognition, 116, 341–360.
Stumper, B., Bannard, C., Lieven, E. V., & Tomasello, M.
(
2011)
“Frequent frames” in German child-directed speech: A limited cue to grammatical categories.
Cognitive Science, 35(6), 1190–1205.
Swingley, D.
(
2005)
Statistical clustering and the contents of infant vocabulary.
Cognitive Psychology, 50(1), 86–132.
Thiessen, E. D., & Saffran, J. R.
(
2003)
When cues collide: Use of stress and statistical cues to word boundaries by 7- to 9-month-old infants.
Developmental Psychology, 39, 706–716.
Trotter, A. S., Frost, R. L. A., & Monaghan, P.
(
2019)
Chained melody: Low-level acoustic cues as a guide to phrase structure in comprehension (Unpublished doctoral dissertation).
Whitacre, J.
(
2010)
Degeneracy: A link between evolvability, robustness and complexity in biological systems.
Theoretical Biology and Medical Modelling, 7, 6.
White, L., & Turk, A. E.
(
2010)
English words on a Procrustean bed: Polysyllabic shortening reconsidered.
Journal of Phonetics, 38(3), 459–471.
Yu, C., & Smith, L. B.
(
2012)
Modeling cross situational word referent learning: Prior questions.
Psychological Review, 119(1), 21–39.
Yurovsky, D., Smith, L. B., & Yu, C.
(
2013)
Statistical word learning at scale: The baby’s view is better.
Developmental Science, 16, 959–966.
Yurovsky, D., Boyer, T. W., Smith, L. B., & Yu, C.
(
2013)
Probabilistic cue combination: Less is more.
Developmental Science, 16(2), 149–158.
Zipf, G. K.
(
1935)
Psycho-biology of languages. Cambridge, MA: The MIT Press.
Cited by
Cited by 2 other publications
Mirzaei, Azizullah, Mahshid Azizi Farsani & Heesun Chang
2023.
Statistical learning of L2 lexical bundles through unimodal, bimodal, and multimodal stimuli.
Language Teaching Research
Munro, Natalie, Elise Baker, Sarah Masso, Lynn Carson, Taiying Lee, Anita M.-Y. Wong & Stephanie F. Stokes
2021.
Vocabulary Acquisition and Usage for Late Talkers Treatment: Effect on Expressive Vocabulary and Phonology.
Journal of Speech, Language, and Hearing Research 64:7
► pp. 2682 ff.
This list is based on CrossRef data as of 23 april 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.