Why we need to investigate casual speech to truly understand language production, processing and the mental lexicon

Tucker, Benjamin V.; Ernestus, Mirjam

doi:10.1075/ml.11.3.03tuc

Article published In:

New Questions for the Next Decade
Edited by Gonia Jarema, Gary Libben and Victor Kuperman
[The Mental Lexicon 11:3] 2016
► pp. 375–400

Why we need to investigate casual speech to truly understand language production, processing and the mental lexicon

Benjamin V. Tucker | University of Alberta

Mirjam Ernestus | Radboud University / Max Plank Institute for Psycholinguistics

The majority of studies addressing psycholinguistic questions focus on speech produced and processed in a careful, laboratory speech style. This ‘careful’ speech is very different from the speech that listeners encounter in casual conversations. This article argues that research on casual speech is necessary to show the validity of conclusions based on careful speech. Moreover, research on casual speech produces new insights and questions on the processes underlying communication and on the mental lexicon that cannot be revealed by research using careful speech. This article first places research on casual speech in its historic perspective. It then provides many examples of how casual speech differs from careful speech and shows that these differences may have important implications for psycholinguistic theories. Subsequently, the article discusses the challenges that research on casual speech faces, which stem from the high variability of this speech style, its necessary casual context, and that casual speech is connected speech. We also present opportunities for research on casual speech, mostly in the form of new experimental methods that facilitate research on connected speech. However, real progress can only be made if these new methods are combined with advanced (still to be developed) statistical techniques.

Keywords: casual speech, conversational speech, experimental paradigms, pronunciation variability, statistical analyses

Published online: 31 December 2016

https://doi.org/10.1075/ml.11.3.03tuc

References

Anderson, A.H., Bader, M., Bard, E.G., Boyle, E., Doherty, G., Garrod, S., & Sotillo, C

(1991) The HCRC map task corpus. Language and Speech, 34(4), 351–366.

Baayen, R.H

(2008) Analyzing linguistic data. A practical introduction to statistics using r. Cambridge University Press.

Baayen, R.H., van Rij, J., de Cat, C. & Wood, S.N

to appear). Autocorrelated errors in experimental data in the language sciences: Some solutions offered by Generalized Additive Mixed Models. In D. Speelman, K. Heylen, & D. Geeraerts (Eds.) Mixed effects regression models in linguistics Berlin: Springer Retrieved from [URL]

Bates, D., Kliegl, R., Vasishth, S. & Baayen, R.H

submitted). Parsimonious mixed models.

Bates, E., & Liu, H

(1996) Cued shadowing. Language and Cognitive Processes, 11(6), 577–582.

Bentum, M., Ernestus, M., ten Bosch, L. & van den Bosch, A

submitted). How do speech registers differ in the predictability of words?

Benzeghiba, M., De Mori, R., Deroo, O., Dupont, S., Erbes, T., Jouvet, D., Fissore, L., Laface, P., Mertins, A., Ris, S., Rose, R., Tyagi, V., & Wellekens, C

(2007) Automatic speech recognition and speech variability: A review. Speech Communication, 49(10), 763–786.

Bernhard, D., & Tucker, B

(2015) The effects of duration on human processing of reduced speech. Canadian Acoustics, 43(3).

Biber, D

(1988) Variation across speech and writing. Cambridge: Cambridge University Press.

Biber, D., Conrad, S., & Reppen, R

(1998) Corpus linguistics: Investigating language structure and use. Cambridge University Press.

Brand, Sophie, & Ernestus, Mirjam

submitted). How do native listeners and learners of French comprehend French word pronunciation variants?

Brenner, D

(2013) The acoustics of Mandarin tones in careful and conversational speech. The Journal of the Acoustical Society of America, 134(5), 4246.

Brenner, D.S

(2015) The phonetics of Mandarin tones in conversation. Retrieved from [URL]

Brouwer, S., Mitterer, H., & Huettig, F

(2012) Speech reductions change the dynamics of competition during spoken word recognition. Language and Cognitive Processes, 27(4), 539–571.

Bürki, A., Ernestus, M., Gendrot, C., Fougeron, C., & Frauenfelder, U.H

(2011) What affects the presence versus absence of schwa and its duration: A corpus analysis of French connected speech. The Journal of the Acoustical Society of America, 130(6), 3980–3991.

Bürki, A., Ernestus, M., & Frauenfelder, U.H

(2010) Is there only one “fenêtre” in the production lexicon? On-line evidence on the nature of phonological representations of pronunciation variants for French schwa words. Journal of Memory and Language, 621, 421–437.

Çetin, Ö., & Shriberg, E

(2006) Speaker overlaps and ASR errors in meetings: Effects before, during, and after the overlap. In 2006 IEEE international conference on Acoustics Speech and Signal Processing Proceedings (vol. 11).

Chen, T.-Y., & Tucker, B.V

(2013) Sonorant onset pitch as a perceptual cue of lexical tones in Mandarin. Phonetica, 70(3), 207–239.

Chomsky, N

(1965) Aspects of the theory of syntax. Cambridge, MA: MIT Press.

Connine, C.M., & Titone, D

(1996) Phoneme monitoring. Language and Cognitive Processes, 11(6), 635–646.

De Chat, C

(2007) French dislocation. interpretation, syntax, acquisition [Oxford Studies in Theoretical Linguistics, 17] (pp. 288). Oxford: Oxford University Press.

Dilts, P.C

(2013) Modelling phonetic reduction in a corpus of spoken English using random forests and mixed-effects regression (Thesis). Retrieved from [URL]

Drijvers, L., & Özyürek, A

in press). Visual context enhanced: The joint contribution of iconic gestures and visible speech to degraded speech comprehension. Journal of Speech, Language, and Hearing Research.

Engen, K.J.V., Baese-Berk, M., Baker, R.E., Choi, A., Kim, M., & Bradlow, A.R

(2010) The wildcat corpus of native-and foreign-accented English: Communicative efficiency across conversational dyads with varying language alignment profiles. Language and Speech, 53(4), 510–540.

Ernestus, M

(2000) Voice assimilation and segment reduction in casual Dutch: A corpus-based study of the phonology-phonetic interface. Holland Institute of Generative Linguistics, Utrecht.

(2012) Message related variation: Segmental within speaker variation. In A.C. Cohn, C. Fougeron, & M. Huffman (Eds.), The Oxford handbook of laboratory phonology (pp. 92–102). Oxford: Oxford University Press.

Ernestus, M., & R.H. Baayen

(2011) Corpora and exemplars in phonology. In J. Goldsmith, J. Riggle, & A. Yu (Eds.), The handbook of phonological theory (2nd ed., pp. 374–400). Chichester, West Sussex: Wiley-Blackwell.

Ernestus, M., Baayen, R.H., & Schreuder, R

(2002) The recognition of reduced word forms. Brain and Language, 811, 162–173.

Ernestus, M., Hanique, I., & Verboom, E

(2015) The effect of speech situation on the occurrence of reduced word pronunciation variants. Journal of Phonetics, 481, 60–75.

Ernestus, M., Lahey, M., Verhees, F., & Baayen, R.H

(2006) Lexical frequency and voice assimilation. Journal of the Acoustical Society of America, 1201, 1040–1051.

Fowler, C.A., & Turvey, M.T

(1981) Immediate compensation in bite-block speech. Phonetica, 37(5–6), 306–326.

Fu, Q., Zeng, F

(2000) Identification of temporal envelop cues in Chinese tone recognition. Asia Pacific Journal of Speech Language and Hearing, 51, 45–57.

Gahl, S., Yao, Y., & Johnson, K

(2012) Why reduce? Phonological neighborhood density and phonetic reduction in spontaneous speech. Journal of Memory and Language, 66(4), 789–806.

Galliano, S., Georois, E., Mostefa, D., Choukri, K., Bonastre, J.-F., & Gravier, J

(2005) ESTER phase II evaluation campaign for the rich transcription of French broadcast news. Proc. Interspeech 20051, 2453–2456.

Gaskell, G., & William, M.-W

(1998) Mechanisms of phonological inference in speech perception. Journal of Experimental Psychology: Human Perception and Performance, 241, 380–396.

Gaygen, D.E., & Luce, P.A

(1998) Effects of modality on subjective frequency estimates and processing of spoken and printed words. Perception & Psychophysics, 60(3), 465–483.

Gick, B

(2002) The use of ultrasound for linguistic phonetic fieldwork. Journal of the International Phonetic Association, 32(02), 113–121.

Godfrey, J.J., Holliman, E.C., & McDaniel, J

(1992) Switchboard: Telephone speech corpus for research and development. In 1992 IEEE international conference on Acoustics, Speech, and Signal Processing, 1992. ICASSP-92 (vol. 11, pp. 517–520).

Greenberg, S

(1999) Speaking in shorthand – A syllable-centric perspective for understanding pronunciation variation. Speech Communication, 291, 159–176.

Goldinger, S.D., & Papesh, M.H

(2012) Pupil dilation reflects the creation and retrieval of memories. Current Directions in Psychological Science, 21(2), 90–95.

Hastie, T.J., & Tibshirani, R.J

(2002) Generalized additive models (vol. 431). CRC Press 1990.

Heylighen, F., & Dewaele, J.-M

(2002) Variation in the contextuality of language: An empirical measure. Foundations of Science, 7(3), 293–340.

Hockett, Charles F

1955 A manual of phonology. Baltimore: Waverly Press.

Hymes, D

(1992) The concept of communicative competence revisited. Thirty years of linguistic evolution. In Studies in honour of René Dirven on the occasion of his sixtieth birthday (pp. 31–57).

Kemps, R., Ernestus, M., Schreuder, R., & Baayen, R.H

(2004) Processing reduced word forms: The suffix restoration effect. Brain and Language, 191, 117–127.

Klingner, J., Tversky, B., & Hanrahan, P

(2011) Effects of visual and verbal presentation on cognitive load in vigilance, memory, and arithmetic tasks. Psychophysiology, 48(3), 323–332.

Koch, X., & Janse, E

(2016) Speech rate effects on the processing of conversational speech across the adult life span. The Journal of the Acoustical Society of America, 139(4), 1618–1636.

Kruschke, J.K

(2010) What to believe: Bayesian methods for data analysis. Trends in Cognitive Sciences, 14(7), 293–300.

(2014) Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Academic Press.

Kryuchkova, T., Tucker, B.V., Wurm, L.H., & Baayen, R.H

(2012) Danger and usefulness are detected early in auditory lexical processing: Evidence from electroencephalography. Brain and Language, 122(2), 81–91.

Labov, W

(1972) Sociolinguistic patterns. University of Pennsylvania Press.

Lahiri, A., & Reetz, H

(2002) ‘Underspecified recognition’. In Carlos Gussenhoven, Natasha Warner, & Toni Rietveld (Eds.), Phonology & phonetics: Laboratory phonology VII (pp. 637–676). Berlin, Mouton.

Levelt, W.J.M., Roelofs, A., & Meyer, A.S

(1999) A theory of lexical access in speech production. Behavioral and Brain Sciences, 221, 1–38.

Lindblom, B

(1963) Spectrographic study of vowel reduction. The Journal of the Acoustical Society of America, 35(11), 1773–1781.

Liu, S., & Samuel, A.G

(2004) Perception of Mandarin lexical tones when F0 information is neutralized. Language & Speech, 471, 109–138.

MacWhinney, B

(2000) The childes project: Tools for analyzing talk (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.

McLennan, C.T., Luce, P.A., & Charles-Luce, J

(2003) Representation of lexical form. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29(4), 539–553.

McQueen, J

(1996) Word spotting. Language and Cognitive Processes, 11(6), 695–699.

Mehta, G., & Cutler, A

(1988) Detection of target phonemes in spontaneous and read speech. Language and Speech, 31(Pt 2), 135–156.

Mirman, D. , Dixon, J.A., & Magnuson, J.S

(2008) Statistical and computational models of the visual world paradigm: Growth curves and individual differences. Journal of Memory and Language, 59(4), 475–494.

Mulder, K., ten Bosch, L., & Boves, L

submitted). Comparing different methods for analyzing ERP signals.

Munson, B., & Solomon, N.P

(2004) The effect of phonological neighborhood density on vowel articulation. Journal of Speech, Language, and Hearing Research, 47(5), 1048–1058.

Oleson, J.J., Cavanaugh, J.E., McMurray, B., & Brown, G

(2015) Detecting time-specific differences between temporal nonlinear curves: Analyzing data from the visual world paradigm. Statistical Methods in Medical Research, 0962280215607411.

Oostdijk, N

(2000) The spoken Dutch Corpus Project. The ELRA Newsletter, 51, 4–8.

Pitt, M.A., Dilley, L., Johnson, K., Kiesling, S., Raymond, W., Hume, E., & Fosler-Lussier, E

(2007) Buckeye corpus of conversational speech (2nd release) [[URL]] Columbus, OH: Department of Psychology. Ohio State University (Distributor).

Pluymaekers, M., Ernestus, M., & Baayen, R

(2006) Articulatory planning is continuous and sensitive to informational redundancy. Phonetica, 62(2–4), 146–159.

Podlubny, R., Geeraert, K., Tucker, B.V

(2015) It’s all about, like, acoustics. Proceedings of the 18th international Congress of Phonetic Sciences . Glasgow, UK: The University of Glasgow. Paper number 0477.

Podlubny, R., Tucker, B.V., & Nearey, T

(2011) ‘Sorry, what was that?’: The roles of pitch, duration, and amplitude in the perception of reduced speech. Poster presented at the Nijmegen Spontaneous Speech Workshop , Nijmegen, NL.

Pollack, I., & Pickett, J.M

(1963) Intelligibility of excerpts from conversational speech. Language and Speech, 61, 165–171.

Ranbom, L.J., & Connine, C.M

(2007) Lexical representation of phonological variation in spoken word recognition. Journal of Memory and Language, 57(2), 273–298.

Richter, E

1930 Beobachtungen über Anglitt und Abglitt an Sprachkurven und umgekehrt laufenden Phonogrammplatten. In Paul Menzerath (Ed.), Berichte über die I. Tagung der Internationalen Gesellschaft für experimentelle Phonetik (pp. 87–90). Bonn: Scheur.

Ruiter, de, L.E

(2015) Information status marking in spontaneous vs. read speech in story-telling tasks – Evidence from intonation analysis using GToBI. Journal of Phonetics, 481, 29–44.

Schönle, P.W., Gräbe, K., Wenig, P., Höhne, J., Schrader, J., & Conrad, B

(1987) Electromagnetic articulography: Use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract. Brain and Language, 31(1), 26–35.

Schweitzer, K., Walsh, M., Calhoun, S., Schütze, H., Möbius, B., Schweitzer, A., & Dogil, G

(2015) Exploring the relationship between intonation and the lexicon: Evidence for lexicalised storage of intonation. Speech Communication, 661, 65–81.

Stampe, D

(1973) A dissertation on natural phonology. PhD Diss. University of Chicago.

Stone, M

(1990) A three‐dimensional model of tongue movement based on ultrasound and X‐ray microbeam data. The Journal of the Acoustical Society of America, 87(5), 2207–2217.

Taft, M., & Chen, H.C

(1992) Judging homophony in Chinese: The influence of tones. Advances in Psychology, 901, 151–172.

Tagliamonte, S.A., & Baayen, R.H

(2012) Models, forests, and trees of York English: Was/were variation as a case study for statistical practice. Language Variation and Change, 24(2), 135–178.

Torreira, F., Adda-Decker, M., & Ernestus, M

(2010) The nijmegen corpus of casual French. Speech Communication, 521, 201–221.

Tucker, B.V

(2007) Spoken word recognition of the reduced American English Flap. The University of Arizona. Retrieved from [URL]

(2011) The effect of reduction on the processing of flaps and /g/ in isolated words. Journal of Phonetics, 39(3), 312–318.

Tyrone, M.E., & Mauk, C.E

(2010) Sign lowering and phonetic reduction in American Sign Language. Journal of Phonetics, 38(2), 317–328.

van Rij, J., Natalya, P., van Rijn, H., Wood, S.N., & Baayen, R.H

submitted). Pupil dilation to study cognitive processing: Challenges and solutions for time course analyses.

Van de Ven, M., Ernestus, M., & Schreuder, R

(2012) Predicting acoustically reduced words in spontaneous speech: The role of semantic/syntactic and acoustic cues in context. Laboratory Phonology, 31, 455–481.

Viebahn, M., Ernestus, M., & McQueen, J

(2015) Syntactic predictability in the recognition of carefully and casually produced speech. Journal of Experimental Psychology: Learning, Memory, and Cognition, 41(6), 1684–1702.

Wagner, P., Trouvain, J., & Zimmerer, F

(2015) In defense of stylistic diversity in speech research. Journal of Phonetics, 481, 1–12.

Warner, N

(2011) Reduction. In M. van Oostendorp, C. Ewen, E. Hume, & K. Rice (Eds.), The Blackwell Companion to Phonology: General issues and segmental phonology (vol. 11, pp. 1866–1891). John Wiley & Sons.

(2012) Methods for studying spontaneous speech. In A. Cohn, C. Fougeron, & M. Huffman (Eds.), The Oxford handbook of laboratory phonology (pp. 621–633). Oxford: Oxford University Press.

Warner, N., & Tucker, B.V

(2011) Phonetic variability of stops and flaps in spontaneous and careful speech. The Journal of the Acoustical Society of America, 130(3), 1606–1617.

Wiggers, P., & Rothkrantz, L.J.M

(2007) Exploratory analysis of word use and sentence length in the spoken Dutch Corpus. In V. Matoušek & P. Mautner (Eds.), Text, speech and dialogue (pp. 366–373). Springer Berlin Heidelberg.

Willems, R.M., Frank, S.L., Nijhof, A.D., Hagoort, P., & Bosch, A. van den

(2016) Prediction during natural language comprehension. Cerebral Cortex, 26(6), 2506–2516.

Wood, S.N

(2006) Generalized additive models. New York: Chapman & Hall/CRC.

Wrench, A.A., & Scobbie, J.M

(2011) Very high frame rate ultrasound tongue imaging. In Proceedings of the 9th International Seminar On Speech Production (ISSP) (pp. 155–162).

Wurm, L.H., & Fisicaro, S.A

(2014) What residualizing predictors in regression analyses does (and what it does not do). Journal of Memory and Language, 721, 37–48.

Xiong, W., Droppo, J., Huang, X., Seide, F., Seltzer, M., Stolcke, A., & Zweig, G

(2016) The Microsoft 2016 Conversational Speech Recognition System. arXiv:1609.03528 [Cs]. Retrieved from [URL]

Xu, Y

(2010) In defense of lab speech. Journal of Phonetics, 38(3), 329–336.

Zekveld, A.A., Kramer, S.E., & Festen, J.M

(2010) Pupil response as an indication of effortful listening: The influence of sentence intelligibility. Ear and Hearing, 311, 480–490.

Cited by

Cited by 26 other publications

Order by:

Baese-Berk, Melissa M., Laura C. Dilley, Molly J. Henry, Louis Vinke & Elina Banzina

2019. Not just a function of function words: Distal speech rate influences perception of prosodically weak syllables. Attention, Perception, & Psychophysics 81:2 ► pp. 571 ff.

Beechey, Timothy

2022. Is speech intelligibility what speech intelligibility tests test?. The Journal of the Acoustical Society of America 152:3 ► pp. 1573 ff.

Ben Hedia, Sonia & Ingo Plag

2017. Gemination and degemination in English prefixation: Phonetic evidence for morphological organization. Journal of Phonetics 62 ► pp. 34 ff.

Bond, Z. S.

2021. Slips of the Ear. In The Handbook of Speech Perception, ► pp. 266 ff.

Dayter, Maria & Elena Riekhakaynen

2021. What Causes Phonetic Reduction in Russian Speech: New Evidence from Machine Learning Algorithms. In Speech and Computer [Lecture Notes in Computer Science, 12997], ► pp. 146 ff.

Engemann, Marie & Ingo Plag

2021. Phonetic reduction and paradigm uniformity effects in spontaneous speech. The Mental Lexicon 16:1 ► pp. 165 ff.

Felker, E., A. Troncoso-Ruiz, M. Ernestus & M. Broersma

2018. The ventriloquist paradigm: Studying speech processing in conversation with experimental control over phonetic input. The Journal of the Acoustical Society of America 144:4 ► pp. EL304 ff.

Herrero de Haro, Alfredo & John Hajek

2023. Covariants of Gemination in Eastern Andalusian Spanish: /t/ following Underlying /s/, /k/, /p/ and /ks/. Languages 8:2 ► pp. 99 ff.

Huettig, Falk & Jan Hulstijn

2024. The Enhanced Literate Mind Hypothesis. Topics in Cognitive Science

Lorenz, David & David Tizón-Couto

2024. Coalescence and contraction of V-to-Vinf sequences in American English – Evidence from spoken language . Corpus Linguistics and Linguistic Theory 20:1 ► pp. 1 ff.

Mclennan, Conor T. & Sara Incera

2021. A Comprehensive Approach to Specificity Effects in Spoken‐Word Recognition. In The Handbook of Speech Perception, ► pp. 206 ff.

Miles, Kelly, Timothy Beechey, Virginia Best & Jörg Buchholz

2022. Measuring Speech Intelligibility and Hearing-Aid Benefit Using Everyday Conversational Sentences in Real-World Environments. Frontiers in Neuroscience 16

Nenadić, Filip & Benjamin V. Tucker

2020. Computational modelling of an auditory lexical decision experiment using jTRACE and TISK. Language, Cognition and Neuroscience 35:10 ► pp. 1326 ff.

Nenadić, Filip, Benjamin V. Tucker & Louis ten Bosch

2023. Computational Modeling of an Auditory Lexical Decision Experiment Using DIANA. Language and Speech 66:3 ► pp. 564 ff.

Orzechowska, Paula

2019. Phonological Processes in Phonotactics: Evidence from Casual Speech. In Complexity in Polish Phonotactics [Prosody, Phonology and Phonetics, ], ► pp. 217 ff.

Podlubny, Ryan G., Terrance M. Nearey, Grzegorz Kondrak & Benjamin V. Tucker

2018. Assessing the importance of several acoustic properties to the perception of spontaneous speech. The Journal of the Acoustical Society of America 143:4 ► pp. 2255 ff.

Schebesta, Annika & Gero Kunter

2022. Constituent durations in English NNN compounds: A case of strategic speaker behavior?. Journal of Phonetics 94 ► pp. 101164 ff.

Schuppler, Barbara, Martine Adda-Decker, Catia Cucchiarini & Rudolf Muhr

2024. An introduction to pluricentric languages in speech science and technology. Speech Communication 156 ► pp. 103007 ff.

Stein, Simon David & Ingo Plag

2021. Morpho-Phonetic Effects in Speech Production: Modeling the Acoustic Duration of English Derived Words With Linear Discriminative Learning. Frontiers in Psychology 12

Stein, Simon David & Ingo Plag

2022. How relative frequency and prosodic structure affect the acoustic duration of English derivatives. Laboratory Phonology 13:1

Tucker, Benjamin V., Daniel Brenner, D. Kyle Danielson, Matthew C. Kelley, Filip Nenadić & Michelle Sims

2019. The Massive Auditory Lexical Decision (MALD) database. Behavior Research Methods 51:3 ► pp. 1187 ff.

Tucker, Benjamin V. & Yoichi Mukai

2023. Spontaneous Speech,

Verbeke, Gil & Ellen Simon

2023. Listening to accents: Comprehensibility, accentedness and intelligibility of native and non-native English speech. Lingua 292 ► pp. 103572 ff.

Vigliecca, Nora Silvana

2017. Relación entre el informe del cuidador sobre el habla espontánea del paciente y la Evaluación Breve de la Afasia. CoDAS 29:5

Warner, Natasha

2023. Advancements of phonetics in the 21st century: Theoretical and empirical issues of spoken word recognition in phonetic research. Journal of Phonetics 101 ► pp. 101275 ff.

Watkins, Freya, Diar Abdlkarim, Bodo Winter & Robin L. Thompson

2024. Viewing angle matters in British Sign Language processing. Scientific Reports 14:1

This list is based on CrossRef data as of 6 june 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.