Exploring the potential of textually‑enhanced captioned video to direct learners’ attention to challenging sound contrasts

Fouz-González, Jonás; Mora, Joan C.

doi:10.1075/jslp.24043.fou

Article In:

Journal of Second Language Pronunciation: Online-First Articles

Exploring the potential of textually‑enhanced captioned video to direct learners’ attention to challenging sound contrasts

Jonás Fouz-González | University of Murcia

Joan C. Mora | University of Barcelona

This study investigates the potential of textually-enhanced captioned video to direct EFL learners’ attention to a difficult L2 vowel contrast (English /æ/-/ʌ/) while watching a 30-minute episode of Ted Lasso. Spanish EFL learners (n = 89) were randomly assigned to five different viewing conditions: unenhanced captions (1); enhanced captions with /æ/ and /ʌ/ in two different colours with the target words either in phonetic symbols (2) or orthography (3); or with /æ/ and /ʌ/ in the same colour, either in phonetic symbols (4) or orthography (5). The participants’ eye movements were recorded with a Tobii TX-1200 eye-tracker. The textual enhancement implemented was effective in directing learners’ attention to the target words and attention was generally maintained during the episode. The enhanced conditions promoted higher fixation rates and durations than the unenhanced one. Additionally, the participants’ answers to a post-viewing questionnaire revealed that they considered these types of enhancement useful to help them spot instances of the target sounds and that the captions were not overwhelming.

Keywords: textual enhancement, captioned video, computer assisted pronunciation training (CAPT), pronunciation training

Article outline

1.Introduction
2.Facilitating pronunciation training with technology
3.Captioned video and pronunciation learning
4.The present study
- 4.1Study design and procedure
- 4.2Participants
- 4.3Target stimuli
- 4.4Eye-gaze measures
- 4.5Data analyses
5.Results
- 5.1Effects of textual enhancement conditions on viewing behavior
- 5.2Learners’ attention to target words over time
- 5.3Video viewing habits and perception of textual enhancement
6.Discussion and conclusions
Acknowledgements
Note
References

This content is being prepared for publication; it may be subject to changes.

References (60)

References

Baranowska, K. (2020). Learning most with least effort: subtitles and cognitive load. ELT Journal, 74(2), 105–115.

Barriuso, T. A., & Hayes-Harb, R. (2018). High variability phonetic training as a bridge from research to practice. The CATESOL Journal, 30 (1), 177–194.

Bashori, M., van Hout, R., Strik, H., & Cucchiarini, C. (2024). I Can Speak: improving English pronunciation through automatic speech recognition-based language learning systems. Innovation in Language Learning and Teaching, 18 (5), 443–461.

Bird, S. A., & Williams, J. N. (2002). The effect of bimodal input on implicit and explicit memory: An investigation into the benefits of within-language subtitling. Applied Psycholinguistics, 23 (4), 509–533.

Bradlow, A. R., Pisoni, D. B., Akahane-Yamada, R., & Tohkura, Y. (1997). Training Japanese listeners to identify English /r/ and /l/: IV. Some effects of perceptual learning on speech production. Journal of the Acoustical Society of America, 101 (4), 2299–2310.

Brooks, M. E., Kristensen, K., van Benthem, K. J., Magnusson, A., Berg, C. W., Nielsen, A., Skaug, H. J., Maechler, M., & Bolker, B. M. (2017). GlmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modelling. The R Journal, 9 (2), 378–400.

Brysbaert, M., & New, B. (2009). Moving beyond Kucera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior Research Methods, 41 (4), 977–990.

Carlet, A., & Cebrian, J. (2019). Assessing the effect of perceptual training on L2 vowel identification, generalization and long-term effects. In A. M. Nyvad, M. Hejná, A. Højen, A. B. Jespersen & M. Hjortshøj Sørensen (Eds.), A sound approach to language matters: In honor of Ocke-Schwen Bohn (pp. 91–119). Aarhus University.

Cebrian, J. (2019). Perceptual assimilation of British English vowels to Spanish monophthongs and diphthongs. Journal of the Acoustical Society of America, 145 1, EL52–EL58.

.

Chun, D. M., Jiang, Y., Meyr, J., & Yang, R. (2015). Acquisition of L2 Mandarin Chinese tones with learner-created tone visualizations. Journal of Second Language Pronunciation, 1 (1), 86–114.

Cintrón-Valentín, M. C., & García-Amaya, L. (2021). Investigating textual enhancement and captions in L2 grammar and vocabulary: An experimental study. Studies in Second Language Acquisition, 43 (5), 1068–1093.

Conklin, K., Pellicer-Sánchez, A., & Carrol, G. (2018). Eye-tracking: A guide for applied linguistics research. Cambridge University Press.

Darcy, I., & Holliday, J. (2019). Teaching an old work new tricks: phonological updates in the L2 mental lexicon. In J. Levis, C. Nagle & E. Todey (Eds.), Proceedings of the 10thPronunciation in Second Language Learning and Teaching Conference (pp. 10–26). Iowa State University.

d’Ydewalle, G., & De Bruycker, W. (2007). Eye movements of children and adults while reading television subtitles. European Psychologist, 12 (3), 196–205.

Flege, J. E., & Bohn, O. -S. (2021). The revised speech learning model (SLM-r). In R. Wayland (Ed.), Second Language Speech Learning: Theoretical and Empirical Progress (pp. 3–83). Cambridge University Press.

Fouz-González, J. (forthcoming). Technology-assisted pronunciation training: Bridging research and pedagogy. University of Toronto Press.

Fouz-González, J. & Mompean, J. A. (2021). Exploring the potential of phonetic symbols and keywords as labels for perceptual training. Studies in Second Language Acquisition, 43 (2), 297–328.

Galimberti, V., Mora, J. C., & Gilabert, R. (2023). Audio-synchronized textual enhancement in foreign language pronunciation learning from videos. System, 116 1, 103078.

Guion, S. G., & Pederson, E. (2007). Investigating the role of attention in phonetic learning. In O. -S. Bohn & M. J. Munro (Eds.), Language experience in second language speech learning (pp. 57–77). John Benjamins.

Hacking, J. F., Smith, B. L., & Johnson, E. M. (2017). Utilizing electropalatography to train palatalized versus unpalatalized consonant productions by native speakers of American English learning Russian. Journal of Second Language Pronunciation, 3 (1), 9–33.

Hardison, D. M. (2004). Generalization of computer-assisted prosody training: Quantitative and qualitative findings. Language Learning & Technology, 8 (1), 34–52.

Hirata, Y. (2004). Computer-assisted pronunciation training for native English speakers learning Japanese pitch and duration contrasts. Computer Assisted Language Learning, 17 (3–4), 357–376.

Huensch, A. (2016). Perceptual phonetic training improves production in larger discourse contexts. Journal of Second Language Pronunciation, 2 (2), 183–207.

Kartushina, N., Hervais-Adelman, A., Frauenfelder, U. H., & Golestani, N. (2015). The effect of phonetic production training with visual feedback on the perception and production of foreign speech sounds. Journal of the Acoustical Society of America, 138 (2), 817–832.

Kruger, J. L. (2016). Psycholinguistics and audiovisual translation. Target, 28 (2), 276–287.

Kruger, J. -L., Hefer, E., & Matthew, G. (2014). Attention distribution and cognitive load in a subtitled academic lecture: L1 vs. L2. Journal of Eye Movement Research, 7 (5), 1–15.

Kuhl, P. K., Conboy, B. T., Coffey-Corina, S., Padden, D., Rivera-Gaxiola, M., & Nelson, T. (2008). Phonetic learning as a pathway to language: new data and native language magnet theory expanded (NLM-e). Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 363 (1493), 979–1000.

Lambacher, S., Martens, W., Kakehi, K., Marasinghe, C., & Molholt, G. (2005). The effects of identification training on the identification and production of American English vowels by native speakers of Japanese. Applied Psycholinguistics, 26 (2), 227–247.

Lee, M., & Révész, A. (2020). Promoting grammatical development through captions and textual enhancement in multimodal input-based tasks. Studies in Second Language Acquisition, 42 (3), 625–651.

Mathias, B., & von Kriegstein, K. (2023). Enriched learning: Behavior, brain, and computation. Trends in Cognitive Sciences, 27 (1), 81–97.

Meara, P., & Miralpeix, I. (2015). V_YesNo Lognostics Vocabulary Test. [URL]

Mitterer, H., & McQueen, J. M. (2009). Foreign subtitles help but native-language subtitles harm foreign speech perception. PLoS ONE, 4 (11), e7785.

Mohsen, M. A., & Mahdi, H. S. (2021). Partial versus full captioning mode to improve L2 vocabulary acquisition in a mobile-assisted language learning setting: Words pronunciation domain. Journal of Computing in Higher Education, 33 1, 524–543.

Mompean, J. A. & Fouz-González, J. (2021). Phonetic symbols in contemporary pronunciation instruction. RELC Journal, 52 (1), 155–168.

Mompean, J. A. & Lintunen, P. (2015). Phonetic notation in foreign language teaching and learning: Potential advantages and learners’ views. Research in Language, 13 (3), 292–314.

Montero Perez, M., Peters, E., & Desmet, P. (2015). Enhancing vocabulary learning through captioned Video: An eye‐tracking study. The Modern Language Journal, 99 (2), 308–328.

(2018). Vocabulary learning through viewing video: the effect of two enhancement techniques. Computer assisted language learning, 31 (1–2), 1–26.

Mora, J. C., & Fouz-González, J. (2024). Contrastive input enhancement in captioned video for L2 pronunciation learning. In C. Muñoz & I. Miralpeix (Eds.), Audiovisual Input and Second Language Learning (pp. 154–178). John Benjamins.

Mora, J. C. & Mora-Plaza, I. (2019). Contributions of cognitive attention control to L2 speech learning. In A. M. Nyvad, M. Hejná, A. Højen, A. B. Jespersen & M. Hjortshøj Sørensen (Eds.), A sound approach to language matters: In honor of Ocke-Schwen Bohn (pp. 477–499). Aarhus University.

Motohashi-Saigo, M. & Hardison, D. M. (2009). Acquisition of L2 Japanese geminates: Training with waveform displays. Language Learning & Technology, 13 (2), 29–47.

Offerman, H. M., & Olson, D. J. (2016). Visual feedback and second language segmental production: The generalizability of pronunciation gains. System, 59 1, 45–60.

Olson, D. (2014). The benefits of visual feedback on segmental production in L2 classrooms. Language Learning & Technology, 18 (3), 173–192.

Pattemore, A., & Muñoz, C. (2022). Captions and learnability factors in learning grammar from audio-visual input. JALT CALL Journal, 18 (1), 83–109.

Pederson, E., & Guion-Anderson, S. (2010). Orienting attention during phonetic training facilitates learning. The Journal of the Acoustical Society of America, 127 (2), EL54–EL59.

Peterson, R. A. (2021). Finding optimal normalizing transformations via bestNormalize. The R Journal, 13 (1), 310–329.

Peterson, R. A., & Cavanough, J. E. (2020). Ordered quantile normalization: a semiparametric transformation built for the cross-validation era. Journal of Applied Statistics, 47 , (13–15), 2312–2327.

Popova, M., & Miralpeix, I. (2024). Maximizing L2 learning from captioned TV viewing. In C. Muñoz & I. Miralpeix (Eds.) Audiovisual input and second language learning (pp. 100–125). John Benjamins.

Rehman, I. (2021). Real-time formant extraction for second language vowel production training. [Unpublished doctoral dissertation]. Iowa State University.

Roon, K. D., Kang, J., & Whalen, D. H. (2020). Effects of ultrasound familiarization on production and perception of nonnative contrasts. Phonetica, 77 (5), 350–393.

Sharwood-Smith, M. (1993). Input enhancement in instructed SLA: Theoretical bases. Studies in Second Language Acquisition, 15 (2), 165–179.

Scheffler, P., & Baranowska, K. (2023). Learning pronunciation through television series. Language Learning & Technology, 27(1), 1–16. [URL]

Suemitsu, A., Dang, J., Ito, T., & Tiede, M. (2015). A real-time articulatory visual feedback approach with target presentation for second language pronunciation learning. The Journal of the Acoustical Society of America, 138 (4), EL382–EL387.

Thomson, R. I. (2011). Computer assisted pronunciation training: Targeting second language vowel perception improves pronunciation. CALICO Journal, 28 (3), 744–765.

(2012). Improving L2 listeners’ perception of English vowels: A computer-mediated approach. Language Learning, 62 (4), 1231–1258.

(2018). High Variability [Pronunciation] Training (HVPT). A proven technique about which every language teacher and learner ought to know. Journal of Second Language Pronunciation, 4 (2), 208–231.

VanPatten, B. (1990). Attending to form and content in the input: An experiment in consciousness. Studies in Second Language Acquisition, 12 (3), 287–301.

Wang, S., Li, J., & Liang, Q. (2024). Visual reinforcement through digital zoom technology in FL pronunciation instruction. Language Learning & Technology, 28 (1), 1–26. [URL]

Winke, P., Gass, S., & Sydorenko, T. (2013). Factors influencing the use of captions by foreign language learners: An eye‐tracking study. The Modern Language Journal, 97 (1), 254–275.

Wisniewska, N., & Mora, J. (2020). Can captioned video benefit second language pronunciation? Studies in Second Language Acquisition, 42 (3), 599–624.

Zhu, J., Zhang, X., & Li, J. (2024). Using AR filters in L2 pronunciation training: practice, perfection, and willingness to share, Computer Assisted Language Learning, 37 1(5–6), 1364–1396.