Article In:
Journal of Second Language Pronunciation: Online-First ArticlesExploring the potential of textually‑enhanced captioned video to direct learners’ attention to challenging sound contrasts
This study investigates the potential of textually-enhanced
captioned video to direct EFL learners’ attention to a difficult L2 vowel
contrast (English /æ/-/ʌ/) while watching a 30-minute episode of
Ted Lasso. Spanish EFL learners (n = 89) were randomly assigned
to five different viewing conditions: unenhanced captions (1); enhanced captions
with /æ/ and /ʌ/ in two different colours with the target words either in
phonetic symbols (2) or orthography (3); or with /æ/ and /ʌ/ in the same colour,
either in phonetic symbols (4) or orthography (5). The participants’ eye
movements were recorded with a Tobii TX-1200 eye-tracker. The textual
enhancement implemented was effective in directing learners’ attention to the
target words and attention was generally maintained during the episode. The
enhanced conditions promoted higher fixation rates and durations than the
unenhanced one. Additionally, the participants’ answers to a post-viewing
questionnaire revealed that they considered these types of enhancement useful to
help them spot instances of the target sounds and that the captions were not
overwhelming.
Keywords: textual enhancement, captioned video, computer assisted pronunciation training (CAPT), pronunciation training
Article outline
- 1.Introduction
- 2.Facilitating pronunciation training with technology
- 3.Captioned video and pronunciation learning
- 4.The present study
- 4.1Study design and procedure
- 4.2Participants
- 4.3Target stimuli
- 4.4Eye-gaze measures
- 4.5Data analyses
- 5.Results
- 5.1Effects of textual enhancement conditions on viewing behavior
- 5.2Learners’ attention to target words over time
- 5.3Video viewing habits and perception of textual enhancement
- 6.Discussion and conclusions
- Acknowledgements
- Note
-
References
This content is being prepared for publication; it may be subject to changes.
References (60)
Baranowska, K. (2020). Learning
most with least effort: subtitles and cognitive
load. ELT
Journal, 74(2), 105–115.
Barriuso, T. A., & Hayes-Harb, R. (2018). High
variability phonetic training as a bridge from research to
practice. The CATESOL
Journal,
30
(1), 177–194.
Bashori, M., van Hout, R., Strik, H., & Cucchiarini, C. (2024). I
Can Speak: improving English pronunciation through automatic speech
recognition-based language learning
systems. Innovation in Language Learning and
Teaching,
18
(5), 443–461.
Bird, S. A., & Williams, J. N. (2002). The
effect of bimodal input on implicit and explicit memory: An investigation
into the benefits of within-language
subtitling. Applied
Psycholinguistics,
23
(4), 509–533.
Bradlow, A. R., Pisoni, D. B., Akahane-Yamada, R., & Tohkura, Y. (1997). Training
Japanese listeners to identify English /r/ and /l/: IV. Some effects of
perceptual learning on speech
production. Journal of the Acoustical Society
of
America,
101
(4), 2299–2310.
Brooks, M. E., Kristensen, K., van Benthem, K. J., Magnusson, A., Berg, C. W., Nielsen, A., Skaug, H. J., Maechler, M., & Bolker, B. M. (2017). GlmmTMB
balances speed and flexibility among packages for zero-inflated generalized
linear mixed modelling. The R
Journal,
9
(2), 378–400.
Brysbaert, M., & New, B. (2009). Moving
beyond Kucera and Francis: A critical evaluation of current word frequency
norms and the introduction of a new and improved word frequency measure for
American English. Behavior Research
Methods,
41
(4), 977–990.
Carlet, A., & Cebrian, J. (2019). Assessing
the effect of perceptual training on L2 vowel identification, generalization
and long-term
effects. In A. M. Nyvad, M. Hejná, A. Højen, A. B. Jespersen & M. Hjortshøj Sørensen (Eds.), A
sound approach to language matters: In honor of Ocke-Schwen
Bohn (pp. 91–119). Aarhus University.
Cebrian, J. (2019). Perceptual
assimilation of British English vowels to Spanish monophthongs and
diphthongs. Journal of the Acoustical Society
of
America,
145
1, EL52–EL58. .
Chun, D. M., Jiang, Y., Meyr, J., & Yang, R. (2015). Acquisition
of L2 Mandarin Chinese tones with learner-created tone
visualizations. Journal of Second Language
Pronunciation,
1
(1), 86–114.
Cintrón-Valentín, M. C., & García-Amaya, L. (2021). Investigating
textual enhancement and captions in L2 grammar and vocabulary: An
experimental study. Studies in Second
Language
Acquisition,
43
(5), 1068–1093.
Conklin, K., Pellicer-Sánchez, A., & Carrol, G. (2018). Eye-tracking:
A guide for applied linguistics
research. Cambridge University Press.
Darcy, I., & Holliday, J. (2019). Teaching
an old work new tricks: phonological updates in the L2 mental
lexicon. In J. Levis, C. Nagle & E. Todey (Eds.), Proceedings
of the 10thPronunciation in Second Language Learning and Teaching
Conference (pp. 10–26). Iowa State University.
d’Ydewalle, G., & De Bruycker, W. (2007). Eye
movements of children and adults while reading television
subtitles. European
Psychologist,
12
(3), 196–205.
Flege, J. E., & Bohn, O. -S. (2021). The
revised speech learning model
(SLM-r). In R. Wayland (Ed.), Second
Language Speech Learning: Theoretical and Empirical
Progress (pp. 3–83). Cambridge University Press.
Fouz-González, J. (forthcoming). Technology-assisted
pronunciation training: Bridging research and
pedagogy. University of Toronto Press.
Fouz-González, J. & Mompean, J. A. (2021). Exploring
the potential of phonetic symbols and keywords as labels for perceptual
training. Studies in Second Language
Acquisition,
43
(2), 297–328.
Galimberti, V., Mora, J. C., & Gilabert, R. (2023). Audio-synchronized
textual enhancement in foreign language pronunciation learning from
videos. System,
116
1, 103078.
Guion, S. G., & Pederson, E. (2007). Investigating
the role of attention in phonetic
learning. In O. -S. Bohn & M. J. Munro (Eds.), Language
experience in second language speech
learning (pp. 57–77). John Benjamins.
Hacking, J. F., Smith, B. L., & Johnson, E. M. (2017). Utilizing
electropalatography to train palatalized versus unpalatalized consonant
productions by native speakers of American English learning
Russian. Journal of Second Language
Pronunciation,
3
(1), 9–33.
Hardison, D. M. (2004). Generalization
of computer-assisted prosody training: Quantitative and qualitative
findings. Language Learning &
Technology,
8
(1), 34–52.
Hirata, Y. (2004). Computer-assisted
pronunciation training for native English speakers learning Japanese pitch
and duration contrasts. Computer Assisted
Language
Learning,
17
(3–4), 357–376.
Huensch, A. (2016). Perceptual
phonetic training improves production in larger discourse
contexts. Journal of Second Language
Pronunciation,
2
(2), 183–207.
Kartushina, N., Hervais-Adelman, A., Frauenfelder, U. H., & Golestani, N. (2015). The
effect of phonetic production training with visual feedback on the
perception and production of foreign speech
sounds. Journal of the Acoustical Society of
America,
138
(2), 817–832.
Kruger, J. L. (2016). Psycholinguistics
and audiovisual
translation. Target,
28
(2), 276–287.
Kruger, J. -L., Hefer, E., & Matthew, G. (2014). Attention
distribution and cognitive load in a subtitled academic lecture: L1 vs.
L2. Journal of Eye Movement
Research,
7
(5), 1–15.
Kuhl, P. K., Conboy, B. T., Coffey-Corina, S., Padden, D., Rivera-Gaxiola, M., & Nelson, T. (2008). Phonetic
learning as a pathway to language: new data and native language magnet
theory expanded (NLM-e). Philosophical
transactions of the Royal Society of London. Series B, Biological
sciences,
363
(1493), 979–1000.
Lambacher, S., Martens, W., Kakehi, K., Marasinghe, C., & Molholt, G. (2005). The
effects of identification training on the identification and production of
American English vowels by native speakers of
Japanese. Applied
Psycholinguistics,
26
(2), 227–247.
Lee, M., & Révész, A. (2020). Promoting
grammatical development through captions and textual enhancement in
multimodal input-based tasks. Studies in
Second Language
Acquisition,
42
(3), 625–651.
Mathias, B., & von Kriegstein, K. (2023). Enriched
learning: Behavior, brain, and
computation. Trends in Cognitive
Sciences,
27
(1), 81–97.
Meara, P., & Miralpeix, I. (2015). V_YesNo
Lognostics Vocabulary Test. [URL]
Mitterer, H., & McQueen, J. M. (2009). Foreign
subtitles help but native-language subtitles harm foreign speech
perception. PLoS
ONE,
4
(11), e7785.
Mohsen, M. A., & Mahdi, H. S. (2021). Partial
versus full captioning mode to improve L2 vocabulary acquisition in a
mobile-assisted language learning setting: Words pronunciation
domain. Journal of Computing in Higher
Education,
33
1, 524–543.
Mompean, J. A. & Fouz-González, J. (2021). Phonetic
symbols in contemporary pronunciation
instruction. RELC
Journal,
52
(1), 155–168.
Mompean, J. A. & Lintunen, P. (2015). Phonetic
notation in foreign language teaching and learning: Potential advantages and
learners’ views. Research in
Language,
13
(3), 292–314.
Montero Perez, M., Peters, E., & Desmet, P. (2015). Enhancing
vocabulary learning through captioned Video: An eye‐tracking
study. The Modern Language
Journal,
99
(2), 308–328.
(2018). Vocabulary
learning through viewing video: the effect of two enhancement
techniques. Computer assisted language
learning,
31
(1–2), 1–26.
Mora, J. C., & Fouz-González, J. (2024). Contrastive
input enhancement in captioned video for L2 pronunciation
learning. In C. Muñoz & I. Miralpeix (Eds.), Audiovisual
Input and Second Language
Learning (pp. 154–178). John Benjamins.
Mora, J. C. & Mora-Plaza, I. (2019). Contributions
of cognitive attention control to L2 speech
learning. In A. M. Nyvad, M. Hejná, A. Højen, A. B. Jespersen & M. Hjortshøj Sørensen (Eds.), A
sound approach to language matters: In honor of Ocke-Schwen
Bohn (pp. 477–499). Aarhus University.
Motohashi-Saigo, M. & Hardison, D. M. (2009). Acquisition
of L2 Japanese geminates: Training with waveform
displays. Language Learning &
Technology,
13
(2), 29–47.
Offerman, H. M., & Olson, D. J. (2016). Visual
feedback and second language segmental production: The generalizability of
pronunciation
gains. System,
59
1, 45–60.
Olson, D. (2014). The
benefits of visual feedback on segmental production in L2
classrooms. Language Learning &
Technology,
18
(3), 173–192.
Pattemore, A., & Muñoz, C. (2022). Captions
and learnability factors in learning grammar from audio-visual
input. JALT CALL
Journal,
18
(1), 83–109.
Pederson, E., & Guion-Anderson, S. (2010). Orienting
attention during phonetic training facilitates
learning. The Journal of the Acoustical
Society of
America,
127
(2), EL54–EL59.
Peterson, R. A. (2021). Finding
optimal normalizing transformations via
bestNormalize. The R
Journal,
13
(1), 310–329.
Peterson, R. A., & Cavanough, J. E. (2020). Ordered
quantile normalization: a semiparametric transformation built for the
cross-validation era. Journal of Applied
Statistics,
47
, (13–15), 2312–2327.
Popova, M., & Miralpeix, I. (2024). Maximizing
L2 learning from captioned TV
viewing. In C. Muñoz & I. Miralpeix (Eds.) Audiovisual
input and second language
learning (pp. 100–125). John Benjamins.
Rehman, I. (2021). Real-time
formant extraction for second language vowel production
training. [Unpublished doctoral
dissertation]. Iowa State University.
Roon, K. D., Kang, J., & Whalen, D. H. (2020). Effects
of ultrasound familiarization on production and perception of nonnative
contrasts. Phonetica,
77
(5), 350–393.
Sharwood-Smith, M. (1993). Input
enhancement in instructed SLA: Theoretical
bases. Studies in Second Language
Acquisition,
15
(2), 165–179.
Scheffler, P., & Baranowska, K. (2023). Learning
pronunciation through television
series. Language Learning &
Technology, 27(1), 1–16. [URL]
Suemitsu, A., Dang, J., Ito, T., & Tiede, M. (2015). A
real-time articulatory visual feedback approach with target presentation for
second language pronunciation learning. The
Journal of the Acoustical Society of
America,
138
(4), EL382–EL387.
Thomson, R. I. (2011). Computer
assisted pronunciation training: Targeting second language vowel perception
improves pronunciation. CALICO
Journal,
28
(3), 744–765.
(2012). Improving
L2 listeners’ perception of English vowels: A computer-mediated
approach. Language
Learning,
62
(4), 1231–1258.
(2018). High
Variability [Pronunciation] Training (HVPT). A proven technique about which
every language teacher and learner ought to
know. Journal of Second Language
Pronunciation,
4
(2), 208–231.
VanPatten, B. (1990). Attending
to form and content in the input: An experiment in
consciousness. Studies in Second Language
Acquisition,
12
(3), 287–301.
Wang, S., Li, J., & Liang, Q. (2024). Visual
reinforcement through digital zoom technology in FL pronunciation
instruction. Language Learning &
Technology,
28
(1), 1–26. [URL]
Winke, P., Gass, S., & Sydorenko, T. (2013). Factors
influencing the use of captions by foreign language learners: An
eye‐tracking study. The Modern Language
Journal,
97
(1), 254–275.