Article published In:
Journal of Second Language Pronunciation: Online-First ArticlesTraining of English prosody with acoustically modified voices
Prosodic aspects of speech are crucial for the comprehensibility of L2 speakers, but prosody is rarely targeted in
English language lessons. This paper describes an innovative training of English phrasal prosody using participants’ own speech as
models in a modified listen-and-repeat paradigm, with their melodic and rhythmic patterns manipulated by means of PSOLA. The
two-hour training was delivered individually to twelve intermediate native speakers of Czech. The comparison of a baseline
recording of a read text before the training and two texts read six weeks after the training shows that using one’s own
PSOLA-modified voice for prosody training is beneficial: the participants were perceived as sounding significantly more competent
in the after-training recordings, their phrasing corresponded more to text-based predictions, and their melodic variability was
significantly greater. The contribution of targeted prosody modifications in the teaching of L2 pronunciation are discussed.
Article outline
- 1.Introduction
- 1.1Prosody in L2 speech
- 1.2Using manipulations of speech prosody
- 1.3Current study
- 2.Training with PSOLA-modified prosody
- 2.1Speakers and material
- 2.2Prosodic manipulations
- 2.3Procedure
- 3.Perceptual assessment of competence
- 3.1Method
- 3.2Results and discussion
- 4.Analysis of phrasing
- 4.1Method
- 4.2Phrasing score
- 5.Production analysis
- 5.1Method
- 5.2Results and discussion
- 6.General discussion
- Notes
-
References
Available under the Creative Commons Attribution (CC BY) 4.0 license.
For any use beyond this license, please contact the publisher at [email protected].
Published online: 3 February 2025
https://doi.org/10.1075/jslp.24041.ska
https://doi.org/10.1075/jslp.24041.ska
References (59)
Anderson-Hsieh, J. (1990). Teaching
suprasegmentals to international teaching assistants using field-specific materials. English
for Specific
Purposes,
9
1, 195–214.
Baker, A. A. (2011). Discourse
prosody and teachers’ stated beliefs and practices. TESOL
Journal,
2
1, 263–292.
Beckman, M. E., & Ayers Elam, G. (1997). Guidelines
for ToBI labelling, version 3. The Ohio State University Research Foundation.
Bergeron, A., & Trofimovich, P. (2017). Linguistic
dimensions of accentedness and comprehensibility: Exploring task and listener effects in second language
French. Foreign Language
Annals,
50
(3), 547–566.
Boersma, P., & Weenink, D. (2024). Praat:
Doing phonetics by computer (Version 6.4). Retrieved from [URL].
Bořil, T., & Skarnitzl, R. (2016). Tools
rPraat and mPraat: Interfacing phonetic analyses with signal
processing. In: P. Sojka, A. Horák, I. Kopeček & K. Pala (Eds.),
Proceedings
of the 19th International Conference on Text, Speech and
Dialogue
(pp. 367–374). Springer International Publishing.
Chafe, W. L. (1988). Linking
intonation units in spoken English. In: J. Haiman & S. A. Thompson (Eds.), Clause
combining in grammar and
discourse (pp. 1–27). John Benjamins.
Chun, D. M., & Levis, J. M. (2020). Prosody
in L2 teaching: Methodologies and effectiveness. In: C. Gussenhoven & A. Chen (Eds.), Oxford
handbook of language
prosody (pp. 619–630). Oxford University Press.
Crowther, D., Trofimovich, P., & Isaacs, T. (2016). Linguistic
dimensions of second language accent and comprehensibility: Nonnative listeners’
perspectives. Journal of Second Language
Pronunciation,
2
(2), 160–182.
Dankovičová, J., & Dellwo, V. (1999). Czech
speech rhythm and the rhythm class hypothesis. In: J. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, & A. C. Bailey (Eds.),
Proceedings
of the 16th International Congress of Phonetic
Sciences
(pp. 1241–1244, San Francisco, August 1–7, 1999.
De Meo, A., Vitale, M., Pattorino, M., Cutugno, F., & Origlia, A. (2013). Imitation/self-imitation
in computer-assisted prosody training for Chinese learners of L2
Italian. In: J. Levis, & K. LeVelle (Eds.),
Proceedings
of the 4th Pronunciation in Second Language Learning and Teaching
conference
(pp. 90–100). Iowa State University.
Derwing, T. M., Levis, J. M., Sonsaat-Hegeheimer, S. (2022). Bridging
the research-practice gap in L2 pronunciation. In: J. M. Levis, T. M. Derwing, & S. Sonsaat-Hegelheimer (Eds.), Second
language pronunciation: Bridging the gap between research and
teaching (pp. 1–18). Wiley Blackwell.
Derwing, T. M., Munro, M. J., Foote, J. A., Waugh, E., & Fleming, F. (2014). Opening
the window on comprehensible pronunciation after 19 years: A workplace training study. Language
Learning,
64
(3), 526–548.
Dickerson, W.. (2019). The
ripples of rhythm: Implications for ESL instruction. J. Levis, C. Nagle & E. Todey (Eds.),
Proceedings
of the 10th Pronunciation in Second Language Learning and Teaching
conference
(pp. 36–54). Iowa State University, September 2018.
Ding, S., Liberatore, C., Sonsaat, S., Lučić, I., Silpachai, A., Zhao, G., Chukharev-Hudilainen, E., Levis J., & Gutierrez-Osuna, R. (2019). Golden
speaker builder — An interactive tool for pronunciation training. Speech Communication,
115, 51–66.
Eriksson, A. & Heldner, M. (2015). The
acoustics of word stress in English as a function of stress level and speaking
style. In: S. Möller, H. Ney, B. Möbius, E. Nöth & S. Steidl (Eds.), Proceedings
of Interspeech
2015 (pp. 41–45). Dresden, September 6–10, 2015.
Felps, D., Bortfeld, H., & Gutierrez-Osuna, R. (2009). Foreign
accent conversion in computer assisted pronunciation training. Speech Communication,
51, 920–932.
Frazier, L., Carlson, K., & Clifton, C. Jr. (2006). Prosodic
phrasing is central to language comprehension. Trends in Cognitive
Sciences,
10
(6), 244–249.
French, L. M., Gagné, N., & Collins, L. (2020). Long-term
effects of intensive instruction on fluency, comprehensibility and accentedness. Journal of
Second Language
Pronunciation,
6
(3), 380–401.
Gordon, J., & Darcy, I. (2016). The
development of comprehensible speech in L2 learners. Journal of Second Language
Pronunciation,
2
1, 56–92.
(2022). Teaching
segmentals and suprasegmentals: Effects of explicit pronunciation instruction on comprehensibility, fluency, and
accentedness. Journal of Second Language
Pronunciation,
8
(2), 168–195.
Gravano, A., & Hirschberg, J. (2011). Turn-taking
cues in task-oriented dialogue. Computer Speech and Language,
25, 601–634.
Henderson, A. J., & Skarnitzl, R. (2022). “A
better me”: Using acoustically modified learner voices as models. Language Learning &
Technology,
26
(1), 1–21. [URL]
Hermes, D. (2006). Stylization
of pitch contours. In: S. Sudhoff et al. (Eds.), Methods
in empirical prosody
research (pp. 29–62). De Gruyter.
Hickok, G. (2010) The
role of mirror neurons in speech perception and action word semantics. Language and Cognitive
Processes,
25
(6), 749–776.
Hruška, R., & Bořil, T. (2017). Temporal
variability of fundamental frequency contours. Acta Universitatis Carolinae — Philologica,
3, 35–44.
Kolly, M. -J., Boula de Mareüil, P., Leemann, A., & Dellwo, V. (2017). Listeners
use temporal information to identify French- and English-accented speech. Speech Communication,
86, 121–134.
Kügler, F., & Calhoun, S. (2020). Prosodic
encoding of information structure: A typological
perspective. In: C. Gussenhoven & A. Chen (Eds.), Oxford
handbook of language
prosody (pp. 454–467). Oxford University Press.
Kuhn, M. (2008). Building
predictive models in R using the caret package. Journal of Statistical
Software,
28
(5), 1–26.
Kusz, E. (2023). Effects
of self-imitation practice on L2 pronunciation with the use of Golden Speaker
Builder. In: R. I. Thomson, T. M. Derwing, J. M. Levis & K. Hiebert (Eds.),
Proceedings
of the 13th Pronunciation in Second Language Learning and Teaching
conference
, Brock University, June 2022.
Lambert, W. E., Hodgson, R. C., Gardner, R. C., & Fillenbaum, S. (1960). Evaluational
reactions to spoken languages. Journal of Abnormal and Social
Psychology,
60
(1), 44–51.
Lenth, R. V. (2023). emmeans:
Estimated Marginal Means, aka Least-Squares Means, v. 1.8.8. Retrieved from [URL]
Levis, J. M. (2018). Intelligibility,
oral communication, and the teaching of pronunciation. Cambridge University Press.
Li, P., Baills, F., Alazard-Guiu, C., Baqué, L., & Prieto, P. (2023). A
pedagogical note on teaching L2 prosody and speech sounds using hand gestures. Journal of
Second Language
Pronunciation,
9
(3), 340–349.
Mennen, I., & de Leeuw, E. (2014). Beyond
segments: Prosody in SLA. Studies in Second Language Acquisition,
36, 183–194.
Moulines, E., & Charpentier, F. (1990). Pitch-synchronous
waveform processing techniques for text-to-speech synthesis using diphones. Speech
Communication,
9
(5–6), 453–467.
Munro, M. J., & Derwing, T. M. (1999). Foreign
accent, comprehensibility, and intelligibility in the speech of second language
learners. Language
Learning,
49
1, 285–310.
Niebuhr, O., Alm, M., Schümchen, N., & Fischer, K. (2017). Comparing
visualization techniques for learning second language prosody: First results. International
Journal of Learner Corpus
Research,
3
(2), 250–277.
O’Brien, M. G. (2022). Making
the teaching of suprasegmentals accessible. In: J. M. Levis, T. M. Derwing, & S. Sonsaat-Hegelheimer (Eds.), Second
language pronunciation: Bridging the gap between research and
teaching (pp. 85–106). Wiley Blackwell.
Phillips, S., Aguilar Perez, A., Alt, H., & Darcy, I. (2022). Pause
for thought (groups): non-native pausing behavior and ease of processing of L2
speech. In: J. Levis & A. Guskaroska (Eds.),
Proceedings
of the 12th Pronunciation in Second Language Learning and Teaching
conference
, Brock University, June 2021.
Pickering, L. (2001). The
role of tone choice in improving ITA communication in the classroom. TESOL
Quarterly,
35
(2), 233–255.
Polyanskaya, L., Ordin, M., & Busa, M. G. (2017). Relative
salience of speech rhythm and speech rate on perceived foreign accent in a second
language. Language and
Speech,
60
(3), 333–355.
Rogerson-Revell, P. (2012). Can
or should we teach intonation? IATEFL Pronunciation SIG
Newsletter,
47
1, 16–20.
Saito, Y., & Saito, K. (2017). Differential
effects of instruction on the development of second language comprehensibility, word stress, rhythm, and intonation: The case
of inexperienced Japanese EFL learners. Language Teaching
Research,
21
(5), 589–608.
Saito, K., Trofimovich, P., & Isaacs, T. (2016). Second
language speech production: Investigating linguistic correlates of comprehensibility and accentedness for learners at
different ability levels. Applied Psycholinguistics,
37, 217–240.
Scherer, K. R. (2003). Vocal
communication of emotion: A review of research paradigms. Speech Communication,
40, 227–256.
Skarnitzl, R., & Eriksson, A. (2017). The
acoustics of word stress in Czech as a function of speaking
style. In: F. Lacerda, D. House, M. Heldner, J. Gustafson, S. Strömbergsson & M. Włodarczak (Eds.), Proceedings
of Interspeech
2017 (pp. 3221–3225, Stockholm, August 20–24, 2017.
Skarnitzl, R., & Hledíková, H. (2022). Prosodic
phrasing of good speakers in English and Czech. Frontiers in
Psychology, 131, 857647.
Stoffel, M. A., Nakagawa, S., & Schielzeth, H. (2017). rptR:
repeatability estimation and variance decomposition by generalized linear mixed-effects
models. Methods in Ecology and
Evolution,
8
(11), 1639–1644.
Sundström, A. (1998). Automatic
prosody modification as a means for foreign language pronunciation
training. In: Proceedings of ETRW on Speech Technology in Language
Learning
(STiLL) (pp. 49–52, Marholmen, May 25–27, 1998. Retrieved
from [URL]
Šturm, P., & Lukeš, D. (2017). Fonotaktická
analýza obsahu slabik na okrajích českých slov v mluvené a psané řeči [A phonotactic analysis of the content of syllables on
word boundaries in spoken and written Czech texts]. Slovo a
slovesnost,
78
(2), 99–118.
Trouvain, J., & Braun, B. (2020). Sentence
prosody in a second language. In: C. Gussenhoven & A. Chen (Eds.), Oxford
handbook of language
prosody (pp. 605–618). Oxford University Press.
Van Maastricht, L., Zee, T., Krahmer, E., & Swerts, M. (2021). The
interplay of prosodic cues in the L2: How intonation, rhythm, and speech rate in speech by Spanish learners of Dutch
contribute to L1 Dutch perceptions of accentedness and comprehensibility. Speech
Communication,
133
1, 81–90.
Volín, J. (2019). The
size of prosodic phrases in native and foreign-accented read-out monologues. Acta Universitatis
Carolinae —
Philologica 21/2019, 145–158.
Volín, J., & Poesová, K. (2016). Perceptual
impact of speech melody hybridization: English and Czech English. Research in
Language,
14
(1), 31–41.
Volín, J., Poesová, K., & Weingartová, L. (2015). Speech
melody properties in English, Czech and Czech English: Reference and interference. Research in
Language, 131, 107–123.
Wickham, H. (2016). ggplot2:
Elegant graphics for data
analysis. Springer-Verlag. Available
at: [URL].