Aiming for advanced intelligibility and proficiency using mobile ASR

Mroz, Aurore

doi:10.1075/jslp.18030.mro

Article published In:

Journal of Second Language Pronunciation
Vol. 6:1 (2020) ► pp.12–38

Aiming for advanced intelligibility and proficiency using mobile ASR

Aurore Mroz | University of Illinois at Urbana-Champaign

This experimental study aimed to determine the impact of mobile-based Automatic Speech Recognition (ASR) in Gmail on intelligibility and proficiency, as well as whether any individual factors influenced learning outcomes. It focused on 26 Intermediate learners of French as a foreign language enrolled in two university courses geared towards the development of advanced oral skills but with different approaches to integrated instruction. It innovatively combined human-based and machine-based ratings within an ecological paradigm, following Levis’s (2005) intelligibility principle and Thomson and Derwing’s (2015) call for research that is readily useful for language instructors. Results show that ASR users significantly outperformed non-ASR users on intelligibility, particularly when exposed to instruction on spelling-to-sound patterns, and demonstrated the biggest growth in proficiency. Gender was also found to impact results. Pedagogical implications and venues for future research are offered.

Keywords: intelligibility, proficiency, advanced oral skills, ASR, French as a foreign language, ecological approach

Article outline

1.Introduction
2.Literature review
- 2.1Intelligibility and proficiency
- 2.2Technology
  - 2.2.1Ecological paradigm
  - 2.2.2ASR for L2 pronunciation
- 2.3Design of experiment for research on L2 pronunciation
  - 2.3.1Pronunciation instruction
  - 2.3.2Speaking tasks
  - 2.3.3Human-based vs. machine-based ratings
  - 2.3.4Learners’ factors
3.Research questions (RQ)
4.Method
- 4.1Participants
- 4.2Experiment
- 4.3Previous findings
- 4.4Analyses
  - 4.4.1Intelligibility
  - 4.4.2Proficiency
  - 4.4.3Relationship between intelligibility and proficiency
  - 4.4.4Learners’ factors
5.Results
- 5.1Intelligibility
- 5.2Proficiency
- 5.3Relationship between intelligibility and proficiency
- 5.4Learners’ factors
- 5.5Summary of results
  - 5.5.1RQ1
  - 5.5.2RQ2
  - 5.5.3RQ3
  - 5.5.4RQ4
6.Discussion
7.Conclusion
References

Published online: 11 February 2020

https://doi.org/10.1075/jslp.18030.mro

References

Abrahamsson, N., & Hyltenstam, K.

(2009) Age of onset and nativelikeness in a second language: Listener perception versus linguistic scrutiny. Language Learning, 59(2), 249–306.

ACTFL

(2012) Oral proficiency interview. Familiarization manual [Electronic version]. Retrieved May 6, 2018 from [URL]

Blin, F.

(2016) Toward and ‘ecological’ CALL theory. Theoretical perspectives and their instantiation in CALL research and practice. In F. Farr, & L. Murray (Eds.), The Routledge Handbook of Language Learning and Technology (pp. 39–54). New York, NY: Routledge.

Bongaerts, T.

(1999) Ultimate attainment in L2 pronunciation: The case of very advanced late L2 learners. In D. Birdsong (Ed.), Second language acquisition and the critical period hypothesis (pp. 133–159). Mahwah, NJ: Erlbaum.

Burston, J.

(2015) Twenty years of MALL project implementation: A meta-analysis of learning outcomes. ReCALL, 27(1), 4–20.

Burston, J., & Arispe, K.

(2018) Looking for a needle in a haystack: CALL and advanced language proficiency. CALICO Journal, 35(1), 77–102.

Creswell, J. W., & Plano Clark, V. L.

(2011) Designing and conducting mixed methods research (2nd ed.). Thousand Oaks, CA: Sage Publications.

Derwing, T. M.

(2010) Utopian goals for pronunciation teaching. In J. M. Levis, & K. LeVelle (Eds.), Proceedings of the 1st pronunciation in second language learning and teaching conference (pp. 24–37). Ames, IA: Iowa State University.

Derwing, T. M., Munro, M. J., & Carbonaro, M.

(2012) Does popular speech recognition software work with ESL speech? TESOL Quarterly, 34(3), 592–603.

Ehsani, F., & Knodt, E.

(1998) Speech technology in computer-aided language learning: Strengths and limitations of a new CALL paradigm. Language Learning & Technology, 2(1), 45–60.

Eskenazi, M.

(1999) Using automatic speech processing for foreign language pronunciation tutoring: Some issues and a prototype. Language Learning & Technology, 2(2), 62–76.

Gilner, L., & Morales, F.

(2010) Pronunciation: Intelligibility, frequency considerations, and instruction. 文京学院大学外国語学部文京学院短期大学紀要, 91, 35–56.

Grissom, R. J., & Kim, J. J.

(2012) Effect sizes for research: Univariate and multivariate applications. (2nd ed.). New York, NY: Taylor & Francis.

Hoyt-Oukada, K.

(2003) Considering students’ needs and interests in curriculum construction. The French Review, 76(4), 721–737.

Hsu, L.

(2016) An empirical examination of EFL learners’ perceptual learning styles and acceptance of ASR-based computer-assisted pronunciation training. Computer Assisted Language Learning, 29(5), 881–900.

Kang, O.

(2013) Relative impact of pronunciation features on ratings of non-native speakers’ oral proficiency. In J. Levis, & K. LeVelle (Eds.), Proceedings of the 4th Pronunciation in Second Language Learning and Teaching Conference (pp. 10–15). Ames, IA: Iowa State University.

Koo, T. K., & Li, M. Y.

(2016) A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15(2), 155–163.

Lee, J., Jang, J., & Plonsky, L.

(2015) The effectiveness of second language pronunciation instruction: A meta-analysis. Applied Linguistics, 36(3), 345–366.

Levis, J. M.

(2005) Changing contexts and shifting paradigms in pronunciation teaching. TESOL Quarterly 39(3), 369–377.

Levis, J., & Suvorov, R.

(2013) Automatic speech recognition. In C. Chapelle (Ed.), The Encyclopedia of Applied Linguistics (pp. 1–8). Hoboken, NJ: Blackwell Publishing.

Liakin, D., Cardoso, W., & Liakina, N.

(2017) Mobilizing instruction in a second-language context: Learners’ perceptions of two speech technologies. Languages, 2(11), 1–21.

Ma, R., Henrichsen, L. E., Cox, T. L., & Tanner, M. W.

(2018) Pronunciation’s role in English speaking-proficiency ratings. Journal of Second Language Pronunciation, 4(1), 73–102.

McCrocklin, S.

(2016) Pronunciation learner autonomy: The potential of Automatic Speech Recognition. System, 571, 25–42.

Moyer, A.

(2016) The puzzle of gender effects in L2 phonology. Journal of Second Language Pronunciation, 2(1), 8–28.

Mroz, A.

(2018) Seeing how people hear you: French learners experiencing intelligibility through automatic speech recognition. Foreign Language Annals, 51(3), 617–637.

Munro, M. J., & Derwing, T. M.

(2015) A prospectus for pronunciation research in the 21st century. A point of view. Journal of Second Language Pronunciation, 1(1), 11–42.

Pew Research Center

(2018) Mobile fact sheet. Retrieved April 12, 2018 from [URL]

Ranta, L., & Lyster, R.

(2018) Form-focused instruction. In P. Garrett, & J. M. Vots (Eds.), The Routledge Handbook of Language Awareness (pp. 40–56). New York: Routledge.

Saito, K.

(2012) Effects of instruction on L2 pronunciation development: A synthesis of 15 quasi-experimental intervention studies. TESOL Quarterly, 46(4), 842–854.

Sicola, L., & Darcy, I.

(2015) Integrating pronunciation in the language classroom. In M. Reed, & J. M. Levis (Eds.), The Handbook of English Pronunciation (pp. 471–487). Wiley Blackwell, Chichester, UK.

Thomson, R. I., & Derwing, T. M.

(2015) The effectiveness of L2 pronunciation instruction: A narrative review. Applied Linguistics, 36(3), 326–344.

Ukkonen, E.

(1985) Algorithms for approximate string matching. Information and Control, 64(1–3), 100–118.

van Doremalen, J., Boves, L., Colpaert, J., Cucchiarini, C., & Strik, H.

(2016) Evaluating automatic speech recognition-based language learning systems: a case study. Computer Assisted Language Learning, 29(4), 833–851.

van Doremalen, J., Cucchiarini, C., & Strik, H.

(2010) Optimizing automatic speech recognition for low-proficient non-native speakers. EURASIP Journal of Audio, Speech, And Music Processing 2010, 1–13.

van Lier, L.

(2004) The semiotics and ecology of language learning. Perception, voice, identity and democracy. Utbildning & Demokrati, 13(3), 79–103.

Vu, N. T., Wang, Y., Klose, M., Mihaylova, Z., & Schultz, T.

(2014) Improving ASR performance on non-native speech using multilingual and crosslingual information. In Fifteenth Annual Conference of the International Speech Communication Association , Singapore, 2014.

Cited by

Cited by 3 other publications

Inceoglu, Solène, Wen-Hsin Chen & Hyojung Lim

2023. Assessment of L2 intelligibility: Comparing L1 listeners and automatic speech recognition. ReCALL 35:1 ► pp. 89 ff.

Martin, Ines A. & Solène Inceoglu

2022. The Laboratory, the Classroom, and Online. In Second Language Pronunciation, ► pp. 254 ff.

McCrocklin, Shannon & Idée Edalatishams

2020. Revisiting Popular Speech Recognition Software for ESL Speech. TESOL Quarterly 54:4 ► pp. 1086 ff.

This list is based on CrossRef data as of 2 april 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.