Aiming for advanced intelligibility and proficiency using mobile ASR
Aurore Mroz | University of Illinois at Urbana-Champaign
This experimental study aimed to determine the impact of mobile-based Automatic Speech Recognition (ASR) in Gmail
on intelligibility and proficiency, as well as whether any individual factors influenced learning outcomes. It focused on 26
Intermediate learners of French as a foreign language enrolled in two university courses geared towards the development of
advanced oral skills but with different approaches to integrated instruction. It innovatively combined human-based and
machine-based ratings within an ecological paradigm, following Levis’s (2005)
intelligibility principle and Thomson and Derwing’s (2015) call for research that is
readily useful for language instructors. Results show that ASR users significantly outperformed non-ASR users on intelligibility,
particularly when exposed to instruction on spelling-to-sound patterns, and demonstrated the biggest growth in proficiency. Gender
was also found to impact results. Pedagogical implications and venues for future research are offered.
(2009) Age of onset and nativelikeness in a second language: Listener perception versus linguistic scrutiny. Language Learning, 59(2), 249–306.
ACTFL
(2012) Oral proficiency interview. Familiarization manual [Electronic version]. Retrieved May 6, 2018 from [URL]
Blin, F.
(2016) Toward and ‘ecological’ CALL theory. Theoretical perspectives and their instantiation in CALL research and practice. In F. Farr, & L. Murray (Eds.), The Routledge Handbook of Language Learning and Technology (pp. 39–54). New York, NY: Routledge.
Bongaerts, T.
(1999) Ultimate attainment in L2 pronunciation: The case of very advanced late L2 learners. In D. Birdsong (Ed.), Second language acquisition and the critical period hypothesis (pp. 133–159). Mahwah, NJ: Erlbaum.
Burston, J.
(2015) Twenty years of MALL project implementation: A meta-analysis of learning outcomes. ReCALL, 27(1), 4–20.
Burston, J., & Arispe, K.
(2018) Looking for a needle in a haystack: CALL and advanced language proficiency. CALICO Journal, 35(1), 77–102.
Creswell, J. W., & Plano Clark, V. L.
(2011) Designing and conducting mixed methods research (2nd ed.). Thousand Oaks, CA: Sage Publications.
Derwing, T. M.
(2010) Utopian goals for pronunciation teaching. In J. M. Levis, & K. LeVelle (Eds.), Proceedings of the 1st pronunciation in second language learning and teaching conference (pp. 24–37). Ames, IA: Iowa State University.
Derwing, T. M., Munro, M. J., & Carbonaro, M.
(2012) Does popular speech recognition software work with ESL speech?TESOL Quarterly, 34(3), 592–603.
Ehsani, F., & Knodt, E.
(1998) Speech technology in computer-aided language learning: Strengths and limitations of a new CALL paradigm. Language Learning & Technology, 2(1), 45–60.
Eskenazi, M.
(1999) Using automatic speech processing for foreign language pronunciation tutoring: Some issues and a prototype. Language Learning & Technology, 2(2), 62–76.
Gilner, L., & Morales, F.
(2010) Pronunciation: Intelligibility, frequency considerations, and instruction. 文京学院大学外国語学部文京学院短期大学紀要, 91, 35–56.
Grissom, R. J., & Kim, J. J.
(2012) Effect sizes for research: Univariate and multivariate applications. (2nd ed.). New York, NY: Taylor & Francis.
Hoyt-Oukada, K.
(2003) Considering students’ needs and interests in curriculum construction. The French Review, 76(4), 721–737.
Hsu, L.
(2016) An empirical examination of EFL learners’ perceptual learning styles and acceptance of ASR-based computer-assisted pronunciation training. Computer Assisted Language Learning, 29(5), 881–900.
Kang, O.
(2013) Relative impact of pronunciation features on ratings of non-native speakers’ oral proficiency. In J. Levis, & K. LeVelle (Eds.), Proceedings of the 4th Pronunciation in Second Language Learning and Teaching Conference (pp. 10–15). Ames, IA: Iowa State University.
Koo, T. K., & Li, M. Y.
(2016) A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15(2), 155–163.
Lee, J., Jang, J., & Plonsky, L.
(2015) The effectiveness of second language pronunciation instruction: A meta-analysis. Applied Linguistics, 36(3), 345–366.
Levis, J. M.
(2005) Changing contexts and shifting paradigms in pronunciation teaching. TESOL Quarterly 39(3), 369–377.
Levis, J., & Suvorov, R.
(2013) Automatic speech recognition. In C. Chapelle (Ed.), The Encyclopedia of Applied Linguistics (pp. 1–8). Hoboken, NJ: Blackwell Publishing.
Liakin, D., Cardoso, W., & Liakina, N.
(2017) Mobilizing instruction in a second-language context: Learners’ perceptions of two speech technologies. Languages, 2(11), 1–21.
Ma, R., Henrichsen, L. E., Cox, T. L., & Tanner, M. W.
(2018) Seeing how people hear you: French learners experiencing intelligibility through automatic speech recognition. Foreign Language Annals, 51(3), 617–637.
(2018) Mobile fact sheet. Retrieved April 12, 2018 from [URL]
Ranta, L., & Lyster, R.
(2018) Form-focused instruction. In P. Garrett, & J. M. Vots (Eds.), The Routledge Handbook of Language Awareness (pp. 40–56). New York: Routledge.
Saito, K.
(2012) Effects of instruction on L2 pronunciation development: A synthesis of 15 quasi-experimental intervention studies. TESOL Quarterly, 46(4), 842–854.
Sicola, L., & Darcy, I.
(2015) Integrating pronunciation in the language classroom. In M. Reed, & J. M. Levis (Eds.), The Handbook of English Pronunciation (pp. 471–487). Wiley Blackwell, Chichester, UK.
Thomson, R. I., & Derwing, T. M.
(2015) The effectiveness of L2 pronunciation instruction: A narrative review. Applied Linguistics, 36(3), 326–344.
Ukkonen, E.
(1985) Algorithms for approximate string matching. Information and Control, 64(1–3), 100–118.
van Doremalen, J., Boves, L., Colpaert, J., Cucchiarini, C., & Strik, H.
(2016) Evaluating automatic speech recognition-based language learning systems: a case study. Computer Assisted Language Learning, 29(4), 833–851.
van Doremalen, J., Cucchiarini, C., & Strik, H.
(2010) Optimizing automatic speech recognition for low-proficient non-native speakers. EURASIP Journal of Audio, Speech, And Music Processing 2010, 1–13.
van Lier, L.
(2004) The semiotics and ecology of language learning. Perception, voice, identity and democracy. Utbildning & Demokrati, 13(3), 79–103.
Vu, N. T., Wang, Y., Klose, M., Mihaylova, Z., & Schultz, T.
(2014) Improving ASR performance on non-native speech using multilingual and crosslingual information. In
Fifteenth Annual Conference of the International Speech Communication Association
, Singapore, 2014.
Cited by
Cited by 3 other publications
Inceoglu, Solène, Wen-Hsin Chen & Hyojung Lim
2023. Assessment of L2 intelligibility: Comparing L1 listeners and automatic speech recognition. ReCALL 35:1 ► pp. 89 ff.
Martin, Ines A. & Solène Inceoglu
2022. The Laboratory, the Classroom, and Online. In Second Language Pronunciation, ► pp. 254 ff.
McCrocklin, Shannon & Idée Edalatishams
2020. Revisiting Popular Speech Recognition Software for ESL Speech. TESOL Quarterly 54:4 ► pp. 1086 ff.
This list is based on CrossRef data as of 2 april 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.