Chapter 1
Spanish learner corpus research
Achievements and challenges
This chapter presents a state-of-the-art overview of Spanish learner corpus research (SLCR). It starts by emphasizing the uniqueness of a monograph focusing on research dealing with learners of a language other than English. The next section is concerned with the status of Spanish as a foreign language in the world, and as a pluricentric language made up of a set of American and Spanish varieties. After that, the main features of learner corpus design and analysis in SLCR are reviewed. Besides of providing an overview of the main Spanish learner corpora, this chapter directs attention towards some of the challenges that this field of research will have to face. The last section briefly reviews the contributions to this volume.
Article outline
- 1.Introduction
-
2.The status of Spanish as a Foreign Language
- 3.Learner corpus design and analysis in SLCR: Features and problems
- 4.An overview of Spanish learner corpora
- 5.Challenges to SLCR
- 6.Contributions to this volume
-
Acknowledgments
-
Notes
-
References
References (116)
References
Ainciburu, C.. 2010. Al día. Revista Nebrija de Lingüística Aplicada 7. <[URL]> (9 May 2016).
Alonso-Ramos, M., Wanner, L. Vázquez Veiga, N., Vincze, O., Mosqueira, E. & Prieto, S. 2010a. Tagging collocations for learners. In Elexicography in the 21st Century: New Challenges, New Applications. Proceedings of eLex2009 [Cahiers du Cental 7], S. Granger & M. Paquot (eds), 375–380. Louvain-la Neuve: Presses Universitaires de Louvain.
Alonso-Ramos, M., Wanner, L., Vincze, O., Casamayor, G., Vázquez Veiga, N., Mosqueira, E. & Prieto, S. 2010b. Towards a motivated annotation schema of collocation errors in learner corpora. In Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC’10), 19–21 May 2010, Valletta, Malta, N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odjik, S. Piperidis, M. Rosner & D. Tapias (eds), 3209–3214. Paris: European Language Resources Association (ELRA).
Alonso-Ramos, M., Carlini, R., Codina-Filbà, J., Orol, A., Vincze, O., & Wanner, L. 2015. Towards a learner need-oriented second language collocation writing assistant. In Critical CALL – Proceedings of the 2015 EUROCALL Conference, Padova, Italy, F. Helm, L. Bradley, M. Guarda & S. Thouësny (eds), 16–23. Dublin: Research-publishing.net <
> (9 May 2016).
Bailini, S. 2013. SCIL: A Spanish corpus of Italian learners. Procedia — Social and Behavioral Sciences 95: 542–49. DOI:
Barker, F., Salamoura, A. & Saville, N. 2015 Learner corpora and language testing. In Granger et al. (eds), 511–533.
Barlow, M. 2005. Computer-based analysis of learner language. In R. Ellis & G.P. Barkhuizen (eds), 335–57. Oxford: OUP.
Blanco, A. 2013. Training and research in phonetics for Spanish as a second language with technological support. The EUROCALL Review 21 (2): 3–26. <[URL]>
Bley-Vroman, R. 1983. The comparative fallacy in interlanguage studies: The case of systematicity. Language Learning 33: 1–17.
Bordón, T. & Liskin-Gasparro, J.E. 2015. The assessment and evaluation of Spanish. In The Routledge Handbook of Hispanic Applied Linguistics, M. Lacorte (ed.), 258–274. London: Routledge.
Boyd, A., Hana, J., Nicolas, L., Meurers, D. Wisniewski, K., Abel, A., Schöne, K., Štindlová, B. & Vettori, C. 2014. The MERLIN corpus: Learner language and the CEFR. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), 1281–1288. Paris: European Language Resources Association (ELRA) <[URL]>
Bravo García, E. 2005. La variedad americana en la enseñanza del español como L2. In Las gramáticas y los diccionarios en la enseñanza del español como segunda lengua, deseo y realidad: Actas del XV Congreso Internacional de ASELE, Sevilla 22–25 de septiembre de 2004, M. Castillo Carballo, O. Cruz Moya, J. M. García Platero & J.P. Mora Gutiérrez (eds), 193–198. Sevilla: Universidad de Sevilla <[URL]>
Bravo García, E. 2008. El español internacional. Madrid: Arco Libros.
Bustos, E. & Sánchez, J.J. 2011. Espalex: un corpus para el estudio de la adquisición del español como lengua extranjera. In La Red y sus aplicaciones en la enseñanza-aprendizaje del español como lengua extranjera: Actas del XXII Congreso Internacional de ASELE, Valladolid. C. Hernández González, A. Carrasco Santana & E. Álvarez Ramos (eds), 149–15. Universidad de Valladolid. <[URL]>
Buyse, K. & González, E. 2012. El corpus de aprendices Aprescrilov y su utilidad para la didáctica de ELE en la Bélgica multilingüe. In Plurilingüismo y enseñanza de ELE en contextos multiculturales. Actas del XXIII Congreso Internacional de la ASELE, B. Blecua Girona, F. Sierra & B. Crous (eds), 247–252. ASELE <[URL]>
Callies, M. 2015. Learner corpus methodology. In Granger et al. (eds), 57–77.
Callies, M. & Götz, S. 2015. Learner Corpora in Language Testing and Assessment [Studies in Corpus Linguistics 70]. Amsterdam: John Benjamins.
Callies, M. & Zaytseva, E. 2013. The Corpus of Academic Learner English (CALE): A new resource for the study of lexico-grammatical variation in advanced learner varieties. In S. Granger, G. Gilquin & F. Meunier (eds), 49–59, Louvain: Presses universitaires de Louvain.
Campillos Llanos, L. 2014. A Spanish learner oral corpus for computer aided error analysis. Corpora 9 (2): 207–238. DOI: .
Carlsen, C. 2012. Proficiency level: A fuzzy variable in computer learner corpora. Applied Linguistics 33 (2): 161–183.
Cestero, A. & Penadés, I. 2009. Corpus de textos escritos para el análisis de errores de aprendices de E/LE (CORANE). CD-ROM. Alcalá de Henares: Universidad de Alcalá
Chambers, A. 2015. The learner corpus as a pedagogic corpus. In Granger et al. (eds), 445–464
Collentine, J. & Asención-Delaney, Y. 2010. A corpus-based analysis of the discourse functions of ser/estar + adjective in three levels of Spanish FL learners. Language Learning 60 (2): 409–445.
Corino, E. 2008. VALICO: An online corpus of learning varieties of the Italian language. In Proceedings of the Second Colloquium on Lesser Used Languages and Computer Linguistics, >Verena Lyding (ed.), 117–134. <[URL]>
Cotos, E. 2014. Enhancing writing pedagogy with learner corpus data. ReCALL 26 (2): 202–224.
Council of Europe. 2001. Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Cambridge: CUP.
Dagneaux, E., Denness, S. & Granger, S. 1998. Computer-aided error analysis. System 26 (2): 163–174.
Degand, L. & Perrez, J. 2004. Causale connectieven in het leerdercorpus Nederlands. N/F Tijdschrift van de Association des Néerlandistes de Belgique Francophone 4:115–127.
Del Valle, J. & Gabriel-Stheman, L. 2004. Lengua y Mercado: El español en la era de la globalización económica. In La batalla del idioma. La intelectualidad hispánica ante la lengua, J. del Valle & L. Gabriel-Stheeman (eds), 253–263. Madrid & Frankfurt: Iberoamericana & Vervuert.
Díaz, L. 2007. Interlengua española. Estudio de Casos. Barcelona: Printulibro Intergrup.
Durrant, P. & Siyanova-Chanturia, A. 2015. Learner corpora and psycholinguistics. In Granger et al. (eds), 57–77.
Ellis, R. & Barkhuizen, G. 2005. Analysing Learner Language. Oxford: OUP.
Fairclough, M. 2015. Spanish as a heritage language. In The Routledge Handbook of Hispanic Applied Linguistics, M. Lacorte (ed.), 134–149. London: Routledge.
Fernández, J., Gates Tapia, A. N. & Lu, X. 2014. Oral proficiency and pragmatic marker use in L2 spoken Spanish: The case of pues
and
bueno
. Journal of Pragmatics 74 :150–164
Fernández López, S. 1990. Análisis de errores e interlengua en el aprendizaje del español como lengua extranjera. PhD dissertation, Universidad Complutense.
Fernández Vítores, D. 2015. El español: Una lengua viva. Informe 2015. Instituto Cervantes. <[URL]>
Forsberg, F. & Bartning, I. 2010. Can linguistic features discriminate between the communicative CEFR-levels? : A pilot study of written L2 French. In Communicative Proficiency and Linguistic Development: Intersections between SLA and Language Testing [EUROSLA Monograph Series 1], 133–157. <[URL]>
Gallina, F. 2013. The Lexicon of Spoken Italian by Foreigners: A study on the acquisition of vocabulary by L2 Italian learners between measures of lexical richness and lexical fields. In S. Granger, G. Gilquin & F. Meunier (eds), 179–195. Louvain: Presses universitaires de Louvain.
Gamallo, P., García, M. del Río, I. & González, I. 2015. Avalingua: Natural language processing for automatic error detection. In Callies & Götz (eds), 35–57.
García-Salido, M. & Alonso-Ramos, M. Forthcoming. Asignación de niveles de aprendizaje a las colocaciones del Diccionario de colocaciones del español. Revista Signos.
Gilquin, G. & Gries, S.T. 2009. Corpora and experimental methods: A state-of-the-art review. In Corpora and Experimental Methods, G. Gilquin (ed.). Special issue of Corpus Linguistics and Linguistic Theory 5 (1): 1–26.
González Royo, C. 2011. La problemática de la afinidad entre el español y el italiano en la enseñanza/aprendizaje desde la fraseología: el corpus de interlengua oral. Redele, 20 VIII Encuentro práctico de ELE / I.C Nápoles. <[URL]>
Granger, S. 1996. From CA to CIA and back. An integrated approach to computerized bilingual and learner corpora. In Languages in Contrast, K. Aijmer, B. Altenberg & M. Johansson (eds), 37–51. Lund: Lund University Press.
Granger, S. 1998. The computer learner corpus: A versatile new source of data for SLA research. In Granger (ed.), 3–18.
Granger, S. (ed.) 1998. Learner English on Computer. London: Addison Wesley Longman.
Granger, S., Gilquin, G. & Meunier, F. (eds) 2013. Twenty Years of Learner Corpus Research: Looking Back, Moving Ahead, Louvain: Presses universitaires de Louvain.
Granger, S., 2015b. The contribution of learner corpora to reference and instructional material design. In Granger et al. (eds), 485–510.
Granger, S., Dagneaux, E., Meunier, F. & Paquot, M. 2009. International Corpus of Learner English, Version 2. Handbook and CD ROM. Louvain la Neuve: Presses universitaires de Louvain.
Granger, S., Gilquin, G. & Meunier, F. 2015. Introduction: Learner corpus research – Past, present and future. In Granger et al. (eds), 1–5.
Granger, S., Gilquin, G. & Meunier, F. (eds) 2015. The Cambridge Handbook of Learner Corpus Research. Cambridge: CUP.
Gudmestad, A., House, L. & Geeslin, K.L. 2013. What a Bayesian analysis can do for SLA: New tools for the sociolinguistic study of subject expression in L2 Spanish. Language Learning 63 (3): 371–399.
Hana J., Rosen A., Štindlová B. & Jäger, P. 2012. Building a learner corpus. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), N. Calzolari, K. Choukri, T. Declerck, M.U. Dogan, B. Maegaard, J. Mariani, J. Odijk & S. Piperidis (eds), 3228–3232. Paris: European Language Resources Association (ELRA) <[URL]>
Hashimoto, K. & Takeuchi, K. 2012. Prototypical design of learner support material based on the analyisis of non-verbal elements in presentation. In Intelligent Interactive Multimedia: Systems and Services. Proceedings of the 5th International Conference on Intelligent Interactive Multimedia Systems and Services (IIMMS 2012), T. Watanabe, J. Watada, N. Takahashi , R.J. Howlett, L.C. Jain (eds), 531–540. Heidelberg: Springer.
Higgins, D., Ramkinemi, C. & Zechner, K. 2015. Learner corpora and automated scoring. In Granger et al. (eds), 587–604.
Ife, A. 2004. The L2 learner corpus: Reviewing its potential for the early stages of learning. In Applied Linguistics at the Interface, M. Baynham, A. Deignan & G. White (eds) 91–103. London: Equinox.
Instituto Cervantes. 1997–2016. Plan curricular del Instituto Cervantes. Niveles de referencia para el español. <[URL]>
Jantunen, J., 2014. ICLFI – International Corpus of Learner Finnish, LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague. <[URL]>
Johns, T. 1991. Should you be persuaded: Two samples of data-driven learning materials. In Classroom Concordancing, T. Johns & P. King (eds). English Language Research Journal 4: 1–16.
Lara, L. F. 2007. Por una reconstrucción de la idea de la lengua española. In La lengua, ¿patria común? Ideas e ideologías del español, José del Valle (ed.), 163–181. Frankfurt & Madrid: Vervuert & Iberoamericana.
Leacock, C., Chodorow, M. & Tetreault, J. 2015. Automatic grammar- and spell-checking for language. In Granger et al. (eds), 567–586.
Leech, G. 1998. Preface. In Granger (ed.), xiv–xx.
Lebsanft, F., Mihatsch, W. & Polzin Haumann, C. (eds). 2012. El español, ¿desde las variedades a la lengua pluricéntrica? Madrid & Frankfurt: Iberoamericana & Vervuert.
Leonhardt, K. 2012. El concepto del pluricentrismo en los cursos virtuales del CVC. In Lebsanft et al. (eds), 313–328.
Lozano, C. 2009. CEDEL2: corpus escrito del español como L2. In Applied Linguistics Now: Understanding Language and Mind/La lingüística aplicada actual: comprendiendo el lenguaje y la mente, C.M. Bretones, J.F. Fernández Sánchez, J.R. Ibáñez Ibáñez, M.E. García Sánchez, M.E. Cortés de los Ríos, S. Salaberri Ramiro, M.S. Cruz Martínez, N. Perdú Honeyman & B. Cantizano Márquez (eds), 197–212. Almería: Universidad de Almería.
Lozano, C. 2015. Learner corpora as a research tool for the investigation of lexical competence in L2 Spanish. Journal of Spanish Language Teaching 2 (2): 180–193. DOI:
Lu, H.C. 2010. An annotated Taiwanese learners’ corpus of Spanish, CATE. Corpus Linguistics and Linguistic Theory 6 (2): 297–300.
Lüdeling, A., Walter, M., Kroymann, E. & Adolphs, P. 2005. Multi-level error annotation in learner corpora. In The Corpus Linguistics Conference Series, 1 (1): Proceedings of Corpus Linguistics, Birmingham <[URL]>
Lüdeling, A. & Hirschman, H. 2015. Error annotation systems. In Granger et al. (eds), 135–157.
Marsden, E., Myles, F., Rule, S., Mitchell, R. 2002. Oral French interlanguage corpora: Tools for data management and analysis. Centre for Language in Education Occasional Papers 58. Southampton: University of Southampton.
Martínez Pérsico, M. 2013. Pluricentrismo y norma panhispánica del español. Consideraciones críticas sobre el imaginario docente ELE. Mediterráneo. Revista de la Consejería de Educación en Italia, Grecia y Albania 5 (3): 111–129.
McCarthy, M. 2008. Accessing and interpreting corpus information in the teacher education context. Language Teaching 41 (4): 563–574.
Mendikoetxea, A. 2014. Corpus-based research in second language Spanish. In The Handbook of Spanish Second Language Acquisition, K.L. Geeslin (ed.), 11–29. Hoboken NJ: Wiley-Blackwell.
Meunier, F. 2015. Developmental patterns in learner corpora. In Granger et al. (eds), 379–400.
Mitchell, R., Domínguez, L., Arche, M. J., Myles, F. & Marsden, E. 2008. SPLLOC: A new database for Spanish second language acquisition research. In EUROSLA Yearbook 8, L. Roberts, F. Myles & A. David (eds), 287–304. Amsterdam: John Benjamins.
Moreno Fernández, F. 2000. Qué español enseñar. Madrid: Arco/Libros.
Moreno Fernández, F. 2010. Las variedades de la lengua española y su enseñanza. Madrid: Arco/Libros.
Moreno Fernández F. & Otero Roth, J. 2013. Atlas de la lengua española en el mundo. Barcelona: Ariel; Madrid: Fundación Telefónica.
Muñoz Liceras, J., Maxwell, D., Laguardia, B., Fernández, D., Fernández, R., Díaz, L. 1997. A longitudinal study of Spanish non-native grammars: Beyond parameters. In Contemporary Perspectives on the Acquisition of Spanish, Vol. 1: Developing Grammars, Ana Teresa Pérez-Leroux & William R. Glass (eds), 99–132. Somerville, MA: Cascadilla Press.
Myles, F. 2015. Second language acquisition theory and learner corpus research. In Granger et al. (eds), 309–331.
Osimk-Teasdale, R. 2014. ‘I just wanted to give a partly answer’: Capturing and exploring word class variation in EFL data. Journal of English as a Lingua Franca 3 (1): 109–143.
Papadopoulou, D. & Tantos, A. 2014. Greek learner corpus: Approaching agreement errors through stand-off annotation. Workshop: Interlanguage Annotation: From SLA Research to Learner Corpus Research, Book of Abstracts 47th Annual Meeting of the Societas Linguistica Europaea, 243–22, Adam Mickiewicz University, Pozna, Poland.
Parodi, G. 2015. Review article. Corpus de aprendices de español (CAES). Journal of Spanish Language Teaching 2(2):194–200. DOI:
Pérez Serrano, M. 2015. Un enfoque léxico a prueba: Efectos de la instrucción en el aprendizaje de las colocaciones léxicas. PhD dissertation, Univesidad de Salamanca.
Pino, A. 2009. Palabras en interacción: Un corpus de aprendices suecos de E/LE. In A Survey of Corpus-based Research, Pascual Cantos Gómez, Aquilino Sánchez Pérez (eds), 470–487. Murcia: Universidad de Murcia and AELINCO. <[URL]>
Potowski, K. 2014. Heritage learners of Spanish. In The Handbook of Spanish Second Language Acquisition, K.L. Geeslin (ed.), 404–422. Hoboken NJ: Wiley-Blackwell.
Real Academia Española, Asociación de Academias de la Lengua Española. 2009. Nueva gramática de la lengua española. Madrid: Espasa.
Reder, S., Harris, K. & Setzler, K. 2003. The multimedia adult ESL learner corpus. TESOL Quaterly 37 (3): 546–557.
Rodríguez Prieto, J.-P. 2009. Acquisitional patterns of the Spanish copular verbs ‘ser’ and ‘estar’: Data from L2 beginning learners in favor of the declarative/procedural model. RESLA 22: 307–25.
Salazar-García, V. & Eliwey, A. F. 2015. Spanish copulas and the interlanguage of Iraqi university students. In Studies in Learner Corpus Linguistics: Research and Applications for Foreign Language Teaching and Assessment, E. Castello, K. Ackerley & F. Coccetta (eds), 263–278. Bern: Peter Lang.
Sánchez Rufat, A. 2015a. Análisis contrastivo de interlengua y corpus de aprendientes: Precisiones metodológicas. Pragmalingüística 23: 191–210.
Sánchez Rufat, A. 2015b. La investigación de corpus de aprendientes y el desarrollo de los estudios de la interlengua del español. Language Design: Journal of Theoretical and Experimental Linguistics 17: 57–83.
Sánchez Rufat, A. 2016. Las funciones diagnóstica y evaluativa del análisis contrastivo de la interlengua del español basado en corpus. LinRed 13. <[URL]>
Santos Gargallo, I. 1991. La enseñanza de segundas lenguas. Análisis de errores en la expresión escrita de estudiantes de español cuya lengua nativa es el serbo-croata. PhD dissertation. Universidad Complutense.
Siyanova, A. & Schmitt, N. 2008. L2 learner production and processing of collocation: A multi-study perspective. The Canadian Modern Language Review/La revue Canadienne des Langues Vivantes 64 (3): 429–458.
Tagnin, S.-E.O. 2006. A multilingual learner corpus in Brazil. In Corpus Linguistics around the World. A., Wilson, D., Archer, & P., Rayson (eds), 195–202. Amsterdam: Rodopi.
Tenfjord, K., Meurer, P. & Hofland, K. 2006. The ASK Corpus. A language learner corpus of Norwegian as a second language. In Proceedings of the Fitfth International Conference on Languages Resources and Evaluation (LREC’6), 1821–1824. Paris: European Language Resources Association (ELRA). <[URL]>
Tono, Y. 2003. Learner corpora: Design, development and applications. In Proceedings of the 2003 Corpus Linguistics Conference [ICREL Technical Paper 16], D. Archer, P. Rayson, A. Wilson & T. McEnery (eds), 800–809. Lancaster University.
Torrent-Lenzen, A. 2006. Unidad y pluricentrismo en la comunidad hispanohablante: cultivo y mantenimiento de una norma panhispánica unificada. Titz: Axel Lenzen.
Tracy-Ventura, N., McManus, K., Norris, J., & Ortega, L. 2014. “Repeat as much as you can”: Elicited imitation as a measure of oral proficiency in L2 French. In Measuring L2 Proficiency: Perspectives from SLA, P. Leclercq, A. Edmonds & H. Hilton (eds), 143–166. Bristol: Multilingual Matters.
Valverde, M.P & Ohtani, A. 2012. Automatic detection of gender and number agreement errors in Spanish texts written by Japanese learners. In Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation (PACLIC 26), Manurung, R. & Bond, F. (eds), 299–307. Faculty of Computer Science, Universitas Indonesia. <[URL]>
Valverde, M. P. & Ohtani, A. 2014. Annotating article errors in Spanish learner texts: Design and evaluation of an annotation scheme. In Proc. of the 28th Pacific Asia Conference on Language, Information and Computation (PACLIC), 12–14 December 2014, Phuket (Thailand), W. Aroonmanakun, T. Supnithi & P. Boonkwan (eds), 234–243.
Vande Casteele, A. & Collewaert, K. 2013. The use of discourse markers in Spanish language learners’ written compositions. Procedia — Social and Behavioral Sciences 95: 550–556.
Vázquez, G. 1991. Análisis de errores y aprendizaje de español/lengua extranjera [Studia Romanica et Linguistica 25]. Frankfurt: Peter Lang.
Vincze, O., Alonso-Ramos, M., Mosqueira, E. & Prieto, S. 2011. Exploiting a learner corpus for the development of a CALL environment for learning Spanish collocations. In Electronic Lexicography in the 21st Century: New Applications for New Users. Proceedings of eLex 2011, I. Kosem & K. Kosem (eds), 280–85. Ljubljana: Trojina, Institute for Applied Slovene Studies.
Wanner, L., Verlinde, S. & Alonso-Ramos, M. 2013. Writing assistants and automatic lexical error correction: Word combinatorics. In Electronic Lexicography in the 21st Century: Thinking Outside the Paper. Proceedings of the eLex 2013 Conference, 17–19 October 2013. Tallinn, Estonia, I. Kosem, J. Kallas, P. Gantar, S. Krek, M. Langemets & M. Tuulik (eds), 472–487. Ljubljana/Tallinn: Trojina, Institute for Applied Slovene Studies & Eesti Keele Instituut. [URL]
Yuldashev, A., Fernandez, J. & Thorne, S.-L. 2013. Second language learners’ contiguous and discontiguous multi-word unit use over time. Modern Language Journal 97: 31–45.
Cited by (1)
Cited by one other publication
Gudmestad, Aarnes, Amanda Edmonds & Thomas Metzger
2019.
Using Variationism and Learner Corpus Research to Investigate Grammatical Gender Marking in Additional Language Spanish.
Language Learning 69:4
► pp. 911 ff.
This list is based on CrossRef data as of 15 september 2023. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.