Publications
Publication details [#59607]
Alexopoulou, Theodora, Jeroen Geertzen, Anna Korhonen and Detmar Meurers. 2015. Exploring big educational learner corpora for SLA research. Perspectives on relative clauses. International Journal of Learner Corpus Research 1 (1) : 96–129.
Publication type
Article in journal
Publication language
English
Keywords
Language as a subject
Place, Publisher
John Benjamins
Journal DOI
10.1075/ijlcr
Annotation
This paper considers the opportunities presented by big educational learner corpora for Second Language Acquisition (SLA). In particular, it focuses on the EF Cambridge Open Language Database (EFCAMDAT), an open access database of student writings submitted to Englishtown, the online school of EF Education First. EFCAMDAT stands out for its size (33 million words, 85 thousand learners) and a range of 128 writing tasks covering all CEFR levels with data from learners from varying nationalities. The paper discusses methodological issues arising from analyzing big data resources generated in educational contexts and argues that Natural Language Processing (NLP) is essential for the automated processing of such datasets. As a study case, the paper follows the developmental trajectory of relative clauses, a construction that necessitates deeper syntactic analysis. It considers specific issues that can affect the developmental trajectory, including task effects, formulaic language and national language effects.