This paper discusses machine learning techniques for the prediction of Common European Framework of Reference (CEFR)
levels in a learner corpus. We summarise the CAp 2018 Machine Learning (ML) competition, a
classification task of the six CEFR levels, which map linguistic competence in a foreign language onto six reference levels. The goal of
this competition was to produce a machine learning system to predict learners’ competence levels from written productions comprising between
20 and 300 words and a set of characteristics computed for each text extracted from the French component of the EFCAMDAT data (Geertzen et al., 2013). Together with the description of the competition, we provide an analysis of
the results and methods proposed by the participants and discuss the benefits of this kind of competition for the learner corpus research
(LCR) community. The main findings address the methods used and lexical bias introduced by the task.
Abney, S.2007. Semisupervised learning for computational linguistics. London: Chapman and Hall/CRC.
Alexopoulou, T., Michel, M., Murakami, A., & Meurers, D.2017. Task effects on linguistic complexity and accuracy: A large-scale learner corpus analysis employing natural language processing techniques. Language Learning, 67(S1), 180–208.
Alexopoulou, T., Yannakoudakis, H., & Salamoura, A.2013. Classifying intermediate learner English: a data-driven approach to learner corpora. In Twenty years of learner corpus research: Looking back, moving ahead (pp. 11–23). Belgium: Presses Universitaires de Louvain.
Attali, Y. & Burstein, J.2006. Automated essay scoring with e-rater® v.2. The Journal of Technology, Learning and Assessment, 4(3).
Balikas, G.2018. Lexical bias in essay level prediction. ArXiv e-prints.
Barker, F., Salamoura, A., & Saville, N.2015. Learner corpora and language testing. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge Handbook of Learner Corpus Research (pp. 511–534). Cambridge: Cambridge University Press.
Baur, C., Caines, A., Chua, C., Gerlach, J., Qian, M., Rayner, M., Russell, M., Strik, H., & Wei, X.2018. Overview of the 2018 spoken CALL shared task. In Interspeech 2018, 2354–2358. Geneva: ISCA.
Baur, C., Chua, C., Gerlach, J., Rayner, E., Russel, M., Strik, H., & Wei, X.2017. Overview of the 2017 spoken CALL shared task. In Workshop on Speech and Language Technology in Education (SLaTE). Stockholm, Sweden.
Boyd, A., Hana, J., Nicolas, L., Meurers, D., Wisniewski, K., Abel, A., Schöne, K., Stindlová, B., & Vettori, C.2014. The MERLIN corpus: Learner language and the CEFR. In LREC, 1281–1288. Reykjavik, Iceland.
Chen, X. & Meurers, D.2016. CTAP: A web-based tool supporting automatic complexity analysis. In Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC), 113–119.
Council of Europe (2001a. Common European Framework of Reference for Lan- guages: Learning, teaching, assessment. Strasbourg, Language Policy Division: Cambridge University Press.
Council of Europe (2001b. Common European Framework of Reference for Lan- guages: Learning, teaching, assessment. Structured overview of all CEFR scales. Strasbourg, Language Policy Division: Cambridge University Press.
Council of Europe (2018. Common European Framework of Reference for Languages: Learning, teaching, assessment; Companion volume with new descriptors. Strasbourg, Language Policy Division: Cambridge University Press.
Crossley, S. A., Salsbury, T., McNamara, D. S., & Jarvis, S.2011. Predicting lexical proficiency in language learner texts using computational indices. Language Testing, 28(4), 561–580.
Cushing Weigle, S.2010. Validation of automated scores of TOEFL iBT tasks against non-test indicators of writing ability. Language Testing, 27(3), 335–353.
Dahlmeier, D., Ng, H. T., & Wu, S. M.2013. Building a large annotated corpus of learner English: The NUS corpus of learner English. In Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications, 22–31. Association for Computational Linguistics. Atlanta, Georgia.
Dale, R. & Kilgarriff, A.2011. Helping our own: The HOO 2011 pilot shared task. In Proceedings of the 13th European Workshop on Natural Language Generation, ENLG ’11, 242–249. Association for Computational Linguistics. Nancy, France.
Dale, R., Anisimoff, I., & Narroway, G.2012. HOO 2012: A report on the preposition and determiner error correction shared task. In Proceedings of the Seventh Workshop on Building Educational Applications Using NLP, NAACL HLT ’12, 54–62. Association for Computational Linguistics. Montreal, Canada.
Flach, P.2012. Machine learning: The art and science of algorithms that make sense of data. Cambridge: Cambridge University Press.
Friedman, J., Hastie, T., & Tibshirani, R.2001. The elements of statistical learning, volume 1. New York: Springer Series in Statistics.
Geertzen, J., Alexopoulou, T., & Korhonen, A.2013. Automatic linguistic annotation of large scale L2 databases: The EF-Cambridge open language database (EFCAMDAT). In Proceedings of the 31st Second Language Research Forum. Somerville, MA: Cascadilla Proceedings Project.
Goldberg, Y.2017. Neural network methods for natural language processing. synthesis lectures on human language technologies. San Rafael, CA: Morgan & Claypool Publishers.
Granger, S., Kraif, O., Ponton, C., Antoniadis, G., & Zampa, V.2007. Integrating learner corpora and natural language processing: A crucial step towards reconciling technological sophistication and pedagogical effectiveness. ReCALL, 19(3), 252–268.
Hawkins, J. A. & Buttery, P.2010. Criterial features in learner corpora: Theory and illustrations. English Profile Journal, 1(01).
Hawkins, J. A. & Filipović, L.2012. Criterial features in L2 English: Specifying the reference levels of the Common European Framework, volume 1 of English Profile Studies. United Kingdom: Cambridge University Press.
Higgins, D., Ramineni, C., & Zechner, K.2015. Learner corpora and automated scoring. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge Handbook of Learner Corpus Research (pp. 587–604). Cambridge: Cambridge University Press.
Hopman, E., Thompson, B., Austerweil, J., & Lupyan, G.2018. Predictors of L2 word learning accuracy: A big data investigation. In the 40th Annual Conference of the Cognitive Science Society (CogSci 2018), 513–518.
Jarvis, S. & Paquot, M.2015. Learner corpora and native language identification. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge Handbook of Learner Corpus Research (pp. 605–628). Cambridge: Cambridge University Press.
Jarvis, S.2011. Data mining with learner corpora. In F. Meunier, S. De Cock, G. Gilquin, & M. Paquot (Eds.), A taste for corpora: In honour of Sylviane Granger (pp. 127–154). Amsterdam and Philadelphia: John Benjamins.
Le, Q. V. & Mikolov, T.2014. Distributed representations of sentences and documents. ArXiv: 1405.4053.
Leacock, C., Chodorow, M., Gamon, M., & Tetreault, J.2010. Automated grammatical error detection for language learners. Synthesis Lectures on Human Language Technologies, 3(1), 1–134.
Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollár, P.2017. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, 2980–2988.
Lissón, P. & Ballier, N.2018. Investigating learners’ progression in French as a foreign language: vocabulary growth and lexical diversity. CUNY Student Research Day. Poster.
Lissón, P.2017. Investigating the use of readability metrics to detect differences in written productions of learners: a corpus-based study. Bellaterra Journal of Teaching & Learning Language & Literature, 10(4), 68–86.
Liu, B.2012. Sentiment analysis and opinion mining. San Rafael, CA: Morgan & Claypool Publishers.
Lu, X.2014. Computational methods for corpus annotation and analysis. New York: Springer.
Magerman, D. M.1995. Statistical decision-tree models for parsing. In Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics, 276–283. Association for Computational Linguistics.
Malmasi, S., Evanini, K., Cahill, A., Tetreault, J., Pugh, R., Hamill, C., Napolitano, D., & Qian, Y.2017. A report on the 2017 native language identification shared task. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, 62–75. Association for Computational Linguistics. Copenhagen, Denmark.
Meurers, D.2015. Learner corpora and natural language processing. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge Handbook of Learner Corpus Research (pp. 537–566). Cambridge: Cambridge University Press.
Michalke, M.2017. koRpus: An R package for text analysis. (Version 0.10–2). Available at: [URL] (accessed October 2018).
Mons, B.2018. Data stewardship for open science: Implementing FAIR principles. London: Chapman and Hall/CRC.
Murakami, A.2014. Individual variation and the role of L1 in the L2 development of English grammatical morphemes: Insights from learner corpora. PhD thesis, University of Cambridge.
Murakami, A.2016. Modeling systematicity and individuality in nonlinear second language development: The case of English grammatical morphemes. Language Learning, 66(4), 834–871.
Murphy, K. P.2012. Machine learning. A probabilistic perspective. Adaptive Com- putation and Machine Learning. Cambridge (MA): MIT Press.
Ng, H. T., Wu, S. M., Briscoe, T., Hadiwinoto, C., Susanto, R. H., & Bryant, C.2014. The CoNLL-2014 shared task on grammatical error correction. In Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task, 1–14. Association for Computational Linguistics. Baltimore, Maryland.
Nissim, M., Abzianidze, L., Evang, K., van der Goot, R., Haagsma, H., Plank, B., & Wieling, M.2017. Sharing is caring: The future of shared tasks. Computational Linguistics, 43(4), 897–904.
Page, E. B.1968. The use of the computer in analyzing student essays. International Review of Education / Internationale Zeitschrift für Erziehungswissenschaft / Revue Internationale de l’Education, 14(2), 210–225.
Paroubek, P., Chaudiron, S., & Hirschman, L.2007. Principles of evaluation in natural language processing. Traitement Automatique des Langues, 48(1), 7–31.
Rich, A., Popp, P. O., Halpern, D., Rothe, A., & Gureckis, T.2018. Modeling second-language learning from a psychological perspective. In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, 223–230.
Sang, E. F. & De Meulder, F.2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050, 142–147.
Settles, B.2018. Data for the 2018 Duolingo shared task on second language acquisition modeling (SLAM). Available at: . (accessed October 2018).
Settles, B., Brust, C., Gustafson, E., Hagiwara, M., & Madnani, N.2018. Second language acquisition modeling. In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, 56–65.
Shermis, M. D., Burstein, J., Higgins, D., & Zechner, K.2010. Automated essay scoring: Writing assessment and instruction”. In P. Peterson, E. Baker, & B. McGaw (Eds.), International Encyclopedia of Education (Third Edition) (pp. 20–26). Oxford: Elsevier.
Tetreault, J., Burstein, J., Kochmar, E., Leacock, C., & Yannakoudakis, H.2018. Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics. New Orleans, Louisiana.
Thewissen, J.2015. Accuracy across proficiency levels: A learner corpus approach. Louvain: Presses universitaires de Louvain.
Thrun, S. & Pratt, L.1998. Learning to learn. Norwell, MA, USA: Kluwer Aca- demic Publishers.
Vajjala, S. & Loo, K.2014. Automatic CEFR level prediction for Estonian learner text. In NEALT Proceedings Series, volume 221, 113–128.
Volodina, E., Pilán, I. & Alfter, D.2016. Classification of Swedish learner essays by CEFR levels. CALL Communities and Culture–Short Papers from EURO- CALL, 2016, 456–461.
Wisniewski, K.2017. Empirical learner language and the levels of the Common European Framework of Reference. Language Learning, 67(S1), 232–253.
Yannakoudakis, H., Briscoe, T., & Medlock, B.2011. A New dataset and method for automatically grading ESOL texts. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies – Volume 1, HLT ’11, 180–189. Association for Computational Linguistics.
Yannakoudakis, H., Kochmar, E., Leacock, C., Madnani, N., Pilán, I., & Zesch, T.2019. Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics. Florence, Italy.
Cited by (10)
Cited by ten other publications
Gaillat, Thomas, Antoine Lafontaine & Anas Knefati
2023. Visualizing Linguistic Complexity and Proficiency in Learner English Writings. CALICO Journal 40:2 ► pp. 178 ff.
Ruggia, Simona & Thomas Gaillat
2023. Les corpus numériques pour la didactique des langues : de la formation des enseignants à l’élaboration de dispositifs d’apprentissage . Corpus :24
Tran, Quynh, Krystsina Shpileuskaya, Elaine Zaunseder, Josef Salg, Larissa Putzar & Sven Blankenburg
2023. 2023 11th International Conference on Affective Computing and Intelligent Interaction (ACII), ► pp. 1 ff.
Utami, Nabelanita & Fariska Zakhralativa Ruskanda
2023. 2023 10th International Conference on Advanced Informatics: Concept, Theory and Application (ICAICTA), ► pp. 1 ff.
Gaillat, Thomas, Andrew Simpkin, Nicolas Ballier, Bernardo Stearns, Annanda Sousa, Manon Bouyé & Manel Zarrouk
2022. Predicting CEFR levels in learners of English: The use of microsystem criterial features in a machine learning approach. ReCALL 34:2 ► pp. 130 ff.
He, Haiyin, Darchia Maia & Muhammad Arif
2022. Application of Grammar Error Detection Method for English Composition Based on Machine Learning. Security and Communication Networks 2022 ► pp. 1 ff.
Jimenez, Sergio, Fabio N Silva, George Dueñas & Alexander Gelbukh
2022. ProficiencyRank: Automatically ranking expertise in online collaborative social networks. Information Sciences 588 ► pp. 231 ff.
Lyashevskaya, Olga, Olga Vinogradova & Anna Scherbakova
This list is based on CrossRef data as of 17 october 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.