Error annotation is a key feature of modern learner corpora. Error identification is always based on some kind of reconstructed learner utterance (target hypothesis). Since a single target hypothesis can only cover a certain amount of linguistic information while ignoring other aspects, the need for multiple target hypotheses becomes apparent. Using the German learner corpus Falko as an example, we therefore argue for a flexible multi-layer stand-off corpus architecture where competing target hypotheses can be coded in parallel. Surface differences between the learner text and the target hypotheses can then be exploited for automatic error annotation.
2023. Treebanking user-generated content: a UD based overview of guidelines, corpora and unified recommendations. Language Resources and Evaluation 57:2 ► pp. 493 ff.
Hirschmann, Hagen & Thomas Schmidt
2022. Gesprochene Lernerkorpora: Methodisch-technische Aspekte der Erhebung, Erschließung und Nutzung. Zeitschrift für germanistische Linguistik 50:1 ► pp. 36 ff.
Shadrova, Anna, Pia Linscheid, Julia Lukassek, Anke Lüdeling & Sarah Schneider
2021. A Challenge for Contrastive L1/L2 Corpus Studies: Large Inter- and Intra-Individual Variation Across Morphological, but Not Global Syntactic Categories in Task-Based Corpus Data of a Homogeneous L1 German Group. Frontiers in Psychology 12
Wisniewski, Katrin
2021. „Ist es B2-Niveau genug?“. Zeitschrift für Angewandte Linguistik 2021:75 ► pp. 364 ff.
Gilquin, Gaëtanelle
2020. Learner Corpora. In A Practical Handbook of Corpus Linguistics, ► pp. 283 ff.
2020. Corpus Architecture. In A Practical Handbook of Corpus Linguistics, ► pp. 49 ff.
Lüdeling, Anke, Hagen Hirschmann & Anna Shadrova
2017. Linguistic Models, Acquisition Theories, and Learner Corpora: Morphological Productivity in SLA Research Exemplified by Complex Verbs in German. Language Learning 67:S1 ► pp. 96 ff.
MacWhinney, Brian
2017. A Shared Platform for Studying Second Language Acquisition. Language Learning 67:S1 ► pp. 254 ff.
Meurers, Detmar & Markus Dickinson
2017. Evidence and Interpretation in Language Learning Research: Opportunities for Collaboration With Computational Linguistics. Language Learning 67:S1 ► pp. 66 ff.
Odebrecht, Carolin, Malte Belz, Amir Zeldes, Anke Lüdeling & Thomas Krause
2017. RIDGES Herbology: designing a diachronic multi-layer corpus. Language Resources and Evaluation 51:3 ► pp. 695 ff.
Lee, John, Chak Yan Yeung, Amir Zeldes, Marc Reznicek, Anke Lüdeling & Jonathan Webster
2015. CityU corpus of essay drafts of English language learners: a corpus of textual revision in second language writing. Language Resources and Evaluation 49:3 ► pp. 659 ff.
Mahlow, Cerstin
2015. Learning from Errors: Systematic Analysis of Complex Writing Errors for Improving Writing Technology. In Language Production, Cognition, and the Lexicon [Text, Speech and Language Technology, 48], ► pp. 419 ff.
This list is based on CrossRef data as of 20 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.