Article In:
International Journal of Learner Corpus Research: Online-First ArticlesFrom early to future learner corpus research
The aim of this article is to survey the field of learner corpus research from its origins to the present day and
to provide some future perspectives. Key aspects of the field — learner corpus design and collection, learner corpus methodology,
statistical analysis, research focus and links with related fields, in particular SLA, FLT and NLP — are compared in
first-generation LCR, which extends from the early 1980s to 2000, and second-generation LCR, which covers the period from the
early 2000s until today. The survey shows that the field has undergone major theoretical and methodological changes and
considerably extended its range of applications. Future developments that are likely to gain ground are grouped into three
categories: increased diversity, increased interdisciplinarity and increased automation.
Keywords: learner corpus research, second language acquisition, foreign language teaching, natural language processing
Article outline
- 1.Introduction
- 2.First-generation LCR
- 2.1Learner corpus design and collection
- 2.2Learner corpus methodology
- 2.2.1Two main methodological approaches
- 2.2.2Learner corpus annotation
- 2.2.3Statistical analysis
- 2.3Research focus
- 2.4Links with SLA and FLT
- 3.Second-generation LCR
- 3.1Learner corpus collection
- 3.2Learner corpus design
- 3.3Learner corpus methodology
- 3.3.1Two main methodological approaches
- 3.3.2Learner corpus annotation
- 3.3.3Statistical analysis
- 3.4Research focus
- 3.5Links with SLA, FLT and NLP
- 3.5.1SLA
- 3.5.2FLT
- 3.5.3Natural language processing
- 4.Future LCR
- 4.1Increased diversity
- 4.2Increased interdisciplinarity
- 4.3Increased automation
- 5.Conclusion
- Notes
- Author queries
-
References
This content is being prepared for publication; it may be subject to changes.
References (197)
Aarts, J., & Granger, S. (1998). Tag
sequences in learner corpora: A key to interlanguage grammar and
discourse. In: S. Granger (Ed.), Learner
English on
computer (pp. 132–141). Addison Wesley Longman.
Ädel, A. (2008). Involvement
features in writing: Do time and interaction trump register
awareness? In G. Gilquin, S. Papp, & M. B. Díez-Bedmar (Eds.), Linking
up contrastive and Learner Corpus
Research (pp. 35–53). Rodopi.
Akbaş, E., & Dinçer, Z. O. (2021). Accuracy
order in L2 grammatical morphemes: Corpus evidence from different proficiency levels of Turkish learners of
English. Studies in Second Language Learning and
Teaching,
11
(4), 607–627.
Alfaifi, A., Atwell, E., & Abuhakema, G. (2013). Error
annotation of the Arabic Learner Corpus: A new error
tagset. In I. Gurevych, C. Biemann, & T. Zesch (Eds.), Language
Processing and Knowledge in the Web. Lecture Notes in Computer
Science, vol 81051. Springer.
Altenberg, B., & Tapper, M. (1998). The
use of adverbial connectors in advanced Swedish learners’ written
English. In S. Granger (Ed.), Learner
English on
computer (pp. 80–93). Addison Wesley Longman.
André, V., Boulton, A., Ciekanski, M., & Cousinard, C. (2024). Learning
to interact from conversational narratives: New perspectives for a data-driven approach integrating learner
data. In S. Götz & S. Granger (Eds.), Learner
Corpus Research for Pedagogical Purposes. Special issue of the International
Journal of Learner Corpus
Research,
10
(1), 67–106.
Axelsson, M. W., & Berglund, Y. (2002). The
Uppsala Student English Corpus (USE): A multi-faceted resource for research and course
development In L. Borin (Ed.), Parallel
corpora, parallel
worlds (pp. 79–90). Rodopi.
Ballier, N., & Martin, P. (2015). Speech
annotation of learner corpora. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The
Cambridge handbook of Learner Corpus
Research (pp. 107–134). Cambridge University Press.
Bestgen, Y. (2014). Inadequacy
of the chi-squared test to examine vocabulary differences between corpora. Literary and
Linguistic
Computing,
29
(2), 164–170.
Bestgen, Y., & Granger, S. (2011). Categorizing
spelling errors to assess L2 writing. International Journal of Continuing Engineering Education
and Life-Long
Learning,
21
(2/3), 235–252.
Biber, D., & Reppen, R. (1998). Comparing
native and learner perspectives on English grammar: a study of complement
clauses. In S. Granger (Ed.), Learner
English on
computer (pp. 145–158). Addison Wesley Longman.
Blázquez-Carratero, M. (2023). Building
a pedagogic spellchecker for L2 learners of
Spanish. ReCALL,
35
(3), 321–338.
Bley-Vroman, R. (1983). The
comparative fallacy in interlanguage studies: The case of systematicity. Language
Learning,
33
1, 1–17.
Borin, L., & Prütz, K. (2004). New
wine in old skins? A corpus investigation of L1 syntactic transfer in learner
language. In G. Aston, S. Bernardini, & D. Stewart (Eds.), Corpora
and language
learners (pp. 67–87). Benjamins.
Boulton, A., & Vyatkina, N. (2021). Thirty
years of data-driven learning: Taking stock and charting new directions over time. Language
Learning &
Technology,
25
(3), 66–89.
Boyd, A., Hana, J., Nicolas, L., Meurers, D., Wisniewski, K., Abel, A., Schone, K., Stindlov, B., & Vettori, C. (2014). The
MERLIN corpus: Learner language and the CEFR. Proceedings of the Ninth International Conference
on Language Resources and Evaluation (LREC’14). Reykjavik, Iceland. [URL]
Brunni, S., Lehto, M.-M., Jantunen, J. H., & Airaksinen, V. (2015). How
to annotate morphologically rich learner language. Principles, problems and solutions. Bergen
Language and Linguistic
Studies (BeLLS),
6
1, 133–152.
Caines, A., Nicholls, D., & Buttery, P. (2017). Annotating
errors and disfluencies in transcriptions of speech. Technical Report
915. University of Cambridge Computer Laboratory. [URL]
Caines, A., & Buttery, P. (2019). The
effect of task and topic on opportunity of use in learner
corpora. In V. Brezina & L. Flowerdew (Eds.), Learner
Corpus Research: New perspectives and
applications (pp. 5–27). Bloomsbury.
Carter, R., & McCarthy, M. (2006). Cambridge
grammar of English: A comprehensive guide. Cambridge University Press.
Castello, E., Ackerley, K., & Coccetta, F. (Eds.). (2016). Studies
in Learner Corpus Linguistics: Research and applications for foreign language teaching and
assessment. Peter Lang.
Chi, M. A., Wong, P. K., & Wong, C. M. (1994). Collocational
problems among ESL learners: A corpus-based study. In L. Flowerdew L. & A. K. Tong (Eds.), Proceedings
of the seminar on corpus linguistics and
lexicology (pp. 157–165). Hong Kong: University of Science and Technology.
Chuang, F.-Y., & Nesi, H. (2006). An
analysis of formal errors in a corpus of L2 English produced by Chinese
students. Corpora,
1
(2), 251–271.
Cogo, A., & Dewey, M. (2012). Analysing
English as a lingua franca: A corpus-driven
investigation. Continuum.
Council of Europe. (2001). Common European
Framework of Reference for Languages: learning, teaching and assessment. Cambridge University Press.
Cowan, R., Choi, H. E., & Kim, D. H. (2003). Four
questions for error diagnosis and correction in CALL. CALICO
Journal,
20
(3), 451–463.
Cowan, R., Choo, J., & Lee, G. S. (2014). ICALL
for improving Korean L2 writers’ ability to edit grammatical errors. Language Learning and
Technology,
18
(3), 193–207.
Crosthwaite, P. (2013). An
error analysis of L2 English discourse reference through learner corpora analysis. Linguistic
Research,
30
(2), 163–193.
(2019). Definite
article bridging relations in L2: A learner corpus study. Corpus Linguistics and Linguistic
Theory,
15
(2), 297–319.
Dagneaux, E., Denness, S., Granger, S., & Meunier, F. (1996). Error
tagging manual. Version 1.1. Louvain-la-Neuve: Centre for English Corpus Linguistics. University of Louvain.
Dagneaux, E., Denness, S., & Granger, S. (1998). Computer-aided
error
analysis. System,
26
(2), 163–174.
De Cock, S., Granger, S., Leech, G., & McEnery, T. (1998). An
automated approach to the phrasicon of EFL learners. In S. Granger (Ed.), Learner
English on
computer (pp. 67–79). Addison Wesley Longman.
de Haan, P. (1997). An
experiment in English learner data analysis. In J. Aarts, I. de Mönnink, & H. Wekker (Eds.), Studies
in English language and
teaching (pp. 215–229). Rodopi.
Díaz-Negrillo, A., & Fernández-Domínguez, J. (2006). Error
tagging systems for learner corpora. Revista Española de Lingüística
Aplicada,
19
1, 83–102.
Díez-Bedmar, M. B. (2018). Fine-tuning
descriptors for CEFR B1 level: Insights from learner corpora. ELT
Journal,
72
(2), 199–209.
Díez-Bedmar, M. B., & Pérez-Paredes, P. (2012). The
types and effects of peer native speakers’ feedback on CMC. Language Learning &
Technology,
16
(1), 62–90.
Domínguez, L., Tracy-Ventura, N., Arche, M. J., Mitchell, R., & Myles, R. (2013). The
role of dynamic contrasts in the L2 acquisition of Spanish past tense morphology. Bilingualism:
Language and
Cognition,
16
(3), 20131, 558–577.
Doughty, C. J., & Long, M. H. (2003). The
scope of inquiry and goals of SLA. In C. J. Doughty & M. H. Long (Eds.), The
handbook of Second Language
Acquisition (pp. 3–16). Blackwell.
Dressen-Hammouda, D. (2013). Politeness
strategies in the job application letter: Implications of Intercultural Rhetoric for designing writing
feedback. Asp,
64
1, 139–159.
Durrant, P., & Schmitt, N. (2009). To
what extent do native and non-native writers make use of collocations? IRAL — International
Review of Applied Linguistics in Language
Teaching, 47(2), 157–177.
Ebeling, S. O., & Hasselgård, H. (2021). The
functions of n-grams in bilingual and learner corpora: An integrated contrastive
approach. In S. Granger (Ed.), Perspectives
on the L2 Phrasicon: The view from learner
corpora (pp. 25–49). Multilingual Matters.
Ferraresi, A. (2024). Learner
corpora in the era of ChatGPT. Building a corpus of Italian EFL learners’ interactions with
chatbots. Paper presented
at TALC 2024, July
7–10. Manchester.
Field, Y., & Yip, L. (1992). A
comparison of internal conjunctive cohesion in the English essay writing of Cantonese speakers and native speakers of
English. RELC
Journal,
23
1, 15–28.
Flowerdew, L. (1997). Interpersonal
strategies: Investigating interlanguage corpora. RELC
Journal,
28
(1), 72–88.
Fuchs, R., & Werner, V. (2018). Tense
and aspect in Second Language Acquisition and learner corpus research. Introduction to the special
issue. International Journal of Learner Corpus
Research,
4
(2), 143–163.
Fuyuno, M., Komiya, R., & Saitoh, T. (2018). Multimodal
analysis of public speaking performance by EFL learners: Applying deep learning to understanding how successful speakers use
facial movement. The Asian Journal of Applied
Linguistics,
5
(1), 117–129.
Gablasova, D., Brezina, V., McEnery, T., & Boyd, E. (2017). Epistemic
stance in spoken L2 English: The effect of task and speaker style. Applied
Linguistics,
38
(5), 613–637.
Gablasova, D., Brezina, V., & McEnery, T. (2019). The
Trinity Lancaster Corpus: Development, description and application. International Journal of
Learner Corpus
Research,
5
(2), 126–158.
Gaillat, T., Simpkin, A., Ballier, N., Stearns, B., Sousa, A., et al. (2021). Predicting
CEFR levels in learners of English: The use of microsystem criterial features in a machine
learning. ReCALL,
34
(2), 130–146.
Geertzen, J., Alexopoulou, T., & Korhonen, A. (2014). Automatic
linguistic annotation of large scale L2 databases: The EF-Cambridge open language database
(EFCamDat). Selected Proceedings of the 2012 Second Language Research
Forum (pp. 240–254). Somerville, MA. [URL]
Gillard, P., & Gadsby, A. (1998). Using
a learners’ corpus in compiling ELT. In S. Granger (Ed.), Learner
English on
computer (pp. 159–171). Addison Wesley Longman.
Gilquin, G. (2000). The
integrated contrastive model: Spicing up your data. Languages in
Contrast,
3
(1), 95–123.
(2007). To
err is not all. What corpus and elicitation can reveal about the use of collocations by
learners. Zeitschrift für Anglistik und
Amerikanistik,
55
(3), 273–291.
(2021). Combining
learner corpora and experimental methods. In N. Tracy-Ventura & M. Paquot (Eds.), The
Routledge handbook of Second Language Acquisition and
corpora (pp. 133–144). Routledge.
(2022). The
Process Corpus of English in education: Going beyond the written
text. Research in Corpus
Linguistics,
10
(1), 31–44.
(2024). Lexical
use in spoken New Englishes and learner Englishes: The effects of shared and distinct communicative
constraints. In B. van Rooy & H. Kotze (Eds.), Constraints
on language variation and change in complex multilingual contact
settings (pp. 120–152). Benjamins.
(forthcoming). Second
and foreign language learners: The effect of language exposure on the use of English phrasal
verbs. International Journal of Bilingualism.
Gilquin, G., De Cock, S., & Granger, S. (2010). Louvain
International Database of Spoken English Interlanguage. Handbook and CD-ROM. Presses universitaires de Louvain.
Gilquin, G., & Granger, S. (2021). The
passive and the lexis-grammar interface: An inter-varietal
perspective. In S. Granger (Ed.), Perspectives
on the L2 phrasicon: The view from learner
corpora (pp. 72–98). Multilingual Matters.
Gilquin, G., & Laporte, S. (2021). The
use of online writing tools by learners of English: Evidence from a process
corpus. International Journal of
Lexicography,
34
(4), 472–492.
Gilquin, G., & Meriläinen, L. (2024). Constrained
communication in EFL and ESL: The case of embedded inversion. English
World-Wide,
45
(2), 196–223.
Glaznieks, A., Frey, J., Stopfner, M., Zanasi, L., & Nicolas, L. (2022). Leonide:
A longitudinal trilingual corpus of young learners of Italian, German and
English. International Journal of Learner Corpus
Research,
8
(1), 97–120.
Götz, S. (2019). Filled
pauses across proficiency levels, L1s and learning context variables. A multivariate exploration of the Trinity
Lancaster Corpus Sample
. International Journal of Learner Corpus
Research,
5
(2), 159–180.
Götz, S., & Granger, S. (2024). Introduction:
Learner corpus research for pedagogical purposes: An overview and some research
perspectives. In S. Götz & S. Granger (Eds.), Learner
corpus research for pedagogical purposes. Special issue of the International
Journal of Learner Corpus
Research,
10
(1), 1–38.
Götz, S., & Mukherjee, J. (2019). Investigating
the effect of the study abroad variable on learner output: A pseudo-longitudinal study on spoken German learner
English. In V. Brezina & L. Flowerdew (Eds.), Learner
Corpus Research: New perspectives and
applications (pp. 47–65). Bloomsbury.
Granger, S. (1993). The
International Corpus of Learner
English
. In J. Aarts, P. de Haan, & N. Oostdijk (Eds.), English
language corpora: Design, analysis and
exploitation (pp. 57–69). Rodopi.
(1996). From
CA to CIA and back: an integrated contrastive approach to computerized bilingual and learner
corpora. In K. Aijmer, B. Altenberg, & M. Johansson (Eds.), Languages
in contrast. Text-based cross-linguistic
studies (pp. 37–51). Lund University Press.
(1997). Automated
retrieval of passives from native and learner corpora: precision and recall. Journal of English
Linguistics,
25
(4), 365–374.
(1999). Use
of tenses by advanced EFL learners?: Evidence from an error-tagged computer
corpus. In H. Hasselgård & S. Oksefjell (Eds.), Out
of corpora. Studies in honour of Stig
Johansson (pp. 191–202). Rodopi.
(2015). Contrastive
interlanguage analysis: A reappraisal. International Journal of Learner Corpus
Research,
1
(1), 7–24.
(2017). Learner
corpora in foreign language education. In S. Thorne & S. May (Eds.), Language
and technology. Encyclopedia of language and education. 3rd
edition. (pp. 427–440). Springer.
(2021). Have
Learner Corpus Research and Second Language Acquisition finally
met? In B. Le Bruyn & M. Paquot (Eds.), Learner
Corpus Research meets Second Language
Acquisition (pp. 243–257). Cambridge University Press.
Granger, S., & Bestgen, Y. (2014). The
use of collocations by intermediate vs. advanced non-native writers: A bigram-based
study. IRAL,
52
(3), 229–252.
Granger, S., Cassart, A., Dagneaux, E., Husquet, C., Verhulst, N., & Watrin, P. (2002). Error
tagging manual for L2 French. CECL Papers. Centre for English Corpus Linguistics: Université catholique de Louvain [URL]
Granger, S., & Lefer, M.-A. (2023). Learner
translation corpora: Bridging the gap between learner corpus research and corpus-based translation
studies. In S. Granger & M.-A. Lefer (Eds.) Learner
translation corpora. Special issue of the International Journal of Learner
Corpus
Research,
9
(1), 1–28.
Granger, S., & Paquot, M. (2015). Electronic
lexicography goes local: Design and structures of a needs-driven online academic writing
aid. Lexicographica,
31
(1), 118–141.
(2022). The
Louvain English for Academic Purposes Dictionary: User Manual. CECL Papers
5. Louvain-la-Neuve: Centre for English Corpus Linguistics/Université catholique de Louvain. [URL]
(forthcoming). Learner
corpora of Language for Specific Purposes. In C. A. Chapelle (Ed.) Encyclopedia
of Applied Linguistics. 2nd Edition. Wiley Blackwell.
Granger, S. & Rayson, P. (1998). Automatic
lexical profiling of learner texts. In S. Granger (Ed.) Learner
English on
computer (pp. 119–131). Addison Wesley Longman.
Granger, S., Swallow, H., & Thewissen, J. (2022). The
Louvain error tagging manual. Version 2.0. CECL Papers
4. Louvain-la-Neuve: Centre for English Corpus Linguistics/Université catholique de Louvain. [URL]
(2023). The
UCLouvain error editor user guide — Version 2.0. CECL Papers
6. Louvain-la-Neuve: Centre for English Corpus Linguistics/Université catholique de Louvain. [URL]
Granger, S., & Tribble, C. (1998). Learner
corpus data in the foreign language classroom: form-focused instruction and data-driven
learning. In S. Granger (Ed.), Learner
English on
computer (pp. 199–209). Addison Wesley Longman.
Granger, S., & Tyson, S. (1996). Connector
usage in the English essay writing of native and non-native EFL speakers of English. World
Englishes,
15
(1), 17–27.
Gries, S. Th. (2006). Some proposals towards a more
rigorous corpus
linguistics. ZAA,
54
(2), 191–202.
(2008). Corpus-based methods in
analyses of SLA data. In P. Robinson & N. C. Ellis (Eds.), Handbook
of Cognitive Linguistics and Second Language
Acquisition (pp. 406–431). Routledge.
(2015). Statistics for learner corpus
research. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The
Cambridge handbook of Learner Corpus
Research (pp. 159–181). Cambridge University Press.
(2022). MuPDAR for corpus-based
learner and variety studies: Two (more) suggestions for
improvement. In S. Flach & M. Hilpert (Eds.), Broadening
the spectrum of corpus linguistics: New approaches to variability and
change (pp. 257–283). Benjamins.
Gyllstad, H., & Snoder, P. (2021). Exploring
learner corpus data for language testing and assessment
purposes. In S. Granger (Ed.) Perspectives
on the L2 phrasicon: The view from learner
corpora (pp. 49–71). Multilingual Matters.
Han, J., Yoo, H., Myung, J., Kim, M., Lee, T. Y., Ahn, S.-Y., & Oh, A. (2024). RECIPE4U:
Student-ChatGPT interaction dataset in EFL writing education. Proceedings of the 2024 Joint
International Conference on Computational Linguistics, Language Resources and
Evaluation (pp. 13666–13676). Torino, Italia. [URL]
Higgins, D., Ramineni, C., & Zechner, K. (2015). Learner
corpora and automated scoring. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The
Cambridge handbook of Learner Corpus
Research (pp. 587–604). Cambridge University Press.
Hasselgård, H. (1999). Review
of Learner English on computer
(S. Granger Ed.). ICAME
Journal,
23
1, 148–152.
Horváth, J. (2001). Advanced
writing in English as a foreign language: A corpus-based study of processes and
products. Lingua Franca Csoport.
Housen, A. (2002). A
corpus-based study of the L2 acquisition of the English verb
system. In S. Granger, S., J. Hung J., & S. Petch-Tyson (Eds.) Computer
Learner Corpora, Second Language Acquisition and Foreign Language
Learning (pp. 77–116). Benjamins.
Howarth, P. A. (1996). Phraseology
in English academic writing: Some implications for language learning and dictionary
making. Lexicographica Series Maior 75. Max Niemeyer.
Hsieh, W.-M., & Liou, H.-C. (2008). A
case study of corpus-informed online academic writing for EFL graduate students. CALICO
Journal,
26
(1), 28–47.
Huang, Y., Murakami, A., Theodora Alexopoulou, T., & Korhoneni, A. (2018). Dependency
parsing of learner English. International Journal of Corpus
Linguistics,
23
(1), 28–54.
Hyland, K., & Milton, J. (1997). Qualification
and certainty in Ll and L2. Journal of Second Language
Writing,
6
(2), 183–205.
Ionin, T., & Díez-Bedmar, M. B. (2021). Article
use in Russian and Spanish learner writing at CEFR B1 and B2 Levels: Effects of proficiency, native language, and
specificity. In B. Le Bruyn & M. Paquot (Eds.), Learner
Corpus Research meets Second Language
Acquisition (pp. 243–257). Cambridge University Press.
Ishikawa, S. (2023). The
ICNALE Guide. An Introduction to a learner corpus study on Asian learners’ L2
English. Routledge.
Ivaska, I., Ferraresi, A., & Bernardini, S. (2022). Syntactic
properties of constrained English: A corpus-driven approach. In S. Granger & M.-A. Lefer (Eds.), Extending
the scope of corpus-based translation
studies (pp. 133–157). Bloomsbury.
Ivaska, I., Bernardini, S., & Ferraresi, A. (2024). The
complex case of constrained communication: A corpus-driven, multilingual and multi-register search for the common ground
between non-native and translated language. In B. van Rooy & H. Kotze (Eds.), Constraints
on language variation and change in complex multilingual contact
settings (pp.191–222). Benjamins.
Jadoulle, P. (2024). Investigating
noviceness and non-nativeness in academic writing: A cross-linguistic approach to
stance. Unpublished doctoral dissertation. University of Louvain: Louvain-la-Neuve.
Jung, Y., Gablasova, D., Brezina, V., & Schmück, H. (2024). Developing
a coding scheme for annotating opinion statements in L2 interactive spoken English with application for language teaching and
assessment. Research in Corpus
Linguistics,
12
(2), 146–173.
Källkvist, M. (1995). Lexical
errors among verbs? A pilot study of the vocabulary of advanced Swedish learners of
English. Working Papers in English and Applied
Linguistics (pp. 103–115). Lund University [URL]
(1999). Form-class
and task-type effects in learner English: A Study of advanced Swedish learners. Lund University Press.
Kaszubski, P. (1997). Polish
student writers — Can corpora help them? In B. Lewandowska-Tomaszczyk & P. J. Melia (Eds.) PALC’97
— Practical applications in language
corpora (pp. 133–158). Lódź University Press.
(1998). Enhancing
a writing textbook: A national perspective. In S. Granger (Ed.) Learner
English on
computer (pp. 172–185). Addison Wesley Longman.
Kawecki, R. (2013). A
beginner French learner corpus. In S. Granger, G. Gilquin, & F. Meunier (Eds.) Twenty
years of Learner Corpus Research: Looking back, moving
ahead (pp. 247–261). Presses universitaires de Louvain.
Kyle, K. (Ed.) (2021). Natural
language processing for learner corpus research. Special issue of
the International Journal of Learner Corpus
Research,
7
(1).
Kyle, K., & Eguchi, M. (2023). Assessing
spoken lexical and lexicogrammatical proficiency using features of word, bigram, and dependency bigram
use. The Modern Language
Journal,
107
(2), 531–564.
Lado, R. (1957). Linguistics
across cultures: Applied linguistics for language teachers. University of Michigan Press.
Larsen-Freeman, D. (2014). Another
step to be taken — Rethinking the end point of the interlanguage
continuum. In Z. Han & E. Tarone (Eds.) Interlanguage.
Forty years
later (pp. 203–220). Benjamins.
Larsson, T., Egbert, J., & Biber, D. (2022a). On
the status of statistical reporting versus linguistic description in corpus linguistics: A ten-year
perspective. Corpora,
17
(1), 137–157.
Larsson, T., Reppen, R., & Dixon, T. (2022b). A
phraseological study of highlighting strategies in novice and expert writing. Journal of
English for Academic
Purposes,
60
1, 101179.
Laufer, B., & Nation, P. (1995). Vocabulary
size and use: Lexical richness in L2 written production. Applied
Linguistics,
16
1, 307–322.
Leacock, C., Chodorow, M., & Tetreault, J. (2015). Automatic
grammar — and spell-checking for language learners. In S. Granger, G. Gilquin, & F. Meunier (Eds.) The
Cambridge handbook of Learner Corpus
Research (pp. 567–586). Cambridge University Press.
Lee, L.-H., Chang, L.-P., & Tseng, Y.-H. (2016). Developing
learner corpus annotation for Chinese grammatical errors.
2016 International
Conference on Asian Language Processing (IALP), Tainan,
Taiwan (pp. 254–257). [URL].
Leech, G. (1998). Preface:
Learner corpora: what they are and what can be done with
them. In: S. Granger (Ed.) Learner
English on
computer (pp. xiv–xx). Pearson.
Leńko-Szymańska, A., & Biel, L. (2023). Terminological
collocations in trainee and professional legal translations. A learner-corpus study of L2 company law
translations. International Journal of Learner Corpus
Research,
9
(1), 29–59.
Leńko-Szymańska, A., & Götz, S. (Eds.) (2022). Complexity,
accuracy and fluency in Learner Corpus Research. Benjamins.
Lessard, G. (1999). Review
of learner English on computer (Granger Ed., 1998). Computational
Linguistics,
25
(2), 302–303.
Li, Q., Tarp, S., Nomdedeu-Rull, A. (2024). The
necessary symbiosis: How ChatGPT co-authored a new type of learner’s grammar to be displayed in a digital writing
assistant. [Manuscript submitted for publication].
Lim, J., Mark, G., Pérez-Paredes, P., & O’Keeffe, A. (2024). Exploring
part of speech (POS)-tag sequences in a large-scale learner corpus of L2 English: A developmental
perspective. Corpora,
19
(1), 31–59.
Lorenz, G. (1998). Overstatement
in advanced learners’ writing: Stylistic aspects of adjective
intensification. In S. Granger (Ed.) Learner
English on
computer (pp. 53–66). Addison Wesley Longman.
Lozano, C., & Díaz-Negrillo, A. (2019). Using
learner corpus methods in L2 acquisition research. The morpheme order studies revisited with Interlanguage
Annotation. Revísta Española de Lingüistica
Aplicada,
32
1, 82–124.
Lüdeling, A., & Hirschmann, H. (2015). Error
annotation systems. In S. Granger, G. Gilquin, & F. Meunier (Eds.) The
Cambridge handbook of Learner Corpus
Research (pp. 135–157). Cambridge University Press.
Lüdeling, A., M. Walter, E. Kroymann, & P. Adolphs. (2005). Multi-level
error annotation in learner corpora. Proceedings from the Corpus Linguistics Conference
Series, Vol. 1, no. 1. [URL]
Ma, Q., Crosthwaite, P., Sun, D., & Zou, D. (2024). Exploring
ChatGPT literacy in language education: A global perspective and comprehensive
approach. Computers and education:Artificial intelligence.
Marchand, T., & Akutsu, S. (2015). First
steps in assigning proficiency to texts in a learner corpus of computer-mediated
communication. In M. Callies & S. Götz (Eds.) Learner
corpora in language testing and
assessment (pp. 85–112). Benjamins.
Marti, L., Yilmaz, S., & Bayyurt, Y. (2019). Reporting
research in applied linguistics: The role of nativeness and expertise. Journal of English for
Academic
Purposes,
40
1, 98–114.
Meunier, F. (2016). Introduction
to the LONGDALE Project. In E. Castello, K. Ackerley, & F. Coccetta (Eds.) Studies
in learner corpus linguistics: Research and applications for foreign language teaching and
assessment (pp. 123–126). Peter Lang.
Milton, J. (1998). Exploiting
L1 and interlanguage corpora in the design of an electronic language learning and production
environment. In S. Granger (Ed.) Learner
English on
computer (pp. 186–198). Addison Wesley Longman.
Milton, J., & Chowdhury, N. (1994). Tagging
the interlanguage of Chinese learners of English. In L. Flowerdew & A. K. K. Tong (Eds.) Entering
text (pp. 127–143). The Hong Kong University of Science and Technology.
Milton, J., & Tsang, E. S. C. (1993). A
corpus-based study of logical connectors in EFL students’ writing: Directions for future
research. In R. Pemberton & E. S. C. Tsang (Eds.) Studies
in
Lexis (pp. 215–246). The Hong Kong University of Science and Technology.
Möller, V. (2017). A
statistical analysis of learner corpus data, experimental data and individual differences: Monofactorial vs. multifactorial
approaches. In P. de Haan, R. de Vries, & S. van Vuuren (Eds.) Language,
learners and levels: Progression and
variation (pp. 409–439). Benjamins.
Murakami, A. (2013). Cross-linguistic
influence on the accuracy order of L2 English grammatical
morphemes. In S. Granger, S. Gaëtanelle, & F. Meunier (Eds.), Twenty
years of learner corpus research. Looking back, moving
ahead (pp. 325–334). Presses universitaires de Louvain.
Murakami, A., & Alexopoulou, T. (2016). L1
influence on the acquisition order of English grammatical morphemes: A learner corpus
study. Studies in Second Language
Acquisition,
38
(3), 365–401.
Mizumoto, A., & Eguchi, M. (2023). Exploring
the potential of using an AI language model for automated essay scoring. Research Methods in
Applied Linguistics,
2
1, 100050.
Myles, F. (2005). Interlanguage
corpora and second language acquisition research. Second Language
Research,
21
(4), 373–391.
Nagata, R., Whittaker, E., & Sheinman, V. (2011). Creating
a manually error-tagged and shallow-parsed learner corpus. Proceedings of the 49th annual
meeting of the association for computational
linguistics (pp. 1210–1219). Portland.
Neumanová, Z. (2023). Investigating
L2 English preposition use by Czech university students: A learner corpus study. Ostrava
Journal of English
Philology,
15
(1), 93–119.
Nicholas, A., Blake, J., Mozgovoy, M., & Perkins, J. (2023). Investigating
pragmatic failure in L2 English email writing among Japanese university EFL learners. A learner corpus
approach. Register
Studies,
5
(1), 23–51.
O’Donnell, M. (2008). The
UAM Corpus Tool: Software for corpus annotation and
exploration. In Proceedings of the XXVI Congreso de
AESLA (pp. 3–5), Almeria, Spain.
Pan, Z. (2024). The
use of semi-automatic annotation in speech acts performed by learners of English. World Journal
of English
Language,
14
(6), 1–12.
Paquot, M. (2024). Learner
corpus research: a critical appraisal and roadmap for contributing (more) to SLA research
agendas. Corpus Linguistics and Linguistic
Theory, aop.
Paquot, M., & Plonsky, L. (2017). Quantitative
research methods and study quality in learner corpus research. International Journal of Learner
Corpus
Research,
3
(1): 61–94.
Petch-Tyson, S. (1998). Writer/reader
visibility in EFL written discourse. In S. Granger (Ed.) Learner
English on
computer (pp. 107–118). Addison Wesley Longman.
Picoral, A., Staples, S., & Reppen, R. (2021). Automated
annotation of learner English: An evaluation of software tools. International Journal of
Learner Corpus
Research,
7
(1), 17–52.
Pilar Valverde Ibañez, M., & Ohtani, A. (2014). Annotating
article errors in Spanish learner texts: Design and evaluation of an annotation
scheme. Proceedings of the 28th Pacific Asia Conference on Language, Information and
Computation (pp. 234–243). Phuket, Thailand. [URL]
Rakhilina, E., Vyrenkova, A., Mustakimova, E., Alina Ladygina, A., & Smirnov, I. (2016). Building
a learner corpus for Russian. In Proceedings of the joint workshop on
NLP for Computer Assisted Language Learning and NLP for Language Acquisition. Umeå, Sweden. [URL]
Rautionaho, P., & Deshors, S. C. (2018). Progressive
or not progressive? Modeling constructional choices in EFL and ESL. International Journal of
Learner Corpus
Research,
4
(2), 225–252.
Rebuschat, P., Meurers, D., & McEnery, T. (2017). Language
learning research at the intersection of experimental, computational, and corpus-based
approaches. Language
Learning,
67
: S1, 6–13.
Ringbom, H. (1998). Vocabulary
frequencies in advanced learner English: A cross-linguistic
approach. In S. Granger (Ed.) Learner
English on
computer (pp. 41–52). Addison Wesley Longman.
Römer, U. (2009). English
in academia: Does nativeness matter? Anglistik: International Journal of English
Studies,
20
(2), 89–100.
Rosen, A. (2016). Building
and using corpora of non-native Czech. ITAT 2016 Proceedings, CEUR Workshop
Proceedings,
1649
1, 80–87.
Rosen, A., Hana, J., Štindlová, B., & Feldman, A. (2014). Evaluating
and automating the annotation of a learner corpus. Language Resources &
Evaluation,
48
1, 65–92.
Rundell, M. (2009). The
future has arrived: A new era in electronic dictionaries. MED
Magazine,
54
1. [URL]
Rundell, M., & S. Granger. (2007). From
corpora to confidence. English Teaching
Professional, 501: 15–18.
Sarte, K. M. & Gnevsheva, K. (2022). Noun
phrasal complexity in ESL written essays under a constructed-response task: Examining proficiency and topic
effects. Assessing
Writing,
51
1, 100595.
Shaw, S. (1997). The
use of language corpora in the compilation of the Longman Dictionary of Contemporary English (third
edition). In B. Lewandowska-Tomaszczyk & P. J. Melia (Eds.) PALC’97
— Practical applications in language
corpora (pp. 269–275). Lódź University Press.
Štindlová, B., Škodová, S., Rosen, A., & Hana, J. (2013). A
learner corpus of Czech: Current state and future
directions. In S. Granger, G. Gilquin, & F. Meunier (Eds.). Twenty
years of Learner Corpus Research. Looking back, moving
ahead (pp. 435–446). Presses universitaires de Louvain.
Tan, M. (2005). Authentic
language or language errors? Lessons from a learner corpus. ELT
Journal,
59
(2), 126–134.
Tao, Y., Agrawal, A., Dombi, J., Sydorenko, T., & Lee, J. I. (2024). ChatGPT
Role-play ddataset: Analysis of user motives and model naturalness. Proceedings of the 2024
Joint International Conference on Computational Linguistics, Language Resources and
Evaluation (pp. 3133–3145). Torino, Italia. [URL]
Tarp, S., Fisker, K., & Sepstrup, P. (2017). L2
writing assistants and context-aware dictionaries: New challenges to
lexicography. Lexikos,
27
1: 494–521.
Tenfjord, K. (2004). ASK
— A computer learner corpus. In P. J. Henriksen (Ed.), CALL
for the Nordic languages. Tools and methods for computer assisted language learning. Copenhagen Studies in
Language, 301, 147–158.
Tenfjord, K., Meurer, P., & Hofland, K. (2006). The
ASK Corpus — a language learner corpus of Norwegian as a second
language. In N. Calzolari, K. Choukri, A. Gangemi, B. Maegaard, J. Mariani, J. Odijk, & D. Tapias (Eds.) Proceedings
of the Fifth International Conference on Language Resources and
Evaluation (LREC06) (pp. 1821–1824).
Tarp, S., Fisker, K., & Sepstrup, P. (2017). L2
writing assistants and context-aware dictionaries: New challenges to
lexicography. Lexikos, 271, 494–521.
Tarp, S., & Nomdedeu-Rull, A. (2024). Who
has the last word? Lessons from using ChatGPT to develop an AI-based Spanish writing
assistant. Círculo de Lingüística Aplicada a la
Comunicación, 971, 309–321.
Thewissen, J. (2015). Accuracy
across proficiency levels: A learner corpus approach. Presses universitaires de Louvain.
Tono, Y. (1996). Using
learner corpora for L2 lexicography? Information of collocational errors for EFL
learners. Lexikos,
6
1, 116–132.
(2000). A
computer learner corpus-based analysis of the acquisition order of English grammatical
morphemes. In L. Burnard, & T. McEnery (Eds.) Rethinking
language pedagogy from a corpus
perspective (pp. 123–132). Peter Lang.
Tono, Y., Kaneko, T., Isahara, H., Saiga, T., Izumi, E., Narita, M., & Kaneko, E. (2001). The
Standard Speaking Test (SST) Corpus: A 1 million-word spoken corpus of Japanese learners of English and its implication for L2
lexicography. Language Facts and
Perspectives, 11(2), 7–17.
Tono, Y., Fukuda, K., Takebayashi, K., & Kawamoto, N. (2024). Using
ChatGPT and CEFR profile information to create learner corpora with error codings and comparable texts with different CEFR
levels. Paper presented
at TALC 2024, July
7–10. Manchester.
Tracy-Ventura, N., & Huensch, A. (2018). The
potential of publicly shared longitudinal learner corpora in SLA
research. In A. Gudmestad & A. Edmonds (Eds.) Critical
reflections on data in Second Language
Acquisition (pp. 149–170). Benjamins.
Tracy-Ventura, N., Mitchell, R., & McManus, K. (2016). The
LANGSNAP longitudinal learner corpus: Design and use. In M. Alonso Ramos (Ed.) Spanish
Learner Corpus Research: Current trends and future
perspectives (pp. 117–142). Benjamins.
Tracy-Ventura, N., & Paquot, M. (Eds.). (2021). The
Routledge handbook of Second Language Acquisition and
Corpora. Routledge.
Turton, N. D. & Heaton, J. B. (1996). Longman
dictionary of common errors. New Edition. Addison Wesley Longman.
Vajjala, S. (2018). Automated
assessment of non-native nearner essays: Investigating the role of linguistic
features. International Journal of Artificial Intelligence in
Education,
28
1, 79–105.
Vanderbauwhede, G. (2012). The
Integrated Contrastive Model evaluated: The French and Dutch demonstrative determiner in L1 and
L2. International Journal of Applied
Linguistics, 22(3), 392–413.
Vandeweerd, N., Housen, A., & Paquot, M. (2023). Comparing
the longitudinal development of phraseological complexity across oral and written
tasks. Studies in Second Language
Acquisition, 45(4), 787–811.
Vinogradova, O. (2016). The
role and applications of expert error annotation in a corpus of English learner
texts. Proceedings of “Dialog
2016”, 151, 740–751. [URL]
(2019). To
automated generation of test questions on the basis of error annotations in EFL essays. A time-saving
tool? In S. Götz & J. Mukherjee (Eds.) Learner
corpora and language
teaching (pp. 29–48). Benjamins.
Virtanen, T. (1997). The
progressive in NNS and NS student compositions: Evidence from the International Corpus of Learner
English
. In M. Ljung (Ed.) Corpus-based
studies in
English (pp. 299–309). Rodopi.
(1998). Direct
questions in argumentative student writing. In S. Granger (Ed.) Learner
English on
computer (pp. 94–106). Addison Wesley Longman.
Vyatkina, N. (2013). Analyzing
part-of-speech variability in a longitudinal learner corpus and a pedagogic
corpus. In S. Granger, G. Gilquin, & F. Meunier (Eds.) Twenty
years of Learner Corpus Research: Looking back, moving
ahead (pp. 479–491). Presses universitaires de Louvain.
Wang, Q., & Yuan, Z. (2024). Assessing
the efficacy of grammar error correction: A human evaluation approach in the Japanese
context. arXiv:2402.18101
Wang, W., & Zhang, J. (2023). Factors
predicting human performance in error annotation for non-native speech corpus. Speech
Communication,
149
1, 38–46.
Wang, X., Bruno, J., Molloy, H., Evanini, K., & Zechner, K. (2017). Discourse
annotation of non-native spontaneous spoken responses using the rhetorical structure theory
framework. In Proceedings of the 55th Annual Meeting of the
Association for Computational Linguistics (Volume 2: Short
Papers) (pp. 263–268). Vancouver, Canada. Association for Computational Linguistics.
Weisser, M. (2021). Profiling
learners through pragmatically and error annotated corpora. In P. Pérez-Paredes & G. Mark (Eds.). Beyond
concordance lines: Corpora in language
education (pp. 121–148). Benjamins.
Xia, D., Sulzer, M. A., & Pae, H. K. (2023). Phrase-frames
in business emails: A contrast between learners of business English and working
professionals. Text &
Talk,
44
(5), 693–714.
Zybert, J. (1999). Errors
in foreign language learning. The case of Polish learners of English. Instytut Anglistyki Uniwersytetu Warszawskiego. [URL]