From early to future learner corpus research
The aim of this article is to survey the field of learner corpus research from its origins to the present day and
to provide some future perspectives. Key aspects of the field — learner corpus design and collection, learner corpus methodology,
statistical analysis, research focus and links with related fields, in particular SLA, FLT and NLP — are compared in
first-generation LCR, which extends from the late 1980s to 2000, and second-generation LCR, which covers the period from the early
2000s until today. The survey shows that the field has undergone major theoretical and methodological changes and considerably
extended its range of applications. Future developments that are likely to gain ground are grouped into three categories:
increased diversity, increased interdisciplinarity and increased automation.
Article outline
- 1.Introduction
- 2.First-generation LCR
- 2.1Learner corpus design and collection
- 2.2Learner corpus methodology
- 2.2.1Two main methodological approaches
- 2.2.2Learner corpus annotation
- 2.2.3Statistical analysis
- 2.3Research focus
- 2.4Links with SLA and FLT
- 3.Second-generation LCR
- 3.1Learner corpus collection
- 3.2Learner corpus design
- 3.3Learner corpus methodology
- 3.3.1Two main methodological approaches
- 3.3.2Learner corpus annotation
- 3.3.3Statistical analysis
- 3.4Research focus
- 3.5Links with SLA, FLT and NLP
- 3.5.1SLA
- 3.5.2FLT
- 3.5.3Natural language processing
- 4.Future LCR
- 4.1Increased diversity
- 4.2Increased interdisciplinarity
- 4.3Increased automation
- 5.Conclusion
- Notes
-
References