This paper introduces a new corpus resource for language learning research, the Trinity Lancaster Corpus (TLC),
which contains 4.2 million words of interaction between L1 and L2 speakers of English. The corpus includes spoken production from
over 2,000 L2 speakers from different linguistic and cultural backgrounds at different levels of proficiency engaged in two to
four tasks. The paper provides a description of the TLC and places it in the context of current learner corpus development and
research. The discussion of practical decisions taken in the construction of the TLC also enables a critical reflection on current
methodological issues in corpus construction.
2010 “Building a spoken corpus”. In A. O’Keeffe & M. McCarthy (Eds.), The Routledge Handbook of Corpus Linguistics. London: Routledge, 38–52.
Aijmer, K.
2014 “Pragmatic markers”. In K. Aijmer & C. Rühlemann (Eds.), Corpus Pragmatics: A Handbook. Cambridge: Cambridge University Press, 195–218.
Alexopoulou, T., Michel, M., Murakami, A., & Meurers, D.
2017 “Task effects on linguistic complexity and accuracy: A large-scale learner corpus analysis employing natural language processing techniques”. Language Learning 67(S1), 180–208.
Arche, M. J.
2008SPLLOC Transcription Conventions. [URL] (accessed August 2019).
Aston, G., & Burnard, L.
1998The BNC Handbook: Exploring the British National Corpus with SARA. Capstone.
Baker, P. & Egbert, J.
(Eds.)2016Triangulating Methodological Approaches in Corpus Linguistic Research. London: Routledge.
Biber, D. & Conrad, S.
2009Register, Genre, and Style. Cambridge: Cambridge University Press.
Breiteneder, A., Pitzl, M. L., Majewski, S. & Klimpfinger, T.
2006 “VOICE recording-Methodological challenges in the compilation of a corpus of spoken ELF”. Nordic Journal of English Studies 5(2), 161–187.
2018Statistics in Corpus Linguistics. A practical guide. Cambridge: Cambridge University Press.
Callies, M.
2015 “Using learner corpora in language testing and assessment: Current practice and future challenges”. In E. Castello, K. Ackerley & F. Coccetta (Eds.), Studies in Learner Corpus Linguistics: Research and Applications for Foreign Language Teaching and Assessment. Frankfurt: Peter Lang, 21–35.
Cameron, D.
2001Working with Spoken Discourse. London: Sage.
Carlsen, C.
2012 “Proficiency level – A fuzzy variable in computer learner corpora”. Applied Linguistics 33(2), 161–183.
Cervantes, I. M. & Gablasova, D.
2017 “Phrasal verbs in spoken L2 English: The effect of L2 proficiency and L1 background”. In V. Brezina & L. Flowerdew (Eds.), Learner Corpus Research: New perspectives and applications. London: Bloomsbury, 28–46.
1995 “Theoretical issues: transcribing the untranscribable”. In G. Leech, G. Myers & J. Thomas (Eds.), Spoken English on Computer: Transcription, Mark-up and Application. Harlow: Longman, 35–53.
Crowdy, S.
1994 “Spoken corpus transcription”. Literary and Linguistic Computing 9(1), 25–28.
Dayrell, C. & Urry, J.
2015 “Mediating climate politics: The surprising case of Brazil”. European Journal of Social Theory 18(3), 257–273.
2002 “Frequency effects in language processing. A review with implications for theories of implicit and explicit language acquisition”. Studies in Second Language Acquisition 24(2), 143–188.
Fuchs, R., Götz, S. & Werner, V.
2016 “The present perfect in learner Englishes: A corpus-based case study on L1 German intermediate and advanced speech and writing”. In V. Werner, E. Seoane & C. Suárez-Gómez (Eds.), Re-Assessing the Present Perfect. Berlin: Mouton de Gruyter, 297–338.
Gablasova, D. & Brezina, V.
In preparation. Challenges in transcribing spoken learner language: Lessons from the Trinity Lancaster Corpus.
2017 “Exploring learner language through corpora: Comparing and interpreting corpus frequency information”. Language Learning 67(S1), 130–154.
Gablasova, D. & Brezina, V.
2017 “Disagreement in L2 spoken English: From learner corpus research to corpus-based teaching materials”. In V. Brezina & L. Flowerdew (Eds.), Learner Corpus Research: New perspectives and applications. London: Bloomsbury, 69–89.
Gablasova, D., Brezina, V., McEnery, T. & Boyd, E.
2017 “Epistemic stance in spoken L2 English: The effect of task type and speaker style”. Applied Linguistics 38(5), 613–637.
Gablasova, D. & Brezina, V.
2015 “Does speaker role affect the choice of epistemic adverbials in L2 speech? Evidence from the Trinity Lancaster Corpus”. In J. Romero-Trillo (Ed.), Yearbook of Corpus Linguistics and Pragmatics 2015. Dordrecht: Springer, 117–136.
Gilquin, G., De Cock, S. & Granger, S.
2010The Louvain International Database of Spoken English Interlanguage. Handbook and CD-ROM. Louvain-la-Neuve: Presses universitaires de Louvain.
Gilquin, G.
2015 “From design to collection of learner corpora”. In S. Granger, G. Gilquin & F. Meunier (Eds.), Cambridge Handbook of Learner Corpus Research. Cambridge: Cambridge University Press, 9–34.
2015 “Some current quantitative problems in corpus linguistics and a sketch of some solutions”. Language and Linguistics 16(1), 93–117.
Jucker, A. H., Smith, S. W. & Lüdge, T.
2003 “Interactive aspects of vagueness in conversation”. Journal of Pragmatics 35(12), 1737–1769.
Kormos, J.
2014Speech production and second language acquisition. London: Routledge.
Leech, G.
1998 “Preface. Learner corpora: what they are and what can be done with them”. In S. Granger (Ed.), Learner English on Computer. London: Longman, xiv–xx.
Leech, G.
2000 “Grammars of spoken English: New outcomes of corpus-oriented research”. Language Learning 50(4), 675–724.
Leech, G.
2007 “New resources, or just better old ones? The Holy Grail of representativeness”. In M. Hundt, N. Nesselhauf & C. Biewer (Eds.), Corpus Linguistics and the Web. Amsterdam: Rodopi, 134–149.
Love, R., Dembry, C., Hardie, A., Brezina, V. & McEnery, T.
2017 “The spoken BNC2014”. International Journal of Corpus Linguistics 22(3), 319–344.
Mackey, A., & Gass, S. M.
2005Second Language Research: Methodology and Design. New York NY: Routledge.
MacWhinney, B.
2000The CHILDES Database: Tools for analyzing talk, 3rd edn. Mahwah NY: Lawrence Erlbaum Associates.
McEnery, T., Xiao, R. & Tono, Y.
2006Corpus-based Language Studies: An Advanced Resource Book. London: Taylor & Francis.
McEnery, T. & Hardie, A.
2011Corpus Linguistics: Method, Theory and Practice. Cambridge: Cambridge University Press.
Muñoz, C.
(Ed.)2006Age and the Rate of Foreign Language Learning. Clevedon: Multilingual Matters.
Myles, F.
2015 “Second language acquisition theory and learner corpus research”. In S. Granger, G. Gilquin & F. Meunier (Eds.), Cambridge Handbook of Learner Corpus Research. Cambridge: Cambridge University Press, 309–332.
Ochs, E.
1979 “Transcription as theory”. Developmental Pragmatics 10(1), 43–72.
Papageorgiou, S.
2007Relating the Trinity College London GESE and ISE Examinations to the Common European Framework of Reference. Final project report, February 2007 London: Trinity College London.
Plonsky, L.
2016, February. The N crowd: Sampling practices, internal validity, and generalizability in L2 research. Presentation given at University College London, London, UK.
Porte, G.
(Ed.)2012Replication Research in Applied Linguistics. Cambridge: Cambridge University Press.
Roever, C.
2010 “Effects of cultural background in a test of ESL pragmalinguistics: A DIF approach”. In G. Kasper, H.t. Nguyen, D. R. Yoshimi & J. K. Yoshioka (Eds.), Pragmatics and Language Learning, Vol. 121. Honolulu: National Foreign Language Resource Center, University of Hawai’i at Mānoa, 187–212.
Semino, E., Demjén, Z., Demmen, J., Koller, V., Payne, S., Hardie, A., & Rayson, P.
2017 “The online use of violence and journey metaphors by patients with cancer, as compared with health professionals: a mixed methods study”. BMJ Supportive & Palliative Care 7(1), 60–66.
Sinclair, J.
2005 “Corpus and text – basic principles”. In M. Wynne (Ed.), Developing Linguistic Corpora: A Guide to Good Practice. Oxford: Oxbow Books, 1–16.
Spencer-Oatey, H.
2008 “Introduction”. In H. Spencer-Oatey (Ed.), Culturally Speaking. Culture, Communication and Politeness Theory, 2nd edn. London: Bloomsbury, 1–8.
Thompson, P.
2005 “Spoken language corpora”. In M. Wynne (Ed.), Developing Linguistic Corpora: A Guide to Good Practice. Oxford: Oxbow Books, 59–70.
Tomasello, M.
2003 “Introduction: Some surprises for psychologist’s”. In M. Tomasello (Ed.), The New Psychology of Language. London: Taylor and Francis, 7–20.
2016Exam Information: Graded Examinations in Spoken English (GESE). Available at [URL]
Wall, D., & C. Taylor
2014 ‘Communicative Language Testing (CLT): Reflections on the “Issues Revisited” from the perspective of an examinations board.’ Language Assessment Quarterly 11(2): 170–185.
Wong, D. & Kruger, H.
2018 “Yeah, yeah yeah or yeah no that’s right: A multifactorial analysis of the selection of backchannel structures in British English”. In V. Brezina, R. Love & K. Aijmer (Eds.), Corpus Approaches to Contemporary British Speech. London: Routledge, 120–156.
Cited by
Cited by 16 other publications
Curry, Niall & Tony Clark
2020. Spelling Errors in the Preliminary English B1 Exam: Corpus-Informed Evaluation of Examination Criteria for MENA Contexts. In The Assessment of L2 Written English across the MENA Region, ► pp. 359 ff.
Curry, Niall, Robbie Love & Olivia Goodman
2022. Adverbs on the move: investigating publisher application of corpus research on recent language change to ELT coursebook development. Corpora 17:1 ► pp. 1 ff.
Du, Hang
2022. Grammatical and lexical development during study abroad: Research on a corpus of spoken L2 Chinese. Foreign Language Annals 55:4 ► pp. 985 ff.
2022. Alternation phenomena and language proficiency: the genitive alternation in the spoken language of EFL learners. Corpus Linguistics and Linguistic Theory 0:0
Huang, Danqing, Dirk Geeraerts & Weiwei Zhang
2021. A diachronic analysis of the FIRE character. Chinese Semiotic Studies 17:1 ► pp. 1 ff.
Huang, Lan-fen & Tomáš Gráf
2021. Expanding lindsei to spoken learner English from several L1s across cefr levels. Corpora 16:2 ► pp. 271 ff.
2022. English Syntactic Analysis and Word Sense Disambiguation Strategy of Neutral Set from the Perspective of Natural Language Processing. Advances in Multimedia 2022 ► pp. 1 ff.
Lu, Zuqin
2022. Analysis model of college students' mental health based on online community topic mining and emotion analysis in novel coronavirus epidemic situation. Frontiers in Public Health 10
Paquot, Magali, Dana Gablasova, Vaclav Brezina & Hubert Naets
2022. Gesprochene Lernerkorpora des Deutschen: Eine Bestandsaufnahme. Zeitschrift für germanistische Linguistik 50:1 ► pp. 1 ff.
Wu, Di & Qiangyi Li
2022. Design of Chinese Corpus Based on Semantic Mining Algorithm. Advances in Multimedia 2022 ► pp. 1 ff.
This list is based on CrossRef data as of 1 may 2023. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.