This paper introduces a new corpus resource for language learning research, the Trinity Lancaster Corpus (TLC),
which contains 4.2 million words of interaction between L1 and L2 speakers of English. The corpus includes spoken production from
over 2,000 L2 speakers from different linguistic and cultural backgrounds at different levels of proficiency engaged in two to
four tasks. The paper provides a description of the TLC and places it in the context of current learner corpus development and
research. The discussion of practical decisions taken in the construction of the TLC also enables a critical reflection on current
methodological issues in corpus construction.
Adolphs, S. & Knight, D.2010. “Building a spoken corpus”. In A. O’Keeffe & M. McCarthy (Eds.), The Routledge Handbook of Corpus Linguistics. London: Routledge, 38–52.
Aijmer, K.2014. “Pragmatic markers”. In K. Aijmer & C. Rühlemann (Eds.), Corpus Pragmatics: A Handbook. Cambridge: Cambridge University Press, 195–218.
Alexopoulou, T., Michel, M., Murakami, A., & Meurers, D.2017. “Task effects on linguistic complexity and accuracy: A large-scale learner corpus analysis employing natural language processing techniques”. Language Learning 67(S1), 180–208.
Arche, M. J.2008. SPLLOC Transcription Conventions. <[URL]> (accessed August 2019).
Aston, G., & Burnard, L.1998. The BNC Handbook: Exploring the British National Corpus with SARA. Capstone.
Baker, P. & Egbert, J. (Eds.). 2016. Triangulating Methodological Approaches in Corpus Linguistic Research. London: Routledge.
Biber, D. & Conrad, S.2009. Register, Genre, and Style. Cambridge: Cambridge University Press.
Breiteneder, A., Pitzl, M. L., Majewski, S. & Klimpfinger, T.2006. “VOICE recording-Methodological challenges in the compilation of a corpus of spoken ELF”. Nordic Journal of English Studies 5(2), 161–187.
Brezina, V.2018. Statistics in Corpus Linguistics. A practical guide. Cambridge: Cambridge University Press.
Callies, M.2015. “Using learner corpora in language testing and assessment: Current practice and future challenges”. In E. Castello, K. Ackerley & F. Coccetta (Eds.), Studies in Learner Corpus Linguistics: Research and Applications for Foreign Language Teaching and Assessment. Frankfurt: Peter Lang, 21–35.
Cameron, D.2001. Working with Spoken Discourse. London: Sage.
Carlsen, C.2012. “Proficiency level – A fuzzy variable in computer learner corpora”. Applied Linguistics 33(2), 161–183.
Cervantes, I. M. & Gablasova, D.2017. “Phrasal verbs in spoken L2 English: The effect of L2 proficiency and L1 background”. In V. Brezina & L. Flowerdew (Eds.), Learner Corpus Research: New perspectives and applications. London: Bloomsbury, 28–46.
Cook, G.1995. “Theoretical issues: transcribing the untranscribable”. In G. Leech, G. Myers & J. Thomas (Eds.), Spoken English on Computer: Transcription, Mark-up and Application. Harlow: Longman, 35–53.
Crowdy, S.1994. “Spoken corpus transcription”. Literary and Linguistic Computing 9(1), 25–28.
Dayrell, C. & Urry, J.2015. “Mediating climate politics: The surprising case of Brazil”. European Journal of Social Theory 18(3), 257–273.
Ellis, N. C.2002. “Frequency effects in language processing. A review with implications for theories of implicit and explicit language acquisition”. Studies in Second Language Acquisition 24(2), 143–188.
Fuchs, R., Götz, S. & Werner, V.2016. “The present perfect in learner Englishes: A corpus-based case study on L1 German intermediate and advanced speech and writing”. In V. Werner, E. Seoane & C. Suárez-Gómez (Eds.), Re-Assessing the Present Perfect. Berlin: Mouton de Gruyter, 297–338.
Gablasova, D. & Brezina, V. In preparation. Challenges in transcribing spoken learner language: Lessons from the Trinity Lancaster Corpus.
Gablasova, D., Brezina, V. & McEnery, T.2017. “Exploring learner language through corpora: Comparing and interpreting corpus frequency information”. Language Learning 67(S1), 130–154.
Gablasova, D. & Brezina, V.2017. “Disagreement in L2 spoken English: From learner corpus research to corpus-based teaching materials”. In V. Brezina & L. Flowerdew (Eds.), Learner Corpus Research: New perspectives and applications. London: Bloomsbury, 69–89.
Gablasova, D., Brezina, V., McEnery, T. & Boyd, E.2017. “Epistemic stance in spoken L2 English: The effect of task type and speaker style”. Applied Linguistics 38(5), 613–637.
Gablasova, D. & Brezina, V.2015. “Does speaker role affect the choice of epistemic adverbials in L2 speech? Evidence from the Trinity Lancaster Corpus”. In J. Romero-Trillo (Ed.), Yearbook of Corpus Linguistics and Pragmatics 2015. Dordrecht: Springer, 117–136.
Gilquin, G., De Cock, S. & Granger, S.2010. The Louvain International Database of Spoken English Interlanguage. Handbook and CD-ROM. Louvain-la-Neuve: Presses universitaires de Louvain.
Gilquin, G.2015. “From design to collection of learner corpora”. In S. Granger, G. Gilquin & F. Meunier (Eds.), Cambridge Handbook of Learner Corpus Research. Cambridge: Cambridge University Press, 9–34.
Gries, S. Th.2015. “Some current quantitative problems in corpus linguistics and a sketch of some solutions”. Language and Linguistics 16(1), 93–117.
Jucker, A. H., Smith, S. W. & Lüdge, T.2003. “Interactive aspects of vagueness in conversation”. Journal of Pragmatics 35(12), 1737–1769.
Kormos, J.2014. Speech production and second language acquisition. London: Routledge.
Leech, G.1998. “Preface. Learner corpora: what they are and what can be done with them”. In S. Granger (Ed.), Learner English on Computer. London: Longman, xiv–xx.
Leech, G.2000. “Grammars of spoken English: New outcomes of corpus-oriented research”. Language Learning 50(4), 675–724.
Leech, G.2007. “New resources, or just better old ones? The Holy Grail of representativeness”. In M. Hundt, N. Nesselhauf & C. Biewer (Eds.), Corpus Linguistics and the Web. Amsterdam: Rodopi, 134–149.
Love, R., Dembry, C., Hardie, A., Brezina, V. & McEnery, T.2017. “The spoken BNC2014”. International Journal of Corpus Linguistics 22(3), 319–344.
Mackey, A., & Gass, S. M.2005. Second Language Research: Methodology and Design. New York NY: Routledge.
MacWhinney, B.2000. The CHILDES Database: Tools for analyzing talk, 3rd edn. Mahwah NY: Lawrence Erlbaum Associates.
McEnery, T., Xiao, R. & Tono, Y.2006. Corpus-based Language Studies: An Advanced Resource Book. London: Taylor & Francis.
McEnery, T. & Hardie, A.2011. Corpus Linguistics: Method, Theory and Practice. Cambridge: Cambridge University Press.
Muñoz, C. (Ed.) 2006. Age and the Rate of Foreign Language Learning. Clevedon: Multilingual Matters.
Myles, F.2015. “Second language acquisition theory and learner corpus research”. In S. Granger, G. Gilquin & F. Meunier (Eds.), Cambridge Handbook of Learner Corpus Research. Cambridge: Cambridge University Press, 309–332.
Ochs, E.1979. “Transcription as theory”. Developmental Pragmatics 10(1), 43–72.
Papageorgiou, S.2007. Relating the Trinity College London GESE and ISE Examinations to the Common European Framework of Reference. Final project report, February 2007. London: Trinity College London.
Plonsky, L.2016, February. The N crowd: Sampling practices, internal validity, and generalizability in L2 research. Presentation given at University College London, London, UK.
Porte, G. (Ed.). 2012. Replication Research in Applied Linguistics. Cambridge: Cambridge University Press.
Roever, C.2010. “Effects of cultural background in a test of ESL pragmalinguistics: A DIF approach”. In G. Kasper, H.t. Nguyen, D. R. Yoshimi & J. K. Yoshioka (Eds.), Pragmatics and Language Learning, Vol. 121. Honolulu: National Foreign Language Resource Center, University of Hawai’i at Mānoa, 187–212.
Semino, E., Demjén, Z., Demmen, J., Koller, V., Payne, S., Hardie, A., & Rayson, P.2017. “The online use of violence and journey metaphors by patients with cancer, as compared with health professionals: a mixed methods study”. BMJ Supportive & Palliative Care 7(1), 60–66.
Sinclair, J.2005. “Corpus and text – basic principles”. In M. Wynne (Ed.), Developing Linguistic Corpora: A Guide to Good Practice. Oxford: Oxbow Books, 1–16.
Spencer-Oatey, H.2008. “Introduction”. In H. Spencer-Oatey (Ed.), Culturally Speaking. Culture, Communication and Politeness Theory, 2nd edn. London: Bloomsbury, 1–8.
Thompson, P.2005. “Spoken language corpora”. In M. Wynne (Ed.), Developing Linguistic Corpora: A Guide to Good Practice. Oxford: Oxbow Books, 59–70.
Tomasello, M.2003. “Introduction: Some surprises for psychologist’s”. In M. Tomasello (Ed.), The New Psychology of Language. London: Taylor and Francis, 7–20.
Trinity College London. 2016. Exam Information: Graded Examinations in Spoken English (GESE). Available at <[URL]>
Wall, D., & C. Taylor. 2014. ‘Communicative Language Testing (CLT): Reflections on the “Issues Revisited” from the perspective of an examinations board.’ Language Assessment Quarterly 11(2): 170–185.
Wong, D. & Kruger, H.2018. “Yeah, yeah yeah or yeah no that’s right: A multifactorial analysis of the selection of backchannel structures in British English”. In V. Brezina, R. Love & K. Aijmer (Eds.), Corpus Approaches to Contemporary British Speech. London: Routledge, 120–156.
Cited by (20)
Cited by 20 other publications
Chen, Yu-Hua, Simon Harrison, Michael Paul Stevens & Qianqian Zhou
2024. Developing a multimodal corpus of L2 academic English from an English medium of instruction university in China. Corpora 19:1 ► pp. 1 ff.
Kim, Minjin & Xiaofei Lu
2024. L2 English speaking syntactic complexity: Data preprocessing issues, reliability of automated analysis, and the effects of proficiency, L1 background, and topic. The Modern Language Journal 108:1 ► pp. 270 ff.
2024. Learner corpus research: a critical appraisal and roadmap for contributing (more) to SLA research agendas. Corpus Linguistics and Linguistic Theory
2023. Alternation phenomena and language proficiency: the genitive alternation in the spoken language of EFL learners. Corpus Linguistics and Linguistic Theory 19:3 ► pp. 427 ff.
Curry, Niall, Robbie Love & Olivia Goodman
2022. Adverbs on the move: investigating publisher application of corpus research on recent language change to ELT coursebook development. Corpora 17:1 ► pp. 1 ff.
Du, Hang
2022. Grammatical and lexical development during study abroad: Research on a corpus of spoken L2 Chinese. Foreign Language Annals 55:4 ► pp. 985 ff.
Liang, Chaohui, Jiling Shang & Qiangyi Li
2022. English Syntactic Analysis and Word Sense Disambiguation Strategy of Neutral Set from the Perspective of Natural Language Processing. Advances in Multimedia 2022 ► pp. 1 ff.
Lu, Zuqin
2022. Analysis model of college students' mental health based on online community topic mining and emotion analysis in novel coronavirus epidemic situation. Frontiers in Public Health 10
Paquot, Magali, Dana Gablasova, Vaclav Brezina & Hubert Naets
2020. Spelling Errors in the Preliminary English B1 Exam: Corpus-Informed Evaluation of Examination Criteria for MENA Contexts. In The Assessment of L2 Written English across the MENA Region, ► pp. 359 ff.
This list is based on CrossRef data as of 17 october 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.