Alex Boulton | ATILF, CNRS & University of Lorraine
Data-driven learning typically involves the use of dedicated concordancers to explore linguistic corpora, which may require significant training if the technology is not to be an obstacle for teacher and learner alike. One possibility is to begin not with corpus or concordancer, but to find parallels with what ‘ordinary’ users already do. This paper compares the web to a corpus, regular search engines to concordancers, and the techniques used in web searches to data-driven learning. It also examines previous studies which exploit web searches in ways not incompatible with a DDL approach.
Acar, A., Geluso, J. & Shiki, T. 2011. How can search engines improve your writing?CALL-EJ 12(1): 1–10.
Adolphs, S. 2006. Introducing Electronic Text Analysis: A Practical Guide for Language and Literary Studies. London: Routledge.
Allan, R. 2009. Can a graded reader corpus provide ‘authentic’ input?ELT Journal 63(1): 23–32.
Anthony, L. 2011. AntConc, version 3. Tokyo: Waseda University. <[URL]> (17 February 2013).
Aston, G. 1997. Small and large corpora in language learning. In Practical Applications in Language Corpora, B. Lewandowska-Tomaszczyk & J. Melia (eds), 51–62. Łódź: Łódź University Press.
Baroni, M. & Bernardini, S. (eds). 2006. Wacky! Working Papers on the Web as Corpus. Bologna: Gedit.
Bergh, G. 2005. Min(d)ing English language data on the web: What can Google tell us?ICAME Journal 29: 25–46.
Bernardini, S., Baroni, M. & Evert, S. 2006. A WaCky introduction. In Wacky! Working Papers on the Web as Corpus, M. Baroni & S. Bernardini (eds), 9–40. Bologna: Gedit.
Biber, D., Conrad, S. & Reppen, R. 1998. Corpus Linguistics: Investigating Language Structure and Use. Cambridge: CUP.
Boulton, A. 2010a. Data-driven learning: Taking the computer out of the equation. Language Learning 60(3): 534–572.
Boulton, A. 2010b. Data-driven learning: On paper, in practice. In Corpus Linguistics in Language Teaching, T. Harris & M. Moreno Jaén (eds), 17–52. Bern: Peter Lang.
Boulton, A. 2011a. Data-driven learning: The perpetual enigma. In Explorations across Languages and Corpora, S. Goźdź-Roszkowski (ed.), 563–580. Frankfurt: Peter Lang.
Boulton, A. 2011b. Bringing corpora to the masses: Free and easy tools for interdisciplinary language studies. In Corpora, Language, Teaching, and Resources: From Theory to Practice, N. Kübler (ed.), 69–96. Bern: Peter Lang.
Boulton, A. 2012. Hands-on/hands-off: Alternative approaches to data-driven learning. In Input, Process and Product: Developments in Teaching and Language Corpora, J. Thomas & A. Boulton (eds), 152–168. Brno: Masaryk University Press.
Boulton, A. & Tyne, H. 2014. Des Documents Authentiques aux Corpus: Démarches pour l’Apprentissage des Langues. Paris: Didier.
Braun, S. 2005. From pedagogically relevant corpora to authentic language learning contents. ReCALL 17(1): 47–64.
Braun, S. 2010. Getting past ‘groundhog day’: Spoken multimedia corpora for student-centred corpus exploration. In Corpus Linguistics in Language Teaching, T. Harris & M. Moreno Jaén (eds), 75–97. Bern: Peter Lang.
Brezina, V. 2012. Use of Google Scholar in corpus-driven EAP research. Journal of English for Academic Purposes 11(4): 319–331.
Burnard, L. 2002. Where did we go wrong? A retrospective look at the British National Corpus. In Teaching and Learning by Doing Corpus Analysis, B. Kettemann & G. Marko (eds), 51–70. Amsterdam: Rodopi.
Buyse, K. & Verlinde, S. 2013. Possible effects of free on line data driven lexicographic instruments on foreign language learning: The case of Linguee and the Interactive Language Toolbox. Procedia: Social and Behavioral Sciences, 95: 507–512.
Chambers, A. & O’Sullivan, Í. 2004. Corpus consultation and advanced learners’ writing skills in French. ReCALL 16(1): 158–172.
Chang, J.-Y. 2010. Postsecondary EFL students’ evaluations of corpora with regard to English writing. SNU Journal of Education Research 19: 57–85. <[URL]> (11 April 2011).
Cheng, W. 2011. Exploring Corpus Linguistics: Language in Action. London: Routledge.
Chinnery, G. 2008. You’ve got some GALL: Google-assisted language learning. Language Learning & Technology 12(1): 3–11.
Cobb, T. 2014. A resource wish-list for data-driven learning in French. In Ecological and Data-Driven Perspectives in French Language Studies, H. Tyne, V. André, A. Boulton, C. Benzitoun & Y. Greub (eds), 257–292. Newcastle upon Tyne: Cambridge Scholars.
Conroy, M. 2010. Internet tools for language learning: University students taking control of their writing. Australasian Journal of Educational Technology 26(6): 861–882.
Cotos, E. 2014. Enhancing writing pedagogy with learner corpus data. ReCALL 26(2): 202–224.
Crystal, D. 2011. Internet Linguistics. Abingdon: Routledge.
Davies, M. 2013. Google Scholar and COCA-Academic: Two very different approaches to examining academic English. Journal of English for Academic Purposes 12: 155–165.
Dose, S. 2012. Scripted speech in the EFL classroom: The Corpus of American Television Series for teaching spoken English. In Input, Process and Product: Developments in Teaching and Language Corpora, J. Thomas & A. Boulton (eds), 103–121. Brno: Masaryk University Press.
Dudeney, G. 2000. The Internet and the Language Classroom. Cambridge: CUP.
Ferris, D. & Roberts, B. 2001. Error feedback in L2 writing classes: How explicit does it need to be?Journal of Second Language Writing 10: 161–184.
Firth, J. 1957. Papers in Linguistics 1934–1951. Oxford: OUP.
Fletcher, W. 2007. Concordancing the web: Promise and problems, tools and techniques. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 25–45. Amsterdam: Rodopi.
Forchini, P. 2012. Movie Language Revisited: Evidence from Multi- Dimensional Analysis and Corpora. Frankfurt: Peter Lang.
Frankenberg-Garcia, A. 2014. How language learners can benefit from corpora, or not. Recherches en Didactique des Langues et des Cultures, 11(1): 93-110.
Franz, A. & Brants, T. 2006. All our n-gram are belong to you. Google Machine Translation Team Research Blog. <[URL]> (6 June 2012).
Gao, Z.-M. 2011. Exploring the effects and use of a Chinese-English parallel concordancer. Computer Assisted Language Learning 24(3): 255–275.
Gavioli, L. 2009. Corpus analysis and the achievement of learner autonomy in interaction. In Using Corpora to Learn about Language and Discourse, L. Lombardo (ed.), 39–71. Bern: Peter Lang.
Geiller, L. 2014. How EFL students can use Google to correct ‘untreatable’ written errors. Eurocall Review 22(2): 26-45.
Geluso, J. 2013. Phraseology and frequency of occurrence on the web: Native speakers’ perceptions of Google-informed second language writing. Computer Assisted Language Learning 26(2): 144–157.
Gilquin, G. & Granger, S. 2010. How can data-driven learning be used in language teaching? In The Routledge Handbook of Corpus Linguistics, A. O’Keeffe & M. McCarthy (eds), 359–370. London: Routledge.
Gilquin, G. & Gries, S. 2009. Corpora and experimental methods: A state-of-the-art review. Corpus Linguistics and Linguistic Theory 5(1): 1–26.
Hafner, C. & Candlin, C. 2007. Corpus tools as an affordance to learning in professional legal education. Journal of English for Academic Purposes 6(4): 303–318.
Hargittai, E., Fullerton, L., Menchen-Trevino, E. & Thomas, K. 2010. Trust on the web: How young adults judge the credibility of online content. International Journal of Communication 4: 468–494.
Hawkins, D. 1996. Hunting, grazing, browsing: A model for online information retrieval. ONLINE 20: n.p. <[URL]> (17 July, 2006 via <[URL]>).
Hoey, M. 2012. Lexical priming: The odd case of a psycholinguistic theory that generates corpus-linguistic hypotheses for both English and Chinese. Paper given at
Corpus Technologies and Applied Linguistics
. Suzhou: Xi’an Jiaotong Liverpool University, 28-30 June.
Huang, H.-T. & Liou, H.-C. 2007. Vocabulary learning in an automated graded reading program. Language Learning & Technology 11(3): 64–82.
Hundt, M., Nesselhauf, N. & Biewer, C. (eds). 2007. Corpus Linguistics and the Web. Amsterdam: Rodopi.
Johns, T. 1986. Micro-Concord: A language learner’s research tool. System 14(2): 151–162.
Johns, T. 1988. Whence and whither classroom concordancing? In Computer Applications in Language Learning, P. Bongaerts, P. de Haan, S. Lobbe & H. Wekker (eds), 9–27. Dordrecht: Foris.
Johns, T. 1990. From printout to handout: Grammar and vocabulary teaching in the context of data-driven learning. CALL Austria 10: 14–34.
Johns, T. 1991. Should you be persuaded: Two examples of data-driven learning. In Classroom Concordancing, T. Johns & P. King (eds), English Language Research Journal 4: 1–16.
Johns, T. 1993. Data-driven learning: An update. TELL&CALL 2: 4–10.
Johns, T. 1997. Contexts: The background, development and trialling of a concordance-based CALL program. In Teaching and Language Corpora, A. Wichmann, S. Fligelstone, T. McEnery & G. Knowles (eds), 100–115. Harlow: Addison Wesley Longman.
Johns, T. & King, P. (eds). 1991. Classroom Concordancing. English Language Research Journal 4.
Johns, T., Lee, H.-C. & Wang, L. 2008. Integrating corpus-based CALL programs in teaching English through children’s literature. Computer Assisted Language Learning 21(5): 483–506.
Joseph, B. 2004. The editor’s department: On change in Language and change in language. Language 80(3): 381–383.
Kaszubski, P. 2006. Web-based concordancing and ESAP writing. Poznań Studies in Contemporary Linguistics 41: 161–193.
Keller, F. & Lapata, M. 2003. Using the web to obtain frequencies for unseen bigrams. Computational Linguistics 29(3): 459–484.
Kennedy, C. & Miceli, T. 2001. An evaluation of intermediate students’ approaches to corpus investigation. Language Learning & Technology 5(3): 77–90.
Kilgarriff, A. 2001. Web as corpus. In Corpus Linguistics: Readings in a Widening Discipline, G. Sampson & D. McCarthy (eds), 471–473. London: Continuum.
Kilgarriff, A. 2005. Language is never, ever, ever random. Corpus Linguistics and Linguistic Theory 1(2): 263–275.
Kilgarriff, A. 2007. Googleology is bad science. Computational Linguistics 33(1): 147–151.
Kilgarriff, A. & Grefenstette, G. (eds). 2003. Web as Corpus. Computational Linguistics 29(3).
Kübler, N. 2011. Working with corpora for translation teaching in a French-speaking setting. In New Trends in Corpora and Language Learning, A. Frankenberg-Garcia, L. Flowerdew & G. Aston (eds), 62–80. London: Continuum.
Lam, Y. 2000. Technophilia vs. technophobia: A preliminary look at why second-language teachers do or do not use technology in their classrooms. Canadian Modern Language Review 56(3): 390–420.
Laufer, B. & Hulstijn, J. 2001. Incidental vocabulary acquisition in a second language: The construct of task-induced involvement. Applied Linguistics 22(1): 1–26.
Leech, G. 1997. Teaching and language corpora: A convergence. In Teaching and Language Corpora, A. Wichmann, S. Fligelstone, T. McEnery & G. Knowles (eds), 1–23. Harlow: Addison Wesley Longman.
Leńko-Szymańska, A. 2014. Is this enough? A qualitative evaluation of the effectiveness of a teacher-training course on the use of corpora in language education. ReCALL 26(2): 260–278.
Littlemore, J. & Oakey, D. 2004. Communication with a purpose: Exploiting the Internet to promote language learning. In ICT and Language Learning: Integrating Pedagogy and Practice, A. Chambers, J. Conacher & J. Littlemore (eds), 95–119. Birmingham: University of Birmingham Press.
Lüdeling, A., Evert, S. & Baroni, M. 2007. Using web data for linguistic purposes. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 7–24. Amsterdam: Rodopi.
McCarthy, M. 2008. Accessing and interpreting corpus information in the teacher education context. Language Teaching 41(4): 563–574.
McEnery, T., Xiao, R. & Tono, Y. 2006. Corpus-Based Language Studies: An Advanced Resource Book. London: Routledge.
Milton, J. 2006. Resource-rich web-based feedback: Helping learners become independent writers. In Feedback in Second Language Writing: Contexts and Issues, K. Hyland & F. Hyland (eds), 123–137. Cambridge: CUP.
Mondorf, B. 2007. Recalcitrant problems of comparative alternation and new insights emerging from Internet data. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 211–232. Amsterdam: Rodopi.
Nesi, H. 2000. The Use and Abuse of EFL Dictionaries. Tübingen: Max Niemeyer.
O’Sullivan, Í. & Chambers, A. 2006. Learners’ writing skills in French: Corpus consultation and learner evaluation. Journal of Second Language Writing 15(1): 49–68.
Park, K. 2012. Learner-corpus interaction: A locus of microgenesis in corpus-assisted L2 writing. Applied Linguistics 33(4): 361–385.
Park, K. & Kinginger, C. 2010. Writing/thinking in real time: Digital video and corpus query analysis. Language Learning & Technology 14(3): 31–50.
Pérez-Paredes, P., Sánchez Tornel, M., Alcaraz Calero, J. & Aguada Jiménez, P. 2011. Tracking learners’ actual uses of corpora: Guided vs. non-guided corpus consultation. Computer Assisted Language Learning 24(3): 233–253.
Philip, G. 2011. ‘…and I dropped my jaw with fear’: The role of corpora in teaching phraseology. In Corpora, Language, Teaching, and Resources: From Theory to Practice, N. Kübler (ed.), 49–68. Bern: Peter Lang.
Renouf, A., Kehoe, A. & Banerjee, J. 2007. WebCorp: An integrated system for web text search. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 47–67. Amsterdam: Rodopi.
Robb, T. 2003. Google as a quick ‘n’ dirty corpus tool. TESL-EJ 7(2): n.p. <[URL]> (1 July, 2007).
Rohdenburg, G. 2007. Determinants of grammatical variation in English and the formation/confirmation of linguistic hypotheses by means of Internet data. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 191–209. Amsterdam: Rodopi.
Römer, U. 2010. Using general and specialised corpora in English language teaching: Past, present and future. In Corpus-based Approaches to English Language Teaching, M.-C. Campoy, B. Bellés-Fortuño & M.-L. Gea-Valor (eds), 18–35. London: Continuum.
Rosenbach, A. 2007. Exploring constructions on the web: A case study. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 67–190. Amsterdam: Rodopi.
Rundell, M. 2000. The biggest corpus of all. Humanising Language Teaching 2(3): n.p. <[URL]> (7 June 2012).
Scheffler, P. 2007. When intuition fails us: The world wide web as a corpus. Glottodidactica 33: 137–145.
Sha, G. 2010. Using Google as a super corpus to drive written language learning: A comparison with the British National Corpus. Computer Assisted Language Learning 23(5): 377–393.
Sharoff, S. 2006. Creating general-purpose corpora using automated search engine queries. In WaCKy! Working Papers on the Web as Corpus, M. Baroni & S. Bernardini (eds), 63–98. Bologna: Gedit.
Shei, C. 2008a. Web as corpus, Google, and TESOL: A new trilogy. Taiwan Journal of TESOL 5(2): 1–28.
Shei, C. 2008b. Discovering the hidden treasure on the Internet: Using Google to uncover the veil of phraseology. Computer Assisted Language Learning 21(1): 67–85.
Sinclair, J. 2001. Preface. In Small Corpus Studies and ELT: Theory and Practice [Studies in Corpus Linguistics 5], M. Ghadessy, A. Henry & R. Roseberry (eds), vii–xv. Amsterdam: John Benjamins.
Sinclair, J. 2003. Reading Concordances: An Introduction. Harlow: Longman.
Sinclair, J. 2005. Corpus and text: Basic principles. / Appendix: How to build a corpus. In Developing Linguistic Corpora: A Guide to Good Practice, M. Wynne (ed.), 5–24 / 95–101. Oxford: Oxbow Books.
Smith, S. 2011. Learner construction of corpora for general English in Taiwan. Computer Assisted Language Learning 24(4): 291–316.
Sockett, G. & Toffoli, D. 2012. Beyond learner autonomy: A dynamic systems view of the informal learning of English in virtual online communities. ReCALL 24(2): 138–151.
Stewart, D., Bernardini, S. & Aston, G. 2004. Ten years of TaLC. In Corpora and Language Learners [Studies in Corpus Linguistics 17], G. Aston, S. Bernardini & D. Stewart (eds), 1–18. Amsterdam: John Benjamins.
Sun, Y.-C. 2007. Learner perceptions of a concordancing tool for academic writing. Computer Assisted Language Learning 20(4): 323–343.
Todd, R. 2001. Induction from self-selected concordances and self-correction. System 29(1): 91–102.
Tyne, H. 2012. Corpus work with ordinary teachers: Data-driven learning activities. In Input, Process and Product: Developments in Teaching and Language Corpora, J. Thomas & A. Boulton (eds), 136–151. Brno: Masaryk University Press.
Volk, M. 2002. Using the web as corpus for linguistic research. In Tähendusepüüdja: Catcher of the Meaning – A Festschrift for Professor Halduur Oim, R. Pajusalu & T. Hennoste (eds), n.p. Tartu: University of Tartu. <[URL]> (25 March 2006).
Widdowson, H.G. 2000. On the limitations of linguistics applied. Applied Linguistics 21(1): 3–25.
Willis, J. 1998. Concordances in the classroom without a computer. In Materials Development in Language Teaching, B. Tomlinson (ed.), 44–66. Cambridge: CUP.
Wu, S., Franken, M. & Witten, I. 2009. Refining the use of the web (and web search) as a language teaching and learning resource. Computer Assisted Language Learning 22(3): 249–268.
Yoon, H. & Jo, J. 2014. Direct and indirect access to corpora: An exploratory case study comparing students’ error correction and learning strategy use in L2 writing. Language Learning & Technology 18(1): 96–117.
Young, B. 2011. The grammar voyeur: Using Google to teach English grammar to advanced undergraduates. American Speech 86(2): 247–258.
2024. The Modish Universities and the Impact of Artificial Intelligence on IT. In AI, Corporate Social Responsibility, and Marketing in Modern Organizations [Advances in Marketing, Customer Relationship Management, and E-Services, ], ► pp. 263 ff.
Crosthwaite, Peter & Brett Steeples
2024. Data-driven learning with younger learners: exploring corpus-assisted development of the passive voice for science writing with female secondary school students. Computer Assisted Language Learning 37:5-6 ► pp. 1166 ff.
2023. Classroom concordancing and English academic lecture comprehension: an implication of data-driven learning. Computer Assisted Language Learning 36:5-6 ► pp. 885 ff.
Zare, Javad & Sedigheh Karimpour
2022. Classroom Concordancing and Second Language Motivational Self-System: A Data-Driven Learning Approach. Frontiers in Psychology 13
Mohanachandran, Dileep Kumar, Cheng Tat Yap, Zohr Ismaili & Normala S. Govindarajo
2021. Smart University and Artificial Intelligence. In The Fourth Industrial Revolution: Implementation of Artificial Intelligence for Growing Business Success [Studies in Computational Intelligence, 935], ► pp. 255 ff.
Ben Amor, Olfa & Faiza Derbel
2020. The Identification of English Non-finite Structures Using NooJ Platform. In Formalizing Natural Languages with NooJ 2019 and Its Natural Language Processing Applications [Communications in Computer and Information Science, 1153], ► pp. 27 ff.
Charles, Maggie
2018. Corpus Tools for Writing Students. In The TESOL Encyclopedia of English Language Teaching, ► pp. 1 ff.
Whyte, Shona
2018. Magdalena Sowa, Jaroslaw Krajka, Innovations in Languages for Specific Purposes – Innovations en langues sur objectifs spécifiques, Present challenges and future promises – Défis actuels et engagements à venir. ASp 73 ► pp. 112 ff.
Boulton, Alex & Tom Cobb
2017. Corpus Use in Language Learning: A Meta‐Analysis. Language Learning 67:2 ► pp. 348 ff.
Boulton, Alex
2016. Integrating corpus tools and techniques in ESP courses. ASp 69 ► pp. 113 ff.
2013. Corpus et appropriation de L1 et L2. Linx :68-69 ► pp. 9 ff.
This list is based on CrossRef data as of 27 december 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.