Applying data-driven learning to the web
Data-driven learning typically involves the use of dedicated concordancers to explore linguistic corpora, which may require significant training if the technology is not to be an obstacle for teacher and learner alike. One possibility is to begin not with corpus or concordancer, but to find parallels with what ‘ordinary’ users already do. This paper compares the web to a corpus, regular search engines to concordancers, and the techniques used in web searches to data-driven learning. It also examines previous studies which exploit web searches in ways not incompatible with a DDL approach.
References
Acar, A., Geluso, J. & Shiki, T
2011 How can search engines improve your writing? CALL-EJ 12(1): 1–10.

Adolphs, S
2006 Introducing Electronic Text Analysis: A Practical Guide for Language and Literary Studies. London: Routledge.

Allan, R
2009 Can a graded reader corpus provide ‘authentic’ input? ELT Journal 63(1): 23–32.


Anthony, L
2011 AntConc, version 3. Tokyo: Waseda University.
[URL] (17 February 2013).

Aston, G
1997 Small and large corpora in language learning. In
Practical Applications in Language Corpora,
B. Lewandowska-Tomaszczyk &
J. Melia (eds), 51–62. Łódź: Łódź University Press.

Baroni, M. & Bernardini, S
(eds) 2006 Wacky! Working Papers on the Web as Corpus. Bologna: Gedit.

Bergh, G
2005 Min(d)ing English language data on the web: What can Google tell us? ICAME Journal 29: 25–46.

Bernardini, S., Baroni, M. & Evert, S
2006 A WaCky introduction. In
Wacky! Working Papers on the Web as Corpus,
M. Baroni &
S. Bernardini (eds), 9–40. Bologna: Gedit.

Biber, D., Conrad, S. & Reppen, R
1998 Corpus Linguistics: Investigating Language Structure and Use. Cambridge: CUP.


Boulton, A
2010a Data-driven learning: Taking the computer out of the equation.
Language Learning 60(3): 534–572.


Boulton, A
2010b Data-driven learning: On paper, in practice. In
Corpus Linguistics in Language Teaching,
T. Harris &
M. Moreno Jaén (eds), 17–52. Bern: Peter Lang.

Boulton, A
2011a Data-driven learning: The perpetual enigma. In
Explorations across Languages and Corpora,
S. Goźdź-Roszkowski (ed.), 563–580. Frankfurt: Peter Lang.

Boulton, A
2011b Bringing corpora to the masses: Free and easy tools for interdisciplinary language studies. In
Corpora, Language, Teaching, and Resources: From Theory to Practice,
N. Kübler (ed.), 69–96. Bern: Peter Lang.

Boulton, A
2012 Hands-on/hands-off: Alternative approaches to data-driven learning. In
Input, Process and Product: Developments in Teaching and Language Corpora,
J. Thomas &
A. Boulton (eds), 152–168. Brno: Masaryk University Press.

Boulton, A. & Tyne, H
2014 Des Documents Authentiques aux Corpus: Démarches pour l’Apprentissage des Langues. Paris: Didier.

Braun, S
2005 From pedagogically relevant corpora to authentic language learning contents.
ReCALL 17(1): 47–64.


Braun, S
2010 Getting past ‘groundhog day’: Spoken multimedia corpora for student-centred corpus exploration. In
Corpus Linguistics in Language Teaching,
T. Harris &
M. Moreno Jaén (eds), 75–97. Bern: Peter Lang.

Brezina, V
2012 Use of Google Scholar in corpus-driven EAP research.
Journal of English for Academic Purposes 11(4): 319–331.


Burnard, L
2002 Where did we go wrong? A retrospective look at the British National Corpus. In
Teaching and Learning by Doing Corpus Analysis,
B. Kettemann &
G. Marko (eds), 51–70. Amsterdam: Rodopi.

Buyse, K. & Verlinde, S
2013 Possible effects of free on line data driven lexicographic instruments on foreign language learning: The case of Linguee and the Interactive Language Toolbox.
Procedia: Social and Behavioral Sciences, 95: 507–512.


Chambers, A. & O’Sullivan, Í
2004 Corpus consultation and advanced learners’ writing skills in French.
ReCALL 16(1): 158–172.


Chang, J.-Y
2010 Postsecondary EFL students’ evaluations of corpora with regard to English writing.
SNU Journal of Education Research 19: 57–85.
[URL] (11 April 2011).

Cheng, W
2011 Exploring Corpus Linguistics: Language in Action. London: Routledge.

Chinnery, G
2008 You’ve got some GALL: Google-assisted language learning.
Language Learning & Technology 12(1): 3–11.

Cobb, T
2014 A resource wish-list for data-driven learning in French. In
Ecological and Data-Driven Perspectives in French Language Studies,
H. Tyne,
V. André,
A. Boulton,
C. Benzitoun &
Y. Greub (eds), 257–292. Newcastle upon Tyne: Cambridge Scholars.

Conroy, M
2010 Internet tools for language learning: University students taking control of their writing.
Australasian Journal of Educational Technology 26(6): 861–882.

Cotos, E
2014 Enhancing writing pedagogy with learner corpus data.
ReCALL 26(2): 202–224.


Crystal, D
2011 Internet Linguistics. Abingdon: Routledge.

Davies, M
2013 Google Scholar and COCA-Academic: Two very different approaches to examining academic English.
Journal of English for Academic Purposes 12: 155–165.


Dose, S
2012 Scripted speech in the EFL classroom: The Corpus of American Television Series for teaching spoken English. In
Input, Process and Product: Developments in Teaching and Language Corpora,
J. Thomas &
A. Boulton (eds), 103–121. Brno: Masaryk University Press.

Dudeney, G
2000 The Internet and the Language Classroom. Cambridge: CUP.

Ferris, D. & Roberts, B
2001 Error feedback in L2 writing classes: How explicit does it need to be? Journal of Second Language Writing 10: 161–184.


Firth, J
1957 Papers in Linguistics 1934–1951. Oxford: OUP.

Fletcher, W
2007 Concordancing the web: Promise and problems, tools and techniques. In
Corpus Linguistics and the Web,
M. Hundt,
N. Nesselhauf &
C. Biewer (eds), 25–45. Amsterdam: Rodopi.

Forchini, P
2012 Movie Language Revisited: Evidence from Multi- Dimensional Analysis and Corpora. Frankfurt: Peter Lang.

Frankenberg-Garcia, A
2014 How language learners can benefit from corpora, or not.
Recherches en Didactique des Langues et des Cultures, 11(1): 93-110.

Franz, A. & Brants, T
2006 All our n-gram are belong to you.
Google Machine Translation Team Research Blog.
[URL] (6 June 2012).

Gao, Z.-M
2011 Exploring the effects and use of a Chinese-English parallel concordancer.
Computer Assisted Language Learning 24(3): 255–275.


Gavioli, L
2009 Corpus analysis and the achievement of learner autonomy in interaction. In
Using Corpora to Learn about Language and Discourse,
L. Lombardo (ed.), 39–71. Bern: Peter Lang.

Geiller, L
2014 How EFL students can use Google to correct ‘untreatable’ written errors.
Eurocall Review 22(2): 26-45.

Geluso, J
2013 Phraseology and frequency of occurrence on the web: Native speakers’ perceptions of Google-informed second language writing.
Computer Assisted Language Learning 26(2): 144–157.


Ghadessy, M., Henry, A. & Roseberry, R
Gilquin, G. & Granger, S
2010 How can data-driven learning be used in language teaching? In
The Routledge Handbook of Corpus Linguistics,
A. O’Keeffe &
M. McCarthy (eds), 359–370. London: Routledge.


Gilquin, G. & Gries, S
2009 Corpora and experimental methods: A state-of-the-art review.
Corpus Linguistics and Linguistic Theory 5(1): 1–26.


Hafner, C. & Candlin, C
2007 Corpus tools as an affordance to learning in professional legal education.
Journal of English for Academic Purposes 6(4): 303–318.


Hargittai, E., Fullerton, L., Menchen-Trevino, E. & Thomas, K
2010 Trust on the web: How young adults judge the credibility of online content.
International Journal of Communication 4: 468–494.

Hawkins, D
1996 Hunting, grazing, browsing: A model for online information retrieval.
ONLINE 20: n.p.
[URL] (17 July, 2006 via
[URL]).

Hoey, M
2012 Lexical priming: The odd case of a psycholinguistic theory that generates corpus-linguistic hypotheses for both English and Chinese. Paper given at
Corpus Technologies and Applied Linguistics
. Suzhou: Xi’an Jiaotong Liverpool University, 28-30 June.

Huang, H.-T. & Liou, H.-C
2007 Vocabulary learning in an automated graded reading program.
Language Learning & Technology 11(3): 64–82.

Hundt, M., Nesselhauf, N. & Biewer, C
(eds) 2007 Corpus Linguistics and the Web. Amsterdam: Rodopi.

Johns, T
1986 Micro-Concord: A language learner’s research tool.
System 14(2): 151–162.


Johns, T
1988 Whence and whither classroom concordancing? In
Computer Applications in Language Learning,
P. Bongaerts,
P. de Haan,
S. Lobbe &
H. Wekker (eds), 9–27. Dordrecht: Foris.

Johns, T
1990 From printout to handout: Grammar and vocabulary teaching in the context of data-driven learning.
CALL Austria 10: 14–34.

Johns, T
1991 Should you be persuaded: Two examples of data-driven learning. In
Classroom Concordancing,
T. Johns &
P. King (eds),
English Language Research Journal 4: 1–16.

Johns, T
1993 Data-driven learning: An update.
TELL&CALL 2: 4–10.

Johns, T
1997 Contexts: The background, development and trialling of a concordance-based CALL program. In
Teaching and Language Corpora,
A. Wichmann,
S. Fligelstone,
T. McEnery &
G. Knowles (eds), 100–115. Harlow: Addison Wesley Longman.

Johns, T. & King, P
(eds) 1991 Classroom Concordancing. English Language Research Journal 4.

Johns, T., Lee, H.-C. & Wang, L
2008 Integrating corpus-based CALL programs in teaching English through children’s literature.
Computer Assisted Language Learning 21(5): 483–506.


Joseph, B
2004 The editor’s department: On change in Language and change in language.
Language 80(3): 381–383.


Kaszubski, P
2006 Web-based concordancing and ESAP writing.
Poznań Studies in Contemporary Linguistics 41: 161–193.

Keller, F. & Lapata, M
2003 Using the web to obtain frequencies for unseen bigrams.
Computational Linguistics 29(3): 459–484.


Kennedy, C. & Miceli, T
2001 An evaluation of intermediate students’ approaches to corpus investigation.
Language Learning & Technology 5(3): 77–90.

Kilgarriff, A
2001 Web as corpus. In
Corpus Linguistics: Readings in a Widening Discipline,
G. Sampson &
D. McCarthy (eds), 471–473. London: Continuum.

Kilgarriff, A
2005 Language is never, ever, ever random.
Corpus Linguistics and Linguistic Theory 1(2): 263–275.


Kilgarriff, A
2007 Googleology is bad science.
Computational Linguistics 33(1): 147–151.


Kilgarriff, A. & Grefenstette, G
(eds) 2003
Web as Corpus
.
Computational Linguistics 29(3).


Kübler, N
2011 Working with corpora for translation teaching in a French-speaking setting. In
New Trends in Corpora and Language Learning,
A. Frankenberg-Garcia,
L. Flowerdew &
G. Aston (eds), 62–80. London: Continuum.

Lam, Y
2000 Technophilia vs. technophobia: A preliminary look at why second-language teachers do or do not use technology in their classrooms.
Canadian Modern Language Review 56(3): 390–420.


Laufer, B. & Hulstijn, J
2001 Incidental vocabulary acquisition in a second language: The construct of task-induced involvement.
Applied Linguistics 22(1): 1–26.


Leech, G
1997 Teaching and language corpora: A convergence. In
Teaching and Language Corpora,
A. Wichmann,
S. Fligelstone,
T. McEnery &
G. Knowles (eds), 1–23. Harlow: Addison Wesley Longman.

Leńko-Szymańska, A
2014 Is this enough? A qualitative evaluation of the effectiveness of a teacher-training course on the use of corpora in language education.
ReCALL 26(2): 260–278.


Littlemore, J. & Oakey, D
2004 Communication with a purpose: Exploiting the Internet to promote language learning. In
ICT and Language Learning: Integrating Pedagogy and Practice,
A. Chambers,
J. Conacher &
J. Littlemore (eds), 95–119. Birmingham: University of Birmingham Press.

Lüdeling, A., Evert, S. & Baroni, M
2007 Using web data for linguistic purposes. In
Corpus Linguistics and the Web,
M. Hundt,
N. Nesselhauf &
C. Biewer (eds), 7–24. Amsterdam: Rodopi.

McCarthy, M
2008 Accessing and interpreting corpus information in the teacher education context.
Language Teaching 41(4): 563–574.


McEnery, T., Xiao, R. & Tono, Y
2006 Corpus-Based Language Studies: An Advanced Resource Book. London: Routledge.

Milton, J
2006 Resource-rich web-based feedback: Helping learners become independent writers. In
Feedback in Second Language Writing: Contexts and Issues,
K. Hyland &
F. Hyland (eds), 123–137. Cambridge: CUP.


Mondorf, B
2007 Recalcitrant problems of comparative alternation and new insights emerging from Internet data. In
Corpus Linguistics and the Web,
M. Hundt,
N. Nesselhauf &
C. Biewer (eds), 211–232. Amsterdam: Rodopi.


Nesi, H
2000 The Use and Abuse of EFL Dictionaries. Tübingen: Max Niemeyer.


O’Sullivan, Í. & Chambers, A
2006 Learners’ writing skills in French: Corpus consultation and learner evaluation.
Journal of Second Language Writing 15(1): 49–68.


Park, K
2012 Learner-corpus interaction: A locus of microgenesis in corpus-assisted L2 writing.
Applied Linguistics 33(4): 361–385.


Park, K. & Kinginger, C
2010 Writing/thinking in real time: Digital video and corpus query analysis.
Language Learning & Technology 14(3): 31–50.

Pérez-Paredes, P., Sánchez Tornel, M., Alcaraz Calero, J. & Aguada Jiménez, P
2011 Tracking learners’ actual uses of corpora: Guided vs. non-guided corpus consultation.
Computer Assisted Language Learning 24(3): 233–253.


Philip, G
2011 ‘…and I dropped my jaw with fear’: The role of corpora in teaching phraseology. In
Corpora, Language, Teaching, and Resources: From Theory to Practice,
N. Kübler (ed.), 49–68. Bern: Peter Lang.

Renouf, A., Kehoe, A. & Banerjee, J
2007 WebCorp: An integrated system for web text search. In
Corpus Linguistics and the Web,
M. Hundt,
N. Nesselhauf &
C. Biewer (eds), 47–67. Amsterdam: Rodopi.


Robb, T
2003 Google as a quick ‘n’ dirty corpus tool.
TESL-EJ 7(2): n.p.
[URL] (1 July, 2007).

Rodgers, O., Chambers, A. & LeBaron, F
Rohdenburg, G
2007 Determinants of grammatical variation in English and the formation/confirmation of linguistic hypotheses by means of Internet data. In
Corpus Linguistics and the Web,
M. Hundt,
N. Nesselhauf &
C. Biewer (eds), 191–209. Amsterdam: Rodopi.


Römer, U
2010 Using general and specialised corpora in English language teaching: Past, present and future. In
Corpus-based Approaches to English Language Teaching,
M.-C. Campoy,
B. Bellés-Fortuño &
M.-L. Gea-Valor (eds), 18–35. London: Continuum.

Rosenbach, A
2007 Exploring constructions on the web: A case study. In
Corpus Linguistics and the Web,
M. Hundt,
N. Nesselhauf &
C. Biewer (eds), 67–190. Amsterdam: Rodopi.


Rundell, M
2000 The biggest corpus of all.
Humanising Language Teaching 2(3): n.p.
[URL] (7 June 2012).

Scheffler, P
2007 When intuition fails us: The world wide web as a corpus.
Glottodidactica 33: 137–145.

Sha, G
2010 Using Google as a super corpus to drive written language learning: A comparison with the British National Corpus.
Computer Assisted Language Learning 23(5): 377–393.


Sharoff, S
2006 Creating general-purpose corpora using automated search engine queries. In
WaCKy! Working Papers on the Web as Corpus,
M. Baroni &
S. Bernardini (eds), 63–98. Bologna: Gedit.

Shei, C
2008a Web as corpus, Google, and TESOL: A new trilogy.
Taiwan Journal of TESOL 5(2): 1–28.

Shei, C
2008b Discovering the hidden treasure on the Internet: Using Google to uncover the veil of phraseology.
Computer Assisted Language Learning 21(1): 67–85.


Sinclair, J
2001 Preface. In
Small Corpus Studies and ELT: Theory and Practice [
Studies in Corpus Linguistics 5],
M. Ghadessy,
A. Henry &
R. Roseberry (eds), vii–xv. Amsterdam: John Benjamins.


Sinclair, J
2003 Reading Concordances: An Introduction. Harlow: Longman.

Sinclair, J
2005 Corpus and text: Basic principles. / Appendix: How to build a corpus. In
Developing Linguistic Corpora: A Guide to Good Practice,
M. Wynne (ed.), 5–24 / 95–101. Oxford: Oxbow Books.

Smith, S
2011 Learner construction of corpora for general English in Taiwan.
Computer Assisted Language Learning 24(4): 291–316.


Sockett, G. & Toffoli, D
2012 Beyond learner autonomy: A dynamic systems view of the informal learning of English in virtual online communities.
ReCALL 24(2): 138–151.


Stewart, D., Bernardini, S. & Aston, G
2004 Ten years of TaLC. In
Corpora and Language Learners [
Studies in Corpus Linguistics 17],
G. Aston,
S. Bernardini &
D. Stewart (eds), 1–18. Amsterdam: John Benjamins.


Sun, Y.-C
2007 Learner perceptions of a concordancing tool for academic writing.
Computer Assisted Language Learning 20(4): 323–343.


Todd, R
2001 Induction from self-selected concordances and self-correction.
System 29(1): 91–102.


Tyne, H
2012 Corpus work with ordinary teachers: Data-driven learning activities. In
Input, Process and Product: Developments in Teaching and Language Corpora,
J. Thomas &
A. Boulton (eds), 136–151. Brno: Masaryk University Press.

Volk, M
2002 Using the web as corpus for linguistic research. In
Tähendusepüüdja: Catcher of the Meaning – A Festschrift for Professor Halduur Oim,
R. Pajusalu &
T. Hennoste (eds), n.p. Tartu: University of Tartu.
[URL] (25 March 2006).

Widdowson, H.G
2000 On the limitations of linguistics applied.
Applied Linguistics 21(1): 3–25.


Willis, J
1998 Concordances in the classroom without a computer. In
Materials Development in Language Teaching,
B. Tomlinson (ed.), 44–66. Cambridge: CUP.

Wu, S., Franken, M. & Witten, I
2009 Refining the use of the web (and web search) as a language teaching and learning resource.
Computer Assisted Language Learning 22(3): 249–268.


Yoon, H. & Jo, J
2014 Direct and indirect access to corpora: An exploratory case study comparing students’ error correction and learning strategy use in L2 writing.
Language Learning & Technology 18(1): 96–117.

Young, B
2011 The grammar voyeur: Using Google to teach English grammar to advanced undergraduates.
American Speech 86(2): 247–258.


Cited by
Cited by 5 other publications
Ben Amor, Olfa & Faiza Derbel
2020.
The Identification of English Non-finite Structures Using NooJ Platform. In
Formalizing Natural Languages with NooJ 2019 and Its Natural Language Processing Applications [
Communications in Computer and Information Science, 1153],
► pp. 27 ff.

Boulton, Alex
Crosthwaite, Peter & Brett Steeples
2022.
Data-driven learning with younger learners: exploring corpus-assisted development of the passive voice for science writing with female secondary school students.
Computer Assisted Language Learning ► pp. 1 ff.

Mohanachandran, Dileep Kumar, Cheng Tat Yap, Zohr Ismaili & Normala S. Govindarajo
2021.
Smart University and Artificial Intelligence. In
The Fourth Industrial Revolution: Implementation of Artificial Intelligence for Growing Business Success [
Studies in Computational Intelligence, 935],
► pp. 255 ff.

Zare, Javad & Sedigheh Karimpour
2022.
Classroom Concordancing and Second Language Motivational Self-System: A Data-Driven Learning Approach.
Frontiers in Psychology 13

This list is based on CrossRef data as of 11 march 2023. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.