Article published in:
Linguistic Innovations: Rethinking linguistic creativity in non-native EnglishesEdited by Sandra C. Deshors, Sandra Götz and Samantha Laporte
[International Journal of Learner Corpus Research 2:2] 2016
► pp. 177–204
Detecting innovations in a parsed corpus of learner English
Gerold Schneider | University of Konstanz & University of Zurich
Gaëtanelle Gilquin | University of Louvain & FNRS
In research on L2 English, recent corpus-based studies indicate that some non-standard forms are shared by indigenized (ESL) and foreign (EFL) varieties of English, which challenges the idea of a clear dichotomy between innovation and error. We present a data-driven large-scale method to detect innovations, test it on verb + preposition structures (including phrasal verbs) and adjective + preposition structures, and describe similarities and differences between EFL and ESL. We use a dependency-parsed version of the International Corpus of Learner English to automatically extract potential innovations, defined as patterns of overuse compared to the British National Corpus as reference corpus. We measure overuse by means of collocation measures like O/E or T-score, and compare our results with similar results for ESL. In both quantitative and qualitative analyses, we detect similarities between the two varieties (e.g. discuss about) and dissimilarities (e.g. accuse for, only distinctive for EFL). We report more verb/adjective + preposition combinations than previous studies and discuss the roles of analogy and transfer.
Keywords: Error Analysis, Learner English, Cognitive Linguistics, verb-preposition constructions, corpus linguistics, data-driven approach, English as a Second Language (ESL), English as a Foreign Language (EFL), linguistic innovations, collocations
Published online: 20 October 2016
https://doi.org/10.1075/ijlcr.2.2.03sch
https://doi.org/10.1075/ijlcr.2.2.03sch
References
Aston, G. & Burnard, L
Benson, M., Benson, E. & Ilson, R
Cornell, A
Davies, M. & Fuchs, R
Deshors, S.C
Díaz-Negrillo, A., Ballier, N. & Thompson, P
Dickinson, M. & Ragheb, M
2009 “Dependency annotation for learner corpora”. In Proceedings of the
Eighth Workshop on Treebanks and Linguistic Theories (TLT)
. Milan, Italy.
Edwards, A
Edwards, A. & Laporte, S
Evert, S
Fuchs, R. & Wunder, E.-M
2015 “A sonority-based account of speech rhythm in Chinese learners of English”. In U. Gut, R. Fuchs & E.-M. Wunder (Eds.), Universal or Diverse Paths to English Phonology? Bridging the Gap between Research on Phonological Acquisition of English as a Second, Third or Foreign Language. Berlin: de Gruyter, 165–184.
Gardner, D. & Davies, M
Gilquin, G
Gilquin, G. & Granger, S
Götz, S
2015 “Fluency in ENL, ESL and EFL: A corpus-based pilot study”. In Proceedings of Disfluency in Spontaneous Speech, DISS 2015. Glasgow, UK. Available at: http://disfluency.org/DiSS_2015/Programme_files/Goetz-DISS2015.pdf (accessed April 2016).
Götz, S. & Schilk, M
2011 “Formulaic sequences in spoken ENL, ESL and EFL: Focus on British English, Indian English and learner English of advanced German learners”. In J. Mukherjee & M. Hundt (Eds.), Exploring Second-Language Varieties of English and Learner Englishes: Bridging a Paradigm Gap. Amsterdam: John Benjamins, 79–100. 

Granger, S
Granger, S., Dagneaux, E., Meunier, F. & Paquot, M
Gries, S.T. & Wulff, S
Gut, U
Gut, U., Fuchs, R. & Wunder, E.-M
Jurafsky, D. & Martin, J.H
Laporte, S
Lehmann, H.M. & Schneider, G
2011 “A large-scale investigation of verb-attached prepositional phrases”. In S. Hoffmann, P. Rayson & G. Leech (Eds.), Studies in Variation, Contacts and Change in English, Volume 6: Methodological and Historical Dimensions of Corpus Linguistics. Varieng, Helsinki. Available at: http://www.helsinki.fi/varieng/series/volumes/06/lehmann_schneider/ (accessed April 2016).
2012 “Dependency Bank”. In Proceedings of
LREC 2012 Workshop on Challenges in the Management of Large Corpora
, 23–28.
Mukherjee, J
2005 “All mine, mine alone…”. Emerging local norms in Indian English lexico-grammar. Paper presented at the University of Zurich.
Mukherjee, J. & Hoffmann, S
Mukherjee, J. & Hundt, M
Nelson, G., Wallis, S. & Aarts, B
Ng, H.T., Wu, S.M., Briscoe, T., Hadiwinoto, C., Susanto, R.H. & Bryant, C
Pecina, P
Rosén, V. & Smedt, K.D
2010 “Syntactic annotation of learner corpora”. In H. Johansen, A. Golden, J.E. Hagen & A.-K. Helland (Eds.), Systematisk, variert, men ikke tilfeldig. Antologi om norsk som andrespråk i anledning Kari Tenfjords 60-årsdag [Systematic, Varied, but not Arbitrary. Anthology about Norwegian as a Second Language on the Occasion of Kari Tenfjord’s 60th Birthday]. Oslo: Novus forlag, 120–132.
Sag, I.A., Baldwin, T., Bond, F., Copestake, A. & Flickinger, D
Salazar, D
Sand, A
Schneider, E.W
Schneider, G
Schneider, G. & Hundt, M
2009 “Using a parser as a heuristic tool for the description of New Englishes”. In Proceedings of
Corpus Linguistics 2009
, Liverpool.
Schneider, G. & Zipp, L
2013 “Discovering new verb-preposition combinations in New Englishes”, Studies in Variation, Contacts and Change in English 131. Available at: http://www.helsinki.fi/varieng/series/volumes/13/schneider_zipp (accessed April 2016).
Shannon, C
Tomasello, M
Van Rooy, B
Cited by
Cited by 7 other publications
No author info given
Gilquin, Gaëtanelle
GUT, ULRIKE & ROBERT FUCHS
Hoffmann, Sebastian
Lyashevkaya, Olga & Irina Panteleeva
McCallum, Lee
Schneider, Gerold, Marianne Hundt & Daniel Schreier
This list is based on CrossRef data as of 15 april 2022. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.