In research on L2 English, recent corpus-based studies indicate that some non-standard forms are shared by indigenized (ESL) and foreign (EFL) varieties of English, which challenges the idea of a clear dichotomy between innovation and error. We present a data-driven large-scale method to detect innovations, test it on verb + preposition structures (including phrasal verbs) and adjective + preposition structures, and describe similarities and differences between EFL and ESL. We use a dependency-parsed version of the International Corpus of Learner English to automatically extract potential innovations, defined as patterns of overuse compared to the British National Corpus as reference corpus. We measure overuse by means of collocation measures like O/E or T-score, and compare our results with similar results for ESL. In both quantitative and qualitative analyses, we detect similarities between the two varieties (e.g. discuss about) and dissimilarities (e.g. accuse for, only distinctive for EFL). We report more verb/adjective + preposition combinations than previous studies and discuss the roles of analogy and transfer.
2008 “Corpora and collocations”. In A. Lüdeling & M. Kytö (Eds.), Corpus Linguistics. An International Handbook. Berlin: de Gruyter, 1212–1248.
Fuchs, R. & Wunder, E.-M
2015 “A sonority-based account of speech rhythm in Chinese learners of English”. In U. Gut, R. Fuchs & E.-M. Wunder (Eds.), Universal or Diverse Paths to English Phonology? Bridging the Gap between Research on Phonological Acquisition of English as a Second, Third or Foreign Language. Berlin: de Gruyter, 165–184.
Gardner, D. & Davies, M
2007 “Pointing out frequent phrasal verbs: A corpus-based analysis”, TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect 41(2), 339–359.
2011 “Corpus linguistics to bridge the gap between World Englishes and Learner Englishes”. In L. Ruiz Miyares & M.R. Álvarez Silva (Eds.), Comunicación Social en el Siglo XXI, Vol. II1. Santiago de Cuba: Centro de Lingüística Aplicada, 638–642.
2015b “The use of phrasal verbs by French-speaking EFL learners. A constructional and collostructional corpus-based approach”, Corpus Linguistics and Linguistic Theory 11(1), 51–88.
To appear. “Applied cognitive linguistics and second/foreign language varieties: Towards an explanatory account”. In E. Tribushinina, J. Evers-Vermeul & L. Rasier (Eds.) Usage-based Approaches to Language Acquisition and Language Teaching Berlin de Gruyter
2011 “A large-scale investigation of verb-attached prepositional phrases”. In S. Hoffmann, P. Rayson & G. Leech (Eds.), Studies in Variation, Contacts and Change in English, Volume 6: Methodological and Historical Dimensions of Corpus Linguistics. Varieng, Helsinki. Available at: [URL] (accessed April 2016).
Lehmann, H.M. & Schneider, G
2012 “Dependency Bank”. In Proceedings of
LREC 2012 Workshop on Challenges in the Management of Large Corpora
2005“All mine, mine alone…”. Emerging local norms in Indian English lexico-grammar. Paper presented at the University of Zurich.
2007 “Steady states in the evolution of New Englishes: Present-day Indian English as an equilibrium”, Journal of English Linguistics 35(2), 157–187.
(Eds.)2014Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task. Association for Computational Linguistics, Baltimore, Maryland, June.
2009Lexical Association Measures: Collocation Extraction. Studies in Computational and Theoretical Linguistics. Institute of Formal and Applied Linguistics, Charles University in Prague.
Rosén, V. & Smedt, K.D
2010 “Syntactic annotation of learner corpora”. In H. Johansen, A. Golden, J.E. Hagen & A.-K. Helland (Eds.), Systematisk, variert, men ikke tilfeldig. Antologi om norsk som andrespråk i anledning Kari Tenfjords 60-årsdag [Systematic, Varied, but not Arbitrary. Anthology about Norwegian as a Second Language on the Occasion of Kari Tenfjord’s 60th Birthday]. Oslo: Novus forlag, 120–132.
Sag, I.A., Baldwin, T., Bond, F., Copestake, A. & Flickinger, D
2001Multi-word expressions: A pain in the neck for NLP. Technical Report LinGO Working Paper No. 2001-03, Stanford University, CA.
2017. Automatic Dependency Parsing of a Learner English Corpus Realec. SSRN Electronic Journal
2019. Assessing Second Language Proficiency Under ‘Unequal’ Perspectives: A Call for Research in the MENA Region. In English Language Teaching Research in the Middle East and North Africa, ► pp. 3 ff.
2020. The Interplay between Universal Processes and Cross-Linguistic Influence in the Light of Learner Corpus Data: Examining Shared Features of Non-native Englishes. In Learner Corpus Research Meets Second Language Acquisition, ► pp. 67 ff.
Schneider, Gerold, Marianne Hundt & Daniel Schreier
2020. Pluralized non-count nouns across Englishes: A corpus-linguistic approach to variety types. Corpus Linguistics and Linguistic Theory 16:3 ► pp. 515 ff.
This list is based on CrossRef data as of 22 may 2023. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.