Article published in:Rethinking Linguistic Creativity in Non-native Englishes
Edited by Sandra C. Deshors, Sandra Götz and Samantha Laporte
[Benjamins Current Topics 98] 2018
► pp. 47–74
Detecting innovations in a parsed corpus of learner English
In research on L2 English, recent corpus-based studies indicate that some nonstandard forms are shared by indigenized (ESL) and foreign (EFL) varieties of English, which challenges the idea of a clear dichotomy between innovation and error. We present a data-driven large-scale method to detect innovations, test it on verb + preposition structures (including phrasal verbs) and adjective + preposition structures, and describe similarities and differences between EFL and ESL. We use a dependency-parsed version of the International Corpus of Learner English to automatically extract potential innovations, defined as patterns of overuse compared to the British National Corpus as reference corpus. We measure overuse by means of collocation measures like O/E or T-score, and compare our results with similar results for ESL. In both quantitative and qualitative analyses, we detect similarities between the two varieties (e.g. discuss about) and dissimilarities (e.g. accuse for, only distinctive for EFL). We report more verb/adjective + preposition combinations than previous studies and discuss the roles of analogy and transfer.
Keywords: Cognitive Linguistics, collocations, corpus linguistics, data-driven approach, English as a Foreign Language (EFL), English as a Second Language (ESL), Error Analysis, Learner English, linguistic innovations, verb-preposition constructions
Published online: 19 July 2018