Publication details [#59609]

Kim, YouJin, Scott Crossley and Kristopher Kyle. 2015. Native language identification and writing proficiency. International Journal of Learner Corpus Research 1 (2) : 187–209.
Publication type
Article in journal
Publication language
Place, Publisher
John Benjamins
Journal DOI


This study evaluates the impact of writing proficiency on native language identification (NLI), a topic that has important implications for the generalizability of NLI models and detection-based arguments for cross-linguistic influence (Jarvis 2010, 2012; CLI). The study uses multinomial logistic regression to classify the first language (L1) group membership of essays at two proficiency levels based on systematic lexical and phrasal choices made by members of five L1 groups. The results indicate that lower proficiency essays are significantly easier to classify than higher proficiency essays, suggesting that lower proficiency writers make lexical and phrasal choices that are more similar to other lower proficiency writers that share an L1 than higher proficiency writers that share an L1. A close analysis of the findings also indicates that the relationship between NLI accuracy and proficiency differed across L1 groups.