Automated tools for syntactic complexity measurement are increasingly used for analyzing various kinds of second
language corpora, even though these tools were originally developed and tested for texts produced by advanced learners. This study
investigates the reliability of automated complexity measurement for beginner and lower-intermediate L2 English data by comparing
manual and automated analyses of a corpus of 80 texts written by Dutch-speaking learners. Our quantitative and qualitative
analyses reveal that the reliability of automated complexity measurement is substantially affected by learner errors, parser
errors, and Tregex pattern undergeneration. We also demonstrate the importance of aligning the definitions of
analytical units between the computational tool and human annotators. In order to enhance the reliability of automated analyses,
it is recommended that certain modifications are made to the system, and non-advanced L2 English data are preprocessed prior to
automated analyses.
Bulté, B. (2013). The Development of Complexity in Second Language Acquisition: A Dynamic Systems Approach [Unpublished doctoral dissertation]. Vrije Universiteit Brussel.
Bulté, B., & Housen, A. (2012). Defining
and operationalising L2 complexity. In A. Housen, F. Kuiken, & I. Vedder (Eds.), Dimensions
of L2 Performance and Proficiency: Complexity, Accuracy and Fluency in
SLA (pp. 21–46). John Benjamins.
Bulté, B., & Housen, A. (2014). Conceptualizing
and measuring short-term changes in L2 writing complexity. Journal of Second Language
Writing,
26
1, 42–65.
Bulté, B., & Housen, A. (2018). Syntactic
complexity in L2 writing: Individual pathways and emerging group trends. International Journal
of Applied
Linguistics,
28
(1), 147–164.
Bulté, B., & Roothooft, H. (2020). Investigating
the interrelationship between rated L2 proficiency and linguistic complexity in L2
speech. System,
91
1, 102246.
Abney, S. P. (1987). The
English Noun Phrase in its Sentential Aspect [Doctoral
dissertation, Massachusetts Institute of Technology]. DSpace@MIT. [URL]
Bi, P., & Jiang, J. (2020). Syntactic
complexity in assessing young adolescent EFL learners’ writings: Syntactic elaboration and
diversity. System,
91
1, 102248.
Biber, D. (1988). Variation
Across Speech and Writing. Cambridge University Press.
Cambridge University Press &
Assessment. (n.d.). Cambridge learner corpus – error
codes. [URL]
Casal, J. E., & Lee, J. J. (2019). Syntactic
complexity and writing quality in assessed first-year L2 writing. Journal of Second Language
Writing,
44
1, 51–62.
Chen, X. B., & Meurers, D. (2016). CTAP:
A Web-Based Tool Supporting Automatic Complexity Analysis. Proceedings of The Workshop on
Computational Linguistics for Linguistic Complexity (pp. 113–119). Association for Computational Linguistics. [URL]
Choi, J. D., Tetreault, J., & Stent, A. (2015). It
depends: Dependency parser comparison using a web-based evaluation
tool. In C. Zong & M. Strube (Eds.), Proceedings
of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on
Natural Language
Processing (pp. 387–396). Association for Computational Linguistics. [URL].
Cooper, T. C. (1976). Measuring
written syntactic patterns of second language learners of German. The Journal of Educational
Research,
69
(5), 176–183.
De Clercq, B., & Housen, A. (2017). A
cross-linguistic perspective on syntactic complexity in L2 Development: Syntactic elaboration and
diversity. The Modern Language
Journal,
101
(2), 315–334.
Gaillat, T., & Ballier, N. (2019). Prototype de feedback visuel des productions écrites d’apprenants francophones de l’anglais sous
Moodle [Prototype of visual feedback for written productions of
French-speaking learners of English on Moodle]. In Actes de la conférence
EIAH2019. Association des Technologies de l’Information pour l’Education et la Formation. [URL]
Granger, S., Dagneaux, E., Meunier, F., & Paquot, M. (2009). International
Corpus of Learner English (Version 2.0). Presses universitaires de Louvain.
Huibregtse, I., Admiraal, W., & Meara, P. (2002). Scores
on a yes-no vocabulary test: Correction for guessing and response style. Language
Testing,
19
(3), 227–245.
Hunt, K. W. (1965). Grammatical
Structures Written at Three Grade Levels. National Council of Teachers of English.
Hwang, H., Jung, H., & Kim, H. (2020). Effects
of written versus spoken production modalities on syntactic complexity measures in beginning-level child EFL
learners. The Modern Language
Journal,
104
(1), 267–283.
Jiang, J., Bi, P., & Liu, H. (2019). Syntactic
complexity development in the writings of EFL learners: Insights from a dependency syntactically-annotated
corpus. Journal of Second Language
Writing,
46
1, 100666.
Kameen, P. T. (1979). Syntactic
skill and ESL writing quality. In C. Yorio, K. Perkins, & J. Schachter (Eds.), On
TESOL ’79: The Learner in
Focus (pp. 343–364). TESOL.
Khushik, G. A., & Huhta, A. (2019). Investigating
syntactic complexity in EFL learners’ writing across Common European Framework of Reference levels A1, A2, and
B1. Applied
Linguistics,
41
(4), 506–532.
Klein, D., & Manning, C. D. (2003). Fast
exact inference with a factored model for natural language
parsing. In S. Beker, S. Thrun, & K. Obermayer (Eds.), Advances
in Neural Information Processing
Systems15
1 (pp. 3–10). MIT Press.
Kummerfeld, J. K., Hall, D., Curran, J., & Klein, D. (2012). Parser
showdown at the wall street corral: An empirical investigation of error types in parser
output. In J. Tsujii, J. Henderson, & M. Paşca (Eds.), The
2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language
Learning (pp. 1048–1059). [URL]
Kyle, K. (2016). Measuring
Syntactic Development in L2 Writing: Fine Grained Indices of Syntactic Complexity and Usage-based Indices of Syntactic
Sophistication [Doctoral dissertation, Georgia State University]. ScholarWorks @ Georgia State University. [URL]
Larsson, T., & Kaatari, H. (2020). Syntactic
complexity across registers: Investigating (in)formality in second-language writing. Journal of
English for Academic
Purposes,
45
1, 100850.
Lu, X. (2011). A
corpus-based evaluation of syntactic complexity measures as indices of college-level ESL writers’ language
development. TESOL
Quarterly,
45
(1), 36–62.
Lu, X. (2014). Computational
Methods for Corpus Annotation and Analysis. Springer.
Lu, X. (2017). Automated
measurement of syntactic complexity in corpus-based L2 writing research and implications for writing
assessment. Language
Testing,
34
(4), 493–511.
Lu, X., & Ai, H. (2015). Syntactic
complexity in college-level English writing: Differences among writers with diverse L1
backgrounds. Journal of Second Language
Writing,
29
1, 16–27.
Lu, X., Casal, J. E., & Liu, Y. (2020). The
rhetorical functions of syntactically complex sentences in social science research article
introductions. Journal of English for Academic
Purposes,
44
1, 100832.
Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., & McClosky, D. (2014). The
Stanford CoreNLP Natural Language Processing Toolkit. Proceedings of the 52nd Annual Meeting of
the Association for Computational Linguistics: System
Demonstrations (pp. 55–60). Association for Computational Linguistics. [URL].
McNamara, D. S., Crossley, S. A., & McCarthy, P. M. (2010). Linguistic
features of writing quality. Written
Communication,
27
(1), 57–86.
McNamara, D. S., Graesser, A. C., McCarthy, P. M., & Cai, Z. (2014). Automated
Evaluation Of Text And Discourse With Coh-metrix. Cambridge University Press.
Meurers, D., & Dickinson, M. (2017). Evidence
and interpretation in language learning research: Opportunities for collaboration with computational
linguistics. Language
Learning,
67
(S1), 66–95.
Nicholls, D. (2003). The
Cambridge Learner Corpus: Error coding and analysis for lexicography and ELT. Proceedings of
the Corpus Linguistics 2003 conference, 572–581.
Ortega, L. (2003). Syntactic
complexity measures and their relationship to L2 Proficiency: A research synthesis of college-level L2
Writing. Applied
Linguistics,
24
(4), 492–518.
Pallotti, G. (2015). A
simple view of linguistic complexity. Second Language
Research,
31
(1), 117–134.
Polio, C. G. (1997). Measures
of linguistic accuracy in second language writing research. Language
Learning,
47
(1), 101–143.
Polio, C., & Yoon, H. (2018). The
reliability and validity of automated tools for examining variation in syntactic complexity across
genres. International Journal of Applied
Linguistics,
28
(1), 165–188.
R Core Team. (2020). R: A language and
environment for statistical computing (Version 4.0.0) [Computer
software]. R Foundation for Statistical Computing. [URL]
Spoelman, M., & Verspoor, M. (2010). Dynamic
patterns in development of accuracy and complexity: A longitudinal case study in the acquisition of
Finnish. Applied
Linguistics,
31
(4), 532–553.
Stanford Natural Language Processing
Group. (n.d.). Stanford Parser
FAQ. Retrieved November 10,
2020, from [URL]
Verspoor, M., Schmid, M. S., & Xu, X. (2012). A
dynamic usage based perspective on L2 writing. Journal of Second Language
Writing,
21
(3), 239–263.
Vyatkina, N. (2013). Specific
syntactic complexity: Developmental profiling of individuals based on an annotated learner
corpus. The Modern Language
Journal,
97
(S1), 11–30.
Walter, T. (2017). Measuring Syntactic Complexity in the Academic Writing of English Students at the University of Vienna. [Doctoral dissertation, Universität Wien]. u:theses Universität Wien.
Wu, X., Mauranen, A., & Lei, L. (2020). Syntactic
complexity in English as a Lingua Franca academic writing. Journal of English for Academic
Purposes,
43
1, 100798.
Cited by (6)
Cited by six other publications
Alzahrani, Alaa & Lawrence Jun Zhang
2024. Utility of Kolmogorov complexity measures: Analysis of L2 groups and L1 backgrounds. PLOS ONE 19:4 ► pp. e0301806 ff.
Bulté, Bram, Alex Housen & Gabriele Pallotti
2024. Complexity and Difficulty in Second Language Acquisition: A Theoretical and Methodological Overview. Language Learning
Kaatari, Henrik, Ying Wang & Tove Larsson
2024. Introducing the Swedish Learner English Corpus: a corpus that enables investigations of the impact of extramural activities on L2 writing. Corpora 19:1 ► pp. 17 ff.
Zheng, Yan & Jessie S. Barrot
2024. Syntactic complexity in second language (L2) writing: Comparing students’ narrative and argumentative essays. System 123 ► pp. 103342 ff.
Hwang, Hyun-Bin & Charlene Polio
2023. Text length effects on the reliability of syntactic complexity indices. Research Methods in Applied Linguistics 2:3 ► pp. 100085 ff.
Kaatari, Henrik, Tove Larsson, Ying Wang, Seda Acikara-Eickhoff & Pia Sundqvist
2023. Exploring the effects of target-language extramural activities on students’ written production. Journal of Second Language Writing 62 ► pp. 101062 ff.
This list is based on CrossRef data as of 11 september 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.