A corpus-based study: Accuracy, syntactic complexity and task type at play in examination writing

Lyashevskaya, Olga; Vinogradova, Olga; Scherbakova, Anna

doi:10.1075/scl.104.10lya

Part of

Complexity, Accuracy and Fluency in Learner Corpus Research
Edited by Agnieszka Leńko-Szymańska and Sandra Götz
[Studies in Corpus Linguistics 104] 2022
► pp. 241–272

Accuracy, syntactic complexity and task type at play in examination writing

A corpus-based study

Olga Lyashevskaya | National Research University Higher School of Economics | V.V. Vinogradov Russian Language Institute of the Russian Academy of Sciences

Olga Vinogradova | National Research University Higher School of Economics

Anna Scherbakova | National Research University Higher School of Economics

This chapter explores the association between syntactic complexity and syntactic accuracy in essays written by Russian learners of English in reply to two examination task types: a description of graphical material (Task 1) and an opinion essay (Task 2). A Poisson regression model served to predict the number of syntactic errors. Two syntactic complexity parameters were statistically significant in predicting syntactic accuracy in both tasks: the numbers of sentences and adverbial clauses. Three more parameters predicted the accuracy in Task 1 only: maximum depth of syntactic trees, and the numbers of adjective + noun and noun + infinitive constructions. Six parameters were related to syntactic accuracy in Task 2: the numbers of all clauses, of tokens and of T- units; the average length of sentence; and the numbers of coordinated and of participle + noun constructions.

Keywords: syntactic complexity, accuracy, EFL/ESL writing, assessing writing

Article outline

1.Introduction
2.Literature review
3.Research questions, material and method
- 3.1Research questions
- 3.2Examination task types
- 3.3Corpus data
- 3.4Data analysis
  - 3.4.1Evaluating syntactic accuracy
  - 3.4.2Parameters of syntactic complexity
  - 3.4.3Statistical analysis
4.Results
- 4.1Accuracy (error rates) across the two task types
- 4.2Syntactic accuracy
- 4.3Association between accuracy and complexity for syntactic features
- 4.4An aggregated metric of syntactic complexity
5.Discussion
6.Limitations and further research
7.Conclusions
Acknowledgements
Notes
References
Appendix

Published online: 1 December 2022

https://doi.org/10.1075/scl.104.10lya

References (48)

References

Alexopoulou, Theodora, Michel, Marije, Murakami, Akira & Meurers, Detmar. 2017. Task effects on linguistic complexity and accuracy: A large-scale learner corpus analysis employing natural language processing techniques. Language Learning 67(S1): 180–208.

Alexopoulou, Theodora, Yannakoudakis, Helen & Salamoura, Angeliki. 2013. Classifying intermediate learner English: A data driven approach to learner corpora. In Twenty Years of Learner Corpus Research: Looking Back, Moving Ahead, Sylviane Granger, Gaëtanelle Gilquin & Fanny Meunier (eds), 11–23. Louvain-la-Neuve: Presses universitaires de Louvain.

Asgarikia, Parissa. 2014. The effects of task type, strategic planning and no planning on written performance of Iranian intermediate EFL learners. Procedia – Social and Behavioral Sciences 98: 276–285.

Baechler, Raffaela & Seiler, Guido (eds). 2016. Complexity, Isolation, and Variation. Berlin: Walter de Gruyter.

Ballier, Nicolas, Canu, Stéphane, Petitjean, Caroline, Gasso, Gilles, Balhana, Carlos, Alexopoulou, Theodora & Gaillat, Thomas. 2020. Machine learning for learner English. International Journal of Learner Corpus Research 6(1): 72–103.

Barrot, Jessie S. & Agdeppa, Joan Y. 2021. Complexity, accuracy, and fluency as indices of college-level L2 writers’ proficiency. Assessing Writing 49: 100510.

Biber, Douglas, Gray, Bethany, Staples, Shelley & Egbert, Jesse. 2020. Investigating grammatical complexity in L2 English writing research: Linguistic description versus predictive measurement. Journal of English for Academic Purposes 46: 100869.

Blunk, Idan, Balewski, Zuzanna, Mahowald, Kyle & Fedorenko, Evelina. 2016. Syntactic processing is distributed across the language system. NeuroImage 127: 307–323.

Bulté, Bram & Housen, Alex. 2012. Defining and operationalising L2 complexity. In Dimensions of L2 Performance and Proficiency: Complexity, Accuracy and Fluency in SLA [Language Learning & Language Teaching 32], Alex Housen, Folkert Kuiken & Ineke Vedder (eds), 21–46. Amsterdam: John Benjamins.

. 2014. Conceptualizing and measuring short-term changes in L2 writing complexity. Journal of Second Language Writing 26: 42–65.

Council of Europe. 2001. Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Cambridge: CUP.

Crossley, Scott. 2020. Linguistic features in writing quality and development: An overview. Journal of Writing Research 11(3): 415–443.

Crossley, Scott A. & McNamara, Danielle S. 2014. Does writing development equal writing quality? A computational investigation of syntactic complexity in L2 learners. Journal of Second Language Writing 26: 66–79.

Dunn, Peter K. & Smyth, Gordon K. 2018. Generalized Linear Models with Examples in R. New York NY: Springer.

Foster, Pauline & Wigglesworth, Gillian. 2016. Capturing accuracy in second language performance: The case for a weighted clause ratio. Annual Review of Applied Linguistics 36: 98–116.

Graesser, Arthur C., McNamara, Danielle S., Louwerse, Max M. & Cai, Zhiqiang. 2004. Coh-Metrix: Analysis of text on cohesion and language. Behavior Research Methods, Instruments, & Computers 36(2): 193–202.

Hockett, Charles F. 1958. A Course in Modern Linguistics. New York NY: Macmillan.

Housen, Alex, Kuiken, Folkert & Vedder, Ineke. 2012. Complexity, accuracy and fluency: Definitions, measurement and research. In Dimensions of L2 Performance and Proficiency: Complexity, Accuracy and Fluency in SLA [Language Learning & Language Teaching 32], Alex Housen, Folkert Kuiken & Ineke Vedder (eds), 1–20. Amsterdam: John Benjamins.

Huddleston, Rodney D. & Pullum, Geoffrey K. 2005. The Cambridge Grammar of the English Language. Cambridge: CUP.

Jiang, Jingyang, Bi, Peng & Liu, Haitao. 2019. Syntactic complexity development in the writings of EFL learners: Insights from a dependency syntactically-annotated corpus. Journal of Second Language Writing 46: 100666.

Kortmann, Bernd & Szmrecsanyi, Benedikt. 2012. Introduction: Linguistic complexity. In Linguistic Complexity: Second Language Acquisition, Indigenization, Contact, Bernd Kortmann & Benedikt Szmrecsanyi (eds), 6–34. Berlin: Walter De Gruyter.

Krippendorff, Klaus. 2018. Content Analysis: An Introduction to Its Methodology (4th edn). Thousand Oaks CA: Sage.

Kutuzov, Andrey, Kuzmenko, Elizaveta & Vinogradova, Olga. 2015. Evaluating inter-rater reliability for hierarchical error annotation in learner corpora. In Corpus Linguistics, Federica Formato & Andrew Hardie (eds), 211–213. Lancaster: Lancaster University Press.

Kyle, Kristopher & Crossley, Scott A. 2018. Measuring syntactic complexity in L2 writing using fine-grained clausal and phrasal indices. The Modern Language Journal 102(2): 333–349.

Lahuerta, Ana. 2020. Analysis of accuracy in the writing of EFL students enrolled on CLIL and non-CLIL programmes: The impact of grade and gender. The Language Learning Journal 48(2): 121–132.

Lan, Nguyễn Thúy. 2015. The effect of task type on accuracy and complexity in IELTS academic writing. VNU Journal of Foreign Studies 31(1): 45–63.

Lu, Xiaofei. 2010. Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics 15(4): 474–496.

Lu, Xiaofei & Ai, Haiyang. 2015. Syntactic complexity in college-level English writing: Differences among writers with diverse L1 backgrounds. Journal of Second Language Writing 29: 16–27.

Lüdecke, Daniel. 2018. Ggeffects: Tidy data frames of marginal effects from regression models. Journal of Open Source Software 3(26): 772.

Lyashevskaya, Olga, Panteleeva, Irina & Vinogradova, Olga. 2021. Automated assessment of learner text complexity. Assessing Writing 49 : 100529.

Meurers, Detmar & Dickinson, Markus. 2017. Evidence and interpretation in language learning research: Opportunities for collaboration with computational linguistics. Language Learning 67(S1): 66–95.

Plonsky, Luke & Kim, You Jin. 2016. Task-based learner production: A substantive and methodological review. Annual Review of Applied Linguistics 36: 73–97.

Polio, Charlene & Shea, Mark C. 2014. An investigation into current measures of linguistic accuracy in second language writing research. Journal of Second Language Writing 26: 10–27.

R Core Team. 2019. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing.

Robinson, Peter. 1995. Task complexity and second language narrative discourse. Language Learning 45(1): 99–140.

Robinson, Peter, Cadierno, Teresa & Shirai, Yasuhiro. 2009. Time and motion: Measuring the effects of the conceptual demands of tasks on second language speech production. Applied Linguistics 30(4): 533–554.

Sampson, Geoffrey, Gill, David & Trudgill, Peter (eds). 2009. Language Complexity as an Evolving Variable. Oxford: OUP.

Skehan, Peter. 1998. A Cognitive Approach to Language Learning. Oxford: OUP.

Skehan, Peter & Foster, Pauline. 2012. Complexity, accuracy, and fluency and lexis in task-based performance: A synthesis of the Ealing research. In Alex Housen, Folkert Kuiken & Ineke Vedder (eds), Dimensions of L2 Performance and Proficiency: Complexity, Accuracy and Fluency in SLA [Language Learning & Language Teaching 32], 199–220. Amsterdam: John Benjamins.

Straka, Milan & Straková, Jana. 2017. Tokenizing, POS tagging, lemmatizing and parsing UD 2.0 with UDPipe. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. Stroudsburg PA: Association for Computational Linguistics. <[URL]> (6 December 2021).

Viklova, Anna & Vinogradova, Olga. 2021. Interlanguage interference in choosing English verbs tense forms in essays from Russian learners of English. In Cross-cultural Space: Linguistic and Didactic Aspect 2, Vorotilina Elena (ed.), 17–27. Petrozavodsk: Petrozavodsk State University. (in Russian)

Vinogradova, Olga, Lyashevskaya, Olga & Panteleeva, Irina. 2017. Multi-level student essay feedback in a learner corpus. In Papers from the Annual International Conference “Dialogue” (2017). Computational Linguistics and Intellectual Technologies 16 ( 1 ): 1, 373–386. <[URL]> (6 December 2022).

Vinogradova, Olga, Smirnova, Elizaveta, Viklova, Anna & Panteleeva, Irina. 2020. Syntactic complexity of academic text: A corpus study of written production by learners of English with Russian L1 in comparison with expert texts of English authors. Vestnik RGGU (Moscow Linguistic Journal) 7: 107–129. (in Russian).

Vinogradova, Olga. 2016. The role and applications of expert error annotation in a corpus of English learner texts. In Papers from the Annual International Conference “Dialogue” (2016). Computational Linguistics and Intellectual Technologies 15 ( 1 ): 1, 830–840. <[URL]> (6 December 2022).

Vyatkina, Nina. 2012. The development of second language writing complexity in groups and individuals: A longitudinal learner corpus study. The Modern Language Journal 96(4): 576–598.

Winter, Bodo & Bürkner, Paul C. 2021. Poisson regression for linguists: A tutorial introduction to modeling count data with brms. Language and Linguistics Compass 15(11): e123439.

Wolfe-Quintero, Kate, Inagaki, Shunji & Kim, Hae-Young. 1998. Second Language Development in Writing: Measures of Fluency, Accuracy, & Complexity. Honolulu HI: Second Language Teaching & Curriculum Center, University of Hawai’i at Manoa.

Xue, Wu, Mauranen, Anna & Lei, Lei. 2020. Syntactic complexity in English as a lingua franca academic writing. Journal of English for Academic Purposes 43: 100798.