Researching learner language through POS keyword and syntactic complexity analyses
In this paper, we explore the affordances of two different research methods that may be instrumental in analysing learner language complexity: standard corpus linguistics methodology and automatic syntactic complexity analysers. Our results suggest that POS keyword analysis and automatic syntactic analysis are both effective for the identification of linguistic features at different levels of development in instructed SLA. In particular, countable nouns, prepositional phrases, verbs and general adverbs are criterial features that define the transition from lower to higher secondary school language learning in the Spanish component of the ICCI corpus. We suggest that the analysis of complexity in noun phrases is of great interest for researchers and teachers in terms of identifying development milestones in language acquisition.
Article outline
- 1.Introduction
- 2.Research methodology
- 2.1Data
- 2.2Research methods
- 3.Contrasting learner corpora (1): POS keyword analysis
- 3.1Grades 7 and 8
- 3.2Grades 11 and 12
- 3.3Grades 7 and 8 vs Grades 11 and 12
- 4.Contrasting learner corpora (2): Automatic syntactic complexity analysis
- 4.1Grades 7, 8, 11 and 12: Complexity in the noun phrase
- 4.2Grades 7, 8, 11 and 12: Syntactic sophistication
- 4.2.1Traditional measures of syntactic complexity
- 4.2.2Measures of syntactic sophistication
- 4.3Grades 7 and 8 vs Grades 11 and 12: Complexity in the noun phrase and syntactic sophistication measures
- 5.Discussion and pedagogical implications
- 5.1RQ (1) Do different groups of learners present distinct linguistic features? Can these features be identified by means of automatic analysis of language?
- 5.2RQ (2) Do different methods to carry out automatic analysis of language present a similar picture of complexity and language development? How do the research methods in this paper complement each other? How does this complementarity inform language teaching?
- 6.Conclusion and some limitations
-
Notes
-
References
-
Appendix
References
Aguado-Jiménez, Pilar, Pérez-Paredes, Pascual & Sánchez, Purificación
2012 Exploring the use of multidimensional analysis of learner language to promote register awareness.
System 40(1): 90–103.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Alexopoulou, Theodora, Michel, Marije Cornelie, Murakami, Akira & Meurers, Detmar
2017 Task effects on linguistic complexity and accuracy: A large-scale learner corpus analysis employing natural language processing techniques.
Language Learning 67(S1): 180–208.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Biber, Douglas, Gray, Bethany & Poonpon, Kornwipa
2011 Should we use characteristics of conversation to measure grammatical complexity in L2 writing development? TESOL Quarterly 45(1): 5–35.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Boulton, Alex
2009 Testing the limits of data-driven learning: Language proficiency and training.
ReCALL 21(1): 37–54.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Bulté, Bram & Housen, Alex
2012 Defining and operationalising L2 complexity,
Alex Housen,
Folkert Kuiken &
Ineke Vedder (eds),
Dimensions of L2 Performance and Proficiency. Complexity, Accuracy and Fluency in SLA [
Language Learning & Language Teaching 32], 21–46. Amsterdam: John Benjamins.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Bulté, Bram & Housen, Alex
2014 Conceptualizing and measuring short-term changes in L2 writing complexity.
Journal of Second Language Writing 26: 42–65.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Byrnes, Heidi & Sinicrope, Castle
2008 Advancedness and the development of relativization in L2 German: A curriculum-based longitudinal study,
Lourdes Ortega &
Heidi Byrnes (eds),
The Longitudinal Study of Advanced L2 Capacities, 109–138. New York NY: Routledge.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Carlsen, Cecilie
2012 Proficiency level – a fuzzy variable in computer learner corpora.
Applied Linguistics 33(2): 161–183.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Chen, Danqi & Manning, Christopher
2014 A Fast and Accurate Dependency Parser using Neural Networks,
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 740–750. Doha, Qatar: Association for Computational Linguistics.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Díez-Bedmar, María Belén
2010a Analysis of the Written Expression in English in the University Entrance Examination at the University of Jaén. PhD dissertation, Universidad de Jaén.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Díez-Bedmar, María Belén
2010b From secondary school to university: The Use of the English article system by Spanish learners. In
Exploring Corpus Linguistics in English Language Teaching,
Begoña Belles-Fortuno,
Mari Carmen Campoy &
Lluisa Gea-Valor (eds), 45–55. Castelló: Publicacions de la Universitat Jaume I.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Díez-Bedmar, María Belén
2012 The use of the common European framework of reference for languages to evaluate compositions in the English exam section of the university admission examination.
Revista de Educación 357: 55–79.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Díez-Bedmar, María Belén & Papp, Szilvia
2008 The use of the English article system by Chinese and Spanish learners. In
Linking up Contrastive and Learner Corpus Research,
Gaëtanelle Gilquin,
Szilvia Papp &
María Belén Díez-Bedmar (eds), 147–175. Amsterdam: Rodopi.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Díez-Bedmar, María Belén & Pérez Paredes, Pascual
Ellis, Nick C. O’Donnell, Matthew Brook & Römer, Ute
2013 Usage-based language: Investigating the latent structures that underpin acquisition.
Language Learning 63(s1): 25–51.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Ellis, Nick C., Römer, Ute & O’Donnell, Matthew Brook
2016 Usage-based approaches to language acquisition and processing: cognitive and corpus investigations of construction grammar. Malden, MA: Wiley.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Foster, Pauline & Tavakoli, Parvaneh
2009 Native speakers and task performance: Comparing effects on complexity, fluency, and lexical diversity.
Language Learning 59(4): 866–896.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Gablasova, Dana, Brezina, Vaclav & McEnery, Tony
2017 Exploring learner language through corpora: Comparing and interpreting corpus frequency information.
Language Learning 67(S1):130–154.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Granger, Sylviane, Dagneaux, Estelle, Meunier, Fanny & Paquot, Magali
2009 The International Corpus of Learner English,
Version 2. Handbook and CD-ROM. Louvain-la-Neuve: Presses Universitaires de Louvain.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Hawkins, John A. & Filipović, Luna
2012 Criterial Features in L2 English. Cambridge: CUP.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Ionin, Tania & Díez-Bedmar, María Belén
Forthcoming.
Article use in Russian and Spanish learner writing at CEFR B1 and B2 levels: effects of proficiency, native language, and specificity, Bert.
S. W. Le Brun &
Magali Paquot eds
Learner Corpora and Second Language Acquisition Cambridge CUP
Kyle, Kris
2016 Measuring syntactic development in L2 writing: Fine Grained Indices of Syntactic Complexity and Usage-based Indices of Syntactic Sophistication. PhD Dissertation, Georgia State University.
[URL]> (
24 March 2017).
Lu, Xiaofei
2011 A corpus-based evaluation of syntactic complexity measures as indices of college level ESL writers’ language development.
TESOL Quarterly 45(1): 36–62.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Norris, John M. & Ortega, Lourdes
2009 Towards an organic approach to investigating CAF in instructed SLA: The case of complexity.
Applied Linguistics 30(4): 555–578.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Ortega, Lourdes
2003 Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing.
Applied Linguistics 24(4): 492–518.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Pendar, Nick & Chapelle, Carol A. A.
2008 Investigating the promise of learner corpora: Methodological issues.
CALICO Journal 25: 189–206.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Pérez Paredes, Pascual & Díez-Bedmar, María Belén
2012 Intensifying adverbs in learner writing. In
Developmental and Crosslinguistic Perspectives in Learner Corpus Research [
Tokyo University of Foreign Studies 4],
Yukio Tono,
Yuji Kawaguchi &
Makoto Minegishi (eds), 105–123. Amsterdam: John Benjamins.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Pérez-Paredes, Pascual & Sánchez-Tornel, María
Pérez-Paredes, Pascual, Guillamón, Carlos & Aguado, Pilar
2018 Language teachers’ perceptions on the use of OER language processing technologies in MALL.
Computer Assisted Language Learning.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Rayson, Paul
2009 Wmatrix: A Web-based Corpus-processing Environment. Computing Department, Lancaster University.
[URL]> (
1 February 2016).
Robinson, Peter, Mackey, Alison, Gass, Susan & Schmidt, Richard
2012 Attention and awareness in second language acquisition. In
The Routledge Handbook of Second Language Acquisition,
Susan Gass &
Alison Mackey (eds), 247–267. New York NY: Routledge.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Schmidt, Richard
1990 The role of consciousness in second language learning.
Applied Linguistics 11: 129–158.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Tomasello, Michael
2003 Constructing a Language: A Usage-based Approach to Child Language Acquisition. Cambridge MA: Harvard University Press.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Tono, Yukio & Díez-Bedmar, María Belén
van Rijn, Jacolien, van Rijn, Hedderik & Hendriks, Petra
2012 How WM load influences pronoun interpretation. In
Proceedings of the 11th International Conference on Cognitive Modeling,
Nele Rußwinkel,
Uwe Drewitz &
Hedderick van Rijn (eds), 101–102. Berlin: Universitaetsverlag der TU Berlin.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
van Rooy, Bertus & Schäfer, Lade
2002 The effect of leavener errors on pos tag errors during automatic POS tagging.
Southern African Linguistic and Applied Language Studies, 20(4), 325–335.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Verspoor, Marjolijn, Lowie, Wander & Van Dijk, Marijn
2008 Variability in second language development from a dynamic systems perspective.
The Modern Language Journal 92(2): 214–231.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Vyatkina, Nina
2012 The development of second language writing complexity in groups and individuals: A longitudinal learner corpus study.
The Modern Language Journal 96(4): 576–598.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Vyatkina, Nina
2013 Specific syntactic complexity: Developmental profiling of individuals based on an annotated learner corpus.
The Modern Language Journal 97(S1): 11–30.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Wolfe-Quintero, Kate, Inagaki, Shunki & Kim, Hae-Young
1998 Second Language Development in Writing: Measures of Fluency, Accuracy & Complexity. Honolulu HI: University of Hawaii Press.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cited by
Cited by 3 other publications
Blanco-Suárez, Zeltia, Francisco Gallardo-del-Puerto & Evelyn Gandón-Chapela
2020.
The Primary Education Learners’ English Corpus (PELEC): Design and compilation.
Research in Corpus Linguistics 8
► pp. 147 ff.
![DOI logo](//benjamins.com/logos/doi-logo.svg)
Lim, Joyce Dong Ok, Geraldine Mark, Pascual Pérez-Paredes & Anne O’Keeffe
2024.
Exploring part of speech (pos) tag sequences in a large-scale learner corpus of L2 English: a developmental perspective.
Corpora 19:1
► pp. 31 ff.
![DOI logo](//benjamins.com/logos/doi-logo.svg)
Picoral, Adriana, Shelley Staples & Randi Reppen
This list is based on CrossRef data as of 29 may 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.