Automatic analysis of thematic structure in written English
This paper proposes and describes a computational system for the automatic analysis of thematic structure, as defined in Systemic Functional Linguistics, in written English. The system takes an English text as input and produces as output an analysis of the thematic structure of each sentence in the text. The system is evaluated using data from The Wall Street Journal section of the Penn Treebank (Marcus et al. 1993) and the British Academic Written English corpus (Gardner & Nesi 2013). An experiment using these data shows that the system achieves a high degree of reliability in regard to both identifying theme-rheme boundaries and determining several of the linguistic properties of the identified themes, including syntactic nodes, theme function, markedness, mood types, and theme roles. To illustrate how the system is used, we describe an example application designed to compare collections of novice and expert academic writing in terms of thematic structure.
References (26)
Eggins, S. (2004). An Introduction to Systemic Functional Linguistics (2nd ed.). New York, NY: Continuum.
Gardner, S., & Nesi, H. (2013). A classification of genre families in university student writing. Applied Linguistics, 34(1), 25–52.
Ghadessy, M. (1999). Thematic organization in academic article abstract. Estudios Ingleses de la Universidad Complutense, 71, 141–161.
Gomez, M. (1994). Theme and textual organization. Journal of Words, 45(3), 293–305.
Gosden, H. (1995). Success in research article writing and revision: A social-constructionist perspective. English for Specific Purposes, 14(1), 37–57.
Halliday, M.A.K. (1994). An Introduction to Functional Grammar (2nd ed.). London, UK: Edward Arnold.
Halliday, M.A.K., & Matthiessen, C. (2004). An Introduction to Functional Grammar (3rd ed.). London, UK: Edward Arnold.
Hunt, K.W. (1965). Grammatical Structures Written at Three Grade Levels (NCTE research report no. 3). Urbana, IL: National Council of Teachers of English.
Hunt, K.W. (1970). Do sentences in the second language grow like those in the first? TESOL Quarterly, 4(3), 195–202.
Jalilifar, A. (2009). Thematic development in English and translated academic texts. Journal of Language and Translation, 10(1), 81–111.
Jalilifar, A. (2010). The status of theme in applied linguistics articles. Asian ESP Journal, 6(2), 7–39.
Kappagoda, A. (2009). The Use of Systemic-functional Linguistics in Automated Text Mining. Edinburgh, Australia: Defense Science and Technology Organization.
Klein, D., & Manning, C.D. (2003). Fast exact inference with a factored model for natural language parsing. In S. Becker, S. Thrun, & K. Obermayer (Eds.), Advances in Neural Information Processing Systems 15 (pp. 3–10). Cambridge, MA: MIT Press.
Lu, X. (2002). Discourse and ideology: The Taiwan issue in the Chinese and American media. In C.N. Candlin (Ed.), Research and Practice in Professional Discourse (pp. 589–608). Hong Kong: City University of Hong Kong Press.
Marcus, M.P., Marcinkiewicz, M.A., & Santorini, B. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), 313–330.
Martínez, I.A. (2003). Aspects of theme in the method and discussion sections of biology journal articles in English. Journal of English for Academic Purposes, 2(2), 103–123.
McCabe, A.M. (1999). Theme and thematic patterns in Spanish and English history texts. (Unpublished doctoral dissertation). Aston University, Birmingham, UK.
North, S.P. (2005). Disciplinary variation in the use of theme in undergraduate essays. Applied Linguistics, 26(3), 431–452.
O’Halloran, K.L. (2003). Systemics 1.0: Software for research and teaching systemic functional linguistics. RELC Journal, 34(2), 155–177.
Schleppegrell, M. (2001). Linguistic features of the language of schooling. Linguistics and Education, 12(4), 431–459.
Schwarz, L., Bartsch, S., Eckart, R., & Teich, E. (2008). Exploring automatic theme identification: A rule-based approach. In A. Storrer, A. Geyken, A. Siebert & K.-M. Würzner (Eds.), Text Resources and Lexical Knowledge. Selected Papers from the 9th Conference on Natural Language Processing (pp. 15–26). Berlin, Germany: Mouton de Gruyter.
Souter, D.C. (1996). A corpus-trained parser for systemic-functional syntax. (Unpublished doctoral dissertation). University of Leeds, Leeds, UK.
Steinberger, R., & Bennett, P. (1994). Automatic recognition of theme, focus and contrastive stress. In P. Bosch & R. van der Sandt (Eds.), Proceedings of the interdisciplinary conference in celebration of the 10th anniversary of the journal of semantics, 12–15 August 1994 (Vol. 11, pp. 205–214). Meinhard-Schwebda, Germany: The IBM Institute for Logic and Linguistics.
Thompson, G. (2004). Introducing Functional Grammar (2nd ed.). London, UK: Edward Arnold.
Wang, I. (2007). Theme and rheme in the thematic organization of texts: Implications for teaching academic writing. Asian EFL Journal, 9(1), 164–176.
Cited by (3)
Cited by three other publications
Eguchi, Masaki & Kristopher Kyle
2024.
Building custom NLP tools to annotate discourse-functional features for second language writing research: A tutorial.
Research Methods in Applied Linguistics 3:3
► pp. 100153 ff.
Dontcheva-Navratilova, Olga, Renata Jančaříková, Irena Hůlková & Josef Schmied
2020.
Theme choices in Czech University students’ English-medium Master's theses.
Lingua 243
► pp. 102892 ff.
Kim, Dongwook, Sanjay Mishra, Ze Wang & Surendra N. Singh
2016.
Insidious Effects of Syntactic Complexity: Are Ads Targeting Older Adults Too Complex to Remember?.
Journal of Advertising 45:4
► pp. 509 ff.
This list is based on CrossRef data as of 17 october 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.