Article published in:The Functional Perspective on Language and Discourse: Applications and implications
Edited by María de los Ángeles Gómez González, Francisco José Ruiz de Mendoza Ibáñez, Francisco Gonzálvez-García and Angela Downing
[Pragmatics & Beyond New Series 247] 2014
► pp. 57–86
Contrastive corpus annotation in the CONTRANOT project
Issues and problems
In this paper we outline a number of issues and problems which arise during the process of contrastive human-coded corpus annotation of certain semantic and discourse categories within the framework of the CONTRANOT project, aimed at the creation and validation of contrastive functional descriptions through corpus analysis and annotation. Human-coded corpus annotation is a preliminary step for the training of computer algorithms which allow the automation of the annotation of large corpora, but it can also serve as a mechanism for testing aspects of linguistic theories empirically, such as theory formation and theory-redefinition, as well as enriching theories with quantitative information. The work reported in this paper focuses on the annotation of the category of Thematisation, on the one hand, and on Modality, on the other, to illustrate the challenges researchers have to face when confronted with the task of developing well-designed and reliable annotation procedures for complex linguistic phenomena in a contrastive manner. We describe the annotation tasks and procedures developed so far, which include the design of annotation schemas on the basis of available linguistic theories and the testing of their reliability through agreement studies. We also evaluate and discuss the results of the annotations on the basis of their relevance for the theoretical characterisation of the investigated phenomena. We expect that our work will have an impact in the area of contrastive textual analysis, and that it will pave the way for the development of automated annotation systems for computational applications.
Published online: 16 May 2014
Arús, Jorge, Julia Lavid, and Lara Moratón
Carretero, Marta, and Juan Rafael Zamorano-Mansilla
2010 “Annotating English and Spanish corpora for the categories of epistemic and deontic modality.” Paper presented at the 4th International Conference on Modality in English. Madrid, Universidad Complutense, 9–11 September.
Carretero, Marta, and Maite Taboada
. In press. “The Annotation of Appraisal: How Attitude and Epistemic Modality Overlap in English and Spanish Consumer Reviews.” In Thinking Modally: English and Contrastive Studies on Modality ed. by Juan Rafael Zamorano-Mansilla, E. Domínguez-Romero, C. Maíz-Arévalo, and M. V. Martín de la Rosa. Bern: Peter Lang.
Hovy, Eduard, and Julia Lavid
Lavid, Julia, Jorge Arús, and Juan Rafael Zamorano-Mansilla
Lavid, Julia, Jorge Arús, and Lara Moratón
2010a “Towards an Annotated English–Spanish Corpus with SFL–based Textual Features.” Paper presented at the 37th International Systemic–Functional Congress. Vancouver, Canada.
2010b “Investigating Thematic Meaning in English and Spanish: A Methodological Proposal.” Paper presented at the 22nd European Systemic–Functional Linguistics Conference and Workshop. University of Primorska (Koper, Eslovenia). To be published in G. O’Grady, et al. (eds.). Choice in Language: Applications in Text Analysis . London: Equinox.
McEnery, Anthony, R. Xiao, and Y. Tono
Reidsma, Dennis, and Jean Carletta
Taboada, Maite and Marta Carretero
Cited by other publications
Lavid, Julia, Jorge Arús, Bernard DeClerck & Veronique Hoste
López, Julia Lavid
This list is based on CrossRef data as of 24 september 2020. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.