Chapter 7
English and Spanish discourse markers in translation
Corpus analysis and annotation
The study and annotation of discourse markers (DMs) in the context of translation is a much needed and challenging task not only for descriptive translation studies, but also for Natural Language Processing (NLP) applications. Their various meanings are difficult to identify and annotate, even for trained human experts. In this chapter, a methodology for the analysis and annotation of DMs is proposed, using three highly frequent DMs in English -in fact, actually and really- and their translations into Spanish as a case study. The methodology consists of an initial corpus analysis phase followed by a corpus annotation phase. The corpus analysis provides qualitative and quantitative information on the meanings of these DMs by looking at their translations in large parallel corpora. The corpus annotation phase specifies the annotation procedure, which can be generalized to other DMs and to other language pairs, and form the basis for large-scale cross-linguistic annotation of DMs.
Article outline
- 1.Introduction
- 2.Data and methodology
- 3.Corpus analysis phase
- 3.1Some previous work on English DMs
- 3.2Translation analysis of in fact
- 3.3Translation analysis of actually
- 3.4Translation analysis of really
- 3.5Analysis of the back translations
- 3.5.1English translations of de hecho
- 3.5.2English translations of en realidad
- 3.5.3English translations of realmente
- 3.6Lexico-semantic field construction
- 4.Corpus annotation phase
- 5.Summary and concluding remarks
-
Notes
-
References
-
Appendix