Indexation and analysis of a parallel corpus using CQPweb
The COVALT PAR_ES Corpus (EN/FR/DE > ES)
Ulrike Oster | Universitat Politècnica de València, Universitat Jaume I
This contribution presents a section of the Corpus Valencià de Literatura Traduïda (COVALT), created by the research group of the same name (Department of Translation and Communication, Universitat Jaume I, Spain). The COVALT corpus is a four-million word corpus made up of narrative works originally written in English, French, and German and their Catalan translations published in the autonomous community of Valencia between 1990 and 2000. Since the members of the Covalt group are interested in translation research, and more specifically in the investigation of translated Catalan and Spanish, this corpus has recently been extended to include translations into Spanish published in Spain (COVALT PAR_ES corpus). This chapter presents the COVALT PAR_ES corpus, as well as its process of compilation and analysis with CQPweb.
Article outline
- 1.Introduction
- 2.The corpora
- 3.Corpus compilation and indexation
- 3.1Preparation of texts
- 3.2Uploading the files to CQPweb
- Step 1: Creating directories
- Step 2: Encoding and indexing corpora in CWB
- Step 3: Aligning the subcorpora
- Step 4: Copying the files to CQPweb
- Step 5: Activating the corpora on the web interface
- 4.Corpus analysis
- 5.Conclusion
-
Acknowledgements
-
Notes
-
References
References (30)
References
Badia, Toni, Pujol, Manel, Tuells, Antoni, Vivaldi, Jorge, Yzaguirre, Lluís de & Cabré, Mª Teresa. 1998. IULA’s LSP multilingual corpus: Compilation and processing. In Proceedings of ELRA Conference, 29–31 May 1998, Universidad de Granada.
Christ, Oliver, Schulze, Bruno, M., Hofmann, Anja & König, Esther. 1999. The Open IMS Corpus Workbench. Corpus Query Processor. User’s Manual. Stuttgart: University of Stuttgart. <[URL]> (29 March 2017).
Doval, Irene, Fernández Lanza, Santiago, Jiménez Juliá, Tomás, Liste Lamas, Elsa & Lübke, Barbara. This volume. Corpus PaGeS: A multifunctional resource for language learning, translation and cross-linguistic research. In Parallel Corpora for Contrastive and Translation Studies: New Resources and Applications [Studies in Corpus Linguistics 90], Irene Doval & M. Teresa Sánchez (eds). Amsterdam: John Benjamins.
Evert, Stefan & The CWB Development Team. 2016: The IMS Open Corpus WorkBench (CWB). Corpus Encoding Tutorial. <[URL]> (20 October 2016).
Frankenberg-Garcia, Anna & Santos, Diana. 2003. Introducing Compara, the Portuguese English parallel corpus. In Corpora in Translator Education, Federico Zanettin, Silvia Bernardini & Dominic Stewart (eds), 71–87. Manchester: St. Jerome.
Gómez Guinovart, Xavier & Sacau Fontela, Elena. 2004. Parallel corpora for the Galician Language: Building and processing of the CLUVI (Linguistic Corpus of the University of Vigo). In Proceedings of the Fourth International Conference on Language Resources and Evaluation, 26–28 May 2004, Lisbon.
Guzman, Josep R. 2013. El corpus COVALT i l’eina d’alineament de frases Alfra-COVALT. In El corpus COVALT: un observatori de fraseologia traduïda, Llum Bracho Lapiedra (ed.), 49–60. Aachen: Shaker Verlag.
Guzman, Josep R. 2015a. Puntuació i traducció: Verführung i Der Tangospieler. Quaderns – Revista de Traducció 22: 217–232.
Guzman, Josep R. 2015b. Segmentation and regrouping of sentences. Lenguaje y Textos 42: 97–105.
Guzman, Josep R. 2016. La traducció de la modalitat deóntica i epistèmica del verb modal sollen en el corpus COVALT. Zeitschrift für Katalanistik 29: 135–165.
Hardie, Andrew. 2014. The IMS Open Corpus Workbench (CWB) CQPweb System Administrator’s Manual. <[URL]> (8 October 2016).
Hardie, Andrew. 2016. The IMS Open Corpus Workbench (CWB). CQPweb System Administrator’s Manual. <[URL]> (20 November 2016).
Johansson, Stig. 2004. Multilingual corpora: models, methods, use. TradTerm 10: 59–82. 

Marco, Josep. 2013b. La traducció de les unitats fraseològiques de base somàtica en el subcorpus angles català. In El corpus COVALT: Un observatori de fraseologia traduïda, Llum Bracho Lapiedra (ed.), 163–216. Aachen: Shaker Verlag.
Marco, Josep. 2018a The translation of food-related culture-specific items in the COVALT corpus: A study of techniques and factors. Perspectives.
.
Martínez Vilinsky, Bárbara. 2016. La infrarrepresentación de elementos únicos en textos traducidos de ingles a español: perífrasis verbales, demostrativos y sufijos apreciativos en un corpus comparable y paralelo de novel policíaca. PhD dissertation, Universitat Jaume I, Castelló de la Plana, Spain.
Mikhailov, Mikhail & Cooper, Robert. 2016. Corpus Linguistics for Translation and Contrastive Studies. London: Routledge. 

Molés-Cases, Teresa. 2016a. La traducción de los eventos de movimiento en un corpus paralelo alemán-español de literatura infantil y juvenil. Frankfurt: Peter Lang. 

Molés-Cases, Teresa. 2016b. Compilación y análisis de un corpus paralelo para la investigación en traducción. Proyecto con Déjà Vu, Treetagger e IMS Open Corpus Workbench. RLA (Revista de Lingüística Teórica y Aplicada) 54(1): 149–174. 

Oster, Ulrike & van Lawick, Heike. 2013. Anàlisi dels somatismes del subcorpus alemany-català. In El corpus COVALT: Un observatori de fraseologia traduïda, Llum Bracho Lapiedra (ed.), 267–294. Aachen: Shaker.
Oster, Ulrike & Molés-Cases, Teresa. 2016. Eating and drinking seen through translation: A study of food-related translation difficulties and techniques in a parallel corpus of literary texts. Across Languages and Cultures 17(1): 53–75. 

Przepiórkowski, Adam, Górski, Rafał L., Łaziński, Marek & Pezik, Piotr. 2010. Recent developments in the National Corpus of Polish. In Proceedings of the International Conference on Language Resources and Evaluation, 17–23 May 2010, Valleta, Malta.
Sanjurjo-González, Hugo & Izquierdo, Marlen. This volume. P-ACTRES 2.0: A parallel corpus for cross-linguistic research. In Parallel Corpora for Contrastive and Translation Studies: New Resources and Applications [Studies in Corpus Linguistics 90], Irene Doval & M. Teresa Sánchez (eds). Amsterdam: John Benjamins.
Verdegal, Joan. 2013. Les unitats fraseològiqus somàtiques franceses i catalanes en COVALT: Localització, freqüència i anàlisi. In El corpus COVALT: un observatori de fraseologia traduïda, Llum Bracho Lapiedra (ed.), 217–266. Aachen: Shaker.
Verdegal, Joan. 2014. Traduir l’emoció: metodologia i resultats. In Homenatge a Germà Colón. Labor omnia improbus vincit, Rosa Agost & Lluís Gimeno (eds), 251–279. Castelló: Publicacions de la Universitat Jaume I.
Zubillaga, Naroa, Sanz, Zuriñe & Uribarri, Ibon. 2015. Building a trilingual parallel corpus to analyse literary translations from German into Basque. In New Directions in Corpus-based Translation Studies, Claudio Fantinuoli & Federico Zanettin (eds), 71–92. Berlin: Language Science Press.
Cited by (3)
Cited by three other publications
Oster, Ulrike & Isabel Tello
Doval, Irene, Santiago Fernández Lanza, Tomás Jiménez Juliá, Elsa Liste Lamas & Barbara Lübke
This list is based on CrossRef data as of 27 december 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.