InterCorp
A parallel corpus of 40 languages
This chapter presents the current version of InterCorp, a parallel corpus created at the Faculty of Arts, Charles University in Prague. The corpus contains texts in Czech aligned with one or more foreign-language version(s), including Czech and 39 other languages. The chapter analyses its structure and technical parameters, and describes some technical tools used with the corpus (Kontext, a corpus query interface, and InterText, a parallel text alignment editor created specifically for the project). Similarly, the contribution discusses Treq (Translation Equivalents Database), a collection of bilingual Czech-foreign language dictionaries built automatically from InterCorp. In the last section of the chapter, the possibilities for methodological and linguistic exploitation of the corpus are discussed.
Article outline
- 1.Introduction
- 2.Description of the corpus
- 2.1The Spanish part of the corpus
- 3.Using the corpus
- 4.Specific tools: Translation equivalents database
- 5.Exploiting InterCorp
- 6.Conclusion
-
Acknowledgment
-
References
References
Čermák, František, Corness, Patrick & Klégr, Aleš
(eds) 2010 InterCorp: Exploring a Multilingual Corpus. Prague: Nakladatelství Lidové Noviny & Ústav Českého národního korpusu.
Čermák, František & Rosen, Alexandr
Čermák, Petr
2007 Acerca de los corpora paralelos: El proyecto Intercorp (About the parallel corpora: The Intercorp project).
Verba 34: 375–380.
Machálek, Tomáš
2016 Kontext.
[URL] (18 November 2017).
Nádvorníková, Olga
2017 Pièges méthodologiques des corpus parallèles et comment les éviter (Methodological traps of parallel corpora and how to avoid them).
Corela. Cognition, Représentation, Langage HS-21: 1–28.
Och, Franz Josef & Ney, Hermann
2003 A systematic comparison of various statistical alignment models.
Computational Linguistics 29(1): 19–51.
Repository of bibliographical items based on the Czech National Corpus
2017 <
[URL] (18 November 2017).
Rosen, Alexander & Vavřín, Martin
2016 Korpus InterCorp, version 9 of 9 Sep 2016.
Institute of the Czech National Corpus, Charles University, Prague 2014.
[URL] (18 November 2017).
Rosen, Alexandr & Vavřín, Martin
2012 Building a multilingual parallel corpus for human users. In
Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12),
Nicoletta Calzolari et al. (eds), 2447–2452. Turkey: European Language Resources Association (ELRA).
Meurer, Paul
2012 INESS-Search: A search system for LFG (and other) treebanks. In
Proceedings of LFG12 Conference
,
Miriam, Butt &
Tracy, H. King (eds). Stanford, CA: CSLI Publications).
Rosen, Alexandr
2016 InterCorp – a look behind the façade of a parallel corpus. In
Polskojęzyczne korpusy równoległe. Polish-language Parallel Corpora,
Ewa Gruszczyńska &
Agnieszka Leńko-Szymańska (eds.), 21–40. Warszawa: Instytut Lingwistyki Stosowanej.
Škrabal, Michal & Vavřín, Martin
2017 The Translation Equivalents Database (Treq) as a Lexicographer’s Aid. In
Electronic lexicography in the 21st century. Proceedings of eLex 2017 conference,
Kosek Iztok et alii (eds.), 124–137. Leiden: Lexical Computing CZ s. r. o.
Štichauer, Pavel & Čermák, Petr
2016 Causative constructions of the hacer / fare + verb type in Spanish and Italian and their Czech counterparts: A parallel corpus-based study.
Linguistica Pragensia 26(2): 7–20.
TreeTagger
2017 <
[URL] (18 November 2017).
Vondřička, Pavel
2014 Aligning parallel texts with InterText. In
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14),
Nicoletta Calzolari et al. (eds), 1875–1879. Reykjavik: European Language Resources Association (ELRA).
Vondřička, Pavel
2016 Intertext, Parallel Text Alignment Editor.
[URL] (18 November 2017).
Cited by
Cited by 2 other publications
DOVAL, Irene
2018.
Corpus paralelos en la enseñanza de lenguas extranjeras: un ejemplo de aplicación basado en el corpus PaGeS.
CLINA: Revista Interdisciplinaria de Traducción, Interpretación y Comunicación Intercultural 4:2
► pp. 65 ff.
Mikhailov, Mikhail
2021.
Mind the Source Data! Translation Equivalents and Translation Stimuli from Parallel Corpora. In
New Perspectives on Corpus Translation Studies [
New Frontiers in Translation Studies, ],
► pp. 259 ff.
This list is based on CrossRef data as of 22 april 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.