Source language classification of indirect translations
IlmariIvaska and LauraIvaska
University of Turku | Finnish Literature Society (SKS)
Abstract
One of the major barriers to the systematic study of indirect translation – that is, translations of
translations – is the lack of efficient methods to identify these translations. In this article, we use supervised machine
learning to examine whether computers can be harnessed to identify indirect translations. Our data consist of a monolingual
comparable corpus that includes (1) nontranslated Finnish texts, (2) direct translations from English, French, German, Greek, and
Swedish into Finnish, and (3) indirect translations from Greek (the ultimate source language) via English, French, German, and
Swedish (mediating languages) into Finnish. We use n-grams of various types and lengths as feature sets and random forests as the
statistical classification technique. To maximize the transferability of the method, the feature sets were implemented in
accordance with the Universal Dependencies framework. This study confirms that computers can distinguish between translated and
nontranslated Finnish, as well as between Finnish translations made from different source languages. Regarding indirect
translations, the ultimate source language has a greater impact on the linguistic composition of indirect Finnish translations
than their respective mediating languages. Hence, the indirect translations could not be reliably identified. Therefore, our
results suggest that the reliable computational identification of indirect translations and their mediating languages requires a
way to control for the effect of the ultimate source language.
In this article, we study indirect translation (ITr), which, put simply, is translating from translation(s). For example,
the Finnish translation Kerro minulle, Zorbas ‘Tell me, Zorbas’ (1954b),
by Vappu Roos, of Nikos Kazantzakis’s novel Βίος και πολιτεία του Αλέξη Ζορμπά Vios kai politeía tou Aléxi Zormpá
(1946, published in Carl Wildman’s English translation under the title Zorba
the Greek [1952]) was not done from the original Greek but from the French
translation by Yvonne Gauthier, Gisèle Prassinos, and Pierre Fridas, titled Alexis Zorba (1954a). In this case, ITr forms the chain Greek–French–Finnish, where Greek is the ultimate source language
(ultimate SL), French is the mediating language, and Finnish is the ultimate target language (ultimate TL). An ITr may also be
compilative, that is, based on several source texts (STs) in one or several SLs. For example, the Finnish translation
Veljesviha ‘Hatred of brothers’ (1967), by Kyllikki Villa, of
Kazantzakis’s novel Οι Αδερφοφάδες Oi aderfofádes (1963, published in
Athena Gianakas Dallas’s English translation as The Fratricides [1964])
has three STs: the French translation (Les frères ennemis ‘The enemy brothers’, 1965, translated by Pierre Aellig), the English translation, and the Greek version (for more details, see L. Ivaska [2021]; for discussion on further types of ITrs, see, e.g., Washbourne 2013; Assis Rosa, Pięta, and Bueno Maia 2017).
References
Assis
Rosa, Alexandra, Hanna Pięta, and Rita
Bueno Maia
2017 “Theoretical,
Methodological and Terminological Issues Regarding Indirect Translation: An
Overview.” Translation
Studies 10 (2): 113–132.
Baker, Mona
1993 “Corpus
Linguistics and Translation Studies – Implications and
Applications.” In Text and Technology: In Honour of John
Sinclair, edited by Mona Baker, Gill Francis, and Elena Tognini-Bonelli, 233–250. Amsterdam: John
Benjamins.
Baroni, Marco, and Silvia Bernardini
2006 “A
New Approach to the Study of Translationese: Machine-Learning the Difference between Original and Translated
Text.” Literary and Linguistic
Computing 21 (3): 259–274.
Cartoni, Bruno, Sandrine Zufferey, and Thomas Meyer
2013 “Using
the Europarl Corpus for Cross-Linguistic Research.” Belgian Journal of
Linguistics 27 (1): 23–42.
Čermák, František, and Alexandr Rosen
2012 “The
Case of InterCorp: A Multilingual Parallel Corpus.” International Journal of Corpus
Linguistics 17 (3): 411–427.
Fernández
Muñiz, Iris
2016 “Tracking
Sources in Indirect Translation Archaeology: A Case Study on a 1917 Spanish Translation of Ibsen’s Et
Dukkehjem (1879).” In New Horizons in Translation
Research and Education 4, edited by Turo Rautaoja, Tamara
Mikolič Južnič, and Kaisa Koskinen, 115–132. Joensuu: University
of Eastern Finland.
Genette, Gérard
1991 “Introduction
to the Paratext.” New Literary
History 22 (2): 261–272.
Hanes, Vanessa Lopes
Lourenço
2017 “Between Continents:
Agatha Christie’s Translations as Intercultural Mediators.” Cadernos de
Tradução 37 (1): 208–229.
Islam, Zahurul, and Armin Hoenen
2013 “Source
and Translation Classification Using Most Frequent
Words.” In Proceedings of the Sixth International Joint Conference on
Natural Language Processing, Nagoya, Japan, 14–18 October 2013, edited by Ruslan Mitkov and Jong
C. Park, 1299–1305. Nagoya: Asian
Federation of Natural Language Processing.
Ivaska, Ilmari, and Silvia Bernardini
2020 “Constrained
Language Use in Finnish: A Corpus-Driven Approach.” Nordic Journal of
Linguistics 43 (1): 33–57.
Ivaska, Laura
2019 “Distinguishing
Translations from Non-translations and Identifying (In)direct Translations’ Source
Languages.” In Proceedings of the Research Data and Humanities
(RDHum) 2019 Conference: Data, Methods and Tools, edited by Jarmo
Harri Jantunen, Sisko Brunni, Niina Kunnas, Santeri Palviainen, and Katja Västi. Studia
humaniora ouluensia
17, 125–138. Oulu: University
of Oulu.
Ivaska, Laura
2020 “Identifying
(Indirect) Translations and Their Source Languages in the Finnish National Bibliography Fennica: Problems and
Solutions.” In MikaEL 13: 75–88.
Ivaska, Laura
2021 “The
Genesis of a Compilative Translation and its de facto Source
Text.” In Genetic Translation Studies: Conflict and Collaboration in
Liminal Spaces, edited by Ariadne Nunes, Joana Moura, and Marta
Pacheco Pinto, 72–88. London: Bloomsbury.
Kanerva, Jenna, Filip Ginter, Niko Miekka, Akseli Leino, and Tapio Salakoski
2018 “Turku
Neural Parser Pipeline: An End-to-End System for the CoNLL 2018 Shared
Task.” In Proceedings of the CoNLL 2018 Shared Task: Multilingual
Parsing from Raw Text to Universal Dependencies, edited by Daniel Zeman and Jan Hajič, 133–142. Brussels: Association
for Computational Linguistics.
Kazantzakis, Nikos
1946Βίος και πολιτεία του Αλέξη Ζορμπά [Life and times of Alexis
Zorbas]. Athens: Dimitrakou.
Kazantzakis, Nikos
1952Zorba
the Greek. Translated by Carl Wildman. New
York: Simon and Schuster.
Kazantzakis, Nikos
1954aAlexis
Zorba. Translated by Yvonne Gauthier, Gisèle Prassinos, and Pierre Fridas. Paris: Plon.
1964The
Fratricides. Translated by Athena
Gianakas Dallas. New
York: Simon and Schuster.
Kazantzakis, Nikos
1965Les frères ennemis [The enemy
brothers]. Translated by Pierre Aellig. Paris: Plon.
Kazantzakis, Nikos
1967Veljesviha [Hatred of
brothers]. Translated by Kyllikkki Villa. Helsinki: Tammi.
Koehn, Philipp
2005 “Europarl:
A Parallel Corpus for Statistical Machine
Translation.” In Proceedings of Machine Translation Summit X:
Papers, 79–86. Phuket: Association
for Computational Linguistics.
Koppel, Moshe, and Noam Ordan
2011 “Translationese
and its Dialects.” In Proceedings of the 49th Annual Meeting of the
Association for Computational Linguistics: Human Language Technologies, Volume 1, edited
by Dekang Lin, 1318–1326. Portland: Association
for Computational Linguistics.
Lynch, Gerard, and Carl Vogel
2012 “Towards
the Automatic Detection of the Source Language of a Literary
Translation.” In Proceedings of COLING 2012:
Posters, edited by Martin Kay and Christian Boitet, 775–784. Mumbai: The
COLING 2012 Organizing Committee.
Mauranen, Anna
2004 “Corpora,
Universals and Interference.” In Translation Universals: Do They
Exist? edited by Anna Mauranen and Pekka Kujamäki, 65–82. Amsterdam: John
Benjamins.
Meyer, David, Evgenia Dimitriadou, Kurt Hornik, Andreas Weingessel, and Friedrich Leisch
2021E1071:
Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071). TU
Wien.
Nisioi, Sergiu
2015 “Unsupervised
Classification of Translated Texts.” In Natural Language Processing
and Information Systems, edited by Chris Biemann, Siegfried Handschuh, André Freitas, Farid Meziane, and Elisabeth Métais, 323–334. Cham: Springer.
Nivre, Joakim, Marie-Catherine
de Marneffe, Filip Ginter, Jan Hajič, Christopher
D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis Tyers, and Daniel Zeman
2020 “Universal
Dependencies v2: An Evergrowing Multilingual Treebank
Collection.” In Proceedings of 12th Conference on Language Resources
and Evaluation LREC’2020, edited by Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declercket al., 4034–4043. Marseille: European
Language Resources Association.
Popescu, Marius
2011 “Studying
Translationese at the Character Level.” In Proceedings of the
International Conference Recent Advances in Natural Language Processing 2011, edited
by Ruslan Mitkov and Galia Angelova, 634–639. Hissar: Association
for Computational Linguistics.
R Core Team
2021R: A Language and
Environment for Statistical Computing. Vienna: R
Foundation for Statistical Computing.
Rabinovich, Ella, Sergiu Nisioi, Noam Ordan, and Shuly Wintner
2016 “On
the Similarities between Native, Non-Native and Translated
Texts.” In Proceedings of the 54th Annual Meeting of the Association
for Computational Linguistics, edited by Katrin Erk and Noah
A. Smith, 1870–1881. Berlin: Association
for Computational Linguistics.
Rabinovich, Ella, Noam Ordan, and Shuly Wintner
2017 “Found
in Translation: Reconstructing Phylogenetic Language Trees from
Translations.” In Proceedings of the 55th Annual Meeting of the
Association for Computational Linguistics, edited by Regina Barzilay and Min-Yen Kan, 530–540. Vancouver: Association
for Computational Linguistics.
Rabinovich, Ella, and Shuly Wintner
2015 “Unsupervised
Identification of Translationese.” Transactions of the Association for Computational
Linguistics 3: 419–432.
Toury, Gideon
2012Descriptive
Translation Studies – and Beyond. Amsterdam: John
Benjamins.
Ustaszewski, Michael
2021 “Towards
a Machine Learning Approach to the Analysis of Indirect Translation.” Translation
Studies 14 (3): 313–331.
Volansky, Vered, Noam Ordan, and Shuly Wintner
2015 “On
the Features of Translationese.” Digital Scholarship in the
Humanities 30 (1): 98–118.
Washbourne, Kelly
2013 “Nonlinear
Narratives: Paths of Indirect and Relay
Translation.” Meta 58 (3): 607–625.
Wright, Marvin
N., and Andreas Ziegler
2017 “ranger:
A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of
Statistical
Software 77 (1): 1–17.
Zei, Alki
1971Ο μεγάλος περίπατος του Πέτρου [Petros’ long
journey]. Athens: Kedros.
Zei, Alki
1972Petros’
War. Translated by Edward Fenton. New
York: E. P. Dutton.
Zei, Alki
1973Tämä on sotaa, Petros [This is war,
Petros]. Translated by Marikki Makkonen. Porvoo: WSOY.