Publications

Publication details [#46666]

Publication type
Article in Special issue
Publication language
English
Source language
Target language
Pivot language
Journal DOI
10.1075/target

Abstract

One of the major barriers to the systematic study of indirect translation – that is, translations of translations – is the lack of efficient methods to identify these translations. In this article, the authors use supervised machine learning to examine whether computers can be harnessed to identify indirect translations. Their data consist of a monolingual comparable corpus that includes (1) nontranslated Finnish texts, (2) direct translations from English, French, German, Greek, and Swedish into Finnish, and (3) indirect translations from Greek (the ultimate source language) via English, French, German, and Swedish (mediating languages) into Finnish. They use n-grams of various types and lengths as feature sets and random forests as the statistical classification technique. To maximize the transferability of the method, the feature sets were implemented in accordance with the Universal Dependencies framework. This study confirms that computers can distinguish between translated and nontranslated Finnish, as well as between Finnish translations made from different source languages. Regarding indirect translations, the ultimate source language has a greater impact on the linguistic composition of indirect Finnish translations than their respective mediating languages. Hence, the indirect translations could not be reliably identified. Therefore, the results suggest that the reliable computational identification of indirect translations and their mediating languages requires a way to control for the effect of the ultimate source language.
Source : Based on abstract in journal