Some Translation Studies informed suggestions for further balancing methodologies for machine translation quality evaluation
Ralph Krüger | TH Köln – University of Applied Sciences
This article intends to contribute to the current debate on the quality of neural machine translation (NMT) vs. (professional) human translation quality, where recently claims concerning (super)human performance of NMT systems have emerged. The article will critically analyse some current machine translation (MT) quality evaluation methodologies employed in studies claiming such performance of their MT systems. This analysis aims to identify areas where these methodologies are potentially biased in favour of MT and hence may overvalue MT performance while undervaluing human translation performance. Then, the article provides some Translation Studies informed suggestions for improving or debiasing these methodologies in order to arrive at a more balanced picture of MT vs. (professional) human translation quality.
2021European Language Industry Survey. Accessed June 9, 2021. [URL]
ErgoTrans
2015Final Report: Cognitive and Physical Ergonomics of Translation (ErgoTrans). Accessed June 24 2021. [URL]
Freitag, Markus, George Foster, David Grangier, Viresh Ratnakar, Qijun Tan, and Wolfgang Macherey
2021 “Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation.” arXiv. Accessed June 9, 2021. [URL].
Grice, Herbert P.
1975 “Logic and Conversation.” In Syntax and Semantics. Volume 31, edited by Peter Cole, and Jerry L. Morgan. 41–58. New York: Academic Press.
Hassan, Hany, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, Shujie Liu, Tie-Yan Liu, Renqian Luo, Arul Menezes, Tao Qin, Frank Seide, Xu Tan, Fei Tian, Lijun Wu, Shuangzhi Wu, Yingce Xia, Dongdong Zhang, Zhirui Zhang, and Ming Zhou
2018 “Achieving Human Parity on Automatic Chinese to English News Translation.” arXiv. Accessed June 9, 2021. [URL]
Horn-Helf, Brigitte
1999Technisches Übersetzen in Theorie und Praxis. [The Theory and Practice of Technical Translation]. Tübingen/Basel: Francke.
House, Juliane
2006 “Communicative Styles in English and German.” European Journal of English Studies 10(3), 249–267.
Kade, Otto
1968Zufall und Gesetzmäßigkeit in der Übersetzung [Coincidence and Regularities in Translation]. Leipzig: Verlag Enzyklopädie.
Koehn, Philipp
2020Neural Machine Translation. Cambridge: University Press.
Krüger, Ralph
2015The Interface between Scientific and Technical Translation Studies and Cognitive Linguistics. With Particular Emphasis on Explicitation and Implicitation as Indicators of Translational Text-Context Interaction. Berlin: Frank & Timme.
Krüger, Ralph
2016 “Situated LSP Translation from a Cognitive Translational Perspective.” Lebende Sprachen 61(2), 297–332.
Krüger, Ralph
2020 “Explicitation in Neural Machine Translation.” Across Languages and Cultures 21(2), 195–216.
Läubli, Samuel, Rico Sennrich, and Martin Volk
2018 “Has Machine Translation Achieved Human Parity? A Case for Document-Level Evaluation.” In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, edited by Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun’ichi Tsujii. 4791–4796. Association for Computational Linguistics. Accessed June 9, 2021.
Läubli, Samuel, Sheila Castilho, Graham Neubig, Rico Sennrich, Qinlan Shen, and Antonio Toral
2020 “A Set of Recommendations for Assessing Human-Machine Parity in Language Translation.” Journal of Artificial Intelligence Research 671, 653–672. Accessed June 9, 2021.
Lommel, Arle
2018 “Metrics for Translation Quality Assessment: A Case for Standardising Error Typologies.” In Translation Quality Assessment. From Principles to Practice, edited by Joss Moorkens, Sheila Castilho, Federico Gaspari, and Stephen Doherty. 109–127. Springer.
Lommel, Arle
2020 “At Human Parity? A Skeptical Response to MT Quality Claims” In Maschinelle Übersetzung für Übersetzungsprofis, edited by Jörg Porsiel. 185–197. BDÜ Fachverlag.
Macken, Lieve, Daniel Prou, and Arda Tezcan
2020 “Quantifying the Effect of Machine Translation in a High-Quality Human Translation Production Process.” Informatics 7(2), 1–19. Accessed June 25, 2021. [URL]
Maruf, Sameen, Fahimeh Saleh, and Gholamreza Haffari
2021A Survey on Document-Level Neural Machine Translation: Methods and Evaluation. ACM Computing Surveys 54(2), 1–36. Accessed November 1, 2021.
Melby, Alan
2019 “Bells MT (Machine Translation) Does Not Yet Ring.” Presentation at APTIF 9: Reality vs. Illusion: From Morse Code to Machine Translation.
Muzii, Luigi
2021 “Close Call – Observations on Productivity, Talent Shortages, & Human Parity MT.” eMpTy Pages. Accessed June 12, 2021. [URL]
Nord, Christiane
1997Translating as a Purposeful Activity. Functionalist Approaches Explained. Manchester: St. Jerome.
Nord, Christiane
2009Textanalyse und Übersetzen. Theoretische Grundlagen, Methode und didaktische Anwendung einer übersetzungsrelevanten Textanalyse [Text Analysis and Translation. Theoretical Foundations, Method and Didactic Application of a Translation-Relevant Text Analysis]. 4th edition. Tübingen: Gross.
Popel, Martin, Marketa Tomkova, Jakub Tomek, Łukasz Kaiser, Jakob Uszkoreit, Ondřej Bojar, and Zdeněk Žabokrtský
2020 “Transforming Machine Translation: a Deep Learning System Reaches News Translation Quality Comparable to Human Professionals.” Nature Communications 111, 1–15. Accessed June 9, 2021.
Pym, Anthony
2020 “Translation, Risk Management and Cognition.” In The Routledge Handbook of Translation and Cognition, edited by Favio Alves and Arnt Lykke Jakobsen. 445–458. New York: Routledge.
Reiß, Katharina, Hans J. Vermeer
1991Grundlegung einer allgemeinen Translationstheorie [Laying the Foundations for a General Theory of Translation and Interpreting]. 2nd edition. Tübingen: Niemeyer.
Risku, Hanna
2004Translationsmanagement. Interkulturelle Fachkommunikation im Kommunikationszeitalter [Translation Management. Intercultural LSP Communication in the Communication Age]. Tübingen: Narr.
Schmitt, Peter A.
2015 “Who Is Afraid of MT?” Lebende Sprachen 60(2), 234–258.
Sulubacak, Umut, Ozan Caglayan, Stig-Arne Grönroos, Aku Rouhe, Desmond Elliott, Lucia Specia, and Jörg Tiedemann
2020 “Multimodal Machine Translation through Visuals and Speech.” Machine Translation 34(2–3), 97–147.
Toral, Antonio, Sheila Castilho, Ken Hu, and Andy Way
2018 “Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation.” In Proceedings of the Third Conference on Machine Translation: Research Papers, edited by Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, and Karin Verspoor. 113–123. Accessed June 9, 2021.
Vashee, Kirti
2021a “The Quest for Human Parity Machine Translation.” eMpTy Pages. Accessed November 6, 2021. [URL]
Vashee, Kirti
2021b “Understanding Machine Translation Quality: A Review.” eMpTy Pages. Accessed November 6, 2021. [URL]
Vashee, Kirti
2021c “The Human-in-the-Loop Driving MT Progress.” eMpTy Pages. Accessed November 6, 2021. [URL]
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jacob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin
2017 “Attention Is All You Need.” In Advances in Neural Information Processing Systems 30 (NIPS 2017), edited by Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett. 1–11. Accessed June 9, 2021. [URL]
2019 “Machine Translation: Where Are We at Today? In The Bloomsbury Companion to Language Industry Studies, edited by Erik Angelone, Maureen Ehrensberger-Dow, and Gary Massey. 311–332. Bloomsbury Academic.
Wu, Yonghui, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean
2016 “Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation.” arXiv. Accessed June 9, 2021. [URL]
Cited by (7)
Cited by 7 other publications
Durr, Margarete
2024. Le traducteur humain a-t-il (encore) un avenir en traduction juridique ?. Lebende Sprachen 69:1 ► pp. 69 ff.
Li, Chen & Zhiyuan Sun
2024. Evaluation of the Quality of Sustainable Entrepreneurship Education in Universities Based on the Grey Correlation Algorithm. Journal of Information & Knowledge Management 23:03
Moorkens, Joss
2024. ‘I am not a number’: on quantification and algorithmic norms in translation. Perspectives 32:3 ► pp. 477 ff.
Li, Ruichao, Abdullah Mohd Nawi & Myoung Sook Kang
2023. Human-machine Translation Model Evaluation Based on Artificial Intelligence Translation. EMITTER International Journal of Engineering Technology 11:2 ► pp. 145 ff.
Yang, Yanxia, Runze Liu, Xingmin Qian & Jiayue Ni
2023. Performance and perception: machine translation post-editing in Chinese-English news translation by novice translators. Humanities and Social Sciences Communications 10:1
Krüger, Ralph
2022. Integrating professional machine translation literacy and data literacy. Lebende Sprachen 67:2 ► pp. 247 ff.
Krüger, Ralph
2023. Artificial intelligence literacy for the language industry – with particular emphasis on recent large language models such as GPT-4. Lebende Sprachen 68:2 ► pp. 283 ff.
This list is based on CrossRef data as of 5 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.