Making sense of neural machine translation

Forcada, Mikel L.

doi:10.1075/ts.6.2.06for

Article published In:

Translation Spaces
Vol. 6:2 (2017) ► pp.291–309

Translation and techniques

Making sense of neural machine translation

Mikel L. Forcada | Universitat d’Alacant, Spain

The last few years have witnessed a surge in the interest of a new machine translation paradigm: neural machine translation (NMT). Neural machine translation is starting to displace its corpus-based predecessor, statistical machine translation (SMT). In this paper, I introduce NMT, and explain in detail, without the mathematical complexity, how neural machine translation systems work, how they are trained, and their main differences with SMT systems. The paper will try to decipher NMT jargon such as “distributed representations”, “deep learning”, “word embeddings”, “vectors”, “layers”, “weights”, “encoder”, “decoder”, and “attention”, and build upon these concepts, so that individual translators and professionals working for the translation industry as well as students and academics in translation studies can make sense of this new technology and know what to expect from it. Aspects such as how NMT output differs from SMT, and the hardware and software requirements of NMT, both at training time and at run time, on the translation industry, will be discussed.

Keywords: neural machine translation, neural networks, machine translation, word embeddings, encoder, decoder, deep learning

Article outline

1.Introduction
2.What is neural machine translation and how does it work?
- 2.1Neural machine translation is corpus-based machine translation
- 2.2Neural machine translation uses neural networks
  - 2.2.1Neural units or neurons
  - 2.2.2Grouping units into layers to learn distributed representations
- 2.3How does neural machine translation work?
  - 2.3.1Training
  - 2.3.2Machine translation as predicting the next word
  - 2.3.3Representations for words and for longer segments of text
  - 2.3.4Encoding
  - 2.3.5Decoding
- 2.4Extensions and alternative neural machine translation architectures
  - 2.4.1Attention
  - 2.4.2“Convolutional” neural machine translation
  - 2.4.3Doing away with recursion and convolution: is attention all you need?
- 2.5Main differences between neural and statistical machine translation
3.What can translators expect from neural machine translation?
- 3.1High computational requirements
- 3.2A different kind of output
- 3.3Is neural machine translation better than statistical machine translation?
  - 3.3.1Automatic evaluation
  - 3.3.2Subjective evaluation
  - 3.3.3Measuring post-editing effort and productivity
4.Concluding remarks
Acknowledgements
Notes
References

This article is currently available as a sample article.

Published online: 4 December 2017

https://doi.org/10.1075/ts.6.2.06for

References

Arthur, Philip, Graham Neubig, and Satoshi Nakamura

2016 “Incorporating Discrete Translation Lexicons into Neural Machine Translation.” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (Austin, Texas, November 1–5, 2016). 1557–1567.

Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio

2014 “Neural Machine Translation by Jointly Learning to Align and Translate”, eprint arXiv:1409.0473 ([URL]).

Bentivogli, Luisa, Arianna Bisazza, Mauro Cettolo, and Marcello Federico

2016 “Neural versus Phrase-Based Machine Translation Quality: A Case Study.” in Proceedings of Conference on Empirical Methods in Natural Language Processing. EMNLP: Texas (USA). 257–267. (eprint arXiv:1608.04631 [URL]).

Bojar, Ondrej, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, and Marcos Zampieri

2016 “Findings of the 2016 Conference on Machine Translation.” in Proceedings of the First Conference on Machine Translation (Berlin, Germany, August). 131–198.

Castilho, Sheila, Joss Moorkens, Federico Gaspari, Iacer Calixto, John Tinsley, and Andy Way

2017 “Is Neural Machine Translation the New State of the Art?” Prague Bulletin of Mathematical Linguistics 108(1): 109–120.

Cho, Kyunghyun, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio

2014 “On the Properties of Neural Machine Translation: Encoder-Decoder Approaches.” eprint arXiv:1409.1259 ([URL]).

Forcada, Mikel L., and Ramón P. Ñeco

1997 “Recursive hetero-associative memories for translation” in Biological and Artificial Computation: From Neuroscience to Technology (International Work-Conference on Artificial and Natural Neural Networks, IWANN’97 Lanzarote, Canary Islands, Spain, June 4–6, 1997, Proceedings), edited by José Mira, Roberto Moreno-Díaz, and Joan Cabestany. Heidelberg: Springer. 453–462.

Forcada, Mikel L.

2010 “Machine Translation Today”, in Handbook of Translation Studies, edited by Yves Gambier, Luc Van Doorslaer. vol. 11, 215–223.

Foster, George, Pierre Isabelle, and Pierre Plamondon

1997 “Target-Text Mediated Interactive Machine Translation.” Machine Translation 12(1). 175–194.

Gehring, Jonas, Michael Auli, David Grangier, Denis Yarats, and Yann N. Dauphin

2017 “Convolutional Sequence to Sequence Learning.” eprint arXiv:1705.03122 (eprint arXiv: 1705.03122 [URL]).

Haddow, Barry

2017 Personal communication.

Hearne, Mary, and Andy Way

2011 “Statistical Machine Translation: A Guide for Linguists and Translators.” Language and Linguistics Compass 5(5). 205–226.

Hochreiter, Sepp, and Jürgen Schmidhuber

1997 “Long short-term memory.” Neural Computation 9(8).1735–1780.

Koehn, Philipp

2010 Statistical Machine Translation. Cambridge, Mass., USA: MIT Press.

Levin, Pavel, Nishikant Dhanuka, and Maxim Khalilov

2017 “Machine Translation at Booking.com: Journey and Lessons Learned.” in The 20th Annual Conference of the European Association for Machine Translation (29–31 May 2017, Prague, Czech Republic): Conference Booklet, User Studies and Project/Product Descriptions. 81–86.

Mikolov, Tomas, Kai Chen, Greg Corrado, and Jeffrey Dean

2013a “Efficient Estimation of Word Representations in Vector Space.” in Proceedings of the International Conference on Learning Representations (also available as eprint arXiv: 1301.3781 [URL]).

Mikolov, Tomas, Wen-tau Yih, and Geoffrey Zweig

2013b “Linguistic Regularities in Continuous Space Word Representations.” in Proceedings of NAACL-HLT 2013 (Atlanta, Georgia, 9–14 June 2013), 746–751.

Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu

2002 “BLEU: A Method for Automatic Evaluation of Machine Translation.” Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311–318.

Peris, Álvaro, Miguel Domingo, and Francisco Casacuberta

2017 “Interactive Neural Machine Translation.” Computer Speech and Language 451, 201–220.

Sennrich, Rico, Barry Haddow, and Alexandra Birch

2016 “Neural Machine Translation of Rare Words with Subword Units.” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 1715–1725 (Also eprint arXiv: 1508.07909: [URL]).

Sennrich, Rico, Orhan Firat, Kyunghyun Cho, Alexandra Birch, Barry Haddow, Julian Hitschler, Marcin Junczys-Dowmunt, Samuel Läubli, Antonio Valerio Miceli Barone, Jozef Mokry, and Maria Nădejde

2017 “Nematus: A Toolkit for Neural Machine Translation” eprint arXiv:1703.04357 ([URL]).

Shterionov, Dimitar, Pat Nagle, Laura Casanellas, Riccardo Superbo, and Tony O’Dowd

2017 “Empirical Evaluation of NMT and PBSMT Quality for Large-Scale Translation Production” in The 20th Annual Conference of the European Association for Machine Translation (29–31 May 2017, Prague, Czech Republic): Conference Booklet, User Studies and Project/Product Descriptions. 75–80.

Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le

2014 “Sequence to Sequence Learning with Neural Networks”, in Advances in Neural Information Processing Systems, edited by Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger. p. 3104–3112.

Toral, Antonio, and Víctor M. Sánchez-Cartagena

2017 “A Multifaceted Evaluation of Neural versus Phrase-Based Machine Translation for 9 Language Directions” in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (Valencia, Spain, April 3–7, 2017), Volume 1, Long Papers. 1063–1073.

Vashee, Kirti

2016 “The Google Neural Machine Translation Marketing Deception”, [URL]

Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin

2017 “Attention is all you need.” eprint arXiv:1706.03762 ([URL]).

Way, Andy, and Mary Hearne

2011 “On the Role of Translations in State-of-the-Art Statistical Machine Translation.” Language and Linguistics Compass 5:51, 227–248.

Wu, Yonghui, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean

2017 “Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation”, eprint arXiv:1609.08144 ([URL]).

Cited by

Cited by 54 other publications

Order by:

Almanna, Ali & Rafik Jamoussi

2022. NMT verb rendering: A cognitive approach to informing Arabic-into-English post-editing. Open Linguistics 8:1 ► pp. 310 ff.

Alwazna, Rafat Y.

2024. The use of automation in the rendition of certain articles of the Saudi Commercial Law into English: a post-editing-based comparison of five machine translation systems. Frontiers in Artificial Intelligence 6

Asscher, Omri

2024. The explanatory power of descriptive translation studies in the machine translation era. Perspectives 32:2 ► pp. 261 ff.

Asscher, Omri & Ella Glikson

2023. Human evaluations of machine translation in an ethically charged situation. New Media & Society 25:5 ► pp. 1087 ff.

Bedecho, Aklilu Thomas & Michael Melese Woldeyohannis

2022. 2022 International Conference on Information and Communication Technology for Development for Africa (ICT4DA), ► pp. 96 ff.

Bowker, Lynne

2020. Chinese speakers’ use of machine translation as an aid for scholarly writing in English: a review of the literature and a report on a pilot workshop on machine translation literacy. Asia Pacific Translation and Intercultural Studies 7:3 ► pp. 288 ff.

Bowker, Lynne

2020. Machine translation literacy instruction for international business students and business English instructors. Journal of Business & Finance Librarianship 25:1-2 ► pp. 25 ff.

Bowker, Lynne & Frédéric Blain

2022. When French becomes Canadian French. The Journal of Internationalization and Localization 9:1 ► pp. 1 ff.

Briva-Iglesias, Vicent

2021. Traducción humana vs. traducción automática: análisis contrastivo e implicaciones para la aplicación de la traducción automática en traducción jurídica. Mutatis Mutandis. Revista Latinoamericana de Traducción 14:2 ► pp. 571 ff.

Calvo-Ferrer, José Ramón

2023. Can you tell the difference? A study of human vs machine-translated subtitles. Perspectives ► pp. 1 ff.

Cennamo, Ilaria & Loïc de Faria Pires

2022. Intelligence artificielle et traduction. FORUM. Revue internationale d’interprétation et de traduction / International Journal of Interpretation and Translation 20:2 ► pp. 333 ff.

Cid, Clara Ginovart, Carme Colominas & Antoni Oliver

2020. Language industry views on the profile of the post-editor. Translation Spaces 9:2 ► pp. 283 ff.

de Faria Pires, Loïc

2020. Master’s students’ post-editing perception and strategies. FORUM. Revue internationale d’interprétation et de traduction / International Journal of Interpretation and Translation 18:1 ► pp. 26 ff.

Delorme Benites, Alice, Sara Cotelli Kureth, Caroline Lehr & Elizabeth Steele

2021. Machine translation literacy: a panorama of practices at Swiss universities and implications for language teaching. In CALL and professionalisation: short papers from EUROCALL 2021, ► pp. 80 ff.

do Carmo, Félix

2021. Editing Actions: A Missing Link Between Translation Process Research and Machine Translation Research. In Explorations in Empirical Translation Process Research [Machine Translation: Technologies and Applications, 3], ► pp. 3 ff.

Ehrensberger-Dow, Maureen, Alice Delorme Benites & Caroline Lehr

2023. A new role for translators and trainers: MT literacy consultants. The Interpreter and Translator Trainer 17:3 ► pp. 393 ff.

Forcada, Mikel L.

2023. Licensing and Usage Rights of Language Data in Machine Translation. In Towards Responsible Machine Translation [Machine Translation: Technologies and Applications, 4], ► pp. 49 ff.

Haddow, Barry, Alexandra Birch & Kenneth Heafield

2021. Machine translation in healthcare. In The Routledge Handbook of Translation and Health, ► pp. 108 ff.

Hongtao, Wang

2023. Defending the last bastion. Babel. Revue internationale de la traduction / International Journal of Translation

Karimova, Sariya, Patrick Simianer & Stefan Riezler

2018. A user-study on online adaptation of neural machine translation to human post-edits. Machine Translation 32:4 ► pp. 309 ff.

Killman, Jeffrey

2023. Machine translation and legal terminology. In Handbook of Terminology [Handbook of Terminology, 3], ► pp. 485 ff.

Killman, Jeffrey

2023. Rendering Morphosyntactic Features of Legal Spanish Judgments Using Neural and Statistical Machine Translation. In New Advances in Legal Translation and Interpreting [New Frontiers in Translation Studies, ], ► pp. 221 ff.

Kim, Joosung, Soo Hyun Kim & Inwhee Joe

2024. Development and Proposal of Military Artificial Intelligence Battlefield Noise Cancellation Model for Secure Joint Operations. In Software Engineering Methods in Systems and Network Systems [Lecture Notes in Networks and Systems, 934], ► pp. 492 ff.

Klimova, Blanka, Marcel Pikhart, Alice Delorme Benites, Caroline Lehr & Christina Sanchez-Stockhammer

2023. Neural machine translation in foreign language teaching and learning: a systematic review. Education and Information Technologies 28:1 ► pp. 663 ff.

Lankford, Séamus, Haithem Afli & Andy Way

2023. adaptNMT: an open-source, language-agnostic development environment for neural machine translation. Language Resources and Evaluation 57:4 ► pp. 1671 ff.

Lee, Jieun & Hyoeun Choi

2023. A quality assessment of Korean–English patent machine translation. FORUM. Revue internationale d’interprétation et de traduction / International Journal of Interpretation and Translation 21:2 ► pp. 236 ff.

Li, Congli & Zhiguo Qu

2022. A Study on Chinese-English Machine Translation Based on Transfer Learning and Neural Networks. Wireless Communications and Mobile Computing 2022 ► pp. 1 ff.

Li, Jingyun & Kuruva Lakshmanna

2022. Application of Machine Learning Combined with Wireless Network in Design of Online Translation System. Wireless Communications and Mobile Computing 2022 ► pp. 1 ff.

Liu, Yiguang & Junying Liang

2024. Multidimensional comparison of Chinese-English interpreting outputs from human and machine: Implications for interpreting education in the machine-translation age. Linguistics and Education 80 ► pp. 101273 ff.

Lo, Siowai

2023. Neural machine translation in EFL classrooms: learners’ vocabulary improvement, immediate vocabulary retention and delayed vocabulary retention. Computer Assisted Language Learning ► pp. 1 ff.

Lo, Siowai

2024. The effects of NMT as a de facto dictionary on vocabulary learning: a comparison of three look-up conditions. Computer Assisted Language Learning ► pp. 1 ff.

Lohar, Pintu, Guodong Xie, Daniel Gallagher & Andy Way

2023. Building Neural Machine Translation Systems for Multilingual Participatory Spaces. Analytics 2:2 ► pp. 393 ff.

Matos Veliz, Claudia, Orphée De Clercq & Veronique Hoste

2021. Is neural always better? SMT versus NMT for Dutch text normalization. Expert Systems with Applications 170 ► pp. 114500 ff.

Melby, Alan K. & Daryl R. Hague

2019. A singular(ity) preoccupation. In The Evolving Curriculum in Interpreter and Translator Education [American Translators Association Scholarly Monograph Series, XIX], ► pp. 205 ff.

Moorkens, Joss

2018. What to expect from Neural Machine Translation: a practical in-class translation evaluation exercise. The Interpreter and Translator Trainer 12:4 ► pp. 375 ff.

Moorkens, Joss, Antonio Toral, Sheila Castilho & Andy Way

2018. Translators’ perceptions of literary post-editing using statistical and neural machine translation. Translation Spaces 7:2 ► pp. 240 ff.

Munkova, Dasa, Michal Munk, Ľubomír Benko, Jiri Stastny & Wen-Long Shang

2021. MT Evaluation in the Context of Language Complexity. Complexity 2021 ► pp. 1 ff.

O'Brien, Sharon

2021. Post-editing. In Handbook of Translation Studies [Handbook of Translation Studies, 5], ► pp. 178 ff.

O’Brien, Sharon

2020. Translation, human–computer interaction and cognition 1. In The Routledge Handbook of Translation and Cognition, ► pp. 376 ff.

O’Brien, Sharon & Maureen Ehrensberger-Dow

2020. MT Literacy—A cognitive view. Translation, Cognition & Behavior 3:2 ► pp. 145 ff.

Paulsen Christensen, Tina, Kristine Bundgaard, Anne Schjoldager & Helle Dam Jensen

2022. What motor vehicles and translation machines have in common - a first step towards a translation automation taxonomy. Perspectives 30:1 ► pp. 19 ff.

Ragni, Valentina & Lucas Nunes Vieira

2022. What has changed with neural machine translation? A critical review of human factors. Perspectives 30:1 ► pp. 137 ff.

Riemland, Matt

2023. Theorizing sustainable, low-resource MT in development settings. Translation Spaces 12:2 ► pp. 231 ff.

Rodríguez Vázquez, Silvia, Abigail Kaplan, Pierrette Bouillon, Cornelia Griebel & Razieh Azari

2022. La traduction automatique des textes faciles à lire et à comprendre (FALC) : une étude comparative. Meta 67:1 ► pp. 18 ff.

Sakamoto, Akiko

2022. Translation and Technology. In The Cambridge Handbook of Translation, ► pp. 55 ff.

Steigerwald, Emma, Valeria Ramírez-Castañeda, Débora Y C Brandt, András Báldi, Julie Teresa Shapiro, Lynne Bowker & Rebecca D Tarvin

2022. Overcoming Language Barriers in Academia: Machine Translation Tools and a Vision for a Multilingual Future. BioScience 72:10 ► pp. 988 ff.

Sánchez-Gijón, Pilar

2022. Neural machine translation and the indivisibility of culture and language. FORUM. Revue internationale d’interprétation et de traduction / International Journal of Interpretation and Translation 20:2 ► pp. 357 ff.

Tonja, Atnafu Lambebo, Olga Kolesnikova, Muhammad Arif, Alexander Gelbukh & Grigori Sidorov

2022. Improving Neural Machine Translation for Low Resource Languages Using Mixed Training: The Case of Ethiopian Languages. In Advances in Computational Intelligence [Lecture Notes in Computer Science, 13613], ► pp. 30 ff.

Tonja, Atnafu Lambebo, Olga Kolesnikova, Alexander Gelbukh & Grigori Sidorov

2023. Low-Resource Neural Machine Translation Improvement Using Source-Side Monolingual Data. Applied Sciences 13:2 ► pp. 1201 ff.

Wiesmann, Eva

2019. Machine Translation in the Field of Law: A Study of the Translation of Italian Legal Texts into German. Comparative Legilinguistics 37:1 ► pp. 117 ff.

Yigezu, Mesay Gemeda, Michael Melese Woldeyohannis & Atnafu Lambebo Tonja

2021. 2021 International Conference on Information and Communication Technology for Development for Africa (ICT4DA), ► pp. 89 ff.

Zhao, Xiaoda, Xiaoyan Jin & Naeem Jan

2022. A Comparative Study of Text Genres in English-Chinese Translation Effects Based on Deep Learning LSTM. Computational and Mathematical Methods in Medicine 2022 ► pp. 1 ff.

ÇETİNER, Caner

2021. Sustainability of translation as a profession: Changing roles of translators in light of the developments in machine translation systems. RumeliDE Dil ve Edebiyat Araştırmaları Dergisi :Ö9 ► pp. 575 ff.

[no author supplied]

2019. References. In Machine Translation and Global Research: Towards Improved Machine Translation Literacy in the Scholarly Community, ► pp. 97 ff.

This list is based on CrossRef data as of 15 april 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.