Article published In:
Intelligences pour la traduction. IA et interculturel : actions et interactions.
Edited by Ludovica Maggi and Sarah Bordes
[FORUM 20:2] 2022
► pp. 315332
References

Bibliographie

Bahdanau, Dzmitry, Kyunghyun Cho, et Yoshua Bengio
2015 “Neural Machine Translation by Jointly Learning to Align and Translate.” In Proceedings of the First International Conference on Learning Representations. San Diego, CA.Google Scholar
Banerjee, Satanjeev et Alon Lavie
2005 “METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments.” In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation, 65–72. Ann Arbor, Michigan.Google Scholar
Bawden, Rachel, Rico Sennrich, Alexandra Birch et Barry Haddow
2018 “Evaluating Discourse Phenomena in Neural Machine Translation.” In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 1304–13. New Orleans, Louisiana. DOI logoGoogle Scholar
Belinkov, Yonatan et Yonatan Bisk
2018 “Synthetic and Natural Noise Both Break Neural Machine Translation.” In International Conference on Learning Representations.Google Scholar
Belinkov, Yonatan et James Glass
2019 “Analysis Methods in Neural Language Processing: A Survey.” Transactions of the Association for Computational Linguistics 71 (April): 49–72. DOI logoGoogle Scholar
Blanchon, Hervé, and Christian Boitet
2007 “Pour l’évaluation Externe Des Systèmes de TA Par Des méthodes Fondées Sur La tâche.” Traitement Automatique Des Langues 481: 33–65.Google Scholar
Burchardt, Aljoscha, Vivien Macketanz, Jon Dehdari, Georg Heigold, Jan-Thorsten Peter, et Philip Williams
2017 “A Linguistic Evaluation of Rule-Based, Phrase-Based, and Neural MT Engines.” The Prague Bulletin of Mathematical Linguistics 1081: 159–70. DOI logoGoogle Scholar
Burlot, Franck, et François Yvon
2017 “Evaluating the Morphological Competence of Machine Translation Systems.” In Proceedings of the Second Conference on Machine Translation, Volume 1: Research Papers, 43–55. Copenhagen, Denmark. DOI logoGoogle Scholar
2018 “Evaluation morphologique pour la traduction automatique: adaptation au français.” In Conférence sur le Traitement Automatique des Langues Naturelles, 14 pages. TALN. Rennes, France.Google Scholar
Castilho, Sheila, Stephen Doherty, Federico Gaspari, and Joss Moorkens
2018 “Approaches to Human and Machine Translation Quality Assessment.” In Translation Quality Assessment, 9–38. Springer. DOI logoGoogle Scholar
Chatzikoumi, Eirini
2020 “How to Evaluate Machine Translation: A Review of Automated and Human Metrics.” Natural Language Engineering 26 (2): 137–61. DOI logoGoogle Scholar
Cho, Kyunghyun, Bart van Merrienboer, Dzmitry Bahdanau, et Yoshua Bengio
2014 “On the Properties of Neural Machine Translation: Encoder-Decoder Approaches.” In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, 103–11. Doha, Qatar. DOI logoGoogle Scholar
Conneau, Alexis, German Kruszewski, Guillaume Lample, Loı̈c Barrault, and Marco Baroni
2018 “What You Can Cram into a Single $&!#* Vector: Probing Sentence Embeddings for Linguistic Properties.” In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2126–36. Melbourne, Australia. DOI logoGoogle Scholar
Forcada, Mikel L., Carolina Scarton, Lucia Specia, Barry Haddow, and Alexandra Birch
2018 “Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires When Evaluating Machine Translation for Gisting.” In Proceedings of the Third Conference on Machine Translation: Research Papers, 192–203. Brussels, Belgium. DOI logoGoogle Scholar
Freitag, Markus, George Foster, David Grangier, Viresh Ratnakar, Qijun Tan, et Wolfgang Macherey
2021 “Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation.” Transactions of the Association for Computational Linguistics 91: 1460–74. DOI logoGoogle Scholar
Gehring, Jonas, Michael Auli, David Grangier, Denis Yarats, et Yann N. Dauphin
2017 “Convolutional Sequence to Sequence Learning.” In Proceedings of the 34th International Conference on Machine Learning, edited by D. Precup and Y. W. Teh, 701:1243–52. Sydney, Australia.[URL]
Giulianelli, Mario, Jack Harding, Florian Mohnert, Dieuwke Hupkes, et Willem Zuidema
2018 “Under the Hood: Using Diagnostic Classifiers to Investigate and Improve How Language Models Track Agreement Information.” In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, 240–48. Brussels, Belgium. DOI logoGoogle Scholar
Guillou, Liane, and Christian Hardmeier
2016 “PROTEST: A Test Suite for Evaluating Pronouns in Machine Translation.” In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), 636–43. Portorož, Slovenia.Google Scholar
Guillou, Liane, Christian Hardmeier, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley, Mauro Cettolo, Bonnie Webber, and Andrei Popescu-Belis
2016 “Findings of the 2016 WMT Shared Task on Cross-Lingual Pronoun Prediction.” In Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, 525–42. Berlin, Germany. DOI logoGoogle Scholar
Hardmeier, Christian, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley et Mauro Cettolo
2015 “Pronoun-Focused MT and Cross-Lingual Pronoun Prediction: Findings of the 2015 DiscoMT Shared Task on Pronoun Translation.” In Proceedings of the Second Workshop on Discourse in Machine Translation, 1–16. Lisbon, Portugal. DOI logoGoogle Scholar
Hewitt, John et Percy Liang
2019 “Designing and Interpreting Probes with Control Tasks.” In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2733–43. Hong Kong, China. DOI logoGoogle Scholar
Hovy, Eduard, Margaret King et Andrei Popescu-Belis
2002 “Principles of Context-Based Machine Translation Evaluation.” Machine Translation 17 (1): 43–75. DOI logoGoogle Scholar
Isabelle, Pierre, Colin Cherry, et George Foster
2017 “A Challenge Set Approach to Evaluating Machine Translation.” In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2486–96. Copenhagen, Denmark. DOI logoGoogle Scholar
King, Margaret et Kirsten Falkedal
1990 “Using Test Suites in Evaluation of Machine Translation Systems.” In Papers Presented to the 13th International Conference on Computational Linguistics. COLING 1990. DOI logoGoogle Scholar
Koehn, Philipp
2010Statistical Machine Translation. Cambridge University Press.Google Scholar
Krubiński, Mateusz, Erfan Ghadery, Marie-Francine Moens, and Pavel Pecina
2021 “Just Ask! Evaluating Machine Translation by Asking and Answering Questions.” In Proceedings of the Sixth Conference on Machine Translation, 495–506. Online.Google Scholar
Kübler, Natalie
2008 “A Comparable Learner Translator Corpus: Creation and Use.” In Proc. Of LREC 2008 Workshop on Building and Using Comparable Corpora, 73–78. BUCC. Marrakech, Morocco.Google Scholar
Läubli, Samuel, Sheila Castilho, Graham Neubig, Rico Sennrich, Qinlan Shen, and Antonio Toral
2020 “A Set of Recommendations for Assessing Human-Machine Parity in Language Translation.” Journal of Artificial Intelligence Review 671: 653–72. DOI logoGoogle Scholar
Lommel, Arle, Hans Uszkoreit, and Aljoscha Burchardt
2014 “Multidimensional Quality Metrics (MQM): A Framework for Declaring and Describing Translation Quality Metrics.” Revista Tradumàtica: Tecnologies de La Traducció, no. 12: 455–63. DOI logoGoogle Scholar
Maruf, Sameen, Fahimeh Saleh, and Gholamreza Haffari
2021 “A Survey on Document-Level Neural Machine Translation: Methods and Evaluation.” ACM Comput. Surv. 54 (2). DOI logoGoogle Scholar
Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu
2002 “BLEU: A Method for Automatic Evaluation of Machine Translation.” In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 311–18. ACL ’02. Stroudsburg, PA, USA.Google Scholar
Pierce, John R., John B. Carroll, Eric P. Hamp, David G. Hays, Charles F. Hockett, Anthony G. Oettinger, and Alan Perlis
1966 “Language and Machines – Computers in Translation and Linguistics.” Washington, DC: ALPAC Report, National Academy of Sciences.Google Scholar
Raganato, Alessandro, Yves Scherrer, and Jörg Tiedemann
2019 “The MuCoW Test Suite at WMT 2019: Automatically Harvested Multilingual Contrastive Word Sense Disambiguation Test Sets for Machine Translation.” In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), 470–80. Florence, Italy. DOI logoGoogle Scholar
Rei, Ricardo, Craig Stewart, Ana C. Farinha, and Alon Lavie
2020 “COMET: A Neural Framework for MT Evaluation.” In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2685–2702. Online. DOI logoGoogle Scholar
Rios, Annette, Mathias Müller, and Rico Sennrich
2018 “The Word Sense Disambiguation Test Suite at WMT18.” In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, 588–96. Belgium, Brussels. DOI logoGoogle Scholar
Rudin, Cynthia
2019 “Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.” Nature Machine Intelligence 1 (5): 206–15. DOI logoGoogle Scholar
Saunders, Danielle, and Bill Byrne
2020 “Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem.” In of the 58th Annual Meeting of the Association for Computational Linguistics, 7724–36. Online. DOI logoGoogle Scholar
Scarton, Carolina, and Lucia Specia
2016 “A Reading Comprehension Corpus for Machine Translation Evaluation.” In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), 3652–58. Portorož, Slovenia.Google Scholar
Sennrich, Rico
2017 “How Grammatical Is Character-Level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs.” In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 376–82. Valencia, Spain. DOI logoGoogle Scholar
Shi, Xing, Inkit Padhi, and Kevin Knight
2016 “Does String-Based Neural MT Learn Source Syntax?” In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 1526–34. Austin, Texas. DOI logoGoogle Scholar
Snover, Matthew, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, et John Makhoul
2006 “A Study of Translation Edit Rate with Targeted Human Annotation.” In Proceedings of the Seventh Conference of the Association for Machine Translation in the America (AMTA), 223–31. Boston, Massachusetts, USA.Google Scholar
Specia, Lucia, Carolina Scarton, et Gustavo Henrique Paetzold
2018Quality Estimation for Machine Translation. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers. DOI logoGoogle Scholar
Thompson, Brian et Matt Post
2020 “Automatic Machine Translation Evaluation in Many Languages via Zero-Shot Paraphrasing.” In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 90–121. Online. DOI logoGoogle Scholar
Vanmassenhove, Eva, Jinhua Du, and Andy Way
2017 “Investigating ‘Aspect’ in NMT and SMT: Translating the English Simple Past and Present Perfect.” Computational Linguistics in the Netherlands Journal 71: 109–28.Google Scholar
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, et Illia Polosukhin
2017 “Attention Is All You Need.” In Advances in Neural Information Processing Systems 30 1, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 5998–6008.Google Scholar
Vig, Jesse, Sebastian Gehrmann, Yonatan Belinkov, Sharon Qian, Daniel Nevo, Yaron Singer, and Stuart Shieber
. Investigating gender bias in language models using causal mediation analysis. In NeurIPS, volume 331, pages 12388–12401. Curran Associates, Inc. 2020.Google Scholar
Voita, Elena and Ivan Titov
. Information-theoretic probing with minimum description length. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 183–196, Online, November 2020Association for Computational Linguistics. DOI logoGoogle Scholar
Voita, Elena, Rico Sennrich, and Ivan Titov
2019 “When a Good Translation Is Wrong in Context: Context-Aware Machine Translation Improves on Deixis, Ellipsis, and Lexical Cohesion.” In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 1198–1212. Florence, Italy. DOI logoGoogle Scholar
Wisniewski, Guillaume, Lichao Zhou, Nicolas Ballier, et François Yvon
2021 “Biais de genre dans un système de traduction automatique neuronale : une étude préliminaire.” In Traitement Automatique des Langues Naturelles, edité by P. Denis, N. Grabar, A. Fraisse, R. Cardon, B. Jacquemin, E. Kergosien, and A. Balvet, 11–25. Lille, France.Google Scholar
Wisniewski, Guillaume, Lichao Zhu, Nicolas Ballier, et François Yvon
2021 “Screening Gender Transfer in Neural Machine Translation.” In Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP. Punta Cana, Dominica. DOI logoGoogle Scholar
Zhang, Tianyi, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, et Yoav Artzi
2020 “BERTScore: Evaluating Text Generation with BERT.” In International Conference on Learning Representations.Google Scholar