Article published In:
Intelligences pour la traduction. IA et interculturel : actions et interactions.
Edited by Ludovica Maggi and Sarah Bordes
[FORUM 20:2] 2022
► pp. 315332
References (51)
Bibliographie
Bahdanau, Dzmitry, Kyunghyun Cho, et Yoshua Bengio. 2015. “Neural Machine Translation by Jointly Learning to Align and Translate.” In Proceedings of the First International Conference on Learning Representations. San Diego, CA.Google Scholar
Banerjee, Satanjeev et Alon Lavie. 2005. “METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments.” In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation, 65–72. Ann Arbor, Michigan.Google Scholar
Bawden, Rachel, Rico Sennrich, Alexandra Birch et Barry Haddow. 2018. “Evaluating Discourse Phenomena in Neural Machine Translation.” In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 1304–13. New Orleans, Louisiana. DOI logoGoogle Scholar
Belinkov, Yonatan et Yonatan Bisk. 2018. “Synthetic and Natural Noise Both Break Neural Machine Translation.” In International Conference on Learning Representations.Google Scholar
Belinkov, Yonatan et James Glass. 2019. “Analysis Methods in Neural Language Processing: A Survey.” Transactions of the Association for Computational Linguistics 71 (April): 49–72. DOI logoGoogle Scholar
Blanchon, Hervé, and Christian Boitet. 2007. “Pour l’évaluation Externe Des Systèmes de TA Par Des méthodes Fondées Sur La tâche.” Traitement Automatique Des Langues 481: 33–65.Google Scholar
Burchardt, Aljoscha, Vivien Macketanz, Jon Dehdari, Georg Heigold, Jan-Thorsten Peter, et Philip Williams. 2017. “A Linguistic Evaluation of Rule-Based, Phrase-Based, and Neural MT Engines.” The Prague Bulletin of Mathematical Linguistics 1081: 159–70. DOI logoGoogle Scholar
Burlot, Franck, et François Yvon. 2017. “Evaluating the Morphological Competence of Machine Translation Systems.” In Proceedings of the Second Conference on Machine Translation, Volume 1: Research Papers, 43–55. Copenhagen, Denmark. DOI logoGoogle Scholar
. 2018. “Evaluation morphologique pour la traduction automatique: adaptation au français.” In Conférence sur le Traitement Automatique des Langues Naturelles, 14 pages. TALN. Rennes, France.Google Scholar
Castilho, Sheila, Stephen Doherty, Federico Gaspari, and Joss Moorkens. 2018. “Approaches to Human and Machine Translation Quality Assessment.” In Translation Quality Assessment, 9–38. Springer. DOI logoGoogle Scholar
Chatzikoumi, Eirini. 2020. “How to Evaluate Machine Translation: A Review of Automated and Human Metrics.” Natural Language Engineering 26 (2): 137–61. DOI logoGoogle Scholar
Cho, Kyunghyun, Bart van Merrienboer, Dzmitry Bahdanau, et Yoshua Bengio. 2014. “On the Properties of Neural Machine Translation: Encoder-Decoder Approaches.” In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, 103–11. Doha, Qatar. DOI logoGoogle Scholar
Conneau, Alexis, German Kruszewski, Guillaume Lample, Loı̈c Barrault, and Marco Baroni. 2018. “What You Can Cram into a Single $&!#* Vector: Probing Sentence Embeddings for Linguistic Properties.” In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2126–36. Melbourne, Australia. DOI logoGoogle Scholar
Forcada, Mikel L., Carolina Scarton, Lucia Specia, Barry Haddow, and Alexandra Birch. 2018. “Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires When Evaluating Machine Translation for Gisting.” In Proceedings of the Third Conference on Machine Translation: Research Papers, 192–203. Brussels, Belgium. DOI logoGoogle Scholar
Freitag, Markus, George Foster, David Grangier, Viresh Ratnakar, Qijun Tan, et Wolfgang Macherey. 2021. “Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation.” Transactions of the Association for Computational Linguistics 91: 1460–74. DOI logoGoogle Scholar
Gehring, Jonas, Michael Auli, David Grangier, Denis Yarats, et Yann N. Dauphin. 2017. “Convolutional Sequence to Sequence Learning.” In Proceedings of the 34th International Conference on Machine Learning, edited by D. Precup and Y. W. Teh, 701:1243–52. Sydney, Australia.[URL]
Giulianelli, Mario, Jack Harding, Florian Mohnert, Dieuwke Hupkes, et Willem Zuidema. 2018. “Under the Hood: Using Diagnostic Classifiers to Investigate and Improve How Language Models Track Agreement Information.” In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, 240–48. Brussels, Belgium. DOI logoGoogle Scholar
Guillou, Liane, and Christian Hardmeier. 2016. “PROTEST: A Test Suite for Evaluating Pronouns in Machine Translation.” In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), 636–43. Portorož, Slovenia.Google Scholar
Guillou, Liane, Christian Hardmeier, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley, Mauro Cettolo, Bonnie Webber, and Andrei Popescu-Belis. 2016. “Findings of the 2016 WMT Shared Task on Cross-Lingual Pronoun Prediction.” In Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, 525–42. Berlin, Germany. DOI logoGoogle Scholar
Hardmeier, Christian, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley et Mauro Cettolo. 2015. “Pronoun-Focused MT and Cross-Lingual Pronoun Prediction: Findings of the 2015 DiscoMT Shared Task on Pronoun Translation.” In Proceedings of the Second Workshop on Discourse in Machine Translation, 1–16. Lisbon, Portugal. DOI logoGoogle Scholar
Hewitt, John et Percy Liang. 2019. “Designing and Interpreting Probes with Control Tasks.” In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2733–43. Hong Kong, China. DOI logoGoogle Scholar
Hovy, Eduard, Margaret King et Andrei Popescu-Belis. 2002. “Principles of Context-Based Machine Translation Evaluation.” Machine Translation 17 (1): 43–75. DOI logoGoogle Scholar
Isabelle, Pierre, Colin Cherry, et George Foster. 2017. “A Challenge Set Approach to Evaluating Machine Translation.” In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2486–96. Copenhagen, Denmark. DOI logoGoogle Scholar
King, Margaret et Kirsten Falkedal. 1990. “Using Test Suites in Evaluation of Machine Translation Systems.” In Papers Presented to the 13th International Conference on Computational Linguistics. COLING 1990. DOI logoGoogle Scholar
Koehn, Philipp. 2010. Statistical Machine Translation. Cambridge University Press.Google Scholar
Krubiński, Mateusz, Erfan Ghadery, Marie-Francine Moens, and Pavel Pecina. 2021. “Just Ask! Evaluating Machine Translation by Asking and Answering Questions.” In Proceedings of the Sixth Conference on Machine Translation, 495–506. Online.Google Scholar
Kübler, Natalie. 2008. “A Comparable Learner Translator Corpus: Creation and Use.” In Proc. Of LREC 2008 Workshop on Building and Using Comparable Corpora, 73–78. BUCC. Marrakech, Morocco.Google Scholar
Läubli, Samuel, Sheila Castilho, Graham Neubig, Rico Sennrich, Qinlan Shen, and Antonio Toral. 2020. “A Set of Recommendations for Assessing Human-Machine Parity in Language Translation.” Journal of Artificial Intelligence Review 671: 653–72. DOI logoGoogle Scholar
Lommel, Arle, Hans Uszkoreit, and Aljoscha Burchardt. 2014. “Multidimensional Quality Metrics (MQM): A Framework for Declaring and Describing Translation Quality Metrics.” Revista Tradumàtica: Tecnologies de La Traducció, no. 12: 455–63. DOI logoGoogle Scholar
Maruf, Sameen, Fahimeh Saleh, and Gholamreza Haffari. 2021. “A Survey on Document-Level Neural Machine Translation: Methods and Evaluation.” ACM Comput. Surv. 54 (2). DOI logoGoogle Scholar
Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. “BLEU: A Method for Automatic Evaluation of Machine Translation.” In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 311–18. ACL ’02. Stroudsburg, PA, USA.Google Scholar
Pierce, John R., John B. Carroll, Eric P. Hamp, David G. Hays, Charles F. Hockett, Anthony G. Oettinger, and Alan Perlis. 1966. “Language and Machines – Computers in Translation and Linguistics.” Washington, DC: ALPAC Report, National Academy of Sciences.Google Scholar
Raganato, Alessandro, Yves Scherrer, and Jörg Tiedemann. 2019. “The MuCoW Test Suite at WMT 2019: Automatically Harvested Multilingual Contrastive Word Sense Disambiguation Test Sets for Machine Translation.” In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), 470–80. Florence, Italy. DOI logoGoogle Scholar
Rei, Ricardo, Craig Stewart, Ana C. Farinha, and Alon Lavie. 2020. “COMET: A Neural Framework for MT Evaluation.” In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2685–2702. Online. DOI logoGoogle Scholar
Rios, Annette, Mathias Müller, and Rico Sennrich. 2018. “The Word Sense Disambiguation Test Suite at WMT18.” In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, 588–96. Belgium, Brussels. DOI logoGoogle Scholar
Rudin, Cynthia. 2019. “Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.” Nature Machine Intelligence 1 (5): 206–15. DOI logoGoogle Scholar
Saunders, Danielle, and Bill Byrne. 2020. “Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem.” In of the 58th Annual Meeting of the Association for Computational Linguistics, 7724–36. Online. DOI logoGoogle Scholar
Scarton, Carolina, and Lucia Specia. 2016. “A Reading Comprehension Corpus for Machine Translation Evaluation.” In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), 3652–58. Portorož, Slovenia.Google Scholar
Sennrich, Rico. 2017. “How Grammatical Is Character-Level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs.” In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 376–82. Valencia, Spain. DOI logoGoogle Scholar
Shi, Xing, Inkit Padhi, and Kevin Knight. 2016. “Does String-Based Neural MT Learn Source Syntax?” In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 1526–34. Austin, Texas. DOI logoGoogle Scholar
Snover, Matthew, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, et John Makhoul. 2006. “A Study of Translation Edit Rate with Targeted Human Annotation.” In Proceedings of the Seventh Conference of the Association for Machine Translation in the America (AMTA), 223–31. Boston, Massachusetts, USA.Google Scholar
Specia, Lucia, Carolina Scarton, et Gustavo Henrique Paetzold. 2018. Quality Estimation for Machine Translation. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers. DOI logoGoogle Scholar
Thompson, Brian et Matt Post. 2020. “Automatic Machine Translation Evaluation in Many Languages via Zero-Shot Paraphrasing.” In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 90–121. Online. DOI logoGoogle Scholar
Vanmassenhove, Eva, Jinhua Du, and Andy Way. 2017. “Investigating ‘Aspect’ in NMT and SMT: Translating the English Simple Past and Present Perfect.” Computational Linguistics in the Netherlands Journal 71: 109–28.Google Scholar
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, et Illia Polosukhin. 2017. “Attention Is All You Need.” In Advances in Neural Information Processing Systems 30 1, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 5998–6008.Google Scholar
Vig, Jesse, Sebastian Gehrmann, Yonatan Belinkov, Sharon Qian, Daniel Nevo, Yaron Singer, and Stuart Shieber. Investigating gender bias in language models using causal mediation analysis. In NeurIPS, volume 331, pages 12388–12401. Curran Associates, Inc., 2020.Google Scholar
Voita, Elena and Ivan Titov. Information-theoretic probing with minimum description length. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 183–196, Online, November 2020. Association for Computational Linguistics. DOI logoGoogle Scholar
Voita, Elena, Rico Sennrich, and Ivan Titov. 2019. “When a Good Translation Is Wrong in Context: Context-Aware Machine Translation Improves on Deixis, Ellipsis, and Lexical Cohesion.” In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 1198–1212. Florence, Italy. DOI logoGoogle Scholar
Wisniewski, Guillaume, Lichao Zhou, Nicolas Ballier, et François Yvon. 2021. “Biais de genre dans un système de traduction automatique neuronale : une étude préliminaire.” In Traitement Automatique des Langues Naturelles, edité by P. Denis, N. Grabar, A. Fraisse, R. Cardon, B. Jacquemin, E. Kergosien, and A. Balvet, 11–25. Lille, France.Google Scholar
Wisniewski, Guillaume, Lichao Zhu, Nicolas Ballier, et François Yvon. 2021. “Screening Gender Transfer in Neural Machine Translation.” In Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP. Punta Cana, Dominica. DOI logoGoogle Scholar
Zhang, Tianyi, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, et Yoav Artzi. 2020. “BERTScore: Evaluating Text Generation with BERT.” In International Conference on Learning Representations.Google Scholar