More than tweets
A critical reflection on developing and testing crisis machine translation technology
The application of machine translation (MT) in crisis settings is of increasing interest to humanitarian practitioners. We collaborated with industry and non-profit partners: (1) to develop and test the utility of an MT system trained specifically on crisis-related content in an under-resourced language combination (French-to-Swahili); and (2) to evaluate the extent to which speakers of both French and Swahili without post-editing experience could be mobilized to post-edit the output of this system effectively. Our small study carried out in Kenya found that our system performed well, provided useful output, and was positively evaluated by inexperienced post-editors. We use the study to discuss the feasibility of MT use in crisis settings for low-resource language combinations and make recommendations on data selection and domain consideration for future crisis-related MT development.
Keywords: crisis translation, crisis, machine translation (MT), post-editing, evaluation, training, citizen translators, data sets
Published online: 05 November 2019
[ p. 327 ]References
Al-Dahash, Hajer, Menaha Thayaparan, and Udayangani Kulatunga
Ansari, Aimee, and Rebecca Petras
2018 Gamayun: The Language Equality Initiative. Accessed March 3, 2019. https://translatorswithoutborders.org/wp-content/uploads/2018/03/Gamayun-Language-Equality-Initiative-March-2018.pdf
Cadwell, Patrick, and Sharon O’Brien
Castilho, Sheila, Sharon O’Brien, Fabio Alves, and Morgan O’Brien
Castilho, Sheila, Stephen Doherty, Federico Gaspari, and Joss Moorkens
Chu, Chenhui, and Rui Wang
2018 “A Survey of Domain Adaptation for Neural Machine Translation.” In Proceedings of the 27th International Conference on Computational Linguistics, edited by Emily M. Bender, Leon Derczynski, and Pierre Isabelle, 1304–1319. Santa Fe, New Mexico: Association for Computational Linguistics, http://aclweb.org/anthology/C18-1
Cruz Silva, Catarina, Chao-Hong Liu, Alberto Poncelas, and Andy Way
2018 “Extracting In-Domain Training Data for Neural Machine Translation Using Data Selection Methods.” In Proceedings of the Third Conference on Machine Translation, 224–231. Brussels, Belgium: Association for Computational Linguistics, http://www.statmt.org/wmt18/WMT-2018.pdf
Doherty, Stephen, and Sharon O’Brien
Federici, Federico M.
Federici, Federico M., and Patrick Cadwell[ p. 328 ]
Federici, Federico M., Brian J. Gerber, Sharon O’Brien, and Patrick Cadwell
Fischer, Henry W.
Flanagan, Marian, and Tina Paulsen Christensen
Gaspari, Federico, Antonio Toral, Sudip Kumar Naskar, Declan Groves, and Andy Way
2014 “Perception vs Reality: Measuring Machine Translation Post-Editing Productivity.” In Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: Workshop on Post-Editing Technology and Practice (WPTP3), edited by Sharon O’Brien, Michel Simard, and Lucia Specia, 60–72. Vancouver: AMTA.
Guerberof Arenas, Ana
Haddow, George D., Jane A. Bullock, and Damon P. Coppola
Harvard Humanitarian Initiative
IDMC (Internal Displacement Monitoring Centre)
2018 Global Report on Internal Displacement 2018. Accessed March 3, 2019. http://www.internal-displacement.org/global-report/grid2018/
Karakanta, Alina, Jon Dehdari, Josef van, Genabith J.
Kobus, Catherine, Josep Crego, and Jean Senellart
Koehn, Philipp, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondřej Bojar, Alexandra Constantin, and Evan Herbst
[ p. 329 ]
Lewis, William D.
2010 “Haitian Creole: How to Build and Ship an MT Engine from Scratch in 4 Days, 17 Hours, & 30 Minutes.” In Proceedings of the 14th Annual Conference of the European Association for Machine Translation (EAMT 2010) (no pagination). Saint-Raphaël, France: EAMT. Accessed July 7, 2019. http://www.mt-archive.info/EAMT-2010-Lewis.pdf
Lewis, William D., Robert Munro, and Stephan Vogel
2018 Workshop Proceedings of Technologies for MT of Low Resource Languages (LoResMT 2018). Accessed March 3, 2019. http://aclweb.org/anthology/W18-2200
Liu, Chao-Hong, Catarina Cruz Silva, Longyue Wang, and Andy Way
Mehra, Kanav, and Vibhash Chandra
2017 “Summarizing Microblogs for Emergency Relief and Preparedness.” In Proceedings of the First International Workshop on Exploitation of Social Media for Emergency Relief and Preparedness (SMERP 2017), 104–108. Aberdeen, UK: CEUR. Accessed July 7, 2019. http://ceur-ws.org/Vol-1832/
Moorkens, Joss, Sharon O’Brien, Igor A. L. da Silva, Norma B. de Lima Fonseca, and Fabio Alves
O’Brien, Sharon, and Patrick Cadwell
2017 “Translation Facilitates Comprehension of Health-Related Crisis Information: Kenya as an Example.” The Journal of Specialised Translation 28: 23–51. Accessed July 7, 2019. https://www.jostrans.org/issue28/art_obrien.pdf
O’Brien, Sharon, Federico M. Federici, Patrick Cadwell, Jay Marlowe, and Brain Gerber
Och, Franz Josef, and Hermann Ney
2014 Low Resource Languages for Emergent Incidents (LORELEI). Accessed March 3, 2019. https://www.darpa.mil/program/low-resource-languages-for-emergent-incidents
Patel, Sindur, Nirav Bhatt, Chandni Shah, and Rutvika Nanecha
2017 “Multilingual Microblog Summarization.” In Proceedings of the First International Workshop on Exploitation of Social Media for Emergency Relief and Preparedness (SMERP 2017), 116–121. Aberdeen, UK: CEUR. Accessed July 7, 2019. http://ceur-ws.org/Vol-1832/
Plitt, Mirko, and François Masselot[ p. 330 ]
Santos-Hernández, Jenniffer, and Betty Hearn Morrow
Sellnow, Timothy L., and Matthew W. Seeger
2018 “Preparedness in Diverse Communities: Citizen Translation for Community Engagement.” In Proceedings of the Information Systems for Crisis Response and Management Asia Pacific 2018 Conference, 400–406. Wellington, New Zealand: Massey University. Accessed July 7, 2019. http://ndhadeliver.natlib.govt.nz/delivery/DeliveryManagerServlet?dps_pid=IE37914290
2013 Adequacy/Fluency Guidelines. Accessed March 3, 2019. https://www.taus.net/academy/best-practices/evaluate-best-practices/adequacy-fluency-guidelines
Teixeira, Carlos S. C.
2014 “Perceived vs. Measured Performance in the Post-Editing of Suggestions from Machine Translation and Translation Memories.” In Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: Workshop on Post-Editing Technology and Practice (WPTP3), edited by Sharon O’Brien, Michel Simard, and Lucia Specia, 450–59. Vancouver: AMTA.
TWB (Translators without Borders)
2016 Translators without Borders Develops the World’s First Crisis-Specific Machine Translation System for Kurdish Languages. Accessed June 25, 2019. https://translatorswithoutborders.org/translators-without-borders-develops-worlds-first-crisis-specific-machine-translation-system-kurdish-languages/
2019 Becoming a TWB Partner. Accessed March 3, 2019. https://translatorswithoutborders.org/partners/Eligibility/
Waugh, William L., and Kathleen J. Tierney
Ziemski, Michał, Junczys-Dowmunt, Marcin, Pouliquen, Bruno
Cited by other publications
Chu, Chenhui & Rui Wang
Footitt, Hilary, Angela M. Crack & Wine Tesseur
This list is based on CrossRef data as of 22 november 2020. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.