Article published In:
Translation, Cognition & Behavior: Online-First ArticlesMorphological complexity as a predictor of cognitive effort in neural machine translation post-editing
This study examines how morphological complexity affects cognitive effort in neural machine translation (NMT)
post-editing across six languages. Analysis of the DivEMT dataset shows that morphologically richer target languages like
Ukrainian and Turkish require more editing time, keystrokes, and frequent pauses, indicating higher cognitive demands. Vietnamese,
despite simpler morphology, also showed high cognitive effort, suggesting other factors like syntax influence processing load.
Mean Size of Paradigm (MSP) analysis confirmed Ukrainian and Turkish’s high morphological complexity compared to isolating
languages like Vietnamese. Higher error rates in morphologically rich languages demonstrate increased editing needs. While user
perceptions varied, the data reveals that greater linguistic distance correlates with higher cognitive effort in NMT post-editing,
showing typological divergence impacts beyond morphology alone.
Keywords: neural machine translation (NMT), post-editing, morphological complexity, cognitive load, typological divergence
Article outline
- 1.Introduction
- 2.Cognitive load theory and morphological complexity
- 2.1Escalating intrinsic cognitive load
- 2.2Amplified germane cognitive load
- 3.Related work
- 4.Morphological complexity measures
- 4.1Type-token ratio (TTR)
- 4.2Mean size of paradigm (MSP)
- 4.3Entropy of paradigms
- 5.Methods
- 5.1Research aims and design
- 5.2Data source and NMT systems
- 5.3Linguistic profiles of sample languages
- 5.4Validating selected complexity metrics
- 5.5Cognitive effort metrics
- 5.6Analytical approach
- 6.MSP as a metric for morphological complexity
- 7.Quantifying post-editing cognitive effort metrics
- 7.1Edit times — a proxy for post-editing complexity
- 7.2Keystrokes — quantifying technical effort
- 7.3Pauses — proxies of cognitive load
- 8.Quantifying post-editing cognitive effort rates
- 9.Subjective perceptions of morphological divergence in post-editing
- 9.1Perceived language-related challenges
- 10.Conclusion
- Acknowledgements
-
References
Published online: 3 December 2024
https://doi.org/10.1075/tcb.24002.abu
https://doi.org/10.1075/tcb.24002.abu
References (89)
Agrawal, Sweta, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, and Marjan Ghazvininejad. 2023. “In-Context
Examples Selection for Machine Translation.” In Findings of the
Association for Computational Linguistics: ACL
2023, 8857–73. Toronto, Canada: Association for Computational Linguistics.
Almanna, Ali, and Rafik Jamoussi. 2022. “NMT
Verb Rendering: A Cognitive Approach to Informing Arabic-into-English Post-Editing.” Open
Linguistics 8 (1): 310–27.
Alvarez, Sergi, Antoni Oliver, and Toni Badia. 2019. “Does
NMT Make a Difference When Post-Editing Closely Related Languages? The Case of
Spanish-Catalan.” In Proceedings of Machine Translation Summit XVII:
Translator, Project and User Tracks, 49–56. Dublin, Ireland: European Association for Machine Translation. [URL]
Arenas, Ana Guerberof. 2014. “The Role of
Professional Experience in Post-Editing from a Quality and Productivity
Perspective.” In. [URL]
Aronoff, Mark. 1993. Morphology
by Itself: Stems and Inflectional
Classes. Vol. 221. MIT press. [URL]
Aziz, Wilker, Ruslan Mitkov, and Lucia Specia. 2013. “Ranking
Machine Translation Systems via Post-Editing.” In Text, Speech, and
Dialogue, edited by Ivan Habernal and Václav Matoušek, 410–18. Lecture
Notes in Computer Science. Berlin, Heidelberg: Springer.
Bakker, Dik, André Müller, Viveka Velupillai, Søren Wichmann, Cecil H. Brown, Pamela Brown, Dmitry Egorov, Robert Mailhammer, Anthony Grant, and Eric W. Holman. 2009. “Adding
Typology to Lexicostatistics: A Combined Approach to Language Classification.” Linguistic
Typology 13 (1).
Balling, Laura Winther, Michael Carl, and Sharon O’Brian. 2014. Post-Editing
of Machine Translation: Processes and Applications. Cambridge Scholars Publishing. [URL]
Bannert, Maria. 2002. “Managing
Cognitive Load — Recent Trends in Cognitive Load Theory.” Learning and
Instruction 12 (1): 139–46.
Bentz, Christian, Dimitrios Alikaniotis, Michael Cysouw, and Ramon Ferrer-i-Cancho. 2017. “The
Entropy of Words — Learnability and Expressivity across More than 1000
Languages.” Entropy 19 (6): 275.
Blache, P. 2011. “A
Computational Model for Linguistic Complexity.” Biology, Computation and
Linguistics. [URL]
Boudelaa, Sami, and William D. Marslen-Wilson. 2015. “Structure,
Form, and Meaning in the Mental Lexicon: Evidence from Arabic.” Language, Cognition and
Neuroscience 30 (8): 955–92.
Brezina, Vaclav, and Gabriele Pallotti. 2019. “Morphological
Complexity in Written L2 Texts.” Second Language
Research 35 (1): 99–119.
Bulté, Bram, and Alex Housen. 2012. “Defining
and Operationalising L2 Complexity.” In Dimensions of L2 Performance
and Proficiency: Complexity, Accuracy and Fluency in SLA, edited by Alex Housen, Folkert Kuiken, and Ineke Vedder, 21–46. Language
Learning & Language Teaching. John Benjamins Publishing Company.
Castilho, Sheila, Joss Moorkens, Federico Gaspari, Iacer Calixto, John Tinsley, and Andy Way. 2017. “Is
Neural Machine Translation the New State of the Art?” The Prague Bulletin of Mathematical
Linguistics, no. 108. [URL].
Chen, Ouhao, Fred Paas, and John Sweller. 2023. “A
Cognitive Load Theory Approach to Defining and Measuring Task Complexity Through Element
Interactivity.” Educational Psychology
Review 35 (2): 63.
Çöltekin, Çagri, and Taraka Rama. 2018. “EXPLOITING
UNIVERSAL DEPENDENCIES TREEBANKS FOR MEASURING MORPHOSYNTACTIC
COMPLEXITY.” In. [URL]
Çöltekin, Çağrı, and Taraka Rama. 2022. “What
Do Complexity Measures Measure? Correlating and Validating Corpus-Based Measures of Morphological
Complexity.” arXiv.
Cotterell, Ryan, Christo Kirov, Mans Hulden, and Jason Eisner. 2019. “On
the Complexity and Typology of Inflectional Morphological Systems.” Transactions of the
Association for Computational
Linguistics 71 (June):327–42.
Covington, Michael A., and Joe D. McFall. 2010. “Cutting
the Gordian Knot: The Moving-Average Type–Token Ratio (MATTR).” Journal of Quantitative
Linguistics 17 (2): 94–100.
De La Torre García, Nuria, María Cecilia Ainciburu, and Kris Buyse. 2021. “Morphological
Complexity and Rated Writing Proficiency: The Case of Verbal Inflectional Diversity in L2
Spanish.” ITL — International Journal of Applied
Linguistics 172 (2): 290–318.
Debue, Nicolas, and Cécile Van De Leemput. 2014. “What
Does Germane Load Mean? An Empirical Contribution to the Cognitive Load Theory.” Frontiers in
Psychology 51:1099.
DeKeyser, Robert M. 2005. “What Makes Learning
Second-Language Grammar Difficult? A Review of Issues.” Language
Learning 551. [URL]
Göksel, Aslı, and Celia Kerslake. 2004. Turkish:
A Comprehensive Grammar. Routledge. [URL].
Gutierrez-Vasques, Ximena, and Victor Mijangos. 2019. “Productivity
and Predictability for Measuring Morphological
Complexity.” Entropy 22 (1): 48.
Hart, S. G. 1988. “Development
of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research.” Human Mental
Workload/Elsevier.
Herbig, Nico, Santanu Pal, Mihaela Vela, Antonio Krüger, and Josef van Genabith. 2019. “Multi-Modal
Indicators for Estimating Perceived Cognitive Load in Post-Editing of Machine
Translation.” Machine
Translation 33 (1): 91–115.
Housen, Alex, and Hannelore Simoens. 2016. “Introduction:
Cognitive Perspectives on Difficulty and Complexity in L2 Acquisition.” Studies in Second
Language
Acquisition 38 (2): 163–75.
Inglese, Guglielmo, and Luca Brigada Villa. 2021. “Inferring Morphological Complexity from Syntactic Dependency Networks: A Test.” In Proceedings of the Third Workshop on Computational Typology and Multilingual NLP, 10–22. Online: Association for Computational Linguistics.
Jia, Yanfang, and Si Lai. 2022. “Post-Editing
Metaphorical Expressions: Productivity, Quality, and Strategies.” Journal of Foreign Languages
and
Cultures 6 (2): 28–43.
Jia, Yanfang, and Binghan Zheng. 2022. “The
Interaction Effect between Source Text Complexity and Machine Translation Quality on the Task Difficulty of NMT Post-Editing
from English to Chinese: A Multi-Method Study.” Across Languages and
Cultures 23 (1).
Johnson, Tamar, Kexin Gao, Kenny Smith, Hugh Rabagliati, and Jennifer Culbertson. 2020. “Predictive
Structure or Paradigm Size? Investigating the Effects of i-Complexity and e-Complexity on the Learnability of Morphological
Systems.” [URL].
Juola, Patrick. 1998. “Measuring
Linguistic Complexity: The Morphological Tier.” Journal of Quantitative
Linguistics 5 (3): 206–13.
Kettunen, Kimmo. 2014. “Can
Type-Token Ratio Be Used to Show Morphological Complexity of Languages?” Journal of
Quantitative
Linguistics 21 (3): 223–45.
Klepsch, Melina, and Tina Seufert. 2020. “Understanding
Instructional Design Effects by Differentiated Measurement of Intrinsic, Extraneous, and Germane Cognitive
Load.” Instructional
Science 48 (1): 45–77.
Koplenig, Alexander, Peter Meyer, Sascha Wolfer, and Carolin Müller-Spitzer. 2017. “The
Statistical Trade-off between Word Order and Word Structure — Large-Scale Evidence for the Principle of Least
Effort.” PLOS
ONE 12 (3): e0173614.
Koponen, Maarit. 2016. “Machine
Translation Post-Editing and Effort. Empirical Studies on the Post-Editing
Process.” In. [URL]
Koponen, Maarit, Wilker Aziz, Luciana Ramos, Lucia Specia, J. Rautio, Meritxell González, Lauri Carlson, and C. España-Bonet. 2012. “Post-Editing
Time as a Measure of Cognitive Effort.” In. [URL]
Krings, Hans P. 2001. Repairing Texts: Empirical
Investigations of Machine Translation Post-Editing
Processes. Vol. 51. Kent State University Press. [URL]
Lacruz, Isabel, Michael Denkowski, and Alon Lavie. 2014. “Cognitive
Demand and Cognitive Effort in Post-Editing.” In Proceedings of the
11th Conference of the Association for Machine Translation in the
Americas, 73–84. Vancouver, Canada: Association for Machine Translation in the Americas. [URL]
Leblebi̇Ci̇, Ata. 2023. “THE
RELATIONSHIP BETWEEN LINGUISTIC DISTANCE AND NEURAL MACHINE TRANSLATION QUALITY.”
Lee, Jiyong. 2019. “Task
Complexity, Cognitive Load, and L1 Speech.” Applied
Linguistics 40 (3): 506–39.
Lee, Seungjun, Jungseob Lee, Hyeonseok Moon, Chanjun Park, Jaehyung Seo, Sugyeong Eo, Seonmin Koo, and Heuiseok Lim. 2023. “A
Survey on Evaluation Metrics for Machine
Translation.” Mathematics 11 (4): 1006.
Leppink, Jimmie, Fred Paas, Cees P. M. Van Der Vleuten, Tamara Van Gog, and Jeroen J. G. Van Merriënboer. 2013. “Development
of an Instrument for Measuring Different Types of Cognitive Load.” Behavior Research
Methods 45 (4): 1058–72.
Maibam, Indika, and Bipul Syam Purkayastha. 2023. “Reordering
of Source Side for a Factored English to Manipuri SMT System.” International Journal of
Electrical and Computer Engineering
Systems 14 (3): 285–92.
McWhorter, John H. 2001. “The Worlds Simplest Grammars
Are Creole Grammars.” Linguistic
Typology 5 (2–3).
Mithun, Marianne. 2020. “Where
Is Morphological Complexity?” In The Complexities of
Morphology, edited by Peter Arkadiev and Francesco Gardani, 01. Oxford University Press.
Moscoso del Prado, Fermin. 2011. “The
Mirage of Morphological Complexity.” In Proceedings of the Annual
Meeting of the Cognitive Science Society. Vol. 331. [URL]
Naismith, Laura M., and Rodrigo B. Cavalcanti. 2017. “Measuring
Germane Load Requires Correlation with Learning.” Medical
Education 51 (2): 228.
O’Brien, Sharon. 2006. “Pauses
as Indicators of Cognitive Effort in Post-Editing Machine Translation Output:” Across Languages
and
Cultures 7 (1): 1–21.
Oflazer, Kemal. 1994. “Two-Level
Description of Turkish Morphology.” Literary and Linguistic
Computing 9 (2): 137–48.
Orr, Robert. 2014. “Slavica
et Islamica: Ukrainian in Context by Andreii Danylenko.” Journal of Slavic
Linguistics 22 (2): 277–91.
Paas, Fred G. W. C., Jeroen J. G. Van Merriënboer, and Jos J. Adam. 1994. “Measurement
of Cognitive Load in Instructional Research.” Perceptual and Motor
Skills 79 (1): 419–30.
Paas, Fred, and John Sweller. 2012. “An
Evolutionary Upgrade of Cognitive Load Theory: Using the Human Motor System and Collaboration to Support the Learning of
Complex Cognitive Tasks.” Educational Psychology
Review 24 (1): 27–45.
Paas, Fred, and Tamara Van Gog. 2006. “Optimising
Worked Example Instruction: Different Ways to Increase Germane Cognitive Load.” Learning and
Instruction. Elsevier. [URL]
Popović, Maja, and Mihael Arčan. 2016. “PE2rr
Corpus: Manual Error Annotation of Automatically Pre-Annotated MT
Post-Edits.” In Proceedings of the Tenth International Conference on
Language Resources and Evaluation
(LREC’16), 27–32. Portorož, Slovenia: European Language Resources Association (ELRA). [URL]
Ryding, Karin C. 2005. A Reference Grammar of Modern Standard
Arabic. Reference
Grammars. Cambridge: Cambridge University Press.
2014. “Arabic Linguistics: Overview
and History.” Cambridge University Press. [URL]
Sarti, Gabriele, Arianna Bisazza, Ana Guerberof-Arenas, and Antonio Toral. 2022. “DivEMT:
Neural Machine Translation Post-Editing Effort Across Typologically Diverse
Languages.” In Proceedings of the 2022 Conference on Empirical
Methods in Natural Language
Processing, 7795–7816. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics.
Sennrich, Rico. 2015. “Neural
Machine Translation of Rare Words with Subword Units.” arXiv Preprint
arXiv:1508.07909. [URL]
Shah, Ritesh, Christian Boitet, Pushpak Bhattacharyya, Mithun Padmakumar, Leonardo Zilio, Ruslan Kalitvianski, Mohammad Nasiruddin, Mutsuko Tomokiyo, and Sandra Castellanos Páez. 2015. “Post-Editing
a Chapter of a Specialized Textbook into 7 Languages: Importance of Terminological Proximity with English for
Productivity.” In Proceedings of the 12th International Conference on
Natural Language Processing, 325–32. Trivandrum, India: NLP Association of India. [URL]
Skadiņa, Inguna, and Mārcis Pinnis. 2017. “NMT
or SMT: Case Study of a Narrow-Domain English-Latvian Post-Editing
Project.” In Proceedings of the Eighth International Joint Conference
on Natural Language Processing (Volume 1: Long
Papers), 373–83. Taipei, Taiwan: Asian Federation of Natural Language Processing. [URL]
Snover, Matthew, Bonnie Dorr, Rich Schwartz, Linnea Micciulla, and John Makhoul. 2006. “A
Study of Translation Edit Rate with Targeted Human
Annotation.” In Proceedings of the 7th Conference of the Association
for Machine Translation in the Americas: Technical
Papers, 223–31. Cambridge, Massachusetts, USA: Association for Machine Translation in the Americas. [URL]
Specia, Lucia, Kim Harris, Frédéric Blain, Aljoscha Burchardt, Viviven Macketanz, Inguna Skadin, Matteo Negri, and Marco Turchi. 2017. “Translation
Quality and Productivity: A Study on Rich Morphology
Languages.” In Proceedings of Machine Translation Summit XVI:
Research Track, 55–71. Nagoya Japan. [URL]
Strötgen, Jannik, Ayser Armiti, Tran Van Canh, Julian Zell, and Michael Gertz. 2014. “Time
for More Languages: Temporal Tagging of Arabic, Italian, Spanish, and Vietnamese.” ACM
Transactions on Asian Language Information
Processing 13 (1): 1–21.
Stump, Gregory. 2019. “Paradigm
Function Morphology.” The Oxford Handbook of Morphological
Theory, 285–304.
Stump, Gregory T. 2001. “Inflectional Morphology: A
Theory of Paradigm Structure.” Cambridge University Press. [URL].
Sweller, John. 1988. “Cognitive
Load during Problem Solving: Effects on Learning.” Cognitive
Science 12 (2): 257–85.
. 1994. “Cognitive
Load Theory, Learning Difficulty, and Instructional Design.” Learning and
Instruction 4 (4): 295–312.
. 2010. “Element
Interactivity and Intrinsic, Extraneous, and Germane Cognitive Load.” Educational Psychology
Review 221:123–38.
. 2011. “CHAPTER
TWO — Cognitive Load Theory.” In Psychology of Learning and
Motivation, edited by Jose P. Mestre and Brian H. Ross, 551:37–76. Academic Press.
Sweller, John, Jeroen J. G. van Merrienboer, and Fred G. W. C. Paas. 1998. “Cognitive
Architecture and Instructional Design.” Educational Psychology
Review 10 (3): 251–96.
Sweller, John, Jeroen J. G. Van Merriënboer, and Fred Paas. 2019. “Cognitive
Architecture and Instructional Design: 20 Years Later.” Educational Psychology
Review 31 (2): 261–92.
Tang, Yuqing, Chau Tran, Xian Li, Peng-Jen Chen, Naman Goyal, Vishrav Chaudhary, Jiatao Gu, and Angela Fan. 2021. “Multilingual
Translation from Denoising Pre-Training.” In Findings of the
Association for Computational Linguistics: ACL-IJCNLP
2021, 3450–66. [URL].
Tatsumi, Midori, and Johann Roturier. 2010. “Source
Text Characteristics and Technical and Temporal Post-Editing Effort: What Is Their
Relationship.” In Proceedings of the Second Joint EM+/CNGL Workshop:
Bringing MT to the User: Research on Integrating MT in the Translation
Industry, 43–52. Denver, Colorado, USA: Association for Machine Translation in the Americas. [URL]
Templin, Mildred C. 1957. “Certain Language Skills in
Children; Their Development and Interrelationships.” [URL].
Tesak, Jürgen. 1994. “Cognitive
Load and the Processing of Grammatical Items.” Journal of
Neurolinguistics 8 (1): 43–48.
Toledo Báez, M. Cristina. 2021. “Machine Translation and
Post-Editing : Impact of Training and Directionality on Quality and Productivity.”
Van Der Slik, Frans, Roeland Van Hout, and Job Schepens. 2019. “The
Role of Morphological Complexity in Predicting the Learnability of an Additional Language: The Case of La (Additional
Language) Dutch.” Second Language
Research 35 (1): 47–70.
Venhuizen, Noortje J., Matthew W. Crocker, and Harm Brouwer. 2019. “Semantic
Entropy in Language
Comprehension.” Entropy 21 (12): 1159.
Vieira, Lucas Nunes. 2014. “Indices of Cognitive Effort
in Machine Translation Post-Editing.” Machine
Translation 28 (3): 187–216.
. 2016. “Cognitive Effort in
Post-Editing of Machine Translation.” In. [URL]
Wang, Yu, and Ali Jalalian Daghigh. 2023. “Cognitive
Effort in Human Translation and Machine Translation Post-Editing Processes: A Holistic and Phased
View.” FORUM. Revue Internationale d’interprétation et de Traduction / International Journal of
Interpretation and
Translation 21 (1): 139–62.
Weng, Rongxiang, Qiang Wang, Wensen Cheng, Changfeng Zhu, and Min Zhang. 2023. “Towards
Reliable Neural Machine Translation with Consistency-Aware
Meta-Learning.” In Proceedings of the Thirty-Seventh AAAI Conference
on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth
Symposium on Educational Advances in Artificial
Intelligence, 371:13709–17. AAAI’23/IAAI’23/EAAI’23. AAAI Press.