翻译
衡量文本翻译的难度：采用以文本为中心和以译者为导向相结合的方法 [Measuring the difficulty of text translation: The combination of text-focused and translator-oriented approaches]

刘艳梅、郑冰寒、周好

山东财经大学|杜伦大学

Translated by 何雅祺、胡博凡、刘纯玮、刘芳菲、尹佳佳、宋丽、刘琪、秦雨、刘曌龙、张紫钰

山东财经大学|杜伦大学

抽象

本文探讨了文本复杂性对译者主观感知翻译难度和认知负荷的影响。在本研究中，来自英国一所大学的26名翻译硕士学生按要求将三篇复杂程度不同的英语文本翻译成中文。测试者用眼动仪记录他们的眼球运动，并要求他们在翻译前后分别采用李克特量表和NASA任务负荷指数量表对认知负荷进行自我评估。研究结果表明：（1）由阅读能力、词频和非纯字义性三方面衡量得出的文本内在复杂性，与实验参与者对翻译难度的主观评价结果一致；（2）自我评估中大部分项目与眼动测量得到的指标(包括注视及眼跳时长)呈中度正相关；（3）随着源文本复杂程度的增加，在三篇文本中有两篇译者的注视及眼跳时长(但不包括瞳孔大小)显著增长，反映了认知负荷的增加。

关键词：

抽象
关键词
1.介绍
2.评估认知负荷
3.研究设计
4.实验结果
5.讨论
6.结论
致谢
笔记
参考文献
Appendix
通讯地址

1.介绍

在过去二十年中，衡量原文翻译难度对翻译教学与研究的重要性得到了一定的关注(例如，Hale 和 Campbell 2002Hale, Sandra, and Stuart Campbell 2002 “The Interaction between Text Difficulty and Translation Accuracy.” Babel 8 (1): 14–33.; Jensen11.本文中提到的Jensen and Hvelplund是同一个人。 2009Jensen, Kristian T. H. 2009 “Indicators of Text Complexity.” In Behind the Mind: Methods, Models and Results in Translation Process Research, edited by Susanne Göpferich, Arnt L. Jakobsen, and Inger M. Mees, 61–80. Copenhagen: Samfundslitteratur.; Mishra, Bhattacharyya 和 Carl 2013Mishra, Abhijit, Pushpak Bhattacharyya, and Michael Carl 2013 “Automatically Predicting Sentence Translation Difficulty.” In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 346–351. Sofia.; Sun 和 Shreve 2014Sun, Sanjun, and Gregory M. Shreve 2014 “Measuring Translation Difficulty: An Empirical Study.” Target 26 (1): 98–127.)。文本复杂性不同会导致翻译难度发生相应的变化，在调查这一问题上，以往的研究人员的测试要么仅基于可读性(Pavlović和Jensen 2009Pavlović, Nataša, and Kristian Jensen 2009 “Eye Tracking Translation Directionality.” In Translation Research Projects 2, edited by Anthony Pym and Alexander Perekrestenko, 93–109. Tarragona: Intercultural Studies Group.)，要么基于可读性与其他指标——如词频、句子结构和非纯字义性的结合(Sharmin, Spakov，Räihä和 Jakobsen 2008Sharmin, Selina, Oleg Spakov, Kari-Jouko Räihä, and Arnt L. Jakobsen 2008 “Where on the Screen Do Translation Students Look While Translating, and for How Long?” In Looking at Eyes: Eye-Tracking Studies of Reading and Translation Processing, edited by Arnt L. Jakobsen, Susanne Göpferich, and Inger M. Mees, 31–51. Copenhagen: Samfundslitteratur.; Jensen 2009Jensen, Kristian T. H. 2009 “Indicators of Text Complexity.” In Behind the Mind: Methods, Models and Results in Translation Process Research, edited by Susanne Göpferich, Arnt L. Jakobsen, and Inger M. Mees, 61–80. Copenhagen: Samfundslitteratur.)。测量手段通常集中于研究文本本身的复杂程度，例如字符长度、音节长度和句子长度，而忽略了其他重要因素，如概念复杂度、文本组织或读者的背景知识(Liu 和 Chiu 2011Liu, Minhua, and Yu-Hsien Chiu 2011 “Assessing Source Material Difficulty for Consecutive Interpreting.” In Interpreting Chinese, Interpreting China, edited by Robin Setton, 135–156. Amsterdam: John Benjamins., 149)。然而，文本因素也只能部分地解释文本的翻译难度 (Sun 和Shreve 2014Sun, Sanjun, and Gregory M. Shreve 2014 “Measuring Translation Difficulty: An Empirical Study.” Target 26 (1): 98–127., 98)，因为翻译难度的形成始于翻译任务和译者之间的互动。因此对翻译难度的研究既要考虑文本本身，也要顾及到翻译该文本的译者自身的情况。

有一种假设是，相对于简单的文本，更复杂的文本会给译者带来更沉重的负担，但目前还不确定的是，在针对文本内在复杂性的定量测量上，在多大程度上与实验参与者对其认知负荷的主观测量以及对认知努力22.在本文中，认知负荷是指翻译工作对译者认知资源的需求，而认知努力是译者在翻译过程中实际投入的认知资源。的生理测量相关。Hvelplund (2011)Hvelplund, Kristian Tangsgaard 2011 Allocation of Cognitive Resources in Ttranslation: An Eye-tracking and Key-logging Study. PhD thesis. Copenhagen Business School. 以及Sun 和 Shreve (2014Sun, Sanjun, and Gregory M. Shreve 2014 “Measuring Translation Difficulty: An Empirical Study.” Target 26 (1): 98–127.)是少数几位对文本和译者采取多种测量手段的研究人员。Hvelplund (2011)Hvelplund, Kristian Tangsgaard 2011 Allocation of Cognitive Resources in Ttranslation: An Eye-tracking and Key-logging Study. PhD thesis. Copenhagen Business School. 通过可读性、单词频率和非纯字义性来考察了文本复杂性对译者认知负荷（此数据由眼动仪记录的瞳孔大小呈现）的影响。他发现，在翻译复杂程度不同的文本的过程中，被测试者瞳孔大小的平均值没有显著差异。这一结果既质疑了上述三个因素作为内在复杂性定量文本测量的适用性，也质疑了单一瞳孔大小作为生理测量的可靠性。由于缺乏译者对翻译难度的主观评价，很难合理地解释文本复杂性对译者认知负荷的影响。Sun 和 Shreve (2014Sun, Sanjun, and Gregory M. Shreve 2014 “Measuring Translation Difficulty: An Empirical Study.” Target 26 (1): 98–127.)认为，NASA任务负荷指数能可靠地评估主观翻译难度，而单是考虑文本可读性，其与翻译难度33.NASA任务负荷指数(NASA Task Load Index)是Hart 和 Staveland (1988Hart, Sandra G., and Lowell E. Staveland 1988 “Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research.” In Human Mental Workload, edited by Peter A. Hancock and Najmedin Meshkati, 139–183. Amsterdam: North-Holland.)为测量主观工作量而开发的多维量表。其中囊括六个与工作负荷相关的分量表，即脑力需求、体力需求、时间需求、努力程度、表现和受挫水平。每个分量表由双极描述符(例如，低/高、好/差)将其划分为20个等量。之间的相关性很弱。

我们的研究旨在综合前述两项研究涉及的指标：一是在Hvelpund（2011）的研究中使用过的文本复杂性衡量方法，二是Sun和Shreve（2014Sun, Sanjun, and Gregory M. Shreve 2014 “Measuring Translation Difficulty: An Empirical Study.” Target 26 (1): 98–127.）应用的主观评估方法。另外，我们还收集了眼动数据，用以研究实验参与者在执行翻译任务时的认知努力。根据Sun（2015Sun, Sanjun 2015 “Measuring Translation Difficulty: Theoretical and Methodological Considerations.” Across Languages and Cultures 16 (1): 29–54.）及Akbari和Segers（2017Akbari, Alireza, and Winibert Segers 2017 “Translation Difficulty: How to Measure and What to Measure.” Lebende Sprachen 62 (1): 3–29.）的观点，翻译难度的测量应综合多种方法，例如衡量文本的复杂性、评估译文、测量译者的认知负荷。本研究采取了以上的综合研究方法，希望可以探讨以下几个问题：（1）定量文本测量的指标（参看 Hveplund 2011）显示的文本本身的难度是否与译者主观的自我评估中指出的翻译难度相吻合？（2）因翻译难度等级的差别，译者会产生不同程度的认知负荷。这些认知负荷程度间的主要差异是什么？（3）在对认知负荷程度上，通过NASA任务负荷指数问卷调查获得的主观评估是否与眼动仪显示的生理学指标相吻合？

2.评估认知负荷

以实证方法探讨文本难度的研究可追溯到Campbell（1999Campbell, Stuart 1999 “A Cognitive Approach to Source Text Difficulty in Translation.” Target 11 (1): 33–63.）；在Campbell的研究之前，文本难度只是阅读研究中一个争辩的主题（Hale and Campbell 2002Hale, Sandra, and Stuart Campbell 2002 “The Interaction between Text Difficulty and Translation Accuracy.” Babel 8 (1): 14–33., 14）。翻译难度可以从两个方面衡量：施加于译者认知系统的负荷和译者执行任务时付出的努力。现有的研究指出，测量认知负荷有两种行之有效的方法：主观指数（评定量表）和心理生理学指数（例如瞳孔直径、心率变化、事件相关脑电位等，Paas and van Merriënboer 1994aPaas, Fred G. W. C., and Jeroen J. G. van Merriënboer 1994a “Instructional Control of Cognitive Load in the Training of Complex Cognitive Tasks.” Educational Psychology Review 6 (4): 351–371., 357）。本次研究中应用的主观指数包括译前评定与译后评定，本文的4.2部分将会有详细阐述。

就像阅读研究一样，本次研究也使用了眼动数据作为指示认知负荷的生理学指标。在使用了眼动跟踪评估阅读任务难度的研究中，结果显示读者在处理以下类型的字词

时注视时间相对更长：长单词（Just and Carpenter 1980Just, Marcel A., and Patricia A. Carpenter 1980 “A Theory of Reading: From Eye Fixations to Comprehension.” Psychological Review 87 (4): 329–354.; Rayner, Sereno, and Raney 1996Rayner, Keith, Sara C. Sereno, and Gary E. Raney 1996 “Eye Movement Control in Reading: A Comparison of Two Types of Models.” Journal of Experimental Psychology: Human Perception and Performance 22 (5): 1188–1200.），低频词（Just and Carpenter 1980Just, Marcel A., and Patricia A. Carpenter 1980 “A Theory of Reading: From Eye Fixations to Comprehension.” Psychological Review 87 (4): 329–354.; Inhoff 1984Inhoff, Albrecht Werner 1984 “Two Stages of Word Processing during Eye Fixations in the Reading of Prose.” Journal of Verbal Learning & Verbal Behavior 23 (5): 612–624.; Rayner and Fischer 1996Rayner, Keith, and Martin H. Fischer 1996 “Mindless Reading Revisited: Eye Movements during Reading and Scanning Are Different.” Perception & Psychophysics 58 (5): 734–747.; Rayner and Raney 1996Rayner, Keith, and Gary E. Raney 1996 “Eye Movement Control in Reading and Visual Search: Effects of Word Frequency.” Psychonomic Bulletin & Review 3 (2): 245–248.）、新词（不熟悉的单词）（Chaffin, Morris, and Seely 2001Chaffin, Roger, Robin K. Morris, and Rachel E. Seely 2001 “Learning New Word Meanings from Context: A Study of Eye Movements.” Journal of Experimental Psychology: Learning, Memory, and Cognition 27 (1): 225–235.; Williams and Morris 2004Williams, Rihana, and Robin Morris 2004 “Eye Movements, Word Familiarity, and Vocabulary Acquisition.” European Journal of Cognitive Psychology 16 (1–2): 312–339.）、歧义词（Rayner and Duffy 1986Rayner, Keith, and Susan A. Duffy 1986 “Lexical Complexity and Fixation Times in Reading: Effects of Word Frequency, Verb Complexity, and Lexical Ambiguity.” Memory & Cognition 14 (3):191–201.; Sereno, O’Donnell, and Rayner 2006Sereno, Sara C., Patrick J. O’Donnell, and Keith Rayner 2006 “Eye Movements and Lexical Ambiguity Resolution: Investigating the Subordinate-bias Effect.” Journal of Experimental Psychology: Human Perception and Performance 32 (2): 335–350.）以及不受语境限制或难以根据语境预测的词（Ehrlich and Rayner 1981Ehrlich, Susan F., and Keith Rayner 1981 “Contextual Effects on Word Perception and Eye Movements during Reading.” Journal of Verbal Learning and Verbal Behavior 20 (6): 641–655.; Zola 1984Zola, David 1984 “Redundancy and Word Perception during Reading.” Perception & Psychophysics 36 (3): 277–284.; Rayner and Well 1996Rayner, Keith, and Arnold D. Well 1996 “Effects of Contextual Constraint on Eye Movements in Reading: A Further Examination.” Psychonomic Bulletin & Review 3 (4): 504–509.; Ashby, Rayner, and Clifton 2005Ashby, Jane, Keith Rayner, and Charles Clifton 2005 “Eye Movements of Highly Skilled and Average Readers: Differential Effects of Frequency and Predictability.” The Quarterly Journal of Experimental Psychology Section A 58 (6): 1065–1086.）。除了词汇因素，句法因素与语篇因素也会对注视时长产生影响（Staub and Rayner 2007Staub, Adrian, and Keith Rayner 2007 “Eye Movements and On-Line Comprehension Processes.” In The Oxford Handbook of Psycholinguistics, edited by M. Gareth Gaskell and Gerry Altmann, 327–342. Oxford: Oxford University Press.）。读者会花更长的时间从重要从句中整合信息，并在句尾做出推断（Just and Carpenter 1980Just, Marcel A., and Patricia A. Carpenter 1980 “A Theory of Reading: From Eye Fixations to Comprehension.” Psychological Review 87 (4): 329–354., 329）。当转喻性质的指称描述及隐喻表达（而非純字义层面的表达方式）出现在句首时，读者也会花费更多阅读时间理解信息（Gibbs 1990Gibbs Jr, Raymond W. 1990 “Comprehending Figurative Referential Descriptions.” Journal of Experimental Psychology: Learning, Memory and Cognition 16 (1): 56–66.）。另外，与传统表达方式相比较，阅读园径句会花费更多时间（Schotter and Rayner 2012Schotter, Elizabeth R., and Keith Rayner 2012 “Eye Movements in Reading: Implications for Reading Subtitles.” In Eye Tracking in Audiovisual Translation, edited by Elisa Perego, 83–104. Roma: Aracne Editrice., 91）。还有一个研究表明，结构松散的文本片段比连贯的文本片段更能吸引读者的视觉注意力（Vauras, Hyönä, and Niemi 1992Vauras, Marja, Jukka Hyönä, and Pekka Niemi 1992 “Comprehending Coherent and Incoherent Texts: Evidence from Eye Movement Patterns and Recall Performance.” Journal of Research in Reading 15 (1): 39–54.）。

上文谈及的研究成果均指出一点：当处理较为复杂的信息时，注视时间有所延长。此外，眼跳时长也应该纳入考量（Irwin 2004Irwin, David E. 2004 “Fixation Location and Fixation Duration as Indices of Cognitive Processing.” In The Interface of Language, Vision, and Action: Eye Movements and Visual World, edited by John Henderson and Fernanda Ferreira, 105–134. New York: Psychology Press., 128），因为认知处理有时候也发生在眼跳过程中。Mishra、Bhattacharyya 和 Carl（2013Mishra, Abhijit, Pushpak Bhattacharyya, and Michael Carl 2013 “Automatically Predicting Sentence Translation Difficulty.” In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 346–351. Sofia.）将FSD视为信息处理时长，以此建立起一套句子翻译难度指标（TDI）。他们认为TDI与源语句子的三项性质相关——长度、多义程度及句式复杂度。在本研究中的两个以眼动跟踪评估认知负荷的衡量指标里，这种测量方法为其一。

除FSD之外，衡量读者认知系统工作量的常用指标还包括瞳孔大小或瞳孔扩大 (Hvelplund 2014Hvelplund, Kristian Tangsgaard 2014 “Eye Tracking and the Translation Process: Reflections on the Analysis and Interpretation of Eye-tracking Data.” In Minding Translation / Con la traducción en mente, edited by Ricardo Muñoz Martín, 201–224. San Vicente del Raspeig: Publicaciones de la Universidad de Alicante., 214)。Hess和Polt（1964Hess, Eckhard H., and James M. Polt 1964 “Pupil Size in Relation to Mental Activity during Simple Problem-Solving.” Science 143 (3611): 1190–1192.）首次提出瞳孔大小与任务难度呈正相关。他们发现，当解决了简单的乘法计算后，任务复杂性的增加将引起瞳孔的强烈反应。在阅读实验中，Just和Carpenter（1993Just, Marcel A., and Patricia A. Carpenter 1993 “The Intensity Dimension of Thought: Pupillometric Indices of Sentence Processing.” Canadian Journal of Experimental Psychology 47 (2): 310–339.）报告称，句子越复杂，所需处理时间越长，瞳孔扩张也越明显。Hyönä、Tommola和Alaja (1995Hyönä, Jukka, Jorma Tommola, and Anna-Mari Alaja 1995 “Pupil Dilation as a Measure of Processing Load in Simultaneous Interpretation and Other Language Tasks.” The Quarterly Journal of Experimental Psychology 48 (3): 598–612.)将同声传译与其他语言处理任务相比较，他们指出，执行的任务难度不同时，实验参与者的平均瞳孔大小表现出显著差别。在笔译中，Pavlović和Jensen (2009Pavlović, Nataša, and Kristian Jensen 2009 “Eye Tracking Translation Directionality.” In Translation Research Projects 2, edited by Anthony Pym and Alexander Perekrestenko, 93–109. Tarragona: Intercultural Studies Group.)则发现，在目标文本重组过程中，瞳孔扩张比源文本理解过程更大。以上研究均得出同样的结论：任务越困难、瞳孔扩张越大。

本研究使用的实验材料与Hvelplund (2011)Hvelplund, Kristian Tangsgaard 2011 Allocation of Cognitive Resources in Ttranslation: An Eye-tracking and Key-logging Study. PhD thesis. Copenhagen Business School.所使用的相同，旨在重新审视文本复杂性与译者认知负荷间的联系。

3.研究设计

3.1实验参与者

实验参与者经自愿招募而来，由26名来自英国的一所大学的翻译专业文学硕士（24名女性和2名男性）组成。44.已预期性别不平衡不会对研究结果造成影响（Hvelplund, 2011年）。他们被认为能代表高级英汉翻译学习者。经初步研究和筛选，共挑选出22名实验参与者，平均年龄在23.78岁（22–24岁，标准差 = 1.12岁）。他们的母语为汉语，二外为英语，且在非双语环境中长大。平均在9.35岁（9–10岁之间，标准差 = 0.43）时开始学习英语。这些后期的双语者被评价为精通英语，雅思55.作为高等教育和全球移民英语水平测试，国际英语语言测试系统（IELTS）为最为广泛的测试体系之一。成绩报告的分数等级划分为1分（最低）到9分（最高）。平均成绩达到7.42分（7–8分之间，标准差 = 0.35）。参与者均为盲打打字员，有正常或矫正至正常的视力。为尽可能减少对数据质量的负面影响，实验参与者在实验前不得饮用咖啡或酒精饮料，女性不得涂抹厚重的睫毛膏。实验参与者被告知实验将确保匿名性和保密性，并在实验前签署知情书。实验参与者会收到12英镑的特易购优惠券作为报酬。该实验得到大学研究伦理委员会的批准。

3.2实验材料

实验材料包括一篇体验文本和A、B和C三篇实验文本（见附录一）。实验文本经Hvelplund (2011)Hvelplund, Kristian Tangsgaard 2011 Allocation of Cognitive Resources in Ttranslation: An Eye-tracking and Key-logging Study. PhD thesis. Copenhagen Business School.授权使用，未经修改。文章来源为在线报纸，受众群体广泛，翻译时无需专门知识。每篇文章和标题的字数长度相近。衡量文本复杂程度的三个指标为可读性、词频和非纯字义性。由A、B和C三文本的线性趋势可见，在阅读难度、低频词和非纯字义数量上，文本C最复杂，文本A最简单，而文本B则介于二者之间（见图 1）。

图一.衡量源文本复杂性的三指标概况

3.3实验设置

为了尽可能减少光线对眼睛的影响，实验室天花板上安装了一个稳定的光源。所有实验参与者的眼球运动均以Tobii TX300眼动仪（300赫兹）记录。眼动仪与演示屏幕，即一台23英寸的液晶显示器，相连接。屏幕分辨率为1280*1024像素，注视半径为Tobii系统的默认设置——每英寸35像素。英语源文本显示于键盘追踪软件Translog II用户界面的上窗口中，66.Translog II仅用于显示源文本和输入目标文本。研究项目中未对Translog数据加以分析。字体为新罗马，字号为20磅，双倍行距；中文目标文本在下窗口显示，字体为宋体，字号为20磅，双倍间距。固定滤波器采用I-VT滤波器，设定最少注视时长为60毫秒、速度阈值每秒钟30度。

3.4实验步骤

研究人员在大学的眼动跟踪实验室对实验参与者分别进行了测试。他们首先需要阅读三篇纸质文章，阅读顺序采用拉丁方设计方法，然后按照Likert量表0–10等级对翻译难度进行评估，0级代表非常简单，10级代表非常困难。为了避免实验参与者在眼动跟踪实验之前对文本有进行过度的处理加工，评估任务必须在三分钟内完成。然后他们被要求坐在距监测仪约60厘米处，再通过五点校准和验证的程序。保存可接受的校准刻度之后，实验参与者开始在不限时间的情况下翻译体验文本以及三篇实验文本（和之前评估任务的顺序相同）。实验过程中没有任何在线或非在线的资源或资料辅助。如有需要，实验参与者可以在完成每个实验文章之间进行休息。最后，实验参与者根据修正过后的NASA任务负荷指数量表(详见附表 2)对翻译任务的认知负荷做出评估，这一量表同样也为Sun和Shreve（2014Sun, Sanjun, and Gregory M. Shreve 2014 “Measuring Translation Difficulty: An Empirical Study.” Target 26 (1): 98–127.）所应用。整个实验过程持续约一个小时。

4.实验结果

4.1眼动跟踪数据的质量评估

研究人员在对收集的眼动跟踪数据进行分析之前，首先对数据的质量做出了评估。根据Hvelplund(2011)Hvelplund, Kristian Tangsgaard 2011 Allocation of Cognitive Resources in Ttranslation: An Eye-tracking and Key-logging Study. PhD thesis. Copenhagen Business School.，本研究采用了以下三个质量评估标准：屏幕凝视时长（GTS），凝视时长比（GFP），平均注视时长（MFD）。

GTS是指电脑屏幕的凝视总时长占工作总时长的百分比（Hvelplund 2011Hvelplund, Kristian Tangsgaard 2011 Allocation of Cognitive Resources in Ttranslation: An Eye-tracking and Key-logging Study. PhD thesis. Copenhagen Business School., 104），计算公式为（注视时长+眼跳时长）/工作总时长*100%。因为认知处理也会在眼跳过程中发生，注视时长并不能完全代表认知时长，所以必须要算上眼跳时长（Irwin 2004Irwin, David E. 2004 “Fixation Location and Fixation Duration as Indices of Cognitive Processing.” In The Interface of Language, Vision, and Action: Eye Movements and Visual World, edited by John Henderson and Fernanda Ferreira, 105–134. New York: Psychology Press., 126）。如果GTS的比值低于73.1%（低于平均值一个标准差），则该数据在评估中被视作无效。

GFP说明了凝视活动中注视和眼跳的分配情况，计算公式为注视时长/（注视时长+眼跳时长）*100%。根据Hvelplund (2011)Hvelplund, Kristian Tangsgaard 2011 Allocation of Cognitive Resources in Ttranslation: An Eye-tracking and Key-logging Study. PhD thesis. Copenhagen Business School.，如果眼跳时长比例高于15%，则表明在眼动跟踪数据中，某系列的凝视样本出现了无效的数据。按照Hvelplund的建议，本研究设定GFP为85.2％（低于平均值一个标准差）作为有效数据的阈值。

Rayner(1998Rayner, Keith 1998 “Eye Movements in Reading and Information Processing: 20 Years of Research.” Psychological Bulletin 124 (3): 372–422.)提出的平均注视时长（MFD），即注视总时长/注视点数量，也经常用作眼动跟踪数据质量评估的一个指标。在本次实验中，如果MFD低于241.60毫秒（低于平均值一个标准差），则被视为无效数据。

满足至少两个以上标准的数据才能用于进一步的分析(参见Hvelplund 2011Hvelplund, Kristian Tangsgaard 2011 Allocation of Cognitive Resources in Ttranslation: An Eye-tracking and Key-logging Study. PhD thesis. Copenhagen Business School.)。表 1说明了两名实验参与者（I7 ，I10）的数据被认作无效，他们所有的数据记录将从进一步的分析中剔除。因此，无效数据的比例是8.33%。

表 1.含无效数据（以X表示）的眼动跟踪质量评估总结

实验参与者 (I)	文本 A			文本 B			文本 C
实验参与者 (I)	GTS	GFP	MFD	GTS	GFP	MFD	GTS	GFP	MFD
I3				×
I6			×			×
I7	×	×	×	×	×	×	×	×	×
I8			×			×			×
I9									×
I10	×	×	×					×
I16			×			×
I18	×			×
I24			×

4.2主观测量

4.2.1译前评估

正如在3.2节提到，可读性，词频，以及非纯字义性等一些指标说明了文本复杂性从文本A到文本B，及从文本B到文本C逐步增强。表 2 是所有实验参与者对译前难度的评估数据结果。

表 2.翻译难度译前评分统计结果

文本	样本含量	平均值	标准偏差	最小值	最大值	Kendall系数	卡方值	自由度	显著性
A	22	4.00	1.10	1.50	6.00	.699	30.775	2	.000
B	22	4.45	1.28	2.00	7.00
C	22	6.11	.72	5.00	7.50

表 2说明文本A的翻译难度评分稍微低于文本B，大幅低于文本C。文本A，文本B和文本C的平均译前评分分别是4,4.45和6.11，这表明对实验参与者而言，这些文本的翻译难度逐步增加，这与文本复杂性的测试结果相一致。

为了进一步评估评分者间信度，即实验参与者就翻译难度等级评分的一致性程度，我们计算了Kendall和谐系数。Kendall W = 0.699且p < 0.01，作为一个分界点，表示实验参与者的评分之间存在很强的一致性。77.Kendall W值范围从0（完全不一致）到1（完全一致）。如果Kendall W值在0.91–1之间，实验参与者的反应被认为具有很强一致性；在0.71–0.90之间，则具有较强一致性；在0.51–0.70之间，则具有中度一致性；在0.31–0.50之间，则具有较弱一致性；在0.0–0.30之间，则缺乏一致性 (LeBreton and Senter 2008LeBreton, James M., and Jenell L. Senter 2008 “Answers to 20 Questions about Interrater Reliability and Interrater Agreement.” Organizational Research Methods 11 (4): 815–852., 836)。这一结果支持了Sun和Shreve (2014Sun, Sanjun, and Gregory M. Shreve 2014 “Measuring Translation Difficulty: An Empirical Study.” Target 26 (1): 98–127.) 的研究发现：译者的预评分数在一定程度上能够预测翻译的难度等级。

4.2.2译后评分

翻译难度等级的译后评分包括NASA任务负荷指数六个分量表中的四个: 脑力需求指数; 努力程度; 受挫指数; 表现水平。同译前评分一样，实验参与者被要求根据Likert量表的0–10对四个分量表进行评分;表 3展示统计结果。

表 3.翻译难度译后评分统计结果

分量表	文本	平均值	最小值	最大值	Kendall的 W值	卡方值	显著性
脑力需求指数	A	4.18	1.5	6.5	.736	32.386	.000
	B	4.91	3	7
	C	6.36	4	9
努力程度	A	4.39	1.5	6.5	.541	23.792	.000
	B	4.86	3	7
	C	6.34	4	9
受挫指数	A	3.75	1.5	6	.681	29.949	.000
	B	4.39	1.5	6
	C	6.09	3.5	9
表现水平	A	3.82	1	6.5	.342	15.027	.001
	B	4.39	3	6
	C	5.3	4	7

在表 3中，平均值、最小值和最大值从文本A到文本C总体呈上升趋势，这意味着实验参与者准确地感知到了三个文本在翻译难度等级上的差异。四个分量表的Kendall和谐系数(Kendall W = 0.736，p < 0.01)表明，实验参与者高度认同文本C最难，而文本A最简单。对于努力程度 (Kendall W = 0.541，p < 0.01)和受挫指数(Kendall W = 0.681，p < 0.01)，实验参与者相对地认同他们翻译文本C付诸了最大努力，在翻译过程中遇到了最大的挫折，其次是文本B，然后是文本A。在NASA任务负荷指数的四个分量表中，自我表现水平这个分量在实验参与者中达成的一致性程度相对较低(Kendall W = 0.342, p < 0.01)。NASA任务负荷指数的创建者Hart和Staveland (1988Hart, Sandra G., and Lowell E. Staveland 1988 “Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research.” In Human Mental Workload, edited by Peter A. Hancock and Najmedin Meshkati, 139–183. Amsterdam: North-Holland.) 也指出，自我表现水平“相对独立于其他等级评分”(165)。因此，我们的第一个问题可以得到肯定的回答：译前评分和译后评分与定量文本测量的内在复杂性相一致。

4.3生理测量

为了探讨实验参与者翻译不同复杂程度文本时认知负荷的变化，本研究还分析了包括注视及眼跳时长和瞳孔扩张在内的生理数据。

4.3.1注视及眼跳时长(FSD)

从图 2 可以看出, FSD总和的平均值从文本A到文本C呈上升趋势，而从文本A到文本B的增量小于从文本B到文本C的增量。然而，三个文本中每一位实验参与者的平均FSD呈相互交织趋势 (见图 3)。

图 2.文本A，B和C的平均FSD

图 3.每位实验参与者翻译文本A，B和C的平均FSD

表 4.平均FSD正态性检验

	Texts	Kolmogorov-Smirnov a			Shapiro-Wilk
	Texts	统计数值	自由度	显著性	统计数值	自由度	显著性
FSD	A	.099	22	.200*	.968	22	.670
	B	.112	22	.200*	.981	22	.936
	C	.157	22	.171	.914	22	.058

aLilliefors显著性校正

*这是真正显著性的下限。

图 4.文本复杂性对FSD的影响

因此，我们另外进行了统计分析，以确定三个文本中观察到的FSD差异是否具有统计学意义。表 4显示，三组数据均为正态分布(KS和SW检验均为p > 0.05)。在此基础上，以文本复杂性为固定效应，实验参与者为随机效应，建立了线性混合效应回归(LMER)模型。结果 (见图 4) 显示，文本A与文本C (t = 3.659，p = 0.001)、文本B与文本C (t = 3.211，p = 0.003)之间的FSD差异显著，而文本A与文本B之间的FSD差异不显著(t = 0.447，p = 0.66)。

4.3.2瞳孔扩张

实验参与者翻译文本时的瞳孔大小平均值非常接近：文本A（2.99）、文本B（2.99）和文本C（2.88）。为了探究翻译三个文本时瞳孔大小的潜在隐性差异，本研究基于正态分布检验和方差齐性检验的结果，进行了单因素方差分析。KS正态分布检验结果（见表 5）表明三组数据均为正态分布（文本A：Z = 0.852，文本B：Z = 0.794，文本C：Z = 0.682，p > 0.05）。方差同质性检验结果（见表 6）表明组间方差一致（p > 0.05）。单因素方差分析结果（见表 7）表明，实验参与者在翻译三个文本时，瞳孔大小平均值无统计学显著差异（F = 0.009，p > 0.05）。

表 5.实验参与者翻译三个文本时瞳孔大小的正态分布检验数据

		文本A	文本B	文本C
N		22	22	22
参考值 a	平均值	2.992	2.986	2.981
参考值 a	标准差	.277	.270	.267
最大差异	绝对值	.182	.169	.145
	正值	.182	.165	.145
	负值	−.156	−.169	−.131
KS检验Z值		.852	.794	.682
显著性.(双侧检验)		.462	.554	.741

a检验分布正常。

表 6.方差齐性检验（瞳孔大小‭）

Levene统计量	自由度1	自由度2	显著性
.007	2	63	.993

表 7.实验参与者翻译A、B、C三个文本时瞳孔大小的方差分析

	平方和	自由度	均方	统计量	显著性
组间	.001	2	.001	.009	.991
组内	4.650	63	.074
总计	4.651	65

实验结果虽然与2011年Hvelplund的结果基本一致，但实验参与者翻译文本C（最复杂的文本）时瞳孔大小平均值最小这一结果却很令人意外。为了确定“适应效应”(Hyönä，Tommola，Alaja 1995Hyönä, Jukka, Jorma Tommola, and Anna-Mari Alaja 1995 “Pupil Dilation as a Measure of Processing Load in Simultaneous Interpretation and Other Language Tasks.” The Quarterly Journal of Experimental Psychology 48 (3): 598–612.；O’Brien 2006O’Brien, Sharon 2006 “Eye-tracking and Translation Memory Matches.” Perspectives 14 (3): 185–205.) 是否对本实验有影响，我们根据翻译文本的顺序，将瞳孔大小数据分为三组。根据翻译顺序（A-B-C、B-C-A和C-A-B），重新分组瞳孔大小的平均值。图 5表明每个顺序中，第一个文本瞳孔大小的平均值都为三个文本中最大。Friedman测试表明，任务顺序对实验参与者瞳孔大小有显著的影响（p = 0.028），而文本复杂性则没有（p = 0.483）。

图 5.不同翻译顺序瞳孔大小的平均值

4.4主观测量和生理测量之间的联系

本节将呈现以下的研究结果：实验参与者对翻译困难的感知与其认知负荷（由眼动跟踪数据所体现）之间的关系。本实验假设译后评分比译前评分更能准确地反映实验参与者的感知，因为实验参与者翻译过这些文本后，准确评估这些文本难度水平的能力可能会提高。因此，本实验采用译后评分的数据，包括与脑力需求指数、努力程度、受挫指数和自身表现水平相关的数据，进行进一步的统计分析。

正态分布检验（见表 8）显示，译后评分的四个分量表（p > 0.05）都没有违反正态性假设。散点图显示，在三个文本中，NASA任务负荷指数（四个分量表的平均值）和FSD之间都存在正线性关系（见图 6）。

表 8.译后评分正态性检验

		脑力需求指数	努力程度	受挫指数	自身表现水平
N		66	66	66	66
参考值 a	平均值	5.152	5.197	4.742	4.500
	标准	1.468	1.422	1.574	1.237
	偏差
最大差异	绝对值	.156	.138	.132	.157
	正值	.132	.133	.132	.157
	负值	−.156	−.138	−.126	−.146
KS检验Z值		1.266	1.122	1.072	1.275
显著性 (双侧检验)		.081	.161	.200	.077

a检验分布正常。

图 6.NASA任务负荷指数测量（平均）与FSD之间的关系

由于NASA任务负荷指数与FSD的数据满足以下三个要求，即比率数据、线性关系和正态分布，我们进行了Pearson相关系数检验，以测量主观测量和眼动跟踪数据之间的关系。大部分检验结果中，Pearson相关系数r值（见表 9）高于0.40，但有一个值等于0.389（文本C中表现水平与FSD之间的相关系数）。除了三个略大于0.05的p值（粗体）外，NASA任务负荷指数的四个分量表与眼动跟踪数据之间大多数呈正相关（p < 0.05）。简言之，几乎所有主观测量的分量表都与眼动跟踪数据（由FSD体现）有中度或强度的正相关。

Table 9.主观测量与FSD*之间的Pearson相关检验

分量表	Pearson相关（系数）	本文A	文本	文本C
脑力需求指数	Pearson相关系数r	.582	.499	.584
脑力需求指数	显著性(双侧检验)	.004	.018	.004
努力程度	Pearson相关系数r	.529	.605	.642
努力程度	显著性(双侧检验)	.011	.003	.001
受挫指数	Pearson相关系数r	.564	.417	.443
受挫指数	显著性(双侧检验)	.006	.054	.039
表现水平	Pearson相关系数r	.474	.415	.389
表现水平	显著性(双侧检验)	.026	.055	.073

*相关性系数值通常从+1到0到-1，其中+1表示完美的正关系，-1表示完美的负关系，0表示没有关系。Evans(1996)Evans, James D. 1996 Straightforward Statistics for the Behavioral Sciences. Pacific Grove, CA: Brooks/Cole Publishing Co.

认为相关性强弱取决于r的绝对值: 0.00–0.19，“非常微弱”; 0.20–0.39，“微弱”;0.40–0.59，“中度”; 0.60–0.79，“强”;0.80–1.0，“非常强”。

5.讨论

本研究旨在探究由可读性、词频和非纯字义性所指示的不同程度的文本复杂性，是否与译者对翻译难度的感知有关，并据此引发不同程度的认知努力。第一个和第三个研究问题有明确的答案。对文本内在复杂性的定量测量与实验参与者自我评估的翻译难度是一致的。此外，NASA-任务负荷指数测得的认知负荷FSD得出的认知负荷呈中度正相关。通过生理测量得出的不同结果增加了解答第二个问题的复杂性。FSD的数据证实了认知负荷的变化与文本复杂性有关。然而，瞳孔大小并没有产生相应的变化。下面将分别讨论三种测量方法。

5.1定量文本测量

文本内在复杂性的定量测量主要集中在其语言学特征上，即词频、可读性和非纯字义性。尽管可读性这一单一指标与翻译难度之间的相关性较小(Sun and Shreve 2014Sun, Sanjun, and Gregory M. Shreve 2014 “Measuring Translation Difficulty: An Empirical Study.” Target 26 (1): 98–127.)，但这三个属性作为文本复杂性指标的有效性在一定程度上得到了验证。

例如，如4.2节所示，它们与实验参与者的判断有明显的相关性。FSD所显示的认知负荷随文本复杂性的增加呈上升趋势。这三个要素分别只体现文本中某一特定位置的某个方面，但这三个要素的组合则从整体上反映了它们对文本难度的交互影响。正如Carpenter 及Just (1989Carpenter, Patricia A., and Marcel A. Just 1989 “The Role of Working Memory in Language Comprehension.” In Complex Information Processing: The Impact of Herbert A. Simon, edited by David Klahr and Kenneth Kotovsk, 31–68. Hillsdale, NJ: Lawrence Erlbaum Associates Publishers., 61)所指出的那样：“可读性的功能不单是体现文本某一特定部分的难度，它也能体现这种难度是如何影响其他信息的维护”。因此，任务的复杂度在很大程度上取决于两项条件：需要同时处理的要素数量和要素交互性的程度。在认知负荷理论中，要素交互性被用作为内在认知负荷基本而典型的衡量机制。互相影响的要素越多，工作记忆负荷越重(Sweller 2010Sweller, John 2010 “Element Interactivity and Intrinsic, Extraneous, and Germane Cognitive Load.” Educational Psychology Review 22 (2): 123–138., 123–124)。三种要素的组合以及它们在不同复杂度上的线性增长证明其能更有效地显示文本的复杂性，而这正是实验参与者所感知到的。

然而，瞳孔大小的数据并没有提供有力的证据来证明文本越复杂，认知负荷越大这一观点。这一结果与Hvelplund(2011)Hvelplund, Kristian Tangsgaard 2011 Allocation of Cognitive Resources in Ttranslation: An Eye-tracking and Key-logging Study. PhD thesis. Copenhagen Business School.的观点一致，原因可能是认知指标本身的缺陷或译者认知资源的分布，具体解释见5.2节。

5.2生理学测量

正如FSD所揭示的那样，认知努力的需求是会随着文本难度的增加而增加。其原因或许在于语言特征和文本复杂性的交互式影响。要是从某单一因素进行判断，例如非纯字义这一点，文本A和其余两个文本之间的复杂性存在较大的差距（见图 1）。然而，非纯字义因素与可读性和词频这两种因素所产生的复合效应，则会对认知负荷程度产生不同的作用，这可以从实验参与者的译前评分中看出（见表 2）。从文本C中所感知到的认知负荷（平均值为6.11）比文本A和文本B的认知负荷都要高很多（文本A的认知负荷平均值为4.00，而文本B为4.45），而文本A与文本B之间的认知负荷差异较小。这与FSD所显示的实验参与者付出的认知努力相一致。

而且，为了尽量减少外界因素的不利影响，实验参与者翻译时没有任何辅助工具。在这样的条件下，实验参与者就会将尽力集中于文本的理解和转换上，而不是在寻找外界资源和选择潜在的最佳翻译问题解决方案。因此，认知努力也就能毫无偏离地反映实验参与者对于文本的认知负荷。这也就解释了为什么文本C的FSD明显比文本A和文本B的要长，而文本A和文本B的FSD之间则无明显差异。

另一方面，从瞳孔大小的数据上是无法察觉到认知负荷的变化的。根据Iqbal, Zheng and Bailey (2004)Iqbal, Shamsi T., Xianjun Sam Zheng, and Brian P. Bailey 2004 “Task-evoked Pupillary Response to Mental Workload in Human-Computer Interaction.” In CHI’04 Extended Abstracts on Human Factors in Computing Systems, edited by Elizabeth Dykstra-Erickson and Manfred Tscheligi, 1477–1480. Vienna.和Schultheis and Jameson (2004)Schultheis, Holger, and Anthony Jameson 2004 “Assessing Cognitive Load in Adaptive Hypermedia Systems: Physiological and Behavioural Methods.” In Adaptive Hypermedia and Adaptive Web-based Systems, edited by Wolfgang Nejdl and Paul De Bra, 225–234. Berlin: Springer.的研究, 瞳孔扩张不一定对任务负荷的变化敏感。事实上，本研究和Hvelplund（2011Hvelplund, Kristian Tangsgaard 2011 Allocation of Cognitive Resources in Ttranslation: An Eye-tracking and Key-logging Study. PhD thesis. Copenhagen Business School.）的研究结果都表明，瞳孔大小对于持续时间相对较长的翻译任务来说，或许并不是一个测量认知负荷的适当指标。原因有以下几个方面：

首先，瞳孔扩张可受多种因素影响，像环境光照度、任务的复杂程度、凝视角度以及咖啡、酒精的摄入。某些因素有被控制，比如让光照和屏幕亮度保持在恒定水平，实验参与者禁止摄入咖啡或酒精。但是其他一些影响因素则难以控制，例如可转变的凝视角度。在翻译过程中，即使要求盲打，实验参与者也会习惯性或偶然性地看看键盘，这就会让凝视的位置发生轻微的变化。还有，当在实验中参与者转动眼球，他们的瞳孔也会与眼动仪的监控摄像头成不同的角度和距离。这进而又会导致瞳孔大小测量的不一致。“如果摄像头在眼睛下方，这种影响效果尤为显著”（Pomplun and Sunkara 2003Pomplun, Marc, and Sindhura Sunkara 2003 “Pupil Dilation as an Indicator of Cognitive Workload in Human-computer Interaction.” In Proceedings of the 10th International Conference on HCI (Vol.3), 542–546. Mahwah, NJ: Lawrence Erlbaum Associates, Inc., 542）

其次，此研究并没有探究在特定困难之处如非纯字义表达时瞳孔的扩张。因此，在整篇文本的平均值当中，特定相关的瞳孔扩张现象可能会被隐藏。有些较难的词或表达会被认为在实验参与者身上会产生更重的认知负荷，但是整个文本来看，它们并不会导致更大的平均瞳孔直径。这证实了Schultheis 和 Jameson (2004Schultheis, Holger, and Anthony Jameson 2004 “Assessing Cognitive Load in Adaptive Hypermedia Systems: Physiological and Behavioural Methods.” In Adaptive Hypermedia and Adaptive Web-based Systems, edited by Wolfgang Nejdl and Paul De Bra, 225–234. Berlin: Springer.)的发现：瞳孔大小的变化和子任务的认知负荷水平相一致，而不是整个任务的认知负荷。此二人得出结论是：在难易相间的任务中，瞳孔的大小只有在任务的某些特定阶段才会有所不同。为了使用瞳孔大小作为认知负荷指标，Schultheis 和 Jameson（2004Schultheis, Holger, and Anthony Jameson 2004 “Assessing Cognitive Load in Adaptive Hypermedia Systems: Physiological and Behavioural Methods.” In Adaptive Hypermedia and Adaptive Web-based Systems, edited by Wolfgang Nejdl and Paul De Bra, 225–234. Berlin: Springer., 227）提出在以下5种条件中，至少要满足3种。这5种条件是：（1）持续的照明；（2）避免眼动；（3）运用非可视刺激（如听觉刺激）；（4）使用多种相似而简短的任务；（5）只对任务和实验对象的平均值进行评价。我们的研究只满足了其中的2个条件：第一个和最后一个。这或许能解释为什么我们发现文本难度对瞳孔大小没有什么影响。

5.3主观判断

在难度评估上，主观判断和定量文本测量的一致性，以及自我报告和生理测量之间的正相关都意味着使用评定量表做自我报告可能是评估翻译难度更可靠的方法。这一结果符合Paas（1992Paas, Fred G. W. C. 1992 “Training Strategies for Attaining Transfer of Problem-solving Skill in Statistics: A Cognitive-load Approach.” Journal of Educational Psychology 84 (4): 429–434.）和Van Merrienboer（1994bPaas, Fred G. W. C., and Jeroen J. G. van Merriënboer 1994b “Variability of Worked Examples and Transfer of Geometrical Problem-solving Skills: A Cognitive-load Approach.” Journal of Educational Psychology 86 (1): 122–133.）的观点，他们认为自我报告“可靠，不夸张，更易识别认知负荷中相对较小的差别”（Sweller, Van Merrienboer, and Paas 1998Sweller, John, Jeroen J. G. Van Merrienboer, and Fred G. W. C. Paas 1998 “Cognitive Architecture and Instructional Design.” Educational Psychology Review 10 (3): 251–296., 268）。

以上及Sun和Shreve（2014Sun, Sanjun, and Gregory M. Shreve 2014 “Measuring Translation Difficulty: An Empirical Study.” Target 26 (1): 98–127.）的研究发现都表明，主观判断容易获取且可信度相对较高，可视为评估翻译难度的重要工具。然而，这个方法或许并不十全十美。“以往的经验、背景知识的深度和领域的技能等个体差异”（Liu and Chiu 2011Liu, Minhua, and Yu-Hsien Chiu 2011 “Assessing Source Material Difficulty for Consecutive Interpreting.” In Interpreting Chinese, Interpreting China, edited by Robin Setton, 135–156. Amsterdam: John Benjamins., 152）可能会在文本难度水平的感知方面产生巨大差异。个人任务处理的能力难免会产生主观性，除此之外，个人预测容易受到外部因素影响，如工作条件及常规实践或顾客需求等翻译要求。一些研究人员认为，“对难度的主观感受基本上依赖于执行任务过程中的时间压力”（Cain 2007Cain, Brad 2007 A Review of the Mental Workload Literature. Technical Report, Defence Research and Development Canada Toronto., 8）。另外，个人评估似乎没有考量到无意识或自动过程造成的影响。鉴于以上观点，主观判断可作为一种可靠的总体方法（Johannsen 1979Johannsen, Gunnar 1979 “Workload and Workload Measurement.” In Mental Workload: Its Theory and Measurement, edited by Neville Moray, 3–11. New York: Springer Science & Business Media.），但仅是“压力水平的粗略指标，几乎没有判断价值”（Cain 2007Cain, Brad 2007 A Review of the Mental Workload Literature. Technical Report, Defence Research and Development Canada Toronto., 8）。

6.结论

为找到有效衡量翻译难度的便捷方法，本文通过多重比较的方式对比了译前和译后评分等主观指标，可读性、词频和非纯字义等定量文本指标，以及FSD和瞳孔扩张等生理指标，从而对文本复杂性和认知负荷进行研究。研究结果如下：

第一，主观评分证明，可读性、词频和非纯字义等文本特性作为内在难度的定量文本指标具有有效性。这一结果至少可以表明更多定量文本指标的相互影响可有效评估文本的复杂性。

第二，这些结果表明，由可读性、词频和非纯字义等指标检索出的可比翻译难度和认知负荷水平会影响NASA任务负荷指数的主观评分。一方面，实验参与者对翻译难度的自我评估与文本复杂性的定量评估结果一致。另一方面，脑力需求指数、努力程度、受挫指数和表现水平的译后评级同FSD显示的认知负荷之间存在稍高程度的正相关。所以，即使主观性有缺陷，而且无法解释潜意识或自动的翻译过程，但主观判断或许仍是评估文本翻译难度成本效益更高的方法。

第三，文本复杂性对译者认知负荷造成的影响是由FSD的指标显示出来，而非瞳孔大小。比起文本复杂性，瞳孔大小更容易受到文本显示顺序的影响。而要找到眼动测量和文本复杂性之间的关系，还需要更多实验证据。

此次研究的结果可帮助确立测试翻译材料难度的方法，从而让翻译教师或评审能够在翻译教学中设置翻译难度水平。然而，我们也意识到本研究尚有不足，如源文本数量较少，文本类型不统一，参与研究的学生仅有一组。在今后的研究中，任务类型可更多样化，选择专业水平不同的人员参与研究。对源文本和目标文本的眼动数据进行对比分析，或许能明确翻译难度在多大程度上是理解或生成现象。此外，还可借助口语报告、Translog和质量评估数据来强化数据的三角测量。

致谢

本研究由山东省社会科学规划项目（Social Science Planning Program of Shandong Province）（14CWXJ29）和英国杜伦大学希望种子研究基金会（Durham University Seedcorn Research Funding）（04.14.290201）提供资金支持。衷心感谢Ricardo Muñoz Martín教授、匿名审稿人员和各位编辑为提升本文质量提出的建设性反馈意见。

笔记

1.本文中提到的Jensen and Hvelplund是同一个人。

2.在本文中，认知负荷是指翻译工作对译者认知资源的需求，而认知努力是译者在翻译过程中实际投入的认知资源。

3.NASA任务负荷指数(NASA Task Load Index)是Hart 和 Staveland (1988Hart, Sandra G., and Lowell E. Staveland 1988 “Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research.” In Human Mental Workload, edited by Peter A. Hancock and Najmedin Meshkati, 139–183. Amsterdam: North-Holland.

)为测量主观工作量而开发的多维量表。其中囊括六个与工作负荷相关的分量表，即脑力需求、体力需求、时间需求、努力程度、表现和受挫水平。每个分量表由双极描述符(例如，低/高、好/差)将其划分为20个等量。

4.已预期性别不平衡不会对研究结果造成影响（Hvelplund, 2011年）。

5.作为高等教育和全球移民英语水平测试，国际英语语言测试系统（IELTS）为最为广泛的测试体系之一。成绩报告的分数等级划分为1分（最低）到9分（最高）。

6.Translog II仅用于显示源文本和输入目标文本。研究项目中未对Translog数据加以分析。

7.Kendall W值范围从0（完全不一致）到1（完全一致）。如果Kendall W值在0.91–1之间，实验参与者的反应被认为具有很强一致性；在0.71–0.90之间，则具有较强一致性；在0.51–0.70之间，则具有中度一致性；在0.31–0.50之间，则具有较弱一致性；在0.0–0.30之间，则缺乏一致性 (LeBreton and Senter 2008LeBreton, James M., and Jenell L. Senter 2008 “Answers to 20 Questions about Interrater Reliability and Interrater Agreement.” Organizational Research Methods 11 (4): 815–852.

, 836)。

参考文献

Akbari, Alireza, and Winibert Segers

2017 “Translation Difficulty: How to Measure and What to Measure.” Lebende Sprachen 62 (1): 3–29.

① ②

Ashby, Jane, Keith Rayner, and Charles Clifton

2005 “Eye Movements of Highly Skilled and Average Readers: Differential Effects of Frequency and Predictability.” The Quarterly Journal of Experimental Psychology Section A 58 (6): 1065–1086.

①

Cain, Brad

2007 A Review of the Mental Workload Literature. Technical Report, Defence Research and Development Canada Toronto.

① ②

Campbell, Stuart

1999 “A Cognitive Approach to Source Text Difficulty in Translation.” Target 11 (1): 33–63.

①

Carpenter, Patricia A., and Marcel A. Just

1989 “The Role of Working Memory in Language Comprehension.” In Complex Information Processing: The Impact of Herbert A. Simon, edited by David Klahr and Kenneth Kotovsk, 31–68. Hillsdale, NJ: Lawrence Erlbaum Associates Publishers.

①

Chaffin, Roger, Robin K. Morris, and Rachel E. Seely

2001 “Learning New Word Meanings from Context: A Study of Eye Movements.” Journal of Experimental Psychology: Learning, Memory, and Cognition 27 (1): 225–235.

①

Ehrlich, Susan F., and Keith Rayner

1981 “Contextual Effects on Word Perception and Eye Movements during Reading.” Journal of Verbal Learning and Verbal Behavior 20 (6): 641–655.

①

Evans, James D.

1996 Straightforward Statistics for the Behavioral Sciences. Pacific Grove, CA: Brooks/Cole Publishing Co.

①

Gibbs Jr, Raymond W.

1990 “Comprehending Figurative Referential Descriptions.” Journal of Experimental Psychology: Learning, Memory and Cognition 16 (1): 56–66.

①

Hale, Sandra, and Stuart Campbell

2002 “The Interaction between Text Difficulty and Translation Accuracy.” Babel 8 (1): 14–33.

① ②

Hart, Sandra G., and Lowell E. Staveland

1988 “Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research.” In Human Mental Workload, edited by Peter A. Hancock and Najmedin Meshkati, 139–183. Amsterdam: North-Holland.

① ②

Hess, Eckhard H., and James M. Polt

1964 “Pupil Size in Relation to Mental Activity during Simple Problem-Solving.” Science 143 (3611): 1190–1192.

① ②

Hvelplund, Kristian Tangsgaard

2011 Allocation of Cognitive Resources in Ttranslation: An Eye-tracking and Key-logging Study. PhD thesis. Copenhagen Business School.

① ② ③ ④ ⑤ ⑥ ⑦ ⑧ ⑨ ⑩ ⑪

2014 “Eye Tracking and the Translation Process: Reflections on the Analysis and Interpretation of Eye-tracking Data.” In Minding Translation / Con la traducción en mente, edited by Ricardo Muñoz Martín, 201–224. San Vicente del Raspeig: Publicaciones de la Universidad de Alicante.

① ②

Hyönä, Jukka, Jorma Tommola, and Anna-Mari Alaja

1995 “Pupil Dilation as a Measure of Processing Load in Simultaneous Interpretation and Other Language Tasks.” The Quarterly Journal of Experimental Psychology 48 (3): 598–612.

① ② ③

Inhoff, Albrecht Werner

1984 “Two Stages of Word Processing during Eye Fixations in the Reading of Prose.” Journal of Verbal Learning & Verbal Behavior 23 (5): 612–624.

①

Iqbal, Shamsi T., Xianjun Sam Zheng, and Brian P. Bailey

2004 “Task-evoked Pupillary Response to Mental Workload in Human-Computer Interaction.” In CHI’04 Extended Abstracts on Human Factors in Computing Systems, edited by Elizabeth Dykstra-Erickson and Manfred Tscheligi, 1477–1480. Vienna.

①

Irwin, David E.

2004 “Fixation Location and Fixation Duration as Indices of Cognitive Processing.” In The Interface of Language, Vision, and Action: Eye Movements and Visual World, edited by John Henderson and Fernanda Ferreira, 105–134. New York: Psychology Press.

① ②

Jensen, Kristian T. H.

2009 “Indicators of Text Complexity.” In Behind the Mind: Methods, Models and Results in Translation Process Research, edited by Susanne Göpferich, Arnt L. Jakobsen, and Inger M. Mees, 61–80. Copenhagen: Samfundslitteratur.

① ② ③

Johannsen, Gunnar

1979 “Workload and Workload Measurement.” In Mental Workload: Its Theory and Measurement, edited by Neville Moray, 3–11. New York: Springer Science & Business Media.

①

Just, Marcel A., and Patricia A. Carpenter

1980 “A Theory of Reading: From Eye Fixations to Comprehension.” Psychological Review 87 (4): 329–354.

① ② ③

1993 “The Intensity Dimension of Thought: Pupillometric Indices of Sentence Processing.” Canadian Journal of Experimental Psychology 47 (2): 310–339.

① ②

LeBreton, James M., and Jenell L. Senter

2008 “Answers to 20 Questions about Interrater Reliability and Interrater Agreement.” Organizational Research Methods 11 (4): 815–852.

①

Liu, Minhua, and Yu-Hsien Chiu

2011 “Assessing Source Material Difficulty for Consecutive Interpreting.” In Interpreting Chinese, Interpreting China, edited by Robin Setton, 135–156. Amsterdam: John Benjamins.

① ②

Mishra, Abhijit, Pushpak Bhattacharyya, and Michael Carl

2013 “Automatically Predicting Sentence Translation Difficulty.” In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 346–351. Sofia.

① ②

O’Brien, Sharon

2006 “Eye-tracking and Translation Memory Matches.” Perspectives 14 (3): 185–205.

①

Paas, Fred G. W. C.

1992 “Training Strategies for Attaining Transfer of Problem-solving Skill in Statistics: A Cognitive-load Approach.” Journal of Educational Psychology 84 (4): 429–434.

①

Paas, Fred G. W. C., and Jeroen J. G. van Merriënboer

1994a “Instructional Control of Cognitive Load in the Training of Complex Cognitive Tasks.” Educational Psychology Review 6 (4): 351–371.

①

1994b “Variability of Worked Examples and Transfer of Geometrical Problem-solving Skills: A Cognitive-load Approach.” Journal of Educational Psychology 86 (1): 122–133.

①

Pavlović, Nataša, and Kristian Jensen

2009 “Eye Tracking Translation Directionality.” In Translation Research Projects 2, edited by Anthony Pym and Alexander Perekrestenko, 93–109. Tarragona: Intercultural Studies Group.

① ② ③

Pomplun, Marc, and Sindhura Sunkara

2003 “Pupil Dilation as an Indicator of Cognitive Workload in Human-computer Interaction.” In Proceedings of the 10th International Conference on HCI (Vol.3), 542–546. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.

①

Rayner, Keith

1998 “Eye Movements in Reading and Information Processing: 20 Years of Research.” Psychological Bulletin 124 (3): 372–422.

①

Rayner, Keith, and Arnold D. Well

1996 “Effects of Contextual Constraint on Eye Movements in Reading: A Further Examination.” Psychonomic Bulletin & Review 3 (4): 504–509.

①

Rayner, Keith, and Susan A. Duffy

1986 “Lexical Complexity and Fixation Times in Reading: Effects of Word Frequency, Verb Complexity, and Lexical Ambiguity.” Memory & Cognition 14 (3):191–201.

①

Rayner, Keith, and Martin H. Fischer

1996 “Mindless Reading Revisited: Eye Movements during Reading and Scanning Are Different.” Perception & Psychophysics 58 (5): 734–747.

①

Rayner, Keith, and Gary E. Raney

1996 “Eye Movement Control in Reading and Visual Search: Effects of Word Frequency.” Psychonomic Bulletin & Review 3 (2): 245–248.

①

Rayner, Keith, Sara C. Sereno, and Gary E. Raney

1996 “Eye Movement Control in Reading: A Comparison of Two Types of Models.” Journal of Experimental Psychology: Human Perception and Performance 22 (5): 1188–1200.

①

Schotter, Elizabeth R., and Keith Rayner

2012 “Eye Movements in Reading: Implications for Reading Subtitles.” In Eye Tracking in Audiovisual Translation, edited by Elisa Perego, 83–104. Roma: Aracne Editrice.

①

Schultheis, Holger, and Anthony Jameson

2004 “Assessing Cognitive Load in Adaptive Hypermedia Systems: Physiological and Behavioural Methods.” In Adaptive Hypermedia and Adaptive Web-based Systems, edited by Wolfgang Nejdl and Paul De Bra, 225–234. Berlin: Springer.

① ② ③

Sereno, Sara C., Patrick J. O’Donnell, and Keith Rayner

2006 “Eye Movements and Lexical Ambiguity Resolution: Investigating the Subordinate-bias Effect.” Journal of Experimental Psychology: Human Perception and Performance 32 (2): 335–350.

①

Sharmin, Selina, Oleg Spakov, Kari-Jouko Räihä, and Arnt L. Jakobsen

2008 “Where on the Screen Do Translation Students Look While Translating, and for How Long?” In Looking at Eyes: Eye-Tracking Studies of Reading and Translation Processing, edited by Arnt L. Jakobsen, Susanne Göpferich, and Inger M. Mees, 31–51. Copenhagen: Samfundslitteratur.

①

Staub, Adrian, and Keith Rayner

2007 “Eye Movements and On-Line Comprehension Processes.” In The Oxford Handbook of Psycholinguistics, edited by M. Gareth Gaskell and Gerry Altmann, 327–342. Oxford: Oxford University Press.

①

Sun, Sanjun

2015 “Measuring Translation Difficulty: Theoretical and Methodological Considerations.” Across Languages and Cultures 16 (1): 29–54.

① ②

Sun, Sanjun, and Gregory M. Shreve

2014 “Measuring Translation Difficulty: An Empirical Study.” Target 26 (1): 98–127.

① ② ③ ④ ⑤ ⑥ ⑦ ⑧ ⑨ ⑩ ⑪

Sweller, John

2010 “Element Interactivity and Intrinsic, Extraneous, and Germane Cognitive Load.” Educational Psychology Review 22 (2): 123–138.

①

Sweller, John, Jeroen J. G. Van Merrienboer, and Fred G. W. C. Paas

1998 “Cognitive Architecture and Instructional Design.” Educational Psychology Review 10 (3): 251–296.

①

Vauras, Marja, Jukka Hyönä, and Pekka Niemi

1992 “Comprehending Coherent and Incoherent Texts: Evidence from Eye Movement Patterns and Recall Performance.” Journal of Research in Reading 15 (1): 39–54.

①

Williams, Rihana, and Robin Morris

2004 “Eye Movements, Word Familiarity, and Vocabulary Acquisition.” European Journal of Cognitive Psychology 16 (1–2): 312–339.

①

Zola, David

1984 “Redundancy and Word Perception during Reading.” Perception & Psychophysics 36 (3): 277–284.

①

附录 I

体验文本

A wedding now costs $ 35,000

来源: 《每日邮报在线》(2017年2月3日)

Study reveals tying the knot costs more than ever as couples look to make their ceremony more lavish. The average wedding last year cost $ 35,329, it’s been revealed.

Record-breaking figure comes from The Knot 2016 Real Weddings Study. It’s a jump from $ 32,641 – the average cost of a wedding in the 2015 study. The most expensive place to tie the knot is in Manhattan ($ 78,464) and the cheapest is in Arkansas ($ 19,522).

字符数（计空格）：415

标题字符数（计空格）：27

实验文本(参看Jensen 2009Jensen, Kristian T. H. 2009 “Indicators of Text Complexity.” In Behind the Mind: Methods, Models and Results in Translation Process Research, edited by Susanne Göpferich, Arnt L. Jakobsen, and Inger M. Mees, 61–80. Copenhagen: Samfundslitteratur.)

(文本A)Killer nurse receives four life sentences

来源：《独立报》(2008年3月4日)

Hospital nurse Colin Norris was imprisoned for life today for the killing of four of his patients. 32 year old Norris from Glasgow killed the four women in 2002 by giving them large amounts of sleeping medicine. Yesterday, he was found guilty of four counts of murder following a long trial. He was given four life sentences, one for each of the killings. He will have to serve at least 30 years. Police officer Chris Gregg said that Norris had been acting strangely around the hospital. Only the awareness of other hospital staff put a stop to him and to the killings. The police have learned that the motive for the killings was that Norris disliked working with old people. All of his victims were old weak women with heart problems. All of them could be considered a burden to hospital staff.

字符数（计空格）：837

标题字符数（计空格）：41

(文本 B)Families hit with increase in cost of living

来源: 《泰晤士报》(2008年2月12日)

British families have to cough up an extra £ 1,300 a year as food and fuel prices soar at their fastest rate in 17 years. Prices in supermarkets have climbed at an alarming rate over the past year. Analysts have warned that prices will increase further still, making it hard for the Bank of England to cut interest rates as it struggles to keep inflation and the economy under control. To make matters worse, escalating prices are racing ahead of salary increases, especially those of nurses and other healthcare professionals, who have suffered from the government’s insistence that those in the public sector have to receive below-inflation salary increases. In addition to fuel and food, electricity bills are also soaring. Five out of the six largest suppliers have increased their customers’ bills.

字符数（计空格）：846

标题字符数（计空格）：44

(文本 C)Spielberg shows Beijing red card over Darfur 来源: 《每日电讯报》(2008年2月13日)

In a gesture sure to rattle the Chinese Government, Steven Spielberg pulled out of the Beijing Olympics to protest against China’s backing for Sudan’s policy in Darfur. His withdrawal comes in the wake of fighting flaring up again in Darfur and is set to embarrass China, which has sought to halt the negative fallout from having close ties to the Sudanese government. China, which has extensive investments in the Sudanese oil industry, maintains close links with the Government, which includes one minister charged with crimes against humanity by the International Criminal Court in The Hague. Although emphasizing that Khartoum bears the bulk of the responsibility for these ongoing atrocities, Spielberg maintains that the international community, and particularly China, should do more to end the suffering.

字符数（计空格）：856

标题字符数（计空格）：44

附录 II

经调整的用于测量翻译难度的NASA任务负荷指数(参看 Sun and Shreve 2014Sun, Sanjun, and Gregory M. Shreve 2014 “Measuring Translation Difficulty: An Empirical Study.” Target 26 (1): 98–127.)

通讯地址

Yanmei Liu

Shandong University of Finance and Economics

[email protected]

合著者联系方式

郑冰寒

Elvet Riverside

School of Modern Languages and Cultures Durham University

[email protected]

https://orcid.org/0000-0001-5302-4709

周好

Durham University

[email protected]

翻译衡量文本翻译的难度： 采用以文本为中心和以译者为导向相结合的方法 [Measuring the difficulty of text translation: The combination of text-focused and translator-oriented approaches]

1.介绍

2.评估认知负荷

3.研究设计

3.1实验参与者

3.2实验材料

3.3实验设置

3.4实验步骤

4.实验结果

4.1眼动跟踪数据的质量评估

4.2主观测量

4.2.1译前评估

4.2.2译后评分

4.3生理测量

4.3.1注视及眼跳时长(FSD)

4.3.2瞳孔扩张

4.4主观测量和生理测量之间的联系

5.讨论

5.1定量文本测量

5.2生理学测量

5.3主观判断

6.结论

致谢

笔记

参考文献

附录 I

体验文本

实验文本(参看Jensen 2009Jensen, Kristian T. H. 2009 “Indicators of Text Complexity.” In Behind the Mind: Methods, Models and Results in Translation Process Research, edited by Susanne Göpferich, Arnt L. Jakobsen, and Inger M. Mees, 61–80. Copenhagen: Samfundslitteratur.)

附录 II

通讯地址

合著者联系方式

翻译
衡量文本翻译的难度：采用以文本为中心和以译者为导向相结合的方法 [Measuring the difficulty of text translation: The combination of text-focused and translator-oriented approaches]

实验文本(参看Jensen 2009Jensen, Kristian T. H. 2009 “Indicators of Text Complexity.” In Behind the Mind: Methods, Models and Results in Translation Process Research, edited by Susanne Göpferich, Arnt L. Jakobsen, and Inger M. Mees, 61–80. Copenhagen: Samfundslitteratur.)