Over the past decade, interpretation assessment has played an increasingly important role in interpreter education, professional certification, and interpreting research. The time-honored assessment method is based on analysis of (para)linguistic features of interpretation (including such items as omissions, substitutions, un/filled pauses and self-corrections). Recently, use of descriptor-based rating scales to assess interpretation has emerged as a viable alternative (e.g., Angelelli 2009; Han 2015, 2016; J. Lee 2008; Tiselius 2009), arguably providing a basis for reliable, valid and practical assessments. However, little work has been done in interpreting studies to ascertain the assumed benefits of this emerging assessment practice. Based on 17 international peer-reviewed journals over the last twelve years (2004–2015), and other related publications (e.g., scholarly books, reports, documents), this article provides an overview of practices in scale-based interpretation assessment, focusing on four major aspects: (a) rating scales; (b) raters; (c) rating procedures; (d) reporting of assessment outcomes. Problem areas and possible emerging trends in interpretation assessment are examined, identifying a number of future research needs.
(2011) Striving for an ‘A’ grade: A case study in performance management of interpreters. International Journal of Interpreter Education 31, 56–71.
Brennan, R. L.
(2001) Generalizability theory. New York: Springer.
(1986) Linguistic (semantic) and extra-linguistic (pragmatic) criteria for the evaluation of conference interpretation and interpreters. Multilingua 5 (4), 231–235.
Campbell, S. & Hale, S.
(2003) Translation and interpreting assessment in the context of educational measurement. In G. Anderman & M. Rogers (Eds.), Translation today: Trends and perspectives. Clevedon: Multilingual Matters, 205–224.
Carroll, J. B.
(1966) An experiment in evaluating the quality of translations. Mechanical Translation and Computational Linguistics 9 (3–4), 55–66.
(2011) Technical report on the development and pilot testing of the CCHI examinations. Washington, DC: Certification Commission for Healthcare Interpreters. [URL] (accessed 22 May 2015).
(2012) Technical report on the development and pilot testing of the Certified Healthcare Interpreter™ (CHI™) examination for Arabic and Mandarin. Washington, DC: Certification Commission for Healthcare Interpreters. [URL] (accessed 22 May 2015).
(2009) Authenticity in accreditation tests for interpreters in China. The Interpreter and Translator Trainer 3 (2), 257–273.
(2014) Monolingual short courses for language-specific accreditation: Can they work? A Sydney experience. The Interpreter and Translator Trainer 8 (2), 1–23.
Hale, S., Garcia, I., Hlavac, J., Kim, M., Lai, M., Turner, B. & Slatyer, H.
(2012) Development of a conceptual overview for a new model for NAATI standards, testing and assessment. Sydney, Australia. [URL] (accessed 22 May 2015).
Hamidi, M. & Pöchhacker, F.
(2007) Simultaneous consecutive interpreting: A new technique put to the test. Meta 52 (2), 276–289.
(2013) A cross-national overview of translator and interpreter certification procedures. Translation & Interpreting 51, 32–65.
(2014) Measuring rater variability in interpreter performance testing: Using classical test theory, G theory and Rasch measurement. Paper presented at the Biennial Conference of the Association for Language Testing and Assessment of Australia and New Zealand at the University of Queensland, 27–29 November 2014.
(2005) Self-assessment as an autonomous learning tool in an interpretation classroom. Meta 50 (4).
Lim, H. -O.
(2006) A comparison of curricula of graduate schools of interpretation and translation in Korea. Meta 51 (2), 215–228.
Lin, I. I., Chang, F. A., & Kuo, F.
(2013) The impact of non-native accented English on rendition accuracy in simultaneous interpreting. Translation & Interpreting 5 (2), 30–44.
(2013) Design and analysis of Taiwan’s interpretation certification examination. In D. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting. Frankfurt: Peter Lang, 163–178.
Liu, M., Chang, C. -C. & Wu, S. -C.
(2008) Interpretation evaluation practices: Comparison of eleven schools in Taiwan, China, Britain, and the USA. Compilation and Translation Review 1 (1), 1–42.
Llewellyn, J. P.
(1981) Simultaneous interpreting. In J. K. Woll & M. Deuchar (Eds.), Perspectives on British Sign Language & Deafness. London: Croom Helm, 89–104.
Lu, M., Liu, C. & Gong, X. F.
(2007) 全国翻译专业资格(水平)考试英语口译试题命制一致性研究报告. [How to maintain consistency in CATTI’s interpretation tests: A research report]. 中国翻译, 51, 57–61.
Lunz, M. E. & Stahl, J. A.
(1990) Judge consistency and severity across grading periods. Evaluation and the Health Professions 13 (4), 425–444.
Lynch, B. K. & McNamara, T. F.
(1998) Using G-theory and many-facet Rasch measurement in the development of performance assessments of the ESL speaking skills of immigrants. Language Testing 15 (2), 158–180.
(2010) Development and validation of oral and written examinations for medical interpreter certification: Technical report. Burbank, CA. [URL] (accessed 22 May 2015).
PSI Services LLC
(2013) Development and validation of oral examinations for medical interpreter certification: Mandarin, Russian, Cantonese, Korean, and Vietnamese forms. [URL] (accessed 22 May 2015).
(2010) The impact of fluency on the subjective assessment of interpreting quality. The Interpreters’ Newsletter 151, 101–115.
Ribas, M. A.
(2010) Formative assessment in the interpreting classroom: Using the portfolio with students beginning simultaneous interpreting. Current Trends in Translation Teaching and Learning 31, 97–131.
Roat, C. E.
(2006) Certification of health care interpreters in the United States: A primer, a status report and considerations for national certification. Los Angeles, CA. [URL] (accessed 22 May 2015).
(2013) Certification of social interpreters in Flanders, Belgium: Assessment and politics. In D. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting. Frankfurt: Peter Lang, 179–197.
(2013) Rethinking bifurcated testing models in the court interpreter certification process. In D. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting. Frankfurt: Peter Lang, 67–84.
Wang, B. H.
(2007) 口译能力评估和译员能力评估 – 口译的客观评估模式初探. [From interpreting competence to interpreter competence – a tentative model for objective assessment of interpreting]. 外语界 31, 44–50.
Wang, B. H.
(2011) 口译能力的评估模式及测试设计再探 – 以全国英语口译大赛为例. [Exploration of the assessment model and test design of interpreting competence]. 外语界 11, 66–71.
Wang, J. -H., Napier, J., Goswell, D. & Carmichael, A.
(2015) The design and application of rubrics to assess signed language interpreting performance, The Interpreter and Translator Trainer 9 (1), 83–103.
Wang, M. W. & Stanley, J. C.
(1970) Differential weighting: A review of methods and empirical studies. Review of Educational Research 41, 663–705.
(1993) Exploring bias analysis as a tool for improving rater consistency in assessing oral interaction. Language Testing 10 (3), 305–319.
Wu, J., Liu, M. & Liao, C.
(2013) Analytic scoring in interpretation test: Construct validity and the halo effect. In H. -H. Liao, T. -E. Kao & Y. Lin (Eds.), The making of a translator: Multiple perspectives. Taipei: Bookman, 277–292.
Wu, S. C.
(2010) Assessing simultaneous interpreting: A study on test reliability and examiners’ assessment behavior. PhD thesis, Newcastle University.
Wu, S. C.
(2013) How do we assess students in the interpreting examinations? In D. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting. Frankfurt: Peter Lang, 15–33.
Xi, X. -M. & Mollaun, P.
(2006) Investigating the utility of analytic scoring for the TOEFL Academic Speaking Test (TAST). [URL] (accessed 15 June 2015).
Yan, J. X., Pan, J. & Wang, H. -H.
(2010) Learner factors, self-perceived language ability and interpreting learning: An Investigation of Hong Kong tertiary interpreting classes. The Interpreter and Translator Trainer 4 (2), 173–196.
Yan, J. X., Pan, J., Wu, H. & Wang, Y.
(2013) Mapping interpreting studies: The state of the field based on a database of nine major translation and interpreting journals (2000–2010). Perspectives 21 (3), 446–73.
Yeh, S. -P., & Liu, M.
(2006) 口譯評分客觀化初探：採用量表的可能性 [A more objective approach to interpretation evaluation: Exploring the use of scoring rubrics]. 國立編譯館館刊 34 (4), 57–78.
(2013) The development of certification for healthcare interpreters in the United States. Translation & Interpreting 5 (1), 114–126.
(2011) Authentic and valid assessment: Assessing the performance of public service interpreters. Investigation in University Teaching and Learning 71, 99–105.
Zhao, N. & Dong, Y. P.
(2013) 基于多面Rasch模型的交替传译测试效度验证. [Validation of a consecutive interpreting test based on multi-faceted Rasch model]. 解放军外国语学院学报 36 (1), 86–90.
2018. A longitudinal quantitative investigation into the concurrent validity of self and peer assessment applied to English-Chinese bi-directional interpretation in an undergraduate interpreting course. Studies in Educational Evaluation 58 ► pp. 187 ff.
2019. A generalizability theory study of optimal measurement design for a summative assessment of English/Chinese consecutive interpreting. Language Testing 36:3 ► pp. 419 ff.
2021. Analytic rubric scoring versus comparative judgment: a comparison of two approaches to assessing spoken-language interpreting. Meta 66:2 ► pp. 337 ff.
2023. Aptitude for interpreting: the predictive value of cognitive fluency. The Interpreter and Translator Trainer 17:1 ► pp. 155 ff.
Wang, Weiwei, Yi Xu, Binhua Wang & Lei Mu
2020. Developing Interpreting Competence Scales in China. Frontiers in Psychology 11
이지은, Choi, Hyo-eun & You-jin Lee
2019. 평가 척도를 이용한 사법통역 평가 사례연구. The Journal of Translation Studies 20:2 ► pp. 81 ff.
This list is based on CrossRef data as of 26 november 2023. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.