Bilingual raters play an important role in assessing spoken-language interpreting (between X and Y languages). Presumably, raters with X being the dominant language (DL) and Y the less DL can potentially differ, in terms of rating processes, from other raters with Y being the DL and X the less DL, when assessing either X-to-Y or Y-to-X interpreting. As such, raters’ language background and its interaction with interpreting directionality may influence assessment outcomes. However, this complex interaction and its effects on assessment have not been investigated. We therefore conducted the current experiment to explore how raters’ language background and interpreting directionality would affect assessment of English-Chinese, two-way interpreting. Our analyses of the quantitative data indicate that, when assessing interpreting into raters’ mother tongue or DL, they displayed a greater level of self-confidence and self-consistency, but rated performance more harshly. Such statistically significant group-level disparities led to different assessment outcomes, as pass and fail rates varied, depending on the rater group. These quantitative findings, coupled with the raters’ qualitative comments, may have implications for selection and training of bilingual raters for interpreting assessment.
(1995) The effect of rater variables in the development of an occupation-specific language performance test. Language Testing,
12
1, 1–15.
Carey, M. D., Mannell, R. H., & Dunn, P. K.
(2011) Does a rater’s familiarity with a candidate’s pronunciation affect the rating in oral proficiency interviews?Language Testing,
28
1, 201–219.
Chen, J.
(2009) Authenticity in accreditation tests for interpreters in China. The Interpreter and Translator Trainer,
3
1, 257–273.
Fayer, J. M., & Krasinski, E.
(1987) Native and nonnative judgments of intelligibility and irritation. Language Learning,
37
1, 313–326.
Gile, D.
(2009) Interpreting studies: A critical review from within. Monografías De Traducción E Interpretación,
1
1, 135–155.
Gui, M.
(2012) Exploring differences between Chinese and American EFL teachers’ evaluation of speech performance. Language Assessment Quarterly,
9
1, 186–203.
Han, C.
(2017) Using analytic rating scales to assess English-Chinese bi-directional interpreting: A longitudinal Rasch analysis of scale utility and rater behaviour. Linguistica Antverpiensia, New Series: Themes in Translation Studies,
16
1, 196–215.
Han, C.
(2018) A longitudinal quantitative investigation into the concurrent validity of self and peer assessment applied to English-Chinese bi-directional interpretation in an undergraduate interpreting course. Studies in Educational Evaluation,
58
1, 187–196.
Han, C.
(2019) A generalizability theory study of optimal measurement design for a summative assessment of English/Chinese consecutive interpreting. Language Testing, 361, 419–438.
Han, C.
(2022) Interpreting testing and assessment: A state-of-the-art review. Language Testing,
39
1, 30–55.
Han, C., & Riazi, M.
(2018) The accuracy of student self-assessments of English-Chinese bidirectional interpretation: A longitudinal quantitative study. Assessment and Evaluation in Higher Education,
43
1, 386–398.
(1996) Who should be the judge? The use of non-native speakers as raters on a test of English as an international language. Melbourne Papers in Language Testing,
5
1, 29–49.
Huang, B. H.
(2013) The effects of accent familiarity and language teaching experience on raters’ judgments of non-native speech. System,
41
1, 770–785.
Huang, B., Alegre, A., & Eisenberg, A.
(2016) A cross-linguistic investigation of the effect of raters’ accent familiarity on speaking assessment. Language Assessment Quarterly,
13
1, 25–41.
Kim, Y-H.
(2009a) A G-theory analysis of rater effect in ESL speaking assessment. Applied Linguistics,
30
1, 435–40.
Kim, Y-H.
(2009b) An investigation into native and non-native teachers’ judgments of oral English performance: a mixed methods approach. Language Testing,
26
1, 187–217.
Linacre, J. M.
(2002) What do infit and outfit, mean-square and standardized mean?Rasch Measurement Transactions,
16
1, 878. [URL]
(2013) Design and analysis of Taiwan’s interpretation certification examination. In D. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting (pp. 163–178). Peter Lang.
Liu, M-H., Chang, C-C., & Wu, S -C.
(2008) Interpretation evaluation practices: Comparison of eleven schools in Taiwan, China, Britain, and the USA. Compilation and Translation Review,
1
1, 1–42.
Mellinger, C. D., & Hanson, T. A.
(2017) Quantitative research methods in translation and interpreting studies. Routledge.
(2013) Assessing interpreter aptitude in a variety of languages. In D. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting (pp. 35–50). Peter Lang.
Su, W.
(2019) Exploring native English teachers’ and native Chinese teachers’ assessment of interpreting. Language and Education,
33
1, 577–594.
Wei, J., & Llosa, L.
(2015) Investigating differences between American and Indian raters in assessing TOEFL iBT speaking tasks. Language Assessment Quarterly,
12
1, 283–304.
Wink, P., Gass, S., & Myford, C.
(2012) Raters’ L2 background as a potential source of bias in rating oral performance. Language Testing,
30
1, 231–252.
Xi, X., & Mollaun, P.
(2011) Using raters from India to score a large-scale speaking test. Language Learning,
61
1, 1222–1255.
Yeh, S-P., & Liu, M.
(2006) A more objective approach to interpretation evaluation: Exploring the use of scoring rubrics. Journal of the National Institute for Compilation and Translation,
34
1, 57–78.
Zhang, Y., & Elder, C.
(2011) Judgments of oral proficiency by non-native and native English speaking teacher raters: Competing or complementary constructs?Language Testing,
28
1, 31–50.
Zhang, Y., & Elder, C.
(2014) Investigating native and non-native English-speaking teacher raters’ judgements of oral proficiency in the College English Test-Spoken English Test (CETSET). Assessment in Education: Principles, Policy & Practice,
21
1, 306–325.