Article published In:
Interpreting
Vol. 25:1 (2023) ► pp.109143
References
Banerjee, S. & Lavie, A.
(2005) METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, 65–72. [URL]
Callison-Bruch, C., Osborne, M. & Koehn, P.
(2006) Re-evaluating the role of BLEU in machine translation research. Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, 249–256. [URL]
Chen, J., Yang, H-B. & Han, C.
(2021) Holistic versus analytic scoring of spoken-language interpreting: A multi-perspectival comparative analysis. Manuscript submitted for publication.Google Scholar
Christodoulides, G. & Lenglet, C.
(2014) Prosodic correlates of perceived quality and fluency in simultaneous interpreting. In N. Campbell, D. Gibbon & D. Hirst (Eds.), Proceedings of the 7th Speech Prosody Conference, 1002–1006. [URL]. DOI logo
Chung, H-Y.
(2020) Automatic evaluation of human translation: BLEU vs. METEOR. Lebende Sprachen 65 (1), 181–205. DOI logoGoogle Scholar
Coughlin, D.
(2003) Correlating automated and human assessments of machine translation quality. [URL]
Devlin, J., Chang, M-W., Lee, K. & Toutanova, K.
(2018) BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 4171–4186. [URL]
Doddington, G.
(2002) Automatic evaluation of machine translation quality using N-gram co-occurrence statistics. Proceedings of the Second International Conference on Human Language Technology Research, 138–145. DOI logoGoogle Scholar
Ginther, A., Dimova, S. & Yang, R.
(2010) Conceptual and empirical relationships between temporal measures of fluency and oral English proficiency with implications for automated scoring. Language Testing 27 (3), 379–399. DOI logoGoogle Scholar
(2022a) Interpreting testing and assessment: A state-of-the-art review. Language Testing 39 (1), 30–55. DOI logoGoogle Scholar
Han, C. & Lu, X-L.
(2021a) Interpreting quality assessment re-imagined: The synergy between human and machine scoring. Interpreting and Society 1 (1), 70–90. DOI logoGoogle Scholar
(2021b) Can automated machine translation evaluation metrics be used to assess students’ interpretation in the language learning classroom? Computer Assisted Language Learning, 1–24. DOI logoGoogle Scholar
Han, C. & Xiao, X-Y.
(2021) A comparative judgment approach to assessing Chinese Sign Language interpreting. Language Testing, 1–24. DOI logoGoogle Scholar
Han, C., Hu, J. & Deng, Y.
forthcoming). Effects of language background and directionality on raters’ assessments of spoken-language interpreting: An exploratory experimental study. Revista Española de Lingüística Aplicada.
Han, C., Chen, S-J., Fu, R-B. & Fan, Q.
International School of Linguists
(2020) Diploma in Public Service Interpreting learner handbook. London, UK. [URL]
Koehn, P.
(2010) Statistical machine translation. Cambridge: Cambridge University Press.Google Scholar
Le, N-T., Lecouteux, B. & Besacier, L.
(2018) Automatic quality estimation for speech translation using joint ASR and MT features. Machine Translation 32 (4), 325–351. DOI logoGoogle Scholar
Lee, J.
(2008) Rating scales for interpreting performance assessment. The Interpreter and Translator Trainer 2 (2), 165–184. DOI logoGoogle Scholar
Liu, M-H.
(2013) Design and analysis of Taiwan’s interpretation certification examination. In: D. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting. Frankfurt: Peter Lang, 163–178.Google Scholar
Liu, Y-M.
(2021) Exploring a corpus-based approach to assessing interpreting quality. In: J. Chen & C. Han (Eds.), Testing and assessment of interpreting: Recent developments in China. Singapore: Springer, 159–178. DOI logoGoogle Scholar
Loper, E. & Steven, B.
(2002) NLTK: the natural language toolkit. Proceedings of the ACL-02 workshop on effective tools and methodologies for teaching natural language processing and computational linguistics, 63–70. DOI logoGoogle Scholar
Mathur, N., Wei, J., Freitag, M., Ma, Q-S. & Bojar, O.
(2020) Results of the WMT20 metrics shared task. Proceedings of the Fifth Conference on Machine Translation, 688–725. [URL]
NAATI
(2019) Certified interpreter test assessment rubrics. [URL]
Ouyang, L-W., Lv, Q-X. & Liang, J-Y.
(2021) Coh-Metrix model-based automatic assessment of interpreting quality. In: J. Chen & C. Han (Eds.), Testing and assessment of interpreting: Recent developments in China. Singapore: Springer, 179–200. DOI logoGoogle Scholar
Papineni, K., Roukos, S., Ward, T. & Zhu, W-J.
(2002) BLEU: A method for automatic evaluation of machine translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 311–318. [URL]
Reiter, E.
(2018) A structured review of the validity of BLEU. Computational Linguistics 44 (3), 393–401. DOI logoGoogle Scholar
Sellam, T., Das, D. & Parikh, A. P.
(2020) BLEURT: Learning robust metrics for text generation. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 7881–7892. [URL]. DOI logo
Setton, R. & Dawrant, A.
(2016) Conference Interpreting: A Trainer’s Guide. Amsterdam & Philadelphia: John Benjamins. DOI logoGoogle Scholar
Snover, M., Dorr, B., Schwartz, R., Micciulla, L. & Makhoul, J.
(2006) A study of translation edit rate with targeted human annotation. Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, 223–231. [URL]
Stewart, C., Vogler, N., Hu, J-J., Boyd-Graber, J. & Neubig, G.
(2018) Automatic estimation of simultaneous interpreter performance. The 56th Annual Meeting of the Association for Computational Linguistics. [URL]. DOI logo
Su, W.
(2019) Exploring native English teachers’ and native Chinese teachers’ assessment of interpreting. Language and Education 33 1, 577–594. DOI logoGoogle Scholar
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C-W., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q. & Rush, A.
(2020) Transformers: State-of-the-art natural language processing. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 38–45. [URL]. DOI logo
Wu, S. C.
(2010) Assessing simultaneous interpreting: A study on test reliability and examiners’ assessment behavior. [URL]
Wu, Z-W.
(2021) Chasing the unicorn? The feasibility of automatic assessment of interpreting fluency. In: J. Chen & C. Han (Eds.). Testing and assessment of interpreting: Recent developments in China. Singapore: Springer, 143–158. DOI logoGoogle Scholar
Yang, L-Y.
(2015) An exploratory study of fluency in English output of Chinese consecutive interpreting learners. Journal of Zhejiang International Studies University 1 1, 60–68.Google Scholar
Zhang, M.
(2013) Contrasting automated and human scoring of essays. R&D Connections 21 1. [URL]