Developing rating scales for the assessment of second language performance

Turner, Carolyn E.; Upshur, John A.

doi:10.1075/aralss.13.04tur

Article published In:

The Language Testing Cycle: From inception to washback
Edited by Gillian Wigglesworth and Catherine Elder
[Australian Review of Applied Linguistics. Series S 13] 1996
► pp. 55–79

Developing rating scales for the assessment of second language performance

Carolyn E. Turner | McGill University, Montreal

John A. Upshur | Concordia University, Montreal

The two most common approaches to rating second language performance pose problems of reliability and validity. An alternative method utilizes rating scales that are empirically derived from samples of learner performance; these scales define boundaries between adjacent score levels rather than provide normative descriptions of ideal performances; the rating process requires making two or three binary choices about a language performance being rated. A procedure, that consists of a series of five explicit tasks, is used to construct a rating scale. The scale is designed for use with a specific population and a specific test task.

A group of primary school ESL teachers used this procedure to make two speaking tests, including elicitation tasks and rating scales, for use in their school district. The tests were administered to 255 sixth grade learners. The scales were found to be highly accurate for scoring short speech samples, and were quite efficient in time required for scale development and rater training. Scales exhibit content relevance in the instructional setting. Development of this type of scale is recommended for use in high-stakes assessment.

Published online: 1 January 1996

https://doi.org/10.1075/aralss.13.04tur

References (20)

References

Alderson, J.C. (1991) Language testing in the 1990s: How far have we come? How much further have we to go? In S. Anivan (ed.) Current Developments in Language Testing. Singapore, SEAMEO Regional Language Centre.

Alderson, J. C., C. Clapham and D. Wall (1995) Language test construction and evaluation. Cambridge, Cambridge University Press.

Bachman, L.F. (1990) Fundamental considerations in language testing. Oxford, Oxford University Press.

(1991) What does language testing have to offer? TESOL Quarterly 251:671–704.

Chalhoub-Deville, M. (1995) Deriving oral assessment scales across different tests and rater groups. Language Testing 121:16–33.

Cohen, A. (1994) Assessing language ability in the classroom. 2nd Edition. Boston, Heinle and Heinle.

Columbia Broadcasting System (Producer) (1990) Arnold of the ducks [Videotape].

Fulcher, G. (1987) Tests of oral performance: The need for data-based criteria. EST Journal 411:287–291.

Henning, G. & Cascallar, E. (1992) A preliminary study of the nature of communicative competence. TOEFL Research Reports, Report 36. Princeton, NJ, Educational Testing Service.

Matthews, M (1990) The measurement of productive skills: Doubts concerning the assessment criteria of certain public examinations. ELT Journal 441:117–121.

McKay, P. (1995) Developing ESL proficiency descriptions for the school context: The NLLIA bandscales. In G. Brindley (ed.) Language assessment in action. (Research Series 8). Sydney, NCELTR, Macquarie University.

Ministère de l’Éducation. (1984) Elementary school curriculum: English as a Second Lanugage (#16-2204 A). Québec, QC, Gouvernement du Québec, Ministère de l’Éducation.

Shohamy, E. (1983) Rater reliability of the oral interview speaking test. Foreign Language Annals 161:219–222.

(1992) New modes of assessment: The connection between testing and learning. In E. Shohamy and A.R. Walton (eds) Language assessment for feedback: Testing and other strategies. Dubuque, IA, Kendall/Hunt.

(1995) Performance assessment in language testing. Annual Review of Applied Linguistics 151,:88–211.

Skehan, P. (1988) State of the art article: Language testing I. Language Teaching 211:211–221.

(1989) State of the art article: Language testing II. Language Teaching 221:1–13.

(1991) Progress in language testing: The 1990s, in J.C. Alderson and B. North (eds) Language testing in the 1990s: The communicative legacy. London, Modern English Publications and The British Council.

Underhill, N. (1987) Testing spoken language: A handbook of oral testing techniques. Cambridge, Cambridge University Press.

Upshur, J. A. & Turner, C. E. (1995) Constructing rating scales for second language tests. ELT Journal 49,1:3–12.

Cited by (6)

Cited by six other publications

Order by:

O'Grady, Stefan

2023. Halo effects in rating data: Assessing speech fluency. Research Methods in Applied Linguistics 2:2 ► pp. 100048 ff.

Hidri, Sahbi

2018. Introduction: State of the Art of Assessing Second Language Abilities. In Revisiting the Assessment of Second Language Abilities: From Theory to Practice [Second Language Learning and Teaching, ], ► pp. 1 ff.

O’Grady, Stefan

2018. Investigating the Use of an Empirically Derived, Binary-Choice and Boundary-Definition (EBB) Scale for the Assessment of English Language Spoken Proficiency. In Revisiting the Assessment of Second Language Abilities: From Theory to Practice [Second Language Learning and Teaching, ], ► pp. 49 ff.

O’Grady, Stefan

2019. The impact of pre-task planning on speaking test performance for English-medium university admission. Language Testing 36:4 ► pp. 505 ff.

Plakans, Lia

2013. Writing Scale Development and Use Within a Language Program. TESOL Journal 4:1 ► pp. 151 ff.

Ducasse, Ana Maria & Annie Brown

2009. Assessing paired orals: Raters' orientation to interaction. Language Testing 26:3 ► pp. 423 ff.

This list is based on CrossRef data as of 3 november 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.