Article published in:The Language Testing Cycle: From inception to washback
Edited by Gillian Wigglesworth and Catherine Elder
[Australian Review of Applied Linguistics. Series S 13] 1996
► pp. 55–79
Developing rating scales for the assessment of second language performance
The two most common approaches to rating second language performance pose problems of reliability and validity. An alternative method utilizes rating scales that are empirically derived from samples of learner performance; these scales define boundaries between adjacent score levels rather than provide normative descriptions of ideal performances; the rating process requires making two or three binary choices about a language performance being rated. A procedure, that consists of a series of five explicit tasks, is used to construct a rating scale. The scale is designed for use with a specific population and a specific test task.A group of primary school ESL teachers used this procedure to make two speaking tests, including elicitation tasks and rating scales, for use in their school district. The tests were administered to 255 sixth grade learners. The scales were found to be highly accurate for scoring short speech samples, and were quite efficient in time required for scale development and rater training. Scales exhibit content relevance in the instructional setting. Development of this type of scale is recommended for use in high-stakes assessment.
Published online: 01 January 1996
Cited by 5 other publications
Ducasse, Ana Maria & Annie Brown
This list is based on CrossRef data as of 22 october 2021. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.