Latent trait estimates of rater reliability in the international English Language Testing System (IELTS)

Griffin, Patrick

doi:10.1075/aral.13.2.01gri

Article published In:

Australian Review of Applied Linguistics
Vol. 13:2 (1990) ► pp.1–22

Latent trait estimates of rater reliability in the international English Language Testing System (IELTS)

Patrick Griffin | Phillip Institute of Technology

The study examined the effects of fixed criteria, training and moderation on reliability of ratings assigned to written scripts. Using an item response analysis, consistency of inter and intra rater reliability of scoring patterns were examined under changing conditions. Ratings were assigned twice under workshop conditions and once under unsupervised isolated conditions. The workshops were used to identify criteria used by raters and then to obtain an agreed set of criteria using a consensus moderation approach. Results indicate that raters are influenced by their backgrounds, the moderation procedure and by the criteria depending on the circumstances under which the ratings were assigned. However a lack of fit of the ratings to a single dimension model over time, suggests that the raters may change their criteria under different conditions. Although similar ratings may be assigned, different criteria are employed by the same rater over time. The results seriously question the use of classical measurement approaches in the assessment of rater reliability.

Published online: 1 January 1990

https://doi.org/10.1075/aral.13.2.01gri

References (7)

Andrich, D.

(1978) Application of a psychometric rating model to ordered categories which are scored with successive integers. Applied Psychological Measurement, 21:581–594.

Ingram D.E. and E. Wiley

(1979, Revised 1985) The Australian Second Language Proficiency Ratings. Griffith University, Mimeograph.

Rasch, G.

(1960, revised 1980) Probabilistic models for some intelligence and attainment tests. Copenhagen, Denmark Paedegogiske Institut and Chicago, University of Chicago Press.

Skeehan, P.

(1988) Peter Skeehan on Testing Part I. Language Teaching 21,4:211–221.

(1989) Peter Skeehan on Language Testing Part II. Language Teaching 22,1:1–13.

Wright B. and G. Masters

(1982) Rating scale analysis. Chicago, MESA Press.

Wright B. and M. Stone

(1979) Best test design. Chicago, MESA Press.