Chapter 2
Considerations in developing vertical scales for language
tests
This chapter provides a framework for building vertical
scales, for language assessments in general and for the
TOEFL® Family of Assessments in particular. Topics
covered include aspects of vertical scale design (growth definitions,
vertical articulation, data collection), statistical methods for
vertical linking, and evaluation and maintenance of the resulting
vertical scale. Also discussed are challenges associated with vertical
scaling, as noted in the research literature, in general and as pertains
to language proficiency assessments.
Article outline
 Introduction
 Vertical scale design
 Growth definitions
 Vertical articulation
 Data collection design
 Statistical methods for vertical linking
 Hieronymus scaling
 Thurstone scaling
 IRT scaling
 IRT scaling Decision 1: Choice of model
 IRT scaling Decision 2: Separate vs concurrent
calibration
 IRT scaling Decision 3: Scores
 Evaluation of a vertical scale
 Maintenance of the vertical scale
 Challenges with vertical scaling
 Conclusion

References