抄録
The present paper attempts to introduce one possible method of developing valid and reliable rating scales on the basis of actual performance assessment data. The original impetus for this study was a realization of the strong need to establish an objective rating scale with regard to the quality of English pronunciation of the students in a teacher training course at university level. As a concrete procedure in developing the scale, descriptors (i.e., short descriptions) which characterize levels of pronunciation quality were first extracted from existing scales of language proficiency. Then, four high school teachers and two of the authors of the present paper used these descriptors in actual performance assessment and the results were analyzed in the Item Response Theory (IRT). Conclusions reached were twofold: (1) carefully developed descriptors can successfully convey certain meanings to their users (or raters); and (2) it is possible to make explicit the difficulty of what is meant in each descriptor on the basis of empirical indices which are estimated by means of the IRT.