ARELE: Annual Review of English Language Education in Japan
Online ISSN : 2432-0412
Print ISSN : 1344-8560
ISSN-L : 1344-8560
Research Articles
Rater Reliability in Classroom Speaking Assessment in a Japanese Senior High School
Author information

2021 Volume 32 Pages 129-144


  When teachers score classroom speaking tests, intensive rater training ahead of the test may not always be possible. The current study examines the extent to which rater reliability can be maintained using a simple rubric without detailed rater training. We analyzed four speaking tests for senior high school students (N = 116). The speaking tests involved an individual presentation, a paired role play, and two group discussions across seven months. Each test was evaluated using a simple rubric by two or more raters who did not receive intensive rater training. The data was analyzed using many-facet Rasch measurement and generalizability theory. The results suggest that in general, raters scored similarly and consistently. The number of raters required to maintain sufficient reliability (Φ = .70), at the overall test level, was one to four, with group discussion tests requiring more raters or intensive rater training. Pedagogical implications with regard to the allocation of limited resources of time and raters were discussed.

Content from these authors
© 2021 The Japan Society of English Language Education
Previous article Next article