Performance assessment has been attracted much attention in various assessment fields, such as entrance exam, employee evaluation and educational assessment. Performance assessment enables to assess examinees’ practical and higher order skills, which are difficult to be assessed by traditional paper tests. In typical performance assessment, examinee’s performances for multiple tasks a re evaluated by multiple raters. However, it has been pointed out that reliability of such performance assessment strongly depends on characteristics of raters and tasks. As a method to improve the reliability, item response models which incorporate rater and task characteristic parameters has been proposed. Earlier studies reported that the models could improve the reliability of performance assessment because they can estimate ability of examinees considering characteristics of raters and tasks. When applying them to actual performance assessments, the selection of an optimal model for the assessment situation is important. Therefore, this paper reviews previous item response models that incorporate rater and task characteristic parameters and explains those characteristics. Furthermore, the paper proposes an approach to select an optimal model for assessment situations. Moreover, the paper demonstrates the effectiveness of the models through a real data application.
View full abstract