抄録
This study examined essay ratings from the ICNALE GRA database, focusing on the relationship between holistic and analytic evaluations and the role of rater characteristics. Holistic scores correlated strongly with analytic scores (r = 0.74‐0.86), with analytic ratings explaining 90% of the variance. “Intelligibility” emerged as the most influential factor, suggesting that holistic evaluation may approximate analytic results when rubrics are based on “Language”, “Content”, and “Attitude.” The second analysis compared different rater groups. Raters whose first language is English awarded higher scores than those whose first language is Japanese, and female raters scored higher than male raters, while rating experience made no difference. Moreover, the English L1 group emphasized content-related criteria such as “Purposefulness” and “Sophistication,” whereas the Japanese L1 group prioritized linguistic “Complexity” and “Accuracy.” These findings indicate that rater background shapes evaluation focus and scoring patterns, highlighting the need for rater training and calibration to ensure fairness.