AIを活用した英文ライティング自動評価採点システムの採点妥当性の検証

杉田 由仁

doi:10.20622/jltajournal.27.0_3

抄録

The main purpose of this research is to examine how far we can depend on the scores (Scoring validity) made by an AI-powered essay-scoring system for a task-based writing test (TBWT). It contains two elicitation tasks: Task 1 focusing on Accuracy and Task 2 focusing on Communicability. Japanese high school students participated in the present study. They took the TBWT online and answered the survey of opinions about their grades set by the system. In order to consider the scoring validity, two-way ANOVA for mixed design was conducted on their scores of TBWT. The results indicated that 1) the five groups created based on the grades are significantly different in both Accuracy and Communicability grades, 2) the cut-off scores of levels for different words in a text (type levels) should be adjusted for Accuracy grades, 3) the cut-off scores of type levels and quality of ideas need to be adjusted for Communicability grades. However, the survey results show that most of the students exhibited consensual agreement in their grades, with A, B+, B, B-and C according to the cut-off score set for each grade. These findings were discussed from the point of view of further improvement of the system.

著者関連情報

この記事はクリエイティブ・コモンズ [表示 - 非営利 - 改変禁止 4.0 国際]ライセンスの下に提供されています。
https://creativecommons.org/licenses/by-nc-nd/4.0/deed.ja

お気に入り & アラート

閲覧履歴

前身誌

外国語教育評価学会研究紀要

日本言語テスト学会研究紀要

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）