学習者向けの日本語文章の難易度判定モデルの構築と最適化 難易度関連要因の探索および予測効果の解明

劉 婧怡

doi:10.24701/mathling.34.6_405

Abstract

This study aims to identify the linguistic elements related to readability of Japanese texts for second language learners, and construct a readability assessment model with high accuracy and explainability that meets the needs of language education. Specifically, 86 linguistic features across 5 categories were extracted from Japanese textbooks with difficulty levels, and readability models were constructed and evaluated. When comparing 4 classification models for automatic difficulty assessment, SVM (Support Vector Machine) showed best performance with an accuracy (ACC) of 0.898 in judging the readability of Japanese texts. Furthermore, feature selection using a stepwise approach identified 35 highly relevant factors to construct a model maintaining 0.880 accuracy while enhancing simplicity and explainability. Additionally, readability scores were quantified into three perspectives for visualization of prediction results. Thus, the readability model developed in this study not only demonstrated high predictive accuracy, but also contributed to explainability desired in the field of language education.

Content from these authors

この記事はクリエイティブ・コモンズ [表示 - 非営利 - 改変禁止 4.0 国際]ライセンスの下に提供されています。
https://creativecommons.org/licenses/by-nc-nd/4.0/deed.ja

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!