2026 Volume 35 Issue 4 Pages 177-192
This study examined quantitative and qualitative changes in vocabulary production among Japanese learners through a paragraph writing activity conducted at a university in China. It also compared evaluations by human teachers and GenAI (ChatGPT and DeepSeek). Morphological analyses showed a gradual increase in learners’ vocabulary use and sentence length, particularly noting an increase in the use of native Japanese words (Wago). Qualitative analyses indicated that human teachers provided the most flexible and nuanced assessments, effectively capturing subtle differences in learners’ vocabulary choices and emotional expressions. While the GenAI assessments correlated with human evaluations, ChatGPT tended to provide cautious and conservative ratings, limiting its ability to capture subtle expressive nuances. DeepSeek’s assessments were somewhat closer to those of human teachers but still lacked sufficient flexibility. These findings highlight the importance of human empathy and nuanced understanding in assessing vocabulary skills, underscoring that effective evaluation should integrate attention to learners’ emotional expressions and personal growth in addition to quantitative measures.