2024 Volume 31 Issue 2 Pages 349-373
This study evaluates ChatGPT’s ability to generate Japanese on text-to-text generation tasks. ChatGPT is one of the large language models that can be adapted to a variety of natural language processing tasks in an interactive manner. While its language-generating ability has been quantitatively evaluated in a variety of tasks in English, it has not yet been fully evaluated in Japanese. This paper reports the evaluation results of ChatGPT’s ability to generate Japanese in typical text-to-text generation tasks such as machine translation, summarization, and text simplification, comparing it with conventional supervised methods. Experimental results showed that ChatGPT underperformed existing supervised models in automatic evaluation for all tasks, but tended to outperform those models in human evaluation. Our detailed analysis revealed that while ChatGPT outputs high-quality Japanese sentences in general, it fails to meet some of the detailed requirements of each task.