Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
38th (2024)
Session ID : 2G5-GS-6-02
Conference information

Data Augmentation with ChatGPT for Efficient Evaluation of Large Language Models in Data-Scarce Environments
*HANHUA ZHU
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In recent years, the development of Large Language Models (LLMs) has rapidly progressed, playing a significant role in Natural Language Processing (NLP). However, there is currently no established standard for efficiently evaluating these LLMs, which often generate complex sentences. Existing evaluation methods using trained Language Models (LMs) are popular due to their cost-effectiveness, but they often fall short in accuracy when training data is scarce. I propose a data augmentation method using ChatGPT to improve the accuracy of LMs in situations of data scarcity. Results on the Japanese Question Answering (QA) task demonstrate that an LM, trained using questions and answers generated by the proposed method, surpassed ChatGPT3.5 and achieved 92% of the evaluation performance of ChatGPT4, even in scenarios where only documents were available.

Content from these authors
© 2024 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top