Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
38th (2024)
Session ID : 2I5-GS-10-04
Conference information

Proposal of a Question-Answering System Using RAG with Cost Reduction Techniques
*Rei MASUDAKazuma IWAMOTOYusei MICHINOBUKenji KITAIchitoshi TAKEHARAKazuaki ANDOHitoshi KAMEIKeizo SAISHOKoji KIDA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Retrieval-Augmented Generation (RAG) is a technique that enables question-answering for internal organizational documents by integrating external information with large-scale language models. In recent years, there has been a growing trend in question-answering services that combine ChatGPT with RAG. However, using high-performance models like GPT-4 in large-scale settings can lead to increased API costs due to the rising number of input tokens. This study proposes an additional step that utilizes lower-cost models, such as GPT-3.5, to selectively extract only the necessary information from documents before generating responses. This approach aims to reduce the number of tokens used during response generation, thereby potentially lowering the operational costs associated with GPT-4. The paper also compares the results of this proposed method with those of conventional methods to assess its effectiveness. The findings indicate that the proposed method manages to reduce costs while maintaining accuracy.

Content from these authors
© 2024 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top