コスト削減手法を取り入れたRAGによる質問応答システムの提案

増田 嶺; 岩本 和真; 道信 祐成; 北 健志; 竹原 一駿; 安藤 一秋; 亀井 仁志; 最所 圭三; 喜田 弘司

doi:10.11517/pjsai.JSAI2024.0_2I5GS1004

38th (2024)

Session ID : 2I5-GS-10-04

DOI https://doi.org/10.11517/pjsai.JSAI2024.0_2I5GS1004

Conference information

Host: The Japanese Society for Artificial Intelligence

Name : The 38th Annual Conference of the Japanese Society for Artificial Intelligence

Number : 38

Location : [in Japanese]

Date : May 28, 2024 - May 31, 2024

Proposal of a Question-Answering System Using RAG with Cost Reduction Techniques

*Rei MASUDA, Kazuma IWAMOTO, Yusei MICHINOBU, Kenji KITA, Ichitoshi TAKEHARA, Kazuaki ANDO, Hitoshi KAMEI, Keizo SAISHO, Koji KIDA

Author information

Keywords: Retrieval-Augmented Generation, ChatGPT

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

Retrieval-Augmented Generation (RAG) is a technique that enables question-answering for internal organizational documents by integrating external information with large-scale language models. In recent years, there has been a growing trend in question-answering services that combine ChatGPT with RAG. However, using high-performance models like GPT-4 in large-scale settings can lead to increased API costs due to the rising number of input tokens. This study proposes an additional step that utilizes lower-cost models, such as GPT-3.5, to selectively extract only the necessary information from documents before generating responses. This approach aims to reduce the number of tokens used during response generation, thereby potentially lowering the operational costs associated with GPT-4. The paper also compares the results of this proposed method with those of conventional methods to assess its effectiveness. The findings indicate that the proposed method manages to reduce costs while maintaining accuracy.

Corresponding author

Conference information

Register with J-STAGE for free!