対話に基づく観光地推薦のためのDPOを用いた情報抽出性能の改善

田尻 愛斗; 稲葉 通将

doi:10.11517/jsaislud.102.0_104

Abstract

Conversational recommender systems aim to provide personalized recommendations through interactive conversations with users. A key challenge is to effectively extract and integrate relevant information from both dialogue history and item descriptions for accurate recommendations. Our previous work used a large language model (LLM) to independently generate dialogue summaries and item recommendation descriptions, which were then fed into a score predictor for recommendation. However, this separate processing restricted the model's ability to accurately associate user preferences expressed in the dialogue with relevant item attributes. To address this limitation, we propose a novel approach that uses Direct Preference Optimization (DPO) to fine-tune the LLM. By jointly considering dialogue history and item descriptions during fine-tuning, our method enables the model to generate summaries and recommendation descriptions that are more intricately linked, leading to more effective extraction of user preferences and, ultimately, improved recommendation accuracy in dialogue-based tourist attraction recommendation systems.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!