Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
39th (2025)
Session ID : 3Win5-02
Conference information

Improving Data-to-Text Generation with Large Language Models through Numerical Data Back-Translation
*Masahiro EBEAtsushi AOYAMA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

We introduce a reinforcement learning approach that utilizes back-translation to numerical data for Data-to-Text generation with large language models (LLMs). Numerical data can have multiple possible interpretations, making it difficult to predefine their meaning and the key points to be explained before conducting an analysis. In this study, we focus on information recoverability in explaining numerical data and propose a reinforcement learning approach based on Proximal Policy Optimization (PPO). This approach does not require prior reference definitions and uses the error in back-translation to numerical data as a reward signal. Our experiments demonstrate that the proposed method significantly improves explanatory performance after training. Furthermore, the explanatory performance achieved with our method is significantly higher than that obtained using Direct Policy Optimization (DPO), a training method that does not require the design of a reward function. These results highlight the effectiveness of using back-translation error as a reward for enhancing explanatory performance.

Content from these authors
© 2025 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top