Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
37th (2023)
Session ID : 3E1-GS-2-01
Conference information

Improved Regret Approximation for Min-Max Regret Optimization in Reinforcement Learning
*Keita SAITOTakumi TANABEYouhei AKIMOTO
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In the field of reinforcement learning, there are cases in which the environment parameter at evaluation time is inaccessible during training. Several approaches aim to minimize the worst-case regret in terms of the environment parameter. As true regret can be rarely obtained during training, regret is calculated using approximated optimal policy under each environment parameter. However, when using approximated regret for training, inaccuracy of the approximation can cause the minimax regret optimization to fail. In this paper, we propose an approach that improves the accuracy of the approximation of optimal policies, which consequently improves the regret approximation. Our experiments show that our approach is effective in accurately approximating regret, which leads to higher performance in minimizing worst-case regret.

Content from these authors
© 2023 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top