Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
Recently, reinforcement learning (RL) has been showing increasingly high performance in a variety of complex tasks of decision making and control, but RL requires quite careful engineering of reward functions to solve real tasks. Inverse reinforcement learning (IRL) is a framework to construct reward functions by learning from demonstration, but the estimated reward function cannot be transferred to other dynamics due to its dynamics-dependent indefiniteness. To obtain transferable reward functions, we propose a novel mathematical formulation for fixing the dynamics-dependent indefiniteness of reward functions by utilizing demonstrations generated in multiple dynamics. We also show that the existing discussion on the indefiniteness of reward functions can be generalized from usual RL to maximum entropy RL, which serves as the subroutine forward solver in usual IRL algorithms based on maximum entropy IRL.