Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
33rd (2019)
Session ID : 4I3-J-2-02
Conference information

Task-Conditional Generative Adversarial Imitation Learning That Infers Multiple Reward Functions
*Kyoichiro KOBAYASHITakato HORIIRyo IWAKIYukie NAGAIMinoru ASADA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In this work, we propose a new framework of imitation learning that is designed to infer the multiple reward func- tions. We introduce latent variables to discriminator and generator in Generative Adversarial Imitation Learning (GAIL) to learn different reward functions and policies for different tasks. In order to control the balance between imitate expert directly (early convergence) and to enhance variance of policy (sample various data and learning robust reward), we introduce entropy regularized correction term in generator's objective function. We guarantee that the objective function has the unique optimal solution by the same discussion as GAIL. In the experiment at the grid world problem, we show that our framework can infer multiple reward functions and policies that represent different tasks efficiently.

Content from these authors
© 2019 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top