Host: The Japanese Society for Artificial Intelligence
Name : The 35th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 35
Location : [in Japanese]
Date : June 08, 2021 - June 11, 2021
Recently, inverse reinforcement learning, which estimates the reward from an expert's trajectories, has been attracting attention for imitating complex behaviors and estimating intentions. This study proposes a novel deep inverse reinforcement learning method that combines LogReg-IRL, an IRL method based on linearly solvable Markov decision process, and ALOCC, an adversarial one-class classification method. The proposed method can quickly learn rewards and state values without reinforcement learning executions or trajectories to be compared. We show that the proposed method obtains a more expert-like gait than LogReg-IRL in the BipedalWalker task through computer experiments.