Host: The Japan Society of Mechanical Engineers
Name : [in Japanese]
Date : June 06, 2021 - June 08, 2021
The goal-conditioned reinforcement learning approach is useful when an agent wants to adaptively achieve arbitrary goals in various environments. Furthermore, it is desirable to be able to learn the behavior of an agent from observations from on-board sensors such as RGB cameras. Since it is difficult for agents to learn directly from images, previous research has shown that the learning process can be accelerated by learning latent representations using VAEs and performing goal-conditioned reinforcement learning in the latent space. However, latent representations learned using VAEs with image reconstruction loss contain information irrelevant to the task to be accomplished. In goal-conditioned reinforcement learning, the reward function is defined as the distance to the goal state, so latent representations that capture the distance between states are appropriate. In this study, we propose a method for maintaining the similarity between states in the latent space using contrastive learning and performing goal-conditioned reinforcement learning in the latent space given image observations. We compared the proposed method with that using latent representations obtained from VAEs, and show that our method outperformed the other.