対照学習を用いた潜在表現空間上での目標条件付き強化学習

山田 貴哉; 小川原 光一

doi:10.1299/jsmermd.2021.1P1-I15

Abstract

The goal-conditioned reinforcement learning approach is useful when an agent wants to adaptively achieve arbitrary goals in various environments. Furthermore, it is desirable to be able to learn the behavior of an agent from observations from on-board sensors such as RGB cameras. Since it is difficult for agents to learn directly from images, previous research has shown that the learning process can be accelerated by learning latent representations using VAEs and performing goal-conditioned reinforcement learning in the latent space. However, latent representations learned using VAEs with image reconstruction loss contain information irrelevant to the task to be accomplished. In goal-conditioned reinforcement learning, the reward function is defined as the distance to the goal state, so latent representations that capture the distance between states are appropriate. In this study, we propose a method for maintaining the similarity between states in the latent space using contrastive learning and performing goal-conditioned reinforcement learning in the latent space given image observations. We compared the proposed method with that using latent representations obtained from VAEs, and show that our method outperformed the other.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!