2024 Volume 36 Issue 1 Pages 571-581
Reinforcement learning methods include learning a simple and accurate dynamics model of the environment (world model) and performing trial-and-error in a compact latent space. However, because the world model is learned using reconstruction errors, performance deteriorates when the visual environment becomes more complex. In contrast, learning the world model by contrast learning reduces the performance degradation even when the visual environment is complex. However, there is still a problem of performance degradation when the batch size is reduced. In this study, we propose a method for learning world models using non-contrast learning. We believe that this method can solve the problem of performance degradation in tasks with complex visual environments. In order to improve robustness with respect to visual information, we introduced a loss function that suppresses the effect of background information that is irrelevant to the task. As a result, the proposed method performed better in 4 out of 6 tasks with a normal background, and in 5 out of 6 tasks with a complex background.