類似グラフ環境における事前知識を活用した方策学習のための世界モデル

河村 和紀; 池之内 颯都; 石川 峻弥; 村上 綾菜; 河野 慎; 松尾 豊

doi:10.11517/pjsai.JSAI2023.0_2G4OS21d04

Abstract

In this paper, we introduce a reinforcement learning method based on a world model that finds the optimal policy in an environment represented by a graph. There are many environments in virtual and real worlds that are represented by graphs, such as games, transportation networks, knowledge graphs, social networks, and communication networks. Although there are several methods for finding the optimal policy in these environments, existing research has not been able to utilize prior knowledge from similar environments when learning new policies. Therefore, in this study, we propose a method for learning better policies in environments represented by graphs when knowledge of the environment is acquired. We also show that the proposed method outperforms a simple search method without prior knowledge by simulating a maze game represented by a graph.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!