Profit Sharingと遺伝的アルゴリズムを用いたハイブリッド学習 -MDPs環境でのタスク分割性能-

鈴木 晃平; 加藤 昇平

doi:10.11517/pjsai.JSAI2018.0_1N304

Abstract

Reinforcement learning is generally performed in the Markov decision processes (MDP). However, there is a possibility that the agent cannot correctly observe the environment due to the perception ability of the sensor. This is called partially observable Markov decision processes (POMDP). In a POMDP environment, an agent may observe the same information at more than one state. We proposed a hybrid learning method using Profit Sharing and genetic algorithm (HPG) for this problem.However, Most of real problems can be represented in an MDP environments. In this paper, we improve HPG to adapt to MDPs environments and report the effectiveness of our method by some experiments with mazes.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!