電気学会論文誌C(電子・情報・システム部門誌)
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
<知能,ロボティクス>
不完全知覚問題に対するProfit Sharingと遺伝的アルゴリズムを用いたハイブリッド学習
鈴木 晃平加藤 昇平
著者情報
ジャーナル フリー

2017 年 137 巻 12 号 p. 1591-1599

詳細
抄録

Reinforcement learning is generally performed in the Markov decision processes (MDP). However, there is a possibility that the agent can not correctly observe the environment due to the perception ability of the sensor. This is called partially observable Markov decision processes (POMDP). In a POMDP environment, an agent may observe the same information at more than one state. HQ-learning and Episode-based Profit Sharing (EPS) are well known methods for this problem. HQ-learning divides a POMDP environment into subtasks. EPS distributes same reward to state-action pairs in the episode when an agent achieves a goal. However, these methods have disadvantages in learning efficiency and localized solutions. In this paper, we propose a hybrid learning method combining PS and genetic algorithm. We also report the effectiveness of our method by some experiments with partially observable mazes.

著者関連情報
© 2017 電気学会
前の記事 次の記事
feedback
Top