システム制御情報学会論文誌
Online ISSN : 2185-811X
Print ISSN : 1342-5668
ISSN-L : 1342-5668
部分観測マルコフ決定過程における位置ベクトルを用いた強化学習手法の提案
清本 盛明亀井 且有
著者情報
ジャーナル フリー

2001 年 14 巻 2 号 p. 86-91

詳細
抄録

This paper describes a reinforcement learning with a position vector, which does not fall into Partially Observable Markov Decision Process (POMDP). Firstly, a rule structure using the position vector as agent's inside sensory information and a restraint of reward assignment for detours are described and then a new reinforcement learning method composed of them is proposed. Next, the proposed method is compared with a conventional method for relatively simple Partial Observation Markov Environment (POME). As a result, it is shown that the reward assignment to unnecessary rules is restrained, that is, the rewards are given to only effective rules and then an efficient learning is carried out. In addition, we apply the proposed method to the shortest path acquisition problem of POME which can hardly be solved by the conventional method, and obseve that an optimum solution is obtained by the proposed method. Finally, the proposed method is successfully applied to a huge maze used in Japan micro-mouse competition, which shows that the proposed method is effective for such realistic problems.

著者関連情報
© システム制御情報学会
前の記事
feedback
Top