電気学会論文誌C(電子・情報・システム部門誌)
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
部分観測マルコフ環境における階層型強化学習
スイッチングQ-学習の提案
釜谷 博行阿部 健一
著者情報
ジャーナル フリー

2002 年 122 巻 7 号 p. 1186-1193

詳細
抄録
The most widely used reinforcement learning (RL) algorithms are limited to Markovian environments. To handle larger scale partially observable Markov decision processes, we propose a new on-line hierarchical RL algorithm, which is called Switching Q-learning (SQ-learning). The basic idea of SQ-learning is that non-Markovian tasks can be automatically decomposed into subtasks solvable by multiple policies, without any other information leading to good subgoals. To deal with such decomposition, SQ-learning employs ordered sequences of Q modules in which each module discovers a local control policy based on Sarsa (λ). Furthermore, a hierarchical structure learning automaton is used which finds appropriate subgoal sequences according to LR-I algorithm. The results of extensive simulations demonstrate the effectiveness of SQ-learning.
著者関連情報
© 電気学会
前の記事 次の記事
feedback
Top