電気学会論文誌C(電子・情報・システム部門誌)
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
強化学習法のための状態グルーピングとオポチュニチ評価に関する研究
兪 文偉横井 浩史嘉数 侑昇
著者情報
ジャーナル フリー

1997 年 117 巻 9 号 p. 1300-1307

詳細
抄録

In this paper, we propose the State Grouping scheme for coping with the problem of scaling up the Reinforcement Learning Algorithm to real, large size application. The grouping scheme is based on geographical and trial-error information, and is made up with state generating, state combining, state splitting, state forgetting procedures, with corresponding action selecting module and learning module. Also, we discuss the Labeling Based Evaluation scheme which can evaluate the opportunity of the state-action pair, therefore, use better experience to guide the exploration of the state-space effectively. Incorporating the Labeling Based Evaluation and State Grouping scheme into the Reinforcement Learning Algorithm, we get the approach that can generate organized state space for Reinforcement Learning, and do problem solving as well. We argue that the approach with this kind of ability is necessary for autonomous agent, namely, autonomous agent can not act depending on any pre-defined map, instead, it should search the environment as well as find the optimal problem solution autonomously and simultaneously. By solving the large state-size 3-DOF and 4-link manipulator problem, we show the efficiency of the proposed approach, i.e., the agent can achieve the optimal or sub-optimal path with less memory and less time.

著者関連情報
© 電気学会
前の記事 次の記事
feedback
Top