Grid-World迷路問題においてマクロアクション生成機能を有する強化学習モデルとその学習特性に関する考察

恩田 宏; 小澤 誠一

doi:10.1541/ieejeiss.129.735

抄録

A macro-action is a typical series of useful actions that brings high expected rewards to an agent. Murata et al. have proposed an Actor-Critic model which can generate macro-actions automatically based on the information on state values and visiting frequency of states. However, their model has not assumed that generated macro-actions are utilized for leaning different tasks. In this paper, we extend the Murata's model such that generated macro-actions can help an agent learn an optimal policy quickly in multi-task Grid-World (MTGW) maze problems. The proposed model is applied to two MTGW problems, each of which consists of six different maze tasks. From the experimental results, it is concluded that the proposed model could speed up learning if macro-actions are generated in the so-called correlated regions.

著者関連情報

お気に入り & アラート

閲覧履歴

発行機関からのお知らせ

【電気学会会員の方】購読している論文誌を無料でご覧いただけます（会員ご本人のみの個人としての利用に限ります）。購読者番号欄にMyページへのログインIDを，パスワード欄に生年月日8ケタ（西暦，半角数字。例：19800303）を入力して下さい。

ダウンロード

論文(PDF)の閲覧方法はこちら
閲覧方法 (327.9K)

前身誌

電気学会論文誌. C

電氣學會雜誌

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）