合理的な忘却型Profit Sharing強化学習法

幸若 完壮; 渡辺 浩太; 五十嵐 一

doi:10.1541/ieejeiss.132.448

抄録

In this paper, Rationally oriented Forgettable Profit Sharing method (RFPS) for reinforcement learning is proposed. Although the Profit Sharing (PS) provides good performances in real environments, its learning is often slow in long term tasks because it is difficult to determine the adequate discount rate which satisfies the Miyazaki rational theorem. There are several rationality-relaxed PS methods which work well for such tasks. However, these PS may result in many irrational loops. The proposed method fulfills the rationality by forgetting the reinforced irrational loops. This method can be easily combined with ordinary PS methods and performs well in long term tasks. The simulation results show that the proposed method can learn more efficiently than the conventional PS methods.

著者関連情報

お気に入り & アラート

閲覧履歴

発行機関からのお知らせ

【電気学会会員の方】購読している論文誌を無料でご覧いただけます（会員ご本人のみの個人としての利用に限ります）。購読者番号欄にMyページへのログインIDを，パスワード欄に生年月日8ケタ（西暦，半角数字。例：19800303）を入力して下さい。

ダウンロード

論文(PDF)の閲覧方法はこちら
閲覧方法 (327.9K)

前身誌

電気学会論文誌. C

電氣學會雜誌

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）