Reinforcement Learning Using Adaptive Search Method

Kosuke Umesako; Masanao Obayashi; Kunikazu Kobayashi

doi:10.1541/ieejeiss1987.122.3_374

抄録

We propose an adaptive probability density function (PDF) to select an effective action on reinforcement learning (RL). The uniform distribution function and the normal distribution function of an action are often used to select an action. When these fuctions are used, however, the information of search direction is net considered. The proposed method utilizing the information of it enables RL to reduce the number of trials, which is needed to real environment learning. Furthermore, the proposed method can be applied easily to various methods of RL, for example, actor-critic, stochastic gradient ascent method. The performance of our proposed method is demonstrated by computer simulations.

著者関連情報

お気に入り & アラート

閲覧履歴

[title in Japanese]

発行機関からのお知らせ

【電気学会会員の方】購読している論文誌を無料でご覧いただけます（会員ご本人のみの個人としての利用に限ります）。購読者番号欄にMyページへのログインIDを，パスワード欄に生年月日8ケタ（西暦，半角数字。例：19800303）を入力して下さい。

ダウンロード

論文(PDF)の閲覧方法はこちら
閲覧方法 (327.9K)

前身誌

電気学会論文誌. C

電氣學會雜誌

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）