Journal of Advanced Computational Intelligence and Intelligent Informatics
Online ISSN : 1883-8014
Print ISSN : 1343-0130
ISSN-L : 1883-8014
Regular Papers
Direct Policy Search Reinforcement Learning Based on Variational Bayesian Inference
Nobuhiko Yamaguchi
著者情報
ジャーナル オープンアクセス

2020 年 24 巻 6 号 p. 711-718

詳細
抄録

Direct policy search is a promising reinforcement learning framework particularly for controlling continuous, high-dimensional systems. Peters et al. proposed reward-weighted regression (RWR) as a direct policy search. The RWR algorithm estimates the policy parameter based on the expectation-maximization (EM) algorithm and is therefore prone to overfitting. In this study, we focus on variational Bayesian inference to avoid overfitting and propose direct policy search reinforcement learning based on variational Bayesian inference (VBRL). The performance of the proposed VBRL is assessed in several experiments involving a mountain car and a ball batting task. These experiments demonstrate that VBRL yields a higher average return and outperforms the RWR.

著者関連情報

この記事は最新の被引用情報を取得できません。

© 2020 Fuji Technology Press Ltd.
次の記事
feedback
Top