主催: 一般社団法人 日本機械学会
会議名: ロボティクス・メカトロニクス 講演会2018
開催日: 2018/06/02 - 2018/06/05
In this paper, we introduce a policy search reinforcement learning method with a sparse non-parametric policy model. We formulate policy search as a variational learning problem. A sparse pseudo-input Gaussian processes (SPGP) is placed as a prior distribution of the control policy, then a variational lower bound of the expected reward is derived, which is optimized w.r.t. the hyper parameters and the pseudo-input variables. We conducted numerical simulations and real robot experiments, and confirmed the effectiveness of our proposed method.