ガウス過程に基づく自己駆動型方策による方策探索

佐々木 光; 松原 崇充

doi:10.1299/jsmermd.2021.1P1-I17

Abstract

In this paper, we propose a policy search reinforcement learning method with a non-parametric policy model and self-triggered control. We formulate a self-triggered policy search that employs a control policy and an execution length policy to reduce the number of action decisions in a trial. Our method employs sparse Gaussian process as a policy model with a self-triggered control framework, and its update law for maximizing return is derived based on variational Bayesian learning. We conducted simulations for a reaching task in a two-dimensional environment and confirmed the effectiveness of our proposed method.

Content from these authors

Favorites & Alerts

Add to favorites
Additional info alert
Citation alert
Authentication alert

Corresponding author

Register with J-STAGE for free!