2019 Volume 55 Issue 2 Pages 100-109
This study presents an extension of Bayesian learning approach with Gaussian process regression focusing on continuous-time optimal control problem in which stage cost function is unknown. By applying control parametrization method, the optimal control problem can be approximately formulated as a nonlinear programming problem, and the statistics of the cost function estimated by Gaussian process regression is analyzed. To obtain a solution to Bayesian optimization problem, an effective gradient calculation based on variational method is developed. Furthermore, the analysis of optimality in the fashion of bandit problem provides the order of regret bound achieved by the proposed algorithm.