抄録
Reinforcement learning is effective in acquisition of optimal control policy. However, the calculation amount increases in high-dimensional space. In this paper, we propose a global and local optimal control method using dynamic programming(DP) and differential dynamic programming(DDP). In the global part, approximate the optimal trajectory in the state space by DP. In the local part, optimize the approximate trajectory in the neighborhood by DDP. The proposed method can reduce the calculation amount in optimal control.