自然勾配を用いたポートハミルトン系のための強化学習の高速化

福永 修一; 岩本 有生

doi:10.9746/sicetr.59.70

抄録

In this paper, we accelerated a reinforcement learning algorithm for port-Hamiltonian systems using a natural gradient method. The proposed algorithm consists of an actor-critic structure wherein the actor generates a control input according to a policy and learns the policy using a temporal difference (TD) error, and the critic computes the TD error and learns a state-value function. Furthermore, the reinforcement learning algorithm for port-Hamiltonian systems has two types of the policy parameters which the proposed algorithm learns using the natural gradient method. Additionally, the proposed method was applied to the problem of swing-up control for an inverted pendulum through numerical simulation. The simulation result showed that the learning speed of the proposed method was higher than that of the existing method.

著者関連情報

お気に入り & アラート

お気に入りに追加
追加情報アラート
被引用アラート
認証解除アラート

閲覧履歴

性的指向及び性的指向によるマイノリティ・ストレスが精神的健康に及ぼす影響
The Dynamic Organisation of the Secretory Pathway

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）