Learning Quadcopter Maneuvers with Concurrent Methods of Policy Optimization

Pei-Hua Huang; Osamu Hasegawa

doi:10.20965/jaciii.2017.p0639

抄録

This study presents an aerial robotic application of deep reinforcement learning that imparts an asynchronous learning framework and trust region policy optimization to a simulated quad-rotor helicopter (quadcopter) environment. In particular, we optimized a control policy asynchronously through interaction with concurrent instances of the environment. The control system was benchmarked and extended with examples to tackle continuous state-action tasks for the quadcoptor: hovering control and balancing an inverted pole. Performing these maneuvers required continuous actions for sensitive control of small acceleration changes of the quadcoptor, thereby maximizing the scalar reward of the defined tasks. The simulation results demonstrated an enhancement of the learning speed and reliability for the tasks.

著者関連情報

この記事は最新の被引用情報を取得できません。

This article is licensed under a Creative Commons [Attribution-NoDerivatives 4.0 International] license (https://creativecommons.org/licenses/by-nd/4.0/).
The journal is fully Open Access under Creative Commons licenses and all articles are free to access at JACIII Official Site.
https://www.fujipress.jp/jaciii/jc-about/

お気に入り & アラート

閲覧履歴

創刊号からの全論文のPDFは
JACIII公式サイトで公開中(無料)
doiリンクをクリック！

責任著者(Corresponding author)

訂正情報

J-STAGEへの登録はこちら（無料）