羽ばたき型UAVの強化学習制御における効果的な探索法の検討

平井 健太郎; 齋藤 未来; 李 直; 謝 砺鋒; 笹崎 舜翔; 渡邉 孝信

doi:10.1299/jsmermd.2023.2P1-D11

抄録

We conducted investigation into an effective scheduling method of the exploration in a reinforcement learning algorithm, aiming at the control of a flapping unmanned aerial vehicle (UAV) we have developed. Deep Q Network (DQN) algorithm was employed to determine optimal gain parameters of PID control of the Yaw angle of the airframe. Although the Yaw angle can be stabilized by this PID-DQN hybrid method, we noticed that the gain parameters tend to be biased toward highly rated values in the early stages of the learning. In this study, we solved this problem by modifiying the scheduling of epsilon-greedy method in DQN.

著者関連情報

お気に入り & アラート

お気に入りに追加
追加情報アラート
被引用アラート
認証解除アラート

閲覧履歴

Pulmonary Endarterectomy and Balloon Pulmonary Angioplasty for Chronic Thromboembolic Pulmonary Hypertension　― Similar Effects on Health-Related Quality of Life ―
Benchmarking for Sustainable Urban Transport Systems: A Comparative Study in Indian Metro Cities
パネル企画報告：シミュレーションと現実のギャップを埋められるのか？　次世代シミュレータへの展望と課題
タンパク質のリフォールディングを目指したエマルション調製法

発行機関からのお知らせ

会員向け購読者番号とパスワードは以下URLよりご確認下さい。
https://www.jsme.or.jp/publication/proceedings/

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）