2P1-G12 並列処理を用いた価値関数合成による強化学習の効率化

仲間 祐貴; 當眞 嗣久; 山田 孝治; 遠藤 聡志

doi:10.1299/jsmermd.2010._2P1-G12_1

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec)

Online ISSN : 2424-3124

2010

Session ID : 2P1-G12

DOI https://doi.org/10.1299/jsmermd.2010._2P1-G12_1

Conference information

2P1-G12 Efficiency Improvement of Reinforcemnt Learning Using Parallel Processing for Combination Value Function

Yuuki Nakama, Tsuguhisa Thoma, Koji Yamada, Satoshi Endo

Author information

Keywords: Reinforcement Learning, Multi-Agent System, Parallel Processing

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

In this paper, efficiency improvement of reinforcement learning using parallel processing for combination value function. We propose the method of periodically composing Q table of local learning clusters to global Q table. We apply this method to two applications. One is maze problem and an another is behavior rule detection problem for modular typed robot. Q Learning method and Monte Carlo method are compared with profit share method that learns robot behaviors. We presented computer experiments of 40 PC clusters. The convergence time and learning times are evaluated and discussed.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!