Host: The Japanese Society for Artificial Intelligence
Name : The 33rd Annual Conference of the Japanese Society for Artificial Intelligence, 2019
Number : 33
Location : [in Japanese]
Date : June 04, 2019 - June 07, 2019
In deep rerinforcement learning, it is difficult to converge when the exploration is insufficient or a reward is sparce. Besides, in a specific tasks, the number of exploration may be limited. Therefore, it is considered effective to learn in source tasks previously to promote learning in the target tasks. In this research, we propose a method to train a model that can work well on variety of target tasks with evolutionary algorithm and policy gradient method. In this method, agents explore multiple environments with diverce set of neural networks to train a general model with evolutionary algorithm and policy gradient methid. In the experiments, we assume multiple 3D control source tasks. After the model training with our method in the source tasks, we shows how effective the model is for the 3D Control tasks of the target tasks.