抄録
We proposed a novel method of hybrid machine learning using both simulator and real hardware. In advance, a simulator of the hardware is built with the actually acquired data from the real hardware using neural networks and the back-propagation learning method. Afterward, the objective controller of the hardware is trained only with the built simulator by the reinforcement learning method. Finally, the controller is applied to the real hardware. The both learning processes for the simulator and the controller are performed without using the real hardware after the data sampling, therefore load against the hardware is less than using the real hardware, and the objective controller can be optimized faster than real time learning. As an example, we picked up the pendulum swing-up task which was a typical nonlinear control problem, and the proposed method worked successfully.