Abstract
In this paper, an application of reinforcement learning to a radio-controlled helicopter is considered. The Actor-Critic algorithm is employed, where an Actor and a Critic are realized by using RBF neural networks. In the Critic, connection weights between neurons are updated by adopting the backpropagation algorithm to minimize squared TD-error. Control signals are generated by adding noises to the output of the Actor while learning. As for the Actor, connection weights between neurons are updated based on the evaluation of control results. Moreover, through some computer simulations using 2-dimensional model, it is observed that the network learns hovering flight controls.