In general, meta-parameters in a reinforcement learning system such as learning rate are empirically determined and fixed during the learning. Therefore, when an external environment has changed, the sytem cannot adjust to the change. Meanwhile, it is suggested that the biological brain could conduct reinforcement learning and adjust to the external environment by controlling neuromodulators corresponding to meta-parameters. In the present paper, based on the above suggestion, a method to adjust meta-parameters using the TD-error is proposed. Through computer simulations using maze problem and inverted pendulum control problem, it is verified that meta-parameters are appropriately adjusted according to the amplitude of the TD-error.
J-STAGEがリニューアルされました! https://www.jstage.jst.go.jp/browse/-char/ja/