2017 Volume 137 Issue 12 Pages 1676-1683
The reinforcement learning is a method of training for an agent for accomplishing task by selecting suitable action from the current state. Deep Q network is combining convolutional network with Q-learning. By using the Convolutional Neural Network, Deep Q Network can apply to large dimentional input state tasks without special pre-processing. However Deep Q Network needs a large iteration for getting excellent outputs. The reason of that the Deep Q Network is using ε-greedy for action selection, and the ε is set to high value (close to one) in initial stage in learning. High ε value means that the agent selects action randomly in the learning. Hence, the agent needs large number of iteration of learning for accomplishing a task. In this paper adopts the Boltzmann selection to Deep Q Network. Finally, our algorithm has been applied to 2 kinds of arcade learning environment tasks, and results showed that our algorithm is better than ordinary Deep Q Network.
The transactions of the Institute of Electrical Engineers of Japan.C
The Journal of the Institute of Electrical Engineers of Japan