2021 Volume 12 Issue 3 Pages 323-335
Recent advances in deep reinforcement learning has led to its application in a number of real-world problems. One of the most popularly used deep reinforcement learning algorithms is the deep Q-learning method which uses neural networks to approximate the estimation of the action-value function. Training of deep Q-networks (DQN) is usually restricted to first order gradient based methods. Though second order methods have shown to have faster convergence in several supervised learning problems, their application in deep reinforcement learning is limited. This paper attempts to accelerate the training of deep Q-networks by introducing a second order Nesterov's accelerated quasi-Newton method and verify the feasibility of second order methods in deep reinforcement learning. We evaluate the performance on deep reinforcement learning using double DQNs for VLSI global routing. The results show that the proposed method can obtain better routing solutions compared to the DQNs trained with conventional first order algorithms.