Applying Double DQN to Reinforcement learning of Automated Designing ICT System

Natsuki Okamura; Yutaka Yakuwa; Takayuki Kuroda; Ikuko E. Yairi

doi:10.1587/comex.2022XBL0100

This article has now been updated. Please use the final version.

Applying Double DQN to Reinforcement learning of Automated Designing ICT System

Natsuki Okamura, Yutaka Yakuwa, Takayuki Kuroda, Ikuko E. Yairi

Author information

Keywords: Network System Design, Design Automation, Machine Learning, Reinforcement Learning

JOURNAL FREE ACCESS Advance online publication

Article ID: 2022XBL0100

DOI https://doi.org/10.1587/comex.2022XBL0100

The final version of this article is now available: Vol. 11 (2022), No. 10 pp. 667-672

Details

Abstract

Designing an ICT system providing certain network application services (DICTS), which consists of selecting appropriate equipment for various requirements, optimal arrangements, and correct connections, needs specialized knowledge and enormous human power. Autonomous DICTS technology using deep reinforcement learning (DRL) has an elementary problem of huge learning time caused by accidentally overestimating a specific configuration because of the sparse reward despite a vast combination of selections, arrangements, and connections. This paper applies our improved Double DQN, a typical DRL algorithm to suppress overestimation, to an autonomous DICTS technology named Weaver and demonstrates the possibility of reducing the number of episodes until convergence by 25%.

Corresponding author

Register with J-STAGE for free!