An investigation of the relationship between numerical precision and performance of Q-learning for hardware implementation

Daisuke Oguchi; Satoshi Moriya; Hideaki Yamamoto; Shigeo Sato

doi:10.1587/nolta.13.427

Special Section on Nonlinear Science Workshop on the Journal

An investigation of the relationship between numerical precision and performance of Q-learning for hardware implementation

Daisuke Oguchi, Satoshi Moriya, Hideaki Yamamoto, Shigeo Sato

Author information

Keywords: reinforcement learning, FPGA, Q-learning, edge computing

JOURNAL FREE ACCESS

2022 Volume 13 Issue 2 Pages 427-433

DOI https://doi.org/10.1587/nolta.13.427

Details

Abstract

Reinforcement learning is promising as a machine learning paradigm in edge computing. However, its high computational cost poses a challenge when implementing in devices with limited circuit resources and power consumption. In this study, we investigated the relationship between the bit-length of floating-point operations and the learning performance of the reinforcement learning algorithm. In the case of the FrozenLake maze problem, we found that the learning performance of 8-bit floating-point arithmetic decreased, while that of 16-bit floating-point arithmetic was comparable to that of 64-bit CPU arithmetic. Our results provide a practical guideline for designing a dedicated reinforcement learning hardware with minimum circuit resources and power consumption.

Corresponding author