An investigation of the relationship between numerical precision and performance of Q-learning for hardware implementation

Daisuke Oguchi; Satoshi Moriya; Hideaki Yamamoto; Shigeo Sato

doi:10.1587/nolta.13.427

Abstract

Reinforcement learning is promising as a machine learning paradigm in edge computing. However, its high computational cost poses a challenge when implementing in devices with limited circuit resources and power consumption. In this study, we investigated the relationship between the bit-length of floating-point operations and the learning performance of the reinforcement learning algorithm. In the case of the FrozenLake maze problem, we found that the learning performance of 8-bit floating-point arithmetic decreased, while that of 16-bit floating-point arithmetic was comparable to that of 64-bit CPU arithmetic. Our results provide a practical guideline for designing a dedicated reinforcement learning hardware with minimum circuit resources and power consumption.

Content from these authors

Favorites & Alerts

Add to favorites
Additional info alert
Citation alert
Authentication alert

Corresponding author

Register with J-STAGE for free!