An Optimal Control of Auto-Sleep Systems Based on the Q-Learning

Hiroyuki OKAMURA; Takeshi ISHIKURA; Tadashi DOHI

doi:10.9746/sicetr1965.39.590

Hiroyuki OKAMURA, Takeshi ISHIKURA, Tadashi DOHI

Author information

Keywords: Dynamic Power Management, auto-sleep system, Queueing Model, semi-Markov decision process, Q-Learning

JOURNAL FREE ACCESS

2003 Volume 39 Issue 6 Pages 590-599

DOI https://doi.org/10.9746/sicetr1965.39.590

Details

Abstract

Dynamic power management (DPM) is one of the most effective techniques for reducing energy consumption. Especially, auto-sleep function is the simplest but effective way to reduce electric power consumption. In this paper, we consider a stochastic model to determine an auto-sleep timing sequentially. More precisely, we develop an optimal control scheme of auto-sleep timing based on the Q-learning, which is a part of reinforcement learning algorithms and is strictly related to the Markov decision process (MDP) and the semi-Markov decision process (SMDP). First, we reformulate the stochastic auto-sleep model under the SMDP. Second, the optimal control scheme is to determine the optimal auto-sleep timing established by applying the Q-learning algorithm. Finaly, numerical examples are presented to investigate the effectiveness of DPM with real data.

Corresponding author

Register with J-STAGE for free!