電気学会論文誌C(電子・情報・システム部門誌)
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
<ソフトコンピューティング・学習>
報酬が周期的に変化する環境のための強化学習
澁谷 長史安信 誠二
著者情報
ジャーナル フリー

2014 年 134 巻 9 号 p. 1325-1332

詳細
抄録
This paper proposes a new reinforcement learning method to construct agents in environments with cyclic reward depending on time. The proposed method consists of two parts: (a) a cyclic action-value function by superposing sinusoidal action-value function in phasor representation and (b) an algorithm to use it. Reinforcement learning is a widely used framework to develop agent which can decide suitable action. It enables the agent to learn suitable action only in stationary environments. Contrast to conventional methods, the proposed reinforcement learning method can be applied to learning in environments with cyclic reward depending on the time. Experimental results show that the proposed method performs much better than conventional methods.
著者関連情報
© 2014 電気学会
前の記事 次の記事
feedback
Top