Performance of LQ-learning in POMDP Environments

Haeyeon Lee; Hiroyuki Kamaya; Kenich Abe

doi:10.11499/sicep.2002.0.174.0

SICE Annual Conference 2002

DOI https://doi.org/10.11499/sicep.2002.0.174.0

会議情報

主催: The Society of Instrument and Control Engineers

共催: IEEE/Industiral Electronic Society, IEEE/Robotics and Automation Society, IEEE/Control System Society

Performance of LQ-learning in POMDP Environments

Haeyeon Lee, Hiroyuki Kamaya, Kenich Abe

著者情報

キーワード: Reinforcement Learning (RL), Labeling Q-learning (LQ-learning), Self-Organizing Map (SOM), Partially Observed Markov Decision Processes (POMDPs)

会議録・要旨集フリー

p. 174

詳細

抄録

In this paper, we propose a new type of LQ-learning to solve POMDP. In the POMDP environment, the agent cannot observe the environment directly. In the LQ-learning, in order to dicriminate partially observed states, the agent attaches label to each observation which perceived as the same ones. Unlike our previous LQ-learning, we make preparations of knowledge about the environment in advance. The knowledge is automatically acquired by Kohenen’s Self-Organizing Map (SOM), which provides the knowledge about state transitions to the agent. Then, LQ-learning agent attaches labels to observations with reference to a map obtained by SOM.

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）