抄録
We propose a new method to define the constants of Complex-Valued Q-learning, which is an Reinforcement Learning algorithm that can deal with incomplete perception problems. It enables agents without sufficient perception to recognize the context of actions at some level by applying complex numbers to value functions. We improve this method in two ways. One, we predict contexts not by adjacent situations as before but by how many times an agent acts from a starting situation. Second, we efficiently use memory that depends on the number of the steps that is required to get rewards. Our agents successfully solved more complex, incomplete perception problems by these methods. We also consider a context-based designing method for Complex-Valued Reinforcement Learning.