環境の変化に適応するマルチエージェントの学習手法

今福 啓

doi:10.1527/tjsai.21.153

Abstract

In this paper, several agents construct the multi-agent system, and each agent defines its action to maximize the reward that can obtain from the environment without communicating each other. However, if the environment around the multi-agent system changes, the action which can obtain the high reward also changes, so each agent should adapt to the change of the environment to obtain the high reward. Therefore, each agent is required to recognize the change of the environment through the acquired reward and should learn which action will obtain a high reward.
To adapt to the change of the environment, we propose a new learning method for multi-agent system. In the proposed method, each agent has a matrix named ``transition probability matrix'' that expresses which action will obtain the high reward in the future time. Each agent updates the element of the matrix by using not only the acquired reward but also the entropy of the matrix. The update procedure of the matrix is classified into three cases according to the increase or the decrease of the acquired reward and the entropy of the matrix in the past time.
Some simulations were done by using the proposed method. The results show that each agent can adapt to the change of the reward and obtain the high reward.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!