Host: The Japanese Society for Artificial Intelligence
Name : The 38th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 38
Location : [in Japanese]
Date : May 28, 2024 - May 31, 2024
Humans and animals learn from both successes and failures. When you perform an action and get a reward, the value of the action increases and you will choose it frequently after that. In contrast, if you do not get a reward, the value decreases and you will choose it less frequently. This is known as reinforcement learning. A coefficient that determines how much an action’s value increases is called positive learning rate, and one for decreasing is called negative learning rate. For almost all reinforcement learning models used in the field of AI, positive and negative learning rates are set as identical and constant. However, recent studies have discovered that some animals learn asymmetrically, i.e., have different positive and negative learning rates, and that the learning rates adaptively change according to the reward distributions. Then, do humans, too, learn asymmetrically and adaptively? We conducted an online bandit experiment and examined it. Additionally, we conducted an additional decision-making experiment to analyze the results in terms of the relationship between experienced and described decision-making environments.