Journal of Advanced Computational Intelligence and Intelligent Informatics
Online ISSN : 1883-8014
Print ISSN : 1343-0130
ISSN-L : 1883-8014
Special Issue on Cutting Edge of Reinforcement Learning and its Hybrid Methods
Estimation of Different Reward Functions Latent in Trajectory Data
Masaharu SaitoSachiyo Arai
Author information
JOURNAL OPEN ACCESS

2024 Volume 28 Issue 2 Pages 403-412

Details
Abstract

In recent years, inverse reinforcement learning has attracted attention as a method for estimating the intention of actions using the trajectories of various action-taking agents, including human flow data. In the context of reinforcement learning, “intention” refers to a reward function. Conventional inverse reinforcement learning assumes that all trajectories are generated from policies learned under a single reward function. However, it is natural to assume that people in a human flow act according to multiple policies. In this study, we introduce an expectation-maximization algorithm to inverse reinforcement learning, and propose a method to estimate different reward functions from the trajectories of human flow. The effectiveness of the proposed method was evaluated through a computer experiment based on human flow data collected from subjects around airport gates.

Content from these authors

This article cannot obtain the latest cited-by information.

© 2024 Fuji Technology Press Ltd.

This article is licensed under a Creative Commons [Attribution-NoDerivatives 4.0 International] license (https://creativecommons.org/licenses/by-nd/4.0/).
The journal is fully Open Access under Creative Commons licenses and all articles are free to access at JACIII official website.
https://www.fujipress.jp/jaciii/jc-about/#https://creativecommons.org/licenses/by-nd
Previous article Next article
feedback
Top