Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
36th (2022)
Session ID : 3G4-OS-15b-01
Conference information

Multi-objective Deep Reinforcement Learning for Crowd Guidance Policy Optimization
*Ryo NISHIDAYuki TANIGAKIMasaki ONISHIKoichi HASHIMOTO
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

The objective of this study is to improve Multi Objective Deep Reinforcement Learning (MODRL) for optimizing crowd guidance strategies. In general, MODRL is classified into Outer-loop method and Inner-loop method. In the former, multiple objective functions are transformed into a single objective using a scalarization function, and the Pareto front, which is the optimal solution set, is obtained by repeatedly updating the weights of the scalarization function and performing single-objective optimization. However, in this method, if the computational cost of single-objective optimization is high, the overall computational cost increases in proportion to the number of times the weights update. On the other hand, the latter the Inner-loop method is designed to learn Pareto front in a learning process. In this study, we examine the approximation of the Pareto solution by different action selection methods of Pareto-DQN, which is a typical method of the Inner-loop method. In the experiments, we evaluate the proposed method using a benchmark problem and finally discuss its application to the optimization of crowd guidance strategies.

Content from these authors
© 2022 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top