Generating Diverse Optimal Road Management Plans in Post-Disaster by Applying Envelope Multi-Objective Deep Reinforcement Learning

Soo-Hyun Joo; Yoshiki Ogawa; Yoshihide Sekimoto

doi:10.20965/jdr.2023.p0884

Abstract

The authors used a data-driven reinforcement learning model for the post-disaster rapid recovery of human mobility, considering human-mobility recovery rate, road connectivity, and travel cost as the recovery components, to generate the reward framework. Each component has relative importance with respect to the others. However, if the preference is different from the original one, the optimal policy may not always be identified. This limitation must be addressed to enhance the robustness and generalizability of the proposed deep Q-network model. Therefore, a set of optimal policies were identified over a predetermined preference space, and the underlying importance was evaluated by applying envelope multi-objective reinforcement learning. The agent used in this study could distinguish the importance of each damaged road based on a given relative preference and derive a road-recovery policy suitable for each criterion. Furthermore, the authors provided the guidelines for constructing the optimal road-management plan. Based on the generalized policy network, the government can access diverse restoration strategies and select the most appropriate one depending on the disaster situation.

Content from these authors

This article cannot obtain the latest cited-by information.

This article is licensed under a Creative Commons [Attribution-NoDerivatives 4.0 International] license (https://creativecommons.org/licenses/by-nd/4.0/).
The journal is fully Open Access under Creative Commons licenses and all articles are free to access at JDR official website.
https://www.fujipress.jp/jdr/dr-about/#https://creativecommons.org/licenses/by-nd

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!