人間のフィードバックによる深層強化学習(Reinforcement Learning from Human Feedback)を適用したダム放流操作モデルの試み

箱石 健太; 一言 正之; 菅田 大輔; 石田 富英; 小久保 緑

doi:10.11532/jsceiii.5.3_410

Abstract

There is an increasing number of cases where dam operations are optimized using deep reinforcement learning based on meteorological conditions and various dam quantities. However, in making decisions on dam operations, dam discharge operations are judged based on various circumstances such as stakeholders in the dam basin and CCTV camera images, in addition to meteorological conditions and dam quantities. It is difficult to model these values of dam discharge operations as reward functions in deep reinforcement learning. Recently, large language models (LLMs) have been able to implement deep reinforcement learning based on human values through Reinforcement Learning from Human Feedback (RLHF), achieving more accurate responses. In this study, we applied RLHF to a dam discharge operation model using deep reinforcement learning and constructed a dam discharge operation model that incorporates human values.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!