2024 Volume 5 Issue 3 Pages 410-417
There is an increasing number of cases where dam operations are optimized using deep reinforcement learning based on meteorological conditions and various dam quantities. However, in making decisions on dam operations, dam discharge operations are judged based on various circumstances such as stakeholders in the dam basin and CCTV camera images, in addition to meteorological conditions and dam quantities. It is difficult to model these values of dam discharge operations as reward functions in deep reinforcement learning. Recently, large language models (LLMs) have been able to implement deep reinforcement learning based on human values through Reinforcement Learning from Human Feedback (RLHF), achieving more accurate responses. In this study, we applied RLHF to a dam discharge operation model using deep reinforcement learning and constructed a dam discharge operation model that incorporates human values.