2024 Volume 80 Issue 16 Article ID: 23-16196
In this paper, we propose a inference model for prediction of water level using the HMLasso (least absolute shrinkage and selection operator with high missing rate) algorithm. The HMLasso algorithm enables the learning of models directly from data sets that contain missing values. In the collection of river data, there are several factors that can induce missing data. These factors encompass the closure of telemeter, their installation, and observation errors. We conducted a comparative analysis between conventional method and the HMLasso model. The analysis was carried out on the Mogami River during a flooding event in August 2022. To facilitate this comparison, we artificially increased the missing data rate up to a maximum of 50% and performed multiple analyses. As a result, when trained using actual values, the Nash-Sutcliffe coefficient was 0.876. However, even with a 50% data missingness rate, the coefficient reduced only marginally to 0.842. This results that the HMlasso model's performance degradation due to missingness rate is relatively minimal.