Reconstructing the pristine flow of highly developed rivers − a case study on the Chao Phraya River

Understanding the extent to which human activities affect river flow is fundamental for enhancing effective water resources management. In past decades, various methods have been proposed to estimate naturalized flow (i.e. the expected flow if the basin is unaffected by human activities). However, there are still drawbacks to natural‐ ized flow estimation, particularly in a highly regulated basin with incomplete hydrological observation. This study proposes a method for daily naturalized flow development at the key station of the Chao Phraya River Basin; the most highly regulated basin in Thailand. The naturalized flow is estimated by applying the Naturalization with Coarse and Fine Components (NCFC) method to perceive river flow conditions unaffected by human disturbance. The estima‐ tion is derived with the integration of five components: (1) observed river flow at the key hydrological station; (2) changes in major reservoir storage; (3) water withdrawal along the river; (4) travel time from major reservoirs to the station; and (5) the filtering technique used by SavitzkyGolay with a three-day window.


INTRODUCTION
The effects of climate variability and human activities on the hydrological cycle have been explored by numerous researchers in previous decades, enhancing our understanding of the water cycle and water resources management capability. Due to human modifications, observed river flows at gauging stations no longer reflect the purely natural hydrological response of the basins. Therefore, uncertainty in the observed flow data is a potential source of disinformation (Beven, 2016). To calibrate and validate a hydrological model, an appropriate naturalization process is required for the assessment of natural hydrological variations.
Naturalized flow (hereafter Q NAT ) is defined as the flow Correspondence to: Adisorn Champathong, Royal Irrigation Department, 811 Samsen Road, Bangkok 10300, Thailand. E-mail: adisorn_eng@ hotmail.com estimation at stream gauging stations, with the effects of human activities removed. It is very useful for practical applications, particularly in the modeling of river basin management (Wurbs, 2006). To infer pristine hydrological conditions, the effect of anthropogenic activities must be removed (Naik and Jay 2005;Rajagopalan et al., 2009;Döll et al., 2009). The Q NAT is fundamental to various simulation-based studies e.g. climate change (Rajagopalan et al., 2009;Döll and Zhang, 2010;Haddeland et al., 2014), water allocation (Wurbs, 2006;Wurbs and Hoffpauir, 2017), flood simulation (Mateo et al., 2017), runoff changes (Xiao et al., 2018), land surface models (Quintana-Seguí et al., 2019), or even the global assessment of river flow regimes (Döll et al., 2009) since many present hydrological models exclude those involving human activities (e.g. irrigation water withdrawal and dam operation). Although widely estimated, only a few studies elaborate on details of the naturalization process simply for the adequate reproduction of their applications. Such studies include comprehensive guidelines for practical water management by Wurbs (2006) and Wurbs and Hoffpauir (2017). Nazemi et al. (2017) found that naturalized streamflow data have limitations regarding representation of the regulatory effects caused by major upstream dams and other anthropogenic interventions because of uncertainties in both reservoir storage and water use data. Additionally, Tongal et al. (2017) discovered that the Q NAT series based on no-regulation, no-irrigation (NRNI) estimation, reversing the effects of irrigation depletion and flow regulation under a mass balance approach, was unable to completely recover the streamflow-homogenization effect of the dam. Such NRNI data are often used to calibrate hydrological models for representing the river flow unaffected by human activity in different regional projects. Since the NRNI estimation could contain uncertainties due to seasonality and cyclic events, these researchers suggested that the Q NAT process be further investigated.
The Chao Phraya River Basin (CPB) is highly regulated as described in Text S1. Amongst numerous studies of the CPB, some estimated the Q NAT for use in hydrological model validation (Champathong et al., 2013;Hanasaki et al., 2014;Mateo et al., 2014). Since Q NAT is widely regarded as an auxiliary data process, no systematic discus-sion or comparison on it has been reported.
Accordingly, the research questions herein are designed to establish the best method for Q NAT reproduction amongst those published methods on the CPB. This paper exhibits daily Q NAT development, called Naturalization with Coarse-Fine Components (NCFC), to represent historical flow of the river basin. The NCFC is composed of the recorded river flows associated with human activities. The coarse portion is based on inflow and outflow components at each river reach, including travel time from reservoir release and irrigation water withdrawal to the target gauging station. Fine adjustment is applied using two filtering methods for the noise reduction of hydrological data without removing a significant part of the original data as referred to Text S2. A Flow Duration Curve (FDC) is then used to visualize the comparison between various Q NAT results and pre-dam flow.

Naturalized flow method for a highly regulated basin
Estimation of Wurbs' naturalized flow Wurbs (2006) defined the Q NAT for a particular month using historical records at a gauging station as follows.
where NF represents naturalized flow; GF is gauged flow; D i is the water supply diversion at location i upstream of the gauge; RF i is return flow into the river system at location i upstream of the gauge; EP i is net evaporation from a reservoir upstream of a gauging station; and ΔS i is storage change in upstream reservoirs.
For application in the study area, Q NAT can be expressed as the following equation.
where Q NAT is the naturalized flow at a gauging station, Q OBS is the observed flow at a gauging station, ΔS DAM i is the storage change in the upstream reservoir i, n is the total number of dams in upstream sub-basins, and Q DIV is the total water diverted from upstream river reaches.
where Q Inflow is inflow into a reservoir, Q Pumped is discharge pumped back into a reservoir, Q Release is discharge released into downstream channels, Q Spill is discharge spilled from a reservoir through spillways, and Q Evap is evaporation from a reservoir. The arithmetic operation of inflow, outflow, water diversion, and received water is conducted to obtain a river diversion in each river reach. The channel water balance of contiguous reaches described in Payn et al. (2009) is presented in the following equation.
where Q DIV is water diversion of each river reach; Q U is the uppermost gauging station below a dam; Q D is the lowermost gauging station above the station in question; and Q GAIN is the water gained in each reach. Modification of Wurbs' naturalized flow Equations (1)-(4) are based purely on the instantaneous water balance neglecting the travel time of flow. The resulting naturalized flow occasionally becomes negative, thereby requiring modification. The following section elaborates on the development of new methods for estimating daily Q NAT.
(1) Travel time of reservoir release The travel time of reservoir release at a point downstream is an essential addition for coinciding with the flow behavior of the daily Q NAT . Determined by the use of historical records for streamflow gauging stations, the travel time depends on travel velocity, varying in accordance with flow magnitude through stream reaches. The generic form of the Q NAT equation based on the average travel time is shown as follows.
where Q NAT is the daily naturalized flow at a gauging station on Day t; Q OBS is observed streamflow at a target gauging station; ΔQ i (t -d i ) is change in discharge with the d-day average travel time released from reservoir i; n is the total number of dams in upstream sub-basins; and Q DIV is the total diverted discharge of each river reach.
(2) Integration of water withdrawal In a river basin with inadequate data on water gained or tributary flows, as described in Equation (4), the known irrigation water record could be replaced by Q DIV in order to determine the Q NAT as follows: where Q NAT is the daily naturalized flow at a gauging station on Day t; Q i is the change in discharge with the d-day average travel time released from reservoir i; n is the total number of dams in upstream sub-basins; Q j is the change in irrigation water withdrawn from the main intake structure j along river reaches from reservoirs to a target gauging station; m is the total number of major irrigation withdrawals in upstream sub-basins. Q j is applied when irrigation water data is available but the water gained is unknown. Application to the Chao Phraya River Daily naturalized river flow at the C.2 gauging station (hereafter C.2 station) is estimated using Equations (1)-(6). Daily river discharge Q OBS is taken from the Royal Irrigation Department of Thailand (RID). Both the original and new methods are applied to the Chao Phraya River from 1981 to 2004 since reliable gridded daily meteorological data is available during this period (Kotsuki and Tanaka, 2013;Mateo et al., 2014). Note that the original Q NAT refers to Q NAT without water withdrawal.
River runoff measured at the main gauging stations located in the four major rivers are collected from the RID. Flows at reservoirs (Q Inflow , Q Release , Q Pumped , Q Spill , and Q Evap ) are taken from Electricity Generating Authority of Thailand (EGAT). Note that only Bhumibol and Sirikit Dams (hereafter BB and SK respectively) are taken into account since both dams are superior to every other large-scale dam. As for Q U and Q D , data from P.12 and N.12 (downstream stations of BB and SK respectively) is taken up until the C.2 station.

Major anthropogenic influences on the naturalized flow of the highly regulated basin
(A) Water withdrawals from major large-scale irrigation projects at downstream channels.
A variety of irrigation structures are situated in the CPB. As depicted in Figure 1A, from 1981-2004, the annual mean water withdrawal from large-scale irrigation projects located near the Ping and Yom Rivers measures 12.50 and 12.00 m 3 /s; whereas the largest average withdrawal is 34.30 m 3 /s from the Nan River (with a standard deviation of 12.70 m 3 /s). (B) A change in river flow at the confluence (C.2 station) due to large-scale reservoir operations.
Prior to the construction of the BB and SK, the average river flow at the confluence was low in winter (December-January), dry (February-April), and pre-monsoon seasons (May-June) ( Figure 1B). The mean river flow from December to June was between 10-300 m 3 /s because there was less rainfall. In contrast, the flow is very high from July-October (wet season). Furthermore, in the pre-and post-dam monthly hydrographs, following dam construction, it can be observed that the post-dam river flow was clearly adjusted by the dam controls. During the dry season, an increase in river discharge was delivered between 400-600 m 3 /s to allow for rising water use in the basin. Moreover, the flood peaks could be reduced from approximately 2,800 to 1,300 m 3 /s during the rainy season.

Flow characteristics of the key station throughout the range of discharge
Comparison of flow duration curves (FDCs) derived by various Q NAT was computed using Equations (1)-(6). All components with moving average (MA) and Savitzky-Golay (SG) filtering methods and window spans (3, 5, 7 days) were then integrated: MA3d, MA5d, MA7d, SG3d, SG5d, and SG7d (red line, Figures 2A-2F). The filtering methods produce similar FDCs for high and medium flows ( Figures 2G-2H), while the medium-flow range (dashed black line) can clearly be distinguished from the pre-dam flow ( Figure 2H). Conversely, at the low-flow range, FDCs with diverse filtering methods exhibit larger differences. Although all FDCs differ from those of Q predam , the FDC of the Q NAT with SG3d shows a closer performance than the other methods ( Figure 2I). Additionally, a comparison between median flows using the various filtering methods and the pre-dam median flows is presented in the boxplots. Such that the medians derived by Q NAT with SG3d and SG5d of 37% are closest to that of Q predam ( Figure 2J).
The FDC shows a moderate correlation between pre-dam and Q NAT lines in high and low flows except for the medium flow. This could be explained by the findings of Yilmaz et al. (2008) in that the medium-flow deviation is caused by medium precipitation and the intermediate-term primary and secondary base flow relaxation response to a watershed.

Negative residuals of naturalized flow after filtering
A comparison of the negative values of Q NAT applied with different travel times is shown in Figure 3A. The lower and upper limits (1 to 10 days) of travel times from BB and SK to the C.2 station were applied to the original Q NAT . The diverse travel times, reflecting seasonal variation, are caused by a negative-value reduction based on the negatives of the original Q NAT (21.7%). The occurrence of similarities in the negatives derived from all travel time sets ranged from 20.1 to 21.8%. To exemplify, the 10-day travel time from BB and SK to C.2 station caused negative values of 20.8%; likewise, the average 6-day and 9-day travel times caused negative values of 20.1%. Furthermore, the residual negatives of Q NAT applied by the filtering methods with different window spans significantly removed the negative values shown in Figure 3B. The filtering process of the moving average and Savitzky-Golay methods was capable of producing a two-fold reduction in negative values relative to the original Q NAT with no filtering application. The least negative percentage was found in the outputs of MA7d, potentially reducing by 13.7% (from 21.7% to 8.0%), while the SG3d could lead to reductions of 11.6% (from 21.7% to 10.1%).

Monthly and overall patterns of naturalized flow after filtering
The relationship between the original Q NAT and that with the addition of filtering is illustrated in Figure 4A-4F. The results show that the highest and lowest correlations were caused by the application of SG and MA with three-day and seven-day moving windows, respectively ( Figures 4B,  4E). The negative residuals of monthly river flow after the filtering application are depicted in Figures 4G-4L. Each boxplot shows quantiles of 25, 50, and 75%, respectively. The finer temporal resolutions of the moving window disclosed fewer margins than the coarser. For example, MA3d had an interquartile range (IQR) of between −2.5 to 2.5% ( Figure 4G), while MA7d had the maximum IQR of between −5.5 to 10% ( Figure 4K). Similar to MA3d, SG7d had an IQR of between −2.5 to 2.5% ( Figure 4L). In contrast, SG3d had the smallest margin (close to 0%) ( Figure  4H). Figure 5 shows a comparison of the daily hydrographs at C.2 station. The mean observed daily discharge during January-May from 1981 to 2004 (black line) was higher than that of the other three-processed Q NAT. Additionally, the observed peak discharge was reduced, particularly during September and October, compared to the other Q NAT . Furthermore, the Q NAT hydrograph excluding the reservoir evaporation and the water withdrawals still disclosed negative values during the dry period (green line). In contrast, a disappearance in negative flow was discovered when integrating the main losses (e.g. reservoir evaporation and irri-    (6), disclosed no further negative values.

Uncertainties
Three uncertainties were identified in Q NAT estimations; the first of which relates to the unsatisfactory monitoring of upstream water withdrawal in the Chao Phraya River Basin. The factors impacting the deviation between Q NAT and pre-dam lines in the medium-flow range were then explored (Figure 2). Since the Q NAT output is derived from inflow, outflow, and the water diversion components of water balance at the river reach, the deviation of medium flow is considered to be significantly impacted by the inclusion of insufficient water diversion along the river and its tributaries. Equation (4) refers to incomplete water gain or tributary flow, causing a decline in the amount of water diversion. As a result, the Q NAT is also decreased due to less water diversion as indicated in Equation (5). The second uncertainty is the effect of land use and land cover change, such as with the exclusion of water withdrawal, especially in the growth of agricultural land, which could be reflected in the Q NAT results. The third uncertainty relates to the shortage of river gauging networks. Therefore, it is recommended that hydrological networks be enhanced by installing key gauging stations following World Meteorological Organization guidelines, particularly in highly regulated basins. Finally, the effects of reservoirs other than BB and SK have been neglected. Other large, medium, and small reservoirs, such as Kewlom and Mae Song Dams (in the Wang and Yom River Basins), should be included in future research to increase the accuracy of Q NAT development.

Effects of water withdrawal and travel time
This study demonstrates that the insufficient recording of water use (Q DIV ) along natural and artificial channels, particularly immeasurable water withdrawal from rain-fed areas or ungagged sub-basins, could induce negative values for Q NAT (according to Equation (5)). More specifically, the estimated Q NAT , excluding evaporation and irrigation water withdrawal described in Champathong et al. (2013); Mateo et al. (2014); and Hanasaki et al. (2014), shows negative residuals in low-flow stages as depicted in Figure 5. Consequently, the presence of negative values was mainly influenced by the exclusion of water withdrawal. Nevertheless, travel time had little effect on the negative residuals ( Figure  3). Effects of filtering Overall, the MA7d recommended by Fleig et al. (2006); WMO (2008); Stahl et al. (2010);and Forzieri et al. (2014) produced the largest comparative difference between predam and Q NAT for the basin. Rather, SG filtering with a seven-day moving window (SG7d) performed better than the MA7d. The SG7d is superior to the MA7d because (1) it discloses low-flow behavior, especially the reduced fluctuation caused by reservoir releases during the dry season, and (2) it maintains the magnitude of high flow. Maintaining the shape of a gauging station hydrograph using an appropriate filtering method is essential for further hydrological analysis. The SG3d has certain advantages over other filtering methods in terms of flow characteristics, negative residuals, as well as overall, monthly, and daily patterns. Firstly, the medians of river flow derived by SG3d and SG5d methods of 37% are closest to that of Q predam ( Figure 2J). Secondly, the filtering process of both moving average and Savitzky-Golay methods provided a two-fold reduction in negative residuals compared to the original Q NAT with no filtering application. Lastly, Q NAT with SG3d was the best fit for Q predam . Consequently, with its better performance, SG3d could be utilized for the further development of Q NAT .

Future research on naturalized flow in hydrological analysis
Since water withdrawal data is crucial to the accurate estimation of Q NAT , observed hydrological data needs to be comprehensively collected. The enhancement of hydrological measurement networks should be promoted to examine water withdrawal, particularly in highly regulated basins of developing countries. Furthermore, Tongal et al. (2017) recommended that the existing Q NAT method be reviewed due to the various drawbacks involved. For that matter, the NCFC method is a potential guideline for generating the simplistic Q NAT associated with human influence in highly regulated basins. Since NCFC is a re-arranged method following water balance components, flow duration curves, and smoothing filtering methods, one could reproduce Q NAT to match their application. The Q NAT method plays a vital role in validating and calibrating hydrological models, especially in highly regulated basins. With the rising trend in water demand, increasing water withdrawal will lead to a higher impact on Q NAT . Simultaneously, a higher evaporation rate in water resources could be greatly affected by river flow under rising global warming. Integrating Q NAT into hydrological model simulation is, therefore, inevitably more important in order to obtain more accurate model results and explore appropriate adaptation measures for water resources management.

CONCLUSIONS
In the era of widespread anthropogenic impacts on natural processes, it no longer makes sense to study only the natural hydrological cycle (Oki and Kanae, 2006). Fluctuations in the dynamic water cycle, induced by human intervention and climate change, has recently been subjected to further exploration to comprehensively understand the mechanisms and expected impact on water resources in the future (Döll and Zhang, 2010;Haddeland et al., 2014; Bennett et al., 2018). This study presents the development of a method for estimating daily naturalized flow and its application at the C.2 station in the Chao Phraya River Basin of Thailand.
The key findings applicable to the Chao Phraya River in Thailand are as follows. Firstly, insufficient water withdrawal from the river and its tributaries is the primary uncertainty when estimating Q NAT , particularly during the planting period in the rain-fed area. Secondly, consideration of the travel time from major interventions (the two large dams in this study) improves naturalized flow estimation. Diverse travel times, reflecting seasonal variation, slightly reduce the negative value. Thirdly, both MA and SG methods are capable of producing a two-fold reduction in negative value. The least negative percentage is found at MA7d outputs, indicating a potential reduction by 13.7%; while SG3d could reduce by 11.6%. Nevertheless, overall, the SG3d is the most appropriate filtering method in terms of flow characteristics, negative residuals, as well as overall, monthly, and daily patterns. However, taking all the measures proposed in this study, the estimated natural flow exhibits occasional negative values. Integrating further anthropogenic terms, particularly losses (e.g. reservoir evaporation), water withdrawal from rivers and appropriate travel times are crucial to better naturalize observed river flow at the target hydrological gauging station. Due to its fundamental importance in hydrological modeling, greater attention should be paid to Q NAT for more accurate simulation results. Q obs Q NAT (without evaporation and irrigation) Q NAT + evaporation + irrigation Q NAT + evaporation + irrigation + travel time + SG3d Figure 5. Comparison of daily hydrographs plotted from different hydrological components in the drought year of 1998 at C.2 station. These components comprise: (1) observed discharge; (2) original Q NAT excluding reservoir evaporation and irrigation water withdrawal; (3) the original Q NAT including reservoir evaporation and irrigation water withdrawal; and (4) the original Q NAT including reservoir evaporation, irrigation water withdrawal, travel times, and SG3d. The final one, integrated with the complete components, shows no negative residual during the drought period and a high discharge peak during the rainy season (red line)