Identifying dominant runoff mechanisms and their lumped modeling: a data-based modeling approach

The authors developed a methodology for identifying dominant runoff mechanisms of a watershed and their lumped modeling as a data-based modeling approach with precipitation and runoff data which would contribute to the reduction of uncertainties in both the model structure and the model parameter. We firstly separated a hydrograph into several runoff components by a recession analysis of runoff data and a filter separation method. Secondly, we estimated storage as a function of runoff for each component. Finally, we constructed a single Tank model for each component, where both the runoff component and the estimated storage were used as constraint conditions in identifying coefficients of runoff and infiltration. By applying this approach, we found that (1) the constructed Tank model perfectly traced the runoff components separated by the filter separation method, (2) there are almost no uncertainties in the model structure and the parameter if the result of filter separation can be assumed to be reliable, and (3) we can even estimate effective rainfall with our approach. These results imply our methodology allows identifying and modeling dominant rainfall-storage-runoff mechanisms with minimal uncertainties in model structure and parameter, using hourly precipitation and runoff data alone.


INTRODUCTION
Runoff models always suffer from uncertainties in structure, parameters and simulated runoff, regardless of whether they are lumped or distributed models. Is it possible to construct a runoff model based on dominant runoff processes (Grayson and Blöschl, 2000;Sivakumar, 2004) of a watershed in the framework of data-based modeling approach (Young and Beven, 1994)? At present, such a methodology has not been developed, hence the problem of uncertainties remain unsolved. This should be one of the most critical problems of current runoff modeling studies.
How can we realize the modeling based on dominant runoff processes identified with observed data? The authors Correspondence to: Yoshiyuki Yokoo, Faculty of Symbiotic Systems Science, Fukushima University, 1 Kanayagawa, Fukushima-city, Fukushima 960-1296, Japan. E-mail: yokoo@sss.fukushima-u.ac.jp *Present address: Delft University of Technology, The Netherlands. think that runoff components separated by the filter separation method developed by Hino and Hasebe (1984) could be recognized as data-based dominant runoff processes because the method creates a numerical filter with recession time constant directly identified by the recession curve of an hourly hydrograph. By repeating the separations, this method can separate a hydrograph into multiple components. Recently, Kobayashi and Yokoo (2013) suggested a methodology to identify dominant runoff processes and their corresponding watershed-scale storage by integrating the filter separation method by Hino and Hasebe (1984) and the data-based watershed-scale storage estimation method by Kirchner (2009). Later, Yokoo et al. (2014) applied the methodology for interpreting the relationship between slope failure and watershed-scale storage in Thailand. Chiba and Yokoo (2015) then improved the methodology by linearizing the relationship between runoff and storage for theoretical consistency in integrating the method by Kirchner (2009) with the filter separation method by Hino and Hasebe (1984). If we can assume that the methodology by Chiba and Yokoo (2015) is widely applicable in any watershed, the last problem becomes the runoff model construction.
To solve the above problems, the present study aims to develop and demonstrate a methodology that firstly identifies dominant runoff mechanisms within a watershed based on observed data by using the method by Chiba and Yokoo (2015) and then constructing a runoff model reflecting the dominant runoff mechanisms without calibration of model structures and parameters. Here, we employed the Tank model of Sugawara (1995) because equations used in Chiba and Yokoo (2015) resemble those of the Tank model. More details will be explained in the next section.

Study area and data
We conducted this study in the catchment area of Kamimasaki water-level monitoring station along Sendai River located at Ebino city in Miyazaki prefecture, Japan, as in Figure 1. The details of the watershed are summarized in Table SI. Precipitation data used in this study is measured at Hachigamine monitoring station plotted in Figure 1.

Hydrograph separation
As demonstrated by Kobayashi and Yokoo (2013), we separated a hydrograph into several components by using the filter separation method by Hino and Hasebe (1984). The method firstly estimates a characteristic recession time constant by applying an exponential decay function to the slowest recession part of runoff data plotted on a semilogarithmic axis where the recession curve is found as a recession line segment. The recession time constant of a component T c,i (with unique component number i) is identified as the inverse of the recession decay factor of the exponential decay function. The authors then separated a hydrograph into faster and slower components with the following equations: where q i (t), q(t), α i , ω i (τ), τ, τ max,i (= 5T c,i ) respectively are the separated slower runoff component, runoff before separation, parameter, numerical filter defined in Equation (2), time axis and effective duration of the numerical filter. Note that i is a unique number of a runoff component defined as log 5 T c,i rounded to the nearest whole number in this study. The parameters c 0,i and c 1,i are defined as δ 2 /T c,i 2 and δ 2 /T c,i respectively, where δ is a constant higher than 2.0 so we employed 2.5 in the present study. By repeating these procedures to the residual faster component, we successively separated the hydrograph of the studied watershed into several components. As in Chiba and Yokoo (2015), we identified the magnitude of T c,i so as to make the value of log 5 T c,i , rounded-off to the nearest integer, become lower than its precedent value as in Table  SII. This helped to minimize subjectivity in the identification of T c,i .

Storage estimation
For estimating watershed-scale storage, we employed the method suggested by Chiba and Yokoo (2015) which is an improved version of the method by Kobayashi and Yokoo (2013) who combined the hydrograph separation method by Hino and Hasebe (1984) and the watershed-scale storage estimation method by Kirchner (2009). The details of the method are described in the original literatures, however, here we explain them briefly.
As suggested by Chiba and Yokoo (2015), who modified the method by Kirchner (2009), we firstly explored the value of a i in the following relationship for the ith runoff component by applying a linear regression line as in Figure  S1(a-e).
The relationships in Equation (3) for slower components were explored without the original Kirchner's (2009) conditions for selecting data in rainless night-time periods to minimize the effects of evapotranspiration and precipitation in the water balance equation, because we assumed that the relationships in Equation (3) for slower components should be affected less by evapotranspiration and precipitation. Hence we employed the original Kirchner's (2009) conditions only for the fastest component. Secondly we calculated storage (s i n -s 0,i ) as a function of runoff q i by the following equation, where s 0,i is a storage when q i n becomes 0 and the actual magnitude of s 0,i stays unknown at this stage. The relationship between q i n and (s i n -s 0,i ) in Equation (4) is explored for all the runoff components separated by numerical filtering as in Figure S1(f).

Construction of Tank model
In constructing a Tank model suggested by Sugawara (1995) as in Figure 2(a), we must specify the number of vertically serial tanks and parameter values. In the present study, we assumed that the total number of vertically serial tanks (i = i max ) is equal to that of the runoff components separated by the numerical filtering with their unique tank numbers i explained above. The parameter values are identified by using the time series data of runoff components and their corresponding storage, which is a newly developed method in the present study.
This method firstly identifies the runoff coefficient of the bottom tank (i = i max ) that generates the slowest runoff component by taking a i in Equation (3). Here i is the same as defined above and it denotes the order of the tank countedup from the top tank (i = 1). As we assume there is no infiltration from the bottom tank, we can estimate infiltration from the (i max -1)th tank to the bottom tank, ( 1 ) max n i p  , by cal-

IDENTIFYING AND MODELING RUNOFF MECHANISMS
culating the water balance of the bottom tank as Equation (5): where n denotes time step. In the case of middle tanks between the top and the bottom tanks, we can estimate infiltration from upper tanks p i − 1 n by calculating the water balance as Equation (6) successively from the lower tanks to upper tanks.
Similarly, we can estimate effective precipitation pe n+1 of the time step (n + 1) from the water balance of the top tank as Equation (7).
Note that infiltrations ( 1 ) max n i p  in Equation (5) and p i − 1 n in Equation (6) can be negative values, indicating return flows from slower components to faster components. Also, the effective rainfall pe n+1 calculated by Equation (7) can have a negative value, which can be interpreted as instantaneous evapotranspiration from a watershed. These negative fluxes would be regarded as the difference between our method and the original Tank model by Sugawara (1995).
Secondly, we identified the magnitudes of the infiltration coefficients b i and the initial storage s 0,i simultaneously by applying linear regression lines as Equation (8) to the scatter diagrams between (s i n -s 0,i ) and p i n for all the runoff components except for the slowest component that has no infiltration as in Figure S2.
These equations (4) to (8) make it possible to identify all the Tank model coefficients together with the estimated infiltration (or return flow) between tanks and effective precipitation (or evapotranspiration) pe at a watershed-scale that is input to the top tank with equation (7). The difference between precipitation at watershed-scale p and pe can also be regarded as evapotranspiration and outflow to the outside of the watershed area, which requires rigorous discussions to be explored in a separate study. Figure 3 shows the result of hydrographs separated by the numerical filter by Hino and Hasebe (1984). The recession time constants, T c , for each components were estimated using the criteria set by Chiba and Yokoo (2015) as in Table SII. We can see that the estimated runoff components show different recession rates, indicating that they could be regarded as dominant runoff components from the viewpoint of numerical filtering. Figure 4 shows the changes in estimated storages of (s i ns 0,i ) for all the runoff components, which is based on the results in Figure S1(a-f). Unlike the result of the hydrograph separation in Figure 3, storage in the slowest component dominated among all the storage components. These results show lower storage capacities in faster components and higher capacities in slower components, which is often found in bucket-type models such as the Tank model by Sugawara (1995) and is similar to the previous researches (e.g. Kobayashi and Yokoo, 2013;Yokoo et al., 2014;Chiba and Yokoo, 2015). Figure S2 shows the relationships between infiltration p i n and storage (s i n -s 0,i ) for each tank that were illustrated for estimating the magnitudes of infiltration coefficients b i . The scatter plots were widely spread around the linear regression lines estimated by the least square method, which originates from the differences in calculation methods between the original Tank model by Sugawara (1995) and this methodology. Although we assumed linear relationships between p i n and (s i n -s 0,i ), these results suggest further exploration of the relationships is required.

Construction of the Tank model
The Tank model parameters identified by our methodology is summarized in Table SII. The negative value of s 0,4 in Table SII indicates that the runoff hole is located lower than the infiltration hole, hence the infiltration hole sinks into the tank as in Figure 2(b). This is because our runoff storage-discharge relationship is modeled by Equation (4). If s 0,i is positive and less than s i as in Figure 2(a), runoff is calculated as the product of runoff coefficient a i and the storage (s i -s 0,i ) whereas infiltration is calculated as s i ·b i . If s 0,i is negative as in Figure 2(b), the product of runoff coefficient a i and the storage (s i -s 0,i ) becomes runoff that stops when the tank becomes empty, where infiltration, calcu-lated as s i ·b i , stops when storage height from infiltration hole s i becomes zero.
How does the Tank model work? Figure 5 shows the time series data of the Tank model identified by the method explained in the previous section. Panel (a) illustrates the observed precipitation p measured at Hachigamine monitoring station together with the effective precipitation pe estimated by our method. We can see that the effective precipitation pe is less than the observed precipitation p and also pe becomes even more negative in the rainy season.
One of the potential reasons for pe being less than p could be explained by evapotranspiration if p, measured at Hachigamine monitoring station in Figure 1, accurately represented the watershed-scale precipitation. The negative pe after rainfall events in the rainy season should indicate intensive evapotranspiration after rainfall events at the watershed-scale. However, reasons for pe less than p and negative pe must be carefully investigated in a well gauged watershed as a separate study.  The panels (b) to (f) show the temporal changes of storage (s i -s 0,i ), runoff (q i ) and infiltration (p i ). Notice that the temporal changes of storage, runoff and infiltration get smoother as the tank descends to a lower position in the model, which is similar to the Tank model by Sugawara (1995). However, we can also see that the timing of the infiltration peaks are clearly faster than storage peaks especially in panels (d) and (e), whereas the runoff peaks coincides with storage peaks in the same panels because storage is directly calculated as a linear function of runoff as in Equation (4).
The reason for faster infiltration peaks is explained by Equations (5-6) which calculates infiltration from the n . In the original method by Sugawara (1995), it calculates runoff as a linear function of storage, but ours calculates using Equations (5-6). In other words, infiltration terms calculated in Equations (5-6) work to connect different calculation methods between the original Sugawara (1995) Tank model and our method based on Chiba and Yokoo (2015) that estimated multiple storagedischarge relationships independently of each other. Therefore, our method is comparable with Sugawara's tank model however we should recognize that our Tank model is not same as the Tank model by Sugawara (1995). Figure 5 is just a summary of inputs, state variables and outputs, hence there is nothing new, but we would claim that these are calculated only from observed time series of precipitation and runoff without any complex calibration processes nor a priori assumptions for model structures. Benefits of our methodology and some ways forward will be discussed in the following section.

CONCLUDING DISCUSSION
This study developed a methodology for constructing a Tank model (Sugawara, 1995) without complex calibrations nor a priori assumptions for model structures. Using the watershed-scale storage estimated by the method by Chiba and Yokoo (2015) which integrated the filter separation method by Hino and Hasebe (1984) with the watershed-scale storage estimation method by Kirchner (2009), we succeeded in constructing a Tank model only with the observed precipitation and runoff data.
The outstanding benefit of this methodology is that we can construct a Tank model uniquely to a watershed without complex calibration processes nor a priori assumptions for model structures, unlike Yokoo et al. (2001) and Yokoo and Kazama (2012). With such calibrations or assumptions, we cannot be sure of the model parameters nor structure because of their uncertainties. Along the goal of the initiative of prediction in ungauged basins (PUB; Sivapalan, 2003;Sivapalan et al., 2003), the authors could considerably decrease the dependency on calibrations and increase dependency on knowledge, although our methodology demands continuous hourly data of precipitation and runoff.
With such a Tank model, estimations of dominant runoff processes and their relationship between storage and runoff derived from our approach, we can estimate how precipitation is stored in and released from a watershed only from observed hourly precipitation and runoff data. As far as lumped modeling approach is concerned, the residual problem will be how the watershed-scale precipitation is converted to effective precipitation estimated with our approach in using a Tank model. Considerations of the conversion is beyond the scope of this study and it should be the next target of this approach, where spatial distribution of precipitation and evapotranspiration must be deeply investigated using the effective precipitation estimated by our approach as reference data.
Another benefit is that we can compare our outputs or state variables with field studies as demonstrated by previous studies (e.g. Beven, 2006;Birkel et al., 2011;Krakauer and Temimi, 2011;Sayama et al., 2011). Number of domi-nant processes, amount of storage, residence time of precipitation, infiltration, as well as runoff for each dominant process are always the target of field-based hydrology. Among them, residence time will be a clue to verify the estimated nature of dominant processes. For example, we estimated the recession time constants T c in hydrograph separations. The magnitude of 5T c becomes an approximation of residence time calculated as annual mean storage divided by annual mean flow, because 99.3% of precipitation becomes discharge after a precipitation event within the period of 5T c . As discussed by Seibert and McDonnell (2002), a Tank model constructed in our methodology must be carefully tested against soft data or process knowledge from experimentalists. Inversely, variables estimated by our approach might suggest a new way of field experiments to verify the variables. Therefore, the next task of this study will be testing our approach in intensively gauged watersheds such as experimental watersheds in collaboration among modelers and experimentalists.

ACKNOWLEDGMENTS
This study was supported by JSPS KAKENHI Grant Numbers JP16K06501 and JP16KK0142. We also thank the hydrological data on Water Information System and geographic data provided by the MLIT, Japan.