Development of XAJMISO hydrological model for rainfall-runoff analysis

A hydrological model, XAJMISO is adopted from the XinAnJiang (XAJ) model and modified by transforming the parametric routing system into a non-parametric system to reduce parameters and improve performance. The pro‐ posed model replaces routing components of the XAJ model with two linear systems: surface flow and subsurface flow, including interflow and baseflow. The discharge at the basin outlet is then calculated using response functions of surface and subsurface flow, which are derived by means of a multiple input single output (MISO) system. In con‐ trast to other MISO studies, the present study defines the finite length for calculating response function coordinates based on time scales of all runoff components. The model is applied to six river basins of different data aridity indices in the United States and compared with our modified XAJ (mXAJ) model. The results reveal that the proposed model sufficiently represents the relationship between rainfall and runoff of relatively large basins and provides better and more stable performance. The proposed model also makes calibration much easier by reducing four sensitive parame‐ ters out of seven in mXAJ.


INTRODUCTION
Rainfall-runoff modeling is one of the most important topics in hydrology and has become an essential measure in flood forecasting, planning and management of water resources. Different types of models have been developed for different purposes and are generally categorized as empirical (black-box) models (e.g. Nash, 1959;Karunanithi et al., 1994;Nikolaou and Mantha, 2000), conceptual models (e.g. Nielsen and Hansen, 1973;Zhao et al., 1980;Bergstom, 1995;Sugawara, 1995;Crawford and Linsley, 1996), and physically based models (e.g. Beven and Kirkby, 1979;Morris, 1980;Abbott et al., 1986). Compared to black-box models and physically based models, conceptual rainfall-runoff models provide reasonable accuracy without heavy computational requirements and have been widely employed in hydrological modelling because they are practical and convenient. Among numerous conceptual models, Xinanjiang (XAJ) model (Zhao et al., Correspondence to: Minjiao Lu, Department of Civil and Environmental Engineering, Nagaoka University of Technology, 1603-1 Kamitomioka, Nagaoka, Niigata 940-2188, Japan. E-mail: lu@nagaokaut.ac.jp 1980) is a rainfall-runoff model developed and widely used in China and over the world .
Most of the modern conceptual models contain a large number of interacting parameters which make model calibration extremely difficult even for experts (Gan et al., 1997). Furthermore, it has been found that some simple models perform better than complex ones (Hornberger et al., 1985;Loague and Freeze, 1985). Hence, developing a simple model that relates runoff to only a few parameters, which are easy to handle during the calibration process and provide good performance, has received much attention and become a challenging task (Hornberger et al., 1985). Replacement of parametric model parts by non-parametric ones will help to reduce the number of parameters and make the calibration easier and more efficient.
In rainfall-runoff modelling, model parts based on linear representation have been successfully applied in several studies (Dooge, 1973;Natale and Todini, 1976;O'Connor, 1976;Zhao et al., 1980;Beven, 2003). Beven (2003) thought inaccuracies due to the linear assumption for runoff routing are less than those associated with runoff production. The routing part, or sub-system can be well represented in simple linear systems (LSs). Zhao et al. (1980) also implemented linear representation in the routing part of the conceptual XAJ model, which has been widely employed in many studies (Gan and Biftu, 1996;Jayawardena and Zhou, 2000;Cheng et al., 2006;Lu, 2012, 2014).
This study attempts to modify the original XAJ model (Zhao et al., 1980) and our web-based modified XAJ model (Kyi et al., 2016) to reconstruct its routing sub-model to a non-parametric multiple input single output (MISO) system which takes multiple runoff components as inputs and outputs discharge at basin outlet. Hereafter, we will refer to the original XAJ model as oXAJ, the web-based modified one as mXAJ, and the resultant model of this study as XAJMISO. Zhao and Wang (1988) and Lu and Li (2014) showed that only seven or eight parameters are sensitive in more than 15 parameters of the oXAJ model and the mXAJ model. Considering four sensitive parameters are included in the routing sub-model, it is clear that the XAJMISO model will greatly ease the model calibration.
The MISO system was introduced into hydrological modelling by Liang (1988). It has been studied and successfully applied in developing channel flow routing models for river flow forecasting of large catchments under the linear system assumption (Liang, 1988;Liang and Nash, 1988). Some works have been carried out to take into account the nonlinear characteristics of the routing process (Liang et al., 1994;Kothyari and Singh, 1999). However, the aforementioned MISO research works have paid little attention to runoff production and have not discussed the length of response functions. The XAJMISO model combines a runoff production sub-system and a MISO system as the routing sub-system. This allows the utilization of runoff production of the XAJ model which is widely tested.
This study discussed the truncation length of the response functions based on runoff concentration time and implemented the MISO system to form the XAJMISO model. The model is tested in simulations of the daily discharge of six river basins across the United States and compared to the mXAJ model (Kyi et al., 2016). By using the same parameters for model parts common with the mXAJ model and input data, model performance evaluation shows the contribution of introducing the MISO system.

The formulation of the MISO system
The XAJMISO adopted model structures related to data adjustment, runoff generation and runoff component separation from the oXAJ (Zhao, 1992) and mXAJ (Kyi et al., 2016) models, and replaced the routing sub-system with a MISO system. Figure 1(a) shows the common parts of these models which will be described in text S1. And Figure 1(b), 1(c), and 1(d) show the routing parts of these models. Here only the routing parts will be explained. Figure 1(b) and 1(c) show that both models accept three runoff components and use linear parts, such as linear channel (LC), linear reservoir (LR) and Muskingum method (M) to construct their routing system. Theoretically, they can be expressed by three parallel linear systems taking three runoff components as inputs as follows: where n indicates time step; B represents a backward operator, which means BR x,t = R x,t-Δt , and R x and h x express the runoff component and its response function respectively. The subscript x = s, i, g stands for surface runoff, interflow and groundwater.
For both the oXAJ and mXAJ models, KI and KG are daily dimensionless outflow coefficients of the free water storage to interflow and groundwater defined in  (1) can then be rewritten as where R ss,t represents subsurface flow, the sum of interflow and groundwater and h ss,n = [(1 -α)h i,n + αh g,n ] is its response function. From the above discussion, it can be concluded that the routing sub-system in the XAJ models can be replaced by a MISO system consisting of two parallel linear systems as shown in Figure 1(d). The model structures shown in Figure 1(a) and 1(d) form the XAJMISO model. Theoretically, a linear system is more flexible than the combination of linear parts such as LC and LR, and better model performance can be expected. Mathematically, XAJMISO ∍ oXAJ and XAJMISO ∍ mXAJ. Furthermore, the XAJMISO model can reduce at least four parameters in the XAJ models, three recession coefficients of the LRs and α. This will largely ease the model calibration procedure.

The truncation of the response functions of the MISO system
In order to determine the response functions, some assumptions are necessary. One approach is to parameterize these response functions using some function such as gamma function (Liang and Nash, 1988). This will allow us to turn the problem into optimization of parameters in the functions. However, this will limit the flexibility of the MISO system. In this study, we choose to truncate the response functions and turn Equation (2) into a set of simultaneous linear equations which can be solved easily.
Generally, response function is equivalent to residence time distribution of the runoff within the system. For surface runoff, time of concentration (T c ) or base time of unit hydrograph can be used to define the length of the response function, namely the truncation length. There are many equations (e.g. Cronshey, 1986;Haktanir and Sezen, 1990;Fang et al., 2006;Lu, 2013Lu, , 2016 available to estimate T c . Taking into account the results of these equations, a response function of 30 days can cover surface runoff routing even for very large basins. Comparing with the surface runoff, subsurface runoff is several orders of magnitude slower. Zhao and Wang (1988) showed that the mean residence times of interflow and groundwater are about 10 and 50-500 days respectively. Sugawara (1995) also recommended to set the baseflow recession coefficient to 0.002 which is equivalent to mean residence time of 500 days. It seems more than 1000 days is necessary before most of the groundwater runoff reaches the basin outlet. A longer response function will not only increase the number of dependent variables, but also will make the model spin-up period longer and shorten the effective length of data for model calibration.
In order to take account of the periodicity of inflow, the subsurface system can be rewritten as where n y , n t indicate year and time step within a year; and N t is number of steps in a year. When inflow into the system is completely periodical, namely B n y N t + n t R ss, t = B n y + 1 N t + n t R ss, t , can be derived. This makes it possible to shorten the response function to one year by re-defining h̑s s, n t = ∑ n y = 0 ∞ h ss, n y N t + n t . Considering the inflow into the subsurface system is relatively small and strongly periodical, it is reasonable to consider the truncation to one year will not cause large losses. Equation (2) becomes where N s = 30 and N ss = 365 for the daily model, and N s = 720 and N ss = 8760 for the hourly model. Equation (4) can be written in matrix form as

K.H. KYI AND M. LU
R s and R ss can be derived from the runoff generation part of the mXAJ model, and response functions h s and h ss can be calculated as follows by minimizing the squared error (Q obs -Q) T (Q obs -Q) (a detailed explanation can be found in Liang and Nash (1988)): where Q obs is the observed discharge corresponding to Q.
give the two response functions. The proposed XAJMISO model can then calculate outflow at the basin outlet using Equation (5). A list of the XAJMISO model parameters is given in the third column of Table SI.

Data
In this study, daily mean areal rainfall calculated from ground-based rain gauges, P; daily potential evaporation, E p (developed from the NOAA Evaporation Atlas, Farnsworth et al. (1982); and runoff data (developed from USGS hydro-climatic data, Wallis et al. (1991) are used. These data are provided by the U.S. Model Parameter Estimation Project (MOPEX) data set (Schaake et al., 2006), which is freely accessible at ftp://hydrology.nws.noaa.gov/ (last accessed on 23 March 2013). This study computes the daily runoff for every study basin with the proposed XAJMISO model and compares the results with the daily observed runoff.

Selection of study basins
This study analyses 6 river basins across the United States to avoid snow impacts on the XAJ model's simulation and evaluation. Detailed discussion related to the selection of study basins and calibration is mentioned in text S2 and S3. A list of studied MOPEX basins, their locations, and characteristics are presented in Table I. The stream gauge locations are illustrated in Figure S1.

Reference data
The proposed methodology is first tested under ideal conditions on the Nantahala River basin near Rainbow Springs, NC, United States (MOPEX ID 03504000; second row of Table I), which has an area of 135 km 2 . Daily runoff generated by the mXAJ model was used as reference data (calculated from a 54-year record, 1948-2001). The h s and h ss calculated using Equation (6), along with their respective theoretical values calculated using Equation (S5) and (S6) from parameter values of c s , c i , c g and α, are illustrated in Figure S2(a) and S2(b). As expected, the calculated response functions completely agree with their theoretical ones. Using the coordinates of these response functions, the daily discharge at the basin outlet is computed using Equation (5), and the calculated hydrograph is confirmed to perfectly reproduce the reference hydrograph ( Figure S2(c)). Thus, the proposed XAJMISO model performs well in the reference case for the simulation of daily discharge at the basin outlet using two LSs.

Real data
The application of the model to six real-world basins is discussed here using observed data. The data record of each basin is divided into calibration and validation periods, as shown in Table SII, and the performance of the model on the real, observed runoff data set is analyzed for the aforementioned study basins in both periods. To avoid possible effects of initial conditions in this study, the first year of the simulation results is excluded from our proposed method. This step was taken because Rahman et al. (2015) suggested that the spin-up time for the mXAJ model soil moisture can be less than one year. The h s and h ss are then calculated for our selected six study basins using the proposed method using Equation (6). Based on the values of h s and h ss , discharge at the basin outlet will be obtained using Equation (5). Figure 2 and Figure 3 show the response function coordinates of h s and h ss for the calibration periods of all basins in accordance with basin scale. All surface runoff response functions approach zero within 10 days, and all subsurface runoff response functions within 100 days. This implies that the runoff that arrives at the basin outlet after the truncation length is very limited, and the truncation error also can be thought to be very limited. The analysis of all surface runoff response functions reveals that their center moves slightly to the right when the drainage area of the study basin increases, which can be seen in both the calibration and validation periods. In the calibration periods, their median response times are 1.5, 2.3, 2.6, 2.1, 2.6 and  Figure 4 compares observed and simulated daily hydrographs in which discharges on same day in different years are averaged. Six panels show the result of the Nantahala River (MOPEX ID: 035045000), the French Broad River (MOPEX ID: 03443000), the Noxubee River (MOPEX ID: 02448000), Locust Fork Station (MOPEX ID: 02456500), the Chehalis River (MOPEX ID: 12027500) and the Oostanaula River (MOPEX ID: 02387500) across the USA in the calibration period while Figure S3 displays the same hydrographs in the validation periods. As in the reference data, the hydrographs exhibit a clear match between observed and simulated averaged daily hydrographs in both calibration and validation periods, confirming the applicability of the XAJMISO model at study basins of different data aridity indices (DAIs) and different drainage areas. The model efficiencies are then evaluated by using the Nash-Sutcliffe efficiency (NSE) (Nash and Sutcliffe, 1970) where Q obs , Q c and Q obs are observed discharge, calculated runoff and average observed runoff; t and T are time and number of data points, respectively. As indicated in Table  II, the XAJMISO model shows much better performance than the mXAJ model at all study basins, though the mXAJ model itself gives quite good results. All daily NSE values for the calibration period are greater than 0.839, and the worst value for the validation period is 0.797. Compared with the mXAJ model, the advantage of the XAJMISO model is very significant. It raised NSE on average about 0.1 in the calibration period and 0.07 in the validation period. Also, a smaller standard deviation of NSEs in Table  II implies the performance of the XAJMISO is more stable. Another interesting and important finding of this study is the XAJMISO model is capable of representing the relationship between rainfall and runoff even when catchment size becomes quite large. Due to its structural limitation, the mXAJ tends to show slightly lower NSEs for large basins (Table II). However, the XAJMISO does not show such tendency and has almost the same performance.

CONCLUSIONS
In this study, the MISO system performing runoff concentration is proposed to replace the routing part of the mXAJ model and the oXAJ model. The combination of XAJ model runoff generation part and the MISO system forms a new rainfall-runoff model named as XAJMISO. It was applied to six selected basins in the United States. Its performance was investigated using averaged daily hydrographs and NSE values for both calibration and validation periods. It is revealed that the XAJMISO model achieved agreement with the observed discharge better than the mXAJ model, and effectively represented the real-world hydrological processes in basins of different aridity and dif- It is considered as one of the reasons for the superiority of the XAJMISO model that the mXAJ model uses a linear combination of LRs, while the XAJMISO model uses two parallel LSs characterized by non-parametric response functions which are more convenient and flexible. Theoretically, mXAJ is a subset of XAJMISO. This means that the XAJMISO model may include some functionalities which are not included in the mXAJ model. It is reasonable to consider that the runoff concentration which is implicitly and partly included in the XAJMISO model but not implemented in the mXAJ model causes the difference between these two models. Therefore, the XAJMISO model can to some extent cover runoff concentration in the river network within the basin. Considering the routing part of the oXAJ model is also a linear system, it can be expected that the XAJMISO can also be superior to the oXAJ model. By using non-parametric linear systems to replace the routing part in the XAJ model, the XAJMISO model successfully reduces several sensitive model parameters, and makes the model calibration easier. According to Zhao and Wang (1988) and Lu and Li (2014), both the oXAJ model and the mXAJ model have only seven sensitive parameters (C ep , SM, KI, KG, c s , c g and L (lag time of channel routing, not included in the mXAJ model)). The result of daily simulation is mainly controlled by routing parameters. In the XAJMISO model, parameters (c s , c i , c g and α) are completely eliminated. This almost removes half of the sensitive parameters and largely simplifies the model calibration procedure. Undoubtedly, this is a significant progress in XAJ model development.

SUPPLEMENTS
Text S1. The oXAJ model and the mXAJ model Text S2. Selection of study basins Text S3. Calibration of XAJ model parameters Figure S1. Stream gauge location of analyzed basins over USA Figure S2. Results from reference data (Nantahala station near Rainbow Springs, ID: 03504000, data period: 1949~2001); (a) response function of surface runoff; (b) response function of subsurface runoff; (c) reference and calculated daily hydrographs averaged over data period Figure S3. Comparison between observed and simulated daily hydrographs averaged over validation period; (a) Nantahala River, ID: 03504000 (b) French Broad River, ID: 03443000 (c) Noxubee River, ID: 02448000 (d) Locust fork station at Sayre, ID: 02456500 (e) Chehalis River, ID: 12027500 and (f) Oostanaula River, ID: 02387500