Hydrological frequency analysis of large-ensemble climate simulation data using control density as a statistical control

Uncertainty in hydrological statistics estimated with finite observations, such as design rainfall, can be quanti‐ fied as a confidence interval using statistical theory. Ensemble climate data also enables derivation of a confi‐ dence interval. Recently, the database for policy decision making for future climate change (d4PDF) was developed in Japan, which contains dozens of simulated extreme rain‐ fall events for the past and 60 years into the future, allow‐ ing the uncertainty of design rainfall to be quantified as a confidence interval. This study applies an order statistics distribution to evaluate uncertainty in the order statistics of extreme rainfall from the perspective of mathematical theory, while a confidence interval is used for uncertainty evaluation in the probability distribution itself. An advan‐ tage of the introduction of an order statistics distribution is that it can be used to quantify the goodness-of-fit between observation and ensemble climate data under the condition that the extreme value distribution estimated from observa‐ tions is a true distribution. The order statistics distribution is called the control density distribution, which is derived from characteristics that order statistics from standard uni‐ form distribution follows beta distribution. The overlap ratio of the control density distribution and frequency dis‐ tributions derived from ensemble climate data is utilized for evaluation of the degree of goodness-of-fit for both data.


INTRODUCTION
The fifth report of the Intergovernmental Panel on Climate Change (IPCC) stated that there is no doubt that there is warming of the climate system and predicted that extreme rainfall in mid-latitude land areas would increase by the end of this century (IPCC, 2014). In light of this situation, research related to adaptation to climate change is increasing (e.g. Koninklijk Nederlands Meteorologisch Instituut (KNMI), 2020; Headquarters U.S. Army Corps of Engineers (USACE), 2011). In Japan, design rainfall used for flood-control management has been estimated from rainfall data, which are obtained from observations over several decades (Ministry of Land, Infrastructure, Transport and Tourism (MLIT), 2004). To make adaptation measures for torrential rainfall associated with climate change, a flood risk evaluation introducing risk-based approach has been proposed (e.g. Berkhout et al., 2014;Yamada et al., 2018;Yamada, 2019). The risk-based approach suggested by Yamada et al. (2018) and Yamada (2019) introducing mutual uncertainty evaluation based on huge ensemble climate data and mathematical statistics has characteristics that allow evaluatation of extreme phenomenon that are hard to predict with observations. Also, ensemble climate databases (Mizuta et al., 2017;Yamada et al., 2018;Sasaki et al., 2008Sasaki et al., , 2011 developed in Japan are utilized to set change ratios of rainfall with design level in each region, and flood-control management based on basin scale follows (MLIT, 2021).
In this study, we utilized the Database for Policy Decision Making for Future Climate Change (d4PDF) (Mizuta et al., 2017) which consists of simulations from the atmospheric general circulation model (AGCM) called the Meteorological Research Institute AGCM, version 3.2 (MRI-AGCM3.2) (Mizuta et al., 2012) with a horizontal resolution of 60 km (d4PDF-60 km), and dynamical downscaling (DDS) using the Non hydrostatic Regional Climate Model (NHRCM) (Sasaki et al., 2008) of the d4PDF-60 km to a horizontal resolution of 20 km targeted over Japan (d4PDF-20 km). The series of analyses presented in this study were performed using rainfall data developed by Yamada et al. (2018) and Yamada (2019) which were downscaled from d4PDF-20 km to a horizontal resolution of 5 km (d4PDF-5 km) over the Hokkaido region, northmost Japan, and its surrounding area.
In the process of design rainfall determination, goodnessof-fit testing is generally applied to evaluate validity over an assumed distribution of extreme rainfall. The Kolmogorov-Smirnov test is one such test (Kolmogorov, 1933;Smirnov, 1939). When this test is applied to frequency analysis of extreme hydrological events (e.g. World Meteorological Organization (WMO), 1989; USACE, 1994; Japan Meteorological Agency (JMA), 2021), the results can be used to determine the range of uncertainty in extreme rainfall under an assumed distribution. This test can also be applied to evaluate the goodness-of-fit between ensemble climate data and observations (e.g. Tanaka et al., 2019;Shimizu et al., 2020). However, the power of the Kolmogorov-Smirnov test becomes weak at the tail of an assumed distribution. The acceptance region of this test is quite wide at the tail of the assumed probability distribution, which leads to the existence of infinite values of extremes under the Kolmogorov-Smirnov test. In the light of weak power of the Kolmogorov-Smirnov test, a probability limit method theory was proposed by Moriguti (1995a), which has strong power at the tail of an assumed probability distribution. Shimizu et al. (2018Shimizu et al. ( , 2020 introduced the probability limit method into the construction of confidence intervals. The confidence interval based on the probability limit method test quantifies the threshold of the range estimated probability distribution from finite observation data with high accuracy. Also, the validity of confidence interval derived from resampling to d4PDF-5 km was supported by its consistency with the confidence interval based on the probability limit method (Shimizu et al., 2018(Shimizu et al., , 2020. Uncertainty in estimated probability distributions is quantified by the above confidence interval. Thus, to evaluate uncertainty in order statistics of extreme rainfall we adopted the concept of a control band, which was first described by Gumbel (1958). Here, the control band expresses the range of the i-th order statistics following an assumed distribution. Specifically, we applied an i-th order statistics for extreme rainfall, which is constructed using characteristics of beta distribution described later; we call this function the control density distribution. The control density distribution can be applied to evaluate the goodness-of-fit between ensemble climate data and observed data under the condition that the probability distribution used for construction of the control density distribution is true. Here, the confidence interval proposed by Shimizu et al. (2020) is based on a hypothesis test, the probability limit method, and diagnostics of rejection of an assumed distribution. On the other hand, control density distribution is not based on the hypothesis theory, and test statistics do not exist. Thus, the control density distribution is not used for diagnosis on the rejection of an assumed distribution. We describe our method, analyze the properties of the control density distribution and its relation to sample size, verify it using the Monte Carlo method, and apply the control density distribution method to d4PDF-5 km. The novelty of this study is the introduction of a control density distribution for the goodness-of-fit evaluation between ensemble climate data and observations.

CONCEPT OF THE CONTROL DENSITY DIS-TRIBUTION
In Gumbel's Statistics of Extremes (1958), he discussed the theoretical distribution of the i-th order statistics of the Gumbel distribution. The control band can be derived from this theoretical distribution. The lower boundary x lower i and upper boundary x upper i of the control band of the i-th order statistics are derived from the following equations: where x (i) is the i-th order statistics, f x i x is its theoretical probability distribution function (PDF), and α is the significance level. Connecting x lower i and x upper i for different i gives the control curve of the assumed distribution (Gumbel, 1958).
The control band and control curve are useful for statistical control, but f x i x is not always known for distributions other than the Gumbel distribution. However, it provides more information than the control band and control curve. Thus, we suggest the following method to determine the control bands for an arbitrary distribution. Then, using those control bands, the PDF f x i x can be derived. The derived distribution is referred to as the control density distribution in this study.

Theoretical method of finding the control band for an arbitrary distribution
A control band of order statistics was employed using the following procedure.
(1) Probability-representing function Generally, a random variable's distribution is represented by a PDF or a cumulative distribution function (CDF). Moriguti (1995b) proposed the following probabilityrepresenting function (PRF): where F(x) is a cumulative distribution function, and its inverse function, G(y), is defined as the PRF. Because the domain of G(y) is the range of F(x), it becomes [0, 1]. Using the PRF, the uniform distribution on [0, 1] can be transformed to any type of distribution. If a random variable Y follows a uniform distribution of [0, 1], the cumulative distribution function of the random variable X = G(y) will be F(x). In this way, the characteristics of the uniform distribution on [0, 1] can be generalized to any distribution; this is discussed below.
(2) Characteristics of the i-th order statistics of the uniform distribution on [0,1] Because the properties of the uniform distribution on [0, 1] can be extended to any type of distribution using the PRF, the characteristics of the statistics of the uniform distribution on [0, 1] should be examined. The i-th order statistics x (i) of the uniform distribution on [0, 1] follows a beta distribution, for which the cumulative distribution function is as follows: where B(x; α, β) is incomplete beta function, α and β are parameters of the beta distribution, and for the i-th order statistics in a sample of size n, α = i, β = n -i + 1.
(3) Finding the arbitrary distribution of the i-th order statistics For the results shown in Figure 1, we assumed a sample size of 60 and plotted the PDF of the 1st, 10th, 20th, 30th, 40th, 50th, and 60th order statistics derived from a uniform distribution. Here, the function form of order statistics FREQUENCY ANALYSIS OF LARGE-ENSEMBLE DATA derived from the uniform distribution is equivalent to the beta distribution. By projecting the control band, x lower i − uniform and x upper i − uniform , of the i-th order statistics derived from the uniform distribution to the PRF of the Gumbel distribution (or any type of distribution), we can obtain the control band of the i-th order statistics of the Gumbel distribution: Deriving the control density distribution from the control band Using the control band, the PDF of the i-th order statistics can be reverse derived. The process is shown in Figure  2. To ensure that the CDF of the Gumbel distribution follows a straight line, the Gumbel probability paper is applied in this figure. The projection points of the i-th order statistics from the uniform distribution to the Gumbel distribution are plotted in Figure 2(a). The probability between two consecutive projection points was fixed to 0.5%. Thus, connecting the projection points gives the control curve of the Gumbel distribution. The control curves representing the 99%, 80%, 60%, 40%, and 20% control bands are plotted in Figure 2 Figure 2(c) represents our concept of the control density distribution. For each order statistics, the density is calculated by dividing the fixed probability of 0.5% by the distance between two consecutive projection points. For example, the probability density between the lower boundary of the 99% control band and the lower boundary of the 98% control band is Next, we checked the reliability of the control density distribution using the Monte Carlo method. The test procedure can be summarized as follows: The true distribution is the Gumbel distribution with location parameter μ = 0, and scale parameter β = 1, and the sample size is 60. Then, 60 × 10,000 samples are generated to obtain the probability densities in each order, and compare them to control density distribution. The results are shown in Figure 3. It is apparent that the frequency distributions of order statistics, which are generated from the Gumbel distribution accord with the control density distribution quite well. Figure 3(c) shows that the histogram of order statistics is almost identical to the control density distribution, indicating that the reliability of the control density distribution is high.

Characteristics of the control density distribution
In frequency analysis of hydrological events, the size of the sample represents the period that the hydrological data cover (in years). For Japan, observations over about 60 years are considered for flood-control management. It is very difficult to estimate rainfall with a return period that is significantly longer than the period of observation. On the other hand, the control density distribution estimates ranges of order statistics following an assumed probability distribution. Figure 4 shows the 99% control band of the Gumbel distribution for various sample sizes, which were set to 10, 50, 100, 200 and 500. Some conclusions can be drawn. First, for a sample of size n, the return period that the control band can cover ranges up to n. Therefore, extrapolation is necessary for estimating extreme rainfall with a return period longer than the sample size. Second, by connecting the upper and lower limits of the control band of the different sample sizes, we obtain a straight line that is parallel to the true Gumbel distribution (Figure 4a). The same situation of the control band is derived for different significance levels, which estimates the maximum values of the different sample sizes following the Gumbel distribution with the same parameter as in the true Gumbel distribution. Hence, Figure 1. Practical method for finding the probability distribution function (PDF) of the i-th order statistics for an arbitrary distribution by protecting the beta distribution to the PDF of the targeted distribution it is reasonable to extrapolate the control band with a straight line that parallels the true distribution. Third, for a specific return period, a larger sample size results in a narrower control band.

APPLYING CONTROL DENSITY DISTRIBU-TION TO ENSEMBLE CLIMATE DATA
In this section, the control band and control density distribution are applied to a goodness-of-fit evaluation using   ) and (c) that for a specific return period, the Mote Carlo test agrees with control density well FREQUENCY ANALYSIS OF LARGE-ENSEMBLE DATA d4PDF-5 km and observations. Here, the details of the d4PDF data are shown in the following. The d4PDF-20 km comprised of ensemble climate data obtained from a NHRCM with a horizontal resolution of 20 km, using the d4PDF-60 km as boundary conditions. Specifically, this regional downscaling experiment was constructed from an experiment targeting the past 60 years from 1951 to 2010. The 4K experiment assumes an increase in the global average temperature of 4K against the pre-industrial revolution levels and targets the 60 years from 2051 to 2110. The past experiment includes 50 ensemble members that perturb sea-ice conditions, sea-surface temperatures, and initial conditions, for a total of 60 years × 50 members = 3000 years. The 4K experiment includes 6 sea-surface temperature patterns and 15 ensemble members that perturb those patterns, for a total of 60 years × 6 sea-surface patterns × 15 members = 5400 years. The reproductivity of d4PDF-5 km developed by Yamada et al. (2018) and Yamada (2019) was verified through comparison with observations Hoshino et al., 2020).
The target area is the Tokachi River Obihiro reference point, which is located in Hokkaido. The observation of basin average 3-day annual maximum rainfall which covered 100 years, and the basin average 3-day annual maximum rainfall of d4PDF-5 km past experiment for total 3,000 years are used.
In conventional frequency analysis, the following steps are conducted (e.g. Stedinger et al., 1993;Takara, 1998). First, a distribution type is found for the observations. Second, the best-fitting parameter is determined for the observations. Third, hypothesis testing is used as the stochastic control to determine whether the distribution type should be Figure 4. Characteristics of control density showed (a) 99% control bands of the same distribution for different sample sizes and (b) that for a specific return period, larger sample size means narrower control bands changed. Fourth, if the best-fitting distribution passes the hypothesis test, the distribution is considered the assumed distribution and applied to estimate the rainfall for a return period larger than the sample size.
In extreme value theory (Fisher and Tippett, 1928;Gnedenko, 1943), even when the sample follows the assumed distribution and uses the best-fitting curve, uncertainty is present. In this study, the uncertainty in one sample can be quantified using the control band and control density distribution.
To use the ensemble climate dataset to help consideration on the assumed distribution, we propose two ideas: (1) Continue to use the best-fitting distribution of the observations as the assumed distribution because there is no model bias in the observations and the results can easily be compared to those of traditional frequency analysis.
(2) Use the average of the best-fitting distribution of the ensemble climate dataset as the assumed distribution. With this method, the estimated distribution is affected by the climate model; however, it includes large ensemble members and can thus complement idea (1). Figure 5 shows the analysis of idea (1). We considered an annual maximum 3-day rainfall. The Kolmogorov-Smirnov test and the probability limit method were both applied. As shown in the figure, for the same 5% significance level, the probability limit method had a smaller acceptance region at the tail ends of the estimated distribution compared to the Kolmogorov-Smirnov test. This means that the probability limit method had a stronger power for testing in these areas, in line with Shimizu et al. (2020). This result is important for frequency analyses of extreme hydrological events, which often focus on long return periods and would fall into the tail ends of the distribution. Using hypothesis tests for stochastic control, all of the observations were in the acceptance region for both tests; hence, no observations should be considered outliers and the estimated distribution should not be rejected at the 5% significance level.

Consider the best-fitting distribution of the observation data as the assumed distribution
The 99% control band in the figure is for the 100-year sample, the same period as the observation. For comparisons with d4PDF-5 km, we also needed to construct a control density distribution with the same sample size as d4PDF-5 km, which is 60 years. The results are shown in Figure 5(b). We can assess the uncertainty in the rainfall for a certain return period using the control bands. For example, Figure 5(c) shows the risks of exceeding the observations for return periods of 200, 100, 50, and 20 years are 7%, 7%, 21%, and 39%, respectively. For conventional frequency analysis, the largest observation value has a return period that is much longer than 100 years; however, with the present method, it has the same return period as the length of the observation, i.e. 100 years, and we know that the risk of exceeding that value is 7%.
For a certain return period, such as 100 years, we now have not only the expected values of extreme rainfall events but also their whole distribution for the 100-year return period. Hence, we can compare them to the d4PDF-5 km ( Figure 5(c)). We can see that the shape of the control density distribution agrees well with the shape of the histogram; however, the histogram appears to be shifted to the left compared to the control density distribution. The reason for this shift may be that the assumed distribution was determined according to the best-fitting Gumbel distribution of the observation. Also, the overlap ratio of the control density distribution and frequency distributions of each order, which are derived from ensemble climate data, can be applied as a goodness-of-fit evaluation index for ensemble climate data against observations. When the overlap ratio is higher, the goodness-of-fit is considered high under the condition that the assumed distribution is regarded as the true distribution.

Consider the average of the best-fitting distribution of the ensemble climate dataset to be the assumed distribution
To evaluate validity of control density distribution, using the average of the best-fitting distribution of the ensemble climate data as the assumed distribution is introduced. The results are shown in Figure 6. First, we can compare the best-fitting Gumbel distribution of the observations to the average of the best-fitting distributions of the d4PDF-5 km. The plot shifts to the right, which agrees with the previous results. Furthermore, comparing Figures 5(c) and 6(c), shows that, in the current case, not only the shape but also the position of the control density distribution agrees better with the histogram than do the results of the previous case. In the current case, the risks of exceeding the observations for return periods of 200,100,50,and 20 years become 14%,14%,34%,and 57%,respectively. Here,in Figures 5(c) and 6(c), the black line corresponding to the 200-year return period represents the value derived in the following manner. The observed maximum value corresponding to the 100-year return period is extrapolated to obtain a 200year return period with the same slope as the Gumbel distribution estimated from the observed data and d4PDF-5 km. In Figure 5, the Gumbel distribution estimated from the observation (solid red line) and, in Figure 6, that obtained from d4PDF-5 km (solid green line), are used to estimate observations with a 200-year return period. Then, an extrapolated value corresponding to a 200-year return period can be obtained. To verify the validity of the proposed method, frequency distributions with arbitrary return periods derived from each member of d4PDF-5 km are compared to the control density distribution, which is the Gumbel distribution fitted to a sample constructed of the average values of d4PDF-5 km in Figure 6. That figure confirms that the two distributions accord with each other well, supporting the validity of the proposed control density distribution. A comparison with the values of the previous case reveals some implications.

SUMMARY
Observational data which we use for various decision making such as flood control management are essentially limited compared to enormous degrees of freedom in the climate system. Therefore, it is necessary to incorporate confidence intervals into the discussion in order to understand statistics of extremes. To quantify uncertainty in statistics of extremes, we adopted the probability limit method, which has high power of test at the tail of an assumed distribution for derivation of confidence interval. In the probability limit method, samples of order statistics from the uniform distribution on [0, 1] are generated by the Figure 5. Method considering the best-fitting distribution of observations as the true distribution showed: (a) the relationship between observation and control density, (b) the general relationship between ensemble climate simulation data and control band, and (c) the relationship between ensemble climate simulation data and control density for a specific return period Monte Carlo method, and the occurrence probability of the order statistics located near the end most of tail of the beta distribution in each sample is extracted. The distribution of the occurrence probability of these order statistics is defined as the distribution of the test statistic. This enables an analytic derivation of the distribution of possible thresholds for each order at a given significance level and achieves high test power at the tail of the distribution. Therefore, confidence intervals based on the probability limit method, which were proposed in our previous studies (Shimizu et al., 2020), quantifies the uncertainty in the estimated probability distribution with high accuracy, while the control density distribution quantifies the probability distribution of order statistics with arbitrary return periods. In addition, we constructed a method for evaluating the goodness-of-fit between observation and ensemble climate data under the assumption that the population distribution is the probability distribution of extreme rainfall estimated from the observation. Moreover, quantification of the degree of goodness-of-fit between observation and the ensemble climate data is possible through calculation of the overlap ratio between the frequency distribution calculated from the ensemble climate data and the control density distribution for arbitrary return periods. The goodness of fit evaluation based on the control density distribution is not applied to decide rejection of an assumed distribution, because the derivation process of control density distribution does not include the theoretical test statistics distribution. Therefore, for threshold estimation of extreme rainfall and possibility of rejection of an assumed distribution, a confidence interval based on the probability limit method should be utilized. On the other hand, quantification of consistency, in which order statistics in ensemble climate data against observations is required, control density distribution might be useful. Figure 6. Method considering the average of the best-fitting distribution of the ensemble climate dataset as the true distribution showed: (a) the relationship between observations and the control density, (b) the general relationship between ensemble climate simulation data and control band, and (c) the relationship between the ensemble climate simulation data and control density for a specific return period