Current issues in and an emerging method for flood frequency analysis under changing climate

: In this study several issues with the standard flood fre quency analysis are discussed in the context of a changing hydro-climate in the 21st century. Among these issues the loss of statistical equilibrium in the hydro-climate of a stud ied region during the 21st century has serious implications on the standard frequency analysis that is discussed in some detail. An alternative method to flood frequency analysis within the framework of a changing climate based on ensem ble of future climate projections is reported and demon strated by a numerical application to a target watershed.


INTRODUCTION
A standard method for the risk-based analysis and design of hydraulic structures has been the flood frequency analysis. The flood frequency analysis procedure was standardized in the USA in mid-1970s by the development of the Bulletin 17 by the U.S. Water Resources Council (1976). This standardized flood frequency analysis method was revised in 1982 by the U.S. Department of Interior Interagency Advisory Committee on Water Data (1982), and has been used for more than 30 years until present by all U.S. Agencies and water resources engineering practitioners in the U.S. and elsewhere around the world.
The flood frequency analysis attempts to quantify the flood peak discharge as function of the probability of exceeding a discharge threshold for a particular structure, such as a flood levee, defined in terms of a selected return period. A return period is defined as the average time interval inbetween the consecutive occurrences of floods with discharges that exceed the critical discharge level above which the structure would be harmed. Dating back to 1960s (U.S. Water Resources Council, 1967), the flood frequency analysis was based on the development of a statistical sample from the record of annual maximum flow discharges at the river location of interest. Then, an empirical frequency distribution of annual maximum discharges against their return periods could be estimated by means of plotting position formulae, such as the Weibull formula (Benson, 1962).
However, due to the limited length of the historical flow records which were usually less than the desired return periods, such as 100-year return period or 200-year return period, it became necessary to extrapolate the statistical information of the limited observation record of the annual maximum discharges to those of the desired return periods. For this purpose, and for creating a uniform procedure for estimating the flood risks, the U.S. Water Resources Council (1976) developed the report, called "Bulletin 17", where they recommended the Log Pearson Type 3 (LP3) distribution as the probability distribution of the annual maximum flows. Once the LP3 distribution is fitted to the existing record of the annual maximum flow discharges, in terms of the estimation of its parameters, then it is used to extrapolate the annual maximum flow discharge magnitudes to those corresponding to the desired return periods. Accordingly, a comprehensive procedure was developed by the U.S. Department of Interior Interagency Advisory Committee on Water Data (1982) for the calibration of the LP3 distribution's parameters in terms of the existing flow records relevant to the study watershed so that it could be implemented to the particular river location within the watershed. This procedure was reported in the so-called "Bulletin 17B" which became the standard reference for flood frequency analysis in USA and elsewhere until present. Currently, a new flood frequency analysis bulletin, Bulletin 17C is being prepared by the U.S. Subcommittee on Hydrology, Advisory Committee on Water Information Hydrologic Frequency Analysis Work Group (HFAWG, 2013) in order to revise mainly the estimation procedure of the main theoretical frequency distribution that is used in frequency analysis, the Log Pearson Type 3 probability distribution for the annual maximum flow discharges.
The standard flood frequency analysis method of the Bulletin 17B is based on some fundamental assumptions. The foremost among these assumptions is that the hydroclimate system will stay in statistical equilibrium (so-called stationarity) during the time horizon from the present time when the flood exceedance probabilities are estimated based on the LP3 distribution that is calibrated by the data of the historical record, to at least the end time of the desired return period (usually 100 years or 200 years). A second, not-muchdiscussed assumption is that the single historical flow record of the study location is representative of the population statistical behavior of the flow process at that location. In other words, it is assumed that the statistical inferences that one can make from the statistics that are estimated from the single historical record of the study location can describe the future population behavior of the flows at the particular station as they evolve in time. This assumption amounts to "ergodicity in probability distribution" which means that when the length of the record of annual maximum flows at the study location is sufficiently long, the sample probability distribution of the annual maximum flows that can be estimated empirically by means of plotting position formulae from such a record could form the basis for estimating the population probability distribution of the annual maximum flows, which is taken to be the LP3 distribution whose parameters are to be estimated from the historical data of the particular location and of the surrounding region.
In the face of the change taking place in the global climate (Intergovernmental Panel on Climate Change [IPCC], 2007), both of the above assumptions are seriously debatable. As the global climate is changing, the hydro-climate conditions at a study watershed are also changing. Furthermore, the regional hydro-climate studies, based on downscaling the climate projections of the global climate models for the time horizon of the 21st century (~90 year time horizon of 2010-2100) are convincingly showing significant changes taking place in the regional hydro-climates around the world. Within the framework of a changing hydro-climate in a flood study region, it is difficult to argue that the LP3 distribution that is calibrated by the historical hydrologic data, will be representative of the future flood conditions since the fundamental assumption of the statistical equilibrium (stationarity) of the climate upon which it is based, is no longer valid during the 21st century, let alone during the next 200 years or longer time horizons.
The HFAWG (2013) in their web report make the following statement: "There is much concern about changes in flood risk associated with climate variability and long-term climate change. Time invariance was assumed in the development of this guide. In those situations where there is sufficient scientific evidence to facilitate quantification of the impact of climate variability or change in flood risk, this knowledge should be incorporated in flood frequency analysis by employing time-varying parameters or other appropriate techniques. All such methods employed need to be thoroughly documented and justified." The method of "time-varying parameters" mentioned in the above statement to account for the effect of climate change assumes that the historical time trends in the LP3 distribution's parameters, if there are any, shall continue in the same way in the future 100 years or longer. However, if the global climate is changing in the 21st century, as evidenced from the global climate model projections (IPCC, 2007), then there is no guarantee that the historical hydrologic trends, if there are such trends, shall continue in the same way in the future. Furthermore, if a region's hydroclimate system is changing, this may mean that the population characteristics of the hydrologic flows may be changing in that region. If this is the case, then it is questionable that a unique probability distribution, such as LP3, would be representative of the evolving population of flood flows throughout the 21st century and beyond. Concerning the issue of "ergodicity in the probability distribution" of floods, the necessary condition for this ergodicity is the time-stationarity of the underlying flow process which is going to be violated as the regional climates around the world evolve with time in the 21st century and beyond. Therefore, the practice of making statistical inferences based on one observed flow record at the study location, leading to building a probability distribution of future flood flows, which is based on the assumption of "ergodicity in the probability distribution" of floods, is not supported under a changing climate. Then the question becomes "Is there an alternative way to the flood frequency analysis approach of Bulletin 17B or 17C?" The answer to this question may be found in the currently available 40-some projections of the 21st century hydroclimate by various Global Climate Model/Earth Modeling Systems (GCMs/ESMs) of various climate centers around the world. For any geographical location of the world where a flood frequency analysis is sought, it is possible to downscale these 40-plus climate projections of the 21st century by means of a calibrated and validated regional atmospheric model at fine time (hourly) and space (<10 km) resolutions over the study region, and then use these 40-plus projections as input to a calibrated and validated physically-based watershed hydrology model in order to simulate 40-plus projections of the hydrologic conditions over the study region. Based on this numerically-simulated hydrologic flow dataset, one can then obtain 40-plus projections of the annual maximum flow discharges for the 21st century at the selected study location. If these projections cover the period 2010-2100, then one would end up with a sample of at least 40 × 90 = 3600 years, containing 3600 annual maximum flow discharge values. In the case of a strict assumption that annual maximum flood flows for these 3600 years of simulations come from the same population, then one would be able to estimate an annual maximum flow discharge even for an exceedance probability of ~3 × 10 -4 directly from the numerically constructed sample of flood flows without assuming any probability distribution or ergodicity. If one were to recognize the evolution of the regional hydro-climate during 2010-2100 period, one could still divide this period into a certain number of non-overlapping periods, such as 2010-2040, 2040-2070 and 2070-2100 within each of which it would be plausible to assume local stationarity of the hydro-climate. Then for each of these 30-year, locallystationary periods one would have a sample of 30 × 40 = 1200 years, consisting of 1200 annual maximum flow discharge values. Under the assumption that the annual maximum flow discharges for each of these years are mutually independent, and that each of the 40 climate projections are equally-likely, then one could obtain 3 different frequency distributions that would evolve with time during the 21st century, reflecting the impact of the evolving climate throughout the 21st century. In each of these frequency distributions one would be able to estimate the annual maximum flow discharge that would correspond to as low as ~8 × 10 -4 exceedance probability. These probability distributions, since they are constructed from an ensemble of 40-plus realizations, would not need any assumption concerning an underlying probability distribution, nor would they need the assumption of ergodicity in the probability distribution since they are estimated from an ensemble, not from one realization.

AN EMERGING METHOD FOR FLOOD FREQUENCY ANALYSIS UNDER CHANGING CLIMATE
This section explains a recently developed method (Trinh et al., 2016) to estimate the 100-and 200-year return period floods without any frequency curve extrapolation under changing climate as aforementioned. This method utilizes future climate projections produced by multiple GCMs/ ESMs based on multiple future climate scenarios. Time and space resolutions of available future climate projections data are generally 6-hourly and 100 km, respectively, at finest scale. They are too coarse as inputs for modeling watershed-scale hydrological processes, especially in order to obtain flow data for flood frequency analysis. Future climate projections need to be downscaled to fine time (hourly) and space (< 10 km) resolutions for flood frequency analysis. To downscale future climate projections and simulate future hydrologic conditions, a physically-based hydro-climate model, which consists of a regional atmospheric model and a physically-based hydrologic model, is recommended because the model parameters of a physically-based model could evolve by the change in climate if the land conditions change by the change in climate since their parameters are estimated objectively and directly from the land conditions. The regional atmospheric model is used to downscale the future climate projections to fine time and space resolutions. The downscaled future climate projections are then used as inputs to a physically-based hydrologic model in order to simulate the projections of the hydrologic conditions over the study watershed. Then, the annual maximum values of flow discharge at the study watershed are extracted from the simulated future hydrologic conditions. The hydro-climate simulations based on multiple future climate projections generate an ensemble of annual maximum values of flow discharge. The sample size of annual maximum values of flow discharge is equal to (the number of projection years) × (the number of individual future climate projections). If the number of future climate projections is sufficiently large, the future period can be divided into several non-overlapping time windows. In that case, the sample size of annual maximum values of flow discharge during a determined future period (time window) becomes (the number of years during the time window) × (the number of individual future climate projections). Then, two assumptions are employed to estimate flood frequency curves under future climate change conditions: 1) the local stationarity of hydro-climate conditions during each of the determined future non-overlapping periods (time windows); 2) each future projection is equally likely to occur. Since the occurrence probability of each projection is unknown, the occurrence probability is assumed to be equal among all the projections. If the sample size of annual maximum values during the considered time window is more than 200, then it is possible to estimate the 200-year return period flood without any extrapolation. The assumption of the local stationarity in hydro-climate conditions within a shorter time window is more realistic although a sufficient number of future individual projections are required to create a sufficient sample size to simulate a specified return period flood without any extrapolation. If there is a sufficiently large sample size of annual maximum values during the considered time window compared to the target return period, outliers can be excluded from the estimated flood frequency curve.

APPLICATION TO THE CACHE CREEK WATERSHED IN CALIFORNIA
The aforementioned method was demonstrated by Trinh et al. (2016) over the Cache Creek watershed (CCW) in California, the United States. The CCW is located in the Coastal Range of California with an area of approximately 3,000 km 2 . Future climate projections data that can be used by a physically-based hydro-climate model were limited when this application was conducted. Only 13 future climate projections were utilized in this application. The 13 future climate projections were produced by two GCMs based on four future climate scenarios. The two GCMs are the third-generation community climate system model (CCSM3, Collins et al., 2006), and the fifth-generation European center global climate model (ECHAM5, Roeckner et al., 2003). The four future climate scenarios include A1B, A1FI, A2 and B1 which were reported in the Special Report on Emission Scenarios (SRES) in 2000 that outlined possible future greenhouse gas emissions scenarios. The SRES scenarios differ in their assumptions of greenhouse gas emission, population growth, land use, fossil fuel consumption, economic and technological development (Nakicenovic and Swart, 2000). Nine of the 13 projections were produced by ECHAM5 with three different emission scenarios and different initial conditions: A1B1, A1B2, A1B3, A2-1, A2-2, A2-3, B1-1, B1-2, and B1-3. Four projections were created by CCSM3 with a single initial condition under four different emission scenarios: A1B, A2, B1, and A1FI. In this application, occurrence of the 13 projections are assumed to be equally probable.
As a physically-based hydro-climate model, Trinh et al. (2016) utilized the Watershed Environmental Hydrology Hydro-Climate Model (WEHY-HCM, Kavvas et al., 2013;Kure et al., 2013). The WEHY-HCM was developed by integrating a regional atmospheric model with a physicallybased hydrologic model, the Watershed Environmental Hydrology Model (WEHY, Chen et al., 2004aChen et al., , 2004bKavvas et al., 2004Kavvas et al., , 2006 through the atmospheric boundary layer. The WEHY model is a scalable physically-based hydrologic model that is able to upscale point-scale hydrologic processes at grid points to the scale of the model computational unit (MCU) areas throughout the watershed domain. The WEHY model consists of two subprograms: the hillslope flow processes and the channel routing processes. While the hillslope flow subprogram describes unsaturated flow, subsurface stormflow, overland flow, groundwater flow, and hillslope channel flow, the channel routing subprogram describes the transport of water in a stream network of the watershed. Since model parameters are derived from the physical features of a watershed, the WEHY model has capability to account for the effect of spatial heterogeneity in land surface and subsurface conditions on the hydrologic flow processes. In this application, the Fifth-Generation Penn State/NCAR Mesoscale Model (MM5, Grell et al., 1994) was employed as the regional atmospheric model component of the WEHY-HCM.
Before applying the WEHY-HCM to the future climate projections, the physically-based hydro-climate model was calibrated and validated over the CCW. For the calibration and validation, a historical atmospheric reanalysis dataset (NCEP/NCAR Reanalysis I; Kalnay et al., 1996Kalnay et al., ) from 1950Kalnay et al., through 1999, and the control runs (historical model simulations providing initial conditions for future climate projections) from 1901 through 1999 of ECHAM5 and CCSM3 were dynamically downscaled over the CCW by means of the atmospheric model component. The model biases on precipitation in the downscaled control runs were corrected by comparing the model simulation results with the corresponding observations. These bias-correction factors were also applied to future climate projections. Then the downscaled atmospheric data were used as inputs to the hydrologic model component in order to simulate hydrological processes within the CCW. Then, the WEHY-HCM was calibrated and validated by comparing precipitation and flow discharge within the CCW between the model-simulated results based on the reanalysis data and the corresponding observations. Furthermore, observations of the annual maximum flow discharge are available at the outlet of the CCW for more than 100 years. Therefore, annual maximum flow discharges that were simulated by means of the physicallybased hydro-climate model based on the control runs of ECHAM5 and CCSM3, were validated by comparing the flood frequency curve corresponding to the model-simulated annual maximum flows against the frequency curve that was constructed from the observed annual maximum flow discharge record at the outlet of the CCW (see Figure 1).
After the validation step, the 13 future climate projections during the 90-year future period from 2010 through 2099 were dynamically downscaled to hourly time resolution and 9 km spatial resolution over the CCW by means of the atmospheric model component. Through the atmospheric boundary layer, the downscaled future climate projections were incorporated into the hydrologic model component in order to simulate the hydrologic conditions during the future period. The ensemble of the annual maximum values of hourly flow discharge was obtained from the simulated hydrologic conditions based on all the 13 future projections.
Since the future period was divided into two 45-year non-overlapping time windows (2010-2054, and 2055-2099), the sample size of the annual maximum flow values is 585 for each of the time windows in this application. Finally, the flood frequency curves for each of the two time windows were estimated from the ensemble of the annual maximum values, and are illustrated in Figure 2. The 100year and 200-year return period floods are shown in Table I. These return period floods were obtained by interpolation (not extrapolation). A clear difference between the flood frequency curves corresponding to the two time windows was found in the CCW.

SUMMARY AND CONCLUSIONS
In this study several issues with the standard flood frequency analysis are discussed in the context of a changing hydro-climate in the 21st century. Among these issues the loss of statistical equilibrium in the hydro-climate of a studied region during the 21st century has serious implications on the standard frequency analysis. Under a changing climate since the statistical equilibrium in the hydro-climate system is violated it is seriously debatable that the frequency characteristics of annual maximum floods that are inferred from the past are representative of the future flood conditions during the 21st century. Furthermore, under a changing climate it is seriously debatable that a single historical record, however long it may be, is representative of the population characteristics of the future floods during the 21st century.
An alternative method to flood frequency analysis within the framework of a changing climate is reported and demon-  Figure 1. Comparison of the model-simulated (control run) and the empirical (from flow record) (obs) flood frequency curves at the outlet of the CCW using observed annual maximum flow data  and the model-simulated annual maximum flow data from the downscaling of the control runs from two selected GCMs (Trinh et al., 2016) Figure 2. Flood frequency curves for two consecutive 45-year non-overlapping time windows obtained from the ensemble of all flow projections during 21st century at the outlet of the CCW (Trinh et al., 2016) strated by a numerical application to a target watershed. This new approach to flood frequency analysis is based on the dynamical downscaling of the projections of the 21st century climate by global climate models to the region of study. By means of such downscaling of the future climate projections by a regional hydro-climate model, it is possible to construct an ensemble of flow projections that will have a sufficiently large sample size to render the construction of flood frequency distributions, as they evolve during the 21st century, without resorting to any statistical extrapolation or to any particular probability distribution.