for the Objective Classification of Controlling Factors for the Occurrence of the Wide-spread Extreme Precipitation Events during the Baiu Season over Western Japan

Atmospheric patterns associated with wide-spread extreme precipitation events during the Baiu season over western Japan have a diversity in the record. Using an objective approach based on the empirical orthogonal function (EOF) analysis, this study introduces a classification of atmospheric parameters related to the wide-spread extreme precipitation events which are not directly caused by tropical cyclones. The number of a rain gauge observation stations that record extreme precipitation during the Baiu season over western Japan is equivalently proportional to the scores of the first two Principal Components, implying that there are two orthogonal controlling factors for the occurrence of wide- spread extreme precipitation. The first Principal Component is well correlated with a typical frontal dynamical structure as the enhanced westerly jet, the large gradient of the equivalent potential temperature, and the upper-level Rossby wave train injecting into a cyclonic anomaly at the north of the precipitation area. On the other hand, the second Principal Component is dominated by moisture fields with a low-level cyclone and no upper-level signal. This finding could provide a physical understanding of the diversity of atmospheric patterns causing wide-spread extreme precipitation over western Japan and physical insight into how it will change in the future climate. of the wide-spread


Introduction
Rainbands associated with the quasi-stationary front in East Asia, known as the Baiu front (Ninomiya 1984;Ninomiya and Murakami 1987;Sampe and Xie 2010), can produce torrential rainfall causing wide-spread devastation in Japan. Recently, the July 2018 heavy rain event produced unprecedented precipitation, particularly over western Japan and the Tokai region (e.g., Shimpo et al. 2019). A rain event like July 2018 is defined in this study as a "wide-spread extreme precipitation" event, which is discussed in detail in Section 2.
Atmospheric circulation patterns associated with past widespread extreme precipitation events over Japan have been examined by previous studies Murakami and Huang 1984;Ogura et al. 1985;Hirota et al. 2016;Hamada and Takayabu 2018;Sekizawa et al. 2019;Takemura et al. 2019;Tsuji et al. 2020;Yokoyama et al. 2020;Sugimoto 2020;Harada et al. 2020). It was reported that the important atmospheric factors for extreme precipitation are vigorous moisture transport, enhanced westerly jet, anomalous upper-level and low-level vortices, and Rossby wave trains associated with upstream blocking.
However, because these studies were based on a composite analysis or a case study of an individual event, the described phys-ical processes are not statistically separated. Some factors may correlate and could be attributed to an identical mechanism. In other words, they might not encompass a diversity of circulation patterns with the wide-spread extreme precipitation. Therefore, an objective approach is needed to interpret the dynamics and variability of these events.
This study utilized a novel approach proposed by Graf et al. (2017) to classify events of the wide-spread extreme precipitation. This approach is more objective than the composite analysis because it includes a large set of related atmospheric factors and then projects them into principal components, reducing the dimensionality and meaningful separation of the parameters. Note that, although some previous studies classified anomalous weather patterns related to heavy precipitation events over Japan using the self-organization map (Ohba et al. 2014) or K-means clustering (Miyasaka et al. 2019), Graf's approach is physically more straightforward by projecting the atmospheric environments and processes themselves into the classification.
This study aims to conduct a comprehensive investigation of the classification of the atmospheric factors associated with the wide-spread extreme precipitation event and provide insight into their dynamical mechanism. The present article is organized as follows. The methodology and data used herein are described in Section 2. The results of the analysis are given in Sections 3 and 4. Section 5 provides a discussion and a summary.

Datasets and event selection
The Automated Meteorological Data Acquisition System (AMeDAS) is a rain gauge observation network across Japan managed by the Japan Meteorological Agency from 1977. The AMeDAS station records precipitation amount at a 1 h temporal resolution. We use rain gauge data collected at 241 western Japan AMeDAS stations that have been continuously operating from 1979 to 2018. Supplemental Figure S1a shows the top 99th percentile thresholds of the 12 h averaged precipitation amount in each AMeDAS station. The AMeDAS stations are widely and uniformly distributed over western Japan, with a relatively higher precipitation threshold over the Kyushu region.
In this study, an extreme precipitation event is defined as a top 99th percentile event of 12 h moving-averaged precipitation in June and July at each AMeDAS station. The peak time of an extreme precipitation event at a time resolution of one hour is defined as when each AMeDAS rain gauge records the highest precipitation amount during the historical 12 h moving-averaged extreme precipitation event. During this analysis, if precipitation records were already detected in higher ranks of the extreme precipitation within 6 h before and after the peak, they are removed from the gauge record to avoid ranking precipitation amounts which were already detected as the higher ranks. Then, the local peak time is classified into either 0000−1200 or 1200−2400 UTC. To distinguish extreme precipitation events from those directly caused by tropical cyclones (TCs), we exclude extreme precipitation events during which any AMeDAS station was located within 1000 km of a TC center derived from the International Best Track Archive for Climate Stewardship (IBTrACS) dataset (Knapp et al.

Objective Classification of Controlling Factors for the Occurrence of the Wide-spread Extreme Precipitation Events during the Baiu Season over Western Japan
Ryosuke Shibuya, Yukari Takayabu, and Chie Yokoyama Atmosphere and Ocean Research Institute, The University of Tokyo, Kashiwa, Japan Corresponding author: Ryosuke Shibuya, Atmosphere and Ocean Research Institute, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, Japan, 277-8568. E-mail: shibuyar@aori.u-tokyo.ac.jp. and lower levels (300, 500, 850 hPa in this study) and parameters related to moisture, mid-latitude dynamics, frontal structure, and dry/moist stability.
First, all 33 parameters are fitted to a single linear regression model against the rank of the number of the AMeDAS stations to check whether the selected variables have a causal relationship to the wide-spread extreme precipitation: where Y denotes a selected parameter, X denotes the rank of the wide-spread extreme precipitation, a and b are an estimated fitting parameter, and ε denote the residual. Figure 1 shows the scatter plot of the column-integrated water vapor near western Japan (130°E, 32°N) against the rank of the wide-spread extreme precipitation events and the number of the AMeDAS stations that record the wide-spread extreme precipitation. Hereafter we only utilized the time-series in which at least one extreme precipitation occurred since we cannot rank the other time-series (2172 points) with zero extreme precipitation events over western Japan. Apparently, the number of the AMeDAS stations exponentially increases against the rank in Fig. 2, indicating that: 2010). From 1979 to 2018, the number of western Japan AMeDAS stations where extreme precipitation occurred simultaneously on a given day at 0000−1200 or 1200−2400 UTC between June and July is counted. The time-series, consisting of 4880 samples, are then ranked according to the number of AMeDAS stations which simultanelously record the extreme precipitation events at a timestep of a sample. Supplementary Figure S1b, for example, depicts the 12 h averaged precipitation amount during the 1st wide-spread extreme precipitation events at 1200−2400 UTC on 6 July 2018, with 183 AMeDAS stations exceeding their top 99th percentiles.

Atmospheric parameters
Diagnostic variables are calculated from the Japanese 55-year Reanalysis (JRA-55) 6-hourly dataset (Kobayashi et al. 2015) at a 1.25° × 1.25° spatial resolution. The variables, which are thought to be related to the wide-spread extreme precipitation, are selected based on our literature review introduced in Section 1. Table 1 summarized a list of the 33 chosen variables, and details of the selected variables are described in Text S2 in the Supplemental Material. They include dynamical flow features at upper, middle, Table 1. List of parameters with abbreviation (Abbrev.) and associated signal (Signal). Red bold letters indicated those fitted to a single linear regression model against the rank of the widespread extreme precipitation events with a large | t | value. Note that |Ñθ | 300 is omitted since the specific humidity is quite small at 300 hPa and thus it almost corresponds to |Ñθ e | 300 .

Short description of diagnostic variable for 2D
Signal Abbrev.
Column-integrated water vapor EADY growth rate between 500 hPa and 850 hPa EADY growth rate between 300 hPa and 500 hPa Lower-troposphere Convective Instability Fig. 1. The scatter plot of amounts of the column-integrated water vapor (CWV, kg m −2 ) near western Japan (130°E, 32°N) denoted by black dots and the number of the AMeDAS stations which simultaneously record the widespread extreme precipitation denoted by green dots against the rank of the widespread extreme precipitation event. The fitted linear regression model to the column-integrated water vapor is denoted by the red line. Fig. 2. A horizontal map of the estimated regression coefficient of the column-integrated water vapor (kg m −2 ) against the rank of the widespread extreme precipitation event. The contour interval is 0.5 × 10 −3 kg m −2 . The shaded area indicates that the regression coefficient is positive beyond 99% of the statistically significant confidence level using the Student's t test. The purple rectangular denotes the averaging region of parameters with the width of 20° × 10°.
where µ denotes the number of the AMeDAS stations that simultaneously record the extreme precipitation, c and d are fi tting parameters, and ε¢ denotes the residual. It is easily proved that the assumption of the linear regression model in Eq. (1) corresponds to that of a log-linear regression model between the number of the AMeDAS station ( µ) and the selected diagnostic variable (Y ) with Eq. (2): Based on the least-square method, the slope a in Eq.
(1) is determined with the statistical signifi cance level judged by the t-test, where s denotes the standard error of the estimation in Eq.
(1) and i denotes the rank. Figure 2 shows a horizontal map of the estimated slope parameter a of the column-integrated water vapor against the rank of the wide-spread extreme precipitation event.
The local maxima of the regression coeffi cient are found in the west of Japan, with the statistically signifi cant signal having a zonally elongated distribution. To avoid an arbitrary discussion based on a variable at a specifi c grid point, the selected variables are averaged over a rectangular centered at the local maximum of the regression coeffi cient with a width of 20° in the zonal direction and 10° in the meridional direction, as shown in Fig. 2. Note that the location of the center of the average is different for each variable at each height level, as shown in the supplemental Figure  S3. Using the spatially averaged variables, the t value is calculated to examine whether they have a clear causal relationship to the wide-spread extreme precipitation or not. It is found that the 19 variables with the large | t | value greater than t = 10 is collocated with the occurrence of the events, which are denoted by red bold in Table 1. We performed the Empirical Orthogonal Function (EOF) analysis after normalizing the anomalies of the 19 variables by their standard deviation, as shown in Section 3. Notably, we confi rmed that the results in this study are unaffected by the width of the rectangular with 20° × 5°, 10° × 10° and 10° × 5° (not shown).

The phase-space representation of the wide-spread extreme precipitation
The EOF analysis is based on the singular value decomposition. The EOF modes, whose eigen vectors are orthogonal each other, are constructed from the 19 variables chosen. In this case, the EOF 1 mode explains 36% of the total variance, while the EOF 2 and EOF 3 modes explain 19% and 12%, respectively. Furthermore, the principal components (PCs) are obtained by projecting the normalized time-series of each variable into the EOF modes. Figure 3a depicts the number of AMeDAS stations that record extreme precipitation in the PC space, using PC1 and PC2 scores as coordinates. It is found that the number of AMeDAS stations increases signifi cantly with a high PC score in the fi rst quadrant of the PC space, implying that both PC1 and PC2 components are equivalently important as controlling factors of widespread extreme precipitation. Note that the number of the AMeDAS stations does not show a clear dependency on the other PC components such as PC3 and PC4. Figure 3b shows the density of the events within the time-series on the PC space. It is noted that there is no obvious cluster, and events with a large PC rarely occur within the years of 1979 to 2018.
In Fig. 3b, the black dots represent the coeffi cients (loadings) of the parameter which correspond to the eigen vector multiplied by the square root of the eigen value obtained from the singular decomposition analysis. If we consider the coeffi cient pointing from the origin to be a vector, the large magnitude of the vectors indicates a signifi cant contribution to the PC1 and PC2 components. A vector parallel to a PC axis also indicates a strong correlation to the PC component, allowing us to interpret the physical meaning of the new PC axes (Graf et al. 2017).
First, the PC2 component is mainly contributed by column-integrated water vapor (CWV) and equivalent potential temperature (θ e ), implying that it is strongly related to the moisture fi eld. On the other hand, the PC1 component is associated with parameters related to the frontal system (horizontal gradient of equivalent potential temperature (|Ñθ e |), potential temperature (|Ñθ |) and frontogenesis function (FGF)), the upper-level jet stream (U 300 and U 500 ), and the dynamical forced ascent motions (ω D ). The temperature advection (T adv ) also has a positive contribution to the PC1 component, as suggested by Sampe and Xie. (2010). Therefore, it is suggested that the PC1 component is strongly related to the typical frontal dynamics. The parameters of the lower-level wind (U 850 and V 850 ) contributes both to the PC1 and the PC2 components, consistent with the previous studies that discussed the importance The density of the samples in the PC space. The black dots represent the coeffi cients (loadings) of the parameter which corresponds to the eigen vector multiplied by the square root of the eigen value obtained in the analysis. The coeffi cients are multipled by six for the visualization. The parameters are labelled with abbreviation defi ned in Table 1. of the low-level jet during heavy rainfall events (e.g., Matsumoto et al. 1971). Another important finding is that frontal parameter (|Ñθ e |, |Ñθ |, FGF, U 300 , U 500 , ω D and T adv ) are well collocated with each other during the wide-spread extreme precipitation and that they are statistically independent to time-variation of the moisture parameters such as CWV and θ e .
Next, the density of the wide-spread extreme events in each quadrant on the PC phase-space is examined based on the calendar day from June to July in Fig. 4. When the PC1 score is positive, an event is categorized on the 4th or 1st quadrant, while when the PC2 score is positive, that is categorized on the 2nd and 1st quadrant. The density of the event shows a clear dependency on the calendar day: The events on the 4th and 1st quadrant occur from the beginning of June to the middle of July, while those on the 2nd and 1st quadrant occur from the middle of June to the end of July. The event on the 3rd quadrant shows no clear seasonal dependency. Therefore, PC1 and PC2 components have a different seasonal peak in June and July, respectively, which makes the beginning of July the most frequent time-period of the occurrence of the widespread extreme precipitation. Figure 5 shows the horizontal maps of anomalies of CWV, geopotential height at 500 hPa and 850 hPa (Φ 500 and Φ 850 ) regressed onto the PC1 and PC2 components. In Fig. 5a, the anomaly of CWV regressed onto the PC1 shows a dipole structure at the north and the south of about 33°N, indicating a strong frontal structure in terms of Ñθ e . At the north of the precipitation area (western Japan), a strong cyclonic anomaly regressed onto the PC1 exists in the lower troposphere and is enhanced toward the upper-level. The dipole structure of the geopotential anomaly indicates the enhancement of the westerly jet (U 500 ). The strong westerly jet implies the greater gradient of the dry potential temperature through the thermal wind balance, which also leads to greater quasi-geostrophic vertical motion (ω D ). On the other hand,  CWV regressed onto the PC2 has a strong positive anomaly at the west of the precipitation area with no clear frontal stricture. The negative anomaly of Φ at the northwest of Japan causes a strong northward flow into the precipitation area with vigorous moisture fluxes (not shown), but the cyclonic structure is limited only in the lower level. Figure 6 depicts the regression coefficients of stream function anomalies at 300 hPa on three days before and at a day of widespread extreme precipitation, which are low-pass filtered with a cut-off period of 7 days. In addition, the horizontal components of the wave-activity flux defined by Nakamura (1997, 2001) are imposed. It is found that a clear wave train of quasistationary Rossby waves in the regression coefficient of PC1 propagates into the strong cyclonic anomaly at the north of Japan, likely leading to the equivalent barotropic structure of Φ. On the other hand, there are very weak upper-level disturbances in the regression coefficient of PC2, suggesting that the PC2 component is not associated with the upper-level dynamics.

Discussion and conclusion
According to this analysis, the frontal system with the enhanced westerly jet associated with the upper-level Rossby wave train and the rich moisture field with the low-level moisture flux are statistically uncorrelated for the occurrence of wide-spread extreme precipitation. This finding provides a physical understanding of the wide-spread extreme precipitation: For example, the reason why only parts of the past wide-spread extreme precipitation events were associated with the upper-level Rossby wave trains (e.g., the event of July 2018) is thought to be the equally weighted contribution of the PC1 and PC2 components to the event (Fig. 3). In addition, the lower level winds (e.g., U 850 and V 850 ) are largely projected both onto the PC1 and PC2 components, implying that the lower level westerly jet and southerly are essentially necessary for the wide-spread extreme precipitation during the Baiu season.
Our main results are summarized as follows: 1) Using an objective approach based on the EOF analysis, atmospheric parameters related to the widespread extreme precipitation events are classified by the first two Principal Components (PCs) of the parameter phase space.

2) The number of AMeDAS stations is equivalently proportional
to the PC1 and PC2 component scores, implying that there are two orthogonal controlling factors for the occurrence of widespread extreme precipitation.
3) The PC1 components are associated with the frontal dynamical structure, such as the enhanced westerly jet and large gradient of the equivalent potential temperature, whereas the PC2 components are dominated by moisture fields associated with the low-level cyclone. Such a physical insight is also useful to understand future changes in wide-spread extreme precipitation under different climate models such as phase 6 of the Coupled Model Intercomparison Project (Erying et al. 2016). This approach is consistent with the storyline technique (e.g., Zappa and Shepherd 2017) to understand what factors characterize the uncertainty of regional atmospheric circulation changes in climate change responses. In our future works, the future change of the wide-spread extreme precipitation will be examined in the physical framework proposed in this study. data analysis of global water cycle and precipitation in changing climate'', the Japan Society for the Promotion of Science (JSPS) through Grants-in-Aid for Scientific Research JP19H05702. The authors thank Dr. Takeshi Horinouchi for his giving us the idea of this analytical method. The JRA-55 data used in this study were provided by the Japan Meteorological Agency (JMA). All figures shown in this paper were created using the Dennou Club Library (DCL).
Edited by: C. Kobayashi