Hybrid Assimilation of Satellite Rainfall Product with High Density Gauge Network to Improve Daily Estimation: A Case of Karnataka, India

Accurate rainfall estimation during the Indian summer monsoon (ISM) is one of the most crucial activities in and around the Indian Sub-continent. Japan Aerospace Exploration Agency (JAXA) provides a couple of Global Satellite Mapping of Precipitation (GSMaP) rainfall products, namely, the GSMaP_MVK, which is a satellite-based product calculated with ancillary data including global objective analysis data, and the GSMaP_Gauge, which is adjusted by global rain gauges. In this study, the daily rainfall amount from the GSMaP rainfall product (version 7) is validated against a dense rain gauge network over Karnataka, one of the southwestern states of India, during ISM 2016 – 2018. Furthermore, as the primary objective of this study, these dense rain gauge observations are assimilated in the GSMaP rainfall product using a hybrid assimilation method to improve the final rainfall estimate. The hybrid assimilation method is a combination of the two-dimensional variational (2D-Var) method and the Kalman filter, in which the 2D-Var method is utilized to merge rain gauge observations and the Kalman filter is applied to update background error in the 2D-Var method. Preliminary verification results suggest that GSMaP_Gauge rainfall has sufficient skill over north interior Karnataka and south interior Karnataka regions, with large errors over the orographic heavy rainfall region of the Western Ghats. These errors are larger in the GSMaP_MVK rainfall product over orographic heavy rainfall regions. Hybrid assimilation results of ran domly selected rain gauge observations improve the skill of GSMaP_Gauge and GSMaP_MVK rainfall products when compared with independent rain gauge observations. These improvements in daily rainfall are more promi nent over orographic heavy rainfall regions. GSMaP_MVK rainfall product shows larger improvement due to the absence of the gauge adjustment in the JAXA operational processing. The superiority of the hybrid assimilation method against Cressman and optimal interpolation methods for impacts of utilized rain gauge numbers are also presented in the present study.


Introduction
Reliable rainfall estimation is vital for the Indian agriculture industry mainly during the Indian summer monsoon (ISM) season that has a large socioeconomic impact (Turner et al. 2019). Accurate rainfall estimates are also essential for weather forecasting applications, prediction of water-related natural hazards such as floods, droughts, landslides, etc. (Kumar et al. 2014;Chen et al. 2015). Although rainfall is one of the most crucial parameters for various applications, the availability of accurate and reliable rainfall data on finer spatial and temporal scales remains a challenge (Wang, W. et al. 2017;Wang, Z. et al. 2017;Anjum et al. 2018). Furthermore, rainfall is highly varying in space and time-scale, and its estimation is complex both with ground observations (rain gauges and weather radar) and with satellite data. The sparse distribution of rain gauges and weather radars mainly in mountainous and deeper oceanic regions limits various applications on the global and regional scale. Conversely, space-borne sensors provide homogeneous spatial and temporal distribution of rainfall (Gairola et al. 2015). However, the accuracy of satellite-retrieved rainfall should be assessed with ground observations due to the inherent limitations of retrieval algorithms (Chiaravalloti et al. 2018). As space-borne sensors provide instantaneous global scanning of rainfall and rain gauges give accurate but point measurements of rainfall, the verification of satellite-retrieved rainfall against ground observations itself is a major challenge. The problem is even more stimulating under complex topographic conditions, dense vegetation areas, and coastal regions (e.g., Brocca et al. 2014;Maggioni et al. 2016;Chiaravalloti et al. 2018). Another major problem for accurate rainfall estimation is merging ground observations with satellite estimates of rainfall. Sun et al. (2018) presented a comprehensive review of the 30 global rainfall datasets [namely, gauge-based Global Precipitation Climatology Center (GPCC), Climate Prediction Center (CPC), satellite-retrieved Global Satellite Mapping of Precipitation (GSMaP), Tropical Rainfall Measuring Mission (TRMM)], and reported large differences over complex mountain regions including tropics. The authors also pointed out the issues of the number and spatial coverage of the gauge observations, rainfall retrieval algorithms, and data assimilation procedures to generate realistic rainfall reanalysis and merged rainfall products. Kubota et al. (2009) also compared six satellite-derived rainfall products including Japan Aerospace Exploration Agency (JAXA) GSMaP rainfall against a ground radar dataset calibrated by rain gauges around Japan. The authors found the best validation results over the ocean and reported relatively poor results over mountain regions. Shige et al. (2013) demonstrated that the GSMaP estimates in a case shown by Kubota et al. (2009) could be improved by utilizing more representative profiles in the orographic rainfall. Furthermore, Taniguchi et al. (2013) modified the GSMaP rainfall product using an orographic/non-orographic rainfall classification scheme based on orographically forced upward motion and moisture flux convergence. Trinh-Tuan et al. (2019) showed a clear dependence of biases in the GSMaP estimates over Central Vietnam on elevation and zonal wind speed, suggesting the need to improve orographic rainfall estimations. Nodzu et al. (2019) also examined the effect of interaction between wind and topography on the GSMaP performance over Northern Vietnam and suggested that consideration of the orographic effects with wind information may further improve the accuracy of rainfall.
Various studies are conducted to evaluate the quality of satellite-retrieved rainfall against rain gauge networks over India (Sharifi et al. 2018;Singh et al. 2019 and references therein). Singh et al. (2019) compared diverse rainfall products against India Meteorological Department (IMD) rain gauges during summer monsoon 2016 and found large differences between satellite-derived rainfall products and rain gauges over Karnataka, southwestern India. Prakash et al. (2018) found a relatively smaller error in gauge-adjusted GSMaP in comparison with the Integrated Multisatellite Retrievals for Global Precipitation Measurement (GPM) (IMERG) and TRMM multisatellite precipitation analysis (TMPA) mainly over the regions of low rainfall and the western coast of India. Earlier studies (Palazzi et al. 2013;Hu et al. 2016;Shah and Mishra 2016) also suggested drawbacks of gauge-based estimates and satellite retrievals over mountainous regions, particularly in the Western Ghats mountain range, northeast India, and in the foothills Himalaya. Generally, these studies over different parts of the globe and target location of the Western Ghats in southwestern India suggest that the gauge-adjusted rainfall better represents the intrinsic variability of rainfall with more reliability.
The synergy of rain gauge observations with satellite-based rainfall estimates in case of gauge-adjusted rainfall estimation is attempted in several previous studies (Gairola and Krishnamurti 1992;Adler et al. 2003;Mitra et al. 2003;Huffman et al. 2007;Roy Bhowmik and Das 2007;Krishnamurti et al. 2009;Gairola et al. 2012). To merge rain gauge observations with Indian National SATellite (INSAT) satelliteretrieved rainfall at 1° × 1° spatial resolution, Roy Bhowmik and Das (2007) used an objective analysis method over the Indian landmass for ISM rainfall. Gairola et al. (2015) developed a merged rainfall method by blending rain gauge observations with geostationary Kalpana-1 satellite-derived INSATretrieved Multi-Spectral Rainfall Algorithm (IMSRA) rainfall estimates using an objective criterion of successive correction method. The authors found considerable improvements in terms of correlation, bias, and root-mean-square error after objective analysis, especially over the regions where the density of rain gauge was better. Mitra et al. (2009) used a similar approach for blending rain gauge data with the near-real-time TMPA rainfall product over India for monsoon rainfall monitoring. The major drawback of the objective analysis techniques is that it does not consider the uncertainties (or errors) in first guess (here satellite rainfall) and observation (here rain gauge) inputs. Hence, the effective merging technique is still required to improve rainfall estimation in terms of both better resolution and accuracy considering the errors in both satellite and ground rainfall together.
In this context, the variational method is popularly known for considering inconsistencies (or errors) in input parameters and provides its optimal estimation. The optimal state is achieved via the iterative method in the variational method and it is less computationally intensive compared with sequential assimilation methods such as optimal interpolation. Earlier, Bianchi et al. (2013) used the variational method to combine rain gauge, weather radar, and microwave observations with associated uncertainties to retrieve rain rate. Li et al. (2015) implemented the variational method to prepare high-resolution hourly rainfall using China

Meteorological Administration gauges and Climate
Prediction Center Morphing (CMORPH; Joyce et al. 2004) rainfall products. Generally, the variational method does not consider the evolution (or flow) of uncertainties in satellite rainfall (also called background error), which are considered to be a fixed diagonal matrix in earlier studies. These deficiencies in the variational method can be resolved to some extent by implementing the Kalman filter, which can simulate the flow of background error. Hence, a hybrid assimilation method, a combination of the two-dimensional variational (2D-Var) method and a flow-dependent background error from the Kalman filter, is required to prepare gauge-adjusted rainfall product (Cheng et al. 2010;Daley 1997). This hybrid method combines the advantages of excellent spatial coverage from satellite measurement and accurate rainfall estimates from rain gauge data with their uncertainties and has the potential for an optimal combination of rainfall estimation from both sources simultaneously.
Hence, this study aims to develop a hybrid assimilation method for merged rainfall products over a unique site that is well represented by sufficient ground observations (around 6502 stations). In this study, the first the GSMaP rainfall products are compared with dense rain gauge observations over Karnataka, India, during ISM 2016 -2018 for evaluating the daily rainfall amount. Around half of the randomly selected rain gauges are merged with GSMaP rainfall products using the hybrid assimilation method. These new daily rainfall estimates are verified against the rest of the independent gauges and IMERG final rainfall products. Section 2 discussed the various rainfall data used in the present study, followed by results and discussions in Section 3. These findings are concluded in Section 4.

KSNDMC rain gauge network
The Indian state of Karnataka is located within 11°50′N and 18°50′N latitudes and 74°E and 78°50′E longitudes (Fig. 1a). This state is situated on not only a tableland region but also coastal plains and mountain slopes in the western part of the Deccan Peninsular region of India (Fig. 1b). The dense rain gauge network (6502 stations in 2018 with average rain gauge density of ~ 6100 stations during 2016 -2018) of the Karnataka State Natural Disaster Monitoring Centre (KSNDMC) is used in this study during ISM 2016 -2018 (Fig. 1a). The rain gauge sensor used in this network is a tipping bucket with low tolerance using the material of polycarbonate or industrial standard metal. The KSNDMC gauges comprise a funnel that collects and channels precipitation into a small container. Every day at 0830 Indian Standard Time (IST) [0300 Universal Time Coordinate (UTC)], the container tips and empties the collected water and produces a signal in an inbuilt electrical circuit. The tolerance is limited by the precision of the instrument that is 0.5 mm. The precision of the instrument is 1 % of rainfall intensity up to 50 mm per day, and 2 % of rainfall intensity of 50 -100 mm day −1 (Mohapatra et al. 2017). The original time resolution of the observations is every 15 min using a tipping count method (0.2/0.5 mm per tip) with an operating range up to 600 mm h −1 , but in this study, 24 h (last day 0830 IST to current day 0830 IST) accumulated rainfall observations (valid at 0830 IST) are used for verification and assimilation. In this study, Karnataka state is divided into four meteorological zones by state boundaries defined as (1) Coastal Karnataka, a region of heavy rainfall that receives an average June to September (hereafter JJAS) rainfall of 2517 mm, far above the rest of the states; (2) North Interior Karnataka (NIK), an arid zone that receives 526 mm of average rainfall in JJAS; (3) South Interior Karnataka (SIK), a zone that receives 518 mm of average rainfall in JJAS; and (4) Malnad (Malenadu) Region, which comprises the Western Ghats, a mountain range inland from the Arabian Sea, which is approximately 900 m high, with moderate to very high rainfall with 1390 mm of average normal rainfall in JJAS. These average rainfall amounts for different regions are based on long-term   Fig. 1a. Figure 1b shows the map of topography at 30 s spatial resolution from the United State Geological Survey available with the Weather Research and Forecasting model (Attada et al. 2018) over the study region. Figure 1c shows mean JJAS rainfall at 0.1° spatial resolution from 16 year TRMM/PR data [TRMM Precipitation Radar (PR)] Precipitation System Dataset Version 2.2; Hirose et al. 2009Hirose et al. , 2017aHirose and Okada 2018). Similar to Fig. 1 in Shige et al. (2017), a climatological relationship between topography and rainfall around Karnataka is examined here using the TRMM/PR data. Figure 1d shows the cross-shore distribution of rainfall and topography average across the rectangular box selected over the Western Ghats (Fig.  1c). The maximum value of rainfall is obtained mostly over the coastal and windward side of the mountainous regions. Rainfall values are decreased noticeably in the NIK and SIK rain shadow regions that are also represented by the mean TRMM PR rainfall (Fig. 1c).

JAXA GSMaP rainfall
With the notable success of the TRMM, the National Aeronautics and Space Administration and JAXA have launched the GPM Core Observatory in early 2014 to provide the latest generation of satellite-based near-real-time precipitation and snowfall estimates (Hou et al. 2014;Skofronick-Jackson et al. 2017). The GSMaP rainfall product has been developed by the JAXA as the Japanese GPM standard product (Kubota et al. 2020). The core algorithms of the GSMaP products are based on those provided by the GSMaP project: passive microwave (PMW) precipitation retrieval algorithm, PMW-IR (InfraRed) combined algorithm and gauge-adjustment algorithm. The GSMaP algorithm consists of the following steps: 1) calculating the rainfall rate from PMW sensors (Kubota et al. 2007;Aonashi et al. 2009;Shige et al. 2009) with ancillary data including global objective analysis data provided by the Japan Meteorological Agency; 2) using Morphing technique to propagate rainfall-affected area; 3) refining the estimated data using the Kalman filter approach ); and 4) adjusting rain rates using the National Oceanic and Atmospheric Administration (NOAA) CPC unified gauge-based analysis of global daily rainfall (Mega et al. 2019). The spatial distribution of NOAA/CPC gauges (Chen et al. 2008) over the study region is shown in Fig. 1a (as a black star). The rainfall retrieval algorithms of JAXA GSMaP have been upgraded further in the GPM-era as described in Kubota et al. (2020). Heavy rainfall associated with shallow orographic rainfall systems was underestimated by the GSMaP algorithms because of weak ice scattering signatures Shige et al. 2013). Therefore, the orographic rainfall estimation method using the global objective analysis data was developed exclusively and installed in the GSMaP PMW algorithm (Shige et al. , 2014Yamamoto and Shige 2015;Yamamoto et al. 2017).
The GSMaP rainfall estimates are available at three levels, namely, near-real-time, real-time, and standard products. The near-real-time and real-time GSMaP products are available to the public with 0 h and 4 h latency, respectively (Kubota et al. 2020). The GSMaP_ MVK and the GSMaP_Gauge are categorized as the standard product with 3 day latency. The GSMaP_ Gauge (defined as GSMaP_G in figures) and GSMaP_ MVK version 7 rainfall products are used in this study available from the JAXA webpage (https://www. gportal.jaxa.jp/gp). In the version 7 algorithm, the orographic rainfall estimation method by Yamamoto et al. (2017) was used for all sensors (Kubota et al. 2020). The GSMaP_Gauge is adjusted by the global rain gauges derived from the NOAA/CPC, whereas the GSMaP_MVK is without rain gauges adjustments. Both products have the same spatial and temporal resolution, which is 0.1° and 1 h with coverage between 60°N and 60°S. The KSNDMC gauges are not part of NOAA/CPC gauges.

IMERG rainfall
The IMERG rainfall product has been developed as the US GPM standard product (Huffman et al. 2020b), and the IMERG has several advantages over other satellite rainfall products, such as wide spatial representation (60°N to 60°S) of precipitation, fine spatiotemporal resolutions, and additional snowfall observations (Anjum et al. 2018). The IMERG rainfall is the combination of features of three multisatellite precipitation products including (1) TMPA, (2) CMORPH, and (3) PERSIANN (Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks; Sorooshian et al. 2000). IMERG product used all constellations of microwave sensors, IR-based observations from geosynchronous satellites, and monthly gauge precipitation data from GPCC rain gauges (Schneider et al. 2014) to correct the bias of satellite retrievals over the land (Huffman et al. 2020a, b;Sharifi et al. 2018). IMERG rainfall estimates are available at three levels, known as early, late, and final stage IMERG products. Early and Late IMERG products provide near-real-time rainfall estimates and are available to the public with 6 h and 18 h latency, respectively (Tan and Duan 2017). The final product is calibrated with the GPCC monthly data and provides post-real-time rainfall estimates after around 4 months of data retrieval. All IMERG products are available at the same spatial (0.1°) and temporal (halfhourly, daily, and monthly temporal scales) resolutions. The IMERG final products with 30 min frequency are used in this study.

Methodology
The data assimilation for most weather applications is usually an under-sampling problem in which numbers of grid points are higher on the analysis grid (e.g., satellite retrievals) than observations (here rain gauges) (Daley 1997). In direct assimilation systems, such as the Cressman analysis (Cressman 1959) or successive correction methods (Bratseth 1986) in objective analysis, observation information is simply spread to the analysis grid point through the interpolation of observation within a radius of influence (ROI) without considering inconsistencies (both observation and background errors) in input parameters. Conversely, the objective of the optimal interpolation and variational methods is to minimize the cost function that measures the distance between background (here satellites derived rainfall) and observation (here rain gauge) (Daley 1997). The variational method spreads observation information to analyze grid points using iterative minimization of the cost function and based on the background and observation error. The background and observation errors are uncertainties in the satellite and rain gauge data, respectively. An optimal analysis can be prepared using the 2D-Var assimilation method by an accurate specification of covariance matrices because of strong dependence upon these error covariances (Xie et al. 2002;Tyndall 2008Tyndall , 2010. The variational technique minimizes a cost function iteratively to compute analysis (x a ). In 2D-Var methodology, the cost (penalty) function J (x a ) is made up of two components: where the term J b penalizes the analysis for differences between the analysis (x a ) and the GSMaP rainfall considered here as a background field and the term J o penalizes the analysis for the difference between the analysis (x a ) and the rain gauge observations defined as where x a is the analysis variable, x b is the background field taken from GSMaP_Gauge or GSMaP_MVK rainfall product, P b and P o are the background and observation error covariances respectively, y o is the observation vector taken from rain gauge observations, and H is the forward transform interpolation operator which interpolates the analysis grid points to the observation values. Initially, background and observation error covariance are considered as diagonal matrices with values of fixed diagonal elements as 4 mm day −1 and 1 mm day −1 , respectively. The computational expense of the analysis can be reduced by reformulating the variational Eq.
(2) in observation space using Shermon-Morrison-Woodbury inversion formula (Lorenc 1986). Equation (2) should be minimized with respect to analysis (x a ) to find the minimum penalty between the GSMaP rainfall and gauge observations: The analysis solution is given as or equivalently, Here, K t is known as the Kalman gain at t time step. Furthermore, in place of using fixed diagonal background error covariance, the Kalman filter method is implemented to update background error at t time step.
Here, P a t and P b t are analysis and background error at t time step, H t is forward transform operator at time t. Initially at the first time step, P b t is considered as a fixed diagonal matrix. The estimated analysis error (P a t ) obtained from Eq. (6) is used to compute background error for t + 1 time step using In this study, M is considered as an identity matrix and Q is considered as a zero matrix for simplicity and complex behavior of rainfall prediction and may be a scope for future research. Furthermore, a hybrid background error is used for the 2D-Var assimilation in which updated background error is computed using Finally, the hybrid assimilation method is performed here to generate merged rainfall products using the 2D-Var method with the flow-dependent background error matrix using the Kalman filter. Figure 2 shows the spatial distribution of mean rainfall (mm day −1 ) during JJAS from KSNDMC gauges, GSMaP_Gauge V7, and GSMaP_MVK V7 rainfall products for 2016 -2018. The all India (southern peninsula) rainfall in 2016, 2017, and 2018 was 97 (92),  95 (100), and 91 % (98 %) of the long period average (LPA; the average rainfall recorded during the months from June to September in the past 50 year period) rainfall from IMD gauges, respectively (IMD Annual Report; http://www.imd.gov.in). The years 2016 -2018 represent varying rainfall distribution over the Western Ghats from deficit, normal, and above normal in years . Large differences are observed in spatial rainfall distribution during 2017 and 2018 over the Western Ghats and NIK regions, whereas both years are normal rainfall years according to  show that in general high rainfall was observed in the Coastal and Malnad regions during JJAS. However, the mean rainfall is less over NIK and SIK regions because of their occurrence in the rain shadow regions of the Western Ghats. The spatial distribution of the GSMaP_Gauge rainfall for the same JJAS period for 2016 (Fig. 2d), 2017 (Fig. 2e), and 2018 ( Fig. 2f) suggest that GSMaP_Gauge rainfall products have less error when compared with gauge observations. However, the large magnitudes of rainfall over the Western Ghats regions are underestimated in the GSMaP_ Gauge rainfall product. It suggests a need for correction in GSMaP rainfall product over mountainous regions. Takido et al. (2016) also detected that GSMaP_ Gauge still underestimated the precipitation intensity in high-elevation regions over Japan. The authors suggested improvements with higher resolution gaugebased network data than the NOAA/CPC gauge data. Similarly, inadequate distributions of the NOAA/CPC gauge data can lead to the underestimation of the rainfall over the Western Ghats regions (Fig. 1a). The spatial distribution of GSMaP_MVK rainfall  suggests that this rainfall product has less skill over the orographic heavy rainfall regions. In comparison with the GSMaP_Gauge rainfall product (Figs. 2d -f), which has less error compared with KSNDMC gauges, the GSMaP_MVK rainfall product has slightly higher error compared with KSNDMC gauges over the Malnad and coastal regions. Both GSMaP_Gauge and GSMaP_MVK rainfall products can capture low rainfall over the NIK and SIK regions. These analyses suggest that both rainfall products should be further improved in general and over the mountainous regions in particular. As noted in Section 2.2, the rainfall estimates over the orographic heavy rainfall regions are inherently problematic and the orographic rainfall estimation methods have been developed and installed in the GSMaP PMW algorithm. Hirose et al. (2019) showed that the GSMaP PMW algorithm with the orographic rainfall estimation method was able to estimate the heavy rainfall band well, but the issue persists in the GSMaP because of the unavailability of microwave satellite measurements. However, the current results suggest that the methods must be improved further through some more suitable data-driven analysis such as the hybrid assimilation method.

Comparison of GSMaP_MVK and GSMaP_Gauge rainfall against KSNDMC gauges
Furthermore, the BIAS (mean difference), NBIAS (BIAS normalized by total rainfall), and RMSD (rootmean-square difference) statistics used for error estimations are defined as where rain i sat and rain i gauge represent rainfall from GSMaP rainfall product and KSNDMC gauge observations, respectively. The total number of data points is represented by N. Figure 3 shows the scatter plot of GSMaP_Gauge (upper panel) and GSMaP_MVK (lower panel) rainfall product against KSNDMC rain gauges for JJAS 2016 -2018. The blue and red lines represent the 45° reference line and best fit line using the least squares method, respectively. The value of RMSD is 9.5, 10.4, and 12.2 mm day −1 for 2016, 2017, and 2018, respectively, when GSMaP_Gauge rainfall product is compared with KSNDMC gauge observations (Figs. 3a -c). The value of BIAS is 0.5, −0.1, and −1.3 mm day −1 for these years, respectively, Gauge rainfall product that suggests the importance of gauge calibration in the GSMaP_Gauge rainfall product. Moreover, these statistics are almost similar for different monsoon years (varies from deficit to above normal years) that suggest some inherent limitations of the both selected GSMaP rainfall product over the Karnataka region. The daily area average rainfall variation from KSNDMC rain gauges and the corresponding GSMaP_Gauge and GSMaP_MVK rainfall products for JJAS 2016 -2018 suggest that slightly larger errors are found in the GSMaP_MVK rainfall product compared with those in the GSMaP_Gauge rainfall product. It is important to mention here that both operational GSMaP rainfall products can capture the active and break phase of diverse monsoon years (figure not shown).
To evaluate errors in both operational GSMaP rainfall products, a comparison of GSMaP rainfall is extended for different IMD rainfall classification. These IMD rainfall classifications are majorly based on the intensity of daily rainfall and it divides daily rainfall into eight different categories varying from No Rain to Extremely Heavy Rain (Table 1; IMD Glossary). Figure 4 shows RMSD and NBIAS in both operational GSMaP rainfall products during JJAS 2016 -2018. Results suggest that RMSD varies from 2 mm day −1 to 13 mm day −1 for No Rain, Very Light Rain, Light Rain, and Moderate Rain classifications (Fig. 4a). A negative NBIAS is found for different rainfall classifications except for No Rain and Very Light Rain classifications (Fig. 4b). The negative values of NBIAS suggest the underestimation of rain- fall in both operational GSMaP rainfall products in comparison with KSNDMC gauge observations. For Light Rain, Moderate Rain, Rather Heavy, and Heavy Rain classifications, the GSMaP_Gauge product has less NBIAS compared with GSMaP_MVK rainfall for 2016 -2018. It is important to mention that for few pixels, GSMaP rainfall products also incorrectly classify No Rain regions as rainy pixels. The RMSD values are very high for Rather Heavy, Heavy Rain, Very Heavy Rain, and Extremely Heavy Rain classifications and range from 50 mm day −1 to 250 mm day −1 with negative values of NBIAS (−0.4 to −0.7 for GSMaP_MVK rainfall). It also suggests that both operational GSMaP rainfall products are erroneous mainly over orographic heavy rainfall regions, which are prone to heavy rainfall over Karnataka. Moreover, the GSMaP_Gauge rainfall product has less RMSD and NBIAS compared with the GSMaP_MVK rainfall product for different rainfall classifications, except for Very Heavy and Extremely Heavy rainfall classifications. The density plots of both operational GSMaP rainfall products against KSNDMC gauges also suggest that the GSMaP_Gauge rainfall is closer to observations for low rainfall threshold (< 20 mm day −1 ), whereas both operational GSMaP rainfall products have almost the same distribution for high rainfall thresholds that are far from gauges (figure not shown). It suggests that a sparse network of rain gauges over mountainous regions reduces the accuracy of GSMaP_ Gauge over the Western Ghats region. Table 2 present the error statistics of both operational GSMaP rainfall products for different regions. Results suggest that GSMaP_MVK rainfall has large negative BIAS (13 mm day −1 to 25 mm day −1 ) over the coastal region with the value of RMSD varying from 25 mm day −1 to 38 mm day −1 . The correlation coefficient is approximately 0.58, 0.37, and 0.58 for the years 2016, 2017, and 2018, respectively. The Rainfall amount realised in a day is 0.0 mm Rainfall amount realised in a day is between 0.1 mm to 2.4 mm Rainfall amount realised in a day is between 2.5 mm to 7.5 mm Rainfall amount realised in a day is between 7.6 mm to 35.5 mm Rainfall amount realised in a day is between 35.6 mm to 64.4 mm Rainfall amount realised in a day is between 64.5 mm to 124.4 mm Rainfall amount realised in a day is between 124.5 mm to 244.4 mm Rainfall amount realised in a day is more than or equal to 244.5 mm  Table 1.

values of NBIAS are high for coastal regions in the
year 2018 compared with that in the year 2016. The large BIAS is corrected in the GSMaP_Gauge rainfall products over the coastal region to some extent, and values of BIAS (1 -8 mm day −1 ) and RMSD (18 mm day −1 to 25 mm day −1 ) are improved significantly for the years 2016 -2018. Similar to the coastal region, the Malnad region (Fig. 1a) shows large errors in both operational GSMaP rainfall products. The values of BIAS, NBIAS, and RMSD are slightly less in the Malnad region than in the coastal region, but the correlation coefficient is less for different years. Both NIK and SIK regions show less error in the GSMaP rainfall products. The value of RMSD (BIAS) is less than 10 (1) mm day −1 for different years and the correlation coefficient is approximately 0.6. For the years 2016 -2018, the GSMaP_Gauge data have a better skill as compared to GSMaP_MVK rainfall in NIK and SIK regions, which confirms its superiority for all regions due to the calibration of the GSMaP_Gauge rainfall with the NOAA gauge analysis (Fig. 1a). These preliminary verification results suggest the need for further rain gauge adjustment of GSMaP rainfall over the Malnad and coastal regions. The hybrid assimilation method is implemented here to generate new GSMaP rainfall products over Karnataka, southwestern India. The verification of new GSMaP rainfall products is presented in Section 3.2.

Evaluation of GSMaP_MVK_NEW and
GSMaP_Gauge_NEW rainfall The randomly selected 50 % rain gauges (defined as training gauges) from the average network of approximately 6100 rain gauges over Karnataka are used to prepare new merge GSMaP rainfall product (defined as GSMaP_Gauge_NEW and GSMaP_MVK_NEW) using the hybrid assimilation method. In this method, a variational method is used to prepare the gaugeadjusted GSMaP rainfall and the Kalman filter is used to estimate the flow of background error in satellite rainfall (discussed in Section 3). The remaining 50 % of rain gauges (defined as verification gauges) are used for independent verification of different rainfall products. Figure 5 shows the scatter plot of GSMaP_ Gauge, GSMaP_Gauge_NEW, GSMaP_MVK, and GSMaP_MVK_NEW against training gauges, which are used to prepare GSMaP_Gauge_NEW (Figs. 5b,f,j) and GSMaP_MVK_NEW (Figs. 5d, h, l) rainfall products. The error statistics provide the sanity check to recognize that after merging training gauges in both operational GSMaP rainfall products, the new rainfall products are closer to observations and demonstrate successful assimilation of the training gauges. Results suggest that the GSMaP_Gauge rainfall has RMSD (BIAS) of 9.6 (0.4), 10.5 (−0.2), and 12.5 (−1.4) mm day −1 for JJAS 2016 (Fig. 5a), 2017 (Fig. 5e), and 2018 (Fig. 5i), respectively. These error statistics are reduced to 3.9 (0.1), 4.2 (−0.0), and 4.7 (−0.2) mm day −1 , respectively, for JJAS 2016 (Fig. 5b), 2017 ( Fig. 5f), and 2018 (Fig. 5j). The values of BIAS are close to 0 after hybrid assimilation because of the bias correction step implemented in the variational assimilation method. The value of the correlation coefficient has increased from approximately 0.7 in GSMaP_ Gauge rainfall to 0.96 in GSMaP_Gauge_NEW rainfall. The number of training gauges observations is almost 0.35 million for different years. These statistics suggest that after the merging of training gauges in GSMaP rainfall product via hybrid assimilation method, new rainfall products are closer to training gauges and supports successful ingestion of ground observations. Similar to the GSMaP_Gauge rainfall product, error statistics for GSMaP_MVK rainfall product is also improved from 12.3 (−1.8), 13.0 (−1.6), and 16.5 (−2.1) mm day −1 for JJAS 2016 (Fig. 5c), 2017 (Fig. 5g), and 2018 (Fig. 5k), respectively, to 4.1 (−0.2), 4.4 (−0.2), and 4.9 (−0.3) mm day −1 in GSMaP_MVK_NEW (Figs. 5d, h, l) rainfall product. The value of the correlation coefficient is also improved from approximately 0.52 in GSMaP_MVK rainfall product to 0.96 in GSMaP_MVK_NEW rainfall product. These statistics suggest that after the merging of training gauges with GSMaP_MVK rainfall product, the new rainfall products are closer to assimilated observations (training gauges) and support successful assimilation of the ground observations. After the initial verification of operational and new GSMaP rainfall products, these rainfall products are also compared with verification gauges that can be considered as independent verification. Results suggest that the GSMaP_Gauge rainfall has RMSD (BIAS) of 9.4 (0.5), 10.3 (−0.1), and 11.9 (−1.2) mm day −1 for JJAS 2016 (Fig. 6a), 2017 (Fig. 6e), and 2018 (Fig.  6i), respectively. These error statistics are changed to 6.8 (0.1), 7.4 (−0.1), and 8.1 (−0.4) mm day −1 , respectively, in the GSMaP_Gauge_NEW rainfall product for JJAS 2016 (Fig. 6b), 2017 (Fig. 6f), and 2018 (Fig. 6j). The value of the correlation coefficient has increased from approximately 0.7 in GSMaP_Gauge rainfall to 0.86 in GSMaP_Gauge_NEW rainfall. The numbers of verification gauges are almost similar to the number of training gauges for different years. These results suggest that new rainfall products have less error compared with operational GSMaP rainfall products when compared with verification gauges. Similar to the GSMaP_Gauge rainfall product, error statistics for the GSMaP_MVK rainfall product is improved from 11.9 (−1.6), 12.7 (−1.5), and 15.6 (−1.9) mm day −1 for JJAS 2016 (Fig. 6c), 2017 (Fig. 6g), and 2018 ( Fig. 6k), respectively, to 7.4 (−0.4), 8.2 (−0.5), and 8.9 (−0.5) mm day −1 in the GSMaP_MVK_NEW (Figs. 6d, h, l) rainfall product. The values of the correlation coefficient are also improved from approximately 0.53 in the GSMaP_MVK rainfall product to approximately 0.82 in the GSMaP_MVK_NEW rainfall product. These statistics suggest that new rainfall products have better statistics with verification gauges when compared with the GSMaP_MVK operational rainfall product. It is also important to discuss here that the improvements are larger in the GSMaP_MVK rainfall product than in the GSMaP_Gauge rainfall product, which may be due to the calibration of the GSMaP_Gauge rainfall with the NOAA/CPC gauges in operational production. Figure 7 shows the spatial distribution of the improvement parameter (IP) for the GSMaP_Gauge_ NEW and GSMaP_MVK_NEW rainfall products in comparison with the operational GSMaP_Gauge and GSMaP_MVK rainfall products when compared with verification gauges. The IP is defined as where GSMaP_Gauge or GSMaP_MVK rainfall product is defined as GSMaP GaugeorMVK , GSMaP_Gauge_ NEW or GSMaP_MVK_NEW rainfall product is defined as GSMaP Gauge_NEWorMVK_NEW , the total number of collocations is defined as N, and verification gauges are defined as KSNDMC ver . The positive (negative) value of IP corresponds to the improvement (degradation) of the GSMaP_Gauge_NEW or GSMaP_MVK_ radation. The domain average value of IP is positive, which suggests that the quality of GSMaP rainfall products is improved with the ingestion of training gauges when compared with verification gauges. These positive improvements are more prominent for the GSMaP_MVK rainfall products (Figs. 7d -f) that may be due to the absence of the NOAA gauge calibration in this rainfall product. The spatial distribution of IP for different years suggests that the maximum positive impact is observed over the Western Ghats regions. The values of IP for GSMaP_Gauge_NEW are largest for JJAS 2018 and smallest for JJAS 2016 over the Western Ghats. However, the values of IP are almost similar for the GSMaP_MVK_NEW rainfall for different years. Results also suggest that besides coastal and Western Ghats regions, NIK and SIK regions show improvement for different years.
Besides the comparison of different rainfall products against verification gauges, these new rainfall products are also compared with the IMERG final rainfall products. IMERG final rainfall products use GPCC gauge analysis to calibrate merged rainfall products. As described in Schneider et al. (2014), the GPCC uses two rain gauge sources besides the NOAA CPC (used in the GSMaP). Dinku et al. (2008) found that the GPCC product has better overall statistics compared with the NOAA CPC over a mountainous region of Africa. Earlier studies suggest that IMERG final products have sufficient skill over tropical regions and this dataset can be considered as an independent source for verification. The JAXA operational and new GSMaP rainfall products are also compared with IMERG final rainfall products for the years 2016 -2018. Results suggest that the GSMaP_Gauge rainfall  . 8i), respectively. These error statistics are changed to 9.9 (−0.9), 9.3 (0.0), and 9.9 (0.4) mm day −1 , respectively, for JJAS 2016 (Fig. 8b), 2017 (Fig. 8f), and 2018 (Fig. 8j). The value of the correlation coefficient is slightly more for the GSMaP_ Gauge_NEW rainfall than for the GSMaP_Gauge rainfall. However, slightly larger values of RMSD and BIAS are found in new rainfall products than those found in operational GSMaP rainfall products. These results suggest that new rainfall products have negligible to very small changes in comparison with operational GSMaP rainfall products when compared with IMERG final rainfall products. The error statistics for the GSMaP_MVK rainfall product is improved from . These statistics suggest that new rainfall products have less error with the IMERG final data compared with GSMaP_MVK operational rainfall products. It is also important to discuss here that the large improvements are found in the GSMaP_MVK rainfall when compared with the IMERG final data, whereas negligible to little changes are found in the GSMaP_Gauge rainfall. Noteworthily, the new GSMaP rainfall products have a higher correlation with verification gauges as well as IMERG final data that supports the improved skill of rainfall product after the hybrid assimilation of training gauges.
To evaluate the skill of operational and new GSMaP rainfall products, these data are also compared with verification gauges for different IMD classifications. Besides IP defined in Eq. (12), absolute NBIAS are used to understand the quality of new rainfall products in comparison with operational GSMaP rainfall products. The absolute NBIAS parameter is defined as Positive (negative) values of absolute NBIAS show the improvement (degradation) of new rainfall data against operational GSMaP rainfall. Figure 9 shows improvement parameter and absolute NBIAS in both GSMaP_Gauge_NEW and GSMaP_MVK_NEW rainfall products during JJAS 2016 -2018. Results suggest that the value of improvement varies from 2 mm day −1 to 60 mm day −1 for different rain classifications (Fig.  9a). Generally, the GSMaP_Gauge rainfall product has less improvement than the GSMaP_MVK rainfall product. It suggests that because of operational gauge calibration, the GSMaP_Gauge rainfall product is closer to ground observations. It is also important to note that, for all heavy rainfall classifications, both operational GSMaP rainfall products show large improvements (Fig. 9a). These large improvements are mainly over the Western Ghats regions and more noteworthy for the years 2017 and 2018. The value of absolute NBIAS in GSMaP_Gauge is less than that in GSMaP_MVK for different rainfall classifications except for Very Heavy Rain and Extremely Heavy Rain classifications (Fig. 9b). These results suggest substantial improvement in operational GSMaP rainfall products after implementing hybrid assimilation. It is also important to note that the areas with higher precipitation show larger improvement. Figure 10 shows the density plot of rainfall deviation (defined as GSMaP minus rain gauge) for GSMaP_ Gauge, GSMaP_Gauge_NEW, GSMaP_MVK, and GSMaP_MVK_NEW for the years 2016 -2018. This figure suggests that for different rainfall thresholds, GSMaP_Gauge_NEW and GSMaP_MVK_NEW rainfall have less error. The new product is closer to observations for all years in comparison with operational GSMaP rainfall products. The density plot of deviation is shifted toward low rainfall values that suggest that more numbers of points are closer to observations after assimilation. However, for high rainfall thresholds, both operational GSMaP rainfall products have large deviations. It suggests that a dense network of rain gauges over orographic heavy rainfall regions improves the quality of both operational GSMaP rainfall products. Results also present a better performance of GSMaP_Gauge rainfall product than that of GSMaP_ MVK rainfall product for a selected study period. Moreover, new rainfall products have better skills for high rainfall thresholds over Karnataka, India. The hybrid assimilation of additional gauge observations mainly over the Western Ghats regions can capture the magnitude of the complete dynamical range of rainfall (mainly higher rainfall) accurately in comparison with operational GSMaP rainfall products.

Evaluation of different assimilation method for
variable density of rain gauges The Cressman (Cressman 1959) and optimal interpolation (Daley 1997) methods are also used in this study besides the hybrid assimilation method to understand the importance of the hybrid assimilation method. To recognize the need for a dense rain gauge network, total rain gauge stations in the year 2018 are randomly divided as training and validation gauge stations. Furthermore, the training gauge stations used for data assimilation are divided into three cases, namely, RG1 (all training rain gauge stations), RG2 (50 % of training rain gauge stations), and RG3 (25 % of training rain gauge stations). Merge rainfall product prepared from different assimilation methods (namely, Cressman, optimal interpolation, and hybrid methods) Fig. 9. Error statistics of (a) Improvement parameter and (b) absolute NBIAS for GSMaP_Gauge_NEW (GSMaP_ MVK_NEW) rainfall compared to GSMaP_Gauge (GSMaP_MVK) rainfall for different IMD classifications as shown in Table 1. and variable numbers of rain gauge stations (namely, RG1, RG2, and RG3) besides both operational GSMaP rainfall products are compared with independent validation rain gauge stations for ISM 2018. The ROI is considered as 5 km for the Cressman method. The fix observation and background error for the optimal interpolation method is the same as that used for variational assimilation discussed in Section 3. Table  3 shows the RMSD values for RG1, RG2, and RG3 with different assimilation methods. Results show that in general merged rainfall products have less error when compared with both operational GSMaP products. Less RMSD values are noticed in the optimal interpolation method when compared with the Cressman method. The reduction of RMSD is more in the hybrid assimilation method when compared with other selected assimilation methods. It clearly shows the importance of considering the flow of background error covariance in the hybrid assimilation method that is considered as fixed in the optimal interpolation method (i.e., B is considered as diagonal matrices with diagonal elements as 4 mm day −1 in the optimal interpolation method). Additionally, highdensity rain gauge network has a large impact on merged rainfall products. The RMSD values of 11.8 (15.3), 11.4 (14.6), and 10.7 (12.8) mm day −1 are noticed in the Cressman method-generated merge GSMaP_ Gauge (GSMaP_MVK) product for RG3, RG2, and RG1 gauges, respectively. It is also important to mention here that both rain gauge density and assimilation methodology are important for preparing merged rainfall products. Cressman and optimal interpolation methods show more effect of dense gauge network for GSMaP_MVK rainfall products. The values of RMSD are reduced from 15.3 (13.1) mm day −1 to 12.8 (9.4) mm day −1 for the Cressman (optimal interpolation) method in GSMaP_MVK rainfall for RG3 -RG1 gauges, respectively. However, the impact of the utilized rain gauge numbers is relatively less in the hybrid assimilation method. The values of RMSD are changed from 10.6 mm day −1 to 8.3 mm day −1 for RG3 -RG1 gauges in GSMaP_MVK merge rainfall for the hybrid assimilation method. Generally, the RMSD values are less in the GSMaP_Gauge product, signifying the importance of operational gauge calibration used in this product.

Conclusions
A hybrid assimilation method for merging various rainfall products over a unique site with a dense gauge observation network over the Karnataka region of southwestern India has been developed and demonstrated. The verification results for four topographically different regions within the study area suggest a large error in GSMaP rainfall over the Coastal and Malnad Western-Ghat areas, a windward side of the mountainous regions, whereas GSMaP rainfall can capture rainfall patterns over NIK and SIK regions. The GSMaP_Gauge rainfall product has more skills compared with the GSMaP_MVK rainfall product over orographic heavy rainfall regions, and the former has less RMSD and higher correlation. Present results reconfirm large errors for high rainfall threshold for different IMD rainfall classifications. These preliminary verifications at a daily scale with an independent dense gauge network suggest that further plausible modifications are possible in operational GSMaP rainfall products using ground observations mainly over orographic heavy rainfall regions, the areas well known for their land inhomogeneity. A hybrid assimilation method is implemented as a combination of the variational method and the Kalman filter method, in which rain gauge observations are used to prepare an analysis that is an optimal combination of ground observations and GSMaP rainfall product, and evolution of background error is simulated using the Kalman filter method. Results suggest that new GSMaP rainfall analyses are closer to gauge observations, which Table 3. RMSD in daily GSMaP rainfall products using different assimilation methods and utilized rain gauge numbers (RG1, RG2, and RG3). are used for optimally combining gauge observations, and show successful assimilation of gauge observations. Furthermore, these new daily rainfall products are compared with independent gauge observations and IMERG final rainfall products calibrated by the GPCC. Results suggest that the new analyses are in better agreement with the independent observations. Moreover, the distributions of new rainfall products match well with gauge observations. Results are also extended to understand the importance of dense rain gauge networks and different data assimilation methods such as the Cressman method and the optimal interpolation method besides the hybrid assimilation method. These results suggest that both dense rain gauge networks and assimilation methods are essential for preparing merged rainfall products. The hybrid assimilation method shows less error in comparison with the Cressman and optimal interpolation methods for the impacts of the utilized rain gauge numbers. In all cases, GSMaP_Gauge has less error compared with the GSMaP_MVK rainfall product. These analyses suggest that an optimal number of ground-based observations with hybrid assimilation methods have greater potential to improve satellite-based rainfall estimates. The development of this new daily gridded rainfall product can be used for various agricultural, hydrological, and meteorological applications. Moreover, such a merged product is also useful for data assimilation in the weather models (Kumar 2020), verification of model skills, monitoring of the monsoon progress and its assessment (in terms of its active and break phases), calculation of freshwater fluxes over the oceans, etc. In the present hybrid assimilation method, the variation of background error with model error is not considered, which may be a scope for future research. Moreover, precise estimation of observation error is also a challenging issue that is considered here as a fixed diagonal matrix. The scope of this study can be further extended with the augmentation in terms of the finer temporal resolution from daily scale to hourly scale for various hydrometeorological applications.