2020 Volume 98 Issue 2 Pages 353-377
This study constructs a merged total column ozone (TCO) dataset using 20 available satellite Level 2 TCO (L2SAT) datasets over 40 years from 1978 to 2017. The individual 20 datasets and the merged TCO dataset are corrected against ground-based Dobson and Brewer spectrophotometer TCO (GD) measurements. Two bias correction methods are used: simple linear regression (SLR) as a function of time and multiple linear regression (MLR) as a function of time, solar zenith angle, and effective ozone temperature. All of the satellite datasets are consistent with GD within ±2–3 %, except for some degraded data from the Total Ozone Mapping Spectrometer/Earth Probe during a period of degraded calibration and from the Ozone Mapping and Profiling Suite (OMPS) provided from NOAA at an early stage of measurements. OMPS data provided from NASA show fairly stable L2SAT–GD differences. The Global Ozone Monitoring Experiment/MetOp-A and -B datasets show abrupt changes of approximately 8 DU coincident with the change of retrieval algorithm. For the TCO merged datasets created by averaging all coincident data located within a grid cell from the 20 satellite-borne TCO datasets, the differences between corrected and uncorrected TCOs by MLR are generally positive at lower latitudes where the bias correction increases TCO because of low effective ozone temperature. In the trend analysis, the difference between corrected and uncorrected TCO trends by MLR shows clear seasonal and latitudinal dependency, whereas such seasonal and latitudinal dependency is lost by SLR. The root mean square difference of L2SAT–GD for the uncorrected merged datasets, 8.6 DU, is reduced to 8.4 DU after correction using SLR and MLR. Therefore, the empirically corrected merged TCO datasets that are converted into time-series homogenization with high temporal-resolution are suitable as a data source for trend analyses as well as assimilation for long-term reanalysis.
The increased halogen loading in the stratosphere caused by human activity has led to a decrease in total ozone abundance (e.g., World Meteorological Organization 2019). Changes in the ozone layer are of major concern to scientists and the general public, and thus atmospheric ozone is being closely monitored to assess the recovery of the stratospheric ozone layer from chemical ozone depletion. In response to the observed ozone depletion, the Montreal Protocol on Substances that Delete the Ozone Layer was agreed upon in 1987, and the successful implementation of the protocol and its later amendments are expected to result in ozone recovery to pre-1980s levels (World Meteorological Organization 2019). Newchurch et al. (2003) performed a statistical analysis of satellite measurements that stratospheric ozone depletion, which is controlled mainly by gas-phase catalytic cycles, slowed globally after 1997. However, the estimated dates of return of the total column ozone (TCO) to pre-1980 values for the baseline scenario (RCP-6.0) are different for the mid-latitudes in the northern and southern hemispheres (NH and SH) and significantly delayed for the Antarctic ozone. They will probably occur before the mid-century (∼ 2035)/around the mid-century for annually averaged NH/SH mid-latitude ozone, and after the mid-century (∼ 2060) for the Antarctic ozone hole (World Meteorological Organization 2019).
Global-scale satellite observations of TCO have been performed since the early 1970s, and regular and continuous ozone monitoring by the Total Ozone Mapping Spectrometer (TOMS) and Solar Backscatter Ultraviolet (SBUV) instruments onboard the Nimbus-7 satellite started in 1978 (McPeters et al. 1996; Bhartia et al. 2013). Since then, ozone observations by multiple instruments on different polar-orbiting platforms have been available. Some of these instruments are the Global Ozone Monitoring Experiment (GOME) onboard the European Remote Satellite-2 (ERS-2) (Bodeker et al. 2001), the Ozone Monitoring Instrument (OMI) onboard the Earth Observation System Aura satellite (Koukouli et al. 2012), and the Ozone Mapping and Profiling Suite (OMPS) onboard the Suomi National Polar-orbiting Partnership (S-NPP) satellite (Flynn et al. 2014).
Satellite-borne instruments have an important advantage for ozone measurements compared to sparsely distributed ground-based networks because they can provide daily global TCO maps with a resolution that is sufficient to detect meteorological variability across regions. However, they suffer from several disadvantages, including instrument calibration drift and occasional gaps in data continuity due to changes in retrieval algorithms and observation conditions. In contrast, ground-based instruments are regularly cross-calibrated and provide continuous time series data, but they are spatially scattered, and their coverage of the globe is far from complete.
So far, some differences within ±3 % were reported between the TCO measurements of TOMS or GOME instruments and ground-based observations (Fioletev et al. 2008). Antón et al. (2010) evaluated TOMS/Earth Probe (EP) TCO data and concluded that TOMS data for the period of 2000–2004 should not be used for trend analysis due to degraded calibration issues. Van Roozendael et al. (2012) reported an updated GOME Data Processor version 5 (GDP5) spectral fitting algorithm for reprocessing 16-year GOME data records and they indicated that the updated GDP5 showed a clear improvement with reduced solar zenith angle and seasonal dependences for nearly all regions compared with ground-based measurements. Loyola et al. (2011) evaluated GOME-2/MetOp-A TCO measurements for 3 years (2007–2009) using GDP version 4.4 retrieval algorithm and they found that GOME-2 TCO slightly underestimated (overestimated) ground-based ozone by 0.5–1 % (∼ 0.5 %) over the mid-latitudes of the NH (SH). Meanwhile, Kroon et al. (2011) evaluated OMI ozone profiles against co-flying instruments and collocated ozone profiles and they showed an agreement of stratospheric ozone profiles within 20 % with global correlative data, with a systematic positive bias of ∼ 60 % in the tropics and 30 % in the mid-latitude regions. Furthermore, TCO datasets of the European sensors GOME, SCanning Imaging Absorption SpectroMeter for Atmospheric CartograpHY (SCIAMACHY), and GOME-2 were reprocessed with GOME-type Direct FITting (GODFIT) version 3 retrieve algorithm (Lerot et al. 2014; Koukouli et al. 2015). The ozone datasets had good quality and less sensitivity to instrumental degradation owing to a new reflectance soft-calibration scheme. These studies suggest that in order to obtain accurate global ozone records, satellite measurements must be validated against measurements made by ground-based instruments. Therefore, a merged ozone dataset from multiple satellite-acquired ozone after the validation is suitable for trend analysis and climate change studies.
One of the major difficulties in assessing long-term global ozone variations is data inhomogeneity (Fioletov et al. 2002). Satellite measurements are obtained by different instruments and retrieved by different algorithms; furthermore, the instruments are subject to problems such as radiation damage (van der A et al. 2010). Although SBUV instruments are quite similar in design, different orbital characteristics between the satellites resulted in measurements made at different local times and solar zenith angles (Miller et al. 2002; Frith et al. 2014). Thus, the drifting orbits for the series of SBUV/2 instruments on NOAA satellites are possibly the biggest source of uncertainty regarding the determination of long-term ozone change from these instruments. Operational satellite changes and observation interruptions result in systematic errors for datasets that change over time, and these errors must be taken into account when estimating the ozone recovery trend.
To date, several merged ozone datasets have been created. For example, Bodeker et al. (2001) constructed a merged ozone dataset from TOMS and GOME datasets after removing latitudinal and temporal drift and biases by intersatellite comparisons and by examining the differences between satellite overpass and ground-based measurements as a function of time and latitude. Meanwhile, Stolarski and Frith (2006) constructed a merged ozone dataset from TOMS and SBUV version 8 retrieved data by intersatellite comparisons and by averaging the data during instrument overlapped periods. Besides, merged monthly zonal mean version 8.6 ozone data were constructed from nine independent SBUV-type instruments operating in the 8 am–4 pm equator crossing time range (McPeters et al. 2013; Frith et al. 2014). These merged datasets validated by intersatellite comparison or overpass ground-based datasets are used for trend analysis based on monthly data.
On the other hand, ozone datasets are used for long-term reanalysis such as the European Centre for Medium-Range Weather Forecasts Reanalysis-Interim (ERA-Interim) (Dragani 2011), the Japanese 55-year Reanalysis conducted by the Japan Meteorological Agency (JRA-55) (Kobayashi et al. 2015), and Modern-Era Retrospective Analysis for Research and Applications version 2 (MERRA-2) (Wargan et al. 2017). However, these ozone datasets are not homogenized for the usage of the reanalysis. In contrast, van der A et al. (2010, 2015) constructed merged TCO datasets from the assimilation based on high temporal-resolution satellite data in which the differences between Level 2 (L2) retrieval datasets and ground-based observations were corrected for systematic biases as a function of solar zenith angle (SZA), viewing angle, time, and effective ozone temperature (Teff).
In this study, we apply the method of van der A et al. (2010, 2015) for the bias correction of 20 satellite datasets comprising the latest TCO retrievals, to produce continuous and homogeneous TCO datasets covering 40 years from 1978 to 2017. We first evaluate biases in the individual satellite TCO datasets by comparing the satellite data with spatially and temporally collocated ground-based Dobson and Brewer spectrophotometer networks. Then, we create individual corrected datasets by weighted averaging all data within each grid cell. The empirically corrected merged TCO dataset with high temporal resolution (e.g., daily or hourly) constructed in this manner is thus suitable for trend analysis as well as assimilation of long-term reanalysis. Besides, the corrected datasets are useful for the post-processing of Dobson spectrophotometer data, because the retrieval algorithm for the Dobson data does not account for natural interannual variations in stratospheric temperature, which significantly affects ozone absorption coefficients (Koukouli et al. 2016).
This paper is organized as follows. In Section 2, we describe the L2 satellite-borne TCO datasets used, all of which are available from public archives. Section 3 presents our methods, including two bias correction methods, and methods used to calculate the differences between coincident satellite overpass data and ground-based observations and to produce Level 3 (L3) TCO datasets. Section 4 presents the corrected TCO data for biases by simple linear regression (SLR) as a function of time, and Section 5 presents the corrected data by multiple linear regression (MLR) as a function of time, SZA, and Teff. Section 6 investigates the TCO trend in the corrected and uncorrected merged TCO datasets, and Section 7 presents our concluding remarks.
In this study, 20 L2 satellite TCO datasets covering 40 years from 1978 to 2017 are acquired from public archives (Table 1). The short names used to refer to these datasets are mainly based on the name of collecting instruments. For datasets composed of data acquired by the same type of instrument but on a different satellite or by a different retrieval algorithm, the dataset name includes the name of the satellite as well as algorithm or provider. Thus, the 20 datasets used are: TOMS/N7, TOMS/EP, GOME, TOGOMI, SCIAMACHY, TOSOMI, OMI-TOMS, OMI-DOAS, GOME-2A, GOME-2B, OMPS-NASA, OMPS-NOAA, SBUV/N7, SBUV/N9, SBUV/N11, SBUV/N14, SBUV/N16, SBUV/N17, SBUV/N18, and SBUV/N19. The instruments that acquired the ozone data and the datasets are briefly described below.
Ground-based Dobson and Brewer spectrophotometer TCO measurements are collected from “Total Ozone” of the archive of the World Ozone and Ultraviolet Data Center (WOUDC). Daily TCO values, either the best representative values or daily averages, are usually submitted to WOUDC by a contributor. For example, the Japan Meteorological Agency has submitted the best representative values of Dobson observations and the daily averages of Brewer observations. By visually inspecting the differences between satellite overpass and ground-based TCO measurements as a function of time for each station, data with low number of observations, large data gap, sudden jumps, and very large offsets are rejected. Thus, 34 Brewer and 39 Dobson TCO measurement stations are selected in this study (Tables A1, A2, respectively). Single monochromator Brewer (and Dobson) spectrophotometers are susceptible to scattered light error for large SZA observations, whereas double-monochromator Brewer spectrophotometers enable more precise observations (Karppinen et al. 2015). Thus, single or/and double Brewer measurements are listed in Table A1 for reference.
2.2 TOMSThe TOMS onboard Nimbus 7 (TOMS/N7) with TCO version 8 retrieval algorithm provided the daily TCO global distribution (TOMS Science Team 1996; McPeters et al. 1996). TOMS/N7 had an orbit altitude of 935 km and this instrument performed cross-track scanning with a 102° field of view (FOV) using 35 sample points at 3° intervals, corresponding to a 50 km resolution at the Earth's surface for near-nadir angle observations. This basic retrieval algorithm uses two wavelengths to derive TCO. The shorter wavelength, which has a stronger continuum ozone absorption in the Huggins band (310–340 nm), is sensitive to TCO. Meanwhile, the longer wavelength, which has weak ozone absorption, characterizes the underlying surface and clouds.
The TOMS onboard the EP satellite (TOMS/EP) started measurements from an initial orbit of 500 km with an inclination of 98°. In December 1997, the orbit was elevated to an altitude of 739 km in order to increase the daily global coverage to 90 % (TOMS Science Team 1998; McPeters et al. 1998). The TOMS/EP TCO measurements were processed with TCO version 8 retrieval algorithm. After the first few years of observation, TOMS/EP had serious optical degradation and the calibration had to be empirically corrected. Thus, it is not recommended to use degraded data from EP for trend analysis (Antón et al. 2010). TOMS instruments were also mounted on the Meteor-3 satellite of the former USSR and the Japanese ADvanced Earth Observing Satellite (ADEOS). The Meteor-3 satellite had three-diffuser systems (cover, working, and reference sequence) for back-scatter ultraviolet instruments. It had a non-sunsynchronous near-circular and near-polar prograde orbit that slowly drifted with local time. The reference diffuser was usually exposed for one sequence every 106 days (half the period of orbital precession). This precession posed a serious problem for trend determination from the TOMS/Meteor-3 instrument. The ADEOS spacecraft failed after only about a year in orbit; hence, little data are available. TOMS/Meteor-3 and TOMS/ADEOS measurements are not used in this study because their L2 data were not available from public archives.
2.3 SBUVThe SBUV instrument onboard Nimbus-7 (SBUV/N7) measured backscattered ultraviolet radiation at 12 different wavelengths to determine TCO and vertical ozone profiles (Bhartia et al. 2013). Datasets SBUV/N9, /N11, /N14, /N16, /N17, /N18, and /N19 comprise ozone measurements acquired by second-generation instruments (SBUV/2) onboard NOAA series satellites (NOAA-9, −11, −14, −16, −17, −18, and −19, respectively) (Frith et al. 2014). Version 8.6 retrieval algorithm was used to reprocess the SBUV TCO measurements (Bhartia et al. 2013; Frith et al. 2014). Both SBUV and SBUV/2 were nadir-viewing instruments with an FOV of approximately 11.3° × 11.3°, which corresponded to an area of approximately 180 km × 180 km (168 km × 168 km) for SBUV (SBUV/2) on the Earth's surface. The acquisition of one set of SBUV measurements, only during the daylight portion of each orbit, took 32 s. About 14 orbits were covered per day, with a 26° separation at the equator.
2.4 GOMEGOME, onboard ERS-2, was a nadir-viewing scanning spectrometer designed to measure a range of atmospheric trace constituents in the troposphere and stratosphere (Deutsches Zentrum für Luft und Raumfahrt (DLR) 2012). The largest FOV of GOME was 320 × 40 km, and three cross-track measurements provided global coverage at the equator every 3 days. An operational TCO dataset was generated by GDP version 4, using a Differential Optical Absorption Spectroscopy (DOAS) algorithm (van Roozendael et al. 2006). A DOAS-type algorithm compares the spectral fitting of effective slant columns and air-mass factor computations (mainly from look-up tables). In accordance with the Ozone Climate Change Initiative (CCI) project of the European Space Agency (ESA) (Lerot et al. 2014), the TCO dataset was reprocessed using GODFIT algorithm version 4 (Lerot et al. 2010; Van Roozendael et al. 2012), and this updated reprocessed dataset is used in this study.
Another GOME nadir TCO dataset was retrieved using the Total Ozone retrieval scheme for GOME developed for the OMI instrument (TOGOMI version 2.1) (Valks et al. 2004; Tropospheric Emission Monitoring Internet Service 2005).
2.5 SCIAMACHYThe SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY) onboard the ESA Environmental satellite (Envisat) had a nadir mode that detected upwelling solar radiation scattered in the atmosphere and reflected by the Earth's surface. It had a 30 × 60 km along-track FOV and a swath width of 960 km (Ozone CCI 2012; Ebojie et al. 2014; Lerot et al. 2014). An operational TCO dataset was generated using SCIAMACHY Data Processor (SDP) versions 3–5 by DOAS technique (Lerot et al. 2009). The SCIAMACHY TCO dataset was also reprocessed using GODFIT algorithm to achieve a high level of consistency with other instruments (Lerot et al. 2014; Van Roozendael et al. 2012). This updated reprocessed dataset is used in this study.
Another SCIAMACHY TCO dataset was retrieved using the Total Ozone retrieval scheme for SCIAMACHY, which used OMI-DOAS algorithm (TOSOMI version 2.0) (Eskes et al. 2005; van der A and Eskes 2006), which was an application of TOGOMI algorithm to SCIAMACHY.
2.6 OMIOMI is a Dutch-Finnish nadir-viewing near-UV/visible charge-coupled device (CCD) spectrometer onboard NASA's Earth Observation System's Aura satellite (Bhartia 2002; OMI team 2012). OMI measurements cover the 264–504 nm spectra with a spectral resolution of 0.42–0.63 nm. The nominal surface FOV is 13 × 24 km at nadir with a swath width of 2600 km. Typically, 60 cross-track measurements are made per swath, and 1650 swath observations are made per orbit. The TCO measurements are processed using TOMS version 8 algorithm, which uses two wavelengths (317.5 nm and 331.2 nm) to obtain the TCO measurements. The other four wavelengths are used for diagnostics and error correction (Bhartia 2005).
Another TCO dataset is derived using the DOAS-based TCO algorithm of the Royal Netherlands Meteorological Institute (KNMI) (Veefkind 2006), which was originally used to process data from the GOME instrument onboard ERS-2. Because this algorithm is most suitable for weakly absorbing trace gases, it estimates TCO data from longer wavelengths than those used by the TOMS algorithm.
2.7 GOME-2A and -2BThe GOME-2 instruments, onboard the ESA Operational Meteorological Satellites (MetOp-A and -B; datasets GOME-2A and GOME-2B), have a main-channel FOV of 80 × 40 km and obtain 24 cross-track measurements on a swath with a nominal width of 1920 km (DLR 2012). From January 2007 to 15 July 2013, GOME-2A data were processed with GDP 4.6, and from 15 July 2013 to 15 October 2016, GOME-2A and -2B data were processed with GDP 4.7. GDP 4.8 has been used to process all data since 15 October 2016 (http://o3msaf.fmi.fi/products/oto_o3.html).
2.8 OMPSThe OMPS Nadir Mapper (NM) onboard the S-NPP satellite is a TCO sensor operated by the Joint Polar Satellite System (JPSS) program (Godin 2014). OMPS-NM measurements are processed in near real-time at the NOAA Interface Data Processing Segment using the Environmental Data Record algorithm, which is similar to NASA's version 7 TCO retrieval algorithm (Flynn et al. 2014). OMPS-NM has a 110° cross-track FOV with 35 cross-track bins and a 0.27° along-track slit width, corresponding to a 50-km resolution at the Earth's surface. It provides instantaneous coverage of a 2800-km-wide swath at the Earth's surface. In this study, the OMPS-NOAA dataset comprises OMPS data from the JPSS NOAA site.
Another OMPS-NM TCO dataset is provided by NASA's Goddard Earth Science Data and Information Services Center, which uses an algorithm that is similar to NASA TOMS version 8 retrieval algorithm used for OMI (Jaross 2017; Kramarova et al. 2014). Because the CCD readout in the NASA product is split at the center, there are typically 36 cross-track measurements per swath and 400 swath observations per orbit. The OMPS-NASA dataset in this study comprises OMPS data from the NASA site.
All L2 binary and ASCII data are first transformed into a uniform ASCII format. Then, uncorrected L3 TCO datasets are created from the uniform L2 ASCII data. In this study, two bias correction methods are used. Biases of datasets are corrected by SLR as a function of time and by MLR as a function of time, SZA, and Teff. The systematic mean biases and drift of the individual datasets are evaluated from the differences between coincident L2 satellite overpass data and selected ground-based observations. Meanwhile, Teff is calculated if a L2 retrievals are not available. Then, the linear regression coefficients are used to create corrected L3 TCO datasets from the uniform L2 ASCII data. Finally, we describe a method for estimating the uncertainties of systematic mean biases and drift.
3.1 Uniform ASCII formatAll datasets in binary NetCDF, HDF, HDF-EOS5 formats, or original ASCII formats are converted to our uniform ASCII format. Elements of the uniform format are observation time, center location (latitude and longitude) of a satellite pixel, four corner locations of the pixel, TCO and TCO error, SZA, viewing zenith angle (VZA), Teff, surface flags (sea/land/ice), and error flags. The SBUV datasets contain one file for each observation day while the other datasets contain one file for each orbit. Ground-based observation data from WOUDC in a CSV format are converted to the uniform format.
3.2 Differences between satellite overpass TCO observations and ground-based measurement data (SAT–GDdiff)The satellite overpass at a ground-based station is defined when the distance between the center of the satellite pixel and the ground-based station location is less than 50–200 km, depending on the satellite pixel size. Because daily data for Brewer and Dobson measurements are derived from “TotalOzone” of the WOUDC, we assume that ground-based measurements would be at local noon. Thus, we consider a time window within ±4 h (i.e., from 8 am to 4 pm) for satellite overpass. The local noon is estimated from the longitude of the ground-based station. For spatiotemporal collocated satellite overpass and ground-based data, differences between coincident satellite L2 TCO observations (L2SAT) and daily TCO ground-based measurement data (GD) from the Brewer and Dobson spectrophotometers (Table A1 and A2, respectively), SAT–GDdiff = L2SAT – GD, are calculated.
In this paper, several criteria are adopted to SZA and cloud amount based on the purpose of the analysis. To analyze the dependence of SZA, cloud amount, and Teff on SAT-GDdiff, satellite overpass data when SZA < 89° and all the cloud amount are used. For the regression analysis, only satellite overpass data at SZA < 80° and cloud amount < 0.8 are used, since SAT–GDdiff becomes large at large SZA, and the root mean square of SAT–GDdiff (RMSD) are large when cloud amount is close to 1. To create TCO products at each grid mesh, L2SAT at SZA < 84° are used because these data are often flagged as “good”. The cloud amount criterion is set at 0.8.
3.3 Effective ozone temperature (Teff)The ozone absorption coefficient, which is used by retrieval algorithms to estimate TCO, is strongly dependent on the stratospheric temperature that changes with season and latitude. Teff is defined, following Koukouli et al. (2016), as the integral of the temperature weighted with ozone profile over altitude as follows:
![]() |
In this study, the biases of satellite data as a function of Teff are corrected as part of MLR fitting. However, the retrieval Teff is not available in some datasets. In that case, Teff is computed from JRA-55 (Kobayashi et al. 2015) temperature profile data, T (z) and the climatological monthly-mean ozone profile, O3 (z) (Fortuin and Kelder 1998). The temperature data are extracted from the 6-hourly three-dimensional temperature distribution in JRA-55, and T (z) is derived at local noon from the reanalysis temperature for each ground station by interpolation (van der A et al. 2010). Temperature correction (van der A et al. 2010; Koukouli et al. 2016) is applied to post-process the Dobson TCO as follows:
![]() |
Biases of the individual and merged satellite TCO datasets are corrected in comparison with selected GD by two methods: SLR and MLR. In the first bias correction method, daily SAT–GDdiff that is within three times the standard deviation at each ground-based station is averaged over all the stations. Then, to correct the mean bias (offset) and the trend (long-term drift) of the data, SLR is applied to daily station-averaged SAT–GDdiff as a function of time. Some datasets contain degraded data because of instrumental problems or because they were obtained during the early stage of the satellite deployment (Antón et al. 2010; Flynn et al. 2014). In such cases, the satellite measurement period is divided into two parts, and linear regression analysis is applied separately to the data collected during each sub-period. Moreover, SBUV datasets of satellites that drifted from an afternoon (ascending) orbit to a morning (descending) orbit are analyzed as different time series. We only use SBUV measurements in the 8 am–4 pm equator crossing time range, with a single exception for SBUV/N11 in the 1994–1995 period to avoid a gap in the time series (Frith et al. 2014).
In the second bias correction method, MLR is applied to daily SAT–GDdiff as a function of time, SZA, and Teff at all the stations rather than daily station-averaged differences.
3.5 Level 3 datasetsWe produce hourly L3 TCO datasets by spatially re-sampling L2SAT data to a linear Gaussian grid with TL159 resolution (320 longitudinal intervals × 160 latitudinal intervals). The L2 longitude and latitude data are first converted to 0.1° intervals and then remapped to a grid location. Besides, the start and end times of each hourly interval recorded in advance as line numbers of an orbit-based-text file and number of records, are used for the L3 production.
Here, only satellite TCO data collected on the sunlit side of Earth are used. Data flagged as “bad” as well as those that do not satisfy certain criteria (i.e., data collected when SZA > 84° (McPeters et al. 1996), cloud cover > 0.8, TCO values outside the 100–700 DU range) are not used. Basically, for non-SBUV datasets, all data flagged as “good” are used. In the case of SBUV datasets, however, global data coverage during 1993–96 and 2003–04 would be extremely low if all data flagged as “bad” were excluded. To prevent data gaps, we disregarded quality flags and include all SBUV datasets for the L3 production.
Each gridded product in the dataset is generated by first multiplying data within a grid cell by a weighting factor that is inversely proportional to the TCO error, and then averaging the weighted data within the grid cell. In other words, the data with low errors within a grid cell are assumed to be more reliable. In some datasets, satellite FOVs at higher latitudes are larger than the grid cell size. In that case, the average grid cell value is patched to adjacent grid cells, where the number of adjacent grid cells is proportional to the ratio of the FOV area, calculated using spherical trigonometry, to the grid cell area. The daily and monthly L3 datasets are created by summing hourly L3 values.
Moreover, merged datasets are produced in a similar but separate manner by averaging the data located within each grid cell over all datasets multiplied by the weighting factor that is inversely proportional to the TCO error.
3.6 UncertaintiesData averaging and the uncertainty in the measurements are outlined in Damadeo et al. (2014) and Petropavlovskikh et al. (2019). The uncertainty in the mean is representative of both the uncertainty in the measurements and the overall variance (Damadeo et al. 2014). A simple lag-1 autocorrelation correction is considered for the SLR model that is applied to daily station-averaged SAT–GDdiff. Since we have an approach for using the station-averaged SAT–GDdiff, these differences are assumed to be homoscedastic. No seasonal cycle is included in this regression because Fourier components representing the seasonal cycle are relatively small.
The trend analysis is conducted based on two different periods to describe the ozone decrease in the 1980s and 1990s, and the slow ozone increase since the early 2000s (Petropavlovskikh et al. 2019). Fourier components representing 12, 6, 4, and 3 months periodicity are included in the SLR, and other proxies such as the quasi-biennial oscillation, the 11-year solar cycle, etc., are not included.
In this section, the satellite overpass retrievals of each dataset are compared with coincident GD made by Dobson and Brewer spectrophotometers. The systematic mean bias and drift of SAT–GDdiff over time are assessed using SLR. The daily L2SAT is compared with corresponding daily GD recorded at each ground-based station as follows: First, daily SAT–GDdiff is calculated for each site, and then the mean daily differences for all the sites are calculated. Second, SLR is applied to the daily mean differences as a function of time to determine the bias (offset) and trend (slope) of each dataset (Table 2).
Figure 1 shows a 31-day-running mean of SAT–GDdiff and the regression lines (shown in red) in the 20 datasets. Overall, all the datasets agree with GD within ±2–3 % (< 10 DU), except for periods when the measurements were significantly degraded (TOMS/EP during 2000–2003, and OMPS-NOAA during 2012–2014). In many datasets, the slope (trend) of SAT–GDdiff is within ±1 % (< 4 DU) per decade. Exceptions are TOMS/EP during 2000–2003 due to instrument degradation (Antón et al. 2010), and OMPS-NOAA and GOME-2B because of unstable satellite measurements during the early deployment stage. After their early stage, the drift in these two acquired datasets is reduced (i.e., the slope is smaller). TOMS/N7 shows good agreement with GD, with a mean bias of +3 DU (1 %) and a drift of −2 DU per decade, and TOMS/EP shows small deviations of ∼ 0.5 % during 1996–1999. However, after 2000, TOMS/EP observations display significant deviations from GD, reaching low values of −12 DU in 2003 with an abnormally low trend of −24 DU per decade. Thus, we exclude the TOMS/EP data from 2000 to 2003 from our trend evaluation, based on the recommendations of Antón et al. (2010) and McPeters et al. (2008).
Differences between the satellite L2 overpass total column ozone (TCO) (L2SAT) and ground-based TCO (GD) from a network of 71 Brewer and Dobson spectrophotometer stations, SAT–GDdiff = L2SAT – GD, as a function of time. Simple linear regression (SLR) lines in red that correspond to Table 2 are drawn. (a) TOMS/N7 (TN7), (b) TOMS/EP (TEP), (c) GOME (GOM), (d) TOGOMI (TGM), (e) SCIAMACHY (SCI), (f) TOSOMI (TSM), (g) OMI-TOMS (OMI), (h) OMI-DOAS (DOA), (i) GOME-2A (G2A), (j) GOME-2B (G2B), (k) OMPS-NOAA (ONO), (l) OMPS-NASA (ONA), (m) SBUV/N7 (S7), (n) SBUV/N9 (S9), (o) SBUV/N11 (S11), (p) SBUV/N14 (S14), (q) SBUV/N16 (S16), (r) SBUV/N17 (S17), (s) SBUV/N18 (S18), and (t) SBUV/N19 (S19). A 31-day running mean is applied.
GOME and SCIAMACHY data that were processed with the GODFIT algorithm (Lerot et al. 2010) show almost no slope (within ±0.1 DU per decade) and mean biases of 2.9 and 0.3 DU, respectively (Table 2), compared with GD. The 31-day-running mean peak-to-peak amplitude ranges ∼ 6 DU in GOME and 3 DU in SCIAMACHY (Figs. 1c, e, respectively). Meanwhile, the TOGOMI and TOSOMI are similar to those of GOME and SCIAMACHY, respectively, except that the offset in TOGOMI (0.5 %) is reduced in comparison with that in GOME (1 %), and the RMSD is 9.5 DU for TOSOMI compared with 8.7 DU for SCIAMACHY.
OMI-TOMS and OMI-DOAS datasets agree well with GD (Kroon et al. 2008). The time series of SAT–GDdiff (Figs. 1g, h, respectively) are almost flat throughout the year, both datasets have a mean bias of −3.1 DU and −1.8 DU, respectively. OMI-TOMS has a slightly larger drift and a smaller RMSD compared with OMI-DOAS. Similar to GOME and SCIAMACHY that were also processed with the DOAS-type algorithm, OMI-DOAS displays seasonal behavior with relatively smaller seasonal amplitude.
The GOME-2A and -2B datasets exhibit seasonal peak-to-peak SAT–GDdiff of 2–3 %, owing to using the DOAS algorithm. Their time series show abrupt changes on 1 July 2013, which coincided with the algorithm change from GDP 4.6 to GDP 4.7 (Figs. 1i, j) when the difference increases suddenly by ∼ 8 DU. The seasonality associated with the data processed by DOAS-type algorithms can, for the most part, be attributed to the fact that the algorithms do not account for latitudinal and temporal changes in the effective temperature (Koukouli et al. 2016). In addition, data processed with DOAS-type algorithms acquired at high SZA show poor agreement with GD, apparently as a result of seasonal and geometrical problems (i.e., measurements at high SZA often located at high latitudes).
OMPS-NASA shows a remarkable agreement with GD, with a mean bias of around zero and a slightly upward trend of 0.4 % per decade (Fig. 1l). In contrast, during the early stage of satellite deployment in 2012–2014, OMPS-NOAA measurements are extremely degraded (Flynn et al. 2014), and SAT–GDdiff shows an abnormally high slope. After 2015, however, the OMPS-NOAA SAT–GDdiff exhibits stable behavior (Fig. 1k). Thus, we include only OMPS-NOAA from after 2015 in the merged TCO dataset.
The mean offsets of SAT–GDdiff for datasets acquired with SBUV-type instruments are generally within 1–3 %. For daily data without smoothing (not shown), fewer good matches are available for pure nadir observation instruments than those for a two-dimensional CCD of OMI and a mapper of OMPS, and thus the variability in the SBUV anomalies is large. Because of poor sampling and low-quality SBUV instruments in the 1990s and early 2000s, SBUV/N9, /N11, and /N14 measurements exhibit large biases and significant drift due to large deviations of the equator crossing time (ECT) from an initial near-noon ECT (Kramarova et al. 2013; Frith et al. 2014). In early series SBUV datasets (SBUV/N7, /N9, /N11, and /N14; Figs. 1m–p), the peak-to-peak amplitude of 31-day-running mean fluctuations is within ±10 DU, whereas it is within ±5 DU in datasets acquired later (SBUV/N16, /N17, /N18, and /N19; Figs. 1q–t).
The latitudinal distributions of the mean relative SAT–GDdiff, i.e., (L2SAT − GD)/L2SAT, in the 20 datasets are shown in Fig. 2. Mean biases of satellite TCO relative to GD at each station range from −2 % to 5 %. The biases are small and close to zero at higher latitudes in NH and positive near the equator and in SH. However, the SH positive biases may be driven by the negative bias stations in NH mid-latitudes. The spread (variance) ranges from 3 % to 5 % at most of the stations. From the figure, no significant change is observed in biases in the GD records in the early part of the satellite records compared with the later part.
Relative SAT–GDdiff for 20 individual TCO datasets as a function of the latitude of each ground-based station without bias correction. The averaged biases are denoted by red filled circles and standard deviations denoted by error bars. The datasets are the same as those shown in Fig. 1.
Ozone trends would be best diagnosed from the temporal evolution of the global TCO average. Figure 3 shows the time series of uncorrected satellite-measured TCO averaged over 60°S–60°N for each of the 20 datasets and the corresponding TCO data corrected by SLR as a function of time. Both the uncorrected and corrected time series show a general decline in ozone levels over time in the 1980s and early 1990s. The most visible effect of the trend correction is to reduce the trend over the period of 1978–1995. Since the ground network was poorly calibrated in that period than in the post 2000 period, this might be due to uncertainty in the ground data. In this figure, there are a local maximum in 1988–1991 and a minimum in 1993. The low TCO values during 1992–1994, which are reduced by 5–10 DU compared with the 1990 level, are attributable to the Mount Pinatubo eruption in June 1991 (Fioletov et al. 2002). However, the TCO values return to the late 1980s level in1998–1999, and no long-term drift is apparent after 2000.
Time series of satellite-measured TCO averaged over 60°S–60°N for the 20 individual datasets (a) without and (b) with bias correction by SLR. The curves are based on monthly mean data and smoothed by applying 13-month running mean to eliminate high-frequency variations. Monthly zonal mean values averaged over the whole observation period are used to fill gaps in the individual datasets before calculating the area weighted TCO. Solid lines in colors are used for non-SBUV datasets. SBUV datasets are shown by long and short dash lines with black (/N7), purple (/N9), and grey (/N11), short dash lines with black (/N14), purple (/N16) and grey (/N17), and long dash lines with black (/N18) and purple (/N19).
The TOMS/N7 L2SAT overestimates GD by 2–4 DU during 1979–1993 (i.e., differences are positive; Fig. 1a). As a result, the corrected TCO magnitudes are reduced to 280–293 DU. Conversely, the TOMS/EP L2SAT underestimates GD during 1996–1999 (i.e., differences are negative; Fig. 1b). Thus, the corrected TCO values are higher than the uncorrected values. However, there are large differences between the corrected TOMS/EP dataset and other corrected datasets which include data for 1996–1999 (i.e., GOME, TOGOMI, and SBUV/N9, /N11, and /N14) (Fig. 3b). This result suggests that not only the TOMS/EP TCO data from after 2000 when the instrument is known to have degradation issues, but also the TOMS/EP TCO data from before 2000 may be problematic. Hence, the data corrected by SLR may need to be carefully inspected before they are used to determine long-term ozone trends.
The satellite datasets for the 2000s agree well, within 5 DU, with one another, and the corrected datasets agree within 4 DU. In the 2010s, NASA-satellite TCO data (OMI-TOMS, OMI-DOAS, OMPS-NOAA, and OMPS-NASA) and SBUV-series data agree well with one another. However, GOME-2A and -2B TCO data are ∼ 5 DU higher than the NASA-satellite and SBUV-series data (except for the OMPS-NOAA data from before 2015). The OMPS-NASA TCO dataset shows remarkable stability with respect to the ground-based network; the corrected globally-averaged OMPS-NASA TCO data are almost the same as those of the uncorrected data (Fig. 1l). Remarkably, intersatellite differences between the NASA satellites and GOME-2 series after around 2013 significantly decrease in magnitude from ∼ 5 DU for uncorrected TCOs to 2–3 DU for corrected TCOs, except for OMPS-NOAA.
In summary, the systematic mean biases and drift of TCO in the 20 satellite retrieval datasets as a function of time are removed by SLR fitting of the coincident satellite overpass and Dobson and Brewer GD data. In some datasets, measurements are extremely degraded during certain periods. In that case, linear regression analysis is applied separately to different parts of the satellite measurement period. In all the datasets, satellite overpass data agree with GD within ±2–3 % when the significantly degraded TOMS/EP measurements during 2000–2003 and OMPS-NOAA measurements during 2012–2014 are excluded. Abrupt jumps of ∼ 8 DU in GOME-2A and -2B coincident with a change in the retrieval algorithm. The RMSD of the merged TCO dataset, 8.6 DU, is reduced to 8.4 DU by bias correction with respect to time.
Biases of all the available satellite TCO datasets and the multi-sensor merged TCO dataset relative to the selected GDs are corrected as a function of time, SZA, and Teff by MLR. First, number frequency distributions of satellite overpass measurements are presented for each dataset and then relative SAT–GDdiff and RMSD are investigated as a function of SZA, cloud amount, and Teff. To assess the validity of this method for correcting biases in satellite TCO retrievals, we present the corrected TCO anomalies relative to uncorrected TCO values on a global map and examine the time series of globally averaged TCO.
In relation to SZA, frequency distributions of satellite overpass measurements (Fig. 4a) are unimodal or bimodal with peaks at SZA of 30–40° or 60° or both. The distributions of number frequencies and relative differences are heavily influenced by the fact that most of the ground stations are located in NH mid-latitudes. Because the overpass SZA varies according to the season and latitude of the observations, the distribution largely reflects the geophysical location of the ground stations, which are mostly unevenly distributed in NH extratropics. Overall, the relative SAT–GDdiff (Fig. 4b) and RMSD (Fig. 4c) are small when SZA is 20–60°. However, there are significant differences among the datasets when SZA is high, especially for SZA > 80°, because errors in the GD are likely due to scattered light.
(a) Number relative frequencies of satellite overpass measurements, (b) relative SAT–GDdiff, and (c) root mean square of SAT–GDdiff (RMSD) as a function of SZA grouped in 5° bins. Panels (d, e, f) are the same as (a, b, c) but for cloud amount grouped in 5 % reflectivity bins. Panels (g, h, i) are the same as (a, b, c) but for Teff grouped in 3 K bins. Non-SBUV datasets are shown by colored lines without marks. The SBUV datasets are represented by the ensemble means (black line) with standard deviations (error bars). For OMPS-NOAA, retrievals from before 2015 are significantly degraded; thus, only data retrieved after 2015 are used.
The frequency distributions in relation to cloud amount (Fig. 4d) show cloud-free percentages of just 4–5 % for satellite overpass measurements retrieved by DOAS-type algorithms and 35–40 % for those by TOMS-type algorithms, except for OMPS-NOAA. The number frequencies of retrievals generally decrease as the cloud amount increases because overpass differences in this study use only GD in direct sun to ensure accuracy. Overall, the relative SAT–GDdiff in most of the datasets shows little dependency on cloud amount (Fig. 4e). The SAT–GDdiff in TOMS/N7 and GOME-2B (OMI-TOMS and OMPS-NOAA) shows positive (negative) biases, and biases of SCIAMACHY, TOSOMI, GOME-2A, and OMPS-NASA are near zero within ±1 %. The RMSD increases as increasing the cloud amount (Fig. 4f). Because clouds can substantially reduce the accuracy of satellite TCO products (Antón and Loyola 2011), these results suggest that cloud amount may be suitable for the consideration of TCO error rather than for the TCO itself.
The frequency distributions of satellite overpass measurements in relation to Teff (Fig. 4g) show that the L2 retrievals of Teff for GOME, SCIAMACHY, TOGOMI, GOME-2A and GOME-2B peak at ∼ 230 K whereas the modal of JRA-55-derived Teff peaks at ∼ 220 K. Van Roozendael et al. (2012) indicated that the DOAS may overestimate Teff and that a difference of 10 K in Teff results in a 3 % change in TCO. The Teff dependency of SAT–GDdiff and its RMSD (Figs. 4h, i) show large spreads among the datasets at low temperatures because of the small sample sizes and large atmospheric variability in winter months. The relative SAT–GDdiff for GOME, SCIAMACHY, TOGOMI, and TOSOMI shows larger positive biases at lower Teff, whereas that for TOMS/EP, OMI-TOMS, and OMPSNOAA shows large negative biases at lower Teff. The OMI-DOAS, OMPS-NASA, and SBUV ensemble mean biases are small, within ±1 %, when Teff is higher than 220 K.
MLR coefficients of SAT–GDdiff as a function of time, SZA, and Teff are presented in Table 3 for the 20 datasets, along with their RMSDs. The RMSDs of all the datasets are ∼ 10 DU (3 %), and bias correction by MLR reduces the RMSD by 1–8 %, compared with the RMSDs of the uncorrected data (see RMSD1 in Table 2); GOME-2A and -2B show the largest RMSD reduction. The RMSD of uncorrected merged datasets, 8.6 DU, is reduced to 8.4 DU after correction by MLR.
Figure 5 shows latitude time cross-sections of differences between corrected TCO by MLR and uncorrected TCO for the 20 datasets (panels 5a–t) and the merged dataset (panel 5u). Overall, the correction differences show latitudinal dependency and distinct seasonal variation. The positive TCO correction in TOMS/EP, OMI-TOMS, OMI-DOAS, and OMPS-NOAA and negative TCO correction in GOME, TOGOMI, and GOME-2B at almost whole latitudes are basically attributed to negative and positive SAT–GDdiff, as shown in Fig. 1. Positive TCO correction at only higher latitudes is distinct in TOMS/N7. TCO correction is relatively small in SBUV datasets, due to small magnitude of SAT–GDdiff, as shown in Figs. 1 and 2. These general characteristics in individual datasets are well reflected in the merged dataset in Fig. 5u.
Latitude time cross sections of differences between monthly mean corrected and uncorrected satellite TCOs (DU) in individual datasets. (a) TOMS/N7, (b) TOMS/EP, (c) GOME, (d) TOGOMI, (e) SCIAMACHY, (f) TOSOMI, (g) OMI-TOMS, (h) OMI-DOAS, (i) GOME-2A, (j) GOME-2B, (k) OMPS-NOAA, (l) OMPS-NASA, (m) SBUV/N7, (n) SBUV/N9, (o) SBUV/N11, (p) SBUV/N14, (q) SBUV/N16, (r) SBUV/N17, (s) SBUV/N18, and (t) SBUV/N19. (u) Merged TCO dataset during 1978–2017 (MRG).
Large gaps in SH after 2003 in the GOME and TOGOMI datasets are due to an ERS-2 recording failure (Koukouli et al. 2012). In the datasets processed by DOAS-type algorithms, TCO retrievals are not available when the Earth's surface is covered by snow or ice, because the algorithms cannot distinguish between the snow- or ice-covered ground and cloud (Spurr et al. 2012). The effective light path is enhanced by multiple reflections in clouds, and when snow and ice is present, the highly reflecting surface produces an elongated “ghost” column; in both cases, TCO is overestimated (Loyola et al. 2011). However, in GOME-2A and -2B, after the retrieval algorithm was changed in mid-October 2016, the monthly coverage over 60–90°S increases to ∼ 100 %, from ∼ 40 % with the former algorithm.
It is interesting to examine how the spatial distribution and temporal behavior of the merged data change as a result of bias correction by MLR. For example, Figs. 6a–c show longitude-latitude sections of monthly mean corrected minus uncorrected TCO differences, RMSD, and Teff in October 2008. In general, TCO in October is high over the east Asia at higher latitudes and over the Southern Ocean, whereas it is low over 30°N–30°S region and over Antarctica. A large part of the merged TCO data in 2008 is derived from the OMI-TOMS and OMI-DOAS datasets, both of which have negative biases. Thus, the corrected TCO results show increases in TCO in most parts of the world (Fig. 6a). Figure 6b shows that the RMSD is large at the equator and in Antarctica, and small in the Southern Ocean in this example.
Monthly mean global maps of (a) differences between corrected and uncorrected TCO in DU, (b) the RMSD between corrected and uncorrected TCOs in DU and (c) Teff in K in October 2008. (d) Time series of corrected minus uncorrected Dobson TCO at Tsukuba/Tateno (36°N, 140°E). The data corrected by SLR (green line) or by MLR (black line).
Figure 6c shows a low Teff in lower latitudes and a high Teff in the Southern Ocean. In Table 3, coefficients of Teff for OMI-TOMS and OMI-DOAS are positive, indicating that the bias correction in higher/lower Teff regions show decreases/increases in TCO in those regions. The key point is that this bias correction by MLR is not uniform globally because the correction in a global map depends on Teff and SZA, which are critical parameters affecting TCO retrievals (van der A et al. 2010).
The time series of bias correction by SLR and MLR at a mid-latitude station in NH is shown in Fig. 6d. Bias correction by SLR does not reveal any seasonal behavior because this correction uses only the mean offset and drift. In contrast, bias correction by MLR reveals clear seasonal behavior. The seasonal peak-to-peak amplitude changes from ∼ 3 DU before 2013 and to ∼ 6 DU after 2013. This amplitude change coincides with the algorithm change in GOME-2A and -2B. The GOME-2A and -2B satellite overpass time series (Figs. 1i, j, respectively) show negative biases against GD before 2013 and positive biases after 2013.
Figure 7 shows the time series of merged uncorrected and corrected TCOs by SLR and MLR averaged over 60°S–60°N, and the differences of corrected TCO relative to the uncorrected TCO; NH and SH averages are also shown. For comparison, the KNMI ozone multi-sensor reanalysis of van der A et al. (2010) is plotted. The reanalysis was constructed by data assimilation using reprocessed versions of 13 satellite corrected datasets for 1978–2008 against GD for biases as a function of time, SZA, VZA, and Teff.
(a) Time series of merged TCO averaged over 60°S to 60°N without bias correction (black line), merged TCO corrected by SLR (blue line), and merged TCO corrected by MLR (red line). For reference, the assimilated merged TCO time series produced by van der A et al. (2010) is also shown (grey line). Time series are smoothed by applying 25-month running mean. (b) Differences between corrected and uncorrected TCOs. (c, d) Same as panel (a) but for the NH average over the equator to 60°N and SH average over the equator to 60°S.
The behavior of the global, NH and SH averages of TCO corrected by SLR is similar to that corrected by MLR, and good agreement is also seen between both our corrected and merged TCO time series and the merged KNMI time series of van der A et al. (2010). This agreement indicates that the simplified method in which only the mean satellite bias and drift are corrected has some benefits as an assimilation data source in that it removes the discontinuity caused by satellite de-biased data, and it is expected to reduce apparent biases among individual satellite measurements. From Fig. 7b, the bias correction by MLR is higher than that by SLR. This difference could be caused by Teff of Dobson and retrieved satellite ozone as well as SZA. By comparing these time series with those of the individual satellite datasets (Fig. 3), we find that both SLR and MLR fittings correct the positive biases during the TOMS/N7 observation period of 1978–1993 and the negative biases during the OMI period of 2004–2017.
In summary, bias correction by MLR is not uniform globally and shows distinct seasonal variation because the correction depends on time, SZA, and Teff. The bias correction by MLR is higher than that by SLR. In addition, the time series after SLR correction for mean bias and drift shows good agreement with the TCO time series corrected by MLR (R2 = 0.998). Therefore, the simplified method has some benefits as an assimilation data source in that it removes the discontinuity caused by satellite de-biased data, and it is expected to reduce apparent biases among individual satellite measurements.
In this section, the TCO trend in the corrected and uncorrected merged TCO datasets is investigated. Apparent biases in individual satellite measurements are reduced in the MLR corrected merged dataset.
Figure 8 shows the seasonal march of the TCO trend in NH and SH mid-latitudes along with the latitudinal distribution of the TCO trend during two periods, 1979–1996 and 2000–2017. These two different trends correspond to the ozone decrease in the 1980s and 1990s and the slow ozone increase since the early 2000s (Petropavlovskikh et al. 2019).
Seasonal evolution of the TCO trend over (a) 35–60°N and (b) 35–60°S during 1979–1996 (blue) and 2000–2017 (red) in the merged dataset corrected by MLR. Units are DU yr−1. The trend without correction is shown in black. Error bars (long-short dashed lines) denote twice the standard error of the mean trend with (without) bias correction. Latitudinal distribution of the TCO trend during (c) 1979–1996 (blue) and (d) 2000–2017 (red) with bias correction. The trend without correction is shown in black.
The negative TCO trend is the strongest in winter and spring with largest variability during 1979–1996 in NH mid-latitudes (35–60°N; Fig. 8a), as reported previously by Fioletov et al. (2002). During these years, the year-round trend is statistically significant (i.e., values are within twice the standard error) in both NH and SH mid-latitudes. The summer and fall uncertainties are smaller than those in winter and spring. In SH mid-latitudes (35–60°S; Fig. 8b), the ozone decline is less seasonally dependent. Overall, TCO trend shows a statistically significant decline of 10–20 DU per decade during 1979–1996 (Figs. 8a, b). During 2000–2017, the trend is near zero in all seasons in both NH and SH mid-latitudes, except for an insignificant positive trend in January over 35–60°N and a significant positive trend in April and May over 35–60°S. In both hemispheres during 1979–1996, the difference between the corrected and uncorrected TCO trends shows seasonal dependence, being positive in summer months. This influence may be attributed to Teff and/or SZA dependency for MLR. It is notable that bias correction by SLR loses seasonality for differences between corrected and uncorrected TCOs (not shown).
The TCO trend is near zero over the equator and substantially declines outside the 30°S–30°N zone during 1979–1996 (Fig. 8c). The cumulative ozone loss over the mid-latitude during that period is ∼ 5 %, in agreement with results of Fioletov et al. (2002) and World Meteorological Organization (2015). In this period, the corrected TCO data show smaller negative trends compared with the uncorrected data. During 2000–2017 (Fig. 8d), the trend is near zero over 20–30°S and 10–60°N. Meanwhile, it is significantly positive over 10°S–10°N and insignificantly positive over 40–60°S. In this period, the corrected data show a large positive trend over 10°S–30°N, compared with the uncorrected data. Overall, our trend correction by MLR is higher than that by SLR especially at lower latitudes because bias correction increases in TCO at low Teff.
In summary, the difference between MLR corrected and uncorrected TCO trends shows clear seasonal and latitudinal dependency while such seasonal and latitudinal dependence of bias correction by SLR against time is lost. The differences between MLR and SLR corrected TCO trends are generally positive at lower latitudes where bias correction by MLR increases in TCO due to low Teff.
In this study, we have constructed the merged dataset using 20 available satellite TCO datasets covering the 40 years from 1978 to 2017. The individual datasets and the merged dataset are corrected against selected GD by two methods: SLR against time and MLR against time, SZA, and Teff.
In the 20 datasets, SAT–GDdiff shows little dependence of SZA when SZA < 60°. However, it shows large deviations when SZA > 80° with anomalies > 5 % because GD errors are likely due to scattered light at large SZA. The dependency of the cloud amount is small in most of the datasets and the (TCO) RMSD for the cloud amount increases as increasing the cloud amount. Although clouds can significantly reduce the accuracy of satellite TCO retrievals (Antón and Loyola 2011), this means that cloud amount may not be a suitable parameter for TCO bias correction and that it may be taken into consideration in TCO error correction. SAT–GDdiff shows a large spread at low Teff, and this result is attributable to small sample sizes and probably also to large atmospheric variability in winter months.
Because SZA and Teff dependencies are geophysically linked (Koukouli et al. 2016), these two effects are important for validation studies. van der A et al. (2010) also indicated that SZA and Teff have clear seasonal components and that an explicit seasonal or latitudinal dependency almost disappears when their corrections were applied to the satellite observations. In our analysis of SAT–GDdiff, all datasets agree within ±2–3 % (< 10 DU), except for some significantly degraded measurements in TOMS-EP during 2000–2003 and OMPS-NOAA during 2012–2014. The systematic drifts are within ±1 % (< 4 DU) per decade, except in degraded TOMS/EP measurements, and in OMPS-NOAA and GOME-2B measurements during the early stage of satellite deployment. Later, OMPS-NOAA and GOME-2B retrievals exhibit stable behaviors.
The merged datasets are created by averaging all data located within a grid cell after multiplication with weights inversely proportional to TCO errors. The behavior of the merged TCO after bias correction by SLR is similar to that after correction by MLR. Both corrected merged TCO datasets agree well with the KNMI merged TCO reported by van der A et al. (2010). Therefore, correction for mean bias and drift also improves the merged TCO datasets. The RMSDs in the corrected datasets by SLR and MLR are reduced from 8.6 DU to 8.4 DU. This is an additional benefit for data correction and time-series homogenization. A drawback of bias correction by SLR is that bias correction of seasonal and latitudinal dependence is not detected, whereas bias correction by MLR shows clear seasonal and latitudinal dependence (Fig. 6).
For the trend analysis, bias correction by MLR shows clear seasonal and latitudinal dependence while such seasonal and latitudinal dependence of bias correction by SLR against time is lost. The differences between MLR corrected TCO and uncorrected TCO trends are generally positive at lower latitudes where bias correction increases in TCO due to low Teff. A further investigation is needed for this positive TCO correction in the low latitude.
Data availability. The homogenized monthly mean merged TCO datasets are available from the authors upon request via e-mail (biascorrection_l2sat_total-ozone AT mri-jma.go.jp).
This research was supported by the Environment Research and Technology Development Fund (S-12) of the Environmental Restoration and Conservation Agency of Japan and by Grant-in-Aid (JP18K03748) for Science Research (KAKENHI) from the Ministry of Education, Culture, Sports, Science, and Technology of Japan. The authors give many thanks to Atsuya Kinoshita, Itaru Uesato, Mikio Ueno, Nozomu Ohkawara, Makoto Inoue, Toshihiko Hirooka, and especially Kazuyuki Miyazaki. Without his constructive and scientific advices to claim JMA's ozone products, nothing about our ozone homogenization project would be accomplished at all.