2025 Volume 31 Pages 29-37
This study developed a way to automate the topographic correction process, specifically the C and SCS+C methods, using the Google Earth Engine (GEE) as a platform. We developed a script that enables users to specify the target path and row, period of acquisition, and the Normalized Difference Vegetation Index (NDVI) threshold required for correction. Then the desired topographic correction is realized based on these inputs. The data to be corrected were obtained from Landsat 8. Users were required to prepare a limited number of parameters to execute code on the GEE correctly; these included the Landsat path, row, image-acquisition period, country name of the largescale international boundary polygons, the NDVI threshold used for the regression analysis, and the center latitude and longitude of the display screen. The NDVI and slope angle obtained from a digital elevation model were used to define the area to be sampled and to obtain the parameters required for the correction model automatically. It was immediately apparent that shadows were removed from the corrected image compared to the uncorrected image. After comparing the correlation coefficients before and after correction for each band, the corrected values were substantially lower than the uncorrected values, indicating that the correction was made appropriately. The code proposed here requires little extra effort to implement, even when both the C and SCS+C correction methods are applied.
During satellite observations of the Earth's surface, the effects of the atmosphere and topography on the resulting images are unavoidable. Changes in apparent spectral reflectance due to shading caused by the ruggedness of terrain are known as topographic effects, and are recognized as a key challenge that limits the practical utilization of satellite data. In areas with complex topography, these effects are significant, and variations in lighting conditions due to rugged terrain cause differences in the observed data that are unrelated to vegetation conditions. Topographic effects impact forest damage assessment (Ekstrand, 1996), the attributes of forest stand structure (Cohen and Spies 1992), and landcover classification (Fahsi et al., 2000, Vanonckelen et al., 2013). Several correction methods have been developed to mitigate the impact of topographic effects on observed data (Dong et al., 2020).
Such correction can be broadly classified into non-geometric and geometric methods (Murakami, 2007). Non-geometric methods are typically simple models that require no data other than satellite imagery and corrections are made based on band-to-band ratios or other factors (Ekstrand, 1996). Geometric methods use a digital elevation model (DEM) to incorporate the angle of incidence and reflection of sunlight on the ground surface into the correction. As one of the methods that considers geometric variation, the non-Lambertian model uses empirical parameters to correct each band, resulting in highly accurate corrections (Sola et al., 2016).
However, the complexity of differences in solar altitude and topographic relief makes it difficult to devise a definitive method to evaluate the adequacy of correction. To address this issue, Sola et al., (2016) conducted a multi-criteria evaluation of several correction methods. Their study used seven different criteria to evaluate 10 correction methods: the C (Teillet et al., 1982), Smoothed C-Correction (Riaño et al., 2003), SCS+C (Soenen et al. 2005), Statistical-Empirical (Teillet et al., 1982), Minneart (Minnaert, 1941), Enhanced Minneart (Smith et al., 1980), Pixel-based Minneart (Ge et al., 2008), Modified Minnaert (Richter, 1998), two stage normalization (Civco, 1989), and slope-matching (Nichol et al., 2006) methods.
Currently, there is no software or service that can perform the necessary processing for topographic correction in a single step. As a result, achieving topographic correction often involves combining multiple processes within the geographic information system (GIS) software, which can increase the required manual effort and time investment. In particular, calculation of empirical parameters for non-Lambertian models is complicated (Murakami, 2007) and cannot be completed by GIS software alone. The use of multiple software packages is therefore required to accomplish the necessary calculations and parameter estimations. In addition, the use of commercial software for topographic correction often necessitates a valid license, which can be an additional cost. Because topographic correction can be computationally intensive, the processing time can be significantly increased when working with computers with inadequate capability. Another challenge arises when multiple images need to be corrected simultaneously. Currently, there is no efficient way to process all of the images in a batch or in a parallel manner, resulting in a multiplicative increase in the required man-hours. Because topographic correction is a pre-processing step to remove topographic effects from satellite imagery prior to its use in research, the process should ideally be performed easily and with little effort on the part of the user.
These problems can be solved using the Google Earth Engine (GEE), which is a cloud computing platform designed to process and store data at a petabyte scale (Mutanga and Kumar 2019). The most important feature of this system is its ability to streamline the entire process, from data search to analysis and the display of results, all of which are conducted on a cloud server. This eliminates the need for users to search for, download, and manage data individually when conducting their analysis (Kobayashi 2018). For analysis, the suggested approach is to use a code editor, which is a web-based integrated development environment. This allows users to create and edit appropriate scripts using JavaScript or Python notation. Scripts can be written with any combination of processes, such as those used in GIS software, so that all of the processes necessary for correction can be written at once. Therefore, once a script is created, the second and subsequent processes can be performed simply by pressing the RUN button. In addition, the GEE does not require the purchase of a license. It is free to Google account holders and can be used for noncommercial purposes without charge.
We automated topographic correction using the GEE as a platform, specifically the C and SCS+C methods. We developed a script in which the user specifies the target path and row, period of acquisition, and the Normalized Difference Vegetation Index (NDVI) threshold required for correction. Then the desired topographic correction is realized based on these inputs. Verification of accuracy was performed to evaluate the acceptability of the correction.
Data provided by the GEE can be obtained from the Data Catalog (https://developers.google.com/earth-engine/datasets). Each dataset listed in the Data Catalog contains the source, supplementary information, and a description of the data. The data can be used by writing the script described in the Earth Engine Snippet on the code editor. All data used in this study were obtained from the Data Catalog.
Satellite DataLandsat 8 was the source of the data to be corrected. The collection 2, level 2, and Tier 1 products were selected. The United States Geological Survey (USGS) implemented a data management category called "collection" to organize the Landsat archive starting from 2016. The reorganization of Collection 1 involved modifying the file names and metadata structure to ensure consistent quality over time and across devices. In Collection 2, there were further changes in the preprocessing of the data and improvements in data access and distribution performance (USGS, 2020). "Level" expresses the level of processing applied to the data. In level 1, the processing is aimed at improving geometric and radiometric accuracy, while in level 2 atmospheric correction is applied in addition to the level 1 processing. "Tier" is another indicator of data quality. The three Tier categories are real-time (RT), Tier 1 (T1), and Tier 2 (T2), with RT provided immediately after observations, Tier 1 for the highest quality data available, and T2 for data that are not usually of normal quality, such as nighttime or cloud data (USGS, 2022).
In the GEE scripting environment, when working with collection data, it is common to select multiple images at once based on specific criteria. Therefore, in this study, as a criterion for selecting images to be corrected, those with the smallest percentage of clouds during the period specified by the user were selected. Potential uses of the corrected images include forest-type classification and monitoring of changes in forests. Among the Landsat 8 bands, the visible, near-infrared, and shortwave infrared regions are particularly suitable for such applications (Vogelmann et al. 2009; Davies et al. 2016). Therefore, band 2 through band 7 were extracted from the acquired data and corrected. The path and row of an image were found on the USGS website (https://landsat.usgs.gov/landsat_acq#convertPathRow).
Largescale International Boundary Polygons (LSIB2017) DatasetThe LSIB2017 is a polygon dataset provided by the Office of the Geographer and Global Issues at the U.S. Department of State that represents land areas by country. These data were used to clip (cut-out) the uncorrected images and DEMs to reduce the processing load for the correction. This did not include ocean area data that were not subject to correction.
DEMsThe DEMs used in the study were generated by the Shuttle Radar Topography Mission (SRTM). Several types of SRTM-derived DEMs are provided by the GEE but we used the "NASA SRTM Digital Elevation 30 m," provided by the USGS. This DEM is a joint effort of the National Aeronautics and Space Administration (NASA) and the National Geospatial-Intelligence Agency (NGA), with the participation of the German Aerospace Center (DLR) and the Italian Space Agency (ISA) (USGS, 2015). The resolution of the data is 30 m.
Correction MethodsThe C and SCS+C methods were selected to correct for topographic effects. The C method, developed by Teillet et al. (1982), incorporates empirical parameters into the equation to resolve overcorrection in the case of large angles of incidence in the COS method (Smith et al., 1980). SCS+C, developed by Soenen et al. (2005), incorporates empirical parameters similar to the C method to resolve the deficiencies of the SCS method (Gu and Gillespie, 1998) but is prone to overcorrection. The COS, SCS, C, and SCS+C methods are shown in equations (1)–(4), respectively.
(1) |
(2) |
(3) |
(4) |
where Lcorr,λ is the corrected digital number (DN), Lλ is the original DN, θs is the solar zenith angle, γi is the solar incidence angle, Cλ is the empirical parameter (intercept and slope were obtained from a regression analysis of the cosine of the solar incidence angle and uncorrected DN value), and s is the slope angle.
The solar zenith angle was calculated by subtracting the solar altitude obtained from Landsat metadata from 90 degrees. Empirical parameters were calculated for each band and were derived by calculating the intercept and slope from a regression analysis of the cosine of the solar incidence angle and the uncorrected DN value. To restrict the target area to forested areas, a regression analysis was conducted by extracting pixels with an NDVI of more than 0.35 and a slope angle of more than 5 degrees. The solar incidence angle is the angle between the solar rays and the Earth's surface and is defined by four factors: the slope angle of the terrain, the azimuth angle, the solar zenith angle, and the solar azimuth angle. The solar incidence angle was calculated as follows:
(5) |
where a is the slope azimuth and a' is the solar azimuth. Solar azimuth angles were obtained from Landsat metadata. The slope angle and slope azimuth were created from the DEM, and the values were converted from degrees into radians. Flowcharts of the correction are shown in Figures 1 and 2.
Flowchart of C method.
Flowchart of SCS+C method.
There is no widely established method to verify the efficiency of topographic correction. Here, we used the variation in the correlation coefficient (Gao et al., 2014) and the p-value obtained from the cosine of the solar incidence angle and the DN value of each band, which have generally been used in the past. First, 1000 pixels were randomly selected for each of the images before and after correction, for which the NDVI was 0.35 and the slope angle was greater than 5 degrees. Then, to exclude noisy data, pixels with DN values in the top 5% and bottom 5% of each band were excluded (i.e., 900 points remained). Correlation coefficients and p-values were calculated from the DN values and cosine values of the solar incidence angles for these 900 points. In the uncorrected image there was a positive correlation, and if the correction was performed correctly, the correlation was expected to be smaller in the corrected image than in the uncorrected image.
Three areas of Japan with different latitudes and longitudes were selected as target sites for verification (Figure 3). The acquisition period for the validation images was from January 1, 2020, to December 31, 2020, and each selected image was the one with the lowest cloud cover during that period.
Images and locations used for accuracy verification.
The source code created for this study is shown in the Appendix. The Landsat path, row, image acquisition period, country name of the LSIB, NDVI threshold used for the regression analysis, and the center latitude and longitude of the display screen were set as variables at the beginning of the code. Even users who do not have a complete understanding of the code's intricacies can make corrections. When exporting the corrected image, the commented-out part at the end of the code was redisplayed and a geometry was created around the target image on the map. Then pressing the RUN button created a new task in the task tab at the top of the screen, and the download task could be executed.
In the Appendix, Part A consists of a "test" function that was independently created. When working with the GEE two methods are required, t (variable name) and console.log (variable name), to display the output results on the map and metadata in the console, respectively. Therefore, we created a test function that combines all of these processes into a single function, and when set to test (variable name), the metadata and output results are displayed simultaneously. Part B presents a function that computes the empirical parameters and applies the correction model to the uncorrected image. To calculate the empirical parameters for each band, a loop process with "for" minutes was not used in this case. Instead, an array was created containing the names of each band (referred to as "useBand" in the script), and a higher-order function called "map()." This was based on Functional Programming Concepts (https://developers.google.com/earth-engine/tutorials/tutorial_js_03) in the GEE Guide. Part C provides the detailed settings for displaying the output results on the map. Here, the minimum and maximum values were set to 7000 and 32000, respectively, for image stretching, but this is only an example. If the corrected image is too dark or too white, the user can adjust the color to a more appropriate tone by setting the range in the Layer Manager to 100%.
Comparison of Images Before and After CorrectionFigure 4 shows the images that were used to verify accuracy before and after correction. The images are color composites with Band 6 (SWIR1) assigned to R, Band 5 (NIR) to G, and Band 4 (RED) to B. Comparing the uncorrected and corrected images, it is apparent that shadows have been removed from the latter. Comparing the C and SCS+C methods for each image, few differences can be visually observed.
A zoomed-in view of a part of the uncorrected image, path108 and row34 (Sea of Japan side of Honshu). Acquired on August 26, 2020. R:G:B = band6:band5:band4. (a) uncorrected, (b) C method, (c) SCS+C method.
The scatter plots of DN values and the cosine values of the solar incidence angles that were used to calculate the correlation coefficients are shown in Figure 5. The calculated correlation coefficients and p-values are shown in Figure 6. The corrected values are much lower than the uncorrected values, indicating that the correction was appropriate. This is particularly apparent in path 108 and row 34, where all corrected values are less than 0.1, and in path 106 and row 30, where all values are less than 0.1 for bands 4, 6, and 7.
Scatter plots of DN and cosine values of solar incidence angle used for accuracy verification. Acquired on August 26, 2020. Bands 2 to 7. (a) uncorrected, (b) C method, (c) SCS+C method.
Correlation coefficients between DN per band and cosine values of solar incidence angle for areas with NDVI > 0.35 and slope > 5°.
Comparing the correlation coefficients of the C and SCS+C methods, the C method had smaller values for the path 106 and row 30 images, except in band 7. In the path 108 and row 34 images, the SCS+C method had smaller values for bands 2 and 4, but the C method had smaller values for the other bands. In the images of path 112 and row 37, all bands had smaller values for the SCS+C method.
To analyze variations in the correction effect based on slope azimuth, the nine images to be evaluated were first masked using threshold values of NDVI (0.35) and slope angle (5 degrees). Subsequently, these masked images were divided into groups based on 30-degree intervals of slope azimuth, and the median value of each group was calculated. The results are shown in Figure 7. Prior to the correction, there was noticeable variation in median values across different slope azimuths, particularly for bands 5 and 6. However, in the corrected image, the median difference decreased, and the fold line flattened. There were no substantial differences in bands 2, 3, 4, and 7 compared to bands 5 and 6 before the correction. However, after the correction, the median difference was smaller and there was little difference between sunny and shaded slopes.
Relationship between slope azimuth and median DN. The solar azimuth is 139 degrees. path108 and row34 (Sea of Japan side of Honshu). Acquired on August 26, 2020. a) uncorrected, (b) C method, (c) SCS+C method.
The results confirm that the correction was appropriate for both methods. The visual assessment results show that shading caused by topographic ruggedness was substantially reduced in the corrected data (Figure 4). The correlation between DN values and the cosine values of solar incidence angles, which existed before correction, was no longer significant after correction (Figures 5 and 6). The variation in DN values according to slope orientation was also virtually unobservable after the correction (Figure 7). These results confirm that the correction model developed using the GEE worked effectively.
These corrections can be performed using only a few parameters. The user has no need to download data, nor is any complicated preprocessing required. Moreover, only a short time is required to obtain the results, which is a significant improvement compared to the time taken for conventional topographic corrections. This has all been made possible through the existence of the GEE itself, but the code developed here adds considerable value.
The correction scripts created in this study will need to be modified as technology develops. For example, Landsat 9 data are available from 2022, and when using the script created in this study to make corrections to Landsat 9 images, it will be necessary to modify the data loading and metadata access. However, even in this case, only a portion of the script will need to be changed, and no modification to the fundamental algorithm will be required.
Comparison of the C and SCS+C MethodsAs shown in Figures 4 and 5, the values of the correlation coefficients after correction varied from image to image and band to band. This variable was dependent on the time of year the image was acquired and the ruggedness of the topography (Park et al., 2017). Therefore, after applying either correction method, the degree of correction should be visually evaluated, and the image with the better correction should be used. The method proposed here requires little extra effort, even if both correction methods are applied. The additional time required to apply both methods is almost negligible, and therefore there is no reason for users not to use both correction methods.
The NDVI and Slope Angle ThresholdsIn an ideal scenario, the NDVI threshold should be determined automatically based on the characteristics of the image. However, there are currently no objective criteria for automatic NDVI thresholding in this method. The appropriate NDVI threshold can vary depending on factors such as vegetation type and the time of year when the image is captured. Because of this and the difficulty in automatically calculating a suitable threshold value, it is recommended that the user performs the correction process and manually searches for an appropriate NDVI threshold by comparing the results obtained. This is why the NDVI threshold is declared at the beginning of the script as a user-settable variable.
As an application of the method using correlation coefficients described here, one possible approach could involve combining multiple steps into a single script. This script would perform the correction process for each NDVI threshold value, ranging from 0 to 1 in increments of 0.1. After correction, the correlation coefficient for each band would be calculated, and these values would be averaged to identify the image with the correlation coefficient closest to 0. However, because processing within a single file can be complex, it is still desirable for users to determine their own threshold values and make corrections.
The slope angle threshold was set at a fixed value of 5 degrees (0.0872 in radians) in the script because the slope angle did not vary significantly across different target images. However, due to the characteristics of the DEM, there are cases when pixels that appear to be vegetated areas in urban areas or cultivated land, as well as mountainous areas, are extracted. However, these inherently undesirable pixels have little effect on the correction. Once extracted based on the threshold value, the pixels are converted into vector data (polygons) and used to clip the satellite image. By setting the minimum scale to 250 m when converting to this vector, most of the pixels in cultivated areas and urban vegetated areas are removed. In this way, apparent micro-slopes are largely excluded.
We developed a method for automating topographic-effect correction, as well as scripts for the C and SCS+C methods. Using the GEE cloud computing platform as the automation platform, users can easily make corrections. The code, together with a readme file, is available on GitHub. Users can automatically achieve correction with minimal parameter preparation. Using the the GEE, users can obtain corrected images without preparing any data, including those for the DEM.
Compared to general analysis software, the GEE is more difficult to operate intuitively and requires basic programming knowledge. However, it also has many benefits, such as high processing performance due to the process being executed in a cloud environment, and no license is required for use; once a script is created, similar processing can be easily performed a second time and thereafter, and there is no time-consuming data-acquisition process. Examples of GEE applications include detecting disturbances in tropical seasonal forests (Shimizu et al. 2019) and its use in the forest change point extraction program (FAMOST), which is being developed by the Forestry Agency (Japan Forest Technology Association, 2022). The FAMOST program uses an NDVI value derived from the optical satellite imagery of two periods to extract changes in forests, such as clearcuts. By superimposing the information on the extracted change points with geospatial information obtained from local governments, the system can be used for a wide range of tasks, such as checking the status of logging based on a logging-notification system, early detection of illegal logging, checking the location of forest land development, and checking the occurrence of disasters (Forestry Agency 2021). False extraction is inevitable in automatic change extraction, and the shading caused by topographic effects is often the cause. If the correction method developed in this paper were incorporated into FAMOST, we expect that it would substantially reduce false extraction results.
The 2006 Intergovernmental Panel on Climate Change (IPCC) Guidelines propose the gain-loss method and the stock change method for analyzing changes in the forest carbon stock, and the stock change method, which considers the change in carbon stock at two points in time as emissions and sinks. This is the most widely applied method in many countries (Hirata et al. 2012). A combination of remote sensing and field surveys has been found to be effective for determining the amount of change in carbon stock at the national level (UN Climate Change, 2007). Studies have estimated carbon stocks using remote sensing with topographic correction (Clerici et al.. 2016; Dabi et al., 2021), and the effectiveness of correction has been widely discussed. Therefore, it is expected that the correction script developed in this study can be used to estimate changes in carbon stocks with noise-reduced data.