Background Estimation in X-ray Photoelectron Spectroscopy Data Using an Active Shirley Method with Automated Selection of the Analytical Range

In this study, we introduce a novel algorithm that can recognize the concavo-convex shapes of X-ray photoelectron spectroscopy (XPS) data and estimate the optimum background (BG) in XPS spectra with fine structures near the endpoints. In this algorithm, the active Shirley method was improved by incorpo-rating a function for automatically selecting the analytical range. This autoselection function first investigates all the candidates for the initial endpoints. These estimates are then used to decide the BG shape according to the Shirley method. In order to ex-clude false-positive candidates caused by the recognition of noise peaks as small XPS peaks, the function evaluates the con-cavo-convex shape of the XPS spectrum after the long-period noise is removed using a smoothing process. The proposed algorithm was demonstrated to successfully estimate the optimal spectral BG from an XPS spectrum with a poor signal-to-noise ratio of about 40%.


I. INTRODUCTION
X-ray photoelectron spectroscopy (XPS) can be used to investigate the chemical bonding nature of a surface or the interface between solids. Thus, it is widely used in industrial fields such as material development and quality control [1]. In the analysis of XPS spectra, background (BG) estimation is the first step for removing the BG from an XPS spectrum. This can be done using several methods, such as the linear method, the Shirley method [2], and the Tougaard method [3]. Of these approaches, the Shirley method offers the advantage of simplicity and is the most widely used technique for analyzing XPS data [4−6]. In this method, the BG shape is estimated on the basis of the intensities of a start point and an endpoint, which are selected from the data points in the XPS spectrum. The intensity of this endpoint must be selected by an analyst.
In order to eliminate the need for human input, the active Shirley method [7] uses a self-consistent iterative calculation to stably estimate the XPS spectrum BG and constitutional peaks without fixing the intensities of the endpoints of the BG. Matsumoto et al. proposed an algorithm that incorporates the nonlinear least-squares method into the active Shirley method; this algorithm is used as the software COMPRO [8]. This method was applied for evaluating the XPS spectrum of a 10 nm thick SiO2 film on a Si substrate and was shown to stably derive the proper BG and area values of the peak components without depending on an analyst and, thereby, to evaluate the SiO2 thickness with greater reproducibility [9,10].
However, when the active Shirley method is applied to XPS spectra that exhibit structures near either of the endpoints, the BG tends to be overestimated, resulting in negative peak intensities after the BG is subtracted. This implies that the automatic tuning for estimating the intensities of the endpoints in the active Shirley method does not work effectively if the BG increases drastically near the endpoints.
Therefore, Nishizawa et al. proposed a new algorithm in which the initial intensities of the endpoints in the XPS spectrum are adjusted to suppress the negative values that occur when the BG signal crosses the XPS spectrum (i.e., when the BG intensity exceeds the spectral intensity) [11]. This algorithm was shown to be effective for XPS spectra whose endpoint intensities are large, but it cannot always avoid the occurrence of a negative region. As an example of the limitations of this algorithm, Figure 1 shows the Cu 2p XPS spectra of Bi2Sr2CaCu2Oy superconducting single crystals cleaved under high vacuum [12−15]. The spectra were obtained with excitation by 4600-eV synchrotron radiation and Al Kα X-ray [ Figure 1(a, b)]. The Bi 4s peak overlaps the estimated BG at 940 eV more significantly in the XPS spectrum measured by synchrotron radiation with 4600 eV [ Figure 1(a)] than in that measured by Al Kα X-ray [ Figure  1(b)]. Thus, in the XPS spectrum measured using the 4600-eV X-ray excitation [ Figure 1(a)], the intensity increased nonmonotonically around each minimum point (marked with ▽ in the figure). In this case, Nishizawa's algorithm yielded a negative region as the active Shirley method with autotuning of the initial endpoints was applied to the whole range of the Cu 2p XPS spectrum (923−972 eV). On the other hand, in the XPS spectrum measured by Al Kα X-ray [ Figure 1(b)], the Bi 4s peak intensity at 940 eV was smaller and the intensity increased almost monotonically around each minimum point (marked with ▼ in the figure). The difference of the relative peak intensity of the Bi 4s to the Cu 2p is due to the photoionization cross sections dependent on X-ray energies [16]. In the Al Kα X-ray case, no negative region appears when Nishizawa's algo-rithm is applied.
The problem associated with the Shirley method is that the shape of the estimated BG differs depending on how the endpoint is selected by the analyst. Therefore, an active Shirley method that automatically provides stable solutions is highly desirable. However, as shown in previous studies, this approach is not effective in certain spectra with complex or unique structures in the middle region, resulting in an estimated BG that crosses the spectrum. Hence, in order to solve this problem, we proposed a novel approach in which all of the minimum points included in the spectrum are detected and used as the endpoints of several candidate BGs to facilitate a round-robin BG estimation.

A. Estimation of promising BG candidates
The new algorithm is based on empirical and manual BG estimation by the analyst. In this procedure, the analyst firstly focuses on the main peak of the inner shell level and the surrounding incidental peaks and then determines a region that includes these peaks as the analytical range. After that, the BG is estimated using the Shirley method. When the estimated BG intersects the spectrum, the energy range is narrowed until it excludes the surrounding incidental peaks. Then, a new endpoint is selected and the BG estimation is repeated accordingly.
The new algorithm follows this trial-and-error approach for estimating the BG by tuning the analytical range in order to estimate the optimum BG. As shown in Figure 2, the extreme values of the spectrum are first identified and the BG candidates are derived by estimating a different BG passing through each of the different local minimum points. Every BG candidate is derived from each BG coefficient, that is one constant ratio of BG intensity to the peak area in the full energy range of an XPS spectrum. The optimum BG is then selected from these candidates and subsequently utilized as Figure 1: BG estimations obtained by the active Shirley method with a function for autotuning the initial endpoints [11] for the Cu 2p XPS spectra obtained with (a) 4600-eV synchrotron radiation X-ray excitation and (b) Al Kα X-ray excitation. In the Cu 2p XPS spectrum, the Bi 4s peak overlaps the estimated BG around 940 eV. Local minimum points are marked with ▽ to denote that the spectrum increases nonmonotonically or with ▼ to denote that the spectrum increases monotonically as the binding energy increases.
the initial BG to optimize the peak separation and perform BG estimation using the active Shirley method. The procedure for selecting the optimum BG is as follows: (1) The BG candidates, that intersect the XPS spectrum and generate a negative area, are eliminated. The ratio of the negative area (blue color in Figure 3) to the positive area (red color in Figure 3) should be as small as possible. A ratio of up to 0.5% was tolerated because the measured spectrum is generally accompanied by noise. The threshold value of 0.5% was determined empirically in a case study of about 160 XPS spectra, including Fe 2p and Cu 2p spectra, obtained from 16 different types of elements.
(2) Of the BG candidates satisfying condition (1), the one that has the largest BG area intensity is selected. This second condition is imposed to ensure that the BG estimation is reproducible and that the number of separated peaks is small. However, when the XPS data is noisy, the peak separation may not be uniquely determined [17]. Therefore, in this study, the Bayesian information criterion (BIC) [18,19] is used as an index to narrow down the results of multiple peak separation considering both the complexity and the error of the model; thus, the peak separation result with the lowest BIC value was selected. The use of the BIC will be reported in another article [20].

B. Initial BG selection from candidates
In manual BG estimation, even if there are remaining noise peaks after the smoothing process, the expert analyst can recognize the endpoint of the peak by empirically separating real peaks from noise peaks. However, a new algorithm is needed to automatically select the endpoint from the candidates (denoted by • in the inset of Figure 4). From the physical viewpoint of XPS spectra, long-period structure in Figure 4 can be not noise peaks but real peaks. However, from the technical viewpoint of this BG estimation, long-period structure in Figure 4 are conveniently categorized as noise peaks.
In general, the short-period spectral noise occurring at every energy step can be reduced by smoothing; however, this does not effectively remove that long-period noise as shown in Figure 4. Here, long-period noise means the remaining long-period concave-convex structure after smoothing. Long-period noise is observed particularly in regions that are far away from the peaks and where the spectrum is flat-shaped (like Region A in Figure 4). Thus,   the long-period noise cannot be fully removed by moderate smoothing and over-smoothing to fully remove it will obscure the actual spectrum.
First, in this study, the endpoint candidates were reduced by smoothing using the Savitzky-Golay method [21,22] in order to reduce the short-period noise. Second, the remaining long-period noise (as shown in Figure 5) were reduced by a peak search algorithm in order to avoid over-smoothing and reduce the number of trial-and-error iterations needed for the round-robin BG estimation. The peak search means finding a visible set of two local-minimum points and a middle local-maximum point for BG estimation because the local-minimum points of a tiny noise peak can be negligible as for BG estimation. As shown in Figure 4, all minimum and maximum points derived from the long-period noise are identified in the region where the spectral shape far from the peak is flat (Region A in the figure). When the intensity difference between the local maximum and the local minimum is smaller than a threshold, they are regarded as long-period noise and the minimum point is excluded from the endpoint candidates that will be used for the Shirley method. Here, notice that, in a region "A" of Figure 4, all candidates will be excluded if the intensity of all long-period noises are lower than the threshold as shown in Figure 5(c). In this case, the endpoint is indefinite and BG cannot be properly estimated. In order to avoid this problem, we have to keep two local minimum points across a local maximum point and introduced a specific algorithm of which detail algorithm is shown in the appendix.
Using the above two-step algorithm, the local minima that are attributed to short-period noise or long-period noise are excluded from the endpoint candidates, thereby reducing the number of trial-and-error iterations that are needed to estimate the initial BG. Figure 6 shows the BG estimations and peak separations obtained using (a) Nishizawa's algorithm and (b) the proposed algorithm for the Cu 2p XPS spectrum of the cleaved surface of a Bi2Sr2Can−1CunOy high-temperature superconducting single crystal. XPS measurements were obtained using an X-ray photoelectron spectrometer (Ulvac-Phi Inc. DAPHNIA). Synchrotron radiation with 4600 eV was used as the X-ray source. In these XPS spectra, a small Bi 4s peak overlapped the estimated BG signal near 940 eV and a negative region appeared around 947 eV when the BG was estimated using Nishizawa's algorithm [ Figure 6(a)]. This is because the optimization was carried out using the intensity of the local minimum point closest to the endpoint as the initial intensity.

III. RESULTS AND DISCUSSION
In Figure 6(a), the local minimum point closest to the endpoint on the high-binding-energy side was located around 965 eV. Using the active Shirley method, the endpoint intensity on the high-binding-energy side is automatically tuned to that of this minimum point prior to the BG estimation and peak separation. However, since there is another local minimum point around 947 eV, at which the spectral intensity drastically decreases, the estimated BG intersects with the XPS spectrum, giving rise to a negative region. On the other hand, as shown in Figure 6(b), when the new algorithm was applied to the same spectrum, the proper BG was successfully estimated without generating a negative region. This is because all of the minimum points in the spectrum are detected and a BG passing through each local minimum point is derived. Hence, the BG with the greatest intensity but with the smallest negative region is selected as the optimum BG. The BG automatically estimated using the proposed algorithm overlaps with the BG estimated in the manually-limited range of 923−947 eV. The expert analysts empirically understand how to choose the minimum point and limit the energy range to avoid a negative region. It is noted that the proposed algorithm automatically chooses the best condition. This automation is very useful for estimating many BGs of a series of XPS spectra which show drastic peak deformation or peak shift, such as three-dimensional spectra and time-resolved spectra.
Incidentally, although the BG estimated by the new algorism shows the low intensity around endpoint of analytical range, it is not unsuccessful underestimation but the result derived from the one BG coefficient in an XPS spectrum. To avoid this underestimation, one can set multiple BG coefficients dependent on the segmented ranges of an XPS spectrum and this segmentation is manually done. In this work, we simply used one BG coefficient in the full energy range to attain an automatic analysis. Our target is to attain an automatic analysis of mass XPS spectra, the requirements for our algorism are the followings; 1) negative-area free, 2) adjustment free of analytical range, 3) high reproducibility. For such XPS analyses that focus on the above requirements, our proposed method is especially suitable to be applied.
The number of peaks around 941 eV are different between Figures 6(a) and 6(b). The BIC approach clarifies that best fitted peaks are dependent on the spectral noise and BG shape. Figure 7 shows the BGs and peak separations estimated by (a) applying Nishizawa's algorithm and (b) the proposed algorithm to the Ca 2p XPS spectrum for CaCO3 powder that was heated to 1200°C in air. This spectrum exhibits several small peaks caused by the plasmon-loss phenome- non [23] in the high-binding-energy range of 350−358.5 eV.
In addition, the signal-to-noise ratio of this signal is about 7%, which is larger than that of the Cu 2p XPS spectrum shown in Figure 6 (the signal-to-noise ratio is about 4%). Figure 7 shows that when the BG was estimated by Nishizawa's algorithm, a negative region was generated around 350−352 eV. Here the minimum point closest to the endpoint of the spectrum was identified to be about 356 eV; however, the endpoint was associated with a small peak caused by the plasmon-loss phenomenon. Thus, the endpoint intensity of the BG estimation is virtually decreased to the same intensity of the local minimum point at 356 eV before optimization using the active Shirley method. However, since there is a minimum point with a smaller intensity at 352 eV, the estimated BG intentionally crosses the XPS spectrum.
On the other hand, it can be seen that the BG estimated using the new algorithm does not result in a negative region, demonstrating the effectiveness of the new algorithm for automated detection of the outline of the whole spectrum and optimum BG estimation, even for spectra that exhibit subpeaks at the ends of the peaks. Moreover, this algorithm was effective for a noisy XPS spectrum with a signal-to-noise ratio of about 7%.
In addition, the robustness of the BG estimation in the presence of spectral noise was investigated. Figure 8 shows the result of the BG estimation and peak separation using the new algorithm for a spectrum with a significant noise (signal-to-noise ratio: about 40%). For the analysis, a Cu 2p XPS spectrum was artificially generated by a linear combination of Cu + and Cu 2+ XPS spectra. Here, the BG of the artificial spectrum was assumed to be generated in proportion to the area under the XPS peak according to the Shirley method. In addition, Poisson's noise was added to the artifi-cial spectrum. The intensity ( ) of the endpoint (at 973 eV) was 48 counts, and the standard deviation of Poisson's noise (√ ) was 6.9 and was approximated by a Gaussian distribution.
The results show that the estimated BG intensity agrees with the true BG within an error of ±0.3√ (corresponding to an error ratio of 4.3% or less). Therefore, it was concluded that the proposed algorithm can automatically and accurately extract the top and endpoints of each peak even in the presence of high levels of noise. Furthermore, the estimated BG had almost the same shape as that of the true BG because of the detection of the outline of the entire spectrum. Thus, it was further concluded that this algorithm is effective for BG estimation in very noisy spectra.
In addition, the BIC was introduced in this algorithm in order to imitate the tendency to select fewer peaks in manual peak separation for XPS spectra. Generally, in noisy spectra, the spectral shape can easily be altered by the smoothing process. Notably, the number of separated peaks must inevitably increase in order to reproduce the fine structures that appear in the spectrum after the smoothing process. The result of peak separation using the active Shirley method (a mathematical optimization method) strongly depends on the initial number of separated peaks. Therefore, the BIC is effective as an evaluation criterion for selecting the optimal number of peaks from several candidates with different initial peak numbers [17]. Figure 9 shows the relationship between the BIC value and the initial peak number in the analysis of the Cu 2p XPS spectrum shown in Figure 8 by the active Shirley method with the new algorithm. The data show that the BIC value is the lowest with an initial peak number of four (denoted by ■). Incidentally, the estimated BG and separated peaks derived using the proposed algorithm and BIC were similar to

IV. CONCLUSIONS
In this paper, we introduced a novel algorithm that can recognize the concavo-convex shape of an XPS spectrum and generate the optimum BG. This algorithm facilitates the automated analysis of XPS spectra, which is traditionally performed manually. The proposed algorithm also made it possible to select the optimum BG and to separate peaks from several candidates even if the XPS spectrum has a complicated shape and is distributed over a wide analytical range.
The step of the proposed algorithm, the detection of the concavo-convex shape in the spectrum, can be regarded as sparse modeling with a polygon as the model function. Meanwhile, usual peak separation and BG estimation can be considered as sparse modeling with a BG function as the model function based on the pseudo-Voigt [24] function and the Shirley method. Therefore, the proposed algorithm can be said to perform coarse sparse modeling using a polygon as a model function, followed by sparse modeling using more sophisticated model functions. Therefore, using this algorithm is expected to improve the versatility of the automated analysis of XPS spectra with diverse shapes. Figure 10: An algorithm to reduce the total number of local maximum points and local minimum points using the intensity difference between local minimum points and local minimum points.