2023 Volume 63 Issue 1 Pages 170-178
As railroad rails are an important social infrastructure, monetary and human resources are spent on their maintenance and inspection for people’s safety and security. In particular, rails are uniformly ground periodically to reduce noise and vibration during railroad operations and avoid damage to the rail tops. However, due to cost reduction and lack of human resources, the damage mechanism of rails is being elucidated for more efficient grinding. Factors such as passing tonnage, transit speed, and weather conditions affect rail damage; however, the details of the damage mechanism are not clear. In this study, we measured a large amount of residual stresses and the full width at half maximum (FWHM) of the diffraction rings using a high-speed X-ray residual stress measurement system and explored the possibility of detecting abnormal areas of rails and diagnosing the signs of damage using statistical analysis methods and machine learning.
For verification, 2D mapping measurements of the x-axis component of the vertical stress σx, shear stresses τxy, τxz, and τyz, and FWHM of the diffraction ring at the head of a cracked rail were used. The data were subjected to dimensionality reduction by principal component analysis, kernel principal component analysis, and an autoencoder. These normal models were built using the data of the normal areas without cracks. The anomaly score was defined as the differences from the normal models, and the detection accuracies of the models were compared using the area under the receiver operating characteristic curve; the autoencoder showed the best performance.
A railroad network, such as the Shinkansen, is a social infrastructure that forms the basis of people’s lives and economic activities, and thus, requires a safe operation. Therefore, regular maintenance and inspection are indispensable, and large amounts of human and monetary resources are required, but increasing costs and shortage of human resources have made this difficult. As part of maintenance, rails are ground down several hundred micrometers from the surface every year using a rail grinder to reduce vehicle vibration and running sound and remove the damage caused by friction cracks. To optimize the amount of grinding, research has been conducted on the rolling contact fatigue mechanism of rails, based on scientific evidence. Sasaki et al. reported the mechanism of rolling contact fatigue of rails by conducting triaxial stress analysis of rails using residual stresses and the full width at half maximum (FWHM) of the diffraction rings of X-rays.1) Many systematic studies have been conducted on rail damage, such as residual stress,2,3,4,5) crack propagation,6,7) and white etching layer (WEL).8,9,10,11,12) However, the details of the damage mechanism of rails have not been clarified because of the complex combination of multiple conditions, such as passing tonnage, transit speed, and weather conditions. In addition, actual situations are difficult to reproduce, even if a two-cylinder rolling test is performed, in which test specimens simulating rails and wheels are rotated at a high speed while maintaining contact. Therefore, we are conducting research to quantitatively evaluate the signs of crack initiation and remaining life of rails using statistical analysis and machine learning, based on data on possible causes of damage obtained through previous studies. In this study, we measured a large amount of residual stresses and FWHM of diffraction rings of previously used rails and investigated the possibility of detecting abnormal points through statistical analysis and machine learning. Detailed measurements were retaken on a cracked rail used by Mitsui et al.13) to compare and verify anomaly detection methods using dimensionality reduction, such as principal component analysis (PCA), kernel principal component analysis (Kernel PCA), and autoencoder.
The application of residual stress measurement using X-rays is widespread in high-speed measurement systems that use a two-dimensional X-ray detector. In our previous study, we demonstrated that residual stress can be measured in less than a second using a device based on the cosα method and silicon-on-insulator (SOI) pixel detectors.14) We used this device to obtain mapping measurements of the residual stresses and FWHM of the diffraction rings on the cracked rail and proved that stress-releasing areas due to cracks could be detected. In this study, the measured quantities are subjected to statistical analysis and machine learning to quantify the degree of damage to the rail, and a method to utilize this information for rail maintenance and inspection is proposed.
The residual stress measurement method using X-rays was proposed by Christenson et al. in 1953 as the sin2ψ method.15) This method requires a large stationary apparatus, such as a precise goniometric mechanism, to measure the diffracted intensities of incident X-rays from several angles. The sin2ψ method is used as a global standard; however, it is mainly limited to research purposes because it takes more than 10 min to measure and is not suitable for large objects. In 1978, Taira et al. proposed a residual stress measurement method called the cosα method, which evaluates plane stress by observing the entire diffraction ring of diffracted X-rays with a two-dimensional X-ray detector with single X-ray incidence.16) Furthermore, Sasaki et al. proposed the generalized cosα method in 1995, which was extended to triaxial stresses.17) In the 2000 s, a portable X-ray residual stress measurement system using the cosα method became commercially available in Japan and was used by various companies and laboratories because it enabled on-site measurements. However, a smaller and faster device is required for further use on-site and industrial applications. Therefore, in a previous study, we created an X-ray residual stress measurement system using a charge-integration-type SOI pixel detector, INTPIX4,18) developed at the High Energy Accelerator Research Organization (KEK), and demonstrated that it can measure a single point in less than a second with the same accuracy as the conventional system.13) This reduced measurement time facilitates research using a large amount of measurement data, such as mapping measurements. Further applied studies have been conducted using the developed equipment: Sasaki et al. reported triaxial stress analysis of rails,1) and Nishimura et al. reported precise measurements using synchrotron radiation.19)
In the cosα method, an X-ray beam is incident on a polycrystalline metal in one direction, and the diffraction ring formed by the diffracted X-rays according to Bragg’s law is measured using a two-dimensional X-ray detector. In the stress-free state, the diffraction ring is a perfect circle; however, when the lattice spacing of the metal atoms changes due to stress, the diffraction angle of the incident X-ray changes, and the diffraction ring is deformed. Stress can be evaluated by measuring the changes in the diffraction ring. The detailed measurement and analysis methods were the same as those used by Mitsui et al.14)
In conventional metal material evaluation, the damage is evaluated by combining measurements of the residual stresses and FWHM of diffraction rings. However, there are many influential factors in actual metal materials, and estimating the damage mechanism is sometimes impossible. In this study, we investigated the possibility of detecting abnormal positions by statistically analyzing the residual stresses and FWHM of the diffraction rings at many points using multivariate analysis. To detect abnormal positions based on many factors, such as triaxial stresses and FWHM of the diffraction rings, dimensionality reduction of the measured data was performed, and the degree of the anomaly was defined according to Hotelling’s T2 to quantify the degree of damage.20,21)
Hotelling’s T2 considers data 𝒟 ={x(n)|n = 1, 2, 3,…N}, which consists of N observations in M dimensions. If 𝒟 contains zero or very few abnormal samples and assumes that each sample follows a Gaussian distribution, the probability density function can be written as
(1) |
(2) |
(3) |
(4) |
(5) |
PCA is a multivariate analysis method in which the direction of maximum variance of the multivariate data is the first principal component, and subsequent principal components are orthogonal to the previous principal components and maximize the variance to obtain a new orthonormal basis. Because principal components can be used to represent the original data with fewer variables, they are used for dimensionality reduction to reduce the number of variables in the data. When dealing with data where variables are not independent of each other, PCA can be used to reduce unnecessary dimensions and avoid multicollinearity; thus, the covariance matrix can be calculated correctly, and anomalies can be detected.22)
Consider data 𝒟 ={x(n)|n = 1, 2, 3,…N}, which consists of N observations of M-dimensional data. Assuming that 𝒟 contains zero or very few abnormal samples, the sample mean vector and covariance matrix can be written as Eqs. (3) and (4), respectively. Additionally, S is defined as follows:
(6) |
(7) |
(8) |
Assuming that the normal subspace consisting of only the data of the normal part is spanned by orthonormal bases u1,..., um, let u be one of them for simplicity. The component of x(n) along u is represented by uTx(n), and its sample mean is
(9) |
(10) |
(11) |
(12) |
(13) |
Next, suppose an orthonormal basis Um ≡ [u1,..., um] is obtained by PCA of normal data only, which spans an m-dimensional normal subspace with dimensionality reduction by solving the eigenequation. By resolving an arbitrary input x′ into its normal subspace components x′(1) and orthogonal components x′(2), we can express it as follows:
(14) |
(15) |
(16) |
Kernel PCA23) is a method in which a data matrix is nonlinearly mapped into a high-dimensional space and then subjected to PCA. A data matrix with a nonlinear distribution is mapped to a high-dimensional space to enable the linear separation of anomalous points. Here, we consider a transformation ϕ(x) that maps onto an Mϕ-dimensional feature space. Because the computational complexity of high-dimensional mapping is high, we solve this problem by replacing
Assuming that the mean of the dataset mapped into the Mϕ-dimensional feature space
(17) |
(18) |
(19) |
(20) |
(21) |
(22) |
(23) |
(24) |
(25) |
(26) |
(27) |
(28) |
(29) |
(30) |
(31) |
Autoencoder,25) an algorithm for dimensionality reduction using neural networks, was used to evaluate the degree of rail damage by constructing a model from only the normal region of the cracked rail and calculating the degree of the anomaly score at the measurement points.
An autoencoder is a neural network in which the number of artificial neurons in the input and output layers is the same. It is a network that has been trained to reproduce the input data in the output layer. Usually, there is one middle layer, consisting of fewer artificial neurons than that in the input and output layers, which is a mechanism to reduce the dimensionality of the input data.
The activity level of the ith neuron in layer l,
(32) |
(33) |
Architecture of an autoencoder.
When training the autoencoder, these two parameters Wij and bi are learned to minimize the following binary cross-entropy as a loss function:
(34) |
(35) |
Detailed X-ray residual stress measurements were retaken on a cracked rail used by Mitsui et al.13) to validate the damage evaluation of metal by anomaly detection using PCA, kernel PCA, and autoencoder; the methods are described below.
A total of 1719 data points were measured on the cracked rail cut out to 200 mm, nine points at 5-mm intervals of 40 mm in the shorter direction, and 191 lines at 1-mm intervals of 190 mm in the longer direction. The exposure time of the X-ray detector was 100 ms/frame, and the diffraction ring image was acquired by integrating 10 frames. As it took approximately 26 ms to transfer one frame, the measurement time for one point was approximately 1.26 s. The detector was in contact with a heat sink and fixed at 15°C by cooling water. The measurement was carried out in an X-ray-shielded box, which also served as a light shield, and the temperature inside the box was approximately 30°C. The detector was fully depleted by a bias voltage of 70 V. Therefore, the 5.4 keV Cr-Kα characteristic X-rays used in the measurement were more than 99% detectable. The power of the X-ray tube was set at 20 kV, 4 mA. The calibration of the measurement system was performed as in Reference 13 using commercial zero and high stress standard test pieces.
The measurements were performed in two directions, ψ0 = 0° and 35° to the vertical of the rail head, and the x-axis component of the residual stress σx, shear stresses τxy, τxz, and τyz, and FWHM of the diffraction ring were measured. There were nine measurement points per line in the shorter direction, and each measurement point had 10 measurement quantities (the x-axis component of the residual stress σx and its error, shear stresses τxy, τxz, and τyz and their errors, and FWHM of the diffraction ring from two directions) for a total of 90 dimensions of data. Here, the longer direction was defined as the X-direction, the shorter direction as the Y-direction, and the height of the rail was the Z-direction. Thirty lines (70–85 mm and 103–116 mm) with visual cracks of 191 lines of measurement data were identified as abnormal areas. The measured data were standardized to a mean value of 0 and a standard deviation of 1 for each parameter. Ninety-five lines from 0 to 49 mm and 146 to 190 mm, which did not contain any damaged areas, were used as training data, and the accuracy of the model was verified using the anomaly scores of the lines from 50 to 145 mm as test data.
Figure 2 shows the cracked rail sample and the FWHM distribution of the diffraction rings at ψ0 = 35°. The distributions of σx at ψ0 = 35°, τxy at ψ0 = 35°, FWHM of the diffraction rings at ψ0 = 0°, τxz at ψ0 = 0°, and τyz at ψ0 = 0° are shown in Figs. 3, 4, 5, 6, 7, respectively. The FWHM of the diffraction rings and τxz are larger at the upper edge of the rail. This is because of WEL formation due to martensitic transformation caused by frictional heat from rail–wheel contact. WEL is observed as a widening FWHM of the diffraction rings due to high dislocation density, which was confirmed in a previous study.1) Plastic flow is known to occur on the rail surface,26,27,28,29) and it is reported to be consistent with the distribution of τxz.30)
Cracked rail sample (upper) and distribution of FWHM of diffraction rings at ψ0 = 35° (lower). (Online version in color.)
Distribution of σx at ψ0 = 35°. (Online version in color.)
Distribution of τxy at ψ0 = 35°. (Online version in color.)
Distribution of FWHM of diffraction rings at ψ0 = 0°. (Online version in color.)
Distribution of τxz at ψ0 = 0°. (Online version in color.)
Distribution of τyz at ψ0 = 0°. (Online version in color.)
The FWHM of the diffraction rings are smaller around the cracks, which is caused by a type of softening phenomenon due to the repeated passage of trains after crack initiation. The σx is lower at the cracks due to the release of residual stresses. The purpose of this method is to detect these changes around the crack in the magnitude of the anomaly score.
4.2. Comparison of Model AccuracyThe anomaly score distribution of each model by PCA, kernel PCA, and autoencoder when the data is compressed to two dimensions is shown in Fig. 8.
Anomaly score distribution when data is compressed to two dimensions.
A higher anomaly score in this figure is assumed to indicate more damage in that location. In addition, by setting a threshold value, areas where the anomaly score exceeds an arbitrary predefined value can be evaluated as damaged areas. The accuracy of each model can be evaluated by comparing the actual crack locations and the locations evaluated as abnormal, adjusting the threshold of the anomaly score as needed. Then, a confusion matrix, as shown in Fig. 9, is created with each value defined by Eqs. (36), (37), (38), (39).
(36) |
(37) |
(38) |
(39) |
Confusion matrix.
ROC curve.
ROC curve in two dimensions for each model.
Dimension dependence of AUROC for each model.
Using the developed X-ray residual stress measurement system, the residual stresses and FWHM of the diffraction rings on the surface of the rail with cracks were measured, and anomaly detection was performed using PCA, kernel PCA, and autoencoder. When the accuracies of the models were compared, anomaly detection by machine learning with autoencoder showed the best performance. This research will enable quantitative damage evaluation of rails, which has been an ongoing challenge, and efficient maintenance. However, because rail usage varies by region, practical application will require on-site evaluation.
In the future, we plan to improve the accuracy of anomaly detection by analyzing a larger amount of measurement data and using more advanced machine learning, such as convolutional autoencoders. Furthermore, this analysis method can also be used for anomaly detection by combining measurement data from ultrasonic inspection and visible light images. In addition, because this method can be applied to anomaly detection not only in rails but also in various mechanical metal parts, such as bearings, we can verify its applicability to a variety of objects.
This work was supported by ISIJ Research Promotion Grant, Adaptable and Seamless Technology Transfer Program through Target-driven R&D from The Japan Science and Technology Agency, and JSPS KAKENHI Grant Numbers 21K03830, 16H02309, and 16K14145. Designing of INTPIX4 was supported by Systems Design Lab (d.lab), the University of Tokyo in collaboration with Cadence Design Systems, Inc., Synopsys, Inc., and Menter Graphics, Inc.