2024 Volume 65 Issue 4 Pages 428-433
Fatigue limit is well predicted by tensile strength or hardness, and the relationship is often analyzed by linear regression using the minimum squared approximation. However, the prediction of the number of cycles to failure at a given stress amplitude, meaning the estimate of the S–N curve, has not been realized. Therefore, we aim to investigate the estimability of the S–N curve using the random forest method based on the data described in the NIMS fatigue data sheet. The random forest method is a machine learning algorithm and an ensemble learning algorithm that integrates weak learners of multiple decision tree models to improve generalization ability. It was clarified that the machine learning of the multiple decision tree model is excellent in fatigue limit prediction. The S–N curve can be accurately estimated by combining the prediction of fatigue limit and the number of cycles to failure at a given stress amplitude.
This Paper was Originally Published in Japanese in J. Soc. Mater. Sci., Japan 70 (2021) 876–880.
Fig. 9 Prediction of S-N curve of fracture life using data of S25C, S35C, S55C, SNCM439, SmN438, SmN43, SUS403, SUS304 (data of fracture life of 5 × 106 times or less, fatigue limit considers only hardness).
NIMS has accumulated fatigue test data of various structural materials for approximately 40 years, known as NIMS fatigue data sheets (FDS).1) These FDS show empirical correlations between fatigue limits (i.e., fatigue strength at 107 cycles) and other mechanical properties (Fig. 12)). From these FDS, it is empirically known that there is a correlation between fatigue limit and other mechanical properties (Fig. 12)). In addition to the fatigue limit, the estimation of fatigue strength (S–N curve) is attempted by normalizing the stress amplitude using the tensile strength.3) Table 1 lists the index properties of fatigue. In Table 1, fatigue is first classified into high- and low-cycle fatigue according to the life range. The high-cycle fatigue strength property is generally expressed by the curve σa-Nf, which is the relationship between stress amplitude and life. In this case, the index is the strength property, with tensile strength σB denoting the static index and cyclic yield stress σyc characterizing the dynamic index. The reasons for this are described in a later. Conversely, the low-cycle fatigue strength property is represented by the relationship between strain and life, εa-Nf. Therefore, the deformation characteristic is considered an index. In this case, the static index is the rupture ductility εf and the dynamic index is the exponent n′ of the cyclic stress–strain curve.3) It is empirically known that an excellent correlation exists between tensile strength σB and fatigue limit σw. The correlation between yield stress σy (or 0.2% proof stress σ0.2) and σw has also been investigated, but it is not as strong as the σB–σw relationship because σy is affected by an instability phenomenon called yielding. However, a linear relationship is established between the cyclic yield stress σyc and σw because σyc corresponds to the internal microstructure, reaching a certain steady state after repeated plastic deformation. Thus, it is reasonable to adopt tensile strength σB as a static index of high-cycle fatigue strength and cyclic yield stress σyc as a dynamic index. The dynamic index should essentially be adopted because fatigue is caused by repeated plastic strain, but there are some barriers to adopting σyc. First, σyc must be measured by a strain control test using the companion specimen method or the incremental step method,4) and the measurement data are not plentiful. As shown in Fig. 2, the two index properties σB and σyc are proportional, so we believe it is acceptable to use the static index for practical purposes. Figure 3 shows σa/σB-Nf normalized by σB. However, the entire normalized results in a wide band, which is not an accurate estimation. Therefore, we attempted to estimate the S–N curve (relationship between stress amplitude and fatigue life) through machine learning.
Relationship between mechanical properties and fatigue limit. (a) versus Vickers Hardness (b) versus tensile strength.
Relationship between tensile strength and cyclic yield stress.
S-N curves were normalized in tensile strength.
The random forest method is an algorithm in machine learning. It is an ensemble learning algorithm that improves generalization ability by integrating weak learners of multiple decision tree models and is mainly used for classification (discrimination) and regression (estimation) applications. The key issues are (1) whether more accurate data can be sampled for the target data population and (2) whether decision tree models can be created for each training component. In conventional mathematical model regression, the regression is based on the least-squares approximation to find the correlation between two data sets of interest. However, machine learning can create a regression model that relates multiple decision tree models of the learning elements, which is expected to provide a more accurate estimation.
In this study, we explored the improvement of the estimation accuracy of the fatigue limit using the experimental data available from the NIMS FDS by the random forest method. Next, the possibility of estimating the S–N curve was also examined by predicting the fatigue strength below 106 cycles using the same method.
The data population for estimating fatigue limits was based on the experimental data of S25C (FDS No. 1) and S55C (FDS No. 4) by rotating bending fatigue tests. A random forest method was used to examine the effect of each study element. Next, fatigue limit data from torsional fatigue tests were added to the data population to study the effects of different fatigue test methods. Furthermore, the effect of stress ratio was examined by adding the fatigue test data with R = 0 and with stress ratio R = −1. On the basis of the results of previous studies, estimation accuracy was examined using fatigue data for different types of steels: S35C (FDS No. 2), SNCM439 (FDS No. 25), SmN438 (FDS No. 16), SmN443 (FDS No. 17), SUS403 (FDS No. 30), SUS304 (FDS No. 33), and S25C and S55C. The estimation accuracy of the fatigue data of different types of steel was examined. Next, fatigue life estimation was attempted using fatigue data of 106 cycles or less for various steels. Finally, the estimation of the S–N curve was attempted for S45C (FDS No. 3) tempered at 550°C, Heat A, by predicting the fatigue strength under 106 cycles for each stress amplitude. Until now, “elongation” and “reduction of area” have not been focused on because they correlate well with tensile strength and hardness for estimating fatigue limits. However, in the finite life range of the S–N curve, especially in the low-cycle range of short life, rupture ductility is an indicator of low-cycle fatigue, so a decision tree model was adopted to relate tensile strength, hardness, elongation, and reduction of area.
A commercial personal computer was used for machine learning, and Python 3.6.1,5) available for free download, and the external library Anaconda6) were used.
The target data were the fatigue test results described in the FDS. For the sake of fairness of analysis, 80% of the data were training data and 20% were test data randomly extracted each time. Therefore, it is impossible to determine which data are the test data. The mean absolute percentage error (MAPE) was obtained from the test data as one of the evaluation results of the analysis.
\begin{equation} \text{MAPE (%)} = \frac{100}{N} \sum\nolimits_{i = 1}^{N}\left| \frac{\widehat{y\imath} - yi}{yi} \right| \end{equation} | (1) |
where $\widehat{y\imath }$ is the value of the data used in the analysis and yi is the estimate obtained from the analyzed data.
The root mean square error (RMSE), mean squared error (MSE), and coefficient of determination (R2) are used as indicators to evaluate the fit accuracy of the model obtained in the regression analysis. However, when calculated with the RMSE and MSE error functions, the + and − data are summed, resulting in a canceled mean error. Conversely, MAPE can localize discrepancies in prediction data because of absolute values, and problems with MAPE include cases where the measured value is zero, or the prediction is too small. Additionally, without cross-validation and grid search, biased conclusions may be obtained. However, for all predictions, a relationship diagram between experimental and predicted values, as shown in Fig. 4, is developed and visually observed, which is considered a substitute for cross-validation and grid search. For these reasons, we considered it appropriate to use MAPE rather than RMSE and MSE as the error function in this study.
Relationship between fatigue limit by AI prediction and fatigue limit by experiment using 107 times unbroken data of rotational bending fatigue test and torsional fatigue test of S25C and S55C. (a) Prediction using a decision tree model for HV only. (b) Prediction using a decision tree model for HV and test method.
Using the data from S25C and S55C rotating bending fatigue tests (total = 218), four decision tree models were created as learning factors for Vickers hardness, tensile strength, elongation, and reduction of area. Table 2 shows the results. The MAPE of Vickers hardness and tensile strength is <2%, signifying a high estimation accuracy. These results confirm the excellent correlation between hardness and tensile strength and fatigue limit shown in FDS No. 5 (Fig. 1) by machine learning, and the estimation accuracy is much improved.
3.1.2 Influence of test methodTorsion test data were added to the rotating bending test data for S25C and S55C conducted in Section 3.1.1. (total = 279). A test method section was added as a learning element. The analysis results are shown in Table 3 and Fig. 4. The fatigue limit estimated only using the Vickers hardness in Fig. 4(a) was approximately 12% of MAPE. Alternatively, the MAPE of the fatigue limit estimated from the regression model that links Vickers hardness and the decision tree model of the test method in Fig. 4(b) is 2.23%, dramatically improving estimation accuracy. This result indicates that the regression model by machine learning, which can relate multiple learning factors, is effective for fatigue limit estimation.
3.1.3 Effect of stress ratioA decision tree model was added to the rotating bending and torsion test results for S25C and S55C conducted in Section 3.1.2, using the test data from the axial loading tests (R = 0 and −1) as stress ratios (total = 306). The analytical results are shown in Table 3 and Fig. 5. The MAPE of the fatigue limit estimated only by the tensile strength and test method in Fig. 5(a) was 3.02%. The MAPE of the fatigue limit estimated from the regression model with three learning factors based on the decision tree model of stress ratio, tensile strength, and the test method in Fig. 5(b) is 2.35%, which is an enhancement in the estimation accuracy.
Relationship between fatigue limit by AI prediction and fatigue limit by experiment using 107 times unbroken data (total 306) of axial load test (R = 0, −1) for rotating bending fatigue test and torsion fatigue test of S25C and S55C. (a) Prediction by tensile strength and test method. (b) Prediction by tensile strength, test method and stress ratio.
Fatigue limit data (total = 892) from rotating bending fatigue tests of S35C, SNCM439, SmN438, SmN443, SUS403, and SUS304 were added to the fatigue test results of S25C and S55C. The analysis results are shown in Table 3 and Fig. 6. The MAPE of the fatigue limit estimated from the regression model linking the hardness and decision tree model of the test method was 2.94%, which is a high estimation accuracy.
Prediction of 107 times fatigue limit using data of S25C, S35C, S55C, SNCM439, SmN438, SmN443, SUS403, SUS304.
A decision tree model of Vickers hardness, tensile strength, reduction of area, and elongation was developed by restricting the analysis to the S25C and S55C fatigue data (total = 515) of 106 cycles or less, and the fatigue strength of 106 cycles or less was estimated by relating all decision tree training elements. The results of the analysis are shown in Table 4 and Fig. 7. The regression model with the decision tree model for Vickers hardness, tensile strength, elongation, and reduction of area showed a high estimation accuracy of 92.0% for the training data, but 65.8% for the randomly selected test data, and 38.7% for the MAPE. This is thought to be because the training data distinguish between S25C and S55C fatigue data, resulting in fatigue strength estimates closer to the original data. Conversely, since the test data are extracted randomly, S25C and S55C, which have different fatigue strengths, are not distinguished, and the estimated data vary. It is unknown which data correspond to each of S25C and S55C (because the data are extracted at random), but it is thought that it is probably the band indicated by the circle in the figure.
Prediction result of fracture life using data of S25C and S55C (only data with fracture life of 106 times or less is used).
Next, the Vickers hardness, tensile strength, elongation, and reduction of area were estimated by linking the decision tree models using a total of 2478 pieces of fatigue data (106 cycles or less) for different types of steels (S25C, S55C, S35C, SNCM439, SmN438, SmN443, SUS403, and SUS304). The results are shown in Table 4 and Fig. 8, which show that MAPE was estimated 29.8% more accurately than for the two steel grades, S25C and S55C, as shown in Fig. 7. This result is due to the increase in the total number of data by a factor of five compared to Fig. 7, and a further improvement in estimation accuracy can be expected with more experimental data in the future.
Prediction result of fracture life using data of S25C, S35C, S55C, SNCM439, SmN438, SmN443, SUS403, SUS304 (only data with fracture life of 106 times or less is used).
The relationship between the fatigue strength under 5 × 106 cycles was obtained by machine learning using the decision tree models of Vickers hardness, tensile strength, elongation, and reduction of area, based on the fatigue data (total = 2834) under 5 × 106 cycles for different types of steels (see Table 4). The fracture strength at each stress amplitude was estimated from the mechanical properties of S45C. Additionally, the fatigue limit at 2.12 × 107 cycles was obtained from the Vickers hardness of S45C based on the relationship between Vickers hardness and the fatigue limit at 2.12 × 107 cycles obtained by machine learning from various steel materials. The analysis results are shown in Fig. 9. Experimental and estimated data are indicated by △ and ●, respectively. First, as shown in Table 2, the fatigue limit estimated from the Vickers hardness agreed very well with the estimated accuracy of 99% and MAPE of 1.76. The estimated fatigue strength below 5 × 106 cycles is also in good agreement, even though the MAPE is 29.8%. The test data did not include steel grades with different strengths, as shown in Figs. 7 and 8; thus, there was no variation in the prediction accuracy. Moreover, the estimation of the S–N curve by machine learning was highly accurate when the fatigue strength and limit were estimated separately. This result reveals that the approximation of the S–N curve is possible by utilizing the accumulated experimental data in the FDS.
Prediction of S-N curve of fracture life using data of S25C, S35C, S55C, SNCM439, SmN438, SmN43, SUS403, SUS304 (data of fracture life of 5 × 106 times or less, fatigue limit considers only hardness).
Using the experimental data provided in the NIMS FDS, we attempted to estimate the fatigue limit and the fatigue strength below 106 cycles by the random forest method and examined the possibility of estimating the S–N curve. The results obtained are as follows: