Analysis of Prediction Mechanisms and Feature Importance of Martensite Start Temperature of Alloy Steel via Explainable Artificial Intelligence

Junhyub Jeon; Namhyuk Seo; Jae-Gil Jung; Seung Bae Son; Seok-Jae Lee

doi:10.2320/matertrans.MT-MI2022004

Abstract

This study proposes a machine learning model to predict the martensite start temperature (M_s) of alloy steels. We collected 219 usable data from the literature, and adjusted the hyperparameters to propose an accurate machine learning model. Artificial neural networks (ANN) exhibited the best performance compared with existing empirical equation. The prediction mechanisms and feature importance of the ANN with regards to the whole system were discussed via the Shapley additive explanation (SHAP).

1. Introduction

Martensite is essential to improving the mechanical properties of alloy steel because it contains carbon and other elements as a solid solution from the prior austenite, which enhances the strength and hardness of alloy steel.¹^–³⁾ Controlling martensite phase fraction is highly associated with managing the martensite start temperature (M_s). The M_s can be influenced by the addition of alloying elements and prior austenite grain size (AGS). For instance, it is known that the addition of most alloying elements except for Co and Al lowers the M_s.¹⁾ Besides that, a smaller prior AGS also results in a smaller M_s.⁴^–⁹⁾ In order to effectively design alloy steels, more precise predictions of M_s and further comprehension of the mechanisms of alloying elements and AGS are necessary. Researchers have proposed empirical equations to predict the M_s and understand effects of its alloying elements and AGS.¹⁰^–²⁰⁾ For example, Lee et al. suggested an empirical equation to predict M_s considering alloying elements and AGS of alloy steels, which can be expressed as follows:¹¹⁾

\begin{align} M_{s}\ ({{}^{\circ}\text{C}}) &= 475.9 - 335.1C - 34.5\textit{Mn} - 1.3\textit{Si} \\ &\quad - 15.5\textit{Ni} - 13.1\textit{Cr} - 10.7\textit{Mo} \\ &\quad - 9.6\textit{Cu} + 11.67\ln (d_{\gamma}) \end{align}

(1)

where each element symbol represents the mass percentage of the respective element, and d_γ is the AGS. Although these equations are capable of accurately predicting the M_s, the prediction mechanisms of alloying elements and AGS were only suggested for specific alloys. Machine learning methods have been employed to improve the prediction accuracy and propose a novel model to predict the material properties. Hence, many researchers have suggested a machine learning model to predict the M_s.²¹^–²⁶⁾ Capdevila et al. suggested a Bayesian neural network model considering the alloying elements. Subsequently, they analyzed its effect on the Fe–x wt.% C alloy and alloying element at Fe–(0.1, 0.4, 0.8) wt.% C–x wt.% M (Mn, Ni, Cu, Co, W, Mo, Si, and Cr).²¹⁾ Rahaman et al. suggested and compared random forest regression, extremely randomized trees, gradient boosting, Adaboost, and artificial neural networks (ANN) considering alloying elements and analyzed the feature importance via linear interaction.²²⁾ Further, Capdevila et al. and Wang et al. suggested a machine learning model considering not only alloying elements but also the AGS.²³^,²⁴⁾ These studies proposed machine learning models, but the prediction mechanisms of models were not analyzed or analyzed for specific alloys. Therefore, a comprehensive understanding of the prediction mechanisms of the M_s of alloy steels can provide further guidance for steel design. The Shapley additive explanation (SHAP), which is an explainable artificial intelligence method, elucidates the prediction mechanisms of machine learning models with respect to the overall combination of features.²⁷^–³⁰⁾ Our previous study demonstrated cementite start temperature prediction mechanisms and suggested the importance of alloying elements in this combination.²⁷⁾

In this study, ANN, random forest regression (RFR), support vector regression (SVR), and k-nearest neighbor (kNN) were employed to predict M_s. Trained machine learning models, which were then compared with themselves and an empirical equation to confirm any improvements in model accuracy. The quantitative prediction mechanisms of the machine learning model were proposed via SHAP analysis.

2. Experimental Procedure

2.1 Data collection and analysis

The utilizable data (219 in total) containing the average grain size (AGS) and M_s of C, Mn, Si, Ni, Cr, Mo, and Cu were accumulated from the literature.¹¹⁾ After analyzing and removing duplicated data, 201 data were left. The range, average, and standard deviation of data are presented in Table 1. The data were divided into 70% and 30% randomly for the training and testing dataset, respectively, to propose the RFR, SVR, and kNN. The data were then further divided into 70%, 10%, and 20% randomly for the training, validation, and testing datasets, respectively, for the ANN proposal.

Table 1 The ranges, averages, and deviations of alloying elements (wt.%), AGS (µm), and M_s (°C) in the dataset used in this study.

2.2 Machine learning model training

ANN, RFR, SVR, and kNN were trained and tested (additionally, ANN was validated via validation dataset) to propose an intensive prediction model and analyze prediction mechanisms. The hyperparameters of the ANN are learning rate, activation function, the number of layers, and the number of neurons.³¹^–³³⁾ The learning rate was set as 0.01, and activation functions for first layer and second layer were adjusted by changing the linear, sigmoid, and rectified linear unit (ReLU). The number of layers was varied between one and two, and the number of neurons was varied from 1 to 100. The number of neurons in the second layer does not exceed that in the first layer. The hyperparameters of the RFR are the number of decision trees and their respective magnitude of maximum depths.³²^,³⁴^,³⁵⁾ The number of decision trees and magnitude of maximum depth of each tree were set from 1 to 100. The hyperparameters of SVR are the soft margin, kernel type, kernel coefficient (γ), and regularization parameter (C). The soft margin was used as the default value, and type of kernel used was the radial basis function.³²^,³⁶⁾ Next, the C was set from 0.1 to 10000 and γ was set from 0.0000001 to 0.01. The hyperparameter of kNN was the number of data (k) that were considered to predict M_s.³²^,³⁷⁾ k was set from 1 to 131. Hyperparameters of the machine learning models were adjusted via grid search and 5–fold cross–validation. Subsequently, machine learning models with apposite hyperparameters were proposed using training dataset and testing datasets. The determination coefficient (R²) was applied to estimate the accuracy of the machine learning model²²⁾ and can be expressed with the following equation:

\begin{equation} R^{2} = 1 - \frac{\displaystyle\sum\nolimits_{i=1}^{n} (y_{i} - \hat{y}_{i})^{2}}{\displaystyle\sum\nolimits_{i=1}^{n} (y_{i} - \bar{y})^{2}} \end{equation}

(2)

where $\hat{y}$, y, $\bar{y}$ are the predicted, experimental and arithmetic mean value of y, and n is the number of datasets.

2.3 Model validation and SHAP analysis

The proposed machine learning models were validated using additional data that were not used for training and testing (including validation in the case of ANN). 78 additional data points were collected from the literature.²⁰⁾ Prediction mechanisms of the machine learning model selected based on R² were analyzed via SHAP.³²^,³⁸^,³⁹⁾ Machine learning training and analyzing prediction mechanisms were performed using TensorFlow version 2.7.0, scikit–learn version 0.23.1, and Python version 3.7.

3. Results and Discussions

The results of the hyperparameter adjustments are shown in Fig. 1. Hyperparameters were set based on the prediction accuracy of testing data (generally, the prediction accuracy of training data has a high value). Figure 1(a) shows the prediction accuracy of the ANN based on activation function. The linear unit and ReLU demonstrated the best performance (R² = 0.9272) for the first and second layers, respectively. Figure 1(b) shows the performances of the RFR depending on the number of estimators and maximum depth. The RFR model has a maximum depth of 67 and performed the best (R² = 0.9236) with five estimators. Figure 1(c) shows prediction accuracy of the SVR based on C and γ. The model with 100000 C and 0.01 γ yielded the best performance (R² = 0.8196). kNN exhibited the best performance when k = 131, as shown in Fig. 1(d).

Fig. 1

Hyperparameters adjusting results of (a) ANN, (b) RFR, (c) SVR, (d) kNN.

Figure 2 shows the performances of machine learning models in training and testing data (in the case of ANN, the testing data comprises 10% validation and 20% testing data). The ANN had R² value of 0.9820 and 0.9703 for the training and testing data, respectively. Further, the RFR had R² values of 0.9866 and 0.9460 for the training and testing data, respectively. Next, the SVR had R² values of 0.9801 and 0.8432 for the training and testing data, respectively. Finally, the kNN had R² values of 0.0006 and 0.0013 for the training and testing data, respectively. ANN had the best performance for the testing data but not the training data. However, the accuracy of testing data is more important because it indicates that the ANN model is capable of predicting both the training and additional data accurately. Therefore, the ANN model was selected and subsequently validated with additional data, the prediction mechanisms were then analyzed using SHAP.

Fig. 2

Performance of machine learning models for the training and testing datasets.

Figure 3(a) shows the prediction accuracy of the ANN in the training, validation, and testing data. The prediction accuracies of ANN were R² = 0.9820, 0.9796, and 0.9638 for training, validation, and testing data, respectively. The difference in accuracy between the training, validation, and testing data was approximately 0.02, suggesting that the ANN model was capable of predicting the M_s accurately. Figure 3(b) illustrates the performance of eq. (1) and ANN on the whole data set (training, testing, and validation data). The R² values of eq. (1) and ANN were 0.9793 and 0.9801, respectively. Hence, it can be deduced that the ANN was slightly more accurate than eq. (1).

Fig. 3

Comparison between the measured and predicted M_s. Performances of (a) ANN for each dataset. (b) ANN and eq. (1).

The ANN model and eq. (1) progressed to extra validation via 78 additional data that were not used in model training, validation, and testing. Figure 4 demonstrates prediction accuracies of the ANN model and eq. (1) for the additional data. The R² values of the ANN and eq. (1) were 0.9265 and 0.9070, respectively, hence, it can be deduced that the ANN model was capable of predicting the M_s of general alloy steels.

Fig. 4

Prediction performance of the additional dataset using (a) ANN and (b) eq. (1).

Figure 5(a) demonstrates that the magnitude of influence of each variable on the M_s. C had the greatest influence on the M_s (92.02 average absolute SHAP value), followed by Mn, AGS, Ni, Mo, Cr, Cu, and Si, which had average absolute SHAP values ranging between 0.45 to 17.06. Figure 5(b) illustrates the effect of the variable on the M_s. It was observed that the M_s decreased with increasing alloying elemental content (C, Mn, Ni, Mo, Cr, Cu, and Si), but increased with increasing AGS.

Fig. 5

SHAP analysis results of ANN model. (a) The average impact of each alloying element and AGS on the M_s. (b) Scatter plot of variables related to the M_s.

The detailed prediction mechanisms of each variable were analyzed using the SHAP dependence plot shown in Fig. 6. The applied base value was 252.95°C. Note that the average M_s temperature of 390625 temporary dataset made for the SHAP analysis was 252.95°C. Each SHAP values indicates an increase or decrease from the base value. All the variables were less affected by other variables. Thus, the SHAP value for a specific variable (ex. 0.1 wt.% C) appears as a point. Figures 6(a), (b), (d), (e), (f), (g), and (h) show the mechanisms of the alloying elements. It was observed for all alloying elements that the M_s decreased with increasing alloying elements. Ghosh and Olson proposed a critical driving force for martensite transformation⁴⁰⁾ which is expressed as follows:

\begin{align} -\Delta G_{M_{s}} & = 1010 + 4009\sqrt{C} + 1980\sqrt{\textit{Mn}} \\ &\quad+ 172\sqrt{\textit{Ni}} + 1418\sqrt{\textit{Mo}}\\ &\quad + 1868\sqrt{\textit{Cr}} + 752\sqrt{\textit{Cu}} + 1879\sqrt{\textit{Si}} \end{align}

(3)

where each element symbol represents the element content in atomic percentages. All alloying elements in our data increased the critical driving force for martensite transformation. Consequently, the M_s decreased when alloying elements were added to satisfy the critical driving force. Furthermore, C, Mn, Ni, and Cu are austenite stabilizers.¹⁾ In particular, C has the greatest influence on austenite stability. It demonstrates that C has the greatest influence on M_s followed by Mn and Ni. Figure 6(c) illustrates the AGS effect on M_s, it was observed that M_s decreased with decreasing AGS. This can be attributed to the enhanced austenite stability via grain size strengthening with decreasing AGS,²³⁾ which increases the resistance to martensite transformation. The SHAP analysis progresses throughout the entire system and proposes variable importance and prediction mechanisms. This demonstrates that these variable mechanisms on M_s can be generalized to alloy steel.

Fig. 6

SHAP dependence plots of each variable.

4. Conclusion

In this study, we utilized machine learning models to predict the M_s of alloy steel. An ANN model with a R² value of 0.9801 in the additional data was compared with existing equation, where it was observed that the ANN model was approximately 2% more accurate when predicting the M_s. SHAP analysis was then applied to propose the importance of variables and predict mechanisms in the entire system. C has the greatest influence on the M_s, followed by Mn, AGS, Ni, Mo, Cr, Cu, and Si. It was observed that the M_s decreased with increasing alloying element content. This is because alloying elements increase the driving force for the martensite transformation. Austenite stability via alloying elements seems to have a lower influence on the M_s than the driving force. Thus, strong austenite stabilizers (C, Mn, and Ni) assist their effect on the M_s while ferrite stabilizers suppress it. In addition, it was observed that the M_s decreased with decreasing AGS. This can be attributed to grain strengthening, which suppresses the shearing strain for the martensite transformation when the AGS is reduced.

Acknowledgments

This work was supported by the Technology Innovation Program (JIAT-22-4185) funded by the Jeollabukdo (Korea).

REFERENCES

1) H. Bhadeshia and R. Honeycombe: Microstructure and Properties, 3rd ed., (Butterworth-Heinemann: Oxford, United Kingdom, 2006) pp. 95–128.
2) D.A. Porter, K.E. Easterling and M.Y. Sherif: Phase Transformations in Metals and Alloys, 3rd ed., (CRC Press: Boca Raton, U.S.A., 2009) pp. 383–404.
3) R. Abbaschian, L. Abbaschian and R.E. Reed-Hill: Physical Metallurgy Principles, 4th ed., (Cengage Learning: Stamford, U.S.A., 2009) pp. 603–650.
4) H.-S. Yang and H.K.D.H. Bhadeshia: Scr. Mater. 60 (2009) 493–495. doi:10.1016/j.scriptamat.2008.11.043
5) S.-J. Lee and Y.-K. Lee: Mater. Sci. Forum 475–479 (2005) 3169–3172. doi:10.4028/www.scientific.net/MSF.475-479.3169
6) A. García-Junceda, C. Capdevila, F.G. Caballero and C. García de Andrés: Scr. Mater. 58 (2008) 134–137. doi:10.1016/j.scriptamat.2007.09.017
7) C. Liu, M. Huang, Q. Ren, Y. Ren and L. Zhang: Steel Res. Int. 93 (2022) 2200044. doi:10.1002/srin.202200044
8) H.S. Yang, D.W. Suh and H.K.D.H. Bhadeshia: ISIJ Int. 52 (2012) 164–166. doi:10.2355/isijinternational.52.164
9) Y.-S. Jung, Y.-K. Lee, D.K. Matlock and M.C. Mataya: Met. Mater. Int. 17 (2011) 553–556. doi:10.1007/s12540-011-0804-x
10) J. Park, J.-H. Shim and S.-J. Lee: Metall. Mater. Trans. A 49 (2018) 450–454. doi:10.1007/s11661-017-4436-8
11) S.-J. Lee and K.-S. Park: Metall. Mater. Trans. A 44 (2013) 3423–3427. doi:10.1007/s11661-013-1798-4
12) P. Payson and C.H. Savage: Trans. ASM 33 (1944) 261–275.
13) E.S. Rowland and S.R. Lyle: Trans. ASM 37 (1946) 27–47.
14) R.A. Grange and H.M. Stewart: Trans. ASM 167 (1946) 467–490.
15) A.E. Nehrenberg: Trans. ASM 167 (1946) 494–498.
16) W. Steven and A.G. Haynes: J. Iron Steel Inst. 183 (1956) 349–359.
17) K.W. Andrews: J. Iron Steel Inst. 203 (1965) 721–727.
18) K. Ishida: J. Alloy. Compd. 220 (1995) 126–131. doi:10.1016/0925-8388(94)06002-9
19) Q. Lu, S. Liu, W. Li and X. Jin: Mater. Des. 192 (2020) 108696. doi:10.1016/j.matdes.2020.108696
20) S.M.C. van Bohemen and L. Morsdorf: Acta Mater. 125 (2017) 401–415. doi:10.1016/j.actamat.2016.12.029
21) C. Capdevila, F.G. Caballero and C. Garcia de Andres: ISIJ Int. 42 (2002) 894–902. doi:10.2355/isijinternational.42.894
22) M. Rahaman, W. Mu, J. Odqvist and P. Hedstrom: Metall. Mater. Trans. A 50 (2019) 2081–2091. doi:10.1007/s11661-019-05170-8
23) C. Capdevila, F.G. Caballero and C. Garcia de Andres: Mater. Sci. Technol. 19 (2003) 581–586. doi:10.1179/026708303225001902
24) C. Wang, K. Zhu, P. Hedstrom, Y. Li and W. Xu: J. Mater. Sci. Technol. 128 (2022) 31–43. doi:10.1016/j.jmst.2022.04.014
25) C. Wang, D. Ren, Y. Li, X. Wang and W. Xu: Materials 15 (2022) 3495. doi:10.3390/ma15103495
26) W.G. Vermeulen, P.F. Morris, A.P. de Weijer and S. van der Zwaag: Ironmak. Steelmak. 23 (1996) 433–437.
27) J. Jeon, N. Seo, J.-G. Jung, S.B. Son and S.-J. Lee: Mater. Trans. 63 (2022) 1369–1374. doi:10.2320/matertrans.MT-MB2022009
28) J. Jeon, N. Seo, J.-G. Jung, H.-S. Kim, S.B. Son and S.-J. Lee: J. Mater. Res. Technol. 21 (2022) 1408–1418. doi:10.1016/j.jmrt.2022.09.119
29) J. Jeon, N. Seo, S.B. Son, J.-G. Jung and S.-J. Lee: J. Mater. Sci. 57 (2022) 18142–18153. doi:10.1007/s10853-022-07538-5
30) J. Jeon, N. Seo, S.B. Son, S.-J. Lee and M. Jung: Metals 11 (2021) 1159. doi:10.3390/met11081159
31) J. Jeon, N. Seo, H.-J. Kim, M.-H. Lee, H.-K. Lim, S.B. Son and S.-J. Lee: Metals 11 (2021) 729. doi:10.3390/met11050729
32) J. Jeon, G. Kim, N. Seo, H. Choi, H.-J. Kim, M.-H. Lee, H.-K. Lim, S.B. Son and S.-J. Lee: J. Mater. Res. Technol. 16 (2022) 129–138. doi:10.1016/j.jmrt.2021.12.003
33) J. Jeon, D. Kim, J.-H. Hong, H.-J. Kim and S.-J. Lee: Korean J. Met. Mater. 60 (2022) 713–721. doi:10.3365/KJMM.2022.60.9.713
34) S.K. Murthy, S. Kasif and S. Salzberg: J. Artif. Intell. Res. 2 (1994) 1–32. doi:10.1613/jair.63
35) Q.R. Wang and C.Y. Suen: IEEE Trans. Pattern Anal. Mach. Intell. PAMI-6 (1984) 406–417. doi:10.1109/TPAMI.1984.4767546
36) A.J. Smola and B. Scholkopf: Stat. Comput. 14 (2004) 199–222. doi:10.1023/B:STCO.0000035301.49549.88
37) T.M. Cover and P.E. Hart: IEEE Trans. Inf. Theory 13 (1967) 21–27. doi:10.1109/TIT.1967.1053964
38) S.M. Lundberg and S.-I. Lee: In Proceedings of the 31st International Conference on Neural Information Processing Systems, (Long Beach, CA, U.S.A., 2017) pp. 4768–4777.
39) L.S. Shapley: Technical Report for U.S. Air Force, (Rand Corporation, Santa Monica, Ca, U.S.A., 1951).
40) G. Ghosh and G.B. Olson: Acta Metall. Mater. 42 (1994) 3361–3370. doi:10.1016/0956-7151(94)90468-5

Corresponding author

Version information

Register with J-STAGE for free!