2023 Volume 64 Issue 9 Pages 2196-2201
This study proposes a machine learning model to predict the martensite start temperature (Ms) of alloy steels. We collected 219 usable data from the literature, and adjusted the hyperparameters to propose an accurate machine learning model. Artificial neural networks (ANN) exhibited the best performance compared with existing empirical equation. The prediction mechanisms and feature importance of the ANN with regards to the whole system were discussed via the Shapley additive explanation (SHAP).
Martensite is essential to improving the mechanical properties of alloy steel because it contains carbon and other elements as a solid solution from the prior austenite, which enhances the strength and hardness of alloy steel.1–3) Controlling martensite phase fraction is highly associated with managing the martensite start temperature (Ms). The Ms can be influenced by the addition of alloying elements and prior austenite grain size (AGS). For instance, it is known that the addition of most alloying elements except for Co and Al lowers the Ms.1) Besides that, a smaller prior AGS also results in a smaller Ms.4–9) In order to effectively design alloy steels, more precise predictions of Ms and further comprehension of the mechanisms of alloying elements and AGS are necessary. Researchers have proposed empirical equations to predict the Ms and understand effects of its alloying elements and AGS.10–20) For example, Lee et al. suggested an empirical equation to predict Ms considering alloying elements and AGS of alloy steels, which can be expressed as follows:11)
\begin{align} M_{s}\ ({{}^{\circ}\text{C}}) &= 475.9 - 335.1C - 34.5\textit{Mn} - 1.3\textit{Si} \\ &\quad - 15.5\textit{Ni} - 13.1\textit{Cr} - 10.7\textit{Mo} \\ &\quad - 9.6\textit{Cu} + 11.67\ln (d_{\gamma}) \end{align} | (1) |
In this study, ANN, random forest regression (RFR), support vector regression (SVR), and k-nearest neighbor (kNN) were employed to predict Ms. Trained machine learning models, which were then compared with themselves and an empirical equation to confirm any improvements in model accuracy. The quantitative prediction mechanisms of the machine learning model were proposed via SHAP analysis.
The utilizable data (219 in total) containing the average grain size (AGS) and Ms of C, Mn, Si, Ni, Cr, Mo, and Cu were accumulated from the literature.11) After analyzing and removing duplicated data, 201 data were left. The range, average, and standard deviation of data are presented in Table 1. The data were divided into 70% and 30% randomly for the training and testing dataset, respectively, to propose the RFR, SVR, and kNN. The data were then further divided into 70%, 10%, and 20% randomly for the training, validation, and testing datasets, respectively, for the ANN proposal.
ANN, RFR, SVR, and kNN were trained and tested (additionally, ANN was validated via validation dataset) to propose an intensive prediction model and analyze prediction mechanisms. The hyperparameters of the ANN are learning rate, activation function, the number of layers, and the number of neurons.31–33) The learning rate was set as 0.01, and activation functions for first layer and second layer were adjusted by changing the linear, sigmoid, and rectified linear unit (ReLU). The number of layers was varied between one and two, and the number of neurons was varied from 1 to 100. The number of neurons in the second layer does not exceed that in the first layer. The hyperparameters of the RFR are the number of decision trees and their respective magnitude of maximum depths.32,34,35) The number of decision trees and magnitude of maximum depth of each tree were set from 1 to 100. The hyperparameters of SVR are the soft margin, kernel type, kernel coefficient (γ), and regularization parameter (C). The soft margin was used as the default value, and type of kernel used was the radial basis function.32,36) Next, the C was set from 0.1 to 10000 and γ was set from 0.0000001 to 0.01. The hyperparameter of kNN was the number of data (k) that were considered to predict Ms.32,37) k was set from 1 to 131. Hyperparameters of the machine learning models were adjusted via grid search and 5–fold cross–validation. Subsequently, machine learning models with apposite hyperparameters were proposed using training dataset and testing datasets. The determination coefficient (R2) was applied to estimate the accuracy of the machine learning model22) and can be expressed with the following equation:
\begin{equation} R^{2} = 1 - \frac{\displaystyle\sum\nolimits_{i=1}^{n} (y_{i} - \hat{y}_{i})^{2}}{\displaystyle\sum\nolimits_{i=1}^{n} (y_{i} - \bar{y})^{2}} \end{equation} | (2) |
The proposed machine learning models were validated using additional data that were not used for training and testing (including validation in the case of ANN). 78 additional data points were collected from the literature.20) Prediction mechanisms of the machine learning model selected based on R2 were analyzed via SHAP.32,38,39) Machine learning training and analyzing prediction mechanisms were performed using TensorFlow version 2.7.0, scikit–learn version 0.23.1, and Python version 3.7.
The results of the hyperparameter adjustments are shown in Fig. 1. Hyperparameters were set based on the prediction accuracy of testing data (generally, the prediction accuracy of training data has a high value). Figure 1(a) shows the prediction accuracy of the ANN based on activation function. The linear unit and ReLU demonstrated the best performance (R2 = 0.9272) for the first and second layers, respectively. Figure 1(b) shows the performances of the RFR depending on the number of estimators and maximum depth. The RFR model has a maximum depth of 67 and performed the best (R2 = 0.9236) with five estimators. Figure 1(c) shows prediction accuracy of the SVR based on C and γ. The model with 100000 C and 0.01 γ yielded the best performance (R2 = 0.8196). kNN exhibited the best performance when k = 131, as shown in Fig. 1(d).
Hyperparameters adjusting results of (a) ANN, (b) RFR, (c) SVR, (d) kNN.
Figure 2 shows the performances of machine learning models in training and testing data (in the case of ANN, the testing data comprises 10% validation and 20% testing data). The ANN had R2 value of 0.9820 and 0.9703 for the training and testing data, respectively. Further, the RFR had R2 values of 0.9866 and 0.9460 for the training and testing data, respectively. Next, the SVR had R2 values of 0.9801 and 0.8432 for the training and testing data, respectively. Finally, the kNN had R2 values of 0.0006 and 0.0013 for the training and testing data, respectively. ANN had the best performance for the testing data but not the training data. However, the accuracy of testing data is more important because it indicates that the ANN model is capable of predicting both the training and additional data accurately. Therefore, the ANN model was selected and subsequently validated with additional data, the prediction mechanisms were then analyzed using SHAP.
Performance of machine learning models for the training and testing datasets.
Figure 3(a) shows the prediction accuracy of the ANN in the training, validation, and testing data. The prediction accuracies of ANN were R2 = 0.9820, 0.9796, and 0.9638 for training, validation, and testing data, respectively. The difference in accuracy between the training, validation, and testing data was approximately 0.02, suggesting that the ANN model was capable of predicting the Ms accurately. Figure 3(b) illustrates the performance of eq. (1) and ANN on the whole data set (training, testing, and validation data). The R2 values of eq. (1) and ANN were 0.9793 and 0.9801, respectively. Hence, it can be deduced that the ANN was slightly more accurate than eq. (1).
Comparison between the measured and predicted Ms. Performances of (a) ANN for each dataset. (b) ANN and eq. (1).
The ANN model and eq. (1) progressed to extra validation via 78 additional data that were not used in model training, validation, and testing. Figure 4 demonstrates prediction accuracies of the ANN model and eq. (1) for the additional data. The R2 values of the ANN and eq. (1) were 0.9265 and 0.9070, respectively, hence, it can be deduced that the ANN model was capable of predicting the Ms of general alloy steels.
Prediction performance of the additional dataset using (a) ANN and (b) eq. (1).
Figure 5(a) demonstrates that the magnitude of influence of each variable on the Ms. C had the greatest influence on the Ms (92.02 average absolute SHAP value), followed by Mn, AGS, Ni, Mo, Cr, Cu, and Si, which had average absolute SHAP values ranging between 0.45 to 17.06. Figure 5(b) illustrates the effect of the variable on the Ms. It was observed that the Ms decreased with increasing alloying elemental content (C, Mn, Ni, Mo, Cr, Cu, and Si), but increased with increasing AGS.
SHAP analysis results of ANN model. (a) The average impact of each alloying element and AGS on the Ms. (b) Scatter plot of variables related to the Ms.
The detailed prediction mechanisms of each variable were analyzed using the SHAP dependence plot shown in Fig. 6. The applied base value was 252.95°C. Note that the average Ms temperature of 390625 temporary dataset made for the SHAP analysis was 252.95°C. Each SHAP values indicates an increase or decrease from the base value. All the variables were less affected by other variables. Thus, the SHAP value for a specific variable (ex. 0.1 wt.% C) appears as a point. Figures 6(a), (b), (d), (e), (f), (g), and (h) show the mechanisms of the alloying elements. It was observed for all alloying elements that the Ms decreased with increasing alloying elements. Ghosh and Olson proposed a critical driving force for martensite transformation40) which is expressed as follows:
\begin{align} -\Delta G_{M_{s}} & = 1010 + 4009\sqrt{C} + 1980\sqrt{\textit{Mn}} \\ &\quad+ 172\sqrt{\textit{Ni}} + 1418\sqrt{\textit{Mo}}\\ &\quad + 1868\sqrt{\textit{Cr}} + 752\sqrt{\textit{Cu}} + 1879\sqrt{\textit{Si}} \end{align} | (3) |
SHAP dependence plots of each variable.
In this study, we utilized machine learning models to predict the Ms of alloy steel. An ANN model with a R2 value of 0.9801 in the additional data was compared with existing equation, where it was observed that the ANN model was approximately 2% more accurate when predicting the Ms. SHAP analysis was then applied to propose the importance of variables and predict mechanisms in the entire system. C has the greatest influence on the Ms, followed by Mn, AGS, Ni, Mo, Cr, Cu, and Si. It was observed that the Ms decreased with increasing alloying element content. This is because alloying elements increase the driving force for the martensite transformation. Austenite stability via alloying elements seems to have a lower influence on the Ms than the driving force. Thus, strong austenite stabilizers (C, Mn, and Ni) assist their effect on the Ms while ferrite stabilizers suppress it. In addition, it was observed that the Ms decreased with decreasing AGS. This can be attributed to grain strengthening, which suppresses the shearing strain for the martensite transformation when the AGS is reduced.
This work was supported by the Technology Innovation Program (JIAT-22-4185) funded by the Jeollabukdo (Korea).