2024 Volume 64 Issue 8 Pages 1291-1300
Accurately predicting the end temperature of molten steel is significant for controlling ladle furnace (LF) refining. This paper proposes an error correction method called EC-CBR based on case-based reasoning (CBR) to reduce errors in the prediction models caused by discrepancies between actual production data and training data. The proposed method combines the incremental learning advantage of CBR with the ability of other models to fit nonlinear relations. First, a prediction model is established, and historical heats similar to the new heat are retrieved by CBR. Then, the model error of the new heat is calculated by employing the errors of similar heats. The prediction result is calculated by subtracting the error from the predicted value. Testing and comparison are conducted on the models (support vector regression, backpropagation neural network, extreme learning machine and mechanism model) and general CBR using actual production data. Results show that the EC-CBR is effective for both data-driven and mechanism models, with an increase of approximately 5% in hit rate within the range of ±5°C for data-driven models and an increase of 21.73% for mechanism model. Moreover, the corrected data-driven models show higher accuracy than the general CBR, further proving the effectiveness of the proposed method.
Ladle furnace (LF) is an effective secondary refining method that can control molten steel’s temperature and composition and adjust the production rhythm to reduce the process cost and improve production efficiency and product quality.1,2) In practice, the molten steel temperature cannot be measured continuously due to the limitations of the production environment and temperature-measuring components, making it difficult to achieve precise control.3) Currently, manual experience is often relied upon to determine whether the molten steel has reached the refining endpoint. In practice, due to the complexity of reactions that occur during the refining process, manual judgment is often subjective. Multiple temperature checks and heating cycles are required in the vicinity of the refining endpoint to ensure that the molten steel has truly reached the target temperature. Many scholars have studied and established various prediction models to obtain the molten steel’s end temperature accurately. If these models are accurate enough, they can help operators determine whether the current temperature is approaching the refining target, reducing the number of unnecessary power-off and temperature checks. These models can currently be mainly classified into three categories in the literature: mechanism, data-driven, and hybrid.4)
The mechanism model (MM), also known as the first principle model, is mainly established according to the energy conservation, heat transfer, and mass conservation equations. Wu et al. used the energy conservation equation and took molten steel and slag as research objects to derive the molten steel heating rate model.5) The one-dimensional unsteady thermal conductivity equations for the ladle wall and bottom in column and right-angle coordinate systems were established. The predicted end steel temperature was solved using the finite difference method. The data-driven model (DM) typically utilize artificial intelligence algorithms to process historical data and solve the complex relationship between input parameters and the end temperature of molten steel. Tian et al. adopted backpropagation neural network (BPNN), extreme learning machine (ELM), and other algorithms combined with the principle of ensemble learning. The authors applied the improved adaptive boosting (AdaBoost) algorithm to integrate multiple sub-models and predict steel temperature in LF.6,7) Hybrid model (HM) combine multiple MMs and DMs, and a well-structured, high-performance HM can leverage the advantages of both. When predicting steel temperature, He et al. took into account the effect of ladle heat status on temperature.8) They developed a ladle thermal tracking model to calculate the compensation temperature for ladle heat status and combined it with the temperature predicted by BPNN to obtain the molten steel temperature.
In the refining process, the steel temperature is influenced by many factors, and complex nonlinear relationships exist among them. Establishing a complete MM is challenging, and many simplifications and assumptions are introduced, which may compromise the model’s accuracy. Although DMs established by artificial intelligence algorithms eliminate the need for detailed physical and chemical information about the process, the accuracy of these models is highly dependent on the similarity between the distribution of model training and actual production data.9) The refining process of molten steel involves many complex reactions, and the process parameters of each refining process are constantly changing. Therefore, there are inevitably discrepancies between the training data of the model and the actual production data. When the model is applied to actual production, its accuracy will decrease because it cannot adapt to such discrepancies.10,11) Therefore, it is challenging to construct a model with high prediction accuracy to guide the actual production.12)
To address the aforementioned issues, online training methods have been proposed, which involve dynamic and adaptive models. When the model encounters new data during the application process, it is retrained to adapt to the new application scenarios. Two main approaches for online training are the moving window (MW) method and the just-in-time learning (JITL) method.13) The MW method uses the data training model that is closest in time to the new data, while the JITL method uses the data training model that is closest in the data space to the new data. Kneale et al. applied the MW method and selected a smaller window size to build simple DMs instead of a complex model and monitor the generation of products in the chemical process, shortening the time of model reconstruction when affected by data drift.14) Gu et al. used heats similar to the new heat in the historical case base to dynamically predict the molten steel temperature in the second blowing stage of the converter. They established a Long Short-Term Memory (LSTM) model using the process parameters of the second blowing stage of similar heats and used it to predict the molten steel temperature in the second blowing stage of the new heat.15) Retraining the model is often limited by factors such as training time, quantity of training data, and the selection of model structure and parameters. These can make it impractical for practical application.
To improve the accuracy and adaptability of prediction models, the error correction (EC) strategy has been proposed.16,17) An EC model is established to predict the errors generated in the prediction process of the primary model and correct its results. Xu et al. applied the EC method to the Weather Research and Forecasting Model (WRF) to improve the accuracy of wind speed prediction in wind farms.18) They used wind speed-related features as input and the historical prediction errors of WRF as output to establish a LSTM prediction model, which was used to correct errors in the WRF. Huang et al. proposed an EC method for BPNN based on the three-way decision and integration model.19) The training data were divided into three categories according to the prediction error of BPNN on the training data. Then, three algorithms were used to build models for each data type to predict the BPNN error and correct the predicted value of the BPNN model. While the EC method can enhance the adaptability and accuracy of prediction models, most existing EC models are trained on specific historical data. Therefore, the performance of these models may be degraded due to discrepancies between the training data and actual data during application.
This paper proposes an EC method based on CBR (EC-CBR) to address the issue of low model accuracy caused by discrepancies between training and actual data, without requiring model retraining. Compared with other data-driven methods, CBR does not rely on a specific data set, does not require a complex training process, and can gradually expand its case base through incremental learning with practical applications. The EC-CBR method improves the accuracy of prediction models by correcting their prediction results through error calculation. Firstly, a prediction model for end temperature of molten steel is established, and a case base is established based on the model’s training data. Secondly, the CBR case retrieval process is used to find historical heats similar to the new one. The production data of the new heat and similar heats are then fed into the prediction model to predict the end temperature. Next, the errors of similar heats in the prediction model are calculated, and the weighted average of these errors, based on the similarity with the new heat, is used as the errors of the new heat in the prediction model. Finally, the prediction result is obtained by subtracting the error from the predicted value of the new heat.
The EC-CBR method proposed in this paper combines the incremental learning advantage of CBR with the ability of other models to fit nonlinear relations. In this section, the principle of CBR and the implementation process of EC-CBR are explained.
2.1. Case-based ReasoningSimilar to human problem solving, CBR is a method of solving new problems by applying the solution of previous similar cases. As a data-driven method, CBR has been widely applied in many areas. When encountering a new problem, CBR searches the similar problems solved in the past, and their solutions in the case base, compare the background and time differences between the new and old problems and adjusts and modifies the solutions of the old case to solve the new problems. It is a reasoning mode that uses accumulated knowledge and experience to solve current problems.20) The standard CBR model often consists of four processes:
1. Case representation: the historical data and the problems to be solved are described in a unified case expression, constituting a case base and a problem set.
2. Case retrieval: in the case base, historical cases similar to the problem to be solved are retrieved by calculating the similarity.
3. Case reuse: reuse the solution from the retrieved historical cases to solve the new problem.
4. Case retain: save the solved problem and its corresponding solution to the case base and expand the case base to obtain a more powerful problem-solving ability.
2.2. Error Correction Method Based on CBRBased on CBR, this paper proposes an EC method for model accuracy improvement. For the prediction of the end temperature of molten steel in LF, the actual end temperature Ti of molten steel in a certain heat can be expressed as follows:
(1) |
where f is a model for predicting the end temperature of molten steel, Xi is an array of factors influencing the end temperature (which is also the model’s input), and εi is the error between the actual temperature and the predicted temperature of the model.
The error of the end temperature prediction model partly comes from the fitting effect of the model itself, and partly from the discrepancy between the refining process data in actual production and the model training data, which leads to a decrease in the model’s performance. Most current research focuses on data processing and selecting artificial intelligence algorithms to more accurately predict the end temperature of molten steel in LF, but it overlooks the potential advantages of the EC method.
The application of the EC method has achieved good performance in many fields, such as wind speed prediction, price prediction, precipitation prediction, and PM2.5 content prediction.21,22,23) The EC-CBR method proposed in this paper calculates the error of the new case using the error of similar cases in the prediction model. Contrary to the other data-driven methods, the predictive performance of CBR will not decrease due to the change in data distribution but will improve with the continuous supplement of the case base.
Generally, the training process of DM is to fit a function in the data space and minimize the value of the fitting function on the training data. To prevent overfitting when training complex models, the loss function value should not be expected to be exactly zero, which means the model will have some errors. In the data space, the prediction error is the position deviation between the function fitted by the DM and the true value. According to reasonable data feature weights, similar cases retrieved by CBR and the new case are close in the data space. Moreover, the relative distance between them and the fitting function is similar, so the errors generated by them on the fitting function are also similar. Therefore, CBR can be used to predict the error of the end temperature prediction model of molten steel in LF.
In artificial intelligence, some investigations directly use CBR to predict the target value. However, in complex industrial scenarios, the accuracy of CBR is directly affected by the coverage of the case base to the actual data. In addition, to solve some complex and nonlinear problems, obtaining information from the existing experience and data and analyzing and calculating the information requires combining CBR and other intelligent methods. The process parameters of each LF refining heat are different, there are many factors influencing the end temperature of molten steel, and there are complex nonlinear relationships between them. Some data-driven methods, such as artificial intelligence algorithms, can better fit the nonlinear relationship. Therefore, the proposed method combines the advantages of CBR and other models and applies the EC method to improve the prediction accuracy of the end temperature of molten steel.
Each heat of LF refining requires steps such as arc heating, additions adding, and argon stirring. The difference is only the specific process parameters of each heat in each step. Therefore, by calculating the similarity of process parameters between each heat, this method retrieves the historical heats similar to the new heat. Then, the errors of these similar heats in the prediction model are calculated. The weighted mean is a widely used statistical method in CBR. In this paper, the weighted mean of these errors is made according to the similarity between the retrieved similar heats and the new heat. Thus, the error of the new heat in the prediction model is calculated. The final prediction value of molten steel temperature is obtained by subtracting the calculated error from the predicted result of the model.
The end temperature of molten steel in the new heat is shown as follows:
(2) |
Where N is the number of similar heats, qk is the weight of heat k, and εk is the error of heat k in the prediction model. Here, N is a hyperparameter related to data dimension, namely the number of factors affecting molten steel’s end temperature and the prediction model’s accuracy. This value should be experimentally obtained.
The value of qk is related to the similarity between the new heat and its similar heats and is expressed by the following equation:
(3) |
where Sk is the similarity between the new heat and heat k. The calculation of similarity is the key to the case retrieval step in CBR, and the commonly used Euclidean distance similarity is used in this paper.
The Euclidean distance d (Xi, Xk) and similarity Sk between the new heat and heat k are shown as follows:
(4) |
(5) |
where m is the number of influencing factors, xij and xkj respectively represent influencing factors j of the new heat and heat k in case base, and wj represents the weight of influencing factor j.
The implementation steps of the EC method based on CBR for end temperature prediction of molten steel in LF are as follows:
1. Calculate the similarity between the new heat and historical heats in the case base according to Eqs. (4) and (5), obtain the production data and end temperature of the N most similar heats, and calculate the weight qk of each similar heats according to Eq. (3);
2. Introduce the production data of the new heat and its similar heats into the end temperature prediction model of molten steel, and predict their end temperature;
3. Compare the prediction value of molten steel end temperature of each similar heat with its real value, and calculate the prediction error of each similar heat εk;
4. Calculate the end temperature of the new heat according to Eq. (2), and save the data of the new heat in the case base.
The implementation process of this method is shown in Fig. 1.
In this section, the factors affecting the end temperature of molten steel in LF refining process are analyzed, and the production data of a steel plant are pre-processed. Three typical data-driven models and a metallurgical mechanism model are established to predict the end temperature of molten steel and their accuracy is compared.
3.1. Analysis of Influencing Factors in LFThe energy budget of the LF refining process is analyzed according to the energy balance, as shown in Fig. 2. The heat of the refining process is mainly provided by arc heating and reaction heat generated by alloy and slagging agents. Part of the heat entering the LF system is used to heat the molten steel and slag and melt the alloy and the slagging agent, while the other part of the heat is lost in the refining process. The lost heat includes the heat storage of the ladle lining, heat dissipation between the ladle shell and the air, the heat lost through the slag, and the heat taken away when the high-temperature gas is discharged from the slag surface.
Production data collected from a steel plant in LF production is used as the data source for end temperature prediction and EC in this paper. The data is measurable and can be accurately measured and measurement errors have a minimal impact on the model’s predictive accuracy. Based on this production data, analysis of LF energy balance, production experience and data statistics, it was found that nine main factors primarily affect the end temperature of molten steel in the LF process: the molten steel weight, the starting temperature of molten steel in LF, the refining time, the electricity consumption, the argon consumption, the age of the ladle, the addition amount of alloy (carburant, high carbon ferromanganese, medium carbon ferromanganese, aluminum pellet, aluminum slag), the addition amount of slagging agent (quicklime, compound deoxidizing slagging agent, fluorite, slag melting agent) and the length of feeding wire (Al wire, Ca–Al wire).
3.2. Data Pre-processingThe production data of LF refining process were collected from a steelmaking plant. Due to data interference, discrete distribution of production data and personnel error operation, the original data occasionally appear missing values and outliers. Therefore, before modeling, missing values in the dataset were deleted and outliers were processed by boxplot. Boxplot is a widely used method to detect outliers in sample data, it has employed a resistant rule for identifying possible outliers in a dataset.24) According to the rule, QL and QU are lower and upper fourth quartiles of the dataset, with QM as the median of the dataset. The resistance rule of boxplot labels an observation from the dataset as outlier if the observation fall below QL − 1.5IQR, or above QU + 1.5IQR, where IQR = QU − QL. In this study, the original data were processed using boxplot, and the outliers below QL − 1.5IQR and above QU + 1.5IQR were eliminated. After necessary data pre-processing, the influencing factors of the end temperature of molten steel in LF refining and the data distribution are shown in Table 1. All influencing factors are taken as the input of prediction models and the matching attribute of similar heats retrieval.
Influencing factors | Label | Minimum | Maximum | Mean | Standard deviation |
---|---|---|---|---|---|
Age of ladle | X1 | 59 | 461 | 252.26 | 68.42 |
Weight of molten steel [t] | X2 | 214.0 | 277.2 | 253.33 | 7.23 |
Starting temperature [°C] | X3 | 1526.3 | 1636.9 | 1581.79 | 17.86 |
Refining time [min] | X4 | 25.4 | 77.3 | 49.58 | 9.37 |
Electricity consumption [kwh] | X5 | 1034.20 | 9273.57 | 5043.63 | 1535.32 |
Argon consumption per ton of steel [m3] | X6 | 0.08 | 0.31 | 0.18 | 0.04 |
Weight of carburant [kg] | X7 | 0 | 39 | 1.80 | 6.07 |
Weight of high carbon ferromanganese [kg] | X8 | 0 | 462 | 114.94 | 65.79 |
Weight of medium carbon ferromanganese [kg] | X9 | 0 | 240 | 15.38 | 42.45 |
Weight of aluminum pellet [kg] | X10 | 0 | 327 | 112.42 | 64.39 |
Weight of aluminum slag [kg] | X11 | 0 | 1462 | 118.92 | 90.53 |
Weight of quicklime [kg] | X12 | 0 | 2532 | 1238.33 | 178.58 |
Weight of compound deoxidizing slagging agent [kg] | X13 | 0 | 783 | 157.79 | 116.11 |
Weight of fluorite [kg] | X14 | 0 | 754 | 122.51 | 147.60 |
Weight of slag melting agent [kg] | X15 | 0 | 643 | 121.09 | 157.43 |
Length of Al wire [m] | X16 | 0 | 355 | 54.91 | 60.03 |
Length of Ca–Al wire [m] | X17 | 0 | 182 | 94.52 | 16.38 |
From Table 1 it can be seen that the parameters have great differences in magnitude. When training models, the parameters with larger values tend to have greater weights. To eliminate the influence of magnitude of each parameter, all the data are normalized. Equation (6) is applied for the data normalization.
(6) |
Where Xvalue is the normalized value corresponding to the parameter X, max(X) and min(X) are the maximum and minimum values of the parameter X, respectively.
DMs and metallurgical mechanism model (MMM) are established based on production data to predicted the end temperature of molten steel in LF. Apply the EC-CBR method proposed in this article to the aforementioned models and verify its effectiveness in improving model accuracy. Among the 1495 production data sets retained, 80% are randomly selected as the training set and initial case base (1196 heats), and the remaining 20% as the test set (299 heats).
3.3. Data-driven ModelsThree commonly used artificial intelligence algorithms are used to predict the end temperature of molten steel in LF.
Support vector machine (SVM) is a widely used machine learning algorithm for solving classification problems. SVR applies the idea of SVM to the regression problem to minimize the distance between the model’s support vector and the hyperplane, finding the internal relationship of a set of data and achieving regression fitting. The SVR uses kernel functions to map data to a higher-dimensional space, where it fits into a hyperplane. The model’s accuracy is related to the mapping effect of the kernel function.
BPNN is a supervised algorithm based on gradient descent, which consists of two processes: the forward propagation of information and the back propagation (BP) of errors. During forward propagation, input signals are transmitted through hidden layers to the output node, where outputs are generated by the activation function. During backpropagation, errors are propagated from the hidden layers to the input layer, and the network’s connection weights and thresholds are dynamically adjusted. After repeated learning and training, the network’s weights and thresholds are adjusted to minimize the error.
ELM is a single-layer feedforward neural network. In ELM, hidden node parameters can be randomly assigned, and only the output weights must be determined analytically. ELM can provide better generalization performance with less learning time and human intervention than traditional neural networks.25)
To determine the hyperparameters of each model, avoid overfitting, and achieve the best and most stable predictive performance, K-fold cross-validation is used in the modeling process of DMs. This method randomly splits the training data into K groups, where one group is used as the validation set and the remaining K-1 groups are combined to form a new training set for model building. Each group is used once for validation, resulting in K models trained on different subsets of the data. The average prediction error of these K models is then calculated to evaluate the model performance under the current hyperparameters. Note that the test set was not included in the learning of DMs.
In this paper, the 5-fold cross-validation method was chosen due to the size of the data being used. Activation functions commonly used in artificial neural network model include sigmoid, hyperbolic tangent, and rectified linear units (ReLU). In this paper, the number of hidden layers and the number of hidden layer neurons under different combinations are used to obtain the accuracy of the neural network model. ELM itself is a single-layer feedforward neural network. Given that the input of the model has 17 parameters, the number of hidden layer neurons of the neural network in the experiment ranges from 1 to 100, and the maximum number of iterations is 500. Numerous kernel functions can be utilized for SVR, with well-known ones encompassing the polynomial kernel function, Gaussian kernel function, and sigmoid kernel function. After cross-validation, the SVR model was found to utilize the Gaussian kernel function. The BPNN model was configured with two hidden layers, the number of hidden layer neurons were 79 and 65, respectively, and the ReLU was selected as the activation function. For the ELM model, the number of hidden layer neurons was 21, and the activation function was the sigmoid.
The experiments were run under a Windows 11 operating system, and realized by the Python 3.7 programming language. The experiments were conducted on a computer with a 16-GB memory, a 12-GB video memory, i5-12600KF CPU, and GeForce RTX 3060 GPU card. After the hyperparameters are determined, the final SVR, BPNN, and ELM models were trained with all the data of the training set (1196 heats), and the training time was 0.023, 0.112, and 0.004 seconds, respectively.
Based on these hyperparameters, the prediction accuracy of the three models on the test set is shown in Table 2, where Root Mean Square Error (RMSE) is used to describe the overall performance of the model, which is as follow:
(7) |
Model | RMSE | ±5°C/% | ±7°C/% | ±10°C/% |
---|---|---|---|---|
SVR | 5.32 | 69.57 | 83.95 | 91.97 |
BPNN | 5.84 | 66.89 | 79.29 | 90.64 |
ELM | 5.56 | 68.90 | 81.61 | 90.97 |
Where Tpredicted is the model’s predicted value, Tactual is the actual value, and m is the number of predicted samples.
By comparison, it can be seen that the prediction accuracy of the three models is close, and the prediction hit rate is below 70% within the range of ±5°C. This is because the end temperature of molten steel in LF is influenced by many factors, and the process parameters exhibit significant variation between different heats. Consequently, there are discrepancies between the data in the models training set and the test set, which ultimately prevents the three commonly used models from achieving the desired level of accuracy in predicting the end temperature of molten steel.
3.4. Metallurgical Mechanism ModelThe MMM for predicting the end temperature follows the principle of energy balance. Figure 2 illustrates the heat input and output during the LF refining process, where Qarc, Qadd, Qslag, Qshell, Qlining, and Qgas in Fig. 2 can be calculated as Eqs. (8), (9), (10), (11), (12), (13), (14), (15), (16).26)
(8) |
Where Parc is the arc power, W; ηarc is the arc heat transfer coefficient; cosΦ is the power factor; ηE is the electrical efficiency; UΦ is the phase voltage of the LF transformer, V; I is the electrode current per phase, A.
The Qadd includes the heat effects of both slagging agent, denoted as Qadd_slag, and alloy, denoted as Qadd_alloy. Where Qadd_alloy can be calculated by Eq. (10), and Qadd_slag can be calculated in a similar manner.
(9) |
(10) |
Where Tfi and T0i are the liquidus and initial temperature of alloy element i, respectively, °C; Tsteel is temperature of the molten steel, °C; csi and cli are the specific heat capacities of the solid and liquid phases of alloy element i, respectively, J·(kg·°C)−1; ΔHmi is the heat of fusion of alloy element i , J·kg−1; ΔHoi and ΔHfi are the heat of oxidation and dissolution of alloy element i, respectively, J·mol−1; mi is the amount of element i added, kg; Mi is the molar mass of element i, kg·mol−1; fi is the recovery rate of alloy element i.
The Qslag is composed of the heat stored in the slag during the heating process, Qslag_in; and the heat dissipated from the surface of the slag, Qslag_out, as shown in Eq. (11). Where the Qslag_in and the Qslag_out are represented by Eqs. (12) and (13).
(11) |
(12) |
(13) |
Where mslag is the weight of the slag, kg; cslag is the specific heat capacity of the slag, J·(kg·°C)−1; Tslag_end and Tslag_0 are the end and initial temperature of the steel slag, °C, respectively; Asl is the area of the slag surface, m2; t0 and tend are the start and end times of the refining process, respectively, s; hsl is the convective heat transfer coefficient of the slag surface, J·(m2·s·°C)−1; σ is the Stefan-Boltzmann constant, W·(m2·K4)−1; εsl is the emissivity of the slag surface; Ti is the surface temperature of the slag at time i, °C, Te is the ambient temperature, °C.
The convection heat loss from the ladle shell can be calculated by Eq. (14):
(14) |
Where t is the refining time, min; Ai is the surface area of each part of the ladle shell, m2; and qi is the heat flow from each part of the ladle shell, J·(m2·h)−1; εs is the emissivity of the ladle shell surface; Ts is the temperature of the ladle shell, °C; k is a coefficient that depends on the direction of heat dissipation.
The heat stored in the ladle lining, Qlining, can be calculated using Eq. (15):
(15) |
Where mi is the mass of each part of lining, kg; ci is the specific heat capacity of each part of lining, J·(kg·°C)−1; Tl and Tl0 are the temperature of each part of lining at the end and start of the refining process, respectively, °C.
The main components of the furnace gas are argon, flue gas, and some particulate matter. The heat loss generated by the furnace gas, Qgas, can be calculated using Eq. (16):
(16) |
Where cAr, cg, and cg are the specific heat capacity of argon, flue gas, and particulate matter, respectively, J·(kg·°C)−1; VAr is the consumption of argon during the refining process, m3; T0 is the initial temperature of the argon, °C; Tm, Tg, and Td are the exhaust temperatures of the argon, flue gas, and particulate matter, respectively, °C; mg and md are the weights of the flue gas and particulate matter, respectively, kg.
Based on the analysis of LF energy balance, the heat absorbed by the molten steel is ΔQ, and the MMM can be expressed by Eq. (17):
(17) |
Where, Tend is the predicted end temperature of the molten steel; T0 is the initial temperature of the steel; msteel is the weight of the molten steel, kg; csteel is the specific heat capacity of the molten steel, J·(kg·°C)−1.
To validate the accuracy of the MMM, the test set in section 3.2 is used to predict the end temperature of molten steel. In the MMM, the data normalization is not necessary, because the calculation is based on the metallurgical mechanism, which requires the use of raw data.
Table 3 displays the prediction accuracy of the MMM on the test set. It can be seen that, the prediction accuracy of the MMM is relatively low. The model’s RMSE is as high as 16.36, and its accuracy within ±10°C is only 42.81%. Parameters such as electrical efficiency, surface temperature of the slag, weight of the slag, etc., are obtained through hypothesis or experience, which leads to the low prediction accuracy of the mechanism model.
Model | RMSE | ±5°C/% | ±7°C/% | ±10°C/% |
---|---|---|---|---|
MMM | 16.36 | 20.74 | 30.10 | 42.81 |
In this section, the prediction results of the above models are corrected using the EC-CBR proposed in this paper, and the general CBR is used to establish an end temperature prediction model to compared with EC-CBR.
4.1. Application of EC-CBRTo apply this method, it is necessary to determine the weights of various influencing factors and the number of reused heats to calculate the similarity between each heat. The accuracy of the influencing factors’ weights plays a significant role in the accuracy of retrieving similar heats, which in turn affects the effectiveness of the EC method in correcting the prediction accuracy of the models.
To obtain more accurate weights for each influencing factor, this paper employs a method based on statistics called Maximum Information Coefficient (MIC). MIC can widely measure the linear and nonlinear relationships between variables. It is also very effective for measuring the non-functional dependencies that a single function cannot express.27)
When calculating weights by MIC, the first step is to obtain the MIC between each influencing factor and the model prediction error. First, divide the scatterplot formed by one influencing factor and the model prediction error in a two-dimensional space into a grid. Next, calculate the mutual information (MI) between these two variables, and normalize the result. There will be a corresponding normalized MI value under different ways of grid division, and the maximum value is the MIC between these two variables. After calculating the MIC between each factor and the prediction error, the weight of each factor can be calculated by Eq. (18).
(18) |
Where wi is the weight of the influencing factor Xi, MIC (Xi) is the MIC between the influencing factor Xi and the prediction error. The weight of each influencing factor is shown in Table 4.
Model | X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | X9 |
---|---|---|---|---|---|---|---|---|---|
SVR | 0.050 | 0.043 | 0.072 | 0.058 | 0.088 | 0.051 | 0.062 | 0.079 | 0.042 |
BPNN | 0.037 | 0.020 | 0.104 | 0.051 | 0.143 | 0.034 | 0.011 | 0.148 | 0.023 |
ELM | 0.048 | 0.058 | 0.066 | 0.057 | 0.065 | 0.051 | 0.052 | 0.080 | 0.056 |
MMM | 0.049 | 0.052 | 0.068 | 0.065 | 0.084 | 0.054 | 0.051 | 0.092 | 0.044 |
X10 | X11 | X12 | X13 | X14 | X15 | X16 | X17 | ||
SVR | 0.050 | 0.117 | 0.056 | 0.040 | 0.044 | 0.050 | 0.037 | 0.061 | |
BPNN | 0.044 | 0.161 | 0.079 | 0.026 | 0.014 | 0.027 | 0.058 | 0.020 | |
ELM | 0.044 | 0.118 | 0.057 | 0.043 | 0.037 | 0.039 | 0.061 | 0.068 | |
MMM | 0.049 | 0.106 | 0.063 | 0.043 | 0.045 | 0.038 | 0.048 | 0.049 |
Then, using the weights in Table 4, the N heats with the highest similarity are selected according to Eqs. (4) and (5) to calculate the prediction error of the new heat. Experiments show that when N is taken as 9, 10, 9, and 6, respectively, the accuracy of SVR, BPNN, ELM, and MMM can be significantly improved by EC-CBR. The variation of RMSE of these four models with different values of N is shown in Fig. 3.
The hit rates and RMSE of models before and after EC-CBR correction are shown in Figs. 4 and 5. According to Fig. 4, compared with the four original models, the RMSE of each model is significantly reduced after being corrected by the proposed method. The hit rates in the range of ±5°C, ±7°C, and ±10°C are significantly improved. The accuracy of the three DMs is improved by approximately 5% in the range of ±5°C. The hit rates of the BPNN model are 6.69%, 4.32%, and 3.34% higher than those of the original model in the ranges of ±5°C, ±7°C, and ±10°C, respectively. The prediction accuracy of each DM is higher than 70% in the range of ±5°C, proving that the EC-CBR can effectively improve the model’s accuracy. The accuracy of the MMM is also greatly improved, with a hit rate in the range of ±5°C increasing by 21.73%, and the RMSE decreasing from 16.36 to 8.71. However, since the MMM has lower inherent prediction accuracy, even after being corrected, it still cannot match the accuracy of the DMs.
The EC-CBR proposed in this paper combines CBR with other models and uses CBR to correct errors in other artificial intelligence models. Unlike traditional CBR, EC-CBR reuses the error of multiple similar cases in the prediction models, instead of directly using the solutions of similar cases as solutions of new cases. In this study, the general CBR is used to establish an end temperature prediction model and compared with EC-CBR. The CBR model’s initial case base and test data are taken from the training and test sets divided in Section 3.2. Additionally, MIC is used to calculate the weight of each influencing factor. Consistent with EC-CBR, the weighted average of the end temperature of each similar heat is taken according to the similarity.
The weights of influencing factors are calculated and presented in Table 5. Under this weight distribution, the general CBR model achieved the highest prediction accuracy for the end temperature of molten steel in LF when N was set to 4, with an RMSE of 5.26. According to the calculation, the hit rates of the general CBR model in the range of ±5°C, ±7°C, and ±10°C are 68.90%, 81.27%, and 92.64%, respectively.
Model | X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | X9 |
---|---|---|---|---|---|---|---|---|---|
CBR | 0.013 | 0.009 | 0.120 | 0.048 | 0.265 | 0.087 | 0.030 | 0.086 | 0.012 |
X10 | X11 | X12 | X13 | X14 | X15 | X16 | X17 | ||
CBR | 0.025 | 0.141 | 0.053 | 0.033 | 0.018 | 0.022 | 0.016 | 0.022 |
Based on the production data of a steel plant, Tables 2 and 6 show that the general CBR had similar accuracy and hit rate with the three original DMs. However, its performance is inferior to any DM corrected by EC-CBR. The hit rate of the BPNN model corrected by EC-CBR within the range of ±5°C was higher than that of the general CBR by 4.68%. To further analyze the effectiveness of EC-CBR, hit rates of the three corrected DMs and CBR with statistic range of 2°C step length are collected in details and drawn in one graph, shown in Fig. 6. It can be seen from Fig. 6 that DMs corrected by EC-CBR have apparent advantage in hit rate distribution in subdivided intervals, the peak of hit rate curves are significantly higher than that of CBR. The higher the hit rate of a model, the more advantageous it is for practical applications. EC-CBR has better prediction performance than general CBR.
Model | RMSE | ±5°C/% | ±7°C/% | ±10°C/% |
---|---|---|---|---|
SVR (EC-CBR) | 5.02 | 73.24 | 84.62 | 92.98 |
BPNN (EC-CBR) | 5.15 | 73.58 | 83.61 | 93.98 |
ELM (EC-CBR) | 5.25 | 73.24 | 83.28 | 91.97 |
CBR | 5.26 | 68.90 | 81.27 | 92.64 |
This paper proposed an error correction method based on the CBR (EC-CBR) to reduce the error of the prediction models caused by the discrepancy between the actual production data and the training data. The method combines the incremental learning advantage of CBR with the ability of other models to fit nonlinear relations and can improve the accuracy of prediction models. Based on the actual production data of the LF refining process in a steel plant, three artificial intelligence algorithms, SVR, BPNN, and ELM as well as metallurgical mechanism are used to establish models and predict the end temperature of molten steel. The prediction results of each model are corrected by the proposed method and compared with the general CBR. The following conclusions can be drawn:
(1) Due to the process parameters of LF refining are complex and changeable, it is difficult for artificial intelligence algorithms such as SVR, BPNN, and ELM to establish high-precision prediction models of LF molten steel end temperature. Additionally, since many parameters during the refining process cannot be obtained, the accuracy of the MMM is lower. The EC-CBR method proposed in this paper is used to correct these models. As a result, the RMSE of these models decreases by 0.66, 1.04, 0.81, and 10.63, respectively. The DMs show an increase of approximately 5% in hit rate within the range of ±5°C. The MMM exhibits an increase of 21.73% in hit rate within the range of ±5°C. These results demonstrate the effectiveness of the proposed method in this paper, especially in narrow range.
(2) The general CBR is used to predict the end temperature of molten steel in LF and is compared with the EC-CBR method. The EC-CBR method combines the strengths of CBR and other models to increase prediction accuracy by using CBR to predict the error of models. The accuracy of SVR, BPNN, and ELM models corrected by EC-CBR in the range of ±5°C is 4.34%, 4.68%, and 4.34% higher than that of the general CBR, respectively.
This work was supported by Key Laboratory of Metallurgical Industry Safety & Risk Prevention and Control, Ministry of Emergency Management.
On behalf of all of the authors, the corresponding author states that there is no conflict of interest.