ISIJ International
Online ISSN : 1347-5460
Print ISSN : 0915-1559
ISSN-L : 0915-1559
Instrumentation, Control and System Engineering
Prediction Method of Core Dead Stock Column Temperature Based on PCA and Ridge Regression
Wenyan WangXiaofan ZhangKun LuBing DaiJun ZhangPeng ChenBing Wang
Author information
JOURNAL OPEN ACCESS FULL-TEXT HTML

2021 Volume 61 Issue 11 Pages 2785-2791

Details
Abstract

The change of core temperature of blast furnace reflects the working status of hearth. However, the temperature of core dead stock column can not be measured by sensors directly. Therefore, a prediction model of Core Dead Stock Column Temperature is proposed in this work based on primary component analysis (PCA) and ridge regression algorithms, where PCA and person correlation coefficient are used for feature extraction and ridge regression is employed to solve multi-collinearity problems. Based on an in-house dataset collected within a successive three months, experimental results show that the R-squared of model on the training data set can achieve 88% and the average relative error on the test data set is only 0.33%, which shows the effectiveness of the proposed model.

1. Introduction

The hearth is a crucial region of the blast furnace.1) The working status of hearth has important effects upon the operation of blast furnace. To maintain a stable operation of the blast furnace hearth, it is necessary to get indications of its prevailing state.2,3) As one of the most critical factors reflecting the working status of hearth, hearth activity can affect the smooth progress of production, and even the life of blast furnace.4,5,6) Therefore, more and more attention has been paid to the activity of hearth by blast furnace operators. Currently, the activity of the hearth is mainly investigated by the temperature of the blast furnace core, the gas and liquid permeability of the hearth, and the fluidity of slag-iron. Generally, when the temperature of the dead stock column fluctuates between 1380°C and 1450°C, the activity of the hearth is normal. If this temperature is lower than 1360°C, the hearth activity decreases because the viscosity of slag will be increased, which leads to the deterioration of the permeability and liquid permeability of the dead charge column, and therefore the corresponding measures should be taken in the shortest time to improve the hearth activity and ensure the stable and smooth operation of the hearth.

However, there are few studies on how to quantitatively evaluate hearth activity. Du et al. proposed a furnace heat index model based on the calculation of the heat balance in the high temperature region to predict the blast furnace temperature, Feng et al. simulated the deposition of slag particles on the furnace wall by phase transformation and heat transfer,7,8,9) due to the calculation based on the static basis of the blast furnace, it is difficult to adapt the instantaneous change of the blast furnace. Liu et al. explored the variation law of furnace core temperature by analyzing the existing state of dead material column,10,11) Zhou et al. analysis the activity of blast furnace hearth by monitoring flame temperature of raceway zone.12,13,14) But these laws are based on the theoretical basis and a good quantitative model is not given, so the hearth activity cannot be intuitively characterized. Chen developed a new index based on the flow resistance coefficient of slag and hot metal to estimate hearth activity, Upadhyay et al. develop a mathematical model to simulate the variation in hot metal/slag accumulation and temperature during the taping of the blast furnace based on the heat transfer between metal and slag, metal and solids et al.15,16,17) But the new index is based on the judgment of the flow resistance coefficient of the slag-iron, and this mathematical model can only reflect the temperature variation of the hot metal, so they cannot accurately reflect the hearth activity, and the description of the hearth activity state is vague.

There are many states and control parameters, which can affect the temperature of the furnace temperature dead stock column, and they have different contributions to temperature prediction. Therefore, in order to reduce the heavy work of variable collection and simplify the computational complexity of the regression model, principal component analysis (PCA) with a simple mathematical principle is used to select the most important factors affecting temperature in this work, while ridge regression is used to reduce information redundancy and increase model robustness. Therefore, this work present a quantitative temperature prediction method based on PCA and ridge regression to predict temperature of core dead stock column for hearth activity evaluation.

2. Data and Method

2.1. Data Preprocessing

2.1.1. Data Collection

The data used in this work was collected from the production processing of a blast furnace with 38 tuyeres and 4 iron notches, its volume is 4747 m3. Because the temperature of core dead stock column cannot be obtained directly, 50 state parameters are collected to delineate the state of blast furnace, such as cold air pressure, oxygen enrichment rate, and so on, some of parameters are represented by the abbreviations listed in Table 1. These parameters were collected one time an hour in a continuous 91 days from October 19, 2017 to January 18, 2018.

Table 1. Part of operation process parameters.
AbbreviationParameters
T3Top gas riser temperature 3
tH2Top gas composition H2
tCOTop gas CO
tCO2Top gas CO2
ηCO,CUtilization rate of CO
P2020.080 m furnace body static pressure
PA2020.080 m furnace body static pressure at position A
PB2020.080 m furnace body static pressure at position B
PC2020.080 m furnace body static pressure at position C
………………
P2626.025 m furnace body static pressure
PA2626.025 m furnace body static pressure at position A
PB2626.025 m furnace body static pressure at position B
PC2626.025 m furnace body static pressure at position C
TCCross center temperature
TfTheoretical combustion temperature
IBFBlast furnace bosh gas index
TZZ position temperature
AOCentral ore addition

2.1.2. Missing Value Imputation

Operation process data of blast furnace generally suffers from missing value problem due to different factors, which is very common in industry scene. These missing values can significantly affect subsequent analysis, so it is necessary to estimate them as accurately as possible before using these algorithms.18) In our dataset, there are 503 missing values of T3 and 183 missing values of FCW also 70 missing values of TC. Therefore, this work adopts two imputation methods to replace the missing data with substituted values. Specifically, for the case of one missing data point, a mean substitution is used to estimate missing values. For example, the data at 3 pm on October 25 is missing, its value can be replaced by the average of the values at 2 pm and 4 pm on October 25.For the case of more values are missing, a regression substitution strategy is used to estimate the values, and the estimation of the i-th missing data can be expressed as:   

Z i = β 0 + ω=1 Ω β ω V ωi (1)
where β is the regression coefficient, Ω represents the number of features linearly related to feature Z and V is a known variable with no missing values.

2.1.3. Target Value Calculation

The temperature of core dead stock column (DMT) in the blast furnace can directly reflect the temperature changes of the dead coke reactor, which is indicator of the permeability of reactor in the shortest time. The higher the temperature of core dead stock column is, the stronger the permeability of the dead coke reactor and the better the hearth activity will be, and vice versa.4) Although the DMT cannot be directly measured during the operation of the blast furnace, its value can be calculated based on an empirical formula developed by Shibaike20) et al. as follow:   

DMT= 0.165× T f × V bosh D 3 +2.445×(FR-483)      +2.91×( Δ t -107)-11.2×( η CO,C -27.2)      +28.09×( D pcoke -25.8)+326 (2)
where Tf is theoretical combustion temperature, Vbosh is bosh gas volume, D is the hearth diameter, FR is fuel ratio, Δt is slag fluidity index, ηCO,C is the utilization rate of CO, Dpcoke is coke size of core dead stock column. Based on the above formula, the temperature of the core dead stock column can be calculated, and the distribution is shown in Fig. 1. Figure 2 shows the number of target values in different intervals, the abscissa is the target value interval, the interval range is 20°C, and the ordinate represents the number of target values in the interval.
Fig. 1.

Distribution of target value.

Fig. 2.

Distribution of target value interval.

Based on the above data processing, the final dataset in this work has 1955 samples, with 50 input features and one target value for each of them, where sample is the time point collected information, and input feature is the values of collected parameters.

Although the temperature of the core dead stock column (DTM) can be accurately calculated by Eq. (2), but some parameter values used in this formula are not easy to obtain. For example, the coke size of the dead stock column (Dpcoke) is constantly changing, which generally needs to be given by blast furnace operators according to their experience, and the calculation of the flow coefficient of slag and iron (Δt) is complicated. In contrast, the parameters used in this work can be accurately and conveniently collected, and few of them that contain most of the original information are analyzed and selected by the PCA method, which reduces the computational complexity of the prediction model. Moreover, the ridge regression method is used to predict the temperature of the dead stock column of the blast furnace, which eliminates redundant information, such as collinearity, etc.

2.2. Prediction Model Construction

Compared with the traditional machine learning PCA and ridge regression methods, Recurrent neural network can use the time information stored in the memory unit within its architecture to predict future time-related data. but it contains a large number of training parameters that are constantly iterated and updated in each training process, which leads to unstable prediction results. Moreover, the complete training of these parameters usually requires a large amount of raw data. ARIMA is the most common method for time series forecasting in statistical models, and the calculation is simple. but it requires time series data to be stable and cannot capture non-linear relationships. Instead, PCA and ridge regression methods are well-known, they only contain few training model parameters and is easy to calculate. Therefore, PCA and ridge regression were used to analyze the relationship between input parameters and target values.

2.2.1. Principal Component Analysis

Due to the temperature of core dead stock column can not be measured directly, many peripheral parameters of blast furnace have to be collected to predict DMT value. However, there is inevitably redundancy among the 50 state parameters from the production process. To overcome this problem, principal component analysis method is adopted, which can extract useful information without losing too much information. PCA is a well-established technique for dimensionality reduction, and it transforms the original variables into new axes, or principal components (PCs), which are orthogonal, so that the data presented in those axes are uncorrelated with each other.20,21)

Herein, the 50 features in the data set are represented by variables X1,X2,...,X50, and X=(X1,X2,...,X50)T. Assuming that the mean of vector X is μ, the covariance matrix is Σ.

A new synthetic variable can be obtained by linear transformation of X, which is represented by Y. That is to say, the new synthetic variable can be expressed linearly by the original variable and satisfies the following formula:   

{ Y 1 = μ 1,1 X 1 + μ 1,2 X 2 +...+ μ 1,50 X 50 Y 2 = μ 2,1 X 1 + μ 2,2 X 2 +...+ μ 2,50 X 50 ...... Y 50 = μ 50,1 X 1 + μ 50,2 X 2 +...+ μ 50,50 X 50 (3)

Since the linear transformation of the original variable can be carried out arbitrarily, it is hoped to obtain the comprehensive variable with the largest variance and independence. In order to meet the above requirements, the linear transformation is restricted to the following principles:

(1) μ i T μ i =1 , or μ i1 2 + μ i2 2 +...+ μ in 2 =1(i=1,2,...,n) .

(2) Yi is not related to Yj ( ij;i,j=1,2,...,n ).

(3) Y1 is the largest variance of all the linear combinations satisfying principle (1) of X1,X2,...,X50;Y2 is the largest variance of all linear combinations of X1,X2,...,X50 unrelated to Y1;……; Y50 is the largest variance of all linear combinations of X1,X2,...,X50 unrelated to Y1,Y2,...,Y49.22)

The synthetic variable Y1,Y2,...,Y50 based on the above three principles is called the principal component of the 50 variables. Among them, the proportion of synthetic variables in total variance decreases in turn.

2.2.2. Ridge Regression

A correlation statistic that is used to measure the strength and direction of relationship between two variables is known as Pearson correlation coefficient.23,24,25) After calculating the Pearson correlation coefficients between all features, as shown in Table 2, it can be found that there is a strong linear correlation between some features. To solve this problem, a ridge regression method is performed in this work. Ridge regression is a remedy used in the presence of multi-collinearity problem and it was first proposed by Hoerl and Kennard.26) It is an improvement of OLS method, and the difference between ridge regression and OLS is the k value, this k value is added to the diagonal elements of the correlation matrix and thus biased regression coefficients are obtained. Thus, ridge regression is a biased regression method.27,28,29)

Table 2. Pearson correlation coefficient between some features.
FeatureP20PA20PB20PC20P26PA26PB26PC26
P2010.9820.9540.6130.7540.950.970.987
PA2010.9770.6540.8030.9380.9710.99
PB2010.540.9030.8610.9180.987
PC2010.1970.7990.7560.59
P2610.5690.6680.836
PA2610.9910.92
PB2610.96
PC261

Suppose that t features are selected from the data set, each feature contains m samples. These features and the target value are represented by (Pi1,Pi2,...,Pit;Qi), (i=1,2,...,m). Then the parameter estimates are obtained by least squares and written into a matrix form as follows:   

α ˆ = ( P T P) -1 P T Q (4)

When there is multi-collinearity between independent variables, where |PTP|≈0, it will seriously affect the accuracy of least squares. To this end, add a constant matrix kI(k>0) to PTP, making PTP+kI a non-singular matrix. Hence, ridge regression is defined as:   

α ˆ (k)= ( P T P+kI) -1 P T Q (5)

Formula (5) is the ridge regression estimate of α, where k is a diagonal matrix of non-negative constants.27) When the ridge parameter k changes, α ˆ (k) will change as well, the curve of α ˆ (k) can be described in the plane coordinate system and ridge trace can be obtained. Ridge traces are used to select appropriate k values and independent variables. When the ridge trace of each regression coefficient tends to be stable, the most suitable k value can be obtained.

3. Result and Discussion

Based on in-housed data collected from industry scene, this work tries to predict temperature of core dead stock column from many peripheral parameters of blast furnace. To remove the redundant information and multi-collinearity within the data, a computational model integrated PCA and ridge regression methods are developed to improve the prediction. To evaluate the prediction performance more objectively, the dataset is partitioned into two parts, 70% of them as training set and 30% as test set.

3.1. Performance of Model on Training Set and Test Set

The quality of the model can be judged by R-squared and some error indicators, the R-squared of the model in the training set is 88% and on the test set is 86%. Figures 3 and 4 shows the comparison between the predicted and actual values on the training set and test set, it can be seen from the figure that the model fits the data points well on the training set and test set and there is no over fitting, this proves that the ridge regression model is reasonable. In order to further explain the performance of the model, the relative error, average relative error, and average absolute error are used to explain the accuracy of the model, Figs. 5 and 6 shows the relative errors of the model on the training set and test set, the average relative error on the training set is 0.46%, the relative error is concentrated in 0 to 0.6%, and the average absolute error is 5.12°C. On the test set, the average relative error is 0.32%, the relative error is concentrated between 0 and 0.5%, and the average absolute error is 4.46°C. Figure 7 shows the difference between the actual value and the predicted value, it can be seen from the figure that in most cases the actual value is lower than the predicted value, so the model has a low temperature warning function to a certain extent. In order to further explore the error distribution, the error of the model on the test set is divided into five intervals (−15°C, −5°C), (−5°C, 0°C), (0°C, 5°C), (5°C, 10°C), (10°C, 25°C). Figure 8 shows the distribution of the five error intervals, with intervals (0°C, 5°C) and (−5°C, 0°C) accounting for 51.55% and 19.67% respectively, intervals (10°C, 25°C), (−15°C, −5°C) account for a small proportion, with 7.1% and 5.28% respectively, therefore, most of the errors are concentrated between −5°C and 5°C, which also indicates that the error distribution is concentrated and the model is stable. Thus, it can be concluded that the model has achieved good results on both the training set and the test set, proving the effectiveness of the model.

Fig. 3.

Comparisons of actual and predicted values on training set. (Online version in color.)

Fig. 4.

Comparisons of actual and predicted values on test set. (Online version in color.)

Fig. 5.

Relative error of training set.

Fig. 6.

Relative error of test set.

Fig. 7.

Difference of actual and predicted values on test set.

Fig. 8.

Distribution of error intervals. (Online version in color.)

3.2. Preliminary Feature Selection via PCA and Pearson Coefficient

To solve the problem of information redundancy within the original dataset, PCA is used to transform the data into a new coordinate system to find the orthogonal principal components (PCs) with big variances which can remove the information redundancy. The analysis results of the PCA in the original dataset are presented in Table 3, where the top fifteen PCs are shown. It can be seen that about 98.5% of cumulative variance is obtained by the top 15 principal components, which means that 98.5% information is involved within this 15 parameters. In this way, PCA can keep the original information into the new coordinate system. To reduce computational cost in prediction, the top five PCs is selected which can cover the 84% information within the original dataset, and the data dimension is reduced from 50 to 5.

Table 3. Results from the principal component analysis for the first fifteen principal components.
ComponentInitial Eigenvalues
Total% of VarianceCumulative %
123.68648.33848.338
26.37713.01361.352
34.2398.6570.002
43.7977.74877.75
53.0636.2584
62.0614.20588.205
71.1942.43790.643
80.921.87792.52
90.6951.41993.939
100.5881.295.139
110.4730.96596.104
120.3960.80996.913
130.3010.61597.528
140.2570.52598.053
150.2110.43198.484

Total: characteristic root.

In PCA analysis, the top five components can represented 84% information volumn within the original dataset, and each principal component is a linear combination of all state parameters. Therefore, this work does not take the these principal components as the input of the prediction model, but uses them to select parameters that can largely replace the original data. To find the state parameters with large amount of information, the correlation between each principal component and the parameters are calculated to characterize the contribution of state parameters.

Therefore, the selected features have significant correlation with the first five PCs, further-more, the selected features also need to be highly correlated with the target value. Table 4 lists the features with PC correlation coefficient greater than 0.7 and Pearson correlation coefficient greater than 0.5.

Table 4. Correlation coefficients between features and PCs, between features and the target value.
PCsFeatureCorrelation coefficients with PCsPearson coefficients with target value
print1tH20.7270.555
tN20.9470.547
FRO0.9660.636
TCA0.7940.614
P200.8940.701
PA200.8920.682
PB200.8750.674
PA260.8610.636
PB260.8830.661
PC260.8930.696
DLP0.8850.686
RO0.9650.661
Tf0.840.672
print2BM0.810.593
IBG0.8930.57
print4PC200.7080.592
print5P260.7340.548

3.3. Model for Calculation of DMT

Owing to the multiple-collinearity in the features selected, ridge regression analysis is employed to eliminate the effects of multi-collinearity. In this work, the selected features are represented by variables x1,x2,...x50, and the purpose is to find an optimized k. The ridge traces of the features we preliminary selected with k are presented in Fig. 9. Figure 9 is the ridge trace map before selecting variables. The ridge traces of many variables in the Figure tend to 0, which indicates that these variables are not related to the target variable and therefore can be deleted. In addition, the ridge traces of some variables in Fig. 9 are crossed, which indicates that the regression coefficients of these variables affect each other. In other words, there is collinearity between these variables and can be deleted appropriately. Finally, the remaining variables are shown in Fig. 10, it can be seen that when k = 0.5, the regression coefficient are basically stable, which indicates that the influence of multi-collinearity in ridge regression is basically eliminated.

Fig. 9.

Ridge traces of the features coefficients (before deleting). (Online version in color.)

Fig. 10.

Ridge traces of the features coefficients (after deleting). (Online version in color.)

Thus, k is set as 0.5 to perform the ridge regression algorithm, and the model of the prediction of DMT can be described as follow:   

P DMT =912.489+0.102 T CA -0.009 P 20 +0.014 D LP     -1.232 R O +0.102 T f +4.261 I BG -0.011 P C20     -0.007 P 26 +0.005BM
where PDMT is the predicted value of the DMT, TCA is cold air temperature, P20 is 20.080 m furnace body static pressure, DLP is lower pressure difference, RO is Oxygen enrichment rate, Tf is theoretical combustion temperature, PC20 is 20.080 m furnace body static pressure at position C, P26 is 26.025 m furnace body static pressure, BM is blast momentum, and IBG is blast furnace bosh gas Index. It is worth noting that the final parameters included in this model do not contain missing data in the original data set. Therefore, the error caused by the missing values filled by Eq. (1) has no effect on the experimental results. However, in order to ensure the integrity of the data, it is necessary to use Eq. (1) to fill the missing data at the beginning of this work.

4. Conclusion

In this paper, a method for predicting the temperature of the dead stock column of blast furnace core based on PCA and ridge regression is presented. PCA is used to screen features that can represent the original information as much as possible. On this basis, the features that have significant effect on the target value are sieved by using the person correlation coefficient between input features and target value. Ridge regression is employed to overcome the multi-collinearity problem and establish the model. Finally, 9 features were selected for ridge regression modeling. The average relative error of this model on the training set is 0.46% and 0.32% on the test set. The experimental result shows the proposed model has favorable predictive effect and is of great significance to the production of blast furnace.

Acknowledgement

This work was supported by the National Natural Science Foundation of China (Nos. 61272004, 61672035, and 61872004), Educational Commission of Anhui Province (No. KJ2019ZD05).

References
 
© 2021 The Iron and Steel Institute of Japan.

This is an open access article under the terms of the Creative Commons Attribution license.
https://creativecommons.org/licenses/by/4.0/
feedback
Top