Fuel Ratio Optimization of Blast Furnace Based on Data Mining

Xiuyun Zhai; Mingtong Chen; Wencong Lu

doi:10.2355/isijinternational.ISIJINT-2020-238

Abstract

Despite the age of the process, the blast furnace (BF) ironmaking is still crucial to iron and steel industry. To improve the competitiveness of enterprises, fuel ratio (FR) in BF ironmaking process needs to be kept its lowest level possibly. In the work, a prediction model was established to predict FR of the BF by using feature selection and support vector regression (SVR). GA-SVR (genetic algorithm - SVR) method was employed to select the most informative five features from the candidate features.The experimental results indicated that the SVR model brought high learning precision and excellent prediction generalization ability. To explore and discover the laws of BF production, the influences of the five features on FR were discussed by simulation analysis of the model. All the calculations were performed on the computational platform of data mining developed by us. The work can provide guides for the operators on modulating input parameters in advance. The methods outlined here can provide valuable hints into revealing mechanisms of BF ironmaking process and realizing controlled production of BF with guidance of quantitative analysis methods.

1. Introduction

Iron and steel making, a typical high-energy-consuming, high-emission and high-pollution industry, is also the fundamental industry for the contemporary world.^1,2,3,4) Blast furnace (BF) ironmaking is placed special attention due to consuming more than 70% of the energy in the whole production.^5,6,7,8)

The inner of BF composes of five zones from top to bottom: throat, shaft, belly, bosh and hearth (as shown graphically in Fig. 1). The typical BF system mainly includes BF body, ore and coke feeding system, hot blast system, pulverized coal injection system, top gas treatment system and tapping system.⁹⁾ During the operation of BF, solid raw materials mainly including iron ore and coke are alternately charged from the top of the furnace layer by layer. Simultaneously, high-pressure and high temperature hot blast enriched with oxygen and carrying auxiliary materials such as natural gas or pulverized coal is blown into the bottom of the furnace through tuyeres. Their combustion with coke at the raceway produces a great deal of heat and reducing gas consisting of carbon monoxide and hydrogen.¹⁰⁾ The reducing atmosphere and the coke remove the oxygen in the burden to make the pig iron products accompanied by slags, which are regularly tapped out by opening the clay-lined tapholes for the following steelmaking process.

Fig. 1.

Schematic of a typical BF ironmaking process. (Online version in color.)

Energy consumption, resource shortages and environmental problems caused by the development of iron and steel industry are becoming increasingly serious. Low fuel ratio (FR) will result in decreasing the sum of all fuel consumption (including coke, pulverized coal, coke nut etc.) per ton of molten iron produced, and is the root of addressing the above problems.¹¹⁾ Then, it is a necessary work to quickly and efficiently predict FR of BF to achieve low energy consumption production. However, it is a challenge to construct a relational model between FR and the multitudinous parameters because BF is a complex industrial reactor including the interacting effects of multiphases accompanied simultaneously with multiphase coupling and multiphysics field coexisting,¹²⁾ and so on.

What is encouraging is that data mining applied for research on the BF production has been certified as an effective approach to solve the above problems.^13,14,15) Zhou et al. constructed the Hammerstein model to predict the most essential quality indicators of the BF with the least squares support vector machine-based nonlinear subspace identification method.¹⁶⁾ The prediction model of BF’s silicon content with support vector regression (SVR) was built by Xu et al.¹⁷⁾ In those data-driven models, many data mining methods were used, such as principal component analysis (PCA),^18,19) SVR,^20,21,22) artificial neural network (ANN),^23,24) genetic algorithm (GA),²⁵⁾ and so on. Especially, support vector machine (SVM) have certainly garnered a lot of attention in that it can solve the dimension disaster and avoid overfitting to deal with nonlinear problem. For instance, the smooth support vector regression (SSVR) model constructed by Jian et al.²⁶⁾ was used to predict the trend of the silicon content of BF hot metal. Because there are highly complicated chemistry processes and the transport phenomena in blast furnaces, the data-driven models²⁷⁾ with the assistance of data mining are of vital importance for throwing light on the complex interrelations among variables in ironmaking process.^28,29)

The work concentrates on the development of the support vector regression (SVR) model based on data mining to predict and optimize FR of the BF with the optimal feature set including barely five parameters. The quantitative control of FR can been realized by adjusting the parameters. In modeling, GA-SVR method was used to find the optimal feature set. The experimental results indicated that the SVR-RBF (radial basis function) model with high precision and generalization performance can provide the valuable guidance and references for operators and managers to determine the direction of low energy-consumption, low-emission, saving cost, and strengthening the competition power of the enterprises.

2. Materials and Methods

2.1. Dataset and Features

The dataset was built by collecting the historical data throughout the year from the BF (internal volume of 2000 m³) in the Iron and Steel Company of China. The outliers must be excluded because they occurred under abnormal production conditions (such as blowing-down, record fault or overhaul). So, the real dataset for modeling comprises 326 samples. The dataset for testing the model consists of 87 samples was collected from the BF in the first three months of next year. The disciplinarian of BF burden distribution had remained unchanged during the period of data collection.

A lot of process parameters of BF have more or less effects on FR since BF is a complex industrial reactor.^30,31) However, not all parameters related to FR can be as the inputs of model because there are strong correlations among some parameters. Too many input variables will lead to a highly complex model. Therefore, it is a requisite work to search for the candidate features from numerous process parameters for modeling. Finally, the thirty-five candidate features (in Table 1) were determined from experiences of the BF experts and correlation analysis. The output of the model is FR of the BF in units of Kg/t. The median value of FR is equal to 561.5 Kg/t.

Table 1. The list of the candidate features for modeling.

No.	Meanings	Features	No.	Meanings	Features
1	Grade of Iron (%)	X₁	19	Iron Losses (%)	X₁₉
2	Pig Iron [Ti] (%)	X₂	20	Unit Consumption of Nut Coke (Kg/t)	X₂₀
3	Pig Iron [Si] (%)	X₃	21	Small Sinter (Kg/t)	X₂₁
4	Blast Temperature (°C)	X₄	22	Comprehensive Ironmaking Strength (t/m³·d)	X₂₂
5	Top Gas Pressure (MPa)	X₅	23	Feed batch	X₂₃
6	Blast Volume (m³/min)	X₆	24	Gas Utilization Rate (%)	X₂₄
7	Coke-load (t/t)	X₇	25	Unit Consumption of Iron Ore (Kg/t)	X₂₅
8	Utilization Coefficient (%)	X₈	26	Index Burden Permeability (Q/ΔP)	X₂₆
9	Slag Iron Ratio (Kg/t)	X₉	27	Gray Iron Ratio (%)	X₂₇
10	The Basicity of Slag	X₁₀	28	Top Temperature (°C)	X₂₈
11	Oxygen-enriched Rate (%)	X₁₁	29	Sinter (t/batch)	X₂₉
12	Coke Ash (%)	X₁₂	30	Small Sinter (t/batch)	X₃₀
13	Coke Sulfur (%)	X₁₃	31	Pellet 1 (t/batch)	X₃₁
14	Coke M40 (%)	X₁₄	32	Pellet 2 (t/batch)	X₃₂
15	Coke M10 (%)	X₁₅	33	Huili Mine (t/batch)	X₃₃
16	Coke CSR (%)	X₁₆	34	Batch Weight of Coke (t/batch)	X₃₄
17	Coke < 25 mm (%)	X₁₇	35	Blast Speed (m/s)	X₃₅
18	Clinker Rate (%)	X₁₈

2.2. Support Vector Regression

A version of SVM for regression, called SVR, was proposed by Vapnik et al.³²⁾ in 1996. SVM generalization to SVR is accomplished by introducing a ε-insensitive region around the function, called the ε-tube. Moreover, SVR is formulated as an optimization problem by first defining a convex ε-insensitive loss function to be minimized and finding the flattest tube that contains most of the training samples.³³⁾ In SVR, points outside the tube are penalized, but those within the tube, either above or below the function, receive no penalty. SVR problem formulation is derived from a geometrical perspective, using the nonlinear regression example shown in Fig. 2. Adopting a soft-margin approach like that employed in SVM, slack variables ξ, ξ* can be employed to guard against outliers.

Fig. 2.

The diagram of SVR with ε-insensitive approach. (Online version in color.)

Training the original SVR means solving:³⁴⁾

minimize 1 2 ||ω| | 2

(1)

subject to { y i -<ω, x i >-b≤ε <ω, x i >+ b- y i ≤ε

(2)

where x_i is a training sample with target value y_i. The inner product plus intercept <ω, x_i> + b is the prediction for the sample, and ε is a free parameter that servers as a threshold: all predictions must be within ε range of the true prediction.

Vapnik introduced the functions into his developed SVM method, making SVM can deal with nonlinear problems by using linear algorithms. The open issue in SVR is the selection of parameter values for the kernel and loss functions.³⁵⁾ In this work, the SVR model employed RBF as the kernel function,³⁶⁾ whose expression is given as:

K( x i , x j )=exp( -|| x i - x j | | 2 σ 2 )

(3)

2.3. Implement

The calculations were performed on Online Computational Platform of Material Data Mining^37,38) developed by us. It can be freely used on the website (http://matdata.shu.edu.cn/ocpmdm/). Its predecessor is HyperMiner software package^32,39,40) written by us. Its free version can be downloaded from the website of Laboratory of Materials Data Mining in Shanghai University (http://chemdata.shu.edu.cn:8080/MyLab/Lab/download.jsp).

3. Results and Discussion

3.1. Feature Selection

Feature selection is a key factor to determine whether a successful model could be established. It can reduce the dimension of feature space to further decrease the risk of over fitting, and can better remove features unrelated to target value and noise interference.^41,42,43) Meanwhile, it can also make the training time shorten, further promote the prediction ability and generalization performance of the model. In the work, GA-SVR method was employed to screen the subset of features for modeling. Compared with other optimization algorithms, GA has an ability to move from local optima present on the response surface. Accordingly, it is widely used in optimization for its fine global search capability. In the SVR models, root mean square error (RMSE) of 10-fold cross validation was as the fitness function evaluation. The smaller the RMSE is, the better the modeling effect is. RMSE is defined as follows:

RMSE= ∑ i=1 n ( e i - p i ) 2 n

(4)

where: e_i and p_i are experimental value and predictive value of FR of the ith sample, respectively; n is the number of sample points in the sample set.

Figure 3 illustrates how GA can be used to compress the candidate variable set for finding the optimal feature set. It can be seen that the smallest RMSE emerged after GA accomplished 48 generations. Here the optimal input variable set was found including the five features (Blast Temperature (X₄), Coke-load (X₇), Iron Losses (X₁₉), Pellet 2 (X₃₂) and Batch Weight of Coke (X₃₄)). In accordance with the BF production process, the five features are responsible for FR directly or indirectly. So it is reasonable that they are used as the predictors of the model.

Fig. 3.

RMSE versus generation of evolution in GA. (Online version in color.)

3.2. Model Building

A SVR-RBF model was constructed to make quantitative predictions of unknown samples. The generalization performance of a nonlinear SVR-RBF model relies on the setting values of three hyper-parameters, namely, C, ε and σ. Parameter C is a constant that determines regularized penalty to estimation errors. Parameter ε controls the width of the ε-insensitive zone and is used to fit the training data. Parameter σ is the RBF width parameter that determines how far the effect of a single training example reaches, where the low value means ‘far’, and the high value means ‘close’. When optimizing the three parameters through the grid-search method, they were set in [0.01, 0.1], [1, 30] and [0.5, 1.5], and had steps of 0.02, 2 and 0.22, respectively. Their optimization process is shown in Fig. 4. They are best when RMSE is lowest, namely the best regression performance can be achieved.

Fig. 4.

Three hyper-parameters optimization of the SVR-RBF model. (Online version in color.)

The result of SVR parameter grid search showed that the optimal C, ε and σ were 3, 0.03 and 1.38, respectively. A SVR-RBF model with R_Cro (correlation coefficient) of 0.916, RMSE_Cro of 5.252 and Q²_Cro (determination coefficient) of 0.839 for 10-fold cross validation was constructed by using the optimized parameters. The subscript Cro represents cross-validation. The predictive results of the training set and the test set are shown in Fig. 5. The model is shown as follows:

y= ∑ i n β i ⋅exp(-1.38⋅(||x- x i | | 2 ) + 0.4506755

(5)

where x is the unknown vector, x_i is the support vector in the model, n is the corresponding number, and β_i is Lagrange multiplier of the support vectors.

Fig. 5.

The predictive results of the training set and the test set by using the SVR model. (Online version in color.)

R_Tra, RMSE_Tra and Q²_Tra of the model are 0.942, 4.386 and 0.888, and R_Tes, RMSE_Tes and Q²_Tes of the model are 0.923, 5.460 and 0.837, respectively. The subscript Tra and Tes identify the training set and the test set. To further understand the prediction performance of the model, relative deviations (RD) of FR for the training set and the test set were analyzed. The result is shown in Table 2. It can be seen from the table that the model predicts 80.37% data with RD of less than 1%, and 1.23% data have RD of from 2% to 4% in the training set. It can be also observed that 72.41% data are predicted with RD of less than 1%, and no data have RD of more than 2% in the test set. From the above analysis, it can be seen that the model has high prediction accuracy and practical meaning. So, SVR-RBF method is feasible not only in theory but in application.

Table 2. Relative deviation (RD) of predictive values of FR for the training set and test set.

RD (%)	% of predictive training values	Cumulative percentage	% of predictive test values	Cumulative percentage
RD ≤ 1	80.37	80.37	72.41	72.41
1< RD ≤2	18.40	98.77	27.59	100
2 < RD ≤ 4	1.23	100	0	100
Total	100		100

3.3. Simulation Analysis

The section discusses the effects of various variables on FR by means of regression analysis of the SVR-RBF model. Simulation study can be used to observe the diversification of the target variable on one variable when the other variables are fixed at the mean values. Figure 6 illustrates the diversification of FR with Blast Temperature (X₄), Coke-load (X₇), Iron Losses (X₁₉), Pellet 2 (X₃₂), and Batch Weight of Coke (X₃₄) respectively when the other four features keep constant. From Fig. 6(a), the increased Blast Temperature translates into lower FR in the zone of low blast temperatures, while FR rises slightly at higher blast temperatures. It is mainly because too much low-quality coal powder are blown into BF when Blast Temperatures are higher. In Fig. 6(b), FR first increases with Coke-load then decreases more sharply. In Figs. 6(c) and 6(e), FR increases with Iron Losses and Batch Weight of Coke, respectively. The opposite trends can be observed in Fig. 6(d). Namely, FR has been in the decreasing trend when Pellet 2 is increased.

Fig. 6.

The diversification of FR with (a) Blast Temperature (X₄), (b) Coke-load (X₇), (c) Iron Losses (X₁₉), (d) Pellet 2 (X₃₂), and (e) Batch Weight of Coke (X₃₄). (Online version in color.)

4. Conclusions

This work focuses on FR optimization of the BF that is a critical problem directly related to the existence and development of the iron and steel enterprises. To solve the issue, the data mining methods were employed to construct the relationship model between FR and the five technology parameters. Taking the quantitative control and optimization of FR of the BF into account, the SVR-RBF model with R_Tra of 0.942, RMSE_Tra of 4.386 and Q²_Tra of 0.888 was established. The model with high precision and generalization performance can help the BF foremen adjust the parameters in advance to optimize FR, and assist the managers to make the right decisions for improving the competitiveness of the enterprise. All the calculations were performed on the computational platform of data mining developed by us, which is free to do many of the data mining tasks that were previously available only in commercial software packages. The study integrated experiences of the BF foremen and experts, and the data mining techniques to solve the industrial optimization problem with strong noise and multi-variable couple. So the method outlined here can provide valuable hints into the industry optimization with the assistance of data mining.

Acknowledgement

The authors acknowledge the financial support from the National Key Research and Development Program of China (No. 2016YFB0700504).

References

1) C. Yilmaz and T. Turek: J. Clean. Prod., 164 (2017), 1519.
2) J. Wu, R. Wang, G. Pu and H. Qi: Appl. Energy, 183 (2016), 430.
3) M. A. Quader, S. Ahmed, R. A. R. Ghazilla, S. Ahmed and M. Dahari: Renew. Sustain. Energy Rev., 50 (2015), 594.
4) M. Jampani, J. Gibson and P. C. Pistorius: Metall. Mater. Trans. B, 50 (2019), 1290.
5) K. Takahashi, T. Nouchi, M. Sato and T. Ariyama: ISIJ Int., 55 (2015), 1866.
6) I. F. Kurunov: Metallurgist, 54 (2010), 335.
7) S. Kuang, Z. Li and A. Yu: Steel Res. Int., 89 (2017), 1700071.
8) T. Okosun, A. K. Silaen and C. Q. Zhou: Steel Res. Int., 90 (2019), 1900046.
9) V. R. Radhakrishnan and A. R. Mohamed: J. Process Control, 10 (2000), 509.
10) J. A. de Castro, C. Takano and J.-i. Yagi: J. Mater. Res. Technol., 6 (2017), 258.
11) W. Chen, X. Yin and D. Ma: Appl. Energy, 136 (2014), 1174.
12) X. Zhang, M. Kano and S. Matsuzaki: Comput. Chem. Eng., 121 (2019), 442.
13) P. Zhou, D. Guo and T. Chai: Neurocomputing, 308 (2018), 101.
14) C. Gao, L. Jian, X. Liu, J. Chen and Y. Sun: IEEE Trans. Neural Networks, 22 (2011), 2272.
15) A. Nurkkala, F. Pettersson and H. Saxén: Ind. Eng. Chem. Res., 50 (2011), 9236.
16) P. Zhou, D. Guo, H. Wang and T. Chai: IEEE Trans. Neural Networks Learn. Syst., 29 (2018), 4007.
17) X. Xu, C. Hua, Y. Tang and X. Guan: Neural Comput. Appl., 27 (2016), 1451.
18) B. Zhou, H. Ye, H. Zhang and M. Li: Control Eng. Pract., 47 (2016), 1.
19) L. Shi, Z.-l. Li, T. Yu and J.-p. Li: J. Iron Steel Res. Int., 18 (2011), 13.
20) X. Zhang, M. Kano and S. Matsuzaki: Comput. Chem. Eng., 130 (2019), 106575.
21) A. Ghosh and S. K. Majumdar: Int. J. Adv. Manuf. Technol., 52 (2011), 989.
22) C. Hua, J. Wu, J. Li and X. Guan: Neural Comput. Appl., 28 (2017), 4111.
23) C. Bilim, C. D. Atiş, H. Tanyildizi and O. Karahan: Adv. Eng. Softw., 40 (2009), 334.
24) F. S. V. Gomes, K. F. Coco and J. L. F. Salles: IEEE Trans. Autom. Sci. Eng., 14 (2017), 1286.
25) T. Mitra, F. Pettersson, H. Saxén and N. Chakraborti: Mater. Manuf. Process., 32 (2017), 1179.
26) L. Jian, C. Gao and Z. Xia: Steel Res. Int., 82 (2011), 169.
27) W. Sun, Z. Wang and Q. Wang: Energy, 199 (2020), 117497.
28) P. Zhou, P. Dai, H. Song and T. Chai: IET Control Theory Appl., 11 (2017), 2343.
29) P. Zhou, H. Song, H. Wang and T. Chai: IEEE Trans. Control Syst. Technol., 25 (2017), 1761.
30) W. H. Chen, M. R. Lin, T. S. Leu and S. W. Du: Int. J. Hydrog. Energy, 36 (2011), 11727.
31) X. Yu and Y. Shen: Metall. Mater. Trans. B, 50 (2019), 2238.
32) P. Xiong, X. Ji, X. Zhao, W. Lv, T. Liu and W. Lu: Chemom. Intell. Lab. Syst., 144 (2015), 11.
33) B. Niu, Q. Su, X. Yuan, W. Lu and J. Ding: Med. Chem., 8 (2012), 1108.
34) A. Chalimourda, B. Schölkopf and A. J. Smola: Neural Networks, 17 (2004), 127.
35) V. Cherkassky and Y. Ma: Neural Networks, 17 (2004), 113.
36) C. J. C. Burges: Data Min. Knowl. Discovery, 2 (1998), 121.
37) X. Zhai, M. Chen and W. Lu: Comput. Mater. Sci., 151 (2018), 41.
38) Q. Zhang, D. Chang, X. Zhai and W. Lu: Chemom. Intell. Lab. Syst., 177 (2018), 26.
39) C. R. Peng, W. C. Lu, B. Niu, Y. J. Li and L. L. Hu: Protein Pep. Lett., 19 (2012), 108.
40) B. Hu, K. Lu, Q. Zhang, X. Ji and W. Lu: Comput. Mater. Sci., 136 (2017), 29.
41) P. de Boves Harrington: TrAC Trends Anal. Chem., 25 (2006), 1112.
42) S. H. Min, J. Lee and I. Han: Expert Syst. Appl., 31 (2006), 652.
43) D. Zhang and D. Shen: Neuroimage, 59 (2012), 895.

Corresponding author

Register with J-STAGE for free!