ISIJ International
Online ISSN : 1347-5460
Print ISSN : 0915-1559
ISSN-L : 0915-1559
Regular Article
Adaptive Weighting Just-in-Time-Learning Quality Prediction Model for an Industrial Blast Furnace
Kun ChenYi Liu
Author information
JOURNAL OPEN ACCESS FULL-TEXT HTML

2017 Volume 57 Issue 1 Pages 107-113

Details
Abstract

Development of accurate soft sensors for online quality prediction (e.g., silicon content) in an industrial blast furnace is a difficult task. A novel just-in-time-learning (JITL) prediction approach using adaptive feature-weighting for similar samples is developed. First, a dual-objective joint-optimization framework is proposed to introduce both input and output information into the model. Then, a suitable similarity criterion with feature weighting strategy is formulated, which is not considered in conventional JITL methods. Moreover, the trade-off parameter in the joint-optimization problem can be chosen automatically, without the time-consuming cross-validation procedure. The proposed method is applied to online predict the silicon content in an industrial blast furnace in China. Compared with other JITL-based soft sensors, better prediction performance has been obtained.

1. Introduction

The blast furnace ironmaking process is an important unit operation in the manufacturing of iron and steel. It consumes the main energy input to the integrated route of steel production and emits much carbon oxide which is the main cause of the greenhouse effect. As one of the most energy intensive and complicated industrial process, there has been a growing awareness of modeling and controlling the blast furnace ironmaking process for increasing efficiency and reducing cost. The silicon content, indicating the thermal state in blast furnace, is the most important index of pig iron quality. It must be kept at an appropriate level to facilitate the production and stable running of the ironmaking process. Therefore, accurate online prediction of the silicon content in hot metal is very critical.1,2,3,4,5,6) Extensive research on the thermodynamic and kinetic behaviors occurring inside the blast furnace ironmaking process has been investigated. However, an accurate mechanism model in industrial processes has not been constructed.

Nowadays, a large amount of process data containing useful information can be obtained in industrial blast furnace ironmaking processes. To online predict the silicon content, various data-driven soft sensor modeling approaches, including various neural networks,7,8,9,10,11,12,13,14) partial least squares,14,15) fuzzy inference systems,16) nonlinear time series analysis,17,18,19,20) subspace identification,21) support vector regression (SVR) and least squares SVR (LSSVR),22,23,24) and others25,26,27,28,29) have been investigated. A recent overview of black-box models for short-term silicon content prediction in blast furnaces can be referred to.30) Without substantial understanding of the complicated phenomenology, the data-driven soft sensor models can be built in a quick manner.30,31,32) Among them, SVR and LSSVR have shown promising prediction performance, especially when the training data are insufficient.22,23,24,33)

One main disadvantage of most existing data-driven modeling approaches for the silicon content is that a single global model is built. However, it is not enough to describe all the process characteristics only using a single model,34,35,36,37,38,39,40) especially for some complicated regions with insufficient information. To improve the prediction performance, Nurkkala et al.29) presented multiple autoregressive vector models to describe complex systems. On the other hand, although moving-window-based recursive soft sensors can gradually be adapted to new operational conditions, how to choose a suitable moving-window size for complex blast furnace ironmaking processes is difficult.10,11,17) Additionally, most recursive models may not function well in a new operational region until a sufficient period of time has passed because of the time delay when they adapt themselves to new operational conditions. Recently, the just-in-time LSSVR (JLSSVR) modeling approach has been applied to industrial ironmaking processes.40) For a query sample, the JLSSVR based local model is online built using its similar samples. Consequently, a suitable JLSSVR model can better describe process nonlinearity directly. However, there are still two disadvantages of JLSSVR. First, data samples utilized for construction of a JLSSVR model are assigned with the same feature weight. This is inconsistent with many practical applications mainly because each input variable has its special impact on the final quality. Second, for JLSSVR and most current just-in-time modeling methods,34,35,36,37,38,40) similarity measurements are only considered on the input variables. However, the information of output variables is not utilized.

This work mainly develops an adaptive just-in-time-learning (JITL)-based local model for better prediction of the silicon content. First, a dual-objective optimization framework is proposed to preserve the local structure of both input and output variables. Then, according to the importance, weights are assigned to the variables in the projection space. Using the new similarity, the relevant samples can be selected and weighted. Moreover, the trade-off parameter in the optimization problem can be chosen in a efficient manner, without the time-consuming cross-validation. All these improvements can enhance the performance of JITL-based models.

The remainder of this paper is organized as follows. The JLSSVR soft sensor modeling method is described in Section 2. In Section 3, the detailed implementations of the adaptive weighting JLSSVR model are developed. It is evaluated by the silicon content online prediction in an industrial process in Section 4. Comparison studies with other methods are also investigated. Finally, a conclusion is made in Section 5.

2. Local Soft Sensor Model for Online Quality Prediction

2.1. Basic LSSVR Method

Using the kernel learning framework, the SVR/LSSVR soft sensor model is to learn a mapping f: XY with a modeling set S = {X, Y} = { x i , y i } i=1 N . A general form of the kernelized nonlinear model for process modeling can be formulated as38)   

y i =f( x i ;θ ) + e i =f( x i ;w,   b ) + e i = w T ϕ( x i ) +b+ e i ,i=1,,N (1)
where yi and ei denote the output measurement and the process noise at instance i, respectively; xi is usually composed of several measured variables, and f is the wanted model; ϕ is a feature map; θ = [wT,b]T; that is, the symbols w and b are the model parameter vector and the bias term, respectively.38) Applying the LSSVR framework to Eq. (7),33) the following optimization problem is formulated   
{ min   J( w,b ) = 1 2 w 2 + γ 2 e 2 s.t.       y i - w T ϕ( x i )-b- e i =0,   i=1,,N (2)
where e = [e1,…,eN]T is the approximation error. The user-defined regularization parameter γ (γ > 0) determines the trade-off between the model’s complexity and approximation accuracy, and a suitable choice of γ can prevent over-fitting.33) After solving the optimization problem in Eq. (2),33) the solution can be expressed as   
{ α=P[ y- 1 1 T Py 1 T P1 ] b= 1 T Py 1 T P1 (3)
where α = [α1,…,αN]T are Lagrange multipliers; y = [y1,…,yN]T; I∈RN×N is a unit matrix and 1 is a vector of ones; P= ( Ω+ I γ ) -1 . The kernel matrix Ω is composed of elements Ω(i,j) = ϕ( x i ),ϕ( x j ) , ∀i,j = 1,...,N.33) Briefly, the development of an LSSVR-based soft sensor model amounts to solving a set of linear equations using the kernel transform. Compared with solving a convex quadratic programming problem of the SVR method, LSSVR can be implemented more efficient and easily.33)

Finally, the LSSVR model for prediction of a test sample xq, i.e., y ˆ q , can be obtained33)   

y ˆ q =f(θ,    x q ) = i=1 N α i ϕ( x i ) ,ϕ( x q ) +b = α T k q +b (4)
where α=[α1,…,αN]T are Lagrange multipliers related to the model parameter vector w, and kq(i) = ϕ( x i ),ϕ( x q ) , ∀i = 1,...,N is a kernel vector.

2.2. JLSSVR-based Local Model

For some industrial processes, the direct application of a global/fixed model with complicated structure is often difficult. Another limitation of the global methods is that it is difficult for them to be updated in a quick manner when the process dynamics are changing.37) Additionally, for some situations with more complex characteristics, the training data samples are not sufficient. Therefore, only using an LSSVR/SVR model for industrial ironmaking processes is still not enough.

To alleviate these problems and construct the local models automatically, the JITL method inspired by the ideas of local modeling has been developed as an alternative to nonlinear process modeling and control.34,35,36,37,38,39,40) As illustrated in Fig. 1, for a query sample xq, there are three steps to construct a JITL model (take JLSSVR for example). First, select similar samples as a similar set Sq in the database S based on some defined similarity criterions. Second, construct a JLSSVR model fJLSSVR(xq) using the relevant dataset Sq. Third, online predict the output y ˆ q for the current query sample xq and then discard the JLSSVR model fJLSSVR(xq).40)

Fig. 1.

The common flowchart of JITL-based online soft sensor modeling method.

With the same three-step implements, a new JLSSVR model can be constructed for the next query sample. The Euclidean distance-based similarity is commonly utilized.34,35,36,37,38,39,40) The similarity index (SI) between the query sample xq and the sample xi in the dataset is defined below35)   

S x,qi =exp( - d qi ) =exp( - x i - x q ) ,   i=1,,N (5)
where dqi is the distance similarity between xq and xi in the dataset. The value of Sx,qi is bounded between 0 and 1 and when Sx,qi approaches to 1, xq resembles xi closely. With the distance-and-angle-based similarity factor, a little better prediction performance can be obtained. However, another parameter for balance of the distance and the angle should be chosen.36) Additionally, some correlation-based similarity criteria have been proposed recently.37) Based on the these similarity criteria, a relevant dataset Sq with nq similar samples can be adopted to build a JLSSVR model.

For construction of the JLSSVR model, the user-defined parameters include the regularization parameter γ, and the kernel parameter (e.g., the Gaussian kernel K( x i ,    x j )=exp[ - x i - x j 2 /σ ] with the width parameter σ > 0). With a pair of parameters [γ, σ], the JLSSVR model can be constructed. The leave-one-out (LOO) cross validation has been shown to provide a sensible criterion for model selection.41) However, the LOO criterion for LSSVR is computationally expensive and, therefore, is not suitable for online performance. Fortunately, a fast LOO (FLOO) cross-validation criterion was proposed by Cawley and Talbot.41) Based on the FLOO criterion,40,41) the JLSSVR model can be optimized by solving the minimum FLOO-based error with nq samples.   

E n q FLOO = i=1 n q e i FLOO = i=1 n q α i P ii + s i 2 /o (6)
where Pii is the item at the ith row and ith column of P in Eq. (3), s = P1 = [s1,⋯,snq]T and o = −1TP1. The related terms (i.e., P and αi) are available. Additionally, the computational load of s and o is small. The complexity of FLOO can be reduced to about O( n q 3 ) operations, compared to the basic LOO with about O( n q 4 ) operations.41) Consequently, the computation is much more efficient and online selection of the parameters of JLSSVR is feasible.

3. Adaptive Weighting Local Soft Sensor Model

Although the criterion in Eq. (5) has been widely adopted in JITL-based methods for process modeling and monitoring, the similarity still may not be described adequately in two folds. First, different input variables are equally weighting, which is inconsistent with many practical situations. Additionally, similarity measurements are only built on the input variables (i.e., in Eq. (5)). No information in output quality variables is used.

Recently, a supervised locality preserving projection (LPP) strategy42) was proposed to construct the similarity measurement using both of input and output information.39)   

S I ij = S x,ij η S y,ij 1-η (7)
where Sx,ij is the closeness measurement of the input variables between the ith and jth samples; Sy,ij is the closeness measurement of the output variables between the ith and jth samples; and 0 ≤ η ≤ 1 is a parameter to balance the importance of input and output information.

With the similarity defined in Eq. (7), an LPP approach is employed to seek the mapping direction, which serves as the weights for variables. Then, relevant samples are selected according to the Euclidean distance in the projection space. Unfortunately, there are still two drawbacks in the LPP-based similarity method. First, the parameter η is selected through the cross-validation method, which is really time-consuming. Second, in the low-dimensional space, the internal variables are assigned with the equal weights, which may cause the same problem as the traditional criterion in.36)

The proposed relevant sample selection strategy is to construct an adaptive weighted distance as the similarity criterion by adequately utilizing the information in both of input and output variables. Consequently, the LPP algorithm42) is adopted to keep the local structure of both input and output information. And it is solved in a dual-objective optimization form as a general eigen problem. Then, according to the eigenvalue of projection direction, adaptive weights are assigned to the internal variables to construct the similarity criterion to select relevant samples.

3.1. Feature Extraction Using LPP

One challenge to establish the suitable similarity criterion is to select representative features from various input variables while preserving the local structure as much as possible. To achieve this goal, the LPP approach is employed. LPP42) is a recent dimensionality reduction method which has been successfully applied in information retrieval and pattern recognition areas. LPP aims at preserving the neighborhood structure of the data set, while principal component analysis (PCA) only retains most of the original variance.42) Additionally, LPP shares many of the properties of nonlinear methods such as locally linear embedding43) and Laplacian eigenmaps.44) Therefore, compared with PCA, LPP can reveal the intrinsic geometrical structure of the observed data and find more meaningful low-dimensional information hidden in the high-dimensional observations. Moreover, the linear property of LPP makes it efficient in computation and suitable for practical applications.42)

Given a set of m-dimensional input variables X = {x1,…,xN} with corresponding output variables Y = {y1,…,yN} where xiRm, LPP aims to find a transformation matrix B = [β1,…,βd] to project these samples to a low-dimensional sample set Z = {z1,…,zN} in Rd (dm) based on following objective function:42)   

J LPP =min i,j=1,,N,ij ( z i - z j ) 2 S I ij (8)
where SIij indicates the similarity between the ith and jth samples, and zi is the one-dimensional representation of xi with a projection vector β, i.e., zi = βTxi, with zi as the vector of most important d representation of zi.

To better utilizing both the secondary and primary information, a dual-objective optimization scheme is proposed:   

{ J x =min i,j=1,,N,ij z i - z j 2 S x ij J y =min i,j=1,,N,ij z i - z j 2 S y ij (9)

It is expected that, after projection, the nearby points, with both similar input and output variables, are still close in the low-dimensional projection space. However, these two objectives are generally difficult to be obtained at the same time, as the projection directions are usually different. To solve this problem, a trade-off parameter η1 is introduced to balance these two objectives. Then, the objective function can be described as follows:   

J=min    η 1 J x +( 1- η 1 ) J y =min i,j=1,,N,ij z i - z j 2 ( η 1 S x,ij +( 1- η 1 ) S y,ij ) (10)
where 0 ≤ η1 ≤ 1. Details on the selection of parameter η1 will be presented in Section 3.3.

To solve this optimization problem, an orthogonal constraint βTβ = 1 is introduced to avoid singularity problem. Then, the optimization problem can be solved through an eigen problem:42)   

XL X T β=λβ (11)
where L = DSI, with SIij = η1Sx,ij + (1 − η1)Sy,ij and D a diagonal matrix defined as D ii = j=1 N S I ij . The items β1,…,βd are the eigenvectors corresponding to the smallest d eigenvalues λ1,…,λd. And the internal variables z can be obtained as zi = BTxi with B = [β1,…,βd].

3.2. Adaptive Weighting Similarity Criterion

Instead of using the Euclidean distance directly (e.g., Eq. (5)), a weighted distance is adopted here:   

d w ij = k=1 d v k ( z i,k - z j,k ) 2 (12)
where vk is the feature weight parameter to balance the components of zi, and zi,k indicates the kth element in zi.

Inspired by the principle component analysis,45) an eigenvalue based weighting strategy is formulated to calculate the importance of the latent variables:   

v k = λ k -1 i=1 d λ k -1 (13)

With Eq. (13), the element with a larger eigenvalue will be assigned a relative smaller importance (≈0) and can be ignored to reduce the projection dimension automatically. Then, according to the weighted distance between the training data and the query sample, ηq samples with the largest weighted similarity measurements are selected as relevant samples.

A simple illustration of sample selection in two-dimensional projection space is shown in Fig. 2 to distinguish the proposed strategy and the criterion in.39) Suppose the values of β1, β2 are the mapping direction corresponding to the λ1, λ2, respectively. The shape of samples with the same similarity indices would be an ellipse in with the long-axis-to-short-axis ratio a:b= λ 1 -1 : λ 2 -1 , rather than a circle in.39) It is straightforward to observe that the more relevant feature would be assigned a larger weight and play a more important role to determine the similarity measurement, which can make the criterion more accurate.

Fig. 2.

Illustration of sample selection with feature weighting in two dimensional projection space.

3.3. Auto Selection of Parameter η1

For the dual-objective optimization problem in Eq. (9), it is difficult to obtain an optimal solution mainly due to the conflict between the two sub-objectives. To obtain a relatively good solution, the scales and convergence speed of the sub-objectives should be considered carefully to select the parameter η1.46)

The solution of Eq. (10) is obtained by solving the eigen problem rather than using an iterative manner. As a result, the convergence speed problem can be avoided. Thus, the parameter η1 should be selected to balance the scale issue. Inspired by,46) the scale of Jx and Jy can be defined as:   

S input =ρ( X L x X T ) S output =ρ( X L y X T ) (14)
where Lx and Ly are defined based on Sx,ij and Sy,ij, respectively, ρ() indicates the largest eigenvalue of the matrix . Thus, η1 can be obtained by:   
η 1 S input =( 1- η 1 ) S output η 1 = S output S output + S input (15)

In summary, the steps to select the relevant samples for construction of a JLSSVR model can be described as follows:

Step 1: Establish the matrices Sx, Sy (The elements in Sx and Sy are Sx,ij and Sy,ij, respectively.), Lx and Ly.

Step 2: Set η1 automatically according to Eqs. (14) and (15).

Step 3: Solve the optimization problem in Eq. (10) to obtain the corresponding eigenvalues and eigenvectors.

Step 4: Assign adaptive weights to the latent variables and calculate the weighted distance based on Eqs. (12) and (13).

Step 5: Select nq relevant samples on the basis of the weighted similarity measurement.

Step 6: The JLSSVR soft sensor model can be online constructed using Eqs. (1), (2), (3) and (6).

It should be noted that, by introducing the dual-objective optimization scheme, the local structure of both input and output information can be better preserved. Additionally, using the LPP method, the new variables will be independent with each other by removing the correlation among the process variables. Moreover, the adaptive weighted similarity criterion keeping the local information can help estimate the similarity and select the relevant samples. Consequently, with the properly selected relevant samples, the built local model in a JITL manner would be more accurate to predict the output variables.

4. Industrial Silicon Content Prediction

In this section, the proposed improved JITL-based soft sensor modeling methods are applied to online predict the silicon content of an industrial blast furnace ironmaking process in China. All the data samples have been collected from daily process records and the corresponding laboratory analysis. The process input variables correlated with the product quality (i.e., the silicon content) have been selected.21,22,24) These input variables include the blast volume, the blast temperature, the top pressure, the gas permeability, the top temperature, the ore/coke ratio, and the pulverized coal injection. The sampling time of most of these input variables is 1 minute. According to expert experience and correlation analysis,30) the time difference between the silicon content and input variables can be selected. For example, the time difference between the silicon content and the top pressure is about 2 h; and the time difference between the silicon content and the gas permeability is about 1 h.

After simply preprocessing the modeling set with 3-sigma criterion, most of obvious outlier samples and missing values have been removed out. After preprocessing, a set of about 260 samples is investigated. The first 150 samples are treated as the historical samples. The rest set of about 110 samples is for testing. The simulation environment in this case is MatLab V2009b with CPU main frequency 2.3 GHz and 4 GB memory.

As discussed in previous research, only a single global model is not enough for description of all the process characteristics in industrial ironmaking processes.40) Additionally, the blast furnace dynamics may change, gradually or abruptly, and a fixed soft sensor model validated on earlier data may not perform well on future data.29) Here, JLSSVR is considered as a JITL-based local modeling method. To better illustrate the effect of the proposed method, the adaptive weighted relevant sample selection strategy is applied to acquire the relevant data set for each query sample. Two cases are investigated here. The first one in Section 4.1 is to study the performance of eigenvalue based adaptive weighting approach only using the information in input variables, i.e. η=1 in Eq. (7). The other one in Section 4.2 focuses on the utilizing both of the information in input and output variables and how to auto-select the trade-off parameter η1 in Eq. (10).

The root-mean-square error (RMSE) and relative RMSE (simply noted as RE) are two common performance indices to quantitatively evaluate the prediction performance of different soft sensor models.40) Additionally, the hit rate (HR) index is often adopted in industrial blast furnace ironmaking processes.21,22,23,24,25,26,27,28) Three indices of RMSE, RE, and HR are defined, respectively.   

RMSE= q=1 l ( y q - y ˆ q l ) 2 (16)
  
RE= q=1 l ( y q - y ˆ q y q ) 2 /l (17)
  
{ HR= q=1 l H q l ×100% where    H q ={ 1,   | y ˆ q - y q |<0.1 0,   else (18)
where y ˆ q and yq are the predicted value and the actual value, respectively. And l is the number of test samples.

4.1. Performance of Weighting Strategy

As aforementioned, the comparison of the feature weighting strategies is investigated only using the information in input variables, i.e. η1 = 1 in Eq. (7). Two following weighting strategies are considered for the variables in the projected space: (1) equal weighting, where the values of vk are assigned as v1 = ··· = vd in Eq. (12); (2) eigenvalue based weighting, with its weights assigned as Eqs. (12) and (13).

The prediction results of these two weighting strategies are listed in Table 1 with the number of relevant samples nq = 10, 20, 30, 40, respectively. Additionally, the online prediction results of the silicon content with nq = 30 are shown in Fig. 3. To show the result more clear, only the first 50 testing samples are plotted in Fig. 3. From Fig. 3 and Table 1, it can be shown that, regardless of the value of nq (nq = 10, 20, 30, 40), the eigenvalue based weighting criterion outperforms the equal weighting strategy, with the smaller RMSE and RE indices, and the larger HR value. For this case, the number of similar samples nq = 30 is a suitable choice.

Table 1. Comparisons of LPP-based JLSSVR soft sensors with different feature weighting strategies for different similar samples (best results are bold and underlined).
Similar samplesFeature weighting strategyRMSERE (%)HR (%)
10Eigenvalue based weighting0.09919.168.2
Equal weight0.10820.365.4
20Eigenvalue based weighting0.09718.270.1
Equal weight0.10720.167.3
30Eigenvalue based weighting0.09518.171.0
Equal weight0.10519.969.2
40Eigenvalue based weighting0.10118.969.2
Equal weight0.10920.666.4
Fig. 3.

Online silicon content prediction error comparison results of JLSSVR soft sensor models with eigenvalue based weighting and equal weighting strategies.

The prediction results indicate that the eigenvalue based weighting criterion can help determine the feature weights and construct the similarity measurement more suitably. This is mainly because the feature weights are assigned according to their relative importance. Consequently, the local structures of initial data samples can be kept and the similarity criterion can be estimated more properly.

4.2. Performance of Utilizing the Output Information

To show the effect of utilizing the output variable information of the training dataset, five LPP-based similarity criteria are utilized to search the mapping direction, followed by the eigenvalue based weighting strategy to calculate the weighted distance in projection space. They are listed as: (1) Sx, only using the input variable information of the training samples; (2) Sy, only using the output information of the training samples; (3) SxSy, that is the criterion S I ij = S x,ij η S y,ij 1-η defined in Eq. (7), with the trade-off parameter η determined by the cross-validation approach; (4) the proposed criterion J = min η1Jx + (1−η1)Jy in Eq. (10) with η1 selected by cross-validation, noted as PCcv; (5) the proposed criterion J = min η1Jx + (1−η1)Jy in Eq. (10) with auto parameter setting for η1, noted as PCauto for simplicity.

The comparison results of the aforementioned criteria with nq = 30 are tabulated in Table 2. From Table 2, one can see that a better prediction performance can be obtained by utilizing the information in both of input and output variables, compared with only using Sx or Sy. The output variables contain some important information and should be adopted. However, it was not explored in most traditional JITL-based soft sensors in industrial processes. The prediction results of traditional JLSSVR38,40) without LPP-based similarity for searching the similar set are also listed in Table 2. It is inferior to other methods with LPP-based similarity strategies. The adaptive weighting JLSSVR (i.e., PCauto) and traditional JLSSVR methods are compared using a parity plot shown in Fig. 4. It can be indicated that the adaptive weighting JLSSVR method is more accurate than JLSSVR (the HR index increases from 66.4% to 73.8%).

Table 2. Comparisons of the online prediction errors using six different JLSSVR soft sensor models (best results are bold and underlined).
JLSSVR-based modelsBrief descriptionRMSERE (%)HR (%)
SxLPP-based input variable information0.09518.271.0
SyLPP-based output variable information0.09718.370.1
SxSyLPP-based input and output variable information0.09318.072.0
PCcvLPP-based feature weighting for input and output variable information (with η1 selected by cross-validation)0.08717.373.8
PCautoLPP-based feature weighting for input and output variable information (with auto parameter setting for η1)0.08617.473.8
JLSSVR40)Traditional similarity criterion without LPP0.10720.566.4
Fig. 4.

The RE index according to possible values of the trade-off parameter η1 and its auto-setting result.

Moreover, it is shown that the cross-validation based criterion (PCcv) can also make the model predict well. However, for a query sample with nq = 30, the total computational time for online modeling and prediction is about 11 s, which is much larger than using the PCauto method (less than 1 s). It is rather time-consuming for the PCcv method mainly because all the possible values of η1 need to be computed to select the proper value by cross-validation. Meanwhile, for the proposed auto parameter setting strategy defined in Eqs. (14) and (15) (PCauto), the computation is more efficient because only a generalized eigen problem is solved. As illustrated in Fig. 5, the predictive performance of the RE index with possible trade-off parameters η1 is presented. It is shown that the auto-setting parameter can be a good and suitable choice in practice although it is suboptimal. This is mainly because the optimal value of η1 shown in Fig. 5 is not known beforehand. In a word, the proposed similarity PCauto can achieve almost the best performance as the traditional cross-validation method, while saving a lot of computational time.

Fig. 5.

Parity plot based of assay values against the prediction values of the silicon content in the test set using the adaptive weighting JLSSVR and traditional JLSSVR soft sensor models.

Therefore, from all the obtained results and comparison analysis, the proposed PCauto approach by utilizing both of the input and output information with automatic trade-off parameter determination and the eigenvalue based weighting strategy, can achieve a promising prediction performance. Moreover, compared with the traditional cross-validation approach, it can be implemented in an efficient manner.

5. Conclusion

This paper has proposed a novel JITL-based local soft sensor model with adaptive relevant sample selection for better prediction of the silicon content. The main contributions are three folds: (1) the eigenvalue based adaptive weighting strategy, (2) using a dual-objective optimization framework for employment of the output information, (3) and the auto-selection of the trade-off parameter η1 in an efficient manner. The superiority of the proposed method is demonstrated and compared with several JLSSVR soft sensors in terms of online prediction of the silicon content in an industrial blast furnace. Note that other JITL-based modeling methods can also be integrated with the proposed relevant sample selection strategy. Additionally, some advanced outlier detection methods can be applied as a preprocessing method to enhance the reliability of quality prediction. Therefore, there are several interesting research directions worth investigating to further enhance the accuracy and transparency of a silicon content prediction model.

Acknowledgment

The authors would like to gratefully acknowledge the National Natural Science Foundation of China (Grant No. 61004136) and Jiangsu Key Laboratory of Process Enhancement & New Energy Equipment Technology (Nanjing University of Technology) for their financial support.

Abbreviations

FLOO: fast leave-one-out

HR: hit rate

JITL: just-in-time learning

JLSSVR: just-in-time least squares support vector regression

LPP: locality preserving projection

LSSVR: least squares support vector regression

RE: relative root-mean-square error

RMSE: root-mean-square error

SI: similarity index

SVR: support vector regression

References
 
© 2017 by The Iron and Steel Institute of Japan
feedback
Top