ISIJ International
Online ISSN : 1347-5460
Print ISSN : 0915-1559
ISSN-L : 0915-1559
Regular Article
PCA-LMNN-Based Fault Diagnosis Method for Ironmaking Processes with Insufficient Faulty Data
Tongshuai ZhangHao Ye Haifeng ZhangMingliang Li
Author information
JOURNAL OPEN ACCESS FULL-TEXT HTML

2016 Volume 56 Issue 10 Pages 1779-1788

Details
Abstract

Fault detection and fault classification are important in the modern ironmaking process. Some studies based on principal component analysis (PCA) techniques have been performed for fault detection in the ironmaking process. However, studies on fault classification in the ironmaking process remain limited. In this paper, problems that are related to the classification of abnormalities in blast furnaces are considered. We fuse historical abnormal data that were collected from three real blast furnaces to address the problem of insufficient historical faulty data. To extract common features for the same type of abnormalities, which are not affected by different operation points or different blast furnaces, we propose the use of a contribution vector as a fault feature, which is calculated by the PCA-based technique. The large marginal nearest neighbor (LMNN) technique is employed to train a classifier with contribution vectors as inputs. Twenty-one historical abnormalities in three different real blast furnaces are employed to validate the proposed method. The results indicate that this method achieves the desired performance.

1. Introduction

Ironmaking processes are an important part of the modern iron and steel industry.1) Blast furnaces, which are the main reactors, are the core of the entire ironmaking process.2) In a typical ironmaking process, ironmaking-bearing materials (such as iron ore, sinters, and pellets), cokes and flux are dumped into the top of a blast furnace, whereas hot dry air, enriching oxygen, fuels (such as tar or pulverized coal) and moisture are blasted into the bottom of the blast furnace. Similarly, the output, which is composed of liquid molten iron and slag, flows from the bottom of the blast furnace, whereas coal gas and dust are collected at the top of the furnace. With one downward stream and one upward stream, complex reactions occur in the blast furnace.3)

A blast furnace should operate at normal status and in a steady state. Because the physical and chemical reactions in a blast furnace are complex, various abnormalities, such as hanging, slipping, channeling, and cold furnace condition, frequently occur in a blast furnace,4) which may cause severe accidents if they are not detected and handled in a timely manner. Thus, effective fault diagnosis methods for the ironmaking process, including fault detection, i.e., detection of an abnormality, and fault classification, i.e., identification of the cause or type of abnormality after fault detection, should be developed.

Expert systems have been successfully applied to fault detection and fault classification in ironmaking processes.5,6,7,8) However, the requirements of comprehensive rules and sufficient historical information about abnormalities, which are not always available, limit the application of these systems in many factories.

To address the limitations of expert system-based methods, data-driven methods for either fault detection or fault classification have been developed in recent years.

Data-driven methods for fault detection in ironmaking process are primarily based on principal component analysis (PCA). Although PCA has been extensively and successfully applied in the chemical industry,9,10) its application to the ironmaking process remains limited. A study by Gamero,11) which addressed qualitative trend analysis, and a study by Vanhatalo,12) which considered an experimental blast furnace, are representative of limited research that is based on PCA for fault detection in ironmaking process. Motivated by their studies and to address the problem of unknown switching disturbances in hot blast stoves, a two-stage PCA-based method was proposed for the incipient detection of abnormalities in a real blast furnace of the Liuzhou Iron & Steel Co., Ltd.13) The method was tested using more real historical data in literature,14) which were collected from three blast furnaces in the Liuzhou Iron & Steel Co., Ltd. And among a total of 24 abnormalities, 20 abnormalities were detected prior to the successful monitoring of the operators. However, these PCA-based methods cannot be utilized to perform fault classification after fault detection. Based on our previous work,13,14) this paper focuses on fault classification, i.e., the classification of newly occurring abnormalities after they have been detected by the two-stage PCA-based fault detection method proposed in literature13) for the ironmaking processes at the Liuzhou Iron & Steel Co., Ltd.

Numerous data-driven methods for fault classification exist. These methods consist of two phases: offline training and online classification.15) In offline training, a classifier is trained by using a proper optimization method with historical faulty data, whose fault types are known a priori. Many methods include a step that is referred to as fault feature extraction prior to training, which transforms the original faulty data into a proper form to manifest the fault features,16) e.g., Li et al.17) employed the Fourier transform of the original faulty signal as a fault feature. In online classification, new data (or the corresponding fault feature) with an unknown fault type, are fed to the trained classifier, and the output of the classifier is calculated and employed as the classification result.18) Some data-driven fault classification methods for ironmaking processes, which are primarily based on support vector machines (SVMs) and its variants,4,19,20,21) and neural networks22,23) employ original data instead of extracted fault features to train a classifier. Compared with expert system based methods, the merit of these methods is that they do not require comprehensive rules. However, they still meet with a common difficulty that is encountered by all data-driven fault classification methods,24) i.e., the requirement of rich historical faulty data that cover all types of faults to train the classifier, which is frequently impossible in real applications. As shown in Table 3 of Sec. 3, we employ historical data from 2012 to 2014 for 21 confirmed abnormalities for the three blast furnaces at the Liuzhou Iron & Steel Co., Ltd. The lack of historical faulty data is revealed in the following aspects, which renders existing data-driven fault classification methods inapplicable:

➢ Historical data of some abnormality types may be lacking for a blast furnace. For example, historical data of the cold furnace condition were unavailable for neither Blast Furnace No. 2 nor Blast Furnace No. 5 at the Liuzhou Iron & Steel Co., Ltd.

➢ Blast furnaces may run at different operating points due to changes in production requirements, the quality of raw materials, or the weather. Even for the same types of abnormalities, they may occur at different operating points and cannot be covered by historical faulty data.

➢ Faulty data are very limited. For Blast Furnace No. 5 at the Liuzhou Iron & Steel Co., Ltd., only historical data for three abnormalities were collected.

Table 3. Summary of the detection results of the abnormalities.
Abnormality No.Blast FurnaceAbnormality TypeAlarm StatisticsLead TimeSample Size
1No. 2HangingT220.7 hours3795
2HangingT23.1 hours944
3ChannelingSPE3 hours952
4HangingT2&SPE1.6 hours296
5HangingT2&SPE0.5 hours302
6HangingT2&SPE3.6 hours564
7HangingT21.1 hours566
8HangingT23.0 hours841
9HangingT27.5 hours951
10HangingT23.4 hours700
11No. 3HangingT2&SPE9.3 hours2790
12SlippingSPE1.6 hours644
13HangingT219.6 hours1413
14HangingT21.4 hours763
15HangingT21.3 hours342
16ChannelingSPE4.9 hours1019
17Cold furnace conditionT2&SPE1.3 hours491
18HangingT2&SPE1.5 hours474
19No. 5HangingT2&SPE6.5 hours1611
20ChannelingT2&SPE0.6 hours326
21HangingT2&SPE3.3 hours990

Motivated by a study on predictive maintenance,25) which fused the historical data of different aircraft engines all over the world to address the difficulty of insufficient historical faulty data for each engine, we fuse the historical data of three blast furnaces at the Liuzhou Iron & Steel Co., Ltd. to offer a more complete historical dataset for fault classification, which may cover more abnormality types and operating points than the historical data of each single blast furnace. An important reason why we can do this lies in the consideration that the majority of the monitoring variables in the three blast furnaces are common, and the reaction mechanisms of the ironmaking processes are similar.

However, the manifestation of the same type of abnormalities in different blast furnaces or at different operating points may differ in most cases. To realize the idea of data fusion in feature extraction, which is a step of fault classification, the extraction of common features for the same type of abnormalities that are robust to different operating points and different blast furnaces becomes a key problem to be solved. Note that none of the existing data-driven fault classification methods for ironmaking processes4,19,20,21,22,23) addresses the problem because they directly employ the original process data as fault features.

To address the problem, we propose the use of a contribution vector as a fault feature in this paper. Contribution plots are frequently adopted by PCA-based methods to preliminarily analyze the possible cause of a detected fault, i.e., the process variable with the largest contribution value to the abnormal T2 or SPE statistic will be treated as the cause of the fault.10) As noted in related literature,26) contribution plots do not properly indicate the root cause of a fault. In contrast to existing PCA-based methods, which directly employ contribution values to perform cause analysis of faults, we employ a contribution vector that is composed of the contribution values of all process variables as the fault feature, which will be classified by a classifier. The contribution values reflect the influence of each process variable on the variation of a system from its normal state10) and an abnormality of a blast furnace is a system variation, while different operating points and different blast furnaces can be regarded as various normal states of a system. Thus, contribution values may be helpful for eliminating the influence of the differences due to different operating points and different blast furnaces. To the best of our knowledge, the use of contribution values as fault features has not been observed in existing fault classification methods.

Many state-of-the-art classification algorithms, such as SVMs and the large margin nearest neighbor (LMNN), can be adopted to realize classification after the fault features have been extracted. In this paper, the LMNN, which is a newly developed classification algorithm that is based on metric learning,27) is adopted as the classifier. Other classification algorithms can also be adopted.

The remainder of the paper is organized as follows: Sec. 2 introduces the basic idea of PCA10) and a two-stage PCA-based fault detection method for the ironmaking process proposed in our previous work,13) which will be employed to extract the contribution vectors in this paper, as well as the LMNN-based classification method,27) which will be applied to train the classifier in this paper. Sec. 3 describes the dataset collected from the Liuzhou Iron & Steel Co., Ltd. Sec. 4 presents the complete LMNN-PCA-based method that is proposed in this paper for the classification of abnormalities of blast furnaces at the Liuzhou Iron & Steel Co., Ltd. and the experimental results that are based on real historical data. Sec. 5 concludes with a discussion of the results and future studies.

2. Preliminaries

2.1. A Brief Introduction to the PCA-Based Fault Detection Method

Let X denote a data matrix for L observations of q variables; X can be decomposed as,28)   

X=T P T = i=1 q t i p i T = i=1 s t i p i T +E (1)
where ti, pi, (i=1, ···, q), and E denote the score vector, the loading vector and the residual matrix, respectively, and s denotes the number of the main score vectors that represent the majority of information about X. Please refer to literature28) for details.

Hotelling’s T2 statistic and SPE statistic have been extensively employed by PCA-based monitoring methods, which are defined as10)   

T 2 (k)=x (k) T P s Σ s -2 P s T x(k) (2)
and   
SPE(k)=r (k) T r(k), with      r(k)=x(k)- x ˆ (k),       x ˆ (k)= P s P s T x(k) (3)
where x(k) denotes a newly arriving observation at sampling instant k, Ps is a matrix that is composed of the first s columns of P (i.e., pi for 1≤is), and Σs is a diagonal matrix that is composed of the s largest singular values of X. For a given confidence level α, the corresponding thresholds for the T2 statistic and SPE statistic can be calculated according to literature.28) For the newly arriving observation x(k), if its corresponding T2(k) or SPE(k) exceeds the threshold, a fault alarm is given.

In addition, after a fault alarm has been given, the contribution of each individual process variable xi(k)(i=1, ···, q) to the abnormal T2 statistic or SPE statistic at the kth sampling instant can be calculated as26)   

con t T 2 ,i (k)= ( ξ i T P s Σ s -1 P s T x(k)) 2 con t SPE,i (k)= ( ξ i T r k ) 2 = r i (k) 2 i=1,,q (4)
where ξi=[0 0···1···0]T denotes the ith column of an identity matrix. Because T 2 (k)= i=1 q con t T 2 ,i (k) and SPE(k)= i=1 q con t SPE,i (k) , the percentages of contT2,i (k) and contSPE,i (k) in the corresponding statistic (i.e., T2 and SPE) reflect the influence of the ith variable to the variations in the principal space and the residual space, respectively. In traditional PCA-based methods, the variable with the largest contribution to abnormal statistics is usually treated as the cause of the abnormality,10) which is a preliminary cause analysis instead of a root cause analysis, as mentioned in literature.26)

Unknown switching between blast stoves at the Liuzhou Iron & Steel Co., Ltd. introduces severe disturbances to both the modelling procedure and the monitoring procedure if a standard PCA-based method is applied to fault detection in the ironmaking process. Thus, a two-stage PCA-based method was proposed in literature13) to address this issue.

In literaure,13) the first-stage PCA is designed to achieve effective identification, location and removal of the switching disturbances via the comparison of the T2 statistic with a threshold in the hypothesis testing framework. All identified disturbances are removed to construct a new training set without disturbances. The second-stage PCA is equivalent to the standard PCA-based monitoring method, with the exception that a procedure to distinguish the switching disturbances and anomalies is added. Interested readers may refer to literature13) for a detailed algorithm.

2.2. A Brief Introduction to LMNN-Based Classification

2.2.1. Offline Training Model of LMNN

LMNN, which is proposed by Weinberger et al.,27) is an extensively applied metric learning method. The basic model is introduced as follows.

Let {( x i , y i )} i=1 n denote a training set of n labeled samples with the variable vectors x i R q and the class labels yi ∈{1,2,…,m}. LMNN defines the squared distance between x i , x j as27)   

D M ( x i , x j )= ( x i - x j ) T M( x i - x j ), (5)
where the Mahalanobis metric M is a positive semidefinite matrix to be determined. The following two terms, which were introduced by Weinberger et al.,27) illustrate how to learn the matrix M in LMNN.

• Target neighbors: For each sample x i with label yi, its target neighbors are samples that are the c nearest neighbors with the same label yi determined by Euclidean distance. The number of the target neighbors c should be predetermined.

• Impostors: For a sample x i with label yi and its farthest target neighbor x j , an impostor is any sample x k with label y k y i such that   

D M ( x i , x k ) D M ( x i , x j )+C (6)
where C>0, which is usually 1.

The objective of LMNN is to learn a matrix M that minimizes the distance between each training sample with its “target neighbors” while maximizing the distance between each training sample with its “impostors”, which can be realized by solving the following semidefinite program (SDP)27)   

min M    J= min M    (1-μ) ( x i , x j )S D M ( x i , x j ) +μ i,j,k ξ ijk s.t. (1) D M ( x i , x k )- D M ( x i , x j )1- ξ ijk ( x i , x j , x k )R (2) ξ ijk 0 (3)M0. (7)
where J is the loss function, μ∈[0,1] is the weighting parameter that controls the “pull/push” trade-off and {ξijk} are slack variables. S and R are defined as follows:   
S={( x i , x j ): y i = y j    and    x j belongs to cnearest neighborhood of    x i }
  
R={( x i , x j , x k ):( x i , x j )S,    y i y k }.
Therefore, a classifier can be trained offline by solving (7) using collected historical data with known labels as the training set, which corresponds to the offline phase of the scheme of fault classification, as mentioned in Sec. 1.

2.2.2. Energy-Based Online Classification

In this paper, energy based classification29) is adopted. In contrast with distance-based classification,27) which employs the optimal matrix M, that minimizes the loss function J in (7) as a metric, the energy-based classification directly employs the loss function J as a classifier.29)

As mentioned in Sec. 1, the classification procedure is implemented online, i.e., for a newly arriving sample x t with unknown labels, energy-based classification considers it as an extra training sample and computes the loss function J for each possible label yt with the optimal matrix M that is obtained by solving the SDP (8). Then, the unlabeled sample is classified by label yoptimal if it minimizes the total loss function,29) i.e.,   

y optimal =arg min y t {(1-μ) ( x t , x j )S D M ( x t , x j ) +μ { ( x i , x j x t )R [1+ D M ( x i , x j ) - D M ( x i , x t ) ] + + ( x t , x j x k )R [1+ D M ( x t , x j ) - D M ( x t , x k ) ] + } (8)

3. Description of the Monitored Blast Furnaces and the Historical Data

In this paper, only the problem of fault classification of the three blast furnaces at the Liuzhou Iron & Steel Co., Ltd. is considered because a two-stage PCA-based method has been provided in literature13,14) for their fault detection. The volumes of the blast furnaces are 2650 m3, 2000 m3 and 1500 m3.

Table 1 shows the 35 main process variables of the three blast furnaces that are monitored. Although most variables are common for the three blast furnaces, only three variables only available for Blast Furnaces No. 3 and No. 5, and only five variables are available for Blast Furnace No. 2. Please refer to the footnotes of Table 1 for details.

Table 1. Variable list of the dataset.
No.VariableNo.Variable
1Oxygen enrichment rate (%)19Cold blast pressure(2)2 (MPa)
2Blast furnace permeability index20Total pressure drop (kPa)
3CO volume1 (%)21Hot blast pressure(1) (MPa)
4H2 volume1 (%)22Hot blast pressure(2)2 (MPa)
5CO2 volume1 (%)23Actual blast velocity (m/s)
6Blast velocity at tuyere of blast furnace (m/s)24Cold blast temperature2 (°C)
7Enriching oxygen flow (m3/h)25Hot blast temperature (°C)
8Cold blast flow (104 m3/h)26Top temperature (1) (°C)
9Blast momentum (KJ)27Top temperature (2) (°C)
10Blast furnace bosh gas volume (m3)28Top temperature (3) (°C)
11BF bosh gas index29Top temperature (4) (°C)
12Theoretical combustion temperature (°C)30Downcomer temperature2 (°C)
13Blast furnace top gas pressure (1) (kPa)31Drag coefficient
14Blast furnace top gas pressure (2) (kPa)32Blast humidity
15Blast furnace top gas pressure (3) (kPa)33Coal injection set value (t/h)
16Blast furnace top gas pressure (4)2 (kPa)34Actual coal injection rate (t/h)
17Enriching oxygen pressure (MPa)35Actual coal injection in last hour (t)
18Cold blast pressure(1) (MPa)
1.  Variables only available for Blast Furnaces No. 3 and No. 5.

2.  Variables only available for Blast Furnace No. 2.

A total of 21 abnormalities are confirmed and recorded for Blast Furnaces No. 2, 3 and 5 from 2012 to 2014, which cover four abnormality types. Table 2 lists the number of times each type of abnormality occurs in each blast furnace. Their corresponding data have been employed to validate the performance of the two-stage PCA-based fault detection method in literature13,14) and will be utilized to validate the performance of the fault classification proposed in this paper.

Table 2. Summation of the abnormalities.
AbnormalitiesBlast Furnace
No. 2No. 3No. 5
Hanging952
Channeling110
Slipping010
Cold furnace condition010

In our previous work,14) all 21 abnormalities have been successfully detected by the T2 or SPE statistic of the two-sage PCA. The results are listed in Table 3. For each abnormality, the column Alarm Statistics lists the two statistics (i.e., T2 or SPE) that exceeds the threshold and gives alarms. The column Lead Time indicates how much earlier the two-stage PCA-based method generates an alarm than the operators.14)

In this paper, the data that correspond to each abnormality of Table 3 will be employed as the abnormal data to train or test the classification method. The starting time is set as the alarm time of the two-stage PCA-based method,14) and the ending time is set earlier than the time during which operators confirmed the abnormalities. In Table 3, the sample size for each abnormality indicates how many abnormal samples are included in the abnormal data, which is determined by its starting time and ending time.

If the corresponding abnormal samples for each abnormality are employed for training, the known abnormality type will be involved in the training. If the corresponding abnormal samples for each abnormality are employed for testing, then the abnormality type is assumed to be unknown to the algorithm and will be employed to evaluate whether the classification results of the proposed method are correct.

4. Fault Classification Based on the Contribution Vector and LMNN

4.1. Basic Idea and Complete Scheme

Figure 1 shows the scheme of the fault classification method that is proposed in this paper, which includes an offline training phase and an online classification phase, as mentioned in Sec. 1.

Fig. 1.

Scheme of fault classification.

As shown in Fig. 1, using the abnormal samples for training that are listed in Table 3 in the offline training procedure of the proposed method, the abnormal samples for each abnormality of each blast furnace are transformed via a contribution vector-based feature extraction algorithm, which will be introduced in Sec. 4.2. All extracted fault features (i.e., contribution vectors) and their known abnormality types are employed to train the classifier, which will be introduced in Sec. 4.3.

In the online classification procedure of the proposed method, as shown in Fig. 1, the contribution vector for a newly arriving abnormal sample of a blast furnace, whose abnormality type is unknown, is calculated by the feature extraction algorithm. Then, the trained classifier will use the contribution vector as input and calculate the output, which indicates which type of abnormality has occurred. Note that we assume that a newly arriving sample must be abnormal because the fault classification procedure is activated by a fault detection procedure, i.e., only after a fault detection algorithm has given an alarm about the occurrence of an abnormality, the newly arriving samples will be employed as the input of the classifier. As mentioned in Sec. 1 and Sec. 3, this paper only focuses on the fault classification, whose detailed algorithms are provided in Sec. 4.3, because the two-stage PCA-based fault detection method for the three blast furnaces has been discussed in literature.14)

4.2. Fault Feature Extraction Based on Variables’ Contributions

As mentioned in Sec. 1, blast furnaces may operate at different operating points and the three blast furnaces are different, which lead to different manifestations of even the same type of abnormalities. Therefore, features that are robust to different blast furnaces and different operating points should be extracted.

PCA is a statistical technique to build up a linear model for a process that maps the original process variables to a set of uncorrelated components through an orthogonal transformation. And the data space for the original process is divided into two parts: the principal space and the residual space, which can be monitored by T2 and SPE statistics, respectively. Furthermore, the contributions of variables to the two statistics can be calculated. And different contributions of the corresponding variables reflect the variation of the process.As mentioned in Sec. 2, the contribution values reflect the influence of each process variable on the variation of a system from its normal state. Because an abnormality of a blast furnace is the variation from the current operating point of the ironmaking process, whereas different operating points and different blast furnaces can be regarded as various normal states of the system, contribution values should be helpful for extracting common features from the data that correspond to the same type of abnormalities regardless of different blast furnaces or different operating points.

In the first step of classification in this paper (refer to Fig. 1), i.e., feature extraction, we propose the use of the contribution vectors, i.e.,   

Con t T 2 (k)=( con t T 2 ,1 (k),,con t T 2 ,i (k),,con t T 2 ,35 (k) ) Con t SPE (k)=( con t SPE,1 (k),,con t SPE,i (k),,con t SPE,35 (k) ) (9)
as the feature vectors, where con t T 2 ,i (k) and contSPE,i (k), i=1,2,···,35 denote the contribution of variable No. i in Table 1 to the abnormal T2 and SPE statistics, respectively, which can be calculated according to (4).

In the following section, a numerical simulation example is given to illustrate the benefits of the contribution vector.

Let   

μ 1 =( 1 8 ) ,    Σ 1 =( 1 0.1 0.1 1 ) ,    μ 2 =( 8 1 ) ,    Σ 2 =( 1 0.11 0.11 1.21 ) , f 1 =( 0 4 )    and    f 2 =( 4 0 )
Then, let zi,j=yi+fj, i=1,2; j=1,2 denote a two-dimensional system that operates at two possible operating points with two possible types of faults, where yi~N(μi, Σi) for i=1,2 denotes normal process operation at the ith operating point, where fj ∈ R2 for j=1,2 denotes the jth type of fault and zi,j denotes the process variable that operates at the ith operating point with the jth type of fault.

Figure 2 displays the simulation results for two normal cases, i.e., y1 and y2, in which each case has 1000 samples, and four faulty cases, i.e., z1,1, z1,2, z2,1, z2,2, in which each case has 50 samples.

Fig. 2.

Two types of faults at two operating points. (Online version in color.)

In Fig. 2, the points in the same color are desired to be classified in the same group, i.e., points for z1,1 and z2,1, and points for z1,2 and z2,2 are desired to be classified in the same group, respectively. However, because red points (for z1,2) are located between the two sets of black points (for z1,1 and z2,1) and black points (for z2,1) are located between the two sets of red points (for z1,2 and z2,2), a classifier (which usually classifies similar data into the same group) has difficulty yielding correct classification results or the classifier may classify z1,2 and z1,1 or z1,2 and z2,1 in the same group and these two possibilities may further lead to different wrong classifications for z2,2.

Let z i,j 1 and z i,j 2 denote two elements of zi,j, i=1,2; j=1,2. Let con t T 2 , z i,j 1 (k) and con t T 2 , z i,j 2 (k) denote the contribution of z i,j 1 and z i,j 2 to the T2 statistic, respectively, which are calculated according to (4). Define the contribution vector as Con t z i,j (k)=( con t T 2 , z i,j 1 (k), con t T 2 , z i,j 2 (k) ) , where k=1,···,50. Figure 3 shows the points Con t z i,j , which have a one-to-one relationship with the points zi,j in Fig. 2. We use black stars and black circles to denote Con t z 1,1 and Con t z 2,1 , respectively, and use red stars and red circles to denote Con t z 1,2 and Con t z 2,2 , respectively. Via the transformation, the two types of black points are situated near the vertical axis, and two types of red points are situated near the horizontal axis. Thus, the black and red points are separated, and the influence of the operating points is eliminated, which enables any type of classifier to make a correct classification with these new points (i.e., contribution vectors) as fault features.

Fig. 3.

Contributions of two variables. (Online version in color.)

An additional merit of using a contribution vector as a fault feature is that its use is consistent with the practical experience of operators because operators usually classify abnormalities according to the trends of key variables that deviated from their current running state regardless of different operating points, which indicates that operators believe that the “symptoms” of the same type of abnormalities are similar regardless of the operating point.

We may normalize the feature vector at sampling instant k as   

Cont_normalize d * (k)= [con t *,1 (k),,con t *,i (k),,con t *,35 (k)] i=1 35 con t *,i (k) 2 (10)
to guarantee that Cont_normalize d * (k) 2 =1 , where the subscript ‘*’ denotes T2 or SPE and has the same meaning in (11).

To reduce the data scale and suppress the effect of noise, the variables’ contributions can be smoothed by averaging over a ten-sample interval, i.e.,   

Cont_ave r * (k)= 1 10 i=1 10 Cont_normalized(10k+i-10) (11)
Therefore, the sample sizes of the contribution vectors in Sec. 4 are approximately one-tenth of the samples sizes of the abnormal samples listed in Column Sample Size of Table 3 in Sec. 3.

The algorithm to extract contribution vectors as fault features, whose scheme is shown in Fig. 4, can be summarized as follows.

Fig. 4.

Algorithm flowchart of feature extraction algorithm.

Algorithm 1: Feature extraction

1. For abnormal samples x(k), k=1,,N of an abnormality, calculate the T2 and SPE statistic based on a two-stage PCA method according to (2) and (3). Determine which statistic exceeds the corresponding threshold.

2. Calculate the contribution of the ith variable to the abnormal statistic con t *,i (k),i=1,,35 according to (4), and compose the contribution vectors Con t * (k) , k=1,,N according to (9).

3. Normalize the contribution vectors according to (10).

4. Smooth the contribution vectors by a moving average according to (11), and obtain the final contribution vectors Cont_ave r * (k) , k=1,,N/ 10 as the fault feature.

Remark 1: For abnormal samples of an abnormality, Algorithm 1 may generate one set of contribution vectors—T2-based or SPE-based—or two sets of contribution vectors—both T2-based and SPE-based—depending on which statistic exceeds its threshold, as shown in Fig. 4.

According to (9), to fuse the data of the three blast furnaces, we need to calculate the contribution values of all 35 variables in Table 1 for each blast furnace based on its process data. However, slight differences exist among the variable lists of the three blast furnaces, i.e., variables No. 16, 19, 22, 24 and 30, only exist in Blast Furnaces No. 3 and No. 5 but are missing in Blast Furnace No. 2. However, variables No. 3, 4 and 5 only exist in blast furnace No. 2 but are missing in Blast Furnaces No. 3 and No. 5. Thus, reasonable values for the contributions of the missing variables to the statistics are needed. In this paper, their contributions are set to 0, which is a simple but effective approach.

4.3. Offline Training and Online Classification Algorithms

Let m=4 denote the total number of abnormality types and let K denote the total number of abnormalities, which are covered by the abnormal samples employed for training. According to Fig. 5, which displays the flowchart of the offline training algorithm, K= k 1 + k 2 + k 3 , where k1, k2 and k3 are the numbers of abnormalities occurring in Blast Furnace No. 2, No. 3 and No. 5, respectively. And the sum K is the number of all the abnormalities collected from three blast furnaces. As in the majority of the classification methods, we use the integer yi to represent the type of the ith abnormality, where y i =1,,4 represents hanging, channeling, slipping and cold furnace condition, respectively. For the ith abnormality, Ni samples, i=1,,K , exist.

Fig. 5.

Algorithm flowchart of offline training algorithm.

The offline training algorithm can be summarized as follows:

Algorithm 2: Offline training

1. For the ith abnormality in Fig. 5, i=1,,K , perform feature extraction to Ni abnormal samples according to Algorithm 1, which is given in Sec. 4.2, and obtain the T2-based contribution vectors Cont_ave r T 2 ,i (k) or/and (according to which statistic is abnormal) the SPE-based contribution vectors Cont_ave r SPE,i (k) , where k=1,, N i / 10 .

2. Label the contribution vectors Cont_ave r T 2 ,i (k) or/and Cont_ave r SPE,i (k) , where k=1,, N i / 10 , for the ith abnormality with yi (i.e., the known type of the ith abnormality).

3. Use all labeled contribution vectors Cont_ave r T 2 ,i (k) or/and Cont_ave r SPE,i (k) , where k=1,, N i / 10 and i=1,,K , to train the two classifiers based on the LMNN technique by solving the SDP according to (7).

As mentioned in Sec. 1, a complete fault diagnosis procedure includes fault detection and fault classification. However, the online classification procedure will not be activated until the two-stage PCA method gives an alarm, i.e., if the T2 statistic or SPE statistic for newly arriving samples exceed the corresponding threshold. Thus, the online classification procedure assumes that each newly arriving sample that is fed into the online classification algorithm must be an abnormal sample, with an unknown abnormality type.

The online classification algorithm, which is shown in Fig. 6, can be summarized as follows.

Fig. 6.

Algorithm flowchart of online classification algorithm.

Algorithm 3: Online classification

1. For every ten newly arriving samples (because the width of the moving window is set to 10 according to (11)), calculate the corresponding contribution vector Cont_ave r * (k) , where ‘*’ denotes the abnormal statistic, i.e., T2 or SPE, by Algorithm 1 in Sec. 4.2.

2. According to whether the contribution vector is based on T2 or SPE, select the corresponding T2- or SPE-based classifier that is trained by Algorithm 2 and calculate its output according to (8) with the feature vector Cont_ave r * (k) as an input. Then, the output of the classifier, which is represented by the number y * (k){1,2,3,4} , gives the classified abnormality type.

Because any abnormality will be retained before it is handled by an operator, a succession of abnormal samples exist for a newly occurring abnormality, which produces a succession of classification results, which may not always be consistent. In addition, the classification results of T2-based contribution vectors may differ from the classification results for SPE-based contribution vectors. Thus, a final decision is necessary to determine the type of the abnormality based on the multiple classification results given by T2- and SPE-based contribution vectors for multiple samples, which may not be consistent. The decision logic is illustrated as follows:

Algorithm 4: Decision logic

1. Set the length threshold L and the majority threshold β.

2. Let M(j) denote how many times Type j appears in the L classification results for j=1,,4 . Let jm denote the type that appears for the most times, i.e., M( j m )= max 1j4 [ M(j) ] . Then, determine M(jm) and jm. In some cases, both the T2 and SPE statistic may give alarms in the stage of fault detection, which will cause Algorithm 3 to yield classification results for both T2- and SPE-based contribution vectors, i.e., M( j m, T 2 ) and M( j m,SPE ) . A simple and direct method for fusing these two types of classification results is to let j m = j m,* , where ‘*’ denotes T2 or SPE and M( j m,* )=max[ M( j m, T 2 ),M( j m,SPE ) ] . Then, let M( j m )= max[ M( j m, T 2 ),M( j m,SPE ) ] .

3. If M( j m ) L ×100%>β , set the final classification result to jm because Type jm has appeared a sufficient number of times among the L classification results (exceeding the threshold β). In the case that M(jm) does not exceed the threshold β, which indicates an insufficient number of consistent classification results, we have to maintain an undetermined abnormality type, which indicates that the algorithm fails to make the classification.

4.4. Experimental Results

In this section, the proposed method will be tested based on real historical abnormal data collected from three blast furnaces at the Liuzhou Iron & Steel Co., Ltd., as shown in Table 3. Two cases are designed in this paper.

In the first case, Abnormalities No. 19, 20 and 21 of Blast Furnace No. 5 are employed as the testing set, whereas the other abnormalities, i.e., Abnormalities No. 1–No. 18 for Blast Furnaces No. 2 or No. 3 are employed as the training set. In this case, no abnormal samples for Blast Furnace No. 5 exist in the training set, which indicates that the three testing abnormalities are novel to Blast Furnace No. 5. Table 4 lists the sample sizes of the training set and the testing set.

Table 4. Sample sizes of the training set and testing set.
Feature VectorsT2-basedSPE-based
Training set1519750
Testing set292292

In the second case, more abnormalities are employed to test the total performance of the proposed method, i.e., Abnormalities No. 4, 8, 10, 11, 13, 15, 16, 18, and 21 and part of No. 12 and 17 are employed as the testing set, and Abnormalities No. 1, 2, 3, 5, 6, 7, 9, 14, 19, and 20 and the remaining part of No. 12 and 17 are employed as the training set. Because only one cold furnace condition and one slipping exist in the collected data, which correspond to Abnormalities No. 12 and No. 17, respectively, the data for Abnormalities No. 12 and 17 are divided into two parts. The first part is employed for training data, and the second part is employed for testing data. Table 6 lists the sample sizes of the training set and the testing set.

Table 6. Sample sizes of the training set and testing set.
T2-based Contribution VectorsSPE based Contribution VectorsOriginal Data
Training Set9994261126
Testing Set812616945

4.4.1. Parameter Settings for LMNN-Based Classification

As introduced in Sec. 2, two parameters can be tuned for the LMNN algorithm: The number of target neighbors c and the weighting coefficient μ. In our study, c=3 and μ=0.5 is chosen.

As a rule of thumb, c is chosen as 1 or 3,29) which is given in the reference list in both the original and the revised manuscript. Generally, a large c suppresses the effect of noises of classification. And the value c=3 works well for our practice. Therefore, in this paper, c is chosen as 3.

The weighting parameter, μ[0,1] , controls the “pull/push” trade-off, which can be tuned via cross validation. However, as is reported in Weinberger’s work,29) the optimization of the LMNN is not sensitive to the value of μ. Moreover the value μ=0.5 works well for our practice.

For the decision logic, i.e., Algorithm 4 in Sec. 4.3, the length threshold is the sample size of each abnormality, and the majority threshold is β=90%.

4.4.2. Testing Results for Case I

The classification results of the two classifiers, which are based on T2-based contribution vectors and SPE-based contribution vectors, are listed in Table 5. Regardless of which classifier is employed, the two hanging abnormalities (i.e., Abnormalities No. 19 and 21) are correctly classified with a very low error rate; however, none of the classifiers can correctly recognize the channeling abnormality (i.e., Abnormality No. 20). These results indicate the merit of fusing data that were collected from different blast furnaces for fault classification because no abnormal samples are involved in the training set for Blast Furnace No. 5. Because 14 hanging abnormalities in Blast Furnaces No. 2 and No. 3 are included in the training set, the method recognizes the same abnormality for Blast Furnace No. 5. The channeling abnormality (i.e., Abnormality 20) cannot be correctly classified because only two channeling abnormalities are alarmed by the SPE statistic in the historical data for Blast Furnaces No. 2 and No. 3, which are substantially insufficient.

Table 5. Experimental results of Case I.
Abnormality No.Abnormality TypeSample SizeClassification Error Rate (%)Correctness of the Classification Result
T2-basedSPE-based
19Hanging1610.620.00True
20Channeling32100.00100.00False
21Hanging990.000.00True

Table 5 provides the correctness of the final classification results given by Algorithm 4 because the classification results given by the two statistics (i.e., T2 and SPE) at different instants may not be consistent, as mentioned in Sec. 4.3.

To illustrate the classification procedure, Fig. 7 shows the online classification process for Abnormality No. 21. Subfigure 7(a) shows all the process variables with their original amplitudes, and their corresponding units are shown in Table 1. Some variables appear overlapped at the bottom of Subfigure 7(a), because their amplitudes are relative small with comparison to the red, yellow, green and blue ones, which correspond to enriching oxygen flow, blast furnace bosh gas volume, theoretical combustion temperature and hot blast temperature, respectively. Subfigure 7(b) and 7(c) show the changes of T2 statistic and SPE statistic along with time, respectively, according to which the two-stage PCA based13) method generates an alarm, i.e. an abnormality is detected. The time when the abnormality is detected is marked on both Subfigure 7(b) and 7(c). Subfigure 7(d) shows the outputs of the classifiers for both T2 based and SPE based contribution vectors.

Fig. 7.

Online classification process for Abnormality No. 21. (Online version in color.)

As shown in Fig. 7, a newly occurring abnormality is detected by the T2 statistic and the SPE statistic at the 93000th instant using the two-stage PCA method proposed in literature.13) Then, the classification procedure that is proposed in this paper is activated. Contribution vectors based on the T2 statistic and the SPE statistic are calculated by Algorithm 1 in Sec. 4.2, with the exception of the samples that correspond to strong disturbances, which are denoted by dashed green rectangles. For details on the identification of disturbances, please refer to literature.13) Using Algorithm 3 in Sec. 4.3, the two classifiers for the T2-based contribution vector and the SPE-based contribution vector will yield an integer that indicates the possible type of abnormality as output, where the integers 1, 2, 3 and 4 represent the hanging, channeling, slipping and the cold furnace condition, respectively; they are denoted in in red stars and green stars, respectively, in subfigure (d). All classification results are consistent and correct, which indicates that they belong to the same abnormality type. The majority threshold β=90% that is adopted in this paper indicates that the type of the newly occurring abnormality can be determined if the percentage of the majority decisions exceeds 90%. With regard to Abnormality No. 21, the decision would be hanging, which is correct.

4.4.3. Testing Results for Case II

In this case, a larger number of abnormality samples are included in the testing set to evaluate the total classification performance with limited faulty data, i.e., 11 abnormalities are employed as the testing set and 13 abnormalities are employed as the training set (as mentioned in Sec. 4.4.1, Abnormality No. 12 and Abnormality No. 17 are divided into two parts and employed by the training set and the testing set, respectively). In addition, the proposed method with the contribution vectors as the input of the classifier is compared with the direct use of the LMNN classification method with the original abnormal samples as the input to demonstrate the advantage of contribution vectors.

Table 7 lists the classification error rates of three classifiers that are based on T2-based contribution vectors, SPE-based contribution vectors and original data, and the correctness of the final classification results given by Algorithm 4. Note that ‘/’ in Table 7 indicates that no corresponding classification results exist because the corresponding abnormalities are not alarmed by the T2 or SPE statistic, which causes a difference in the sample sizes between T2-based contribution vectors and SPE-based contribution vectors.

Table 7. Experimental results of Case II.
Abnormality No.Abnormality TypeSample SizeClassification Error
Rate (%)
Correctness of thee Classification Result
T2-
Based
SPE-BasedOriginal DataContribution Vector-BasedOriginal Data-Based
4Hanging290.000.0020.69TrueFalse
8Hanging840.00/0.00TrueTrue
10Hanging700.00/0.00TrueTrue
11Hanging2792.5110.0438.35TrueFalse
12Slipping32/0.000.00TrueTrue
13Hanging1410.00/0.00TrueTrue
15Hanging340.00/0.00TrueTrue
16Channeling101/0.0066.34TrueFalse
17Cold Furnace Condition293.450.0031.03TrueFalse
18Hanging470.0010.6457.45TrueFalse
21Hanging990.0010.102.02TrueTrue

From Table 7, the following facts can be concluded:

• The total performance of the fault classification based on contribution vectors is satisfactory. The majority of the error rates are less than 10% and only the error rates of Abnormalities No. 11, 18 and 21 by SPE-based contribution vectors are slightly larger than 10%. Therefore, the proposed classification method yields an acceptable classification accuracy.

• The performance of the contribution vector-based classification outperforms the performance of the original data-based classification for all abnormalities, except that the error rate of the original data-based classification is lower than the error rate of the SPE-based contribution vectors in Abnormality 21.

• According to the decision logic in Algorithm 4 of Sec. 4.3, with the majority threshold β=90%, the 11 abnormalities are correctly classified by contribution vector-based classification, whereas only six abnormalities can be correctly determined by the original data-based classification.

Figure 7 has already illustrated the detailed detection and classification procedure for a hanging abnormality, i.e. Abnormality No. 21 in Section 4.4.2. To show the procedure more comprehensively for the other three types of abnormalities, figures for the detection and classification procedures of Abnormalities No. 12, 16 and 17 are given in Figs. 8, 9, 10, respectively, which correspond to slipping, channeling and cold furnace condition, respectively. Note that the symbols in Subfigures (a)–(d) of Figs. 8, 9, 10 have similar meaning to those in Fig. 7, and to make a comparison between the classification performance based on contribution vectors and that based on original data, Subfigure (e) is given to show the classification results based on original data in Figs. 8, 9, 10. And the classification based on contribution vectors outperforms the classification based on original data for Abnormalities No. 16 and 17, i.e. a channeling and a cold furnace condition, which is shown in Figs. 9 and 10, while they get the same classification results for Abnormality 12, i.e. a slipping, which is shown in Fig. 8.

Fig. 8.

Classification results of Abnormality No. 12. (Online version in color.)

Fig. 9.

Classification results of Abnormality No. 16. (Online version in color.)

Fig. 10.

Classification results of Abnormality No. 17. (Online version in color.)

5. Conclusions

In this paper, a PCA-LMNN-based method for the fault classification of the ironmaking processes at the Liuzhou Iron & Steel Co., Ltd. is proposed. Historical faulty data of three different blast furnaces are fused to address the lack of abundant historical data. To overcome the influence of the changes of operating points and the differences among blast furnaces, contribution vectors that are based on the T2 or SPE statistic are employed as fault features, which are extracted by a two-stage PCA-based method. The LMNN is utilized to train classifiers, which is a suitable choice for multi-class classification. The experimental results show that the proposed method can obtain the desired performance, which combines the advantages of both the multivariate statistical process control technique and the machine learning technique.

In addition, many interesting problems require future investigation, for example, methods for adding a new abnormality, which is identified by operators, to a historical database.

Another possible study is to improve the classification performance by introducing a nonlinear classifier. Due to the complexity of the ironmaking process, even the root causes of abnormalities that belong to the same type, which are identified by operators, may differ. Therefore, abnormalities clustering, which can be achieved by unsupervised learning techniques, would be meaningful.

Acknowledgments

This study was supported by the National Natural Science Foundation of China under Grants 61290324 and 61490701.

References
 
© 2016 by The Iron and Steel Institute of Japan
feedback
Top