Double Low-rank Based Matrix Decomposition for Surface Defect Segmentation of Steel Sheet

Shiyang Zhou; Shiqian Wu; Ketao Cui; Huaiguang Liu

doi:10.2355/isijinternational.ISIJINT-2021-024

Abstract

Despite advances in surface defect segmentation of steel sheet, it is still far from meeting the needs of real-world applications due to some method usually lack of adaptiveness to different shape, size, location and texture of defect object. Based on the assumption that each defect image is composed of defect-free background components that reflect the similarities of different regions and defect foreground components that reflect unique object information, we formulate the segmentation task as an image decomposition problem. To this end, we develop a double low-rank based matrix factorization framework for decomposing the surface defect image into defect foreground image and defect-free background image. Furthermore, considering the similarity of the defect-free background sub-regions and the defective sub-regions, Laplacian and sparse regularization terms are introduced into the matrix decomposition framework to improve their representation ability and discriminative ability. Importantly, the proposed method is unsupervised and training-free, so it does not requiring a large number of training samples with time-consuming manual labels. Experimental results on synthetic and real-world surface defect images show that the proposed method outperforms some state-of-the-art approaches in terms of both subjective and objective experiments.

1. Introduction

Surface defect detection plays a vital role in quality management of industrial product,^1,2) such as steel, fabric, paper, leather, printed circuit boards, liquid crystal display, semiconductor wafers and so on. It aims to identify, locate, segment and classify surface defects during manufacturing. Surface defect segmentation can reduce redundant information and highlighting the critical defect regions for high-level image understanding, which is useful for the real-time surface defect detection system based on machine vision. For steel sheet, automated detection of a diverse range of abnormal regions of images is generally a challenging task, due to the major reasons we outline below. As illustrated in Fig. 1: (1) heterogeneous and scattered defect: the number and type of defect in a defect image are generally unknown in advance, and different surface images often have different imaging qualities, i.e., low contrast, an unclear boundary; (2) cluttered and complicated background: defect-free background may also have great differences in different images; (3) different types of defect might be contained in a single defect image, and they often exhibit substantial stochastic variability in terms of shape, size, gray, texture and location. These factors pose many difficulties for a computational approach to detect and segment the diverse defects from surface defect images.

Fig. 1.

Examples of surface defect image of steel sheet.

In the past two decades, extensive efforts have been devoted for more efficient and accurate defect segmentation methods.^3,4,5,6) Traditional segmentation methods of surface defect can be mainly divided into three categories: statistical-based methods,^7,8) filter-based methods^9,10,11) and model-based methods. As different types of defect have different shape, size, gray, texture and location, above-mentioned approaches are basically customized for a predefined or specific type of defect, which often lacks the capacity to learn the common features of all types of defect. Thereby, they may face challenges for segmenting various abnormal regions, and the generalization capability is constrained. Additionally, the low computational speed of some of these methods is a limitation for real-time detection. These factors motivate researchers to develop some new methods for surface defect segmentation of steel sheet.

In the last few years, convolutional neural networks (CNN) and generative adversarial networks (GAN)-based deep learning (DL) methods^12,13) have been achieving remarkable performance in image segmentation. Therefore, some studies have attempted to adopt DL methods for defect detection and segmentation.¹⁴⁾ As mentioned in,^15,16,17) these DL models are complex with many parameters, and training them require a huge number of expert-labelled samples, complex optimization algorithm, high computational cost, strong support from hardware and software, which is a significant challenging problem in industrial environments. Moreover, defective samples are difficult to obtain because of the probability of defect occurrence is very low in industrial manufacturing.

For the surface image of steel sheet, normal surface generally show homogeneous texture, defects are local anomalies on the surface, which are dissimilar to the texture at other locations. From the view of computer vision, defects are considered as aberrant or anomalous arrayed pixels in the image, comparing with defect-free region. As illustrated in Fig. 2, if there are no any defects, I is a “clean” texture image. The surface image with abnormal regions can be regarded as the superposition of two different parts: abnormal regions and the normal background. Accordingly, the surface defect image I is now divided into two new images: defect-free background image (denoted by B) and defect foreground image (denoted by F). Therefore, for the given surface defect image, defect segmentation can be converted into the following problem: how to decompose surface defect image into defect object and corresponding defect-free background exactly? Once we have the defect foreground image, then various abnormal regions can be detected and segmented from this image. The main task of surface defect segmentation is to get F from given input image I. In this way, the surface defect segmentation task can be formulated as a problem of matrix decomposition.

Fig. 2.

A surface image with defect can be regarded as the superposition of defect on its defect-free background: (a) original surface defect image I; (b) defect-free background image B; (c) defect foreground image F.

More recently, considerable matrix decomposition based on low-rank and sparse methods for defect detection and segmentation have become popular. These methods consider that textured background and defect foreground have low-rank and sparse characteristic, respectively. Mathematically, these models can be formulated as the joint low-rank and sparse matrix minimization problem, also called robust principal component analysis (RPCA).¹⁸⁾ RPCA assumes that an image can be represented as a combination of a highly redundant part (i.e., visually consistent background regions) and a sparse part (i.e., foreground object regions). Therefore, given the feature matrix of input image, it can be decomposed into a low-rank matrix corresponding to background and a sparse matrix corresponding to foreground object. However, directly using RPCA to detect and segment surface defects in the industrial products is not practical due to lack of accuracy and suffer from limited adaptability. Based on the milestone work of RPCA, some prior knowledge of surface defect images, and powerful regularization terms are incorporate into original RPCA model, which can improve segmentation results in terms of both speed and accuracy. Particularly, many literatures conclude that low-rank matrix decomposition based methods can obtain better performance for surface defect segmentation. Cen et al.¹⁹⁾ investigated a model of low-rank matrix reconstruction for defect inspection of TFT-LCD. However the method can only solve TFT-LCD images with simple background, and may not be extended to other industrial products. Li et al.²⁰⁾ designed a low-rank decomposition model, which the low-rank matrix represents the original fabric texture image not including defects and a sparse matrix represents the defects information. This method needs sophisticated design of specific defect. Yan et al.²¹⁾ performed a smooth-sparse decomposition (SSD) model with regularized high-dimensional regression to decompose a defect image and separate anomalous regions by solving a large-scale optimization problem. This method is applied based on the premise that the image background is smooth and anomalous regions are sparse or can be expressed by sparse bases. Cao et al.²²⁾ presented prior knowledge guided least squares regression (PG-LSR) model based on low-rank representation to detect diverse defects. However, this method ignores the sparse of defect pixels, and can’t detect continuous defective regions. Huang et al.²³⁾ applied a texture prior to construct a novel weighted low-rank reconstruction (W-LRR) model, which is only suitable for the defect images with regular or near-regular texture. Wang et al.²⁴⁾ studied the entity sparsity pursuit (ESP) method to identify surface defects. It assumes defect region accounts for a small fraction of whole image, which is constrained in the real application. These methods don’t consider the low-rank characteristic for defect foreground object and background regions simultaneously, and ignore the spatial and pattern relations of image regions, which may influence segmentation performance.

Motivated by the above analysis,^25,26,27,28) an easy-to-implement defect segmentation approach based on double low-rank decomposition (DLMD) is developed in this paper. The method uses a unified low-rank assumption to characterize background and defect object. The framework of the proposed DLMD-based segmentation approach can be divided into two steps: firstly, the defect foreground image and defect-free background image are separated from surface defect image; secondly, the optimization strategy is further applied to enhance decomposed results, so as to further improve the accuracy of the defect foreground image, which leading to a higher segmentation performance.

The remainder of the paper is organized as follows. Section 2 briefly reviews the related work on RPCA and its robust extensions. In Section 3, we presents the DLMD-based surface defect segmentation method, including formulation, optimization, and solution. Section 4 introduces experimental results and analyses between our proposed method and some state-of-the-art methods, followed with the conclusion and future work in Section 5.

2. Related Work

In the past few years, robust principal component analysis has been applied to domain of image processing, such as image denoising and inpainting, image classification, image background modeling, image super-resolution and so on.²⁹⁾ It assumes the clean data matrix is low-rank and the error matrix is sparse. The low-rank representation has a better performance in discovering global structures of data, which can reveal the relationships of the samples: the within-class affinities are dense while the between-class affinities are all zeros. The RPCA model is

min L,S (rank(L)+λ ‖ S ‖ 0 ) s.t. D=L+S

(1)

where, D∈ ℝ m×n is the observed matrix, L∈ ℝ m×n and S∈ ℝ m×n are two decomposed matrices; rank(·) denotes the rank of matrix; ‖ ⋅ ‖ 0 denotes l₀ norm, which equals the number of non-zero element of matrix; λ>0 is a parameter that trades off the rank of L versus the sparsity of S.

As rank(·) and ‖ ⋅ ‖ 0 are not convex, a common heuristic criterion is to replace rank(·) and ‖ ⋅ ‖ 0 are replaced by nuclear norm ‖ ⋅ ‖ * and l₁ norm ‖ ⋅ ‖ 1 , respectively, Eq. (1) can be relaxed to the following convex problem:

min L,S ( ‖ L ‖ * +λ ‖ S ‖ 1 ) s.t. D=L+S

(2)

where, ‖ ⋅ ‖ * equals the sum of singular values of a matrix; ‖ ⋅ ‖ 1 equals the sum of the absolute values of all entries.

Given a data matrix, RPCA can decompose it into the sum of a low-rank matrix and a sparse matrix exactly by minimizing a weighted combination of the nuclear norm and the l₁ norm. Besides, ‖ ⋅ ‖ 0 can be replaced by l_2,1 norm,³⁰⁾ it can detect outliers with column-wise sparsity. ‖ S ‖ 2,1 =∑ j=1 n ‖ s j ‖ 2 , S = (s₁, s₂, …, s_n) with s j ∈ ℝ m , ‖ ⋅ ‖ 2,1 equals the sum of l₂ norms of the columns of a matrix

Several convex optimization algorithms have been proposed to solve RPCA,³¹⁾ such as iterative thresholding (IT) approach, accelerated proximal gradient (APG) algorithm, semi-definite programming (SDP) method, dual approach, singular value thresholding (SVT), alternating direction method of multipliers (ADMM), inexact augmented Lagrangian multipliers (ALM) method, and so forth. Suppose that L∈ ℝ m×n is a matrix with rank r, its singular value decomposition (SVD) operation is denoted as svd(L) = UΣV^T, where, Σ = diag({σ_i}_1≤_i_≤_r) is the diagonal matrix with σ₁, σ₂, …, σ_r on the diagonal and zeros elsewhere, σ_i is the i-th singular value of L, U∈ ℝ m×r and V∈ ℝ N×r are left, right singular matrices, respectively.

For the traditional soft-thresholding shrinkage operator Ψ λ { Σ ij }={ Σ ij -λ Σ ij >λ 0 Σ ij ≤λ , where, Σ_ij stands for the (i,j)-th element of Σ. Each singular value equally shrink by subtracting the same constant λ, which means that all singular values have equal contributions. For the surface defect image, matrix singular values have clear physical meanings, larger singular values corresponding to major projection directions are supposed to be less shrunk to preserve the major components. Given the weights vector w∈ ℝ r , the non-uniform singular value thresholding operator³²⁾ Ψ λw { Σ ij }={ Σ ij -λ w i Σ ij >λ 0 Σ ij ≤λ , where, w i = ∑ j=1 r σ j σ i . For the larger singular values which quantify the principal information of image, they should be reduced a little as much as possible, i.e., the larger the singular value is, the more contribution it makes to the major information. Different singular values are treated differently by assigning different weights and can adaptively shrink according to the specific information of image, which can improve the accuracy of low-rank reconstruction, and enhance the adaptivity of defect segmentation.

3. A Computational Approach for Surface Defect Segmentation

In this section, a double low-rank based matrix decomposition method is proposed for segmenting the surface defect of steel sheet. Our proposed method comprises of five stages, including denoising, superpixel over-segmentation, feature extraction, feature matrix decomposition, defect segmentation. The flowchart of the proposed DLMD method is shown in Fig. 3 and different stages are described in detail in the following sub-sections.

Fig. 3.

Diagram of the proposed DLMD method for surface defect segmentation. (Online version in color.)

3.1. Denoising

Typically, a denosing step is first conducted for the surface defect image. We directly convolve the Gaussian kernel with defect image, which can filter out the noise and preserve the useful defect structural information.

3.2. Superpixel Over-segmentation

In order to improve the computation efficiency and capture the better structural and boundary information in defect image, we adopt the superpixel-algorithm of adaptive simple linear iterative clustering (ASLIC)³³⁾ to partition the defect image into several non-overlap sub-regions. There is only one parameter we need to specify, the number of superpixel sub-regions K. It can generate regular shaped superpixels in both textured and non-textured regions alike. As the number of superpixel sub-regions is far less than the pixel of image, which can ease the computational burden to a certain degree. As shown in Fig. 4, a bigger K should be chosen if the potential defect object is small and morphological complex, which can produce more deformable shape to enclose the region containing potential defect.

Fig. 4.

Superpixel over-segmentation results with different K for ASLIC: (a) K=50; (b) K=100; (c) K=150; (d) K=200.

3.3. Feature Extraction

According to the work of,^26,34) the image feature of gray-scale, Gabor filters and steerable pyramids, are extracted for the surface defect image. Therefore, a total of 25 image features are computed and then stacked vertically to construct a feature vector for each pixel, some details are presented in Appendix 1. For each superpixel sub-region, its feature vector is calculated by taking mean of all the feature vectors of pixels contained in it, and such a feature representation is robust to noise. All the feature vector of sub-regions are normalized into unit column vectors, and are stacked together to build a feature matrix D∈ ℝ d×K , where d is the dimension of feature vector, K is the number of superpixels sub-regions.

3.4. Feature Matrix Decomposition

As shown in Fig. 5, we try to decompose surface defect image I into defect-free background image B and defect foreground image F, and segment these images by superpixel-algorithm, then stack all feature vector of superpixel sub-regions together to form three feature matrices. Our purpose is to decompose the feature matrix D constructed from the original defect image I into a feature matrix L represents a background image B and a feature matrix S represents a defect foreground image F in a certain feature space, respectively. Therefore, D can be modelled as the sum of the two matrices: D = L + S, where, each column of matrices denotes the feature vector of individual superpixel sub-region.

Fig. 5.

Illustration of surface defect image decomposition with double low-rank assumption: (a) original surface defect image I by superpixel over-segmentation; (b) defect-free background image B by superpixel over-segmentation; (c) defect foreground image F by superpixel over-segmentation.

Both the background image B and the defect foreground image F contain multiple homogeneous and highly similar sub-regions. These two feature matrices L and S have redundant information and can be assumed to have low-rank due to the similarity among different sub-regions, which form a low-dimensional feature subspace. What’s more, in order to reduce the influence of noises and improve the robustness to uneven illumination simultaneously, we assume that the background has the sparse property and lies in a sparse feature subspace.

3.4.1. Problem Formulation

Based on above analysis, the matrix D can be separated into two feature matrices L and S by solving the problem:

min L,S (rank(L)+rank(S)+αΘ(L,S)+β ‖ L ‖ 0 ) s.t. D=L+S

(3)

where, Θ(L,S) denotes the regularization term to enlarge the margin and reduce the coherence between the feature subspaces induced by L and S; α>0 and β>0 are regularization parameters.

To separate the defect object from the defect-free background as much as possible, the local invariance assumption³⁵⁾ based Laplacian regularization term Θ(L,S) can be defined as follows:

Θ(L,S)= 1 2 ∑ i,j=1 K ‖ s i - s j ‖ 2 2 w ij =tr(SM S T )

(4)

where, M∈ ℝ K×K is a Laplacian matrix; tr(·) denotes the trace of a matrix; s_i, s_j denotes the i-th and j-th column of S; w_ij of affinity matrix W∈ ℝ K×K denotes the weight that represents the feature similarity between sub-regions R_i and R_j; some details about M are listed in Appendix 2.

3.4.2. Optimization

Then, Eq. (3) is thus converted into the following convex surrogate optimization problem:

min L,S ( ‖ L ‖ * + ‖ S ‖ * +αtr(SM S T )+β ‖ L ‖ 2,1 ) s.t. D=L+S

(5)

where, l_2,1 norm-based penalty term ‖ L ‖ 2,1 aims to characterize the noise or illumination interference of image.

The inexact ALM method is an effective solver to find optimal solutions of Eq. (5), which minimizes the dual form of the original constrained optimization problem over one variable with others fixed at a time, and repeats this process with increasing positive penalty scalar until it converges. By introducing the auxiliary variables H = L, J = S, Eq. (5) is turned into the augmented Lagrangian function:

O(L,S,H,J, Y 1 , Y 2 , Y 3 ,μ) = ‖ L ‖ * + ‖ S ‖ * +αtr(JM J T )+β ‖ H ‖ 2,1 +⟨ Y 1 , D-L-S ⟩ + μ 2 ‖ D-L-S ‖ F 2 +⟨ Y 2 , H-L ⟩+ μ 2 ‖ H-L ‖ F 2 +⟨ Y 3 , J-S ⟩+ μ 2 ‖ J-S ‖ F 2

(6)

where, <·, ·> means inner product operator of two matrices; ‖ ⋅ ‖ F 2 denotes Frobenius norm, which is defined as sum of squares of each element of matrix; Y₁, Y₂ and Y₃ are Lagrange multipliers; μ is a penalty parameter.

Some details of optimization algorithm for the proposed DLMD method are presented in Appendix 3, the whole algorithm is summarized in Table 1.

Table 1. Algorithm of solving DLMD via ALM.

Algorithm 1 Solving DLMD via ALM
Input: Feature matrix D∈ ℝ d×K , parameters α > 0, β > 0, and ϵ>0
Initialize: H⁽⁰⁾ = J⁽⁰⁾ = L⁽⁰⁾ = S⁽⁰⁾ = 0, Y 1 (0) = Y 2 (0) = Y 3 (0) =0 , μ⁽⁰⁾ = 0.1, μ_max = 10⁵, ρ = 1.1, k = 0
While not converged do
step 1: Update H⁽^k⁺¹⁾ by Eq. (A3-2)
step 2: Update J⁽^k⁺¹⁾ by Eq. (A3-4)
step 3: Update L⁽^k⁺¹⁾ by Eq. (A3-6)
step 4: Update S⁽^k⁺¹⁾ by Eq. (A3-8)
step 5: Update Y 1 (k+1) , Y 2 (k+1) , Y 3 (k+1) by Eq. (A3-9)
step 6: Update μ⁽^k⁺¹⁾ by Eq. (A3-10)
step 7: Check the convergence condition ‖ D- L (k+1) - S (k+1) ‖ F ‖ D ‖ F < 10 -5
step 8: k = k + 1
End While
Output: The optimal solution L∈ ℝ d×K and S∈ ℝ d×K

3.5. Defect Segmentation

Each column of L = (l₁, l₂, …, l_K) and S = (s₁, s₂, …, s_K) are the feature vector of corresponding superpixel sub-region of decomposed background image B and defect foreground image F, respectively. Then, we transfer L and S from feature domain to spatial domain for constructing the visual image. The gray-value of each superpixel sub-region is maximum value of corresponding feature vector, allocating it to corresponding pixels to visualize background image B and defect foreground image F, as shown in Fig. 3. As described in,³⁶⁾ to enhance the consistency, completeness of defect objects and suppress the background noise in defect foreground image F, the regression optimization algorithm is adopted by combining background image B and defect foreground image F:

min s i ( ∑ i=1 K w i f ( s i -1) 2 + ∑ i=1 K w i b s i 2 + ∑ i,j=1 K w ij ( s i - s j ) 2 )

(7)

where, w i f and w i b denotes gray-value of sub-region in defect foreground image F and background image B, respectively; s i ∈s= ( s 1 , s 2 , ⋯, s K ) T denotes the enhanced gray-value of i-th sub-region of defect image F.

Following W^b = diag [ ( w 1 b , w 2 b , ⋯, w K b ) T ]∈ ℝ K×K , W^f = diag { ( w 1 f , w 2 f , ⋯, w K f ) T }∈ ℝ K×K ,

min s ( s T W b s+ s T W f s-2 W f s+ W f 1+2 s T Ms)

(8)

where, 1∈ ℝ K×1 denotes the unit vector; M∈ ℝ K×K denotes the same Laplacian matrix of Eq. (4).

Differentiating Eq. (8) with respect to s, and let it to be zero, therefore 2W^bs + 2W^fs − 2W^f1 + 4Ms = 0, its solution is s = (W^f + W^b + 2M)−1W^f1. From optimization, the sub-regions within the same foreground or background to have more similar gray-value while the sub-regions from different classes to have different gray-value. The gray-value of defect sub-region in foreground image is bigger, so the defect object can be highlighted further. Finally, surface defect can be easily localized and segmented through a simple thresholding operation.

4. Experiment

In this section, we conduct a series of tests including simulation experiments and real-image experiments. We introduce the experimental setups, experimental analyses about parameters settings, convergence, computational complexity, robustness to noise. In addition to, we present the qualitative and quantitative comparisons between our method and some state-of-the-art methods of surface defect segmentation.

4.1. Experimental Setup

The proposed DLMD method is qualitatively and quantitatively evaluated using two typical types of surface defect images from NEU datasets,³⁷⁾ Patch and Scratch, all images are in the same size of 200 × 200 pixels. For each surface defect image, the pixel-level ground truth is manually marked by using white to denote defective pixels and black to denote defect-free pixels, the sample are shown in Fig. 8(b). Our proposed DLMD method is compared with five representative methods of surface defect segmentation quantitatively and qualitatively, including RPCA [18], SSD [21], PG-LSR [22], W-LRR [23], and ESP [24].

Fig. 8.

Segmentation results of the proposed DLMD method: (a) input image; (b) manual-labeled ground-truth image; (c) original defect foreground image; (d) enhanced defect foreground image; (e) binarization of (d) by Otsu’s method.

The qualitative evaluation metrics refer to evaluate the segmentation performance based on human subjective feeling. For example, the boundary of surface defect is clear, and the contrast between defect and background is obvious. The quantitative evaluation metrics refer to precision-recall (P-R) curve, receiver operating characteristic (ROC) curve, average F-Measure (F_β) curve, area under ROC curve (AUC) and mean square error (MAE). Supposing that the pixel belonging to defect is defined as a positive sample, and a pixel belonging to background is defined as a negative sample. The symbols TP (True Positives), TN (True Negative), FP (False Positive), and FN (False Negative) correspond to the number of defect pixel correctly identified as defect object, the number of background pixel correctly identified as background, the number of background pixel mistakenly identified as defect object, and the number of defect pixel mistakenly identified as background, respectively. Then, Precision, Recall, TPR (True Positive Rate), FPR (False Positive Rate), F₁, and MAE are computed as follows: Precision= TP TP+FP , Recall= TP TP+FN , TPR= TP TP+FN , FPR= FP FP+TN , F 1 = 2 N ∑ i=1 N Precision×Recall Precision+Recall , MAE= ∑ i=1 H ∑ j=1 W | BW(i, j)-G(i, j) | H×W , where, N, H and W denotes the number, height and width of surface defect image. Precision is defined as the percentage of defect pixels correctly assigned, while Recall is the ratio of correctly assigned defect pixels to all true defect pixels. The F₁ indicator comprehensively considers Precision and Recall, and larger F₁ indicates higher performance. MAE measures the dissimilarity between segmented images BW and corresponding ground truth G.

4.2. Experimental Results Analysis

4.2.1. Analysis of Parameters Settings

To validate how different α and β affect the segmentation performance of the proposed method, we use the strategy which fix one and vary the other items respectively. The verification results that are measured by AUC metric are shown in Table 2. The results show that when the values of α and β are set properly, the proposed method can achieve better segmentation performance. When α is small, the performance is very sensitive to the changes of β, while α is big, β is insensitivity. Especially, it would be better to set the values of α much larger than that of β in order to penalize the feature matrix of defect-free background image to be sparse. The segmentation performance of the proposed approach reaches a high level when α = 1.25 and β = 0.25.

Table 2. Experimental results with different parameters α and β.

β α	0.05	0.15	0.25	0.35	0.45	0.55
0.25	0.809765	0.620011	0.617729	0.617857	0.617848	0.617807
0.5	0.828737	0.787541	0.716342	0.688683	0.683770	0.681919
0.75	0.820643	0.833686	0.804402	0.750935	0.713586	0.702458
1	0.817401	0.834645	0.842852	0.818865	0.778656	0.740081
1.25	0.813881	0.834150	0.845304	0.832574	0.821758	0.799830
1.5	0.810798	0.833805	0.834686	0.835088	0.833225	0.826053

4.2.2. Analysis of Convergence

We evaluate the convergence of Algorithm 1 to empirically show the convergence through experiments in different iterations, which is calculated via the error ‖ D-L-S ‖ F / ‖ D ‖ F . As shown in Fig. 6, x-axis denotes iterations, and y-axis denotes error. We can see that the error converges very fast, usually within 20 iterations. The convergence curve indicates that the proposed DLMD method finally achieves optimal local solution by using the ALM.

Fig. 6.

Convergence Curve of the proposed DLMD method.

4.2.3. Analysis of Computational Complexity

The major computation of the proposed method is the singular value decomposition of matrix and matrix addition operations. The complexity of SVD for matrix is O(dK²), the complexity of matrix addition is O(dK), where d × K is the size of data matrix. Thus, the total complexity of the proposed method is O(t(dK² + dK)), where t is the iterations. By low-rank constraint r ≪ K, the computational complexity is reduced to O(t(drK + dK)), which implies that proposed method theoretically has linear complexity with respect to d and K.

4.2.4. Analysis of Robustness to Noise

The noise conditions of real factories are complex and random, thus the robustness to noise is very vital in evaluating defect segmentation method. To evaluate the robustness of the proposed method against noise, we conducted further experiments on synthesized defect images and original defect images with different noise level.

Four synthesized defect images and corresponding ground truth are shown in Fig. 7. These images are corrupted with multiple circles or squares of different sizes, and additive Gaussian noise with different SNR is introduced to the entire image. As shown in Fig. 7, binarization defect images obtained by the proposed method are clear and accurate, while the segmentation results by Otsu’s method contain many noises and can’t separate the defects.

Fig. 7.

Experimental results for synthesized defect images with different noise: (a) 24 dB; (b) 20 dB; (c) 16 dB; (d) 12 dB. For four sub-graphs, from left to right, each column stand for polluted images, corresponding ground truth, decomposed defect foreground images, binarization of defect foreground images by Otsu’s method, binarization of original images by Otsu’s method, respectively.

Besides, we artificially corrupt original surface defect images with additive Gaussian noise with different SNR, including 24 dB, 20 dB, 16 dB and 12 dB. As shown in Table 3, when SNR decreases gradually, the AUC and MAE metrics can remain a relative high level, especially when SNR = 16 dB, AUC still remain around 0.8.

Table 3. Experimental results with different noise level.

SNR Index	No Noise	24 dB	20 dB	16 dB	12 dB
AUC	0.8453	0.8298	0.8123	0.7759	0.7183
MAE	0.1593	0.1610	0.1731	0.1939	0.2220

4.2.5. Analysis of Segmentation Results

As illustrated in Fig. 8, different surface defect images containing various defects, and different defects may have different shape, size, gray, texture and location. From enhanced defect foreground image shown in Fig. 8(d), the proposed DLMD method has achieved the goal of “highlight the foreground and suppressing the background”. It’s simple to define a proper threshold to segment the defects accurately. Figure 8(e) shows that the segmentation shape of defect is similar to ground truth, the defect shape and edge are well preserved.

4.3. Comparison with State-of-the-art Methods

4.3.1. Qualitative Comparison

The qualitative comparison results between the proposed DLMD method and other five methods are shown in Fig. 9. It’s shown that most of methods can handle simple defect images with relatively homogenous background (i.e., column 5, and 10). They can generate high-quality segmentation results. However, for some complex defect images containing multiple objects (i.e., column 6, 11 and 12), or having complicated background (i.e., column 3, and 4), the defect object could not be uniformly highlighted, and parts of the background being falsely assigned as the defect.

Fig. 9.

Qualitative comparison results: (a) input image; (b) manual-labeled ground-truth image; (c) RPCA; (d) SSD; (e) PG-LSR; (f) W-LRR (g) ESP; (h) proposed DLMD.

Even though these methods all use the similar matrix decomposition to guide the defect segmentation, their final segmentations results are not the same in many cases. RPCA is able to locate the defective regions for Scratch defect. While for the Patch defect, it’s difficult to accurately separate the defect objects. As shown in Fig. 9(c), when there are various illumination and gray levels, RPCA tends to generate many noises and obtain low precision. As shown in Fig. 9(d), SSD cannot precisely distinguish defective regions and background. The potential reason for the low performance is that it needs strict preconditions. Compared with RPCA, PG-LSR is not very sensitive to various illumination even though it cannot locate defective regions as precise as RPCA in several cases. As shown in Fig. 9(e), segmented defective regions are much larger than the real-defective regions. The reason for this phenomenon is that PG-LSR is a method based on least squares regression, which makes it very sensitive to size and pattern of defect objects. As shown in Fig. 9(f), the segmentation results of W-LRR tend to generate inaccurate defect shapes. The potential reason is that W-LRR achieves the goal of defect segmentation by dividing the images into patches of standard size. Based on the image partitioning operation, it distinguishes defects from a local perspective is not reliable when the contrast between background and defect is not distinct, which is sensitive to the patch size. The segmentation results obtained by ESP approximate to real-defective regions, as shown in Fig. 9(g). However, there are many pixels that belonging to the background are misjudged by defect. By contrast, our proposed DLMD method separates the defect objects from the image background successfully and locates various defects precisely. It more efficiently highlights the whole defect object with well-defined boundaries.

These experimental results illustrate our proposed DLMD method is effective for segmenting a variety of defects from defect images, even if the types and the number of defects in these images are unknown and different defects exhibit diverse visual features of shapes, scales, directions and locations. It has a strong adaptive ability for the complex and varying surface defects of steel sheet. Besides, double low-rank constrain of the proposed method contributes to the good performance, and thus, is the important factor to obtain more precise segmentation results than other methods. What’s more, it’s reasonable to conclude that treat the matrix singular values differently by which the most important characteristics of defects or background can be preserved.

4.3.2. Quantitative Comparison

The six methods are evaluated quantitatively and their P-R curves, ROC curves, AUC values, F-measure curves and MAE values are illustrated in Fig. 10 and Table 4, respectively. Figure 10 illustrates that the proposed DLMD method significantly outperforms the other five methods. Especially, the precision can remain above 90% within a large threshold range, which reflects a better segmentation performance. Table 4 summarizes the quantitative results of six methods, and the best results are marked in bold. It demonstrates that the proposed DLMD method also has better performance than other five methods. Most of AUC results are higher than 70%, and the proposed DLMD method achieves 84.53%, which is competitive with 9.53% improvement to 75.00% achieved by ESP. MAE of the proposed DLMD method is typically the lowest among all the methods. Based on the above qualitative and quantitative analyses, it confirms that our proposed DLMD method consistently outperforms some state-of-the-art methods.

Fig. 10.

Quantitative comparison results with P-R curves, ROC curves and F-measure curves. (Online version in color.)

Table 4. Comparison of AUC and MAE of the proposed DLMD with other methods.

Method Index	RPCA [18]	SSD [21]	PG-LSR [22]	W-LRR [23]	ESP [24]	Ours (DLMD)
AUC	0.7636	0.7144	0.7133	0.6636	0.7500	0.8453
MAE	0.1860	0.2500	0.2010	0.2598	0.1937	0.1593

5. Conclusion

Based on the anomaly characteristics of defect in the surface defect image of steel sheet, we formulate the defect segmentation as a problem of matrix decomposition. We design a double low-rank decomposition model to obtain high-quality defect foreground image directly, which provides a robust way to segment the surface defect. Experimental results show that our proposed DLMD method generally has better performance than some state-of-the-art defect segmentation methods in terms of AUC and MAE, which prove it performs efficiently and competitively for the surface defect segmentation task. At the same time, it provides an interesting perspective for other industrial product’s surface defect detection and segmentation, such as AMOLED screens, wood, ceramic tile and leather.

Acknowledgements

This research was funded by National Natural Science Foundation of China, under grant number 51805386, open fund of Key Laboratory of Image Processing and Intelligent Control (Huazhong University of Science and Technology), Ministry of Education, under grant number IPIC2019-03, open fund of Hubei Key Laboratory of Mechanical Transmission and Manufacturing Engineering at Wuhan University of Science and Technology, under grant number 2018A04.

References

1) K. Hanbay, M. F. Talu and Ö. F. Özgüven: Optik, 127 (2016), 11960. https://doi.org/10.1016/j.ijleo.2016.09.110
2) J. P. Yun, D. Kim, K. H. Kim, S. J. Lee, C. H. Park and S. W. Kim: Opt. Eng., 56 (2017), 053108. https://doi.org/10.1117/1.OE.56.5.053108
3) T. Ehret, A. Davy, J. M. Morel and M. Delbracio: J. Math. Imaging Vis., 61 (2019), 710. https://doi.org/10.1007/s10851-019-00885-0
4) Q. W. Luo, X. X. Fang, L. Liu, C. H. Yang and Y. C. Sun: IEEE Trans. Instrum. Meas., 69 (2020), 626. https://doi.org/10.1109/TIM.2019.2963555
5) T. Czimmermann, G. Ciuti, M. Milazzo, M. Chiurazzi, S. Roccella, C. M. Oddo and P. Dario: Sensors, 20 (2020), 1459. https://doi.org/10.3390/s20051459
6) G. R. Song, K. C. Song and Y. H. Yan: Opt. Lasers Eng., 128 (2020), 106000. https://doi.org/10.1016/j.optlaseng.2019.106000
7) H. Y. Wang, J. W. Zhang, Y. Tian, H. Y. Chen, H. X. Sun and K. Liu: IEEE Trans. Ind. Inform., 15 (2019), 2798. https://doi.org/10.1109/TII.2018.2887145
8) T. Qu, L. Zou, Q. L. Zhang, X. Chen and C. E. Fan: J. Text. Inst., 107 (2016), 743. https://doi.org/10.1080/00405000.2015.1061760
9) D. C. Choi, Y. J. Jeon, S. H. Kim, S. Moon, J. P. Yun and S. W. Kim: ISIJ Int., 57 (2017), 1045. https://doi.org/10.2355/isijinternational.ISIJINT-2016-160
10) L. Jia, C. Chen, J. Z. Liang and Z. J. Hou: Neurocomputing, 238 (2017), 84. https://doi.org/10.1016/j.neucom.2017.01.039
11) K. Xu, Y. Xu, P. Zhou and L. Wang: Opt. Lasers Eng., 105 (2018), 110. https://doi.org/10.1016/j.optlaseng.2018.01.010
12) Y. Y. Li, D. Zhang and D. J. Lee: Neurocomputing, 329 (2019), 329. https://doi.org/10.1016/j.neucom.2018.10.070
13) J. H. Liu, C. Y. Wang, H. Su, B. Du and D. C. Tao: IEEE Trans. Image Process., 29 (2019), 3388. https://doi.org/10.1109/TIP.2019.2959741
14) J. B. Zhang, H. Su, W. Zou, X. Y. Gong, Z. T. Zhang and F. Shen: Pattern Recognit., 109 (2021), 107571. https://doi.org/10.1016/j.patcog.2020.107571
15) H. Y. Chen, Q. D. Hu, B. S. Zhai, H. Chen and K. Liu: Neural Comput. Appl., 32 (2020), 11229. https://doi.org/10.1007/s00521-020-04819-5
16) Y. P. Gao, L. Gao, X. Y. Li and X. G. Yan: Robot. Comput. Integr. Manuf., 61 (2020), 101825. https://doi.org/10.1016/j.rcim.2019.101825
17) D. Tabernik, S. Šela, J. Skvarč and D. Skočaj: J. Intell. Manuf., 31 (2020), 759. https://doi.org/10.1007/s10845-019-01476-x
18) E. J. Candès, X. D. Li, Y. Ma and J. Wright: J. ACM, 58 (2011), 11. https://doi.org/10.1145/1970392.1970395
19) Y. G. Cen, R. Z. Zhao, L. H. Cen, L. H. Cui, Z. J. Miao and Z. Wei: Neurocomputing, 149 (2015), 1206. https://doi.org/10.1016/j.neucom.2014.09.007
20) C. L. Li, G. S. Gao, Z. F. Liu, D. Huang and J. T. Xi: IEEE Access, 7 (2019), 83962. https://doi.org/10.1109/ACCESS.2019.2925196
21) H. Yan, K. Paynabar and J. J. Shi: Technometrics, 59 (2017), 102. https://doi.org/10.1080/00401706.2015.1102764
22) J. J. Cao, J. Zhang, Z. J. Wen, N. N. Wang and X. P. Liu: Multimed. Tools Appl., 76 (2017), 4141. https://doi.org/10.1007/s11042-015-3041-3
23) Q. Z. Huangpeng, H. Zhang, X. R. Zeng and W. Huang: IEEE Access, 6 (2018), 37965. https://doi.org/10.1109/ACCESS.2018.2852663
24) J. Z. Wang, Q. Y. Li, J. R. Gan, H. M. Yu and X. Yang: IEEE Trans. Ind. Inform., 16 (2020), 141. https://doi.org/10.1109/TII.2019.2917522
25) W. B. Zou, Z. Liu, K. Kpalma, J. Ronsin, Y. Zhao and N. Komodakis: IEEE Trans. Image Process., 24 (2015), 3858. https://doi.org/10.1109/TIP.2015.2456497
26) H. W. Peng, B. Li, H. B. Ling, W. M. Hu, W. H. Xiong and S. J. Maybank: IEEE Trans. Pattern Anal. Mach. Intell., 39 (2017), 818. https://doi.org/10.1109/TPAMI.2016.2562626
27) T. Bouwmans, A. Sobral, S. Javed, S. K. Jung and E. H. Zahzah: Comput. Sci. Rev., 23 (2017), 1. https://doi.org/10.1016/j.cosrev.2016.11.001
28) F. H. Shang, J. Cheng, Y. Y. Liu, Z. Q. Luo and Z. C. Lin: IEEE Trans. Pattern Anal. Mach. Intell., 40 (2018), 2066. https://doi.org/10.1109/TPAMI.2017.2748590
29) T. Bouwmans, S. Javed, H. Y. Zhang, Z. C. Lin and R. Otazo: Proc. IEEE, 106 (2018), 1427. https://doi.org/10.1109/JPROC.2018.2853589
30) Z. Zhang, M. B. Zhao, F. Z. Li, L. Zhang and S. C. Yan: Neural Netw., 96 (2017), 55. https://doi.org/10.1016/j.neunet.2017.08.001
31) S. Q. Ma and N. S. Aybat: Proc. IEEE, 106 (2018), 1411. https://doi.org/10.1109/JPROC.2018.2846606
32) S. H. Gu, L. Zhang, W. M. Zuo and X. C. Feng: 2014 IEEE Conf. on Computer Vision and Pattern Recognition, IEEE, New York, (2014), 2862. https://doi.org/10.1109/CVPR.2014.366
33) R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua and S. Süsstrunk: IEEE Trans. Pattern Anal. Mach. Intell., 34 (2012), 2274. https://doi.org/10.1109/TPAMI.2012.120
34) X. H. Shen and Y. Wu: 2012 IEEE Conf. on Computer Vision and Pattern Recognition, IEEE, Piscataway, NJ, (2012), 853. https://doi.org/10.1109/CVPR.2012.6247758
35) D. Cai, X. F. He, J. W. Han and T. S. Huang: IEEE Trans. Pattern Anal. Mach. Intell., 33 (2011), 1548. https://doi.org/1109/TPAMI.2010.231
36) W. J. Zhu, S. Liang, Y. C. Wei and J. Sun: 2014 IEEE Conf. on Computer Vision and Pattern Recognition, IEEE, New York, (2014), 2814. https://doi.org/10.1109/CVPR.2014.360
37) Y. J. Zhao, Y. H. Yan and K. C. Song: Int. J. Adv. Manuf. Technol., 90 (2017), 1665. https://doi.org/10.1007/s00170-016-9489-0

Appendices

Appendix 1. Feature Extraction

Firstly, different from the traditional color image in most applications of computer vision, the surface defect images are captured in grayscale format, which means that gray-level information can act as a reliable feature to distinguish defects from background. Secondly, Gabor filters are used to make texture analyzes in both spatial and frequency domain, which provides a useful and reasonably accurate description of most spatial aspects of simple receptive fields. Therefore, we generate 16 Gabor filters by rotating 0, 45, 90, 135, 180, 225, 270, and 315 degrees about the x, and y-axes, as well as two different Gaussian scales. The frequency and direction data obtained from these Gabor filter convolutions are calculated. Thirdly, steerable pyramid filters with four directions on two different scales are performed for the surface defect image.

Appendix 2. Laplacian Matrix M

Supposed that each sub-region is represented by a node, a undirected graph model can be built from the surface defect image, the affinity matrix W is defined as follows:

w ij = { exp( - ‖ p i - p j ‖ 2 2 2 σ p 2 ) exp( - ‖ f ¯ i - f ¯ j ‖ 2 2 2 σ f 2 ) R i and R j are directly adjacent 0 otherwise

(A2-1)

where, p i ∈ ℝ 2 and p j ∈ ℝ 2 denote the central coordinate of R_i and R_j; f ¯ i ∈ ℝ d and f ¯ j ∈ ℝ d denote the feature vector of R_i and R_j; exp( - ‖ p i - p j ‖ 2 2 2 σ p 2 ) represents the spatial contiguity between R_i and R_j; exp( - ‖ f ¯ i - f ¯ j ‖ 2 2 2 σ f 2 ) gives the feature similarity between R_i and R_j; σ_p and σ_f are two scalars.

The Laplacian matrix M is defined as follows:

M ij ={ - w ij i≠j Σ i≠j w ij otherwise

(A2-2)

Appendix 3. Optimization of Eq. (6)

(1) Update H when fix the others

In order to solve H, we can further simplify the optimization problem of Eq. (6):

min H ( 1 2 ‖ L- Y 2 μ -H ‖ F 2 + β μ ‖ H ‖ 2,1 )

(A3-1)

The optimal solution can be obtained by the following problem:

H (k+1) (:,j)= { ‖ Z (k) (:,j) ‖ 2 - β μ (k) ‖ Z (k) (:,j) ‖ 2 Z (k) (:,j) ‖ Z (k) (:,j) ‖ 2 > β μ (k) 0 otherwise

(A3-2)

where, Z (k) = L (k) - Y 2 (k) μ (k) , Z(:, j) denotes the j-th column of matrix Z.

(2) Update J when fix the others

In order to solve J, the optimal solution can be obtained by the following problem:

min J ( 1 2 ‖ J-S+ Y 3 μ ‖ F 2 + α μ tr(JM J T ) )

(A3-3)

Differentiating it with respect to H, and let it to be zero, J-S+ Y 3 μ + 2α μ JM=0 , the close-form solution is,

J (k+1) =( S (k) - Y 3 (k) μ (k) ) ( I+ 2α μ (k) M ) -1

(A3-4)

(3) Update L when fix the others

In order to solve L, the optimization problem of Eq. (6) can be transformed as following simplified form:

min L ( 1 2 ‖ D-S+ Y 1 μ -L ‖ F 2 + 1 2 ‖ H+ Y 2 μ -L ‖ F 2 + 1 μ ‖ L ‖ * )

(A3-5)

It can be rewritten as min L ( 1 2 ‖ 1 2 ( D-S+H+ Y 1 + Y 2 μ ) -L ‖ F 2 + 1 4μ ‖ L ‖ * ) , the optimal solution is,

L (k+1) =U Ψ w 4 μ (k) (Σ) V T

(A3-6)

where, (U,Σ,V)=svd[ 1 2 ( D- S (k) + H (k+1) + Y 1 (k) + Y 2 (k) μ (k) ) ] ; Ψ w 4μ (⋅) denotes non-uniform singular value thresholding operator, {σ_i}_i_{= 1, 2, …,}_r is the singular value of 1 2 ( D- S (k) + H (k+1) + Y 1 (k) + Y 2 (k) μ (k) ) , w i = ∑ j=1 r σ j σ i .

(4) Update S when fix the others

In order to solve S, the optimization problem of Eq. (6) can be transformed to the following problem:

min S ( 1 2 ‖ D-L+ Y 1 μ -S ‖ F 2 + 1 2 ‖ J+ Y 3 μ -S ‖ F 2 + 1 μ ‖ S ‖ * )

(A3-7)

It can be rewritten as min S ( 1 2 ‖ 1 2 ( D-L+J+ Y 1 + Y 3 μ ) -S ‖ F 2 + 1 4μ ‖ S ‖ * ) , its solution is

S (k+1) =U Ψ w 4μ (Σ) V T

(A3-8)

where, (U,Σ,V)=svd[ 1 2 ( D- L (k+1) + J (k+1) + Y 1 (k) + Y 3 (k) μ (k) ) ] ; {σ_i}_i₌_{1, 2, …,}_r is the singular value of 1 2 ( D- L (k+1) + J (k+1) + Y 1 (k) + Y 3 (k) μ (k) ) , w i = ∑ j=1 r σ j σ i .

(5) Update Y₁, Y₂ and Y₃

Y 1 (k+1) = Y 1 (k) + μ (k) (D- L (k+1) - S (k+1) ) Y 2 (k+1) = Y 2 (k) + μ (k) ( H (k+1) - L (k+1) ) Y 3 (k+1) = Y 3 (k) + μ (k) ( J (k+1) - S (k+1) )

(A3-9)

(6) Update μ

μ (k+1) =min(ρ μ (k) , μ max )

(A3-10)

where, 0 < ρ < 1.

Corresponding author

Register with J-STAGE for free!