Microstructure Estimation by Combining Deep Learning and Phase Transformation Model

Satoshi Noguchi; Syuji Aihara; Junya Inoue

doi:10.2355/isijinternational.ISIJINT-2023-365

Abstract

In material design, the establishment of process–structure–property relationship is crucial for analyzing and controlling material microstructures. For the establishment of process–structure–property relationship, a central problem is the analysis, characterization, and control of microstructures, since microstructures are highly sensitive to material processing and critically affect material’s properties. Therefore, accurately estimating the morphology of material microstructures plays a significant role in understanding the process–structure–property relationship. In this paper, we propose a deep-learning framework for estimating material microstructures under specific process conditions. The framework utilizes two deep learning networks: vector quantized variational autoencoder (VQVAE) and pixel convolutional neural network (PixelCNN). The framework can predict material microstructures from the transformation behavior given by some physical models. In this sense, the framework is consistent with the physical knowledge accumulated in the field of material science. Importantly, our study demonstrates qualitative and quantitative evidence that incorporating physical models enhances the accuracy of microstructure prediction by deep learning models. These results highlight the importance of appropriately integrating field-specific knowledge when applying data-driven frameworks to materials design. Consequently, our results provide a basis for integrating data-driven methods with the accumulated knowledge in the field. This integration holds great potential for advancing material design through deep learning.

1. Introduction

Material design aims to identifying material microstructures with desired properties or process conditions such as their chemical compositions and/or heat treatment conditions. Since process conditions strongly influence the morphology of material microstructures and material properties are critically dependent on material microstructures, the morphological estimation of material microstructures is extremely important for analyzing process–structure–property relationship in material design. Thus, a framework for estimating material microstructures has been actively discussed. The phase field method is a typical methodology for estimating microstructure morphology. The phase field method has been extended to the calculation of microstructure formation in multi-phase materials and polycrystalline materials, starting from its application to the calculation of dendrite growth.^1,2,3) With the recent development of the computational phase diagram called calculation of phase diagram (CALPHAD),⁴⁾ more practical prediction of the microstructure formation in alloy material systems has been attempted.^5,6,7,8,9) For example, Loginova et al. analyzed the formation process of Widmannstetten ferrite by the phase field method and succeeded in reproducing its characteristic acicular structure.^10,11) However, as Bhadeshia pointed out,¹²⁾ there is no physical justification for the setting of the interface width and the interface free energy in the phase field method. Therefore, the microstructure prediction by phase field method is limited to qualitative reproduction of microstructural topology, making it difficult to quantitatively take into account the effects on microstructure formation of alloying elements. On the other hand, a spatially averaged description of microstructure formation based on the kinetics of phase transformation, such as Johnson-Mehl-Avrami-Kolmogorov (JMAK) equation,^13,14,15) is widely discussed for predicting the overall behavior of the phase transformation. Despite the simplicity, there are attempts to estimate the spatially averaged behavior of the phase transformation by quantitatively taking into account the effects of alloying elements. Also, the agreement with experiments has been reported.¹⁶⁾ In summary, while conventional metallurgical theory can estimate the global behavior of phase transformation on the basis of the transformation kinetics, quantitative estimation of the microstructure taking into account the effects of alloying elements on the spatial distribution of phases is still difficult.

In addition to the model-driven approaches described above, data-driven methods, such as deep learning models, have recently attracted attention in the field of computational material engineering.^{17,18,19,20,21,22)} For example, applications to the characterization of material microstructures with complex morphologies and the establishment of process–structure–property relationship have been actively discussed. Deost et al. applied a convolutional neural network (CNN) to extract geometrical features of ultra-high carbon steel and attempted to establish the relationship between material microstructures and process conditions from the extracted features.²³⁾ There are also attempts to estimate material microstructures using deep learning methods.^24,25,26) In particular, generative adversarial network (GAN)^27,28,29) and variational autoencoder (VAE)³⁰⁾ can be listed as deep learning models that are expected to be applied to the microstructure prediction. For example, Iyer et al. attempted to estimate material microstructures from some process conditions by training a GAN with steel microstructures labeled by the process conditions.²⁹⁾ Also, Cang et al. prepared artificial two-phase material data and attempted to estimate microstructures from a target Young’s modulus using VAE.³¹⁾ However, most of them are still under development, and their applications are limited to relatively simple microstructures with almost uniform distribution of two phases. Therefore, the estimation of material microstructures with high spatial anisotropy such as steel alloys by deep learning has not been sufficiently discussed.

One of the major difficulties toward applying data-driven methods to material design is the need for large amounts of high-quality data, which is usually difficult and expensive to collect. In the field of computer science, where deep learning methods themselves are discussed, there are standard datasets containing large amounts of data that are used as the basis for discussion. For example, ImageNet,³²⁾ which is one of the standard image datasets, contains more than 14 million images. On the other hand, as described below, the dataset used in this study is composed of only 120 steel microstructures. Thus, there is a gap of several orders of magnitude in the amount of available data between the fields of computer science and material science. Therefore, discussion specific to the field of material science is indispensable to achieve the expected performance, using only data that can be practically obtained in material science.

Based on the above background, this study discusses how to improve the accuracy of deep learning by combining the accumulated knowledge in the field of material science as an attempt to compensate for the lack of data. As a first step for establishing a framework for combining deep learning framework and physical knowledge, we aim to developing a deep learning model that can be combined with physical model. When considering the actual coupling with the physical model, the influence of the accuracy or the range of applicability of the physical model on the performance cannot be ignored. Thus, in this paper, we assume that a physical model that adequately explains the experiment is obtained, and discuss the coupling with the physical model using experimental data.

2. Methodology

This chapter describes the deep learning model used in this paper. This framework is based on our previous works.^33,34) The framework consists of two deep learning networks; vector quantized variational autoencoder (VQVAE)³⁵⁾ and pixel convolutional neural network (PixelCNN).^36,37) VQVAE is used to extract features describing input material microstructures as well as their spatial arrangement, and PixelCNN is used to obtain correlations between the extracted features and target conditions such as process parameters and/or material properties. Figure 1 shows a schematic diagram of the framework composed of VQVAE and PixelCNN. The details of VQVAE and PixelCNN are described below.

Fig. 1. Schematic illustration of framework. (a) VQVAE for characterization of material microstructures, (b) PixelCNN for determination of correlation between extracted features and desired conditions such as process and/or property. (Online version in color.)

2.1. Vector Quantized Variational Autoencoder (VQVAE)

This section describes the feature extraction of material microstructures using VQVAE. VQVAE was developed based on VAE.³⁰⁾ Therefore we start with the explanation of the basic mechanism of VAE. Figure 2 shows the basic structure of VAE, which consists of a probabilistic encoder Q(z|x) and a probabilistic decoder P(x|z). The encoder and decoder are modeled by CNN. The encoder converts an input image into a Gaussian distribution in a space of lower dimension than the input data, called latent space. The decoder reconstructs the input image using the latent variable vector sampled from the assumed Gaussian distribution. In other words, VAE must reconstruct the input image from a latent variable vector that contains noise. As a result, VAE can represent the input data as a distribution rather than a fixed point in the latent space.

Fig. 2. Schematic of architecture of VAE. (Online version in color.)

The loss function used in VAE is defined as the sum of the reconstruction loss and the Kullback-Leibler divergence (KL divergence):³⁰⁾

L VAE = ‖ x- x ˆ ‖ 2 +KL[ Q(z|x)‖ P(z) ],

(1)

where the first term is the mean squared loss between the input data x and the reconstructed data x ˆ , and the second term, KL divergence, is defined as follows.

KL[ Q(z|x)‖ P(z) ]=∫ Q(z|x)log Q(z|x) P(z) dz .

(2)

In short, KL divergence defines the measure of similarity of probability distributions. In other words, the more similar the probability distributions are, the smaller its value is. Also, by definition its value is zero when two probability distributions are identical. Because P(z) is assumed to be a standard Gaussian distribution in VAE, the KL divergence defined in Eq. (2) acts like a regularization term. In other words, the further away from the standard Gaussian distribution Q(z|x) is, the larger the value of the KL divergence becomes, which contributes to increasing the loss function. Thus, this term prevents the encoder Q(z|x) from becoming too complex in the sense of being far from the standard Gaussian distribution.

As mentioned above, VQVAE was developed based on the VAE. As shown in Fig. 1(a), VQVAE is also composed of a convolutional encoder and a convolutional decoder just like VAE. However, while VAE assumes a continuous latent vector following a Gaussian distribution, VQVAE assumes discrete latent vectors. In VQVAE, K D-dimensional vectors are randomly sampled from the D-dimensional latent space. This set of K D-dimensional vectors is called codebook (e∈ℝ^D^×^K). This codebook is also optimized through the training. In VQVAE, the latent variable z in VAE is replaced with two variables z^e∈ℝ^M^×^N^×^D and z^q∈ℝ^M^×^N^×^D. z^e is the output of the convolutional encoder and it can be interpreted as the set of M×N D-dimensional vectors. Also, z^q is a set of vectors defined by the following equation:

z mn q (x)= e j , where j= argmin k ‖ z mn e (x)- e k ‖ 2 ,

(3)

where e_k is the D-dimensional latent vector included in the codebook. Thus, z^q is a set of D-dimensional vectors obtained by replacing each D-dimensional vector in z^e with the nearest D-dimensional vector included in the codebook. z^q is called an index list. The vector replacement defined by Eq. (3) is called vector quantization (VQ) in VQVAE and is the characteristic operation of VQVAE. The decoder of VQVAE reconstructs the input image from a set of replaced vectors z^q (index list). From a material point of view, these discrete latent vectors can be understood as corresponding to geometric features such as several phases and their boundaries included in the input microstructure. For example, in a steel microstructure, these latent variables may correspond to the morphology of ferrite or martensite phases or their grain boundaries. An important point for understanding material microstructures is that several phases included in the microstructures have completely different geometrical features as a result of being produced by qualitatively different transformation kinetics. For example, a steel microstructure has a patchwork structure of phases with completely different geometrical features. From this point of view, the array of feature vectors (index list) extracted by VQVAE can be interpreted as a spatial arrangement of small-scale structures constructing the input microstructures.

The loss function of VQVAE is composed of three terms–reconstruction loss, codebook loss, and commitment loss–as the first, second, and third term in Eq. (4), respectively,³⁵⁾

L VQVAE = ‖ x- x ˆ ‖ 2 2 + ‖ ϕ sg ( z e )- z q ‖ 2 2 +β ‖ z e - ϕ sg ( z q ) ‖ 2 2 ,

(4)

where ϕ_sg is a stop-gradient operator and β is the weight for adjusting the influence of the commitment loss. The reconstruction loss is the same as in the usual VAE. The codebook loss is used for making chosen vectors included in codebook z^q approach the corresponding D-dimensional vectors in the output of encoder z^e, while the commitment loss is applied for making vectors included in z^e close to the selected vectors in the codebook z^q in D-dimensional space. ϕ_sg is introduced so that z^e and z^q can approach each other alternately.

In summary, VQVAE decomposes an input microstructure into a spatial arrangement of discrete feature vectors, just as metallurgists consider the input microstructure to be a set of several characteristic phases, such as ferrite and martensite phases. In this sense, the architecture of VQVAE is consistent with our interpretation of material microstructures. This is one of the important characteristics of the proposed framework.

2.2. Pixel Convolutional Neural Network (PixelCNN)

In this section, we explain PixelCNN, which is used to obtain a correlation between the features given by VQVAE and process conditions and/or material properties. Figure 1(b) shows a conceptual diagram of PixelCNN. PixelCNN is an autoregressive model that models the following conditional probability distribution for the pixels in the index list (x).

P(x|h)=P( x 1 |h) ∏ i=2 n 2 P( x i | x 1 ,⋯, x i-1 ,h ),

(5)

where x is an input index list, x_i is a pixel in the index list, and h is the desired condition vector. The conditional distribution written as Eq. (5) is modeled by a CNN that is connected to a softmax layer to estimate the probability of n_c classes, depending on the problem. The ordering of the pixel dependencies is left to right and top to bottom. In other words, every pixel is dependent on all pixels above and to the left of it. The network of PixelCNN is modeled to realize this dependence of pixels. Note that h does not depend on the location of the pixel in the index list. This is equivalent to adding a condition-dependent bias at every layer of the network.

As described above, VQVAE extracts an index list that is a spatial arrangement of feature vectors composing the input microstructure image. All images in a dataset can be converted to an index list using trained VQVAE. Each pixel in each index list has an index of a certain vector in the codebook. Therefore, there are n_c=K choices, depending on the number of prepared feature vectors included in the codebook. In other words, since the training PixelCNN can be formulated as a K-class classification problem, the loss function is the cross-entropy loss function defined as Eq. (6).

Loss=- ∑ i c i log p i ,

(6)

where c_i is called a categorical distribution whose value is 1 for the element corresponding to the input index (the correct index) and 0 for the others. Also, p_i is the probability distribution of each index for the i-th pixel predicted by PixelCNN. In other words, PixelCNN is trained for the expectation of the inference to be identical to the input true index list. As a result, PixelCNN can extract the spatial dependencies of the several phases constructing the input microstructure. This means that PixelCNN can obtain the spatial order within a target microstructure as a conditional probability distribution for the index list defined as Eq. (5).

2.3. Steel Microstructure Prediction Based on the Proposed Methodology

In this section, we explain how to estimate steel microstructures from the target process conditions and/or material properties using the learned VQVAE and PixelCNN. Figure 3 shows a schematic of the procedure for estimating steel microstructures from the transformation behavior using the framework of this study. In this study, a list of temperatures at which specific transformation fractions are reached is given as the condition h, which is called a transformation fraction curve. For microstructure prediction, the target transformation fraction curve h is given to the probability distribution defined as Eq. (5) in PixelCNN, and an index list is sampled from the distribution. Then, the sample index list is replaced with a set of feature vectors corresponding to the index list using the trained codebook in VQVAE. As mentioned above, the trained codebook is expected to be a list of candidates of vectors describing the geometrical features of target microstructures. In the case of steel microstructures, the codebook is considered as corresponding to geometrical features such as ferrite and martensite phases or their grain boundaries. By converting them into a microstructure image using the trained decoder, we can obtain the steel microstructure image corresponding to the given transformation behavior. An important point here is that since the probability distribution defined in Eq. (5) is explicitly defined by PixelCNN in this study, the probabilistic correlation between the given transformation behavior and the corresponding steel microstructure can be obtained. Therefore, by repeating the sampling multiple times, it is possible to construct an ensemble of steel microstructures corresponding to a common transformation behavior. This allows a probabilistic estimation of the microstructure.

Fig. 3. Schematic of procedure of estimating microstructures using trained VQVAE and PixelCNN. (Online version in color.)

Finally, an important difference between some conventional methods such as GAN and VAE, and the proposed framework should be mentioned. First, since steel material is composed of a finite number of different phases due to completely different transformation kinetics, feature vectors that represent the microstructural images are also considered to be described by finite number of sparse and discrete vectors. Furthermore, since process conditions govern the formation of each phase, the spatial arrangement of the feature vectors is determined by process conditions. As mentioned above, the proposed framework prepares a codebook as a sparce finite number of discrete feature vectors in VQVAE. Also, VQVAE decomposes an input microstructure image into a spatial arrangement of feature vectors (index list). Moreover, PixelCNN acquires the correlation between process conditions and the spatial arrangement of feature vectors defined by Eq. (5). As a result, the proposed framework takes into account the inherent sparsity of steel microstructures and explicitly describes the metallurgical dependences using neural networks. On the other hand, since GAN and VAE use feature vectors that follow a continuous distribution for image generation, the sparsity of steel microstructures is not considered. Also, the dependence of material microstructures on process conditions is not explicitly defined but is expected to be implicitly extracted by the network from given data. In summary, the proposed framework differs significantly from conventional methods in the sense of that the former explicitly incorporates metallurgical knowledge, such as the intrinsic sparsity of steel microstructures, and metallurgical causality, such as a spatial arrangement of microstructures determined by the process conditions, into the network architecture.

3. Steel Microstructure Dataset and Preprocessing

3.1. Steel Microstructure Dataset

Microstructures of steel microstructures with different heat treatment conditions and chemical compositions were prepared for training the deep learning framework. To obtain the microstructure images, Formaster tests were performed to simulate a heat-affected zone of the welding. The heating rate, the maximum heating temperature, and the holding time were set to 50°C/s, 1400°C, and 5 seconds, respectively. The specimens were then cooled to 1000°C at 50°C/s and then to room temperature at four cooling rates of 1.0, 3.0, 10, and 30°C/s. The obtained microstructures were observed using an optical microscope. Transformation fractions were calculated for each microstructure based on the dilatational data during the cooling. The results of the Formaster test yielded a total of 120 images of the steel microstructures, one for each of the four cooling rates (1.0, 3.0, 10, and 30°C/s), 15 kinds of compositions shown in Table 1, and two magnifications (X100 and X400). Figure 4 shows examples of microstructure images for each chemical composition at a cooling rate of 1.0°C/s and magnification of 100. Figure 5 shows a plot of the relationship between the temperature and the transformation fraction obtained experimentally for each kind of steel and cooling rate. The temperature at which the transformation fraction increases by 2% for each condition is shown. The number of data points comprising each transformation fraction curve is 49. This study discusses the microstructure prediction from these transformation fraction curves. For visualization relationship between microstructures and the cooling rate, Fig. 6 shows the microstructures of steel 14 at X100 magnification for the four cooling rates.

Table 1. Chemical composition for each steel [mass%].

Steel	C	Si	Mn	Cu	Ni	Cr	Mo	V	Ti	B	Alsol	N
1	0.07	–	1.52	–	–	–	–	–	–	–	0.028	0.0032
2	0.15	–	0.79	–	–	–	–	–	–	–	0.030	0.0028
3	0.07	–	1.48	–	–	–	–	0.058	–	–	0.032	0.0024
4	0.07	–	1.47	–	–	–	0.60	–	–	–	0.032	0.0026
5	0.07	–	1.48	–	–	1.50	–	–	–	–	0.033	0.0029
6	0.07	0.40	1.47	–	–	–	–	–	–	–	0.025	0.0014
7	0.06	0.30	0.89	0.19	2.99	0.50	0.41	0.040	–	0.0010	0.063	0.0032
8	0.13	0.30	1.14	0.20	0.80	0.50	0.39	0.039	–	0.0013	0.061	0.0033
9	0.09	0.30	0.89	0.20	1.50	0.80	0.40	0.040	–	0.0011	0.062	0.0041
10	0.09	0.30	0.90	0.20	1.51	0.50	0.59	0.039	–	0.0011	0.061	0.0037
11	0.05	–	1.93	–	–	–	–	–	–	–	0.027	0.0053
12	0.03	–	1.42	–	–	–	–	–	–	–	0.032	0.0040
13	0.06	–	1.44	–	–	–	–	–	0.036	–	0.028	0.0044
14	0.13	0.30	1.24	–	–	0.21	–	0.041	–	0.0002	0.055	0.0028
15	0.07	0.29	1.31	0.20	0.59	0.21	–	0.040	–	0.0001	0.058	0.0029

Fig. 4. Examples of steel microstructures. These are microstructure generated for the cooling rate 1.0°C/s. Each number corresponds to the steel index shown in Table 1.

Fig. 5. Transformation behavior for each steel. Each number corresponds to the steel index shown in Table 1. (Online version in color.)

Fig. 6. Examples of steel microstructures. Microstructures of steel 14 for each of the four cooling rates, 1.0, 3.0, 10, and 30°C/s.

3.2. Data Augmentation

This section describes the procedure for creating the training dataset from the original microstructure images shown in section 3.1. Figure 7 shows a schematic of the procedure for creating the training dataset. In this study, to create the training dataset, 256×256 pixel square patches are cropped from the original microstructure images, allowing each image to overlap each other. In addition, the square patches were rotated clockwise 90, 180, and 270 degrees to expand the dataset. Also, the input images were randomly flipped upside down and left to right before inputting the network to further expand the dataset.

Fig. 7. Schematic of procedure of creating the training dataset from the original microstructure images. 256×256 pixel patches are cropped. (a) Examples of original microstructures. (b) Examples of cropped images from the original microstructures. (Online version in color.)

In addition to the above data augmentation, pseudo-low magnification images (50x and 200x) were created by cropping corresponding regions from X100 and X400 images to extract macroscopic structures, which were also added to the training dataset. Figure 8(a) shows a comparison of the cropped areas, and Fig. 8(b) shows examples of training images created for each magnification. As a result of these data augmentations, a training dataset containing 88256 square images was prepared from 120 original microstructures.

Fig. 8. Schematic of procedure of creating pseudo low-magnification image to extract the macroscopic structure. (a) Corresponding areas to each magnification. (b) Resultant cropped images for the magnifications. (Online version in color.)

4. Results and Discussion of the Microstructure Prediction Using the Proposed Methodology

We discuss the problem of estimating steel microstructures from transformation behavior using the proposed framework. In this study, we define the problem of estimating the microstructure for the given chemical composition and heat treatment conditions as the microstructure prediction problem. Figure 9 shows a conceptual diagram of the procedure for estimating the microstructures by combining a deep learning model and a physical model. In the figure, JMAK equation is shown as an example of a physical model for microstructure prediction. In this diagram, the transformation fraction curve is first predicted from the chemical composition and the heat treatment conditions using JMAK equation. Subsequently, microstructures are estimated from the transformation behavior given by the physical model based on a correlation between the transformation behavior and the microstructure obtained by the proposed framework. As a result, a methodology for estimating the material microstructure from the given chemical composition and heat treatment conditions is constructed by coupling the deep learning model with the physical model. In particular, this study basically focuses on the deep learning model that plays a part of the methodology and discusses the effect of coupling with the physical model on the performance of the deep learning model. When considering the coupling with the physical model and performing the microstructure prediction, the influence of the accuracy or the range of applicability of the physical model on the microstructure prediction cannot be ignored. Therefore, in this study, as a first step toward coupling with the physical model, we assume that a physical model that adequately explains the experiment is obtained, and discuss the coupling with the physical model based on the experimentally obtained transformation fraction curve.

Fig. 9. Schematic of coupling between deep learning model and physical model. (Online version in color.)

4.1. Results of the Estimated Microstructures

Figure 10 summarizes the microstructure images predicted from the transformation fraction curves of 15 steels when the cooling rate is 1.0°C/s and the magnification is X100. Each panel contains four square images estimated from the same transformation behavior. The numbers in each panel correspond to those in Fig. 4. We can see that the predicted microstructures have a similar topology to the original microstructure images shown in Fig. 4. Also, importantly since PixelCNN explicitly models the spatial order of the microstructure as a probability distribution, it enables us to probabilistically estimate steel microstructures. Therefore, it is possible to estimate various microstructures that follow the same probability distribution even for the same conditions (the transformation behavior). As a result, as shown in the respective panels of Fig. 10, the microstructures estimated for the same transformation behavior have similar topological characteristics, but they are not completely identical.

Fig. 10. Examples of generated microstructures for each chemical composition when the cooling rate is 1.0°C/s and the magnification is X100. Also, each number corresponds to the steel index shown in Table 1.

Figure 11 shows the trend of microstructures generated corresponding to different cooling rates for steel 14. As the cooling rate increases, the proportion of dark-looking microstructures such as martensite phase increases, and finer microstructures are formed. As a result, it can be seen that the proposed framework can capture the relationship between the given process parameter (the cooling rate) and the corresponding microstructures. The quantitative validation of the establishment of the process-structure linkage using the same framework is presented in detail in our previous paper.³³⁾ Figure 12 shows examples of the microstructures produced for several steels when the cooling rate is fixed at 1.0°C/s and the magnification is varied. The results show that a hierarchical structure is obtained as the magnification is varied.

Fig. 11. Examples of microstructures generated for each cooling rate for steel 14.

Fig. 12. Hierarchical microstructures generated for some chemical compositions with original microstructures.

As described above, it is confirmed that the estimated microstructures have qualitatively similar geometric features to the original microstructures. On the other hand, relatively large structures within the field of view of the microstructure, such as prior austenite grain boundaries or acicular structures such as those seen in steel 2, are not clear in the estimated microstructure. Because a correlation in square patches is factorized as a product of local interactions in PixelCNN as defined in Eq. (5), the extent of interaction exponentially decreases as distance increases. As a result, the long-range interactions are difficult to be captured by PixelCNN. Thus, it is difficult to obtain global structures such as the prior austenite grain boundaries and the acicular structures using PixelCNN. An attempt for acquiring global dependencies among pixels by PixelCNN has been discussed.³⁸⁾ It is expected that the global structures may be clarified using such kind of method. Also, considering that only one microstructure was used for each condition, the data is undeniably insufficient. Therefore, it is also expected to become clearer by expanding the dataset, but it is unclear with the current small dataset.

Next, we consider the effect of the number of data points describing the transformation behavior on the microstructure prediction. Figures 13(a) and 13(b) show the transformation fraction curves for steel 5 as an example, which are represented by plots with 9 and 49 points, respectively. Figures 13(c) and 13(d) show the microstructures estimated by each transformation fraction curve for a cooling rate of 1.0°C/s. From the comparison between Figs. 13(a) and 13(b), information such as the local curvature of the transformation fraction curves may be lost in the case of the 9 points. For example, in Fig. 13(b), it can be seen that the slope of the transformation fraction curve has a retardation of transformation around a temperature of 650°C and a transformation fraction of 20% when the cooling rate is 1.0°C/s. On the other hand, this retardation of transformation is not clear in Fig. 13(a). In addition, the comparison between Figs. 13(c) and 13(d) shows that the microstructures estimated using 49 points data are clearer than those estimated using 9 points data. These results indicate that the information on the transformation behavior is expanded with the increase in the number of plots representing the transformation fraction curve. Thus, the estimated microstructure becomes clearer.

Fig. 13. Difference in the generated microstructures depending on the number of data points representing the curve of transformation behavior for steel 5. (a) and (b) Representations of the curve of transformation behavior using 9 and 49 points, respectively. (c) and (d) Microstructures generated for each representation of the transformation behavior using 9 and 49 points, respectively. (Online version in color.)

As supplemental information, Fig. 14 shows visualizations of the latent space extracted by the proposed framework based on principal component analysis (PCA)³⁹⁾ and corresponding steel indices. In practice, after the output tensors of the trained encoder in VQVAE corresponding to the input images are flattened into vectors, the vectors are compressed into two-dimensional vectors by PCA. Then, we map the microstructure patches in two-dimensional space by the compressed vectors. Figure 14(a) illustrates the result of the visualization of 5000 sample patches randomly selected from the training dataset. This result demonstrates that each microstructure image is continuously distributed according to its morphology. Also, Fig. 14(b) shows a plot of the steel index of each microstructure. Although further discussion is required regarding the physical interpretation of these plots, a continuous distribution based on the morphology of the microstructures is observed in the latent space. This suggests that the distance in the latent space corresponds to the similarity of the microstructures.

Fig. 14. Plots of microstructure images in the latent space with the corresponding steel index, visualized by principal component analysis. (a) Plot of images embedded in the latent space. (b) Plot of latent vectors colored for steel index. (Online version in color.)

4.2. Contribution to Microstructure Prediction by Combining with a Physical Model

In this section, we discuss the effect of the combination of the deep learning framework and the physical model on the microstructure prediction. To examine the effect, we consider two models. One is a model that directly estimates microstructures from chemical composition and heat treatment conditions using only the proposed deep learning framework, and the other is a model that estimates the microstructure from the calculated transformation behavior from the chemical composition and heat treatment conditions via a physical model. However, in the latter case, the accuracy of the microstructure prediction is affected by the accuracy of the physical model. To eliminate this influence, we train our deep learning framework using the experimentally obtained transformation behavior shown in Fig. 5 and discuss the combination with the physical model indirectly. In other words, assuming that we have the physical model that represents the transformation behavior well, we hypothetically discuss the microstructure prediction based on the combination of the deep learning and the physical model.

Using the above two models, we consider the estimation of a steel microstructure corresponding to an unknown chemical composition and heat treatment conditions that is not included in the training dataset. Here, we remove microstructure data for a certain composition from the training dataset, train the deep learning framework, and then estimate the removed microstructure using the trained model. For the former model, microstructures are estimated directly from the chemical composition and the cooling rate, while for the latter model microstructures are estimated from the transformation behavior assumed to be given by the physical model and the cooling rate. Figure 15 summarizes the estimated microstructures for steel 5 using each model, where Fig. 15(a) shows the experimentally obtained microstructure image of steel 5, and Figs. 15(b) and 15(c) show the microstructures estimated from the transformation behavior and the chemical composition, respectively. The cooling rate was set as 1.0°C/s. It can be seen that the estimation using the transformation behavior captures the geometrical characteristic of microstructures. On the other hand, we can see that the direct estimation from the chemical composition cannot reproduce the characteristic microstructures. This result suggests that the “translation” of the chemical composition and heat treatment conditions into the corresponding transformation behavior via the physical model improves the accuracy of microstructure prediction. In other words, it suggests that the combination of the deep learning model and the physical model can contribute to improving the accuracy of microstructure prediction by the deep learning model.

Fig. 15. Comparison between microstructures estimated by deep learning methods with and without physical model. The cooling rate was set as 1.0°C/s. (a) Original microstructure. (b) Microstructure estimated by deep learning framework with physical model. (c) Microstructure estimated by deep learning framework without physical model.

To quantitatively validate the effectiveness of combination with the physical model for microstructure prediction, we consider the problem of estimating the distribution of unknown microstructures in the latent space shown in Fig. 14. We trained the two deep learning models mentioned earlier by the training dataset without steel 5 and used each model to estimate the distribution of steel 5 in the latent space. Furthermore, we used the data for steel 5 as ground truth data to verify the estimation accuracy. We summarized the distribution of the ground truth data and the distribution estimated by the PixelCNN included in each deep learning model in Fig. 16. Additionally, as a quantitative measure to validate the estimation accuracy, we calculated the distance between the barycenters of the ground truth data distribution and the estimated distribution. As shown in Fig. 14, since the distance in the latent space corresponds to the similarity of the microstructures, we can discuss the estimation accuracy of the microstructures using the distance in the latent space. We also present this barycenter distance in Fig. 16. From the comparison of the distributions shown in Fig. 16, although it is somewhat unclear due to dimensionality reduction, it can be seen that the distribution closer to the training data can be estimated when coupling with the physical model. Together with the result of the computed barycenter distance, these results indicate that coupling with the physical model contributes to improving the accuracy of estimation by the deep learning framework.

Fig. 16. Visualization of estimating distributions of unknown microstructures in the latent space using deep learning framework with and without physical model. (a) and (b) Comparison between the distributions in the latent space of microstructures from the training dataset and microstructures by the deep learning method with and without physical model, respectively. (c) An enlarged view of the dotted area in (a). (d) An enlarged view of the dotted area in (b). Also, distance between barycenters of each distribution is shown. (Online version in color.)

4.3. Comparison with the Conventional Deep Learning Method

In this section, we compare the estimations of unknown microstructures using the proposed framework and VAE which is a typical example of a conventional generative model. In the same way as the estimation by the proposed framework, the transformation fraction curve was input into VAE. In practice, the training microstructure is converted into features by the encoder of VAE, and the decoder reconstructs the training microstructure from the extracted features and the transformation fraction curve. As a result, VAE is trained to obtain a correlation between the microstructures and the transformation behavior. Here, we estimated unknown microstructures from the transformation behavior using the decoder trained with the transformation fraction curve as a condition. In Fig. 17, the microstructures predicted by the proposed framework and VAE are shown. Also, Fig. 18 shows the comparison of the distributions of unknown microstructures estimated from the transformation behavior using each framework in the latent space. Note that Figs. 16(a) and 18(a) are identical. In the same way as above, the distance between the barycenters of the distribution of the ground truth data and the estimated distribution was calculated, which is also shown in Fig. 18. From these results, we can confirm that our framework is superior to the conventional method, VAE, in terms of microstructure prediction. This could be one of the evidences that our framework is more suitable for material design than conventional methods.

Fig. 17. Comparison between microstructures estimated by the deep learning framework presented in this study and VAE. (a) Target microstructure. (b) The microstructures estimated by the deep learning method presented in this study. (c) The microstructures estimated by VAE.

Fig. 18. Visualization of estimating distributions of unknown microstructures in the latent space using the deep learning framework in this study and VAE. (a) and (b) Comparison between the distributions in the latent space of microstructures from the training dataset and microstructures estimated by the deep learning framework in this study and VAE, respectively. (Online version in color.)

4.4. Future Work

Finally, we discuss future works to make our framework attractive tool for the microstructure prediction. As mentioned above, while our method can estimate the basic topology of microstructure, it is unclear about structures such as the grain boundary ferrite and the needle-shaped ferrite. Thus, it is one of our future tasks to improve the accuracy of the reproduction of relatively global structures in the field of view. The network-like structure of prior austenite grain boundaries is also unclear in the estimated structure given in this study. One possible strategy is to add the prior austenite grain size to the label h as another descriptor of microstructures. In other words, by appropriately expanding the descriptors according to the structure of interest, we can expect to improve accuracy. In the proposed framework, defining the dimension of the index list determines how large region of the training microstructure is mapped to the latent vector in the codebook. In this sense, the current setting discussed in the paper focuses on a specific length-scale structure. Therefore, we are considering the possibility of hierarchically linking our framework for the extraction of multiple length-scale structures to appropriately extract global structures. In addition, although the coupling of the deep learning framework and the physical model was discussed using the experimental data in this paper, it is important to construct a comprehensive framework to estimate microstructures from the transformation fraction curve given by the physical model in the future. Therefore, it is also important to develop the physical model which can predict the transformation fraction courve.

5. Conclusion

The following is the summary of this paper:

• We proposed the deep learning framework composed of VQVAE and PixrlCNN to predict steel microstructure morphology by combining with physical models. Using the framework, we discussed the problem of estimating steel microstructures from the transformation behavior.

• It is shown that the steel microstructures given by the framework are similar to the experimentally obtained microstructures in terms of basic topology such as the volume fraction and the average grain size.

• The effect of combination with the physical model on the estimation accuracy by the framework was examined. As a result, the improvement of accuracy by the combination was qualitatively and quantitatively confirmed. Also, the advantage of the proposed framework over the conventional method VAE was qualitatively and quantitatively confirmed.

From these results, it can be concluded that the combination of data science methods with the accumulated knowledge in the domain is essential for the effective application of the data-driven methodology to material science.

Acknowledgement

This work was supported by Council for Science, Technology and Innovation (CSTI), Cross-ministerial Strategic Innovation Promotion Program (SIP), “Materials Integration” for Revolutionary Design System of Structural Materials (Funding agency: JST). We would also like to express our sincere gratitude to Dr. Tadashi Kasuya, a project researcher at the University of Tokyo, and Mr. Masahiro Imoto of Kobe Steel, Ltd. for providing valuable data.

References

1) R. Kobayashi: Phys. D, 63 (1993), 410. https://doi.org/10.1016/0167-2789(93)90120-P
2) I. Steinbach, F. Pezzolla, B. Nestler, M. Seeßelberg, R. Prieler, G. J. Schmitz and J. L. L. Rezende: Phys. D, 94 (1996), 135. https://doi.org/10.1016/0167-2789(95)00298-7
3) J. Tiaden, B. Nestler, H. J. Diepers and I. Steinbach: Phys. D, 115 (1998), 73. https://doi.org/10.1016/S0167-2789(97)00226-1
4) N. Saunders and A. P. Miodownik: CALPHAD (Calculation of Phase Diagrams), Elsiver, Amsterdam, (1998), 478.
5) H. Kobayashi, M. Ode, S. G. Kim, W. T. Kim and T. Suzuki: Scr. Mater., 48 (2003), 689. https://doi.org/10.1016/S1359-6462(02)00557-2
6) K. Wu, Y. A. Chang and Y. Wang: Scr. Mater., 50 (2004), 1145. https://doi.org/10.1016/j.scriptamat.2004.01.025
7) R. S. Qin, E. R. Wallach and R. C. Thomson: J. Cryst. Growth, 279 (2005), 163. https://doi.org/10.1016/j.jcrysgro.2005.02.005
8) N. Warnken, D. Ma, A. Drevermann, R. C. Reed, S. G. Fries and I. Steinbach: Acta Mater., 57 (2009), 5862. https://doi.org/10.1016/j.actamat.2009.08.013
9) B. Böttger, J. Eiken and I. Steinbach: Acta Mater., 54 (2006), 2697. https://doi.org/10.1016/j.actamat.2006.02.008
10) I. Loginova, J. Ågren and G. Amberg: Acta Mater., 52 (2004), 4055. https://doi.org/10.1016/j.actamat.2004.05.033
11) M. Militzer: Curr. Opin. Solid State Mater. Sci., 15 (2011), 106. https://doi.org/10.1016/j.cossms.2010.10.001
12) H. K. D. H. Bhadeshia: Bainite in Steels: Theory and Practice, CRC Press, Boca Raton, FL, (2019), 589.
13) W. A. Johonson and R. F. Mehl: Trans. Metall. Soc. AIME, 135 (1939), 416.
14) M. Avrami: J. Chem. Phys., 7 (1939), 1103. https://doi.org/10.1063/1.1750380
15) A. N. Kolmogorov: Izv. Akad. Nauk. USSR Ser. Mat., 1 (1937), 355 (in Russian).
16) M. Dao, N. Chollacoop, K. J. Van Vliet, T. A. Venkatesh and S. Suresh: Acta Mater., 49 (2001), 3899. https://doi.org/10.1016/S1359-6454(01)00295-6
17) A. Cecen, H. Dai, Y. C. Yabansu, S. R. Kalidindi and L. Song: Acta Mater., 146 (2018), 76. https://doi.org/10.1016/j.actamat.2017.11.053
18) Z. Yang, Y. Yabansu, D. Jha, W.-k. Liao, A. N. Choudhary, S. R. Kalidindi and A. Agrawal: Acta Mater., 166 (2019), 335. https://doi.org/10.1016/j.actamat.2018.12.045
19) Z.-L. Wang and Y. Adachi: Mater. Sci. Eng. A, 744 (2019), 661. https://doi.org/10.1016/j.msea.2018.12.049
20) R. Bostanabad, Y. Zhang, X. Li, T. Kearney, L. C. Brinson, D. W. Apley, W. K. Liu and W. Chen: Prog. Mater. Sci., 95 (2018), 1. https://doi.org/10.1016/j.pmatsci.2018.01.005
21) R. Bostanabad, A. T. Bui, W. Xie, D. W. Apley and W. Chen: Acta Materialia, 103 (2016), 89. https://doi.org/10.1016/j.actamat.2015.09.044
22) R. Bostanabad, W. Chen and D. W. Apley: J. Microsc., 264 (2016), 282. https://doi.org/10.1111/jmi.12441
23) B. L. De Cost, T. Francis and E. A. Holm: Acta Mater, 133 (2017), 30. https://doi.org/10.1016/j.actamat.2017.05.014
24) R. Cang, Y. Xu, S. Chen, Y. Liu, Y. Jiao and M. Y. Ren: J. Mech. Des., 139 (2017), 071404. https://doi.org/10.1115/1.4036649
25) Z. Yang, Y. C. Yabansu, R. Al-Bahrani, W.-k. Liao, A. N. Choudhary, S. R. Kalidindi and A. Agrawal: Comput. Mater. Sci., 151 (2018), 278. https://doi.org/10.1016/j.commatsci.2018.05.014
26) Z. Yang, X. Li, L. C. Brinson, A. N. Choudhary, W. Chen and A. N. Agrawal: J. Mech. Des., 140 (2018), 111416. https://doi.org/10.1115/1.4041371
27) I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio: Commun. ACM, 63 (2020), 139. https://doi.org/10.1145/3422622
28) X. Ding, Y. Wang, Z. Xu, W. J. Welch and Z. Wang: Proc. 9th Int. Conf. on Learning Representations, ICLR 2021, Appleton, WI, (2021). https://openreview.net/forum?id=PrzjugOsDeE, (accessed 2023-08-28).
29) A. Iyer, B. Dey, A. Dasgupta, W. Chen and A. Chakraborty: 2nd Workshop on Machine Learning and the Physical Sciences, NeurIPS 2019, San Diego, CA, (2019). https://ml4physicalsciences.github.io/2019/, (accessed 2023-08-28).
30) D. Kingma and M. Welling: Proc. 2nd Int. Conf. on Learning Representations, ICLR 2014, Appelton, WI, (2014). https://iclr.cc/archive/2014/conference-proceedings/, (accessed 2023-08-28).
31) R. Cang, H. Li, H. Yao, Y. Jiao and Y. Ren: Comput. Mater. Sci, 150 (2018), 212. https://doi.org/10.1016/j.commatsci.2018.03.074
32) J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei: Proc. 2009 IEEE Conf. on Computer Vision and Pattern Recognition, IEEE, Piscataway, NJ, (2009), 248. https://doi.org/10.1109/CVPR.2009.5206848
33) S. Noguchi and J. Inoue: Phys. Rev. E, 104 (2021), 025302. https://doi.org/10.1103/PhysRevE.104.025302
34) S. Noguchi, H. Wang and J. Inoue: Sci. Rep., 12 (2022), 14238. https://doi.org/10.1038/s41598-022-17614-0
35) A. Oord, O. Vinyals and K. Kavukcuoglu: Proc. 30th Neural Inf. Process. Syst., NeurIPS 2017, San Diego, CA, (2017), 6306. https://papers.nips.cc/paper_files/paper/2017/hash/7a98af17e63a0ac09ce2e96d03992fbc-Abstract.html, (accessed 2023-08-28).
36) A. Oord, N. Kalchbrenner and K. Kavukcuoglu: Proc. 33rd Int. Conf. on Machine Learning, PMLR 2016, San Diego, CA, (2016), 1747. https://proceedings.mlr.press/v48/oord16.html, (accessed 2023-08-28).
37) A. Oord, N. Kalchbrenner, L. Espeholt, K. Kavukcuoglu, O. Vinyals and A. Graves: Proc. 29th Neural Inf. Process. Syst., NeurIPS 2016, San Diego, CA, (2016), 4790. https://proceedings.neurips.cc/paper_files/paper/2016, (accessed 2023-08-28).
38) X. Chen, N. Mishra, M. Rohaninejad and P. Abbeel: Proc. 35th Int. Conf. on Machine Learning, PMLR 2018, San Diego, CA, (2018), 864. https://proceedings.mlr.press/v80/chen18h.html, (accessed 2023-08-28).
39) I. T. Jolliffe: Principal Component Analysis, Springer, New York, NY, (1986), 1.

Corresponding author

Register with J-STAGE for free!