2024 Volume 64 Issue 1 Pages 142-153
In material design, the establishment of process–structure–property relationship is crucial for analyzing and controlling material microstructures. For the establishment of process–structure–property relationship, a central problem is the analysis, characterization, and control of microstructures, since microstructures are highly sensitive to material processing and critically affect material’s properties. Therefore, accurately estimating the morphology of material microstructures plays a significant role in understanding the process–structure–property relationship. In this paper, we propose a deep-learning framework for estimating material microstructures under specific process conditions. The framework utilizes two deep learning networks: vector quantized variational autoencoder (VQVAE) and pixel convolutional neural network (PixelCNN). The framework can predict material microstructures from the transformation behavior given by some physical models. In this sense, the framework is consistent with the physical knowledge accumulated in the field of material science. Importantly, our study demonstrates qualitative and quantitative evidence that incorporating physical models enhances the accuracy of microstructure prediction by deep learning models. These results highlight the importance of appropriately integrating field-specific knowledge when applying data-driven frameworks to materials design. Consequently, our results provide a basis for integrating data-driven methods with the accumulated knowledge in the field. This integration holds great potential for advancing material design through deep learning.
Material design aims to identifying material microstructures with desired properties or process conditions such as their chemical compositions and/or heat treatment conditions. Since process conditions strongly influence the morphology of material microstructures and material properties are critically dependent on material microstructures, the morphological estimation of material microstructures is extremely important for analyzing process–structure–property relationship in material design. Thus, a framework for estimating material microstructures has been actively discussed. The phase field method is a typical methodology for estimating microstructure morphology. The phase field method has been extended to the calculation of microstructure formation in multi-phase materials and polycrystalline materials, starting from its application to the calculation of dendrite growth.1,2,3) With the recent development of the computational phase diagram called calculation of phase diagram (CALPHAD),4) more practical prediction of the microstructure formation in alloy material systems has been attempted.5,6,7,8,9) For example, Loginova et al. analyzed the formation process of Widmannstetten ferrite by the phase field method and succeeded in reproducing its characteristic acicular structure.10,11) However, as Bhadeshia pointed out,12) there is no physical justification for the setting of the interface width and the interface free energy in the phase field method. Therefore, the microstructure prediction by phase field method is limited to qualitative reproduction of microstructural topology, making it difficult to quantitatively take into account the effects on microstructure formation of alloying elements. On the other hand, a spatially averaged description of microstructure formation based on the kinetics of phase transformation, such as Johnson-Mehl-Avrami-Kolmogorov (JMAK) equation,13,14,15) is widely discussed for predicting the overall behavior of the phase transformation. Despite the simplicity, there are attempts to estimate the spatially averaged behavior of the phase transformation by quantitatively taking into account the effects of alloying elements. Also, the agreement with experiments has been reported.16) In summary, while conventional metallurgical theory can estimate the global behavior of phase transformation on the basis of the transformation kinetics, quantitative estimation of the microstructure taking into account the effects of alloying elements on the spatial distribution of phases is still difficult.
In addition to the model-driven approaches described above, data-driven methods, such as deep learning models, have recently attracted attention in the field of computational material engineering.17,18,19,20,21,22) For example, applications to the characterization of material microstructures with complex morphologies and the establishment of process–structure–property relationship have been actively discussed. Deost et al. applied a convolutional neural network (CNN) to extract geometrical features of ultra-high carbon steel and attempted to establish the relationship between material microstructures and process conditions from the extracted features.23) There are also attempts to estimate material microstructures using deep learning methods.24,25,26) In particular, generative adversarial network (GAN)27,28,29) and variational autoencoder (VAE)30) can be listed as deep learning models that are expected to be applied to the microstructure prediction. For example, Iyer et al. attempted to estimate material microstructures from some process conditions by training a GAN with steel microstructures labeled by the process conditions.29) Also, Cang et al. prepared artificial two-phase material data and attempted to estimate microstructures from a target Young’s modulus using VAE.31) However, most of them are still under development, and their applications are limited to relatively simple microstructures with almost uniform distribution of two phases. Therefore, the estimation of material microstructures with high spatial anisotropy such as steel alloys by deep learning has not been sufficiently discussed.
One of the major difficulties toward applying data-driven methods to material design is the need for large amounts of high-quality data, which is usually difficult and expensive to collect. In the field of computer science, where deep learning methods themselves are discussed, there are standard datasets containing large amounts of data that are used as the basis for discussion. For example, ImageNet,32) which is one of the standard image datasets, contains more than 14 million images. On the other hand, as described below, the dataset used in this study is composed of only 120 steel microstructures. Thus, there is a gap of several orders of magnitude in the amount of available data between the fields of computer science and material science. Therefore, discussion specific to the field of material science is indispensable to achieve the expected performance, using only data that can be practically obtained in material science.
Based on the above background, this study discusses how to improve the accuracy of deep learning by combining the accumulated knowledge in the field of material science as an attempt to compensate for the lack of data. As a first step for establishing a framework for combining deep learning framework and physical knowledge, we aim to developing a deep learning model that can be combined with physical model. When considering the actual coupling with the physical model, the influence of the accuracy or the range of applicability of the physical model on the performance cannot be ignored. Thus, in this paper, we assume that a physical model that adequately explains the experiment is obtained, and discuss the coupling with the physical model using experimental data.
This chapter describes the deep learning model used in this paper. This framework is based on our previous works.33,34) The framework consists of two deep learning networks; vector quantized variational autoencoder (VQVAE)35) and pixel convolutional neural network (PixelCNN).36,37) VQVAE is used to extract features describing input material microstructures as well as their spatial arrangement, and PixelCNN is used to obtain correlations between the extracted features and target conditions such as process parameters and/or material properties. Figure 1 shows a schematic diagram of the framework composed of VQVAE and PixelCNN. The details of VQVAE and PixelCNN are described below.
This section describes the feature extraction of material microstructures using VQVAE. VQVAE was developed based on VAE.30) Therefore we start with the explanation of the basic mechanism of VAE. Figure 2 shows the basic structure of VAE, which consists of a probabilistic encoder Q(z|x) and a probabilistic decoder P(x|z). The encoder and decoder are modeled by CNN. The encoder converts an input image into a Gaussian distribution in a space of lower dimension than the input data, called latent space. The decoder reconstructs the input image using the latent variable vector sampled from the assumed Gaussian distribution. In other words, VAE must reconstruct the input image from a latent variable vector that contains noise. As a result, VAE can represent the input data as a distribution rather than a fixed point in the latent space.
The loss function used in VAE is defined as the sum of the reconstruction loss and the Kullback-Leibler divergence (KL divergence):30)
(1) |
where the first term is the mean squared loss between the input data x and the reconstructed data
(2) |
In short, KL divergence defines the measure of similarity of probability distributions. In other words, the more similar the probability distributions are, the smaller its value is. Also, by definition its value is zero when two probability distributions are identical. Because P(z) is assumed to be a standard Gaussian distribution in VAE, the KL divergence defined in Eq. (2) acts like a regularization term. In other words, the further away from the standard Gaussian distribution Q(z|x) is, the larger the value of the KL divergence becomes, which contributes to increasing the loss function. Thus, this term prevents the encoder Q(z|x) from becoming too complex in the sense of being far from the standard Gaussian distribution.
As mentioned above, VQVAE was developed based on the VAE. As shown in Fig. 1(a), VQVAE is also composed of a convolutional encoder and a convolutional decoder just like VAE. However, while VAE assumes a continuous latent vector following a Gaussian distribution, VQVAE assumes discrete latent vectors. In VQVAE, K D-dimensional vectors are randomly sampled from the D-dimensional latent space. This set of K D-dimensional vectors is called codebook (e∈ℝD×K). This codebook is also optimized through the training. In VQVAE, the latent variable z in VAE is replaced with two variables ze∈ℝM×N×D and zq∈ℝM×N×D. ze is the output of the convolutional encoder and it can be interpreted as the set of M×N D-dimensional vectors. Also, zq is a set of vectors defined by the following equation:
(3) |
where ek is the D-dimensional latent vector included in the codebook. Thus, zq is a set of D-dimensional vectors obtained by replacing each D-dimensional vector in ze with the nearest D-dimensional vector included in the codebook. zq is called an index list. The vector replacement defined by Eq. (3) is called vector quantization (VQ) in VQVAE and is the characteristic operation of VQVAE. The decoder of VQVAE reconstructs the input image from a set of replaced vectors zq (index list). From a material point of view, these discrete latent vectors can be understood as corresponding to geometric features such as several phases and their boundaries included in the input microstructure. For example, in a steel microstructure, these latent variables may correspond to the morphology of ferrite or martensite phases or their grain boundaries. An important point for understanding material microstructures is that several phases included in the microstructures have completely different geometrical features as a result of being produced by qualitatively different transformation kinetics. For example, a steel microstructure has a patchwork structure of phases with completely different geometrical features. From this point of view, the array of feature vectors (index list) extracted by VQVAE can be interpreted as a spatial arrangement of small-scale structures constructing the input microstructures.
The loss function of VQVAE is composed of three terms–reconstruction loss, codebook loss, and commitment loss–as the first, second, and third term in Eq. (4), respectively,35)
(4) |
where ϕsg is a stop-gradient operator and β is the weight for adjusting the influence of the commitment loss. The reconstruction loss is the same as in the usual VAE. The codebook loss is used for making chosen vectors included in codebook zq approach the corresponding D-dimensional vectors in the output of encoder ze, while the commitment loss is applied for making vectors included in ze close to the selected vectors in the codebook zq in D-dimensional space. ϕsg is introduced so that ze and zq can approach each other alternately.
In summary, VQVAE decomposes an input microstructure into a spatial arrangement of discrete feature vectors, just as metallurgists consider the input microstructure to be a set of several characteristic phases, such as ferrite and martensite phases. In this sense, the architecture of VQVAE is consistent with our interpretation of material microstructures. This is one of the important characteristics of the proposed framework.
2.2. Pixel Convolutional Neural Network (PixelCNN)In this section, we explain PixelCNN, which is used to obtain a correlation between the features given by VQVAE and process conditions and/or material properties. Figure 1(b) shows a conceptual diagram of PixelCNN. PixelCNN is an autoregressive model that models the following conditional probability distribution for the pixels in the index list (x).
(5) |
where x is an input index list, xi is a pixel in the index list, and h is the desired condition vector. The conditional distribution written as Eq. (5) is modeled by a CNN that is connected to a softmax layer to estimate the probability of nc classes, depending on the problem. The ordering of the pixel dependencies is left to right and top to bottom. In other words, every pixel is dependent on all pixels above and to the left of it. The network of PixelCNN is modeled to realize this dependence of pixels. Note that h does not depend on the location of the pixel in the index list. This is equivalent to adding a condition-dependent bias at every layer of the network.
As described above, VQVAE extracts an index list that is a spatial arrangement of feature vectors composing the input microstructure image. All images in a dataset can be converted to an index list using trained VQVAE. Each pixel in each index list has an index of a certain vector in the codebook. Therefore, there are nc=K choices, depending on the number of prepared feature vectors included in the codebook. In other words, since the training PixelCNN can be formulated as a K-class classification problem, the loss function is the cross-entropy loss function defined as Eq. (6).
(6) |
where ci is called a categorical distribution whose value is 1 for the element corresponding to the input index (the correct index) and 0 for the others. Also, pi is the probability distribution of each index for the i-th pixel predicted by PixelCNN. In other words, PixelCNN is trained for the expectation of the inference to be identical to the input true index list. As a result, PixelCNN can extract the spatial dependencies of the several phases constructing the input microstructure. This means that PixelCNN can obtain the spatial order within a target microstructure as a conditional probability distribution for the index list defined as Eq. (5).
2.3. Steel Microstructure Prediction Based on the Proposed MethodologyIn this section, we explain how to estimate steel microstructures from the target process conditions and/or material properties using the learned VQVAE and PixelCNN. Figure 3 shows a schematic of the procedure for estimating steel microstructures from the transformation behavior using the framework of this study. In this study, a list of temperatures at which specific transformation fractions are reached is given as the condition h, which is called a transformation fraction curve. For microstructure prediction, the target transformation fraction curve h is given to the probability distribution defined as Eq. (5) in PixelCNN, and an index list is sampled from the distribution. Then, the sample index list is replaced with a set of feature vectors corresponding to the index list using the trained codebook in VQVAE. As mentioned above, the trained codebook is expected to be a list of candidates of vectors describing the geometrical features of target microstructures. In the case of steel microstructures, the codebook is considered as corresponding to geometrical features such as ferrite and martensite phases or their grain boundaries. By converting them into a microstructure image using the trained decoder, we can obtain the steel microstructure image corresponding to the given transformation behavior. An important point here is that since the probability distribution defined in Eq. (5) is explicitly defined by PixelCNN in this study, the probabilistic correlation between the given transformation behavior and the corresponding steel microstructure can be obtained. Therefore, by repeating the sampling multiple times, it is possible to construct an ensemble of steel microstructures corresponding to a common transformation behavior. This allows a probabilistic estimation of the microstructure.
Finally, an important difference between some conventional methods such as GAN and VAE, and the proposed framework should be mentioned. First, since steel material is composed of a finite number of different phases due to completely different transformation kinetics, feature vectors that represent the microstructural images are also considered to be described by finite number of sparse and discrete vectors. Furthermore, since process conditions govern the formation of each phase, the spatial arrangement of the feature vectors is determined by process conditions. As mentioned above, the proposed framework prepares a codebook as a sparce finite number of discrete feature vectors in VQVAE. Also, VQVAE decomposes an input microstructure image into a spatial arrangement of feature vectors (index list). Moreover, PixelCNN acquires the correlation between process conditions and the spatial arrangement of feature vectors defined by Eq. (5). As a result, the proposed framework takes into account the inherent sparsity of steel microstructures and explicitly describes the metallurgical dependences using neural networks. On the other hand, since GAN and VAE use feature vectors that follow a continuous distribution for image generation, the sparsity of steel microstructures is not considered. Also, the dependence of material microstructures on process conditions is not explicitly defined but is expected to be implicitly extracted by the network from given data. In summary, the proposed framework differs significantly from conventional methods in the sense of that the former explicitly incorporates metallurgical knowledge, such as the intrinsic sparsity of steel microstructures, and metallurgical causality, such as a spatial arrangement of microstructures determined by the process conditions, into the network architecture.
Microstructures of steel microstructures with different heat treatment conditions and chemical compositions were prepared for training the deep learning framework. To obtain the microstructure images, Formaster tests were performed to simulate a heat-affected zone of the welding. The heating rate, the maximum heating temperature, and the holding time were set to 50°C/s, 1400°C, and 5 seconds, respectively. The specimens were then cooled to 1000°C at 50°C/s and then to room temperature at four cooling rates of 1.0, 3.0, 10, and 30°C/s. The obtained microstructures were observed using an optical microscope. Transformation fractions were calculated for each microstructure based on the dilatational data during the cooling. The results of the Formaster test yielded a total of 120 images of the steel microstructures, one for each of the four cooling rates (1.0, 3.0, 10, and 30°C/s), 15 kinds of compositions shown in Table 1, and two magnifications (X100 and X400). Figure 4 shows examples of microstructure images for each chemical composition at a cooling rate of 1.0°C/s and magnification of 100. Figure 5 shows a plot of the relationship between the temperature and the transformation fraction obtained experimentally for each kind of steel and cooling rate. The temperature at which the transformation fraction increases by 2% for each condition is shown. The number of data points comprising each transformation fraction curve is 49. This study discusses the microstructure prediction from these transformation fraction curves. For visualization relationship between microstructures and the cooling rate, Fig. 6 shows the microstructures of steel 14 at X100 magnification for the four cooling rates.
Steel | C | Si | Mn | Cu | Ni | Cr | Mo | V | Ti | B | Alsol | N |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0.07 | – | 1.52 | – | – | – | – | – | – | – | 0.028 | 0.0032 |
2 | 0.15 | – | 0.79 | – | – | – | – | – | – | – | 0.030 | 0.0028 |
3 | 0.07 | – | 1.48 | – | – | – | – | 0.058 | – | – | 0.032 | 0.0024 |
4 | 0.07 | – | 1.47 | – | – | – | 0.60 | – | – | – | 0.032 | 0.0026 |
5 | 0.07 | – | 1.48 | – | – | 1.50 | – | – | – | – | 0.033 | 0.0029 |
6 | 0.07 | 0.40 | 1.47 | – | – | – | – | – | – | – | 0.025 | 0.0014 |
7 | 0.06 | 0.30 | 0.89 | 0.19 | 2.99 | 0.50 | 0.41 | 0.040 | – | 0.0010 | 0.063 | 0.0032 |
8 | 0.13 | 0.30 | 1.14 | 0.20 | 0.80 | 0.50 | 0.39 | 0.039 | – | 0.0013 | 0.061 | 0.0033 |
9 | 0.09 | 0.30 | 0.89 | 0.20 | 1.50 | 0.80 | 0.40 | 0.040 | – | 0.0011 | 0.062 | 0.0041 |
10 | 0.09 | 0.30 | 0.90 | 0.20 | 1.51 | 0.50 | 0.59 | 0.039 | – | 0.0011 | 0.061 | 0.0037 |
11 | 0.05 | – | 1.93 | – | – | – | – | – | – | – | 0.027 | 0.0053 |
12 | 0.03 | – | 1.42 | – | – | – | – | – | – | – | 0.032 | 0.0040 |
13 | 0.06 | – | 1.44 | – | – | – | – | – | 0.036 | – | 0.028 | 0.0044 |
14 | 0.13 | 0.30 | 1.24 | – | – | 0.21 | – | 0.041 | – | 0.0002 | 0.055 | 0.0028 |
15 | 0.07 | 0.29 | 1.31 | 0.20 | 0.59 | 0.21 | – | 0.040 | – | 0.0001 | 0.058 | 0.0029 |
This section describes the procedure for creating the training dataset from the original microstructure images shown in section 3.1. Figure 7 shows a schematic of the procedure for creating the training dataset. In this study, to create the training dataset, 256×256 pixel square patches are cropped from the original microstructure images, allowing each image to overlap each other. In addition, the square patches were rotated clockwise 90, 180, and 270 degrees to expand the dataset. Also, the input images were randomly flipped upside down and left to right before inputting the network to further expand the dataset.
In addition to the above data augmentation, pseudo-low magnification images (50x and 200x) were created by cropping corresponding regions from X100 and X400 images to extract macroscopic structures, which were also added to the training dataset. Figure 8(a) shows a comparison of the cropped areas, and Fig. 8(b) shows examples of training images created for each magnification. As a result of these data augmentations, a training dataset containing 88256 square images was prepared from 120 original microstructures.
We discuss the problem of estimating steel microstructures from transformation behavior using the proposed framework. In this study, we define the problem of estimating the microstructure for the given chemical composition and heat treatment conditions as the microstructure prediction problem. Figure 9 shows a conceptual diagram of the procedure for estimating the microstructures by combining a deep learning model and a physical model. In the figure, JMAK equation is shown as an example of a physical model for microstructure prediction. In this diagram, the transformation fraction curve is first predicted from the chemical composition and the heat treatment conditions using JMAK equation. Subsequently, microstructures are estimated from the transformation behavior given by the physical model based on a correlation between the transformation behavior and the microstructure obtained by the proposed framework. As a result, a methodology for estimating the material microstructure from the given chemical composition and heat treatment conditions is constructed by coupling the deep learning model with the physical model. In particular, this study basically focuses on the deep learning model that plays a part of the methodology and discusses the effect of coupling with the physical model on the performance of the deep learning model. When considering the coupling with the physical model and performing the microstructure prediction, the influence of the accuracy or the range of applicability of the physical model on the microstructure prediction cannot be ignored. Therefore, in this study, as a first step toward coupling with the physical model, we assume that a physical model that adequately explains the experiment is obtained, and discuss the coupling with the physical model based on the experimentally obtained transformation fraction curve.
Figure 10 summarizes the microstructure images predicted from the transformation fraction curves of 15 steels when the cooling rate is 1.0°C/s and the magnification is X100. Each panel contains four square images estimated from the same transformation behavior. The numbers in each panel correspond to those in Fig. 4. We can see that the predicted microstructures have a similar topology to the original microstructure images shown in Fig. 4. Also, importantly since PixelCNN explicitly models the spatial order of the microstructure as a probability distribution, it enables us to probabilistically estimate steel microstructures. Therefore, it is possible to estimate various microstructures that follow the same probability distribution even for the same conditions (the transformation behavior). As a result, as shown in the respective panels of Fig. 10, the microstructures estimated for the same transformation behavior have similar topological characteristics, but they are not completely identical.
Figure 11 shows the trend of microstructures generated corresponding to different cooling rates for steel 14. As the cooling rate increases, the proportion of dark-looking microstructures such as martensite phase increases, and finer microstructures are formed. As a result, it can be seen that the proposed framework can capture the relationship between the given process parameter (the cooling rate) and the corresponding microstructures. The quantitative validation of the establishment of the process-structure linkage using the same framework is presented in detail in our previous paper.33) Figure 12 shows examples of the microstructures produced for several steels when the cooling rate is fixed at 1.0°C/s and the magnification is varied. The results show that a hierarchical structure is obtained as the magnification is varied.
As described above, it is confirmed that the estimated microstructures have qualitatively similar geometric features to the original microstructures. On the other hand, relatively large structures within the field of view of the microstructure, such as prior austenite grain boundaries or acicular structures such as those seen in steel 2, are not clear in the estimated microstructure. Because a correlation in square patches is factorized as a product of local interactions in PixelCNN as defined in Eq. (5), the extent of interaction exponentially decreases as distance increases. As a result, the long-range interactions are difficult to be captured by PixelCNN. Thus, it is difficult to obtain global structures such as the prior austenite grain boundaries and the acicular structures using PixelCNN. An attempt for acquiring global dependencies among pixels by PixelCNN has been discussed.38) It is expected that the global structures may be clarified using such kind of method. Also, considering that only one microstructure was used for each condition, the data is undeniably insufficient. Therefore, it is also expected to become clearer by expanding the dataset, but it is unclear with the current small dataset.
Next, we consider the effect of the number of data points describing the transformation behavior on the microstructure prediction. Figures 13(a) and 13(b) show the transformation fraction curves for steel 5 as an example, which are represented by plots with 9 and 49 points, respectively. Figures 13(c) and 13(d) show the microstructures estimated by each transformation fraction curve for a cooling rate of 1.0°C/s. From the comparison between Figs. 13(a) and 13(b), information such as the local curvature of the transformation fraction curves may be lost in the case of the 9 points. For example, in Fig. 13(b), it can be seen that the slope of the transformation fraction curve has a retardation of transformation around a temperature of 650°C and a transformation fraction of 20% when the cooling rate is 1.0°C/s. On the other hand, this retardation of transformation is not clear in Fig. 13(a). In addition, the comparison between Figs. 13(c) and 13(d) shows that the microstructures estimated using 49 points data are clearer than those estimated using 9 points data. These results indicate that the information on the transformation behavior is expanded with the increase in the number of plots representing the transformation fraction curve. Thus, the estimated microstructure becomes clearer.
As supplemental information, Fig. 14 shows visualizations of the latent space extracted by the proposed framework based on principal component analysis (PCA)39) and corresponding steel indices. In practice, after the output tensors of the trained encoder in VQVAE corresponding to the input images are flattened into vectors, the vectors are compressed into two-dimensional vectors by PCA. Then, we map the microstructure patches in two-dimensional space by the compressed vectors. Figure 14(a) illustrates the result of the visualization of 5000 sample patches randomly selected from the training dataset. This result demonstrates that each microstructure image is continuously distributed according to its morphology. Also, Fig. 14(b) shows a plot of the steel index of each microstructure. Although further discussion is required regarding the physical interpretation of these plots, a continuous distribution based on the morphology of the microstructures is observed in the latent space. This suggests that the distance in the latent space corresponds to the similarity of the microstructures.
In this section, we discuss the effect of the combination of the deep learning framework and the physical model on the microstructure prediction. To examine the effect, we consider two models. One is a model that directly estimates microstructures from chemical composition and heat treatment conditions using only the proposed deep learning framework, and the other is a model that estimates the microstructure from the calculated transformation behavior from the chemical composition and heat treatment conditions via a physical model. However, in the latter case, the accuracy of the microstructure prediction is affected by the accuracy of the physical model. To eliminate this influence, we train our deep learning framework using the experimentally obtained transformation behavior shown in Fig. 5 and discuss the combination with the physical model indirectly. In other words, assuming that we have the physical model that represents the transformation behavior well, we hypothetically discuss the microstructure prediction based on the combination of the deep learning and the physical model.
Using the above two models, we consider the estimation of a steel microstructure corresponding to an unknown chemical composition and heat treatment conditions that is not included in the training dataset. Here, we remove microstructure data for a certain composition from the training dataset, train the deep learning framework, and then estimate the removed microstructure using the trained model. For the former model, microstructures are estimated directly from the chemical composition and the cooling rate, while for the latter model microstructures are estimated from the transformation behavior assumed to be given by the physical model and the cooling rate. Figure 15 summarizes the estimated microstructures for steel 5 using each model, where Fig. 15(a) shows the experimentally obtained microstructure image of steel 5, and Figs. 15(b) and 15(c) show the microstructures estimated from the transformation behavior and the chemical composition, respectively. The cooling rate was set as 1.0°C/s. It can be seen that the estimation using the transformation behavior captures the geometrical characteristic of microstructures. On the other hand, we can see that the direct estimation from the chemical composition cannot reproduce the characteristic microstructures. This result suggests that the “translation” of the chemical composition and heat treatment conditions into the corresponding transformation behavior via the physical model improves the accuracy of microstructure prediction. In other words, it suggests that the combination of the deep learning model and the physical model can contribute to improving the accuracy of microstructure prediction by the deep learning model.
To quantitatively validate the effectiveness of combination with the physical model for microstructure prediction, we consider the problem of estimating the distribution of unknown microstructures in the latent space shown in Fig. 14. We trained the two deep learning models mentioned earlier by the training dataset without steel 5 and used each model to estimate the distribution of steel 5 in the latent space. Furthermore, we used the data for steel 5 as ground truth data to verify the estimation accuracy. We summarized the distribution of the ground truth data and the distribution estimated by the PixelCNN included in each deep learning model in Fig. 16. Additionally, as a quantitative measure to validate the estimation accuracy, we calculated the distance between the barycenters of the ground truth data distribution and the estimated distribution. As shown in Fig. 14, since the distance in the latent space corresponds to the similarity of the microstructures, we can discuss the estimation accuracy of the microstructures using the distance in the latent space. We also present this barycenter distance in Fig. 16. From the comparison of the distributions shown in Fig. 16, although it is somewhat unclear due to dimensionality reduction, it can be seen that the distribution closer to the training data can be estimated when coupling with the physical model. Together with the result of the computed barycenter distance, these results indicate that coupling with the physical model contributes to improving the accuracy of estimation by the deep learning framework.
In this section, we compare the estimations of unknown microstructures using the proposed framework and VAE which is a typical example of a conventional generative model. In the same way as the estimation by the proposed framework, the transformation fraction curve was input into VAE. In practice, the training microstructure is converted into features by the encoder of VAE, and the decoder reconstructs the training microstructure from the extracted features and the transformation fraction curve. As a result, VAE is trained to obtain a correlation between the microstructures and the transformation behavior. Here, we estimated unknown microstructures from the transformation behavior using the decoder trained with the transformation fraction curve as a condition. In Fig. 17, the microstructures predicted by the proposed framework and VAE are shown. Also, Fig. 18 shows the comparison of the distributions of unknown microstructures estimated from the transformation behavior using each framework in the latent space. Note that Figs. 16(a) and 18(a) are identical. In the same way as above, the distance between the barycenters of the distribution of the ground truth data and the estimated distribution was calculated, which is also shown in Fig. 18. From these results, we can confirm that our framework is superior to the conventional method, VAE, in terms of microstructure prediction. This could be one of the evidences that our framework is more suitable for material design than conventional methods.
Finally, we discuss future works to make our framework attractive tool for the microstructure prediction. As mentioned above, while our method can estimate the basic topology of microstructure, it is unclear about structures such as the grain boundary ferrite and the needle-shaped ferrite. Thus, it is one of our future tasks to improve the accuracy of the reproduction of relatively global structures in the field of view. The network-like structure of prior austenite grain boundaries is also unclear in the estimated structure given in this study. One possible strategy is to add the prior austenite grain size to the label h as another descriptor of microstructures. In other words, by appropriately expanding the descriptors according to the structure of interest, we can expect to improve accuracy. In the proposed framework, defining the dimension of the index list determines how large region of the training microstructure is mapped to the latent vector in the codebook. In this sense, the current setting discussed in the paper focuses on a specific length-scale structure. Therefore, we are considering the possibility of hierarchically linking our framework for the extraction of multiple length-scale structures to appropriately extract global structures. In addition, although the coupling of the deep learning framework and the physical model was discussed using the experimental data in this paper, it is important to construct a comprehensive framework to estimate microstructures from the transformation fraction curve given by the physical model in the future. Therefore, it is also important to develop the physical model which can predict the transformation fraction courve.
The following is the summary of this paper:
• We proposed the deep learning framework composed of VQVAE and PixrlCNN to predict steel microstructure morphology by combining with physical models. Using the framework, we discussed the problem of estimating steel microstructures from the transformation behavior.
• It is shown that the steel microstructures given by the framework are similar to the experimentally obtained microstructures in terms of basic topology such as the volume fraction and the average grain size.
• The effect of combination with the physical model on the estimation accuracy by the framework was examined. As a result, the improvement of accuracy by the combination was qualitatively and quantitatively confirmed. Also, the advantage of the proposed framework over the conventional method VAE was qualitatively and quantitatively confirmed.
From these results, it can be concluded that the combination of data science methods with the accumulated knowledge in the domain is essential for the effective application of the data-driven methodology to material science.
This work was supported by Council for Science, Technology and Innovation (CSTI), Cross-ministerial Strategic Innovation Promotion Program (SIP), “Materials Integration” for Revolutionary Design System of Structural Materials (Funding agency: JST). We would also like to express our sincere gratitude to Dr. Tadashi Kasuya, a project researcher at the University of Tokyo, and Mr. Masahiro Imoto of Kobe Steel, Ltd. for providing valuable data.