Chemical and Pharmaceutical Bulletin
Online ISSN : 1347-5223
Print ISSN : 0009-2363
ISSN-L : 0009-2363
Regular Articles
Prediction of Dissolution Data Integrated in Tablet Database Using Four-Layered Artificial Neural Networks
Kozo Takayama Shota KawaiYasuko ObataHiroaki TodoKenji Sugibayashi
Author information
JOURNAL FREE ACCESS FULL-TEXT HTML

2017 Volume 65 Issue 10 Pages 967-972

Details
Abstract

A large number of dissolution data were measured and integrated into a previously constructed tablet database composed of 14 kinds of compounds as model active pharmaceutical ingredients (APIs) with contents ranging from 10 to 80%. The database has contained physicochemical and powder properties of APIs, together with basic physical attributes of tablets such as the tensile strength and the disintegration time. In order to enhance the value of this database, drug dissolution data are essential to improving key information for designing tablet formulations. A four-layered artificial neural network (4LNN), newly implemented in commercially available software, was employed to predict dissolution data from physicochemical and powder properties of APIs. Our results showed that an excellent model for the prediction of dissolution data was achieved with 4LNN method. The function of 4LNN was appreciably better than that of conventional three-layered model, despite both models adopting the same number of nodes and algorithms for activation functions. Furthermore, linear regression models resulted in poor prediction of dissolution data.

Tablets are one of the most common dosage forms among various pharmaceutical products. Since tablets are produced with many formulation factors such as various excipients and process variables, the relationships between formulation factors and attributes of tablet quality are often complicated. The effects of formulation factors can be substantially modulated by changes in the physicochemical nature of the active pharmaceutical ingredients (APIs). A tablet database would provide the basis for meaningful investigations using reliable candidate formulations, which can be undertaken at the very beginning of the development process. In a previous study,1) we created a tablet database that included 14 kinds of APIs: acetaminophen, antipyrine, aspirin, ascorbic acid, chloramphenicol, ethenzamide, flufenamic acid, griseofulvin, isoniazid, ketoprofen, mefenamic acid, sulfisoxazole, sulfamonomethoxine and theophylline. Tablets containing 10, 20, 30, 40, 50, 60, 70 and 80% in content of each API were prepared (112 different tablets in total). The tensile strength and disintegration time were measured as basic physical characteristics. Although the scale of the database is not yet sufficient, in a previous study, we found that the nature of API molecules and their powder properties could accurately predict basic physical characteristics, demonstrating the useful application of this database. In addition to the basic characteristics of the tablets, the dissolution rate of APIs is another important factor to consider, as is in partly related to bioavailability.2)

Recently, artificial intelligence (AI) has been an emerging theme in the science of complex system, particularly, deep learning35) and sparse modeling6,7) are increasingly becoming the focus of AI researches. Robust findings using these methods are often reported in the field of image analysis.3,7,8) Also, commercially available software employing novel AI techniques are continuously being developed, implemented, and updated. In this study, we measured the 112 types of tablets, and aimed to quantitatively predict dissolution data using a combination of physicochemical and powder properties of APIs. For this purpose, we employed four-layered artificial neural network (4LNN),9) a machine leaning technique that has recently been implemented in commercially available statistical software. Prediction ability of 4LNN was compared with the conventional three-layered model (3LNN) and classical linear regression equations.

Experimental

Materials

Acetaminophen, flufenamic acid and isoniazid were obtained from Tokyo Kasei Kogyo (Tokyo, Japan). Antipyrine, aspirin, ascorbic acid, chloramphenicol, ethenzamide, griseofulvin, ketoprofen, mefenamic acid, sulfamonomethoxine, theophylline and magnesium stearate (MgSt) were obtained from Wako Pure Chemical Industries, Ltd. (Osaka, Japan). Sulfisoxazole was obtained from Sigma-Aldrich (Tokyo, Japan). Microcrystalline cellulose (Ceolus® PH-101) was obtained from Asahi Kasei Chemicals (Tokyo, Japan). Lactose (200-mesh grade) was obtained from DMV International (Veghel, the Netherlands). Cornstarch was obtained from Nihon Shokuhin Kako (Tokyo, Japan). Low-substituted hydroxypropyl cellulose (L-HPC® LH21) was obtained from Shin-Etsu Chemical (Tokyo, Japan). All other chemicals were of reagent grade.

Preparation of Tablets

The preparation methods for tablets containing each API were fully described in detail in the previous paper.1) Briefly, the ingredients were accurately weighed according to the designated formulations, which were previously optimized in the placebo tablet formulation with acceptable tensile strength and disintegration time. All ingredients, except MgSt, were blended in a polyethylene bag for 2 min. The appropriate amount (0.5%) of MgSt was then added to the mixture and mixed for 1 min. The mixed powder was compressed at 8 kN into a round faced tablet, 8 mm in diameter and 12 mm in curvature radius, using a HANDTAB 100 hydraulic press (Ichihashi-Seiki Company, Ltd., Kyoto, Japan).

Drug Dissolution

According to the Japanese Pharmacopoeia XVII (JP XVII), dissolution test No. 2 (paddle method) for each tablet was conducted at 37±0.5°C with a paddle rotation speed of 50 rpm using 900 mL phosphate buffered saline (pH 7.4) as the dissolution medium. A dissolution tester (NTR 6400ATR, Toyama Sangyo, Co., Ltd., Osaka, Japan) with an auto-sampler (SAS-6000, Toyama Sangyo, Co., Ltd.) was used. Dissolved rates (%) of the APIs at 5, 10, 15, 30 and 60 min (D5, D10, D15, D30 and D60) in dissolution media were determined using a UV spectrophotometer (V-650, JASCO Corp., Tokyo, Japan). The D5, D10, D15, D30 and D60 values were analyzed as output variables in the 4LNN model. The mean values of three determinations were used as the dissolution data analysis.

Physicochemical Property and Powder Property

The physicochemical property and powder property values measured in the previous paper1) were used as causal factors for predicting the dissolution data of the APIs. That is, the melting point (mp, °C), the solubility in water at 37°C (Cs, mg/L), the specific surface area (Sw, cm2/g), the mean particle diameter (d50, µm) and the size distribution (Rw defined by (d90d10)/d50, where d10 and d90 are the particle diameters at 10 and 90%, respectively) of each API. These values and the API quantity (Q, %) were used as causal input variables.

Neural Network Architecture and Data Analysis

The commercially available statistical software, the JMP Pro 13 (SAS Institute Japan, Co., Ltd., Tokyo, Japan) was employed for the data analysis. 4LNN is composed of 4 layers: 1 input, 2 hidden, and 1 output. Each hidden layer contains nodes, with the nodes in the adjacent layers mutually linked with relevant weights. The functions applied at the nodes of each layer are referred to as activation functions. The hyperbolic tangent (tanH), linear combination (Linear), and Gaussian function (Gauss) were used as activation functions in each node. Although there are an infinite number of artificial neural network (ANN) structures, we investigated models 6/1/1/5 to 6/3/3/5. The 6/1/1/5 model was composed of 6 causal factors in the input, 1 node in the 1st and 2nd hidden, and 5 responses in the output layers. The 6/3/3/5 model was composed of 6 causal factors in the input, 3 nodes in the 1st and 2nd hidden, and 5 responses in the output layers. In total, 361 sets of 4LNN structures were conducted with different combinations of the activation function (tanH, Linear, and Gauss). A “boosting” function was implemented in JMP Pro 13, in order to build a large additive neural network model through fitting sequences of smaller models. Although the boosting function allows quicker identification of the most optimal network model, we could not cover the entire conceivable combinations of networks. Consequently we investigated overall combinations from the 6/1/1/5 to 6/3/3/5 models (361 sets). With regard to validation, the “k-fold” function implemented in the software, was employed. This method divides the original data into k subsets, and each subset, in turn, is used to validate the fit of the model on the rest of the data, fitting a total of k models. The model with the best validation statistic was chosen as the optimal one. This is the best method for fitting small data sets, as it makes efficient use of limited amounts of data. In this study, we employed k=5 (i.e., 80% of training and 20% of validation). The training and validation data were successively repeated 5 times with changing datasets. Using different initial weights of links given at random, 100 iterations of training were conducted. Subsequently, the total number of 4LNN trained was 902500 (5×361×5×100: 5 times repetition of trials, 361 combinations of nodes in hidden layers, 5-fold validation, and 100 iterations of different initial weights). Physicochemical and powder properties of the APIs as input factors were previously normalized. Robust fitting, using least absolute deviations rather than least squares, was used for the training process. To mitigate the tendency of 4LNN to over-fit the data, a squared method was chosen as the penalty function. JMP Pro 13 was also applied to performing the data analysis using conventional 3LNN, multiple regression analysis, ANOVA, and calculating other statistical quantities.

Results and Discussion

Single Correlation Analysis

The correlation matrix is summarized in Table 1. A fairly good correlation between dissolution data was observed in the early dissolution stage (r=0.963 for D5 vs. D10), and gradually weakened with the passage of dissolution time. Similarly, in the final dissolution stage, a strong correlation between D30 and D60 (r=0.971) was observed, but weakened with the passage of dissolution time. Solubility of the APIs (log Cs) had the greatest effect on the dissolution data. Though in the initial dissolution stage, the effect of log Cs was apparent (r=0.617 for D5), it gradually lessened with time (r=0.584 for D10, r=0.547 for D15, r=0.468 for D30, and r=0.402 for D60). Effects of the other physicochemical or powder properties on the dissolution data were weaker compared with log Cs.

Table 1. Correlation Matrix of Physicochemical and Powder Properties vs. Dissolution Data
mplog Cslog Swd50RwQD5D10D15D30
log Cs−0.251
log Sw0.272−0.455
d50−0.4380.428−0.628
Rw−0.358−0.130−0.028−0.338
Q00000
D5−0.0910.617−0.2060.312−0.227−0.312
D10−0.1530.584−0.2360.338−0.163−0.2610.963
D15−0.2360.547−0.2630.367−0.069−0.1950.8940.964
D30−0.3030.468−0.3020.3800.048−0.1480.7690.8740.950
D60−0.3320.402−0.3370.3760.137−0.1370.6600.7780.8720.971

Multiple Regression Analysis

Multiple regression analysis was applied to the dissolution data, D5 to D60. Regression models were optimized using the stepwise variable forward selection method, which gave the smallest values of the Bayesian Information Criterion (BIC).10) The optimal equations are as follows:   

(1)
  
(2)
  
(3)
  
(4)
  
(5)
where n is the number of samples, R2 is the coefficient of determination, Fo is the observed variance ratio, and the number in parentheses is the 95% confidence interval of regression coefficients. In the optimal regression equations for D5, D10 and D15, the log Cs and Q values were highly significant and were essential for predicting the early dissolution stage of the APIs. Greater log Cs and smaller Q values resulted in optimal dissolution rates (Eqs. 1–3). Q values likely functioned as phase separation factors since the single effect on the dissolution data was obscure (Table 1). Alternatively, factors such as d50, mp and Rw were chosen instead of Q values in the equations for predicting the last dissolution stage (D30 and D60). The log Cs value was consistently found in the equations for D5 to D60. ANOVA results suggested that all regression equations were highly significant (p<0.01), but their prediction ability was considerably poor, with the lowest prediction ability observed in the last dissolution stage. The effect of Rw was negative in the initial (Eq. 1), but positive in the last stage (Eq. 5). The d50 values acted as a positive factor in the last dissolution stage (Eqs. 4, 5), indicating that greater dissolution may be obtained with larger API particle sizes. These results are inconsistent with well-known physicochemical findings,11,12) suggesting uncertainty with Eqs. 4 and 5. The mp negatively affected the last dissolution stage (Eqs. 4, 5). With lower mp associated with weaker API crystal lattices.13) Though this factor was not significant in Eq. 5, it was retained in the stepwise regression method.

4LNN Analysis

Figure 1 shows the final optimal structure of 4LNN selected from 361 different 4LNNs. With regard to the activation functions, 2 Linear and 1 Gauss in the 1st hidden layer, and 1 tanH, 1 Linear and 1 Gauss in the 2nd hidden layer were selected in the optimal 4LNN. Cross-validation results are summarized in Table 2. Figure 2 shows relationships between the experimental dissolution data and 5-fold cross-validation results (Trial #5 in Table 2). The R2CV values denote R2 for each of the 5-fold cross-validation results of resampling the dissolution data into 80% training and 20% validation, with the resampling process sequentially repeated 5 times. Prediction ability of the 4LNN was superior to that of linear regression equations, though there were variations in the R2CV values. Relative standard deviations (RSDs, %) of R2CV values were less than 5% except in the case of D60, where the RSD was 16%. D60 values were observed to be 95% or greater in 70 samples (62.5% of the total 112 tested). Such small difference may have contributed to the variations in the predicted data of the 4LNN models.

Fig. 1. Optimal Structure of Four-Layered Artificial Neural Network
Table 2. R2CV Values Obtained from 5-Fold Cross-Validation Using the Optimal 4LNN
TrialDissolution data
D5D10D15D30D60
10.8660.9080.8900.9040.872
20.8300.8730.8650.8310.642
30.7940.8600.8650.8490.658
40.7840.8660.9150.9140.880
50.8320.8650.8580.8870.898
Mean0.8210.8740.8790.8770.790
S.D.0.0330.0190.0240.0360.128
RSD (%)4.022.172.734.1016.20
Fig. 2. Relationship between Experimental Dissolution Data and Cross-Validated Results with 4LNN (6/3/3/5; Trial #5 in Table 2)

Keys: a; D5, b; D10, c; D15, d; D30, e; D60.

4LNNs were compared with the conventional 3LNN method, which were composed of 6 causal factors in the input, 6 nodes in the hidden and 5 responses in the output layers (6/6/5). With regard to activation functions, 1 tanH, 3 Linear and 2 Gauss were used. These functions were selected using the optimal 4LNN structure as a reference (1 tanH, 1 Linear and 1 Gauss in the 1st hidden layer; 2 Liner and 1 Gauss in the 2nd hidden layer, i.e., 1 tanH, 3 Linear and 2 Gauss in total). Results are given in Table 3. The R2CV values observed with 3LNNs were lower than those of the 4LNNs except in the case of D15. Additionally, the RSD values of the R2CV were considerably larger than those in the 4LNNs. The number of unknown weights in 3LNN was 66 (6×6+6×5), compared to 42 (6×3+3×3+3×5) in the 4LNN analysis, suggesting that the internal structure of the 3LNN was more complex than the 4LNN. While 3LNN is more advantageous for training the test data, there is a tendency to over-fit, resulting in poor validation.14) On the contrary, in 4LNN, the 6 causal factors are simplified into 3 nodes in the 1st hidden layer. Properly integrated information will then be transferred to the 3 nodes in the 2nd hidden layer, and finally, the secured outcome is delivered in the output layer. This suggests that deeper hierarchical structures may be better for more robust prediction.7,1517) In this study, the number of sample tablets is still less so that prediction in the 4LNN models is limited in the range of this dataset. A wide range of data should be integrated, in order to accomplish universal and comprehensive predictions.

Table 3. R2CV Values Obtained from 5-Fold Cross-Validation Using the Referential 3LNN
TrialDissolution data
D5D10D15D30D60
10.9090.9710.9800.6820.362
20.8000.8860.8760.5820.463
30.6860.7990.8290.7450.459
40.8610.9700.9430.7830.555
50.8000.8550.9330.9670.890
Mean0.8110.8960.9120.7520.546
S.D.0.0840.0750.0600.1420.204
RSD (%)10.368.376.5818.8837.36

Sensitivity Analysis

The sensitivity of the dissolution data to each physicochemical and powder property of the APIs was analyzed.1820) The basic measure of sensitivity is the change in prediction accuracy of 4LNNs following the application of a leave-one-factor-out (LOFO) procedure. First, the R2CV-LOFO value was estimated by the 5-fold cross-validation using the LOFO data. Then, the sensitivity of the dissolution data to designated factors was defined as:   

(6)

When a factor contributing to the prediction accuracy is removed from the original data, the sensitivity should increase. If the sensitivity index defined by Eq. 6 is positive, the factor is a significant predictor of the dissolution data, thereby allowing for the identification of factors that are significant or insignificant to the dissolution of each API from tablets. Sensitivity was determined by R2CV-original and R2CV-LOFO for the mean of 5-trials. Results are summarized in Table 4.

Table 4. Sensitivity of Dissolution Data to Physicochemical and Powder Properties
Dissolution dataSensitivity (%)
mplog Cslog Swd50RwQ
D57.3323.251.562.53−2.0733.59
D108.0612.422.121.400.2328.92
D1510.466.681.911.45−1.3727.91
D302.602.683.212.02−2.2811.62
D607.303.121.621.27−11.012.37

In the initial dissolution stage (D5D15), Q, log Cs and mp values were significant factors. Powder properties such as log Sw and d50 were less effective, and Rw slightly disturbed 4LNN approximations except in the case of D10. It was rather serious in the prediction of D60 (Sensitivity=−11%), and then the 4LNN models without using Rw were trained. As a result, more robust validation was accomplished (Table 5, Fig. 3). Especially, D60 values were exceptionally well predicted. Thus, if variables in this tablet database needed to be restricted, molecular nature of the APIs would take precedence as it was more important for predicting dissolution data than powder properties.

Table 5. R2CV Values Obtained from 5-Fold Cross-Validation with 4LNN without Using Rw
TrialDissolution data
D5D10D15D30D60
10.8630.8790.9000.9050.889
20.8620.9040.9140.9140.902
30.8010.8290.8280.8530.860
40.8630.8790.9000.9050.889
50.8010.8710.9150.9010.842
Mean0.8380.8720.8910.8960.876
S.D.0.0340.0270.0360.0240.025
RSD (%)4.063.104.042.682.85
Fig. 3. Relationship between Experimental Dissolution Data and Cross-Validated Results with 4LNN without Using Rw (5/3/3/5; Trial #5 in Table 5)

Keys: a; D5, b; D10, c; D15, d; D30, e; D60.

Conclusion

The tablet database, in which dissolution data were implemented, was created in this study. This database was useful in determining reliable formulations with preferable dissolution profiles, as well as more superior basic characteristics including tensile strength and disintegration time. An excellent dissolution prediction model was obtained using 4LNN, which was appreciably better than conventional 3LNN models. Linear regression models resulted in very poor prediction of dissolution data. Further investigations should focus on identifying the effect of appropriate powder properties of APIs.

Acknowledgment

This study was supported by the Japan Society for the Promotion of Science (JSPS) KAKENHI Grant Number JP17K08252. The authors are grateful to Ms. Saori Mitsumoto for her assistance in the experimental works.

Conflict of Interest

The authors declare no conflict of interest.

References
 
© 2017 The Pharmaceutical Society of Japan
feedback
Top