Recently, fragment molecular orbital (FMO) method has attracted much attention as an electronic state calculation method for macromolecular system. When the size of a molecule grows, the ratio of the dimer calculation in the total computing time grows in the FMO method (O(N3)). To overcome this difficulty, the dimer-es approximation was developed; the dimer SCF calculation between distant monomers was approximated by that for fragments with electrostatic interaction. It has a dramatic effect on the reduction of computing time without loss of the accuracy of calculation when the molecular size grows (O(N2)). However, it becomes one of the bottlenecks for a large molecule because it is necessary to calculate the coulomb integral at four centers in the dimer-es approximation. Then, for the speed-up of the coulomb integral, continuous multiple method (CMM) was implemented in FMO program ABINIT-MPX, and the accuracy of calculation and the computing time were investigated in this article.
In chemical plants, soft sensors have been widely used to estimate difficult-to-measure process variables online. The predictive accuracy of soft sensors decreases due to changes in the state of chemical plants, and soft sensor models based on time difference (TD) have been constructed for reducing the effects of deterioration with age such as the drift. However, details on models based on TD (TD models) remain to be clarified. In this study, therefore, TD models were discussed in terms of noise and variance in data, auto-correlation in process variables, degree of model accuracy, and so on. Then, we theoretically clarified and formulated the difference of predictive accuracy between normal models and TD models. The relationships and the formulas of TD were verified through the analysis of simulation data. Furthermore, we analyzed dynamic simulation data with considering observed disturbances and unobserved disturbances, and confirmed that predictive accuracy of TD models increased by setting appropriate intervals of TD.
We are developing a classification model for predicting mutagenicity of diverse organic compounds. We have proposed an ensemble model, in which many support vector machine models are constructed and integrated to predict mutagenicity. This model successfully predicted mutagenicity with accuracy rate of 79.6 %. However, on the other hand, the results of prediction suggested that some wrong data were included in database. Therefore, in this study, Ames test was carried out for some suspicious compounds. First, an ensemble model was constructed using the dataset that was assembled by Hansen et al. Then Ames test was carried out for five suspicious compounds that were registered in the database as negative. As a result, three of five compounds were judged as positive. This suggests that the database include some wrong data and our model can find these compounds efficiently.
In this article, we introduce two elementary techniques in drug design; structure generator and chemical space visualization. Structure generator is used in lead generation and it might be helpful for structure hopping. We focus one type of structure generators based on QSAR (quantitative structure-activity relationship), so-called inverse QSAR approach. The objective of inverse QSAR approach is to propose chemical structures whose biological activities are predicted to be high from a QSAR model. Chemical space visualization is another important technique in lead optimization. It might be a good compass to lead us to where synthetic compounds are in chemical space or show the extent of chemical efforts necessary to achieve lead optimization. Visualization is also useful for understanding molecular selectivity against multi-target proteins. Because molecules having biological activities against multi-target proteins may cause many unfavorable side effects and toxicities, chemical space visualization is of great value from the perspective of safety. Two elementary techniques, structure generator and chemical space visualization, are briefly reviewed including our studies.
A membrane bioreactor (MBR) is equipment which filters polluted water such as factory disposal and sewage. Activated sludge is used to remove organic substances metabolically and filtrated to a membrane by transmembrane pressure (TMP). Since the MBR is able to treat water for a short time and has space-saving features, carrying out distributed installation of the MBR and performing unmanned operation to a building, a factory, and so on, attracts much attention as a solution of water-shortage. However, the rise of transmembrane pressure (TMP) which arises as a result of accumulation of foulants on a membrane is one of the biggest problems. Membrane needs to be washed when TMP reaches to some extent. The focus of this study is to estimate TMP with statistical models and also know when the membrane wash-up will become necessary. In this study, two types of statistical models were constructed between explanatory variables related to fouling and an objective variable, i.e., membrane resistance (R) or deposition rate of foulants to membrane (DR). Partial least squares (PLS) and support vector regression (SVR) were employed for the construction of each model. It is able to predict TMP because R or DR can be converted into TMP. As a result of TMP prediction with real industrial data, usage of DR as an objective variable and the SVR method improved the accuracy of TMP prediction.