Journal of Computer Aided Chemistry

Application of Rule Mining to Quantative Structure-Activity Relationship Using Rough Set Theory

Kiyoshi Hasegawa, Michio Koyama, Masamoto Arakawa, Kimito Funatsu

2008 Volume 9 Pages 1-7
Published: 2008
Released on J-STAGE: January 12, 2008

DOIhttps://doi.org/10.2751/jcac.9.1

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we have applied Rough Set Theory (RST) to Quantative Structure-Activity Relationship (QSAR) and have validated its usefulness as a rule mining method. Previously, Inductive Logic Programming (ILP) has been well known to be rule mining method. However, its practical applications were heavily limited due to the difficulty for preparing background knowledge in advance. RST is a new rule mining method originally developed in chemometrics. RST selects the least descriptor sets for discriminating one sample and others. These descriptor sets are called as reduct. RST constructs the possible rules for deriving the high activity using the specific reduct. We have used Dihydrofolate Reducatse (DHFR) Inhibitors as a validation set of RST. This data set has been thoroughly investigated in several studies and the structural requirements for high activity have been well known. The RST-based rules are well matched to these structural requirements and the utility of RST has been proved. According to the success in this study, the further application to the data set that have more diverse compounds and more noisy activity would be expected.

View full abstract

Download PDF (240K)
Parallelization of Crystal Calculation for Large-Scale Molecular Crystal Structure Analysis

Shigeaki Obata, Hitoshi Goto

2008 Volume 9 Pages 8-16
Published: 2008
Released on J-STAGE: January 19, 2008

DOIhttps://doi.org/10.2751/jcac.9.8

JOURNAL FREE ACCESS

Show abstractHide abstract

Prediction of crystal structure and property based on computational chemistry is obviously expected as an efficient technology for the developments of new functional material and drug designs. Although the improvement of the prediction accuracy can be resolved by expanding the size of the computational crystal model, the practical performance is often restricted by specification of available computers. In this paper, parallel distributed-computing technique and its performance for a large-scale molecular crystal calculation program CONFLEX/KESSHOU (CONFLEX/K), which is combined our versatile computational chemistry tool CONFLEX and a crystal structure optimization method KESSHOU have been proposed. In order to prevent the calculation accuracy from computational error (loss of trailing digits) on the arithmetic addition between short and long-range interaction energies appearing in the large-scale crystal calculation, the improved Kahan's summation algorithm has been adapted to a part of intermolecular calculations of CONFLEX/K. Finally, we confirmed that CONFLEX/K performance test with crystal structure optimizations of aspirin crystalline having 200-1,000 A crystal radius reaches more than 90% parallel efficiency by using parallel computing environment up to 63 workers (cores). Analysis of crystal energy changes depending on the size of the computational crystal model have shown that if one can expect highly accurate crystal calculation within 10-3 kcal/mol, larger size of the computational crystal model more than 80 A in the crystal radius will be required.

View full abstract

Download PDF (541K)
Molecular Mechanics and Molecular Orbital Simulations on The Specific Interactions between Lactose Repressor Protein and Its Inducer and Anti-Inducer Molecules

Shin Nishikawa, Shinsaku Kozakai, Yasuo Sengoku, Noriyuki Kurita

2008 Volume 9 Pages 17-29
Published: 2008
Released on J-STAGE: February 23, 2008

DOIhttps://doi.org/10.2751/jcac.9.17

JOURNAL FREE ACCESS

Show abstractHide abstract

To elucidate the specific interactions between lactose repressor (LacR) and its inducer (IPTG) as well as anti-inducer (ONPF) ligand molecules, we investigated the stable structures and electronic properties of the LacR + IPTG and LacR + ONPF complexes including DNA and solvating water molecules, by the molecular simulations based on classical molecular mechanics and semiempirical molecular orbital methods. The results clarified that the specific water molecules bridging between the amino acid of ligand binding-site of LacR and ligand molecules are essential for the specific interactions between LacR and IPTG or ONPF. Moreover, it was found that the binding energy between the DNA-binding domain of LacR and DNA are affected largely by the binding of IPTG and ONPF to LacR.

View full abstract

Download PDF (787K)
Development of Fingerprint Verification Type Self-Organized Map Applied to Profiling Seized Methamphetamine

Rika Nishikiori, Yukiko Makino, Yukino Ochi, Noriyuki Yamashita, Kousu ...

2008 Volume 9 Pages 30-36
Published: 2008
Released on J-STAGE: March 20, 2008

DOIhttps://doi.org/10.2751/jcac.9.30

JOURNAL FREE ACCESS

Show abstractHide abstract

In a previous study {Takagi, T. et al., Chem. Pharm. Bull., 52(12), 1427-1432 (2004)}, we applied a slightly revised neural Independent Component Analysis (ICA) for profiling illegally distributed methamphetamine. Using ICA and an hourglass type Hierarchical Neural Network (HNN), we obtained better classification results than by using Principal Component Analysis (PCA), CATegorical PCA (CATPCA) and the MultiDimensional Scaling method (MDS). The HNN is a nonlinear machine learning method, and the ICA applied in that study exhibited nonlinear characteristics. The results indicated that nonlinear analysis is more efficient than linear analysis for profiling confiscated methamphetamine. Consequently, in this study, we applied Self-Organizing Maps (SOMs) to impurity profiling of methamphetamine.
While SOM is currently a frequently employed nonlinear classification method, the ordinary SOM uses only that information contained by the winner neuron for classification and the information of other grid points is neglected. We therefore attempted to simultaneously utilize the information of loser neurons in order to avoid information loss. First, we visualized the resultant reference vectors using a contour map of each sample. Although considerable information can be visually compared using the SOM contour maps, metric comparisons are difficult. We therefore used MDS to construct a similarity matrix using the data of the resultant reference vectors to visualize metric data. To assess the results, we assumed that there are four synthetic routes (Nagai, Leuckart, Emde and reductive amination methods), and that each of these can be identified by comparing route-specific impurities.

View full abstract

Download PDF (343K)
Predicting Rank of Japanese Green Teas by Derivative Profiles of Spectra Obtained from Fourier Transform Near-Infrared Reflectance Spectroscopy

Tatsuhiko Ikeda, Md. Altaf-Ul-Amin, Aziza Kawsar Parvin, Shigehiko K ...

2008 Volume 9 Pages 37-46
Published: 2008
Released on J-STAGE: April 16, 2008

DOIhttps://doi.org/10.2751/jcac.9.37

JOURNAL FREE ACCESS

Show abstractHide abstract

A rapid and easy method for extracting features from spectra obtained from Fourier transform near-infrared (FT-NIR) reflectance spectroscopy was examined by using the 1st and 2nd derivatives and Spearman's rank correlation. This method can select features from the overall wavelength. Therefore, this method can be considered suitable for the quality estimation of foods. Practically, a set of ranked green tea samples from a Japanese commercial tea contest were analyzed by FT-NIR in order to create a reliable quality-prediction model. The 2nd derivative was determined for reducing noise and amplifying the fundamental features. Feature selection from the amplified data was performed using relations between the tea ranks and the derivative coefficients. Finally, a reliable quality-prediction model of green tea was formulated by using single linear and PLS regressions. Furthermore, we discuss possibility of the derivative coefficients as feature representation in FT-NIR.

View full abstract

Download PDF (289K)
Theoretical Study on Emission Spectra of Bioluminescent Luciferases by Fragment Molecular Orbital Method

Ayumu Tagami, Nobuhiro Ishibashi, Dai-ichiro Kato, Naoki Taguchi, Yuji ...

2008 Volume 9 Pages 47-54
Published: 2008
Released on J-STAGE: April 24, 2008

DOIhttps://doi.org/10.2751/jcac.9.47

JOURNAL FREE ACCESS

Show abstractHide abstract

The fragment molecular orbital (FMO) method, in which a molecule or a molecular cluster is divided into small fragments, enables ab initio calculations of large molecules such as protein and DNA with chemical accuracy. In this study, we have performed the multilayer fragment molecular orbital (MLFMO) calculations for a firefly luciferase / oxyluciferin molecular system. We employed wild-type (emitting-green) and three mutant (emitting-red or emitting-orange) luciferases. The calculated results of emission energies for the four structures agreed well with experimental data. We also discussed the details of calculations and the significance of environmental effects on emission spectra.

View full abstract

Download PDF (181K)
Automatic Generation of Structure of Phospholipids

Hisayuki Horai, Takaaki Nishioka

2008 Volume 9 Pages 55-61
Published: 2008
Released on J-STAGE: August 28, 2008

DOIhttps://doi.org/10.2751/jcac.9.55

JOURNAL FREE ACCESS

Show abstractHide abstract

An algorithm and a tool for automatic generation of structures of Phospholipids are proposed. The input is a compact representation of the variable part of phospholipids in a systematic way. The output is a structure of the phospholipid represented in the MDL Molfile format. The output molfile describes not only the topological connectivity of atoms but also the 2D coordinate of each atom in order to draw the structure without any overlapping. The variation of phospholipids that the tool covers includes glycerophospholipids (phosphatidylcholines, phosphatidylethanolamines, phosphatidylglicerols, phosphatidylinositols and phosphatidylserines) and sphingophospholipids with arbitrary length and arbitrary number of double bonds at arbitrary positions and in arbitrary cis/trans isomerism.

View full abstract

Download PDF (195K)
Monte Carlo Simulation Using Quantum Mechanical Calculations (QM/MC Simulation). An Application to Alkaline Hydrolysis of Methylacetate

Toru Yamaguchi, Michinori Sumimoto, Kenzi Hori

2008 Volume 9 Pages 62-69
Published: 2008
Released on J-STAGE: September 03, 2008

DOIhttps://doi.org/10.2751/jcac.9.62

JOURNAL FREE ACCESS

Show abstractHide abstract

Although it is possible to analyze chemical reactions in detail using molecular orbital (MO) and Density Functional Theory (DFT) calculations, these results simulate reactions at 0 K in the vacuum. Usual organic reactions proceed in solvents such as water, acetnitrile, alcohol and so on. In order to simulate the reactions in solution, it is necessary to investigate the mechanisms including solvent effects. The SCRF calculations have been used for this purpose while the method regards solvents as simple dielectric constants, and then, it is impossible to analyze the role of each solvent molecule for the reactions. Molecular dynamic (MD) calculations and Monte Carlo (MC) simulations have been used for calculating difference in free energy solvation. These theories usually use classical force fields so that it is very difficult to obtain good parameters for organic solvents used in organic synthesis. We have been developing Monte Carlo simulations using quantum mechanical calculations, called the QM/MC simulations. This approach makes it possible to analyze solvent effects from the quantum chemical view point. As an example of the simulation, we adopted alkaline hydrolysis of methyl acetate. A combination of ab initio calculations at the MP2/6-31++G^** level of theory for analyzing the reaction mechanisms in the vacuum and the MC simulations using the PM3 method produced results consistent with experimental results very much.

View full abstract

Download PDF (440K)
Development of Drug-likeness Model and Its Visualization

Masamoto Arakawa, Tomoyuki Miyao, Kimito Funatsu

2008 Volume 9 Pages 70-80
Published: 2008
Released on J-STAGE: October 11, 2008

DOIhttps://doi.org/10.2751/jcac.9.70

JOURNAL FREE ACCESS

Show abstractHide abstract

For rational drug design, it is important to estimate drug-likeness of drug candidate in early stage of developing process. In this study, we have developed a statistical model for estimating drug-likeness. Using drug and non-drug structures taken from comprehensive medicinal chemistry (CMC) and available chemicals directory (ACD) database, the SVM model has been constructed with structural descriptors of Dragon software. We have carried out grid search to optimize SVM model using 5-fold crossed validated accuracy as criteria. As a result, the predictive SVM model, which accuracy is 87.65% for training set and 79.40% for test set, was obtained. Then we have visualized the SVM model using generative topographic mapping (GTM) and self-organizing map (SOM). For appropriate non-linear mapping, it is important to balance between accuracy and smoothness of mapping. However, a criterion for smoothness of non-linear mapping had been unknown. Thus we have proposed novel criterion to measure non-linearity of mapping, root mean squared error of midpoint (RMSEM), and compared GTM and SOM.

View full abstract

Download PDF (624K)
Development of Evaluation Model for Strategic Sites in Synthetic Route Design System AIPHOS

Akio Tanaka, Takashi Kawai, Tsutomu Matsumoto, Tetsuhiko Takabatake, H ...

2008 Volume 9 Pages 81-91
Published: 2008
Released on J-STAGE: November 27, 2008

DOIhttps://doi.org/10.2751/jcac.9.81

JOURNAL FREE ACCESS

Show abstractHide abstract

An evaluation technique was developed to prioritize synthetic strategic sites (a set of bonds to make precursors by cut and connection) for the purpose of effective retro-synthesis in synthetic route design system, AIPHOS. In this paper, the relationship between the strategic sites proposed by AIPHOS and the reaction centers in reaction databases was discussed. The correlation has been analyzed by logistic regression analysis (LoRA) with molecular centrality, bond dissociation energy (BDE), and the number of bonds. We used the equation to clarify the importance of synthetic strategy sites. The correlation models showed high similarity among three reaction databases. In addition, from the model, reaction centers in reaction databases were found to be located in the center of the whole structures, to have fewer bonds, and to have smaller bond dissociation energies.

View full abstract

Download PDF (398K)

Register with J-STAGE for free!