Proceedings of the Symposium on Chemoinformatics
41th Symposium on Chemoinformatics, Kumamoto
Displaying 1-50 of 61 articles from this issue
Program
  • Pages 1-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Download PDF (129K)
Invited Lecture
  • Ryoji Asahi, Ryosuke Jinnouchi, Kazutoshi Miwa
    Pages 1T01-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Industry has developed along with historical finding and dramatic improvement of functional materials. On the other hand, research and development of materials may take huge resources and long time. In order to accelerate to develop the materials on demand, we have developed machine learning algorithm with DFT accuracy that can access to high-throughput simulations for practical size of materials modelling. DFT data sets for simple models are stored in a database, which is used to predict energy and force in a practical model through similarity kernels, such as Gaussian or polynomial function of power spectrum. The regression coefficients are determined to reproduce the DFT training data by using a Bayesian linear regression method. Applications to catalytic activity of nanoparticles and diffusion properties in solid-state ionic conductor demonstrate that the present data-driven method is promising to predict chemical reactions and transport properties, which are not easily determined only with DFT calculations, thus to design a variety of functional materials.
    Download PDF (628K)
  • [in Japanese]
    Pages 2T01-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
Special Lecture
  • Masatoshi Hagiwara
    Pages 1S01-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Patients of congenital diseases have abnormalities in their chromosomes and/or genes. Therefore, it has been considered that drug treatments can serve to do little for these patients more than to patch over each symptom temporarily when it arises. Although we cannot normalize their chromosomes and genes with chemical drugs, we may be able to manipulate the amounts and patterns of mRNAs transcribed from patients DNAs with small chemicals. Based on this simple idea, we have looked for chemical compounds which can be applicable for human diseases targeting kinase families of CDKs, CLKs and DYRKs which are involved in the regulation of gene expression, and eventually succeeded to find FIT039, TG003, and ALGERNON as potential therapeutic drugs to cure diseases such as viral infections, Duchenne muscular dystrophy, and Down syndrome, respectively. In addition, we established splicing reporter assay for disease genes with dual color (SPREADD) using a segment of pathogenic genes, and found a splicing modulator, RECTAS, which can rectify the aberrant IKBKAP splicing in Familial dysautonomia patient fibroblasts with SPREADD screening. Our chemical therapeutics are applicable for other congenital diseases such as Fabry disease and Cystic fibrosis.
    Download PDF (127K)
  • Yoshihiro Asai
    Pages 1S02-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Materials informatics (MI) research predicting and designing materials chemical compositions for materials application functions that we want received enormous attention owing to rapid progresses in the machine learning and the artificial intelligence methods. National MI projects have been launched at worldwide. Especially with regard to the stress-strain problem for the mechanical functions of structural materials like metals and alloys, a reliable regression model has been proposed from the German projects for MI. On the other hand, MI for electro and optical functions in nanoelectronics and functional materials as well as MI for catalysis still remain to be “yet to come”. In this talk, I will discuss a scope for the MI toward these problems with a special emphasis on more intensive use of computer simulation, which may be indispensable if we think of inherent problems in experimental data. For this to happen, a large progress in computer simulations for direct predictions of materials functions out of the composition and structure information is highly desired, whose successful examples in nanoelectronics researches will be discussed with a short introduction to basic theories behind it.
    Download PDF (186K)
Oral Session
  • Sayaka Mizutani, Edward Pauwels, Veronique Stoven, Susumu Goto, Yoshih ...
    Pages 1A01-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Identifying the mode-of-action of drug side effects is an important issue in drug discovery. It is necessary to analyze the association between drug-protein interactions (molecular scale) and side effects (phenotypic scale). We propose a new method for large-scale analysis of targeted proteins and side effects, using sparse canonical correlation analysis on the co-occurrence of drugs in target protein profiles and side effect profiles. The proposed method enables us to make a biologically relevant interpretation regarding the relationship between drug–targeted proteins and side effects. We performed pathway enrichment analyses based on biological pathways in the KEGG Pathway database. It was observed that most of the correlated sets tended to be significantly enriched with target proteins that are involved in the same biological pathways, even if the molecular functions of those proteins are different. The proposed method is expected to be useful for predicting potential side effect profiles of drugs or new drug candidate compounds based on their target protein profiles.
    Download PDF (1351K)
  • Kenji Hori
    Pages 1A02-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    We are running a CREST project which consists of three projects; the first makes a very large scale library (VLSVL) of drug candidate molecules; the second creates a prediction model which screens candidates in VLSVL and picks up potential molecules; the third is our project. We are constructing a data base, called TSDB and QMRDB, which are used for analyzing reaction mechanisms to synthesize many candidate molecules in a short time. It is because the existence of transition states is the key for the reaction to proceed. For this purpose, we developed a cloud system managing the data bases as well as theoretical calculations. Two programs were also created; one is an interface between the cloud system and windows terminals and the other makes input files for Gaussian calculations based on search results of TSDB. In the present talk, we will show the summary of the TSDB system and some results of reaction mechanism analyses for synthesizing drug candidate molecules for inhibiting the PME-1 protein.
    Download PDF (990K)
  • Shigehiko Kayana, Aki Morita, Minako Ohashi, Ryohei Eguchi, Altaf-Ul- ...
    Pages 1A03-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    We have developed a database focused on secondary metabolites designated as KNApSAcK family DB. It comprises 51,086 secondary metabolites and 114,238 species-metabolite relationships. In the present study, we report a similarity search in chemical structure in KNApSAcK Core DB. The similarity search algorithm has developed by Shirai and colleagues. In this algorithm COMPLIG, a targeted compound and a retrieved compound are compared in 3 dimensional structure similarity and finally a ratio the number of common bonds and atoms over the numbers of bonds and atoms in smaller molecule in the two compounds. The similarity search makes it possible to obtain information of candidate activities via chemical structure level.
    Download PDF (1347K)
  • Rumiko Tanaka, Shin-ichi Nakayama
    Pages 1A04-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    The chemical substance names described in the patent publications have various descriptions and the description of the name depends on the author. Such variation causes hindering information sharing. Auto-extraction of chemical substance names is useful for information sharing. In order to extract the names of chemical substances in Japanese, we created a corpus tagged with chemical substance names. Next, we studied cutting out words from sentences, concatenating cut-out words and taking out only chemical substance names from cut-out words. We also made a selection comparison between chemical substance names and functional group names that are similar to chemical substance names.
    Download PDF (507K)
  • Takahiro Takahashi, Ryosuke Tsuchiya, Masamoto Arakawa
    Pages 1A05-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    In order to automate the research and development process for chemical vapor deposition (CVD) processes, we developed an automatic experimental design system. The system proposed the optimal experimental conditions using multi-objective optimization methods for identifying the reaction mechanism (reaction model) that indicates the reaction paths both in a gas-phase and on a surface from the reactant (source gas) to the product (film). In addition, the optimal experimental conditions as Pareto solutions were rearranged using a cluster analysis method.
    Download PDF (697K)
  • Tomoyuki Miyao, Kimito Funatsu
    Pages 1A06-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    For ligand-based virtual screening (VS) using three-dimensional information of query molecules, it is thought that experimentally determined binding modes of query molecules provide better performance of selecting active compounds. Furthermore, using an ensemble of compounds similar to a query active compound contributed to giving better screening performance than using the single query compound when using two-dimensional molecular representations. In this study, we investigated the importance of conformations of query molecules and the degree of VS performance enhancement when using ensemble of queries with three-dimensional molecular representations. Our results indicate that conformations are not so important for ligand-based VS and using an ensemble of compounds similar to a query compound is a reasonable strategy for VS using three-dimensional molecular representations.
    Download PDF (599K)
  • Yousuke Katsuda, Maimi Inoue, Takuto Kamura, Yusuke Kitamura, Masaki H ...
    Pages 1A07-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    It is becoming clear that RNA constructs G-quadruplex structures, and these structures regulate protein translation reaction in vitro and in cells. As a novel study of the field, researchers recently try to identify the “gene” that constructs RNA G-quadruplex structures(RGq) in mRNA and its “position.” Our group discovered a compound that has highly selectivity for RGq named RGB-1, and confirmed that it regulates translation reaction in cells. Using this advantage, we succeeded to discover a novel G-quadruplex formation site at the NRAS 5’UTR. Herein, we report that the method that detects RGq structures that regulate translation reaction in the cell using RGB-1.
    Download PDF (541K)
  • Ryusuke Sawada, Michio Iwata, Masahito Umezaki, Yoshihiko Usui, Toshik ...
    Pages 2B01-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Kampo medicines (Japanese traditional formulas) are popular and useful for treatment of multifactorial and chronic diseases. However, the pharmacotherapy with Kampo medicines depends heavily on the empirical knowledge of medical doctors in practice, and scientific evidence is not sufficient for explaining the underlying mode-of-action of Kampo medicines. Pharmacological effects of Kampo medicines are based on multiple compound?multiple target interactions. In this study we propose new computational methods for predicting new therapeutic indications of Kampo medicines from various big data of Kampo medicines and crude drugs. Target proteins and target pathways of the constituent compounds of Kampo medicines were estimated by docking simulations and machine learning methods based on large-scale omics data (e.g., genome, proteome, metabolome, chemical-protein interactome), and potential therapeutic indications of Kampo medicines were predicted on a large scale. We also established KampoDB (http://wakanmoview.inm.u-toyama.ac.jp/kampo/), a novel database of Kampo medicines, which provides various useful scientific resources on Kampo medicines, crude drugs, constituent compounds, and target proteins of the constituent compounds.
    Download PDF (318K)
  • Francois Berenger, Yoshihiro Yamanishi
    Pages 2B2-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    A bisector tree is a data structure from computational geometry to do static spatial indexing of points. It allows to do fast and exact nearest neighbor searches (and other queries) in an N-dimensional space, provided a metric to measure the distance between any two points in that space exists.We have made an open source implementation of a bisector tree (https://github.com/UnixJunkie/bisec-tree). It is bucketized, such that several nearby molecules can be put into the same bucket. The (maximum) bucket size is a user-chosen parameter. Our implementation proposes two heuristics, in order to find good vantage points during tree construction, to accelerate subsequent queries.In this presentation, we report on the indexing and querying of millions of molecules and the associated challenges.
    Download PDF (165K)
  • Ikumi Morikawa, Masamoto Arakawa, Hiroto Ohta, Manabu Sugimoto
    Pages 2B03-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Biogenic amine receptors have been known as targets to develop new pesticides because they control eating behavior of insects. In this study, we focus on 35 antagonists to silkworm dopamine receptor that experimentally investigated by Ohta et al., and constructed a predictive model for understanding and prediction of the antagonist activity. In our predictive model, we used electronic descriptors obtained from electronic-structure calculations as explanatory variables since electronic interaction is considered to be a key factor in formation of ligand-receptor complexes. Also, by referring to the predictive model established, we searched for highly active compounds from a database which stores electronic-structure information of 5733 secondary metabolites of plants.
    Download PDF (343K)
  • Akio Kaneko, Hitoshi Goto
    Pages 2B04-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    To advance the research on quantitative structure activity correlation (QSAR) and quantitative physical property correlation (QSPR) using deep neural networks (DNNs), it is necessary to prepare a large number of conformation information as the learning dataset, Although most of the available public molecular databases currently provide 2D molecular information and/or with a single 3D structure data. Therefore, we have developed a laboratory localized database C3DB (Computational Chemistry Conformation DataBase) system which records numerous conformers obtained by conformational search and three-dimensional structure information optimized by quantum chemical calculation. The C3DB system consists of an API server that provides WUI, and a DB server that provides the functions for converting molecular structure information into XML format and storing it into the database (DB). An XML format used in the DB server is uniquely extended from the genuine CML format. In addition, C3DB provides functions for submitting jobs for conformational search and geometry optimization to pre-registered computing servers and retrieving and storing their results. Currently, about 12,000 molecular information is stored.
    Download PDF (424K)
  • Tenfu Suzumura, Hitoshi Goto
    Pages 2B05-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Coarse-grained models improve the computational efficiency and allow calculations in a wide range and long timescale for macromolecular systems. Although, detail information of the all-atom models is often required along with the efficiency of coarse-grained models. Because our docking method using coarse-grained models predicts the binding site as a Cα structure, it is necessary to transform the coarse-grained model into the all-atom model for high accuracy analyses. In this work, we propose a new CBRM (Conformation-based Reverse Mapping) method that constructs the all-atom geometry from a coarse-grained peptide in a binding site. In order to address a huge variety of geometry of the peptide in a binding site of complex structure, we create the complete tripeptide conformation DB by exhaustive conformation search and use them as templates of the all-atom structure. By combining the conformation DB, CBRM method can construct the all-atom structure for a coarse-grained peptide of arbitrary residue length. In addition, we considered new coarse-grained models that improve the accuracy of the all-atom model construction of CBRM method.
    Download PDF (564K)
  • Ryuhei Harada, Yasuteru Shigeta
    Pages 2B06-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Proteins are polymers with extremely complicated structure and elaborate functions and form the basis of various biological phenomena occurring in vivo, such as molecular recognition, information transfer, enzymatic reaction, and etc. Since it is expected that there is a correlation (structure-function relationship) between their three-dimensional (3D) structure and function, 3D structures of many proteins have been solved by experimental methods such as X-ray / neutron diffraction experiment and nuclear magnetic resonance method (NMR). Especially recently, X-ray free electron laser and cryo-electron microscope have revealed the relationship between dynamical function and structural change at single-molecule level. In this research, we reproduce low resolution structure data by executing parallel cascaded MD we have developed using similarity with experimental data. We also show that it is possible to induce structural transitions by using dissimilarity with experimental data.
    Download PDF (958K)
  • Ming Huang, Ryohei Eguchi, Naoaki Ono, Altaf-Ul-Amin, Shigehiko Kanay ...
    Pages 2B07-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    The chemical characteristic of a chemical compound that prolongate the QT interval in electrocardiogram is defined as the cardiotoxicity. The blockage of the potassium channel Ikr of the cardiomyocytes is regarded as a significant cause of the cardiotoxicity. Given that many compounds with largely different structures will block the Ikr channel, and the structure of the Ikr channel is unclear till now, we propose to predict the blockage of chemical compounds based on quantitative structure-activity relationship, which will be implemented by in-silico models. To construct the in-silico models, we use both the descriptors embedding and convolutional embedding in a deep neural network structure, which classify the compounds based on their half maximal inhibitory concentration IC50. The performance of the models will be shown in this talk.
    Download PDF (240K)
  • Ryohei Eguchi, Mei Kou, Naoaki Ono, Altaf-Ul-Amin, Shigehiko Kayana
    Pages 2B08-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Various chemical descriptors such like molecular fingerprints have been long discussed to represent biochemical features, in order to embed molecular structures into a numerical space and quantify their activities. However, it is still difficult to predict bioactivities from molecular structures since it depends on the choices of those chemical descriptors. Recently, machine learning methods based on Graph Convolutional Neural Networks (GCNN) have been proposed that can automatically optimize a model for molecular feature extraction from the given training sets. In this study, we introduce an application of GCNN to predict metabolic pathways of alkaloids, namely, one of the largest families of secondary metabolites in plants. We trained and tested GCNN model on alkaloid compounds and the mean accuracy of 20 runs with random sampling is about 94% (Number of epoch: 200). The results showed that it is greatly expected that it will lead to an understanding of the evolution of metabolic system unique to organisms.
    Download PDF (494K)
  • Michio Iwata, Longhao Yuan, Qibin Zhao, Yasuo Tabei, Yoshihiro Yamanis ...
    Pages 2B09-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Genome-wide analysis of transcriptome responses of human cell lines to drug treatments is an important issue in drug discovery. However, drug-induced gene expression profiles are largely unknown for all human cell lines, which is a serious obstacle in practical applications. In this study, we developed a novel computational method to predict unknown parts of drug-induced gene expression profiles on various human cell lines and to predict new drug therapeutic indications for a wide range of diseases. We proposed a tensor-train weighted optimization algorithm to predict the potential values for unknown parts in tensor-structured gene expression data. It was shown that the proposed algorithm can accurately reconstruct drug-induced gene expression data for a range of human cell lines. It was also shown that in comparison with the use of original gene expression profiles, the use of imputed gene expression profiles improved the accuracy of drug repositioning in the framework of multitask learning.
    Download PDF (977K)
  • Chia-Hsiu CHEN, Kenichi TANAKA, Masaaki KOTERA, Kimito FUNATSU
    Pages 2B10-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    In the chemical industry, designing novel compounds with desired characteristics is a bottleneck in the chemical manufacturing development. Quantitative structure–property relationship (QSPR) modeling with machine learning techniques can move the chemical design forward to work more efficiently. A challenge of current QSPR models is the lack of interpretability operating black-box models. Hence, interpretable machine learning methods will be essential for researchers to understand, trust, and effectively manage a QSPR model. Global interpretability and local interpretability are two typical ways to define the scope of model interpretation. Global interpretation is information on structure−property relationships for a series of compounds, helping shed some light on mechanisms of property of compounds. Local interpretability gives information about how different structural motifs of a single compound influence the property. In this presentation, we focus on the designs of interpretable frameworks for typical machine learning models. Two different approaches based on ensemble learning and deep learning to interpretable models will be presented to achieve global interpretation and local interpretation respectively which are equal to or even better than typical trustworthy models. We believe that trust in QSPR models can be enhanced by interpretable machine learning methods that conform to human knowledge and expectations.
    Download PDF (565K)
  • Takahiro Inoue, Kenichi Tanaka, Masaaki Kotera, Kimito Funatsu
    Pages 2B11-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    The use of structural generators is one of the ways to develop efficiently functioning organic molecules. Here we present a novel algorithm to diversify the structure generated by the DAECS structure generator, which were previously developed to generate structures having objective properties. The proposed algorithm was implemented for seed structure selection by restricting the search area and then clustering the structure on the 2D map generated by the GTM algorithm. To evaluate our algorithm, we conducted a computational experiment using ligand-like structures for the histamine H1 receptor. While there is still room for improvement, our algorithm is superior to previous methods in terms of structural diversity: the structures generated by our algorithm were more interspersed on the 2D map and their average Tanimoto distance was longer compared with those generated by the previous algorithm. It was also proven that the proposed algorithm was efficient in terms of computational cost.
    Download PDF (836K)
  • Yasuhiro Shigemitsu, Yasushi Ohga
    Pages 2C01-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    The reaction rate theory in condensed phase has been well established on the assumption of chemical equilibrium between solute and solvent interaction. Nom-equilibrium effects, however, arise in high pressured solution reactions or in excited state ultrafast reactions, where the solute-solvent equilibrium is broken down. The present study evaluate the non-equilibrium effects by means of Fokker-Planck type stochastic differential equations with ballistic-TST sink term along the chemical reaction coordinate.
    Download PDF (370K)
  • Yuya Kuramoto, Dai Akase, Misako Aida
    Pages 2C02-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Trimethylamine N-oxide is well known as a representative osmolyte. However, the mechanism is not completely unraveled even now. In the mechanism of regulating osmotic pressure, it is considered that the influence of TMAO on surrounded water molecules is important. In this study, we perform QM/MM-MD simulation in which we treat a TMAO molecule as QM (HF/6-31G). The results show that the density of water molecules around the oxygen atom of TMAO is nearly twenty times larger than that of bulk water. This means that water molecules are hydrogen-bonded to the oxygen atom of TMAO specifically. The density of water molecules within 6Å from TMAO is 0.99g/cm3. It means that the density is almost same as that of bulk in spite of excluded volume of TMAO. The water molecules which are excluded by TMAO are bound to TMAO.
    Download PDF (706K)
  • Yukinori Suwa, Yusuke Kawashima, Norihito Kawashita, Yuki Fuji, Yu-Shi ...
    Pages 2C03-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Since the molecules that absorb visible or near infrared light cause electron excitation, analysis on electronic states leads to the development of new derivatives. Our target compounds are nitrogen-containing polyheterocyclic ones with various absorption wavelengths (the blue and near infrared absorptions). We investigated the relationship between the structures and the absorption wavelengths of these compounds using quantum chemical calculation. As a result, it was confirmed that similar electron excitation occurs in all the structures. Furthermore, when comparing the experimental and calculated excitation energies, the compounds can be divided into two group. One group overestimates the excitation energy and the other shows close estimation to the experimental measurement values. However, even in the overestimated groups, we found the linearity between calculated and experimental energies in most substituents. Thus, since this method is considered to be sufficient to design novel molecular structures by quantum chemistry calculations among structures with small differences in the geometries, further discovery of structure can be expected.
    Download PDF (477K)
  • Eisuke Nakazawa, Daiki Masuoka, Kenta Suzuki, Takahiro Takahashi
    Pages 2C04-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Chemical Vapor Deposition (CVD) is the main process of semiconductor device fabrication. This research and development tends to be trial and error because of the complex reaction. For this reason, we have developed an automatic modeling system for reaction mechanisms in CVD. This system proposes a reaction models representing reaction rates and paths by using the deposition experimental data in CVD equipment. In the previous studies, the deposition profiles were calculated from the reaction models using the simulator based on the global optimization algorithm to evaluate this proposed reaction models. In this study, we formulated mass balance equations inside the CVD equipment, and derived an exact solution of the deposition profiles, therefore we succeeded in computing it from about 10 times to 1000 times faster than conventional algorithm. In addition, we implemented this exact solution in the automatic modeling system for reaction mechanisms and compared it with the conventional method for the analysis accuracy and the calculation cost.
    Download PDF (469K)
  • Hiroyuki Tanaka, Manabu Sugimoto
    Pages 2C05-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    In recent years, great attention has been paid to the development of the perovskite solar cell as a low-cost photovoltaic device. In order to reduce the fabrication cost, it has been considered that the performance of organic hole transport material (HTM) should be more improved because of their insufficient hole mobility and high material cost. In this study, the electronic-structure calculation is applied to establish quantitative relation between electronic descriptors and experimentally measured hole mobility of organic HTMs. Through the multiple regression analysis, it is shown that the obtained mathematical equation representing the correlation is reasonable in accuracy. The analysis also suggests some critical electronic factors which would be useful for material search of new organic HTMs.
    Download PDF (252K)
  • Takafumi Inoue, Manabu Sugimoto
    Pages 2C06-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Three-dimensional shape of molecular orbitals (MOs) is important for understanding chemical reactivity and property of molecules. In this study, we developed a computational method to numerically evaluate shape similarity among MOs. As an application, we coded a program to draw orbital correlation diagrams between the target molecule and its fragments by referring to the evaluated shape similarity. It is shown that the present method is in reasonable success to automatically recognize the orbital correlations.
    Download PDF (2372K)
  • Ryoko Hayashi
    Pages 2C07-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    This talk addresses an activity to realize molecular design based on the cooperation between data science and computational science. I extracted Z matrix from the output of Gaussian 09 structural optimization job via a Perl program and replaced Z matrix with it in original input data. I could run Gaussian 09 normally with the replaced Z matrix autlmatically so that I will explain the technical detail of the implementation.
    Download PDF (306K)
  • Yuki Sugawara, Masaaki Kotera, Kenichi Tanaka, Hiroshi Nakano, Masakaz ...
    Pages 2C08-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Fluorescent substances have been increasingly important for a wide range of applications, e.g., animal experiment in drug discovery, liquid crystal displays, and lighting devices. The compounds with the BODIPY structure, which is composed of dipyrromethene complexed with a disubstituted boron atom, are known to exhibit fluorescence and some practical properties, e.g., a sharp fluorescence peak, small influence of wavelength change dependent on solvent and high quantum yield. Here we propose an ensemble learning-based model that quickly and accurately predicts new BODIPY candidates with desirable physical properties. Compared with the quantum computation model (MAE = 24.71, R2 = 0.84), the proposed model showed highly accurate performance (MAE = 17.34, R2 = 0.90). Our model presented some descriptors that are important for the prediction of absorption wavelengths, which is consistent with known chemistry knowledge and shows the reliability of the model. Our model still does not properly utilize solvent information, indicating the possibility of further improvement in the foreseeable future.
    Download PDF (595K)
  • Takayoshi Yoshimura, Yohei Ogiwara, Norio Sakai, Miho Hatanaka
    Pages 2C09-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    an intramolecular cyclization reaction ofalkyne acid using palladium (0) catalyst was reported by ogiwara et al. since theproduct of this reaction has a core structural unit found in many naturallyproducts, this reaction itself is also useful. in this research, comprehensivereaction paths were searched using artificial force induced reaction (afir)method which is an automated reaction path search method. and furthermore, weused prim's algorithm, a graph theory algorithm, to determine the reasonablereaction path. several paths reach to product were found without using prioriknowledge such as structure of tss. as a result of examining variouspossibilities, it was found that the reaction started from the cleavage of thec-h bond at the propargylic position of the substrate.
    Download PDF (838K)
  • Aya Miyazaki, Miho Hatanaka
    Pages 2C10-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Asymmetric Michael addition was achieved by using the rare earth catalystincluding the chiral N ,N ’-dioxidederivative ligand. The enantioselectivity of this catalytic reaction could beswitched by changing the rare earth from Scandium to Yttrium. To understand themechanism of the stereoselectivity, we applied an automated reaction pathsearch method, called the artificial force induced reaction (AFIR) method, andexplored various reaction pathways giving major and minor products. It wasfound that the reaction started from the coordination of one of the reactants, pyrazolonederivative, to Scandium. In this coordination structure, the reactive carbon inthe pyrazolone moiety was not covered by the chiral ligand, and the approachdirection of the other reactant, α , β -unsaturated carbonylcompound, was not restricted. Thus, the stereoselectivity could be controlledby the stability of the transition states of the following reaction step, suchas C-C bond formation or proton transfer. In this presentation, we will reportthe details about the structures of transition states and discuss about theorigin of stereoselectivity.
    Download PDF (844K)
  • Ryo Kageyama, Junji Seino, Mikito Fujinami, Yasuhiro Ikabata, Hiromi N ...
    Pages 2C11-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    In the orbital-free density functional theory (OF-DFT), total energies in atoms and molecules are expressed as a functional of electron density. Recently, we have constructed the kinetic energy (KE) density functional (KEDF) using machine learning to reproduce the Kohn-Sham (KS) KE. The deviations in the machine-learned (ML-) KEDF from KS KE are smaller than those in any conventional KEDFs for atoms and molecules. For the practical calculations using ML-KEDF, the electron density optimization algorithm to decide the electron density in ground state from initial electron density and the kinetic potential (KP), which is the derivative of ML-KEDF in terms of electron density, are required. We developed the scheme to construct the machine learned KP (ML-KP) that corresponds to ML-KEDF, and implemented the density optimization algorithm using ML-KEDF and ML-KP. In this presentation, we will discuss the accuracies in ML-KP and optimized densities / total energies in several atoms and molecules.
    Download PDF (539K)
  • Shintaro Fukushima, Yuichi Motoyama, Kazuyoshi Yoshimi
    Pages 2C12-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Recently, retrosynthetic analysis with machine learning and deep learning has been proposed actively. We focus on the method proposed by Coley et al. In this method, we first calculate similarities between the target product and products in the reaction database to find similar products. Next, we generate candidate reactions by modifying reactions of the similar targets. The method by Coley et al. is more accurate than other methods. However, its search space is limited because it is based on the matching with the existing reactions. In this presentation, we propose a method with GAN(Generative Adversarial Network) in order to expand the search space. The idea of the proposed method is to learn a generative model with GAN, generate reactants with the generative model, and then a reaction with reaction prediction. We got new reactants with the proposed method. Besides, the ratio of the correct reactions was higher than the previous methods. We continue the detailed verification of the search space.
    Download PDF (427K)
Younger Cooperated Session
  • Shino Ohira, Kyosuke Tsumura, Jun Nakabayashi
    Pages 2Y01-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    When we acquired a seed or lead compound in drug discovery, the candidate often dropped out during in vivo/vitro testing. We needed to have multiple scaffolds to increase the rate of success of drug development. For scaffold hopping, we needed a descriptor that could represent the binding affinities of a ligand to its target protein. In order to find the descriptor, we assumed that protein-ligand binding could be described as the set of interactions between a ligand and amino acids. By computing the interaction energy between a ligand and amino acids, we created a new descriptor, Amino Acid Mapping (AAM). Based on AAM, we developed the machine learning system (AI-AAM) both for searching and creating seed or lead compounds from a biologically active template compound. We were now ready to obtain multiple candidate compounds in drug discovery.
    Download PDF (769K)
  • TAKAYUKI SERIZAWA
    Pages 2Y02-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Recently Artificial Intelligence is attracting attention not only Computer Vision but also Drug Discovery.  We have been using de novo molecular design system.  The system can propose molecules but can’t propose synthetic route.  There is a problem that de novo molecular generator sometimes proposes synthetic infeasible compounds.  To address this issue, we tried to develop Computer Aided Retrosynthetic Analysis system.
    Download PDF (417K)
  • Jun-ichi Takeshita, Yoko Kitsunai, Takamitsu Sasaki, Kouichi Yoshinari
    Pages 2Y03-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    In order to conduct safety assessment of chemical substances, experiments are often done with animals. However, in terms of time and cost efficiencies and the need for animal protection, computational methods for the prediction of chemical toxicity have been attracting great attentions recently. There are two frameworks for the toxicity prediction methods: QSAR and read-across approaches. Several systems are already commercially available to predict yes/no-type genotoxicity, based on QSAR approaches. However, few computational prediction methods for repeated-dose toxicity have been developed because of the diversity of observation items and the complexity of toxicity mechanism. Thus, in this study, we attempted to develop a prediction method of repeated-dose toxicity of rat liver, kidney, and blood, based on a read-across approach using HESS, which is an in vivo toxicity database publicly available from NITE, Japan, and statistical methods.
    Download PDF (217K)
  • J.B. Brown
    Pages 2Y04-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    There are hopes that virtual screening can reduced pharmaceutical development costs by predicting pharmacological activity. While deep learning has become a central method for many, chemogenomic active learning (CGAL) has demonstrated the ability to obtain the same prediction performance by efficient use of less data. Even when data is sparse or receptors are not in chemogenomic data, CGAL can be successful. In this light, how should we think of machine learning in drug discovery. The CGAL method will be introduced and the truth about machine learning will be argued.
    Download PDF (386K)
  • Masaaki Kotera
    Pages 2Y05-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Our knowledge of metabolites and their pathways is only a small part of natural products. IUBMB Enzyme List, a list of known enzymes, is the basis of metabolic pathway reconstruction based on the reference pathway. However, this strategy is inherently unsuitable for natural biosynthesis pathways unique to organism species and biodegradation pathways for environmental pollutants. In recent years We have developed an approach to link the known compounds by enzymatic reaction using machine learning as de novo reconstruction of metabolic pathways, apart from the approach to generate new compounds. We also developed an EC sub-subclass prediction method and an enzyme protein prediction method from chemical structure as a prediction of an enzyme that catalyzes the reaction. These studies are similar to the problems of organic synthesis strategy and have many common parts, but it is very important to understand it because there are differences such as available information and strategies that can be taken.
    Download PDF (287K)
  • A Message to the Next Generation Youth
    Kunio Sannohe
    Pages 2Y06-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    In recent years, the development of computers has dramatically changed our life style. The performance of the computer has changed greatly from 35 years ago when we were involved in computational chemistry at the company. At that time, large-scale calculations could only be done with large computers, but now it is easy for anyone to use computational chemistry in a familiar personal computer etc. From now on, we are expecting further utilization of computational chemistry by many chemists, and development of useful new technologies in computational chemistry. In this lecture, we will talk about how computational chemistry was launched and disseminated within the company, and how we had carried out industry-academia-government collaboration in order to create new technologies, at the dawn of computational chemistry. Then, I would like to tell the next generation youth the message that expect further progress of computational chemistry.
    Download PDF (319K)
  • J.B. Brown
    Pages 2Y07-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Download PDF (386K)
Poster Session
  • Jung-Hoon Seol, Kenichi Tanaka, Tetsuya Sawatsubashi, Shinnosuke Kaji, ...
    Pages 1P01-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    On a thermal power plant, the water quality of system water needs to be properly monitored and controlled. The impurities dissolved in the system water are considered as the major factors on plant troubles. In the present circumstances, though, the measurement of impurity concentration relies on manual analyses, with a large time interval. Thus, online methods for monitoring impurities in system water are of great necessity for practical applications, making it possible to prevent troubles in advance. Here we propose a monitoring method of online prediction for the concentration of iron oxides, one of the main impurities dissolved in system water of a thermal power plant. The proposed method was based on a virtual sensing technique named soft sensor, and higher prediction accuracy was achieved by applying variable selection through genetic algorithm-based process variables and dynamics selection (GAVDS). Three case studies were carried out to examine the performance of the proposed method.
    Download PDF (479K)
  • Ryota Kato, Kenichi Tanaka, Masaaki Kotera, Kimito Funatsu
    Pages 1P02-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Quantitative Structure-Property Relationship (QSPR) is a kind of method to predict properties of compounds. In QSPR, a regression model is constructed from training data consisting of the structure and properties of the compound. In many cases, molecular descriptors are calculated from structure and are used as input. However, the descriptors may not contain sufficient information about the object property, and 3D structure is difficult to consider. In this study, molecular structures were regarded as sequential data of atom information, and regression model was constructed using recurrent neural network (RNN) to deal with variable length data. It was shown that prediction accuracy improves by normalization of coordinate system and consideration of multiple coordinate systems. As a result of the case study, the proposed method outperformed the existing method for predicting octanol/water partition coefficient. This method is expected to be more useful by eliminating the influence from the data format and considering other conformations.
    Download PDF (804K)
  • Fusako Sakata, Masaaki Kotera, Kenichi Tanaka, Hiroshi Nakano, Masakaz ...
    Pages 1P03-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    In order to efficiently discover novel materials with desirable properties, it is necessary to develop a method to predict physical properties from only compositional formula. In this study, we constructed a regression model expressing relationship between compositional formula and the physical properties. The composition formula of the inorganic material were converted into descriptors, and were used as explanatory variables. We proposed a total of 387 diverse and general descriptors using the numbers of atomic elements and their parameters such as atomic weight, electronegativity, etc., enabling prediction of various physical properties. As a case study, we built predictive models by random forest regression using our proposed descriptors, and predicted three physical properties, i.e., crystal formation energy, density and refractive index. The obtained R2 values were 0.970, 0.977 and 0.766, respectively. In addition to the successful predictive performance, we were also able to statistically select the descriptors that contributed to the prediction models, and they were reasonable from the viewpoint of chemical knowledge.
    Download PDF (389K)
  • Amane Suzuki, Kenichi Tanaka, Masaaki Kotera, Kimito Funatsu
    Pages 1P04-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    A major challenge in chemoinformatics is to generate novel structures with desirable activity. The structure generation method based on deep generative model has the advantage in obtaining novel structures via SMILES representation without exhaustive structure generation. However, some studies use a large amount of data to model the relationship between chemical structures and their activity. Here we propose a new deep generative model combined with the semi-supervised learning, which enables the prediction from small size datasets. We conducted a case study to confirm the predictive ability for the alpha2A adrenergic receptor (ADRA2A) dataset, and showed that the proposed model performed better than the previous methods. We plan to further develop and verify our model so that it generates structures with desirable activity from small size datasets.
    Download PDF (538K)
  • Takurou Nishimura, kimito Funatsu
    Pages 1P05-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    The organic synthetic route design system AIPHOS (Artificial Intelligence for Planning and Handling Organic Synthesis) is a system for creating a new synthetic route, and works while complementing the advantages of both experienced strategies and logical oriented strategies. KOSP (Knowledgebase - Oriented Synthesis Planning) adds a function to recognize the strategy part using knowledge base (KDB) derived from reaction database (RDB), so it is more experienced system than AIPHOS is there. TOSP (Transform-Oriented Synthesis Planning) is a completely experienced system that uses the Transform database built from existing reactions. These systems have been developed as client-server type systems in order to entrust a large amount of processing to high-performance servers. Today, due to advances in hardware and OS, we can use computing environment which can not be compared with conventional personal environment in personal environment. Therefore, in order to further improve the convenience of AIHOS, we developed a standalone personal AIPHOS. As a result, effects such as improvement of development efficiency, cancellation of operation restriction, and elimination of security load were obtained.
    Download PDF (1686K)
  • Mikito Fujinami, Junji Seino, Hiromi Nakai
    Pages 1P06-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Reaction prediction is a computational method to predict chemical products from given reactants. In the field of cheminformatics, many reaction prediction systems have been developed. Recently, utilizing machine learning methods with a molecular fingerprint has attracted attention due to their high accuracy. The present study utilized quantum chemical descriptors instead of the fingerprint for reaction prediction in order to find the descriptors, which is independent from reaction systems. An analysis on descriptors can be also performed to unveil quantitative contributions for reaction prediction. The analysis has a potential to explain chemical reactions using physicochemical quantities. The prediction accuracy for polar and radical reactions in present study was close to that of the fingerprint based systems. The scheme has been extended to a prediction of pericyclic reaction. An analysis on quantum chemical descriptors contributing reaction prediction has also been performed. In this presentation, we will show the prediction accuracy for pericyclic reaction prediction. In addition, effective quantum chemical descriptors for each polar, radical, and pericyclic reaction will be discussed.
    Download PDF (258K)
  • Hiroki Maekawara, Mikito Fujinami, Junji Seino, Ryota Isshiki, Junichi ...
    Pages 1P07-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    In experimental chemistry, a reaction condition optimization is performed to maximize yields of chemical products. Because there are various kind of conditions, the optimization process requires much labor and time. Recently, methods combined with machine learning have been developed to reduce the optimization costs especially for flow reactor systems. The purpose of this study is to develop a scheme to reduce the optimization costs in laboratory scale batch reactions using machine learning. First, we focused on a solvent condition, which affects yields largely. The present scheme utilizes several solvent properties as descriptors for machine learning. In this presentation, a real experimental dataset was adopted. We will show the analysis result about the correlation between solvent properties and experimental yields based on regression and clustering techniques.
    Download PDF (248K)
  • Kairi Nakamura, Junji Seino, Hiromi Nakai
    Pages 1P08-
    Published: 2018
    Released on J-STAGE: October 26, 2018
    CONFERENCE PROCEEDINGS FREE ACCESS
    Information of strengths of chemical bondings plays an important role to understand chemical reactions and molecular properties. Our laboratory has developed the bond energy density analysis (bond-EDA) technique, which partitions the total energy of a molecule into atomic and bonding contributions, to evaluate strengths of chemical bondings. In this study, we improved the bond-EDA technique to simultaneously evaluate any bonding energies including covalent bonding, hydrogen bonding, and so on. To avoid a numerical instability when the simultaneous equation is solved, we utilized the LASSO regression, which contains the regularization term. The numerical assessments were performed in several molecules. The present scheme reproduces energies in multiple bondings such as double and triple C-C bondings as well as in the several species of chemical bondings.
    Download PDF (356K)
feedback
Top