The quality of coal as fuel or law materials of chemical industry is estimated by variety of parameters, such as the ultimate analysis, the proximate analysis, the maceral analysis or the fluidity tests and so on. On the contrary, it is hard to grasp the whole image of coals under the condition of considering a lot of variables and it is already found that some parameters have high correlation. That is why various parameters were summarized and arranged by using the principle component analysis, at first. Next, the classification and regression trees (CART) was used for creating a new classification, which made the difference of each coal more clearer. Consequently, the importance of the ultimate analysis was reconfirmed and the classification method on the carbon and the sulfur content was proposed. Moreover, the relation between the ultimate analysis and the maceral analysis was studied by CART and we found simple and convenient semi-quantitative rules for estimating the inertinite (IN) content from the ultimate analysis. Furthermore, the clear classification about places of coal production was obtained by plotting the relation between the H/C and the IN content.
Colorless endoperoxide (1a) of benzo[1,2,3-kl:4,5,6-k'l']dixanthene (2a) changes into colored 2a in the presence of acid at room temperature. Interaction between 1a and acid was estimated using the AM1 molecular orbital calculation. The calculation shows that 1a reacts with a proton more easily at an endo oxygen atom than at a xanthene oxygen atom, and that the protonated species goes through a carbocation (4a) to give colored 2a. In order to propose novel functional dyes, which change their color more sensitively by the addition of acid, we studied the interaction between protons and the endoperoxides containing nitrogen (1b) or sulfur (1c) instead of oxygen. Nitrogen analog 1b is a hypothetical molecule. The calculated proton affinity of 1b or 1c at peripheral heteroatoms was about 34 kcal mol-1 greater than that of 1a. The deoxygenation of endoperoxide (1) to give 2 is expected to be accelerated by replacement of the peripheral heteroatoms. Replacement of the peripheral heteroatom from oxygen to nitrogen also showed larger proton affinity in case the protonation would take place at the endo oxygen. Such replacement of the peripheral atoms would be effective modification for the acceleration of color forming reaction because of their greater proton affinity.
In order to estimate IPCE (Incident Photon-to-Current Conversion Efficiency) of DSSC (dye-Sensitized Solar Cells), we constructed a theory and established a calculation procedure. The electron transfer rate constant consists of free energy change and electron transfer integral between the states before and after the electron transfer. We estimate both values by QEq-CS (Charge Equilibration for Charge Separation) method. Using experimental data of DSSC with 9-phenyl xanthene dye (Sayama et al. Chem.Lett. 753 (1998)) as the target, we studied three electron transfer steps: (i) Step1, from the ground state of dye to semiconductor; (ii) Step 2, from iodine redox pair in the solution to the oxidized dye; (iii) Step 1', from the excited state of dye to semiconductor. We calculated the change of electrostatic energy, estimated the change of free energy, and calculated the activation factor with the reorganization energy related to the each step. As a result, we clarified that the activation factor of the Step 1' roughly explained the tendency of the IPCE for the dye-sensitized solar cells.
The XyMJava system for drawing chemical structural formulas has been developed by using the Java language in order to enhance World Wide Web communication of chemical information, where the XyM notation system proposed previously has been adopted as a language for inputting structural data. The object-oriented technique, especially the design-pattern approach, is applied to parse a XyM notation in the XyMJava system. A chemical model is introduced to encapsulate information on chemical structural formulas and used to draw the formula on a CRT display. Thereby, an HTML document containing a XyM notation can be browsed by the XyMApplet of the XyM Java system.
Accurate recognition of differences and similarities in stereochemical structures is achieved by extended CAST (CAnonical-representation of STereochemistry) coding method. Using the CAST notations, complete search of partial structures as well as whole structures by matching with the query in any level from a specific atom considering planar, configurational, and conformational information has been achieved. Differences and similarities of stereochemistry in four aldopentoses of D-xylose, D-ribose, D-arabinose, and D-lyxose, which have three chiral centers, are clearly represented by the extended CAST. Applications for some organic compounds containing more complicated stereochemical structures are also demonstrated.
Although the Generalized Additive Model (GAM) is known as a superior nonparametric regression method, there have been only a few applications, especially in the fields of chemistry and pharmaceutical sciences. GAM can be applied to nonlinear problems that can also be solved using the hierarchical Artificial Neural Network (ANN) method. In this study, GAM was compared with ANN in regression, classification, and prediction power using artificial and actual pharmaceutical data sets. The results show that GAM and ANN have similar regression/classification/prediction powers. Considering the fact that additive models simply visualize the relationship between a predictor variable and a response variable, GAM can be applied to data sets in the pharmaceutical sciences.
In 3D-QSAR analysis such as comparative molecular field analysis (CoMFA), proper superimposition of molecules is required. Since appropriate superimposition is important factor for construction of predictive data-model and correct analysis of it, various methodologies for molecular alignment have been proposed. We have proposed novel molecular alignment method using Hopfield Neural Network (HNN) [M. Arakawa, K. Hasegawa, K. Funatsu, Journal of Computer Aided Chemistry, 2, 29-36 (2001)]. In this paper, 3D-QSAR analysis of Cyclooxygenase-2 (COX-2) inhibitors which consist of three different types of skeleton, is reported. The structures of COX-2 inhibitors were aligned using our HNN method and analyzed by CoMFA. A robust PLS model (R²=0.922, Q²=0.653) was obtained and it was validated by contour map of the regression coefficients and X-ray crystal structure of COX-2.
The canonical ensemble was used to simulate the vapor/liquid interface, in order to visualize the whole configuration of molecules in the system. Although it had previously been difficult to apply the NVTMC to an extremely localized system such as liquid surface because of the slow convergence, the DTMC (Dual Translation Monte Carlo) method proposed in this paper permitted a significant simulation. The effectiveness of a fixed bed model, that approximates the entire interaction force from the bulk liquid onto the molecules making up the liquid surface, was confirmed. This is a multipurpose model that can be applied to the adsorption system as well as to liquid surface. In addition, we determined two parameters in the Lennard-Jones potential for Ar, Kr, and 1-centered nitrogen by pilot simulations based on real experimental values, for liquid density and boiling point. Heat of vaporization was estimated using these new LJ parameters for each molecule during the DTMC simulation, and the results showed good agreement with experimental data in the literature.
A tool for displaying and communicating chemical structural formulas has been developed on the basis of XyMML (XyM Markup Language), where a XyMML document according to the XML (Extensible Markup Language) specification has been transformed into an HTML (HyperText Markup Language) document by means of a translator program due to XSLT (Extensible Stylesheet Language Transformations). During this process, XyMML data written in such a XyMML document have been converted into XyM notations embedded in such an HTML document, which is browsed by virtue of a World Wide Web (WWW) browser including the XyMJava system. Another tool for printing chemical structural formulas has been developed so that the same XyMML document has been transformed into a XyMTeX document by means of XSLT. The resulting XyMTeX document has been used to print a document containing structural formulas through the TeX/LaTeX typesetting system. Thereby, the XyMML and the related techniques have been shown to have the potentiality of serving as a kernel for integrating WWW communication, electronic publishing, and conventional publishing in chemistry.
Although the supervised neural networks such as BNN (Back Propagation Neural Network) and CNN (Counter Propagation Neural Network) are useful techniques for modeling nonlinear data, the prediction ability for test set is not enough in case of using the large number of descriptors. Furthermore, interpretation of the established model is rather difficult and it is cumbersome to design new compounds. Therefore, it is important to remove the unrelevant descriptors, which have no significant contributions to the model. In order to select the significant descriptors among the huge combinations, GA (Genetic Algorithm) has been developed and used in QSAR (Quantitative Structure-Activity Relationship) studies. In the previous work, we have successfully combined CNN and GA and applied it to the structure-activity data of Phenylalkylamines. In this report, we examined the real utility of our method by using the steroid data, which have the larger number of descriptors than that of phenylalkylamies. First of all, we showed that this data set is nonlinear by PLS (Partial Least Squares). Next, we built up the CNN model with all 51 descriptors but the prediction for test set was poor. Then, GA was used for variable selection and it reduced the number of descriptors from 51 to 11. The prediction ability of the CNN model with 11 descriptors was much improved. Finally, the loading vector maps of the selected descriptors and activity were compared. The trend between the activity and each descriptor was easily understood by the coloring loading maps.
In 3D-QSAR analysis such as comparative molecular field analysis (CoMFA), proper superimposition of molecules is required. Since appropriate superimposition is an important factor for construction of predictive model, various methodologies for molecular alignment have been proposed. We have proposed the novel molecular alignment method using Hopfield Neural Network (HNN). In this paper, 3D-QSAR analysis of human epidermal growth factor receptor-2 (HER2) inhibitors which consist of two different types of skeleton, was reported. The structures of HER2 inhibitors were automatically aligned using HNN and then the correlation between the HER2 activity and the molecular fields was analyzed by PLS. The robust PLS model (R²=0.805, Q²=0.701) was obtained and it was validated by contour map of the regression coefficients.
N-Substituted maleimides are polymerized to form optical active polymers by using an asymmetric catalyst prepared from an optical active bisoxazoline derivative and alkyllithium. In the present study, the reaction mechanism of this asymmetric anionic polymerization was clarified by use of ab initio molecular orbital calculation. In the initiation reaction, the configuration of the catalyst causes an asymmetric induction. On the other hand, the growth reaction proceeds an asymmetric manner and the configuration of asymmetric catalyst has no effect.
We have described previously a measure of protein similarity based on a hard ball model of the position of α-carbon atoms in amino acid residues. A genetic algorithm (GA) is used to search the space of possible alignments to identify the maximum possible volume overlap of one protein with another, with the chromosome in this GA using a simple binary encoding scheme. Here, we extend the measure to take account of the secondary structure elements present within a protein, using an elite generational replacement GA, a steady-state GA and a bit-climber; we also consider the use of a Gray coding scheme. Self-recognition and database searching experiments with structures from the Protein Data Bank show that the bit-climber with a Gray code representation gave the best results of the three search methods that were tested.
To elucidate a common feature of OmpR family, we investigated stable structures and electronic properties of DNA binding sites of OmpR proteins by semi-empirical MO method. The results clarify that the HOMO is localized on the 2nd aspartic acid for almost all families. This amino acid exists near the backbone of DNA, when OmpR binds to DNA. Therefore, it seems that the 2nd aspartic acid of DNA binding site of OmpR family is essential for the nonspecific interaction between OmpR and DNA backbone. The effect of amino-acid mutation on the structures and electronic properties was also invesgated.
As biomedical studies, recently, are becoming more interdisciplinary, and documents of that field are collected in various databases. In this study, we compared 4 biomedical field databases, MEDLINE, EMBASE, BIOSIS, and SCI, to make better use of databases not only for information retrieval but for content analysis. Descriptors frequently indexed to the documents recorded by these databases in common reflect the characteristics of each databases. Therefore, we examined these descriptors using frequency counting, term similarity analysis, and quantification theory III analysis. The results were given objectivity by applying the categories of Medical Subject Headings (MeSH) also to descriptors of databases other than MEDLINE. These results revealed differences and character of 4 databases.