This paper describes a novel reduced representation of three-dimensional protein structures and its applications for data mining in proteins. In this representation, a protein structure is represented by a set of pseudo-atoms corresponding to glycine residues and 3D coordinates of their alpha-carbons. To evaluate the performance of this reduced representation of protein structures, the authors modified the AIM program for searching 3D common structural features for three or more proteins. Three dehydrogenase proteins were represented in the above-mentioned format and their common structural features were searched by the modified AIM program. The substructures related to the NAD binding domain of all three proteins were successfully identified. In another trial, a reduced representation consisting of seven glycine residues from the NifH/frxC motif site in a nitrogenase (1NIPA) was used as a query for searching common substructures in 1,300 peptide chains. Then, similar substructures close to iron-sulfur clusters were identified in several hybrid cluster proteins. The presented results show the potential applicability of this method for 3D structural data mining of proteins.
This paper describes classification and prediction for pharmacologically active classes of drugs under the presence of noise chemical compounds. Dopamine D1 receptor agonists (63 compounds), antagonists (169 compounds) and other drugs (696 compounds) were used for the work. Each drug molecule was characterized with Topological Fragment Spectra (TFS) reported by the authors. TFS-based artificial neural network (TFS/ANN) and support vector machine (TFS/SVM) were employed and evaluated for their classification and prediction abilities. It was concluded that the TFS/SVM works better than TFS/ANN in both the training and the prediction.
A gas separation membrane module which consists of three engines, was designed. The engine for creating a separation modular structure generates the arrangement coordinates data of a separation unit by GUI, and generates the input file for hydrogen recovery rate calculation. Since it can respond to a cylinder or square pillar type module outward form and can respond to unit arrangement of a triangular lattice or a square lattice, the output of the unit arrangement coordinates of general multi-pipe set structures (separation, extraction, heat exchanger, etc.) is possible. The engine for performing hydrogen recovery rate calculation applies a commercial computer fluid dynamics (CFD) package, receives input data from a modular structure creation portion, and outputs a convergence calculation result to a file. The engine for optimizing a modular structure has managed the input data and the result for the recovery rate calculation stored in the data base, proposes the parameter for structure optimization using a genetic algorithm, and determines the data set for performing the next recovery calculation. In the design of a separation module, it became clear that a genetic algorithm can optimize efficiently the size of a separation membrane element or its arrangement.
Relationships between the HOMO energy, the ν(C=O) stretching band, basicity(pKa), and σ(p) in phosphorus ylides have been studied. A linear relationship between the HOMO energies obtained by the AM1, HF/3-21G, and HF/6-31G methods and pKa was found. The same linear relationship was found between the HOMO energies obtained by these methods and σ(p)(indicated p-position). Thus, the HOMO energy can be adopted instead of pKa and σ(p) for choosing the ylide as the donor in organic reactions. In addition, a linear relationship between the ν (C=O) stretching bands obtained by these methods and the parameters described above was found. Hence, as for the electron donation of the ylide, it is noted that the HOMO energy can be used instead of pKa as a new method for showing the basicity of ylides. Furthermore, the ν(C=O) stretching bands obtained by these methods can be estimated by the HOMO energy obtained by the same methods.
The XyMTeX2PS system for typesetting chemical documents having structural formulas has been developed to cover both traditional printing and Internet communication. The system is capable of providing chemical documents as PostScript files of high quality. The PostScript files can be converted into PDF files, which serves as a key to cover both of the fields, where more elaborate stereochemical expressions such as wedged bonds are available.
The XyM2Mol system, which consists of the XyM2Mol application and the XyM2Mol applet, is developed to convert XyM-notation codes into connection tables. Thereby, the structural data by XyM Notation become applicable to a wide variety of chemical applications through such connection tables.