Erdős number, which is defined as the distance between one scientist and Paul Erdős, is one of the most famous scientific jokes. The collaboration graph to define Erdős number consists of researchers as vertices, and the edge is drawn between two vertices if the two researchers jointly authored one or more papers. Erdős number is frequently mentioned as the typical example of small world phenomenon, and is the good educational tool to learn about relationship among various academic disciplines. In this study, to clarify the position of informational chemistry in science, Erdős numbers of Japanese informational chemists were discussed. Hosoya number was introduced and the Erdős number can be approximately obtained by using Hosoya number.
In the GADV hypothesis, it is assumed that the primordial life originate from the proteins constructed from glycine, alanine, aspartic acid, and valine ([GADV]-proteins). Three-dimensional structures of [GADV]-proteins have not been observed experimentally, in fact, even computational predictions have not been carried out. In this study, computational modeling methods for three-dimensional structures of proteins were adopted for [GADV]-proteins, and the availabilities of the methods were compared. The prediction methods are frequently based on the structure conservation throughout evolution, and the model constructions depend on the knowledge of protein structures in existence. The predicted structures of [GADV]-proteins were mainly composed of β structures because of high β strand capability for valine residue. Because many of predicted structures were immediately broken by molecular dynamics (MD) simulations, instabilities of these structures were indicated. On the other hand, not only β structures but also helices were observed for the resulted structures by MD simulations. The results suggest that MD simulations using only Newton’s equation were useful for the structural predictions of [GADV]-proteins.
In industrial plants, soft sensors have been widely used to estimate difficult-to-measure process variables online. The predictive accuracy of soft sensors decreases due to changes in the state of chemical plants, and soft sensor models must adapt to the process changes by using new measured data. However when a model is reconstructed with data that have low variation, the model cannot predict abrupt changes of process characteristics. The predictive performance of adaptive models depends on databases. We therefore propose an index to monitor database, i.e. database monitoring index (DMI), and a database monitoring method using the DMI. The DMI is based on similarity between two data and is defined as a rate of absolute difference of an objective variable and similarity of explanatory variables. The more similar two data are, the smaller value the DMI has. When new data is obtained, DMI values are calculated between new data and all data in a database. If the minimum value of the DMI values is large, the new data is added to the database. By using the DMI and selecting new data, the amount of information of a database can enlarge while curbing the rise in the number of data in the database. Through the analysis of simulation data, we confirmed that the appropriate monitoring of a database could be achieved and the preidctive accuracy of adaptive soft sensor models could increase by using the proposed DMI.
In carbon dioxide capture and storage (CCS), the chemical absorption method with amine compounds has been widely investigated as a method for capturing CO2. In this way, amine compounds with high performances of CO2 absorption and desorption are required for cost reduction of CO2 separation and recovery. One of the approaches to find amine compounds with high performances is molecular design with quantitative structure-property relationships (QSPR) models and structure generators. In this study, ensemble learning and genetic algorithm-based partial least squares (GAPLS), which is a variable selection method, were combined to construct predictive regression models. This method is named ensemble GAPLS (EGAPLS). In ensemble learning, prediction results from multi-models are integrated to give a better result than those of each single model. Moreover, considering the variance of the predicted values, it is possible to evaluate the reliability of the final prediction result. We constructed the QSPR models and evaluated the predictive accuracy of these models by cross-model validation (CMV) with the data of absorption rate and desorption capacity with tertiary amine compounds. The modeling results showed that the EGAPLS models had the highest predictive accuracy. The constructed EGAPLS models were applied to molecular design, and accordingly, promising chemical structures were obtained for CO2 separation and recovery.