Production of hydrogen from water using semiconductors called photoelectrochemical materials is directly utilizing the energy from sunlight. Currently, there are many studies on the photoelectrode, but few studies on the photoelectrode reactor. The purpose of this study, therefore, is to develop the photoelectrode reactor for hydrogen production. Analyses and optimization are important in the design of the chemical reactor. For the analysis of the chemical reactor, computational fluid dynamics (CFD) is commonly used. However, optimization requires multiple CFD simulations, so it is difficult to obtain the optimum values of construction parameters for its computational cost when CFD simulation is time-consuming. For this reason, statistical surrogate models, or metamodels, are used to replace the actual expensive CFD simulation. In this study, we build the Kriging metamodel of CFD analysis to obtain the optimum value of construction parameter of the reactor. Since the photoelectrode reactor has a trade-off problem among objective variables, the determination of optimum value were thought to be difficult. We obtain Pareto optimality solutions by using CFD simulation, the Kriging metamodel and NSGA-II (Non-Domination Sorting Genetic Algorithm) multi-objective optimization procedure.
We developed a new SOAP-API service in mass spectral database MassBank. Using the service, an arbitrary application software can access MassBank. Therefore, a user can make a program which combines functions of MassBank for user's own purpose, executes a fixed series of processes to large amount of data repeatedly, and performs a cooperative task with other internet services. Furthermore, a search function to MassBank can be introduced to an existing mass spectral tool. Though MassBank is a distributed database in the internet, using this SOAP-API service of massbank.jp, we can search spectra in all distributed database servers and obtain data without regard to the server where they really exist.
In this paper, the results of construction of classification models for mutagenicity of organic molecules are described. The objective of this study is to construct a model which can predict results of reverse mutation test with high accuracy. For this end, we propose a novel ensemble modeling method in which a lot of support vector machine (SVM) models are constructed as a sub-model and integrated to predict mutagenicity. For constructing sub-model, a part of data matrix which is randomly selected from an original data matrix and randomly determined SVM parameters are used. After the construction of sub-models, a certain number of models which have high accuracy rate are selected and integrated to predict mutagenicity. We constructed an ensemble model using a data set of reverse mutation test which was collected by Hansen et al. [K. Hansen, et al., J. Chem. Inf. Model., 49, 2077-2081] to estimate the proposed method. As a result, an ensemble model with accuracy of 79.6% was successfully obtained. The area under ROC-curve (AUC) is 0.866, which is slightly better than that of Hansen et al. Thus we concluded that the ensemble modeling with SVM sub-models are a promising method for predicting mutagenicity of organic molecules.
Recently, fruit internal quality is examined by the machine equipped with optical sensor. Predicting internal qualities such as sugar content from spectrum leads to estimation of commodity value of fruit. In this study, we built regression models and classification models between NIR spectra and the amount of sugar content, presence of sorbitol and browning in apples. Regression models are used for predicting sugar content, and classification models are used for predicting whether apple sorbitol and browning are present inside apple. For regression and classification analysis using NIR spectrum, it is important to select wavelengths appropriately. Therefore we proposed Genetic Algorithm-based Wavelength Selection (GAWLS) method which is variable selection method by using genetic algorithm and validated availability of models. In addition, we proposed a new classification method by combining GAWLS with k-nearest neighbor (k-NN) method. As a result, we revealed that the model obtained by using GAWLS method is more useful than the model obtained by using Partial Least Squares (PLS) method with all variables and Genetic Algorithm-based Partial Least Squares (GAPLS) method.
In quantitative structure-activity relationships (QSAR), partial least squares (PLS) are of particular interest as a statistical method. Since successful applications of PLS to QSAR data set, PLS has evolved for coping with more demands associated with complex data structures. Especially, PLS variants focusing on visualization and chemical interpretation are highly desirable in modeling multi-target structure-activity relationships. In this paper, we employed the self-organized PLS (SOMPLS) approach to predict multiple inhibitory activities against three serine protease receptors (Thrombin, Trypsin and Factor Xa). Volsurf descriptors were used as chemical descriptors. From the SOMPLS analysis, we could catch rough trends about what chemical features are essential to each serine protease protein. Their chemical features could be successfully validated from X-ray crystal structures and the corresponding alignment residues.
A prediction system for the quantity of an adsorbed organic compound on zeolite has been developed. The regression model useful for a various combinations of zeolite and solvent has readily been developed using genetic algorithm partial least squares (GAPLS). In the models, the molecular descriptor of the organic compound is used as an explanatory variable and the partition coefficient is used as an objective variable. As a result, the system can provide accurate predictions for almost of all combinations. Additionally, with the GA-PLS method, we applied the system to selecting the optimal combination of zeolite and solvent for simulated moving bed (SMB) processes. The validity of the system was evaluated for separation of 2-adamantanone and 2-adamantanol as a representative case. The combinations selected by the system were almost the same as those selected by experiment. This system is intended to shorten the time for selecting the optimal zeolite/solvent combination and to ensure good selection accuracy for developing SMB methods.