Abstract
In experimental design for functional molecules, materials or products, complicated relationships exist between various experimental parameters and objective physical and chemical properties. Regression analysis with experimental data is a useful way for understanding those relationships, and a constructed regression model can be used to search for functional products effectively. However, although those products can be found in domains out of existing data, the predictive ability of the model tends to be low in regions where data density is low, and new candidates whose predicted values of a property are unreliable will not achieve desired values of the property. Therefore to search for new candidates in appropriate extrapolation domains, we consider the probability that a new candidate will have intended values of a property and the reliability of a predicted value of the property for the candidate. The probability is calculated from a predicted value and its estimated prediction error, and the reliability is based on data density. The proposed method is applied to simulation data and aqueous solubility data, and the efficiency of the method could be confirmed. In addition, we could automatically select an appropriate regression method in each step of the search according to features of existing data.