2019 Volume 20 Pages 7-17
Fluorescent substances are used in a wide range of applications, and the method that effectively design molecules having desirable absorption and emission wavelength is required. In this study, we used boron-dipyrromethene (BODIPY) compounds as a case study, and constructed high precision wavelength prediction model using ensemble learning. Prediction accuracy improved in stacking model using RDKit descriptors and Morgan fingerprint. The variables related to the molecular skeleton and the conjugation length were shown to be important. We also proposed an applicability domain (AD) estimation model that directly use the descriptors based on Tanimoto distance. The performance of the AD models was shown better than the OCSVM-based model. Using our proposed stacking model and AD model, newly generated compounds were screened and we obtained 602 compounds which were estimated inside the AD in both absorption wavelength and emission wavelength.