Quinolone derivatives are antibiotics with a 1,4-dihydro-quinoline-3-carboxylic acid skeleton, which have been under development by many pharmaceutical companies. An aromatic nucleophilic substitution (SNAr) reaction is the key reaction for manufacturing the quinolone derivatives. Therefore, the ability to predict experimental yields for the SNAr reactions is very useful for developing synthetic pathways for these compounds. In the present study, we tried to predict reaction yields by using the GA-PLS method with values calculated from the Molecular Orbital (MO) calculations as explanatory variables. The present GA-PLS analysis also adopted such experimental parameters as dielectric constants of solvents and reaction temperatures. Although it was necessary to classify nucleophiles and quinolone derivatives according to their geometry, we succeeded in making models associating the experimental yields with the parameters shown above. The analyses by using energy levels corresponding to more than two orbitals near HOMO and LUMO produced better results than those only from the energy levels of HOMO and LUMO. The GA-PLS method extrapolated variables closely related to the reaction mechanism for making the models. However, the method also selected as parameters important explanatory variables such as energy levels of HOMO and LUMO and electron densities of products in order to obtain models with high R² values. These variables are unrelated to the SNAr mechanism and are dependant on classification of the reactants according to their geometry. The difference of selected parameters suggests a difference of the reaction mechanisms according to the combination of nucleophiles and quinolones, which contradicts the experimental findings that all reactions proceed according to the same mechanism, i.e., the SNAr mechanism. We then tried to classify the data by reaction conditions. As a result, we were able to make a model by the classification according to the isolation method, which was not dependent on the chemical structure. This shows the large influence exerted by the difference of isolation method by having the assumed yield as the target variable, and the difficulty of modeling. Therefore, it is possible to predict the reaction yield by using the parameters calculated from the MO calculations, and it became clear that the differences of isolation methods were responsible for the difficulty of making a unified model.
View full abstract