Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
Machine learning is expected to be applied in life science. However, data structures observed in life science fields tend to be high dimensions with relatively small sample sizes, which cause model overfitting and unexplainability. A sparse modeling algorithm LASSO is one of the possibilities to overcome the problems; nevertheless, the prediction performance of LASSO is usually worse than general algorithms such as neural networks. To achieve both performance and explainability, we proposed a two-phase prediction method. In this method, instead of predicting labels from features directly, we independently build two models: one predicts intermediate status from features, and the other predicts labels from the intermediate status; then combine the two models. Features of molecular biology such as gene expression are recommended to use as intermediate status. To evaluate our method, we applied the method to ecological data for flowering prediction and medical data for cancer type prediction. The results of both applications indicated that our approach ensures performance and explainability.