機械学習による予測に寄与した特徴量抽出のための相関分析手法を用いた要因分析技術

鈴木 麻由美; 柴原 琢磨; 村垣 善治

doi:10.14948/jami.38.351

Abstract

　Over the last half century, a number of new learning methods have been developed, including SVMs and deep neural networks. These are very accurate, but unfortunately they also lack explainability. In particular, deep neural networks provide no information about the importance of feature variables. High explainability is expected to guarantee the reliability of prediction models made by learning methods other than the evaluation of prediction accuracy. To address this problem, we have developed a factor analysis technique for nonlinear machine learning methods. The technique has two statistical steps as follows. The first step, called backward analysis, generates probability distributions of the positive and negative classes estimated by the prediction model. The second step uses backward elimination based on Hilbert-Schmidt independence criteria to extract feature variables for which there is a nonlinear correlation between the feature variables and outcome. This factor analysis technique was verified by simulation. In the experiment, we extracted new factors that are relevant to prostate cancer from the feature variables of gene expression data. Experimental results show that this technique has the potential to play a vital role in clinical research.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!