安定化スコア検定を用いた高次元生存データに基づく決定木の構築法

江村 剛志

doi:10.11329/jjssj.52.373

Abstract

A decision tree is a statistical model constructed by recursively partitioning samples into several groups. A decision tree based on survival data (survival tree) can classify patients into different risk groups that are useful to predict patient prognosis. To test the significance of partitions, survival analysis methods are used, such as the log-rank test. However, the log-rank test may be unstable for small samples, and hence, the significance of partitions could be difficult to interpret. Furthermore, the R package for a decision tree, rpart, may overcorrect the significance for multiple testing under high-dimensional covariates. In this article, we introduce a method that alleviates these problems by the “stabilized score test” for constructing a survival tree. The proposed method also yields a simple tuning method by the P-value of the test. We illustrate the proposed method using a lung cancer dataset. The proposed method can be implemented by the R package “uni.survival.tree”. The R code for the data analysis is given in Appendix.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!