1995 年 10 巻 4 号 p. 572-579
This paper proposes an inference algorithm for classification trees based on MDL, the Minimum Description Length Criterion. MDL is a well known model selection criterion which stipulates that one should select a model so as to minimize the total number of bits to encode information necessary to reproduce the original data given as input, including the description of the model and the parameters. First, an exact expression of the description length of classification trees with a certain encoding scheme is given. Then an efficient algorithm to generate classification treesis proposed, in which trees are produced in a top-down manner by reducing the coding length by replacements of a terminal node with a decision node. Finally, simulation experiments are described which show that our algorithm based on MDL outperforms that using another related model selection criterion called AIC. In particular, the classification trees obtained using MDL were much more consistent with the true model.