カテゴリカル変数の背後にある構造を利用した決定木学習の困難さ

小林 靖明; 小林 佑輔; 大舘 陽太

doi:10.11517/jsaifpai.117.0_02

Abstract

In decision tree learning, we split an instance space into two subspaces based on a of instances at each node. For numerical or ordinal categorical features, this can be done by an "optimal" hyperplane that separates the domain space of those features. For nominal categorical features, however, it is not obvious to define a "hyperplane" of the domain space. Lucena (Lucena, AISTAT 2020) pointed out that the domain space of nominal categorical features may be structured and exploited this structural information to learn decision trees. In this method, at each internal node, we need to find an "optimal" bipartition of a graph whose vertices are labeled by either +1 or ?1 such that both sides are connected and the misclassification is minimized. In this paper, we formalize this problem as Connected Bipartition and investigate its computational complexity.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!