1992 Volume 7 Issue 6 Pages 992-1000
The inductive decision tree learning system can automatically develop some classification rules from a set of training examples, and has been applied in many application domains. This paper presents an approach, named INDECTS, for dealing with continuous attributes such as numerical data in the inductive decision tree learning. INDECTS is a variant of ID3 algorithm and has a labeling procedure for the numerical attributes. The labeling procedure divides all numerical data into several clusters, and makes a new discrete symbolic attribute that has several data thresholds for testing a numerical attribute. The procedure consists of a dividing operation and a unifying operation for data clusters. The key idea of this algorithm is to collect numerical data into data clusters according to its classification class, and to make thresholds for the adjacent data clusters by calculating maximum information gain. The labeling procedure is executed in each decision tree expanding step, so that each numerical attribute is dynamically translated into the discrete attributes, which have high information gain as the classification criteria. Some experiments show that this algorithm is more powerful and available for many types of numerical data than the existing methods.