Journal of the Japanese Society for Artificial Intelligence
Online ISSN : 2435-8614
Print ISSN : 2188-2266
Print ISSN:0912-8085 until 2013
Simultaneous Reliability Evaluation of Generality and Accuracy for Characteristic Rule Discovery in Databases
Einoshin SUZUKI
Author information
MAGAZINE FREE ACCESS

1999 Volume 14 Issue 1 Pages 139-147

Details
Abstract

This paper presents an evaluation method for discovering probabilistic if-then rules with high reliability from data sets. The discovery of probabilistic if-then rules, each of which is a restricted form of a characteristic production rule, is well motivated by various useful applications such as the semantic query optimization and the automatic development of a knowledge-base. In a discovery algorithm, a production rule is evaluated according to its generality and its accuracy since these are widely accepted as criteria in inductive learning. Here, reliability evaluation for these criteria is mandatory in distinguishing reliable rules from unreliable patterns without annoying the users. However, previous discovery approaches for characteristic rules have either ignored the reliability evaluation or have only evaluated the reliability of generality. Consequently, they tend to discover a huge number of rules, some of which are unreliable in their accuracies. In order to circumvent these difficulties we propose an approach based on a simultaneous estimation. Our approach discovers, based on the normal approximations of the multinomial distributions, the rules which exceed the pre-specified thresholds for generality and accuracy with high reliability. A novel pruning method is employed for improving the time efficiency without changing the discovery outcome. The proposed approach has been validated experimentally using 21 benchmark data sets in the machine learning community.

Content from these authors
© 1999 The Japaense Society for Artificial Intelligence
Previous article Next article
feedback
Top