STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table, and its effectiveness have been confirmed by simulation experiments. However, the previous work has only pointed out the problems of the conventional methods and simply proposed the method to overcome them, and STRIM has yet to be applied to several problems and/or has contained some limitations in STRIM which must be solved before it can be applied to real-world datasets. This paper further examines the limitations of STRIM as presently introduced, and considers several conditions for its application and utilization. Specifically,these are to eliminate the limitation of the number of the decision attribute values, and to clarify the principle for STRIM to induct true rules, the size of the dataset needed, and the relationships and differences between STRIM and the conventional methods. Real-world datasets often contain missing values in the condition attributes, and contaminated values in the decision attribute, from various reasons. Based on the above considerations, this paper reports simulation experiments to examine the capacity of STRIM in such circumstances. The results show the method seems to be sufficiently robust for application to real-world datasets.
抄録全体を表示