IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Finding Frequent Closed Itemsets in Sliding Window in Linear Time
Junbo CHENBo ZHOULu CHENXinyu WANGYiqun DING
Author information
JOURNAL FREE ACCESS

2008 Volume E91.D Issue 10 Pages 2406-2418

Details
Abstract

One of the most well-studied problems in data mining is computing the collection of frequent itemsets in large transactional databases. Since the introduction of the famous Apriori algorithm [14], many others have been proposed to find the frequent itemsets. Among such algorithms, the approach of mining closed itemsets has raised much interest in data mining community. The algorithms taking this approach include TITANIC [8], CLOSET+[6], DCI-Closed [4], FCI-Stream [3], GC-Tree [15], TGC-Tree [16] etc. Among these algorithms, FCI-Stream, GC-Tree and TGC-Tree are online algorithms work under sliding window environments. By the performance evaluation in [16], GC-Tree [15] is the fastest one. In this paper, an improved algorithm based on GC-Tree is proposed, the computational complexity of which is proved to be a linear combination of the average transaction size and the average closed itemset size. The algorithm is based on the essential theorem presented in Sect. 4.2. Empirically, the new algorithm is several orders of magnitude faster than the state of art algorithm, GC-Tree.

Content from these authors
© 2008 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top