2011 年 2011 巻 DOCMAS-B101 号 p. 06-
We propose a novel efficient on-line algorithm for extracting frequent subsequences from a multiple-data stream. This algorithm solves the important problem that a large amount of memories are suddenly consumed when bursty arrivals occurs in a data stream. For an on-line algorithm, suppressing memory consumption is very important, thus, an on-line algorithm often takes a form of an approximation algorithm, where an error ratio is guaranteed to be lower than a user-specified threshold value. Our algorithm is based on an extended version [6] of LOSSY COUNTING Algorithm. The proposed on-line algorithm firstly limit the available memory to a given fixed space. Whenever it consumes all of the given memory space, it expires lowest frequency candidates of frequent sequences from the memory and stored instead new candidates which arrives in a data stream. We prove that the proposed algorithm have no false negatives under some conditions, and also have some other properties such as robustness.