精度保証付きオンライン型高速近似系列マイニング

村田 順平; 岩沼 宏治; 石原 龍一; 鍋島 英知

doi:10.11517/jsaisigtwo.2009.DMSM-A803_11

抄録

We propose an on-line approximation high-speed algorithm for extracting frequents ubsequences from a stream data. In an on-line algorithm, suppressing memory consumption is very important, thus, an on-line algorithm often takes a form of approximation algorithm, where the error ratio should be guaranteed to be lower than a user-specified threshold value. Our Algorithm is based on LOSSY_COUNTING Algorithm[1, 4], which is famous and can extracts frequent items from a stream data. We extend LOSSY COUNTING Algorithm to extract frequent subsequences from a stream data by using Head Frequency that is a measure for frequency of subsequences. We estimate approximation accuracy of the proposed algorithm and the space complexity. The order of memory consumption is M/ϵ logN, where M is the maximum number of subsequences obtained in each window, ϵ is an user-specified error ratio, and N is the length of a stream data. Through experiments we show that the proposed algorithm has good scalability to the length of a stream data and can suppress the memory consumption being lower than the estimated value

著者関連情報

お気に入り & アラート

閲覧履歴

発行機関からのお知らせ

第二種研究会の全記事は認証なしでアクセス可能です．また，各記事の著作権は原則として著者に帰属します．

責任著者(Corresponding author)

会議情報

J-STAGEへの登録はこちら（無料）