人工知能学会論文誌
Online ISSN : 1346-8030
Print ISSN : 1346-0714
ISSN-L : 1346-0714
論文
情報理論的クラスタリングによる異常値クラスタの検出
安藤 晋鈴木 英之進
著者情報
ジャーナル フリー

2008 年 23 巻 5 号 p. 344-354

詳細
抄録
Identifying atypical objects is one of the classic tasks in machine learning. Recent works, e.g., One-class Clustering and Minority Detection, have explored the task further to identify clusters of atypical objects which strongly contrast from the rest of the dataset. In such problems, avoiding false positive detection is an important yet significantly difficult issue. In this paper, we propose an information theoretic clustering which aims to compactly represent the global and local structures of the dataset and identify atypical clusters in terms of information geometric distance. The former objective contributes to reducing the number of false positive detections. Its formalization further yields a unifying view of the classic outlier detection and the novel tasks. We present a scalable algorithm for detecting multiple clusters of atypical objects without a pre-defined number of clusters. The algorithm is evaluated as an unsupervised two-class classification using simulated datasets and a text classification benchmark.
著者関連情報
© 2008 JSAI (The Japanese Society for Artificial Intelligence)
前の記事 次の記事
feedback
Top