Abstract
Clustering by Self-Organizing Map (SOM) can extract clusters of arbitrary distribution shapes based on the distance between the code-vectors (representative points of the input data). Hence, this is one of the “distance-based” clustering approaches. On the other hand, there are “distribution-based” clustering approaches that consider the distribution of input data when extracting clusters appropriately. For example, x-means method adopts Bayesian Information Criterion (BIC) into k-means method. Information criteria are also easily introduced into the clustering method by SOM. In this paper, we propose a clustering method by SOM and information criteria. In this method, initial cluster-candidates are derived by SOM, and then these candidates are merged appropriately based on information criterion such as BIC or AIC (Akaike Information Criterion). Through the clustering experiments for the artificial datasets and UCI Machine Learning Repository's datasets, we confirm that our proposed method can extract clusters more accurately and stably than the SOM-only method. Furthermore, we show that AIC is suitable for the proposed method compared to BIC and that our method also can estimate the number of clusters in dataset.