In this paper, two clustering algorithms that handle data with tolerance are proposed. One is based on hard c-means (HCM) while the other is based on the learning vector quantization (LVQC). We consider a tolerance which is a new concept to handle data with uncertainty such as errors, ranges, or a lost attribute of data in the optimization framework. The concept of tolerance is included in both algorithms. Dissimilarity in the former clustering algorithms is defined by using nearest-neighbor, furthest-neighbor or Hausdorff distance. On the other hand, dissimilarity in the proposed algorithms is defined by squared L
2 (euclidean)-norm and the algorithm can handle the data with uncertainty in the strict optimization problems. First, the concept of tolerance which implies errors, ranges and the loss of attribute of data is described. Optimization problems that take the tolerance into account are formulated. A unique and explicit optimal solution is given by Karush-Kuhn-Tucker conditions. An alternate minimization algorithm and a learning algorithm are constructed. Moreover, effectiveness of the proposed algorithms is verified through numerical examples.
View full abstract