2011 Volume 1 Issue 1 Pages 11-22
A clustering procedure is developed to classify the row vectors of a multivariate data matrixinto clusters with their sizes, that is, the numbers of vectors allocated to clusters, keptfixed at prescribed numbers. For this fixed size clustering, [1] the estimation of centroid vectorsof clusters and [2] the permutation of the rows of a data matrix are alternately iterated,so that the sum of the squared distances between the centroid vectors and permuted rowvectors is minimized. Here, the step [2] can be called least squares permutation in which thepermutation matrix optimally matching a permuted matrix to a target matrix is obtainedwith a simple iterative algorithm. In simulation studies and the application to a real dataset, the fixed size clustering procedure was found to perform sufficiently correct classification,and the least squares permutation was also found to well recover true permutationmatrices, though the procedures often yielded local minima.