Independent observations
X1,
X2, …,
Xn are made on a distribution
F on
Rd. To devide these observations into
k clusters, first choose a vector of optimal cluster centers
bn=(
bn1,
bn2, …,
bnk) to minimize
Wn(
a)=1/nΣni=1min
1≤
j≤
k||
Xi-
aj||
2 as a function of
a=(
a1,
a2, …,
ak), then assign each observation to its nearest cluster center. Each
bnj is the mean of observations in its cluster. Pollard (1982) obtained a central limit theorem for the means of the
k-clusters. In this paper, it is shown that the bootstrap distribution of the centered
bn has the same limiting distribution; the argument rests on asymptotic behavior of empirical processes on Vapnik-Chervonenkis classes in triangular array setting. Advantages of the bootstrap methods are discussed and the performance of bootstrap confidence sets is compared with Pollard's confidence sets by Monte Carlo simulation.
2
View full abstract