Which multi objective optimization technique is used by K-Means clustering? And can someone share some information about the optimization technique?


K Means clustering algorithm optimizes 2 objective functions at a time, minimizing the difference between elements within the cluster and maximizing the difference between clusters. So I want to understand which multi objective optimization technique is used in K Means?


Hi Ravi,

Performs a multi-objective optimization for collecting cluster alternatives. The algorithm drawsR bootstrap samples from x. It calculates clusterings for all specified cluster numbers K using kmeans,neuralgas, and single-linkage clustering. It then applies several cluster validation indices to
the clusterings.

mocca(x, R = 50, K = 2:10, iter.max = 1000, nstart = 10)


x A numeric matrix of data, or an object that can be coerced to such a matrix (such
as a numeric vector or a data frame with numeric columns).

R The number of bootstrap samples.
K The range of cluster numbers, i.e. a vector of integers listing the maximum
numbers of clusters to be used by each of the algorithms.
iter.max The maximum number of iterations allowed in k-means.
nstart For k-means, how many random sets should be chosen?


A list with two entries:

cluster A list containing one sublist for each clustering algorithm and the baseline cluster
solution. Each of these lists hold an entry for each cluster size K, which
again consists of R vectors of cluster assignments. These vectors assign each
data point in x to a cluster.
objectiveVals A matrix of objective function values. Each row corresponds to a certain cluster
validation index applied to a certain clustering algorithm. The columns correspond
to different cluster numbers. Consequently, an entry of the matrix specifies
the median value of a certain cluster validation index for a certain clustering
algorithm with a specific number of clusters over the R bootstrap samples.

res <- mocca(toy5, R=10, K=2:5)

plot kmeans result for MCA index against neuralgas result for MCA index

plot(res$objectiveVals[1,], res$objectiveVals[5,], pch=NA,
xlab=rownames(res$objectiveVals)[1], ylab=rownames(res$objectiveVals)[5])
text(res$objectiveVals[1,], res$objectiveVals[5,], labels=colnames(res$objectiveVals))