How to select the optimal value of k for k-means clustering




How are the values of k selected for clustering.Is it done based on the business case or by some other means like iteration through different values and seeing the changes & then selecting some value?

How to find the number of cluster in K-means algorithm?
How to determine the optimal number of clusters in R


What I know you can make an elbow curve b/w cluster distance and number of cluster and wherever the change in slope is highest, that is the optimum number of cluster. Why it’s called an elbow curve? Because it looks like an elbow!

Now coming to how to do it?

fit_nbclust <- NbClust(iris[,-5],,, method="kmeans")
plot(fit_nbclust )

After running this code the maximum value gives you the optimum cluster number, probably after that you can run a K means with that cluster size.

Hope this helps.