Optimal number of clusters-- elbow R execution



I executing elbow criterion to determine optimal number of clusters in ‘R’ but I need to understand how is ‘R’ executing the for loop stated below:

Note: ‘whole_sse’ is the scaled data frame in below code

wss <- (nrow(whole_sse)-1) * sum(apply(whole_sse, 2, var))
this code prints ‘wss’ as 2634. I understand its printing within sum of squares but why are we subtracting number of rows by ‘1’ ?

for(i in 1:6){ wss[i] <- sum(kmeans(whole_sse, centers=6)$withinss)}
After excuting for loop ‘wss’ outputs below values
962.9898 942.4279 929.6245 987.5077 987.4833 987.5179

How is for loop assigning values to ‘wss’ ?


hello @sunnysingha,

Is this what you are trying to do:

wss=lapply(1:15,function(x)kmeans(churn.train,centers=x,nstart=30,iter.max = 20)$tot.withinss) 
plot(1:15,wss,type="l",xlab="# of Clusters",ylab="Total Within SS")