How to find variable importance in R

k-meansclustering

#1

Hello,

While doing clustering in SPSS a output shown below is generated:


From this we can infer which are the strongest variables in the cluster(most important variables).
How can I find something like this in R??


#2

Hi,
There is no variable of importance for clusters, you have to define the importance through weight or the weights are calculated to create a margin (t-sne type).
The clusters are based on the distance between points from centroids ( the number of iterations and definition of centroids will influence) The wikipedia on Kmeans explains this well have a look .
The distance could be of multiple types Manhattan, Euclidian or other dimensions for example for genomics we use other type.
The data points have to be centered and scaled (similar to PCA). You can give importance by multiplying your variables by a weight.
Hope this help.
Alain


#3

Hi,
yesterday I wrote about one type of clustering, what you can do is to use hierarchical clustering hclust() then you prune (not mandatory) it and then plot the tree. The split will appear in the graphic with name of variable and the values. Not exactly variable of importance (weight in equation) but you can how the space is tiled.

Hope this help.

Alain