Hi All,

how will I get highly correlated variables non graphically. Please drop me the code to get the names of highly correlated variables.

Thanks,

Tony

Hi All,

how will I get highly correlated variables non graphically. Please drop me the code to get the names of highly correlated variables.

Thanks,

Tony

Hi Tony,

Non- Graphical methods:

You could use-

i) cor() function and check which are highly correlated.

ii) The function rcorr() [in Hmisc package] can be used to compute the significance levels for pearson and spearman correlations. It returns both the correlation coefficients and the p-value of the correlation for all possible pairs of columns in the data table.

Simplified format:

rcorr(x, type = c(“pearson”,“spearman”))

The output of the function rcorr() is a list containing the following elements : - r : the correlation matrix - n : the matrix of the number of observations used in analyzing each pair of variables - P : the p-values corresponding to the significance levels of correlations.

Thanks,

Abhishek

Hi Abhishek,

Thanks for sharing the above inputs.

is there a function which can be applied on a dataset and renders out come as highly correlated variables names ?

Regards,

Tony

/

Hi @tillutony

well the always welcome package caret … findCorrelation() it will give the column names of the data frame so you can do

findCorrelation( mydataframe, cutoff= .9, names=TRUE)

and done.

Alain