Non graphical corelated Variables

machine_learning

#1

Hi All,

how will I get highly correlated variables non graphically. Please drop me the code to get the names of highly correlated variables.

Thanks,
Tony


#2

Hi Tony,

Non- Graphical methods:

You could use-
i) cor() function and check which are highly correlated.

ii) The function rcorr() [in Hmisc package] can be used to compute the significance levels for pearson and spearman correlations. It returns both the correlation coefficients and the p-value of the correlation for all possible pairs of columns in the data table.

Simplified format:
rcorr(x, type = c(“pearson”,“spearman”))

The output of the function rcorr() is a list containing the following elements : - r : the correlation matrix - n : the matrix of the number of observations used in analyzing each pair of variables - P : the p-values corresponding to the significance levels of correlations.

Thanks,
Abhishek


#3

Hi Abhishek,

Thanks for sharing the above inputs.
is there a function which can be applied on a dataset and renders out come as highly correlated variables names ?

Regards,
Tony

/


#4

Hi @tillutony

well the always welcome package caret … findCorrelation() it will give the column names of the data frame so you can do
findCorrelation( mydataframe, cutoff= .9, names=TRUE)

and done.
Alain


#5

Thank a ton