I am having a dataset in which all the variables are highly correlated:
I did PCA on this dataset but the results are not very encouraging.The scree plot shows that only 1 component will be enough:
However if I take only two components their loadings are:
As the loadings matrix show the correlations between the variables and PC’s are quite low,applying PCA is also not going to help i guess.I can apply ada to perform additive logistic regression but I am not sure if that would be the right method?
I have to ultimately predict churn but the data is highly skewed:
Decision tree is also not helping as only one node(root) is getting generated.
Can someone please guide me on this one??