Is multicollinearity always a problem in regression?


Having variables which are multicollinear in the model, will it affect the predictive power of the model?



Multicollinearity is a common problem while estimating linear models, logistic regression and other regression techniques. It is due to high correlations among predictor variables and it leads to unreliable and unstable estimates of regression coefficients.

Most of us know that multicollinearity is not a good thing for model output but many do not realize that there are several situations in which multicollinearity can be safely ignored.

It is mostly diagnose using the variance inflation factor (VIF). The VIF is just 1/(1-R2).


Multicollinearity : Should this always be avoided?

Can we try Ridge Regression to overcome the multi-collinearity ?