What is singularity and how to remove it




While running linear regression in R I got stumped by this output:

The R^2 is coming out to be 1???
What is this singularity?I mean I understand that it is when some variable is a linear combination of some other variables,but in a dataset,isn’t that highly improbable for singularity occurring until and unless the variables displaying singularity have been feature engineered.
Can someone please explain to me how to deal with singularity??



About singularity, before to a linear model (or glm) do a correlation analysis. In your case you have perfect collinearity between the variables mentioned Charge and Calls I suppose check the result of the correlation matrix. You can remove the variables with NA (be still careful of scale for the variables you keep as it seems that your data set is not scaled, but the correlation matrix with give you the necessary information).

R2 of 1, that is surprising you totally explain the variance, but I shall still check the residual as this is surprising, your residual standard error is so low so it seems ok.

Hope this help.


Dealing with singularities in a linear regression model

You are having singularity warning because, 3 of your variables total.day.charge, total.night.charge, total.international.charge have values as ‘NA’, which means missing. You need to drop these variables.

you have r square value=1 which could be because you have highly correlated variables, so it might be overfitting .Or you have some variable in independent variable which is formed out of dependent variable or very highly correlated to it. b.t.w. what is your dependent variable here?

Also building any model, check its assumptions!