Problem with linear regression



What would happen if i run a linear regression model on a data set having some missing values?


It depends on the number of missing values. Running the analysis with missing values may lead to loose some importance of the variables.

  1. Impute the missing values with any one of the imputation methods
  2. Delete the missing records if it is not important
  3. Get better data (if possible)


Tools handle missing values. But in terms of math behind the linear regression while estimating say for a simple y=a+bx, a and b are calculated with a formula that requires no missing values as its a minimization function.

If you run in an excel with missing values, i doubt if you will be able to find each value that is required in the formula unless you consider the missing to be 0 or treat them or simply delete them.