How to improve linear regression model performance?



Dear All,

I am working with a model with (3 predictive variables) on a simple linear regression model. All the variables have numerical and non-categorical values. However the MAPE for my model (I use lm function in R) is always coming at 0.68 - 0.78 range.

How can I bring the MAPE to say less than 0.2 - 0.3 range.

I would be eagerly awaiting your help.

Thanks and regards with best wishes


@Debanjan_Banerjee- It completely depends upon the problem.

You can look into this article for any kind of help.

Hope this helps!




yes it depends of the problem still you have stats which could give one hit specially with linear regression.

The first item I shall look after checking the residual is the Cox distance do you have observations in your training set for you model which have high leverage, so check the fourth graphic of the plot(your near model) not the easiest to understand but easy to find explanation, the main point anyway will be mark.
Then rebuild your model after removing the high cox this should help.

To check the stability of your model before to put in production and if you work in R the library rms has few methods check ols() you can do with cross validation, in this way you will have a better understanding of how your model could behave in production.

Last point linear model have high bias!!! even if they could be powerful so if you production data have more “outliers” type them… problem your MAPE will shoot up.

Hope this help.