# Negative Predicted Sales (Linear Regression)

#1

Hi Team,
While working on the BIG Mart Sales problem , I am facing issues when predicting the sales using linear regression, Adjusted R square : 56.21, Residual standard error : 1129, the model statistic is as per below (only significant variables).
Estimate
Intercept : - 1869
Item_MRP : 15.6
Item_Fat_content (Regular) : 52.7
Outlet_type(Supermarket 1): 1955
Outlet_type(Supermarket 2): 1626
Outlet_type(Supermarket 3): 3355
Visibility Category (Low) : 54.3

Since the intercept is negative some of the predicted sales value is coming as -ve, as a result of which the MAPE is coming as 1.
Can anyone of you please suggest how should I proceed in this situation.
Regards
Arnab

#2

From my understanding, it seems that the model is not good.
I’m saying this because if an intercept of -1869 leads to a negative predicted sales figure, that mean the sales amount is not high enough to justify a Residual Standard Error of 1129! The error is huge despite the model having an adj R sq. of 56.21. It might also be the case that you have a relatively over-fitted model at your hand.

This is my 2 cents.

#3

@Nishant_S
Thanks for your reply , I have also observed hetroscedusticity while potting Fitted vs Residual, any suggestion to improve the model.
Regards
Arnab

#4

Some amount of heteroscedasticity in sales data is normal as it is a time series data. If you are working with years of historic data, I hope you have already stationarized the time series. If not, you need to do this before applying regression.

#5

@Nishant_S…I now applied linear regression on the logrithmic value of the sales…which increases the adjusted r2 to 74%…and also the diagnostic charts are ok…should I calculate the residual against (predicted log sales - actual log sales) or (predicted sales(= calculated from log value) - actual sales …

#6

Either of those should be fine. The results would be identical.

#7

@Nishant_S
The results are not identical…the mape is coming as 0.07 (resid on log sales) and 0.52 (resid on actual sales) …is the difference acceptable?? Any suggestions?

#8

Just to check. You are aware that you cannot take log transformation of 0 and negative values right?
I mean to ask have you accounted for this?

#9

@Nishant_S, I don’t have any negative or zero predicted values.Any suggestion on if mean absolute % Error (mean (Residual/actual sales)) as 0.52 is acceptable? if not how to reduce it?