Submission for Bigmart


Hello @pulkitpahwa,

I was trying something on Bigmart data set. As I started submitting, I find that the rmse value after local CV and what I received after submission has huge difference. I even execute my old code but still difference was huge. Finally I submitted the old file which gave me best lb score and again the difference is huge. Are any changes happening there ?

Kindly let know. Thanks for your time.


Hi @sadashivb,
The reason for the difference is that only 25% of data is public.
The entire test data is divided into Public (25%)[which is the test file] and Private (75%) data.
Your score is evaluated on the entire data.
You have to avoid overfitting your data.


Hi @monica_joseph,

Thanks for your response.

Perhaps I failed to explain properly above. To cut the story short - yesterday I submitted an an old file created in Jan which gave me score of 1163.94 then. With same file, yesterday I got score of 2460.17 on lb.

Its the same same public data set. So I should be getting same score.

Sorry if I am confusing. But has anyone tried submitting same model and have found difference in rmse ?



Hi @sadashivb,

First of all, we didn’t make any changes to the public data set. It is the same public date set.

To go ahead with trying to find out the error, I checked all your submissions with both the scores. On checking, I find out that the files are completely different and hence the scores.

Good Day.


@pulkitpahwa - Thanks for response and sorry for all the trouble.

Seems I messed up here. :disappointed:


Hi @sadashivb,
I faced the same problem as you once. The problem was that the preprocessing steps that I was doing on the train data were not replicated for the test data. Try checking your code. Is your data modelling pipeline without bugs?

PS: If you still face the problem, feel free to contact me (faizankshaikh at gmail). I will try my best to fix it.


one problem occur when we make data frame for submission
error is no. of row is 5561 and 8524
so please suggest me what we do for submission


Hi @aman013,
Check out this blog for a clearer understanding of the solution


my problem is still remain.after prediction i receive 8523 rows and i make dataframe then error occur- different no of rows i make this code
d <- data.frame(item = test$item_id,outlet id = test$outlet id,item output sale= prediction)
error - differnent no. of row - 5621,8523


Hi @aman013,
Maybe there some error in prediction step. Could you share your code?

I’m not an R guy but I’ll try to help.


After set model we make prediction
Pf =predict(fit,test)
Submit=data.frame(test$itemid, test$ouletid,item outlet sale=pf)


Hi @aman013, did you solve your problem?


Yes , i solve it


Great! Could you share what was wrong and how you solved it?


Hey does any one know why the private leaderboard does not have the rankings ? I was 6th on big mart sales and 34th in loan prediction, I don’t find it anywhere?? Why and what happened! I made my final submission as well. Anyone???


Maybe for a practice problem the results keep updating to give chance to new people