Submission for Bigmart


#1

Hello @pulkitpahwa,

I was trying something on Bigmart data set. As I started submitting, I find that the rmse value after local CV and what I received after submission has huge difference. I even execute my old code but still difference was huge. Finally I submitted the old file which gave me best lb score and again the difference is huge. Are any changes happening there ?

Kindly let know. Thanks for your time.


#2

Hi @sadashivb,
The reason for the difference is that only 25% of data is public.
The entire test data is divided into Public (25%)[which is the test file] and Private (75%) data.
Your score is evaluated on the entire data.
You have to avoid overfitting your data.
Thanks,
Monica


#3

Hi @monica_joseph,

Thanks for your response.

Perhaps I failed to explain properly above. To cut the story short - yesterday I submitted an an old file created in Jan which gave me score of 1163.94 then. With same file, yesterday I got score of 2460.17 on lb.

Its the same same public data set. So I should be getting same score.

Sorry if I am confusing. But has anyone tried submitting same model and have found difference in rmse ?

Thanks!


#4

Hi @sadashivb,

First of all, we didn’t make any changes to the public data set. It is the same public date set.

To go ahead with trying to find out the error, I checked all your submissions with both the scores. On checking, I find out that the files are completely different and hence the scores.

Good Day.


#5

@pulkitpahwa - Thanks for response and sorry for all the trouble.

Seems I messed up here. :disappointed:


#7

Hi @sadashivb,
I faced the same problem as you once. The problem was that the preprocessing steps that I was doing on the train data were not replicated for the test data. Try checking your code. Is your data modelling pipeline without bugs?

PS: If you still face the problem, feel free to contact me (faizankshaikh at gmail). I will try my best to fix it.


#8

hi
one problem occur when we make data frame for submission
error is no. of row is 5561 and 8524
so please suggest me what we do for submission


#9

Hi @aman013,
Check out this blog for a clearer understanding of the solution


#10

my problem is still remain.after prediction i receive 8523 rows and i make dataframe then error occur- different no of rows i make this code
d <- data.frame(item = test$item_id,outlet id = test$outlet id,item output sale= prediction)
error - differnent no. of row - 5621,8523


#11

Hi @aman013,
Maybe there some error in prediction step. Could you share your code?

I’m not an R guy but I’ll try to help.
Regards,
Faizan


#12

After set model we make prediction
Pf =predict(fit,test)
Submit=data.frame(test$itemid, test$ouletid,item outlet sale=pf)


#13

Hi @aman013, did you solve your problem?


#14

Yes , i solve it


#15

Great! Could you share what was wrong and how you solved it?


#16

Hey does any one know why the private leaderboard does not have the rankings ? I was 6th on big mart sales and 34th in loan prediction, I don’t find it anywhere?? Why and what happened! I made my final submission as well. Anyone???


#17

Maybe for a practice problem the results keep updating to give chance to new people


#18