BigMart Baseline Solution - Score 1598 (Python codes)

baseline
big_mart_sales
python

#1

Hi guys,

I am sharing the first baseline solution for BigMart sales problem. These are very primitive solutions but good to set the ball rolling.

Just to set the context, baseline solutions are the ones for which don’t really need a predictive model. These are the basic solutions against which we should benchmark our first model. I am sharing 2 baseline solution.

Step 1 - Setting Up:

import pandas as pd
train = pd.read_csv("train.csv")
test = pd.read_csv("test.csv")

Solution 1: Mean Sales: Most intuitive solution is to predict the mean sales of all products

# Determine mean of the output column:
mean_sales = train['Item_Outlet_Sales'].mean()
#Initialize submission dataframe with ID varaibles
base1 = test[['Item_Identifier','Outlet_Identifier']]
#Assign outcome variable to mean value:
base1['Item_Outlet_Sales'] = mean_sales
#Export submission:
base1.to_csv("submission_baseline1.csv",index=False)

Socre: 1773.82514

Model 2: Mean Sales by product: Another intuition might be to predict the mean sales of the particular product as output products

# Determine mean of the output column:
mean_item_sales = train.pivot_table(values='Item_Outlet_Sales',index='Item_Identifier')
#Initialize submission dataframe with ID varaibles
base1 = test[['Item_Identifier','Outlet_Identifier']]
#Assign outcome variable to mean value by product:
base2['Item_Outlet_Sales'] = base2.apply(lambda x: mean_item_sales[x['Item_Identifier']],axis=1)
#Export submission:
base2.to_csv("submission_baseline2.csv",index=False)

Socre: 1598.68255

Actually you don’t need Python to do this. It can be simply done in excel as well. But if your model is below these scores, there is definitely something going wrong!

Does anyone have a better baseline solution? (Remember no modeling technique to be used - just intuition!)

Cheers!
Aarshay


#2

Which predictive model is suitable for big mart sales prediction???


#3

Refer this blog by @Aarshay. It is really interesting to see his approach for solving the problem.