Suitable Forecasting techniques


I am new to Analytics and Data Science. This is the first case assignment I am working on.

Can someone please suggest the suitable techniques for solving this usecase?

Below is problem description and a glimpse of the variables in the datasets.

My client is a leading global home appliance manufacturer. They are interested to develop a system that can forecast market demand of certain products.
We have access to the below two data sets:

  1. Demand data.csv: Weekly shipment information of 13 home appliance products in US.
  2. Economic factors.csv: Monthly data of key economic factors which might be relevant.
    Develop an algorithm(s) to forecast the Industry Shipment Demand at Quarterly and Yearly levels.

Demand Data: Year, Quarter, Week, Product, Shipment, Units
Economic factors: Date, GDP ($ Billions), Federal Interest rate, Electricity Charges (US Average),
Employment to Population Ratio (US: Age Group 25-54), Unemployment Rate, Population Growth, Temperature (US Average), Precipitation,
Transportation Services Index, Consumer Price Index, Consumer Price Index (Housing), S&P House price index


Hey @varundhawal

For solving this problem, you’ll find the following flow helpful:

NOTE: Make sure whether you’ll have the same features you’ll be using for modelling for the corresponding test observations for which you want to make predictions otherwise you might have to use the lag features to make the present predictions.

  1. Merge the tables into the same table. You can merge them based on the date most probably but I’m can’t comment with certainty as it depends largely on the data that you have at your disposal.

  2. Based on my reading of the problem statement, you have to predict the Units for corresponding Product which might be represented as Product ID.

  3. Then you can focus on the data cleaning part which will involve treating the missing values, treating the outliers, etc

  4. Once the data is prepared, you can start to create the model chosen based on your understanding of the data and its relationship with the outcome.

  5. You can reiterate through the modelling section and try different models and chose the one which gives the best cross validation score. You can also chose to create an ensemble of multiple models here.

  6. Once you are happy with your model, you can choose to deploy it or run it manually using code each time you wish to make a prediction.

This I believe is the basic framework that will help you solve this problem statement irrespective of the language you are using.