# Perform Multiple Linear Regression for Time Series data?

#1

Hi Everyone,

I am provided with 3 years worth of Sales data (broken down by month) along with data for multiple potential independent variables like Ad Spend (broken down by TV and Digital), Population Increase, Consumer purchase index. I am required to a) Forecast Sales data and B) find the effect of these dependent variables on Sales and what is the optimal mix of Ad spend (TV vs Digital) for Sales.

I wish to run an equivalent of multiple linear regression in Python but for Time Series data. All the examples that I have read for Time Series focus on 1 independent variable. How can I incorporate multiple independent variables into a linear regression model for Time Series ? What other models can I use for this analysis ?

Note: Sales Data is non stationary and has seasonality.

Thanks,

#2

Go for ARIMAX. It can take into account the exogenous variables as well and still perform time series analysis. Coefficient from the final model can then be used to find out effect of exogenous variables on the DV i.e. sales.

Vector auto-regression can also be possible solution here.

Peace.

#3

Thank you so much Gaurav for your response. I really appreciate your help !

Is there a blog on this site or any link that you are aware off that I can use?

#4

Here are few resources to take a look at:

1. ARIMA with Exogenous variables.

2. ARIMAX

3. If you can read in details, go for ARIMAX MUDDLE

Peace.

#5

Thank You so much for this Gaurav. This was really helpful.
Couple of follow-up question -

1. Since there is an inherent lag time between the time you do an Ad campaign vs how it is reflected on Sales. How would I incorporate that in the model? For instance - For some Automotive brands, the effect of their advertisement is seen 4 months after on Sales. How do I incorporate that in the model ?

2. In ARIMAX one of the methods is - fit(method, **kwargs)
The only fit options are ‘MLE’ and ‘M-H’. My question if I am trying to predict Sales, why don’t this package has MSE (Mean Square Error) as one of the fit options.

3. Also, do you have handy links for VAR model ? That would be super helpful.

Thanks a lot again for helping a stranger !

#6

Hey, Sorry for the late reply. Here is what I could think of your questions:

1. Give me some time to think on this. Will get back to you on this.

2. There is an argument “method” in ARIMAX model. Set it’s value to “CSS” which is nothing but method to minimize the sum of squared residual. More details at: ARIMAX fitting methods.

3. Sorry no any such links for VAR (Actually I find it little difficult so never really tried venturing into this!)

Hope this helps.

Peace.

#7

Hi @ak3674

I am also having a similar dataset, were you able to solve the problem ?