Multivariate Time Series Analysis

python

#1

Hi,

I’ve a multivariate time series data set which has 20 predictor variable. The target variable is a continuous variable. The data set is about “Employees Absenteeism” . The objective is
“How much losses every month can we project in 2011 if same trend of absenteeism continues?”
I’m new to time series thing. I’ve been told that it can be solved by other than time series model. Could you please help me with what are the other model with which we can perform the forecasting.
Attached dataset.

Your help would be much appreciated.

Many thanks!

Data Set.zip (20.4 KB)


#2

Hi @lakshveer,

You can treat it as a regression problem. Since there is no variable related to time, there is no need to treat it as a time series problem. You can follow the below article to learn various techniques to solve a regression problem:


#3

Hi @PulkitS,

Thanks for the reply.

In case of regression I’m aware that I can predict on the test data for which I’ve independent variables, but how to monthly forecast the absent hours for next year if same trend of absenteeism continues. How can we do it by linear regression? Is there any function which does it?

Could you please brief me about it!

Thanks in advance!


#4

Hi @lakshveer,

I looked at the dataset which you have shared. It contains independent variables like Reason for absence, Month of absence, Seasons, etc. So, you can fit the linear regression model on those variables and the model will learn the trend of absenteeism. Based on that model you can predict the future values using the predict function of linear regression.


#5

Hi @PulkitS,

Thanks for looking at it!

As far as I know I can predict on the test data(having independent variables) to get the target variable in linear regression. Can I predict for future values based on month for which I don’t have any test data?
The objective is : To monthly forecast the absent hours for next year if same trend of absenteeism continues.

How can this be done? I’m very confused.

Thanks again!


#6

Hi @lakshveer,

Yes, you need the exact same independent variables in the test data set as that of train dataset. You cannot make predictions using linear regression unless you have the test dataset.

You can use the Time series forecasting techniques for such cases. SARIMAX model can be used to predict the future values, i.e. next one year in your case.