How to get the trend variable in Multiple linear Regression?

machine_learning

#1

Hi,

I’ve a multivariate time series data for Employee Absenteeism which I want to analyze using Multiple linear regression.

The objective is to to find how much losses every month can we project in 2011 if same trend of absenteeism continues?

I need to find the trend variable which can be included in the equation but I don’t know how exactly can this be achieved.

Data Set.zip (20.4 KB)

Your help will be much appreciated!

Thanks in advance


#2

Hi @lakshveer

This is not a time series problem, since the data is not collected at fixed time interval (the data is not time dependent). Group the data on month column, such that you have two columns : month and absenteeism in hours.

Sr.No. Month   Absenteeism (hour)
 1       7         119
 2       8         132
 3       9

… and so on

With this data, you will be able to predict the absenteeism (hours) for each month in 2011. Further you can add the values to get the total hours in year.


#3

Hi @AishwaryaSingh

Thanks for the reply!
Apart from adding month column, Should I not include the trend variable to make forecast about absenteeism hours as it will take trends from other variables into consideration and it seems to be more logical to me as taking only month can neglect the information provided by other variables.
I briefly went through the book : Forecasting: principles and practice, where I got some information about the trend variable. Specific page( https://www.otexts.org/fpp/5/2), Section : Example: Australian quarterly beer production.

However, I don’t exactly know how can we calculate this variable. Do you have any idea of how this trend variable is calculated.

Many thanks!