Time Series multiple periodicity

timeseries
datetime
machine_learning
time_series
data_science

#1

I need some help on AV practice problem of time series analysis.Dataset has hourly data which has multiple periodicity. Actually the given series is somewhat stationary but when we apply one difference to reduce the trend the statistics of dickey fuller test got improved( p value comes out to be exactly 0). So, I tried to apply ARIMA model to the one difference time series but the results were awful…I also tried to tweak the Parameter model with the help of ACF and PACF but failed!Moreover the ACF and PACF are not showing the seasonality that I can see in raw data. When I tried to fit the model directly on the given series the results were pretty good! Initially I tried to get the ARIMA model parametrers( p,q,d) with the help of ACF and PACF but then I think of an automated pipeline to get the best parameters. In the automated pipeline I used walk forward validation to get the MSE of different combinations of ARIMA model parameters but this process is taking a lot of time and couldn’t complete I think because of large dataset and the repetive model framing nature of walk forward validation. So, I tried to reduce the test dataset to a minimal value of last 3 days. The MSE of many model comes out to be close enough making it difficult for me to decide which ones should I choose! Please can anyone help me out in explaining why the difference time series give bad results and what should be a good technique to validate the model. Moreover, I’m thinking that since the time series has a weekly trend which repeats itself after every week…maybe we can take a difference of 24*7 in time series to eliminate this seasonality or we may make some own features like 1. Day of the month
2. Hour of the day
3. Day of the week
4. Ordinal date (Number of days from January 1 of year 1) and then use some other models like GBM or random forest. Please Guide me …I’m totally confused


#2

Hey Prabhat, welcome to the world of time series!

Just to set the context, time series modelling is actually more difficult than it looks.

OK, so first suggestion,

NEVER TRUST THE DICKEY FULLER TEST FOR STATIONARITY, I repeat,
NEVER TRUST THE DICKEY FULLER TEST FOR STATIONARITY!

Your only friends in this case are going to be the ACF and PACF graphs. They can atleast tell you if you need the AR or MA.

In case you need to find out the ordering for autocorrelation, AR function in R can help. It performs AR with different ordering and then can provide you the order for which it got lowest AIC value. Once you’re done with this, you’ll still need to manage the seasonal ordering, unfortunately I can’t help you with this because, I’m struggling with the same. In case you get an understanding of seasonal ordering, please let me know.

I’d also suggest you too take a look at the Facebook Prophet for forecasting. I do not recommend it for all cases, but sometimes it work extremely well. Plus, prophet can take care of weekly and yearly seasonality as well.

If you still can’t improve the accuracy, try looking at the BSTS by Google or NNetAR in R’s forecasting package.

Just to add here, if you want to view the seasonality component, decompose your time series data in R using decompose function. or STL function.

Peace!