Here is my approach towards the mini datahack time series problem which scored me a score of around 226 on the private leaderboard. (rank 2)
I started data exploration and noticed right off the bat that all the count were even numbers. Moreover, we had observations every hour which steadily increased over the entire data. So we have a trend. However, the maximum value was not at the tail of the dataset.
As an accuracy measure, I used divided the count by two and then multiplied my solution back by 2 at the end of prediction step. The halfcount was fed them to an ARIMA function as my initial solution but was nowhere near the top 50. I then came across an inbuilt R function “HoltWinters” which combines the capabilities of both Holt’s and Winter’s models of time series. Just the model itself (at its default values) brought me inside the top 50. After that it was all parameter tuning based on the observations:
- There appeared to be a seasonality which was multiplicative in nature. Hence the data did not increase linearly.
- The count remained at 2 for the first fraction of observations. As such the trend did not start immediately along with the data.
- There appeared to be an impact of the day of the week on the data. It seemed to indicate a cyclic behavior in general.
- As I recently gained proficiency over the Winter’s model, I was aware of the individual equations deployed by the model and tuned alpha (indicates level), beta (indicates trend) and gamma (indicates seasonality) variables as well.
All these features collectively took me to the top 10 in the public leaderboard and rank 2 in the private leaderboard.
I believe there were a few outliers and a bit of approximation which were the cause of the RMSE in the dataset and further tuning could have improved the errrs even better.
It seems that XGBoost captures these intricacies itself and fits well to a time series too. However, time series models are no less and got me the next best position on the leaderboard though it requires data exploration in a different light.