How can we randomly split time series data in R and Python?




Could you please tell me how can we randomly split time series data. I have a time series data from July 2007 to July 2010(37 month). Currently I’m taking the first 30 months to train the model and rest 7 for test. The problem is there has been sudden increase in the months of test data set for few months which the model is not able to predict which leads to poor MAPE, as it was not observed with the data the model was trained with.

How can I randomly split time series and then test the model? Please mention code as I’m not aware of this.

Many thanks!


Hi @lakshveer,

For time series we do not use random split for splitting the data. If we choose the split randomly it will take some values from the starting and some from the last years as well. It is similar to predicting the old values based on the future values which is not the case in real scenario.

So we divide the time series data based on the dates. Instead of changing your splits, try to use different forecasting techniques to predict the future values. You can refer the below mentioned course which covers most of the forecasting techniques in python: