How to improve Time series model (ARIMA) in Python?

timeseries
python

#1

Hello,
I need some help in time series, I am working on a time series and have built an ARIMA model in Python but the results are not very good ( getting an increasing trend over the time).

I have tried the following:
1- Made the series stationary by differencing (dickey -fuller test)
2- Tuned the parameters using grid search
3-Did other transformation like log but results are more or less same
4- walk forward method for the validation

My series is an increasing trend but there are many ups and downs in the past, Residuals are high in recent past but still growth is same (almost), Please help me if there is something I should try in order to get a better prediction which can take residuals or noise in account on a timely basis.

PS: this is my first project and I am totally new to the time series.
Thanks,
Priyank


#2

Hi @priyanshu_rai2000

It is very difficult to suggest anything on time series without visualizing the data. Until and unless you provide data and charts, I don’t think people will be able to help you out.

Regards,
Kunal


#3

Hi Priyank!

I would suggest that you provide us some framework like coefficients of a simple regression (original data vs lagged data), visualizations, and further info.
While it is true that transforming data is the typical way to go, you need to analyze the original series first. Have you plot your data? A few histograms, boxplots? Have you defined a benchmark model to assess this ARIMA’s results?

Since you’re new to this, and I’m quite a newbie too, I understand it can be frustrating. But if you devote some quality time trying to - almost literally - listen to your data, you won’t need to torture it. You will be truly analyzing it.

Perform a few descriptive techniques, as well as some linear regressions (against lagged values), then ask again. The bright people here will be able to help you.


#4

As for a benchmark model, I suggest:

the mean for the entire sample;
the coefficient of a regression like Series vs Time (as simple as that);
and, of course, Next Value (prediction 1) equal to Last Value (last observation).

As for a plot, it will be useful in determining if the series behaves the same all through the period of study. That’s a problema when you work with aggregated data. If you were working with a series such as, for instance, Mexico’s GDP, that’s aggregated data: Consumption + Investment + Exports - Imports. If there’s a significant change in the trend you see, you might want to take a look at each of its components (which are also aggregated data, but not as aggregated as GDP).

Success, and Greetings from Mexico!!