Predict Future values(Time Series) using ARIMA



I was following this article for predicting future values. Model fitted very well on training data. when i am predicting for future values, i am not getting desired results. Am i using correct parameters or need to work on it??

My code is:

import pandas as pd
import numpy as np
from statsmodels.tsa.arima_model import ARIMA
import matplotlib.pylab as plt

data_1 = pd.read_csv('AirPassengers.csv')
avg= data_1['#Passengers']
res = pd.Series(avg, index=pd.to_datetime(data_1['Month'],format='%Y-%m'))

ts_diff = ts - ts.shift()
r = ARIMA(ts,(2,1,2))
r =

pred = r.predict(start='1961-01',end='1970-01')
dates = pd.date_range('1961-01','1970-01',freq='M')

predictions_ARIMA_diff = pd.Series(pred, copy=True)
predictions_ARIMA_diff_cumsum = predictions_ARIMA_diff.cumsum()
predictions_ARIMA_log = pd.Series(ts.ix[0])
predictions_ARIMA_log = predictions_ARIMA_log.add(predictions_ARIMA_diff_cumsum,fill_value=0)
predictions_ARIMA = np.exp(predictions_ARIMA_log)

print predictions_ARIMA.head()
print ts.head()

Thanks in advance



Hi there,

Can you tell us what do you want us to tell you? What answer are you expecting? Please be more specific.




I am trying to predict results for future dates as i mentioned start and end date in predict function. The AirPassenger data is same as described in this article. Model is correctly fitted to training data. Final results are almost constant and not getting appropriate trends.
I don’t know which parameters to use in predict() or i made mistake while taking anti log or calculating cumsum. I saw the documentation of ARIMA also but can’t figure out the problem. You can download the AirPasssenger.csv file from this link. Please try to run it.

Thankyou so much for kind help !!!


Can we do out of sample prediction using ARIMA(stastmodel) in python ??
I read about this on various forums and observe that ARIMA model is very likely to converge to mean. Same thing is happening in my case. I am getting constant results which is mean itself. You can see in attached image. My sample training data is till 26th june. I am doing forecasting for further dates(out of sample).


I am always uncertain when one tries to predict a FUTURE Value. Let me explain a bit more by using analogy

Some of my newbie analytics friends (I am also a newbie in analytics), for example like to predict stock price based on old existing prices. - they call it technical analysis. I always disagrees with them. Why ? Because if technical analysis based on old stock price can predict future price, then all businessmen namely the founders or ceo like Google or Microsoft should stop providing a service and they can simply sit back at their desks, doing analytics on prediction of stock price and attempt to make $. Worse …people come out with a lot of wrong info by using existing old prices and convert them to moving averages, etc when PRICE itself is not a good independent variable.

For me I believe in fundamental analysis, which is to use BUSINESS RATIOS to determine the future up or down of a stock price. Ratios are determine by humans or employee workings in a company that generate profit or loss, etc.

Now back to your airline analysis - rather than predict the future passengers load (if I am not wrong - is this what you are trying to predict ?), why not determine the FEATURES that determine the UP OR DOWN of the passenger outcomes ? Example, you can use feature engineering to determine which features are most important or has most impact to passenger load trend ?

I’ve NOT study the ARIMA story, etc Perhaps I am really wrong in understanding analytics.


There’s a similar question in stackoverflow, they had suggested using forcast instead of predict… Haven’t tried it yet myself…



I am facing the same issue, how to input dates as start and end parameter to predict function ?

My training data has value till week 15-jan-2017, I want to do weekly forecasting starting 22-jan-2017 till 28-jan-2018

Let me know if you found the solution. Thanks!