You are on page 1of 2

Better prediction with restricted training set (European Main Stock Markets)

ARIMA stands for AutoRegressive Integrated Moving Average.


To better understand your time series data and to predict future points in the series
(forecasting), after cleansing your data (function tsclean from R package “forecast”), you
can fit your time series data to an ARIMA model to prepare prediction.
As there is lot of ARIMA models defined by p, d and q parameters and as people who
needs forecast are not ARIMA models specialist, an auto.arima function is available in R
package “forecast”. auto.arima will try different ARIMA models and select the “best” fit
arima model. “Best” here mean “Best for a vast majority of situations not for all
situations”. So for some few specific situations, an ARIMA specialist is needed.
As ARIMA (and then auto.arima) is autoprojective which uses the most recent data to
compute essentially a weighted average of past values, why we need to select a
restricted training set (Methods, including my own method, for extracting “restricted data”
are compared in my previous post) ?
It is said that a picture speaks a thousand words, some examples are the best advocate
for restricted training set.
In the pictures,
- Test set is shaded in black color
- Traing set is shaded in blue color
- Data set has a serial combination of grey, blue and black colors
- Forecast results is shaded in red color
- Forecast prediction interval is limited by orange color
Following are one year forecasting trend for European Stock indexes : DAX, CAC and
FTSE.
As you can see one year (mid 1997 to mid 1998) forecast based on autoarima model is
better with RESTRICTED training set (left figure) than with full data set.
Data used are EuStockMarkets from (R package “forecast”).

You might also like