Professional Documents
Culture Documents
By George C. S. Wang
Describes in simple language how transformed into one. We can define that a 1. Model identification
to use Box-Jenkins models for stationary time series has a constant mean 2. Model estimation
and has no trend overtime. A plot of the 3. Diagnostic Checking
forecasting ... the key requirement
data is usually enough to see if the data are 4. Forecasting
of Box-Jenkins modeling is that stationary. In practice, few time series can
time series is either stationary or meet this condition, but as long as the data The four steps are sinîilar to those
can he transformed into one ... the can be transformed into a stationary series, required for linear regression except that
most difficult part in this type of a Box-Jenkins model can be developed. Step I isalittlemoreinvolved. Box-Jenkins
uses a statistical procedure to identify
modeling is the identification of a
THE MODELING PROCESS a model, which can be confusing. The
model. other three steps are quite straightforward.
Box-Jenkins modeling of a stationary Let's first discuss the mechanics of Step
time series involves the following four 1, model identification, which we would
do in great detail. Then we will use an
G
eorge Box and Gwilyni Jenkins steps:
developed a statistical approach example to illustrate the whole modeling
for time series modeling. Time process.
series models developed on the basis of
their approach are called Box-Jenkins MODEL IDENTIFICATION
models, also known as ARIMA models. A
time series can be defined as a sequence of ARIMA stands for Autoregressive-
data observed over time. Integrated-Moving Average. The letter"!"
(Integrated) indicates that the modeling
ARIMA models are univariate, that time series has been transformed into a
is, they are based on a single time series stationary time series. ARIMA represents
variable. Box and Jenkins have also three difTerent types of models: It can be
developed procedures for multivariale an AR (autoregressive) model, or a MA
modeling. However, in practice, even their (moving average) model, or an ARMA
univariate approach, sometimes, is not as which includes both AR and MA terms.
well understood as the classic regression Notice that we have dropped the "1" from
method. The objective of this article is ARIMA for simplicity. Let's briefly define
to describe the basics of univariate Box- these three model forms.
Jenkins models in simple and layman
terms.
GEORGE C. S. WANG AR Model: An AR model looks like
Dr. Wang is currently an independent a linear regression model except that
UNIVARIATE MODELING consultant, specializing in statistical in a regression model the dependent
modeling and business forecasting. variable and its independent variables
The purpose of univariate modeling is Formerly, he was Forecast Manager at are different, whereas in an AR model
to establish a relationship between the Consolidated Edison Company of New the independent variables are simply
present value of a time series and its past York, and was responsible for forecast the time-lagged values of the dependent
values so that forecasts can be made on modeling and forecasting. He also has variable, so it is autoregressive. An AR
the basis of the past values alone. served as Company witness and gave model can include diflcrent numbers ot"
testimony in regulatory proceedings. He autoregressive ternis. If an AR model
Stationary Time Series: The first is the co-author of the book. Regression includes only one autoregressive letm.
requirement for univariate Box-Jenkins Analysis: Modeling and Forecasting. He it is an AR ( 1 ) model; we can also have
modeling is that the time series data to be received his M.B.A. and Ph.D. degrees AR (2), AR (3), etc. An AR model can be
modeled are either stationary or can be from New York University. linear or nonlinear.
-0.4
3 4 5 6 7 8
-0.4
terms do we need in the identified model?
-O.fi
To answer these questions, we need to -0.4
calculate the autocorrelation ftinction and -0.6 -0.8
the partial autocorrelation function of the Time Lags Time Lags
series.
What are Autocorrelation Function the coefficient of the independent variable correlogram. Figure 1 shows three pairs of
(ACF) and Partial Autocorrelation is called first order partial autocorrelation theoretical ACF and PACF correlograms.
Function (PACF)? Without going into the function; when a second term of two-
mathematics, ACF values fall between -1 period lag is added to the regression, the In modeling, if the actual correlogram
and +1 calculated from the time series at coefficient of the second term is called looks like one of these three theoretical
ditïerent lags to measure the significance the second order partial autocorrelation correlograms, in which the ACF dimin-
of correlations between the present function, etc. The values of PACF will ishes quickly and the PACF has only one
observation and the past observations, and also fall between -1 and +1 if the time large spike, we will choose an AR (1)
to determine how far back in time (i.e., of series is stationary. model for the data. The " I " in parenthesis
how many time-lags) are they correlated. indicates that the AR model needs only
How do we use the pair of ACF and one autoregressive term, and the model is
PACF values are the coefficients of a PACF functions to identify an appropriate an AR of order 1.
linear regression of the time series using model? A plot of the pair will provide
its lagged values as independent variables. us with a good indication of what type Notice that the ACF patterns in 2a and 3a
When the regression includes only one of model we want to entertain. The plot are the same, but the large PACF spike in
independent variable of one-period lag. of a pair of ACF and PACF is called a 2b occurs at lag 1, whereas in 3b, it occurs
ACF of lag I -
FIGURE 3
Auto - cov aricmce
PLOT OF DIFFERENCED DEMAND DATA
Variance
Auto-covariance of lag 1 =
[(0.48-0.62) (0.02-0.62) +
40
(0.02-0.62) (1.17-0.62)
9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
Variance = — [(0.48-0.62)-+ (0.02-0.62)^
40
+ ...+ (0.26-0.62)-]= 118.27
TABLE 2
8.53 ACFAND PACF AT DIFFERENT LAGS
ACF of lag 1 = - 0.072.
^ 118.27
Lag ACF PAC Lag ACF PAC
1 0.072 0.072 6 0.012 0.041
2 0.01 0.005 7 -0.05 1 -0.003
Calculation of PACF of Lag 1: In Table 3 0.045 0.045 8 0.148 0.015
2, the PACF of lag I also equals 0.072. -0.396 -0.013
4 -0.406 9 0.122
We can use EXCEL regression add-Íns to
5 -0,177 -0.137 10 0.029 0.025
regress y on y and obtain.
0.4-
0.6-
1 0.2
0.4
0.6
the three sets of correlograms in Figure Time Lags Time l.
1? The ACF in Figure 4 is quite similar
to those in 2a and 3a in Figure I, but the
PACF here seems to look different from equation, where c is the constant term, (j) is the degree of correlation between the
PACF 2b and 3b in Figure 1. However, the coefficient of y^^ and a^ is the residual. dependent variable and the independent
of the 10 bars in the PACF chart, there is We have estimated the model with both variables; we use the t-statistics to test
only one large spike at lag 4. If we ignore procedures as follows: the significance of the coefficients and the
the nine smaller bars in this chart, then it standard error to measure how closely the
becomes similar to chart 3b in Figure 1. y, = 0.863 - 0. + a, -.(2) model fits the data.
We have said that the patterns of charts
3a and 3b in Figure 1 suggested an AR The residual, a,, in Equation (2) is We also need to check the stability of
(4^) model, and then the two charts in expected to be zero in forecasting. The the estimated model. For an AR ( 1 ) model,
Figure 4 also suggest an AR (4^) model as interested readers can use the data given we require that -l<(t)<l. An AR (2) model
follows: in Columns (6) and (8) of Table 1, and has two coefficients, <[), and tj),, we require
use EXCEL to verify the estimated that:
y^ = c + 0y^_^ + a^ . -. Í1 ) coefficients in Equation (2). !f one would
use a nonlinear procedure, it will take
Although we denote Equation (1) as AR three iterations to get Equation (2).
(4^), it is an AR ( 1 ) model in the sense that Equation (2) has a coefficient of
it has only one autoregressive term, which Suppose that the identified model is a -0.4675, which falls between-1 and l.The
models seasonality of period 4. MA (4^) as follow: model is stable. If these conditions are not
met, either because ihe time series is not
stationary requiring more transformation,
MODEL ESTIMATION AND or because the model was not properly
DIAGNOSTIC CHECKING Equation (3) is nonlinear because a^ is identified.
not observable, and it must be generated.
The next two steps are for estimation of We have to use the nonlinear least squares
FORECASTING
the model coefficients and diagnostically procedure to produce a^ (the historic
checking the goodness of fit. These two forecast errors) before we can iteratively
Equation (2) is our model for forecasting,
steps are usually done together. estimate coefficient 0.
but we want to forecast the demand Y, not
the differenced value y^. Therefore, we
Estimation: In fact, most of identified Notice that Box and Jenkins used the must transform the model from the y^ fonn
ARMA models are nonlinear requiring a backward shift operator B in their analysis to the Y, form. Recall that y, ^ V, - Y^ ^ and
nonlinearestimation procedure. Only some very extensively. For example, they y, _,= Yj _,-Y, j^. Equation (2) becomes,
simple AR models are linear and can be denoted y^ ^ = By,, y^, ^ B-y, y ^ = B^y.
estimated with the Ordinary Least Squares etc. In this article, we have avoided the 0.863
(OLS) procedure. For either procedure, the use of B. -0.465 ...(4)
criterion for getting the best estimates of
coefficients is the same, that is, to minimize Diagnostic Checking: Regardless what Notice that we have dropped the a^ tenu
the sum of the squared errors. estimation procedure is used in modeling, in Equation (4) because in forecasting, a
the criteria for testing the goodness of fit is assumed to be zero. Re-arranging terms
Equation ( 1 ) is clearly a linear regression are the same. We use the R' to measure in Equation (4), we obtain.
BE A MEMBER OF THE
+0.465Y. ...(5) INSTITUTE OF BUSINESS FORECASTING & PLANNING
Suppose that we want to forecast the Benefits include:
demand for the first quarter of 2006, •Journal of Business Forecasting co\ cTi. issues sucli as "How to Win the Support
according to Equation (5), we need the Complimentary for active IBF Members, each of Top Management for Forecasting," "How
demand data for the first quarters of 2005 issue gives you a host of jargon-free articles to Select Forecasting Software/ Systems." and
on how to obtain, recognize, and use good more. Plus, you will have access to electronic
and 2004. From Table I. Y,^,„, - 27.08 and
forecasts written in an easy-to-understand copies of the latest journal. Moreover, you
Y„„, = 25.91, then. style for business executives and managers. will also have access to our Action Templates,
Plus, it provides new. practical forecasting ready to use. Currently, they include; (1)
= 0.863 ideas to help you make vital decisions about How to calculate forecast error? (2) How to
sales, capital outlays, credit, plant expansion, calculate how much money you will save by
financial planning, budgeting, inventory reducing specific amount of error? (3) How to
Y,,^,,-0.863 + 0.535x27.08 control, production scheduling and marketing calculate safety stocks (forthcoming).
+ 0.465 «25.9! -27.40 strategies. A one-year subscription includes 4
issues. Most of the articles are written for and •Events & Training (Discounts available)
Forecast accuracy can be similarly by practicing forecasters. IBF Conferences and Tutorials can raise
evaluated as in linear regressioti. your forecasting accuracy to new levels. Get
•Jonrnal of Business Forecasting Past step-by-step training, hear case studies from
Articles NEW! Active Members will now forecasting professionals working in well
CONCLUDING REMARKS have FULL access to all Journal of Business known companies, see demos of the latest
Forecasting articles since inception. With software packages and systems, network
It is obvious that the tnost difficult step active IBF Membership, you will have the and make long lasting connections with your
ability to download unlimited .pdf files of forecasting peers, and more. Our events are
in ARIMA modeling is Step 1, the model articles based on your set search criteria. run in Europe, Asia, as well as in the U.S.A.
identification. Once we get a handle on Step This way you will have access to research at Plus, we also offer online events through our
1, the other three steps are quite similar your fingertips! You can access hundreds of Webinar series.
to those in linear regression. Although articles representing a multitude of industries,
the calculations of the ACF and PACF companies, and topics including demand Join us at an IBF event today! For a full
planning and supply chain management. schedule of our upcoming events and
and the nonlinear estimation procedure
This access will give you a step ahead in testimonials, visit us online: www.ibf org.
look complicated and tedious, computer improving your forecasting performance.
software is available to do these jobs. There is no other body of knowledge which •In-House Training Seminars (Discounts
is as extensive as this one and is geared available) Bring the IBF to your workplace.
primarily towards forecasting practitioners. Enjoy the convenience of a professionally
In the example, the data base originally
developed forecasting training program
included 44 points; we lost 4 points in for your staff at a location of your choice
•Benchmarking Research Reports Our
differencing. The identified model has a benchmarking reports will provide you anywhere in the world. Gain knowledge and
term of lag 4; therefore, only 36 data points with understanding of key metrics and how hands-on training that can be put to use right
were available for mode! estimation. This your company measures up. The ultimate away. Companies that recently had In-House
is the reason why in ARIMA modeling, outcome of these studies is to gain a solid Training include: GAP. Cadbury, Wachovia.
understanding of the "best in class" metrics Wyeth, GlaxoSmithKline, Nike, Molson.
we need a relatively large sample size to most companies are achieving. Research and more. Call us for further details today!
accommodate data loss due to ditïerencing includes; benchmarks of forecasting errors, Discounts are applicable for Corporate
and lagged structure of the model. • forecasting software/systems, forecasting Members.
salary, and more. These indepth studies
of topics are based on various surveys of •Forecasting Books (Disconnts available)
UPCOMING EVENTS forecasting professionals from IBF events as Our books are geared toward helping
Demand Planning & Forecasting: well as from other sources. professionals learn, process, interpret, and
Best Practices Conference implement Business Forecasting information.
•Knowledge & Action Templates Our In addition, if you miss one of our conferences,
Las Vegas. NV - April 30- May 2, 2008
growing online knowledge base includes key we öfter manuals that detail each speaker's
For Information: issues and information on forecasting. This presentation from all our conferences.
Call/Contact
Individual IVIembership Corporate Membership
Institute of Business Forecasting & (8 People Maximum)
Planning S250 Domestic:, $300 Foreign $1800 Domestic, S2000 Foreign
Ph. 516.504.7576, Email lnfo@ibforg Call 516.504.7576 or visit us on the vwsnw.ibf.org to sign upl