You are on page 1of 125

Prognostic method

Luk echura
c.h.: Thursday 14:30-16:00

Content - Lectures
1. Introduction to forecasting
2. Time series properties; Trend function - forecast;
ARIMA models - forecast
3. Low costs method
4. Econometric models - specific to general modelling;
5. General to specific modelling ADL models
6. Price forecasts application of ADL models
7. VAR models
8. Co-integration analysis, VECM models
9. Leading indicators

Content - seminars
1. Project, Dataset, Trend function
2. ARIMA model
3. ADL model
4. VAR model; Evaluation of forecasts

Literature

Granger, C.W.J.: Forecasting in business and


economics, Academic Press New York, 1989

Charemza, W.W., Deadman, D.F: New Directions


in Econometric Practice: General to Specific
Modelling, Cointegration
and Vector Autoregression, Edward Elgar, 1999
(2
nd
edition)

Green, W.H.: Econometric Analysis, Printice Hall,


2003 (5
th
edition)

Scenario

September 26, 2015, will be a Thursday. By 11


a.m. the sun will be shining brightly on the
Scripps Beach at La Jolla, a few miles north of
San Diego in California. The surf will consist of
four-foot breakers from the southwest at
fourteen-second intervals. It will be high tide,
with a height of +2.3 feet. After a brief spell of
body surfing and enjoying the 20
0
C temperature,
I shall put on my mask and snorkel and start
exploring the kelp beds just beyond the pier. I
will find a half dollar lost by another diver many
months ago and find that a particularly attractive
barnacle is attached to it. I use the half dollar to
buy an Evening Tribune and get no change.

Examination of scenario

The date. September 26, 2015, merely specifies when the


scenario is to occur.

The day. That this date will be a Thursday is not a forecast but
merely a designation of a name to a date according to a widely
accepted method of designating such names, known as the
Gregorian calendar. However, an assumption is being made that
the same calendar will be in use in 2015.

Tide-height. The mechanism that causes tides is very well known


and depends on the movement of the moon and various other
members of the solar system. As these movements can be
predicted with great accuracy, it is possible to calculate a tide
height for virtually any beach in the world, although the
calculation is not necessarily an easy one. Thus a forecast of tide
height can be made, and one would expect very little error in the
forecast, although previous wind conditions may affect the actual
value somewhat. The only assumption required to make this
forecast is that the tide-generating mechanism does not change in
any significant fashion before 2015.

Sun shining, surf conditions, ocean temperature. These can all be


considered to be forecasts made by observing what typically
happens at Scripps Beach in La Jolla on September 26. For
example, one might record sea temperature at 11a.m. on all
previous September 26s for which readings exist, average these
values, and use this average to forecast what will occur at the
same date in the future. It would clearly be wrong to average
temperatures over all past days, as ocean temperature varies
greatly from one part of the year to the next, a feature known a s
the seasonal variation. Clearly such forecast could be very wrong,
but it is difficult to see how they might be improved. Local
weather effects in 2015 depend so greatly on the future
movements of air masses an the mechanism that moves these
masses through the atmosphere is so complicated that there is
virtually no hope of making a really accurate forecast of what will
occur several years hence. Unlike the astronomers and
oceanographers, who have a very high quality model for the
movement of planets and oceans, the meteorologist has only a
very imperfect model for the atmosphere due to the extra order of
complexity involved.

My personal behavior. Much of this is very


doubtful, requiring not only that I be around in
2015 and sufficiently healthy to enjoy the ocean
but also that I learn how to snorkel. Given that I
am sufficiently healthy and wealthy, the
prediction could be self-fulfilling.

Finding a half dollar with a barnacle. This is just


pure fantasy. Such a thing could occur but is
extremely unlikely. Any detail of this kind can be
thought of as being merely the use of artistic
license by the writer.

Spending a half dollar to buy an Evening Tribune.


The San Diego evening paper at this time is the
Evening Tribune and currently costs 25 cents, having
gone from 15 cents fairly recently. By looking at the
history of price increase in the economy at large, one
might well forecast that what now costs 25 cents will
be costing 50 cents by 2015. In fact, such a price is
possibly optimistic. There has been a noticeable
upward drift in prices, known as a trend, and, by
simply assuming that this drift will continue, the
price of the paper in 2015 may be estimated. The
mechanism by which prices will change is by no
means well understood, so the forecast is based on
an extension, or extrapolation, of what has been
observed in the past.

This simple scenario brings out a number


of important points about forecasting:

The most important of these is that things to


be forecast vary greatly in their degree of
predictability.

It should also be clear that the methods that


can be used to forecast can vary greatly and
will depend upon data availability, the quality
of models available, and the kinds of
assumptions made, amongst other things.

Forecasting situations

Many million dollars are spent annually on prediction in


the United States alone, making forecasting big business.

The major consumers of specific forecasts are government


officials and servants, federal, state, and local, together
with management, particularly that part belonging to the
higher echelons, in all types of business.

Somme typical forecast situations:

A company has to forecast future sales of each of its


products to ensure that its production and inventory
are kept at economical levels while controlling the
likelihood of being unable to meet orders.

A firm is considering putting capital into a new


investment. To decide whether the investment is a
worthwhile one, it has to forecast the returns that will
result in each of the next few years.

A cigarette manufacturing company is considering


introducing a new brand and has to predict the likely sales
for this brand and also the effect of sales on its other
brands.

A government wants to forecast the values of some


important economic variable, such as the unemployment
rate, if it takes no action or if it alters one of its controls,
such as the marginal tax rate. Such forecasts are
necessary for policy decision.

A town council forecasts the demand for junior school


places in some part of the town in order to decide whether
or not an extra school is required, or if an existing school
should be expanded.

A government expects to have a substantial deficit in


three years time and would like to know if there will be a
crowding out effect leading to increased interest rates.

Forecasting

Forecasting might be defined as a


part of cognition theory relating to
the future.

Economic processes have objective


characters

Forecast (prognosis) x prediction x


hypothesis

Types of forecasts
(classification)

According to the length of


forecasting period

According to the subject of


forecasting

Three important types of forecasts:

Event outcome forecasts

Event timing forecasts

Time series forecasts



Event outcome forecasts

A baby is to be born. What sex will it


be?

An election is to occur. Who will


win?

What grad will you get in your


forecasting course?

The main problem with forecasting


the outcome of a futer event is that
the event may be unique, so that
really relevant information may be
difficult or expensive to acquire.

The sex of an unborn baby. It is very easy to note


that in the United States in recent years 51.3% of all
babies have been boys.. Thus, one could make a
forecast of the form with probability 0.513 the baby
will be a boy. By observing the proportions of boys
and girls born in the families of the mother and
father, one might want to alter this probability, since
some families produce a predominance of girls,
instance. Much better information could be acquired
by performing chemical tests on the mother-to-be
and quite possibly a very definite prediction could be
made with considerable confidence, particularly as
the birth date draws near.

Event timing forecasts

When will the next Czech election occur?

When will the next turning point of the


economy happen? (leading indicator)

When will CNB change its interest rate?

When will your sister get married?

.

Time series forecasts

Time series is a sequence of values usually


recorded at equidistant time intervals.

Daily closing prices of IBM shares;

Monthly unemployment rate or balance


payment deficit, etc.

Varialbe x
t
, where t=1,,n

What is the value of x in time n+h?

h represents the forecasting period.

x
t
is a sequence of stochastic values, as well as
x
n+h

Variable x
t
might be characterized in
terms of probability theory

DF, PDF;

E(x), variance

Interval x point forecast



Information sets

Information sets (I
n
):

I
n
: x
n-j
, j>=0

I
n
: x
n-j
, y
n-j
, z
n-j
, etc., j>=0

Misspecification

Under-parameterization

Over-parameterization

Information sets:

Proper past and current data

Improper suboptimal forecast

Information sets:

Numerical data

Information of nonnumerical kind



Cost function of forecasts

Criterion for best forecast selection


or forecast evaluation, respectively.

C(e) = Ae
2

Criterion = min C(e)



Forecasting method

Subjective method

Objective method

Time series models

Time series

Time series analysis methods that


are used to describe time series
and/or to predict their future
development, respectively.

Basic characteristics of time
series

The first task of time series analysis


is to get first information about the
process, which the time series
represents.

Visual analysis

Figure

Basic characteristics/statistics
(differences of different order, mean,
variance, distribution, etc.)

Time series model -
approaches

The fundamental principle is one-


dimensional model:
Y
t
=f(t;u)
We have basically three approaches to
this model:
(i) Classical (formal) model
(ii) Box-Jenkins methodology
(iii) Spectral analysis

Stochastic time series

Stochastic process infinite sequence of random


values ordered in time.

Realisation of each observation is given by


probability distribution function f(Y
t
).

Modelling of stochastic process asks for the


proper description of this function.

The better description is, the better forecast I can


get.

Stacionarity vs. nonstacionarity of time
series

Definition: A time series is said to be


strictly stationary if the joint
distribution of X
t1
,,X
tn
is the same
as the joint distribution of X
t1+R
,
,X
tn+R
for all t
1
,,t
n
and R.

That is, the distribution of the


stationary process remains
unchanged when shifted in time by
an arbitrary value R.

Stacionarity

Weak stationarity (Secondd-order


stationarity) the mean and
variance of X
t
are constant and the
covariances of X
t
depend only on the
lag or interval R=t
1
-t
2
, not on t
1
or
t
2
.

Non-stationary time series
Reasons for non-stationarity:

Trend

Seasonality

Structural shocks in economy


Detection of non-stationarity
-
ACF
-
Unit root tests

Time series models

ARIMA models

We use these models for short-run


prediction especially in situations:

We do not have adequate dataset

LRM is not verified or has bad


forecasting characteristics

Random walk models

Simple stochastic process

Non-stationary process since its mean and variance is not


constant

I(1)

If we use this process e.g. for description of dynamic


behaviour of some variable (e.g. Share prices or consumer
prices), we talk about non-stationary time series model.

However, the time series of first differences are generated


by stationary random process, which is also called white
noise.

This holds for short-run prediction based on random walk


model:
y
t+1
=y
t
+ E(u
t+1
) = y
t

Moving average models (MA)

It is used for modelling dynamics of stationary time series.

E.g. Analysis of changes in share prices. This sequence of


price changes with zero mean and constant variance can
be written:

Y
t
= u
t

u
t
are identically distributed error terms, serially uncorrelated.
They represents the impact of unexpected factors on share
price e.g. Information about the financial situation of company.

It can be assumed that all new information is not captured


by the market within one day. That is why the price
change of the next day can be expressed as:

y
t+1
= u
t+1
+ au
t

Where u
t+1
represents the impact of the new
information in time t+1. And au
t
expresses the
impact of the information from the previous day.

With respect to the fact that the observations


generated by MA(1) process are correlated of the
first order only (only neighbouring observation
are correlated), i.e. y
t-1
and y
t+1
, we say that the
process MA(1) has memory only one period, i.e. it
forgets all that happened with the delay higher
than one period.

Autoregressive model (AR)

Another access to the modelling of


the structure of stationary time
series.

Expressing y
t
as a function of its
several previous observations.

Autoregressive moving average model -
ARMA

In praxis we often encounter a situation


that the stationary stochastic process is
not fully consistent with assumption of MA
or AR models. In this situation we might
use a model that is a combination of AR
and MA processes.

We call this model ARMA(p,q) model,


where p denotes the order of AR process
and q the order of MA process.


Autoregressive integrated moving average
model (ARIMA)

Except of stationary stochastic


processes we use non-stationary
processes as well.

I(d)

A special case of the process ARIMA,


which is used for generating of time
series containing trend, is the
process SARIMA, which is used to
model time series of multiplicative
seasonal type, i.e. with stochastic
seasonality (again several
modification exists SAR, SMA or
SARMA, respectively).

Specification of ARIMA (p,d,q) model

1. step linearization of time series

2. step determining of the order of


integration of time series I(d)

3. step determining p and q

4. step estimation

5. step verification

6. step application - forecasting



3. step determinign p and q

ACF

AIC, SIC

4. step estimation

AR OLS

MA spatial techniques, e.g.


Interactive NOLS

5. step verification

Autocorrelation

Ex-post forecast

6. step application - forecasting

MA(1) model

AR(1) model

ARMA(1,1) model

Method with low costs

The whole process of time series


analysis takes quite long time and
thus is to costly for some
applications.

EWMA exponentially weighted mowing
average methods

Forecast with low costs.

In this method the most important


characteristic of the time series is its
mean.

The forecast is based on the


estimation of the mean of time
series.

EWMA
1
) 1 (

+
t t t
x x x
t h t
x f
,

Improved EWMA
12
1 1
1 1 12
) 1 ( ) (
) 1 ( ) (
) )( 1 ( ) (



+
+
+ +
t t t t
t t t t
t t t t t
S x x S
T x x T
T x S x x



12 , +
+ +
h t t t h t
S hT x f

Which method to use?

ARIMA

Stepwise AR model

EWMA model

The decision is determined by:

Time

Money

Data set

General to specific modelling

General to specific modelling is based on


the formulation of general dynamic
model, which is subsequently tested,
transformed and reduced in the size using
different kind of statistical tests.

The resulted specific model is a nested


version of general model. It contains only
significant relationships among variables.


ADL model

General dynamic model (one-equation) is


usually specified in the form of ADL
(Autoregressive Distributed Lag) model.

ADL model is a model, which contains n


lags of dependent variable (that is why
autoregressive) and p lags of independent
variable(s) (that is why distributed lag).

ADL (n,p) model
y
t
=
0
+
1
y
t-1
+
2
y
t-2
++
n
y
t-n
+
10
x
1t
+
11
x
1t-1
+ +
1p
x
1t-p
+ u
t
where
0
, ,
n
and
10
, ,
1p
are unknown parameters for n lags
of endogenous variable and p lags of exogenous variable and u
t
are
Gaussian residuals with zero conditional mean, i.e. E(u
t
|y
t-1
, , y
t-
n
, x
t
, x
t-1
, , x
t-p
) = 0.

ADL (n,p) model with k regressors
y
t
=
0
+
1
y
t-1
+
2
y
t-2
++
n
y
t-n
+
10
x
1t
+
11
x
1t-1
+ +
1p
x
1t-p

++
k0
x
kt
+
k1
x
kt-1
+ +
kp
x
kt-p
+ u
t
where
0
, ,
n
and
10
, ,
1p
,
k1
, ,
kp
are unknown
parameters for n lags of endogenous variable and p lags of k
exogenous variables and u
t
are Gaussian residuals with zero
conditional mean, i.e. E(u
t
|y
t-1
, , y
t-n
, x
t
, x
t-1
, , x
t-p
) = 0.

DL model and parameter interpretation

Partial multipliers of order i expresses the


marginal effect unity change in x
t-i
on y
t
. In
words, the partial multipliers show the impact
of the unity change in X
t
in the period t-i, i.e. i
periods before t, on E(y
t
) (ceteris paribus).

Short-run, or impact multiplier is partial


multiplier of order i = 0,
0
. This multiplier
shows the impact of unity change in x
t
on
E(y
t
) in current period.

Interim, or intermediate, multipliers of order i


represent a sum of first i partial multipliers. That is,
they equal to the sum of
0
+
1
+
2
++
i
. These
multipliers show the impact of unity change in x
t
on
E(y
t
) for i period with respect to time t.

Long-run multiplier is a sum of all partial


multipliers of DL model. This multipliers expresses
the impact of unity change in x
t
on E(y
t
) for all
periods or lags, respectively.

ADL model - interpretation of
parameters

Partial multipliers of i-th order the same


meaning like in the case of DL model.

Short-run, or impact multiplier


0
again the
same meaning like in the case of DL model.

Interim multipliers of i-th order the same


meaning like in DL model, however, it is
important to take into the consideration the
impact of lagged endogenous variables.

Interim multiplier of i-th order
t t
x y
0


t t t
x y y +
+ 1 1 1


t t t
x y y +
+ + 2 1 2 2


I I I
I
i
i
I
i
i
I
i
i
+ + + + +


1 2
3
1
2
0
1
...

Long-run multiplier of ADL model

Long-run multiplier equals to interim multiplier


assuming that I=(p=n).

ADL model and cointegration analysis
* * * *
1 1
*
) . . . 1 (
. . .
) . . . 1 (
*
1
0
1
1
1 0
1
X X X Y
n
j
j
p
i
i
n
j
j
n
p
n


+ + +
+


Model ADL(1,1)

General ADL(1,1) might be reduced to at least ten


specific models:

1
=
1
= 0

Static regression

0
=
1
= 0

AR(1) model.

1
=
0
= 0

Leading indicator equation


1
= 1,
0
=
1

Model in first differences

1
= 0

DL(1) model

1
= 0

Partial adjustment model.

0
= 0

Dead-start model with lagged information


1
=
1

Model, in which parameter


1
=
1
is called as a model
of proportional reaction and explanatory variables are x
t

and (y
t-1
x
t-1
).

1
1 = (
0
+
1
)

Model, where
1
1 = (
0
+
1
), is a little bit more
difficult to derive. To get the model we must express
ADL(1,1) model in first differences with error-correction
mechanism. That is, we must from both sides of
ADL(1,1) model subtract y
t-1
and add and subtract
0
x
t-1
to
the right side of the model. Then the relation can be
simplified to:

t t t t t
u x x y y + + + +
1 1 0 0 1 1
) ( ) 1 (
This equation contains error-correction mechanism of the
type (y
t-1
x
t-1
), if the following condition is satisfied:

1
1 = (
0
+
1
).

1
=
1

0

Restriction 10 is called common factor or also


COMFAC restriction.

Assumptions of ADL model

E(u
t
|y
t-1
, , y
t-n
, x
1t
, x
1t-1
, , x
1t-p
, ., x
kt
, x
kt-1
, ,
x
kt-p
) = 0;

(a) random variables (y


t
, x
1t
, , x
kt
) are stationary;

(b) (y
t
, x
1t
, , x
kt
) a (y
t-j
, x
1t-j
, , x
kt-j
) are
independent with j large enough;

x
1t
, , x
kt
and y
t
have zero and finite first four
moments;

no perfect multicolinearity.

ADL (n,p) model lag length choice

F-test

Maximum of adjusted R
2

Minimum of Akaike Information


Criterion

Minimum of Bayesova (or also


Schwarz) Information Criterion

) 1 (
1
1
2
2
R
q n
n
R


n
q
n
SSR
AIC
2
ln +

,
_

) ln( ln n
n
q
n
SSR
BIC +

,
_

The information criterion might be in conflict.

Then, the choice depends on our preferences.



From General to Specific model

Tests of significance of i-th lag

F-test

Granger causality
Test of Granger causality
Statistics of Granger causality F-test

Forecasting with ADL (n,p) model

Short-run forecast

f
n,1
or
t+1
= b
0
+ b
1
y
t
+ c
1
x
t

f
n,1
or
t+1
= b
0
+ b
1
y
t
+ b
2
y
t-1
++ b
n
y
t-n+1
+ c
11
x
1t
+
+ c
1p
x
1t-p+1
++ c
k1
x
kt
++ c
kp
x
kt-p+1

The situation might be complicated by including


explanatory variable x in time t.

Forecasting with ADL (n,p) model

Interim and long-run forecast

f
n,2
or
t+2
= b
0
+ b
1

t+1
+ b
2
y
t
++ b
n
y
t-n+2
+ c
10
x

1t+2
+ c
11
x
1t+1
++ c
1p
x
1t-p+2
++ c
k0
x
kt+2
+ c
k1
x

kt+1
+ + c
kp
x
kt-p+2

f
n,h
or
t+h
= b
0
+ b
1

t+h-1
+ b
2
y
t+h-2
++ b
n
y
t-n+h
+
c
10
x
1t+h
+ c
11
x
1t+h-1
++ c
1p
x
1t-p+h
++ c
k0
x
kt+h
+
c
k1
x
kt+h-1
+ + c
kp
x
kt-p+h

Forecast error

The forecast error consist of two


components:

The error of the estimation of


parameters.

The uncertainty about the value of u


t
.

The magnitude of typical forecast


error might be expressed by RMSFE
(Root Mean Squared Forecast Error).

RMFSE
[ ]
t t t t t
x c y b b u y y ) ( ) ( ) (

1 1 1 1 0 0 1 1 1
+ +
+ + +
[ ] [ ]
t t u t t
x c y b b y y E MSFE ) ( ) ( ) ( var ) (
1 1 1 1 0 0
2 2
1 1
+ + +
+ +

Interval forecast

Assuming the normal distribution of error term


the interval forecast is given:
)

1 1 1 + + +
t
t t t
y y SE table t y


Ex-post forecast

Ex-post is used to verify the forecasting


quality of the model and to estimate
RMSFE.

ADL model (application) Forecast
of Farm prices

Q
SZt
= f(CZV
t-h
| MC
t-h
,); h = 1, , n

CZV
t
= f(Q
SZt
Q
DZt
)

Q
DZt
= f(MR
t-h
CZV
t-h
= 0| )

By substitution of supply function and


demand function in the relation
CZV
t
= f(Q
SZt
Q
DZt
) might be obtained the
reduced model of price transmission, i.e.
CZV
t
= f(CZV
t-h
, CPV
t-h
| .)

General model
CZV
t
=
0
+
1
CZV
t-1
++
n
CZV
t-n
+
11
CPV
1t-1
+ +
1p
CPV
1t-
p
+ u
t
where
0
, ,
n
and
10
, ,
1p
are unknown parameters for n lags
of endogenous variable and p lags of exogenous varibale and u
t

are Gaussian residuals with zero conditional mean, i.e. E(u
t
|y
t-1
,
, y
t-n
, x
t
, x
t-1
, , x
t-p
) = 0.

DATA SET

Time series: farm prices and


wholesale prices gathered by CZSO
(database ARAD NB).

Time series cover the period


January 1995 July 2005, i.e. 127
observations.

85,00
95,00
105,00
115,00
125,00
135,00
145,00
155,00
I
.
9
5
V
.
9
5
I
X
.
9
5
I
.
9
6
V
.
9
6
I
X
.
9
6
I
.
9
7
V
.
9
7
I
X
.
9
7
I
.
9
8
V
.
9
8
I
X
.
9
8
I
.
9
9
V
.
9
9
I
X
.
9
9
I
.
0
0
V
.
0
0
I
X
.
0
0
I
.
0
1
V
.
0
1
I
X
.
0
1
I
.
0
2
V
.
0
2
I
X
.
0
2
I
.
0
3
V
.
0
3
I
X
.
0
3
I
.
0
4
V
.
0
4
I
X
.
0
4
I
.
0
5
V
.
0
5
CZV
CPV

100,0
105,0
110,0
115,0
120,0
125,0
130,0
I. II. III. IV V V I V II V III IX X XI XII
Pr mr CZV v jednotliv c h ms c c h

129,0
129,5
130,0
130,5
131,0
131,5
132,0
132,5
I. II. III. IV V VI VII VIII IX X XI XII
Prmr CPV v jednotlivch
mscch

Data transformation

Farm prices (CZV) and wholesale


prices (CPV) annual changes
75
80
85
90
95
100
105
110
115
120
125
I
.
9
5
V
I
I
.
9
5
I
.
9
6
V
I
I
.
9
6
I
.
9
7
V
I
I
.
9
7
I
.
9
8
V
I
I
.
9
8
I
.
9
9
V
I
I
.
9
9
I
.
0
0
V
I
I
.
0
0
I
.
0
1
V
I
I
.
0
1
I
.
0
2
V
I
I
.
0
2
I
.
0
3
V
I
I
.
0
3
I
.
0
4
V
I
I
.
0
4
I
.
0
5
V
I
I
.
0
5
CZV
CPV

First differences CZV (dCZV) and CPV
(dCPV)
-15
-10
-5
0
5
10
15
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100103106109 112115118121124
dCZV
dCPV

Farm prices (CZV) and wholesale prices (CPV)
seasonal decomposition
-30,00
-20,00
-10,00
0,00
10,00
20,00
30,00
I
.
9
5
V
.
9
5
I
X
.
9
5
I
.
9
6
V
.
9
6
I
X
.
9
6
I
.
9
7
V
.
9
7
I
X
.
9
7
I
.
9
8
V
.
9
8
I
X
.
9
8
I
.
9
9
V
.
9
9
I
X
.
9
9
I
.
0
0
V
.
0
0
I
X
.
0
0
I
.
0
1
V
.
0
1
I
X
.
0
1
I
.
0
2
V
.
0
2
I
X
.
0
2
I
.
0
3
V
.
0
3
I
X
.
0
3
I
.
0
4
V
.
0
4
I
X
.
0
4
I
.
0
5
V
.
0
5
CZV seznn oitn
CPV seznn oitn

First differences CZV-seasonal dec.,
CPV and CPV-seasonal dec.
-10,00
-8,00
-6,00
-4,00
-2,00
0,00
2,00
4,00
6,00
8,00
10,00
I
.
9
5
V
.
9
5
I
X
.
9
5
I
.
9
6
V
.
9
6
I
X
.
9
6
I
.
9
7
V
.
9
7
I
X
.
9
7
I
.
9
8
V
.
9
8
I
X
.
9
8
I
.
9
9
V
.
9
9
I
X
.
9
9
I
.
0
0
V
.
0
0
I
X
.
0
0
I
.
0
1
V
.
0
1
I
X
.
0
1
I
.
0
2
V
.
0
2
I
X
.
0
2
I
.
0
3
V
.
0
3
I
X
.
0
3
I
.
0
4
V
.
0
4
I
X
.
0
4
I
.
0
5
V
.
0
5
Prvn dif erence CZV-sez. o.
Prvn dif erence CPV
Prvn dif erence CPV-sez. neo.

Autocorrelation coef.
Lags
Price indexes (average 1994=100) Annual changes - price Price indexes (average 1994=100) seasonal
decomposition
CZV CPV dCZV dCPV CZV CPV dCZV dCPV CZVo CPVo dCZVo dCPVo
1 0,929 0,983 0,320 0,629 0,851 0,998 0,119 0,543 0,920 0,997 0,241 0,206
2 0,810 0,944 -0,058 0,479 0,666 0,994 0,077 0,411 0,799 0,992 -0,083 0,146
3 0,696 0,890 -0,078 0,384 0,461 0,989 0,070 0,426 0,692 0,986 -0,079 0,133
4 0,587 0,823 -0,015 0,370 0,232 0,982 -0,225 0,331 0,594 0,979 -0,015 0,092
5 0,477 0,742 -0,044 0,321 0,070 0,974 -0,197 0,287 0,496 0,972 -0,070 0,316
6 0,372 0,651 -0,003 0,257 -0,036 0,965 -0,322 0,277 0,408 0,962 -0,044 0,087
7 0,275 0,554 0,069 0,150 -0,051 0,954 -0,172 0,182 0,325 0,951 -0,023 0,261
8 0,178 0,458 0,122 0,072 -0,023 0,942 -0,194 0,138 0,242 0,938 0,135 -0,010
9 0,066 0,362 0,166 -0,011 0,069 0,929 0,108 0,157 0,140 0,924 0,115 -0,018
10 -0,080 0,267 0,026 -0,098 0,133 0,914 0,111 0,014 0,020 0,910 -0,018 -0,079
11 -0,231 0,175 -0,100 -0,245 0,164 0,899 0,028 -0,086 -0,095 0,896 -0,017 -0,134
12 -0,362 0,094 -0,383 -0,491 0,189 0,884 0,632 0,010 -0,205 0,883 -0,011 0,455
13 -0,431 0,032 -0,307 -0,353 0,033 0,867 -0,054 -0,149 -0,314 0,864 -0,245 -0,197

ADL (Data annual changes)
Total observations: 127
Usable observations: 114 (1996:02 to 2005:07)
Dependent variable: dCZV
R
2
: 0,275 SEE: 3,2654
DW-test: 2,0108 SSR: 1164,418
Variable Coefficient P-value
dCZV (1) 0,179827 0,0463
dCZV (12) -0,306588 0,0011
dCPV (1) 0,7918373 0,0730
dCPV (12) -0,774379 0,0741
Specification tests T-Stat. Significance level
A: Tests for Autocorrelation LM(1) 0,03689 0,84768
LM(4) 9,32449 0,05348
B: Tests for Heteroscedasticity Breusch-Pagan test 6,7595 0,07997
C: Functional Form Tests RESET test with quadratic 0,00216 0,96299
RESET test with quadratic and cubic 0,0063 0,99372
D: Tests for Normality BJ test 8,31716 0,01563
E: Structural Stability Tests Chow test 0,87207 0,45809

Forecast - CZV
Obdob Prognza dCZV Prognza CZV Skuten hodnoty CZV Chyba prognzy
VIII.05 1,6811 90,6811 91,50 0,8189
IX.05 2,5747 93,2558 93,30 0,0442
X.05 0,8152 94,0711 93,70 -0,3711
XI.05 -0,7437 93,3274 94,00 0,6726
XII.05 1,5110 94,8385 94,80 -0,0385

ADL (Data index 1994=100)
Total observations: 127
Usable observations: 113 (1996:03 to 2005:07)
Dependent variable: dCZV
R
2
: 0,1909 SEE: 2,9941
DW-test: 2,0886 SSR: 959,19
Variable Coefficient P-value
dCZV (1) 0,2169 0,0265
dCZV (2) -0,2057 0,0311
dCZV (13) -0,2132 0,0215
dCPV (1) 0,6562 0,0871
dCPV (2) 0,2492 0,3956
dCPV (13) -0,6835 0,0580
Specification tests T-Stat. Significance level
A: Tests for Autocorrelation LM(1) 2,06817 0,1504
LM(4) 7,94291 0,0937
B: Tests for Heteroscedasticity Breusch-Pagan test 9,36495 0,0248
C: Functional Form Tests RESET test with quadratic 5,72859 0,0184
RESET test with quadratic and cubic 3,3175 0,0401
D: Tests for Normality BJ test 8,20156 0,0166
E: Structural Stability Tests Chow test 1,21043 0,3098

Forecast CZV
Obdob Prognza dCZVso Prognza CZVso Prognza CZV Skuten hodnoty CZV Chyba prognzy
VIII.05 -1,8622 -5,8387 110,3251 108,2929 -2,0321
IX.05 0,7092 -5,1295 105,8980 103,1604 -2,7376
X.05 1,9833 -3,1463 110,3234 106,8399 -3,4835
XI.05 0,5819 -2,5644 110,5036 108,0690 -2,4346
XII.05 -0,3093 -2,8737 106,6251 103,8608 -2,7643

Comparison
Obdob Prognza dCZV Prognza CZV Skuten hodnoty CZV Chyba prognzy
VIII.05 1,6811 90,6811 91,50 0,8189
IX.05 2,5747 93,2558 93,30 0,0442
X.05 0,8152 94,0711 93,70 -0,3711
XI.05 -0,7437 93,3274 94,00 0,6726
XII.05 1,5110 94,8385 94,80 -0,0385
Obdob Prognza dCZVso Prognza CZVso Prognza CZV Skuten hodnoty CZV Chyba prognzy
VIII.05 -1,8622 -5,8387 110,3251 108,2929 -2,0321
IX.05 0,7092 -5,1295 105,8980 103,1604 -2,7376
X.05 1,9833 -3,1463 110,3234 106,8399 -3,4835
XI.05 0,5819 -2,5644 110,5036 108,0690 -2,4346
XII.05 -0,3093 -2,8737 106,6251 103,8608 -2,7643

VAR model

Employing ADL model for modelling


price transmission implicitly
assumes that one price is
exogenous. However, this
assumption does not hold in many
application, i.e. in this case we face
the endogeneity problem.

Economic model - modification
i Ai A A Pi P P i
C Q Q P Q Q P ) ( ) (
k
Q
Q
Ai
Pi

i Ai A A Pi P P A P i
C Q Q P Q Q P P P ) ( ) ( max ) , (
0
) , (

Pi
A P i
Q
P P

0

+
Ai
A
A
A
Ai A
Pi
P
P
p
Pi p
Q
Q
Q
P
kQ kP
Q
Q
Q
P
Q P
Ai
A
A
A
Ai A
Pi
P
P
p
Pi P
Q
Q
Q
P
kQ kP
Q
Q
Q
P
Q P

+
) 1 ( ) 1 (
PA
i
A
PP
i
p
e
kP
e
P

+ +
) 1 ( ) 1 (
PA
A
PP
p
e
kP
e
P

+ +

VAR model

Vector Autoregressive models (VAR) the


basic feature of this models is that all
variables in the model are random
(stochastic) and endogenous.

The lag length is the same for all


variables.

VAR model is not strictly based on


economic theory.

VAR model

VAR(p) model is a generalisation of AR model for


more variables.

The advantage of the model is relatively simple


estimation of parameters OLS.

Construction of VAR model:

Linearization of time series (log transformation)

Unit root tests

Transformation of variables on stationary time series

Lag length choice

Estimation

Verification and simplification (reduction of lags)

Orthogonalisation of residuals

Application (Forecasting, Impulse-Response analysis)



VAR model

VAR(p) might be written as (C


S
= 0
for s > p):
t s t
p
s
s t
U X C X + +

where X
t
represent k variables of the model, i.e. for
two variables case:
.
1
]
1

t
t
t
x
x
X
2
1

Lag length choice in VAR models


information criterion (see AIC, BIC,
etc.

Very important feature of the VAR


model is the correlation between (or
among) residuals of the equations in
the model. This enables the
derivation of structural alternatives
to classical econometric models.

Orthogonalisation of residuals

The derivation of structural


alternative is based on the
transformation of VAR models into
the form, in which the model has
orthogonal residuals.

We may demonstrate the process of


orthogonalization on the example of
two variables in the VAR model:

where residuals are correlated,


i.e.:
1
]
1

+
1
]
1

1
]
1

+
1
]
1

1
]
1

1
]
1

t
t
t
t
t
t
t
t
u
u
y
x
d c
b a
y
x
d c
b a
y
x
2
1
2
2
2 2
2 2
1
1
1 1
1 1
12 2 1 22
2
2 11
2
1 2 1
) ( ; ) ( ; ) ( ; 0 ) ( ) (
t t t t t t
u u E u E u E u E u E

to get the model with orthogonal


residuals we multiply the first
equation of the model by
11
12


and we subtract the result from
the second equation.

1
]
1

+
1
]
1

1
]
1

+
1
]
1

1
]
1

1
]
1


t
t
t
t
t
t
t t
t
u
u
y
x
d c
b a
y
x
d c
b a
x y
x
2
1
2
2
2 2
2 2
1
1
1 1
1 1

) ( ; ) ( ; ) (
1 2 2 t t t s s s s s s
u u u b d d a c c

where
0 )) ( ) / ( ) (( )) ( ( ) (
12 12
2
1 11 12 2 1 1 2 1 2 1



t t t t t t t t
u E u u E u u u E u u E
The fact the residual are not correlated can be
proved:

Values of
ij
are not known usually
and must be estimated. The idea of
orthogonalisation is based on the
employment of the equations of the
model separately in the economic
analysis. In this case the economic
analysis investigates the impact of
unknown shock or orthogonal
innovation on the system.

Impulse-response function
1
]
1

1
]
1

1
]
1

1
]
1

i t
i t
i
i
t
t
u
u
d c
b a
y
x
2
1
1 1
1 1
0
1
0 1

11 12 1 2 2
/ ) (

a u u u
t t t
where
1
]
1

1
1
]
1

1
]
1

i t
i t
i i
i i
i
t
t
u
u
y
x
2
1
) (
22
) (
21
) (
12
) (
11
0



Interpretation of elements in I-R function

Interpretation of i-th element:


0
21

Represents expected impact of unitary


change in u
1t
on y
t
in time t=0.
1
21

Represents expected impact of unitary


change in u
1t
on y
t
in time t=1.

If we abstract from the correlation


of residuals of the equation, the
forecast can be derived mechanically
from the VAR model.

h s kT
p
s
s h kT n
X C X or f
+

+
+

1
1 ,

'

>

+
+
+
s h for X
s h for X
X
h s kT
h s kT
h s kT


VAR model - example
Total observations: 127 Monthly Data From: 1995:01 To 2005:07
Usable observations: 113 (1996:03 to 2005:07) Degrees of Freedom: 105
Dependent variable: dCZV Dependent variable: dCPV
R
2
: 31,93 SEE: 3,2258 R
2
: 0,62 SEE: 0,4969
DW-test: 2,0629 SSR: 1092,6035 DW-test: 2,0752 SSR: 25,9261
Variable Coefficient p-value Variable Coefficient p-value
dCZV (1) 0,16025 0,1066 dCPV (1) 0,4388 0,0000
dCZV (2) -0,2094 0,2800 dCPV (2) 0,1854 0,0227
dCZV (12) -0,2741 0,0046 dCPV (12) -0,5397 0,0000
dCZV (13) -0,1641 0,1161 dCPV (13) 0,2800 0,0019
dCPV (1) 1,0272 0,0930 dCZV (1) 0,0312 0,0423
dCPV (2) 0,4898 0,3488 dCZV (2) 0,0146 0,3164
dCPV (12) -0,5021 0,3669 dCZV (12) 0,0315 0,0330
dCPV (13) 0,0990 0,8626 dCZV (13) 0,0011 0,9428
F-Test F-statistic p-value F-Test F-statistic p-value
dCZV 5,5941 0,0004 dCPV 29,6004 0,0000
dCPV 2,9671 0,0229 dCZV 2,7889 0,0301

Impulse-response analysis
Responses to Shock in dCZV
-1,500
-1,000
-0,500
0,000
0,500
1,000
1,500
2,000
2,500
3,000
3,500
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
dCZV
dCPV

Responses to Shock in dCPV
-0,600
-0,400
-0,200
0,000
0,200
0,400
0,600
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
dCZV
dCPV

Decomposition of Variance dCZV
Decomposition of Variance for Series dCZV
Step Std Error dCZV dCPV
1 3,10951078 100,000 0,000
2 3,21123003 97,880 2,120
3 3,26487070 95,574 4,426
4 3,27680712 94,916 5,084
5 3,29175028 94,679 5,321
6 3,29900310 94,479 5,521
7 3,30228036 94,334 5,660
8 3,30421762 94,253 5,747
9 3,30549764 94,210 5,790
10 3,30626983 94,183 5,817
11 3,30671348 94,167 5,833
12 3,30697610 94,157 5,843
13 3,43536280 94,239 5,761
14 3,57292139 93,073 6,927
15 3,60878770 91,261 8,739
16 3,62624759 90,402 9,598
17 3,64110516 90,010 9,990
18 3,65597919 89,721 10,279

Dekompozice of Variance for dCPV
Decomposition of Variance for Series dCPV
Step Std Error dCZV dCPV
1 0,4789928 9,679 90,324
2 0,54376337 16,417 83,583
3 0,59796144 21,141 78,859
4 0,62289722 22,173 77,827
5 0,63714616 22,498 77,502
6 0,64557739 22,776 77,224
7 0,65069735 22,971 77,029
8 0,65371211 23,074 76,926
9 0,65549074 23,129 76,871
10 0,65655163 23,163 76,837
11 0,65718664 23,184 76,816
12 0,65756557 23,196 76,804
13 0,69771656 20,758 79,242
14 0,70208071 20,754 79,246
15 0,71951277 21,811 78,189
16 0,73050160 22,182 77,818
17 0,73932358 22,177 77,823
18 0,74590017 22,236 77,764

Forecast
Prognza dCZV Prognza dCPV Prognza CZV Prognza CPV Chyba prognzy CZV Chyba prognzy CPV
-0,7008 -0,1025 88,299 97,897 3,201 0,103
2,8115 0,0310 94,312 98,031 -1,012 -0,331
1,5479 0,0856 94,848 97,786 -1,148 -0,386
-1,1313 0,3575 92,569 97,757 1,431 -0,557
0,5294 0,5901 94,529 97,790 0,271 -1,290

Cointegration analysis

If we transform the nonstationary


time series on the stationary ones
by differencing we lose very
important information about their
long-run relationship.

Cointegration analysis enables to


solve this problem.

If we assume that variables Y


t
and X
t
are
integrated of the same order and we
model them as:
Y
t
= X
t
+ u
t

Three situation may occur:

process u
t
is a white noise, i.e. I(0),

process u
t
is stationary and autocorrelated
and is also I(0),

process u
t
is I(1).

In the first case the variables are cointegrated. That is,


there is a long-run relationship between them (the
equilibrium relationship). The long-run multiplier is the
regression coefficient .

In the second case the variables are cointegrated as well.


In this case we may write u
t
= u
t-1
+
t
, where
t
is a
white noise, then the model can be rewritten in the form
of ADL (1,1), i.e.:
Y
t
= Y
t-1
+ X
t
- X
t-1
+
t

The long-run multiplier () representing the long-run


relationship between variables is:
= (1+)/(1-)

In the last case the model does not contain cointegrated


time series. It is spurious regression.

Cointegration - definition

Engle and Granger (1987) defines the cointegration


between two variables as follows:

Definition: time series x


t
and y
t
are cointegrated of
order d, b, where d>=b>=0 , formally x
t
, y
t
~ CI(d,b) ,
if:

both series are integrated of order d, and

the linear combination of these time series exists,


i.e.
1
x
t
+
2
y
t
, which is integrated of order d b.
Vector [
1
,
2
] is called cointegration vector.

Economic time series are usually integrated of


order 1.

To be cointegrated the time series must


satisfy:
x
t
. ~ CI(1 , 1).

VECM

VECM can be formally written:


t s t
p
s
s t t
u X C X X + + +


1
1

where C
S
= 0 for s > p, X
t
is k x 1 vector of variables
integrated of order 1, i.e. I(1), u
1
, ,u
t
are nid (0,) and
is a matrix of lon-run relationship.

Construction of VECM

Linearization of time series (log


transformation)

Unit root tests

Lag length choice

Estimation of cointegrating vector

Estimation of VECM

Verification

Orthogonalisation of residuals

Application (Forecasting, Impulse-Response


analysis)

VECM - example
Total observations: 127 Monthly Data From: 1995:01 To 2005:07
Usable observations: 121 (1995:07 to 2005:07) Degrees of Freedom: 95
Dependent variable: dCZV Dependent variable: dCPV
R
2
: 0,3387 SEE: 3,2482 R
2
: 0,4816 SEE: 0,5960
DW-test: 1,9907 SSR: 1160,5650 DW-test: 2,0148 SSR: 39,0746
Variable Coefficient (p-value) Variable Coefficient (value)
dCZV (1) 0,3473 (0,0004) dCPV (1) 0,3942 (0,0002)
dCZV (2) -0,1365 (0,1925) dCPV (2) 0,0401 (0,7071)
dCZV (3) -0,0159 (0,8767) dCPV (3) -0,0804 (0,9408)
dCZV (4) -0,0084(0,9311) dCPV (4) 0,0795 (0,4677)
dCZV (5) -0,1443 (0,1362) dCPV (5) 0,0382 (0,6925)
dCPV (1) 0,9239 (0,0943) dCZV (1) 0,0512 (0,0045)
dCPV (2) 0,5499 (0,3451) dCZV (2) 0,0133 (0,4862)
dCPV (3) -0,0569 (0,9231) dCZV (3) 0,0374 (0,0482)
dCPV (4) 1,0637 (0,0764) dCZV (4) 0,0123 (0,4930)
dCPV (5) 0,9094 (0,0862) dCZV (5) 0,0368 (0,0393)
EC1 (1) -0,2526 (0,0001) EC1 (1) -0,0196 (0,0796)

Impulse-response analysis
Responses to Shock in CZV
-1,50
-1,00
-0,50
0,00
0,50
1,00
1,50
2,00
2,50
3,00
3,50
4,00
1 3 5 7 9
1
1
1
3
1
5
1
7
1
9
2
1
2
3
2
5
2
7
2
9
3
1
3
3
3
5
3
7
3
9
CZV
CPV

Responses to Shock in CPV
0,00
0,50
1,00
1,50
2,00
2,50
3,00
3,50
1 3 5 7 9
1
1
1
3
1
5
1
7
1
9
2
1
2
3
2
5
2
7
2
9
3
1
3
3
3
5
3
7
3
9
CZV
CPV

Decomposition of Variance for Series CZV
Step Std Error CZV CPV
1 3,09700607 100,000 0,000
2 4,81955004 98,344 1,656
3 5,77441900 93,312 6,688
4 6,34232902 87,838 12,162
5 6,93916917 80,082 19,918
6 7,67317018 69,460 30,540
7 8,41318490 61,130 38,870
8 9,06189422 56,119 43,881
9 9,61728273 52,636 47,364
10 10,08570278 49,634 50,366
11 10,48334070 46,930 53,070
12 10,83138368 44,333 55,667
13 11,14124022 41,952 58,048
14 11,40656507 40,024 59,976
15 11,62305439 38,581 61,419
16 11,79527266 37,543 62,457
17 11,92908489 36,847 63,153
18 12,03372917 36,426 63,574
Decomposition of Varinace - CZV

Decomposition of Variance - CPV
Decomposition of Variance for Series CPV
Step Std Error CZV CPV
1 0,56826971 13,964 86,036
2 1,01824341 19,645 80,355
3 1,43479702 21,357 78,643
4 1,82847241 22,642 77,358
5 2,21484081 22,522 77,478
6 2,62257483 22,585 77,415
7 3,02933492 21,697 78,303
8 3,420418410 20,127 79,873
9 3,786881080 18,354 81,646
10 4,131634022 16,694 83,306
11 4,455210903 15,189 84,811
12 4,752832383 13,871 86,129
13 5,021949265 12,721 87,279
14 5,264221149 11,713 88,287
15 5,482628144 10,383 89,162
16 5,679481259 10,102 89,898
17 5,856661000 9,506 90,494
18 6,015597754 9,045 90,955

Forecast
Obdob Prognza dCZV Prognza dCPV Prognza CZV Prognza CPV Chyba prognzy CZV Chyba prognzy CPV
VIII.05 -0,1191 -0,5608 88,881 97,439 2,619 0,561
IX.05 -0,3177 -0,3195 91,182 97,680 2,118 0,020
X.05 -1,3773 -0,1638 91,923 97,536 1,777 -0,136
XI.05 -1,9033 -0,0610 91,797 97,339 2,203 -0,139
XII.05 -0,9422 -0,3420 93,058 96,858 1,742 -0,358

You might also like