Professional Documents
Culture Documents
[Document subtitle]
[DATE]
[COMPANY NAME]
[Company address]
Contents
Introduction.............................................................................................................................................. 1
Section 1: Descriptive Analysis ............................................................................................................... 2
1.1 Visual Analysis and Unit Root Tests ............................................................................................. 2
1.2 ARIMA Models ............................................................................................................................... 4
1.2.2: Box-Jenkins Identification ...................................................................................................... 4
1.2.2 Model Selection ...................................................................................................................... 7
1.2.3 Forecast Evaluation and Residual Diagnostics .................................................................... 11
Section 2: Robustness Check ............................................................................................................... 15
2.1 Pretesting .................................................................................................................................... 15
2.2 Correlations ................................................................................................................................. 18
References ............................................................................................................................................ 22
Introduction
300
250
200
50
1994
1996
1998
2000
2002
2004
2006
2008
2010
2012
2014
140
120
100
80
60
40
20
0
1994
1995
1996
1997
1998
1999
2000
2001
200
150
100
section.
Regarding the apparent nonstationarity of the
50
0
2004
2006
2008
2010
2012
2014
L_SAPU 2003m1 to 2014m6 in levels. The test is specified with a constant and 3 lags. As indicated by
the p-value of 0.4011, we cannot reject the hypothesis that this portion of the series is characterized by
a unit root for any traditional level of significance. This test thus corroborates the hypothesis that the
series is nonstationary.
Regarding
the
nature
of
this
apparent
Figure 4: L_SAPU 2003m1 to 2014m6
6.0
5.5
4.5
4.0
3.5
3.0
2004
2006
2008
2010
2012
2014
1% level
5% level
10% level
Prob.*
0.0000
-3.499167
-2.891550
-2.582846
1% level
5% level
10% level
t-Statistic
Prob.*
-1.755684
-3.479656
-2.883073
-2.578331
0.4011
1% level
5% level
10% level
t-Statistic
Prob.*
-9.393356
-4.026429
-3.442955
-3.146165
0.0000
t-Statistic
Prob.*
-13.00195
-3.479656
-2.883073
-2.578331
0.0000
1% level
5% level
10% level
left;
skewness
DT(L_SAPU).
is
The
most
fourth
pronounced
central
in
moment
displaying
fatter
tails
1.5
1.0
0.5
than
0.0
-0.5
-1.0
residing
-1.5
in
the
tails
of
their
respective
2004
2006
2008
2010
2012
2014
2012
2014
0.8
0.0
-0.4
-1.2
2004
2006
2008
2010
differencing
and
detrending
5.1
L_SAPU
5.2
D(L_SAPU)
5.3
DT(L_SAPU)
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
4.746310
4.693905
5.644762
3.255166
0.415366
-0.286317
3.365061
0.007362
-0.012206
1.383108
-1.006805
0.345102
0.357227
4.573799
1.03E-16
-0.008638
0.612533
-1.142621
0.272905
-0.592941
4.478033
Jarque-Bera
Probability
2.651773
0.265567
17.05243
0.000198
20.64765
0.000033
observation
2005m1.
to
affect
observation
6.2
DT(L_SAPU)
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
-1.76E-17
-0.008600
0.695292
-0.748037
0.294703
-0.010816
2.558378
2.35E-16
-0.015972
0.602350
-0.624201
0.246021
-0.037858
2.755471
Jarque-Bera
Probability
1.091530
0.579398
0.365859
0.832827
DT(L_SAPU),
observations
2005m1
and
2005m1 and 2005m2 to D(L_SAPU) and a dummy variable for observation 2005m1 to DT(L_SAPU).
Table 6 shows descriptive statistics for the residuals of D(L_SAPU) regressed on a constant and these
dummies (6.1) and for DT(L_SAPU) inclusive of the 2005m1 dummy (6.2). For both series, the JarqueBare statistic indicates that the hypothesis of normality cannot be rejected at any traditional level of
significance. All subsequent models presented in this section include the relevant dummy variables and
are estimated on samples which exclude 2003m2. This resolves the issue of outliers.
Turning now to the issue of lag length selection, Figure 7 and 8 respectively display the ACFs and
PACFs for L_SAPU, D(L_SAPU) and DT(L_SAPU). The ACF for L_SAPU shows a markedly slow linear
decay, a feature indicative of a unit root (Enders, 2010:73); however, the ACFs for D(L_SAPU) and
DT(L_SAPU) show that either approach removes this persistence, reducing the number of significant
Figure 7: SAPU Autocorrelation Function
0,800
0,600
0,400
0,200
0,000
-0,200
-0,400
-0,600
1
2
L_SAPU
4
D(L_SAPU)
DT(L_SAPU)
10
Significance
2
L_SAPU
D(L_SAPU)
DT(L_SAPU)
10
Significance
lags to one in both cases. Turning to Figure 8, the PACF of D(L_SAPU) indicates that our ARIMA model
may need to contain as many as three lags, while the PACF of DT(L_SAPU) indicates that one lag may
be sufficient.
highest adjusted R-squared (of 0.511053, column 7.9) achieved by the set of ARIMA(p,1,q) models is
less than the lowest adjusted R-squared (of 0.564614, column 8.2) achieved by the set of ARIMA(q,0,1)
7.2
-0.442928***
7.3
7.4
-0.563887***
-0.300189***
-0.799862***
7.5
-0.773702***
-0.801019***
-0.179567*
0.505632
0.486799
0.052221
0.174970
0.515960
0.497521
0.031107
0.153856
0.483596
0.458769
0.114011
0.261310
7.12
7.13
-0.749900***
7.14
0.068400
-0.809714***
-0.170669*
-0.085062
-0.651971***
0.518376
0.495221
0.044286
0.191585
0.509577
0.485999
0.062389
0.209688
-0.652691***
0.503519
0.489467
0.038304
0.136503
0.440800
0.419497
0.175446
0.298196
0.503985
0.485089
0.055546
0.178295
7.7
-0.657519***
-0.450292***
-0.264077***
-0.079993
-0.811335***
0.378499
0.360910
0.262895
0.361094
7.6
AR(1)
AR(2)
AR(3)
MA(1)
MA(2)
MA(3)
R-Squared
Adjusted R-Squared
AIC
SBC
7.8
-0.689573***
-0.500107***
7.9
-0.759654***
7.10
7.11
-0.133164
-0.412994***
-0.784627***
-0.094265
-0.100864
-0.764419***
-0.738372***
-0.243409***
0.502941
0.479044
0.075830
0.223129
0.533482
0.511053
0.012419
0.159718
0.521552
0.498549
0.037670
0.184969
0.523528
0.500621
0.033530
0.180829
-0.093440
-0.046456
-0.754525***
-0.213780**
0.506587
0.482865
0.068468
0.215767
Table 8.1: L_SAPU ARIMA(p,0,q) Models (Specified with a Constant and a Linear Time Trend)
AR(1)
AR(2)
AR(3)
MA(1)
MA(2)
MA(3)
R-Squared
Adjusted R-Squared
AIC
SBC
8.1
0.281013***
8.2
8.3
0.249483**
0.124413
8.4
0.250299**
8.5
0.190037*
0.242873**
0.576597
0.564614
0.036300
0.134499
0.591906
0.572286
0.035837
0.183136
0.584635
0.568811
0.035316
0.158065
8.7
0.233942**
0.098348
0.112301
0.277499***
0.265961***
0.144951
0.591224
0.575651
0.019326
0.142075
0.586793
0.571052
0.030105
0.152854
0.593339
0.573788
0.032320
0.179619
8.13
8.14
0.092901
0.580671
0.568803
0.026632
0.124831
8.6
Table 8.2: DT(L_SAPU) ARIMA(p,0,q) Models (Specified with a Constant and a Linear Time Trend
AR(1)
AR(2)
AR(3)
MA(1)
MA(2)
MA(3)
R-Squared
Adjusted R-Squared
AIC
SBC
8.8
8.9
0.239662**
0.125431
0.242073**
8.10
8.11
8.12
0.247522**
0.189134*
0.009099
0.262686***
0.146813
0.041488
0.275539***
0.088107
0.039288
0.587568
0.567739
0.046411
0.193710
0.585234
0.565293
0.052054
0.199353
0.587626
0.567800
0.046270
0.193569
0.591246
0.571594
0.037454
0.184753
0.106804
0.261801***
0.152479
0.125459
0.592057
0.572444
0.035468
0.182767
0.592257
0.572653
0.034978
0.182277
0.182591*
0.089720
0.256385**
0.076966
0.006766
0.594761
0.575278
0.028817
0.176116
10
models. This finding corroborates the first element of our hypothesis that models including a linear trend
will fit well in-sample. Note however that adjusted R-squared is not an appropriate criterion for
evaluating the relative goodness of fit for non-nested models (Wooldridge, *****); as mentioned above,
for this purpose we must refer to each models AIC and SBC score. Moreover, it is also necessary to
note that the AIC and SBC cannot be used to rank models between these groups, as they do not allow
for comparison across models with different transformations of the dependent variable (Burnham &
Anderson, 2002:80). Rather, the AIC and he SBC are used here to select the best models from within
each group respectively.
For ease of evaluation, the four lowest (and thus best) AIC and SBC scores among each of the two sets
of models have been colour-coded: light blue is lowest, dark blue is second lowest, light gold is third
lowest, dark gold is fourth lowest. Of the ARIMA(p,1,q) models, the ARIMA(0,1,2) model in column 7.6
scores lowest in both AIC and SBC and is thus the strongest contender of this group; the ARIMA(1,1,2)
model (7.9) is also a strong contender, with the second lowest AIC score and the third lowest SBC
score. For the third ARIMA(p,1,q) contender, the ARIMA(0,1,1) model (7.2) is selected given that it
attains the second lowest SBC score; this statistic is unbiased in small samples (****), and thus 7.2 is
selected not only for the level of its SBC score but also for the reliability of this score vis--vis models
that achieved low AIC scores.
Among the ARIMA(p,0,q) models, the ARIMA(1,0,0) model (8.1) achieves the lowest SBC score and
the second lowest AIC score, and is thus selected. The ARIMA(0,0,1) model (8.2) performs poorly
relative to many of the other ARIMA(p,0,q) models with respects to its AIC score, but is selected on the
basis of its SBC score (which is second-lowest among this group of models). Finally, the ARIMA(1,0,1)
model (8.5), which achieved the lowest AIC score and the third lowest SBC score, is selected.
=
=+1
=
=+1
( )2
| |
= 100 |
=+1
| /
[1]
[2]
[3]
11
where the are the forecasted values of series, the are the actual values of the series, is the
number of observations that comprise the forecast period and is the final period of the sample used
for estimation. It can be deduced from the equations above that the RMSE and the MAE are invariant
to additive transformations to { } and { } (the series of and respectively) and are sensitive to
multiplicative transformations thereof;
12
7.2
7.6
7.9
8.1
8.2
8.5
0,233025
0,233939
0,232661
0,227842
0,222692
0,241225
0,174137
0,177961
0,175856
0,176699
0,170752
0,188668
81,88688
114,4257
119,8167
3,423309
3,303542
3,657336
0,490039
0,478561
0,497255
0,021892
0,02141
0,023171
Bias Proportion
0,038518
0,045005
0,006261
0,013418
0,021025
0,009302
Variance Proportion
0,179617
0,135159
0,191533
0,467837
0,513652
0,379572
Covariance Proportion
0,781866
0,819836
0,802206
0,518745
0,465323
0,611126
0,819317
0,822532
0,818037
0,043657
0,042670
0,046221
7.2
7.6
7.9
8.1
8.2
8.5
Jarque-Bera Statistic
1,131019
0,703254
0,676746
0,980092
0,926902
0,667017
Prob,
0,568071
0,703542
0,712929
0,612598
0,629109
0,716406
0,471431
1,804706
0,406063
1,26157
2,001406
0,996703
0,7566
0,1337
0,8039
0,29
0,0999
0,413
1,956233
6,898901
1,59509
5,185523
8,004894
4,176669
0,7438
0,1413
0,8097
0,2688
0,0914
0,3826
0,105423
0,067843
0,157677
0,087633
0,055419
0,204926
0,746
0,795
0,6921
0,7678
0,8143
0,6517
0,107288
0,069067
0,160388
0,089198
0,056425
0,208357
0,7433
0,7927
0,6888
0,7652
0,8122
0,6481
13
the opposite is true of the MAPE. However, none of these three statistics are comparable across
difference transformations of the dependent variable, as the series { } and { } do not generally
contain the same forecastable information (*****).
14
2.1 Pretesting
Table 12 shows the ADF critical and test statistics for a variety of transformations on MSPC and under
a variety of specifications, broken into sample periods that correspond with the data availability of
SAPU. As indicated in 12.1 and 12.2 respectively, ADF tests conducted over the full sample, specified
with a constant or a constant and a deterministic trend, produce test statistics that are too high to reject
the hypothesis of a unit root; high p-values of 0.5963 and 0.8504 respectively make this rejection
uncontentious. Furthermore, under 12.3 it can be seen that for the first difference of MSPC (D_MSPC)
the hypothesis of a unit root can be rejected at the one percent level of significance; hence we can
conclude from these full sample tests that MSPC is difference stationary (i.e. is I(1)). With reference to
12.10, 12.11 and 12.12, the same conclusion may be drawn with regards to the log of MSPC (L_MSPC)
and for the first difference of the log of MSPC (DL_MSPC).
When MSPC is examined in sections corresponding the data availability of SAPU, tests for the level of
integration of the series produce results similar to those reported in Section 1 for SAPU in the
corresponding sample periods: For the period 1994Q1 to 2001Q4, MSPC and L_MSPC tests as I(0);
15
1994Q1 to 2001Q4
2003Q1 to 2014Q2
12.1
12.2
12.3
12.4
12.5
12.6
12.7
12.8
12.9
MSPC
MSPC
D(MSPC)
MSPC
MSPC
D(MSPC)
MSPC
MSPC
D(MSPC)
-1.363019
-1.411030
-13.40636
-5.061119
-5.382322
-9.542237
-0.944454
-3.641963
-9.035099
1% level
-3.514426
-4.076860
-3.514426
-3.661661
-4.284580
-3.670170
-3.581152
-4.170583
-3.581152
5% level
-2.898145
-3.466966
-2.898145
-2.960411
-3.562882
-2.963972
-2.926622
-3.510740
-2.926622
10 % level
-2.586351
-3.160198
-2.586351
-2.619160
-3.215267
-2.621007
-2.601424
-3.185512
-2.601424
0.5963
0.8504
0.0001
0.0003
0.0007
0.0000
0.7649
0.0371
0.0000
Constant
Constant
Constant
Constant
Constant
Constant
Constant
Constant
Constant
Dependent Variable
t-Statistic
ADF Test critical values:
Prob.*
Exogenous:
Trend
Trend
Trend
12.10
12.11
12.13
12.14
12.15
12.16
12.17
12.8
12.9
Dependent Variable
L_MSPC
L_MSPC
D(L_MSPC)
L_MSPC
L_MSPC
D(L_MSPC)
L_MSPC
L_MSPC
D(L_MSPC)
t-Statistic
-2.313132
-2.312348
-12.56348
-4.813128
-5.105216
-9.568363
-0.992759
-3.538037
-8.323042
1% level
-3.513344
-4.075340
-3.514426
-3.661661
-4.284580
-3.670170
-3.581152
-4.170583
-3.581152
5% level
-2.897678
-3.466248
-2.898145
-2.960411
-3.562882
-2.963972
-2.926622
-3.510740
-2.926622
10 % level
-2.586103
-3.159780
-2.586351
-2.619160
-3.215267
-2.621007
-2.601424
-3.185512
-2.601424
0.1704
0.4224
0.0001
0.0005
0.0013
0.0000
0.7482
0.0470
0.0000
Constant
Constant
Constant
Constant
Constant
Constant
Constant
Constant
Constant
Prob.*
Exogenous:
Trend
Trend
Trend
16
2003Q1 to 2014Q2
13.1
13.2
13.3
13.4
13.5
13.6
SAPU
SAPU
D(SAPU)
SAPU
SAPU
D(SAPU)
-2.592892
-3.791835
-5.627037
-2.424996
-3.959771
-8.359628
1% level
-3.689194
-4.323979
-3.711457
-3.581152
-4.170583
-3.581152
5% level
-2.971853
-3.580623
-2.981038
-2.926622
-3.510740
-2.926622
10 % level
-2.625121
-3.225334
-2.629906
-2.601424
-3.185512
-2.601424
0.1063
0.0323
0.0001
0.1407
0.0172
0.0000
Constant
Constant
Constant
Constant
Constant
Constant
Dependent Variable
t-Statistic
ADF Test critical values:
Prob.*
Exogenous:
Trend
Trend
13.7
13.8
13.9
13.10
13.11
13.12
L_SAPU
L_SAPU
D(L_SAPU)
L_SAPU
L_SAPU
D(L_SAPU)
-2.432925
-5.319496
-4.917714
-1.313868
-3.931164
-9.590602
1% level
-3.661661
-4.394309
-3.752946
-3.588509
-4.175640
-3.588509
5% level
-2.960411
-3.612199
-2.998064
-2.929734
-3.513075
-2.929734
10 % level
-2.619160
-3.243079
-2.638752
-2.603064
-3.186854
-2.603064
0.1414
0.0013
0.0007
0.6149
0.0186
0.0000
Constant
Constant
Constant
Constant
Constant
Constant
Dependent Variable
t-Statistic
ADF Test critical values:
Prob.*
Exogenous
Trend
Trend
for the period 2003Q1 to 2014Q2, MSPC and L_MSPC test as trend stationary at the five percent level
of significance.
Table 13 shows the results of a similarly implemented battery of ADF tests for a variety of
transformations of SAPU; as in Section 1, full sample tests are omitted here due to the discontinuity in
the data from 2002Q1 to 2002Q4. For the period 2003Q1 to 2014Q2 the results in Table 13 are similar
to those presented in Section 1: 13.4 and 13.10 indicate that we cannot reject the hypothesis that SAPU
and L_SAPU are nonstationary, while the rejection of the hypothesis in 13.5 and 13.11 suggest that the
data is trend stationary. For 1994Q1 to 2001Q4, the results obtained for SAPU differ importantly from
those obtained for the series monthly counterpart: for this period and at this data frequency, the
hypothesis that the series is non-stationary cannot be rejected at the ten percent level for SAPU and
L_SAPU; the rejection of the null hypothesis obtained in 13.2 and 13.5 thus suggest that this portion of
the series is trend stationary.
17
2.2 Correlations
Though it is not robust to sources of endogeneity, the correlation structure between variables can
provide valuable insight regarding the relationship between them. Table 14 displays the correlation
coefficients for the SAPU and MSPC in levels, first differences, log levels and for the first differences of
the log levels of these series. Columns corresponds to lags, leads or contemporaneous value of SAPU
or its transformation (the relevant transformation of SAPU and MSPC for each row is indicated in the
left hand column); a positive (negative) number indicates that the reported correlation is between MSPC
and a lead (lag) of SAPU (or transformations thereof). For convenience, positive correlations are
highlighted in yellow, negative correlations are highlighted in blue.
As can be seen in Table 14 below, correlation between MSPC and SAPU for the period 1994Q1 to
2001Q4 (14.1) is positive for contemporaneous values of SAPU, as well as for its first and second lag.
Leads of SAPU are found to be negatively correlated with MSPC, but these correlations are of a
negligible magnitude. This correlation structure is consistent with the above-stated hypotheses that
SAPU and MSPC should be positively correlated and that SAPU should be seen to drive the variation
in MSPC. However, this result is somewhat reversed for L_MSPC and L_SAPU: here, leads of L_SAPU
are found to be positively correlated with L_MSPC, and lags of L_SAPU are negligibly negatively
correlated with L_MSPC, suggesting that it is L_MSPC that drives L_SAPU. Furthermore, for D(MSPC)
Table 14: Correlations for Leads and Lags of SAPU
+1
-1
-2
-0,0073
-0,0141
0,1821
0,2347
0,2393
0,1435
-0,1465
0,1907
0,0236
0,0526
0,3056
0,2607
0,2506
-0,0400
-0,0789
-0,0014
-0,1424
0,2917
-0,0257
0,1232
+2
+1
-1
-2
0,7687
0,8255
0,8853
0,9050
0,8844
-0,1361
-0,0691
0,2428
0,1166
0,0842
0,7987
0,8424
0,8771
0,8817
0,8661
0,0527
0,0230
0,2459
-0,0380
-0,1001
18
2
94
96
98
00
source.
02
04
06
L_SAPU
08
10
12
14
L_MSPC
0.8
0.4
0.0
-0.4
-0.8
-1.2
-1.6
1994
1995
1996
1997
1998
L_SAPU
1999
2000
2001
L_MSPC
0.4
-0.4
-0.8
-1.2
03
methodology,
Baker,
Bloom
and
04
05
06
Davis
07
08
L_SAPU
09
10
11
12
13
14
L_MSPC
19
20
21
References
Bonate, P. 2006 Pharmacokinetic-Pharmacodynamic Modeling and Simulation. New York: Springer
Science & Business Media, Inc.
Burnham, K.P. & Anderson, D.R. 2002. Model Selection and Multimodel Inference: A Practical
Information-Theoretic Approach. New York: Springer Inc.
22