Professional Documents
Culture Documents
Indexing terms: Load forecasting, Load trends, Decomposition technique, Electric load forecasting, Power system planning
Introduction
possible
models
estimate
model
parameters
Model construction
2.1 Identification
When modelling a given data set, a large number of
models are often available for consideration. The purpose of the identification step is to ascertain the subset
of models that appear to hold more promise for adequately modelling the data. A visual inspection of a
graph of given observations against time can often
reveal both obvious and less apparent characteristics of
the data. Identification information which may be
gleaned from a perusal of a graph include the following:
Autocorrelation. The autocorrelation among successive
values of data is a key tool in identifying the basic pattern and determining an appropriate model corresponding to a data series. In a set of completely
random data, the autocorrelation among successive
values will be close to, or equal to, zero; whereas, data
values having strong seasonal or cyclical character will
be highly correlated.
Trends: The presence of trends in a data set is a form
of nonstationarity. Trends can be classed as either
deterministic or stochastic. Deterministic trends can be
expressed as a function of time. Whereas, stochastic
trends can often be accounted for through differencing.
If trends are present in the plot of data that do not
appear to follow the path of a deterministic function
but rather evolve in a stochastic fashion, the differencing may account for these trends.
Seasonality: Usually it is known in advance whether or
not a data set is seasonable, and a graphical display
will simply confirm what is already obvious.
2.2 Estimation
At the estimation step, the values of the parameters of
the different proposed trend functions are obtained.
Several estimation techniques are available, among
these the methods of least squares and maximum likelihood [6, 71. Once the parameters of all proposed trend
functions are estimated, the most appropriate trend
function has to be selected.
In order to select only the most appropriate trend
function among the different proposed ones, three statistical tests can be applied: the coefficient of determination (R2),F-test and t-test.
(a) The coefficient of determination (R2):is the ratio
between both expected and total variations, and it is a
measure of how well the observations fit around a
selected forecasting model. Mathematically, R2 can be
defined by the following equation [SI:
R2 zz
m m of total variations
Fl2
X(Yz- Y) (1)
-
An R2 value ranges from 0 to 1. If the expected variation is zero, implying that the selected model is the
same as the mean, then the R2 value will be equal to
zero.
(b) The F-test: is the ratio of two variances (expected
and unexpected). If the expected variance is equal to
the unexpected, then the ratio will be equal to 1, which
indicates that a selected forecasting model is no better
than the mean. This test indicates the overall significance of a selected model, and can be mathematically
given by [SI:
R~/(K 1)
(1 - R2)/(n- K )
where n is the number of observations, and K is the
number of selected model coefficients. The appropriate
decision rules concerning the F-test significance at 95%
confidence level, for example, are roughly as follows: If
the number of observations is between 6 and 10, the Ftest must exceed a value of 6 to be significant; otherwise, the F-test must be 5 or greater to be significant
F=
PI.
(e) The t-test: is a measure of the significance of each
of the selected model parameters. Since the values of
the estimated parameters are the outcome of a single
sample procedure, and can therefore differ from the
real parameters, it is useful to know the extent of such
variations so that confidence intervals can be constructed and tests of hypotheses concerning the true
values of the estimated parameters can be performed.
The t-test is based on values of the student t-distribution, and represents all possible values that the estimated parameters can take as the result of sampling
effect [8]. The t-test is a measure of the significance of
each of the selected parameters. Mathematically, the ttest for each parameter is given by:
T T
VZ
t, = -
(3)
ot
where Y , is the the value of coefficient i and 0,is the
standard error of coefficient i.
The rules for deciding whether a coefficient is significantly different from a 95% confidence level, for example, are roughly as follows: If the number of
observations is between 5 and 15, the absolute of t-test
must be greater than 3 to have significance; otherwise,
the t-test must have a value greater than 2 to be significant [81.
By examining all three statistical test results for a
given data set, the trend function that describes given
data best can then be selected. An R2 value close to
one, a high F-test value and high t-test value indicate a
good fit, a suitable choice of the trend option and that
the selected trend function parameters are significantly
different from zero, respectively.
case and the identified model has not removed all patterns, as indicated by the fact that the errors are not
random. The most informative approach to check for
whiteness is to examine the graph of estimated residuals against estimated observations, then the graph of
the estimated residual auto correlation function
(RACF). If the estimated residuals form a pattern
around its mean, then it can be concluded that the
residuals are not white. Furthermore, if an RACF is
significantly different from zero, then it is confirmed
that the residuals are not white, and the selected model
is not suitable for forecasting.
The RACF at lag k is calculated as
N
Gt6t-k
rk(6) =
t=k+l
(4)
&d2
t=l
Decomposition models
3. I
Multiplicative model
X t = Tt x Ct x St x Et
(5)
and can be summarised in the following four steps. (i)
Separate the trend-cycle components of the data by calculating a moving average whose number of terms is
equal to the length of the seasonality (e.g. monthly or
semi-annually). A moving average of this length contains no seasonal effect and little or no randomness.
The resulting moving average, M , , can be written as
follows:
z+n-1
c xJ
xt
- x 100 = St x
Et x 100 =
J,ft
st
(7)
xt
for t = 1,.. . , N ( 8 )
St = - x 100
Mt
(0) take the average of all $ values, representing the
same repeating period, excluding the largest and smallest %, values. The largest and smallest s, values represent an abnormal seasonality effect:
-
sMAX
-SjwIN
sta"
=
N-2
f o r t = 1 , .. . , n (9)
where fl = length of the calculated moving averages
= maximum S , value for time period
for time t, SMAx
t,
= minimum g, value for time period t.
sMIN
15
S,,,
as follows:
3tav
t=l
fort = 1 , .. . , n (11)
K x St,,
(iv) Separate the trend from the cycle by finding an
appropriate trend function that approximates the data.
Then, by dividing eqn. 6 by the selected trend function
the cyclical factor can be obtained.
The ELF for a certain time period is achieved by
determining three values at the specified time: the trend
value, seasonality factor and cyclical factor. The trend
value is determined by substituting the time period
value into the trend equation. The seasonality factor is
determined from eqn. 11. The cyclical factor is estimated from the recent pattern in this factor. The randomness cannot be projected; therefore, the ELF
relationship is simply Tt x C, x S,.
St
+ + +
X t = Tt Ct St Et
(12)
and can be summarised in the following steps.
(i) Separate the trend-cycle components of the data by
calculating a moving average whose number of terms is
equal to the seasonality length as follows:
z+n-1
Tt x Ct
(13)
st
(iv) Eliminate the randomness from the values of ,Tcby
averaging all available values of s, referring to the
+
Application
+ +
(15)
+ Rt3
(16)
where A, B, C and D are the trend coefficients. Hence,
the identification step recommends that the following
decomposition models can be applied to the given load
data:
(1) model 1, multiplicative decomposition model with a
parabolic trend function;
(2) model 2, multiplicative decomposition model with
an S-curve trend function;
(3) model 3, additive decomposition model with a parabolic trend function;
(4) model 4, additive decomposition model with an Scurve trend function.
5500
J=z
Tt=A+Bt+Ct2
Tt = A Bt Ct2
6000
t: xJ
Mt =
5000
4500
4000
3 3500
r
2 3000
2500
2000
1500
1000
5 00
0,,,,,,,,,,/,,,,
I , , I
25
50
75
100
,,,,,,/,,
125
150
,,,,,
Following the decomposition model procedures discussed in Section 3, both seasonal and cyclical factors
can be obtained, and these are given in Table 1. For
multiplicative models, if a seasonal and cyclical factor
representing a specific time period is above loo%, it
implies that both seasonality and cyclicity are higher
for this time period than on average. Also, for additive
models, if a specific seasonal and cyclical factor representing a certain time period are considered as the base
point for seasonality and cyclicity, then the other time
periods are either above or below the base by the differencing value.
At the estimation step, the values of the parameters
of the different trend functions specified from the previous step are obtained. One way of estimating paramIEE Proc -Gener Transm Distvib , Vol 143, No 1, Januarv 1996
Model type
Trend
func.
Multiplicative
Parabolic
Seasonal %
Cyclical %
99.34
99.694
100.26
99.341
99.38
99.016
99.59
98.722
99.90
98.442
100.27
98.178
99.52
100.51 99.93
100.53 98.67
102.02
102.185 101.752 101.295 100.855 100.450 100.065
S-curve
Seasonal %
Cyclical %
99.34
100.26 99.38
99.59
99.90
100.27 99.52
100.049 100.060 100.070 100.079 100.079 100.077 99.777
Parabolic
Seasonal %
Cyclical %
-4.117
10.445 -12.473 -18.966 -7.281 6.518
-22.491 -22.417 -22.090 -21.604 -21.147 -20.282
-8.088 7.828
-22.167 -21.772
-19.322 4.231
-21.787 -21.922
-16.984 46.789
-22.014 -22.260
S-curve
Seasonal %
Cyclical %
-4.117
4.124
-8.088
0.432
-19.322
3.161
-16.984 46.789
4.102
4.188
Additive
Factor type
10.445
4.200
Mathematical
A
representation
Parabolic A+Bt+CP
S-curve
426.82
E
._
c
Model type
7.828
2.337
99.93
99.927
100.53
99.967
4.231
3.694
12
98.67
102.02
100.004 100.032
0 85
a,
3: Statistical tests
Trend
function
100.51
99.872
11
0901
-.
!?i
U
18.75 0.0198
10
s
-m4
0
6.518
4.748
080A.A.E."
R2%
F-test
F-test
calculated tabulated
89.35 843.24
99.69 21862.86
3.0
2.6
Additive
89.35 843.59
99.62 17663.80
2.6
Parabolic 11.990
S-curve
2.810
3.0
0 7 5 - , , , , , , , , , I , , , , , , I , , , , , , , , , ,I , , , I , , , , , , , , , , , , , , , , I
I
Fig.3
250.00
200.00
1 50.00
5E -5,
..:
-200.00
. *
*.
-250.00 *
:8
-300.007
1000
I I I
I I
I I I I I I I I
2000
3000
p I I
I I I I
I I I I I
4000
I I ,
I I
5000
estimatedobservations, MW
I I I
6000
17
5750
55001
0.4
B 0.2
9
3 0
9 -0.2
-3 -0.4
0
-0.6-
$ -0.8L
3000 I
145
155
165
175
185
195
205
1982-1986 period,rnonths
Fig.9 A corn arson between the actual data (1982 - 1986) and the
forecasting resup, using models 2 and 4
-actual data
-A- Model 2
0.6
E 0.4
z-g
-e-
Model 4
Conclusion
0.2
-::
-0.2
- l b , , 8 a , , , 8 , a , 8
I , , , , , , ,
, r , , ~ , , , , c , , r ,
20
T m T I - m $
40
60
time-(ag
9000
8000
3 7000
I
2-6000
5000
q,,la
40004
, , , , / , ,, , , , ,
1000
Fig.7
200
300
400
1970-1986 period, months
Model 2 estimuted loads
100
500
12000
11 000
10000
9000
8000
7000
-8 6000
50001
4000
30001
References
2oo
10000
100
200
300
400
500