Professional Documents
Culture Documents
Abstract—In this paper, we present a new approach for very are met in the medium term;
short term electricity load demand forecasting. In particular, 3) Short-term (STLF) - A day ahead; used to assist
we apply support vector regression to predict the load demand planning and market participants;
every 5 minutes based on historical data from the Australian 4) Very short-term (VSTLF) - hours and minutes ahead;
electricity operator NEMMCO for 2006-2008. The results show used to assist trading and eventually dispatch.
that support vector regression is a very promising approach, In this paper we focus on VSTLF. In Australia, the
outperforming backpropagation neural networks, which is the
VSTLF plays an important role in the operation of the
most popular prediction model used by both industry
forecasters and researchers. However, it is interesting to note national electricity market, as its market operator NEMMCO
that support vector regression gives similar results to the must issue every 5 minutes the production schedule of the
simpler linear regression and least means squares models. We generators [3-4]. The 5-minute regional demand forecasting
also discuss the performance of four different feature sets with relates the demand at the beginning of a dispatch interval to
these prediction models and the application of a correlation- the target at the end [4]. Based on this load forecasting,
based sub-set feature selection method. generators and network operators are required to notify
NEMMCO of their maximum supply capacity and
availability, and this information is matched against regional
I. INTRODUCTION demand forecasts. This then enables the remainder of market
00:00
08:10
16:20
00:30
08:40
16:50
01:00
09:10
17:20
01:30
09:40
17:50
02:00
10:10
18:20
02:30
10:40
18:50
03:00
11:10
19:20
TABLE 2. PREDICTION ALGORITHMS USED FOR COMPARISON
Prediction Description and parameters Time
algorithm
Fig. 1. Daily electricity load at 5-minute intervals, for 1 week (from Monday
Linear Approximates the relationship between the outcome to Sunday)
regression and independent variables with a straight line. The
(LR) least squares method is used to find the line which
best fits the data. We used the M5's method for Fig. 2 plots the load differences at 5-minute intervals for 2
variable selection - starting with all variables at each consecutive days (Monday and Tuesday). It confirms that
step the weakest variable is removed based on its there are daily patterns in the electricity demand. These
standardized coefficient until no improvement is
observed in the error estimate.
difference values can be used instead of the actual values or
Least mean A modification of LR using least squared regression in addition to them.
squares functions generated from random subsamples of the
(LMS) data. The final prediction model is the least squared
regression with the lowest median squared error. We 3.5%
used sample size=4.
2.5%
Back- A classical neural network trained with the back-
Load Difference [%]
08:30
09:55
11:20
12:45
14:10
15:35
17:00
18:25
19:50
21:15
22:40
a maximum number of epochs was reached (500). -1.5%
-2.5%
IV. MODELLING Time
08:30
09:55
11:20
12:45
14:10
15:35
17:00
18:25
19:50
21:15
22:40
-1.5%
2006, 2007 and 2008. To predict the load for August 2008,
we used the data from August 2006 and 2007 as training -2.5%
Xt+1
and SVR, MAPE=0.31%; on feature set 4: LR, LMS and 8000
Xt+1
8000
12500 7000
6000
11500 5000
Electricity Load [MV]
9500
12000
11000
8500 10000
9000
Xt+1
7500 8000
0:00
1:15
2:30
3:45
5:00
6:15
7:30
8:45
10:00
11:15
12:30
13:45
15:00
16:15
17:30
18:45
20:00
21:15
22:30
23:45
7000
Time 6000
In terms of the KPI, the 20% target is met by all models 8000
for feature set 1 and 4, by LMS for feature set 2 and by all 7000
models except BPNN for feature set 3. Even though BPNN 6000
with feature set 1 and 4 meets the target, its performance is 5000
14-16% lower than the other algorithms. 5000 6000 7000 8000 9000 10000 11000 12000
and LMS produces the best results and significantly Fig. 5. Scatter plots of predicted load (Xt+1) and historical load data Xt, Xt-1,
outperforms BPNN. The KPI accuracy results of the three XXt+1 and XXt
models are 25-26% higher than the KPI target.
The good performance of LR and LSM indicates that We also investigated if the number of features in the best
there is a linear relationship between the independent performing feature set (set 1) can be further reduced without
variables (features used for prediction) and the dependent deteriorating the prediction accuracy. We applied a simple,
variable (the load that we are predicting). Fig. 5 shows fast and efficient method for feature sub-set selection, called
scatter plots of Xt, Xt-1, XXt+1 and XXt versus the dependent correlation-based feature selection (CFS) [18]. It searches
variable Xt+1 confirming the strong linear relationships. for the “best” sub-set of features where “best” is defined by
a heuristic which takes into consideration 2 criteria: 1) how
good the individual features are at predicting the outcome ACKNOWLEDGMENT
and 2) how much the individual features correlate with each We would like to thank NEMMCO for providing the data
other. Good feature subsets contain features that are highly and all the information needed to conduct the research.
correlated with the outcome and uncorrelated with each
other. REFERENCES
To evaluate CFS, we used the data for 2006 and 2007.
[1] G. Gross and F.D. Galiana, “Short term load forecasting,” IEEE
Table 6 shows the selected features using CFS. As it can be Proceedings, vol. 75, no. 12, pp. 1558-1573, Dec. 1987.
seen, 1 or 2 features were selected for each month, which is [2] D. W. Bunn, “Short term forecasting: a review of procedures in the
a drastic feature reduction from the original 11 features. The electricity supply industry,” Journal of the Operational Research
most important features for predicting the load Xt+1 are Xt, Society, vol. 33, pp. 533-545, 1982.
[3] H. Outhred, “The competitive market for electricity in Australia: why
XXt+1, XXt and XXt-1, with Xt (the load 5 minutes before) it works so well,” in Proceedings of the 33rd Hawaii International
being always selected. There seem also to be a seasonal Conference on System Sciences, Maui, 2000, pp. 1-8.
pattern that requires further investigation. Table 7 [4] “An introduction to Australia's national electricity market,” National
summarizes the performance of the prediction algorithms Electricity Market Management Company Limited, 2005.
[5] W. Charytoniuk and M.S. Chen, “Very short term load forecasting
using the selected features and 10-fold cross validation as an using artificial neural network,” IEEE Trans. on Power Systems, vol.
evaluation procedure. A comparison with Table 4 shows that 15, no. 1, pp. 263-268, Feb. 2000.
the accuracy results are worse - there is an increase in MAE [6] K. Liu, S. Subbarayan, R.R. Shoults, M.T. Manry, C. Kwan, F. L.
of 0.14% for LR and LMS, 0.27% for BPNN and 0.12% for Lewis and J. Naccarino, "Comparison of very short-term load
forecasting techniques,” IEEE Trans. on Power Systems, vol. 11, no.
SVR. However, the time to build the prediction models is
2, pp. 877-882, May 1996.
reduced, especially for BPNN and SVR. [7] H. Daneshi and A. Daneshi, “Real time load forecast in power
System," in Proceedings of the 3rd International Conference DRPT
TABLE 6. SELECTED VARIABLES BY CFS APPLIED TO FEATURE SET 1 2008, Nanjing, China, 2008, pp. 689-695.
Month Xt XXt+1 XXt XXt-1 [8] P. K. Dash, H. P. Satpathy, and A. C. Liew, “A real-time short-term
February peak and average load forecasting system using a self organising
March fuzzy neural network,” Engineering Applications of Artificial
April Intelligence, vol. 11, pp. 307-316, 1998.
May [9] P. Shamsollahi, K. W. Cheung, Q. Chen, and E. H. Germain, “A
June neural network based very short term load forecaster for the interim
July ISO New England electricity market system,” Power Industry
August Computer Applications, pp. 217-222, May 2001.
September [10] H. Y. Yang, H. Ye, G. Wang, J. Khan, and T. Hu, “Fuzzy neural very
October short term load forecasting based on chaotic dynamic reconstruction,”
November Chaos, Solitons and Fractals, vol. 29, pp. 462-469, Aug. 2006.
December [11] Q. Chen, J. Milligan, E.H. Germain, R. Raub, P.Shamsollahi and K.W.
Cheung , “Implementation and performance analysis of very short
term load forecaster- based on the electronic dispatch project in IS0
TABLE 7. 10-FOLD CROSS VALIDATION PERFORMANCE USING THE FEATURES New England,” in Power Engineering, 2001.
SELECTED BY CFS
[12] D. Chen and M. York, “Neural network based approaches to very
short term load prediction,” in IEEE Power and Energy Society
LR LMS BPNN SVR General Meeting - Conversion and Delivery of Electrical Energy in
Time [seconds] 0.59 74.5 17.6 0.97 the 21st Century, Pittsburgh, 2008, pp. 1-8.
MAE [%] 0.55 0.55 0.78 0.52 [13] L. F. Sugianto and X.-B. Lu, “Demand forecasting in the deregulated
RAE [%] 5.79 5.77 8.26 5.51 market: a bibliography survey,” Monash University, 2002.
[14] J. Yang, “Power system short-term load forecasting,” PhD Thesis,
VI. CONCLUSION Technical University Darmstadt, Darmstadt, 2006.
[15] B. J. Chen, M. W. Chang, and C. J. Lin, “Load forecasting using
We presented a new approach for the 5-minute electricity support vector machine: a study on EUNITE competition 2001,” IEEE
Trans. on Power Systems, vol. 19, no. 4, pp. 1821-1830, 2004.
load demand forecasting using SVR. The results showed
[16] A. J. Smola and B. Scholkopf, “A tutorial on support vector
that SVR is a very promising prediction model, out- regression," NeuroCOLT2 Technical Report NC2-TR-1998-030, 2003.
performing the BPNN prediction algorithms which is widely [17] I. Witten and E. Frank, Data mining: practical machine learning tools
used by both industry forecasters and researchers. However, and techniques, Second edition, Morgan Kaufmann, 2005.
we found that simpler models such as LR and LNS produced [18] M. A. Hall, “Correlation-based feature selection for discrete and
numeric class machine learning,” Proceedings of the 17th International
similar accuracy results and were faster to train than SVR. Conference on Machine Learning, pp.359-366, 2000.
We investigated the performance of four feature sets based [19] V. Vapnik, Statistical learning theory, Wiley, 1998.
on historical load data and information about the time and [20] D. C. Sansom, T. Downs, and T.K. Saha, "Evaluation of support
day of the week. The best performing feature set uses only vector machine based forecasting tool in electricity price forecasting
for Australian national electricity market participants," Journal of
historical load data, namely the load 25-30 minutes before Electrical and Electronics Engineering, vol. 22, no. 3, pp. 227-234,
the forecast time on the forecast day and the load on the 2003.
same weekday the previous week. Correlation-based feature [21] T. Joachims, Learning to classify text using support vector machines -
sub-set selection applied to this feature set was able to - methods, theory, and algorithms, Springer, 2002.
further reduce the number of features, and thus decrease the
time to build the model, but it also reduced the prediction
accuracy.