You are on page 1of 6

Fault Location in Distribution Systems with

Distributed Generation Using Support Vector


Machines and Smart Meters
Ramón Pérez Carmen Vásquez
Department of Electrical Engineering Department of Electrical Engineering
Universidad Politécnica Salesiana UNEXPO
Quito-Ecuador Barquisimeto-Venezuela
rperezp@ups.edu.ec cvasquez@unexpo.edu.ve

Abstract— In this paper the fault location is presented in fault location [1], for this reason, this problem is more
distribution systems with distributed generation using Support important to treat when there is the presence of DG in power
Vector Machines and information provided by smart meters distribution networks. A scheme fault location in distribution
located on the system. Different types of faults that can occur in a systems with DG is presented in [2] using impedance-based
distribution system are simulated for fault resistances 5 to 30 methods. In [3] present the use of Support Vector Machines
Ohms in steps of 5 Ohms. Support Vector Machines were trained (SVM) for fault localization in distribution systems with
with effective voltage values measured at the substation in varying penetration of DG. The authors of [4] suggest the fault
distributed generation and smart meters. The results show that location in distribution systems based on algorithmic methods
the accuracy in locating all types of failure is higher than 87%,
and smart metering.
demonstrating a fortress in this tool.
In this paper fault location in distribution systems with
Keywords—Fault location; Support Vector Machines; distributed generation it is proposed using SVM and smart
Distributed generation; Smart meters; Fault resistance. metering. It intends to make the location through areas of the
circuit and with the help of effective voltage values of smart
I. INTRODUCTION meters when the fault occurs. In section 2 the theory of SVM is
presented, in 3 proposed methodology for locating faults, in 4
the results are presented and finally in 5 concludes.
Power quality has become a topic of great interest to most
people who enjoy electricity service today. Over the years, the
growth of society has demanded a proportional growth of the II. SUPPORT VECTOR MACHINES
services needed in the development of their daily activities, One of the methods based on the knowledge that uses
including electricity. This growth not only involves increasing patterns to determine the solution of the problem fault location
capacity generation, transmission or distribution system should in distribution networks are SVM. These were developed by
also consider energy as a commercial product, which must be Vladimir Vapnik in the early 80s using statistical tools,
delivered to the user under the highest levels of quality. In optimization and artificial neural networks [1]. The architecture
electrical distribution systems is where the highest percentage of the SVM, depends only on the parameter C and kernel
of failures so that the frequency and duration of power outages function. In the case of Radial Basis Function (RBF), it is only
represents the most critical factor occurs, since most of the required parameter gamma (γ) thus avoiding exclusive
interruptions of the power supply are generated following a requirements on architectural parameters such as number of
power failure. Prompt repair of a failure depends largely on nodes and layers, type of connection between layers, among
how quickly the place where this occurred is found, a others [5].
determining factor for maintaining high rates of energy quality
factor is the rapid and efficient location of faults. The support vector classifiers are based on hyperplanes that
separate the training data into two subgroups. Among all
Currently the distribution networks are presenting a high possible planes of separation between the two classes there is a
penetration of Distributed Generation (DG) which is single plane of optimum spacing such that the distance between
universally accepted as an effective and economical solution to the nearest optimal hyperplane and the training pattern is
address increases in energy demands of the system, because the maximum, as thus allow to distinguish more clearly regions
DG is a better option for correct the major problems of energy where fall points belonging to each group, as shown in figure 1
loss, voltage profiles, lines congestion, safety and reliability, to [5].
mention some. In addition to the production of electrical
energy, DG uses clean energy which are compatible with the The OSH separates the training data into two groups that
environment. The integration of DG to conventional networks each have their own label and ‫{ א‬+ 1, -1} such that the distance
modifies the amplitudes of the fault signals (voltage, current) between OSH and pattern nearest training is maximum, with
which significantly affects the performance of the algorithms the intention of forcing the generalization of machine learning.

978-1-5090-1629-7/16/$31.00 ©2016 IEEE


Expression optimal hyperplane is shown in (1) and the This nonlinear function is known as kernel and between the
hyperplanes separators (2) and (3). main are linear, RBF, polynomial and sigmoid. In Table I, are
the four (4) most commonly used type of kernels.

TABLE I. MOST USED KERNEL FUNCTIONS.

Kernel Function
Linear

RBF

Polynomial

Sigmoid
Fig.1. Hyperplane that separates the data correctly [6]
are the parameters of the kernel function.
(1)
In general, the RBF kernel is a reasonable first choice. This
(2) core nonlinear mapped samples into a higher dimensional
+1
space so, unlike the linear kernel can handle the case when the
(3)
-1 relationship between the labels and the attributes of the class is
not linear. Moreover, the linear core is a special case of the
To maximize the margin, corresponds to maximize (4)
subject to the constraint (5) linear RBF kernel as a penalty parameter C has the same
performance as the RBF kernel with some parameters (C, γ)
[11]. Furthermore, the sigmoid kernel behaves as RBF certain
(4) parameters [12]. The RBF kernel has excellent behavior in
nonlinear cases and less number difficulties [3]. Finally, the
polynomial kernel has more hyper-parameters that RBF kernel
(5) [13].

The objective function (4) with (5) represents the quadratic There are two parameters for RBF kernel, these are C and γ.
optimization problem with constraints. This can be solved It is not known in advance that values (C, γ) are best for a
using Lagrange multipliers as shown in (6). problem, therefore, some form of selection of models (search
parameters) should be done. The objective is to identify the
best parameters (C, γ) so that the classifier can accurately
(6) predict the unknown data (test data). A common strategy is to
separate the dataset into two parts, of which one is considered
unknown. The accuracy of the prediction obtained from
When the data are not linearly separable should take the unknown set more accurately reflects performance in the
approach for SVM given by [7] based on [8], allowing the classification of a set of independent data. An improved
violation of constraints given in (5). They introduce slack version of this procedure is known as cross-validation.
variable given by (7) which generates a new restriction (8)
In cross-validation v folders first training data are divided
(7) into v subsets of equal size. Subsequently a subset is tested
(8) using the classifier trained on the remaining v-1. Therefore,
each instance of the entire training set once predicted so cross
To find the optimal hyperplane classifier with weak margin validation accuracy is the percentage of the data that are
should be minimized (9), subject to (8) [9]. correctly classified.
SVM originally were created to solve binary classification
(9) and later extended to classify problems with multiple classes.
The SVM were created to solve binary classification and its
The C parameter is chosen by the user, a high value of this application was extended to solve multiple classes. In this
parameter represents a high penalty to errors. Its optimum paper it is presented a problem of multiple classes resolved by
value is found by cross-validation input data [10]. the scheme proposed by[14].
When data are not separable, it is possible to qualify to
establish a relationship between the input space and space high-
dimensional representation. This latter fact is possible by using
a nonlinear function to map the input data to a larger space.
III. PROPOSED METHODOLOGY training stage SVM it is tuned, by finding the best parameters
Figure 2 shows the circuit under study in this research, (C, γ) by cross validation. Subsequently the SVM is tested and
which corresponds to the IEEE 34 node test feeder. It is accuracy is determined by (10), all using the LibSVM
characterized by a level of nominal voltage of 24.9 kV and a software [17] in MATLAB®.
total of 26 nodes three-phase, three single-phase nodes
belonging to phase A and five single-phase nodes to phase B. = (10)
The data From this system is in [15]. In addition, for this
research will be considered a presence of a distributed
generation source located in the node 848 with a penetration of
30%. Two smart meters located at nodes 842 and 860 needed TABLE II. DESCRIPTION OF THE TYPES OF FAULTS.
to eliminate the uncertainty estimate multiple failures views
from the substation distribution network are considered. Type of fault Description

Fault 1 Single-phase fault in phase A

Fault 2 Single-phase fault in phase B

Fault 3 Single-phase fault in phase C

Fault 4 Two-phase fault between phases A and B

Fault 5 Two-phase fault between phases B and C

Fault 6 Two-phase fault between phases C and A

Fault 7 Fault between two phases and earth in phases A and B


Fault 8 Fault between two phases and earth in phases B and C
Fig. 2. IEEE 34 node test feeder.[15]
Fault 9 Fault between two phases and earth in phases C and A
This system is simulated in the Electromagnetic Transients
Fault 10 Three-phase fault
Program (EMTP) under the graphical environment ATPDraw
for the various copies of the circuit for the 11 types of faults Fault 11 Three-phase-ground fault
that will be simulated in each of the nodes, considering six (6)
different resistance values fault, which are 5, 10, 15, 20, 25 and
30 ohms respectively, which represent typical values in TABLE III. DESCRIPTION OF THE ZONES OF THE CIRCUIT.
distribution systems. This combination of different types of
fault considered with resistances result in a total of 1764 Zone Nodes Description
simulations that are performed through a communication
between ATP® and MATLAB®, this in order to obtain the 1 800,802,806,808,810 4 three phase nodes and 1 single-
database needed to perform the fault location. The description phase node in phase B
of each type of fault is shown in table II. 2 812,814,850,816,824,826 5 three phase nodes and 1 single-
phase node in phase B
For the development of this study it is to use 5 descriptors
extracted from the behavior of electrical variables within the 3 818, 820, 822 3 single-phase node in phase A
system under conditions of permanent faults, in order to ensure
a complete description of system behavior. Descriptors 4 828, 830, 854, 856 3 three phase nodes and 1 single-
considered are the rms values of the phase voltages and line phase node in phase B
measurements at the substation, the rms value of the line 5 852, 832, 858, 864 3 three phase nodes and 1 single-
voltage measured at the source of distributed generation and phase node in phase B
the rms voltage values indicated by smart meters. 6 842, 844, 846, 848 4 three phase nodes
It is proposed to locate faults through the division of the
distribution network through nodes formed by nearby areas, as 7 888, 890 2 three phase nodes
proposed [16]. Table III is shown the description of the nodes
8 834, 860, 836, 840 4 three phase nodes
on each of the nine (9) areas where the circuit has been divided.
9 862, 838 1 three phase node and 1 single-phase
After obtaining the database proceeds to the scaling node in phase B
attributes to prevent large numerical ranges dominate the
small number range [13]. The full data is separated into two Below it is detailed the algorithm used in the model:
(2) parts, the first of it representing approximately 70% of the
total is used to train the SVM and the rest of the data is used
for testing, the training data are not in the test data. In the
Fault location algorithm TABLE IV. RESULTS OF THE TUNING OF PARAMETERS OF THE SVM

Step 1: Input Cross-


IEEE 34 node test feeder Type of Validation
rfmin= 5Ω fault Accuracy
rfmáx=30 Ω (%)
rpas= 5 Ω
nres = (rmax-rmin)/rpas + 1; Fault 1 134217728 0.015625 93.9655
Step 2: fid = fopen ('lista_de_fallas.txt','a');
for j=1:11 Fault 2 268435456 0.00048828 92.3077
for i=1:34 Fault 3 33554432 0.00024414 92.3077
for k=1:nres
fault_j_nodo_i_rf_k.atp Fault 4 524288 0.015625 95.1923
end for
end for Fault 5 536870912 0.00048828 95.1923
end for
Fault 6 16777216 0.00097656 92.3077
Step 3: for i=1:11
for j = 1:34 Fault 7 33554432 0.0039063 97.1698
for k = 1:nres
run ATPDraw= (gnudirtpbig.exe both,A, s -r); Fault 8 67108864 0.00048828 96.1538
run Pl42mat.exe
Fault 9 8388608 0.125 97.1154
end for
end for Fault 10 262144 0.015625 92.3077
end for
Step 4: for i=1:1764 Fault 11 32 32 100
Extraction Vrms in the substation, in distributed generation
and the Smart Meters.
end for The cross-validation stage adjusts the C parameter and of
Step 5: the SVM. Table IV shows that the percentage accuracy of the
svm-scale.exe SVM in its cross validation step is greater than 90% for any
ndata = 1764 fault. In figures 2 and 3 the contours of cross-validation for the
prcnt = round (ndata*0.3); higher and lower accuracy of the SVM for cross-validation are
for i = 1:11 shown. In table V are shown the results in the testing phase
nmbr = ['Database',num2str(i),'.txt'];
of the SVM.
[m_zn,m_data] = libsvmread (fullfile(fd,nmbr));
data_test= m_data(fila==1,:); zone_test = m_zn(fila==1,:); 20 100

data_train = m_data(fila==0,:); zone_train = m_zn(fila==0,:); 15 Accuracy = 100.00 % 90


end for 10 80
Step 6: bestcv = 0
5 70
for i = 1:11
log2(J)

0 60
cmd = ['-q -v 5 -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)];
cv = svmtrain(zone_train, data_train, cmd); -5 50

if (cv > bestcv) -10 40

bestcv = cv; bestc = 2^log2c; bestg = 2^log2g; -15 30

end if -20
-5 0 5 10 15 20 25 30
20
log2(C)
end for
bestParam = ['-q -c ', num2str(bestc), ' -g ', num2str(bestg)]; Fig 2. Contour cross validation fault 11.
Step 7:
model = svmtrain(zone_train, data_train, bestParam); y
[zone,accuracy]= svmpredict(zone_test,data_test,model); 20 90

Step 8: Output 15 Accuracy = 92.31 % 80

[matrix_conf,order]= confusionmat(zone_test,zone_prediction) 10
70
Step 9: Return 5
60
log2(J)

0
50
-5
IV. RESULTS -10
40

The results of the tuning parameters of the SVM are shown -15 30

in table IV. -20


-5 0 5 10 15 20 25 30
20
log2(C)

Fig 3. Contour cross validation fault 3.


TABLE V. RESULTS OF THE ACCURACY OF THE SVM IN TEST STAGE TABLE VIII. CONFUSION MATRIX FOR FAULT 3

Type of fault Accuracy (%) Prediction Zone


1 2 4 5 6 7 8 9
Fault 1 87.931 1 8 0 0 0 0 0 0 0
2 0 10 0 0 0 0 0 0
Fault 2 95

Real Zone
4 0 0 6 0 0 0 0 0
5 0 0 0 6 0 0 0 0
Fault 3 92.3077 6 0 0 0 0 8 0 0 0
Fault 4 98.0769 7 0 0 0 0 0 4 0 0
8 0 0 0 0 0 0 6 2
Fault 5 92.3077 9 0 0 0 0 0 0 2 0

Fault 6 94.2308 TABLE IX. CONFUSION MATRIX FOR FAULT 4

Fault 7 92 Prediction Zone


1 2 4 5 6 7 8 9
Fault 8 88.4615 1 8 0 0 0 0 0 0 0
2 0 10 0 0 0 0 0 0
Fault 9 98.0769

Real Zone
4 0 0 6 0 0 0 0 0
Fault 10 96.1538 5 0 0 0 6 0 0 0 0
6 0 0 0 0 8 0 0 0
Fault 11 100 7 0 0 0 0 0 4 0 0
8 0 0 0 0 0 0 7 1
9 0 0 0 0 0 0 0 2
Each of the accuracies for the type of fault is represented by
a confusion matrix in which the elements on the main diagonal TABLE X. CONFUSION MATRIX FOR FAULT 5
represent the successful data and elements outside the main
diagonal data erring where the SVM failed in the Prediction Zone
1 2 4 5 6 7 8 9
classification. From tables VI to XVI are shown the confusion 1 8 0 0 0 0 0 0 0
matrices for each type of fault. Only single-phase fault in 2 0 10 0 0 0 0 0 0

Real Zone
phase A presents zone 3 because this is made only by single 4 0 0 6 0 0 0 0 0
phase nodes belonging to phase A so that only this type of 5 0 0 0 6 0 0 0 0
fault can be simulated in this zone. 6 0 0 0 0 8 0 0 0
7 0 0 0 0 0 4 0 0
TABLE VI. CONFUSION MATRIX FOR FAULT 1 8 0 0 0 0 0 0 6 2
9 0 0 0 0 0 0 2 0
Prediction Zone
1 2 3 4 5 6 7 8 9
1 8 0 0 0 0 0 0 0 0 TABLE XI. CONFUSION MATRIX FOR FAULT 6
2 0 10 0 0 0 0 0 0 0 Prediction Zone
3 0 0 6 0 0 0 0 0 0
Real Zone

1 2 4 5 6 7 8 9
4 0 0 0 6 0 0 0 0 0 1 8 0 0 0 0 0 0 0
5 0 0 0 0 6 0 0 0 0 2 0 10 0 0 0 0 0 0
Real Zone

6 0 0 0 0 0 8 0 0 0 4 0 0 6 0 0 0 0 0
7 0 0 0 0 0 0 4 0 0 5 0 0 0 6 0 0 0 0
8 0 0 0 0 0 0 0 2 6 6 0 0 0 0 8 0 0 0
9 0 0 0 0 0 0 0 1 1 7 0 0 0 0 0 4 0 0
8 0 0 0 0 0 0 7 1
TABLE VII. CONFUSION MATRIX FOR FAULT 2 9 0 0 0 0 0 0 2 0
Prediction Zone
1 2 4 5 6 7 8 9 TABLE XII. CONFUSION MATRIX FOR FAULT 7
1 10 0 0 0 0 0 0 0 Prediction Zone
2 0 12 0 0 0 0 0 0 1 2 4 5 6 7 8 9
Real Zone

4 0 0 8 0 0 0 0 0 1 8 0 0 0 0 0 0 0
5 0 0 0 8 0 0 0 0 2 0 10 0 0 0 0 0 0
Real Zone

6 0 0 0 0 8 0 0 0 4 0 0 6 0 0 0 0 0
7 0 0 0 0 0 4 0 0 5 0 0 0 6 0 0 0 0
8 0 0 0 0 0 0 5 1 6 0 0 0 0 6 0 0 0
9 0 0 0 0 0 0 2 2 7 0 0 0 0 0 4 0 0
8 0 0 0 0 0 0 4 4
9 0 0 0 0 0 0 0 2
TABLE XIII. CONFUSION MATRIX FOR FAULT 8 REFERENCES
Prediction Zone [1] Y. Menchafou, H. El Markhi, M. Zahri, and M. Habibi, “Impact of
1 2 4 5 6 7 8 9 distributed generation integration in electric power distribution
1 8 0 0 0 0 0 0 0 systems on fault location methods,” 3rd Int. Renew. Sustain. Energy
2 0 10 0 0 0 0 0 0 Conf., no. 1998, 2015.
Real Zone 4 0 0 6 0 0 0 0 0
[2] S. F. Alwash, V. K. Ramachandaramurthy, and N. Mithulananthan,
5 0 0 0 6 0 0 0 0
“Fault-Location Scheme for Power Distribution System with
6 0 0 0 0 8 0 0 0 Distributed Generation,” IEEE Trans. Power Deliv., vol. 30, no. c,
7 0 0 0 0 0 4 0 0 pp. 1187–1195, 2015.
8 0 0 0 0 0 0 4 4
9 0 0 0 0 0 0 2 0 [3] R. Agrawal and D. Thukaram, “Identification of fault location in
power distribution system with distributed generation using support
TABLE XIV. CONFUSION MATRIX FOR FAULT 9 vector machines,” 2013 IEEE PES Innov. Smart Grid Technol.
Prediction Zone Conf., pp. 1–6, 2013.
1 2 4 5 6 7 8 9 [4] F. C. L. Trindade and W. Freitas, “Low Voltage Zones to Support
1 8 0 0 0 0 0 0 0 Fault Location in Distribution Systems With Smart Meters,” pp. 1–
2 0 10 0 0 0 0 0 0 10, 2016.
Real Zone

4 0 0 6 0 0 0 0 0
5 0 0 0 6 0 0 0 0 [5] C. Burges, “A Tutorial on Support Vector Machines for Pattern
Recognition,” Data Min. Knowl. Discov., vol. 2, no. 2, pp. 121–167,
6 0 0 0 0 8 0 0 0
1998.
7 0 0 0 0 0 4 0 0
8 0 0 0 0 0 0 7 1 [6] R. Pérez, A. Aguila, and C. Vásquez, “Classification of the Status of
9 0 0 0 0 0 0 0 2 the Voltage Supply in Induction Motors Using Support Vector
Machines,” 2016.
TABLE XV. CONFUSION MATRIX FOR FAULT 10
[7] C. Cortes and V. Vapnik, “Support-Vector Networks,” Machine
Prediction Zone Learning, vol. 20, no. 3, pp. 273–297, 1995.
1 2 4 5 6 7 8 9
1 8 0 0 0 0 0 0 0 [8] K. Bennett and O. L. Mangasarian, “Robust linear programming
2 0 10 0 0 0 0 0 0 discrimination of two linearly inseparable sets,” Optim. Methods
Real Zone

4 0 0 6 0 0 0 0 0 Softw., vol. 1, no. 1, pp. 23–34, 1992.


5 0 0 0 6 0 0 0 0 [9] G. Li, C. Wen, Z. G. Li, A. Zhang, F. Yang, and K. Mao, “Model-
6 0 0 0 0 8 0 0 0 based online learning with kernels,” IEEE Trans. Neural Networks
7 0 0 0 0 0 4 0 0 Learn. Syst., vol. 24, no. 3, pp. 356–369, 2013.
8 0 0 0 0 0 0 8 0
9 0 0 0 0 0 0 2 0 [10] A. Astorino and A. Fuduli, “The Proximal Trajectory Algorithm in
SVM Cross Validation,” IEEE Transactions on Neural Networks
TABLE XVI. CONFUSION MATRIX FOR FAULT 11 and Learning Systems, in press, 2015.
Prediction Zone [11] S. S. Keerthi and C.-J. Lin, “Asymptotic behaviors of support vector
1 2 4 5 6 7 8 9 machines with Gaussian kernel.,” Neural Comput., vol. 15, no. 7,
1 8 0 0 0 0 0 0 0 pp. 1667–89, 2003.
2 0 10 0 0 0 0 0 0
Real Zone

4 0 0 6 0 0 0 0 0 [12] H. Lin and C. Lin, “A study on sigmoid kernels for SVM and the
5 0 0 0 6 0 0 0 0 training of non-PSD kernels by SMO-type methods,” submitted to
Neural Computation, 2003. .
6 0 0 0 0 8 0 0 0
7 0 0 0 0 0 4 0 0 [13] H. Chih-Wei, C. Chih-Chung, and L. Chih-Jen, “A Practical Guide
8 0 0 0 0 0 0 8 0 to Support Vector Classification,” BJU international, 2008.
9 0 0 0 0 0 0 0 2 [Online]. Available:
http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf.
V. CONCLUSIONS
[14] C. Hsu and C. Lin, “A comparison of methods for multiclass
Fault location in distribution systems with distributed support vector machines,” IEEE Trans. Neural Networks, vol. 13,
generation and smart meters using SVM is presented. The no. 2, pp. 415–425, 2002.
accuracy of the SVM cross validation was over 90% for all [15] Distribution System Analysis Subcommittee, “IEEE 34 Node Test
fault types. The accuracy of the SVM was over 87% in the Feeder.” p. 16, 2000.
testing stage. The RMS values of the voltage at the substation [16] J. José and M. Flórez, “Localización de faltas en sistemas de
in smart meters and distributed generation source under fault distribución de energía eléctrica usando métodos basados en el
condition, were used as descriptors of the SVM. The results modelo y métodos basados en el conocimiento,” 2007.
demonstrate the high ability of this tool to be used in fault [17] C. Chang and C. Lin, “LIBSVM : A Library for Support Vector
location in electrical distribution systems with distributed Machines,” 2001. [Online]. Available:
generation. https://www.csie.ntu.edu.tw/~cjlin/libsvm/.

You might also like