Professional Documents
Culture Documents
1.0 Introduction
State estimation for electric transmission grids
was first formulated as a weighted least-squares
problem by Fred Schweppe and his research
group [1] in 1969 (Schweppe also developed
spot pricing, the precursor of modern-day
locational marginal prices LMPs a central
feature of electricity markets). A state estimator
is a central part of every control center.
OPF
1
The basic motivation for state estimation is that
we want to perform computer analysis of the
network under the conditions characterized by
the current set of measurements.
2
The problem is to determine the state of the
circuit, which in this case is nodal voltages v1,
v2, and the voltage e across the voltage source.
+ V -
R1
Node 1 Node 2
A1
+ -
e A3
I1 I2
R2 R3
A2
Node 3
Fig. 1
Lets write each one of the measured currents in
terms of the node voltages, and we may also
write down our one voltage measurement.
v v e
i1m, 2 1 2 v1 v2 e 1.0 (1)
1
0 v1
i3m,1 v1 3.2 (2)
1
v 0
i2m,3 2 v2 0.8 (3)
1
e 1.1 (4)
Expressing all of the above in matrix form:
3
1 1 1 1.0
1 0 0 v1 3.2
v 2
0 1 0 0.8
e (5)
0 0 1 1.1
Lets denote terms in eq. (5) as A, x, and b, so:
Ax b (6)
4
There is no single solution to eq. (5), but there is
a single solution that is normally thought of as
best. One way to define best is that it is the
solution which minimizes the sum of the
squared error between what should be
computed by each equation, which is:
1.0
3.2
b
0. 8 (7)
1 . 1
and what is computed by each equation, which
is:
1 1 1
1 0 0 v1
Ax v2
0 1 0 (8)
e
0 0 1
The difference, or error, is then:
1.0 1 1 1 1.0 v1 v2 e
3.2 1 0 0 1 3.2 v
v
b Ax
v2 1
0.8 0 1 0 0.8 v2 (9)
e
1.1 0 0 1 1.1 e
The squared error is then
5
1.0 v1 v2 e 2
3 .2 v 1 2
0.8 v 2
2
1.1 e2
and the sum of the squared errors is:
1.0 v1 v2 e2 3.2 v1 2 0.8 v2 2 1.1 e2
Careful tracking of the previous expression will
indicate that it could be written as
b AxT b Ax
Lets multiply the above by and give it a
name:
J
1
b Ax T b Ax (10)
2
Our problem is then to choose x so as to
minimize J. Under requirements on the form of
J (convexity), we minimize it by setting its
gradient with respect to x to 0; then solve for x.
To do this, we can expand J as follows:
J
1
2 2
b Ax T b Ax 1 bT Ax T b Ax
b b Ax b b Ax Ax Ax
1 T
2
T T T
(11)
6
Consider the second and third terms in (12a).
Using a 2x2 to illustrate,
A A21 b1
xT AT b x1 x2 11 b1x1 A11 b1x2 A12 b2 x1 A21 b2 x2 A22
A12 A22 b2
A A12 x1
bT Ax b1 b2 11 b1x1 A11 b2 x1 A21 b1x2 A12 b2 x2 A22
A21 A22 x2
we observe these terms are equal, which we can
prove by using (Ax)T=xTAT to show that
[xTATb]T=bTAx and recognizing that these terms
are scalar in (12a). Therefore, (12a) becomes
J
1 T
2
b b 2bT Ax xT AT Ax
(12b)
which is identical to (12A.12), p. 504 of W&W.
To remind us about gradients, we recall that it is
given by (13):
J
x
1
J
x J x2
(13)
J
n
x
Now (12) is written in compact notation, and it
may not be obvious how to differentiate each
7
term in it. To assist with this, your text, in the
Appendix to Chapter 12, develops the following
relations, which are summarized in (12A.25, pg.
506), and repeated here for convenience:
Function Gradient
#1 F xT b xF b
#2 F bT x xF b
#3 F xT Au x F Au
#4 F uT Ax x F AT u
#5 F xT Ax x F 2 Ax
Relation #5 applies only if A is symmetric.
J bT b 2bT A x xT AT A x
1
2 (12c)
Using the appropriate relations in the above
table (#4 to the second term and #5 to the third
term), the gradient of (12c) can be expressed as:
x J 2 AT b 2( AT A)T x
1
2
(14)
8
(Applying #5 to the third term is acceptable
since ATA must be symmetric.)
9
Reference [3, p. 157] shows that if A has
linearly independent columns, then ATA is
invertible.
0 1 0
1 0 0 1 1 1 2
0 0 1
The inverse of the gain matrix is then found
from Matlab as
3 1 1
G 1 3 1
1 1
4
1 1 3
10
The pseudo-inverse is then
3 1 1 1 1 0 0 1 3 1 1
A G A 1 3 1 1 0 1 0 1 1 3 1
I 1 1T 1
4 4
1 1 3 1 0 0 1 1 1 1 3
11
3.0 A motivating system and some basics
Bus 1
P23, Q23
Bus 3
Fig. 2
We denote the measured quantities as follows:
z1 V1
z 2 P12
z 3 Q12
z 4 P23 (22)
z 5 Q23
Then we can write
z i zi i (23)
where
12
zi is the measured value
i
0 2 3
Fig. 3
Calibration curves for measuring instruments
enable determination of the variance i.
13
Recall the expectation operator, which we
denote as E() (i.e., the expected value of ). It
is defined as:
E ( x) xf ( x)dx x (24)
which gives x, the mean value of a variable x
described by the probability distribution
function f(x).
x
2 x x ( x ) 2 f ( x)dx
2
x f ( x)dx 2 x xf ( x)dx ( x ) f ( x)dx
2 2
(26)
E ( x 2 ) 2( x ) 2 ( x ) 2
E ( x 2 ) ( x )2
14
From eq. (26) (which is true for any random
variable x), we see that if the mean is 0
(E(x)=0), then the last term in eq. (26) is 0 and
x2 E ( x 2 ) (27)
In regards to the calibration error, characterized
by the random variable i, we have then:
E (i ) 0 (zero mean)
E ( i
2
) 2
i (variance)
Note that the larger the variance, the less
accurate is the measuring device.
15
It can be shown [4] from eq. (26) that
xy cov( x, y) E ( xy ) E ( x) E ( y)
2
(29)
If x and y are independent, then
E(xy)=E(x)E(y).
Therefore, for two independent random
variables, the covariance is:
xy2 cov( x, y) 0
Two variables having 0 covariance are
uncorrelated, i.e., variation in one gives no
information about variation in the other.
16
12 0 0
0 2
0 0
R 2
0 0 (31)
0 0 0 m2
where it is assumed that we have m measuring
instruments. We use (31) in our development.
17
zi hi ( x) (33)
For a voltage measurement, the function hi is
very simple:
zi Vk (34)
where measurement i occurs at bus k.
inductive line.
Shunt susceptance at bus p of bp (which
18
z1
z
True values: (38)
zm
1
Errors (39)
m
a measurement)
Solve for x (it would need to be non-linear
20
Recall equation (10) which expressed our sum
of squared errors as
J
1
b Ax T b Ax
(10)
2
Similarly we express the sum of squared errors as
1 m 2 1 T
J i z h( x) z h( x)
1 T
2 i1 2 2 (44)
We denote the above as J because we will find
it convenient to modify a bit.
1
Bad device large , small i2
2
i
21
Therefore, we will modify eq. (44) to be:
1 m i2
J 2 (45)
2 i1 i
2 i1 i 2 2 2 i1 i2
(47)
22
The problem then becomes to find x that
minimizes J. Note, however, that h is nonlinear,
and so our solution will necessarily be iterative.
mimimize
1 m i2 1 T 1 1 m zi hi ( x)
2
J 2 R z h ( x ) R z h( x )
1 T 1
2 i1 i 2 2 2 i1 i2
(48)
We can apply first order conditions, which
means that all first derivatives of the objective
function with respect to decision variables must
be zero, i.e., that x J 0 . That is,
J
x 0
J 1
xJ
x J (49)
0
xn
For a single element in x J , we have:
1 m zi hi ( x)
2
J (50)
2 i1 i2
23
J 1 m 2zi hi ( x) hi ( x) m zi hi ( x) hi ( x)
x1 2 i 1 i2
x1 i 1 i2 x1 (51)
This can be written in matrix form as
z1 h1 ( x)
J h1 ( x) h2 ( x) hm ( x) 1 z2 h2 ( x)
x1
R
x1 x1 x1 (52)
m m
z h ( x )
And we then see how to write the vector of
derivatives, according to:
h1 ( x) h2 ( x) hm ( x)
x
x1 x1 z1 h1 ( x)
1
h1 ( x) h2 ( x) hm ( x)
J 1 z2 h2 ( x )
x2 x2 x2 R
x
(53)
h1 ( x) h2 ( x) hm ( x)
m m
z h ( x )
xn xn xn
We recognize the matrix of partial derivatives in
(53) as a sort of Jacobian matrix but
(a) it is nm, i.e., it is not square and
(b) unlike standard Jacobian, here the rows
vary with variable (x1, x2, ), not function
(h1, h2, ).
24
Lets define a matrix H that does not have the
second (b) attribute, i.e.,
h1 ( x) h1 ( x) h1 ( x)
x
x2 xn
1
h2 ( x) h2 ( x)
h2 ( x)
H x1 x2 xn
(54)
h ( x) hm ( x) hm ( x)
m
x1 x2 xn
Note that H is mn, and it is the transpose of the
first matrix in eq. (53).
25
Because there are n elements in the partial-J
vector on the left, we observe that eq. (55) gives
n equations. Since there are n variables in x, it is
possible to solve eq. (55) explicitly for x.
26
Recall that in a Taylor series expansion, the
higher order terms (h.o.t.) contain products of
x, and so if x is relatively small, terms
containing products of x will be very small,
and in fact, negligible. So we will neglect the
h.o.t. in eq. (57). This results in:
G ( x 0 x) G ( x 0 ) x G ( x) x 0 (58)
0 x
27
Or we can write
( k 1)
x x x
(k )
(60)
Evaluating G at the better guess, we have
( k 1)
) G( x x) G ( x ) x G ( x) x
(k ) (k )
G( x (61) x(k )
x G ( x) x G( x )
(k )
x (63)
(k )
28
Since there are n functional expressions and n
derivatives to take for each one, we can see that
x G( x) will be nn, a square matrix.
(k )
x
29
derivatives of the measurement equations do not
change. Is this a good assumption? Remember,
from eq. (54),
h1 ( x) h1 ( x) h1 ( x)
x
x2 xn
1
h ( x) h2 ( x) h2 ( x)
h( x ) 2
H x1 x2 xn
x (54)
h ( x) hm ( x) hm ( x)
m
x1 x2 xn
where the functions hi are the expressions for
the measurements (voltages, real and reactive
power flows) in terms of the states (angles and
voltage magnitudes).
30
x G ( x)
G ( x)
x
x
1
H T ( x) R z h( x)
1 h( x ) 1 h( x ) (66)
H T ( x) R H T ( x) R
x x
h( x )
But we recognize from eq. (54) the term x in
eq. (66) as H. Therefore, eq. (66) becomes:
1
x G ( x ) H T ( x) R H ( x )
(67)
Making this substitution into eq. (63) results in:
x G( x) x G( x )
(k )
x (k ) (63)
1
x G( x )
(k )
H T ( x) R H ( x) (k ) (68)
x
31
6.0 Solution Algorithm
Given:
measurements z [z1, ,zm]
standard deviations 1,m
the network
32
Example:
Consider the system below. Real power measurements are taken as follows: P12=0.62 pu , P13=0.06 pu,
and P32=0.37 pu . All voltages are 1.0 per unit, and all measurement devices have =0.01. Assume the bus
3 angle is reference. So the state vector is therefore x=[1 2]T. Your textbook solves this problem using
DC power flow equations on pp. 467-471, which are reviewed below. Your homework requests that you
repeat the analysis using AC power flow equations.
Bus 1 Bus 2
P12 x12=0.2pu
P12=0.62pu
P13=0.06pu x23=0.25pu
P32=0.37pu Bus 3
X13=0.4pu
h1 ( x) P12
1
1` 2 51` 5 2
0.2
h2 ( x) P13
1
1` 3 2.51`
0.4
h3 ( x) P32
1
3` 2 4 2
0.25
b) Compute H(x(0)), h(x(0)) for an estimate of x(0) =[1 2]T=[0.024 -0.093]T, (in radians)
33
h1 ( x) h1 ( x) P12 ( x) P12 ( x)
x1 x2 1 2 5 5
h( x) h2 ( x) h2 ( x) P13 ( x) P13 ( x)
H 2.5 0
x x x2 1 2
h (1x)
3 h3 ( x) P32 ( x) P32 ( x) 0 4
x1 x2 1 2
In this particular case, H is a constant matrix. In general, it will be a function of the
states and will thus change from iteration to iteration. It is constant in this case since
we have used linearized (DC) power flow expressions.
c) Compute A H T ( x) R H ( x)
1
x (0) , b H T ( x) R
1
z h( x) x (0)
b H T ( x) R 1z h( x) (0)
x
34
though we started from different guesses, because the hi functions are linear (and therefore
the H matrix is constant).
r z h( x) x ( k )
To get this, we need to recompute h(x), using x=[1 2]T=[0.0286 -.0943]T as follows:
h1( x) P12 51` 5 2 5(.0286) 5(.0943) 0.6145
h2 ( x) P13 2.51` 2.5(.0286) 0.0715
h3 ( x) P32 4 2 4(.0943) 0.3772
The measurements are P12=0.62 pu , P13=0.06 pu, and P32=0.37 pu, and so residuals
are given by
0.62 0.6145 0.0055
r 0.06 0.0715 0.0115
0.37 0.3772 .0072
Compare to our previous residual
0.62 0.585 0.035
0.06 0.06 0
0.37 0.372 0.002
and we see that it got better in r1 and worse in r2 and r3. However, the squared error
for this iteration is 0.000214 and that for the previous iteration is 0.0012, and so, in
fact, our solution has improved. We are able to look at just squared error in this example
instead of weighted squared error because all of the weights are the same.
Note that your text, at the bottom of page 469, calculates J. This is the objective function,
the weighted squared error, which is a measure of how good our solution is. They
get 2.14, which can be compared to our above calculations by multiplying by 2=.0001,
which gives 0.000214, the same as what we obtained.
References:
[1] F. Schweppe, .J. Wildes, and D. Rom, Power system static state estimation: Parts I, II, and III, Power Industry
Computer Conference (PICA), Denver, Colorado June, 1969.
[2] A. Monticelli, State Estimation in Electric Power Systems: A Generalized Approach, Kluwer, Boston, 1999.
[3] G. Strang, Linear algebra and its applications, third edition, Harcourt Brace, 1988.
[4] A. Papoulis, Probability, Random Variables, and Stochastic Processes, McGraw-Hill, New York, 1984.
35