Professional Documents
Culture Documents
AFactorization Approach
Part II
Synthesis Lectures on Control
and Mechatronics
Editors
Chaouki Abdallah, University of New Mexico
Mark W. Spong, University of Texas at Dallas
Control System Synthesis: A Factorization Approach, Part II
Mathukumalli Vidyasagar
2011
Control System Synthesis: A Factorization Approach, Part I
Mathukumalli Vidyasagar
2011
The Reaction Wheel Pendulum
Daniel J. Block, Karl J. strm, and Mark W. Spong
2007
Copyright 2011 by Morgan & Claypool
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any meanselectronic, mechanical, photocopy, recording, or any other except for brief quotations in
printed reviews, without the prior permission of the publisher.
Control System Synthesis: A Factorization Approach, Part II
Mathukumalli Vidyasagar
www.morganclaypool.com
ISBN: 9781608456611 paperback
ISBN: 9781608456635 ebook (Part II)
DOI 10.2200/S00358ED1V01Y201105CRM003
A Publication in the Morgan & Claypool Publishers series
SYNTHESIS LECTURES ON CONTROL AND MECHATRONICS
Lecture #3
Series Editors: Chaouki Abdallah, University of New Mexico
Mark W. Spong, University of Texas at Dallas
Series ISSN
Synthesis Lectures on Control and Mechatronics
Print 1939-0564 Electronic 1939-0572
Control SystemSynthesis
AFactorization Approach
Part II
Mathukumalli Vidyasagar
University of Texas at Dallas
SYNTHESIS LECTURES ON CONTROL AND MECHATRONICS #3
C
M
&
cLaypool Morgan publishers
&
ABSTRACT
This book introduces the so-called stable factorization approach to the synthesis of feedback
controllers for linear control systems. The key to this approach is to view the multi-input, multi-
output (MIMO) plant for which one wishes to design a controller as a matrix over the fraction eld
F associated with a commutative ring with identity, denoted by R, which also has no divisors of
zero. In this setting, the set of single-input, single-output (SISO) stable control systems is precisely
the ring R, while the set of stable MIMO control systems is the set of matrices whose elements all
belong to R. The set of unstable, meaning not necessarily stable, control systems is then taken to be
the eld of fractions F associated with R in the SISO case, and the set of matrices with elements in
F in the MIMO case.
The central notion introduced in the book is that, in most situations of practical interest,
every matrix P whose elements belong to F can be factored as a ratio of two matrices N, D
whose elements belong to R, in such a way that N, D are coprime. In the familiar case where the
ring R corresponds to the set of bounded-input, bounded-output (BIBO)-stable rational transfer
functions, coprimeness is equivalent totwofunctions not having any commonzeros inthe closedright
half-plane including innity. However, the notion of coprimeness extends readily to discrete-time
systems, distributed-parameter systems in both the continuous- as well as discrete-time domains,
and to multi-dimensional systems. Thus the stable factorization approach enables one to capture all
these situations within a common framework.
The key result in the stable factorization approach is the parametrization of all controllers
that stabilize a given plant. It is shown that the set of all stabilizing controllers can be parametrized
by a single parameter R, whose elements all belong to R. Moreover, every transfer matrix in the
closed-loop system is an afne function of the design parameter R. Thus problems of reliable
stabilization, disturbance rejection, robust stabilization etc. can all be formulated in terms of choosing
an appropriate R.
This is a reprint of the book Control System Synthesis: A Factorization Approach originally
published by M.I.T. Press in 1985.
KEYWORDS
stable factorization, coprimeness, Bezout identity, simultaneous stabilization, robust
stabilization, robust regulation, genericity
Dedicated to the memories of
My paternal grandfather,
Mathukumalli Narasimharao (1881-1965)
and
My father,
Prof. Mathukumalli Venkata Subbarao (1921-2006)
ix
+ Hj +|11
Hj +|
++
Hj 1| +1|1j `
|-1 +|1|-
| |1|]1|1j `
+ |: |: |:
May God protect us together
May God nourish us together
May we work together with great energy
May our study be vigorous and effective
May we not have ill feelings
Let there be peace, peace, peace
(A vedic prayer recited by the teacher and student(s) together at the commencement of studies.
Found in several upanishads including Taittiriyopanishad)
xi
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Preface for the Original Edition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
6
Filtering and Sensitivity Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.2 Some Facts about Hardy Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.3 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.4 Sensitivity Minimization: Scalar Case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.5 Sensitivity Minimization: Fat Plant Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6.6 Sensitivity Minimization: General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
6.7 Two-Parameter Compensator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
7
Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
7.2 Graph Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
7.3 Graph Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
7.4 Robustness Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
7.5 Robust and Optimal Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
7.6 Genericity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
8
Extensions to General Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
8.1 Coprime Factorizations over a Commutative Ring . . . . . . . . . . . . . . . . . . . . . . . . . 279
8.2 Coprime Factorizations for Distributed Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 298
8.3 Controller Synthesis over a General Ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
A
Algebraic Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
A.1 Rings, Fields and Ideals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
A.2 Rings and Fields of Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
A.3 Principal Ideal Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
A.4 Euclidean Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
xii
B
Preliminaries on Matrix Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
B.1 Matrices and Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
B.2 Canonical Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
C
Topological Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
C.1 Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
C.2 Topological Rings and Normed Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Authors Biography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Preface
This is a reprint of my book Control System Synthesis: A Factorization Approach, originally
published by M.I.T. Press in 1985. That book went out of print about ten years after its publication,
but nevertheless continued to be cited in the published literature. I would like to believe that this is
because the contents of the book are still relevant to linear control theory as it is currently practiced.
I am therefore grateful to Morgan & Claypool for their decision to reprint my book, so that the
current generation of graduate students and researchers are able to access its contents without having
to photocopy it surreptitiously from the library.
The original text was created by me using the troff text processing system created at Bell
Laboratories. Indeed, the 1985 book was the rst time that I produced a camera-ready copy, instead
of subjecting myself to the mercy and vagaries of typists and/or typesetters. Back then this was a
novel thing to do; however today it is standard practice. Dr. C. L. Tondo of T&T TechWorks, Inc.
and his troops have expertly keyed in the entire text into L
A
T
E
X, for which I am extremely grateful.
There is no doubt that the physical appearance of the text has been signicantly improved as a result
of this switch.
With the entire text at my disposal, I could in principle have made major changes. After
thinking over the matter, I decided to stick with the original text, and restricted myself to correcting
typographical errors. Upon re-reading the original text after a gap of perhaps twenty years, I felt that
the entire material still continues to be relevant, except for Chapter 6 on H
-control; these are cited in the book as references [40] and [115]. A
few years after my book was published, the inuential book [37] appeared. This was followed in
short order by the paper [28], which gives a complete state-space computational procedure for H
-
control; this paper is perhaps the most cited paper in the history of control theory. Subsequently, the
book [29] gave an elementary introduction to the theory, while [45, 117] are advanced treatments
of the subject. It would therefore have required a massive effort on my part to rewrite Chapter 6 of
the book to bring it up to date, and I felt that I could contribute nothing beyond the excellent texts
already in print. So I decided to leave the book as it is, on the basis that the conceptual framework
for H
z(j)
2
d
=
1
2
[W
1
(I +PC)
1
W
2
](j)
2
d . (6.1.4)
Case II. Sensitivity Minimization Suppose u is a square-integrable input with an L
2
-norm no larger
than one, and suppose it is desired to minimize the largest L
2
-norm that z may have in response to
such an input. Here again the objective function is to minimize the weighted energy of the error e,
but in this case the input u is assumed to be any L
2
-input of unit norm or less, rather than white
noise as in Case I. In this case the objective function to be minimized is
J = sup
u
2
1
z
2
= sup
u
2
1
W
1
(I +PC)
1
W
2
u
2
= sup
R
[W
1
(I +PC)
1
W
2
](j) , (6.1.5)
where || ||
2
denotes the L
2
-norm, and the last equality is a standard result from [26, 94] as pointed
out in the next section. Since in this case the quantity measures the maximum error in response to
a unit-norm input u, this problem is sometimes referred to as the minimax sensitivity problem.
In both the above cases, as well as in others that the reader may construct, the minimization
of J in (6.1.1) is required to be carried out subject to the constraint that C stabilizes P. As things
stand, the compensator C is thus constrained, and also enters the objective function J in a highly
nonlinear fashion. However, using the results of Chapter 5, it is possible to transform this into an
unconstrained optimization problem involving a much simpler objective function.
The set of all compensators C that stabilize a given plant P is neatly parametrized in The-
orem 5.2.1. Let (N, D), (
D,
N) be any r.c.f. and any l.c.f. of P, and select
X,
Y M(S) such that
6.1. PROBLEMSTATEMENT 155
N
X +
D
Y = I. Then, from this theorem,
S(P) = {(
X +DR)(
Y NR)
1
: R M(S) and |
Y NR| = 0} . (6.1.6)
Moreover, if C S(P) equals (
X +DR)(
Y NR)
1
for a certain R M(S), then (cf. (5.2.7))
(I +PC)
1
= (
Y NR)
D . (6.1.7)
Thus, the problem of minimizing J(R) = W
1
(I +PC)
1
W
2
subject to the constraint that C
S(P) is equivalent to minimizing
J(R) = W
1
(
Y NR)
DW
2
, (6.1.8)
subject to the constraint that |
Y NR| = 0. Now, from Lemma 5.2.4, |
Y NR| = 0 for all R
belonging to an open dense subset of M(S), and if P is strictly proper, then from Lemma 5.2.6,
|
Y NR| = 0 for all R M(S). Hence, the original problem of minimizing the weighted perfor-
mance measure J of (6.1.1) is equivalent, in the case where P is strictly proper, to an unconstrained
minimization of the objective function J() of (6.1.8) with respect to R. Moreover, the original
expression W
1
(I +PC)
1
W
2
has been replaced by W
1
(
Y NR)
DW
2
, where the noteworthy
feature is that the quantity inside the normis afne in R. If the plant is not strictly proper, the quantity
R is constrained to the extent that it has to satisfy the nonsingularity constraint. However, since
this constraint is only operational on a nowhere dense set, it can be ignored during the minimization
and can be tested once an unconstrained minimum has been found. If by chance the optimal choice
of R does not satisfy the nonsingularity constraint, then arbitrarily small perturbations in R will
satisfy the constraint and will result in only an arbitrarily small increase in the performance measure
(since the latter is a continuous function).
Up to now we have only talked about minimizing the weighted energy of the error signal
in response to various types of inputs. Depending on the application, one might prefer instead
to minimize the weighted energy of some other signal within the loop. Since the transfer matrix
between any two points can be expressed as an afne function of R using the results of Chapter 5,
such a minimization is qualitatively no different from the problems discussed earlier. For instance,
if it is desired to minimize the weighted energy of the plant output in response to an input u, then
J = W
1
PC(I +PC)
1
W
2
. (6.1.9)
However, if C S(P), then by (5.2.7),
PC(I +PC)
1
= (
X +DR)
N , (6.1.10)
for some R M(S), so that
J = W
1
(
X +DR)
NW
2
. (6.1.11)
In fact, it is easy to see that the general problem becomes one of minimizing
J = F GRH , (6.1.12)
156 6. FILTERINGANDSENSITIVITY MINIMIZATION
where F, G, H M(S) are given matrices.
Finally, the above discussion applies even when a two-parameter control scheme is used. In
this case, by (5.6.15), all transfer matrices are afne functions of the matrix S = [Q R], and the
general cost function assumes the form
J = F GSH , (6.1.13)
where F, G, H M(S) are given matrices. The discrete-time analogs of the above problems are
simply obtained by replacing S by R
.
6.2 SOMEFACTS ABOUTHARDY SPACES
In the previous section, it was shown that several problems of ltering and sensitivity minimization
can be formulated as follows: Minimize
J(R) = F GRH , (6.2.1)
where F, G, H M(S) are given and R M(S) is to be chosen. The solution of this class of
problems is facilitated by the use of some concepts and results from the theory of H
p
- (or Hardy)
spaces. The purpose of this section is to present a listing of the basic concepts and results from this
theory, mostly without proof. The reader is referred to [34] for an excellent treatment of this theory,
and to [59] for some specialized results.
Throughout this section, D denotes the open unit disc in the complex plane, and
D the closed
unit disc. For each p [1, ), the set H
p
consists of all analytic functions f on the open unit disc
D with the property that
sup
r(0,1)
_
_
1
2
2
_
0
|f (re
j
)|
p
d
_
_
1/p
=: f
p
< . (6.2.2)
The set H
< . (6.2.3)
If f H
p
for some p [1, ], then strictly speaking f (z) is only dened when |z| < 1.
However, for almost all [0, 2], it is possible to dene f (e
j
) as a nontangential limit of f (z)
as z e
j
. If f H
p
then its boundary value function f (e
j
) belongs to the Lebesgue space
L
p
[0, 2]. Moreover,
f
p
=
_
_
1
2
2
_
0
|f (e
j
)|
p
d
_
_
1/p
if 1 p < ,
= ess. sup
[0,2]
|f (e
j
)| if p = . (6.2.4)
6.2. SOMEFACTS ABOUTHARDY SPACES 157
In other words, the norm f
p
of a function f H
p
can be computed from its boundary value
function.
For each p [1, ], the space H
p
is a Banach space. In addition, H
2
is a (complex) Hilbert
space with inner product dened by
f, g =
1
2
2
_
0
f (e
j
)g(e
j
) d , (6.2.5)
where the bar denotes complex conjugation. It turns out that the set of functions {z
i
}
i=0
forms an
orthonormal basis for H
2
. Hence, a function f () analytic in D belongs to H
2
if and only if it can
be expanded in the form
f (z) =
i=0
f
i
z
i
, (6.2.6)
where the sequence {f
i
} is square-summable and the innite sum converges in the sense of the
H
2
-norm. Thus, one can think of H
2
as the set of z-transforms of square-summable sequences.
Similarly, if {f
i
} is an absolutely summable sequence, then
f (z) =
i=0
f
i
z
i
, (6.2.7)
belongs to H
, and in fact
f
i=0
|f
i
| . (6.2.8)
Ingeneral, the converse of the above statement is false: Not every functioninH
is of the form(6.2.7)
where the coefcient sequence is absolutely summable. However, it is easy to verify that a rational
function belongs to H
becomes a
Banach algebra.
Suppose f H
; or
in other words,
T
f
:= sup
g
2
1
fg
2
= f
. (6.2.10)
158 6. FILTERINGANDSENSITIVITY MINIMIZATION
Suppose m is a nonnegative integer, and that {a
i
} is a sequence of nonzero complex numbers such
that |a
i
| < 1 for all i, and the sum
i=1
(1 |a
i
|) converges. Then the function
b(z) = z
m
i=1
|a
i
|
a
i
z a
i
1 a
i
z
, (6.2.11)
is called the Blaschke product corresponding to m and the sequence {a
i
}. The function b() is an
element of H
i=1
|a
i
|
a
i
z a
i
1 a
i
z
, (6.2.12)
is called the associated (nite) Blaschke product. It too belongs to H
, i.e., it
has an inverse in H
.
The factorization described above is a special case of what is known as an inner-outer fac-
torization. A function g H
(e
j
)g(e
j
) d . (6.2.13)
where
denotes the conjugate transpose. Corresponding to every F H
nm
= sup
[0,2]
(F(e
j
)) , (6.2.15)
where () denotes the largest singular value.
The matrix analog of Fact 6.2.1 is stated next. A matrix G H
nm
is inner if
G
(e
j
)G(e
j
) = I for almost all [0, 2] (in this denition it is important to note that n
need not equal m). A matrix F H
nm
is just
the set of rational functions in H
dened
in (2.2.21) is consistent with (6.2.4).
Fact 6.2.2 Suppose G H
nm
is outer and B R
nm
) is inner, G M(H
) and
H M(H
2
), then FG
= G
and FH
2
= H
2
.
1
This means that at least one full-size minor of G is nonzero.
160 6. FILTERINGANDSENSITIVITY MINIMIZATION
One can also dene the class H
p
over the open right half-plane, rather than the open unit
disc as is the case above. Suppose f : C
+
C is analytic; then f belongs to the class H
p
(C
+
),
where 1 p < , if
sup
>0
_
_
1
2
|f ( +j)|
p
d
_
_
1/p
=: f
p+
< . (6.2.16)
The function f belongs to H
(C
+
) if
sup
>0
sup
|f ( +j)| =: f
+
< . (6.2.17)
If f H
p
(C
+
), then its domain denition can be extended to the j-axis, and the function
f (j) belongs to the Lebesgue space L
p
(R). Moreover,
f
p+
=
_
_
1
2
|f (j)|
p
d
_
_
1/p
if p [1, ),
f
+
= ess. sup
|f (j)| if p = . (6.2.18)
H
2
(C
+
) is a Hilbert space, with the inner product of two functions f and g dened by
f, g
+
=
1
2
f (j)g(j) d . (6.2.19)
The notions of Blaschke product, inner and outer functions all have their counterparts in this
setting. In particular, a function f H
(C
+
) is inner if |f (j)| = 1 for almost all ; a rational
f H
(C
+
) with f (j) = 0 for all is outer if it is a unit. Facts 6.2.1 and 6.2.2 have their obvious
analogs for the space H
(C
+
), obtained by replacing unit circle by imaginary axis and e
j
by
j throughout.
Suppose f H
(C
+
), and satises, g
+
= f
(C
+
) can
be generated in this manner. For example, g(s) = e
s
belongs to H
(C
+
), but is not of the
form(6.2.20) for any f H
is a subset of H
p
for all p < , and indeed H
p
is a
subset of H
q
whenever p > q; this follows from the compactness of the interval [0, 2]. However,
H
p
(C
+
) and H
q
(C
+
) are distinct sets whenever p = q, with neither being a subset of the other.
6.2. SOMEFACTS ABOUTHARDY SPACES 161
There is however an interesting relationship, which is given next.
Fact 6.2.3 Suppose f H
p
, for some p [1, ), and dene g by (6.2.20). Let
h(s) =
2
1/p
g(s)
(1 +s)
2/p
. (6.2.21)
Then h H
p
(C
+
) and h
p+
= f
p
.
Proof. By denition
f
p
p
=
1
2
_
0
|f (e
j
)|
p
d, (6.2.22)
h
p
p
=
1
2
g(j)
(1 +
2
)
1/p
p
2 d
=
1
2
2
[0, 2] denote the orthogonal complement of H
+
2
[0, 2] in L
2
[0, 2].
Then every function f L
2
can be expressed uniquely in the form f = f
+
+f
where f
+
H
+
2
and f
2
(we drop the [0, 2] for brevity). This splitting up has a very intuitive interpretation.
The collection of functions {e
jn
, n Z} forms an orthonormal basis for L
2
, so that every f L
2
has the Fourier series
f (e
j
) =
n=
f
n
e
jn
. (6.2.26)
162 6. FILTERINGANDSENSITIVITY MINIMIZATION
NowH
+
2
is the span of the functions {e
jn
, n Z
+
}, while H
2
is the span of the set {e
jn
, n < 0}.
Hence,
f
+
(e
j
) =
n=0
f
n
e
jn
, f
(e
j
) =
1
n=
f
n
e
jn
. (6.2.27)
Note that the function z z
1
f
(z
1
) belongs to H
2
. Such a function is called coanalytic. In case
f () is rational and has no poles on the unit circle, the above decomposition is even more familiar:
Expand f into partial fractions; then f
+
(resp. f
2+
, (6.3.1)
over all C S(P). At this stage, several assumptions are made to facilitate a solution. First, note that
J might not even be dened for all C S(P), because W
1
(I +PC)
1
W
2
, might not be strictly
proper. To get around this problem, it is assumed that either W
1
or W
2
is strictly proper. Next,
it is assumed that W
1
, W
2
are square, and that W
1
(s) = 0, W
2
(s) = 0 for all s C
+
. Finally, for
technical reasons, it is assumed that the plant P has at least as many inputs as outputs. This is the
so-called square or fat plant case; the case of tall plants is discussed in Section 6.6.
As pointed out in Section 6.1, this problem can be reformulated in a more manageable
form. Let (N, D), (
D,
N) be an r.c.f. and an l.c.f. of P, and let
X,
Y M(S) be any solutions of
N
X +
D
Y = I. Then, by Theorem 5.2.1, every C S(P) is of the form (
X +DR)(
Y NR)
1
for some R M(S) such that |
Y NR| = 0. Moreover, the corresponding weighted sensitivity
measure equals
J(R) = W
1
(
Y NR)
DW
2
2
= W
1
Y
DW
2
W
1
NR
DW
2
2
. (6.3.2)
Hence, the problem is to minimize J(R) by a suitable choice of R M(S).
2
This problem has a
very nice geometrical interpretation. As R varies over M(S), the product W
1
NR
DW
2
generates a
subspace of M(H
2
(C
+
)). To minimize J(), one has to nd the closest point in this subspace to
W
1
Y
DW
2
. Now, since at least one of the matrices W
1
or W
2
is strictly proper, it turns out that the
subspace {W
1
NR
DW
2
: R M(S)} is not closed. As a result, there may not exist a closest point in
this subspace to W
1
Y
DW
2
. In other words, the inmum of J() is not attained by any R, except in
trivial cases. It is nevertheless possible to compute the inmum of J(), and to construct a family
2
The constraint that |
Y NR| = 0 is ignored since it is satised by almost all R (see Lemma 5.2.4).
6.3. FILTERING 163
(R
(s) =
A
(s).
4
Then
min
R
J
m
(R) = (G
i
FH
i
)
2
, (6.3.4)
and this minimum is attained by any R satisfying
G
o
RH
o
= (G
i
FH
i
)
+
. (6.3.5)
Proof. First of all, observe that the equation G
o
RH
o
= A has a solution for R corresponding to
every A M(S). The reason is that G
o
has a right inverse and H
o
is a unit.
To prove the lemma, note that G
i
(s)G
i
(s) = I, H
i
(s)H
i
(s) = H
i
(s)H
i
(s) = I s, since
both G
i
and H
i
are inner matrices. In particular, G
i
A
2
2
, (6.3.6)
where the norm on the left is the H
2
-norm or the L
2
-norm, while the norm on the right side is
strictly the L
2
-norm (since G
i
is also norm-preserving.
Using this fact leads to
(J
m
(R))
2
= F GRH
2
2
= F G
i
G
o
RH
o
H
i
2
2
= G
i
FH
i
G
o
RH
o
2
2
. (6.3.7)
If G
i
FH
i
is decomposed as (G
i
FH
i
)
+
+(G
i
FH
i
, then by orthogonality,
(J
m
(R))
2
= (G
i
FH
i
2
2
+(G
i
FH
i
)
+
G
o
RH
o
2
2
(G
i
FH
i
2
2
. (6.3.8)
On the other hand, equality can be achieved in (6.3.8) by choosing R to satisfy (6.3.5). 2
3
This means that G has at least as many columns as rows.
4
In particular, if s = j, then, A
(j) is the conjugate transpose of A(j), so that this notation is consistent with earlier usage.
164 6. FILTERINGANDSENSITIVITY MINIMIZATION
If we attempt to apply Lemma 6.3.1 to minimize the function J() of (6.3.2), we run
into the difculty that the functions G = W
1
N and H =
DW
2
may not satisfy the hypotheses
of Lemma 6.3.1. In particular, if P(j) has less than full row rank for some , then so does N(j)
(i.e., G(j)). Similarly, any j-axis poles of P correspond to zeros of |
D| and of |H|. Finally, since
W
1
or W
2
is assumed to be strictly proper, either |G()| = 0 or |H()| = 0. To get around these
difculties, the following modication of Lemma 6.3.1 is required.
Lemma 6.3.2 Suppose F, G, H M(S), and suppose in addition that
(i) G has the form G = G
1
D
g
, where G
1
M(S) has full row rank at all points on the j-axis
and at innity, while D
g
M(S) is a diagonal matrix D
g
= Diag {d
g
1
, , d
g
n
}, where the only
C
+e
-zeros of d
g
i
(if any) are at innity or on the j-axis.
(ii) H has the form H = D
h
H
1
where H
1
M(S) is square and nonsingular at all points on the
j-axis and at innity, while D
h
M(S) is a diagonal matrix of the formD
h
= Diag {d
h
1
, , d
h
m
},
where the only C
+e
-zeros of d
h
i
(if any) are at innity or on the j-axis.
Under these conditions,
inf
RM(S)
F GRH
2
= inf
RM(S)
F G
1
RH
1
2
. (6.3.9)
Proof. As shown by Lemma 6.3.1, the inmum on the right side of (6.3.9) is actually attained. Let
R
0
M(S) be an element that achieves this minimum. Now every matrix of the form GRH is also
of the form G
1
R
1
H
1
(just let R
1
= D
g
RD
h
). Thus, the inmum on the left side of (6.3.9) can be
no smaller than that on the right side. Hence, in order to prove the lemma, it is enough to construct
a family of functions (R
)
>0
in M(S) such that F GR
H
2
F G
1
R
0
H
1
2
as 0
+
.
Suppose a S has the property that a(s) = 0 whenever Re s > 0, so that the only possible
C
+e
-zeros of a are at innity or on the j-axis. (Note that all the d
g
i
, d
h
i
, are of this type.) To be
specic, let j
1
, , j
l
(where
l
may be innite) denote the distinct j-axis zeros of a, with
multiplicities m
1
, , m
l
, respectively. It is claimed that there exists a family of functions (b
)
>0
in S such that
(i) b
(j) 1 uniformly on all compact subsets of (, ) that do not contain any of the
i
s, as 0
+
.
To see this, note that a(s) is of the form
a(s) = u(s)
l
i=1
_
s j
i
s +1
_
m
i
_
1
s +1
_
n
, (6.3.10)
for some unit u S. Hence,
b
(s) =
l
i=1
_
s j
i
s + j
i
_
m
i
_
1
s +1
_
n
, (6.3.11)
6.3. FILTERING 165
satises the requirements (i) and (ii).
In view of this, it follows that there exists families B
g
, B
h
of diagonal matrices in M(S) such
that
(i) D
1
g
B
g
is unimodular for all > 0.
(ii) B
h
D
1
h
is unimodular for all > 0.
(iii) B
g
(j) I, B
h
(j) I as 0
+
uniformly on all compact subsets of (, )
that do not contain the zeros of |D
g
|, |D
h
|.
Now dene R
= D
1
g
B
g
R
0
B
h
D
1
h
, where R
0
M(S) attains the inmum on the right
side of (6.3.9). Then
F GR
H
2
= F G
1
D
g
D
1
g
B
g
R
0
B
h
D
1
h
D
h
H
1
2
= F G
1
B
g
R
0
B
h
H
1
2
. (6.3.12)
The proof of the lemma is complete if it can be shown that the last quantity in (6.3.12) approaches
F G
1
R
0
H
1
2
as 0
+
. But this follows from the limiting behavior of B
g
, B
h
as described
in (iii) above. 2
We are now ready to tackle the general problem of minimizing J() as dened in (6.3.2). Let
G
s
denote the Smith form of W
1
N and H
s
the Smith form of
DW
2
, and suppose
U
g
G
s
V
g
= W
1
N, U
h
H
s
V
h
=
DW
2
, (6.3.13)
where U
g
, V
g
, U
h
, V
h
U(S). Then
J(R) = W
1
Y
DW
2
W
1
NR
DW
2
2
= W
1
Y
DW
2
U
g
G
s
V
g
RU
h
H
s
V
h
2
. (6.3.14)
Since V
g
, U
h
are unimodular, one can replace V
g
RU
h
by another free parameter, say R
1
. Moreover,
G
s
(and H
s
) can be expressed as a product of two diagonal matrices E
g
and D
g
(resp. E
h
and D
h
)
such that |E
g
| is nonzero at all points on the j-axis and at innity, while |D
g
(s)| = 0 whenever
s C
+
. Now, dening
G
1
= U
g
E
g
, H
1
= E
h
V
h
, (6.3.15)
shows that
J(R) = W
1
Y
DW
2
G
1
D
g
R
1
D
h
H
1
2
, (6.3.16)
where R
1
= V
g
RU
h
. Hence, by Lemma 6.3.2,
inf
R
J(R) = inf
R
W
1
Y
DW
2
G
1
RH
1
2
, (6.3.17)
which can be computed using Lemma 6.3.1.
The situation is much cleaner in the case of discrete-time systems. First, since the interval
[0, 2] is compact, every H
-function is also in H
2
. Further, a strictly causal plant, which is the
166 6. FILTERINGANDSENSITIVITY MINIMIZATION
discrete analog of a strictly proper plant, is one whose transfer matrix is zero at z = 0. As a result,
in the formulation of the discrete-time ltering problem, the weighting matrices W
1
and W
2
can be
any rational H
-matrices and need not be strictly causal. Moreover, an optimal controller always
exists and the approximation procedure of Lemma 6.3.2 is not needed.
6.4 SENSITIVITY MINIMIZATION: SCALARCASE
Because the sensitivity minimization problem is much more involved technically than the ltering
problem, the discussion of the former is divided into three parts, namely the scalar case, the fat plant
case, and the general case. Accordingly, the problem studied in this section is the following: Suppose
p is a given scalar plant, and w is a given element of S. The objective is to nd a c S(p) that
minimizes the weighted sensitivity measure
5
J = sup
R
w(j)
1 +p(j)c(j)
. (6.4.1)
As in the previous section, one can transform this problem into an afne minimization problem.
Let (n, d) be a coprime factorization of p, and let x, y S be such that xn +yd = 1. Then the
problem becomes: Minimize
J(r) = sup
R
|[(y rn)dw](j)|
= (y rn)dw
, (6.4.2)
by suitable choice of r S. Moreover, it can be assumed without loss of generality that the weighting
function w is outer. To see this, factor w in the form w = w
i
w
o
where w
i
and w
o
are inner and
outer, respectively. Then, since multiplication by the inner function w
i
is norm-preserving, we have
J(r) = (y rn)dw
o
.
In order to take full advantage of the available results on H
p
-spaces, a bilinear transformation
is now used to map the set S into H
, i.e., f R
. Moreover,
= sup
[0,2]
|
f (e
j
)|
= sup
R
|f (j)| = f
S
, (6.4.4)
since the unit circle maps onto the imaginary axis plus the point at innity. Conversely, suppose
g H
= g
S
. In this way one can set up a one-to-one norm-preserving corre-
spondence between functions in S and functions in R
that minimizes
J = ( y r n)
d w
. (6.4.6)
At this stage we drop the tilde and state the problem in H
rather than in R
, nd a rational function r H
that minimizes
J(r) = f gr
. (6.4.7)
The denitions of f and g when J is as in (6.4.6) are obvious. The problem of minimizing J has
a very simple geometric interpretation: Let gH
}. Then the
problem is to nd the closest point to f in the subspace gH
.
It turns out that the subspace gH
there
exists a unique closest point in gH
f gr
, (6.4.8)
then the inmum in (6.4.8) may or may not be attained, depending on f . In the original context,
this means that in general there does not exist a single compensator that minimizes the weighted
sensitivity measure (6.4.1), in the practically important case where the plant is strictly proper (or has
j-axis zeros).
In viewof the above discussion, one would nevertheless like to knowthe value of J
0
, as well as a
sequence {r
i
} of functions in H
. Then
J
0
= inf
rH
f bdur
= inf
rH
f bvr
, (6.4.9)
since the unit factor u can be absorbed into the free parameter r. Now consider the modied
problem: Minimize f rb
by a suitable choice of r H
. Dene
J
m
= inf
rH
f rb
, (6.4.10)
168 6. FILTERINGANDSENSITIVITY MINIMIZATION
as the value of the modied problem. In viewof earlier discussion, the inmumin (6.4.10) is actually
attained, since b has no zeros on the unit circle. Now, since every function of the form bvr is also a
multiple of b, it is clear that J
m
J
0
. The next lemma gives conditions under which equality holds.
Lemma 6.4.1 Let e
j
1
, , e
j
l
denote the distinct zeros of v on the unit circle. Then J
0
= J
m
if and
only if |f (e
j
i
)| J
m
i.
Proof. only if Since v(e
j
i
) = 0 i, we have (bvr)(e
j
i
) = 0 i, r H
.
6
Hence, (f
bvr)(e
j
i
) = f (e
j
i
) i, and as a result
f bvr
= sup
[0,2]
|[f bvr](e
j
)|
max
i
|[f bvr](e
j
i
)|
= max
i
|f (e
j
i
)|, r H
. (6.4.11)
Combining (6.4.11) with the hypothesis that J
0
= J
m
leads to
J
m
= J
0
= inf
rH
f bvr
max
i
|f (e
j
i
)| . (6.4.12)
if Select an r
0
H
such that J
m
= f r
0
b
; such an r
0
exists, by earlier remarks.
Let f
0
= f r
0
b. Next, introduce the function
h
,
(z) = (z e
j
)/[z (1 +)e
j
] , (6.4.13)
and dene
h
(z) =
l
i=1
[h
i
,
(z)]
m
i
, (6.4.14)
where m
i
is the multiplicity of e
j
i
as a zero of v(). Now observe that (i) h
/v is a unit of H
for
each > 0, and (ii) h
(e
j
) 1 uniformly on all compact subsets of [0, 2] that do not contain any
of the
i
s, as 0
+
. Consider the family (r
)
>0
in H
dened by r
= r
0
h
J
0
, for each > 0 . (6.4.15)
On the other hand, f bvr
= f br
0
h
)(e
j
)| |f
0
(e
j
)| J
m
uniformly
on all compact subsets of [0, 2] that do not contain any of the
i
s, and |(f bvr
)(e
j
i
)| =
|f (e
j
i
)| J
m
by hypothesis. This means that
lim sup
0
+
f bvr
J
m
. (6.4.16)
6
Strictly speaking, r(e
j
i
) is undenedfor a general H
J
m
as 0
+
. (6.4.17)
This completes the proof. 2
Inthe above proof, it is easy to see that there is nothing special about the functionh
of (6.4.14),
and that any function h
> 0, (ii) h
(e
j
i
) = 0 i, and
(iii) h
(e
j
) 1 uniformly on all compact subsets of [0, 2] that do not contain any of the 0
i
s.
Note that it f (e
j
i
) = 0 i, then the condition of Lemma 6.4.1 is automatically satised and
J
0
= J
m
. Going back to (6.4.6), we have f = y
d w and g = n
d(z)
,
where n,
d are the numerator and denominator polynomials of p, respectively. A particular solution
of x n + y
d = 1 can be obtained by Euclidean division and back substitution, namely
x(z) = 6z
2
+
13
2
, y(z) =
1
4
(6z
2
9z +2) .
Finally,
w(z) =
1 z
2
.
Hence,
J(r) = f rg
,
170 6. FILTERINGANDSENSITIVITY MINIMIZATION
where
f (z) = ( w y
d)(z) =
1
4
(z 1)(2z 1)(z
2
+1)(6z
2
9z +2),
g(z) = ( w n
.
Lemma 6.4.1 provides conditions under which, by solving the minimization problem
in (6.4.10) (which does have a solution), it is possible to determine the value of the inmumin (6.4.9).
Actually, it turns out that it is possible to compute J
m
(the value of the problem in (6.4.10)) without
actually computing the r
0
that attains the minimum. If the conditions of Lemma 6.4.1 hold, this
J
m
is also the inmum in (6.4.9). In this way, it is possible to obtain rapidly a gure of merit against
which suboptimal choices of r in (6.4.9) can be compared. If the optimization problem in (6.4.10)
is solved completely, i.e., if r
0
is also determined, then a family r
such that f gr
J
m
can
be computed as in the proof of Lemma 6.4.1.
The next step then is to give a procedure for computing J
m
. We begin with a result that is
now obvious, but has some practical signicance.
Lemma 6.4.3 Suppose the plant p is strictly proper and stable, and has no zeros in the open right
half-plane; suppose also that the weighting function w is stable, strictly proper and vanishes at all j-axis
zeros (if any) of p. Then
inf
cS(p)
sup
_
w
1 +pc
_
(j)
= 0 . (6.4.18)
Remarks 6.4.4 In other words, Lemma 6.4.3 states that if the plant is strictly proper, stable, and
minimum phase,
7
then the weighted sensitivity measure can be made arbitrarily small provided
the weighting function is also stable and vanishes at all RHP zeros of the plant.
7
Normally, the term minimum phase implies that the plant has no zeros in the closed right half-plane; here the usage is slightly
different.
6.4. SENSITIVITY MINIMIZATION: SCALARCASE 171
Proof. In this case
d is a unit, and all zeros of n in the closed unit disc are in fact on the unit circle.
As a result, the nite Blaschke product b equals 1, and the condition of Lemma 6.4.1 is satised.
Finally, it is clear that
inf
r
f rb
= inf
r
f r
= 0 , (6.4.19)
corresponding to the choice r = f . 2
In the general case the inmum J
m
in (6.4.10) can be computed by associating, with each
H
h H
, it follows that
J
m
= inf
rH
T
f rb
. (6.4.20)
Let bH
2
denote the range of the operator T
b
; i.e.,
bH
2
= {bg : g H
2
} , (6.4.21)
and let N denote the orthogonal complement of bH
2
. Thus,
N = {h H
2
: h, bg = 0 g H
2
} . (6.4.22)
Finally, let denote the orthogonal projection mapping H
2
onto N. Now, as r varies over H
, the
operator T
f rb
ranges over various operators. However, the operator T
f rb
is the same for all r,
and in fact equals T
f
. To see this, let g H
2
, be arbitrary; then
T
f rb
g = [(f rb)g] = (fg) (rbg)
= (fg) (brg) = (fg) = T
f
g , (6.4.23)
since brg bH
2
and hence its projection onto Nis zero. Using this idea, one can prove the following
result (see Sarason [86] for the proof ):
Fact 6.4.5 Dene a linear map A : N N to be the restriction of T
f
to N. Then
inf
rH
f rb
= A . (6.4.24)
Moreover, if f (e
j
) is a continuous functionof , thenthere is a unique functionr
0
H
that attains
the inmum in (6.4.26). Let f
0
= f r
0
b, and let g N be a function such that Ag
2
= g
2
.
Then f
0
/A is inner, and
f
0
(z) =
(Ag)(z)
g(z)
. (6.4.25)
Fact 6.4.5 is very useful because it is possible to compute all the quantities involved using only
linear algebra. First of all, the subspace N is nite-dimensional, and in fact its dimension equals the
172 6. FILTERINGANDSENSITIVITY MINIMIZATION
number of zeros of the Blaschke product b. Next, one can readily nd a basis for N, which can be
converted to an orthonormal basis using the Gram-Schmidt procedure. Let {v
1
, , v
n
} denote
this orthonormal basis. Then one can compute a matrix representation of the linear operator A with
respect to this basis; call it A
0
. Now (6.4.24) states that the inmum of f rb
is precisely the
largest singular value of A
0
. Further, if p = [p
1
p
n
]
A
0
corresponding to the eigenvalue ( (A
0
))
2
, then
f
0
(z) =
n
i=1
(A
0
p)
i
v
i
(z)
n
i=1
p
i
v
i
(z)
. (6.4.26)
Let us elaborate each of these points in turn.
Lemma 6.4.6 Let b be a nite Blaschke product, and dene N as in (6.4.22). Let {
1
, ,
n
} denote
the distinct zeros of b, with multiplicities {m
1
, , m
n
}, respectively. Then a (nonorthonormal) basis for
N is given by the functions
_
z
k1
(1
i
z)
k
_
, k = 1, , m
i
; i = 1, , n . (6.4.27)
The proof follows readily from the following lemma.
Lemma 6.4.7 Suppose is a complex number with || < 1, k is an integer greater than or equal to 1,
and dene
(z) =
z
k1
(1
z)
k
. (6.4.28)
Then H
2
; moreover, for any g H
2
,
, g =
g
(k1)
()
(k 1)!
. (6.4.29)
6.4. SENSITIVITY MINIMIZATION: SCALARCASE 173
Proof. Clearly H
2
, since its only singularity is at 1/
(e
j
)g(e
j
) d
=
1
2
2
_
0
e
j (k1)
(1
e
j
)
k
g(e
j
) d
=
1
2
2
_
0
e
j
(e
j
)
k
g(e
j
) d
=
1
2j
_
C
g(z)
(z )
k
dz , (6.4.30)
where C denotes the unit circle oriented counterclockwise (note that z = e
j
on C, so that dz =
je
j
d). Finally, by the Cauchy integral formula, the last termin (6.4.30) equals g
(k1)
()/(k 1)!.
2
Proof of Lemma 6.4.6. The subspace N is dened by (6.4.22). Let g H
2
be arbitrary. Then bg
H
2
has a zero of multiplicity m
i
at
i
, for i = 1, , n. Hence, if
i
k
denotes the function
i
k
(z) =
z
k1
(1
i
z)
k
, (6.4.31)
then, by Lemma 6.4.7,
i
k
, bg =
(bg)
(k1)
(
i
)
(k 1)!
= 0 , (6.4.32)
whenever 1 k m
i
. Thus, each of the functions
i
k
belongs to N. It is routine to verify that they
are linearly independent. Finally, in order to show that they span N, it is enough to show that the
orthogonal complement of the span of the
i
k
s is contained in bH
2
. Accordingly, suppose h H
2
is orthogonal to each
i
k
. Then, by Lemma 6.4.7, h has a zero of multiplicity m
i
at
i
, for all i, and
is hence of the form bg for some g H
2
. 2
Once a basis has been found for N, it is easy to construct an orthonormal basis using the
Gram-Schmidt procedure. Let us drop the double subscript notation, let n denote the dimension of
the null space N, and let {
1
, ,
n
} denote the basis for N given by 6.4.6. Let C
nn
denote
the associated Grammian matrix; i.e.,
ij
=
i
,
j
. (6.4.33)
174 6. FILTERINGANDSENSITIVITY MINIMIZATION
Note that the constants
ij
can be readily computed using Lemma 6.4.7. For instance, if b has only
simple poles at
1
, ,
n
, then Lemma 6.4.6 shows that the set
i
(z) =
1
1
i
z
, (6.4.34)
is a basis for N. Next, by Lemma 6.4.7,
ij
=
i
,
j
=
j
(
i
) =
1
1
i
. (6.4.35)
Since is the Grammian of a linearly independent set, it is Hermitian and positive denite. Hence,
it is possible to factor in the form B
i=1
i
(z)x
ij
, j = 1, , n . (6.4.36)
Then {v
1
, , v
n
} is an orthonormal basis for N. If the factor B is selected to be upper triangular,
then this is just the Gram-Schmidt procedure.
The next step is to obtain a matrix representation A
0
for the operator A = T
f
with respect
to the basis {v
i
}. This is given by
a
ij
= v
i
, Av
j
= v
i
, T
f
v
j
= v
i
, f v
j
. (6.4.37)
Now, by Lemma 6.4.7, a quantity of the form
i
, f
j
is very easy to compute. Hence, it is desirable
to expand a
ij
as a summation of terms of the form
i
, f
j
using (6.4.36). Let denote the matrix
in C
nn
whose ij-th element is
i
, f
j
. Then routine computations show that
A
0
= X
X . (6.4.38)
Now Fact 6.4.5 asserts that the inmum J
m
in (6.4.10) equals A
0
. Further, if p C
n
is a
column vector such that A
0
p
2
= p
2
, then the optimal function f
0
is given by (6.4.26).
Finally, as is the case with the ltering problem, the solution of the sensitivity minimization
problem is a lot cleaner in the discrete-time case than in the continuous-time case. If p(z) is the
transfer function of a strictly-causal plant, then p(0) = 0. If, in addition, p() has no poles nor zeros
on the unit circle, and the weighting function w() has no zeros on the unit circle, then an optimal
controller always exists.
This section is concluded with an illustrative example.
Example 6.4.8 Consider once again the plant of Example 6.4.2, namely
p(s) =
s 1
(s 3)(s
2
+1)
,
6.4. SENSITIVITY MINIMIZATION: SCALARCASE 175
together with the weighting function
w(s) =
1
s +1
.
By previous discussion, the sensitivity minimization problem reduces to nding an r H
that
minimizes f rb
, where f and b are computed in Example 6.4.2. The rst objective of this
example is to show that it is unnecessary to compute f and b explicitly. Now b is the Blaschke
product part of n
d w and f = y
d w, it follows that if z is a zero of b, then f (z) = w(z) or 0, depending on
whether z is a zero of n (i.e., a plant zero) or a zero of
d i.e., a plant pole). In the present instance,
f (0) = w(0) = w(1) = 0.5, f (0.5) = 0. All of this discussion is made more precise in the next
section, where it is shown that the sensitivity minimization problem is one of interpolation by a
function of minimum norm.
To compute the optimal function f
0
, we rst nd a basis for the null space N of (6.4.22). By
Lemma 6.4.6, such a basis is given by
1
(z) = 1,
2
(z) =
1
1 0.5z
.
From (6.4.35), the Grammian matrix is
=
_
1 1
1 4/3
_
,
which can be factored as
=
_
1 0
1 1/
3
_ _
1 1
0 1/
3
_
= B
B .
Hence,
X = B
1
=
_
1
3
0
3
_
,
176 6. FILTERINGANDSENSITIVITY MINIMIZATION
and an orthonormal basis for N is given by
v
1
(z) =
1
(z) = 1, v
2
(z) =
3(
2
(z)
1
(z)) =
3
z
2 z
.
Next we need to compute A
0
= X
Xwhere
ij
=
i
, f
j
. From Lemma 6.4.7,
1
, f
1
= (f
1
)(0) = w(0) = 0.5,
1
, f
2
= (f
2
)(0) = 0.5,
2
, f
1
= (f
1
)(0.5) = 0,
2
, f
2
= (f
2
)(0.5) = 0 .
Therefore
A
0
= X
X =
_
0.5 0
0.5
3 0
_
.
It is routine to compute that (A
0
) = 1, and that the associated vector p is [1 0]
, which leads to
A
0
p = [0.5 0.5
3]
3v
2
(z)
v
1
(z)
=
1 2z
2 z
.
6.5 SENSITIVITY MINIMIZATION: FATPLANTCASE
The sensitivity minimization problemfor multivariable systems is much deeper than the correspond-
ing problem for scalar systems. At present, a clean solution to the problem is available only in the
case where the plant has at least as many inputs as outputs. Not all of the details of this solution are
given in this section, as they require some background concepts from functional analysis not covered
previously.
Suppose P is a given strictly proper plant with at least as many inputs as outputs, and that
W
1
, W
2
M(S) are weighting matrices such that W
1
has full row rank (as a rational matrix) and
W
2
is square and nonsingular. The objective is to minimize the weighted sensitivity measure
J = W
1
(I +PC)
1
W
2
. (6.5.1)
Let (N, D), (
D,
N) be any r.c.f. and any l.c.f. of P, and let
X,
Y M(S) be any solutions of
N
X +
D
Y = I. Then, as shown in Section 6.2, one can replace (6.5.1) by the equivalent problem
of minimizing
J(R) = W
1
Y
DW
2
W
1
NR
DW
2
, (6.5.2)
6.5. SENSITIVITY MINIMIZATION: FATPLANTCASE 177
as the free parameter R varies over M(S). Now dene rational H
-matrices F, G, H by
F(z) = (W
1
Y
DW
2
)[(1 +z)/(1 z)] , (6.5.3)
G(z) = (W
1
N)[(1 +z)/(1 z)] , (6.5.4)
H(z) = (
DW
2
)[(1 +z)/(1 z)] . (6.5.5)
Then the problem becomes one of minimizing
J(R) = F GRH
, (6.5.6)
suitably choosing R from the set of rational H
-matrices.
Since the plant is strictly proper, we have that G(1) = 0. As a result, the inmum of J is not
attained in general. To deal with this situation, it is necessary to extend Lemma 6.4.1 to the matrix
case. For this purpose, dene
J
0
= inf
RM(H
)
F GRH
. (6.5.7)
Suppose G and H have the following properties: G is of the form G
1
g where G
1
M(H
), g
H
, and in addition G
1
(e
j
) has full row rank at all [0, 2], and all zeros of g() are on the unit
circle. Similarly, H is of the form H
1
h where H
1
M(H
), h H
, and in addition H
1
(e
j
) is
square and nonsingular at all [0, 2], and all zeros of h() are on the unit circle. This means in
particular that if G(e
j
) has less than full rank for some , then G(e
j
) must in fact equal the zero
matrix; similar remarks apply to H. Note that the assumptions on G and H are trivially satised in
the scalar case. Now dene
J
m
= inf
RM(H
)
F G
1
RH
1
. (6.5.8)
It can be shown that the inmum in (6.5.8) is achieved, owing to the rank conditions on G
1
and H
1
.
The next result is a generalization of Lemma 6.4.1.
Lemma 6.5.1 Let e
j
1
, , e
j
l
denote the distinct zeros of the product gh on the unit circle. Then
J
0
= J
m
if and only if F(e
j
i
) J
m
i.
Proof. Since every matrix of the formGRH is also of the formG
1
RH
1
it is easy to see that J
m
J
0
.
only if Since gh(e
j
i
) = 0 i, it follows that GRH(e
j
i
) = 0 i, R M(H
). Hence,
(F GRH)(e
j
i
) = F(e
j
i
) i, R M(H
), and as a result
F GRH
= sup
[0,2]
(F GRH)(e
j
)
max
i
(F GRH)(e
j
i
)
= max
i
F(e
j
i
), R M(H
) . (6.5.9)
178 6. FILTERINGANDSENSITIVITY MINIMIZATION
Combining (6.5.9) with the hypothesis that J
0
= J
m
leads to
J
m
= J
0
= inf
RM(H
)
F GRH
max
i
F(e
j
i
) . (6.5.10)
if Select anR
0
M(H
) suchthat J
m
= F G
1
R
0
H
1
. SuchanR
0
exists, by earlier remarks.
Now recall the function (cf. (6.4.13))
h
,
(z) =
z e
j
z (1 +)e
j
, (6.5.11)
and dene
h
(z) =
l
i1
[h
i
,
(z)]
m
i
, (6.5.12)
where m
i
is the multiplicity of e
j
i
as a zero of gh. Dene the family (R
) in M(H
) by R
=
R
0
h
/gh. Then F GR
H = F G
1
R
0
H
1
h
F G
1
R
0
H
1
= J
m
as 0
+
. Hence, J
0
= J
m
. 2
To interpret Lemma 6.5.1 in the context of the sensitivity minimization problem, suppose
the plant P has blocking zeros
8
at s = j
1
, , j
n
and at innity, but has full row rank at all
other points on the j-axis. Suppose also that P has no j-axis poles. Select the weighting matrices
W
1
, W
2
such that W
1
has the same blocking zeros as P on the extended j-axis, but has full row
rank at all other points on the j-axis, while W
2
(j) is square and nonsingular at all points on the
extended j-axis. Then F(e
j
i
) = 0 i and the condition F(e
j
i
) J
m
is automatically satised;
as a result, J
0
= J
m
. Moreover, if R
0
is a choice of R that attains the inmum in (6.5.8), then the
family (R
= R
0
h
)
1
W
2
J
m
as 0
+
.
Let us therefore turn our attention to the problem of computing the inmum in (6.5.8) and
an R that attains this inmum. As a rst step, factorize G
1
and H
1
in the form G
1
= G
i
G
o
, H
1
=
H
o
H
i
, where G
i
, H
i
are inner, G
o
M(H
) is a unit. This
is possible in view of the rank conditions on G
1
and H
1
(see Fact 6.2.2). Now, for any V M(H
),
the equation V = G
o
RH
o
has a solution for R in M(H
). Hence,
inf
RM(H
)
F G
1
RH
1
= inf
VM(H
)
F G
i
VH
i
. (6.5.13)
Next, let G
adj
i
, H
adj
i
denote the adjoint matrices of G
i
, H
i
, respectively, and note that these adjoint
matrices are themselves inner. Let b = |G
i
| |H
i
|, and note that b is an inner function. Now, since
multiplication by inner matrices is norm-preserving, it follows that
F G
i
VH
i
= G
adj
i
FH
adj
i
G
adj
i
G
i
VH
i
H
adj
i
= F
0
bV
, V M(H
) , (6.5.14)
8
Recall that s is a blocking zero of P if P(s) = 0.
6.5. SENSITIVITY MINIMIZATION: FATPLANTCASE 179
where F
0
= G
adj
i
FH
adj
i
. Hence, from (6.5.8), (6.5.13), and (6.5.14),
J
m
= inf
VM(H
)
F
0
bV
. (6.5.15)
As a nal simplication, we may as well assume that b is a (nite) Blaschke product: Since b is inner,
it equals a constant times a Blaschke product; the constant can be absorbed into the free matrix V.
Let us now switch notation and restate the problem. Given a rational matrix F M(H
)
and a nite Blaschke product b, nd an R M(H
) that minimizes F bR
(e
j
)g(e
j
) d . (6.5.16)
As in the scalar case, with each matrix H H
ss
= ess. sup
[0,2]
(H(e
j
)) . (6.5.18)
It turns out that Fact 6.4.5 has a complete analog in the multivariable case in so far as computing
the inmum of F bR
, let
from H
s
2
into N. Let denote the
collection {
: H
ss
such that M =
and
= M.
180 6. FILTERINGANDSENSITIVITY MINIMIZATION
Proof. See [86], Theorem 3. 2
Theorem6.5.3 Suppose F H
rs
F bR
= A . (6.5.22)
Remarks 6.5.4 As in the scalar case, the subspace N is nite-dimensional, so that A can be
computed by purely algebraic means. More on this later.
Proof. As a rst step, suppose r = s so that F is square. As R varies over H
ss
, the operator T
FbR
ranges over various operators. However, the operator T
FbR
is the same for all R, and equals T
F
.
To see this, let g H
s
2
be arbitrary. Then
T
FbR
g = (F bR)g = (Fg) (bRg) = T
F
g , (6.5.23)
since bRg H
s
2
and hence its projection onto N is zero. Now T
F
is what is called
F
in
Lemma 6.5.2, which states that there exists a matrix H
ss
such that
=
F
and such
that
=
F
= (T
F
). But
=
F
is equivalent to (F ) = 0, i.e.,
(F )g H
s
2
g H
s
2
. (6.5.24)
Let e
i
denote the i-th elementary unit vector, which has a 1 in the i-th position and zeros elsewhere,
and let
g(z) = e
i
, z D . (6.5.25)
Theng H
s
, i.e., that = F bR
0
for some R
0
H
ss
. It is now
claimed that
F bR
0
= inf
RH
ss
F bR
. (6.5.26)
6.5. SENSITIVITY MINIMIZATION: FATPLANTCASE 181
To prove (6.5.26), note that
F bR
= T
FbR
by (6.5.18),
_
_
T
FbR
_
_
since
_
_
_
_
1,
=
_
_
T
F
_
_
, by (6.5.23),
=
= F bR
0
. (6.5.27)
The preceding discussion can be summarized as follows: If F is square, then
inf
RH
ss
F bR
=
_
_
T
F
_
_
. (6.5.28)
To complete the proof of the theorem, it only remains to show that T
F
restricted to N has the
same norm as T
F
. Suppose g H
s
2
. Then g can be expressed uniquely in the form f +bh where
f N and h H
s
2
( since H
s
2
is the direct sum of N and bH
s
2
). Now
9
T
F
g = (Ff +Fbh) = (Ff +bFh) = Ff . (6.5.29)
Moreover, g
2
f
2
, so that
_
_
Fg
_
_
2
g
2
_
_
Ff
_
_
2
f
2
. (6.5.30)
Hence, A (which is the restriction of T
F
to N) has the same norm as T
F
.
Next, the restriction that F be square is removed. Suppose r < s and dene F
0
as in (6.5.21).
Partition any R
0
H
ss
as R
0
= [R
1
]
where R H
rs
, we have
F
0
bR
0
=
_
_
_
_
_
F bR
bR
1
__
_
_
_
= ess. sup
[0,2]
_
(F bR)(e
j
)
bR
1
(e
j
)
_
. (6.5.31)
Now for any two complex matrices A, B of compatible dimensions, it is easy to show that
_
A
B
_
(A) . (6.5.32)
Hence, from (6.5.31), F
0
bR
0
F bR
, so that
inf
R
0
H
ss
F
0
bR
0
inf
RH
rs
F bR
. (6.5.33)
To prove the reverse inequality, let R be any matrix in H
ss
0
srs
]
. Then F
0
bR
0
= F bR
. Hence, the
inequality in (6.5.33) is actually an equality. Moreover, from the previous paragraph, the inmum
on the left side of (6.5.33) equals the norm of T
F
0
when restricted to N. 2
9
Note that this argument fails if b is replaced by an inner matrix.
182 6. FILTERINGANDSENSITIVITY MINIMIZATION
As in the scalar case, it is easy to compute a basis for N.
Lemma 6.5.5 Let b be a nite Blaschke product with distinct zeros
1
, ,
n
, of multiplicities
m
1
, , m
n
, respectively. For i {1, , s}, let e
i
denote the i-th s 1 elementary unit vector. Dene
N as in (6.5.20). Then, a (nonorthonormal) basis for N is given by the vector-valued functions
v
i
kl
(z) =
z
k1
1
l
z
e
i
, 1 i s; 1 k m
l
; 1 l n . (6.5.34)
Proof. By Lemma 6.4.7 and (6.5.16),
v
i
kl
, f = f
(k1)
i
(z
l
), f H
s
2
, (6.5.35)
Since f bH
s
2
if and only if the right side of (6.5.35) is zero for all i, k, l, the lemma follows. 2
Once a basis for N has been determined, it is possible to convert it into an orthonormal basis,
compute a matrix representation of A
0
of Awith respect to this basis, and nally to compute (A
0
);
the details are exactly the same as in the scalar case (see (6.4.33) et. seq.). However, if one is only
interested in computing (A
0
), a short-cut is available, and is described next.
It should be emphasized that the procedure described next is equally applicable to the scalar
case, and can be used to compute (A
0
) more readily than by using the formula (6.4.38). However,
there wouldbe noadvantage gainedby this: Inthe scalar case, Fact 6.4.5gives a methodfor computing
the inmum of f br
as well as for nding an r that attains the inmum, all in one shot. In
contrast, the procedure given below only determines the inmum, and a separate interpolation
procedure has to be used to nd an optimizing r. This is no disadvantage in the multivariable case,
however, since such a two-stage procedure is all that is available at present.
Now for the short-cut: Theorem 6.5.3 states that the inmum of F bR
equals A,
where A : N Nis the restrictionof T
F
to N. Lemma 6.5.5 provides a (nonorthonormal) basis for
N. To compute A, the rst step is to observe that A = A
, where A
: N N is the adjoint
map of A. Now compute a representation of A
| = 0.
The effectiveness of the above short-cut derives fromthe fact that the matrix representation
is block-diagonal, as shown in Lemma 6.5.6 below. For the sake of clarity of notation, Lemma 6.5.6 is
6.5. SENSITIVITY MINIMIZATION: FATPLANTCASE 183
stated only for the case where the Blaschke product b has simple zeros; the general case is conceptually
not more difcult and is left to the reader.
Lemma 6.5.6 Suppose b is a nite Blaschke product with distinct simple zeros
1
, ,
m
, and that
F H
rs
is rational. Dene C
msmr
, C
mrmr
by
= Block Diag {F
(
1
), , F
(
m
)} , (6.5.37)
= (
ij
), i, j = 1, , m , (6.5.38)
where
ij
=
1
1
i
I
r
. (6.5.39)
Then
inf
RH
ss
F bR
= inf{ :
i
z
e
j
, j = 1, , s; i = 1, , m . (6.5.41)
Thus, any h N can be expanded as
h(z) =
m
i=1
s
j=1
h
ij
v
ij
(z) . (6.5.42)
Arrange the coordinates h
ij
in an ms 1 column vector as follows:
_
_
_
_
_
_
_
_
_
_
_
_
_
h
11
.
.
.
h
1s
.
.
.
h
m1
.
.
.
h
ms
_
_
. (6.5.43)
Then the claim is that the matrix representation of A
0
= Block Diag {F
0
(
1
), , F
0
(
m
)} . (6.5.44)
184 6. FILTERINGANDSENSITIVITY MINIMIZATION
To prove the claim, let v
kl
be any basis function of the form (6.5.41). We must show that the ma-
trix (6.5.44) gives the right answer for A
v
kl
. Nowthe coordinate vector of v
kl
is [0 1 0]
, where
the 1 is inthe position(k 1)s +l. Applying the matrix
0
to this vector gives [0 F
0
(
k
)e
l
0]
v
kl
; in other words,
A
v
kl
=
s
p=1
f
0lp
(
k
)v
kp
. (6.5.45)
To show that (6.5.45) is correct, it is enough to show that it gives the right answer for the inner
products v
ij
, A
v
kl
for all i, j. By denition,
v
ij
, A
v
kl
= A
v
kl
, v
ij
= v
kl
, Av
ij
= v
kl
, T
F
0
v
ij
= v
kl
, F
0
v
ij
= e
l
F
0
(
k
)v
ij
(
k
)
= f
0lj
(
k
)
1
1
k
, (6.5.46)
where the last two steps follow from Lemma 6.4.7 and (6.5.41), respectively. On the other hand,
letting u
kl
denote the function in (6.5.45) leads to
v
ij
, u
kl
=
s
p=1
f
0lp
(
k
)v
ij
, v
kp
. (6.5.47)
Note that
v
ij
, v
kp
=
1
1
jp
, (6.5.48)
where is the Kronecker delta. Hence, (6.5.47) becomes
v
ij
, u
kl
= f
0lj
(
k
)
1
1
j
, (6.5.49)
which agrees with (6.5.46).
By previous discussion, A
is given by
A
= inf{ :
0
0
is positive denite} , (6.5.50)
where
0
is the Grammian matrix of the basis in (6.5.41). One can readily compute that
0
=
(
0ij
)
m
i,j=1
where
0ij
=
1
1
i
I
s
, (6.5.51)
Now the only difference between and
0
is in their dimensions. Similarly, if we dene as
in (6.5.37), then is a submatrix of
0
, since F is a submatrix of F
0
. If r = s then we are done,
6.5. SENSITIVITY MINIMIZATION: FATPLANTCASE 185
since =
0
, =
0
and (6.5.50) is the same as (6.5.40). Otherwise, by symmetric rowandcolumn
permutations, one can transform
0
0
into the form
0
_
0
0
_
, (6.5.52)
where = (
ij
)
m
i,j=1
, and
ij
=
1
1
i
I
sr
. (6.5.53)
It is clear from (6.5.52) that
0
0
0
is positive denite if and only if
is; hence,
(6.5.50) reduces to (6.5.40). 2
Now that an expression is available for the inmum of F bR
F bR
, (6.5.54)
and let
1
, ,
n
denote the (simple) zeros of b. Then the problem is to nd a matrix G H
rs
such that
G(
i
) = F(
i
), i = 1, , n, and G
= . (6.5.55)
(Note that G H
rs
if and only if G(
i
) = F(
i
) i).
Equivalently, if we dene
F
i
= [F(
i
)]/ , (6.5.56)
then the problem is one of nding a matrix H
rs
such that
= 1 and (
i
) = F
i
i.
The remainder of this section is devoted to the matrix version of the so-called Nevanlinna-
Pick problem, which can be stated as follows: Given complex numbers
1
, ,
n
, with |
i
| < 1 i,
together with complex matrices F
1
, , F
n
, with F
i
1 i, nd all matrix-valued functions
(if any) such that
1 and (
i
) = F
i
i. The relevance of this problem to sensitivity
minimization is clear.
10
Strictly speaking, the Pick problem only makes sense for square matrices, so it is probably more accurate to speak of the matrix
Nevanlinna algorithm.
186 6. FILTERINGANDSENSITIVITY MINIMIZATION
In the sequel, it is important to note that denotes the norm of a complex matrix whereas
such that
1 and (
i
) = f
i
i.
Suppose one of the constants has magnitude exactly equal to 1, and renumber the constants if
necessary so that |f
1
| = 1. Since
1
belongs to the open unit disc, it follows from the maximum
modulus principle that the only H
and that
1. Suppose (z
0
) = 1 for some z
0
in the
open unit disc, and let r denote the multiplicity of 1 as a singular value of (z
0
) (i.e., as an eigenvalue of
(z
0
)(z
0
)). Then (z) = 1 z D. Moreover, there exist unitary matrices U C
kk
, V C
ll
such that
U(z)V =
_
I
r
0
lrr
0
rkr
(z)
_
, (6.5.58)
where (z) < 1 z D.
The proof of Lemma 6.5.7 requires the following observation on complex matrices.
Lemma 6.5.8 Suppose M C
kl
has norm 1. Then
(i) |m
ij
| 1 i, j,
(ii) If |m
ij
| = 1 for some i, j, then all other elements of row i and column j are zero.
Proof. Temporarily let
2
denote the Euclidean vector norm on both C
k
and C
l
. Then
Mx
2
x
2
x C
l
, (6.5.59)
since M = 1. To prove (i), suppose |m
ij
| > 1 for some i, j, and let x = e
j
, the j-th elementary
unit vector in C
l
. Then Mx
2
> 1, which is a contradiction. To prove (ii), suppose |m
ij
| = 1 for
some i, j, and again let x = e
j
. Then Me
j
2
= 1 implies that all other elements of the j-th column
of M are zero. The same reasoning applied to M
(z
0
)(z
0
), so that 1 is an r-fold
singular value of (z
0
). Thus, there exist unitary matrices U C
kk
, V C
ll
such that
U(z
0
)V =
_
I
r
0
rlr
0
krr
M
_
, (6.5.60)
where M < 1. Dene y(z) = (U(z)V)
11
, i.e., y is the corner element of U(z)V. Then y
is analytic, satises |y(z)| 1 z D (by Lemma 6.5.8 (i)), and y(z
0
) = 1 (by (6.5.60)). Hence,
by the maximum modulus theorem, y(z) = 1 z D, since z
0
is in the open unit disc. Now by
Lemma 6.5.8 (ii), all other elements in the rst row and rst column of U(z)V are identically
zero. The same reasoning applies to all elements of the form (U(z)V)
ii
where i r. Hence,
U(z)V is of the form (6.5.58), and (z) = 1 z D. It only remains to show that (z) <
1 z D. Suppose to the contrary that (z
1
) = 1 for some z
1
D. Then, by previous reasoning,
(z) = 1 z D. In particular, M = (z
0
) = 1 , which contradicts the assumption that 1
is an r-fold singular value of (z
0
). Hence, (z) < 1 z D. 2
Let us now return to the interpolation problem, where one is given points
1
, ,
n
, and
matrices F
1
, , F
m
, with |
i
| < 1 and F
i
1 i. Recall that the objective is to show that one
may assume F
i
< 1 i without sacricing generality. Assume to the contrary that F
i
= 1 for
some values of i. Among these select one for which the multiplicity of 1 as a singular value is the
largest, and renumber it as F
1
. Select unitary matrices U, V such that
UF
1
V =
_
I
r
0
0 G
1
_
, (6.5.61)
where r is the multiplicity of 1 as a singular value of F, and G
1
< 1. Now, by Lemma 6.5.7, any
such that
1 and (
i
) = F
i
must satisfy
U(z)V =
_
I
r
0
0 (z)
_
z D . (6.5.62)
In particular, F
2
, , F
n
must satisfy
UF
i
V =
_
I
r
0
0 G
i
_
, i = 2, , n . (6.5.63)
Otherwise the interpolation problem has no solution. On the other hand, suppose (6.5.63) holds.
Then G
i
< 1 i; otherwise, if G
i
= 1 for some i, then F
i
has 1 as a singular value of multiplicity
larger than r, contradicting the manner in which F
1
was selected. Nowreplace the original interpola-
tion problemby the following modied problem: Find a such that
1 and (
i
) = G
i
i.
Then there is a one-to-one correspondence between solutions of the original problem and of
the modied problem, via (6.5.62). But in the modied problem all the matrices G
i
, have norms
strictly less than one.
188 6. FILTERINGANDSENSITIVITY MINIMIZATION
We are now ready for the main result on the matrix Nevanlinna problem.
Theorem 6.5.9 Suppose
1
, ,
n
are distinct complex numbers and F
1
, , F
n
are complex
matrices with |
i
| < 1 and F
i
< 1 i. Dene a partitioned matrix
P =
_
_
_
P
11
P
1n
.
.
.
.
.
.
P
n1
P
nn
_
_
, (6.5.64)
where
P
ij
=
1
1
j
(I F
i
F
j
) . (6.5.65)
Then there exists a M(H
) such that
< 1 and (
i
) = F
i
i, if and only if the matrix
P is positive denite.
In the current set-up Lemma 6.5.6 leads to the following result: There exists a M(H
)
such that
1 and (
i
) = F
i
if and only if the matrix P is nonnegative denite.
The proof of Theorem 6.5.9 is rather long and is based on a preliminary lemma. We shall see
that an iterative algorithm for computing will emerge from the proof.
Lemma 6.5.10 Suppose E < 1, and dene
L(E) =
_
A B
C D
_
, (6.5.66)
where
A = (I EE
)
1/2
, B = (I EE
)
1/2
E,
C = (I E
E)
1/2
E
, D = (I E
E)
1/2
. (6.5.67)
(Note that M
1/2
denotes the Hermitian square root of M.) Then for each matrix X with X 1, the
matrix
T
E
(X) = (AX +B)(CX +D)
1
, (6.5.68)
is well-dened. Moreover,
T
E
(X) 1 (resp. < 1) X 1 (resp. < 1) . (6.5.69)
6.5. SENSITIVITY MINIMIZATION: FATPLANTCASE 189
Proof. First of all, CX +D = (I E
E)
1/2
(E
M I 0 (resp.
< 0). Hence, (6.5.69) is equivalent to
T
T I 0 (resp. < 0) X
(T
(CX +D)
= [X
I]L
JL
_
X
I
_
= [X
I]J
_
X
I
_
= X
X I . (6.5.75)
Hence, (6.5.69) follows from (6.5.74) and (6.5.75). 2
Proof of Theorem6.5.9. if The proof is by induction on the number n of points to be interpolated.
If n = 1, the theorem is clearly true: Take (z) = F
1
. To establish the inductive step, suppose the
theorem is true for n 1 points, and suppose that P is positive denite; it is shown below that this
implies the existence of a M(H
) satisfying
< 1 and (
i
) = F
i
i.
Let y be the inner function
11
y(z) =
|
1
|(z
1
)
1
(1
1
z)
, (6.5.76)
11
Note that y(z) = z if
1
= 0.
190 6. FILTERINGANDSENSITIVITY MINIMIZATION
and dene
T
i
= T
F
1
(F
i
), i = 2, , n , (6.5.77)
G
i
=
1
y(
i
)
T
i
, i = 2, , n , (6.5.78)
where T
F
1
() is dened in (6.5.68). Now dene a partitioned matrix
Q =
_
_
_
Q
22
Q
2n
.
.
.
.
.
.
Q
n2
Q
nn
_
_
, (6.5.79)
where
Q
ij
=
1
1
j
(I G
i
G
j
) . (6.5.80)
It is claimed that Q is positive denite. To prove the claim, compute Q
ij
from (6.5.80), and make
liberal use of the identities (6.5.71) and (6.5.72). This leads to
Q
ij
=
1
1
j
_
I
1
1 y(
i
)y(
j
)
T
i
T
j
_
=
1
1
j
__
1
1
1 y(
i
)y(
j
)
_
I +
1
1 y(
i
)y(
j
)
(I T
i
T
j
)
_
, (6.5.81)
where from (6.5.77),
I T
i
T
j
= (I F
1
F
1
)
1/2
(I F
i
F
1
)
1
(I F
i
F
j
)
(I F
1
F
j
)
1
(I F
1
F
1
)
1/2
. (6.5.82)
For clarity, dene
R
1
i
= (I F
1
F
i
)
1
(I F
1
F
1
)
1/2
, i = 1, , n , (6.5.83)
and note that
1
1
_
1
1
1 y(
i
)y(
j
)
_
=
1
1
(
1
)(
j
1
)
. (6.5.84)
Then, (6.5.81) becomes
Q
ij
=
1
1
(
1
)(
j
1
)
I +
R
i
1
y(
i
)
P
ij
R
1
j
y(
j
)
, (6.5.85)
or, upon rearrangement,
P
ij
= R
i
_
1
1
(1
1
)(1
j
1
)
I + y(
i
)Q
ij
y(
j
)
_
R
j
. (6.5.86)
6.5. SENSITIVITY MINIMIZATION: FATPLANTCASE 191
Now (6.5.86) implies that
P = R
_
c
1
I 0
0 Q
_
R , (6.5.87)
where
c
i
=
1
1
i
for i = 1, , n , (6.5.88)
and R is a partitioned matrix of the form
R =
_
_
_
_
_
_
_
c
1
R
1
c
2
R
2
c
3
R
3
c
n
R
n
0 y(
2
)c
2
R
2
0 0
0 0 y(
3
)c
3
R
3
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 y(
n
)c
n
R
n
_
_
. (6.5.89)
Since Ris nonsingular, c
1
> 0 andP is positive denite, it follows from(6.5.87) that Qis also positive
denite. In particular, Q
ii
is positive denite for all i, and G
i
< 1 for all i. By the inductive
hypothesis, there exists a function M(H
) such that
< 1 satisfying (
i
) = G
i
for
i = 2, , n. Now dene
(z) = (A y(z)(z)C)
1
(y(z)(z)D B) , (6.5.90)
where
A = (I F
1
F
1
)
1/2
, B = (I F
1
F
1
)
1/2
F
1
,
C = (I F
1
F
1
)
1/2
F
1
, D = (I F
1
F
1
)
1/2
. (6.5.91)
Note that (6.5.90) is the inverse of the transformation T
F
1
, dened in (6.5.68). Hence,
from (6.5.77), (6.5.78), and (6.5.90),
(
i
) = F
i
for i = 2, , n . (6.5.92)
Since y(
1
) = 0, (6.5.90) shows that
(
1
) = A
1
B = F
1
. (6.5.93)
Finally, by Lemma 6.5.10, (z) < 1 for all z implies that (z) < 1 for all z. Hence, is a
solution to the interpolation problem with n points. This completes the proof by induction.
only if One needs to prove the following statement: Suppose M(H
) and that
< 1. Let
1
, ,
n
be any n distinct points in the open unit disc, let F
i
= (
i
), and
dene P as in (6.5.64), (6.5.65). Then P is positive denite.
192 6. FILTERINGANDSENSITIVITY MINIMIZATION
The proof is again by induction on the number n of points. The statement is clearly true if
n = 1. Suppose it is true for n 1 points. Given the n points
1
, ,
n
and the function , let
F
i
= (
i
) and dene the function M(H
) by
(z) =
1
y(z)
(A(z) +B)(C(z) +D)
1
, (6.5.94)
where y is as in (6.5.76) and A, B, C, D are as in (6.5.91). Then
) such
that (
n
) = H
n
and
) satisfying (
i
) = F
i
and
such that
1 (as opposed to
1
0.73788
=
_
0.15791
0.45135
_
,
G
2
=
1
0.5
T
2
=
_
0.31582
0.9027
_
.
One can as well choose (z) = G
2
z. Then, from (6.5.90),
(z) =
_
1.0236 +0.0706z 0.0472 +0.1412z
0.0472 +0.2018z 1.0944 +0.4037z
_
1
_
0.2236 +0.3531z
0.4472 +1.0092z
_
=
1
1.118 +0.4743z
_
0.2236 +0.3659z
0.4472 +1.0028z
_
=
1
1 +0.4242z
_
0.2 +0.3273z
0.4 +0.8969z
_
.
6.6 SENSITIVITY MINIMIZATION: GENERAL CASE
In this section, the sensitivity minimization problem is studied in full generality, i.e., without any
simplifying assumptions on the relative numbers of plant inputs and outputs, and on the dimensions
of the weighting matrices. The ltering problem in the general case is not discussed separately in
the interests of brevity, but reasoning similar to that given here applies to this problem as well. As
shown in earlier sections, the problem is one of minimizing a cost function
J(R) = F GRH
, (6.6.1)
194 6. FILTERINGANDSENSITIVITY MINIMIZATION
by a suitable choice of R M(H
), where F, G, H M(H
(z
1
) = A(z) z, and (6.6.2)
A(e
j
) 0 [0, 2] . (6.6.3)
Then there exists an outer matrix U M(R
) such that
A(z) = U
(z
1
)U(z) z . (6.6.4)
Further, if (6.6.3) is strengthened to
A(e
j
) > 0 [0, 2] , (6.6.5)
then U can be chosen to a unit of M(R
where n > m, and that Ais inner. Then there exists an inner matrix
B R
n(nm)
such that = [A B] R
nn
is also inner.
The matrix B is sometimes referred to as a complementary inner matrix of A.
Proof. Since A is inner, it has full column rank in M(R
. Note that D R
mm
, N R
nmm
. Now dene
P(z) = [D
(z
1
)]
1
N
(z
1
) . (6.6.7)
6.6. SENSITIVITY MINIMIZATION: GENERAL CASE 195
Then P (R(z))
m(nm)
and therefore has an r.c.f. (U, V) over M(R
). Specically, U
R
m(nm)
, V R
(nm)(nm)
, and
P(z) = U(z)[V(z)]
1
z . (6.6.8)
Combining (6.6.8) and (6.6.9) shows that
D
(z
1
)U(z) +N
(z
1
)V(z) = 0 z . (6.6.9)
Now dene
W =
_
U
V
_
R
n(nm)
, (6.6.10)
and express (6.6.9) in the form
A
(z
1
)W(z) = 0 z . (6.6.11)
Clearly W has full column rank as an element of M(R
and W
o
R
(nm)(nm)
. Multiplying (6.6.11) by
[W
o
(z)]
1
gives
A
(z
1
)W
i
(z) = 0 z . (6.6.12)
It is claimed that the choice B = W
i
makes = [A B] inner (and of course square). Note that
(z
1
)(z) =
_
A
(z
1
)A(z) A
(z
1
)B(z)
B
(z
1
)A(z) B
(z
1
)B(z)
_
= I z , (6.6.13)
since A, B are both inner and satisfy (6.6.12). 2
Now we come to the main result of this section. To be specic, suppose F R
nm
, G
R
nl
, H R
km
, and that n > l, k < m; the cases n l and/or k mare simpler and the requisite
modications will be obvious. Recall that the problem is to select R R
lk
so as to minimize
J(R) = F GRH
. (6.6.14)
The rst step in the solution procedure is to show that the above problem can be reduced to
that of minimizing
_
_
_
_
_
A Q B
C D
__
_
_
_
, (6.6.15)
by a suitable choice of Q R
lk
, where A, B, C, D are L
(not H
!) matrices of appropriate
dimensions. For this purpose, rst factor G as G
i
G
o
, where G
i
R
nl
is inner and G
o
R
ll
is
outer. Then, using Lemma 6.6.2, construct a square inner matrix R
nn
containing G
i
as its rst
l columns. Then,
G =
_
G
o
0
(nl)l
_
. (6.6.16)
196 6. FILTERINGANDSENSITIVITY MINIMIZATION
Similar reasoning applied to H
R
mk
and a
square inner matrix R
mm
such that
H = [H
o
0
k(mk)
] . (6.6.17)
Hence,
J(R) =
_
_
F
_
G
o
0
_
R[H
o
0]
_
_
=
_
_
_
G
o
0
_
R[H
o
0]
_
_
. (6.6.18)
Now dene
_
A B
C D
_
=
M(L
) , (6.6.19)
and let Q denote the free parameter G
o
RH
o
(recall that both G
o
and H
o
are units of M(H
) by
virtue of the rank conditions on G and H). Then the cost function becomes
_
_
_
_
_
A Q B
C D
__
_
_
_
. (6.6.20)
To proceed further, we require another very simple result:
Lemma 6.6.3 Suppose W = [X Y] M(L
) is rational. Then W
. Further, if >
Y
, then
W
(
2
I YY
)
1/2
X
1 , (6.6.21)
where Y
(z) = Y
(z
1
), and (
2
I YY
)
1/2
denotes the spectral factor of
2
I YY
.
Proof. It is obvious that W
. Moreover, if > Y
, then (
2
I YY
)
1/2
exists
and is a unit of M(R
)
1/2
is well-dened.
By denition,
W
= max
[0,2]
(W(e
j
)) . (6.6.22)
Hence,
W
(W(e
j
)) [0, 2]
(W(e
j
)W
(e
j
)
2
I
[XX
+YY
](e
j
)
2
I
(XX
)(e
j
)
2
I (YY
)(e
j
) . (6.6.23)
This leads immediately to (6.6.21). 2
6.6. SENSITIVITY MINIMIZATION: GENERAL CASE 197
A parallel result to Lemma 6.6.3 is the following.
Lemma 6.6.4 Suppose
W =
_
X
Y
_
M(L
) , (6.6.24)
is rational. Then W
. Further, if > Y
, then
W
X(
2
I Y
Y)
1/2
1 . (6.6.25)
Now let us return to the problem of minimizing the cost function in (6.6.20). Let be a
current guess of the minimum, and let K denote [C D]. Then, from Lemma 6.6.4,
min
QM(H
)
_
_
_
_
_
A Q B
C D
__
_
_
_
, (6.6.26)
if and only if
min
QM(R
)
_
_
[A Q B](
2
I K
K)
1/2
_
_
1 . (6.6.27)
Now note that (
2
I K
K)
1/2
is a unit of M(R
K)
1/2
can be replaced by another free parameter R. If we dene
A
1
= A(
2
I K
K)
1/2
M(L
) , (6.6.28)
B
1
= B(
2
I K
K)
1/2
M(L
) , (6.6.29)
then (6.6.27) is equivalent to
min
RM(H
)
[A
1
R B
1
]
1 . (6.6.30)
Next, from Lemma 6.6.3, (6.6.30) holds if and only if
min
RM(H
)
(I B
1
B
1
)
1/2
(A
1
R)
1 . (6.6.31)
Once again, it is possible to replace (I B
1
B
1
)
1/2
R by another free parameter S M(H
).
If we dene
A
2
= (I B
1
B
1
)
1/2
A
1
M(L
) , (6.6.32)
then (6.6.31) is equivalent to
min
SM(H
)
A
2
S
1 . (6.6.33)
The problem of nding the minimum in (6.6.33) is one of nding the closest matrix in
M(H
) and T
i
is inner. Hence,
A
2
(z) = [T (z)]
1
U(z)
= [T
i
(z)]
1
[T
o
(z)]
1
U(z)
= [T
i
(z)]
1
V(z) , (6.6.34)
where V = T
1
o
U M(R
). Now
A
2
S
= T
1
i
V S
= T
i
(T
1
i
V S)
= V T
i
S
= T
adj
i
V bS
, (6.6.35)
where b = |T
i
|. This is the type of objective function covered by Theorem 6.5.3.
The preceding discussion can be summarized in the following iterative procedure for com-
puting the minimum of the function in (6.6.20).
Step 1 Choose
max {[C D]
,
_
B
D
_
} . (6.6.36)
Step 2 Form
L = (
2
I K
K)
1/2
, (6.6.37)
where K = [C D].
Step 3 Form
M = (I BLL
)
1/2
. (6.6.38)
lf BL
= min
SM(H
)
MAL S
. (6.6.39)
If J
-norm, there
exist plants for which the optimumachievable with a two-parameter scheme is as small as one wishes
compared to the one-parameter optimum.
To avoid purely technical details caused by the nonexistence of optimal compensators in the
continuous-time case, the discussion below is addressed to the discrete-time case. However, with
the aid of Lemmas 6.3.2, 6.4.1, and 6.5.1, this discussion can be translated to the continuous-time
case.
Accordingly, suppose P is a given multivariable plant with transfer matrix P(z), where z is
the transfer function of a unit delay. Suppose u
1
is a reference input to be tracked, while u
2
and u
3
are sensor noises to be rejected. Let W
1
, W
2
, W
3
, W
4
be given weighting matrices in M(H
). It
is assumed that (i) P has no poles on the unit circle, (ii) W
1
, W
3
are outer, (iii) W
2
, W
4
are square
outer, and (iv) P(e
j
), W
1
(e
j
), , W
4
(e
j
) all have full row rank at all [0, 2].
For convenience, a two-parameter compensation scheme is reproduced in Figure 6.2. The
u
3
u
2
z
2
y
2
y
1
N
c1
D
c
1 1
~
N
c2
~
~
+
+
+
+
+
u
1
z
1
N
D
Figure 6.2: Two-Parameter Compensation Scheme.
optimal tracking problem is to minimize either
J
2
= W
1
(I H
yu
1
)W
2
2
, (6.7.1)
or
J
= W
1
(I H
yu
1
)W
2
, (6.7.2)
by means of a suitable stabilizing compensator. The optimal disturbance rejection problem is to select
a stabilizing compensator that minimizes either
K
21
= W
3
H
yu
2
W
4
2
or K
22
= W
3
H
vu
3
W
4
2
, (6.7.3)
200 6. FILTERINGANDSENSITIVITY MINIMIZATION
or
K
1
= W
3
H
yu
2
W
4
or K
2
= W
3
H
vu
3
W
4
. (6.7.4)
From the discussion in Section 5.6, it is at once clear that, as far as minimizing
K
21
, K
22
, K
1
, K
2
, there is no advantage to using a two-parameter scheme instead of a one-
parameter scheme. This is because the set of achievable stable transfer matrices H
yu
2
, H
vu
3
are
exactly the same with either scheme. But with the optimal tracking problem, the situation is quite
different. The minimum J
2
or J
and K
1
or any other combination). This is the overwhelming advantage of the two-parameter
scheme. To justify the latter statement rst, recall from (5.6.15) that the set of achievable stable
transfer matrices from (u
1
, u
2
, u
3
) to y are parametrized by
[NQ N(Y R
N) N(X +R
D)] , (6.7.5)
where (N, D), (
D,
N) are any r.c.f. and l.c.f. of P, X, Y M(H
and K
22
simultaneously, for example, one
chooses Q to minimize
J
= W
1
(I NQ)W
2
, (6.7.6)
and R to minimize
K
22
= W
3
(I N(X +R
D))W
4
2
. (6.7.7)
The associated compensator
N
c
1
= Q,
N
c
2
= X +R
D,
D
c
= Y R
N , (6.7.8)
minimizes the weighted tracking error and optimally rejects the weighted disturbance.
From (5.6.16), the set of stable transfer matrices H
yu
1
, that can be achieved using a one-
parameter compensator is {N(X +R
D) : R M(H
= min
QM(H
)
W
1
(I NQ)W
2
, (6.7.9)
J
2
= min
QM(H
)
W
1
(I NQ)W
2
2
, (6.7.10)
h
= min
RM(H
)
W
1
(I N(X +R
D))W
2
, (6.7.11)
h
2
= min
RM(H
)
W
1
(I N(X +R
D))W
2
2
. (6.7.12)
Then, J
, J
2
(resp. h
, h
2
) are the best weighted tracking errors achievable with a two-parameter
(resp. one-parameter) compensation scheme. Clearly J
, J
2
h
2
. If P is a stable plant, then
one can choose N = P, X = 0,
D = I, in which case it is immediate that J
2
= h
2
, J
= h
. On
the other hand, there exist plants for which the ratios J
/h
or J
2
/h
2
are as small as one wishes.
6.7. TWO-PARAMETERCOMPENSATOR 201
To set up these examples, let us study the computation of the various minima in (6.7.9)(6.7.12).
Consider rst the constants J
, h
= min
qH
(1 nq)w
. (6.7.13)
At this stage it can be assumed without loss of generality that w is outer: If w = w
i
w
o
is an
inner-outer factorization of w, then multiplication by w
i
does not affect the norm. Also, since
w(e
j
) = 0 [0, 2], w is actually a unit of H
. Therefore
J
= min
qH
w n
i
n
o
qw
= min
qH
w n
i
q
, (6.7.14)
after absorbing the units n
o
and w into q. Thus, the problem is to nd a function f of minimum
H
. By denition,
h
= min
rH
[1 n(x +rd)]w
= (y rn) dw
. (6.7.15)
Factor d as d
i
d
o
where d
i
, d
o
are respectively inner and outer, and absorb all units into the free
parameter r. This leads to
h
= min
rH
[yd
o
w n
i
r]d
i
= min
rH
yd
o
w n
i
r
, (6.7.16)
since multiplication by d
i
preserves norms. Thus, the problem is to nd a function g of minimum
H
is in general larger
than J
. In fact, if p has a pole-zero pair close together inside the unit disc, then |d
i
()| will be
very small at some zero of n
i
, and h
.
Example 6.7.1 Let w = 1, and let
p(z) =
z 0.5
z 0.5
.
202 6. FILTERINGANDSENSITIVITY MINIMIZATION
Then
n = z 0.5 , n
i
=
z 0.5
1 (0.5 +)z
,
d = z 0.5, d
i
=
z 0.5
1 0.5z
.
Since n
i
has only one zero, at z = 0.5 +,
J
= |w(0.5 +)| = 1,
h
= |(w/d
i
)(0.5 +)| = 1/|d
i
(0.5 +)|
=
0.75 0.5
0.75
.
By choosing sufciently small, the ratio J
/h
2
and h
2
. Recall fromSection 6.2 that the space L
2
[0, 2]
can be decomposed as the direct sumof the orthogonal subspaces H
+
2
and H
2
, where H
+
2
consists of
the boundary values of H
2
-functions. Given f L
2
[0, 2], let f
+
(resp. f
2
). Finally, if f H
2
, then f
), then F
o
(resp. F
i
) denotes the outer (resp. inner) part of F.
For ease of discussion, suppose W
1
= I, and denote W
2
by W. Also, suppose W is a unit of
M(H
2
= min
QM(H
)
(I NQ)W
2
= min
QM(H
)
W N
i
Q
2
, after absorbing N
o
, W into Q
= min
QM(H
)
N
i
W Q
2
= (N
i
W)
2
, (6.7.18)
corresponding to the choice Q = (N
i
W)
+
. On the other hand,
h
2
= min
RM(H
)
(I N(X +R
D))W
2
. (6.7.19)
To bring out the connection between h
2
and J
2
more clearly, select
X,
Y M(H
) such that
_
Y X
N
D
_ _
D
X
N
Y
_
=
_
D
X
N
Y
_ _
Y X
N
D
_
= I . (6.7.20)
6.7. TWO-PARAMETERCOMPENSATOR 203
This is always possible, by Theorem 4.1.16. If we let (
DW)
o
denote the outer part of
DW, then
h
2
= min
RM(H
)
(
Y NR)
DW
2
= min
RM(H
Y(
DW)
o
NR(
DW)
o
2
= min
RM(H
)
N
i
Y(
DW)
o
N
o
R(
DW)
o
2
= min
RM(H
)
N
i
Y(
DW)
o
R
2
= [N
i
Y(
DW)
o
]
2
. (6.7.21)
To compare (6.7.18) and (6.7.21), note that
NX +
Y
D = I , (6.7.22)
N
i
W = N
i
(NX +
Y
D)W = N
o
XW +N
i
Y
DW . (6.7.23)
However, since N
o
XW M(H
2
),
(N
i
W)
= [N
i
Y
DW]
. (6.7.24)
So if we dene F = N
Y
DW M(L
2
) then
J
2
= F
2
, h
2
= [F(
DW)
i
]
2
. (6.7.25)
This implies that h
2
, is in general larger than J
2
.
Example 6.7.2 Consider the scalar case; let w = 1, and let
p(z) =
z
m
1 2z
,
where m is an integer. Then we can choose
n = z
m
, d = 1 2z,
x = 2
m
, y = 1 +2z +4z
2
+ +2
m1
z
m1
.
In this case n = n
i
since n is already inner. Hence, we have
n
i
= z
m
, n
i
= z
m
.
d
i
=
z 0.5
1 0.5z
, d
i
=
1 0.5z
z 0.5
.
f = n
i
yd = n
i
(1 nx) = n
i
x = z
m
2
m
.
J
2
= f
2
= z
m
2
= 1 .
f d
i
= (z
m
2
m
) d
i
,
h
2
= (f d
i
)
2
= (z
m
d
i
)
(2
m
d
i
)
2
2
m
(d
i
)
2
(z
m
d
i
)
2
.
204 6. FILTERINGANDSENSITIVITY MINIMIZATION
Now expand d
i
(z) in a Fourier series
d
i
(z) =
i=0
c
i
z
i
,
where
c
0
= 0.5, c
i
= 1.5(0.5)
i
for i 1 .
Then, since d
i
is inner,
1 = d
i
2
2
=
i=0
c
2
i
.
Therefore,
i=1
c
2
i
= 0.75 .
Since,
d
i
(z) =
i=0
c
i
z
i
,
(d
i
)
i=1
c
i
z
i
,
we see that
(d
i
)
2
2
=
i=1
c
2
i
= 0.75 .
On the other hand, if m 1, then we have
(z
m
d
i
)
= z
m
d
i
=
i=0
c
i
z
(i+m)
,
(z
m
d
i
)
2
= d
i
2
= 1 .
Hence,
h
2
2
m1
3 1 .
By choosing the integer m sufciently large, the ratio J
2
/h
2
can be made as small as one desires.
6.7. TWO-PARAMETERCOMPENSATOR 205
NOTES ANDREFERENCES
Section 6.2 contains only the barest minimumof the theory of Hardy spaces. For excellent treatments
of this topic, see Duren [34], Koosis [59] or Sz. Nagy and Foias [69].
Solving the ltering problem of Section 6.3 was the original motivation of Youla, Jabr and
Bongiorno [111] in parametrizing the set of all compensators that stabilize a given plant. Kucera [60]
gives a solution to the discrete-time ltering problem that is similar in its approach to that of [111].
The treatment given here follows Francis [36].
The sensitivity minimization problemwas formulated by Zames [113] and solved in the scalar
case by Zames and Francis [40, 115]. The treatment here is similar to that in [36]. In the multivariable
case, the fat plant case was solved by Chang and Pearson [12] and Francis, Helton, and Zames [38].
Our treatment is adapted from the former. The general sensitivity minimization problem was solved
by Doyle [31]. Note that it is possible to carry out all of the steps of Sections 6.4 to 6.6 using
purely state-space methods; see [16, 31, 44]. The paper of Sarason [86] contains some mathematical
results pertinent to the sensitivity minimization problem. The Nevanlinna-Pick algorithm used in
Section 6.5 is taken from the paper by Delsarte, Genin, and Kamp [18], but this paper treats only
the square matrix case. Finally, Potapov [75] contains several useful results on matrices of analytic
functions, such as Lemma 6.5.6.
207
C H A P T E R 7
Robustness
This chapter begins with a detailed study of the issue of the robustness of feedback stability in
the presence of plant and/or compensator perturbations. Both the nature and the extent of the
perturbations that maintain feedback stability are discussed. Then the problem of robust regulation
(i.e., stabilization plus tracking and/or disturbance rejection) is examined, for both the one-parameter
as well as the two-parameter compensation schemes. Finally, various generic properties of control
systems are studied.
7.1 INTRODUCTION
Consider the standard feedback system shown in Figure 7.1, and suppose the compensator C stabi-
lizes the plant P, i.e., that the closed-loop transfer matrix
H(P, C) =
_
(I +PC)
1
P(I +CP)
1
C(I +PC)
1
(I +CP)
1
_
, (7.1.1)
belongs to M(S). The central question studied in this chapter is: What sort of perturbations can be
u
1
u
2
e
1
e
2
y
2
y
1
C P
+ +
+
, where
the uncertainty parameter assumes values in some topological space . Typically represents some
physical parameters of the plant, such as component values, and the function P
captures the
manner in which this uncertainty in the physical parameters is reected in the plant description. The
parameter space may be R
n
for some integer n, or a more general normed space, or any topological
space. An example of the latter situation arises in the design of compensators for singularly perturbed
systems, where = [0, ). For technical reasons, it is assumed that is a rst-countable topological
space (see Section C.1 for the denition). As all metric spaces are rst-countable, this poses no
practical restriction. Let P
0
be the nominal plant description on which the compensator design
is based. The question is: When is it possible to design a stabilizing compensator for P
0
that is
robust against the plant uncertainty? To answer this question, it is necessary rst to dene robust
stabilization. The denition adopted here is that a compensator
1
C S(P
0
) robustly stabilizes the
family of plants P
) N, and
(ii) the stable closed-loop transfer matrix H(P
, C) is continuous in at
0
. The rst condition
means that C stabilizes not only the nominal plant P
0
but also all perturbed plants P
for all ,
sufciently near
0
. But this alone is not enough: The objective of compensator design is not merely
to stabilize a plant but also to shape the stable closed-loop response. Thus, the second condition
is imposed, requiring that the perturbed (stable) closed-loop response H(P
, C) be close to the
nominal (stable) closed-loop response H(P
0
, C) whenever
0
is near . Note that the question as
posed above is purely qualitative, in that one is not concerned about the size of the neighborhood
N, but merely about its existence. The quantitative aspects of robustness are studied in the next two
sections.
The question studied (and answered) in this section is: When does there exist a robustly
stabilizing compensator for a given family P
is continuous in the
graph topology at =
0
. In other words, robust stabilization is possible if and only if any variations
in the parameter from its nominal value
0
affect the plant description in a continuous manner,
where the continuity of P
and
L
2
in a manner entirely analogous with (7.2.1), namely
G
L
(P) = {(u, y) L
n+m
: y = P u} , (7.2.3)
G
L
2
(P) = {(u, y) L
n+m
2
: y = P u} , (7.2.4)
where the hat denotes Laplace transformation. It can be shown [95] that these graphs can also be
simply characterized in terms of an r.c.f. of P. Specically, if (N, D) is any r.c.f. of P, then
G
L
} , (7.2.5)
G
L
2
(P) = {(Dz, Nz) : z L
m
2
} , (7.2.6)
where Dand N are nowinterpreted as bounded operators on L
or L
2
. But technically Lemma 7.2.1
is easier to prove than (7.2.5) or (7.2.6).
Proof. Suppose u = Dz for some z S
m
. Then Pu = ND
1
Dz = Nz S
n
. Hence, for all z S
m
,
the pair (Dz, Nz) belongs to G
S
(P). It remains to show that every point (u, y) G
S
(P) is of the
form (Dz, Nz) for some z S
m
. Since N and D are right-coprime, one can select X, Y M(S)
such that XN +YD = 1. Now suppose (u, y) G
S
(P), i.e., suppose u S
m
, y S
n
and y = Pu.
7.2. GRAPHTOPOLOGY 211
Dene z = D
1
u [R(s)]
m
. Then clearly (u, y) = (Dz, Nz). The proof is complete if it can be
shown that z actually belongs to S
m
. Note that
z = D
1
u = (XN +YD)D
1
u = XND
1
u +Yu
= Xy +Yu S
m
, (7.2.7)
since all quantities in the last expression belong to M(S). 2
Corollary 7.2.3 Suppose P S
nm
. Then
G
S
(P) = {(z, Pz) : z S
m
} . (7.2.8)
Proof. Note that (P, I) is an r.c.f. of P. 2
Now we are ready to dene the graph topology. Suppose (N, D) is a right-coprime pair, and
that |D| = 0. Then there exists a constant = (N, D) such that (N
1
, D
1
) is also a right-coprime
pair and |D
1
| = 0 whenever [(N
1
N)
(D
1
D)
D D
1
__
_
_
_
<
1
} , (7.2.11)
N
2
= {
N
D
1
:
_
_
_
_
_
N N
2
D D
2
__
_
_
_
<
2
} , (7.2.12)
and suppose P N
1
N
2
. Then P has r.c.f.s (
N
1
,
D
1
), (
N
2
,
D
2
) such that
_
_
_
_
_
N
1
N
1
D
1
D
1
__
_
_
_
=: r
1
<
1
,
_
_
_
_
_
N
2
N
2
D
2
D
2
__
_
_
_
=: r
2
<
2
. (7.2.13)
Moreover, there exists a U U(S) such that
N
2
=
N
1
U,
D
2
=
D
1
U. Now select a positive number
r such that r min{
1
r
1
,
2
r
2
/U}, and dene
N = {ND
1
:
_
_
_
_
_
N
N
1
D
D
1
__
_
_
_
< r} . (7.2.14)
Then N is certainly a basic neighborhood containing P. It only remains show that N N
1
N
2
.
This follows readily: First,
_
_
_
_
_
N
N
1
D
D
1
__
_
_
_
< r
_
_
_
_
_
N N
1
D D
1
__
_
_
_
< r
1
+r <
1
, (7.2.15)
which shows that ND
1
N
1
. Next,
_
_
_
_
_
N
N
1
D
D
1
__
_
_
_
< r
_
_
_
_
_
NU
N
2
DU
D
2
__
_
_
_
< rU
_
_
_
_
_
NU N
2
DU D
2
__
_
_
_
< r
2
+rU <
2
, (7.2.16)
which shows that (NU)(DU)
1
= ND
1
N
2
. Hence N N
1
N
2
. 2
The topology on M(R(s)) dened by the collection of sets of the form (7.2.10) as base is
called the graph topology. In it, two plants P
1
and P
2
are close if they have r.c.f.s (N
1
, D
1
), (N
2
, D
2
)
such that (N
1
, D
1
) (N
2
, D
2
) is small. In view of Lemma 7.2.1, this means that P
1
and P
2
are
close if their graphs are close. This is the reason for the name graph topology.
7.2. GRAPHTOPOLOGY 213
The central result of this section is that the graph topology on M(R(s)) is the weakest topology
in which feedback stability is a robust property. In order to state this result precisely, it is necessary
rst to characterize the convergence of sequences and continuity of functions in the graph topology.
Suppose {P
i
} is a sequence in M(R(s)) and that P M(R(s)). Then by denition (Section C.1),
{P
i
} converges to P if and only if every neighborhood of P contains all but a nite number of terms
of the sequence {P
i
}. Since every neighborhood of P contains a basic neighborhood of P, it is easy
to see that {P
i
} converges to P if and only if every basic neighborhood of P contains all but a nite
number of terms of {P
i
}. All this is a ready consequence of the denition of the graph topology
and of the denition of convergence in a topological space. The next lemma gives several equivalent
characterizations of convergence in the graph topology.
Lemma 7.2.5 Suppose {P
i
} is a sequence in M(R(s)) and that P M(R(s)). Then the following
statements are equivalent:
(i) {P
i
} converges to P in the graph topology.
(ii) For every r.c.f. (N, D) of P, there exist r.c.f.s (N
i
, D
i
) of P
i
such that N
i
N, D
i
D in
M(S).
(iii) There exist an r.c.f. (N, D) of P and a sequence {(N
i
, D
i
)} of r.c.f.s of P
i
such that N
i
N, D
i
D in M(S).
(iv) For every l.c.f. (
D,
N) of P, there exists a sequence {(
D
i
,
N
i
)} of l.c.f.s of P
i
such that
N
i
N,
D
i
D in M(S).
(v) There exist an l.c.f. (
D,
N) of P and a sequence {(
D
i
,
N
i
)} of l.c.f s of P
i
such that
N
i
N,
D
i
D in M(S).
Proof. (i) (ii). This is a ready consequence of the denition of the graph topology and of the
discussion in the paragraph preceding the lemma. Suppose {P
i
} converges to P. Then every basic
neighborhood of P contains all but a nite number of terms in the sequence {P
i
}. In particular, let
(N, D) be any r.c.f. of P and let
i
be any sequence of positive numbers converging to zero. (Assume
without loss of generality that
i
< (N, D) i.) Dene
N
i
= {
N
D
1
:
_
_
_
_
_
N N
D D
__
_
_
_
<
i
} . (7.2.17)
Then for each integer k there exists an integer j
k
such that P
i
N
k
i j
k
. That is, there exist
r.c.f.s (N
i
, D
i
) of P
i
such that
_
_
_
_
_
N
i
N
D
i
D
__
_
_
_
<
k
i j
k
. (7.2.18)
From this, one can conclude that
_
_
_
_
_
N
i
N
D
i
D
__
_
_
_
0 . (7.2.19)
214 7. ROBUSTNESS
i.e., that N
i
N, D
i
D in M(S). This shows that (i) (ii). The converse implication follows
just as easily. Suppose (ii) is true and let N be any basic neighborhood of P. Then N is of the
form (7.2.10) for some r.c.f. (N, D) of P and some > 0. By (ii), there exists a sequence {(N
i
, D
i
)}
of r.c.f.s of P
i
such that (N
i
, D
i
) (N, D) in M(S). Hence [N
i
D
i
]
[N
N
D
_ _
D
X
N
Y
_
= I . (7.2.20)
This can always be done, in view of Theorem 4.1.16. Since (N
i
, D
i
) (N, D) and since U(S) is
an open subset of M(S), it follows that
F
i
=
_
D
i
X
N
i
Y
_
U(S) for large enough i . (7.2.21)
Now let G
i
= F
1
i
M(S) for large enough i, and partition G
i
as
G
i
=
_
Y
i
X
i
N
i
D
i
_
. (7.2.22)
Then
N
i
,
D
i
are left-coprime (for large enough i), since
N
i
X +
D
i
Y = I. Further,
N
i
D
i
+
D
i
N
i
=
0 implies that
D
1
i
N
i
= N
i
D
1
i
= P
i
.
2
Hence (
D
i
,
N
i
) is an l.c.f. of P
i
. Finally, since inversion of
unimodular matrices is continuous, we have that
G
i
(lim
i
F
i
)
1
=
_
Y X
N
D
_
. (7.2.23)
This shows that (
D
i
,
N
i
) (
D,
N), which is an l.c.f. of P. This is (v).
(v) (iii). This is entirely analogous to the above. 2
2
It still has to be shown that
D
i
is nonsingular for this argument to be valid. However, this is straight-forward; see the proof of
Lemma 8.1.17.
7.2. GRAPHTOPOLOGY 215
The next result characterizes the continuity of functions mapping into M(R(s)). It is a ready
consequence of Lemma 7.2.5.
Corollary 7.2.6 Suppose is a rst-countable topological space, and that P
is a function mapping
into M(R(s)). Then the following are equivalent:
(i) The function P
is continuous at =
0
when M(R(s)) is equipped with the graph
topology.
(ii) For each r.c.f. (N
0
, D
0
) of P
0
, there exists a family of r.c.f.s (N
, D
) of P
such that
(N
0
, D
0
) = (N
0
, D
0
), and such that N
, D
, D
) of r.c.f.s of P
such that N
, D
0
, there exists a family (
D
,
N
) of l.c.f.s of P
such that
(
D
0
,
N
0
) = (
D
0
,
N
0
) and such that (
D
,
N
,
N
) of l.c.f.s of P
such that
D
,
N
0
) if and only if the function P
is continuous at =
0
in the graphtopology: further,
if P
is continuous at
0
, then every C S(P
0
) robustly stabilizes the family {P
}.
Theorem7.2.7 Suppose P
is continuous at =
0
when M(R(s)) is equipped with
the graph topology. Let C be any element of S(P
0
). Then there exists a neighborhood N of
0
such
that C S(P
0
) such that C S(P
is continuous at
0
. By Corollary 7.2.6 (iii), there
exists a family (N
, D
) of r.c.f.s of P
such that N
, D
are continuous at
0
. Since C S(P
0
),
there is an l.c.f. (
D
c
,
N
c
) of C such that
D
c
D
0
+
N
c
N
0
= I (see Corollary 5.1.8). Moreover,
H(P
0
, C) =
_
I N
0
N
c
N
0
D
c
D
0
N
c
D
0
D
c
_
. (7.2.24)
Now dene
=
D
c
D
+
N
c
N
. Then
0
= I and
, C) =
_
I N
N
c
N
D
c
D
N
c
D
D
c
_
, (7.2.25)
that H(P
, C) is continuous at =
0
.
(ii) By hypothesis, there exists a C S(P
0
) and a neighborhood N of
0
such that C
S(P
0
. Since C S(P
0
), by Corollary 5.1.8 there exists
an l.c.f. (
D
c
,
N
c
) of C such that
D
c
D
0
+
N
c
N
0
= I. Now, since C S(P
) N, it follows by
symmetry that P
, D
)
of P
such that
D
c
D
+
N
c
N
= I N. Moreover,
H(P
0
, C) =
_
I N
0
N
c
N
0
D
c
D
0
N
c
D
0
D
c
_
, (7.2.26)
H(P
, C) =
_
I N
N
c
N
D
c
D
N
c
D
D
c
_
. (7.2.27)
By hypothesis H(P
, C) is continuous at =
0
. Hence so is the function
F
=
_
N
N
c
N
D
c
D
N
c
D
D
c
_
. (7.2.28)
Since N
0
, D
0
are xed elements of M(S) (i.e., independent of ), the function
F
_
N
0
D
0
_
=
_
N
_
(since
N
c
N
0
+
D
c
D
0
= I) , (7.2.29)
is also continuous at
0
. Thus, N
, D
are continuous at
0
. Finally, since H(P
, C) is continuous
at
0
, it follows that
lim
0
H(P
, C) = H(P
0
, C) . (7.2.30)
Then from (7.2.29) it follows that
lim
0
_
N
_
=
_
N
0
D
0
_
. (7.2.31)
Now by Corollary 7.2.6 (ii), the map P
is continuous at
0
. 2
In the proof of Theorem 7.2.7, the only properties of M(S) used were (i) the set of units is
open, and (ii) inversion of a unit is continuous. Thus, Theorem 7.2.7 holds under extremely general
conditions. Specically, one can replace S by any topological ring with the above two properties, plus
the obvious ones (namely S is a commutative domain with identity), and Theorem 7.2.7 remains
valid. The same remark applies to Theorem 7.2.8 and Corollary 7.2.9 below.
7.2. GRAPHTOPOLOGY 217
Theorem 7.2.7 is addressed to the robustness of feedback stability in the case where the plant
is perturbed but the compensator remains xed. The next result is addressed to the case where both
the plant and the compensator may be perturbed. Suppose a family of plants {P
} and a family
of compensators {C
, C
, C
, C
(,)
= C
, C
0
, C
0
) is stable.
(i) Suppose the functions P
, C
are continuous at =
0
in the graph topology.
Then there exists a neighborhood N of
0
such that (P
, C
, C
) is continuous at =
0
in the norm topology on M(S).
(ii) Conversely, suppose there is a neighborhood N of
0
such that (P
, C
, C
) is continuous at =
0
in the norm topology on M(S). Then the
functions P
, C
are continuous at =
0
in the graph topology.
The proof of this theorem is deferred to Section 7.3. However, combining Theorems 7.2.8
and 7.2.7 leads to the following result, which is relevant in the context of singularly perturbed
systems.
Corollary 7.2.9 Suppose P
, C
, C
) is continuous at =
0
, then there exists a C M(R(s)) and a neighborhood B of
0
such
that (P
, C) is continuous at =
0
.
Remarks 7.2.10 The point of the corollary is this: Suppose a family of plants {P
} is specied,
together with a nominal plant P
0
. If it is possible to design a family of compensators C
such that
C
, C
are continuous
in the graph topology at =
0
. Now let C = lim
0
C
0
). Then, by
Theorem 7.2.7 (i), C robustly stabilizes the family {P
}. 2
The preceding results showthat the graph topology is of fundamental importance in the study
of the robustness of feedback stability. Hence it is worthwhile to develop a further understanding of
it, with particular emphasis on what continuity in this topology means. The next several propositions
are directed towards this aim.
Proposition 7.2.11 On the set M(S) viewed as a subset of M(R(s)), the norm topology and the graph
topology are the same. M(S) is an open subset of M(R(s)) in the graph topology.
Proof. To prove the rst sentence, it is enough to show that convergence in the norm topology is
equivalent to convergence in the graph topology in M(S). Accordingly, suppose {P
i
} is a sequence
3
in
M(S) and that P M(S). Suppose rst that P
i
P 0. Then C = 0 stabilizes P as well as P
i
,
since P, P
i
M(S). Further, since P
i
P 0, it follows from(7.1.1) that H(P
i
, 0) H(P, 0)
in M(S). Hence, by Theorem 7.2.7 (ii), we conclude that {P
i
} converges to P in the graph topology.
To prove the converse, and incidentally to prove the second sentence, suppose {P
i
} is a sequence
in M(R(s)) (not necessarily in M(S)) converging in the graph topology to P M(S). Now(P, I) is
an r.c.f. of P. Hence, by Lemma 7.2.5 (ii) there exist r.c.f.s (N
i
, D
i
) of P
i
such that N
i
P, D
i
I
in M(S). Now, I is a unimodular matrix. Since U(S) is an open subset of M(S) and since the inversion
of unimodular matrices is continuous, this implies that D
i
is unimodular for large enough i and that
D
1
i
I inM(S). Hence, P
i
= N
i
D
1
i
M(S) for large enoughi; moreover, since N
i
P 0
and D
1
i
I 0, it follows that P
i
P = N
i
D
1
i
I 0. As it has been shown during
the preceding argument that every sequence in M(R(s)) that converges to a limit in M(S) must
eventually lie in M(S), we conclude that M(S) is an open subset of M(R(s)). 2
Proposition 7.2.11 shows that, in the case of functions mapping into M(S), continuity in
the graph topology is the same as continuity in the usual norm topology. In the case of unstable
plant sequences, however, several strange things can happen. Before illustrating these difculties, a
preliminary result is presented that is often helpful in concluding lack of continuity.
Proposition 7.2.12 Suppose P
0
in the closed RHP.
(i) Let be any compact subset of the open RHP such that none of the s
i
lies on the boundary of ,
and let denote the number of poles of P
0
inside , counted according to their McMillan degrees. Then
there is a neighborhood N of
0
such that P
0
has a pole
at innity, suppose in addition that is bounded. Then P
(s) P
0
(s) as
0
, uniformly for all
s .
Remarks 7.2.13 In general, the number of closed RHP poles of P
and of P
0
need not be equal.
For example, let = [0, ),
0
= 0 and P
, D
) of r.c.f.s of P
suchthat N
, D
are continuous
at
0
. To prove (i), observe that |D
| is continuous at
0
; hence |D
|(s) |D
0
|(s) uniformly in s
as
0
. Since |D
| and |D
0
| are both analytic in the open RHP, and since |D
0
| has no zeros on
the boundary of , it follows by applying the principle of the argument to |D
| and to |D
0
| that,
for some neighborhood N of
0
, the number of zeros of |D
|, (i) follows.
To prove (ii), observe that since D
(s) D
0
(s) uniformly over the closed RHP, it fol-
lows that [D
(s)]
1
[D
0
(s)]
1
wherever the latter is well-dened, i.e., wherever |D
0
(s)| = 0.
Moreover, if we exclude a neighborhood of every closed RHP zero of |D
0
|, the convergence is
uniform with respect to s. Since P
(s) = N
(s)[D
(s)]
1
, the same is true of P
. 2
One of the nice features of the norm topology in M(S) is that it is a product topology: That
is, a sequence of matrices {P
(i)
} in M(S) converges to P M(S) if and only if each component
sequence {P
(i)
jk
} converges to p
jk
for all j, k. Consequently, a function P
is continuous at
0
. Now Proposition 7.2.11
shows that the graph topology and the norm topology coincide on M(S). Hence, when restricted to
M(S), the graph topology is also a product topology. However, in the case where a function P
maps into unstable multivariable plants, the continuity of the matrix-valued map P
at
0
neither
implies nor is implied by the continuity at
0
of the scalar-valued maps p
ij
for all i, j. This is
illustrated through some examples. First a scalar example is presented to set the stage.
Example 7.2.14 Let = R,
0
= 0, and consider the function
p
(s) =
s 1
s 1
. (7.2.32)
Then at = 0 the function p
/ S = 0, the conclusion
is that p is not continuous at = 0.
220 7. ROBUSTNESS
Example 7.2.15 Again let = R,
0
= 0 and consider the function
P(s) =
_
s
s1
s1
s1
2s
2
1
s
2
1
s
2
+s+1
s
2
1
_
. (7.2.33)
Then, from Example 7.2.14, the (1,2)-component of P
(s) =
_
z + z 1
z 2 1
_
, D
(s) =
_
z 1 z
z z 1
_
, (7.2.34)
where z = s/(s +1) S. Then it is easy to see that N
is continuous at = 0, and D
is independent
of . Moreover, P
= N
D
1
is
continuous in the graphtopology at = 0.This example shows that the continuity of a matrix-valued
function does not imply the continuity of its components.
Example 7.2.16 Once again let = R,
0
= 0, and consider
P(s) =
_
s+1
s1
s+1
s1+
_
. (7.2.35)
Then,
P
0
(s) =
_
s+1
s1
s+1
s1
_
. (7.2.36)
It is immediate from (7.2.35) and (7.2.36) that each component of P
, has two poles in the closed RHP, each of McMillan degree one.
Since Proposition 7.2.12 (i) fails to hold, the conclusion is that P
0
(P
+Q
) = lim
0
P
+ lim
0
Q
= Q
+R
0
P
= lim
0
Q
+ lim
0
R
, (7.2.40)
is not valid in general. Thus, the point of Proposition 7.2.18 is that (7.2.40) is valid provided one of
the two limits on the right side of (7.2.40) belongs to M(S).
222 7. ROBUSTNESS
Proof. if Since Q is continuous at
0
, by Corollary 7.2.6 (iii) it follows that there exists a family
(N
, D
) of r.c.f.s of Q
such that N
, D
are continuous at
0
. Since R
+
R
, D
) is an r.c.f. of P
+R
and D
is continuous at
0
.
only if Write Q
= P
= Q
is unimodular
for all , and to extend Proposition 7.2.20 to the case where P
= R
.
The nal result along these lines shows that the graph topology on M(R(s)) is a product
topology under certain circumstances.
Proposition 7.2.21 Suppose P maps a rst-countable topological space into M(R(s)), and that P is
block-diagonal of the form P = Block Diag {P
1
, , P
n
} where the dimensions of the blocks are the same
for all . Then P is continuous at
0
if and only if each function P
i
is continuous at
0
.
The proof of this proposition is given in Section 7.3. One can generate some more results by
combining the preceding three propositions in various ways.
Consider a function P
is continuous at
0
. Such
a test is provided by Theorem 7.2.7, and can be stated as follows: Find any C S(P
0
); then,
P
is continuous at
0
if and only if C robustly stabilizes the family {P
x(t ) +B
u(t ), y(t ) = C
x(t ) . (7.2.41)
Suppose the triple (C
0
, A
0
, B
0
) is stabilizable and detectable, and that (C
, A
, B
) is continuous in at
= 0 in the usual norm topology on Euclidean space. Then the map P
FC
is given by
(s) = I C
(sI A
+FC
)
1
F , (7.2.42)
(s) = C
(sI A
FC
)
1
B
, (7.2.43)
for sufciently small . Since
D
,
N
, C) is continuous at = 0.
In Proposition 7.2.22, each of the plants P
, > 0.
Proposition 7.2.23 Suppose = [0, ), and for each > 0 let P
, C
0
) approaches
H(P
0
, C
0
) as 0
+
. On the basis of Theorem 7.2.7, we know that this is so if and only if P
is
continuous in the graph topology at = 0. Now Proposition 7.2.23 states that continuity holds if
C
2
(sI A
22
)
1
B
2
0.
Proposition 7.2.23 has several interesting consequences. For instance, suppose the simplied
systemP
0
is stabilized by state feedback. This corresponds to the case y(t ) = x(t ) (i.e., C
1
= I, C
2
=
0), and C
0
= K, a constant matrix. Since the requisite condition holds in this case, it follows that the
control law u(t ) = Kx(t ) also stabilizes P
, K)
converges to H(P
0
, K) as 0
+
. Similarly, if the control input u does not directly affect the last
modes, (i.e., B
2
= 0), then it is again immediate that any compensator that stabilizes the reduced
order model also robustly stabilizes the singularly perturbed system for all sufciently small positive
.
Proof. Suppose C
2
(sI A
22
)
1
B
2
0. The continuity of P
,
N
) of l.c.f.s of P
and,
D
0
(s) = I C(sI A +FC)
1
F , (7.2.48)
N
0
(s) = C(sI A +FC)
1
(B FD
0
) +D
0
. (7.2.49)
Then, by Theorem 4.2.1, (
D
0
,
N
0
) is an l.c.f. of P
0
. To obtain an l.c.f. of P
, let F
0
= [F
0]
and
dene
=
_
A
11
A
12
A
21
/ A
22
/
_
_
F
0
_
[C
1
C
2
]
=
_
A
11
FC
1
A
12
FC
2
A
21
/ A
22
/
_
. (7.2.50)
From [15], the eigenvalues of
A
is Hurwitz for all sufciently small positive . Thus, once again by Theorem 4.2.1, an
l.c.f. of P
is given by
(s) = I C
0
(sI
A
)
1
F
0
, (7.2.52)
(s) = C
0
(sI
A
)
1
(B
F
0
D
1
) +D
1
, (7.2.53)
7.2. GRAPHTOPOLOGY 225
where
B
=
_
B
1
B
2
/
_
, C
0
= [C
1
C
2
] . (7.2.54)
The remainder of the proof consists of studying
N
and
D
N
0
,
D
D
0
as 0
+
. Recall that the inverse of a partitioned matrix is given by
_
X Y
W V
_
1
=
_
1
1
YV
1
V
1
W
1
V
1
+V
1
W
1
YV
1
_
, (7.2.55)
where
= X YV
1
W . (7.2.56)
Hence
= I [C
1
C
2
]
_
sI A
11
+FC
1
A
12
+FC
2
A
21
/ sI A
22
/
_
1
_
F
0
_
= I C
1
1
F +C
2
V
1
W
1
F , (7.2.57)
where
= (sI A
11
+FC
1
) (A
12
FC
2
)(sI A
22
/)
1
A
21
/
= (sI A
11
+FC
1
) (A
12
FC
2
)(sI A
22
)
1
A
21
, (7.2.58)
V = sI A
22
/ , W = A
21
/ . (7.2.59)
Simplication of (7.2.57) yields
= I M
1
F , (7.2.60)
where
M = C
1
C
2
V
1
W = C
1
+C
2
(sI A
22
)
1
A
21
. (7.2.61)
Now, letting
A denote the Hurwitz matrix A FC, we get
= I C(sI
A)
1
F
= I C
1
(sI
A)
1
F +C
2
A
1
22
A
21
(sI
A)
1
F . (7.2.62)
Hence, it can be concluded that
D
D
0
if it can be shown that
1
(sI
A)
1
, (7.2.63)
(sI A
22
)
1
A
21
1
A
1
22
A
21
(sI
A)
1
. (7.2.64)
Note that (sI A
22
)
1
does not approach A
1
22
.
4
Hence, (7.2.64) does not automatically follow
from (7.2.63).
4
Observe that (sI A
22
)
1
0 as s whenever > 0. Hence, the norm of the difference between the functions (sI
A
22
)
1
and A
1
22
is at least A
1
22
, no matter how small is.
226 7. ROBUSTNESS
To prove (7.2.63), let
A
11
denote A
11
FC
1
, and note that
1
= [sI
A
11
(A
12
FC
2
)(sI A
22
)
1
A
21
]
1
= {sI
A
11
(A
12
FC
2
)A
1
22
A
21
(A
12
FC
2
)[(sI A
22
)
1
A
1
22
]A
21
}
1
= (I R)
1
(sI
A)
1
, (7.2.65)
where
R = (A
12
FC
2
)[(sI A
22
)
1
A
1
22
](sI
A)
1
. (7.2.66)
Hence,
1
(sI
A)
1
= [(I R)
1
I](sI
A)
1
. (7.2.67)
Now (I R)(s) approaches I uniformly on every compact subset of C
+e
as 0, and (sI
A)
1
0 as |s| . Hence,
1
(sI
A)
1
and (7.2.63) is proved. To establish (7.2.64),
observe that (sI A
22
)
1
approaches A
1
22
uniformly on all compact subsets of C
+e
as 0,
and that both
1
and (sI
A)
1
are strictly proper. Thus (7.2.64) follows from (7.2.63) plus the
strict properness of the quantities on both sides of (7.2.63). Finally, (7.2.63) and (7.2.64) together
imply that
D
D
0
as 0.
Next, we examine the numerator matrix
N
. From (7.2.53)
= [C
1
C
2
]
_
sI A
11
+FC
1
A
12
+FC
2
A
21
/ sI A
22
/
_ _
B
1
FD
1
B
2
/
_
+D
1
. (7.2.68)
Expanding the above equation results in an expression for
N
1
(B
1
FD
1
). Now (7.2.63) implies that this term converges to C
1
(sI
A)
1
(B
1
FD
1
).
The second term in the expansion is
C
2
V
1
W
1
(B
1
FD
1
)
= C
2
(sI A
22
/)
1
(A
21
/)
1
(B
1
FD
1
)
= C
2
(sI A
22
)
1
A
21
1
(B
1
FD
1
) . (7.2.69)
From reasoning analogous to that used to establish (7.2.64), it follows that this term converges to
C
2
A
1
22
A
21
(sI A
22
)
1
(B
1
FD
1
). The third term is
C
1
1
YV
1
B
2
/
= C
1
1
(A
12
FC
2
)(sI A
22
/)
1
B
2
/
= C
1
1
(A
12
FC
2
)(sI A
22
)
1
B
2
. (7.2.70)
As before, this converges to
C
1
(sI
A)
1
(A
12
FC
2
)A
1
22
B
2
. (7.2.71)
7.2. GRAPHTOPOLOGY 227
However, note that C
2
A
1
22
B
2
= 0. Hence the above limit simplies to
C
1
(sI
A)
1
A
12
A
1
22
B
2
. (7.2.72)
The fourth term is the most complicated one; it equals
C
2
V
1
(I +W
1
YV
1
)B
2
/
= C
2
(sI A
22
/)
1
_
I +
A
21
1
(A
12
FC
2
)
_
sI
A
22
_
1
_
B
2
= C
2
(sI A
22
)
1
[I +A
22
1
(A
12
FC
2
)(sI A
22
)
1
]B
2
= C
2
(sI A
22
)
1
B
2
+C
2
(sI A
22
)
1
A
21
1
(A
12
FC
2
)(sI A
22
)
1
B
2
. (7.2.73)
Nowthe rst termon the right side is identically zero by hypothesis, while the second termis strictly
proper. As a result, the secondtermconverges without difculty toC
2
A
1
22
A
21
(sI
A)
1
A
12
A
1
22
B
2
.
Of course the fth and nal term is the constant matrix D
1
. Combining the ve limits, we see that,
approaches
(C
1
C
2
A
1
22
A
21
)(sI
A)
1
(B
1
FD
1
A
12
A
1
22
B
2
) +D
1
. (7.2.74)
Finally, using the relations (7.2.47), and noting that D
1
= D
0
, since C
2
A
1
22
B
2
= 0, the above limit
simplies to
C(sI
A)
1
(B FD
0
) +D
0
=
N
0
. (7.2.75)
From Corollary 7.2.6, it now follows that P
converges to P
0
in the graph topology. 2
While the result contained in Proposition 7.2.23 is interesting, its use is limited by the fact
that the condition it requires is quite restrictive. The origin of the restrictiveness is the denition
of robust stabilizability. Recall that a controller C S(P
0
) was not only required to stabilize P
for
all sufciently small positive , but the resulting stable family of transfer matrices H(P
, C) was
required to converge uniformly over C
+e
to H(P
0
, C) as 0. The next proposition is devoted to
a study of the situation where a weaker form of convergence of the family H(P
, C) is acceptable.
Proposition7.2.25 Consider the family {P
, C)](s) converges to
[H(P
0
, C)](s) uniformly over all compact subsets of C
+e
as 0.
Proof. In the interests of brevity, let us say that a family {F
} converges
to GF provided G is strictly proper.
228 7. ROBUSTNESS
Since C S(P
0
), it has an r.c.f. (N
c
, D
c
) such that
N
0
N
c
+
D
0
D
c
= I. Moreover, since C is
strictly causal, so is N
c
. Now consider the return difference matrix
=
N
N
c
+
D
D
c
. (7.2.76)
In the proof of Proposition 7.2.23, it was shown that {
N
} converges to
N
0
provided C
2
(sI
A
22
)
1
B
2
0. However, from (7.2.73) one can easily verify that, even if the above condition is
not satised, the family {
N
N
c
N
0
N
c
. Also
D
D
0
, since the condition was not used in the proof of this result
(see (7.2.63)(7.2.67)). Hence,
N
0
N
c
+
D
0
D
c
= I, which implies that
is unimodular for
sufciently small (positive) . By Theorem 5.1.6, this means that C S(P
, C) =
_
D
c
D
c
N
c
I N
c
_
. (7.2.77)
All terms except D
c
} such
that (7.2.47) holds, but C / S(P
(s)N(s) +D
(s)D(s) = I, s , (7.3.1)
where N
(s) = N
(s).
Lemma 7.3.1 Every P M(R(s)) has a normalized r.c.f. which is unique to within right multiplica-
tion by an orthogonal matrix.
Proof. Let (N
1
, D
1
) be any r.c.f. of P, and dene
(s) = N
1
(s)N
1
(s) +D
1
(s)D
1
(s) . (7.3.2)
Then
2
(s)N
2
(s) +D
2
(s)D
2
(s) = I, s . (7.3.3)
Moreover, by Theorem 4.1.13, there is a unimodular matrix U U(S) such that N
2
= NU, D
2
=
DU. Substituting this into (7.3.3) and using (7.3.1) shows that U
and A
i
= [D
i
N
i
]
,
then (N
i
U
i
, D
i
U
i
) is an r.c.f. of P
i
and A A
i
U
i
2 i n. It nowfollows fromLemma 7.2.5
that P
i
P in the graph topology.
only if Suppose P
i
P in the graph topology, and let (N, D) be a normalized r.c.f. of
P. Then by Lemma 7.2.5 there exist r.c.f.s (N
i
, D
i
) of P
i
such that N
i
N, D
i
D. However,
(N
i
, D
i
) need not be normalized. Let M
i
= [D
i
N
i
]
, and suppose M
i
= A
i
R
i
where A
i
corre-
sponds to a normalized r.c.f. of P
i
and R
i
U(S). Now M
i
A 0 and M
i
= A
i
R
i
=
R
i
since A
i
is an isometry. Hence, R
i
1. Dene U
i
= R
i
/R
i
. Then U
i
= 1; moreover,
A A
i
U
i
A A
i
R
i
+A
i
R
i
A
i
U
i
= A M
i
+R
i
U
i
= A M
i
+|1 R
i
|
0 as i . (7.3.15)
232 7. ROBUSTNESS
This shows that (P, P
i
) 0. To prove that (P
i
, P) 0, it is necessary to estimate R
1
i
. Let
i
= A M
i
. Then for all vectors x M(L
2
), we have M
i
x Ax (A M
i
)x (1
i
)x. Since x = A
i
x = M
i
R
1
i
x (1
i
)R
1
i
x, it follows that R
1
i
1/(1
i
)
whenever
i
< 1. In particular, R
1
i
1 as i . Let V
i
= R
1
i
/R
1
i
. Then, as in (7.3.15),
V
i
= 1 and
A
i
AV
i
= M
i
R
1
i
AV
i
M
i
R
1
i
AR
1
i
+AR
1
i
AV
i
M
i
AR
1
i
+R
1
i
V
i
0 as i . (7.3.16)
Thus (P
i
, P) 0. 2
At present, the graph metric distance between two plants cannot be computed exactly, since
the problem of determining the inmum in (7.3.5) is still open. However, it is possible to give upper
and lower bounds.
Lemma 7.3.6 Suppose (N
1
, D
1
) is a normalized r.c.f. of P
1
, (N
2
, D
2
) is a (not necessarily normalized)
r.c.f. of P
2
, and let A
1
= [D
1
N
1
]
, M
2
= [D
2
N
2
]
. Suppose A
1
M
2
=: < 1. Then
d(P
1
, P
2
)
2
1
. (7.3.17)
Proof. Suppose M
2
= A
2
R
2
where A
2
corresponds to a normalized r.c.f. of P
2
and R
2
U(S). Now
M
2
A
1
+M
2
A
1
= 1 + . Hence, R
2
= M
2
1 + andR
1
2
1/(1 ), as
in the proof of Theorem 7.3.5. Moreover, (7.3.15) and (7.3.16) imply that
(P
1
, P
2
) 2 , (7.3.18)
(P
2
, P
1
) 2/(1 ) , (7.3.19)
d(P
1
, P
2
) = max{(P
1
, P
2
), (P
2
, P
1
)} 2/(1 ) , (7.3.20)
This completes the proof. 2
To compute a lower bound on the graph metric distance between two plants, note that
(P
1
, P
2
) inf
RM(S)
A
1
A
2
R . (7.3.21)
This can be computed using the methods of Section 6.6.
7.3. GRAPHMETRIC 233
We are now in a position to undertake a proof of Proposition 7.2.21. For ease of reference it
is restated here in an equivalent form.
Proposition 7.3.7 Suppose {P
i
} is a sequence in M(R(s)), P M(R(s)), and suppose P
i
, P are
block-diagonal in the form
P
i
= Block Diag {P
i
1
, , P
i
l
} , (7.3.22)
P = Block Diag {
P
1
, ,
P
l
} , (7.3.23)
where the partitioning is commensurate. Then d(P
i
, P) 0 if and only if d(P
i
j
,
P
j
) 0 as i ,
for each j in {1, , l}.
Remarks 7.3.8 Proposition 7.3.7 is equivalent to Proposition 7.2.21 in view of Theorem 7.3.5.
Proof. It is enough to prove the proposition for the case where l = 2, as the general case will then
follow by induction. Accordingly, let us change notation, and suppose we are given a sequence {P
i
}
of block-diagonal matrices and a block-diagonal matrix P having the form
P
i
=
_
Q
i
0
0 R
i
_
, P =
_
Q 0
0 R
_
. (7.3.24)
The claim is that d(P
i
, P) 0 if and only if d(Q
i
, Q) 0 and d(R
i
, R) 0.
if Suppose d(Q
i
, Q) 0, d(R
i
, R) 0. Then, by Theorem 7.3.5, {Q
i
} converges to
Q and {R
i
} converges to R in the graph topology. This implies, by Lemma 7.2.5, that there are
r.c.f.s (N
Q
i
, D
Q
i
), (N
R
i
, D
R
i
), (N
Q
, D
Q
), (N
R
, D
R
) of Q
i
, R
i
, Q, R such that (N
Q
i
, D
Q
i
)
(N
Q
, D
Q
), (N
R
i
, D
R
i
) (N
R
, D
R
) Now dene
N
i
=
_
N
Q
i
0
0 N
R
i
_
, D
i
=
_
D
Q
i
0
0 D
R
i
_
. (7.3.25)
Then clearly (N
i
, D
i
) is an r.c.f. of P
i
, (N, D) is an r.c.f. of P, and (N
i
, D
i
) (N, D) (because the
norm topology on M(S) is a product topology). Again by Lemma 7.2.5, {P
i
} converges to P in the
graph topology, so that d(P
i
, P) 0 by Theorem 7.3.5.
only if Suppose d(P
i
, P) 0. Then {P
i
} converges to P in the graph topology. Let
(N
Q
, D
Q
), (N
R
, D
R
) be normalized r.c.f.s of Q, R respectively. Then (N, D) as dened in (7.3.26)
is easily shown to be a normalized r.c.f. of P. Similarly, let (N
Q
i
, D
Q
i
), (N
R
i
, D
R
i
) be normalized
r.c.f.s of Q
i
, R
i
, respectively. Then (N
i
, D
i
) as dened in (7.3.25) is a normalized r.c.f. of P
i
. By
hypothesis, there is a sequence {U
i
} of unimodular matrices such that
_
_
_
_
D
Q
i
0
0 D
R
i
N
Q
i
0
0 N
R
i
_
_
_
U
1i
U
2i
U
3i
U
4i
_
_
_
_
_
D
Q
0
0 D
R
N
Q
0
0 N
R
_
_
, (7.3.26)
234 7. ROBUSTNESS
where U
i
is partitioned in the obvious way. Since the topology on M(S) is a product topology, each
block in the partitioned matrix on the left side of (7.3.26) converges to the corresponding block on
the right side of (7.3.26). In particular,
_
D
Q
i
N
Q
i
_
U
2i
_
0
0
_
,
_
D
R
i
N
R
i
_
U
3i
_
0
0
_
. (7.3.27)
Since (N
Q
i
, D
Q
i
), (N
R
i
, D
R
i
) are both normalized r.c.f.s, (7.3.27) implies that U
2i
0, U
3i
0.
Let A
i
(resp. A) denote the block 4 2 matrix on the left (resp. right) side of (7.3.26). Since
U
2i
0, U
3i
0, it follows that
A
i
_
0 U
2i
U
3i
0
_
0 . (7.3.28)
Combining (7.3.26) and (7.3.28) gives
A
i
U
i
A , (7.3.29)
where
U
i
=
_
U
1i
0
0 U
4i
_
. (7.3.30)
Now (7.3.29) shows that
U
i
1. Hence, if we dene W
i
= U
i
/U
i
, then (7.3.29) shows that
A A
i
W
i
0. On the other hand, since d(P
i
, P) 0, there exists a sequence {V
i
} of matrices
in the unit ball of M(S) such that A
i
AV
i
0. Thus A A
i
W
i
+A
i
AV
i
< 1 for large
enough i, and W
i
= 1. Hence by Lemma 7.3.4, both W
i
and V
i
are unimodular for large enough
i. In particular, U
1i
and U
4i
are both unimodular for large enough i. Hence, (N
Q
i
U
1i
, D
Q
i
U
1i
) is
an r.c.f. of Q
i
, and (N
R
i
U
4i
, D
R
i
U
4i
) is an r.c.f. of R
i
. Since (7.3.29) implies that
_
D
Q
i
N
Q
i
_
U
1i
_
D
Q
N
Q
_
,
_
D
R
i
N
R
i
_
U
4i
_
D
R
N
R
_
. (7.3.31)
It follows from Lemma 7.2.5 that Q
i
Q, R
i
R in the graph topology. Now Theorem 7.3.5
shows that d(Q
i
, Q) 0, d(R
i
, R) 0. 2
This section is concluded with a proof of Theorem 7.2.8. It is restated here in an equivalent
form for ease of reference.
Proposition 7.3.9 Suppose {P
i
}, {C
i
} are sequences in M(R(s)), and that P
0
, C
0
M(R(s)). Then,
the following statements are equivalent:
(i) d(P
i
, P
0
) 0, d(C
i
, C
0
) 0 as i .
(ii) d(H
i
, H
0
) 0 as i , where H
i
= H(P
i
, C
i
), H
0
= H(P
0
, C
0
).
7.3. GRAPHMETRIC 235
Remarks 7.3.10 If the pair (P
0
, C
0
) is stable, then the above result reduces to Theorem 7.2.8. In
general, it shows that the map from the plant-compensator pair to the closed-loop transfer matrix is
continuous if all spaces are equipped with the graph topology, even if the pair (P
0
, C
0
) is not stable.
Proof. Let
G
0
=
_
C
0
0
0 P
0
_
, G
i
=
_
C
i
0
0 P
i
_
. (7.3.32)
Then Proposition 7.3.7 shows that (i) is equivalent to the following statement: (iii) d(G
i
, G
0
) 0
as i . Thus the statement to be proved is: d(G
i
, G
0
) 0 if and only if d(H
i
, H
0
) 0.
Dene
F =
_
0 I
I 0
_
. (7.3.33)
Then (cf. Section 5.1)
H
i
= (I +FG
i
)
1
, H
0
= (I +FG
0
)
1
. (7.3.34)
Suppose (N
i
, D
i
) is an r.c.f. of G
i
. Then H
i
= D
i
(D
i
+FN
i
)
1
, from (7.3.34). Moreover,
(D
i
, D
i
+FN
i
) is an r.c.f. of H
i
, since
_
D
i
+FN
i
D
i
_
=
_
I F
I 0
_ _
D
i
N
i
_
, (7.3.35)
and the matrix on the right side of (7.3.35) is unimodular, with the inverse
_
0 I
F F
_
. (7.3.36)
(Note that F
1
= F.) Conversely, suppose (B
i
, A
i
) is an r.c.f. of H
i
. Then
_
D
i
N
i
_
=
_
0 I
F F
_ _
A
i
B
i
_
, (7.3.37)
is an r.c.f. of G
i
. Similar remarks apply to G
0
and H
0
.
The statement to be proved is that d(G
i
, G
0
) 0 if and only if d(H
i
, H
0
) 0. Suppose
rst that d(G
i
, G
0
) 0. Then, by Theorem 7.3.5, {G
i
} converges to G
0
in the graph topology.
Hence, according to Lemma 7.2.5, there exist r.c.f.s (N
i
, D
i
) of G
i
and (N
0
, D
0
) of G
0
such that
N
i
N
0
, D
i
D
0
. Now dene
_
A
i
B
i
_
=
_
I F
I 0
_ _
D
i
N
i
_
,
_
A
0
B
0
_
=
_
I F
I 0
_ _
D
0
N
0
_
. (7.3.38)
Then (B
i
, A
i
) is an r.c.f. of H
i
, (B
0
, A
0
) is an r.c.f. of H
0
, and A
i
A
0
, B
i
B
0
. Hence, {H
i
}
converges to H
0
in the graph topology and d(H
i
, H
0
) 0. The reverse implication follows by
reversing the above steps. 2
236 7. ROBUSTNESS
We conclude this section with a discussion of the graph metric in the case where the set of
stable transfer functions is something other than the set S. First of all, the graph metric extends
almost effortlessly to linear distributed systems. The central step in dening a graph metric is the
existence of a normalized r.c.f. In the case of distributed systems, the existence of such a factorization
is assured by Beurlings theoremin the scalar case and the Beurling-Lax theoremin the multivariable
case, which states that a closed shift-invariant subspace of H
n
2
is the image of an inner operator.
Thus, even in the case of distributed systems, if a plant has an r.c.f., then it has a normalized r.c.f.
Of course, not all distributed systems have r.c.f.s (see Section 8.1). Thus, the graph metric can only
be dened between pairs of plants when each has an r.c.f. Next, in the case of discrete-time systems,
the arguments given here carry over without any major modications: One only has to replace the
extended closed right half-plane by the closed unit disc. Finally, if the set of stable transfer functions
is a set of the form S
D
where D is some prespecied region of the complex plane, or its discrete
analog, then the graph metric can be dened by rst conformally mapping Dinto C
+e
(or the closed
unit disc in the discrete-time case). This of course requires that the region D be simply connected.
7.4 ROBUSTNESS CONDITIONS
In this section, the focus is on designing stabilizing compensators for imprecisely known plants.
Thus, a nominal plant description is available, together with a description of the plant uncertainty,
and the objective is to design a compensator that stabilizes all plants lying within the specied band
of uncertainty. The types of plant uncertainty considered include additive perturbations, multiplica-
tive perturbations, stable-factor perturbations, and graph-metric perturbations. The solution of the
design problem is divided into two steps: The rst step is the derivation of necessary and sufcient
conditions that any compensator must satisfy in order to be a solution of the design problem. The
second step is a further examination of these conditions to see when there actually exists a compen-
sator satisfying these conditions. This leads naturally to the notion of optimally robust compensators.
This section concludes with a discussion of the case where there are uncertainties in both the plant
and compensator.
Throughout this section, several models of plant uncertainty are used; these are described
next.
Suppose P
0
is a given nominal plant and that r Sis a given function. An additive perturbation
of P
0
is only dened in the case where P
0
has no poles on the imaginary axis. The class A(P
0
, r)
consists of all P M(R(s)) that have the same number of open RHPpoles as P
0
(counted according
to McMillan degree), and satisfy
P(j) P
0
(j) < |r(j)| R . (7.4.1)
In particular, (7.4.1) implies that if P A(P
0
, r), then P also has no poles on the j-axis. However,
P P
0
need not be analytic in the open RHP, because even though P and P
0
have the same
number of open RHP poles, they could be at different locations. Hence, (7.4.1) does not imply that
P(s) P
0
(s) < |r(s)| s C
+e
. The assumption that r S is not too restrictive. Suppose it is
7.4. ROBUSTNESS CONDITIONS 237
known that P has the same number of RHP poles as P
0
, and that P(j) P
0
(j) < |f (j)|
where f is some proper rational function. Then it is always possible to nd a stable proper rational
function r such that r(j) = f (j) . In fact, one can assume without loss of generality that r is
outer.
The class of multiplicative perturbations of P
0
is also dened under the restriction that P
0
has no j-axis poles. The class M(P
0
, r) consists of all P M(R(s)) that have the same number
of open RHP poles as P
0
and satisfy
P(s) = (I +M(s))P
0
(s) , M M(R(s)) , where
M(j) < |r(j)| R . (7.4.2)
Finally, the class of stable-factor perturbations is specied not just in terms of the nominal plant
description P
0
, but in terms of a particular r.c.f. (N
0
, D
0
) of P
0
and a function r S. The class
S(N
0
, D
0
, r) consists of all plants P M(R(s)) satisfying
P(s) = N(s)[D(s)]
1
, where
_
_
_
_
_
N(s) N
0
(s)
D(s) D
0
(s)
__
_
_
_
< |r(s)| s C
+e
. (7.4.3)
The rationale behind the above three models of plant uncertainty is fairly clear. In the case of
both A(P
0
, r) and M(P
0
, r), the perturbed plant P has the same number of open RHP poles as the
nominal plant P
0
(though possibly at different locations). In the case of A(P
0
, r), |r(j)| provides
an upper bound on the absolute magnitude of the plant uncertainty at the frequency , while in the
case of M(P
0
, r), |r(j)| provides an upper bound on the relative magnitude of uncertainty at the
frequency . The class of stable-factor perturbations is free from the restructive assumption that
the perturbed plant and the unperturbed plant have the same number of RHP poles. As well, P
0
is permitted to have j-axis poles, as is the perturbed plant P. Using (7.4.3) in conjunction with
the principle of the argument over various subregions of C
+
, one can localize regions in C
+
where
a P S(N
0
, D
0
, r) can have poles.
Theorems 7.4.1 and 7.4.2 below give the main robustness results in the case of additive and
multiplicative perturbations, respectively.
Theorem 7.4.1 Suppose P S(N
0
, D
0
, r), r M(R(s)) are specied, and suppose C stabilizes
P
0
. Then C stabilizes all P in the class A(P
0
, r) if and only if
5
[C(I +P
0
C)
1
](j) |r(j)| 1 R . (7.4.4)
Theorem7.4.2 Suppose P
0
M(P
0
, r), r M(R(s)) are specied, and suppose C stabilizes P
0
.
Then C stabilizes all P in the class M(P
0
, r) if and only if
[P
0
C(I +P
0
C)
1
](j) |r(j)| 1 R . (7.4.5)
5
Recall that the norm M of a matrix M equals its largest singular value.
238 7. ROBUSTNESS
The proofs of both theorems make use of a graphical criterion for feedback stability that
generalizes the Nyquist criterion. As it is of independent interest, it is stated separately.
Lemma 7.4.3 Suppose P, C M(R(s)) have
p
,
c
poles in the open RHP, respectively, counted
according to McMillan degree, and none on the j-axis. Then C stabilizes P if and only if the plot of
|(I +PC)(j)| as decreases from to does not pass through the origin, and encircles the origin
p
+
c
times in the clockwise sense.
A little notation and an appeal to the principle of the argument simplies the proof of
Lemma 7.4.3. Suppose f is a rational function that has neither poles nor zeros on the extended
j-axis. Then, as decreases from to , the graph of f (j) traces out a closed curve in the
complex plane. Let
ind f =
1
2
lim
1
|r(j
0
)|
< |r(j)| R , (7.4.13)
since
1
|r(j
0
)| > 1. Hence P belongs to the class A(P
0
, r). However, it is claimed that C does not
stabilize P. To prove this, it is enough, in view of Lemma 7.4.3, to show that |(I +PC)(j
0
)| = 0.
Let v
1
denote the rst column of V, and note that V
v
1
= e
1
= [1 0 0]
because V is orthogonal.
Hence,
(I +QF)(j
0
)v
1
= v
1
(QF)(j
0
)v
1
= v
1
V
_
1/
1
0
0 0
_ _
0
0 0
_
e
1
= v
1
v
1
= 0 . (7.4.14)
240 7. ROBUSTNESS
Since v
1
= 0, this implies that |(I +QF)(j
0
)| = 0. Now |I +PC| = |I +(Q+P
0
C)| = |I +
QF| |I +P
0
C|, from which it follows that |(I +PC)(j
0
)| = 0.
Case (ii)
0
is nonzero and nite. In this case the only complication is that the matrices U, V
and the constant r(j
0
) may not be real, so that the Qdened in (7.4.12) may not be a real rational
matrix. To handle this possibility, write r(j
0
) = |r(j
0
)| exp(j) where [0, 2). Let v
1
, u
1
denote respectively the rst column of V and the rst row of U. Express each component of v
1
, u
1
in the form
(v
1
)
i
= v
i
exp(j
i
), (u
1
)
i
= u
i
exp(j
i
) , (7.4.15)
where v
i
, u
i
are real numbers and , (, 0] i. Dene
Q(s) =
r(s)
1
|r(j
0
)|
v(s) u
(s) , (7.4.16)
where
v(s) =
_
_
_
v
1
(s
1
)/(s +
1
)
.
.
.
v
n
(s
n
)/(s +
n
)
_
_
,
u(s) =
_
_
_
u
1
(s
1
)/(s +
1
)
.
.
.
u
n
(s
n
)/(s +
n
) ,
_
_
, (7.4.17)
and the nonnegative constants
i
,
i
are adjusted such that
6
arg
_
j
0
i
j
0
+
i
_
=
i
, arg
_
j
0
i
j
0
+
i
_
=
i
. (7.4.18)
Then Q is a real rational matrix, Q M(S), and
Q(j
0
) = v
1
u
1
/
1
. (7.4.19)
Moreover, Q satises (7.4.13). Hence, P = P
0
+Q belongs to the class A(P
0
, r). As before, (I +
QF)(j
0
) is singular, because it maps v
1
into the zero vector. Hence C does not stabilize P.
if Suppose (7.4.4) holds. Let P be any plant in the class A(P
0
, r) and let denote P P
0
.
Then, by (7.4.1),
(j) < |r(j)| . (7.4.20)
Combining (7.4.4) with (7.4.20) and letting F denote C(I +P
0
C)
1
leads to
(j) F(j) < 1 . (7.4.21)
6
One can assume without loss of generality that
0
> 0, so that such
i
,
i
exist.
7.4. ROBUSTNESS CONDITIONS 241
Hence
|I +(j)F(j)| = 0 R, [0, 1] . (7.4.22)
Let X denote the topological space C 0 (i.e., the punctured plane) with the topology inherited
from C. Dene two closed curves f () and g() in X by
f (j) = 1 , g(j) = |I +(j)F(j)| . (7.4.23)
Now (7.4.22) shows that the function
h(, j) = |I +(j)F(j)| , (7.4.24)
continuously transforms the curve f into the curve g in X. Hence, the curves f and g are homotopic
in X, and as a result they have the same index with respect to 0. Since the index of f is clearly zero,
it follows that
ind g = ind |I +F| = 0 . (7.4.25)
Finally, note that
|I +PC| = |I +F| |I +P
0
C| . (7.4.26)
Since C stabilizes P
0
, it follows from Lemma 7.4.3 that
ind |I +P
0
C| = (
0
+
c
) , (7.4.27)
where
0
(resp.
c
) is the number of RHP poles of P
0
(resp. C) counted according to McMillan
degree. Now (7.4.25)(7.4.27) imply that
ind |I +PC| = (
0
+
c
) . (7.4.28)
Since P also has
0
RHP poles by assumption, (7.4.28) and Lemma 7.4.3 show that C stabilizes P.
2
Suppose we dene
A(P
0
, r) to be the set of all plants P that have the same number of RHP
poles as P
0
, and satisfy
P(j) P
0
(j) |r(j)| . (7.4.29)
(The difference between (7.4.1) and (7.4.29) is that the < is replaced by .) Then the proof of
Theorem 7.4.1 can be easily adapted to prove the following result.
Corollary 7.4.4 Suppose C S(P
0
). Then C stabilizes all plants in the class
A(P
0
, r) if and only if
sup
[C(I +P
0
C)
1
](j) |r(j)| < 1 . (7.4.30)
242 7. ROBUSTNESS
Sketch of Proof of Theorem7.4.2. Suppose P = (I +M)P
0
. Then
|I +PC| = |I +(I +M)P
0
C| = |I +MG| |I +P
0
C| , (7.4.31)
where G = P
0
C(I +P
0
C)
1
. By assumption, M satises M(j) < |r(j)| , and (7.4.5) states
that G(j) |r(j)| 1 . 2
To derive a result analogous to Corollary 7.4.4, dene
M(P
0
, r) to be the set of all plants P
that have the same number of RHP poles as P
0
, and satisfy P = (I +M)P
0
where M M(R(s))
and
M(j) |r(j)| . (7.4.32)
Corollary 7.4.5 Suppose C S(P
0
). Then C stabilizes all plants in the class
M(P
0
, r) if and only if
sup
[P
0
C(I +P
0
C)
1
](j) |r(j)| < 1 . (7.4.33)
Finally, consider the case of stable-factor perturbations.
Theorem 7.4.6 Suppose P
0
M(R(s)). Suppose an r.c.f. (N
0
, D
0
) of P
0
and a function r S
are specied, and dene the class S(N
0
, D
0
, r) as in (7.4.3). Suppose C S(P
0
), and select an l.c.f.
(
D
c
,
N
c
) of C such that
D
c
D
0
+
N
c
N
0
= I. Then C stabilizes all P in the class, S(N
0
, D
0
, r), if
and only if
[
D
c
(j)
N
c
(j)] |r(j)| 1 . (7.4.34)
Remarks 7.4.7 Since
D
c
,
N
c
M(S), r S, the condition (7.4.34) actually implies that
[
D
c
(s)
N
c
(s)]r(s) 1 s C
+e
.
Proof. For convenience, dene
A
0
=
_
D
0
N
0
_
, A
1
=
_
D
N
_
,
A
c
= [
D
c
N
c
] . (7.4.35)
if Suppose (7.4.34) holds, and suppose P is an arbitrary plant in the class S(N
0
, D
0
, r).
Then P has an r.c.f. (N, D) such that A(s) A
0
(s) < |r(s)| s C
+e
. Now consider the return
difference matrix
D
c
D +
N
c
N =
A
c
A. Since
A
c
A
0
= I, it follows that
A
c
A = I +
A
c
(A A
0
).
However, from (7.4.34) we get
A
c
(A A
0
)](s)
A
c
(s) [A A
0
](s) < 1 s C
+e
. This
shows that |(
A
c
A)(s)| = |I +[
A
c
(A A
0
)](s)| = 0 s C
+e
, so that
A
c
A is unimodular. Hence,
by Theorem 5.1.6, C stabilizes P.
7.4. ROBUSTNESS CONDITIONS 243
only if Suppose (7.4.34) is false, and select an
0
such that
A
c
(j) |r(j
0
)| > 1. Let
1
2
m
denote the singular values of
A
c
(j
0
), and note that
1
|r(j
0
)| > 1. Let =
Diag {
1
,
m
}, and select unitary matrices U, V such that
7
U
A
c
(j
0
)V = [ 0] . (7.4.36)
Since
A
c
(j
0
)A
0
(j
0
) = I, it follows that A
0
(j
0
) must be of the form
A
0
(j
0
) = V
_
1
X
_
U . (7.4.37)
By assumption,
1
|r(j
0
)| > 1. Now dene R = Diag {1/
1
, 0, , 0}, and
Q(s) = V
_
R
T
_
U
r(s)
r(j
0
)
, (7.4.38)
where T is a constant matrix chosen as follows: If the rst column of the matrix X in (7.4.37) is
nonzero, simply choose T = 0. Otherwise, if the rst column of X is zero, choose T to be a matrix
whose columns are all zero except for the rst column, and whose rst column is nonzero and has
sufciently small norm that Q(j
0
) < 1/|r(j
0
)|. If T is so chosen, then the rst column of
X T is not identically zero. Nowlet P be the plant ND
1
, where [N
= A = A
0
+Q. Then
P belongs to the class S(N
0
, D
0
, r), since Q(s) |r(s)| 1 s C
+e
.
8
However,
A(j
0
) = A
0
(j
0
) +Q(j
0
) = V
_
1
R
X T
_
U , (7.4.39)
and the rst column of
1
R is zero. Hence
A
c
(j
0
)A(j
0
) is singular, and
A
c
A is not uni-
modular.
The question is: Does the singularity of |
D
c
D +
N
c
N| at j
0
imply that C fails to stabilize
P? We can conclude that this is so, provided it can be shown that N and D are right-coprime,
or at least, that any greatest common right divisor of N and D is nonsingular at j
0
. To am-
plify this argument, suppose N = N
1
B, D = D
1
B, where B is a g.c.r.d. of N and D, and N
1
and
D
1
are right-coprime. Then |
D
c
D +
N
c
N| = |
D
c
D
1
+
N
c
N
1
| |B|. Hence, if |B(j
0
)| = 0, then
|
D
c
D
1
+
N
c
N
1
| vanishes at j
0
, and C does not stabilize P. Thus the proof is complete if it can be
shown that |B(j
0
)| = 0. For this it is enough to show that the matrix [D
= A
0
Qhas full
column rank at j
0
. But this last fact is immediate from (7.4.39), since the rst column of X T
is nonzero. 2
There is an interesting anomaly associated with the robustness condition (7.4.34). Comparing
Theorem 7.4.1 with Corollary 7.4.4, or Theorem 7.4.2 with Corollary 7.4.5, we see that it makes
very little difference, in the case of additive and multiplicative perturbations, whether the class of
7
Note that
A
c
(s) is a fat matrix and has full row rank at all s C
+e
.
8
The matrix Q may not be a real rational matrix; but this can be xed up as in the proof of Theorem 7.4.1.
244 7. ROBUSTNESS
perturbations is dened with a strict or nonstrict inequality. If the class of perturbations is dened
with a strict inequality, then the robustness condition has a nonstrict inequality, and vice versa. But
this is not so in the case of stable-factor perturbations. Dene the class
S(N
0
, D
0
, r) by
S(N
0
, D
0
, r) = {P = ND
1
:
_
_
_
_
_
(N N
0
)(s)
(D D
0
)(s)
__
_
_
_
|r(s)| s C
+e
} . (7.4.40)
One might be tempted to conjecture the following result: C S(P
0
) stabilizes all P in the class
S(N
0
, D
0
, r); if and only if
sup
A
c
(j) |r(j)| < 1 . (7.4.41)
But this is false: (7.4.41) is certainly sufcient for robust stability, but not necessary, as the next
example shows.
Example 7.4.8 Consider
p
0
(s) =
4
s 3
=
n
0
(s)
d
0
(s)
where
n
0
(s) =
4(s +7)
(s +1)
2
, d
0
(s) =
(s 3)(s +7)
(s +1)
2
.
Let c(s) = 4/(s +5). Then c stabilizes p
0
. Moreover, d
c
d
0
+n
c
n
0
= 1 with
d
c
(s) =
s +5
s +7
, n
c
(s) =
4
s +7
.
Now consider the class
S(n
0
, d
0
, r); with r = 1. Since
_
_
_
_
_
d
c
()
n
c
()
__
_
_
_
=
_
_
_
_
_
1
0
__
_
_
_
= 1 ,
(7.4.41) does not hold. Nevertheless, c stabilizes every p in the class
S(n
0
, d
0
, r), which shows
that (7.4.41) is not always necessary for robust stability.
To establish the claim made in the above paragraph, suppose
p(s) =
n(s)
d(s)
, where
_
_
_
_
_
n n
0
d d
0
_
(s)
_
_
_
_
1 s C
+e
.
Let Q = [q
1
q
2
]
denote [d d
0
n n
0
]
1.
9
We prove the contrapositive,
namely, if g is a rational H
i=0
g
i
(z 1)
i
.
If a g has (at least) a double zero at z = 1, then
g(1) = a(1) = 1 , g
(1) = a
(1) = 1 .
Hence, for z < 1 and sufciently close to 1, we have g(z) > 1. Therefore, g
-norm
minimization results derived in Sections 6.5 and 6.6.
Theorem 7.4.9 Suppose a nominal plant P
0
M(R(s)), free of poles on the extended j-axis,
and a function r S are specied. Suppose r does not have any zeros on the extended j-axis. Let
(N, D), (
D,
N) be any r.c.f. and any l.c.f. of P
0
, and let
X,
Y M(S) be any particular solutions of
the identity
N
X +
D
Y = I. Factor D,
D in the form
D = D
i
D
o
,
D =
D
o
D
i
, (7.4.42)
where D
i
,
D
i
, are inner and D
o
,
D
o
are outer, and assume without loss of generality that |D
i
| =
|
D
i
|.
10
Finally, factor r as r
o
r
i
where r
o
, r
i
are outer and inner, respectively. Under these conditions,
there exists a C S(P
0
) that stabilizes all plants in the class A(P
0
, r) if and only if there exists an
S M(S) such that
D
adj
i
X
D
o
r
o
+S
1 , (7.4.43)
where = |D
i
| = |
D
i
|.
Proof. By Theorem 5.2.1, the set of all C S(P
0
) is given by
S(P
0
) = {(
X +DS)(
Y NS)
1
: S M(S), |
Y NS| = 0} , (7.4.44)
where the nonsingularity constraint is satised by an open dense subset of S M(S) (see
Lemma 5.2.4). Moreover, from (5.2.7), if C = (
X +DS)(
Y NS)
1
, then
C(I +P
0
C)
1
= (
X +DS)
D . (7.4.45)
Hence, there exists a C S(P
0
) satisfying (7.4.4) if and only if there exists an S M(S) such that
X
Dr +DS
Dr
1 . (7.4.46)
A few simple manipulations bring (7.4.46) into the form (7.4.43). Note that
X
Dr +DS
Dr
=
X
D
o
D
i
r
o
r
i
+D
i
D
o
S
D
o
D
i
r
o
r
i
= D
adj
i
X
D
o
r
o
r
i
+
2
D
o
S
D
o
r
o
r
i
= r
i
(D
adj
i
X
D
o
r
o
+Q)
= D
adj
i
X
D
o
r
o
+Q
, (7.4.47)
10
By Theorem 4.1.15, |D| and |
D| are associates; so are |D
o
| and |
D
o
|, since both are units. Hence, |D
i
| and |
D
i
| are associates,
and since both are inner, |D
i
| = |
D
i
| in any case.
7.4. ROBUSTNESS CONDITIONS 247
where, in successive steps, we multiply on the left by D
adj
i
and on the right by
D
adj
i
, replace the
matrix S by a new free parameter Q = D
o
S
D
o
r
o
, and observe that multiplication by the inner
function r
i
preserves norms. Hence, (7.4.46) and (7.4.43) are equivalent. 2
If has only simple RHPzeros (or equivalently, if P
0
has only simple RHPpoles), then (7.4.43)
can be stated in a more readily testable form using the results of Section 6.5.
Corollary 7.4.10 Suppose has only simple RHP zeros, and let
1
, ,
n
denote these zeros. Let
F
i
= (D
adj
i
XD
o
r
o
)(
j
), j = 1, , n, and dene
Q
jk
=
I +F
j
F
k
j
+
k
, 1 j, k n , (7.4.48)
Q =
_
_
_
Q
11
Q
1n
.
.
.
.
.
.
Q
n1
Q
nn
_
_
. (7.4.49)
Then there exists a C S(P
0
) that stabilizes all plants in the class A(P
0
, r) if and only if Qis nonnegative
denite.
In (7.4.43),
X is any particular solution of the identity
N
X +
D
Y = I. Hence, it is nice to
know that the test matrix Qis the same no matter which
X is used. To see this, let
X
1
,
Y
1
be another
set of matrices in M(S) satisfying
N
X
1
+
D
Y
1
= I. Then, from the proof of Theorem 5.2.1, it
follows that
X
1
=
X +DR for some R M(S). Hence,
D
adj
i
X
1
D
o
r
o
= D
adj
i
(
X +DR)
D
o
r
o
= D
adj
i
X
D
o
r
o
+D
o
R
D
o
r
o
, (7.4.50)
since D = D
i
D
o
. Now the second term on the right side of (7.4.50) vanishes at all zeros of .
Hence F
1
, , F
n
are independent of which particular solution
X is used to compute them. Similar
remarks apply even in the case where has repeated zeros.
If the plant P
0
is scalar, then the expression for F
j
(or f
j
in the scalar case) is more elegant. In
this case, D
adj
i
= 1 (since by convention the adjoint of a 1 1 matrix is 1) and f
j
= (xd
o
r
o
)(
j
).
Now, in the scalar case = d
i
; hence, it follows from d = d
i
d
o
= d
o
and nx +yd = 1 that, at any
zero
j
of , we have n(
j
)x(
j
) = 1. Hence,
x(
j
)d
o
(
j
) =
d
o
(
j
)
n(
j
)
. (7.4.51)
With the plant p
0
, associate the stable plant q
0
= p
0
= n/d
o
. Then
f
j
=
d
o
(
j
)
n(
j
)
r
o
(
j
) =
r
o
q
0
(
j
) . (7.4.52)
248 7. ROBUSTNESS
The treatment of multiplicative perturbations is entirely similar to that of additive perturba-
tions.
Theorem 7.4.11 Suppose a function r S and a nominal plant P
0
M(R(s)) are specied, and
suppose that P
0
has no extended j-axis poles. Let (N, D), (
D,
N) be any r.c.f. and any l.c.f. of
P
0
, and let X, Y M(S) be any particular solutions to the identity XN +YD = I. Under these
conditions, there exists a C S(P
0
) that stabilizes all plants in the class M(P
0
, r) if and only if there
exists an R M(S) such that
N(X +R
D)r
1 . (7.4.53)
Further, if P
0
has at least as many inputs as outputs and r has no extended j-axis zeros, then (7.4.53)
can be simplied as follows: Factor N,
D in the form
N = N
i
N
o
,
D =
D
o
D
i
, (7.4.54)
where N
i
,
D
i
are inner and N
o
,
D
o
are outer. Finally. factor r as r
i
r
o
where r
i
is inner and r
o
is a unit
of S. Under these conditions, there exists a C S(P
0
) that stabilizes all plants in the class M(P
0
, r)
if and only if there exists an S M(S) such that
N
o
X
D
adj
i
r
o
+S
1 , (7.4.55)
where = |
D
i
|.
Proof. From Theorem 5.2.1, every C S(P
0
) is of the form (Y R
N)
1
(X +R
D) for some
R M(S). Moreover,
P
0
C(I +P
0
C)
1
= P
0
(I +CP
0
)
1
C = N(X +R
D) . (7.4.56)
Hence (7.4.5) holds for some C S(P
0
) if and only if there exists an R M(S) such that (7.4.53)
holds. Now suppose the additional assumptions on P
0
, r hold. Then, using by now familiar manip-
ulations, we note that
NXr +NR
Dr
= N
i
N
o
Xr
o
r
i
+N
i
N
o
R
D
o
D
i
r
o
r
i
= N
i
r
i
(N
o
Xr
o
D
adj
i
+S)
= N
o
Xr
o
D
adj
i
+S)
, (7.4.57)
where S = N
o
R
D
o
and we use the fact that multiplication by the inner matrix N
i
r
i
reserves norms.
Now (7.4.57) is precisely (7.4.55). 2
If the plant P
0
is scalar, one can again obtain a simple expression for the N
o
Xr
o
D
adj
i
evaluated
at the zeros of . Suppose nx +dy = 1 and suppose () = 0 at some point C
+
. Then d() = 0,
nx() = 1, and (N
o
Xr
o
D
adj
i
)() = r
o
()/n
i
().
7.4. ROBUSTNESS CONDITIONS 249
The foregoing results also lead very naturally to the notion of optimally robust compensators.
Suppose a nominal plant P
0
is given, together with a function f S, which represents an uncertainty
prole. Consider now the class A(P
0
, r) consisting of all plants satisfying (7.4.1), where r = f . It
is now reasonable to ask: What is the largest value of the parameter for which the class A(P
0
, r)
is robustly stabilizable, and what is a corresponding robustly stabilizing compensator? This problem
can be solved very easily using the results derived so far. Factor f as f
i
f
o
where f
i
, f
o
are respectively
inner and outer, and dene
= min
SM(S)
D
adj
i
X
D
o
f
o
+S , (7.4.58)
where all symbols are as in Theorem 7.4.9. Now, by applying Theorem 7.4.9 with r = f , it follows
that there exists a single compensator that stabilizes all the plants in the class A(P
0
, r) if and only
if 1. Hence, the largest value of for which there exists a robustly stabilizing compensator is
given by = 1/ . Further, if S is any matrix which attains the minimum in (7.4.58), then the com-
pensator C = (
X +DS)(
Y NS)
1
is an optimally robust compensator. The case of multiplicative
perturbations is entirely similar and is left to the reader.
Next, the case of stable-factor perturbations is studied.
Theorem 7.4.12 Suppose a function r S and a nominal plant P
0
M(R(s)) are specied,
together with an r.c.f. (N
0
, D
0
) of P
0
. Let X, Y M(S) satisfy XN
0
+YD
0
= I, and let (
D
0
,
N
0
)
be any l.c.f. of P
0
. Then there exists a C S(P
0
) that stabilizes all plants in the class S(N
0
, D
0
, r)
if and only if there exists a matrix R M(S) such that A BR 1, where
A =
_
X
Y
_
, B =
_
D
0
N
0
_
. (7.4.59)
The proof is obvious and is left to the reader.
This section is concluded with a discussion of the case where there are perturbations in both
the plant and the compensator. The results in this case make use of the graph metric introduced in
Section 7.3. A bit of notation makes the presentation smoother. Given P, C M(R(s)), dene
T (P, C) = H(P, C)
_
I 0
0 0
_
, (7.4.60)
where H(P, C) is dened in 7.1.1. Using this last equation, one can derive a few other useful
representations of T (P, C). For instance,
T (P, C) =
_
PC(I +PC)
1
P(I +CP)
1
C(I +PC)
1
(I +CP)
1
_
=
_
P
I
_
(I +CP)
1
[C I] . (7.4.61)
Theorem 7.4.13 Suppose the pair (P
0
, C
0
) is stable, and that P
0
, C
0
are perturbed to P, C,
respectively. Then the pair (P, C) is stable provided
d(P, P
0
)T (P
0
, C
0
) +d(C, C
0
)T (C
0
, P
0
) < 1 . (7.4.62)
250 7. ROBUSTNESS
Corollary 7.4.14 Under the hypotheses of Theorem 7.4.13, the pair (P, C
0
) is stable provided
d(P, P
0
) < 1/T (P
0
, C
0
).
The signicance of Theorem 7.4.13 and Corollary 7.4.14 is as follows: The type of plant
perturbations studied in these two results is the most unstructured one considered in this section, in
that (i) one is permitted to simultaneously perturb both the plant and the compensator. (ii) there is
no restriction on the number of RHP poles of the perturbed and unperturbed plant being the same,
and (iii) the perturbations are not couched in terms of a particular coprime factorization of the plant.
The stability condition (7.4.62) is interesting in that the effects of the perturbations in the plant
and controller enter additively. When only the plant is perturbed, the stability condition given in
Corollary 7.4.14 is reminiscent of the small gain theorem (see e.g., [D1, Chapter VI]). These results
also bring out the utility of the graph metric in obtaining quantitative robustness estimates.
Proof. Let (N
0
, D
0
), (
X
0
,
Y
0
) be normalized r.c.f.s of P
0
, C
0
, respectively. Then from the stability
of the pair (P
0
, C
0
), and Problem 5.1.1, the matrix
U
0
=
_
D
0
X
0
N
0
Y
0
_
, (7.4.63)
is unimodular. Let V
0
= U
1
0
, and partition V
0
as
V
0
=
_
Y
0
X
0
N
0
D
0
_
. (7.4.64)
Then (
D
0
,
N
0
), (Y
0
, X
0
), are l.c.f.s of P
0
, C
0
, respectively.
Now select real numbers
p
> d(P, P
0
),
c
> d(C, C
0
) such that
p
T (P
0
, C
0
) +
c
T (C
0
, P
0
) < 1 . (7.4.65)
This is possible in view of (7.4.62). Let (N
p
, D
p
), (N
c
, D
c
) be any normalized r.c.f.s of P, C
respectively. Then, fromthe denition of the graph metric (see 7.3.2), there exist matrices W
p
, W
c
,
M(S) with W
p
1, W
c
1 such that
_
_
_
_
_
D
0
D
p
W
p
N
0
N
p
W
p
__
_
_
_
p
, (7.4.66)
_
_
_
_
_
Y
0
D
c
W
c
X
0
N
c
W
c
__
_
_
_
c
. (7.4.67)
7.4. ROBUSTNESS CONDITIONS 251
Dene F as in (7.3.33) and observe that F is inner so that multiplication by F is norm-preserving.
Thus (7.4.67) implies that
_
_
_
_
_
(
X
0
N
c
W
c
)
Y
0
D
c
W
c
__
_
_
_
c
. (7.4.68)
Next, dene
U =
_
D
p
W
p
N
c
W
c
N
p
W
p
D
c
W
c
_
=
_
D
p
N
c
N
p
D
c
_ _
W
p
0
0 W
c
_
. (7.4.69)
It is now shown that (7.4.65) implies the unimodularity of U. This will show, a fortiriori, that W
p
and W
c
, are both unimodular, and that the pair (P, C) is stable, by Problem 5.1.1. Note that
U U
0
= [A B] , (7.4.70)
where
A =
_
D
0
D
p
W
p
N
0
N
p
W
p
_
, B =
_
(
X
0
N
c
W
c
)
Y
0
D
c
W
c
_
. (7.4.71)
Now, if [A B]U
1
0
< 1, then (7.4.70) implies that U is unimodular. From (7.4.64),
[A B]U
1
0
= [A B]V
0
= A[Y
0
X
0
] +B[
N
0
D
0
] . (7.4.72)
[A B]U
1
0
A[Y
0
X
0
] +B[
N
0
D
0
]
p
[Y
0
X
0
] +
c
[
N
0
D
0
] (7.4.73)
where the last step follows from (7.4.66) and (7.4.68). The proof is completed by showing that
[Y
0
X
0
] = T (P
0
, C
0
) , (7.4.74)
[
N
0
D
0
] = T (C
0
, P
0
) . (7.4.75)
Then (7.4.65) and (7.4.73) will imply the unimodularity of U and the stability of (P, C).
To prove (7.4.74), recall that (N
0
, D
0
) is a normalized r.c.f. of P
0
, and (Y
0
, X
0
) is the corre-
sponding l.c.f. of C
0
such that Y
0
D
0
+X
0
N
0
= I. Hence, from (5.1.23) and (7.4.60),
T (P
0
, C
0
) =
_
N
0
X
0
N
0
Y
0
D
0
X
0
D
0
Y
0
_
=
_
N
0
D
0
_
[X
0
Y
0
] ; (7.4.76)
T (P
0
, C
0
) = [X
0
Y
0
] since
_
N
0
D
0
_
is an isometry
= [Y
0
X
0
] . (7.4.77)
The proof of (7.4.75) follows essentially by symmetry arguments, after noting that [
N
0
D
0
] =
[
N
0
D
0
]. Recall that (
X
0
,
Y
0
) is a normalized r.c.f. of C
0
, that P
0
stabilizes C
0
, and that (
D
0
,
N
0
)
is the corresponding l.c.f. of P
0
such that
D
0
Y
0
+
N
0
X
0
= I. 2
252 7. ROBUSTNESS
It is not known at present how close the conditions of Theorem 7.4.13 and Corollary 7.4.14
are to being necessary. In particular, it is not known whether the condition T (P
0
, C) < r
1
is
necessary for a controller Ctostabilize all plants withina distance of r fromthe plant P
0
. Nevertheless,
it is reasonable to seek, among all stabilizing controllers for P
0
, an optimally robust controller C
0
for which T (P
0
, C
0
) is as small as possible. The problem of nding such a C
0
can be formulated
as an H
N
0
D
0
_
[X
0
Y
0
] , V =
_
N
0
D
0
_
, W = [
D
0
N
0
] . (7.4.79)
This type of minimization problem is studied in Section 6.6.
Finally, the results contained in this section carry over with obvious modications to the study
of the robustness of discrete-time systems, and to cases where the set of stable transfer functions
is something other than S.
7.5 ROBUSTANDOPTIMAL REGULATION
In this section, several previously derived results are brought to bear to provide solutions of two
problems, namely robust regulation and optimal robust regulation. The robust regulation problem
encompasses two aspects, namely robust tracking and robust disturbance rejection. The robust track-
ing problem can be described as follows: Suppose we are given a plant P, together with an unstable
reference signal generator which is factored in the form
D
1
r
N
r
, where
D
r
,
N
r
are left-coprime.
The objective is to design a compensator such that the plant output tracks the unstable reference
signal, and continues to do so even as the plant is slightly perturbed. This problem is depicted in
Figure 7.2, where, to achieve the greatest possible generality, a two-parameter compensation scheme
is shown. Alternatively, suppose the unstable signal is a disturbance which is injected at the plant
output. In this case, the objective is to design a compensator such that the plant output is insensitive
to the disturbance, and remains so even as the plant is slightly perturbed. This situation is shown
in Figure 7.3, and is referred to as the robust disturbance rejection problem. Note that a feedback,
or one-parameter, conguration is employed in Figure 7.3 since, as shown in Section 5.6, both
congurations are equivalent for the purposes of disturbance rejection.
In more precise terms, the robust disturbance rejection problem can be stated as follows: Find
a compensator such that three conditions are met, namely
7.5. ROBUSTANDOPTIMAL REGULATION 253
N
r c
1
~
N
~
c
2
N
~
D
r
1
~
D
c
1
~
v r y
P
+
r
+
1
r
D
c
M(S) (i.e., if and only if
r
divides every component of D
c
).
Proof. Since the robustness condition(R3) involves perturbingthe plant fromits nominal description
P, we can assume at the outset that
D and
D
r
are right-coprime. If this is not so originally, it can be
achieved by an arbitrarily small perturbation of
D, by Lemma 7.5.4. Also, since the truth or falsity
of the condition
1
r
D
c
is unaffected by multiplying D
c
by a unimodular matrix, we can as well
choose the particular r.c.f. (N
c
, D
c
) of C that satises
DD
c
+
NN
c
= I. Finally, note that since
r
is the largest invariant factor of
D
r
, we have that
r
D
1
r
M(S) (see Problem 7.5.2).
if Suppose D
c
=
r
T , where T M(S). Now (R1) holds by assumption, since C S(P).
Next,
(I PC)
1
D
1
r
= D
c
D
D
1
r
=
r
T
D
D
1
r
M(S) , (7.5.4)
so that (R2) holds. To prove (R3), suppose P =
D
1
N is replaced by P
1
=
D
1
1
N
1
, where
D
1
=
D +
D,
N
1
=
N +
N. Then the corresponding return difference matrix is
M =
D
1
D
c
+
N
1
N
c
= I +[
D
N]
_
D
c
N
c
_
, (7.5.5)
256 7. ROBUSTNESS
which is unimodular provided
[
D
N] < =
_
_
_
_
_
D
c
N
c
__
_
_
_
1
. (7.5.6)
Hence, (R1) holds robustly. To prove that (R2) also holds robustly, note that
(I +P
1
C)
1
D
1
r
= D
c
M
1
D
1
D
1
r
= T M
1
D
1
r
D
1
r
M(S) , (7.5.7)
since M
1
M(S) due to the unimodularity of M.
only if Suppose (Rl), (R2), (R3) hold; it is to be shown that
1
r
D
c
M(S). This part of
the proof is divided into several steps.
First, since
D and D
r
are right-coprime, let (A, B) be an l.c.f. of
D
D
1
r
. Then, by Theo-
rem 4.1.15, A and
D
r
have the same invariant factors. In particular,
r
is the largest invariant factor
of A.
Step 1 It is shown that
D
c
A
1
M(S) . (7.5.8)
By assumption, (R2) holds, i.e., D
c
D
D
1
r
= D
c
A
1
B M(S). An application of Lemma 7.5.3
gives (7.5.8).
Step 2 It is shown that
D
c
Q
NN
c
A
1
M(S) Q M(S) . (7.5.9)
Let Q M(S) be arbitrary, and choose in the interval (0, 1/Q
NN
c
). Then I Q
NN
c
is
unimodular. Dene M = (I Q
NN
c
)
1
. Then
M = I +
i=1
(Q
NN
c
)
i
= I +R
NN
c
, (7.5.10)
for some rational matrix R
0 as 0. Hence, R
<
for sufciently small , where is the number dened in (7.5.1). Now let P
1
=
D
1
(I +R
)
N.
By assumption, (I +P
1
C)
1
D
1
r
M(S), since P
1
is a valid perturbation of P. But
(I +P
1
C)
1
D
1
r
= D
c
[
DD
c
+(I +R
)
NN
c
]
1
D
D
1
r
= D
c
(I +R
NN
c
)
1
A
1
B
= D
c
(I Q
NN
c
)A
1
B M(S) . (7.5.11)
Applying Lemma 7.5.3 gives
D
c
(I Q
NN
c
)A
1
M(S) . (7.5.12)
Subtracting (7.5.8) yields D
c
Q
NN
c
A
1
M(S), and dividing by leads to (7.5.9).
Step 3 It is shown that
D
c
QA
1
M(S) Q M(S) . (7.5.13)
7.5. ROBUSTANDOPTIMAL REGULATION 257
Indeed, since
DD
c
+
NN
c
= I,
D
c
QA
1
= D
c
Q(
DD
c
+
NN
c
)A
1
= (D
c
Q
D)D
c
A
1
+D
c
Q
NN
c
A
1
M(S) , (7.5.14)
by (7.5.8) and (7.5.9), respectively.
Step 4 It is shown that
1
r
D
c
M(S) , (7.5.15)
which is the desired conclusion. Select unimodular U, V U(S) such that
UAV =
_
_
_
1
0
.
.
.
0
r
_
_
= S , (7.5.16)
is the Smith form of A. By (7.5.8),
D
c
QA
1
M(S) Q D
c
QVS
1
U M(S) Q
D
c
QVS
1
M(S) Q
D
c
Q
1
S
1
M(S) Q
1
(7.5.17)
where Q
1
denotes QV, and Q, Q
1
range over M(S). Now let Q
1
= I. Then D
c
S
1
M(S), which
shows that
r
divides the last column of D
c
. Next, successively let Q
1
equal the permutation matrix
that interchanges column j with the last column, and let the index j vary over all possible values.
Then D
c
Q
1
S
1
M(S) implies that
r
divides every column of D
c
. 2
The argument used in the last step is sufciently useful that it deserves display as a separate
item.
Proposition 7.5.6 Suppose A, B M(S), have the same number of columns, and in addition A is
square and nonsingular. Then
BQA
1
M(S) Q M(S) , (7.5.18)
if and only if the largest invariant factor of A divides every element of B.
Proof of Theorem7.5.1. It is rst shown that the robust regulation problem has a solution if and
only if N and
r
I are left-coprime. Select
X,
Y M(S) such that
N
X +
D
Y = I. Then by Theo-
rem 5.2.1,
S(P) = {(
X +DR)(
Y NR)
1
: R M(S)} . (7.5.19)
Thus, the set of all D
c
as C varies over S(P) is given by {
Y NR : R M(S)}. By Lemma 7.5.5,
the robust tracking problemhas a solution if and only if
1
r
(
Y NR) M(S) for some R M(S),
or equivalently, there exist R, W M(S) such that
Y NR =
r
W . (7.5.20)
258 7. ROBUSTNESS
Now (7.5.20) can be rearranged as
NR +
r
W =
Y . (7.5.21)
Thus, the proof is complete if it can be shown that the left-coprimeness of N and
r
I is necessary and
sufcient for (7.5.21) to have a solution. Accordingly, suppose rst that N,
r
I are left-coprime, and
select T, U M(S) such that NT +
r
U = I; then R = T
Y, W = U
Y is a solution of (7.5.21).
Conversely, suppose N and
r
I are not left-coprime, and let V be a nontrivial (i.e., nonunimodular)
common left-divisor of N and
r
I. Then NR +
r
W is a left-multiple of V for all R, W. On
the other hand, from Corollaries 4.1.18 and 4.1.7, N and
Y are left-coprime, so that
Y is not a
left-multiple of V. Hence, (7.5.21) does not have a solution.
To complete the proof, it is shown that the set of all C S(P) such that
r
D
1
c
M(S)
is precisely {C
1
/
r
: C
1
S(P/
r
)}, provided of course that N and
r
I are left-coprime. First,
suppose D
c
=
r
T for some T M(S). Then, since
I =
DD
c
+
NN
c
=
r
DT +
NN
c
, (7.5.22)
we see that T and N
c
are right-coprime, and that C
1
= N
c
T
1
stabilizes (
r
D)
1
N = P/
r
.
Hence, every solution to the robust tracking problem is of the formC
1
/
r
, for some C
1
S(P/
r
).
Conversely, suppose C
1
S(P/
r
). Now P/
r
= (
r
D)
1
N; moreover, it can be shown that
r
D
and
N are left-coprime. Hence, (
r
D,
N) is a l.c.f. of P/
r
. Since C
1
S(P/
r
), C
1
has an r.c.f.
(N
c
1
, D
c
1
) satisfying
r
DD
c
1
+
NN
c
1
= I . (7.5.23)
Thus, C = C
1
/
r
= N
c
1
(D
c
1
)
1
stabilizes P, and the denominator matrix of C is a multiple of
r
. Hence, C solves the robust tracking problem. 2
Corollary 7.5.7 Suppose N and
r
I are left-coprime, and let T, U M(S) be particular solutions of
NT +
r
U = I. (7.5.24)
Let (B, A) be an r.c.f. of N/
r
. Then the set of all C S(P) that solve the robust tracking problem is
{[
X +D(T
Y +AS)][
r
(U
Y BS)]
1
: S M(S)} . (7.5.25)
Proof. From the proof of Theorem 7.5.1, we see that C solves the robust tracking problem if and
only if C = (
X +DR)(
Y NR)
1
for some R satisfying (7.5.21). Now it is routine to verify that
the set of all solutions of (7.5.21) is
R = T
Y +AS , W = U
Y BS , S M(S) . (7.5.26)
Substituting for Rfrom(7.5.26) andnoting that
Y NR =
r
W =
r
(U
Y BS) leads to(7.5.25).
2
7.5. ROBUSTANDOPTIMAL REGULATION 259
Proof of Lemma 7.5.2. Recall from Section 4.2 that N and
r
I are left-coprime if and only if the
matrix F = [N
r
I] has full row rank at all s C
+e
. If
r
(s) = 0, the rank condition holds;
therefore, it is only necessary to verify the rank condition at the C
+e
-zeros of
r
. If
r
(s) = 0, then
F(s) has full row rank if and only if N(s) has full row rank. Hence, we conclude that N and
r
I are
left-coprime if and only if N(s) has full row rank at all C
+e
-zeros of
r
. Clearly this is the same as
requiring the largest invariant factor of N and
r
to be coprime. 2
Lemma 7.5.2 also holds over any principal ideal domain, and not just over S; but the proof in
the general case is a little more complicated (see Problem 7.5.3).
Now let us consider the problem of robust tracking using a two-parameter compensation
scheme. Recall from Section 5.6 that such a scheme can be depicted as shown in Figure 7.2. In this
scheme, the tracking error between the reference input r and the plant output y is given by
e = [I N(
N
c
2
N +
D
c
D)
1
N
c
1
]r . (7.5.27)
Hence, the transfer matrix from v to e is given by
F(N, D) = [I NM
1
N
c
1
]
D
1
r
N
r
, (7.5.28)
where
M = M(N, D) =
N
c
2
N +
D
c
D . (7.5.29)
As before, the objectives of the robust tracking problemare to determine the compensator parameters
N
c
1
,
N
c
2
,
D
c
such that the closed-loop system is stable and F(N, D) M(S) for the nominal plant
as well as for all plants in some neighborhood of the nominal plant.
A partial solution to the problem is given by the next theorem.
Theorem 7.5.8 Consider the system of Figure 7.2, and suppose P has at least as many inputs as
outputs. Let (N, D) be anr.c.f. of the nominal plant, andlet (N
1
, D
1
) denote perturbations of (N, D).
Let (N
c
2
, D
c
) be an r.c.f. of the feedback compensator
D
1
c
N
c
2
. Then the triple (
N
c
1
,
N
c
2
,
D
c
) solves
the robust tracking problem if and only if
(i)
1
r
D
c
M(S), and
(ii) N
1
[M(N
1
, D)]
1
(
N
c
2
N
c
1
)
D
1
r
M(S) (N
1
, D
1
).
If P is square, (ii) is equivalent to
(ii) (
N
c
2
N
c
1
)
D
1
r
M(S).
If P is nonsquare, (ii) implies but is not implied by (ii).
Remarks 7.5.9 If we use a one-parameter, or feedback compensator, then
N
c
2
=
N
c
1
, so that
(ii) above is automatically satised. In this case, (i) is a necessary and sufcient condition for robust
tracking. This is precisely the result contained in Lemma 7.5.5.
260 7. ROBUSTNESS
Proof. if Suppose rst that (i) and (ii) hold. It is desired to show that
(I N
1
M
1
N
c
1
)
D
1
r
M(S) (N
1
, D
1
) , (7.5.30)
where M = M(N
1
, D
1
) is as in (7.5.29). Dene
S =
_
D
1
N
c
2
N
1
D
c
_
. (7.5.31)
By the robustness of closed-loop stability, S is unimodular for (N
1
, D
1
) sufciently close to (N, D).
Let T = S
1
, and partition T as
T =
_
A B
K L
_
. (7.5.32)
Now [A B] is equivalent to [I 0], since it is part of a unimodular matrix. This, plus the fact that
[A B]
_
N
c
2
D
c
_
= 0 , (7.5.33)
implies that [A B] is an l.c.f. of
D
1
c
N
c
2
, i.e., that
[A B] = U[
D
c
N
c
2
] for some U U(S) . (7.5.34)
Similarly,
[K L] = [
N
1
D
1
] , (7.5.35)
where (
N
1
,
D
1
) is an l.c.f. of the perturbed plant N
1
D
1
1
. Now T S = I implies that
U(
D
c
D
1
+
N
c
2
N
1
) = I , (7.5.36)
or U = M
1
. Also, ST = I implies in particular that
N
1
B +D
c
L = I , (7.5.37)
or
NM
1
N
c
2
+D
c
D
1
= I . (7.5.38)
Therefore
(I N
1
M
1
N
c
1
)
D
1
r
= D
c
D
1
D
1
r
+N
1
M
1
(
N
c
2
N
c
1
)
D
1
r
, (7.5.39)
which belongs to M(S) by (i) and (ii). Hence, robust tracking is achieved.
only if Suppose the triple (
N
c
1
,
N
c
2
,
D
c
) achieves robust tracking. It is desired to show
that (i) and (ii) hold. The hypothesis is that
{I N
1
[M(N
1
, D
1
)]
1
N
c
1
}
D
1
r
N
r
M(S) (N
1
, D
1
) , (7.5.40)
7.5. ROBUSTANDOPTIMAL REGULATION 261
where (N
1
, D
1
) range over some neighborhood of (N, D). By Lemma 7.5.3, (7.5.40) is equivalent
to
{I N
1
[M(N
1
, D
1
)]
1
N
c
1
}
D
1
r
M(S) (N
1
, D
1
) . (7.5.41)
As a prelude to the proof, observe rst that it can be assumed without loss of generality that the
numerator matrix N of the nominal plant and
r
I are left-coprime. If this is not so originally, it
can be achieved by an arbitrarily small perturbation of the plant: It sufces to perturb N to a nearby
matrix which has full row rank at all C
+e
-zeros of
r
(see Lemma 7.5.2). Also, we may as well
assume that the r.c.f. (N, D) of the nominal plant and the triple (
N
c
1
,
N
c
2
,
D
c
) satisfy
N
c
2
N +
D
c
D = I . (7.5.42)
Finally, let (
D,
N) be an l.c.f. of the nominal plant dened by the equation
_
D N
c
2
N D
c
_
=
_
D
c
N
c
2
N
D
_
1
. (7.5.43)
Then, since
N and N have the same invariant factors, it follows from Lemma 7.5.2 that
N and
r
I
are also left-coprime.
The rst step of the proof is to observe that the hypotheses imply that
N
c
1
and
D
r
are
right-coprime. To see this, note that (7.5.41) implies that
W = (I N
N
c
1
)
D
1
r
M(S) . (7.5.44)
Hence,
N
N
c
1
+W
D
r
= I , (7.5.45)
which shows that
N
c
1
and
D
r
are right-coprime. Let (A, B) be an l.c.f. of
N
c
1
D
1
r
, and note that
r
is also the largest invariant factor of A. Next, x the plant numerator at N and vary its denominator.
Then (7.5.41) states that
{I N[M(N, D
1
)]
1
N
c
1
}
D
1
r
M(S) D
1
. (7.5.46)
Subtracting (7.5.44) from (7.5.46) leads to
N{[M(N, D
1
)]
1
I}
N
c
1
D
1
r
= N{[M(N, D
1
)]
1
I}A
1
B M(S) D
1
. (7.5.47)
Hence, from Lemma 7.5.3,
N{[M(N, D
1
)]
1
I}A
1
M(S) D
1
. (7.5.48)
Now M = M(N, D
1
) = I +
D
c
(D
1
D). By an argument analogous to that in the proof of
Lemma 7.5.5 (see especially Step 2), it can be shown that M
1
I can be made equal
D
c
Q
for an arbitrary matrix Q M(S), by suitable choice of D
1
. Therefore, (7.5.48) implies that
N
D
c
QA
1
M(S) Q M(S) . (7.5.49)
262 7. ROBUSTNESS
Now Proposition 7.5.6 leads to the conclusion that
r
the largest invariant factor of A, divides each
element of N
D
c
. However, from (7.5.43), N
D
c
= D
c
N. Hence,
D
c
1
r
N M(S) . (7.5.50)
Finally, since
r
I and
N are left-coprime, an application of Lemma 7.5.3 to (7.5.50) gives
D
c
1
r
M(S) . (7.5.51)
This proves (i).
The proof of (ii) is now very easy. From previous arguments, robust tracking is equivalent
to (7.5.39). However, since
r
divides every element of D
c
, we have that
D
c
D
1
D
1
r
M(S) (N
1
, D
1
) . (7.5.52)
Subtracting (7.5.52) from (7.5.39) shows that
N
1
M
1
(
N
c
2
N
c
1
)
D
1
r
M(S) (N
1
, D
1
) . (7.5.53)
This proves (ii).
To complete the proof, it is necessary to show two things: 1) Condition (ii) implies (ii);
2) If the plant is square, (ii) implies (ii). Of these, the rst statement is obvious: If
N
c
2
N
c
1
is a
left-multiple of
D
r
, then (7.5.53) is clearly satised. To prove the second statement, dene
Q = (
N
c
2
N
c
1
)
D
1
r
. (7.5.54)
Then (7.5.53) implies in particular that NQ M(S). Nowselect unimodular matrices U, V U(S)
such that UQV is in Smith-McMillan form, and let
S = U
1
QV
1
= Diag {s
1
/t
1
, , s
r
/t
r
} . (7.5.55)
Since NQ M(S), it follows that t
i
divides the i-th column of NU, for all i. Now, if Q itself does
not belong to M(S), then t
1
is not a unit. Note that t
1
is a divisor of
r
, because of (7.5.54). Now let
s be any C
+e
-zero of t
1
. Then, since t
1
divides the rst column of NU, NU(s) is singular, which in
turn implies, by the unimodularity of U, that N(s) is singular. However, since t
1
is a divisor of
r
,
this contradicts the left-coprimeness of N and
r
I. Hence, Q M(S). 2
Theorem 7.5.8 shows that (ii) implies (ii) for square plants. It is now shown by example that
there exist situations involving nonsquare plants where (ii) holds but (ii) is false.
Example 7.5.10 Suppose the plant
P =
_
3s
2s 1
4s +1
2s 1
_
7.5. ROBUSTANDOPTIMAL REGULATION 263
is to be compensated so as to robustly track a step input. In this case, we may take
D
r
=
r
=
s/(s +1). For convenience, let z denote s/(s +1), and note that z S. Then
P = [3z 3z +1](3z 1)
1
,
and an r.c.f. for P is given by
N = [z 3z +1] ,
D =
_
1 +z 1 3z
2z 6z 1
_
.
Now choose
D
c
=
_
1 0
z z
_
,
N
c
2
=
_
1
1 z
_
.
Then it is claimed that robust tracking is achieved provided
r
divides the second component of
N
c
2
N
c
1
, even if
r
does not divide the rst component of
N
c
2
N
c
1
. This is proved by establishing
that condition (ii) of Theorem 7.5.8 holds. For this purpose, suppose N, D are perturbed to N
1
, D
1
,
and partition these as
N
1
= [n
a
n
b
], D
1
= [D
a
D
b
] .
Also, partition
N
c
1
,
N
c
2
,
D
c
as
N
c
1
=
_
x
a
x
b
_
,
N
c
2
=
_
y
a
y
b
_
,
D
c
=
_
D
a
D
b
_
.
The key point to note here is that
D
b
is a multiple of z =
r
. Now partition M =
N
c
2
N
1
+
D
c
D
1
,
in the form
M =
_
m
aa
m
ab
m
ba
m
bb
_
.
Then
N
1
M
1
(
N
c
2
N
c
1
)/
r
= |M|
1
[(n
a
m
bb
n
b
m
ba
)(y
a
x
a
)]/
r
+|M|
1
[(n
a
m
ab
n
b
m
aa
)(y
b
x
b
)]/
r
.
Since
r
divides y
b
x
b
, the second term belongs to S, provided the plant perturbation is small
enough to maintain the unimodularity of M, i.e., to maintain closed-loop stability. As for the rst
term,
m
ba
= y
b
n
a
+
D
b
D
a
, m
bb
= y
b
n
b
+
D
b
D
b
n
a
m
bb
n
b
m
ba
=
D
b
(n
a
D
b
n
b
D
a
)
264 7. ROBUSTNESS
is a multiple of
r
, since
r
divides
D
b
. Hence, (ii) holds and robust tracking is achieved, whatever
be x
a
!
Nowlet us consider the problemof optimal regulation. The problems studied in Chapter 6 can
be interpreted as minimizing the effect of a stable disturbance on the plant output, or minimizing the
tracking error in response to a stable reference input. With the aid of the results derived thus far in
this section, it is nowpossible to tackle the same problems in the presence of unstable disturbances or
reference inputs. Specically, suppose P is a given plant,
D
1
r
N
r
is a given disturbance or reference
input generator, and W
1
, W
2
are given square weighting matrices. The objective is to minimize the
objective function
J
p
= W
1
(I +PC)
1
D
1
r
N
r
W
2
p
, (7.5.56)
over all C that achieve robust regulation. This can be interpreted as the problem of optimal distur-
bance rejection, or optimal tracking using a one-parameter compensator. Using Theorem 7.5.1 and
Corollary 7.5.7, this problem can be transformed into one of the type solved in Chapter 6. With all
symbols as in Theorem 7.5.1 and Corollary 7.5.7, the set of all C over which J
p
is to be minimized
is given by (7.5.25). Substituting from (7.5.25) into (7.5.56) shows that
(I +PC)
1
D
1
r
= (U
Y BS)
DL , (7.5.57)
J
p
= E W
1
BSF
p
, (7.5.58)
where
L =
r
D
1
r
,
E = W
1
U
YF ,
F =
DL
N
r
W
2
. (7.5.59)
Note that the robust regulation problem has a solution if and only if the plant has at least as many
inputs as outputs. Hence, F is square or fat, and as a result the minimization of J
p
can be carried
out using the techniques presented in Section 6.5.
Consider now the problem of optimal robust tracking of square plants using a two-parameter
compensation scheme. From Corollary 7.5.7 and Theorem 7.5.8 it follows that the set of all triples
(
N
c
1
,
N
c
2
,
D
c
) that achieve robust regulation is given by
D
1
c
N
c
2
= [(
X +D)(T
Y +AS)][
r
(U
Y BS)]
1
,
N
c
1
=
N
c
2
Q
D
r
, S, Q M(S) . (7.5.60)
The above expression is not very convenient, as it does not explicitly specify the compensator pa-
rameters. Such an expression can be obtained using a doubly coprime factorization for the plant.
Suppose
_
Y X
N
D
_ _
D
X
N
Y
_
= I . (7.5.61)
7.5. ROBUSTANDOPTIMAL REGULATION 265
Then by Problem 5.2.2 it follows that
(
X +DR)(
Y NR)
1
= (Y R
N)
1
(X +R
D) R M(S) . (7.5.62)
Hence, the set of all compensators that achieve robust tracking can be described with the aid of
Corollary 7.5.7 and (7.5.62) as
D
c
= Y (T
Y +AS)
N ,
N
c
2
= X +(T
Y +AS)
D ,
D
c
1
= X +(T
Y +AS)
D +Q
D
r
. (7.5.63)
For the class of compensators described above, the tracking error transfer matrix can be computed
from (7.5.39) as
(D
c
D
D
1
r
+NQ)
N
r
= ((U
Y BS)
DL +NQ)
N
r
. (7.5.64)
Comparing (7.5.64) with (7.5.57), we see that the extra exibility arising fromusing a two-parameter
compensationscheme is the presence of the termNQin (7.5.64).The expression (7.5.64) canactually
be simplied. Since the plant is square, N and
r
I are both left- as well as right-coprime. Hence,
(N,
r
I) is a r.c.f. for
1
r
N, or in other words, we may take A =
r
I, B = N. Hence, in (7.5.64),
(U
Y BS)
DL +NQ = U
Y
DL +N(QS
DL) = U
Y
DL +NR , (7.5.65)
where R = QS
D is a free parameter. To reiterate, the set of stable transfer matrices achievable
using a feedback compensator is
(U
Y
DL NS
DL)
N
r
, S M(S) , (7.5.66)
whereas with a two-parameter scheme, the set is
(U
Y
DL +NQ)
N
r
, Q M(S) . (7.5.67)
Now, the problem of minimizing
J
p
= W
1
(I N
N
c
2
)
D
1
r
N
r
W
2
p
, (7.5.68)
over all robustly stabilizing two-parameter compensators can be reformulated as that of minimizing
J
p
= E W
1
NQ
N
r
W
2
p
, (7.5.69)
as Q varies over M(S), where E is dened in (7.5.59). It is easy to see that the inmum in (7.5.69)
is no larger than that in (7.5.58), since (7.5.69) contains (7.5.58) as a special case by choosing
Q = S
DL. Moreover, as in Section 6.6, there always exist innitely many optimal compensators if a
266 7. ROBUSTNESS
two-parameter scheme is used, and this exibility can be exploited to achieve both good disturbance
rejection and good tracking simultaneously.
Example 7.5.11 For the sake of variety we consider discrete-time systems in the next two exam-
ples. Thus z denotes the unit delay and its z-transform. Note that the z-transform of a sequence
{f
0
, f
1
, . . .} is given by
f (z) =
i=0
f
i
z
i
.
Consider the set-up shown in Figure 7.3, and suppose it is desired to robustly reject a step
input disturbance. Since the z-transform of a sequence of ones is 1/(1 z), we may take
n
r
= 1, d
r
=
r
= 1 z .
Suppose the plant has two inputs and one output, and has the transfer matrix
P = [z(z 0.5) z(z +0.5)](z +0.25)
1
.
An l.c.f. of this plant is given by
D = z +0.25 ,
N = [z(z 0.5) z(z +0.5)] ,
and a particular solution of the Bezout identity
N
X +
D
Y = I is given by
X = [4 4]
,
Y = 4 .
An r.c.f. of P is given by
N = [z 0] ,
D =
_
1 z 0.5
1 z 0.5
_
.
Now N and
r
are left-coprime. In fact (7.5.24) is satised with
T = [1 0]
, U = 1 .
Hence, the robust disturbance rejection problem has a solution. Further, all solutions of this problem
can be parametrized using Corollary 7.5.7. An r.c.f. of N/
r
is
B = [z 0] , A =
_
z 1 0
0 1
_
.
From (7.5.25), the set of all compensators that achieve robust disturbance rejection is given by
C = N
c
D
1
c
,
7.5. ROBUSTANDOPTIMAL REGULATION 267
where
D
c
= (4 zs
1
) ,
N
c
= (z +0.25)
_
4z +3 +(1 z)s
1
(z +0.5)s
2
4z 3 +(z 1)s
1
+(z 0.5)s
2
_
,
s
1
, s
2
H
are rational .
Now consider the problem of optimal robust disturbance rejection as stated in (7.5.56), with
both weighting functions equal to unity. From (7.5.57) and (7.5.58), the set of achievable transfer
functions from v to y is
f (s
1
, s
2
) = (4 zs
1
)(z +0.25) .
Suppose it is desired to minimize the H
.
Example 7.5.12 Consider the set-up shown in Figure 7.2, where the input to be tracked is the
sinusoidal sequence {0, 1, 0, 1, 0, 1, . . .}. Since the z-transform of this sequence is z/(1 +z
2
), we
may take
n
r
= z ,
r
= 1 +z
2
.
Suppose the plant is scalar, with the transfer function
p =
z
z +0.5
.
Then, dropping the tildes because everything is a scalar, we have
n = z , d = z +0.5 , x = 2 , y = 2 .
a =
r
= 1 +z
2
, b = n = z , t = z , u = 1 .
Hence, the set of r, w that satisfy (7.5.21) is
r = ty +as = 2z +(1 +z
2
)s , w = uy bs = 2 zs , s H
.
From Corollary 7.5.7, the set of all feedback, or one-parameter compensators that achieve robust
tracking is given by c = n
c
/d
c
, where
d
c
= (2 zs)(1 +z
2
) , n
c
= 2 +(z +0.5)(2z +(1 +z
2
)s) ,
268 7. ROBUSTNESS
and s H
= (2 zs)(z +0.5)z
, s H
.
Using the techniques of Section 6.4, one can verify that the minimum is 3, corresponding to the
choice s = 0. Hence the unique feedback compensator that achieves optimal robust tracking is
c =
(1 +0.5z +z
2
)
1 +z
2
.
Now let us see how much better we can do using a two-parameter compensation scheme. The set
of transfer functions from v to the tracking error that can be achieved is
n
r
((uy bs)d +nq) = z[(2 zs) (z +0.5) +zq], s , q H
.
= z(1 +zq
1
) ,
where q
1
= q (z +0.5)s +2. The H
N
r
i
such that the ratio of the optimal tracking error achievable using
a two-parameter compensation scheme to that achievable using a one-parameter scheme
approaches zero as i approaches .
7.6 GENERICITY
In this section, the focus is on the notion of genericity, which is a mathematically precise way of
expressing the intuitive idea that some statement or property is almost always true. The concept of
genericity is dened rst, and then it is shown that various useful properties of control systems are
generically true. These include coprimeness, strong stabilizability, and simultaneous stabilizability
(subject of course to suitable restrictions).
We begin with a formal denition of genericity. Let X denote a topological space. Recall that
a binary relation R on X is a subset of the product space X X. It is symmetric if (x, y) R
(y, x) R. More generally, an n-ary relation is a subset of the product space X
n
.
Denition 7.6.1 A binary relation R on X is generic if it is an open dense subset of X. It is strongly
generic if the following is true: For each x X, dene
B(x) = {y X : (x, y) R} . (7.6.1)
Then the set of x for which B(x) is an open dense subset of X is itself an open dense subset of X.
One can state the same denition in words: R is generic if two statements are true: (i) If a pair
(x, y) belongs to R, then there is a neighborhood of (x, y) such that every point in the neighborhood
also belongs to R. (ii) If a pair (x, y) does not belong to R, then every neighborhood of this pair
contains a point that belongs to R. In other words, R is generic if it has two features: (i) If a pair
satises the relation, then sufciently small perturbations will not destroy the relation. (ii) If a pair
does not satisfy the relation, then arbitrarily small perturbations in the pair will make it satisfy the
relation. A simple (and pertinent) example is that, in the polynomial ring R[s] topologized as in
Problem C.2.3, the relation R dened by
R = {(x, y) : x and y have no common zeros} , (7.6.2)
is generic. The exact relationship between genericity and strong genericity is brought out in Prob-
lem 7.6.1. But, roughly speaking, the symmetric relation R is strongly generic if, for almost all x,
almost all y satisfy the relation (x, y) R.
Proposition 7.6.2 Let A, B be open dense subsets of a topological space X. Then C = A B is also open
and dense.
270 7. ROBUSTNESS
Proof. Clearly C is open, being the intersection of two open sets. Hence, it only remains to show
that it is dense. For this, it is necessary to show the following: Let x X be arbitrary. Then every
neighborhood of x contains a point of A B. Accordingly, let x X be chosen arbitrarily, and let N
be a neighborhood of x. Since A is dense, N contains an element a A. Moreover, since A is open,
there is a neighborhood N
1
A containing a. One can assume that N
1
is also a subset of N by
replacing N
1
by N N
1
if necessary. Next, since B is also dense, the neighborhood N
1
contains an
element b B. Since b also belongs to A, b belongs to A B and is contained in the neighborhood
N of x. 2
The topological space-under study in this section is the ring S of proper stable rational
functions. Recall that the topology on S is the natural one induced by its norm. Thus, a base for
the topology on S is provided by open balls of various radii centered at various points. Most of the
arguments given below can be extended to an arbitrary topological ring whose set of units is open
and where inversion of a unit is continuous, but at a considerable increase in the level of abstraction.
We begin by deriving some results concerning the genericity of coprimeness.
Proposition 7.6.3 Dene a symmetric binary relation R on S by
R = {(f, g) : f and g are coprime} . (7.6.3)
Then R is strongly generic as well as generic.
Proof. It is shown only that R is strongly generic. The proof of genericity is similar and is left as an
exercise (see Problem 7.6.2).
Let f S, and dene
R
f
= {g S : f and g are coprime} . (7.6.4)
It is shown below that R
f
is an open dense subset of S whenever f = 0. Since the set of nonzero
elements is open and dense in S, this would prove the strong genericity of R.
To show that R
f
is open, suppose that f and g are coprime, and choose a, b S such that
af +bg = 1. If h g is less than b
1
, then
af +bh = 1 +b(h g) , (7.6.5)
is a unit of S, which shows that h and f are also coprime. Thus, if g R
f
, then h R
f
whenever
h belongs to the ball B(g, b
1
). This shows that R
f
is open.
To show that R
f
is dense whenever f = 0, let g be an arbitrary element of S. If f and g are
coprime, then nothing further needs to be done. On the other hand, if f and g are not coprime, it
must be shown that every neighborhood of g contains an element which is coprime with f . Towards
7.6. GENERICITY 271
this end, let S denote the set of C
+e
-zeros of the function f , and note that S is a nite set since f
is nonzero. Now partition the set S into two disjoint subsets S
1
and S
2
. as follows:
S
1
= {s S : g(s) = 0} , S
2
= {s S : g(s) = 0} . (7.6.6)
Dene
r = min
sS
2
|g(s)| , (7.6.7)
and note that r > 0. Now let > 0 be arbitrary; it will be shown that the ball B(g, ) contains an
element h which is coprime with f . Select a positive < min(r, ) and dene h = g + 1. Then,
it is claimed that f and h are coprime, since they do not have any common C
+e
-zeros. To establish
the claim, note that the only possible common C
+e
-zeros of f and h are the elements of the set S.
Now, if s S
1
, then h(s) = = 0, whereas if s S
2
, then |h(s)| > r > 0. Hence, f and h are
coprime. 2
Thus, Proposition 7.6.3 shows that, generically, any two functions in S are coprime. In order
to extend this result to matrices, a preliminary lemma is needed.
Proposition 7.6.4 Suppose m > 1 is some integer. Then the set
M
m
= {D S
mm
: |D| = 0} , (7.6.8)
is an open dense subset of S
mm
.
Proof. The map D |D| is continuous, and the set of nonzero elements in S is open. Hence, M
m
,
which is the preimage of an open set under a continuous map, is itself open. To show that it is also
dense, suppose D is a matrix in S
mm
with |D| = 0, and let be an arbitrary positive number. It
is shown next that the ball B(D, ) contains a matrix with a nonzero determinant. Select matrices
U, V U(S) such that
UDV = Diag {d
1
, , d
r
, 0, , 0} , (7.6.9)
is the Smith form of D, and r is the rank of D. Observe now that d
i
divides d
i+1
for all i. Now
dene
S = Diag {d
1
, , d
r
, d
r
, , d
r
} , (7.6.10)
and let D
1
= U
1
SV
1
. Then |D
1
| = 0, and D
1
can be made to lie in the ball B(D, ) by choosing
the constant sufciently small. 2
The next proposition generalizes Proposition 7.6.3 to the matrix case.
Proposition 7.6.5 Suppose m < n are positive integers, and dene
R(m, n) = {A S
mn
: The g.c.d. of all minors of A is 1 } . (7.6.11)
272 7. ROBUSTNESS
Then R(m, n) is an open dense subset of S
mn
.
Remarks 7.6.6 1) If R(m, n) is viewed as mn-ary relation on S, then the proposition states that
this relation is generic.
2) Another way of stating the proposition is: Suppose F S
mk
, G S
ml
. Then F and G
are generically coprime if k +l > m.
3) The denition (7.6.11) can be equivalently stated as:
R(m, n) = {A S
mn
: A [I 0]} . (7.6.12)
Proof. Clearly it is enough to prove the proposition for the case where n = m+1. The openness
of the set R(m, n) is obvious, and it only remains to show that the set is dense. This is done by
induction on m. Note that the proposition is true when m = 1, by Proposition 7.6.3. Suppose the
proposition is true for matrices with m1 rows and m(or more) columns. To establish the statement
for matrices with m rows, suppose A S
mn
but does not belong to R(m, n). To be shown is that
every neighborhood of Acontains a matrix belonging to R(m, n). Let > 0 be any specied number,
and partition A in the form
A = [A
1
a
2
] , (7.6.13)
where A
1
and a
2
have dimensions mmand m1, respectively. Nowlet A
11
denote the submatrix
consisting of the rst m1 rows of A
1
. By the inductive hypothesis and Proposition 7.6.4, there
exists an mm matrix
B
1
=
_
B
11
B
12
_
, (7.6.14)
such that (i) |B
1
| = 0, (ii) [I 0] is a Smith form of B
11
, and (iii) B
1
A
1
< /2. Now let S
denote the set of C
+e
-zeros of |B
1
|, and partition the set S into two disjoint sets S
1
and S
2
, as
follows:
S
1
= {s S : all minors of [B
1
a
2
] are zero} ,
S
2
= {s S : not all minors of [B
1
a
2
] are zero} . (7.6.15)
The idea, as in the proof of Proposition 7.6.3, is to perturb a
2
in such a way as to destroy all the
common zeros in the set S
1
while ensuring that none of the points in S
2
becomes a common zero
of all the minors. For this purpose, note that all minors of the matrix [B
1
a
2
] belong to S and are
thus, bounded over the closed right half-plane. Let be an upper bound on the norms of all these
minors. Now, at each point s S
1
, nd an (m1) (m1) minor of B
1
which is nonzero at s;
such a minor must exist, since [I 0] is a Smith form of B
1
. Suppose such a minor is obtained by
7.6. GENERICITY 273
omitting the j-th column of B
1
. Let e
m
denote the m-th elementary unit vector. Then the mm
minor of
B = [B
1
a
2
+e
m
] , (7.6.16)
obtained by omitting the j-th column is nonzero at s, whenever = 0. Hence, at all s S
1
, at least
one of the mm minors of the matrix B in (7.6.16) is nonzero. The next step is to show that,
if is sufciently small, then at least one of the mm minors of the matrix B is nonzero at all
points in the set S
2
as well. This can be shown as follows: In analogy with (7.6.7), choose a real
number r > 0 such that, at each s S
2
, at least one mm minor of the matrix [B
1
a
2
] is nonzero.
Now choose < r/. Then, at each s S
2
, the corresponding mm minor of B is also nonzero.
Finally, since S = S
1
S
2
, is the zero set of |B
1
|, these are the only possible points in the closed
RHP where all mm minors of the matrix B could vanish. Hence, we conclude that, whenever
< min(r/, ), B has the Smith form [I 0]. 2
Proposition 7.6.5 shows that a rectangular matrix generically has the Smith form [I 0]. The
next result is addressed to the Smith form of square matrices.
Proposition 7.6.7 Suppose m is a positive integer, and dene
T
m
= {A S
mm
: A Diag {1, , 1, |A|}} . (7.6.17)
Then T
m
is an open dense subset of S
mm
.
Proof. The openness of T
m
is left to the reader to prove. To prove that T
m
is dense, suppose A does
not belong to T
m
, and let > 0 be a given number. To construct an element A
1
belonging to T
m
lying in the ball B(A, ), we proceed as follows: First, by Proposition 7.6.4, it can be assumed that
|A| = 0, by perturbing the matrix Aslightly if necessary. Nowselect unimodular matrices U, V such
that
S = UAV = Diag {a
1
, , a
m
} , (7.6.18)
is a Smithformof A. Then, for any > 0, one can nd functions b
i
B(a
i
, ) suchthat the functions
b
1
, , b
m
are all pairwise coprime (see Proposition 7.6.3). Hence, the matrix
B = U
1
Diag {b
1
, , b
m
}V
1
, (7.6.19)
has the Smith form Diag {1, , 1, |B|}. By adjusting , one can ensure that B B(A, ). 2
The rst principal result of this section states that a collection of r multivariable plants is
generically simultaneously stabilizable provided the plants have at least r inputs or r outputs. A
useful corollary is that generically a multivariable plant can be strongly stabilized. These results
make use of Proposition 7.6.5 concerning the generic Smith form of rectangular matrices.
Theorem7.6.8 Suppose l, mare given positive integers, and let M(R(s)) denote the set [R(s)]
lm
,
equipped with the graph topology. Let r be a positive integer, and dene the r-ary relation R on
274 7. ROBUSTNESS
M(R(s)) by
R = {(P
1
, , P
r
) :
_
i
S(P
i
) = } . (7.6.20)
Then R is an open dense subset of M(R(s)) whenever r max(l, m).
The proof of the theorem is divided into two parts: First, it is shown that the result is true in
the case where min(l, m) = 1, i.e., the case where all plants are either single-input or single-output.
Then it is shown that the general case can be reduced to this special case.
Lemma 7.6.9 Suppose min(l, m) = 1. Then R dened in (7.6.20) is an open dense subset of M(R(s))
if max(l, m) r.
Proof. The proof is given for the case where l = 1 and m r. The other case can be proved by
taking transposes.
Given the plants P
1
, , P
r
, each of dimension 1 m, nd l.c.f.s (d
(i)
p
, [n
(i)
p
1
n
(i)
p
m
]), for
i = 1, , r. Dene the matrix Q M(R(s))
r(m+1)
by
q
ij
= d
(i)
p
if j = 1 ,
q
ij
= n
(i)
p
(j1)
if j > 1 . (7.6.21)
Let C M(R(s))
m1
be a compensator, and let ([n
c
1
, , n
c
m
]
, d
c
) be an r.c.f. of C. Then by
Theorem 5.1.6, C stabilizes each of the plants P
i
if and only if each of the functions u
i
are dened
by
_
_
_
u
1
.
.
.
u
r
_
_
= Q
_
_
_
_
_
d
c
n
c
1
.
.
.
n
c
m
_
_
, (7.6.22)
is a unit. Turning this argument around, the set of plants {P
1
, , P
r
} can be simultaneously sta-
bilized if and only if there exist units u
i
, i = 1, , r such that (7.6.22) has a solution in M(R(s))
for the unknowns d
c
and n
c
i
, i = 1, , m, where the rst component of the solution (correspond-
ing to d
c
) is nonzero. Now, if m r, then the matrix Q has more columns than rows. Hence, by
Proposition 7.6.5, Q is generically right-invertible. That is, if Q is not right-invertible, it can be
made so by an arbitrarily small perturbation in each of its elements, which is precisely the same as
slightly perturbing each of the plants P
i
in the graph topology. Thus, by slightly perturbing each of
the plants, we can ensure that (7.6.22) has a solution in M(R(s)), whatever be the left hand side.
In particular, one can choose any arbitrary set of units {u
1
, , u
r
}, and (7.6.22) has a solution in
M(R(s)). The only additional point to be worried about is that the rst element of this solution
(corresponding to d
c
) should be nonzero. But this too can be guaranteed by perturbing the units
slightly if necessary. The details of this last argument are routine and are left to the reader. 2
7.6. GENERICITY 275
The next stepis to reduce the case of anl msystemwhere l < mto the case of a single-output
system (the case where l > m is handled by transposition).
Lemma 7.6.10 Suppose l, m are given positive integers. Dene the relation R on the product space
M(S)
(m+l)m
R
l
as the set of all (D, N, v) such that vN, D are right-coprime. Then the relation R is
generic.
Proof. Let n
(1)
, , n
(l)
denote the rows of the matrix N. By Proposition 7.6.5, it is generically
true that the Smith form of the matrix
M
i
=
_
D
n
(i)
_
, (7.6.23)
is [I 0]
for all i. In other words, this means that the g.c.d. of all mm minors of the matrix
M =
_
D
N
_
, (7.6.24)
involving exactly one row of N is equal to 1. Now, if v is an l 1 vector, then the pair (D, vN) is
right-coprime if and only if the g.c.d. of all mm minors of the matrix
Q =
_
D
vN
_
, (7.6.25)
is equal to 1. We show that generically this is the case for almost all v R
l
. Now the matrix Q has
dimensions (m+1) m, as do the matrices M
i
dened in (7.6.23). Moreover, since the determinant
function is multilinear, a minor of Q involving the last row can be expressed as a linear combination
of the corresponding minors of the various M
i
. More precisely, let q
i
denote the minor of Qobtained
by omitting the j-th row the matrix D, and dene m
j
i
analogously. Then
q
j
=
l
i=1
v
i
m
j
i
. (7.6.26)
By hypothesis, it is generically true that
g.c.d.
i,j
{m
j
i
} = 1 . (7.6.27)
Hence, if the elements m
j
i
satisfy the rank condition of [100], it follows that, for almost all vectors
v,
g.c.d.
j
{q
j
} = 1 . (7.6.28)
It is left to the reader to showthat the rank condition is generically satised. This shows that (D, vN)
are right-coprime and completes the proof that the relation R is generic. 2
276 7. ROBUSTNESS
The signicance of Lemma 7.6.10 is in showing that, generically, a multivariable plant can
be stabilized using a rank one compensator. Suppose P is a given multivariable plant with an r.c.f.
(N, D), and suppose there exists a vector v such that vN, D are right-coprime. Then one can nd
a pair of elements (a, B) M(S) such that avN +BD = I. Note that a is a column vector. Now
the compensator C = B
1
av has rank one as an element of M(R(s)) and stabilizes P. Hence, if
one can stabilize P using a rank one compensator if one can nd a row vector v such that vN, D
are right-coprime. Lemma 7.6.10 states that generically this is always possible.
With the aid of Lemma 7.6.10, the proof of Theorem 7.6.8 can be completed.
Proof of Theorem7.6.8. If min(l, m) = 1, then the truth of the theorem is established by
Lemma 7.6.9. If not, suppose without loss of generality that r m l. For each plant P
i
, it is
generically true that the set of row vectors v such that the m-input-scalar-output plant vP
i
can be
stabilized by an appropriate compensator is open and dense in R
l
. Since the intersection of a nite
number of open and dense subsets of R
l
is again open and dense (see Proposition 7.6.2), generi-
cally the set of row vectors v such that each of the plants vP
i
can be stabilized by an appropriate
compensator is open and dense. Moreover, by Lemma 7.6.10, generically there exists a common sta-
bilizing compensator C
1
, since r m. Hence, C = vC
1
simultaneously stabilizes each of the plants
P
i
, i = 1, , r. 2
InSection5.4 it was shownthat the problemof simultaneously stabilizing r plants is equivalent
to that of simultaneously stabilizing r 1 plants using a stable compensator. Now Theorem 5.2.1
shows that r plants of dimension l m can be generically simultaneously stabilized provided r
max(l, m). Thus, it is natural to conjecture that r plants of dimension l m can generically be
simultaneously stabilized using a stable compensator provided r < max(l, m). In fact this is true,
and the proof follows very nearly the reasoning above. However, a fewminor adjustments are needed.
The reason for this is the following: Given r plants P
i
, i = 1, , r, they can be simultaneously
stabilized using a stable compensator if and only if the plants r +1 plants 0, P
i
, i = 1, , r can
be simultaneously stabilized. Now, the reasoning used in Lemma 7.6.9 and Theorem 7.6.8 relies on
being able to perturb all of the plants as needed. However, in the present instance, one can only
perturb all except one plant (namely the zero plant). However, there is a saving feature: In choosing an
l.c.f. of the zero plant, the numerator matrix has to be zero, but the denominator matrix is arbitrary.
Hence, the proof of the above result is complete if it can be shown that the Smith form of an
(r +1) (m+1) matrix of the form
M =
_
d
1
N
1
d
2
0
_
, (7.6.29)
is generically equal to [I 0] whenever r < m. Comparing the matrix in (7.6.29) to that in Propo-
sition 7.6.5, we see that the only difference is the presence of a row of zeros in the bottom of the
matrix M in (7.6.29). Now, checking back to the proof of Proposition 7.6.5, the ingredients to the
proof by induction were: (i) the fact that the r mmatrix N
1
generically had the Smith form[I 0],
and (ii) the ability to adjust the corner element d
2
. As both of these features are present in (7.6.29),
7.6. GENERICITY 277
we can conclude that the matrix M is generically right-invertible if r < m. Hence, the following
corollary to Theorem 5.2.1 is proved:
Corollary 7.6.11 Suppose the plants P
i
, i = 1, , r have dimension l m. Then they can generically
be simultaneously stabilized using a stable compensator if r < m.
PROBLEMS
7.6.1. (a) Show that if R is open then R
x
is open for all x.
(b) Show that if R
x
is a dense subset of X for all x belonging to a dense subset of X, then
R is a dense subset of X X.
7.6.2. Show that the relation R of (7.6.3) is generic.
7.6.3. Show that the relation T
m
dened in (7.6.17) is open.
NOTES ANDREFERENCES
The graph topology of Section 7.2 and the graph metric of Section 7.3 were introduced in [96]. For
another metric on the set of square rational plants, see [114]. The robustness results in Section 7.4
are due to several researchers, including Doyle and Stein [32], Chen and Desoer [13], Kimura [58],
and Vidyasagar and Kimura [99]. For other results on the robustness of feedback stability, specically
the gain margin of systems, see [57, 67]. Some of the results on robust regulation are from [39].
The genericity of simultaneous stabilizability was rst established for two plants by Vidyasagar and
Viswanadham [102]. The general results given here were rst obtained by Ghosh and Byrnes [43]
using state-space methods. The present treatment follows [100].
279
C H A P T E R 8
Extensions to General Settings
The previous six chapters concentrated on the problems of controller synthesis in the case where
the underlying ring of stable transfer functions is the ring S dened in Section 2.1. As pointed
out in Section 4.1, the only property of S that is used repeatedly in Chapters 27 is the fact that S
is a principal ideal domain. Thus, most if not all of the contents of these chapters remain valid if
S is replaced by any p.i.d. This is why the synthesis theory of Chapters 27 encompasses, within a
single framework, lumped continuous-time as well as discrete-time systems, and situations where
the stable region in the complex plant is something other than the left half-plane (or the closed unit
disc in the discrete-time case).
The objective of the present chapter is to generalize the earlier synthesis theory by removing
the assumption that the underlying set of stable systems is a p.i.d. In effect, the generalization
extends the scope of the theory to include linear distributed systems, 2-D systems and the like.
In contrast with earlier chapters, this chapter, and in particular Section 8.1, presupposes a
mathematical background on the part of the reader well beyond that contained in the appendices.
Several references to the literature are given at appropriate places, but in general, [14, 116] are good
references for concepts from abstract algebra, while [42] is a handy source for results on Banach
algebras.
8.1 COPRIMEFACTORIZATIONS OVERACOMMUTATIVE
RING
Throughout this section, R denotes a commutative domain with identity, and F denotes the eld
of fractions associated with R. As in the rest of the book, the symbols M(R), M(F), U(R) denote
respectively the set of all matrices with elements in R, the set of matrices with elements in F, and the
set of all unimodular matrices with elements in R. If R is a principal ideal domain, then the contents
of Section 4.1 show that every matrix over F has both right- and left-coprime factorizations over
R. In fact, for this it is enough that R be a Bezout domain, i.e., that every nitely generated ideal in
R be principal. However, if R is not Bezout, then the questions of the existence of r.c.f.s and l.c.f.s
and their interrelationships need to be examined anew. It is the purpose of this section to do so.
The concepts of coprimeness and coprime factorizations introduced in Section 4.1 can be
extended in a natural way to a general ring R. Suppose A, B M(R) have the same number of
columns. Then A and B are right-coprime if there exist X, Y M(R) such that
XN +YD = I . (8.1.1)
280 8. EXTENSIONSTOGENERAL SETTINGS
Suppose F M(F). A pair (N, D) where N, D M(R) is a right-coprime factorization (r.c.f.) of
F if (i) D is square and |D| = 0, (ii) F = ND
1
, and (iii) N, D are right-coprime. Left-coprime
matrices and l.c.f.s are dened analogously.
Given an arbitrary F M(F), the existence of an r.c.f. for F is not automatic unless R is a
Bezout domain (see Corollary 8.1.5 below). However, if F does have an r.c.f., then it is easy enough
to characterize all r.c.f.s of F. In fact, the relevant part of Theorem 4.1.13 carries over in toto. For
ease of reference, the result is repeated here, but the proof is omitted.
Lemma 8.1.1 Suppose F M(F) has an r.c.f. (N, D). Then the set of all r.c.f.s of F consists of the set
of pairs (NU, DU) as U ranges over all matrices in U(R).
But when does an F M(F) have an r.c.f.? We begin with the scalar case.
Lemma 8.1.2 Suppose f F, and express f as a/b where a, b R. Then f has a coprime factorization
if and only if the ideal generated by a and b in R is principal.
Proof. if Suppose g is a generator of the ideal generated by a and b. Then a = ng, b = dg, g =
xa +yb for appropriate elements n, d, x, y R. Since b is the denominator of a fraction, it is
nonzero, which in turn implies that both d and g, are nonzero. This, together with earlier relations,
leads to f = n/d, xn +yd = 1. Hence, (n, d) is a coprime factorization of f .
only if Suppose (n, d) is a coprime factorization of f , and choose x, y R such that
xn +yd = 1. Let r = b/d. Then
r = bd
1
= b(xnd
1
+y)
= b(xab
1
+y) = xa +yb R . (8.1.2)
Hence, d divides b in R, and dr = b. This implies that a = nr, and that the ideal generated by a, b
is the same as the principal ideal generated by r. 2
The matrix case is essentially similar, if one observes that every matrix F M(F) can be
expressed in the form AB
1
where A, B M(R) and |B| = 0. To see this, suppose f
ij
= p
ij
/q
ij
where p
ij
, q
ij
R. Dene b =
i
j
q
ij
, and let a
ij
= bp
ij
/q
ij
R. If Adenotes the matrix of the
a
ij
s, then F = A(bI)
1
. The result in the matrix case is stated without proof.
Lemma 8.1.3 Suppose F F
nm
, and express F in the formAB
1
where A, B M(R) and |B| = 0.
Dene
I = {XA +YB : X R
mn
, Y R
mm
} , (8.1.3)
and note that I is a left ideal in R
mm
. Then F has an r.c.f. if and only if I is a left principal ideal.
Lemma 8.1.4 Every f F has a coprime factorization if and only if R is a Bezout domain.
8.1. COPRIMEFACTORIZATIONS OVERACOMMUTATIVERING 281
Proof. The following statement is easy to prove: R is a Bezout domain if and only if every pair of
elements in R generates a principal ideal. So, if R is not Bezout, then there exist a, b R such
that the ideal generated by a and b is not principal. Obviously, both a and b are nonzero. Hence,
f = a/b F, but does not have a coprime factorization, by Lemma 8.1.2. This proves the only if
part. The if part follows readily from Lemma 8.1.2. 2
Corollary 8.1.5 Every F F
nm
has an r.c.f. for all integers n, m 1 if and only if R is a Bezout
domain.
Proof. If R is a Bezout domain, the proof of Theorem 4.1.13 applies to show that every F F
nm
has an r.c.f. for all n, m 1. 2
The next several paragraphs are devoted to deriving conditions for a matrix in M(R) to have
a left (or right) inverse in M(R), using the idea of reduction modulo maximal ideals. Since a nite
set of matrices A
1
, , A
n
in M(R) are left-coprime if and only if the matrix [A
1
, , A
n
] has a
right inverse, this problem arises naturally in the study of the coprimeness of two (or more) matrices.
In the case where R is a Bezout domain, several equivalent criteria for left invertibility are given in
Section 4.1. It turns out that Corollary 4.1.5 is the result that extends most naturally to the case of
a general ring R (see Theorem 8.1.8 below).
An ideal I in R is maximal if I = R and I is not properly contained in any ideal of R other
than R itself. Equivalently (see [116, p. 150]), I is maximal if and only if the factor ring R/I is a eld.
Let max(R) denote the set of maximal ideals of R. Then R/I is a eld for each I max(R), even
though for different I this eld could be different. However, if R is in addition a Banach algebra
over the complex eld, then R/I is isomorphic to the complex eld C for all I max(R) [42, p. 31].
For x R and I max(R), let x
I
denote the coset x +I in the factor ring R/I.
Lemma 8.1.6 Suppose a
1
, , a
n
R. Then there exist x
1
, , x
n
R such that
n
i=1
x
i
a
i
= 1 (8.1.4)
if and only if
rank [ a
1I
a
nI
] = 1 I max(R) . (8.1.5)
Remarks 8.1.7 Note that (8.1.4) is equivalent to the statement that a
1
, , a
n
generate R,
while (8.1.5) states that no maximal ideal of R contains all of the a
i
s. Thus, Lemma 8.1.6 is a
natural restatement of a well-known result [42, p. 24] that an element in R is a unit if and only if it
is not contained in any maximal ideal of R.
282 8. EXTENSIONSTOGENERAL SETTINGS
Proof. if Suppose (8.1.5) holds and let J denote the ideal generated by the a
i
s. If J = R, then J is
contained in a maximal ideal of R [116, p. 151]. Suppose J I max(R). Then I contains all the
a
i
s, in contradiction to (8.1.5). Hence, J = R, i.e., (8.1.4) holds.
only if Suppose (8.1.4) holds. If (8.1.5) is false, thenthere exists anI max(R) that contains
all the a
i
s. In view of (8.1.4), this implies that 1 I , which is absurd. 2
Theorem8.1.8 Suppose A R
nm
with n m. Then A has a left inverse in R
mn
if and only if
1
rank
A
I
= mI max(R) . (8.1.6)
An equivalent, and extremely useful, way of stating this theorem is: A R
nm
has a left
inverse if and only if the set of mm minors of A together generate R.
Proof. only if Suppose B R
mn
is a left inverse of A, so that BA = I
m
. Since the map x x
I
from R into R/I is a homomorphism for each I max(R), it follows, upon taking cosets modulo I,
that
B
I
A
I
= I
m
I max(R) . (8.1.7)
Now (8.1.7) implies (8.1.6) .
if This part of the proof is made more intelligible by using the symbol S(m, n) dened
in Section B.1. Recall that S(m, n) denotes the set of all strictly increasing m-tuples taken from
{1, , n}.
The case m = n is very easy: In this case (8.1.6) merely states that |A| is not contained in any
maximal ideal of R. Hence, |A| is a unit of R, and A
1
R
mm
.
Now suppose m < n. For each m-tuple J S(m, n), let F
J
R
mm
denote the mm
submatrix of A corresponding to the rows in J. In other words, if J = {j
1
, , j
m
}, then
F
J
=
_
_
_
a
j
1
1
a
j
1
m
.
.
.
.
.
.
a
j
m
1
a
j
m
m
_
_
. (8.1.8)
Let f
J
= |F
j
| denote the mm minor of A corresponding to the rows in J. Then (8.1.6) states
that, for each I max(R), we have
f
JI
= 0 for some J S(m, n). Hence, by Lemma 8.1.6, the
minors f
J
, J S(m, n) together generate R; i.e., there exist elements x
J
, J S(m, n) such that
JS(m,n)
x
J
f
J
= 1 . (8.1.9)
1
Note that
A
I
is an n m matrix, with elements in the eld R/I, whose ij-th element is a
ijI
.
8.1. COPRIMEFACTORIZATIONS OVERACOMMUTATIVERING 283
Next, for each J S(m, n), dene a matrix B
J
R
mn
as follows: Suppose J = {j
1
, , j
m
},
dene F
J
as in (8.1.8), and let G
J
R
mm
denote the adjoint matrix of F
J
. Let g
J1
, , g
Jm
denote the columns of G
J
. Then
B
J
= [0 g
J1
g
J2
g
Jm
0]. (8.1.10)
In other words, columns j
1
, , j
m
of B
J
equal g
J1
, g
Jm
, respectively, while all other columns
of B
J
are zero. By construction,
B
J
A = G
J
F
J
= f
J
I
m
. (8.1.11)
Now dene
B =
JS(m,n)
x
J
B
J
. (8.1.12)
Then BA = I
m
, from (8.1.11) and (8.1.9).
2
2
Corollary 8.1.9 Suppose A, B M(R) have the same number of columns. Then A and B are right-
coprime if and only if the matrix [
A
I
B
I
]
(p) B
(p)]
denote the image (in C) of x under the homomorphism associated with . Let
be equipped with the weakest topology in which the map x
|f ()| , (8.1.15)
and multiplication dened pointwise. Then, with each element x R, one can associate an element
x C() by the rule x() = x
i=0
that are absolutely summable, with convolution
as the multiplication. As shown in [26, 97],
1
(Z
+
) is precisely the set of unit-pulse responses of
f (z) =
i=0
f
i
z
i
, (8.1.16)
which maps
1
(Z
+
) into C. The map z
f (z) which is a function in C(D) is the Gelfand
transform of f
1
(Z
+
). In this case, it is just the usual z-transform of the sequence {f
i
}.
More generally, let R =
1
(Z
n
+
) denote the set of all absolutely summable n-indexedsequences
{f
i
1
i
n
} where the each index ranges over the nonnegative integers. The maximal ideal space of R
8.1. COPRIMEFACTORIZATIONS OVERACOMMUTATIVERING 285
is D
n
, the cartesian product of the unit disc with itself n times, also referred to as the n-polydisc.
Suppose z = (z
1
, , z
n
) D
n
. Then the function
f (z) =
i
1
=0
i
n
=0
f
i
1
i
n
z
i
1
1
z
i
n
n
(8.1.17)
denes a homomorphism from
1
(Z
n
+
) into C. The map z
f (z) which belongs to C(D
n
) is the
Gelfand transform of f , which is again the usual z-transform.
Now suppose R =
1
(Z), the set of absolutely summable two-sided sequences {f
i
}
i=
.
In this case the maximal ideal space of R is the unit circle T. For each z T, the corresponding
homomorphism from
1
(Z) to C is again given by (8.1.16), with the summation now running
from to . If R =
1
(Z
n
), then the maximal ideal space of R is T
n
, and (8.1.17) denes a
homomorphism of R into C (with the obvious change of 0 to everywhere). One can replace
Z
n
by any locally compact Abelian group and the theory is still complete; see [80].
In some applications, one does not know the maximal ideal space of R explicitly, but does
know a dense subset of . Two such instances are discussed in Examples 8.1.15 and 8.1.16 below.
Theorem 8.1.8 can be easily extended to this case.
Lemma 8.1.12 Suppose R is a Banach algebra over C, with maximal ideal space . Suppose is a
dense subset of , and suppose a
1
, , a
n
R. Then there exist x
1
, , x
n
R such that
n
i=1
x
i
a
i
= 1 (8.1.18)
if and only if
inf
i=1
| a
i
()| > 0 . (8.1.19)
Proof. if Suppose (8.1.19) holds; then it is claimed that
n
i=1
| a
i
()| = 0 , (8.1.20)
or equivalently, that at least one a
i
() is nonzero at each . The proof of (8.1.20) is by con-
tradiction. Suppose that, for some
0
, we have a
i
(
0
) = 0 i. Since is dense in , there is a
sequence
3
{
j
} in converging to
0
, and
n
i=1
| a
i
(
j
)|
n
i=1
| a
i
(
0
)| = 0 . (8.1.21)
3
If is not a rst countable topological space, this argument has to be modied by replacing all sequences by nets.
286 8. EXTENSIONSTOGENERAL SETTINGS
But (8.1.21) contradicts (8.1.19). Hence, (8.1.20) is true. The existence of the x
i
s satisfying (8.1.18)
now follows from Lemma 8.1.6.
only if Suppose (8.1.18) holds. Then, by Lemma 8.1.6, (8.1.20) is also true. Now, since
is a compact topological space, (8.1.20) implies that
inf
i=1
| a
i
()| > 0 . (8.1.22)
Finally, (8.1.22) implies (8.1.19), since is a subset of . 2
Lemma 8.1.13 Let R, , be as in Lemma 8.1.12, and suppose A R
nm
with n m. Then Ahas
a left inverse in R
mn
if and only if
inf
m
(
A()) > 0 , (8.1.23)
where
m
() denotes the smallest singular value of a matrix.
Remarks 8.1.14 An equivalent way of stating (8.1.23) is as follows: There exists an > 0 such
that
A
()
A() I .
Proof. if Suppose
m
(
A()) . Then a simple exercise in numerical analysis shows that
JS(m,n)
| a
J
(|
n
2
c
> 0 , (8.1.24)
where a
J
, J S(m, n) are the various mm minors of
A, and c = n!/[m!(n m)!]. From
Lemma 8.1.12, (8.1.24) implies that the minors a
J
, J S(m, n) generate R. The existence of a
left inverse for A now follows as in the proof of Theorem 8.1.8.
only if Suppose A has a left inverse. Then, by Theorem 8.1.8, the minors a
J
, J S(m, n)
of A generate R. Hence, by Lemma 8.1.12,
inf
J
| a
j
()| > 0 . (8.1.25)
It is now routine to prove (8.1.23). 2
Application 8.1.15 Let R = H
i=0
f
i
(t t
i
) +f
a
(t ) , (8.1.27)
where 0 t
1
t
2
< are real constant, f
a
() is absolutely continuous, and in addition
f
A
:=
i=0
|f
i
| +
_
0
|f
a
(t )| dt < . (8.1.28)
One can interpret A as the set of regular measures (without singular part) of bounded variation on
the half-line [0, ); see [48, 80]. It is shown in [26] that the set A is a good model for the set of
impulse responses of BlBO-stable linear time-invariant continuous-time systems. Now the maximal
ideal space of A is very messy to describe. However, it is shown in [48, pp. 145 et seq.] that the closed
right half of the complex plane is a dense subset of this maximal ideal space, in the following sense:
For each s C
+
, dene a homomorphism
s
: A C by
s
(f ) =
f (s) =
i=0
f
i
exp(st
i
) +
_
0
f
a
(t ) exp(st ) dt . (8.1.29)
It is recognized that
f is just the Laplace transform of f (). For each s C
+
, the set of f () A
such that
f (s) = 0 is a maximal ideal of A. The result in [48] is that the collection of maximal ideals
corresponding to all s C
+
is dense in the maximal ideal space of A. Hence [8], by Lemma 8.1.12,
two functions f, g A are coprime if and only if
inf
sC
+
{|
f (s)| +| g(s)|} > 0 . (8.1.30)
Now we change direction somewhat. Lemma 8.1.4 shows that if R is not a Bezout domain,
then not every matrix in M(R) has an r.c.f., and not every matrix has an l.c.f. At this stage it is natural
to ask: If a matrix has one, does it have the other? The rest of this section is devoted to answering
this question for various rings of interest.
288 8. EXTENSIONSTOGENERAL SETTINGS
Suppose A R
mn
with m < n; then we say that A can be complemented if there exists a
unimodular matrix U R
nn
containing A as a submatrix. Suppose A can be complemented, and
let U = [A
R
nn
be unimodular. By expanding |U| along the rst m rows using Laplaces
expansion (Fact B.1.4), we see that the mmminors of Amust together generate R, or equivalently
(by Theorem 8.1.8), that A has a right inverse in R
nm
. Thus, only right-invertible matrices can be
complemented. But is right-invertibility sufcient?
We postpone answering this question in order to demonstrate the relevance of comple-
mentability to the issue at hand.
Lemma 8.1.17 Suppose F M(F) has an l.c.f. (
D,
N). Then F has an r.c.f. if and only if the matrix
A = [
D
N] can be complemented.
Proof. if Suppose Acan be complemented. Since B = [
N
D] is obtained fromAby elementary
column operations, B can also be complemented.
4
Suppose
U =
_
Y X
N
D
_
(8.1.31)
is a unimodular matrix containing B as a submatrix, and partition U
1
M(R) as
U
1
=
_
D
X
N
Y
_
. (8.1.32)
The claim is that (N, D) is an r.c.f. of F. From UU
1
= I, we get
YD +XN = I , (8.1.33)
ND +
DN = 0 . (8.1.34)
Thus, D and N are right-coprime from (8.1.33), and the proof is complete if it can be established
that |D| = 0, because (8.1.34) would then imply that ND
1
=
D
1
N = F.
To show that D is nonsingular, begin by observing that
D is nonsingular, since (
D,
N) is an
l.c.f. of F. Therefore the matrix
_
I 0
N
D
_
U
1
=
_
D
X
0 I
_
(8.1.35)
is also nonsingular. It now follows from (8.1.35) that |D| = 0.
only if This part exactly parallels the proof of Theorem 4.1.16. Suppose (N, D) is an r.c.f.
of F, and select X, Y,
X,
Y M(R) such that XN +YD = I,
N
X +
D
Y = I. Dene
U =
_
Y X
N
D
_
, W =
_
D
X
N
Y
_
. (8.1.36)
4
The change from [
D
N] to [
N
D] is made simply to make U resemble the matrix in Theorem 4.1.16.
8.1. COPRIMEFACTORIZATIONS OVERACOMMUTATIVERING 289
Then,
UW =
_
I Y
X +X
Y
0 I
_
. (8.1.37)
Since |UW| = 1, it follows that |U| is a unit and that U is unimodular. This shows that B =
[
N
D] can be complemented. The same is true of A = [
D
N], since A can be transformed to B
by elementary column operations. 2
Corollary 8.1.18 Suppose F M(F) has an r.c.f. (N, D). Then F has an l.c.f. if and only if the matrix
[D
can be complemented.
The proof is easy and is left as an exercise.
Suppose A R
mn
be complemented. Then by denition there exists a matrix B R
(nm)m
such that [A
0
]
is unimodular. Then
the set of B M(R
(nm)m
such that [A
is unimodular is given by
{B : B = RA +VB
0
, where R M(R), V U(R)} . (8.1.38)
Proof. First, suppose B = RA +VB
0
where V U(R). Then,
_
A
B
_
=
_
A
RA +VB
0
_
=
_
I 0
R V
_ _
A
B
0
_
, (8.1.39)
whichis unimodular since bothmatrices onthe right side are in U(R). Conversely, suppose [A
]
U(R). Partition [A
0
]
1
as [F G], and note that AF = I, AG = 0. Now, by assumption,
_
A
B
_ _
A
B
0
_
1
=
_
A
B
_
[F G] =
_
I 0
BF BG
_
(8.1.40)
is in U(R), which shows that BG U(R). Let V = BG, R = BF. Then, from (8.1.38),
_
A
B
_
=
_
I 0
R V
_ _
A
B
0
_
=
_
A
RA +VB
0
_
, (8.1.41)
which implies that B = RA +VB
0
. 2
290 8. EXTENSIONSTOGENERAL SETTINGS
Now we return to the question: When can every right-invertible matrix over R be comple-
mented? The answer is provided by the notion of a Hermite ring. A row [a
1
, , a
n
] R
1n
is a
unimodular row if a
1
, , a
n
together generate R. A ring R is Hermite if every unimodular row can
be complemented. Lemma B.2.1 shows that every Bezout domain is Hermite. But it is shown below
that several other commonly encountered rings are Hermite, though they are not Bezout. Before
this, a minor technicality is disposed of.
Lemma 8.1.20 The following statements are equivalent:
(i) R is a Hermite ring.
(ii) If A R
kn
has a right inverse, then A can be complemented.
Remarks 8.1.21 (a) The objective of the lemma is to deal with fat matrices, not just row vectors;
(b) compare (ii) with Corollary 4.1.7.
Proof. (ii) (i) Obvious: just let m = 1.
(i) (ii) Fix n. The proof is by induction on k. The denition of a Hermite ring shows that
(ii) is true when k = 1. Suppose (ii) is true for matrices in R
mn
whenever m k 1 < n 1, and
suppose A R
kn
has a right inverse, say C. Partition A and C as shown below:
A =
_
A
1
a
2
_
, C = [C
1
c
2
] , (8.1.42)
where a
2
is the last row of A and c
2
is the last column of C. Then A
1
C
1
= I
k1
. Hence, by the
inductive hypothesis (and taking transposes), there exists a unimodular matrix U R
nn
of the
form U = [C
1
D]. Now
AU =
_
A
1
a
2
_
[C
1
D] =
_
I
k1
Z
0 y
_
=: M . (8.1.43)
Since A has a right inverse, so does M, and hence so does y R
1(nk+1)
. Let b R
(nk+1)1
be
a right inverse of y. Since R is Hermite, there exists a unimodular matrix B R
(nk+1)(nk+l)
containing b as its rst column. Thus,
AU
_
I
k1
0
0 B
_
=
_
I
k1
ZB
0 yB
_
=
_
I
k1
0 1
_
=: N , (8.1.44)
since the rst element of yB R
1(nk+1)
is a one. (In (8.1.44), is a generic symbol denoting an
element or a matrix whose value is unimportant.) Nowthe matrix N in(8.1.44) canbe complemented,
since the determinant of
_
_
I
k1
0 1
0 0 I
nk
_
_
(8.1.45)
is one. Since N is a right associate of A, it follows that A can also be complemented. 2
8.1. COPRIMEFACTORIZATIONS OVERACOMMUTATIVERING 291
Corollary 8.1.22 Suppose R is a Hermite ring and that A R
mn
with m < n. Then A has a right
inverse if and only if there exist unimodular V R
mm
, U R
nn
such that
VAU = [I
m
0] . (8.1.46)
Corollary 8.1.22 shows that in a Hermite ring, even though there is no Smith form in general
for arbitrary matrices, there is one for right-invertible matrices (and of course for left-invertible
matrices).
Theorem8.1.23 The following three statements are equivalent:
(i) R is a Hermite ring.
(ii) If F M(F) has an l.c.f., then it has an r.c.f.
(iii) If F M(F) has an r.c.f., then it has an l.c.f.
Proof. The equivalence of (ii) and (iii) follows readily by taking transposes.
(i) (ii) Suppose R is Hermite, and suppose F M(F) has an l.c.f., say (
D,
N). Then A =
[
D
N] has a right inverse. From Lemma 8.1.20, Acan be complemented, and from Lemma 8.1.17,
it follows that F has an r.c.f.
(ii) (i) Suppose [a
1
a
n
] R
1n
is an arbitrary unimodular row; it must be shown that
this row can be complemented. Since a
1
, , a
n
generate R, at least one of the a
i
must be nonzero.
Suppose a
1
= 0 by renumbering if necessary, and deneF F
1(n1)
by
F =
_
a
2
a
1
a
n
a
1
_
. (8.1.47)
Now the unimodularity of the row [a
1
a
n
] shows that F has an l.c.f., namely (a
1
, [a
2
a
n
]).
Since (ii) is assumed to hold, F also has an r.c.f. By Lemma 8.1.17, this implies that the row[a
1
a
n
]
can be complemented. Hence, R is Hermite. 2
Theorem 8.1.23 shows that Hermite rings are the next best thing to Bezout domains, at least
as far as controller synthesis theory goes. If R is Bezout, then every matrix in M(F) has both an
r.c.f. and an l.c.f., and all of the synthesis theory of Chapter 5 goes through very nicely. If R is not
Bezout, then not every matrix in M(F) has an r.c.f., and not every matrix has an l.c.f. However, if
R is at least Hermite, then the matrices in M(F) that have an r.c.f. are precisely those that have an
l.c.f. Moreover, this property characterizes Hermite rings.
It is therefore worthwhile to explore whether various commonly encountered rings are Her-
mite. If R is a complex Banach algebra, then there is a very general result available. The proof of this
result is well beyond the scope of this book and can be found in [62]; see [62, Theorem 3].
Theorem 8.1.24 Suppose R is a complex Banach algebra with maximal ideal space . Then a
unimodular row [a
1
a
n
] in R can be complemented if and only if the row [ a
1
a
n
] can be
complemented in C(). In particular, if is contractible, then R is Hermite.
292 8. EXTENSIONSTOGENERAL SETTINGS
Recall that a topological space is contractible if the identity map on is homotopic to a
constant map. In other words, is contractible if there exists an
0
and a continuous function
f : [0, 1] such that
f (0, ) = , f (1, ) =
0
, . (8.1.48)
Theorem 8.1.24 is similar in spirit to one of the basic results in the Gelfand theory of Banach
algebras, which states that, given any complex Banach algebra R and any f R, f can be inverted
in R if and only if
f can be inverted in C().
Note that a convex set in C
n
is contractible: If C
n
is convex, let
0
be arbitrary and
dene
f (, ) =
0
+(1 ) . (8.1.49)
This observation leads to a very useful criterion.
Corollary 8.1.25 Suppose R is a complex Banach algebra whose maximal ideal space is a (compact)
convex subset of C
n
for some integer n. Then R is a Hermite ring.
Examples 8.1.26 [6, 54] The disc algebra,
1
(Z
+
),
1
(Z
n
+
) are all Hermite.
Theorem 8.1.24 and Corollary 8.1.25 are not valid in general if R is a real Banach algebra.
However, in [6, 54] it is shown that the following real algebras are indeed Hermite rings:
(i) The set A
s
of functions in the disc algebra A whose Fourier coefcients are all real (i.e.,
the set of all symmetric functions in A).
(ii) The set
1r
(Z
n
+
) consisting of all real sequences in
1
(Z
n
+
).
Theorem 8.1.24 suggests a simple example of a ring that is not Hermite.
Example 8.1.27 Let S
2
denote the unit sphere in R
3
, and let R = C(S
2
), the set of all continuous
complex-valued functions on S
2
.
5
Then, = S
2
, which is not contractible. It is claimed that R
is not a Hermite ring. To show this, it is enough to demonstrate a unimodular row in R
13
that
cannot be complemented. Dene a = [a
1
a
2
a
3
] by setting a(x) equal to the unit outward normal
vector at x, for each x S
2
. Then a() is continuous and therefore belongs to R
13
. Moreover, since
a(x) = 0 x S
2
, it follows from Lemma 8.1.6 that a is a unimodular row. Now suppose a could be
complemented, and suppose [a
] R
33
is unimodular. Then for each x S
2
, the three vec-
tors a(x), b(x), c(x) must span the vector space C
3
. Let T (x) denote the (two-dimensional) tangent
space to S
2
at x. Since span {a(x), b(x), c(x)} = C
3
contains T (x), and since a(x) is orthogonal to
T (x), we see that the projections of b(x), c(x) onto T (x) must both be nonzero and must together
span T (x). Let f(x), g(x) denote the projections of b(x), c(x) onto T (x), respectively. Then f()
is a vector eld on S
2
that does not vanish anywhere. However, this contradicts the Borsuk-Ulam
5
This example was suggested to the author by Ken Davidson.
8.1. COPRIMEFACTORIZATIONS OVERACOMMUTATIVERING 293
theorem [46, p. 116], which states that every vector eld on S
2
must have a zero. Hence, a() cannot
be complemented, and R is not a Hermite ring.
If we dene F = [a
2
/a
1
a
3
/a
1
], then F has an l.c.f. but not an r.c.f.
We conclude this section with a theorem that is very important in several branches of control
theory and indeed in mathematics.
Theorem 8.1.28 (Quillen-Suslin) Let k be a eld, and let R = k[x
1
, , x
l
], the ring of poly-
nomials in l indeterminates with coefcients in k. Then R is a Hermite ring.
Remarks 8.1.29 (i) Until it was proven, the above result was known as Serres conjecture. See [61]
for an excellent historical introduction to this result, its origins and its signicance.
(ii) Quillen [76] shows that the result remains valid if k is replaced by any p.i.d. However, the
proof given below does not apply to this case.
(iii) The ring k[[x
1
, , , x
l
]] of formal power series in l indeterminates with coefcients in
k is also a Hermite ring; see [63].
(iv) The proof given below is taken from [79, pp. 138145]. See [112] for another proof that
applies to the special case when k = C.
The elementary proof of Theorem 8.1.28 is rather long, and makes use of a few concepts not
previously introduced. A ring B is local if it contains a unique maximal ideal [4, p. 38]. Suppose A is
a domain, and let P be a prime ideal in A. The set consisting of all fractions of the form a/b where
a A, b A \ P, is a local ring, called the localization of A at P, and is denoted by A
p
. The unique
maximal ideal of A
p
is the set of fractions a/b where a P, b A \ P. The pertinent property of
local rings is the following: If B is a local ring and J is its unique maximal ideal, then every b / J is
a unit. This follows from Lemma 8.1.6.
If B is a ring and b
1
, , b
n
B, the symbol (b
1
, , b
n
) denotes the ideal generated by
the b
i
s. If f B[y], then (f ) denotes the degree of the polynomial f . More generally, if f
B[y
1
, , y
m
], then (f ) denotes the total degree of f . That is, if f is of the form
f (y
1
, , y
l
) =
i
f
i
y
i
1
1
y
i
l
l
, (8.1.50)
where i is a shorthand for the l-tuple (i
1
, , i
l
) then (f ) is the largest of the sums (i
1
+ +i
l
)
for which the coefcient f
i
is nonzero.
The proof of Theorem 8.1.28 is basically inductive. Clearly k[x
1
] is Hermite, since it is a p.i.d.
(see Lemma B.2.1). The proof consists of showing that if k[x
1
, , x
l1
] is Hermite, then so is
k[x
1
, , x
l
]. This proof consists of two independent steps. The rst step (which is actually proved
last here, since it is more complicated than the other) is to show that if B is a Hermite ring and
[a
1
, , a
n
] is a unimodular row in B
1n
, then it can be complemented, provided one of the a
i
s is a
294 8. EXTENSIONSTOGENERAL SETTINGS
monic polynomial. The second step (which is proved rst) is to show that in the ring k[x
1
, , x
l
],
it is always possible to effect a change of variables in such a way that every unimodular row contains
at least one monic polynomial in the new set of variables.
Lemma 8.1.30 (Nagata) Suppose a R, and let m = (a) +1. Dene a new set of indeterminates
y
l
= x
l
, y
i
= x
i
x
m
li
l
. (8.1.51)
Then a(x
1
, , x
l
) = rb(y
1
, . . . , y
l
), where r = 0 belongs to k,and b is a monic polynomial in y
l
(with
coefcients in k[y
1
, , y
l1
]).
Proof. Denote an l-tuple (j
1
, , j
l
) where each of the j
i
is nonnegative by the symbol j. Then the
polynomial a can be written as
a =
j
r
j
x
j
i
1
x
j
l
l
, (8.1.52)
where r
j
k. Note that (8.1.51) has the inverse transformation
x
l
= y
l
, x
i
= y
i
+y
m
li
l
. (8.1.53)
Substituting from (8.1.53) into (8.1.52) gives
b =
j
r
j
(y
1
+y
m
l1
l
)
j
1
(y
l1
+y
m
l
)
j
l1
y
j
l
l
. (8.1.54)
Now let mdenote the l-tuple (m
l1
, , m, 1). The rst point to note is that if i = (i
1
, , i
l
) and
j = (j
1
, , j
l
) are two distinct l-tuples with 0 i
k
, j
k
m1 k, then i mand j mare distinct
integers, where denotes the usual vector dot product. This follows upon noting that i m < m
l
, and
that the l-tuple i is just the representation of the integer i m with respect to the base m; similarly
for the integer j m. Returning to (8.1.54), let J = {j : r
j
= 0}. Then there is a unique l-tuple j
0
J
such that j
0
m > j mj J \ j
0
. Hence, when expanded, (8.1.54) is of the form
b = r
j
0
y
j
0
m
l
+ terms involving lower powers of y
l
. (8.1.55)
Hence, b equals r
j
0
times a monic polynomial in y
l
of degree j
0
m. 2
Note that if kis aninnite eld, thenit is possible toachieve the same result as inLemma 8.1.30
by using only a linear transformation of variables [112]. However, the more complicated transfor-
mation is needed in the case of an arbitrary (and possibly nite) eld.
To prove that R = k[x
1
, , x
l
] is a Hermite ring, suppose [a
1
a
n
] is an arbitrary uni-
modular row in R
1n
; it must be shown that this row can be complemented. In order to do this,
we can rst transform the indeterminates x
1
, , x
l
into the set y
1
, , y
l
using Lemma 8.1.30
8.1. COPRIMEFACTORIZATIONS OVERACOMMUTATIVERING 295
such that one of the b
i
s is monic in y
l
, where b
i
(y
1
. , y
l
) = a(x
1
, , x
l
)i. Since the rings
k[x
1
, , x
l
] and k[y
1
, , y
l
] are clearly isomorphic, it is enough to show that the row [b
1
b
n
]
can be complemented. Thus, the proof of Theorem 8.1.28 is complete once the following lemma is
established. Note that this lemma is of independent interest.
Lemma 8.1.31 Suppose B is a Hermite ring, and let R = B[x]. Suppose [a
1
a
n
] R
1n
is a
unimodular row, and that at least one of the a
i
s is monic. Then this row can be complemented.
The proof of Lemma 8.1.31 requires two preliminary results.
Lemma 8.1.32 (Suslin) Suppose B is a ring, R = B[x], and suppose f, g R are of the form
f (x) = x
r
+a
1
x
r1
+ +a
r
,
g(x) = b
1
x
r1
+ +b
r
. (8.1.56)
Then, for each j {1, , r} the ideal (f, g) contains a polynomial h
j
such that (h
j
) r 1 and the
leading coefcient of h
j
is b
j
.
Proof. Dene I B to be the set of leading coefcients of all h in the ideal (f, g) such that
(h) r 1. Then I is an ideal in B. To see this, suppose p
1
, q
1
, I, and let p, q (f, g) be the
corresponding polynomials with leading coefcients p
1
, q
1
and degrees less than or equal to r 1.
Assume without loss of generality that (p) (q). Then the polynomial
h(x) = q(x) x
(q)(p)
p(x) (8.1.57)
has leading coefcient q
1
p
1
and degree no larger than r 1. Hence, I is closed under subtraction.
Finally, it is clear that if p
1
I and s B, then p
1
s I; just take h(x) = sp(x).
To prove the lemma, it is enough to establish the following statement: If h (f, g) and
(h) r 1, then all coefcients of h belong to I. This is established by induction, by proving that:
for each j {1, , r}, if h (f, g) and (h) r 1, then the coefcients of x
r1
, , x
rj
belong to I. Now the statement is true for j = 1, from the denition of I. Suppose the statement is
true up to j 1, and suppose h is an arbitrary element of the ideal (f, g) with (h) r 1. Let
= (h), and write
h(x) = h
1
x
r1
+ +h
r
, (8.1.58)
where some of the h
i
may be zero. By the inductive hypothesis, h
1
, , h
j1
belong to I. Now,
xh(x) h
1
f (x) (f, g) has a degree no larger than r 1. Moreover,
xh(x) h
1
f (x) =
r
i=1
(h
i+1
h
1
a
i
)x
ri
. (8.1.59)
The (j 1)-st coefcient of this polynomial is h
j
h
1
a
j1
, which belongs to I by the inductive
hypothesis. Since h
1
a
j1
I, it follows that h
j
I. This completes the proof of the inductive step.
2
296 8. EXTENSIONSTOGENERAL SETTINGS
The last preliminary result is the same as Lemma 8.1.31, except that the ring B is restricted
to be a local ring.
Lemma 8.1.33 Let B be a local ring, let R = B[x], and suppose [a
1
a
n
] R
1n
is a unimodular
row with at least one of the a
i
being monic. Then this row can be complemented.
Proof (Suslin). Assume without loss of generality that n 3,
6
and that a
1
is monic. The proof is by
induction on the degree of a
1
. If (a
1
) = 0, then a
1
= 1 and there is nothing further to prove. If
(a
1
) =: r > 0, we will showthat [a
1
a
n
] can be reduced, through elementary column operations,
to a row [b
1
b
n
] where b
1
is monic and (b
1
) < r.
Accordingly, suppose r = (a
1
). Since a
1
is monic, by performing column operations if nec-
essary we can assume that (a
i
) < r for i = 2, , n. Let J be the unique maximal ideal of B, and
for b B let
b B/J denote the coset b +J. For f R = B[x], dene
f [B/J][x] in the obvious
way, namely, if f =
f
i
x
i
, then
f =
f
i
x
i
. Now the row [ a
1
a
n
] is unimodular in [B/J][x].
Since ( a
1
) = (a
1
) > 0, not all of a
2
, , a
n
can be zero. By renumbering if necessary, suppose
a
2
= 0. Then some coefcient of a
2
does not belong to J. Suppose
a
2
(x) =
r
i=1
b
2i
x
ri
, (8.1.60)
and suppose b
2j
/ J. Since B is a local ring, this implies that b
2j
is a unit. Let us now apply
Lemma 8.1.32 with f = a
1
, g = a
2
, and dene I as in Lemma 8.1.32. Then Lemma 8.1.32 implies
that b
2j
I, i.e., that the ideal (a
1
, a
2
) contains a monic polynomial of degree less than r. So by
elementary column operations, a
3
can be replaced by a monic polynomial of degree less than r. Now
apply the inductive hypothesis. 2
Proof of Lemma 8.1.31. The lemma is proved by establishing the following fact: There exists a
unimodular matrix U U(R) such that aU = b, where b = [b
1
b
n
] B
1n
is a unimodular
row. In fact, such a transformation is possible (as the following proof will reveal) for any ring B.
Since B is Hermite, b can be complemented, whence a can also be complemented.
Suppose f, g R = B[x]; then f g B[x] can be dened in the obvious way, namely,
if f =
f
i
x
i
, then (f g)(x) =
f
i
(g(x))
i
. Given the row a, dene I to be the subset of B
consisting of all b B which have the property that, for each f, g R there exists a unimodular
matrix U = U(f, g) such that a(f +bg) = a(f )U, where a(f +bg), a(f ) are rowvectors in R
1n
dened by
(a(f +bg))
i
= a
i
(f +bg) ,
(a(f ))
i
= a
i
f . (8.1.61)
6
Note that a unimodular row consisting of exactly two elements can always be complemented, in any ring.
8.1. COPRIMEFACTORIZATIONS OVERACOMMUTATIVERING 297
Then I is an ideal in B: Clearly 0 I (take U = I f, g). If b, c I, then for each f, g
R there exist unimodular matrices U(f, g), V(f, g) such that a(f +bg) = a(f )U, a(f +
cg) = a(f )V. Hence, we have a(f +(b +c)g) = a(f +bg +cg) = a(f +bg)V(f +bg, g) =
a(f )U(f, g)V(f +bg, g). Since the matrix U(f, g)V(f +bg, g) is unimodular for all f, g R,
it follows that b +c I. Finally, if b I and c B, then for any f, g R we have a(f +bcg) =
a(f )U(f, cg), which shows that bc I.
If it can be shown that I = B, then the proof is complete. If 1 I, then for each f, g R
there exists a U(f, g) U(R) such that a(f +g) = a(f ). In particular, let f = x, g = x; then
a(0) = a(x)U for some U U(R). If we now dene b = a(0), then the desired statement follows
upon observing that a(0) is a unimodular row in B.
To show that I = B, suppose by way of contradiction that I = B. Then I is contained in
some maximal ideal of B, say J. Let L denote the localization B
J
. Then B L, and a is a (monic)
unimodular row in L[x]. By Lemma 8.1.33, a can be complemented in L[x]. In other words, there
exists a unimodular matrix U over L[x] such that
a(x) = [1 0 0]U(x) . (8.1.62)
Now adjoin a new indeterminate y to L[x], and dene
V(x, y) = [U(x +y)]
1
U(x) . (8.1.63)
Then V is also a unimodular matrix over L[x, y], and V(x, 0) = I
n
. From (8.1.62), it follows that
a(x +y) = [1 0 0]U(x +y) , (8.1.64)
so that, from (8.1.64) and (8.1.63),
a(x +y)V(x, y) = [1 0 0]U(x) = a(x) . (8.1.65)
Next, note that V(x, y) is a matrix over L[x, y], so that eachelement of V(x, y) is a polynomial
in x and y with coefcients in L. Moreover, since V(x, 0) = I
n
, it follows that each element of V is
of the form
V
ij
(x, y) = c
ij
+
r
k=1
d
ijk
(x)y
k
, (8.1.66)
where c
ij
= 0 or 1. Loosely speaking, each term in the polynomial expansion of v
ij
is either a
constant 0 or 1, or else involves a positive power of y. Recall that L consists of fractions p/q where
p B, q B \ J. Let b denote the product of the denominators of all the coefcients occurring in
the matrix V(x, y); thus b is the product of the denominators of the coefcients of d
ijk
(x), for all
i, j, k. Then V(x, by) is actually a unimodular matrix over B[x, y], since replacing y by by wipes
out all the denominators. Now (8.1.65) implies that
a(x +by)V(x, by) = a(x) . (8.1.67)
298 8. EXTENSIONSTOGENERAL SETTINGS
So for any f, g B[x], (8.1.67) implies that
a(f +bg)V(f, bg) = a(f ) . (8.1.68)
Therefore b I, by denition. On the other hand, b is the product of elements in B \ J, and hence
itself belongs to B \ J. In particular, b / I J. This contradiction shows that I = B. 2
8.2 COPRIMEFACTORIZATIONS FORDISTRIBUTED
SYSTEMS
Suppose R is a commutative domain with identity, and let F denote the associated eld of fractions.
The previous section was concerned with the question of when a matrix in M(F) has a right or
left-coprime factorization, and when the existence of one implies the existence of the other. It was
shown that the notion of a Hermite ring plays an important role in answering the above questions.
In particular, if R is a Hermite ring, then a matrix in M(F) has an r.c.f if and only if it has an l.c.f.
However, aside from some general and abstract results, very little was said about the existence of
coprime factorizations. The objective of this section is to study this issue in more detail for the case
of linear distributed systems, both continuous-time as well as discrete-time. This is done by rst
proving a general theorem about the existence of coprime factorizations, and then specializing this
theorem to the two special cases mentioned above. As a consequence, we shall obtain two very large
classes of transfer matrices whose elements all have both r.c.f s as weil as l.c.f.s.
The main result of this section is given next.
Theorem8.2.1 Suppose I is a subset of R satisfying two conditions:
(I1) I is a saturated multiplicative system;
(I2) If a I and b R, then the ideal in R generated by a and b is principal.
Under these conditions, every matrix G M(I
1
R) has both an r.c.f. as well as an l.c.f.
Remarks 8.2.2 1) Recall fromSection A.2 that a subset I of Ris said to be multiplicative if a, b I
implies that ab I. It is saturated if a I and b R divides a together imply that b I.
2) By induction, one can show that (I2) is equivalent to the following statement: (I2) If a I
and b
1
, . . . , b
n
R, then the ideal in R generated by a, b
1
, . . . , b
n
is principal. Let d be a generator
of this principal ideal. Then one can show quite easily that d is a greatest common divisor (g.c.d.)
of the set {a, b
1
, . . . , b
n
}.
3) Recall from Section A.2 that I
1
R denotes the ring consisting of all fractions of the form
a/b, where a R, and b I. One may assume without loss of generality that 1 I. By identifying
a R with a/1, one can imbed R as a subring of I
1
R.
The proof of Theorem 8.2.1 is facilitated by some preliminary results.
Lemma 8.2.3 Suppose a I, b
1
, . . . , b
n
R, and let d be a g.c.d. of the set {a, b
1
, . . . , b
n
}. Then
there exists a matrix U U(R) such that |U| = d and [a, b
1
, . . . , b
n
]
b
n
_
_
=
_
_
_
_
_
_
_
_
_
d
0
0
_
_
. (8.2.1)
Proof. Let c
0
= a/d, c
i
= b
i
/d, i 1. Then c
0
I since I is saturated, and 1 is a g.c.d. of all the
c
i
s. By Corollary 8.2.4 above, there exists a U U(R) such that [c
0
c
n
]
i,j
a
ij
, b
ij
= g
ij
a , (8.2.3)
and let B be the matrix whose elements are the b
ij
s, A = aI. Then G = BA
1
, and the theorem is
proved if it can be shown that there exists a U U(R) such that (8.2.2) holds. Such a U is constructed
300 8. EXTENSIONSTOGENERAL SETTINGS
iteratively. As the rst step, let d
m
be a g.c.d. of the set {a, b
1m
, , b
nm
}. Such a d
m
exists, since
a I (by the multiplicativity of the set I). Select a unimodular matrix V
m
R
(n+1)(n+1)
such that
V
m
_
_
_
_
_
_
_
_
_
a
b
1m
b
nm
_
_
=
_
_
_
_
_
_
_
_
_
d
m
0
0
_
_
. (8.2.4)
Such a V
m
exists, by Lemma 8.2.5. Now dene
U
m
=
_
I
m1
0
0 V
m
_
. (8.2.5)
Then routine computations show that
U
m
_
A
B
_
=
_
aI
m1
e
m
X 0
_
, (8.2.6)
where
e
m
= [0 0 d
m
]
, (8.2.7)
and X M(R). It is now possible to repeat the procedure by working on the next-to-last column of
the matrix in (8.2.6). After m repetitions of the reduction, one arrives at a matrix of the form shown
in (8.2.2) 2
As an application of the above theory, consider the ring A described in (8.1.16), which is
recalled here for convenience. The ring A consists of all distributions f () with support in [0, )
of the form
f (t ) =
i=0
f
i
(t t
i
) +f
a
(t ) , (8.2.8)
where 0 t
1
< t
2
< . . . are real numbers, f
i
are real, f () is measurable, and in addition
f
A
=
i=0
|f
i
| +
_
0
|f
a
(t )| dt < . (8.2.9)
It can be shown that A is precisely the set of impulse responses of BIBO-stable linear time-invariant
systems, and is a ring with pointwise addition and convolution as the product. Now, it can be
shown [101] that A is not a Bezout domain. Hence, not every element in the eld of fractions of A
has a coprime factorization. Moreover, it is not known whether A is a Hermite ring or not. Hence, it
is of interest to identify a subset B of the fraction eld of A such that every matrix in M(B) has both
8.2. COPRIMEFACTORIZATIONS FORDISTRIBUTEDSYSTEMS 301
an r.c.f. and an l.c.f. This is done in [8] and can be described in the current framework as follows: Let
i=1
(z
i
)
m
i
. (8.2.10)
302 8. EXTENSIONSTOGENERAL SETTINGS
Then it can be shown that the ideal in A generated by f and g is the same as the principal ideal
generated by (see Problem 8.2.2). Hence, (I2) holds, and every matrix whose elements all belong
to I
1
A have both an r.c.f. and an l.c.f.
PROBLEMS
8.2.1. Show that the set I consisting of the functions in
A
0
that are bounded away from zero
innity satises condition (I1).
8.2.2. Show that the subset I of the disc algebra dened above satises conditions (I1) and (I2).
8.3 CONTROLLERSYNTHESIS OVERAGENERAL RING
The objective of this section is to extend the results contained in Chapter 5, and specically Sec-
tion 5.2, to the case where the set S is replaced by a general ring R. The main obstacle to doing
so is that the plant and compensator can not be assumed a priori to have right- or left-coprime
factorizations. It is shown below that this difculty can be mostly circumvented.
Suppose R is a commutative domain with identity, and denote its eld of fractions by F. In
the present set-up, both the plant P and the compensator C are assumed to belong to M(F). The
pair (P, C) is stable if
H(P, C) =
_
(I +PC)
1
P(I +CP)
1
C(I +PC)
1
(I +CP)
1
_
, (8.3.1)
is well-dened and belongs to M(R). The symbol S(P) denotes the set of C M(F) such that
(P, C) is a stable pair. The rst objective is to parametrize the set S(P) corresponding to a given
plant P. This is made easier by the next result.
Lemma 8.3.1 Suppose P has an r.c.f. (resp. an l.c.f.) and that C S(P). Then C has an l.c.f. (resp. an
r.c.f.).
Proof. Consider the case where P has an r.c.f.; in case P has an l.c.f. the arguments are entirely
similar. Suppose P = ND
1
where N, D M(R) are right-coprime, and select X, Y M(R) such
that XN +YD = I. Now
H(P, C) = H(ND
1
, C)
=
_
I N(D +CN)
1
C N(D +CN)
1
D(D +CN)
1
C D(D +CN)
1
_
(8.3.2)
belongs to M(R), since C S(P). Dene S, T M(F) by
S = (D +CN)
1
, T = (D +CN)
1
C . (8.3.3)
8.3. CONTROLLERSYNTHESIS OVERAGENERAL RING 303
Then,
H(P, C) =
_
I NT NS
DT DS
_
, (8.3.4)
which shows that NS, NT, DS, DT M(R). This in turn implies that
S = X(NS) +Y(DS) M(R) ;
T = X(NT ) +Y(DT ) M(R) . (8.3.5)
Also, from (8.3.3), C = S
1
T . Thus, in order to prove that (S, T ) is an l.c.f. of C, it only remains
to establish the left-coprimeness of S and T . But this is immediate, since
SD +T N = (D +CN)
1
(D +CN) = I . (8.3.6)
Hence, (S, T ) is an l.c.f. of C. 2
Corollary 8.3.2 Suppose P has both an r.c.f. and an l.c.f. and that C S(P). Then C has both an r.c.f.
and an l.c.f.
Corollary 8.3.3 Suppose P, C M(F) and that the pair (P, C) is stable. Then the matrix
G =
_
C 0
0 P
_
(8.3.7)
has both an r.c.f. and an l.c.f.
Remarks 8.3.4 It is not claimed that P and C individually have both an r.c.f. and an l.c.f. - only
that G has both.
Proof. Dene
F =
_
0 I
I 0
_
. (8.3.8)
Then (P, C) is stable if and only if (G, F) is stable. Since F is a constant matrix, it has both an r.c.f.
as well as an l.c.f. The conclusion now follows from Corollary 8.3.2. 2
304 8. EXTENSIONSTOGENERAL SETTINGS
Thus, if one is interested in designing stabilizing compensators for a plant that has both an
r.c.f. and an l.c.f., the basic result of Chapter 5, namely Theorem 5.2.1, carries over completely. It is
repeated here for emphasis.
Theorem8.3.5 Suppose P M(F) has an r.c.f. (N, D) and an l.c.f. (
D,
N). Select X, Y,
X,
Y
M(R) such that XN +YD = I,
N
X +
D
Y = I. Then,
S(P) = {(Y R
N)
1
(X +R
D) : R M(R) and |Y R
N| = 0}
= {(
X +DR)(
Y NR)
1
: R M(R) and |
Y NR| = 0}. (8.3.9)
Once a parametrization of S(P) is available, Theorem 5.4.1 concerning the simultaneous
stabilizability of two plants P
0
and P
1
carries over to a general ring R.
Theorem8.3.6 Suppose P
0
and P
1
M(F) have the same dimensions, and that each has both an
r.c.f. and an l.c.f. Dene
A
0
= X
0
N
1
+Y
0
D
1
, B
0
=
N
0
D
1
+
D
0
N
1
, (8.3.10)
where the symbols are as in (5.4.1). Then there exists a C M(F) that stabilizes both P
0
and P
1
if
and only if there exists an M M(R) such that A
0
+MB
0
U(R).
Thus, as before, the simultaneous stabilizability question for two plants can be reduced to a
strong stabilizability question for a single plant. Unfortunately, this latter question is very difcult
to answer, especially for multivariable plants.
The strong stabilizability question for scalar plants is the following: Given coprime elements
n, d R, when does there exist a c R such that d +cn is a unit of R? The answer to this question
depends strongly on the specics of the ring R. In case R = A, the disc algebra, and n has only a
nite number of zeros inside the closed unit disc, the existence or otherwise of a suitable c can be
decided using the parity interlacing test, as in Section 2.4. However, if n has a sequence of zeros
clustering on the unit circle, the methods of Section 2.4 are no longer applicable. For multivariable
plants, the strong stabilization question is: Given a right-coprime pair N, D M(R), does there
exist a C M(R) such that D +CN U(R)? If R is a proper Euclidean domain, the results of
Section 4.4 can be used to transform this matrix problem into a scalar problem. Specically, such a
C exists if and only if the pair (n, d) can be strongly stabilized, where n is the g.c.d. of all elements
of N and d = |D|. However, such a reduction is not possible over a general ring. As a result, the
strong stabilization problem for multivariable plants over a general ring is still open.
The other contents of Chapter 5, such as the two-parameter compensation scheme for ex-
ample, can also be extended to a general ring without undue difculty. The details are left to the
reader.
8.3. CONTROLLERSYNTHESIS OVERAGENERAL RING 305
The results of Chapters 6 and 7 can be extended almost without modication to distributed
systems. The reason is that these results are mainly based on the properties of the function spaces
H
2
and H
, and the rationality of the various functions involved does not play a central role.
This sectionis concludedwitha discussionof the possibility of designinglumpedcompensators
(i.e., those with rational transfer function matrices) for distributed systems. Such compensators are
desirable because of ease of implementation. Consider rst the discrete-time case, and suppose the
ring R is the disc algebra A. Then the polynomials are dense in A. As a consequence, the set of
rational functions (i.e., ratios of polynomials) are dense in the fraction eld of Ain the graph topology.
Thus, if c is any element (compensator) in the fraction eld of A, then every neighborhood of c
contains a rational function. This fact can be exploited to design lumped stabilizing compensators,
as follows: First, use Theorem 8.3.5 together with any other mathematical apparatus to design a
good, or even optimal, compensator C in the form N
c
D
1
c
or
D
1
c
N
c
, where the matrices N
c
, D
c
or
N
c
,
D
c
have their elements in the ring A. By approximating the elements of these matrices by
polynomials or rational H
c
d
=
ac
bd
. (A.2)
Note that, if b, d M, then so does bd. Hence, the right sides of (A.1) and (A.2) are valid fractions.
Actually, (A.1) and (A.2) represent operations on equivalence classes; but the reader can verify
that the sum and product of two fractions (i.e., two equivalence classes) do not depend on which
representatives of the equivalence classes are used.
With addition and multiplication dened by (A.1) and (A.2), L is a ring. Moreover, if every
a R is identied with the fraction a/1, then R is isomorphic to a subring of L. The element 1/1
serves as an identity for L. Finally, if d M, then d corresponds to d/1 L; moreover, d/1 is a unit
in L with the inverse 1/d. Thus, every element of (the isomorphic image of ) M is a unit in L. The
ring L is called the ring of fractions of R with respect to M, and is denoted by M
1
R.
Since M is a subset of N (the set of absolute nondivisors of zero), the ring N
1
R contains the
largest number of units. If in particular R is a domain, then N = R \ 0, i.e., N consists of all nonzero
elements of R. In this case, the ring F = (R \ 0)
1
R has the property that every nonzero element
of R is a unit in F. Moreover, every nonzero element of F is also a unit: If a/b F and a = 0, then
b/a is the inverse of a/b. Hence, F is a eld; it is referred to as the eld of fractions or quotient eld
associated with the domain R.
There is a particularly important class of fraction rings that is frequently encountered. Suppose
R is a ring, and that I is a prime ideal in R. If M denotes the complement of I, then M is a
multiplicative system (see Problem A.1.4). The corresponding fraction ring M
1
R is called the
localization of R with respect to I.
Examples A.2.3 The set of integers Z is a domain. The eld of fractions associated with the
integers is the set of rational numbers.
In the ring Z, let M denote the set of numbers that are not divisible by 3. Then M is a
multiplicative system (since 3 is a prime number), and 1 M. The ring of fractions M
1
Z is the set
of rational numbers whose denominators (when expressed in reduced form) are not divisible by 3.
This is a subring of the eld of rational numbers.
PROBLEMS
A.2.1. Suppose R is a ring, a, b, c R, and a is an absolute nondivisor of zero. Show that if
ab = ac, then b = c.
314 A. ALGEBRAICPRELIMINARIES
A.2.2. Show that the addition and multiplication rules (A.1) and (A.2) are unambiguous, in the
sense that the nal answers do not depend on which representatives of the equivalence
classes are used in the computation.
A.2.3. Consider the ring Z/(10), consisting of the integers 0, , 9 with addition and multipli-
cation dened modulo 10.
(i) Determine the units of this ring.
(ii) Show that the set N of absolute nondivisors of zero consists of just the units.
(iii) Using (ii), show that the ring of fractions with numerators in Z/(10) and denominators
in N is again just Z/(10).
A.2.4. Suppose R is a ring and that I is a prime ideal in R. Let M denote the complement of I.
Show that M is a multiplicative system.
A.3 PRINCIPAL IDEAL DOMAINS
Throughout this section, ring means a commutative ring with identity.
Denition A.3.1 A ring R is said to be a principal ideal ring if every ideal in R is principal. R is a
principal ideal domain (p.i.d.) if it is a domain as well as a principal ideal ring.
Recall that a principal ideal I consists of all multiples of some element a, i.e., I = {xa : x R}.
Thus, in a principal ideal ring, every ideal is generated by a single element.
If x and y are elements of a ring R with x = 0, we say that x divides y, and y is a multiple of
x, if there is an element z R such that y = xz; this is denoted by x | y. If x and y are elements of
a ring R such that not both are zero, a greatest common divisor (GCD) of x, y is any element d R
such that
(GCD1) d | x and d | y .
(GCD2) c | x, c | y c | d .
In the above denition, it is implicit that d = 0, since d divides a nonzero element by virtue
of (GCD1). We say a greatest common divisor because a GCD is not unique. Certainly, if d is
a GCD of x and y, then so is d. Actually, the following stronger result holds:
Fact A.3.2 Suppose d is a GCDof x and y; then so is du whenever u is a unit. Suppose in addition
that R is a domain; then every GCD d
1
of x and y is of the form d
1
= du for some unit u.
Proof. The rst sentence is obvious. To prove the second sentence, observe that if d
1
is another
GCD of x and y, then by (GCD2) d
1
| d and d | d
1
. Hence, d
1
= du for some u. Moreover, du | d
since d
1
| d. Since R is a domain and d = 0, this implies that u | 1, i.e., that u is a unit. 2
A.3. PRINCIPAL IDEAL DOMAINS 315
Fact A.3.2 states that if R is a domain, then once we have found one GCD of a given pair of
elements, we can quickly nd them all. However, a question that is not answered by Fact A.3.2 is:
Does every pair of elements have a GCD? The answer to this question is provided next.
Theorem A.3.3 Suppose R is a principal ideal ring. Then every pair of elements x, y R, not
both of which are zero, has a GCD d which can be expressed in the form
d = px +qy (A.1)
for appropriate elements p, q R. Moreover, if R is a domain, then every GCD of x and y can be
expressed in the form (A.1).
Proof. Given x, y R, consider the set
I = {ax +by, a, b R} . (A.2)
In other words, I is the set of all linear combinations of x and y. It is easy to verify that I is an ideal
in R.
1
Since R is a principal ideal ring, I must be a principal ideal; that is, there exists an element
d I such that I is the set of all multiples of d. Since x = 1 x +0 y I, it follows that x is a
multiple of d, i.e., d | x; similarly d | y. Thus, d satises the axiom (GCD1). Next, since d I and
I is the ideal generated by x, y, there exist p, q R such that (A.1) holds. Now suppose c | x, c | y.
Then c | (px +qy), i.e., c | d. Hence, d also satises (GCD2) and is thus a GCD of x and y.
Up to now, we have shown that every pair of elements x and y has a GCD which can be
written in the form (A.1) for a suitable choice of p and q. Now, if in addition R is a domain, then
every GCD d
1
of x, y is of the form d
1
= du for some unit u, by Fact A.3.2. Thus, every GCD d
1
can be written in the form d
1
= du = upx +upy. 2
Two elements x, y R are relatively prime or simply coprime if every GCD of x, y is a unit.
In view of Fact A.3.2, if R is a domain, this is equivalent to saying that x and y are coprime if and
only if 1 is a GCD of x and y. Now suppose R is a principal ideal domain. Then, by Theorem A.3.3,
x, y R are coprime if and only if the ideal I in (A.2) is the same as the ideal generated by 1. But
the latter is clearly the entire ring R. This can be summarized as follows:
Fact A.3.4 Let R be a principal ideal domain. Then x, y R are coprime if and only if there exist
p, q R such that px +qy = 1.
One can also dene a GCD of an n-tuple of elements (x
1
, , x
n
), not all of which are zero.
An element d R is a GCD of the collection (x
1
, , x
n
) if
(i) d | x
i
i , (A.3)
(ii) c | x
i
i c | d . (A.4)
1
I is referred to as the ideal generated by x and y.
316 A. ALGEBRAICPRELIMINARIES
As in Fact A.3.2, it follows that a GCD of a given n-tuple is unique to within a unit provided the
ring is a domain. Moreover, Theorem A.3.3 can be generalized as follows:
TheoremA.3.5 Suppose R is a principal ideal ring, and suppose x
1
, , x
n
R, with at least one
element not equal to zero. Then the n-tuple (x
1
, , x
n
) has a GCD d which can be expressed in
the form
d =
n
i=1
p
i
x
i
(A.5)
for appropriate elements p
1
, , p
n
R. Moreover, if R is a domain, then every GCD of this
n-tuple can be expressed in the form (A.5).
The proof is entirely analogous to that of Theorem A.3.3 and is left as an exercise.
In the eld of rational numbers, every ratio a/b of integers has an equivalent reduced form
f/g where f and g are coprime. The next result shows that such a statement is also true in the eld
of fractions associated with any principal ideal domain.
Fact A.3.6 Let R be a principal ideal domain, and let F be the eld of fractions associated with R.
Given any fraction a/b in F, there exists an equivalent fraction f/g such that f and g are coprime.
Proof. If a and b are already coprime, then there is nothing to be done. Otherwise, let d be a GCD
of a, b, and dene f = a/d, g = b/d. Then clearly, f/g = a/b, and it only remains to show that
f and g are coprime. From Theorem A.3.3, there exist p, q R such that
d = pa +qb = pf d +qgd . (A.6)
Cancelling d from both sides of (A.6) gives 1 = pf +qg, which shows that f, g are coprime. 2
Thus, the procedure for reducing a fraction is the natural one: namely, we extract a greatest
common divisor from the numerator and the denominator.
Two elements a, b R are associates (denoted by a b) if there is a unit u such that a = bu.
One can readily verify that is an equivalence relation on R. A nonunit, nonzero element p R
is a prime if the only divisors of p are either units or associates of p. An equivalent denition is the
following: p is a prime if p = ab, a, b R implies that either a or b is a unit. In the ring of integers,
the primes are precisely the prime numbers.
A useful property of principal ideal domains is stated next, without proof (see [116]).
Fact A.3.7 Every nonunit, nonzero element of a principal ideal domain can be expressed as a
product of primes. Moreover, this factorization is unique in the following sense: If
x =
n
i=1
p
i
=
m
i=1
q
i
, (A.7)
A.3. PRINCIPAL IDEAL DOMAINS 317
where p
i
, q
i
are all primes, then n = m, and the q
i
s can be renumbered such that p
i
q
i
i.
Thus, the only nonuniqueness in the prime factorization of an element arises from the pos-
sibility of multiplying some of the prime factors by a unit (for example, 6 = 2 3 = (2) (3)).
Using prime factorizations, one can give a simple expression for a GCD of a set of elements.
The proof is left as an exercise (see Problem A.3.7).
Fact A.3.8 Suppose x
1
, , x
n
is a set of elements such that none is a unit nor zero. Express each
element x
i
in terms of its prime factors in the form
x
i
=
n
i=1
p
ij
j
(A.8)
where the p
j
s are distinct (i.e., nonassociative) primes,
ij
0 and
ij
= 0 if p
j
is not a divisor of
x
i
. Then a GCD d of (x
1
. , x
n
) is given by
d =
n
i=1
p
j
j
(A.9)
where
j
= min
i
ij
. (A.10)
Corollary A.3.9 Two elements x, y are coprime if and only if their prime divisors are distinct.
Up to now, we have talked about the greatest common divisor of a set of elements. A parallel
concept is that of the least common multiple. Suppose (x
1
, , x
n
) is a set of elements, none of
which is zero. We say that y is a least common multiple of this set of elements if
(LCM1) x
i
| y i ,
(LCM2) x
i
| z i implies that y | z .
Using prime factorizations, one can give a simple expression for a l.c.m. that parallels Fact A.3.8.
Fact A.3.10 Suppose (x
1
, , x
n
) is a collection of nonzero, nonunit elements. Express each
element x
i
in terms of its prime factors as in (A.8). Then a l.c.m. of this collection of elements is
given by
y =
n
i=1
p
j
j
, (A.11)
318 A. ALGEBRAICPRELIMINARIES
where
j
= max
i
ij
. (A.12)
The proof is left as an exercise (see Problem A.3.8).
PROBLEMS
A.3.1. Prove that two elements x, y in a p.i.d. R are coprime if and only if there exist p, q R
such that px +qy is a unit.
A.3.2. Let R be a domain, and let x, y denote a GCD of x, y which is unique to within a unit
factor. Show that if u, v are units, then x, y = ux, vy (in other words, every GCD of
x, y is also a GCD of ux, vy and vice versa).
A.3.3. Let R be a p.i.d., and let x, y, z R. Show that x, y, z = x, y, z. More generally,
showthat if x
1
, , x
n
R, thenx
1
, , x
n
= x
1
, , x
m
, x
m+1
, , x
n
for 1
m < n.
A.3.4. Let R be a p.i.d., and let x
1
, , x
n
R. Let d be a GCD of this set of elements and
select p
1
, , p
n
R such that
p
i
x
i
= d. Show that 1 is a GCD of the set of elements
p
1
, , p
n
.
A.3.5. Suppose x, y are coprime and z divides y, in a p.i.d. R. Show that x and z are coprime.
A.3.6. Suppose u is a unit and x
1
, , x
n
are nonzero elements. Show that u, x
1
, , x
n
= 1.
A.3.7. Prove Fact A.3.8.
A.3.8. Prove Fact A.3.10.
A.3.9. (i) Find the prime factorizations of 8, 24, 42 in the ring of integers.
(ii) Find a GCD of the above three numbers using Fact A.3.8.
(iii) Find a l.c.m. of the above three numbers using Fact A.3.10.
A.3.10. Show that the following three statements are equivalent:
(i) x divides y.
(ii) x is a GCD of x, y.
(iii) y is a l.c.m. of x, y.
A.3.11. Suppose R is a p.i.d., that x, y, z R, x and y are coprime, and that x divides yz. Show
that x divides z. (Hint: Use Corollary A.3.9.)
A.4. EUCLIDEANDOMAINS 319
A.4 EUCLIDEANDOMAINS
In this section, we study a special type of ring that nds a lot of application in this book. Throughout
this section domain means a commutative domain with identity.
Denition A.4.1 A domain R is a Euclidean domain if there is a degree function : R \ 0 Z
+
a satisfying the following axioms:
2
(ED1) For every x, y R with y = 0, there exists a q R such that either r := x qy is zero, or
else (r) < (y).
(ED2) If x | y then (x) (y).
One can think of q as a quotient, and r as a remainder, after dividing x by y.The axiom(ED1)
states that we can always get a remainder that is either zero or else has a smaller degree than the
divisor y. We speak of a quotient and remainder because q and r are not necessarily unique unless
additional conditions are imposed on the degree function ().
The axiom (ED2) implies that (1) (x) x = 0, since 1 divides every nonzero element.
Hence, it can be assumed without loss of generality that (1) = 0. The same axiom also implies that
if x and y are associates, then they have the same degree (because if x and y are associates, then x | y
and y | x). In particular, (u) = 0 whenever u is a unit.
Fact A.4.2 Let R be a Euclidean domain with degree function (). and suppose
(x +y) max{(x), (y)} , (A.1)
(xy) = (x) +(y) . (A.2)
Then, for every x, y Rwith y = 0, there exists a unique q Rsuch that (x yq) < (y), where
the degree of zero is taken as .
Proof. By (ED1), there exists at least one suchq. Nowsuppose (x ay) < (y), (x by) < (y),
and dene r = x ay, s = x by. Then x = ay +r = by +s. Rearranging gives (a b)y = s
r. If a = b, then ((a b)y) = (a b) +(y) (y), by (A.2). On the other hand, (s r)
max{(r), (s)} < (y). This contradiction shows that a = b and also r = s. 2
Denition A.4.3 A Euclidean domain R with degree function () is called a proper Euclidean
domain if R is not a eld and () satises (A.2).
3
2
Note that Z
+
denotes the set of nonnegative integers.
3
This is slightly different from the denition in [65, p. 30].
320 A. ALGEBRAICPRELIMINARIES
Note that, in a proper Euclidean domain, the division process might still produce nonunique
quotients and remainders, because (A.1) is not assumed to hold. This is the case, for example, in the
ring of proper stable rational functions, which are studied in Chapter 2.
Fact A.4.4 Every Euclidean domain is a principal ideal domain.
Proof. Let R be a Euclidean domain, and let I be an ideal in R. If I = {0}, then I is principal with
0 as the generator. So suppose I contains some nonzero elements, and let x be an element of I such
that (x) is minimum over all nonzero elements of I. We claim that I is the ideal generated by x
and is hence principal. To prove this, let y I be chosen arbitrarily; it is shown that x divides y. By
axiom (ED1), there exists a q R such that either r := y qx is zero or else (r) < (x). If r = 0,
then (r) < (x) contradicts the manner in which x was chosen. Hence, r = 0, i.e., x divides y.
2
We now present a very important example of a Euclidean domain, namely the ring of poly-
nomials in one indeterminate with coefcients in a eld. To lead up to this example, an abstract
denition of a polynomial is given.
Let R be a ring. Then a polynomial over R is an innite sequence {a
0
, a
1
, } such that only
nitely many terms are nonzero. The sum and product of two polynomials a = {a
i
} and b = {b
i
}
are dened by
(a +b)
i
= a
i
+b
i
, (A.3)
(ab)
i
=
i
j=0
a
ij
b
j
=
i
j=0
a
j
b
ij
. (A.4)
For notational convenience, a polynomial a = {a
i
} can be represented by a
0
+a
1
s +a
2
s
2
+
where s is called the indeterminate. The highest value of the index i such that a
i
= 0 is called the
degree of a polynomial a = {a
0
, a
1
, }.
4
Thus, if a is a polynomial of degree m, we can write
a(s) = a
0
+a
1
s + +a
m
s
m
=
m
i=0
a
i
s
i
. (A.5)
The set of polynomials over Ris denoted by R[s], and is a commutative ring with identity. Moreover,
if R is a domain, so is R[s].
Fact A.4.5 Suppose F is a eld. Then F[s] is a Euclidean domain if the degree of a polynomial in
F[s] is dened as above.
4
The degree of the zero polynomial is taken as .
A.4. EUCLIDEANDOMAINS 321
Proof. To prove axiom (ED1), suppose
f (s) =
n
i=0
f
i
s
i
, g(s) =
m
i=0
g
i
s
i
, g
m
= 0 . (A.6)
It is necessary to showthe existence of a q F[s] such that (f gq) < (g), where (0) = . If
n < m, take q = 0. If n m, dene q
1
(s) = (f
n
/g
m
)s
nm
; then (f gq
1
) n 1. By repeating
this process if necessary on the polynomial f gq
1
we can ultimately nd a q F[s] such that
(f gq) < (g). Thus, (ED1) is satised. The proof of (ED2) is straight-forward. 2
Since the degree function () satises both (A.1) and (A.2), the Euclidean division process
yields a unique remainder and quotient r, q corresponding to each pair f, g with g = 0. Moreover,
it is clear that F[s] is not a eld (the polynomial s has no inverse). Hence, F[s] is a proper Euclidean
domain.
The eld of fractions associated with F[s] is denoted by F(s), and is called the set of rational
functions over F. Note that every element of F(s) is a ratio of two polynomials (hence the name
rational function).
PROBLEMS
A.4.1. Show that the set of integers Z is a Euclidean domain if we dene the degree of an integer
to be its absolute value. Does the division process result in unique remainders?
A.4.2. Consider the ring R[s], consisting of polynomials with real coefcients. What are the
primes of this ring?
A.4.3. Suppose R is a proper Euclidean domain and x R. Show that if (x) = 1, then x is a
prime. Is the converse true? (Hint: See Problem A.4.2.)
A.4.4. Let R be a Euclidean domain. Show that x R is a unit if and only if (x) = 0. (Hint:
Use axiom (ED1).)
323
A P P E N D I X B
Preliminaries on Matrix Rings
The objective of this appendix is to gather some well-known facts on matrices whose elements belong
to a ring or a eld, and to state them in as much generality as possible.
B.1 MATRICES ANDDETERMINANTS
Let Rbe a ring, and let R
nn
denote the set of square matrices of order n n whose elements belong
to R. If the sum and product of two matrices in R
nn
are dened in the familiar way, namely
(A +B)
ij
= a
ij
+b
ij
, (B.1)
(AB)
ij
=
n
k=1
a
ik
b
kj
, (B.2)
then R
nn
becomes a ring, usually referred to as a ring of matrices over R. If R contains an identity
and n 2, then R
nn
is not commutative. For instance, if n = 2, we have
_
1 0
0 0
_ _
0 0
1 0
_
=
_
0 0
1 0
_ _
1 0
0 0
_
. (B.3)
Similar examples can be constructed if n > 2. Also, if n 2, then R
nn
is not a domain because
Diag {1, 0, , 0} Diag {0, 1, 0, , 0} = 0
nn
. (B.4)
The determinant of a matrix A R
nn
is denoted by |A| and is dened in the familiar way, namely
|A| =
n
sign
n
i=j
a
i(i)
, (B.5)
where
n
denotes the set of permutations of the set N = {1, , n} into itself, and sign = 1
depending onwhether is anevenor odd permutation.
1
Most of the usual properties of determinants
hold in the present abstract setting. The required results are stated without proof, and the reader is
referred to [65] for further details.
Dene a function : R
n
R
n
R as follows: For every v
1
, , v
n
R
n
, dene
(v
1
, , v
n
) to be the determinant of the matrix V R
nn
whose columns are v
1
, , v
n
in that
1
Throughout this section, the symbol 1 is used to denote both the integer as well as the identity element of the ring R. It is
usually clear from the context which is meant.
324 B. PRELIMINARIES ONMATRIXRINGS
order. Thus, is just the determinant function viewed as a function of the columns of a matrix.
Similarly, dene
(v
1
, , v
n
) to be the determinant of the matrix V
R
nn
whose rows are
v
1
, , v
n
, in that order.
Fact B.1.1 We have
(v
1
, , v
n
) =
(v
1
, , v
n
) v
1
, , v
n
R
n
. (B.6)
The function (and hence
) is alternating and multilinear. That is, if two arguments of are
interchanged, then changes sign, and if two arguments of are equal, then equals zero; nally,
(v
1
+w
1
, v
2
, , v
n
) = (v
1
, v
2
, , v
n
) +(w
1
, v
2
, , v
n
) (B.7)
for all possible choices of the arguments.
Fact B.1.2 Let A R
nn
, n 2. Then, for any i, j {1, , n},
|A| =
n
j=1
(1)
i+j
a
ij
m
ij
(A) , (B.8)
|A| =
n
i=1
(1)
i+j
a
ij
m
ij
(A) , (B.9)
where m
ij
(A) is the ij-t h minor of A, dened as the determinant of the (n 1) (n 1) matrix
obtained from A by deleting its i-th row and j-th column.
2
Fact B.1.3 Let A R
nn
, n 2. Then
n
j=1
a
ij
(1)
k+j
m
kj
(A) =
_
|A| if i = k
0 if i = k
(B.10)
n
i=1
a
ik
(1)
i+j
m
ij
(A) =
_
|A| if j = k
0 if j = k
. (B.11)
Now a bit of notation is introduced to make subsequent theorem statements more compact.
Suppose mand n are positive integers, with m n. Then S(m, n) denotes the collection of all strictly
increasing m-tuples {i
1
, , i
m
}, where 1 i
1
< i
2
< < i
m
n. For example,
S(3, 5) = {(1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 3, 4), (1, 3, 5) ,
(1, 4, 5), (2, 3, 4), (2, 3, 5), (2, 4, 5), (3, 4, 5)} . (B.12)
2
Tobe consistent withsubsequent notation, one shouldwrite a
N\i,N\j
insteadof m
ij
(A). But the latter notationis more convenient.
B.1. MATRICES ANDDETERMINANTS 325
If m = n, then S(m, n) is the singleton set {(1, 2. , n)}, while if m = 0 then S(m, n) is just the
empty set.
Suppose A R
mn
, and let I S(l, m), J S(l, n). Then a
IJ
denotes the l l minor of A
consisting of the rows from I and the columns from J. In particular, if I and J are singleton sets
of the form I = {i}, J = {j}, then a
IJ
is just the element a
ij
. Note that a
IJ
R. We use A
IJ
to
denote the l l matrix consisting of the elements from the rows in I and the columns in J. Thus,
a
IJ
= |A
IJ
|. In some situations it is of interest to examine a submatrix of A consisting of the rows
in I and all columns of A; such a submatrix is denoted by A
I.
. The notation A
.J
is similarly dened.
Fact B.1.4 (Laplaces Expansion of a Determinant) Suppose A R
nn
, and suppose I S(m, n).
Then,
|A| =
JS(m,n)
(1)
v(I,J)
a
IJ
a
N\I,N\J
, (B.13)
where
v(I, J) =
iI
i +
jJ
j . (B.14)
Equation (B.13) generalizes (B.8). The corresponding generalization of (B.9) is similar and
is left to the reader.
Fact B.1.5 (Binet-Cauchy Formula) Suppose A R
nm
, B R
ml
, and let C = AB R
nl
.
Let I S(p, n), J S(p, l). Then,
c
IJ
=
KS(p,m)
a
IK
b
KJ
. (B.15)
In particular, if n = m = l, then |C| = |A| |B|.
Note that (B.15) is a natural generalization of (B.2).
Using the multilinearity of the determinant function, one can obtain an expression for the
determinant of the sum of two matrices. For example, if A, B R
22
and a
1
, a
2
, b
1
, b
2
are the
columns of the two matrices, then
|A +B| = (a
1
+b
1
, a
2
+b
2
)
= (a
1
, a
2
) +(a
1
, b
2
) +(b
1
, a
2
) +(b
1
, b
2
) . (B.16)
If A, B R
nn
then the formula for |A +B| will involve the sum of 2
n
terms. In case one of the
matrices is diagonal, each term in this expansion can be neatly expressed as a product of principal
minors of A and B.
Fact B.1.6 Suppose A, B R
nn
and that A is diagonal (i.e., a
ij
= 0 for i = j). Then,
|A +B| =
n
l=0
IS(l,n)
a
II
b
N\I,N\I
, (B.17)
326 B. PRELIMINARIES ONMATRIXRINGS
where a
II
is interpreted as 1 when I is empty.
If A R
nn
, its adjoint matrix, denoted by A
adj
, is dened by
(A
adj
)
ij
= (1)
i+j
m
ji
(A) . (B.18)
In view of (B.10) and (B.11), it is seen that, for any A R
nn
,
A A
adj
= A
adj
A = |A|I
n
, (B.19)
where I
n
denotes the n n identity matrix.
A matrix A R
nn
ts unimodular if it has an inverse in R
nn
, i.e., it is a unit in the ring
R
nn
.
Fact B.1.7 A R
nn
is unimodular if and only if |A| is a unit in R.
Proof. if Suppose |A| is a unit in Rand let b = |A|
1
R. Then bA
adj
R
nn
and A bA
adj
=
bA
adj
A = I
n
. Hence, A is unimodular.
only if Suppose A is unimodular, and let B R
nn
be the inverse of A. Then 1 = |I
n
| =
|A| |B|, which shows that |A| is a unit in R. 2
Now consider the set F
nn
of matrices with elements in a eld F. Since a eld is also a ring,
all of the preceding discussion applies. In addition, since every nonzero element of F is a unit, we
see that every A F
nn
such that |A| = 0 has an inverse in F
nn
. It is customary to call a matrix A
nonsingular if |A| = 0. From Fact B.1.3, we see that if A is nonsingular, then A
1
is given by
(A
1
)
ij
= (1)
i+j
m
ji
(A)/|A| . (B.20)
The relation (B.20) is a special case of the following result, which gives the relationship between the
minors of A and A
1
. The proof can be found in [41, pp. 2122].
Fact B.1.8 Suppose A F
nn
is nonsingular and let B = A
1
. Let N = {1, , n} and suppose
I, J S(l, n) for some l. Then,
b
IJ
= (1)
v(N\I,N\J)
a
N\I,N\J
|A|
1
, (B.21)
where the function v is dened in (B.14).
If I and J are singleton sets then (B.21) reduces to (B.20).
Suppose F F
rs
, and suppose F can be partitioned as
F =
_
A B
C D
_
, (B.22)
B.1. MATRICES ANDDETERMINANTS 327
where A is a nonsingular matrix of order n n. Then the matrix
G := D CA
1
B F
rnsn
(B.23)
is called the Schur complement of F with respect to A, and is sometimes denoted by F/A. The next
result relates the minors of G to those of F.
Fact B.1.9 Suppose J S(t, m), K S(t, l). Then
g
JK
= |A|
1
f
N({n}+J),N({n}+K)
, (B.24)
where {n} +J denotes the set sum(thus if J = {j
1
, , j
t
}, then {n} +J = {n +j
1
, , n +j
t
}).
Proof. Observe that
_
I 0
CA
1
I
_
F =
_
A B
0 G
_
=: E . (B.25)
Now suppose P S(n +t, n +m), Q S(n +t, n +l) and that N is a subset of both P and
Q. Consider the minor e
PQ
. The rows in P of E are obtained by adding multiples of the rst
n rows of F to the rows in P of F. Thus, e
PQ
= f
PQ
, since a minor is unchanged by adding
multiples of some rows to others. Observe that the matrix E
PQ
is block-lower-triangular, so that
e
PQ
= |A| e
PN,QN
. The identity (B.24) now follows by choosing P = N ({n} +J), Q =
N ({n} +K). 2
Fact B.1.10 Suppose A F
nn
, B F
mn
, |A| = 0, and let G = BA
1
. Suppose J
S(l, m), K S(l, n). Then |A| g
JK
equals the determinant of the n n matrix obtained from
A by replacing the rows in K of A by the rows in J of B.
Remarks B.1.11 A simple example helps to illustrate the statement of the result. Suppose A
F
55
, B F
35
, and let a
i
, b
i
denote the i-th rows of Aand B, respectively. Each of these is a 1 5
row vector. Suppose J = (2, 3), K = (3, 5). Then the above result states that
|A| g
(2,3),(3,5)
=
a
1
a
2
b
2
a
4
b
3
. (B.26)
Note that if B is a column vector then the above fact reduces to Cramers rule for solving linear
equations.
328 B. PRELIMINARIES ONMATRIXRINGS
Proof. Dene
F =
_
A I
B 0
_
. (B.27)
Then G is the Schur complement of F with respect to A. By Fact B.1.9,
|A| g
JK
= f
N
({n} +J), N ({n} +K),
=
A I
.K
B
J.
0
. (B.28)
Note that each of the last l columns of the minor consists of all zeros except for a single 1. If we
expand this minor about the last l columns using Laplaces expansion (Fact B.1.4), then the expansion
consists of a single term, since the submatrix consisting of the last l columns has only nonzero l l
minor. Thus, if K = {k
1
, , k
l
}, then
|A| g
JK
= (1)
v
(1)
l
A
N\K
B
J.
, (B.29)
where
v =
l
i=1
(n +i) +
l
i=1
k
i
. (B.30)
Now the minor on the right side of (B.29) is not quite in the form stated in Fact B.1.10, since
the rows of B occur below those of A, rather than substituting for them (as in Example B.26, for
instance). It is a matter of detail to verify that the parity factor (1)
v+l
accounts for this difference.
2
Fact B.1.12 Suppose A F
nn
, B F
nm
, |A| = 0 and let G = A
1
B. Suppose J
S(l, n), K S(l, m). Then |A| g
JK
equals the determinant of the n n matrix obtained from
A by replacing the columns in J of A by the columns in K of B.
The proof is similar to that of Fact B.1.10 and is left to the reader.
B.2 CANONICAL FORMS
Let Rbe a principal ideal domain, and let Fbe the eld of fractions associated with R. In this section,
we study the sets R
nm
and F
nm
, consisting of n m matrices whose elements belong to R and
F, respectively. We prove the existence of two canonical forms, namely the Smith form on R
nm
and
the Smith-McMillan form on F
nm
.
Amatrix A R
nm
is a left associate of B R
nm
(denoted by A =
L
B) if there is a unimod-
ular matrix U R
nn
such that A = UB. A is a right associate of B (denoted by A =
R
B) if there
is a unimodular matrix V R
mm
such that A = BV. A is equivalent to B (denoted by A B) if
B.2. CANONICAL FORMS 329
there are unimodular matrices U R
nn
, V R
mm
. such that A = UBV. It is left to the reader
to verify that =
L
, =
R
, are all equivalence relations.
We now commence our study of canonical forms.
Lemma B.2.1 Suppose a
1
, , a
n
R, and let d
n
be a GCD of this set of elements. Then there exists
a matrix P
n
R
nn
whose rst row is [a
1
a
n
] and whose determinant is d
n
.
Proof. The proof is by induction on n. If n = 2, by Theorem A.3.3 there exist p
1
, p
2
R such that
p
1
a
1
+p
2
a
2
= d
2
. Now let
P
2
=
_
a
1
a
2
p
2
p
1
_
(B.1)
For larger values of n, let d
n1
be a GCD of a
1
, , a
n1
. By the inductive hypothesis, we can
construct a matrix P
n1
R
n1n1
whose rst row is [a
1
a
n1
] and whose determinant is
d
n1
. By Fact B.1.3,
n1
j=1
a
j
(1)
1+j
m
1j
(P
n1
) = d
n1
. (B.2)
so that
n1
j=1
a
j
d
n1
(1)
1+j
m
1j
(P
n1
) = 1 . (B.3)
For convenience, let z
j
= a
j
/d
n1
. Next, by Problem A.3.3, d
n
is a GCD of d
n1
and a
n
. By
Theorem A.3.3, there exist x, y R such that xd
n1
+ya
n
= d
n
. Now dene
P
n
=
_
_
_
_
_
_
_
a
n
P
n1
0
.
.
.
0
yz
1
yz
n1
x
_
_
. (B.4)
Expanding |P
n
| about the last column and using (B.3) gives |P
n
| = xd
n1
+ya
n
= d
n
. 2
Theorem B.2.2 (Hermite Form) Every A R
nn
is a left associate of a matrix B that is lower-
triangular (i.e., b
ij
= 0 for j > i).
330 B. PRELIMINARIES ONMATRIXRINGS
Proof. Let d
n
be a GCD of {a
1n
, , a
nn
}, i.e., the elements of the last column of A. By Theo-
rem A.3.5, there exist elements p
1
, , p
n
R such that
n
i=1
p
i
a
in
= d
n
. (B.5)
By Problem A.3.4, 1 is a GCD of the set p
1
, , p
n
. Hence, by a slight variation of Lemma B.2.1,
there exists a unimodular matrix U R
nn
such that its last row is [p
1
p
n
]. Now (UA)
nn
= d
n
.
Also, for i = 1, , n 1, (UA)
in
belongs to the ideal generated by a
1n
, , a
nn
and is thus a
multiple of d
n
. Let z
i
= (UA)
in
/d
n
for i = 1, , n 1, and dene
U
n
=
_
_
_
_
_
_
_
1 0 0 z
1
0 1 0 z
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 1 z
n1
0 0 0 1
_
_
. (B.6)
Then U
n
is unimodular since its determinant is 1. Moreover, U
n
UA is of the form
U
n
UA =
_
_
_
_
_
0
A
n1
.
.
.
0
a
n1
d
n
_
_
. (B.7)
In other words, the last column of A has been reduced to zero above the diagonal by means of left
multiplication by an appropriate unimodular matrix. One can now repeat the procedure with the
n 1 n 1 matrix A
n1
, and eventually arrive at a lower triangular matrix. For clarity, the next
step of the algorithm is briey outlined: Let d
n1
denote a GCD of the elements of the last column
of A
n1
. As above, there exists a unimodular matrix
U
n1
R
n1n1
such that
U
n1
A
n1
=
_
_
_
_
_
0
A
n2
.
.
.
0
a
n2
d
n1
_
_
. (B.8)
Now dene the n n unimodular matrix
U
n1
= Block Diag{
U
n1
, 1} . (B.9)
B.2. CANONICAL FORMS 331
Then,
U
n1
U
n
UA =
_
_
_
_
_
_
_
0 0
A
n2
.
.
.
.
.
.
0 0
a
n2
d
n1
0
a
n1
d
n
_
_
. (B.10)
The rest of the proof is now obvious. 2
Corollary B.2.3 Every matrix A R
nn
, is a left associate of an upper triangular matrix C R
nn
.
Proof. In the proof of Theorem B.2.2, start with the rst column of A instead of the last. 2
Corollary B.2.4 Every matrix A R
nn
is a right associate of a lower (resp. upper) triangular matrix.
Proof. In the proof of Theorem B.2.2, start with the last (resp. rst) row of A. 2
For rectangular matrices, the following result holds:
Corollary B.2.5 Suppose A R
nm
. Then A is a left associate of a matrix of the form
_
D
0
_
if n > m, [DE] if n < m . (B.11)
where D can be chosen to be either lower or upper triangular.
It is left to the reader to state and prove the result analogous to Corollary B.2.5 concerning
right associates of rectangular matrices.
Next, we start developing the Smith form. Recall that a matrix A R
nm
is said to have rank
l if (i) there is an l l submatrix of A with nonzero determinant, and (ii) every (l +1) (l +1)
minor of A is zero. An elementary row operation on the matrix A consists of one of the following:
(i) interchanging two rows of A, or (ii) adding a multiple of one rowto another. An elementary column
operation is similarly dened. It is easy to see that an elementary row (column) operation on a matrix
A can be accomplished by multiplying A on the left (right) by a unimodular matrix. Thus, a matrix
obtained from A by elementary row and column operations is equivalent to A.
Now suppose A R
nm
is a (left or right) multiple of B R
nm
, and let b
l
denote a GCD
of all l l minors of B. Then it follows from the Binet-Cauchy formula (Fact B.1.5) that b
l
divides
all l l minors of A. Thus, if a
l
denotes a GCD of all l l minors of A, we see that b
l
divides a
l
.
332 B. PRELIMINARIES ONMATRIXRINGS
From this, it follows that if A and B are equivalent matrices, then a
l
and b
l
are associates, and A
and B have the same rank.
Theorem B.2.6 (Smith Form) Suppose A R
nm
has rank l. Then, A is equivalent to a matrix
H R
nm
of the form
H =
_
_
_
_
_
_
_
h
1
0 0 0
0 h
2
0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 h
l
0
0 0
_
_
, (B.12)
where h
i
divides h
i+1
for i = 1, , l 1. Moreover, h
1
h
i
is a GCD of all i i minors of A,
and the h
i
s are unique to within multiplication by a unit.
Remarks B.2.7 h
1
, , h
n
are called the invariant factors of A.
Proof. Since A has rank l, it contains an l l submatrix with nonzero determinant. By elementary
row and column operations, this submatrix can be brought to the upper left-hand corner of A.
As in the proof of Theorem B.2.2, there is a unimodular matrix U R
nn
such that (UA)
11
is a GCD of all elements in the rst column of A. By elementary row operations, the rst column
of UA can be made to contain all zeros except in the (1, 1)-position. Call the resulting matrix
A. If
a
11
divides all elements of the rst row of
A, then all elements of the rst row (except in the (1, 1)
position) of
AV can be made to equal zero by a suitable choice of a unimodular matrix V of the
form
V =
_
_
_
_
_
_
_
1 v
12
v
13
v
1m
1 0 0
1 0
.
.
.
.
.
.
.
.
.
1
_
_
. (B.13)
Moreover, the rst column or
AV will continue to have zeros except in the (1, 1) position. On the
other hand, if a
11
does not divide all elements in the rst row of
A. we can choose a unimodular
matrix V so that (
AV)
11
is a GCD of the elements of the rst row of
AV. If this is done, the rst
column of
AV may no longer contain zeros. In such a case, we repeat the above row and column
operations. This process cannot continue indenitely, because the original element a
11
has only a
nite number of prime factors (see Fact A.3.7) and each successive corner element is a proper (i.e.,
nonassociative) divisor of its predecessor. Thus, in a nite number of steps, we arrive at a matrix B
equivalent to A such that b
11
divides all elements of the rst row as well as rst column of B. By
B.2. CANONICAL FORMS 333
elementary row and column operations, all of the elements of the rst row and column of B can be
made equal to zero. Thus,
A B
_
_
_
_
_
b
11
0 0
0
.
.
. B
1
0
_
_
. (B.14)
By proceeding to the rst row and column of B
1
(i.e., the second row and column of B) and then
repeating the procedure, we will eventually have
A
_
_
_
_
_
d
1
0
.
.
.
.
.
.
.
.
. 0
0 d
l
0 M
_
_
. (B.15)
Now, d
i
= 0 i, since these are all divisors of the elements of the rst l rows and columns of A; and
none of these rows nor columns is identically zero. Next, M must be zero; otherwise A is equivalent
to a matrix of rank at least l +1, which contradicts the fact that the rank of A is l.
Thus, far we have shown that
A
_
_
_
_
_
_
_
d
1
0 0
0 d
2
0 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 d
l
0 0
_
_
. (B.16)
Let D = Diag {d
1
, , d
l
}. We will show that D Diag {h
1
, , h
l
}, where h
i
divides h
i+1
for
i = 1, , l 1. This is enough to prove the theorem. By adding columns 2 to l to the rst column
of D, it follows that
D
_
_
_
_
_
d
1
0 0
d
2
d
2
0
.
.
.
.
.
.
.
.
.
.
.
.
d
l
0 d
l
_
_
. (B.17)
By multiplying the latter matrix on the left by an appropriate unimodular matrix, we get another
matrix E whose (1, 1)-element is a GCD of {d
1
, , d
l
} and whose rst column is zero otherwise.
Let h
1
:= a GCD of {d
1
. , d
l
}. Then,
E =
_
_
_
_
_
h
1
0
E
.
.
.
0
_
_
, (B.18)
334 B. PRELIMINARIES ONMATRIXRINGS
where every element of
E is in the ideal generated by d
2
, , d
l
, and is thus a multiple of h
1
. Since
the rst row of
E is a multiple of h
1
, it follows that
E
_
_
_
_
_
h
1
0 0
0
.
.
. E
1
0
_
_
, (B.19)
where every element of E
1
is a multiple of h
1
. Now, by the preceding paragraph, E
1
Diag {g
2
, , g
l
} where h
1
divides g
i
for all i. Now let h
2
be a GCD of g
2
, , g
l
. Then clearly
h
1
divides h
2
. Moreover,
E
1
_
h
2
0
0 E
2
_
. (B.20)
Repeating this procedure, we nally get D Diag {h
1
, , h
l
} where h
i
divides h
i+1
for all i. This
completes the proof of (B.12).
The divisibility conditions on the h
i
imply that the product h
1
h
i
is a GCD of all i i
minors of H. Since A and H are equivalent, product is also a GCD of all i i minors of A.
Now suppose A is also equivalent to another matrix
G =
_
_
_
_
_
_
_
g
1
0 0
0 g
2
0 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 g
l
0 0
_
_
, (B.21)
where g
i
divides g
i+1
for all i. Then the reasoning of the preceding paragraph shows that the product
h
1
h
i
is an associate of the product g
1
g
i
for all i. Now, if h
1
g
1
and h
1
h
2
g
1
g
2
, then
h
2
g
2
. Reasoning inductively in this fashion, we conclude that h
i
g
i
for all i. 2
Corollary B.2.8 Two matrices A, B R
nm
are equivalent if and only if their invariant factors are
associates.
B.2. CANONICAL FORMS 335
Let F be the eld of fractions associated with R. We now begin a study of the set F
nm
of
matrices with elements from F and show the existence of a canonical form known as the Smith-
McMillan form.
Theorem B.2.9 (Smith-McMillan Form) Suppose F F
nm
has rank l. Then there exist uni-
modular matrices U R
nn
, V R
mm
such that
UFV =
_
_
_
_
_
_
_
a
1
/b
1
0 0 0
0 a
2
/b
2
0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 a
l
/b
l
0 0
_
_
, (B.22)
where a
i
, b
i
are coprime for all i; a
i
divides a
i+1
and b
i+1
divides b
i
for i = 1, , l 1; and b
1
is
a least common multiple of the denominators of all elements of F, expressed in reduced form.
3
Proof. Let y denote an l.c.m. of the denominators of all elements of F, expressed in reduced form.
Then yF R
nm
, and the rank of yF is also l. By Theorem B.2.6, there exist unimodular matrices
U R
nn
, V R
mm
such that
UyFV =
_
_
_
_
_
_
h
1
0
.
.
.
.
.
.
.
.
. 0
0 h
l
.
.
.
0 0
_
_
, (B.23)
where h
i
divides h
i+1
for all i. Hence,
UFV =
_
_
_
_
_
_
h
1
/y 0
.
.
.
.
.
.
.
.
. 0
0 h
l
/y
.
.
.
0 0
_
_
. (B.24)
Let a
i
/b
i
be a reduced form for the fraction h
i
/y, for i = 1, , l. Since h
i
| h
i+1
, let h
i+1
= h
i
r
i
where r
i
R. Let [] denote the reduced form of a matrix. Then
a
i+1
b
i+1
=
_
h
i+1
y
_
=
_
h
i
r
i
y
_
=
_
a
i
r
i
b
i
_
= a
i
_
r
i
b
i
_
, (B.25)
where in the last step we used the fact that a
i
, b
i
are coprime. Now (B.25) implies that
a
i
| a
i+1
, b
i+1
| b
i
for all i.
3
See Fact A.3.6 for the denition of a reduced form.
336 B. PRELIMINARIES ONMATRIXRINGS
Finally, to show that b
1
y, it is enough to show that h
1
and y are coprime, since a
1
/b
1
is a reduced form of h
1
/y. This is most easily done using prime factorizations. Let p
ij
/q
ij
be a
reduced form of the element f
ij
, for all i, j. Then y is an l.c.m. of all q
ij
, and h
1
is a GCD of all
yf
ij
= yp
ij
/q
ij
for all i, j. Suppose t is a prime factor of y, of multiplicity of . (By this we mean
that t is a prime, t
divides y but t
+1
does not.) Then, from Fact A.3.10, t must be a prime divisor
of multiplicity , of some q
ij
. The corresponding p
ij
is not divisible by t , since p
ij
, q
ij
are coprime
(see Corollary A.3.9). As a result, the corresponding term yf
ij
= yp
ij
/q
ij
is also not divisible by
t (since t is a factor of multiplicity of both yp
ij
and q
ij
). Thus, h
1
, being a GCD of all yf
ij
, is
also not divisible by t (see Fact A.3.8). Since this is true of every prime factor of y, it follows from
Corollary A.3.9 that y and h
1
are coprime. 2
337
A P P E N D I X C
Topological Preliminaries
In this appendix, a few basic concepts from topology are introduced. For greater detail, the reader is
referred to [56] or [89].
C.1 TOPOLOGICAL SPACES
This section contains a brief introduction to topological spaces.
Denition C.1.1 Let S be a set. A collection T of subsets of S is a topology if
1
(TOP1) Both S and (the empty set) belong to T .
(TOP2) A nite intersection of sets in T again belongs to T .
(TOP3) An arbitrary union of sets in T again belongs to T .
The ordered pair (S, T ) is called a topological space, and subsets of S belonging to T are said to be
open. A subset of S is closed if its complement in S is open.
Examples C.1.2 Let S be any set, and let T
1
consist of just the two sets S and . Then (S, T
1
) is
a topological space. T
1
is referred to as the trivial topology on S. Let T
2
consist of all subsets of S.
Then (S, T
2
) is also a topological space. T
2
is referred to as the discrete topology on S.
Suppose S is a set, and T
1
, T
2
are topologies on S. Then T
1
is weaker than T
2
(and T
2
is
stronger than T
1
) if T
1
is a subset of T
2
, i.e., every set that is open in the topology T
1
is also open in
the topology T
2
. It is obvious that, for any set S, the trivial topology and the discrete topology are,
respectively, the weakest and strongest topologies that can be dened on S.
To give interesting and useful examples of topological spaces, the notion of base is introduced.
To motivate this notion, recall the familiar denition of an open subset of the real line: A subset U of
Ris open if and only if, corresponding to every x U, there is a number > 0 such that the interval
(x , x +) is also contained in U. The concept of a base is an abstraction of this idea, whereby
intervals of the form (x , x +) are replaced by more general sets satisfying appropriate axioms.
Let S be a set, and let B be a collection of subsets of S that satises two axioms:
(B1) The sets in B cover S (i.e., the union of the sets in B is S).
(B2) Whenever B
1
, B
2
are sets in B with a nonempty intersection and x B
1
B
2
, there
exists a B(x) in B such that x B(x) (B
1
B
2
).
1
Throughout this section, upper case italic letters denote collections of sets, while bold face letters denote sets.
338 C. TOPOLOGICAL PRELIMINARIES
Using this collection B, another collection T of subsets of S is dened as follows: A subset U
of S belongs to T if and only if, for every x U, there is a set B(x) such that x B(x) U.
Fact C.1.3 A subset U of S belongs to the collection T if and only if it is a union of sets in B. T
is a topology on S. Moreover, T is the weakest topology on S containing all sets in B.
Remarks C.1.4 B is referred to as a base for the topology T , and T is the topology generated by the
base B.
Example C.1.5 To clarify Fact C.1.3, consider the set R
n
, consisting of n-tuples of real numbers.
Let B denote the collection of balls B(x, ) of the form
B(x, ) = {y R
n
: x y < } (C.1)
as x varies over R
n
and varies over the positive numbers. Here denotes the usual Euclidean (or
any other) norm on R
n
. It is a straight-forward matter to verify that this collection of balls satises
axioms (B1) and (B2). Hence, this collection forms a base for a topology on R
n
, in which a set
U R
n
is open if and only if, for every x U, there is a ball B(x, ) U. This coincides with the
usual denition of open sets on R
n
. The one extra bit of information that comes out of Fact C.1.3
is that a set is open if and only if it is a union of balls.
Proof of Fact C.1.3. To prove the rst sentence, suppose rst that Uis a union of sets in B; it is shown
that U belongs to the collection T . Specically, suppose U =
iI
B
i
, where I is an index set and B
i
is in the collection B for all i I. Let x be an arbitrary element of U. Then x B
i
for some i I,
and for this i we have x B
i
U. Hence, by the denition of T, U is in T . Conversely, suppose U
is in T . Then for every x U there is a B(x) in B such that x B(x) U. It is now claimed that
U =
xU
B(x), which would show that U is a union of sets from B. To prove the claim (and thereby
complete the proof of the rst sentence), note that U
xU
B(x) since each B(x) U; conversely,
since every x U also belongs to the corresponding B(x), it follows that U
xU
B(x). Hence, the
two sets are equal.
The second sentence is proved by verifying that T satises the three axioms (TOP1)(TOP3).
First, belongs to T since vacuously satises the dening condition for a set to belong to T .
Next, since S is the union of all sets in B, it is also in T . Hence, T satises (TOP1).
To establish (TOP2), it is enough to show that a nonempty intersection of two sets in T again
belongs to T ; it will then followby induction that every nite intersection of sets in T again belongs to
T . Accordingly, suppose U, V are in T and let x be any element of U V; we will show the existence
of a B(x) in B such that x B(x) (U V), which in turn will establish that the intersection
C.1. TOPOLOGICAL SPACES 339
U V is open. Since x U and x V, there exist B
1
(x), B
2
(x) in B such that x B
1
(x) U, x
B
2
(x) V. Now, by axiom(B2), there is a B(x) in B such that x B(x) (B
1
(x) B
2
(x)), which
in turn is contained in U V. Thus, T satises (TOP2).
To establish (TOP3), suppose {U
is a union of sets
belonging to B, whence their union is also a union of sets belonging to B. Thus,
is also in T .
To prove the last sentence, let T
1
be another topology on S such that every set in B is contained
in T
1
. Since T
1
is a topology, arbitrary unions of sets in B again belong to T
1
. Since every set in T
can be expressed as a union of sets in B, it follows that every set in T is in T
1
, i.e., T is weaker than
T
1
. 2
A topological space (S, T ) is rst-countable if T has the following property: For every x S,
there is a countable collection of open sets {B
i
(x), i Z
+
}, each containing x, such that every open
set containing x also contains some B
i
(x). In view of Fact C.1.3, the collection B
i
(x), x S is a
base for the topology T . Since the set S may be uncountable, the base for the topology may also
be uncountable. However, in a rst-countable topology, the collection of open sets containing each
particular point has a countable base.
For example, consider the set R
n
together with the topology of Example C.1.5. For a xed
x R
n
, the collection of balls B(x, 1/m), m an integer 1 is countable; moreover, every open
set containing x also contains at least one of the balls B(x, 1/m). Hence, the topological space of
Example C.1.5 is rst-countable.
A very general class of rst-countable topological spaces is that of metric spaces, which are
dened next.
Denition C.1.6 A metric space (S, ) is a set S, together with a function : S R satisfying
the following axioms:
(M1) (y, x) = (x, y) x, y S.
(M2) (x, y) 0 x, y S; (x, y) = 0 x = y.
(M3) (x, z) (x, y) +(y, z) x, y, z S.
If (S, ) is a metric space, then there is a natural topology that can be dened on S. Let
B(x, ) denote the ball
B(x, ) = {y S : (x, y) < } . (C.2)
Then the collection of sets B(x, ) as x varies over S and varies over all positive numbers, is a base
for a topology T on S. T is referred to as the topology on S induced by the metric . In this topology,
a set U in S is open if and only if, for every x U, there is a ball B(x, ) U.
Every metric space is rst-countable as a topological space: For a xed x, consider the countable
collection of sets B(x, 1/m), m an integer 1. Then every open set containing x also contains at
least one of the sets B(x, 1/m).
The question of convergence of sequences in topological spaces is examined next. Suppose
(S, T ) is a topological space. A set N is said to be a neighborhood of x S if x N, and N contains
340 C. TOPOLOGICAL PRELIMINARIES
a set in T that contains x. In other words, a set N is a neighborhood of x if it contains an open
set containing x. Note that a neighborhood itself need not be open. A sequence {x
i
} in S is said
to converge to x S if every neighborhood of x contains all but a nite number of terms of the
sequence {x
i
}.
Example C.1.7 Consider the set R
n
together with the topology dened in Example C.1.5. Then
a sequence {x
i
} converges to x if and only if, for every > 0, there is a number N such that
x
i
B(x, ) i N. This is the familiar notion of convergence in R
n
.
Example C.1.8 Consider a set S together with the discrete topology of Example C.1.2. Let {x
i
}
be a sequence in S converging to x. Since the singleton set {x} is also a neighborhood of x, it follows
that x
i
= x for all except a nite number of values of i.
Example C.1.9 Consider a set S together with the trivial topology of Example C.1.2. Then S is
the only neighborhood of any x S. As a result, every sequence {x
i
} converges to every x S.
The preceding two examples show that the meaning of convergence is very much dependent
on the particular topology dened on a set. Moreover, in general, a sequence does not necessarily
have a unique limit. To address the latter problem, we dene Hausdorff topologies. A topology T
on a set S is Hausdorff if distinct points have disjoint neighborhoods. It is easy to verify that if T is a
Hausdorff topology on a set S, then the limit of a sequence in S is unique if it exists. Also, if (S, )
is a metric space, then the topology induced by the metric is Hausdorff.
Let S be a set, and let T
1
, T
2
be two topologies on S, with T
1
weaker than T
2
. Then, whenever
a sequence {x
i
} in S converges to x in the topology T
2
, it also converges to x in the topology
T
1
. However, it is possible that a sequence may converge in T
1
but not in T
2
. This shows that
convergence in T
1
is in general a weaker requirement than convergence in T
2
(and helps to explain
the terminology).
Recall that a set in S is closed if its complement is open. One of the desirable features of rst-
countable topologies is that one can give an alternate, and very useful, characterization of closed sets.
Suppose U is a set in S. An element x S is said to be a limit point (or a cluster point, accumulation
point) of U if every neighborhood of x contains an element of U other than x. It can be shown that
a set is closed if and only if it contains all of its limit points. Now suppose (S, T ) is a rst-countable
topological space. Then, for each x S, there is a countable collection of open sets {B
i
(x), i Z}
such that every neighborhood of x contains at least one of the B
i
(x). As a result, x is a limit point
of a set U if and only if there exists a sequence {x
i
} in U converging to x such that x
i
= x i. Based
on this, one can show that U is closed if and only if every convergent sequence {x
i
} in U has a limit
in U (see Problem C.1.5).
C.1. TOPOLOGICAL SPACES 341
Now we come to the notion of continuity. Suppose (S
1
, T
1
), (S
2
, T
2
) are topological spaces,
and f is a function mapping S
1
into S
2
. Given any subset U of S
2
, its preimage under f is the subset
of S
1
denoted by f
1
(U) and dened by
f
1
(U) = {x S
1
: f (x) U} . (C.3)
The function f is continuous at x S
1
if, whenever U is an open subset of S
2
containing f (x),
its preimage f
1
(U) is an open subset of S
1
(containing x). f is continuous if it is continuous at
all x S
1
. Clearly, whether or not f is continuous is very much dependent on the topologies T
1
and T
2
. For example, if T
1
is the discrete topology, then every function f : S
1
S
2
is continuous.
Also, one can easily verify the following: If f is continuous at x S
1
and N is a neighborhood of
f (x), then f
1
(N) is a neighborhood of x (see Problem C.1.6). The converse is also true, but more
difcult to prove: If f
1
(N) is a neighborhood of x whenever N is a neighborhood of f (x), then f
is continuous at x. The proof can be found in [56].
Fact C.1.10 Suppose (S
1
, T
1
), (S
2
, T
2
) are topological spaces, and f : S
1
S
2
is continuous at
x S
1
. Then, whenever {x
i
} is a sequence in S
1
converging to x, the sequence {f (x
i
)} converges to
f (x).
Proof. Let N be any neighborhood of f (x). Since f is continuous at x, f
1
(N) is a neighborhood
of x. Since {x
i
} converges to x, f
1
(N) contains all but a nite number of terms of the sequence
{x
i
}. This is the same as saying that f (x
i
) N for all but a nite number of values of i. Since this
is true for every neighborhood of f (x), we conclude that {f (x
i
)} converges to f (x). 2
If the topologies T
1
, T
2
are rst-countable, then the converse of Fact C.1.10 is also true: If
{f (x
i
)} converges to f (x) for all sequences {x
i
} converging to x, then f is continuous at x.
PROBLEMS
C.1.1. Let (S, ) be a metric space, and let B(x, ) be dened by (C.2). Show that the collection
of sets B(x, ) is a base for a topology on S by verifying that axiom (B1) is satised.
C.1.2. Let S be any set, and dene : S S R by (x, y) = 0 if x = y, 1 if x = y.
(i) Verify that (S, ) is a metric space.
(ii) Show that the topology on S induced by is the discrete topology.
C.1.3. Let (S, T ) be a rst countable topological space, and let U be a subset of S. Show that U
is closed if and only if the following is true: x
i
U, {x
i
} converges to x implies that x U
(Hint: either x = x
i
for some i or else x is a limit point of U).
342 C. TOPOLOGICAL PRELIMINARIES
C.1.4. Suppose (S, ) is a metric space, and dene
d(x, y) =
(x, y)
1 +(x, y)
.
(i) Show that (S, d) is also a metric space.
(ii) Show that and d induce the same topology on S.
C.1.5. Show that, if (S, T ) is a topological space and T is a Hausdorff topology, then a sequence
in S can converge to at most one point.
C.1.6. Suppose (S
1
, T
1
), (S
2
, T
2
) are topological spaces, and f : S
1
S
2
is continuous at x S
1
.
Show that, if N is a neighborhood of f (x), then f
1
(N) is a neighborhood of x.
C.2 TOPOLOGICAL RINGS ANDNORMEDALGEBRAS
This section contains a brief introduction to the subject of topological rings. Roughly speaking, a
topological ring is a ring, together with a topology, such that subtraction and multiplication are
continuous operations with respect to the topology. Before talking about topological rings as such,
it is necessary to dene the product topology on the cartesian product of topological spaces.
Recall that the cartesian product S
1
S
2
of two sets S
1
and S
2
consists of all ordered pairs
(x, y) where x S
1
and y S
2
. Now suppose T
1
, T
2
are topologies on S
1
, S
2
, respectively. The
objective is to dene a topology on the product set S = S
1
S
2
. This is done in the following way:
Let B denote the collection of subsets of S of the formU Vwhere U T
1
, V T
2
. In other words,
B is the collection of subsets of S formed by taking cartesian products of an open subset of S
1
and
an open subset of S
2
. It can be shown that B is a base for a topology on S (see Problem C.2.1),
which is referred to as the product topology on S.
Now suppose S
i
, i = 1, 2, 3 are topological spaces, with the topologies T
i
, i = 1, 2, 3. Let
S = S
1
S
2
, and let T denote the product topology on S. Suppose f is a function mapping S into
S
3
. By denition, f is continuous if f
1
(W) is an open subset of S whenever W is an open subset of
S
3
. Now suppose f is continuous, that (x, y) S and let f (x, y) =: z S
3
. Since f is continuous,
it follows that, whenever N is a neighborhood of z, f
1
(N) is a neighborhood of (x, y). Recalling
the denition of the product topology on the set S, we see that, whenever N is a neighborhood of
z, there exist neighborhoods N
1
of x and N
2
of y such that f (x
1
, y
1
) Nx
1
N
1
, y
1
N
2
.
In particular, f (x, y
1
) Ny
1
N
2
, and f (x
1
, y) Nx
1
N
1
. The conclusion is that if f is a
continuous function, then for each x S
1
the function f (x, ) : S
2
S
3
is continuous; similarly,
for each y S
2
, the function f (, y) : S
1
S
3
is continuous.
Now we come to topological rings.
Denition C.2.1 Suppose R is a ring and T is a topology on R. The pair (R, T ) is a topological
ring if the functions (x, y) x y and (x, y) xy are continuous functions from R R into
R, when R R is given the product topology.
C.2. TOPOLOGICAL RINGS ANDNORMEDALGEBRAS 343
Several facts are immediate consequences of the above denition. First, if the function
(x, y) x y is continuous, then for each xed x , the function y x y is a continuous function
from R into R. In particular, taking x = 0, it follows that the function y y is continuous. Next,
since compositions of continuous functions are again continuous, the function (x, y) x +y =
x (y) is also continuous. Similarly, the function (x, y) y x = (x y) is also continuous.
Given a subset U R and an element x R, let x +U denote the set dened by
x +U = {x +y : y U} . (C.1)
One can think of x +U as the set U translated by x. Now suppose U is an open subset of R, and
dene the function f
x
: R R by f
x
(y) = y x. As seen earlier, f
x
is continuous. Hence, the
set f
1
x
(U) is open. But clearly this set equals x +U. Thus, we have shown that if U is open, so is
x +U for all x R. In other words, the topology on a topological ring is translation-invariant, in
the sense that translates of open sets are again open. A consequence of this is that, once we know all
the open sets containing 0, we know all open sets.
We now speak briey about normed algebras. Suppose A is a linear vector space over a eld
F. This means that there is a concept of addition between members of A, and scalar multiplication
between an element of F and an element of A. Now A is an algebra (over F) if, in addition, one
can dene a product of two elements of A in such a way that A becomes a ring. Thus, an algebra A
has associated with it a eld F, and three operations: addition between elements of A, multiplication
between two elements of A, and multiplication between an element of F and an element of A. A is a
ring with respect to addition and multiplication, and Ais a linear vector space with respect to addition
and scalar multiplication. For example, if F is any eld, the set of matrices F
nn
is an algebra over F.
Now suppose A is an algebra over R (the real numbers) or C (the complex numbers). Then
(A, ) is a normed algebra if one can dene a norm on A such that
(NA1) a 0 a A, a = 0 a = 0,
(NA2) a = || a R( or C), a A,
(NA3) a +b a +b a, b A,
(NA4) ab a b , a, b A,
Axioms (NA1)(NA3) are just the usual requirements for (A, ) to be a normed space,
and (NA4) is the additional requirement for it to be a normed algebra.
Let (A, ) be a normed algebra, and let B denote the collection of balls
B(x, ) = {y A : x y < } . (C.2)
Then, B is a base for a topology T on A. This is the same as the topology induced by the metric
(x, y) = x y.
It is nowshown that (A, T ) is a topological ring. In order to do this, it is necessary to showthat
the functions (x, y) x y and (x, y) xy are both continuous. Let us begin with subtraction.
Suppose x, y A, z = x y, and let S A be an open set containing z. We must show that the
set {(x, y) : x y S} =: V is an open subset of A A. This is done by showing that, whenever
344 C. TOPOLOGICAL PRELIMINARIES
(x, y) V, there are balls B(x, ), B(y, ) such that B := B(x, ) B(y, ) is contained in V. Since
S is open and x y S, by denition there is an > 0 such that B(x y, ) S. Nowlet = /2,
and, suppose x
1
B( x, ), y
1
B( y, ). Then
(x
1
y
1
) (x y) x
1
x +y
1
y < 2 < . (C.3)
Hence, (x
1
, y
1
) V. This shows that subtraction is continuous. Next, to show that multiplication
is continuous, suppose x, y A, let z = x y, and let S be an open set in A containing z. Dene
V = {(x, y) : xy S}. We show that V is open by showing that, whenever (x, y) V, there exist
balls B(x, ), B(y, ) such that B(x, ) B(y, ) is contained in V. Since V is open, by denition
there is an > 0 such that B(xy, ) S. Suppose x
1
B(x, ), y
1
B(y, ) where is yet to be
specied. Then
x
1
y
1
xy = [(x
1
x) +x] [(y
1
y) +y] xy
= (x
1
x)(y
1
y) +(x
1
x)y +x(y
1
y) . (C.4)
x
1
y
1
xy x
1
x y
1
y +x
1
x y
+ x y
1
y
2
+(x +y) . (C.5)
Hence (x
1
, y
1
) V if is chosen such that
2
+(x +y) . (C.6)
Such a can always be found since the left side of (C.6) is continuous in and equals zero when
is zero. This completes the proof that multiplication is continuous.
Fact C.2.2 Suppose (A, ) is a normed algebra, and let U denote the set of units of A (i.e., the
set of elements of A that have a multiplicative inverse in A). Then the function f : u u
1
maps
U into itself continuously.
Remark C.2.3 By the continuity of the inversion function f , we mean the following: Given any
u U and any > 0, there exists a > 0 such that
u
1
v
1
< whenever v U and u v < . (C.7)
C.2. TOPOLOGICAL RINGS ANDNORMEDALGEBRAS 345
Proof. Let u U and let > 0 be specied. Suppose v U and v B(u, ), where is not as yet
specied. Then
u
1
v
1
= u
1
(v u)v
1
=
u
1
v u v
1
u
1
v
1
u
1
(u
1
+u
1
v
1
) . (C.8)
Solving for u
1
v
1
from (C.8) gives
u
1
v
1
u
1
2
1 u
1
. (C.9)
Thus, (C.7) is satised if is chosen such that the right side of (C.9) is less than . But such a
can always be found, since the right side of (C.9) is continuous in and equals zero when is zero.
This shows that the function f is continuous. 2
PROBLEMS
C.2.1. Suppose (S
1
, T
1
), (S
2
, T
2
) are topological spaces.
(i) Let T denote the collection of subsets of S
1
S
2
of the form U V where U T
1
, V
T
2
. Show that T is a base for a topology on the set S
1
S
2
(Hint: Show that T satises the
axiom (B1) of Section B.1).
(ii) Suppose B
1
, B
2
are respectively bases for the topologies T
1
, T
2
. Let B denote the col-
lection of subsets of S
1
S
2
of the form B
1
B
2
where B
1
B
1
, B
2
B
2
. Show that B
is also a base for a topology on S
1
S
2
.
(iii) Show that B and T generate the same topology on S
1
S
2
(Hint: Show that every set
in T is a union of sets in B).
C.2.2. Consider the ring R
nn
of n n matrices with real elements. Let be any norm on the
vector space R
n
, and dene the norm of a matrix by
A = sup
x=0
Ax
x
Show that (R
nn
, ) is a normed algebra.
C.2.3. Consider the ring R[s] of polynomials in the indeterminate s with real coefcients. Dene
the norm of a(s) =
i
a
i
s
i
by
a() =
i
|a
i
| .
Show that (R[s], ) is a normed algebra.
347
Bibliography
[1] V. Anantharam and C. A. Desoer, On the stabilization nonlinear systems, IEEE Trans. on
Auto. Control, AC-29, pp. 569572, June 1984. DOI: 10.1109/TAC.1984.1103584 Cited on
page(s) 306
[2] B. D. O. Anderson, A note on the Youla-Bongiorno-Lu condition, Automatica, 12, pp. 387
388, July 1976. DOI: 10.1016/0005-1098(76)90060-1 Cited on page(s)
[3] B. D. O. Anderson and J. B. Moore, Optimal Filtering, Prentice-Hall, Englewood Cliffs, New
Jersey, 1979. Cited on page(s) 194, 229
[4] M. F. Atiyah and I. G. MacDonald, Introduction to Commutative Algebra, Addison-Wesley,
Reading, MA., 1969. Cited on page(s) 283, 293
[5] J. A. Ball and J. W. Helton, ABeurling-Lax theoremfor the Lie group U(m, n) which contains
most classical interpolation theory, J. Operator Theory, 9, pp. l07142, 1983. Cited on page(s)
[6] C. I. Byrnes, M. W. Spong and T. J. Tarn, A several complex variables approach to feedback
stabilization of neutral delay-differential systems, Math. Sys. Thy., 17, pp. 97134, May 1984.
DOI: 10.1007/BF01744436 Cited on page(s) 292, 306
[7] F. M. Callier andC. A. Desoer, Open-loopunstable convolutionfeedbacksystems withdynam-
ical feedback, Automatica, 13, pp. 507518, Dec. 1976. DOI: 10.1016/0005-1098(76)90010-8
Cited on page(s)
[8] F. M. Callier and C. A. Desoer, An algebra of transfer functions of distributed linear time-
invariant systems, IEEETrans. Circ. and Sys., CAS-25, pp. 651662, Sept. 1978.
DOI: 10.1109/TCS.1978.1084544 Cited on page(s) 287, 301, 305
[9] F. M. Callier and C. A. Desoer, Simplications and clarications on the paper An algebra
of transfer functions of distributed linear time-invariant systems, IEEETrans. Circ. and Sys.,
CAS-27, pp. 320323, Apr. 1980. DOI: 10.1109/TCS.1980.1084802 Cited on page(s)
[10] F. M. Callier and C. A. Desoer, Stabilization, tracking and disturbance rejection in multi-
variable convolution systems, Annales de la Societ Scientique de Bruxelles, 94, pp. 751, 1980.
Cited on page(s)
[11] F. M. Callier and C. A. Desoer, Multivariable Feedback Systems, Springer-Verlag, New York,
1982. Cited on page(s)
348 BIBLIOGRAPHY
[12] B. C. Chang and J. B. Pearson, Optimal Disturbance reductioninlinear multivariable systems,
IEEETrans. on Auto. Control, AC-29, pp. 880888, Oct. 1984.
DOI: 10.1109/TAC.1984.1103409 Cited on page(s) 205
[13] M. J. Chen and C. A. Desoer, Necessary and sufcient conditions for robust stability of linear
distributed feedback systems, Int. J. Control, 35, pp. 255267, 1982.
DOI: 10.1080/00207178208922617 Cited on page(s) 277
[14] M. J. Chen and C. A. Desoer, Algebraic theory of robust stability of interconnected systems,
IEEETrans. on Auto. Control, AC-29, pp. 511519, June 1984.
DOI: 10.1109/TAC.1984.1103572 Cited on page(s) 279
[15] J. H. Chowand P. V. Kokotovic, Eigenvalue placement in two-time-scale systems, Proc. IFAC
Symp. on Large Scale Sys., Udine, Italy, pp. 321326, June 1976. Cited on page(s) 224
[16] C.-C. Chu and J. C. Doyle, On inner-outer and spectral factorizations, Proc. IEEE Conf. on
Decision and Control, Las Vegas, Dec. 1984. Cited on page(s) 205
[17] H. T. Colebrooke, Algebra with Arithmetic and Mensuration, John Murray, London, 1817,
reprinted by Dr. Martin Sandig OHg, Wiesbaden, W. Germany, 1973. Cited on page(s) xvii
[18] P. R. Delsarte, Y. Genin and Y. Kamp, The Nevanlinna-Pick problem for matrix-valued
functions, SIAM J. Appl. Math, 36, pp. 4761, Feb. 1979. DOI: 10.1137/0136005 Cited on
page(s) 205
[19] C. A. Desoer andW. S. Chan, The feedback interconnection of linear time-invariant systems,
J. Franklin Inst., 300, pp. 335351, 1975. DOI: 10.1016/0016-0032(75)90161-1 Cited on
page(s)
[20] C. A. Desoer and M. J. Chen, Design of multivariable feedback systems with stable plant,
IEEETrans. on Auto. Control, AC-26, pp. 408415, April 1981.
DOI: 10.1109/TAC.1981.1102594 Cited on page(s)
[21] C. A. Desoer and C. L. Gustafson, Design of multivariable feedback systems with simple
unstable plant, IEEETrans. on Auto. Control, AC-29, pp. 901908, Oct. 1984.
DOI: 10.1109/TAC.1984.1103400 Cited on page(s)
[22] C. A. Desoer and C. L. Gustafson, Algebraic theory of linear multivariable feedback systems,
IEEETrans. on Auto. Control, AC-29, pp. 909917, Oct. 1984.
DOI: 10.1109/TAC.1984.1103410 Cited on page(s)
[23] C. A. Desoer andC. A. Lin, Two-stepcompensationof nonlinear systems, Systems and Control
Letters, 3, pp. 4146, June 1983. DOI: 10.1016/0167-6911(83)90036-1 Cited on page(s) 306
BIBLIOGRAPHY 349
[24] C. A. Desoer and R. W. Liu, Global parametrization of feedback systems with nonlinear
plants, Systems and Control Letters, 1, pp. 249251, Jan. 1982.
DOI: 10.1016/S0167-6911(82)80006-6 Cited on page(s) 306
[25] C. A. Desoer, R. W. Liu, J. Murray and R. Saeks, Feedback system design: The fractional
representation approach to analysis and synthesis, IEEETrans. Auto. Control, AC-25, pp. 399
412, June 1980. DOI: 10.1109/TAC.1980.1102374 Cited on page(s)
[26] C. A. Desoer and M. Vidyasagar, Feedback Systems: Input-Output Properties, Academic Press,
New York, 1975. Cited on page(s) 154, 210, 239, 284, 287
[27] J. C. Doyle, Robustness of multiloop linear feedback systems, Proc. 17th Conf. Decision and
Control, Ft. Lauderdale, pp. 1218, 1979. Cited on page(s)
[28] J. C. Doyle, K. Glover, P. P. Khargonekar and B. A. Francis, State space solutions to standard
H
2
and H