(Synthesis Lectures On Control and Mechatronics) Mathukumalli Vidyasagar-Control System Synthesis - A Factorization Approach, Part II (Synthesis Lectures On Control and Mechatronics) - Morgan & Claypo

Control SystemSynthesis
AFactorization Approach
Part II
Synthesis Lectures on Control
and Mechatronics
Editors
Chaouki Abdallah, University of New Mexico
Mark W. Spong, University of Texas at Dallas
Control System Synthesis: A Factorization Approach, Part II
Mathukumalli Vidyasagar
2011
Control System Synthesis: A Factorization Approach, Part I
2011
The Reaction Wheel Pendulum
Daniel J. Block, Karl J. strm, and Mark W. Spong
2007
Copyright 2011 by Morgan & Claypool
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any meanselectronic, mechanical, photocopy, recording, or any other except for brief quotations in
printed reviews, without the prior permission of the publisher.
Control System Synthesis: A Factorization Approach, Part II
www.morganclaypool.com
ISBN: 9781608456611 paperback
ISBN: 9781608456635 ebook (Part II)
DOI 10.2200/S00358ED1V01Y201105CRM003
A Publication in the Morgan & Claypool Publishers series
SYNTHESIS LECTURES ON CONTROL AND MECHATRONICS
Lecture #3
Series Editors: Chaouki Abdallah, University of New Mexico
Mark W. Spong, University of Texas at Dallas
Series ISSN
Synthesis Lectures on Control and Mechatronics
Print 1939-0564 Electronic 1939-0572
Control SystemSynthesis
AFactorization Approach
Part II
University of Texas at Dallas
SYNTHESIS LECTURES ON CONTROL AND MECHATRONICS #3
C
M
&
cLaypool Morgan publishers
&
ABSTRACT
This book introduces the so-called stable factorization approach to the synthesis of feedback
controllers for linear control systems. The key to this approach is to view the multi-input, multi-
output (MIMO) plant for which one wishes to design a controller as a matrix over the fraction eld
F associated with a commutative ring with identity, denoted by R, which also has no divisors of
zero. In this setting, the set of single-input, single-output (SISO) stable control systems is precisely
the ring R, while the set of stable MIMO control systems is the set of matrices whose elements all
belong to R. The set of unstable, meaning not necessarily stable, control systems is then taken to be
the eld of fractions F associated with R in the SISO case, and the set of matrices with elements in
F in the MIMO case.
The central notion introduced in the book is that, in most situations of practical interest,
every matrix P whose elements belong to F can be factored as a ratio of two matrices N, D
whose elements belong to R, in such a way that N, D are coprime. In the familiar case where the
ring R corresponds to the set of bounded-input, bounded-output (BIBO)-stable rational transfer
functions, coprimeness is equivalent totwofunctions not having any commonzeros inthe closedright
half-plane including innity. However, the notion of coprimeness extends readily to discrete-time
systems, distributed-parameter systems in both the continuous- as well as discrete-time domains,
and to multi-dimensional systems. Thus the stable factorization approach enables one to capture all
these situations within a common framework.
The key result in the stable factorization approach is the parametrization of all controllers
that stabilize a given plant. It is shown that the set of all stabilizing controllers can be parametrized
by a single parameter R, whose elements all belong to R. Moreover, every transfer matrix in the
closed-loop system is an afne function of the design parameter R. Thus problems of reliable
stabilization, disturbance rejection, robust stabilization etc. can all be formulated in terms of choosing
an appropriate R.
This is a reprint of the book Control System Synthesis: A Factorization Approach originally
published by M.I.T. Press in 1985.
KEYWORDS
stable factorization, coprimeness, Bezout identity, simultaneous stabilization, robust
stabilization, robust regulation, genericity
Dedicated to the memories of
My paternal grandfather,
Mathukumalli Narasimharao (1881-1965)
and
My father,
Prof. Mathukumalli Venkata Subbarao (1921-2006)
ix
+ Hj +|11
Hj +|
++
Hj 1| +1|1j `
|-1 +|1|-
| |1|]1|1j `
+ |: |: |:
May God protect us together
May God nourish us together
May we work together with great energy
May our study be vigorous and effective
May we not have ill feelings
Let there be peace, peace, peace
(A vedic prayer recited by the teacher and student(s) together at the commencement of studies.
Found in several upanishads including Taittiriyopanishad)
xi
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Preface for the Original Edition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
6
Filtering and Sensitivity Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.2 Some Facts about Hardy Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.3 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.4 Sensitivity Minimization: Scalar Case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.5 Sensitivity Minimization: Fat Plant Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6.6 Sensitivity Minimization: General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
6.7 Two-Parameter Compensator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
7
Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
7.2 Graph Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
7.3 Graph Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
7.4 Robustness Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
7.5 Robust and Optimal Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
7.6 Genericity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
8
Extensions to General Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
8.1 Coprime Factorizations over a Commutative Ring . . . . . . . . . . . . . . . . . . . . . . . . . 279
8.2 Coprime Factorizations for Distributed Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 298
8.3 Controller Synthesis over a General Ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
A
Algebraic Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
A.1 Rings, Fields and Ideals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
A.2 Rings and Fields of Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
A.3 Principal Ideal Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
A.4 Euclidean Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
xii
B
Preliminaries on Matrix Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
B.1 Matrices and Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
B.2 Canonical Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
C
Topological Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
C.1 Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
C.2 Topological Rings and Normed Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Authors Biography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Preface
This is a reprint of my book Control System Synthesis: A Factorization Approach, originally
published by M.I.T. Press in 1985. That book went out of print about ten years after its publication,
but nevertheless continued to be cited in the published literature. I would like to believe that this is
because the contents of the book are still relevant to linear control theory as it is currently practiced.
I am therefore grateful to Morgan & Claypool for their decision to reprint my book, so that the
current generation of graduate students and researchers are able to access its contents without having
to photocopy it surreptitiously from the library.
The original text was created by me using the troff text processing system created at Bell
Laboratories. Indeed, the 1985 book was the rst time that I produced a camera-ready copy, instead
of subjecting myself to the mercy and vagaries of typists and/or typesetters. Back then this was a
novel thing to do; however today it is standard practice. Dr. C. L. Tondo of T&T TechWorks, Inc.
and his troops have expertly keyed in the entire text into L
A
T
E
X, for which I am extremely grateful.
There is no doubt that the physical appearance of the text has been signicantly improved as a result
of this switch.
With the entire text at my disposal, I could in principle have made major changes. After
thinking over the matter, I decided to stick with the original text, and restricted myself to correcting
typographical errors. Upon re-reading the original text after a gap of perhaps twenty years, I felt that
the entire material still continues to be relevant, except for Chapter 6 on H
-control. Just about the

time that I was nalizing the book, two seminal papers appeared, giving the connection between
interpolation theory and H
-control; these are cited in the book as references [40] and [115]. A
few years after my book was published, the inuential book [37] appeared. This was followed in
short order by the paper [28], which gives a complete state-space computational procedure for H
-
control; this paper is perhaps the most cited paper in the history of control theory. Subsequently, the
book [29] gave an elementary introduction to the theory, while [45, 117] are advanced treatments
of the subject. It would therefore have required a massive effort on my part to rewrite Chapter 6 of
the book to bring it up to date, and I felt that I could contribute nothing beyond the excellent texts
already in print. So I decided to leave the book as it is, on the basis that the conceptual framework
for H
-control presented here still remains relevant.

I had dedicated the original book to my paternal grandfather. In the interim, my father too
has passed away, and I have therefore added his name to the dedication of the Morgan & Claypool
edition. About twenty years ago, while perusing a book on Telugu writers, I discovered that one of
my ancestors, bearing the same name as my paternal grandfather, was a well-known Telugu poet
during the 19th century; he lived between 1816 and 1873. The article about him states that the clan
xiv PREFACE
was famous for scholarship for three centuries (emphasis added). I am truly blessed to have such a
distinguished lineage.
Dallas and Hyderabad, June 2011
Preface for the Original Edition
The objective of this book is to present a comprehensive treatment of some recent results in the
area of linear multivariable control that can be obtained using the so-called "factorization" approach.
It is intended as a second level graduate text in linear systems and as a research monograph. The
prerequisites for reading this book are covered in three appendices, but a reader encountering these
topics for the rst time would undoubtedly have difculty in mastering this background on the basis
of these appendices alone. Moreover, the appendices concentrate on the mathematical background
needed to understand the material covered here, but for motivational background a standard rst
course in graduate level linear system theory would be desirable.
The central idea that is used repeatedly in the book is that of "factoring" the transfer matrix of
a (not necessarily stable) system as the "ratio" of two stable rational matrices. This idea was rst used
in a paper published in 1972 (see [92]), but the emphasis there was on analyzing the stability of a
given system rather than on the synthesis of control systems as is the case here. It turns out that this
seemingly simple stratagem leads to conceptually simple and computationally tractable solutions to
many important and interesting problems; a detailed description can be found in Chapter l.
The starting point of the factorization approach is to obtain a simple parametrization of all
compensators that stabilize a given plant. One could then, in principle, choose the best compensator
for various applications. This idea was presented in the 1976 paper by Youla, Jabr, and Bongiorno
entitled "Modern Wiener-Hopf Design of Optimal Controllers, Part II: The Multivariable Case,"
which can be considered to have launched this entire area of research. The viewpoint adopted in
this book, namely that of setting up all problems in a ring, was initially proposed in a 1980 paper by
Desoer, Liu, Murray, and Saeks. This paper greatly streamlined the Youla et al. paper and reduced
the problem to its essentials. Thus virtually all of the research reported here is less than ve years
old, which bears out the power of this approach to formulate and solve important problems.
In writing the book, some assumptions have been made about the potential readership. First,
it is assumed that the reader is already well versed in the aims and problems of control systemanalysis
and design. Thus, for example, the book starts off discussing the problemof stabilization without any
attempt to justify the importance of this problem; it is assumed that the reader already knows that
stable systems are better than unstable ones. Also, as the book is aimed at professional researchers
as well as practitioners of control system synthesis, a theorem-proof format has been adopted to
bring out clearly the requisite conditions under which a particular statement is true, but at the same
time, the principal results of each section have been stated as close to the start of the section as is
practicable. In this way, a reader who is interested in pursuing the topics presented here in greater
depth is enabled to do so; one who is only interested in using the results can rapidly obtain an idea
of what they are by scanning the beginnings of various sections and by skipping proofs. In this
xvi PREFACEFORTHEORIGINAL EDITION
connection it is worth noting that Chapter 3 is a painless introduction to the factorization approach
to scalar systems that could, in my opinion, be taught to undergraduates without difculty.
At various times, I have taught the material covered here at Waterloo, Berkeley, M.I.T., and
the Indian Institute of Science. Based on these experiences, I believe that the appendices plus the
rst ve chapters can be covered in a thirteen-week period, with three hours of lectures per week. In a
standard American semester consisting of eighteen weeks of lectures with three hours of lectures per
week, it should be possible to cover the entire book, especially if one starts directly from Chapter I
rather than the appendices. Most of the sections in the appendices and the rst ve chapters contain
several problems, which contain various ancillary results. The reader is encouraged to attempt all of
these problems, especially as the results contained in the problems are freely used in the subsequent
sections.
It is now my pleasure to thank several persons who aided me in this project. My wife Shakun-
thala was always a great source of support and encouragement during the writing of this book, which
took considerably more time than either of us thought it would. Little Aparna came into being at
about the same time as the third draft, which gave me a major incentive to nish the book as quickly
as possible, so that I might then be able to spend more time with her. Several colleagues gave me
the benet of their helpful comments on various parts of the manuscript. Of these, I would like to
mention Ken Davidson, Charlie Desoer, John Doyle, Bruce Francis, Allen Tannenbaum, George
Verghese, N. Viswanadham, and Alan Willsky. My students Chris Ma and Dean Minto went over
large parts of the material with a great deal of care and exposed more than one serious mistake. I
would like to acknowledge my indebtedness to all of these individuals. The nal camera ready copy
was produced by me using the troff facility and the pic preprocessor to generate the diagrams. In this
connection, I would like to thank. Brian Haggman, a differential topologist turned computer hack,
for much valuable advice. Finally, at the outset of this project, I was fortunate enough to receive an
E. W. R. Steacie Memorial Fellowship awarded by the Natural Sciences and Engineering Research
Council of Canada, which freed me from all teaching duties for a period of two years and enabled
me to concentrate on the research and writing that is reected in these pages.
I would like to conclude this preface with a historical aside, which I hope the reader will nd
diverting. It is easy enough to discover, even by a cursory glance at the contents of this book, that one
of the main tools used repeatedly is the formulation of the general solution of the matrix equation
XN +YD = I, where all entities are matrices with elements in a principal ideal domain. Among
recent writers, Tom Kailath [49] refers to this equation as the Bezout identity, while V. Ku cera [60]
refers to it as the Diophantine equation. In an effort to pin down just exactly what it should be
called, I started searching the literature in the area of the history of mathematics for the details of
the person(s) who rst obtained the general solution to the equation ax +by = 1, where a and b
are given integers and one seeks integer-valued solutions for x and y.
It appears that the equation "Diophantine equation" was commonly applied by European
mathematicians of the seventeenth century and later to any equation where the unknowns were
required to assume only integer values. The phrase honors the Greek mathematician Diophantus,
PREFACEFORTHEORIGINAL EDITION xvii
who lived (in Alexandria) during the latter part of the third century A. D. However, the general
solution of the linear equation in integer variables mentioned above was never studied by him.
In fact, Smith [90, p. 134] states that Diophantus never studied indeterminate equations, that is
equations that have more than one solution. According to Colebrooke [17, p. xiii], the rst occidental
mathematician to study this equation and to derive its general solution was one Bachet de Meziriac
in the year 1624. The rst mathematician of antiquity to formulate and nd all solutions to this
equation was an ancient Hindu by the name of Aryabhatta, born in the year 476. A detailed and
enjoyable exposition of this subject can be found in the recent comprehensive book by van der
Waerden [104]. Thus, in order to respect priority, I submit that the equation in question should
henceforth be referred to as Aryabhattas identity.
153
C H A P T E R 6
Filtering and Sensitivity
Minimization
6.1 PROBLEMSTATEMENT
The problem studied in this chapter is the following: Suppose P is a given plant, not necessarily
stable, and W
1
, W
2
are given proper stable rational matrices. The objective is to nd, among all
stabilizing compensators of P, a C that minimizes the weighted performance measure
J = W
1
(I +PC)
1
W
2
. (6.1.1)
The signicance of the problem can be explained in terms of Figure 6.1.
t r e
P C
W
1
W
2
W
2
W
1
z
y
d
v
+
+
+
Figure 6.1: Pictorial Representation of the Weighted Performance Measure.

Here W
2
can be interpreted as the generator of the disturbance input d, and (I +PC)
1
W
2
it the transfer matrix from j to the measured output y. In this case W
1
y is the weighted measured
output, and W
1
(I +PC)
1
W
2
is the transfer matrix from v to the weighted measured output.
Alternately, one can think of W
2
as generating a (bounded) reference input r, and (I +PC)
1
W
2
is
the transfer matrix from t to the tracking error e. In this case W
1
e is the weighted error. By suitable
denition of the norm , the weighted performance measure J can be made to represent several
quantities of practical signicance.
Case I. Filtering Suppose r is a random process with power spectral density W
2
(j)
2
, and that it
is desired to minimize the weighted energy of the resulting process e by selecting C appropriately.
154 6. FILTERINGANDSENSITIVITY MINIMIZATION
In this case, t represents white noise and W
2
is a whitening lter; e(s) = [(I +PC)
1
W
2
)](s), and
the power spectral density of e is given by
e(j)
2
= [(I +PC)
1
W
2
](j)
2
. (6.1.2)
In assessing the situation, one may be more concerned about certain frequencies more than others.
The role of the weighting matrix W
1
is to reect the relative importance attached to energy at
different frequencies. Thus, the weighted error z equals W
1
e and its power spectral density is
z(j)
2
= [W
1
(I +PC)
1
W
2
](j)
2
. (6.1.3)
The objective is to minimize the energy of the process z, i.e.,
J =
1
2
z(j)
2
d
=
1
2
[W
1
(I +PC)
1
W
2
](j)
2
d . (6.1.4)
Case II. Sensitivity Minimization Suppose u is a square-integrable input with an L
2
-norm no larger
than one, and suppose it is desired to minimize the largest L
2
-norm that z may have in response to
such an input. Here again the objective function is to minimize the weighted energy of the error e,
but in this case the input u is assumed to be any L
2
-input of unit norm or less, rather than white
noise as in Case I. In this case the objective function to be minimized is
J = sup
u
2
1
z
2
= sup
u
2
1
W
1
(I +PC)
1
W
2
u
2
= sup
R
[W
1
(I +PC)
1
W
2
](j) , (6.1.5)
where || ||
2
denotes the L
2
-norm, and the last equality is a standard result from [26, 94] as pointed
out in the next section. Since in this case the quantity measures the maximum error in response to
a unit-norm input u, this problem is sometimes referred to as the minimax sensitivity problem.
In both the above cases, as well as in others that the reader may construct, the minimization
of J in (6.1.1) is required to be carried out subject to the constraint that C stabilizes P. As things
stand, the compensator C is thus constrained, and also enters the objective function J in a highly
nonlinear fashion. However, using the results of Chapter 5, it is possible to transform this into an
unconstrained optimization problem involving a much simpler objective function.
The set of all compensators C that stabilize a given plant P is neatly parametrized in The-
orem 5.2.1. Let (N, D), (

D,

N) be any r.c.f. and any l.c.f. of P, and select

X,

Y M(S) such that
6.1. PROBLEMSTATEMENT 155
N

X +

D

Y = I. Then, from this theorem,
S(P) = {(

X +DR)(

Y NR)
1
: R M(S) and |

Y NR| = 0} . (6.1.6)
Moreover, if C S(P) equals (

X +DR)(

Y NR)
1
for a certain R M(S), then (cf. (5.2.7))
(I +PC)
1
= (

Y NR)

D . (6.1.7)
Thus, the problem of minimizing J(R) = W
1
(I +PC)
1
W
2
subject to the constraint that C
S(P) is equivalent to minimizing
J(R) = W
1
(

Y NR)

DW
2
, (6.1.8)
subject to the constraint that |

Y NR| = 0. Now, from Lemma 5.2.4, |

Y NR| = 0 for all R
belonging to an open dense subset of M(S), and if P is strictly proper, then from Lemma 5.2.6,
|

Y NR| = 0 for all R M(S). Hence, the original problem of minimizing the weighted perfor-
mance measure J of (6.1.1) is equivalent, in the case where P is strictly proper, to an unconstrained
minimization of the objective function J() of (6.1.8) with respect to R. Moreover, the original
expression W
1
(I +PC)
1
W
2
has been replaced by W
1
(

Y NR)

DW
2
, where the noteworthy
feature is that the quantity inside the normis afne in R. If the plant is not strictly proper, the quantity
R is constrained to the extent that it has to satisfy the nonsingularity constraint. However, since
this constraint is only operational on a nowhere dense set, it can be ignored during the minimization
and can be tested once an unconstrained minimum has been found. If by chance the optimal choice
of R does not satisfy the nonsingularity constraint, then arbitrarily small perturbations in R will
satisfy the constraint and will result in only an arbitrarily small increase in the performance measure
(since the latter is a continuous function).
Up to now we have only talked about minimizing the weighted energy of the error signal
in response to various types of inputs. Depending on the application, one might prefer instead
to minimize the weighted energy of some other signal within the loop. Since the transfer matrix
between any two points can be expressed as an afne function of R using the results of Chapter 5,
such a minimization is qualitatively no different from the problems discussed earlier. For instance,
if it is desired to minimize the weighted energy of the plant output in response to an input u, then
J = W
1
PC(I +PC)
1
W
2
. (6.1.9)
However, if C S(P), then by (5.2.7),
PC(I +PC)
1
= (

X +DR)

N , (6.1.10)
for some R M(S), so that
J = W
1
(

X +DR)

NW
2
. (6.1.11)
In fact, it is easy to see that the general problem becomes one of minimizing
J = F GRH , (6.1.12)
where F, G, H M(S) are given matrices.
Finally, the above discussion applies even when a two-parameter control scheme is used. In
this case, by (5.6.15), all transfer matrices are afne functions of the matrix S = [Q R], and the
general cost function assumes the form
J = F GSH , (6.1.13)
where F, G, H M(S) are given matrices. The discrete-time analogs of the above problems are
simply obtained by replacing S by R
.
6.2 SOMEFACTS ABOUTHARDY SPACES
In the previous section, it was shown that several problems of ltering and sensitivity minimization
can be formulated as follows: Minimize
J(R) = F GRH , (6.2.1)
where F, G, H M(S) are given and R M(S) is to be chosen. The solution of this class of
problems is facilitated by the use of some concepts and results from the theory of H
p
- (or Hardy)
spaces. The purpose of this section is to present a listing of the basic concepts and results from this
theory, mostly without proof. The reader is referred to [34] for an excellent treatment of this theory,
and to [59] for some specialized results.
Throughout this section, D denotes the open unit disc in the complex plane, and

D the closed
unit disc. For each p [1, ), the set H
p
consists of all analytic functions f on the open unit disc
D with the property that
sup
r(0,1)
_
_
1
2
2
_
0
|f (re
j
)|
p
d
_
_
1/p
=: f
p
< . (6.2.2)
The set H
consists of all analytic functions f on D such that

sup
r(0,1)
max
[0,2]
|f (re
j
)| =: f
< . (6.2.3)
If f H
p
for some p [1, ], then strictly speaking f (z) is only dened when |z| < 1.
However, for almost all [0, 2], it is possible to dene f (e
j
) as a nontangential limit of f (z)
as z e
j
. If f H
p
then its boundary value function f (e
j
) belongs to the Lebesgue space
L
p
[0, 2]. Moreover,
f
p
=
_
_
1
2
2
_
0
|f (e
j
)|
p
d
_
_
1/p
if 1 p < ,
= ess. sup
[0,2]
|f (e
j
)| if p = . (6.2.4)
6.2. SOMEFACTS ABOUTHARDY SPACES 157
In other words, the norm f
p
of a function f H
p
can be computed from its boundary value
function.
For each p [1, ], the space H
p
is a Banach space. In addition, H
2
is a (complex) Hilbert
space with inner product dened by
f, g =
1
2
2
_
0
f (e
j
)g(e
j
) d , (6.2.5)
where the bar denotes complex conjugation. It turns out that the set of functions {z
i
}
i=0
forms an
orthonormal basis for H
2
. Hence, a function f () analytic in D belongs to H
2
if and only if it can
be expanded in the form
f (z) =
i=0
f
i
z
i
, (6.2.6)
where the sequence {f
i
} is square-summable and the innite sum converges in the sense of the
H
2
-norm. Thus, one can think of H
2
as the set of z-transforms of square-summable sequences.
Similarly, if {f
i
} is an absolutely summable sequence, then
f (z) =
i=0
f
i
z
i
, (6.2.7)
belongs to H
, and in fact
f
i=0
|f
i
| . (6.2.8)
Ingeneral, the converse of the above statement is false: Not every functioninH
is of the form(6.2.7)
where the coefcient sequence is absolutely summable. However, it is easy to verify that a rational
function belongs to H
if and only if it is the z-transform of an absolutely summable sequence.

If the product of two H
-functions is dened to be their pointwise product, then H
becomes a
Banach algebra.
Suppose f H
. Then one can associate with f a continuous linear operator T

f
mapping
H
2
into itself by
(T
f
g)(z) = f (z)g(z) . (6.2.9)
In other words, an application of T
f
consists simply of pointwise multiplication by f (). It can be
shown that this association is norm-preserving. That is, the operator norm of T
f
equals f
; or
in other words,
T
f
:= sup
g
2
1
fg
2
= f
. (6.2.10)
Suppose m is a nonnegative integer, and that {a
i
} is a sequence of nonzero complex numbers such
that |a
i
| < 1 for all i, and the sum
i=1
(1 |a
i
|) converges. Then the function
b(z) = z
m
i=1
|a
i
|
a
i
z a
i
1 a
i
z
, (6.2.11)
is called the Blaschke product corresponding to m and the sequence {a
i
}. The function b() is an
element of H
and satises |b(z)| < 1 for all z in D, |b(e

j
)| = 1 for almost all [0, 2]. More-
over, the set of zeros of b() in D consists precisely of the set {a
i
}, where repeated occurrence of a
particular complex number corresponds to multiple zeros of b. If m is a nonnegative integer and
{a
1
, , a
n
} is a nite set of points (not necessarily distinct) such that 0 < |a
i
| < 1 for all i, then
b(z) = z
m
n
i=1
|a
i
|
a
i
z a
i
1 a
i
z
, (6.2.12)
is called the associated (nite) Blaschke product. It too belongs to H
and satises |b(z)| < 1 for

all z D and |b(e
j
)| = 1 [0, 2].
Suppose f is any rational function in H
. Let m denote the multiplicity of z = 0 as a zero

of f (if f (0) = 0 set m = 0), and let a
1
, , a
n
denote the other zeros of f in D. If f has a
multiple zero at some point in D, repeat this point in the sequence {a
1
, , a
n
} as many times as
the multiplicity. If b() is now dened as in (6.2.12), then it is clear that f can be factored in the
form f (z) = a(z)b(z), where both a, b H
, b is a Blaschke product, and a has no zeros in the

open unit disc. Further, if f has no zeros on the unit circle, then a is actually a unit of H
, i.e., it
has an inverse in H
.
The factorization described above is a special case of what is known as an inner-outer fac-
torization. A function g H
is inner if it satises |g(e

j
)| = 1 for almost all [0, 2]. The
only rational inner functions are nite Blaschke products multiplied by a complex number of unit
modulus. In general, an inner function can be expressed as a product of a Blaschke product and a
so-called singular inner function; we will not require the latter concept in this book. The denition
of an outer function is more technical: A function f H
is outer if the range of the map T

f
dened in (6.2.9) is dense in H
2
. However, a rational function f H
is outer if and only if it has

no zeros in the open unit disc (but it could have zeros on the unit circle). With these denitions, we
have the following result:
Fact 6.2.1 Every f H
can be factored as ab where a is outer and b is inner. If f is rational

and has no zeros on the unit circle, then f can be factored in the form ab where a is a unit of H
and b is a nite Blaschke product.

In the case of matrices over H
p
-spaces, the essential results for our purposes are as follows:
Let n > 1 be an integer. Then H
n
2
is a Hilbert space with the inner product
f, g =
1
2
2
_
0
f
(e
j
)g(e
j
) d . (6.2.13)
where

denotes the conjugate transpose. Corresponding to every F H
nm
one can dene a

continuous linear operator T
F
: H
m
2
H
n
2
by
(T
F
g)(z) = F(z)g(z) z D . (6.2.14)
Moreover, the norm of the operator T
F
equals the norm of F; that is,
T
F
= sup
g
2
1
T
F
g
2
= F
= sup
[0,2]
(F(e
j
)) , (6.2.15)
where () denotes the largest singular value.
The matrix analog of Fact 6.2.1 is stated next. A matrix G H
nm
is inner if
G
(e
j
)G(e
j
) = I for almost all [0, 2] (in this denition it is important to note that n
need not equal m). A matrix F H
nm
is outer if the range of the map T

F
dened in (6.2.14) is
dense in H
n
2
. In particular, a rational matrix F H
nm
is outer if and only if rank F(z) = n for all

z D. Note that if G inner (resp. outer) then n m (resp. n m).
To state the next result precisely, we recall the set R
introduced in Section 2.1, which consists

of all rational functions that have no poles in the closed unit disc. It is easy to see that R
is just
the set of rational functions in H
(this explains the symbolism), and that the norm on R
dened
in (2.2.21) is consistent with (6.2.4).
Fact 6.2.2 Suppose G H
nm
and has rank min{n, m}.

1
Then Gcan be factored as G
i
G
o
where
G
i
is inner and G
o
is outer. If n m, then G
o
is square, whereas if n m, then G
i
is square.
Fact 6.2.2 has several interesting consequences. First, suppose G R
nm
has rank n and

that n m. Then, by applying Fact 6.2.2 to F = G
, we see that G can be factored as AB where

A R
nn
is outer and B R
nm
is the transpose of an inner matrix. This is quite distinct from the

factorization in Fact 6.2.2. Second, noting that the transpose of a square inner matrix is also inner,
we see that a square matrix G M(R
) of full rank can be factored as G

i
G
o
and also as F
o
F
i
where G
i
, F
i
are inner and G
o
, F
o
are outer.
Finally, observe that if G is a square inner matrix, then |G| is an inner function, and the
adjoint matrix of G is also a square inner matrix. Also, note that left multiplication by an inner
matrix preserves norms in both H
2
and H
. That is, if F M(H
) is inner, G M(H
) and
H M(H
2
), then FG
= G
and FH
2
= H
2
.
1
This means that at least one full-size minor of G is nonzero.
One can also dene the class H
p
over the open right half-plane, rather than the open unit
disc as is the case above. Suppose f : C
+
C is analytic; then f belongs to the class H
p
(C
+
),
where 1 p < , if
sup
>0
_
_
1
2
|f ( +j)|
p
d
_
_
1/p
=: f
p+
< . (6.2.16)
The function f belongs to H
(C
+
) if
sup
>0
sup
|f ( +j)| =: f
+
< . (6.2.17)
If f H
p
(C
+
), then its domain denition can be extended to the j-axis, and the function
f (j) belongs to the Lebesgue space L
p
(R). Moreover,
f
p+
=
_
_
1
2
|f (j)|
p
d
_
_
1/p
if p [1, ),
f
+
= ess. sup
|f (j)| if p = . (6.2.18)
H
2
(C
+
) is a Hilbert space, with the inner product of two functions f and g dened by
f, g
+
=
1
2
f (j)g(j) d . (6.2.19)
The notions of Blaschke product, inner and outer functions all have their counterparts in this
setting. In particular, a function f H
(C
+
) is inner if |f (j)| = 1 for almost all ; a rational
f H
(C
+
) with f (j) = 0 for all is outer if it is a unit. Facts 6.2.1 and 6.2.2 have their obvious
analogs for the space H
(C
+
), obtained by replacing unit circle by imaginary axis and e
j
by
j throughout.
Suppose f H
(of the unit disc). Then the function g : C

+
C dened by
g(s) = f [(s 1)/(s +1)] , (6.2.20)
belongs to H
(C
+
), and satises, g
+
= f
. However, not all functions in H
(C
+
) can
be generated in this manner. For example, g(s) = e
s
belongs to H
(C
+
), but is not of the
form(6.2.20) for any f H
.The situationis worse whenp < . Evenif f H

p
, the correspond-
ing g might not belong to H
p
(C
+
). An easy way to see this is to note that every g H
p
(C
+
) must
be strictly proper. Thus, if f H
p
is rational and f (1) = 0, then from (6.2.20) g() = f (1) = 0,
so that g / H
p
(C
+
). Over the unit disc, H
is a subset of H
p
for all p < , and indeed H
p
is a
subset of H
q
whenever p > q; this follows from the compactness of the interval [0, 2]. However,
H
p
(C
+
) and H
q
(C
+
) are distinct sets whenever p = q, with neither being a subset of the other.
There is however an interesting relationship, which is given next.
Fact 6.2.3 Suppose f H
p
, for some p [1, ), and dene g by (6.2.20). Let
h(s) =
2
1/p
g(s)
(1 +s)
2/p
. (6.2.21)
Then h H
p
(C
+
) and h
p+
= f
p
.
Proof. By denition
f
p
p
=
1
2
_
0
|f (e
j
)|
p
d, (6.2.22)
h
p
p
=
1
2
g(j)
(1 +
2
)
1/p
p
2 d
=
1
2
|f ([(j 1)/(j +1)])|

p
2d
1 +
2
. (6.2.23)
To relate (6.2.22) and (6.2.23), let e
j
= (j 1)/(j +1). Then
j =
1 +e
j
1 e
j
=
e
j/2
+e
j/2
e
j/2
e
j/2
= j cot /2 . (6.2.24)
Hence,
= 2 cot
1
; d =
2d
1 +
2
. (6.2.25)
Combining (6.2.25) and (6.2.23) completes the proof. 2
Finally, if f H
2
, then its boundary function f (e
j
) belongs to L
2
[0, 2]. Let
H
+
2
[0, 2] denote the set of boundary values of functions in H
2
. Then H
+
2
[0, 2] is a closed sub-
space of L
2
[0, 2]. Let H
2
[0, 2] denote the orthogonal complement of H
+
2
[0, 2] in L
2
[0, 2].
Then every function f L
2
can be expressed uniquely in the form f = f
+
+f
where f
+
H
+
2
and f
2
(we drop the [0, 2] for brevity). This splitting up has a very intuitive interpretation.
The collection of functions {e
jn
, n Z} forms an orthonormal basis for L
2
, so that every f L
2
has the Fourier series
f (e
j
) =
n=
f
n
e
jn
. (6.2.26)
NowH
+
2
is the span of the functions {e
jn
, n Z
+
}, while H
2
is the span of the set {e
jn
, n < 0}.
Hence,
f
+
(e
j
) =
n=0
f
n
e
jn
, f
(e
j
) =
1
n=
f
n
e
jn
. (6.2.27)
Note that the function z z
1
f
(z
1
) belongs to H
2
. Such a function is called coanalytic. In case
f () is rational and has no poles on the unit circle, the above decomposition is even more familiar:
Expand f into partial fractions; then f
+
(resp. f
) is the sum of the terms corresponding to poles

outside (resp. inside) the unit disc.
The above discussion applies, mutatis mutandis, to the space L
2
(R).
6.3 FILTERING
The problem studied in this section is the following: Suppose a plant P M(R(s)) is specied,
together with weighting matrices W
1
, W
2
. The objective is to minimize the weighted sensitivity
measure
J = W
1
(I +PC)
1
W
2
2+
, (6.3.1)
over all C S(P). At this stage, several assumptions are made to facilitate a solution. First, note that
J might not even be dened for all C S(P), because W
1
(I +PC)
1
W
2
, might not be strictly
proper. To get around this problem, it is assumed that either W
1
or W
2
is strictly proper. Next,
it is assumed that W
1
, W
2
are square, and that W
1
(s) = 0, W
2
(s) = 0 for all s C
+
. Finally, for
technical reasons, it is assumed that the plant P has at least as many inputs as outputs. This is the
so-called square or fat plant case; the case of tall plants is discussed in Section 6.6.
As pointed out in Section 6.1, this problem can be reformulated in a more manageable
form. Let (N, D), (

D,

N) be an r.c.f. and an l.c.f. of P, and let

X,

Y M(S) be any solutions of
N

X +

D

Y = I. Then, by Theorem 5.2.1, every C S(P) is of the form (

X +DR)(

Y NR)
1
for some R M(S) such that |

Y NR| = 0. Moreover, the corresponding weighted sensitivity
measure equals
J(R) = W
1
(

Y NR)

DW
2
2
= W
1

Y

DW
2
W
1
NR

DW
2
2
. (6.3.2)
Hence, the problem is to minimize J(R) by a suitable choice of R M(S).
2
This problem has a
very nice geometrical interpretation. As R varies over M(S), the product W
1
NR

DW
2
generates a
subspace of M(H
2
(C
+
)). To minimize J(), one has to nd the closest point in this subspace to
W
1

Y

DW
2
. Now, since at least one of the matrices W
1
or W
2
is strictly proper, it turns out that the
subspace {W
1
NR

DW
2
: R M(S)} is not closed. As a result, there may not exist a closest point in
this subspace to W
1

Y

DW
2
. In other words, the inmum of J() is not attained by any R, except in
trivial cases. It is nevertheless possible to compute the inmum of J(), and to construct a family
2
The constraint that |

Y NR| = 0 is ignored since it is satised by almost all R (see Lemma 5.2.4).
6.3. FILTERING 163
(R
) of functions in M(S) such that J(R
) approaches this inmum as 0

+
. This procedure is
based on the following lemma.
Lemma 6.3.1 Suppose F, G, H M(S), and suppose in addition that (i) F is strictly proper, (ii) G
has full row rank at all points on the j-axis and at innity,
3
and (iii) H is square and nonsingular at
all points on the j-axis and at innity. Dene
J
m
(R) = F GRH
2
, (6.3.3)
and let J
m
(R) = if F GRH / M(H
2
(C
+
)). Factorize G, H as G = G
i
G
o
, H = H
o
H
i
, where
G
i
, H
i
are inner and G
o
, H
o
are outer. For any, A M(S), let A
denote the rational matrix, A
(s) =
A
(s).
4
Then
min
R
J
m
(R) = (G
i
FH
i
)
2
, (6.3.4)
and this minimum is attained by any R satisfying
G
o
RH
o
= (G
i
FH
i
)
+
. (6.3.5)
Proof. First of all, observe that the equation G
o
RH
o
= A has a solution for R corresponding to
every A M(S). The reason is that G
o
has a right inverse and H
o
is a unit.
To prove the lemma, note that G
i
(s)G
i
(s) = I, H
i
(s)H
i
(s) = H
i
(s)H
i
(s) = I s, since
both G
i
and H
i
are inner matrices. In particular, G
i
(j) is a unitary matrix for all , and as a result

for any A M(H
2
(C
+
)) we have
A
2
2
= G
i
A
2
2
, (6.3.6)
where the norm on the left is the H
2
-norm or the L
2
-norm, while the norm on the right side is
strictly the L
2
-norm (since G
i
A need not belong to M(H

2
(C
+
))). In the same way, multiplication
by H
i
is also norm-preserving.
Using this fact leads to
(J
m
(R))
2
= F GRH
2
2
= F G
i
G
o
RH
o
H
i
2
2
= G
i
FH
i
G
o
RH
o
2
2
. (6.3.7)
If G
i
FH
i
is decomposed as (G
i
FH
i
)
+
+(G
i
FH
i
, then by orthogonality,
(J
m
(R))
2
= (G
i
FH
i
2
2
+(G
i
FH
i
)
+
G
o
RH
o
2
2
(G
i
FH
i
2
2
. (6.3.8)
On the other hand, equality can be achieved in (6.3.8) by choosing R to satisfy (6.3.5). 2
3
This means that G has at least as many columns as rows.
4
In particular, if s = j, then, A
(j) is the conjugate transpose of A(j), so that this notation is consistent with earlier usage.
If we attempt to apply Lemma 6.3.1 to minimize the function J() of (6.3.2), we run
into the difculty that the functions G = W
1
N and H =

DW
2
may not satisfy the hypotheses
of Lemma 6.3.1. In particular, if P(j) has less than full row rank for some , then so does N(j)
(i.e., G(j)). Similarly, any j-axis poles of P correspond to zeros of |

D| and of |H|. Finally, since
W
1
or W
2
is assumed to be strictly proper, either |G()| = 0 or |H()| = 0. To get around these
difculties, the following modication of Lemma 6.3.1 is required.
Lemma 6.3.2 Suppose F, G, H M(S), and suppose in addition that
(i) G has the form G = G
1
D
g
, where G
1
M(S) has full row rank at all points on the j-axis
and at innity, while D
g
M(S) is a diagonal matrix D
g
= Diag {d
g
1
, , d
g
n
}, where the only
C
+e
-zeros of d
g
i
(if any) are at innity or on the j-axis.
(ii) H has the form H = D
h
H
1
where H
1
M(S) is square and nonsingular at all points on the
j-axis and at innity, while D
h
M(S) is a diagonal matrix of the formD
h
= Diag {d
h
1
, , d
h
m
},
where the only C
+e
-zeros of d
h
i
(if any) are at innity or on the j-axis.
Under these conditions,
inf
RM(S)
F GRH
2
= inf
RM(S)
F G
1
RH
1
2
. (6.3.9)
Proof. As shown by Lemma 6.3.1, the inmum on the right side of (6.3.9) is actually attained. Let
R
0
M(S) be an element that achieves this minimum. Now every matrix of the form GRH is also
of the form G
1
R
1
H
1
(just let R
1
= D
g
RD
h
). Thus, the inmum on the left side of (6.3.9) can be
no smaller than that on the right side. Hence, in order to prove the lemma, it is enough to construct
a family of functions (R
)
>0
in M(S) such that F GR
H
2
F G
1
R
0
H
1
2
as 0
+
.
Suppose a S has the property that a(s) = 0 whenever Re s > 0, so that the only possible
C
+e
-zeros of a are at innity or on the j-axis. (Note that all the d
g
i
, d
h
i
, are of this type.) To be
specic, let j
1
, , j
l
(where
l
may be innite) denote the distinct j-axis zeros of a, with
multiplicities m
1
, , m
l
, respectively. It is claimed that there exists a family of functions (b
)
>0
in S such that
(i) b
/a is a unit of S for each > 0.

(ii) b
(j) 1 uniformly on all compact subsets of (, ) that do not contain any of the
i
s, as 0
+
.
To see this, note that a(s) is of the form
a(s) = u(s)
l
i=1
_
s j
i
s +1
_
m
i
_
1
s +1
_
n
, (6.3.10)
for some unit u S. Hence,
b
(s) =
l
i=1
_
s j
i
s + j
i
_
m
i
_
1
s +1
_
n
, (6.3.11)
6.3. FILTERING 165
satises the requirements (i) and (ii).
In view of this, it follows that there exists families B
g
, B
h
of diagonal matrices in M(S) such
that
(i) D
1
g
B
g
is unimodular for all > 0.
(ii) B
h
D
1
h
is unimodular for all > 0.
(iii) B
g
(j) I, B
h
(j) I as 0
+
uniformly on all compact subsets of (, )
that do not contain the zeros of |D
g
|, |D
h
|.
Now dene R
= D
1
g
B
g
R
0
B
h
D
1
h
, where R
0
M(S) attains the inmum on the right
side of (6.3.9). Then
F GR
H
2
= F G
1
D
g
D
1
g
B
g
R
0
B
h
D
1
h
D
h
H
1
2
= F G
1
B
g
R
0
B
h
H
1
2
. (6.3.12)
The proof of the lemma is complete if it can be shown that the last quantity in (6.3.12) approaches
F G
1
R
0
H
1
2
as 0
+
. But this follows from the limiting behavior of B
g
, B
h
as described
in (iii) above. 2
We are now ready to tackle the general problem of minimizing J() as dened in (6.3.2). Let
G
s
denote the Smith form of W
1
N and H
s
the Smith form of

DW
2
, and suppose
U
g
G
s
V
g
= W
1
N, U
h
H
s
V
h
=

DW
2
, (6.3.13)
where U
g
, V
g
, U
h
, V
h
U(S). Then
J(R) = W
1

Y

DW
2
W
1
NR

DW
2
2
= W
1

Y

DW
2
U
g
G
s
V
g
RU
h
H
s
V
h
2
. (6.3.14)
Since V
g
, U
h
are unimodular, one can replace V
g
RU
h
by another free parameter, say R
1
. Moreover,
G
s
(and H
s
) can be expressed as a product of two diagonal matrices E
g
and D
g
(resp. E
h
and D
h
)
such that |E
g
| is nonzero at all points on the j-axis and at innity, while |D
g
(s)| = 0 whenever
s C
+
. Now, dening
G
1
= U
g
E
g
, H
1
= E
h
V
h
, (6.3.15)
shows that
J(R) = W
1

Y

DW
2
G
1
D
g
R
1
D
h
H
1
2
, (6.3.16)
where R
1
= V
g
RU
h
. Hence, by Lemma 6.3.2,
inf
R
J(R) = inf
R
W
1

Y

DW
2
G
1
RH
1
2
, (6.3.17)
which can be computed using Lemma 6.3.1.
The situation is much cleaner in the case of discrete-time systems. First, since the interval
[0, 2] is compact, every H
-function is also in H
2
. Further, a strictly causal plant, which is the
discrete analog of a strictly proper plant, is one whose transfer matrix is zero at z = 0. As a result,
in the formulation of the discrete-time ltering problem, the weighting matrices W
1
and W
2
can be
any rational H
-matrices and need not be strictly causal. Moreover, an optimal controller always
exists and the approximation procedure of Lemma 6.3.2 is not needed.
6.4 SENSITIVITY MINIMIZATION: SCALARCASE
Because the sensitivity minimization problem is much more involved technically than the ltering
problem, the discussion of the former is divided into three parts, namely the scalar case, the fat plant
case, and the general case. Accordingly, the problem studied in this section is the following: Suppose
p is a given scalar plant, and w is a given element of S. The objective is to nd a c S(p) that
minimizes the weighted sensitivity measure
5
J = sup
R
w(j)
1 +p(j)c(j)
. (6.4.1)
As in the previous section, one can transform this problem into an afne minimization problem.
Let (n, d) be a coprime factorization of p, and let x, y S be such that xn +yd = 1. Then the
problem becomes: Minimize
J(r) = sup
R
|[(y rn)dw](j)|
= (y rn)dw
, (6.4.2)
by suitable choice of r S. Moreover, it can be assumed without loss of generality that the weighting
function w is outer. To see this, factor w in the form w = w
i
w
o
where w
i
and w
o
are inner and
outer, respectively. Then, since multiplication by the inner function w
i
is norm-preserving, we have
J(r) = (y rn)dw
o
.
In order to take full advantage of the available results on H
p
-spaces, a bilinear transformation
is now used to map the set S into H
. Suppose f S, and dene the function

f by
f (z) = f [(1 +z)/(1 z)] . (6.4.3)

Since the bilinear transformation s = (1 +z)(1 z) maps the open unit disc into the open right
half-plane, it follows that

f is a rational element of H
, i.e., f R
. Moreover,
= sup
[0,2]
|

f (e
j
)|
= sup
R
|f (j)| = f
S
, (6.4.4)
since the unit circle maps onto the imaginary axis plus the point at innity. Conversely, suppose
g H
is rational, and dene g by

g(s) = g[(s 1)/(s +1)] . (6.4.5)
5
Note that the two weights w
1
and w
2
are combined into one, since we are dealing with the scalar case.
6.4. SENSITIVITY MINIMIZATION: SCALARCASE 167
Then g S and g
= g
S
. In this way one can set up a one-to-one norm-preserving corre-
spondence between functions in S and functions in R
. If this procedure is adopted, the problem

under study becomes the following: Find a function r R
that minimizes
J = ( y r n)
d w
. (6.4.6)
At this stage we drop the tilde and state the problem in H
rather than in R
; the reason for the

latter is that H
is a Banach space while R
is merely a normed space. The problem in essence is as

follows: Given two rational functions f, g H
, nd a rational function r H
that minimizes
J(r) = f gr
. (6.4.7)
The denitions of f and g when J is as in (6.4.6) are obvious. The problem of minimizing J has
a very simple geometric interpretation: Let gH
denote the subspace {gr : r H
}. Then the
problem is to nd the closest point to f in the subspace gH
, where distances are measured using

the norm
.
It turns out that the subspace gH
is closed if and only if g has no zeros on the unit circle. If

gH
is a closed subspace, then it can be shown that, corresponding to each rational f H
there
exists a unique closest point in gH
(more on this shortly). Unfortunately, if the plant p has zeros

on the j-axis or at innity, then so does n. As a result, g = n = n
d w has one or more zeros on the

unit disc, and the subspace gH
is not closed. As a consequence, if one denes

J
0
= inf
rH
f gr
, (6.4.8)
then the inmum in (6.4.8) may or may not be attained, depending on f . In the original context,
this means that in general there does not exist a single compensator that minimizes the weighted
sensitivity measure (6.4.1), in the practically important case where the plant is strictly proper (or has
j-axis zeros).
In viewof the above discussion, one would nevertheless like to knowthe value of J
0
, as well as a
sequence {r
i
} of functions in H
such that J(r

i
) J
0
. This information can then be translated back
into a greatest lower bound on the weighted sensitivity measure of (6.4.1), as well as an optimizing
sequence {c
i
} of compensators.
In order to do this, rst carry out an inner-outer factorization of g. Thus, g = bg
0
where b is
a nite Blaschke product and g
0
is outer. Now factor g
0
as vu where v H
has all of its zeros on

the unit circle and u is a unit of H
. Then
J
0
= inf
rH
f bdur
= inf
rH
f bvr
, (6.4.9)
since the unit factor u can be absorbed into the free parameter r. Now consider the modied
problem: Minimize f rb
by a suitable choice of r H
. Dene
J
m
= inf
rH
f rb
, (6.4.10)
as the value of the modied problem. In viewof earlier discussion, the inmumin (6.4.10) is actually
attained, since b has no zeros on the unit circle. Now, since every function of the form bvr is also a
multiple of b, it is clear that J
m
J
0
. The next lemma gives conditions under which equality holds.
Lemma 6.4.1 Let e
j
1
, , e
j
l
denote the distinct zeros of v on the unit circle. Then J
0
= J
m
if and
only if |f (e
j
i
)| J
m
i.
Proof. only if Since v(e
j
i
) = 0 i, we have (bvr)(e
j
i
) = 0 i, r H
.
6
Hence, (f
bvr)(e
j
i
) = f (e
j
i
) i, and as a result
f bvr
= sup
[0,2]
|[f bvr](e
j
)|
max
i
|[f bvr](e
j
i
)|
= max
i
|f (e
j
i
)|, r H
. (6.4.11)
Combining (6.4.11) with the hypothesis that J
0
= J
m
leads to
J
m
= J
0
= inf
rH
f bvr
max
i
|f (e
j
i
)| . (6.4.12)
if Select an r
0
H
such that J
m
= f r
0
b
; such an r
0
exists, by earlier remarks.
Let f
0
= f r
0
b. Next, introduce the function
h
,
(z) = (z e
j
)/[z (1 +)e
j
] , (6.4.13)
and dene
h
(z) =
l
i=1
[h
i
,
(z)]
m
i
, (6.4.14)
where m
i
is the multiplicity of e
j
i
as a zero of v(). Now observe that (i) h
/v is a unit of H
for
each > 0, and (ii) h
(e
j
) 1 uniformly on all compact subsets of [0, 2] that do not contain any
of the
i
s, as 0
+
. Consider the family (r
)
>0
in H
dened by r
= r
0
h
/v. Then by (6.4.9),

f bvr
J
0
, for each > 0 . (6.4.15)
On the other hand, f bvr
= f br
0
h
so that |(f bvr
)(e
j
)| |f
0
(e
j
)| J
m
uniformly
on all compact subsets of [0, 2] that do not contain any of the
i
s, and |(f bvr
)(e
j
i
)| =
|f (e
j
i
)| J
m
by hypothesis. This means that
lim sup
0
+
f bvr
J
m
. (6.4.16)
6
Strictly speaking, r(e
j
i
) is undenedfor a general H
-function, andthis statement is only validunder more restrictive conditions,

e.g., r is rational. These technicalities can be addressed, but spelling out the details would obscure the main ideas of the proof.
Combining (6.4.15), (6.4.16) and the fact that J
0
J
m
leads to the conclusion that J
0
= J
m
and
that
f bvr
J
m
as 0
+
. (6.4.17)
This completes the proof. 2
Inthe above proof, it is easy to see that there is nothing special about the functionh
of (6.4.14),
and that any function h
can be used so long as (i) h
> 0, (ii) h
(e
j
i
) = 0 i, and
(iii) h
(e
j
) 1 uniformly on all compact subsets of [0, 2] that do not contain any of the 0
i
s.
Note that it f (e
j
i
) = 0 i, then the condition of Lemma 6.4.1 is automatically satised and
J
0
= J
m
. Going back to (6.4.6), we have f = y

d w and g = n
d w. Hence, if the weighting function

w is chosen such that every unit circle zero of n is also a zero of w, then J
0
= J
m
. (Note that
unit circle zeros of

d are already common to both f and g.) Translating this condition back to the
s-domain yields the result that J
0
= J
m
provided w is strictly proper whenever p is, and vanishes at
the j-axis zeros of the plant. Once such a weighting function is chosen, then any unit circle zeros
of n
d w can be deleted without affecting the inmum of J.

Example 6.4.2 Let p be a strictly proper, nonminimum phase plant with the transfer function
p(s) =
s 1
(s 3)(s
2
+1)
.
Let the weighting function w equal
w(s) =
1
s +1
.
Then w() = p() = 0, so that J
0
= J
m
in this case. Now, substituting s = (1 +z)/(1 z) gives
p(z) =
z(z 1)
2
2(2z 1)(z
2
+1)
=
n(z)
d(z)
,
where n,

d are the numerator and denominator polynomials of p, respectively. A particular solution
of x n + y

d = 1 can be obtained by Euclidean division and back substitution, namely
x(z) = 6z
2
+
13
2
, y(z) =
1
4
(6z
2
9z +2) .
Finally,
w(z) =
1 z
2
.
Hence,
J(r) = f rg
,
where
f (z) = ( w y

d)(z) =
1
4
(z 1)(2z 1)(z
2
+1)(6z
2
9z +2),
g(z) = ( w n
d)(z) = z(z 1)(2z 1)(z

2
+1) .
If we factor g in the form g = bvu where b is a Blaschke product, v has all of its zeros on the unit
circle, and u is a unit, then
b(z) =
z(z 0.5)
1 0.5z
,
v(z) = (z 1)
3
(z
2
+1),
u(z) = 2(1 0.5z) .
Now Lemma 6.4.1 allows us to ignore v and u, in that
inf
r
J(r) = J
m
= inf
r
f rb
.
Lemma 6.4.1 provides conditions under which, by solving the minimization problem
in (6.4.10) (which does have a solution), it is possible to determine the value of the inmumin (6.4.9).
Actually, it turns out that it is possible to compute J
m
(the value of the problem in (6.4.10)) without
actually computing the r
0
that attains the minimum. If the conditions of Lemma 6.4.1 hold, this
J
m
is also the inmum in (6.4.9). In this way, it is possible to obtain rapidly a gure of merit against
which suboptimal choices of r in (6.4.9) can be compared. If the optimization problem in (6.4.10)
is solved completely, i.e., if r
0
is also determined, then a family r
such that f gr
J
m
can
be computed as in the proof of Lemma 6.4.1.
The next step then is to give a procedure for computing J
m
. We begin with a result that is
now obvious, but has some practical signicance.
Lemma 6.4.3 Suppose the plant p is strictly proper and stable, and has no zeros in the open right
half-plane; suppose also that the weighting function w is stable, strictly proper and vanishes at all j-axis
zeros (if any) of p. Then
inf
cS(p)
sup
_
w
1 +pc
_
(j)
= 0 . (6.4.18)
Remarks 6.4.4 In other words, Lemma 6.4.3 states that if the plant is strictly proper, stable, and
minimum phase,
7
then the weighted sensitivity measure can be made arbitrarily small provided
the weighting function is also stable and vanishes at all RHP zeros of the plant.
7
Normally, the term minimum phase implies that the plant has no zeros in the closed right half-plane; here the usage is slightly
different.
Proof. In this case

d is a unit, and all zeros of n in the closed unit disc are in fact on the unit circle.
As a result, the nite Blaschke product b equals 1, and the condition of Lemma 6.4.1 is satised.
Finally, it is clear that
inf
r
f rb
= inf
r
f r
= 0 , (6.4.19)
corresponding to the choice r = f . 2
In the general case the inmum J
m
in (6.4.10) can be computed by associating, with each
H
-function f rb, the corresponding continuous linear operator T

f rb
mapping H
2
into itself,
as dened in (6.2.9). Since T
h
= h
h H
, it follows that
J
m
= inf
rH
T
f rb
. (6.4.20)
Let bH
2
denote the range of the operator T
b
; i.e.,
bH
2
= {bg : g H
2
} , (6.4.21)
and let N denote the orthogonal complement of bH
2
. Thus,
N = {h H
2
: h, bg = 0 g H
2
} . (6.4.22)
Finally, let denote the orthogonal projection mapping H
2
onto N. Now, as r varies over H
, the
operator T
f rb
ranges over various operators. However, the operator T
f rb
is the same for all r,
and in fact equals T
f
. To see this, let g H
2
, be arbitrary; then
T
f rb
g = [(f rb)g] = (fg) (rbg)
= (fg) (brg) = (fg) = T
f
g , (6.4.23)
since brg bH
2
and hence its projection onto Nis zero. Using this idea, one can prove the following
result (see Sarason [86] for the proof ):
Fact 6.4.5 Dene a linear map A : N N to be the restriction of T
f
to N. Then
inf
rH
f rb
= A . (6.4.24)
Moreover, if f (e
j
) is a continuous functionof , thenthere is a unique functionr
0
H
that attains
the inmum in (6.4.26). Let f
0
= f r
0
b, and let g N be a function such that Ag
2
= g
2
.
Then f
0
/A is inner, and
f
0
(z) =
(Ag)(z)
g(z)
. (6.4.25)
Fact 6.4.5 is very useful because it is possible to compute all the quantities involved using only
linear algebra. First of all, the subspace N is nite-dimensional, and in fact its dimension equals the
number of zeros of the Blaschke product b. Next, one can readily nd a basis for N, which can be
converted to an orthonormal basis using the Gram-Schmidt procedure. Let {v
1
, , v
n
} denote
this orthonormal basis. Then one can compute a matrix representation of the linear operator A with
respect to this basis; call it A
0
. Now (6.4.24) states that the inmum of f rb
is precisely the
largest singular value of A
0
. Further, if p = [p
1
p
n
]
denotes the (column) eigenvector of A

0
A
0
corresponding to the eigenvalue ( (A
0
))
2
, then
f
0
(z) =
n
i=1
(A
0
p)
i
v
i
(z)
n
i=1
p
i
v
i
(z)
. (6.4.26)
Let us elaborate each of these points in turn.
Lemma 6.4.6 Let b be a nite Blaschke product, and dene N as in (6.4.22). Let {
1
, ,
n
} denote
the distinct zeros of b, with multiplicities {m
1
, , m
n
}, respectively. Then a (nonorthonormal) basis for
N is given by the functions
_
z
k1
(1

i
z)
k
_
, k = 1, , m
i
; i = 1, , n . (6.4.27)
The proof follows readily from the following lemma.
Lemma 6.4.7 Suppose is a complex number with || < 1, k is an integer greater than or equal to 1,
and dene
(z) =
z
k1
(1

z)
k
. (6.4.28)
Then H
2
; moreover, for any g H
2
,
, g =
g
(k1)
()
(k 1)!
. (6.4.29)
Proof. Clearly H
2
, since its only singularity is at 1/
. By denition (6.2.5), we have

, g =
1
2
2
_
0
(e
j
)g(e
j
) d
=
1
2
2
_
0
e
j (k1)
(1

e
j
)
k
g(e
j
) d
=
1
2
2
_
0
e
j
(e
j
)
k
g(e
j
) d
=
1
2j
_
C
g(z)
(z )
k
dz , (6.4.30)
where C denotes the unit circle oriented counterclockwise (note that z = e
j
on C, so that dz =
je
j
d). Finally, by the Cauchy integral formula, the last termin (6.4.30) equals g
(k1)
()/(k 1)!.
2
Proof of Lemma 6.4.6. The subspace N is dened by (6.4.22). Let g H
2
be arbitrary. Then bg
H
2
has a zero of multiplicity m
i
at
i
, for i = 1, , n. Hence, if
i
k
denotes the function
i
k
(z) =
z
k1
(1

i
z)
k
, (6.4.31)
then, by Lemma 6.4.7,
i
k
, bg =
(bg)
(k1)
(
i
)
(k 1)!
= 0 , (6.4.32)
whenever 1 k m
i
. Thus, each of the functions
i
k
belongs to N. It is routine to verify that they
are linearly independent. Finally, in order to show that they span N, it is enough to show that the
orthogonal complement of the span of the
i
k
s is contained in bH
2
. Accordingly, suppose h H
2
is orthogonal to each
i
k
. Then, by Lemma 6.4.7, h has a zero of multiplicity m
i
at
i
, for all i, and
is hence of the form bg for some g H
2
. 2
Once a basis has been found for N, it is easy to construct an orthonormal basis using the
Gram-Schmidt procedure. Let us drop the double subscript notation, let n denote the dimension of
the null space N, and let {
1
, ,
n
} denote the basis for N given by 6.4.6. Let C
nn
denote
the associated Grammian matrix; i.e.,
ij
=
i
,
j
. (6.4.33)
Note that the constants
ij
can be readily computed using Lemma 6.4.7. For instance, if b has only
simple poles at
1
, ,
n
, then Lemma 6.4.6 shows that the set
i
(z) =
1
1

i
z
, (6.4.34)
is a basis for N. Next, by Lemma 6.4.7,
ij
=
i
,
j
=
j
(
i
) =
1
1

i
. (6.4.35)
Since is the Grammian of a linearly independent set, it is Hermitian and positive denite. Hence,
it is possible to factor in the form B
B where B is nonsingular. Let X = B

1
, and dene
v
j
(z) =
n
i=1
i
(z)x
ij
, j = 1, , n . (6.4.36)
Then {v
1
, , v
n
} is an orthonormal basis for N. If the factor B is selected to be upper triangular,
then this is just the Gram-Schmidt procedure.
The next step is to obtain a matrix representation A
0
for the operator A = T
f
with respect
to the basis {v
i
}. This is given by
a
ij
= v
i
, Av
j
= v
i
, T
f
v
j
= v
i
, f v
j
. (6.4.37)
Now, by Lemma 6.4.7, a quantity of the form
i
, f
j
is very easy to compute. Hence, it is desirable
to expand a
ij
as a summation of terms of the form
i
, f
j
using (6.4.36). Let denote the matrix
in C
nn
whose ij-th element is
i
, f
j
. Then routine computations show that
A
0
= X
X . (6.4.38)
Now Fact 6.4.5 asserts that the inmum J
m
in (6.4.10) equals A
0
. Further, if p C
n
is a
column vector such that A
0
p
2
= p
2
, then the optimal function f
0
is given by (6.4.26).
Finally, as is the case with the ltering problem, the solution of the sensitivity minimization
problem is a lot cleaner in the discrete-time case than in the continuous-time case. If p(z) is the
transfer function of a strictly-causal plant, then p(0) = 0. If, in addition, p() has no poles nor zeros
on the unit circle, and the weighting function w() has no zeros on the unit circle, then an optimal
controller always exists.
This section is concluded with an illustrative example.
Example 6.4.8 Consider once again the plant of Example 6.4.2, namely
p(s) =
s 1
(s 3)(s
2
+1)
,
together with the weighting function
w(s) =
1
s +1
.
By previous discussion, the sensitivity minimization problem reduces to nding an r H
that
minimizes f rb
, where f and b are computed in Example 6.4.2. The rst objective of this
example is to show that it is unnecessary to compute f and b explicitly. Now b is the Blaschke
product part of n
d w. Hence, to form b, it is only necessary to know the locations of the zeros of

n,

d, w inside the open unit disc. In the s-domain, this translates to knowing the open RHP zeros
and poles of the plant, as well as the open RHP zeros of the weighting function. Moreover, as stated
earlier, it can be assumed without loss of generality that the weighting function is outer, so that it
has no open RHP zeros. In the problem at hand, p has a zero at s = 1, and a pole at s = 3. Under
the bilinear transformation z = (s 1)/(s +1), these points correspond to z = 0, 0.5, respectively.
Hence,
b(z) =
z(z 0.5)
1 0.5z
.
Next, in order to apply Fact 6.4.5 to compute the optimal function f
0
, it is only necessary to know
the values of f (and its derivatives in case b has repeated zeros) at the zeros of b. Since b is the inner
part of n
d w and f = y

d w, it follows that if z is a zero of b, then f (z) = w(z) or 0, depending on
whether z is a zero of n (i.e., a plant zero) or a zero of

d i.e., a plant pole). In the present instance,
f (0) = w(0) = w(1) = 0.5, f (0.5) = 0. All of this discussion is made more precise in the next
section, where it is shown that the sensitivity minimization problem is one of interpolation by a
function of minimum norm.
To compute the optimal function f
0
, we rst nd a basis for the null space N of (6.4.22). By
Lemma 6.4.6, such a basis is given by
1
(z) = 1,
2
(z) =
1
1 0.5z
.
From (6.4.35), the Grammian matrix is
=
_
1 1
1 4/3
_
,
which can be factored as
=
_
1 0
1 1/
3
_ _
1 1
0 1/
3
_
= B
B .
Hence,
X = B
1
=
_
1
3
0

3
_
,
and an orthonormal basis for N is given by
v
1
(z) =
1
(z) = 1, v
2
(z) =
3(
2
(z)
1
(z)) =
3
z
2 z
.
Next we need to compute A
0
= X
Xwhere
ij
=
i
, f
j
. From Lemma 6.4.7,
1
, f
1
= (f
1
)(0) = w(0) = 0.5,
1
, f
2
= (f
2
)(0) = 0.5,
2
, f
1
= (f
1
)(0.5) = 0,
2
, f
2
= (f
2
)(0.5) = 0 .
Therefore
A
0
= X
X =
_
0.5 0
0.5
3 0
_
.
It is routine to compute that (A
0
) = 1, and that the associated vector p is [1 0]
, which leads to
A
0
p = [0.5 0.5
3]
. Hence, by (6.4.26) the optimal function f

0
is given by
f
0
(z) =
0.5v
1
(z) 0.5
3v
2
(z)
v
1
(z)
=
1 2z
2 z
.
6.5 SENSITIVITY MINIMIZATION: FATPLANTCASE
The sensitivity minimization problemfor multivariable systems is much deeper than the correspond-
ing problem for scalar systems. At present, a clean solution to the problem is available only in the
case where the plant has at least as many inputs as outputs. Not all of the details of this solution are
given in this section, as they require some background concepts from functional analysis not covered
previously.
Suppose P is a given strictly proper plant with at least as many inputs as outputs, and that
W
1
, W
2
M(S) are weighting matrices such that W
1
has full row rank (as a rational matrix) and
W
2
is square and nonsingular. The objective is to minimize the weighted sensitivity measure
J = W
1
(I +PC)
1
W
2
. (6.5.1)
Let (N, D), (

D,

N) be any r.c.f. and any l.c.f. of P, and let

X,

Y M(S) be any solutions of
N

X +

D

Y = I. Then, as shown in Section 6.2, one can replace (6.5.1) by the equivalent problem
of minimizing
J(R) = W
1

Y

DW
2
W
1
NR

DW
2
, (6.5.2)
6.5. SENSITIVITY MINIMIZATION: FATPLANTCASE 177
as the free parameter R varies over M(S). Now dene rational H
-matrices F, G, H by
F(z) = (W
1

Y

DW
2
)[(1 +z)/(1 z)] , (6.5.3)
G(z) = (W
1
N)[(1 +z)/(1 z)] , (6.5.4)
H(z) = (

DW
2
)[(1 +z)/(1 z)] . (6.5.5)
Then the problem becomes one of minimizing
J(R) = F GRH
, (6.5.6)
suitably choosing R from the set of rational H
-matrices.
Since the plant is strictly proper, we have that G(1) = 0. As a result, the inmum of J is not
attained in general. To deal with this situation, it is necessary to extend Lemma 6.4.1 to the matrix
case. For this purpose, dene
J
0
= inf
RM(H
)
F GRH
. (6.5.7)
Suppose G and H have the following properties: G is of the form G
1
g where G
1
M(H
), g
H
, and in addition G
1
(e
j
) has full row rank at all [0, 2], and all zeros of g() are on the unit
circle. Similarly, H is of the form H
1
h where H
1
M(H
), h H
, and in addition H
1
(e
j
) is
square and nonsingular at all [0, 2], and all zeros of h() are on the unit circle. This means in
particular that if G(e
j
) has less than full rank for some , then G(e
j
) must in fact equal the zero
matrix; similar remarks apply to H. Note that the assumptions on G and H are trivially satised in
the scalar case. Now dene
J
m
= inf
RM(H
)
F G
1
RH
1
. (6.5.8)
It can be shown that the inmum in (6.5.8) is achieved, owing to the rank conditions on G
1
and H
1
.
The next result is a generalization of Lemma 6.4.1.
Lemma 6.5.1 Let e
j
1
, , e
j
l
denote the distinct zeros of the product gh on the unit circle. Then
J
0
= J
m
if and only if F(e
j
i
) J
m
i.
Proof. Since every matrix of the formGRH is also of the formG
1
RH
1
it is easy to see that J
m
J
0
.
only if Since gh(e
j
i
) = 0 i, it follows that GRH(e
j
i
) = 0 i, R M(H
). Hence,
(F GRH)(e
j
i
) = F(e
j
i
) i, R M(H
), and as a result
F GRH
= sup
[0,2]
(F GRH)(e
j
)
max
i
(F GRH)(e
j
i
)
= max
i
F(e
j
i
), R M(H
) . (6.5.9)
Combining (6.5.9) with the hypothesis that J
0
= J
m
leads to
J
m
= J
0
= inf
RM(H
)
F GRH
max
i
F(e
j
i
) . (6.5.10)
if Select anR
0
M(H
) suchthat J
m
= F G
1
R
0
H
1
. SuchanR
0
exists, by earlier remarks.
Now recall the function (cf. (6.4.13))
h
,
(z) =
z e
j
z (1 +)e
j
, (6.5.11)
and dene
h
(z) =
l
i1
[h
i
,
(z)]
m
i
, (6.5.12)
where m
i
is the multiplicity of e
j
i
as a zero of gh. Dene the family (R
) in M(H
) by R
=
R
0
h
/gh. Then F GR
H = F G
1
R
0
H
1
h
. Moreover, as in the proof of Lemma 6.4.1, it will

follow that F G
1
R
0
H
1
h
F G
1
R
0
H
1
= J
m
as 0
+
. Hence, J
0
= J
m
. 2
To interpret Lemma 6.5.1 in the context of the sensitivity minimization problem, suppose
the plant P has blocking zeros
8
at s = j
1
, , j
n
and at innity, but has full row rank at all
other points on the j-axis. Suppose also that P has no j-axis poles. Select the weighting matrices
W
1
, W
2
such that W
1
has the same blocking zeros as P on the extended j-axis, but has full row
rank at all other points on the j-axis, while W
2
(j) is square and nonsingular at all points on the
extended j-axis. Then F(e
j
i
) = 0 i and the condition F(e
j
i
) J
m
is automatically satised;
as a result, J
0
= J
m
. Moreover, if R
0
is a choice of R that attains the inmum in (6.5.8), then the
family (R
= R
0
h
/gh) is an optimizing family for J(). This leads to a family of compensators

(C
) for P such that W

1
(I +PC
)
1
W
2
J
m
as 0
+
.
Let us therefore turn our attention to the problem of computing the inmum in (6.5.8) and
an R that attains this inmum. As a rst step, factorize G
1
and H
1
in the form G
1
= G
i
G
o
, H
1
=
H
o
H
i
, where G
i
, H
i
are inner, G
o
M(H
) has a right inverse, and H

o
M(H
) is a unit. This
is possible in view of the rank conditions on G
1
and H
1
(see Fact 6.2.2). Now, for any V M(H
),
the equation V = G
o
RH
o
has a solution for R in M(H
). Hence,
inf
RM(H
)
F G
1
RH
1
= inf
VM(H
)
F G
i
VH
i
. (6.5.13)
Next, let G
adj
i
, H
adj
i
denote the adjoint matrices of G
i
, H
i
, respectively, and note that these adjoint
matrices are themselves inner. Let b = |G
i
| |H
i
|, and note that b is an inner function. Now, since
multiplication by inner matrices is norm-preserving, it follows that
F G
i
VH
i
= G
adj
i
FH
adj
i
G
adj
i
G
i
VH
i
H
adj
i

= F
0
bV
, V M(H
) , (6.5.14)
8
Recall that s is a blocking zero of P if P(s) = 0.
where F
0
= G
adj
i
FH
adj
i
. Hence, from (6.5.8), (6.5.13), and (6.5.14),
J
m
= inf
VM(H
)
F
0
bV
. (6.5.15)
As a nal simplication, we may as well assume that b is a (nite) Blaschke product: Since b is inner,
it equals a constant times a Blaschke product; the constant can be absorbed into the free matrix V.
Let us now switch notation and restate the problem. Given a rational matrix F M(H
)
and a nite Blaschke product b, nd an R M(H
) that minimizes F bR
. There are two

aspects to this problem: (i) computing the minimum, and (ii) nding an R that attains the minimum.
Each aspect is discussed separately.
Backtracking through the various manipulations, the reader can verify that the matrices F
and R have the same dimensions, and that they have at least as many columns as rows. To be specic,
suppose they are of dimension r s, where r s. Recall fromSection 6.2 that H
s
2
is a Hilbert space
under the inner product
f, g =
1
2
2
_
0
f
(e
j
)g(e
j
) d . (6.5.16)
As in the scalar case, with each matrix H H
ss
one can associate a continuous linear operator

T
H
: H
s
2
H
s
2
in the obvious way, namely:
(T
H
f )(z) = H(z)f (z) z D, f H
s
2
. (6.5.17)
Moreover, this association is norm-preserving; that is,
T
H
= sup
f H
s
2
\0
T
H
f
2
f
2
= H
= ess. sup
[0,2]
(H(e
j
)) . (6.5.18)
It turns out that Fact 6.4.5 has a complete analog in the multivariable case in so far as computing
the inmum of F bR
. This analog is based on a preliminary result taken from [86]. Some

notation smoothens the way to a presentation of the latter. Let
bH
s
2
= {bf : f H
s
2
} , (6.5.19)
and let N denote the orthogonal complement of bH
s
2
in H
s
2
; thus
N = {g H
s
2
: g, bf = 0 f H
s
2
} . (6.5.20)
Finally, let denote the orthogonal projection mapping H
s
2
onto N.
Lemma 6.5.2 For each H
ss
, let
denote the map T
from H
s
2
into N. Let denote the
collection {
: H
ss
}. Then, corresponding to each map M in the collection , there exists a matrix

H
ss
such that M =
and
= M.
Proof. See [86], Theorem 3. 2
Theorem6.5.3 Suppose F H
rs
with r s. If r < s, dene

F
0
=
_
F
0
rss
_
, (6.5.21)
so that F
0
H
ss
. Dene a linear mapping, A : N N as the restriction of T

F
0
to N. Then
inf
RH
ss
F bR
= A . (6.5.22)
Remarks 6.5.4 As in the scalar case, the subspace N is nite-dimensional, so that A can be
computed by purely algebraic means. More on this later.
Proof. As a rst step, suppose r = s so that F is square. As R varies over H
ss
, the operator T
FbR
ranges over various operators. However, the operator T
FbR
is the same for all R, and equals T
F
.
To see this, let g H
s
2
be arbitrary. Then
T
FbR
g = (F bR)g = (Fg) (bRg) = T
F
g , (6.5.23)
since bRg H
s
2
and hence its projection onto N is zero. Now T
F
is what is called
F
in
Lemma 6.5.2, which states that there exists a matrix H
ss
such that
=
F
and such
that
=
F
= (T
F
). But
=
F
is equivalent to (F ) = 0, i.e.,
(F )g H
s
2
g H
s
2
. (6.5.24)
Let e
i
denote the i-th elementary unit vector, which has a 1 in the i-th position and zeros elsewhere,
and let
g(z) = e
i
, z D . (6.5.25)
Theng H
s
(whichalso implies that g H

s
2
); moreover, (6.5.24) nowimplies that the i-thcolumn
of the matrix F is a multiple (in H
) of the inner function b. Repeating this reasoning for

all indices i shows that F bH
ss
, i.e., that = F bR
0
for some R
0
H
ss
. It is now
claimed that
F bR
0
= inf
RH
ss
F bR
. (6.5.26)
To prove (6.5.26), note that
F bR
= T
FbR
by (6.5.18),
_
_
T
FbR
_
_
since
_
_
_
_
1,
=
_
_
T
F
_
_
, by (6.5.23),
=
= F bR
0
. (6.5.27)
The preceding discussion can be summarized as follows: If F is square, then
inf
RH
ss
F bR
=
_
_
T
F
_
_
. (6.5.28)
To complete the proof of the theorem, it only remains to show that T
F
restricted to N has the
same norm as T
F
. Suppose g H
s
2
. Then g can be expressed uniquely in the form f +bh where
f N and h H
s
2
( since H
s
2
is the direct sum of N and bH
s
2
). Now
9
T
F
g = (Ff +Fbh) = (Ff +bFh) = Ff . (6.5.29)
Moreover, g
2
f
2
, so that
_
_
Fg
_
_
2
g
2
_
_
Ff
_
_
2
f
2
. (6.5.30)
Hence, A (which is the restriction of T
F
to N) has the same norm as T
F
.
Next, the restriction that F be square is removed. Suppose r < s and dene F
0
as in (6.5.21).
Partition any R
0
H
ss
as R
0
= [R
1
]
where R H
rs
. Then, for any R

0
H
ss
, we have
F
0
bR
0
=
_
_
_
_
_
F bR
bR
1
__
_
_
_
= ess. sup
[0,2]

_
(F bR)(e
j
)
bR
1
(e
j
)
_
. (6.5.31)
Now for any two complex matrices A, B of compatible dimensions, it is easy to show that

_
A
B
_
(A) . (6.5.32)
Hence, from (6.5.31), F
0
bR
0
F bR
, so that
inf
R
0
H
ss
F
0
bR
0
inf
RH
rs
F bR
. (6.5.33)
To prove the reverse inequality, let R be any matrix in H
ss
that attains the inmum on the right

side of (6.5.33), and dene R
0
= [R
0
srs
]
. Then F
0
bR
0
= F bR
. Hence, the
inequality in (6.5.33) is actually an equality. Moreover, from the previous paragraph, the inmum
on the left side of (6.5.33) equals the norm of T
F
0
when restricted to N. 2
9
Note that this argument fails if b is replaced by an inner matrix.
As in the scalar case, it is easy to compute a basis for N.
Lemma 6.5.5 Let b be a nite Blaschke product with distinct zeros
1
, ,
n
, of multiplicities
m
1
, , m
n
, respectively. For i {1, , s}, let e
i
denote the i-th s 1 elementary unit vector. Dene
N as in (6.5.20). Then, a (nonorthonormal) basis for N is given by the vector-valued functions
v
i
kl
(z) =
z
k1
1

l
z
e
i
, 1 i s; 1 k m
l
; 1 l n . (6.5.34)
Proof. By Lemma 6.4.7 and (6.5.16),
v
i
kl
, f = f
(k1)
i
(z
l
), f H
s
2
, (6.5.35)
Since f bH
s
2
if and only if the right side of (6.5.35) is zero for all i, k, l, the lemma follows. 2
Once a basis for N has been determined, it is possible to convert it into an orthonormal basis,
compute a matrix representation of A
0
of Awith respect to this basis, and nally to compute (A
0
);
the details are exactly the same as in the scalar case (see (6.4.33) et. seq.). However, if one is only
interested in computing (A
0
), a short-cut is available, and is described next.
It should be emphasized that the procedure described next is equally applicable to the scalar
case, and can be used to compute (A
0
) more readily than by using the formula (6.4.38). However,
there wouldbe noadvantage gainedby this: Inthe scalar case, Fact 6.4.5gives a methodfor computing
the inmum of f br
as well as for nding an r that attains the inmum, all in one shot. In
contrast, the procedure given below only determines the inmum, and a separate interpolation
procedure has to be used to nd an optimizing r. This is no disadvantage in the multivariable case,
however, since such a two-stage procedure is all that is available at present.
Now for the short-cut: Theorem 6.5.3 states that the inmum of F bR
equals A,
where A : N Nis the restrictionof T
F
to N. Lemma 6.5.5 provides a (nonorthonormal) basis for
N. To compute A, the rst step is to observe that A = A
, where A
: N N is the adjoint
map of A. Now compute a representation of A
with respect to the basis given in Lemma 6.5.5, and

denote it by . Also, let denote the Grammian matrix of the basis in Lemma 6.5.5. Then it is
easy to show that
A = inf{ :
is positive denite} . (6.5.36)

Alternately, A is the largest solution to the generalized eigenvalue problem |
| = 0.
The effectiveness of the above short-cut derives fromthe fact that the matrix representation
is block-diagonal, as shown in Lemma 6.5.6 below. For the sake of clarity of notation, Lemma 6.5.6 is
stated only for the case where the Blaschke product b has simple zeros; the general case is conceptually
not more difcult and is left to the reader.
Lemma 6.5.6 Suppose b is a nite Blaschke product with distinct simple zeros
1
, ,
m
, and that
F H
rs
is rational. Dene C
msmr
, C
mrmr
by
= Block Diag {F
(
1
), , F
(
m
)} , (6.5.37)
= (
ij
), i, j = 1, , m , (6.5.38)
where
ij
=
1
1

i
I
r
. (6.5.39)
Then
inf
RH
ss
F bR
= inf{ :
is positive denite} . (6.5.40)

Proof. Dene Nas in (6.5.20). Then Nhas dimension ms, and a basis for Nis given by Lemma 6.5.5
as
v
ij
(z) =
1
1

i
z
e
j
, j = 1, , s; i = 1, , m . (6.5.41)
Thus, any h N can be expanded as
h(z) =
m
i=1
s
j=1
h
ij
v
ij
(z) . (6.5.42)
Arrange the coordinates h
ij
in an ms 1 column vector as follows:
_
_
_
_
_
_
_
_
_
_
_
_
_
h
11
.
.
.
h
1s
.
.
.
h
m1
.
.
.
h
ms
_
_
. (6.5.43)
Then the claim is that the matrix representation of A
with respect to this basis is
0
= Block Diag {F
0
(
1
), , F
0
(
m
)} . (6.5.44)
To prove the claim, let v
kl
be any basis function of the form (6.5.41). We must show that the ma-
trix (6.5.44) gives the right answer for A
v
kl
. Nowthe coordinate vector of v
kl
is [0 1 0]
, where
the 1 is inthe position(k 1)s +l. Applying the matrix
0
to this vector gives [0 F
0
(
k
)e
l
0]
as the coordinate vector of A
v
kl
; in other words,
A
v
kl
=
s
p=1
f
0lp
(
k
)v
kp
. (6.5.45)
To show that (6.5.45) is correct, it is enough to show that it gives the right answer for the inner
products v
ij
, A
v
kl
for all i, j. By denition,
v
ij
, A
v
kl
= A
v
kl
, v
ij
= v
kl
, Av
ij
= v
kl
, T
F
0
v
ij
= v
kl
, F
0
v
ij
= e
l
F
0
(
k
)v
ij
(
k
)
= f
0lj
(
k
)
1
1

k
, (6.5.46)
where the last two steps follow from Lemma 6.4.7 and (6.5.41), respectively. On the other hand,
letting u
kl
denote the function in (6.5.45) leads to
v
ij
, u
kl
=
s
p=1
f
0lp
(
k
)v
ij
, v
kp
. (6.5.47)
Note that
v
ij
, v
kp
=
1
1

jp
, (6.5.48)
where is the Kronecker delta. Hence, (6.5.47) becomes
v
ij
, u
kl
= f
0lj
(
k
)
1
1

j
, (6.5.49)
which agrees with (6.5.46).
By previous discussion, A
is given by
A
= inf{ :
0

0
is positive denite} , (6.5.50)
where
0
is the Grammian matrix of the basis in (6.5.41). One can readily compute that
0
=
(
0ij
)
m
i,j=1
where
0ij
=
1
1

i
I
s
, (6.5.51)
Now the only difference between and
0
is in their dimensions. Similarly, if we dene as
in (6.5.37), then is a submatrix of
0
, since F is a submatrix of F
0
. If r = s then we are done,
since =
0
, =
0
and (6.5.50) is the same as (6.5.40). Otherwise, by symmetric rowandcolumn
permutations, one can transform
0

0
into the form
0

_
0
0
_
, (6.5.52)
where = (
ij
)
m
i,j=1
, and
ij
=
1
1

i
I
sr
. (6.5.53)
It is clear from (6.5.52) that
0

0
0
is positive denite if and only if
is; hence,
(6.5.50) reduces to (6.5.40). 2
Now that an expression is available for the inmum of F bR
the next step is to give

a procedure for computing an R that attains the inmum. In the case where the inner function b
has only simple zeros, such a procedure is provided by the matrix Nevanlinna-Pick algorithm.
10
This
algorithm is also applicable to the scalar problem studied in Section 6.4, and forms an alternative to
the procedure of Fact 6.4.5. However, the latter is preferable, since it leads to an optimal interpolating
function in one step, where as the Nevanlinna-Pick algorithm is iterative, and requires as many
iterations as there are zeros of b.
The main result in the sequel is Theorem 6.5.9, which overlaps somewhat with Lemma 6.5.6.
However, an independent proof is given for this theorem, as the iterative interpolation algorithm is
really a consequence of the method of proof.
To set up the procedure, dene
= inf
RH
rs
F bR
, (6.5.54)
and let
1
, ,
n
denote the (simple) zeros of b. Then the problem is to nd a matrix G H
rs
such that
G(
i
) = F(
i
), i = 1, , n, and G
= . (6.5.55)
(Note that G H
rs
is of the form F bR for some R H

ss
if and only if G(
i
) = F(
i
) i).
Equivalently, if we dene
F
i
= [F(
i
)]/ , (6.5.56)
then the problem is one of nding a matrix H
rs
such that
= 1 and (
i
) = F
i
i.
The remainder of this section is devoted to the matrix version of the so-called Nevanlinna-
Pick problem, which can be stated as follows: Given complex numbers
1
, ,
n
, with |
i
| < 1 i,
together with complex matrices F
1
, , F
n
, with F
i
1 i, nd all matrix-valued functions
(if any) such that
1 and (
i
) = F
i
i. The relevance of this problem to sensitivity
minimization is clear.
10
Strictly speaking, the Pick problem only makes sense for square matrices, so it is probably more accurate to speak of the matrix
Nevanlinna algorithm.
In the sequel, it is important to note that denotes the norm of a complex matrix whereas

denotes the norm of a matrix-valued function.

Before proceeding to the interpolation algorithm, it is established that one may assume that
F
i
< 1 i, without any loss of generality. Consider rst the scalar case, where one is given scalars
f
1
, , f
n
, with f
i
1 i, and is asked to nd a H
such that
1 and (
i
) = f
i
i.
Suppose one of the constants has magnitude exactly equal to 1, and renumber the constants if
necessary so that |f
1
| = 1. Since
1
belongs to the open unit disc, it follows from the maximum
modulus principle that the only H
-function with norm 1 that equals f

1
at
1
is the constant
function (z) f
1
z. Hence, the interpolation problemdoes not have a solution unless f
i
= f
1
i.
In the latter case, the solution is the obvious one, namely (z) f
1
. Thus, one may as well assume
at the outset that |f
i
| < 1 i, because the problem is trivial otherwise. But this reasoning breaks
down in the matrix case, because there exist nonconstant matrices such that (z) 1 z D;
for example, let
(z) =
_
1 0
0 z
_
. (6.5.57)
However, Lemma 6.5.7 below shows that all such functions are essentially of the form (6.5.57). One
can think of Lemma 6.5.7 as providing a matrix version of the maximum modulus principle.
Lemma 6.5.7 Suppose H
kl
and that
1. Suppose (z
0
) = 1 for some z
0
in the
open unit disc, and let r denote the multiplicity of 1 as a singular value of (z
0
) (i.e., as an eigenvalue of
(z
0
)(z
0
)). Then (z) = 1 z D. Moreover, there exist unitary matrices U C
kk
, V C
ll
such that
U(z)V =
_
I
r
0
lrr
0
rkr
(z)
_
, (6.5.58)
where (z) < 1 z D.
The proof of Lemma 6.5.7 requires the following observation on complex matrices.
Lemma 6.5.8 Suppose M C
kl
has norm 1. Then
(i) |m
ij
| 1 i, j,
(ii) If |m
ij
| = 1 for some i, j, then all other elements of row i and column j are zero.
Proof. Temporarily let
2
denote the Euclidean vector norm on both C
k
and C
l
. Then
Mx
2
x
2
x C
l
, (6.5.59)
since M = 1. To prove (i), suppose |m
ij
| > 1 for some i, j, and let x = e
j
, the j-th elementary
unit vector in C
l
. Then Mx
2
> 1, which is a contradiction. To prove (ii), suppose |m
ij
| = 1 for
some i, j, and again let x = e
j
. Then Me
j
2
= 1 implies that all other elements of the j-th column
of M are zero. The same reasoning applied to M
shows that all other elements of the i-th row of

M are zero. 2
Proof of Lemma 6.5.7. Suppose 1 is an r-fold eigenvalue of
(z
0
)(z
0
), so that 1 is an r-fold
singular value of (z
0
). Thus, there exist unitary matrices U C
kk
, V C
ll
such that
U(z
0
)V =
_
I
r
0
rlr
0
krr
M
_
, (6.5.60)
where M < 1. Dene y(z) = (U(z)V)
11
, i.e., y is the corner element of U(z)V. Then y
is analytic, satises |y(z)| 1 z D (by Lemma 6.5.8 (i)), and y(z
0
) = 1 (by (6.5.60)). Hence,
by the maximum modulus theorem, y(z) = 1 z D, since z
0
is in the open unit disc. Now by
Lemma 6.5.8 (ii), all other elements in the rst row and rst column of U(z)V are identically
zero. The same reasoning applies to all elements of the form (U(z)V)
ii
where i r. Hence,
U(z)V is of the form (6.5.58), and (z) = 1 z D. It only remains to show that (z) <
1 z D. Suppose to the contrary that (z
1
) = 1 for some z
1
D. Then, by previous reasoning,
(z) = 1 z D. In particular, M = (z
0
) = 1 , which contradicts the assumption that 1
is an r-fold singular value of (z
0
). Hence, (z) < 1 z D. 2
Let us now return to the interpolation problem, where one is given points
1
, ,
n
, and
matrices F
1
, , F
m
, with |
i
| < 1 and F
i
1 i. Recall that the objective is to show that one
may assume F
i
< 1 i without sacricing generality. Assume to the contrary that F
i
= 1 for
some values of i. Among these select one for which the multiplicity of 1 as a singular value is the
largest, and renumber it as F
1
. Select unitary matrices U, V such that
UF
1
V =
_
I
r
0
0 G
1
_
, (6.5.61)
where r is the multiplicity of 1 as a singular value of F, and G
1
< 1. Now, by Lemma 6.5.7, any
such that
1 and (
i
) = F
i
must satisfy
U(z)V =
_
I
r
0
0 (z)
_
z D . (6.5.62)
In particular, F
2
, , F
n
must satisfy
UF
i
V =
_
I
r
0
0 G
i
_
, i = 2, , n . (6.5.63)
Otherwise the interpolation problem has no solution. On the other hand, suppose (6.5.63) holds.
Then G
i
< 1 i; otherwise, if G
i
= 1 for some i, then F
i
has 1 as a singular value of multiplicity
larger than r, contradicting the manner in which F
1
was selected. Nowreplace the original interpola-
tion problemby the following modied problem: Find a such that
1 and (
i
) = G
i
i.
Then there is a one-to-one correspondence between solutions of the original problem and of
the modied problem, via (6.5.62). But in the modied problem all the matrices G
i
, have norms
strictly less than one.
We are now ready for the main result on the matrix Nevanlinna problem.
Theorem 6.5.9 Suppose
1
, ,
n
are distinct complex numbers and F
1
, , F
n
are complex
matrices with |
i
| < 1 and F
i
< 1 i. Dene a partitioned matrix
P =
_
_
_
P
11
P
1n
.
.
.
.
.
.
P
n1
P
nn
_
_
, (6.5.64)
where
P
ij
=
1
1

j
(I F
i
F
j
) . (6.5.65)
Then there exists a M(H
) such that
< 1 and (
i
) = F
i
i, if and only if the matrix
P is positive denite.
In the current set-up Lemma 6.5.6 leads to the following result: There exists a M(H
)
such that
1 and (
i
) = F
i
if and only if the matrix P is nonnegative denite.
The proof of Theorem 6.5.9 is rather long and is based on a preliminary lemma. We shall see
that an iterative algorithm for computing will emerge from the proof.
Lemma 6.5.10 Suppose E < 1, and dene
L(E) =
_
A B
C D
_
, (6.5.66)
where
A = (I EE
)
1/2
, B = (I EE
)
1/2
E,
C = (I E
E)
1/2
E
, D = (I E
E)
1/2
. (6.5.67)
(Note that M
1/2
denotes the Hermitian square root of M.) Then for each matrix X with X 1, the
matrix
T
E
(X) = (AX +B)(CX +D)
1
, (6.5.68)
is well-dened. Moreover,
T
E
(X) 1 (resp. < 1) X 1 (resp. < 1) . (6.5.69)
Proof. First of all, CX +D = (I E
E)
1/2
(E
X +I). Since E < 1, it follows that, whenever

X 1, the matrix E
Xalso has normless than one, so that E
X +I has an inverse. Hence, T

E
(X)
is well-dened.
Now dene
J =
_
I 0
0 I
_
. (6.5.70)
Then routine calculations show that L
JL = J; these calculations make use of the matrix identities

(I MN)
1
M = M(I NM)
1
, (6.5.71)
(I MN)
1
= I +M(I NM)
1
N . (6.5.72)
Note that for any matrix M, we have that M 1 (resp. < 1) if and only if M
M I 0 (resp.
< 0). Hence, (6.5.69) is equivalent to
T
T I 0 (resp. < 0) X
X I 0 (resp. < 0) , (6.5.73)

where T is a shorthand for T
E
(X). But since CX +D is nonsingular,
T
T I 0 (resp. < 0) (CX +D)
(T
T I)(CX +D) 0 (resp. < 0) . (6.5.74)

The latter quantity equals
(AX +B)
(AX +B) (CX +D)
(CX +D)
= [X
I]L
JL
_
X
I
_
= [X
I]J
_
X
I
_
= X
X I . (6.5.75)
Hence, (6.5.69) follows from (6.5.74) and (6.5.75). 2
Proof of Theorem6.5.9. if The proof is by induction on the number n of points to be interpolated.
If n = 1, the theorem is clearly true: Take (z) = F
1
. To establish the inductive step, suppose the
theorem is true for n 1 points, and suppose that P is positive denite; it is shown below that this
implies the existence of a M(H
) satisfying
< 1 and (
i
) = F
i
i.
Let y be the inner function
11
y(z) =
|
1
|(z
1
)
1
(1

1
z)
, (6.5.76)
11
Note that y(z) = z if
1
= 0.
and dene
T
i
= T
F
1
(F
i
), i = 2, , n , (6.5.77)
G
i
=
1
y(
i
)
T
i
, i = 2, , n , (6.5.78)
where T
F
1
() is dened in (6.5.68). Now dene a partitioned matrix
Q =
_
_
_
Q
22
Q
2n
.
.
.
.
.
.
Q
n2
Q
nn
_
_
, (6.5.79)
where
Q
ij
=
1
1

j
(I G
i
G
j
) . (6.5.80)
It is claimed that Q is positive denite. To prove the claim, compute Q
ij
from (6.5.80), and make
liberal use of the identities (6.5.71) and (6.5.72). This leads to
Q
ij
=
1
1

j
_
I
1
1 y(
i
)y(
j
)
T
i
T
j
_
=
1
1

j
__
1
1
1 y(
i
)y(
j
)
_
I +
1
1 y(
i
)y(
j
)
(I T
i
T
j
)
_
, (6.5.81)
where from (6.5.77),
I T
i
T
j
= (I F
1
F
1
)
1/2
(I F
i
F
1
)
1
(I F
i
F
j
)
(I F
1
F
j
)
1
(I F
1
F
1
)
1/2
. (6.5.82)
For clarity, dene
R
1
i
= (I F
1
F
i
)
1
(I F
1
F
1
)
1/2
, i = 1, , n , (6.5.83)
and note that
1
1

_
1
1
1 y(
i
)y(
j
)
_
=
1
1
(
1
)(
j

1
)
. (6.5.84)
Then, (6.5.81) becomes
Q
ij
=
1
1
(
1
)(
j

1
)
I +
R
i
1
y(
i
)
P
ij
R
1
j
y(
j
)
, (6.5.85)
or, upon rearrangement,
P
ij
= R
i
_
1

1
(1

1
)(1
j
1
)
I + y(
i
)Q
ij
y(
j
)
_
R
j
. (6.5.86)
Now (6.5.86) implies that
P = R
_
c
1
I 0
0 Q
_
R , (6.5.87)
where
c
i
=
1
1

i
for i = 1, , n , (6.5.88)
and R is a partitioned matrix of the form
R =
_
_
_
_
_
_
_
c
1
R
1
c
2
R
2
c
3
R
3
c
n
R
n
0 y(
2
)c
2
R
2
0 0
0 0 y(
3
)c
3
R
3
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 y(
n
)c
n
R
n
_
_
. (6.5.89)
Since Ris nonsingular, c
1
> 0 andP is positive denite, it follows from(6.5.87) that Qis also positive
denite. In particular, Q
ii
is positive denite for all i, and G
i
< 1 for all i. By the inductive
hypothesis, there exists a function M(H
) such that
< 1 satisfying (
i
) = G
i
for
i = 2, , n. Now dene
(z) = (A y(z)(z)C)
1
(y(z)(z)D B) , (6.5.90)
where
A = (I F
1
F
1
)
1/2
, B = (I F
1
F
1
)
1/2
F
1
,
C = (I F
1
F
1
)
1/2
F
1
, D = (I F
1
F
1
)
1/2
. (6.5.91)
Note that (6.5.90) is the inverse of the transformation T
F
1
, dened in (6.5.68). Hence,
from (6.5.77), (6.5.78), and (6.5.90),
(
i
) = F
i
for i = 2, , n . (6.5.92)
Since y(
1
) = 0, (6.5.90) shows that
(
1
) = A
1
B = F
1
. (6.5.93)
Finally, by Lemma 6.5.10, (z) < 1 for all z implies that (z) < 1 for all z. Hence, is a
solution to the interpolation problem with n points. This completes the proof by induction.
only if One needs to prove the following statement: Suppose M(H
) and that
< 1. Let
1
, ,
n
be any n distinct points in the open unit disc, let F
i
= (
i
), and
dene P as in (6.5.64), (6.5.65). Then P is positive denite.
The proof is again by induction on the number n of points. The statement is clearly true if
n = 1. Suppose it is true for n 1 points. Given the n points
1
, ,
n
and the function , let
F
i
= (
i
) and dene the function M(H
) by
(z) =
1
y(z)
(A(z) +B)(C(z) +D)
1
, (6.5.94)
where y is as in (6.5.76) and A, B, C, D are as in (6.5.91). Then
< 1 by Lemma 6.5.10.

Moreover, (
i
) = G
i
where G
i
is dened in (6.5.78). By the inductive hypothesis, the matrix Q
of (6.5.79) is positive denite. Since (6.5.87) holds, P is also positive denite. Hence, the statement
is true for n points. 2
The proof of the if part of the theorem suggests an iterative algorithm for computing . By
the transformation (6.5.77)(6.5.78) one can replace the given interpolation problem involving n
points by another involving only n 1 points. Applying the transformation repeatedly reduces the
number of interpolation constraints by one each time, until eventually one is left with a single matrix
H
n
such that H
n
< 1, and is faced with the problem of nding a matrix () M(H
) such
that (
n
) = H
n
and
< 1. An obvious solution is (z) H

n
z. By applying the inverse
transformation (6.5.90) n 1 times one can nd a solution to the original problem. Moreover, there
is a one-to-one correspondence between the set of all solutions () to the above problem and the
set of all solutions to the original interpolation problem.
The fact that the matrix R in (6.5.87) is upper-triangular suggests that (6.5.87) is the rst step
in a block Cholesky decomposition of P. This fact can be used to generate Q quickly and reliably.
Suppose now that it is desired to nd a function () M(H
) satisfying (
i
) = F
i
and
such that
1 (as opposed to
< 1). In this case, a necessary and sufcient condition

that such a exist is that the test matrix P of (6.5.64) be nonnegative denite (as opposed to positive
denite as in Theorem 6.5.9). To construct such a , one can use the iterative procedure described
above, but with a few modications. As mentioned before Theorem 6.5.9, one can assume at the
outset that each matrix F
i
has a norm strictly less than one, by applying the reduction procedure of
Lemma 6.5.7 if needed. Then, by applying the transformation (6.5.77)(6.5.80), one can replace the
original problem involving the n matrices F
1
, , F
n
by a reduced interpolation problem involving
the n 1 matrices G
2
, , G
n
. Moreover, if P is nonnegative denite, so is Q, because of (6.5.87).
However, even if F
i
< 1 i, it can happen that G
i
= 1 for some i. In such a case, it is necessary
rst to reduce the size of the matrices by rst using Lemma 6.5.7 before carrying out another step
of the Nevanlinna iteration.
This section concludes with an example that illustrates the use of the matrix Nevanlinna
algorithm.
Example 6.5.11 Suppose it is desired to nd a matrix H
12
that has norm less than one and

satises the interpolation constraints
(0) = [0.2 0.4], (0.5) = [0.3 0.7] .
6.6. SENSITIVITY MINIMIZATION: GENERAL CASE 193
The test matrix P of (6.5.64) is of order 4 4 in this case. To save some labor, let us consider the
equivalent problemof nding a H
21
that has normless than one and satises the interpolation

constraints
(0) =
_
0.2
0.4
_
, (0.5) =
_
0.3
0.7
_
.
In this case
1
= 0,
2
= 0.5, F
1
, F
2
are self-evident, and
P =
_
0.8 0.66
0.66 0.56
_
,
which is positive denite. Hence, the problem has a solution. From (6.5.67),
L(F
1
) =
_
_
1.0236 0.0472 0.2236
0.0472 1.0944 0.4472
0.2236 0.4472 1.118
_
_
.
In this case y(z) = z, y(
2
) = 0.5. Hence, from (6.5.77) and (6.5.78),
T
2
=
_
0.11652
0.33304
_
1
0.73788
=
_
0.15791
0.45135
_
,
G
2
=
1
0.5
T
2
=
_
0.31582
0.9027
_
.
One can as well choose (z) = G
2
z. Then, from (6.5.90),
(z) =
_
1.0236 +0.0706z 0.0472 +0.1412z
0.0472 +0.2018z 1.0944 +0.4037z
_
1
_
0.2236 +0.3531z
0.4472 +1.0092z
_
=
1
1.118 +0.4743z
_
0.2236 +0.3659z
0.4472 +1.0028z
_
=
1
1 +0.4242z
_
0.2 +0.3273z
0.4 +0.8969z
_
.
6.6 SENSITIVITY MINIMIZATION: GENERAL CASE
In this section, the sensitivity minimization problem is studied in full generality, i.e., without any
simplifying assumptions on the relative numbers of plant inputs and outputs, and on the dimensions
of the weighting matrices. The ltering problem in the general case is not discussed separately in
the interests of brevity, but reasoning similar to that given here applies to this problem as well. As
shown in earlier sections, the problem is one of minimizing a cost function
J(R) = F GRH
, (6.6.1)
by a suitable choice of R M(H
), where F, G, H M(H
) are given rational matrices. It is

assumed throughout that G(e
j
), H(e
j
) have full rank (either row or column, as appropriate) for
all [0, 2]. If this is not so, then it is necessary to extend Lemma 6.5.1 to the present case. This
extension is straight forward and is left to the reader.
Some preliminary results are presented before we proceed to the main solution algorithm. In
the sequel, the notation A > 0 (resp. A 0) means that the Hermitian matrix A is positive denite
(resp. nonnegative denite). The notation A > B (resp. A B) denotes that Aand B are Hermitian,
and that A B > 0 (resp. A B 0). The notations A < B and A B are dened analogously.
Lemma 6.6.1 Suppose A M(R(z)) and satises two conditions:
A
(z
1
) = A(z) z, and (6.6.2)
A(e
j
) 0 [0, 2] . (6.6.3)
Then there exists an outer matrix U M(R
) such that
A(z) = U
(z
1
)U(z) z . (6.6.4)
Further, if (6.6.3) is strengthened to
A(e
j
) > 0 [0, 2] , (6.6.5)
then U can be chosen to a unit of M(R
). In this case, U is unique to within left multiplication by an

orthogonal matrix.
A matrix U satisfying (6.6.4) is called a spectral factor of A and is denoted by A
1/2
.
Proof. See [3, p. 240]. 2
Lemma 6.6.2 Suppose A R
nm
where n > m, and that Ais inner. Then there exists an inner matrix
B R
n(nm)
such that = [A B] R
nn
is also inner.
The matrix B is sometimes referred to as a complementary inner matrix of A.
Proof. Since A is inner, it has full column rank in M(R
). By permuting rows if necessary, which

does not affect the validity of the lemma, it can be assumed that
A =
_
D
N
_
, (6.6.6)
where |D| = 0 as an element of R
. Note that D R
mm
, N R
nmm
. Now dene
P(z) = [D
(z
1
)]
1
N
(z
1
) . (6.6.7)
Then P (R(z))
m(nm)
and therefore has an r.c.f. (U, V) over M(R
). Specically, U
R
m(nm)
, V R
(nm)(nm)
, and
P(z) = U(z)[V(z)]
1
z . (6.6.8)
Combining (6.6.8) and (6.6.9) shows that
D
(z
1
)U(z) +N
(z
1
)V(z) = 0 z . (6.6.9)
Now dene
W =
_
U
V
_
R
n(nm)
, (6.6.10)
and express (6.6.9) in the form
A
(z
1
)W(z) = 0 z . (6.6.11)
Clearly W has full column rank as an element of M(R
). Hence, by Fact 6.2.2, W has an inner-

outer factorization W
i
W
o
, where W
i
R
n(nm)
and W
o
R
(nm)(nm)
. Multiplying (6.6.11) by
[W
o
(z)]
1
gives
A
(z
1
)W
i
(z) = 0 z . (6.6.12)
It is claimed that the choice B = W
i
makes = [A B] inner (and of course square). Note that
(z
1
)(z) =
_
A
(z
1
)A(z) A
(z
1
)B(z)
B
(z
1
)A(z) B
(z
1
)B(z)
_
= I z , (6.6.13)
since A, B are both inner and satisfy (6.6.12). 2
Now we come to the main result of this section. To be specic, suppose F R
nm
, G
R
nl
, H R
km
, and that n > l, k < m; the cases n l and/or k mare simpler and the requisite
modications will be obvious. Recall that the problem is to select R R
lk
so as to minimize
J(R) = F GRH
. (6.6.14)
The rst step in the solution procedure is to show that the above problem can be reduced to
that of minimizing
_
_
_
_
_
A Q B
C D
__
_
_
_
, (6.6.15)
by a suitable choice of Q R
lk
, where A, B, C, D are L
(not H
!) matrices of appropriate
dimensions. For this purpose, rst factor G as G
i
G
o
, where G
i
R
nl
is inner and G
o
R
ll
is
outer. Then, using Lemma 6.6.2, construct a square inner matrix R
nn
containing G
i
as its rst
l columns. Then,
G =
_
G
o
0
(nl)l
_
. (6.6.16)
Similar reasoning applied to H
R
mk
shows that there exist an outer matrix H

o
R
kk
and a
square inner matrix R
mm
such that
H = [H
o
0
k(mk)
] . (6.6.17)
Hence,
J(R) =
_
_
F
_
G
o
0
_
R[H
o
0]
_
_
=
_
_

_
G
o
0
_
R[H
o
0]
_
_
. (6.6.18)
Now dene
_
A B
C D
_
=
M(L
) , (6.6.19)
and let Q denote the free parameter G
o
RH
o
(recall that both G
o
and H
o
are units of M(H
) by
virtue of the rank conditions on G and H). Then the cost function becomes
_
_
_
_
_
A Q B
C D
__
_
_
_
. (6.6.20)
To proceed further, we require another very simple result:
Lemma 6.6.3 Suppose W = [X Y] M(L
) is rational. Then W
. Further, if >
Y
, then
W
(
2
I YY
)
1/2
X
1 , (6.6.21)
where Y
(z) = Y
(z
1
), and (
2
I YY
)
1/2
denotes the spectral factor of
2
I YY
.
Proof. It is obvious that W
. Moreover, if > Y
, then (
2
I YY
)
1/2
exists
and is a unit of M(R
), by Lemma 6.6.1; hence, (

2
I YY
)
1/2
is well-dened.
By denition,
W
= max
[0,2]
(W(e
j
)) . (6.6.22)
Hence,
W
(W(e
j
)) [0, 2]
(W(e
j
)W
(e
j
)
2
I
[XX
+YY
](e
j
)
2
I
(XX
)(e
j
)
2
I (YY
)(e
j
) . (6.6.23)
This leads immediately to (6.6.21). 2
A parallel result to Lemma 6.6.3 is the following.
Lemma 6.6.4 Suppose
W =
_
X
Y
_
M(L
) , (6.6.24)
is rational. Then W
. Further, if > Y
, then
W
X(
2
I Y
Y)
1/2
1 . (6.6.25)
Now let us return to the problem of minimizing the cost function in (6.6.20). Let be a
current guess of the minimum, and let K denote [C D]. Then, from Lemma 6.6.4,
min
QM(H
)
_
_
_
_
_
A Q B
C D
__
_
_
_
, (6.6.26)
if and only if
min
QM(R
)
_
_
[A Q B](
2
I K
K)
1/2
_
_
1 . (6.6.27)
Now note that (
2
I K
K)
1/2
is a unit of M(R
). Hence, the product Q(

2
I K
K)
1/2
can be replaced by another free parameter R. If we dene
A
1
= A(
2
I K
K)
1/2
M(L
) , (6.6.28)
B
1
= B(
2
I K
K)
1/2
M(L
) , (6.6.29)
then (6.6.27) is equivalent to
min
RM(H
)
[A
1
R B
1
]
1 . (6.6.30)
Next, from Lemma 6.6.3, (6.6.30) holds if and only if
min
RM(H
)
(I B
1
B
1
)
1/2
(A
1
R)
1 . (6.6.31)
Once again, it is possible to replace (I B
1
B
1
)
1/2
R by another free parameter S M(H
).
If we dene
A
2
= (I B
1
B
1
)
1/2
A
1
M(L
) , (6.6.32)
then (6.6.31) is equivalent to
min
SM(H
)
A
2
S
1 . (6.6.33)
The problem of nding the minimum in (6.6.33) is one of nding the closest matrix in
M(H
) to a given matrix in M(L
). This can be reduced to the problem studied in Section 6.5,

namely minimizing a cost function of the form (6.5.22). This can be accomplished as follows: Since
A
2
M(L
) and is rational, it has an l.c.f. (T, U) over M(R
). Now T is square and full rank in

M(R
), and therefore has a factorization T

o
T
i
, where T
o
is a unit of M(R
) and T
i
is inner. Hence,
A
2
(z) = [T (z)]
1
U(z)
= [T
i
(z)]
1
[T
o
(z)]
1
U(z)
= [T
i
(z)]
1
V(z) , (6.6.34)
where V = T
1
o
U M(R
). Now
A
2
S
= T
1
i
V S
= T
i
(T
1
i
V S)
= V T
i
S
= T
adj
i
V bS
, (6.6.35)
where b = |T
i
|. This is the type of objective function covered by Theorem 6.5.3.
The preceding discussion can be summarized in the following iterative procedure for com-
puting the minimum of the function in (6.6.20).
Step 1 Choose
max {[C D]
,
_
B
D
_
} . (6.6.36)
Step 2 Form
L = (
2
I K
K)
1/2
, (6.6.37)
where K = [C D].
Step 3 Form
M = (I BLL
)
1/2
. (6.6.38)
lf BL
1 , increase and go to Step 2; otherwise go on to Step 4.

Step 4 Compute
J
= min
SM(H
)
MAL S
. (6.6.39)
If J
< 1, decrease and go to Step 2. If J
> 1, increase and go to Step 2. If = 1, go on to

Step 5.
Step 5 Set Q = M
1
SL
1
, where S achieves the minimum in (6.6.39). This choice of Qminimizes
the original objective function in (6.6.20).
6.7. TWO-PARAMETERCOMPENSATOR 199
6.7 TWO-PARAMETERCOMPENSATOR
The objective of this section is to discuss the relative merits of a two-parameter compensator in
ltering and sensitivity minimization applications, compared to a one-parameter compensator. The
conclusions can be briey stated as follows: In problems of disturbance rejection, both schemes are
equivalent. In tracking problems involving a stable plant, again both schemes are equivalent. Finally,
in tracking problems with an unstable plant, whether one uses the H
2
-norm or the H
-norm, there
exist plants for which the optimumachievable with a two-parameter scheme is as small as one wishes
compared to the one-parameter optimum.
To avoid purely technical details caused by the nonexistence of optimal compensators in the
continuous-time case, the discussion below is addressed to the discrete-time case. However, with
the aid of Lemmas 6.3.2, 6.4.1, and 6.5.1, this discussion can be translated to the continuous-time
case.
Accordingly, suppose P is a given multivariable plant with transfer matrix P(z), where z is
the transfer function of a unit delay. Suppose u
1
is a reference input to be tracked, while u
2
and u
3
are sensor noises to be rejected. Let W
1
, W
2
, W
3
, W
4
be given weighting matrices in M(H
). It
is assumed that (i) P has no poles on the unit circle, (ii) W
1
, W
3
are outer, (iii) W
2
, W
4
are square
outer, and (iv) P(e
j
), W
1
(e
j
), , W
4
(e
j
) all have full row rank at all [0, 2].
For convenience, a two-parameter compensation scheme is reproduced in Figure 6.2. The
u
3
u
2
z
2
y
2
y
1
N
c1
D
c
1 1
~
N
c2
~
~
+
+
+
+
+
u
1
z
1
N
D
Figure 6.2: Two-Parameter Compensation Scheme.
optimal tracking problem is to minimize either
J
2
= W
1
(I H
yu
1
)W
2
2
, (6.7.1)
or
J
= W
1
(I H
yu
1
)W
2
, (6.7.2)
by means of a suitable stabilizing compensator. The optimal disturbance rejection problem is to select
a stabilizing compensator that minimizes either
K
21
= W
3
H
yu
2
W
4
2
or K
22
= W
3
H
vu
3
W
4
2
, (6.7.3)
or
K
1
= W
3
H
yu
2
W
4
or K
2
= W
3
H
vu
3
W
4
. (6.7.4)
From the discussion in Section 5.6, it is at once clear that, as far as minimizing
K
21
, K
22
, K
1
, K
2
, there is no advantage to using a two-parameter scheme instead of a one-
parameter scheme. This is because the set of achievable stable transfer matrices H
yu
2
, H
vu
3
are
exactly the same with either scheme. But with the optimal tracking problem, the situation is quite
different. The minimum J
2
or J
achievable with a two-parameter scheme is no larger than with

a one-parameter scheme, and can be arbitrarily smaller, depending on the plant. Moreover, with
the two-parameter scheme, it is possible simultaneously to minimize J
2
and K
21
, (or J
2
and K
22
, or
J
and K
1
or any other combination). This is the overwhelming advantage of the two-parameter
scheme. To justify the latter statement rst, recall from (5.6.15) that the set of achievable stable
transfer matrices from (u
1
, u
2
, u
3
) to y are parametrized by
[NQ N(Y R

N) N(X +R

D)] , (6.7.5)
where (N, D), (

D,

N) are any r.c.f. and l.c.f. of P, X, Y M(H
) satisfy XN +YD = I, and

Q, R M(H
) are arbitrary. Thus, to minimize J
and K
22
simultaneously, for example, one
chooses Q to minimize
J
= W
1
(I NQ)W
2
, (6.7.6)
and R to minimize
K
22
= W
3
(I N(X +R

D))W
4
2
. (6.7.7)
The associated compensator
N
c
1
= Q,

N
c
2
= X +R

D,

D
c
= Y R

N , (6.7.8)
minimizes the weighted tracking error and optimally rejects the weighted disturbance.
From (5.6.16), the set of stable transfer matrices H
yu
1
, that can be achieved using a one-
parameter compensator is {N(X +R

D) : R M(H
)}. Let us dene

J
= min
QM(H
)
W
1
(I NQ)W
2
, (6.7.9)
J
2
= min
QM(H
)
W
1
(I NQ)W
2
2
, (6.7.10)
h
= min
RM(H
)
W
1
(I N(X +R

D))W
2
, (6.7.11)
h
2
= min
RM(H
)
W
1
(I N(X +R

D))W
2
2
. (6.7.12)
Then, J
, J
2
(resp. h
, h
2
) are the best weighted tracking errors achievable with a two-parameter
(resp. one-parameter) compensation scheme. Clearly J
, J
2
h
2
. If P is a stable plant, then
one can choose N = P, X = 0,

D = I, in which case it is immediate that J
2
= h
2
, J
= h
. On
the other hand, there exist plants for which the ratios J
/h
or J
2
/h
2
are as small as one wishes.
To set up these examples, let us study the computation of the various minima in (6.7.9)(6.7.12).
Consider rst the constants J
, h
and restrict attention to the scalar case for ease of discussion. In

this case we can dene a single weighting function w = w
1
w
2
. Also, factor n as n
i
n
o
, where n
i
, n
o
are respectively inner and outer functions. Then
J
= min
qH
(1 nq)w
. (6.7.13)
At this stage it can be assumed without loss of generality that w is outer: If w = w
i
w
o
is an
inner-outer factorization of w, then multiplication by w
i
does not affect the norm. Also, since
w(e
j
) = 0 [0, 2], w is actually a unit of H
. Therefore
J
= min
qH
w n
i
n
o
qw
= min
qH
w n
i
q
, (6.7.14)
after absorbing the units n
o
and w into q. Thus, the problem is to nd a function f of minimum
H
-norm such that f interpolates the values of w at the zeros of n

i
(i.e., at the plant zeros inside
the unit disc).
Now consider the computation of h
. By denition,
h
= min
rH
[1 n(x +rd)]w
= (y rn) dw
. (6.7.15)
Factor d as d
i
d
o
where d
i
, d
o
are respectively inner and outer, and absorb all units into the free
parameter r. This leads to
h
= min
rH
[yd
o
w n
i
r]d
i
= min
rH
yd
o
w n
i
r
, (6.7.16)
since multiplication by d
i
preserves norms. Thus, the problem is to nd a function g of minimum
H
-norm that interpolates the values of yd

o
w at the zeros of n
i
. This can be restated in a more
transparent way. Suppose is a zero of n
i
. Then
n
i
() = 0 (nx)() = 0 (yd)() = 1
(yd
o
)() = 1/d
i
() . (6.7.17)
Thus, g must interpolate the values of w/d
i
at the zeros of n
i
, while f interpolates the values of w
at the zeros of n
i
. Since |d
i
(z)| < 1 for all z in the open unit disc, we see that h
is in general larger
than J
. In fact, if p has a pole-zero pair close together inside the unit disc, then |d
i
()| will be
very small at some zero of n
i
, and h
will be considerably larger than J
.
Example 6.7.1 Let w = 1, and let
p(z) =
z 0.5
z 0.5
.
Then
n = z 0.5 , n
i
=
z 0.5
1 (0.5 +)z
,
d = z 0.5, d
i
=
z 0.5
1 0.5z
.
Since n
i
has only one zero, at z = 0.5 +,
J
= |w(0.5 +)| = 1,
h
= |(w/d
i
)(0.5 +)| = 1/|d
i
(0.5 +)|
=
0.75 0.5
0.75
.
By choosing sufciently small, the ratio J
/h
can be made as small as one desires.

Nowconsider the computation of J
2
and h
2
. Recall fromSection 6.2 that the space L
2
[0, 2]
can be decomposed as the direct sumof the orthogonal subspaces H
+
2
and H
2
, where H
+
2
consists of
the boundary values of H
2
-functions. Given f L
2
[0, 2], let f
+
(resp. f
) denote its projection

onto H
+
2
(resp. H
2
). Finally, if f H
2
, then f
denotes the function z z

1
f (z
1
), while if
F M(H
), then F
o
(resp. F
i
) denotes the outer (resp. inner) part of F.
For ease of discussion, suppose W
1
= I, and denote W
2
by W. Also, suppose W is a unit of
M(H
), without loss of generality. Then

J
2
= min
QM(H
)
(I NQ)W
2
= min
QM(H
)
W N
i
Q
2
, after absorbing N
o
, W into Q
= min
QM(H
)
N
i
W Q
2
= (N
i
W)
2
, (6.7.18)
corresponding to the choice Q = (N
i
W)
+
. On the other hand,
h
2
= min
RM(H
)
(I N(X +R

D))W
2
. (6.7.19)
To bring out the connection between h
2
and J
2
more clearly, select

X,

Y M(H
) such that
_
Y X

N

D
_ _
D

X
N

Y
_
=
_
D

X
N

Y
_ _
Y X

N

D
_
= I . (6.7.20)
This is always possible, by Theorem 4.1.16. If we let (

DW)
o
denote the outer part of

DW, then
h
2
= min
RM(H
)
(

Y NR)

DW
2
= min
RM(H

Y(

DW)
o
NR(

DW)
o
2
= min
RM(H
)
N
i

Y(

DW)
o
N
o
R(

DW)
o
2
= min
RM(H
)
N
i

Y(

DW)
o
R
2
= [N
i

Y(

DW)
o
]
2
. (6.7.21)
To compare (6.7.18) and (6.7.21), note that
NX +

Y

D = I , (6.7.22)
N
i
W = N
i
(NX +

Y

D)W = N
o
XW +N
i

Y

DW . (6.7.23)
However, since N
o
XW M(H
2
),
(N
i
W)
= [N
i

Y

DW]
. (6.7.24)
So if we dene F = N
Y

DW M(L
2
) then
J
2
= F
2
, h
2
= [F(

DW)
i
]
2
. (6.7.25)
This implies that h
2
, is in general larger than J
2
.
Example 6.7.2 Consider the scalar case; let w = 1, and let
p(z) =
z
m
1 2z
,
where m is an integer. Then we can choose
n = z
m
, d = 1 2z,
x = 2
m
, y = 1 +2z +4z
2
+ +2
m1
z
m1
.
In this case n = n
i
since n is already inner. Hence, we have
n
i
= z
m
, n
i
= z
m
.
d
i
=
z 0.5
1 0.5z
, d
i
=
1 0.5z
z 0.5
.
f = n
i
yd = n
i
(1 nx) = n
i
x = z
m
2
m
.
J
2
= f
2
= z
m
2
= 1 .
f d
i
= (z
m
2
m
) d
i
,
h
2
= (f d
i
)
2
= (z
m
d
i
)
(2
m
d
i
)
2
2
m
(d
i
)
2
(z
m
d
i
)
2
.
Now expand d
i
(z) in a Fourier series
d
i
(z) =
i=0
c
i
z
i
,
where
c
0
= 0.5, c
i
= 1.5(0.5)
i
for i 1 .
Then, since d
i
is inner,
1 = d
i
2
2
=
i=0
c
2
i
.
Therefore,
i=1
c
2
i
= 0.75 .
Since,
d
i
(z) =
i=0
c
i
z
i
,
(d
i
)
i=1
c
i
z
i
,
we see that
(d
i
)
2
2
=
i=1
c
2
i
= 0.75 .
On the other hand, if m 1, then we have
(z
m
d
i
)
= z
m
d
i
=
i=0
c
i
z
(i+m)
,
(z
m
d
i
)
2
= d
i

2
= 1 .
Hence,
h
2
2
m1
3 1 .
By choosing the integer m sufciently large, the ratio J
2
/h
2
can be made as small as one desires.
NOTES ANDREFERENCES
Section 6.2 contains only the barest minimumof the theory of Hardy spaces. For excellent treatments
of this topic, see Duren [34], Koosis [59] or Sz. Nagy and Foias [69].
Solving the ltering problem of Section 6.3 was the original motivation of Youla, Jabr and
Bongiorno [111] in parametrizing the set of all compensators that stabilize a given plant. Kucera [60]
gives a solution to the discrete-time ltering problem that is similar in its approach to that of [111].
The treatment given here follows Francis [36].
The sensitivity minimization problemwas formulated by Zames [113] and solved in the scalar
case by Zames and Francis [40, 115]. The treatment here is similar to that in [36]. In the multivariable
case, the fat plant case was solved by Chang and Pearson [12] and Francis, Helton, and Zames [38].
Our treatment is adapted from the former. The general sensitivity minimization problem was solved
by Doyle [31]. Note that it is possible to carry out all of the steps of Sections 6.4 to 6.6 using
purely state-space methods; see [16, 31, 44]. The paper of Sarason [86] contains some mathematical
results pertinent to the sensitivity minimization problem. The Nevanlinna-Pick algorithm used in
Section 6.5 is taken from the paper by Delsarte, Genin, and Kamp [18], but this paper treats only
the square matrix case. Finally, Potapov [75] contains several useful results on matrices of analytic
functions, such as Lemma 6.5.6.
207
C H A P T E R 7
Robustness
This chapter begins with a detailed study of the issue of the robustness of feedback stability in
the presence of plant and/or compensator perturbations. Both the nature and the extent of the
perturbations that maintain feedback stability are discussed. Then the problem of robust regulation
(i.e., stabilization plus tracking and/or disturbance rejection) is examined, for both the one-parameter
as well as the two-parameter compensation schemes. Finally, various generic properties of control
systems are studied.
7.1 INTRODUCTION
Consider the standard feedback system shown in Figure 7.1, and suppose the compensator C stabi-
lizes the plant P, i.e., that the closed-loop transfer matrix
H(P, C) =
_
(I +PC)
1
P(I +CP)
1
C(I +PC)
1
(I +CP)
1
_
, (7.1.1)
belongs to M(S). The central question studied in this chapter is: What sort of perturbations can be
u
1
u
2
e
1
e
2
y
2
y
1
C P
+ +
+
Figure 7.1: Feedback System.

permitted in P and/or C without destroying feedback stability? This question is very relevant in any
practical application of the controller synthesis theory developed in the previous two chapters. The
rst step in designing a compensator for any given plant is to obtain a mathematical model for it. As
this process always involves some modeling errors, simplifying assumptions, etc., the true plant is
in general different from the nominal model used in designing a compensator. In the same vein, one
can normally expect some errors or simplications in implementing the compensator, so that the
true compensator is not necessarily the same as the designed compensator. These phenomena can
be captured by supposing that the true plant and compensator are perturbed versions of the nominal
208 7. ROBUSTNESS
plant and compensator respectively. Hence, there is interest in obtaining characterizations of both
the nature and the extent of the permissible perturbations.
Before attempting to characterize permissible perturbations, we should rst decide on a rea-
sonable denition of what constitutes a permissible perturbation. Suppose (P
0
, C
0
) is a nominal
plant-compensator pair which is stable, so that H(P
0
, C
0
) M(S), and suppose P
0
, C
0
are per-
turbed to P, C, respectively. In order for (P, C) to be a permissible perturbation of (P
0
, C
0
), we
require as a minimum that (P, C) also be a stable pair. But this is not enough. The objective of
compensator design is not merely to stabilize a plant but also to improve its response (this is why
compensation is applied even to stable plants in some instances). Hence (P, C) is deemed to be
a permissible perturbation of (P
0
, C
0
) if (P, C) is stable, and in addition the closed-loop transfer
matrix H(P, C) is close to H(P
0
, C
0
).
There are two aspects to the study of robustness. The rst is qualitative, and pertains to the
nature of the permissible perturbations. Mathematically, this is the problem of dening a topology on
the set of plant-compensator pairs. The second is quantitative, and is concerned with the extent of the
permissible perturbations. This is the problem of dening a metric on the set of plant-compensator
pairs. Each has its own signicance, which is discussed at the appropriate place.
In the study of robustness, a very helpful feature is that the map from the pair (P, C) to the
closed-loop transfer matrix H(P, C) is one-to-one. That is, given H(P, C) M(S), one can uniquely
determine P and C, by proceeding as follows: Dene
G =
_
C 0
0 P
_
, F =
_
0 I
I 0
_
. (7.1.2)
Then it is easy to verify (as in Section 5.1) that H(P, C) = (I +FG)
1
. Solving this relationship
for G in terms of H(P, C) gives
G =
_
C 0
0 P
_
= F[I (H(P, C))
1
] , (7.1.3)
where we use the fact that F
1
= F. Nowsuppose that (P
0
, C
0
) is a givenstable pair corresponding
to the nominal plant and compensator, and let B denote a subset of M(S) containing H(P
0
, C
0
),
representing the acceptable perturbations of the closed-loop transfer matrix resulting from changes
in the plant and/or compensator. Now the set
{G M(R(s)) : G = F(I H)
1
for some H B} , (7.1.4)
certainly contains all pairs (P, C) such that H(P, C) B. But there is a aw in this argument: If
H is an arbitrary element of B, then G = F(I H)
1
need not be block-diagonal and therefore,
need not correspond to a plant-compensator pair. Of course, the set of block-diagonal matrices of
the form (7.1.4) is precisely the set of permissible plant and compensator perturbations. However,
this set is very difcult to characterize explicitly. Hence we are forced to seek alternate approaches.
This is what is done in the next few sections.
7.2. GRAPHTOPOLOGY 209
7.2 GRAPHTOPOLOGY
Suppose it is desired to design a stabilizing compensator for a plant that is only specied imprecisely.
Suppose further that the plant uncertainty can be modeled by considering a family of plants P
, where
the uncertainty parameter assumes values in some topological space . Typically represents some
physical parameters of the plant, such as component values, and the function P
captures the
manner in which this uncertainty in the physical parameters is reected in the plant description. The
parameter space may be R
n
for some integer n, or a more general normed space, or any topological
space. An example of the latter situation arises in the design of compensators for singularly perturbed
systems, where = [0, ). For technical reasons, it is assumed that is a rst-countable topological
space (see Section C.1 for the denition). As all metric spaces are rst-countable, this poses no
practical restriction. Let P
0
be the nominal plant description on which the compensator design
is based. The question is: When is it possible to design a stabilizing compensator for P
0
that is
robust against the plant uncertainty? To answer this question, it is necessary rst to dene robust
stabilization. The denition adopted here is that a compensator
1
C S(P
0
) robustly stabilizes the
family of plants P
if (i) there exists a neighborhood N of in such that C S(P
) N, and
(ii) the stable closed-loop transfer matrix H(P
, C) is continuous in at
0
. The rst condition
means that C stabilizes not only the nominal plant P
0
but also all perturbed plants P
for all ,
sufciently near
0
. But this alone is not enough: The objective of compensator design is not merely
to stabilize a plant but also to shape the stable closed-loop response. Thus, the second condition
is imposed, requiring that the perturbed (stable) closed-loop response H(P
, C) be close to the
nominal (stable) closed-loop response H(P
0
, C) whenever
0
is near . Note that the question as
posed above is purely qualitative, in that one is not concerned about the size of the neighborhood
N, but merely about its existence. The quantitative aspects of robustness are studied in the next two
sections.
The question studied (and answered) in this section is: When does there exist a robustly
stabilizing compensator for a given family P
of plants? The answer involves the introduction of an

appropriate topology on the set M(R(s)) of possibly unstable plants, called the graph topology. In
terms of the graph topology, the answer to the above question turns out to be exceedingly simple
and satisfactory: Robust stabilization is possible if and only if the function P
is continuous in the
graph topology at =
0
. In other words, robust stabilization is possible if and only if any variations
in the parameter from its nominal value
0
affect the plant description in a continuous manner,
where the continuity of P
is judged in the context of the graph topology.

In this section, the graph topology is dened, and some of its properties are proved, including
the above-stated result on robust stabilizability. Then several examples are given with a view to
developing some insight into the graph topology. One of these examples contains a fairly substantial
application of these ideas to the design of compensators for singularly perturbed systems.
As the name implies, the graph topology has to do with the graph of a plant, and we begin
with this notion.
1
Recall that S(P) is the set of compensators that stabilize P.
210 7. ROBUSTNESS
Suppose P M(R(s)). Then P is a possibly unstable plant; that is, not all bounded inputs
applied to P need result in bounded outputs. But undoubtedly some (though not necessarily all)
bounded inputs applied to P will produce bounded outputs. For example, suppose p(s) = 1/(s 1);
then it is intuitively clear that a bounded input u() whose Laplace transform has a zero at s = 1 will
produce a bounded output from p. The graph of a plant P is just the set of bounded input-output
pairs corresponding to P. Precisely, if P R(s)
nm
, then
G
S
(P) = {(u, y) S
n+m
: y = Pu} , (7.2.1)
is the graph of P with respect to S.
Lemma 7.2.1 Suppose P R(s)
nm
, and let (N, D) be any r.c.f. of P. Then
G
S
(P) = {(Dz, Nz) : z S
m
} . (7.2.2)
Remarks 7.2.2 This lemma shows that the graph of a plant P can be very simply parametrized in
terms of any r.c.f. (N, D) of P. As the free parameter z varies over S
m
, the pair (Dz, Nz) generates
the set of all bounded input-output pairs corresponding to P.
As in [26, 81], let L
denote the set of all measurable essentially bounded functions mapping

[0, ) into the real numbers, and let L
2
denote the set of all measurable square-integrable functions
mapping [0, ) into R. Then one can also dene the graph of P with respect to the spaces L
and
L
2
in a manner entirely analogous with (7.2.1), namely
G
L
(P) = {(u, y) L
n+m
: y = P u} , (7.2.3)
G
L
2
(P) = {(u, y) L
n+m
2
: y = P u} , (7.2.4)
where the hat denotes Laplace transformation. It can be shown [95] that these graphs can also be
simply characterized in terms of an r.c.f. of P. Specically, if (N, D) is any r.c.f. of P, then
G
L
(P) = {(Dz, Nz) : z L

m
} , (7.2.5)
G
L
2
(P) = {(Dz, Nz) : z L
m
2
} , (7.2.6)
where Dand N are nowinterpreted as bounded operators on L
or L
2
. But technically Lemma 7.2.1
is easier to prove than (7.2.5) or (7.2.6).
Proof. Suppose u = Dz for some z S
m
. Then Pu = ND
1
Dz = Nz S
n
. Hence, for all z S
m
,
the pair (Dz, Nz) belongs to G
S
(P). It remains to show that every point (u, y) G
S
(P) is of the
form (Dz, Nz) for some z S
m
. Since N and D are right-coprime, one can select X, Y M(S)
such that XN +YD = 1. Now suppose (u, y) G
S
(P), i.e., suppose u S
m
, y S
n
and y = Pu.
Dene z = D
1
u [R(s)]
m
. Then clearly (u, y) = (Dz, Nz). The proof is complete if it can be
shown that z actually belongs to S
m
. Note that
z = D
1
u = (XN +YD)D
1
u = XND
1
u +Yu
= Xy +Yu S
m
, (7.2.7)
since all quantities in the last expression belong to M(S). 2
Corollary 7.2.3 Suppose P S
nm
. Then
G
S
(P) = {(z, Pz) : z S
m
} . (7.2.8)
Proof. Note that (P, I) is an r.c.f. of P. 2
Now we are ready to dene the graph topology. Suppose (N, D) is a right-coprime pair, and
that |D| = 0. Then there exists a constant = (N, D) such that (N
1
, D
1
) is also a right-coprime
pair and |D
1
| = 0 whenever [(N
1
N)
(D
1
D)
< (N, D). To see this, rst select any

X, Y M(S) such that XN +YD = I, and let
1
= 1/[X Y]. Then
XN
1
+YD
1
I < 1 whenever
_
_
_
_
_
N
1
N
D
1
D
__
_
_
_
<
1
. (7.2.9)
This shows that XN
1
+YD
1
U(S), whence N
1
and D
1
are right-coprime. Next, since |D| = 0, the
ball B(|D|, r) does not contain 0 whenever r < |D|. Let r < |D|. Since the map D
1
|D
1
|
is continuous, the set of matrices D
1
such that |D
1
| B(|D|, r) contains a ball B(D,
2
) centered
at D. Now let = min{
1
,
2
}. Then the preceding arguments show that N
1
, D
1
are coprime and
|D
1
| = 0 whenever (N
1
, D
1
) B((N, D), ). Of course, the bound is not uniquely specied by
the above conditions, but this does not matter.
Given a plant P M(R(s)), a basic neighborhood of P is dened as follows: Let (N, D) be
any r.c.f. of P, and let be any positive number less than (N, D). Then the set
N(N, D; ) = {P
1
= N
1
D
1
1
:
_
_
_
_
_
N
1
N
D
1
D
__
_
_
_
< } , (7.2.10)
is a basic neighborhood of P. Thus, a basic neighborhood of P merely consists of all plants P
1
which
have an r.c.f. (N
1
, D
1
) that is close to an r.c.f. of P. To put it another way, a basic neighborhood of
P is obtained by starting with some r.c.f. (N, D) of P and making a small perturbation of (N, D)
to (N
1
, D
1
), and then taking the ratio, N
1
D
1
1
. Now by varying over all positive numbers less than
(N, D), by varying (N, D) over all r.c.f.s of P, and by varying P over all elements of M(R(s)),
we obtain a collection of basic neighborhoods.
Lemma 7.2.4 The collection of basic neighborhoods forms a base for a topology on M(R(s)).
212 7. ROBUSTNESS
Proof. In accordance with the contents of Section C.1, it is necessary to prove that axioms (B1)
and (B2) are satised. For convenience, these axioms are recalled below:
(B1) The collection of basic neighborhoods covers M(R(s)).
(B2) If N
1
, N
2
are basic neighborhoods and P N
1
N
2
, then there is a basic neighborhood
N such that P N (N
1
N
2
).
Of these, it is at once obvious that (B1) holds, since every P M(R(s)) belongs to some basic
neighborhood. To prove (B2), suppose
N
1
= {

N

D
1
:
_
_
_
_
_

N N
1
D D
1
__
_
_
_
<
1
} , (7.2.11)
N
2
= {

N

D
1
:
_
_
_
_
_

N N
2
D D
2
__
_
_
_
<
2
} , (7.2.12)
and suppose P N
1
N
2
. Then P has r.c.f.s (

N
1
,

D
1
), (

N
2
,

D
2
) such that
_
_
_
_
_

N
1
N
1
D
1
D
1
__
_
_
_
=: r
1
<
1
,
_
_
_
_
_

N
2
N
2
D
2
D
2
__
_
_
_
=: r
2
<
2
. (7.2.13)
Moreover, there exists a U U(S) such that

N
2
=

N
1
U,

D
2
=

D
1
U. Now select a positive number
r such that r min{
1
r
1
,
2
r
2
/U}, and dene
N = {ND
1
:
_
_
_
_
_
N

N
1
D

D
1
__
_
_
_
< r} . (7.2.14)
Then N is certainly a basic neighborhood containing P. It only remains show that N N
1
N
2
.
This follows readily: First,
_
_
_
_
_
N

N
1
D

D
1
__
_
_
_
< r
_
_
_
_
_
N N
1
D D
1
__
_
_
_
< r
1
+r <
1
, (7.2.15)
which shows that ND
1
N
1
. Next,
_
_
_
_
_
N

N
1
D

D
1
__
_
_
_
< r
_
_
_
_
_
NU

N
2
DU

D
2
__
_
_
_
< rU
_
_
_
_
_
NU N
2
DU D
2
__
_
_
_
< r
2
+rU <
2
, (7.2.16)
which shows that (NU)(DU)
1
= ND
1
N
2
. Hence N N
1
N
2
. 2
The topology on M(R(s)) dened by the collection of sets of the form (7.2.10) as base is
called the graph topology. In it, two plants P
1
and P
2
are close if they have r.c.f.s (N
1
, D
1
), (N
2
, D
2
)
such that (N
1
, D
1
) (N
2
, D
2
) is small. In view of Lemma 7.2.1, this means that P
1
and P
2
are
close if their graphs are close. This is the reason for the name graph topology.
The central result of this section is that the graph topology on M(R(s)) is the weakest topology
in which feedback stability is a robust property. In order to state this result precisely, it is necessary
rst to characterize the convergence of sequences and continuity of functions in the graph topology.
Suppose {P
i
} is a sequence in M(R(s)) and that P M(R(s)). Then by denition (Section C.1),
{P
i
} converges to P if and only if every neighborhood of P contains all but a nite number of terms
of the sequence {P
i
}. Since every neighborhood of P contains a basic neighborhood of P, it is easy
to see that {P
i
} converges to P if and only if every basic neighborhood of P contains all but a nite
number of terms of {P
i
}. All this is a ready consequence of the denition of the graph topology
and of the denition of convergence in a topological space. The next lemma gives several equivalent
characterizations of convergence in the graph topology.
Lemma 7.2.5 Suppose {P
i
} is a sequence in M(R(s)) and that P M(R(s)). Then the following
statements are equivalent:
(i) {P
i
} converges to P in the graph topology.
(ii) For every r.c.f. (N, D) of P, there exist r.c.f.s (N
i
, D
i
) of P
i
such that N
i
N, D
i
D in
M(S).
(iii) There exist an r.c.f. (N, D) of P and a sequence {(N
i
, D
i
)} of r.c.f.s of P
i
such that N
i

N, D
i
D in M(S).
(iv) For every l.c.f. (

D,

N) of P, there exists a sequence {(

D
i
,

N
i
)} of l.c.f.s of P
i
such that
N
i

N,

D
i

D in M(S).
(v) There exist an l.c.f. (

D,

N) of P and a sequence {(

D
i
,

N
i
)} of l.c.f s of P
i
such that

N
i

N,

D
i

D in M(S).
Proof. (i) (ii). This is a ready consequence of the denition of the graph topology and of the
discussion in the paragraph preceding the lemma. Suppose {P
i
} converges to P. Then every basic
neighborhood of P contains all but a nite number of terms in the sequence {P
i
}. In particular, let
(N, D) be any r.c.f. of P and let
i
be any sequence of positive numbers converging to zero. (Assume
without loss of generality that
i
< (N, D) i.) Dene
N
i
= {

N

D
1
:
_
_
_
_
_

N N
D D
__
_
_
_
<
i
} . (7.2.17)
Then for each integer k there exists an integer j
k
such that P
i
N
k
i j
k
. That is, there exist
r.c.f.s (N
i
, D
i
) of P
i
such that
_
_
_
_
_
N
i
N
D
i
D
__
_
_
_
<
k
i j
k
. (7.2.18)
From this, one can conclude that
_
_
_
_
_
N
i
N
D
i
D
__
_
_
_
0 . (7.2.19)
214 7. ROBUSTNESS
i.e., that N
i
N, D
i
D in M(S). This shows that (i) (ii). The converse implication follows
just as easily. Suppose (ii) is true and let N be any basic neighborhood of P. Then N is of the
form (7.2.10) for some r.c.f. (N, D) of P and some > 0. By (ii), there exists a sequence {(N
i
, D
i
)}
of r.c.f.s of P
i
such that (N
i
, D
i
) (N, D) in M(S). Hence [N
i
D
i
]
[N
< for large

enough i, i.e., P
i
N for large enough i. Since this is true for all basic neighborhoods of P, it
follows that {P
i
} converges to P.
(ii) (iii). Obvious.
(iii) (ii). Suppose (N, D) is an r.c.f. of P with the property that there exists a sequence
{(N
i
, D
i
)} of r.c.f.s of P
i
such that N
i
N, D
i
D. Let (

N,

D) be any other r.c.f. of P. It is
shown that there exists a sequence {(

N
i
,

D
i
)} of r.c.f.s of P
i
such that

N
i

N,

D
i

D. Since
(N, D), (

N

D) are both r.c.f.s of P, there exists a U U(S) such that

N = NU,

D = DU. Now
dene

N
i
= N
i
U,

D
i
= D
i
U. Then

N
i

N,

D
i

D. Thus, (ii) holds.
(iv) (v). The proof is entirely analogous to that of (ii) (iii) above.
Thus far, it has been shown that (i), (ii), (iii) are all equivalent, and that (iv) and (v) are
equivalent. The proof of the lemma is completed by showing that (iii) (v).
(iii) (v). By hypothesis, there exist an r.c.f. (N, D) of P and a sequence of r.c.f.s {(N
i
, D
i
)}
of P
i
such that N
i
N, D
i
D. Nowselect an l.c.f. (

D,

N) of P and matrices X, Y,

X,

Y M(S)
such that
_
Y X

N

D
_ _
D

X
N

Y
_
= I . (7.2.20)
This can always be done, in view of Theorem 4.1.16. Since (N
i
, D
i
) (N, D) and since U(S) is
an open subset of M(S), it follows that
F
i
=
_
D
i

X
N
i

Y
_
U(S) for large enough i . (7.2.21)
Now let G
i
= F
1
i
M(S) for large enough i, and partition G
i
as
G
i
=
_
Y
i
X
i

N
i

D
i
_
. (7.2.22)
Then

N
i
,

D
i
are left-coprime (for large enough i), since

N
i

X +

D
i

Y = I. Further,

N
i
D
i
+

D
i
N
i
=
0 implies that

D
1
i
N
i
= N
i
D
1
i
= P
i
.
2
Hence (

D
i
,

N
i
) is an l.c.f. of P
i
. Finally, since inversion of
unimodular matrices is continuous, we have that
G
i
(lim
i
F
i
)
1
=
_
Y X

N

D
_
. (7.2.23)
This shows that (

D
i
,

N
i
) (

D,

N), which is an l.c.f. of P. This is (v).
(v) (iii). This is entirely analogous to the above. 2
2
It still has to be shown that

D
i
is nonsingular for this argument to be valid. However, this is straight-forward; see the proof of
Lemma 8.1.17.
The next result characterizes the continuity of functions mapping into M(R(s)). It is a ready
consequence of Lemma 7.2.5.
Corollary 7.2.6 Suppose is a rst-countable topological space, and that P
is a function mapping
into M(R(s)). Then the following are equivalent:
(i) The function P
is continuous at =
0
when M(R(s)) is equipped with the graph
topology.
(ii) For each r.c.f. (N
0
, D
0
) of P
0
, there exists a family of r.c.f.s (N
, D
) of P
such that
(N
0
, D
0
) = (N
0
, D
0
), and such that N
, D
are continuous (in M(S)) at =

0
.
(iii) There exists a family (N
, D
) of r.c.f.s of P
such that N
, D
are continuous (in M(S)) at

=
0
.
(iv) For each l.c.f. (

D
0
,

N
0
) of P
0
, there exists a family (

D
,

N
) of l.c.f.s of P
such that
(

D
0
,

N
0
) = (

D
0
,

N
0
) and such that (

D
,

N
) are continuous (in M(S)) at =

0
.
(v) There exists a family (

D
,

N
) of l.c.f.s of P
such that

D
,

N
are continuous (in M(S)) at

=
0
.
Now we come to the main result of this section. It is rather wordy in the interests of precision,
but canbe loosely statedas follows: Afamily of plants {P
} canbe robustly stabilizedby a compensator

C S(P
0
) if and only if the function P
is continuous at =
0
in the graphtopology: further,
if P
is continuous at
0
, then every C S(P
0
) robustly stabilizes the family {P
}.
Theorem7.2.7 Suppose P
is a function mapping a rst-countable topological space into

M(R(s)).
(i) Suppose the function P
is continuous at =
0
when M(R(s)) is equipped with
the graph topology. Let C be any element of S(P
0
). Then there exists a neighborhood N of
0
such
that C S(P
) N, and moreover H(P
, C) is continuous (in M(S)) as a function of at

0
.
(ii) Conversely, suppose there exists a C S(P
0
) such that C S(P
) for all in some

neighborhood N of
0
, and that H(P
, C) is continuous (in M(S)) as a function of at

0
. Then
the function P
is continuous when M(R(s)) is equipped with the graph topology.

Proof. (i) By hypothesis, the function P
is continuous at
0
. By Corollary 7.2.6 (iii), there
exists a family (N
, D
) of r.c.f.s of P
such that N
, D
are continuous at
0
. Since C S(P
0
),
there is an l.c.f. (

D
c
,

N
c
) of C such that

D
c
D
0
+

N
c
N
0
= I (see Corollary 5.1.8). Moreover,
H(P
0
, C) =
_
I N
0

N
c
N
0

D
c
D
0

N
c
D
0

D
c
_
. (7.2.24)
Now dene
=

D
c
D
+

N
c
N
. Then
0
= I and
is a continuous function from

into M(S). Since U(S) is an open subset of M(S), there is a neighborhood N of
0
in such that
216 7. ROBUSTNESS
U(S) N. Hence C S(P
) N. Moreover, since inversion of unimodular matrices

is continuous, the map
1
is continuous. It now follows from

H(P
, C) =
_
I N
N
c
N
D
c
D
N
c
D
D
c
_
, (7.2.25)
that H(P
, C) is continuous at =
0
.
(ii) By hypothesis, there exists a C S(P
0
) and a neighborhood N of
0
such that C
S(P
) N. Select any r.c.f. (N

0
, D
0
) of P
0
. Since C S(P
0
), by Corollary 5.1.8 there exists
an l.c.f. (

D
c
,

N
c
) of C such that

D
c
D
0
+

N
c
N
0
= I. Now, since C S(P
) N, it follows by
symmetry that P
S(C) N. Hence, once again by Corollary 5.1.8, there exists r.c.f.s (N
, D
)
of P
such that

D
c
D
+

N
c
N
= I N. Moreover,
H(P
0
, C) =
_
I N
0

N
c
N
0

D
c
D
0

N
c
D
0

D
c
_
, (7.2.26)
H(P
, C) =
_
I N

N
c
N

D
c
D

N
c
D

D
c
_
. (7.2.27)
By hypothesis H(P
0
. Hence so is the function
F
=
_
N

N
c
N

D
c
D

N
c
D

D
c
_
. (7.2.28)
Since N
0
, D
0
are xed elements of M(S) (i.e., independent of ), the function
F
_
N
0
D
0
_
=
_
N
_
(since

N
c
N
0
+

D
c
D
0
= I) , (7.2.29)
is also continuous at
0
. Thus, N
, D
are continuous at
0
. Finally, since H(P
, C) is continuous
at
0
, it follows that
lim
0
H(P
, C) = H(P
0
, C) . (7.2.30)
Then from (7.2.29) it follows that
lim
0
_
N
_
=
_
N
0
D
0
_
. (7.2.31)
Now by Corollary 7.2.6 (ii), the map P
is continuous at
0
. 2
In the proof of Theorem 7.2.7, the only properties of M(S) used were (i) the set of units is
open, and (ii) inversion of a unit is continuous. Thus, Theorem 7.2.7 holds under extremely general
conditions. Specically, one can replace S by any topological ring with the above two properties, plus
the obvious ones (namely S is a commutative domain with identity), and Theorem 7.2.7 remains
valid. The same remark applies to Theorem 7.2.8 and Corollary 7.2.9 below.
Theorem 7.2.7 is addressed to the robustness of feedback stability in the case where the plant
is perturbed but the compensator remains xed. The next result is addressed to the case where both
the plant and the compensator may be perturbed. Suppose a family of plants {P
} and a family
of compensators {C
} is specied, and consider the function (P
, C
). Theorem 7.2.8 below

relates the behavior of this function to the robustness of feedback stability. Before the presentation
of the theorem, however, a word is needed about the problem formulation. Modeling the plant
and compensator perturbations by a family {P
, C
} where both P and C are functions of the same

parameter gives the misleading impression that plant and compensator perturbations are in some
way coupled. But this is not so. Suppose plant uncertainty is modeled by specifying a family of
plants P
where belongs to a topological space A, and compensator uncertainty similarly leads

to a family of compensators C
where belongs to another topological space B. In such a case,

dene the parameter space to be the cartesian product of A and B equipped with the product
topology, and dene P
(,)
= P
, C
(,)
= C
. In this way, independent plant and compensator

uncertainty can also be treated by Theorem 7.2.8.
Theorem 7.2.8 Suppose P
, C
are functions mapping a rst-countable topological

space into M(R(s)), and suppose that the pair (P
0
, C
0
) is stable.
(i) Suppose the functions P
, C
are continuous at =
0
in the graph topology.
Then there exists a neighborhood N of
0
such that (P
, C
) is stable for all N, and in addition

H(P
, C
) is continuous at =
0
in the norm topology on M(S).
(ii) Conversely, suppose there is a neighborhood N of
0
such that (P
, C
) is stable for all

N, and such that H(P
, C
0
in the norm topology on M(S). Then the
functions P
, C
are continuous at =
0
The proof of this theorem is deferred to Section 7.3. However, combining Theorems 7.2.8
and 7.2.7 leads to the following result, which is relevant in the context of singularly perturbed
systems.
Corollary 7.2.9 Suppose P
is a function mapping a rst-countable topological space into

M(R(s)), and suppose a point
0
is specied. If there exist a neighborhood N of
0
and a fam-
ily {C
, N} in M(R(s)) of compensators such that (P
, C
) is stable for all N and such that

H(P
, C
0
, then there exists a C M(R(s)) and a neighborhood B of
0
such
that (P
, C) is stable for all B and H(P
0
.
Remarks 7.2.10 The point of the corollary is this: Suppose a family of plants {P
} is specied,
together with a nominal plant P
0
. If it is possible to design a family of compensators C
such that
C
robustly stabilizes the family of plants {P
}, then the same can be accomplished using a xed

(i.e., -independent) compensator.
218 7. ROBUSTNESS
Proof. The hypotheses imply, by Theorem 7.2.8 (ii), that both P
, C
are continuous
in the graph topology at =
0
. Now let C = lim
0
C
, or any other element of S(P
0
). Then, by
Theorem 7.2.7 (i), C robustly stabilizes the family {P
}. 2
The preceding results showthat the graph topology is of fundamental importance in the study
of the robustness of feedback stability. Hence it is worthwhile to develop a further understanding of
it, with particular emphasis on what continuity in this topology means. The next several propositions
are directed towards this aim.
Proposition 7.2.11 On the set M(S) viewed as a subset of M(R(s)), the norm topology and the graph
topology are the same. M(S) is an open subset of M(R(s)) in the graph topology.
Proof. To prove the rst sentence, it is enough to show that convergence in the norm topology is
equivalent to convergence in the graph topology in M(S). Accordingly, suppose {P
i
} is a sequence
3
in
M(S) and that P M(S). Suppose rst that P
i
P 0. Then C = 0 stabilizes P as well as P
i
,
since P, P
i
M(S). Further, since P
i
P 0, it follows from(7.1.1) that H(P
i
, 0) H(P, 0)
in M(S). Hence, by Theorem 7.2.7 (ii), we conclude that {P
i
} converges to P in the graph topology.
To prove the converse, and incidentally to prove the second sentence, suppose {P
i
} is a sequence
in M(R(s)) (not necessarily in M(S)) converging in the graph topology to P M(S). Now(P, I) is
an r.c.f. of P. Hence, by Lemma 7.2.5 (ii) there exist r.c.f.s (N
i
, D
i
) of P
i
such that N
i
P, D
i
I
in M(S). Now, I is a unimodular matrix. Since U(S) is an open subset of M(S) and since the inversion
of unimodular matrices is continuous, this implies that D
i
is unimodular for large enough i and that
D
1
i
I inM(S). Hence, P
i
= N
i
D
1
i
M(S) for large enoughi; moreover, since N
i
P 0
and D
1
i
I 0, it follows that P
i
P = N
i
D
1
i
I 0. As it has been shown during
the preceding argument that every sequence in M(R(s)) that converges to a limit in M(S) must
eventually lie in M(S), we conclude that M(S) is an open subset of M(R(s)). 2
Proposition 7.2.11 shows that, in the case of functions mapping into M(S), continuity in
the graph topology is the same as continuity in the usual norm topology. In the case of unstable
plant sequences, however, several strange things can happen. Before illustrating these difculties, a
preliminary result is presented that is often helpful in concluding lack of continuity.
Proposition 7.2.12 Suppose P
is a function from a rst-countable topological space into

M(R(s)), and suppose this function is continuous in the graph topology at =
0
. Let s
1
, , s
l
denote
the poles of P
0
in the closed RHP.
(i) Let be any compact subset of the open RHP such that none of the s
i
lies on the boundary of ,
and let denote the number of poles of P
0
inside , counted according to their McMillan degrees. Then
there is a neighborhood N of
0
such that P
has exactly poles inside for all N.

3
Strictly speaking, one should examine the convergence of nets rather than sequences. However, it is shown in Section 7.3 that
the graph topology is metrizable and hence rst-countable; so it is enough to examine only sequences.
(ii) Let be any closed subset of the closed RHP that does not contain s
1
, , s
l
; if P
0
has a pole
at innity, suppose in addition that is bounded. Then P
(s) P
0
(s) as
0
, uniformly for all
s .
Remarks 7.2.13 In general, the number of closed RHP poles of P
and of P
0
need not be equal.
For example, let = [0, ),
0
= 0 and P
(s) = 1/(s +).

Proof. Since the function P
is continuous in the graph topology at =

0
, it follows from
Corollary 7.2.6 (iii) that there exists a family (N
, D
) of r.c.f.s of P
suchthat N
, D
are continuous
at
0
. To prove (i), observe that |D
| is continuous at
0
; hence |D
|(s) |D
0
|(s) uniformly in s
as
0
. Since |D
| and |D
0
| are both analytic in the open RHP, and since |D
0
| has no zeros on
the boundary of , it follows by applying the principle of the argument to |D
| and to |D
0
| that,
for some neighborhood N of
0
, the number of zeros of |D
| inside is the same for all N.

Since the RHP poles of P
are in one-to-one correspondence with the zeros of |D
|, and since the

McMillan degree of an RHP pole of P
is precisely its multiplicity as a zero of |D
|, (i) follows.
To prove (ii), observe that since D
(s) D
0
(s) uniformly over the closed RHP, it fol-
lows that [D
(s)]
1
[D
0
(s)]
1
wherever the latter is well-dened, i.e., wherever |D
0
(s)| = 0.
Moreover, if we exclude a neighborhood of every closed RHP zero of |D
0
|, the convergence is
uniform with respect to s. Since P
(s) = N
(s)[D
(s)]
1
, the same is true of P
. 2
One of the nice features of the norm topology in M(S) is that it is a product topology: That
is, a sequence of matrices {P
(i)
} in M(S) converges to P M(S) if and only if each component
sequence {P
(i)
jk
} converges to p
jk
for all j, k. Consequently, a function P
mapping into M(S)

is continuous at
0
if and only if each component of P
is continuous at
0
. Now Proposition 7.2.11
shows that the graph topology and the norm topology coincide on M(S). Hence, when restricted to
M(S), the graph topology is also a product topology. However, in the case where a function P
maps into unstable multivariable plants, the continuity of the matrix-valued map P
at
0
neither
implies nor is implied by the continuity at
0
of the scalar-valued maps p
ij
for all i, j. This is
illustrated through some examples. First a scalar example is presented to set the stage.
Example 7.2.14 Let = R,
0
= 0, and consider the function
p
(s) =
s 1
s 1
. (7.2.32)
Then at = 0 the function p
has a pole-zero cancellation in the open right half-plane, namely

at s = 1. This causes p
to be discontinuous at = 0, as can be seen by the following reasoning.

Now p
0
= 1, which belongs to S. If p
were to be continuous at = 0, it would follow from

Proposition 7.2.11 that p
S for all small enough . Since clearly p
/ S = 0, the conclusion
is that p is not continuous at = 0.
220 7. ROBUSTNESS
Example 7.2.15 Again let = R,
0
= 0 and consider the function
P(s) =
_
s
s1
s1
s1
2s
2
1
s
2
1
s
2
+s+1
s
2
1
_
. (7.2.33)
Then, from Example 7.2.14, the (1,2)-component of P
is not continuous in the graph topology

at = 0. Nevertheless, the function P
mapping R into M(R(s)) is continuous in the graph

topology at = 0. To see this, dene
N
(s) =
_
z + z 1
z 2 1
_
, D
(s) =
_
z 1 z
z z 1
_
, (7.2.34)
where z = s/(s +1) S. Then it is easy to see that N
is continuous at = 0, and D
is independent
of . Moreover, P
= N
D
1
, as is easily checked. Hence, by Corollary 7.2.6 (iii), the function P
is
continuous in the graphtopology at = 0.This example shows that the continuity of a matrix-valued
function does not imply the continuity of its components.
Example 7.2.16 Once again let = R,
0
= 0, and consider
P(s) =
_
s+1
s1
s+1
s1+
_
. (7.2.35)
Then,
P
0
(s) =
_
s+1
s1
s+1
s1
_
. (7.2.36)
It is immediate from (7.2.35) and (7.2.36) that each component of P
is continuous in the graph

topology at = 0. Nevertheless, P
itself is not continuous at = 0. To see why this is so, note that

P
0
has a pole at s = 1 of McMillan degree one, and this is the only pole of P
0
. On the other hand,
for each nonzero (1, 1), P
, has two poles in the closed RHP, each of McMillan degree one.
Since Proposition 7.2.12 (i) fails to hold, the conclusion is that P
is not continuous in the graph

topology at = 0.
The next result shows that, while the graph topology may be the right topology for study-
ing the robustness of feedback stability, addition and multiplication in R(s) (or M(R(s))) are not
necessarily continuous operations in this topology.
Proposition 7.2.17 R(s) equipped with the graph topology is not a topological ring.
Proof. It is necessary to demonstrate that either addition or multiplication fails to be continuous. In
actuality, both operations are not continuous in general. First, let {
i
} be any sequence converging to
zero, and let
p
i
(s) =
s 3
i
s 2
, q
i
(s) =
1
s 2
,
p(s) =
s 3
s 2
, q(s) =
1
s 2
. (7.2.37)
Then p
i
p, q
i
q the graph topology, but
(p
i
+q
i
)(s) =
s 2
i
s 2
, (7.2.38)
does not approach p +q = 1 in the graph topology (see Example 7.2.14). Hence, addition is not
continuous in general. An example of the noncontinuity of multiplication is provided by
p
i
(s) =
s 1
i
s +1
, q
i
(s) =
s +1
s 1 +
i
,
p(s) =
s 1
s +1
, q(s) =
s +1
s 1
. (7.2.39)
The details are left to the reader. 2
Using entirely analogous reasoning, one can showthat the matrix ring R(s)
nn
equipped with
the graph topology fails to be a topological ring for every integer n 1.
Up to now we have seen a spate of negative results, including the fact that the continuity
of a matrix-valued function cannot be related in a simple way to the continuity of its components,
and that addition and multiplication need not be continuous. These results are presented primarily
to impress upon the reader the need to proceed with caution, and to demonstrate that obvious
statements such as lim
0
(P
+Q
) = lim
0
P
+ lim
0
Q
need not always hold true. Happily,

there are also several positive statements that one can make about the graph topology.
Proposition 7.2.18 Suppose is a rst-countable topological space, and that P, Q :
M(R(s)), R : M(S) satisfy P
= Q
+R
. Suppose in addition that R is continuous

at
0
. Then P is continuous at
0
if and only if Q is continuous at
0
.
Remarks 7.2.19 As we have seen in the proof of Proposition 7.2.17, the statement
lim
0
P
= lim
0
Q
+ lim
0
R
, (7.2.40)
is not valid in general. Thus, the point of Proposition 7.2.18 is that (7.2.40) is valid provided one of
the two limits on the right side of (7.2.40) belongs to M(S).
222 7. ROBUSTNESS
Proof. if Since Q is continuous at
0
, by Corollary 7.2.6 (iii) it follows that there exists a family
(N
, D
) of r.c.f.s of Q
such that N
, D
are continuous at
0
. Since R
M(S), the pair (N
+
R
, D
) is an r.c.f. of P
. Finally, since M(S) is a topological ring, the functions N
+R
and D
are both continuous at

0
. This implies, by Corollary 7.2.6, that P
is continuous at
0
.
only if Write Q
= P
and apply the above reasoning. 2

The multiplicative analog of Proposition 7.2.18 is stated next.
Proposition 7.2.20 Suppose is a rst-countable topological space, and that P, Q : M(R(s)),
R : U(S) satisfy P
= Q
. Suppose in addition that R is continuous at

0
. Then P is
continuous at
0
if and only if Q is continuous at
0
.
It is left to the reader to prove Proposition 7.2.20, by exploiting the fact that R
is unimodular
for all , and to extend Proposition 7.2.20 to the case where P
= R
.
The nal result along these lines shows that the graph topology on M(R(s)) is a product
topology under certain circumstances.
Proposition 7.2.21 Suppose P maps a rst-countable topological space into M(R(s)), and that P is
block-diagonal of the form P = Block Diag {P
1
, , P
n
} where the dimensions of the blocks are the same
for all . Then P is continuous at
0
if and only if each function P
i
is continuous at
0
.
The proof of this proposition is given in Section 7.3. One can generate some more results by
combining the preceding three propositions in various ways.
Consider a function P
mapping intoM(R(s)). Inviewof Examples 7.2.15and7.2.16,

one can ask whether there is a simple test to determine whether or not P
is continuous at
0
. Such
a test is provided by Theorem 7.2.7, and can be stated as follows: Find any C S(P
0
); then,
P
is continuous at
0
if and only if C robustly stabilizes the family {P
}. Of course, this test

begs the question, since the issue of continuity arises only in conjunction with the existence of a
robustly stabilizing compensator. Thus, the next several propositions are addressed to some very
specic situations of practical interest.
Proposition 7.2.22 Let = R,
0
= 0, and suppose P
(s) is the transfer matrix of the system

x(t ) = A
x(t ) +B
u(t ), y(t ) = C
x(t ) . (7.2.41)
Suppose the triple (C
0
, A
0
, B
0
) is stabilizable and detectable, and that (C
, A
, B
) is continuous in at
= 0 in the usual norm topology on Euclidean space. Then the map P
is continuous in the graph

topology at = 0.
Proof. This is a straight-forward application of Lemma 4.2.2. Select a matrix F such that A
0
FC
0
is Hurwitz, i.e., such that all eigenvalues of A
0
FC
0
have negative real parts. By continuity,
A
FC
is also Hurwitz for sufciently small . From Lemma 4.2.2, an l.c.f. of P
is given by
(s) = I C
(sI A
+FC
)
1
F , (7.2.42)
(s) = C
(sI A
FC
)
1
B
, (7.2.43)
for sufciently small . Since

D
,

N
are continuous at = 0 in the norm topology on M(S), it

follows from Corollary 7.2.6 (iii) that P
is continuous in the graph topology at = 0. 2

A consequence of Proposition 7.2.22 is that any compensator C that stabilizes the nominal
plant P
0
, which is described by (7.2.41) with = 0, also stabilizes P
for all sufciently small ;

moreover, the closed-loop transfer matrix H(P
, C) is continuous at = 0.
In Proposition 7.2.22, each of the plants P
has the same dynamic order (i.e., the same state-

space dimension). Proposition 7.2.23 below is addressed to the so-called singular perturbation case,
where the plant P
0
has a lower dynamic order than the plants P
, > 0.
Proposition 7.2.23 Suppose = [0, ), and for each > 0 let P
(s) be the transfer matrix of the

system
_
x(t )
z(t )
_
=
_
A
11
A
12
A
21
A
22
_ _
x(t )
z(t )
_
+
_
B
1
B
2
_
u(t ) , (7.2.44)
y(t ) = [C
1
C
2
]
_
x(t )
z(t )
_
+D
1
u(t ) . (7.2.45)
Suppose A
22
is a Hurwitz (and hence, nonsingular) matrix, and let P
0
(s) be the transfer matrix of the
system resulting from substituting = 0 in (7.2.44) and (7.2.45), namely
x(t ) = Ax(t ) +Bu(t ), y(t ) = Cx(t ) +D
0
u(t ) , (7.2.46)
where
A = A
11
A
12
A
1
22
A
21
, B = B
1
A
12
A
1
22
B
2
,
C = C
1
C
2
A
1
22
A
21
, D
0
= D
1
C
2
A
1
22
B
2
. (7.2.47)
Suppose the triple (C,A,B) is stabilizable and detectable. Then P
is continuous at = 0 in the graph

topology if C
2
(sI A
22
)
1
B
2
0.
Remarks 7.2.24 Equations of the form (7.2.44)(7.2.45) are usually used to model systems with
both fast and slow dynamical variables. The state variables x (resp. z) are referred to as the slow (resp.
fast) modes. In this context, P
0
represents a simplied model of P
obtained by neglecting the fast

modes. A relevant question for such systems is the following: When does a compensator designed
224 7. ROBUSTNESS
on the basis of the simplied system P
0
work for the true system P
? More precisely, suppose

one has found a stabilizing compensator C
0
for the simplied system P
0
. The issue is whether
C
0
also stabilizes P
for all sufciently small positive , and if so whether H(P
, C
0
) approaches
H(P
0
, C
0
) as 0
+
. On the basis of Theorem 7.2.7, we know that this is so if and only if P
is
continuous in the graph topology at = 0. Now Proposition 7.2.23 states that continuity holds if
C
2
(sI A
22
)
1
B
2
0.
Proposition 7.2.23 has several interesting consequences. For instance, suppose the simplied
systemP
0
is stabilized by state feedback. This corresponds to the case y(t ) = x(t ) (i.e., C
1
= I, C
2
=
0), and C
0
= K, a constant matrix. Since the requisite condition holds in this case, it follows that the
control law u(t ) = Kx(t ) also stabilizes P
for all sufciently small positive , and that H(P
, K)
converges to H(P
0
, K) as 0
+
. Similarly, if the control input u does not directly affect the last
modes, (i.e., B
2
= 0), then it is again immediate that any compensator that stabilizes the reduced
order model also robustly stabilizes the singularly perturbed system for all sufciently small positive
.
Proof. Suppose C
2
(sI A
22
)
1
B
2
0. The continuity of P
at = 0 is established using Corol-

lary 7.2.6 (iii), i.e., by constructing a family (

D
,

N
) of l.c.f.s of P
, 0, such that both

D
and,
are continuous at = 0 in the norm topology on M(S).

Select a matrix F such that A FC is Hurwitz; this can always be done since (C, A) is
detectable. Dene
D
0
(s) = I C(sI A +FC)
1
F , (7.2.48)
N
0
(s) = C(sI A +FC)
1
(B FD
0
) +D
0
. (7.2.49)
Then, by Theorem 4.2.1, (

D
0
,

N
0
) is an l.c.f. of P
0
. To obtain an l.c.f. of P
, let F
0
= [F
0]
and
dene
=
_
A
11
A
12
A
21
/ A
22
/
_
_
F
0
_
[C
1
C
2
]
=
_
A
11
FC
1
A
12
FC
2
A
21
/ A
22
/
_
. (7.2.50)
From [15], the eigenvalues of

A
are asymptotically equal to those of A

22
/ plus those of
A
11
FC
1
(A
12
FC
2
)A
1
22
A
21
= A FC . (7.2.51)
Hence

A
is Hurwitz for all sufciently small positive . Thus, once again by Theorem 4.2.1, an
l.c.f. of P
is given by
(s) = I C
0
(sI

A
)
1
F
0
, (7.2.52)
(s) = C
0
(sI

A
)
1
(B
F
0
D
1
) +D
1
, (7.2.53)
where
B
=
_
B
1
B
2
/
_
, C
0
= [C
1
C
2
] . (7.2.54)
The remainder of the proof consists of studying

N
and

D
in detail and showing that

N
0
,

D

D
0
as 0
+
. Recall that the inverse of a partitioned matrix is given by
_
X Y
W V
_
1
=
_

1
1
YV
1
V
1
W
1
V
1
+V
1
W
1
YV
1
_
, (7.2.55)
where
= X YV
1
W . (7.2.56)
Hence
= I [C
1
C
2
]
_
sI A
11
+FC
1
A
12
+FC
2
A
21
/ sI A
22
/
_
1
_
F
0
_
= I C
1
1
F +C
2
V
1
W
1
F , (7.2.57)
where
= (sI A
11
+FC
1
) (A
12
FC
2
)(sI A
22
/)
1
A
21
/
= (sI A
11
+FC
1
) (A
12
FC
2
)(sI A
22
)
1
A
21
, (7.2.58)
V = sI A
22
/ , W = A
21
/ . (7.2.59)
Simplication of (7.2.57) yields
= I M
1
F , (7.2.60)
where
M = C
1
C
2
V
1
W = C
1
+C
2
(sI A
22
)
1
A
21
. (7.2.61)
Now, letting

A denote the Hurwitz matrix A FC, we get
= I C(sI

A)
1
F
= I C
1
(sI

A)
1
F +C
2
A
1
22
A
21
(sI

A)
1
F . (7.2.62)
Hence, it can be concluded that

D

D
0
if it can be shown that
1
(sI

A)
1
, (7.2.63)
(sI A
22
)
1
A
21
1
A
1
22
A
21
(sI

A)
1
. (7.2.64)
Note that (sI A
22
)
1
does not approach A
1
22
.
4
Hence, (7.2.64) does not automatically follow
from (7.2.63).
4
Observe that (sI A
22
)
1
0 as s whenever > 0. Hence, the norm of the difference between the functions (sI
A
22
)
1
and A
1
22
is at least A
1
22
, no matter how small is.
226 7. ROBUSTNESS
To prove (7.2.63), let

A
11
denote A
11
FC
1
, and note that
1
= [sI

A
11
(A
12
FC
2
)(sI A
22
)
1
A
21
]
1
= {sI

A
11
(A
12
FC
2
)A
1
22
A
21
(A
12
FC
2
)[(sI A
22
)
1
A
1
22
]A
21
}
1
= (I R)
1
(sI

A)
1
, (7.2.65)
where
R = (A
12
FC
2
)[(sI A
22
)
1
A
1
22
](sI

A)
1
. (7.2.66)
Hence,
1
(sI

A)
1
= [(I R)
1
I](sI

A)
1
. (7.2.67)
Now (I R)(s) approaches I uniformly on every compact subset of C
+e
as 0, and (sI
A)
1
0 as |s| . Hence,
1
(sI

A)
1
and (7.2.63) is proved. To establish (7.2.64),
observe that (sI A
22
)
1
approaches A
1
22
uniformly on all compact subsets of C
+e
as 0,
and that both
1
and (sI

A)
1
are strictly proper. Thus (7.2.64) follows from (7.2.63) plus the
strict properness of the quantities on both sides of (7.2.63). Finally, (7.2.63) and (7.2.64) together
imply that

D

D
0
as 0.
Next, we examine the numerator matrix

N
. From (7.2.53)
= [C
1
C
2
]
_
sI A
11
+FC
1
A
12
+FC
2
A
21
/ sI A
22
/
_ _
B
1
FD
1
B
2
/
_
+D
1
. (7.2.68)
Expanding the above equation results in an expression for

N
as a sum of ve terms the rst four

of which arise from the matrix product and the last is D
1
. From (7.2.55), the rst of these terms is
C
1
1
(B
1
FD
1
). Now (7.2.63) implies that this term converges to C
1
(sI

A)
1
(B
1
FD
1
).
The second term in the expansion is
C
2
V
1
W
1
(B
1
FD
1
)
= C
2
(sI A
22
/)
1
(A
21
/)
1
(B
1
FD
1
)
= C
2
(sI A
22
)
1
A
21
1
(B
1
FD
1
) . (7.2.69)
From reasoning analogous to that used to establish (7.2.64), it follows that this term converges to
C
2
A
1
22
A
21
(sI A
22
)
1
(B
1
FD
1
). The third term is
C
1
1
YV
1
B
2
/
= C
1
1
(A
12
FC
2
)(sI A
22
/)
1
B
2
/
= C
1
1
(A
12
FC
2
)(sI A
22
)
1
B
2
. (7.2.70)
As before, this converges to
C
1
(sI

A)
1
(A
12
FC
2
)A
1
22
B
2
. (7.2.71)
However, note that C
2
A
1
22
B
2
= 0. Hence the above limit simplies to
C
1
(sI

A)
1
A
12
A
1
22
B
2
. (7.2.72)
The fourth term is the most complicated one; it equals
C
2
V
1
(I +W
1
YV
1
)B
2
/
= C
2
(sI A
22
/)
1
_
I +
A
21

1
(A
12
FC
2
)
_
sI
A
22
_
1
_
B
2
= C
2
(sI A
22
)
1
[I +A
22
1
(A
12
FC
2
)(sI A
22
)
1
]B
2
= C
2
(sI A
22
)
1
B
2
+C
2
(sI A
22
)
1
A
21
1
(A
12
FC
2
)(sI A
22
)
1
B
2
. (7.2.73)
Nowthe rst termon the right side is identically zero by hypothesis, while the second termis strictly
proper. As a result, the secondtermconverges without difculty toC
2
A
1
22
A
21
(sI

A)
1
A
12
A
1
22
B
2
.
Of course the fth and nal term is the constant matrix D
1
. Combining the ve limits, we see that,
approaches
(C
1
C
2
A
1
22
A
21
)(sI

A)
1
(B
1
FD
1
A
12
A
1
22
B
2
) +D
1
. (7.2.74)
Finally, using the relations (7.2.47), and noting that D
1
= D
0
, since C
2
A
1
22
B
2
= 0, the above limit
simplies to
C(sI

A)
1
(B FD
0
) +D
0
=

N
0
. (7.2.75)
From Corollary 7.2.6, it now follows that P
converges to P
0
in the graph topology. 2
While the result contained in Proposition 7.2.23 is interesting, its use is limited by the fact
that the condition it requires is quite restrictive. The origin of the restrictiveness is the denition
of robust stabilizability. Recall that a controller C S(P
0
) was not only required to stabilize P
for
all sufciently small positive , but the resulting stable family of transfer matrices H(P
, C) was
required to converge uniformly over C
+e
to H(P
0
, C) as 0. The next proposition is devoted to
a study of the situation where a weaker form of convergence of the family H(P
, C) is acceptable.
Proposition7.2.25 Consider the family {P
} dened in (7.2.44)(7.2.45). Suppose C (P

0
) is strictly
proper. Then C S(P
) for all sufciently small positive . Moreover, [H(P
, C)](s) converges to
[H(P
0
, C)](s) uniformly over all compact subsets of C
+e
as 0.
Proof. In the interests of brevity, let us say that a family {F
} converges weakly to F M(S) if

F
(s) F(s) uniformly over all compact subsets of C

+e
as 0. In contrast, we say simply that
{F
} converges to F if the convergence of F
(s) to F(s) is uniform over the entire C

+e
. Also note
that, if {F
} converges weakly to F and {G
} converges to G, then the product {G
} converges
to GF provided G is strictly proper.
228 7. ROBUSTNESS
Since C S(P
0
), it has an r.c.f. (N
c
, D
c
) such that

N
0
N
c
+

D
0
D
c
= I. Moreover, since C is
strictly causal, so is N
c
. Now consider the return difference matrix
=

N
N
c
+

D
D
c
. (7.2.76)
In the proof of Proposition 7.2.23, it was shown that {

N
} converges to

N
0
provided C
2
(sI
A
22
)
1
B
2
0. However, from (7.2.73) one can easily verify that, even if the above condition is
not satised, the family {

N
} still converges weakly to

N
0
. Since N
c
is strictly proper, it follows
that

N
N
c

N
0
N
c
. Also

D

D
0
, since the condition was not used in the proof of this result
(see (7.2.63)(7.2.67)). Hence,

N
0
N
c
+

D
0
D
c
= I, which implies that
is unimodular for
sufciently small (positive) . By Theorem 5.1.6, this means that C S(P
) for small enough .

Next, the twin formula to (5.1.21) shows that
H(P
, C) =
_
D
c
D
c
N
c
I N
c
_
. (7.2.77)
All terms except D
c
, converge to their corresponding values when = 0. But this term

only converges weakly to D
c

N
0
, since neither D
c
, nor the limit of
1
is strictly proper. However,

this is enough to conclude that {H(P
, C)} converges weakly to H(P

0
, C). 2
Note that the set of all strictly proper compensators in S(P
0
) can be parametrized using
Proposition 5.2.10.
If C S(P
0
) is not strictly proper, then it is possible to construct a family {P
} such
that (7.2.47) holds, but C / S(P
) for all > 0. In other words, if P

0
is stabilized using a nonstrictly
proper compensator, then there always exists a singular perturbation of P
0
that is not stabilized by
C for each > 0.
In closing, let us discuss the graph topology in the case where the set of stable transfer
matrices is something other than the set S. A careful examination of the various proofs in this
section reveals that the actual nature of the elements of the set S was not used anywhere: Rather,
the pertinent properties of S were (i) the set of units in S is open and (ii) inversion is continuous.
Thus, the graph topology can be dened over the set of fractions associated with an arbitrary ring R,
provided only that R has the two properties mentioned above. In particular, the graph topology for
discrete-time systems is self-evident, as it is in the case where the set of desirable transfer functions
is taken to be the set S
D
where D is some prespecied region in the complex plane. Finally, note that
in [114] a so-called gap metric is dened which is nothing but a metrization of the graph topology
corresponding to a particular choice of stable transfer functions, namely the set of bounded linear
operators mapping the Hilbert space H
2
into itself. Such a choice of stable transfer functions takes
into account the fact that BIBO-stability is a desirable attribute of a system, but does not take into
account the requirement that a stable system should be causal in order to be physically useful.
7.3. GRAPHMETRIC 229
7.3 GRAPHMETRIC
The previous section discusses the qualitative aspects of the robustness of feedback stability. Specif-
ically, the set M(R(s)) of all (possibly unstable) plants is topologized by means of the graph topology,
and it is shown that continuity in the graph topology is a necessary and sufcient condition for
feedback stability to be robust (see Theorem 7.2.7). In order to obtain quantitative estimates of
robustness, it is essential to metrize the graph topology, i.e., to dene a metric on M(R(s)) such
that convergence (and hence, continuity) in this metric is equivalent to convergence in the graph
topology. Such a metric, called the graph metric, is dened in this section. Its applications to deriving
quantitative robustness estimates are given in the next section.
At this stage, it should be noted that the graph metric presented in this section is only one
possible means of metrizing the graph topology. It is undoubtedly possible to dene alternate metrics
on M(R(s)) that induce exactly the same topology. Each of these metrics would lead to a different
quantitative estimate for the robustness of feedback stability, but would leave the qualitative result
of Theorem 7.2.7 unchanged. Thus, in some sense the graph topology is more fundamental than
the graph metric, dened below.
The graph metric is dened using the concept of a normalized r.c.f. of a rational matrix, and
this is introduced rst. Suppose P M(R(s)). An r.c.f. (N, D) of P is normalized if
N
(s)N(s) +D
(s)D(s) = I, s , (7.3.1)
where N
(s) = N
(s).
Lemma 7.3.1 Every P M(R(s)) has a normalized r.c.f. which is unique to within right multiplica-
tion by an orthogonal matrix.
Proof. Let (N
1
, D
1
) be any r.c.f. of P, and dene
(s) = N
1
(s)N
1
(s) +D
1
(s)D
1
(s) . (7.3.2)
Then
(s) = (s) s. Moreover, by the right-coprimeness of N

1
and D
1
, there exist positive
constants and such that I (j) I . Hence, by [3, p. 240], one can do a spectral
factorization of as (s) = R
(s)R(s), where R U(S). Now let N = N

1
R
1
, D = D
1
R
1
.
Then (N, D) is also an r.c.f. of P by Theorem 4.1.13, and (7.3.1) holds.
To prove the assertion regarding uniqueness, let (N
2
, D
2
) be another normalized r.c.f. of P.
Then by denition
N
2
(s)N
2
(s) +D
2
(s)D
2
(s) = I, s . (7.3.3)
Moreover, by Theorem 4.1.13, there is a unimodular matrix U U(S) such that N
2
= NU, D
2
=
DU. Substituting this into (7.3.3) and using (7.3.1) shows that U
(s)U(s) = I. In other words, U

is a spectral factor of the identity matrix. Hence, U(s) must be constant as a function of s and must
equal an orthogonal matrix. 2
230 7. ROBUSTNESS
Denition 7.3.2 Suppose P
1
, P
2
M(R(s)) have the same dimensions, and let (N
i
, D
i
) be a
normalized r.c.f. of P
i
for i = 1, 2. Let
A
i
=
_
D
i
N
i
_
, i = 1, 2 , (7.3.4)
and dene
(P
1
, P
2
) = inf
UM(S),U1
A
1
A
2
U , (7.3.5)
d(P
1
, P
2
) = max{(P
1
, P
2
), (P
2
, P
1
)} . (7.3.6)
Then d(P
1
, P
2
) is the graph metric on M(R(s)).
It is left to the reader to verify that d(P
1
, P
2
) is a well-dened quantity depending only on
P
1
and P
2
, even though A
1
and A
2
are only unique to within right multiplication by an orthogonal
matrix. The reason is that multiplication by an orthogonal matrix does not change the norm.
Lemma 7.3.3 d is a metric on M(R(s)) assuming values in the interval [0, 1].
Proof. If U = 0, then A
1
A
2
U = A
1
= 1. Hence, (P
1
, P
2
) 1 and d(P
1
, P
2
) 1 for
all plants P
1
, P
2
. In proving that d is a metric, the only nonobvious part is the proof of the
triangle inequality. Accordingly, let P
1
, P
2
, P
3
M(R(s)), and select U, V M(S) such that
U 1, V 1, and
A
1
A
2
U (P
1
, P
2
) + , (7.3.7)
A
2
A
3
V (P
2
, P
3
) + , (7.3.8)
where is some positive number. Then VU M(S) and VU 1; moreover,
A
1
A
3
VU A
1
A
2
U +A
2
U A
3
VU
A
1
A
2
U +A
2
A
3
VU
(P
1
, P
2
) +(P
2
, P
3
) +2 (7.3.9)
where in the last step we use the fact that U 1. Now (7.3.9) shows that
(P
1
, P
3
) (P
1
, P
2
) +(P
2
, P
3
) +2 . (7.3.10)
Since (7.3.10) holds for every > 0, it follows that
(P
1
, P
3
) (P
1
, P
2
) +(P
2
, P
3
) . (7.3.11)
By symmetry,
(P
3
, P
1
) (P
3
, P
2
) +(P
2
, P
1
) . (7.3.12)
The triangle inequality for d now follows from (7.3.6). 2
The main result of this section is Theorem 7.3.5 which states that the topology on M(R(s))
induced by the graph metric is the same as the graph topology. The following technical result is
needed to prove this theorem.
Lemma 7.3.4 Suppose P
1
, P
2
M(R(s)) have the same dimensions, and let A
1
, A
2
be associated
matrices in M(S) obtained from their normalized r.c.f.s. Suppose U, V M(S) satisfy U 1, and
A
1
A
2
U +A
2
A
1
V < 1. Then U and V are unimodular.
Proof. Note that, because of (7.3.1), we have A
1
X = A
2
X = X X M(S) (in other words,
multiplication by A
i
preserves norms). Hence,
I VU = A
1
A
1
VU
A
1
A
2
U +A
2
U A
1
VU
A
1
A
2
U +A
2
A
1
V < 1 . (7.3.13)
Hence, VU U(S) by Lemma 7.2.5, and V, U U(S) as a consequence. 2
Theorem7.3.5 A sequence {P
i
} in M(R(s)) converges to P M(R(s)) in the graph topology if
and only if d(P
i
, P) 0.
Proof. if Suppose d(P
i
, P) 0. Let A, A
i
be matrices of the form (7.3.4) obtained from nor-
malized r.c.f.s of P, P
i
, respectively. Then for each in the interval (0, 1/4) there exists an integer
n such that d(P
i
, P) i n. For each i, there exists U
i
, V
i
in the unit ball of M(S) such that
A A
i
U
i
d(P
i
, P) +, A
i
AV
i
d(P
i
, P) +. Hence, for all i n, we have
A A
i
U
i
+A
i
AV
i
4 < 1 . (7.3.14)
By Lemma 7.3.4, this implies that U
i
, V
i
are unimodular. Thus, if A = [D
and A
i
= [D
i
N
i
]
,
then (N
i
U
i
, D
i
U
i
) is an r.c.f. of P
i
and A A
i
U
i
2 i n. It nowfollows fromLemma 7.2.5
that P
i
P in the graph topology.
only if Suppose P
i
P in the graph topology, and let (N, D) be a normalized r.c.f. of
P. Then by Lemma 7.2.5 there exist r.c.f.s (N
i
, D
i
) of P
i
such that N
i
N, D
i
D. However,
(N
i
, D
i
) need not be normalized. Let M
i
= [D
i
N
i
]
, and suppose M
i
= A
i
R
i
where A
i
corre-
sponds to a normalized r.c.f. of P
i
and R
i
U(S). Now M
i
A 0 and M
i
= A
i
R
i
=
R
i
since A
i
is an isometry. Hence, R
i
1. Dene U
i
= R
i
/R
i
. Then U
i
= 1; moreover,
A A
i
U
i
A A
i
R
i
+A
i
R
i
A
i
U
i
= A M
i
+R
i
U
i
= A M
i
+|1 R
i
|
0 as i . (7.3.15)
232 7. ROBUSTNESS
This shows that (P, P
i
) 0. To prove that (P
i
, P) 0, it is necessary to estimate R
1
i
. Let
i
= A M
i
. Then for all vectors x M(L
2
), we have M
i
x Ax (A M
i
)x (1
i
)x. Since x = A
i
x = M
i
R
1
i
x (1
i
)R
1
i
x, it follows that R
1
i
1/(1
i
)
whenever
i
< 1. In particular, R
1
i
1 as i . Let V
i
= R
1
i
/R
1
i
. Then, as in (7.3.15),
V
i
= 1 and
A
i
AV
i
= M
i
R
1
i
AV
i
M
i
R
1
i
AR
1
i
+AR
1
i
AV
i
M
i
AR
1
i
+R
1
i
V
i
0 as i . (7.3.16)
Thus (P
i
, P) 0. 2
At present, the graph metric distance between two plants cannot be computed exactly, since
the problem of determining the inmum in (7.3.5) is still open. However, it is possible to give upper
and lower bounds.
Lemma 7.3.6 Suppose (N
1
, D
1
) is a normalized r.c.f. of P
1
, (N
2
, D
2
) is a (not necessarily normalized)
r.c.f. of P
2
, and let A
1
= [D
1
N
1
]
, M
2
= [D
2
N
2
]
. Suppose A
1
M
2
=: < 1. Then
d(P
1
, P
2
)
2
1
. (7.3.17)
Proof. Suppose M
2
= A
2
R
2
where A
2
corresponds to a normalized r.c.f. of P
2
and R
2
U(S). Now
M
2
A
1
+M
2
A
1
= 1 + . Hence, R
2
= M
2
1 + andR
1
2
1/(1 ), as
in the proof of Theorem 7.3.5. Moreover, (7.3.15) and (7.3.16) imply that
(P
1
, P
2
) 2 , (7.3.18)
(P
2
, P
1
) 2/(1 ) , (7.3.19)
d(P
1
, P
2
) = max{(P
1
, P
2
), (P
2
, P
1
)} 2/(1 ) , (7.3.20)
To compute a lower bound on the graph metric distance between two plants, note that
(P
1
, P
2
) inf
RM(S)
A
1
A
2
R . (7.3.21)
This can be computed using the methods of Section 6.6.
We are now in a position to undertake a proof of Proposition 7.2.21. For ease of reference it
is restated here in an equivalent form.
Proposition 7.3.7 Suppose {P
i
} is a sequence in M(R(s)), P M(R(s)), and suppose P
i
, P are
block-diagonal in the form
P
i
= Block Diag {P
i
1
, , P
i
l
} , (7.3.22)
P = Block Diag {

P
1
, ,

P
l
} , (7.3.23)
where the partitioning is commensurate. Then d(P
i
, P) 0 if and only if d(P
i
j
,

P
j
) 0 as i ,
for each j in {1, , l}.
Remarks 7.3.8 Proposition 7.3.7 is equivalent to Proposition 7.2.21 in view of Theorem 7.3.5.
Proof. It is enough to prove the proposition for the case where l = 2, as the general case will then
follow by induction. Accordingly, let us change notation, and suppose we are given a sequence {P
i
}
of block-diagonal matrices and a block-diagonal matrix P having the form
P
i
=
_
Q
i
0
0 R
i
_
, P =
_
Q 0
0 R
_
. (7.3.24)
The claim is that d(P
i
, P) 0 if and only if d(Q
i
, Q) 0 and d(R
i
, R) 0.
if Suppose d(Q
i
, Q) 0, d(R
i
, R) 0. Then, by Theorem 7.3.5, {Q
i
} converges to
Q and {R
i
} converges to R in the graph topology. This implies, by Lemma 7.2.5, that there are
r.c.f.s (N
Q
i
, D
Q
i
), (N
R
i
, D
R
i
), (N
Q
, D
Q
), (N
R
, D
R
) of Q
i
, R
i
, Q, R such that (N
Q
i
, D
Q
i
)
(N
Q
, D
Q
), (N
R
i
, D
R
i
) (N
R
, D
R
) Now dene
N
i
=
_
N
Q
i
0
0 N
R
i
_
, D
i
=
_
D
Q
i
0
0 D
R
i
_
. (7.3.25)
Then clearly (N
i
, D
i
) is an r.c.f. of P
i
, (N, D) is an r.c.f. of P, and (N
i
, D
i
) (N, D) (because the
norm topology on M(S) is a product topology). Again by Lemma 7.2.5, {P
i
} converges to P in the
graph topology, so that d(P
i
, P) 0 by Theorem 7.3.5.
only if Suppose d(P
i
, P) 0. Then {P
i
} converges to P in the graph topology. Let
(N
Q
, D
Q
), (N
R
, D
R
) be normalized r.c.f.s of Q, R respectively. Then (N, D) as dened in (7.3.26)
is easily shown to be a normalized r.c.f. of P. Similarly, let (N
Q
i
, D
Q
i
), (N
R
i
, D
R
i
) be normalized
r.c.f.s of Q
i
, R
i
, respectively. Then (N
i
, D
i
) as dened in (7.3.25) is a normalized r.c.f. of P
i
. By
hypothesis, there is a sequence {U
i
} of unimodular matrices such that
_
_
_
_
D
Q
i
0
0 D
R
i
N
Q
i
0
0 N
R
i
_
_
_
U
1i
U
2i
U
3i
U
4i
_

_
_
_
_
D
Q
0
0 D
R
N
Q
0
0 N
R
_
_
, (7.3.26)
234 7. ROBUSTNESS
where U
i
is partitioned in the obvious way. Since the topology on M(S) is a product topology, each
block in the partitioned matrix on the left side of (7.3.26) converges to the corresponding block on
the right side of (7.3.26). In particular,
_
D
Q
i
N
Q
i
_
U
2i

_
0
0
_
,
_
D
R
i
N
R
i
_
U
3i

_
0
0
_
. (7.3.27)
Since (N
Q
i
, D
Q
i
), (N
R
i
, D
R
i
) are both normalized r.c.f.s, (7.3.27) implies that U
2i
0, U
3i
0.
Let A
i
(resp. A) denote the block 4 2 matrix on the left (resp. right) side of (7.3.26). Since
U
2i
0, U
3i
0, it follows that
A
i
_
0 U
2i
U
3i
0
_
0 . (7.3.28)
Combining (7.3.26) and (7.3.28) gives
A
i

U
i
A , (7.3.29)
where
U
i
=
_
U
1i
0
0 U
4i
_
. (7.3.30)
Now (7.3.29) shows that

U
i
1. Hence, if we dene W
i
= U
i
/U
i
, then (7.3.29) shows that
A A
i
W
i
0. On the other hand, since d(P
i
, P) 0, there exists a sequence {V
i
} of matrices
in the unit ball of M(S) such that A
i
AV
i
0. Thus A A
i
W
i
+A
i
AV
i
< 1 for large
enough i, and W
i
= 1. Hence by Lemma 7.3.4, both W
i
and V
i
are unimodular for large enough
i. In particular, U
1i
and U
4i
are both unimodular for large enough i. Hence, (N
Q
i
U
1i
, D
Q
i
U
1i
) is
an r.c.f. of Q
i
, and (N
R
i
U
4i
, D
R
i
U
4i
) is an r.c.f. of R
i
. Since (7.3.29) implies that
_
D
Q
i
N
Q
i
_
U
1i

_
D
Q
N
Q
_
,
_
D
R
i
N
R
i
_
U
4i

_
D
R
N
R
_
. (7.3.31)
It follows from Lemma 7.2.5 that Q
i
Q, R
i
R in the graph topology. Now Theorem 7.3.5
shows that d(Q
i
, Q) 0, d(R
i
, R) 0. 2
This section is concluded with a proof of Theorem 7.2.8. It is restated here in an equivalent
form for ease of reference.
Proposition 7.3.9 Suppose {P
i
}, {C
i
} are sequences in M(R(s)), and that P
0
, C
0
M(R(s)). Then,
the following statements are equivalent:
(i) d(P
i
, P
0
) 0, d(C
i
, C
0
) 0 as i .
(ii) d(H
i
, H
0
) 0 as i , where H
i
= H(P
i
, C
i
), H
0
= H(P
0
, C
0
).
Remarks 7.3.10 If the pair (P
0
, C
0
) is stable, then the above result reduces to Theorem 7.2.8. In
general, it shows that the map from the plant-compensator pair to the closed-loop transfer matrix is
continuous if all spaces are equipped with the graph topology, even if the pair (P
0
, C
0
) is not stable.
Proof. Let
G
0
=
_
C
0
0
0 P
0
_
, G
i
=
_
C
i
0
0 P
i
_
. (7.3.32)
Then Proposition 7.3.7 shows that (i) is equivalent to the following statement: (iii) d(G
i
, G
0
) 0
as i . Thus the statement to be proved is: d(G
i
, G
0
) 0 if and only if d(H
i
, H
0
) 0.
Dene
F =
_
0 I
I 0
_
. (7.3.33)
Then (cf. Section 5.1)
H
i
= (I +FG
i
)
1
, H
0
= (I +FG
0
)
1
. (7.3.34)
Suppose (N
i
, D
i
) is an r.c.f. of G
i
. Then H
i
= D
i
(D
i
+FN
i
)
1
, from (7.3.34). Moreover,
(D
i
, D
i
+FN
i
) is an r.c.f. of H
i
, since
_
D
i
+FN
i
D
i
_
=
_
I F
I 0
_ _
D
i
N
i
_
, (7.3.35)
and the matrix on the right side of (7.3.35) is unimodular, with the inverse
_
0 I
F F
_
. (7.3.36)
(Note that F
1
= F.) Conversely, suppose (B
i
, A
i
) is an r.c.f. of H
i
. Then
_
D
i
N
i
_
=
_
0 I
F F
_ _
A
i
B
i
_
, (7.3.37)
is an r.c.f. of G
i
. Similar remarks apply to G
0
and H
0
.
The statement to be proved is that d(G
i
, G
0
) 0 if and only if d(H
i
, H
0
) 0. Suppose
rst that d(G
i
, G
0
) 0. Then, by Theorem 7.3.5, {G
i
} converges to G
0
Hence, according to Lemma 7.2.5, there exist r.c.f.s (N
i
, D
i
) of G
i
and (N
0
, D
0
) of G
0
such that
N
i
N
0
, D
i
D
0
. Now dene
_
A
i
B
i
_
=
_
I F
I 0
_ _
D
i
N
i
_
,
_
A
0
B
0
_
=
_
I F
I 0
_ _
D
0
N
0
_
. (7.3.38)
Then (B
i
, A
i
) is an r.c.f. of H
i
, (B
0
, A
0
) is an r.c.f. of H
0
, and A
i
A
0
, B
i
B
0
. Hence, {H
i
}
converges to H
0
in the graph topology and d(H
i
, H
0
) 0. The reverse implication follows by
reversing the above steps. 2
236 7. ROBUSTNESS
We conclude this section with a discussion of the graph metric in the case where the set of
stable transfer functions is something other than the set S. First of all, the graph metric extends
almost effortlessly to linear distributed systems. The central step in dening a graph metric is the
existence of a normalized r.c.f. In the case of distributed systems, the existence of such a factorization
is assured by Beurlings theoremin the scalar case and the Beurling-Lax theoremin the multivariable
case, which states that a closed shift-invariant subspace of H
n
2
is the image of an inner operator.
Thus, even in the case of distributed systems, if a plant has an r.c.f., then it has a normalized r.c.f.
Of course, not all distributed systems have r.c.f.s (see Section 8.1). Thus, the graph metric can only
be dened between pairs of plants when each has an r.c.f. Next, in the case of discrete-time systems,
the arguments given here carry over without any major modications: One only has to replace the
extended closed right half-plane by the closed unit disc. Finally, if the set of stable transfer functions
is a set of the form S
D
where D is some prespecied region of the complex plane, or its discrete
analog, then the graph metric can be dened by rst conformally mapping Dinto C
+e
(or the closed
unit disc in the discrete-time case). This of course requires that the region D be simply connected.
7.4 ROBUSTNESS CONDITIONS
In this section, the focus is on designing stabilizing compensators for imprecisely known plants.
Thus, a nominal plant description is available, together with a description of the plant uncertainty,
and the objective is to design a compensator that stabilizes all plants lying within the specied band
of uncertainty. The types of plant uncertainty considered include additive perturbations, multiplica-
tive perturbations, stable-factor perturbations, and graph-metric perturbations. The solution of the
design problem is divided into two steps: The rst step is the derivation of necessary and sufcient
conditions that any compensator must satisfy in order to be a solution of the design problem. The
second step is a further examination of these conditions to see when there actually exists a compen-
sator satisfying these conditions. This leads naturally to the notion of optimally robust compensators.
This section concludes with a discussion of the case where there are uncertainties in both the plant
and compensator.
Throughout this section, several models of plant uncertainty are used; these are described
next.
Suppose P
0
is a given nominal plant and that r Sis a given function. An additive perturbation
of P
0
is only dened in the case where P
0
has no poles on the imaginary axis. The class A(P
0
, r)
consists of all P M(R(s)) that have the same number of open RHPpoles as P
0
(counted according
to McMillan degree), and satisfy
P(j) P
0
(j) < |r(j)| R . (7.4.1)
In particular, (7.4.1) implies that if P A(P
0
, r), then P also has no poles on the j-axis. However,
P P
0
need not be analytic in the open RHP, because even though P and P
0
have the same
number of open RHP poles, they could be at different locations. Hence, (7.4.1) does not imply that
P(s) P
0
(s) < |r(s)| s C
+e
. The assumption that r S is not too restrictive. Suppose it is
7.4. ROBUSTNESS CONDITIONS 237
known that P has the same number of RHP poles as P
0
, and that P(j) P
0
(j) < |f (j)|
where f is some proper rational function. Then it is always possible to nd a stable proper rational
function r such that r(j) = f (j) . In fact, one can assume without loss of generality that r is
outer.
The class of multiplicative perturbations of P
0
is also dened under the restriction that P
0
has no j-axis poles. The class M(P
0
, r) consists of all P M(R(s)) that have the same number
of open RHP poles as P
0
and satisfy
P(s) = (I +M(s))P
0
(s) , M M(R(s)) , where
M(j) < |r(j)| R . (7.4.2)
Finally, the class of stable-factor perturbations is specied not just in terms of the nominal plant
description P
0
, but in terms of a particular r.c.f. (N
0
, D
0
) of P
0
and a function r S. The class
S(N
0
, D
0
, r) consists of all plants P M(R(s)) satisfying
P(s) = N(s)[D(s)]
1
, where
_
_
_
_
_
N(s) N
0
(s)
D(s) D
0
(s)
__
_
_
_
< |r(s)| s C
+e
. (7.4.3)
The rationale behind the above three models of plant uncertainty is fairly clear. In the case of
both A(P
0
, r) and M(P
0
, r), the perturbed plant P has the same number of open RHP poles as the
nominal plant P
0
(though possibly at different locations). In the case of A(P
0
, r), |r(j)| provides
an upper bound on the absolute magnitude of the plant uncertainty at the frequency , while in the
case of M(P
0
, r), |r(j)| provides an upper bound on the relative magnitude of uncertainty at the
frequency . The class of stable-factor perturbations is free from the restructive assumption that
the perturbed plant and the unperturbed plant have the same number of RHP poles. As well, P
0
is permitted to have j-axis poles, as is the perturbed plant P. Using (7.4.3) in conjunction with
the principle of the argument over various subregions of C
+
, one can localize regions in C
+
where
a P S(N
0
, D
0
, r) can have poles.
Theorems 7.4.1 and 7.4.2 below give the main robustness results in the case of additive and
multiplicative perturbations, respectively.
Theorem 7.4.1 Suppose P S(N
0
, D
0
, r), r M(R(s)) are specied, and suppose C stabilizes
P
0
. Then C stabilizes all P in the class A(P
0
, r) if and only if
5
[C(I +P
0
C)
1
](j) |r(j)| 1 R . (7.4.4)
0
M(P
0
, r), r M(R(s)) are specied, and suppose C stabilizes P
0
.
Then C stabilizes all P in the class M(P
0
, r) if and only if
[P
0
C(I +P
0
C)
1
](j) |r(j)| 1 R . (7.4.5)
5
Recall that the norm M of a matrix M equals its largest singular value.
238 7. ROBUSTNESS
The proofs of both theorems make use of a graphical criterion for feedback stability that
generalizes the Nyquist criterion. As it is of independent interest, it is stated separately.
Lemma 7.4.3 Suppose P, C M(R(s)) have
p
,
c
poles in the open RHP, respectively, counted
according to McMillan degree, and none on the j-axis. Then C stabilizes P if and only if the plot of
|(I +PC)(j)| as decreases from to does not pass through the origin, and encircles the origin
p
+
c
times in the clockwise sense.
A little notation and an appeal to the principle of the argument simplies the proof of
Lemma 7.4.3. Suppose f is a rational function that has neither poles nor zeros on the extended
j-axis. Then, as decreases from to , the graph of f (j) traces out a closed curve in the
complex plane. Let
ind f =
1
2
lim
[arg f (j) arg f (j)] , (7.4.6)

denote the index of this closed curve with respect to the origin. In more prosaic terms, ind f is the
number of times that the plot of f (j) encircles the origin in the counter-clockwise sense as
decreases from to . The principle of the argument states that
ind f = no. of zeros of f in C
+
no. of poles of f in C
+
. (7.4.7)
Since arg ab = arg a +arg b for any two nonzero complex numbers a and b, it follows that ind
fg = ind f + ind g for any two appropriate rational functions f and g.
Proof of Lemma 7.4.3. Let (N
p
, D
p
), (

D
c
,

N
c
) denote any r.c.f. and any l.c.f. of P and C, respec-
tively. Then, by Theorem 5.1.6, C stabilizes P if and only if

D
c
D
p
+

N
c
N
p
is unimodular, or equiv-
alently, f = |

D
c
D
p
, +

N
c
N
p
| is a unit of S. Using the principle of the argument, this condition can
be expressed in terms of the behavior of f (j), and ultimately, |(I +PC)(j)|.
Note that
f = |

D
c
D
p
+

N
c
N
p
| = |

D
c
| |I +CP| |D
p
|
= |

D
c
| |I +PC| |D
p
| (7.4.8)
since |I +CP| = |I +PC|. If |(I +PC)(j)| = 0 for some (nite or innite), it is immediate
that f is not a unit. So suppose that |(I +PC)(j)| = 0 and at innity. Then ind f is well-
dened and
ind f = ind |

D
c
| +ind |I +PC| +ind |D
p
| . (7.4.9)
Now for a function g S with no j-axis zeros, ind g equals its degree (g) dened in (2.1.13).
Moreover, g is a unit if and only if (g) = 0. Finally, from Theorem 4.3.12 and (7.4.7), it follows
that ind |

D
c
| =
c
, ind |D
p
| =
p
. Combining this reasoning with (7.4.9) shows that f is a unit if
and only if
ind |I +PC| = (
c
+
p
) . (7.4.10)
If P and/or C has j-axis poles, Lemma 7.4.3 has to be modied by replacing j-axis
throughout by indented j-axis. This modication is standard (see e.g., [26, Sec.IV.5]).
Proof of Theorem7.4.1. For the sake of simplicity, it is assumed that C has no j-axis poles. The
modications required to handle the case where C has j-axis poles are obvious and left to the
reader.
only if It is shown that if (7.4.4) does not hold, then there exists a P in the class A(P
0
, r) that
is not stabilized by C. Accordingly, suppose (7.4.4) is not true, and let F denote C(I +P
0
C)
1
. Then
either F() |r()| > 1, or else F(j
0
) |r(j
0
)| > 1 for some nite
0
. Let R
e
denote the
extended real line, i.e., the set R {}. Then the hypothesis is that F(j
0
) |r(j
0
)| > 1 for
some
0
R
e
. Let F denote F(j
0
), let
1
, ,
s
denote the nonzero singular values of F, and
let denote the diagonal matrix = Diag {
1
, ,
s
}. Select unitary matrices U, V such that
UF(j
0
)V =
_
0
0 0
_
, (7.4.11)
Note that
1
|r(j
0
)| > 1 by assumption. There are now two cases to consider.
Case (i)
0
= 0 or . In this case F is a real matrix, so that U and V can also be chosen to
be real (and hence orthogonal), and r(j
0
) is real. Now dene Q M(S) by
Q(s) = V
_
1/
1
0
0 0
_
U
r(s)
r(j
0
)
, (7.4.12)
and let P = P
0
+Q. Then, since Qis stable, P has the same number of (and in fact the same) RHP
poles as P
0
. Moreover,
Q(j) =
|r(j)|
1
|r(j
0
)|
< |r(j)| R , (7.4.13)
since
1
|r(j
0
)| > 1. Hence P belongs to the class A(P
0
, r). However, it is claimed that C does not
stabilize P. To prove this, it is enough, in view of Lemma 7.4.3, to show that |(I +PC)(j
0
)| = 0.
Let v
1
denote the rst column of V, and note that V
v
1
= e
1
= [1 0 0]
because V is orthogonal.
Hence,
(I +QF)(j
0
)v
1
= v
1
(QF)(j
0
)v
1
= v
1
V
_
1/
1
0
0 0
_ _
0
0 0
_
e
1
= v
1
v
1
= 0 . (7.4.14)
240 7. ROBUSTNESS
Since v
1
= 0, this implies that |(I +QF)(j
0
)| = 0. Now |I +PC| = |I +(Q+P
0
C)| = |I +
QF| |I +P
0
C|, from which it follows that |(I +PC)(j
0
)| = 0.
Case (ii)
0
is nonzero and nite. In this case the only complication is that the matrices U, V
and the constant r(j
0
) may not be real, so that the Qdened in (7.4.12) may not be a real rational
matrix. To handle this possibility, write r(j
0
) = |r(j
0
)| exp(j) where [0, 2). Let v
1
, u
1
denote respectively the rst column of V and the rst row of U. Express each component of v
1
, u
1
in the form
(v
1
)
i
= v
i
exp(j
i
), (u
1
)
i
= u
i
exp(j
i
) , (7.4.15)
where v
i
, u
i
are real numbers and , (, 0] i. Dene
Q(s) =
r(s)
1
|r(j
0
)|
v(s) u
(s) , (7.4.16)
where
v(s) =
_
_
_
v
1
(s
1
)/(s +
1
)
.
.
.
v
n
(s
n
)/(s +
n
)
_
_
,
u(s) =
_
_
_
u
1
(s
1
)/(s +
1
)
.
.
.
u
n
(s
n
)/(s +
n
) ,
_
_
, (7.4.17)
and the nonnegative constants
i
,
i
are adjusted such that
6
arg
_
j
0

i
j
0
+
i
_
=
i
, arg
_
j
0

i
j
0
+
i
_
=
i
. (7.4.18)
Then Q is a real rational matrix, Q M(S), and
Q(j
0
) = v
1
u
1
/
1
. (7.4.19)
Moreover, Q satises (7.4.13). Hence, P = P
0
+Q belongs to the class A(P
0
, r). As before, (I +
QF)(j
0
) is singular, because it maps v
1
into the zero vector. Hence C does not stabilize P.
if Suppose (7.4.4) holds. Let P be any plant in the class A(P
0
, r) and let denote P P
0
.
Then, by (7.4.1),
(j) < |r(j)| . (7.4.20)
Combining (7.4.4) with (7.4.20) and letting F denote C(I +P
0
C)
1
leads to
(j) F(j) < 1 . (7.4.21)
6
One can assume without loss of generality that
0
> 0, so that such
i
,
i
exist.
Hence
|I +(j)F(j)| = 0 R, [0, 1] . (7.4.22)
Let X denote the topological space C 0 (i.e., the punctured plane) with the topology inherited
from C. Dene two closed curves f () and g() in X by
f (j) = 1 , g(j) = |I +(j)F(j)| . (7.4.23)
Now (7.4.22) shows that the function
h(, j) = |I +(j)F(j)| , (7.4.24)
continuously transforms the curve f into the curve g in X. Hence, the curves f and g are homotopic
in X, and as a result they have the same index with respect to 0. Since the index of f is clearly zero,
it follows that
ind g = ind |I +F| = 0 . (7.4.25)
Finally, note that
|I +PC| = |I +F| |I +P
0
C| . (7.4.26)
Since C stabilizes P
0
, it follows from Lemma 7.4.3 that
ind |I +P
0
C| = (
0
+
c
) , (7.4.27)
where
0
(resp.
c
) is the number of RHP poles of P
0
(resp. C) counted according to McMillan
degree. Now (7.4.25)(7.4.27) imply that
ind |I +PC| = (
0
+
c
) . (7.4.28)
Since P also has
0
RHP poles by assumption, (7.4.28) and Lemma 7.4.3 show that C stabilizes P.
2
Suppose we dene

A(P
0
, r) to be the set of all plants P that have the same number of RHP
poles as P
0
, and satisfy
P(j) P
0
(j) |r(j)| . (7.4.29)
(The difference between (7.4.1) and (7.4.29) is that the < is replaced by .) Then the proof of
Theorem 7.4.1 can be easily adapted to prove the following result.
Corollary 7.4.4 Suppose C S(P
0
). Then C stabilizes all plants in the class

A(P
0
, r) if and only if
sup
[C(I +P
0
C)
1
](j) |r(j)| < 1 . (7.4.30)
242 7. ROBUSTNESS
Sketch of Proof of Theorem7.4.2. Suppose P = (I +M)P
0
. Then
|I +PC| = |I +(I +M)P
0
C| = |I +MG| |I +P
0
C| , (7.4.31)
where G = P
0
C(I +P
0
C)
1
. By assumption, M satises M(j) < |r(j)| , and (7.4.5) states
that G(j) |r(j)| 1 . 2
To derive a result analogous to Corollary 7.4.4, dene

M(P
0
, r) to be the set of all plants P
that have the same number of RHP poles as P
0
, and satisfy P = (I +M)P
0
where M M(R(s))
and
M(j) |r(j)| . (7.4.32)
Corollary 7.4.5 Suppose C S(P
0
). Then C stabilizes all plants in the class

M(P
0
, r) if and only if
sup
[P
0
C(I +P
0
C)
1
](j) |r(j)| < 1 . (7.4.33)
Finally, consider the case of stable-factor perturbations.
Theorem 7.4.6 Suppose P
0
M(R(s)). Suppose an r.c.f. (N
0
, D
0
) of P
0
and a function r S
are specied, and dene the class S(N
0
, D
0
, r) as in (7.4.3). Suppose C S(P
0
), and select an l.c.f.
(

D
c
,

N
c
) of C such that

D
c
D
0
+

N
c
N
0
= I. Then C stabilizes all P in the class, S(N
0
, D
0
, r), if
and only if
[

D
c
(j)

N
c
(j)] |r(j)| 1 . (7.4.34)
Remarks 7.4.7 Since

D
c
,

N
c
M(S), r S, the condition (7.4.34) actually implies that
[

D
c
(s)

N
c
(s)]r(s) 1 s C
+e
.
Proof. For convenience, dene
A
0
=
_
D
0
N
0
_
, A
1
=
_
D
N
_
,

A
c
= [

D
c

N
c
] . (7.4.35)
if Suppose (7.4.34) holds, and suppose P is an arbitrary plant in the class S(N
0
, D
0
, r).
Then P has an r.c.f. (N, D) such that A(s) A
0
(s) < |r(s)| s C
+e
. Now consider the return
difference matrix

D
c
D +

N
c
N =

A
c
A. Since

A
c
A
0
= I, it follows that

A
c
A = I +

A
c
(A A
0
).
However, from (7.4.34) we get

A
c
(A A
0
)](s)

A
c
(s) [A A
0
](s) < 1 s C
+e
. This
shows that |(

A
c
A)(s)| = |I +[

A
c
(A A
0
)](s)| = 0 s C
+e
, so that

A
c
A is unimodular. Hence,
by Theorem 5.1.6, C stabilizes P.
only if Suppose (7.4.34) is false, and select an
0
such that

A
c
(j) |r(j
0
)| > 1. Let
1

2

m
denote the singular values of

A
c
(j
0
), and note that
1
|r(j
0
)| > 1. Let =
Diag {
1
,
m
}, and select unitary matrices U, V such that
7
U

A
c
(j
0
)V = [ 0] . (7.4.36)
Since

A
c
(j
0
)A
0
(j
0
) = I, it follows that A
0
(j
0
) must be of the form
A
0
(j
0
) = V
_
1
X
_
U . (7.4.37)
By assumption,
1
|r(j
0
)| > 1. Now dene R = Diag {1/
1
, 0, , 0}, and
Q(s) = V
_
R
T
_
U
r(s)
r(j
0
)
, (7.4.38)
where T is a constant matrix chosen as follows: If the rst column of the matrix X in (7.4.37) is
nonzero, simply choose T = 0. Otherwise, if the rst column of X is zero, choose T to be a matrix
whose columns are all zero except for the rst column, and whose rst column is nonzero and has
sufciently small norm that Q(j
0
) < 1/|r(j
0
)|. If T is so chosen, then the rst column of
X T is not identically zero. Nowlet P be the plant ND
1
, where [N
= A = A
0
+Q. Then
P belongs to the class S(N
0
, D
0
, r), since Q(s) |r(s)| 1 s C
+e
.
8
However,
A(j
0
) = A
0
(j
0
) +Q(j
0
) = V
_
1
R
X T
_
U , (7.4.39)
and the rst column of
1
R is zero. Hence

A
c
(j
0
)A(j
0
) is singular, and

A
c
A is not uni-
modular.
The question is: Does the singularity of |

D
c
D +

N
c
N| at j
0
imply that C fails to stabilize
P? We can conclude that this is so, provided it can be shown that N and D are right-coprime,
or at least, that any greatest common right divisor of N and D is nonsingular at j
0
. To am-
plify this argument, suppose N = N
1
B, D = D
1
B, where B is a g.c.r.d. of N and D, and N
1
and
D
1
are right-coprime. Then |

D
c
D +

N
c
N| = |

D
c
D
1
+

N
c
N
1
| |B|. Hence, if |B(j
0
)| = 0, then
|

D
c
D
1
+

N
c
N
1
| vanishes at j
0
, and C does not stabilize P. Thus the proof is complete if it can be
shown that |B(j
0
)| = 0. For this it is enough to show that the matrix [D
= A
0
Qhas full
column rank at j
0
. But this last fact is immediate from (7.4.39), since the rst column of X T
is nonzero. 2
There is an interesting anomaly associated with the robustness condition (7.4.34). Comparing
Theorem 7.4.1 with Corollary 7.4.4, or Theorem 7.4.2 with Corollary 7.4.5, we see that it makes
very little difference, in the case of additive and multiplicative perturbations, whether the class of
7
Note that

A
c
(s) is a fat matrix and has full row rank at all s C
+e
.
8
The matrix Q may not be a real rational matrix; but this can be xed up as in the proof of Theorem 7.4.1.
244 7. ROBUSTNESS
perturbations is dened with a strict or nonstrict inequality. If the class of perturbations is dened
with a strict inequality, then the robustness condition has a nonstrict inequality, and vice versa. But
this is not so in the case of stable-factor perturbations. Dene the class

S(N
0
, D
0
, r) by
S(N
0
, D
0
, r) = {P = ND
1
:
_
_
_
_
_
(N N
0
)(s)
(D D
0
)(s)
__
_
_
_
|r(s)| s C
+e
} . (7.4.40)
One might be tempted to conjecture the following result: C S(P
0
) stabilizes all P in the class
S(N
0
, D
0
, r); if and only if
sup

A
c
(j) |r(j)| < 1 . (7.4.41)
But this is false: (7.4.41) is certainly sufcient for robust stability, but not necessary, as the next
example shows.
Example 7.4.8 Consider
p
0
(s) =
4
s 3
=
n
0
(s)
d
0
(s)
where
n
0
(s) =
4(s +7)
(s +1)
2
, d
0
(s) =
(s 3)(s +7)
(s +1)
2
.
Let c(s) = 4/(s +5). Then c stabilizes p
0
. Moreover, d
c
d
0
+n
c
n
0
= 1 with
d
c
(s) =
s +5
s +7
, n
c
(s) =
4
s +7
.
Now consider the class

S(n
0
, d
0
, r); with r = 1. Since
_
_
_
_
_
d
c
()
n
c
()
__
_
_
_
=
_
_
_
_
_
1
0
__
_
_
_
= 1 ,
(7.4.41) does not hold. Nevertheless, c stabilizes every p in the class

S(n
0
, d
0
, r), which shows
that (7.4.41) is not always necessary for robust stability.
To establish the claim made in the above paragraph, suppose
p(s) =
n(s)
d(s)
, where
_
_
_
_
_
n n
0
d d
0
_
(s)
_
_
_
_
1 s C
+e
.
Let Q = [q
1
q
2
]
denote [d d
0
n n
0
]
. Then there are two cases to consider, namely:

(i) q
1
() +1 = 0, and (ii) q
1
() +1 = 0. In either case we have [d
c
(s) n
c
(s)] < 1 s C
+
,
so that
(d
c
d +n
c
n)(s) = 1 +[d
c
(s) n
c
(s)]Q(s) = 0 s C
+
.
On the other hand, [d
c
() n
c
()] = [1 0] so that
(d
c
d +n
c
n)() = 1 +q
1
() .
Hence, if q
1
() = 1, then d
c
d +n
c
n has no zeros in the extended RHP and is thus a unit of S.
Therefore c stabilizes p.
It only remains to show that c stabilizes p even if q
1
() = 1. In this case q
2
() = 0
since Q() 1. Hence, n() = d() = 0, and = 1/(s +1) is a common divisor of n and
d. Further, since (d
c
d +n
c
n)(s) = 0 s C
+
, we see that , its powers and associates are the only
possible common divisors of n and d. Now it is claimed that, whatever be q
1
, the function d
0
+q
1
can have only a simple zero at innity. Let us accept this claim for a moment; then is a greatest
common divisor of n and d, since d/ does not vanish at innity. Let n
1
= 1/, d
1
= d/; then
(n
1
, d
1
) is a coprime factorization of p. Now
(d
c
d
1
+n
c
n
1
)() = d
1
() = 0 ,
since d has only a simple zero at innity. On the other hand, since it has already been established
that (d
c
d +n
c
n)(s) = 0 s C
+
, it follows that
(d
c
d
1
+n
c
n
1
)(s) = 0 s C
+e
.
Hence, d
c
d
1
+n
c
n
1
is a unit of S, and c stabilizes p.
Thus, the example is complete if the claim can be established. This is most easily done using
the bilinear transformation z = (s 1)/(s +1), which sends the function d
0
(s) into
a(z) = d
0
_
1 +z
1 z
_
= (4 3z)(2z 1) ,
and sends q
1
(s) into an associated rational function b(z). In this context, the claim is that a b
has only a simple zero at z = 1 whenever b(1) = 1 and b
1.
9
We prove the contrapositive,
namely, if g is a rational H
-function such that g(1) = 1 and a g has a double zero at z = 1,

then g
> 1. Expand g in a power series around z = 1, as

g(z) =
i=0
g
i
(z 1)
i
.
If a g has (at least) a double zero at z = 1, then
g(1) = a(1) = 1 , g
(1) = a
(1) = 1 .
Hence, for z < 1 and sufciently close to 1, we have g(z) > 1. Therefore, g
> 1. This completes

the example.
9
Here
denotes the norm in the space H
; see Section 6.2.

246 7. ROBUSTNESS
Eachof Theorems 7.4.1, 7.4.2 and 7.4.6 gives conditions for a particular C S(P
0
) to stabilize
all plants in a given class of perturbations. But they do not answer the question whether, given a
class of perturbations, there actually exists a C S(P
0
) that satises the robustness conditions.
This question is addressed next. Note that the solutions given below make use of the H
-norm
minimization results derived in Sections 6.5 and 6.6.
Theorem 7.4.9 Suppose a nominal plant P
0
M(R(s)), free of poles on the extended j-axis,
and a function r S are specied. Suppose r does not have any zeros on the extended j-axis. Let
(N, D), (

D,

N) be any r.c.f. and any l.c.f. of P
0
, and let

X,

Y M(S) be any particular solutions of
the identity

N

X +

D

Y = I. Factor D,

D in the form
D = D
i
D
o
,

D =

D
o

D
i
, (7.4.42)
where D
i
,

D
i
, are inner and D
o
,

D
o
are outer, and assume without loss of generality that |D
i
| =
|

D
i
|.
10
Finally, factor r as r
o
r
i
where r
o
, r
i
are outer and inner, respectively. Under these conditions,
there exists a C S(P
0
) that stabilizes all plants in the class A(P
0
, r) if and only if there exists an
S M(S) such that
D
adj
i
X

D
o
r
o
+S
1 , (7.4.43)
where = |D
i
| = |

D
i
|.
Proof. By Theorem 5.2.1, the set of all C S(P
0
) is given by
S(P
0
) = {(

X +DS)(

Y NS)
1
: S M(S), |

Y NS| = 0} , (7.4.44)
where the nonsingularity constraint is satised by an open dense subset of S M(S) (see
Lemma 5.2.4). Moreover, from (5.2.7), if C = (

X +DS)(

Y NS)
1
, then
C(I +P
0
C)
1
= (

X +DS)

D . (7.4.45)
Hence, there exists a C S(P
0
) satisfying (7.4.4) if and only if there exists an S M(S) such that

X

Dr +DS

Dr
1 . (7.4.46)
A few simple manipulations bring (7.4.46) into the form (7.4.43). Note that

X

Dr +DS

Dr
=

X

D
o

D
i
r
o
r
i
+D
i
D
o
S

D
o

D
i
r
o
r
i
= D
adj
i
X

D
o
r
o
r
i
+
2
D
o
S

D
o
r
o
r
i
= r
i
(D
adj
i
X

D
o
r
o
+Q)
= D
adj
i
X

D
o
r
o
+Q
, (7.4.47)
10
By Theorem 4.1.15, |D| and |

D| are associates; so are |D
o
| and |

D
o
|, since both are units. Hence, |D
i
| and |

D
i
| are associates,
and since both are inner, |D
i
| = |

D
i
| in any case.
where, in successive steps, we multiply on the left by D
adj
i
and on the right by

D
adj
i
, replace the
matrix S by a new free parameter Q = D
o
S

D
o
r
o
, and observe that multiplication by the inner
function r
i
preserves norms. Hence, (7.4.46) and (7.4.43) are equivalent. 2
If has only simple RHPzeros (or equivalently, if P
0
has only simple RHPpoles), then (7.4.43)
can be stated in a more readily testable form using the results of Section 6.5.
Corollary 7.4.10 Suppose has only simple RHP zeros, and let
1
, ,
n
denote these zeros. Let
F
i
= (D
adj
i
XD
o
r
o
)(
j
), j = 1, , n, and dene
Q
jk
=
I +F
j
F
k
j
+
k
, 1 j, k n , (7.4.48)
Q =
_
_
_
Q
11
Q
1n
.
.
.
.
.
.
Q
n1
Q
nn
_
_
. (7.4.49)
Then there exists a C S(P
0
) that stabilizes all plants in the class A(P
0
, r) if and only if Qis nonnegative
denite.
In (7.4.43),

X is any particular solution of the identity

N

X +

D

Y = I. Hence, it is nice to
know that the test matrix Qis the same no matter which

X is used. To see this, let

X
1
,

Y
1
be another
set of matrices in M(S) satisfying

N

X
1
+

D

Y
1
= I. Then, from the proof of Theorem 5.2.1, it
follows that

X
1
=

X +DR for some R M(S). Hence,
D
adj
i
X
1

D
o
r
o
= D
adj
i
(

X +DR)

D
o
r
o
= D
adj
i
X

D
o
r
o
+D
o
R

D
o
r
o
, (7.4.50)
since D = D
i
D
o
. Now the second term on the right side of (7.4.50) vanishes at all zeros of .
Hence F
1
, , F
n
are independent of which particular solution

X is used to compute them. Similar
remarks apply even in the case where has repeated zeros.
If the plant P
0
is scalar, then the expression for F
j
(or f
j
in the scalar case) is more elegant. In
this case, D
adj
i
= 1 (since by convention the adjoint of a 1 1 matrix is 1) and f
j
= (xd
o
r
o
)(
j
).
Now, in the scalar case = d
i
; hence, it follows from d = d
i
d
o
= d
o
and nx +yd = 1 that, at any
zero
j
of , we have n(
j
)x(
j
) = 1. Hence,
x(
j
)d
o
(
j
) =
d
o
(
j
)
n(
j
)
. (7.4.51)
With the plant p
0
, associate the stable plant q
0
= p
0
= n/d
o
. Then
f
j
=
d
o
(
j
)
n(
j
)
r
o
(
j
) =
r
o
q
0
(
j
) . (7.4.52)
248 7. ROBUSTNESS
The treatment of multiplicative perturbations is entirely similar to that of additive perturba-
tions.
Theorem 7.4.11 Suppose a function r S and a nominal plant P
0
M(R(s)) are specied, and
suppose that P
0
has no extended j-axis poles. Let (N, D), (

D,

N) be any r.c.f. and any l.c.f. of
P
0
, and let X, Y M(S) be any particular solutions to the identity XN +YD = I. Under these
conditions, there exists a C S(P
0
) that stabilizes all plants in the class M(P
0
, r) if and only if there
exists an R M(S) such that
N(X +R

D)r
1 . (7.4.53)
Further, if P
0
has at least as many inputs as outputs and r has no extended j-axis zeros, then (7.4.53)
can be simplied as follows: Factor N,

D in the form
N = N
i
N
o
,

D =

D
o

D
i
, (7.4.54)
where N
i
,

D
i
are inner and N
o
,

D
o
are outer. Finally. factor r as r
i
r
o
where r
i
is inner and r
o
is a unit
of S. Under these conditions, there exists a C S(P
0
) that stabilizes all plants in the class M(P
0
, r)
if and only if there exists an S M(S) such that
N
o
X

D
adj
i
r
o
+S
1 , (7.4.55)
where = |

D
i
|.
Proof. From Theorem 5.2.1, every C S(P
0
) is of the form (Y R

N)
1
(X +R

D) for some
R M(S). Moreover,
P
0
C(I +P
0
C)
1
= P
0
(I +CP
0
)
1
C = N(X +R

D) . (7.4.56)
Hence (7.4.5) holds for some C S(P
0
) if and only if there exists an R M(S) such that (7.4.53)
holds. Now suppose the additional assumptions on P
0
, r hold. Then, using by now familiar manip-
ulations, we note that
NXr +NR

Dr
= N
i
N
o
Xr
o
r
i
+N
i
N
o
R

D
o

D
i
r
o
r
i
= N
i
r
i
(N
o
Xr
o

D
adj
i
+S)
= N
o
Xr
o

D
adj
i
+S)
, (7.4.57)
where S = N
o
R

D
o
and we use the fact that multiplication by the inner matrix N
i
r
i
reserves norms.
Now (7.4.57) is precisely (7.4.55). 2
If the plant P
0
is scalar, one can again obtain a simple expression for the N
o
Xr
o

D
adj
i
evaluated
at the zeros of . Suppose nx +dy = 1 and suppose () = 0 at some point C
+
. Then d() = 0,
nx() = 1, and (N
o
Xr
o

D
adj
i
)() = r
o
()/n
i
().
The foregoing results also lead very naturally to the notion of optimally robust compensators.
Suppose a nominal plant P
0
is given, together with a function f S, which represents an uncertainty
prole. Consider now the class A(P
0
, r) consisting of all plants satisfying (7.4.1), where r = f . It
is now reasonable to ask: What is the largest value of the parameter for which the class A(P
0
, r)
is robustly stabilizable, and what is a corresponding robustly stabilizing compensator? This problem
can be solved very easily using the results derived so far. Factor f as f
i
f
o
where f
i
, f
o
are respectively
inner and outer, and dene
= min
SM(S)
D
adj
i
X

D
o
f
o
+S , (7.4.58)
where all symbols are as in Theorem 7.4.9. Now, by applying Theorem 7.4.9 with r = f , it follows
that there exists a single compensator that stabilizes all the plants in the class A(P
0
, r) if and only
if 1. Hence, the largest value of for which there exists a robustly stabilizing compensator is
given by = 1/ . Further, if S is any matrix which attains the minimum in (7.4.58), then the com-
pensator C = (

X +DS)(

Y NS)
1
is an optimally robust compensator. The case of multiplicative
perturbations is entirely similar and is left to the reader.
Next, the case of stable-factor perturbations is studied.
Theorem 7.4.12 Suppose a function r S and a nominal plant P
0
M(R(s)) are specied,
together with an r.c.f. (N
0
, D
0
) of P
0
. Let X, Y M(S) satisfy XN
0
+YD
0
= I, and let (

D
0
,

N
0
)
be any l.c.f. of P
0
. Then there exists a C S(P
0
) that stabilizes all plants in the class S(N
0
, D
0
, r)
if and only if there exists a matrix R M(S) such that A BR 1, where
A =
_
X
Y
_
, B =
_

D
0

N
0
_
. (7.4.59)
The proof is obvious and is left to the reader.
This section is concluded with a discussion of the case where there are perturbations in both
the plant and the compensator. The results in this case make use of the graph metric introduced in
Section 7.3. A bit of notation makes the presentation smoother. Given P, C M(R(s)), dene
T (P, C) = H(P, C)
_
I 0
0 0
_
, (7.4.60)
where H(P, C) is dened in 7.1.1. Using this last equation, one can derive a few other useful
representations of T (P, C). For instance,
T (P, C) =
_
PC(I +PC)
1
P(I +CP)
1
C(I +PC)
1
(I +CP)
1
_
=
_
P
I
_
(I +CP)
1
[C I] . (7.4.61)
Theorem 7.4.13 Suppose the pair (P
0
, C
0
) is stable, and that P
0
, C
0
are perturbed to P, C,
respectively. Then the pair (P, C) is stable provided
d(P, P
0
)T (P
0
, C
0
) +d(C, C
0
)T (C
0
, P
0
) < 1 . (7.4.62)
250 7. ROBUSTNESS
Corollary 7.4.14 Under the hypotheses of Theorem 7.4.13, the pair (P, C
0
) is stable provided
d(P, P
0
) < 1/T (P
0
, C
0
).
The signicance of Theorem 7.4.13 and Corollary 7.4.14 is as follows: The type of plant
perturbations studied in these two results is the most unstructured one considered in this section, in
that (i) one is permitted to simultaneously perturb both the plant and the compensator. (ii) there is
no restriction on the number of RHP poles of the perturbed and unperturbed plant being the same,
and (iii) the perturbations are not couched in terms of a particular coprime factorization of the plant.
The stability condition (7.4.62) is interesting in that the effects of the perturbations in the plant
and controller enter additively. When only the plant is perturbed, the stability condition given in
Corollary 7.4.14 is reminiscent of the small gain theorem (see e.g., [D1, Chapter VI]). These results
also bring out the utility of the graph metric in obtaining quantitative robustness estimates.
Proof. Let (N
0
, D
0
), (

X
0
,

Y
0
) be normalized r.c.f.s of P
0
, C
0
, respectively. Then from the stability
of the pair (P
0
, C
0
), and Problem 5.1.1, the matrix
U
0
=
_
D
0

X
0
N
0

Y
0
_
, (7.4.63)
is unimodular. Let V
0
= U
1
0
, and partition V
0
as
V
0
=
_
Y
0
X
0

N
0

D
0
_
. (7.4.64)
Then (

D
0
,

N
0
), (Y
0
, X
0
), are l.c.f.s of P
0
, C
0
, respectively.
Now select real numbers
p
> d(P, P
0
),
c
> d(C, C
0
) such that
p
T (P
0
, C
0
) +
c
T (C
0
, P
0
) < 1 . (7.4.65)
This is possible in view of (7.4.62). Let (N
p
, D
p
), (N
c
, D
c
) be any normalized r.c.f.s of P, C
respectively. Then, fromthe denition of the graph metric (see 7.3.2), there exist matrices W
p
, W
c
,
M(S) with W
p
1, W
c
1 such that
_
_
_
_
_
D
0
D
p
W
p
N
0
N
p
W
p
__
_
_
_

p
, (7.4.66)
_
_
_
_
_

Y
0
D
c
W
c
X
0
N
c
W
c
__
_
_
_

c
. (7.4.67)
Dene F as in (7.3.33) and observe that F is inner so that multiplication by F is norm-preserving.
Thus (7.4.67) implies that
_
_
_
_
_
(

X
0
N
c
W
c
)
Y
0
D
c
W
c
__
_
_
_

c
. (7.4.68)
Next, dene
U =
_
D
p
W
p
N
c
W
c
N
p
W
p
D
c
W
c
_
=
_
D
p
N
c
N
p
D
c
_ _
W
p
0
0 W
c
_
. (7.4.69)
It is now shown that (7.4.65) implies the unimodularity of U. This will show, a fortiriori, that W
p
and W
c
, are both unimodular, and that the pair (P, C) is stable, by Problem 5.1.1. Note that
U U
0
= [A B] , (7.4.70)
where
A =
_
D
0
D
p
W
p
N
0
N
p
W
p
_
, B =
_
(

X
0
N
c
W
c
)
Y
0
D
c
W
c
_
. (7.4.71)
Now, if [A B]U
1
0
< 1, then (7.4.70) implies that U is unimodular. From (7.4.64),
[A B]U
1
0
= [A B]V
0
= A[Y
0
X
0
] +B[

N
0

D
0
] . (7.4.72)
[A B]U
1
0
A[Y
0
X
0
] +B[

N
0

D
0
]

p
[Y
0
X
0
] +
c
[

N
0

D
0
] (7.4.73)
where the last step follows from (7.4.66) and (7.4.68). The proof is completed by showing that
[Y
0
X
0
] = T (P
0
, C
0
) , (7.4.74)
[

N
0

D
0
] = T (C
0
, P
0
) . (7.4.75)
Then (7.4.65) and (7.4.73) will imply the unimodularity of U and the stability of (P, C).
To prove (7.4.74), recall that (N
0
, D
0
) is a normalized r.c.f. of P
0
, and (Y
0
, X
0
) is the corre-
sponding l.c.f. of C
0
such that Y
0
D
0
+X
0
N
0
= I. Hence, from (5.1.23) and (7.4.60),
T (P
0
, C
0
) =
_
N
0
X
0
N
0
Y
0
D
0
X
0
D
0
Y
0
_
=
_
N
0
D
0
_
[X
0
Y
0
] ; (7.4.76)
T (P
0
, C
0
) = [X
0
Y
0
] since
_
N
0
D
0
_
is an isometry
= [Y
0
X
0
] . (7.4.77)
The proof of (7.4.75) follows essentially by symmetry arguments, after noting that [

N
0

D
0
] =
[

N
0

D
0
]. Recall that (

X
0
,

Y
0
) is a normalized r.c.f. of C
0
, that P
0
stabilizes C
0
, and that (

D
0
,

N
0
)
is the corresponding l.c.f. of P
0
such that

D
0

Y
0
+

N
0

X
0
= I. 2
252 7. ROBUSTNESS
It is not known at present how close the conditions of Theorem 7.4.13 and Corollary 7.4.14
are to being necessary. In particular, it is not known whether the condition T (P
0
, C) < r
1
is
necessary for a controller Ctostabilize all plants withina distance of r fromthe plant P
0
. Nevertheless,
it is reasonable to seek, among all stabilizing controllers for P
0
, an optimally robust controller C
0
for which T (P
0
, C
0
) is as small as possible. The problem of nding such a C
0
can be formulated
as an H
-norm minimization problem. As C varies over S(P

0
), the corresponding T (P
0
, C) varies
over all matrices of the form
T
1
(R) =
_
N
0
(X
0
+R

D
0
) N
0
(Y
0
R

N
0
D
0
(X
0
+R

D
0
) D
0
(Y
0
R

D
0
_
, (7.4.78)
where (N
0
, D
0
), (

D
0
,

N
0
) are any r.c.f. and l.c.f. of P
0
, X
0
, Y
0
M(S) satisfy X
0
N
0
+Y
0
D
0
= I,
and R M(S) is a free parameter (see (5.2.6)). Thus, minimizing T (P
0
, C) over all C S(P
0
)
is equivalent to the unconstrained minimization of T
1
(R) as R varies over M(S). Now note that
T
1
(R) is of the form U VRW, where
U =
_

N
0
D
0
_
[X
0
Y
0
] , V =
_

N
0
D
0
_
, W = [

D
0

N
0
] . (7.4.79)
This type of minimization problem is studied in Section 6.6.
Finally, the results contained in this section carry over with obvious modications to the study
of the robustness of discrete-time systems, and to cases where the set of stable transfer functions
is something other than S.
7.5 ROBUSTANDOPTIMAL REGULATION
In this section, several previously derived results are brought to bear to provide solutions of two
problems, namely robust regulation and optimal robust regulation. The robust regulation problem
encompasses two aspects, namely robust tracking and robust disturbance rejection. The robust track-
ing problem can be described as follows: Suppose we are given a plant P, together with an unstable
reference signal generator which is factored in the form

D
1
r

N
r
, where

D
r
,

N
r
are left-coprime.
The objective is to design a compensator such that the plant output tracks the unstable reference
signal, and continues to do so even as the plant is slightly perturbed. This problem is depicted in
Figure 7.2, where, to achieve the greatest possible generality, a two-parameter compensation scheme
is shown. Alternatively, suppose the unstable signal is a disturbance which is injected at the plant
output. In this case, the objective is to design a compensator such that the plant output is insensitive
to the disturbance, and remains so even as the plant is slightly perturbed. This situation is shown
in Figure 7.3, and is referred to as the robust disturbance rejection problem. Note that a feedback,
or one-parameter, conguration is employed in Figure 7.3 since, as shown in Section 5.6, both
congurations are equivalent for the purposes of disturbance rejection.
In more precise terms, the robust disturbance rejection problem can be stated as follows: Find
a compensator such that three conditions are met, namely
7.5. ROBUSTANDOPTIMAL REGULATION 253
N
r c
1
~
N
~
c
2
N
~
D
r
1
~
D
c
1
~
v r y
P
+
Figure 7.2: Robust Tracking Using a Two-Parameter Compensator.

N
r
~
D
r
1
~
v
y
P C
+
+
+
Figure 7.3: Robust Disturbance Rejection.

(i) The feedback system of Figure 7.2 is stable.
(ii) The transfer matrix from v to the tracking error r y belongs to M(S).
(iii) There exists a neighborhood N(P) of P in the graph topology such that conditions (i)
and (ii) hold for all plants in N(P).
If a one-parameter or feedback compensator is used, the robust tracking problem is mathe-
matically equivalent to the robust disturbance rejection problem, since the transfer matrix from v
to the tracking error e in Figure 7.4 is exactly the same as the transfer matrix from v to the plant
output y in Figure 7.3. On the other hand, if a two-parameter compensation scheme is used, there
is a distinct difference between robust tracking and robust disturbance rejection. As a matter of
fact, there is nothing to be gained by using a two-parameter scheme if the objective is to reject a
disturbance (see Section 5.6). For this reason, throughout this section, attention is focused on the
robust tracking problem.
N
r
~
D
r
1
~ v r e y
+
C P
Figure 7.4: Robust Tracking Using a One-Parameter Compensator.

254 7. ROBUSTNESS
Let us now formulate the problem in more mathematical terms. Given the plant P, let
(N, D), (

D,

N) be any r.c.f. and l.c.f. of P. The objective is to nd a compensator C such that three
conditions are satised:
(R1) C S(P).
(R2) (I +PC)
1

D
1
r

N
r
M(S).
11
(R3) There exists a such that (R1) and (R2) hold whenever P is replaced by P
1
=

D
1
1

N
1
where
[

D

D
1

N

N
1
] < . (7.5.1)
The solution to the robust tracking problem is given next.
Theorem 7.5.1 Let
r
denote the largest invariant factor of

D
r
.
12
Then there exists a C satisfy-
ing (R1)(R3) if and only if N and
r
I are left-coprime. If N and
r
I are left-coprime, the set of
all C that satisfy (R1)(R3) is given by
{C
1
/
r
: C
1
S(P/
r
)} . (7.5.2)
Checking whether or not the problem at hand has a solution is made easier by the next result.
Lemma 7.5.2 The following statements are equivalent:
(i) N and
r
I are left-coprime.
(ii) N(s) has full row rank at all C
+e
-zeros of
r
.
(iii) N has full row rank, and the largest invariant factor of N and
r
are coprime.
Thus, Lemma 7.5.2 shows that only square or fat plants can be robustly regulated, and reduces
the matrix coprimeness test of Theorem 7.5.1 to a scalar coprimeness criterion. Once one is sure
that the robust tracking problem does indeed have a solution in the case at hand, the formula (7.5.2)
describing all compensators C that achieve robust tracking has a very simple interpretation: In
Figure 7.5, choose C
1
to be any compensator that stabilizes the feedback system. Then C = C
1
/
r
achieves (R1)(R3). Moreover, every such C is of the above form. The presence of the 1/
r
term
within the compensator is referred to elsewhere as the internal model principle [107].
The proof of Theorem 7.5.1 requires two preliminary results. The rst of these is the same as
Lemma 5.7.3, and is reproduced here for convenience.
Lemma 7.5.3 Suppose T, U, V M(S) with |U| = 0 and U, V left-coprime. Then T U
1
V M(S)
if and only if T U
1
M(S).
Because of Lemma 7.5.3, condition (R2) can be replaced by
11
Note that (R2) is redundant unless

D
r
, is non-unimodular, i.e., the reference signal generator is unstable.
12
Note that
r
is a nonunit since

D
r
is not unimodular.
N
r
~
D
r
1
~
P C
v r
r
+
Figure 7.5: Solution to the Robust Tracking Problem.

(R2) (I +PC)
1

D
1
r
M(S).
The second result is actually a special case of a fact established in the next section (see Propo-
sition 7.6.5). It is stated here without proof, and the reader may wish to provide a specialized proof
that applies to the ring S (see Problem 7.5.1).
Lemma 7.5.4 Suppose A, B M(S) are square, have the same dimensions, and suppose |A| = 0. Then
for every > 0, the ball
B(B, ) = {B
1
M(S) : B
1
B < } , (7.5.3)
contains a matrix B
1
such that A, B
1
are right-coprime.
The proof of Theorem 7.5.1 is very long. The rst step, which takes us most of the way, is to
establish the next result.
Lemma 7.5.5 Suppose (N
c
, D
c
) is an r.c.f. of C S(P). Then C achieves (R1)(R3) if and only if
1
r
D
c
M(S) (i.e., if and only if
r
divides every component of D
c
).
Proof. Since the robustness condition(R3) involves perturbingthe plant fromits nominal description
P, we can assume at the outset that

D and

D
r
are right-coprime. If this is not so originally, it can be
achieved by an arbitrarily small perturbation of

D, by Lemma 7.5.4. Also, since the truth or falsity
of the condition
1
r
D
c
is unaffected by multiplying D
c
by a unimodular matrix, we can as well
choose the particular r.c.f. (N
c
, D
c
) of C that satises

DD
c
+

NN
c
= I. Finally, note that since
r
is the largest invariant factor of

D
r
, we have that
r

D
1
r
M(S) (see Problem 7.5.2).
if Suppose D
c
=
r
T , where T M(S). Now (R1) holds by assumption, since C S(P).
Next,
(I PC)
1

D
1
r
= D
c

D

D
1
r
=
r
T

D

D
1
r
M(S) , (7.5.4)
so that (R2) holds. To prove (R3), suppose P =

D
1

N is replaced by P
1
=

D
1
1

N
1
, where

D
1
=
D +

D,

N
1
=

N +

N. Then the corresponding return difference matrix is
M =

D
1
D
c
+

N
1
N
c
= I +[

D

N]
_
D
c
N
c
_
, (7.5.5)
256 7. ROBUSTNESS
which is unimodular provided
[

D

N] < =
_
_
_
_
_
D
c
N
c
__
_
_
_
1
. (7.5.6)
Hence, (R1) holds robustly. To prove that (R2) also holds robustly, note that
(I +P
1
C)
1

D
1
r
= D
c
M
1

D
1

D
1
r
= T M
1

D
1
r

D
1
r
M(S) , (7.5.7)
since M
1
M(S) due to the unimodularity of M.
only if Suppose (Rl), (R2), (R3) hold; it is to be shown that
1
r
D
c
M(S). This part of
the proof is divided into several steps.
First, since

D and D
r
are right-coprime, let (A, B) be an l.c.f. of

D

D
1
r
. Then, by Theo-
rem 4.1.15, A and

D
r
have the same invariant factors. In particular,
r
is the largest invariant factor
of A.
Step 1 It is shown that
D
c
A
1
M(S) . (7.5.8)
By assumption, (R2) holds, i.e., D
c

D

D
1
r
= D
c
A
1
B M(S). An application of Lemma 7.5.3
gives (7.5.8).
D
c
Q

NN
c
A
1
M(S) Q M(S) . (7.5.9)
Let Q M(S) be arbitrary, and choose in the interval (0, 1/Q

NN
c
). Then I Q

NN
c
is
unimodular. Dene M = (I Q

NN
c
)
1
. Then
M = I +
i=1
(Q

NN
c
)
i
= I +R

NN
c
, (7.5.10)
for some rational matrix R
, (note that M is rational). Moreover, R
0 as 0. Hence, R
<
for sufciently small , where is the number dened in (7.5.1). Now let P
1
=

D
1
(I +R
)

N.
By assumption, (I +P
1
C)
1

D
1
r
M(S), since P
1
is a valid perturbation of P. But
(I +P
1
C)
1

D
1
r
= D
c
[

DD
c
+(I +R
)

NN
c
]
1

D

D
1
r
= D
c
(I +R

NN
c
)
1
A
1
B
= D
c
(I Q

NN
c
)A
1
B M(S) . (7.5.11)
Applying Lemma 7.5.3 gives
D
c
(I Q

NN
c
)A
1
M(S) . (7.5.12)
Subtracting (7.5.8) yields D
c
Q

NN
c
A
1
M(S), and dividing by leads to (7.5.9).
D
c
QA
1
M(S) Q M(S) . (7.5.13)
Indeed, since

DD
c
+

NN
c
= I,
D
c
QA
1
= D
c
Q(

DD
c
+

NN
c
)A
1
= (D
c
Q

D)D
c
A
1
+D
c
Q

NN
c
A
1
M(S) , (7.5.14)
by (7.5.8) and (7.5.9), respectively.
1
r
D
c
M(S) , (7.5.15)
which is the desired conclusion. Select unimodular U, V U(S) such that
UAV =
_
_
_
1
0
.
.
.
0
r
_
_
= S , (7.5.16)
is the Smith form of A. By (7.5.8),
D
c
QA
1
M(S) Q D
c
QVS
1
U M(S) Q
D
c
QVS
1
M(S) Q
D
c
Q
1
S
1
M(S) Q
1
(7.5.17)
where Q
1
denotes QV, and Q, Q
1
range over M(S). Now let Q
1
= I. Then D
c
S
1
M(S), which
shows that
r
divides the last column of D
c
. Next, successively let Q
1
equal the permutation matrix
that interchanges column j with the last column, and let the index j vary over all possible values.
Then D
c
Q
1
S
1
M(S) implies that
r
divides every column of D
c
. 2
The argument used in the last step is sufciently useful that it deserves display as a separate
item.
Proposition 7.5.6 Suppose A, B M(S), have the same number of columns, and in addition A is
square and nonsingular. Then
BQA
1
M(S) Q M(S) , (7.5.18)
if and only if the largest invariant factor of A divides every element of B.
Proof of Theorem7.5.1. It is rst shown that the robust regulation problem has a solution if and
only if N and
r
I are left-coprime. Select

X,

Y M(S) such that

N

X +

D

Y = I. Then by Theo-
rem 5.2.1,
S(P) = {(

X +DR)(

Y NR)
1
: R M(S)} . (7.5.19)
Thus, the set of all D
c
as C varies over S(P) is given by {

Y NR : R M(S)}. By Lemma 7.5.5,
the robust tracking problemhas a solution if and only if
1
r
(

Y NR) M(S) for some R M(S),
or equivalently, there exist R, W M(S) such that
Y NR =
r
W . (7.5.20)
258 7. ROBUSTNESS
Now (7.5.20) can be rearranged as
NR +
r
W =

Y . (7.5.21)
Thus, the proof is complete if it can be shown that the left-coprimeness of N and
r
I is necessary and
sufcient for (7.5.21) to have a solution. Accordingly, suppose rst that N,
r
I are left-coprime, and
select T, U M(S) such that NT +
r
U = I; then R = T

Y, W = U

Y is a solution of (7.5.21).
Conversely, suppose N and
r
I are not left-coprime, and let V be a nontrivial (i.e., nonunimodular)
common left-divisor of N and
r
I. Then NR +
r
W is a left-multiple of V for all R, W. On
the other hand, from Corollaries 4.1.18 and 4.1.7, N and

Y are left-coprime, so that

Y is not a
left-multiple of V. Hence, (7.5.21) does not have a solution.
To complete the proof, it is shown that the set of all C S(P) such that
r
D
1
c
M(S)
is precisely {C
1
/
r
: C
1
S(P/
r
)}, provided of course that N and
r
I are left-coprime. First,
suppose D
c
=
r
T for some T M(S). Then, since
I =

DD
c
+

NN
c
=
r

DT +

NN
c
, (7.5.22)
we see that T and N
c
are right-coprime, and that C
1
= N
c
T
1
stabilizes (
r

D)
1

N = P/
r
.
Hence, every solution to the robust tracking problem is of the formC
1
/
r
, for some C
1
S(P/
r
).
Conversely, suppose C
1
S(P/
r
). Now P/
r
= (
r

D)
1

N; moreover, it can be shown that
r

D
and

N are left-coprime. Hence, (
r

D,

N) is a l.c.f. of P/
r
. Since C
1
S(P/
r
), C
1
has an r.c.f.
(N
c
1
, D
c
1
) satisfying
r

DD
c
1
+

NN
c
1
= I . (7.5.23)
Thus, C = C
1
/
r
= N
c
1
(D
c
1
)
1
stabilizes P, and the denominator matrix of C is a multiple of
r
. Hence, C solves the robust tracking problem. 2
Corollary 7.5.7 Suppose N and
r
I are left-coprime, and let T, U M(S) be particular solutions of
NT +
r
U = I. (7.5.24)
Let (B, A) be an r.c.f. of N/
r
. Then the set of all C S(P) that solve the robust tracking problem is
{[

X +D(T

Y +AS)][
r
(U

Y BS)]
1
: S M(S)} . (7.5.25)
Proof. From the proof of Theorem 7.5.1, we see that C solves the robust tracking problem if and
only if C = (

X +DR)(

Y NR)
1
for some R satisfying (7.5.21). Now it is routine to verify that
the set of all solutions of (7.5.21) is
R = T

Y +AS , W = U

Y BS , S M(S) . (7.5.26)
Substituting for Rfrom(7.5.26) andnoting that

Y NR =
r
W =
r
(U

Y BS) leads to(7.5.25).
2
Proof of Lemma 7.5.2. Recall from Section 4.2 that N and
r
I are left-coprime if and only if the
matrix F = [N
r
I] has full row rank at all s C
+e
. If
r
(s) = 0, the rank condition holds;
therefore, it is only necessary to verify the rank condition at the C
+e
-zeros of
r
. If
r
(s) = 0, then
F(s) has full row rank if and only if N(s) has full row rank. Hence, we conclude that N and
r
I are
left-coprime if and only if N(s) has full row rank at all C
+e
-zeros of
r
. Clearly this is the same as
requiring the largest invariant factor of N and
r
to be coprime. 2
Lemma 7.5.2 also holds over any principal ideal domain, and not just over S; but the proof in
the general case is a little more complicated (see Problem 7.5.3).
Now let us consider the problem of robust tracking using a two-parameter compensation
scheme. Recall from Section 5.6 that such a scheme can be depicted as shown in Figure 7.2. In this
scheme, the tracking error between the reference input r and the plant output y is given by
e = [I N(

N
c
2
N +

D
c
D)
1

N
c
1
]r . (7.5.27)
Hence, the transfer matrix from v to e is given by
F(N, D) = [I NM
1

N
c
1
]

D
1
r

N
r
, (7.5.28)
where
M = M(N, D) =

N
c
2
N +

D
c
D . (7.5.29)
As before, the objectives of the robust tracking problemare to determine the compensator parameters
N
c
1
,

N
c
2
,

D
c
such that the closed-loop system is stable and F(N, D) M(S) for the nominal plant
as well as for all plants in some neighborhood of the nominal plant.
A partial solution to the problem is given by the next theorem.
Theorem 7.5.8 Consider the system of Figure 7.2, and suppose P has at least as many inputs as
outputs. Let (N, D) be anr.c.f. of the nominal plant, andlet (N
1
, D
1
) denote perturbations of (N, D).
Let (N
c
2
, D
c
) be an r.c.f. of the feedback compensator

D
1
c

N
c
2
. Then the triple (

N
c
1
,

N
c
2
,

D
c
) solves
the robust tracking problem if and only if
(i)
1
r
D
c
M(S), and
(ii) N
1
[M(N
1
, D)]
1
(

N
c
2

N
c
1
)

D
1
r
M(S) (N
1
, D
1
).
If P is square, (ii) is equivalent to
(ii) (

N
c
2

N
c
1
)

D
1
r
M(S).
If P is nonsquare, (ii) implies but is not implied by (ii).
Remarks 7.5.9 If we use a one-parameter, or feedback compensator, then

N
c
2
=

N
c
1
, so that
(ii) above is automatically satised. In this case, (i) is a necessary and sufcient condition for robust
tracking. This is precisely the result contained in Lemma 7.5.5.
260 7. ROBUSTNESS
Proof. if Suppose rst that (i) and (ii) hold. It is desired to show that
(I N
1
M
1

N
c
1
)

D
1
r
M(S) (N
1
, D
1
) , (7.5.30)
where M = M(N
1
, D
1
) is as in (7.5.29). Dene
S =
_
D
1
N
c
2
N
1
D
c
_
. (7.5.31)
By the robustness of closed-loop stability, S is unimodular for (N
1
, D
1
) sufciently close to (N, D).
Let T = S
1
, and partition T as
T =
_
A B
K L
_
. (7.5.32)
Now [A B] is equivalent to [I 0], since it is part of a unimodular matrix. This, plus the fact that
[A B]
_
N
c
2
D
c
_
= 0 , (7.5.33)
implies that [A B] is an l.c.f. of

D
1
c

N
c
2
, i.e., that
[A B] = U[

D
c

N
c
2
] for some U U(S) . (7.5.34)
Similarly,
[K L] = [

N
1

D
1
] , (7.5.35)
where (

N
1
,

D
1
) is an l.c.f. of the perturbed plant N
1
D
1
1
. Now T S = I implies that
U(

D
c
D
1
+

N
c
2
N
1
) = I , (7.5.36)
or U = M
1
. Also, ST = I implies in particular that
N
1
B +D
c
L = I , (7.5.37)
or
NM
1

N
c
2
+D
c

D
1
= I . (7.5.38)
Therefore
(I N
1
M
1

N
c
1
)

D
1
r
= D
c

D
1

D
1
r
+N
1
M
1
(

N
c
2

N
c
1
)

D
1
r
, (7.5.39)
which belongs to M(S) by (i) and (ii). Hence, robust tracking is achieved.
only if Suppose the triple (

N
c
1
,

N
c
2
,

D
c
) achieves robust tracking. It is desired to show
that (i) and (ii) hold. The hypothesis is that
{I N
1
[M(N
1
, D
1
)]
1

N
c
1
}

D
1
r

N
r
M(S) (N
1
, D
1
) , (7.5.40)
where (N
1
, D
1
) range over some neighborhood of (N, D). By Lemma 7.5.3, (7.5.40) is equivalent
to
{I N
1
[M(N
1
, D
1
)]
1

N
c
1
}

D
1
r
M(S) (N
1
, D
1
) . (7.5.41)
As a prelude to the proof, observe rst that it can be assumed without loss of generality that the
numerator matrix N of the nominal plant and
r
I are left-coprime. If this is not so originally, it
can be achieved by an arbitrarily small perturbation of the plant: It sufces to perturb N to a nearby
matrix which has full row rank at all C
+e
-zeros of
r
(see Lemma 7.5.2). Also, we may as well
assume that the r.c.f. (N, D) of the nominal plant and the triple (

N
c
1
,

N
c
2
,

D
c
) satisfy
N
c
2
N +

D
c
D = I . (7.5.42)
Finally, let (

D,

N) be an l.c.f. of the nominal plant dened by the equation
_
D N
c
2
N D
c
_
=
_

D
c

N
c
2

N

D
_
1
. (7.5.43)
Then, since

N and N have the same invariant factors, it follows from Lemma 7.5.2 that

N and
r
I
are also left-coprime.
The rst step of the proof is to observe that the hypotheses imply that

N
c
1
and

D
r
are
right-coprime. To see this, note that (7.5.41) implies that
W = (I N

N
c
1
)

D
1
r
M(S) . (7.5.44)
Hence,
N

N
c
1
+W

D
r
= I , (7.5.45)
which shows that

N
c
1
and

D
r
are right-coprime. Let (A, B) be an l.c.f. of

N
c
1

D
1
r
, and note that
r
is also the largest invariant factor of A. Next, x the plant numerator at N and vary its denominator.
Then (7.5.41) states that
{I N[M(N, D
1
)]
1

N
c
1
}

D
1
r
M(S) D
1
. (7.5.46)
Subtracting (7.5.44) from (7.5.46) leads to
N{[M(N, D
1
)]
1
I}

N
c
1

D
1
r
= N{[M(N, D
1
)]
1
I}A
1
B M(S) D
1
. (7.5.47)
Hence, from Lemma 7.5.3,
N{[M(N, D
1
)]
1
I}A
1
M(S) D
1
. (7.5.48)
Now M = M(N, D
1
) = I +

D
c
(D
1
D). By an argument analogous to that in the proof of
Lemma 7.5.5 (see especially Step 2), it can be shown that M
1
I can be made equal

D
c
Q
for an arbitrary matrix Q M(S), by suitable choice of D
1
. Therefore, (7.5.48) implies that
N

D
c
QA
1
M(S) Q M(S) . (7.5.49)
262 7. ROBUSTNESS
Now Proposition 7.5.6 leads to the conclusion that
r
the largest invariant factor of A, divides each
element of N

D
c
. However, from (7.5.43), N

D
c
= D
c

N. Hence,
D
c
1
r

N M(S) . (7.5.50)
Finally, since
r
I and

N are left-coprime, an application of Lemma 7.5.3 to (7.5.50) gives
D
c
1
r
M(S) . (7.5.51)
This proves (i).
The proof of (ii) is now very easy. From previous arguments, robust tracking is equivalent
to (7.5.39). However, since
r
divides every element of D
c
, we have that
D
c

D
1

D
1
r
M(S) (N
1
, D
1
) . (7.5.52)
Subtracting (7.5.52) from (7.5.39) shows that
N
1
M
1
(

N
c
2

N
c
1
)

D
1
r
M(S) (N
1
, D
1
) . (7.5.53)
This proves (ii).
To complete the proof, it is necessary to show two things: 1) Condition (ii) implies (ii);
2) If the plant is square, (ii) implies (ii). Of these, the rst statement is obvious: If

N
c
2

N
c
1
is a
left-multiple of

D
r
, then (7.5.53) is clearly satised. To prove the second statement, dene
Q = (

N
c
2

N
c
1
)

D
1
r
. (7.5.54)
Then (7.5.53) implies in particular that NQ M(S). Nowselect unimodular matrices U, V U(S)
such that UQV is in Smith-McMillan form, and let
S = U
1
QV
1
= Diag {s
1
/t
1
, , s
r
/t
r
} . (7.5.55)
Since NQ M(S), it follows that t
i
divides the i-th column of NU, for all i. Now, if Q itself does
not belong to M(S), then t
1
is not a unit. Note that t
1
is a divisor of
r
, because of (7.5.54). Now let
s be any C
+e
-zero of t
1
. Then, since t
1
divides the rst column of NU, NU(s) is singular, which in
turn implies, by the unimodularity of U, that N(s) is singular. However, since t
1
is a divisor of
r
,
this contradicts the left-coprimeness of N and
r
I. Hence, Q M(S). 2
Theorem 7.5.8 shows that (ii) implies (ii) for square plants. It is now shown by example that
there exist situations involving nonsquare plants where (ii) holds but (ii) is false.
Example 7.5.10 Suppose the plant
P =
_
3s
2s 1
4s +1
2s 1
_
is to be compensated so as to robustly track a step input. In this case, we may take

D
r
=
r
=
s/(s +1). For convenience, let z denote s/(s +1), and note that z S. Then
P = [3z 3z +1](3z 1)
1
,
and an r.c.f. for P is given by
N = [z 3z +1] ,
D =
_
1 +z 1 3z
2z 6z 1
_
.
Now choose
D
c
=
_
1 0
z z
_
,

N
c
2
=
_
1
1 z
_
.
Then it is claimed that robust tracking is achieved provided
r
divides the second component of
N
c
2

N
c
1
, even if
r
does not divide the rst component of

N
c
2

N
c
1
. This is proved by establishing
that condition (ii) of Theorem 7.5.8 holds. For this purpose, suppose N, D are perturbed to N
1
, D
1
,
and partition these as
N
1
= [n
a
n
b
], D
1
= [D
a
D
b
] .
Also, partition

N
c
1
,

N
c
2
,

D
c
as
N
c
1
=
_
x
a
x
b
_
,

N
c
2
=
_
y
a
y
b
_
,

D
c
=
_

D
a
D
b
_
.
The key point to note here is that

D
b
is a multiple of z =
r
. Now partition M =

N
c
2
N
1
+

D
c
D
1
,
in the form
M =
_
m
aa
m
ab
m
ba
m
bb
_
.
Then
N
1
M
1
(

N
c
2

N
c
1
)/
r
= |M|
1
[(n
a
m
bb
n
b
m
ba
)(y
a
x
a
)]/
r
+|M|
1
[(n
a
m
ab
n
b
m
aa
)(y
b
x
b
)]/
r
.
Since
r
divides y
b
x
b
, the second term belongs to S, provided the plant perturbation is small
enough to maintain the unimodularity of M, i.e., to maintain closed-loop stability. As for the rst
term,
m
ba
= y
b
n
a
+

D
b
D
a
, m
bb
= y
b
n
b
+

D
b
D
b
n
a
m
bb
n
b
m
ba
=

D
b
(n
a
D
b
n
b
D
a
)
264 7. ROBUSTNESS
is a multiple of
r
, since
r
divides

D
b
. Hence, (ii) holds and robust tracking is achieved, whatever
be x
a
!
Nowlet us consider the problemof optimal regulation. The problems studied in Chapter 6 can
be interpreted as minimizing the effect of a stable disturbance on the plant output, or minimizing the
tracking error in response to a stable reference input. With the aid of the results derived thus far in
this section, it is nowpossible to tackle the same problems in the presence of unstable disturbances or
reference inputs. Specically, suppose P is a given plant,

D
1
r

N
r
is a given disturbance or reference
input generator, and W
1
, W
2
are given square weighting matrices. The objective is to minimize the
objective function
J
p
= W
1
(I +PC)
1

D
1
r

N
r
W
2
p
, (7.5.56)
over all C that achieve robust regulation. This can be interpreted as the problem of optimal distur-
bance rejection, or optimal tracking using a one-parameter compensator. Using Theorem 7.5.1 and
Corollary 7.5.7, this problem can be transformed into one of the type solved in Chapter 6. With all
symbols as in Theorem 7.5.1 and Corollary 7.5.7, the set of all C over which J
p
is to be minimized
is given by (7.5.25). Substituting from (7.5.25) into (7.5.56) shows that
(I +PC)
1

D
1
r
= (U

Y BS)

DL , (7.5.57)
J
p
= E W
1
BSF
p
, (7.5.58)
where
L =
r

D
1
r
,
E = W
1
U

YF ,
F =

DL

N
r
W
2
. (7.5.59)
Note that the robust regulation problem has a solution if and only if the plant has at least as many
inputs as outputs. Hence, F is square or fat, and as a result the minimization of J
p
can be carried
out using the techniques presented in Section 6.5.
Consider now the problem of optimal robust tracking of square plants using a two-parameter
compensation scheme. From Corollary 7.5.7 and Theorem 7.5.8 it follows that the set of all triples
(

N
c
1
,

N
c
2
,

D
c
) that achieve robust regulation is given by
D
1
c

N
c
2
= [(

X +D)(T

Y +AS)][
r
(U

Y BS)]
1
,
N
c
1
=

N
c
2
Q

D
r
, S, Q M(S) . (7.5.60)
The above expression is not very convenient, as it does not explicitly specify the compensator pa-
rameters. Such an expression can be obtained using a doubly coprime factorization for the plant.
Suppose
_
Y X

N

D
_ _
D

X
N

Y
_
= I . (7.5.61)
Then by Problem 5.2.2 it follows that
(

X +DR)(

Y NR)
1
= (Y R

N)
1
(X +R

D) R M(S) . (7.5.62)
Hence, the set of all compensators that achieve robust tracking can be described with the aid of
Corollary 7.5.7 and (7.5.62) as
D
c
= Y (T

Y +AS)

N ,
N
c
2
= X +(T

Y +AS)

D ,
D
c
1
= X +(T

Y +AS)

D +Q

D
r
. (7.5.63)
For the class of compensators described above, the tracking error transfer matrix can be computed
from (7.5.39) as
(D
c

D

D
1
r
+NQ)

N
r
= ((U

Y BS)

DL +NQ)

N
r
. (7.5.64)
Comparing (7.5.64) with (7.5.57), we see that the extra exibility arising fromusing a two-parameter
compensationscheme is the presence of the termNQin (7.5.64).The expression (7.5.64) canactually
be simplied. Since the plant is square, N and
r
I are both left- as well as right-coprime. Hence,
(N,
r
I) is a r.c.f. for
1
r
N, or in other words, we may take A =
r
I, B = N. Hence, in (7.5.64),
(U

Y BS)

DL +NQ = U

Y

DL +N(QS

DL) = U

Y

DL +NR , (7.5.65)
where R = QS

D is a free parameter. To reiterate, the set of stable transfer matrices achievable
using a feedback compensator is
(U

Y

DL NS

DL)

N
r
, S M(S) , (7.5.66)
whereas with a two-parameter scheme, the set is
(U

Y

DL +NQ)

N
r
, Q M(S) . (7.5.67)
Now, the problem of minimizing
J
p
= W
1
(I N

N
c
2
)

D
1
r

N
r
W
2
p
, (7.5.68)
over all robustly stabilizing two-parameter compensators can be reformulated as that of minimizing
J
p
= E W
1
NQ

N
r
W
2
p
, (7.5.69)
as Q varies over M(S), where E is dened in (7.5.59). It is easy to see that the inmum in (7.5.69)
is no larger than that in (7.5.58), since (7.5.69) contains (7.5.58) as a special case by choosing
Q = S

DL. Moreover, as in Section 6.6, there always exist innitely many optimal compensators if a
266 7. ROBUSTNESS
two-parameter scheme is used, and this exibility can be exploited to achieve both good disturbance
rejection and good tracking simultaneously.
Example 7.5.11 For the sake of variety we consider discrete-time systems in the next two exam-
ples. Thus z denotes the unit delay and its z-transform. Note that the z-transform of a sequence
{f
0
, f
1
, . . .} is given by
f (z) =
i=0
f
i
z
i
.
Consider the set-up shown in Figure 7.3, and suppose it is desired to robustly reject a step
input disturbance. Since the z-transform of a sequence of ones is 1/(1 z), we may take
n
r
= 1, d
r
=
r
= 1 z .
Suppose the plant has two inputs and one output, and has the transfer matrix
P = [z(z 0.5) z(z +0.5)](z +0.25)
1
.
An l.c.f. of this plant is given by
D = z +0.25 ,

N = [z(z 0.5) z(z +0.5)] ,
and a particular solution of the Bezout identity

N

X +

D

Y = I is given by
X = [4 4]
,

Y = 4 .
An r.c.f. of P is given by
N = [z 0] ,
D =
_
1 z 0.5
1 z 0.5
_
.
Now N and
r
are left-coprime. In fact (7.5.24) is satised with
T = [1 0]
, U = 1 .
Hence, the robust disturbance rejection problem has a solution. Further, all solutions of this problem
can be parametrized using Corollary 7.5.7. An r.c.f. of N/
r
is
B = [z 0] , A =
_
z 1 0
0 1
_
.
From (7.5.25), the set of all compensators that achieve robust disturbance rejection is given by
C = N
c
D
1
c
,
where
D
c
= (4 zs
1
) ,
N
c
= (z +0.25)
_
4z +3 +(1 z)s
1
(z +0.5)s
2
4z 3 +(z 1)s
1
+(z 0.5)s
2
_
,
s
1
, s
2
H
are rational .
Now consider the problem of optimal robust disturbance rejection as stated in (7.5.56), with
both weighting functions equal to unity. From (7.5.57) and (7.5.58), the set of achievable transfer
functions from v to y is
f (s
1
, s
2
) = (4 zs
1
)(z +0.25) .
Suppose it is desired to minimize the H
-norm of this transfer function. Using the methods

described in Section 6.4, one can verily that the minimum is achieved with s
1
= 0, with a minimum
cost of 5. The collection of compensators that achieve the minimum is
D
c
= 4(1 z) , N
c
=
_
4z +3 (z +0.5)s
2
4z 3 +(z 0.5)s
2
_
, s
2
H
.
Example 7.5.12 Consider the set-up shown in Figure 7.2, where the input to be tracked is the
sinusoidal sequence {0, 1, 0, 1, 0, 1, . . .}. Since the z-transform of this sequence is z/(1 +z
2
), we
may take
n
r
= z ,
r
= 1 +z
2
.
Suppose the plant is scalar, with the transfer function
p =
z
z +0.5
.
Then, dropping the tildes because everything is a scalar, we have
n = z , d = z +0.5 , x = 2 , y = 2 .
a =
r
= 1 +z
2
, b = n = z , t = z , u = 1 .
Hence, the set of r, w that satisfy (7.5.21) is
r = ty +as = 2z +(1 +z
2
)s , w = uy bs = 2 zs , s H
.
From Corollary 7.5.7, the set of all feedback, or one-parameter compensators that achieve robust
tracking is given by c = n
c
/d
c
, where
d
c
= (2 zs)(1 +z
2
) , n
c
= 2 +(z +0.5)(2z +(1 +z
2
)s) ,
268 7. ROBUSTNESS
and s H
is arbitrary. If it is desired to have optimal tracking by minimizing the objective func-

tion (7.5.56) with unity weighting, then one has to minimize
J
= (2 zs)(z +0.5)z
, s H
.
Using the techniques of Section 6.4, one can verify that the minimum is 3, corresponding to the
choice s = 0. Hence the unique feedback compensator that achieves optimal robust tracking is
c =
(1 +0.5z +z
2
)
1 +z
2
.
Now let us see how much better we can do using a two-parameter compensation scheme. The set
of transfer functions from v to the tracking error that can be achieved is
n
r
((uy bs)d +nq) = z[(2 zs) (z +0.5) +zq], s , q H
.
= z(1 +zq
1
) ,
where q
1
= q (z +0.5)s +2. The H
-norm of the above is minimized when q

1
= 0, and
equals 1. Hence, not only does the two-parameter scheme result in a reduction in the optimal
tracking error by a factor of 3, but there are innitely many compensators that achieve this mini-
mum. In this way it is possible to achieve both optimal disturbance rejection as well as good tracking.
For example, a compensator that achieves optimal tracking corresponds to the choice q = 2, s = 0,
which leads to the compensator
d
c
= 2 +2z
2
,
n
c
2
= 2(1 +0.5 +z
2
) ,
n
c
1
= n
c
2
+2
r
= z .
Note that the feedback compensator is the same as the optimal compensator discussed above. But
the use of a two-parameter scheme reduces the tracking error by a factor of 3.
PROBLEMS
7.5.1. Prove Lemma 7.5.4.
7.5.2. Suppose A M(S) is square and nonsingular and let a denote its largest invariant factor.
Show that aA
1
M(S).
7.5.3. Prove Lemma 7.5.2 over a general principal ideal domain R(Hint: Partition all the full-size
minors of the matrix [N
r
I] into several groups, namely: those that do not involve any
columns of
r
I, those that involve one column, two columns, etc.).
7.5.4. Suppose

D and

N are left-coprime, as are

N and
r
I. Show that
r

D,

N are left-coprime.
7.6. GENERICITY 269
7.5.5. Using the techniques of Section 6.7, construct a sequence of plants P
i
and a sequence of
reference inputs

D
1
r
i
N
r
i
such that the ratio of the optimal tracking error achievable using
a two-parameter compensation scheme to that achievable using a one-parameter scheme
approaches zero as i approaches .
7.6 GENERICITY
In this section, the focus is on the notion of genericity, which is a mathematically precise way of
expressing the intuitive idea that some statement or property is almost always true. The concept of
genericity is dened rst, and then it is shown that various useful properties of control systems are
generically true. These include coprimeness, strong stabilizability, and simultaneous stabilizability
(subject of course to suitable restrictions).
We begin with a formal denition of genericity. Let X denote a topological space. Recall that
a binary relation R on X is a subset of the product space X X. It is symmetric if (x, y) R
(y, x) R. More generally, an n-ary relation is a subset of the product space X
n
.
Denition 7.6.1 A binary relation R on X is generic if it is an open dense subset of X. It is strongly
generic if the following is true: For each x X, dene
B(x) = {y X : (x, y) R} . (7.6.1)
Then the set of x for which B(x) is an open dense subset of X is itself an open dense subset of X.
One can state the same denition in words: R is generic if two statements are true: (i) If a pair
(x, y) belongs to R, then there is a neighborhood of (x, y) such that every point in the neighborhood
also belongs to R. (ii) If a pair (x, y) does not belong to R, then every neighborhood of this pair
contains a point that belongs to R. In other words, R is generic if it has two features: (i) If a pair
satises the relation, then sufciently small perturbations will not destroy the relation. (ii) If a pair
does not satisfy the relation, then arbitrarily small perturbations in the pair will make it satisfy the
relation. A simple (and pertinent) example is that, in the polynomial ring R[s] topologized as in
Problem C.2.3, the relation R dened by
R = {(x, y) : x and y have no common zeros} , (7.6.2)
is generic. The exact relationship between genericity and strong genericity is brought out in Prob-
lem 7.6.1. But, roughly speaking, the symmetric relation R is strongly generic if, for almost all x,
almost all y satisfy the relation (x, y) R.
Proposition 7.6.2 Let A, B be open dense subsets of a topological space X. Then C = A B is also open
and dense.
270 7. ROBUSTNESS
Proof. Clearly C is open, being the intersection of two open sets. Hence, it only remains to show
that it is dense. For this, it is necessary to show the following: Let x X be arbitrary. Then every
neighborhood of x contains a point of A B. Accordingly, let x X be chosen arbitrarily, and let N
be a neighborhood of x. Since A is dense, N contains an element a A. Moreover, since A is open,
there is a neighborhood N
1
A containing a. One can assume that N
1
is also a subset of N by
replacing N
1
by N N
1
if necessary. Next, since B is also dense, the neighborhood N
1
contains an
element b B. Since b also belongs to A, b belongs to A B and is contained in the neighborhood
N of x. 2
The topological space-under study in this section is the ring S of proper stable rational
functions. Recall that the topology on S is the natural one induced by its norm. Thus, a base for
the topology on S is provided by open balls of various radii centered at various points. Most of the
arguments given below can be extended to an arbitrary topological ring whose set of units is open
and where inversion of a unit is continuous, but at a considerable increase in the level of abstraction.
We begin by deriving some results concerning the genericity of coprimeness.
Proposition 7.6.3 Dene a symmetric binary relation R on S by
R = {(f, g) : f and g are coprime} . (7.6.3)
Then R is strongly generic as well as generic.
Proof. It is shown only that R is strongly generic. The proof of genericity is similar and is left as an
exercise (see Problem 7.6.2).
Let f S, and dene
R
f
= {g S : f and g are coprime} . (7.6.4)
It is shown below that R
f
is an open dense subset of S whenever f = 0. Since the set of nonzero
elements is open and dense in S, this would prove the strong genericity of R.
To show that R
f
is open, suppose that f and g are coprime, and choose a, b S such that
af +bg = 1. If h g is less than b
1
, then
af +bh = 1 +b(h g) , (7.6.5)
is a unit of S, which shows that h and f are also coprime. Thus, if g R
f
, then h R
f
whenever
h belongs to the ball B(g, b
1
). This shows that R
f
is open.
To show that R
f
is dense whenever f = 0, let g be an arbitrary element of S. If f and g are
coprime, then nothing further needs to be done. On the other hand, if f and g are not coprime, it
must be shown that every neighborhood of g contains an element which is coprime with f . Towards
7.6. GENERICITY 271
this end, let S denote the set of C
+e
-zeros of the function f , and note that S is a nite set since f
is nonzero. Now partition the set S into two disjoint subsets S
1
and S
2
. as follows:
S
1
= {s S : g(s) = 0} , S
2
= {s S : g(s) = 0} . (7.6.6)
Dene
r = min
sS
2
|g(s)| , (7.6.7)
and note that r > 0. Now let > 0 be arbitrary; it will be shown that the ball B(g, ) contains an
element h which is coprime with f . Select a positive < min(r, ) and dene h = g + 1. Then,
it is claimed that f and h are coprime, since they do not have any common C
+e
-zeros. To establish
the claim, note that the only possible common C
+e
-zeros of f and h are the elements of the set S.
Now, if s S
1
, then h(s) = = 0, whereas if s S
2
, then |h(s)| > r > 0. Hence, f and h are
coprime. 2
Thus, Proposition 7.6.3 shows that, generically, any two functions in S are coprime. In order
to extend this result to matrices, a preliminary lemma is needed.
Proposition 7.6.4 Suppose m > 1 is some integer. Then the set
M
m
= {D S
mm
: |D| = 0} , (7.6.8)
is an open dense subset of S
mm
.
Proof. The map D |D| is continuous, and the set of nonzero elements in S is open. Hence, M
m
,
which is the preimage of an open set under a continuous map, is itself open. To show that it is also
dense, suppose D is a matrix in S
mm
with |D| = 0, and let be an arbitrary positive number. It
is shown next that the ball B(D, ) contains a matrix with a nonzero determinant. Select matrices
U, V U(S) such that
UDV = Diag {d
1
, , d
r
, 0, , 0} , (7.6.9)
is the Smith form of D, and r is the rank of D. Observe now that d
i
divides d
i+1
for all i. Now
dene
S = Diag {d
1
, , d
r
, d
r
, , d
r
} , (7.6.10)
and let D
1
= U
1
SV
1
. Then |D
1
| = 0, and D
1
can be made to lie in the ball B(D, ) by choosing
the constant sufciently small. 2
The next proposition generalizes Proposition 7.6.3 to the matrix case.
Proposition 7.6.5 Suppose m < n are positive integers, and dene
R(m, n) = {A S
mn
: The g.c.d. of all minors of A is 1 } . (7.6.11)
272 7. ROBUSTNESS
Then R(m, n) is an open dense subset of S
mn
.
Remarks 7.6.6 1) If R(m, n) is viewed as mn-ary relation on S, then the proposition states that
this relation is generic.
2) Another way of stating the proposition is: Suppose F S
mk
, G S
ml
. Then F and G
are generically coprime if k +l > m.
3) The denition (7.6.11) can be equivalently stated as:
R(m, n) = {A S
mn
: A [I 0]} . (7.6.12)
Proof. Clearly it is enough to prove the proposition for the case where n = m+1. The openness
of the set R(m, n) is obvious, and it only remains to show that the set is dense. This is done by
induction on m. Note that the proposition is true when m = 1, by Proposition 7.6.3. Suppose the
proposition is true for matrices with m1 rows and m(or more) columns. To establish the statement
for matrices with m rows, suppose A S
mn
but does not belong to R(m, n). To be shown is that
every neighborhood of Acontains a matrix belonging to R(m, n). Let > 0 be any specied number,
and partition A in the form
A = [A
1
a
2
] , (7.6.13)
where A
1
and a
2
have dimensions mmand m1, respectively. Nowlet A
11
denote the submatrix
consisting of the rst m1 rows of A
1
. By the inductive hypothesis and Proposition 7.6.4, there
exists an mm matrix
B
1
=
_
B
11
B
12
_
, (7.6.14)
such that (i) |B
1
| = 0, (ii) [I 0] is a Smith form of B
11
, and (iii) B
1
A
1
< /2. Now let S
denote the set of C
+e
-zeros of |B
1
|, and partition the set S into two disjoint sets S
1
and S
2
, as
follows:
S
1
= {s S : all minors of [B
1
a
2
] are zero} ,
S
2
= {s S : not all minors of [B
1
a
2
] are zero} . (7.6.15)
The idea, as in the proof of Proposition 7.6.3, is to perturb a
2
in such a way as to destroy all the
common zeros in the set S
1
while ensuring that none of the points in S
2
becomes a common zero
of all the minors. For this purpose, note that all minors of the matrix [B
1
a
2
] belong to S and are
thus, bounded over the closed right half-plane. Let be an upper bound on the norms of all these
minors. Now, at each point s S
1
, nd an (m1) (m1) minor of B
1
which is nonzero at s;
such a minor must exist, since [I 0] is a Smith form of B
1
. Suppose such a minor is obtained by
7.6. GENERICITY 273
omitting the j-th column of B
1
. Let e
m
denote the m-th elementary unit vector. Then the mm
minor of
B = [B
1
a
2
+e
m
] , (7.6.16)
obtained by omitting the j-th column is nonzero at s, whenever = 0. Hence, at all s S
1
, at least
one of the mm minors of the matrix B in (7.6.16) is nonzero. The next step is to show that,
if is sufciently small, then at least one of the mm minors of the matrix B is nonzero at all
points in the set S
2
as well. This can be shown as follows: In analogy with (7.6.7), choose a real
number r > 0 such that, at each s S
2
, at least one mm minor of the matrix [B
1
a
2
] is nonzero.
Now choose < r/. Then, at each s S
2
, the corresponding mm minor of B is also nonzero.
Finally, since S = S
1
S
2
, is the zero set of |B
1
|, these are the only possible points in the closed
RHP where all mm minors of the matrix B could vanish. Hence, we conclude that, whenever
< min(r/, ), B has the Smith form [I 0]. 2
Proposition 7.6.5 shows that a rectangular matrix generically has the Smith form [I 0]. The
next result is addressed to the Smith form of square matrices.
Proposition 7.6.7 Suppose m is a positive integer, and dene
T
m
= {A S
mm
: A Diag {1, , 1, |A|}} . (7.6.17)
Then T
m
is an open dense subset of S
mm
.
Proof. The openness of T
m
is left to the reader to prove. To prove that T
m
is dense, suppose A does
not belong to T
m
, and let > 0 be a given number. To construct an element A
1
belonging to T
m
lying in the ball B(A, ), we proceed as follows: First, by Proposition 7.6.4, it can be assumed that
|A| = 0, by perturbing the matrix Aslightly if necessary. Nowselect unimodular matrices U, V such
that
S = UAV = Diag {a
1
, , a
m
} , (7.6.18)
is a Smithformof A. Then, for any > 0, one can nd functions b
i
B(a
i
, ) suchthat the functions
b
1
, , b
m
are all pairwise coprime (see Proposition 7.6.3). Hence, the matrix
B = U
1
Diag {b
1
, , b
m
}V
1
, (7.6.19)
has the Smith form Diag {1, , 1, |B|}. By adjusting , one can ensure that B B(A, ). 2
The rst principal result of this section states that a collection of r multivariable plants is
generically simultaneously stabilizable provided the plants have at least r inputs or r outputs. A
useful corollary is that generically a multivariable plant can be strongly stabilized. These results
make use of Proposition 7.6.5 concerning the generic Smith form of rectangular matrices.
Theorem7.6.8 Suppose l, mare given positive integers, and let M(R(s)) denote the set [R(s)]
lm
,
equipped with the graph topology. Let r be a positive integer, and dene the r-ary relation R on
274 7. ROBUSTNESS
M(R(s)) by
R = {(P
1
, , P
r
) :
_
i
S(P
i
) = } . (7.6.20)
Then R is an open dense subset of M(R(s)) whenever r max(l, m).
The proof of the theorem is divided into two parts: First, it is shown that the result is true in
the case where min(l, m) = 1, i.e., the case where all plants are either single-input or single-output.
Then it is shown that the general case can be reduced to this special case.
Lemma 7.6.9 Suppose min(l, m) = 1. Then R dened in (7.6.20) is an open dense subset of M(R(s))
if max(l, m) r.
Proof. The proof is given for the case where l = 1 and m r. The other case can be proved by
taking transposes.
Given the plants P
1
, , P
r
, each of dimension 1 m, nd l.c.f.s (d
(i)
p
, [n
(i)
p
1
n
(i)
p
m
]), for
i = 1, , r. Dene the matrix Q M(R(s))
r(m+1)
by
q
ij
= d
(i)
p
if j = 1 ,
q
ij
= n
(i)
p
(j1)
if j > 1 . (7.6.21)
Let C M(R(s))
m1
be a compensator, and let ([n
c
1
, , n
c
m
]
, d
c
) be an r.c.f. of C. Then by
Theorem 5.1.6, C stabilizes each of the plants P
i
if and only if each of the functions u
i
are dened
by
_
_
_
u
1
.
.
.
u
r
_
_
= Q
_
_
_
_
_
d
c
n
c
1
.
.
.
n
c
m
_
_
, (7.6.22)
is a unit. Turning this argument around, the set of plants {P
1
, , P
r
} can be simultaneously sta-
bilized if and only if there exist units u
i
, i = 1, , r such that (7.6.22) has a solution in M(R(s))
for the unknowns d
c
and n
c
i
, i = 1, , m, where the rst component of the solution (correspond-
ing to d
c
) is nonzero. Now, if m r, then the matrix Q has more columns than rows. Hence, by
Proposition 7.6.5, Q is generically right-invertible. That is, if Q is not right-invertible, it can be
made so by an arbitrarily small perturbation in each of its elements, which is precisely the same as
slightly perturbing each of the plants P
i
in the graph topology. Thus, by slightly perturbing each of
the plants, we can ensure that (7.6.22) has a solution in M(R(s)), whatever be the left hand side.
In particular, one can choose any arbitrary set of units {u
1
, , u
r
}, and (7.6.22) has a solution in
M(R(s)). The only additional point to be worried about is that the rst element of this solution
(corresponding to d
c
) should be nonzero. But this too can be guaranteed by perturbing the units
slightly if necessary. The details of this last argument are routine and are left to the reader. 2
7.6. GENERICITY 275
The next stepis to reduce the case of anl msystemwhere l < mto the case of a single-output
system (the case where l > m is handled by transposition).
Lemma 7.6.10 Suppose l, m are given positive integers. Dene the relation R on the product space
M(S)
(m+l)m
R
l
as the set of all (D, N, v) such that vN, D are right-coprime. Then the relation R is
generic.
Proof. Let n
(1)
, , n
(l)
denote the rows of the matrix N. By Proposition 7.6.5, it is generically
true that the Smith form of the matrix
M
i
=
_
D
n
(i)
_
, (7.6.23)
is [I 0]
for all i. In other words, this means that the g.c.d. of all mm minors of the matrix
M =
_
D
N
_
, (7.6.24)
involving exactly one row of N is equal to 1. Now, if v is an l 1 vector, then the pair (D, vN) is
right-coprime if and only if the g.c.d. of all mm minors of the matrix
Q =
_
D
vN
_
, (7.6.25)
is equal to 1. We show that generically this is the case for almost all v R
l
. Now the matrix Q has
dimensions (m+1) m, as do the matrices M
i
dened in (7.6.23). Moreover, since the determinant
function is multilinear, a minor of Q involving the last row can be expressed as a linear combination
of the corresponding minors of the various M
i
. More precisely, let q
i
denote the minor of Qobtained
by omitting the j-th row the matrix D, and dene m
j
i
analogously. Then
q
j
=
l
i=1
v
i
m
j
i
. (7.6.26)
By hypothesis, it is generically true that
g.c.d.
i,j
{m
j
i
} = 1 . (7.6.27)
Hence, if the elements m
j
i
satisfy the rank condition of [100], it follows that, for almost all vectors
v,
g.c.d.
j
{q
j
} = 1 . (7.6.28)
It is left to the reader to showthat the rank condition is generically satised. This shows that (D, vN)
are right-coprime and completes the proof that the relation R is generic. 2
276 7. ROBUSTNESS
The signicance of Lemma 7.6.10 is in showing that, generically, a multivariable plant can
be stabilized using a rank one compensator. Suppose P is a given multivariable plant with an r.c.f.
(N, D), and suppose there exists a vector v such that vN, D are right-coprime. Then one can nd
a pair of elements (a, B) M(S) such that avN +BD = I. Note that a is a column vector. Now
the compensator C = B
1
av has rank one as an element of M(R(s)) and stabilizes P. Hence, if
one can stabilize P using a rank one compensator if one can nd a row vector v such that vN, D
are right-coprime. Lemma 7.6.10 states that generically this is always possible.
With the aid of Lemma 7.6.10, the proof of Theorem 7.6.8 can be completed.
Proof of Theorem7.6.8. If min(l, m) = 1, then the truth of the theorem is established by
Lemma 7.6.9. If not, suppose without loss of generality that r m l. For each plant P
i
, it is
generically true that the set of row vectors v such that the m-input-scalar-output plant vP
i
can be
stabilized by an appropriate compensator is open and dense in R
l
. Since the intersection of a nite
number of open and dense subsets of R
l
is again open and dense (see Proposition 7.6.2), generi-
cally the set of row vectors v such that each of the plants vP
i
can be stabilized by an appropriate
compensator is open and dense. Moreover, by Lemma 7.6.10, generically there exists a common sta-
bilizing compensator C
1
, since r m. Hence, C = vC
1
simultaneously stabilizes each of the plants
P
i
, i = 1, , r. 2
InSection5.4 it was shownthat the problemof simultaneously stabilizing r plants is equivalent
to that of simultaneously stabilizing r 1 plants using a stable compensator. Now Theorem 5.2.1
shows that r plants of dimension l m can be generically simultaneously stabilized provided r
max(l, m). Thus, it is natural to conjecture that r plants of dimension l m can generically be
simultaneously stabilized using a stable compensator provided r < max(l, m). In fact this is true,
and the proof follows very nearly the reasoning above. However, a fewminor adjustments are needed.
The reason for this is the following: Given r plants P
i
, i = 1, , r, they can be simultaneously
stabilized using a stable compensator if and only if the plants r +1 plants 0, P
i
, i = 1, , r can
be simultaneously stabilized. Now, the reasoning used in Lemma 7.6.9 and Theorem 7.6.8 relies on
being able to perturb all of the plants as needed. However, in the present instance, one can only
perturb all except one plant (namely the zero plant). However, there is a saving feature: In choosing an
l.c.f. of the zero plant, the numerator matrix has to be zero, but the denominator matrix is arbitrary.
Hence, the proof of the above result is complete if it can be shown that the Smith form of an
(r +1) (m+1) matrix of the form
M =
_
d
1
N
1
d
2
0
_
, (7.6.29)
is generically equal to [I 0] whenever r < m. Comparing the matrix in (7.6.29) to that in Propo-
sition 7.6.5, we see that the only difference is the presence of a row of zeros in the bottom of the
matrix M in (7.6.29). Now, checking back to the proof of Proposition 7.6.5, the ingredients to the
proof by induction were: (i) the fact that the r mmatrix N
1
generically had the Smith form[I 0],
and (ii) the ability to adjust the corner element d
2
. As both of these features are present in (7.6.29),
7.6. GENERICITY 277
we can conclude that the matrix M is generically right-invertible if r < m. Hence, the following
corollary to Theorem 5.2.1 is proved:
Corollary 7.6.11 Suppose the plants P
i
, i = 1, , r have dimension l m. Then they can generically
be simultaneously stabilized using a stable compensator if r < m.
PROBLEMS
7.6.1. (a) Show that if R is open then R
x
is open for all x.
(b) Show that if R
x
is a dense subset of X for all x belonging to a dense subset of X, then
R is a dense subset of X X.
7.6.2. Show that the relation R of (7.6.3) is generic.
7.6.3. Show that the relation T
m
dened in (7.6.17) is open.
NOTES ANDREFERENCES
The graph topology of Section 7.2 and the graph metric of Section 7.3 were introduced in [96]. For
another metric on the set of square rational plants, see [114]. The robustness results in Section 7.4
are due to several researchers, including Doyle and Stein [32], Chen and Desoer [13], Kimura [58],
and Vidyasagar and Kimura [99]. For other results on the robustness of feedback stability, specically
the gain margin of systems, see [57, 67]. Some of the results on robust regulation are from [39].
The genericity of simultaneous stabilizability was rst established for two plants by Vidyasagar and
Viswanadham [102]. The general results given here were rst obtained by Ghosh and Byrnes [43]
using state-space methods. The present treatment follows [100].
279
C H A P T E R 8
Extensions to General Settings
The previous six chapters concentrated on the problems of controller synthesis in the case where
the underlying ring of stable transfer functions is the ring S dened in Section 2.1. As pointed
out in Section 4.1, the only property of S that is used repeatedly in Chapters 27 is the fact that S
is a principal ideal domain. Thus, most if not all of the contents of these chapters remain valid if
S is replaced by any p.i.d. This is why the synthesis theory of Chapters 27 encompasses, within a
single framework, lumped continuous-time as well as discrete-time systems, and situations where
the stable region in the complex plant is something other than the left half-plane (or the closed unit
disc in the discrete-time case).
The objective of the present chapter is to generalize the earlier synthesis theory by removing
the assumption that the underlying set of stable systems is a p.i.d. In effect, the generalization
extends the scope of the theory to include linear distributed systems, 2-D systems and the like.
In contrast with earlier chapters, this chapter, and in particular Section 8.1, presupposes a
mathematical background on the part of the reader well beyond that contained in the appendices.
Several references to the literature are given at appropriate places, but in general, [14, 116] are good
references for concepts from abstract algebra, while [42] is a handy source for results on Banach
algebras.
8.1 COPRIMEFACTORIZATIONS OVERACOMMUTATIVE
RING
Throughout this section, R denotes a commutative domain with identity, and F denotes the eld
of fractions associated with R. As in the rest of the book, the symbols M(R), M(F), U(R) denote
respectively the set of all matrices with elements in R, the set of matrices with elements in F, and the
set of all unimodular matrices with elements in R. If R is a principal ideal domain, then the contents
of Section 4.1 show that every matrix over F has both right- and left-coprime factorizations over
R. In fact, for this it is enough that R be a Bezout domain, i.e., that every nitely generated ideal in
R be principal. However, if R is not Bezout, then the questions of the existence of r.c.f.s and l.c.f.s
and their interrelationships need to be examined anew. It is the purpose of this section to do so.
The concepts of coprimeness and coprime factorizations introduced in Section 4.1 can be
extended in a natural way to a general ring R. Suppose A, B M(R) have the same number of
columns. Then A and B are right-coprime if there exist X, Y M(R) such that
XN +YD = I . (8.1.1)
280 8. EXTENSIONSTOGENERAL SETTINGS
Suppose F M(F). A pair (N, D) where N, D M(R) is a right-coprime factorization (r.c.f.) of
F if (i) D is square and |D| = 0, (ii) F = ND
1
, and (iii) N, D are right-coprime. Left-coprime
matrices and l.c.f.s are dened analogously.
Given an arbitrary F M(F), the existence of an r.c.f. for F is not automatic unless R is a
Bezout domain (see Corollary 8.1.5 below). However, if F does have an r.c.f., then it is easy enough
to characterize all r.c.f.s of F. In fact, the relevant part of Theorem 4.1.13 carries over in toto. For
ease of reference, the result is repeated here, but the proof is omitted.
Lemma 8.1.1 Suppose F M(F) has an r.c.f. (N, D). Then the set of all r.c.f.s of F consists of the set
of pairs (NU, DU) as U ranges over all matrices in U(R).
But when does an F M(F) have an r.c.f.? We begin with the scalar case.
Lemma 8.1.2 Suppose f F, and express f as a/b where a, b R. Then f has a coprime factorization
if and only if the ideal generated by a and b in R is principal.
Proof. if Suppose g is a generator of the ideal generated by a and b. Then a = ng, b = dg, g =
xa +yb for appropriate elements n, d, x, y R. Since b is the denominator of a fraction, it is
nonzero, which in turn implies that both d and g, are nonzero. This, together with earlier relations,
leads to f = n/d, xn +yd = 1. Hence, (n, d) is a coprime factorization of f .
only if Suppose (n, d) is a coprime factorization of f , and choose x, y R such that
xn +yd = 1. Let r = b/d. Then
r = bd
1
= b(xnd
1
+y)
= b(xab
1
+y) = xa +yb R . (8.1.2)
Hence, d divides b in R, and dr = b. This implies that a = nr, and that the ideal generated by a, b
is the same as the principal ideal generated by r. 2
The matrix case is essentially similar, if one observes that every matrix F M(F) can be
expressed in the form AB
1
where A, B M(R) and |B| = 0. To see this, suppose f
ij
= p
ij
/q
ij
where p
ij
, q
ij
R. Dene b =

i
j
q
ij
, and let a
ij
= bp
ij
/q
ij
R. If Adenotes the matrix of the
a
ij
s, then F = A(bI)
1
. The result in the matrix case is stated without proof.
Lemma 8.1.3 Suppose F F
nm
, and express F in the formAB
1
where A, B M(R) and |B| = 0.
Dene
I = {XA +YB : X R
mn
, Y R
mm
} , (8.1.3)
and note that I is a left ideal in R
mm
. Then F has an r.c.f. if and only if I is a left principal ideal.
Lemma 8.1.4 Every f F has a coprime factorization if and only if R is a Bezout domain.
8.1. COPRIMEFACTORIZATIONS OVERACOMMUTATIVERING 281
Proof. The following statement is easy to prove: R is a Bezout domain if and only if every pair of
elements in R generates a principal ideal. So, if R is not Bezout, then there exist a, b R such
that the ideal generated by a and b is not principal. Obviously, both a and b are nonzero. Hence,
f = a/b F, but does not have a coprime factorization, by Lemma 8.1.2. This proves the only if
part. The if part follows readily from Lemma 8.1.2. 2
Corollary 8.1.5 Every F F
nm
has an r.c.f. for all integers n, m 1 if and only if R is a Bezout
domain.
Proof. If R is a Bezout domain, the proof of Theorem 4.1.13 applies to show that every F F
nm
has an r.c.f. for all n, m 1. 2
The next several paragraphs are devoted to deriving conditions for a matrix in M(R) to have
a left (or right) inverse in M(R), using the idea of reduction modulo maximal ideals. Since a nite
set of matrices A
1
, , A
n
in M(R) are left-coprime if and only if the matrix [A
1
, , A
n
] has a
right inverse, this problem arises naturally in the study of the coprimeness of two (or more) matrices.
In the case where R is a Bezout domain, several equivalent criteria for left invertibility are given in
Section 4.1. It turns out that Corollary 4.1.5 is the result that extends most naturally to the case of
a general ring R (see Theorem 8.1.8 below).
An ideal I in R is maximal if I = R and I is not properly contained in any ideal of R other
than R itself. Equivalently (see [116, p. 150]), I is maximal if and only if the factor ring R/I is a eld.
Let max(R) denote the set of maximal ideals of R. Then R/I is a eld for each I max(R), even
though for different I this eld could be different. However, if R is in addition a Banach algebra
over the complex eld, then R/I is isomorphic to the complex eld C for all I max(R) [42, p. 31].
For x R and I max(R), let x
I
denote the coset x +I in the factor ring R/I.
Lemma 8.1.6 Suppose a
1
, , a
n
R. Then there exist x
1
, , x
n
R such that
n
i=1
x
i
a
i
= 1 (8.1.4)
if and only if
rank [ a
1I
a
nI
] = 1 I max(R) . (8.1.5)
Remarks 8.1.7 Note that (8.1.4) is equivalent to the statement that a
1
, , a
n
generate R,
while (8.1.5) states that no maximal ideal of R contains all of the a
i
s. Thus, Lemma 8.1.6 is a
natural restatement of a well-known result [42, p. 24] that an element in R is a unit if and only if it
is not contained in any maximal ideal of R.
Proof. if Suppose (8.1.5) holds and let J denote the ideal generated by the a
i
s. If J = R, then J is
contained in a maximal ideal of R [116, p. 151]. Suppose J I max(R). Then I contains all the
a
i
s, in contradiction to (8.1.5). Hence, J = R, i.e., (8.1.4) holds.
only if Suppose (8.1.4) holds. If (8.1.5) is false, thenthere exists anI max(R) that contains
all the a
i
s. In view of (8.1.4), this implies that 1 I , which is absurd. 2
Theorem8.1.8 Suppose A R
nm
with n m. Then A has a left inverse in R
mn
if and only if
1
rank

A
I
= mI max(R) . (8.1.6)
An equivalent, and extremely useful, way of stating this theorem is: A R
nm
has a left
inverse if and only if the set of mm minors of A together generate R.
Proof. only if Suppose B R
mn
is a left inverse of A, so that BA = I
m
. Since the map x x
I
from R into R/I is a homomorphism for each I max(R), it follows, upon taking cosets modulo I,
that
B
I

A
I
= I
m
I max(R) . (8.1.7)
Now (8.1.7) implies (8.1.6) .
if This part of the proof is made more intelligible by using the symbol S(m, n) dened
in Section B.1. Recall that S(m, n) denotes the set of all strictly increasing m-tuples taken from
{1, , n}.
The case m = n is very easy: In this case (8.1.6) merely states that |A| is not contained in any
maximal ideal of R. Hence, |A| is a unit of R, and A
1
R
mm
.
Now suppose m < n. For each m-tuple J S(m, n), let F
J
R
mm
denote the mm
submatrix of A corresponding to the rows in J. In other words, if J = {j
1
, , j
m
}, then
F
J
=
_
_
_
a
j
1
1
a
j
1
m
.
.
.
.
.
.
a
j
m
1
a
j
m
m
_
_
. (8.1.8)
Let f
J
= |F
j
| denote the mm minor of A corresponding to the rows in J. Then (8.1.6) states
that, for each I max(R), we have

f
JI
= 0 for some J S(m, n). Hence, by Lemma 8.1.6, the
minors f
J
, J S(m, n) together generate R; i.e., there exist elements x
J
, J S(m, n) such that
JS(m,n)
x
J
f
J
= 1 . (8.1.9)
1
Note that

A
I
is an n m matrix, with elements in the eld R/I, whose ij-th element is a
ijI
.
Next, for each J S(m, n), dene a matrix B
J
R
mn
as follows: Suppose J = {j
1
, , j
m
},
dene F
J
as in (8.1.8), and let G
J
R
mm
denote the adjoint matrix of F
J
. Let g
J1
, , g
Jm
denote the columns of G
J
. Then
B
J
= [0 g
J1
g
J2
g
Jm
0]. (8.1.10)
In other words, columns j
1
, , j
m
of B
J
equal g
J1
, g
Jm
, respectively, while all other columns
of B
J
are zero. By construction,
B
J
A = G
J
F
J
= f
J
I
m
. (8.1.11)
Now dene
B =
JS(m,n)
x
J
B
J
. (8.1.12)
Then BA = I
m
, from (8.1.11) and (8.1.9).
2
2
Corollary 8.1.9 Suppose A, B M(R) have the same number of columns. Then A and B are right-
coprime if and only if the matrix [

A
I

B
I
]
has full column rank for all I max(R).

Application8.1.10 [110] Let k be an algebraically closed eld (e.g., the complex numbers), and let
Rbe the ring k[x
1
, , x
r
] of polynomials in the r indeterminates x
1
, , x
r
with coefcients in k.
This ring arises inthe study of multidimensional linear systems. For eachr-tuple p = (p
1
, , p
r
)
k
r
, the set
I
p
= { k[x
1
, , x
r
] : (p) = 0} (8.1.13)
of polynomials that vanish at p is a maximal ideal of R. Moreover, since k is algebraically closed,
every maximal ideal of R is of the form (8.1.13) for some p k
r
[4, p. 82]. Now R/I
p
is isomorphic
to k for all p k
r
. Further, the canonical map +I
p
takes the polynomial into the eld
element (p).
In this case, Lemma 8.1.6 is the weak form of Hilberts nullstellensatz, namely: A nite
collection of polynomial
1
, ,
n
generates R if and only if, at each point p k
r
, at least one of
the (p) is nonzero. Corollary 8.1.9 states that two matrices A, B M(R) are right-coprime if and
only if the matrix [A
(p) B
(p)]
has full column rank for all p k

r
.
If Ris a Banach algebra over the complex eld, then Lemma 8.1.6 andTheorem8.1.8 continue
to apply, but some further renements are possible because of the topological structure of the maximal
ideal space of R. As mentioned earlier, in this case R/I is isomorphic to C for all I max(R). Hence,
in this case one can think of the canonical projection x x
I
as a homomorphism from R into C.
Thus, with every I max(R) one can associate a nonzero homomorphismfromRto C. The converse
is also true: If : R C is a nonzero homomorphism, then the kernel of is a maximal ideal of
2
This proof was communicated to the author by N.K. Bose.
R. Let us now switch notation. Let denote the maximal ideal space of R, and for each and
each x R, let x
denote the image (in C) of x under the homomorphism associated with . Let
be equipped with the weakest topology in which the map x
is continuous for each x R.

Such a topology can be dened in terms of its base, consisting of sets of the form
B(; x
1
, , x
n
) = { : | x
i
x
i
| < i} . (8.1.14)
Then becomes a compact topological space. Let C() denote the Banach algebra of continuous
(complex-valued) functions on , with the supremum norm; i.e.,
f
C
= max
|f ()| , (8.1.15)
and multiplication dened pointwise. Then, with each element x R, one can associate an element
x C() by the rule x() = x
. The function x C() is called the Gelfand transform of

x. Let

R denote the subset of C() consisting of the set of Gelfand transforms of all x R. Then
R is a subalgebra of C(), but does not in general equal C(); in fact,

R may not even be a closed
subalgebra of C().
The utility of the above theory arises fromthe fact that, for several Banach algebras of interest,
the maximal ideal space can be identied explicitly.
Examples 8.1.11 Let R = A, the disc algebra dened in Section 2.4. One can think of A as the
set of transfer functions of
2
-stable shift-invariant digital lters whose frequency responses are
continuous. The maximal ideal space of A is the closed unit disc D. For each z D, the map
f f (z) (the evaluation map) denes a homomorphism from A into C; conversely, all nonzero
homomorphisms from A into C are of this type. Hence, given f A, the function z f (z) is the
Gelfand transform of f . Since the space C() is just the set of continuous functions C(D), in this
case the Gelfand transform is just the injection map of A into C(D).
Let R =
1
(Z
+
), the set of sequences {f
i
}
i=0
that are absolutely summable, with convolution
as the multiplication. As shown in [26, 97],
1
(Z
+
) is precisely the set of unit-pulse responses of
-stable shift-invariant causal digital lters. The maximal ideal space of

1
(Z
+
) is once again the
closed unit disc D. Corresponding to each z D, one can associate the homomorphism
f (z) =
i=0
f
i
z
i
, (8.1.16)
which maps
1
(Z
+
) into C. The map z

f (z) which is a function in C(D) is the Gelfand
transform of f
1
(Z
+
). In this case, it is just the usual z-transform of the sequence {f
i
}.
More generally, let R =
1
(Z
n
+
) denote the set of all absolutely summable n-indexedsequences
{f
i
1
i
n
} where the each index ranges over the nonnegative integers. The maximal ideal space of R
is D
n
, the cartesian product of the unit disc with itself n times, also referred to as the n-polydisc.
Suppose z = (z
1
, , z
n
) D
n
. Then the function
f (z) =
i
1
=0

i
n
=0
f
i
1
i
n
z
i
1
1
z
i
n
n
(8.1.17)
denes a homomorphism from
1
(Z
n
+
) into C. The map z

f (z) which belongs to C(D
n
) is the
Gelfand transform of f , which is again the usual z-transform.
Now suppose R =
1
(Z), the set of absolutely summable two-sided sequences {f
i
}
i=
.
In this case the maximal ideal space of R is the unit circle T. For each z T, the corresponding
homomorphism from
1
(Z) to C is again given by (8.1.16), with the summation now running
from to . If R =
1
(Z
n
), then the maximal ideal space of R is T
n
, and (8.1.17) denes a
homomorphism of R into C (with the obvious change of 0 to everywhere). One can replace
Z
n
by any locally compact Abelian group and the theory is still complete; see [80].
In some applications, one does not know the maximal ideal space of R explicitly, but does
know a dense subset of . Two such instances are discussed in Examples 8.1.15 and 8.1.16 below.
Theorem 8.1.8 can be easily extended to this case.
Lemma 8.1.12 Suppose R is a Banach algebra over C, with maximal ideal space . Suppose is a
dense subset of , and suppose a
1
, , a
n
R. Then there exist x
1
, , x
n
R such that
n
i=1
x
i
a
i
= 1 (8.1.18)
if and only if
inf
i=1
| a
i
()| > 0 . (8.1.19)
Proof. if Suppose (8.1.19) holds; then it is claimed that
n
i=1
| a
i
()| = 0 , (8.1.20)
or equivalently, that at least one a
i
() is nonzero at each . The proof of (8.1.20) is by con-
tradiction. Suppose that, for some
0
, we have a
i
(
0
) = 0 i. Since is dense in , there is a
sequence
3
{
j
} in converging to
0
, and
n
i=1
| a
i
(
j
)|
n
i=1
| a
i
(
0
)| = 0 . (8.1.21)
3
If is not a rst countable topological space, this argument has to be modied by replacing all sequences by nets.
But (8.1.21) contradicts (8.1.19). Hence, (8.1.20) is true. The existence of the x
i
s satisfying (8.1.18)
now follows from Lemma 8.1.6.
only if Suppose (8.1.18) holds. Then, by Lemma 8.1.6, (8.1.20) is also true. Now, since
is a compact topological space, (8.1.20) implies that
inf
i=1
| a
i
()| > 0 . (8.1.22)
Finally, (8.1.22) implies (8.1.19), since is a subset of . 2
Lemma 8.1.13 Let R, , be as in Lemma 8.1.12, and suppose A R
nm
with n m. Then Ahas
a left inverse in R
mn
if and only if
inf
m
(

A()) > 0 , (8.1.23)
where
m
() denotes the smallest singular value of a matrix.
Remarks 8.1.14 An equivalent way of stating (8.1.23) is as follows: There exists an > 0 such
that

A
()

A() I .
Proof. if Suppose
m
(

A()) . Then a simple exercise in numerical analysis shows that
JS(m,n)
| a
J
(|
n
2
c
> 0 , (8.1.24)
where a
J
, J S(m, n) are the various mm minors of

A, and c = n!/[m!(n m)!]. From
Lemma 8.1.12, (8.1.24) implies that the minors a
J
, J S(m, n) generate R. The existence of a
left inverse for A now follows as in the proof of Theorem 8.1.8.
only if Suppose A has a left inverse. Then, by Theorem 8.1.8, the minors a
J
, J S(m, n)
of A generate R. Hence, by Lemma 8.1.12,
inf
J
| a
j
()| > 0 . (8.1.25)
It is now routine to prove (8.1.23). 2
Application 8.1.15 Let R = H
, which can be interpreted as the set of transfer functions of

causal,
2
-stable, shift-invariant digital lters, without the requirement that the frequency response
be continuous. The maximal ideal space of H
is not known explicitly. However, an extremely deep

result due to Carleson [34, Chapter 12], called the corona theorem, states that the open unit disc
is a dense subset of the maximal ideal space of H
. Hence, Lemma 8.1.12 states, in this case, that

two functions f, g H
are coprime if and only if

inf
|z|<1
{|f (z)| +|g(z)|} > 0 . (8.1.26)
A similar interpretation can be given for Lemma 8.1.13.
Application 8.1.16 [8] Let A denote the set of distributions f () with support in [0, ) of the
form
f (t ) =
i=0
f
i
(t t
i
) +f
a
(t ) , (8.1.27)
where 0 t
1
t
2
< are real constant, f
a
() is absolutely continuous, and in addition
f
A
:=
i=0
|f
i
| +
_
0
|f
a
(t )| dt < . (8.1.28)
One can interpret A as the set of regular measures (without singular part) of bounded variation on
the half-line [0, ); see [48, 80]. It is shown in [26] that the set A is a good model for the set of
impulse responses of BlBO-stable linear time-invariant continuous-time systems. Now the maximal
ideal space of A is very messy to describe. However, it is shown in [48, pp. 145 et seq.] that the closed
right half of the complex plane is a dense subset of this maximal ideal space, in the following sense:
For each s C
+
, dene a homomorphism
s
: A C by
s
(f ) =

f (s) =
i=0
f
i
exp(st
i
) +
_
0
f
a
(t ) exp(st ) dt . (8.1.29)
It is recognized that

f is just the Laplace transform of f (). For each s C
+
, the set of f () A
such that

f (s) = 0 is a maximal ideal of A. The result in [48] is that the collection of maximal ideals
corresponding to all s C
+
is dense in the maximal ideal space of A. Hence [8], by Lemma 8.1.12,
two functions f, g A are coprime if and only if
inf
sC
+
{|

f (s)| +| g(s)|} > 0 . (8.1.30)
Now we change direction somewhat. Lemma 8.1.4 shows that if R is not a Bezout domain,
then not every matrix in M(R) has an r.c.f., and not every matrix has an l.c.f. At this stage it is natural
to ask: If a matrix has one, does it have the other? The rest of this section is devoted to answering
this question for various rings of interest.
Suppose A R
mn
with m < n; then we say that A can be complemented if there exists a
unimodular matrix U R
nn
containing A as a submatrix. Suppose A can be complemented, and
let U = [A
R
nn
be unimodular. By expanding |U| along the rst m rows using Laplaces
expansion (Fact B.1.4), we see that the mmminors of Amust together generate R, or equivalently
(by Theorem 8.1.8), that A has a right inverse in R
nm
. Thus, only right-invertible matrices can be
complemented. But is right-invertibility sufcient?
We postpone answering this question in order to demonstrate the relevance of comple-
mentability to the issue at hand.
Lemma 8.1.17 Suppose F M(F) has an l.c.f. (

D,

N). Then F has an r.c.f. if and only if the matrix
A = [

D

N] can be complemented.
Proof. if Suppose Acan be complemented. Since B = [

N

D] is obtained fromAby elementary
column operations, B can also be complemented.
4
Suppose
U =
_
Y X

N

D
_
(8.1.31)
is a unimodular matrix containing B as a submatrix, and partition U
1
M(R) as
U
1
=
_
D

X
N

Y
_
. (8.1.32)
The claim is that (N, D) is an r.c.f. of F. From UU
1
= I, we get
YD +XN = I , (8.1.33)

ND +

DN = 0 . (8.1.34)
Thus, D and N are right-coprime from (8.1.33), and the proof is complete if it can be established
that |D| = 0, because (8.1.34) would then imply that ND
1
=

D
1

N = F.
To show that D is nonsingular, begin by observing that

D is nonsingular, since (

D,

N) is an
l.c.f. of F. Therefore the matrix
_
I 0

N

D
_
U
1
=
_
D

X
0 I
_
(8.1.35)
is also nonsingular. It now follows from (8.1.35) that |D| = 0.
only if This part exactly parallels the proof of Theorem 4.1.16. Suppose (N, D) is an r.c.f.
of F, and select X, Y,

X,

Y M(R) such that XN +YD = I,

N

X +

D

Y = I. Dene
U =
_
Y X

N

D
_
, W =
_
D

X
N

Y
_
. (8.1.36)
4
The change from [

D

N] to [

N

D] is made simply to make U resemble the matrix in Theorem 4.1.16.
Then,
UW =
_
I Y

X +X

Y
0 I
_
. (8.1.37)
Since |UW| = 1, it follows that |U| is a unit and that U is unimodular. This shows that B =
[

N

D] can be complemented. The same is true of A = [

D

N], since A can be transformed to B
by elementary column operations. 2
Corollary 8.1.18 Suppose F M(F) has an r.c.f. (N, D). Then F has an l.c.f. if and only if the matrix
[D
can be complemented.
The proof is easy and is left as an exercise.
Suppose A R
mn
be complemented. Then by denition there exists a matrix B R
(nm)m
such that [A
is unimodular. The next result parametrizes all such matrices B.

Lemma 8.1.19 Suppose A R
mn
can be complemented, and suppose [A
0
]
is unimodular. Then
the set of B M(R
(nm)m
such that [A
is unimodular is given by
{B : B = RA +VB
0
, where R M(R), V U(R)} . (8.1.38)
Proof. First, suppose B = RA +VB
0
where V U(R). Then,
_
A
B
_
=
_
A
RA +VB
0
_
=
_
I 0
R V
_ _
A
B
0
_
, (8.1.39)
whichis unimodular since bothmatrices onthe right side are in U(R). Conversely, suppose [A
]
U(R). Partition [A
0
]
1
as [F G], and note that AF = I, AG = 0. Now, by assumption,
_
A
B
_ _
A
B
0
_
1
=
_
A
B
_
[F G] =
_
I 0
BF BG
_
(8.1.40)
is in U(R), which shows that BG U(R). Let V = BG, R = BF. Then, from (8.1.38),
_
A
B
_
=
_
I 0
R V
_ _
A
B
0
_
=
_
A
RA +VB
0
_
, (8.1.41)
which implies that B = RA +VB
0
. 2
Now we return to the question: When can every right-invertible matrix over R be comple-
mented? The answer is provided by the notion of a Hermite ring. A row [a
1
, , a
n
] R
1n
is a
unimodular row if a
1
, , a
n
together generate R. A ring R is Hermite if every unimodular row can
be complemented. Lemma B.2.1 shows that every Bezout domain is Hermite. But it is shown below
that several other commonly encountered rings are Hermite, though they are not Bezout. Before
this, a minor technicality is disposed of.
Lemma 8.1.20 The following statements are equivalent:
(i) R is a Hermite ring.
(ii) If A R
kn
has a right inverse, then A can be complemented.
Remarks 8.1.21 (a) The objective of the lemma is to deal with fat matrices, not just row vectors;
(b) compare (ii) with Corollary 4.1.7.
Proof. (ii) (i) Obvious: just let m = 1.
(i) (ii) Fix n. The proof is by induction on k. The denition of a Hermite ring shows that
(ii) is true when k = 1. Suppose (ii) is true for matrices in R
mn
whenever m k 1 < n 1, and
suppose A R
kn
has a right inverse, say C. Partition A and C as shown below:
A =
_
A
1
a
2
_
, C = [C
1
c
2
] , (8.1.42)
where a
2
is the last row of A and c
2
is the last column of C. Then A
1
C
1
= I
k1
. Hence, by the
inductive hypothesis (and taking transposes), there exists a unimodular matrix U R
nn
of the
form U = [C
1
D]. Now
AU =
_
A
1
a
2
_
[C
1
D] =
_
I
k1
Z
0 y
_
=: M . (8.1.43)
Since A has a right inverse, so does M, and hence so does y R
1(nk+1)
. Let b R
(nk+1)1
be
a right inverse of y. Since R is Hermite, there exists a unimodular matrix B R
(nk+1)(nk+l)
containing b as its rst column. Thus,
AU
_
I
k1
0
0 B
_
=
_
I
k1
ZB
0 yB
_
=
_
I
k1

0 1
_
=: N , (8.1.44)
since the rst element of yB R
1(nk+1)
is a one. (In (8.1.44), is a generic symbol denoting an
element or a matrix whose value is unimportant.) Nowthe matrix N in(8.1.44) canbe complemented,
since the determinant of
_
_
I
k1

0 1
0 0 I
nk
_
_
(8.1.45)
is one. Since N is a right associate of A, it follows that A can also be complemented. 2
Corollary 8.1.22 Suppose R is a Hermite ring and that A R
mn
with m < n. Then A has a right
inverse if and only if there exist unimodular V R
mm
, U R
nn
such that
VAU = [I
m
0] . (8.1.46)
Corollary 8.1.22 shows that in a Hermite ring, even though there is no Smith form in general
for arbitrary matrices, there is one for right-invertible matrices (and of course for left-invertible
matrices).
Theorem8.1.23 The following three statements are equivalent:
(i) R is a Hermite ring.
(ii) If F M(F) has an l.c.f., then it has an r.c.f.
(iii) If F M(F) has an r.c.f., then it has an l.c.f.
Proof. The equivalence of (ii) and (iii) follows readily by taking transposes.
(i) (ii) Suppose R is Hermite, and suppose F M(F) has an l.c.f., say (

D,

N). Then A =
[

D

N] has a right inverse. From Lemma 8.1.20, Acan be complemented, and from Lemma 8.1.17,
it follows that F has an r.c.f.
(ii) (i) Suppose [a
1
a
n
] R
1n
is an arbitrary unimodular row; it must be shown that
this row can be complemented. Since a
1
, , a
n
generate R, at least one of the a
i
must be nonzero.
Suppose a
1
= 0 by renumbering if necessary, and deneF F
1(n1)
by
F =
_
a
2
a
1

a
n
a
1
_
. (8.1.47)
Now the unimodularity of the row [a
1
a
n
] shows that F has an l.c.f., namely (a
1
, [a
2
a
n
]).
Since (ii) is assumed to hold, F also has an r.c.f. By Lemma 8.1.17, this implies that the row[a
1
a
n
]
can be complemented. Hence, R is Hermite. 2
Theorem 8.1.23 shows that Hermite rings are the next best thing to Bezout domains, at least
as far as controller synthesis theory goes. If R is Bezout, then every matrix in M(F) has both an
r.c.f. and an l.c.f., and all of the synthesis theory of Chapter 5 goes through very nicely. If R is not
Bezout, then not every matrix in M(F) has an r.c.f., and not every matrix has an l.c.f. However, if
R is at least Hermite, then the matrices in M(F) that have an r.c.f. are precisely those that have an
l.c.f. Moreover, this property characterizes Hermite rings.
It is therefore worthwhile to explore whether various commonly encountered rings are Her-
mite. If R is a complex Banach algebra, then there is a very general result available. The proof of this
result is well beyond the scope of this book and can be found in [62]; see [62, Theorem 3].
Theorem 8.1.24 Suppose R is a complex Banach algebra with maximal ideal space . Then a
unimodular row [a
1
a
n
] in R can be complemented if and only if the row [ a
1
a
n
] can be
complemented in C(). In particular, if is contractible, then R is Hermite.
Recall that a topological space is contractible if the identity map on is homotopic to a
constant map. In other words, is contractible if there exists an
0
and a continuous function
f : [0, 1] such that
f (0, ) = , f (1, ) =
0
, . (8.1.48)
Theorem 8.1.24 is similar in spirit to one of the basic results in the Gelfand theory of Banach
algebras, which states that, given any complex Banach algebra R and any f R, f can be inverted
in R if and only if

f can be inverted in C().
Note that a convex set in C
n
is contractible: If C
n
is convex, let
0
be arbitrary and
dene
f (, ) =
0
+(1 ) . (8.1.49)
This observation leads to a very useful criterion.
Corollary 8.1.25 Suppose R is a complex Banach algebra whose maximal ideal space is a (compact)
convex subset of C
n
for some integer n. Then R is a Hermite ring.
Examples 8.1.26 [6, 54] The disc algebra,
1
(Z
+
),
1
(Z
n
+
) are all Hermite.
Theorem 8.1.24 and Corollary 8.1.25 are not valid in general if R is a real Banach algebra.
However, in [6, 54] it is shown that the following real algebras are indeed Hermite rings:
(i) The set A
s
of functions in the disc algebra A whose Fourier coefcients are all real (i.e.,
the set of all symmetric functions in A).
(ii) The set
1r
(Z
n
+
) consisting of all real sequences in
1
(Z
n
+
).
Theorem 8.1.24 suggests a simple example of a ring that is not Hermite.
Example 8.1.27 Let S
2
denote the unit sphere in R
3
, and let R = C(S
2
), the set of all continuous
complex-valued functions on S
2
.
5
Then, = S
2
, which is not contractible. It is claimed that R
is not a Hermite ring. To show this, it is enough to demonstrate a unimodular row in R
13
that
cannot be complemented. Dene a = [a
1
a
2
a
3
] by setting a(x) equal to the unit outward normal
vector at x, for each x S
2
. Then a() is continuous and therefore belongs to R
13
. Moreover, since
a(x) = 0 x S
2
, it follows from Lemma 8.1.6 that a is a unimodular row. Now suppose a could be
complemented, and suppose [a
] R
33
is unimodular. Then for each x S
2
, the three vec-
tors a(x), b(x), c(x) must span the vector space C
3
. Let T (x) denote the (two-dimensional) tangent
space to S
2
at x. Since span {a(x), b(x), c(x)} = C
3
contains T (x), and since a(x) is orthogonal to
T (x), we see that the projections of b(x), c(x) onto T (x) must both be nonzero and must together
span T (x). Let f(x), g(x) denote the projections of b(x), c(x) onto T (x), respectively. Then f()
is a vector eld on S
2
that does not vanish anywhere. However, this contradicts the Borsuk-Ulam
5
This example was suggested to the author by Ken Davidson.
theorem [46, p. 116], which states that every vector eld on S
2
must have a zero. Hence, a() cannot
be complemented, and R is not a Hermite ring.
If we dene F = [a
2
/a
1
a
3
/a
1
], then F has an l.c.f. but not an r.c.f.
We conclude this section with a theorem that is very important in several branches of control
theory and indeed in mathematics.
Theorem 8.1.28 (Quillen-Suslin) Let k be a eld, and let R = k[x
1
, , x
l
], the ring of poly-
nomials in l indeterminates with coefcients in k. Then R is a Hermite ring.
Remarks 8.1.29 (i) Until it was proven, the above result was known as Serres conjecture. See [61]
for an excellent historical introduction to this result, its origins and its signicance.
(ii) Quillen [76] shows that the result remains valid if k is replaced by any p.i.d. However, the
proof given below does not apply to this case.
(iii) The ring k[[x
1
, , , x
l
]] of formal power series in l indeterminates with coefcients in
k is also a Hermite ring; see [63].
(iv) The proof given below is taken from [79, pp. 138145]. See [112] for another proof that
applies to the special case when k = C.
The elementary proof of Theorem 8.1.28 is rather long, and makes use of a few concepts not
previously introduced. A ring B is local if it contains a unique maximal ideal [4, p. 38]. Suppose A is
a domain, and let P be a prime ideal in A. The set consisting of all fractions of the form a/b where
a A, b A \ P, is a local ring, called the localization of A at P, and is denoted by A
p
. The unique
maximal ideal of A
p
is the set of fractions a/b where a P, b A \ P. The pertinent property of
local rings is the following: If B is a local ring and J is its unique maximal ideal, then every b / J is
a unit. This follows from Lemma 8.1.6.
If B is a ring and b
1
, , b
n
B, the symbol (b
1
, , b
n
) denotes the ideal generated by
the b
i
s. If f B[y], then (f ) denotes the degree of the polynomial f . More generally, if f
B[y
1
, , y
m
], then (f ) denotes the total degree of f . That is, if f is of the form
f (y
1
, , y
l
) =
i
f
i
y
i
1
1
y
i
l
l
, (8.1.50)
where i is a shorthand for the l-tuple (i
1
, , i
l
) then (f ) is the largest of the sums (i
1
+ +i
l
)
for which the coefcient f
i
is nonzero.
The proof of Theorem 8.1.28 is basically inductive. Clearly k[x
1
] is Hermite, since it is a p.i.d.
(see Lemma B.2.1). The proof consists of showing that if k[x
1
, , x
l1
] is Hermite, then so is
k[x
1
, , x
l
]. This proof consists of two independent steps. The rst step (which is actually proved
last here, since it is more complicated than the other) is to show that if B is a Hermite ring and
[a
1
, , a
n
] is a unimodular row in B
1n
, then it can be complemented, provided one of the a
i
s is a
monic polynomial. The second step (which is proved rst) is to show that in the ring k[x
1
, , x
l
],
it is always possible to effect a change of variables in such a way that every unimodular row contains
at least one monic polynomial in the new set of variables.
Lemma 8.1.30 (Nagata) Suppose a R, and let m = (a) +1. Dene a new set of indeterminates
y
l
= x
l
, y
i
= x
i
x
m
li
l
. (8.1.51)
Then a(x
1
, , x
l
) = rb(y
1
, . . . , y
l
), where r = 0 belongs to k,and b is a monic polynomial in y
l
(with
coefcients in k[y
1
, , y
l1
]).
Proof. Denote an l-tuple (j
1
, , j
l
) where each of the j
i
is nonnegative by the symbol j. Then the
polynomial a can be written as
a =
j
r
j
x
j
i
1
x
j
l
l
, (8.1.52)
where r
j
k. Note that (8.1.51) has the inverse transformation
x
l
= y
l
, x
i
= y
i
+y
m
li
l
. (8.1.53)
Substituting from (8.1.53) into (8.1.52) gives
b =
j
r
j
(y
1
+y
m
l1
l
)
j
1
(y
l1
+y
m
l
)
j
l1
y
j
l
l
. (8.1.54)
Now let mdenote the l-tuple (m
l1
, , m, 1). The rst point to note is that if i = (i
1
, , i
l
) and
j = (j
1
, , j
l
) are two distinct l-tuples with 0 i
k
, j
k
m1 k, then i mand j mare distinct
integers, where denotes the usual vector dot product. This follows upon noting that i m < m
l
, and
that the l-tuple i is just the representation of the integer i m with respect to the base m; similarly
for the integer j m. Returning to (8.1.54), let J = {j : r
j
= 0}. Then there is a unique l-tuple j
0
J
such that j
0
m > j mj J \ j
0
. Hence, when expanded, (8.1.54) is of the form
b = r
j
0
y
j
0
m
l
+ terms involving lower powers of y
l
. (8.1.55)
Hence, b equals r
j
0
times a monic polynomial in y
l
of degree j
0
m. 2
Note that if kis aninnite eld, thenit is possible toachieve the same result as inLemma 8.1.30
by using only a linear transformation of variables [112]. However, the more complicated transfor-
mation is needed in the case of an arbitrary (and possibly nite) eld.
To prove that R = k[x
1
, , x
l
] is a Hermite ring, suppose [a
1
a
n
] is an arbitrary uni-
modular row in R
1n
; it must be shown that this row can be complemented. In order to do this,
we can rst transform the indeterminates x
1
, , x
l
into the set y
1
, , y
l
using Lemma 8.1.30
such that one of the b
i
s is monic in y
l
, where b
i
(y
1
. , y
l
) = a(x
1
, , x
l
)i. Since the rings
k[x
1
, , x
l
] and k[y
1
, , y
l
] are clearly isomorphic, it is enough to show that the row [b
1
b
n
]
can be complemented. Thus, the proof of Theorem 8.1.28 is complete once the following lemma is
established. Note that this lemma is of independent interest.
Lemma 8.1.31 Suppose B is a Hermite ring, and let R = B[x]. Suppose [a
1
a
n
] R
1n
is a
unimodular row, and that at least one of the a
i
s is monic. Then this row can be complemented.
The proof of Lemma 8.1.31 requires two preliminary results.
Lemma 8.1.32 (Suslin) Suppose B is a ring, R = B[x], and suppose f, g R are of the form
f (x) = x
r
+a
1
x
r1
+ +a
r
,
g(x) = b
1
x
r1
+ +b
r
. (8.1.56)
Then, for each j {1, , r} the ideal (f, g) contains a polynomial h
j
such that (h
j
) r 1 and the
leading coefcient of h
j
is b
j
.
Proof. Dene I B to be the set of leading coefcients of all h in the ideal (f, g) such that
(h) r 1. Then I is an ideal in B. To see this, suppose p
1
, q
1
, I, and let p, q (f, g) be the
corresponding polynomials with leading coefcients p
1
, q
1
and degrees less than or equal to r 1.
Assume without loss of generality that (p) (q). Then the polynomial
h(x) = q(x) x
(q)(p)
p(x) (8.1.57)
has leading coefcient q
1
p
1
and degree no larger than r 1. Hence, I is closed under subtraction.
Finally, it is clear that if p
1
I and s B, then p
1
s I; just take h(x) = sp(x).
To prove the lemma, it is enough to establish the following statement: If h (f, g) and
(h) r 1, then all coefcients of h belong to I. This is established by induction, by proving that:
for each j {1, , r}, if h (f, g) and (h) r 1, then the coefcients of x
r1
, , x
rj
belong to I. Now the statement is true for j = 1, from the denition of I. Suppose the statement is
true up to j 1, and suppose h is an arbitrary element of the ideal (f, g) with (h) r 1. Let
= (h), and write
h(x) = h
1
x
r1
+ +h
r
, (8.1.58)
where some of the h
i
may be zero. By the inductive hypothesis, h
1
, , h
j1
belong to I. Now,
xh(x) h
1
f (x) (f, g) has a degree no larger than r 1. Moreover,
xh(x) h
1
f (x) =
r
i=1
(h
i+1
h
1
a
i
)x
ri
. (8.1.59)
The (j 1)-st coefcient of this polynomial is h
j
h
1
a
j1
, which belongs to I by the inductive
hypothesis. Since h
1
a
j1
I, it follows that h
j
I. This completes the proof of the inductive step.
2
The last preliminary result is the same as Lemma 8.1.31, except that the ring B is restricted
to be a local ring.
Lemma 8.1.33 Let B be a local ring, let R = B[x], and suppose [a
1
a
n
] R
1n
is a unimodular
row with at least one of the a
i
being monic. Then this row can be complemented.
Proof (Suslin). Assume without loss of generality that n 3,
6
and that a
1
is monic. The proof is by
induction on the degree of a
1
. If (a
1
) = 0, then a
1
= 1 and there is nothing further to prove. If
(a
1
) =: r > 0, we will showthat [a
1
a
n
] can be reduced, through elementary column operations,
to a row [b
1
b
n
] where b
1
is monic and (b
1
) < r.
Accordingly, suppose r = (a
1
). Since a
1
is monic, by performing column operations if nec-
essary we can assume that (a
i
) < r for i = 2, , n. Let J be the unique maximal ideal of B, and
for b B let

b B/J denote the coset b +J. For f R = B[x], dene

f [B/J][x] in the obvious
way, namely, if f =

f
i
x
i
, then

f =

f
i
x
i
. Now the row [ a
1
a
n
] is unimodular in [B/J][x].
Since ( a
1
) = (a
1
) > 0, not all of a
2
, , a
n
can be zero. By renumbering if necessary, suppose
a
2
= 0. Then some coefcient of a
2
does not belong to J. Suppose
a
2
(x) =
r
i=1
b
2i
x
ri
, (8.1.60)
and suppose b
2j
/ J. Since B is a local ring, this implies that b
2j
is a unit. Let us now apply
Lemma 8.1.32 with f = a
1
, g = a
2
, and dene I as in Lemma 8.1.32. Then Lemma 8.1.32 implies
that b
2j
I, i.e., that the ideal (a
1
, a
2
) contains a monic polynomial of degree less than r. So by
elementary column operations, a
3
can be replaced by a monic polynomial of degree less than r. Now
apply the inductive hypothesis. 2
Proof of Lemma 8.1.31. The lemma is proved by establishing the following fact: There exists a
unimodular matrix U U(R) such that aU = b, where b = [b
1
b
n
] B
1n
is a unimodular
row. In fact, such a transformation is possible (as the following proof will reveal) for any ring B.
Since B is Hermite, b can be complemented, whence a can also be complemented.
Suppose f, g R = B[x]; then f g B[x] can be dened in the obvious way, namely,
if f =

f
i
x
i
, then (f g)(x) =

f
i
(g(x))
i
. Given the row a, dene I to be the subset of B
consisting of all b B which have the property that, for each f, g R there exists a unimodular
matrix U = U(f, g) such that a(f +bg) = a(f )U, where a(f +bg), a(f ) are rowvectors in R
1n
dened by
(a(f +bg))
i
= a
i
(f +bg) ,
(a(f ))
i
= a
i
f . (8.1.61)
6
Note that a unimodular row consisting of exactly two elements can always be complemented, in any ring.
Then I is an ideal in B: Clearly 0 I (take U = I f, g). If b, c I, then for each f, g
R there exist unimodular matrices U(f, g), V(f, g) such that a(f +bg) = a(f )U, a(f +
cg) = a(f )V. Hence, we have a(f +(b +c)g) = a(f +bg +cg) = a(f +bg)V(f +bg, g) =
a(f )U(f, g)V(f +bg, g). Since the matrix U(f, g)V(f +bg, g) is unimodular for all f, g R,
it follows that b +c I. Finally, if b I and c B, then for any f, g R we have a(f +bcg) =
a(f )U(f, cg), which shows that bc I.
If it can be shown that I = B, then the proof is complete. If 1 I, then for each f, g R
there exists a U(f, g) U(R) such that a(f +g) = a(f ). In particular, let f = x, g = x; then
a(0) = a(x)U for some U U(R). If we now dene b = a(0), then the desired statement follows
upon observing that a(0) is a unimodular row in B.
To show that I = B, suppose by way of contradiction that I = B. Then I is contained in
some maximal ideal of B, say J. Let L denote the localization B
J
. Then B L, and a is a (monic)
unimodular row in L[x]. By Lemma 8.1.33, a can be complemented in L[x]. In other words, there
exists a unimodular matrix U over L[x] such that
a(x) = [1 0 0]U(x) . (8.1.62)
Now adjoin a new indeterminate y to L[x], and dene
V(x, y) = [U(x +y)]
1
U(x) . (8.1.63)
Then V is also a unimodular matrix over L[x, y], and V(x, 0) = I
n
. From (8.1.62), it follows that
a(x +y) = [1 0 0]U(x +y) , (8.1.64)
so that, from (8.1.64) and (8.1.63),
a(x +y)V(x, y) = [1 0 0]U(x) = a(x) . (8.1.65)
Next, note that V(x, y) is a matrix over L[x, y], so that eachelement of V(x, y) is a polynomial
in x and y with coefcients in L. Moreover, since V(x, 0) = I
n
, it follows that each element of V is
of the form
V
ij
(x, y) = c
ij
+
r
k=1
d
ijk
(x)y
k
, (8.1.66)
where c
ij
= 0 or 1. Loosely speaking, each term in the polynomial expansion of v
ij
is either a
constant 0 or 1, or else involves a positive power of y. Recall that L consists of fractions p/q where
p B, q B \ J. Let b denote the product of the denominators of all the coefcients occurring in
the matrix V(x, y); thus b is the product of the denominators of the coefcients of d
ijk
(x), for all
i, j, k. Then V(x, by) is actually a unimodular matrix over B[x, y], since replacing y by by wipes
out all the denominators. Now (8.1.65) implies that
a(x +by)V(x, by) = a(x) . (8.1.67)
So for any f, g B[x], (8.1.67) implies that
a(f +bg)V(f, bg) = a(f ) . (8.1.68)
Therefore b I, by denition. On the other hand, b is the product of elements in B \ J, and hence
itself belongs to B \ J. In particular, b / I J. This contradiction shows that I = B. 2
8.2 COPRIMEFACTORIZATIONS FORDISTRIBUTED
SYSTEMS
Suppose R is a commutative domain with identity, and let F denote the associated eld of fractions.
The previous section was concerned with the question of when a matrix in M(F) has a right or
left-coprime factorization, and when the existence of one implies the existence of the other. It was
shown that the notion of a Hermite ring plays an important role in answering the above questions.
In particular, if R is a Hermite ring, then a matrix in M(F) has an r.c.f if and only if it has an l.c.f.
However, aside from some general and abstract results, very little was said about the existence of
coprime factorizations. The objective of this section is to study this issue in more detail for the case
of linear distributed systems, both continuous-time as well as discrete-time. This is done by rst
proving a general theorem about the existence of coprime factorizations, and then specializing this
theorem to the two special cases mentioned above. As a consequence, we shall obtain two very large
classes of transfer matrices whose elements all have both r.c.f s as weil as l.c.f.s.
The main result of this section is given next.
Theorem8.2.1 Suppose I is a subset of R satisfying two conditions:
(I1) I is a saturated multiplicative system;
(I2) If a I and b R, then the ideal in R generated by a and b is principal.
Under these conditions, every matrix G M(I
1
R) has both an r.c.f. as well as an l.c.f.
Remarks 8.2.2 1) Recall fromSection A.2 that a subset I of Ris said to be multiplicative if a, b I
implies that ab I. It is saturated if a I and b R divides a together imply that b I.
2) By induction, one can show that (I2) is equivalent to the following statement: (I2) If a I
and b
1
, . . . , b
n
R, then the ideal in R generated by a, b
1
, . . . , b
n
is principal. Let d be a generator
of this principal ideal. Then one can show quite easily that d is a greatest common divisor (g.c.d.)
of the set {a, b
1
, . . . , b
n
}.
3) Recall from Section A.2 that I
1
R denotes the ring consisting of all fractions of the form
a/b, where a R, and b I. One may assume without loss of generality that 1 I. By identifying
a R with a/1, one can imbed R as a subring of I
1
R.
The proof of Theorem 8.2.1 is facilitated by some preliminary results.
Lemma 8.2.3 Suppose a I, b
1
, . . . , b
n
R, and let d be a g.c.d. of the set {a, b
1
, . . . , b
n
}. Then
there exists a matrix U U(R) such that |U| = d and [a, b
1
, . . . , b
n
]
is the rst column of U.

8.2. COPRIMEFACTORIZATIONS FORDISTRIBUTEDSYSTEMS 299
The proof of this lemma exactly follows that of Lemma B.2.1, if one notes that every subset
{a, b
1
, , b
i
} has a g.c.d. by condition (I2).
Corollary 8.2.4 Suppose a, b
1
, , b
n
have a g.c.d. of 1. Then there exists a U U(R) with
[a, b
1
, , b
n
]
as its rst column.

Lemma 8.2.5 Suppose a, b
1
, , b
n
R have a g.c.d. of d. Then there exists a V U(R) such that
V
_
_
_
_
_
_
_
_
_
a
b
1
b
n
_
_
=
_
_
_
_
_
_
_
_
_
d
0
0
_
_
. (8.2.1)
Proof. Let c
0
= a/d, c
i
= b
i
/d, i 1. Then c
0
I since I is saturated, and 1 is a g.c.d. of all the
c
i
s. By Corollary 8.2.4 above, there exists a U U(R) such that [c
0
c
n
]
is the rst column of

U. Now let V = U
1
. 2
Lemma 8.2.6 Suppose G M(I
1
R) is of order n m, and G = BA
1
, where B R
nm
, A
R
mm
, Then G has an r.c.f. provided there exists a U U(R) such that
U
_
A
B
_
=
_
D
0
nm
_
. (8.2.2)
The proof of this lemma is the same as that of Theorem 4.1.3 with a few inessential modi-
cations, and is left to the reader.
Proof of Theorem8.2.1. It is shown that every G M(I
1
R) has an r.c.f. The existence of an l.c.f.
for every such G will then follow by taking transposes.
Suppose G M(I) is of order n m. Then a typical element g
ij
of G is of the form c
ij
/a
ij
,
where c
ij
R, a
ij
I i, j. Dene
a =
i,j
a
ij
, b
ij
= g
ij
a , (8.2.3)
and let B be the matrix whose elements are the b
ij
s, A = aI. Then G = BA
1
, and the theorem is
proved if it can be shown that there exists a U U(R) such that (8.2.2) holds. Such a U is constructed
iteratively. As the rst step, let d
m
be a g.c.d. of the set {a, b
1m
, , b
nm
}. Such a d
m
exists, since
a I (by the multiplicativity of the set I). Select a unimodular matrix V
m
R
(n+1)(n+1)
such that
V
m
_
_
_
_
_
_
_
_
_
a
b
1m
b
nm
_
_
=
_
_
_
_
_
_
_
_
_
d
m
0
0
_
_
. (8.2.4)
Such a V
m
exists, by Lemma 8.2.5. Now dene
U
m
=
_
I
m1
0
0 V
m
_
. (8.2.5)
Then routine computations show that
U
m
_
A
B
_
=
_
aI
m1
e
m
X 0
_
, (8.2.6)
where
e
m
= [0 0 d
m
]
, (8.2.7)
and X M(R). It is now possible to repeat the procedure by working on the next-to-last column of
the matrix in (8.2.6). After m repetitions of the reduction, one arrives at a matrix of the form shown
in (8.2.2) 2
As an application of the above theory, consider the ring A described in (8.1.16), which is
recalled here for convenience. The ring A consists of all distributions f () with support in [0, )
of the form
f (t ) =
i=0
f
i
(t t
i
) +f
a
(t ) , (8.2.8)
where 0 t
1
< t
2
< . . . are real numbers, f
i
are real, f () is measurable, and in addition
f
A
=
i=0
|f
i
| +
_
0
|f
a
(t )| dt < . (8.2.9)
It can be shown that A is precisely the set of impulse responses of BIBO-stable linear time-invariant
systems, and is a ring with pointwise addition and convolution as the product. Now, it can be
shown [101] that A is not a Bezout domain. Hence, not every element in the eld of fractions of A
has a coprime factorization. Moreover, it is not known whether A is a Hermite ring or not. Hence, it
is of interest to identify a subset B of the fraction eld of A such that every matrix in M(B) has both
8.2. COPRIMEFACTORIZATIONS FORDISTRIBUTEDSYSTEMS 301
an r.c.f. and an l.c.f. This is done in [8] and can be described in the current framework as follows: Let
A denote the set of Laplace transforms of distributions in A. Then

A is isomorphic to A, and has
pointwise addition and multiplication as the ring operations. It is routine to verify that each Laplace
transform in

A converges over the closed right half-plane C
+e
, and is analytic over the open right
half-plane. Now let

A
0
denote the subset of

A dened as follows:

A
0
consists of those functions
f

A which are analytic over the closed right half-plane. More precisely, f

A
0
if there exists
an > 0 (which may depend on f ) such that f () is analytic over the half-plane {s : Re s > }.
Then

A
0
is a subring of

A. We take

A
0
as the ring R. The subset I of R is dened as follows: A
function f in

A
0
is bounded away from zero et innity if, whenever {s
i
} is a sequence in C
+e
, such
that |s
i
| , we have that liminf |f (s
i
)| > 0. Now I consists precisely of those f in

A
0
that are
bounded away from zero at innity. It is easy to verify that I is a saturated multiplicative system in R
(see Problem 8.2.1), so that condition (I1) is satised. The proof of (I2) is a little more complicated,
but can be found in [101]. Hence, on the basis of Theorem 8.2.1, we conclude that every function
in B = I
1
R has a coprime factorization, and more generally, every G M(B) has both an r.c.f. as
well as an l.c.f.
In viewof the above very useful property, it is worthwhile to understand a little better the class
of functions B. Since the zeros of an analytic function are always isolated, if f

A then any zero of
f in the open RHP is isolated. Now, if f

A
0
, then every zero of f in the closed RHP is isolated.
Finally, if f I, then f is also bounded away from zero at innity, which implies that all zeros of f
in C
+e
must in fact be contained in a compact subset thereof. Since any such zero is isolated and of
nite multiplicity, we conclude that every f I has only a nite number of isolated zeros, each of
nite order, in the closed RHP. Hence, every g B has only a nite number of singularities in the
closed RHP, and each of these is a pole of nite order.
Now let us consider discrete-time systems. In this case, a good choice for the set of transfer
functions of stable systems is the disc algebra A, consisting of functions that are analytic on the open
unit disc and continuous on the closed unit disc. The argument given in [101] shows that A is not a
Bezout domain, but A is a Hermite ring as shown in Example 8.1.26. Hence, every matrix in M(A)
that has an r.c.f. also has an l.c.f. and vice versa. But it is still desirable to have simple sufcient
conditions for the existence of an r.c.f. (or an l.c.f.). For this purpose, let R be the disc algebra A,
and let I consist of those functions f A that have only a nite number of zeros in the closed unit
disc and none on the unit circle (hence they are all in the open unit disc).
With this denition, it is once again routine to verify condition (I1). To check (I2), suppose
f A, g I. Then f and g can have only a nite number of common zeros in the closed unit
disc, and these must all be in the open unit disc. Suppose these common zeros are
1
, . . . ,
n
, with
multiplicities m
1
, . . . , m
n
. Let (z) be the polynomial
(z) =
n
i=1
(z
i
)
m
i
. (8.2.10)
Then it can be shown that the ideal in A generated by f and g is the same as the principal ideal
generated by (see Problem 8.2.2). Hence, (I2) holds, and every matrix whose elements all belong
to I
1
A have both an r.c.f. and an l.c.f.
PROBLEMS
8.2.1. Show that the set I consisting of the functions in

A
0
that are bounded away from zero
innity satises condition (I1).
8.2.2. Show that the subset I of the disc algebra dened above satises conditions (I1) and (I2).
8.3 CONTROLLERSYNTHESIS OVERAGENERAL RING
The objective of this section is to extend the results contained in Chapter 5, and specically Sec-
tion 5.2, to the case where the set S is replaced by a general ring R. The main obstacle to doing
so is that the plant and compensator can not be assumed a priori to have right- or left-coprime
factorizations. It is shown below that this difculty can be mostly circumvented.
Suppose R is a commutative domain with identity, and denote its eld of fractions by F. In
the present set-up, both the plant P and the compensator C are assumed to belong to M(F). The
pair (P, C) is stable if
H(P, C) =
_
(I +PC)
1
P(I +CP)
1
C(I +PC)
1
(I +CP)
1
_
, (8.3.1)
is well-dened and belongs to M(R). The symbol S(P) denotes the set of C M(F) such that
(P, C) is a stable pair. The rst objective is to parametrize the set S(P) corresponding to a given
plant P. This is made easier by the next result.
Lemma 8.3.1 Suppose P has an r.c.f. (resp. an l.c.f.) and that C S(P). Then C has an l.c.f. (resp. an
r.c.f.).
Proof. Consider the case where P has an r.c.f.; in case P has an l.c.f. the arguments are entirely
similar. Suppose P = ND
1
where N, D M(R) are right-coprime, and select X, Y M(R) such
that XN +YD = I. Now
H(P, C) = H(ND
1
, C)
=
_
I N(D +CN)
1
C N(D +CN)
1
D(D +CN)
1
C D(D +CN)
1
_
(8.3.2)
belongs to M(R), since C S(P). Dene S, T M(F) by
S = (D +CN)
1
, T = (D +CN)
1
C . (8.3.3)
8.3. CONTROLLERSYNTHESIS OVERAGENERAL RING 303
Then,
H(P, C) =
_
I NT NS
DT DS
_
, (8.3.4)
which shows that NS, NT, DS, DT M(R). This in turn implies that
S = X(NS) +Y(DS) M(R) ;
T = X(NT ) +Y(DT ) M(R) . (8.3.5)
Also, from (8.3.3), C = S
1
T . Thus, in order to prove that (S, T ) is an l.c.f. of C, it only remains
to establish the left-coprimeness of S and T . But this is immediate, since
SD +T N = (D +CN)
1
(D +CN) = I . (8.3.6)
Hence, (S, T ) is an l.c.f. of C. 2
Corollary 8.3.2 Suppose P has both an r.c.f. and an l.c.f. and that C S(P). Then C has both an r.c.f.
and an l.c.f.
Corollary 8.3.3 Suppose P, C M(F) and that the pair (P, C) is stable. Then the matrix
G =
_
C 0
0 P
_
(8.3.7)
has both an r.c.f. and an l.c.f.
Remarks 8.3.4 It is not claimed that P and C individually have both an r.c.f. and an l.c.f. - only
that G has both.
Proof. Dene
F =
_
0 I
I 0
_
. (8.3.8)
Then (P, C) is stable if and only if (G, F) is stable. Since F is a constant matrix, it has both an r.c.f.
as well as an l.c.f. The conclusion now follows from Corollary 8.3.2. 2
Thus, if one is interested in designing stabilizing compensators for a plant that has both an
r.c.f. and an l.c.f., the basic result of Chapter 5, namely Theorem 5.2.1, carries over completely. It is
repeated here for emphasis.
Theorem8.3.5 Suppose P M(F) has an r.c.f. (N, D) and an l.c.f. (

D,

N). Select X, Y,

X,

Y
M(R) such that XN +YD = I,

N

X +

D

Y = I. Then,
S(P) = {(Y R

N)
1
(X +R

D) : R M(R) and |Y R

N| = 0}
= {(

X +DR)(

Y NR)
1
: R M(R) and |

Y NR| = 0}. (8.3.9)
Once a parametrization of S(P) is available, Theorem 5.4.1 concerning the simultaneous
stabilizability of two plants P
0
and P
1
carries over to a general ring R.
0
and P
1
M(F) have the same dimensions, and that each has both an
r.c.f. and an l.c.f. Dene
A
0
= X
0
N
1
+Y
0
D
1
, B
0
=

N
0
D
1
+

D
0
N
1
, (8.3.10)
where the symbols are as in (5.4.1). Then there exists a C M(F) that stabilizes both P
0
and P
1
if
and only if there exists an M M(R) such that A
0
+MB
0
U(R).
Thus, as before, the simultaneous stabilizability question for two plants can be reduced to a
strong stabilizability question for a single plant. Unfortunately, this latter question is very difcult
to answer, especially for multivariable plants.
The strong stabilizability question for scalar plants is the following: Given coprime elements
n, d R, when does there exist a c R such that d +cn is a unit of R? The answer to this question
depends strongly on the specics of the ring R. In case R = A, the disc algebra, and n has only a
nite number of zeros inside the closed unit disc, the existence or otherwise of a suitable c can be
decided using the parity interlacing test, as in Section 2.4. However, if n has a sequence of zeros
clustering on the unit circle, the methods of Section 2.4 are no longer applicable. For multivariable
plants, the strong stabilization question is: Given a right-coprime pair N, D M(R), does there
exist a C M(R) such that D +CN U(R)? If R is a proper Euclidean domain, the results of
Section 4.4 can be used to transform this matrix problem into a scalar problem. Specically, such a
C exists if and only if the pair (n, d) can be strongly stabilized, where n is the g.c.d. of all elements
of N and d = |D|. However, such a reduction is not possible over a general ring. As a result, the
strong stabilization problem for multivariable plants over a general ring is still open.
The other contents of Chapter 5, such as the two-parameter compensation scheme for ex-
ample, can also be extended to a general ring without undue difculty. The details are left to the
reader.
8.3. CONTROLLERSYNTHESIS OVERAGENERAL RING 305
The results of Chapters 6 and 7 can be extended almost without modication to distributed
systems. The reason is that these results are mainly based on the properties of the function spaces
H
2
and H
, and the rationality of the various functions involved does not play a central role.
This sectionis concludedwitha discussionof the possibility of designinglumpedcompensators
(i.e., those with rational transfer function matrices) for distributed systems. Such compensators are
desirable because of ease of implementation. Consider rst the discrete-time case, and suppose the
ring R is the disc algebra A. Then the polynomials are dense in A. As a consequence, the set of
rational functions (i.e., ratios of polynomials) are dense in the fraction eld of Ain the graph topology.
Thus, if c is any element (compensator) in the fraction eld of A, then every neighborhood of c
contains a rational function. This fact can be exploited to design lumped stabilizing compensators,
as follows: First, use Theorem 8.3.5 together with any other mathematical apparatus to design a
good, or even optimal, compensator C in the form N
c
D
1
c
or

D
1
c

N
c
, where the matrices N
c
, D
c
or

N
c
,

D
c
have their elements in the ring A. By approximating the elements of these matrices by
polynomials or rational H
-functions to a sufciently high degree of accuracy, one can come up

with lumped compensators that stabilize the plant under study, or achieve near-optimal performance.
In the continuous-time case, the situation is a little more complicated, because the elements
of S are not dense in the ring

A. For example, the function f (s) = exp(s) belongs to

A, since it is
the Laplace transform of a delayed impulse distribution. But it cannot be uniformly approximated
by any stable rational function. Hence, it is not always possible to approximate compensators in
the fraction eld of

A by rational compensators. However, it is not too difcult to verify that any
function which is the Laplace transform of a distribution of the form
f (t ) = f
0
(t ) +f
a
(t ) , (8.3.11)
where the function f
a
is absolutely integrable (i.e., belongs to the space L
1
), can be approximated
arbitrarily closely by a function in S. Thus, if R denotes the set of Laplace transforms of such
functions, then S is a dense subset of R. Hence, any compensator of the form

D
1
c

N
c
, or N
c
D
1
c
where D
c
, N
c
,

N
c
,

D
c
M(R), can be approximated by a lumped compensator.
The situation is more satisfactory in the case of the ring B discussed in the previous section.
Recall that B is the ring of fractions of the form I
1
A, where I and A are dened in Section 8.2. It
can be shown that every function in B is the sum of a rational function and the Laplace transform
of a function belonging to L
1
(see [8]). Thus, if F M(B), then F can be written as F = G+H,
where G M(R(s)) and H is the Laplace transform of a matrix consisting of functions from L
1
.
Now, the problem of designing a stabilizing compensator for F is essentially that of designing one
for the unstable part G of F. Suppose (N
g
, D
g
) is an r.c.f. of G over the ring M(S), and select
matrices X, Y M(S) that satisfy the identity XN
g
+YD
g
= I. Then, as shown in Problem4.1.12,
the pair (N
g
+HD
g
, D
g
) is an r.c.f. of F, this time over the ring

A (which contains S as a subring),
and moreover,
X(N
g
+HD
g
) +(Y XH)D
g
= I . (8.3.12)
Thus, the compensator (Y XH)
1
X stabilizes the plant F. Next, note that Y XH M(R),
where R is the set of Laplace transforms of distributions of the form (8.3.11). Thus, there exists
a sequence of rational matrices {A
i
} in M(S) that converges Y XH. By choosing the index
i sufciently large, we can ensure that the lumped compensator A
1
i
X stabilizes the plant F. If
|Y XH| = 0, this presents no problem, since one can always choose the sequence {A
i
} to consist
of nonsingular matrices. Thus, we have shown that every distributed plant whose transfer matrix
belongs to the set B can be stabilized by a lumped compensator.
NOTES ANDREFERENCES
The references for the material in Section 8.1 are given at the appropriate places. The design of
lumped compensators for distributed systems is discussed in [70, 71], and for the special case of time-
delay systems in [55]. See [51] for results relating the internal and external stability of distributed
systems, and [6, 54, 64, 77] for controller synthesis theory in very general settings.
The factorization approach carries over with some limitations to nonlinear systems as well. In
this connection, see [1, 23, 24].
307
A P P E N D I X A
Algebraic Preliminaries
This appendix contains a brief introduction to some basic properties of rings. It is important to note
that the choice of results from ring theory that are presented here is highly selective; the reader is
referred to [52] or [116] for a more comprehensive treatment.
A.1 RINGS, FIELDS ANDIDEALS
Denition A.1.1 A ring is a nonempty set R together with two binary operations + (addition)
and (multiplication) such that the following axioms are satised:
(R1) (R, +) is a commutative group. This means that
a +(b +c) = (a +b) +c, a, b, c R.
a +b = b +a, a, b R.
There exists an element 0 R such that
a +0 = 0 +a = a, a R.
For every element a R, there exists a corresponding element a R such that a +(a) =
0.
(R2) (R, ) is a semigroup. This means that
a (b c) = (a b) c, a, b, c R.
(R3) Multiplication is distributive over addition. This means that
a (b +c) = a b +a c,
(a +b) c = a c +b c, a, b, c R.
As is customary, a b is denoted by ab, and a +(b) is denoted by a b.
A ring R is said to be commutative if ab = ba a, b R, and is said to have an identity if
there exists an element 1 R such that 1 a = a 1 = a a R.
Examples A.1.2 A classic example of a ring is the set of integers, denoted by Z, with the usual
notions of addition and multiplication. Clearly, Z is a commutative ring with identity. The set
E of even integers, together with the usual addition and multiplication is an example of a ring
308 A. ALGEBRAICPRELIMINARIES
without identity; but E is commutative. The set Z
22
of 2 2 matrices with integer elements is a
noncommutative ring with identity.
A ring R is said to be a domain (or an integral domain) if a R, b R, ab = 0 implies that
either a = 0 or b = 0. In other words, Ris a domain if the product of every pair of nonzero elements
is nonzero.
Example A.1.3 In Example A.1.2 above, both Z and E are domains, but Z
22
is not.
Example A.1.4 Consider the ring Z/(6) consisting of the six elements {0, 1, 2, 3, 4, 5}, with
addition and multiplication dened modulo 6. For example, 3 +5 = 8 mod 6 = 2, and 2.5 = 10
mod 6 = 4. This ring is commutative and has an identity. However, since 2 3 = 0, it is not a domain.
Suppose R is a ring with identity. An element x R is called a unit of R if there is a y R
such that xy = yx = 1. It can be easily shown that such a y is unique; y is called the inverse of x
and is denoted by x
1
.
Example A.1.5 Let R = C[0, 1], the set of continuous real-valued functions dened over the
interval [0, 1]. If addition and multiplication on R are dened pointwise, i.e.,
(x +y)(t ) = x(t ) +y(t ) ,
(xy)(t ) = x(t )y(t ), t [0, 1], x, y R ,
then R is a commutative ring with identity. However, it is not a domain; for example, let
x(t ) =
_
1 2t 0 t 0.5
0 0.5 < t 1
y(t ) =
_
0 0 t 0.5
2t 1 0.5 < t 1
.
Then xy = 0, even though x = 0, y = 0. A function x R is a unit of R if and only if x(t ) =
0 t [0, 1]. In other words, the units in R are the functions that do not change sign over [0, 1].
Denition A.1.6 A eld is a commutative ring F with an identity, satisfying two additional
assumptions:
(F1) F contains at least two elements.
(F2) Every nonzero element of F is a unit.
Examples A.1.7 The rational numbers, real numbers and complex numbers are all well-known
examples of elds. Another example is Z/(p) where p is a prime integer, consisting of the elements
A.1. RINGS, FIELDS ANDIDEALS 309
{0, 1, , p 1}. Addition and multiplication are dened modulo p. This set is clearly a commu-
tative ring with identity, and it contains at least two elements. To show that it satises axiom (F2),
suppose a is any nonzero element of Z/(p). Since the greatest common divisor of a and p (in the
usual sense) is 1, there exist integers x and y such that
ax +py = 1 .
Moreover, for any integer q, we have that
a(x qp) +p(y +qa) = 1 .
So by a suitable choice of q, the integer x qp can be made to lie between 0 and p 1. As a result,
it follows that there exist integers b and c, with b lying between 0 and p 1, such that
ab +pc = 1, or ab = 1 pc 1 mod p .
Hence, a is a unit and b is its inverse. For example, in the ring Z/(13), the inverse of 5 is 8, and the
inverse of 2 is 7.
The eld Z/(2), consisting of the two elements {0, 1}, is called the binary eld.
A subset S of a ring R is a subring of R if it is a ring in its own right, i.e., if 0 S, and the
sum, difference and product of two elements of S again belong to S.
Denition A.1.8 A subset I in a ring R is said to be a left ideal if (i) I is a subgroup of the additive
group of R, and (iiL) a I, x R imply that xa I. I is a right ideal if (i) I is a subgroup of the
additive group of R and (iiR) a I, x R imply that ax I. I is an ideal if it is both a left ideal
and a right ideal.
In the above denition, condition (i) means the following: First, 0 I, and second, if x, y I,
then x y I. Condition (iiL) (resp. (iiR)) means that if and element of I is multiplied on the left
(resp. right) by any element of R, the product is once again in I. Clearly, if R is a commutative ring,
the concepts of a left ideal, right ideal, and ideal all coincide.
Example A.1.9 Let R be the ring Z
22
of 2 2 matrices with integer elements. Let
I
1
= {M Z
22
: m
11
= 0, m
21
= 0} .
Then it is easy to verify that I
1
is a left ideal in R. Similarly,
I
2
= {M Z
22
: m
21
= 0, m
22
= 0}
is a right ideal in R. The set of diagonal matrices
D = {M Z
22
: m
21
= 0, m
12
= 0}
is a subring of R but is neither a left ideal nor a right ideal.
Let a be an integer. Then
M(a) = {M Z
22
: m
ij
is divisible by a i, j }
is an ideal in R.
Example A.1.10 Consider the ring C[0, 1] of Example A.1.5, and let t
0
[0, 1]. Then
I
t
0
= {x() C[0, 1] : x(t
0
) = 0}
is an ideal. More generally, let S be any subset of [0, 1]. Then
I
S
= {x() C[0, 1] : x(t ) = 0 t S}
is an ideal.
Suppose a is some element of a ring R. Then the set of all elements of the form xa where
x R, i.e., the set of all left multiples of a, is a left ideal. It is called the left principal ideal generated
by a. Similarly, the right principal ideal generated by a is the set of all elements ax where x R.
An ideal I in a commutative ring R is a prime ideal if a R, b R, ab I implies that
either a I or b I. Equivalently, I is a prime ideal if and only if the set R I is closed under
multiplication (i.e., the product of two elements not belonging to I cannot itself belong to I).
Example A.1.11 Consider the ring of integers Z, and let n be any integer. Then the set of multiples
of n, denoted by (n), is the principal ideal generated by n. It is a prime ideal if and only if n divides
ab implies that n divides a or n divides b, which is true if and only if n is a prime number.
Example A.1.12 Consider the ideal I
t
0
of Example A.1.10. If a product xy belongs to this ideal,
then (xy)(t
0
) = 0, which means that either x(t
0
) = 0 or y(t
0
) = 0. Hence, I
t
0
is a prime ideal.
However, if S contains more than one point, then I
S
is not a prime ideal.
By convention, the entire ring R is not considered to be a prime ideal. This is in conformity
with the convention that 1 is not considered a prime number (see Example A.1.11). Thus, I is a
prime ideal in R automatically implies, by convention, that I is a proper subset of R.
A.1. RINGS, FIELDS ANDIDEALS 311
PROBLEMS
A.1.1. (i) Show that the zero element of a ring is unique (Hint: if 0
1
and 0
2
are two additive
identities, then 0
1
= 0
1
+0
2
= 0
2
).
(ii) Show that the additive inverse of an element is unique.
A.1.2. (i) Show that a 0 = 0 for all a in a ring (Hint: ab = a(b +0)).
(ii) Show that, if a ring R contains an identity and has at least one nonzero element, then
1 = 0 (Hint: let a = 0; then a0 = a = a1).
A.1.3. Show that, if R is a domain, then the cancellation law holds, i.e., ab = ac, a = 0 implies
that b = c.
A.1.4. Let R be a ring with identity and at least one nonzero element.
(i) Show that the set of units of R is a group under multiplication; i.e., show that (a) if u is
a unit, then so is u
1
, and (b) if u, v are units, so are uv and vu.
(ii) Show that 0 can never be a unit (Hint: see Problem A.1.2).
(iii) Show that if x has an inverse, then it is unique.
(iv) Show that if x has a left inverse y and a right inverse z, then y = z (i.e., show that
yx = 1, xz = 1 implies y = z).
A.1.5. Let R be the set of functions mapping the interval [0, 1] into the set of integers Z.
(i) Show that R is a ring under pointwise addition and multiplication.
(ii) What are the units of this ring?
A.1.6. Consider the ring Z/(p) of Example A.1.7. Show that this ring is a eld if and only if p
is a prime number.
A.1.7. (i) Consider the ring Z/(9), consisting of {0, 1, , 8}, with addition and multiplication
modulo 9. Determine the units of this ring. (Answer: 1,2,4,5,7,8)
(ii) Consider the ring Z/(n). Show that m is a unit of this ring if and only if the greatest
common divisor of m and n is 1.
A.1.8. Consider the ring Z
nn
of n n matrices with integer elements. Show that a matrix M in
this ring is a unit if and only if its determinant is 1.
A.1.9. Show that a commutative ring R is a domain if and only if {0} is a prime ideal.
A.2 RINGS ANDFIELDS OF FRACTIONS
Throughout this section ring means a commutative ring with identity.
Suppose Ris a ring. An element a Ris an absolute nondivisor of zero if b R, ab = 0 implies
that b = 0. In every ring, there are absolute nondivisors of zero; for example, all the units are of this
type. But there may be others as well.
Examples A.2.1 In the ring Z
nn
, a matrix M is an absolute nondivisor of zero if and only if
the determinant of M is nonzero. Thus, a nonunit matrix can be an absolute nondivisor of zero
(cf. Problem A.1.8).
Now consider the ring C[0, 1] dened in Example A.1.5. Suppose a function x() belonging
to this ring vanishes at only a nite number of points. If y C[0, 1] and xy = 0, then x(t )y(t ) 0,
which means that y(t ) = 0 for all except a nite number of values of t . However, since y() is
continuous, this implies that y(t ) 0, or that y is the zero element of the ring. Hence, x is an
absolute nondivisor of zero.
If a is an absolute nondivisor of zero and ab = ac, then b = c; in other words, absolute
nondivisors of zero can be cancelled.
Of course, if R is a domain, then every nonzero element of R is an absolute nondivisor of zero.
A set M in a ring R is said to be a multiplicative system if a, b M implies that ab M. It is
saturated if a R, b R, ab M implies that a M, b M.
Fact A.2.2 The set N of absolute nondivisors of zero in a ring R is a multiplicative system.
Proof. Suppose a, b N, y R, and aby = 0. Then, since a, b N, it follows successively that
aby = 0 by = 0 y = 0. Hence, ab N. 2
Suppose R is a ring, M is a multiplicative system in R containing 1, and M is a subset of N
(the set of absolute nondivisors of zero in R). We now begin a construction which will ultimately
result in a ring L which contains R as a subring, and in which every element of M is a unit.
Consider the set R M, and dene a binary relation on R M as follows: (a, b)
(c, d) ad = bc. The relation is an equivalence relation: Clearly is reexive and symmetric.
To showthat it is transitive, suppose (a, b) (c, d) and (c, d) (e, f ).Thenad = bc and cf = de.
Multiplying the rst equation by f and the second one by b gives adf = bcf = bde. Now d N
since d M and M N. Hence, d can be cancelled in the above equation to give af = be, i.e.,
(a, b) (e, f ).
Since is an equivalence relation, the set R M can be partitioned into disjoint equivalence
classes under . The set of equivalence classes R M/ is denoted by L. The set L therefore
consists of fractions (a, b), or a/b in more familiar terms, where we agree to treat two fractions a/b
A.2. RINGS ANDFIELDS OF FRACTIONS 313
and c/d as equivalent if ad = bc. Addition and multiplication of fractions in L are dened in the
familiar way, namely
a
b
+
c
d
=
ad +bc
bd
. (A.1)
a
b
c
d
=
ac
bd
. (A.2)
Note that, if b, d M, then so does bd. Hence, the right sides of (A.1) and (A.2) are valid fractions.
Actually, (A.1) and (A.2) represent operations on equivalence classes; but the reader can verify
that the sum and product of two fractions (i.e., two equivalence classes) do not depend on which
representatives of the equivalence classes are used.
With addition and multiplication dened by (A.1) and (A.2), L is a ring. Moreover, if every
a R is identied with the fraction a/1, then R is isomorphic to a subring of L. The element 1/1
serves as an identity for L. Finally, if d M, then d corresponds to d/1 L; moreover, d/1 is a unit
in L with the inverse 1/d. Thus, every element of (the isomorphic image of ) M is a unit in L. The
ring L is called the ring of fractions of R with respect to M, and is denoted by M
1
R.
Since M is a subset of N (the set of absolute nondivisors of zero), the ring N
1
R contains the
largest number of units. If in particular R is a domain, then N = R \ 0, i.e., N consists of all nonzero
elements of R. In this case, the ring F = (R \ 0)
1
R has the property that every nonzero element
of R is a unit in F. Moreover, every nonzero element of F is also a unit: If a/b F and a = 0, then
b/a is the inverse of a/b. Hence, F is a eld; it is referred to as the eld of fractions or quotient eld
associated with the domain R.
There is a particularly important class of fraction rings that is frequently encountered. Suppose
R is a ring, and that I is a prime ideal in R. If M denotes the complement of I, then M is a
multiplicative system (see Problem A.1.4). The corresponding fraction ring M
1
R is called the
localization of R with respect to I.
Examples A.2.3 The set of integers Z is a domain. The eld of fractions associated with the
integers is the set of rational numbers.
In the ring Z, let M denote the set of numbers that are not divisible by 3. Then M is a
multiplicative system (since 3 is a prime number), and 1 M. The ring of fractions M
1
Z is the set
of rational numbers whose denominators (when expressed in reduced form) are not divisible by 3.
This is a subring of the eld of rational numbers.
PROBLEMS
A.2.1. Suppose R is a ring, a, b, c R, and a is an absolute nondivisor of zero. Show that if
ab = ac, then b = c.
A.2.2. Show that the addition and multiplication rules (A.1) and (A.2) are unambiguous, in the
sense that the nal answers do not depend on which representatives of the equivalence
classes are used in the computation.
A.2.3. Consider the ring Z/(10), consisting of the integers 0, , 9 with addition and multipli-
cation dened modulo 10.
(i) Determine the units of this ring.
(ii) Show that the set N of absolute nondivisors of zero consists of just the units.
(iii) Using (ii), show that the ring of fractions with numerators in Z/(10) and denominators
in N is again just Z/(10).
A.2.4. Suppose R is a ring and that I is a prime ideal in R. Let M denote the complement of I.
Show that M is a multiplicative system.
A.3 PRINCIPAL IDEAL DOMAINS
Throughout this section, ring means a commutative ring with identity.
Denition A.3.1 A ring R is said to be a principal ideal ring if every ideal in R is principal. R is a
principal ideal domain (p.i.d.) if it is a domain as well as a principal ideal ring.
Recall that a principal ideal I consists of all multiples of some element a, i.e., I = {xa : x R}.
Thus, in a principal ideal ring, every ideal is generated by a single element.
If x and y are elements of a ring R with x = 0, we say that x divides y, and y is a multiple of
x, if there is an element z R such that y = xz; this is denoted by x | y. If x and y are elements of
a ring R such that not both are zero, a greatest common divisor (GCD) of x, y is any element d R
such that
(GCD1) d | x and d | y .
(GCD2) c | x, c | y c | d .
In the above denition, it is implicit that d = 0, since d divides a nonzero element by virtue
of (GCD1). We say a greatest common divisor because a GCD is not unique. Certainly, if d is
a GCD of x and y, then so is d. Actually, the following stronger result holds:
Fact A.3.2 Suppose d is a GCDof x and y; then so is du whenever u is a unit. Suppose in addition
that R is a domain; then every GCD d
1
of x and y is of the form d
1
= du for some unit u.
Proof. The rst sentence is obvious. To prove the second sentence, observe that if d
1
is another
GCD of x and y, then by (GCD2) d
1
| d and d | d
1
. Hence, d
1
= du for some u. Moreover, du | d
since d
1
| d. Since R is a domain and d = 0, this implies that u | 1, i.e., that u is a unit. 2
A.3. PRINCIPAL IDEAL DOMAINS 315
Fact A.3.2 states that if R is a domain, then once we have found one GCD of a given pair of
elements, we can quickly nd them all. However, a question that is not answered by Fact A.3.2 is:
Does every pair of elements have a GCD? The answer to this question is provided next.
Theorem A.3.3 Suppose R is a principal ideal ring. Then every pair of elements x, y R, not
both of which are zero, has a GCD d which can be expressed in the form
d = px +qy (A.1)
for appropriate elements p, q R. Moreover, if R is a domain, then every GCD of x and y can be
expressed in the form (A.1).
Proof. Given x, y R, consider the set
I = {ax +by, a, b R} . (A.2)
In other words, I is the set of all linear combinations of x and y. It is easy to verify that I is an ideal
in R.
1
Since R is a principal ideal ring, I must be a principal ideal; that is, there exists an element
d I such that I is the set of all multiples of d. Since x = 1 x +0 y I, it follows that x is a
multiple of d, i.e., d | x; similarly d | y. Thus, d satises the axiom (GCD1). Next, since d I and
I is the ideal generated by x, y, there exist p, q R such that (A.1) holds. Now suppose c | x, c | y.
Then c | (px +qy), i.e., c | d. Hence, d also satises (GCD2) and is thus a GCD of x and y.
Up to now, we have shown that every pair of elements x and y has a GCD which can be
written in the form (A.1) for a suitable choice of p and q. Now, if in addition R is a domain, then
every GCD d
1
of x, y is of the form d
1
= du for some unit u, by Fact A.3.2. Thus, every GCD d
1
can be written in the form d
1
= du = upx +upy. 2
Two elements x, y R are relatively prime or simply coprime if every GCD of x, y is a unit.
In view of Fact A.3.2, if R is a domain, this is equivalent to saying that x and y are coprime if and
only if 1 is a GCD of x and y. Now suppose R is a principal ideal domain. Then, by Theorem A.3.3,
x, y R are coprime if and only if the ideal I in (A.2) is the same as the ideal generated by 1. But
the latter is clearly the entire ring R. This can be summarized as follows:
Fact A.3.4 Let R be a principal ideal domain. Then x, y R are coprime if and only if there exist
p, q R such that px +qy = 1.
One can also dene a GCD of an n-tuple of elements (x
1
, , x
n
), not all of which are zero.
An element d R is a GCD of the collection (x
1
, , x
n
) if
(i) d | x
i
i , (A.3)
(ii) c | x
i
i c | d . (A.4)
1
I is referred to as the ideal generated by x and y.
As in Fact A.3.2, it follows that a GCD of a given n-tuple is unique to within a unit provided the
ring is a domain. Moreover, Theorem A.3.3 can be generalized as follows:
TheoremA.3.5 Suppose R is a principal ideal ring, and suppose x
1
, , x
n
R, with at least one
element not equal to zero. Then the n-tuple (x
1
, , x
n
) has a GCD d which can be expressed in
the form
d =
n
i=1
p
i
x
i
(A.5)
for appropriate elements p
1
, , p
n
R. Moreover, if R is a domain, then every GCD of this
n-tuple can be expressed in the form (A.5).
The proof is entirely analogous to that of Theorem A.3.3 and is left as an exercise.
In the eld of rational numbers, every ratio a/b of integers has an equivalent reduced form
f/g where f and g are coprime. The next result shows that such a statement is also true in the eld
of fractions associated with any principal ideal domain.
Fact A.3.6 Let R be a principal ideal domain, and let F be the eld of fractions associated with R.
Given any fraction a/b in F, there exists an equivalent fraction f/g such that f and g are coprime.
Proof. If a and b are already coprime, then there is nothing to be done. Otherwise, let d be a GCD
of a, b, and dene f = a/d, g = b/d. Then clearly, f/g = a/b, and it only remains to show that
f and g are coprime. From Theorem A.3.3, there exist p, q R such that
d = pa +qb = pf d +qgd . (A.6)
Cancelling d from both sides of (A.6) gives 1 = pf +qg, which shows that f, g are coprime. 2
Thus, the procedure for reducing a fraction is the natural one: namely, we extract a greatest
common divisor from the numerator and the denominator.
Two elements a, b R are associates (denoted by a b) if there is a unit u such that a = bu.
One can readily verify that is an equivalence relation on R. A nonunit, nonzero element p R
is a prime if the only divisors of p are either units or associates of p. An equivalent denition is the
following: p is a prime if p = ab, a, b R implies that either a or b is a unit. In the ring of integers,
the primes are precisely the prime numbers.
A useful property of principal ideal domains is stated next, without proof (see [116]).
Fact A.3.7 Every nonunit, nonzero element of a principal ideal domain can be expressed as a
product of primes. Moreover, this factorization is unique in the following sense: If
x =
n
i=1
p
i
=
m
i=1
q
i
, (A.7)
A.3. PRINCIPAL IDEAL DOMAINS 317
where p
i
, q
i
are all primes, then n = m, and the q
i
s can be renumbered such that p
i
q
i
i.
Thus, the only nonuniqueness in the prime factorization of an element arises from the pos-
sibility of multiplying some of the prime factors by a unit (for example, 6 = 2 3 = (2) (3)).
Using prime factorizations, one can give a simple expression for a GCD of a set of elements.
The proof is left as an exercise (see Problem A.3.7).
Fact A.3.8 Suppose x
1
, , x
n
is a set of elements such that none is a unit nor zero. Express each
element x
i
in terms of its prime factors in the form
x
i
=
n
i=1
p
ij
j
(A.8)
where the p
j
s are distinct (i.e., nonassociative) primes,
ij
0 and
ij
= 0 if p
j
is not a divisor of
x
i
. Then a GCD d of (x
1
. , x
n
) is given by
d =
n
i=1
p
j
j
(A.9)
where
j
= min
i
ij
. (A.10)
Corollary A.3.9 Two elements x, y are coprime if and only if their prime divisors are distinct.
Up to now, we have talked about the greatest common divisor of a set of elements. A parallel
concept is that of the least common multiple. Suppose (x
1
, , x
n
) is a set of elements, none of
which is zero. We say that y is a least common multiple of this set of elements if
(LCM1) x
i
| y i ,
(LCM2) x
i
| z i implies that y | z .
Using prime factorizations, one can give a simple expression for a l.c.m. that parallels Fact A.3.8.
Fact A.3.10 Suppose (x
1
, , x
n
) is a collection of nonzero, nonunit elements. Express each
element x
i
in terms of its prime factors as in (A.8). Then a l.c.m. of this collection of elements is
given by
y =
n
i=1
p
j
j
, (A.11)
where
j
= max
i
ij
. (A.12)
The proof is left as an exercise (see Problem A.3.8).
PROBLEMS
A.3.1. Prove that two elements x, y in a p.i.d. R are coprime if and only if there exist p, q R
such that px +qy is a unit.
A.3.2. Let R be a domain, and let x, y denote a GCD of x, y which is unique to within a unit
factor. Show that if u, v are units, then x, y = ux, vy (in other words, every GCD of
x, y is also a GCD of ux, vy and vice versa).
A.3.3. Let R be a p.i.d., and let x, y, z R. Show that x, y, z = x, y, z. More generally,
showthat if x
1
, , x
n
R, thenx
1
, , x
n
= x
1
, , x
m
, x
m+1
, , x
n
for 1
m < n.
A.3.4. Let R be a p.i.d., and let x
1
, , x
n
R. Let d be a GCD of this set of elements and
select p
1
, , p
n
R such that
p
i
x
i
= d. Show that 1 is a GCD of the set of elements
p
1
, , p
n
.
A.3.5. Suppose x, y are coprime and z divides y, in a p.i.d. R. Show that x and z are coprime.
A.3.6. Suppose u is a unit and x
1
, , x
n
are nonzero elements. Show that u, x
1
, , x
n
= 1.
A.3.7. Prove Fact A.3.8.
A.3.8. Prove Fact A.3.10.
A.3.9. (i) Find the prime factorizations of 8, 24, 42 in the ring of integers.
(ii) Find a GCD of the above three numbers using Fact A.3.8.
(iii) Find a l.c.m. of the above three numbers using Fact A.3.10.
A.3.10. Show that the following three statements are equivalent:
(i) x divides y.
(ii) x is a GCD of x, y.
(iii) y is a l.c.m. of x, y.
A.3.11. Suppose R is a p.i.d., that x, y, z R, x and y are coprime, and that x divides yz. Show
that x divides z. (Hint: Use Corollary A.3.9.)
A.4. EUCLIDEANDOMAINS 319
A.4 EUCLIDEANDOMAINS
In this section, we study a special type of ring that nds a lot of application in this book. Throughout
this section domain means a commutative domain with identity.
Denition A.4.1 A domain R is a Euclidean domain if there is a degree function : R \ 0 Z
+
a satisfying the following axioms:
2
(ED1) For every x, y R with y = 0, there exists a q R such that either r := x qy is zero, or
else (r) < (y).
(ED2) If x | y then (x) (y).
One can think of q as a quotient, and r as a remainder, after dividing x by y.The axiom(ED1)
states that we can always get a remainder that is either zero or else has a smaller degree than the
divisor y. We speak of a quotient and remainder because q and r are not necessarily unique unless
additional conditions are imposed on the degree function ().
The axiom (ED2) implies that (1) (x) x = 0, since 1 divides every nonzero element.
Hence, it can be assumed without loss of generality that (1) = 0. The same axiom also implies that
if x and y are associates, then they have the same degree (because if x and y are associates, then x | y
and y | x). In particular, (u) = 0 whenever u is a unit.
Fact A.4.2 Let R be a Euclidean domain with degree function (). and suppose
(x +y) max{(x), (y)} , (A.1)
(xy) = (x) +(y) . (A.2)
Then, for every x, y Rwith y = 0, there exists a unique q Rsuch that (x yq) < (y), where
the degree of zero is taken as .
Proof. By (ED1), there exists at least one suchq. Nowsuppose (x ay) < (y), (x by) < (y),
and dene r = x ay, s = x by. Then x = ay +r = by +s. Rearranging gives (a b)y = s
r. If a = b, then ((a b)y) = (a b) +(y) (y), by (A.2). On the other hand, (s r)
max{(r), (s)} < (y). This contradiction shows that a = b and also r = s. 2
Denition A.4.3 A Euclidean domain R with degree function () is called a proper Euclidean
domain if R is not a eld and () satises (A.2).
3
2
Note that Z
+
denotes the set of nonnegative integers.
3
This is slightly different from the denition in [65, p. 30].
Note that, in a proper Euclidean domain, the division process might still produce nonunique
quotients and remainders, because (A.1) is not assumed to hold. This is the case, for example, in the
ring of proper stable rational functions, which are studied in Chapter 2.
Fact A.4.4 Every Euclidean domain is a principal ideal domain.
Proof. Let R be a Euclidean domain, and let I be an ideal in R. If I = {0}, then I is principal with
0 as the generator. So suppose I contains some nonzero elements, and let x be an element of I such
that (x) is minimum over all nonzero elements of I. We claim that I is the ideal generated by x
and is hence principal. To prove this, let y I be chosen arbitrarily; it is shown that x divides y. By
axiom (ED1), there exists a q R such that either r := y qx is zero or else (r) < (x). If r = 0,
then (r) < (x) contradicts the manner in which x was chosen. Hence, r = 0, i.e., x divides y.
2
We now present a very important example of a Euclidean domain, namely the ring of poly-
nomials in one indeterminate with coefcients in a eld. To lead up to this example, an abstract
denition of a polynomial is given.
Let R be a ring. Then a polynomial over R is an innite sequence {a
0
, a
1
, } such that only
nitely many terms are nonzero. The sum and product of two polynomials a = {a
i
} and b = {b
i
}
are dened by
(a +b)
i
= a
i
+b
i
, (A.3)
(ab)
i
=
i
j=0
a
ij
b
j
=
i
j=0
a
j
b
ij
. (A.4)
For notational convenience, a polynomial a = {a
i
} can be represented by a
0
+a
1
s +a
2
s
2
+
where s is called the indeterminate. The highest value of the index i such that a
i
= 0 is called the
degree of a polynomial a = {a
0
, a
1
, }.
4
Thus, if a is a polynomial of degree m, we can write
a(s) = a
0
+a
1
s + +a
m
s
m
=
m
i=0
a
i
s
i
. (A.5)
The set of polynomials over Ris denoted by R[s], and is a commutative ring with identity. Moreover,
if R is a domain, so is R[s].
Fact A.4.5 Suppose F is a eld. Then F[s] is a Euclidean domain if the degree of a polynomial in
F[s] is dened as above.
4
The degree of the zero polynomial is taken as .
A.4. EUCLIDEANDOMAINS 321
Proof. To prove axiom (ED1), suppose
f (s) =
n
i=0
f
i
s
i
, g(s) =
m
i=0
g
i
s
i
, g
m
= 0 . (A.6)
It is necessary to showthe existence of a q F[s] such that (f gq) < (g), where (0) = . If
n < m, take q = 0. If n m, dene q
1
(s) = (f
n
/g
m
)s
nm
; then (f gq
1
) n 1. By repeating
this process if necessary on the polynomial f gq
1
we can ultimately nd a q F[s] such that
(f gq) < (g). Thus, (ED1) is satised. The proof of (ED2) is straight-forward. 2
Since the degree function () satises both (A.1) and (A.2), the Euclidean division process
yields a unique remainder and quotient r, q corresponding to each pair f, g with g = 0. Moreover,
it is clear that F[s] is not a eld (the polynomial s has no inverse). Hence, F[s] is a proper Euclidean
domain.
The eld of fractions associated with F[s] is denoted by F(s), and is called the set of rational
functions over F. Note that every element of F(s) is a ratio of two polynomials (hence the name
rational function).
PROBLEMS
A.4.1. Show that the set of integers Z is a Euclidean domain if we dene the degree of an integer
to be its absolute value. Does the division process result in unique remainders?
A.4.2. Consider the ring R[s], consisting of polynomials with real coefcients. What are the
primes of this ring?
A.4.3. Suppose R is a proper Euclidean domain and x R. Show that if (x) = 1, then x is a
prime. Is the converse true? (Hint: See Problem A.4.2.)
A.4.4. Let R be a Euclidean domain. Show that x R is a unit if and only if (x) = 0. (Hint:
Use axiom (ED1).)
323
A P P E N D I X B
Preliminaries on Matrix Rings
The objective of this appendix is to gather some well-known facts on matrices whose elements belong
to a ring or a eld, and to state them in as much generality as possible.
B.1 MATRICES ANDDETERMINANTS
Let Rbe a ring, and let R
nn
denote the set of square matrices of order n n whose elements belong
to R. If the sum and product of two matrices in R
nn
are dened in the familiar way, namely
(A +B)
ij
= a
ij
+b
ij
, (B.1)
(AB)
ij
=
n
k=1
a
ik
b
kj
, (B.2)
then R
nn
becomes a ring, usually referred to as a ring of matrices over R. If R contains an identity
and n 2, then R
nn
is not commutative. For instance, if n = 2, we have
_
1 0
0 0
_ _
0 0
1 0
_
=
_
0 0
1 0
_ _
1 0
0 0
_
. (B.3)
Similar examples can be constructed if n > 2. Also, if n 2, then R
nn
is not a domain because
Diag {1, 0, , 0} Diag {0, 1, 0, , 0} = 0
nn
. (B.4)
The determinant of a matrix A R
nn
is denoted by |A| and is dened in the familiar way, namely
|A| =
n
sign
n
i=j
a
i(i)
, (B.5)
where
n
denotes the set of permutations of the set N = {1, , n} into itself, and sign = 1
depending onwhether is anevenor odd permutation.
1
Most of the usual properties of determinants
hold in the present abstract setting. The required results are stated without proof, and the reader is
referred to [65] for further details.
Dene a function : R
n
R
n
R as follows: For every v
1
, , v
n
R
n
, dene
(v
1
, , v
n
) to be the determinant of the matrix V R
nn
whose columns are v
1
, , v
n
in that
1
Throughout this section, the symbol 1 is used to denote both the integer as well as the identity element of the ring R. It is
usually clear from the context which is meant.
324 B. PRELIMINARIES ONMATRIXRINGS
order. Thus, is just the determinant function viewed as a function of the columns of a matrix.
Similarly, dene

(v
1
, , v
n
) to be the determinant of the matrix V
R
nn
whose rows are
v
1
, , v
n
, in that order.
Fact B.1.1 We have
(v
1
, , v
n
) =

(v
1
, , v
n
) v
1
, , v
n
R
n
. (B.6)
The function (and hence

) is alternating and multilinear. That is, if two arguments of are
interchanged, then changes sign, and if two arguments of are equal, then equals zero; nally,
(v
1
+w
1
, v
2
, , v
n
) = (v
1
, v
2
, , v
n
) +(w
1
, v
2
, , v
n
) (B.7)
for all possible choices of the arguments.
Fact B.1.2 Let A R
nn
, n 2. Then, for any i, j {1, , n},
|A| =
n
j=1
(1)
i+j
a
ij
m
ij
(A) , (B.8)
|A| =
n
i=1
(1)
i+j
a
ij
m
ij
(A) , (B.9)
where m
ij
(A) is the ij-t h minor of A, dened as the determinant of the (n 1) (n 1) matrix
obtained from A by deleting its i-th row and j-th column.
2
Fact B.1.3 Let A R
nn
, n 2. Then
n
j=1
a
ij
(1)
k+j
m
kj
(A) =
_
|A| if i = k
0 if i = k
(B.10)
n
i=1
a
ik
(1)
i+j
m
ij
(A) =
_
|A| if j = k
0 if j = k
. (B.11)
Now a bit of notation is introduced to make subsequent theorem statements more compact.
Suppose mand n are positive integers, with m n. Then S(m, n) denotes the collection of all strictly
increasing m-tuples {i
1
, , i
m
}, where 1 i
1
< i
2
< < i
m
n. For example,
S(3, 5) = {(1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 3, 4), (1, 3, 5) ,
(1, 4, 5), (2, 3, 4), (2, 3, 5), (2, 4, 5), (3, 4, 5)} . (B.12)
2
Tobe consistent withsubsequent notation, one shouldwrite a
N\i,N\j
insteadof m
ij
(A). But the latter notationis more convenient.
B.1. MATRICES ANDDETERMINANTS 325
If m = n, then S(m, n) is the singleton set {(1, 2. , n)}, while if m = 0 then S(m, n) is just the
empty set.
Suppose A R
mn
, and let I S(l, m), J S(l, n). Then a
IJ
denotes the l l minor of A
consisting of the rows from I and the columns from J. In particular, if I and J are singleton sets
of the form I = {i}, J = {j}, then a
IJ
is just the element a
ij
. Note that a
IJ
R. We use A
IJ
to
denote the l l matrix consisting of the elements from the rows in I and the columns in J. Thus,
a
IJ
= |A
IJ
|. In some situations it is of interest to examine a submatrix of A consisting of the rows
in I and all columns of A; such a submatrix is denoted by A
I.
. The notation A
.J
is similarly dened.
Fact B.1.4 (Laplaces Expansion of a Determinant) Suppose A R
nn
, and suppose I S(m, n).
Then,
|A| =
JS(m,n)
(1)
v(I,J)
a
IJ
a
N\I,N\J
, (B.13)
where
v(I, J) =
iI
i +
jJ
j . (B.14)
Equation (B.13) generalizes (B.8). The corresponding generalization of (B.9) is similar and
is left to the reader.
Fact B.1.5 (Binet-Cauchy Formula) Suppose A R
nm
, B R
ml
, and let C = AB R
nl
.
Let I S(p, n), J S(p, l). Then,
c
IJ
=
KS(p,m)
a
IK
b
KJ
. (B.15)
In particular, if n = m = l, then |C| = |A| |B|.
Note that (B.15) is a natural generalization of (B.2).
Using the multilinearity of the determinant function, one can obtain an expression for the
determinant of the sum of two matrices. For example, if A, B R
22
and a
1
, a
2
, b
1
, b
2
are the
columns of the two matrices, then
|A +B| = (a
1
+b
1
, a
2
+b
2
)
= (a
1
, a
2
) +(a
1
, b
2
) +(b
1
, a
2
) +(b
1
, b
2
) . (B.16)
If A, B R
nn
then the formula for |A +B| will involve the sum of 2
n
terms. In case one of the
matrices is diagonal, each term in this expansion can be neatly expressed as a product of principal
minors of A and B.
Fact B.1.6 Suppose A, B R
nn
and that A is diagonal (i.e., a
ij
= 0 for i = j). Then,
|A +B| =
n
l=0
IS(l,n)
a
II
b
N\I,N\I
, (B.17)
where a
II
is interpreted as 1 when I is empty.
If A R
nn
, its adjoint matrix, denoted by A
adj
, is dened by
(A
adj
)
ij
= (1)
i+j
m
ji
(A) . (B.18)
In view of (B.10) and (B.11), it is seen that, for any A R
nn
,
A A
adj
= A
adj
A = |A|I
n
, (B.19)
where I
n
denotes the n n identity matrix.
A matrix A R
nn
ts unimodular if it has an inverse in R
nn
, i.e., it is a unit in the ring
R
nn
.
Fact B.1.7 A R
nn
is unimodular if and only if |A| is a unit in R.
Proof. if Suppose |A| is a unit in Rand let b = |A|
1
R. Then bA
adj
R
nn
and A bA
adj
=
bA
adj
A = I
n
. Hence, A is unimodular.
only if Suppose A is unimodular, and let B R
nn
be the inverse of A. Then 1 = |I
n
| =
|A| |B|, which shows that |A| is a unit in R. 2
Now consider the set F
nn
of matrices with elements in a eld F. Since a eld is also a ring,
all of the preceding discussion applies. In addition, since every nonzero element of F is a unit, we
see that every A F
nn
such that |A| = 0 has an inverse in F
nn
. It is customary to call a matrix A
nonsingular if |A| = 0. From Fact B.1.3, we see that if A is nonsingular, then A
1
is given by
(A
1
)
ij
= (1)
i+j
m
ji
(A)/|A| . (B.20)
The relation (B.20) is a special case of the following result, which gives the relationship between the
minors of A and A
1
. The proof can be found in [41, pp. 2122].
Fact B.1.8 Suppose A F
nn
is nonsingular and let B = A
1
. Let N = {1, , n} and suppose
I, J S(l, n) for some l. Then,
b
IJ
= (1)
v(N\I,N\J)
a
N\I,N\J
|A|
1
, (B.21)
where the function v is dened in (B.14).
If I and J are singleton sets then (B.21) reduces to (B.20).
Suppose F F
rs
, and suppose F can be partitioned as
F =
_
A B
C D
_
, (B.22)
B.1. MATRICES ANDDETERMINANTS 327
where A is a nonsingular matrix of order n n. Then the matrix
G := D CA
1
B F
rnsn
(B.23)
is called the Schur complement of F with respect to A, and is sometimes denoted by F/A. The next
result relates the minors of G to those of F.
Fact B.1.9 Suppose J S(t, m), K S(t, l). Then
g
JK
= |A|
1
f
N({n}+J),N({n}+K)
, (B.24)
where {n} +J denotes the set sum(thus if J = {j
1
, , j
t
}, then {n} +J = {n +j
1
, , n +j
t
}).
Proof. Observe that
_
I 0
CA
1
I
_
F =
_
A B
0 G
_
=: E . (B.25)
Now suppose P S(n +t, n +m), Q S(n +t, n +l) and that N is a subset of both P and
Q. Consider the minor e
PQ
. The rows in P of E are obtained by adding multiples of the rst
n rows of F to the rows in P of F. Thus, e
PQ
= f
PQ
, since a minor is unchanged by adding
multiples of some rows to others. Observe that the matrix E
PQ
is block-lower-triangular, so that
e
PQ
= |A| e
PN,QN
. The identity (B.24) now follows by choosing P = N ({n} +J), Q =
N ({n} +K). 2
nn
, B F
mn
, |A| = 0, and let G = BA
1
. Suppose J
S(l, m), K S(l, n). Then |A| g
JK
equals the determinant of the n n matrix obtained from
A by replacing the rows in K of A by the rows in J of B.
Remarks B.1.11 A simple example helps to illustrate the statement of the result. Suppose A
F
55
, B F
35
, and let a
i
, b
i
denote the i-th rows of Aand B, respectively. Each of these is a 1 5
row vector. Suppose J = (2, 3), K = (3, 5). Then the above result states that
|A| g
(2,3),(3,5)
=
a
1
a
2
b
2
a
4
b
3
. (B.26)
Note that if B is a column vector then the above fact reduces to Cramers rule for solving linear
equations.
Proof. Dene
F =
_
A I
B 0
_
. (B.27)
Then G is the Schur complement of F with respect to A. By Fact B.1.9,
|A| g
JK
= f
N
({n} +J), N ({n} +K),
=
A I
.K
B
J.
0
. (B.28)
Note that each of the last l columns of the minor consists of all zeros except for a single 1. If we
expand this minor about the last l columns using Laplaces expansion (Fact B.1.4), then the expansion
consists of a single term, since the submatrix consisting of the last l columns has only nonzero l l
minor. Thus, if K = {k
1
, , k
l
}, then
|A| g
JK
= (1)
v
(1)
l
A
N\K
B
J.
, (B.29)
where
v =
l
i=1
(n +i) +
l
i=1
k
i
. (B.30)
Now the minor on the right side of (B.29) is not quite in the form stated in Fact B.1.10, since
the rows of B occur below those of A, rather than substituting for them (as in Example B.26, for
instance). It is a matter of detail to verify that the parity factor (1)
v+l
accounts for this difference.
2
nn
, B F
nm
, |A| = 0 and let G = A
1
B. Suppose J
S(l, n), K S(l, m). Then |A| g
JK
equals the determinant of the n n matrix obtained from
A by replacing the columns in J of A by the columns in K of B.
The proof is similar to that of Fact B.1.10 and is left to the reader.
B.2 CANONICAL FORMS
Let Rbe a principal ideal domain, and let Fbe the eld of fractions associated with R. In this section,
we study the sets R
nm
and F
nm
, consisting of n m matrices whose elements belong to R and
F, respectively. We prove the existence of two canonical forms, namely the Smith form on R
nm
and
the Smith-McMillan form on F
nm
.
Amatrix A R
nm
is a left associate of B R
nm
(denoted by A =
L
B) if there is a unimod-
ular matrix U R
nn
such that A = UB. A is a right associate of B (denoted by A =
R
B) if there
is a unimodular matrix V R
mm
such that A = BV. A is equivalent to B (denoted by A B) if
B.2. CANONICAL FORMS 329
there are unimodular matrices U R
nn
, V R
mm
. such that A = UBV. It is left to the reader
to verify that =
L
, =
R
, are all equivalence relations.
We now commence our study of canonical forms.
Lemma B.2.1 Suppose a
1
, , a
n
R, and let d
n
be a GCD of this set of elements. Then there exists
a matrix P
n
R
nn
whose rst row is [a
1
a
n
] and whose determinant is d
n
.
Proof. The proof is by induction on n. If n = 2, by Theorem A.3.3 there exist p
1
, p
2
R such that
p
1
a
1
+p
2
a
2
= d
2
. Now let
P
2
=
_
a
1
a
2
p
2
p
1
_
(B.1)
For larger values of n, let d
n1
be a GCD of a
1
, , a
n1
. By the inductive hypothesis, we can
construct a matrix P
n1
R
n1n1
whose rst row is [a
1
a
n1
] and whose determinant is
d
n1
. By Fact B.1.3,
n1
j=1
a
j
(1)
1+j
m
1j
(P
n1
) = d
n1
. (B.2)
so that
n1
j=1
a
j
d
n1
(1)
1+j
m
1j
(P
n1
) = 1 . (B.3)
For convenience, let z
j
= a
j
/d
n1
. Next, by Problem A.3.3, d
n
is a GCD of d
n1
and a
n
. By
Theorem A.3.3, there exist x, y R such that xd
n1
+ya
n
= d
n
. Now dene
P
n
=
_
_
_
_
_
_
_
a
n
P
n1
0
.
.
.
0
yz
1
yz
n1
x
_
_
. (B.4)
Expanding |P
n
| about the last column and using (B.3) gives |P
n
| = xd
n1
+ya
n
= d
n
. 2
Theorem B.2.2 (Hermite Form) Every A R
nn
is a left associate of a matrix B that is lower-
triangular (i.e., b
ij
= 0 for j > i).
Proof. Let d
n
be a GCD of {a
1n
, , a
nn
}, i.e., the elements of the last column of A. By Theo-
rem A.3.5, there exist elements p
1
, , p
n
R such that
n
i=1
p
i
a
in
= d
n
. (B.5)
By Problem A.3.4, 1 is a GCD of the set p
1
, , p
n
. Hence, by a slight variation of Lemma B.2.1,
there exists a unimodular matrix U R
nn
such that its last row is [p
1
p
n
]. Now (UA)
nn
= d
n
.
Also, for i = 1, , n 1, (UA)
in
belongs to the ideal generated by a
1n
, , a
nn
and is thus a
multiple of d
n
. Let z
i
= (UA)
in
/d
n
for i = 1, , n 1, and dene
U
n
=
_
_
_
_
_
_
_
1 0 0 z
1
0 1 0 z
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 1 z
n1
0 0 0 1
_
_
. (B.6)
Then U
n
is unimodular since its determinant is 1. Moreover, U
n
UA is of the form
U
n
UA =
_
_
_
_
_
0
A
n1
.
.
.
0
a
n1
d
n
_
_
. (B.7)
In other words, the last column of A has been reduced to zero above the diagonal by means of left
multiplication by an appropriate unimodular matrix. One can now repeat the procedure with the
n 1 n 1 matrix A
n1
, and eventually arrive at a lower triangular matrix. For clarity, the next
step of the algorithm is briey outlined: Let d
n1
denote a GCD of the elements of the last column
of A
n1
. As above, there exists a unimodular matrix

U
n1
R
n1n1
such that
U
n1
A
n1
=
_
_
_
_
_
0
A
n2
.
.
.
0
a
n2
d
n1
_
_
. (B.8)
Now dene the n n unimodular matrix
U
n1
= Block Diag{

U
n1
, 1} . (B.9)
Then,
U
n1
U
n
UA =
_
_
_
_
_
_
_
0 0
A
n2
.
.
.
.
.
.
0 0
a
n2
d
n1
0
a
n1
d
n
_
_
. (B.10)
The rest of the proof is now obvious. 2
Corollary B.2.3 Every matrix A R
nn
, is a left associate of an upper triangular matrix C R
nn
.
Proof. In the proof of Theorem B.2.2, start with the rst column of A instead of the last. 2
Corollary B.2.4 Every matrix A R
nn
is a right associate of a lower (resp. upper) triangular matrix.
Proof. In the proof of Theorem B.2.2, start with the last (resp. rst) row of A. 2
For rectangular matrices, the following result holds:
Corollary B.2.5 Suppose A R
nm
. Then A is a left associate of a matrix of the form
_
D
0
_
if n > m, [DE] if n < m . (B.11)
where D can be chosen to be either lower or upper triangular.
It is left to the reader to state and prove the result analogous to Corollary B.2.5 concerning
right associates of rectangular matrices.
Next, we start developing the Smith form. Recall that a matrix A R
nm
is said to have rank
l if (i) there is an l l submatrix of A with nonzero determinant, and (ii) every (l +1) (l +1)
minor of A is zero. An elementary row operation on the matrix A consists of one of the following:
(i) interchanging two rows of A, or (ii) adding a multiple of one rowto another. An elementary column
operation is similarly dened. It is easy to see that an elementary row (column) operation on a matrix
A can be accomplished by multiplying A on the left (right) by a unimodular matrix. Thus, a matrix
obtained from A by elementary row and column operations is equivalent to A.
Now suppose A R
nm
is a (left or right) multiple of B R
nm
, and let b
l
denote a GCD
of all l l minors of B. Then it follows from the Binet-Cauchy formula (Fact B.1.5) that b
l
divides
all l l minors of A. Thus, if a
l
denotes a GCD of all l l minors of A, we see that b
l
divides a
l
.
From this, it follows that if A and B are equivalent matrices, then a
l
and b
l
are associates, and A
and B have the same rank.
Theorem B.2.6 (Smith Form) Suppose A R
nm
has rank l. Then, A is equivalent to a matrix
H R
nm
of the form
H =
_
_
_
_
_
_
_
h
1
0 0 0
0 h
2
0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 h
l
0
0 0
_
_
, (B.12)
where h
i
divides h
i+1
for i = 1, , l 1. Moreover, h
1
h
i
is a GCD of all i i minors of A,
and the h
i
s are unique to within multiplication by a unit.
Remarks B.2.7 h
1
, , h
n
are called the invariant factors of A.
Proof. Since A has rank l, it contains an l l submatrix with nonzero determinant. By elementary
row and column operations, this submatrix can be brought to the upper left-hand corner of A.
As in the proof of Theorem B.2.2, there is a unimodular matrix U R
nn
such that (UA)
11
is a GCD of all elements in the rst column of A. By elementary row operations, the rst column
of UA can be made to contain all zeros except in the (1, 1)-position. Call the resulting matrix

A. If
a
11
divides all elements of the rst row of

A, then all elements of the rst row (except in the (1, 1)
position) of

AV can be made to equal zero by a suitable choice of a unimodular matrix V of the
form
V =
_
_
_
_
_
_
_
1 v
12
v
13
v
1m
1 0 0
1 0

.
.
.
.
.
.
.
.
.
1
_
_
. (B.13)
Moreover, the rst column or

AV will continue to have zeros except in the (1, 1) position. On the
other hand, if a
11
does not divide all elements in the rst row of

A. we can choose a unimodular
matrix V so that (

AV)
11
is a GCD of the elements of the rst row of

AV. If this is done, the rst
column of

AV may no longer contain zeros. In such a case, we repeat the above row and column
operations. This process cannot continue indenitely, because the original element a
11
has only a
nite number of prime factors (see Fact A.3.7) and each successive corner element is a proper (i.e.,
nonassociative) divisor of its predecessor. Thus, in a nite number of steps, we arrive at a matrix B
equivalent to A such that b
11
divides all elements of the rst row as well as rst column of B. By
elementary row and column operations, all of the elements of the rst row and column of B can be
made equal to zero. Thus,
A B
_
_
_
_
_
b
11
0 0
0
.
.
. B
1
0
_
_
. (B.14)
By proceeding to the rst row and column of B
1
(i.e., the second row and column of B) and then
repeating the procedure, we will eventually have
A
_
_
_
_
_
d
1
0
.
.
.
.
.
.
.
.
. 0
0 d
l
0 M
_
_
. (B.15)
Now, d
i
= 0 i, since these are all divisors of the elements of the rst l rows and columns of A; and
none of these rows nor columns is identically zero. Next, M must be zero; otherwise A is equivalent
to a matrix of rank at least l +1, which contradicts the fact that the rank of A is l.
Thus, far we have shown that
A
_
_
_
_
_
_
_
d
1
0 0
0 d
2
0 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 d
l
0 0
_
_
. (B.16)
Let D = Diag {d
1
, , d
l
}. We will show that D Diag {h
1
, , h
l
}, where h
i
divides h
i+1
for
i = 1, , l 1. This is enough to prove the theorem. By adding columns 2 to l to the rst column
of D, it follows that
D
_
_
_
_
_
d
1
0 0
d
2
d
2
0
.
.
.
.
.
.
.
.
.
.
.
.
d
l
0 d
l
_
_
. (B.17)
By multiplying the latter matrix on the left by an appropriate unimodular matrix, we get another
matrix E whose (1, 1)-element is a GCD of {d
1
, , d
l
} and whose rst column is zero otherwise.
Let h
1
:= a GCD of {d
1
. , d
l
}. Then,
E =
_
_
_
_
_
h
1
0

E
.
.
.
0
_
_
, (B.18)
where every element of

E is in the ideal generated by d
2
, , d
l
, and is thus a multiple of h
1
. Since
the rst row of

E is a multiple of h
1
, it follows that
E
_
_
_
_
_
h
1
0 0
0
.
.
. E
1
0
_
_
, (B.19)
where every element of E
1
is a multiple of h
1
. Now, by the preceding paragraph, E
1

Diag {g
2
, , g
l
} where h
1
divides g
i
for all i. Now let h
2
be a GCD of g
2
, , g
l
. Then clearly
h
1
divides h
2
. Moreover,
E
1

_
h
2
0
0 E
2
_
. (B.20)
Repeating this procedure, we nally get D Diag {h
1
, , h
l
} where h
i
divides h
i+1
for all i. This
completes the proof of (B.12).
The divisibility conditions on the h
i
imply that the product h
1
h
i
is a GCD of all i i
minors of H. Since A and H are equivalent, product is also a GCD of all i i minors of A.
Now suppose A is also equivalent to another matrix
G =
_
_
_
_
_
_
_
g
1
0 0
0 g
2
0 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 g
l
0 0
_
_
, (B.21)
where g
i
divides g
i+1
for all i. Then the reasoning of the preceding paragraph shows that the product
h
1
h
i
is an associate of the product g
1
g
i
for all i. Now, if h
1
g
1
and h
1
h
2
g
1
g
2
, then
h
2
g
2
. Reasoning inductively in this fashion, we conclude that h
i
g
i
for all i. 2
Corollary B.2.8 Two matrices A, B R
nm
are equivalent if and only if their invariant factors are
associates.
Let F be the eld of fractions associated with R. We now begin a study of the set F
nm
of
matrices with elements from F and show the existence of a canonical form known as the Smith-
McMillan form.
Theorem B.2.9 (Smith-McMillan Form) Suppose F F
nm
has rank l. Then there exist uni-
modular matrices U R
nn
, V R
mm
such that
UFV =
_
_
_
_
_
_
_
a
1
/b
1
0 0 0
0 a
2
/b
2
0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 a
l
/b
l
0 0
_
_
, (B.22)
where a
i
, b
i
are coprime for all i; a
i
divides a
i+1
and b
i+1
divides b
i
for i = 1, , l 1; and b
1
is
a least common multiple of the denominators of all elements of F, expressed in reduced form.
3
Proof. Let y denote an l.c.m. of the denominators of all elements of F, expressed in reduced form.
Then yF R
nm
, and the rank of yF is also l. By Theorem B.2.6, there exist unimodular matrices
U R
nn
, V R
mm
such that
UyFV =
_
_
_
_
_
_
h
1
0
.
.
.
.
.
.
.
.
. 0
0 h
l
.
.
.
0 0
_
_
, (B.23)
where h
i
divides h
i+1
for all i. Hence,
UFV =
_
_
_
_
_
_
h
1
/y 0
.
.
.
.
.
.
.
.
. 0
0 h
l
/y
.
.
.
0 0
_
_
. (B.24)
Let a
i
/b
i
be a reduced form for the fraction h
i
/y, for i = 1, , l. Since h
i
| h
i+1
, let h
i+1
= h
i
r
i
where r
i
R. Let [] denote the reduced form of a matrix. Then
a
i+1
b
i+1
=
_
h
i+1
y
_
=
_
h
i
r
i
y
_
=
_
a
i
r
i
b
i
_
= a
i
_
r
i
b
i
_
, (B.25)
where in the last step we used the fact that a
i
, b
i
are coprime. Now (B.25) implies that
a
i
| a
i+1
, b
i+1
| b
i
for all i.
3
See Fact A.3.6 for the denition of a reduced form.
Finally, to show that b
1
y, it is enough to show that h
1
and y are coprime, since a
1
/b
1
is a reduced form of h
1
/y. This is most easily done using prime factorizations. Let p
ij
/q
ij
be a
reduced form of the element f
ij
, for all i, j. Then y is an l.c.m. of all q
ij
, and h
1
is a GCD of all
yf
ij
= yp
ij
/q
ij
for all i, j. Suppose t is a prime factor of y, of multiplicity of . (By this we mean
that t is a prime, t
divides y but t
+1
does not.) Then, from Fact A.3.10, t must be a prime divisor
of multiplicity , of some q
ij
. The corresponding p
ij
is not divisible by t , since p
ij
, q
ij
are coprime
(see Corollary A.3.9). As a result, the corresponding term yf
ij
= yp
ij
/q
ij
is also not divisible by
t (since t is a factor of multiplicity of both yp
ij
and q
ij
). Thus, h
1
, being a GCD of all yf
ij
, is
also not divisible by t (see Fact A.3.8). Since this is true of every prime factor of y, it follows from
Corollary A.3.9 that y and h
1
are coprime. 2
337
A P P E N D I X C
Topological Preliminaries
In this appendix, a few basic concepts from topology are introduced. For greater detail, the reader is
referred to [56] or [89].
C.1 TOPOLOGICAL SPACES
This section contains a brief introduction to topological spaces.
Denition C.1.1 Let S be a set. A collection T of subsets of S is a topology if
1
(TOP1) Both S and (the empty set) belong to T .
(TOP2) A nite intersection of sets in T again belongs to T .
(TOP3) An arbitrary union of sets in T again belongs to T .
The ordered pair (S, T ) is called a topological space, and subsets of S belonging to T are said to be
open. A subset of S is closed if its complement in S is open.
Examples C.1.2 Let S be any set, and let T
1
consist of just the two sets S and . Then (S, T
1
) is
a topological space. T
1
is referred to as the trivial topology on S. Let T
2
consist of all subsets of S.
Then (S, T
2
) is also a topological space. T
2
is referred to as the discrete topology on S.
Suppose S is a set, and T
1
, T
2
are topologies on S. Then T
1
is weaker than T
2
(and T
2
is
stronger than T
1
) if T
1
is a subset of T
2
, i.e., every set that is open in the topology T
1
is also open in
the topology T
2
. It is obvious that, for any set S, the trivial topology and the discrete topology are,
respectively, the weakest and strongest topologies that can be dened on S.
To give interesting and useful examples of topological spaces, the notion of base is introduced.
To motivate this notion, recall the familiar denition of an open subset of the real line: A subset U of
Ris open if and only if, corresponding to every x U, there is a number > 0 such that the interval
(x , x +) is also contained in U. The concept of a base is an abstraction of this idea, whereby
intervals of the form (x , x +) are replaced by more general sets satisfying appropriate axioms.
Let S be a set, and let B be a collection of subsets of S that satises two axioms:
(B1) The sets in B cover S (i.e., the union of the sets in B is S).
(B2) Whenever B
1
, B
2
are sets in B with a nonempty intersection and x B
1
B
2
, there
exists a B(x) in B such that x B(x) (B
1
B
2
).
1
Throughout this section, upper case italic letters denote collections of sets, while bold face letters denote sets.
338 C. TOPOLOGICAL PRELIMINARIES
Using this collection B, another collection T of subsets of S is dened as follows: A subset U
of S belongs to T if and only if, for every x U, there is a set B(x) such that x B(x) U.
Fact C.1.3 A subset U of S belongs to the collection T if and only if it is a union of sets in B. T
is a topology on S. Moreover, T is the weakest topology on S containing all sets in B.
Remarks C.1.4 B is referred to as a base for the topology T , and T is the topology generated by the
base B.
Example C.1.5 To clarify Fact C.1.3, consider the set R
n
, consisting of n-tuples of real numbers.
Let B denote the collection of balls B(x, ) of the form
B(x, ) = {y R
n
: x y < } (C.1)
as x varies over R
n
and varies over the positive numbers. Here denotes the usual Euclidean (or
any other) norm on R
n
. It is a straight-forward matter to verify that this collection of balls satises
axioms (B1) and (B2). Hence, this collection forms a base for a topology on R
n
, in which a set
U R
n
is open if and only if, for every x U, there is a ball B(x, ) U. This coincides with the
usual denition of open sets on R
n
. The one extra bit of information that comes out of Fact C.1.3
is that a set is open if and only if it is a union of balls.
Proof of Fact C.1.3. To prove the rst sentence, suppose rst that Uis a union of sets in B; it is shown
that U belongs to the collection T . Specically, suppose U =

iI
B
i
, where I is an index set and B
i
is in the collection B for all i I. Let x be an arbitrary element of U. Then x B
i
for some i I,
and for this i we have x B
i
U. Hence, by the denition of T, U is in T . Conversely, suppose U
is in T . Then for every x U there is a B(x) in B such that x B(x) U. It is now claimed that
U =

xU
B(x), which would show that U is a union of sets from B. To prove the claim (and thereby
complete the proof of the rst sentence), note that U

xU
B(x) since each B(x) U; conversely,
since every x U also belongs to the corresponding B(x), it follows that U

xU
B(x). Hence, the
two sets are equal.
The second sentence is proved by verifying that T satises the three axioms (TOP1)(TOP3).
First, belongs to T since vacuously satises the dening condition for a set to belong to T .
Next, since S is the union of all sets in B, it is also in T . Hence, T satises (TOP1).
To establish (TOP2), it is enough to show that a nonempty intersection of two sets in T again
belongs to T ; it will then followby induction that every nite intersection of sets in T again belongs to
T . Accordingly, suppose U, V are in T and let x be any element of U V; we will show the existence
of a B(x) in B such that x B(x) (U V), which in turn will establish that the intersection
C.1. TOPOLOGICAL SPACES 339
U V is open. Since x U and x V, there exist B
1
(x), B
2
(x) in B such that x B
1
(x) U, x
B
2
(x) V. Now, by axiom(B2), there is a B(x) in B such that x B(x) (B
1
(x) B
2
(x)), which
in turn is contained in U V. Thus, T satises (TOP2).
To establish (TOP3), suppose {U
} is a family of sets in T . Then each U
is a union of sets
belonging to B, whence their union is also a union of sets belonging to B. Thus,
is also in T .
To prove the last sentence, let T
1
be another topology on S such that every set in B is contained
in T
1
. Since T
1
is a topology, arbitrary unions of sets in B again belong to T
1
. Since every set in T
can be expressed as a union of sets in B, it follows that every set in T is in T
1
, i.e., T is weaker than
T
1
. 2
A topological space (S, T ) is rst-countable if T has the following property: For every x S,
there is a countable collection of open sets {B
i
(x), i Z
+
}, each containing x, such that every open
set containing x also contains some B
i
(x). In view of Fact C.1.3, the collection B
i
(x), x S is a
base for the topology T . Since the set S may be uncountable, the base for the topology may also
be uncountable. However, in a rst-countable topology, the collection of open sets containing each
particular point has a countable base.
For example, consider the set R
n
together with the topology of Example C.1.5. For a xed
x R
n
, the collection of balls B(x, 1/m), m an integer 1 is countable; moreover, every open
set containing x also contains at least one of the balls B(x, 1/m). Hence, the topological space of
Example C.1.5 is rst-countable.
A very general class of rst-countable topological spaces is that of metric spaces, which are
dened next.
Denition C.1.6 A metric space (S, ) is a set S, together with a function : S R satisfying
the following axioms:
(M1) (y, x) = (x, y) x, y S.
(M2) (x, y) 0 x, y S; (x, y) = 0 x = y.
(M3) (x, z) (x, y) +(y, z) x, y, z S.
If (S, ) is a metric space, then there is a natural topology that can be dened on S. Let
B(x, ) denote the ball
B(x, ) = {y S : (x, y) < } . (C.2)
Then the collection of sets B(x, ) as x varies over S and varies over all positive numbers, is a base
for a topology T on S. T is referred to as the topology on S induced by the metric . In this topology,
a set U in S is open if and only if, for every x U, there is a ball B(x, ) U.
Every metric space is rst-countable as a topological space: For a xed x, consider the countable
collection of sets B(x, 1/m), m an integer 1. Then every open set containing x also contains at
least one of the sets B(x, 1/m).
The question of convergence of sequences in topological spaces is examined next. Suppose
(S, T ) is a topological space. A set N is said to be a neighborhood of x S if x N, and N contains
a set in T that contains x. In other words, a set N is a neighborhood of x if it contains an open
set containing x. Note that a neighborhood itself need not be open. A sequence {x
i
} in S is said
to converge to x S if every neighborhood of x contains all but a nite number of terms of the
sequence {x
i
}.
Example C.1.7 Consider the set R
n
together with the topology dened in Example C.1.5. Then
a sequence {x
i
} converges to x if and only if, for every > 0, there is a number N such that
x
i
B(x, ) i N. This is the familiar notion of convergence in R
n
.
Example C.1.8 Consider a set S together with the discrete topology of Example C.1.2. Let {x
i
}
be a sequence in S converging to x. Since the singleton set {x} is also a neighborhood of x, it follows
that x
i
= x for all except a nite number of values of i.
Example C.1.9 Consider a set S together with the trivial topology of Example C.1.2. Then S is
the only neighborhood of any x S. As a result, every sequence {x
i
} converges to every x S.
The preceding two examples show that the meaning of convergence is very much dependent
on the particular topology dened on a set. Moreover, in general, a sequence does not necessarily
have a unique limit. To address the latter problem, we dene Hausdorff topologies. A topology T
on a set S is Hausdorff if distinct points have disjoint neighborhoods. It is easy to verify that if T is a
Hausdorff topology on a set S, then the limit of a sequence in S is unique if it exists. Also, if (S, )
is a metric space, then the topology induced by the metric is Hausdorff.
Let S be a set, and let T
1
, T
2
be two topologies on S, with T
1
weaker than T
2
. Then, whenever
a sequence {x
i
} in S converges to x in the topology T
2
, it also converges to x in the topology
T
1
. However, it is possible that a sequence may converge in T
1
but not in T
2
. This shows that
convergence in T
1
is in general a weaker requirement than convergence in T
2
(and helps to explain
the terminology).
Recall that a set in S is closed if its complement is open. One of the desirable features of rst-
countable topologies is that one can give an alternate, and very useful, characterization of closed sets.
Suppose U is a set in S. An element x S is said to be a limit point (or a cluster point, accumulation
point) of U if every neighborhood of x contains an element of U other than x. It can be shown that
a set is closed if and only if it contains all of its limit points. Now suppose (S, T ) is a rst-countable
topological space. Then, for each x S, there is a countable collection of open sets {B
i
(x), i Z}
such that every neighborhood of x contains at least one of the B
i
(x). As a result, x is a limit point
of a set U if and only if there exists a sequence {x
i
} in U converging to x such that x
i
= x i. Based
on this, one can show that U is closed if and only if every convergent sequence {x
i
} in U has a limit
in U (see Problem C.1.5).
C.1. TOPOLOGICAL SPACES 341
Now we come to the notion of continuity. Suppose (S
1
, T
1
), (S
2
, T
2
) are topological spaces,
and f is a function mapping S
1
into S
2
. Given any subset U of S
2
, its preimage under f is the subset
of S
1
denoted by f
1
(U) and dened by
f
1
(U) = {x S
1
: f (x) U} . (C.3)
The function f is continuous at x S
1
if, whenever U is an open subset of S
2
containing f (x),
its preimage f
1
(U) is an open subset of S
1
(containing x). f is continuous if it is continuous at
all x S
1
. Clearly, whether or not f is continuous is very much dependent on the topologies T
1
and T
2
. For example, if T
1
is the discrete topology, then every function f : S
1
S
2
is continuous.
Also, one can easily verify the following: If f is continuous at x S
1
and N is a neighborhood of
f (x), then f
1
(N) is a neighborhood of x (see Problem C.1.6). The converse is also true, but more
difcult to prove: If f
1
(N) is a neighborhood of x whenever N is a neighborhood of f (x), then f
is continuous at x. The proof can be found in [56].
Fact C.1.10 Suppose (S
1
, T
1
), (S
2
, T
2
) are topological spaces, and f : S
1
S
2
is continuous at
x S
1
. Then, whenever {x
i
} is a sequence in S
1
converging to x, the sequence {f (x
i
)} converges to
f (x).
Proof. Let N be any neighborhood of f (x). Since f is continuous at x, f
1
(N) is a neighborhood
of x. Since {x
i
} converges to x, f
1
(N) contains all but a nite number of terms of the sequence
{x
i
}. This is the same as saying that f (x
i
) N for all but a nite number of values of i. Since this
is true for every neighborhood of f (x), we conclude that {f (x
i
)} converges to f (x). 2
If the topologies T
1
, T
2
are rst-countable, then the converse of Fact C.1.10 is also true: If
{f (x
i
)} converges to f (x) for all sequences {x
i
} converging to x, then f is continuous at x.
PROBLEMS
C.1.1. Let (S, ) be a metric space, and let B(x, ) be dened by (C.2). Show that the collection
of sets B(x, ) is a base for a topology on S by verifying that axiom (B1) is satised.
C.1.2. Let S be any set, and dene : S S R by (x, y) = 0 if x = y, 1 if x = y.
(i) Verify that (S, ) is a metric space.
(ii) Show that the topology on S induced by is the discrete topology.
C.1.3. Let (S, T ) be a rst countable topological space, and let U be a subset of S. Show that U
is closed if and only if the following is true: x
i
U, {x
i
} converges to x implies that x U
(Hint: either x = x
i
for some i or else x is a limit point of U).
C.1.4. Suppose (S, ) is a metric space, and dene
d(x, y) =
(x, y)
1 +(x, y)
.
(i) Show that (S, d) is also a metric space.
(ii) Show that and d induce the same topology on S.
C.1.5. Show that, if (S, T ) is a topological space and T is a Hausdorff topology, then a sequence
in S can converge to at most one point.
C.1.6. Suppose (S
1
, T
1
), (S
2
, T
2
) are topological spaces, and f : S
1
S
2
is continuous at x S
1
.
Show that, if N is a neighborhood of f (x), then f
1
(N) is a neighborhood of x.
C.2 TOPOLOGICAL RINGS ANDNORMEDALGEBRAS
This section contains a brief introduction to the subject of topological rings. Roughly speaking, a
topological ring is a ring, together with a topology, such that subtraction and multiplication are
continuous operations with respect to the topology. Before talking about topological rings as such,
it is necessary to dene the product topology on the cartesian product of topological spaces.
Recall that the cartesian product S
1
S
2
of two sets S
1
and S
2
consists of all ordered pairs
(x, y) where x S
1
and y S
2
. Now suppose T
1
, T
2
are topologies on S
1
, S
2
, respectively. The
objective is to dene a topology on the product set S = S
1
S
2
. This is done in the following way:
Let B denote the collection of subsets of S of the formU Vwhere U T
1
, V T
2
. In other words,
B is the collection of subsets of S formed by taking cartesian products of an open subset of S
1
and
an open subset of S
2
. It can be shown that B is a base for a topology on S (see Problem C.2.1),
which is referred to as the product topology on S.
Now suppose S
i
, i = 1, 2, 3 are topological spaces, with the topologies T
i
, i = 1, 2, 3. Let
S = S
1
S
2
, and let T denote the product topology on S. Suppose f is a function mapping S into
S
3
. By denition, f is continuous if f
1
(W) is an open subset of S whenever W is an open subset of
S
3
. Now suppose f is continuous, that (x, y) S and let f (x, y) =: z S
3
. Since f is continuous,
it follows that, whenever N is a neighborhood of z, f
1
(N) is a neighborhood of (x, y). Recalling
the denition of the product topology on the set S, we see that, whenever N is a neighborhood of
z, there exist neighborhoods N
1
of x and N
2
of y such that f (x
1
, y
1
) Nx
1
N
1
, y
1
N
2
.
In particular, f (x, y
1
) Ny
1
N
2
, and f (x
1
, y) Nx
1
N
1
. The conclusion is that if f is a
continuous function, then for each x S
1
the function f (x, ) : S
2
S
3
is continuous; similarly,
for each y S
2
, the function f (, y) : S
1
S
3
is continuous.
Now we come to topological rings.
Denition C.2.1 Suppose R is a ring and T is a topology on R. The pair (R, T ) is a topological
ring if the functions (x, y) x y and (x, y) xy are continuous functions from R R into
R, when R R is given the product topology.
C.2. TOPOLOGICAL RINGS ANDNORMEDALGEBRAS 343
Several facts are immediate consequences of the above denition. First, if the function
(x, y) x y is continuous, then for each xed x , the function y x y is a continuous function
from R into R. In particular, taking x = 0, it follows that the function y y is continuous. Next,
since compositions of continuous functions are again continuous, the function (x, y) x +y =
x (y) is also continuous. Similarly, the function (x, y) y x = (x y) is also continuous.
Given a subset U R and an element x R, let x +U denote the set dened by
x +U = {x +y : y U} . (C.1)
One can think of x +U as the set U translated by x. Now suppose U is an open subset of R, and
dene the function f
x
: R R by f
x
(y) = y x. As seen earlier, f
x
is continuous. Hence, the
set f
1
x
(U) is open. But clearly this set equals x +U. Thus, we have shown that if U is open, so is
x +U for all x R. In other words, the topology on a topological ring is translation-invariant, in
the sense that translates of open sets are again open. A consequence of this is that, once we know all
the open sets containing 0, we know all open sets.
We now speak briey about normed algebras. Suppose A is a linear vector space over a eld
F. This means that there is a concept of addition between members of A, and scalar multiplication
between an element of F and an element of A. Now A is an algebra (over F) if, in addition, one
can dene a product of two elements of A in such a way that A becomes a ring. Thus, an algebra A
has associated with it a eld F, and three operations: addition between elements of A, multiplication
between two elements of A, and multiplication between an element of F and an element of A. A is a
ring with respect to addition and multiplication, and Ais a linear vector space with respect to addition
and scalar multiplication. For example, if F is any eld, the set of matrices F
nn
is an algebra over F.
Now suppose A is an algebra over R (the real numbers) or C (the complex numbers). Then
(A, ) is a normed algebra if one can dene a norm on A such that
(NA1) a 0 a A, a = 0 a = 0,
(NA2) a = || a R( or C), a A,
(NA3) a +b a +b a, b A,
(NA4) ab a b , a, b A,
Axioms (NA1)(NA3) are just the usual requirements for (A, ) to be a normed space,
and (NA4) is the additional requirement for it to be a normed algebra.
Let (A, ) be a normed algebra, and let B denote the collection of balls
B(x, ) = {y A : x y < } . (C.2)
Then, B is a base for a topology T on A. This is the same as the topology induced by the metric
(x, y) = x y.
It is nowshown that (A, T ) is a topological ring. In order to do this, it is necessary to showthat
the functions (x, y) x y and (x, y) xy are both continuous. Let us begin with subtraction.
Suppose x, y A, z = x y, and let S A be an open set containing z. We must show that the
set {(x, y) : x y S} =: V is an open subset of A A. This is done by showing that, whenever
(x, y) V, there are balls B(x, ), B(y, ) such that B := B(x, ) B(y, ) is contained in V. Since
S is open and x y S, by denition there is an > 0 such that B(x y, ) S. Nowlet = /2,
and, suppose x
1
B( x, ), y
1
B( y, ). Then
(x
1
y
1
) (x y) x
1
x +y
1
y < 2 < . (C.3)
Hence, (x
1
, y
1
) V. This shows that subtraction is continuous. Next, to show that multiplication
is continuous, suppose x, y A, let z = x y, and let S be an open set in A containing z. Dene
V = {(x, y) : xy S}. We show that V is open by showing that, whenever (x, y) V, there exist
balls B(x, ), B(y, ) such that B(x, ) B(y, ) is contained in V. Since V is open, by denition
there is an > 0 such that B(xy, ) S. Suppose x
1
B(x, ), y
1
B(y, ) where is yet to be
specied. Then
x
1
y
1
xy = [(x
1
x) +x] [(y
1
y) +y] xy
= (x
1
x)(y
1
y) +(x
1
x)y +x(y
1
y) . (C.4)
x
1
y
1
xy x
1
x y
1
y +x
1
x y
+ x y
1
y

2
+(x +y) . (C.5)
Hence (x
1
, y
1
) V if is chosen such that
2
+(x +y) . (C.6)
Such a can always be found since the left side of (C.6) is continuous in and equals zero when
is zero. This completes the proof that multiplication is continuous.
Fact C.2.2 Suppose (A, ) is a normed algebra, and let U denote the set of units of A (i.e., the
set of elements of A that have a multiplicative inverse in A). Then the function f : u u
1
maps
U into itself continuously.
Remark C.2.3 By the continuity of the inversion function f , we mean the following: Given any
u U and any > 0, there exists a > 0 such that
u
1
v
1
< whenever v U and u v < . (C.7)
C.2. TOPOLOGICAL RINGS ANDNORMEDALGEBRAS 345
Proof. Let u U and let > 0 be specied. Suppose v U and v B(u, ), where is not as yet
specied. Then
u
1
v
1
= u
1
(v u)v
1
=
u
1
v u v
1
u
1
v
1
u
1
(u
1
+u
1
v
1
) . (C.8)
Solving for u
1
v
1
from (C.8) gives
u
1
v
1

u
1
2
1 u
1
. (C.9)
Thus, (C.7) is satised if is chosen such that the right side of (C.9) is less than . But such a
can always be found, since the right side of (C.9) is continuous in and equals zero when is zero.
This shows that the function f is continuous. 2
PROBLEMS
C.2.1. Suppose (S
1
, T
1
), (S
2
, T
2
) are topological spaces.
(i) Let T denote the collection of subsets of S
1
S
2
of the form U V where U T
1
, V
T
2
. Show that T is a base for a topology on the set S
1
S
2
(Hint: Show that T satises the
axiom (B1) of Section B.1).
(ii) Suppose B
1
, B
2
are respectively bases for the topologies T
1
, T
2
. Let B denote the col-
lection of subsets of S
1
S
2
of the form B
1
B
2
where B
1
B
1
, B
2
B
2
. Show that B
is also a base for a topology on S
1
S
2
.
(iii) Show that B and T generate the same topology on S
1
S
2
(Hint: Show that every set
in T is a union of sets in B).
C.2.2. Consider the ring R
nn
of n n matrices with real elements. Let be any norm on the
vector space R
n
, and dene the norm of a matrix by
A = sup
x=0
Ax
x
Show that (R
nn
, ) is a normed algebra.
C.2.3. Consider the ring R[s] of polynomials in the indeterminate s with real coefcients. Dene
the norm of a(s) =

i
a
i
s
i
by
a() =
i
|a
i
| .
Show that (R[s], ) is a normed algebra.
347
Bibliography
[1] V. Anantharam and C. A. Desoer, On the stabilization nonlinear systems, IEEE Trans. on
Auto. Control, AC-29, pp. 569572, June 1984. DOI: 10.1109/TAC.1984.1103584 Cited on
page(s) 306
[2] B. D. O. Anderson, A note on the Youla-Bongiorno-Lu condition, Automatica, 12, pp. 387
388, July 1976. DOI: 10.1016/0005-1098(76)90060-1 Cited on page(s)
[3] B. D. O. Anderson and J. B. Moore, Optimal Filtering, Prentice-Hall, Englewood Cliffs, New
Jersey, 1979. Cited on page(s) 194, 229
[4] M. F. Atiyah and I. G. MacDonald, Introduction to Commutative Algebra, Addison-Wesley,
Reading, MA., 1969. Cited on page(s) 283, 293
[5] J. A. Ball and J. W. Helton, ABeurling-Lax theoremfor the Lie group U(m, n) which contains
most classical interpolation theory, J. Operator Theory, 9, pp. l07142, 1983. Cited on page(s)
[6] C. I. Byrnes, M. W. Spong and T. J. Tarn, A several complex variables approach to feedback
stabilization of neutral delay-differential systems, Math. Sys. Thy., 17, pp. 97134, May 1984.
DOI: 10.1007/BF01744436 Cited on page(s) 292, 306
[7] F. M. Callier andC. A. Desoer, Open-loopunstable convolutionfeedbacksystems withdynam-
ical feedback, Automatica, 13, pp. 507518, Dec. 1976. DOI: 10.1016/0005-1098(76)90010-8
Cited on page(s)
[8] F. M. Callier and C. A. Desoer, An algebra of transfer functions of distributed linear time-
invariant systems, IEEETrans. Circ. and Sys., CAS-25, pp. 651662, Sept. 1978.
DOI: 10.1109/TCS.1978.1084544 Cited on page(s) 287, 301, 305
[9] F. M. Callier and C. A. Desoer, Simplications and clarications on the paper An algebra
of transfer functions of distributed linear time-invariant systems, IEEETrans. Circ. and Sys.,
CAS-27, pp. 320323, Apr. 1980. DOI: 10.1109/TCS.1980.1084802 Cited on page(s)
[10] F. M. Callier and C. A. Desoer, Stabilization, tracking and disturbance rejection in multi-
variable convolution systems, Annales de la Societ Scientique de Bruxelles, 94, pp. 751, 1980.
Cited on page(s)
[11] F. M. Callier and C. A. Desoer, Multivariable Feedback Systems, Springer-Verlag, New York,
1982. Cited on page(s)
348 BIBLIOGRAPHY
[12] B. C. Chang and J. B. Pearson, Optimal Disturbance reductioninlinear multivariable systems,
IEEETrans. on Auto. Control, AC-29, pp. 880888, Oct. 1984.
DOI: 10.1109/TAC.1984.1103409 Cited on page(s) 205
[13] M. J. Chen and C. A. Desoer, Necessary and sufcient conditions for robust stability of linear
distributed feedback systems, Int. J. Control, 35, pp. 255267, 1982.
DOI: 10.1080/00207178208922617 Cited on page(s) 277
[14] M. J. Chen and C. A. Desoer, Algebraic theory of robust stability of interconnected systems,
IEEETrans. on Auto. Control, AC-29, pp. 511519, June 1984.
[15] J. H. Chowand P. V. Kokotovic, Eigenvalue placement in two-time-scale systems, Proc. IFAC
Symp. on Large Scale Sys., Udine, Italy, pp. 321326, June 1976. Cited on page(s) 224
[16] C.-C. Chu and J. C. Doyle, On inner-outer and spectral factorizations, Proc. IEEE Conf. on
Decision and Control, Las Vegas, Dec. 1984. Cited on page(s) 205
[17] H. T. Colebrooke, Algebra with Arithmetic and Mensuration, John Murray, London, 1817,
reprinted by Dr. Martin Sandig OHg, Wiesbaden, W. Germany, 1973. Cited on page(s) xvii
[18] P. R. Delsarte, Y. Genin and Y. Kamp, The Nevanlinna-Pick problem for matrix-valued
functions, SIAM J. Appl. Math, 36, pp. 4761, Feb. 1979. DOI: 10.1137/0136005 Cited on
page(s) 205
[19] C. A. Desoer andW. S. Chan, The feedback interconnection of linear time-invariant systems,
J. Franklin Inst., 300, pp. 335351, 1975. DOI: 10.1016/0016-0032(75)90161-1 Cited on
page(s)
[20] C. A. Desoer and M. J. Chen, Design of multivariable feedback systems with stable plant,
IEEETrans. on Auto. Control, AC-26, pp. 408415, April 1981.
DOI: 10.1109/TAC.1981.1102594 Cited on page(s)
[21] C. A. Desoer and C. L. Gustafson, Design of multivariable feedback systems with simple
unstable plant, IEEETrans. on Auto. Control, AC-29, pp. 901908, Oct. 1984.
[22] C. A. Desoer and C. L. Gustafson, Algebraic theory of linear multivariable feedback systems,
IEEETrans. on Auto. Control, AC-29, pp. 909917, Oct. 1984.
[23] C. A. Desoer andC. A. Lin, Two-stepcompensationof nonlinear systems, Systems and Control
Letters, 3, pp. 4146, June 1983. DOI: 10.1016/0167-6911(83)90036-1 Cited on page(s) 306
BIBLIOGRAPHY 349
[24] C. A. Desoer and R. W. Liu, Global parametrization of feedback systems with nonlinear
plants, Systems and Control Letters, 1, pp. 249251, Jan. 1982.
DOI: 10.1016/S0167-6911(82)80006-6 Cited on page(s) 306
[25] C. A. Desoer, R. W. Liu, J. Murray and R. Saeks, Feedback system design: The fractional
representation approach to analysis and synthesis, IEEETrans. Auto. Control, AC-25, pp. 399
412, June 1980. DOI: 10.1109/TAC.1980.1102374 Cited on page(s)
[26] C. A. Desoer and M. Vidyasagar, Feedback Systems: Input-Output Properties, Academic Press,
New York, 1975. Cited on page(s) 154, 210, 239, 284, 287
[27] J. C. Doyle, Robustness of multiloop linear feedback systems, Proc. 17th Conf. Decision and
Control, Ft. Lauderdale, pp. 1218, 1979. Cited on page(s)
[28] J. C. Doyle, K. Glover, P. P. Khargonekar and B. A. Francis, State space solutions to standard
H
2
and H
control problems, IEEE Transactions on Automatic Control, 34, 831-847, 1989.

Cited on page(s) xiii
[29] J. C. Doyle, B. A. Francis and A. Tannenbaum, Feedback Control Theory, MacMillan, NewYork,
1991 Cited on page(s) xiii
[30] J. Doyle, Analysis of feedback systems with structured uncertainties, IEE Proc., Part D, 129,
pp. 242250, Nov. 1982. Cited on page(s)
[31] J. C. Doyle, Synthesis of robust controllers and lters, Proc. IEEE Conference on Decision and
Control, pp. 109114, 1983. DOI: 10.1109/CDC.1983.269806 Cited on page(s) 205
[32] J. C. Doyle and G. Stein, Multivariable feedback design: Concepts for a classical/modern
synthesis, IEEETrans. Auto. Control, AC-26, pp. 416, Feb. 1981.
[33] J. C. Doyle, J. E. Wall and G. Stein, Performance and robustness resuits for structured uncer-
tainty, Proc. IEEE Conf. on Decision and Control, pp. 628636, 1982. Cited on page(s)
[34] P. L. Duren, The Theory of H
p
-spaces, Academic Press, New York, 1970. Cited on page(s) 156,
205, 286
[35] B. A. Francis, The multivariable servomechanism problem from the input-output viewpoint,
IEEETrans. Auto. Control, AC-22, pp. 322328, June 1977. DOI: 10.1109/TAC.1977.1101501
Cited on page(s)
[36] B. A. Francis, On the Wiener-Hopf approach to optimal feedback design, Systems and Control
Letters, 2, pp. 197201, Dec. 1982. DOI: 10.1016/0167-6911(82)90001-9 Cited on page(s)
205
350 BIBLIOGRAPHY
[37] B. A. Francis, A Course in H
Control Theory, Lecture Notes in Control and Information

Sciences, Volume 88, Springer-Verlag, Heidelberg, 1987. Cited on page(s) xiii
[38] B. A. Francis, J. W. Helton and G. Zames, H
-optimal feedback controllers for linear mul-

tivariable systems, IEEETrans. on Auto. Control, AC-29, pp. 888900, Oct. 1984.
[39] B. A. Francis and M. Vidyasagar, Algebraic and topological aspects of the regulator problem
for lumped linear systems, Automatica, 19, pp. 8790, Jan. 1983.
DOI: 10.1016/0005-1098(83)90078-X Cited on page(s) 277
[40] B. A. Francis and G. Zames, On optimal sensitivity theory for SISOfeedback systems, IEEE
Trans. Auto. Control, AC-29, pp. 916, Jan. 1984. DOI: 10.1109/TAC.1984.1103357 Cited on
page(s) xiii, 205
[41] F. R. Gantmacher, Theory of Matrices, Chelsea, New York. Cited on page(s) 326
[42] I. M. Gelfand, D. Raikov and G. Shilov, Commutative normed rings, Chelsea, NewYork, 1964.
Cited on page(s) 279, 281
[43] B. K. Ghosh and C. I. Byrnes, Simultaneous stabilization and simultaneous pole-placement
by non-switching dynamic compensation, IEEE Trans. Auto. Control, AC-28, pp. 735741,
June 1983. DOI: 10.1109/TAC.1983.1103299 Cited on page(s) 277
[44] K. Glover, All optimal Hankel-norm approximations of linear multivariable systems and their
L
-error bounds, Int. J. Control, 39, pp. 11151193, June 1984.

DOI: 10.1080/00207178408933239 Cited on page(s) 205
[45] M. Green and D. J. N. Limebeer, Linear Robust Control, Prentice-Hall, Englewood Cliffs, New
Jersey, 1995. Cited on page(s) xiii
[46] V. Guillemin and A. Pollack, Differential Topology, Prentice-Hall, Englewood Cliffs, NJ., 1979.
Cited on page(s) 293
[47] C. L. Gustafson and C. A. Desoer, Controller design for linear multivariable feedback systems
with stable plant, Int. J. Control, 37, pp. 881907, 1983. DOI: 10.1080/00207178308933018
Cited on page(s)
[48] E. Hille and R. S. Phillips, Functional Analysis and Semigroups, Amer. Math. Soc., Providence,
RI, 1957. Cited on page(s) 287
[49] K. Hoffman, Banach Spaces of Analytic Functions, Prentice-Hall, Englewood Cliffs, NJ., 1962.
Cited on page(s)
BIBLIOGRAPHY 351
[50] N. T. Hung and B. D. O. Anderson, Triangularization technique for the design of multivariable
control systems, IEEETrans. Auto. Control, AC-24, pp. 455460, June 1979.
[51] C. A. Jacobson, Some aspects of the structure and stability of a class of linear distributed
systems, Robotics and Automation Lab. Rept. No. 31, Renn. Poly. Inst., Dept. of ECSE, May
1984. Cited on page(s) 306
[52] N. Jacobson, Lectures in Abstract Algebra, Vol. I, Van-Nostrand, New York. 1953. Cited on
page(s) 307
[53] T. Kailath, Linear Systems, Prentice-Hall, Englewood Cliffs, NJ., 1980. Cited on page(s)
[54] E. W. Kamen, P. P. Khargonekar and A. Tannenoaum, A local theory of linear systems with
noncommensurate time delays, submitted for publication. Cited on page(s) 292, 306
[55] E. W. Kamen, P. P. Khargonekar and A. Tannenbaum, Stabilization of time-delay systems
using nite-dimensional compensators, IEEETrans. on Auto. Control, AC-30, pp. 7578, Jan.
1985. DOI: 10.1109/TAC.1985.1103789 Cited on page(s) 306
[56] J. L. Kelley, General Topology, Van-Nostrand, New York, 1955. Cited on page(s) 337, 341
[57] P. P. Khargonekar and A. Tannenbaum, Noneuclidean metrics and the robust stabilization of
systems with parameter uncertainty, IEEETrans. Auto. Control, AC-30, pp. 1005-1013, Oct.
[58] H. Kimura, Robust Stabilizability for a class of transfer functions, IEEETrans. Auto. Control,
AC-29, pp. 788793, Sept. 1984. DOI: 10.1109/TAC.1984.1103663 Cited on page(s) 277
[59] P. Koosis, The Theory of H
p
spaces, Cambridge University Press, Cambridge, 1980. Cited on
page(s) 156, 205
[60] V. Ku cera, Discrete Linear Control: The Polynomial Equation Approach, Wiley, New York, 1979.
Cited on page(s) xvi, 205
[61] T. Y. Lam, Serres Conjecture, Lecture notes in Mathematics, No. 635, Springer-Verlag, Berlin,
[62] V. Ya. Lin, Holomorphic berings and multivalued functions of elements of a Banach algebra,
(English translation) Funct. Anal. Appl., 37, pp. 122128, 1973. DOI: 10.1007/BF01078884
[63] H. Lindel and W. Lutkebohmert, Projektive modulen ber polynomialen erweiterungen von
potenzreihenalgebren, Archiv der Math, 28, pp. 5154, 1977. DOI: 10.1007/BF01223888
352 BIBLIOGRAPHY
[64] R. W. Liu and C. H. Sung, Linear feedback system design, Circ. Sys. and Sig. Proc., 2, pp. 35
44, 1983. DOI: 10.1007/BF01598142 Cited on page(s) 306
[65] C. C. MacDuffee, Theory of Matrices, Chelsea, New York, 1946. Cited on page(s) 319, 323
[66] H. Maeda and M. Vidyasagar, Some results on simultaneous stabilization, Systems and Control
Letters, 5, pp. 205-208, Sept. 1984. DOI: 10.1016/S0167-6911(84)80104-8 Cited on page(s)
[67] H. Maeda and M. Vidyasagar, Innite gain margin problem in multivariable feedback sys-
tems, Automatica, 22, pp. 131-133, Jan. 1986. DOI: 10.1016/0005-1098(86)90115-9 Cited
on page(s) 277
[68] A. S. Morse, System invariants under feedback and cascade control, Proc. Int. Conf. on Math.
Sys. Thy., Udine, ltaly, 1976. Cited on page(s)
[69] B. Sz-Nagy and C. Foias, Harmonic Analysis of Operators on Hilbert Space, Elsevier, New York,
[70] C. N. Nett, The fractional representation approach to robust linear feedback design: A self-
contained exposition, Robotics and Automation Lab. Rept. No. 30, Renn. Poly. Inst., Dept.
of ECSE, May 1984. Cited on page(s) 306
[71] C. N. Nett, C. A. Jacobson and M. J. Balas, Fractional representation theory: Robustness with
applications to nite-dimensional control of a class of linear distributed systems, Proc. IEEE
Conf. on Decision and Control, pp. 268280, 1983. DOI: 10.1109/CDC.1983.269841 Cited on
page(s) 306
[72] C. N. Nett, C. A. Jacobson and M. J. Balas, A connection between state-space and doubly
coprime fractional representations, IEEE Trans. Auto. Control, AC-29, pp. 831832, Sept.
1984. DOI: 10.1109/TAC.1984.1103674 Cited on page(s)
[73] G. Newton, L. Gould and J. F. Kaiser, Analytic Design of Linear Feedback Controls, Wiley, New
York, 1957. Cited on page(s)
[74] L. Pernebo, An algebraic theory for the design of controllers for linear multivariable systems,
IEEETrans. Auto. Control, AC-26, pp. 171194, Feb. 1981. DOI: 10.1109/TAC.1981.1102554
Cited on page(s)
[75] V. P. Potapov, The Multiplicative structure of J-contractive matrix functions, Translations of
the Amer. Math. Soc., 15, pp. 131243. 1960. Cited on page(s) 205
[76] D. Quillen, Projective modules over polynomial rings, Invent. Math., 36, pp. 167171, 1976.
DOI: 10.1007/BF01390008 Cited on page(s) 293
BIBLIOGRAPHY 353
[77] V. R. Raman and R. W. Liu, A necessary and sulcient condition for feedback stabilization
in a factorial ring, IEEETrans. Auto. Control, AC-29, pp. 941942, Oct. 1984.
[78] H. H. Rosenbrock, State Space and Multivariable Theory, Nelson, London, 1970. Cited on
page(s)
[79] J. J. Rotman, An Introduction to Homological Algebra, Academic Press, New York, 1979. Cited
on page(s) 293
[80] W. Rudin, Fourier Analysis on Groups, John Wiley, New York, 1962. Cited on page(s) 285, 287
[81] W. Rudin, Real and Complex Analysis, McGraw-Hill, New York, 1974. Cited on page(s) 210
[82] R. Saeks and J. Murray, Feedback System Design: The tracking and disturbance rejection
problems, IEEETrans. Auto. Control, AC-26, pp. 203218, Feb. 1981.
[83] R. Saeks and J. Murray, Fractional representation, algebraic geometry and the simultaneous
stabilization problem, IEEETrans. Auto. Control, AC-27, pp. 895903, Aug. 1982.
[84] R. Saeks, J. Murray, O. Chua, C. Karmokolias and A. Iyer, Feedback systemdesign: The single
variate case - Part I, Circ., Sys. and Sig. Proc., 1, pp. 137170, 1982. DOI: 10.1007/BF01600050
Cited on page(s)
[85] R. Saeks, J. Murray, O. Chua, C. Karmokolias and A. Iyer, Feedback systemdesign: The single
variate case - Part II, Circ., Sys. and Sig. Proc., 2, pp. 334, 1983. DOI: 10.1007/BF01598141
Cited on page(s)
[86] D. Sarason, Generalized interpolation in H
, Trans. Amer. Math. Soc., 127, pp. 179203,

May 1967. DOI: 10.2307/1994641 Cited on page(s) 171, 179, 180, 205
[87] D. D. Siljak, On reliability of control, Proc. IEEE Conf. Decision and Control, pp. 687694,
1978. Cited on page(s)
[88] D. D. Siljak, Dynamic reliability using multiple control systems, Int. J. Control, 31, pp. 303
329, 1980. DOI: 10.1080/00207178008961043 Cited on page(s)
[89] G. Simmons, Introduction to Topology and Modern Analysis, McGraw-Hill, New York, 1966.
[90] D. E. Smith, History of Mathematics. Vol. I, Dover, New York, 1958. Cited on page(s) xvii
[91] A. A. Suslin, Projective modules over a polynomial ring are free, Soviet Math. Doklady, 17,
pp. 11601164, 1976. Cited on page(s)
354 BIBLIOGRAPHY
[92] M. Vidyasagar, Input-output stability of a broad class of linear time-invariant multivariable
feedback systems, SIAMJ. Control, 10, pp. 203209, Feb. 1972. DOI: 10.1137/0310015 Cited
on page(s) xv
[93] M. Vidyasagar, Coprime factorizations and the stability of muitivariable distributed feedback
systems, SIAM J. Control, 13, pp. 11441155, Nov. 1975. DOI: 10.1137/0313071 Cited on
page(s)
[94] M. Vidyasagar, Nonlinear Systems Analysis, Prentice-Hall, Englewood Cliffs, NJ., 1978. Cited
on page(s) 154
[95] M. Vidyasagar, On the use of right-coprime factorizations in distributed feedback systems
containing unstable subsystems, IEEETrans. Circ. and Sys., CAS-25, pp. 916921, Nov. 1978.
DOI: 10.1109/TCS.1978.1084408 Cited on page(s) 210
[96] M. Vidyasagar, The graph metric for unstable plants and robustness estimates for feedback
stability, IEEETrans. Auto. Control, AC-29, pp. 403418, May 1984.
[97] M. Vidyasagar and N. K. Bose, Input-output stability of linear systems dened over measure
spaces, Proc. Midwest Symp. Circ. and Sys., Montreal, Canada, pp. 394397, Aug. 1975. Cited
on page(s) 284
[98] M. Vidyasagar and K. R. Davidson, A parametrization of all stable stabilizing compensators
for single-input-output systems, (under preparation). Cited on page(s)
[99] M. Vidyasagar and H. Kimura, Robust controllers for uncertain linear multivariable systems,
Automatica, 22, pp. 85-94, Jan. 1986. DOI: 10.1016/0005-1098(86)90107-X Cited on page(s)
277
[100] M. Vidyasagar and B. C. Lvy, On the genericity of simultaneous stabilizability, Symp. Math.
Thy. of Netw. and Sys., Stockholm, 1985. Cited on page(s) 275, 277
[101] M. Vidyasagar, H. Schneider and B. A. Francis, Algebraic and topological aspects of feedback
stabilization, IEEETrans. Auto. Control, AC-27, pp. 880894, Aug. 1982.
DOI: 10.1109/TAC.1982.1103015 Cited on page(s) 300, 301
[102] M. Vidyasagar and N. Viswanadham, Algebraic design techniques for reliable stabilization,
IEEETrans. Auto. Control, AC-27, pp. 10851095, Oct. 1982.
[103] M. Vidyasagar and N. Viswanadham, Reliable stabilization using a multi-controller con-
gutation, Proc. IEEE Conf. on Decision and Control, pp. 856859, 1983.
DOI: 10.1016/0005-1098(85)90008-1 Cited on page(s)
BIBLIOGRAPHY 355
[104] B. L. van der Waerden, Geomety and Algebra in Ancient Civilizations, Springer Verlag, New
York, 1984. Cited on page(s) xvii
[105] J. L. Walsh, Interpolation and Approximation by Rational Functions in the Complex Domain,
AMS Colloqium Publ., Providence, RI., 1935. Cited on page(s)
[106] W. Wolovich, Linear Multivariable Control, Springer-Verlag, New York, 1974. Cited on
page(s)
[107] W. M. Wonham, Linear Multivariable Control: A Geometric Approach, 2nd ed., Springer-
Verlag, New York, 1979. Cited on page(s) 254
[108] D. C. Youla, J. J. Bongiorno, Jr. and C. N. Lu, Single-loop feedback stablization of linear mul-
tivariable plants, Automatica, 10, pp. 159173, 1974. DOI: 10.1016/0005-1098(74)90021-1
Cited on page(s)
[109] D. C. Youla, J. J. Bongiorno, Jr. and H. A. Jabr, Modern Wiener-Hopf design of optimal
controllers, Part I: The single-input case, IEEETrans. on Auto. Control, AC-21, pp. 314, Feb.
1976. DOI: 10.1109/TAC.1976.1101139 Cited on page(s)
[110] D. C. Youla and G. Gnavi, Notes on n-dimensional system theory, IEEE Trans. Circ. and
Sys., CAS-26, pp. 105111, Feb. 1979. DOI: 10.1109/TCS.1979.1084614 Cited on page(s)
283
[111] D. C. Youla, H. A. Jabr and J. J. Bongiorno, Jr., Modern Wiener-Hopf design of optimal
controllers, Part II: The multivariable case, IEEE Trans. Auto. Control, AC-21, pp. 319338,
June 1976. DOI: 10.1109/TAC.1976.1101223 Cited on page(s) 205
[112] D. C. Youla and R. Pickel, The Quillen-Suslin theorem and the structure of n-dimensional
elementary polynomial matrices, IEEETrans. Circ. and Sys., CAS-31, pp. 513518, June 1984.
DOI: 10.1109/TCS.1984.1085545 Cited on page(s) 293, 294
[113] G. Zames, Feedback and optimal sensitivity: Model reference transformations, multiplicative
seminorms and approximate inverses, IEEETrans. Auto. Control, AC-26, pp. 301320, April
[114] G. Zames and A. El-Sakkary, Unstable systems and feedback: The gap metric, Proc. Allerton
Conf., pp. 380385, 1980. Cited on page(s) 228, 277
[115] G. Zames and B. A. Francis, A new approach to classical frequency methods: feedback and
minimax sensitivity, IEEETrans. Auto. Control, AC-28, pp. 585601, May 1983.
DOI: 10.1109/TAC.1983.1103275 Cited on page(s) xiii, 205
[116] O. Zariski and P. Samuel, Commutative Algebra, Vol. I, Van-Nostrand, NewYork, 1958. Cited
on page(s) 279, 281, 282, 307, 316
356 BIBLIOGRAPHY
[117] K. Zhou, J. C. Doyle and K. Glover, Robust and Optimal Control, Prentice-Hall, Englewood
Cliffs, New Jersey, 1996. Cited on page(s) xiii
357
Authors Biography
MATHUKUMALLI VIDYASAGAR
Mathukumalli Vidyasagar was born in Guntur, India in 1947. He received the B.S., M.S., and Ph.D.
degrees in electrical engineering from the University of Wisconsin in Madison, in 1965, 1967, and
1969, respectively. From the next twenty years he taught mostly in Canada, before returning to his
native India in 1989. Over the next twenty years, he built up two research organizations fromscratch,
rst the Centre for Articial Intelligence and Robotics under the Ministry of Defence, Government
of India, and later the Advanced Technology Center in Tata Consultancy Services (TCS), Indias
largest software company. After retiring from TCS in 2009, he joined the University of Texas at
Dallas as the Cecil & Ida Green Chair in Systems Biology Science, and he became the Founding
Head of the Bioengineering Department. His current research interests are stochastic processes and
stochastic modeling, and their application to problems in computational biology. He has received a
number of awards in recognition of his research, including the 2008 IEEE Control Systems Award.
359
Index
absolute nondivisor of zero, 312
accumulation point, 340
additive perturbation, 236
associate, 316
left, 328
right, 328
base for a topology, 338
Bezout domain, 279
relation to coprime factorizations, 280
binary relation, 269
Binet-Cauchy formula, 325
Blaschke product, 158
closed set, 337
cluster point, 340
complementary inner matrix, 194
complementation, 288
in a local ring, 296
comprime factorization
over a general ring, 279
relation to Bezout domains, 280
relation to principal ideals, 279, 280
comprimeness
over a general ring, 281
continuity, 341
contractible set, 291
relation to Hermite rings, 291
coprimeness, 315
in A, 287
corona theorem, 286
degree function, 319
determinant, 323
multilinearity of, 324
domain, 308
equivalence, 328
Euclidean domain, 319
proper, 319
eld, 309
of fractions, 313
ltering, 153
solution, 162, 163
fraction, 312
Gelfand transform, 284
generic relation, 269
genericity
of comprimeness, 270
of nonsingularity, 271
of simultaneous strong stabilizability, 273,
277
of Smith forms, 271, 273
graph, 209
graph metric, 229
estimates for, 232
relation to graph topology, 230
graph topology, 212
and singular perturbations, 223
and state space models, 221
continuity in, 215
360 INDEX
convergence in, 213
relation to product topology, 219
relation to robust stabilization, 215
greatest common divisor, 314
Hardy spaces, 156
Hausdorff topology, 340
Hermite form, 329
Hermite ring, 290
ideal, 309
left, 309
maximal, 281
prime, 310
principal, 310
right, 309
index, 238
inner function, 158
inner matrix, 159
inner-outer factorization, 158, 159
internal model principle, 254
invariant factor, 332
Laplaces expansion, 328
least common multiple, 317
limit point, 340
local ring, 293
localization, 293, 313
locally compact Abelian group, 285
maximal ideal space, 283
maximum modulus principle, 186
metric space, 339
multiplicative perturbation, 236
multiplicative set, 298, 312
Nagatas lemma, 294
neighborhood, 339
Nevanlinna-Pick algorithm, 185
normalized r.c.f., 228
normed algebra, 342
nullstellensatz, 283
Nyquist criterion, 238
open set, 337
optimal robust regulation, 264
outer function, 158
outer matrix, 159
performance index, 153
prime element, 316
prime factorization, 317
principal ideal domain, 314
principal ideal ring, 314
product topology, 342
proper Euclidean domain, 319
Quillen-Suslin theorem, 293
quotient eld, 313
reduced form, 316
relation, 269
binary, 269
generic, 269
strongly generic, 269
symmetric, 269
ring, 307
commutative, 307
of fractions, 313
with an identity, 307
robust disturbance rejection, 252
solution, 252
robust regulation, 252
robust stabilizability, 209
robust stabilization
existence of compensators, 244, 246, 248
theorems, 237, 242
robust tracking, 253
using two-parameter compensation, 259
INDEX 361
saturated set, 298, 312
Schur complement, 327
sensitivity minimization, 154
example, 174, 192
for a fat plant, 176
for a general plant, 193
for a scalar plant, 166
Serres conjecture, 293
singular perturbations, 223
and strictly proper compensators, 227
Smith form, 332
Smith-McMillan form, 335
spectral factor, 194
spectral factorization, 194
stabilization by lumped compensators, 305
stable-factor perturbation, 236
strongly generic relation, 269
subring, 309
symmetric relation, 269
topological ring, 342
topological space, 337
topology, 337
discrete, 337
rst-countable, 339
Hausdorff topology, 340
product, 342
trivial, 337
unimodular row, 289
weighted performance index, 153

(Synthesis Lectures On Control and Mechatronics) Mathukumalli Vidyasagar-Control System Synthesis - A Factorization Approach, Part II (Synthesis Lectures On Control and Mechatronics) - Morgan & Claypo

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(Synthesis Lectures On Control and Mechatronics) Mathukumalli Vidyasagar-Control System Synthesis - A Factorization Approach, Part II (Synthesis Lectures On Control and Mechatronics) - Morgan & Claypo

Uploaded by

Copyright:

Available Formats

Control SystemSynthesis

-control. Just about the

-control presented here still remains relevant.

Figure 6.1: Pictorial Representation of the Weighted Performance Measure.

consists of all analytic functions f on D such that

if and only if it is the z-transform of an absolutely summable sequence.

-functions is dened to be their pointwise product, then H

. Then one can associate with f a continuous linear operator T

and satises |b(z)| < 1 for all z in D, |b(e

and satises |b(z)| < 1 for

. Let m denote the multiplicity of z = 0 as a zero

, b is a Blaschke product, and a has no zeros in the

is inner if it satises |g(e

is outer if the range of the map T

is outer if and only if it has

can be factored as ab where a is outer and b is inner. If f is rational

and b is a nite Blaschke product.

one can dene a

is outer if the range of the map T

is outer if and only if rank F(z) = n for all

introduced in Section 2.1, which consists

(this explains the symbolism), and that the norm on R

and has rank min{n, m}.

has rank n and

, we see that G can be factored as AB where

is the transpose of an inner matrix. This is quite distinct from the

) of full rank can be factored as G

. That is, if F M(H

(of the unit disc). Then the function g : C

. However, not all functions in H

.The situationis worse whenp < . Evenif f H

|f ([(j 1)/(j +1)])|

) is the sum of the terms corresponding to poles

) of functions in M(S) such that J(R

) approaches this inmum as 0

denote the rational matrix, A

(j) is a unitary matrix for all , and as a result

A need not belong to M(H

/a is a unit of S for each > 0.

. Suppose f S, and dene the function

f (z) = f [(1 +z)/(1 z)] . (6.4.3)

is rational, and dene g by

. If this procedure is adopted, the problem

; the reason for the

is a Banach space while R

is merely a normed space. The problem in essence is as

denote the subspace {gr : r H

, where distances are measured using

is closed if and only if g has no zeros on the unit circle. If

is a closed subspace, then it can be shown that, corresponding to each rational f H

(more on this shortly). Unfortunately, if the plant p has zeros

d w has one or more zeros on the

is not closed. As a consequence, if one denes

such that J(r

has all of its zeros on

/v. Then by (6.4.9),

so that |(f bvr

-function, andthis statement is only validunder more restrictive conditions,

can be used so long as (i) h

d w. Hence, if the weighting function

d w can be deleted without affecting the inmum of J.

d)(z) = z(z 1)(2z 1)(z

-function f rb, the corresponding continuous linear operator T