Professional Documents
Culture Documents
of Matrices with
Applications
SIAM's Classics in Applied Mathematics series consists of books that were previously allowed to go
out of print. These books are republished by S1AM as a professional service because they continue
to be important resources tor mathematical scientists.
Editor-in-Chief
Robert E. O'Malley, Jr., University of Washington
Editorial Board
Richard A. Brualdi, University of Wisconsin-Madison
Leah Edelstein-Keshet, University of British Columbia
Nicholas]. Higham, University of Manchester
Herbert B. Keller, California Institute of Technology
Aiidrzej Z. Manitius, George Mason University
Hilary Ockendon, University of Oxford
Ingram Olkin, Stanford University
Peter Olver, University of Minnesota
Ferdinand Verhulst, Mathematisch Instituut, University of Utrecht
Classics in Applied Mathematics
C. C. Lin and L. A. Segel, Mathematics Applied to Deterministic Problems in the Natural Sciences
Johan G. F. Belinfante and Bernard Kolman, A Survey of Lie Groups and Lie Algebras with
Applications and Computational Methods
James M. Ortega, Numerical Analysis: A Second Course
Anthony V. Fiacco arid Garth P. McCorrnick, Nonlinear Programming: Sequential Unconstrained
Minimisation Techniques
F. H. Clarke, Optimization and Nonsmooth Analysis
George F. Carrier and Carl E. Pearson, Ordinary Differential Equations
Leo Breiman, Probability
R. Bellman and G. M. Wing, An Introduction to Invariant Imbedding
Abraham Berman and Robert J. Plemmons, Nonnegative Matrices in the Mathematical Sciences
Olvi L. Mangasarian, Nonlinear Programming
*Carl Friedrich Gauss, Theory of the Combination of Observations Least Subject to Errors:
Part One, Part Two, Supplement. Translated by G. W. Stewart
Richard Bellman, Introduction to Matrix Analysis
U. M. Ascher, R. M. M. Mattheij, and R. D. Russell, Numerical Solution of Boundary Value
Problems for Ordinary Differential Equations
K. E. Brenan, S. L. Campbell, and L. R. Petzold, Numerical Solution of Initial-Value Problems
i?i Differential-Algebraic Equations
Charles L. Lawson and Richard J. Hanson, Solving Least Squares Problems
J. E. Dennis, Jr. and Robert B. Schnabel, Numerical Methods for Unconstrained Optimization
and Nonlinear Equations
Richard E. Barlow and Frank Proschan, Mathematical Theory of Reliability
Cornelius Lanczos, Linear Differential Operators
Richard Bellman, Introduction to Matrix Analysis, Second Edition
Beresforcl N. Parlett, The Symmetric Eigenvalue Problem
iii
This page intentionally left blank
Invariant Subspaces
of Matrices with
Applications
Israel Gohberg
Tel-Aviv University
Ramat>Aviv, Israel
Peter Lancaster
University of Calgary
Calgary, Alberta, Canada
Leiba Rodman
College of William & Mary
Williamsburg, Virginia
siam.
Society for Industrial and Applied Mathematics
Philadelphia
Copyright © 2006 by the Society for Industrial and Applied Mathematics
This SIAM edition is an unabridged republication of the work first published by John
Wiley & Sons, Inc., New York, 1986.
10 9 8 7 6 5 4 3 2 1
All rights reserved. Printed in the United States of America. No part of this book
may be reproduced, stored, or transmitted in any manner without the written
permission of the publisher. For information, write to the Society for Industrial and
Applied Mathematics, 3600 University City Science Center, Philadelphia, PA
19104-2688.
QA322.G649 2006
515'.73-dc22
2006042260
Introduction 1
ix
X Contents
References
[1] R. Bru, L. Rodman, and H. Schneider, "Extensions of Jordan bases for
invariant subspaces of a matrix," Linear Algebra Appl. 150, 209-225
(1991).
[2] W. Fulton, "Eigenvalues, invariant factors, highest weights, and Schubert
calculus," Bull. Amer. Math. Soc. 37, 209-249 (2000).
[3] M. D. Choi, H. Radjavi and P. Rosenthal, "On complementary matrix
algebras," Integral Equations and Operator Theory 13, 165-174 (1990).
[4] J. Hartman, "On a conjecture of Gohberg and Rodman," Linear Algebra
Appl. 140, 267-278 (1990).
[5] A. A. Klyachko, "Stable bundles, representation theory and Hermitian
operators," Selecta Math. 4, 419-445 (1998).
[6] V. Lomonosov and P. Rosenthal, "The simplest proof of Burnside's the-
orem on matrix algebras," Linear Algebra Appl. 383, 45-47 (2004).
[7] H. Radjavi, "A trace condition equivalent to simultaneous triangulariz-
ability," Canad. J. Math. 38, 376-386 (1986).
[8] H. Radjavi and P. Rosenthal, Simultaneous Triangularization, Springer
Verlag, New York, 2001.
XX Preface to the Classics Edition
xxi
This page intentionally left blank
Introduction
_
2 Introduction
The first part of the book, taken alone or together with selections from
the other parts, can be used as a text for undergraduate courses in
mathematics, having only a first course in linear algebra as prerequisite. At
the same time, the book will be of interest to graduate students in science
and engineering. We trust that experts will also find the exposition and new
results interesting. The authors anticipate that the book will also serve as a
valuable reference work for mathematicians, scientists, and engineers. A set
of exercises is included in each chapter. In general, they are designed to
provide illustrations and training rather than extensions of the theory.
The first part of the book is devoted mainly to geometric properties of
invariant subspaces and their applications in three fields. The fields in
question are matrix polynomials, rational matrix functions, and linear
systems theory. They are each presented in self-contained form, and—rather
than being exhaustive—the focus is on those problems in which invariant
subspaces of square and nonsquare matrices play a central role. These
problems include factorization and linear fractional decompostions for mat-
rix functions; problems of realization for rational matrix functions; and the
problem of describing connections, or cascades, of linear systems, pole
assignment, output stabilization, and disturbance decoupling.
The second part is of a more algebraic character in which other properties
of invariant subspaces are analyzed. It contains an analysis of the extent to
which the invariant subspaces determine the parent matrix, invariant sub-
spaces common to commuting matrices, and lattices of subspaces for a single
matrix and for algebras of matrices.
The numerical computation of invariant subspaces is a difficult task as, in
general, it makes sense to compute only those invariant subspaces that
change very little after small changes in the transformation. Thus it is
important to have appropriate notions of "stable" invariant subspaces. Such
an analysis of the stability of invariant subspaces and their generalizations is
the main subject of Part 3. This analysis leads to applications in some of the
problem areas mentioned above.
The subject of Part 4 is analytic families of invariant subspaces and has
many useful applications. Here, the analysis is influenced by the theory of
complex vector bundles, although we do not make use of this theory. The
study of the connections between local and global problems is one of the
main problems studied in this part. Within reasonable bounds, Part 4 relies
only on the theory developed in this book. The material presented here
appears for the first time in a book on linear algebra and is thereby made
accessible to a wider audience.
Part One
Fundamental
Properties of
Invariant Subspaces
and Applications
Part 1 of this work comprises almost half of the entire book. It includes what
can be described as a self-contained course in linear algebra with emphasis
on invariant subspaces, together with substantial developments of applica-
tions to the theory of polynomial and rational matrix-valued functions, and
to systems theory. These applications demand extensions of the standard
material in linear algebra that are included in our treatment in a natural
way. They also serve to breathe new life into an otherwise familiar body of
knowledge. Thus there is a considerable amount of material here (including
all of Chapters 3, 4, and 6) that cannot be found in other books on linear
algebra.
Almost all of the material in this part can be understood by readers who
have completed a beginning course in linear algebra, although there are
places where basic ideas of calculus and complex analysis are required.
3
This page intentionally left blank
Chapter One
and
5
6 Invariant Subspaces
and
are A invariant. To verify this, let x e Ker Am, so Amx = 0. Then Am(Ax) =
A(Amx) = 0, that is, Ax e Ker Am. This means that Ker Am is A invariant.
Further, let x E Im Am, so x = Amy for some y e <p". Then Ax = A(Amy) =
Am(Ay), which implies that AxElmA™. So Im Am is A invariant as
well.
When convenient, we shall often assume implicitly that a linear trans-
formation from (p™ into <p" is given by an n x m matrix with respect to
the standard orthonormal bases el = (1,0,. . . , 0), e2 = (0,1, 0,. . . , 0),
*W = < 0 , 0 , . . . , 0 , 1 > in fVi. • • • , « « HI C-
The following three examples of transformations and their invariant
subspaces are basic and are often used in the sequel.
belong to M as well. So
and
Later (see Proposition 2.5.4) we shall see that the set of all ^-invariant
subspaces of on n x n complex matrix A is never countably infinite; it is
either finite or uncountably infinite.
we have
It turns out that these are all the invariant subspaces for A. The proof of this
fact for a general n is given later in a more general framework. So the total
number of j4-invariant subspaces is
Definition and Examples 9
(By writing A as an n x n matrix in some basis in (p", we easily see from the
definition of the determinant that det(A7— A) is a polynomial of degree n
with an = l.) Hence Akx with k > n is a linear combination of
x, Ax, . . . , A"~lx, so actually
the same reasoning shows that any (aA + /3/) invariant subspace is also A
invariant.
The most primitive nontrivial invariant subspaces are those with dimension
equal to one. For a transformation A: <£"-» <p" and some nonzero x E <p",
therefore, we consider an /l-invariant subspace of the form M = Span{;c}.
In this case there must be a A0 £ <p such that Ax — AOJC. Since we then have
A(ax) = a(Ax) = A0(a*) for any a G <p, the number A0 does not depend on
the choice of the nonzero vector in M. We call A0 an eigenvalue of A, and,
when Ax = AOJC with 0 ^ x G <p", we call x an eigenvector of A (corresponding
to the eigenvalue A 0 ). Observe that, since (A 0 7 - A)x = 0, the eigenvalues of
A can also be characterized as the set of complex zeros of the characteristic
polynomial of A\ <pA( A) = det( A/ — A).
The set of all eigenvalues of A is called the spectrum of A and is denoted
by o-(A). We have seen that any one-dimensional ^-invariant subspace is
spanned by some eigenvector. Conversely, if XQ is an eigenvector of A
corresponding to some eigenvalue A 0 , then Span{jt0) is A invariant. (In
other words, A is the operator of multiplication by A0 when restricted to
Span{*0}.)
Let us have a closer look at the eigenvalues. As the characteristic
polynomial ^(A) = det(A7- A) is a polynomial of degree n, by the funda-
mental theorem of algebra, <p4(h) has n (in general, complex) zeros when
counted with multiplicities. These zeros are exactly the eigenvalues of A.
Since the characteristic polynomial and eigenvalues are independent of the
choice of basis producing the matrix representation, they are properties of
the underlying transformation. So a transformation A: (p"—» <p" has exactly
Eigenvalues and Eigenvectors 11
for some complex numbers /i0 and /Xj. If /AO = 0, then xl is an eigenvector of
A corresponding to the eigenvalue /i^. If /AO ^ 0 and ^ ^ A 0 , then the vector
v = —/Lt()jc0 + (A 0 —/LtJjCj is an eigenvector of A corresponding to ^ for
which {XQ, y} is a linearly independent set. Indeed
Proposition 1.2.1
Let A , , . . . , \k be eigenvalues of A (not necessarily distinct), and let xt be an
eigenvector of A corresponding to A,, i = 1,. . . , k. Then Span{jc,, . . . , xk}
is an A-invariant subspace.
The first equation (together with JCG ¥^ 0) means that JCQ is an eigenvector of
A corresponding to A 0 . The vectors xl,...,xk are called generalized
eigenvectors of A corresponding to the eigenvalue A0 and the eigenvector XQ.
For example, let
As A, T^ A, for / T^ i'0, we find immediately that /3, = 0 for / ^ /„. But then the
left-hand side of equation (1.3.3) is zero, a contradiction with a 5^0. So
there are no generalized eigenvectors for the transformation A.
Jordan chains allow us to construct more invariant subspaces.
Proposition 1.3.1
Let x(), . . . , xk be a Jordan chain of a transformation A. Then the subspace
M - Span{jr0, . . . , xk} is A invariant.
Proof. We have
The following proposition shows how the Jordan chains behave under a
linear change in the matrix A.
Proposition 1.3.2
Let a 7^ 0 and /8 be complex numbers. A chain of vectors XQ, jc,,. . . , xk is a
Jordan chain of A corresponding to the eigenvalue A0 if and only if the vectors
Corollary 1.3.3
(a) The vector XQ is an eigenvector of A corresponding to A0 if and only if jc0
is an eigenvector of a A + ft I (here a ^0, ft are complex numbers) corres-
ponding to a\Q + ft', (b) the vectors A:O, . . . , xk form a Jordan chain of A
corresponding to A0 // and only if these vectors constitute a Jordan chain of
A + ft I corresponding to A() + ft for any complex number ft.
Proposition 1.3.4
The vectors in a Jordan chain x(},. . . , xk of A are linearly independent.
Proof. Assume the contrary, and let xp be the first generalized eigen-
vector in the Jordan chain that is a linear combination of the preceding
vectors:
Proposition 1.4.1
Let A, B: <£"—» <p" be transformations, and let M C <p" be a subspace which
is simultaneously A invariant and B invariant. Then M is also invariant for
a A + BB (with any a, ft E <p) and for AB. Further, if A is invertible, then M
is also invariant for A~l.
EXAMPLE 1.4.1. Let y4: <p"—» <£"" be a transformation that is not of the form
yl for some y E (p (if n — 2, such transformations obviously exist). By
Example 1.1.2, not all subspaces in <p" are A invariant. On the other hand,
take B = A and a + j8 = 0 in (1.4.1). Then the right-hand side of (1.4.1) is
the zero transformation for which every subspace in <p" is invariant. D
Proposition 1.4.2
Let transformations A and B be similar, with the similarity transformation
S: A - S~1BS. Then a subspace M C <p" is A invariant if and only if the
subspace
is B invariant.
So M is A invariant.
Proposition 1.4.3
Let A and B be similar, with the similarity transformation S. Then (a)
lmB = S(lm A); (b) Ker B = S(Ker A); (c) if *0, xlt...txk is a Jordan
chain of A corresponding to A(), then Sx(}, £.*,,..., Sxk is a Jordan chain of
B corresponding to the same A 0 .
More generally, if if,, J"2 are subspaces in <p" and A: if,—> 5~2 is a linear
transformation, its adjoint A*: ^'2—>3'l is defined by the relation
It is not difficult to check that the adjoint transformation always exists and is
unique. It is easily verified that for any linear transformations A and B on
<p" and any a E <p
Basic Operations on Linear Transformations 19
The same formula also holds for the transformation A written as a matrix in
any orthogonal basis in <p" as long as A* is considered as a matrix in the
same basis.
There is a simple and useful characterization of the invariant subspaces of
the adjoint transformation A* in terms of the invariant subspaces of A, as
follows.
Proposition 1.4.4
Let A: <p" —>• <p" be a linear transformation. A subspace M C (J7" is A*
invariant if and only if its orthogonal complement ML is A invariant.
Note the following equalities for the A -invariant subspaces Ker A and
Im A and the A*-invariant subspaces Ker A* and Im A*:
20 Invariant Subspaces
On the other hand, let x be orthogonal to Im A*. Then for every y E <J7", we
have (Ax, y) - (*, A*y) = 0; so Ax 1 (J?n, and thus Ax = 0, or x E Ker A. So
(Im v4*) x CKer y4. Taking orthogonal complements, we obtain Im A* D
(Ker A)1. Combining with (1.4.5), we obtain the first equality in (1.4.4).
The second equality follows from the first one applied to A * instead of A
[recall that (A*)* = A].
Later, we shall also need the following property:
Here, the inclusion D is clear. For the opposite inclusion, let x&lmA.
Then x = Ay for some y. If z is the projection of y onto Ker/4, then
y - z E (Ker A)*~ and also x = A(y - z). Then (1.4.1) implies that y - z E
Im A* and so x E lm(AA*), as required.
A transformation A: $"^> <P" 's called self-adjoint if A = A*. It is easily
seen that A is self-adjoint if and only if it is represented by a hermitian
matrix in some orthogonal basis (recall that a matrix [ajk]"k=i is called
hermitian if ajk = dkj, j, k = 1, . . . , n). For this important class of transfor-
mations we have the following corollary of Proposition 1.4.4.
Corollary 1.4.5
If A is self-adjoint, then M± is A invariant if and only if M is A invariant.
Theorem 1.5.1
Let P be a projector. Then (Im P, Ker P) is a pair of complementary
subspaces in <p". Conversely, for every pair (J^, j£2) of complementary
subspaces in <p", there exists a unique projector P such that Im P — ^,
Ker/> = «%.
Proposition 1.5.2
A projector P is orthogonal if and only if P is self-adjoint, that is, P* = P.
and (1.5.1) follows. In case (b), the left-hand side of equation (1.5.1) is
zero (since x £ Ker P) and the right-hand side is also zero in view of
the orthogonality Ker P = (Im P)1. In the same way, one checks (1.5.1) in
case (c).
So (1.5.1) holds, and P* = P.
Here Tif (i, j = 1,2) is an ra( x my matrix that represents in some basis the
transformation P,T|^: J2?y—»«£), where P, is the projector on 5€t along ^_i
(so P, + P2 = /).
Suppose now that T — P is a projector on =$?, = Im P. Then representation
(1.5.2) takes the form
for some matrix X. In general, X ¥^ 0. One can easily check that X = 0 if and
only if J^ = Ker P. Analogously, if «$?, = Ker P, then (1.5.2) takes the form
and all these transformations are written as matrices with respect to some
chosen bases in M and M'. As M is A invariant, equation (1.5.5) implies
that (/- P)AP = Q, that is, A2l = 0. Hence
Proposition 1.5.3
Let XQ, . . . , xk be a Jordan chain of A\M corresponding to the eigenvalue A0
of A\M. Then JCQ, . . . , xk is also a Jordan chain of A corresponding to A 0 . In
particular, all eigenvalues of A\M are also eigenvalues of A.
Proposition 1.5.4
Let M be an A-invariant subspace with a direct complement M' in <p", and let
Theorem 1.5.5
Let A: ("-* <pm be a transformation, let <p" = Ker A + N, <fm = Im A + R
for some subspaces N and R, and let P be the projector on Im A along R, Q
the projector on N along Ker A. Then (a) the transformation Al = A\^ is a
one-to-one transformation of N onto Im A] (b) the transformation A1 defined
on$mbyAIy = A^l(Py),forallyE.$m,isa generalized in verse of A for which
A A = P and AA = Q; (c) all generalized inverses of A are determined as
N, R range over all complementary subspaces for Ker A, Im A, respectively.
Corollary 1.5.6
In the statement of Theorem 1.5.5, we have
is angular with respect to TT. To see this, observe first that NR is indeed a
subspace; that is, if jc,, jc2 E NK, then for some y^ y2 G Im TT
and if
Thenecause,forany
then
and
Finally, if z E NR fl Ker TT, then z = Ry + y, where y £ Im TT and also
77-z = 0. Thus
Since ft is into Ker TT, TrR - 0 and it follows that y = 0. Hence z = 0 and
$n = tfR + Ker TT.
The angular subspaces generated in this way are, in fact, all possible
angular subspaces.
Proposition 1.6.1
Let N be a subspace of <p". Then Jf is angular with respect to TT if and only if
j{ - NR for some transformation R: Im TT + Ker TT that is uniquely determined
by N.
Consider now a transformation T\ <P"—» <p". As before, let TT: <p"—» <f"
be a projector so that we have <p" = Im TT + Ker TT. Then T has a represen-
tation with respect to this decomposition:
Theorem 1.6.2
Let Ji be an angular subspace with respect to the projector TT. Let T have the
representation (1.6.3) with respect to the decomposition <p" = Im TT + Ker TT.
Then Ji is T invariant if and only if the angular transformation R for N
satisfies the matrix quadratic equation.
1
so Im TT is E TE invariant if and only if (1.6.4) holds.
Corollary 1.6.3
If N is T invariant, then
and
Proof. We have
Let N C <p" be a subspace. We say that two vectors x, y E <p" are compar-
able modulo jV if x - y E Jf, and denote this by x = y (mod JV). In particu-
lar, x = 0 (mod JV) if and only if x G jV. This relation is easily seen to be
reflexive, symmetrical, and transitive. That is
Proposition 1.7.1
Let set $"/Jf be a vector space over (p with the following operations of
addition and multiplication by a complex number:
that is, Jtj + v, G [x + y]^ . So indeed the class [jc + y]^ does not depend on
the choice of x and y. Similarly, one checks that [ax]^ does not depend on
the choice of x in the class [x]^ (for fixed a).
It is a straightforward but tedious task to verify that $nIN satisfies the
following defining properties of a vector space over <p: The sum is commuta-
tive and associative: (a) x + y = y + x, (x + y) + z = x + (y + z) for every
x, y, z G <p"/jV; (b) there is a zero element OE (p"/jV, that is, an element 0
such that jc + 0 = x for all x E. <p%V; (c) for every x €E $"/JV there is an
additive inverse element y G (p7^V, that is, such that x + y = 0; (d) for every
a, (3 G <p and x, y G <p"/OVthe following equalities hold: a(x + y) = ax + ay,
(a + fi}x = ax + fix, (a(3)x = a(fix), and Ix = x (here 1 is the complex
number). We leave the verification of all these properties to the reader. D
The vector space <p"/jVis isomorphic to any direct complement N' of .yVin
<p". Indeed, let a G <p"/^V; then there exists a unique vector y G N' such that
a = [y]tf and in fact, y — Px, where P is the projector on JV' along JV and
x is any vector in the class a. This is easily checked. We have y — x =
—(I — P)x G JV, so v G a. If there were two different vectors y, and v 2 from
N' such that [y^^ — [y2]jf = a, then y ] = _y2 G jV' H JV and _y, ¥= v 2 , which
contradicts the choice of Jf' as a direct complement to ^Vin <p". So we have
constructed a map <p: <p"~* N' defined by <p(a) = y. This map is easily seen
to be a homomorphism of vector spaces; that is
30 Invariant Subspaces
for every a, b E <p"A/V and every a E (p. Moreover, if <p(a) = <p(b), then the
vector y = <p(a) = <p(/>) belongs to both classes a and b of comparable
vectors modulo N, and thus a = b. So 9 is one-to-one. Taking any y E N\
we see that <p([.yL) ~ y> so <P is onto. Summing up, <p is an isomorphism
between the two vector spaces <p"/jV and N'. In particular, dim <p7^V =
n — dim N. Assume now that N is A invariant for some transformation
A: <p"-» <p". Then the induced transformation A: f"W-> f"/Jf is defined
by ^4[*] v = [^-^l.v f°r an V *€• $"• This definition does not depend on the
choice of the vector x in its class [*] A . Indeed, if [jc,]^ = [^l^v' then
Proposition 1.7.2
If X is invariant for both transformations A: (p"—»<P" and B: <p"-» <£", then
l
Proof. By Proposition 1.4.1, jVis invariant for aA + @B, AB, and A
(if A is invertible). For any jcE <p" we have
and
and example (c) is the set of all invariant subspaces for the zero matrix.
32 Invariant Subspaces
Proposition 1.8.2
Given a transformation A: $" —»(p" and an invertible transformation
S: <F"-*<F", we have
and
Theorem 1.8.3
If M{ and M2 are A-invariant subspaces, then
and
We find that
This means that for every vector x E (p" there exists unique vectors
xl £ ^,. . . , xk G %k such that jc = *, + *2 + • • • + xk. Now let Pi be the
projector on ^£i along
The projectors P. are mutually disjoint; that is, P-P- = PjPi = 0 for / ^ /, and
Pi + -~ + Pk = L
Now any transformation A: $"—> <P" can be written as a A: x k block
matrix with respect to the decomposition (1.8.6):
Then one can characterize all matrices for which (1.8.5) is a chain of (not
necessarily all) invariant subspaces in terms of the k x k block representa-
tion as follows.
Proposition 1.8.4
All subspaces from the chain (1.8.5) are invariant for a transformation A if
and only if A has the following form in the chosen basis xlt. . . ,xn:
Proof. Assume that A has the form (1.8.8), which means that in terms
of the projectors Plt . . . , Pk defined above the equalities PfiP- = 0 for /' >;'
hold. For a fixed ;', it follows that
A chain of subspaces
with the property that every Mt is equal to some ^, coincides with the chain
(1.8.9). It is easily seen that a chain (1.8.9) is maximal if and only if
dim Mt: = /, / = ! , . . . , « .
Now if (1.8.9) is a maximal chain, we may choose a basis *,,. . . , xn in
<p" in such a way that
Proposition 1.8.6
Let A: (p"-* <p" be a transformation. If each subspace N C <p" has a com-
plementary subspace that is A invariant, there is a basis in <p" consisting of
eigenvectors of A.
The main result of this section is the following theorem on unitary trian-
gularization of a transformation. It has important implications for the study
of invariant subspaces.
Recall that a transformation U: <p"—»(p" is called unitary if it is invertible
and U~l = U* or, equivalently, if (Ux, Uv) = (x, y) for all x, y E <p". Note
that the seemingly weaker condition \\Ux\\ - \\x\\ for all x E <p" is also
sufficient to ensure that U is unitary. Note also that the product of two
unitary transformations is unitary again, and so is the inverse of a unitary
transformation.
It will be convenient to write linear transformations from <p" into <p" as
n x n matrices with respect to the standard orthonormal basis e^, . . . , en in
<p". We shall use the fact that a matrix is unitary if and only if its columns
form an orthonormal basis in <p".
Theorem 1.9.1
For any n x n matrix A there exists a unitary matrix U such that
is an upper triangular matrix, that is, /,y — 0 for i>j, and the diagonal
elements t ] } , . . . , tnn are just the eigenvalues of A.
is unitary. Write Ul in a block matrix form t/, = [*,K], where V= [x2 • • • xn]
is an n x (n - 1) matrix. Then because of the orthonormality of *,,. . . , xn,
Y*XI =0. Now, using the relation Axl = Ajjc,, we obtain
38 Invariant Subspaces
def
Applying the same procedure to the ( w - l ) x ( n - l ) matrix A2 = V*AV,
we find an (n - 1) x (n - 1) unitary matrix U2 such that
where all subspaces M, are T invariant. Then Proposition 1.4.2 shows that
the maximal chain
Corollary 1.9.2
Any transformation (or n x n matrix) A: <f"~* <P" has a maximal chain of
A-invariant subspaces. In particular, for every /', ! < / < n , there exists an
i-dimensional A-invariant subspace.
Theorem 1.9.3
An n x n matrix A has a unique complete chain of invariant subspaces if and
only if A has a unique eigenvector (up to multiplication by a scalar).
Proof. We have seen in the proof of Theorem 1.9.1 that for any
eigenvector x of A the subspace Span {A:} appears in some complete chain of
-^-invariant subspaces. So if a complete chain of invariant subspaces is
unique, the matrix A has a unique eigenvector (up to multiplication by a
scalar).
The converse part of Theorem 1.9.3 will be proved later using the Jordan
normal form of a matrix (see Theorem 2.5.1).
Theorem 1.9.4
A transformation A: <p" —» <p" is normal if and only if there is an orthonormal
basis in <p" consisting of eigenvectors for A.
1.10 EXERCISES
1.1 Prove or disprove the following statements for any linear transfor-
mation A: ("->(":
(a) Im A + Ker A = <p".
(b) Im A + Ker A = <p" (the sum not necessarily direct).
(c) Im A H Ker ,4^(0}.
(d) dim Im A + dim Ker A = n.
(e) Im A is the orthogonal complement to Ker A *.
1.2 Prove or disprove statements (d) and (e) in the preceding exercise for
a transformation A: <pm-* <p", where m^n.
1.3 Let A: <p"—»<p" be the transformation given (in the standard ortho-
normal basis) by an upper triangular Toeplitz matrix
and find the lattice Inv(y4). What are the invariant subspaces of A*7
1.14 Let
42 Invariant Subspaces
1.17 Let
where
are n x n matrices.
(b) Tl} are n x « diagonal matrices.
(c) TV are n x n circulant matrices.
1.23 Prove that x{ + M,. . . , xk + Jt is a basis in <p"/^ (where
*,, . . . , xk G <p) if and only if for some basis y^ . . . , yp in M the
vectors j t j , . . . , xk, ylt . . . , yp form a basis in <p".
1.24 Let A = diagfflj, . . . , aj: <p"—> (f"1, where the numbers « , , . . . , # „
are distinct. Show that for any ^4-invariant subspace M the induced
transformation A: $"IM —> $ IM can also be written in the form
diag[6j, . . . , bk] in some basis in $nIM.
44 Invariant Subspaces
with respect to the basis el, e2, e3, find a complete chain of A-
invariant subspaces. Find a basis in which A has the upper triangular
form.
1.30 Let A: (p2"—» <p2" be a transformation. Prove that there exists an
orthonormal basis in <p2" such that, with respect to this basis, A has
the representation
We have seen in Section 1.4 and Proposition 1.8.2 that there is a strong
relationship between lattices of invariant subspaces of similar transforma-
tions, namely
for any two tranformations A and 5 from <p" into <p" with S invertible. Thus,
for the study of invariant subspaces, it is desirable to use similarity transfor-
mations to reduce a given transformation to the simplest form, in the hope
that the lattice of invariant subspaces for the simplest form would be more
transparent than that for the original transformation. The "simplest form"
here is the Jordan form. It is obtained in this chapter and used to study
some properties of invariant subspaces. Special insights are obtained into
the structure of invariant subspaces and are exploited throughout the book.
We examine irreducible invariant subspaces, generators of invariant sub-
spaces, maximal and minimal invariant subspaces, and invariant subspaces
of functions of transformations. An interesting class of subspaces is intro-
duced and studied in Section 2.9 that we call "marked." All the subject
matter here is well known, although this exposition may be unusual in
matters of emphasis and detail that will be useful subsequently.
45
46 The Jordan Form and Invariant Subspaces
for all integers / > p. The subspace Ker(/4 — A0I)P is called the root subspace
of A corresponding to A() and is denoted £%A (A).
In other words, 91A (A) consists of all vectors x E <p" such that (^4 -
A O /)^JC = 0 for some integer q ^ 1. (This integer may depend on x.) Because
The nesting of the kernels in (2.1.1) has a dual in the (descending) nesting
of images:
But these sequences of inclusions are coupled by the fact that, for any
integer / > 0,
Proposition 2.1.1
The root subspace <3lx (A) contains the vectors from any Jordan chain of A
corresponding to A 0 .
as well as for A — A07, the only eigenvalue is A 0 , and the corresponding root
subspace $A (A) is the whole of <f". If
Theorem 2.1.2
Let A,,..., A,, be all the differenteigenvalues of a transformation
A: (p" —> <p". Then <p" decomposes into the direct sum
Lemma 2.1.3
For every eigenvalue A0 of A, the restriction A[9l A (A) has the sole eigenvalue
o
A0.
Proof. Let B^A^ (AY We shall show that for every A, ¥^ A0 the
^0
transformation \ll — B on $1K (A) is invertible. Let q be an integer such
that
Then clearly
Lemma 2.1.4
Given a transformation A: $"—> <P" w^h an eigenvalue A 0 , let q be a positive
integer for which
Then the subspaces Ker(/l - 0\iy and lm(A - A 0 /) 9 are direct complements
to each other in <p".
Proof. Since
It follows that
Hence
Hence A — A,/ maps lm(A - A,/) 9 onto itself. It follows that A, is not an
eigenvalue of the restriction of A to the ^-invariant subspace lm(A — A,/) 9 .
So the restrictions of A to the subspaces Ker(/4 - Aj/) 4 = $1K (A) and
58 = lm(A — A,/) 9 have no common eigenvalues. This property is easily
seen to imply that, for any eigenvalue A2 of A^
So we can repeat the previous argument with A replaced by A\% and with A,
replaced by an eigenvalue A2 of A\y, to show that
for some v4-invariant subspace M such that A t and A2 are not eigenvalues of
A\M. Continuing this process, we eventually prove Theorem 2.1.2. D
Another approach to the proof of Theorem 2.1.2 is based on the fact that
if <7,(A), . . . , qr(\) are polynomials (with complex coefficients) with no
common zeros, there exist polynomails p,(A),. . . , pr(\) such that
50 The Jordan Form and Invariant Subspaces
Theorem 2.1.5
Let A: <p" —> <p" be a transformation, and let M be an A-invariant subspace.
Then M decomposes into a direct sum
Note that Theorem 2.1.2 is actually the particular case of Theorem 2.1.5
with M = <p". We consider now some examples in which Theorem 2.1.5
allows us to find all invariant subspaces of a given linear transformation.
$A(,4) = Span{e,} ,
for some indices \<i^<i2<---<ip<n. This fact was stated without proof
in Example 1.1.3.
where A, and A2 are different complex numbers. The matrix A has the
eigenvalues A, and A 2 . Further,
and thus
We see (as Theorem 2.1.2 leads us to expect) that <J7" is a direct (even
orthogonal) sum of R^ (A) and £%A (A). Let M be any ^-invariant subspace.
By Theorem 2.1.5, we obtain
It is easily seen (cf. Example 1.1.1) that the only /1-invariant subspaces in
Span{e l5 e2} are {0}, Spanjtf,}, and Span{e,, e2}. On the other hand, any
subspace in Span{e3, e4} is A invariant.
One can easily describe all subspaces in Span{e3, e4} as follows: {0};
the one-dimensional subspaces Span{e3 4- ae4}, where a G <p is fixed for
each particular subspace; the one-dimensional subspace Span{e4}; and
Span{e 3 ,e 4 ). Finally, the following is a complete list of ^-invariant sub-
spaces:
Theorem 2.2.1
Let A be an n x n (complex) matrix. Then there exists an invertible matrix S
such that S~1AS is a direct sum of Jordan blocks:
The Jordan blocks Jk(\f) in the representation (2.2.1) are uniquely deter-
mined by the matrix A (up to permutation) and do not depend on the choice
ofS.
Corollary 2.2.2
If A{ and A2 are n{ x n, and n2 x n2 matrices with the partial multiplicities
£,(,4,),. . . ,kmi(Al)andkl(A2),. . . , km2(A2) of A, and A2, respectively,
54 The Jordan Form and Invariant Subspaces
The proof of this corollary is immediate if one observes that the Jordan
form of
Corollary 2.2.3
The partial multiplicities of A at A0 coincide with the partial multiplicities of
the conjugate transpose matrix A* at A 0 .
Hence / is the Jordan form of A*, and Corollary 2.2.3 follows from the
definition of partial multiplicities.
Theorem 2.2.4
Let A: <p"—> <p" be a linear transformation. Then there exists a direct sum
decomposition
Proposition 2.2.5
Let A: <p"-» <p" be a linear transformation. Then the geometric multiplicity of
any A0 E ar(A) coincides with dim Ker(A — A 0 /), and the algebraic multiplici-
ty of \Q coincides with the dimension of£%A (A), the root subspace of A0 [i.e.,
with the dimension of Ker(y4 - A0/)"].
Hence
As £%A (A) is the maximal subspace of the type Ker(,4 — A 0 /)^, q = 1,2, . . . ,
we obtain
The Jordan Form and Partial Multiplicities 57
Proposition 2.2.6
Let A: <p"-» <P" be a transformation with partial multiplicities / c l 5 . . . , km
corresponding to the eigenvalue A0 of A. Then
This equality is certainly true for q = 1 (for then both sides are equal to m).
Assume that the equality is true for q — 1. We have
Applying (a - A0/)m ' to the left-hand side and using the property that
(a - A 0 /) m jc^ = 0 for / = 1, . . . , / „ , we find that
Hence £|™, ai0x(^ e ^m-i and because of (2.3.1), a10 = • • • = a, 0 =0. Ap-
plying (A — A 0 /) w ~ 2 to the left-hand side of (2.3.2) we show similarly that
an- • • • — at , = 0, and so on, We put
We claim that
Proof of the Jordan Form
Indeed, assume
Applying (A - A 0 /) m 2
to the left-hand side, we get
Applying the previous argument to (2.3.4) as with (2.3.1), we find that the
vectors
If it happens that
*Si-2. ' = '« + '*-i + 1» • • • • ' « + '*-i + fm-2»" such a way that the vectors
x^_2, i = 1,. . . , tm + tm_l + tm__2 are linearly independent and the linear
span of these vectors is a direct complement to 5^,_3 in ^m_2- Then
put
Consequently
where /i,,. . . , /*> are all the distinct eigenvalues of the restriction A\M.
A useful characterization of spectral subspaces is given by their maximali-
ty property.
Proposition 2.4.1
An A-invariant subspace M T£ {0} is spectral if and only if any A-invariant, v
subspace J£ with the property (r(A\<e) C a-(A\M) is contained in M.
Theorem 2.4.2
The following statements are equivalent for an A-invariant subspace M: (a) M
is spectral for A; (b) there exists a direct complement Jito M such that N is A
invariant and
(c) there exists a unique A-invariant direct complement N to M; (d) for any
A-invariant subspace !£ that contains M properly, cr(A^) contains cr(A\M}
properly.
and, because of the linear independence of jtj 0 , all the coefficients a/; are
zeros. In particular, alk =0, and z — 0 in view of equation (2.4.6).
We have proved that (when a-(A) = {\0}) any nontrivial ,4-invariant
subspace either does not have A -invariant direct complements or has at least
two of them. This means that (c) implies (a).
is defined naturally as the n x n matrix whose entries are the integrals of the
entries of B(\):
Proposition 2.4.3
Let A be a subset of cr(A) where A is a transformation on <p", and let F be a
closed contour having A in its interior and o-(A) ^ A outside F. Then the trans-
formation
where /*.(A,.) is the k{ x /c. Jordan block with A, on the main diagonal. One
easily verifies that
Irreducible Invariant Subspaces and Unicellular Transformations 65
Thus
Theorem 2.5.1
The following statements are equivalent for an A-invariant subspace M:(a}Mis
irreducible; (b) each A-invariant subspace N contained in Mis irreducible', (c)M
is Jordan, that is, has a basis consisting of vectors that form a Jordan chain of A;
(d) there is a unique eigenvector (up to multiplication by a scalar) of A in M;(e) the
lattice of invariant subspaces of A\M is a chain—that is, for any A-invariant
subspaces j£,, j£, C M either £{ C 2£2 or 5£2 C <£", holds; (/) every nonzero A-
invariant subspace that is contained in M is Jordan; (g) the spectrum ofA\M is a
singleton {A 0 }, and
If p > 1, then el and ek + 1 are two eigenvectors of A in M that are not scalar
multiples of each other; so (d)—»(h). Further, if p>\, then
On the other hand, the statement (g) implies that the left-hand side of this
equation is also equal to max{0, £, + • • • + kp - /}. In particular (for / = 1),
we have
Proposition 2.5.2
A transformation A: <p" —> <p" is unicellular if and only if the whole space ((7"
is irreducible as an A-invariant subspace.
Indeed, rewriting Theorem 2.5.1 for the particular case M - <p", one
obtains various characterizations of unicellular transformations.
Another important property of a unicellular transformation is the "near"
uniqueness of an orthonormal basis in which this transformation has upper
triangular form (see Section 1.9).
Theorem 2.5.3
A transformation A: <p" —*• <P" is unicellular if and only if for any two
orthonormal bases x,,..., xn and y , , . . . , yn in which A has an upper
triangular form we have
Proposition 2.5.4
The set Inv(^4) of all invariant subspaces of a fixed transformation
A: <£"-» <p" is either a continuum [i.e., there exists a bijection <p: Inv(A)-* ft]
or a finite set.
Proof. In view of Theorem 2.1.5 we can assume that A has only one
eigenvalue A0, that is, £%A (A) - <p". If A is unicellular, then by Example
2.1.1 the set Inv(y4) is finite (namely, there are exactly n + l ,4-invariant
subspaces). If A is not unicellular, then by the equivalence (c)O(d) in
Theorem 2.5.1 there exist two linearly independent eigenvectors x and v of
A: Ax = \Qx, and Ay = A0y. Then (Span{;t + ay} \ a E ft] is a set of A-
invariant subspaces which is a continuum. On the other hand, let ^ be the
map from the set of all /i-tuples (jc,,. . . , xn) of n — dimensional vectors
* ! , . . . , * „ onto Inv(yl) defined by <K*i> • • • , *„) = Span{jCj,. . . , xn] if the
subspace Span{jc,,. . . , xn} is A invariant and «K*i, • • • , xn) = {0} other-
wise. As the set of all n-tuples (*,,. . . , *„), jc, G <p" is a continuum, by an
elementary result in set theory it follows that lnv(A) is a continuum as
well.
Generators of Invariant Subspaces 69
For example, any basis for M forms a set of generators for M. In connection
with this definition note that for any vectors yl, . . . , yp G <p" the subspace
Span{y,, . . . , yp, Aylt . . . , Ayp, A2ylt . . . , A2yp, . . .} is A invariant. The
particular case when M has one generator is of special interest (see also
Section 1.1), that is, when M = Span{*, Ax, A2x,. . .} for some jc G <p". In
this case we call M a cyclic invariant subspace (and is frequently referred to
as a "Krylov subspace" in the literature on numerical analysis).
The notion of generators behaves well with respect to similarity. That is,
if M is an ,4-invariant subspace with generators j c , , . . . , xm, then SM is an
SAS~{-invariant subspace with generators Sxl,...1Sxm (here 5 is any
invertible transformation). So the study of generators of /1-invariant sub-
spaces can be reduced to the study of generators of /-invariant subspaces,
where J is a Jordan form for A. Let us give some examples.
and let M = <f 2 be the ^-invariant subspace. The vector (1,1) is obviously a
generator for M, so a set of minimal generators must consist of a single
vector.
70 The Jordan Form and Invariant Subspaces
On the other hand, the set of two vectors {el, e2} is a set of generators of
<(72 that is minimal. Indeed, neither of the vectors e} and e2 is a generator of
2
<F .
Theorem 2.6.1
Let M be an A-invariant subspace. Then the number of vectors in a
set of minimal generators coincides with the maximal dimension m of
Ker(yl - \0I)\M, where A0 is any eigenvalue ofA\M.
where
(Such subspaces Ml and M2 are easily found by using the Jordan form of A.)
By the induction hypothesis we have a set of m — 1 generators jc,,. . . , xm_l
for the /4-invariant subspace M^ Also, we have proved that there is a
generator xm for the /4-invariant subspace M2. Then, obviously, jc,,. . . , xm
is a set of generators for M.
if and only if all the coordinates xt are different from zero. Indeed, if xt = 0
for some i, then e, does not belong to Span{;t, Ax, A2x,. . .}. On the other
hand, if xf ^ 0 for / = ! , . . . , « , then
Proposition 2.7.1
A maximal A-invaraint subspace in Jf exists, is unique and is equal to the sum
of all A-invariant subspaces that are contained in Ji.
Note that, because the dimension of N is finite, M can actually be
expressed as the sum of a finite number of j4-invariant subspaces.
Theorem 2.7.2
The maximal A-invariant subspace in Ji coincides with
and xE.Jfp. Assume inductively that we have already proved that xEJV^.,
j = 0,. . . , q — 1 for some q^p. Then
Theorem 2.7.3
Given linear transformations A: (pn —> <p" and C:<p"—»<p r , the maximal
A-invariant subspace in Ker C is
Moreover, the subspace 3f(C, A) coincides with nj=0' Ker(CM') for every
integer q greater than or equal to the degree of a minimal polynomial of A.
It is easily seen that, also, the pair (C, A) is a null kernel pair if and only if
For j = 0, 1 , . . . , « — 1 we have
76 The Jordan Form and Invariant Subspaces
and hence
The notion of null kernel pairs plays important roles in realization theory
for rational matrix functions and in linear systems theory, as we see in
Chapters 7 and 8. Here we prove that every pair of transformations has a
naturally defined null kernel part.
Theorem 2.7.4
Let C: <p"-*<p r and A: (f"1 -» <f" be transformations, and let Ml be the
maximal A-invariant subspace in Ker C. Then for every direct complement
M2 to M\ in (p", C and A have the following block matrix form with respect to
the direct sum decomposition <p" = M\ 4- M2.
where the pair C2: M2-^> (pr, A22: M2—* M2 is a null kernel pair. If <p" =
M\ + M'2 is another direct sum with respect to which C and A have the form
where the pair (C2, A'22) is a null kernel pair, then M\ is the maximal
A-invariant subspace in Ker C and there exists an invertible linear transfor-
mation S: M2-+M2 such that
and hence
On the other hand, x belongs to the domain of definition of A22, that is,
xE.M2. Since M1C\M2 = {Q}, the vector x must be the zero vector.
Consequently, (C2, A22) is a null kernel pair.
Now consider a direct sum decomposition <p" = M{ + M2, with respect to
which C and A have the form of equality (2.7.4) with the null kernel pair
(C2, A'22). As
we have
where the last equality follows from the null kernel property of (C2, A'22).
Hence M{ actually coincides with M^. Further, write the identity transfor-
mation /: <p"-> <p" as a 2 x 2 block matrix
Observe that if (2.7.5) holds, one can identify both M2 and M'2 with (pm,
78 The Jordan Form and Invariant Subspaces
Our next result expresses the duality between minimal and maximal
invariant subspaces in a precise form. (Recall that by Proposition 1.4.4 the
subspace M is A invariant if and only if its orthogonal complement M x is A*
invariant.)
Proposition 2.8.1
An A-invariant subspace M is minimal over Jf if and only if the A*-invariant
subspace M x is maximal in JV1.
Minimal Invariant Subspaces Over a Given Subspace 79
Theorem 2.8.2
The minimal A-invariant subspace M over N coincides with
Indeed, let y G A'N, so that y = A'z for some z G jV. Then for every x G <p"
such that A * 'x €! Jf ^ we have
Hence yE.Nf. If the equality (2.8.2) were not true, there would exist a
nonzero y0 G N*~ such that y0 would be orthogonal to A'Jf. Hence for every
2 G jV we have
Note that the equality M = EJL 0 A'Jf can also be verified directly without
80 The Jordan Form and Invariant Subspaces
Theorem 2.8.3
If M is the minimal A-invariant subspace over N, and k = dim N, then for any
eigenvalue A0 of A \M we have
Theorem 2.8.4
Let #: <p j —»<p" and A: $"—>$" be transformations. Then the minimal
A-invariant subspace over Im B coincides with
We say that a pair of transformations (A, B), where A: (p" —» <p" and
B: <ps —> (p", is a full-range pair if the minimal ^-invariant subspace over
Im B coincides with <p", or, equivalently, if
The duality generated by Proposition 2.8.1 now takes the form: the pair
(A, B) is a full-range pair if and only if the adjoint pair (B*, A*) is a null
kernel pair. This follows from the orthogonal decomposition
82 The Jordan Form and Invariant Subspaces
Then
Theorem 2.8.5
Given transformations A: <p w —»<p", B: (p*—»<p", let N^ be the minimal A-
invariant subspace over Im B. Then for every direct complement N2 to Nr in
<p", and with respect to the decomposition <p" = JV, + Jf2, the transformations
A and B have the block matrix form
with full-range pair (A'n, B{), then Jf[ = JV, and A'n = An, B{ = Br
Marked Invariant Subspaces 83
so (/4 U , #,) is indeed a full-range pair. If (2.8.6) holds for a direct sum
decomposition <p" = M\ + N'2, then
for some choice of integers ra(, 0< mi < kf, is .4 invariant. [Here ra, = 0 is
interpreted in the sense that the vectors fn, . . . , f i k do not appear in
(2.9.1) at all.] Such .A-invariant subspaces are called marked (with respect to
the given basis _/) - in which A is in the Jordan form).
The following example shows that, in general, not every A-invariant
subspace is marked (with respect to some Jordan basis for A).
Theorem 2.9.2
Let A: <£"—»• <p" be a transformation such that, for every eigenvalue A0 of A,
at least one of the following holds: (a) the geometric multiplicity of \Q is equal
Functions of Transformations 85
Hence for fixed A the matrix f(A) depends only on the values of the
derivatives
where /AJ , . . . , fir are all the different eigenvalues of A and m; is the height
of /Uy, that is, the maximal size of Jordan blocks with eigenvalue ^ in a
Jordan form of A. Equivalently, the height of /uy is the minimal integer m
such that Ker(/l - /iy/)m = R^A). This observation allows us to define f(A)
by equality (2.10.1) not only for polynomials /(A), but also for complex-
valued functions that are analytic in a neighbourhood of each eigenvalue of
A.
Note that for a fixed A the correspondence f(\)—>f(A) is an algebraic
homomorphism. This means that for any two functions/(A) and g(A) that
are analytic in a neighbourhood of each eigenvalue of A the following holds:
Functions of Transformations 87
Also, we define
Proposition 2.10.1
Then
It is then easy to verify (2.10.3) for a Jordan block T with eigenvalue A0 (not
necessarily 0). Indeed, T - A0/ has an eigenvalue 0, so by the case already
considered
Now
holds. Note that here F can be replaced by a composite contour that consists
of a small circle around each eigenvalue of A. (Indeed, the matrix function
(/A - A)"1 is analytic outside the spectrum of A.) Using this observation and
formula (2.10.1) we see that for any function that is analytic in a neighbour-
hood of each eigenvalue of A, the formula
So a diagonable matrix has n Jordan blocks in its Jordan form with each
block of size 1. If one knows that A is diagonable, then f ( A ) can be given a
meaning [by the same formula (2.10.1)] for every function /(A) that is
defined on the set of all eigenvalues of A. So, given a diagonable A, there is
an 5 such that
For any function /(A) that is defined for A = a,,. . . , A = an, put
is given by the same power series as ek. In order to verify (2.10.7), we can
assume that A is in the Jordan form:
Then, by definition
is
converges as well.
The exponential function appears naturally in the solution of systems of
differential equations of the type
Theorem 2.11.1
Let A: <p" —» <p" be a transformation with distinct eigenvalues /A, ,. . . , /ur and
partial multiplicities mn,. . . , mik corresponding to /A,., / = 1,. . . , r. Letf(\)
be an analytic function in a neighbourhood of each ti( (if all m(; are 1, it is
sufficient to require that /(/*,-) be defined for i — 1, . . . , r). For each m^
define a positive integer sif as follows: stj = m{- ifra/7 = 1 or iff(k\Hj) = 0for
& = ! , . . . , m-j — 1; otherwise /^(/x,) is the first nonvanishing derivative of
/(A) at /A,. Then the partial multiplicities of f ( A ) corresponding to the
eigenvalue A are as follows:
where f(s\ /x) is the first nonvanishing derivative of /(A) at /LA. [If m = 1 or if
f(k\u.) = 0 for k = 1, . . . , m, we put s = m.] More generally
Partial Multiplicities and Invariant Subspaces 93
Denoting the left-hand side of this relation by f y , note that the sizes of
Jordan blocks of f(A) are uniquely determined by the sequence 11,. . . , tm.
Indeed, the number of Jordan blocks of f ( A ) with size not less than; is just
tj - /,_}, where ; = 1,. . . , m and tQ is zero by definition. This observation,
together with (2.11.1), leads to the conclusion of the theorem.
Let us give an illustrative example for Theorem 2.11.1.
EXAMPLE 2.11.1. Let A be a 23 x 23 matrix with only two distinct eigen-
values 0 and 1, and with partial multiplicities 1,4,9 corresponding to the
eigenvalue 0, and with partial multiplicities 2,7 corresponding to the eigen-
value 1. Let /(A) = A 2 ( A - I) 4 Then f ( A ) has the unique eigenvalue 0, and
the different partial multiplicities of A have the following contribution to the
partial multiplicity (PM) of f(A), according to Theorem 2.7.1:
Hence a Jordan form for the transformation A2(A - I)4 has four Jordan
blocks of size 1,5 Jordan blocks of size 2, one Jordan block of size 4 and one
Jordan block of size 5, all corresponding to the eigenvalue zero.
Note that for a given transformation A: <p" —»<p" and a function /(A) such
that f(A) can be defined as above, there exists a polynomial p(\) such that
p(A) =f(A). Indeed, take p(\) such that
where /Xj, . . . , /u,r are all the different eigenvalues of A and my is the height
Of f l j .
Consider now the connections between invariant subspaces of A and the
invariant subspaces of a function of A.
Proposition 2.11.2
If M is an invariant subspace of a transformation A, then M is also invariant
for every transformation f(A), where /(A) is a function for which f(A) is
defined.
Note that in general the linear transformation f(A) may have more invariant
subspaces than A, as the following example shows.
The invariant subspaces of A are {0}, Span{e,}, <p2, but the invariant
subspaces of A2 = 0 are all the subspaces in <p2.
Theorem 2.11.3
(a) Assume that f( A) is an analytic function in a neighbourhood of each
eigenvalue / u , , . . . , inr of A (/u,,, . . . , /u,r are assumed to be distinct). Then
f(A) has exactly the same invariant subspaces as A if and only if the following
conditions hold: (i) /(^,) ^/(/u. y ) if /LI, ^ /t y ; (ii) /'(/"•,•) ^0 /or every eigen-
value jjLj with height greater than 1. (b) If A is diagonable and f( A) is a
function defined at each eigenvalue of A, then f ( A ) has exactly the same
invariant subspaces as A if and only if condition (/) of part (a) holds.
2.12 EXERCISES
2.1 Let
where al,. , . ,an are complex numbers. Prove that there exists an
invertible matrix S independent of al,. . . , an such that SAS~l is
diagonal. [Hint: A is a polynomial in Q, where Q is defined in
Exercise 2.7].
2.10 What is the Jordan form of the transformation
Exercises 97
2.15 Let
(a) Prove that, for each eigenvalue, A has only one Jordan block in
its Jordan form. (Hint: Use the description of partial multi-
plicities of A in terms of the matrix polynomial A/ — A; see the
appendix.)
(b) Find the Jordan form of A.
Exercises 99
dim Ker As+l + dim Ker ,4s'1 <2dim Ker As, 5= 1,2,...
hold.
100 The Jordan Form and Invariant Subspaces
2.27 Prove that if A: <p" —»<£" has one-dimensional image, the minimal
number of generators of any /1-invariant subspace is less than or
equal to n — 1. Show that Ker A is the only nontrivial A -invariant
subspace whose minimal number of generators is precisely n — 1.
2.28 For a given transformation A, denote by g(M) the minimal number of
generators in an /l-invariant subspace M. Prove that
2.29 Let
with 2 x 2 blocks Aj
(f) Matrices A such that A2 = 0
104 The Jordan Form and Invariant Subspaces
is marked.
2.48 Prove that for any transformation A: (p3 —» <p3 every invariant sub-
space is marked.
2.49 Find all Jordan forms of transformations A: (p 4 —> (p4 for which there
exists a nonmarked invariant subspace.
Chapter Three
Coinvariant and
Semiinvariant
Subspaces
EXAMPLE 3.1.1. Let A be an n x n Jordan block. Then for each / (1 < / < n)
Span{e{, ei+l,. . . ,en} is an ^4-coinvariant subspace (although there are
many other A -coinvariant subspaces). For this subspace there is a unique
A- in variant subspace that is its direct complement, namely,
Span{ej, e 2 ,. . . , e._j} ({0} if i = l). Note that, in this case, the only
subspaces that are simultaneously A invariant and A coinvariant are the
trivial ones {0} and <p". d
105
106 Coinvariant and Semiinvariant Subspaces
For an A -coin variant subspace M and any projector P onto M such that
Ker P is A invariant, we have PAP = PA. This follows, for instance, when
equation (1.5.5) is applied to I — P, or else it can be proved directly.
Conversely, if PAP = PA for some projector P onto a subspace M C <p",
then M is A coinvariant and Ker P is an .4-invariant direct complement to M
in <p".
Given an A -coinvariant subspace M and a projector P onto M such that
Ker P is A invariant, the linear transformation A has the following block
triangular form:
Proposition 3.1.1
Let Ml and M2 be A-coinvariant subspaces with a common A-invariant
direct complement N. Then the compressions P^A^ :Ml-^Ml and
Coinvariant Subspaces 107
(so 5n: Ml-^>M2, 512: N-*M2, S2l: M^ —>N, S22: N—>N). It is easily seen
that 5,2 = 0 and 522 = IN, the identity transformation on N. As / is invert-
ible, the transformation 5,, must be invertible as well, and
Now
Proposition 3.1.2
A subspace M is A coinvariant if and only if its orthogonal complement M1 is
A* coinvariant.
Proposition 3.1.3
A subspace M is orthogonally A-coinvariant if and only if M is invariant for
the adjoint linear transformation A*.
for all y^M. But the left-hand side of (3.1.2) is just (*, A*y). Hence
A*yE.(M±)± =M for all y E M and M is A* invariant. Reversing this
argument we find that if M is A* invariant, then Ax£M± for every x E M """,
that is, M is orthogonally A coinvariant.
The only ^-invariant subspaces are {0}, Spanf^}, Span{e,, e2}, <p3. Con-
sequently, all A- coin variant subspaces are as follows:
for some x, yE <p. Now Span{e2, e3} and Span{e2, (1,0,1}} are ^-coin-
variant subspaces but their intersection (which is equal to Span{e2}) is not.
Also, Span{e3} and Span{(l,0,1}} are y4-coinvariant subspaces but their
sum (which is equal to Span{e3, (1,0,1)}) is not.
Proposition 3.1.4
Any transformation has a complete chain of orthogonally coinvariant sub-
spaces.
Theorem 3.2.1
If A is diagonable, then each invariant subspace of A is reducing. Conversely,
if each invariant subspace of A is reducing, then A is diagonable.
where
Corollary 3.2.2
If a transformation A: <p"—><p" has n distinct eigenvalues, then every A-
invariant subspace is reducing.
Theorem 3.2.3
Every invariant subspace of A is orthogonally reducing if and only if A is
normal.
Proof. Recall first (Theorem 1.9.4) that A is normal if and only if there
is an orthonormal basis of eigenvectors xlt . . . , xn of A.
Assume that A is normal, and let *,,. . . , xn be an orthonormal basis of
eigenvectors of A that is ordered in such a way that
that any ,4-invariant subspace M has the form M = Ef =1 M n £%A (>!)•] The
orthogonal reducing property of <3t K (A) implies that the subspaces
£%A (A),. . . , £%A (/I) are orthogonal to each other. Taking an orthonormal
basis in each £%A (A) (which necessarily consists of eigenvectors of A
corresponding to A,), we obtain an orthonormal basis in <p" in which A has a
diagonal form. Hence A is normal.
Theorem 3.3.1
Let A: (p" —» <p" be a transformation. The following statements are equivalent
for a subspace M C <p": (a) M is semiinvariant for A; (/>) for a suitable
projector P mapping (p" onto M, we have
where we have used the property (b) twice. As the subspace j£is spanned by
A'x, x e M, j = 0,1, . . . , we conclude that PAPy = PAy for every y E J£,
which amounts to the equality PAPQ = PAQ, and (3.3.4) follows. Using the
equalities QAQ = AQ, QAP = AP, PAP=PAQ, we easily verify that
(Q - P)A(Q - P) = A(Q - P). This means that lm(Q - P) is A invariant
Finally, let /(A) be a function such that f(A) is defined. Then f(A) =
p(A), where /?(A) is a polynomial such that
where
in the orthonormal basis for <p" obtained by putting together the ortho-
116 Coinvariant and Semiinvariant Subspaces
Proposition 3.3.2
An orthogonally A-semiinvariant subspace is also orthogonally A* semiin-
variant.
Theorem 3.3.3
The following statements are equivalent for a transformation A: <p"—> <{7" and
a subspace M C (J7n: (a) M is orthogonally semiinvariant for A; (b) we have
The proof is like the proof of Theorem 3.3.1, with the only difference
that an orthogonal triinvariant decomposition is used and the projector Q is
taken to be orthogonal.
Proposition 3.4.1
Let A: <p" —» <J7" be a unicellular transformation that is represented as a
Jordan block in some basis xl, . . . , xn. Then a k-dimensional subspace
Special Classes of Transformations 117
The proof follows easily from the definitions of coinvariant and semi-
invariant subspaces and from the fact that the only ^-invariant subspaces
are {0} and Span{*i , . . . , * , } , / = 1 , . . . , n.
Consider now a diagonable transformation A: <J7" -» <p", so that A =
diag[A,,. . . , \n] in some basis in <p". As we have seen in Example 1.2, if all
A, are different, then every subspace in <p" is A coinvariant and hence also A
semiinvariant. In fact, this conclusion holds for any diagonable transfor-
mation (not necessarily with all eigenvalues distinct). Indeed, consider the
transformation B given by the matrix diag[/*,,. . . , nn] with different ^
values in the same basis in which A is given by diag[A 1 5 . . . , A n ). As every
fi-invariant subspace is also A invariant, it follows that every fl-coinvariant
subspace is also A coinvariant. But we have already seen that every
subspace is fi-coinvariant.
We consider now the orthogonally coinvariant and semiinvariant sub-
spaces. We say that a transformation A: <p"—» £" is orthogonally unicellular
if there exists a Jordan chain xl,...,xn of A such that the vectors
xl,...,xn form an orthogonal basis in <p". Clearly, any orthogonally
unicellular transformation is unicellular.
Proposition 3.4.2
Let A: $" —> <p" be an orthogonally unicellular transformation, and let
x,,..., xn be its orthogonal Jordan chain. Then the only orthogonally
A-coinvariant subspaces are Span{xk, xk +l, . . . , xn}; k — I , . . . , n; and {0}.
The only orthogonally A-semiinvariant subspaces are Span{jc k ,. . . , x,},
l<k<l<n and {0}.
Theorem 3.4.3
The following statements are equivalent for a transformation: (a) A is
normal; (b) every A-invariant subspace is orthogonally A coinvariant; (c)
118 Coinvariant and Semiinvarianl Subspaces
Proof. Obviously, (d) implies (c). Assume that A is normal, and let
A A, be all the different eigenvalues of A. Then
It follows that M is A invariant. So (a) implies (d). One sees easily that (a)
implies (b) also.
It remains to show that (c)=>(a) and (b)=>(a). Assume (c) holds, that is
(cf. Proposition 3.1.2) every /4*-invariant subspace is /1-invariant. Write A*
in an upper triangular form with respect to some orthonormal basis
Comparison of (3.4.2) and (3.4.3) reveals that bi} = 0 for /<;', and A is
normal.
Assume now that (b) holds, and write
3.5 EXERCISES
3.6 Prove that every subspace in <p" is coinvariant for any n x n circulant
matrix.
3.9 Find all coinvariant and semiinvariant subspaces for the matrix
3.10 Prove that every reducing A -invariant subspace is reducing also for
f ( A ) , where /(A) is any function such that f ( A ) is defined. Is the
converse true?
3.11 If J is a Jordan block, for which positive integers k does the matrix Jk
have a nontrivial reducing invariant subspace? Is the reducing sub-
space unique?
Jordan Forms
for Extensions
and Completions
Theorem 4.1.1
Let Jl and J2 be matrices in Jordan normal form with sizes p x p and q x q,
respectively. Let B be a p x q matrix and
121
122 Jordan Forms for Extensions and Completions
Denote by J10 and J2Q the Jordan submatrices of J{ and J2, respectively,
formed by those Jordan blocks with the same eigenvalue A 0 .
Then the partial multiplicities of J corresponding to A 0 coincide with the
partial multiplicities of the submatrix
Lemma 4.1.2
Let A, B, C be given matrices of sizes n x n, m x m, and n x m, respectively.
Consider the equation
This lemma follows immediately from the fact that, for the linear
transformation L: (p"xm-* if*m defined by L(X) = AX-XB, o-(L) =
{X — fjL \ A E a(A) and /x G a(B)}. [See Chapter 12 of Lancaster and Tis-
menetsky (1985), for example.] Here we give a direct proof based on the
Jordan decompositions of A and B.
Thus we can restrict ourselves to equation (4.1.3). Let us write down JA and
JB explicitly:
Write
where H and G are the nilpotent matrices [i.e., <r(H) = (r(G) — {0}] having
1 on the first superdiagonal and zeros elsewhere. Rewrite equation (4.1.5) in
the form
are similar.
As
where Jn (resp. J2i) are tne Jordan blocks from /, (resp. /2) with eigen-
values different from A 0 , and Btj are the corresponding submatrices in J.
Applying Lemma 4.1.3 twice, we see that J is similar to
which after interchanging the second and third block rows and columns (this
is a similarity operation) becomes
Extensions from an Invariant Subspace 125
The following result describes the connections between the partial multi-
plicities of a transformation and those of its extension.
Theorem 4.1.4
Let M C <p" be a subspace and let A0: M —> M be a transformation. Then for
every extension A: <P"—»• <p" of A0 we have
and
Theorem 4.1.5
Let M C <p" be a subspace and A0: M—>M be a transformation. Then for
every coextension A of A0 we have a^A; A0) ^ <Xj(A0; A0), ; = 1, 2,. . . for
every A0 G <p. Conversely, let fi^ (}2> • • • be a nonincreasing sequence of
nonnegative integers such that equations (4.1.8) and (4.1.9) hold. Then there
is a coextension A of A0 such that ctj(A; A 0 ) = /3y, ;' = 1 , 2 , . . . .
Proposition 4.2.1
Let C be a completion of A and B, with the partial multiplicities of A, B, and
C at a fixed A0 E <p given by the nonincreasing sequences of nonnegative
respectively. Then
integers {a,.};°=1, {/SjJLi, and (yj^i' respectively.
*It is convenient here to talk about the "algebraic multiplicity of C at A0" rather than the
"algebraic multiplicity of A0" as an eigenvalue of C.
130 Jordan Forms for Extensions and Completions
As usual in this book, the symbol ft* represents the number of different
elements in a finite set ft.
and thus
which is formed by the same rows and columns as Q itself. Now Q(e) is as
close as we wish to Q provided e is sufficiently close to 0. Take e so small
that the matrix Q(e) is also nonsingular. For such an e
Comparing with (4.2.7), we obtain the desired inequality (4.2.6). Now use
Proposition 2.2.6 to obtain the inequalities (4.2.5).
Indeed, as {A: | % >/} # =0 for y > y , , and similarly for {afc}^=1 and
( A j r = i > all the sums in equation (4.2.8) are finite, so (4.2.8) makes sense.
Further, for any nonincreasing sequence of nonnegative integers {8i}^=l with
finite sum E*=1 5, we have
Obviously, the area of <£ is just the left-hand side of equation (4.2.9). On
the other hand, the right-hand side of (4.2.9) is also the area of <I> calculated
by the rows of <1> (indeed, {k \ Sk ^ i}* is the area of the /th row in <I>
counting from the bottom); hence equality holds in (4.2.9). Now appeal to
(4.2.3) and (4.2.8) follows. We need a completely different line of argument
to prove the following proposition.
Proposition 4.2.2
as An
With {a,}*L,, {/3J°°=, and {y,}r=i Proposition 4.2.1, we have
and hence also the relation (4.2.12), follows from (4.2.4) because in this
case /3y = 0 for / > n - p + 1. Similarly, (4.2.12) holds for p < nB. We have
proved (4.2.10) for m = 1,. . . , n. For m ^ n the inequality (4.2.10) coin-
cides with (4.2.3), so the proof of (4.2.10) is complete.
Theorem 4.3.1
Let (aJJLj, {j3J7=i> and {yjr=i be as in Proposition 4.2.1. Then for every
sequence r{ < r2 < • • • < rm of positive integers we have
and
Lemma 4.3.2
Let
Proof o/ Theorem 4.3.1. Let <p" = ^ + Jiand let y4: M-* M, B: N-^ Ji
be transformations such that {a,}"!,, {j8,}^ =t , and (y / }r=i are the nonin-
creasing sequences of nonnegative integers representing the partial multip-
licities of A, B, and
Assume that inequality (4.3.2) is proved for all A with the property that the
size of the biggest Jordan block is less than a, . Using a matrix similar to A
in place of A, we can assume that
But in view of Lemma 4.3.2 (applied with C'* and C* in place of B and C,
respectively)
Now combine relations (4.3.5), (4.3.6), and (4.3.7) to obtain the inequality
(4.3.2). The inequalities (4.3.1) are obtained from (4.3.2) applied to the
transformation C* written as the 2 x 2 block matrix with respect to the
direct sum decomposition <p" = JV -I- M.
Also let
136 Jordan Forms for Extensions and Completions
Actually, the inclusion (4.3.8) in turn implies (4.3.1) and (4.3.2). The proof
of these statements would take us too far afield; we only mention that it is
essentially the same as the proof of Theorem 10 of Lidskii (1966). It is
interesting that the geometric interpretation of inequalities (4.3.1) and
(4.3.2) is completely analogous to the geometric interpretation of the
inequalities for the eigenvalues of the sum of two hermitian matrices in
terms of the eigenvalues of each hermitian matrix [see Lidskii (1966)].
Inequalities (4.3.1) and (4.3.2) can be generalized. In fact, for any
sequence r1 < r2 < • • • < rm of positive integers and any nonnegative integer
k<r} the following inequalities hold [see Thijsse (1984)]:
Proposition 4.4.1
Let a — (otj, a 2 , . . .) E ft, /8 = (/3,, /32, . . .) Eft, and put m — E*=1 a,, n =
£°°=1 )3,. TTzen fl sequence y = (7,, y2, . . .) E ft belongs to F(a, /3) if and only
if there is an m x n matrix A such that the partial multiplicities of the matrix
is the largest
O
index such that a_/I j ^0 [resp.
I /^
B n 7^0]J are y,,
"^2
y,, . . . .
/1 " » / '
so the matrices
138 Jordan Forms for Extensions and Completions
have the same Jordan form. But then (in view of Corollary 2.2.3) this is also
true for the matrices
As / 2 ar|d ^* are similar to J2 and /,, respectively, the conclusion F(a, /3) =
F(j3, a) follows.
In view of Proposition 4.4.1, in order to determine F(a, /3), we have to
find the partial multiplicities -y, > -y2 > • • • (or, what is the same, the Jordan
form) of matrices J of type (4.4.1). As
where the sum is over all the pairs p,q such that p <min(A;, a,), q <
min(/:, jSy), and p + q> k. For example, BJ^ has u n in the lower left corner
and zeros elsewhere, B|,112) has in the lower lefgt corner and
in the lower left corner and zeros elsewhere (provided «., /3 ; >3). Let
be the mx n matrix with blocks B^f\i = 1, . . . , n,; / = 1, . . . , n 2 ).
Lemma 4.4.2
In the preceding notation we have
and hence
where Eab = 0 whenever at least one of the inequalities 1 < a ^ a,; 1 < ft ^ /3y
is violated, and Ma6 = 0 for a < 1 or ft < 1.
It follows that
A = fl
ij !> + (terms with Ep.q. such that p' > A: or q' > A:)
By column operations from /* and row operations from Jk2, we can eliminate
all terms of A(k) except those in the block B(k). Permuting the rows and
columns of the resulting matrix, we obtain the following matrix that has the
same rank as /*:
EXAMPLE 4.4.1. Let a =(a,,0, 0,. . .), p = (ft, 0, 0,. . .), where a,, fa >
0. We suppose for definiteness that a, > ft. If d ( l ) 7^0, it is easily seen that
t40 jordan Forms for Extensions and Completions
rank
In general, we have
where t0 is the smallest t such that d*'/ ^ 0, or /0 = ftl + 1 if all rf^'/ are zeros.
It is now clear that y = (y,, y 2 ,. . .)eF(a, 0) is determined completely by
the value of t0. Further, using formula (4.4.3) and Lemma 4.4.2, we
compute
F(a, 0) = {(a, + jS,, 0), (a, + j8, - 1,1), . . . , K + 1,0,- 1), (a,, 0,)}
(In every y sequence we write only the first members; the others are zeros.)
The y sequence (a{ + 0t — p, p) corresponds to the value f 0 = p + 1.
The possibility of y = (QJ -I- 0j — p, p), p = 0,. . . , /3j, is realized for the
matrix
where Ap is an al x 0, matrix with all but the (a, — p, l)th entry equal to
zero, and this exceptional entry is equal to 1 (for p = 0j we put Ap = 0). It is
not difficult to construct two independent Jordan chains of A/ — J(p) of
lengths aj + 0, — p and p. Namely, the Jordan chain of length a, + 0, — p is
e
a +p ' ea +p - i > • • • » e a, + i> ^a^p' e a,-p-i> • • • , ^ i • The Jordan chain of
length p is ea] - eai+p, e Q ] _, - e a i + p _ i , . . . , « ffll - P + i ~ e«1+ i -
Using Lemma 4.4.2, we shall now give a complete description of the set
F(a, 0) in the case that a = (alt a2, . . . , a n ,0,. . .) and 0 = (0^0,0,. . .)
where an and 0j are positive.
Introduce the set O0 of all w-tuples (co}, a>2,. . . , eon), where o>; are
integers such that 1 < cu; < A y + 1 and A y = min(a y , 0J. For a given sequence
o> = (o)l5 <u2, . . . , & ) „ ) E ft0 and / = 1, 2,. . . , define integers cj"* as follows:
Special Case of Completions 141
where /ay = max(a y , /3,). Now let y = (y,, y2, • • •) be the nonincreasing
sequence of nonnegative integers defined by the equalities
Thus for every &> G fl0 we have constructed a sequence y. Let us denote this
sequence by F(o>).
Theorem 4.4.3
For every eoGfl ( ) the sequence F((o) belongs to Y(a, B). Conversely, if
y G F(a, j3), there exists a) G O0 s«c/z tfza/ y = F(w).
is just the maximum of the ranks of B\\\ B^, . . . , flj,*/, that is, fk.
Conversely, if d^ are given, define o>; as the minimal /(I < t < A y ) such
that rfj'/ ^ 0; and if d^ = 0 for every /, 1 < t < A y , put wy = Ay- + 1 .
4.5 EXERCISES
are similar.)
where X is an n x n matrix.
Chapter Five
Applications to
Matrix Polynomials
are equivalent. (See the appendix for the notion of equivalence.) In this case
C is said to be a linearization of L(A). The invariant, coinvariant, and
semiinvariant subspaces for C play a special role in the study of the matrix
polynomial L(A). For example, certain invariant subspaces of C are related
to factorizations of L(A). More precisely, certain invariant subspaces deter-
mine monic right divisors of L(A), certain coinvariant subspaces determine
monic left divisors, and certain semiinvariant subspaces determine three
monic factors of L(A). In this chapter we explore these and similar
connections and study the behavior of solutions of differential and differ-
ence equations with constant coefficients.
In this section we introduce the main tools required for the study of monic
matrix polynomials. These tools are freely used in subsequent sections.
Let L(A) = /A' + EJIo Af\' be a monic matrix polynomial of degree /,
where the A- are n x n matrices with complex entries. Note that det L(A) is
a polynomial of degree nl. A linear matrix polynomial /A — A of size
(n + p) x (n + p) is called a linearization of L( A) if
144
Monic Matrix Polynomials 145
Theorem 5.1.1
For a monic matrix polynomial definc
the nl x nl matrix
The matrix C, from Theorem 5.1.1 will be called the (first) companion
matrix of L(A), and will play an important role in the sequel. From the
definition of C, it is clear that
In particular, the eigenvalues of L( A), that is, zeros of the scalar polynomial
det L(A), and the eigenvalues of /A - C, are the same. In fact, we can say
more: since Cl is a linearization of L(A), it follows that the elementary
divisors (and thus also the partial multiplicities of every eigenvalue) of
/A — Cj and L(A) are the same.
Now we prove an important result connecting the rational matrix function
L ( A ) ~ l with the resolvent function for the linearization C,.
Proposition 5.1.2
For every A £ <p that is not an eigenvalue of L(A), the following equality
holds:
where
is an n x nl matrix and
is an n x nl matrix.
Monic Matrix Polynomials 147
It is easy to see that the first n columns of the matrix [£(A)] ' have the form
(5.1.4). Now, multiplying equation (5.1.5) on the left by P, and on the
right by P] and using the relation
Proposition 5.1.3
Any two linearizations of a monic matrix polynomial L(A) are similar.
Conversely, if a matrix T is a linearization of L(\) and matrix S is similar to
T, then S is also a linearization of L(\).
This proposition and the resolvent form (5.1.3) suggest the following
important definition: a triple of matrices (X, T, Y), where T is nl x «/, X is
n x «/, and Y is nl x n, is called a standard triple of L(A) if
For example, Proposition 5.1.2 shows that ( P } , C}, /?,) is a standard triple
of L(A).
It is evident from the definition that, if (X, T, Y) is a standard triple for
L(A), then so is any other triple (X, f, Y) that is similar to (X, T, Y), that
is, such that
for some nonsingular matrix 5. As we see in Theorem 5.1.5, this is the only
freedom in the choice of standard triples.
We start with some useful properties of standard triples. Here and in the
sequel we adopt the notation col[Z(]f=0 for the column matrix
148 Applications to Matrix Polynomials
Proposition 5.1.4
If (X, T, Y) is a standard triple of a monic n x n matrix polynomial
/ then the nl x nl matrices
and
hold.
Proof. We have
where F is a circle with centre 0 and sufficiently large radius so that cr(7)
and the eigenvalues of L( A) are inside T. On the other hand, since L( A) is a
monic polynomial of degree /, the matrix function L(A) = A~'L(A) is
analytic and invertible in a neighbourhood of infinity and takes the value / at
infinity. In fact, L(A) is analytic outside and on F. Hence
It follows that
We are now ready to state and prove the basic result that the standard
triple for a monic matrix polynomial is essentially unique (up to similarity).
Theorem 5.1.5
Let (X1, Tl, y t ) and (X2, T2, Y2) be two standard triples of the monic matrix
polynomial L( A) of degree I. Then there exists a unique nonsingular matrix S
such that
and
where 8tj is the Kronecker index (8;/ = 0 if 17^;; 5(> = 1 if i =/). Obviously
where
then we have
Thus the triple (5.1.10) is similar to the standard triple given in Proposition
5.1.2. The notion of a standard triple is the main tool in the following
representation theorem.
Theorem 5.1.6
Let be a monic matrix polynomial of degree I with
standard triple (X, 7", Y). Then L(\) admits the following representations:
(a) Right canonical form :
Note that only X and T appear in the right canonical form of L(A),
whereas only T and y appear in the left canonical form.
Proof. Observe that the forms (5.1.13) and (5.1.14) are independent of
the choice of the standard triple (X, T, Y). Let us check this for (5.1.13),
for example. We have to prove that if (X, T, Y) and (A", T', Y') are
standard triples of L(A), then
where
152
Therefore
and for checking (5.1.14), we choose the standard triple defined by (5.1.12).
To prove (5.1.13), observe that
and
so
and
So
Thus
Multiplication of Monic Matrix Polynomials 153
and
Theorem 5.2.1
Let L,( A) be a matrix polynomial with standard triple (Xt, T,, Y^for i = 1,2,
and let L(A) = L^AJL^A). Then
whrer
Corollary 5.2.2
If L.( A) are monic matrix polynomials with standard triples {X^ Tt, Y,) for
i = 1,2, then has a standard triple (X, T, Y) with the
representations
154 Applications to Matrix Polynomials
Theorem 5.2.3
Let Lj(A) and L2(\) be n x n monic matrix polynomials. Let a, /3 and y be
the sequences of partial multiplicities of L,(A), L2(\), and L 2 (A)L,(A),
respectively, at A 0 . Then y E F(a, /3). Conversely, if y G F(a, /3), then for n
sufficiently large there exist n x n monic matrix polynomials L,(A) and
L2( A), such that the sequence of their partial multiplicities at A0 are a and j8,
respectively, and the sequence of partial multiplicities of L 2 (A)Lj(A) is y.
Proof. Let ( X t , Tt, Yt} be a standard triple for L,(A) and i = 1,2. By
the multiplication formula (Corollary 5.2.2), the matrix
where / is the unit r x r matrix (for some r < min(r,, r 2 )). Then we can take
for some 7",, T2, A, and the partial multiplicities of 7, (resp. 72) corre-
sponding to A0 are given by the sequence a (resp. /3). Applying a similarity
to 70, if necessary, we can assume that 7, and 72 are in Jordan form.
Further, in view of Theorem 4.1.1 we can assume that o-(7,) = o-(72) =
{A,,}-
According to the assertion proved in the preceding paragraph, for n
sufficiently large there exist matrices X0 and yo of sizes n x r2 and r, x n,
respectively (where r, = EJL, a;, r2 = EJL, )3y) such that K^, = /I, the rows
of yo are linearly independent, and so are the columns of X(). Choose an
n x (n - r 2 ) matrix Xt such that the matrix [A r () A r ,] (of size n x n) is
invertible, and put
are the standard triples for L 2 (A) and L,(A), respectively. By Corollary
5.2.2 the matrix
156 Applications to Matrix Polynomials
is a linearization of L 2 ( A)L,( A). Now Theorem 4.1.1 ensures that the partial
multiplicities of T corresponding to A0 are exactly those for F0; that is, they
are given by the sequence y.
The proof of the converse statement of Theorem 5.2.3 shows that for a
yEF(a, /3) there exist linear monic matrix polynomials L t (A) and L 2 (A)
with the desired properties and with the size not exceeding min(rj,r 2 ) +
TJ + r 2 , where r, (resp. r 2 ) is the sum of all integers in a (resp. )3).
Our analysis of partial multiplicities of completions in sections 4.2-
4.4, combined with Theorem 5.2.3, allows us to deduce various connec-
tions between the partial multiplicities of monic matrix polynomials and the
partial multiplicities of their product, as indicated, for instance, in the
following corollary.
Corollary 5.2.4
Let Lj(A) and L 2 (A) be n x n monic matrix polynomials. Let a =
(a,, a 2 ,. . .), /3 - (j§,, /32, . . .), and y = (y,, -y2, . . .) be sequences of partial
multiplicities o/L,(A), L 2 (A), and L 2 (A)L,(A), respectively, at A 0 . Then
where the subspaces j£ and ^ 4- J< are F invariant. The triinvariant decom-
position (5.3.1) is called supporting [with respect to (X, F, y)] if, for some
integers p and q, the transformations
Divisibility of Monk Matrix Polynomials 157
and
Lemma 5.3.1
Let L(\) be a monic matrix polynomial of degree I with standard triple
(X, T, Y), and let P be a projector in (p"'. Then the transformation
is invertible.
invertible.
158 Applications to Matrix Polynomials
Thus the At are transformations with the following domains and ranges:
where D, and D2 are nonsingular matrices. Recall that A and B are also
nonsingular by Proposition 5.1.4. But then Al is invertible if and only if B4
is invertible. This may be seen as follows.
Suppose that B4 is invertible. Then
Theorem 5.3.2
Let =^(A) be an n x n monic matrix polynomial with standard triple
(X, T, y), and let df" = ££ + M+Jfbea supporting triinvariant decompos-
ition associated with a T-semiinvariant subspace M. Then L(A) admits a
factorization
(c) (Z\M, PMT\M, F) is a standard triple for L 2 (A), where PM is the projector
on M along !£ + Im P A ,
[Here q^l and / - / ? < / are the unique nonnegative integers such that the
linear transformations co\(XTi)t>:^- £-+ $"" and P^[Y,. . . , Tl'p~2Y,
T'~P~1Y]: $n(l-p)->% + M are invertible.}
Conversely, if equation (5.3.6) is a factorization of L(\) into a product of
three monic matrix polynomials Lj(A), L 2 (A), and L 3 (A), there exists a
supporting triinvariant decomposition
160 Applications to Matrix Polynomials
Similarly, using Theorem 5.3.2, one can produce the formulas for Lj(A),
L 2 (A), and L 3 (A) themselves. The proof of Theorem 5.3.2 is quite lengthy
and is relegated to the next section.
The following particular case of Theorem 5.3.2 is especially important.
We assume that L(A) and (X, T, Y) are as in Theorem 5.3.2.
Corollary 5.3.3
Let <p"' = ££ + M+Nbea supporting triinvariant decomposition associated
with a T-semiinvariant subspace M such that !£ + M = <prt/ (so Jf — {0} and M
is actually T coinvariant). Then L(\) admits a factorization
Proposition 5.4.1
Let L( A) = E / = 0 Aj\' be an n x n matrix polynomial (not necessarily monic)
and let L{(\) be an n x n monic matrix polynomial with standard triple
(Xl,Tl,Yl). Then (a) L(A) = L 2 (A)Li(A) for some matrix polynomial
L 2 (A) if and only if the equality
holds; (b) L(A) = Lj( A)L 3 (A)/or some matrix polynomial L 3 (A) if and only
if the equality
holds.
and for |A| large enough (e.g., for |A| > \\T} ||) we have
which is zero. So
which means that all coefficients of negative powers of A in (5.4.2) are zeros,
that is, L(A)L 1 (A)~ I is a polynomial.
Statement (b) of Proposition 5.4.1 follows from the (already proved)
statement (a) when applied to the matrix polynomials L(A) = (L(A))* =
. . _ def _
£}=0 A*\' and L,(A) = (Li(A))* in place of L(A) and L,(A), respectively.
(Note that (y*, T*,X*) is a standard triple for L,(A), and that L(A) =
L,(A)L 3 (A) if and only if L(A) = L 3 (A)L,(A), where L 3 (A) = [L 3 (A)]* is a
matrix polynomial together with L 3 (A).}
where
Proof of Theorem 5.3.2 163
(so V): <p"—»j£, i = 1,. . . , q). It turns out that (A^>, T^, Vq) is a standard
triple of L 3 (A). Indeed, we note that the following equalities hold:
where / is the degree of L(A) and Aj is the coefficient of A' in L(A) [see
formula (5.1.6)], Proposition 5.4.1 ensures that there exists a matrix polyno-
mial L 4 (A) such that L(A)= L 4 (A)L 3 (A). The matrix polynomial L 4 (A) is
necessarily monic and of degree def/ — q. Let us find its standard triple. First
note that the transformation Q = P ^ + P v is a projector on M + Im P v
along 3?. Indeed, for every x E 3? we have Qx = PMx + Pxx = 0 + 0 = 0, and
for every yE.M (resp. y e Im P v ) we have Qy — PMy + P^y = y + 0 = 0
(resp. Qy = P^y = y). Then by Lemma 5.3.1, the transformation
is invertible.
Now we check that
where
the multiplication theorem (Theorem 5.2.1) it will suffice to check that the
triple (X, T, Y) is similar to the triple
where X} = X\<g, Tl = T^. Then P' is a projector and Im P' = !£. Indeed,
we obviously have P'y = y for every y G !£. Further, formula (5.1.9) implies
that
Using (5.4.4) and the fact that P'^ = I, it follows that X and
Im[Y, TY,. . . , T'~q~lY] are direct complements to each other in <pn/.
Thus P' is indeed a projector.
Define 5: f'^lm P + Im Q by
where P' and (? are considered as transformations from <p" into Im P' and
Proof of Theorem 5.3.2 165
and QT= QTQ. The last follows immediately from the fact that Ker Q is an
invariant subspace for T. To prove (5.4.6), take y E <pn/. The case when
_y E Ker (? = Im P' is trivial. Therefore, assume that y E Ker P'. We then
have to demonstrate that P'Ty-VlZ2Qy. Since y E K e r P ' , there exist
x0, . . . , * , _ , _ , E <f" such that u = ^I? r'^^'Fjc,.,. Hence
and so V^ZQy is also equal to VqxQ. This completes the proof of equation
(5.4.6). Finally, the last equality in (5.4.5) is obvious because P'y = 0.
We have now proved equality (5.4.3), from which it follows that
(Q, QTQ, QY) is a standard triple for L 4 (A).
Now define the monic matrix polynomial
where and
Then (A^, PxT^lmP ., P^y) is a standard triple for Lj(A). Indeed, this
follows from the equalities
166 Applications to Matrix Polynomials
where C2 is the second companion matrix of L,(A). The first and third
equations of (5.4.7) follow from the definitions; the second equality follows
from the structure of C2 using the fact that A col[l/,]Jl^ = /.
Now Proposition 5.4.1, (b) implies that L 4 (A) = L,(A)L 2 (A) for some
(necessarily monic) matrix polynomial L 2 (A). So in order to prove the direct
statement of Theorem 5.3.2 we have only to verify that (Z\M, PM^\M-> ^) ^s
indeed a standard triple for L 2 (A). To this end, put
where
is a standard triple for L 3 (A). Here q is the degree of L 3 (A). Let C be the
first companion matrix of L(A). Proposition 5.4.1 implies that
In particular
5.5 EXAMPLE
Then
where L,( A), / = 1, 2, 3 are monic matrix polynomials of the first degree. As
Example 169
with respect to the standard triple (X, J, Y). So we are looking for
/-semiinvariant subspace M with 5£ and SB + M J invariant, such that the
transformations
and
and
and
One could consider all other factorizations (5.5.2) of L(A) in a similar way.
First, we find all pairs of /-invariant subspaces (=$?, 3? + M} with the
170 Applications to Matrix Polynomials
properties (5.5.4) and (5.5.5). Using the Jordan form (5.5.1), it is not
difficult to see that all such pairs are given by the following formulas:
(in the basis e, + ae 3 , e4 in J^and the standard basis in <p 2 ), and this matrix
is invertible for all
which is invertible if and only if /3 ^ 0. (In this calculation we have used the
formula
Factorization into Several Factors and Chains of Invariant Subspaces 171
Theorem 5.6.1
Let (X, T, Y) be a standard triple for L(A). Then for every chain of
T-invariant subspaces
172 Applications to Matrix Polynomials
are invertible (for some positive integers mk < mk_l < • • • < m2 < /) there
exists a factorization (5.6.1) of L(A), with the factors L ; (A) uniquely
determined by the chain (5.6.2), as follows. For j = 1,2,. . . , k - 1, let M) be
a direct complement to ^i+l in ^ (by definition, ^ = <p"') and let
PM : j£y —» M •be the projector on M . along &j+l. Then for j = 1,2,. . . , k —
so Vkq are transformations from <p" into ££k for q = 1, . . . , mk. Conversely,
for every factorization (5.6.1) of L( A) there is a unique chain of T-invariant
subspaces (5.6.2) such that for j = 2,3,. . . , k the transformations
where
(so V-q are transformations from (pn into J^ for g = 1,. . . , m y ). In particular
(with j = k), formula (5.6.5) follows. Further, using the formulas for the
standard triple of the factor L2( A) in Corollary 5.3.3, one easily obtains the
desired formulas [equation (5.6.3)]. The converse statement also follows by
repeated application of the converse statement of Corollary 5.3.3.
A "dual" version of Theorem 5.6.1 can be obtained if one uses the left
canonical form [equation (5.1.14) instead of the right canonical form
equation (5.1.13)] to produce formulas for L y (A)Ly + 1 (A) • • • Lk(\). Then
one uses (5.1.13) [instead of (5.1.14)] to derive the formulas for
L , ( A ) , . . . , L^.^A). We omit an explicit formulation of these results.
We are interested particularly in factorizations (5.6.1) with linear factors
L y (A): L ; (A) = A / + Af for some n x n matrices At (y = 1,. . . , k). Note
that in contrast to the scalar case, not every monic matrix polynomial admits
such a factorization:
We claim that L(A) cannot be factorized into the product of (two) linear
factors. Indeed, assume the contrary:
for some complex numbers a,, bt, cit dt, i = 1, 2. Multiplying the factors on
the right-hand side and comparing entries, we obtain
Letting
174 Applications to Matrix Polynomials
property (indeed, such an A must have only the zero eigenvalue, but then
inevitably A2 = 0).
Theorem 5.6.2
Let L( A) be an n x n monic matrix polynomial of degree I for which the
companion matrix is diagonable. Then there exist n x n matrices A^,. . . , A,
such that
Lemma 5.7.1
A function x(t) is a solution of (5.7.1) if and only if it has the form
so
Let 3£i(T) be a fixed direct complement to ^Q(T} in $10(T) and note that
%i(T) is never T invariant [unless 3^(7) = {0}]. Otherwise ^(T) would
contain an eigenvector of T that, by definition, should belong to 5T0(r).
We now have the direct sum ("' = 9l_(T) + %0(T) 4- %t(T) + 9l+(T).
For a given vector c E <p"', let
but
for every e > 0. Obviously, such a positive number n is unique and is called
the exponent of the exponentially increasing solution x(t). A solution x(t) of
(5.7.1) is exponentially decreasing if (5.7.5) and (5.7.6) hold for some
negative number /i [which is unique and is called again the exponent of x(t)].
We say that a solution x(t) is polynomially increasing if
for some positive integer m. Finally, we say that a solution x(t) is oscillatory
if
Theorem 5.7.2
Let x(t) = Xe'Tc be a solution of (5.7.1). Then (a) x(t) is exponentially
increasing if and only ifc+ ^ 0; (b) x(t) is polynomially increasing if and only
if c+ — 0, Cj T^ 0; (c) x(t) is oscillatory if and only if c+ = Cj = 0, c0 ^ 0; (d)
x(t) is exponentially decreasing if and only if c+ = cl = CQ = 0, c_ 5^0. In
cases (a) and (d), the exponent of x(t) is equal to the maximum of the real
parts of the eigenvalues A0 of T with the property that PA c ^ 0, where PAo is
the projector on ^o(T) along
178 Applications to Matrix Polynomials
Proof. We have
whereas every entry in Xe'r°c0 is of the type (5.7.9) with all polynomials
pf(t) constant. Finally, every entry in Xe'T~c_ is of the type
has zero kernel, and equation (5.7.12) implies c ± =0. Also, the equality
Theorem 5.7.3
For every set of k vectors JC Q , . . . , jc^., in <p" there exists a unique exponen-
tially decreasing solution x(t) of (5.1 A) such that
It follows that for every set JCD, . . . , xk_l £ <p" there exists a unique expo-
nentially decreasing solution x(t) of (5.7.1) with x('\a) = jc,, / = 0 , . . . ,
A: - 1 if and only if the transformation
5.«
S.8 DIFFERENCE EQUATIONS
Let (X, T, F) be a standard triple for L( A). The general solution of (5.8.1) is
then
which is zero in view of Proposition 5.1.4. If the first / vectors in (5.8.2) are
zeros, that is
but
for every positive number e. The number q is called the multiplier of the
geometrically growing (or decaying) solution {jcy}JL0. The solution {*y}JL0 is
said to be of arithmetic growth if for some positive integer k the inequalities
Theorem 5.8.1
Let (xt: — XT'c}~=f) be a solution of (5.8.1). Then the solution is (a) of
geometric growth if and only if c+ 7^0; (b) of arithmetic growth if and only if
c+ = 0, c1 7^0; (c) oscillatory if and only if c+ = 0, c1 =0, cVO; (d) of
geometric decay if and only if c+ — c = c ( = 0 , c ~ 7 ^ 0 . In cases (a) and (d)
the multiplier of {jc,}^=0 is equal to the maximum of the absolute values of the
eigenvalues A0 of T with the property that PA c ^ 0, where PA is the projector
on £%A (T) along
Theorem 5.8.2
For every set of k vectors _y 0 , . . . , y k _ { in <p" there exists a unique geometri-
cally decaying solution {jtj}JL 0 with x(} = y ( } , . . . , x k _ { = yk_l if and only if
L(A) admits a factorization L(\) = L 2 (A)L,(A), where L2(\) and L,(A) are
monic matrix polynomials of degrees I - k and k, respectively, such that
Exercises 183
5.9 EXERCISES
(1)
and verify that (X, T, Y} is similar to the triple (F,, C,, #,)
from Proposition 5.1.2 with the similarity matrix col[AT'][lJ.]
(b) Show that given a right standard pair (X, T) of L(A), there
exists a unique Y such that (X, T, Y) is a standard triple for
L(A), and in fact Y is given by formula (1). [Hint: Use formula
(5.1.11) for the similarity between the standard triple (X, T, Y)
and the standard triple (Pj, C,, R^ from Proposition 5.1.2.]
5.2 A pair of matrices (T, Y) of sizes nl x nl and nl x n, respectively, is
called a left standard pair for the monic n x n matrix polynomial L( A)
if for some n x nl matrix X the triple (X, T, Y) is a standard triple of
L(A).
(a) Prove that a pair of matrices (T, Y) of sizes nl x nl and nl x n,
respectively, is a left standard pair for L(A) = /A 7 + EJIo Aj\' if
and only if [Y, TY, . . . , Tl~lY] is invertible and
(b) Show that given a left standard pair (T, Y) of L(A), there exists
a unique ^T such that (X, T, Y) is a standard triple of L(A), and
in fact
(c) Prove that (T, Y) is a left standard pair for L(A) = /A' +
Ejlj Aj\' if and only if (Y*,. T*) is a right standard pair for the
monic matrix polynomial
184 Applications to Matrix Polynomials
is a left standard pair for L( A), and find X such that (X, T, Y) is
a standard triple for L(A).
5.4 Let be a scalar polynomial. Show that
J,(\0)) is a right standard pair or L(\) and that
a left standard pair for L(A). Find X and Y such that ([1 0 • • • OJ,
J,(A 0 ), Y) and (X, //(A 0 ), col[5,,]J=1) are standard triples for L(A).
5.5 be a scalar polynomial, where
are distinct complex numbers. Show that
and
is an lt x 1 matrix.
Exercises 185
5.6 Let
where a0,. . . , at_l and b0,. . . , b,_l are complex numbers. When
are all solutions exponentially decreasing? When does there exist a
nonzero oscillatory solution?
5.14 Find the solutions of the system of difference equations
5.24 Prove that for a scalar monic polynomial L(A), every CL-invariant
subspace is supporting.
5.25 Describe all supporting subspaces for a monic matrix polynomial
whose coefficients are circulant matrices, that is, matrices of type
Invariant Subspaces
For Transformations
Between Different
Spaces
Consider a transformation from <p m+ " into (p". Our objective in this section
is to develop and investigate a generalization of the notion of an invariant
subspace that will apply to such transformations and that reduces to the
familiar concept when m =0. Let P be the projector on $m+n that maps
each vector onto the corresponding vector with zeros in the last m positions.
We treat vectors of §m+n in terms of their components in Im P and
189
190 Invariant Subspaces for Transformations Between Different Spaces
Theorem 6.LI
Let M be a subspace of <p" and [A B] be a transformation from $m+n into
<p". Then the following are equivalent: (a) M is [A B] invariant; (b) there
exists a subspace y of <p m+ " with M = Ptf and an extension of [A B] under
which y is invariant; (c) the subspace M satisfies
as required.
[A fi]-Invariant Subspaces 191
and note our construction ensures that (A + BH)m E M for any mEM.
Consider the extension of \A B]. It is easily verified that & is
invariant under this extension.
This follows immediately from the definitions.
Corollary 6.1.2
With the notation of Theorem 6.1.1, if M is [A B] invariant, then for any
transformation F: <p"—*• <pm, M is [A + BF B] invariant.
Proof. We use the equivalence of statements (a) and (d) of the theorem.
The fact that M is [A B] invariant implies the existence of an
such that M is (A + BFQ)invariant.Thus, for any
Consequently,
AMCM+Y
EXAMPLE 6.1.1. Let A: <p 3 —»<p 3 be defined by linearity and the equalities
Ael — e2, Ae2 = e^, Ae3 = el. Let °V - Span{e2 + e3}. The subspaces
Span{el,e2} and Span{e,,e 3 } are both A invariant (mod Y). (The sub-
space Span{e,,e 2 } is actually A invariant.) However, their intersection
Span{e,} is not A invariant (mod V). Indeed, Ae\ = e2f£Span{el} +
Span{e2 + e>3}.
Proposition 6.1.3
For every subspace M C (p" there is a unique subspace of <p" that is A
invariant (mod V) and maximal in M.
Proof. Let °\L be the sum of all subspaces that are A invariant (mod V)
and are contained in M. Because of the finite dimension of M, °U is in fact
the sum of a finite number of such subspaces. Consequently, °U is itself A
invariant (mod Y) and thus maximal in M. The uniqueness is clear from the
definition. D
and the additional assumption that <p" is S invariant. Thus S\(n defines an
invertible transformation on (f"1 — the space on which acts. This
means that, with respect to the decomposition S has the
representation
such that
such that
Now let us describe block-similar pairs [Al fij and [A2 B2] in two other
ways.
194 Invariant Subspaces for Transformations Between Different Spaces
Theorem 6.2.1
Let [Al 5J and [A2 B2] be transformations from <p" + <pm into <p". Then
the following statements are equivalent: (a) [A^ Bv] and [A2 B2] are block
similar; (b) there exist invertible transformations N and M on <p" and <pm,
respectively, and a transformation F: <p n —» <f"" such that
(c) for any extension Tl of [A l Bv] there is an extension T2 of [A2 B2] and a
triangular invertible transformation S of the form (6.2.3) for which T{ =
ST2S~\
Corollary 6.2.2
Let [Al Z?j] and [A2 B2] be block-similar transformations with transform-
ing matrix S given by [6.2.3]. Then Mis an [A t B ^-invariant subspace if and
only if N~1M is an [A2 B2]-invariant subspace.
Corollary 6.2.3
If transformations [Al #J and [A2 B2] are block similar, they have the
same rank.
Proof. Let [A{ B{] and [A2 B2] be block similar. Then Theorem 6.2.1
implies that
Block Similarity 195
Proposition 6.2.4
Let Al and A2 be n x n matrices and Bl and B2 be n x m matrices. Then
[Al B,J and [A2 B2] are block similar if and only if the linear matrix
polynomials [/A + A^ B\\ and [I\+ A2 B2] are strictly equivalent, that is,
there exist invertible matrices S and T such that
where Tn is n x n. Then
Hence 7\, =
= S~\
5 \and
and
It follows that r,2 = 0 and then that SfijT^ = B2. Combining this relation
with (6.2.5), it follows from Theorem 6.2.1 that [Al Bt] and [A2 B2] are
block-similar.
Conversely, suppose that the relations (6.2.5) hold for appropriate N, M
and F. Then (6.2.7) holds with 5 = AT1, Tn = N, T12 = 0, T2l = FN~r, and
T22 = M.
Now we are ready to state and prove a result giving a canonical form for
block-similar transformations and known as the Brunovsky canonical form.
In the statement of the theorem /*(A) will, as usual, denote the k x k
Jordan block with eigenvalue A.
Theorem 6.2.5
Given a transformation [A B]: <p" + <f""-^ <p", there is a block-similar
transformation [A0 B0] that (in some bases for <p" and <f"") has the
representation
for some integers kl > • • • ^ kp > 0 and all entries in BQ are zero except for
those in positions ( k { , 1), (fcj + k2, 2),. . ., (k} + • • • + kp, p), and these
exceptional entries are equal to one. Moreover, the matrices AQ and BQ
defined in this way are uniquely determined by [A B], apart from a permu-
tation of the blocks J^( A j ) , . . . , / / (A^) in (6.2.9).
Thus the pair of matrices A0, BQ or the block matrix [AQ B0] may be
seen as making up the Brunovsky canonical form for the transformation
\A B]. It will be convenient to call the matrix the
Kronecker part of A0 and the integers kl, . . . , kp the Kronecker indices of
[A B]. Similarly, we call the Jordan part of AQ and
/ j , . . . , lq the Jordan indices of {A B].
Proof. We use the terminology and results of the appendix to this book.
We may consider A and B to be n x n and n x m matrices, respectively.
Consider the linear matrix polynomial
has no nontrivial polynomial solution *(A), the minimal row indices of C(A)
Analysis of the Brunovsky Canonical Form 197
are absent. Further, the polynomial AC( A ') = [/ + \A, \B] obviously has
no elementary divisors at zero, so C(A) has no elementary divisors at
infinitv. Let k /:_ be the minimal column indices of C(A) and
' be the elementary divisors of C(A). Then
Theorem A.7.3 ensures that C(A) is strictly equivalent to the linear matrix
polynomial
and s — max Ae( p (rank C(A)) - n [and we have used the elementary fact that
— //(AO) and J,(-\0) are similar]. After a permutation of columns the
polynomial (6.2.10) becomes [/A + A() B0] with A0 and BQ as defined in the
statement of the theorem. The theorem itself now follows in view of
Proposition 6.2.4. D
Lemma 6.3.1
Consider any transformations
0,1, 2 , . . . we have
Hence
Assuming that the relation (6.3.1) holds when s = r - 1, this implies that the
right-hand side of (6.3.1) is contained in the left-hand side. But the opposite
inclusion follows from that already proved on replacing A by A — BF.
Theorem 6.3.2
For a transformation [A B]: $m+n—> <p" the following statements are
equivalent: (a) the pair (A, B) is a full-range pair; (b) there is a full-range pair
( A } , B I ) for which [ A t £,] and [A B] are block-similar; (c) in the
Brunovsky form [AQ B0] for [A B], the matrix A0 has no Jordan part; (d)
the rank of the transformation [I\ + A B] does not depend on the complex
parameter A.
ces us that the rank of [/A + AQ B0] takes the same numerical value, except
at the points A = - A y , j = 1, . . . , q, where there is a reduction in rank. Thus
the rank of [/A + A B] is independent of A if and only if there is no Jordan
part in A0, and the equivalence of (c) and (d) is proved.
So far, the discussion of this section has focussed on cases in which the
matrix AQ of a canonical pair (^4 0 , BQ) has no Jordan part. This can be
described as the case q = 0 in equation (6.2.9). It is also possible that AQ has
no Kronecker part; the case p = 0 in equation (6.2.9). In this case BQ = 0 as
well. We return to this case in Section 6.6.
We conclude this section by showing that the Kronecker indices of the
Brunovsky form can be determined directly from geometric properties of
the transformation [A B] without resort to the computation of the minimal
column indices of [/A + A B].
Proposition 6.3.3
Let [A B] be a transformation from <p m+ " into <p" and define the sequence
d_i, dQ, dt, . . . by d_l = 0 and, for s = 0, 1, . . .
Note that the sequence d_lt d0, . . . is ultimately constant and (if B ^0),
is initially strictly increasing (see Section 2.8).
where M and N are invertible and [A0 B0] is block similar to [A B]. Now
Lemma 6.3.1 implies
khjjkjkh8huhioytyyy[[piiiouiutiutugfjjugijugifuiugfiuiguigu ioug
In some special cases Theorem 6.2.5 can be used to describe explicitly all
[A B] -invariant subspaces. We consider a primitive but important "full-
range" case in this section.
Theorem 6.4.1
Let [A B] be a transformation from <p" + 1 into (p" for which (A, B) is a
full-range pair. Then there exists a basis / , , . . . , / „ in (p" such that every
m-dimensional [A B]-invariant subspace M ^ {0} admits the description:
(and the right-hand side is interpreted as zero for t = 1). Indeed, equality in
the 5th place (s = 1, . . . , n - 1) on both sides of (6.4.3) follows from the
easily verified combinatorial identity:
or
but the left-hand side of this equation is just the (t - l)th derivative of the
polynomial evaluated at A y ; so equation (6.4.4),
and hence (6.4.3), is confirmed.
We have verified that the vectors
form a Jordan chain of A + BF corresponding to A ; . As the restriction
(A is unicellular, there exists a unique (A + BF)-
202 Invariant Subspaces for Transformations Between Different Spaces
Corollary 6.4.2
Let [A B] be as in Theorem 6.4.1. Then there exists a basis f^, . . . , / „ m <p"
such that, for every m-tuple of distinct complex numbers A j , . . . , A m , the
m-dimensional subspace
is [A B] invariant.
This corollary shows that (at least in the case of a full-range pair
A: <p"—* <P" and B: <p-» <p") there are a lot of [A B]-invariant subspaces.
Indeed, Corollary 6.4.2 shows the existence of a family of [A B]-invariant
m-dimensional subspaces that depends on m complex parameters (namely,
For the general case of a full-range pair we have the following partial
description of [A /?]-invariant subspaces.
Theorem 6.4.3
Let (A, B) be a full-range pair with Kronecker indices /c, > • • • > /c r . Then
there exists a basis fn,. . . , /^ , / = 1,. . . , r in <p" such that for every r-tuple
of nonnegative integers lt,. . . , lr satisfying lt:^ kt, i = 1,. . . , r, and for every
collection } of complex numbers the subspace
is [A B] invariant.
jhyuuyuyuiuiyuiuyuiyiuioyuiouyiouiojjklhjjhkjkfhdjdffffffffaaasseeui
Theorem 6.5.1
be a full-range pair of transformations. Then
for every n-tuple of complex numbers A \n there exists a transformation
F: <p"-» <pm such that A + BF has eigenvalues A , , . . . , \n.
Proof. With the use of Theorem 6.2.1 it is easily seen that we can
assume, without loss of generality, that A and B are in Brunovsky canonical
form. Furthermore, by Theorem 6.3.2, it follows that the Jordan part of A is
absent [see equation (6.2.9)]. So the Kronecker indices
satisfy the condition
where
Theorem 6.5.2
Let A: <p"—»<p" and B: <p m —» <p" be a pair of transformations, and let the
/ x / matrix J = / / ( A, ) © • • • © / / ( A^) be the Jordan part of the Brunovsky
form for [A B]. Then, given an n-tuple of (not necessarily distinct) complex
numbers /u,j,. . . , /n n , there exists a transformation F: <j7"-*<p'" such that
A + BF has eigenvalues yu,,, . . . , fj,n if and only if at least I numbers among
f i l t . . . , fj.n coincide with the eigenvalues of J (counting multiplicities).
Theorem 6.5.3
Given a nonempty set ftC <p and a transformation [A B\. <f"" +n —» <p", there
exists a transformation F: <p"~* 4-"" such that a(A + BF) C ft if and only if
Recall that S2A (A) = Ker( A0/ - A)" is the root subspace of A correspond-
ing to the eigenvalue A0 and, by definition, £%A (A) = {0} if \0^a(A).
In the proof we use the following basic fact about induced transformations
in factor spaces. (Recall the definition of the induced transformation given
in Section 1.7.)
Lemma 6.5.4
Let X: be a transformation with an invariant subspace Z£, and let
be the induced transformation. Then for every A0 G <p we
have
The condition <3lK (A0) C { A0 \ Im B0) for every A0 E <J7 --ft means that [in
the notation of equation (6.2.9)] A , , . . . , A^ Eft. It remains to apply
Theorem 6.5.2.
Now consider the general case, and let
where [AQ B0] is in Brunovsky canonical form. It is easily seen that there
exists a transformation Fl such that cr(A0 + Z? 0 F,)Cft if and only if there
exists an F2 with cr(/4 + #F2) Cft (indeed, one can take F2 = F0 +
Further, using equation (6.5.1), we have
The definitions and analysis of this chapter have primarily concerned trans-
formations [A B]: (J7" 4- <p m —> <p". Questions arise concerning analogs for
transformations : <p" -* <J7" 4- <pm. In this section we quickly review
some notions and results in this direction. Recall first that a subspace M of
<P" will be called invariant if and only if M^ is [A* C*] invariant.
Thus, with the characterization (d) of Theorem 6.1.1 for [A* C*]-invariant
subspaces, there is a transformation G* such that
(b)
(c) there is a transformation
Proof. It remains only to establish the equivalence of (a) and (b). This
is done by using the equivalence of statements (a) and (c) in Theorem 6.1.1.
Thus M \s\ \ invariant if and only if
208 Invariant Subspaces for Transformations Between Different Spaces
or
Theorem 6.6.2
The transformations l
from <p" to <p m+ " are block similar if
and only if there exist invertible transformations N on <p" and M on <pOT, and a
transformation G: (p™-» (p" such that
Exercises 209
for some integers kl ^ k2 > • • • S: kp, and al entries in C0 are zero except for
those in positions (1,1), (2, k{ + 1),. . . , ( p , kl + • • • + kp_l + 1), and those
exceptional entries are equal to one. Moreover, the matrices A0 and CQ
defined in this way are uniquely determined by A and C, apart from a
permutation of the blocks J, ( A , ) , . . . , / / (A^) in equation (6.6.6).
The case of full-range pairs (A, B), which was one of our concerns in
Section 6.3, is now replaced by the dual case in which (C, A) is a null kernel
pair (see the definition in Section 2.7 and Theorem 2.8.2). The dual of
Theorem 6.3.2 is now as follows.
Theorem 6.6.4
m
For a transformation the following statements are equi-
valent: (a) the pair (A, C) is a null kernel pair; (b) there is a null kernel pair
l
(y4j, C,) for which \ re block similar; (c) in the Brunovsky
jgughgghjghjghghghghhhh
and
and
are almost A invariant (so A has a tridiagonal form with respect to the
basis JCj, . . . , jc n ). [Hint: Apply Gram-Schmidt orthogonaliza-
tion to a basis *,, y2,. . . , yn in <p" such that the chain
6.6 State and prove the analogs of Exercises 6.5 (a) and (b) for a pair
of transformations
6.7 Let (A, B) be a full-range pair of transformations. Show that for any
F the transformation A + BF has not more than dim Im B Jordan
blocks corresponding to each eigenvalue in its Jordan form.
6.8 Let
Rational Matrix
Functions
where p (; (A) and <7,;(A) are scalar polynomials and <7,7(A) are not identically
zero. Such functions W(\) are called rational matrix functions.
We focus on problems for rational matrix functions in which different
types of invariant subspaces and triinvariant decompositions play a decisive
role. All these problems are motivated mostly by linear systems theory, and
their solutions are used in Chapter 8. The problems we have in mind are the
following: (1) the realization problem, which concerns representations of a
rational matrix function in the form D + C(A7 - A)~1B with constant
matrices A, B, C, D; (2) the problem of minimal factorization; and (3) the
problem of linear fractional decomposition.
212
Realizations of Rational Matrix Functions 213
Lemma 7.1.1
Let H(\) = EjlJ \'Hj and L(A) = A7/ -I- Ej Let H(\) = EjlJ \'Hj and L(A) = A7/ -I- Ej Let H(\) = EjlJ \'Hj and L(A) = A7/ -I- Ej
polynomials, respectively. Put
Then
Theorem 7.1.2
Every r x n rational matrix function that is finite at infinity has a realization.
So the degree of //(A) is strictly less than the degree of L(A). We can apply
Lemma 7.1.1 to find A, B, C for which
A realization for W(\) is far from being unique. This can be seen from
our construction of a realization because there are many choices for /(A). In
general, if (A, B, C) is a realization of W(A), then so is (^4, B, C), where
for any matrices At/, Bv, and Cl with suitable sizes (in other words, the
matrices A, B, C are of size s x s, s x n, r x 5, respectively, and partitioned
with respect to the orthogonal sum ( where m is the size
of A; for instance, A}3 is a p x q matrix). Indeed, for every \$£cr(A) we
have
and thus
Among all the realizations of W( A) those with the properties that (C, A)
is a null kernel pair and (A, B) is a full-range pair will be of special interest.
That is, for which
The next result shows that any realization "contains" a realization with
Realizations of Rational Matrix Functions 215
Theorem 7.1.3
Any realization (A., fi, C) of W(\) is the dilation of a realization (A^, B0,
C0) of W(\) with null kernel pair (C0, A0) and full-range pair (Att, #0).
Proof. and be
and we recall that m is the size of A. Let us verify that equality (7.1.5) is a
triivariant decomposition associated with an /l-semiinvariant subspace M,
and that the realization
So
216 Rational Matrix Functions
Also
Hence
because by construction
It turns out that a realization (^4, B, C) for which conditions (7.1.4) are
satisfied is essentially unique. To state this result precisely and to prove it,
we need some observations concerning one-sided invertibility of matrices.
By Theorems 2.7.3 and 2.8.4 we have
where p is any integer not smaller than the degree of the minimal polyno-
mial for A. Hence there exists a left inverse [col[C47]p0']~L. Thus
Note that in general the left and right inverses involved are not unique.
Theorem 7.1.4
Let (/Ij, fij, Q) and (A2,B2,C2) be realizations for a rational matrix
function W(\) for which (C,, A,) and (C2, A2) are null kernel pairs and
(A{,B}), (A2, B2) are full-range pairs. Then the sizes of A^ and A2
coincide, and there exists a nonsingular matrix S such that
Here p is any integer greater than or equal to the maximum of the degrees of
minimal polynomials for Al and A2, and the superscript —L (resp. — R)
indicates left (resp. right) inverse.
Proof. We have
For |A| >max{||y4 1 ||, ||^42||} the matrices \I-Al and A/ - A2 are non-
singular and for / = 1, 2
Consequently, we have
for any A with | A| > max{ || /i JUI^II}- Comparing coefficients, we see that
ClA\Bl = C2A'2B2, j = 0, 1, . . . . This implies ftjAj = H 2 A 2 , where, for k =
1 , 2 we write
Since we have
/. Similarly, one checks that Because S is
invertible, the sizes of A^ and A2 must coincide
It remains to check equations (7.1.7). Write
and
218 Rational Matrix Functions
Theorem 7.1.5
In a realization (A, B, C) of W(\), (C, A) and (A, B) are null kernel pairs
and full-range pairs, respectively, if and only if the size of A is minimal
among all possible realizations of W( A).
where £\(A) and E2(\) are rational matrix functions that are defined and
invertible at A 0 , and i>,,. . . , vn are integers. Indeed, for matrix polynomials
equation (7.2.1) follows from Theorem A.3.4 in the appendix. In the
general case write W(\) = p(A)~ l VV(A), where W(\) and p(\) are matrix
and scalar polynomials, respectively. Since we have a representation (7.2.1)
for W(A), it immediately follows that a similar representation holds for
W(A).
The integers i/,, . . . , vn in (7.2.1) are uniquely determined by W(A) and
Partial Multiplicities and Multiplication 219
Proposition 7.2.1
Let W(\) be as defined above and «/f,(A), . . . , «A p (A) be a canonical set of
null functions of W(\) (resp. W(\)~l) at A 0 . Then the number p is the
number of positive (resp. negative} partial multiplicities of W(\) at A 0 , and
the corresponding orders £,, . . . , kp are the positive (resp. absolute values of
the negative) partial multiplicities of W(\) at A0.
Proof. Briefly, reduce W(\) to local Smith form as described above and
apply the observation made in the paragraph preceding Proposition
7.2.1.
where o-(Ap) — { A0} and \0^o-(A'p). Note also that if A0 is a pole of W(\),
then equation (7.2.4) implies that A0 is an eigenvalue of A.
Proposition 7.2.2
Let W( A), Ap , and Bp be defined as above. Let A0 be a pole of W( A), let $( A)
be a null function of W(\)~l at A0 of order k, and let <?„ be the coefficients of
Partial Multiplicities and Multiplication 221
Then
Proof. By definition, vectors (7.2.6) form the Jordan chain for Ap at A()
if
and it is easily seen that q is the least positive integer for which
(Ap — A0/)* = 0. (One checks this by passing to the Jordan form of Ap.)
Now recall that «A(A) = W(A)<p(A) is analytic near A0; so equating coeffi-
cients of negative powers of (A — A 0 ) to zero and using the fact that
(AD - AO/)* = 0, we obtain for ; = 1, 2,. . .
224 Rational Matrix Functions
Since n*L 0 Ker CA' = {0}, it follows that nJL 0 Ker CpA'p = {0} or, what is
the same, that col[C A' ]'ld is left invertible for some integer r. As
the matrix co\[Cp(Ap - A 0 /)'], r = d is left invertible as well, and since (Ap -
A 0 /)' = 0 for s > q, we obtain the left invertibility of co\[Cp(Ap-
\J)' ']/=]• K now follows that (Ap - A O /)JC O = 0 as required. Finally, since
«/>( A () ) = JC G , it is also true that x(} 7^0. Thus, as asserted, equations (7.2.6) do
associate a Jordan chain for Ap with the null function «KA).
Conversely, let jc(), j c , , . . . , xk_l be a Jordan chain of Ap at A 0 . From the
definition of a minimal realization it follows that the matrix
Theorem 7.2.3
Let W(\) be a rational n x n matrix function with del JV(A)^0, and let its
minimal realization be given by equation (7.2.4). A complex number A0 is a
pole of JV( A) if and only if A0 is an eigenvalue of A, and then the absolute
values of negative partial multiplicities of W( A) at A0 coincide with the sizes of
Jordan blocks with eigenvalue A0 in the Jordan form of A, that is, with the
partial multiplicities of A0 as an eigenvalue of A.
A complex number A0 is a zero of W(\) if and only if A0 is an eigenvalue
of A i , where Al is taken from a minimal realization for W(\)~1:
with matrix polynomial V(\). In this case the positive partial multiplicities of
W( A) at A() coincide with the partial multiplicities of A0 as an eigenvalue ofA}.
Theorem 7.2.4
Let Wj( A) and W 2 (A) be n x n rational matrix functions with determinant not
identically zero and that take finite value at infinity. Then for every A0 E <p
and j = 1,2, . . . we take iry =s 5y, where {^JJl, = tr(WlW2; A 0 ) and {Sf}J=l is
some sequence from V^lW^, A 0 ), 7r(W2; A0)). //, in addition, W t (A) and
W2( A) arfmtY minimal realizations (7.2.8) /or nTi/c/i r/ze realization (7.2.9) o/
Wj(A)W 2 (A) w minimal as well, then actually
The assumption that both H^(A) and W 2 (A) take finite values at infinity is
not essential in Theorem 7.2.4. However, we do not pursue this generaliz-
ation.
The condition that the realization (7.2.9) is minimal for some minimal
realizations (7.2.8) is important in the theory of rational matrix functions
and in the theory of linear systems. It leads to the notion of minimal
factorization and is studied in detail in the following sections.
equation (7.3.2). Indeed, equation (6.3.1) shows that the pair (A - BC, B)
is a full-range pair [because (A, B} is so]. Further, (C, A) is a null kernel
pair, or, equivalently, (A*, C*) is a full-range pair. By the same argument,
the pair (A* - C*#*, C*) is also a full-range pair. Hence (C, A - BC) is a
null kernel pair, and therefore realization (7.3.2) is minimal. In particular,
is minimal.
Let us focus on minimal factorizations (7.3.3) with three factors (p = 3).
A description of all such factorizations in terms of certain triinvariant
decompositions associated with A-semiinvariant subspaces is given. Here A
is taken from a minimal realization W(\) = I + C(A7- A)~1B. Write /4 X =
A - BC, and let A and Ax be of size m.
Minimal Factorizations of Rational Matrix Functions 227
Theorem 7.3.1
Let (7.3.5) be a supporting triinvariant decomposition for W(X). Then W( A)
admits a minimal factorization
where TT^ is the projector on £ along M + Jf, and TTM and TTV are defined
similarly.
Conversely, for every minimal factorization W(\) = Wl(\)W2(\)W3(\)
where the factors are rational matrix functions with value I at infinity there
exists a unique supporting triinvariant decomposition <pm = !£ + M + N such
that
Note that the second equality in (7.3.6) follows from the relations
TT^ATT•<£ = ATT<£ and ir^Air^ = Tr^A, which express the A in variance of !£
and !£ 4- M, respectively (see Section 1.5).
228 Rational Matrix Functions
where
Note that
As the realizations (7.3.7) and (7.3.9) are minimal (see the first part of the
proof), there exist invertible transformations T%:!£'-* t£, TM:M'-*M,
Tjfi jV'-*JVsuch that
where the second equality follows from ir^A*^- = AXTTJV- and Tr^A*TT^ =
TTyAx, expressing the Ax invariance of Ji and M + Jf.
An important particular case of Theorem 7.3.1 appears when N — {0} in
the supporting triinvariant decomposition (7.3.5). This corresponds to the
minimal factorization of W(\) into the product of two factors, as follows.
Corollary 7.3.2
Let 1£ and M be subspaces in <p"" that are direct complements of each other.
Assume that ^ is A invariant and M is A* invariant. Then W(\) admits a
minimal factorization
000000000000000
has a realization
where
Let us find all invariant subspaces for A and Ax. It is easy to see that
(1,1,0) is an eigenvector of A corresponding to the eigenvalue 1, whereas
the vectors (0,0,1), (0,1,0} are the eigenvectors of A corresponding to
the eigenvalue 0. Hence all one-dimensional ^4-invariant subspaces are of
the form Span . Al
two-dimensional /4-invariant subspaces are of the form
if and only if z ^ 3(y - x). Further, one of the following four cases appears:
Theorem 7.5.1
Let m be the size of A in equation (7.5.1), and let
of W(A), where each V^(A) has the form (7.5.7) for some and R. First let
236 Rational Matrix Functions
Lemma 7.5.2
Let A j, A 2 : <p" —> <p" be transformations and assume that at least one of them
is diagonable. Then there exists a direct sum decomposition <p" = Jz^ + • • • +
££n with one-dimensional subspaces j£y, / = 1, . . . , n, such that the complete
chains
and
Indeed, we can then use induction on n and assume that Lemma 7.5.2 is
already proved for A^M and PMA2\M in place of Al and A2, respectively,
where PM is projector on M along j£ (Remember that if at least one of Al
and A2 is diagonable, the same is true for A^M and PMA2\M\ see Theorems
4.1.4 and 4.1.5.) Combining (7.5.9) with the result of Lemma 7.5.2 for A^M
and PMA2\M, we prove the lemma for Al and A2.
To establish the existence of the decomposition (7.5.9), assume first that
A i is diagonable, and let /,, . . . , / „ be a basis for <p" consisting of eigen-
vectors of v4,. If g is an eigenvector of A2, (7.5.9) is satisfied with
£ = Span{/) , . . . , / , }, where the indices / j , . . . , / „ _ , are such that
/ - , , . . . , f i n _ \ , g form"a basis in <p".
If A2 is diagonable but A, is not, then use the part of the theorem already
proved with A2 and A* in place of A{ and A2, respectively. We obtain an
(n - 1)-dimensional A*-invariant subspace M and a one-dimensional A*-
invariant subspace & that are direct complements of each other. Then put
M = (.&)1 and £ = (M)x to satisfy (7.5.9). D
We can now state and prove the following sufficient condition for minimal
factorization of a rational matrix function W(\) into the product of 8(W)
nontrivial factors.
Theorem 7.5.3
Let W( A) be a rational n x n matrix function with a minimal realization
following form of Theorem 7.5.3 may be more easily applied in many cases.
Theorem 7.5.4
Let W( A) be a rational n x n matrix function with W(°°) = I. Assume that
either in W(A), or in W(\)~\ all the poles (if any) of each entry are of the
first order. Then W(\) admits a factorization (7.5.11).
238 Rational Matrix Functions
Proof. Assume that all the poles of each entry in W( A) are of the first
order. The local Smith form (7.2.1) implies that all the negative partial
multiplicities (if any) of W(\) at each point A0 are —Is. By Theorem 7.2.3,
all the partial multiplicities of the matrix A from the minimal realization
(7.5.10) are Is. Hence A is diagonable and Theorem 7.5.3 applies. If all
poles of W(\yl are of the first order, apply the above reasoning to W(\)~l,
using its realization W(A)^ = / - C(A7- (A - BC))~1B, which is minimal
if (7.5.10) is minimal. D
000000000000000000000000000000000000000000000
In this and the next sections we study linear fractional transformations and
decompositions of general (nonsquare) rational matrix functions. We deviate
here from our custom and denote certain matrices by lower case Latin and
Greek letters.
Let W( A) be a rational matrix function of size r x m written in a 2 x 2
block matrix form as follows:
where
Define transformations:
Theorem 7.6.1
We have
Proof. Write
So
using these realizations for W ) and the realization (7.6.5) for by the
following rules: given two rational matrix functions d X with
finite values at infinity and realizations
where
Let
242 Rational Matrix Functions
and
and x — 0 because (y, a) is a null kernel pair. [This follows from the
minimality of (7.6.11).] So the pair (C, A) is also a null kernel pair.
To prove that (A, B) is a full-range pair, observe that a* can be written
in the form
where Ylk and Zlk are certain matrices and the stars denote matrices of no
immediate interest. Formula (7.6.15) can be proved by induction on k by
means of formula (7.6.7). From the minimality of (7.6.11) it follows that for
every there exist vectors i; such that
for some matrices zlk and ylk. [Again, equation (7.6.16) can be proved by
induction on k using (7.6.7).] For every x E <pp by the minimality of
(7.6.11) there exist vectors w 0 , . . . , uq G (p'"1 such that
244 Rational Matrix Functions
for some vectors w(), . . . , wg, and the full-range property of (a, b) is
proved. Hence the realization (7.6.5) is minimal as well.
and formula (7.6.11) gives another version for the realization of the product
of two rational matrix functions:
for some rational matrix functions W(\) and K(A) that take finite values at
infinity. In this section we describe linear fractional decompositions of U( A)
Linear Fractional Decompositions and Invariant Subspaces 245
Indeed, assuming that (7.6.4) and (7.6.5) are minimal realizations of W(\)
and V(A), respectively, then by Theorem 7.6.1 U(\) has a realization (not
necessarily minimal) 5 + y(\I- a)~lfi, where the size of a is t x /, with
t = 8(W) + 8(V). Hence (7.7.2) follows.
The linear fractional decomposition (7.7.1) is called minimal if equality
holds in (7.7.2), that is, 8(U) = d(W) + S(V). As in the preceding para-
graph, Theorem 7.6.1 implies that (7.7.1) is minimal if and only if for some
(and hence for any) minimal realizations (7.6.4) and (7.6.5) of W(A) and
£/(A), respectively, the realization (7.6.11) of U(\) = &W(V) is again
minimal.
Let
Theorem 7.7.1
Assume that ( M l , M 2 ) is a reducing pair with respect to the realization
(7.7.3) of £/( A). The following recipe may be used to construct realizations of
rational matrix functions W( A) and V( A) such that
and
and any transformation d: (p 5 —*(p^ such that the transformations Du, D22
and I — Dl2d are invertible and
(b) choose any transformations F: <p'—» (p* and G:^q—>^' for which
and
and
and V(\) take finite values at infinity and the matrices Wu(°°) and V^22(oo) are
invertible, can be obtained by this recipe.
where a', (3', y', and 8' are given by formulas (7.6.7), (7.6.8), (7.6.9), and
(7.6.10), respectively, using the realizations (7.7.13) and (7.7.14). As
(7.7.11) is a minimal linear fractional decomposition, the realization
(7.7.15) is minimal. [The size of a' is (n + p) x (n +/?).] Comparing the
minimal realizations (7.7.3) and (7.7.15) we find, in view of Theorem 7.1.4,
that 8 = 8' and there exists an invertible transformation S: <p" 4- (p^—»<p'
such that
Putting
with respect to the direct sum decomposition <p' = M^ + M2. The quadruple
(Mi,M2;F{,G\) will be called a supporting quadruple [with respect to the
realization (7.7.3)]. Given a supporting quadruple, for every choice of D
and d satisfying condition (a) of Theorem 7.7.1, the recipe produces a linear
fractional decomposition of U(X). We now have the following important
addition to Theorem 7.7.1.
Theorem 7.7.2
Assume that the realization (7.7.3) is minimal, and let (7.7.11) be a minimal
linear fractional decomposition of U(\} such that W (/ (A) and V(\) take finite
values at infinity and the matrices Wn(°°) and W22(<*>) are invertible. Then
there exists a unique supporting quadruple Q = (Ml, M2; FI} G,) that pro-
duces, together with some choice of D and d satisfying condition (a), the
decomposition (7.7.11) according to the recipe of Theorem 7.7.1.
and d, which, together with Q', give rise to the decomposition (7.7.11) are
the same matrices chosen to produce (7.7.11), together with Q. Further, let
(7.7.8) be the block matrix representations of a, j8, and y with respect to the
direct sum decomposition (p7 = M\ 4- M2, and let
where A, Bt, and Cj are given by formulas (7.7.9) and A', B't, and C'j are
given by (7.7.9) with a n , G,, F,, j8,, y, replaced by «;,, GJ, FJ, /3|, y{,
respectively. By Theorem 7.6.1, both realizations (7.7.16) are minimal, so in
view of Theorem 7.1.3 there exists an invertible transformation S: M\-+ M\
such that
Similarly, we have
where a, b, and c are given by (7.7.10) and a', b', and c' are given by
(7.7.10) with a 22 , j32, y2 replaced by a 22 , /3 2 , y 2 > respectively. Since both
realizations (7.7.18) are minimal, we have
and
and
so
and
Theorem 7.8.1
Let
and
with respect to the direct sum decomposition (p2 = Ml + M2, where (1, x)
and (1, y) are chosen as bases in M} and M2, respectively. Further,
is such that (or + f$F)M\ C M, if and only if the transformation
We conclude that for every six-tuple of complex numbers (x, y, /j, f 2 , ' g l , g2]
such that x^y and (7.8.4) and (7.8.5) hold, there is a minimal linear
fractional decomposition U(\) - &W(V) where W(\) and K(A) are given by
equalities (7.8.6) and (7.8.7), respectively. D
Theorem 7.8.2
Let U(A) be a rational matrix function that has no pole at infinity, and let
m = 8(U). Then U(\) admits a linear fractional decomposition
for any rational matrix function V(\) of suitable size, where Wfl(\) and
WJ2(\) are rational matrix functions of appropriate sizes with Wr/2('») = 0,
WW») = /.
Observe that the decomposition (7.8.8) is minimal in the sense that
8(U) = 8(Wl)+--- + 8(Wm). So, in contrast with the factorization of
rational matrix functions (Example 7.5.1), nontrivial minimal linear frac-
tional decompositions always exist.
Hence &W(V) has the form (7.8.9). Now apply the preceding argument to
(/^A), and so on. Eventually we obtain the desired linear fractional
decomposition (7.8.8).
00000000000000000
7.4 Find minimal realizations for the following scalar rational functions:
7.5 Find a minimal realization for the scalar rational function with finite
value at infinity, assuming that its representation as a sum of simple
fractions is known, that is, of the form Er [Hint:
Use Exercise 7.4 (c) and Exercise 7.11.]
7.6 Show that if
where 0,(A) and fl2(A) are scalar rational functions with finite value at
infinity.
7.8 Describe a minimal realization for the n x n circulant rational matrix
function
Exercises 257
7.10 Give an example of rational matrix functions W,(A) and W2(\) with
minimal realizations (3) for which the realization (4) is not minimal.
7.11 Assume that the realizations (3) are minimal and A} and A2 do not
have common eigenvalues. Prove that (4) is minimal as well. [Hint:
We have to show that ([C^ C 1 is a null kernel pair
be a minimal realization,
(a) Show that
Exercises 259
is a realization of W(\)2.
(b) Is the realization of W( A)2 minimal?
(c) Is the realization minimal if, in addition, the zeros and poles of W( A)
are disjoints?
7.19 For the minimal realization (6), show that
hold.
260 Rational Matrix Functions
7.24 Find the McMillan degree of the circulant rational matrix function
7.25 Find a minimal realization of W(\), and, with respect to this realiz-
ation, describe all the minimal factorizations W(\) = Wl(\)W2(\) of
W(A) in terms of subspaces 2£ and M as in Corollary 7.3.2, for the
following scalar rational functions:
such that W(\) and V(\) take finite values at infinity and Wn(oo),
W22(o°) are invertible. Find all the corresponding reducing pairs of
subspaces with respect to a fixed minimal realization of t/(A).
7.30 Show that all the following decompositions of a rational matrix
function U(\) are particular cases of the linear fractional decompo-
sition:
7.31 For the rational function U(\) given in Example 7.8.1, find all
minimal linear fractional decompositions U(\) = &W(V), with
and
Chapter Eight
Linear Systems
In this chapter we show how the concepts and results of previous chapters
are applied to the theory of time-invariant linear systems. In fact, this is a
short self-contained introduction to linear systems theory. It starts with the
analysis of controllability, observability, minimality, and state feedback and
continues with a selection of important problems with full solution. These
include cascade connections, disturbance decoupling, and output stabiliz-
ation.
262
Reductions, Dilations, and Transfer Functions 263
Formula (8.1.3) expresses the output in terms of the input. In other words,
the input-output behaviour of the system is represented explicitly.
Now we introduce some important operations on linear systems of type
(8.1.1). It is convenient to describe (8.1.1) by the quadruple of transfor-
mations (A, B, C, D). A linear system (A1, B', C', D ) with transformations
A'\ (pm'-^4:m', B': (p"'-»<p m ', C': <p m '-^(p r ', D': <p"'-*<p r ' will be called
similar to (A, B, C, D) if there exists an invertible transformation
5: <f""'-» <pm such that
(In particular, this implies that m = m', n = n', r = r'.) We also encounter
system (8.1.1) with transformations A:M^>M, B:("->M, C:M-^-(r,
and D: (p"—> (pr, where M is a subspace of <£"" for some m. The definition of
similarity applies equally well to this case. [In particular, similarity with the
system (A1, B', C', D') described above implies dim M = m'.]
A system (A, B', C', D') with A': <pm'-* <pm', B': <p"-» <f w ',
C': <p w '-» <fr, D': <f"^ <fr will be called a dilation of (4, B, C, D) if there
exists a direct sum decomposition
with the two following properties: (1) the transformations A', B', C' have
the following block forms with respect to this decomposition
The basic property of reductions and dilations is that they have essentially
the same input-output behaviour; as follows.
Proposition 8.1.1
Let (A', B', C', D') be a dilation of (A, B, C, D). Then, for *0 = 0, the
input-output behaviours of the systems (A', B', C', D') and (A, B, C, D)
are the same. In other words, if u(t) is any (say, continuous) n-dimensional
vector function, then the output y = y(t; 0, u) of the system (A', B', C', D')
and the output y = y(t; 0, u) of the system (A, B, C, D) coincide.
As D' = D, and e('~s)A (for a fixed / and s) admits a power series represen-
tation (see Section 2.6), we have only to show that for q =0,1,. . .
Now (A, B, C, D') and (A, B, C, D) are similar, so there exists an invert-
ible transformation S such that A = S~1AS, C = CS, and B = S~1B. Hence
[It is assumed here that for f > 0 z(/) is a continuous function such that
|z(0l ^ Ke*' for some positive constants K and /x. This ensures that Z( A) is
well defined for all complex A with Re A > /a.] The system (8.1.1) then takes
the form
Solving the first equation for X(\) and substituting in the second equation,
we obtain the formula for the input-output behaviour in terms of the
Laplace transforms:
00000000000000000000000000000000000000
0000000000000000000000000000000000000000000000
and recall that this system is called minimal if the dimension of the state
space is minimal. [We omit the initial condition *(0) = *0 from (8.2.1); so
(8.2.1) has in general many solutions x(t).]
266 Linear Systems
Theorem 8.2.1
(a) Any linear system (8.2.1) is a dilation of a minimal linear system; (b) the
linear system (8.2.1) is minimal if and only if (A, B) is a full-range pair and
(C, A) is a null kernel pair:
where m is the dimension of the state space. Moreover, in (8.2.2) one can
replace nJL 0 Ker CA' by np()' Ker CA' and Z^lm(B'A) by E^1 Im(B'A),
where p is any integer not smaller than the degree of the minimal polynomial
of A.
Theorem 8.2.2
The system (8.2.1) is observable if and only if (C, A) is a null kernel pair:
for f > 0 is x(t) = Q. If equality (8.2.3) were not true, there would be a
nonzero A:O £ nj!0 Ker CA' and the function x(t) — e'Ax0 would be a not
identically zero solution of equation (8.2.4). Indeed, for every 12^0 we have
Minimal Linear Systems: Controllability and Observability 267
Theorem 8.2.3
The system (8.2.1) is controllable if and only if (A, B) is a full-range pair:
Lemma 8.2.4
Let G(t), f G [0, /0] be an mx n matrix depending continuously on t. Then
Then
Hence
From this equation it is clear that (8.2.1) is controllable if and only if for
every t2 > 0 the set of m-dimensional vectors
Minimal Linear Systems: Controllability and Observability 269
coincides with the whole space <pm. By Lemma 8.2.4, the controllability of
(8.2.1) is equivalent to the condition that Im Wt = (p™ for all f >0, where
It follows that
Combining Theorem 8.2.1 with Theorems 8.2.2 and 8.2.3, we obtain the
following important fact.
Corollary 8.2.5
The linear system (8.2.1) is minimal if and only if it is controllable and
observable.
This corollary, together with Theorem 7.1.5, shows that the concept of
minimality for systems and realizations of rational functions are consistent,
270 Linear Systems
and
Suppose also that u } ( t ) and y2(t) are from the same space. The two systems
are combined in a "cascade" form when the output y2 of the second system
becomes the input w, of the first system. We obtain
and
The system (8.3.3) is called a simple cascade composed of the first compo-
nent (8.3.1) and the second component (8.3.2). Note that the dimension of
the state space of the simple cascade is the sum of the state space
dimensions of its components, and the input of the simple cascade coincides
with the input of its second component, whereas the output of the simple
cascade coincides with the output of the first component.
Similarly, one can consider the simple cascade of more than two compo-
nents. Let (Av, /?,, C,, D,),. . . , (A , B , C , D ) be linear systems of
Cascade Connections of Linear Systems 271
which implies
minimal. Now we can use the results of Sections 7.3 and 7.5 concerning
minimal factorizations of rational matrix functions to study simple cascading
decompositions of minimal linear systems. The following analog of Theorem
7.5.1 is an example.
Theorem 8.3.1
The components of every representation of a minimal system (A, B, C, /) as
a simple cascade (with the transfer functions of the components having value I
at infinity) are given by
where the projectors TT, , . . . , irp and associated subspaces !£l,. . . , !£p are
defined as in Theorem 7.5.1. The transformations TTjA^ in (8.3.7) are
understood as acting in X^ and the transformations CTT^ and 7r(B are
understood as acting from j^ into <p", and from <p" into ^, respectively,
where n is the number of rows in C (which is equal to the number of columns
in B).
and assume that the input vector u = u(t) and the output vector y = y(t) are
divided into two components:
Now let
be another linear system with the input s(t), output z(t), and the state w(t).
(Here a, b, c, and d are constant matrices of appropriate sizes.) We obtain a
new system by feeding the first component of the output of (8.3.8) into the
input of (8.3.10) and at the same time feeding the output of (8.3.10) into the
second component of the input of (8.3.8). [It is assumed, of course, that the
vectors y\(t) and s(t) are in the same space, as well as the vectors « 2 (0 anc*
z(t).] This situation is represented diagrammatically by
Cascade Connections of Linear Systems 273
where
Now identify
and hence
Further
Theorem 8.3.2
Any minimal linear system with in-dimensional state space can be represented
as a minimal cascade of m linear systems each of which has one-dimensional
state space.
In this and the next section we consider two important problems from linear
system theory in which [A B]-invariant subspaces (as discussed in Chapter
6) appear naturally and play a crucial role.
Consider the linear system
The Disturbance Decoupling Problem 275
where A: <£"-+ f, B: <f m -» <p", E: <pp-» <p", and D. <p"-» <pr are constant
transformations, and *(/), w(f), q(t), and z(f) are vector functions taking
values in <p", <[?w, <pp, and £r, respectively.
As in Section 8.1, <£" is interpreted as the state space of the underlying
dynamical system, and u(t) is the input. The vector function z(t) is inter-
preted as the output. The term q(t) represents a disturbance that is supposed
to be unknown and unmeasurable. We assume that q(t) is a continuous
function of t for t > 0.
An important transformation of the system (8.4.1) involves "state feed-
back." This is obtained when the state x(t) is fed through a certain constant
linear transformation F into the input, so the input of the new system is
actually the sum of the original input u(t) and the feedback. Diagrammati-
cally, we have
and thus
that the state vector y G (p" is reachable for the system (8.4.2) if there exist a
t0 > 0 and a continuous function u(t) such that the solution x(t) of (8.4.2)
satisfies x(tQ) — y. As
for / >0, it follows easily that the set of all reachable state vectors of (8.4.2)
is a subspace.
Proposition 8.4.1
The set 91 of reachable states coincides with the minimal A-invarant subspace
that contains Im B:
for some
By equality (8.2.9)
Proposition 8.4.2
The system (8.4.1) is disturbance decoupled if and only if
The Disturbance Decoupling Problem 277
The new system has the same form as the original system (8.4.1), with A
replaced by A + BF. Our mathematical problem is: given transformations
A: <p"-» <p" and B: <p m ^ <p", and given subspaces g C <p" (which plays the
role of Im E) and 3) C <p" (which plays the role of Ker D), find, if possible,
a transformation F: <p" -»<f"" such that the subspace
Theorem 8.4.3
In the preceding notation, there exists a transformation F: <p" —> <pm such that
Proof. Assume that there is an F: <p"—» <pm with the property (8.4.3).
By Theorem 2.8.4 (applied with A + BF playing the role of A and any
transformation whose image is <£ playing the role of B) the subspace
(A + BF | g) is (A + BF) invariant, and thus (Theorem 6.1.1) it is [A B]
invariant. As (A + BF %) D <£, and the maximal (in 2)) [A B]-invariant
subspace °U contains < A + BF \ %), we obtain ^ D g.
Conversely, assume °ti D <£. By Theorem 6.1.1 there is a transformation
F: <p"-» <pm such that (A + BF)^ C <U. Now
Theorem 8.4.4
Given a system (8.4.1), there exists a state feedback F: <p"—> <pm such that the
system
where al, a2, and « 3 , as well as £,, b2, and b3 are complex numbers not all
zero. Using Theorem 6.4.1 and its proof, we find that a one-dimensional
subspace M is [A B] invariant if and only if
or
Consider first the case when Ker D is [A B] invariant. This happens if and
only if 6 3 ^0. Then obviously Ker D is the maximal [A #]-invariant
The Output Stabilization Problem 279
ALSKJFLSJ;FKSSL;JFKLSJFLJSKDJOUKTJPLUOTSTAHB;IOESAT;IOJ
8.5 THE OUTPUT STABILIZATIONKLPRONELM DJLJSOFKOLN
PROBLEM
tConsider theSYSTEM
CONSIERTHE system
To study the property (8.5.2), we need the following lemma. This is, in
fact, a special case of Theorem 5.7.2, but it is convenient to have some of
the conclusions recast in the present form.
Lemma 8.5.1
lettyth be stranfsojsrmfation , anfd sklfj
Q, M + be the sum of root
subspaces of A corresponding to the eigenvalues with negative, zero, a
positive real parts, respectively. Thus
Note that Jt_, M(), and M+ are /1-invariant subspaces, and therefore
these subnspaces farse also oknvariant fro the ftransofkormations
gijvefn al. b,d, as in fdjl;skjfldfjla;kjf;skanjktransofosrmadtion
Lemma 8.5.2
We have
cccc
with respect tko thef diect sli skjflksjfl;jedckojklpjkojtisoitijojnl ljlsfjlsjfs;s;s;s;s;s;s;slslslsllll where
where i is ais a
direct complement to N alsoalso jis a null ellhance
Nowcklearlhy
now clearly for everhy
fkor every
with with if andj only jilfj jifn hkas all iltls eigen vslues iln theffj kkoopen
left-half
left . hjalf plane.
j;plane.l So we jhkavfe
so jwe have tootprkovke
prove that
that
if and only if all the eigenvalues of A22 are in the open left-half plane.
Let jc be an eigen vfectkor of lkof ;lff;lcokrresopojnding tko the eigenvaluke4 vallirhj with
then
Theorem 8.5.3
Given transformation, and a substpace
ace
there exists a transformation such thatjat for
t282 linjear systems
and hence
for every that ils jnot6j ank ejiotn vslkukefkj ofj a=bf h,cibnsyqhubtkgtv
for every
onvefrselyk, assume thaqt 98560holsj for eve4ry we hjave
tko oporovef fthat6 jtherfe efxists an f such thatj fkor evefryk
The Output Stabilization Problem 283
Hence
so
Theorem 8.5.4
Given the linear system
for every eigenvalue A0 of A lying in the closed right half plane, and where °ll
is the maximal [A B]-invariant subspace in Ker D.
Ker D — Span{
where al, a2, a3 are complex numbers not all zeros. Here (A Im B) =
Span{el,e2}. If £%eA 0 <0, then there is always an F=[flf2f3] with
properties as in Theorem 8.5.4 (one can take /3 = 0 and choose /i and /2
that the equation A 2 -/ 2 A -/, = 0 has its zeros in the open left-half plane).
So assume 9le A0 > 0. Then there exists an F as in Theorem 8.5.4 if and only
if
Exercises 285
8.6 EXERCISES
8.1 For every input u(t) find the output y(t) for the following linear
systems:
8.2 For every input w(f) find the output y(t) for the following linear
systems:
8.8 Let
is minimal as well.
8.11 Let p(A) be a polynomial of the transformation A: <p"-» <p". Prove
that the minimality of the system
and
where bn ¥=• 0, find a state feedback F such that the system with
feedback
is stable, that is, all its solutions x(t) tend to zero as t—»°°.
8.15 For system (1) in Exercise 8.14 and any k >0, find a state feedback F
such that all solutions x(t) of the system with feedback satisfy
||*0)|| < Ke~kl, where K>0 is constant independent of t.
8.16 Prove that any minimal linear system with n-dimensional state space
has a state feedback for which the system with feedback can be
represented as a simple cascade of n linear systems with state spaces
of dimension 1.
8.17 Prove that controllability is a stable property in the following sense:
for everv controllable svstem
8.18 Prove that observability and minimality of linear systems are also
stable properties. The definition of stability is, in each case, to be
similar to that of Exercise 8.17.
8.19 Show that for any system
290
Notes to Part 1 291
Kaashoek, and van Schagen (1980). That paper contains a more general
theory of invariant subspaces, similarity, canonical forms, and invariants of
blocks of matrices in terms of these blocks only. Some applications of these
results may be found in Gohberg, Kaashoek, and van Schagen (1981,1982).
Theorem 6.2.5 was proved (by a direct approach, without using the Kronec-
ker canonical form) in Brunovsky (1970). The connection between the
Kronecker form for linear polynomials and the state feedback problems is
given in Kalman (1971) and Rosenbrock (1970). In Theorem 6.3.2 the
equivalence of (a) and (d) is due to Hautus (1969).
The spectral assignment problem is classical, by now, and can be found in
many books [see, e.g., Kailath (1980) and Wonham (1974)]. There is a more
difficult version of this problem in which the eigenvalues and their partial
multiplicities are preassigned. This problem is not generally solvable. For
further analysis, see Rosenbrock and Hayton (1978) and Djaferis and Mitter
(1983).
Chapter 7. The concept of minimal realization is a well-known and
important tool in linear system theory [see, e.g., Wonham (1979) and
Kalman (1963)]. See also Bart, Gohberg, and Kaashoek (1979), where the
exposition matches the purposes of this chapter. Section 7.1 contains the
standard material on realization theory, and Lemma 7.1.1 is a particular
case of Theorem 2.2 in Bart, Gohberg, and Kaashoek (1979).
Section 7.2 follows the authors' paper (1983a). Sections 7.3-7.5 are based
on Chapters 1 and 4 in Bart, Gohberg, and Kaashoek (1979). Here, we
concentrate more on decompositions into three or more factors.
Linear fractional decompositions of rational matrix functions play an
important role in network theory; see Helton and Ball (1982). Theorem
7.7.1 is proved in that paper. The exposition in Sections 7.6-7.8 follows that
given in Gohberg and Rubinstein (1985).
Chapter 8. In the last 20 years linear system theory has developed into a
major field of research with very important applications. The literature in
this field is rich and includes monographs, textbooks, and specialized
journals. We mention only the following books where the reader can find
further references and historical remarks: Kalman, Falb, and Arbib (1969),
Wonham (1974), Kailath (1980), Rosenbrock (1970), and Brockett (1970).
This chapter can be viewed as an introduction to some basic concepts of
linear systems theory.
The first three sections contain standard material (except for Theorem
8.3.2). In the last two sections we follow the exposition of Wonham (1979).
Part Two
Algebraic
Properties of
Invariant Subspaces
293
This page intentionally left blank
Chapter Nine
Commuting Matrices
and Hyperinvariant
Subspaces
295
296 Commuting Matrices and Hyperinvariant Subspaces
(a) Aa T^ Ap. We show that in this case Za/3 = 0. Indeed, multiply the
left-hand side of equality (9.1.3) by Aa — A^ and in each term in
the right-hand side replace We
obtain
From the structure of Ha and Hft it follows that the product HaZaf}
is obtained from Zafi by shifting all the rows one place upward and
filling the last row with zeros; similarly, ZaftHft is obtained from Zafi
Commuting Matrices 297
by shifting all the columns one place to the right and filling the first
column with zeros. So equation (9.1.5) gives (where £ik is the
(j, /c)th entry in Za/3, which depends, of course, on a and |8):
Theorem 9.1.1
Let / = diag[/,, • • . , Ju] be an n x n Jordan matrix with Jordan blocks
7,, . . . , Ju and eigenvalues A n . . . , A M , respectively. Then an n x n matrix Z
commutes with J if and only if Za/3 = 0 for \a ^ A^ and Za/3 is an upper
triangular Toeplitz matrix for A0 = A^, where Z = [Z ap ]^^ = 1 is the partition
of Z consistent with the partition of J into Jordan blocks.
Corollary 9.1.2
where the spectra of the matrices A l and A2 do not intersect. Then any n x n
matrix X that commutes with A has the form
is the Jordan form of A. By Theorem 9.1.1, and since ^(J,) fl (r(J2) — 0, any
matrix Y that commutes with J has the form 1
•*•
with the same.
partition as in (9.1.9). Now Y commutes with J if and only if X= SYS
commutes with A, where So
Corollary 9.1.3
Every root subspace for a transformation A: <p"-» <p" is a reducing invariant
subspace for any transformation that commutes with A.
The proof of Theorem 9.1.1 allows us to study the set ^(A) of all
matrices (or transformations) that commute with the matrix (or linear
transformation) A. First, observe that ^(A) is a linear vector space. Indeed,
if
if AX for and then also doe
any complex numbers a and j8.
To compute the dimension of ^(A), consider the elementary divisors of
A. Thus, for every Jordan block or size k x k and eigenvalue A0 in the
Jordan normal form of A we have an elementary divisor (A 0 - A 0 ) of A
Commuting Matrices 299
Theorem 9.1.4
Every matrix commuting with A is a polynomial in A if and only if A is
nonderogatory.
Corollary 9.1.5
//A,, . . . , As are the distinct eigenvalues of a diagonable matrix A, then
where
Proposition 9.1.6
An n x n matrix B commutes with every n x n matrix A if and only if B is a
scalar multiple of /: B = A/ for some A E (f1.
Proof. The part "if" is obvious. So assume that B commutes with every
n x n matrix A—in particular, taking A to be diagonal with n different
eigenvalues with respect to a basis * , , . . . , xn in <p", Corollary 9.1.2 implies
that B is also diagonal in this basis. Therefore, Bxl ESpan{x,}. As any
nonzero vector xx appears in some basis in <p", we find that Bx = Ax for
every * E <p" -> {0}, where the number can depend on x: A = A(z). How
ever, if Bx=A(x)x, By = \(y)y with A(x)*\(y), then B(x + y)0
Span{x + y], a contradiction. Hence A is independent of x and the propos
ition is proved.
Theorem 9.2.1
Let ft be a set of commuting transformations from <p" into <p" (so AB - BA
for any A, B E. ft). Then there exists a complete chain of subspaces
dim Mt = j, such that MG, M.^ . . . , Mn are invariant for
every transformation from ft.
Clearly Z£(x) is a nonzero subspace that is invariant for any A E ft (in short,
ft invariant).
Now let xl E <p" be an eigenvector of some transformation / 4 , E f t
corresponding to an eigenvalue A,; so >!,*, = A,*,. Hence for every
Bl,. . . , Bk £ ft we have
so
for any
Theorem 9.2.2
Let fl be a set of commuting transformations from <p" into (f"1. Then there
exists an orthonormal basis *,,. . . , xn in <p" such that the representation of
any AE.fl in this basis is an upper triangular matrix.
Theorem 9.2.3
Let ft be a set of normal transformations <p" —> £". Then AB = BA for any
transformations A, BE.il if and only if there is an orthonormal basis
consisting of eigenvectors that are common to all transformations in ft.
Theorem 9.3.1
Let A and B be n x n matrices with rank(AB - BA) s 1. Then there exists a
complete chain of subspaces:
span the one-dimensional range of AB - BA. Hence for every y G <p" there
exists a constant ii(y) such that
It follows that
and hence
EXAMPLE 9.4.1. Let A = A/, A E (p. Obviously, any transformation from <p"
to <p" commutes with /I, so the only subspaces which are invariant for every
linear transformation that commutes with A are the trivial ones: {0} and <p".
Hence A has only two hyperinvariant subspaces: {0} and <p".
Theorem 9.4.1
For a transformation A: <p" —* <p" every A-invariant subspace is A hyperin-
variant if and only if A is nonderogatory.
Proof. We have seen already that the part "if" is true. To prove the
"only if" part, assume that A is not nonderogatory. We prove that there
exists an ^-invariant subspace that is not A hyperinvariant. By assumption,
dim Ker(y4 - A 0 /) > 2 for some eigenvalue A0 of A. Without loss of generali-
ty we can assume that A is a Jordan matrix
where m>2 and the first m Jordan blocks correspond to the eigenvalue A 0 ,
306 Commuting Matrices and Hyperinvariant Subspaces
they are arranged so that k} < k2 and Am + 1 , . . . , \p are different from A0.
Obviously, Span{e,} is an .A-invariant subspace. It turns out that this
subspace is not A hyperinvariant. Indeed, by Theorem 9.1.1 the matrix 5
with 1 in the entries (/c, + 1 , 1 ) , . . . , (2kt, A:,) and zero elsewhere, com-
mutes with A. On the other hand, Sel = ek+l, so Spanf^} is not S
invariant. D
Theorem 9.4.2
The lattice of all A-hyperinvariant subspaces coincides with the smallest lattice
tfA of subspaces in (p" that contains
Actually, $fA coincides with the smallest lattice of subspaces in <£" that
contains
for some complex numbers /?,, p 2 , p3, p4. So Ker p(A) can be only one of
the following subspaces: {0} (if pl ^0); Span{e,,e 5 } (if pl = 0, p 2 ^0);
Span{e,,e 2 ,e 5 ,c 6 } (if^, =/? 2 = 0,/? 3 5^0); Span{e l5 e 2 , «?3, e 5 , <?6} (if p, =
P2 — PJ — 0, p3 ^ 0); (p (if p- = 0, / = 1, 2, 3,4). The subspace Im p(v4) can
be one of the following: (p6; Span{<?j, e2, e3, e5}; Span{e!,e 2 }; Span{e,};
{0}. None of these subspaces coincides with !£. D
Proposition 9.5.1
For any A G <p the subspaces
Proof. Fix A G (J7 and a positive integer k, and let x be any vector from
Ker(j4«- A/)*. If B commutes with A, we have
Let
Lemma 9.5.2
For every the subspace
for every
is nonzero, and let q be the maximal index / (1 < i^pj) such that
We show that the subspace J{'q is in
Let Pj be the projector on 3{'p defined by P,/l' ) = = 0 for iVy and
P,./^ =/l y) (a = 1, . . . , p,). Obviously, PfB = BPr Therefore, the sub-
space J^is Pj invariant. Hence y = PjX G 56. For every k — 1, 2, . . . the linear
transformation (5 - A0/)* commutes with B and hence
belong to £.
We have proved that j£ has the form (9.5.3) with ql> — ->qm. Let us
verify that pl - q} > • • • >p m - qm. Fix i0<jQ and let C: $"-* <p" be
defined in the block matrix from C=[C (; ]™ y=1 with respect to the basis
(9.5.1) where C,7 is the zero p, x p. matrix if i^jQ or y T^ i0 and C;o/o is the
Pi * pf matrix [0 /]. By Theorem 9.1.1, C commutes with A, so £ is C
invariant. If then obviously
Otherwise
Theorem 9.6.1
For any transformation A: <p"—»(p" the lattice Hmv(A) is distributive and
self-dual and contains exactly
elements, where ///' > • • • > p^' are the partial multiplicities of A correspond-
ing to the ith eigenvalue, i=\, . . . ,k, and k is the number of different
eigenvalues of A (in particular, Hinv(A) is finite).
for every M, ^V,, JV2 G A. The lattice A is said to be self-dual if there exists a
bijective map $: A-» A such that «/<^ + -V) = «A(^) n «//(JV), ty(M n JV) =
«//(J^) + (/'(•'V) for every M, N E: A. [In other words, A is isomorphic (as a
lattice) to the dual lattice of A.]
and
for any .A-hyperinvariant subspaces =$?, and j£2, we assume (without loss of
generality) that A has only a single eigenvalue
To show that the lattice of A-hyperinvariant subspaces is distributive, first
observe the following equality for any real numbers r, s, t:
and
9.7 EXERCISES
is a circulant.
9.7 Show that any matrix commuting with
Description of
Invariant Subspaces and
Linear Transformations
with the Same
Invariant Subspaces
316
Description of Irreducible Subspaces 317
Proposition 10.1.1
The class T, is an algebra, that is, it is closed under the operations of addition,
multiplication by scalars, and matrix multiplication. Moreover, if A E T- and
det.4^0, then A'1 £ 7}.
Proof. All but the last assertions of Proposition 10.1.1 are immediate
consequence of the definition of Tf. To prove the last assertion, suppose that
and in general
into a direct sum of nonzero ^-invariant subspaces Mf; then from the choice
of / it follows that each Mi in equality (10.1.4) is irreducible). To describe
the ,4-invariant subspaces, therefore, it is sufficient to describe all the
irreducible subspaces.
It follows from Theorem 2.5.1 that an ^-invariant subspace SB is ir-
reducible if and only if the Jordan form of A ^ consists of one Jordan block
only. In other words, !£ is irreducible if and only if there exists a basis
J t j , . . . , x in Z£ and a complex number \ such that
that is, the system {Af,}f =1 is a Jordan basis in Z£. Consequently, every
irreducible subspace is contained in some root subspace. (One can see this
also from Theorem 2.1.5.) Thus it is sufficient to describe all the irreducible
subspaces contained in a fixed root subspace corresponding to the eigen-
value A. Without loss of generality, we assume that A = 0. (Otherwise,
replace A by B = A- A/ and observe that both transformations A and B
have the same invariant subspaces.)
The root subspace ^(A) is decomposed into a direct sum of Jordan
subspaces:
Let g,, . . . , gp E .Sfj and / , , . . . , jq G J£2 "e Jordan bases in J£t and Ja?2,
respectively. Without loss of generality, suppose that p > q. It is known that
in any irreducible subspace ^(^0) of A there exists only one eigenvector
(up to multiplication by a nonzero scalar). We describe first all the irreduci-
ble subspaces that contain the eigenvector g, [and thus are contained in
^(A)].
In the following proposition / is a fixed integer, 1 < / < p .
Proposition 10.1.2
Let T(v\ where v = min(y, q), be an upper triangular matrix of size j x /',
whose diagonal elements are zeros and the block formed by the first v rows
and first v columns is a Toeplitz matrix :
Indeed, if y E 3? n «S?2 ^ {0}, then for some i ( 0 < / < / ? - ! ) and some
complex number y ^ 0 the equality Ay = y/, holds. So /t E ^ O ^ C ^,
and since also g t e <£, the irreducible subspace =$? contains two linearly
independent eigenvectors/! and g l y which is impossible. From (10.1.9) and
the inclusion % + &2 C ^(A) = &{ + %2 it follows that dim £ < dim <£", =
p. Now let £ be an irreducible subspace containing gl with a Jordan basis
y l t . . . , y y ; so y} = a0g, and Ay, + , = ^.(/ = 1, . . . , y - 1). We look for the
vectors y2,. . . , yf in the form of linear combinations of gl,. . . , g ,
/ " , , . . . , fq. Two possibilities can occur: (1) y < ^; (2) q + 1 < y < /?. Consider
first the case when y'< <7. Condition Ay2 = y^ implies that
Theorem 10.1.3
Let
form a Jordan basis in some irreducible subspace of A that contains the vector
/! {here ul, . . . , up E =5^._, and vlt . . . , vp + E 2£r+l are Jordan bases in
cS?r_i and £r+l, respectively). Conversely, for every irreducible subspace t£ of
dimension j such that f^ E SB there exist matrices T("l\ . . . , T("m) such that the
components of the column (10.1.12) form a Jordan basis in t£.
Indeed, let AC, E £%A (/I) be an eigenvector, and let r be the minimal
integer such that xl E SB\ + • • - + S6r. Then xl = algl + • • • + a r f x , where
a, , . . . , ar E <p and a r T^ 0. Consider the system of vectors xt = a}g, + • • • +
arff, i = 1, . . . , pr. Evidently, jc,, . . . , xp satisfy the condition (10.1.5).
Transformations Having the Same Set of Invariant Subspaces 323
So in the representation (10.1.6) one can replace !£r by <Er. Then in view of
Theorem 10.1.3 the components of the columns of form (10.1.12) describe
all the irreducible subspaces of A, which contain the vector jc, [in (10.1.12)
write x' in place of / ( / ) j.
Observe that every irreducible subspace contains an eigenvector of A. So
the description in the preceding paragraph gives all the irreducible subspaces
of A (if the vector xl is varied).
where
324 Description of Invariant Subspaces and Linear Transformations
with b ^0, we must verify only that every B: <p 5 —» <f5 such that Inv(B) =
lnv(A) has the form (10.2.4) with b^Q (in the basis e}, e2, e3, e4, e5).
So assume Inv(Z?) = lnv(A). Then clearly B has upper triangular form,
and (see the argument in Example 10.2.1) the elements on the main
diagonal of B are all equal. Without loss of generality, we can assume that
the main diagonal in B is zero:
Theorem 10.2.1
If Inv(B) = Inv(y4) for a transformation B: <p"—» (p", then
We relegate the lengthy proof of this theorem to the next section. The
proof will be based on the description of irreducible subspaces obtained in
Section 10.1.
We conclude this section with two corollaries of Theorem 10.2.1.
Corollary 10.2.2
Suppose that AB = BA. Then lnv(A) = Inv(B) if and only if B=f(A),
where /(A) is a polynomial such that f(\t) ^/(A y ) for eigenvalues A, ^ A y of
A,f'(\0)*Q whenever A0 G o-(A) and Ker(
In other words, the conditions of Theorem 2.11.3 are not only sufficient,
but also necessary, provided A and B commute.
Corollary 10.2.3
Let A: §" —> <p" be a transformation. Then every transformation B with
Inv(B) = Inv(>4) commutes with A if and only if the following condition
holds: for every eigenvalue A0 of A with Ker(y4 - A 0 /) ^ S?A (A) and
dim Ker(>4 - A07) > 1 we have
328 Description of Invariant Subspaces and Linear Transformations
Lemma 10.3.1
If B has the form (10.3.1) with b2^0, then in any Jordan basis for B the
transformation A has the form
Proof. Without loss of generality we can assume that &(A) — {0} and
the Jordan basis g j , . . . , gn coincides with the standard
where
;lahsdfn';lmdv'cxn/fds;',/';j;cx/.lknfds;ljn;fds';jfdsmnksfd';123456
Lemma 10.3.2
If for every m = 1, . . . , p the subspaces Span and
Span coincide, then
Proof. Use induction on p. For p - 1 the lemma is evident. Assume
the lemma holds true for p = k, and Span
330 Description of Invariant Subspaces and Linear Transformations
in the form
Lemma 10.3.3
Let A: be such that
Proof. First we prove the necessity, that is, if then B has the
form (10.3.7). Consider first the case p = q and prove the necessity by
induction on everything is evident. Suppose that the lemma i
true for and let be irreducible subspaces of dimension
Le vidently,
are irreducible subspaces of A corresponding to the same eigenvalue. Since
Proof of Theorem 10.2.1 331
where
We assume otherwise, consider the linear transformation
where in place of B and use the property that lnv
lnv This condition means that B is invertible.
Let j£ be an irreducible subspace of A such that dim and
By Theorem 10.1.3, there exist numbers such that the
vectors
and
33 Descripccccc
These equalities hold for every (by choosing all possible j£; see
Theorem 10.1.3). Therefore
we obtain Let us
show that, in fact, To this end consider a Jordan basis of A
of the form
As above, we obtain
proof of theorem
where Let
334 Description of Invariant Subspaces and Linear Transformations
Now
Hence
Put
for some F, where The first p diagonals in both blocks are the same
in view of the choice of
Now we can repeat the proof of the inclusion $A C $B given above, with
A and B interchanged. So follows and, therefore, also and
10.4 EXERCISES
10.1 Let
Algebras of Matrices
and Invariant Subspaces
In this chapter we consider subspaces that are invariant for every transfor-
mation from a given algebra of transformations. In fact, this framework
includes general finite-dimensional algebras over (p. The key result, that
every algebra of n x n matrices that is not the algebra of all n x n matrices
has a nontrivial invariant subspace, is developed with a complete proof.
Some results concerning characterization of lattices of subspaces that are
invariant for every transformation from an algebra are presented. Finally, in
the last section we study algebras of transformations for which the orthogo-
nal complement of an invariant subspace is again invariant.
A linear space V (over the field of complex numbers <p) is called an algebra
if an operation (usually called multiplication) is defined in V, which as-
sociates an element in V (denoted xy or x • y) with every (ordered) pair of
elements jc, y from V with the following properties: (a) a(xy) = (ax)y =
x(ay) for every a G <p and every jc, y G V; (b) (xy)z = x(yz) for every
jc, y, z G V (associativity of multiplication); (c) (x + y)z = xz + yz, x(y +
z) — xy + xz for every jc, y, z G V (distributivity of multiplication with re-
spect to addition).
Note that generally speaking xy ¥* yx in the algebra V. The algebra V may
or may not have an identity, that is, an element e G V such that ae — ea = a
for every a G V.
We consider only finite-dimensional algebras, that is, those that are
finite-dimensional linear spaces. The basic example of an algebra is Mn „, the
algebra of all n x n matrices with complex entries, with the usual multiplica-
tion operation. Another important example is the algebra of upper triangu-
lar n x n (complex) matrices.
The following theorem shows that actually every (finite-dimensional)
339
340 Algebras of Matrices and Invariant Subspaces
Theorem 11.1.1
Let V be an algebra of dimension n (as a linear space). If V has identity, then
V can be identified with an algebra of n x n matrices. If V does not have
identity, it can be identified with an algebra of (n + 1) x (n + 1) matrices.
Proof. Assume first that V has the identity e. Let x^,. . . , xn be a basis
in V. For every a E V the mapping a: V-* V defined by a(x) — ax, x E V is a
linear transformation. Denote by M(a) the n x n matrix that represents the
linear transformation a in the fixed basis x{,. . . , xn. It is easy to check that
the mapping M: F-» Mn n defined above is an algebraic homomorphism:
for any elements «, b E V and any a E <p. Further, the only element a E V
for which M(a) = 0 is a = 0. Indeed, if M(d) = 0, then a* = 0 for every x E V.
Taking x = e, we obtain a = 0. Hence we can identify V with the algebra
{M(a) | a e V}, which is simply an algebra of n x n matrices.
Assume now that V does not have identity. Define a new algebra V as all
ordered pairs (x, a) with jc E V, a e <p and with the following operations:
for any x, y € V and any a, /3, y G <p. Obviously, the algebra V has the
identity (0,1) and dimension n + 1. According to the part of Theorem
11.1.1 already proved, we can identify V with an algebra of (n + 1) x (n + 1)
matrices (clearly, dim V= n + l). As V can be identified in turn with the
subalgebra {(x, 0) | oc £ V} of V, the conclusion of Theorem 11.1.1
follows.
Theorem 11.2.1
Let V be an algebra of n x n (complex) matrices with
Then there exists a nontrivial V-invariant subspace.
We exclude the case n — 1, when every subspace in <p" is trivial (in this
case the theorem fails for V= {0}). The proof of Theorem 11.2.1 is lengthy
and based on a series of auxiliary results; it is given in the next section.
Taking a maximal chain of V-invariant subspaces and using Burnside's
theorem we arrive at the following conclusion.
Theorem 11.2.2
For any algebra Vofnxn matrices, there is a chain of V-invariant subspaces
and the set (App \ A £ V}, coincides with the algebra of all transformations
from Np into Np,for p = 1,. . . , k. The chain (11.2.1) is maximal, and every
maximal chain of V-invariant subspaces has the property stated above.
The case when V is the algebra of all block upper triangular matrices with
respect to the decomposition (11.2.2) is of special interest. Then Mnn is a
direct sum of two subspaces: V and W, where W is the algebra of all lower
block triangular matrices with zeros on the main block diagonal:
The subspaces
342 Algebras of Matrices and Invariant Subspaces
are all the invariant subspaces for W. In particular, we have the following
direct sum decompositions:
Conjecture 11.2.3
Let be nonzero subalgebras in Mn n such that
Then there exist nonzero invariant subspaces M} and M2 for
and respectively, which are direct complements of each other in <p".
Lemma 11.3.1
The algebra has no nontrivial ideals.
Lemma 11.3.3
The only matrices that commute with every matrix in U are the scalar
multiples of I.
Lemma 11.3.4
If xl and x2 are linearly independent vectors in (p", then for every pair of
vectors there exists a matrix A from U such that
and
Then, for A £ V
Proposition 11.4.1
(a) If and are two lattices of subspaces in and then
(b) If V, and V-, are algebras of n x n matrices and
then
EXAMPLE 11.4.5. Let A: $" —»• <p" be a fixed transformation, and let V be
the algebra of all transformations that commute with A. Then Inv(V) is the
lattice of all A-hyperinvariant subspaces.
Note that
348 Algebras of Matrices and Invariant Subspaces
Theorem 11.4.2
A distributive lattice of subspaces in is reflexive. Conversely, every finite
reflexive lattice of subspaces is distributive.
The proof of Theorem 11.4.2 is beyond the scope of this book, and we
refer the reader to the original papers by Johnson (1964) and Harrison
(1974) for the full story. Here, we shall only prove two particular cases in
the form of Theorems 11.4.3 and 11.4.4.
Theorem 11.4.3
A complete chain of subspaces
is reflexive.
Reflexive Lattices 349
obviously belongs to AlgA, and its only invariant subspaces are {0}, Mt,
/ = 1, . . . , n - 1, and we have Inv(Alg(A)) C A. Since the reverse inclu-
sion is clear, the conclusion of Theorem 11.4.3 follows.
The next theorem deals with lattices that are as unlike chains as possible.
A lattice A of subspaces in is called a Boolean algebra if it is
distributive and for every M E A there is a unique complement M' (i.e.,
that belongs to We say that a nonzero
subspace is an atom if there are no subspaces for the lattice A strictly
between 3C and (0). The Boolean algebra A is called atomic if any is
a sum of all atoms 3V contained in M. A typical example of an atomic
Boolean algebra of subspaces is A = {Span{jt, | £}, where E is any
subset in (1,2, . . . , n}}, and *,, . . . , xn is a fixed basis in <p".
Theorem 11.4.4
Every atomic Boolean algebra A of subspaces of is reflexive.
Proof. Let K be the set of all atoms in A, and for every K let Px be
the projector on 3C along the complement 3V' of 3V in the lattice A. We shall
show that A = Inv(K), where V is the algebra generated by the transfor-
mations of type PyfAPyf, where A: and K. In other words, V
consists of all linear combinations of transformations of type
Since A is atomic, the complement is the sum of all atoms that are
not contained in M. (Indeed, if an atom 3Hs contained in 3ifis not
contained in MQ and thus by the definition of M0, 3K is not contained in M.
Conversely, if an atom 3C is not contained in M, then obviously 3£ is not
contained in M0, and since Jiis an atom, it must be contained in M'Q.) The
fact proved in the preceding paragraph shows that M'Q is the sum of all the
atoms jK with the property that For any finite set 3if\,. . .p , of
3C
atoms with / = ! , . . . , / ? , we have (using the distributivity of
and
so actually
We have seen in Corollary 3.4.4 that the set Inv(/4) of invariant subspaces of
a transformation A: has the property that lnv(A) exactly
when nv(A) if and only if A is normal. This property makes it
Reductive and Self- Adjoint Algebras 351
Theorem 11.5.1
An algebra V of n x n matrices with is reductive if and only if V is
self -adjoint, that is, implies
Lemma 11.5.2
Let V be a reductive algebra of n x n matrices, and let Ml , . . . , Mm be a set
of mutually orthogonal V-invariant subspaces such that
and for every i the set of restrictions coincides with the algebra
M(J<,) of all transformations from M^ into Mr Then V is self-adjoint.
As is equal to we have
and thus SAg — (p(A)Sg for every Thus (p(A) = SAS for a
Next, we show that S can be taken to be unitary, that is, S~l = 5*. Let M
be a subspace in <p" consisting of all vectors of the form x{ + f- xm, where
As
As {<<4|^| AEV} coincides with M(Mt) and in the preceding equality jty.
can be an arbitrary vector from M -^ we obtain B = S*SBS~lS*~l for all
By Proposition 9.1.6, for some number that must
1/2
be positive because 5*5 is positive definite. Letting 5, we obtain a
l
unitary transformation U such that (B) = UBU~ for all
We next show that V\ ± is reductive. Indeed, \et J
be V\ ± in variant. Then clearly jV is V invariant,
and by the reductive property of V, so is jV1, and hence also N± nMk. It
remains to notice that coincides with the orthogonal complement
to
By the induction hypothesis, is self-adjoint. Therefore, for every
matrix
the transformation
and
11.6 EXERCISES
(d) Block upper triangular matrices [fl, 7 ]" /= i, where «,y are k x k
matrices and ati = 0 if / > j.
(e) Matrices of type
11.7 Show that there is no algebra A with identity strictly contained in the
algebra UT(n) of upper triangular Toeplitz matrices for which
Exercises 357
11.8 Prove that the algebra U(n) of n x n upper triangular matrices is the
unique reflexive algebra for which the lattice of all invariant sub-
spaces is the chain
11.9 Show that there exist n different algebras V,,. . . , Vn whose set of
invariant subspaces coincides with (5) and for which
Real Linear
Transformations
In this chapter we review the basic facts concerning invariant subspaces for
transformations focusing mainly on those results that are
different (or their proofs are different) in the real case, or cannot be
obtained as immediate corollaries, from the corresponding results for trans-
formations from <p" into <p".
We note here that the applications presented in Chapters 5, 7, and 8 also
hold in the real case. That is, applications to matrix polynomials
with real n x n matrices A- and to rational matrix functions W(\) whose
values are real n x n matrices for the real values of A that are not poles of
W(\). In fact, the description of multiplication and divisibility of matrix
polynomials and rational matrix functions in terms of invariant subspaces (as
developed in Chapters 5 and 7) holds for matrices over any field. This
remark applies for the linear fractional decompositions of rational matrix
functions as well. In contrast, the Brunovsky canonical form (Section 6.2) is
not available in the framework of real matrices, so all the results of Chapter
6 that are based on the Brunovsky canonical form fail, in general, in this
context. Also, the results of Chapter 11 do not generally hold in the
context of finite-dimensional algebras over the field of real numbers.
359
360 Real Linear Transformations
where cr and T are real numbers and r ^ 0. The size n of the matrix A is
obviously an even number. It is easily seen that Span{e,,. . . , e2k}, k —
I , . . . , n/2 are ^-invariant subspaces. It turns out that A has no other
nontrivial invariant subspaces. Indeed, replacing A by A - crl, we can
assume without loss of generality that cr = 0. We prove that if M is an
2
^-invariant subspace and ; with at least one of the real
numbers a2k_l and a2k different from zero, then Span{e,,. . . , e2k}, and
proceed by induction on k.
In the case k = 1 we have c^e, + a2e2 and A(alel + a2e2) = ra2el —
The conditions r ^ 0 and a\ + a\ T£ 0 ensure that both vectors el
and e2 are linear combinations of alel + a2e2 and ra2el — ra,e2, and the
assertion is proved for k = 1. Assuming that the assertion is proved for
A computation shows that
the vector _ belongs to Span{e,,. . . , e 2k_2} and in the linear
combination y at least one of the numbers 2A _ 3 , j32*-2 is
different from zero. Obviously, so the induction assumption implies
M DSpan{ej,. . . , e2k_2}. Hence a2k-ie2k-\ + a2ke2k €=^; as the differ-
ence Ax — (ra2ke2k_l — Ta2k_le2k) belongs to Span{<?i,. . . , e2k_2}, also
Ta e
2k 2k-i ~ Ta2k_le2kE.M. Consequently, the vectors e2k_} and e2k belong
to M, and M D Spanje,,. . . , e2k}. In particular, A has no odd-dimensional
invariant subspaces. D
Definition, Examples, and First Properties of Invariant Subspaces 361
Proposition 12.1.1
If the transformation A: ft"—> ft" has no real eigenvalues, then A has no
odd-dimensional invariant subspaces.
The Jordan chains for real transformations are defined in the same way as
for complex transformations: vectors XQ, . . . , xk G ft" form a Jordan chain
of the transformation A: ft"—*ft" corresponding to the eigenvalue A0 of A if
x0 7^0 and AxQ = A0;t0; Ax, - \QXj = x^{, / = ! , . . . , & . The vector JCD is
called an eigenvector. The eigenvalue A0 for which a Jordan chain exists must
obviously be real. Since not every real transformation has real eigenvalues,
it follows that there exist transformations A: ft"—> ft" without Jordan chains
(and in particular without eigenvectors). On the other hand, for every real
eigenvalue A0 of A: ft"—*• ft" there exists an eigenvector (which is any
nonzero vector from Ker(A 0 7- A)C ft"). In particular, A has eigenvectors
provided n is odd.
As we have seen (e.g., in Example 12.1.1), not every real transformation
has one-dimensional invariant subspaces. In contrast, two-dimensional in-
variant subspaces always exist, as shown in the following proposition.
Proposition 12.1.2
Any transformation A: ft"—» ft" with n^2 has at least one two-dimensional
invariant subspace.
362 Real Linear Transformations
It is clear now that Theorem 1.9.1 is generally false for real transfor-
mations. The next result is the real analog of that theorem.
Theorem 12.1.3
Let A: ft" —> ft" be a transformation and assume that det( A/ - A) has exactly
s real zeros (counting multiplicities). Then there exists an orthonormal basis
X i , . . . , xn in ft" such that, with respect to this basis, the transformation A
has the form [fl,7]"/ = i where all the entries ait with i > j are zeros except for
a
s + 2,s + l> as + 4,s + 3-> • ' • ' an,n-l-
Theorem 12.1.4
Let A be as in Theorem 12.1.3 and assume, in addition, that A is normal.
Then there exists an orthonormal basis in %n with respect to which A has
the matrix form [ a ^ ] " j = i , where aif• = 0 for i^j except for as+2s+l,
Let A: ft" —^ ft" be a transformation. The root subspace £%A (A) correspond-
ing to the real eigenvalue A() of A is defined to be Ker(A 0 7- A)", as in the
complex case. Then £%A (A) is spanned by the members of all Jordan chains
of A corresponding to A 0 . For a pair of nonreal eigenvalues a + ir, a — ir of
A (here a, T are real and r ^ O ) the root subspace is defined by
<r, + /r, , . . . , as + irs are the district eigenvalues of A in the open upper half
of the complex plane (if any), then
for some positive integers a,,. . . , ar, /3 15 . . . , Ps. Using this observation, it
can be proved that there is a direct sum decomposition
(see the remark following the proof of Theorem 2.1.2). Moreover, we have:
Theorem 12.2.1
For every A-invariant subspace M the direct sum decomposition
holds.
For the deeper study of properties of invariant subspaces, the real Jordan
form of a real transformation, to be described in the following theorem, is
most useful. As usual, ^(A) denotes the k x k Jordan block with eigenvalue
A. Also, we introduce the 21 x 21 matrix
Theorem 12.2.2
For every transformation A: $"—> %" there exists a basis in ft" in which A has
the following matrix form:
Proposition 12.2.3
If n is odd, then every transformation A: $"—* ^" has an invariant subspace
of any dimension k with 0 < k < n.
The real Jordan form can be used instead of the (complex) Jordan form
to produce results for real transformations analogous to those presented in
Chapters 3 and 4 (with the exception of Proposition 3.1.4). For this purpose
we say that a transformation A: ft"-* ft" is diagonable if its real Jordan form
has only 1 x 1 blocks J,(A ; ), A , , . . . , \p G ft or 2 x 2 blocks /,(/*,-, w-),
j= 1,. . . , q. Also, we use the fact that the Jordan form of the transfor-
mation A: ft"-* ft" with the real Jordan form (12.2.1) is
Theorem 12.3.1
Assume that the transformation A: $" —» %" does not have real eigenvalues.
Let 9t+ C (p" be the spectral subspace of Ac corresponding to the eigenvalues
in the open upper half plane. Then for every A-invariant subspace ^(C $.")
the subspace (!£ + i££} D £%+ is Ac invariant and contained in 8%+. Converse-
ly, for every A' -invariant subspace M C £ft+ there exists a unique A-invariant
subspace £ such that (% + i%) fl 3?+ = M.
Proof. The direct statement of Theorem 12.3.1 has already been ob-
served. To prove the converse statement, let M C <%+ be an /i'-invariant
subspace. Fix a basis z, , . . . , zk in M, and write z; = Xj + (y;, j = 1, . . . , k,
where jc;, yy G ^f". Put *£ = Spanjjc,, . . . , xk, y{, . . . , yk} C Jj?". Let us
check that & is A invariant. Indeed, for each /, Acz- is a linear combination
(with complex coefficients) of z , , . . . , zk, say
Letting a(pl} - f3p'} + iy(p'\ where /3j,;) and yp'} are real, use the definition of
Ac to rewrite (12.3.1) in the form
After separation of real and imaginary parts, these equations clearly imply
that t£ is A invariant. Further, it is easily seen that
Hence
Obviously, JVis also a subspace in <p". We have £' + i«2" = =^' + i^f'. Also,
it is easy to check (e.g., by taking complex conjugates of a Jordan basis in
$+ for Ac\<x+) that $t+ = <3l_. Taking complex conjugates in (12.3.2), we
have
and
The proof shows that Theorem 12.3.1 remains valid if the subspace $t+ is
replaced by the spectral subspace of Ac corresponding to any set S of
eigenvalues of Ac such that A 0 E S implies \0^S and 5 is maximal with
respect to this property.
We pass now to the proof of Theorem 12.2.2. First, let us observe that in
terms of matrices Theorem 12.2.2 can be restated as follows.
Theorem 12.3.2
Given an n x n matrix A whose entries are real numbers, there exists an
invertible n x n matrix with real entries S such that
Complexification and Proof of the Real Jordan Form 369
We now prove the result in the latter form. The Jordan form for
transformations from <p" into (p" is used in the proof.
(by definition, xi0 - 0) and using the fact that Ac is given by a real matrix in
the standard basis, we see that
One checks easily that U- is unitary, that is, UjU* = /, and that
and fjij and wy are the real and imaginary parts of A p + / , repectively (see the
paragraph preceding Theorem 12.2.2 for the definition of .//(/A,, vv ; )). Also,
it is easily seen that the matrix
uniqueness of the Jordan form of Ac. [Indeed, the right-hand side of (12.3.3)
is uniquely determined by the eigenvalues and partial multiplicities of Ac.]
If (r(Ja) fl o-(Jp) - 0, then equation (12.4.2) has only the trivial solution
Za/3 = 0 (Corollary 9.1.2). Assume tr(Ja) = a(J^) = {A 0 }, where A0 is real.
Then, as in the proof of Theorem 9.1.1, Zafi is an upper triangular Toeplitz
matrix.
To study the case &(Ja) - tr(Jp) = {/t() + w>0, /AO - /w () }, it is convenient
to first verify the following lemma.
Lemma 12.4.1
KA + C=AK; KC=CK
Theorem 12.4.2
Let A be an n x n matrix with the real Jordan form diag[/ t , . . . , /„], so
for some invertible real n x n matrix S, where each Ja is either a Jordan block
of size ma x ma with real eigenvalue or a matrix of type
with real /MQ , wa and wa > 0. Then every real n x n matrix X that commutes
with A has the form X = S~1ZS, where the matrix Z = [ZaB]"aB = l partitioned
conformally with the Jordan form diag[/ t ,. . . , Ju] has the following struc-
ture: Ifa-(JJ n a(JB) = 0, then ZaB = 0. //tr(7 a ) = (JB) = { A0}, A0 real, then
374 Real Linear Transformations
Theorem 12.5.1
Let a transformation A: $"—*$" have the minimal poly nominal
Hyperinvariant Subspaces 375
where A,, /ay, and wf are real and n>y >0, A , , . . . , A^ are distinct, and so are
/LI, + iw,,..., JJLS + i(t>s. Then the lattice of ail A hyperinvariant subspaces
coincides with the smallest lattice ^A of subspaces in <p" that contains
KerM - A/)*, lm(A - \jl)k for k = 1,. . . , r y ; ; = 1,. . . , k, and Ker[(,4 -
/t y /r + wy2/]^, Im[(^ - /i;/)2 + wj/]* for k = 1,. . . , 5;; / = 1,. . . , m.
Lemma 12.5.2
Every A-hyperinvariant subspace is of the form
and z 2 = Az{ belong to j?and also to Spanf/^,/^}. Now zl and z 2 are not
collinear; otherwise, A would have a real eigenvalue, and this is impossible.
It follows that Span{z,, zj = Span{/(,r), /f >}, and hence f[r\ /f E %. If
we already know that f\r\ . . . , / 2 /-2 E !£ for some / > 2, then by a similar
argument using the vectors
We are now in a position to prove Theorem 12.5.1 for the case cr(A) =
{fjL + iw, (ji - iw}. As in the proof of Theorem 9.4.2, one shows that every
Hyperinvariant Subspaces 377
for some sequence of integers < ? , , . . . , qm such that ql > • • • > qm >0 and
p{ - ql > p2 - q2 > • • • > p m - qm > 0. We prove that Jz? E &A by induction
on q{. Assume first q} = 1. Then S£ = X\ + \- 3£\ for some t < m. As
p,>p, + 1 , we have
Now assume that the inclusion Jz? £ 5^ is proved for q^ - v — 1, and let =^be
a subspace of the form (12.5.3) with ql = v. Let r, a be the maximal integers
for which ql — • • • = #r and pa — /?r -f i> > 0. Consider the subspace
The inequalities p( - qi > /?( + 1 — qi+l imply that M C 3?. Further, the sub-
space
Proof of Theorem 5.1 (the general case). Again, it is easily seen that
each subspace Ker(A - A ./)*, lm(A - A,/)*, Ker[(,4 - A,/) 2 + wjl]k,
lm[(A - A ; /) 2 + w2/]* is A hyperinvariant. So we must show that each
j4-hyperinvariant subspace belongs to yA. Let M be an A -hyperinvariant
subspace. By Theorem 12.2.1 we have
378 Real Linear Transformations
Write A in the real Jordan form (as in Theorem 12.2.2) and use Theorem
12.4.2 to deduce that each intersection M. fl ^(A) is A\.# ( A ) hyperinvariant
and M H &t uPp-'"p
+iw is A\#n ±iu- (A) hyperinvariantV(pf = 1,» . . .',•> 5).
IM
p l>
Jr
/
With the use
of Theorem 9.4.2, it follows that M fl *3i.K (A) belongs to the smallest lattice
that contains the subspaces
and
whrer
and
380 Real Linear Transformations
Theorem 12.6.1
Let the transformation be given by (12.6.1), in some basis i
Then a transformation has the same invariant subspaces as A if
and only if B has the following matrix form (in the same basis):
where
for some real numbers b(^\ . . . , b^ with M'} 5^0 and some (kn - ki2) x
(kn -ki2) matrix F ( i ) ;
ith f\'] r^O and det c 2 ' } 7^0 and some 2(/ yl - / y2 ) x 2(/ yl - 7 /2 ) rea/ mafri*
r
G (/) . Moreover, the real numbers b\l\ . . . , b\p) are different and the com-
plex numbers are different as well.
For the proof of Theorem 12.6.1, we refer the reader to Soltan (1974).
12.7 EeEXERCISES
12.8 Find the real Jordan form and all invariant subspaces in ft" of the
real companion matrix
12.10 Find the real Jordan form and all invariant subspaces in ft" of the
matrix
where
12.11 Two linear matrix polynomials \Al + and \A2 + B2 with real
matrices A{, 5,, A2, and B2 are called strictly equivalent (over J|f) if
there exist invertible real matrices P and Q such that P(\Al +
B{)Q = \A2 + B2. Prove the following result on the canonical form
for the strict equivalence (over J|?) (the real analog of Theorem
A. 7. 3). A real linear matrix polynomial \A + B is strictly equivalent
(over ^) to a real linear polynomial of the type
12.12 Prove the following analog of the Brunovsky canonical form for real
transformations. Two transformations and
are called b/oc/; similar if there exist invert-
ible transformations and and a transfor-
mation such that
where J is a matrix in the real Jordan form; Z?0 has all zero entries
except for the entries
and these exceptional entries are equal to 1. (Hint: Use Exercise
12.11 in the proof of the Brunovsky canonical form.)
12.13 Let be a full-range pair of transfor-
mations. Prove that given a sequence S - {A 15 . . . , A,,} of n (not
necessarily distinct) complex numbers such that A0 G S implies A() E S
and A0 appears in S exactly as many times as A0, there exists a
transformation F: ft"—> ftm such that A t , . . . , \n are the eigenvalues
of A + BF (counted with multiplicities). (Hint: Use Exercise 12.12.)
Notes to Part 2
Chapter 9. The first two sections contain standard material in linear
algebra [see, e.g., Gantmacher (1959)]. Theorem 9.3.1 is due to Laffey
(1978) and Guralnick (1979). The proof presented here follows Choi,
Laurie, and Radjavi (1981). Theorem 9.4.2 appears in Soltan (1976) and
Fillmore, Herrero, and Longstaff (1977). Our expositions of Theorem 9.4.2
and Section 9.6 follow the latter paper.
Chapter 10. The results and proofs of this chapter are from Soltan
(1973b).
Chapter 11. Theorem 11.2.1 is a well-known result (Burnside's
theorem). It may be found in books on general algebra [see, e.g., Jacobson
(1953)] but generally not in books on linear algebra. In the proof of
Theorem 11.2.1 we follow the exposition from Chapter 8 in Radjavi and
Rosenthal (1973). Other proofs are also available [see Jacobson (1953);
Halperin and Rosenthal (1980); E. Rosenthal (1984)]. Example 11.4.6 and
Theorem 11.4.4 are from Halmos (1971). In the proof of Theorem 5.1 we
are following Radjavi and Rosenthal (1973).
Chapter 12. The real Jordan form is a standard result, although not so
frequently included in books on linear algebra as the (complex) Jordan
form. The real Jordan form can be found in Lancaster and Tismenetsky
(1985), for instance. The proof of Theorem 5.1 is taken from Soltan (1981).
384
Part Three
Topological
Properties of
Invariant Subspaces
and Stability
There are a number of practical problems in which it is necessary to obtain
an invariant subspace of a transformation or a matrix by numerical methods.
In practice, numerical computation can be performed with only a finite
degree of precision and, in addition, the data for a problem will generally be
imprecise. In this situation, the best that we can hope to do is to obtain an
invariant subspace of a transformation that is close to the one we really have
in mind. However, simple examples show that although two transformations
may be close (in any reasonable sense), their invariant subspaces can be
completely different. This leads us to the problem of identifying all invariant
subspaces of a given transformation that are "stable" under small pertur-
bations of the transformation—that is, to identify those invariant subspaces
for which the perturbed transformation will have a "close" or "neighbour-
ing" invariant subspace, in an appropriate sense.
To develop these ideas, we must introduce a measure of distance between
subspaces and to analyze further the structure of the invariant subspaces of a
given transformation. This is done in Part 3, together with descriptions of
stable invariant subspaces, using different notions of stability.
This machinery is then applied to the study of stability of divisors of
polynomial and rational matrix functions and other problems. The reader
whose interest is confined to the applications of Chapter 17 needs only to
study the material presented in Chapter 13, Section 14.3, and Chapter 15.
385
This page intentionally left blank
Chapter Thirteen
This chapter is of an auxiliary character. We set forth the basic facts about
the topological properties of the set of subspaces in <p". Observe that all the
results and proofs of this chapter hold for the set of subspaces in ft" as well.
387
a88 The Metric Space of Subspaces
Note also that 0( j£, M ) < 1 . [This property follows immediately from the
characterization given in condition (13.1.3).] It follows from (13.1.1) that
Theorem 13.1.1
Let M , £ be subspaces in <p". Then
// exactly one of the subspaces 2£ and M is the zero subspace, then the
right-hand side of (13.1.3) is interpreted as 1; if & = M = {0}, then the
right-hand side o/(13.1.3) is interpreted as 0. 7/P, and P2 are projectors with
Im P2 = !£ and Im P2 = M, not necessarily orthogonal, then
Therefore
where
Observe that
Consequently, for every we have
Now
The Gap Between Subspaces 389
Hence by (13.1.6)
So
Theorem 13.1.2
If 0(£, M)<1, then dim $ = dim M.
It also follows directly from this proof that the hypothesis 0(3?, M)<\
implies . In addition, we have
hh The Metric Space of Subspaces
For example, to see the first of these observe that for any x E. M there is the
unique decomposition x = y + z, M x = PMy so.
that M C PM(*£). But the reverse inclusion is obvious, and so we must have
equality.
The following result makes precise the idea that direct sum decompo-
sitions of <p" are stable under small perturbations of the subspaces, as
measured in the gap metric.
Theorem 13.1.3
Let M, M j C <p" be subspaces such that
and
where PM(Pjf) projects <p" onto M (onto N) along Jll and C is a constant
depending on M and M , but not on Ji. In fact
Proof. Let us prove first that the sum Ji + M\ is indeed direct. The
condition that M + M{ — (p" is a direct sum implies that \\x — _ y | | > 6 > 0 for
every and every Here 8 is a fixed positive constant. Take Ji
so close to M that 0 ( M , J f ) < d / 2 . Then ||z-^||<6/2 for every zG5, v ,
where y = y(z) is the orthogonal projection of z on M. Thus for x E SM and
we have
and the last inequality follows from (13.1.11). This completes the proof of
the theorem. D
We remark that the definition and analysis of the gap between subspaces
presented in this section extends verbatim to a finite-dimensional vector
space V over <p (or over J|?) on which a scalar product is defined. Namely,
there exists a complex-valued (or real-valued) function defined on all the
ordered pairs, x, y, where x, y E. V, denoted by (x, y), which satisfies the
following properties: for enery
x, y, z and every
(c) (x, x) > 0 for all x E. V; and (x, x) = 0 if and only if x = 0.
392 The Metric Space of Subspaces
There are notions of the "minimal angle" and the "spherical gap" between
two subspaces that are closely related to the gap between the subspaces. The
basic facts about these notions are exposed in this and the next sections. It
should be noted, however, that these notions and their properties are used
(apart from Sections 13.2 and 13.3) only in Section 13.8 and in the proof of
Theorem 15.2.1.
Given two subspaces 3?, M C (p", the minimal angle <pmin(&, M) (0<
<pmin(J3?, M}< IT/2) between Jz? and M is determined by
Indeed, writing
and writing /? = u + iv, where u and v are real, we see easily that the
function /(M, v) — 1 + ($(x, y) + /8(y, jc) + |^3|2 of two real variables u and v
has its minimum for u = —^((x, y) + (y, x)) and v = \(i(y, x) - i(x, y)),
that is, when )3 = -(y, *). Thus
The Minimal Angle and the Spherical Gap 393
Proposition 13.2.1
For two nontrivial subspaces // and only if
We also need the notion of the "spherical gap" between subspaces. For
nonzero subspaces =$?, M in (p" the spherical gap 0(3?, M ) is defined by
We also put «({0}, ^) = 0(%, {0}) - 1 for every nonzero subspace £ in <p"
and 6*({0}, {0}) = 0. The spherical gap is also a metric in the set of all
subspaces in <p". Indeed, the only nontrivial statement that we have to verify
for this purpose is the triangle inequality:
for all subspaces ^, M , and N in <p". If at least one of j£, M , and Ji is the
zero subspace, (13.2.4) is evident (observe that 0(j£, M)^2 for all sub-
spaces ££, M C <p"). So we can assume that £, M and jVare nonzero. Given
x G 5^., let zx e 5^ be such that ||* - zx\\ = d(x, SM). Then for every y E 5^,
we have
a94 The Metric Space of nSubspaces
It remains to take the supremum over x E S^ and repeat the argument with
the roles of JV and !£ interchanged, in order to verify (13.2.4).
In fact, the spherical gap 0 is not far away from the gap 6 in the following
sense:
The left inequality here follows from (13.1.3). To prove the right inequality
in (13.2.5) it is sufficient to check that for every x E <p" with ||jc|| = 1 we
have
Proposition 13.2.2
For any three subspaces £, M, N C <p",
Theorem 13.2.3
Let M, Jf be subspaces in <p" such that M Pi JV = {0}. Then for every pair of
subspaces M , , JV, C <J7" such that
and
If M + .yV, ^ <p", then there exists a vector jc £ <p" with ||A:|| = 1 and
for all y E M + ^ [e.g., one can take x^(M+J^ iY\. We
can represent the vector x as x = y + z, y&M, ze^V. It follows from
the definition of sin q>min(M, jV) that
and
ainimal Opening and Angular Linear Tranformation 397
where <2, is the orthogonal projector onto M } . But then we can use (13.3.2)
to obtain formula (13.3.3). If M2 = {0}, then (13.3.3) holds trivially.
We use the notion of the minimal opening between two subspaces to
describe the behaviour of angular transformations when the corresponding
projectors are allowed to change.
398 The Metric Space of Subspaces
Lemma 13.3.1
Let I10 be a projector defined on <p", and let O be another projector on <f""
such that Then, provided
we have the following estimate for the norm of the angular transformation R
of Im O with respect to I10:
For 0 we have
Taking the infimum over all and using inequality (3.1), one sees
that
Lemma 13.3.2
Let P and P x be projectors defined on <p" such that <p" = Im P 4- Im P x. Then
for any pair of projectors Q and Q* defined on
sufficiently small, we have <p" = Im Q + Im Q x, and there exists an
invertible transformation which maps Im Q on Im P, Im Q x on
x
Im P , and
Minimal Opening and Angular Linear Transformation 399
As
tion (13.3.9) implies that
But then we may apply Theorem 13.2.3 combined with (13.2.5) to show that
Further
As this implies
that
We have already seen in Section 13.1 that the set <p(<P") °f a'l subspaces in
<p" is a metric space with respect to the gap 6(2£, M ) . In this section we
investigate some topological properties of <£(<£"), that is, those properties
that depend on convergence (or divergence) in the sense of the gap metric.
Theorem 13.4.1
The metric space (£(£") is compact, and, therefore, complete (as a metric
space).
It is easily seen that d(u, v) is a metric in $m, thus turning $m into a metric
space. For each u - {uk}^l E $m define yl m M = Span{M t , . . . , um] E (j/m.
In this way we obtain a map ^4OT: $m —> ($m of metric spaces $m and (J/m .
We prove that the map Am is continuous. Indeed, let ^ E §m and let
v}, . . . ,vm be an orthonormal basis in j£ Pick some
(which is supposed to be in a neighbourhood of For u ( ,
/ = 1, . . . , m, we have (where M = Amu and P^ stands for the orthogonal
projector on the subspace N):
Now, since we find that \a\ < 1 and E" , |a,| < m, and
so
Fix some y E 5^±. We wish to evaluate P^y. For every x E j£, write
and
and
But
Combining (13.4.2) and (13.4.3), we find that \(t, PMy)\ <3m6(w, v) for
every re f" with ||/|| = 1. Thus
Now we can easily prove the continuity of Am. Pick an x £ < p " with
||jc|| = 1. Thus, using (13.4.1) and (13.4.4) we have
so
(in the last inequality we have used the fact that the norm of an orthogonal
projector is 1); so PMx - x, and x E M.
Using Theorems 13.4.2 and 13.1.3, one obtains the following fact.
Theorem 13.4.3
Let ££ and M be direct complements to each other in <£", and let
{^m}», = i be sequences of subspaces such that
Moreover, there exists a constant K>Q depending on 5€ and M only such that
Observe that, in view of Theorem 13.1.3, the subspaces !£m and Mm are
direct complements to each other for sufficiently large m.
where
As usual, d(jt, N) = inf{||jic - _y|| | y G JV} is the distance between x G <p" and
a subset JV C (p". In particular, for m large enough we find that
which is contradictory.
The proof of Theorem 13.4.3 shows that actually equation (13.4.5) holds
with
Proposition 13.4.4
The set $m($n) of all m-dimensional subspaces in (p" is connected.
That is, for every M, N G (£„,(<£") there exists a continuous function
f : [0, 1]-* $m(() such that /(O) = M, /(I) = JV (and the continuity of f is
understood in the gap metric).
Proof. Using the proof of Theorem 13.4.1 and the notation introduced
there, we must show that the set $m is connected. As any orthonormal
system «,, . . . , um in <p" can be completed to an orthonormal basis in <p",
the connectedness of $m would follow from the connectedness of the group
£/(<p") of all n x n unitary matrices. To show that £/(<£") is connected,
observe that any X G £/(<p") has the form
where S is unitary and 0,, . . . , 0n are real numbers (see Section 1.9). So
Similarly, one can prove that the set <$m($") of all m-dimensional
subspaces in Jj?" is connected. To this end use the facts that any orthonormal
systems u l 5 . . . , um in ^" (m < n) can be completed to an orthonormal basis
« ! , . . . , ! / „ with det[« 1 , u2, . . . , un] = 1 and that the set £/+(Jj?") of all
orthogonal n x n matrices with determinant 1 is connected. Recall that a
real n x n matrix U is called orthogonal if U TU = UU T = I.
For completeness, let us prove the connectedness of f/ + (J(f"). It follows
from Theorem 12.1.4 that any x £ t/ + (JJf") admits the representation
Theorem 13.5.1
Let X: <£"-* <p'" be a transformation, and let Px be a projector on Ker X.
Then there exists a constant /C>0, depending only on X and Px, with the
following property: for every transformation Y: <p"-^ (p™ w^tn dim Ker Y =
dim Ker X there exists a projector PY on Ker Y such that
In particular
Proof. It will suffice to prove (13.5.1) for all those y with dim Ker Y -
dim Ker X that are sufficiently close to X, that is, ||X — Y\\ < e, where e > 0
depends on X and Px only. Indeed, for Y with dim Ker Y - dim Ker X and
||A'— y|| > e, use the orthogonal projector PY on Y and the fact that
to obtain (13.5.1) (maybe with a
bigger constant K}.
Consider first the case when X is right invertible. There exists a right
inverse X1 of X such that Im X1 = Im(/ - Px) (cf. Theorem 1.5.5), and then
X'X= I - Px (indeed, both sides are projectors with the same kernel and
the same image). It is easy to verify that any transformation Y: <p"—^Cp" 1
with the property
Kernels and Images of Linear Transformations 407
is also right invertible and one of the right inverses Y1 is given by the
formula Y1 = ZX1 where
Indeed, we have
and hence
Now consider the case when * is not right invertible, and let r be the
dimension of a complementary subspace N to Im * in <pm- Consider the
transformation
Note that the equality dim Ker Y = dim Ker X holds automatically for e
small enough because then such Y will also be right invertible (see the first
part of this proof). Apply (13.5.4) for Y of the form Y(x + y) = Yx + Ly;
x E <p", y G <f", where Y: <p"-> <pm is a transformation such that \\X - Y\\ <
€ and dim Ker Y = dim Ker X. Let us check that Ker Y C <p". Indeed
and since Ker Y C Ker Y, we have in fact Ker Y = Ker Y and thus Ker Y C
<f". Now put Px = Py| n , Py = P y j n to satisfy (13.5.1), for transformations
The condition dim Ker Y = dim Ker X is clearly necessary for the inequal-
ity (13.5.1), since otherwise we obtain a contradiction with Theorem 13.1.2
on taking a Y: f-* <£" such that
A result analogous to Theorem 13.5.1 also holds for the images of linear
transformations. The statement of this result is obtained from Theorem
13.5.1 by replacing Ker X and Ker Y by Im X and Im Y, respectively, and
its proof is reduced to Theorem 13.5.1 by observing that Im A = (Ker A*Y
for a linear transformation A and that 0(M, N) = 8{M \ ^V x ) for any
subspaces M, Ji C <p".
Proposition 13.6.1
Let B(t) be a continuous m x n complex matrix function on K such that
rank B(t) = p is independent of t on K. Then Ker B(i) and Im B(t) are
continuous families of subspaces on K.
Continuous Families of Subspaces 409
where bf(t) is the /th column of Z?(f). Let b /y (f) be the (i, y)th entry in B(t)\
and let D(t) = [^(0];:^,:f=i; C(0 = [M')Ky-i- Then the matrix
Corollary 13.6.2
Let P(t) be a continuous projector-valued function on K. Then Im P(t) and
Ker P(t) are continuous families of subspaces on K.
We have to show that rank P(t) is constant if the projector function P(t)
is continuous. But this follows from inequality (13.1.4) and the fact that the
set of subspaces of fixed dimension is open in the set of all subspaces in <p"
(Theorem 13.1.2).
The following characterization of continuous families of subspaces is very
useful.
Theorem 13.6.3
Let ££(t) be a family of subspaces (of <p" ) on a connected compact subset K
of $m. Then the following properties are equivalent: (a) ££(f) is continuous;
(b) for each tE.K there exists an invertible transformation S(t): <p"—»<p"
which depends continuously on t for t E K, and there exists a subspace
M C <f" such that £(f) = S(t}M for all t E K; (c) for each t0£K there exist a
neighbourhood Ut of 10 in K, an invertible transformation St(t): <p"—>(p"
that depends continuously on t in Ut, and a subspace Mt C <p" such that
410 The Metric Space of Subspaces
We prove Theorem 13.6.3 only for the case K = [0, 1] (of course, the case
when K C $ is easily reduced to this one). The proof when K is a connected
compact set of jj?m requires mathematical tools that are beyond the scope of
this book [see Gohberg and Leiterer (1972) for the complete proof].
to satisfy (b).
Obviously, (b) implies (c). Finally, let us prove that (c) implies (a). Given
5, and M, as in (c), let P(} be the orthogonal projector on M, . Then
5, (t)P0(S, (0) ' is a projector on &(t)\ therefore, for t E U, we have
As 5,t (t)
\ / is continuous and invertible in U,r ,' its inverse is continuous as well,»
a ()
Corollary 13.6.4
Let !£(t) be a continuous family of subspaces (of (p") on K, where K C jjf" is
a connected compact set. Then there exists a continuous basis *,(/), • • • , xp(t)
in !£(t), where p = dim ££(t). (Note that because of the connectedness of K the
dimension of ££(t) is independent of t on K.)
Corollary 13.6.5
Let B(i) be a continuous m x n matrix function on a connected compact set
, such that rank B(t) = p is independent of t. Then there exists a
Applications to Generalized Inverses 411
Theorem 13.7.1
Let X: <p" —> <J7m be a transformation with a generalized inverse X1: (pm —> <p".
Then there exist constants K>0 and e > 0 with the property that every
transformation Y: <p"-» fm with \\Y-X\\<c and dim Ker Y = dim Ker X
has a generalized inverse Y1 satisfying
a positive number e2 ^ e, such that for any Y E 3%(X) with \\X - Y\\ < e2 we
can find an invertible transformation S
and
and
Observe that the complete analog of Theorem 13.5.1 does not hold for
the case of generalized inverses. Namely, given X and X1 as in Theorem
13.7.1, in general there is no positive constant K such that any transfor-
mation Y : <p" —> <pm with dim Ker y = dim Ker X has a generalized inverse
Y7 satisfying (13.7.1). To produce an example of such a situation, take
n = m and let X: <P"~* <P" be invertible. Then there is only one generalized
inverse of X, namely, its inverse X~l. Further, let Y = aX, where a ^0. If
(13.7.1) were true, we would have for some K>0 and all a:
Theorem 13.7.2
Let B(t) be a continuous m x n matrix function on a connected compact set
K C $q such that rank 5(0 — p is independent of t. Then there exists a
continuous n x m matrix function X(t) on K such that, for every t E. K, X(t)
is a generalized inverse of B(t).
is invertible, so
for some complex numbers a;,(0 that also depend on t. Again, a;/(0 are
continuous on K. Indeed, a;j(0 ^s the unique solution of the linear system of
equations
where a(t) is the p 2 -dimensional vector formed by a y ,(0> / = 1> • • • > P'->
i- n — p + I , . . . , n, and A(t) and C(t) are suitable matrix and vector
functions, respectively, which are continuous in /. As the solution of (13.7.4)
exists and is unique for every f G K, it follows that the columns of A(t) are
linearly independent for every tEK. Now fix t(} G K, and assume for
simplicity of notation that the upper p2 rows of A(t()) are linearly indepen-
dent. Partition
where A0(t) and C 0 (f) are the top p2 rows of A(t) and C(t), respectively.
Then A()(t()) is nonsingular; as A(t) is continuous in f, the matrix AQ(t) is
nonsingular for every / from some neighbourhood U, of t0 in K. It follows
that
Corollary 13.7.3.
Let B(t) be a continuous m x n matrix function on a connected compact set
K C ftg such that, for every t G K, the matrix B(t) is left invertible (resp. right
invertible). Then there exists a left inverse (resp. a right inverse) X(t) of B(t)
such that X(t) is a continuous function of t on K.
Subspaces of Normed Spaces 415
Until now we have studied the notions of gaps, minimal angle, minimal
opening, and so on for subspaces of <p" where the norm of a vector
x = (jc,, . . . , xn} is Euclidean: ||jc|| = (£" =1 I*,) 2 ) 1 ' 2 . Here we show how
these notions can be extended to the framework of a finite-dimensional
linear space with a norm that is not necessarily generated by a scalar
product.
Let V be a finite-dimensional linear space over <p or over %. A real-
valued function defined for all elements # E V, denoted by ||jc||, is called a
norm if the following properties are satisfied: (a) ||je||^0 for all x E V ;
if and only id for every jcE V and every
scalar A (so A E <p or A E J j ? according as V is over ((7 or over
(the triangle inequality).
where x = (x,, . . . , xn) belongs to (p" (or to JjJ"). We have used this norm
throughout the book. Actually, this is a particular case of Example 13.8.1
(with the basis fl ; = e^ i = 1, . . . , n in <p" (or jj?") and p = 2).
Proposition 13.8.1
Let /, , . . . , fn be a basis in V, and let \ \ - \ \ be a norm in V. Then, given e > 0
there exists a 8 > 0 such that the inequality
416 The Metric Space of Subspaces
holds provided \Xj — yt\ < S for / = ! , . . . , « , where x — E"=1 Xjfj and
Theorem 13.8.2
Let || • ||' and || • ||" be two norms in V. Then there exists a constant K^l
such that
We stress the fact that K depends on || • ||', || • ||" only (and of course on
the underlying linear space V).
function g attains its maximum and minimum on this bounded set. So there
exist *,, x2 £ V such that \\x{\\' = \\x2\\' = 1 and
be the unit sphere of M. Now the gap 6(!£, M ) between the subspaces !£ and
M in V is defined by formula (13.1.3):
EXAMPLE 13.8.3. Let tjjl2 be the normed space with the norm
So
Let a < /3 < y be positive numbers such that /3 < 1 < y and / 3 y < l . We
compute
and
However, clearly
so the inequality
holds for sufficiently small positive a, and the triangle inequality for the gap
fails in this particular case.
is a metric. (The verification of this fact is exactly the same as that given in
Section 13.2.) Instead of inequality (13.2.5), we have in the case of a general
normed space the weaker inequality
Subspaces of Normed Spaces 419
But
and we have
Theorem 13.8.3
Let || • ||' and \\ • \\" be two norms in V, with the corresponding gaps 8'(M, N)
and 0"(M, JV) between subspaces M and N in V. Then there exists a constant
L > 1 such that
Again, the constant L depends on the norm || • ||' and || • ||" only.
13.9 EXERCISES
13.6 Show that the equality 0(3?, M) = \ holds if and only if either
£± n M ^ {0} or # n M1 ^ {0} (or both).
13.7 Let M I , N ! be subspaces in <p" and M2,N2 be subspaces in <pm.
Prove that
where
13.8 Find the gaps 0(Ker A, Ker B) and 0(Im A, Im 5) for the following
pairs of transformations
(a) j4 and B are diagonal in the same orthonormal basis.
(b) A and # are commuting normal transformations.
(c) A and B are circulant matrices in the same orthonormal basis.
(Hint: A and B can be simultaneously diagonalized by a unitary
matrix.)
Let s£ C <$ be two sets of subspaces of (p". We say that s& is connected in Sft
if for any subspaces <&, M C stf there is a continuous function /: [0, 1]—» 8ft
such that /(O) = j£, /(I) = M. [The continuity of / is understood in the gap
metric. Thus, for every tQ G [0, 1] and every e > 0 there is a 8 > 0 such that
imply The set si is called con-
nected if s& is connected in M.
We start the study of connectedness of the set Inv(y4) with the case when
A — 7, a Jordan matrix with &(J) = {0}. Let r be the geometric multiplicity
of the eigenvalue 0 of J, and let kl>--->kr be the sizes of the Jordan
blocks in /. Also, denote the set of all /^-dimensional J-invariant subspaces
by Inv p .
r) be an ordered r-tuple of integers such that 0 < /i. < A:,,
E; = 1 /,- = />, and let 4>p be the set of all such /--tuples. We associate every
with the subspace $(/) G Inv p , spanned by vectors wj 0 ;
y = 0, . . . , / y — 1; / = 1, . . . , /-, where wj'* are unit coordinate vectors in <(?"
and the sole nonzero coordinate of «j° is equal to one and is in the place
k^ + — • + ki_l+ j + 1 (we assume k0 = 0) for / = 0,. . . , kf - I and i =
1,. . . , r. There is a one-to-one correspondence between elements of <I>p and
423
a24 The Metric paces of Invariant Subspaces
Lemma 14.1.1
<$p is connected in ln\p.
Lemma 14.1.2
Let ^, E ln\p. Then oF, is connected in ln\p with some £F, E4> p .
by additional vectors
y ;to a basic set of vectors in
Connected Components: The Case of One Eigenvalue 425
such that
426 The Metric Spaces of Invariant Subspaces
Theorem 14.1.3
Assume that the transformation A: $"—> <p" has only one eigenvalue A0. Then
Inv(y4) has exactly n + 1 connected components, and each connected compo-
nent consists of all A-invariant subspaces of fixed dimension.
Theorem 14.2.1
Let A,, . . . , At be all the different eigenvalues of A, and let ( / / , , . . . , «/>c be
their respective algebraic multiplicities. Then for every integer p, Q< p < n,
Connected Components: The General Case 427
and for every ordered c-tuple of integers (xl , . . . , *c) such that 0 < x, — ^
i = l , . .. ,cand
Write /i| v as a matrix in the basis a,, ... ,a p , and for every /1-invariant
subspace N' that belongs to V(N], write ^4 v , as a matrix in the basis
P v , a , , . . . , Pjf.ap. Using formula (14.2.2) and the continuity of the trace,
we see that there exists a S > 0 such that, if 0(jV, N')<8 and N' is A
invariant, then
Since Xi(N') assumes only integer values, it follows that A/,(^') 's constant
in some neighbourhood of N in lnv(A) and, therefore, constant in the
connected component of Inv(^4) that contains N.
We show now that if N and Jf' are p-dimensional ^-invariant subspaces
428 The Metric Spaces of Invariant Subspaces
such that Xi(N) = X i ( N ' ) for / = 1, . . . , c, then JV and N' are connected in
Inv(yl). Indeed, applying Theorem 14.1.3 to each restriction A\^ (A) for
/ = 1, . . . , c, we find that N n &Ai(A) is connected with Jf' n ^(A) in the
set of all /4-invariant subspaces of dimension Xi(N) in 9l^(A). Since
and similarly for M', it follows that JV and .A"' are connected in
It remains to show that, given integers x\* • • • •> Xc suc^ tnat 0 — A', — 'A,
and L c i = l X i — p , there exists a subspace jVelnv(/4) with ^,(JV) = xt, for
/ = 1, . . . , c. But assuming that A is in Jordan form, we can always choose
an jV spanned by appropriate coordinate unit vectors. D
Corollary 14.2.2
The set ln\(A) has exactly Hli=l ((f/l ; + 1) connected components, where
«/fj , . . . , «/rc are the algebraic multiplicities of the different eigenvalues
\i, . . . , \c of A, respectively.
The proof of Theorems 14.1.3 and 14.2.1 shows in more detail how the
subspaces in Inv A belonging to the same connected component are con-
nected. We say that a vector function x(t) defined for fE[0, 1] and with
values in <p" is piecewise linear continuous if there exist m points 0 < tl <
"•<tm<\ and vectors y,, . . . , _y m + 1 and z , , . . . , z m + 1 such that, for
i = 1 , . . . , m +1
Corollary 14.2.3
Let M and M be p-dimensional A-invariant subspaces that belong to the same
connected component in Inv A. Then there exist piecewise linear continuous
vector functions v { ( t ) , . . . , vp(t) such that, for all ?G[0, 1], the subspace
Span{ el), • • • , vp(t)} is p-dimensional, A invariant, and
Theorem 14.3.1
An A-invariant subspace M is isolated if and only if, for every eigenvalue
of A with dim Ker( either
Lemma 14.3.2
An A-invariant subspace M is isolated if and only if for every eigenvalue A0 of
A the subspace (A) is isolated as an A\ invariant subspace.
Proof. We have
is itself.
We show now that, for every there exists re aexists
8 > 0a hat,
8 > 0forsuch
anythat, for any
A-invariant subspace with the inequalities
hold for / = 1,. . . , r. Indeed, arguing by contradiction,
assume that for some and some / there exists a sequence of
/l-invariant subspaces such that as w but
Proposition 14.3.3
Every inaccessible A-invariant subspace is isolated.
Lemma 14.3.4
An A-invariant subspace M is inaccessible if and only if, for every eigenvalue
of A, the subspace (v4) is inaccessible as an A -invariant
subspace.
The proof of Lemma 14.3.4 is left to the reader. (It can be obtained along
the same lines as the proof of Lemma 14.3.2.)
Theorem 14.3.5
Every inaccessible (equivalently, isolated) A-invariant subspace is A hyper-
invariant.
The converse of Theorem 14.3.5 does not hold in general, as the next
example shows.
432 The Metric Spaces of Invariant Subspaces
so
where
Theorem 14.4.1
Let A: be a transformation with partial multiplicities
(so ml + • • • + mk = n). Then there exists a reducing A-invariant subspace of
imension if and only if p is admissible, that is, is the sum of some
partial multiplicities ra,,. . . , ra, . In this case the set of all reducing A-
invariant subspaces of dimension p is open in the set of all A-invariant
subspaces.
Now consider the question of whether (for admissible /?) the set R\n\p(A)
of all p-dimensional reducing subspaces for A is dense in the set lnvp(A) of
all p-dimensional /1-invariant subspaces. We see later that the answer is, in
general, no. So a problem arises as to how one can describe the situations
when Rm\p(A) is dense in lnvp(A) in terms of the Jordan structure of A.
We need some preparation to state the results. Let A: be a
transformation with single eigenvalue and partial multiplicities
It follows from Section 4.1 that the partial multiplicities/?
p, of the restriction A to an ^-invariant subspace M satisfy the inequalities
where the union is taken over the finite set of all p-admissible sequences
For each p-admissible sequence
Theorem 14.4.2
For a fixed admissible integer p, the set K\n\p(A) is dense in lnvp(A) if and
only if the following condition holds: any p-admissible sequence p} >•• ->pt
for which the number s,(A\ pt,. . . , p,) attains its maximal value among all
p-admissible sequences has the form p\ = mt, , . . . , p, — mt for some indices
In particular, Rinvp(A) is dense in ln\p(A) pro-
vided there is only one p-admissible sequence for which
is maximal.
Theorem 14.4.3
For every p-admissible sequence p the set Inv /; (^4; /?j, . . . , p,) is,
in the topology induced by the gap metric, a connected complex manifold
whose (complex) dimension is equal to
For the proof of Theorem 14.4.3 we refer the reader to Shayman (1982).
Proof of Theorem 14.4.2. Assume that the condition fails, that is, there
exists a p-admissible sequence with maximal
Reducing Invariant Subspaces 435
Let us give an example showing that, for an admissible p, Rinv p (,A) is not
generally dense in ln\p(A).
In the next example Rinv p (/4) is dense in ln\p(A), for all admissible p.
Corollary 14.4.4
If the transformation A: has only one eigenvalue and
dim Ker 7 - A) = 2, then Rmvp(A) is dense in lnvp(A) for every p such
that Rmvp(A) is not empty.
Theorem 14.5.1
The set Coinv(y4) of all coinvariant subspaces for a transformation
A: is open and dense in the set < of all subspaces in
Furthermore, the set Coinvp(A) of all A-coinvariant subspaces of a fixed
dimension p is connected.
or all such that is large enough. For such the sub space Spa
is a direct complement to JV.
Span it follows that for 17 5^0 and close
nough to zero To show that M belongs to the closure of
the set of all >l-coinvariant subspaces, it remains to prove that
To prove this, assume for simplicity of notation that the upper p rows in
[i>, • • • vp] are linearly independent. Then the same will be true for the upper
p rows of (for 17 close enough to zero). Write
438 The Metric Spaces of Invariant Subspaces
(see the proof of Theorem 14.5.1). But the subspace Span{e2, e3 + rje4} is
not A semiinvariant for 17 ^ 0. Indeed, suppose that
Theorem 14.5.2
For any transformation A: the set Sinvp (/l) of all A-semiinvariant
subspaces of a fixed dimension p is connected.
Theorem 14.6.1
If A has only one eigenvalue, and this eigenvalue is real, then the set Inv*(y4)
of all A-invariant subspaces of fixed dimension p is connected.
The proof of Theorem 14.6.1 will be modeled after the proof of Theorem
14.3.1, taking into account the fact that in some basis in $." the transfor-
mation A has the real Jordan form (see Section 12.2). We apply the
following fact.
Lemma 14.6.2
The set GLr(n) of all real invertible n x n matrices has two connected
components; one contains the matrices with positive determinant, the other
contains those with negative determinant.
Proof. Let T be a real matrix with det T > 0 and let J be a real Jordan
form for T. We first show that J can be connected in GLr(ri) to a diagonal
matrix K with diagonal entries ±1. Indeed, J may have blocks Jp of two
types: first
440 The Metric Spaces of Invariant Subspaces
respectively, for / £ [0, 1]. Then Jp(t) determines a continuous path of real
invertible matrices such that /p(0) = Jp and J p ( l ) is an identity matrix.
Applying the above procedures to every diagonal block in /, we see that J
is connected to K by a path in GLr(n). Now observe that the path in GLr(2)
defined for by
Theorem 14.6.3
If the transformation A: has the only eigenvalues where a
and (3 are real and P^Q, then again the set Inv of all A-invariant
subspaces of fixed dimension p is connected.
Note that under the condition of Theorem 14.6.3, A does not have
odd-dimensional invariant subspaces (in particular, n is even), so we can
assume that p is even (see Proposition 12.1.1).
It is easily seen from the proof of Theorem 12.3.1 that this correspondence
is actually a homeomorphism (p: ln\p(A)^>lnv
Now the connectedness of Inv*(/4) follows from the connectedness of
Inv
heorem 14.1.3). D
where are all the distinct real eigenvalues of A (if any) and
are all the distinct eigenvalues of A in the open upper
half plane. Using this observation, the proof of Theorem 14.2.1 yields the
following description of the connected components in the metric space
Inv (A) of all /1-invariant subspaces in ft" for the general transformation
Theorem 14.6.4
Let A,, . . . , A^ be all the different real eigenvalues of A, let their algebraic
multiplicities be ifr respectively, and let be all
the distinct eigenvalues of A in the open upper half plane with the algebraic
multiplicities <p,,. . . , <p,, respectively. Then for every (s + t)-tuple of integers
such that
the set dim 2£ = p\ \i is tne algebraic multiplicity of
A\y corresponding to A, for i - 1,. . . , s; \s+j is that corresponding to
for where connect-
ed component of Inv (A) and every connected component of Inv*(/4) has
this form. In particular, Inv*(^4) has exactly
connected components.
Theorem 14.6.5
Let A: be a transformation. Then an A-invariant subspace M is
isolated in lnv*(A) if and only if either
Exercises 443
Proof. Using the real analog of Lemma 14.3.2 (its proof is similar to
that of Lemma 14.3.2), we can assume that one of two cases holds: (a)
In the
first case Theorem 14.6.5 is proved in the same way as Theorem 14.3.1. In
the second case use Theorem 14.3.1 and the homeomorphism between
Inv*(^) and lnv given by formula (14.6.1).
14.7 EXERCISES
Continuity and
Stability of
Invariant Sub spaces
444
Sequences of Invariant Subspaces 445
Corollary 15.1.2
The set of A-invariant subspaces is closed; that is, if {Mm}*m = l is a sequence
of A-invariant subspaces with limit M:=\\mm_^M, then M is also A
invariant.
Theorem 15.1.3
Let {Am}~m = l be a sequence of transformations on <p" that converges to a
transformation A on <p". Then Ker A contains the limit of every convergent
subsequence of the sequence {Ker Am}^, =l. In particular, if dim Ker Am =
dim Ker A for every m = 1, 2, . . . then Ker Am and Im Am converge, and
Theorem 15.1.4
Let M be an invariant subspace for the transformation A: <p"~* <P"> and let
ft C <p be an open set such that all eigenvalues of A\M are inside ft. Then for
transformations B on <p" and B-invariant subspaces Jf, cr(B\ v ) C ft as long as
is sufficiently small.
Theorem 15.2.1
Let A,, . . . , \r be the different eigenvalues of the transformation A. A
subspace N of (f"1 is A invariant and stable if and only if
Stable Invariant Subspaces: The Main Result 449
where for each j the space Nt is an arbitrary A-invariant subspace of £%A (A)
if dim Ker( A / - A) = 1; if dim Ker( A y / - A) * 1 then either ^ = {0}' or
Corollary 15.2.2
All invariant subspaces of a transformation A: <p"—» <p" are stable if and only
if A is nonderogatory [i.e., dim Ker(A - A 0 7) = 1 for every eigenvalue A0
of A}.
Theorem 15.2.3
Given e > 0, there exists a 8 > 0 such that the following holds true: if B is a
transformation with and is a complete chain of B-
invariant subspaces, then there exists a complete chain {Jij} of A-invariant
subspaces such that 6(Nj, M j ) < e for / = ! , . . . , « — 1.
In general, the chain (M;} for A will depend on the choice of B. To see
this, consider
Observe that P,,. . . , Pn_i are orthogonal projectors. Indeed, passing to the
limit in the equalities P m . y = (P m ., y ) 2 , we find that P; = P]. Further,
equation (15.2.4) combined with P*m. y = Pm . implies that P* = P;; so P;. is
an orthogonal projector (see Section 1.5).
Further, the subspace jVy = Im Py has dimension y, / = 1, . . . , n — 1. This
is a consequence of Theorem 13.1.2.
By passing to the limits it follows from BmPmj = PmjBmPmi that AP- =
PjAPj. Hence ^ is A invariant. Since Pmj = PmJ+lPmj we have Py = Pj+lPjt
and thus JV / -C^ / + 1 . It follows that JVy. is a complete chain of A-invariant
subspaces. Finally, 0(^V), M But this contradicts
(15.2.3), and the proof is complete. D
Corollary 15.2.4
If A has only one eigenvalue, A 0 , say, and if dim Ker( A07 — A) = 1, then each
invariant subspace of A is stable.
Lemma 15.2.5
If A has only one eigenvalue, A0 say, and if dim Ker(A 0 7— A) >2, then the
only stable A-invariant subspaces are {0} and <p".
otherwise
and put Bf = J+Te. Then || Bf - J \\ tends to 0 as e -» 0. For e * 0 the linear
transformation Bf has exactly one/-dimensional invariant subspace, namely,
Proof of Theorem 15.2.1 in the General Case 451
,Y} = Span{e,,. . . ,e y ). Here 1 < / < £ - ! . It follows that Jff is the only
candidate for a stable /-invariant subspace of dimension j.
Now consider / = diag[Jk ( A 0 ) , . . . , /* 2 (A 0 ), / A | (A 0 )j. Repeating the
argument of the previous paragraph for / instead of /, we see that Nf is the
only candidate for a stable /-invariant subspace of dimension /. But / =
575 ~ l , where S is the similarity transformation that reverses the order of the
blocks in /. It follows that SNj is the only candidate for a stable /-invariant
subspace of dimension /'. As s > 2, however, we have SNj ^ Nj for 1 < / <
k — 1, and the proof is complete. D
Corollary 15.2.4 and Lemma 15.2.5 together prove Theorem 15.2.1 for
the case when A has one eigenvalue only.
The proof of Theorem 15.2.1 in the general case is reduced to the case of
one eigenvalue considered in the preceding section. Recall the notion of the
minimal opening
Proposition 15.3.1
Let be a sequence of subspaces in for
some subspace !£, then
Indeed, if both t£ and M are nonzero, then also Mm are nonzero (at least
for m large enough; see Theorem 13.1.2). Then (15.3.1) follows from
formula (13.3.2). If at least one of % and jV is the zero subspace, then
(15.3.1) is trivial.
Let us introduce some terminology and notation that will be used in the
next two lemmas and their proofs. We use the shorthand Am—* A for
lim^^ || Am - A\\ — 0, where Am, m — 1, 2 , . . . , and A are transformations
on (p". Note that A m —> A if and only if the entries of the matrix represen-
tations of Am (in some fixed basis) converge to the corresponding entries of
452 Continuity and Stability of Invariant Subspaces
Lemma 15.3.2
Let F be a simple rectifiable contour that splits the spectrum of T, let TQ be the
restriction of T to Im P(T; F), and let N be a subspace of Im P(T; F). Then Ji
is a stable invariant subspace for T if and only if jV is a stable invariant
subspace for T().
Proof. Suppose that ^Vis a stable invariant subspace for TQ, but not for
T. Then one can find an e > 0 such that for every positive integer m there
exists a transformation Sm such that
and
and
So
for m sufficiently large. We conclude that Q(Mm, N)—>0, and the proof is
complete. D
Lemma 15.3.3
Let jV be an invariant subspace for T, and assume that the contour F splits the
spectrum of T. If M is stable for T, then P(T;T)Jf is a stable invariant
subspace for the restriction T0 of T to Im P(T; F).
ficiently large. But then, without loss of generality, we may assume that F
splits the spectrum of each Sm . Again using Sm —> T, it follows that
where the matrix corresponds to the decomposition (f1" = 2£. 4- N. Note that
Tm = Em*SmEm leaves Jf invariant. Because Rm-*Q, we have £ M -»7, and
sor m -»r.
Clearly, F splits the spectrum of T\v. As Tm-* T and JV is invariant for
Tm, the contour F will split the spectrum of Tm ^ too, provided m is
sufficiently large. But then we may assume that this happens for all m. Also,
we have
Theorem 15.4.1
Let A : <p" —> <p" be a transformation, and let ft C <p be an open set whose
boundary does not intersect a-(A). Assume that M is an A-invariant subspace
for which the intersection M D 3ft.n(A) is stable (with respect to A}. Then any
B-invariant subspace jV has the property that N D &ln(B) is stable (with
respect to B) provided \\B-A\\ and 0(M, JV) are small enough.
Corollary 15.4.2
Let M be a stable A-invariant subspace. Then there exists an € > 0 such that
any B-invariant subspace N is stable provided
Lemma 15.4.3
Let A and ft be as in Theorem 15.4.1, and let M be an A-invariant subspace.
Then for every e > 0 there exists a 8 > 0 such that every B-invariant sub-
space M with ||B — A\\ + 0(M, N)<8 satisfies the inequality
and, moreover,
Perturbed Stable Invariant Subspaces 457
Here C,, C2, . . . are positive constants that depend on A only. Actually,
one can take S defined as follows:
Put Bm = SmlBmSm and jfm = Sm^m (so that Jfm is Bm invariant). Let P*
(resp. Pjf ) be the orthogonal projector onto M (resp. Nm). As SmlPx Sm is
a projector onto -A"m (not necessarily orthogonal), we have
Since also
Then Um—>U and W m —» W. Since ^ is not stable for Am, Theorem 15.2.1
ensures the existence of a common eigenvalue \m of Um and Wm such that
Now let us focus attention on the spectral A -invariant subspaces, that is,
sums of root subspaces for A (the zero subspace will also be called spectral).
Theorem 15.2.1 shows that each spectral invariant subspace is stable. The
converse is not true in general: every invariant subspace of a unicellular
transformation is stable, but the only spectral subspaces in this case are the
trivial ones.
For the spectral subspaces, an analog of Theorem 15.4.1 holds.
Theorem 15.4.4
Let A and O be as in Theorem 15.4.1. Assume that M is an A-invariant
subspace for which M H ^i{l(A) is a spectral invariant subspace for A. Then
Lipschitz Stable Invariant Subspaces 459
any B-invariant subspace X has the property that N n £%n(B) is spectral (as a
B-invariant subspace) provided \\B — A\\ + 6(M, N) is small enough.
Theorem 15.5.1
For a transformation A and an A-invariant subspace M the following
statements are equivalent: (a) M is Lipschitz stable; (b) M — {0} or else
M = £%A (A) + • • • + £%A (A) for some different eigenvalues A,, . . . , \r of A;
in other words, M is a spectral A-invariant subspace; (c) for every suf-
ficiently small e > 0 there exists a 8 > 0 such that any transformation B with
\\A — B\\ < 8 has a unique invariant subspace N for which 0(M, N) < €.
460 Continuity and Stability of Invariant Subspaces
B(a) — A on all root subspaces of A other than £%A (^4). Then B(a)—> A as
a-*0. Let p = dim ffl^A); q = dim Ker M n ^Ao(X) ; so 0<q<p. For
brevity, denote the right-hand side of (15.5.1) by K(a). To obtain a
contradiction, it is sufficient to show that for a small enough the number of
g-dimensional ^(a)-invariant subspaces N such that B(M fl £%A (A), Jf)<
C, a 'lp is exactly ( ) > 1 (we denote by C, , C2 , . . . positive constants that
depend on p and q only).
Let us prove this assertion. The matrix K(a) has p different eigenvalues
el , . . . , ep , which are the p different roots of the equation xp — a. The
corresponding eigenvectors are y, = (1, e,, . . . , ef" 1 ), / = 1, . . . , p. The
only ^-dimensional K(a)- invariant subspaces are those spanned by any q
vectors among y , , . . . , yp . Take such a subspace «AT and suppose for
notational convenience that Jf = Span{ y,, . . . , yq}. The projector QN onto
M along the subspace spanned by eg+l, . . . , e is given by the formula
Lipschitz Stable Invariant Subspaces 461
/ = 1,2 (for large m), contradicting the assumption that Nlm and M2m are
different.
Now we prove the equivalence of (a) and (b). In view of Theorem 15.2.1,
we have to check that the only Lipschitz stable invariant subspaces of the
Jordan block
are the trivial spaces {0} and <p". For a > 0, let
Now use |e| = Va. One finds that for a sufficiently small
On the other hand, ||/ — Ja \\ = a. But then it is clear that for 1 < k < n — 1
the space Nk is not a Lipschitz stable invariant subspace of /, and thus / has
no nontrivial Lipschitz stable invariant subspace.
Theorem 15.6.1
A lattice A of A-invariant subspaces is stable if and only if it consists of stable
A-invariant subspaces.
Proof. Without loss of generality we can assume that {0} and <p" belong
to A.
Suppose first that A contains an ^-invariant subspace M that is not stable.
Then there exist an e > 0 and a sequence of transformations {/?m}* = 1
tending to A such that 6(M, Jf) > e0 for any 5 m -invariant subspace N and
any m. Obviously, A cannot be stable.
Assume now that every member of A is a stable >4 -invariant subspace. As
the number of stable ^-invariant subspaces is finite (by Theorem 15.2.1),
the lattice A is finite. Let Ml, . . . , M p be all the elements in A. Denote by
A 1 5 . . . , \r the different eigenvalues of A ordered so that
Then
where 5 runs through the set of all isomorphisms of A onto A'. Obviously,
every Lipschitz stable lattice of invariant subspaces is stable. We leave the
proof of the following result to the readers.
Theorem 15.6.2
A lattice A of A-invariant subspaces is Lipschitz stable if and only if A
consists only of a spectral subspaces for A .
Given two sets X and Y of subspaces in <p", the distance between X and Y
is introduced naturally:
Borrowing notation from set theory, denote by 2Z the set of all subsets in a
set Z. Then dist(^, F) is a metric in Z^"' [as before, $(<p") represents the
set of all subspaces in <p"]. Indeed, the only nontrivial property that we have
to check is the triangle inequality:
Now take the supremum with respect to M, and, from the resulting
inequality with the roles of M and JV interchanged, it follows that
Theorem 15.7.1
Inv(v4) is stable in metric if and only if A is nonderogatory , that is,
dim Ker(A — A 0 /) = 1 for every eigenvalue A0 of A.
for any / 5^; and any positive integer m. (Such sequences can obviously be
arranged.) Letting
for every transformation with \\B - A\\ < 8'. We prove now that given 6 >0
there exists a 6">0 such that
for every transformation B with \\B - A\\ < 8". Suppose not. Then there is a
sequence of transformations on such that
and for every m there exists a Bm -invariant subspace Nm with
Now given e >0, let 5 = min(6', 5") to see that dist(Inv(#), lnv(A)) < e
for every transformation B with
It follows from Theorems 15.6.1 and 15.7.1 that ln\(A) is stable if and
only if it is stable in metric.
Also, let us introduce the notion of Lipschitz stability in metric. We say
that the lattice ln\(A) of all invariant subspaces of a transformation
A : <£" —> <p" is Lipschitz stable in metric if there exist positive constants K
and e such that for any transformation the
inequality
Theorem 15.7.2
The lattice lnv(A) for a transformation A: <p"—» <p" is Lipschitz stable in
metric if and only if A has n distinct eigenvalues.
provided \\B - A\\ < e. Now consider the invariant subspaces of B. As A has
n distinct eigenvalues, the same is true for any transformation B sufficiently
close to A. So every B- invariant subspace ^V is spectral:
we find that
for all such B. In view of (15.7.4) and (15.7.5), Inv(/4) is Lipschitz stable in
metric.
Conversely, if A has less than n distinct eigenvalues, then by Theorem
15.5.1 there exists an ,4-invariant subspace that is not Lipschitz stable. Then
clearly lnv(A) cannot be Lipschitz stable in metric.
It turns out that, in contrast with the case of invariant subspaces for a
transformation, every [A #]-invariant subspace is stable and, moreover, the
stability is understood in the Lipschitz sense. More exactly, we have the
following theorem.
Theorem 15.8.1
Let A: <p n —> <p" and B: <p m —* <p" be a full-range pair of transformations.
Then for every [A B]-invariant subspace M there exist positive constants e
and K such that, for every pair of transformations A': <p"—> <p* and
B1: <p m ^<p", with
Proof.
Stability of [A B]-In variant Subspaces 469
and put
Corollary 15.8.2
Let A: <p n —» <p" and B: (pm —» (p" be a full-range pair of transformations, and
let M be an [A B]-invariant subspace. Then for every transformation
F: <pn —»• (pm such that (A + BF)M C M and every direct complement N to M
in <p" there exists positive constants K and e with the property that to any pair
of transformations A': (p"-* <p", B': $m^ <p" with \\A-A'\\ + \\B-B'\\<
e there corresponds a transformation F': <p"—» <pm with Ker F' D M, and a
subspace M' C C", such that (A' + B'(F + F')}M' C M' and
A dual version of Theorem 15.8.1 also holds. Namely, given a null kernel
pair of transformations G: <p"—» (p™ an(^ A: ^P""* <P"> every „ -invariant
subspace is Lipschitz stable in the above sense. The proof can be obtained
by using Theorem 15.8.1 and the fact that a subspace M \s\ \ invariant if
and only if its orthogonal complement is [A* G*] invariant. We leave it to
the reader to state and prove this dual version of Corollary 15.8.2.
Lemma 15.9.2
Assume that n is odd and the transformation A: ft" —* ft" has exactly one
eigenvalue (which is real) and the geometric multiplicity of this eigenvalue is
one. Then each A-invariant subspace is stable.
Proof. As n is odd, every transformation X: ft"—* ft" has an invariant
subspace of every dimension k for \<k<n—\ (this follows from the real
Jordan form for X, because X must have a real eigenvalue). Arguing as in
the proof of Theorem 15.2.3, one proves that for every e >0 there exists a
5 > 0 such that, if B is a transformation with ||Z?-v4||<5 and M is a
Ac-dimensional Z?-invariant subspace, there exists a /c-dimensional A-
invariant subspace jV with 0(Jt, N) < e. Since A is unicellular, this subspace
./Vis unique, and its stability follows.
Lemma 15.9.3
Let n be even, and A: ft"-^ ft" have exactly one real eigenvalue. Let its
geometric multiplicity be one. Then the even dimensional A-invariant sub-
spaces are stable and the odd dimensional A-invariant subspaces are not
stable.
(see Theorem 12.2.1). In this notation we have the following general result
that describes all stable ^-invariant subspaces.
Theorem 15.9.5
Let A be a transformation on tjt". The A-invariant subspace JV is stable if and
only if all the following properties hold: (a) Ji n ^^(A) is an arbitrary even
dimensional A-invariant subspace of <3iK (A) whenever the algebraic multi-
plicity of \j is even and the geometric multiplicity of\j is 1; (b) N ft <3iK (-4) is
an arbitrary A-invariant subspace of <%^ (A) whenever the algebraic multiplic-
ity of \j is odd and the geometric multiplicity of \} is 1; (c) JV D £%A (^4), or
N D <3lK (A) - {0} whenever A; has geometric multiplicity at least 2; (d)
N r\ &la . ± i f i ( A ) is an arbitrary A-invariant subspace of &la±ip(A) when-
ever the geometric multiplicity of
Ji H &i a . ± p ( A ) = {0} whenever a; -f /j8y has geometric multiplicity of at least 2.
Corollary 15.9.6
For a transformation A: $"— » 4?", every stable A-invariant subspace is iso-
lated. Conversely, every isolated A-invariant subspace is stable if and only if
A has no real eigenvalues with even algebraic multiplicity and geometric
multiplicity 1.
We pass now to Lipschitz stable invariant subspaces for real transfor-
mations. The definition of Lipschitz stability is the same as for transfor-
mations on (p". Clearly, every Lipschitz stable invariant subspace is stable.
Also, for a transformation A: ft"-* $" every root subspace ^(A) corres-
ponding to a real eigenvalue A of A, as well as every root subspace
3%a±ili(A) corresponding to a pair a ± ifi of nonreal eigenvalues of A, is a
Lipschitz stable ^-invariant subspace. Moreover, every spectral subspace for
A (i.e., a sum of root subspaces) is also a Lipschitz stable A -invariant
subspace. As in the complex case, these are all Lipschitz stable subspaces:
Theorem 15.9.7
For a transformation A:$n —> $n and an A-invariant subspace J f C . $ " ,
the following statements are equivalent: (a) M is Lipschitz stable;
(b) M = &Xi(A) + - - ' + &AJ(A) + ®ai±i,i(A) + ''' + &at±li(A) for some
distinct real eigenvalues A 1 ? . . . , Ar of A and some distinct eigenvalues
a, + //3,, . . . , a, + ifis in the open upper half plane (here terms 9l^(A) or
terms $ta±ip(A), or even both (in which case M is interpreted as the zero
subspace) may be absent) ; (c) for every e > 0 small enough there exists a
8 > 0 such that every transformation B: ft"-* #" for which \\B - A\\ < d has
a unique invariant subspace Jf for which 8(M, M} < e.
Proof. As in Lemma 15.3.2, one proves that M is Lipschitz stable if and
only if for every real eigenvalue A of A the intersection M D $t^(A) is
Lipschitz stable as an /4|^U) -invariant subspace and for every nonreal
eigenvalue a + i/J of A M n &la±lfi(A) is Lipschitz stable as an A\& (A)-
invariant subspace.
Let us prove the equivalence (a)<^>(b). In view of the above remark, we
can assume that A has either exactly one real eigenvalue or exactly one pair
of nonreal eigenvalues. By Theorem 15.9.6 we have only to prove that the
transformations represented by the matrices
and
474 Continuity and Stability of Invariant Subspaces
Now the proof of Theorem 15.5.1 shows that the only candidates for
nontrivial Lipschitz invariant subspaces for A2 are S~'(Span{e,,. . . , e n / 2 } )
and S~\Span{en/2+i,. . . , en}). But since these subspaces are not real (i.e.,
cannot be obtained from subspaces in Jjf" by complexification), A2 has no
nontrivial Lipschitz invariant subspaces.
The implication (b) =^> (c) is proved as in the proof of Theorem 15.5.1. To
prove the converse implication, observe that, as we have seen in the proof
of Theorem 15.5.1, it is sufficient to show that for any .4 2 -invariant subspace
M(C $") of dimension q (0 < q < n) the number of ^-dimensional invariant
subspaces jV of A2(a) such that
satisfies (15.9.2).
Theorem 15.10.1
Given a transformation A: <p" —> <p" and a closed contour F with F fl o-(A) =
0, there exists an e >0 such that any transformation /?:<£"—> (p" w/tfz
|| 5 — /4|| < e has no eigenvalues on F and satisfies the inequalities
Now let e >0 be so small that if \\B - A\\ < e, the determinant of the top
sxs submatrix in F(\)~\\I - fi)G(A)"1 has exactly
Partial Multiplicities of Close Linear Transformations 477
We say that (15.10.5) is the Jordan structure sequence of A. Denote by 3> the
finite set of all ordered sequences of positive integers (15.10.5) such that
properties (a)-(c) hold and £* =I £y' =] rnif = n (here n is fixed).
Given the sequence ftE<f> as in (15.10.5), for every nonempty subset
A C {1,. . . , s} define
478 Continuity and Stability of Invariant Subspaces
Theorem 15,10.2
Let A: (p" —> <p" be a transformation with s distinct eigenvalues and Jordan
structure sequence fl. Then, given a sequence
E' = 1 m l y ) , it is sufficient to consider only the case when, for some indices
/ < q, we have m, = mu + 1, mq = m ](/ - 1, whereas m; = raly for; T^ l,j^q.
Write
where the matrix (2 nas all zero entries except for the entry in position
(m n + • • • + MI,, mn + • • • + m{q) that is equal to 1. One verifies without
difficulty that the partial multiplicities of Bq are ml = mu + q,rhq = m\q-\,
and rhj = m l y for j ^ I, q.
Given a sequence {Btj}^=l converging to A such that or{Bq} = {A0} and
the Jordan structure sequence of Bq is fl (for each <?). For a fixed g, let
With this choice of /x, values (which depend on q), put Brn =
Bm( ,) to satisfy the requirements of Theorem 15.10.2.
15.11 EXERCISES
Perturbations of
Lattices of
Invariant Subspaces
with Restrictions
on the Jordan Structure
In this chapter we study the behaviour of the lattice Inv(A') of all invariant
subspaces of a matrix X, when X is perturbed within the class of matrices
with fixed Jordan structure (i.e., with isomorphic lattices of invariant
subspaces). A larger class of matrices with fixed Jordan structure corres-
ponding to the eigenvalues of geometric multiplicity greater than 1 is also
studied. For transformations A and B on <p", our main concern is the
relationship of the distance between the lattices of invariant subspaces for A
and B to ^ - B .
482
Preservation of Jordan Structure and Isomorphism of Lattices 483
(16.1.1)
Proposition 16.1.1
If S: <pn—*• <p" is an invertible transformation and y is a lattice of subspaces in
is also a lattice of subspaces and the correspon-
dence = SM. is a lattice isomorphism of y onto Stf.
Proof. The definition of <p ensures that <p is onto, and invertibility of S
ensures that (p is one-to-one. Furthermore,
and
Theorem 16.1.2
Let a transformation A: <p"~* <P" be given. The following statements are
equivalent for a transformation B : <p" —» (p" : (a) Z? /w,s f/ie same Jordan
structure as A; (b) the lattices Inv(Z?) and Inv(^4) are isomorphic; (c) the
lattices In\(B) and Inv(>4) are linearly isomorphic.
be a Jordan basis in 8ft^(B) (so /:,, A:2, . . . , kq are the partial multiplicities
of A at \j and of 5 at /A;). Given an ^4-invariant subspace
spanned by the vectors
Preservation of Jordan Structure and Isomorphism of Lattices 485
where
It is easily seen that I/MS a desired isomorphism between Inv(^4) and Inv(fi);
moreover, M) = SM, where 5 is the invertible transformation defined
Sxrs = y,*> * = !,... ,* r ; r = l , . . - , < ? .
Conversely, suppose that /: Inv(^4)—> Inv(fl) is an isomorphism of lat
tices. Let A j , . . . , \p be all the distinct eigenvalues of A, and let jV( =
«/r(5? A (y4)), / = ! , . . . , / ? . Then <p" is a direct sum of the B-invariant sub-
space's JV,, . . . , Jfp. We claim that \ v ) = 0 for / ?*/. Inde
assume the contrary, that is, r some JV} and ^Vy with
/ T^y. Let JV = Spanf}', + y2}, where _yj (resp. _y 2 ) is some eigenvector of B\^
(resp. of B v) corresponding to the eigenvalue Then N is B invariant.
Let M be the /^-invariant subspace such that tff(M) - JV. Since ^ must
contain a one-dimensional ^-invariant subspace, and since $ is a lattice
isomorphism, the subspace M is one-dimensional. Therefore, M C £%A (y4)
for some A:. This implies Ji (/!)) = ^, a contradiction with
the choice of N.
Further, the spectrum of each restriction B\ v is a singleton. To verify
this, assume the contrary. Then for some i the subspace Nt is a sum of at
least two root subspaces for B:
If *i and j:2 are eigenvectors of y4| -<ti and of A\M , respectively, then
486 Perturbations of Lattices of Invariant Subspaces
Span{*, + x2} is ^-invariant and does not belong to any subspace Mf.
Hence (Span{xl + jc2}) is B invariant, belongs to Jii but does not belong to
any subspace 2ft,M(B). This is impossible because
one-dimensional.
We have proved, therefore, that jV) = £% M .(/?), / = ! , . . . , / ? , where
/ i t j , . . . , fjip are all the distinct eigenvalues of B.
For a fixed /, the number of partial multiplicities of A corresponding to A,
that are greater than or equal to a fixed integer q, coincides with the
maximal number of summands in a direct sum Jif, + • • • + J%, where, for
j = 1,. . . , s, ^ C 91A (A) are irreducible subspaces with dimension not less
than q. As ty induces an isomorphism between Inv(/4|^ ( A ) ) and
ln\(B\^ (B) ), it follows that the number of partial multiplicities of A
corresponding to A, that are greater than or equal to q coincides with the
number of partial multiplicities of B corresponding to /u,, that are not less
than q. Hence A and B have the same Jordan structure.
Corollary 16.1.3
Assume that A and B are transformations on <p" with one and only one
eigenvalue {A0}. Then the lattices Inv(A) and Inv(B)
isomorphic if and only if A and B are similar.
Note that the set £f(A, B) contains transformations arbitrarily close to zero
[Indeed, take a fixed S A,B) and consider S with »0, ^Q.
Hence (^4, B) < 1 for any A and B with the same Jordan structure. This
observation will be used frequently in the sequel.
The following example shows that the equality fl(y4, B)— I is possible.
Then
Properties of Linear Isomorphisms of Lattices 487
and
Theorem 16.2.1
If A and B have the same Jordan structure and , B)<1, then
Hence
so and
an . Now for
any subspace J< C <p" the transformation 5^5 ' is a projector on S1./^ (we
denote by PM the orthogonal projector on Jt). So
488 Perturbations of Lattices of Invariant Subspaces
Consequently
dist(Inv(,4), Inv(fi))
Now consider the case when A and B are similar. Then, evidently, A and
B have the same Jordan structure. Clearly, in this case y(A, B) contains all
the similarity transformations between A and B:
|5 is invertible and A = S~1BS}
Theorem 16.2.2
For every transformation A: <p" —»• <p" we have
where the suprema are taken over all transformations B that are similar to A .
In other words, the first inequality in (16.2.1) means that there exists a
positive constant K > 0 (depending on A) such that for every B that is
similar to A we have
for any transformations X, Y: $"—> £". So, by Theorem 16.2.1, the second
inequality in (16.2.1) follows from the first one.
To prove the first inequality in (16.2.1), consider the linear space L(<p")
of all transformations A: <£" -» <p", with the scalar product (X, Y) = tr(AT*)
for X, yeL(<tr) (where Y* denotes the adjoint of Y defined by the
standard scalar product on <p") and the corresponding norm ||^||,=
Vf%T) for all X<EL($"). For every J5eL(<p") consider the linear
transformation
So
Hence
Properties of Linear Isomorphisms of Lattices 491
Then
and
So
492 Perturbations of Lattices of Invariant Subspaces
To calculate the distance between lnv(A) and lnv(B), note that the
unique different invariant subspaces of A and B (if x ^0) are Span and
Span , respectively, with corresponding orthogonal projectors
Observe that
Theorem 16.3.1
Given a transformation A on <p", we have
and
where the suprema are taken over the set J(A) of all transformations
B: <p n —»<p" which have the same Jordan structure as A.
Transformations with the Same Jordan Structure 493
Theorem 16.3.2
Let J be a class of all linear transformations having the same Jordan structure.
Then the real function defined on J by
provided
(b) Assume now that ||B - A\\ < e, and B E J(A). As the numbers of
different eigenvalues of B and of A coincide, there is exactly one
eigenvalue of B, denoted /i(, inside each circle F(. We claim that for
every / = 1 , . . . , / ? the eigenvalue A, of A and the eigenvalue /x, of B
have the same partial multiplicities. Indeed, assuming the contrary,
it follows from (16.3.4) that
for some /„ (1 < /„ </?) and some 50 (note that the equality
Consequently,
which is a contradiction.
Transformations with the Same Jordan Structure 495
st
Put
and for fixed / ( ! < / < / ? ) let 5, be the transformation constructed above for
the transformation B £ 7(^4) with \\B - A\\ < e2. Define the transformation
Now
and where
the proof of Theorem 16.2.1). Since (1 - <?,)~' <2, (16.3.6) gives
M,.-trB,.|
For every xE.(" write x = xl + -" + xp, where xj = Pi(B)xt and P,(fi)
is the projector on %(fi) along E/^^fl). As Pi(B) = (l/2iri)
JY ( A / - B ) ' 1 d\, we have
where P;-(>1) is the projector on £%A (A) along E /?S/ 5? A (A). Denoting
we see that ||P,(fi)|| < £ > , , / = 1, . . . , / ? . Now using (16.3.11) with these
inequalities we obtain
and thus
Theorem 16.4.1
Let A: <£"—»• <p" be a transformation with height a. Then
satisfies (16.4.1). This is not difficult to verify using the fact that Bm has n
distinct eigenvalues em~1'" with corresponding eigenvectors
n
where e is an nth root of unity. Indeed, writing , we see that the
orthogonal projector on Span IS
Transformations with the Same Derogatory Jordan Structure 499
SO
Corollary 16.4.2
Let A: <p"~* <P" ^e a nonderogatory transformation with height a. Then there
exists a neighbourhood °IL of A in the set of all transformations on <p" such
that
Theorem 16.4.3
Let DJ be a class of all transformations having the same derogatory Jordan
structure. Then the real function defined on DJ by
for every A, B £ DJ that is sufficiently close to AQ, B0, and where a, j8 are
the heights of A() and fi(), respectively.
Theorem 16.4.4
Let A: $"—> (p" be a transformation with height a, and let M be a stable
A-invariant subspace. Then
Lemma 16.5.1
Let A: <p"-^ <p" be a transformation with o-(A) = {0} and dim Ker ^4 = 1.
Then, given a constant M > 0, there exists a K > 0 such that
Proof. Let B: <p"-> <£" be such that \\B-A\\<M.We have A" = 0 and
thus
Now we prove Theorem 16.4.1 for the case when A: <p"—><|7" is non-
derogatory and has only one eigenvalue.
Lemma 16.5.2
Let a(A) = {\0} and dim Ker(A 0 7 - A) - I . Then there exists a constant
K>0 such that the inequality
for any eigenvalue A0 of any B satisfying \\B - A\\ < e2, where the positive
constants K2 and e2 ^ el depend on A only.
It is convenient to assume that A is the Jordan block with respect to the
standard orthonormal basis in (p": A = /„(()). For any B sufficiently close to
A write B - A = [6 /y ]" /=1 . Inequality (16.5.3) shows that there is an eigen-
502 Perturbations of Lattices of Invariant Subspaces
Using |A 0 |< Kj||# - A\\lln and Cramer's rule, we see that for
/ = 2, 3,. . . , n, Xj has the following structure:
where B satisfies \\B - A\\ < e2. Here and in the sequel L0, L, , . . . , denote
positive constants that depend on A only.
Now let jc (1) , . . . , x(k) be k eigenvectors of B corresponding to k different
eigenvalues A j , . . . , A A . Construct new vectors using divided differences:
Let
Proofs of Theorems 16.4.1 and 16.4.4 503
Obviously
and thus, Yk is invertible (for B sufficiently close to A). Using the estimates
we easily find from (16.5.5) that
*. Hence
So
Consequently
for every transformation B such that \\B - A\\ < e2 and every ^-invariant
subspace is spanned by its eigenvectors. As B must be nonderogatory, the
last condition means that B has n distinct eigenvalues.
Assume now that B is such that \\B - A\\ < e2, but B does not have n
distinct eigenvalues. In particular, B is nonderogatory. Let {# m }™ =1 be a
sequence of transformations such that \\Bm - A\\ < e2 for all ra, Bm—> B as
m —»°°, and Bm has n distinct eigenvalues for each m. Let M be a
A:- dimensional fi-invariant subspace. As M is a stable subspace (see
Theorem 15.2.1), there exists a sequence {Mm}^ =l, where Mm is a k-
dimensional /?„, -invariant subspace such that 0(Mm,M,)-*Q as m—»<». By
(16.5.6)
hence
Proofs of Theorems 16.4.1 and 16.4.4 505
\\B-A\\<e
As
it is sufficient to prove Theorem 16.4.1 only for those B: <p"~* <P" that are
close enough to A and satisfy ^t(B) - &j(A), j = 1, 2.
506 Perturbations of Lattices of Invariant Subspaces
Hence
Theorem 16.6.1
We have
508 Perturbations of Lattices of Invariant Subspaces
Proof. Recall that B is nonderogatory if and only if the set of its invariant
subspaces is finite.
By assumption, dim Ker( 0/ - A) > 1 from some eigenvalue of A. Let
x and y be orthonormal vectors belonging to Ker 0 /- A), and put
First take the minimum with respect to / on the left-hand side and then on
the right-hand side. We obtain
for all /E[0, 1]. Taking the maximum with respect to t on the right-hand
side first, and then on the left-hand side, we obtain
With the roles of j£). and Nt, switched it also follows that
Transformations with Different Jordan Structures 509
that is
The next example shows that the answer is, in general, negative.
Clearly, for all m, Bm and A have different derogatory Jordan structure (in
particular, different Jordan structure).
One-dimensional ^4-invariant subspaces are Span{e, + Be and
Span{e3}. The orthogonal projector on Span{ej + Be3} is
Now the inequalities (16.6.3) and (16.6.4) ensure that for m = 1,2, . . . ,
16.7 CONJECTURES
A similar question arises for the case of derogatory Jordan structure, when
(16.7.1) is replaced by
Further, for O given by (16.7.2) denote by P(il) the set of all sequences
for which there is a partition of (1, . . . , 5'} into s disjoint nonempty sets
5 such that the following relations hold:
Conjecture 16.7.1
Let A: be a transformation with the Jordan structure f Then
for any sequence ft' that belongs to P(ft) and is different from ft, there exists
a sequence of transformations [Bm}^=l that converges to A, for which each
Bm has the Jordan structure ft', and for which
where 17,,. . . , ^ are the 5th roots of e, and the n x n matrix Ae has e in the
(5,1) entry and zeros elsewhere. It is easy to see [by considering, e.g.,
det( /- B()] that, at least for e close enough to zero, the matrix B( has the
Jordan structure ft'. Clearly, 17,, . . . , 17, are the eigenvalues of B f , and
(1, 77,,. . . , 17; ', 0, . . . , 0} is the only eigenvector of B (up to multiplica-
tion by a nonzero scalar) corresponding to 17, for / = 1, . . . , s. It follows (cf.
the remark following Theorem 16.5.1) that
and
be two sequences from <£. We say that ft and ft' have the same derogatory
part if the number (say, M) of indices /, 1 </ <s such that r- >2 coincide
with the number of indices ;', ! < / < f such that r j > 2 , and, moreover,
r
f it does not
happen that and have the same derogatory part, we say that and
have different derogatory parts.
Exercises 513
Conjecture 16.7.2
Let the transformation A: have the Jordan structure Then
for every sequence that belongs to P( and such that and have
different derogatory parts there exists a sequence of transformations {Bm}^ =l
that converges to A, for which each Bm has the Jordan structure and for
which
16.8 EXERCISES
Applications
Chapters 13-16 provide us with tools for the study of stability of divisors for
monic matrix polynomials and rational matrix functions. In this chapter we
develop a complete description of stable divisors in terms of their corre-
sponding invariant subspaces and supporting projectors. Special attention is
paid to Lipschitz stable and isolated divisors. We consider also the stability
and isolatedness properties of solutions of matrix quadratic equations as well
as stability of linear fractional decompositions of rational matrix functions.
514
Matrix Polynomials: Preliminaries 515
are invertible (see Section 5.6). Here, lr<: • • < 12< I are some positive
integers. The correspondence between factorizations (17.1.1) and chains of
CL- in variant subspaces is given by the formulas from Theorem 5.6.1.
Namely, let ^ be a direct complement to Mi+l in M} (j = 1,. . . , r - 1) (by (by
definition, Jll = and let P^: Mj—>Nj be the projector on jVy along
MJ+l. For / = !, . . . , r - l , let p. be the difference lj+l — lj where, by
definition, /, = /. Here / is the degree of L Then for /' = 1,. . . , r — 1 we
have
where
where
Also, it is
convenient to use the formulas for the products
AB (cf. the proof of
Theorem 5.6.1). We have for / = 2, . . . , r:
where
Now fix a positive integer /. Consider the set °Wr of all r-tuples
where L is a monic matrix polynomial of
degree /, and is a chain of CL-invariant subspaces.
The set Wr is a metric space with the metric
Theorem 17.1.1
For each g the set Wr ^ is open in Wr .
is evident that Ff is one-to-one and surjective, so that the map FJ1 exists.
Make the set into a metric space
by defining
If A^ , A"2 are topological spaces with metrics pl, p2, defined on A^ and X2,
respectively, the map G: Xl—> X2 is said to be locally Lipschitz continuous
if, for every x Xl , there is a deleted neighbourhood Ux of jc for which
Theorem 17.1.2
The maps Ff and F^1 are locally Lipschitz continuous.
Proof. Given
where the products Ll • • • Lf_v
and L, • • • Lr are given by (17.1.5) and (17.1.4), respectively. Then
(cf. the proof of Theorem 13.5.1). This inequality shows that x is a locally
Lipschitz continuous function of because A
and b have this property.
To establish the local Lipschitz continuity of F^ 1 , we consider a fixed
element It is apparent that the polynomial L
L,L 2 • • • Lr will be a Lipschitz continuous function of L ] } . . . , Lr in a
neighbourhood of this fixed element. Further, let Mr C • • • C M2 be the
chain of CL- in variant subspaces corresponding to the factorization L =
L,L 2 • • • Lr. Let NJ = LtLi+l • • • Lr for / = 2, . . . , r, and let
where / is the degree of L and ra, is the degree of 7V(. The projector PM on
Mt along %_m is given by the formula
As
and
520 Applications
of a monic matrix polynomial L(A), where L t (A) are monic matrix poly-
nomials as well, is stable if for any e >0 there exists a 6 >0 such that any
monic matrix polynomial L(\) with cr,(L,L)<8 admits a factorization
L(A) = L j ( A ) - • • L r (A), where L,(A) are monic matrix polynomials satis-
fying
max
Theorem 17.2.1
Let equality (17.2.1) be a factorization of the monic matrix polynomial L(A).
Let (Mr,. . . , M2, L(A)) = F^(Llt. . . , L r ) be the corresponding chain of
CL-invariant subspaces. Then the factorization (17.2.1) is stable if and only if
the chain
is stable.
From (17.2.3) and the fact that Cm-^ CL it follows that a,(Mm, L)^0. But
then we may assume that for all m the polynomial Mm admits a factorization
Corollary 17.2.2
A factorization
with monic matrix polynomials L(A), L t (A), . . . , Lr(\) is stable if and only
if the corresponding chain
Theorem 17.2.3
A factorization (17.2.1) is stable if and only if, for any common eigenvalue A0
of a pair L,( A), Ly( A) (i ^/) we have dim Ker L( A 0 ) = 1.
Lemma 17.2.4
Let
Matrix Polynomials: Main Results 523
be a transformation from (p"1 into <p™, written in matrix form with respect to
the decomposition Then fm is a stable
invariant subspace for A if and only if for each common eigenvalue \0ofAl
and A 2 the condition dim Ker( A07 - A) — 1 is satisfied.
l
Observe that for i — 1, 2, the Laurent expansion of (/A — At) at
at A0 has
has the
form
where Qtj are some transformations of Im P. into itself and the ellipsis on
the right-hand side of (17.2.5) represents a series in nonnegative powers of
(A - A 0 ). From (17.2.5) one sees that P has the form
where Ql and Q2 are certain transformations acting from (p"*2 into (p7"1. It
follows that {0} ^ P(pmi ^ Im P if and only if A0 e 0-^4,) n o-(A2). Now
appeal to Theorem 15.2.1 (see first paragraph of the proof) to finish the
proof. D
of L( A) with monic polynomials M,( A) satisfying <7p.(L,-( A), Af,( A)) < e (it is
assumed that the degree of M, is p() coincides with (17.2.6), that is,
Theorem 17.2.5
A factorization (17.2.6) is stable if and only if it is isolated.
Theorem 17.2.6
Assume that
Monic Matrix Polynomials 525
A factorization
of the monic matrix polynomial L(A), where L,(A), . . . , Lr(\) are monic
matrix polynomials as well, is called Lipschitz stable if there exist positive
constants e and K such that any monic matrix polynomial L(A) with
cr,(L, L) < e admits a factorization L( A) = L,( A) • • • Lr(\) with monic mat-
rix polynomials Lf(\) satisfying
Theorem 17.3.1
The factorization (17.3.1) is Lipschitz stable if and only if the corresponding
chain of CL-invariant subspaces
is Lipschitz stable.
526 Applications
Indeed
Corollary 17.3.2
For the factorization (17.3.1) and the corresponding chain of CL-invariant
subspaces (17.3.2), the following statements are equivalent: (a) the factoriz-
ation (17.3.1) is Lipschitz stable; (b) all the CL-invariant subspaces
528 Applications
M2, . . . , Mr are spectral; (c) for every e > 0 sufficiently small there exists a
8 >0 with the property every nl x nl matrihx BBwith with \\B-C.\
\\B - CL\\ < 8 has
has a
unique chain of invariant subspaces J i r C J " f r _ l C ' - - C J i 2 such that
ma\(0(Mr,JVr),
Now we are ready to state and prove the main result of this section,
namely, the description of Lipschitz stable factorizations. (Recall the defin-
ition of the metric o-k on matrix polynomials given in Section 17.1.)
Theorem 17.3.3
The following statements are equivalent for a factorization
So, the subspaces M-t are spectral if and only if cr(L;) fl cr(L^) = 0 for; ¥^ k.
Hence the equivalence (a)<£>(b) in Theorem 17.3.3 follows from the
equivalence (a)<£>(b) in Corollary 17.3.2. Similarly, the equivalence
(a)O(c) in Theorem 17.3.3 follows from the corresponding equivalence in
Corollary 17.3.2, taking account of Theorem 17.1.2. D
Theorem 17.4.1
The minimal factorization W 0 (A) = W 01 (A)W 02 (A) • • • W0k(\) is stable if and
only if each common pole (zero) of WOJ and W0p (j ^ p) is a pole (zero) of
W0 of geometric multiplicity 1.
for which the subspaces «2\ 4- • • • 4- <£p (p = 1,. . . , k) are A0-invariant and
the subspaces ££k 4- 3?k + l + • • • + Z£p (p = /c, . . . , 1 ) are AQ invariant,
where AQ — A0 — B(}C0. Moreover, the minimal factorization (17.4.3) cor-
responding to the direct sum decomposition (17.4.4) is given by
where 7ry is the projector on J£J along JS?, + • • • 4- .$?._, -i- j£J + 1 4- • • • + j£A;
note that the realizations (17.4.5) are necessarily minimal. In the formula
(17.4.5) the transformations COTT;: ^.-» <p", 7r y A 0 ir y : ^->^, and
7r;£?0: <p"—»J^ are understood as matrices of sizes n x / y , /y x /;, and /; x «,
respectively, where /; = dim «2?;-, with respect to some basis in ^.
Let (^4, 5, C) be a triple of matrices of sizes d x 8, 8 x n, n x 6,
respectively. Consider the ordered /c-tuple II = (TT,, . . . , Trk) of projectors in
((7s. We say that FI is a supporting k-tuple of projectors with respect to the
triple of matrices (A, B, C) if 77,77-, = 77-.7r; = 0 for / ^j, TT, + • • • + TT^ = /, the
subspaces Im(7r, + • • • + irp) for p = 1, 2 , . . . , k are A invariant, and the
subspaces lm(7rp + irp + l + • • • + irk), p = 1,. . . , k, are AK invariant, where
A* - A - BC. Clearly, II is a supporting /c-tuple of projectors with respect to
(A(), B0, C0) if and only if the subspaces^ = Im TT-(/ = 1,. . . , k) form a direct
sum decomposition of (p5 as in (17.4.4).
A supporting /c-tuple of projectors II = (TT,, . . . , irk) with respect to
(A, B, C) will be called stable if for every e >0 there exists an o> >0 such
that, for any triple of matrices (A', B', C') of sizes 5 x 5 , S x / j , n x 8,
respectively, with \\A - A'\\ + \\B - B'\\ + \\C - C'\\ < a>, there exists a sup-
porting /c-tuple of projectors IT = (TT^, . . . , TT£) with respect to (A', B',C')
such that
The first step in the proof of Theorem 17.4.1 is the following lemma.
Lemma 17.4.2
Let (17.4.1) be a minimal realization for WQ(\), and let H = (irl,. . . , Trk) be
a supporting k-tuple of projectors with respect to (A0, B0, C0), with the
corresponding minimal factorization
The proof of Lemma 17.4.2 is rather long and technical and is given in
the next section.
Next, we make the connection with stable invariant subspaces.
Lemma 17.4.3
Let II = (TTI, . . . , Trk) be a supporting k-tuple of projectors with respect to
(AQ, BQ, CQ). Then II is stable if and only if the A0-invariant subspaces
Im(7r, + • • • + 7T;), / = 1, . . . , / : are stable and the A^-invariant subspaces
Im(7ry + TT / + I + • • • + TTfr), y = 1, . . . , k are stable as well (as before, A^ =
^o ~~ "n^-o)-
By Lemmas 17.4.2 and 17.4.3 the factorization (17.4.7) is stable if and only
if the y40-invariant subspaces J^ = Im^ + f- 77y), j = 1,. . . , k are stable
and the/io -invariantsubspaces^^ = Im(7ry- + irj+l + • • • + Trk)J= 1, . . . , k
are stable as well.
With respect to the decomposition <ps = Im 17, + Im ir2 + • • • + Im irk,
write
In view of Lemma 17.2.4, ^E. is stable if and only if, for every common
eigenvalue A0 of
we have dim Ker( A07 - A) = 1. So all the subspaces j?,,. . . , &k are stable if
and only if every common eigenvalue of Ajf and App (j ^ p) is an eigenvalue
of AQ of geometric multiplicity 1. Similarly, all the subspaces M I } . . . ,Mk
532 Applications
are stable if and only if every common eigenvalue of A* and A*p with j' ^p
is an eigenvalue of AQ of geometric multiplicity 1. It follows that the
factorization (17.4.7) is stable if and only if every common eigenvalue of Ajf
and App (resp. of A* and A*p} with j ^ p is an eigenvalue of A0 (resp. of
AQ) of geometric multiplicity 1. To finish the proof, observe that the
realizations (17.4.5) are minimal and hence, by Theorem 7.2.3, the poles
(resp. zeros) of WQj(\) coincide with the eigenvalues of TtjA^Tr- = An (resp.
eigenvalues of -nyl^ TT, - A*}). Also, the partial multiplicities of a pole (resp.
zero) A0 of WOJ are equal to the partial multiplicities of A0 as an eigenvalue of
AJJ (resp. A*j). Analogous statements hold for the poles and zeros of WQ(\)
and eigenvalues of AQ and A^. D
Then the realization (17.5.1) is minimal. By the stability of II, there exists a
supporting £-tuple of projectors II' = (TT|, . . . , TT£) with respect to
(A, B, C) such that
Use the inequalities wt < e', w ^ e' and the inequalities ||5y !
||7- S'1]! ^ e'(l - e')"1 (assuming e' < 1; cf. the proof of Theorem 16.2.1)
to get
It remains to choose e' < 1 in such a way that this expression is less than e,
and the stability of factorization (17.4.6) is proved.
Conversely, let the factorization (17.4.6) be stable and assume that II is
not stable. Then there exist an e > 0 and sequences {Am}^=l, {5m}^=1,
{Cmrm^ such that
534 Applications
is minimal for all m. In view of the stability of (17.4.6), we can also assume
that each W A admits a minimal factorization
whre
Proof of the Auxiliary Lemmas 535
Now let
Furthermore
Then there exist an e > 0 and a sequence {Am}fn =l such that \\Am -
A0\\-*Q as m —»<» and
In this section we use Theorem 17.4.1 and its proof to derive some useful
information on stable minimal factorizations of rational matrix functions.
First, let us make Theorem 17.4.1 more precise in the sense that if the
minimal factorization
Theorem 17.6.1
Assume that (17.6.1) is a stable minimal factorization, and let
and
538 Applications
and
is stable provided
is small enough.
such that
then necessarily VV^A) = W 0/ (A) for each ;'. It is easily seen that this
definition does not depend on the choice of the minimal realization (17.6.4).
Rational Matrix Functions: Further Deductions 539
From the proof of Theorem 17.4.1 and the fact that the stable invariant
subspaces coincide with the isolated ones (Section 14.3), it is found that this
property also holds for stable minimal factorizations:
Theorem 17.6.2
The minimal factorization (17.6.1) is stable if and only if it is isolated.
Theorem 17.6.3
For the minimal factorization (17.6.1), the following statements are equiva-
lent: (a) equation (17.6.1) is Lipschitz stable; (b) for every pair of indices
j^p, the rational functions W 0y (A) and W0p(\) have no common zeros and
no common poles; (c) given minimal realizations (17.6.2) and (17.6.3) of
W0( A) and W 01 (A), . . . , W0k( A), for every sufficiently small e >0 there exists
an a) >0 such that for any triple (A, B, C) with \\A- A0\\ + \\B - BQ\\ +
|| C - C0|| < to the realization
satisfying
where W(X) and V(X) are rational matrix functions of suitable sizes that
take finite values at infinity. (See Sections 7.6-7.8 for the definition and
basic facts on linear fractional decompositions.)
In informal terms, the stability of (17.7.1) means that any rational matrix
function U(X) sufficiently close to U(\) admits a minimal linear fractional
decomposition U(\) = 3^^,(V), where the rational matrix functions W(A)
and V(\) are as close as we wish to W(\) and V(A), respectively. To make
this notion precise, we resort to minimal realizations for the matrix functions
involved. Thus let
be a minimal realization of (/(A), where a, /3, -y, and 8 are matrices of sizes
/ x /, / x 5, q x /, and q x s, respectively. Also, let
and
be minimal realizations of W(A) and V(A). We say that the minimal linear
fractional decomposition (17.7.1) is Lipschitz stable if there exist positive
constants e and K such that any q x s rational matrix function U( A) that
admits a realization
with
Decompositions of Rational Matrix Functions 541
where the rational matrix functions W(X) and V(\) admit realizations
It is assumed, of course, that the sizes of two matrices coincide each time
their difference appears in the preceding inequalities.
Since any two minimal realizations of the same rational matrix function
are similar (Theorems 7.1.4 and 7.1.5), it is easily seen that the definition of
Lipschitz stability does not depend on the particular choice of minimal
realizations for £/(A), W(\), and V(A).
It is remarkable that a large class of minimal linear fractional decom-
positions is Lipschitz stable, as opposed to the factorization of monic matrix
polynomials and the minimal factorization of rational matrix functions,
where Lipschitz stability is exceptional in a certain sense (Sections 17.3 and
17.6).
Theorem 17.7.1
Let
which means the right invertibility of [j8, a/3,. . . , a1 l(3] and the left
invertibility of
Here Fl = F\M{: Ml-^(s and Gl = QMG: f-^Mlt where QMi stands for
the projector on Ml along M2, are transformations written as matrices with
respect to the basis / , , . . . , fk (and the standard orthonormal bases in <p*
and <£*), and are similarly
defined matrices with respect to some basis g{,. . . , gkin M^ where <2^ is
the projector on Ml along M2.
for some invertible transformation S: <p'—» <p' such that SMl=Ml and
It remains to choose
does not have nontrivial minimal factorizations at all. On the other hand,
(17.7.8) can be represented as a minimal linear fractional decomposition
with
Proposition 17.8.1
For an m x n matrix X, the subspace G(X) is T invariant if and only if X
satisfies (17.8.1).
Lemma 17.8.2
Define a function G from the set M m x n of all m x n matrices to the set of all
subspaces in <p" © <pm by G(X) = G(X). Then G is a homeomorphism (i.e.,
a bijective map that is continuous together with its inverse) between Mmxn and
the set of all subspaces M C <p" © <£"" witn tne property that 0(M, $?)<!,
where ^=<p"©{0}.
Here 6(M, jV) is the gap between M and N (see Chapter 13).
Proof. The continuity of G and G ~ ' follows from the easily verified fact
that the orthogonal projector P on G(X) is given by
and by formula
Isolated Solutions of Matrix Quadratic Equations 547
But L is invertible, so
then the vector v - Q lu has the property that vE.M, P^y = u - Pxv and
therefore, y belongs to
M.
Theorem 17.8.3
The following statements are equivalent: (a) X0 is an isolated solution of
(17.8.1); (b) XQ is an inaccessible solution of (17.8.1); (c) for every eigen-
value A0 of the matrix
or
is a homeomorphism between the set of all solutions Y of (17.8.5) and the set
of ro-invariant subspaces M such that 0 ( M , f f l ) < l , where 2i?=
Hence 0 is an isolated (resp. inaccessible) solution of (17.8.6) if and only if
3€ is an isolated (resp. inaccessible) T0- in variant subspace. An application of
Theorem 14.3.1 and Proposition 14.3.3 shows that (a), (b), and (c) are
equivalent.
Further, the characteristic polynomial of T0 is the product of the charac-
teristic polynomials of A + BXQ and D — XQB. As the multiplicity of A0 as a
zero of the characteristic polynomial of a matrix 5 is equal to the dimension
of £%A (5), it follows that A0 is a common eigenvalue of A + BX0 and
D - XQB if and only if
from the theory of linear equations that equation (17.8.7) either has no
solutions, has a unique solution, or has infinitely many solutions. [In this
case the homogeneous equation
Corollary 17.8.4
The equation YA - DY = 0 has only the trivial solution Y = 0 if and only if
Corollary 17.8.5
If the matrix
so by Proposition 17.8.1 and Lemma 17.8.2 there exist only two solutions
given by
We remark that a 7-invariant subspace M has this form if and only if the
transformation [/ 0]|^: Jt—>$" is invertible. In this way we recover the
description of right divisors of L(A) given in Section 5.3. Similarly, the
equation
Stability of Solutions of Matrix Quadratic Equations 551
thje equation
has a solution Y for which \\Y — X\\ < e. It turns out that the situation with
regard to stability and isolatedness is analogous to that for invariant
subspaces.
Theorem 17.9.1
A solution X of equation (17.9.1) is stable if and only if X is isolated.
Proof. It is sufficient to prove the theorem for the case when C = 0 and
the solution X is the zero matrix (see the proof of Theorem 17.8.3). In this
case G(X) = <P"©{0); so the homeomorphism described in Lemma 17.8.2
implies that X=0 is a stable (resp. isolated) solution of
Theorem 17.9.2
Let X be a stable solution of (17.9.1). Then every solution Y of equation
where A', B', C', and D' are matrices of appropriate sizes, is stable provided
is small enough.
the equation
Theorem 17.9.3
A solution of (17.9.1) is Lipschitz stable if and only if &(A + BX) fl
Similarly, one can obtain the following fact from Theorem 15.5.1: the
solution X in (17.9.1) is Lipschitz stable if and only if for every sufficiently
small e > 0 there exists a 5 > 0 such that
In this section we quickly review some real analogs of the results obtained in
this chapter.
554 Applications
where L y (A) are monic matrix polynomials with real coefficients. Using the
results of Section 15.9 and the approach developed in the proof of Theorem
17.3.1, one obtains necessary and sufficient conditions for stability of the
factorization (17.10.1) (the analog of Corollary 17.2.2). The definition of a
stable factorization of real monic matrix polynomials is the same as in the
complex case, except that now only real matrix polynomials are allowed as
perturbations of L(A) and as factors in a factorization of the perturbed
polynomial.
Theorem 17.10.1
Let CL be the companion matrix of L( A), and let
In contrast with the complex case (Theorem 17.2.5), not every isolated
real factorization (17.10.1) is stable. Using the description of isolated
invariant subspaces for real transformations (Section 15.9), one finds that
(17.10.1) is isolated if and only if the condition (a) in Theorem 17.10.1
holds.
Now we pass to the stability of minimal factorizations
of a rational matrix function W 0 (A) such that the entries of W 0 (A) are real
for real A. (In short, such rational matrix functions are called real.) The
functions W0i(\) are also assumed to be real, and, in addition, we require
that all rational matrix functions involved are n x n and take value / at
infinity. Again, the stability of (17.10.2) is defined as in the complex case
with only real rational matrix functions allowed. The main result on stability
of (17.10.2) is the following analog of Theorem 17.4.1.
The Real Case 555
Theorem 17.10.2
The minimal factorization (17.10.2) of the real rational matrix function
W0( A) with W0(<x>) = I, where for j = 1, 2,. . . , k, W0j(\) is also a real
rational matrix function with W0/(o°) = I, is stable if and only if the following
conditions hold: (a) each common pole (zero) of W0; and W0p ( j ^ p ) is a
pole (zero) of WQ of geometric multiplicity I', (b) each even order real pole A0
of W0 (resp. of WQ*) is also a pole of each W0j (resp. of each VK^ 1 ) of even
order (if A0 is a pole of WQj or of W^1 at all).
One verifies easily that W0(\) = W 01 (A)W 02 (A) and this factorization is
minimal (indeed, the McMillan degree of W0(\) is 2, whereas the McMillan
degree of W 01 (A) and W 02 (A) is 1). Furthermore
so W 01 (A) and W 02 (A) do not have common zeros. It is easily seen that
A0 = 0 is a common pole of W 0 (A), Wol(\), and W 02 (A) and that the only
negative partial multiplicities of W 0 (A), W01( A), and Wm(\) at A 0 are -2, -1
and —1, respectively. Hence condition (a) of Theorem 17.10.2 is satisfied,
556 Applications
the equation
has a real solution Y for which || Y - X\\ < e. The isolated and stable
solutions-can be characterized as follows.
Theorem 17.10.3
The solution X0 of (17.10.3) is isolated if and only if every common
eigenvalue of A + BX0 and D — XQB has geometric multiplicity 1 as an
eigenvalue of the matrix
The solution X0 is stable if and only if it is isolated and, in addition, for every
real eigenvalue A0 of T with even algebraic multiplicity the algebraic multiplic-
ity of A0 as an eigenvalue of A + BXQ (or of D — XQB) is even (if A0 is an
eigenvalue of A + BXn, or of D — X0B at all).
and thus the algebraic multiplicity m(T\ A0) for the eigenvalue A0 of T is
equal to the sum of the algebraic multiplicities m(A + BX0; A0) and m(D -
X0B; A0). Consequently, if m(T; A 0 ) is even, then the evenness of one of
Exercises 557
the numbers m(A + BX0; A0) and m(D — X0B; A0) implies the evenness of
the other.
Again, we omit the proof of Theorem 17.10.3. It can be obtained by
using an argument similar to the proofs of Theorems 17.8.3 and 17.9.1,
using the description of stable and isolated invariant subspaces for real
transformations (Section 15.9) and taking into account equation (17.10.4).
17.11 EXERCISES
17.1 Find all stable factorizations (whose factors are linear matrix poly-
nomials) of the monic matrix polynomial
is stable as well.
(b) Show that the converse of statement (a) is generally false.
(c) Show that the factorization (1) is stable in the algebra of all
matrices of type
where a; and /3y are complex numbers, and let L(A) be a monic
matrix polynomial with coefficients from the algebra V. Describe
factorizations of L(A) that are stable in the algebra V. (Hint: Use
Exercise 17.7.)
Exercises 559
17.12 Find all stable minimal factorizations of the rational matrix function
Chapter 13. This chapter contains mainly well-known results. The main
ideas and results concerning the metric space of subspaces appeared first in
the infinite dimensional framework [see Krein, Krasnoselskii and Milman
(1948); Gohberg and Markus (1959); and also Gohberg and Krein (1957)], and
they are adapted here for the finite-dimensional case. The contents of Sections
13.1 and 13.4 are standard. The exposition presented here is based on that of
Chapters.4 in the authors' book (1982) [see also Kato (1976)]. Theorem 13.2.3
is from Gohberg and Markus (1959). The exposition in Section 13.3 follows
Section 7.2 in Bart, Gohberg, and Kaashoek (1979). Theorem 13.6.3, along with
other related results, was obtained in Gohberg and Leiterer (1972) as a
consequence of general properties of cocycles in certain algebras of continuous
matrix functions. Theorem 13.5.1 appears in the infinite dimensional framework
in Gohberg and Krupnik (1979); here we follow the authors' book (1983b).The
material on normed spaces presented in Section 13.8 is standard knowledge. For
the first part of this section we made use of the exposition in Lancaster and
Tismenetsky (1985).
Chapter 14. The description of connected components in the set of
invariant subspaces (Sections 14.1 and 14.2) is found in Douglas and Pearcy
(1968) [see also Shayman (1982)]. An identification of isolated invariant
subspaces is given in Douglas and Pearcy (1968). Note that in the infinite-
dimensional framework (Hilbert space and bounded linear operators) there
exist inaccessible invariant subspaces that are not isolated [see Douglas and
Pearcy (1968)]. Theorem 14.3.5 was originally proved in the infinite-
dimensional case [Douglas and Pearcy (1968)]. The results on coinvariant
and semiinvariant subspaces in Section 14.5 appear here for the first time.
Chapter 15. Theorem 15.2.1 appeared in Bart, Gohberg and Kaashoek
(1978) and Campbell and Daughtry (1979). The proof presented here
follows the exposition in Bart, Gohberg and Kaashoek (1979). Parts
(a)O(b) of Theorem 15.5.1 was first proved in Kaashoek, van der Mee and
Rodman (1982). The statement of Theorem 15.5.1 and the remaining proof
is taken from Ran and Rodman (1983). Theorem 15.7.1 was proved in
Conway and Halmos (1980). Theorem 15.8.1, although not stated in this
way, was proved in Gohberg and Rubinstein (1985). The material of Section
15.9 is based on Bart, Gohberg and Kaashoek (1979). Theorem 15.10.1 was
561
562 Notes to Part 3
proved in den Boer and Thijsse (1980) and Markus and Parilis (1980).
Theorem 15.10.2 is suggested by Theorem 2.4 in den Boer and Thijsse
(1980).
The results of this chapter play an important role in explicit numerical
computation of invariant subspaces. However, we do not touch the topic of
numerical computation in this book, and refer the reader to the following
sources: Bart, Gohberg, Kaashoek and van Dooren (1980); Golub and
Wilkinson (1976); Ruhe (1970,1970b); van Dooren (1981, 1983); and
Golub and van Loan (1983).
Chapter 16. Most of the results and expositions of the material in this
chapter is taken from Gohberg and Rodman (1986). Corollary 16.1.3
appeared in Brickman and Fillmore (1967). Lemma 16.5.1 is a particular
case of a result due to Ostrowski [see pages 334-335 in Ostrowski (1973)].
Chapter 17. The main results of Section 17.2 (where the case of
factorization into the product of two factors L(A) = Lj(A)L 2 (A) was con-
sidered) are from Bart, Gohberg and Kaashoek (1978). The exposition of
Sections 17.1 and 17.2 follows Gohberg, Lancaster, and Rodman (1982),
where only the case of two factors was considered [see also the authors'
paper (1979)]. The results of Section 17.3 are presented here probably for
the first time. The main part of the contents of Section 17.4, as well as
Theorems 17.6.1 and 17.6.2, is taken from Bart, Gohberg and Kaashoek
(1979). Lemma 17.8.2 is taken from Campbell and Daughtry (1979). The
main results of Section 17.7 are from Gohberg and Rubinstein (1985).
Example 17.10.1 is taken from Chapter 9 in Bart, Gohberg and Kaashoek
(1979).
Part Four
Analytic Properties
of Invariant
Subspaces
This part is devoted to the study of transformations that depend analytically
on a parameter, and to the dependence of their invariant subspaces on the
parameter. We begin with the simplest invariant subspaces, the kernel and
image of the transformation, and this already requires the development of a
theory of analytic families of invariant subspaces. Also, the solution of some
basic problems is required, such as the existence of analytic bases and
analytic complements for analytic families of subspaces. This material is all
presented in Chapter 18 and is probably presented in a book on linear
algebra for the first time. More generally, these results appeared first in the
theory of analytic fibre bundles.
The study of more sophisticated objects and their dependence on the
complex parameter z is the subject of Chapter 19. These include irreducible
subspaces, the Jordan form, and Jordan bases. These results can be viewed
as extensions of perturbation theory for analytic families of transformations.
The final chapter of Part 4 (and of the book) contains applications of the
two preceding chapters to problems that have already appeared in earlier
chapters, but now in the context of analytic dependence on a parameter.
These applications include the factorization of matrix polynomials and
rational matrix functions and the solution of quadratic matrix equations.
563
This page intentionally left blank
Chapter Eighteen
Analytic Families
of Subspaces
Let O be a domain (i.e., a connected open set) in the complex plane <p, and
assume that for every z E f t a transformation A(z)\ <P"-* 4-"" *s given. We
say that A(z) is an analytic family on (1 if in a neighbourhood Uz of each
point ZQ Efl the transformation valued function A(z) admits representation
as a power series
565
566 Analytic Families of Subspaces
representing A(z) in fixed bases in <p" and (pm are analytic functions of z on
the domain ftr Obviously, this definition does not depend on the choice of
these bases.
Now let (M(z)}zen be a family of subspaces in <p". So for every z in ft,
M(z) is a subspace in <p". We say that the family {-M(z)}zefl is analytic on ft
if for every z 0 E f t there exists a neighbourhood Uz Cft of z 0 , a subspace
J< C <p", and an invertible transformation A(z): $"-* <p" that depends
analytically on z in f/z and
Proposition 18.1.1
Let x^z), . . . , xp(z) be analytic functions of z on the domain ft whose values
are n-dimensional vectors. If for every z() €E ft the vectors x}(z0), . . . , xp(zn)
are linearly independent, then
Proposition 18.1.2
Let A(z): <£"—»<£"" be an analytic family of transformations on ft, and
assume that dim Ker A(z) is constant (i.e., independent of z for z in ft).
Then Ker A(z) is an analytic family of subspaces (of (p") on ft, whereas
Im A(z) is an analytic family of subspaces (of (p"1) on ft.
Note that dim Ker A(z) is constant on ft if and only if the rank of A(z) is
constant, or, equivalently, the dimension of Im A(z) is constant.
size p x p of ^(z 0 ), which will be supposed to lie in the left upper corner of
A(z0). Partition A(z) accordingly:
where £(z), C(z), D(z), and E(z} are matrix functions of sizes p x p,
p x (n - p), (m - p) x m, (m - p) x (H - p), respectively, and are analytic
on ft. For some neighbourhood t/ of z0 we have del B(z) ^ 0 for z E U. If
the vector * , A: e (pp, y E <p"~ p belongs to Ker A(z) and z e U, then
It follows that dim Ker A(z) = dim Ker[-D(z)#(z) 'C(z) + E(z)}. But dim
Ker^4(z) is independent of z and equal to n — p; consequently,
D(z)fi(z)-1C(z) + £(z) = 0 for all z e U. Now, obviously
Note that the singular set is discrete; that is, for every z0 £ S(A) there is a
neighbourhood U C ft of z0 such that U fl S(A) = {z0}.
Theorem 18.2.1
Let A(z): <p n —» 4-"" be an analytic family of transformations on ft, and let
r = max zen rank A(z). Then there exist m-dimensional vector-valued func-
570 Analytic Families of Subspaces
and
and
hold.
Lemma 18.2.2
Let jc,(z), . . . , xr(z) be n-dimensional vector-valued functions that are analy-
tic on a domain ft in the complex plane. Assume that for some z0 G ft, the
vectors *j(z 0 ),. . . , *,(2o) are linearly independent. Then there exist n-
dimensional vector functions ^(z), . . . , yr(z) with the following properties:
(a) y { ( z ) , . . . , yr(z) are analytic on ft; (b) y^z),. . . , yr(z) are linearly
independent for every zGft; (c) Span{_y,(z), . . . , yr(z)} =
Span{x,(z),. . . , x r ( z ) } (C(f") for every z G n ^ n o , where O0 = {zG
H Jc,(z), . . . , xr(z) are linearly dependent}. If, in addition, for some s (^r)
the vector functions JCj(z), . . . , xs(z) are linearly independent for all z G ft,
then y,(z), i — 1,. . . , r can be chosen in such a way that (a)-(c) hold, and
moreover, for all
In the proof of Lemma 18.2.2 we use two classical results (see Chapter 3
of Markushevich (1965), Vol. 3, for example) in the theory of analytic and
meromorphic functions that are stated here for the reader's convenience.
Analytic Families of Transformations 571
Lemma 18.2.3
(Weierstrass's theorem). Let S C H be a discrete set, and for every z 0 G 5 let a
positive integer s(z0) be given. Then there exists a (scalar) function f(z) that is
analytic on H and for which the set of zeros off(z) coincides with S, and for
every z() G 5 the multiplicity of z0 as a zero of /(z) is exactly s(z0).
Lemma 18.2.4
(Mittag-Leffter theorem). Let S C fl be a discrete set, and for every z() G S let
a rational function of type
is also discrete. Disregarding the trivial case when O0 is empty, we can write
H0 = (£1, £ 2 ,. . .}, where £ E f t , / = 1 , 2 , . . . , is a finite or countable
sequence with no limit points inside fl.
Let us show that for every j = I , 2,. . . , there exist a positive integer Sj
572 Analytic Families of Subspaces
and scalar functions flly(z),. . . , ar_l ;(z) that are analytic in a neighbour-
hood of £y such that the system of /i-dimensional analytic vector functions on
n
Clearly, the columns 6j(z),. . . , br(z) of fi(z) are analytic and linearly
independent vector functions in a neighbourhood V(£/) of £ y . From formula
(18.2.6) it is clear that Spanfjc^z),. . . , xr(z}} = Spanf^^z),. . . , br(z)}
for z G V(£.) ^ T. Further, from (18.2.6) we obtain
and
Analytic Families of Transformations 573
So the columns b,(z),. . ., br(z) of B(z) have the form (18.2.5), where
a^z) are analytic scalar functions in a neighbourhood of £;.
Now choose y^z),... , yr(z) in the form
where the scalar functions gt(z) are constructed as follows: (a) gr(z) is
analytic and different from zero in O except for the set of poles £,, £2, . . . ,
with corresponding multiplicities s{,s2,...; (b) the functions gt(z) (for
/ = l , . . . , r - l ) are analytic in H except for the poles £j, £ 2 ,. . . , and the
singular part of g,(z) at £/ (f°r / = 1, 2,. . .) is equal to the singular part of
a,7(z)£r(z) at Cr
Let us check the existence of such functions g^z). Let gr(z) be the inverse
of an analytic function with zeros at £j, £ 2 ,. . . , with corresponding multi-
plicities slys2,. . . (such an analytic function exists by Lemma 18.2.3). The
functions g { ( z ) , . .. , g r _j(z) are constructed by using the Mittag-Leffler
theorem (Lemma 18.2.4).
Property (a) ensures that y^z),. . . , yr(z) are linearly independent for
every z G H "- { £,, £ 2 ,. . .}. In a neighbourhood of each L we have
[The last equality follows from the linear independence of Jt^z),. . . , xr(z)
for z E O ^ O0.] We now prove that
So B(z) is analytic on z E V; since z' G H "^ ft0 was arbitrary, B(z) is analytic
on fl ^n o .
Moreover, B(z) admits analytic continuation to the whole of Q, as
follows. Let z 0 e n o , and let Y(z)~L be a left inverse of Y(z), which is
analytic in a neighbourhood V0 of z 0 . [The existence of such Y(z) is proved
as above.] Define B(z) as y(z)^^4(z) for z e V0. Clearly, B(z) is analytic on
V{}, and for z E V0 ^ (z 0 ), this definition coincides with (18.2.11) in view of
the uniqueness of B(z). So B(z) is analytic on il.
Now it is clear that (18.2.10) holds also for zEfl 0 , which proves
(18.2.9). Consideration of dimensions shows that in fact we have an equality
in (18.2.9), unless rank A(z) < r. Thus (18.2.1) and (18.2.3) are proved.
We pass now to the proof of existence of y r + 1 (z),. . . , yn(z) such that
(b), (18.2.2), and (18.2.4) hold. Let a,(z),. . . , ar(z) be the first r rows of
A(z). By assumption fl,(z")' . . . , a r ( z ) are linearly independent for some
z E11. Apply Lemma 18.2.2 to construct ^-dimensional analytic row func-
tions bj(z), . . . , br(z) such that for all z E H the rows b,(z),. . . , br(z) are
linearly independf ~* — J r <- r» -^ r»
Global Properties of Analytic Families of Subspaces 575
Fix z0 E ft, and let br+l,. . . , br be n-dimensional rows such that the vectors
b,(z 0 ) r ,. . . , br(z0)T, bTr+l,. . . , bTn form a basis in <p". Applying Lemma
18.2.2 again [for *,(z) = b,(z}\ . . . , x,(z) = b,(z)\ *,+1(z) =
fcj+i,. - . , Jtn(z) = bj], we construct n-dimensional analytic row functions
br+i(z), • • • > ^«( z ) sucn tnat tne rt x n matrix
is nonsingular for all z E f t . Then the inverse fi(z)"1 is analytic on ft. Let
y r + 1 (z),. . . , yn(z) be the last (n - r) columns of B(z)~l. We claim that (b),
(18.2.2), and (18.2.4) are satisfied with this choice.
Indeed, (b) is evident. Take z E ft ^ ft0; from (18.2.12) and the construc-
tion of yr+l(z),. . . , yn(z) it follows that
But since z^ft 0 , every row of A(z) is a linear combination of the first r
rows. So in fact
Passing to the limit when z approaches a point from ft0, we find that
(18.2.14), as well as the inclusion (18.2.13), holds for every z E f t . Con-
sideration of dimensions shows that the equality holds in (18.2.13) if and
only if rank A(z) = r. D
Theorem 18.3.1
Let {^(z)}z6ft bg an analytic family of subspaces (o/<p") on ft. Then there
exist invertible transformations A(z)\ $"—> <p" that are analytic on ft, and a
subspace M C $" such that M(z) = A(z}M, for all z £ ft.
Theorem 18.3.2
For an analytic family of subspaces M(z) (of <JT") on ft the following
properties hold: (a) there exist n-dimensional vector functions
jCj(z),. . . , xp(z) that are analytic on ft and such that, for each z Eft, the
vectors ^(z),. . . , xp(z) are linearly independent and
Note that property (b) [as well as property (a)] is characteristic for
analytic families of subspaces. So, if P(z) is an analytic family of projectors
on ft, then Im P(z) is an analytic family of subspaces. We leave the
verification of this statement to the reader.
In connection with Theorem 18.3.2 (c), note that the orthogonal comple-
ment M{z)^ is usually not an analytic family, as the next example shows.
Then
Global Properties of Analytic Families of Subspaces 577
Theorem 18.3.3
Let M(z) and N(z) be analytic families of subspaces (of <p") on ft such that
M(z) C ^V(z) for all z E ft. Then there exist n-dimensional vector functions
jc t (z),. . . , xp(z) [where p - dim N(z) - dim M(z)] that are analytic on ft
and such that, for each z E ft, the vectors JCj(z),. . . , xp(z) form a basis in
N(z) modulo M(z).
Proof. By Theorem 18.3.2 there are bases y\(z),. . . , ys(z) in M(z) and
i>,(z), . . . , v((z} in N(z) that are analytic on ft. By Lemma 18.2.2 there
exist analytic vector functions ys+l(z), . . . , y,(z) such that y ^ ( z ) , . . . , y,(z)
are linearly independent for each z E ft and
Corollary 18.3.4
Let Jt^z),. . . , M k ( z ) be analytic families of subspaces (of <p") on ft, and
assume that for each z Eft, (p" is a direct sum of M ^ ( z ) , . . . , M k ( z ) . Then,
578 Analytic Families of Subspaces
Proof. It follows from Theorem 18.3.1 that there exist analytic families
of invertible transformations S,(z): <p"-» (p", / = !,..., k, such that
Sj(z 0 ) = / and 5 ( (z)^,( 2 o) = ^/( z ) f°r a U z E <p. Now the transformation
5(z): <p"-» <p" defined by the property that S(z)x = St(z)x for all x E M^z^}
satisfies the requirements of Corollary 18.3.4.
As a first step towards the proof of Theorem 18.3.1, a result is proved in this
section that can be considered as a weaker version of that theorem. We
say that a function /(z) (whose values may be vectors, or transformations) is
analytic on a compact set ATCft if /(z) is analytic on some open set
containing K.
Theorem 18.4.1
Let K C n be a compact set, and let M(z) C <p" be an analytic family of
subspaces on H. Then there exist vector functions /,(z),. . . , fr(z) E <p" that
are analytic on K and such that / t (z),. . . , fr(z) is a basis in M(z) for every
zG/C.
that holds whenever |z| = 1 and the family +A(z) is nonsingular and analytic
on the disc |z| < 1, and the family A(z) is nonsingular and analytic on the
annulus 1 ^ |z| <°°.
Lemma 18.4.2
Every n x n matrix function A(z) that is analytic and nonsingular on a
neighbourhood of the unit circle admits an incomplete factorization.
Proof. Consider first the case when A(z) is analytic on the disc |z ^ 1.
Let z 0 be a zero of det A(z) with |zj < 1. Then for some invertible matrix
T0 the first row of T0A(z) is zero at the point z0. Put
Proof of Theorem 18.3.1 (Compact Sets) 579
Then A(z) - A^(z) +A^(z)\ moreover, A,(z) is analytic and invertible for
1 < |z| <oo ? +Al(z) is analytic and invertible for |z| ^ 1, and the number of
zeros of det +A j(z) inside the unit circle is strictly less than that of det A(z).
If det +A j(z) ^ 0 for |z| ^ 1, then /l(z) = ^4 ,(z) +/1 ,(2) is an incomplete fac-
torization of A(z). Otherwise, we apply the construction above to ^(z),
and after a finite number of steps an incomplete factorization of A(z) is
obtained.
Now it is easy to prove Lemma 18.4.2 for the case that A(z) is
meromorphic in the disc |z| < i (more exactly, admits a meromorphic con-
tinuation into the disc). Indeed, let z t , . . . , zk be all the poles of A(z) inside
the unit disc with orders al,...,ak, respectively. Then the function
B(z) = nf =1 (z - ziYiA(z) is analytic for z < 1 and thus (according to the
assertion proved in the preceding paragraph) admits an incomplete fac-
torization: B(z) = ~B(z) +B(z). So (18.4.1) with 'A(z) = {n*=1(z -
z-) -Q( '} B(z); +A(z) = *B(z) is an incomplete factorization of A(z).
Now consider the general case. Let e > 0 be such that A(z) is analytic and
invertible in the closed annulus <i> = { z E < p l - e ^ | z | < l + e}. In the
sequel we use some basic and elementary facts about the structure of the set
Cw of aU n x n matrix functions X(z) that are continuous in the closed
annulus 4> and analytic in the open annulus 4> = { z E < p | l — e < | z | < l + e}.
The set Cw is an algebra with pointwise addition and multiplication of
matrices and multiplication by scalars, that is, for z e O and A^z), Y(z) E CM
we define
where A r (z)G Cw. It is easily seen that this is indeed a norm; that is, the
axioms (a)-(c) of Section 13.8 are satisfied. Moreover
Let M + be the set of all matrix functions from C^ that admit an analytic
continuation to the set ( z E < p | | z | < l — e} and let M_ be the set of all
matrix functions from Cw that admit an analytic continuation to the set
( z E < p | | z | > l + e}U {°°} and assume the zero value at infinity. It is easily
seen (as for C w ) that M+ and M_ are closed subspaces in the norm || • ||c .
Clearly, M+ n M_ = {0} (here 0 stands for the identically zero n x « matrix
function on <f>). Furthermore, M+ + M_ = Cli>. Indeed, recall that every
function X(z) £ Cw can be developed into the Laurent series
[See page 225 in Gohberg and Goldberg (1981), for example; the proof is
based on Banach's theorem that every bounded linear operator that maps a
Banach space onto itself, and is one-to-one, has a bounded inverse.]
Return to our original matrix function A(z}. Clearly A(z)"1 £. C w , and
the Laurent series A(z)~l = £*=_«, z'Aj converges uniformly in the annulus
l - e < | z | ^ l + e. Therefore, for some N the matrix function AN(z) —
T,^_N z'Aj has the following properties: det AN(z) ^0 for l - e < | z | < l + e
and
Let
000000000000000000000000000000000000000000000000000000000000000
algebra Cw. (Here / represents the constant n x n identity matrix.) Denote
+
G = (/+ +N)'\ Then +G and ( + G) - 1 belong to the image of P+. In
Proof of Theorem 18.3.1 (Compact Sets) 581
particular, + G and ( + G ) ~ 1 are analytic in the disc |z| < 1. Furthermore, one
checks easily that
and use the fact (proved in the preceding paragraph) that the function
(+G(z)yl(AN(z))~\ which is meromorphic on the unit disc, admits an
incomplete factorization. D
Lemma 18.4.3
Let / j , . . . , fr E <p" and gl,. . . , g, E (p" be two systems of analytic and
linearly independent vectors on ft such that
for z C ft0, where ft0 C ft is a set with at least one limit point inside ft. Then
is a direct sum decomposition for every zE. K except maybe for a finite set
of points 2 , , . . . ,zk. Indeed, by the definition of an analytic family of
subspaces, for every TJ E K there exists a neighbourhood U^ of 17 and an
analytic and invertible matrix function B^(z) defined on U^ such that
B^(z)M = M(z) on U^, where J< is a fixed subspace in <p". We can assume
[by changing B^(z) if necessary] that the subspace M is independent of TJ.
[Here we use the fact that dim M{z) is constant because of the connected-
ness of O.] Actually, we assume M = M.(z0). Let *,,. . . , xr be some basis in
M ( z 0 ) , and let xr+l, . . . , xn be a basis in JV0. Then for 2 e U^ the subspaces
^(z) and JV0 are direct complements to each other if and only if
Two cases can occur: (a) DTJ(z) = 0 for z e L^; (b) Dn(z) ^0, and then we
can suppose (taking Un smaller if necessary) that D T J (z)^0 only at a finite
number of points of U^. Let us call the points 17 for which (a) holds points o
the first kind, and the points 17 for which (b) holds points of the second kind
Since K is connected, all 17 £ K are of the same kind, and since z0 is of the
second kind, all 17 G K are of the second kind. Further, let L^, . . . , U bea
finite covering of the compact K. Since D^ (z) ^ 0 only at a finite number of
points z in L^., / = ! , . . . , / , we find that (18.4.5) holds for every z £ K
except possibly for a finite number of points z , , . . . , zk G K.
By the definition of an analytic family of subspaces, there exist neighbour-
hoods U(zl),..., U(zk) of Z , , . . . , Z A , respectively, and functions
5 (1) (z),. . . , B(k\z) that are invertible and analytic on t/(z,),. . . , U(zk),
respectively, such that
Let x\'\ . . . , x ( r j ) be some basis of the subspace M(ZJ), and let g\i)(z) =
Bu\z)x\'\ (/ = 1,. . . , r; z G U(zj)\ j = 1,. . . , k). Then for p >0 small
enough we have
Proof of Theorem 18.3.1 (Compact Sets) 583
For every z E. K ^ S let P(z) be the projector on M(z) along jV0. Then we
claim that P(z) is an analytic function on K ^ S.
Indeed, we have to prove this assertion in a neighbourhood of every
/AO E K ^ S. Let t/0 be a neighbourhood of /u,0 in the set K "- 5 such that,
when z G UQ, M(z) = fi(z)J£(/u,0) for some analytic and invertible matrix
function B(z) on U0. The matrix function B(z) defined on UQ by the
properties that B(z)x = fi(z)jt for all xE.M( /u,0) and B(z)y = y for all y € JV0
is analytic and invertible. As P(z) = B(z)PQ(B(z))~l, where P0 is the project-
or on M((JL{)) along JV0, the analyticity of /'(z) on U0 follows.
Let us now prove that there exist vector functions f\°\z),. . . , /* 0) (z)
that are analytic on K ^ S and for which
for every z G K ^ S, except maybe for a finite set of points. (The set of
exceptional points is at most finite because of the compactness of K"- 5.)
But from the choice of g\°\ . . . , gJ 0) it follows that
for every z G AT^S, except perhaps for a finite set of points [viz., those
points z for which the vectors £i 0) (z), . . . , g^ 0) (z) are not linearly indepen-
dent]. Thus
Now take the point z 2 and apply similar arguments, and so on. After k steps
one obtains the conclusion of Theorem 18.4.1. D
In this section we finish the proof of Theorem 18.3.1. The main idea is to
pass from the case of compact sets (Theorem 18.4.1) to the case of a general
domain ft. To this end we need some approximation theorems.
A set M C <p is called finitely connected if M is connected and <p "-• M
consists of a finite number of connected components. A set N C M is called
simply connected relative to M if for every connected component Y of <p ^ N
the set Y fl (<p^- M) is not empty. The first of the necessary approximation
theorems is the following.
Lemma 18.5.1
Let K C ft be a finitely connected compact set that is also simply connected
relative to ft. Let Yl,. . . ,YS be all the bounded components of <p "^ K, and,
for j = 1,. . . , s let Zj E Yf;^~ ft be fixed points. Let A(z) be an m*. n matrix
function that is analytic on K. Then for every e > 0 there exists a rational
matrix function B(z) of size m x n such that B(z) is analytic on (p ^
( z i , . . . , zs) and, for any zE.K,
Proof of Theorem 18.3.1 (General Case) 585
where the xjv G <p and such that \A(z) - R(z)\ < e for any z £ K. Let
U CL^~^ {zl,. . . , zs} be a neighbourhood of K whose boundary dU con-
sists of s + I closed simple rectifiable contours. Then for z G K, we obtain
Indeed, denoting the right-hand side of (18.5.3) by G0, let us prove first
that G0 is both a closed and an open set in G. Let FE G0 and HE. G be
such that \\H-F\\c < HF' 1 ^ 1 . Then // = (/- Af)F, where M = / - HF~\
We have
that is, H E G0. So G0 is open. Suppose now that Fy E G0, j = 1,2,. . . and
||Fj - F||—»0 for some FE G. Let ;0 be large enough such that ||F; - Fj| <
HF-'ir 1 . Then F=(I-M)F]Q where ||M||G = ||7- FFhl\\G < I/that is,
FE G0. So G0 is a closed set.
Now let us prove that G0 is connected. Let
then
for some A/,,. . . , Mv E G with ||A/y||G < 1 for / = 1,. . . , v. Rewrite this
representation in the form
where
where +D0(z) is analytic and invertible in the disc {z G <p | |z — rjj < p} and
~Z)0(z) is analytic and invertible for p ^ |z — TjJ <<». The equality D0 =
D0(+D0)~l shows that ~D0 admits analytic continuation to the whole of (p
and £>0(z) is invertible for all z ^ rjl. Also, +D0 is analytic and invertible on
<P^{z],...,z,,7?2,...,r?m}DA:.
Let Y(t), 0 < / < 1 be a continuous function with values in y(i7,) such that
y(0) = 17,, _y(l) = z(7jj). Then the formula
The following lemma is the main approximation result that will be used in
the transition from compact sets in fl to the domain O itself.
588 Analytic Families of Subspaces
Lemma 18.5.3
Let K C ft be a finitely connected compact set that is also simply connected
relative to O. Let M C <p" be a fixed subspace and A(z) be an n x n matrix
function that is analytic and invertible on K and such that A(z)M = M for
z E K. Then for every e > 0 there exists a matrix function B(z) that is analytic
and invertible on H and such that
Because A(z) is invertible when z G AT, so are A j(z) and A2(z). Use Lemma
18.5.2 to find matrix functions #,(z) and B2(z) that are analytic and
invertible on H and such that \\Bf(z) - Af(z)\\ < e/3 for z e K; i = 1, 2. By
Lemma 18.5.1 there exists an analytic matrix function Bl2(z) on H such that
\\Bl2(z)- Al2(z)\\<e/3 for z<=K. Then
Lemma 18.5.4
Let Kl C K2 C • • • C H be a sequence of finitely connected compact sets K^
which are also simply connected relative to H. For m = 1, 2,. . . , let Gm(z) be
an nx. n matrix function that is analytic and invertible on Km and satisfies
Gm(z)M = M for z E Km and for some fixed subspace M C <p". Then for
m = 1, 2,. . . , there exists an n x n matrix function Dm(z) that is analytic and
invertible on Km and such that, whenever z E Km
Then the infinite product Y = n^, (/ + Xm) converges and ||/ - V|| < aea.
Indeed, for the matrices Ym = tt™=l (I + Xj) we have the estimates:
The assertion proved in the preceding paragraph ensures that for every
m — 1,2,. . . , the infinite product
Proof of Theorem 18.3.1. Let us show first that there exists a sequence
of compact sets Kl C. K2 C • • • that are finitely connected, simply connected
relative to ft, and for which U°°=i ^, = ft. To this end choose a sequence of
closed discs Sm C ft, m = 1, 2,. . . , such that U^ = 1 Sm = ft. It is sufficient to
construct Km in such a way that Km D Sm, m = 1, 2 , . . . . Put Kl — 5,,
590 Analytic Families of Subspaces
Which is impossible.
It turns out that although one common direct complement for an analytic
family of subspaces may not exist, only two subspaces are needed to serve as
"alternate" direct complements for each member of the analytic family.
Theorem 18.6.1
For an analytic family of subspaces [ M ( z ) } z e f l of <p" there exist two
subspaces JVj, JV2 C <p" such that for each z £ fl either M(z) 4- JV^ = <p" or
M(z) + JV2 = (fn holds.
Proof. To prove this we first need the following observation: for any
/odimensional subspace !£ C <p, the set DC(^) of all direct complements to
!£ in (p" is open and dense in the set of all (n - A:) -dimensional subspaces.
Indeed, the openness of DC(!£) follows immediately from Theorem 13.1.3.
To prove denseness, let jV be an (n — k) -dimensional subspace in <pn with
hh Analytic Families of Subspaces
basis / ! , . • - , / „ _ f c , and let N0 be a direct complement to !£ with basis
S\-> • • • •> Sn-k- F°r a complex number e put JV(e) = Span{/j +
*£i> • • • , /„-* + egn-*}• Clearly, the vectors /. + eg,, i = 1,. . . , /i - fc are
linearly independent for e close enough to 0, so dim N(e) = n — k.
Moreover, Theorem 13.4.2 shows that
It remains to show that ^V(e) belongs to DC(2£). To this end pick a basis
/ i j , . . . , hk in j£, and consider the n x n matrix
As
0000000000000000000000000000000000000000000000000000000000000000000
that det G(e)^0, and since det G(e) is a polynomial in e it follows th;
detG(e)7^0 for e^O and sufficiently close to zero. Obviously, ^V(e) <
DC(%) for such e.
Now we start to prove Theorem 18.6.1 itself. Fix z
direct complement to M(z0) in <p"- By Theorem 18.3.2 it is possible to pk
vector functions x^(z),. . . , x
for every z e ft, the vectors JCj(z),. . . , x
/ ! > • • • > /n-p ^e a bas's m ^i» consider the n x « matrix function
which is analytic on ft. As det G(z 0 )^0, the determinant of G(z) is not
identically zero, and thus the number of distinct zeros of det G(z) is at most
countable. Let z n z 2 , . . . Eft be all of these zeros. Then jVj is a direct
complement to M(z) for z^{Zj, z 2 , . . .}. On the other hand, we have seen
that, for / = 1, 2, . . . , the sets DC(M(zi)) are open and dense in the set of
all (n — p) -dimensional subspaces in <p". As the latter set is a complete
metric space in the gap topology (Section 13.4), it follows that the inter-
section n°°=1 DC(M(zi)) is again dense [the Baire category theorem;
e.g., see Kelley (1955)]. In particular, this intersection is not empty, so
there exists a subspace JV2 C <p" that is simultaneously a direct complement
to all of M(zl),M(z2), ____ D
Direct Complements for Analytic Families of Subspaces 593
The following result shows that for analytic families of subspaces that
appear as the kernel or the image of a linear matrix function there exists a
common direct complement. As Example 18.6.1 shows, the result is not
necessarily valid for nonlinear matrix functions.
Theorem 18.6.2
Let Tl and T2 be m x n matrices such that the dimension ofKer(Tl + zT2) is
constant, that is, it is independent of z on <p [and the same is automatically
true for dim In^Tj + zT2)]. Then there exist subspaces ^Vj C <p", jV2 C <pm
such that
Note that in view of Proposition 18.1.1 and Theorem 18.2.1 the families
of subspaces Ker(r t + zT2) and Im(7i + zT2) are analytic on <p.
Proof. For the proof of Theorem 18.6.2 we use the Kronecker canonical
form for linear matrix polynomials under strict equivalence (which is
developed in the appendix to this book).
As dimKerCT, + zT2) is independent of z E <p, the canonical form of
Tl + zT2 does not have the term z/ + /. So, in the notation of Theorem
A. 7. 3, there exist invertible matrices Ql and Q2 such that
As
594 Analytic Families of Subspaces
it follows that
Let A(z): <p n —> <P" be an analytic family of transformations on ft. Our next
topic concerns the analytic properties (as functions on z) of certain invariant
subspaces of A(z).
We have already seen some first results in this direction in Section 18.1.
Namely, if the rank of A(z) is independent of z, then Im A(z) and Ker A(z)
are analytic families of subspaces. In the general case, Im A(z) and
Ker A(z) become analytic families of subspaces if corrected on the singular
set of A(z). The next theorem is mainly a reformulation of this statement.
For convenience, let us introduce another definition: an analytic family of
subspaces {^(z)}2en is called A(z) invariant on ft if the subspace M(z) is
A(z) invariant for every z G ft.
Theorem 18.7.1
There exist A(z)-invariant analytic families {M(z}} z&sl and {N(z)}z&fl such
that M(z) = Im A(z) and jV(z) = Ker A(z) for every z not belonging to the
singular set of A(z).
Proof. In view of Theorem 18.2.1 we have only to prove that M(zQ) and
JV(z0) are A(z0) invariant for every z 0 ES(v4). But this follows from
Theorem 15.1.1 because limz_>z A(z} - A(z0) and
Another class of A (z) -in variant subspaces whole behaviour is analytic (at
least locally) includes spectral subspaces, as follows.
Theorem 18.7.2
Let F be a contour in the complex plane such that F fl cr(A(z0)) = 0 for a
fixed z0 E ft. Then the sum Mr(z) of the root subspaces of A(z) correspond-
Analytic Families of Invariant Subspaces 595
is a projector for every zEU. So, to prove that Mr(z) is an analytic family
in U, it is sufficient to check that P(z) is an analytic function on U. Indeed,
|det( A/ - A(z})\ > S > 0 for every A G F and z G t/, where 5 is independent
of A and z. Hence ||(A7- ^(z))"1)! is bounded for A G F and z G U, and
consequently the Riemann sums
Here the A(z)- in variant subspaces (for a fixed z) are easy to find: the only
nontrivial invariant subspace of A(Q) is Spanf^}, and, when z r^O, the only
nontrivial invariant subspaces of A(z) are
where u1 and u2 are the square roots of z. It is easily seen that there is no
nontrivial, A(z)-invariant, analytic family of subspaces on £. D
5% Analytic Families of Subspaces
Theorem 18.8.1
Let ft be a simply connected domain in <J7, and let A(z): <p"—> <p" be an
analytic family of transformations on ft. Then Inv(^4(z)) depends analytically
on z G ft // and only if A(z) have fixed Jordan structure.
Invariant Subspaces and Fixed Jordan Structure 597
subspaces coincides with the set of ^4(z)-invariant subspaces for all z G ft, so
it is sufficient to prove Theorem 18.8.1 for A(z) instead of A(z). From the
definition of A(z) it is clear that the eigenvalues of A(z) are
ptj(z 0 ),. . . , ^(ZQ), that is, they do not depend on z, and, moreover, the
partial multiplicities of Aty(z0) as eigenvalues of A(z) do not depend on z,
either. In other words, in Theorem 18.8.1 we may assume that A(z) is
similar to A(zQ) for all z Eft.
For ; = 1 , . . . , / ? , let my be the maximal partial multiplicity of M;(ZO) as an
eigenvalue of A(zQ) [and hence as an eigenvalue of A(z) for all z in ft].
Note that since A(z) is similar to A(zQ) for all z Eft, by Proposition 18.1.2
there is an analytic basis in Ker(>i(z) — /u,/(z0)/)m for m = 0,1,2,. . . (i.e.,
for each fixed j and m). By Theorem 18.3.3 there exists a basis
x\{\z),. . . , *$(z) in Ker(A(z) - My^oW"' modulo Ker(^(z) - ufa)1*'1
that is analytic on ft. It is easily seen that the vectors
modulo
and so on. Now define the n x « matrix T(z) formed by the columns
where / = 1,. .. , p. As the proof of the Jordan form of a matrix shows (see
Section 2.3), the columns of T(z) form a Jordan basis of A(z). In particular,
Analytic Dependence on a Real Variable 599
In the course of the proof of Theorem 18.8.1 we have also proved the
following result on analytic families of similar transformations.
Corollary 18.8.2
Let A(z): <p" -> <p" be an analytic family of transformations on ft, where ft is
a simply connected domain. Assume that, for a fixed point z 0 Gft, A(z) is
similar to A(zQ) for all z e ft. Then there exists an invertible transformation
T(z): <p"—»$", which is analytic on ft and such that T(z0) = I and
T(zYlA(z) T(z) = A(z0) for all z E ft.
Clearly, A(z) has fixed Jordan structure on ft (the eigenvalues being the two
square roots of z). The nontrivial A(z)-invariant subspaces are
The results presented in Sections 18.1-18.8 include the case when the
families of transformations <p" —> <pm and subspaces of <p" are analytic in a
real variable on an open interval (a, b) of the real axis. The definition of
analyticity is analogous to that in the complex case: representation as a
power series (this time with real coefficients) in a real neighbourhood of each
point t0 E (a, b). As the radius of convergence of this power series is
positive, it converges also in some complex neighbourhood of tQ. Con-
600 Analytic Families of Subspaces
Theorem 18.9.1
Let M(i) be a family of subspaces (of <p") that is analytic in the real variable t
on (a, b). Then the orthogonal complement M(i)L is an analytic family of
subspaces on (a, b) as well.
Proof. Let /() E (a, b). Then in some real neighbourhood Ul of t0 there
exists an analytic family of invertible transformations A(t): (f"1—»<p" such
that M(t) = A(t)M, / E l/j for a fixed subspace M C <pn. Assume (without
loss of generality) that M = Spanf^,. . . , ep} for some p, and write A(i) as
the nx n matrix with entries that are analytic on (a, b) with respect to the
standard basis in <p". Then M(t) = Im B(t) for t E Ul, where B(t) is formed
by the first p columns of A(t). As A(t) is invertible, the columns of B(t) are
linearly independent. For notational simplicity, assume that the top p rows
of B(t0) are linearly independent and hence form a nonsingular p x p
matrix. Then there is a real neighbourhood U2 C £/t of t0 such that the top
rows of B(t) form a nonsingular p x p matrix C(t) as well. So for t E t/2, we
obtain
where A 1 / 2 is the analytic branch of the square root that takes positive values
for A positive, is well defined and Z(t)2 = S(t) (see Section 2.10). Moreover,
because of the symmetry of F, the matrix Z(t) is positive definite for all
t E l/3. Also, Z(t) is an analytic family of matrices on U3. Now one sees
easily that, for t E U3
Exercises 601
and since rank P(t) is easily seen to be p, equality (rather than inclusion)
holds in (18.9.1). Consequently, M(tY is the image of the analytic family of
projectors I-P(t), and thus M(i)L is analytic on (/3. As / 0 E(«, b) was
arbitrary, the analyticity of M(t)± on (a, b) follows. D
One can also consider families of real transformations from Jjf" into $m,
as well as families of subspaces in the real vector space J|?", which are
analytic in a real variable t on (a, b). For such families of real linear
transformations and subspaces the results of Sections 18.1-18.8 hold also.
However, in Theorem 18.7.2 the contour F should be symmetrical with
respect to the real axis; and in the definition of fixed Jordan structure one
has to require, in addition, that the enumeration A J ( Z J ) , . . . , A O T (ZJ) and
Aj(z 2 ), . . . , A m (z 2 ) of distinct eigenvalues of A(zl) and A(z2), respective-
ly, is such that A,(ZI) = A y (z,) holds if and only if A^(z 2 ) = A y (z 2 ).
0000000000000000000000
18.1 Let
18.14 For the following analytic families M(z) of subspaces in <p" that
depend on z G (p, find two subspaces ^Vt and N2 such that for every
z G <p at least one of
holds:
Jordan Form of
Analytic Matrix Functions
Theorem 19.1.1
Let jtj,. . . , jjik be all the distinct eigenvalues of A(zQ), that is, the distinct
zeros of the equation det(ju./- A(z 0 )) =0, where k<n, and let ri(i =
1 , . . . , / : ) be the multiplicity of ^ as a zero of det(/u,/ — A(z0)) = 0 (so
r{ + • • • + rk = ri). Then there is a neighbourhood °IL of z0 in ft with the
following properties: (a) there exist positive integers mn,. .. , m t ;
604
Local Behaviour of Eigenvalues and Eigenvectors 605
(b) the dimension y^ of Ker(v4(A) - /A (;(r (z)/), as we// as the partial multi-
plicities m(^} > • • • > mj^ (>0) of the eigenvalue )U,i;(T(z) o/,4(A), do rcof
depend on z (for zE:°U ^ ( z 0 j ) and do no? depend on cr; (c) for each
/ = 1, . . . , jfc andy = 1, . . . , s, there exist vector-valued fractional power series
converging for z G % :
where x(J^ £ <p", such that for each y and each z E °U, ^ {z0} the vectors
jt-^^z),. . . , x^f'1 }(z) form a Jordan chain of A(z) corresponding to
/*„•(*):
where by definition Jt^ 0) (z) = 0, and jc^ 1) (z)^0. Moreover, for every
z E M ^ {z0} the vectors
where piq is interpreted as zero for k>t^ and similarly for m^ } when
q > yir As the total sum of partial multiplicities of eigenvalues near /^ does
not change after small perturbation of the transformation, we also have the
equality
The only eigenvalue of ^4(0) is zero, with partial multiplicities 3 and 1. [The
easiest way to find the partial multiplicities of A(Q) is to observe that
rank A(0) = 2 and A(0)2 ¥- 0.] To find the eigenvalues of A(z), we have to
solve the equation det(/u,7- A(z)) = 0, which gives (in the notation of
Theorem 19.1.1)
Corollary 19.1.2
Assume that all the eigenvalues ofA(z) are analytic in a neighbourhood of z0.
Then the distinct eigenvalues /i,(z), . . . , /^(z) of A(z), Z^ZQ can be
00000000000000000000000000000000000000000000000000000000000000000.
Further, assuming that the enumeration of the distinct eigenvalues of A(z) for
z T* ZQ is as above, there exist analytic n-dimensional vector functions
in a neighbourhood % C °UV ofzQ with the following properties: (a) for every
zE%"^{z 0 }, and for i = l,...,k; j = l,...,si the vectors
yl*\z),. . . , y^''\z) form a Jordan chain of A(z) corresponding to the
eigenvalue ^(z); (b) for every z G % ^ {z0} the vectors (19.1.4) form a
basis in <p".
Note that y\\\z), y\}\z) do not form a basis in <p2 for z = 0; also, /^(O),
yJi^O) do not form a Jordan chain of ^4(0). This shows that in (a) and (b) in
Corollary 19.1.2 one cannot, in general, replace °U2 ""* {z0} by %. D
Theorem 19.2.1
Let A(z): <p"—> <p" be an analytic family of transformations on ft. Then for
all z £ ft except a discrete set S0 we have
for ZQ G S0 we have
Now if v(A(z'))< i>0 for some z ' G f t , then by Theorem 19.1.1 we have
v(A(z)} ==• v0 in a deleted neighbourhood of z'. This shows that the set 50 of
all z E ft for which v(A(z)) < VQ is indeed discrete, n
The points from S0 will be called the multiple points of the analytic family
of transformations A(z), because at these points the eigenvalues of A(z)
attain higher multiplicity than "usual."
Another way to prove Theorem 19.2.1 is by examining a suitable
resultant matrix. Let
for some scalar functions tfy(z) that are analytic on ft, and consider the
(2n — 1) x (2/i — 1) matrix whose entries are analytic functions on ft:
Global Behaviour of Eigenvalues and Eigenvectors 609
This is the resultant matrix of two scalar polynomials on fj.: det(/u/ - A(z})
and (<?/<?w)(det(/x/- A(z)). A well-known property of resultant matrices
[see, e.g., Gohberg and Heinig (1975)] states that 2n - 1 - rank R(z) is
equal to the number of common zeros of these two polynomials in /u,
(counting multiplicities). In other words
or
otherwise. Since the set of common zeros of det5 t (z), . . . , det 5/(z) is
discrete, Theorem 19.2.1 follows. D
Theorem 19.1.1 shows that the distinct eigenvalues of
A(z), fJt-i(z),. . . , n^z) (where v = max 2en v(A(z))) are analytic on fl ^ 50,
where S0 is taken from Theorem 19.2.1, and have at most algebraic branch
points is 50. [Some of the functions ^(z),. . . , nv(z) may also be analytic at
certain points in S0.] Denote by Sl the subset of 50 consisting of all the
points ZQ such that at least one of the functions /iy-(z), j - 1,. . . , v is not
analytic at z 0 . As a subset of a discrete set, S{ is itself discrete. The set $1
will be called the first exceptional set of the analytic family of linear
transformations A(z), z GO.
It may happen that Sl ^ S0, as shown in the following example.
610 Jordan Forn^of Analytic Matrix Functions
Theorem 19.2.2
Let A(z): <£"-» <P" be an analytic family of transformations on ft with the set
50 of multiple points and let /AJ(Z), . . . , ^(z) be the distinct eivenvalues of
A(z) analytic on ft"-S0 and having at most branch points in 50. Let
m yl (z) > • • • > miy(z), y = y ( j , z) be the partial multiplicities of the eigen-
value iij(z) ofA(z) for j = I , . . . , V, z&?S0. Then there exists a discrete set S2
in ft such that S2 C ft ""- 50 and the number y(y, z) of partial multiplicities and
the partial multiplicities mjk(z} themselves, k — 1,. . . , y(y, z) do not depend
on z in ft""- (50 U 52), for 1, . . . , v.
Proof. The proof follows the pattern of the proof of Theorem 19.2.1. In
view of Theorem 19.1.1, for every z 0 Eft, there is a neighbourhood °UZ of
z0 such that the number of distinct eigenvalues v = v(z0), as well as the
number -y, = y;(z0) of partial multiplicities and the partial multiplicities
themselves mjl > • • • > mjy, mjk = mjk(zo), corresponding to the y'th eigen-
value, are constant for zE.^lrzo ^{z ft"}. It is assumed that the distinct
eigenvalues of A(z) for z E. % ^ {z0} are enumerated so that they are
analytic and yl^ • • ->yv. Denote by A the (finite) set of all sequences of
type
where v,yj,mjk are positive integers with the properties that v<n; y^
• • - > yv\ mt ^ • • • ^ miy, i = 1,. . . , v; S, y miy = n. For any sequence 8 E A
as in (19.2.2) let Ys = U% , where the union is taken over all z 0 Eft such
that v = v(z0); yf = r,(z0),°; = 1,. . . , v\ m,.; = m 0 (z 0 ), y = 1,. . . , ^; i =
1,. . . , v. Obviously, Ys is open and U s e A Vs = ft. Also, the sets Ts, d E A
are mutually disjoint. As ft is connected, this means that all °VS, except for
one of them, are empty. So Theorem 19.2.2 follows. D
where p.(z) and q.(z) are not identically zero polynomials such that
for all z E <£, where 1 ^ kl < k2 < • • • < kq_l < kq = n. We also assume that
the polynomials pk ( z ) , . . . , pk (z) are all different. We have the set of
multiple points
the first exceptional set 5t is empty, and the second exceptional set S2 is the
union of S0 and the set {z £ <p "- S0 | <?,(z) = 0 for some kp + 1 ^ / ^ kp +l — 1
and some p}. D
Theorem 19.2.3
Let A(z): <p"~* <P" be an analytic family of transformations in fl with the first
exceptional set Sl and the second exceptional set S2. Let /^(z),. . . , IAV{Z) be
the* distinct eigenvalues of A(z) (apart from the multiple points), which are
analytic on ft "*- Sl and have at most algebraic branch points in S}. Then there
exist n-dimensional vector functions
j = 1, . . . , v, where m;1 > • • • > mjy are positive integers, with the following
properties: (a) the functions (19.2.3) are analytic on fl ^ Sv and have at most
612 Jordan Form of Analytic Matrix Functions
It is easily seen that if My(z) has an algebraic branch point at z 0 E S^, then
all eigenvectors
where ak(z) is the kth row of A(z). If x\'^(z} were analytic at z 0 , then
(19.2.4) implies that At;(z) is also analytic at z 0 , a contradiction.
The proof of Theorem 19.2.3 is given in the next section.
In the particular case when A(z) is diagonable (i.e., similar to a diagonal
matrix) for every z^S} U S2, the conclusions of Theorem 19.2.3 can be
strengthened, as follows.
Theorem 19.2.4
Let A(z) be as in Theorem 19.2.3, and assume that A(z) is diagonable for all
z^Sl U S2. Then there exist n-dimensional vector functions
with the following properties: (a) the functions (19.2.5) are analytic onfl^ Sl
and have at most algebraic branch points in 5j; (b) for every z E O and every
j = 1,. . . , v the vectors x['\z),. . . , x(^(z) are linearly independent; (c) for
every z G f t ^ (^ U 52) the vectors x\i\z),. . . , x{^(z) form a basis in
Ker(/x y (z)7- A(z}). In particular, the vectors (19.2.5)/orm a basis in <J7" for
every z e f i ^ (^ U S2).
It is easily seen that the singular set is discrete and coincides with the set of
all z0 e ft with
Lemma 19.3.1
Let B(z): <p"—»<p m be a branch analytic family of transformations on ft.
Then there exist m-dimensional branch analytic vector-valued functions
y^(z),. . . , yr(z) on ft, and n-dimensional branch analytic vector-valued
functions x ^ ( z ) , . . . , xn_r(z) on ft with the following properties: (a) each
branch point for any function y^z), j = 1,. . . , r or xk(z), k = 1,. . . , n - r is
also a branch point of B(z)\ (b) y t (z),. . . , yr(z) are linearly independent for
every z G ft; (c) *i( z )> • • • » x n - r ( z ) are linearly independent for every z E ft;
(d) Span{y}(z),...,yr(z)} = lmB(z) and Span^z),. . . , *„_,(*)} =
Ker B(z) for every z not belonging to S(B).
Lemma 19.3.2
Let Bj(z): (p n -^(p w and B2(z): <p"-» <pm be branch analytic families of
transformations on ft, such that
for every z E ft that does not belong to the union of the singular sets of B^(z)
and B2(z). Then there exist branch analytic n-dimensional vector functions
x ^ ( z ) , . . . , xs(z), z Eft with the following properties: (a) every branch point
of any *;(z), j = 1, . . . , s is also a branch point of at least one of B^(z) and
B2(z); (6) jCj(z), . . . , xs(z) are linearly independent for every z E ft; (c) for
every z E f t that does not belong to S(Bl) U S(B2) the vectors
jc t (z), . . . , Jc^z) form a basis in Ker B^(z) modulo Ker B2(z).
for all z E f t not belonging to the singular set of B2(z). Fix z 0 E f t , and
choose xv +l, . . . , * „ in such a way that ^(z,,), . . . , yv(z0), xu+l, . . . , xn
form a basis in <p". Using the branch analytic version of Lemma 18.2.2 (cf.
the paragraph following Lemma 19.3.1), find branch analytic vector func-
tions yv+l(z), - . . , yn(z), z Eft such that ^(z), . . . , yv(z),
yv +l ( z ) , . . . , yn(z) form a basis in <p" for every z E ft. If necessary, replace
BJ(z) by B,.(z)5(z), i = l,2, where S(z) = [y,(z)- • • yn(z)] is an invertible
n x n matrix function, and we can assume that
Lemma 19.3.3
Let BI(Z) and B2(z) be as in Lemma 19.3.2, and let Xj(z),. . . , xt(z) be
branch analytic n-dimensional vector functions with the following properties:
(a) every branch point of any Xj(z), j = 1,. . . , t is also a branch point of at
least one of B^(z) and B2(z); (b) there exists a discrete set T D S(Bl) U S(B2)
such that Jt^z),. . . , xt(z) belong to Ker B^{z) and are linearly independent
modulo Ker B2(z) for every z £ ft ^ T. Then there exist branch analytic
n-dimensional vector functions Jt,+ 1 (z), . . . ,xs(z) such that every point of
any Jcy(z), j=t + l,...,sisa branch point of at least one o/#,(z) and B2(z)
and for every z G O ^ T the set x{(z),. . . , x,(z), Jt, +] (z), . . . , xs(z) forms a
basis in Ker #,(z) modulo Ker B2(z).
The case t = 0 [when the set JCj(z),. . . ,x,(z) does not appear] is not
excluded in Lemma 19.3.3.
There exist branch analytic vector functions y,+i(z), . . . , v n (z) such that
y^z), . . . , yn(z) form a basis in <p" for every z G H (cf. the proof of Lemma
19.3.2). By replacing fit(z) by fl1(z)[y1(z)- • • yn(z)], we can assume that
and the proof is reduced to the case f = 0. But then Lemma 19.3.1 is
applicable. D
We are ready now to prove Theorem 19.2.3. The main idea is to mimic
the proof of the Jordan form for a transformation (Section 2.3) using
Lemma 19.3.2 when necessary.
616 Jordan Form of Analytic Matrix Functions
Proof of Theorem 19.2.3. For a fixed j (y = 1,.. . , i/) let mfl be the
maximal positive integer p such that
for all z&Si U 52. By Theorem 19.2.1 and the definition of 52 the number
mjl is well defined. By Lemma 19.3.2, there exist branch analytic vector
functions x\j)m (z), . . . , Jtj^, (z) on ft that are linearly independent for
every z Eft, can have branch points only in Sj, and such that
form a basis in Ker(/Lt y (z)/- v4(z))m" modulo Ker(/i y (z)/ - ^(z))"1''1'1, for
every z E f t that does not belong to 5((/x;(z)7- A(z))mil)\J 5((/it;(z)/-
^(z))1"'1"1). As we have seen in the proof of the Jordan form, the vectors
are linearly independent modulo Ker(/A ; (z)/- A(z))m>l~2 for every z^Si U
52 (we assume here that m ; 1 >2). By Lemma 19.3.3, there exist branch
analytic vector functions on ft:
with branch points only in Sl and such that for every z ^ Sl U S2 the vectors
Theorem 19.4.1
Let A(z): <£"—> <P" be an analytic family of transformations on ft with the
first and second exceptional sets Sv and S2, respectively. Then, provided
z 0 £ft"-(S 2 U 5"t), every A(70) -invariant subspace MQ is extendable to an
analytic A(z) -invariant family of subspaces on ft ^ Sl.
where
and Jk(n) is the fc x A; Jordan block with eigenvalue p. For z £ft "^ Sl let
T(z) be the n x n matrix whose columns are the vectors (19.4.1) (in this
order). Observe that T(z) is analytic on ft ^ Sl with algebraic branch points
in Sl and T(z) is invertible for z e f t ^ ( 5 2 U 5 ! ) [the function T(z} is
analytic but not necessarily invertible at points in S2]. Then we have
A(z0)T(z0) = T(zQ)J. Given an y4(z 0 )-invariant subspace M0, and any
z £ f t ^ ( 5 , U 5 2 ) , define
Clearly, M(z) is analytic and A(z) invariant for z G ft ~-- (Sl U 52), and also
M(ZQ) — M0. We show that M(z) admits analytic and A (z)- invariant con-
tinuation into the set 52. Let / , , . . . , fk be a basis in M0; then the vectors
form a basis in M(z) for every z £ ft "^ (^j U S2). Note that gv(z),. . . , gk(z)
are analytic in ft^Sj. By Lemma 18.2.2 there exist ^-dimensional vector
functions h^z),. . . , hk(z) that are analytic on ft ^ Sj, linearly independent
for every z G ft "^ Sl, and for which
618 Jordan Form of Analytic Matrix Functions
Theorem 19.4.2
Let A(z), Sj and S2 be as in Theorem 19.4.1. Then every chain of A(z0}-
invariant subspaces, where z0 Gft ~^(S2 U SJ, is extendable to an analytic
chain of A(z)-invariant subspaces on ft ^ 5 t . Moreover, the analytic families
of subspaces that form this analytic chain have at most algebraic branch
points at Sl (in the sense explained after the proof of Theorem 19.4.1).
Theorem 19.4.3
Let A(z) and 5, be as in Theorem 19.4.1. Then every chain A 0 = (MQl C
• • • C M0r} of spectral subspaces of A(zQ), where z0 E ft, is extendable to an
analytic chain of A(z) -invariant subspaces on ft^(Sj^{z0}) that has at
most algebraic branch points at 5, "-> {z()}.
Conjecture 19.4.4
Let A(z), Sj, and S2 be as in Theorem 19.4.1. Then every sequentially
nonisolated A(zQ)-invariant subspace MQ, where z 0 E 52, is extendable to an
analytic A(z)-invariant family of subspaces on ft "" 5, that has at most
algebraic branch points in Sj (in the same sense as the remark following the
proof of Theorem 19.4.1).
The results of Sections 19.1-19.4 hold also for n x n matrix functions A(t)
that are analytic in the real variable t on an open interval ft on the real line.
Of particular interest is the case when all eigenvalues of A(t) are real, as
follows.
Theorem 19.5.1
Let A(t) be an n x n matrix function that is analytic in the real variable t on
ft. Assume that, for all f Eft, all eigenvalues of A(t) are real. Then the
eigenvalues of A(t) are also analytic functions of t on ft.
where c; are complex numbers. Let jl be the first index such that c^ ¥* 0. [If
all Cj are zeros, then A(f) = A0 is obviously analytic at t0.] Then
Take t > tQ and (t - t0)l/a positive. Since A(f) and A0 are real, we find that c^
must be real. In (19.5.1) we now take t<t0 and (t - t0)l/a = \t -
t0\lla • (cos(27r;7a) + /sin(2ir;7a)). We obtain a contradiction with the fact
Analytic Matrix Functions of a Real Variable 621
that c;i is real unless jl is a multiple of a. If j2 > jl is the minimal integer with
c /2 ^o' then
Combining this result with Theorems 19.2.3 and 19.4.1, we have the
following corollary.
Corollary 19.5.2
Let A(t} be an analytic n x n matrix function of a real variable t on ft, and
assume that all eigenvalues of A(t) are real when t E ft. Let S2 be the discrete
set of points in ft defined by the property that either
but at least for one analytic eigenvalue fjij(t) ofA(t) the partial multiplicities of
/Lt;(f0) are different from the partial multiplicities of /t,-(f)> '^'o m a rea^
neighbourhood oft0. Then there exist analytic n-dimensional vector functions
on ft such that for every f G f t ^ 52 the vectors (19.5.2) form a basis in <p"
and, for j = 1,. . . , r, xj{(t),. . . , xjm(t) is a Jordan chain of the transfor-
mation A(t). Moreover, when tQ E ft "- 52, every A(tQ)-invariant subspace MQ
is extendable to an analytic A(t)-invariant family of subspaces on ft.
19.6 EXERCISES
19.1 Find the first and second exceptional sets for the following analytic
families of transformations:
19.2 In Exercise 19.1 (a) find a basis in <p2 that is analytic on <p (with the
possible exception of branch points) and consists of eigenvectors of
A(z) (with the possible exception of a discrete set of values of z).
19.3 Describe the first and second exceptional sets for the following types
of analytic families of transformations A(z): <£" —> <p" on ft:
(a) A(z) — diagfa^z),. . . , an(z)] is a diagonal matrix.
(b) A(z) is a circulant matrix (with respect to a fixed basis in <p")
for every z E ft.
(c) A(z) is an upper triangular Toeplitz matrix for every z E f t .
(d) For every z E f t , all the entries in A(z), with the possible
exception of the entries (/, /) with / = / or with / + j = n + 1,
are zeros.
19.4 Show that the analytic matrix function of type a(z)/ + fi(z)A, where
a(z) and /3(z) are scalar analytic functions and A is a fixed n x n
matrix, has all eigenvalues analytic.
19.5 Show that if A(z) = a(z)/ + p(z)A is the function of Exercise 19.4
and /3(z) is a polynomial of degree /, then the second exceptional set
of A(z) contains not more than / points.
19.6 Prove that the number of exceptional points of a polynomial family
of transformations £*=0 zjA}., z E <£, is always finite. [Hint: Use the
approach based on the resultant matrix (Section 19.2).]
19.7 Let A(z) be an analytic n x n matrix function defined on ft whose
values are circulant matrices. When is every A(z0)-invariant sub-
space analytically extendable for every z0 E ft?
19.8 Describe the analytically extendable A(z0)-invariant subspaces,
where A(z) is an analytic n x n matrix function on ft with upper
traingular Toeplitz values, and z0 E ft.
19.9 Let A(z): <p"—* <P" be an analytic family of transformations defined
on ft, and assume that A(zQ} is nonderogatory for some z 0 Eft.
Prove that every j4(z0)-invariant subspace is sequentially noniso-
lated. (Hint: Use Theorem 15.2.3.)
Exercises 623
and
holds. Find AJ^ by taking the scalar product of (1) with/^. By taking
the scalar product of (1) with / (/' ^ k) it is found that
Applications
This chapter contains applications of the results of the previous two chap-
ters. These applications are concerned with problems of factorizations of
monic matrix polynomials and rational matrix functions depending analyti-
cally on a parameter. The main problem is the analysis of analytic properties
of divisors. Solutions of a matrix quadratic equation with coefficients
depending analytically on a parameter are also analyzed.
624
Factorization of Monk Matrix Polynomials 625
Theorem 20.1.1
and
where Ly( A), j = 1, . . . , r are monic matrix polynomials and S, (resp. S2) is
the first (resp. second) exceptional set of L(A, z). Then there exist monic
matrix polynomials L,(A, z), . . . , L r (A, z) whose coefficients are analytic
functions on f l ^ ( 5 j U 5) (where S is some discrete subset of n^{z 0 }),
having at most poles in 5, and at most algebraic branch points in Sl , and such
that
Note that the case when 5j n S ^ 0 is not excluded. This means that the
coefficients Ajk(z) of L ; (A,z) may have an algebraic branch point and a
pole at the same point z' simultaneously, that is, there is a power series
representation of type
In the same way (using Theorem 19.4.3 in place of Theorem 19.4.2) one
proves the analytic extendability of spectral factorization, as follows.
Theorem 20.1.2
Let z 0 G11 and
where <r(Ly) fl o-(Lk) — 0 for j ¥=• k. Then there exist monic matrix polyno-
mials Lj( A, z),. . . , Lr( A, z) with the same properties as in Theorem 20.1.1,
and whose coefficients are, in addition, analytic at z 0 .
Rational Matrix Functions 627
where
(a) For each / and / and for all z G ft, the polynomial qtj( A, z) in A is not
identically zero, so the rational matrix function W(\, z) is well
defined for every z E ft.
(b) It is convenient to make the further assumption, namely, that for
each pair of indices /, y (1 ^ /, / ^ n) there exists a z0 E ft such that
the leading coefficient of qr / ; (A,z) is nonzero at z = z0 and the
polynomials ptj(\, z 0 ) and <?, ; (A, z 0 ) are coprime, that is, have no
common zeros. In particular, this assumption rules out the case
when PJJ( A, z) and qtj( A, z) have a nontrivial common divisor whose
coefficients depend analytically on z for z E ft.
(c) Finally, we assume that for every z E ft the rational matrix function
W(A, z) (as a function of A) is analytic at infinity and W(°°, z) = /.
Assumptions (a), (b), and (c) are maintained throughout this section.
It can happen that W( A, z) has zeros and poles tending to infinity when z
tends to a certain point zf, Eft. This is illustrated in the next example.
Proposition 20.2.1
The poles and zeros of W(\, z) are bounded in a neighbourhood of each
point in ft if and only if, for each entry pif(\, z)/qii(\, z) of W(X, z), the
leading coefficient of the polynomial g /; (A, z) has no zeros in ft (as an
analytic function of z on fl).
Rational Matrix Functions 629
Theorem 20.2.2
Let be a rational n x n matrix function that depends analytically on
the parameter zE.fl and satisfies assumptions (a), (b), and (c). Let the zeros
and poles of be bounded in a neighbourhood of every point in
Then there exist analytic matrix functions on ft, A(z), B(z), and C(z) of sizes
m x m, m x n, and n x m, respectively, such that
and for every hwith the possible exception of a discrete set S, the
realization (20.2.1) is minimal.
Conversely, if (20.2.1) holds for some matrix functions A(z), B(z), and
C(z) of appropriate sizes that are analytic on ft, then the zeros and poles of
are bounded in a neighbourhood of every point in
for some matrices C0(z), AQ(z), and S0(z). Further, by Proposition 20.2.1,
the leading coefficients of the denominators of the entries in have
no zeros in According to this fact, the proof of Theorem 7.1.2 shows that
AQ(z), # 0 (z), and C0(z) can be chosen to be analytic matrix functions of z
on Let p x p be the size of A By Theorem 18.2.1 we can find
families of subspaces of , which are analytic on and are
such that, for every with the possible exception of a discrete set 5,, we
have
and
Rational Matrix Functions 631
For z E 5, we have
and
y4(z) x ] (see Section 7.2), it follows that the point belongs to the first
exceptional set of W(\, z) if and only if there is a pole or a zero
where where ^ is a neighbourhood of z 0 , such that
z 0 is an algebraic branch point of Note that it can happen that
(20.2.3) is not a minimal realization for some z belonging to the first
exceptional set of (see Example 20.2.2). The set
will be called the second exceptional set T2 of W( A, z). Denoting by S(z) the
McMillan degree of , we obtain the following description of the
points in the second exceptional set: z G T2 if and only if all poles and zeros
of W( A, z) can be continued analytically (as functions of z) to z(), and either
and for at least one zero (or pole) A 0 (z) that is analytic in a neighbourhood
°tt of z () , the zero (or pole) multiplicities of corresponding to
are different from the zero (or pole) multiplicities of
Again, it can happen that T2 intersects with the set of
points where the realization (20.2.3) is not minimal. Clearly, both T, and T2
are discrete sets. Note also that the set Tl U T2 contains all the points z0 for
which 5(z
Theorem 20.3.1
Let be a rational n x n matrix function that depends analytically on z
for z^.fl and such that = I for Assume that the denominator
and numerator of each entry in W( A, z) are coprime for some z0 E ft that is
not a zero of the leading coefficient of the denominator. Assume, in addition,
that the zeros and poles of are bounded in a neighbourhood of each
point in Let
where d(z) is the McMillan degree of W(\, z), and let Tl and T2 be the first
and second exceptional sets of , respectively. Consider a minima
factorization (20.3.1) with Then there exist rational
matrix functions Wj(A, 2),. . . , Wr(\, 2), the entries of which depend analyt-
ically on z in ft (with the possible exception of algebraic branch points in 7\
and of a discrete set D C ft of poles), and having the following properties: (a)
W^(°°, z) = I for j = 1,. . . , r and every 0 does2 not
; (b) the point
Minimal Factorizations of Rational Matrix Functions 635
belong to D and
W for every Moreover, this factorization is
minimal for every
t/Y
where for each /', the vector functions x\'\z),. . . , x(p')(z), as well as
_y ( j y) (z), . . . , )^;)(z), are linearly independent for every z GO and analytic
on ft except possibly for algebraic branch points in 7^. Here pi = ml +
636 Applications
where Df is a discrete set in Cl. [Note that the sum in (20.3.4) is direct.]
Indeed, by (20.3.2) and (20.3.3) we have
where
[By definition, Mr(z) = <pm.] The inclusion C in (20.3.8) is evident from the
definition of <5^(z). Further, for , we have
so
[Indeed, both sides of (20.3.10) take the value 0 on vectors from ^V; +1 (z)
and from ^ and take the value x on each vector x from
Therefore, (/- Gy-i(2))Gy( z ) >s a projector that coincides with
for / = 1 , . . . , r. But
Theorem 20.3.2
Let W(\, z) be as in Theorem 20.3.1, and let
638 Applications
The proof is obtained in the same way as the proof of Theorem 20.3.1, by
using Theorem 19.4.3 in place of Theorem 19.4.2.
To conclude this section we discuss minimal factorizations (20.3.1) that
cannot be continued analytically (as in Theorem 20.3.1). We say that the
minimal factorization (20.3.1) is sequentially nonisolated, if there is a
sequence of points {z such that zm—> z 0 , and sequences of
rational matrix functions . . . , r with value / at infinity
such that
Equation (20.3.11) is understood in the sense that for each pair of indices
k, I (1 < k, I < n) the (k, 1} entry of Wjm( A) has the form
Theorem 20.4.1
For every z 0 E ft ^ (Sj U 52) and every solution XQ of
for every z E ft that is not a pole of X(z). [ The case when a point z 0 E Sl is
also a pole of X(z) is not excluded.]
Proof. By Proposition 17.8.1, the subspace
Consider an example.
Here
Theorem 20.4.2 2
Let X0 be a solution of (20.4.1) and z 0 eO. Furthermore, assume that the
r(z0) -invariant subspace Im is spectral. Then there exists an m x n
matrix function X(z) with the properties described in Theorem 20.4.1 and, in
addition, X(z) is analytic in a neighbourhood of z0.
The proof of Theorem 20.4.2 is obtained in the same way as the proof of
Theorem 20.4.1, but using Theorem 19.4.3 in place of Theorem 19.4.1.
In connection with Theorem 20.4.2, note the following fact. Assume that
m = n. If X\ and X2 are solutions of (20.4.1) such that
20.5 EXERCISES
20.1 Let
be a matrix equation, where A(z), 5(z), C(z), and £>(z) are analytic
matrix functions (of appropriate sizes) on a domain . Assume
that all eigenvalues of the matrix
645
Appendix
Equivalence of
Matrix Polynomials
646
The Smith Form: Existence 647
Theorem A. 1.1
An m x n matrix polynomial is equivalent to a unique ra x « matrix
polynomial where
In other words, for every matrix polynomial ^4(A) there exist matrix
polynomials with constant nonzero determinants such that
648 Appendix
has the form (A. 1.2), and this form is uniquely determined by The
matrix polynomial of (A. 1.2) is called the Smith form of and
plays an important role in the analysis of matrix polynomials. Note that
) from (A.1.3) are not unique in general. Note also that the
zeros on the main diagonal in are absent in case has full rank
for some [In particular, this happens if is an n x n matrix
polynomial with leading coefficient /.]
If all are zeros, there is nothing to prove. Suppose that not all the
are zeros, and let a- be a polynomial of minimal degree among the
nonzero entries of We can suppose that ;0 = 1. [Otherwise, inter-
change columns in By elementary transformations it is possible to
650 Appendix
Theorem A. 2.1
Let A = BC, where B is a m x p matrix, and C is a p x n matrix. Then for
every k, \<k< min(m, n) and every minor A( .I of order k we
have
where the sum is taken over all sequences {«^}*=1 of integers satisfying
I =£ a t < a2 < • • • < ak < p. In particular, if k > p, then the sum on the
right hand side of (A. 2.1) is empty and the equation is interpreted as
652 Appendix
and using the linearity of the determinant as a function of each column, this
expression is easily seen to be equal to
where the sum is taken over all /c-tuples of integers (a l 5 . . . , ak) such that
l ^ a t < p . (Here we use the notation B\ I to denote
\a
ta 1*9=1 even when the the sequence =1 is not increasing, or when
it contains repetitions of numbers.) If not all aly a2,. . . , ak are different,
then clearly B\ I ^ O . Ignoring these summands in
(A.2.2), split the remaining terms into groups of &! terms each in such a way
that the summands in the same group differ only in the order of
indices a,, a 2 ,. . . , ak. We obtain:
The Smith Form: Uniqueness 653
Theorem A. 2.2
Let A( A) be an m x n matrix polynomial. Let be the greatest common
divisor (with leading coefficient 1) of the minors ofA(\) of order k, if not all
of them are zeros, and let = 0 if all the minors of order k of are
zeros. Let p = l and = diag[ , . . . , 0] be a
Smith form of (which exists by the part of Theorem A.I.I already
proved). Then r is the maximal integer such that
In this section we study various invariants appearing in the Smith form for
the matrix polynomials. Let A(\) be an m x n matrix polynomial with the
Smith form D(\). The diagonal elements are called
the invariant polynomials of . The number r of invariant polynomia
can be defined as
Indeed, since E(A) and F(A) from (A. 1.3) are invertible matrices for every
we have rank oNOn the other hand, its
clear that rank is not a zero of one of the invariant polyno-
mials, and rank otherwise. So (A.3.1) follows.
The set of invariant polynomials forms a complete invariant for equiva-
lence of matrix polynomials of the same size.
Theorem A.3.1
Matrix polynomials and ) of the same size are equivalent if and only
if the invariant polynomials of and B are the same.
Invariant Polynomials, Elementary Divisors, and Partial Multiplicities 655
Proof. Suppose the invariant polynomials of A(\) and B(A) are the
same. Then their Smith forms are equal:
and
We now take advantage of the fact that the polynomial entries of A (A)
and its Smith form D( A) are over <p to represent each invariant polynomial
d,(A) as a product of linear factors:
Proposition A.3.2
Let A(\) be an n x n matrix polynomial such that det 0. Then the
sum of degrees of its elementary divisors coincides
with the degree of det
Note that the knowledge of the elementary divisors of A (A) and the
number r of its invariant polynomials is sufficient to
construct d^ In this construction we use the fact that is
divisible by ^_,(A). Let p be all the different complex numbers
that appear in the elementary divisors, and let
(/ = 1, . . . ,p) be the elementary divisors containing the number A / ? and
ordered in the descending order of the degrees an > • • • > a ( k > 0. Clearly,
the number r of invariant polynomials must be greater than or equal to
max{&,, . . . , kp}. Under this condition, the invariant polynomials
are given by the formulas
where we put t.
The following property of the elementary divisors is used subsequently.
Proposition A.3.3
Let A(\) and B(\) be matrix polynomials, and let
a block-diagonal matrix polynomial. Then the set of elementary divisors of
C(A) is the union of the elementary divisors of A(\) and B(A).
Proof. Let D,(A) and D 2 (A) be the Smith forms of A(\) and B(\),
respectively. Then clearly
for some matrix polynomials £(A) and F(A) with constant nonzero deter-
minant. Let
the elementary divisors of /^(A) and D2(\), respectively, corresponding to
the same complex number A0. Arrange the set of exponents
« ! , . . . , ap, j3,,. . . , fiq, in a nonincreasing order:
where 0< y\ ^ • • • < yp+q- Using Theorem A.2.2 it is clear that in the Smith
form D = diag[ of diag[ the in-
variant polynomial d is divisible by but not by
Invariant Polynomials, Elementary Divisors, and Partial Multiplicities 657
Theorem A.3.4
Let A(\) be an nx n matrix polynomial with det Then for every
admits the representation
It remains to show that the K, coincide (after striking off zeros) with the
degrees of elementary divisors of A (A) corresponding to A0. To this end we
show that any factorization of A (A) of type (A. 3. 2) with KJ ^ • • • < Kn
implies that K; is the multiplicity of A0 as a zero of d^( A), j = 1, . . . , n, where
D(A) = diag[d,(A), . . . , dn(\)] is the Smith form of A(\). Indeed, let
diag
and apply Theorem A. 2.1 again. Using the fact that and
(F A (A))~ ! are rational matrix functions that are defined and invertible at
0 , and that dt( A) is a divisor of we deduce that
We study here equivalence and the Smith form for matrix polynomials of
type /A — A, where A is an n x n matrix. It turns out that for such matrix
polynomials the notion of equivalence is closely related to similarity.
Theorem A.4.1
/A — A ~ /A - B if and only if A and B are similar.
where Qr(\) is a matrix polynomial, which is called the right quotient, and
Rr is a constant matrix, which is called the right remainder, on division of
where Q,(\) is the left quotient, and the constant matrix R, is the left
remainder.
Let us check the existence of representation (A.4.1); (A.4.2) can be
checked in a similar way. If / = 0 [i.e., A( A) is constant], put = 0 and
R . So we can suppose / > 1. Write Comparing
the coefficients of powers of A on the right- and left-hand sides of (A.4.1),
we can rewrite this relation as follows:
660 Appendix
we obtain
whence
Since the degree of the matrix polynomial on the right-hand side here is 1 , it
follows that 5( A) = T( A); otherwise, the degree of the matrix polynomial on
the left is at least 2. Hence
so that
Hence the matrix polynomial in the square brackets is zero, and E0R0 - I. It
follows that EQ is nonsingular. D
The definitions of eigenvalues and partial multiplicities made in the
preceding section can be applied to an n x n matrix polynomial of the form
—A. On the other hand, as an n x n matrix (or as a transformation
represented by this matrix in the standard basis el,. . . , en), A has eigen-
values and partial multiplicities as defined in Sections 1.2 and 2.2. It is an
important fact that these notions for IX — A and for A coincide.
Theorem A.4.2
A complex number is an eigenvalue of IX- A if and only if it is an
eigenvalue of A. Moreover, the partial multiplicities of IX — A corresponding
to its eigenvalue A0 coincide with the partial multiplicities of A corresponding
to A 0 .
Proof. The first statement follows from the definitions: 0 is an eigen-
value of IX — A if and only if det(/A — A) = 0, which is exactly the definition
of an eigenvalue of A. For the proof of the second statement, we can
assume that A is in the Jordan form. Further, using Proposition A.3.3, we
reduce the proof to the case when A is a single Jordan block of size n x n:
minor formed by crossing out the first column and the last row in
As det Theorem A.2.2 implies that the Smith form of
/A - A is diag[l, 1, . . . , 1 So the only partial multiplicity of
/A — A is n, which corresponds to 0. D
Theorem A.4.3
Let A be an n x n matrix. Let a, ^ • • • > am be the partial multiplicities of an
eigenvalue 0 of A, and put a. = 0 for i = m + 1, . . . , n. Then
is the minimal multiplicity of as a zero of the determinant
{considered as a polynomial in A) of any p* p submatrix in IX— A.
Let A + \B and Al + \Bl be two linear matrix polynomials of the same size
m x n. We say that A + \B and A j + Afi, are strictly equivalent if there exist
Strict Equivalence of Linear Matrix Polynomials 663
Proposition A. 5.1
Two regular polynomials A + \B and A} + \B^ with det B 7^0, det Bl^0
are strictly equivalent if and only if they have the same invariant polynomials
(or, equivalently , the same elementary divisors).
The polynomials
and
are obviously regular, and both have the Smith form , that is, the
same invariant polynomials. However, they cannot be strictly equivalent
because B and Bl have different ranks. were
664 Appendix
Theorem A. 5.2
Two regular polynomials A + \B and A , + \B^ are strictly equivalent if and
only if the elementary divisors of A + \B and Al + \B^ are the same and
their elementary divisors at infinity are the same.
for some complex numbers o^ and aj. (In fact, the nonzero a j values are the
reciprocals of the nonzero a; values.) Using factorizations of this kind, it is
easily seen that l) are the invariant polynomials of
A + \B, whereas p^l, /A), . . . , p n (l, pt) are the invariant polynomials
of A + B.
Strict Equivalence of Linear Matrix Polynomials 665
Theorem A. 5.3
Every regular, linear, matrix polynomial A + \B is strictly equivalent to a
linear polynomial of the form
where Jk(h) is the k x k Jordan block with eigenvalue A. The linear poly-
nomial (/i.5.1) is uniquely determined by A + \B. In fact, A*1, . . . , A p are
the elementary divisors at infinity ofA + \B, whereas / = 1, . . . , q
are the elementary divisors of
Among all not identically zero polynomial solutions *(A) of (A. 6.1), we
choose one of least degree e and write
Theorem A. 6.1
I f e i s the minimal degree of a nonzero polynomial solution of (A. 6. 1), and if
e > 0, then A + \B is strictly equivalent to a linear matrix polynomial of the
form
where
668 Appendix
Lemma A.6.2
Assume that the rank of U + XV is less than n. Then e is the minimal degree
of nonzero polynomial solutions y( A) of
if and only if
and
or equivalently
The Reduction Theorem for Singular Polynomials 669
Not all the vectors yj are zero, and so (A.6.7) follows. Conversely, if (A.6.7)
holds, we may reverse the argument and obtain a nonzero polynomial
solution of (A.6.6) of degree e. D
Proof of Theorem A.6.1. The proof is given in three steps. In the first
step we show that
are linearly independent. Assume the contrary, and let Axh (h = l) be the
first vector in (A.6.9) that is linearly dependent on the preceding ones:
MHERE
is obtained from
The Reduction Theorem for Singular Polynomials 671
for suitable matrices X and Y, we see that Theorem A. 6.1 will be complete-
ly proved if we can show that the matrices X and Y can be chosen so that
the matrix equation
holds.
We introduce a notation for the elements of D, F, X and also for the rows
of Y and the columns of A and B :
The left-hand sides of these equations are linear polynomials in A. The free
term of each of the first e — 1 of these polynomials is equal to the coefficient
of A in the next polynomial. But then the right-hand sides must also satisfy
this condition. Therefore, for fc = l , 2 , . . . , n — e — 1, we obtain
where
are called linearly dependent if the rank of the polynomial matrix formed
from these columns is less than k\ In that
case there exist k polynomials not all identically
zero, such that
Indeed, let
Note that it may happen that some degrees e,, . . . , ey are zeros. [This is the
case when (A.7.1) admits constant nonzero solutions.] In general, a funda-
mental series of solutions is not uniquely determined (to within scalar
factors) by the pencil A + \B. However, note the following.
674 Appendix
Proposition A. 7.1
Two distinct fundamental series of solutions always have the same series of
degrees
where
m j matrix polynomial. are linearly independent, there
is a nonzero minor /(A) of order m, of . So for every A G (p
that is not a zero of one of the polynomials
rank of the matrix on the left-hand side of (A.7.6) is ml. Hence (A.7.6)
implies ml^nl. Interchanging the roles of Jt,(A) and *,(A), we find the
opposite inequality , we have and we can
repeat the above argument with n2 and m 2 in place of n, and m 1?
respectively, and so on. D
Proposition A.7.2
then the minimal column indices of the polynomials
are the same, and the minimal row indices of these
polynomials are also the same.
Theorem A. 7.3
Every m x n linear matrix polynomial A + \B is strictly equivalent to a
unique linear matrix polynomial of type
Here £j < • • • < ep a«d 171 < • • • ^ 17^ are positive integers; kl , . . . , kr and
l{, . . . , ls are positive integers; \l, . . . , \s are complex numbers.
that is, all solutions that are independent of A. Note that (A.7.11) is
equivalent to the simultaneous equations
Corollary A.7.4
We have A + \B ~A{ + \Bt if and only if the polynomials A + \B and
Al + \Bl have the same minimal column indices, minimal row indices,
elementary divisors, and elementary divisors at infinity.
Thus Corollary A.7.4 describes the full set of invariants for strict equival-
ence of linear matrix polynomials.
679
680 List of Notations and Conventions
683
684 References
Gohberg, I., and G. Heinig, "The resultant matrix and its generalizations, I. The resultant
operator for matrix polynomials," Acta Set. Math. (Szeged) 37, 41-61 (Russian) (1975).
Gohberg, I., and M. A. Kaashoek, "Unsolved problems in matrix and operator theory, II.
Partial multiplicities of a product," Integral Equations and Operator Theory 2, 116-120
(1979).
Gohberg, I., M. A. Kaashoek, and F. van Schagen, "Similarity of operator blocks and
canonical forms. I. General results, feedback equivalence and Kronecker indices," Integral
Equations and Operator Theory 3, 350-396 (1980).
Gohberg, I., M. A. Kaashoek, and F. van Schagen, "Similarity of operator blocks and
canonical forms. II. Infinite dimensional case and Wiener-Hopf factorization," in Topics
in Modern Operator Theory. Operator Theory: Advances and Applications, Vol. 2,
Birkhauser-Verlag, 1981, pp. 121-170.
Gohberg, I., M. A. Kaashoek, and F. van Schagen, "Rational matrix and operator functions
with prescribed singularities," Integral Equations and Operator Theory 5, 673-717 (1982).
Gohberg, I. C., and M. G. Krein, "The basic propositions on defect numbers, root numbers
and indices of linear operators," Uspehi Mat. Nauk 12, 43-118 (1957); translation, Russian
Math. Surveys 13, 185-264 (1960).
Gohberg, I., and N. Krupnik, Einfiihrung in die Theorie der eindimensionalen singuldren
Integraloperatoren, Birkhauser, Basel, 1979.
Gohberg, I., P. Lancaster, and L. Rodman, "Perturbation theory for divisors of operator
polynomials," SIAM J. Math. Anal. 10, 1161-1183 (1979).
Gohberg, I., P. Lancaster, and L. Rodman, Matrix Polynomials, Academic Press, New York,
1982.
Gohberg, I., P. Lancaster, and L. Rodman, "A sign characteristic for self-adjoint meromorphic
matrix functions," Applicable Analysis 16, 165-185 (1983a).
Gohberg, L, P. Lancaster, and L. Rodman, Matrices and Indefinite Scalar Products (Operator
Theory: Advances and Applications, Vol. 8) Birkhauser-Verlag, Basel, 1983b.
Gohberg, I., and Ju. Leiterer, "On holomorphic vector-functions of one variable, I. Functions
on a compact set," Matem. Issled. 7, 60-84 (Russian) (1972).
Gohberg, L, and Ju. Leiterer, "On holomorphic vector-functions of one variable, II. Functions
on domains," Matem. Issled. 8, 37-58 (Russian) (1973).
Gohberg, I. C. and A. S. Markus, "Two theorems on the gap between subspaces of a Banach
space," Uspehi Mat. Nauk 14, 135-140 (Russian) (1959).
Gohberg, L, and L. Rodman, "Analytic matrix functions with prescribed local data," /.
d1 Analyse Math. 40, 90-128 (1981).
Gohberg, I., and L. Rodman, "On distance between lattices of invariant subspaces of
matrices," Linear Algebra Appl. 76, 85-120 (1986).
Gohberg, I., and S. Rubinstein, "Stability of minimal fractional decompositions of rational
matrix functions," in Operator Theory: Advances and Applications, Vol. 18, Birkhauser,
Basel, 1986, pp. 249-270.
Golub, G. H., and C. F. van Loan, Matrix Computations, The Johns Hopkins University Press,
Baltimore, 1983.
Golub, G. H., and J. H. Wilkinson, "Ill-conditioned eigensystems and the computation of the
Jordan canonical form," SIAM Review 18, 578-619 (1976).
Grauert, H., "Analytische Faserungen iiber holomorph vollstandigen Raumen," Math. Ann.
135, 263-273 (1958).
Guralnick, R. M., "A note on pairs of matrices with rank one commutator," Linear and
Multilinear Algebra 8, 97-99 (1979).
Halmos, P. R., "Reflexive lattices of subspaces," J. London Math. Soc. 4, 257-263 (1971).
Halperin, L, and P. Rosenthal, "Burnside's theorem on algebras of matrices," Am. Math.
Monthly 87, 810 (1980).
Harrison, K. J., "Certain distributive lattices of subspaces are reflexive," J. London Math. Soc.
8, 51-56 (1974).
Hautus, M. L. J., "Controllability and observability conditions of linear autonomous systems,"
Ned. Akad. Wet. Proc., Ser. A, 12, 443-448 (1969).
References 685
Helton, J. W., and J. A. Ball, "The cascade decompositions of a given system vs the linear
fractional decompositions of its transfer function," Integral Equations and Operator Theory
5, 341-385 (1982).
Hoffman, K., and R. Kunze, Linear Algebra, Prentice-Hall of India, New Delhi, 1967.
Jacobson, N., Lectures in Abstract Algebra II: Linear Algebra, Van Nostrand, Princeton, NJ,
1953.
Johnson, R. E., "Distinguished rings of linear transformations," Trans. Am. Math. Soc. I l l ,
400-412 (1964).
Kaashoek, M. A., C. V. M. van der Mee, and L. Rodman, "Analytic operator functions with
compact spectrum, II. Spectral pairs and factorization," Integral Equations and Operator
Theory 5, 791-827 (1982).
Kailath, T., Linear Systems, Prentice-Hall, Englewood Cliffs, NJ, 1980.
Kalman, R. E., "Mathematical description of linear dynamical systems," SI AM J. Control 1,
152-192 (1963).
Kalman, R. E., "Kronecker invariants and feedback," Proceedings of Conference on Ordinary
Differential Equations, Math. Research Center, Naval Research Laboratory, Washington,
DC, 1971.
Kalman, R. E., P. L. Falb, and M. A. Arbib, Topics in Mathematical System Theory,
McGraw-Hill, New York, 1969.
Kato, T., Perturbation Theory for Linear Operators, 2nd ed., Springer-Verlag, Berlin, 1976.
Kelley, J. L., General Topology, van Nostrand, New York, 1955.
Kra, I., Automorphic Forms and Kleinian Groups, Benjamin, Reading, MA, 1972.
Krein, M. G., "Introduction to the geometry of indefinite ./-spaces and to the theory of
operators in these spaces," Am. Math. Soc. Translations (2) 93, 103-176 (1970).
Krein, M. G., M. A. Krasnoselskii, and D. P. Milman, "On the defect numbers of linear
operators in Banach space and on some geometric problems," Sbornik Trud. Inst. Mat.
Akad. Nauk Ukr. SSR 11, 97-112 (Russian) (1948).
Kurosh, A. G., Lectures in General Algebra, Pergamon Press, Oxford, 1965.
Laffey, T. J., "Simultaneous triangularization of matrices—low rank cases and the non-
derogatory case," Linear and Multilinear Algebra 6, 269-305 (1978).
Lancaster, P., Theory of Matrices, Academic Press, New York, 1969.
Lancaster, P., and M. Tismenetsky, The Theory of Matrices with Applications, Academic Press,
New York, 1985.
Lidskii, V. B., "Inequalities for eigenvalues and singular values," appendix in F. R. Gantmach-
er, The Theory of Matrices, Moscow, Nauka, 1966, pp. 535-559 (Russian).
Markus, A. S., and E. E. Parilis, "Change in the Jordan structure of a matrix under small
perturbations," Matem. Issled. 54, 98-109 (Russian) (1980).
Markushevich, A. I., Theory of Analytic Functions, Vols. I-III, Prentice-Hall, Englewood
Cliffs, NJ, 1965.
Marsden, J. E., Basic Complex Analysis, Freeman, San Francisco, 1973.
Ostrowski, A. M., Solution of Equations in Euclidean and Banach Space, Academic Press, New
York, 1973.
Porsching, T. A., "Analytic eigenvalues and eigenvectors," Duke Math. J. 35, 363-367 (1968).
Radjavi, H., and P. Rosenthal, Invariant Subspaces, Springer-Verlag, Berlin, 1973.
Ran, A. C. M., and L. Rodman, "Stability of neutral invariant subspaces in indefinite inner
products and stable symmetric factorizations," Integral Equations and Operator Theory 6,
536-571 (1983).
Rodman, L., and M. Schaps, "On the partial multiplicities of a product of two matrix
polynomials," Integral Equations and Operator Theory 2, 565-599 (1979).
Rosenbrock, H. H., State Space and Multivariate Theory, Nelson, London, 1970.
Rosenbrock, H. H., and C. E. Hayton, "The general problem of pole assignment," Intern. J.
Control 27, 837-852 (1978).
Rosenthal, E., "A remark on Burnside's theorem on matrix algebras," Linear Algebra Appl.
63, 175-177 (1984).
Rudin, W., Real and Complex Analysis, 2nd ed., Tata McGraw-Hill, New Delhi.
686 References
Ruhe, A., "Perturbation bounds for means of eigenvalues and invariant subspaces," Nordisk
Tidskrift fur Informations Behandlung (BIT) 10, 343-354 (1970a).
Ruhe, A., "An algorithm for numerical determination of the structure of a general matrix,"
Nordisk Tidskrift fur Informations Behandlung (BIT) 10, 196-216 (1970b).
Saphar, P., "Sur les applications lineaires dans un espace de Banach. II," Ann. Sci. Ecole
Norm. Sup. 82, 205-240 (1965).
Sarason, D., "On spectral sets having connected complement," Acta Sci. Math. (Szeged) 26,
289-299 (1965).
Shayman, M. A., "On the variety of invariant subspaces of a finite-dimensional linear
operator," Trans. AMS 274, 721-747 (1982).
Shmuljan, Yu. L., "Finite dimensional operators depending analytically on a parameter,"
Ukrainian Math. J. 9(2), 195-204 (Russian) (1957).
Shubin, M. A., "On holomorphic families of subspaces of a Banach space," Integral Equations
and Operator Theory 2, 407-420 (translation from Russian) (1979).
Sigal, E. I., "Partial multiplicities of a product of operator functions," Matem. Issled. 8(3),
65-79 (Russian) (1973).
Soltan, V. P., "The Jordan form of matrices and its connection with lattice theory," Matem.
Issled. 8(27), 152-170 (Russian) (1973a).
Soltan, V. P., "On finite dimensional linear operators with the same invariant subspaces,"
Matem. Issled. 8(30), 80-100 (Russian) (1973b).
Soltan, V. P., "On finite dimensional linear operators in real space with the same invariant
subspaces," Matem. Issled. 9, 153-189 (Russian) (1974).
Soltan, V. P., "The structure of hyperinvariant subspaces of a finite dimensional operator," in
Nonselfadjoint Operators, Kishinev, Stiinca, 1976, pp. 192-203 (Russian).
Soltan, V. P., "The lattice of hyperinvariant subspaces for a real finite dimensional operator,"
Matem. Issled. 61, 148-154, Stiinca, Kishinev (Russian) (1981).
Thijsse, G. Ph. A., "Rules for the partial multiplicities of the product of holomorphic matrix
functions," Integral Equations and Operator Theory 3, 515-528 (1980).
Thijsse, G. Ph. A., Partial Multiplicities of Products of Holomorphic Matrix Functions,
Habilitationschrift, Dortmund, 1984.
Thompson, R. C., "Author vs. referee: A case history for middle level mathematicians," Am.
Math. Monthly, 90(10), 661-668 (1983).
Thompson, R. C., "Some invariants of a product of integral matrices," in Proceedings of the
1984 Joint Summer Research Conference on Linear Algebra and its Role in Systems Theory,
1985.
Uspensky, J. V, Theory of Equations, McGraw-Hill, New York, 1978.
Van Dooren, P., "The generalized eigenstructure problem in linear system theory," IEEE
Trans. Aut. Contr. AC-26, 111-129 (1981).
Van Dooren, P., "Reducing subspaces: Definitions, properties and algorithms," in A. Ruhe and
B. Kagstrom, Eds., Matrix Pencils, Lecture Notes in Mathematics, Vol. 973, Springer, New
York, 1983, pp. 58-73.
Wells, R. O., Differential Analysis on Complex Manifolds, Springer-Verlag, New York, 1980.
Wonham, W. M., Linear Multivariable Control: A Geometric Approach, Springer-Verlag,
Berlin, 1979.
Author Index
687
688 Author Index