Israel Gohberg Peter Lancaster, Leiba Rodman - InvariantSubspacesOfMatricesWithApplications

Invariant Subspaces
of Matrices with
Applications
SIAM's Classics in Applied Mathematics series consists of books that were previously allowed to go
out of print. These books are republished by S1AM as a professional service because they continue
to be important resources tor mathematical scientists.
Editor-in-Chief
Robert E. O'Malley, Jr., University of Washington
Editorial Board
Richard A. Brualdi, University of Wisconsin-Madison
Leah Edelstein-Keshet, University of British Columbia
Nicholas]. Higham, University of Manchester
Herbert B. Keller, California Institute of Technology
Aiidrzej Z. Manitius, George Mason University
Hilary Ockendon, University of Oxford
Ingram Olkin, Stanford University
Peter Olver, University of Minnesota
Ferdinand Verhulst, Mathematisch Instituut, University of Utrecht
Classics in Applied Mathematics
C. C. Lin and L. A. Segel, Mathematics Applied to Deterministic Problems in the Natural Sciences
Johan G. F. Belinfante and Bernard Kolman, A Survey of Lie Groups and Lie Algebras with
Applications and Computational Methods
James M. Ortega, Numerical Analysis: A Second Course
Anthony V. Fiacco arid Garth P. McCorrnick, Nonlinear Programming: Sequential Unconstrained
Minimisation Techniques
F. H. Clarke, Optimization and Nonsmooth Analysis
George F. Carrier and Carl E. Pearson, Ordinary Differential Equations
Leo Breiman, Probability
R. Bellman and G. M. Wing, An Introduction to Invariant Imbedding
Abraham Berman and Robert J. Plemmons, Nonnegative Matrices in the Mathematical Sciences
Olvi L. Mangasarian, Nonlinear Programming
*Carl Friedrich Gauss, Theory of the Combination of Observations Least Subject to Errors:
Part One, Part Two, Supplement. Translated by G. W. Stewart
Richard Bellman, Introduction to Matrix Analysis
U. M. Ascher, R. M. M. Mattheij, and R. D. Russell, Numerical Solution of Boundary Value
Problems for Ordinary Differential Equations
K. E. Brenan, S. L. Campbell, and L. R. Petzold, Numerical Solution of Initial-Value Problems
i?i Differential-Algebraic Equations
Charles L. Lawson and Richard J. Hanson, Solving Least Squares Problems
J. E. Dennis, Jr. and Robert B. Schnabel, Numerical Methods for Unconstrained Optimization
and Nonlinear Equations
Richard E. Barlow and Frank Proschan, Mathematical Theory of Reliability
Cornelius Lanczos, Linear Differential Operators
Richard Bellman, Introduction to Matrix Analysis, Second Edition
Beresforcl N. Parlett, The Symmetric Eigenvalue Problem
*First time in print.

ii
Classics in Applied Mathematics (continued)
Richard Haberman, Mathematical Models: Mechanical Vibrations, Population Dynamics,

and Traffic Flow
Peter W. M. John, Statistical Design and Analysis of Experiments
Tamer Ba§ar arid Geert Jan Olsder, Dynamic Mmcooperative Game Theory, Second Edition
Emanuel Parzen, Stochastic Processes
Petar Kokotovic, Hassan K. Khalil, and John O'Reilly, Singular Perturbation Methods
in Control: Analysis and Design
Jean Dickinson Gibbons, Ingram Olkin, and Milton Sobel, Selecting and Ordering Populations:
A New/1 Statistical Methodology
James A. Murdock, Perturbations: Theory and Methods
Ivar Eke land and Roger Temam, Convex Analysis and Variatiorial Problems
Ivar Stakgokl, Boundary Value Problems of Mathematical Physics, Volumes I and II
J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables
David Kinderlehrer and Guido Stampacchia, An Introduction to Variational Inequalities and
Their Applications
F. Natterer, The Mathematics of Computerized Tomography
Avinash C. Kak and Malcolm Slaney, Principles of Computerized Tomographic Imagiiig
R. Wong, Asymptotic Approximations of Integrals
O. Axelsson and V A. Barker, Finite Element Solution of Boundary Value Problems: Theory
and Computation
David R. Brillinger, Time Series: Data Analysis and Theory
Joel N. Franklin, Methods of Mathematical Economics: Linear arid Nonlinear Programming,
Fixed-Point Theorems
Philip Hartman, Ordinary Differential Equations, Second Edition
Michael D. Intriligator, Mathematical Optimisation arid Economic Theory
Philippe G. Ciarlet, The Finite Element Method for Elliptic Problems
Jane K. Cullum and Ralph A. Willoughby, Lanczos Algorithms for Large Symmetric Eigenvalue
Computations, Vol. I: Theory
M. Vidyasagar, Nonlinear Systems Analysis, Second Edition
Robert Mattheij and Jaap Molenaar, Ordinary Differential Equations in Theory and Practice
Shanti S. Gupta and S. Panchapakesan, Multiple Decision Procedures: Theory and Methodology
of Selecting and Ranking Populations
Eugene L. Allgower and Kurt Georg, Introdttctiori to Numerical Continuation Methods
Leah Edelstein-Keshet, Mathematical Models in Biology
Heinz-Otto Kreiss and Jens Lorenz, Initial-Boundary Value Problems arid the Navier-Stofces Equations
J. L. Hodges, Jr. and E. L. Lehmarm, Basic Concepts of Probability and Statistics, Second Edition
George F. Carrier, Max Krook, and Carl E. Pearson, Functions of a Complex Variable: Theory
and Technique
Friedrich Pukelsheim, Optimal Design of Experiments
Israel Gohberg, Peter Lancaster, and Leiba Rodman, Invariant Subspaces of Matrices with
Applications
iii
This page intentionally left blank
Invariant Subspaces
of Matrices with
Applications
Israel Gohberg
Tel-Aviv University
Ramat>Aviv, Israel
Peter Lancaster
University of Calgary
Calgary, Alberta, Canada
Leiba Rodman
College of William & Mary
Williamsburg, Virginia
siam.
Society for Industrial and Applied Mathematics
Philadelphia
Copyright © 2006 by the Society for Industrial and Applied Mathematics
This SIAM edition is an unabridged republication of the work first published by John
Wiley & Sons, Inc., New York, 1986.
10 9 8 7 6 5 4 3 2 1
All rights reserved. Printed in the United States of America. No part of this book
may be reproduced, stored, or transmitted in any manner without the written
permission of the publisher. For information, write to the Society for Industrial and
Applied Mathematics, 3600 University City Science Center, Philadelphia, PA
19104-2688.
Library of Congress Cataloging in Publication Data
Gohberg, I. (Israel), 1928-

Invarient subspeces of matrices with applications / Israel Gohberg, Peter
Lancaster, Leiba Rodman.
p. cm. — (Classics in applied mathematics ; 51)
Originally published: New York : Wiley, c!986, in series: Canadian Mathematical
Society series of monographs and advanced texts.
Includes bibliographical references and indexes.
ISBN 0-89871-608-X (pbk.)
1. Invariant subspaces. 2. Matrices. I. lancaster, Peter, 1929-. II. Rodman, L.
III. Title. IV. Series.
QA322.G649 2006
515'.73-dc22
2006042260
slam. is a registered trademark.

70 our wives
feeffa, fliane, antf*Effa
Contents
Introduction 1
Part One Fundamental Properties of

Invariant Subspaces and Applications 3
Chapter One Invariant Subspaces: Definition, Examples, and First

Properties 5
1.1 Definition and Examples 5

1.2 Eigenvalues and Eigenvectors 10
1.3 Jordan Chains 12
1.4 Invariant Subspaces and Basic Operations on Linear
Transformations 16
1.5 Invariant Subspaces and Projectors 20
1.6 Angular Transformations and Matrix Quadratic
Equations 25
1.7 Transformations in Factor Spaces 28
1.8 The Lattice of Invariant Subspaces 31
1.9 Triangular Matrices and Complete Chains of Invariant
Subspaces 37
1.10 Exercises 40
Chapter Two Jordan Form and Invariant Subspaces 45

2.1 Root Subspaces 45
2.2 The Jordan Form and Partial Multiplicities 52
2.3 Proof of the Jordan Form 58
2.4 Spectral Subspaces 60
2.5 Irreducible Invariant Subspaces and Unicellular
Transformations 65
2.6 Generators of Invariant Subspaces 69
2.7 Maximal Invariant Subspace in a Given Subspace 72
2.8 Minimal Invariant Subspace over a Given Subspace 78
2.9 Marked Invariant Subspaces 83
ix
X Contents
2.10 Functions of Transformations 85

2.11 Partial Multiplicities and Invariant Subspaces of
Functions of Transformations 92
2.12 Exercises 95
Chapter Three Coinvariant and Semiinvariant Subspaces 105

3.1 Coinvariant Subspaces 105
3.2 Reducing Subspaces 109
3.3 Semiinvariant Subspaces 112
3.4 Special Classes of Transformations 116
3.5 Exercises 119
Chapter Four Jordan Form for Extensions and Completions 121

4.1 Extensions from an Invariant Subspace 121
4.2 Completions from a Pair of Invariant and Coinvariant
Subspaces 128
4.3 The Sigal Inequalities 133
4.4 Special Case of Completions 136
4.5 Exercises 142
Chapter Five Applications to Matrix Polynomials 144

5.1 Linearizations, Standard Triples, and Representations of
Monic Matrix Polynomials 144
5.2 Multiplication of Monic Matrix Polynomials and Partial
Multiplicities of a Product 153
5.3 Divisibility of Monic Matrix Polynomials 156
5.4 Proof of Theorem 5.3.2 161
5.5 Example 167
5.6 Factorization into Several Factors and Chains of
Invariant Subspaces 171
5.7 Differential Equations 175
5.8 Difference Equations 180
5.9 Exercises 183
Chapter Six Invariant Subspaces for Transformations Between

Different Spaces 189
6.1 [A B]-Invariant Subspaces 189
6.2 Block Similarity 192
6.3 Analysis of the Brunovsky Canonical Form 197
6.4 Description of [A Z?]-Invariant Subspaces 200
6.5 The Spectral Assignment Problem 203
6.6 Some Dual Concepts 207
6.7 Exercises 209
Contents xi
Chapter Seven Rational Matrix Functions 212

7.1 Realizations of Rational Matrix Functions 212
7.2 Partial Multiplicities and Multiplication 218
7.3 Minimal Factorization of Rational Matrix Functions 225
7.4 Example 230
7.5 Minimal Factorizations into Several Factors and Chains
of Invariant Subspaces 234
7.6 Linear Fractional Transformations 238
7.7 Linear Fractional Decompositions and Invariant
Subspaces of Nonsquare Matrices 244
7.8 Linear Fractional Decompositions:
Further Deductions 251
7.9 Exercises 255
Chapter Eight Linear Systems 262

8.1 Reductions, Dilations, and Transfer Functions 262
8.2 Minimal Linear Systems: Controllability and
Observability 265
8.3 Cascade Connections of Linear Systems 270
8.4 The Disturbance Decoupling Problem 274
8.5 The Output Stabilization Problem 279
8.6 Exercises 285
Notes to Part 1. 290
Part Two Algebraic Properties of Invariant Subspaces 293
Chapter Nine Commuting Matrices and Hyperinvariant Subspaces 295

9.1 Commuting Matrices 295
9.2 Common Invariant Subspaces for Commuting
Matrices 301
9.3 Common Invariant Subspaces for Matrices with Rank 1
Commutators 303
9.4 Hyperinvariant Subspaces 305
9.6 Further Properties of Hyperinvariant Subspaces 311
9.7 Exercises 313
Chapter Ten Description of Invariant Subspaces and Linear

Transformations with the Same Invariant Subspaces 316
10.1 Description of Irreducible Subspaces 316
10.2 Transformations Having the Same Set of Invariant
Subspaces 323
xii Contents

10.4 Exercises 338
Chapter Eleven Algebras of Matrices and Invariant Subspaces 339

11.1 Finite-Dimensional Algebras 339
11.2 Chains of Invariant Subspaces 340
11.4 Reflexive Lattices 346
11.5 Reductive and Self-Ad joint Algebras 350
11.6 Exercises 355
Chapter Twelve Real Linear Transformations 359

12.1 Definition, Examples, and First Properties of Invariant
Subspaces 359
12.2 Root Subspaces and the Real Jordan Form 363
12.3 Complexification and Proof of the Real Jordan
Form 366
12.4 Commuting Matrices 371
12.5 Hyperinvariant Subspaces 374
12.6 Real Transformations with the Same Invariant
Subspaces 378
12.7 Exercises 380
Part Three Topological Properties of

Invariant Subspaces and Stability 385
Chapter Thirteen The Metric Space of Subspaces 387

13.1 The Gap Between Subspaces 387
13.2 The Minimal Angle and the Spherical Gap 392
13.3 Minimal Opening and Angular Linear
Transformations 396
13.4 The Metric Space of Subspaces 400
13.5 Kernels and Images of Linear Transformations 406
13.6 Continuous Families of Subspaces 408
13.7 Applications to Generalized Inverses 411
13.8 Subspaces of Normed Spaces 415
13.9 Exercises 420
Contents xiii
Chapter Fourteen The Metric Space of Invariant Subspaces 423

14.1 Connected Components: The Case of One
Eigenvalue 423
14.2 Connected Components: The General Case 426
14.3 Isolated Invariant Subspaces 428
14.4 Reducing Invariant Subspaces 432
14.5 Coinvariant and Semiinvariant Subspaces 437
14.6 The Real Case 439
14.7 Exercises 443
Chapter Fifteen Continuity and Stability of Invariant Subspaces 444

15.1 Sequences of Invariant Subspaces 444
15.2 Stable Invariant Subspaces: The Main Result 447
15.3 Proof of Theorem 15.2.1 in the General Case 451
15.4 Perturbed Stable Invariant Subspaces 455
15.5 Lipschitz Stable Invariant Subspaces 459
15.6 Stability of Lattices of Invariant Subspaces 463
15.7 Stability in Metric of the Lattice of Invariant
Subspaces 464
15.8 Stability of [A £]-Invariant Subspaces 468
15.9 Stable Invariant Subspaces for Real
Transformations 470
15.10 Partial Multiplicities of Close Linear
Transformations 475
15.11 Exercises 479
Chapter Sixteen Perturbations of Lattices of Invariant Subspaces with

Restrictions on the Jordan Structure 482
16.1 Preservation of Jordan Structure and Isomorphism of
Lattices 482
16.2 Properties of Linear Isomorphisms of Lattices:
The Case of Similar Transformations 486
16.3 Distance Between Invariant Subspaces for
Transformations with the Same Jordan Structure 492
16.4 Transformations with the Same Derogatory Jordan
Structure 497
16.5 Proofs of Theorems 16.4.1 and 16.4.4 500
16.6 Distance between Invariant Subspaces for
Transformations with Different Jordan Structures 507
16.7 Conjectures 510
16.8 Exercises 513
xiv Contents
Chapter Seventeen Applications 514

17.1 Stable Factorizations of Matrix Polynomials:
Preliminaries 514
17.2 Stable Factorizations of Matrix Polynomials:
Main Results 520
17.3 Lipschitz Stable Factorizations of Monic Matrix
Polynomials 525
17.4 Stable Minimal Factorizations of Rational Matrix
Functions: The Main Result 528
17.5 Proof of the Auxiliary Lemmas 532
17.6 Stable Minimal Factorizations of Rational Matrix
Functions: Further Deductions 537
17.7 Stability of Linear Fractional Decompositions of
Rational Matrix Functions 540
17.8 Isolated Solutions of Matrix Quadratic Equations 545
17.9 Stability of Solutions of Matrix Quadratic
Equations 551
17.10 The Real Case 553
17.11 Exercises 557
Part Four Analytic Properties of Invariant Subspaces 563

Chapter Eighteen Analytic Families of Subspaces 565
18.1 Definition and Examples 565
18.2 Kernel and Image of Analytic Families of
Transformations 569
18.3 Global Properties of Analytic Families of
Subspaces 575
18.4 Proof of Theorem 18.3.1 (Compact Sets) 578
18.5 Proof of Theorem 18.3.1 (General Case) 584
18.6 Direct Complements for Analytic Families of
Subspaces 590
18.7 Analytic Families of Invariant Subspaces 594
18.8 Analytic Dependence of the Set of Invariant Subspaces
and Fixed Jordan Structure 596
18.9 Analytic Dependence on a Real Variable 599
18.10 Exercises 601
Chapter Nineteen Jordan Form of Analytic Matrix Functions 604

19.1 Local Behaviour of Eigenvalues and Eigenvectors 604
19.2 Global Behaviour of Eigenvalues and Eigenvectors 607
Contents xv

19.4 Analytic Extendability of Invariant Subspaces 616
19.5 Analytic Matrix Functions of a Real Variable 620
19.6 Exercises 622
Chapter Twenty Applications 624

20.1 Factorization of Monic Matrix Polynomials 624
20.2 Rational Matrix Functions Depending Analytically on a
Parameter 627
20.3 Minimal Factorizations of Rational Matrix
Functions 634
20.4 Matrix Quadratic Equations 639
20.5 Exercises 642
Appendix. Equivalence of Matrix Polynomials 646

A.I The Smith Form: Existence 646
A.2 The Smith Form: Uniqueness 651
A.3 Invariant Polynomials, Elementary Divisors, and Partial
Multiplicities 654
A.4 Equivalence of Linear Matrix Polynomials 659
A.5 Strict Equivalence of Linear Matrix Polynomials:
Regular Case 662
A.6 The Reduction Theorem for Singular Polynomials 666
A.7 Minimal Indices and Strict Equivalence of Linear Matrix
Polynomials (General Case) 672
A.8 Notes to the Appendix 678
List of Notations and Conventions 679

References 683
Author Index 687
Subject Index 689
Preface to the SI AM Classics
Edition
In the past 50 or 60 years, developments in mathematics have led to inno-

vations in linear algebra and matrix theory. This progress was often initiated
by topics and problems from applied mathematics. A good example of this
is the development of mathematical systems theory. In particular, many new
and important results in linear algebra cannot even be formulated without the
notion of invariant subspaces of matrices or linear transformations. In view of
this, the authors set out to write a work on advanced linear algebra in which
invariant subspaces of matrices would be the central notion, the main sub-
ject of research, and the main tool. In other words, matrix theory was to be
presented entirely on the basis of the theory of invariant subspaces, including
the algebraic, geometric, topological, and analytic aspects of the theory. We
believed that this would give a new point of view and a better understanding
of the entire subject. It would also allow us to follow up systematically the
central role of invariant subspaces in linear algebra and matrix analysis, as
well as their role in the study of differential and difference equations, systems
theory, matrix polynomials, rational matrix functions, and algebraic Riccati
equations.
The first edition of the present book was the result. To the authors' knowl-
edge it is the only book in existence with these aims. The first parts of the
book have the character of a textbook easily accessible for undergraduate stu-
dents. As the development progresses, the exposition changes to approach the
style and content of a graduate textbook and even a research monograph until,
in the last part, recent achievements are presented. The fundamental char-
acter of the mathematics, its accessibility, and its importance in applications
makes this a widely useful book for experts and for students in mathematics,
sciences, and engineering.
The first edition sold out in early 2005, and we could not help colleagues
who found a need for it. We are grateful to Wiley-Interscience publications
for producing the first edition and for returning the copyright to us in order to
give the work a new life. We are especially thankful to SIAM for the decision
to include this work in their series Classics in Applied Mathematics.
We would like to mention some other literature with strong connections
to this book. First, there are two other relevant monographs by the present
authors: Matrix Polynomials, published by Academic Press in 1982, and Ma-
trices and Indefinite Scalar Products, published by Birkhauser Verlag in 1983.
Invariant subspaces play an important role in both of them. In fact, work on
these two books convinced us of the need for the present systematic treat-
ment. The monograph of I. Gohberg, M. A. Kaashoek, and F. van Schagen,
xvii
xviii Preface to the Classics Edition
Partially Specified Matrices and Operators: Classification, Completion, Ap-

plications, Birkhauser Verlag, 1995, is recommended as additional reading for
Chapter 4. A later, comprehensive account of the theory of algebraic Riccati
equations, discussed in Chapters 17 and 20, can be found in the monograph
Algebraic Riccati Equations by P. Lancaster and L. Rodman, published by
Oxford University Press in 1995.
By the end of 2005 Birkhauser Verlag will also publish the authors' In-
definite Linear Algebra. This can also be recommended as a book in which
invariant subspaces play an important role.
It is a pleasure to repeat the acknowledgments appearing in the first edi-
tion. These include support from the Killam Foundation of Canada and the
Nathan and Lily Silver Chair on Mathematical Analysis and Operator The-
ory of Tel Aviv University. Continuing support was also provided by staff
at the School of Mathematical Sciences of Tel Aviv University and at the
Department of Mathematics and Statistics of the University of Calgary. In
particular, Jacqueline Gorsky in Tel Aviv and Pat Dalgetty in Calgary con-
tributed with speedy and skillful development of the first typescript. Support
from national organizations is also acknowledged: the Basic Research Fund
of the Israel Academy of Science, the U.S. National Science Foundation, and
the Natural Sciences and Engineering Research Council of Canada.
COMMENTS ON THE DEVELOPMENTS OF TWENTY
YEARS
Twenty years have passed since the appearance of the first edition. Nat-
urally, in this time advances have been made on some the theory appearing
in the first edition, advances which have appeared in specialized journals and
books. Also, the status of some conjectures made in the first edition has
been clarified. Here, several developments of this kind are summarized for the
interested reader, together with a short bibliography.
1. Chapter 2. A characterization of matrices all of whose invariant sub-
spaces are marked is given in [I].
2. Chapter 4. The problem of describing the Jordan forms of completions
from an invariant and a coinvariant subspace, also known as the Carlson
problem, has been solved (in terms of Littlewood-Richardson sequences). As
it turns out, it is closely related to the problem of describing the range of the
eigenvalues of A + B in terms of the eigenvalues of Hermitian matrices A and
B, solved by Klyachko [5]. See the expository paper [2] and references there.
3. Chapter 9. Various results on the existence of complete chains of

invariant subspaces that extend Theorem 9.3.1 are presented in [8] (see also
references there). We quote Radjavi's theorem [7]: A collection S of n x n
complex matrices has a complete chain of common invariant subspaces if and
only if the trace is permutable: trace (Ai • • • Ap) = trace (Aa^ • • • Ar(p)) for
every p-tuple A\,..., Ap, Aj e «S, and every permutation a of (1,2,... ,p}.
Preface to the Classics Edition xix
4. Chapter 11. A simple proof of Burnside's theorem (Theorem 11.2.1 in

the text) is given in [6].
Conjecture 11.2.3 was disproved in [3] (for all n > 1 except 7 and 11) and
in [10] (for n — 7 and n — 11). It is certainly of interest to describe all pairs
of complementary algebras V\ and V% for which this conjecture is correct. In
[3] it was proved that the conjecture is valid if the complementary algebras
Vi and 1/2 are orthogonal.
5. Chapter 15. The past twenty years have seen the development of
a substantial literature concerning stability (in various senses) of invariant
subspaces of matrices, as well as of linear operators acting in an infinite-
dimensional Hilbert space. For much of this material and its applications in
the context of finite-dimensional spaces, we refer the reader to the expository
paper [9] and references there.
6. Chapter 16. Conjecture 16.7.1 is false in general. A counterexample
is given in [4]. The conjecture holds when A is nonderogatory (however, the
proof given on page 512 is erroneous, as pointed out in [4]) and when A is
diagonable. These results were established in [4] as well. An interesting open
question concerns the characterization of those Jordan structures for which
Conjecture 16.7.1 fails.
References
[1] R. Bru, L. Rodman, and H. Schneider, "Extensions of Jordan bases for
invariant subspaces of a matrix," Linear Algebra Appl. 150, 209-225
(1991).
[2] W. Fulton, "Eigenvalues, invariant factors, highest weights, and Schubert
calculus," Bull. Amer. Math. Soc. 37, 209-249 (2000).
[3] M. D. Choi, H. Radjavi and P. Rosenthal, "On complementary matrix
algebras," Integral Equations and Operator Theory 13, 165-174 (1990).
[4] J. Hartman, "On a conjecture of Gohberg and Rodman," Linear Algebra
Appl. 140, 267-278 (1990).
[5] A. A. Klyachko, "Stable bundles, representation theory and Hermitian
operators," Selecta Math. 4, 419-445 (1998).
[6] V. Lomonosov and P. Rosenthal, "The simplest proof of Burnside's the-
orem on matrix algebras," Linear Algebra Appl. 383, 45-47 (2004).
[7] H. Radjavi, "A trace condition equivalent to simultaneous triangulariz-
ability," Canad. J. Math. 38, 376-386 (1986).
[8] H. Radjavi and P. Rosenthal, Simultaneous Triangularization, Springer
Verlag, New York, 2001.
XX Preface to the Classics Edition
[9] A. C. M. Ran and L. Rodman, "A class of robustness problems in matrix

analysis," Operator Theory: Advances and Applications 134, 337-389
(2002).
[10] T. Yoshino, "Supplemental examples: 'On complementary matrix alge-

bras,'" Integral Equations and Operator Theory 14, 764-766 (1991).
Corrections
Page Line Correction
123 13 For [/ 0] read [0 /].
137 3 For nondecreasing read nonincreasing.
137 6 up For Theorem 4.4.1 read Theorem 4.1.4.
137 5 up For Proposition 4.1.1 read Proposition 4.4.1.
140 8 and 9 up Reverse the order of vectors in these chains.
145 14 For L9A) read L(A).
146 1 up For n x nl read nl x n.
196 6 For FN~l read FN.
197 5 up For Cm+n read Cm+n -> Cn.
214 6 up Reverse the positions of B and C. Also B and C.
221 11 For Xj — 1 read Xj-\.
223 10 For (A/ - Ai) read (A/ - Ai)" 1 .
225 4 up For HÂ)-1 read W(\] and replace -C by C.
360 11 In the bottom row of the matrix replace r by —r.
673 2 For k\ read k.
687 Sup For "Mardsen" read "Marsden."
xxi
Introduction
Invariant subspaces are a central notion of linear algebra. However, in

existing texts and expositions the notion is not easily or systematically
followed. Perhaps because the whole structure is very rich, the treatment
becomes fragmented as other related ideas and notions intervene. In
particular, the notion of an invariant subspace as an entity is often lost in the
discussion of eigenvalues, eigenvectors, generalized eigenvectors, and so on.
The importance of invariant subspaces becomes clearer in the context of
operator theory on spaces of infinite dimension. Here, it can be argued that
the structure is poorer and this is one of the few available tools for the study
of many classes of operators. Probably for this reason, the first books on
invariant subspaces appeared in the framework of infinite-dimensional
spaces. It seems to the authors that now there is a case for developing a
treatment of linear algebra in which the central role of invariant subspace is
systematically followed up.
The need for such a treatment has become more apparent in recent years
because of developments in different fields of application and especially in
linear systems theory, where concepts such as controllability, feedback,
factorization, and realization of matrix functions are commonplace. In the
treatment of such problems new concepts and theories have been developed
that form complete new chapters in the body of linear algebra. As examples
of new concepts of linear algebra developed to meet the needs of systems
theory, we should mention invariant subspaces for nonsquare matrices and
similarity of such matrices.
In this book the reader will find a treatment of certain aspects of linear
algebra that meets the two objectives: to develop systematically the central
role of invariant subspaces in the analysis of linear transformations and to
include relevant recent developments of linear algebra stimulated by linear
systems theory. The latter are not dealt with separately, but are integrated
into the text in a way that is natural in the development of the mathematical
structure.
_
2 Introduction
The first part of the book, taken alone or together with selections from
the other parts, can be used as a text for undergraduate courses in
mathematics, having only a first course in linear algebra as prerequisite. At
the same time, the book will be of interest to graduate students in science
and engineering. We trust that experts will also find the exposition and new
results interesting. The authors anticipate that the book will also serve as a
valuable reference work for mathematicians, scientists, and engineers. A set
of exercises is included in each chapter. In general, they are designed to
provide illustrations and training rather than extensions of the theory.
The first part of the book is devoted mainly to geometric properties of
invariant subspaces and their applications in three fields. The fields in
question are matrix polynomials, rational matrix functions, and linear
systems theory. They are each presented in self-contained form, and—rather
than being exhaustive—the focus is on those problems in which invariant
subspaces of square and nonsquare matrices play a central role. These
problems include factorization and linear fractional decompostions for mat-
rix functions; problems of realization for rational matrix functions; and the
problem of describing connections, or cascades, of linear systems, pole
assignment, output stabilization, and disturbance decoupling.
The second part is of a more algebraic character in which other properties
of invariant subspaces are analyzed. It contains an analysis of the extent to
which the invariant subspaces determine the parent matrix, invariant sub-
spaces common to commuting matrices, and lattices of subspaces for a single
matrix and for algebras of matrices.
The numerical computation of invariant subspaces is a difficult task as, in
general, it makes sense to compute only those invariant subspaces that
change very little after small changes in the transformation. Thus it is
important to have appropriate notions of "stable" invariant subspaces. Such
an analysis of the stability of invariant subspaces and their generalizations is
the main subject of Part 3. This analysis leads to applications in some of the
problem areas mentioned above.
The subject of Part 4 is analytic families of invariant subspaces and has
many useful applications. Here, the analysis is influenced by the theory of
complex vector bundles, although we do not make use of this theory. The
study of the connections between local and global problems is one of the
main problems studied in this part. Within reasonable bounds, Part 4 relies
only on the theory developed in this book. The material presented here
appears for the first time in a book on linear algebra and is thereby made
accessible to a wider audience.
Part One
Fundamental
Properties of
Invariant Subspaces
and Applications
Part 1 of this work comprises almost half of the entire book. It includes what
can be described as a self-contained course in linear algebra with emphasis
on invariant subspaces, together with substantial developments of applica-
tions to the theory of polynomial and rational matrix-valued functions, and
to systems theory. These applications demand extensions of the standard
material in linear algebra that are included in our treatment in a natural
way. They also serve to breathe new life into an otherwise familiar body of
knowledge. Thus there is a considerable amount of material here (including
all of Chapters 3, 4, and 6) that cannot be found in other books on linear
algebra.
Almost all of the material in this part can be understood by readers who
have completed a beginning course in linear algebra, although there are
places where basic ideas of calculus and complex analysis are required.
3
Chapter One
Invariant Sub spaces:

Definition, Examples,
and First Properties
This chapter is mainly introductory. It contains the simplest properties of

invariant subspaces of a linear transformation. Some basic tools (projectors,
factor spaces, angular transformations, triangular forms) for the study of
invariant subspaces are developed. We also study the behaviour of invariant
subspaces of a transformation when the operations of similarity and taking
adjoints are applied to the transformation. The lattice of invariant sub-
spaces of a linear transformation—a notion that will be important in the
sequel—is introduced. The presentation of the material here is elementary
and does not even require use of the Jordan form.
1.1 DEFINITION AND EXAMPLES
Let A: <p"—» (p" be a linear transformation. A subspace M C <p" is called

invariant for the transformation A, or A invariant, if Ax & M for every
vector x G M. In other words, M is invariant for A means that the image of
M under A is contained in M\ AM C. M. Trivial examples of invariant
subspaces are {0} and <p". Less trivial examples are the subspaces
and
Indeed, as Ax = 0G Ker A for every x G Ker A, the subspace Ker A is A

invariant. Also, for every x G (p", the vector Ax belongs to Im^4; in
particular, A(lm A) C Im A, and Im A is A invariant.
5
6 Invariant Subspaces
More generally, the subspaces
and
are A invariant. To verify this, let x e Ker Am, so Amx = 0. Then Am(Ax) =
A(Amx) = 0, that is, Ax e Ker Am. This means that Ker Am is A invariant.
Further, let x E Im Am, so x = Amy for some y e <p". Then Ax = A(Amy) =
Am(Ay), which implies that AxElmA™. So Im Am is A invariant as
well.
When convenient, we shall often assume implicitly that a linear trans-
formation from (p™ into <p" is given by an n x m matrix with respect to
the standard orthonormal bases el = (1,0,. . . , 0), e2 = (0,1, 0,. . . , 0),
*W = < 0 , 0 , . . . , 0 , 1 > in fVi. • • • , « « HI C-
The following three examples of transformations and their invariant
subspaces are basic and are often used in the sequel.
EXAMPLE 1.1.1. Let
(the n x n Jordan block with A0 on the main diagonal). Every nonzero

/1-invariant subspace is of the form Span{e l5 . . . , ek}, where et is the vector
(0,. . . , 0,1, 0,. . . , 0} with 1 in the /th place. Indeed, let M be a nonzero
^-invariant subspace, and let
be a vector from M for which the index k = max{m 1 < m ^ n, am ^ 0} is

maximal. Then clearly
On the other hand, the vector x = E* = i a ( e., ak 7^0 belongs to M. Hence,

since M is A invariant, the vectors
Definition and Examples 7
also belong to M. Hence the vectors
belong to M as well. So
and the equality
follows. As for every y = we have
The subspace Span{e,, . . . , e k ] is indeed A invariant. The total number of

.A-invariant subspaces (including {0} and <p") is thus n + 1.
In this example we have
and
As expected, these subspaces are A invariant.
EXAMPLE 1.1.2. Let A — A07, where / is the n x « identity matrix. Clearly,

every subspace in (p" is A invariant. Here the number of /1-invariant
subspaces is infinite (if n > 1).
Note that the set Inv(/l) of all ^-invariant subspaces is uncountably

infinite. Indeed, for linearly independent vectors x, y E <p" the one-
dimensional subspaces Span{;t + ay}, a G $ are all different and belong to
lnv(A). So they form an uncountable set of ^-invariant subspaces.
Conversely, if every one-dimensional subspace of <p" is A invariant for a
linear transformation A, then A = A 0 / for some A 0 . Indeed, for every jcÔ
the subspace Span{*} is A invariant, so Ax = \(x)x, where \(x) is a
complex number that may, a priori, depend on x. Now if
for linearly independent vectors xl and x2, then Span{jct + x2} is not A
invariant, because
Hence we must have A() = \(x) is independent of x ^ 0, so actually

A—
Later (see Proposition 2.5.4) we shall see that the set of all ^-invariant
subspaces of on n x n complex matrix A is never countably infinite; it is
either finite or uncountably infinite.
EXAMPLE 1.1.3. Let
where the complex numbers A , , . . . , An are distinct. For any indices 1^

/, < • • • < ik < n the subspace Span{e, , . . . , ei} is A invariant. Indeed, for
we have
It turns out that these are all the invariant subspaces for A. The proof of this
fact for a general n is given later in a more general framework. So the total
number of j4-invariant subspaces is
Here we shall check only that the 2 x 2 matrix
has exactly two nontrivial invariant subspaces, Span{e t } and Span{e2}.

Indeed, let M be any one-dimensional .A-invariant subspace
Then Ax = a,A,e, + a2\2e2 should belong to M and thus is a scalar multiple

of x } :
for some /8 E <p. Comparing coefficients, we see that we obtain a contradic-

tion A, = A 2 unless al = 0 or «2 = 0. In the former case M = Span{e2} and in
the latter case M = Span{e,}.
In this example we have Ker A = Span{e ( } (when det.<4=0), where
/0 is the index for which A. = 0 (as we have assumed that the A, are
distinct and det A = 0, there is exactly one such index), and Im A =
Span{e, \i*i0}.
The following observation is often useful in proving that a given subspace

is A invariant: A subspace M = Span{jc,,. . . , xk} is A invariant if and only
if AXj G M for i = 1,. . . , k. The proof of this fact is an easy exercise.
For a given transformation A: <p" —» <p" and a given vector x E. <p",
consider the subspace
We now appeal to the Cayley-Hamilton theorem, which states that

where the complex numbers a () ,. . . , an are the coefficients of
the characteristic polynomial det(A7- ^4) of A:
(By writing A as an n x n matrix in some basis in (p", we easily see from the
definition of the determinant that det(A7— A) is a polynomial of degree n
with an = l.) Hence Akx with k > n is a linear combination of
x, Ax, . . . , A"~lx, so actually
The preceding observation shows immediately that M is A invariant. Any

A-invanant subspace !£ that contains x also contains all the vectors

Ax, A2x,. . . , and hence contains M. It follows that M is the smallest
A -invariant subspace that contains the vector x.
We conclude this section with another useful fact regarding invariant
subspaces. Namely, a subspace M C <p" is A invariant for a transformation
A: £"—» <P" if and only if it is (aA + /3/) invariant, where a, (3 are arbitrary
complex numbers such that a 5^0. Indeed, assume that M is A invariant.
Then for every x G M we see that the vector
belongs to M. So M is (aA + j37) invariant. As
the same reasoning shows that any (aA + /3/) invariant subspace is also A
invariant.
1.2 EIGENVALUES AND EIGENVECTORS
The most primitive nontrivial invariant subspaces are those with dimension
equal to one. For a transformation A: <£"-» <p" and some nonzero x E <p",
therefore, we consider an /l-invariant subspace of the form M = Span{;c}.
In this case there must be a A0 £ <p such that Ax — AOJC. Since we then have
A(ax) = a(Ax) = A0(a*) for any a G <p, the number A0 does not depend on
the choice of the nonzero vector in M. We call A0 an eigenvalue of A, and,
when Ax = AOJC with 0 ^ x G <p", we call x an eigenvector of A (corresponding
to the eigenvalue A 0 ). Observe that, since (A 0 7 - A)x = 0, the eigenvalues of
A can also be characterized as the set of complex zeros of the characteristic
polynomial of A\ <pA( A) = det( A/ — A).
The set of all eigenvalues of A is called the spectrum of A and is denoted
by o-(A). We have seen that any one-dimensional ^-invariant subspace is
spanned by some eigenvector. Conversely, if XQ is an eigenvector of A
corresponding to some eigenvalue A 0 , then Span{jt0) is A invariant. (In
other words, A is the operator of multiplication by A0 when restricted to
Span{*0}.)
Let us have a closer look at the eigenvalues. As the characteristic
polynomial ^(A) = det(A7- A) is a polynomial of degree n, by the funda-
mental theorem of algebra, <p4(h) has n (in general, complex) zeros when
counted with multiplicities. These zeros are exactly the eigenvalues of A.
Since the characteristic polynomial and eigenvalues are independent of the
choice of basis producing the matrix representation, they are properties of
the underlying transformation. So a transformation A: (p"—» <p" has exactly
Eigenvalues and Eigenvectors 11
n eigenvalues when counted with multiplicities, and, in any event, the

number of distinct eigenvalues of A does not exceed n. Note that this is a
property of transformations over the field of complex numbers (or, more
generally, over an algebraically closed field). As we shall see later, a
transformation from ft" into ft" does not always have (real) eigenvalues.
Since at least one eigenvector corresponds to any eigenvalue A0 of A it
follows that every linear transformation A: <p"—»<p" has at least one one-
dimensional invariant subspace. Example 1.1.1 shows that in certain cases a
linear transformation has exactly one one-dimensional invariant subspace.
We pass now to the description of two-dimensional ^-invariant subspaces
in terms of eigenvalues and eigenvectors. So assume that M is a two-
dimensional ^-invariant subspace. Then, in a natural way, A determines a
transformation from M into M. We have seen above that for every trans-
formation in a (complex) finite-dimensional vector space (which can be
identified with <pm for some m) there is an eigenvalue and a corresponding
eigenvector. So there exists an *() G M\{Q} and a complex number A0 such
that Ax0 = A0^0. Now let xl be a vector in M for which {x0, x } } is a linearly
independent set; in other words, M — Span{jc0, x^}. Since M is A invariant it
follows that
for some complex numbers /i0 and /Xj. If /AO = 0, then xl is an eigenvector of
A corresponding to the eigenvalue /i^. If /AO ^ 0 and ^ ^ A 0 , then the vector
v = —/Lt()jc0 + (A 0 —/LtJjCj is an eigenvector of A corresponding to ^ for
which {XQ, y} is a linearly independent set. Indeed
Finally, if /i 0 ^0 and ^ = A 0 , then XQ is the only eigenvector (up to

multiplication by a nonzero complex number) of A in M. To check this,
assume that aQx0 + a,*,, t*j ^ 0, is an eigenvector of A corresponding to an
eigenvalue VQ. Then
But the left-hand side of this equality is
and comparing this with equality (2.1), we obtain

which (with a, 7^0) implies A0 = VQ and a,^t0 = 0, a contradiction with the

assumption /u,0 ^ 0. However, note that the vectors z = (1 II*>Q)X\ and JCQ form
a linearly independent set and z has the property that Az - A0z = *0. Such a
vector z will be called a generalized eigenvector of A corresponding to the
eigenvector jc0.
In conclusion, the two-dimensional invariant subspace M is spanned by
two eigenvectors if and only if either /x() = 0 or /t0 T^ 0 and /LI, ¥^ A0. If /AQ ^ 0
and /A, = A 0 , then ^ is spanned by an eigenvector and a corresponding
generalized eigenvector.
A study of invariant subspaces of dimension greater than 2 along these
lines becomes tedious. Nevertheless, it can be done and leads to the
well-known Jordan normal form of a matrix (or transformation) (see Chap-
ter 2).
Using eigenvectors, one can generally produce numerous invariant sub-
spaces, as demonstrated by the following proposition.
Proposition 1.2.1
Let A , , . . . , \k be eigenvalues of A (not necessarily distinct), and let xt be an
eigenvector of A corresponding to A,, i = 1,. . . , k. Then Span{jc,, . . . , xk}
is an A-invariant subspace.
Proof. For any x = £f = 1 aixi £ Spanjjc,,. . . , xk}, where a, E <p, we

have
so indeed Span{*j,. . . , xk} is A invariant.
For some transformations all invariant subspaces are spanned by eigen-

vectors as in Proposition 1.2.1, and for some transformations not all
invariant subspaces are of this form. Indeed, in Example 1.1.1 only one of
the n nonzero invariant subspaces is spanned by eigenvectors. On the other
hand, in Example 1.1.2 every nonzero vector is an eigenvector correspond-
ing to A 0 , so obviously every ^-invariant subspace is spanned by eigen-
vectors.
1.3 JORDAN CHAINS
We have seen in the description of two-dimensional invariant subspaces that

eigenvectors alone are not always sufficient for description of all invariant
subspaces. This fact necessitates consideration of generalized eigenvectors as
well. Let us make a general definition that will include this notion. Let A0 be
an eigenvalue of a linear transformation A: <p n —» 4-"- A. chain of vectors
Jordan Chains 13
Jt 0 , jc,,. . . , xk is called a Jordan chain of ,4 corresponding to A0 if *0 ^ 0 and

the following relations hold:
The first equation (together with JCG ¥^ 0) means that JCQ is an eigenvector of
A corresponding to A 0 . The vectors xl,...,xk are called generalized
eigenvectors of A corresponding to the eigenvalue A0 and the eigenvector XQ.
For example, let
as in Example 1.1.1. Then e{ is an eigenvector of A corresponding to A 0 , and

el, e2,. . . , en is a Jordan chain. This Jordan chain is by no means unique;
for instance, ev, e2 + ae}, . . . , en + aen_} is again a Jordan chain of A,
where a e <p is any number.
In Example 1.1.3 the matrix A does not have generalized eigenvectors at
all; that is, every Jordan chain consists of an eigenvector only. Indeed, we
have A = diag[A,, A 2 , . . . , AJ, where A,, . . . , An are distinct complex num-
bers; therefore
So A,, . . . , An are exactly the eigenvalues of A. It is easily seen that any

eigenvector of A corresponding to A, is of the form aei with a nonzero
scalar a. Assuming that there is a Jordan chain otei , x of A corresponding to
A, , equations (1.3.1) imply
As A, T^ A, for / T^ i'0, we find immediately that /3, = 0 for / ^ /„. But then the
left-hand side of equation (1.3.3) is zero, a contradiction with a 5^0. So
there are no generalized eigenvectors for the transformation A.
Jordan chains allow us to construct more invariant subspaces.
Proposition 1.3.1
Let x(), . . . , xk be a Jordan chain of a transformation A. Then the subspace
M - Span{jr0, . . . , xk} is A invariant.
Proof. We have
where A0 is the eigenvalue of A to which JCG, . . . , xk corresponds; and for

i=l,...,*
Hence the A invariance of M follows.
The following proposition shows how the Jordan chains behave under a
linear change in the matrix A.
Proposition 1.3.2
Let a 7^ 0 and /8 be complex numbers. A chain of vectors XQ, jc,,. . . , xk is a
Jordan chain of A corresponding to the eigenvalue A0 if and only if the vectors
form a Jordan chain of a A + ($1 corresponding to the eigenvalue
Proof. Assume that JCQ, . . . , xk is a Jordan chain of A corresponding to

A 0 , that is, equalities (1.3.1) hold. Then we have
and in general for / = 1,. . . , k

Jordan Chains 15
So by definition the vectors in equality (1.3.4) form a Jordan chain of

a A + ft I corresponding to «A0 + ft.
Conversely, assume that equality (1.3.4) is a Jordan chain of aA + (31
corresponding to «A0 + ft. As
the first part of the proof shows that the vectors
form a Jordan chain of A corresponding to the eigenvalue

(b/x)-y0
Two corollaries from Proposition 1.3.2 will be especially useful in the

sequel.
Corollary 1.3.3
(a) The vector XQ is an eigenvector of A corresponding to A0 if and only if jc0
is an eigenvector of a A + ft I (here a ^0, ft are complex numbers) corres-
ponding to a\Q + ft', (b) the vectors A:O, . . . , xk form a Jordan chain of A
corresponding to A0 // and only if these vectors constitute a Jordan chain of
A + ft I corresponding to A() + ft for any complex number ft.
In many instances Corollary 1.3.3 allows us to reduce the consideration of

eigenvalues and Jordan chains to cases when the eigenvalue is zero. Our first
example of this device appears in the proof of the following proposition.
Proposition 1.3.4
The vectors in a Jordan chain x(},. . . , xk of A are linearly independent.
Proof. Assume the contrary, and let xp be the first generalized eigen-
vector in the Jordan chain that is a linear combination of the preceding
vectors:
We can assume that the eigenvalue A0 of A to which the Jordan chain

JCQ, . . . , xk corresponds is zero. (Otherwise, in view of Corollary 1.3.36, we
consider A — A 0 7 in place of A.) So we have Axp — xp-\- On the other hand,
we have
Comparing both expressions, we see that xp_l is a linear combination of the

vectors x0,. . . , xp_2. This contradicts the choice of xp as the first vector in
the Jordan chain that is a linear combination of the preceding vectors.
1.4 INVARIANT SUBSPACES AND BASIC OPERATIONS

ON LINEAR TRANSFORMATIONS
In this section we first consider questions concerning invariant subspaces of

sums, compositions, and inverses of linear transformations. We shall also
develop the connection between invariant subspaces for a linear transfor-
mation and those of similar and adjoint transformations.
The basic result for the first three algebraic operations is given in the
following proposition.
Proposition 1.4.1
Let A, B: <£"—» <p" be transformations, and let M C <p" be a subspace which
is simultaneously A invariant and B invariant. Then M is also invariant for
a A + BB (with any a, ft E <p) and for AB. Further, if A is invertible, then M
is also invariant for A~l.
Proof. For every x E M we have
and (AB)x = A(Bx) E M because Bx E M.

Assume now that A is invertible, and let *,,. . . , xp be a basis in M. Then
the vectors y{ = Ax{,. . . , yp = Axp are linearly independent (because A is
invertible) and belong to M (because M is A invariant). So yl,. . . , yp is also
a basis in M. Now
For any transformation A, we denote by lnv(A) the set of all ^-invariant

subspaces. Then Proposition 1.4.1 means, in short, that
Basic Operations on Linear Transformations 17
By applying equality (1.4.3) with A replaced by A \ we get Inv(^4 ] ) C

Inv(A), so actually equality holds in (1.4.3). It is very easy to produce
examples when the equality fails in (1.4.1) or (1.4.2). For instance:
EXAMPLE 1.4.1. Let y4: <p"—» <£"" be a transformation that is not of the form
yl for some y E (p (if n — 2, such transformations obviously exist). By
Example 1.1.2, not all subspaces in <p" are A invariant. On the other hand,
take B = A and a + j8 = 0 in (1.4.1). Then the right-hand side of (1.4.1) is
the zero transformation for which every subspace in <p" is invariant. D
To give an example where the inclusion in (1.4.2) is strict, put
The following example of strict inclusion in (1.4.2) is also instructive.
EXAMPLE 1.4.2. Let
An easy analysis (using Example 1.1.1) shows that A and B have no

nontrivial common invariant subspaces. Thus ln\(A) fl Inv(B) = {{0}, <p 2 }-
On the other hand, ln\(AB) must have an eigenvector that spans a
nontrivial AB-invariant subspace. Again, the inclusion (1.4.2) is strict. D
Consider now the notion of similarity. Recall that two transformations A

and B on (p" are called similar if A = S~*BS for some invertible transfor-
mation S (called a similarity transformation between A and B). Evidently,
similar transformations have the same characteristic polynomial and, con-
sequently, the same eigenvalues. The next proposition reveals the close
connection between invariant subspaces of similar transformations.
Proposition 1.4.2
Let transformations A and B be similar, with the similarity transformation
S: A - S~1BS. Then a subspace M C <p" is A invariant if and only if the
subspace
is B invariant.
Proof. Let M be A invariant, and let x E SM, so that x — Sy for some

y E M. Then Bx = BSy = SAy, and since Ay E M, we find that Bx E SM. So
SM is B invariant.
Conversely, assume that SM is B invariant. Then for y E M we have

BSy E SM and thus
So M is A invariant.
Proposition 1.4.2 shows, in particular, that there is a natural correspon-

dence between the sets of invariant subspaces of similar transformations.
Let us check this correspondence more closely in some of the examples of
invariant subspaces already introduced.
Proposition 1.4.3
Let A and B be similar, with the similarity transformation S. Then (a)
lmB = S(lm A); (b) Ker B = S(Ker A); (c) if *0, xlt...txk is a Jordan
chain of A corresponding to A(), then Sx(}, £.*,,..., Sxk is a Jordan chain of
B corresponding to the same A 0 .
Proof. The proof is straightforward. Let us check (b). Take x E Ker A,

so Ax = 0. Then Ax = S~*BSx = 0, and as 5 is invertible, BSx = 0, that is,
Sx E Ker B. Reversing the order of this argument, we see that if Sx E Ker B
for some x E <p", then x E Ker A. The proofs of (a) and (c) proceed in a
similar way. D
Consider now the operation of taking adjoints. Let A: (p n —* <p" be a

transformation. Recall that the adjoint transformation A*: (p"—><p" is de-
fined by the relation
where ( • , - ) is the standard scalar product in ({7":
More generally, if if,, J"2 are subspaces in <p" and A: if,—> 5~2 is a linear
transformation, its adjoint A*: ^'2—>3'l is defined by the relation
It is not difficult to check that the adjoint transformation always exists and is
unique. It is easily verified that for any linear transformations A and B on
<p" and any a E <p
Basic Operations on Linear Transformations 19
If (in the standard basis e{,. . . , en)
then the adjoint transformation is given by the formula
The same formula also holds for the transformation A written as a matrix in
any orthogonal basis in <p" as long as A* is considered as a matrix in the
same basis.
There is a simple and useful characterization of the invariant subspaces of
the adjoint transformation A* in terms of the invariant subspaces of A, as
follows.
Proposition 1.4.4
Let A: <p" —>• <p" be a linear transformation. A subspace M C (J7" is A*
invariant if and only if its orthogonal complement ML is A invariant.
Proof. Assume that M is A* invariant, and let x e M ±. We must prove

that Ax G M L'. Indeed, for every y E M we have
because A*y E M and x E M. x. Conversely, assume that M ± is A invariant,

and take y E M. Then for every x E M *~ we have
which means that A*y E. M. So M is A* invariant.
Note the following equalities for the A -invariant subspaces Ker A and
Im A and the A*-invariant subspaces Ker A* and Im A*:
Indeed, let x = A*y and z E K e r A Then (x, z) = (A*y, z) = (z, A*y) =

(Az, y) = Q', so x E (Ker A)LI Hence we have proved that
On the other hand, let x be orthogonal to Im A*. Then for every y E <J7", we
have (Ax, y) - (*, A*y) = 0; so Ax 1 (J?n, and thus Ax = 0, or x E Ker A. So
(Im v4*) x CKer y4. Taking orthogonal complements, we obtain Im A* D
(Ker A)1. Combining with (1.4.5), we obtain the first equality in (1.4.4).
The second equality follows from the first one applied to A * instead of A
[recall that (A*)* = A].
Later, we shall also need the following property:
Here, the inclusion D is clear. For the opposite inclusion, let x&lmA.
Then x = Ay for some y. If z is the projection of y onto Ker/4, then
y - z E (Ker A)*~ and also x = A(y - z). Then (1.4.1) implies that y - z E
Im A* and so x E lm(AA*), as required.
A transformation A: $"^> <P" 's called self-adjoint if A = A*. It is easily
seen that A is self-adjoint if and only if it is represented by a hermitian
matrix in some orthogonal basis (recall that a matrix [ajk]"k=i is called
hermitian if ajk = dkj, j, k = 1, . . . , n). For this important class of transfor-
mations we have the following corollary of Proposition 1.4.4.
Corollary 1.4.5
If A is self-adjoint, then M± is A invariant if and only if M is A invariant.
1.5 INVARIANT SUBSPACES AND PROJECTORS
A linear transformation defined by P: (p71—> <p" is called a projector if

P2 = P. The important feature of projectors is that there exists a one-to-one
correspondence between the set of all projectors and the set of all pairs of
complementary subspaces in <p". This correspondence is described in
Theorem 1.5.1.
Recall first that \iM,% are subspaces of <f", then ^ + ^ = { z E < p " | z =
x + y, xEM, y£L j£}. This sum is said to be direct if M fl j£ = {0}, in which
case we write M + & for the sum. The subspaces M, j£ are complementary
(are direct complements of each other) if M n <£ - {0} and M + Ja? = (p".
Nontrivial subspaces ^, j£ are orthogonal if for each * E M and _y E j£ we
have (A:, y) = 0 and they are orthogonal complements if, in addition, they are
complementary. In this case, we write M = t£L, !£ = Mi.
Invariant Subspaces and Projectors 21
Theorem 1.5.1
Let P be a projector. Then (Im P, Ker P) is a pair of complementary
subspaces in <p". Conversely, for every pair (J^, j£2) of complementary
subspaces in <p", there exists a unique projector P such that Im P — ^,
Ker/> = «%.
Proof. Let x e <p". Then x = (x - Px) + Px. Clearly, Px e Im P and

* - P* G Ker P (because P2 = P). So Im P + Ker P = <p". Further, if * £
Im P n Ker P, then jc = Py for some y £ <p" and PJC = 0. So
and Im P O Ker P = {0}. Hence Im P and Ker P are indeed complementary

subspaces.
Conversely, let ^ and 5B2 be a pair of complementary subspaces. Let P be
the unique linear transformation in (f"1 such that Px = x for xE 3?} and
P* = 0 for x G «%. Then clearly P2 = P, ^ C Im P, and ^2 C Ker P. But we
already know from the first part of the proof that Im P 4- Ker P = <p". By
dimensional considerations we have, consequently, ^ = Im P and j£2 =
Ker P. So P is a projector with the desired properties. The uniqueness of P
follows from the property that Px = x for every x E Im P (which, in turn, is a
consequence of the equality P2 = P). D
We say that P is the projector on j£, a/ong $2 if Im P = j^, Ker P = 5£2.

A projector P is called orthogonal if Ker P = (Im P)1. Thus the corre-
sponding complementary subspaces are mutually orthogonal. Orthogonal
projectors are particularly important and can be characterized as follows.
Proposition 1.5.2
A projector P is orthogonal if and only if P is self-adjoint, that is, P* = P.
Proof. Suppose that P* = P, and let x G Im P, y e Ker P. Then (x, y) =

(Px, y) = (x, Py) - (x, 0) = 0, that is, Ker P is orthogonal to Im P. Since by
Theorem 1.5.1 Ker P and Im P are complementary, it follows that in fact
KerP = (ImP) 1 .
Conversely, let Ker P = (Im P)1. To prove that P* = P, we have to check
the equality
Because of the sesquilinearity of the function (Px, y) in the arguments

x, v G <p", and in view of Theorem 1.5.1, it is sufficient to prove equation
(1.5.1) for the following four cases: (a) x, v G l m P ; (b) *EKerP, yE
Im P; (c) x G Im P, y e Ker P; (d) x, y e Ker P. In case (d), equality (1.5.1)
is trivial because both sides are 0. In case (a) we have
and (1.5.1) follows. In case (b), the left-hand side of equation (1.5.1) is
zero (since x £ Ker P) and the right-hand side is also zero in view of
the orthogonality Ker P = (Im P)1. In the same way, one checks (1.5.1) in
case (c).
So (1.5.1) holds, and P* = P.
Note that if P is a projector, so is / - P. Indeed, (/ — P)2 =

7 - 2 P + P 2 = / - 2 P + P = /-P. Moreover, KerP = Im(/-P) and
Im P = Ker(/-P). It is natural to call the projectors P and I— P com-
plementary projectors.
We now give useful representations of a projector with respect to a
decomposition of <p" into a sum of two complementary subspaces. Let
T: <£"-» <p" be a transformation and let ^, J£2 be a pair of complementary
subspaces in <p". Denote m, = dim ^ (i = 1, 2); then m, + m2 = n. The
transformation T may be written as a 2 x 2 block matrix with respect to the
decomposition <$?, + Z£2 = <£"":
Here Tif (i, j = 1,2) is an ra( x my matrix that represents in some basis the
transformation P,T|^: J2?y—»«£), where P, is the projector on 5€t along ^_i
(so P, + P2 = /).
Suppose now that T — P is a projector on =$?, = Im P. Then representation
(1.5.2) takes the form
for some matrix X. In general, X ¥^ 0. One can easily check that X = 0 if and
only if J^ = Ker P. Analogously, if «$?, = Ker P, then (1.5.2) takes the form
and Y — 0 if and only if 1£2 — Im P. By the way, the direct multiplicatio

P-P, where P is given by (1.5.3) or (1.5.4), shows that P is indeed a
projector: P2 = P.
Consider now an invariant subspace M for a transformation A: $"—* <p".
For any projector P with Im P = M we obtain
Invariant Subspaces and Projectors 23
Indeed, if x £ Ker P, we obviously have
If x E Im P = M, we see that Ax belongs to M as well and thus
once more. Since <p" = Ker P + Im P, (1.5.5) follows. Conversely, if P is a

projector for which (1.5.5) holds, then for every x E Im P we have PAx =
Ax] in other words, Im P is A invariant. So a subspace M is A invariant if
and only if it is the image of a projector P for which (1.5.5) holds.
Let M be an A-invariant subspace and let P be a projector on M [so that
(1.5.5) holds]. Denoting by M' the kernel of P, represent A as a 2 x 2 block
matrix
with respect to the direct sum decomposition <£" = M 4- M'. Here A} t

is a transformation PAP\M:M-*M, A12 is a linear transformation
PA(I-P)\M.:M'-+M,
and all these transformations are written as matrices with respect to some
chosen bases in M and M'. As M is A invariant, equation (1.5.5) implies
that (/- P)AP = Q, that is, A2l = 0. Hence
Using this representation of the matrix A, we can deduce some important

connections between the restriction A\M — Al} and the matrix A itself.
Proposition 1.5.3
Let XQ, . . . , xk be a Jordan chain of A\M corresponding to the eigenvalue A0
of A\M. Then JCQ, . . . , xk is also a Jordan chain of A corresponding to A 0 . In
particular, all eigenvalues of A\M are also eigenvalues of A.
Proof. We have x0 ^ 0; jr.. £ M for i = 0,. . . , A;, and

As these relations can be rewritten as
But PXj — x^ i = 0,1,. . . , k, and we obtain the relations defining

JCQ, . . . , xk as a Jordan chain of A corresponding to A0.
The last statement in Proposition 1.5.3 can also be proved in the

following way. Suppose that \0Eo-(A{l), that is, Ker(A 0 7- An) ^ {0}.
The representation (1.5.6) implies that any nonzero vector from
Ker(A 0 /-,4 n ) belongs to Ker(A 0 /-X). Thus Ker( A07 - A) * {0}, and
\0e<r(A).
In fact, a more general result holds.
Proposition 1.5.4
Let M be an A-invariant subspace with a direct complement M' in <p", and let
be the representation of A with respect to the decomposition <p" = M + M'.

Then
Proof. This follows immediately from the fact that det(A/- J( 4) =

det(A/-,4 n ) det(A/-,4 2 2 ).
As an example in which projectors and the subspaces Im A and Ker A of

a transformation A all play important roles, let us describe here a construc-
tion of generalized inverses for A.
Given a transformation A: <P"-* <pm, the transformation X: <£""—»<p" is
called a generalized inverse of A if the following holds: for any b E Im A the
linear system Ax = b has a solution x = Xb, and for any b E Im X the linear
system Xx — b has a solution x — Ab. So this is a natural generalization of
the notion of the inverse transformation.
Observe that X is a generalized inverse of A if and only if AX A = A and
XAX = X. Indeed, let X be a generalized inverse of A. Then AXb = b for
every b E Im A, that is, for every b of the form b = Ay. So AX Ay = Ay for
all y E <p", and AX A = A. Similarly, one checks that XAX = X. Conversely,
if AX A = A, then for every b of the form b - Ay the vector Xb = XAy is
obviously a solution of the linear equation Ax ==• b.
The descrition of all generalized inverses of A, which implies, in particu-
lar, that a generalized inverse of A always exists, is given by the following
theorem.
Angular Transformations and Matrix Quadratic Equations 25
Theorem 1.5.5
Let A: ("-* <pm be a transformation, let <p" = Ker A + N, <fm = Im A + R
for some subspaces N and R, and let P be the projector on Im A along R, Q
the projector on N along Ker A. Then (a) the transformation Al = A\^ is a
one-to-one transformation of N onto Im A] (b) the transformation A1 defined
on$mbyAIy = A^l(Py),forallyE.$m,isa generalized in verse of A for which
A A = P and AA = Q; (c) all generalized inverses of A are determined as
N, R range over all complementary subspaces for Ker A, Im A, respectively.
The proof of Theorem 1.5.5 is straightforward.

It is easily seen that, in the hypothesis of the theorem, complementary
subspaces R, N are simply the range and null-space of the generalized
inverse that they determine.
Corollary 1.5.6
In the statement of Theorem 1.5.5, we have
1.6 ANGULAR TRANSFORMATIONS AND

MATRIX QUADRATIC EQUATIONS
In this section we study angular transformations and their connections with

matrix quadratic equations and invariant subspaces. The correspondence
between the invariant subspaces of similar transformations described in
Proposition 1.4.2 is useful here.
This discussion can be seen as the first step in the examination of
solutions of matrix quadratic equations. In this program, we first need the
notion of a subspace "angular with respect to a projector." In Chapter 13
we discuss the topological properties of such subspaces in preparation for
the applications to quadratic equations to be made in Chapters 17 and 20.
Let TT be a projector defined on <p". Transformations acting on (f"1 in this
section are written in 2 x 2 block matrix form with respect to the decompo-
sition <p" = Ker TT + Im TT.
A subspace N of <p" is said to be angular with respect to IT if N + Ker TT =
<p". That is, if and only if N and Ker TT are complementary subspaces of <p".
Thus Im TT is angular with respect to TT, but more generally, if R is any
transformation from Im TT into Ker TT, then the subspace
is angular with respect to TT. To see this, observe first that NR is indeed a
subspace; that is, if jc,, jc2 E NK, then for some y^ y2 G Im TT
and if
Thenecause,forany
then
and
Finally, if z E NR fl Ker TT, then z = Ry + y, where y £ Im TT and also
77-z = 0. Thus
Since ft is into Ker TT, TrR - 0 and it follows that y = 0. Hence z = 0 and
$n = tfR + Ker TT.
The angular subspaces generated in this way are, in fact, all possible
angular subspaces.
Proposition 1.6.1
Let N be a subspace of <p". Then Jf is angular with respect to TT if and only if
j{ - NR for some transformation R: Im TT + Ker TT that is uniquely determined
by N.
Proof. If jV = JVR, we have already checked that ^V is angular. To prove

the converse, assume that Jf is angular with respect to TT, and let Q be the
projector of <p" onto N along Ker TT. Put
Then N = NR. Indeed
that is, R: Im TT—>Ker TT, and we have to show that N = NR.

If x G jVw, then for some y = Try,
Angular Transformations and Matrix Quadratic Equations
Thus NR C N. Conversely, if y 6 N then
thus N = NR, as required.

To prove the uniqueness of R, we show that any defining transformation
R in (1.6.1) must have the form (1.6.2). Thus let jVbe angular with respect
to TT, and let R: Im IT—»Ker TT satisfy (1.6.1). Let _ y £ l m 7 r and x =
Ry + y E. N. Then, since / - Q is onto Ker TT along jV
But QR = Q and QTT = 0 so that Ry = (Q - ir)y.
The transformation R appearing in the preceding proposition is called the

angular transformation for N. Note that R can be defined as the restriction
of a difference of projectors:
Consider now a transformation T\ <P"—» <p". As before, let TT: <p"—» <f"
be a projector so that we have <p" = Im TT + Ker TT. Then T has a represen-
tation with respect to this decomposition:
It is clear that Im TT is invariant under T if and only if T2l — 0. Similarly,

Ker TT is T invariant if and only if r,2 = 0. More generally, what is the
condition that a subspace Jf that is angular with respect to TT be T invariant?
Theorem 1.6.2
Let Ji be an angular subspace with respect to the projector TT. Let T have the
representation (1.6.3) with respect to the decomposition <p" = Im TT + Ker TT.
Then Ji is T invariant if and only if the angular transformation R for N
satisfies the matrix quadratic equation.
Proof. If /,, 72 are the identity transformations on Im TT and Ker TT,

respectively, then since R: Im TT—»Ker TT we can define the transformation
which is written as a 2 x 2 matrix with respect to the decomposition

<P" - Im TT 4- Ker TT. The transformation E is obviously invertible and
For every x £ Im TT we have Ex — x + Rx E N. So E maps Im TT onto N and

E~l maps JV back onto Im TT. By Proposition 1.4.2, jV is T invariant if and
only if Im TT is E~1TE invariant. Now observe that
1
so Im TT is E TE invariant if and only if (1.6.4) holds.
Another important observation follows from the similarity (1.6.5).
Corollary 1.6.3
If N is T invariant, then
and
Proof. We have
Now use Proposition1.5.4to obtain(1.6.6).Furth e r ,

so Im TT is E 1TE invariant if and only if (1.6.4) holds.
1.7 TRANSFORMATIONS IN FACTOR SPACES
Let N C <p" be a subspace. We say that two vectors x, y E <p" are compar-
able modulo jV if x - y E Jf, and denote this by x = y (mod JV). In particu-
lar, x = 0 (mod JV) if and only if x G jV. This relation is easily seen to be
reflexive, symmetrical, and transitive. That is
x = ;c(mod Jf) for all x G <p"

x = Xmod -^0 ^ y = *(mod jV)
jr = Xm°d^V) and y = z(mod JV)4>jc s z(mod Jf)
Transformations in Factor Spaces 29
Thus we have an equivalence relation on <p". It follows that <p" is decom-

posed into disjoint classes of vectors with the properties that in each class
the vectors are comparable modulo Jf, and in different classes the vectors
are not comparable modulo JV. We denote by [x] v- the class of vectors that
are comparable modulo M to a given vector x G <£"". The set of all such
classes of vectors defined by comparability modulo JV is denoted $"IN.
Proposition 1.7.1
Let set $"/Jf be a vector space over (p with the following operations of
addition and multiplication by a complex number:
Proof. We have to check first that these definitions do not depend on

the choice of the representatives x G [x]^ and .yGf}']^. If xl G [x]^ and
y i ^ M j f ^ then
that is, Jtj + v, G [x + y]^ . So indeed the class [jc + y]^ does not depend on
the choice of x and y. Similarly, one checks that [ax]^ does not depend on
the choice of x in the class [x]^ (for fixed a).
It is a straightforward but tedious task to verify that $nIN satisfies the
following defining properties of a vector space over <p: The sum is commuta-
tive and associative: (a) x + y = y + x, (x + y) + z = x + (y + z) for every
x, y, z G <p"/jV; (b) there is a zero element OE (p"/jV, that is, an element 0
such that jc + 0 = x for all x E. <p%V; (c) for every x €E $"/JV there is an
additive inverse element y G (p7^V, that is, such that x + y = 0; (d) for every
a, (3 G <p and x, y G <p"/OVthe following equalities hold: a(x + y) = ax + ay,
(a + fi}x = ax + fix, (a(3)x = a(fix), and Ix = x (here 1 is the complex
number). We leave the verification of all these properties to the reader. D
The vector space <p"/jVis isomorphic to any direct complement N' of .yVin
<p". Indeed, let a G <p"/^V; then there exists a unique vector y G N' such that
a = [y]tf and in fact, y — Px, where P is the projector on JV' along JV and
x is any vector in the class a. This is easily checked. We have y — x =
—(I — P)x G JV, so v G a. If there were two different vectors y, and v 2 from
N' such that [y^^ — [y2]jf = a, then y ] = _y2 G jV' H JV and _y, ¥= v 2 , which
contradicts the choice of Jf' as a direct complement to ^Vin <p". So we have
constructed a map <p: <p"~* N' defined by <p(a) = y. This map is easily seen
to be a homomorphism of vector spaces; that is
for every a, b E <p"A/V and every a E (p. Moreover, if <p(a) = <p(b), then the
vector y = <p(a) = <p(/>) belongs to both classes a and b of comparable
vectors modulo N, and thus a = b. So 9 is one-to-one. Taking any y E N\
we see that <p([.yL) ~ y> so <P is onto. Summing up, <p is an isomorphism
between the two vector spaces <p"/jV and N'. In particular, dim <p7^V =
n — dim N. Assume now that N is A invariant for some transformation
A: <p"-» <p". Then the induced transformation A: f"W-> f"/Jf is defined
by ^4[*] v = [^-^l.v f°r an V *€• $"• This definition does not depend on the
choice of the vector x in its class [*] A . Indeed, if [jc,]^ = [^l^v' then
because xl - x2 E Ji and N is A invariant.

We now present some basic properties of the induced linear transfor-
mation A.
Proposition 1.7.2
If X is invariant for both transformations A: (p"—»<P" and B: <p"-» <£", then
If, in addition, A is invertible, then
l
Proof. By Proposition 1.4.1, jVis invariant for aA + @B, AB, and A
(if A is invertible). For any jcE <p" we have
Further, by definition of the induced transformation we have
and
for every *e<p". Finally, (1.7.2) is a particular case of (1.7.1) (with

B — A~l), taking into account the fact that / = /.
The Lattice of Invariant Subspaces 31
It may happen that A is not invertible but A is invertible. For instance,

let A: <P"—»• <p" be any transformation with the property that <p" = Ker A 4-
Im A. (There are many transformations with this property; those represen-
ted by a diagonal matrix in some basis for <£"", for example.) Put Ji = Ker A.
Then for every vector x E <f" that is not in N we have-i4[jt] v = [Ax]A 5^0.
Thus Ker A = {0} and A is invertible. The following proposition clarifies the
situation.
Proposition 1.7.3
7/A 0 is an eigenvalue of A and Ker(/l - A 0 /) is not contained in N, then A0 is
also an eigenvalue of A. Conversely, every eigenvalue A0 of A is an eigen-
value of A and Ker(/l - A07) is not contained in N.
The proof is immediate: if Ax - AOJC with x^N, then ^4[jc] v = A 0 [*] v
with [x]^ T^ 0, and conversely.
1.8 THE LATTICE OF INVARIANT SUBSPACES
We start with the notion of a lattice of subspaces in <p". A set S of subspaces

in <p" is called a lattice if {0} and <p" belong to S and S contains the
intersection and sum of any two subspaces belonging to S. The following are
examples of lattices of subspaces: (a) S = {{0}, M, M±, <p"}, where M is a
fixed subspace in (p"; (b) S = {{0}, S p a n f ^ , . . . , ek} for k = 1, . . . , «}; (c)
S is the set of all subspaces in <p". For us, the following example of a lattice
of subspaces will be the most important.
Proposition 1.8.1
The set lnv(A) of all invariant subspaces for a fixed transformation
A: <p"-»<p" is a lattice.
Proof. Let M, jVelnv(y4). If x&M PuV, then because of the A in-
variance of M and N we have Ax G M and Ax G JV, so M D JV* is .4 invariant.
Now let jc G J( + jV, so that x = xl+x2, where x^M, x2 G N. Then
Ax = Ax{ + Ax2 E.M. + N, and Jf + JV is A invariant as well. Finally, both
{0} and <p" obviously belong to Inv(A).
Actually, examples (b) and (c) are particular cases of Proposition 1.8.1:
(b) is just the set of all A -invariant subspaces for
and example (c) is the set of all invariant subspaces for the zero matrix.
In contrast, if n > 2, the lattice of example (a) is never the lattice of

invariant subspaces of a fixed transformation A. Indeed, assuming the
contrary, the restriction A\M has a one-dimensional invariant subspace (a
subspace spanned by an eigenvector; here we consider A\M as a transfor-
mation from M into M). By Proposition 1.5.3, this subspace is also an
invariant subspace of A. Hence necessarily dim M = 1, and for the same
reason dim M L - 1. Since <p" = M + M1 we obtain a contradiction when
n>2.
In terms of the lattices of invariant subspaces, Propositions 1.4.2 and
1.4.4 can be restated as follows. We define [Inv(/l)]1 to be the set of
subspaces M1 for which M G Inv(y4).
Proposition 1.8.2
Given a transformation A: $" —»(p" and an invertible transformation
S: <F"-*<F", we have
and
We know that if Ml and M2 are A invariant, then so are Ml + M2 and

M}r\M2. It is of interest to find out how the spectra of the restrictions
A\M +M and A\M nM are related to the spectra of A\M and A\M^.
Theorem 1.8.3
If M{ and M2 are A-invariant subspaces, then
and
Recall that cr(B) stands for the set of eigenvalues of a transformation B.
Proof. Proposition 1.5.3 shows that the inclusion D holds in (1.8.1). To

prove the opposite inclusion, write
where M\ is a subspace in Ml such that M\ + (M^r\ M2) = Mlt and

M'2<^M2 satisfies M2 + (Ml C\M2) = M2. Write A\Mt+M2 as the 3 x 3 block
matrix with respect to decomposition (1.8.3):
Here, A^^PjAPj, and P{ (resp. P3) is the projector on M\ along

(Ml r\M2) + M'2 [resp. on M'2 along (M{ r\M2) + M(], and P2 = I- P^-
P3. As we have seen above, the A invariance of Ml implies A3l — A32 — 0,
and the A invariance of M2 implies A ]2 = A13 = 0. So
We find that
and hence that \Gcr(A\Mi+M ) implies that \Eo-(A\Mi) or \Ecr(A\M2).

For the proof of (1.8.2) note that M^(^M2CMl, and hence by Propos-
ition 1.5.3, (T(A\MinM2)Ca-(A\M). S Similarly, v(A\M^M7)Ca(A\M), and
(1.8.2) follows.
The following example shows that the inclusion in (1.8.2) may be strict.
EXAMPLE 1.8.1. Let
Then Ji} and M2 are A invariant and <r(A\M nM ) = {1}; <T(A\M) =

o-G4L2) = {i,o).
A set 5 of subspaces in <p" is called a chain if {0} and <p" belong to S and
either M C Ji or N C M (with proper inclusions) for every pair of different
subspaces M , J f E . S . Obviously, a chain is also a lattice. Also, a chain of
subspaces is always finite (actually, it cannot contain more than n + 1
subspaces), in contrast to lattices that may be infinite, as in example (c)
above.
Let
be a chain of different subspaces. We choose a direct complement ^ to

Mj_{ in the subspace Mt (i = 1,. . . , k). Then we obtain a decomposition of

<p" into a direct sum
This means that for every vector x E (p" there exists unique vectors
xl £ ^,. . . , xk G %k such that jc = *, + *2 + • • • + xk. Now let Pi be the
projector on ^£i along
The projectors P. are mutually disjoint; that is, P-P- = PjPi = 0 for / ^ /, and
Pi + -~ + Pk = L
Now any transformation A: $"—> <P" can be written as a A: x k block
matrix with respect to the decomposition (1.8.6):
where each transformation Aify - P./IF.L:* /' jj j£—»

i J£' is written as a matrix in
some fixed bases in ^ and ^..
Choose a basis j c , , . . . , xn in <p" in such a way that
where 0 < pv < p2 < • • • < pk = /i, and let
Then one can characterize all matrices for which (1.8.5) is a chain of (not
necessarily all) invariant subspaces in terms of the k x k block representa-
tion as follows.
Proposition 1.8.4
All subspaces from the chain (1.8.5) are invariant for a transformation A if
and only if A has the following form in the chosen basis xlt. . . ,xn:
where Aif is a ( P J - pi-1)*(pj- pt-i) matrix, 1 o = 0).
Proof. Assume that A has the form (1.8.8), which means that in terms
of the projectors Plt . . . , Pk defined above the equalities PfiP- = 0 for /' >;'
hold. For a fixed ;', it follows that
As Qj = Pl + • • • + Pj is a projector on M f and Pj+l + • • • + Pk = I - Qr we

obtain (/ - Qj)AQf = 0, which means that Mj - Im Qj is A invariant.
Conversely, if M},M2,. . . ,Mk are all A invariant, then the equality
(/ - QJAQ; = 0 holds for j = 1, . . . , k. So /ViP, = 0 for i >/, and A has
the form (1.8.8).
A chain of subspaces
is called maximal (or complete) if it cannot be extended to a larger chain,

that is, any chain of subspaces
with the property that every Mt is equal to some ^, coincides with the chain
(1.8.9). It is easily seen that a chain (1.8.9) is maximal if and only if
dim Mt: = /, / = ! , . . . , « .
Now if (1.8.9) is a maximal chain, we may choose a basis *,,. . . , xn in
<p" in such a way that
As a particular case of Proposition 1.8.4, we find that all the subspaces

Mlt . . . , Mn are A invariant for a transformation A if and only if A has
upper triangular form in the basis x{, . . . , xn:
We conclude this section with a useful result on chains of invariant

subspaces for a transformation having a basis of eigenvectors in <p". It turns
out that such chains can be chosen to be complementary to any chain of
subspaces given in advance.
Theorem 1.8.5
Let A: <p" -* <p" be a transformation having a basis in <p" formed by
eigenvectors of A. Then for every chain of subspaces jV, C • • • C Jfp in <p"
there exists a chain of A-invariant subspaces M, D • • • D Mp such that M • is a

direct complement to jV y , / = 1 , . . . , / ? .
Proof. Let x{, x2,. . . , xn be a basis in <p" consisting of eigenvectors of

A. We show first of all that there exists a set of indices K{ C {1,. . . , n} such
that the subspace M{ = {*, | i E Kt} is a direct complement to N} in (p". Let
/, be the first index such that xf does not belong to ^V,. If /, < i'2 < • • • < is
(^n) are already chosen, let is+l be the first index such that xi does not
belong to SpanfjC; , . . . , * , • } + .A",. This process will stop after t steps (say)
when the equality Span{jt, , . . . , jc,} + jVj = <p" is reached. Now one can put
K{ = { / , , . . . , /,} to ensure that Span{jt, | / E K { } is a direct complement to
JV,.
By the same token, there is a set K2 C Kl such that M2 = Spanjjc^ j / E
K2} is a direct complement to M^ H Jf2 in Ml. As N2 — (Ml ft jV2) 4- .A^,
clearly J<2 is a direct complement to N2 in <p". Let ^ 3 = Span{jc, | / E K3},
where K3 C K2 and Ji3 is a direct complement to M2C\ jV3 in J(2, and so on.
Clearly, all the subspaces M^ are A invariant. D
In connection with Theorem 1.8.5 we emphasize that not every transfor-

mation has a basis of eigenvectors. Indeed, we have seen in Example 1.1.1 a
transformation with only one eigenvector (up to multiplication by a nonzero
complex number); obviously one cannot form a basis in <p" from the
eigenvectors of A. Furthermore, the transformation A of Example 1.1.1
does not satisfy the conclusion of Theorem 1.8.5. We leave it to the reader
to verify the following fact concerning this transformation A: for a chain
jV, C • • • C Np of subspaces in <p" there is a chain Ml D • • • D Mp of A-
invariant subspaces such that Mi + Jtfi = (p", / = 1, . . . , p if and only if each
NI is spanned by the vectors of type en+fn, en_l + / „ _ , , . . . , en.fj+l +
/„_,,+ !, where r. = dim JV; and the vectors /„, /„_,, . . . , /„_,.+, belong to
Span{el, e2, . . . , en_r}. (As usual, ek stands for the £th unit coordinate
vector in (p".)
The converse of Theorem 1.8.5 is also true: if for every chain of sub-
spaces *W\ C • • • C Np in <p" there exists a chain of ^4-invariant subspaces
Ml D • • • D Mp such that Mj is a direct complement to JV"y., / = 1,. . . , p,
then there exists a basis of eigenvectors of A. However, a stronger state-
ment holds.
Proposition 1.8.6
Let A: (p"-* <p" be a transformation. If each subspace N C <p" has a com-
plementary subspace that is A invariant, there is a basis in <p" consisting of
eigenvectors of A.
Proof. Let JV0 be the subspace spanned by all the eigenvectors of A. We

have to prove that ^V0 = (p". Assume the contrary: NQ ^ (p". The hypothesis
Triangular Matrices 37
of the theorem implies that there is an A-invariant subspace M^ that is a

direct complement to Nn. Clearly, M ( ] ^ { Q } . Hence there exists an eigen-
vector JCG of A in MQ: Ax() = A0;t0, x(}^Q. Since xttj£N(}, we contradict the
definition of jV0.
1.9 TRIANGULAR MATRICES AND COMPLETE CHAINS OF

INVARIANT SUBSPACES
The main result of this section is the following theorem on unitary trian-
gularization of a transformation. It has important implications for the study
of invariant subspaces.
Recall that a transformation U: <p"—»(p" is called unitary if it is invertible
and U~l = U* or, equivalently, if (Ux, Uv) = (x, y) for all x, y E <p". Note
that the seemingly weaker condition \\Ux\\ - \\x\\ for all x E <p" is also
sufficient to ensure that U is unitary. Note also that the product of two
unitary transformations is unitary again, and so is the inverse of a unitary
transformation.
It will be convenient to write linear transformations from <p" into <p" as
n x n matrices with respect to the standard orthonormal basis e^, . . . , en in
<p". We shall use the fact that a matrix is unitary if and only if its columns
form an orthonormal basis in <p".
Theorem 1.9.1
For any n x n matrix A there exists a unitary matrix U such that
is an upper triangular matrix, that is, /,y — 0 for i>j, and the diagonal
elements t ] } , . . . , tnn are just the eigenvalues of A.
Proof. Let A, be an eigenvalue of A with an eigenvector jc, and assume

that ||jc,j| = 1. Let jc2, . . . , xn be vectors in (p" that, together with *,, form
an orthonormal basis for <p". Then the matrix
is unitary. Write Ul in a block matrix form t/, = [*,K], where V= [x2 • • • xn]
is an n x (n - 1) matrix. Then because of the orthonormality of *,,. . . , xn,
Y*XI =0. Now, using the relation Axl = Ajjc,, we obtain
def
Applying the same procedure to the ( w - l ) x ( n - l ) matrix A2 = V*AV,
we find an (n - 1) x (n - 1) unitary matrix U2 such that
for some eigenvalue A 2 of A2 and some (n - 2) x (n - 2) matrix A3. Apply

the same procedure to A3 using a suitable (n - 2) x (n - 2) unitary matrix
U3, and so on.
Then for the n x n unitary matrix
the product U*AU is upper triangular. Finally, as U* = U \ we have
so that f , , , . . . , tnn are the eigenvalues of A.
Let T= U*AU be a triangular form of the matrix A as in Theorem 1.9.1.

Then it follows from Proposition 1.8.4 that there is a maximal chain
where all subspaces M, are T invariant. Then Proposition 1.4.2 shows that
the maximal chain
consists of ^-invariant subspaces. We have obtained the following fact.
Corollary 1.9.2
Any transformation (or n x n matrix) A: <f"~* <P" has a maximal chain of
A-invariant subspaces. In particular, for every /', ! < / < n , there exists an
i-dimensional A-invariant subspace.
In general, a complete chain of /^-invariant subspaces is not unique. An

extreme case of this situation is provided by A = al, a £ <p. For such an A,
every complete chain of subspaces is a complete chain of A -invariant
subspaces. Clearly, there are many complete chains of subspaces in <|7"
(unless n — \}.
Let us characterize the matrices A for which there is a unique complete
chain of invariant subspaces.
Triangular Matrices 39
Theorem 1.9.3
An n x n matrix A has a unique complete chain of invariant subspaces if and
only if A has a unique eigenvector (up to multiplication by a scalar).
Proof. We have seen in the proof of Theorem 1.9.1 that for any
eigenvector x of A the subspace Span {A:} appears in some complete chain of
-^-invariant subspaces. So if a complete chain of invariant subspaces is
unique, the matrix A has a unique eigenvector (up to multiplication by a
scalar).
The converse part of Theorem 1.9.3 will be proved later using the Jordan
normal form of a matrix (see Theorem 2.5.1).
Theorem 1.9.1 has important consequences for normal transformations.

A transformation A: (p"—> <p" is called normal if A A* = A* A. Self-adjoint
and unitary transformations are normal, of course, but there are also normal
transformations that are neither self-adjoint nor unitary.
Theorem 1.9.4
A transformation A: <p" —» <p" is normal if and only if there is an orthonormal
basis in <p" consisting of eigenvectors for A.
Proof. Write A as an n x n matrix. Assuming that A is normal, the

matrix T from (1.9.1) is easily seen to be normal as well:
But T is upper triangular:
Hence the (1,1) entry in T*T is |f n | 2 , whereas this entry in TT* is

k n l 2 + ki 2 | 2 + "' + I ' m 2 As T*T= rr*, it follows that f 12 = • . • = *,„= 0.
Comparing the (2, 2) entries in T* T and TT*, we now find that t23 = • • • =
t2n = 0, and so on. It turns out that T is diagonal. Now Uel, . . . , Uen is an
orthonormal basis in <J7" consisting of eigenvectors of A.
Conversely, assume that A has a set of eigenvectors/!, . . . , / „ that form
an orthonormal basis in <p". Then the matrix U = [/,/2 • • • / „ ] is unitary and
where A, is the eigenvalue of A corresponding to /. So
As the diagonal matrix T is obviously normal, we find that A is normal as

well.
1.10 EXERCISES
1.1 Prove or disprove the following statements for any linear transfor-
mation A: ("->(":
(a) Im A + Ker A = <p".
(b) Im A + Ker A = <p" (the sum not necessarily direct).
(c) Im A H Ker ,4^(0}.
(d) dim Im A + dim Ker A = n.
(e) Im A is the orthogonal complement to Ker A *.
1.2 Prove or disprove statements (d) and (e) in the preceding exercise for
a transformation A: <pm-* <p", where m^n.
1.3 Let A: <p"—»<p" be the transformation given (in the standard ortho-
normal basis) by an upper triangular Toeplitz matrix
where « 0 , . . . , « „ _ , are complex numbers. Find the subspaces Im A

and Ker A.
1.4 Given A:$n—» (p" as in Example 1.1.3, identify the /4-invariant
subspaces Im Ak and Ker Ak, k = 0,1,. . . .
1.5 Identify Im Ak and Ker Ak, k = 0,1,. . . , where
is given by a lower triangular Toeplitz matrix.

1.6 Find all one-dimensional invariant subspaces of the following trans-
formations (written as matrices with respect to the standard ortho-
normal basis):
Exercises 41
1.7 In the preceding exercise, which transformations have a Jordan chain

consisting of more than one vector? Find these Jordan chains.
1.8 Show that all invariant subspaces for the projector P on the subspace
Ji are of the form Ml + J\f,, where Ml (resp. ^V,) is a subspace in M
(resp. JV). Find the lattice Inv P*.
1.9 Given P as in Exercise 1.8, find all the invariant subspaces of
<*!/* + a2(I — P), where al and a2 are complex numbers.
1.10 Let A: f "-> <p" be a transformation with A 2 = /. Show that Im(/ +
A) and Im(7 - A) are the subspaces consisting of the zero vector and
all eigenvectors of A corresponding to the eigenvalues 1 and -1,
respectively.
1.11 Find all invariant subspaces of a transformation A: <£"-» <p" such that
A2 = I.
1.12 Let
(a) Show that A is similar to
(b) Find all invariant subspaces of A.

1.13 Let
Show that A is similar to a matrix of type
and find the lattice Inv(y4). What are the invariant subspaces of A*7
1.14 Let
Prove that the eigenvalues of Q are cos(27r/c/«) + /sin(27rkin},

& = 0 , 1 , . . . , « - ! . Find the corresponding eigenvectors.
1.15 Show that the transformation
has eigenvectors xp whose yth coordinate is sin{jpir/(n +1)},

p = 0,. . . , n - 1 (independently of a). What are the corresponding
eigenvalues?
1.16 Let
where a 0 , . . . , « „ _ , are complex numbers. Show that A0 is an eigen-

value of A if and only if A0 is a zero of the equation
1.17 Let
(a) Find all eigenvalues and eigenvectors of A.

(b) Find a longest Jordan chain.
(c) Show that A is similar to a matrix of the form
and find the similarity matrix.

Exercises 43
(d) Find the lattice Inv(A) of all invariant subspaces of A.

(e) Find all invariant subspaces of the transposed matrix AT.
1.18 Let A: <£"—»<£" be a transformation represented as a matrix in the
standard orthonormal basis. Show that all invariant subspaces for the
transposed matrix AT are given by the formula Span{^,. . . , xk},
where xl, . . . ,xk is a basis in the orthogonal complement to some
^-invariant subspace, and for a vector y = (y^ . . . , yn) G <p" we
denote y = ( y , , . . . , yn).
1.19 Prove a generalization of Proposition 1.4.4: if A: <p" —» <p" is a
transformation and J£, ./V are subspaces in <p", then >U£ C jV holds if
and only if A*N±CM\
1.20 Give an example of a transformation A: £"—><£" that is not self-
adjoint but nevertheless AM x C M 1 for every ^-invariant subspace
./#.
1.21 Let
Find the angular transformations of M} and M2 with respect to the

projector on <p" © {0} along {0} © <p".
1.22 Find at least one solution of the quadratic equation
where
are n x n matrices.
(b) Tl} are n x « diagonal matrices.
(c) TV are n x n circulant matrices.
1.23 Prove that x{ + M,. . . , xk + Jt is a basis in <p"/^ (where
*,, . . . , xk G <p) if and only if for some basis y^ . . . , yp in M the
vectors j t j , . . . , xk, ylt . . . , yp form a basis in <p".
1.24 Let A = diagfflj, . . . , aj: <p"—> (f"1, where the numbers « , , . . . , # „
are distinct. Show that for any ^4-invariant subspace M the induced
transformation A: $"IM —> $ IM can also be written in the form
diag[6j, . . . , bk] in some basis in $nIM.
1.25 Find all the induced transformations A: $"/M-+ $"IM, where
and M is any A -invariant subspace.

1.26 Show that if P is a projector on (p" and P is the induced transfor-
mation on $"IM, where M is a P-invariant subspace, then P is a
projector as well. Find Im P and Ker P.
1.27 Let
be in a triangular form. Show that
is also in a triangular form. Hence the triangular form of a matrix is

not unique, in general.
1.28 Find complete chains of invariant subspaces for the transformations
given in Exercise 1.6. Check for uniqueness in each case.
1.29 Given a transformation in a matrix form
with respect to the basis el, e2, e3, find a complete chain of A-
invariant subspaces. Find a basis in which A has the upper triangular
form.
1.30 Let A: (p2"—» <p2" be a transformation. Prove that there exists an
orthonormal basis in <p2" such that, with respect to this basis, A has
the representation
where, for each / and /', Atj is an upper triangular matrix.

Chapter Two
The Jordan Form and

Invariant Subspaces
We have seen in Section 1.4 and Proposition 1.8.2 that there is a strong
relationship between lattices of invariant subspaces of similar transforma-
tions, namely
for any two tranformations A and 5 from <p" into <p" with S invertible. Thus,
for the study of invariant subspaces, it is desirable to use similarity transfor-
mations to reduce a given transformation to the simplest form, in the hope
that the lattice of invariant subspaces for the simplest form would be more
transparent than that for the original transformation. The "simplest form"
here is the Jordan form. It is obtained in this chapter and used to study
some properties of invariant subspaces. Special insights are obtained into
the structure of invariant subspaces and are exploited throughout the book.
We examine irreducible invariant subspaces, generators of invariant sub-
spaces, maximal and minimal invariant subspaces, and invariant subspaces
of functions of transformations. An interesting class of subspaces is intro-
duced and studied in Section 2.9 that we call "marked." All the subject
matter here is well known, although this exposition may be unusual in
matters of emphasis and detail that will be useful subsequently.
2.1 ROOT SUBSPACES
In this section we introduce the root subspaces of a transformation. The

study of these subspaces is the first step towards an understanding of the
Jordan form. At the same time it will be seen that the root subspaces are
important examples of invariant subspaces that can be described in terms of
Jordan chains.
45
46 The Jordan Form and Invariant Subspaces
We consider now some ideas leading up to the definition of root

subspaces. Let A: <p"-» <p" be a transformation and let A0 be an eigenvalue
of A. Consider the subspaces Ker(.A — A0/)', / = 1, 2 , . . . . For / = 1 the
subspace Ker(A - A()7) ^ {0} is just the subspace spanned by the eigenvec-
tors of A corresponding to A 0 . As (A — A O /)'JC = 0 implies (A — A O /)' + IJC = 0,
we have
Consequently, Ker(A - A () /)' + 1 ^ Ker(A - A () /)' if and only if dim Ker(A -

A0/)' + 1 >dimKer(v4 - A0/)'. Since the dimensions of the subspaces
Ker(A - A 0 /)', / = 1, 2 , . .. are bounded above by «, there exists a minimal
integer p > \ such that
for all integers / > p. The subspace Ker(/4 — A0I)P is called the root subspace
of A corresponding to A() and is denoted £%A (A).
In other words, 91A (A) consists of all vectors x E <p" such that (^4 -
A O /)^JC = 0 for some integer q ^ 1. (This integer may depend on x.) Because
all subspaces in (2.1.1) are A invariant. In particular, the root subspace

3$A (A) is A invariant.
By definition, &^(A) = Ker(,4 - A 0 /) p is the biggest subspace in the
chain (2.1.1). We see later that, in fact, p is the minimal integer i> 1 for
which the equality Ker(A - A0/)' = Ker(v4 - A0/)' + 1 holds, and that p < n.
Hence we also have
The nesting of the kernels in (2.1.1) has a dual in the (descending) nesting
of images:
But these sequences of inclusions are coupled by the fact that, for any
integer / > 0,
Consequently, if p is the least integer for which Ker(A - \0I)p +l =

Ker(A - \0I)P, it is also the least integer which lm( A - A 0 /) p + 1 = Im(a - \QlY-
Root Subspaces 47
Proposition 2.1.1
The root subspace <3lx (A) contains the vectors from any Jordan chain of A
corresponding to A 0 .
Proof. Let AT O , . . . , xk be a Jordan chain of A corresponding to A 0 . Then
Hence all the vectors *, (/ = 0,. . . , /c) belong to £%A (A).

Let us look at the simplest examples. For
as well as for A — A07, the only eigenvalue is A 0 , and the corresponding root
subspace $A (A) is the whole of <f". If
then the root subspace 3ftx(A) is one-dimensional and is spanned by et for

j = l , 2 , . . . ,/i.
Later, we also use the following fact: if A,S: <p"—»(p" are transfor-
mations with 5 invertible, then
for every eigenvalue A0 of A. An analogous property holds also for every

member of the chain (2.1.1). The proof of equation (2.1.2) follows the same
lines as the proof of Proposition 1.4.3.
The following property of root subspaces is crucial.
Theorem 2.1.2
Let A,,..., A,, be all the differenteigenvalues of a transformation
A: (p" —> <p". Then <p" decomposes into the direct sum
We need some preparations to prove this theorem.

Lemma 2.1.3
For every eigenvalue A0 of A, the restriction A[9l A (A) has the sole eigenvalue
o
A0.
Proof. Let BÂ^ (AY We shall show that for every A, ¥^ A0 the
^0
transformation \ll — B on $1K (A) is invertible. Let q be an integer such
that
Then clearly
Since this implies that
and since A! ^ A 0 , the invertibility of \J - B follows.
Lemma 2.1.4
Given a transformation A: $"—> <P" w^h an eigenvalue A 0 , let q be a positive
integer for which
Then the subspaces Ker(/l - 0\iy and lm(A - A 0 /) 9 are direct complements
to each other in <p".
Proof. Since
we have only to check that
Arguing by contradiction, assume that there is an x ^ 0 in the left-hand side

of equation (2.1.5). Then x = (A — A I)qy for some y. On the other hand,
for some integer r > 1 we have
Root Subspaces 49
It follows that
Hence
a contradiction with (2.1.4) and the definition of a root subspace.
Proof of Theorem 2.1.2 Let A, be an eigenvalue of A. Lemma 2.1.4

shows that
where q is some positive integer for which
By Lemma 2.1.3, the restriction of A to Ker(/4 - A,/)*7 has the sole

eigenvalue A^ On the other hand, A, is not an eigenvalue of the restriction
of A to lm(A - A0/)9.
To see this, observe that we also have
Hence A — A,/ maps lm(A - A,/) 9 onto itself. It follows that A, is not an
eigenvalue of the restriction of A to the ^-invariant subspace lm(A — A,/) 9 .
So the restrictions of A to the subspaces Ker(/4 - Aj/) 4 = $1K (A) and
58 = lm(A — A,/) 9 have no common eigenvalues. This property is easily
seen to imply that, for any eigenvalue A2 of A^
So we can repeat the previous argument with A replaced by A\% and with A,
replaced by an eigenvalue A2 of A\y, to show that
for some v4-invariant subspace M such that A t and A2 are not eigenvalues of
A\M. Continuing this process, we eventually prove Theorem 2.1.2. D
Another approach to the proof of Theorem 2.1.2 is based on the fact that
if <7,(A), . . . , qr(\) are polynomials (with complex coefficients) with no
common zeros, there exist polynomails p,(A),. . . , pr(\) such that
(This is easily proved by induction on r, using the Euclidean algorithm for

the case r = 2.) Now let the characteristic polynomial ^(A) = det( A/ - A)
be factorized in the form
where A , , . . . , A r are different complex numbers (and are, of course, just

the eigenvalues of A) and *>,,. . . , vr are positive integers. Define
for y = l , . . . , r . Using the fact that <pA(A) = Q (the Cayley-Hamilton

theorem) one verifies that actually
for y = 1,. . . , r. Finally, take advantage of the existence of polynomials

p,(A),. . . , p r (A) such that equality (2.1.6) holds, and use equation (2.1.7),
to prove Theorem 2.1.2. This approach can be used to prove results
analogous to Theorem 2.1.2 for matrices over fields other than (p.
Now let M be an >l-invaraint subspace. Consider the restriction A\M as a
linear transformation from M into M, and note that
for every A0 that is an eigenvalue of A\M. If A0 is an eigenvalue of A but not

an eigenvalue of A\M, then &^(A\M) = {0}; but also M n ô(A) = {0}. So
the equality Sft^(A\M) = M O &lXo(A) holds for any A 0 e<r(,4). Applying
Theorem 2.1.2 for the linear transformation A\M and using the above
remark, we obtain the following result.
Theorem 2.1.5
Let A: <p" —> <p" be a transformation, and let M be an A-invariant subspace.
Then M decomposes into a direct sum
where A , , . . . , A r are all the different eigenvalues of A.

Root Subspaces 51
Note that Theorem 2.1.2 is actually the particular case of Theorem 2.1.5
with M = <p". We consider now some examples in which Theorem 2.1.5
allows us to find all invariant subspaces of a given linear transformation.
EXAMPLE 2.1.1. Let A = diag[A,, A 2 , . . . , Aj where A , , . . . , \n are differ-

ent complex numbers (as in Example 1.1.3). Then o-(A) — { A 1 ? . . . , \n}, and
$A(,4) = Span{e,} ,
By Theorem 2.1.5, any /1-invariant subspace M is a direct sum
As M nSpan{e,} is either {0} or Span{e,}, it follows that any ^-invariant

subspace is of the form
for some indices \<i^<i2<---<ip<n. This fact was stated without proof
in Example 1.1.3.
EXAMPLE 2.1.2. Let
where A, and A2 are different complex numbers. The matrix A has the
eigenvalues A, and A 2 . Further,
and thus
So £% A (A) — Span{e1? e2}. For the eigenvalue A2 we have £% A (A) =

Span{e3,e 4 }.
We see (as Theorem 2.1.2 leads us to expect) that <J7" is a direct (even
orthogonal) sum of R^ (A) and £%A (A). Let M be any ^-invariant subspace.
By Theorem 2.1.5, we obtain
It is easily seen (cf. Example 1.1.1) that the only /1-invariant subspaces in
Span{e l5 e2} are {0}, Spanjtf,}, and Span{e,, e2}. On the other hand, any
subspace in Span{e3, e4} is A invariant.
One can easily describe all subspaces in Span{e3, e4} as follows: {0};
the one-dimensional subspaces Span{e3 4- ae4}, where a G <p is fixed for
each particular subspace; the one-dimensional subspace Span{e4}; and
Span{e 3 ,e 4 ). Finally, the following is a complete list of ^-invariant sub-
spaces:
2.2 THE JORDAN FORM AND PARTIAL MMULTIPLICITIES
Let A be an n x n matrix. In this section we state one of the most important

results in linear algebra—the canonical form of a matrix A under similarity
transformations A—>S~1AS, where S is an invertible n x n matrix.
We start with some notations. The Jordan block of size k x k with
eigenvalue A0 is the matrix
Clearly, det( A/ - Jk(A0)) = (A - A0)*, so A0 is the only eigenvalue of Jk(\o).

Further
The Jordan Form and Partial Multiplicities 53
so the only eigenvector of Jk(kQ) (up to multiplication by a nonzero complex

number) is el. The invariant subspaces ofJk(\0) were described in Example
1.1.1; they form a complete chain of subspaces in <p*:
It turns out that a similarity transformation can always be found trans-

forming a matrix into a direct sum of Jordan blocks.
Theorem 2.2.1
Let A be an n x n (complex) matrix. Then there exists an invertible matrix S
such that S~1AS is a direct sum of Jordan blocks:
The Jordan blocks Jk(\f) in the representation (2.2.1) are uniquely deter-
mined by the matrix A (up to permutation) and do not depend on the choice
ofS.
Since the eigenvalues of a matrix are invariant under similarity, it is clear

that the numbers A,, . . . , \p are the eigenvalues of A. Note that they are
not necessarily distinct.
We stress that this result holds only for complex matrices. For real
matrices there is also a canonical form under similarity with a real similarity
matrix. This canonical form is dealt with in Chapter 12.
The right-hand side of equality (2.2.1) is called a Jordan form of the
matrix (or the linear transformation) A. For a given eigenvalue A0 of A, let
Jk (A, ) , . . . , Jk (A. ) be all the Jordan blocks in the Jordan form of A for
which A, = A 0 , ( j r = l , . . . , m . The positive integer m is called the geometric
multiplicity of A0 as an eigenvalue of A, and the integers kt, ,. . . , /c, are
called the partial multiplicities of A 0 . So the number of partial multiplicities
of A() as an eigenvalue of A coincides with the geometric multiplicity of A0. In
view of Theorem 2.2.1, the geometric multiplicity and the partial multi-
plicities depend on A and A0 only and do not depend on the choice of the
invertible matrix S for which (2.2.1) holds. The sum ki + • • • + ki of the
partial multiplicities of A0 is called the algebraic multiplicity of A0 (as an
eigenvalue of A). Obviously, the algebraic multiplicity of A0 is not less than
its geometric multiplicity.
The following property of the partial multiplicities will be useful in the
sequel.
Corollary 2.2.2
If A{ and A2 are n{ x n, and n2 x n2 matrices with the partial multiplicities
£,(,4,),. . . ,kmi(Al)andkl(A2),. . . , km2(A2) of A, and A2, respectively,
all corresponding to the common eigenvalue A 0 , then A:,(y4,),. . . , km^(Al),

kl(A2),. . . , km^(A-,) are the partial multiplicities of the matrix
corresponding to A 0 . In particular, the geometric (resp. algebraic) multiplicity

of
at A0 is the sum of the algebraic (resp. geometric) multiplicities of A, and A2

at A 0 .
The proof of this corollary is immediate if one observes that the Jordan
form of
can be obtained as a direct sum of the Jordan forms of A} and A2,

We also need the following property of partial multiplicities.
Corollary 2.2.3
The partial multiplicities of A at A0 coincide with the partial multiplicities of
the conjugate transpose matrix A* at A 0 .
Proof. Write A = SJS ~ \ where J is the Jordan form of A and S is a

nonsingular matrix. Then A* = S~l*J*S*. Now the conjugate transpose J*
of the matrix / is similar to the matrix / that is obtained from / by replacing
each entry by its complex conjugate. Indeed, if we define the permutation
("rotation") matrix R with elements r/; defined in terms of the Kronecker
delta by rif = S J> + 1 _ y ., then it is easily verified that R~l = R and
Hence / is the Jordan form of A*, and Corollary 2.2.3 follows from the
definition of partial multiplicities.
To describe the result of Theorem 2.2.1 in terms of linear transfor-

mations, let us introduce the following definition. An A -invariant subspace
M is called a Jordan subspace corresponding to the eigenvalue A0 of A if M
is spanned by the vectors of some Jordan chain of A corresponding to A 0 .
Theorem 2.2.4
Let A: <p"—> <p" be a linear transformation. Then there exists a direct sum
decomposition
where Mt is a Jordan subspace of A corresponding to an eigenvalue A, {here

A t , . . . , A are not necessarily different}.
If <f?" = jV, + • • • 4- Nq is another direct sum decomposition with Jordan
subspaces Nt corresponding to eigenvalues /A,, / = 1, . . . , q, then q — p, and
(possibly after a permutation of Jf}, . . . , Nq) dim Mi = dim JV) and A, = /n,
for i = l, . . . , q.
Note that in general the decomposition (2.2.2) is not unique. For
example, if A = /, then one can take Mi = Span{jc,}, where * , , . . . , * „ is
any basis in (p".
Theorem 2.2.1 follows easily from Theorem 2.2.2 and vice versa. Indeed,
let S be as in Theorem 2.2.1. Then put
to satisfy equality (2.2.2).

Conversely, if Mi are as in (2.2.2), choose a basis x\'\ . . . , x(k'} in Mt
whose vectors form a Jordan chain for A. Then put
The direct sum decomposition (2.2.2) ensures that S is an n x n nonsingular

matrix, and the definition of a Jordan chain ensures that S~1AS has the form
(2.2.1).
Theorem 2.2.1 (or Theorem 2.2.4) is proved in the next section. Note
that because of Theorem 2.1.2 one has to prove Theorem 2.2.1 only for the
case when $t^(A) - (p", that is, A has only one eigenvalue A 0 . In this sense
the property of root subspaces described in Theorem 2.1.2 is the first step
toward a proof of the Jordan form.
In view of Proposition 1.4.2, there are many cases in which the Jordan
form allows us to reduce the consideration of invariant subspaces of a
general linear transformation to the consideration of invariant subspaces of
a linear transformation that is given by the Jordan normal form in the
standard orthonormal basis. This reduction is used many times in the sequel.
As a first example of such a reduction we note the following simple fact.
Proposition 2.2.5
Let A: <p"-» <p" be a linear transformation. Then the geometric multiplicity of
any A0 E ar(A) coincides with dim Ker(A — A 0 /), and the algebraic multiplici-
ty of \Q coincides with the dimension of£%A (A), the root subspace of A0 [i.e.,
with the dimension of Ker(y4 - A0/)"].
Proof. By (2.1.2) and Theorem 2.2.1 we can assume without loss of

generality that
Then for any A0 E <£ we have
From the definition of the Jordan block it is easily seen that
Hence
is the number of indices / for which A0 = A y , and, by definition, this number

coincides [in case A0 G cr(A)] with the geometric multiplicity of A0.
Similarly
So for q = 1, 2 , . . . and A0 E <p we have
As £%A (A) is the maximal subspace of the type Ker(,4 — A 0 /)^, q = 1,2, . . . ,
we obtain
which, by definition, is just the algebraic multiplicity of A 0 .

Proposition 2.2.5 is actually a particular case of the following general
proposition.
Proposition 2.2.6
Let A: <p"-» <P" be a transformation with partial multiplicities / c l 5 . . . , km
corresponding to the eigenvalue A0 of A. Then
where H # represents the number of different elements in a finite set ft.
Proof. In view of formula (2.2.3) we have only to show that
This equality is certainly true for q = 1 (for then both sides are equal to m).
Assume that the equality is true for q — 1. We have
Adding the relation
(which is just the induction hypothesis) we verify (2.2.4).
It follows from Proposition 2.2.6 that if
for some positive integer q, then actually
for all p ^ q, that is

2.3 PROOF OF THE JORDAN FORM
In this section we prove Theorem 2.2.4. In view of Theorem 2.1.5, it is

sufficient to consider A\& (A), where A 0 G cr(A) is fixed, in place of A. In
other words, we can assume that A has only one eigenvalue A0, possibly with
several partial multiplicities.
Let f?j = Kcr(A - A,,/)', ] = 1, 2,. . . , m, where m is chosen so that
^m = Â0(^) but <fm_i*9t^(A). Note that ^ C V2 C • • • C <fm. Let
x(^\ . . . , x(^m) be a basis in !fm modulo ^ m _ , , that is, a linearly independent
set in ym such that
(the sum here is direct). We claim that the mtm vectors
are linearly independent. Indeed, assume
Applying (a - A0/)m ' to the left-hand side and using the property that
(a - A 0 /) m jc^ = 0 for / = 1, . . . , / „ , we find that
Hence £|™, ai0x(^ e ^m-i and because of (2.3.1), a10 = • • • = a, 0 =0. Ap-
plying (A — A 0 /) w ~ 2 to the left-hand side of (2.3.2) we show similarly that
an- • • • — at , = 0, and so on, We put
As we have just seen, the sum M} + M2 + • • • + Mt is direct.

Consider now the vectors
We claim that
Proof of the Jordan Form
Indeed, assume
Applying (A - A 0 /) m 2
to the left-hand side, we get
which implies a t = • • • = a, = 0 in view of equality (2.3.1). So equation

(2.3.3) follows.
Assume first that Span does not coincide with
^m _ 1 . Then there exist vectors h in <fm_v such that the
set {*^-]}!=!'m~1 is linearly independent and
Applying the previous argument to (2.3.4) as with (2.3.1), we find that the
vectors
are linearly independent. Now put
If it happens that
then put formally tm_l =0.

At the next step put
and show similarly that
Assuming that ^m_3 + Span{*^_2, i = 1, . . . , tm + t m _ { } ^ 5^,_2, choose

*Si-2. ' = '« + '*-i + 1» • • • • ' « + '*-i + fm-2»" such a way that the vectors
x^_2, i = 1,. . . , tm + tm_l + tm__2 are linearly independent and the linear
span of these vectors is a direct complement to 5^,_3 in ^m_2- Then
put
for /' = ! , . . . , f / n _ 2 - We continue this process of construction of Mt, i =

1,. . . , p, where p = tm + tm_{ + • • • + t { . The construction shows that each
Miis a Jordan subspace of A and the sume Mt + • • • + Mp is a direct sum.
Also
because of our assumption that cr(A) = {cr0}. Hence (2.2.2) holds.

Let us prove the uniqueness part of Theorem 2.2.4. Assume that (2.2.2)
holds, and let A , , . . . , \k be all the different eigenvalues of A. Denoting
by Ej the set of all integers /, 1 < / < / ? , such that A, = A y , we have for
f = 0,l,2,...:
Consequently
In particular (taking / = !), the number of elements in £; coincides with

dim Ker(>4 — Ay7). This proves that for a direct sum decomposition <p" =
N\ + • • • + Nq as in Theorem 2.2.4 we have q — p and for a fixed j the
number of /u,, values that are equal to A y coincides with the numbers of A,
values that are equal to A y . Hence we can assume /U, = A,, / = 1,. . . , p.
Further, (2.3.5) implies that (for fixed A y ) the number
coincides with the number of indices / G Ey such that dim Mt >t(t =

1,2,...), and thus it also coincides with the number of indices i G £. such
that dim J^fi > t. This implies the uniqueness part of Theorem 2.2.4.
2.4 SPECTRAL SUBSPACES
Let A: (p" —> <p" be a transformation. A subspace M C <(?" is called a spectral

subspace for A if M is a sum of root subspaces for A. The zero subspace is
also considered spectral. Since root subspaces are A invariant, a spectral
Spectral Subspaces 61
subspace for A is A invariant. It is easily seen that the total number of

spectral subspaces for A is 2r, where r is the number of distinct eigenvalues
of A
By Theorem 2.1.5, for every A invariant subspace M, we have
where A j , . . . , A p are all the distinct eigenvalues of A. From this formula it

is clear that M is spectral if and only if for every A; either M n S?A (A) = {0}
or the inclusion £%A (A) C M holds. Another consequence of formula (2.4.1)
is that, for any nonzero spectral subspace M of A,
where /i,,. . . , /*> are all the distinct eigenvalues of the restriction A\M.
A useful characterization of spectral subspaces is given by their maximali-
ty property.
Proposition 2.4.1
An A-invariant subspace M T£ {0} is spectral if and only if any A-invariant, v
subspace J£ with the property (r(A\<e) C a-(A\M) is contained in M.
Proof. Assume that M is not spectral so that, in particular, {0} ^

M fl £%A (A) T^ £%A (A) for some A0 G cr(A). Define the ^-invariant subspace
!£ by the equalities
for all eigenvalues A, of A different from A 0 . Obviously, a(A\^) = cr(A\M)

but J£ is not contained in M (actually, ^contains M properly).
On the other hand, assume that M is spectral. If !£ is A invariant with
cr{A\<£) C cr(/4|^), then the equality
(where A 1 5 . . . , \p are the distinct eigenvalues of A) implies that t£ n

$tio(A) = 0 for every A 0 Go-(y4) not belonging to the spectrum of A\M. It
follows then from (2.4.2) that
where /i 1} . . . , ^ are the distinct eigenvalues of A\M. As the right-hand side

of (2.4.3) is equal to M, the inclusion & C M follows.
Another characterization of spectral subspaces can be given in terms of

direct complements.
Theorem 2.4.2
The following statements are equivalent for an A-invariant subspace M: (a) M
is spectral for A; (b) there exists a direct complement Jito M such that N is A
invariant and
(c) there exists a unique A-invariant direct complement N to M; (d) for any
A-invariant subspace !£ that contains M properly, cr(A^) contains cr(A\M}
properly.
To accommodate the cases M - {0} and M - <p" in Theorem 2.4.2 w

adopt the convention that the spectrum of the restriction of A to the zero
subspace is empty.
Proof. The equivalence of (a) and (d) follows immediately from

Theorem 2.4.1. By Theorem 2.1.5 (considering each root subspace of A
separately) we can assume that ^ (A) = <p", that is, A has the single
eigenvalue A 0 . Then the only spectral subspaces of A are {0} and <p".
Further, since cr(A\y) = {\()} for every nonzero ,4-invariant subspace ££,
equation (2.4.4) implies that either (r(j4|^) or o^/lj^-) is empty; in other
words, either M - {0} or Ji={0}. But if the latter case holds, then
obviously M = <J7". Thus M = {0} and M = <p" are the only subspaces
satisfying (b), and (a) and (b) are equivalent.
Obviously, (a) implies (c). So it remains to prove that (c) implies (a).
Let M be a nontrivial ^-invariant subspace (i.e., different from {0} and
<p") that has an ^-invariant direct complement JV. Then N is nontrivial as
well. We now use the Jordan form (Theorem 2.2.4) for the restriction A\jfi
where x\'\ . . . , x(^ is a Jordan chain (necessarily with eigenvalue A 0 ) of A,

i = 1, . . . , q. It is easily seen (cf. Proposition 1.3.4) that the vectors xf} ,
j = 1,. . . , kt, i = 1, . . . , q are linearly independent and hence form a basis
in X.
We now construct another direct complement for M that is A invariant.
Let y (^0) be an eigenvector of A in M, and put
As Ay = \0y, one checks easily that N' is A invariant. Also, N'^Jf,

Spectral Subspaces 63
because otherwise y would belong to N, a contradiction with the direct sum

M +jV=<p". We verify that Jf' is a direct complement to M. Indeed,
observe that the vectors x\l);. .. ; x ^ ^ ; x [ l ) + y; jcj°, ; = 1,.. ., kt, i =
2,. . . , q are linearly independent and hence dim N' = dim N. So we must
only check that M H Ji' = {0}. Let
where ai; are complex numbers. The condition M fl jV = {0} implies
which in turn implies
and, because of the linear independence of jtj 0 , all the coefficients a/; are
zeros. In particular, alk =0, and z — 0 in view of equation (2.4.6).
We have proved that (when a-(A) = {\0}) any nontrivial ,4-invariant
subspace either does not have A -invariant direct complements or has at least
two of them. This means that (c) implies (a).
We deduce immediately from Theorem 2.4.2 that the unique /1-invariant

direct complement ^V to a spectral subspace M is spectral as well: if
M = &^(A) + • • • + ^s(A), then JV = dtÂ) + ••• + ^(A), where
/u,j,. . . , /AS , v ^ , . . . , vt is a complete list of all the distinct eigenvalues of A.
We say that the spectral subspace M for A corresponds to the part A of
the spectrum of A if <j(A\M) — I^. Obviously, there is a unique spectral
subspace corresponding to any given subset A of cr(A) [with the understand-
ing that <r(/l| {() j) = 0]. This spectral subspace can easily be described in case
A is given by an n x n matrix in Jordan form as in equation (2.2.1). Indeed,
using the notation of that equation, if A C cr(A), define the kt x kt matrix AT,
by Kt = / if A, G A and AT, = 0 if A, Â. Then the subspace
is the spectral subspace for A corresponding to A. Its only ^-invariant direct

complement is
We conclude this section with a description of spectral subspaces in terms

of contour integrals. (Actually, this description is a particular case of the
properties of functions of transformations that are studied in more detail in

Section 2.10.) Let F be a simple, closed, rectifiable, and positively oriented
contour in the complex plane. In fact, for our purposes polygonal contours
will suffice. Given an n x n matrix B(\) = [£>,,(A)]" / = 1 , that depends con-
tinuously on the variable A E F (this means that each entry b, 7 (A) in B( A) is
a continuous function of A on F) the integral
is defined naturally as the n x n matrix whose entries are the integrals of the
entries of B(\):
The same definition of a contour integral applies also for transformations

fi(A): <p"—><p" that are continuous functions of A on F. We have only to
write B(\) as a matrix [^/y(A)]" / = 1 in a fixed basis, and then interpret
f r B( A) d\ as a transformation represented by the matrix [Jr 6 (y (A) dA]" ;=1
in the same basis. One checks easily that this defintion is independent of the
chosen basis.
Proposition 2.4.3
Let A be a subset of cr(A) where A is a transformation on <p", and let F be a
closed contour having A in its interior and o-(A) ^ A outside F. Then the trans-
formation
is a projector (known as a Riesz projector) onto the spectral subspace associated

with A and along the spectral subspace associated with o-(A) ^ A.
Proof. Using the relation 5(A7- A)~lS~l = (A/ - SAS'')"', equation

(2.1.2), and the Jordan form, we can assume that A is an n x n matrix given
by
where /*.(A,.) is the k{ x /c. Jordan block with A, on the main diagonal. One
easily verifies that
Irreducible Invariant Subspaces and Unicellular Transformations 65
As a first consequenc of this formula we see immediately that, because

a(A)r\r = 0, ( A / - A)~l is indeed continuous on T. Further, the Cauchy
formula gives
Thus
where Kt = I if A e A and K. = 0 if AÂ. Thus the matrix (2.4.7) is indeed a

projector with image and kernel as prescribed by the theorem.
2.5 IRREDUCIBLE INVARIANT SUBSPACES

AND UNICELLULAR TRANSFORMATIONS
In this section we use the Jordan form to study irreducible invariant

subspaces. An invariant subspace M of a transformation A:$n—> <p" is
called reducible if M can be represented as a direct sum of nonzero
y4-invariant subspaces Ml and M2\ otherwise M is called irreducible.
Let us consider some examples.
EXAMPLE 2.5.1. Let A be a Jordan block. Then, as Example 1.1.1 shows,

each nonzero A -invariant subspace (including (p" itself) is irreducible.
EXAMPLE 2.5.2. Let A = A0/, A0 E <p. Then an ^4-invariant subspace is

irreducible if and only if it is one-dimensional.
EXAMPLE 2.5.3. Let
According to Theorem 2.1.5, the ^-invariant subspaces are as follows: {0};

Span{ael + fte^} for fixed numbers a, j8 G <p with at least one of them
different from zero; Span{ej, e2}\ Span{e,, e3}; <p3. Among these subspaces
Spanje,, e3} and <p3 are reducible and the rest are irreducible.
The following theorem gives various characterizations of irreducible

invariant subspaces.
Theorem 2.5.1
The following statements are equivalent for an A-invariant subspace M:(a}Mis
irreducible; (b) each A-invariant subspace N contained in Mis irreducible', (c)M
is Jordan, that is, has a basis consisting of vectors that form a Jordan chain of A;
(d) there is a unique eigenvector (up to multiplication by a scalar) of A in M;(e) the
lattice of invariant subspaces of A\M is a chain—that is, for any A-invariant
subspaces j£,, j£, C M either £{ C 2£2 or 5£2 C <£", holds; (/) every nonzero A-
invariant subspace that is contained in M is Jordan; (g) the spectrum ofA\M is a
singleton {A 0 }, and
(h) the Jordan form of the linear transformation A M consists of a single

Jordan block.
Proof. The definition of a Jordan block and the description of its

invariant subspaces (Example 1.1.1) show that (h) implies all the other
statements in Theorem 2.5.1.
The implications (f)—»(c) and (b)—»(a) are obvious. Let us show that
(c)—>(d). Let xl,. . . , xk be a basis in M such that Ax} = \Qxl; Ax2 — \0x2 =
x{;. . . ; Axk - \0xk = xlt^l. The matiix of A\M in this basis is the k x k
Jordan block with A0 on the main diagonal, so the spectrum of A\M is the
singleton {A 0 }. If x — T.^laixi is an eigenvector of A (necessarily corre-
sponding to A 0 ), then (A — A O /)JC = 0, which implies E*=2 aixi_l = 0. As
jt,, . . . , xk are linearly independent, <x2 = • • • = ak = 0, and x is a scalar
multiple of*,. So (d) holds.
If x and y are two eigenvectors of A\M such that Span{jc) 7^Span{_y},
then for the ^-invariant subspaces J£, = Span{jc} and J£, = Span{_y} we have
^,^^2 and ^2^^,. So (e) implies (d).
It remains, therefore, to show that (d)—»(h), (a)—»(h), and (g)—>(h). To
this end we can assume that A\M is in Jordan form (written as a matrix in a
suitable basis in A 0 ):
If p > 1, then el and ek + 1 are two eigenvectors of A in M that are not scalar
multiples of each other; so (d)—»(h). Further, if p>\, then
is a direct sum of two nonzero A -in variant subspaces. Hence (a)—»(h).

Finally, assume that (g) holds. Then we have A, = A 2 = • • • = A p = A0 in
equation (2.5.1), and this equation implies
Irreducible Invariant Subspaces and Unicellular Transformations 67
On the other hand, the statement (g) implies that the left-hand side of this
equation is also equal to max{0, £, + • • • + kp - /}. In particular (for / = 1),
we have
which implies p — l. So (h) holds, and Theorem 2.5.1 is proved.
Observe that with M = <p", Theorem 1.9.3 is just the equivalence

(d)O(e). Thus the proof of that theorem is now complete.
A transformation A: (pw —» <p" is called unicellular if the Jordan form of A
consists of a single Jordan block. Comparing statements (a) and (h) of
Theorem 2.5.1, we obtain another characterization of a unicellular trans-
formation.
Proposition 2.5.2
A transformation A: <p" —> <p" is unicellular if and only if the whole space ((7"
is irreducible as an A-invariant subspace.
Indeed, rewriting Theorem 2.5.1 for the particular case M - <p", one
obtains various characterizations of unicellular transformations.
Another important property of a unicellular transformation is the "near"
uniqueness of an orthonormal basis in which this transformation has upper
triangular form (see Section 1.9).
Theorem 2.5.3
A transformation A: <p" —*• <P" is unicellular if and only if for any two
orthonormal bases x,,..., xn and y , , . . . , yn in which A has an upper
triangular form we have
where 6}E | = 1.
Proof. Assume that A is not unicellular. By Theorem 2.5.1 there exist

two eigenvectors jc, and yl (which can be assumed to have norm 1) such that
Spanl^J T^Spanjy,}. The proof of Theorem 1.9.1 shows that there exists
an orthonormal basis whose first vector is jc, and in which A has a
triangular form. Similarly, there exists such a basis whose first vector is _>>,.
So equation (2.5.2) does not hold for j = 1.
Assume now that A is unicellular, and let z , , . . . , zn be a Jordan basis
for A in (". So
For / = 1, 2,. . . , n define jc, to be a vector in Spanjz,,. . . , z,} that is

orthogonal to Span{zj,. . . , Z j _ , } and has norm 1. (By definition, xl =
a Z j / I J z J I for some a e (p with |a| = 1.) Then
and these subspaces are A invariant. By Proposition 1.8.4, A has an upper

triangular form with respect to the orthonormal basis * t , . . . , xn.
If A also has an upper triangular form in an orthonormal basis
y 1 ? . . . , ) > „ , then
is a chain of ^4-invariant subspaces. But the lattice of all ^-invariant

subspaces is a chain (Example 1.1.1); therefore, (2.5.3) is a unique com-
plete chain of A -invariant subspaces. Hence the chain (2.5.3) coincides with
Hence Spanjyj, . . . , y.) = Span{zj,. . . , z,} for i = 1, 2 , . . . , « , and the

orthonormality of y 1 5 . . . , yn implies that (2.5.2) holds.
We conclude this section with a proposition that was promised in

Section 1.1.
Proposition 2.5.4
The set Inv(^4) of all invariant subspaces of a fixed transformation
A: <£"-» <p" is either a continuum [i.e., there exists a bijection <p: Inv(A)-* ft]
or a finite set.
Proof. In view of Theorem 2.1.5 we can assume that A has only one
eigenvalue A0, that is, £%A (A) - <p". If A is unicellular, then by Example
2.1.1 the set Inv(y4) is finite (namely, there are exactly n + l ,4-invariant
subspaces). If A is not unicellular, then by the equivalence (c)O(d) in
Theorem 2.5.1 there exist two linearly independent eigenvectors x and v of
A: Ax = \Qx, and Ay = A0y. Then (Span{;t + ay} \ a E ft] is a set of A-
invariant subspaces which is a continuum. On the other hand, let ^ be the
map from the set of all /i-tuples (jc,,. . . , xn) of n — dimensional vectors
* ! , . . . , * „ onto Inv(yl) defined by <K*i> • • • , *„) = Span{jCj,. . . , xn] if the
subspace Span{jc,,. . . , xn} is A invariant and «K*i, • • • , xn) = {0} other-
wise. As the set of all n-tuples (*,,. . . , *„), jc, G <p" is a continuum, by an
elementary result in set theory it follows that lnv(A) is a continuum as
well.
Generators of Invariant Subspaces 69
2.6 GENERATORS OF INVARIANT SUBSPACES
Let M be an invariant subspace for the transformation A: <p"—><p". The

vectors jc,,. . . , xm £ <p" are called generators for M if
For example, any basis for M forms a set of generators for M. In connection
with this definition note that for any vectors yl, . . . , yp G <p" the subspace
Span{y,, . . . , yp, Aylt . . . , Ayp, A2ylt . . . , A2yp, . . .} is A invariant. The
particular case when M has one generator is of special interest (see also
Section 1.1), that is, when M = Span{*, Ax, A2x,. . .} for some jc G <p". In
this case we call M a cyclic invariant subspace (and is frequently referred to
as a "Krylov subspace" in the literature on numerical analysis).
The notion of generators behaves well with respect to similarity. That is,
if M is an ,4-invariant subspace with generators j c , , . . . , xm, then SM is an
SAS~{-invariant subspace with generators Sxl,...1Sxm (here 5 is any
invertible transformation). So the study of generators of /1-invariant sub-
spaces can be reduced to the study of generators of /-invariant subspaces,
where J is a Jordan form for A. Let us give some examples.
EXAMPLE 2.6.1. Let A = I (or, more generally, A = al, where a G <p).

Then a /c-dimensional subspace M in <p" (which is obviously ^-invariant) has
not less than k generators. Any set of vectors that span M is a set of
generators.
EXAMPLE 2.6.2. Let A = Jn( A) be the n x n Jordan block with eigenvalue A.

An ^-invariant subspace Mk — Span{e l5 . . . , ek] is cyclic with the generator
ek. D
The generators x\, . . . , xm of M are called minimal generators for M\im

is the smallest number of generators of M. Obviously, any set of minimal
generators is a minimal set of generators. (A set of generators xl,. . . , xp for
the ^-invariant subspace M is called minimal if any proper subset of
{AT,, . . . , xp] does not constitute a set of generators for M.) However, not
every minimal set of generators is a set of minimal generators. Let us
demonstrate this in an example.
EXAMPLE 2.6.3. Let
and let M = <f 2 be the ^-invariant subspace. The vector (1,1) is obviously a
generator for M, so a set of minimal generators must consist of a single
vector.
On the other hand, the set of two vectors {el, e2} is a set of generators of
<(72 that is minimal. Indeed, neither of the vectors e} and e2 is a generator of
2
<F .
The number of vectors in a set of minimal generators admits an intrinsic

characterization as follows.
Theorem 2.6.1
Let M be an A-invariant subspace. Then the number of vectors in a
set of minimal generators coincides with the maximal dimension m of
Ker(yl - \0I)\M, where A0 is any eigenvalue ofA\M.
Proof. We can assume that
a matrix in Jordan form (with respect to a certain basis in M). Further, we

can assume that A, = • • • = \m where m^p (recall that m is the maximal
number of Jordan blocks corresponding to any eigenvalue). Let xl,. . . , xq
be generators of M. Let yt be the m-dimensional vector formed by the k^h,
(&j + A: 2 )th, . . . , (kt + k2 + • • • + km)th coordinates of jt, ( / = ! , . . . , q).
Now
Examining the £jth, . . . , (kl + k2 + • • • + km)th coordinates of xi and using

the condition A, = • • • = A m , we see that et £Spanf}^, . . . , yq}. Similarly,
the condition
gives rise to the conclusion that e2 ESpanj^,. . . , yq}. Continuing in this

way, we eventually find that ei G Span{vj,. . . , yq}, i = 1, 2,. . . , m. So
yl,. . . , yq span the whole space <pOT, and thus q^m.
We now prove that there is a set of m generators for M. We proceed by
induction on m.
Suppose first that m = 1, that is, the eigenvalues A , , . . . , A p are all
different. Then the vector x = ek + ek +k + - • • + ek +...+k is a generator
for M. Indeed
Because of the form (2.6.1) of A\M the matrix (A - A 2 /)* 2 - • • (A - Ap/)*^

has the form T{ © 0^. © • • • © 0A , where T{ is an upper triangular non-
Generators of Invariant Subspaces 71
singular matrix. Hence the k{lh coordinate /, k of /, is nonzero. Now

(A - A,/) 7 /, has (k{ -y')th coordinate equal to/, k (and thus nonzero) and
all the coordinates of (A - A,/)'/, below the (A:, -;)th coordinate are zeros
(j = 1,. . . , &i - 1). Consequently, the vectors e{,. . . , ek belong to the
span of/,, (A - A,/)/,,. . . , (A - A,/)* 1 " 1 /!- Similarly, one shows that the
span of vectors
where
contains vectors ek + 1 , . . . , ek +k . Proceeding in this way we find

eventually that all the vectors et, i = 1, . . . , & , + • • • + kp belong to
Span{*, Ax, A2x,. . .}.
Assume now that m > 1. Suppose that for any transformation B and any
B-invariant subspace j£ such that
there exists a set of m — 1 generators in !£. Given the transformation

A: $"-+(", write
where Ml and M2 are some yl-invariant subspaces such that
(Such subspaces Ml and M2 are easily found by using the Jordan form of A.)
By the induction hypothesis we have a set of m — 1 generators jc,,. . . , xm_l
for the /4-invariant subspace M^ Also, we have proved that there is a
generator xm for the /4-invariant subspace M2. Then, obviously, jc,,. . . , xm
is a set of generators for M.
In particular, an v4-invariant subspace M, is cyclic if and only if there is

only one eigenvector (up to multiplication by a nonzero number) in M
corresponding to any eigenvalue of the restricton A\M.
We conclude this section with an example.
EXAMPLE 2.6.4. Let A = diag[A, A2 • • • A n ] where A p .. . , An are different

complex numbers. Then <p" is a cyclic subspace for A. A vector x =
(xlt. . . , xn) G <p" is cyclic, that is
if and only if all the coordinates xt are different from zero. Indeed, if xt = 0
for some i, then e, does not belong to Span{;t, Ax, A2x,. . .}. On the other
hand, if xf ^ 0 for / = ! , . . . , « , then
The determinant on the right-hand side is known as the Vandermonde

determinant, and it is well known that it is equal to n /<; (A y — A ( ) 5^0. So
det[*, Ax,. . . , A"~lx] ^0. It follows that the vectors x, Ax,. . . , A"~lx are
linearly independent and thus span <p".
2.7 MAXIMAL INVARIANT SUBSPACE IN A GIVEN SUBSPACE
Given a transformation A: <p"—> <p" and a subspace Ji C <p", we say that an

/1-invariant subspace M is maximal in jV if M C jV and there is no ^4-
invariant subspace that is contained in N and contains M properly.
Proposition 2.7.1
A maximal A-invaraint subspace in Jf exists, is unique and is equal to the sum
of all A-invariant subspaces that are contained in Ji.
Note that, because the dimension of N is finite, M can actually be
expressed as the sum of a finite number of j4-invariant subspaces.
Proof. Clearly, M is y4-invariant and contained in N. Also, M is maximal in

JV. This follows from the definition of M that implies that every /l-invariant
subspace in ^Vis contained in M.
For the uniqueness, assume that there are two different maximal ^4-invariant
subspaces in ^V, say, Ml and M2. Then M{ + M2 is any4-invariant subspace in jV
that contains M{ properly, a contradiction with the definition of a maximal
A-invariant subspace in JV.
Maximal Invariant Subspace in a Given Subspace 73
Observe that if .A" is A invariant, the maximal ^-invariant subspace in jV

coincides with N itself. At the other extreme, assume that Jf does not
contain any eigenvector, it follows that JV does not contain nonzero A-
invariant subspaces. Hence the maximal /1-invariant subspace in jV is the
zero subspace.
Let us consider some examples.
EXAMPLES 2.7.1. Let A = diag[A t , A 2 , . . . , A n ], where A 1 5 . . . , \n are dif-

ferent complex numbers. Then the maximal ,4-invariant subspace in Ji is
Span{e yj ,. . . , eik}, where e} , p = 1,. . . , k are all the vectors among
e t , . . . , en that are contained in JV (by definition, Span{efi,. . . , 6j } = {0}
if none of the vectors e{,. .. ,en belongs to JV).
EXAMPLE 2.7.2. Let A = -/ n (A 0 ), the n x n Jordan block with eigenvalue A 0 .

Then the maximal ^-invariant subspace in JV is Span{e,,. . . , e p _j},
where p is the minimal index such that ep^N (again, we put
Span{ij,. . . ,ep-\} — {0} if N does not contain e}).
The following more explicit description of maximal A -invariant subspaces

is sometimes useful.
Theorem 2.7.2
The maximal A-invariant subspace in Ji coincides with
where . (in particular, N(} = .A").
Proof. We have M C jV0 = JV. Further, M is ^4-invariant. For, if jc G M,

then A'x — yt for some yy e N (j — 0, 1, . . .) and
Hence Ax E M. It remains to verify that M is maximal in N. Let !£ be an

,4-invariant subspace contained in JV. Then for j - 0,1, . . . ,
and (because 3? is A invariant)
Combining these inclusions, we have

and M is indeed maximal in N.
In connection with Theorem 2.7.2 observe that JV* = A 'N with A is

invertible.
Given a transformation A: $"—> <p", it is well known that there are scalar
polynomials/(A) such that, if /(A) = E, = 0 a ( A', then
Indeed, the characteristic polynomial of A has this property (the Cayley-

Hamilton theorem). A nonzero polynomial g( A) of least degree—say, p—for
which g(A) - 0 is called a minimal polynomial for A (it can be shown that p
is uniquely defined). Then it is clear that for any integer j^p, we can
equate A' io a polynomial in A of degree less than p. Thus (in the notation
of Theorem 2.7.2)
where p is the degree of a minimal polynomial of A. Indeed, the inclusion C

in equation (2.7.1) is obvious. To prove the opposite inclusion, let <?(A) =
\p + SjT0' a-\' be a minimal polynomial of A, so
Let x E n^J N., so A'x E Jf for j = 0, . . . , p - 1. Then
and xE.Jfp. Assume inductively that we have already proved that xEJV^.,
j = 0,. . . , q — 1 for some q^p. Then
and x£.Nq. So actually x E nJL 0 ^, and equation (2.7.1) is proved.

Observe that (2.7.1) implies
for every q >p — I . In particular, equation (2.7.2) holds with q — n — 1.

The case when ^V = KerC and C: $"->(' is a transformation is of

particular interest. In this case one can describe the maximal ^-invariant
subspaces in Ker C in terms of the kernels of transformations CA\ j =
0,1,....
Theorem 2.7.3
Given linear transformations A: (pn —> <p" and C:<p"—»<p r , the maximal
A-invariant subspace in Ker C is
Moreover, the subspace 3f(C, A) coincides with nj=0' Ker(CM') for every
integer q greater than or equal to the degree of a minimal polynomial of A.
Proof. In view of Theorem 2.7.2 and equality (2.7.2), we have only to

show that
However, this equality is immediately verified using the defintions of Ker C

and Ker(CA'). D
We say that a pair of linear transformations (C, A) where A: <p"—»<p"

and C: <£""—» <pr is a null kernel pair if the maximal ^-invariant subspace in
Ker C is the zero subspace, or, equivalently, if
It is easily seen that, also, the pair (C, A) is a null kernel pair if and only if
EXAMPLE 2.7.3. Let C = [c, • • • cj: <p"—»<p, and
For j = 0, 1 , . . . , « — 1 we have
and hence
where k is the smallest index such that ck ^ 0. In particular, (C, A) is a null

kernel pair if and only if c, 7^0.
The notion of null kernel pairs plays important roles in realization theory
for rational matrix functions and in linear systems theory, as we see in
Chapters 7 and 8. Here we prove that every pair of transformations has a
naturally defined null kernel part.
Theorem 2.7.4
Let C: <p"-*<p r and A: (f"1 -» <f" be transformations, and let Ml be the
maximal A-invariant subspace in Ker C. Then for every direct complement
M2 to M\ in (p", C and A have the following block matrix form with respect to
the direct sum decomposition <p" = M\ 4- M2.
where the pair C2: M2-^> (pr, A22: M2—* M2 is a null kernel pair. If <p" =
M\ + M'2 is another direct sum with respect to which C and A have the form
where the pair (C2, A'22) is a null kernel pair, then M\ is the maximal
A-invariant subspace in Ker C and there exists an invertible linear transfor-
mation S: M2-+M2 such that
Proof. As M, is A invariant and Cx - 0 for every x E M l, the transfor-

mations C and A indeed have the form of equality (2.7.3). Let us show that
the pair (C2, A22) is null kernel. Assume x G n°l 0 Ker(C 2 >422). As
where by * we denote a transformation of no immediate interest, we have

and hence
On the other hand, x belongs to the domain of definition of A22, that is,
xE.M2. Since M1C\M2 = {Q}, the vector x must be the zero vector.
Consequently, (C2, A22) is a null kernel pair.
Now consider a direct sum decomposition <p" = M{ + M2, with respect to
which C and A have the form of equality (2.7.4) with the null kernel pair
(C2, A'22). As
we have
where the last equality follows from the null kernel property of (C2, A'22).
Hence M{ actually coincides with M^. Further, write the identity transfor-
mation /: <p"-> <p" as a 2 x 2 block matrix
Here 5: M2—>M2is a linear transformation that must be invertible in view

of the invertibility of /. The inverse of / (which is / itself) written as a 2 x 2
block matrix with respect to the direct sum decompositions (p" = M} + M2 =
Ml + M2 has the form
We obtain the equalities
which imply equality (2.7.5).
Observe that if (2.7.5) holds, one can identify both M2 and M'2 with (pm,
for some integer m. Write C2 and A22 as r x m and m x m matrices,

respectively, with respect to a fixed basis in <pr and some basis in <pm. Then
C'2 and A22 are transformations represented by the matrices C2 and A22,
respectively, with respect to the same basis in <pr and a possibly different
basis in <pm. So the pairs (C2, A22) and (C2, A22) are essentially the same.
We conclude this section with an example.
EXAMPLE 2.7.4. Let C and A be as in Example 2.7.3, and assume that

Cj = • • • = ck_l = 0, ck T^0 (k > 1). Then (C, A) is not a null kernel pair. The
null kernel part (C2, A22) of (C, A) (as in Theorem 2.7.3) is given by
2.8 MINIMAL INVARIANT SUBSPACES OVER A GIVEN SUBSPACE
Here we present properties of invariant subspaces that contain a given

subspace and are minimal with respect to this property. It turns out that
such subspaces are in a certain sense dual to the maximal subspaces studied
in the preceding section. We also see a connection with generators of
invariant subspaces, as studied in Section 2.6.
Given a transformation A: (p" -» <p" and a subspace N C <p", we say that
an .A-invariant subspace M is minimal over N if M D JV and there is no
/I-invariant subspace that contains .A" and is contained properly in M. As an
analog of Proposition 2.7.1 we see that a minimal A-invariant subspace over
N exists, is unique, and is equal to the intersection of all A-invariant
subspaces that contain N. The proof of this statement is left to the reader.
If N is A invariant, then the minimal ^-invariant subspace over Jf
coincides with M itself. On the other hand, it can happen that (p" is the
minimal .A-invariant subspace over N, even when JV is one-dimensional.
EXAMPLE 2.8.1. Let A = diag[A 1? A 2 , . . . , Aj, with different complex num-

bers A n . . . , \n. Let jV = Span £"=1 aiei be a one-dimensional subspace.
Then the minimal ^-invariant subspace over JV is Span{ey | ay ^0}. In
particular, if all ay are different from zero, then the minimal /1-invariant
subspace over ^V is <p".
Our next result expresses the duality between minimal and maximal
invariant subspaces in a precise form. (Recall that by Proposition 1.4.4 the
subspace M is A invariant if and only if its orthogonal complement M x is A*
invariant.)
Proposition 2.8.1
An A-invariant subspace M is minimal over Jf if and only if the A*-invariant
subspace M x is maximal in JV1.
Minimal Invariant Subspaces Over a Given Subspace 79
Proof. Assume that the ^-invariant subspace M is minimal over N. In

particular, M D jV, so M ± C N^. If there were an A*-invariant subspace «2*
such that Jt±C£CJf± and ML ^ g, the subspace £L would be ^
invariant and ./^DJ^D.yV, M¥^ ^^. This contradicts the definition of ^ as
a minimal /1-invariant subspace over N. Hence ML is a maximal .4*-
invariant subspace in jV1. Reversing the argument, we find that, if the
A*-invariant subspace M L is a maximal in JfL, the /l-invariant subspace M
is minimal over N.
Proposition 2.8.1 allows us to obtain many properties of minimal in-

variant subspaces from the corresponding properties of maximal invariant
subspaces proved in the preceding section. For example, let us prove an
analogue of Theorem 2.7.2 in this way.
Theorem 2.8.2
The minimal A-invariant subspace M over N coincides with
Proof. By Proposition 2.8.1 and Theorem 2.7.2, we have
where ^ = {xE. <£" | A*'x £ JV 1 }. It is not difficult to check that for

;' = 0 , 1 , . . .
Indeed, let y G A'N, so that y = A'z for some z G jV. Then for every x G <p"
such that A * 'x €! Jf ^ we have
Hence yE.Nf. If the equality (2.8.2) were not true, there would exist a
nonzero y0 G N*~ such that y0 would be orthogonal to A'Jf. Hence for every
2 G jV we have
which implies _y0 G jV;, a contradiction with y0 G Ji^.

Now (2.8.1) and (2.8.2) give
Note that the equality M = EJL 0 A'Jf can also be verified directly without
difficulty. To this end, observe that the subspace

invariant: if x = A'z for some z £ N, then Ax — A'*lz belongs to M0.
Obviously, MQ contains JV. If M' is an A -in variant subspace that contains N,
then
So M0 is indeed the minimal ^-invariant subspace over N.

As all subspaces under consideration are finite dimensional, the sum
T,J=0AjJi is actually the sum of a finite number of subspaces A'N (/ =
0,1,...). In fact
where q is any integer greater than or equal to the degree p of a minimal

polynomial for A. Indeed, it is sufficient to verify equation (2.8.3) for q = p.
Let r(A) = \p + SpJ a y A ; be a minimal polynomial of A, so
Assuming by induction that we have already proved the inclusion
for some s>p, for x = Asy, y G Jf we have
So the inclusion follows, and by induction we have proved

the inclusion C in (2.8.3) (with q — p). As the opposite inclusion is obvious,
(2.8.3) is proved.
Going back to Theorem 2.8.2, observe that
where / j , . . . , fk is a basis in Jf. In other words, the ^-invariant subspace M

has a set of k generators, where k = dim J{. Combining this observation with
Theorem 2.6.1, we obtain the following fact.
Minimal Invariant Subspaces Over a Given Subspace 81
Theorem 2.8.3
If M is the minimal A-invariant subspace over N, and k = dim N, then for any
eigenvalue A0 of A \M we have
In particular, the theorem implies that if jVis one-dimensional, then M is

cyclic.
It is easy to produce examples when the inequality in (2.8.4) is strict. For
instance, in the extreme case when JV = (p" and A has n distinct eigenvalues,
we have M - <p" and
The case when N — Im B, and B: <p* —> <p" is a transformation is of special

interest. Noting that A'(lm B) = Im(A'B), Theorem 2.8.2 together with
(2.8.3) gives the following.
Theorem 2.8.4
Let #: $" be transformations. Then the minimal
A-invariant subspace over Im B coincides with
for every integer q greater than or equal to the degree of a minimal

polynomial for A. [In particular, J>(A, B) = £":* ImCA'B).]
We say that a pair of transformations (A, B), where A: (p" —» <p" and
B: <ps —> (p", is a full-range pair if the minimal ^-invariant subspace over
Im B coincides with <p", or, equivalently, if
It is easy to see that, also, (A, B) is a full-range pair if and only if
The duality generated by Proposition 2.8.1 now takes the form: the pair
(A, B) is a full-range pair if and only if the adjoint pair (B*, A*) is a null
kernel pair. This follows from the orthogonal decomposition
which is obtained directly from Proposition 1.4.4.
EXAMPLE 2.8.2. Let
Then
where m is the index determined by the properties that bm^Q, bm+ l =

• • • = bn = 0. In particular, $(A, B) = {0} if and only if B = 0, and the pair
(A, B) is full range if and only if bn9^0.
As with null kernel pairs, full-range pairs will be important in realization

theory for rational matrix functions and in linear systems theory (see
Chapters 7 and 8).
We conclude this section with an analog of Theorem 2.7.4 concerning the
full-range part of a pair of transformations.
Theorem 2.8.5
Given transformations A: <p w —»<p", B: (p*—»<p", let N^ be the minimal A-
invariant subspace over Im B. Then for every direct complement N2 to Nr in
<p", and with respect to the decomposition <p" = JV, + Jf2, the transformations
A and B have the block matrix form
where the pair A, l: JV, - » J f , Bl: (p5 -» Jf} is full-range. If f = Jf { + Jf '2 is

another direct sum decomposition with respect to which A and B have the
form
with full-range pair (A'n, B{), then Jf[ = JV, and A'n = An, B{ = Br
Marked Invariant Subspaces 83
Proof. Equality (2.8.5) holds because jVj is A invariant and

Further, in view of (2.8.5) we have
so (/4 U , #,) is indeed a full-range pair. If (2.8.6) holds for a direct sum
decomposition <p" = M\ + N'2, then
which is equal to N\ in view of the full-range property of (A'n, B { ) . Hence

N\ is the minimal /4-invariant subspace over Im B and thus N\ = .yV,. Now
clearly A\, = A n (which is the restriction of A to JVJ = JV,) and B{ = B^. D
2.9 MARKED INVARIANT SUBSPACES
Let A: <p"—»(p" be a transformation, and let
be a basis in which A has the Jordan form
Obviously, any subspace of the form
for some choice of integers ra(, 0< mi < kf, is .4 invariant. [Here ra, = 0 is
interpreted in the sense that the vectors fn, . . . , f i k do not appear in
(2.9.1) at all.] Such .A-invariant subspaces are called marked (with respect to
the given basis _/) - in which A is in the Jordan form).
The following example shows that, in general, not every A-invariant
subspace is marked (with respect to some Jordan basis for A).
EXAMPLE 2.9.1. Let

We shall verify that the A-invanant subspace M = Span{e,, e2} is not

marked in any Jordan basis for A. Indeed, it is easy to see (because A2 ^Q
and rank A = 2) that the Jordan form of A is
So any Jordan basis of A is of the form /,, /2, /3, g, where

A/2 = /i» A/3 = /2- If -^ were marked with respect to this basis, we would
have either M = S p a n { f l , g } or M = Span{/,, /2}. The former case is
impossible because A\M 7^0, and the latter case is impossible because it
implies M, C Im A, which is not true (e2^lm A).
The description of marked invariant subspaces can be reduced to the
description of invariant subspaces which are marked with respect to a fixed
Jordan basis. This reduction is achieved with the use of matrices commuting
with J.
Theorem 2.9.1
Let J be an n x n matrix in Jordan form. Then every marked J-invariant
subspace t£ can be represented in the form !£ = BM, where M is marked (with
respect to the standard basis, e{,. . . , en in <J7") and B is an.nxn matrix
commuting with J.
Proof. Assume j£ = BM, where M is marked (with respect to the
standard basis) and BJ = JB. Denoting by / , , . . . , / „ the columns of fi, we
find that Jîs a marked ./-invariant subspace in the basis/,,. . . , fn. (In view
of the equality BJ = JB, the matrix J has the same Jordan form in the basis
/„•..,/„.)
Conversely, if 56 is marked with respect to some Jordan basis / , , . . . , / „
of J, then, denoting B = [/,/2 • • • / „ ] and M = B~1S£, we obtain L in the
required form $ = BM.
Note that the characteristic property of a marked invariant subspace

depends only on the parts of this subspace corresponding to each eigen-
value: an yl-invariant subspace M is marked if and only if for every
eigenvalue A0 of A the A\<% (x) -invariant subspace M n 5£A (v4) is marked.
This follows immediately from the definition of a marked subspace.
In view of Example 2.9.1 it is of interest to find transformations for which
every invariant subspace is marked. We have the following result.
Theorem 2.9.2
Let A: <£"—»• <p" be a transformation such that, for every eigenvalue A0 of A,
at least one of the following holds: (a) the geometric multiplicity of \Q is equal
to its algebraic multiplicity; (b) dim Ker(v4 - A07) = 1. Then every A-

invariant subspace is marked.
Proof. Considering M n £%A 0 (A) and A\^ A (A) in place of M and A,

0
respectively, we can assume that A has a single eigenvalue A0.

If dim Ker(A - A07) = 1, then there is a unique maximal chain of A-
invariant subspaces:
where /,, /2, . . . , / „ is any Jordan chain for A. So obviously every A-

invariant subspace is marked.
Assume now that the geometric multiplicity of the eigenvalue A0 of A is
equal to its algebraic multiplicity. Then A & (A) = A0/, and since every
nonzero vector in £%A (A) is an eigenvector tor A, again every ^-invariant
subspace is marked.
It is easy to produce examples of transforamtions for which the hypoth-

eses of Theorem 2.9.2 fail, but nevertheless every invariant subspace is
marked; for example
2.10 FUNCTIONS OF TRANSFORMATIONS
We recall the definition of functions of matrices. Let /(A) = E' =O A'/ be a

scalar polynomial of the complex variable A, and let /I: <p" —>• <p" be a
transformation written as a matrix in the standard basis. Then f(A) is
defined as f ( A ) = E' =O ftA'. Letting Jk( A) be the Jordan block of size k with
eigenvalue A, define
a Jordan form for A. Then
A computation shows that

and in general the (s, q) entry of A (A)' is if q 2= 5 and zero

otherwise (here = i\l[(q-s)$i-(q-s))\\ if i^q-s and
= 0 if i < q - s). It follows that
Hence for fixed A the matrix f(A) depends only on the values of the
derivatives
where /AJ , . . . , fir are all the different eigenvalues of A and m; is the height
of /Uy, that is, the maximal size of Jordan blocks with eigenvalue ^ in a
Jordan form of A. Equivalently, the height of /uy is the minimal integer m
such that Ker(/l - /iy/)m = RÂ). This observation allows us to define f(A)
by equality (2.10.1) not only for polynomials /(A), but also for complex-
valued functions that are analytic in a neighbourhood of each eigenvalue of
A.
Note that for a fixed A the correspondence f($—>f(A) is an algebraic
homomorphism. This means that for any two functions/(A) and g(A) that
are analytic in a neighbourhood of each eigenvalue of A the following holds:
On the left-hand side the function af + fig (which is analytic in a neighbour-

hood of each eigenvalue of A) is naturally defined by
Also, we define
These properties can be verified by a straightforward computation using

(2.10.1). For example:
where /(A) and g( A) are analytic functions in a neighbourhood of A0 and

h(\) = /(A)g(A). In particular, the property (2.10.2) ensures that
f(A)g(A) - g(A)f(A) for any functions /(A) and g( A) that are analytic in a
neighbourhood of each eigenvalue of A.
In the sequel we need integral formulas for functions of matrices. Let A
be an n x n matrix, and let F be any simple rectifiable contour in the

complex plane with the property that all eigenvalues of A are inside F. For
instance, one can take F to be a circle with center 0 and radius greater than
\\A\\ (here and elsewhere the norm \\A\\ of a transformation A: <p" —» <p" is
defined by
where for a vector x —
Proposition 2.10.1
Proof. Suppose first that T is a Jordan block with eigenvalue A = 0:
Then
(recall that n is the size of T). So

It is then easy to verify (2.10.3) for a Jordan block T with eigenvalue A0 (not
necessarily 0). Indeed, T - A0/ has an eigenvalue 0, so by the case already
considered
where F0 = {A - A0 | A EF}. The change of variables /x = A + A0 on the

left-hand side leads to
Now
so (2.10.3) holds for the block T.

Applying (2.10.3) separately for each Jordan block, we can carry the
result further for arbitrary Jordan matrices J. Finally, for a given matrix A
there exists a Jordan matrix / and an invertible matrix 5 such that T =
S~1JS. Since (2.10.3) is already proved for J, we have
As a consequence of Proposition 2.10.1 we see that for a scalar poly-

nomial /(A) the formula
holds. Note that here F can be replaced by a composite contour that consists
of a small circle around each eigenvalue of A. (Indeed, the matrix function
(/A - A)"1 is analytic outside the spectrum of A.) Using this observation and
formula (2.10.1) we see that for any function that is analytic in a neighbour-
hood of each eigenvalue of A, the formula
holds, where F consists of a sufficiently small circle around each eigenvalue

of A [so that /(A) is analytic inside and on F].
A transformation A: <p" -»<p" (or an n x n matrix /4) is called diagon-

able if there exist eigenvectors Jtj, . . . , * „ of /I that form a basis in <p".
Equivalently, an n x n matrix .4 is diagonable if for some nonsingular matrix
S the matrix S~1AS has a diagonal form:
So a diagonable matrix has n Jordan blocks in its Jordan form with each
block of size 1. If one knows that A is diagonable, then f ( A ) can be given a
meaning [by the same formula (2.10.1)] for every function /(A) that is
defined on the set of all eigenvalues of A. So, given a diagonable A, there is
an 5 such that
For any function /(A) that is defined for A = a,,. . . , A = an, put
In particular, f ( A ) is defined for a hermitian A and any function/defined on

^?. Also, for a unitary A and any function / defined on the unit circle, the
matrix f(A) is well defined in this way.
Consider now the application of these ideas to the exponential function.
This is subsequently used in connection with the solution of systems of
differential equations with constant coefficients. As/(A) = e* is analytic on
the whole complex plane, the linear transformation f(A) - eA is defined for
every linear transfomation A: <P"—>• <p". In fact
is given by the same power series as ek. In order to verify (2.10.7), we can
assume that A is in the Jordan form:
Then, by definition
On the other hand

/ = 0,1,. . . . So the (s, q) (q^s) entry in the matrix
is
Hence formula (2.10.7) follows.

This argument shows that the series (2.10.7) converges for every
transformation A. Actually, it converges absolutely in the sense that the series
converges as well.
The exponential function appears naturally in the solution of systems of
differential equations of the type
Here akj are fixed (i.e., independent of /) complex numbers, and

xt(t),. . . ,*„(/) are functions of the real variable / to be found. Denoting
A = [akj\"k i=i and x(t) = (x,(f),. . . ,*„(/)), we rewrite this system in the
form
A general solution is given by the formula

where x0 = *(0) is the initial value of x(t).

In connection with this formula observe that e(t+s)A = e'AesA, as follows,
for instance, from (2.10.7). In fact, eA+B -eAeB provided A and B com-
mute. However, eA+B is not equal to eA • eB in general.
2.11 PARTIAL MULTIPLICITIES AND INVARIANT SUBSPACES

OF FUNCTIONS OF TRANSFORMATIONS
From the definition of a function of a transformation A: <p n —» <p" it follows

immediately that if \l,. . . , An are the eigenvalues of A (not necessarily
distinct), then /(A,),. . . , /(A n ) are the eigenvalues of f(A). Moreover we
compute the partial multiplicities of f(A), as follows.
Theorem 2.11.1
Let A: <p" —» <p" be a transformation with distinct eigenvalues /A, ,. . . , /ur and
partial multiplicities mn,. . . , mik corresponding to /A,., / = 1,. . . , r. Letf(\)
be an analytic function in a neighbourhood of each ti( (if all m(; are 1, it is
sufficient to require that /(/*,-) be defined for i — 1, . . . , r). For each m^
define a positive integer sif as follows: stj = m{- ifra/7 = 1 or iff(k\Hj) = 0for
& = ! , . . . , m-j — 1; otherwise /^(/x,) is the first nonvanishing derivative of
/(A) at /A,. Then the partial multiplicities of f ( A ) corresponding to the
eigenvalue A are as follows:
for all indices i such that /(/a ( ) = A.
Proof. By Corollary 2.2.2, if suffices to consider the case when A =

Jm(n>) is a Jordan block. Using equations (2.10.1), we see that
where f(s\ /x) is the first nonvanishing derivative of /(A) at /LA. [If m = 1 or if
f(k\u.) = 0 for k = 1, . . . , m, we put s = m.] More generally
Partial Multiplicities and Invariant Subspaces 93
Denoting the left-hand side of this relation by f y , note that the sizes of
Jordan blocks of f(A) are uniquely determined by the sequence 11,. . . , tm.
Indeed, the number of Jordan blocks of f ( A ) with size not less than; is just
tj - /,_}, where ; = 1,. . . , m and tQ is zero by definition. This observation,
together with (2.11.1), leads to the conclusion of the theorem.
Let us give an illustrative example for Theorem 2.11.1.
EXAMPLE 2.11.1. Let A be a 23 x 23 matrix with only two distinct eigen-
values 0 and 1, and with partial multiplicities 1,4,9 corresponding to the
eigenvalue 0, and with partial multiplicities 2,7 corresponding to the eigen-
value 1. Let /(A) = A 2 ( A - I) 4 Then f ( A ) has the unique eigenvalue 0, and
the different partial multiplicities of A have the following contribution to the
partial multiplicity (PM) of f(A), according to Theorem 2.7.1:
The PM 1 for A gives rise to the PM 1 of f(A).

The PM 4 of A gives rise to the PM values 2,2 of f ( A ) .
The PM 9 of A gives rise to the PM values 4,5 of f(A).
The PM 2 of A gives rise to the PM values 1,1 of f(A).
The PM 7 of A gives rise to the PM values 1,2,2,2 of f(A).
Hence a Jordan form for the transformation A2(A - I)4 has four Jordan
blocks of size 1,5 Jordan blocks of size 2, one Jordan block of size 4 and one
Jordan block of size 5, all corresponding to the eigenvalue zero.
Note that for a given transformation A: <p" —»<p" and a function /(A) such
that f(A) can be defined as above, there exists a polynomial p(\) such that
p(A) =f(A). Indeed, take p(\) such that
where /Xj, . . . , /u,r are all the different eigenvalues of A and my is the height
Of f l j .
Consider now the connections between invariant subspaces of A and the
invariant subspaces of a function of A.
Proposition 2.11.2
If M is an invariant subspace of a transformation A, then M is also invariant
for every transformation f(A), where /(A) is a function for which f(A) is
defined.
The proof is immediate:

for some polynomial p( A) = E;10 p ; A ; ; so for every x E M we have A'x G M,

I = 0, . . . , ra, and thus
Note that in general the linear transformation f(A) may have more invariant
subspaces than A, as the following example shows.
EXAMPLE 2.11.2. Let
The invariant subspaces of A are {0}, Span{e,}, <p2, but the invariant
subspaces of A2 = 0 are all the subspaces in <p2.
We characterize the cases when f ( A ) has exactly the same invariant

subspaces as A.
Theorem 2.11.3
(a) Assume that f( A) is an analytic function in a neighbourhood of each
eigenvalue / u , , . . . , inr of A (/u,,, . . . , /u,r are assumed to be distinct). Then
f(A) has exactly the same invariant subspaces as A if and only if the following
conditions hold: (i) /(^,) ^/(/u. y ) if /LI, ^ /t y ; (ii) /'(/"•,•) ^0 /or every eigen-
value jjLj with height greater than 1. (b) If A is diagonable and f( A) is a
function defined at each eigenvalue of A, then f ( A ) has exactly the same
invariant subspaces as A if and only if condition (/) of part (a) holds.
Proof. We shall assume that A has the Jordan form
where each A, coincides with some /n;, \< j< r.

Suppose that (i) does not hold, and suppose, for instance, that A, ^ A 2
but/Âj) = /(A 2 ). Formula (2.10.1) shows that et + ek{ + t is an eigenvector of
f ( A ) corresponding to the eigenvalue/(A,). Hence Span{e, + ek +,} is/(A)
invariant; but this subspace is easily seen not to be A invariant.
Suppose that (ii) does not hold; say, £ , > ! and /'(A,) = 0. Formula
(2.6.1) implies that el + e2 is an eigenvector oif(A) corresponding to/^).
So Span{e, + e2} is an f(A)-invariant subspace that is not A invariant.
Assume now that (i) and (ii) hold. As f ( A ) = p(A) for some polynomial
p(A), we can assume that/(A) is itself a polynomial. Condition (i) imposed
on the polynomial / ensures that the root subspace of A corresponding to
some eigenvalue A0 is also a root subspace of f ( A ) corresponding to the
Exercises 95
eigenvalue/(A 0 ). Since every ,4-invariant [resp. f(A)-invariant] subspace is

a direct sum of ^-invariant [resp. f(A)-invariant] subspaces, each summand
belonging to a root subspace, we can assume that cr(A) consists of a single
point; say, cr(A) = {0}. Replacing, if necessary, /(A) by a/(A) + /3, where
a, j3 E <p are constants and a 7^0—such a replacement does not alter the set
lnv(f(A)) of all /(v4)-invariant subspaces—we can assume that /(O) = 0,
/'(0) = 1. In this case
But then f(A)=AF, where F= I + Ef = / ai +lA' is an invertible matrix.

Clearly, every v4-invariant subspace is also AF invariant. Note that F~l is a
polynomial in AF (this can be checked, for instance, by direct computation
in each Jordan block of A, using the fact that A is a Jordan matrix and
(r(A) = {0}); so every A F- in variant subspace is also (AF • F ~ l ) invariant,
that is, A invariant. Thus we have proved that lnv(f(A)) - Inv A.
2.12 EXERCISES
2.1 Let
where A ^ : <p m —»<p m and A2: <p"—»<p" are transformations.

(a) Prove or disprove the following statement: every ^-invariant
subspace is a direct sum of an Al -invariant subspace and an
/!2-invariant subspace.
(b) Prove or disprove the preceding statement under the additional
condition that the spectra of A, and A2 do not intersect.
(c) Prove or disprove the preceding statement under the additional
condition that A, and A2 are unicellular with the same eigenvalue.
2.2 Let A: $"—> <f" be a transformation with A2 = I. Describe the root
subspaces of A.
2.3 Describe the root subspaces of a transformation A such that A* = /.
How many spectral /4-invariant subspaces are there?
2.4 Find the root subspaces of the transformation
where B: <p"-» <p" is some transformation and A, ^ A 2 . Is it true that

& A .(X) = Ker( A,/ - A), i - 1, 2?
% The Jordan Form and Invariant Subspaces
2.5 Find the Jordan form for the following matrices A:
For each one of the matrices A and each eigenvalue A0 of A, check

whether 9t^(A) = Ker( A07 - A).
2.6 Find all possible Jordan forms of transformations A: <p"-* <p" satisfy-
ing A2 = 0. Express the number of Jordan blocks of size 2 in terms of
A.
2.7 Find the Jordan form of the transformation
2.8 What is the Jordan form of Qk, k = 2, 3,. . . , where Q is given in

Exercise 2.7.]
2.9 Describe the Jordan form of a circulant matrix
where al,. , . ,an are complex numbers. Prove that there exists an
invertible matrix S independent of al,. . . , an such that SAS~l is
diagonal. [Hint: A is a polynomial in Q, where Q is defined in
Exercise 2.7].
2.10 What is the Jordan form of the transformation
Exercises 97
2.11 Find the Jordan form of the transformation
2.12 Let AI, A2,. • • , An be transformations on <p2, and define
(a) Show that A is similar to a block diagonal matrix with 2 x 2

blocks on the main diagonal. [Hint: On writing
for /' = 1,. . . , n, A is similar to
where B is the circulant matrix
and analogously for C, D, and F. Now use the existence of one

similarity transformation that takes B, C, D, and Fto the Jordan
form (Exercise 2.9).]
(b) Prove that in the Jordan form of A only Jordan blocks of size 2
or 1 may appear.
(c) Show that if all Aj, j = 1,. . . , n are diagonal matrices, then A is
diagonable, that is, the Jordan form of A is a diagonal matrix. Give
an example of nondiagonal A j , . . . , An for which A is diagonable
nevertheless.
2.13 Prove that the block circulant matrix
where A\,. . . , An are k x k matrices, has Jordan blocks of sizes less

than or equal to k in its Jordan form.
2.14 Find the Jordan form for the transformation
where a 0 , . . . , « „ _ , E <p and the polynomial A" - E"=0' af\' has n

distinct zeros. Show that a similarity that takes A to its Jordan form is
given by the Vandermonde matrix of type
2.15 Let
(a) Prove that, for each eigenvalue, A has only one Jordan block in
its Jordan form. (Hint: Use the description of partial multi-
plicities of A in terms of the matrix polynomial A/ — A; see the
appendix.)
(b) Find the Jordan form of A.
Exercises 99
2.16 Show that any matrix of the type
where At are k x k matrices, has not more than k Jordan blocks

corresponding to each eigenvalue in its Jordan form.
2.17 What is the Jordan form of the upper triangular Toeplitz matrix
where « 0 ,. . . , an_l are complex numbers with a{ ^0?

2.18 Find the Jordan form of (/„( A0))*, k = 2, 3, Show that [/„(())]*
has infinitely many invariant subspaces if k>2.
2.19 Describe the Jordan form of the matrix in Exercise 2.17 without the
restriction fljÔ. When does this matrix have infinitely many in-
variant subspaces? [Hint: Observe that the matrix is a polynomial in
7n(0) and use Theorem 2.11.1.]
2.20 Prove that an n x n matrix A is similar to its transpose AT.
2.21 Let A: $"-> <f" be a transformation such that p(A) = 0, where p( A) is
a polynomial of degree k with k distinct zeros A,, . . . , \k.
(a) Show that Ker( A/ - A) * {0}, ; = 1, . . . , k.
(b) Verify the direct sum decomposition
(c) Prove that A is diagonable.

2.22 Assume that the transformation A: <p" —> <p" satisfies the equation
p(A) = 0, where p(A) is a polynomial. Let A0 be a zero of/?(A), and
let k be its multiplicity. Show that the .A-invariant subspace Im q(A),
where q(X) = p( A)( A - A 0 )~*, is spectral.
2.23 Prove that for any transformation A: (p" —> (p" the inequalities
dim Ker As+l + dim Ker ,4s'1 <2dim Ker As, 5= 1,2,...
hold.
2.24 Prove that a transformation A: <p"—><p" has the property that

AM L C M ± for every ^-invariant subspace M if and only if A is
normal.
2.25 Show that a transformation has only one-dimensional irreducible
subspaces if and only if A is diagonable.
2.26 Find the minimal number of generators in <p" of the following
transformations:
(a) The circulant
(b) The lower triangular Toeplitz matrix
(c) The companion matrix
2.27 Prove that if A: <p" —»<£" has one-dimensional image, the minimal
number of generators of any /1-invariant subspace is less than or
equal to n — 1. Show that Ker A is the only nontrivial A -invariant
subspace whose minimal number of generators is precisely n — 1.
2.28 For a given transformation A, denote by g(M) the minimal number of
generators in an /l-invariant subspace M. Prove that
where the maximum is taken over all eigenvalues A0 of A [g({0}) is

interpreted as zero].
Exercises 101
2.29 Let
where Al and A2 are transformations such that every invariant

subspace of each of them is cyclic. Prove or disprove the following
statements:
(a) Every ^-invariant subspace is cyclic.
(b) Every ^-invariant subspace has not more than two minimal
generators.
2.30 Show that the vector (0, 0,. . . , 0,1) E <p" is a generator of (pn as an
invariant subspace of a companion matrix.
2.31 Find the minimal ^-invariant subspace over Im B for the following
pairs of transformations:
Here « , , . . . , « „ are complex numbers.

2.32 Find the maximal y4-invariant subspace in Ker C for the following
pairs of transformations:
(a) C = [1 0 • • • 0]; A is a companion matrix.
(b) C = [ l 0 • • • 0]; A is an upper triangular Toeplitz matrix.
(c) C = [Ik 0 • • • 0]; A is an in Exercise 2.31, (c).
2.33 Prove or disprove the following statements:

(a) If Ml is the maximal y4-invariant subspace in Yl and M2 is the
maximal /l-invariant subspace in T2, then M\ + M2 is the maxi-
mal ,4-invariant subspace in Yl + V2.
(b) If Mi and <Yi (i = 1,2) are as in (a), then Ml n M2 is the maximal
A -in variant subspace in T, fl T2.
(c) The analog of (a) for the case of minimal /1-invariant subspaces Mt
over IT., i = 1,2.
(d) The analog of (b) for the case of minimal ,4-invariant subspaces
./#,. over Tf, / = 1,2.
2.34 Find when the following pairs of matrices are full-range pairs:
(b) (A, B), where A is an n x n matrix with A" =0 and B is an

n x 1 matrix.
2.35 Find when the following pairs of matrices are null kernel pairs:
(a) ([c,,. . . , cj, / W (A 0 )*), where k > 1 is a fixed integer and
c , , . . . ,cne<p.
(b) (C, A), where Cis an 1 x « matrix and ^4 is an n x n upper triangular
matrix with zeros on the main diagonal.
2.36 Given a full-range pair A: ("-+$", £:(p m -*<p", prove that if
A': (p"-» <p", B': $m-+ <p" are transformations sufficiently close to ,4
and B, respectively, (i.e., \\A' - A\\ < e, \\B' - B\\ < e, where e >0
depends on A and B only), then (A', B') is a full-range pair as well.
2.37 Prove that for every pair of transformations A: <p" -> <p", 5: <pm -> <p"
there exists a sequence of full-range pairs (Ap, Bp), p = 1, 2,. . . such
that limp_ ||Xp - ,41| =0 and lim^ \\Bp - B\\ =0.
2.38 State and prove the analogs of Exercises 2.36 and 2.37 for null kernel
pairs.
2.39 Let A and B be transformations on <p". Show that the biggest
.A-invariant (or, equivalently, ^-invariant) subspace M for which
A\M = B\M consists of all vectors *E<p" such that A'x - B'x,
7 = 1,2,....
2.40 Let v 4 j , . . . , Ak be transformations on <J7". Show that the biggest
A, — invariant subspace M for which A^M — Ap\M for p = 1,. . . , k,
consists of all x G <p" such that /l^t = ^l^jc for p = 1,. . . , k and
y = l,2,....
Exercises 103
2.41 Show that the transformation eA is nonsingular for every transfor-

mation A: <£"—»• <£"• Find the eigenvalues and the partial multiplicities
of eA in terms of the eigenvalues and the partial multiplicities of A.
2.42 Give an example of a transformation A such that Inv(^4) is finite but
ln\(eA) is infinite.
2.43 Show that for a transformation A the series
converges provided all eigenvalues of A are less than 1 in absolute

value. For such an A prove that A = ef(A} - /, so one can write
f(A) = ln(7 + A). Prove that A and ln(/ + A) have exactly the same
2.44 Find all marked ^-invariant subspaces for the transformation A of
Example 2.9.1.
2.45 Show that for any transformation A, all /4-hyperinvariant subspaces
are marked.
2.46 For which of the following classes of n x n matrices are all invariant
subspaces marked?
(a) Companion matrices
(b) Block companion matrices
with 2 x 2 blocks Aj (p = n/2)

(c) Upper triangular Toeplitz matrices
(d) Circulant matrices
(e) Block circulant matrices
with 2 x 2 blocks Aj
(f) Matrices A such that A2 = 0
2.47 Prove that every invariant subspace of a matrix of type
is marked.
2.48 Prove that for any transformation A: (p3 —» <p3 every invariant sub-
space is marked.
2.49 Find all Jordan forms of transformations A: (p 4 —> (p4 for which there
exists a nonmarked invariant subspace.
Chapter Three
Coinvariant and
Semiinvariant
Subspaces
In this chapter we study two classes of subspaces closely related to invariant

ones; namely, coinvariant and semiinvariant subspaces. A subspace is called
coinvariant if it is a direct complement to an invariant subspace. A subspace
is called semiinvariant if it is a coinvariant part of an invariant subspace.
Also, we introduce here the related notion of a triinvariant decomposition
for a transformation. This requires a decomposition of the whole space into
a direct sum of three subspaces with respect to which the transformation has
a block upper triangular form. It follows that the first, second, and
third subspace are invariant, semiinvariant, and coinvariant, respectively.
The triinvariant decomposition will play an important role in subsequent
applications.
3.1 COINVARIANT SUBSPACES
A subspace M Cjtf"1 is called coinvariant for the transformation A: <p"—» <p"

(or, in short, A coinvariant) if there is an ^-invariant direct complement to
M in <p". Consider some simple examples.
EXAMPLE 3.1.1. Let A be an n x n Jordan block. Then for each / (1 < / < n)
Span{e{, ei+l,. . . ,en} is an ^4-coinvariant subspace (although there are
many other A -coinvariant subspaces). For this subspace there is a unique
A- in variant subspace that is its direct complement, namely,
Span{ej, e 2 ,. . . , e._j} ({0} if i = l). Note that, in this case, the only
subspaces that are simultaneously A invariant and A coinvariant are the
trivial ones {0} and <p". d
105
106 Coinvariant and Semiinvariant Subspaces
EXAMPLE 3.1.2. Let A = diag[A,, . . . , Aj, where all A, are different. As we

have seen in Example 1.1.3, the only /4-invariant subspaces are (0), (J7", and
Span{e, , . . . , et }, k = I , . . . , n — 1, for any choice of /, < /2 < • • • < ik. In
contrast, every subspace in <p" is A coinvariant. Indeed, let M =
Span{jc,,. . . , xq], where jc,,. . . , xq are linearly independent vectors in <p".
Then the columns of the n x q matrix X = [x^x2 • • • xq] are linearly indepen-
dent. So there exist q rows of X, say, the / , t h , . . . , /^th rows, which are also
linearly independent. Put { / ' , , . . . , / „ _ , } = ( 1 , . . . . H}\{/,, . . . , / , } and
Ji = Span{ej , . . . , e- } so that N is an ^-invariant subspace. As, by
construction, the n x n matrix
is nonsingular, N is a direct complement to M in <p". Thus M is A

coinvariant.
EXAMPLE 3.1.3. If A = a/, a G <p, then every subspace in <p" is obviously A

coinvariant. For every A -coinvariant subspace M there is a continuum of
A -invariant subspaces that are direct complements to M in <p".
For an A -coin variant subspace M and any projector P onto M such that
Ker P is A invariant, we have PAP = PA. This follows, for instance, when
equation (1.5.5) is applied to I — P, or else it can be proved directly.
Conversely, if PAP = PA for some projector P onto a subspace M C <p",
then M is A coinvariant and Ker P is an .4-invariant direct complement to M
in <p".
Given an A -coinvariant subspace M and a projector P onto M such that
Ker P is A invariant, the linear transformation A has the following block
triangular form:
with respect to the decomposition <p" = Im P + Ker P. In particular, we find

that every eigenvalue of the compression PA\M: Jt—>M of A to its coin-
variant subspace M, is also an eigenvalue of A. Indeed, in the represent-
ation (3.1.1) the compression PA\M coincides with A n , and this immediately
implies that cr(PA\M)C a-(A).
We note that, essentially, the compression to a coinvariant subspace
depends on the invariant direct complement only. (Actually, we have
encountered this property already in Theorem 2.7.4 and its proof.)
Proposition 3.1.1
Let Ml and M2 be A-coinvariant subspaces with a common A-invariant
direct complement N. Then the compressions PÂ^ :Ml-^Ml and
Coinvariant Subspaces 107
P2A\M\ M2—>M2 (where Pj is the projector on Mf along N for j = 1,2) are

similar.
Proof. Write
with respect to the direct sum decompositions <p" = M j + N and <p" =

M2 + N, respectively. Also, write the identity transformation /: <p"—» <£" in
the 2 x 2 block matrix form
(so 5n: Ml-^>M2, 512: N-*M2, S2l: M^ —>N, S22: N—>N). It is easily seen
that 5,2 = 0 and 522 = IN, the identity transformation on N. As / is invert-
ible, the transformation 5,, must be invertible as well, and
Now
which gives, in particular, AU = SU1A[1SU. It remains to observe that

P\A\M
l
= ^11' 1*2^
2
= ^11-
The following property of coinvariant subspaces is analogous to the

property of ^-invariant subspaces proved in Section 1.4.
Proposition 3.1.2
A subspace M is A coinvariant if and only if its orthogonal complement M1 is
A* coinvariant.
Proof. Assume that M is A coinvariant, and let N be an ,4-invariant

direct complement to M in <p". Then M^ fl Jf± = (M + JV)1 = ((p")1 = {0},
and since dim M± + dim N'1 — (n — dim M ) + (n — dim Jf) - n, we have
M x + Jf^ = <p". As NL is A* invariant (see Section 1.4) it follows that M L
is A* coinvariant. Conversely, if M ± is A* coinvariant, then by the part of
this proposition already proved, the subspace (ML)L = M is (A*)* coin-
variant, that is, M is A coinvariant.
A subspace M C <p" is called orthogonally coinvariant for the transfor-

mation A: <p"—»<p" (in short, orthogonally A coinvariant) if the orthogonal
complement M L of M is A invariant.
Proposition 3.1.3
A subspace M is orthogonally A-coinvariant if and only if M is invariant for
the adjoint linear transformation A*.
Proof. Assume that M is orthogonally A coinvariant. So Ax E.ML.

Then we have
for all y^M. But the left-hand side of (3.1.2) is just (*, A*y). Hence
A*yE.(M±)± =M for all y E M and M is A* invariant. Reversing this
argument we find that if M is A* invariant, then Ax£M± for every x E M """,
that is, M is orthogonally A coinvariant.
We observe that, in general, A -coin variant subspaces do not form a

lattice, that is, the sum and intersection of A -coinvariant subspaces need not
be A coinvariant. This is illustrated in the following example.
EXAMPLE 3.1.4. Let
The only ^-invariant subspaces are {0}, Spanf^}, Span{e,, e2}, <p3. Con-
sequently, all A- coin variant subspaces are as follows:
Indeed, assume Span{«, v} is a two-dimensional subspace for which

Span{ej} is a direct complement. Writing u = {«,, «,, w 3 ), v = (i>,, u,, u,)
we see that det 7^0. Hence replacing u and v with their linear
combinations, if necessary, we see that
Reducing Subspaces 109
for some x, yE <p. Now Span{e2, e3} and Span{e2, (1,0,1}} are ^-coin-
variant subspaces but their intersection (which is equal to Span{e2}) is not.
Also, Span{e3} and Span{(l,0,1}} are y4-coinvariant subspaces but their
sum (which is equal to Span{e3, (1,0,1)}) is not.
In contrast, it follows immediately from Proposition 3.1.3 that the set of

all orthogonally v4-coinvariant subspaces is a lattice. Note also the following
property of orthogonally coinvariant subspaces.
Proposition 3.1.4
Any transformation has a complete chain of orthogonally coinvariant sub-
spaces.
Proof. Let A: $"—> <P" be a transformation. As we have seen in Section

1.9, there is an orthonormal basis jc,,. . . , *„ for f in which A has the
upper triangular form:
Clearly, the subspaces Span^,. . . , *„}, k = 1,. . . , n are orthogonally A

coinvariant and form a complete chain.
3.2 REDUCING SUBSPACES
An invariant subspace ^ of a transformation A: <p"—» <p" is called reducing

for A if SB + M = (p" for some other ^-invariant subspace M. In other
words, a subspace 3? C <p" is reducing for A if it is simultaneously A
invariant and A coinvariant. In particular, {0} and (p" are trivially reducing.
A more important example follows from Theorem 2.1.2. This shows that the
root subspaces £%A (A) are reducing for A. A unicellular linear transfor-
mation is an example in which the only reducing subspaces are the trivial
ones {0} and <p". On the other hand, A = I is a linear transformation for
which every subspace in <p" is invariant and reducing.
As a transformation on <p" with only one Jordan block (i.e., a unicellular
transformation) has the smallest possible number of reducing subspaces, one
might expect that a transformation with the most Jordan blocks has the most
reducing subspaces. This is indeed so. Recall that a transformation is called
diagonable if its Jordan form is a diagonal matrix.
Theorem 3.2.1
If A is diagonable, then each invariant subspace of A is reducing. Conversely,
if each invariant subspace of A is reducing, then A is diagonable.
Proof. Assume that A is diagonable. Using Proposition 1.4.2, it is easily

seen that each invariant subspace of A is reducing if and only if the same is
true for S~lASy for any nonsingular matrix 5. So we can assume that
A = diag[«j • • • « „ ] for some a,, . . . , « „ E <p. Let A 1 5 . . . , A p be all the
different numbers among the a, values, and for notational convenience
assume that
where
are integers. Obviously, the eigenvalues of A are A , , . . . , A p , and the root

subspaces of A are
(by definition we put £0 = 0). By Theorem 2.1.5 any ^-invariant subspace M

has the form
where Mt C £% A (A). Let JV) be any direct complement for Mt in £% A (A). As

Ax = \fx for every JtE£% A (,<4), the subspace Ji{ is obviously A invariant.
Hence the subspace JV = Ji{ + • • • + Np, which is a direct complement to M
in (p", is also A invariant. This means, by definition, that M is reducing.
Conversely, assume that A is not diagonable. Let M be the ^-invariant
subspace of A spanned by its eigenvectors. As A is not diagonable, M ^ <p".
If .A" is any other /1-invariant subspace and x is an eigenvector of A\x, then x
is also an eigenvector of A, and thus x£.M. So M n ^ V ^ j O } for every
^-invariant N. Consequently, M is not reducing.
An important class of diagonable transformations A: (p"—» <p" are those

that have n distinct eigenvalues A , , . . . , A n . Indeed, the corresponding
eigenvectors xlt . . . ,xn are linearly independent (and, therefore, form a
basis in <£"") because x, E ^(A) and the subspaces $l^(A),. . . , 9t^(A)
form a direct sum. We have the following.
Reducing Subspaces 111
Corollary 3.2.2
If a transformation A: <p"—><p" has n distinct eigenvalues, then every A-
invariant subspace is reducing.
Consider now the situation in which an /4-invariant subspace is reducing

and is orthogonal to its ^-invariant comlementary subspace. An invariant
subspace M of a transformation A: <p" —*• <J7" is called orthogonally reducing
if its orthogonal complement M1 is also A invariant.
Theorem 3.2.3
Every invariant subspace of A is orthogonally reducing if and only if A is
normal.
Proof. Recall first (Theorem 1.9.4) that A is normal if and only if there
is an orthonormal basis of eigenvectors xlt . . . , xn of A.
Assume that A is normal, and let *,,. . . , xn be an orthonormal basis of
eigenvectors of A that is ordered in such a way that
correspond to the eigenvalue A t
correspond to the eigenvalue A2
orrespond to the eigenvalue \p

Here A , , . . . , \p are all the different eigenvalues of A. Arguing as in the
proof of Theorem 3.2.1 we see that any ^4-invariant subspace is of the form
where Mi C Span{xk._(, . . . , xk}, i = 1,. . . , p (by definition kQ = 0) and its

orthogonal complement
in <p" is also A invariant. Here Mf is the orthogonal complement to M( in

the space <p*«'"*•'-'
Conversely, assume that every /l-invariant subspace is orthogonally
reducing. In particular, every /4-invariant subspace is reducing, and by
Theorem 3.2.1, A — diag[a,, . . . , a n ] in a certain basis in (p".
Denoting by A , , . . . , \p all the different eigenvalues of A, it follows that
£%A(/1) is spanned by the eigenvectors of A corresponding to A,. Now for
each / 0, 1 < <o — P> tne subspace 2ft. A (^4) is the unique /1-invariant subspace
'0
that is a direct complement to E/V|- £%A (^4) in (p". [This follows from the fact
that any ,4-invariant subspace M has the form M = Ef =1 M n £%A (>!)•] The
orthogonal reducing property of <3t K (A) implies that the subspaces
£%A (A),. . . , £%A (/I) are orthogonal to each other. Taking an orthonormal
basis in each £%A (A) (which necessarily consists of eigenvectors of A
corresponding to A,), we obtain an orthonormal basis in <p" in which A has a
diagonal form. Hence A is normal.
The proof of Theorem 3.2.3 shows that if every A-invariant subspace is

reducing and every root subspace for A is orthogonally reducing, then every
-A-invariant subspace is orthogonally reducing.
Note also the important special cases of Theorem 3.2.3: every invariant
subspace of a hermitian or unitary transformation is orthogonally reducing.
3.3 SEMIINVARIANT SUBSPACES
A subspace M C <p" is called semiinvariant for a transformation A: <p"—» (f"

(or, in short, A semiinvariant) if there exists an /1-invariant subspace JVsuch
that N D M = {0} and the sum M + Jf is again A invariant. By taking
jV = {0} we see that any ^4-invariant subspace is also A semiinvariant.
If M is an A -coin variant subspace, then there is an /l-invariant direct
complement N to M in <p" (so the conditions that Jf n M - {0} and that
M + M is A invariant are automatically satisfied). Thus we see that any
A-coinvariant subspace is also A semiinvariant. In general, a subspace
M C (p" is A semiinvariant if and only if M is >l|L-coinvariant for some
A-invariant subspace *£ containing M.
EXAMPLE 3.3.1. Let A be an n x n Jordan block. Then it is easily seen that

the subspaces Span{e,, ei+l,. . . , ey}, where ! < / < / < « , are A semiin-
variant (but there are many other A semiinvariant subspaces). This example
shows that in general there exist semiinvariant subspaces that are neither
invariant nor coin variant.
Consider now the ^4-semiinvariant subspace M, and let JV be an A-

invariant subspace such that Jf D M = {0} and M + Jf is A invariant. Then
we have a direct sum decomposition
where Z£ is a direct complement to M + N in <£". To emphasize the fact that

this is a decomposition of <p" into the sum of invariant, semiinvariant, and
coinvariant subspaces, respectively, we call equation (3.3.1) a triinvariant
decomposition associated with the A-semiinvariant subspace M. Triinvariant
decompositions play an important role in the applications of Chapters 5
and 7.
Semiinvariant Subspaces 113
Note that in general a triinvariant decomposition associated with a given

M is not unique. With respect to the triinvariant decomposition (3.3.1), the
transformation A has the following 3 x 3 block form:
Here A n:Jf-*Jf, A22:M-»M, A33:%-+<e, A,2.M^N,n A23:£-+M,

An\ J£-» N. The presence of zeros in (3.3.2) follows from the A invariance
of M and M + Ji (see Section 1.5). The converse is also true: if A is a
transformation from <p" into <p", and A has the form (3.3.2) with respect to
some direct sum decomposition (3.3.1), then M is A semiinvariant, and the
A -invariant subspace N is such that M + Jf is A invariant as well.
In particular, it follows from the formula (3.3.2) that the spectrum of the
compression PA\M (where P: N + M—»N + Jtis the projector on M along
N) of A to its semiinvariant subspace M is contained in the spectrum of A.
We characterize v4-semiinvariant subspaces in terms of functions of A,
as follows.
Theorem 3.3.1
Let A: (p" —» <p" be a transformation. The following statements are equivalent
for a subspace M C <p": (a) M is semiinvariant for A; (/>) for a suitable
projector P mapping (p" onto M, we have
(c) for any function f( A) such that f ( A ) is defined we have
where P is a suitable projector with Im P = M.

In (b) PAm M is understood as a transformation from M into M. Recall
that f(A) is certainly defined for a function /(A) that is analytic on the
spectrum of A and, if A is diagonable, for any function /(A) that is merely
defined on the spectrum of A. As the spectrum of PA\M is contained in the
spectrum of A (provided M is A semiinvariant), it follows that f(PA\M) is
well defined if /(A) is analytic on the spectrum of A. We shall see in Section
4.1 that if A is diagonable, so is PA\M (provided M is A semiinvariant), and
thus f(PA\M) is well defined in the case when A is diagonable and /(A) is
defined on the spectrum of A.
Proof. Assume that M is A semiinvariant, and write A as in (3.3.2),

with respect to the triinvariant decomposition (3.3.1). Let P be the projec-
tor on M along Ji + !£. Then PA M = A22. Now a straightforward calcula-

tion shows that
sQthatPAm\M = A;2 = (PA\Mr.

Now assume that (b) holds. Let £ be the smallest ,4-invariant subspace
containing M. (In other words, £ is the intersection of all /1-invariant
subspaces that contain M.) Equivalently, !£ is the span of all vectors of type
A'x, where xEM and / = 0,1,. . . . In particular, £ D M. Let Q be a
projector on 3? such that Ker Q C Ker P (e.g., take any direct complement
JT to £ n Ker P in Ker P, so that Ker P = Jf' 4- (& n Ker P), and let Q be
the projector on !£ along JV'). Then Im(7 - (2)CKerP or, equivalently,
P(l - Q) = 0, that is, PQ = P. As # D ^, the equality 0P = P obviously
holds. Now
so Q - P is a projector, and Im(<2 - P) is a direct complement to M in J£

We shall prove that lm(Q — P) is A invariant, which shows that M is
semiinvariant for A. Clearly, QAQ = AQ (because Im Q = j£ is A in-
variant) and QAP - AP (because for every vector x E Im P = M, the vector
Ax belongs to !£ and thus QAx — Ax). Let us show that
For every x E. M and for any j = 0,1, 2, . . . we have
where we have used the property (b) twice. As the subspace j£is spanned by
A'x, x e M, j = 0,1, . . . , we conclude that PAPy = PAy for every y E J£,
which amounts to the equality PAPQ = PAQ, and (3.3.4) follows. Using the
equalities QAQ = AQ, QAP = AP, PAP=PAQ, we easily verify that
(Q - P)A(Q - P) = A(Q - P). This means that lm(Q - P) is A invariant
Finally, let /(A) be a function such that f(A) is defined. Then f(A) =
p(A), where /?(A) is a polynomial such that
where A j , . . . , \s are all the distinct eigenvalues of A, and mk is the height

Semiinvariant Subspaces 115
of \k (k = 1,. . . , s). Such a polynomial p( A) always exists. For example, the

Lagrange-Sylvester interpolation polynomial, which is given by the formula
where
and ^(A) = (A - \kym" Il*=1 (A - A,)"1', k = 1,. . . , s [see, e.g. Chapter V

of Gantmacher (1959)]. As the eigenvalues of PA\M are also eigenvalues of
A, and the height of A0 G <r(PA\M) does not exceed the height of A0 as an
eigenvalue of A (see Section 4.1), we obtain f(PA\M) = p(PA\M). Now
equality (3.3.3) follows from (b). Conversely, (c) obviously implies (b). D
Given an ^-semiinvariant subspace M with an associated triinvariant

decomposition <p" = N + M + 56, the proof of Theorem 3.1 shows that (b)
holds with P being the projector on M along N + 56. And conversely, if a
projector P satisfies (b), then Ker P - X + 56, where jV and 56 are A-
invariant and >l-coinvariant subspaces, respectively, taken from some tri-
invariant decomposition associated with M.
Extending the notion of orthogonally coinvariant subspaces, we introduce
the notion of orthogonally semiinvariant subspaces as follows. A subspace
M C <p" is called orthogonally semiinvariant for a transformation A: (p w —> (p"
if there exists an ,4-invariant subspace jV such that M, + N is again A
invariant and M is the orthogonal complement to Ji in M + N. Clearly, an
orthogonally semiinvariant subspace is semiinvariant. For an orthogonally
/1-semiinvariant subspace M there exists an orthogonal decomposition
where 56=(M + jV) x . Decomposition (3.3.5) will be called an orthogonal

triinvariant decomposition associated with M. Again, for a given M there are
generally many associated orthogonal triinvariant decompositions. (The
extreme case of this situation appears for A - 0.)
Consider the orthogonal triinvariant decomposition (3.3.5), and choose
orthonormal bases in Jf, M, and 56. Then we represent A as the 3 x 3 block
matrix
in the orthonormal basis for <p" obtained by putting together the ortho-
normal bases in N, M, and !£. As the representation (3.3.6) is in an ortho-

normal basis, we have
This leads to the following conclusion.
Proposition 3.3.2
An orthogonally A-semiinvariant subspace is also orthogonally A* semiin-
variant.
Indeed, if equation (3.3.5) holds, then J£ is A* invariant, and M is the

orthogonal complement to !£ in the ,4*-invariant subspace Ji± = M ®£.
An analog of Theorem 3.3.1 holds for orthogonally semiinvariant sub-
spaces.
Theorem 3.3.3
The following statements are equivalent for a transformation A: <p"—> <{7" and
a subspace M C (J7n: (a) M is orthogonally semiinvariant for A; (b) we have
where PM is the orthogonal projector on M; (c) for any function f( A) such

that f ( A ) is defined we have
The proof is like the proof of Theorem 3.3.1, with the only difference
that an orthogonal triinvariant decomposition is used and the projector Q is
taken to be orthogonal.
3.4 SPECIAL CLASSES OF TRANSFORMATIONS
In this section we shall describe coinvariant and semiinvariant subspaces for

certain classes of transformations. We start with the relatively simple case of
unicellular transformations.
Proposition 3.4.1
Let A: <p" —» <J7" be a unicellular transformation that is represented as a
Jordan block in some basis xl, . . . , xn. Then a k-dimensional subspace
Special Classes of Transformations 117
M C <p" is A-coinvariant if and only if M is spanned by a set of vectors

y j , . . . , yk with the property that x^ . . . , xa_k, y{,. . . , yk is a basis in <p".
A k-dimensional subspace M is A semiinvariant if and only if M =
Spanl^j,. . . , yk} where the vectors > > , , . . . , yk are such that, for some
index I with k^l<n, we have yi ESpan{jCj,. . . , x,}, / = ! , . . . , / : and
j t j , . . . , x,_k, y t , . . . , yk is a basis in Spanf*,,. . . , x,}.
The proof follows easily from the definitions of coinvariant and semi-
invariant subspaces and from the fact that the only ^-invariant subspaces
are {0} and Span{*i , . . . , * , } , / = 1 , . . . , n.
Consider now a diagonable transformation A: <J7" -» <p", so that A =
diag[A,,. . . , \n] in some basis in <p". As we have seen in Example 1.2, if all
A, are different, then every subspace in <p" is A coinvariant and hence also A
semiinvariant. In fact, this conclusion holds for any diagonable transfor-
mation (not necessarily with all eigenvalues distinct). Indeed, consider the
transformation B given by the matrix diag[/*,,. . . , nn] with different ^
values in the same basis in which A is given by diag[A 1 5 . . . , A n ). As every
fi-invariant subspace is also A invariant, it follows that every fl-coinvariant
subspace is also A coinvariant. But we have already seen that every
subspace is fi-coinvariant.
We consider now the orthogonally coinvariant and semiinvariant sub-
spaces. We say that a transformation A: <p"—» £" is orthogonally unicellular
if there exists a Jordan chain xl,...,xn of A such that the vectors
xl,...,xn form an orthogonal basis in <p". Clearly, any orthogonally
unicellular transformation is unicellular.
Proposition 3.4.2
Let A: $" —> <p" be an orthogonally unicellular transformation, and let
x,,..., xn be its orthogonal Jordan chain. Then the only orthogonally
A-coinvariant subspaces are Span{xk, xk +l, . . . , xn}; k — I , . . . , n; and {0}.
The only orthogonally A-semiinvariant subspaces are Span{jc k ,. . . , x,},
l<k<l<n and {0}.
Again, Proposition 3.4.2 follows from the description of all ^-invariant

subspaces.
Consider a normal transformation A: <p" —» <p"'- AA* = A* A. By
Theorem 1.9.4, A has an orthonormal basis of eigenvectors (and conversely,
if a transformation has an orthonormal basis of eigenvectors, it is normal). It
turns out that normal transformations are exactly those for which the classes
of invariant subspaces and of orthogonally semiinvariant subspaces coincide.
Theorem 3.4.3
The following statements are equivalent for a transformation: (a) A is
normal; (b) every A-invariant subspace is orthogonally A coinvariant; (c)
118 Coinvariant and Semiinvarianl Subspaces
every orthogonally A-coinvariant subspace is A invariant; (d) every or-

thogonally A-semiinvariant subspace is A invariant.
Proof. Obviously, (d) implies (c). Assume that A is normal, and let
A A, be all the different eigenvalues of A. Then
is an orthogonal sum, and A\R^(A) = A,/. Let M be an orthogonally A-

semiinvariant subspace, so that'M is the orthogonal complement to an
^-invariant subspace JV in another ^-invariant subspace X. We have
where JifC^C fflÂ), i = l,.. . ,k. Denoting by Mt the orthogonal com-

plement of Nf in 3?-, the definition of M implies that
It follows that M is A invariant. So (a) implies (d). One sees easily that (a)
implies (b) also.
It remains to show that (c)=>(a) and (b)=>(a). Assume (c) holds, that is
(cf. Proposition 3.1.2) every /4*-invariant subspace is /1-invariant. Write A*
in an upper triangular form with respect to some orthonormal basis
As Spanf*,, . . . , xk}, k = 1, . . . , n are A * -invariant subspaces, they are

also A invariant. Hence (Proposition 1.8.4) A also has an upper triangular
form in the same basis:
On the other hand, equality (3.4.1) implies

Exercises 119
Comparison of (3.4.2) and (3.4.3) reveals that bi} = 0 for /<;', and A is
normal.
Assume now that (b) holds, and write
in some orthonormal basis xl,..,,xn in <£". The subspaces

Spanjjc,, . . . , xk}, k - 1,. . . , n are A invariant and, by (b), orthogonally A
coinvariant. Hence Span{jtA + 1 , . . . , *„}, k = 1,. . . , n - 1 are ^-invariant
subspaces, which means that A has a lower triangular form
Comparing equations (3.4.4) and (3.4.5), we find that A is normal.
As a corollary of Theorem 3.4.3 we obtain the following characterization

of a normal transformation in terms of its invariant subspaces.
Corollary 3.4.4
A transformation A: <p" —> (p" is normal if and only if a subspace M is A
invariant exactly when its orthogonal complement is A invariant.
Indeed, it follows from the definition that the subspace M ± is A invariant
if and only if M is orthogonally A coinvariant.
3.5 EXERCISES
3.1 Prove that, in Example 3.1.2, there is a unique ^-invariant direct

complement to the ^-coinvariant subspace M if and only if M itself is
A invariant.
3.2 Prove that a subspace M is A coinvariant (resp. A semiinvariant) if
and only if M is (aA + /37) coinvariant [resp. (aA + /37) semiin-
variant]. Here a, /8 are complex numbers and a ^0.
3.3 Show that a subspace M is A coinvariant (resp. A semiinvariant) if

and only if &M is S/15"1 coinvariant (resp. 5>15~1 semiinvariant),
where 5 is an invertible transformation.
3.4 Let y4:<p"-*<p" («^3) be a unicellular transformation. Give an

example of a subspace M C (p" that is not A semiinvariant. List
all such subspaces when n = 3.
3.5 Show that every subspace in <p" is A coinvariant if and only if A is

diagonable (i.e., it is similar to a diagonal matrix).
3.6 Prove that every subspace in <p" is coinvariant for any n x n circulant
matrix.
3.7 Give an example of a nondiagonable transformation A: <p"—»• <P" sucn

that every subspace in <p" is A semiinvariant.
3.8 Find all the coinvariant subspaces for the matrices
3.9 Find all coinvariant and semiinvariant subspaces for the matrix
3.10 Prove that every reducing A -invariant subspace is reducing also for
f ( A ) , where /(A) is any function such that f ( A ) is defined. Is the
converse true?
3.11 If J is a Jordan block, for which positive integers k does the matrix Jk
have a nontrivial reducing invariant subspace? Is the reducing sub-
space unique?
3.12 Prove that an ^-invariant subspace M is reducing if and only if

M fl £%A (A) is reducing for every eigenvalue A0 of A.
3.13 Find all the triinvariant decompositions <p3 = Jf 4- M + £ with

dim Jf = dim M — dim ££ = 1 for the following matrices:
Chapter Four
Jordan Forms
for Extensions
and Completions
Consider a transformation A: <p"—><p" and an y4-coinvariant subspace M.

Thus there is an ,4-in variant subspace N such that <p" = M + Jf and there is
a projector P onto M along N. The main problems of this chapter are: given
Jordan normal forms for A}^ and PA M , what are the possible Jordan forms
for A itself? In general, this problem is open. Here, we present partial
results and important inequalities.
4.1 EXTENSIONS FROM AN INVARIANT SUBSPACE
Let M C <p" be a subspace, and consider a transformation AQ: M—>M. A

linear transformation A: (p"—» <p" is called an extension of AQ if Ax = A0x
for every x G M. Then, in particular, M is A invariant. Also, AQ is called the
restriction of A to M. We are interested in the Jordan form (or, equivalently,
the partial multiplicities) of A0 and its extensions.
We start with a relatively simple but important case in which A as well as
its extension are in the Jordan form and have special spectral properties.
These spectral properties ensure that the partial multiplicities corresponding
to a particular eigenvalue A0 are the same for A0 and its extension A.
Theorem 4.1.1
Let Jl and J2 be matrices in Jordan normal form with sizes p x p and q x q,
respectively. Let B be a p x q matrix and
121
122 Jordan Forms for Extensions and Completions
Denote by J10 and J2Q the Jordan submatrices of J{ and J2, respectively,
formed by those Jordan blocks with the same eigenvalue A 0 .
Then the partial multiplicities of J corresponding to A 0 coincide with the
partial multiplicities of the submatrix
of J , where B0 is the submatrix of J formed by the rows that belong to the

rows of 7,0 and by the columns that belong to the columns ofJ2Q (so actually
B0 is a submatrix of B).
Theorem 4.1.1 is used later to reduce problems concerning the Jordan
form of an extension to the case when the transformations involved have
only one eigenvalue. The proof of Theorem 4.1.1 is based on two lemmas,
which are also independently important.
Lemma 4.1.2
Let A, B, C be given matrices of sizes n x n, m x m, and n x m, respectively.
Consider the equation
where X is an n x m matrix to be found. Equation (4.1.1) has a unique

solution X for every C if and only if cr(A) D a(B) = 0.
This lemma follows immediately from the fact that, for the linear
transformation L: (p"xm-* if*m defined by L(X) = AX-XB, o-(L) =
{X — fjL \ A E a(A) and /x G a(B)}. [See Chapter 12 of Lancaster and Tis-
menetsky (1985), for example.] Here we give a direct proof based on the
Jordan decompositions of A and B.
Proof. Equation (4.1.1) may be regarded as a system of linear equa-

tions in the rs variables xif (i = 1,. . . , r; / = 1,. . . , s) that form the entries
in the matrix X. Thus it is sufficient to prove that the homogeneous equation
has only the trivial solution X = 0 if and only if o~(A) fl cr(B) = 0.

Let JA and JB be the Jordan forms of A and B, respectively; so
A = SAJASAl, B = SBJBSBl for some invertible matrices SA and SB. It
follows that X is a solution of (4.1.2) if and only if Z = S~^XSB is a
solution of
Extensions from an Invariant Subspace 123
Thus we can restrict ourselves to equation (4.1.3). Let us write down JA and
JB explicitly:
where JA,. (resp. JB y ) is a Jordan block of size mA , (resp. mR y ) with

eigenvalue A^ , (resp. A B y ). The matrix Z from (4.1.3) is decomposed into
blocks accordingly:
where Z/y is of size mA t:x mB ;.

Suppose first that o-(A) n cr(B)^0. Without loss of generality we can
assume that A^ j = \Bl. Then we can construct a nonzero solution Z of
equation (4.1.3) as follows. In the representation equation (4.1.4) put
Z/y = 0, except for the case that /=;' = !; and let
(according as mA j > m s , or mA t < m B ,). Direct examination shows that

such a matrix Z satisfies (4.1.3).
Suppose now that cr(A) D a(B) = 0. Let Z be given by (4.1.4) and
suppose that Z satisfies (4.1.3). We have to prove that Z = 0.
Equation (4.1.3) means that
Write
where H and G are the nilpotent matrices [i.e., <r(H) = (r(G) — {0}] having
1 on the first superdiagonal and zeros elsewhere. Rewrite equation (4.1.5) in
the form
Multiply the left-hand side by A^ , - A B y , and in each term on the right-hand

side replace (A^ , - AB y)Zj7 by Zi;G - HZir We obtain
Repeating this process, we obtain for every p — 1, 2,. . .

Choose p large enough so that either //9 = 0 or Gp 9 = 0 for every

q = 0,. . . , p. Then the right-hand side of equation (4.1.6) is zero, and since
A <7 = °- ZThus Z = °-
^ ,- ^ A B r we find that
Lemma 4.1.3
If A and B are n x n and m x m matrices, respectively, with o~(A) H o-(B) =
0, then for every nx m matrix C the (m + n) x (m + n) matrices
are similar.
Proof. By Lemma 4.1.2, for every n x m matrix C there is a unique

n x m matrix X such that AX - XB = -C. With this X, one verifies that
As
the lemma follows.
Proof of Theorem 1.1. For notational simplicity assume that
where Jn (resp. J2i) are tne Jordan blocks from /, (resp. /2) with eigen-
values different from A 0 , and Btj are the corresponding submatrices in J.
Applying Lemma 4.1.3 twice, we see that J is similar to
which after interchanging the second and third block rows and columns (this
is a similarity operation) becomes
It remains to apply Lemma 4.1.3 once more to prove that J is similar to
It is convenient to describe the partial multiplicities of a transformation

A: (p" —» (p" at an eigenvalue A0 as a nonincreasing sequence of nonnegative
integers aÂ; A 0 ) > a2(A; A 0 ) > a3(A; A0) > • • • , where the nonzero mem-
bers of this sequence are exactly the partial multiplicities of A at A 0 . In
particular, not more than n of the numbers aÂ; A 0 ) are different from
zero. Also, if A0 is not an eigenvalue of A, we define aÂ; A 0 ) = 0 for
/ = 1 , 2 , . . . . Thus the nonnegative integers aÂ; A 0 ) are defined for all
A0 E <p, and we have
The following result describes the connections between the partial multi-
plicities of a transformation and those of its extension.
Theorem 4.1.4
Let M C <p" be a subspace and let A0: M —> M be a transformation. Then for
every extension A: <P"—»• <p" of A0 we have
for every A0 E <p. Conversely, let /8j ^ (32 ^ • • • be a nonincreasing sequence of

nonnegative integers such that
and
for a fixed complex number A 0 . Then there is an extension A of AQ such that

a > M;A 0 ) = ft,; = i , 2 , . . . .
Proof. We prove (4.1.7) for an extension A of AQ. In view of Theorem
4.1.1, we may restrict ourselves to the case when a(A) = {\0}. (Indeed,
without loss of generality it can be assumed that A0 is in the Jordan form.

Furthermore, the transformation PA\^-: N-+N, where jVis a direct comple-
ment to M and P is the projector on JV along M, may also be assumed to
have Jordan normal form.) There exists a chain of ^-invariant subspaces
where dim M.^ m + i, i = 0,1,. . . , n — m (so m = dim M). This can be

seen by considering the transformation A: $"IM —» $"IM induced by A and
using the existence of a complete chain of .4-invariant subspaces.
In view of the chain (4.1.10) and using induction on the index i of Mit it
will suffice to prove inequalities (4.1.7) for the case dim M = n — 1. Writing
AQ in a basis for M in which A0 has a Jordan form, we can assume
where / = / f c j (A 0 )0- -®Jk (A 0 ), fc, > • • • > /^ is the Jordan form of A0

and B is an (n - 1)-dimensional vector.
Let / be the first index (1 <y <p) for which the (k{ + k2 + • • • + kj)th
coordinate of B is nonzero (if such a j exists). Let 5 be the (n - 1) x (n ~ 1)
matrix
where Qm is the km x kf matrix of the form [0 Ik] and aj +l,. . . , ap are

complex numbers chosen so that the (kl + k2 + • • • + km)th coordinates of
SB are zeros for m = j + 1 , . - - , / ? . If all coordinates klt kl + k2, . . . ,
k} + • • -kp of B are zeros, put S = In_l. It is easy to see that SJ = JS and 5 is
nonsingular. Moreover, the /c,th, (kl + Ac 2 )th,. . . , (A:, + k2 + • • • + kp)th
coordinates of SB are all zero except for at most one of them. Further, let X
be an (n - 1)-dimensional vector such that the nonzero coordinates of the
vector
can appear only in the places k l y k{ + k2,. . . , kl + k2 + • • • + kp (this is

possible because
Now a computation shows that
As is the inverse of it follows that and
have the same partial multiplicities. Now the partial multiplicities
of are easy to discover: they are kl,...,k , 1 if F = 0, and

& , , . . . , kj_{, kj: + 1, kj +,,..., kp if Y =/= 0 and the nonzero coordinate of Y
(by construction of Y there is exactly one) appears in the place kl + • • • + k}.
So the inequalities (4.1.7) are satisfied. If 5 = 0, then (4.1.7) is obviously
satisfied.
Now let j8( be a sequence with the properties described in the theorem.
Let Jtj, . . . , * £ be a basis in M in which A0 has the Jordan form. We
assume also that the first p Jordan blocks in the Jordan form have
eigenvalues A0 and sizes a{(A0; A 0 ),. . . , ap(A0; A 0 ), respectively. (Here,
aj(v4 0 ; A 0 ),. . . , ap(A0; A 0 ) are all the nonzero integers in the sequence
{a y (y4 0 ; AO)}^). So in the basis *,,. . . , xk we have
where A 1 5 . . . , \u are different from A 0 , and al• = ctj(A0; A 0 ). Now let

y\-> • • • » yn-k be vectors in (f"1 such that jc,, . . . , xk, yl, . . . , yn_k is a basis
in (p". Put
where s = Ef = 1 /?,, r = Zf = I (/3,- - a-), and ^ is the number of positive /3,

values. Further, setting t = Ef =1 a,, put
Now let A: (f1"-* (p" be a transformation that is given in the basis 2 , , . . . , zn

by the matrix
where J is any (n - k - r) x (n - k - r) matrix in the Jordan form with the

property that A0 is not an eigenvalue of J. From the construction of A it is
clear that /3,,. . . , fiq are the partial multiplicities of A corresponding to A0
and that A is an extension of A0.
In particular, the theorem shows that if A is diagonable, then so is the
restriction of A to any /1-invariant subspace.
For coinvariant subspaces the notions of coextension and corestriction
become natural. Let M C <f" be a subspace, and let AQ: M-* M be a linear
transformation. A transformation A: $"-+ <p" is called a coextension of A0 if
there exists an ^-invariant direct complement JV to M in <p" such that
PA\M = A0, where P is the projector on M along jV. Clearly, in this case M
is an A -coinvariant subspace. There is a connection between the partial
multiplicities of a transformation and those of a coextension of the kind
described in Theorem 4.1.4.
Theorem 4.1.5
Let M C <p" be a subspace and A0: M—>M be a transformation. Then for
every coextension A of A0 we have aÂ; A0) ^ <Xj(A0; A0), ; = 1, 2,. . . for
every A0 G <p. Conversely, let fi^ (}2> • • • be a nonincreasing sequence of
nonnegative integers such that equations (4.1.8) and (4.1.9) hold. Then there
is a coextension A of A0 such that ctj(A; A 0 ) = /3y, ;' = 1 , 2 , . . . .
The proof of Theorem 4.1.5 is similar to the proof of theorem 4.1.4.

Given a transformation A0:M—*M, where M C <p", we say that a
transformation A: <p" —* <£"* is a dilation of A0 if there exists an yl-invariant
subspace X for which Jf r\ M = (0), M + N is A invariant as well, and
PA\M = AQ, where P is some projector on M with jVCKerP. (The term
"semiextension" would be more logical in the context of our terminology;
however, "dilation" is widely used in the literature.) In this case M is an
/4-semiinvariant subspace and A0 is the reduction of A (again the term
"semirestriction" would be consistent with our terminology, but "reduction"
is already widely used.) Thus there is a subspace 2£ of <p" for which the
decomposition (3.3.1) holds, and this decomposition determines a triangular
representation such as (3.3.2) for A in which A22 = A0. A result similar to
theorems 4.1.4 and 4.1.5 also holds for dilations, and it can be proved by
first applying one of these theorems and then applying the second. In
particular, if A is diagonable, so is any reduction of A.
4.2 COMPLETIONS FROM A PAIR OF INVARIANT AND

COVARIANT SUBSPACES
Let A:M-*M and B: N—*N be transformations, where M and N are

subspaces in <p" which are direct complements to each other. A transfor-
mation C: <p n —» <p" is called a completion of A and B if M is C invariant and
Completions from a Pair of Invariant and Covariant Subspaces 129
C\M = A, PC\^ = B, where P is the projector on ^V along M. So with respect

to the direct sum decomposition <p" = M 4- jV, C has the form
for some matrix D.

Let al > «2 > • • • (resp. j8, s: /32 > • • •) be a sequence of nonnegative
integers whose nonzero elements are exactly the partial multiplicities of A
(resp. B) corresponding to a fixed point A0 E (p. Assuming that C is a
completion of A and B, let y, > y2 > • • • be a sequence of nonnegative
integers such that the nonzero y. values are the partial multiplicities of C at
A 0 . In this section we study the connections between ait @f, and y,. In view
of Theorem 4.1.1, these connections describe the Jordan form of C in terms
of the Jordan forms of A and B.
Some such connections are easily seen. We have
for every A G (p. Now the algebraic multiplicity of an eigenvalue A0 of a

matrix X coincides with the multiplicity of A0 as a zero of the polynomial
det(A^ - A/). (When A0 is not an eigenvalue of X this statement is also true if
we accept the convention that, in this case, the algebraic multiplicity of A0 is
zero.) It follows from equation (4.2.2) that the algebraic multiplicity* of C
at A0 is equal to the sum of the algebraic multiplicities of A and B at A 0 . In
other words
Further, as C is an extension of A and a coextension of B, Theorems 4.1.4

and 4.1.5 imply that
The following inequality between {a,.}".,, {ty},".,, and {yj}^=l is deeper.
Proposition 4.2.1
Let C be a completion of A and B, with the partial multiplicities of A, B, and
C at a fixed A0 E <p given by the nonincreasing sequences of nonnegative
respectively. Then
integers {a,.};°=1, {/SjJLi, and (yjî' respectively.
*It is convenient here to talk about the "algebraic multiplicity of C at A0" rather than the
"algebraic multiplicity of A0" as an eigenvalue of C.
As usual in this book, the symbol ft* represents the number of different
elements in a finite set ft.
Proof. First we prove the following inequalities:
Indeed, for every e ^0 we have [using formula (4.2.1)]
and thus
Fix some /, and let
So there exists an m x m nonsingular submatrix Q in (A — A 0 /)' ®

(B — A 0 /)'. Consider the m x m submatrix Q(e) of
which is formed by the same rows and columns as Q itself. Now Q(e) is as
close as we wish to Q provided e is sufficiently close to 0. Take e so small
that the matrix Q(e) is also nonsingular. For such an e
Comparing with (4.2.7), we obtain the desired inequality (4.2.6). Now use
Proposition 2.2.6 to obtain the inequalities (4.2.5).
In connection with inequalities (4.2.5), note that

Completions from a Pair of Invariant and Covariant Subspaces 131
Indeed, as {A: | % >/} # =0 for y > y , , and similarly for {afc}^=1 and
( A j r = i > all the sums in equation (4.2.8) are finite, so (4.2.8) makes sense.
Further, for any nonincreasing sequence of nonnegative integers {8i}^=l with
finite sum E*=1 5, we have
The easiest way to verify (4.2.9) is by representing each nonzero 5, as the

rectangle with height 5, and width 1 and putting these rectangles one next to
another. The result is a ladderlike figure <l>. For instance, if Sl = 5, 82 = 83 —
4, 84 = 1, 8j = 0 for j > 4, then <J> is the following figure:
Obviously, the area of <£ is just the left-hand side of equation (4.2.9). On
the other hand, the right-hand side of (4.2.9) is also the area of calculated
by the rows of <1> (indeed, {k \ Sk ^ i}* is the area of the /th row in 
counting from the bottom); hence equality holds in (4.2.9). Now appeal to
(4.2.3) and (4.2.8) follows. We need a completely different line of argument
to prove the following proposition.
Proposition 4.2.2
as An
With {a,}*L,, {/3J°°=, and {y,}r=i Proposition 4.2.1, we have
Proof. Assuming that C is given by (4.2.1), one easily obtains
Using Theorem A. 4. 3 of the appendix, pick a p x p submatrix C 0 (A) in

C — A/ such that A0 is a zero of det C0( A) of multiplicity yn + • • • + yn-p +l
(here n x n is the size of C). The integer p is assumed to be greater than

max(/i /4 , nB) where nA x nA is the size of A and nB x ng is the size of B (so
n = nA + nB). By the Binet-Cauchy formula (Theorem A.2.1 of the appen-
dix) we have
where are p x p submatrices of

and , respectively, and the summation is taken over certain
triples i, /, k. Note that del Bf( A) = 0 unless £,( A) is of the form ls 0 Bf( A),
where #,(A) is a (p — 5) x (p — 5) submatrix of B — A/ (here 5 is an integer
that may depend on i and for which 0=ss — nA). Similarly, detAk(\) = Q
unless Ak(\) is of the form / , ® A k ( \ ) , where Ak(\) is a (p - t) x (p - r)
submatrix o f > l - A / ( 0 < r ^ n B ) . Taking these observations into account,
rewrite equation (4.2.11) as follows:
Now the size of £,(A) is at least (p - nA) x (p - nA), so by the same

theorem, Theorem A.4.3, the multiplicity of A0 as a zero of det fl,(A) ^s at
least
(here we use nB + nA - n and j3, = 0 for / > n f l ). Similarly, the multiplicity

of A0 as a zero of det ^(A) is at least an + a /l _ 1 + • • • + an_p +l. We find
that the multiplicity £" =n _ p + 1 7, of A0 as a zero of detC 0 (A) is at least
E / " =n _ p+1 (ay + /3;). It follows from equation (4.2.3) that
If it happens that p<nA, then the inequality
and hence also the relation (4.2.12), follows from (4.2.4) because in this
case /3y = 0 for / > n - p + 1. Similarly, (4.2.12) holds for p < nB. We have
proved (4.2.10) for m = 1,. . . , n. For m ^ n the inequality (4.2.10) coin-
cides with (4.2.3), so the proof of (4.2.10) is complete.
We have proved various inequalities and equalities relating the sequences

Special Case of Completions 137
{«,}:=,, {&};=,, and {•yj.}-., [relations (4.2.3), (4.2.5), (4.2.8), (4.2.10)].

These relations are by no means the only connections between these
sequences. More specifically, there exist nonincreasing sequences of non-
negative integers {aj}°°=1, {/3,}°°=i and {yi}i=l, only a finite number of them
nonzero, that satisfy equations (4.2.3), (4.2.5), (4.2.8), and (4.2.10), but for
which there is no completion C of A and B with the property that for some
A 0 E an^ {yjr=i &VQ tne partial multi-
plicities of A, B, and C, respectively, corresponding to A0. In the next
section we see more general inequalities, but even they do not completely
describe the connections between the partial multiplicities of extensions of A
and B and the partial multiplicities of A and B. The problem of describing
all such connections is open.
4.3 THE SIGAL INEQUALITIES
The main result in this section is the following generalization of Proposition

4.2.2.
Theorem 4.3.1
Let (aJJLj, {j3J7=i> and {yjr=i be as in Proposition 4.2.1. Then for every
sequence r{ < r2 < • • • < rm of positive integers we have
and
Proposition 4.2.2 is obtained from this theorem by putting r. — j, j =

1, . . . , m. It will be convenient to prove a lemma (which is actually a
particular case of Theorem 4.3.1) before proving the theorem itself.
Lemma 4.3.2
Let
where B is (n - k) x (n - k) with a(B) = {0}. // {-yjî and {j3,}°°=1 are th

nonincreasing sequences of partial multiplicities of C and B, respectively, then
7, = /3( + 5., / = 1, 2, . . . , where 5, is zero or one, and E°°=i 5, = k.
Proof. Let jc,,. . . , x, (I ^ 2) be a Jordan chain for C:
Write jc, = L where y, is a k-dimensional vector and z, is (n-k)-

dimensional. Equalities (4.3.3) then imply yl = • • • = yl_l - 0 and Bzi+l = z,,
/ = 1 , . . . , / - 2, z, T^ 0. In other words, z 1 ? . . . , z,_, is a Jordan chain for
B. Moreover, if Xyt = 0, then z , , . . . , z/ is also a Jordan chain for fi.
Now let
be a basis in <p" consisting of Jordan chains for C (so q is the maximal

index such that yq>Q). Denoting by p the maximal index such that
yp>2, let Z be the subspace spanned by the Jordan chains
z n , . . . , z , , ; . . . ; z p l , . . . , zp , for B constructed as in the preceding
paragraph from the Jordan chains jc ;1 ,. . . , x^y, j = 1,. . . , p of C. Here /; is
either y. - 1 or y y . The order of Jordan chains in equation (4.3.4) of the
same length can be adjusted so that / , > • • • > lp. Since Z is B invariant,
Theorem 4.1.4 gives Bf ^ /,, / = 1,. . . , p. On the other hand, by Theorem
4.1.5 y, > jS,-, / = 1, 2, So we obtain y, - Bi < S,, / = 1, 2,. . . , where
each 5, is either zero or one. The equality EjLi ^ = ^ follows from the fact
that the sum of the partial multiplicities of C (resp. of B) is n (resp.
n-A:). D
Proof o/ Theorem 4.3.1. Let <p" = ^ + Jiand let y4: M-* M, B: N-^ Ji
be transformations such that {a,}"!,, {j8,}^ =t , and (y / }r=i are the nonin-
creasing sequences of nonnegative integers representing the partial multip-
licities of A, B, and
respectively, corresponding to the eigenvalue A0 (here D is some transfor-

mation from .A" into M}. Applying a similarity transformation, if necessary,
we can assume that M = MJ".
Without loss of generality (Theorem 4.1.1) we can assume also that
A0 = 0 and a(A) = cr(B} = {0} (then also tr(C) = {0}). We can assume also
that A is in the Jordan form:
We use induction on the size a{ of the biggest Jordan block in A. If

«! = 1, then A = 0 and by Lemma 4.3.2 (applied to B* and C* in place of B

and C, respectively) we have
Assume that inequality (4.3.2) is proved for all A with the property that the
size of the biggest Jordan block is less than a, . Using a matrix similar to A
in place of A, we can assume that
where A2 is a Jordan matrix with partial multiplicities {a,'}*!, satisfying
With the corresponding partition and using the induction
hypothesis the partial multiplicities {? •}*=! °f tne matrix C'

satisfy the inequalities:
But in view of Lemma 4.3.2 (applied with C'* and C* in place of B and C,
respectively)
Now combine relations (4.3.5), (4.3.6), and (4.3.7) to obtain the inequality
(4.3.2). The inequalities (4.3.1) are obtained from (4.3.2) applied to the
transformation C* written as the 2 x 2 block matrix with respect to the
direct sum decomposition <p" = JV -I- M.
Inequalities (4.3.1) and (4.3.2) admit the following geometric interpre-

tation. Let q be any index such that y(; = 0 for i>q (e.g., <7 = E°1, ai +
E*=1 /3,-). Denote by Kl C R* the convex hull of the points
where TT is any permutation o f { l , 2 , . . . , q r } , that is
Also let
Then inequalities (4.3.1) and (4.3.2) imply
Actually, the inclusion (4.3.8) in turn implies (4.3.1) and (4.3.2). The proof
of these statements would take us too far afield; we only mention that it is
essentially the same as the proof of Theorem 10 of Lidskii (1966). It is
interesting that the geometric interpretation of inequalities (4.3.1) and
(4.3.2) is completely analogous to the geometric interpretation of the
inequalities for the eigenvalues of the sum of two hermitian matrices in
terms of the eigenvalues of each hermitian matrix [see Lidskii (1966)].
Inequalities (4.3.1) and (4.3.2) can be generalized. In fact, for any
sequence r1 < r2 < • • • < rm of positive integers and any nonnegative integer
k<r} the following inequalities hold [see Thijsse (1984)]:
Theorem 4.3.1 is a particular case of (4.3.9) with k = 0.

We have seen that, given the sequences {a(}°lj and {ft}JLj of partial
multiplicities of A and B, respectively, corresponding to A0, the sequence
{•y,}r=i °f partial multiplicities corresponding to A0 of any completion C of A
and B satisfies the properties of (4.2.3), (4.2.4), (4.2.5), (4.3.1), and
(4.3.2); moreover, (4.3.9) is satisfied as well. However, the following
example shows that, in general, these properties do not characterize the
partial multiplicities of completions.
EXAMPLE 4.3.1 . Let al = a2 = 3, a, = 0 for i > 2; fa = /32 = 5; 03 = 4; ft = 0

for / > 3 ; y i = 7 , y2-6, •y3 = 4, y4 = 3, yi , = 0 for i>4. One verifies that
relations (4.2.3), (4.2.4), (4.2.5), and (4.3.9) hold [the verification of
(4.3.9) is lengthy because of the many possibilities involved]. However,
Theorem 7 of Rodman and Schaps (1979) implies that there is no com-
pletion C of A and B such that the partial multiplicities of A, B, and C
corresponding to some A0 are given by {a,}^, {ft}"!,, and
respectively.
4.4 SPECIAL CASE OF COMPLETIONS
In this section we describe all the possible sequences of partial multiplicities

corresponding to A0 for completions of A and B in case at least one of A and
B has only one partial multiplicity at A0. First, we establish some general
observations on partial multiplicities of completions that are used in this

description.
It is convenient to introduce the set ft of all nondecreasing sequences of
nonnegative integers such that, in each sequence, only a finite number of
integers is different from zero. For a = (aj, a 2 ,. . .), /3 = (/?,, /3 2 ,. . .) E ft
denote by F(a, )3) the set of all sequences y = (yl, y2,. . .) E ft with the
following properties: (a) there is a transformation C: <£""—»• £" (for some n)
and a C-invariant subspace M such that the restriction C\M has partial
multiplicities a,, a 2 ,. . . corresponding to a certain eigenvalue A 0 ; (b) the
compression of C to a coinvariant subspace that is a complement to M has
partial multiplicities j8 t , )3 2 ,. . . corresponding to A 0 , and (c) C itself has partial
multiplicities yl,y2, • • - corresponding to the same A 0 .
Proposition 4.4.1
Let a — (otj, a 2 , . . .) E ft, /8 = (/3,, /32, . . .) Eft, and put m — E*=1 a,, n =
£°°=1 )3,. TTzen fl sequence y = (7,, y2, . . .) E ft belongs to F(a, /3) if and only
if there is an m x n matrix A such that the partial multiplicities of the matrix
is the largest
O
index such that a_/I j ^0 [resp.
I /^
B n 7^0]J are y,,
"^2
y,, . . . .
/1 " » / '
Proof. As the part "if" follows from the definition of F(a,/8), we

have only to prove the "only if part. Assume y EF(a, /3). By definition,
there is a matrix C partitioned as follows:
where for some eigenvalue A0 of C the partial multiplicities of C (resp. C u ,

C22) at A0 are given by y (resp. a, /3). Replacing C by C- A07, we can
assume A0 = 0. Furthermore, we can assume that Cn and C22 are matrices in
the Jordan form. It remains to appeal to Theorem 4.4'.l.i
It follows immediately from Proposition 4.4.1 that F(a, j8) = F(j3, a).
Indeed, in the notation of Proposition 4.4.1 we have
so the matrices
have the same Jordan form. But then (in view of Corollary 2.2.3) this is also
true for the matrices
As / 2 ar|d ^* are similar to J2 and /,, respectively, the conclusion F(a, /3) =
F(j3, a) follows.
In view of Proposition 4.4.1, in order to determine F(a, /3), we have to
find the partial multiplicities -y, > -y2 > • • • (or, what is the same, the Jordan
form) of matrices J of type (4.4.1). As
(by definition, /° = /), we focus on a formula for computation of the ranks

of/ 1 , / = 1,2, ____
Divide the matrix A into blocks Atj, i = 1, . . . , n, ; / = 1, . . . , «2 accord-
ing to the sizes of Jordan blocks in /, and /2 (so the size of Atj is a, x /3;).
For fixed i and j, write Atj = T,%=l E^, "„£,,, where Epq is an a, x p.
matrix with 1 in the intersection of the (a, - p + l)th row and gth column
and zero in all other places. Let
(we put upq = 0 if p > ofj or q > /3y). Define
where the sum is over all the pairs p,q such that p <min(A;, a,), q <
min(/:, jSy), and p + q> k. For example, BJ^ has u n in the lower left corner
and zeros elsewhere, B|,112) has in the lower lefgt corner and
zero elsewhere, B^3) has
in the lower left corner and zeros elsewhere (provided «., /3 ; >3). Let
be the mx n matrix with blocks B^f\i = 1, . . . , n,; / = 1, . . . , n 2 ).
Lemma 4.4.2
In the preceding notation we have
rank /* = rank /* -t- rank /* + rank B(k) , k = 1, 2, . . .

Proof. Let A(k) be defined by An easy indctiom

argument on k shows that
and hence
where Eab = 0 whenever at least one of the inequalities 1 < a ^ a,; 1 < ft ^ /3y
is violated, and Ma6 = 0 for a < 1 or ft < 1.
It follows that
A = fl
ij !> + (terms with Ep.q. such that p' > A: or q' > A:)
By column operations from /* and row operations from Jk2, we can eliminate
all terms of A(k) except those in the block B(k). Permuting the rows and
columns of the resulting matrix, we obtain the following matrix that has the
same rank as /*:
where ak = rank Jk and bk = rank /*• Lemma 4.4.2 follows.
It is an immediate consequence of the lemma that the sequence (y;}7=i

depends only on the diagonal sums d^, for /<min(a / , ft). Thus we can
replace each Afj by a matrix in which only the first column can contain
nonzero entries. Alternatively, we can presume that only the bottom row of
A^ can contain nonzero entries.
For illustration of Lemma 4.4.2, consider the following example.
EXAMPLE 4.4.1. Let a =(a,,0, 0,. . .), p = (ft, 0, 0,. . .), where a,, fa >
0. We suppose for definiteness that a, > ft. If d ( l ) 7^0, it is easily seen that
t40 jordan Forms for Extensions and Completions
rank
In general, we have
where t0 is the smallest t such that d*'/ ^ 0, or /0 = ftl + 1 if all rf^'/ are zeros.
It is now clear that y = (y,, y 2 ,. . .)eF(a, 0) is determined completely by
the value of t0. Further, using formula (4.4.3) and Lemma 4.4.2, we
compute
Computation shows that
F(a, 0) = {(a, + jS,, 0), (a, + j8, - 1,1), . . . , K + 1,0,- 1), (a,, 0,)}
(In every y sequence we write only the first members; the others are zeros.)
The y sequence (a{ + 0t — p, p) corresponds to the value f 0 = p + 1.
The possibility of y = (QJ -I- 0j — p, p), p = 0,. . . , /3j, is realized for the
matrix
where Ap is an al x 0, matrix with all but the (a, — p, l)th entry equal to
zero, and this exceptional entry is equal to 1 (for p = 0j we put Ap = 0). It is
not difficult to construct two independent Jordan chains of A/ — J(p) of
lengths aj + 0, — p and p. Namely, the Jordan chain of length a, + 0, — p is
e
a +p ' ea +p - i > • • • » e a, + i> â^p' e a,-p-i> • • • , ^ i • The Jordan chain of
length p is ea] - eai+p, e Q ] _, - e a i + p _ i , . . . , « ffll - P + i ~ e«1+ i -
Using Lemma 4.4.2, we shall now give a complete description of the set
F(a, 0) in the case that a = (alt a2, . . . , a n ,0,. . .) and 0 = (0^0,0,. . .)
where an and 0j are positive.
Introduce the set O0 of all w-tuples (co}, a>2,. . . , eon), where o>; are
integers such that 1 < cu; < A y + 1 and A y = min(a y , 0J. For a given sequence
o> = (o)l5 <u2, . . . , & ) „ ) E ft0 and / = 1, 2,. . . , define integers cj"* as follows:
where /ay = max(a y , /3,). Now let y = (y,, y2, • • •) be the nonincreasing
sequence of nonnegative integers defined by the equalities
Thus for every &> G fl0 we have constructed a sequence y. Let us denote this
sequence by F(o>).
Theorem 4.4.3
For every eoGfl ( ) the sequence F((o) belongs to Y(a, B). Conversely, if
y G F(a, j3), there exists a) G O0 s«c/z tfza/ y = F(w).
proof. Recall that
Inview of lemma 4.42, we find that
It remains to check, therefore, that for every w Gfl it is possible to pick

the complex numbers dfi ( ! < / < « , / = 1 , 2 , . . . ) in such a way that
fk - rank B(k) for A: = 1, 2, . . . , where fk is defined by equation (4.4.4) and
B(k) is defined as in Lemma 4.4.2; and conversely, for every choice of d^ it
is possible to find an o> G H0 such that fk — rank B(k\
Note that B(k) depends on d(^ with t < A ; , so we restrict ourselves only to
these values of t.
Given w — (otl, . . . , '<u n )Gft 0 , choose d^ in such a way that at- is the
smallest index / with the property that d(^ ¥^ 0 [if a))• = A y + 1, put d^ — 0 for
all t]. It is easy to see that cff is just the rank of the matrix B^ [defined by
(4.4.2)]. Observe that after crossing out some zero columns and rows, if
necessary, B(k^ is an upper triangular Toeplitz matrix with min(Ac, /3,)
columns. Thus the rank of
is just the maximum of the ranks of B\\\ B^, . . . , flj,*/, that is, fk.
Conversely, if d^ are given, define o>; as the minimal /(I < t < A y ) such
that rfj'/ ^ 0; and if d^ = 0 for every /, 1 < t < A y , put wy = Ay- + 1 .
4.5 EXERCISES
4.1 Supply a proof of Theorem 4.1.5.

4.2 State and prove a result for dilations analogous to Theorems 4.1.4 and
4.1.5.
4.3 Prove that the maximal dimension of an irreducible /1-invariant sub-
space coincides with the maximal dimension of a Jordan block in the
Jordan form of A.
4.4 Find all possibilities for the partial multiplicities of matrices of type
where X is any n x m matrix.

4.5 What is the answer to the preceding exercise under the restriction that
rank X < k, where A: is a fixed positive integer?
4.6 Find all possibilities for the partial multiplicities of matrices of the
following types:
where X is any n x m matrix of rank 1. (Hint: Prove that there

exists an n x m matrix X0 with exactly one nonzero entry such
that
are similar.)
(c) What happens if we allow matrices X of rank 2?

Exercises 143
4.7 Find all possibilities for partial multiplicities of matrices of type
where X is any n x m matrix.

4.8 Let
be circulant matrices. Find all possibilities for the partial multiplicities

of matrices of type
where X is an n x n matrix.
Chapter Five
Applications to
Matrix Polynomials
Let A0, A j , . . . , A,_)be complex n x n matrices. We call the matrix-valued

function L(A) = I\' + ZJIg Aj\' a monic matrix polynomial of degree /. It
will be seen that there are In x In matrices C such that
are equivalent. (See the appendix for the notion of equivalence.) In this case
C is said to be a linearization of L(A). The invariant, coinvariant, and
semiinvariant subspaces for C play a special role in the study of the matrix
polynomial L(A). For example, certain invariant subspaces of C are related
to factorizations of L(A). More precisely, certain invariant subspaces deter-
mine monic right divisors of L(A), certain coinvariant subspaces determine
monic left divisors, and certain semiinvariant subspaces determine three
monic factors of L(A). In this chapter we explore these and similar
connections and study the behavior of solutions of differential and differ-
ence equations with constant coefficients.
5.1 LINEARIZATIONS,s STANDARD TRIPLES, AND

REPRESENTATIONS OF MONIC MATRIX POLYNOMIALS
In this section we introduce the main tools required for the study of monic
matrix polynomials. These tools are freely used in subsequent sections.
Let L(A) = /A' + EJIo Af\' be a monic matrix polynomial of degree /,
where the A- are n x n matrices with complex entries. Note that det L(A) is
a polynomial of degree nl. A linear matrix polynomial /A — A of size
(n + p) x (n + p) is called a linearization of L( A) if
144
where E( A) and F( A) are (n + p) x (n + p) matrix polynomials with con-

stant nonzero determinants. Admitting a small abuse of language, we also
call matrix A from equation (5.1.1) a linearization of L(A). Comparing
determinants on both sides of (5.1.1), we conclude that det(/A - A) is a
polynomial of degree nl, where / is the degree of L(A). So the size of a
linearization A of L(A) is necessarily nl.
As an illustration of the notion of linearization, consider the lineariz-
ations of a scalar polynomial (n = 1). Let L(A) = flf =1 (A — A,)"1 be a scalar
polynomial having different zeros A , , . . . , Afc with multiplicities a 1 5 . . . , ak,
respectively. To construct a linearization of L(A), let/, (i = 1,. . . , k) be the
Jordan block of size a, with eigenvalue A ( , and consider the linear polyno-
mial A/ - / of size E*=1 a^ where J = diag[/,]f=1. Then / is a linearization of
L9A). Indeed, /A - J and have the same elementary divisors;
so using Theorem A.3.1, we find that J is a linearization of L(A).
The following theorem describes a linearization of a monic matrix
polynomial directly in terms of the coefficients of the polynomial.
Theorem 5.1.1
For a monic matrix polynomial definc
the nl x nl matrix
Then C\ is a linearization of L(A).
Proof. Define nl x n/ matrix polynomials £(A) and F(A) as follows:

146 Applicaions toMatrix Polynomials
where# 0 (A) = / a n d B r + l ( A ) = AB r (A) + A,_r_l for r = 0, 1, . . . , / - 2. It is

immediately seen that del F( A) = 1 and det E( A) = ±1. Direct multiplication
on both sides shows that
and Theorem 5.1.1 follows. D
The matrix C, from Theorem 5.1.1 will be called the (first) companion
matrix of L(A), and will play an important role in the sequel. From the
definition of C, it is clear that
In particular, the eigenvalues of L( A), that is, zeros of the scalar polynomial
det L(A), and the eigenvalues of /A - C, are the same. In fact, we can say
more: since Cl is a linearization of L(A), it follows that the elementary
divisors (and thus also the partial multiplicities of every eigenvalue) of
/A — Cj and L(A) are the same.
Now we prove an important result connecting the rational matrix function
L ( A ) ~ l with the resolvent function for the linearization C,.
Proposition 5.1.2
For every A £ <p that is not an eigenvalue of L(A), the following equality
holds:
where
is an n x nl matrix and
is an n x nl matrix.
Proof. Consider the equality (5.1.2) used in the proof of Theorem

5.1.1. We have
It is easy to see that the first n columns of the matrix [£(A)] ' have the form
(5.1.4). Now, multiplying equation (5.1.5) on the left by P, and on the
right by P] and using the relation
We obtain the desired formula (5.1.3).
Formula (5.1.3) is referred to as a resolvent form of the monic matrix

polynomial L(A). The following result follows directly from the definition of
a linearization and Theorem A. 4.1.
Proposition 5.1.3
Any two linearizations of a monic matrix polynomial L(A) are similar.
Conversely, if a matrix T is a linearization of L(\) and matrix S is similar to
T, then S is also a linearization of L(\).
This proposition and the resolvent form (5.1.3) suggest the following
important definition: a triple of matrices (X, T, Y), where T is nl x «/, X is
n x «/, and Y is nl x n, is called a standard triple of L(A) if
For example, Proposition 5.1.2 shows that ( P } , C}, /?,) is a standard triple
of L(A).
It is evident from the definition that, if (X, T, Y) is a standard triple for
L(A), then so is any other triple (X, f, Y) that is similar to (X, T, Y), that
is, such that
for some nonsingular matrix 5. As we see in Theorem 5.1.5, this is the only
freedom in the choice of standard triples.
We start with some useful properties of standard triples. Here and in the
sequel we adopt the notation col[Z(]f=0 for the column matrix
148 Applications to Matrix Polynomials
Proposition 5.1.4
If (X, T, Y) is a standard triple of a monic n x n matrix polynomial
/ then the nl x nl matrices
are nonsingular. Further, the equalities
and
hold.
Proof. We have
and by Proposition 2.10.1,
where F is a circle with centre 0 and sufficiently large radius so that cr(7)
and the eigenvalues of L( A) are inside T. On the other hand, since L( A) is a
monic polynomial of degree /, the matrix function L(A) = A~'L(A) is
analytic and invertible in a neighbourhood of infinity and takes the value / at
infinity. In fact, L(A) is analytic outside and on F. Hence
and representing L(A)" 1 as a power series 7 + E£ =1 A ~ k L k , we see that
Combining this with (5.1.8), we have

As the right-hand side in equation (5.1.9) is nonsingular, the nl x nl matrices

col[AT]!:J and [7 TF • • • r'^y] are both nonsingular.
Now use equation (5.1.8) again and we find that, for / = 0, 1, . . . , / — 1,
It follows that
and since the second factor is nonsingular, formula (5.1.6) follows.

Similarly, starting with the equality
formula (5.1.7) can be verified.
We are now ready to state and prove the basic result that the standard
triple for a monic matrix polynomial is essentially unique (up to similarity).
Theorem 5.1.5
Let (X1, Tl, y t ) and (X2, T2, Y2) be two standard triples of the monic matrix
polynomial L( A) of degree I. Then there exists a unique nonsingular matrix S
such that
The matrix S is given by the formula
where the invertibility of the matrices involved is ensured by Proposition

5.1.4. In particular, if (X, T, Y) is a standard triple of L(A), then T is a

linearization of L(A).
Proof. Assume we have already found a nonsingular S such that

(5.1.10) holds. Then
and
Thus formulas (5.1.11) hold and consequently 5 is unique.

Now we prove the existence of an 5 such that (5.1.10) holds. Without loss
of generality, and taking advantage of Proposition 5.1.2, we can assume that
X2 = P,, T2 = C}, y2 = /?,. Using (5.1.6) [with (X, T, Y) replaced by
(A",, T,, y,)], the equality
where C\ is the companion matrix of L(A), is easily verified. Also, (5.1.9)

implies
where 8tj is the Kronecker index (8;/ = 0 if 17^;; 5(> = 1 if i =/). Obviously
and equations (5.1.10) hold with S = col[A'17",]!:,'.

Finally, if (A', T, Y) is a standard triple of L(A), then, by the part of
Theorem 5.1.4 already proved, T is similar to the companion matrix C, of
L(A), and thus T is also a linearization of L(A).
Proposition 5.1.2 gives an example of a standard triple based on the

companion matrix of L(A). Another useful example of a standard triple is
where
and is called the second companion matrix of L(\). Indeed, if we define

151
then we have
Thus the triple (5.1.10) is similar to the standard triple given in Proposition
5.1.2. The notion of a standard triple is the main tool in the following
representation theorem.
Theorem 5.1.6
Let be a monic matrix polynomial of degree I with
standard triple (X, 7", Y). Then L(\) admits the following representations:
(a) Right canonical form :
where V- are nl x n matrices such that
(b) Left canonical form:
where W are n x nl matrices such that
Note that only X and T appear in the right canonical form of L(A),
whereas only T and y appear in the left canonical form.
Proof. Observe that the forms (5.1.13) and (5.1.14) are independent of
the choice of the standard triple (X, T, Y). Let us check this for (5.1.13),
for example. We have to prove that if (X, T, Y) and (A", T', Y') are
standard triples of L(A), then
where
152
But these standard triples are similar:
Therefore
and (5.1.15) follows.

Thus it suffices to check equation (5.1.13) only for the special standard
triple
and for checking (5.1.14), we choose the standard triple defined by (5.1.12).
To prove (5.1.13), observe that
and
so
and (5.1.13) becomes evident. To prove (5.1.14), note that by direct

computation one easily checks that for the standard triple (5.1.12)
and
So
Thus
Multiplication of Monic Matrix Polynomials 153
and
So equations (5.1.14) follows.
5.2 MULTIPLICATION OF MONIC MATRIXx pPOLYNOMIALS

AND PARTIAL MULTIPLICITIES OF A PRODUCT
In this section we describe multiplication of monic matrix polynomials in

terms of their standard triples. First we compute the inverse L~l(\) of the
product L(A) = L 2 (A)Lj(A) of two monic matrix polynomials /^(A) and
L 2 (A).
Theorem 5.2.1
Let L,( A) be a matrix polynomial with standard triple (Xt, T,, Y^for i = 1,2,
and let L(A) = LÂJLÂ). Then
whrer
Proof. It is easily verified that
The product on the right of equation (5.2.1) is then found to be
But, using the definition of standard triples, this is just ), an

the theorem follows immediately.
Corollary 5.2.2
If L.( A) are monic matrix polynomials with standard triples {X^ Tt, Y,) for
i = 1,2, then has a standard triple (X, T, Y) with the
representations
Proof. Combine Theorem 5.1.5 with Theorem 5.2.1.
Corollary 5.2.2 allows us to describe the partial multiplicities of a produc

of monic matrix polynomials. We first give some necessary definitions. For a
monic matrix polynomial L(A) and its eigenvalue A0 [i.e., det L( A 0 ) = 0], let
«j > a2 > • • • > ar be the degrees of the elementary divisors of L( A) corre-
sponding to A 0 . The integers a, are called the partial multiplicities of L(A)
corresponding to A(). It is convenient to augment the a, values by zeros and
call the sequence a — (an «2, . . . , ar, 0, . . .) the sequence of partial multi-
plicities of L( A) at A 0 . Thus a G ft (see Section 4.4 for the definition of ft).
Also, we shall say formally that the partial multiplicities of L( A) correspond
ing to a complex number that is not an eigenvalue of L(A) are all zeros.
Recall also the definition of the set F(a, /3) given in Section 4.4.
Theorem 5.2.3
Let Lj(A) and L2(\) be n x n monic matrix polynomials. Let a, /3 and y be
the sequences of partial multiplicities of L,(A), L2(\), and L 2 (A)L,(A),
respectively, at A 0 . Then y E F(a, /3). Conversely, if y G F(a, /3), then for n
sufficiently large there exist n x n monic matrix polynomials L,(A) and
L2( A), such that the sequence of their partial multiplicities at A0 are a and j8,
respectively, and the sequence of partial multiplicities of L 2 (A)Lj(A) is y.
Proof. Let ( X t , Tt, Yt} be a standard triple for L,(A) and i = 1,2. By
the multiplication formula (Corollary 5.2.2), the matrix
is a linearization of L 2 (A)Lj(A). From the properties of a linearization it

follows that y is also the sequence of partial multiplicities of T at A 0 . Now
from the structure of T it is clear that y G F(a, /3), and the first part of the
theorem follows.
To prove the second part of Theorem 5.2.3, we first prove the following
assertion: let A be an r t x r2 matrix. Then for n sufficiently large there exist
an r, x n matrix Y and an n x r2 matrix X such that YX - A, the rows of Y
are linearly independent, and the columns of X are linearly independent.
Indeed, multiplying A by invertible matrices from the left and the right (if
necessary), we can suppose that
Multiplication of Monic Matrix Polynomials 155
where / is the unit r x r matrix (for some r < min(r,, r 2 )). Then we can take
where Y} is an (r t - r) x r, matrix with linearly independent rows and

Xl is an r2 x (r2 - r) matrix with linearly independent columns. Then n =
r + r, + r 2 , of course.
Now let y G F(a, /3), so that y is the sequence of partial multiplicities of
for some 7",, T2, A, and the partial multiplicities of 7, (resp. 72) corre-
sponding to A0 are given by the sequence a (resp. /3). Applying a similarity
to 70, if necessary, we can assume that 7, and 72 are in Jordan form.
Further, in view of Theorem 4.1.1 we can assume that o-(7,) = o-(72) =
{A,,}-
According to the assertion proved in the preceding paragraph, for n
sufficiently large there exist matrices X0 and yo of sizes n x r2 and r, x n,
respectively (where r, = EJL, a;, r2 = EJL, )3y) such that K^, = /I, the rows
of yo are linearly independent, and so are the columns of X(). Choose an
n x (n - r 2 ) matrix Xt such that the matrix [A r () A r ,] (of size n x n) is
invertible, and put
where z is some complex number different from A 0 . Similarly, choose an

matrux Y1 such that is nonsingular, and put
As 72 © z/ (resp. 7, © z/) is a linearization of L 2 ( A) [resp. of L,( A)], it

follows that the partial multiplicities of L2(\) [resp. of L,(A)] corresponding
to A() are given by the sequence j8 (resp. a). Further
are the standard triples for L 2 (A) and L,(A), respectively. By Corollary
5.2.2 the matrix
is a linearization of L 2 ( A)L,( A). Now Theorem 4.1.1 ensures that the partial
multiplicities of T corresponding to A0 are exactly those for F0; that is, they
are given by the sequence y.
The proof of the converse statement of Theorem 5.2.3 shows that for a
yEF(a, /3) there exist linear monic matrix polynomials L t (A) and L 2 (A)
with the desired properties and with the size not exceeding min(rj,r 2 ) +
TJ + r 2 , where r, (resp. r 2 ) is the sum of all integers in a (resp. )3).
Our analysis of partial multiplicities of completions in sections 4.2-
4.4, combined with Theorem 5.2.3, allows us to deduce various connec-
tions between the partial multiplicities of monic matrix polynomials and the
partial multiplicities of their product, as indicated, for instance, in the
following corollary.
Corollary 5.2.4
Let Lj(A) and L 2 (A) be n x n monic matrix polynomials. Let a =
(a,, a 2 ,. . .), /3 - (j§,, /32, . . .), and y = (y,, -y2, . . .) be sequences of partial
multiplicities o/L,(A), L 2 (A), and L 2 (A)L,(A), respectively, at A 0 . Then
/or an_y sequence r^<- • • <rm of positive integers.
The corollary follows from Theorems 4.3.1 and 5.2.3.
5.3 DIVISIBILITY OF MONIC MATRIX POLYNOMIALS
Let L( A) be an nx n monic matrix polynomial of degree /, and let

(X, T, y) be a standard triple for L(A). Consider a F-semiinvariant sub-
space M. Thus there exists a triinvariant decomposition (see Section 3.3)
associated with M:
where the subspaces j£ and ^ 4- J< are F invariant. The triinvariant decom-
position (5.3.1) is called supporting [with respect to (X, F, y)] if, for some
integers p and q, the transformations
Divisibility of Monk Matrix Polynomials 157
and
are invertible (in particular, this implies that dim(j£ + M) = np,

dim j£' = nq).
Cases in which j£= {0} are of particular interest; then M is T invariant
and condition (5.3.3) is vacuous. Also, if JV= (0), then M is T coinvariant
and the condition (5.3.2) is satisfied automatically with p = /. (Indeed,
we have seen in Proposition 5.1.4 that the matrix col[AT']'lo is n
singular.)
The definition of a supporting triinvariant decomposition is given in terms
of (X, T) only. However, if P# is a projector with Ker P^ — JV, the
following lemma shows that the invertibility of (5.3.2) is equivalent to
the invertibility of the transformation from C n( '~ p) into Im P^ defined by
Similarly, (5.3.3) is invertible if and only if
is invertible, where P^ is a projector with Ker Py — 58 (note that because of

the T invariance of £ and M we have
Lemma 5.3.1
Let L(\) be a monic matrix polynomial of degree I with standard triple
(X, T, Y), and let P be a projector in (p"'. Then the transformation
(where k < I) is invertible if and only if the transformation
is invertible.
invertible.
Proof. Put and with

spect to the decompositions and
write
Thus the At are transformations with the following domains and ranges:
and similarly for the Bt.

Observe that Al and B4 coincide with the transformations (5.3.4) and
(5.3.5), respectively. By formula (5.1.9) the product AB has the form
where D, and D2 are nonsingular matrices. Recall that A and B are also
nonsingular by Proposition 5.1.4. But then Al is invertible if and only if B4
is invertible. This may be seen as follows.
Suppose that B4 is invertible. Then
is invertible in view of the invertibility of fi, and then also B} - B2B4l B3 is

invertible. The special form of AB implies A^B2 + A2B4 =0. Hence D, =
A ,B, +A2B3 = A1B1 -A {B2B~l B^ = Al(Bl -B2B4}B3) and it follows
that A\ is invertible. A similar argument shows that invertibility of Al
implies the invertibility of B4. This proves the lemma.
The importance of supporting triinvariant decompositions stems from the

following result describing factorizations of a monic matrix polynomial L(A)
in terms of supporting triinvariant decompositions associated with a lineariz-
ation of L(\}.
Divisibility of Monic Matrix Polynomials 159
Theorem 5.3.2
Let =^(A) be an n x n monic matrix polynomial with standard triple
(X, T, y), and let df" = ££ + M+Jfbea supporting triinvariant decompos-
ition associated with a T-semiinvariant subspace M. Then L(A) admits a
factorization
where L ( (A), / = 1 , 2 , 3 are monic matrix polynomials with the following

property: (a) (X\y, T^, Y) is a standard triple of L 3 (A), where
(b) (X, PjfT\lmP , P^Y) is a standard triple for L,(A), where P v is a

projector with Ker P v = !£ 4- M and
(c) (Z\M, PMT\M, F) is a standard triple for L 2 (A), where PM is the projector
on M along !£ + Im P A ,
[Here q^l and / - / ? < / are the unique nonnegative integers such that the
linear transformations co$XTi)t>:^- £-+ $"" and P^[Y,. . . , Tl'p~2Y,
T'~P~1Y]: $n(l-p)->% + M are invertible.}
Conversely, if equation (5.3.6) is a factorization of L($ into a product of
three monic matrix polynomials Lj(A), L 2 (A), and L 3 (A), there exists a
supporting triinvariant decomposition
associated with a T-semiinvariant subspace M such that the standard triples

o/L,(A), L2(\),and L 3 (A) are (X, P x r l l m / P^Y), (Z]M, PMT\M, PMY),
and 7 " , y), respectively, where P^ is a projector with Ker P^ =
!£ 4- M , PM is the projector on M along !£ + Im PA , and X, Y, Z, Y are given
by (5.3.8), (5.3.7), (5.3.9) and (5.3.10), respectively.
Moreover, the T-invariant subspaces !£ and t£ + M in (5.3.11) are unique-
ly determined by the factors Lj(A), L 2 (A), and L 3 (A).
It is assumed in Theorem 5.3.2 that P where
/ is the degree of L(A).
As a monic matrix polynomial M(A) and its inverse are uniquely deter-
mined by any standard triple (see Theorem 5.1.6 and the definition of a
standard triple), Theorem 5.3.2 provides an explicit description of the
factors L,(A) in (5.3.6) in terms of supporting triinvariant decompositions.
For instance, if <p" = !£ + M 4- N is a supporting triinvariant decom-
position (associated with a r-semiinvariant subspace M ) and L( A) =
L!(A)L 2 (A)L 3 (A) is the corresponding factorization of L(A), then (in the
notation of Theorem 5.3.2) we have
Similarly, using Theorem 5.3.2, one can produce the formulas for Lj(A),
L 2 (A), and L 3 (A) themselves. The proof of Theorem 5.3.2 is quite lengthy
and is relegated to the next section.
The following particular case of Theorem 5.3.2 is especially important.
We assume that L(A) and (X, T, Y) are as in Theorem 5.3.2.
Corollary 5.3.3
Let <p"' = ££ + M+Nbea supporting triinvariant decomposition associated
with a T-semiinvariant subspace M such that !£ + M = <prt/ (so Jf — {0} and M
is actually T coinvariant). Then L(\) admits a factorization
where L3( A) is a monic matrix polynomial of degree q with a standard triple

of the form (X^, T^, Y), where
Proof of Theorem 5.3.2 161
Also, L 2 (A) is a monic matrix polynomial of degree I- p with a standard

triple of the form (XIM, PM T\M, PM Y) where
and PM is the projector on M along ££.

Conversely, if equation (5.3.12) is a factorization of L(\) into a product
of two monic matrix polynomials L 2 (A) and L3(X), there exists a unique
T-invariant subspace J£ such that the triinvariant decomposition <p" = 2£ +
M 4- {0} (where M is a direct complement to 3?) is supporting and the
standard triples of L 2 (A) and L3(X) are as described above.
Note that under the conditions of Corollary 5.3.3 we have q = I - p (cf.

Lemma 5.3.1).
Again, as in Theorem 5.3.2, one can write down explicit formulas for the
factors in (5.3.12) and their inverses using the triinvariant decomposition
<p"' = % + M + Jf with M = {0}. For example
in the notation of Corollary 5.3.3.
5.4 PROOF OF THEOREM 5.3.2
We need the following fact.
Proposition 5.4.1
Let L( A) = E / = 0 Aj\' be an n x n matrix polynomial (not necessarily monic)
and let L{(\) be an n x n monic matrix polynomial with standard triple
(Xl,Tl,Yl). Then (a) L(A) = L 2 (A)Li(A) for some matrix polynomial
L 2 (A) if and only if the equality
holds; (b) L(A) = Lj( A)L 3 (A)/or some matrix polynomial L 3 (A) if and only
if the equality
holds.
Proof. Let us prove (a). We have

Therefore
and for |A| large enough (e.g., for |A| > \\T} ||) we have
Now assume L(A)L 1 (A)~ 1 is a polynomial. Then in formula (5.4.2) the

coefficients of negative powers of A are zeros. But the coefficient of
A~ y "'(; = 0, 1,.. .) in (5.4.2) is
which is zero. So
As [y,, 7,, y,, . . . , r ^ ' y j is nonsingular [where k is the degree of L,(A);

see Proposition 5.1.4], we obtain equality (5.4.1).
Conversely, if (5.4.1) holds, then
which means that all coefficients of negative powers of A in (5.4.2) are zeros,
that is, L(A)L 1 (A)~ I is a polynomial.
Statement (b) of Proposition 5.4.1 follows from the (already proved)
statement (a) when applied to the matrix polynomials L(A) = (L(A))* =
. . _ def _
£}=0 A*\' and L,(A) = (Li(A))* in place of L(A) and L,(A), respectively.
(Note that (y*, T*,X*) is a standard triple for L,(A), and that L(A) =
L,(A)L 3 (A) if and only if L(A) = L 3 (A)L,(A), where L 3 (A) = [L 3 (A)]* is a
matrix polynomial together with L 3 (A).}
Assume now that <p"' = 3? + M + N is a supporting triinvariant decom-

position associated with 7-semiinvariant subspace M, as in Theorem 5.3.2.
As [coltAT'lfJo1]^: ££—> (p"9 is an invertible transformation, we can define
the n x n monic matrix polynomial L3(\) by the formula
where
(so V): <p"—»j£, i = 1,. . . , q). It turns out that (A^>, T^, Vq) is a standard
triple of L 3 (A). Indeed, we note that the following equalities hold:
where C3 is the companion matrix for L 3 (A), and
(The second equality is obtained from
on premultiplication by ATj^,.) Hence (A^, 7^,, V ) is similar to the

standard triple (P,, C,, /?,) for the matrix polynomial L 3 (A) (in the notation
of Proposition 5.1.2), so (A^, 7^, V^) is itself a standard triple for L 3 (A).
Because of the equality
where / is the degree of L(A) and Aj is the coefficient of A' in L(A) [see
formula (5.1.6)], Proposition 5.4.1 ensures that there exists a matrix polyno-
mial L 4 (A) such that L(A)= L 4 (A)L 3 (A). The matrix polynomial L 4 (A) is
necessarily monic and of degree def/ — q. Let us find its standard triple. First
note that the transformation Q = P ^ + P v is a projector on M + Im P v
along 3?. Indeed, for every x E 3? we have Qx = PMx + Pxx = 0 + 0 = 0, and
for every yE.M (resp. y e Im P v ) we have Qy — PMy + P^y = y + 0 = 0
(resp. Qy = P^y = y). Then by Lemma 5.3.1, the transformation
is invertible.
Now we check that
where
and QTQ is considered as a transformation from Im Q into itself. In view of

the multiplication theorem (Theorem 5.2.1) it will suffice to check that the
triple (X, T, Y) is similar to the triple
For then we have where L 4 (A) is the right-han

side of (5.4.3) and thus L 4 (A) = L 4 (A). To this end define
where X} = X\<g, Tl = T^. Then P' is a projector and Im P' = !£. Indeed,
we obviously have P'y = y for every y G !£. Further, formula (5.1.9) implies
that
In fact, we have the equality:
To check this, let y £ Ker P'. As [y, 7Y,. . . , r'^'Y] is invertible, we

have y = E|lo r'Yx, for some x0,. . . , x,_l E <p". Now
and formula (5.1.9) easily implies that xt_q = • • • = jc 7 _j = 0. Hence (5.4.4)

follows.
In view of Lemma 5.3.1 the transformation [Y, TY,. . . , Tl~q~lY] is
one-to-one; therefore
Using (5.4.4) and the fact that P'^ = I, it follows that X and
Im[Y, TY,. . . , T'~q~lY] are direct complements to each other in <pn/.
Thus P' is indeed a projector.
Define 5: f'^lm P + Im Q by
where P' and (? are considered as transformations from <p" into Im P' and
Im Q, respectively. One verifies easily that S is invertible. We show that
Take y E <£"'. Then P> E 2 and col^TV'P'.y]*^ = col[AT'~^]*=1. In

particular, XlP'y = A'y. This proves that [A^ 0]S = X. The second equality
in (5.4.5) is equivalent to the relations
and QT= QTQ. The last follows immediately from the fact that Ker Q is an
invariant subspace for T. To prove (5.4.6), take y E <pn/. The case when
_y E Ker (? = Im P' is trivial. Therefore, assume that y E Ker P'. We then
have to demonstrate that P'Ty-VlZ2Qy. Since y E K e r P ' , there exist
x0, . . . , * , _ , _ , E <f" such that u = Î? r'^^'Fjc,.,. Hence
with u E K e r P ' and, as a consequence, P'Ty = P'T''qYxQ. But then it

follows from the definition of P' that
On the other hand, putting , we obtain
and so V^ZQy is also equal to VqxQ. This completes the proof of equation
(5.4.6). Finally, the last equality in (5.4.5) is obvious because P'y = 0.
We have now proved equality (5.4.3), from which it follows that
(Q, QTQ, QY) is a standard triple for L 4 (A).
Now define the monic matrix polynomial
where and
Then (A^, PxT^lmP ., P^y) is a standard triple for Lj(A). Indeed, this
follows from the equalities
where C2 is the second companion matrix of L,(A). The first and third
equations of (5.4.7) follow from the definitions; the second equality follows
from the structure of C2 using the fact that A col[l/,]Jl^ = /.
Now Proposition 5.4.1, (b) implies that L 4 (A) = L,(A)L 2 (A) for some
(necessarily monic) matrix polynomial L 2 (A). So in order to prove the direct
statement of Theorem 5.3.2 we have only to verify that (Z\M, PM^\M-> ^) ^s
indeed a standard triple for L 2 (A). To this end, put
where
Note that, in view of Lemma 5.3.1, the invertibility of the transformation on

the right-hand side of (5.4.8) follows from the invertibility of A. As shown
earlier in this section, (Z^, PMT\M, Y) is a standard triple for L2(\), and
L 4 (A) = L!(A)L 2 ( A) for some monic matrix polynomial L,(A) with standard
triple (X, P v T | I m / v P,Y). Hence L,(A) = L,(A), and thus L 2 (A) = L 2 (A).
Consider now the proof of the converse statement of Theorem 5.3.2. This
statement amounts to the following: if L(A) = L 4 (A)L 3 (A) for some monic
matrix polynomials L 4 (A) and L 3 (A), then there is a unique ^-invariant
subspace !£ such that (A^, T\</,, Y), with
is a standard triple for L 3 (A). Here q is the degree of L 3 (A). Let C be the
first companion matrix of L(A). Proposition 5.4.1 implies that
where is a standard triple for Also
Eliminating C from (5.4.9 and (5.4.10), we obtain
This readily implies that the subspace

Example 167
is T invariant. Moreover, it is easily seen that the columns of

are linearly independent; equation (5.4.11)
implies that in the basis of j£ formed by these columns, 7^ is represented by
the matrix 7,.
Further
so X\<£ is represented in the same basis in Z£ by the matrix X. Now it is clear

that is similar to (A',, T and thus Y) is alsot
standard triple for L 3 (A).
It remains to prove the uniqueness of Z£. Assume that J£" is also a
7-invariant subspace such that (X^., 7^,, Y) is a standard triple for L 3 (A)
(for some admissible Y). As any two standard triples of L 3 (A) are similar,
there exists an invertible transformation 5: =2" —> £ such that Xl(f. = X^S,
I \<r>' — >j 11 <fj. 1 nen
In particular
But the matrix col[AT'']!=J is invertible, so

Theorem 5.3.2 is proved completely.
5.5 EXAMPLE
We illustrate Theorem 5.3.2 with an example. Let
Then
is the standard triple for L(A) of Proposition 5.1.2, where

is the companion matrix for L(A). As we are concerned with semiinvariant

subspaces for C, it is more convenient to use a Jordan form for C in place of
C itself. The only eigenvalues of L(A) (and thus also of C) are 0, 1, and 2. A
calculation shows that the vectors
form a Jordan chain of C corresponding to 0; the vector x3 -

(1, 0, 0, 0, 0, 0) is an eigenvector of C corresponding to 0; the vectors
form a Jordan chain of C corresponding to 1 ; and the vector jc6 =

(-1,1, -2, 2, -4, 4) is an eigenvector of C corresponding to 2. The vectors
jc,, . . . ,jc 6 are easily seen to be linearly independent. Denoting by S the
invertible 6 x 6 matrix with columns Jt l 5 . . . , jc6, let
(/ is the Jordan form of C); and
Clearly, (X, J, Y) is a standard triple for L(A).

We now find some factorizations
where L,( A), / = 1, 2, 3 are monic matrix polynomials of the first degree. As
Example 169
in Theorem 5.3.2, we express these factorizations in terms of the supporting

triinvariant decompositions
with respect to the standard triple (X, J, Y). So we are looking for
/-semiinvariant subspace M with 5£ and SB + M J invariant, such that the
transformations
and
are invertible. In particular, dim !£ = dim M = dim N = 2. As £ and £6 + M

are J invariant, we have
and
where ^?A (/) is the root subspace of / corresponding to the eigenvalue A 0 .

We consider only those supporting triinvariant decomposition (5.5.3) for
which
and
In other words, we consider only those factorizations (5.5.1) for

which detL and or
equivalently
One could consider all other factorizations (5.5.2) of L(A) in a similar way.
First, we find all pairs of /-invariant subspaces (=$?, 3? + M} with the
properties (5.5.4) and (5.5.5). Using the Jordan form (5.5.1), it is not
difficult to see that all such pairs are given by the following formulas:
Let us check which of these pairs (£, !£ + M ) give rise to supporting

triinvariant decompositions, that is, for which pairs the transformations
are invertible. We have for
(in the basis e, + ae 3 , e4 in Jând the standard basis in <p 2 ), and this matrix
is invertible for all
which is not invertible. For
which is invertible. For
which is invertible if and only if /3 ^ 0. (In this calculation we have used the
formula
Factorization into Several Factors and Chains of Invariant Subspaces 171
Summarizing, one obtains all the supporting triinvariant decompositions

(5.5.3) with the properties (5.5.4) and (5.5.5) where either £ = Span{e, +
ae3,e4} for some a E <(7, M is a direct complement to Span{e l5 e 3 , e4, e6} in
<p6, or, for some nonzero ft E <p we have ^£- Span{e,, e 4 }, M is a direct
complement to !£ in Spanf^, e2 + 0e3, e 4 , ej, and JV is a direct comple-
ment to Span{e,, e2 + (3e3, e 4 , e6} in <p .
Using the formulas given in Theorem 5.3.2, one finds all the factoriza-
tions (5.5.2) corresponding to the supporting triinvariant decomposition
with properties (5.5.4) and (5.5.5) (here a E <p and j8 E <p, jSÔ are as
above):
5.6 FACTORIZATION INTO SEVERAL FACTORS AND

CHAINS OF INVARIANT SUBSPACES
In this section we study factorizations of the monic n x n matrix polynomial

L(A) of degree / into the product of several factors:
where L,(A), . . . , L A (A) are monic n x n matrix polynomials of positive

degrees /, lk, respectively (of course, /, + • • • + lk = /). We have al-
ready encountered particular cases of factorizations (5.6.1) in Theorem
5.3.2 (with k = 3) and in Corollary 5.3.3 (with A: = 2). In Theorem 5.3.2
factorizations (5.6.1) with k = 3 were described in terms of supporting
triinvariant decompositions associated with semiinvariant subspaces of a
linearization of L(\). In contrast, the description of (5.6.1) is to be given
in terms of chains of invariant subspaces for a linearization of L(\).
The following main result can be regarded as a generalization of Corol-
lary 5.3.3.
Theorem 5.6.1
Let (X, T, Y) be a standard triple for L(A). Then for every chain of
T-invariant subspaces
satisfying the property that the transformations
are invertible (for some positive integers mk < mk_l < • • • < m2 < /) there
exists a factorization (5.6.1) of L(A), with the factors L ; (A) uniquely
determined by the chain (5.6.2), as follows. For j = 1,2,. . . , k - 1, let M) be
a direct complement to î+l in ^ (by definition, ^ = <p"') and let
PM : j£y —» M •be the projector on M . along &j+l. Then for j = 1,2,. . . , k —
so Vkq are transformations from <p" into ££k for q = 1, . . . , mk. Conversely,
for every factorization (5.6.1) of L( A) there is a unique chain of T-invariant
subspaces (5.6.2) such that for j = 2,3,. . . , k the transformations
where m (/; is the degree of L y ), are invertible and

formulas (5.6.3) and (5.6.5) hold.
Observe that in view of Proposition 3.1.1 formulas (5.6.3) do not depend

on the choice of M - .
Proof. Apply Corollary 5.3.3 several times to see that factorization

Factorization into Several Factors and Chains of Invariant Subspaces 173
(5.6.1) holds for the monic matrix polynomial

having the standard triple (X^, T^, Yj), where Yf is given by (5.6.4)
(y = 2,. . . , k). Now use Theorem 5.1.6 to produce the formulas
where
(so V-q are transformations from (pn into J^ for g = 1,. . . , m y ). In particular
(with j = k), formula (5.6.5) follows. Further, using the formulas for the
standard triple of the factor L2( A) in Corollary 5.3.3, one easily obtains the
desired formulas [equation (5.6.3)]. The converse statement also follows by
repeated application of the converse statement of Corollary 5.3.3.
A "dual" version of Theorem 5.6.1 can be obtained if one uses the left
canonical form [equation (5.1.14) instead of the right canonical form
equation (5.1.13)] to produce formulas for L y (A)Ly + 1 (A) • • • Lk(\). Then
one uses (5.1.13) [instead of (5.1.14)] to derive the formulas for
L , ( A ) , . . . , L^.Â). We omit an explicit formulation of these results.
We are interested particularly in factorizations (5.6.1) with linear factors
L y (A): L ; (A) = A / + Af for some n x n matrices At (y = 1,. . . , k). Note
that in contrast to the scalar case, not every monic matrix polynomial admits
such a factorization:
EXAMPLE 5.6.1. Let
We claim that L(A) cannot be factorized into the product of (two) linear
factors. Indeed, assume the contrary:
for some complex numbers a,, bt, cit dt, i = 1, 2. Multiplying the factors on
the right-hand side and comparing entries, we obtain
Letting
we can rewrite equality (5.6.6) in the form
which implies However, there is no 2*2matrix A wiht thie
property (indeed, such an A must have only the zero eigenvalue, but then
inevitably A2 = 0).
As we shall see in the next theorem, a necessary (but not sufficient)

condition for a monic matrix polynomial L( A) not to be decomposable into a
product of monic linear factors is that the linearization of L ( A ) is not
diagonable. Indeed, in Example 5.6.1 the linearization of has

only one Jordan block /4(0) in its Jordan form.
Theorem 5.6.2
Let L( A) be an n x n monic matrix polynomial of degree I for which the
companion matrix is diagonable. Then there exist n x n matrices A^,. . . , A,
such that
Proof. Let (X, T, Y) be a standard triple for L(A), and let
Obviously, the JV; are subspaces in <J7n/ and
By Theorem 1.8.5 there exist ^-invariant subspaces M\ C M2 C • • • C Mt_}

such that M • is a direct complement to N^ in (p"'. The transformations
are invertible. Indeed, by the choice of Mt we have KerfcoHAT1]'"^.) =

{0}. As the matrix co^XT']'^ is invertible, the matrix colfAT'Î,', 'has
linearly independent rows and thus ImfcolIAT'jri^ ) = Im(col[AT'"]{-lJ) =
<p"', y = ! , . . . , / - ! . Invertibility of (5.6.7) now follows. The proof is
completed by applying Theorem 5.6.1. D
Differential Equations 175
5.7 DIFFERENTIAL EQUATIONS
Consider the homogeneous system of differential equations with constant

coefficients:
wherere n x n (complex) matrices, ans ann -

dimensional vector function of t to be found. The behaviour of solutions of
equation (5.7.1) as t—»°° is an important question in applications to physical
systems. We look for solutions with prescribed growth (or decay) at infinity.
It will turn out that such solutions depend on certain invariant subspaces of
a linearization of the monic matrix polynomial
connected with (5.7.1).

First we observe that a solution of (5.7.1) is uniquely defined by the
initial data x('\d) = Xj, j = 0 , . . . , / — 1, with given initial vectors
Xn. . . . , X i _ i . Indeed, denoting by y(t) the n/-dimensional vector
equation (5.7.1) is equivalent to the following equation:
As it is well known [cf. Section 2.10, especially formula (2.10.8)], a solution of

equation (5.7.2) is uniquely defined by the initial data y(a), which amounts
to the initial data x(n(d), j = 0 , . . . , / - 1 for equation (5.7.1). In particular,
the dimension of the set of all solutions of (5.7.1) (this set obviously is a
linear space) is nl, the number of (complex) parameters in the n-dimensional
vectors jc () ,. . . , xl_l that determine the initial data of a solution and thus
the solution itself.
It will be convenient to describe the general solution of (5.7.1) in terms
of a standard triple (X, T, Y) of the monic matrix polynomial L(\).
Lemma 5.7.1
A function x(t) is a solution of (5.7.1) if and only if it has the form
for some vector c E <p" .
Proof. Differentiating (5.7.3), we obtain
so
which is equal to zero in view of Proposition 5.1.4. It remains to show that

every solution of (5.7.1) is of the type (5.7.3) for some c E <pn/. As the linear
space of all solutions of (5.7.1) has dimension nl it will suffice to show that
the solutions Xe'Tc},. . . , Xe'Tcnl that correspond to a basis c,,. . . , cnl in
(pn/ are linearly independent. In other words, we should prove that Xe'Tc = 0
for all tâ implies c = 0. Indeed, differentiating the relation Xe'Tc = Q
j times, we obtain XT'e'Tc = 0 for 7 = 0,1, 2,. . . . In particular
As the matrices e'T and col(AT')|lo are nonsingular (Proposition 5.1.4), it

follows that c = 0.
Now let us introduce some T-invariant subspaces:+ffl (T)[resp.

is the sum of all root subspaces of T corresponding to its eigenvalues with
positive real part (resp. with negative real part); &10(T) is the sum of all root
subspaces of T corresponding to its pure imaginary eigenvalues (including
zero); and
Obviously, 3^0(T) is a /"-invariant subspace contained in &10(T). If it

happens that T has no eigenvalues with positive real part, we set £% + (T) =
(0). A similar convention will apply for &_(T), 020(T), and 3/C0(T).
Let 3£i(T) be a fixed direct complement to ^Q(T} in $10(T) and note that
%i(T) is never T invariant [unless 3^(7) = {0}]. Otherwise ^(T) would
contain an eigenvector of T that, by definition, should belong to 5T0(r).
We now have the direct sum ("' = 9l_(T) + %0(T) 4- %t(T) + 9l+(T).
For a given vector c E <p"', let
where c_ e &_(r), c0 £ 3f 0 (r), c, £ %}(T), c + £ & + (r). We describe the

qualitative behaviour of solutions of (5.7.1) in terms of this decomposition
of the initial value of the solution x(t).
A solution x(t) of (5.7.1) is said to be exponentially increasing if for some
positive number /a
but
for every e > 0. Obviously, such a positive number n is unique and is called
the exponent of the exponentially increasing solution x(t). A solution x(t) of
(5.7.1) is exponentially decreasing if (5.7.5) and (5.7.6) hold for some
negative number /i [which is unique and is called again the exponent of x(t)].
We say that a solution x(t) is polynomially increasing if
for some positive integer m. Finally, we say that a solution x(t) is oscillatory
if
These classes of solutions of (5.7.1) can be distinguished according to the

decomposition (5.7.4) of the vector c, as follows.
Theorem 5.7.2
Let x(t) = Xe'Tc be a solution of (5.7.1). Then (a) x(t) is exponentially
increasing if and only ifc+ ^ 0; (b) x(t) is polynomially increasing if and only
if c+ — 0, Cj T^ 0; (c) x(t) is oscillatory if and only if c+ = Cj = 0, c0 ^ 0; (d)
x(t) is exponentially decreasing if and only if c+ = cl = CQ = 0, c_ 5^0. In
cases (a) and (d), the exponent of x(t) is equal to the maximum of the real
parts of the eigenvalues A0 of T with the property that PA c ^ 0, where PAo is
the projector on ô(T) along
Proof. We have
where Without loss of generalit

[passing to a similar triple (X, T, Y), if necessary] we can assume that T_,
T0, T+ are matrices in Jordan form.
Note that for the Jordan block Jk(\) we have (according to Section 2.10)
So every entry in Xe'T+c+ is a function of the type
for some polynomials /?,(/)• Also, every entry in Xe'T°(cQ + c t ) is of the

type
whereas every entry in Xe'r°c0 is of the type (5.7.9) with all polynomials
pf(t) constant. Finally, every entry in Xe'T~c_ is of the type
Further, note that
if and only if c ± =0. Indeed, if equality (5.7.11) holds, then successive

differentiation gives XT'±e'T-c± = 0, / = 0 , 1 , . . . . In particular
As is a nonsingular matrix, the transformation

has zero kernel, and equation (5.7.12) implies c ± =0. Also, the equality
holds if and only if c0 + c, = 0. Also
if and only if c0 = 0. According to the observation made in this and the

preceding paragraphs, statements (a)-(d) follow easily from formula (5.7.7).
For instance, assume that x(t) is exponentially increasing. In view of
(5.7.8)-(5.7.10), this means that Xe'T+c+*Q (since \ez\ = e*ez for any
complex number z), and this is equivalent to the inequality c+ ¥^ 0.
A special case with X — [I 0 • • • 0] and T the companion matrix of

L(A) deserves special attention. In this case the matrix col[AT']|lo is just
the identity, and thus
Exponentially decreasing solutions of (5.7.1) are of particular interest. We

present one result on existence and uniqueness of exponentially decreasing
solutions in which only partial initial data are prescribed.
Theorem 5.7.3
For every set of k vectors JC Q , . . . , jc^., in <p" there exists a unique exponen-
tially decreasing solution x(t) of (5.1 A) such that
if and only if the matrix polynomial L( A) admits a factorization

are monic matrix polynomials of
degrees k and I - k, respectively, such that Re A <0 for all A G o-(/.,) an
<&t A > 0 / o r fl//AGo-(L 2 ).
Proof. In the notation of Theorem 5.7.2 the solution x(t) is exponential-

ly decreasing if and only if
where c_ G £%_(T). When x(t) is given by (5.7.10) we have

It follows that for every set JCD, . . . , xk_l £ <p" there exists a unique expo-
nentially decreasing solution x(t) of (5.7.1) with x('\a) = jc,, / = 0 , . . . ,
A: - 1 if and only if the transformation
is one-to-one and onto. This amounts to the invertibility of

col[Ar(<r|ijM7.))l]fj01, which in turn is equivalent (by Corollary 5.3.3) to the
existence of a factorization L(A) = L 2 (A)L,(A). Moreover, in this factoriza-
tion [ X \ # _ ( T ) , T\m_(T)-> Y] is a standard triple for L t (A) (for a suitable Y"),
whereas (^, PT\lmp, PY) is a standard triple for L 2 (A) for a suitable X,
where P is the projector on 3fcQ(T) + 3ft+(T) along $._(T). As 7 1 | g8 _ (r) and
PT\lm P are linearizations of Lj(A) and L 2 ( A), respectively (Theorem 5.1.5),
it follows that indeed <3le A < 0 for all AGo-(L,), and S^^ A > 0 for all
Ae<r(L 2 ). D
5.«
S.8 DIFFERENCE EQUATIONS
In this section we consider the system of difference equations
where /10, . . . , A,_l are given n x n matrices, and {^}JL0 is a sequence of

n-dimensional vectors to be found. Clearly, given / initial vectors
XQ, . . . ,x,_l, the vectors x,, xl+1, and so on are determined uniquely from
(5.8.1). Hence, a solution {jcy.}°°=0 of equation (5.8.1) is determined by its
first / vectors.
Again, it will turn out that the asymptotic behaviour of solutions of
(5.8.1) can be described in terms of certain invariant subspaces of a
linearization of the associated monic matrix polynomial
Let (X, T, F) be a standard triple for L( A). The general solution of (5.8.1) is
then
where c E <p"' is an arbitrary vector. Indeed, putting jcy = XT'c, j =

0, 1, . . . , we have
0,1,.
Difference Equations 181
which is zero in view of Proposition 5.1.4. If the first / vectors in (5.8.2) are
zeros, that is
then by the nonsingularity of col[AT']Jlo we obtain c = 0. This means that

the solutions (5.8.2) are indeed all the solution of (5.8.1).
The solutions of (5.8.1) are now to be classified according to the rate of
growth of the sequence {jcy}JL0. We say that the solution {*y}JL0 is of
geometric growth (resp. geometric decay) if there exists a number q > 1
(resp. a positive number q < 1) such that
but
for every positive number e. The number q is called the multiplier of the
geometrically growing (or decaying) solution {jcy}JL0. The solution {*y}JL0 is
said to be of arithmetic growth if for some positive integer k the inequalities
holds. Finally, {jty.}°°=0 is oscillatory if
The classification of the solution Xj = XT'c, y = 0,1,. . . of (5.8.1) in

terms of c E <p"' is based on certain TMnvariant subspaces. Let us introduce
these subspaces. Denote by 2ft+(T) [resp. £%~(T)] the sum of all root
subspaces of T corresponding to the eigenvalues A0 of T with | A 0 | > 1 (resp.
with |A0| < 1), and let 3£l(T) be a direct complement to the subspace
in the sum of all root substances of T corresponding to the eigenvalues An

with |A 0 | = 1. Observe that ^ + (T), 9t~(T), and %°(T) are T invariant. We
have a direct sum decomposition
according to which every vector c G <pw/ will be represented as
Theorem 5.8.1
Let (xt: — XT'c}~=f) be a solution of (5.8.1). Then the solution is (a) of
geometric growth if and only if c+ 7^0; (b) of arithmetic growth if and only if
c+ = 0, c1 7^0; (c) oscillatory if and only if c+ = 0, c1 =0, cVO; (d) of
geometric decay if and only if c+ — c = c ( = 0 , c ~ 7 ^ 0 . In cases (a) and (d)
the multiplier of {jc,}^=0 is equal to the maximum of the absolute values of the
eigenvalues A0 of T with the property that PA c ^ 0, where PA is the projector
on £%A (T) along
The proof of Theorem 5.8.1 is similar to the proof of Theorem 5.7.2 if we

first observe that the rath power of the Jordan block of size k x k with
eigenvalue A is
(It is assumed here that This formula can be easily

verified by induction on ra.
The following result on existence of geometrically decaying solutions of
equation (5.8.1) can be established using a proof similar to that of Theorem
5.7.3.
Theorem 5.8.2
For every set of k vectors _y 0 , . . . , y k _ { in <p" there exists a unique geometri-
cally decaying solution {jtj}JL 0 with x(} = y ( } , . . . , x k _ { = yk_l if and only if
L(A) admits a factorization L(\) = L 2 (A)L,(A), where L2(\) and L,(A) are
monic matrix polynomials of degrees I - k and k, respectively, such that
Exercises 183
5.9 EXERCISES
5.1 For a monic n x n matrix polynomial L(A) of degree /, the pair of

matrices (X, T), where X and T have sizes n x nl and n/ x «/,
respectively, is called a right standard pair for L(A) if (X, T, Y) is a
standard triple of L(A), for some n x n matrix V.
(a) Prove that a pair of matrices (A', T) of sizes n x nl and «/ x «/,
respectively, is a right standard pair for a monic matrix poly-
nomial if and only if col|
vertible and
[Hint: The necessity follows from Proposition 5.1.4. To prove

sufficiency, define
(1)
and verify that (X, T, Y} is similar to the triple (F,, C,, #,)
from Proposition 5.1.2 with the similarity matrix col[AT'][lJ.]
(b) Show that given a right standard pair (X, T) of L(A), there
exists a unique Y such that (X, T, Y) is a standard triple for
L(A), and in fact Y is given by formula (1). [Hint: Use formula
(5.1.11) for the similarity between the standard triple (X, T, Y)
and the standard triple (Pj, C,, R^ from Proposition 5.1.2.]
5.2 A pair of matrices (T, Y) of sizes nl x nl and nl x n, respectively, is
called a left standard pair for the monic n x n matrix polynomial L( A)
if for some n x nl matrix X the triple (X, T, Y) is a standard triple of
L(A).
(a) Prove that a pair of matrices (T, Y) of sizes nl x nl and nl x n,
respectively, is a left standard pair for L(A) = /A 7 + EJIo Aj\' if
and only if [Y, TY, . . . , Tl~lY] is invertible and
(b) Show that given a left standard pair (T, Y) of L(A), there exists
a unique ^T such that (X, T, Y) is a standard triple of L(A), and
in fact
(c) Prove that (T, Y) is a left standard pair for L(A) = /A' +
Ejlj Aj\' if and only if (Y*,. T*) is a right standard pair for the
monic matrix polynomial
5.3 be a scalar polynomial with / distinct zeros
(a) Show that
is a right standard pair for L( A). Find Y such that (X, 7\ Y) is a

standard triple for L(A).
(b) Show that
is a left standard pair for L( A), and find X such that (X, T, Y) is
a standard triple for L(A).
5.4 Let be a scalar polynomial. Show that
J,(\0)) is a right standard pair or L(\) and that
a left standard pair for L(A). Find X and Y such that ([1 0 • • • OJ,
J,(A 0 ), Y) and (X, //(A 0 ), col[5,,]J=1) are standard triples for L(A).
5.5 be a scalar polynomial, where
are distinct complex numbers. Show that
and
are right and left standard pairs, respectively, of L(A), where

is an [. matrix and
is an lt x 1 matrix.
Exercises 185
5.6 Let
be a monic matrix polynomial, and let and

be standard triples for the polynomials L,(A) and L 2 (A), respectively.
Find a standard triple for the polynomial L(A).
5.7 Given a standard triple for the polynomial L(A), find a standard triple
for the polynomial S~'L(A + a)S, where S is an invertible matrix,
and a is a complex number.
5.8 Let (X, T, Y) be a standard triple for L(A). Show that
is a standard triple for the matrix polynomial L(A 2 ).

5.9 Given a standard triple for the matrix polynomial L(A), find a
standard triple for the polynomial L(/?(A)), where
is a scalar polynomial.
5.10 Let
be a 3 x 3 matrix polynomial whose coefficients are circulants:
(a A , bk, and c^ are complex numbers). Describe right and left

standard pairs of L(A). [Hint: Find an invertible 5 such that
S~1L(\)S is diagonal and use the results of Exercises 5.5-5.7.]
5.11 Identify right and left standard pairs of a monic n x n matrix
polynomial with circulant coefficients.
5.12 Using the right standard pair of a scalar polynomial given in Exercise
5.5, describe:
(a) The solutions of differential equation
where a 0 , . . . , al^l are complex numbers;

(b) The solutions of difference equations
5.13 Find the solution of the system of differential equations
where a0,. . . , at_l and b0,. . . , b,_l are complex numbers. When
are all solutions exponentially decreasing? When does there exist a
nonzero oscillatory solution?
5.14 Find the solutions of the system of difference equations
When do all nonzero solutions have geometric growth?

5.15 Find the supporting triinvariant decomposition !£ + M + {0} = <p'
corresponding to the divisor (A - A , ) " ' • • • (A - A^)"* of the scalar
polynomial (A - A,)* 1 • • • (A - \k)Pk (here af < 0;, y = 1,. . . , A:, and
at; are nonnegative integers). Use the standard triple determined by
the right standard pair described in Exercise 5.5.
5.16 Let A/ — A', and A/ - X2 be linear n x n matrix polynomials such
that the matrix Xl - X2 is invertible. Construct a monic n x n matrix
polynomial of second degree with right divisors A/ — Xl and A/ — X2.
[Hint: Look for a matrix polynomial with the standard pair ([/ /],
*,e*2).i
5.17 Let Lj(A) and L 2 (A) be monic matrix polynomials with no partial
multiplicities greater than 1. Show that the product L 1 (A)L 2 (A) has
no partial multiplicities greater than 2.
5.18 State and prove a generalization of the preceding exercise for the
product of k monic matrix polynomials with no partial multiplicities
greater than 1.
5.19 Show that a monic n x n matrix polynomial has not more than n
partial multiplicities corresponding to any zero of its determinant.
(Hint: Use Exercise 2.16.)
5.20 Prove that a monic n x n matrix polynomial of degree / with
circulant coefficients has not more than / partial multiplicities corres-
ponding to any zero of its determinant.
5.21 Describe all supporting triinvariant decompositions for the scalar
polynomial (A - A0)".
Exercises 187
5.22 Given an n x n monic matrix polynomial L( A) of degree /, a

CL-invariant subspace !£ is called supporting if the direct sum
decomposition 56 + M 4- {0} = <p"' is a supporting triinvariant de-
composition with respect to the standard triple
Find all supporting subspaces for the scalar polynomial
5.23 Find all supporting subspaces for the scalar polynomial
5.24 Prove that for a scalar monic polynomial L(A), every CL-invariant
subspace is supporting.
5.25 Describe all supporting subspaces for a monic matrix polynomial
whose coefficients are circulant matrices, that is, matrices of type
5.26 Give an example of a monic matrix polynomial of second degree

with nondiagonable companion matrix that admits factorization into
linear factors.
5.27 Prove the following extension of Theorem 5.6.2 for polynomials of
second degree. Let L(A) be a monic n x n matrix polynomial of
second degree such that its companion matrix has at least 2n - \
blocks in its Jordan form. Then L(\) admits a factorization into
linear factors ( A / - A^XI- A2). [Hint: Let (X, /) be a right
standard pair of L(A) with / in the Jordan form. Arguing by
contradiction, assume that every n columns of X formed by the
eigenvectors of L(A) are linearly dependent. Then the columns in
that correspond to the eigenvectors of L(A) are linearly
dependent, and this contradicts the invertibility of
5.28 A factorization L(A) = L 2 (A)L 3 (A) of a monic matrix polynomial

L( A) is called spectral if det L2( A) and det L3( A) have no common
zeros. Show that the factorization is spectral if and only if in the
corresponding triinvariant decomposition t£ 4- M + {0} = <pn/
(Corollary 5.3.3) the ^-invariant subspace SE is spectral.
5.29 Prove or disprove the following statement: each monic matrix poly-
nomial L(A) has a spectral factorization corresponding to every
triinvariant decomposition j£ 4- M + {0} = <p" with spectral T-
invariant subspaces $£ and M, where T is a linearization for L(A).
5.30 Let a,, « 2 , a 3 , a4 be distinct complex numbers, and let
(a) Show that
is a right standard pair for L(A).

(b) Find Y such that (AT, 7, 7) is a standard triple for L(A).
(c) Using the supporting triinvariant decomposition !£ 4- M 4- {0} =
<p4 with spectral T- invariant subspace #, find all spectral
factorizations of L( A).
5.31 Let M( A) and N( A) be a monic matrix polynomials of sizes n x n
and m x m, respectively, and of the same degree /, and let
be a direct sum of M(A) and N(\). Prove or disprove each of the

following statements: (a) the monic matrix polynomials Lj(A) and
L2( A) in every factorization L(A) = Lj( A)L 2 (A) are also direct sums;
(b) same as (a) with the extra assumption that M( A) and N( A) do not
have common eigenvalues.
5.32 Verify formula (5.8.3).
5.33 Supply the details for the proof of Theorem 5.8.1.
5.34 Prove Theorem 5.8.2.
Chapter Six
Invariant Subspaces
For Transformations
Between Different
Spaces
We are now to generalize the notion of an invariant subspace for transfor-

mations from <p" into <p" in such a way that it will apply to transformations
from fm+H into <p", or from <p" into $m+". The definitions introduced will
have associated with them a natural generalization of similarity, called
"block similarity", that will apply to transformations between different
spaces. This will form an equivalence relation on the class of transform-
ations between two given (generally different) spaces. A canonical form is
developed for this similarity that is a generalization of the Jordan normal
form. These ideas and results are then applied to the resolution of two
spectral assignment problems. This really means analysis of the changes in
spectra brought about by block similarity transformations.
Although this material is based on the theory of feedback in time-
invariant linear systems, the presentation here is in the framework of linear
algebra.
6.1 [A B]-INVARIANT t SUBSPACES
Consider a transformation from <p m+ " into (p". Our objective in this section
is to develop and investigate a generalization of the notion of an invariant
subspace that will apply to such transformations and that reduces to the
familiar concept when m =0. Let P be the projector on $m+n that maps
each vector onto the corresponding vector with zeros in the last m positions.
We treat vectors of §m+n in terms of their components in Im P and
189
190 Invariant Subspaces for Transformations Between Different Spaces
Im ), respectively, and, fo we identify

with (jtj, . . . , xn} E <p". Then we may repre-
sent any x E <p m+ " as an ordered pair (Px, (I - P)x) and, with respect to this
decomposition, a transformation from <p m+n into <p" can be written in the
block form [A B] where v4:<p"^<p" and fi: and [A B}& C P& = M. Of course, when
m = 0, P = /, and this is interpreted as the familiar definition AM C M for A
invariance.
We now characterize this concept in different ways and, for this purpose,
introduce another definition. Given a transformation [A B]: <p" 4-
<p m -»<F' 1 , a transformation T: <p m+ "^ fm+n is called an extension of
[v4 5] if it has the form
for some transformations
Theorem 6.LI
Let M be a subspace of <p" and [A B] be a transformation from $m+n into
<p". Then the following are equivalent: (a) M is [A B] invariant; (b) there
exists a subspace y of <p m+ " with M = Ptf and an extension of [A B] under
which y is invariant; (c) the subspace M satisfies
(d) there is a transformation such that
Proof. The theorem will be proved by verifying the implications

(a)=>(d):»(c)=>(b)=>(a).
(a)=^>(d): Since M is [A B] invariant, there is a subspace with
M - Py and [A B\V C M. Let xl,. . . , xk be a basis for M. Then there exist
zl,...,zkESe such that *. = Pz-, / = 1, 2, . . . , * . Define y. = (/ - P)z- E
<pm, y - 1, 2,. . . , k and then, since E 5^, [A B\tf C J( implies that, for
; = 1, 2,. . . , k, Axf + Byj E M. Now define a transformation F: <p"-» <pm by
setting FXJ — yt for y = 1,. . . , k and letting F be arbitrary on some direct
complement to M in <p". Then for any m = S^, a^ E ^ we have
as required.
[A fi]-Invariant Subspaces 191
(d)=>(c): Given condition (6.1.2) we have, for any xEM

(c)=>(b): Let *,,. . . , xk be a basis for M and, using formula (6.1.1), let
y{,. . . , yk be vectors in <pm for which Ax} + Byj £ M for j = 1, 2, . . . , & .
Define a transformation //: <p"—><p in by means of the relation Hxj = yi,
j = 1, 2,. . . , k and letting H be arbitrary on some direct complement to M
in (p". Then define the subspace <f of <p m+ ' 1 by
and note our construction ensures that (A + BH)m E M for any mEM.
Consider the extension of \A B]. It is easily verified that & is
invariant under this extension.
This follows immediately from the definitions.
We will find the next simple corollary useful.
Corollary 6.1.2
With the notation of Theorem 6.1.1, if M is [A B] invariant, then for any
transformation F: <p"—*• <pm, M is [A + BF B] invariant.
Proof. We use the equivalence of statements (a) and (d) of the theorem.
The fact that M is [A B] invariant implies the existence of an
such that M is (A + BFQ)invariant.Thus, for any
Consequently,
AMCM+Y
Subspaces characterized by equation (6.1.1) are described in more

geometric terms by replacement of Im B by some subspace °V of ((7". In this
context it is useful to describe a subspace M as A invariant (mod Y) if
When V = {0} a subspace is A invariant (mod V) if and only if it is A

invariant. At the other extreme, when V = <p", every subspace is A invariant
(mod T).
For a given transformation A: <p" —> <p" and a subspace °V of (p", consider
the class of all subspaces that are A invariant (mod T). It is easy to see that
this class is closed under addition of subspaces, but is not closed under
intersection. This is illustrated in the next example. We observe that
(reverting to the language of transformations), although the set of all

A -invariant subspaces form a lattice, the same is not generally true for the
set of all [A /?]-invariant subspaces.
EXAMPLE 6.1.1. Let A: <p 3 —»<p 3 be defined by linearity and the equalities
Ael — e2, Ae2 = e^, Ae3 = el. Let °V - Span{e2 + e3}. The subspaces
Span{el,e2} and Span{e,,e 3 } are both A invariant (mod Y). (The sub-
space Span{e,,e 2 } is actually A invariant.) However, their intersection
Span{e,} is not A invariant (mod V). Indeed, Ae\ = e2f£Span{el} +
Span{e2 + e>3}.
Given A and Y as above, it is natural to look for a "largest" subspace

among all of those that are A invariant (mod Y). More generally (cf.
Section 2.7), given a subspace M of (p", a subspace °IL of M that is A
invariant (mod V), is said to be maximal in M if °H contains all other
subspaces of M that are A invariant (mod V).
Proposition 6.1.3
For every subspace M C (p" there is a unique subspace of <p" that is A
invariant (mod V) and maximal in M.
Proof. Let °\L be the sum of all subspaces that are A invariant (mod V)
and are contained in M. Because of the finite dimension of M, °U is in fact
the sum of a finite number of such subspaces. Consequently, °U is itself A
invariant (mod Y) and thus maximal in M. The uniqueness is clear from the
definition. D
6.2 BLOCK SIMILARITY
In the preceding section the idea of [A B]-invariant subspaces has been

developed where [A B] is viewed as a transformation from (p" + <pm into
<P". We must also consider transformations of the other kind, namely, those
acting from <p" to (p" 4- (p"1. Such transformations can be written in the form
where A: <p"—»<p" and C: <p"—»<p'". For these transformations, we
need a dual concept of -invariant subspaces where is viewed as a
transformation from (p" into (p" 4- <pm. Thus, guided by Proposition 1.4.4, it
[ A I invariant if an only if M x is
[A* C*] invariant in the sense of Section 6.1. We develop this idea in
Section 6.6. The purpose of this section is to generalize the notion of
similarity to transformations [A B] and in a way that will be consistent
with the definitions of these generalized invariant subspaces.
Block Similarity 193
Let us begin with similarity for transformations from <p" into

1
<p" 4- (p" . In this case it is natural to say that a transformation is
similar to if there is an invertible transformation 5 on <p" 4- <f"m such

that
and the additional assumption that <p" is S invariant. Thus S\(n defines an
invertible transformation on (f"1 — the space on which acts. This
means that, with respect to the decomposition S has the
representation
where X, Y are invertible transformations on <p" and <pm, respectively. The

formal definition is thus as follows: transformations from
1 1
<p" into (f " 4- (p" are said to be block similar if there is an invertible
transformation
such that
Going to the adjoint transformations, this leads us to the dual definition:

transformations [Al fij and [A2 B2] from <p" 4- <pm into <p" are said to be
block similar if there is an invertible transformation
such that
Now let us describe block-similar pairs [Al fij and [A2 B2] in two other
ways.
Theorem 6.2.1
Let [Al 5J and [A2 B2] be transformations from <p" + <pm into <p". Then
the following statements are equivalent: (a) [A^ Bv] and [A2 B2] are block
similar; (b) there exist invertible transformations N and M on <p" and <pm,
respectively, and a transformation F: <p n —» <f"" such that
(c) for any extension Tl of [A l Bv] there is an extension T2 of [A2 B2] and a
triangular invertible transformation S of the form (6.2.3) for which T{ =
ST2S~\
Proof. Given statement (a) and, hence, equation (6.2.4), let F= LN \

and it is found immediately that equation (6.2.4) implies the relations
(6.2.5). So (a)=>(b).
Given statement (b), define 5 as in (6.2.3), let L = FN, and let
be an extension of [ A } J?J. Then it is easily verified that 5 1T^S is an

extension of [A2 B2], and statement (c) follows.
Finally, statement (c) implies that for any extension 7, of [Al #J [as in
(6.2.6)] there is an extension T2 of [A2 B2] such that T2 = S^T^S with S as
in (6.2.3). This immediately implies equation (6.2.4). Thus (c)=>(a).
Corollary 6.2.2
Let [Al Z?j] and [A2 B2] be block-similar transformations with transform-
ing matrix S given by [6.2.3]. Then Mis an [A t B ^-invariant subspace if and
only if N~1M is an [A2 B2]-invariant subspace.
Proof. Assume that M is [Al Bv] invariant. By Theorem 6.1.1 there is

an extension Tl of [Al #J and a subspace & such that M = Py and
T,ycy. Since But also, using (6.2.3
P(S'iy) = N~lPy=N'lM. Hence [A2 fl2](S~V)C AT1^ and, by defin-
ition, N~}M is [A2 B2] invariant.
If we are given that N~1M is [A2 B2] invariant, it follows from T2 =
5~ 1 r i 5thatîs[^l 2 5J invariant.
Corollary 6.2.3
If transformations [Al #J and [A2 B2] are block similar, they have the
same rank.
Proof. Let [A{ B{] and [A2 B2] be block similar. Then Theorem 6.2.1
implies that
Block Similarity 195
Writing G = FNM we see that
But it is easily verified that Im Im and s

rank[,4 2 B2] = rank[y4, £J.
By use of the characterizations of block-similar transformations de-

veloped in Theorem 6.2.1, it is easily verified that block similarity deter-
mines an equivalence relation on the class of all transformations from
<p" + <pm into <p". This immediately raises the problem of finding a canonical
form for representations of the transformations in the equivalence classes
determined by this relation. The rest of this section is devoted to the
derivation of such a form. It will, of course, be a generalization of (and so
be more complicated than) the Jordan normal form, which is associated with
similarity of transformations in the usual sense, and which appears herein as
Theorem 2.2.1.
Our argument will make use of the Kronecker canonical form for linear
matrix polynomials under strict equivalence, as developed in the appendix.
The following proposition is an important step in the argument. Note that it
is convenient to work with matrices here. The previous analysis applies, of
course, when they are viewed as transformations in the natural way.
Proposition 6.2.4
Let Al and A2 be n x n matrices and Bl and B2 be n x m matrices. Then
[Al B,J and [A2 B2] are block similar if and only if the linear matrix
polynomials [/A + A^ B\\ and [I\+ A2 B2] are strictly equivalent, that is,
there exist invertible matrices S and T such that
Proof. Assume that (6.2.7) holds and write
where Tn is n x n. Then
Hence 7\, =
= S~\
5 \and
and
Equation (6.2.7) also implies that

It follows that r,2 = 0 and then that SfijT^ = B2. Combining this relation
with (6.2.5), it follows from Theorem 6.2.1 that [Al Bt] and [A2 B2] are
block-similar.
Conversely, suppose that the relations (6.2.5) hold for appropriate N, M
and F. Then (6.2.7) holds with 5 = AT1, Tn = N, T12 = 0, T2l = FN~r, and
T22 = M.
Now we are ready to state and prove a result giving a canonical form for
block-similar transformations and known as the Brunovsky canonical form.
In the statement of the theorem /*(A) will, as usual, denote the k x k
Jordan block with eigenvalue A.
Theorem 6.2.5
Given a transformation [A B]: <p" + <f""-^ <p", there is a block-similar
transformation [A0 B0] that (in some bases for <p" and <f"") has the
representation
for some integers kl > • • • ^ kp > 0 and all entries in BQ are zero except for
those in positions ( k { , 1), (fcj + k2, 2),. . ., (k} + • • • + kp, p), and these
exceptional entries are equal to one. Moreover, the matrices AQ and BQ
defined in this way are uniquely determined by [A B], apart from a permu-
tation of the blocks J^( A j ) , . . . , / / (A^) in (6.2.9).
Thus the pair of matrices A0, BQ or the block matrix [AQ B0] may be
seen as making up the Brunovsky canonical form for the transformation
\A B]. It will be convenient to call the matrix the
Kronecker part of A0 and the integers kl, . . . , kp the Kronecker indices of
[A B]. Similarly, we call the Jordan part of AQ and
/ j , . . . , lq the Jordan indices of {A B].
Proof. We use the terminology and results of the appendix to this book.
We may consider A and B to be n x n and n x m matrices, respectively.
Consider the linear matrix polynomial
of size n x (n + m). As the equation
has no nontrivial polynomial solution *(A), the minimal row indices of C(A)
Analysis of the Brunovsky Canonical Form 197
are absent. Further, the polynomial AC( A ') = [/ + \A, \B] obviously has
no elementary divisors at zero, so C(A) has no elementary divisors at
infinitv. Let k /:_ be the minimal column indices of C(A) and
' be the elementary divisors of C(A). Then
Theorem A.7.3 ensures that C(A) is strictly equivalent to the linear matrix
polynomial
where Lk is the k x (k + 1) matrix
and s — max Ae( p (rank C(A)) - n [and we have used the elementary fact that
— //(AO) and J,(-\0) are similar]. After a permutation of columns the
polynomial (6.2.10) becomes [/A + A() B0] with A0 and BQ as defined in the
statement of the theorem. The theorem itself now follows in view of
Proposition 6.2.4. D
6.3 ANALYSIS OF THE BRUNOVSKY CANONICAL FORM
We first draw attention to an important special case of Theorem 6.2.5. This

concerns transformations [A B]: <p" in which the pair (A, B) is a
full-range pair in the sense defined in Section 2.8. That is, when
where p is the degree of a minimal polynomial for A.

The following lemma will be useful.
Lemma 6.3.1
Consider any transformations
0,1, 2 , . . . we have
Proof. The proof is by induction on s. When s = 0, equation (6.3.1) is

trivially true. Using a binomial expansion it is found that
Hence
Assuming that the relation (6.3.1) holds when s = r - 1, this implies that the
right-hand side of (6.3.1) is contained in the left-hand side. But the opposite
inclusion follows from that already proved on replacing A by A — BF.
We now formulate other characterizations of full-range pairs (A, B).
Theorem 6.3.2
For a transformation [A B]: $m+n—> <p" the following statements are
equivalent: (a) the pair (A, B) is a full-range pair; (b) there is a full-range pair
( A } , B I ) for which [ A t £,] and [A B] are block-similar; (c) in the
Brunovsky form [AQ B0] for [A B], the matrix A0 has no Jordan part; (d)
the rank of the transformation [I\ + A B] does not depend on the complex
parameter A.
Proof. Consider statement (b). If [Al #,] and [A B] are block-similar,

then, by Theorem 6.2.1, there are invertible transformations N, M and a
transformation F such that
Thus From the definition of full-range pairs and

Lemma 6.3.1 it follows that (A, B) is a full-range pair. So (a) and (b) are
equivalent.
Now consider a canonical pair (AQ, B0) as defined in Theorem 6.2.5. It is
easily verified that such a pair is a full-range pair if and only if the Jordan
part of AQ is absent. Since [A B] is block-similar to a canonical pair
[A0 BQ] (by Theorem 6.2.5), the equivalence of (a) and (c) follows from the
equivalence of (a) and (b).
Consider condition (d). It follows from Corollary 6.2.3 that the rank of
[/A + A B] for any A £ <f is just that of [/A + A0 BQ] where [A0 BQ] is a
Brunovsky form for [A B]. A moment's examination of A0 and BQ convin-
Analysis of the Brunovsky Canonical Form 199
ces us that the rank of [/A + AQ B0] takes the same numerical value, except
at the points A = - A y , j = 1, . . . , q, where there is a reduction in rank. Thus
the rank of [/A + A B] is independent of A if and only if there is no Jordan
part in A0, and the equivalence of (c) and (d) is proved.
So far, the discussion of this section has focussed on cases in which the
matrix AQ of a canonical pair (^4 0 , BQ) has no Jordan part. This can be
described as the case q = 0 in equation (6.2.9). It is also possible that AQ has
no Kronecker part; the case p = 0 in equation (6.2.9). In this case BQ = 0 as
well. We return to this case in Section 6.6.
We conclude this section by showing that the Kronecker indices of the
Brunovsky form can be determined directly from geometric properties of
the transformation [A B] without resort to the computation of the minimal
column indices of [/A + A B].
Proposition 6.3.3
Let [A B] be a transformation from <p m+ " into <p" and define the sequence
d_i, dQ, dt, . . . by d_l = 0 and, for s = 0, 1, . . .
Then the Kronecker indices £,,. . . , kp of [A B] are determined by the

relations
Note that the sequence d_lt d0, . . . is ultimately constant and (if B ^0),
is initially strictly increasing (see Section 2.8).
Proof. Use Theorems 6.2.1 and 6.2.5 to write
where M and N are invertible and [A0 B0] is block similar to [A B]. Now
Lemma 6.3.1 implies
Consequently, the integers ds defined by formula (6.3.2) are invariant under

block similarity. Now formula (6.3.3) is easily verified for a canonical pair
A , B .
Note that the number of Kronecker indices p is given by equation (6.3.3)

in the case s = 0. Thus
khjjkjkh8huhioytyyy[[piiiouiutiutugfjjugijugifuiugfiuiguigu ioug
In some special cases Theorem 6.2.5 can be used to describe explicitly all
[A B] -invariant subspaces. We consider a primitive but important "full-
range" case in this section.
Theorem 6.4.1
Let [A B] be a transformation from <p" + 1 into (p" for which (A, B) is a
full-range pair. Then there exists a basis / , , . . . , / „ in (p" such that every
m-dimensional [A B]-invariant subspace M ^ {0} admits the description:
where r } , . . . , r, are positive integers with rl + • • • + r, = m and A t , . . . , A,

are distinct complex numbers with the
understanding that 0! = 1 and that Conversly, every
subspace M C <p" of the form (6.4.1) is [A B] invariant.
Proof. Taking advantage of the equivalence (a)<=>(b) in Theorem 6.3.2,

we can assume that
Let M ^ {0} be an [A #]-invariant subspace. Then, by Theorem 6.1.1,

there exists a 1 x n matrix such that M is invariant for
the matrix
Description of [A BJ-In variant Subspaces 201
Let r,, . . . , r, be all the partial multiplicities of (^4 + BF)\M (so r{ + • • • +

r, = dim Jt), and let A , , . . . , A, be the corresponding eigenvalues. For every
A0 G <p the matrix A 0 7 —(A + BF) has a nonsingular ( n - l ) x ( n - l ) sub-
matrix (namely, that formed by the rows 1,2, . . . , n - l and columns
2, 3, . . . , n). It follows that dim Ker( A07 - (A + BF)) = I for every A0 e
cr(A). So there is exactly one Jordan block in the Jordan form of A + BF
corresponding to each A 0 E a(A + BF). Hence the same property holds for
(A + BF)\M, and the eigenvalues A 1 5 . . . , A, must be distinct. It follows that
in order to prove that M has the form (6.4.1), it will suffice to verify that for
any Jordan chain g, , . . . , gr of (A + BF)\M corresponding to A y we have
Observe first that
and consequently A y is a zero of the polynomial

of multiplicity at least r. Further, for t = 1, 2, . . . , r
(and the right-hand side is interpreted as zero for t = 1). Indeed, equality in
the 5th place (s = 1, . . . , n - 1) on both sides of (6.4.3) follows from the
easily verified combinatorial identity:
Equality in the nth place on both sides of (6.4.3) amounts to
or
but the left-hand side of this equation is just the (t - l)th derivative of the
polynomial evaluated at A y ; so equation (6.4.4),
and hence (6.4.3), is confirmed.
We have verified that the vectors
form a Jordan chain of A + BF corresponding to A ; . As the restriction
(A is unicellular, there exists a unique (A + BF)-
invariant subspace in &tk(A + BF) of dimension r, and this subspace is

spanned by the vectors in any Jordan chain of (A + BF) of length r
corresponding to A y . So (6.4.2) follows.
Conversely, let M be given by (6.4.1) (with fk replaced by ek, k =
! , . . . , « ) . Let/(A) = A" - fln_1A"~1 — • • • — a0 be a polynomial such that A y
is a zero of /(A) of multiplicity of at least r , / = 1 , . . . , / . As we have seen
above, the vectors form a Jordan
chain of A + BF corresponding to A for
. So by Theorem 6.1.1, M is [A B] invariant.
The case /= m in Theorem 6.4.1 deserves special attention.
Corollary 6.4.2
Let [A B] be as in Theorem 6.4.1. Then there exists a basis f^, . . . , / „ m <p"
such that, for every m-tuple of distinct complex numbers A j , . . . , A m , the
m-dimensional subspace
is [A B] invariant.
This corollary shows that (at least in the case of a full-range pair
A: <p"—* <P" and B: <p-» <p") there are a lot of [A B]-invariant subspaces.
Indeed, Corollary 6.4.2 shows the existence of a family of [A B]-invariant
m-dimensional subspaces that depends on m complex parameters (namely,
For the general case of a full-range pair we have the following partial
description of [A /?]-invariant subspaces.
Theorem 6.4.3
Let (A, B) be a full-range pair with Kronecker indices /c, > • • • > /c r . Then
there exists a basis fn,. . . , /^ , / = 1,. . . , r in <p" such that for every r-tuple
of nonnegative integers lt,. . . , lr satisfying lt:^ kt, i = 1,. . . , r, and for every
collection } of complex numbers the subspace
is [A B] invariant.
The proof of Theorem 6.4.3 is obtained by combining Theorem 6.3.2 and

Corollary 6.4.2.
The Spectral Assignment Problem 203
jhyuuyuyuiuiyuiuyuiyiuioyuiouyiouiojjklhjjhkjkfhdjdffffffffaaasseeui
For a transformation A on <p" the eigenvalues are invariant under similarity

transformations. More generally, if A is defined by a transformation
[A B]: <p" + <pm-» (f"1, then, by Theorem 6.2.1, block similarity transforms
A into N~l(A + BF)N for some invertible N. Thus the eigenvalues of A are
no longer invariant, but are transformed to those of A + BF, where F
depends on the similarity. Now we ask, for given [A B], what are the
attainable eigenvalues of A + BF? We do not answer this question directly,
but we present solutions to two closely related problems.
First, suppose that we are given n complex numbers A } , . . . , An (possibly
with repetitions) that are candidates for the eigenvalues of A + BF. Under
what conditions on the transformation [A B] does a transformation
F: <p"-» <p" exist such that the numbers A ] 5 . . . , \n are just the eigenvalues
of A + BF, counting algebraic multiplicities? This is known as the spectral
assignment problem. It is important in its own right and is also relevant to
our discussion of the stability of [A B]-invariant subspaces.
Clearly, when B - 0, the problem is not generally solvable. Another
extreme case arises if B — I when it is easily seen that a solution can always
be found by using diagonable matrices F. We show first that the problem is
always solvable as long as (A, B) is a full-range pair.
Theorem 6.5.1
be a full-range pair of transformations. Then
for every n-tuple of complex numbers A \n there exists a transformation
F: <p"-» <pm such that A + BF has eigenvalues A , , . . . , \n.
Proof. With the use of Theorem 6.2.1 it is easily seen that we can
assume, without loss of generality, that A and B are in Brunovsky canonical
form. Furthermore, by Theorem 6.3.2, it follows that the Jordan part of A is
absent [see equation (6.2.9)]. So the Kronecker indices
satisfy the condition
be the scalar polynomial with zeros where

(and we define /0 = 0). Let
where F, is the m x ki matrix whose /th row is

and the other rows are zeros. Then
where
is a kt x kt matrix for / = 1,. . . , p [the companion matrix of a,(-)]. It is well

known that the eigenvalues of At are exactly A, + 1 , . . . , A,. This proves the
theorem. D
The argument used in proving Theorem 6.5.1 can also be utilized to

obtain a full description of the solvable cases of the spectral assignment
problem. We omit the details of the proof.
Theorem 6.5.2
Let A: <p"—»<p" and B: <p m —» <p" be a pair of transformations, and let the
/ x / matrix J = / / ( A, ) © • • • © / / ( A^) be the Jordan part of the Brunovsky
form for [A B]. Then, given an n-tuple of (not necessarily distinct) complex
numbers /u,j,. . . , /n n , there exists a transformation F: <j7"-*<p'" such that
A + BF has eigenvalues yu,,, . . . , fj,n if and only if at least I numbers among
f i l t . . . , fj.n coincide with the eigenvalues of J (counting multiplicities).
We need another version of the spectral assignment problem, known as

the spectral shifting problem. Given a transformation [A B]: <p"
and a nonempty set O C <p, when does there exist a transformation
F: <p"^> <pm such that cr(A + fiF)CH? When (A, B) is a full-range pair,
such an F always exists in view of Theorem 6.3.2. In general, the answer
depends on the relationship between the root subspaces of A and the
minimal /1-invariant subspace over Im B:
[known as the "controllable subspace" of the pair (A, B) in the systems

theory literature; see also Proposition 8.4.1]. Observe first that the subspace
( A | Im B) is the minimal ^-invariant subspace over Im B (see Theorem
2.8.4). In particular, (A \ Im B} is A invariant. Also, equation (6.3.1) can
be expressed in the form
for any transformation F: <p" —> (pm.

The Spectral Assignment Problem 205
Theorem 6.5.3
Given a nonempty set ftC <p and a transformation [A B\. <f"" +n —» <p", there
exists a transformation F: <p"~* 4-"" such that a(A + BF) C ft if and only if
for every eigenvalue A0 of A that does not belong to ft.
Recall that S2A (A) = Ker( A0/ - A)" is the root subspace of A correspond-
ing to the eigenvalue A0 and, by definition, £%A (A) = {0} if \0â(A).
In the proof we use the following basic fact about induced transformations
in factor spaces. (Recall the definition of the induced transformation given
in Section 1.7.)
Lemma 6.5.4
Let X: be a transformation with an invariant subspace Z£, and let
be the induced transformation. Then for every A0 G <p we
have
where P: <f"-» <p"/îs the canonical transformation: Px = x + %, x E. <p". In

particular, every eigenvalue of X is also an eigenvalue of X.
Proof. Then for every with

we have
So p(X)x e £. Let <7(A) = n;r=1 ( A y - A)", where A , , . . . , A, are all the

eigenvalues of X different from A 0 . As p(\) and q(\) are polynomials with
no common zeros, there exist polynomials g(\) and h(\) such that
g(A)p(A) + /i(A)<?(A) = 1. (This is well known and is easily deduced from
the Euclidean algorithm.) Hence
Since we also havee On the other hand, the

Cayley-Hamilton theorem ensures that p(X)h(X)g(X)x = 0, that is, the
vector u = h(X)g(X)x belongs to 3ix(X). Now equation (6.5.4) implies
We have proved the inclusion C in equality (6.5.3). The opposite inclusion

follows from the relation
for every vector y E <p".
Proof of Theorem 6.5.3. First consider a pair (A 0 , B0) in the Brunovsky

canonical form, as described in Theorem 6.2.5. Then
The condition <3lK (A0) C { A0 \ Im B0) for every A0 E <J7 --ft means that [in
the notation of equation (6.2.9)] A , , . . . , A^ Eft. It remains to apply
Theorem 6.5.2.
Now consider the general case, and let
where [AQ B0] is in Brunovsky canonical form. It is easily seen that there
exists a transformation Fl such that cr(A0 + Z? 0 F,)Cft if and only if there
exists an F2 with cr(/4 + #F2) Cft (indeed, one can take F2 = F0 +
Further, using equation (6.5.1), we have
and obviously, for any
So it remains to show that (6.5.2) holds if and only if
This is done by using Lemma 6.5.4. Denote by P: $"—> $"/(A Im B) the

canonical transformation
For a transformation X: (p n —»<p" with invariant subspace (A \ Im B), let
be the induced transformation. Using (6.5.1), we see that A and A + BFare

well defined. Further, for every
Some Dual Concepts 207
so A = A + BF0. Now, assuming that (6.5.5) holds, and in view of Lemma

6.5.4, we find that for every
Hence A similar argument shows that (6.5.2) implies

(6.5.5).
6.6 SOME DUAL CONCEPTS
The definitions and analysis of this chapter have primarily concerned trans-
formations [A B]: (J7" 4- <p". Questions arise concerning analogs for
transformations : <p" -* <J7" 4- <pm. In this section we quickly review
some notions and results in this direction. Recall first that a subspace M of
<P" will be called invariant if and only if M^ is [A* C*] invariant.
Thus, with the characterization (d) of Theorem 6.1.1 for [A* C*]-invariant
subspaces, there is a transformation G* such that
if and only if M is invariant. Using Proposition 1.4.4, we see that this is

equivalent to
We include this discussion as part of the following statement.

Theorem 6.6.1
Let M be a subspace of <t7" and \ \ be a transformation from (p" into
Then the following are equivalent: (a) M is invariant;
(b)
(c) there is a transformation
Proof. It remains only to establish the equivalence of (a) and (b). This
is done by using the equivalence of statements (a) and (c) in Theorem 6.1.1.
Thus M \s\ \ invariant if and only if
Now it is easily verified that, for subspaces % Y, and a transformation A,

the relations ATCÛ and /l*^ 1 C < F 1 are equivalent. Thus equation
(6.6.4) is equivalent to
or
which is condition (6.6.2).
It is useful to have a terminology involving an arbitrary subspace in the

place of Ker C in (6.6.2). Thus, if A is a transformation on <p" and Tis a
subspace of <p", we say that a subspace M is A invariant intersect V, or A
invariant (int T), if A(M CiT)CM.
Through extension of the terminology of Section 2.8 for any given
subspace M, a subspace Û that is A invariant (int T) is said to be minimal
over M if °lt D M and there is no other ^-invariant (int T") subspace that
contains M and is contained in aU.
Now consider a generalization of similarity for transformations from (t"1
to <p" 4- <pm. If is such a transformation, an extension of is a
transformation T on (p" 4- (pm of the form
Then we say that transformations rom (
are block similar if, given any extension 7\ of there is an extension

such that T, and T2 are similar. Comparing this with the
corresponding definition of Section 6.2, we see that this is equivalent to the
block similarity of [A* C*] and [A* C*]. We may thus apply Theorem
6.2.1 to obtain the following theorem.
Theorem 6.6.2
The transformations l
from <p" to <p m+ " are block similar if
and only if there exist invertible transformations N on <p" and M on <pOT, and a
transformation G: (p™-» (p" such that
Exercises 209
Once again, it is found that block similarity determines an equivalence

relation on all transformations from <p" into <p" 4- <pm. Furthermore, the
canonical forms in the corresponding equivalence follow immediately from
the Brunovsky form of Theorem 6.2.5 by duality.
Theorem 6.6.3
Given a transformation there is a block-similar trans-
formation that (in some bases for <£"" and <p") has the representation
for some integers kl ^ k2 > • • • S: kp, and al entries in C0 are zero except for
those in positions (1,1), (2, k{ + 1),. . . , ( p , kl + • • • + kp_l + 1), and those
exceptional entries are equal to one. Moreover, the matrices A0 and CQ
defined in this way are uniquely determined by A and C, apart from a
permutation of the blocks J, ( A , ) , . . . , / / (A^) in equation (6.6.6).
The case of full-range pairs (A, B), which was one of our concerns in
Section 6.3, is now replaced by the dual case in which (C, A) is a null kernel
pair (see the definition in Section 2.7 and Theorem 2.8.2). The dual of
Theorem 6.3.2 is now as follows.
Theorem 6.6.4
m
For a transformation the following statements are equi-
valent: (a) the pair (A, C) is a null kernel pair; (b) there is a null kernel pair
l
(y4j, C,) for which \ re block similar; (c) in the Brunovsky
form e matrix AA0 has

has no Jordan part; (d) the rank of th
transformation does not depend on the complex parameter A.
jgughgghjghjghghghghhhh
6.1 Let A: <p"—»• <p" be a transformation. A chain
of subspaces in (p" will be called almost A invariant if AM{ C Mi +i,

i-l,...,k-l. Show that the chain (1) is almost A invariant if and
only if A has the block matrix form A = [-Aj/lfJJi with Aif = 0 for
/ - / > ! , with respect to the direct sum decomposition
•' • + &k + i, where ^ is a direct complement to Mt_^ in Mi (by
definition, M0 = {0} and
6.2 Prove that every transformation A: $"-^> $" has an almost A-

invariant chain
consisting of n + I distinct subspaces, where Ml is any given one-

dimensional subspace. (Hint: For a given ./^ = Span{jt),
jtÔ, put M2 = Span{jt, Ax},. . . , Mk = Span (AT, Ax,. . . , Ak~lx},
where k is the least positive integer such that the vectors
x, Ax,. . . , Akx are linearly dependent. Use the preceding exercise.)
6.3 A block matrix A = [Aij]^j:=l is called tridiagonal if v4/7 = 0 for
\i - j\ > 1. Show that a transformation A has tridiagonal block matrix
form with respect to a direct sum decomposition <p" = 2£{ + • • • + 2£p if
and only if the chains
and
are almost A invariant.

6.4 be a self-adjoint transformation. Prove that for any
with norm 1 there exists an orthonormal basis
1
such that the chains
and
are almost A invariant (so A has a tridiagonal form with respect to the
basis JCj, . . . , jc n ). [Hint: Apply Gram-Schmidt orthogonaliza-
tion to a basis *,, y2,. . . , yn in <p" such that the chain
is almost A invariant (Exercise 6.2) and use the self-adjointness

of A.]
6.5 Let A: (p"-» <p" and B: fm-> (pn be transformations.
(a) Show that
for every A G (p with the possible exception of not more than

n - (dim Im B) points.
(b) Show that if equation (2) holds at k eigenvalues of A (counting
multiplicities) then for every £-tuple /i,,. . . , pk there exists a
transformation F: (£n-^ <pm such that /*,, . . . , fik e a(A + BF).
Exercises 211
6.6 State and prove the analogs of Exercises 6.5 (a) and (b) for a pair
of transformations
6.7 Let (A, B) be a full-range pair of transformations. Show that for any
F the transformation A + BF has not more than dim Im B Jordan
blocks corresponding to each eigenvalue in its Jordan form.
6.8 Let
(a) Show that (A, B) is a full-range pair.

(b) Find matrices N, M and F, where N and M are invertible, such
that the pair N~l(A + BF)N, N'1BM is in the Brunovsky
canonical form.
(c) Find G such that A + BG has the eigenvalues 0, 2, -1.
6.9 Let A: <p" -» <p" be a transformation, and let x E <p" be a cyclic vector
in <p" for /I (i.e., <p" = Span{*, A*, /I 2 jc,. . .}). Show that for any
n-tuple of not necessarily distinct complex numbers A n . . . , \n there
exists a transformation B: <p"—»(p" with Im B C Span{*} such that
A + B has the eigenvalues A , , . . . , A w .
6.10 Let A: <p" -> (p" be a transformation, and let M C <p" be a subspace such
that <p" is the minimal ^-invariant subspace that contains M. Show
that for n-tuple A j , . . . , \n of not necessarily distinct complex numbers
there exists a transformation B: <p"'—» <p"withlm J? C ^ such that A + B
has eigenvalues
6.11 Let
(a) Show that (C, A) is a null kernel pair.

(b) Find matrices N, M and F, where N and M are invertible such
that MCN~\ N(A + FC)N~l are in the canonical form as
described in Theorem 6.6.3.
(c) Find G such that A + GC has the eigenvalues 1, —1,0.
6.12 Let A: <p"-» <p" be a transformation, and let M C <p" be a subspace
such that {0} is the maximal >4-invariant subspace in M. Prove that
for any n-tuple of not necessarily distinct complex numbers
A,, . . . , An there exists a transformation C: (p"—> <p" with Ker C CM
such that A + C has eigenvalues
Chapter Seven
Rational Matrix
Functions
In this chapter we study r x n matrices VV( A) whose elements are rational

functions of a complex variable A. Thus we may write
where p (; (A) and <7,;(A) are scalar polynomials and <7,7(A) are not identically
zero. Such functions W(\) are called rational matrix functions.
We focus on problems for rational matrix functions in which different
types of invariant subspaces and triinvariant decompositions play a decisive
role. All these problems are motivated mostly by linear systems theory, and
their solutions are used in Chapter 8. The problems we have in mind are the
following: (1) the realization problem, which concerns representations of a
rational matrix function in the form D + C(A7 - A)~1B with constant
matrices A, B, C, D; (2) the problem of minimal factorization; and (3) the
problem of linear fractional decomposition.
7.1 REALIZATIONS OF RATIONAL MATRIX FUNCTIONS
Let W(\) be an r x n rational matrix function. We assume that W(\) is finite

at infinity; that is, in each entry p,7( A) /<7,7( A) of W(A) the degree of the
polynomial p /7 (A) is less than or equal to the degree of ^-(A).
A realization of the rational matrix function W( A) is a representation of
the form
where A, B, C, D are matrices of sizes m x m, m x n, r x m, r x n,
212
Realizations of Rational Matrix Functions 213
respectively. Observe that Hm [To verify this, assume

that A is in the Jordan form and use formula (1.9.5).] So if there exists a
realization (7.1.1), then necessarily D = W(o°). We may thus identify such a
realization with the triple (A, B, C). The following lemma is useful in the
proof of existence of a realization.
Lemma 7.1.1
Let H(\) = EjlJ \'Hj and L(A) = A7/ -I- Ej Let H(\) = EjlJ \'Hj and L(A) = A7/ -I- Ej Let H(\) = EjlJ \'Hj and L(A) = A7/ -I- Ej
polynomials, respectively. Put
Then
Proof. We know already (see Section 5.1) that for
where Q = [I 0 ••• 0]. We may define C,(A),. . . , C,(A) for all

by
From equation (7.1.2) we see that
the special form of A yields
It follows that C( A/ - It follows that C( A

is complete.
Theorem 7.1.2
Every r x n rational matrix function that is finite at infinity has a realization.
Proof. Let W( A) be an r x n rational matrix function with finite value at

infinity. There exists a monic scalar polynomial /(A) such that /(A)W(A) is a
214 Rational Matrix Functions
(matrix) polynomial. For instance, take /(A) to be a least common multiple

of the denominators of entries in W(\). Put //(A) = /(A)(W(A) - W(oo)).
Then //(A) is an r x n matrix polynomial. Clearly, L(A) = /(A)/ n is monic
and W(A) = JV(o>) + //(A)L( A)'1. Further
So the degree of //(A) is strictly less than the degree of L(A). We can apply
Lemma 7.1.1 to find A, B, C for which
This is a realization of W(\).
A realization for W(\) is far from being unique. This can be seen from
our construction of a realization because there are many choices for /(A). In
general, if (A, B, C) is a realization of W(A), then so is (^4, B, C), where
for any matrices At/, Bv, and Cl with suitable sizes (in other words, the
matrices A, B, C are of size s x s, s x n, r x 5, respectively, and partitioned
with respect to the orthogonal sum ( where m is the size
of A; for instance, A}3 is a p x q matrix). Indeed, for every \$£cr(A) we
have
and thus
Among all the realizations of W( A) those with the properties that (C, A)
is a null kernel pair and (A, B) is a full-range pair will be of special interest.
That is, for which
The next result shows that any realization "contains" a realization with
those properties. To make this precise it is convenient to introduce another

definition. Let (A, B, C) be a realization of W(A), and let m x m be the size
of A. Given a triinvariant decomposition <pm = t£ + M + JV associated with
an /4-semiinvariant subspace M (so that the subspaces !£ and !£ + M are A
invariant) with the property that C\y — 0 and Im B C !£ + M, a realization
(PMA\ M, PMB, C\M), where PM: (p"1'—>• M is a projector on M with
K e r P ^ D ^ , is called a reduction of (A, B, C). Note that
(PMA\M, PMB, C\M} is again a realization for the same W(\). [See the
proof that (7.1.3) is a realization of W(\) if (A, B, C) is.] We shall also say
that (A, B, C) is a dilation of (PMA\M, PMB, C\M) (in a natural extension of
the terminology introduced at the end of Section 4.1).
Theorem 7.1.3
Any realization (A., fi, C) of W(\) is the dilation of a realization (A^, B0,
C0) of W(\) with null kernel pair (C0, A0) and full-range pair (Att, #0).
Proof. and be
the maximal /1-invariant subspace in Ker C and the minimal ^-invariant

subspace over Im B, respectively.
Put £=jtC(C, A), let M be a direct complement of %n/(A,B) in
#(A, B), and choose M so that
and we recall that m is the size of A. Let us verify that equality (7.1.5) is a
triivariant decomposition associated with an /l-semiinvariant subspace M,
and that the realization
where P: (p M —» M is the projector on M along !£ + JV", is a reduction of

(./4, B, C), and has the required properties. Indeed, !£ and X + M =
3£(C, A) + $(A, B) are ^4-invariant subspaces, so (7.1.5) is indeed a tri-
invariant decomposition. Further, C\<g — 0 and
so (7.1.6) is a reduction of (A, B, C). It remains only to prove that the

realization (7.1.6) of W(\) has the null kernel and full-range properties.
Indeed
So
Also
Hence
because by construction
It turns out that a realization (^4, B, C) for which conditions (7.1.4) are
satisfied is essentially unique. To state this result precisely and to prove it,
we need some observations concerning one-sided invertibility of matrices.
By Theorems 2.7.3 and 2.8.4 we have
where p is any integer not smaller than the degree of the minimal polyno-
mial for A. Hence there exists a left inverse [col[C47]p0']~L. Thus
Also, there exists a right inverse
Note that in general the left and right inverses involved are not unique.
Theorem 7.1.4
Let (/Ij, fij, Q) and (A2,B2,C2) be realizations for a rational matrix
function W(\) for which (C,, A,) and (C2, A2) are null kernel pairs and
(A{,B}), (A2, B2) are full-range pairs. Then the sizes of A^ and A2
coincide, and there exists a nonsingular matrix S such that
Moreover, the matrix S is unique and is given by

Here p is any integer greater than or equal to the maximum of the degrees of
minimal polynomials for Al and A2, and the superscript —L (resp. — R)
indicates left (resp. right) inverse.
Proof. We have
For |A| >max{||y4 1 ||, ||^42||} the matrices \I-Al and A/ - A2 are non-
singular and for / = 1, 2
Consequently, we have
for any A with | A| > max{ || /i JUIÎI}- Comparing coefficients, we see that
ClA\Bl = C2A'2B2, j = 0, 1, . . . . This implies ftjAj = H 2 A 2 , where, for k =
1 , 2 we write
Premultiplying by a left inverse of H2 and postmultiplying by a right inverse

of A 2 , we find that the second equality in (7.1.8) holds. Now define 5 as in
(7.1.8). Let us check first that S is (two-sided) invertible. Indeed, we can
verify the relations
Since we have
/. Similarly, one checks that Because S is
invertible, the sizes of A^ and A2 must coincide
It remains to check equations (7.1.7). Write
Premultiply by H2 L and postmultiply by A 2 * to obtain A2S = SA}. Now
and
Theorems 7.1.3 and 7.1.4 allow us to deduce the following important

fact.
Theorem 7.1.5
In a realization (A, B, C) of W(\), (C, A) and (A, B) are null kernel pairs
and full-range pairs, respectively, if and only if the size of A is minimal
among all possible realizations of W( A).
Proof. Assume that the size m of A is minimal. By Theorem 7.1.3,

there is a reduction (A1, B', C') of (A, B, C) that is a realization for W(\)
and satisfies conditions (7.1.4). But because of the minimality of m the
realizations (A', B', C") and (A, B, C) must be similar, and this implies that
(A, B, C) also satisfies condition (7.1.4).
Conversely, assume that (A, B, C) satisfies conditions (7.1.4). Arguing
by contradiction, suppose that there is a realization (A', B', C') with A' of
smaller size than A. By Theorem 7.1.3, there is a reduction (A", B", C") of
(A', B', C') that satisfies conditions (7.1.4). But then the size of A" is
smaller than that of A, which contradicts Theorem 7.1.4. D
Realizations of the kind described in this theorem are, naturally, called

minimal realizations of W(\). That is, they are those realizations for which
the dimension of the space on which A acts is as small as possible.
7.2 PARTIAL MULTIPLICITIES AND MULTIPLICATION
In this section we study multiplication and partial multiplicities of rational

matrix functions. To facilitate the presentation, it is assumed that the
functions take values in the square matrices and that the determinant
function is not identically zero.
Let W( A) by an n x n rational matrix function with det W( A) 7^0. In a
neighbourhood of each point A0 G ,,. . . , vn are integers. Indeed, for matrix polynomials
equation (7.2.1) follows from Theorem A.3.4 in the appendix. In the
general case write W(\) = p(A)~ l VV(A), where W(\) and p(\) are matrix
and scalar polynomials, respectively. Since we have a representation (7.2.1)
for W(A), it immediately follows that a similar representation holds for
W(A).
The integers i/,, . . . , vn in (7.2.1) are uniquely determined by W(A) and
Partial Multiplicities and Multiplication 219
A0 up to permutation and do not depend on the particular choice of the local

Smith form (7.2.1). To see this, assume that j/, < • • • < f n , and define the
multiplicity of a scalar rational function g( A) ^0 at A0 as the integer v such
that the function g(\)(X - \0)~" is analytic and nonzero at A 0 . Then, using
the Cauchy-Binet formula (Theorem A.2.1 in the appendix), we see that
i>j + • • • + i>. is the minimal multiplicity at A0 of the not identically zero
minors of size / x i of W(A), / = 1,. . . , n. Thus the numbers ^ + • • • + v{,
/ = ! , . . . , « , and, consequently, i>,,. . . , vn are uniquely determined by
W(\).
The integers i/,,. . . , vn from the local Smith form (7.2.1) of W(\) are
called the partial multiplicities of W(\) at A () .
Note that A 0 E (f1 is a pole of W(\) [i.e., a pole of at least one entry in
W(\)] if and only if W(\) has a negative partial multiplicity at A 0 . Indeed,
the minimal partial multiplicity of W(\) at A0 coincides with the minimal
multiplicity at A0 of the not identically zero entries of W( A). Also, A0 G <p is
a zero of W(A) [by definition, this means that A() is a pole of WÂ)" 1 ] if and
only if W(\) has a positive partial multiplicity. In particular, for every
A0 G <J7, except for a finite number of points, all partial multiplicities are
zeros.
There is a close relationship between the partial multiplicities of W(\)
and the minimal realization of W(A). Namely, let W(\) be a rational n x n
matrix function with determinant not identically zero. Let
be the Laurent series of W( A) at infinity (here q is some nonnegative integer

and the coefficients Wt are n x n matrices): write U(\) — £J=0 A'W. for the
polynomial part of W(\). Thus W(\) - t/(A) takes the value 0 at infinity,
and we may write
where C( A/ - A) 1B is a minimal realization of the rational matrix function

W( A) - LÂ). We say that (7.2.3) is a minimal realization of W(A). We see
later (Theorem 7.2.3) that A 0 e <p is a pole of W(\) if and only if A0 is an
eigenvalue of A. Moreover, for a fixed pole of W( A) the number of negative
partial multiplicities of W(A) at A0 coincides with the number of Jordan
blocks with eigenvalue A0 in the Jordan normal form of A , and the absolute
values of these partial multiplicities coincide with the sizes of these Jordan
blocks. A similar statement holds for the zeros of W(\).
An analytic n-dimensional vector function
defined on a neighbourhood of A0 G W(A)i/f(A) is analytic in a
neighbourhood of A 0 , and [W(A)^(A)] A = A =0. The multiplicity of A0 as a
zero of the vector function W(A)^(A) is the order of «H A), and $Q is the null
vector of «/f(A). From this definition it follows immediately that for n x n
matrix-valued functions U( A) and V( A) that are rational and invertible in a
neighbourhood of A 0 , «/r(A) is a null function of V(\)W(\)U(\) at A0 of
order k if and only if U(\)tff( A) is a null function of W( A) at A0 of order k. A
set of null functions <A,(A), . . . , «/^( A) of W( A) at A0 with orders A;,, . . . , kp,
respectively, is said to be canonical if the null vectors ^(AQ), . . . , &p(\0)
are linearly independent and the sum kl + k2 + • • • + kp is maximal among
all sets of null functions with linearly independent null vectors.
Proposition 7.2.1
Let W(\) be as defined above and «/f,(A), . . . , «A p (A) be a canonical set of
null functions of W(\) (resp. W(\)~l) at A 0 . Then the number p is the
number of positive (resp. negative} partial multiplicities of W(\) at A 0 , and
the corresponding orders £,, . . . , kp are the positive (resp. absolute values of
the negative) partial multiplicities of W(\) at A0.
Proof. Briefly, reduce W(\) to local Smith form as described above and
apply the observation made in the paragraph preceding Proposition
7.2.1.
Now we fix an n x n rational matrix function W(\) with det W(\)^Q.

Let
be its minimal realization, and fix an eigenvalue A0 of A. Replacing (7.2.4),

if necessary, by a similar realization, we can assume that
where o-(Ap) — { A0} and \0ô-(A'p). Note also that if A0 is a pole of W(\),
then equation (7.2.4) implies that A0 is an eigenvalue of A.
Proposition 7.2.2
Let W( A), Ap , and Bp be defined as above. Let A0 be a pole of W( A), let $( A)
be a null function of W(\)~l at A0 of order k, and let <?„ be the coefficients of
Then
is a Jordan chain for A at A 0 . Conversely, if A0 is an eigenvalue of A and

x(), . . . , xk_l is a Jordan chain of A at A 0 , there is a null function «A(A) of
W( A)" 1 at A0 with order not less than k for which (7.2.6) holds [in particular,
A0 is a pole of W(\)].
Note that as <r(Ap) - {A 0 }, the series in (7.2.6) is actually finite.
Proof. By definition, vectors (7.2.6) form the Jordan chain for Ap at A()
if
The last k — 1 statements follow immediately from (7.2.6). Also
Now the Laurent series for W( A) at A 0 , say, W( A) = EJL_ 9 ( A - A 0 ) ; W,, has

the following coefficients of negative powers of (A — A 0 ):
and it is easily seen that q is the least positive integer for which
(Ap — A0/)* = 0. (One checks this by passing to the Jordan form of Ap.)
Now recall that «A(A) = W(A)<p(A) is analytic near A0; so equating coeffi-
cients of negative powers of (A — A 0 ) to zero and using the fact that
(AD - AO/)* = 0, we obtain for ; = 1, 2,. . .
Since n*L 0 Ker CA' = {0}, it follows that nJL 0 Ker CpA'p = {0} or, what is
the same, that col[C A' ]'ld is left invertible for some integer r. As
the matrix co\[Cp(Ap - A 0 /)'], r = d is left invertible as well, and since (Ap -
A 0 /)' = 0 for s > q, we obtain the left invertibility of co\[Cp(Ap-
\J)' ']/=]• K now follows that (Ap - A O /)JC O = 0 as required. Finally, since
«/>( A () ) = JC G , it is also true that x(} 7^0. Thus, as asserted, equations (7.2.6) do
associate a Jordan chain for Ap with the null function «KA).
Conversely, let jc(), j c , , . . . , xk_l be a Jordan chain of Ap at A 0 . From the
definition of a minimal realization it follows that the matrix
is right invertible for some integer m. Consequently, there exist vectors

<pk, <pk +l,. . . , with only finitely many nonzero, such that
The definition of a Jordan chain includes (Ap - A 0 /)jt y = xf_} for / =

1, 2, . . . , k - 1, and so equations (7.2.6) follow immediately from (7.2.7). It
remains only to check that W(A)(p(A) is now a null function of W(X)~l at A 0 ,
where <p( A) = EJL* ( A - A 0 ) Vy-
Observe first that *0 ^ 0 and that Cpj4j,*0 = A y C p Jc 0 for y = 0,1, 2 , . . . . As
the matrix co\[Cp(Ap - A0/y]y"L0' is left invertible for some integer m, so is
col[C A']"1^, and it follows that C jc 0 ^0. But using (7.2.6), we obtain
If the Jordan chain jc () ,. . . , xk_l of Ap at A0 cannot be prolonged, then

xk_l^lm(Ap - A 0 7), and it follows from (7.2.7) that <pk ^0. Thus a maxi-
mal Jordan chain of length k determines, by means of (7.2.6), an associated
null function ^(A) of W(X)~' of order k.
Propositions 7.2.1 and 7.2.2 prove the following result. [The second part
of Theorem 7.2.3 concerning zeros of W(A) is obtained by applying the first
part to W(\)~1.]
Theorem 7.2.3
Let W(\) be a rational n x n matrix function with del JV(A)^0, and let its
minimal realization be given by equation (7.2.4). A complex number A0 is a
pole of JV( A) if and only if A0 is an eigenvalue of A, and then the absolute
values of negative partial multiplicities of W( A) at A0 coincide with the sizes of
Jordan blocks with eigenvalue A0 in the Jordan form of A, that is, with the
partial multiplicities of A0 as an eigenvalue of A.
A complex number A0 is a zero of W(\) if and only if A0 is an eigenvalue
of A i , where Al is taken from a minimal realization for W(\)~1:
with matrix polynomial V(\). In this case the positive partial multiplicities of
W( A) at A() coincide with the partial multiplicities of A0 as an eigenvalue ofA}.
Now we apply Theorem 7.2.3 to study the partial multiplicities of a

product of two rational matrix functions. Let W,(A) and W2(\) be rational
n x n matrix functions with realizations
for / = 1 and 2. [Of course, the existence of realizations (7.2.8) presumes

that W,(A) and W2(\) are finite at infinity.] Then the product Wt(\)W2(\)
has a realization
Indeed, the following formula is easily verified by multiplication:
so the right-hand side of (7.2.9) is equal to

So formula (7.2.9) produces a realization for the product W}W2 in terms

of the realizations for each factor. Easy examples show that (7.2.9) is not
necessarily minimal even if the realizations (7.2.8) are minimal. See the
following example, for instance.
EXAMPLE 2.1. Let
Minimal realizations for W,(A), / = 1,2 are not difficult to obtain:
Formula (7.2.9) gives
which is a realization of the rational matrix function /, but not a minimal

one. More generally, if W 2 (A) = W{(\y\ then the realization (7.2.9) is not
minimal [unless VKj(A) is a constant].
Let W(\) be an n x n rational matrix function with determinant not

identically zero. For A 0 €E <p, denote by ir(W; A0) = {7r;.}°°=1 the nonincreas-
ing sequence of absolute values of negative partial multiplicities of W( A) at
A 0 . This means that TTI ^ 7r2 ^ • • • are nonnegative integers with only a finite
number of them nonzero (say, Trk > ?rk + l = 0), and — TT{, — Tr2, . . . ,—irk are
the negative partial multiplicities of W(\) at A 0 .
Consider nonincreasing sequences a = {ay-}*=1 and /8 = (ftj}^=l of non-
negative integers such that only finitely many of them are nonzero, and
recall the definition of the set F(a, j8) given in Section 4.4.
Theorem 7.2.4
Let Wj( A) and W 2 (A) be n x n rational matrix functions with determinant not
identically zero and that take finite value at infinity. Then for every A0 E <p
and j = 1,2, . . . we take iry =s 5y, where {^JJl, = tr(WlW2; A 0 ) and {Sf}J=l is
some sequence from V^lW^, A 0 ), 7r(W2; A0)). //, in addition, W t (A) and
W2( A) arfmtY minimal realizations (7.2.8) /or nTi/c/i r/ze realization (7.2.9) o/
Wj(A)W 2 (A) w minimal as well, then actually
Proof. Let ^(A) and W 2 (A) have minimal realizations as in equation

(7.2.8). Using Theorem 7.2.3 and the definition of r(ir(Wi; A 0 ),
Minimal Factorizations of Rational Matrix Functions 225
7r(W 2 ; A 0 )), we see that the nonincreasing sequence 8 - (5,-}JLi of partial

multiplicities of the matrix
belongs to Y(TT(W}; A 0 ), 7r(W2; A 0 )). Now (7.2.9) is a realization (not

necessarily minimal) of Wj(A)W 2 (A). Theorem 7.1.2 shows that there is a
restriction to some ,4 -semii variant sub-
space such that the realization
is minimal. Then {7r;-}JL, is the sequence of partial multiplicities of A0 at A 0 .

But as A0 is a restriction of A to M, we have 7ry ^ 5y, for y = 1, 2, . . . (see
Section 4.1).
The assumption that both H^(A) and W 2 (A) take finite values at infinity is
not essential in Theorem 7.2.4. However, we do not pursue this generaliz-
ation.
The condition that the realization (7.2.9) is minimal for some minimal
realizations (7.2.8) is important in the theory of rational matrix functions
and in the theory of linear systems. It leads to the notion of minimal
factorization and is studied in detail in the following sections.
7.3 MINIMAL FACTORIZATIONS OF RATIONAL MATRIX

FUNCTIONS
In this section we describe the minimal factorizations of a rational matrix
function in terms of certain invariant subspaces. To make the presentation
more transparent, we restrict ourselves to the case when the rational matrix
functions involved are n x n and have value / at infinity. (The same analysis
applies to the case when the matrix function has invertible value at infinity.)
We start with a definition. The McMillan degree of a rational n x n matrix
function W( A) [with W(°°) = I], denoted 8(W), is the size of the matrix A in
a minimal realization
It is easily verified that
where Moreover, if realization is minimal, so is

equation (7.3.2). Indeed, equation (6.3.1) shows that the pair (A - BC, B)
is a full-range pair [because (A, B} is so]. Further, (C, A) is a null kernel
pair, or, equivalently, (A*, C*) is a full-range pair. By the same argument,
the pair (A* - C*#*, C*) is also a full-range pair. Hence (C, A - BC) is a
null kernel pair, and therefore realization (7.3.2) is minimal. In particular,
Consider the factorization
where, for ; = 1, . . . , / ? , Wj(\) are n x n rational matrix functions with

minimal realizations
Formula (7.2.9) applied several times yields a realization for W(\):
This realization is not neceswsaruly minimal, so we have (in view of Theorem

7.12)
We say that the factorization (7.3.3) is minimal if actually 8(W) = 8(Wl) +

• • • + 8(Wp), that is, realization (7.3.4) is minimal as well. In informal
terms, minimality of (7.3.3) means that zero-pole cancellation does not
occur between the factors Wj(\). Because the McMillan degrees of a
rational matrix function (with value / at infinity) and of its inverse are the
same, (7.3.3) is minimal if and only if the corresponding factorization for
the inverse matrix function
is minimal.
Let us focus on minimal factorizations (7.3.3) with three factors (p = 3).
A description of all such factorizations in terms of certain triinvariant
decompositions associated with A-semiinvariant subspaces is given. Here A
is taken from a minimal realization W(\) = I + C(A7- A)~1B. Write /4 X =
A - BC, and let A and Ax be of size m.
We say that a direct sum decomposition
is a supporting triinvariant decomposition for W(\) if (7.3.5) is a triinvariant

decomposition associated with an ,4-semiinvariant subspace M (so !£ and
Z£ + M are A invariant) and at the same time M is Ax semiinvariant with
associated triinvariant decomposition <pm = N + M 4-^ (i.e., N and jV 4- M
are A* invariant). Note that a supporting triinvariant decomposition for
W(\) depends on the choice of minimal realization. We assume, however,
that the minimal realization of W(\) is fixed and thereby suppresses the
dependence of supporting triinvariant decompositions on this choice. (In
view of Theorems 7.1.4 and 7.1.5, there is no loss of generality in making
this assumption.)
The role of supporting triinvariant decompositions in the minimal fac
torization problem is revealed in the next theorem.
Theorem 7.3.1
Let (7.3.5) be a supporting triinvariant decomposition for W(X). Then W( A)
admits a minimal factorization
where TT^ is the projector on £ along M + Jf, and TTM and TTV are defined
similarly.
Conversely, for every minimal factorization W(\) = Wl(\)W2(\)W3(\)
where the factors are rational matrix functions with value I at infinity there
exists a unique supporting triinvariant decomposition <pm = !£ + M + N such
that
Note that the second equality in (7.3.6) follows from the relations
TTÂTT•<£ = ATT<£ and irÂir^ = TrÂ, which express the A in variance of !£
and !£ 4- M, respectively (see Section 1.5).
Proof. With respect to the direct sum decomposition (7.3.5), write
Note, in particular, that the triangular form of A* implies A12 = B^C2,

/413 = #iC 3 , and A23 = B2C3. Applying formula (7.2.9) twice, we now see
that the product on the right-hand side of (7.3.6) is indeed W(A). Further,
denoting for #=.$?, M, or Jf, we
obviously have 8( Wx) ^ dim 5f. Hence
Since, by definition, ra = 8(W), it follows that
and the factorization (7.3.6) is minimal.

Next assume that W= W{W2W3 is a minimal factorization of W, and for
i = l , 2 , 3 let
be a minimal realization of W,(A). By the multiplication formula (7.2.9)
where
Note that
As the factorization W= W1W2W3 is minimal, the realization (7.3.8) is

minimal. Hence, by Theorem 7.1.4, for some invertible matrix S we have
To satisfy (7.3.7), put 2= S&, M = SM, and Jf = SJf, where
and A. has size p, for / = 1, 2, 3.

It remains to prove the uniqueness of j£, ./#, and JV. Assume that
<pm = ,2" + M ' + JV' is also a supporting triin variant decomposition such that
As the realizations (7.3.7) and (7.3.9) are minimal (see the first part of the
proof), there exist invertible transformations T%:!£'-* t£, TM:M'-*M,
Tjfi jV'-*JVsuch that
Therefore, the invertible transformation ( <pm defined y

for 3( — Z£, M, N is a similarity between the minimal realization
and itself:
Because of the uniqueness of such a similarity (Theorem 7.1.4), we must

have
Using formula (7.3.2), we can rewrite the minimal factorization (7.3.6) in

terms of the minimal factorization of the inverse matrix function:
where the second equality follows from irÂ*^- = AXTTJV- and TrÂ*TT^ =
TTyAx, expressing the Ax invariance of Ji and M + Jf.
An important particular case of Theorem 7.3.1 appears when N — {0} in
the supporting triinvariant decomposition (7.3.5). This corresponds to the
minimal factorization of W(\) into the product of two factors, as follows.
Corollary 7.3.2
Let 1£ and M be subspaces in <p"" that are direct complements of each other.
Assume that ^ is A invariant and M is A* invariant. Then W(\) admits a
minimal factorization
where ir^ is the projector on 3?along M. Conversely, if W(\) = Wl(\)W2( A)

is a minimal factorization with then there exists a
unique direct sum decomposition <pm = $£ 4- M, where 5£ is A invariant,
M is A* invariant, and such that
000000000000000
Let us illustrate the description of minimal factorizations obtained in

Theorem 7.3.1. The rational matrix function
has a realization
where
This realization is minimal. Indeed, the matrix
has rank 3 and hence zero kernel. The matrix

Example 231
has rank 3, and hence its image is <p3. Further
Let us find all invariant subspaces for A and Ax. It is easy to see that
(1,1,0) is an eigenvector of A corresponding to the eigenvalue 1, whereas
the vectors (0,0,1), (0,1,0} are the eigenvectors of A corresponding to
the eigenvalue 0. Hence all one-dimensional ^4-invariant subspaces are of
the form Span . Al
two-dimensional /4-invariant subspaces are of the form
Passing to Ax, wefin*dthat A* has three eigenvalues -1, y = | ( l + /V3),

and y with corresponding eigenvectors (1,2,3), ( l , y , 0), and ( l , y , 0),
respectively. There are three one-dimensional Ax-invariant subspaces
Span{(l,2,3}}, Span{(l, y, 0)}, Span{(l, y, 0)}, and three two-dimen-
sional y4 x -invariant subspaces Span {(1,2,3), ( l , y , 0)}, Span{(l, 2, 3),
(1, y,0)}, and Span«l, 0,0), (0,1,0)}.
Now we describe supporing triinvariant decompositions
of W(A) with ^=Span{(l,2,3)}, % = Span{(l, 1,0)}. If we let M =

Span{(;t, y, z)}, we easily see that
if and only if z ^ 3(y - x). Further, one of the following four cases appears:
In cases (a) and (c) we obtain J/ = Span{(l,y, 0)} and M~

Span{(l, y, 0}}, respectively. In case (b) we have
for some complex numbers p, q, r, s. Consider the second equality in (7.4.3)

as an equation with unknowns p, q, r, s. Solving this equation and putting
r = 1 — y, we get q = 3 — 3y, 5 = 1 — 3a, p = 2 — 3a — y, and M is spanned by
(2 - 3a — y, 2 - 3 ay — y, 3 — 3y), where a ¥* \. [This condition reflects the
inequality z^3(y-jc).] Similarly, in case (d) we obtain M — Span{(2 —
3a - y, 2 — 3ay - y, 3 - 3y}}, where a ^ 3. To summarize, the subspaces
M for which
is a supporting triinvariant decomposition for W(\) are exactly the follow-

ing:
and
To compute the corresponding minimal factorizations according to for-
mula (7.3.6), write the matrices A, B, C (understood as transformations in
the standard orthonormal bases in <p2 and (p3) with respect to the basis
(1, 1,0), < l , y , 0 ) , (1,2,3) in (p3 and the standard basis (1,0), (0,1) in
t2:
So the minimal factorization corresponding to the supporting triinvariant

decomposition (7.4.3) with M = Span{(l, y,0)} is
where
Example 233
Replacing y by y in these expressions we obtain the minimal factorization

corresponding to (7.4.3) with M = Span{(l, 7,0}}.
Now for a ^ \ write A, B, and C in the basis
The corresponding minimal factorization is given by

Th
Taking y in the place of y in these expressions, we obtain the mini-

mal factorization corresponding to (7.4.3) with
Note that these four factorizations exhaust all minimal factorizations
with not identically constant rational 2 x 2 matrix functions with value

/ at infinity and for which W,( A) has a pole at = 1 and has a zero a
0000 000000000
7.5 MINIMAL FACTORIZATIONS INTO SEVERAL FACTORS AND

CHAINS OF INVARIANT SUBSPACES
Let W( A) be an n x n rational matrix function with minimal realization
so that, in particular, W(°°) = I. We study minimal factorizations of W( A) by

means of the realization (7.5.1), and in terms of chains of invariant
subspaces for A and A* = A - EC. We state the main theorem of this
section.
Theorem 7.5.1
Let m be the size of A in equation (7.5.1), and let
where the chain
consists of A-invariant subspaces, whereas the chain
consists of Ax-invariant subspaces. Then W admits the minimal factoriz-

ation
where TT, is the projector on ^£- along J^

Conversely, for every minimal factorization
Factors and Chains of Invariant Subspaces 235
where Wt( re rational n x n matrix functions with W^) /, there exists a

unique direct sum decomposition (7.5.2) with the property that the chains
(7.5.3) and (7.5.4) consist of invariant subspaces for A and A", respectively,
such that
The proof is obtained by p — 1 consecutive applications of Corollary

7.3.2.
As in the remark following the proof of Theorem 7.3.1, the factorization
(7.5.5) implies the minimal factorization for W
We are interested in the case when p, the number of factors in the

minimal factorization (7.5.6), is maximal [of course, we exclude the case
when some of the W- values are identically equal to /]. Obviously, p
cannot exceed the McMillan degree of W (W), for then each facto
Wj(\) must have McMillan degree 1. It is not difficult to find a general form
of rational n~x n matrix functions V( A) with V(°°) = / and S(V) — 1; namely
where A0 is a complex number and R is an n x n matrix of rank 1. Indeed, if

V(A) has the form (7.5.7), then by writing R = C0B0, where C0 is an n x 1
matrix and BQ is a 1 x n matrix, we obtain a realization
of V(A) that is obviously minimal. So d(V) = l. Conversely, if 8(V) = l,
then we take a minimal realization and put
R = C0B0 to obtain (7.5.7).
Note that if V(\) has the form (7.5.7), then so does [because
8(V~l) = S(V} = I]. Indeed, by equation (7.3.2)
where tr R is the trace of R (the sum of its diagonal entries).

We arrive at the following problem: study minimal factoriza-
tions
of W(A), where each V^(A) has the form (7.5.7) for some and R. First let
us see an example showing that not every ) admits a minimal factoriza-

tion of this type.
EXAMPLE 5.1. Let
This realization (with
is easily seen to be minimal. As BC = 0, we have
Obviously, there is no (nontrivial) direct sum decomposition

where j?, and j£2 are A invariant. So by Theorem 7.5.1 (or Corollary 7.3.2)
W(\) does not admit minimal factorizations, except for the trivial ones
We give a sufficient condition for the existence of a minimal factorization

(7.5.8). This condition is based on the following independently interesting
property of chains of invariant subspaces.
Lemma 7.5.2
Let A j, A 2 : <p" —> <p" be transformations and assume that at least one of them
is diagonable. Then there exists a direct sum decomposition <p" = Jz^ + • • • +
££n with one-dimensional subspaces j£y, / = 1, . . . , n, such that the complete
chains
and
consist of A învariant and A2-invariant subspaces, respectively.
Proof. It is sufficient to prove the existence of a direct sum decompo-

sition
where dimdim M is A, invariant, an£ is A

hactors and Chyains of Invariant Subspaces 237
Indeed, we can then use induction on n and assume that Lemma 7.5.2 is
already proved for A^M and PMA2\M in place of Al and A2, respectively,
where PM is projector on M along j£ (Remember that if at least one of Al
and A2 is diagonable, the same is true for A^M and PMA2\M\ see Theorems
4.1.4 and 4.1.5.) Combining (7.5.9) with the result of Lemma 7.5.2 for A^M
and PMA2\M, we prove the lemma for Al and A2.
To establish the existence of the decomposition (7.5.9), assume first that
A i is diagonable, and let /,, . . . , / „ be a basis for <p" consisting of eigen-
vectors of v4,. If g is an eigenvector of A2, (7.5.9) is satisfied with
£ = Span{/) , . . . , / , }, where the indices / j , . . . , / „ _ , are such that
/ - , , . . . , f i n _ \ , g form"a basis in <p".
If A2 is diagonable but A, is not, then use the part of the theorem already
proved with A2 and A* in place of A{ and A2, respectively. We obtain an
(n - 1)-dimensional A*-invariant subspace M and a one-dimensional A*-
invariant subspace & that are direct complements of each other. Then put
M = (.&)1 and £ = (M)x to satisfy (7.5.9). D
We can now state and prove the following sufficient condition for minimal
factorization of a rational matrix function W(\) into the product of 8(W)
nontrivial factors.
Theorem 7.5.3
Let W( A) be a rational n x n matrix function with a minimal realization
and assume that at least one of the matrices A and A — BC is diagonable.

Then W( A) admits a minimal factorization of the form
where are complex numbers and Rl, . . . , Rm are n x n matrices

of rank 1.
The proof of Theorem 7.5.3 is obtained by combining Theorem 7.5.1 and

Corollary 7.5.2. In Example 7.5.1 the hypothesis of Theorem 7.5.3 is
obviously violated. Inded, the matrix os not diagonable. THe
following form of Theorem 7.5.3 may be more easily applied in many cases.
Theorem 7.5.4
Let W( A) be a rational n x n matrix function with W(°°) = I. Assume that
either in W(A), or in W(\)~\ all the poles (if any) of each entry are of the
first order. Then W(\) admits a factorization (7.5.11).
Recall that the order of a pole A of a scalar rational matrix /( A) is defined

as the minimal positive integer r such that lim A _ A [(A — A 0 ) r /(A)] is finite.
Proof. Assume that all the poles of each entry in W( A) are of the first
order. The local Smith form (7.2.1) implies that all the negative partial
multiplicities (if any) of W(\) at each point A0 are —Is. By Theorem 7.2.3,
all the partial multiplicities of the matrix A from the minimal realization
(7.5.10) are Is. Hence A is diagonable and Theorem 7.5.3 applies. If all
poles of W(\yl are of the first order, apply the above reasoning to W(\)~l,
using its realization W(A)^ = / - C(A7- (A - BC))~1B, which is minimal
if (7.5.10) is minimal. D
000000000000000000000000000000000000000000000
In this and the next sections we study linear fractional transformations and
decompositions of general (nonsquare) rational matrix functions. We deviate
here from our custom and denote certain matrices by lower case Latin and
Greek letters.
Let W( A) be a rational matrix function of size r x m written in a 2 x 2
block matrix form as follows:
Here m, and ri (/' = !, 2) are positive integers such that m = ml+m2,

r=r\ + r2. Let V(\) be a rational m2 x rl matrix function for which
det(7- W 12 (A)V(A))^0, and define matrix function
So U( A) is a rational matrix function of size r2 x ml. It is called the linear

fractional transformation of ) by ) [with respect to the block matrix
form of (7.6.1)] and is denoted by SFW(V). It is easily seen that when
m and can be rewritten in the form
where
Conversely, if (7.6.3) holds, then we have (7.6.2) with

Linear Fractional Transformations 239
The form (7.6.3) justifies the terminology "linear fractional transforma-

tion", however, the form (7.6.2) will be more convenient for our analysis.
Observe that multiplication of rational matrix functions is a particular
case of the linear fractional transformation, which is obtained in case
and either
Assume now that both W(\) and K(A) take finite values at infinity. Then
(see Section 7.1) there exist realizations
where A, B, C, and D are matrices of sizes n x H, n x m, r x n, and r x m,

respectively, and
with matrices a, 6, c, d of size p ~ * p , pxr\, m2xp, and w 2 x r , ,

respectively. At this point we do not require that the realizations (7.6.4) and
(7.6.5) be minimal. We are to find a realization of &W(V) in terms of the
realizations (7.6.4) and (7.6.5) of W(A) and V(\).
With respect to the direct sum decompositions (pm = (£""' + (p™2 and
(p = <f ri + <p r2 , we write B, C, and D as block matrices
r
As D = Vy(°°), formula (7.6.2) shows that 3*W(V) is analytic at infinity (i.e.,

has no poles there) provided the matrix / — Dl2d is invertible; in this case
We restrict our attention to rational matrix functions that are analytic at

infinity, so it will be assumed that / - Dl2d is invertible. Then / — dDl2 is
invertible as well and
Indeed, multiplication gives

Define transformations:
Theorem 7.6.1
We have
Further, if this realization of &W(V) is minimal, then the realizations (7.6.4)

and (7.6.5) of W(\) and V(\), respectively, are minimal as well.
Proof. Write
So
We use a step-by-step procedure to compute a realization for
using these realizations for W ) and the realization (7.6.5) for by the
following rules: given two rational matrix functions d X with
finite values at infinity and realizations
realizations for Xl(\) + X2(\), Xl(\)X2(\), and A\(A) - 1 can be found as

follows [cf. formulas (7.2.9) and (7.3.2)]:
(it is assumed in the last formula that D} is invertible). A computation shows

that
where
Let
where n x n and p x p are the sizes of ^4 and a, respectively. Then
and
Writing (7.6.12) in the form
we see that formula (7.6.11) follows.

0000000000000000000000000000000000000000000000000 00000000
that
for all nonnegative integers k. Using formula (7.6.7), one proves by

induction on k that
Indeed, (7.6.14) holds for k = 0. Assuming that (7.6.14) is true for k - 1, we

have
where the last equality follows in view of (7.6.13). Now

and x — 0 because (y, a) is a null kernel pair. [This follows from the
minimality of (7.6.11).] So the pair (C, A) is also a null kernel pair.
To prove that (A, B) is a full-range pair, observe that a* can be written
in the form
where Ylk and Zlk are certain matrices and the stars denote matrices of no
immediate interest. Formula (7.6.15) can be proved by induction on k by
means of formula (7.6.7). From the minimality of (7.6.11) it follows that for
every there exist vectors i; such that
But then, using (7.6.15) and (7.6.8), we have
and (/I, B) is a full-range pair. So the realization (7.6.4) is minimal.

Now consider the realizaton (7.6.5). sucg that
One proves that using an aggu-

ment analogous to that used in obtaining (7.6.14). Hence
In view of the minimality of (7.6.11) we obtain x = 0, and (c, a) is a null

kernel pair. Finally, write a* in the form
for some matrices zlk and ylk. [Again, equation (7.6.16) can be proved by
induction on k using (7.6.7).] For every x E <pp by the minimality of
(7.6.11) there exist vectors w 0 , . . . , uq G (p'"1 such that
From (7.6.16), it follows that
for some vectors w(), . . . , wg, and the full-range property of (a, b) is
proved. Hence the realization (7.6.5) is minimal as well.
Observe that if D 1 2 =0, £> 2 1 =0, Dn = /, C, =0, ^ = 0 , we have

W 21 (A) = 0, W, 2 (A) = 0, W U (A) = 7, and so
On the other hand, formulas (7.6.7)-(7.6.10) take the form
which coincides with formula (7.2.9) for the realization of a product of

rational matrix functions. So (7.6.11) is a generalization of (7.2.9). On the
other hand, putting D I 2 = 0, D21 = 0, D22 = /, C2 = 0, B2 = 0, we have
and formula (7.6.11) gives another version for the realization of the product
of two rational matrix functions:
7.7 LINEAR FRACTIONAL DECOMPOSITIONS AND INVARIANT

SUBSPACES FOR NONSQUARE MATRICES
Let (J( A) be a rational matrix function of size q x s with finite value at

infinity. A linear fractional decomposition of U(\) is a representation of
(/(A) in the form
for some rational matrix functions W(\) and K(A) that take finite values at
infinity. In this section we describe linear fractional decompositions of U( A)
Linear Fractional Decompositions and Invariant Subspaces 245
in terms of certain invariant subspaces for nonsquare matrices related to a

realization of U(\).
Minimal linear fractional decompositions (7.7.1) are of particular inter-
est. First observe that the definition of the McMillan degree of a rational
matrix function with value / at infinity (given in Section 7.3) extends
verbatim to a (possibly rectangular) rational matrix function W(\) with
finite value at infinity: namely, 8(W) is the size of the matrix A taken from
any minimal realization
of W(\). In any linear fractional decomposition (7.7.1) of U(\) for which

the rational functions W(A) and V(\) take finite values at infinity, we have
Indeed, assuming that (7.6.4) and (7.6.5) are minimal realizations of W(\)
and V(A), respectively, then by Theorem 7.6.1 U(\) has a realization (not
necessarily minimal) 5 + y(\I- a)~lfi, where the size of a is t x /, with
t = 8(W) + 8(V). Hence (7.7.2) follows.
The linear fractional decomposition (7.7.1) is called minimal if equality
holds in (7.7.2), that is, 8(U) = d(W) + S(V). As in the preceding para-
graph, Theorem 7.6.1 implies that (7.7.1) is minimal if and only if for some
(and hence for any) minimal realizations (7.6.4) and (7.6.5) of W(A) and
£/(A), respectively, the realization (7.6.11) of U(\) = &W(V) is again
minimal.
Let
be a realization (not necessarily minimal) of f/(A), where a, /3, y, and 8 are

matrices of sizes / x /, / x s, q x /, and q x s, respectively. Recall from
Theorem 6.1.1 that a subspace M C <p' is [a (3] invariant if and only if there
exists an 5 x / matrix F such that M is invariant for a + /3F. Also (see
Theorem 6.6.1), a subspace X C (p' is invariant if and only if there
exists an / x q matrix G such that (a + Gy)N C N. For the purpose of this
section we can accept these properties as definitions of [a /3]-invariant and
-invariant subspaces, respectively.
A pair of subspaces (M}, M2) of <(7' will be called reducing with respect to
realization (7.7.3) if M\ is [a )8] invariant, M2 is invariant, and M^ and
M2 are direct complements to each other in <p'.
The following theorem provides a geometrical characterization of mini-
mal linear fractional decompositions of t/(A) in terms of its realization
(7.7.3).
Theorem 7.7.1
Assume that ( M l , M 2 ) is a reducing pair with respect to the realization
(7.7.3) of £/( A). The following recipe may be used to construct realizations of
rational matrix functions W( A) and V( A) such that
and
with a transformation A: Mt — » M , , and
with a transformation a: M2—>M2: (a) choose any transformation
and any transformation d: (p 5 —*(p^ such that the transformations Du, D22
and I — Dl2d are invertible and
(b) choose any transformations F: <p'—» (p* and G:^q—>^' for which
and
be block matrix representations with respect to the direct sum decomposition

<p =M} + M2. Then, defining
and
equation (7.7.4) holds. Moreover, if, in addition, the realization (7.7.3) is

minimal, the linear fractional decomposition (1.1 A) is minimal as well; and
conversely, any minimal linear fractional decomposition
of U( A) where the rational matrix functions
and V(\) take finite values at infinity and the matrices Wu(°°) and V^22(oo) are
invertible, can be obtained by this recipe.
Proof. Let A, B-, Cj, Dtj and a, b, c, d be defined as in the recipe.

Then, using the relationships (7.7.7), (7.7.9), and (7.7.10) and the
equalities a21 + /3 2 F 1 =0 and a 1 2 +G 1 -y 2 = 0 (which follow from the in-
variance of Ml and M2 under the transformations a + /3F and a + Gy,
respectively), one checks that the equalities (7.6.7)-(7.6.10) hold. Now by
Theorem 7.6.1. we obtain the linear fractional decompo-
sition (7.7.4).
Assume now that (7.7.3) is a minimal realization of t/(A); hence 5(U) -
I. By Theorem 7.6.1 the realizations (7.7.5) and (7.7.6) are minimal, so
As M{ and M2 are direct complements to each other in <p', we have

8(U) = S(W) + S(V), and the minimality of the linear fractional decom-
position (7.7.4) follows.
Conversely, assume that (7.7.3) is a minimal realization of and let

(7.7.10) be a minimal linear fractional decomposition of , where the
rational functions W(\) and V(A) are finite at infinity and
are invertible. Here
0000 000000000000000000000000000000 00000 00000000000000

formula (7.7.11) and by the invertibility of Wn(°°) and W22(°°); in particular,
the matrix functions W,,(A) and W22(\) must be square.] Let
be a minimal realization of partitioned as in (7.7.12), where the

matrix A has size n x n, n- 8(W). Let
be a minimal realization of V( A) in which a is p x p, p = 8(V). By Theorem

7.6.1, form a realization
where a', (3', y', and 8' are given by formulas (7.6.7), (7.6.8), (7.6.9), and
(7.6.10), respectively, using the realizations (7.7.13) and (7.7.14). As
(7.7.11) is a minimal linear fractional decomposition, the realization
(7.7.15) is minimal. [The size of a' is (n + p) x (n +/?).] Comparing the
minimal realizations (7.7.3) and (7.7.15) we find, in view of Theorem 7.1.4,
that 8 = 8' and there exists an invertible transformation S: <p" 4- (p^—»<p'
such that
Putting
one verifies that and the minimal

linear fractional decomposition (7.7.11) is given by our recipe.
Observe that the linear fractional decomposition of U( A) described in the

recipe of Theorem 7.7.1 depends on the reducing pair ( M l , M 2 ) i on the
choice of D and d such that condition (a) holds, and on the choice of F and
G such that (a + ^F)M1CM1, (a + Gy)M2CM2. [We assume that the

realization (7.7.3) of U(\) is fixed in advance.] We determine the parts of
this information that are uniquely defined by the linear fractional decompo-
sition. Let us introduce the following definition. Let (Mlf M 2 ) be a reducing
pair [with respect to the realization (7.7.3)] and F: <f'^ <fJ, G: £*-»> <p' be
transformations such that (a + (3F)Ml C M^, (a + Gy)M2 C M2, and write
with respect to the direct sum decomposition <p' = M^ + M2. The quadruple
(Mi,M2;F{,G\) will be called a supporting quadruple [with respect to the
realization (7.7.3)]. Given a supporting quadruple, for every choice of D
and d satisfying condition (a) of Theorem 7.7.1, the recipe produces a linear
fractional decomposition of U(X). We now have the following important
addition to Theorem 7.7.1.
Theorem 7.7.2
Assume that the realization (7.7.3) is minimal, and let (7.7.11) be a minimal
linear fractional decomposition of U(\} such that W (/ (A) and V(\) take finite
values at infinity and the matrices Wn(°°) and W22(<*>) are invertible. Then
there exists a unique supporting quadruple Q = (Ml, M2; FI} G,) that pro-
duces, together with some choice of D and d satisfying condition (a), the
decomposition (7.7.11) according to the recipe of Theorem 7.7.1.
Proof. The existence of Q is ensured by Theorem 7.7.1. To prove the

uniqueness of Q, assume that Q' - (M(, M'2\ F[, G|) is another supporting
quadruple that gives rise (with some choice of D and d) to the same
decomposition (7.7.11). As D = W(QO), d = V(<x>), we see that actually the
matrices
and d, which, together with Q', give rise to the decomposition (7.7.11) are
the same matrices chosen to produce (7.7.11), together with Q. Further, let
(7.7.8) be the block matrix representations of a, j8, and y with respect to the
direct sum decomposition (p7 = M\ 4- M2, and let
be the corresponding representations with respect to the direct sum <p' =

M\ + M2. We now have two realizations for W(A):
where A, Bt, and Cj are given by formulas (7.7.9) and A', B't, and C'j are
given by (7.7.9) with a n , G,, F,, j8,, y, replaced by «;,, GJ, FJ, /3|, y{,
respectively. By Theorem 7.6.1, both realizations (7.7.16) are minimal, so in
view of Theorem 7.1.3 there exists an invertible transformation S: M\-+ M\
such that
Similarly, we have
where a, b, and c are given by (7.7.10) and a', b', and c' are given by
(7.7.10) with a 22 , j32, y2 replaced by a 22 , /3 2 , y 2 > respectively. Since both
realizations (7.7.18) are minimal, we have
for some invertible transformation

We now verify that
Indeed, formulas (7.7.9) together with (7.7.17) give
and
so a n = S la'nS. From formulas (7.7.10) and (7.7.19) one obtains

Linear Fractional Decompositions: Further Deductions 251
and
Further, the definition of the supporting quadruples Q and Q' implies
so
and
All the established relationships verify the equalities (7.7.20).

It remains to observe that the transformation V = is a similarity
of the minimal realization (7.7.3) with itself. Since such a similarity must be
unique (Theorem 7.1.4), it follows that V= /and hence
7.8. LINEAR FRACTIONAL DECOMPOSITIONS:

FURTHER DEDUCTIONS
We consider here some deductions, examples, and results on linear fraction-

al decompositions that follow from the main theorems, Theorems 7.7.1 and
7.7.2.
The particular case when 8 - I, D — /, and d - I in Theorems 7.7.1 an
7.7.2 is of special interest. In this case condition (a) of Theorem 7.7.1 is
satisfied automatically, and we have the following.
Theorem 7.8.1
Let
be a minimal realization of the rational q x q matrix function U(\). Let

(Jti,M2) be a reducing pair for the realization (7.8.1), and write
Choose any transformations
in such a way that (a + J3F)M1 CMl, (a + Gy)M2 C M2. Then
and
produce a minimal linear fractional decomposition U( A) = &W(V). Converse-

ly, every minimal linear fractional decomposition U(\) = &W(V) with
W(°o) = / and V(o°) = / can be obtained in this way, and the quadruple
(Ml, M2\ F t , G,) is determined uniquely by W(\) and V(A).
Let us give a simple example illustrating Theorem 7.8.1.
EXAMPLE 7.8.1. Let
A minimal realization for U(\) is easy to find:
with 8 = -y = /3 = 7, « = - We find all nontrivial [i.e., such that

W(\)Î, V(A)^7] minimal linear fractional decompositions U(\)-
&W(V) such that W(«>) = /, K(°o) = /. Every subspace in <p2 is [a /3]
invariant, as well as invariant. We consider the case when the one-
dimensional subspaces Ml and M2 and <p2 that are direct complements to each
other are of the form
Linear Fractional Decompositions: Further Deductions 253
Then one computes
with respect to the direct sum decomposition (p2 = Ml + M2, where (1, x)
and (1, y) are chosen as bases in M} and M2, respectively. Further,
is such that (or + f$F)M\ C M, if and only if the transformation
The transformation G = is such that (a + Gy)M

2 C M2 if and only if
forG, =[g, g2] we have
Now formulas (7.8.2) and (7.8.3) give
We conclude that for every six-tuple of complex numbers (x, y, /j, f 2 , ' g l , g2]
such that x^y and (7.8.4) and (7.8.5) hold, there is a minimal linear
fractional decomposition U(\) - &W(V) where W(\) and K(A) are given by
equalities (7.8.6) and (7.8.7), respectively. D
As an application of Theorem 7.7.1, let us consider linear fractional

decompositions with several factors.
Theorem 7.8.2
Let U(A) be a rational matrix function that has no pole at infinity, and let
m = 8(U). Then U(\) admits a linear fractional decomposition
where for j = 1, . . . , m Wj( A) is a rational matrix function that is finite at

infinity with McMillan degree 1. Moreover, VVÂ) can be chosen in such a
way that
for any rational matrix function V(\) of suitable size, where Wfl(\) and
WJ2(\) are rational matrix functions of appropriate sizes with Wr/2('») = 0,
WW») = /.
Observe that the decomposition (7.8.8) is minimal in the sense that
8(U) = 8(Wl)+--- + 8(Wm). So, in contrast with the factorization of
rational matrix functions (Example 7.5.1), nontrivial minimal linear frac-
tional decompositions always exist.
Proof. Choose a minimal realization
By the pole assignment theorem (Theorem 6.5.1), there exists a transfor-

mation F such that o-(a + ftp) = ( A , , . . . , A,} with distinct numbers
A , , . . . , A, (here / x / is the size of a). So there is a basis g,, . . . g, in (p7 such
that (a + (3F)gj — A^g y , 7' = 1 , . . . , / . On the other hand, for any transform-
ation G: (p*—*• <p' there is a basis / , , . . . , / , in (p7 in which the matrix of
a + Gy has a lower triangular form (Theorem 1.9.1). Choose gy in such a
way that g;, /2, /3, . . . , / / are linearly independent and put Ml — Span{g;},
M2 = Span{/2, /3, . . . , / / } • Then (M^ M 2 ) is a reducing pair and the recipe
of Theorem 7.7.1 (with D = I, d — 8) produces a minimal linear fractional
decomposition U(\)=&w(Ul), where d(W) = l and W(<*>) = I. Moreover,
taking G =0 it follows that W(A) has the form
Hence &W(V) has the form (7.8.9). Now apply the preceding argument to
(/Â), and so on. Eventually we obtain the desired linear fractional
decomposition (7.8.8).
Observe that, because 8(Wj) = l, each function W ; (A) from Theorem

7.8.2 has only one pole /u,., and the multiplicity of this pole is 1. The proof of
000000000000 255
Theorem 7.8.2, together with formula (7.7.8) for the transformation A,

shows that the functions Wj(\) can be chosen with the additional property
that f i l , . .. , n.m are the eigenvalues (counted with multiplicities) of the
transformation a taken from a minimal realization (7.8.3) of f/(A).
00000000000000000
7.1 Find realizations for the following rational matrix functions:
Determine whether these realizations are minimal.

7.2 Find the McMillan degree and a minimal realization for the following
rational matrix functions:
7.3 Reduce the following realizations to minimal realizations
where Cp is the 1 x n matrix with 1 in the pth place and zeros

elsewhere;
7.4 Find minimal realizations for the following scalar rational functions:
where A; is a positive integer
[Hint: In the minimal realization / + C(\I - A)~1B the matrix

A is the Jordan block of size k with eigenvalue A
7.5 Find a minimal realization for the scalar rational function with finite
value at infinity, assuming that its representation as a sum of simple
fractions is known, that is, of the form Er [Hint:
Use Exercise 7.4 (c) and Exercise 7.11.]
7.6 Show that if
are realizations for n x n and m x m rational matrix functions W,(A)

and W2(\), then the (n + m) x (n + m) rational matrix function
W,(A)@W 2 (A) has realization
Show, furthermore, that (2) is minimal if and only if each realization

(1) is minimal.
7.7 Describe a minimal realization for the 2 x 2 circulant rational matrix
function
where 0,(A) and fl2(A) are scalar rational functions with finite value at
infinity.
7.8 Describe a minimal realization for the n x n circulant rational matrix
function
Exercises 257
[As usual, assume that W(o°) is finite at infinity.]

7.9 Let ^(A) and W2(\) be rational matrix functions with realizations
Show that the sum W{(\) + W2(\) has the realization
7.10 Give an example of rational matrix functions W,(A) and W2(\) with
minimal realizations (3) for which the realization (4) is not minimal.
7.11 Assume that the realizations (3) are minimal and A} and A2 do not
have common eigenvalues. Prove that (4) is minimal as well. [Hint:
We have to show that ([C^ C 1 is a null kernel pair
and I l i s a full-range pair. Suppose that x and

such that
for k = 0,1, Because (r(At) H cr(A2) = 0, for k = 0,1, . . . there

exists a polynomial pk(\) such that pk(Al) = Q, pk(A2)= A2. Then
Hence y = 0. Similarly, one proves that x - 0.]

7.12 Let
be a rational n x n matrix function, where A,, . . . , A^. are distinct

complex numbers. Show that W(\) admits a realization
When is this realization minimal?

7.13 Find a realization for a rational n x n matrix function of the form
(where A , , . . . , A^ are distinct complex numbers). When is the ob-

tained realization minimal?
7.14 Given a realization W(A) = C(A- A)~1B, find a realization for the
rational matrix function
Is it minimal if the realization W(A) = C( A - A) ]B is minimal?

7.15 Given a realization
of a rational matrix function, find a realization for W(a\ + )3), where

a T^ 0 and /3 are fixed complex numbers. Assuming that (5) is
minimal, determine whether the obtained realization is minimal as
well.
7.16 Given a realization (5), show that W(A 2 ) has a realization
If (5) is minimal, is this realization minimal as well?

7.17 Given a realization (5), find a realization for W(p( A)), where p(A) is
a scalar polynomial of third degree. Is the realization obtained
minimal if (5) is minimal?
7.18 Let
be a minimal realization,
(a) Show that
Exercises 259
is a realization of W(\)2.
(b) Is the realization of W( A)2 minimal?
(c) Is the realization minimal if, in addition, the zeros and poles of W( A)
are disjoints?
7.19 For the minimal realization (6), show that
is a realization of W( A) . Is it minimal? Is it minimal if the zeros and

poles of W( A) are disjoint?
7.20 Show that a realization W( A) = / + C( \I - A)~1B is minimal if A and
A — BC do not have common eigenvalues. (Hint: Use Theorem
7.1.3.)
7.21 Let W( A) be an n x n rational matrix function with W(°°) = I and
assume that W(\) is hermitian for all real A that are not poles of W(\).
Prove that for every minimal realization
there exists a unique invertible matrix 5 such that
7.22 Show that the McMillan degree of
where are distinct complex numbers, is equal to the sum

of ranks of Zl,. . . , Zk.
7.23 Show that for rational n x n matrix functions Wj(A) and W2(\) with
finite values at infinity the inequalities
hold.
7.24 Find the McMillan degree of the circulant rational matrix function
7.25 Find a minimal realization of W(\), and, with respect to this realiz-
ation, describe all the minimal factorizations W(\) = Wl(\)W2(\) of
W(A) in terms of subspaces 2£ and M as in Corollary 7.3.2, for the
following scalar rational functions:
whree si a fixed integer
7.26 When is the realization /„ + / n (A/ n — A) 1B, where A is upper tri-

angular with zeros on the main diagonal and B is diagonal with
distinct eigenvalues, minimal? Show that in this case W(\) admits a
minimal factorization with factors having McMillan degree 1.
7.27 Prove that a circulant rational matrix function (Exercise 7.24) wit
value / at infinity admits a minimal factorization with factors having
McMillan degree 1.
7.28 Let
be a minimal realization, and assume that BC = 0.

(a) Prove that
(b) Prove that W( A) admits a nontrivial minimal factorization if and
only if A is not unicellular.
7.29 Let
be a scalar rational function. Use the recipe of Theorem 7.7.1 to

construct all minimal linear fractional decompositions t/(A) = ^W(V),
Exercises 261
such that W(\) and V(\) take finite values at infinity and Wn(oo),
W22(o°) are invertible. Find all the corresponding reducing pairs of
subspaces with respect to a fixed minimal realization of t/(A).
7.30 Show that all the following decompositions of a rational matrix
function U(\) are particular cases of the linear fractional decompo-
sition:
7.31 For the rational function U(\) given in Example 7.8.1, find all
minimal linear fractional decompositions U(\) = &W(V), with
and
Chapter Eight
Linear Systems
In this chapter we show how the concepts and results of previous chapters
are applied to the theory of time-invariant linear systems. In fact, this is a
short self-contained introduction to linear systems theory. It starts with the
analysis of controllability, observability, minimality, and state feedback and
continues with a selection of important problems with full solution. These
include cascade connections, disturbance decoupling, and output stabiliz-
ation.
8.1 REDUCTIONS, DILATIONS, AND TRANSFER FUNCTIONS
Consider the system of linear differential equations
where are constant

transformations (i.e., independent of /). Here u(t) is an n-dimensional
vector function on t > 0 that is at our disposal and is referred to as the input
(or control) of the linear system [equations (8.1.1)]. The r-dimensional
vector function y(t) is the output of (8.1.1), and the m-dimensional function
x(t) is the state of (8.1.1). Usually the state of the system (8.1.1) is unknown
to us and must be inferred from the input (which we know) and the output
(which we may be able to observe, at least partially).
Let Jt(/;* 0 , M) be the solution of the first equation in (8.1.1) [with the
initial value *(0) = *0]. It follows from the basic theory of ordinary differen-
tial equations [see Coddington and Levinson (1955), for example] that the
solution x(t;x0, u) is unique and is given by the formula
262
Reductions, Dilations, and Transfer Functions 263
Substituting into the second equation of (8.1.1), we have
Formula (8.1.3) expresses the output in terms of the input. In other words,
the input-output behaviour of the system is represented explicitly.
Now we introduce some important operations on linear systems of type
(8.1.1). It is convenient to describe (8.1.1) by the quadruple of transfor-
mations (A, B, C, D). A linear system (A1, B', C', D ) with transformations
A'\ (pm'-^4:m', B': (p"'-»<p m ', C': <p m '-^(p r ', D': <p"'-*<p r ' will be called
similar to (A, B, C, D) if there exists an invertible transformation
5: <f""'-» <pm such that
(In particular, this implies that m = m', n = n', r = r'.) We also encounter
system (8.1.1) with transformations A:M^>M, B:("->M, C:M-^-(r,
and D: (p"—> (pr, where M is a subspace of <£"" for some m. The definition of
similarity applies equally well to this case. [In particular, similarity with the
system (A1, B', C', D') described above implies dim M = m'.]
A system (A, B', C', D') with A': <pm'-* <pm', B': <p"-» <f w ',
C': <p w '-» <fr, D': <f"^ <fr will be called a dilation of (4, B, C, D) if there
exists a direct sum decomposition
with the two following properties: (1) the transformations A', B', C' have
the following block forms with respect to this decomposition
where the stars denote entries of no immediate concern (so A: M—>M,

C:M^>fr, fl: <£"•-»./#); (2) the system (A, B, C, D') is similar to
(A, B, C, D). In particular, if (A, B', C', D') is a dilation of (A, B, C, /)),
then D' = D. The form (8.1.5) for A' shows that the subspaces 58 and
5£ + M are A' invariant; in other words, (8.1.4) is a triinvariant decompo-
sition associated with the ^4-semiinvariant subspace M. Similarity is actually
a particular case of dilation, with M — <pm and Jz? = Jf = {0}.
We say that (A, B, C, D) is a reduction of (A', B', C', D') if
(A1, B', C', D') is a dilation of (A, B, C, D).
264 Linear Systems
The basic property of reductions and dilations is that they have essentially
the same input-output behaviour; as follows.
Proposition 8.1.1
Let (A', B', C', D') be a dilation of (A, B, C, D). Then, for *0 = 0, the
input-output behaviours of the systems (A', B', C', D') and (A, B, C, D)
are the same. In other words, if u(t) is any (say, continuous) n-dimensional
vector function, then the output y = y(t; 0, u) of the system (A', B', C', D')
and the output y = y(t; 0, u) of the system (A, B, C, D) coincide.
Proof. Formula (8.1.3) gives
As D' = D, and e('~s)A (for a fixed / and s) admits a power series represen-
tation (see Section 2.6), we have only to show that for q =0,1,. . .
Using formula (8.1.5), we obtain
Now (A, B, C, D') and (A, B, C, D) are similar, so there exists an invert-
ible transformation S such that A = S~1AS, C = CS, and B = S~1B. Hence
In practice one is concerned about the dimension m of the state space of

a given system (8.1.1). It is desirable to make this dimension as small as
possible without changing the input-output behaviour. We say that the
system (8.1.1) is minimal if the dimension m of its state space is minimal
among all linear systems (A', B', C', D') that exhibit the same input-output
behaviour given the initial condition that that state vector is zero [i.e.,
jc(0) = 0]. In view of Proposition 8.1.1, the following problem arises: given
the linear system (8.1.1), not necessarily minimal, produce a minimal system
by reduction of (8.1.1). We see later that this is always possible.
Minimal Linear Systems: Controllability and Observability 265
To study this and other problems in linear system theory, it is convenient

to introduce the transfer function. Consider the system (8.1.1) with Jt(0) = 0,
and apply the Laplace transform. Denote by the capital Roman letter the
Laplace transform of the function designated by the corresponding small
letter; thus
[It is assumed here that for f > 0 z(/) is a continuous function such that
|z(0l ^ Ke*' for some positive constants K and /x. This ensures that Z( A) is
well defined for all complex A with Re A > /a.] The system (8.1.1) then takes
the form
Solving the first equation for X(\) and substituting in the second equation,
we obtain the formula for the input-output behaviour in terms of the
Laplace transforms:
So the function W(A) = D + C( A/ - A)~1B performs the input-outut map

of the system (8.1.1), following application of the Laplace transform. This
function is called the transfer function of the linear system (8.1.1). Observe
that the transfer function is a rational matrix function of size r x n that has
finite value (=D) at infinity. Observe also that the transfer functions of two
linear systems coincide if and only if the systems have the same input-
output behaviour. In particular, systems obtained from each other by
reductions and dilations have the same transfer functions.
00000000000000000000000000000000000000
0000000000000000000000000000000000000000000000
Consider once more the linear system of the preceding section:
and recall that this system is called minimal if the dimension of the state
space is minimal. [We omit the initial condition *(0) = *0 from (8.2.1); so
(8.2.1) has in general many solutions x(t).]
266 Linear Systems
Applying the results of Section 7.1 to transfer functions, we obtain the

following information on minimality of the system (8.2.1).
Theorem 8.2.1
(a) Any linear system (8.2.1) is a dilation of a minimal linear system; (b) the
linear system (8.2.1) is minimal if and only if (A, B) is a full-range pair and
(C, A) is a null kernel pair:
where m is the dimension of the state space. Moreover, in (8.2.2) one can
replace nJL 0 Ker CA' by np()' Ker CA' and Z^lm(B'A) by E^1 Im(B'A),
where p is any integer not smaller than the degree of the minimal polynomial
of A.
Indeed, (a) is a restatement of Theorem 7.1.3, and (b) follows from

Theorem 7.1.5.
It turns out that the conditions (8.2.2) obtained in Chapter 7 from
mathematical considerations have important physical meanings, namely,
"controllability" and "observability" of the linear system (8.2.1). Let us
introduce these notions.
The system (8.2.1) is called observable if for every continuous input u(t)
and output y(t) there is at most one solution x(t). In other words, by
knowing the input and output one can determine the state (including the
initial value) in a unique way.
Theorem 8.2.2
The system (8.2.1) is observable if and only if (C, A) is a null kernel pair:
Proof. Assume that (8.2.1) is observable. With y(t) = Q and u(t) = 0,

the definition implies that the only solution of the system
for f > 0 is x(t) = Q. If equality (8.2.3) were not true, there would be a
nonzero A:O £ nj!0 Ker CA' and the function x(t) — e'Ax0 would be a not
identically zero solution of equation (8.2.4). Indeed, for every 12^0 we have
Thus observability implies the condition stated in equality (8.2.3).

Now assume that (8.2.3) holds but (arguing by contradiction) the system
(8.2.1) is not observable. Then there exist continuous vector functions y(t)
and u(t) such that for j - 1, 2 and all t >0, we obtain
for some x { ( t ) and x2(t) that do not coincide everywhere. Subtracting

(8.2.5) with / = 2 from (8.2.5) withy = 1, and denoting x(t) = *,(0 - x2(t)^
0, we have
In particular, it is found that
Hence x0 = 0 by (8.2.3); but this contradicts x(t)^Q.
The system (8.2.1) is called controllable if by a suitable choice of input

the state can be driven from any position to any other position in a
prescribed period of time. Formally, this means that for every jct G (p"1,
x2 G <£"", and t2 > tl ^ 0 there is a continuous function u(t) such that
x ( t } ) = jc,, x(t2) = *2 f°r some solution *(/) of
Note that in the definition of controllability the second equation y(t) =

Cx(t) + Du(t) of equation (8.2.1) is irrelevant. Further, by replacing x(t) by
x(t - / j ) we can assume in the definition of controllability that f j is always 0.
Theorem 8.2.3
The system (8.2.1) is controllable if and only if (A, B) is a full-range pair:
We need the following lemma for the proof of Theorem 8.2.3.

268 Linear Systems
Lemma 8.2.4
Let G(t), f G [0, /0] be an mx n matrix depending continuously on t. Then
Proof. Let W= J^0 G(t)[G(t)]* dt. Assume x G <pm is such that x = Wy

for some y G <p". Then putting «(f) = [G(/)]*_y we find that x belongs to the
left-hand side of (8.2.7).
Conversely, if xl £lm W, then there exists an x2 G <pm such that Wx2 = 0
and (jtj, x2)^Q. [Here we use the property that W= W* and thus Im W —
(Ker W)1.] Arguing by contradiction, assume that there exists a continuous
vector function u(t) such that
Then
On the other hand
and since the norm is nonnegative and G(t)* continuous, we obtain

G(t)*x2 = 0, or x£G(f) = 0 for all / G [0, t2]. But this contradicts (8.2.8). D
Proof of Theorem 8.2.3. By formula (8.1.2) for every solution x(t) of

(8.2.6) with jc(0) = x{ we have
Hence
From this equation it is clear that (8.2.1) is controllable if and only if for
every t2 > 0 the set of m-dimensional vectors
coincides with the whole space <pm. By Lemma 8.2.4, the controllability of
(8.2.1) is equivalent to the condition that Im Wt = (p™ for all f >0, where
We prove Theorem 8.2.3 by showing that for all / > 0
If x e Ker Wt, then x*W,x = 0, that is
So B*e sA x - 0, Q<s^t. [Otherwise, in view of the continuity of

H/?*^**!!2 as a function of 5, we obtain a contradiction with (8.2.10).]
Repeated differentiation with respect to s and putting s = 0 gives
It follows that
Assume now that Then B*A*' lx = 0,i = l,2, It

sA
follows that B*e 'x = 0 when s > 0, and hence x*Wtx = 0 for t > 0. But W,
is nonnegative definite, so actually Wtx - 0, that is, jc E Ker Wt.
Combining Theorem 8.2.1 with Theorems 8.2.2 and 8.2.3, we obtain the
following important fact.
Corollary 8.2.5
The linear system (8.2.1) is minimal if and only if it is controllable and
observable.
This corollary, together with Theorem 7.1.5, shows that the concept of
minimality for systems and realizations of rational functions are consistent,
270 Linear Systems
in the sense that a system is minimal precisely when it determines a minimal

realization for its transfer function.
8.3 CASCADE CONNECTIONS OF LINEAR SYSTEMS
Consider two systems of type (8.1.1) (with initial value zero):
and
Suppose also that u } ( t ) and y2(t) are from the same space. The two systems
are combined in a "cascade" form when the output y2 of the second system
becomes the input w, of the first system. We obtain
and
Writing x(t) = we obtain a new system of the same type:
The system (8.3.3) is called a simple cascade composed of the first compo-
nent (8.3.1) and the second component (8.3.2). Note that the dimension of
the state space of the simple cascade is the sum of the state space
dimensions of its components, and the input of the simple cascade coincides
with the input of its second component, whereas the output of the simple
cascade coincides with the output of the first component.
Similarly, one can consider the simple cascade of more than two compo-
nents. Let (Av, /?,, C,, D,),. . . , (A , B , C , D ) be linear systems of
Cascade Connections of Linear Systems 271
type (8.1.1). A linear system that is obtained by identifying the output of

(A,, B,, C,., D,) with the input of (A,_lt £,_,, C,_,, D,_,), i = 2,3,. . . , p
will be called the simple cascade of the systems (.4,, Blt C,, D , ) , . . . ,
(Ap, Bp, Cp, Dp). By applying formula (8.3.2) p - 1 times, we see that such
a simple cascade has the form
In the language of transfer functions the simple cascading connection has

a very simple interpretation: formula (7.2.9) shows that the transfer
function of the simple cascade of two systems is the product of the trans-
fer functions of its first and second components (in this order). More
generally, if (A, B, C, D) is the simple cascade of (A^ 5,, C 1? £ > , ) , . . . ,
(Ap, Bpt Cp, D,), then
The following problem is of considerable interest: describe the represen-

tation of a given linear system (A, B, C, D) as a simple cascade of other
linear systems. We can assume that (A, B, C, D) is minimal (otherwise
replace it by a minimal system with the same input-output behaviour). In
order to relate this problem to the factorization problem for rational matrix
functions described in Sections 7.3 and 7.5, we shall assume that D — I and
that in each component (At, /?,, C,, D-) of the simple cascade (A, B, C, 7)
we have D, = /. Equation (8.3.4) shows that if (A, B, C, D) is a simple
cascade with components (At, Bt, C,, D,), / = ! , . . . , / ? , then the size of A
[or, what is the same, the McMillan degree S(W) of the transfer function
W(A) of (A, B, C, /)] is equal to ra, + • • • + mp, where mi is the size of At,
Denoting by W,(A) the transfer function of (At, Biy C,, £>,), we have
8(Wi) < mr On the other hand, as we have seen in the preceding paragraph
which implies
So equality holds throughout (8.3.6), which means that the factorization

(8.3.5) is minimal and that each system (A(, 5,, C,, D,), / = 1,. . . , p is
272 Linear Systems
minimal. Now we can use the results of Sections 7.3 and 7.5 concerning
minimal factorizations of rational matrix functions to study simple cascading
decompositions of minimal linear systems. The following analog of Theorem
7.5.1 is an example.
Theorem 8.3.1
The components of every representation of a minimal system (A, B, C, /) as
a simple cascade (with the transfer functions of the components having value I
at infinity) are given by
where the projectors TT, , . . . , irp and associated subspaces !£l,. . . , !£p are
defined as in Theorem 7.5.1. The transformations TTjA^ in (8.3.7) are
understood as acting in X^ and the transformations CTT^ and 7r(B are
understood as acting from j^ into <p", and from <p" into ^, respectively,
where n is the number of rows in C (which is equal to the number of columns
in B).
We now describe a more general way to connect two linear systems.

Consider the linear system
and assume that the input vector u = u(t) and the output vector y = y(t) are
divided into two components:
Now let
be another linear system with the input s(t), output z(t), and the state w(t).
(Here a, b, c, and d are constant matrices of appropriate sizes.) We obtain a
new system by feeding the first component of the output of (8.3.8) into the
input of (8.3.10) and at the same time feeding the output of (8.3.10) into the
second component of the input of (8.3.8). [It is assumed, of course, that the
vectors y\(t) and s(t) are in the same space, as well as the vectors « 2 (0 anc*
z(t).] This situation is represented diagrammatically by
Cascade Connections of Linear Systems 273
Here S t and S2 represent the linear systems described by equations (8.3.8)

and (8.3.10), respectively. The new system has u v ( t ) as an input and y2(t) as
an output and is called the cascade of (8.3.10) by (8.3.8). The "simple
cascade" described in the first part of this section is a particular case of a
cascade. Indeed, if the first component of the output y^(t) in the system
(8.3.8) depends on M,(/) only, and y2(t) = « 2 (0> then tne cascade described
by (8.3.11) is actually a simple cascade.
We turn now to a description of the cascade in terms of transfer
functions. First, rewrite (8.3.8) in the form
where
are the block matrix representations of B, C, and D conforming with the

division [equations (8.3.9)] of y(t) and u(t). The transfer function of this
system is
where W^(A) = Dif + C,( A/ - A) '#,; /, j =1,2. So passing to the Laplace

transforms, we have
where, as usual, the capital Roman letters indicate Laplace transforms of

the functions designated by the corresponding lowercase letters. Let V( A) be
the transfer function of (8.3.10); then
Now identify
Using (8.3.12)-(8.3.14) we have (omitting the variable A)

274 Linear Systems
and hence
Further
So the cascade of a linear system with the transfer function V( A) by a linear

system with the transfer function
has the transfer function t/(A) given by the formula
We recognize that U(\) is just a linear fractional transformation, t/(A) =

3PW(V), as discussed in Chapter 7. Consequently, the results of Sections 7.6,
7.7, and 7.8 can be interpreted in terms of minimal cascades of linear
systems. The cascade of (7.3.10) by (7.3.8) will be called minimal if the
corresponding linear fractional decomposition U = 2FW(V) is minimal. As an
example, let us restate Theorem 7.8.2 in these terms.
Theorem 8.3.2
Any minimal linear system with in-dimensional state space can be represented
as a minimal cascade of m linear systems each of which has one-dimensional
state space.
8.4 THE DISTURBANCE DECOUPLING PROBLEM
In this and the next section we consider two important problems from linear
system theory in which [A B]-invariant subspaces (as discussed in Chapter
6) appear naturally and play a crucial role.
Consider the linear system
The Disturbance Decoupling Problem 275
where A: <£"-+ f, B: <f m -» <p", E: <pp-» <p", and D. <p"-» <pr are constant
transformations, and *(/), w(f), q(t), and z(f) are vector functions taking
values in <p", <[?w, <pp, and £r, respectively.
As in Section 8.1, <£" is interpreted as the state space of the underlying
dynamical system, and u(t) is the input. The vector function z(t) is inter-
preted as the output. The term q(t) represents a disturbance that is supposed
to be unknown and unmeasurable. We assume that q(t) is a continuous
function of t for t > 0.
An important transformation of the system (8.4.1) involves "state feed-
back." This is obtained when the state x(t) is fed through a certain constant
linear transformation F into the input, so the input of the new system is
actually the sum of the original input u(t) and the feedback. Diagrammati-
cally, we have
Our problem is to determine (if possible) a state feedback F in such a way

that, in the new system, the output is independent of the disturbance q(t).
To express this problem in mathematical terms we introduce the follow-
ing definition. The system (8.4.1) is called disturbance decoupled if for every
x0 G <p" the output z(t} of the system (8.4.1) with jt(0) = jc0 is the same for
every continuous function q(t). We have (cf. Section 8.1)
and thus
Hence the system (8.4.1) is disturbance decoupled if and only if
for every continuous function q(t).

We need one more notion from linear system theory. Consider the linear
system
where A: <p"—»<p" and B: 0 and a continuous function u(t) such that the solution x(t) of (8.4.2)
satisfies x(tQ) — y. As
for / >0, it follows easily that the set of all reachable state vectors of (8.4.2)
is a subspace.
Proposition 8.4.1
The set 91 of reachable states coincides with the minimal A-invarant subspace
that contains Im B:
Proof. By Lemma 8.2.4 we find that x e <3l if and only if
for some
By equality (8.2.9)
or, taking into account the hermitian property of W,o
which coincides with (A \ Im B} in view of Theorem 2.7.3.
Using this proposition, we obtain the following characterization of distur-

bance decoupled systems.
Proposition 8.4.2
The system (8.4.1) is disturbance decoupled if and only if
The Disturbance Decoupling Problem 277
Returning to the problem mentioned above, note that state feedback is

described by a transformation F: <p"-» <pm, and substituting u(t) + Fx(t) in
place of u(t) in the system (8.4.1), we obtain the system with state feedback:
The new system has the same form as the original system (8.4.1), with A
replaced by A + BF. Our mathematical problem is: given transformations
A: <p"-» <p" and B: <p m ^ <p", and given subspaces g C <p" (which plays the
role of Im E) and 3) C <p" (which plays the role of Ker D), find, if possible,
a transformation F: <p" -»<f"" such that the subspace
[which is the minimal (A + BF)-invariant subspace containing <£] is con-

tained in 2).
The solution to this problem depends on the notion of [A B]-invariant
subspaces, as developed in Chapter 6.
Theorem 8.4.3
In the preceding notation, there exists a transformation F: <p" —> <pm such that
if and only if the [A B]-invariant subspace °U that is maximal in 9) contains

%. In this case any transformation F: <p"-* (pm with (A + BF)°U C % (which
exists by Theorem 6.1.1) has the property (8.4.3).
Proof. Assume that there is an F: <p"—» <pm with the property (8.4.3).
By Theorem 2.8.4 (applied with A + BF playing the role of A and any
transformation whose image is <£ playing the role of B) the subspace
(A + BF | g) is (A + BF) invariant, and thus (Theorem 6.1.1) it is [A B]
invariant. As (A + BF %) D <£, and the maximal (in 2)) [A B]-invariant
subspace °U contains < A + BF \ %), we obtain ^ D g.
Conversely, assume °ti D <£. By Theorem 6.1.1 there is a transformation
F: <p"-» <pm such that (A + BF)^ C <U. Now

278 Linear Systems
When applied to the disturbance decoupling problem, Theorem 8.4.3 can

be restated in the following form.
Theorem 8.4.4
Given a system (8.4.1), there exists a state feedback F: <p"—> <pm such that the
system
is disturbance decoupled if and only if the [A B]-invariant subspace °U that

is maximal in Ker D, contains Im E. In this case the system (8.4.4) is distur-
bance decoupled for every transformation F: <p"—» <p™ with the property that
°U is (A + BF) invariant.
We illustrate Theorem 8.4.4 by a simple example.
EXAMPLE 8.4.1. Let
where al, a2, and « 3 , as well as £,, b2, and b3 are complex numbers not all
zero. Using Theorem 6.4.1 and its proof, we find that a one-dimensional
subspace M is [A B] invariant if and only if
for some A G <p, and a two-dimensional subspace M is [A B\ invariant if and

only if either
or
Consider first the case when Ker D is [A B] invariant. This happens if and
only if 6 3 ^0. Then obviously Ker D is the maximal [A #]-invariant
The Output Stabilization Problem 279
sukkkbspaceindefrjd,nanddefral;fkjl; jubkf;ehuf aklfjsabdj

l;kfjkijb;lnkjkajfgl;tkaljskofkjhsijbfonkgkusfabnd
jl;kj;lnast
kjl;j;elkunjjlj fshefhuxj
j fss XKLDJFL;AHFL/NH;H
H LKJL;JLK
so, when DF 3 ^0, there exists a 1 x 3 matrix JKDLAF such that the
system (8.4.4) is disturbance decoupled if and only if a,6, + a2b2 + a3b3 = 0,
and in this case one can take XBVGXD in such a way that the polynomial
bl + b2x + b3x2 divides DSFSFSGFVSGSDGGDKJ;ALKJF;SKOAJFA;SKLJ;KLJL;JKL;KJFLKJALKLKJKJFJOJWIOJFJSFJSJLK;NJAJKSFA;LKS;LKJFAS;LKJFEJ;IOJK;DJF SJKAssume
FKOIEKN;IOEL;KNKL;J now that Ker D is not
ALJKLJ;LJLJLJAALA;KFJ BDFNLKA;LKJFIEKJL;KJDISATHKILYARAJAFL;KJL;IEKLKJ THEN THE MAAXIKJAL AAL;KJKL;SJF
sukkbspace in der dis span WHEREWHEREWHEREWHEREWHERE INTHISCASWWEN
have Span im E if and only if
So, if b3 = 0 and b2 7^0, there exists an F= [/i/ 2 / 3 ] as in Theorem 8.4.4 if

and only if (8.4.5) holds, in which case one can take/y in such a way that the
polynomial b} + b2x divides -/} - f2x - f3x2 + jc3. Finally, assume b2 =
b3 — Q. Then the maximal [A fi]-invariant subspace in Ker D is the
zero subspace, and there is no F for which the system (8.4.4) is distur-
bance decoupled. D
ALSKJFLSJ;FKSSL;JFKLSJFLJSKDJOUKTJPLUOTSTAHB;IOESAT;IOJ
8.5 THE OUTPUT STABILIZATIONKLPRONELM DJLJSOFKOLN
PROBLEM
tConsider theSYSTEM
CONSIERTHE system
where the transformations and ARE

constant. The problem we deal with in this section is that of stabilizing the
output z(t) by means of a state feedback while still maintaining the freedom
to apply a control function
function u(t). More exactly, the problem is to find a
TRANSFORMATION 9QHIXHHREPRESENTS THE STAE FEED BACKKK90
solution of the new system
WITH IENTICALY ZEERKOONKPLAKJFLH FKOR EVERY INITIAL VALUKE

AS
ThTTTh linear System
this condition amounts to
To study the property (8.5.2), we need the following lemma. This is, in
fact, a special case of Theorem 5.7.2, but it is convenient to have some of
the conclusions recast in the present form.
Lemma 8.5.1
lettyth be stranfsojsrmfation , anfd sklfj
Q, M + be the sum of root
subspaces of A corresponding to the eigenvalues with negative, zero, a
positive real parts, respectively. Thus
and fkodsr skome wen have
(c) fkor all
Note that Jt_, M(), and M+ are /1-invariant subspaces, and therefore
these subnspaces farse also oknvariant fro the ftransofkormations
gijvefn al. b,d, as in fdjl;skjfldfjla;kjf;skanjktransofosrmadtion
be the maximal (A + BF)-invariant subspace in Ker D. The condition

(8.5.2) can be expressed in terms of the root subspaces of A + BF as
follows.
Lemma 8.5.2
We have
if and only if £% fkoreverhyki efifefn asdvljliule sufch that j

4re
Proof. By Theorem 2.7.4 we have

cccc
with respect tko thef diect sli skjflksjfl;jedckojklpjkojtisoitijojnl ljlsfjlsjfs;s;s;s;s;s;s;slslslsllll where
where i is ais a
direct complement to N alsoalso jis a null ellhance
Nowcklearlhy
now clearly for everhy
fkor every
with with if andj only jilfj jifn hkas all iltls eigen vslues iln theffj kkoopen
left-half
left . hjalf plane.
j;plane.l So we jhkavfe
so jwe have tootprkovke
prove that
that
if and only if all the eigenvalues of A22 are in the open left-half plane.
Let jc be an eigen vfectkor of lkof ;lff;lcokrresopojnding tko the eigenvaluke4 vallirhj with
then
and bykl lemma al;js;jl;sfl;jks;lfj;sljfjkl
;aflj but if thfen
which contradicts the definition of NF. Hence in equality (8.5.4) holds,

and thus (8.5.3) does not.
Conversely, if <r(/4 22 ) lies in the open left-half plane, then (8.5.3) holds
by Lemma 8.5.1 (where A22 plays the role of A). D
Now we can reformulate the problem of stabilizing the output by state
fedbacjk as followa givfefn transpfokrmationsl and s
subspace 9whuich plays thfe rkokle kof ker gh such
that everyk rkootj subspace of abf ckorrepkondinf ttko an egtenvslkue fa;llkydlvflk7ue a;llllllllvalk7ue wfith
nkonnefgtatjiveff rfeal jipart jis contained in
In this formulation there is nothing special about the set of eigenvalues
with nonnegative real parts. In general, we can consider any proper subset
kodk (the "bad:":domain )domain domain
klkklklkllof thfe colkosed riht _half plane. cloksefd right .lhalf oplane right half plane. plane.
Now we can prove a general result on solvability of this problem in terms
of [A /?]-invariant subspaces.
Theorem 8.5.3
Given transformation, and a substpace
ace
there exists a transformation such thatjat for
t282 linjear systems
every eigenvaluy4r ifa and only if , for evferyi eigen vaklue of

A in H 6 , we have
where (A \ Im B) is the minimal A-invariant subspace containing Im B and

is the mazimal invariant subspace in
proof,. for fafgevfe t4ransofkormation f be the maximal

invariant asw F is also [A B] invariant, we have
Assume now that F is such that
for every eigenvaklue4 id WBD RGr vwkibfa ri then ,by lemmma

for every ve
and hence
CCCCCCCCCCC;LAKJSL;KJF;THE CANKOJ;LICSL LINEFARTRANSFROMAFTION FJ

FOREVERY ANDFORATRANSFRONATION
FOR WHILCH IJS AN INVGARIANT SUBSPACE, LKET BE THE
TRANSFORMATION INDUCED BY XKON ONE EASILY CHFECKS FTHAT A
NOW USE LEMMA 654TOIJBRub
for every sjilmilarlsyu
for every that ils jnot6j ank ejiotn vslkukefkj ofj a=bf h,cibnsyqhubtkgtv
for every
onvefrselyk, assume thaqt 98560holsj for eve4ry we hjave
tko oporovef fthat6 jtherfe efxists an f such thatj fkor evefryk
eigenvalkue kof ja=bfthatj blkelogs tykl let be aj transfor

mation sfuch thatj that 9a-bf it ijs jfeasily sen thatj the subspace
is a invariant. we have = A, where
where the upper
the jupper
bar jdenotes thef jilnducecd transfkormation kon enoting byklb the
canonnical tansfkormation wesee thatj lemma 6.5,.4andj equality
se r y i
Hence
further,u t5he jilncu jimpljies that indwe

havfe seen thaincluksjinon at thej begilnnilng jof jthlis ;rrko.
opposdfite injcluksiion, taskke then fkor we
have ece andj thej inclusiion
folllows. be the canoknocslby
thej transfokrmation inducfefd btykl it is assumed
that isz invgarjiant 0a
Nopw Theorem 6.5.3 implies the existence of a transformaton

such that the spectrum of liies in the comple-
ment of We have
which means that
By Lemma 6.5.4 again, for every
so
and F = F0 + FI is the desired transformation. D

284 Linear Systems
The proof of Theorem 8.5.3 shows that, assuming $k(A)C

(A | Im B) + °U for every A0 E Clb, the transformation F: $"-+ fm such that
91A (A + BF) C % for every A0 E Ht can be constructed as follows: F =
F0 + F,P, where F0: f"-* $m is such that (A + BF0)*U C °U; P: ("W is the
canonical transformation; and Fl: $nl°U —> <pm has the property that the
spectrum of the transformation on <p"/^ induced by the transformation
A + B(FQ + F,P) on <p" lies outside H 6 .
Applying Theorem 8.5.3 to the output stabilization problem, we obtain
the following result.
Theorem 8.5.4
Given the linear system
with constant transformations A: f "-+$", B: fm-*f"t and D: $"->$'>

there exists a transformation (state feedback} F: <p" —* (pm such that, for every
initial value jc(0), the solution of (8.5.8) with u(t) = Fx(t) satisfies
lim,^.^ z(r) = 0 if and only if
for every eigenvalue A0 of A lying in the closed right half plane, and where °ll
is the maximal [A B]-invariant subspace in Ker D.
We conclude this section with an example illustrating Theorem 8.5.4.
EXAMPLE 8.5.1. Let
Ker D — Span{
where al, a2, a3 are complex numbers not all zeros. Here (A Im B) =
Span{el,e2}. If £%eA 0 <0, then there is always an F=[flf2f3] with
properties as in Theorem 8.5.4 (one can take /3 = 0 and choose /i and /2
that the equation A 2 -/ 2 A -/, = 0 has its zeros in the open left-half plane).
So assume 9le A0 > 0. Then there exists an F as in Theorem 8.5.4 if and only
if
Exercises 285
If «3 - 0, then (8.5.9) is always false. If «3 ^0, then (8.5.9) happens if and

only if the subspace Ker D is [A B] invariant, or, equivalently, if Ker D is
(A + BG) invariant for some G = [g,g 2 g 3 ]. An easy verification shows that
this is the case if and only if So there exists an as in
Theorem 8.5.4 if and only if « 3 ^ 0 and a 2 = A 0 a 1 . In this case /3 =
Aoflj — /jflj - A 0 / 2 a, and /, and /2 for which the zeros of A 2 — / 2 A — /, = 0 are
in the left half plane will do. D
8.6 EXERCISES
8.1 For every input u(t) find the output y(t) for the following linear
systems:
8.2 For every input w(f) find the output y(t) for the following linear
systems:
where B is the (k^ + k2) x 2 matrix whose first column is ek and

second column is ek +k , and C is the 2 x (kl + k2) matrix whose first
row is e\ and second row is eTk +1 .
where A is an nx n lower triangular matrix and C=

[0 • • • 0 1 0 • • • 0] with 1 in the &th place.
286 Linear Systems
8.3 Consider the linear system
When is this system controllable? observable? minimal?

8.4 Find transfer functions for the linear systems given in Exercises 8.1
and 8.2.
8.5 Build minimal linear systems with the following transfer functions:
(b) p( A) ', where p( A) = E*=0 a ; A y is a scalar polynomial.

(c) (L(\)y\ where L(A) is a monic n x n matrix polynomial of
degree /.
8.6 Show that the system
is controllable and observable.

8.7 For the system in Exercise 8.6, given the n-tuple of complex numbers
A,, . . . , \n, find a state feedback F such that A + BF has eigenvalues
A , , . . . , A n . Also, find G such that A + GC has eigenvalues
8.8 Let
be a linear system with n x n circulant matrix A and n x 1 and 1 x n

matrices B and C, respectively. When is the system controllable?
Observable? Minimal?
Exercises 287
8.9 Consider the linear system
where / is a nilpotent n x n Jordan matrix (i.e., with J" = 0) and B

and C are n x 1 and 1 x n matrices, respectively. When is this system
controllable? Observable? Minimal?
8.10 Prove or disprove: if the system
is minimal, then the system
is minimal as well.
8.11 Let p(A) be a polynomial of the transformation A: <p"-» <p". Prove
that the minimality of the system
implies the minimality of
Is the converse true?

8.12 Let
and
be two systems, and assume that A2-p(Al), where p(A) is a

polynomial such that p(A,) 5^p(A 2 ) for any pair of different eigen-
values A! and A2 of A l and p'( A)| A=A ^ 0 for every eigenvalue A0 of A l
such that A^ (A ) is not diagonable. Prove that the systems are
simultaneously minimal or nonminimal.
288 Linear Systems
8.13 Show that if the system
is controllable, then for every A0 G <p the system
is controllable as well. Is this property true for the observability of

systems?
8.14 For a controllable system
where bn ¥=• 0, find a state feedback F such that the system with
feedback
is stable, that is, all its solutions x(t) tend to zero as t—»°°.
8.15 For system (1) in Exercise 8.14 and any k >0, find a state feedback F
such that all solutions x(t) of the system with feedback satisfy
||*0)|| < Ke~kl, where K>0 is constant independent of t.
8.16 Prove that any minimal linear system with n-dimensional state space
has a state feedback for which the system with feedback can be
represented as a simple cascade of n linear systems with state spaces
of dimension 1.
8.17 Prove that controllability is a stable property in the following sense:
for everv controllable svstem
there exists an e > 0 such that any linear system
with is controllable as well.

Exercises 289
8.18 Prove that observability and minimality of linear systems are also
stable properties. The definition of stability is, in each case, to be
similar to that of Exercise 8.17.
8.19 Show that for any system
there exists a sequence of minimal systems

Notes to
Part 1
Chapter 1. The material here is quite elementary and well known,

although not everything is readily available in the literature. Part of Section
1.5 is based on the exposition in Chapter S4 of the authors' book (1982).
More about angular transformations and matrix quadratic equations can be
found in Bart, Gohberg, and Kaashoek (1979). Angular subspaces and
operators for the infinite dimensional case were introduced and studied in
Krein (1970).
Chapter 2. The proof of the Jordan form presented here is standard and
can be found in many books in linear algebra; for example, see Gantmacher
(1959) or Lancaster and Tismenetsky (1985). A proof of the Jordan form
can be obtained also by analyzing the properties of the set of all invariant
subspaces as a lattice. This was done in Soltan (1973a). In this approach, the
invariance of the Jordan form follows from the well-known Schmidt-Ore
theorem in lattice theory [see, e.g., Kurosh (1965)].
"The yl-invariant subspace maximal in JV" and "the .A-invariant subspace
minimal over JV" are phrases that are introduced here probably for the first
time, although the notions themselves had been developed and are now well
known in the context of linear systems theory. In general, the whole
material of Sections 2.7 and 2.8 is influenced by linear system theory.
However, our presentation here is independent of that theory and leads us
to abandon its well-established terminology. In particular, in linear systems
theory, "full-range" and "null kernel" pairs are known as "controllable"
and "observable" pairs, respectively. Marked invariant subspaces are prob-
ably introduced for the first time. The existence of nonmarked invariant
subspaces is often overlooked. The description of partial multiplicities and
invariant subspaces of functions will hold no surprises for the specialist, but,
again, these are results that are not easily found in the standard literature on
linear algebra.
290
Notes to Part 1 291
Chapter 3. The material of this chapter (except for Theorem 3.3.1) is

well known. Theorem 3.3.1 in the infinite dimensional case was proved by
Sarason (1965). Here we follow his proof.
Chapter 4. The problem of analysis of partial multiplicities of extensions
from an invariant and a coinvariant subspace was stated in Gohberg and
Kaashoek (1979). This problem was connected there with the description of
partial multiplicities of products of matrix polynomials in terms of partial
multiplicities of each factor and reappears in this context in Section 5.2. The
first results concerning this description were proved in Sigal (1973). In
particular, Theorem 3.3.1 was proved in that paper. Example 4.3.1 and the
material in Section 4.4 (except for Proposion 4.4.1) is taken from Rodman
and Schaps (1979). For further information and more inequalities concern-
ing the partial multiplicities, see Thijsse (1980,1984) and Rodman and
Schaps (1979).
When this book was finalized, the authors learned about another impor-
tant line of development concerning the problem of partial multiplicities of
products of matrix polynomials. This has been intensively studied (even in a
more general setting) by several authors. The reader is referred to recent
work of Thompson (1983 and 1985) for details and further references.
Chapter 5. The theory presented in this chapter can be viewed as a
generalization of the familiar spectral theory of a matrix A but, in this
context, identified with the linear matrix polynomial A / — A. This theory of
matrix polynomials was developed by the authors and summarized in the
book by Gohberg, Lancaster, and Rodman (1982). The material and
presentation in this chapter is based on the first four chapters of that book.
It also contains further results on matrix polynomials including least com-
mon multiples, greatest common divisors, matrix polynomials with her-
mitian coefficients, nonmonic matrix polynomials, and connections with
differential and difference equations. Lists of relevant references and
historical comments on this subject are found in the above-mentioned
monograph by the authors (1982). In this presentation we focus more
closely on decompositions into three or more factors. Theorem 5.2.3 is close to
the original theorem of Sigal (1973) concerning matrix-valued functions. See
also Thompson (1983 and 1985).
Chapter 6. The main results of this chapter were first obtained in a
different form in the theory of linear systems [see, e.g., monographs by
Wonham (1974) and Kailath (1980)]. In this chapter the presentation is
independent of linear systems theory and is given in a pure linear algebraic
form. This approach led us to change the terminology, which is well
established in the theory of linear systems, and to make it more suitable for
linear algebra.
The ideas of block similarity in Sections 6.2 and 6.6, as well as of
[A B]-invariant and -invariant subspaces, are taken from Gohberg,
292 Notes to Part 1
Kaashoek, and van Schagen (1980). That paper contains a more general
theory of invariant subspaces, similarity, canonical forms, and invariants of
blocks of matrices in terms of these blocks only. Some applications of these
results may be found in Gohberg, Kaashoek, and van Schagen (1981,1982).
Theorem 6.2.5 was proved (by a direct approach, without using the Kronec-
ker canonical form) in Brunovsky (1970). The connection between the
Kronecker form for linear polynomials and the state feedback problems is
given in Kalman (1971) and Rosenbrock (1970). In Theorem 6.3.2 the
equivalence of (a) and (d) is due to Hautus (1969).
The spectral assignment problem is classical, by now, and can be found in
many books [see, e.g., Kailath (1980) and Wonham (1974)]. There is a more
difficult version of this problem in which the eigenvalues and their partial
multiplicities are preassigned. This problem is not generally solvable. For
further analysis, see Rosenbrock and Hayton (1978) and Djaferis and Mitter
(1983).
Chapter 7. The concept of minimal realization is a well-known and
important tool in linear system theory [see, e.g., Wonham (1979) and
Kalman (1963)]. See also Bart, Gohberg, and Kaashoek (1979), where the
exposition matches the purposes of this chapter. Section 7.1 contains the
standard material on realization theory, and Lemma 7.1.1 is a particular
case of Theorem 2.2 in Bart, Gohberg, and Kaashoek (1979).
Section 7.2 follows the authors' paper (1983a). Sections 7.3-7.5 are based
on Chapters 1 and 4 in Bart, Gohberg, and Kaashoek (1979). Here, we
concentrate more on decompositions into three or more factors.
Linear fractional decompositions of rational matrix functions play an
important role in network theory; see Helton and Ball (1982). Theorem
7.7.1 is proved in that paper. The exposition in Sections 7.6-7.8 follows that
given in Gohberg and Rubinstein (1985).
Chapter 8. In the last 20 years linear system theory has developed into a
major field of research with very important applications. The literature in
this field is rich and includes monographs, textbooks, and specialized
journals. We mention only the following books where the reader can find
further references and historical remarks: Kalman, Falb, and Arbib (1969),
Wonham (1974), Kailath (1980), Rosenbrock (1970), and Brockett (1970).
This chapter can be viewed as an introduction to some basic concepts of
linear systems theory.
The first three sections contain standard material (except for Theorem
8.3.2). In the last two sections we follow the exposition of Wonham (1979).
Part Two
Algebraic
Properties of
Invariant Subspaces
In Chapters 9-12 we develop material that supplements the theory of Part 1.

In particular, we go more deeply into the algebraic structure of invariant
subspaces. We include a description of the set of all invariant subspaces for a
given transformation and examine to what extent a transformation is defined
by its lattice of invariant subspaces. Special attention is paid to invariant
subspaces of commuting transformations and of algebras of transforma-
tions . In the final chapter the theory of the first two parts (developed for complex
linear transformations) is reviewed in the context of real linear transformations.
293
Chapter Nine
Commuting Matrices
and Hyperinvariant
Subspaces
In this chapter we study lattices of invariant subspaces that are common to

different commuting transformations. The description of all transformations
that commute with a given transformation is a necessary part of the
investigation of this problem. This description is used later in the chapter to
study the hyperinvariant subspaces for a transformation A, that is, those
subspaces that are invariant for any transformation commuting with A.
9.1 COMMUTING MATRICES
Matrices A and B (both of the same size n x n) are said to commute if

AB = BA. In this section we describe the set of all matrices which commute
with a given matrix A. In other words, we wish to find all the solutions of
the eauation
where X is an n x n matrix to be found.

We can restrict ourselves to the case that A is in the Jordan form. Indeed,
let / = S ~ 1AS be a Jordan matrix for some nonsingular matrix 5. Then X is a
solution of equation (9.1.1) if and only if Z = 5 ~1XS is a solution of
So we shall assume that A - J is in the Jordan form. Write
295
296 Commuting Matrices and Hyperinvariant Subspaces
where Ja(a = 1,. . . , u) is a Jordan block of size ma x ma, Ja = \ala + Ha,

where la is the unit matrix of size ma x ma, and Ha is the ma x ma nilpotent
Jordan block:
Let Z be a matrix that satisfies (9.1.2) and write
where Za/3 is a ma x m^ matrix. Rewrite equality (9.1.2) in the form
Two cases can occur:
(a) Aa T^ Ap. We show that in this case Za/3 = 0. Indeed, multiply the
left-hand side of equality (9.1.3) by Aa — A^ and in each term in
the right-hand side replace We
obtain
Repeating this process, we obtain for every
Choose p large enough that either Hqa = 0 or Hp^ q = 0 for every

q = 0, . . . , / ? . Then the right-hand side of equation (9.1.4) is zero,
and since \a ^ A^, we find that Zaft = 0.
(b) Att = A0.Then
From the structure of Ha and Hft it follows that the product HaZaf}
is obtained from Zafi by shifting all the rows one place upward and
filling the last row with zeros; similarly, ZaftHft is obtained from Zafi
Commuting Matrices 297
by shifting all the columns one place to the right and filling the first
column with zeros. So equation (9.1.5) gives (where £ik is the
(j, /c)th entry in Za/3, which depends, of course, on a and |8):
where by definition £.0 = £ m + ltk — 0. These equalities mean that the

matrix Z ~ has one of the following structures:
where Qpq stands for the zero p x q matrix. Matrices of types

(9.1.6)-(9.1.8) are referred to as upper triangular Toeplitz matrices.
So we have proved the following result.
Theorem 9.1.1
Let / = diag[/,, • • . , Ju] be an n x n Jordan matrix with Jordan blocks
7,, . . . , Ju and eigenvalues A n . . . , A M , respectively. Then an n x n matrix Z
commutes with J if and only if Za/3 = 0 for \a ^ A^ and Za/3 is an upper
triangular Toeplitz matrix for A0 = A^, where Z = [Z ap ]^^ = 1 is the partition
of Z consistent with the partition of J into Jordan blocks.
We repeat that Theorem 9.1.1 gives, after applying a suitable similarity

transformation, a description of all matrices commuting with a fixed matrix
A. This theorem has a number of important corollaries.
Corollary 9.1.2
Let A be an n8n matrix partitionaed as follows:

where the spectra of the matrices A l and A2 do not intersect. Then any n x n
matrix X that commutes with A has the form
with the same partition as in equality (9.1.9).
Proof. Let /, (resp. 72) be the Jordan form of A} (resp. A2), so

/. = S^1 AiSi for some nonsingular matrices Sl and S2. Then
is the Jordan form of A. By Theorem 9.1.1, and since ^(J,) fl (r(J2) — 0, any
matrix Y that commutes with J has the form 1
•*•
with the same.
partition as in (9.1.9). Now Y commutes with J if and only if X= SYS
commutes with A, where So
has the desired structure.
This corollary, reformulated in terms of transformations, runs as follows:

let A: <p"—»<p" be a transformation, and let Ml and M2 be A-invariant
subspaces that are complementary to each other and for which the restric-
tions A\M and A\M have no common eigenvalues. Then M\ and M2 are
invariant subspaces for every transformation that commutes with A. To
prove this, write A in the 2 x 2 block matrix form with respect to the direct
sum decomposition and use Corollary 9.1.2. The next result
is a special case.
Corollary 9.1.3
Every root subspace for a transformation A: <p"-» <p" is a reducing invariant
subspace for any transformation that commutes with A.
The proof of Theorem 9.1.1 allows us to study the set ^(A) of all
matrices (or transformations) that commute with the matrix (or linear
transformation) A. First, observe that ^(A) is a linear vector space. Indeed,
if
if AX for and then also doe
any complex numbers a and j8.
To compute the dimension of ^(A), consider the elementary divisors of
A. Thus, for every Jordan block or size k x k and eigenvalue A0 in the
Jordan normal form of A we have an elementary divisor (A 0 - A 0 ) of A
(which is a polynomial in A). The greatest common divisor of two elemen-

tary divisors ( A - A j ) * 1 and ( A - A 2 ) * 2 of A is (A - \l)mtn(kl'*2) if A, = A 2
and is 1 if \l^\2. Taking this observation into account, Theorem 9.1.1
shows that the dimension of ^(A) is Ef , = 1 asl, where a^, is the degree of
the greatest common divisor of (A — A5)*J and (A-A,)*', and
are all the elementary divisors of A. In particular
where n is the size of A.

We have seen that, quite obviously, any polynomial in A commutes with
A, and we now ask about conditions on A such that, conversely, each matrix
commuting with A is a polynomial in A.
To this end we need the following notion. An n x n matrix (or transfor-
mation A: <p" —> <(7") is called nonderogatory if there is only one Jordan block
in the Jordan form of A associated with each eigenvalue. It turns out that A
is nonderogatory if and only if any one of the following four equivalent
statements holds: (a) dim Ker( A/ - A) < 1 for every A E <p; (b) A is similar
to a matrix
for some complex numbers « 0 ,. . . , «„_,; (c) the minimal polynomial of A

coincides with the characteristic polynomial of A; and (d) A is cyclic, that is,
there exists an x E (p" such that
Indeed, by assuming that A is in the Jordan form, condition (a) is clearly

equivalent to A having only one Jordan block for each eigenvalue. By
Theorem 2.6.1, (d) is equivalent to A being nonderogatory. Further, the
ap
minimal polynomial for A is easily seen to be p) ,
where A j , . . . , A p are all the distinct eigenvalues of A and ay is the maximal
size of the Jordan blocks of A corresponding to A. From this description it is
clear that (c) is equivalent to (a). We have proved, therefore, that (a), (c),
and (d) are equivalent to each other and to the condition that A is
nonderogatory.
Let A be the matrix (9.1.11). We want to prove that (a) holds. Let
and be eigenvectors of A corresponding

to the eigenvalue A 0 . Thus Ax = A0Jt, Ay = \0y, and x7^0, yÔ. The
structure of A implies that jc, = Aj,"1.*,, yt - ^lyl for / = ! , . . . , n. But
then necessarily xl ^ 0, _y, T^ 0, and jc = (yi/*i).y, that is, jc and y are linearly
dependent. Hence (a) holds.
Finally, we show that (d) implies (b). First observe that if (9.1.12) holds,
then the vectors x, Ax,. . . , A"~lx are linearly independent (otherwise <p"
would be spanned by less than n vectors, which is impossible). In the basis
jc, Ax,. . . , A"~lx the matrix A has the form (9.1.11).
Theorem 9.1.4
Every matrix commuting with A is a polynomial in A if and only if A is
nonderogatory.
Proof. First recall that in view of the Cayley-Hamilton theorem the

number of linearly independent powers of A does not exceed n. Thus, if
AX - XA implies that ^ is a polynomial of A, then X can be rewritten as
X = p(A), where / = d e g p ( A ) = £ n and all powers /, A, A2,. . . , Al~l are
linearly independent. So in this case dim ^(A) = / < n. Inequality (9.1.10)
then implies that dim Vo(A) = n. This means [again in view of (9.1.10)] that
ast = 0 for s ¥ ^ t . So in the Jordan form of A there is only one Jordan block
associated with each eigenvalue of A.
Conversely, assume that the Jordan form of A is
where A 1 ? . . . , \s are different complex numbers. As we have seen, the

solution X of AX = XA is then similar to a direct sum of upper triangular
Toeplitz matrices
More exactly, where 5 is a nonsingular matrix such

that / = S~1AS. Now a polynomial p( A) satisfying the conditions
for / = 1,. . . , s gives the desired result:

Common Invariant Subspaces for Commuting Matrices 301
Note that p( A) can be chosen with degree not exceeding
We now confine our attention to matrices commuting with a diagonable

matrix. Recall that an n x n matrix A is diagonable if and only if there is a
basis in <p" of eigenvectors of A. The following corollary is obtained from
Theorem 9.1.1.
Corollary 9.1.5
//A,, . . . , As are the distinct eigenvalues of a diagonable matrix A, then
where
For future reference let us also indicate the following fact.
Proposition 9.1.6
An n x n matrix B commutes with every n x n matrix A if and only if B is a
scalar multiple of /: B = A/ for some A E (f1.
Proof. The part "if" is obvious. So assume that B commutes with every
n x n matrix A—in particular, taking A to be diagonal with n different
eigenvalues with respect to a basis * , , . . . , xn in <p", Corollary 9.1.2 implies
that B is also diagonal in this basis. Therefore, Bxl ESpan{x,}. As any
nonzero vector xx appears in some basis in <p", we find that Bx = Ax for
every * E <p" -> {0}, where the number can depend on x: A = A(z). How
ever, if Bx=A(x)x, By = \(y)y with A(x)*\(y), then B(x + y)0
Span{x + y], a contradiction. Hence A is independent of x and the propos
ition is proved.
9.2 COMMON INVARIANT SUBSPACES FOR COMMUTING MATRIC
In this section we establish a fundamental property of a set of commuting

transformations, namely, that there is always a complete chain of subspaces
that are invariant for every transformation of the set.
Theorem 9.2.1
Let ft be a set of commuting transformations from <p" into <p" (so AB - BA
for any A, B E. ft). Then there exists a complete chain of subspaces
dim Mt = j, such that MG, M.^ . . . , Mn are invariant for
every transformation from ft.
Proof. For every nonzero vector x E <t" write
Clearly Z£(x) is a nonzero subspace that is invariant for any A E ft (in short,
ft invariant).
Now let xl E <p" be an eigenvector of some transformation / 4 , E f t
corresponding to an eigenvalue A,; so >!,*, = A,*,. Hence for every
Bl,. . . , Bk £ ft we have
so
Let Jt 2 EJ£(jt,) be an eigenvector of some A2 Eft: A2x2 = \2x2. Then

A2\<f(x } = A 2 /, and Z£(x2) C <£(x}). We continue the construction of nonzero
subsoaces
where -A,|^(;t j = \fl, i = 1, . . . , k for some Al, . . . , Ak E. ft and complex

numbers A,, . . . , \k, until we encounter the situation where 2£(y) — <&(xk)
for every eigenvector y&^(xk) corresponding to any eigenvalue A of any
transformation BE ft. In this case every B E f t has an eigenvalue \B with
the property that B^(x ) — \BI. Let y\ be any nonzero vector from ££(xk).
Then the subspace Ml =Span{_y 1 } is ft invariant.
Let jVj be a direct complement to M} in (p". With respect to the
decomposition M , + ^V, = (f1", we have
for any
The condition AB = BA implies that A . Repeating the abov

procedure, we find a common eigenvector y2 G ^ of all linear transfor-
mations from ft. Put M2 = Span{.y,, y2}, and so on. Eventually we obtain a
complete chain of common ft-invariant subspaces.
In terms of bases, Theorem 9.2.1 can be stated as follows.

Common Invariant Subspaces for Matrices with Rank 1 Commutators 303
Theorem 9.2.2
Let fl be a set of commuting transformations from <p" into (f"1. Then there
exists an orthonormal basis *,,. . . , xn in <p" such that the representation of
any AE.fl in this basis is an upper triangular matrix.
Proof. be a complete chain of

subspaces as in Theorem 9.2.1. Now construct an orthonormal basis
X i , . . . , x in such a way that Span for
If every transformation from the set ft is normal, the upper triangular

matrices of Theorem 9.2.2 are actually diagonal (cf. the proof of Theorem
1.9.4). As a result we obtain the "only if" part of the following result.
Theorem 9.2.3
Let ft be a set of normal transformations <p" —> £". Then AB = BA for any
transformations A, BE.il if and only if there is an orthonormal basis
consisting of eigenvectors that are common to all transformations in ft.
The part "if" of this theorem is clear: if x{,. . . , xn is an orthonormal

basis in <p" formed by common eigenvectors of A and B, where A, B E ft,
then in this basis we have
9.3 COMMON INVARIANT SUBSPACES FOR MATRICES WITH

RANK 1 COMMUTATORS
For n x n matrices A and B, the commutator of A and B is, by definition,

the matrix AB - BA. So the commutator measures the extent to which A
and B fail to commute. We have seen in the preceding section that if A and
B commute, that is, if their commutator is zero, then there exists a complete
chain of common invariant subspaces of A and B. It turns out that this result
is still true if the commutator is small in the sense of rank.
Theorem 9.3.1
Let A and B be n x n matrices with rank(AB - BA) s 1. Then there exists a
complete chain of subspaces:
such that each M is both A invariant and B invariant.

Proof. We shall assume that rank(AB - BA) = 1. (If AB-BA = Q,

Theorem 9.3.1 is contained in Theorem 9.2.1.) We can also assume that A is
singular. (If necessary, replace A by A - A0/ for a suitable A 0 , and note that
the commutators of A and B and of A - A0/ and B are the same.) We claim
that either Ker A or Im A is B invariant. Indeed, if Ker A is not B
invariant, then there exists a nonzero vector x E <p" such that Ax - 0 and
ABxÔ. Thus
span the one-dimensional range of AB - BA. Hence for every y G <p" there
exists a constant ii(y) such that
It follows that
and hence
so Im A is B invariant. We have shown that there is a nontrivial subspace Ji

that is invariant for both A and B.
Write A and B as 2 x 2 block matrices with respect to the decomposition
N + N' = <t", where N' is some direct complement to jV:
Then rank(AlBl - B^^^ 1 and rank(A2B2 - B2A2)< 1. So we can appl

the preceding argument to find a nontrivial common invariant subspace for
Al and Bl (if d i m j V > l ) . Similarly, there exists a nontrivial common
invariant subspace for A2 and B2 (if dim N' > 1). Continuing in this way, we
ultimately obtain the result of the theorem.
Theorem 9.3.1 can also be restated in terms of simultaneous trianguliza-

tions of A and B, just as Theorem 9.2.1 was recast in the form of Theorem
9.2.2. In contrast with Theorem 9.2.1, the result of Theorem 9.3.1 does not
generally hold for sets of more than two matrices.
EXAMPLE 9.3.1. Let
It is easily checked that

Hyperinvariant Subspaces 305
Nevertheless, there is no one-dimensional common invariant subspace for

Al, A2, and A3. Indeed, A3 has exactly two one-dimensional invariant
subspaces, Span{e,} and Span{e2}, and neither of them is invariant for both
Al and A2.
9.4 HYPERINVARIANT SUBSPACES
Let A: <p" —»<p" be a transformation. A subspace M C <p" is called hyperm-

variant for A (or A hyperinvariant) if M is invariant for any transformation
that commutes with A. In particular, an /l-hyperinvariant subspace is A
invariant. Let us study two simple examples.
EXAMPLE 9.4.1. Let A = A/, A E (p. Obviously, any transformation from <p"
to <p" commutes with /I, so the only subspaces which are invariant for every
linear transformation that commutes with A are the trivial ones: {0} and <p".
Hence A has only two hyperinvariant subspaces: {0} and <p".
EXAMPLE 9.4.2. Assume that A: (p"—»(p" has n distinct eigenvalues

A,, . . . , \n with corresponding eigenvectors jc n . . . , xn. Then A has exactly
2" invariant subspaces Spanjjc, | / G K ] , where K is any subset in { 1 , . . . , « }
(see Example 1.1.3). By Theorem 9.1.4, the only transformations that
commute with A are the polynomials in A. Since every ^-invariant subspace
is invariant also for any polynomial of A, we find that every ^-invariant
subspace is A hyperinvariant.
More generally, let A be a nonderogatory transformation. Then Theorem

9.1.4 shows that every /1-invariant subspace is also A hyperinvariant. This
property is characteristic for nonderogatory transformations.
Theorem 9.4.1
For a transformation A: <p" —* <p" every A-invariant subspace is A hyperin-
variant if and only if A is nonderogatory.
Proof. We have seen already that the part "if" is true. To prove the
"only if" part, assume that A is not nonderogatory. We prove that there
exists an ^-invariant subspace that is not A hyperinvariant. By assumption,
dim Ker(y4 - A 0 /) > 2 for some eigenvalue A0 of A. Without loss of generali-
ty we can assume that A is a Jordan matrix
where m>2 and the first m Jordan blocks correspond to the eigenvalue A 0 ,
they are arranged so that k} < k2 and Am + 1 , . . . , \p are different from A0.
Obviously, Span{e,} is an .A-invariant subspace. It turns out that this
subspace is not A hyperinvariant. Indeed, by Theorem 9.1.1 the matrix 5
with 1 in the entries (/c, + 1 , 1 ) , . . . , (2kt, A:,) and zero elsewhere, com-
mutes with A. On the other hand, Sel = ek+l, so Spanf^} is not S
invariant. D
It is easily seen that all the A -hyperinvariant subspaces form a lattice,

that is, the intersection and sum of v4-hyperinvariant subspaces are again A
hyperinvariant. Denote this lattice by Hinv(yl). Now we can state the main
result concerning the structure of Hinv(v4).
Theorem 9.4.2
The lattice of all A-hyperinvariant subspaces coincides with the smallest lattice
tfA of subspaces in (p" that contains
Actually, $fA coincides with the smallest lattice of subspaces in <£" that
contains
where is the minimal polynimial of A.Indeed.

for and for
and
The proof of Theorem 9.4.2 is given in the next section.
The following example shows that, in general, not every /1-hyperinvariant
subspace is the image or the kernel of a polynomial in A.
EXAMPLE 9.4.3. Let A be the 0 X 6 matrix
According to Theorem 9.4.2, the subspace

Im A2 is A hyperinvariant. On the other hand, there is no polynomial p( A)
such that 56 = Ker p(A) or !£- Im p(A). Indeed, for any polynomial p(A)
the matrix p(A) has the form (see Section 9.2.10):
for some complex numbers /?,, p 2 , p3, p4. So Ker p(A) can be only one of
the following subspaces: {0} (if pl ^0); Span{e,,e 5 } (if pl = 0, p 2 ^0);
Span{e,,e 2 ,e 5 ,c 6 } (if^, =/? 2 = 0,/? 3 5^0); Span{e l5 e 2 , «?3, e 5 , <?6} (if p, =
P2 — PJ — 0, p3 ^ 0); (p (if p- = 0, / = 1, 2, 3,4). The subspace Im p(v4) can
be one of the following: (p6; Span{<?j, e2, e3, e5}; Span{e!,e 2 }; Span{e,};
{0}. None of these subspaces coincides with !£. D
The proof of Theorem 9.4.2 requires some preparation. We first prove

several auxiliary results that are useful in their own right.
Proposition 9.5.1
For any A G <p the subspaces
are A hyper invariant.
Proof. Fix A G (J7 and a positive integer k, and let x be any vector from
Ker(j4«- A/)*. If B commutes with A, we have
So Bx G Ker(,4 — A/) , and the subspace Ker(^4 - A/) is A hyperinvariant.

Similarly, let y G lm(A - A/)* and BA = AB. Then for any z G <p" such that
(/4 - A/) z = y, we obtain
So #y G Im(v4 — A/)*; therefore, lm(A - A/)* is A hyperinvariant.
We proceed now with the identification of Hinv(A), assuming that A

has only one eigenvalue. Given positive integers pl>--->pm, let
A(/?,,. . . , pm) be the set of all m-tuples of integers ( < 7 j , . . . , qm) such that
F°r evei7 two
sequences and from
then min(<7', q") belong to A(/7 1 5 . . . , pm).

Let B: <p" —» <p" be a transformation with a single eigenvalue A 0 , and let
be a Jordan basis in (p" for 5, where p^>p2>--->pm. So in this basis 5

has the form
Let
Lemma 9.5.2
For every the subspace
is B hyperinvariant. Conversely, every B-hyperinvariant subspace !£ has the

form <p(<7 1? . . . , qm) for some (q^ . . . , <? m )e A(p,, . . . , p m ). Moreover
for every
Proof. Let ^ be a nonzero fi-hyperinvariant subspace, and let x E

an arbitrary nonzero vector. Write * as a linear combination of the basis
vectors:
Assume that for some j the vector

is nonzero, and let q be the maximal index / (1 < i^pj) such that
We show that the subspace J{'q is in
Let Pj be the projector on 3{'p defined by P,/l' ) = = 0 for iVy and
P,./^ =/l y) (a = 1, . . . , p,). Obviously, PfB = BPr Therefore, the sub-
space Jîs Pj invariant. Hence y = PjX G 56. For every k — 1, 2, . . . the linear
transformation (5 - A0/)* commutes with B and hence
Then the vectors
also belong to «£ Thus 3C'q C 5£.

Furthermore, we show that if 3C'gC<e (y'^2), then also %*'*£&.
Indeed, let X: £"—* $" be the linear transformation given in the basis
(9.5.1) by the matrix
where X^v is a pv x p^ matrix, and X^ v = 0 for all /i, i/ except for

which is given as follows:
Theorem 9.1.1 shows that X commutes with B. Consequently, !£ is X

invariant and the vectors
belong to £.
We have proved that j£ has the form (9.5.3) with ql> — ->qm. Let us
verify that pl - q} > • • • >p m - qm. Fix i0<jQ and let C: $"-* <p" be
defined in the block matrix from C=[C (; ]™ y=1 with respect to the basis
(9.5.1) where C,7 is the zero p, x p. matrix if i^jQ or y T^ i0 and C;o/o is the
Pi * pf matrix [0 /]. By Theorem 9.1.1, C commutes with A, so £ is C
invariant. If then obviously
Otherwise
which implies , again.

It remains to show that every subspace
with ( < ? , , . . . , <7 m )G A(p 1 5 . . . , pm) is B hyperinvariant. Let

We must prove that Jîs C invariant. With respect to the basis (9.5.1), write
C as the block matrix C = [C /y ]™ y==l , where C,; is a p{ x p. matrix of one of
the following types (see Theorem 9.1.1):
[in the notation of (9.1.6)-(9.1.8)]. From the structure of C it is easily seen

that £ is C invariant if and only if the ^th column in every C,; has all entries
zero in the places qi + 1,. . . , pt;. In case / > j the first nonzero entry in
the <7yth column of C,7 can be in the [pt•.- (PJ - <? y )]th place; but pi..—
(p/-^-)^^-because ( < ? , , . . . , ^ m ) e A ( p , , . . . , pm). In case i<j the first
nonzero entry in the ^;th column of Ctj can be in the g;th place; but qf < qt,
so we are done in this case also. Finally, in case / = /' obviously the ^th
column of C,y has zeros in places qi; + 1, . . . , / ? , . We have verified that !£ is
indeed C invariant.
Finally, equalities (9.5.4) and (9.5.5) are clear from the definitions of
min(<y', q") and max(g', q").
Now we begin the proof of Theorem 9.4.2 itself. In view of Proposition

9.5.1, every element in the lattice 5^, the smallest lattice containing the
subspaces (9.4.1), is A hyperinvariant. Now let ££ be an A -hyperinvariant
subspace. Then Z£ is, in particular, A invariant; therefore
where A , , . . . , \m are all the distinct eigenvalues of A. Now

is also an ^4-hyperinvariant subspace. [Recall that the
integers ri are defined by the minimal polynomial (A - A,) r ' • • • (A - Am)r"' of
A.] Thus, to show that 3?E.yA, we can assume that A has only one
eigenvalue A 0 . Letting /?, > • • • > / ? , be the partial multiplicities of /4, in view
of Lemma 9.5.2 it will suffice to verify that
where (<?,, . . . , < 7 / ) E A ( p j , . . . , p,) and 3^ are defined as in equation

(9.5.2) [with respect to a Jordan basis f(-} of A]. Actually
Further Properties of Hyperinvariant Subspaces 311
where N=A-\0I. Indeed, as %' CKer A^'fllm N"1'*', / = ! , . . . , / , the

inclusion C in (9.5.7) is obvious. For the opposite inclusion, let
r G K e r A^'nimyV'^' so x = Np'~q'y for some y with N"'y=Q.
Write y = y1 + y2 + "- + y,, where ^GSpanl/J'*,. . . , f(pn}. Then x =
E,'=1 Af'^.and
We want to show that AfPl ^'y ; G 3^. or, equivalently
But since ( < ? , , . . . , ^ ) e A ( p , , . . . , p,), we have q,+ p, - qi >min(p,, p y ),

! < / < / , and (9.5.9) follows from (9.5.8). Theorem 9.4.2 is proved.
9.6 FURTHER PROPERTIES OF HYPERINVARIANT SUBSPACES
We present here some properties of the lattice Hinv(A) of all /i-hyperin-

variant subspaces.
Theorem 9.6.1
For any transformation A: <p"—»(p" the lattice Hmv(A) is distributive and
self-dual and contains exactly
elements, where ///' > • • • > p^' are the partial multiplicities of A correspond-
ing to the ith eigenvalue, i=\, . . . ,k, and k is the number of different
eigenvalues of A (in particular, Hinv(A) is finite).
Let us explain the terms that appear in this theorem. By definition, a

lattice A of subspaces in (p" is called distributive if
for every M, ^V,, JV2 G A. The lattice A is said to be self-dual if there exists a
bijective map $: A-» A such that «/<^ + -V) = «A(^) n «//(JV), ty(M n JV) =
«//(J^) + (/'(•'V) for every M, N E: A. [In other words, A is isomorphic (as a
lattice) to the dual lattice of A.]
Proof. Note that every /4-hyperinvariant subspace J£ admits the repre-

sentation
where A , , . . . , A^ are all the distinct eigenvalues of A. As
and
for any .A-hyperinvariant subspaces =$?, and j£2, we assume (without loss of
generality) that A has only a single eigenvalue
To show that the lattice of A-hyperinvariant subspaces is distributive, first
observe the following equality for any real numbers r, s, t:
This equality can be easily verified by assuming (without loss of generality)

that r < 5 , and then by considering three cases separately: (1) / < r <s; (2)
r < t ^ s; (3) r =£ s < t. Now let M^, M2, M3 be y4-hyperinvariant subspaces.
According to Lemma 9.5.2, write
in the notation of Lemma 9.5.2, where

A(PJ, . . . , pm), i- 1,2, 3, and / ? , > • • • >/? m are the partial multiplicities of
A. Using (9.5.4) and (9.5.5), we have
and
Using (9.6.2), we obtain equality between (9.6.3) and (9.6.4).

To prove the self-duality of Hinv(>l), observe that, in view of Lemma
9.5.2, the map $: Hinv(j4)—»Hinv(yl) defined by
where (q}, . . . , qm) G A(p 1? . . . , pm) satisfies the definition of a self-dual

lattice. For instance:
Exercises 313
It remains to verify the Hinv(,4) has exactly
elements. Instead of Hinv(>l), we count elements in A( / ? , , . . . , p m ). Using

induction on m [formula (9.6.5) obviously holds for m = 1], assume that
A
(P 2 > • • • . Pm) has exactly [H^1 (pf-pj+l + 1)] (/?„, + 1) elements. Now
observe that (<72 + 5, #2, . . . , qm) belongs to A( / ? , , . . . , pm) if and only if
(<7 2 > • • • ' ^m) belongs to A(/? 2 , . . . , pm) and 0 < * <p" lies between 2 and 2", and both bounds
can be attained. Indeed, the transformation / has only trivial hyperinvariant
subspaces, whereas a diagonable transformation with n distinct eigenvalues
has 2" hyperinvariant subspaces (see Examples 9.4.1 and 9.4.2). That the
number of A -hyperinvariant subspaces cannot exceed 2" follows from a
general result in lattice theory [see, e.g., Theorem 148 in Donnellan (1968)
using the fact that Hinv(,4) is distributive and each chain in Hinv(v4)
contains not more than n + 1 different subspaces.
9.7 EXERCISES
9.1 Consider the transformation
written as a matrix with respect to the standard basis e^, e2,e3.

(a) Find all transformations that commute with A.
(b) Find all A-hyperinvariant subspaces.
9.2 Show that if a transformation A: (p"—» <p" has n distinct eigenvalues,

then every transformation commuting with A is diagonable. Con-
versely, if every transformation commuting with A is diagonable, then
A has n distinct eigenvalues.
9.3 Supply a proof for Corollary 9.1.5.
9.4 Show that if AJn(\Q) = Jn(\0)A, then A is diagonable if and only if A
is a scalar multiple of the identity.
9.5 Prove or disprove each of the following statements for any commuting
transformations A: <f" -* <p" and B: <f" -» <p":
(a) There exists an orthonormal basis in which A and B have the
lower triangular form.
(b) There exists a basis in which both A and B have Jordan form.
(c) Both A and B have the same eigenvectors (possibly corres-
ponding to different eigenvalues).
(d) Both A and B have the same invariant subspaces.
9.6 Show that any matrix commuting with
is a circulant.
9.7 Show that any matrix commuting with
where « 0 , «,, . . . , #„_, are given complex numbers, is a polynomial

of A.
9.8 Describe all matrices commuting with
Exercises 315
Are all of these polynomials of Q1 Find all <2-hyperinvariant sub-

spaces.
9.9 Describe all transformations commuting with a transformation
A: <£""—> <JT" of rank 1. Find all ^4-hyperinvariant subspaces.
9.10 Let A: <p" —» (J7" be a transformation. Prove that every A-hyperin-
variant subspace is the image of some transformation which com-
mutes with A. (Hint: Use Lemma 9.5.2.)
9.11 Show that every /1-hyperinvariant subspace is the kernel of some
transformation which commutes with A.
9.12 Prove that for the matrix A from Exercise 9.7 we have Hinv(^4) =
lnv(A).
9.13 Is Hinv(y4) = lnv(A) true for any block companion matrix
where Aj are 2 x 2 matrices?

9.14 Show that for circulant matrices A in general Hinv(v4) ^ ln\(A). Find
necessary and sufficient conditions on the circulant matrix A in order
that Hinv(A) = Inv(^).
9.15 Give an example of a transformation A and of an v4-hyperinvariant
subspace M that does not belong to the smallest lattice of subspaces
containing the images of all polynomials in A.
9.16 Give an example analogous to Exercise 9.15 with "images" replaced
by "kernels."
9.17 Give an example of a transformation A such that Inv(y4) is not
distributive.
Chapter Ten
Description of
Invariant Subspaces and
Linear Transformations
with the Same
Invariant Subspaces
In this chapter we consider two related problems: (a) description of all

invariant subspaces of a given transformation and (b) to what extent a
transformation is determined by its lattice of all invariant subspaces.
We have seen in Chapter 2 that every invariant subspace of a linear
transformation A: (p" -» <p" is a direct sum of irreducible ^-invariant sub-
spaces, that is, such that the restriction of A to each one of these subspaces
has only one Jordan block in its Jordan form. Thus, to solve the first
problem mentioned above it will be sufficient to describe all irreducible
yi-invariant subspaces. This is done in Section 10.1.
The second objective of this chapter is a characterization of transfor-
mations having exactly the same set of invariant subspaces. It turns out that,
in general, not all such transformations are polynomials of each other. Our
characterization (given in Section 10.2) will depend on the description of
irreducible invariant subspaces given in Section 10.1.
10.1 DESCRIPTION OF IRREDUCIBLE SUBSPACES
In the description of invariant subspaces upper triangular Toeplitz matrices,

and matrices that resemble upper triangular Toeplitz matrices, play an
important role, as we see later. We recall first some simple facts about
Toeplitz matrices.
316
Description of Irreducible Subspaces 317
A matrix A of size / x j is called Toeplitz if its entries have the following

structure
where . Denote by T- the class of all

upper triangular Toeplitz matrices of size j x y, that is, such that «, = • • • =
a ; _, = 0 in equation (10.1.1).
Proposition 10.1.1
The class T, is an algebra, that is, it is closed under the operations of addition,
multiplication by scalars, and matrix multiplication. Moreover, if A E T- and
det.4^0, then A'1 £ 7}.
Proof. All but the last assertions of Proposition 10.1.1 are immediate
consequence of the definition of Tf. To prove the last assertion, suppose that
Ome dedices seaso;u thjat Further
and in general
(It is assumed that bki = 0 whenever k =sO.) Equations (10.1.2) define

recursively:
Using (10.1.3), we can prove by induction on k (starting with k = 0) that

bi_k • does not depend on /. But this means exactly that the matrix [b^]^ k =l
is Toeplitz.
318 Description of Invariant Subspaces and Linear Transformations
Let A: <p" —»<p" be a transformation. It is clear that each .A-invariant

subspace M can be represented as a direct sum of nonzero A-invariant
subspaces M } , . . . , Mk, each of which is irreducible, that is, not represent-
able as a direct sum of smaller invariant subspaces (indeed, let / be the
maximal number of factors in a decomposition
into a direct sum of nonzero ^-invariant subspaces Mf; then from the choice
of / it follows that each Mi in equality (10.1.4) is irreducible). To describe
the ,4-invariant subspaces, therefore, it is sufficient to describe all the
irreducible subspaces.
It follows from Theorem 2.5.1 that an ^-invariant subspace SB is ir-
reducible if and only if the Jordan form of A ^ consists of one Jordan block
only. In other words, !£ is irreducible if and only if there exists a basis
J t j , . . . , x in Z£ and a complex number \ such that
that is, the system {Af,}f =1 is a Jordan basis in Z£. Consequently, every
irreducible subspace is contained in some root subspace. (One can see this
also from Theorem 2.1.5.) Thus it is sufficient to describe all the irreducible
subspaces contained in a fixed root subspace corresponding to the eigen-
value A. Without loss of generality, we assume that A = 0. (Otherwise,
replace A by B = A- A/ and observe that both transformations A and B
have the same invariant subspaces.)
The root subspace ^(A) is decomposed into a direct sum of Jordan
subspaces:
The description of the Jordan subspaces contained in &10(A) is given

according to the number m of irreducible subspaces in the decomposition
If $10(A) is an irreducible subspace [i.e., m = 1 in (10.1.6)] and the

vectors (*,}f=1 form a Jordan basis in £%0(A), then Span{jCj,. . . , *y},
j — 1, . . . , p are all the .^-invariant subspaces in &t0(A), and all of them are
irreducible subspaces.
Consider now the case when m = 2 in (10.1.6). We use the following
notation: if {2,}f=1 is a system of vectors z, E (p", denote by z ( y ) the column
formed by vectors, as follows:
Let g,, . . . , gp E .Sfj and / , , . . . , jq G J£2 "e Jordan bases in J£t and Ja?2,
respectively. Without loss of generality, suppose that p > q. It is known that
in any irreducible subspace ^(^0) of A there exists only one eigenvector
(up to multiplication by a nonzero scalar). We describe first all the irreduci-
ble subspaces that contain the eigenvector g, [and thus are contained in
^(A)].
In the following proposition / is a fixed integer, 1 < / < p .
Proposition 10.1.2
Let T(v\ where v = min(y, q), be an upper triangular matrix of size j x /',
whose diagonal elements are zeros and the block formed by the first v rows
and first v columns is a Toeplitz matrix :
Then the components of the column
form a Jordan basis of some j-dimensional A-invariant irreducible subspace

that contains g p Conversely, every irreducible subspace of dimension j of A
that contains g, has a Jordan basis given by the components of (10.1.8),
where T(v) is some matrix of type (10.1.7).
The multiplication in T ( v ) f ( i ) is performed componentwise: for complex
numbers xrs and n-dimensional vectors z , , . . . , z; we define
Note also that the dimension of every irreducible subspace of A contained

in 2ft.Q(A) does not exceed p [recall that m = 2 in (10.1.6) and that dim ^ =
p > dim 5£2 > 1]; so Proposition 10.1.2 does indeed give the description of all
irreducible subspaces that contain gj.
Proof. First observe that if SB is an irreducible subspace and g, G Z£y

then
Indeed, if y E 3? n «S?2 ^ {0}, then for some i ( 0 < / < / ? - ! ) and some
complex number y ^ 0 the equality Ay = y/, holds. So /t E ^ O ^ C ^,
and since also g t e <£, the irreducible subspace =$? contains two linearly
independent eigenvectors/! and g l y which is impossible. From (10.1.9) and
the inclusion % + &2 C ^(A) = &{ + %2 it follows that dim £ < dim <£", =
p. Now let £ be an irreducible subspace containing gl with a Jordan basis
y l t . . . , y y ; so y} = a0g, and Ay, + , = ^.(/ = 1, . . . , y - 1). We look for the
vectors y2,. . . , yf in the form of linear combinations of gl,. . . , g ,
/ " , , . . . , fq. Two possibilities can occur: (1) y < ^; (2) q + 1 < y < /?. Consider
first the case when y'< <7. Condition Ay2 = y^ implies that
Condition lies that

Continuing these arguments, we obtain
where a t , . . . , a y _,, /8,,. . . , p ; -_j are some numbers. In case

one finds analogously
where are some complex numbers

Formulas (10.1.10) and (10.1.11) can be written in the form
where C and 5 (t)) are j x / matrices, and C is an upper triangular invertible

Toeplitz matrix (invertible because its diagonal element is a 0 ^0). By
Proposition 10.1.1, C~l is also an upper triangular Toeplitz matrix. It is easy
to see that the matrix C~}S(v) has the form T(v} [see (10.1.7)]: T(v)
C~}S(V\ Put z ( ] ) = C ~ l y ( > ) = g(i)+T(v)f(j\ It is easy to see that
Span{yl}( =Span{zi}{ and the vectors z , , . . . , z y satisfy (10.1.5). So
the components of z ( / ) form a Jordan basis in £, D
Now let xl (âgj) be an arbitrary eigenvector of A contained in

&Q(A) = ^ + <$£>. Evidently, xl = ^gl + TJ/, (£ ^0). Consider the system of
vectors *,• = £& + TJ/J-, i - 1, • • • , q. Clearly, the vectors xl, . . . , xq satisfy
the condition (10.1.5); therefore, they form a Jordan basis of some ir-
reducible subspace &C$10(A). It can easily be verified that ^+3? =
2ft0(A). Hence dim&=q. By Proposition 10.1.2, for every irreducible
subspace j£ containing the vector jc, (the dimension j of !£ is necessarily not
larger than q) there exists a matrix T(i) of the form (10.1.7) such that the
components of the column v(i) = x ( j ) + TU)g(l) form a Jordan basis in j£
Conversely, for every matrix T(i) of size / x j the components of the column
v(n form a Jordan basis in some irreducible subspace of A. Thus a complete
description of the irreducible subspaces contained in the root subspaces
&o(A) = «£, i <#2, is obtained.
This description for the case when m = 2 in the decomposition (10.1.6)
can be generalized for an arbitrary m. This is the content of the following
theorem.
Theorem 10.1.3
Let
be a decomposition of the root subspace £%A (A) of the transformation

A: <p"—> <p" into a direct sum of irreducible subspaces =$?,,. . . , 2£m. Let
g,,. . . t g p i e ^ . . . ;/„. .. , / P r e ^ r ; . . . ; * , , . . . , / i ^ e ^ 6e /orto
bases /n ^,. . . , ^.,. . . , £m, respectively ( / ? , > • • • ^pm). Let j be an
integer such that \<j<pr — dim !£r. For every i — 1,. . . , m let v( —
min(/, p ( ). Then for every set of matrices T\"l\ . . . , T(^m) of the form
(10.1.7) and of size j x j the components of the column
form a Jordan basis in some irreducible subspace of A that contains the vector
/! {here ul, . . . , up E =5^._, and vlt . . . , vp + E 2£r+l are Jordan bases in
cS?r_i and £r+l, respectively). Conversely, for every irreducible subspace t£ of
dimension j such that f^ E SB there exist matrices T("l\ . . . , T("m) such that the
components of the column (10.1.12) form a Jordan basis in t£.
Proof. Use induction on the number m of subspaces in the decompo-

sition (10.1.6). For m = 2 this theorem coincides with Proposition 10.1.2.
Suppose that the theorem holds for m ^ k - 1 , and assume that £%A (A) =
j£, + ---- *-«#*• If % is an irreducible subspace such that /, E SB, then
where Indeed, for every y E SB ^ {0}

there exist a nonnegative integer / and a complex number y ^ 0 such that
A'y = y/j. If, in addition, y E SBT, then y/j = A'y E «2?r, which contradicts the
direct decomposition From (10.1.13) and from the
inclusion S we deduce that dim S£ < dim S£r. As-
sume that /•</:. (The case r = k can be considered in a similar way.) If
then by the induction hypothesis the components of a
column of the form (10.1.12) form a Jordan basis in S£. HSefe^ + ••• +
%k_i, consider the subspace £' = (<€ + Sek) n (^, + • • • + ^.J. Since ^ n
«S?fc = {0}, the equality dim.2" = dim(«Sf -i- 5^) + dim(«2>, + • • • -f ^ _ , ) -
dim(^ 4- • • • 4- S£k) = dim <S? holds. Evidently, S£' is ,4 invariant. Let us
show that S£' is an irreducible subspace. Suppose the contrary; then there
exists an eigenvector gE J£" of A that is not a scalar multiple of / t . Since
SB' C j£j + <3?£, the vector g is a linear combination of the eigenvectors/! and
hly where /i, E <£k. But then /ij E SB' C «#, 4- • • • + SEk_^ which means that
the sum (Ja?, -i- • • • 4- S£k_^) + !£k is not direct, and this is a contradiction with
our assumptions. So S£' is an irreducible subspace. Since S£' C SB, + • • • 4-
SBk_\, by the assumption of induction the components of the column z(i)
form a Jordan basis in SB' for some T\"l), . . . , T(k"^l}. The property that
£'C£ + J£k implies the inclusion SBCSB' + S£k. As it has been proved
above, there exists a matrix T ( k ) such that the components of the column
rm a Jordan
basis in J£.
Theorem 10.1.3 also gives a description of all irreducible subspaces of A

that contain an arbitrarily given eigenvector of A from the root subspace
Indeed, let AC, E £%A (/I) be an eigenvector, and let r be the minimal
integer such that xl E SB\ + • • - + S6r. Then xl = algl + • • • + a r f x , where
a, , . . . , ar E <p and a r T^ 0. Consider the system of vectors xt = a}g, + • • • +
arff, i = 1, . . . , pr. Evidently, jc,, . . . , xp satisfy the condition (10.1.5).
Transformations Having the Same Set of Invariant Subspaces 323
Therefore, their linear span is an irreducible sub-

space. It is easily seen that
So in the representation (10.1.6) one can replace !£r by <Er. Then in view of
Theorem 10.1.3 the components of the columns of form (10.1.12) describe
all the irreducible subspaces of A, which contain the vector jc, [in (10.1.12)
write x' in place of / ( / ) j.
Observe that every irreducible subspace contains an eigenvector of A. So
the description in the preceding paragraph gives all the irreducible subspaces
of A (if the vector xl is varied).
10.2 TRANSFORMATIONS HAVING THE SAME SET

OF INVARIANT SUBSPACES
Consider a transformation A: $"—> (p". In this section we describe the class

of all transformations B: $"-*$" such that lnv(A) = Inv(B). A relative
simple case of this situation has already been pointed out in Theorem 2.11.3
(when one transformation is a polynomial in the other). Surprisingly
enough, it turns out that the set of transformations B such that Inv(fi) =
Inv(v4) does not generally consist only of the transformations/(^4), where
/(A) is a polynomial with the properties indicated in Theorem 2.11.3. It can
even happen that noncommuting transformations have the same set of
Before we embark on the statement and proof of the main theorem
describing the transformations with the same set of invariant subspaces
(which is quite complicated), let us study some examples.
EXAMPLE 10.2.1. Let A be the n x n Jordan block Jn(\a). The invariant

subspaces of A are J^ = Span{e,, . . . , e^, j - 0, . . . , n (by definition ^0 =
{0}). Let us find all transformations B: (pn —> <p" for which lnv(A) = lnv(B).
It turns out that Inv(,4) = Inv(B) if and only if (in the basis e^ . . . , en) B
has the form
where
Indeed, suppose lnv(B) = Inv(.A). Then clearly the matrix representing B

has the triangular form (10.2.1). Moreover, it is easy to see that a n = • • • =
ann. Indeed, the numbers a , , , . . . , ann are the eigenvalues of B\ if they are
not all equal, then there exists a pair of nonzero complemented invariant
subspaces of B, namely, the root subspaces corresponding to a pair of
complemented nonempty subsets in cr(B). But the existence of a pair of
nonzero complemented subspaces contradicts the assumption that lnv(B) =
ln\(A).
Let us show that al2 a23 • - • « „ _ , „ ^0. Consider the transformation C =
B-anI, which has the same invariant subspaces as B. If for some /
(1 < / < n — 1) we have then Hence
Since any nonzero vector in Ker C spans a one-dimensional C-invariant

subspace, inequality (10.2.3) contradicts the assumption Inv(fi) = lnv(A)
again.
Conversely, suppose that B satisfies (10.2.1) and (10.2.2). Put C =
B-auI. We show that Ker C = ^. Let x = E" =1 ^jej £ Ker C and x ^0.
Let p be such that £p + j = • • • = £„ = 0 and £,p ^ 0. Then p = \. Indeed, if p
were greater than 1, then Cx = ap p +lep + l + • • • 7^0. So x = ^lel, that is,
Ker C = «$?!. This means that any two eigenvectors of B are collinear.
Appeal to Theorem 2.5.1 [(d)<£>(e)] and deduce that for any two B-
invariant subspaces M^ and M2, either M}CM2 or M2CM}. Since
Jô, jj?,, . . . , 5£n are B invariant and dim !£)= j (j = 0, . . . , n), it follows
that any B-invariant subspace coincides with one of ^. D
Example 10.2.1 provides a situation when Inv(^4) = Inv(S) but A and B
do not commute [take A = / n (A 0 ), n >3, and B as in (10.2.1) with distinct
nonzero numbers ay / + 1 , y = 1,. . . , n — 1].
If A has more than one Jordan block, the situation may be completely
different from Example 10.2.1.
EXAMPLE 10.2.2. Let
It turns out that Inv(fi) = Inv(A) if and only if B is a polynomial in A,

B - p(A), such that p'(\) 9*0. In other words, B has the form
for some a, b, c e <f where b ^ 0.

As by Theorem 2.11.3 Inv(B) = lnv(A) for every B in the form (10.2.4)
with b ^0, we must verify only that every B: <p 5 —» <f5 such that Inv(B) =
lnv(A) has the form (10.2.4) with b^Q (in the basis e}, e2, e3, e4, e5).
So assume Inv(Z?) = lnv(A). Then clearly B has upper triangular form,
and (see the argument in Example 10.2.1) the elements on the main
diagonal of B are all equal. Without loss of generality, we can assume that
the main diagonal in B is zero:
As Span{e4,e5} is A invariant and hence belongs to Inv(fi), we have

a
i4 ~ flis ~ °24 = a2s = ^34 ~fl35= ®- ^ one °^ tne numbers 0 12 , fl23, or a45
were zero, then B would have three one-dimensional invariant subspaces
whose sum is direct. This contradicts the assumption Inv(fi) = Inv(j4) (^4
cannot have more than two one-dimensional invariant subspaces whose sum
is direct). Hence al2, a23, and a45 are different from zero. It remains to show
that fl,2 = «23 ~ a 45- To this end observe that Span{e, + e 4 , e2 + e5} is A
invariant and hence B invariant. So
which implies an = a^- A similar analysis of the iJ-invanant subspace

Span{en e2 + e4, e3 + e5} leads to the conclusion that a23 = a45. D
Now we state the main theorem, which describes all transformations

B :$"-+<£" with Inv(fl) = Inv(/l), where the transformation A: <p" -> (p" is
given. This description will contain the results of Examples 10.2.1 and 10.2.2
as very special cases. Note that without loss of generality we can assume
(and we do) that A is an n x n matrix in the Jordan form
where are all the different eigenvalues of A,

and
where / ? , > • • • >p m . Of course, the number m, as well as / > , , . . . , pm,

depend on /; we suppress this dependence in the notation for the sake of
clarity. The notation for upper triangular Toeplitz matrices will be ab-
breviated to the form
Finally, we use the notation
where F is the (q - p)x (q — p) upper triangular matrix whose (/, y) entry

is fif (i<j). It is assumed, of course, that p^q. In other words,
Uq(aQ,. . . , « p _ i ; F) is a q x q matrix whose first p superdiagonals (starting
from the main diagonal) have the structure of a Toeplitz matrix, whereas
the next q — p superdiagonals contain the upper triangular part of the
matrix F, which is not necessarily Toeplitz. If p = q, F is empty and
Theorem 10.2.1
If Inv(B) = Inv(y4) for a transformation B: <p"—» (p", then
(in a chosen Jordan basis for A), where each block

has the form
for some complex numbers

b and an upper triangular matrix F of size
numbers b2,. . . , bp , as well as the matrix F, depend on j. Conversely, if B
has the form (10.2.5), (10.2.6) and //,,, b} and F have the above properties,
then lnv(B) = Inv(yl).
We relegate the lengthy proof of this theorem to the next section. The
proof will be based on the description of irreducible subspaces obtained in
Section 10.1.
We conclude this section with two corollaries of Theorem 10.2.1.
Corollary 10.2.2
Suppose that AB = BA. Then lnv(A) = Inv(B) if and only if B=f(A),
where /(A) is a polynomial such that f(\t) ^/(A y ) for eigenvalues A, ^ A y of
A,f'(\0)*Q whenever A0 G o-(A) and Ker(
In other words, the conditions of Theorem 2.11.3 are not only sufficient,
but also necessary, provided A and B commute.
Proof. In view of Theorem 2.11.3 it is necessary to prove merely the

"only if" statement. So assume Inv(^4) = Inv(fi). Let \lt . . . , \k be the
different eigenvalues of A, and let
be the decomposition of $l^(A) into a direct sum of Jordan subspaces

^n ' • • • ' ^j,m- sucri tnat dim &j\ — '"— dim ^j,m- The restrictions A\^ and
B\% commute; so in view of Theorem 9.1.1 (observing that A\^ has only
one Jordan block) there exists a polynomial py( A) such that B\^ = pj(A\x ).
It follows now from Theorem 10.2.1 that B\^= pt(A\^). Since the minimal
polynomials of A\^ , j - 1, . . . , k are relatively prime ,' there exists a poly-
nomial p(\) such that B = p(A). Indeed, let p(\) be an interpolating poly-
nomial such that
where k- = dim .2), and qr (a) (A 0 ) denotes the ath
derivative of the polynomial g(A) evaluated at A 0 . (See Gantmacher (1959),
Lancaster and Tismenetsky (1985), for example, for information on inter-
polating polynomials (see also Section 2.10).)
From the definition of a function of the matrix A (see Section 2.10), it
follows that Z?|yj . = p(A\gt) for / ' = ! , . . . , & and, consequently, B = p(A).
Using Theorem 10.2.1 once more, we deduce that p(A,) ^/?(A y ) for i^j
a n d / ? ' ( A , ) ^ 0 f or / = !,. . . , k.
Corollary 10.2.3
Let A: §" —> <p" be a transformation. Then every transformation B with
Inv(B) = Inv(>4) commutes with A if and only if the following condition
holds: for every eigenvalue A0 of A with Ker(y4 - A 0 /) ^ S?A (A) and
dim Ker(>4 - A07) > 1 we have
where p = p( A 0 ) is the maximal integer such that Ker(^4 - A () /) p ^ 9£A (A).

Further, the set of all transformations B with Inv(fi) = Inv(A) coincides with
the set of all transformations commuting with A if and only if dim Ker(A -
A 0 /) = 1 for every eigenvalue A0 of A, that is, A is nonderogatory.
The proof is obtained by combining Theorem 10.2.1 with the description
of all matrices commuting with A (Theorem 9.1.1).
We start with three lemmas to be used in the proof of Theorem 10.2.1.

Let A: <p"-» (p" be a unicellular transformation. (Recall that A is called
unicellular if <p" is its irreducible subspace.) Let g,,. . . , gn be a Jordan basis
of A. Let B be a transformation such that its matrix in the basis g t , . . . , gn
has the form
for some 6, E <p and an (n - k) x (n — k) upper triangular matrix F.
Lemma 10.3.1
If B has the form (10.3.1) with b2^0, then in any Jordan basis for B the
transformation A has the form
for some a, €E <p with a2 ^ 0, and some upper triangular matrix G.
Proof. Without loss of generality we can assume that &(A) — {0} and
the Jordan basis g j , . . . , gn coincides with the standard
where
for some bk +l,. . . , bn G £ and upper triangular matrix F' of size (n — k) x

(n - k). Since 6 2 ^ 0 , it follows from Example 10.2.1 that the transfor-
mations A, B, and Bl have the same invariant subspaces. Hence (recalling
the equivalence (a)<=>(e) in Theorem 2.5.1) the transformations B and Bl
are also unicellular.
As ABl = BÂ and B^ is unicellular, it follows from Theorem 9.1.1 that
A = p(B}) for some polynomial p(A).
Let / , , . . . , / „ be a Jordan basis for fl. We claim that the matrix of C in
the basis /i, . . - , / „ has the form (10.3.3) again, possibly with another
matrix F'. Indeed, the only nonzero B-invariant subspaces are
Span (because they are .A-invariant sub-

spaces). On the other hand, Example 10.2.1 ensures that the only nonzero
jB-invariant subspaces are Span It follows that
f Now it is easily seen that the matrix
of C in the basis has the form (10.3.3).
Consider the following relations
where every summand in H contains C as a factor. Consequently, the matrix

of H in the basis /i , • • • , / „ is upper triangular and the first k diagonals
(counting from the main diagonal) are zeros. Now (10.3.2) follows from
(10.3.4) and, by Example 10.2.1,
Let vectors be linearly independent in <J7". In the

sequel we shall encounter systems of vectors of the form
;lahsdfn';lmdv'cxn/fds;',/';j;cx/.lknfds;ljn;fds';jfdsmnksfd';123456
where fl(/ are certain numbers.
Lemma 10.3.2
If for every m = 1, . . . , p the subspaces Span and
Span coincide, then
Proof. Use induction on p. For p - 1 the lemma is evident. Assume
the lemma holds true for p = k, and Span
Span By the induction hypothesis,

For every vector we have
Rewrite the equation
in the form
this will contradict the linear

independence of So we must have
Let ^ be the set of all irreducible subspaces of a transformation

Since every invariant subspace for a
transformation can be represented as a direct sum of irreducible subspaces,
the equality lnv(A) = Inv(B) holds if and only if Now consider a
special case of this equality.
Lemma 10.3.3
Let A: be such that
where ^ (£ = 1,2) are irreducible subspaces of A corresponding to the

same eigenvalue. Let dim
be Jordan bases in these subspaces. Then for
if and only if the matrix of B in the basis
has the form
where are complex numbers with is an upper

triangular matrix of size
Proof. First we prove the necessity, that is, if then B has the
form (10.3.7). Consider first the case p = q and prove the necessity by
induction on everything is evident. Suppose that the lemma i
true for and let be irreducible subspaces of dimension
Le vidently,
are irreducible subspaces of A corresponding to the same eigenvalue. Since
by assumption, the subspaces are irreducible

subspaces for B. By Example 10.2.1 and the induction hypothesis, the
matrix representation of B in the basis has the
form
where
We assume otherwise, consider the linear transformation
where in place of B and use the property that lnv
lnv This condition means that B is invertible.
Let j£ be an irreducible subspace of A such that dim and
By Theorem 10.1.3, there exist numbers such that the
vectors
form a Jordan basis in !£. Since 6, ^0, it follows that
It follows from the form of B that
and
33 Descripccccc
evidently, Span Thaen by L:emma

we have
These equalities hold for every (by choosing all possible j£; see
Theorem 10.1.3). Therefore
Similarly, considering Jordan bases of the form
we obtain Let us
show that, in fact, To this end consider a Jordan basis of A
of the form
where £ and 17 are arbitrary numbers. We have
As above, we obtain
proof of theorem
ByLemmalQ.3.2, Since 17 can be arbitrary,

Thus the necessity part of Lemma 10.3.3 is proved for the case
Now consider the case and proceed by induction on
h. Assume that the necessity part of Lemma 10.3.3 holds for h ^ k, and let
be irreducible subspaces for A with dim and
dim By Example 10.2.1 and the assumption of induction, the matrix
representation of B in the basis has the form
where Let
be a Jordan basis (for A) of an arbitrary irreducible subspace of

dimension p + k + 1 and such that d\ G $£. As above, we obtain
Now
Hence
Put
Since Span or every

« , , . . . , « , Lemma 10.3.2 implies that
The necessity part of Lemma 10.3.3 is proved.

Let us prove the sufficiency of the conditions of Lemma 10.3.3. Assume
that B has the form (10.3.7) in a Jordan basis for A. Let J£be an irreducible
subspace for A with dim and be an eigenvector.
Then umbers £ and 17. Put
Suppose that In view of Proposition 10.1.2 (see also the
remark after its proof), there are some number for which the
vectors
form a Jordan basis of A in [If lace d by

respectively in (10.3.8).] A straightforward computation reveals
that ^ is B invariant, and in the basis we have:
As in Example 10.2.1, implies that Jîs an irreducible subspace for B.

Now let $£ be an irreducible subspace for A such that dim S£ — m
It is easily seen that d, G Z£ and (by Proposition 10.1.2)
there exist numbers such that the vectors
form a Jordan basis of A in ££. Again, a straightforward calculation shows

that !£ is B invariant and in the basis
where the (/', /) entry of Since it follows from

Example 10.2.1 that the subspace j£is an irreducible subspace of B. So we
have proved that
Let us prove the opposite inclusion be a Jordan
basis of B in the subspace 2£2. Write pu
Evidently, the vectors form a
Jordan basis for Z? in pan{d
We show that the sequence can be augmented by vectors
so that is a Jordan basis of B in %l. (Observe tha
by Example 10.2.1, ^ is an irreducible subspace for B.) Assume that the
vectors are already constructed. Then
for the following equation must be satisfied in order that
where Zr is the submatrix of formed by the first

rows and the columns From (10.3.7) and it
follows that Zr is invertible, so (10.3.11) always has a (unique) solution
is constructed. By Lemma 10.3.1, A has the followi following
form in the basis
for some F, where The first p diagonals in both blocks are the same
in view of the choice of
Now we can repeat the proof of the inclusion $A C $B given above, with
A and B interchanged. So follows and, therefore, also and
Now we are prepared to prove Theorem 10.2.1 itself.
Proof of Theorem 10.2.1. As every A- in variant subspace is the sum of

its intersections with the root subspaces of A, we may restrict ourselves to
the case when <p" is a root subspace for A. Let
be the decomposition of <p" into a direct sum of irreducible subspaces

be
«JC
Jordan bases in respectively. Assume (without loss of generali-
ty) that
Now let t*6 a transformation, and suppose that the invariant
subspaces of B and those of A are the same. Applying Lemma 10.3.3 to the
restrictions we find that /? has the form described in
Theorem 10.2.1.
Conversely, assume that B has the form
with We now prove that Inv(B) = lnv(A). Suppose for definiteness

that Let us show that every irreducible subspace for A is also
an irreducible subspace of B. Let !£ be an irreducible subspace for A
with dim and let be an eigenvector of A. Then
Span — Ker A. Write with
for some Put It is
easily seen that Then the vector given b

(10.1.12) (replacing form a Jordan basis for A, for
some numbers wo possibilities occur for the number j
(=dim j£): Consider first the case Taking
into account the form of B, it is easy to check that Jaîs B invariant and the
matrix of B \^ in the basis is of the form
with Then by Example 10.2.1 !£ is an irreducible subspace for B.

Now suppose that Since j clearly r — 1. This means
that the eigenvector xl G £ is collinear with dlE.5£l. Taking into account
the form of B [given by (10.3.13)] we conclude that 3? is B invariant and the
matrix of B\<g in the basis is given by (10.3.10) with and
By Example 10.2.1, £ is an irreducible subspace for B.
We show that every irreducible subspace for B is also an irreducible
subspace for A. As we have already proved, the subspaces
[which appear in (10.3.12)] are also irreducible subspaces for B. Let
be a Jordan basis of B in Z£m\ then
where is the Jordan basis of A in be a

Jordan basis of A in Construct the vectors as follows
(recall that
Since the vectors form a Jordan basis in !£m, the vectors

satisfy the equalities
Because is an irreducible subspace for B, there exists
vectors sucn that the system forms a Jordan
basis for B in (See the last paragraph in the proof of Lemma 10.3.3.)
Express by means of the Jordan basis for A in
Continuing these constructions, we obtain Jordan bases for B in each of the

subspaces From the choice of these bases and Lemma
10.3.1 it follows that the matrix of A in the union of these bases has the
form
where A is the eigenvalue of A and As it was proved above, every

irreducible subspace for B is also irreducible for A. Thus the equality
ln\(A) = lnv(B) holds.
10.4 EXERCISES
10.1 Let
(a) Describe all irreducible /l-invariant subspaces that contain

(b) Describe all irreducible y4-invariant subspaces that contain
10.2 Let Describe all irreducible ^-invariant subspaces that
contain
10.3 Prove or disprove the following statement: if are
transformations with and with ln\(A) =
Inv(fi), then A and B are similar.
10.4 Show that if have the same set of hyperinvariant
subspaces and if then A and B are
similar.
10.5 Show that two lower triangular Toeplitz matrices have the same
invariant subspaces if and only if each matrix is a polynomial in the
other.
10.6 Show that two circulants have the same invariant subspaces if and
only if each circulant is a polynomial in the other.
10.7 Is the property expressed in Exercise 10.6 true for two block circu-
lants of type
where A- are 2 x 2 matrices? What happens if Af are 3 x 3 matrices?

10.8 Show that two companion matrices have the same invariant subspaces
if and only if each is a polynomial in the other. Is this property true
for block companion matrices
with 2 x 2 blocks A ? For block companion matrices with 3 x 3 blocks

-v
Chapter Eleven
Algebras of Matrices
and Invariant Subspaces
In this chapter we consider subspaces that are invariant for every transfor-
mation from a given algebra of transformations. In fact, this framework
includes general finite-dimensional algebras over (p. The key result, that
every algebra of n x n matrices that is not the algebra of all n x n matrices
has a nontrivial invariant subspace, is developed with a complete proof.
Some results concerning characterization of lattices of subspaces that are
invariant for every transformation from an algebra are presented. Finally, in
the last section we study algebras of transformations for which the orthogo-
nal complement of an invariant subspace is again invariant.
//./ FINITE-DIMENSIONAL ALGEBRAS
A linear space V (over the field of complex numbers <p) is called an algebra
if an operation (usually called multiplication) is defined in V, which as-
sociates an element in V (denoted xy or x • y) with every (ordered) pair of
elements jc, y from V with the following properties: (a) a(xy) = (ax)y =
x(ay) for every a G <p and every jc, y G V; (b) (xy)z = x(yz) for every
jc, y, z G V (associativity of multiplication); (c) (x + y)z = xz + yz, x(y +
z) — xy + xz for every jc, y, z G V (distributivity of multiplication with re-
spect to addition).
Note that generally speaking xy ¥* yx in the algebra V. The algebra V may
or may not have an identity, that is, an element e G V such that ae — ea = a
for every a G V.
We consider only finite-dimensional algebras, that is, those that are
finite-dimensional linear spaces. The basic example of an algebra is Mn „, the
algebra of all n x n matrices with complex entries, with the usual multiplica-
tion operation. Another important example is the algebra of upper triangu-
lar n x n (complex) matrices.
The following theorem shows that actually every (finite-dimensional)
339
340 Algebras of Matrices and Invariant Subspaces
algebra is an algebra of (not necessarily all) matrices. This is the basic

simple result concerning representations of finite-dimensional algebras.
Theorem 11.1.1
Let V be an algebra of dimension n (as a linear space). If V has identity, then
V can be identified with an algebra of n x n matrices. If V does not have
identity, it can be identified with an algebra of (n + 1) x (n + 1) matrices.
Proof. Assume first that V has the identity e. Let x^,. . . , xn be a basis
in V. For every a E V the mapping a: V-* V defined by a(x) — ax, x E V is a
linear transformation. Denote by M(a) the n x n matrix that represents the
linear transformation a in the fixed basis x{,. . . , xn. It is easy to check that
the mapping M: F-» Mn n defined above is an algebraic homomorphism:
for any elements «, b E V and any a E <p. Further, the only element a E V
for which M(a) = 0 is a = 0. Indeed, if M(d) = 0, then a* = 0 for every x E V.
Taking x = e, we obtain a = 0. Hence we can identify V with the algebra
{M(a) | a e V}, which is simply an algebra of n x n matrices.
Assume now that V does not have identity. Define a new algebra V as all
ordered pairs (x, a) with jc E V, a e <p and with the following operations:
for any x, y € V and any a, /3, y G <p. Obviously, the algebra V has the
identity (0,1) and dimension n + 1. According to the part of Theorem
11.1.1 already proved, we can identify V with an algebra of (n + 1) x (n + 1)
matrices (clearly, dim V= n + l). As V can be identified in turn with the
subalgebra {(x, 0) | oc £ V} of V, the conclusion of Theorem 11.1.1
follows.
In view of Theorem 11.1.1 we consider only algebras of matrices in the

sequel.
11.2 CHAINS OF INVARIANT SUBSPACES
Let V be an algebra of (not necessarily all) n x n matrices. A subspace

is called V invariant if is invariant for any matrix from V. The
Chains of Invariant Subspaces 341
following basic fact (known as Burnside's theorem) establishes the existence

of nontrivial invariant subspaces for algebras of matrices.
Theorem 11.2.1
Let V be an algebra of n x n (complex) matrices with
Then there exists a nontrivial V-invariant subspace.
We exclude the case n — 1, when every subspace in <p" is trivial (in this
case the theorem fails for V= {0}). The proof of Theorem 11.2.1 is lengthy
and based on a series of auxiliary results; it is given in the next section.
Taking a maximal chain of V-invariant subspaces and using Burnside's
theorem we arrive at the following conclusion.
Theorem 11.2.2
For any algebra Vofnxn matrices, there is a chain of V-invariant subspaces
such that, with respect to a direct sum decomposition
where Np is a direct complement to every

transformation A E V has a block triangular form
and the set (App \ A £ V}, coincides with the algebra of all transformations
from Np into Np,for p = 1,. . . , k. The chain (11.2.1) is maximal, and every
maximal chain of V-invariant subspaces has the property stated above.
The case when V is the algebra of all block upper triangular matrices with
respect to the decomposition (11.2.2) is of special interest. Then Mnn is a
direct sum of two subspaces: V and W, where W is the algebra of all lower
block triangular matrices with zeros on the main block diagonal:
The subspaces
are all the invariant subspaces for W. In particular, we have the following
direct sum decompositions:
This motivates the following conjecture.
Conjecture 11.2.3
Let be nonzero subalgebras in Mn n such that
Then there exist nonzero invariant subspaces M} and M2 for
and respectively, which are direct complements of each other in <p".
We are able to prove a partial result in the direction of this conjecture.

Namely, if and are subalgebras in such that then
for every V^-invariant subspace and every V2-invariant subspace
either or (or both) holds. Indeed, assuming
the contrary, let be a direct complement to in i = 1, 2, and
let Jibe a direct complement to in Then we have a direct sum
decomposition
With respect to this decomposition, every has a block matrix

representation of type
[the zeros appear because of the V invariance of

whereas every has a block matrix representation of type
[the zeros appear because of the invariance of

So every matrix in has a zero in the (4,2) block entry, which
contradicts the assumption that
We start with auxiliary results. A subset Q of an algebra U of n x n matrices

is called an ideal if Q is a subalgebra; that is, implie
and for every complex number a; and, in addition,
AB and ZL4 belong to Q as long as and . Trivial examples of
ideals are and
Lemma 11.3.1
The algebra has no nontrivial ideals.
Proof. Let Q be a nonzero ideal in and let It is

easily seen that for every pair of indices there are
matrices such that has a one in the entry and zeros
elsewhere. Now any matrix can be written
and thus belongs to Q. Hence
Now let U be an algebra of n x n matrices (n > 2) that has no nontrivial

invariant subspaces. We prove that thereby proving Theorem
11.2.1. The first observation is that without loss of generality we can assume
. Indeed, consider the algebra Obvious-
ly, U has no nontrivial invariant subspaces as well. Also, U is an ideal in U.
Hence, if we know already that then Lemma 11.3.1 implies that
either or But the latter case is excluded by the definition
of and the condition So it is assumed that /
Lemma 11.3.2 For every nonzero vector x in and every there
exists a matrix such that
Proof. The set is an invariant subspace for U. This

subspace is nonzero because x = I • x is a nonzero vector in M (recall that
By our assumption on U the subspace coincides with . Hence
for every there exists an such that
Lemma 11.3.3
The only matrices that commute with every matrix in U are the scalar
multiples of I.
Proof. Let be such that SA = AS for every Let be

an eigenvalue of S with corresponding eigenvector Then for every
we have
By Lemma 11.3.2, for every there is an A in U with So

equations (11.3.1) mean that
Lemma 11.3.4
If xl and x2 are linearly independent vectors in (p", then for every pair of
vectors there exists a matrix A from U such that
and
It is sufficient to show that there exist

Proof. such that
and Indeed, we may then us
Lemma 11.3.2 to find with Henc
We now prove the existence of A\. (The existence of A2 is proved

similarly.) Arguing by contradiction, assume that implies
for every Then one can define a transformation by the
requirement that for all Indeed, if for some
A and B in U, then and thus also which
means . So T is correctly defined. Further,
by Lemma 11.3.2; hence T is defined on the whole of Now for any A
and B in U we have
and since we find that for all By

Lemma 11.3.3, T = al for some a G <p. Therefore, for all
. But this contradicts Lemma 11.3.2.
We say that an algebra V of n x n matrices is /c transitive if for every set

of /c linearly independent vectors in and every set of k vectors
in there exists a matrix A such that
Evidently, every fc-transitive algebra is p transitive for
Lemma 11.3.4 says that the algebra U is 2 transitive.
Proof of Theorem 11.2.1 In view of Lemma 11.3.4 it is sufficient to

prove that every 2-transitive algebra V of n x n matrices is n transitive.
Assume by induction that V is k transitive, and we will prove that V is
transitive (here
So let be linearly independent vectors in It will suffice to
verify that for every there exists a matrix such that
(indeed, for given the 1 transitiv-
ity of V implies the existence of such that then for

we have
We will prove the existence of one has simply to
permute the indices). Suppose that no such exists; that is,
implies that . Consider the algebra
of 2/i x In matrices. It turns out (because of the 2 transitivity of V) that any

V (2) -invariant subspace is one of the subspaces
for some A G (p. Indeed, the V^-invariant subspace M
(which we can assume to be nonzero) is a sum of cyclic V (2) -invariant
subspaces: M — M^ where
Fix an index /. For any n x / i matrix B, assuming xn,xi2 are linearly

independent, and by the assumption of 2 transitivity of V, we have Bxn —
Axn, Bxi2 — Axi2 for some hence invariant, where
Now because of the obvious 2 transitivity of Mn n, we find that Mf =

. Assume now that and xi2 are linearly dependent. Then 1
transitivity of V implies again that Ml is M^\ invariant. If xn = 0, we get
and if x for some et
Consequently, is equal to <p2" except for the two cases: (1)

xn = 0 for all / = 1, . . . , p\ (2) / = 1, . . . , p for the same
In the first case in the second case M
Now we return to the proof of the existence of By the induction

hypothesis, for each / ( ! < / < & ) there is some with and
The subspace
is V (2) invariant; therefore (according to the fact proved in the preceding

paragraph), there exists a complex number a such that

for all The induction hypothesis implies that
and the assumption that = 0 whenever for j = 1, . . , k shows

that a mapping T: is unambiguously defined by
Obviously, T is linear. Further, for A V and j - 1,. . . , k we have (where

the term ACjXj appears in the yth place)
Since the subspace coincided with by the 1

transitivity of V. So the linearity of T gives
Then, for A £ V
Hence {x \ Ax — 0 for all } is a nontrivial V-invariant subspace. This

contradicts the 1 transitivity of V.
11.4 REFLEXIVE LATTICES
Let A be a lattice of subspaces in (p". The set of all n x n matrices A such

that A3?C3? for every J2?EA, denoted Alg(A), is an algebra. Indeed, if
then
Reflexive Lattices 347
for every subspace On the other hand, for an algebra V of n x n

matrices the set Inv(V) of all V- in variant subspaces in <p" is easily seen to be
a lattice of subspaces [i.e., n v ( K ) implies In\(V) and
The following properties of Alg( ) and Inv(V) are
immediate consequences of the definitions.
Proposition 11.4.1
(a) If and are two lattices of subspaces in and then
(b) If V, and V-, are algebras of n x n matrices and
then
Let us check property (c), for example. Assum At or

every /4 lg(A). Hence £ is Alg( invariant; that is, Inv(Alg(
EXAMPLE 11.4.1. Let A be the chain
Then Alg(A) is the algebra of all upper triangular matrices.
EXAMPLE 11.4.2. Let A be the set of subspaces Span where K

runs over all subsets of { ! , . . . , « } . Clearly A is a lattice. The algebra
Alg(A) is easily seen to be the algebra of all diagonal matrices.
EXAMPLE 11.4.3. For a fixed subspace let A be the lattice of all

subspaces that are contained in M. Then Alg(A) is the algebra of all
transformations A having the form
with respect to the direct sum decomposition M + N (for a fixed direct

complement Jf to M).
EXAMPLE 11.4.4. Let V be the algebra of polynomials

where is a fixed linear transformation. Then Inv(V) is the lattice
of all /1-invariant subspaces.
EXAMPLE 11.4.5. Let A: $" —»• <p" be a fixed transformation, and let V be
the algebra of all transformations that commute with A. Then Inv(V) is the
lattice of all A-hyperinvariant subspaces.
Note that
for every lattice A of the subspaces in Indeed, the inclusion i

equation (11.4.1) follows from (c) and (a). To prove the opposite inclusion,
let A Alg(A). Then any subspace M belonging to Inv(Alg( )) is invariant
for every transformation in Alg(A); in particular, M is A invariant. This
shows that A Alg(Inv(Alg(A))). Similarly, one proves that
for every algebra V of transformations

A lattice of subspaces in is called reflexive if Inv(Alg(
Equality (11.4.2) shows, for example, that any lattice of the form Inv(V) for
some algebra V is reflexive. Let us give an example of a nonreflexive lattice.
EXAMPLE 11.4.6. Let A be the following lattice of subspaces in

Let us find the alge-
bra Alg( ). The 2 x 2 matrix A = , has invariant subspaces !£ and M
if and only if b = c = 0. Further, JV is A invariant if and only if
So
and Lat(Alg(A)) consists of all subspaces in
Many results are known about sufficient conditions for reflexivity of a

lattice of subspaces. Often the key ingredient in such conditions is distribu-
tivity. Recall the definition of a distributive lattice of subspaces given in
Section 9.6.
Theorem 11.4.2
A distributive lattice of subspaces in is reflexive. Conversely, every finite
reflexive lattice of subspaces is distributive.
The proof of Theorem 11.4.2 is beyond the scope of this book, and we
refer the reader to the original papers by Johnson (1964) and Harrison
(1974) for the full story. Here, we shall only prove two particular cases in
the form of Theorems 11.4.3 and 11.4.4.
Theorem 11.4.3
A complete chain of subspaces
is reflexive.
Reflexive Lattices 349
Proof. Let/!, . . . , / „ be a basis in ch that Span{/

i = 1,. . . , n - 1, and write linear transformations as matrices with respect to
this basis. Example 11.4.1 shows that Alg(A) consists of all upper triangular
matrices. As the linear transformation
obviously belongs to AlgA, and its only invariant subspaces are {0}, Mt,
/ = 1, . . . , n - 1, and we have Inv(Alg(A)) C A. Since the reverse inclu-
sion is clear, the conclusion of Theorem 11.4.3 follows.
The next theorem deals with lattices that are as unlike chains as possible.
A lattice A of subspaces in is called a Boolean algebra if it is
distributive and for every M E A there is a unique complement M' (i.e.,
that belongs to We say that a nonzero
subspace is an atom if there are no subspaces for the lattice A strictly
between 3C and (0). The Boolean algebra A is called atomic if any is
a sum of all atoms 3V contained in M. A typical example of an atomic
Boolean algebra of subspaces is A = {Span{jt, | £}, where E is any
subset in (1,2, . . . , n}}, and *,, . . . , xn is a fixed basis in <p".
Theorem 11.4.4
Every atomic Boolean algebra A of subspaces of is reflexive.
Proof. Let K be the set of all atoms in A, and for every K let Px be
the projector on 3C along the complement 3V' of 3V in the lattice A. We shall
show that A = Inv(K), where V is the algebra generated by the transfor-
mations of type PyfAPyf, where A: and K. In other words, V
consists of all linear combinations of transformations of type
where A,: and $fj,. . . , jftm are atoms in

Let Ji? be an atom in A. For any atom $T, we have either £ - 3K or
(This follows from the distributivity of
as £ is an atom, either 'o holds.) In the former

case Im for every transformation A: and in the latter
case Ker P XAPX. In either case Z£ is PXAPX invariant. Hence

lnv(V). Now every is a sum of the (finitely many) atoms contained in
M. Hence nv(K). In other words,
To prove the reverse inclusion, it is convenient to use the following fact:
if X is an atom in d ^ E l n v ( V ) , then either
Indeed, suppose that M is not contained in 3V', so there exists a vector
such that Since P xf ¥^ 0, it follows that every vector jc in
3£ has the form APxf for some transformation A: Then also
x = PxAPxf. As and (K), we have that is
Return to the proof of the inclusion Inv(V) Let nv(K), and let
M be the sum of all the atoms in that are contained in M. Also, let
be the intersection of all the complements of atoms in A such that
these complements contain M. Obviously
Since A is atomic, the complement is the sum of all atoms that are
not contained in M. (Indeed, if an atom 3Hs contained in 3ifis not
contained in MQ and thus by the definition of M0, 3K is not contained in M.
Conversely, if an atom 3C is not contained in M, then obviously 3£ is not
contained in M0, and since Jiis an atom, it must be contained in M'Q.) The
fact proved in the preceding paragraph shows that M'Q is the sum of all the
atoms jK with the property that For any finite set 3if\,. . .p , of
3C
atoms with / = ! , . . . , / ? , we have (using the distributivity of
and
so actually
This shows that hence Combining this with (11.4.3), we

see that and thu
11.5 REDUCTIVE AND SELF-ADJOINT ALGEBRAS
We have seen in Corollary 3.4.4 that the set Inv(/4) of invariant subspaces of
a transformation A: has the property that lnv(A) exactly
when nv(A) if and only if A is normal. This property makes it
Reductive and Self- Adjoint Algebras 351
natural to introduce the following definition: an algebra V of n x n matrices

is called reductive if it contains / and for every subspace belonging to Inv(K)
its orthogonal complement belongs to Inv(K) as well. Thus the algebra P(A)
wnere
of all polynomials A is a normal transformation, is reduc-
tive. This algebra P(A) has the property that implies
Indeed, we have only to show that, for the normal transformation A, the
adjoint A* is a polynomial in A. Passing, if necessary, to the orthonormal
basis of eigenvectors of A, we can assume that A is diagonal: A =
diag Now let be a scalar polynomial satisfying the
conditions / = 1, . . . , n. Then clearly
The next theorem shows that this property of the reductive algebra P( A)
is a particular case of a much more general fact.
Theorem 11.5.1
An algebra V of n x n matrices with is reductive if and only if V is
self -adjoint, that is, implies
As a subspace M is A invariant if and only if M^ is A* invariant, it

follows immediately that every self-adjoint algebra with identity is reductive.
To prove the converse, we need the following basic property of invariant
subspaces of reductive algebras.
Lemma 11.5.2
Let V be a reductive algebra of n x n matrices, and let Ml , . . . , Mm be a set
of mutually orthogonal V-invariant subspaces such that
and for every i the set of restrictions coincides with the algebra
M(J<,) of all transformations from M^ into Mr Then V is self-adjoint.
Proof. We proceed by induction on m. For m = 1, that is, and

V = M(<p"), the lemma is obvious. So assume that the lemma is proved
already for m - 1 subspaces, and we prove the lemma for m subspaces.
It is convenient to distinguish two cases. In case 1, there exist distinct
integers / and k between 1 and m and an algebraic isomorphism
such that for every This means
that <p is a one-to-one and onto map with the following properties:
As is equal to we have
We show first that there exists an invertible transformation S: Mj-+Mk

such that p(X) = SXS~l for every Note that takes
rank 1 projectors into rank 1 projectors. Indeed, if P2 = P and
rankP = l, then is a projector. Moreover, the
one-dimensional subspace
is mapped by <p into the subspace
so the subspace is also one-dimensional; hence

rank (P) = 1. Now fix any nonzero vector and let be
the orthogonal projector on Span{/}. As <p(A0): Mk— * Mk is also a one-
dimensional projector, there exists an invertible transformation
S0: M}-*Mk such that A0) = S^^So'. (This follows from the fact that
the Jordan form of any one-dimensional projector in is the same:
diag[l,0, . . . ,0].) Define 5: by
Let us show that this definition is correct. Indeed, if Alf=A2f, then

(A{ - A2)A0 = 0. Consequently, 0, and sinc
<p(A()) is a projector onto Span{50/}, we obtain
In other words, Alf = A2f happens only if 50. Hence S
is correctly defined. Clearly, S is linear and onto. If (,4)S 0/ = 0, then
which implies AAQ = Q and Af=0. This shows that
Ker5={0}. Hence 5 is invertible. Finally, for every A,B we
have
and thus SAg — (p(A)Sg for every Thus (p(A) = SAS for a
Next, we show that S can be taken to be unitary, that is, S~l = 5*. Let M
be a subspace in <p" consisting of all vectors of the form x{ + f- xm, where
As
for every it follows that J£ is V invariant. Since V is reductive, M L is

V invariant as well. A computation shows that
Reductive and Self-Adjoint Algebras 353
The fact that for all implies that is for

and then
As {<<4|^| AEV} coincides with M(Mt) and in the preceding equality jty.
can be an arbitrary vector from M -^ we obtain B = S*SBS~lS*~l for all
By Proposition 9.1.6, for some number that must
1/2
be positive because 5*5 is positive definite. Letting 5, we obtain a
l
unitary transformation U such that (B) = UBU~ for all
We next show that V\ ± is reductive. Indeed, \et J
be V\ ± in variant. Then clearly jV is V invariant,
and by the reductive property of V, so is jV1, and hence also N± nMk. It
remains to notice that coincides with the orthogonal complement
to
By the induction hypothesis, is self-adjoint. Therefore, for every
matrix
the transformation
belongs to As for evye we have

it folowiw that
But and U is unitary. So the transformation (11.5.1) is just

A*. We have proved that V is self-adjoint (in case 1).
Consider now case 2. For any pair of distinct integers / and k between 1
and m, there is no algebraic isomorphism (p as in case 1. If for fixed
=0 implies A\M =0 for any and vice versa, then we can
correctly define an algebraic isomorphism (p: M(Mj)-+M{Mk) by putting
for all (recall th = M(Mj)). Thus our assump-
tion in case 2 implies the following. For each pair ;', k of distinct integers
between 1 and m there exists a matrix A E V such that exactly one of the
transformations is zero.
We now prove that there exists a matrix such that is different
from zero for exactly one index j. Choose different from zero so that
the number p of indices j with is minimal. Permuting
if necessary, we can assume that

We must show that p = I . Assume the contrary, that is,
Interchanging Ml and Mp if necessary, we can assume that
for some matrix denote the set of all transformations
B:Ml^M} such that B = B\M{ for some BEV with B\M = 0. The fact
that implies that jP, is an ideal in M(M,{). Since C E / j and
CÔ, Lemma 11.3.1 shows that actually ^, = M(Ml). Similarly, the set ^2
of all transformations B: M l —> M l such that B = B M for some B E V with
B\M =®> • • • ' B M = 0, is a nonzero ideal in M(J£,) and thus /2 =
Now the identity transformation /: J<, —> M{ belongs to both <^1 and
£2. Therefore, there exist transformations B): and
(; = 2, 3,. . . , /?) such that
and
belongs to V. Then also BC belongs to V, and

However, this contradicts the choice of p. So, indeed, p = 1.
As the ideal j?2 constructed above coincides with M ( M l ) , it follows that
every matrix B from V is a sum of two transformations B\M 1 and B\ M ±.
Since Kj^ = M(J< 1 ) we find that V is self-adjoint provided V\ ± is. But the '
algebra V| i is easily seen to be reductive because V is. Now the self-
adjointness of V\M follows from the induction hypothesis. Lemma 11.5.2 is
proved completely.
Now we are ready to prove the converse statement of Theorem 11.5.1. If

V has no nontrivial invariant subspaces, then by Theorem 11.2.1
and obviously V is self-adjoint. If V has nontrivial invariant subspaces, then
it has a minimal one, say, M } . As V is reductive, M ^ is also V invariant, and
the restriction V\ML is reductive. If V\M±. is not the algebra of all transfor-
mations then there exist a minimal nontrivial V-invariant sub-
space M Proceeding in this manner, we obtain a sequence of
mutually orthogonal V-invariant subspaces Ml,. . . ,Mm such that
and for each j there are no nontrivial y-invariant subspaces in Mr By

Theorem 11.2.1 the restriction V\M (/ = l , . . . , m ) coincides with the
algebra of all transformations M j - * M j . It remains to apply Lemma 11.5.2.
Exercises 355
11.6 EXERCISES
11.1 Prove or disprove that the following sets of n x n matrices are

algebras:
(a) Upper triangular Toeplitz matrices:
(b) Toeplitz matrices:
(c) Circulant matrices:
(d) Companion matrices:
(e) Upper tragular matrices whre

11.2 Prove or disprove that the following sets of nk x nk matrices are
algebras:
(a) Block upper triangular Toeplitz matrices (1), where ai are A: x A:
matrices, / = ! , . . . , « .
(b) Block Toeplitz matrices (2), where «; are k x k matrices,
j — —n + I , . . . , n — I .
(c) Block circulant matrices (3), where ay are k x k matrices,
7 = 1,. . . ,«.
(d) Block upper triangular matrices [fl, 7 ]" /= i, where «,y are k x k
matrices and ati = 0 if / > j.
(e) Matrices of type
where a(/ are k x k matrices.

11.3 Show that the set of all n x n matrices of type
is an algebra. Find all invariant subspaces of this algebra.

11.4 Let A be an n x n matrix,
(a) Show that the set
is not necessarily an algebra.

(b) ProvethattheclosureofQ,thatis,thesetofalln x nmatricesXior
which there exists a sequence {Xm}^=} with Xm E. Q for m =
1,2,... and limm_^ Xm = X, is an algebra with identity.
(c) Describe all invariant subspaces of the closure of Q.
11.5 Show that the algebra of all n x n upper triangular Toeplitz matrices
and the algebra of all n x n upper triangular matrices have exactly
the same lattice of invariant subspaces.
11.6 Show that the algebra of all upper triangular n x n matrices contains
any algebra A for which
11.7 Show that there is no algebra A with identity strictly contained in the
algebra UT(n) of upper triangular Toeplitz matrices for which
Exercises 357
11.8 Prove that the algebra U(n) of n x n upper triangular matrices is the
unique reflexive algebra for which the lattice of all invariant sub-
spaces is the chain
11.9 Show that there exist n different algebras V,,. . . , Vn whose set of
invariant subspaces coincides with (5) and for which
11.10 Find all invariant subspaces of the algebra of all In x In matrices of

type
where A, B, C, and D are upper triangular matrices.

11.11 As Exercise 11.10 but now, in addition, B and C have zeros along
the main diagonal.
11.12 Find all invariant subspaces of the algebra of all 2n x 2n matrices
, where A, B, C, and D are n x n circulant matrices.
11.13 Let A be an n x n matrix that is not a scalar multiple of the identity.
Find a nontrivial invariant subspace for the algebra of all matrices
that commute with A. Does there exist such a subspace of dimension
1?
11.14 Let A be an n x n matrix and
be the algebra of polynomials in A. Give necessary and sufficient

conditions for reflexivity of V in terms of the structure of the Jordan
form of A.
11.15 Indicate which of the following algebras are reflexive:
(a) n x n upper triangular Toeplitz matrices.
(b) n x n upper triangular matrices.
(c) n x n circulant matrices.
(d) nk x nk block circulant matrices (with kx k blocks).
(e) nk x nk block upper triangular matrices (with k x k blocks).
(f) nk x nk block upper triangular Toeplitz matrices (with k x k
blocks).
(g) the algebra from Exercise 11.3.
11.16 Let Q be as in Exercise 11.4. When is the closure of Q a reflexive

algebra?
11.17 Given a chain of subspaces
construct reflexive and nonreflexive algebras whose set of invariant

subspaces coincides with (6).
11.18 Let j t j , . . . , xn be a basis in <p", and let A be the minimal lattice of
subspaces that contains Span^},. . . ,Span{jc n }. Prove that there
exists a unique algebra V for which = Inv(V). Is V reflexive?
11.19 Let V be an algebra of n x n matrices without identity and such that
A" - 0 for every A G V. Prove that A }A2- • • An - 0 for every n-tuple
of matrices / 4 , , . . . , An from V. (Hint: Use Theorem 11.2.2.)
Chapter Twelve
Real Linear
Transformations
In this chapter we review the basic facts concerning invariant subspaces for
transformations focusing mainly on those results that are
different (or their proofs are different) in the real case, or cannot be
obtained as immediate corollaries, from the corresponding results for trans-
formations from <p" into <p".
We note here that the applications presented in Chapters 5, 7, and 8 also
hold in the real case. That is, applications to matrix polynomials
with real n x n matrices A- and to rational matrix functions W(\) whose
values are real n x n matrices for the real values of A that are not poles of
W(\). In fact, the description of multiplication and divisibility of matrix
polynomials and rational matrix functions in terms of invariant subspaces (as
developed in Chapters 5 and 7) holds for matrices over any field. This
remark applies for the linear fractional decompositions of rational matrix
functions as well. In contrast, the Brunovsky canonical form (Section 6.2) is
not available in the framework of real matrices, so all the results of Chapter
6 that are based on the Brunovsky canonical form fail, in general, in this
context. Also, the results of Chapter 11 do not generally hold in the
context of finite-dimensional algebras over the field of real numbers.
12.1 DEFINITION, EXAMPLES, AND FIRST PROPERTIES

Let A: $"—> 4?" be a linear transformation. As in the case of linear

transformations on a complex space, we say that a subspace M C ft" is
invariant for A (or A invariant) if for every The whole of Jjf"
and the zero subspace are trivially A invariant, and the same applies to
Im A and Ker A. As in the complex case, one checks that all the nonzero
359
360 Real Linear Transformations
invariant subspaces of the n x n Jordan block with real eigenvalue (con-

sidered as a transformation from l|?" into $" written as a matrix in the
standard orthonormal basis e,,. . . , en) are Span{e l5 . . . , ek], k = 1,. . . , n.
Also, for the diagonal matrix A = diag where are
distinct real numbers, all the invariant subspaces are of the form
Span{e, | with K C (1,. . . , n} (Span{ is interpreted as the
zero subspace).
In addition to these examples, the following example is basic and
specially significant for real transformations.
EXAMPLE 12.1.1. Let
where cr and T are real numbers and r ^ 0. The size n of the matrix A is
obviously an even number. It is easily seen that Span{e,,. . . , e2k}, k —
I , . . . , n/2 are ^-invariant subspaces. It turns out that A has no other
nontrivial invariant subspaces. Indeed, replacing A by A - crl, we can
assume without loss of generality that cr = 0. We prove that if M is an
2
^-invariant subspace and ; with at least one of the real
numbers a2k_l and a2k different from zero, then Span{e,,. . . , e2k}, and
proceed by induction on k.
In the case k = 1 we have cê, + a2e2 and A(alel + a2e2) = ra2el —
The conditions r ^ 0 and a\ + a\ T£ 0 ensure that both vectors el
and e2 are linear combinations of alel + a2e2 and ra2el — ra,e2, and the
assertion is proved for k = 1. Assuming that the assertion is proved for
A computation shows that
the vector _ belongs to Span{e,,. . . , e 2k_2} and in the linear
combination y at least one of the numbers 2A _ 3 , j32*-2 is
different from zero. Obviously, so the induction assumption implies
M DSpan{ej,. . . , e2k_2}. Hence a2k-ie2k-\ + a2ke2k €=^; as the differ-
ence Ax — (ra2ke2k_l — Ta2k_le2k) belongs to Span{<?i,. . . , e2k_2}, also
Ta e
2k 2k-i ~ Ta2k_le2kE.M. Consequently, the vectors e2k_} and e2k belong
to M, and M D Spanje,,. . . , e2k}. In particular, A has no odd-dimensional
invariant subspaces. D
Definition, Examples, and First Properties of Invariant Subspaces 361
We say that a complex number A0 is an eigenvalue of A if det( A07 - A) =

0. Note that we admit nonreal numbers as eigenvalues of the real transfor-
mation A. As before, the set of all eigenvalues of A will be called the
spectrum of A and denoted by cr(A). Since the polynomial det( A/ - A) has
real coefficients (as one can see by writing A in matrix form in some basis in
4?"), it follows that the spectrum of A is symmetrical with respect to the real
axis: if A0 is an eigenvalue of A, so is A 0 , and the multiplicity of A0 as a zero
of det( A/ - A) is equal to that of A 0 .
Not every transformation A: ft"—> ft" has real eigenvalues. For instance,
in Example 12.1.1 the eigenvalues of A are a + ir and a - ir. However, if n
is odd, then A must have at least one real eigenvalue. Indeed, det(A/ — A)
is a monic polynomial of degree n with real coefficients; hence for n odd det
(A/ — A) has real zeros. This implies the following fact (which has already
been observed in the case of Example 12.1.1).
Proposition 12.1.1
If the transformation A: ft"—> ft" has no real eigenvalues, then A has no
odd-dimensional invariant subspaces.
Proof. If M C ft" were an odd-dimensional ^-invariant subspace, the

restriction A\M would have a real eigenvalue, which contradicts the fact that
A has no real eigenvalues. (As in the complex case, the eigenvalues of any
restriction A # to an ,4-invariant subspace are necessarily eigenvalues of
A.) D
The Jordan chains for real transformations are defined in the same way as
for complex transformations: vectors XQ, . . . , xk G ft" form a Jordan chain
of the transformation A: ft"—*ft" corresponding to the eigenvalue A0 of A if
x0 7^0 and AxQ = A0;t0; Ax, - \QXj = x^{, / = ! , . . . , & . The vector JCD is
called an eigenvector. The eigenvalue A0 for which a Jordan chain exists must
obviously be real. Since not every real transformation has real eigenvalues,
it follows that there exist transformations A: ft"—> ft" without Jordan chains
(and in particular without eigenvectors). On the other hand, for every real
eigenvalue A0 of A: ft"—*• ft" there exists an eigenvector (which is any
nonzero vector from Ker(A 0 7- A)C ft"). In particular, A has eigenvectors
provided n is odd.
As we have seen (e.g., in Example 12.1.1), not every real transformation
has one-dimensional invariant subspaces. In contrast, two-dimensional in-
variant subspaces always exist, as shown in the following proposition.
Proposition 12.1.2
Any transformation A: ft"—» ft" with n^2 has at least one two-dimensional
invariant subspace.
Proof. Assume first that A has a pair of nonreal eigenvalues a + ir,

a - ir (cr, T are real, r ^ 0). Then
Let x E ft" ^ {0} be such that
Then clearly the subspace M = Span{x, Ax} is v4-invariant. Further, M

cannot be one-dimensional because otherwise Ax = IJLX for some /z €E ft,
which in view of equality (12.1.1) would imply /u,2 — 2ju,cr + (cr2 + r 2 ) = 0, or
(fj, — cr)2 + r 2 = 0, which is impossible since T ^ 0.
If A has no nonreal eigenvalues, then (leaving aside the trivial case when
A is a scalar multiple of /) the subspace Span{jc, y}, where x and y are
eigenvectors of A corresponding to different eigenvalues, is two-dimensional
and A invariant. D
It is clear now that Theorem 1.9.1 is generally false for real transfor-
mations. The next result is the real analog of that theorem.
Theorem 12.1.3
Let A: ft" —> ft" be a transformation and assume that det( A/ - A) has exactly
s real zeros (counting multiplicities). Then there exists an orthonormal basis
X i , . . . , xn in ft" such that, with respect to this basis, the transformation A
has the form [fl,7]"/ = i where all the entries ait with i > j are zeros except for
a
s + 2,s + l> as + 4,s + 3-> • ' • ' an,n-l-
So, the matrix [«,,]", = , is "almost" upper triangular.
Proof. Apply induction on n. If A has a real eigenvalue, then use the

proof of Theorem 1.9.1. If A has no real eigenvalues, then pick a two-
dimensional .A-invariant subspace (which exists by Proposition 12.1.2) with
an orthonormal basis x, y. Write A as the 2 x 2 block matrix with respect to
the orthogonal decomposition <p" = M -f M ^:
and apply the induction hypothesis to the transformation
It follows from Theorem 12.1.3 that a transformation

def
A: &"-*&" with
det( A/ - A) having s real zeros has a chain of p + 1 = \{n + s) + 1 invariant
subspaces:
Root Subspaces and the Real Jordan Form 363
(Observe that n — s is the number of nonreal zeros of det( A/ - A). So n - s

and n + s are even numbers.) We leave it to the reader to verify that
\(n + s) + 1 is the maximal number of elements in a chain of ^4-invariant
subspaces.
We say that a transformation A:ftn—>]jtn is self-adjointnt if (Ax, y) =
(x, Ay) for every jc, y G %", [As usual, (-, •) stands for the standard scalar
product in ^".] In other words, A is self-adjoint if A = A*. Also, a
transformation A is called unitary if A* = A~l and normal if A A* = A*A.
Note that in an orthonormal basis a self-adjoint transformation is represen-
ted by a symmetric matrix, and a unitary transformation is represented by
an orthogonal matrix. (Recall that a real matrix U is called orthogonal if
UUT=UTU=I.)
For normal transformations the "almost" triangular form of Theorem
12.1.3 is actually "almost" diagonal:
Theorem 12.1.4
Let A be as in Theorem 12.1.3 and assume, in addition, that A is normal.
Then there exists an orthonormal basis in %n with respect to which A has
the matrix form [ a ^ ] " j = i , where aif• = 0 for i^j except for as+2s+l,
Proof. Use an orthonormal basis in ftn with the properties described in

Theorem 12.1.3, and observe that the equality A* A = A A* implies that
actually a i f = Q for i>j except as+ls+2, . . . , «„_,,„.
12.2 ROOT SUBSPACES AND THE REAL JORDAN FORM
Let A: ft" —^ ft" be a transformation. The root subspace £%A (A) correspond-
ing to the real eigenvalue A() of A is defined to be Ker(A 0 7- A)", as in the
complex case. Then £%A (A) is spanned by the members of all Jordan chains
of A corresponding to A 0 . For a pair of nonreal eigenvalues a + ir, a — ir of
A (here a, T are real and r ^ O ) the root subspace is defined by
where p is a positive integer such that
for every positive integer k.

Note that, if A t , . . . , \r are the distinct real eigenvalues of A (if any) and
<r, + /r, , . . . , as + irs are the district eigenvalues of A in the open upper half
of the complex plane (if any), then
for some positive integers a,,. . . , ar, /3 15 . . . , Ps. Using this observation, it
can be proved that there is a direct sum decomposition
(see the remark following the proof of Theorem 2.1.2). Moreover, we have:
Theorem 12.2.1
For every A-invariant subspace M the direct sum decomposition
holds.
For the deeper study of properties of invariant subspaces, the real Jordan
form of a real transformation, to be described in the following theorem, is
most useful. As usual, ^(A) denotes the k x k Jordan block with eigenvalue
A. Also, we introduce the 21 x 21 matrix
whrer and are real numbers with and

represents the 2 x 2 identity matrix.
Theorem 12.2.2
For every transformation A: $"—> %" there exists a basis in ft" in which A has
the following matrix form:
whre are real numbers(nucessarilty

Root Subspaces and the Real Jordan Form 365
distinct) and wl,...,wq are positive. In the representation (12.2.1) the

blocks Jk (A,) and //(/Lt ; , w-) are uniquely determined by A up to permu-
tation.
The proof of Theorem 12.2.2 will be relegated to the next section.

The right-hand side of equality (12.2.1) is called a real Jordan form of A.
Clearly, A 1 5 . . . , \p are the real eigenvalues of A, and ^ ± iw{, . . . , iiq ±
iwq are the nonreal eigenvalues of A. Given A0 E o~(A), A0 real, the partial
multiplicities and the algebraic and geometric multiplicity of A correspond-
ing to A0 are defined as in the complex case. For a nonreal eigenvalue p. + iw
of A, the partial multiplicities of A corresponding to /LI + iw are, by
definition, the half-sizes /y of the blocks //(/Lt y , w.) with /xy = /a and w- = ±w.
The number of partial multiplicities of A corresponding to /n + iw is the
geometric multiplicity of /t + iw, and the sum of partial multiplicities is the
algebraic multiplicity of /i + iw.
By use of the real Jordan form, it is not difficult to prove the following
fact, which we need later.
Proposition 12.2.3
If n is odd, then every transformation A: $"—* ^" has an invariant subspace
of any dimension k with 0 < k < n.
Proof. Without loss of generality we can assume that A is given by an

n x n matrix in the real Jordan form. As n is odd, A has a real eigenvalue,
so that blocks J k (A,) in the real Jordan form (12.2.1) oi A are present.
Since the subspaces Spanf^j, . . . , e - } , j = 1, . . . , k- are /^(A ; ) invariant,
and the subspaces Spanl^, . . . , e 2 j ] , j = 1, . . . , I- are //.(/x,-, vv y ) invariant,
we obtain the existence of ^4- invariant subspaces of 'any dimension k,
0<fc<n. D
Analogs of the results on spectral and irreducible invariant subspaces

proved in Chapter 2 can be stated and proved for transformations from ^?"
to jjf". (As in the complex case we say that an ^-invariant subspace M is
irreducible if M cannot be represented as a direct sum of two A -invariant
subspaces.) For example, see Theorem 12.2.4.
Theorem 12.2.4 Let A: %"-*$" be a transformation. The following

statements are equivalent for an A-invariant subspace M :
(a) M is irreducible.
(b) Each A-invariant subspace contained in M is irreducible.
(c) The Jordan form of the restriction A\M is either / n (A), A G J|f, or (in
case n is even) Jn/2(f^, w), ^t, w E. $ , w ^ 0.
(d) There is either a unique eigenvector (up to multiplication by a nonzero real

number) of A in M or (in case A\M has no eigenvectors) a unique
two-dimensional A-invariant subspace in M.
(e) The lattice of A-invariant subspaces is a chain.
(/) The spectrum of A\M is either a singleton { A0}, A() G ft, or a pair of
nonreal eigenvalues { p + iw, p - w>}, and
in the former case and
in the latter case.
The real Jordan form can be used instead of the (complex) Jordan form
to produce results for real transformations analogous to those presented in
Chapters 3 and 4 (with the exception of Proposition 3.1.4). For this purpose
we say that a transformation A: ft"-* ft" is diagonable if its real Jordan form
has only 1 x 1 blocks J,(A ; ), A , , . . . , \p G ft or 2 x 2 blocks /,(/*,-, w-),
j= 1,. . . , q. Also, we use the fact that the Jordan form of the transfor-
mation A: ft"-* ft" with the real Jordan form (12.2.1) is
12.3 COMPLEXIFICATION AND PROOF OF THE

REAL JORDAN FORM
We describe here a standard method for constructing a transformation

<p w —» <p" from a given transformation ft"-* ft" with similar spectral proper-
ties. In many cases this method allows us to obtain results on real transfor-
mations from the corresponding results on complex transformations. In
particular, it is used in the proof of Theorem 12.2.2.
Let A: ft"—>ft" be a transformation. Define the complexification
Ac: (f"1-* <f" of A as follows: Ac(x + iy) = Ax + iAy, where x, y £ ft". Obvi-
ously, Ac is a linear transformation. If A is given by an n x n matrix in some
basis in ft", then this same basis may be considered as a basis in <p" and Ac is
given by the same matrix. It is clear from this observation that the
eigenvalues and the corresponding partial multiplicities of A and of Ac are
the same.
Let M be a subspace in ft". Then M + iM = {x + iy \ x, y G M} is a
Complexification and Proof of the Real Jordan Form 367
subspace in <p". Moreover, if M is A invariant, then M + iM is easily seen to

be Ac invariant.
We need the following basic connection between the invariant subspaces
of a real transformation and the invariant subspaces of its complexification.
Theorem 12.3.1
Assume that the transformation A: $" —» %" does not have real eigenvalues.
Let 9t+ C (p" be the spectral subspace of Ac corresponding to the eigenvalues
in the open upper half plane. Then for every A-invariant subspace ^(C $.")
the subspace (!£ + i££} D £%+ is Ac invariant and contained in 8%+. Converse-
ly, for every A' -invariant subspace M C £ft+ there exists a unique A-invariant
subspace £ such that (% + i%) fl 3?+ = M.
Proof. The direct statement of Theorem 12.3.1 has already been ob-
served. To prove the converse statement, let M C <%+ be an /i'-invariant
subspace. Fix a basis z, , . . . , zk in M, and write z; = Xj + (y;, j = 1, . . . , k,
where jc;, yy G ^f". Put *£ = Spanjjc,, . . . , xk, y{, . . . , yk} C Jj?". Let us
check that & is A invariant. Indeed, for each /, Acz- is a linear combination
(with complex coefficients) of z , , . . . , zk, say
Letting a(pl} - f3p'} + iy(p'\ where /3j,;) and yp'} are real, use the definition of
Ac to rewrite (12.3.1) in the form
After separation of real and imaginary parts, these equations clearly imply
that t£ is A invariant. Further, it is easily seen that
where z, = xl - iyf, j = 1, . . . , k. Equality (12.3.1) implies that the subspace

M - Span{Zp . . . , z k ] is Ac invariant and
This statement is easily verified; by Ietting_z 1 , . . . , zk be a Jordan basis for

AC\M, for example. As M C 5£+, we have MC.<3l_, where 9l_ is the spectral
subspace of Ac corresponding to the eigenvalues in the open lower half
plane. Now
Hence
It remains to prove the uniqueness of t£. Let ,£" be another ^-invariant

subspace such that
For a given subspace JV C <p", define its complex conjugate:
Obviously, JVis also a subspace in <p". We have £' + i«2" = =^' + i^f'. Also,
it is easy to check (e.g., by taking complex conjugates of a Jordan basis in
$+ for Ac\<x+) that $t+ = <3l_. Taking complex conjugates in (12.3.2), we
have
and
As % + i£ = {x + iy \ x, y G &}, and similarly for £' + W, the equality of

&' and g follows. D
The proof shows that Theorem 12.3.1 remains valid if the subspace $t+ is
replaced by the spectral subspace of Ac corresponding to any set S of
eigenvalues of Ac such that A 0 E S implies \0^S and 5 is maximal with
respect to this property.
We pass now to the proof of Theorem 12.2.2. First, let us observe that in
terms of matrices Theorem 12.2.2 can be restated as follows.
Theorem 12.3.2
Given an n x n matrix A whose entries are real numbers, there exists an
invertible n x n matrix with real entries S such that
Complexification and Proof of the Real Jordan Form 369
where A ; , /ny, and Wj are as in Theorem 12.2.2. The right-hand side of

(12.3.3) is uniquely determined by A up to permutations of blocks Jk ( A,) and
We now prove the result in the latter form. The Jordan form for
transformations from <p" into (p" is used in the proof.
Proof. Let Ac be the complexification of A. Let $lx (AC)C (p" be the

root subspace of Ac corresponding to a real eigenvalue A 0 . As the matrices
(A° - A0/)', / = 0 , 1 , 2 , . . . have real entries, there exists a basis in each
subspace Ker(Ac - A 0 /)' C C" that consists of n-dimensional vectors with
real coordinates. (Here, we use the fact that vectors *,,. . . , xk G Jjf"
are linearly independent over $ if and only if they are linearly indepen-
dent over (p.) Further, if m is such that Ker(X - \0I)m = â(Ac) but
Ker(v4c - A,,/)"7"1 ^ 5£A (Ac), then, by using the same fact, we see that there
is a basis in £%A (Ac) modulo Ker(Ac - \QI)m~l consisting of real vectors. We
can now repeat the arguments from the proof of the Jordan form (Section
12.2.3) to show that there exists a basis in £%A (Ac) consisting of Jordan
chains of Ac (in short, a Jordan basis) with real coordinates.
Further, let xn,. . . , xf m; i = l,. . . , p be a Jordan basis in £%A (Ac)
where A0 is a nonreal eigenvalue of Ac (so for each / the vectors
* , • ! » • • • > Xi.m f°rm a Jordan chain of Ac corresponding to A 0 ). By taking
complex conjugates in the equalities
(by definition, xi0 - 0) and using the fact that Ac is given by a real matrix in
the standard basis, we see that
are the Jordan chains of Ac corresponding to A 0 . The vectors (12.3.4)

inherit linear independence from the vectors x i f . Further, dim $t^(Ac) —
dim <3l i (Ac) (because the algebraic multiplicities of Ac at A0 and at A0 are
the same); hence the vectors (12.3.4) form a basis in £% A (Ac).
Putting together Jordan bases for each £%A (Ac), where A0 G ft H (r(Ac),
which consist of vectors with real coordinates, and Jordan bases for each
pair of subspaces £%A (Ac) and £% A (Ac) (where A0 is a nonreal eigenvalue of
Ac) that are obtained from each other by complex conjugation, we obtain
the following equality:
Here A,, . . . , A p are real numbers, A p + I , . . . , \p+q are nonreal numbers

(which can be assumed to have positive imaginary parts), and R is an
invertible n x n matrix that, when partitioned according to the sizes of
Jordan blocks in the right-hand side of (12.3.5), say
has the property that are real and
aand consder the
One checks easily that U- is unitary, that is, UjU* = /, and that
and fjij and wy are the real and imaginary parts of A p + / , repectively (see the
paragraph preceding Theorem 12.2.2 for the definition of .//(/A,, vv ; )). Also,
it is easily seen that the matrix
has real entries. Multiplying (12.3.5) from the right by
and denoting the real invertible matrix RU by Q, we have
and formula (12.3.3) follows.

The uniqueness of the right-hand side of (12.3.3) follows from the
uniqueness of the Jordan form of Ac. [Indeed, the right-hand side of (12.3.3)
is uniquely determined by the eigenvalues and partial multiplicities of Ac.]
12.4 COMMUTING MATRICES
Let A be an n x n matrix with real entries. In this section we study the

general form of real matrices that commute with A. This result is applied in
the next section to characterize the lattice of hyperinvariant subspaces of a
real transformation.
In view of Theorem 12.2.2, we can assume that
where each Ja is either a Jordan block of size ma x ma with real eigenvalue

A a , or J-Jm 12(1*0 ,wa) ( m tne notation introduced before Theorem
12.2.2). Let Z be a real matrix such that AZ = ZA. Partition Z according to
(12.4.1): Z = [Za/3]^ p = 1 , where Za/3 is an ma x m& real matrix. Then we
have
If (r(Ja) fl o-(Jp) - 0, then equation (12.4.2) has only the trivial solution
Za/3 = 0 (Corollary 9.1.2). Assume tr(Ja) = a(J^) = {A 0 }, where A0 is real.
Then, as in the proof of Theorem 9.1.1, Zafi is an upper triangular Toeplitz
matrix.
To study the case &(Ja) - tr(Jp) = {/t() + w>0, /AO - /w () }, it is convenient
to first verify the following lemma.
Lemma 12.4.1
Let K=\ be a 2 x 2 matrix with real u,, w such that w ^ 0. Then

L -w nJ
f/ie system of equations
KA + C=AK; KC=CK
for unknown 2 x 2 matrices A and C implies C = 0.

The lemma is verified by a direct computation after writing out the
entries in A and C explicitly.
Now return to the case
in equations letting and writing

Comparing the block entries (ma/2,1) and then ( r a a / 2 - l , l ) in this

equation, we obtain
By Lemma 12.4.1, Um / 2 , =0. Now compare the block entries in positions

( m a / 2 - l , l ) and (m a /2-2, 1), and reapplying Lemma 12.4.1, it follows
that Um / 2 _ , , =0. Continue in this way, and it is found that
Equality (12.4.4) implies that for / = ! , . . . , p/2, KU^Û^K and for
In view of Lemma 12.4.1, Un = t/22 = • • • = Up/2 pl2, hence UJ_l • commutes

with K for y = 1, . . . , p/2. Further, K U j _ 2 j + #,._, y = f/ ; _ 2 J _, + ^-2,,*
for y = 3, . . . , p/2. Using Lemma 12.4.1 again, U and
KUj_2 j = /7 y _ 2 ;A". Continuing in this way, we find that (7,y (i^j) depends
only on the difference between y and / and commutes with K. Because of the
latter property Utj must have the form for some real numbers a
and b (which depend, of course, on i and y).
Putting all the above information together, we arrive at the following
description of all real matrices that commute with a given real n x n matrix
A.
Theorem 12.4.2
Let A be an n x n matrix with the real Jordan form diag[/ t , . . . , /„], so
for some invertible real n x n matrix S, where each Ja is either a Jordan block
of size ma x ma with real eigenvalue or a matrix of type
with real /MQ , wa and wa > 0. Then every real n x n matrix X that commutes
with A has the form X = S~1ZS, where the matrix Z = [ZaB]"aB = l partitioned
conformally with the Jordan form diag[/ t ,. . . , Ju] has the following struc-
ture: Ifa-(JJ n a(JB) = 0, then ZaB = 0. //tr(7 a ) = (JB) = { A0}, A0 real, then
is a real p x p upper triangular Toeplitz matrix. If

ju, - nv}, where /n a/id >v > 0 are real, then again
and in this case
where the 2 x 2 blocks X(Jp have the form
/or some real numbers u^l and v(
12.5 HYPERINVARIANT T SSUBSPACES
Let A: ^"—>% n be a transformation. A subspace M C J|?" is called .A

hyperinvariant if J{ is invariant for every transformation X: $"-*$" that
commutes with A. It is easily seen that the set of all A -hyperinvariant
subspaces is a lattice. In this section we obtain another characterization of
this lattice, one that is analogous to Theorem 9.4.2. The description of
commuting matrices obtained in Theorem 12.4.2 is used in the proof.
Theorem 12.5.1
Let a transformation A: $"—*$" have the minimal poly nominal
where A,, /ay, and wf are real and n>y >0, A , , . . . , A^ are distinct, and so are
/LI, + iw,,..., JJLS + i(t>s. Then the lattice of ail A hyperinvariant subspaces
coincides with the smallest lattice Â of subspaces in <p" that contains
KerM - A/)*, lm(A - \jl)k for k = 1,. . . , r y ; ; = 1,. . . , k, and Ker[(,4 -
/t y /r + wy2/]^, Im[(^ - /i;/)2 + wj/]* for k = 1,. . . , 5;; / = 1,. . . , m.
We consider first a particular case of Theorem 12.5.1 when the spec-

trum of A consists only of one pair of nonreal eigenvalues JJL + iw, JJL - iw
(/LI, w £ # , wÔ). Let
be a Jordan basis in Jj?", where pl>--->pm so that, in this basis, /I is

represented by the matrix
The following lemma is an analog of Lemma 9.5.2.
Lemma 12.5.2
Every A-hyperinvariant subspace is of the form
where q\, . . . , qm is a non-decreasing sequence of nonnegative integers such
If qt — 0 for some /, then, of course, 3{'q is interpreted as the zero

subspace. We see later that conversely, every subspace of the form (12.5.2)
is A hyperinvariant.
Proof. Let !£ be a nonzero /l-hyperinvariant subspace, and let x E j£

Write x as a linear combination of the vectors (12.5.1):
We claim that each vector belongs to l. Insteed,let Prr be

the projector on 3C'pr defined by Pr/}5) = 0 for s^r and Fr/!r) = f^ for
/ = ! , . . . , 2/?r. It follows from Theorem 12.4.2 that Pr commutes with A.
Hence 2£ is Pr invariant, and yr = Prx G 5£.
Fix an integer r between 1 and m and denote by a the maximal index / of

a nonzero coefficient £*r) (/ = ! , . . . , 2pr). Without loss of generality, we
can assume that a = 2/3 is even (otherwise consider Ax in place of*). Let us
show that all the vectors f\r\ . . . , f^ belong to J£ Indeed, the vectors
and z 2 = Az{ belong to j?and also to Spanf/^,/^}. Now zl and z 2 are not
collinear; otherwise, A would have a real eigenvalue, and this is impossible.
It follows that Span{z,, zj = Span{/(,r), /f >}, and hence f[r\ /f E %. If
we already know that f\r\ . . . , / 2 /-2 E !£ for some / > 2, then by a similar
argument using the vectors
and z 2/ = Az 2 , ,e^, we find that / 2 -_ , , / 2? e «^. For i = 0 we have

f (r) ) • • • » 7f (<F)
/I a e ^
*=<>£•
As the vector A: e % was arbitrary, it follows that & = Xlqi + -- + %™m
for some integers qi such that 0 ^ ^ ( < / J ( , / = l , . . . , m . To prove that
ql > • • • s qr, we must show that jtCraC!£ implies 3£ra~l C !£. Consider the
transformation B: $"—»$" that, in the basis (12.5.1), has the block matrix
form B = [Xij]™j=i where Xtj is the 2p, x 2pJ zero matrix (/,;' = ! , . . . , m),
except for
Theorem 12.4.2 ensures that B commutes with A. Hence !£ is B invariant,

and /; r ~ I} = Bf(ir) E X, i = 1, . . . , 2a. In other words, <3ir~l C X.
Further, consider the transformation C: $"—> ]jf" that, in the basis
(12.5.1), has the block matrix form C = [¥),•] ",-=,, where YU is the 2p, x 2/?;
zero matrix except for
Then by Theorem 12.4.2, C commutes with A, and assuming 2qr >

2(pr-Pr+i), we have
This implies 2qr - 2(pr ~ f>r+})^2qr+}, or pr - qr >pr+l - qr+l. If qr <

Pr ~ Pr+1» th en the inequality pr - qr > p r + , - ^r r + , is obvious. D
We are now in a position to prove Theorem 12.5.1 for the case cr(A) =
{fjL + iw, (ji - iw}. As in the proof of Theorem 9.4.2, one shows that every
subspace of the form Ker[(,4 - /A/) 2 + w2I]k, or lm[(A - /a/) 2 + w2I]k is A

hyperinvariant. So we have only to show that every /1-hyperinvariant
subspace j£ belongs to the lattice !?A. By Lemma 12.5.2
for some sequence of integers < ? , , . . . , qm such that ql > • • • > qm >0 and
p{ - ql > p2 - q2 > • • • > p m - qm > 0. We prove that Jz? E &A by induction
on q{. Assume first q} = 1. Then S£ = X\ + \- 3£\ for some t < m. As
p,>p, + 1 , we have
Now assume that the inclusion Jz? £ 5^ is proved for q^ - v — 1, and let =^be
a subspace of the form (12.5.3) with ql = v. Let r, a be the maximal integers
for which ql — • • • = #r and pa — /?r -f i> > 0. Consider the subspace
It is easily seen that
The inequalities p( - qi > /?( + 1 — qi+l imply that M C 3?. Further, the sub-
space
is A hyperinvariant, and since
the induction hypothesis ensures that Jf E $PA. Finally, 2£ - M + Ji belongs

to yA as well.
We have proved Theorem 12.5.1 for the case when the spectrum of A
consists of exactly one pair of nonreal eigenvalues. As the proof shows, the
converse statement of Lemma 12.5.2 is also true: every subspace of the form
(12.5.2) is A hyperinvariant.
Proof of Theorem 5.1 (the general case). Again, it is easily seen that
each subspace Ker(A - A ./)*, lm(A - A,/)*, Ker[(,4 - A,/) 2 + wjl]k,
lm[(A - A ; /) 2 + w2/]* is A hyperinvariant. So we must show that each
j4-hyperinvariant subspace belongs to yA. Let M be an A -hyperinvariant
subspace. By Theorem 12.2.1 we have
Write A in the real Jordan form (as in Theorem 12.2.2) and use Theorem
12.4.2 to deduce that each intersection M. fl ^(A) is A\.# ( A ) hyperinvariant
and M H &t uPp-'"p
+iw is A\#n ±iu- (A) hyperinvariantV(pf = 1,» . . .',•> 5).
IM
p l>
Jr
/
With the use
of Theorem 9.4.2, it follows that M fl *3i.K (A) belongs to the smallest lattice
that contains the subspaces
and
Similarly, by the part of the theorem already proved, we find that M fl

£% M ±IW (A) belongs to the smallest lattice that contains the subspaces
It follows that M E. yA, and Theorem 12.5.1 is proved completely. D
12.6 REAL TRANSFORMATIONS WITH THE SAME

INVARIANT SUBSPACES
In this section we describe transformations B: %"-* $", which have the

same invariant subspaces as a given transformation A: $" —> ^". This de-
scription is a real analog of Theorem 10.2.1.
By Theorem 12.2.2, we can assume that, in a certain basis in Jff", A has
the matrix form
whrer
with different real numbers and

a eal Transformations with the Same Invariant Subspaces 3479
with different complex numbers in the open upper

half plane. We use the notation introduced in Section 12.2, and also assume
that kn^'-->kimi, / „ > • • • > / , . , .
Now introduce the following notation (partly used in Section 10.2): given
real numbers « 0 , . . . , a s _ { , denote by Ts(a0, . . . , a^.,) the s x s upper
triangular Toeplitz matrix
Further, for positive integers s < Met
where F is a real (/ — s) x (t — s) upper triangular matrix
Similarly if real matrices, we

x
define the 2s x 2s upper triangular Toeplitz matrix Ts («0, . . . , a s _ { ) by
the same formula (12.6.2). If, in addition, the real 2x2 matrices f
(l<j<k<t-s) are given, denote by t / f x 2 (a0, . . . , «,_,; F) the 2f x 2t
matrix given by (12.6.3) with F given by (12.6.4). By definition, for 5 = / we
have
and
We can now give a description of all transformations B: $"—>$" with the

same invariant subspaces as A.
Theorem 12.6.1
Let the transformation be given by (12.6.1), in some basis i
Then a transformation has the same invariant subspaces as A if
and only if B has the following matrix form (in the same basis):
where
for some real numbers b(^\ . . . , b^ with M'} 5^0 and some (kn - ki2) x
(kn -ki2) matrix F ( i ) ;
for some 2 x 2 raz/ blocks
ith f\'] rÔ and det c 2 ' } 7^0 and some 2(/ yl - / y2 ) x 2(/ yl - 7 /2 ) rea/ mafri*
r
G (/) . Moreover, the real numbers b\l\ . . . , b\p) are different and the com-
plex numbers are different as well.
For the proof of Theorem 12.6.1, we refer the reader to Soltan (1974).
12.7 EeEXERCISES
12.1 Prove that the transformation of rotation through an angle <p:
has no nontrivial invariant subspaces except when <p is an integer

multiple of TT.
Exercises 381
12.2 Given an example of a transformation such that A has

no eigenvectors but A2 has a basis of eigenvectors in ft2".
12.3 Show that if is such that A2 has an eigenvector corre-
sponding to a nonnegative eigenvalue A 0 , then A has an eigenvector
as well.
12.4 Show that if is a transformation with det A< 0, then A
has at least two distinct real eigenvalues.
12.5 Find the real Jordan form of the n x n matrix
Find all the invariant subspaces in ft" of this matrix.

12.6 Describe the real Jordan form and all invariant subspaces in ft* of the
3 x 3 real circulant matrix
12.7 Find the real Jordan form of an n x n real circulant matrix
12.8 Find the real Jordan form and all invariant subspaces in ft" of the
real companion matrix
assuming that the polynomial

distinct complex zeros.
12.9 What is the real Jordan form of real n x n companion matrix?
12.10 Find the real Jordan form and all invariant subspaces in ft" of the
matrix
where
12.11 Two linear matrix polynomials \Al + and \A2 + B2 with real
matrices A{, 5,, A2, and B2 are called strictly equivalent (over J|f) if
there exist invertible real matrices P and Q such that P(\Al +
B{)Q = \A2 + B2. Prove the following result on the canonical form
for the strict equivalence (over J|?) (the real analog of Theorem
A. 7. 3). A real linear matrix polynomial \A + B is strictly equivalent
(over ^) to a real linear polynomial of the type
where 0 is the p x q zero matrix; Lf is the e x (e + 1) matrix
Me is the transpose of L£; A 1 ? . . . , \u are real numbers;
with ; and /L,

y ;<u; are real numbers with a>,>0 for
'
'] — 1,. . . , v. Moreover, the form (1) is uniquely determined by
\A + B up to permutations of blocks. (Hint: In the proof of
Theorem A.7.3 use the real Jordan form in place of the complex
Jordan form.)
Exercises 383
12.12 Prove the following analog of the Brunovsky canonical form for real
transformations. Two transformations and
are called b/oc/; similar if there exist invert-
ible transformations and and a transfor-
mation such that
Prove that every transformation is block

similar to a transformation [A{) 50] of the following form (written as
matrices with respect to certain bases in ftm and ft"):
where J is a matrix in the real Jordan form; Z?0 has all zero entries
except for the entries
and these exceptional entries are equal to 1. (Hint: Use Exercise
12.11 in the proof of the Brunovsky canonical form.)
12.13 Let be a full-range pair of transfor-
mations. Prove that given a sequence S - {A 15 . . . , A,,} of n (not
necessarily distinct) complex numbers such that A0 G S implies A() E S
and A0 appears in S exactly as many times as A0, there exists a
transformation F: ft"—> ftm such that A t , . . . , \n are the eigenvalues
of A + BF (counted with multiplicities). (Hint: Use Exercise 12.12.)
Notes to Part 2
Chapter 9. The first two sections contain standard material in linear
algebra [see, e.g., Gantmacher (1959)]. Theorem 9.3.1 is due to Laffey
(1978) and Guralnick (1979). The proof presented here follows Choi,
Laurie, and Radjavi (1981). Theorem 9.4.2 appears in Soltan (1976) and
Fillmore, Herrero, and Longstaff (1977). Our expositions of Theorem 9.4.2
and Section 9.6 follow the latter paper.
Chapter 10. The results and proofs of this chapter are from Soltan
(1973b).
Chapter 11. Theorem 11.2.1 is a well-known result (Burnside's
theorem). It may be found in books on general algebra [see, e.g., Jacobson
(1953)] but generally not in books on linear algebra. In the proof of
Theorem 11.2.1 we follow the exposition from Chapter 8 in Radjavi and
Rosenthal (1973). Other proofs are also available [see Jacobson (1953);
Halperin and Rosenthal (1980); E. Rosenthal (1984)]. Example 11.4.6 and
Theorem 11.4.4 are from Halmos (1971). In the proof of Theorem 5.1 we
are following Radjavi and Rosenthal (1973).
Chapter 12. The real Jordan form is a standard result, although not so
frequently included in books on linear algebra as the (complex) Jordan
form. The real Jordan form can be found in Lancaster and Tismenetsky
(1985), for instance. The proof of Theorem 5.1 is taken from Soltan (1981).
384
Part Three
Topological
Properties of
Invariant Subspaces
and Stability
There are a number of practical problems in which it is necessary to obtain
an invariant subspace of a transformation or a matrix by numerical methods.
In practice, numerical computation can be performed with only a finite
degree of precision and, in addition, the data for a problem will generally be
imprecise. In this situation, the best that we can hope to do is to obtain an
invariant subspace of a transformation that is close to the one we really have
in mind. However, simple examples show that although two transformations
may be close (in any reasonable sense), their invariant subspaces can be
completely different. This leads us to the problem of identifying all invariant
subspaces of a given transformation that are "stable" under small pertur-
bations of the transformation—that is, to identify those invariant subspaces
for which the perturbed transformation will have a "close" or "neighbour-
ing" invariant subspace, in an appropriate sense.
To develop these ideas, we must introduce a measure of distance between
subspaces and to analyze further the structure of the invariant subspaces of a
given transformation. This is done in Part 3, together with descriptions of
stable invariant subspaces, using different notions of stability.
This machinery is then applied to the study of stability of divisors of
polynomial and rational matrix functions and other problems. The reader
whose interest is confined to the applications of Chapter 17 needs only to
study the material presented in Chapter 13, Section 14.3, and Chapter 15.
385
Chapter Thirteen
The Metric Space

of Subspaces
This chapter is of an auxiliary character. We set forth the basic facts about
the topological properties of the set of subspaces in <p". Observe that all the
results and proofs of this chapter hold for the set of subspaces in ft" as well.
13.1 THE GAP BETWEEN SUBSPACES
We consider <p" endowed with the standard scalar product. If

and the cor-
responding norm is
The norm of an n x n matrix A (or a transformation A : <p" —» (p" ) is defined

accordingly:
Now we introduce a concept that serves as a measure of distance between

subspaces. The gap between subspaces £ and M (in <p") is defined as
where P^ and PM are the orthogonal projectors on ^ and M, respectively. It

is clear from the definition that 0( <£', M ) is a metric in the set of all
subspaces in (p"; that is, 6($,M) enjoys the following properties: (a)
, Jf) + B(Ji, M) (the triangle inequality).
387
a88 The Metric Space of Subspaces
Note also that 0( j£, M ) < 1 . [This property follows immediately from the
characterization given in condition (13.1.3).] It follows from (13.1.1) that
where !£L and M ^ denote orthogonal complements. Indeed,
In the following paragraphs denote by S? the unit sphere in a subspace

X C <p", that is, . We also need the concept of the
distance of d(x, Z) from x G <£" to a set Z C (f1". This is defined by
Theorem 13.1.1
Let M , £ be subspaces in <p". Then
// exactly one of the subspaces 2£ and M is the zero subspace, then the
right-hand side of (13.1.3) is interpreted as 1; if & = M = {0}, then the
right-hand side o/(13.1.3) is interpreted as 0. 7/P, and P2 are projectors with
Im P2 = !£ and Im P2 = M, not necessarily orthogonal, then
Proof. For every we have
Therefore
Similarly, suP;te^ d(x, £) < ||P, - P2||; so
where
Observe that
Consequently, for every we have
Now
The Gap Between Subspaces 389
Hence by (13.1.6)
On the other hand, using the relation
and the orthogonality of PM, we obtain
Taking advantage of (13.1.6) and (13.1.7) we obtain
So
Using (13.1.5) (with P, = P^, P2 = PM), we obtain (13.1.3). The inequality

(13.1.4) follows now from (13.1.5). D
It is an important property of the metric 6(!£, M} that, in a neighbour-

hood of every subspace !£ E. <p", all the subspaces have the same dimension
(equal to dim 5£ ). This is a consequence of the following theorem.
Theorem 13.1.2
If 0(£, M)<1, then dim $ = dim M.
Proof. The condition 8($, M)<\ implies that £ n M ^ = {0} and

. Indeed, suppose the contrary, and assume, for instance, that
Then d(x,m)=1,and by (13.1.3)
a contradiction. Now l O } implies that dim 3
dim M, and = {0} implies that dim ^ > dim M.
It also follows directly from this proof that the hypothesis 0(3?, M)<\
implies . In addition, we have
hh The Metric Space of Subspaces
For example, to see the first of these observe that for any x E. M there is the
unique decomposition x = y + z, M x = PMy so.
that M C PM(*£). But the reverse inclusion is obvious, and so we must have
equality.
The following result makes precise the idea that direct sum decompo-
sitions of <p" are stable under small perturbations of the subspaces, as
measured in the gap metric.
Theorem 13.1.3
Let M, M j C <p" be subspaces such that
If jV is a subspace in <p" such that Q(M, jV) is sufficiently small, then
and
where PM(Pjf) projects <p" onto M (onto N) along Jll and C is a constant
depending on M and M , but not on Ji. In fact
Proof. Let us prove first that the sum Ji + M\ is indeed direct. The
condition that M + M{ — (p" is a direct sum implies that \\x — _ y | | > 6 > 0 for
every and every Here 8 is a fixed positive constant. Take Ji
so close to M that 0 ( M , J f ) < d / 2 . Then ||z-^||<6/2 for every zG5, v ,
where y = y(z) is the orthogonal projection of z on M. Thus for x E SM and
we have
so ^n^,-{0}. By Theorem 13.1.2 di so

dimensional considerations tell us that M + Ml = <p" for 6(M, jV) < 1, and
equation (13.1.8) follows.
To establish the right-hand inequality in (13.1.9) two preliminary remarks
are needed. First note that for any xE.M, and yEMl we have x =
PM(x + y) so that
It is claimed that, for 6(M, N) small enough

The Gap Between Subspaces 391
for all and

Without loss of generality, assume \\z\\ =
— 1.
\. Suppose that
and let x£M. Then,
Then,using
using(13.1.10),
(13.1.10),we
weobtain
obtain
But then x - (x - z) + z implies ||*|| > 1 - S, and so
and, for S small enough, (13.1.11) is established.

The second remark is that, for any x E. <£"
for some constant C0. To establish (13.1.12), it is sufficient to consider the

case that xE-Jt^ and ||jt|| = 1. But then, obviously, we can take
Now for any x E. 5 V , by use of (13.1.12) and (13.1.3), we obtain
Then, if and it follows that
and the last inequality follows from (13.1.11). This completes the proof of
the theorem. D
We remark that the definition and analysis of the gap between subspaces
presented in this section extends verbatim to a finite-dimensional vector
space V over <p (or over J|?) on which a scalar product is defined. Namely,
there exists a complex-valued (or real-valued) function defined on all the
ordered pairs, x, y, where x, y E. V, denoted by (x, y), which satisfies the
following properties: for enery
x, y, z and every
(c) (x, x) > 0 for all x E. V; and (x, x) = 0 if and only if x = 0.
392 The Metric Space of Subspaces
13.2 THE MINIMAL ANGLE AND THE SPHERICAL GAP
There are notions of the "minimal angle" and the "spherical gap" between
two subspaces that are closely related to the gap between the subspaces. The
basic facts about these notions are exposed in this and the next sections. It
should be noted, however, that these notions and their properties are used
(apart from Sections 13.2 and 13.3) only in Section 13.8 and in the proof of
Theorem 15.2.1.
Given two subspaces 3?, M C (p", the minimal angle <pmin(&, M) (0<
<pmin(J3?, M}< IT/2) between Jz? and M is determined by
The minimal angle can also be defined by the equality
Indeed, writing
for any we have
Now for ||jc|| = ||y|| = 1
and writing /? = u + iv, where u and v are real, we see easily that the
function /(M, v) — 1 + ($(x, y) + /8(y, jc) + |^3|2 of two real variables u and v
has its minimum for u = —^((x, y) + (y, x)) and v = \(i(y, x) - i(x, y)),
that is, when )3 = -(y, *). Thus
The Minimal Angle and the Spherical Gap 393
Denote by a and b the right-hand sides of equations (13.2.2) and (13.2.1),

respectively. Then
In view of (13.2.3) the equality 1 - a2 = b2 follows, and this means that,

indeed, formulas (13.2.1) and (13.2.2) define the same angle <pmin(j£, M}
with 0<<p min (^, M)<Trl2.
Proposition 13.2.1
For two nontrivial subspaces // and only if
Proof. Obviously, if x £ £ fl M is a vector of norm 1, then
so Conversely, assume As the set

is closed and bounded, the
continuous function \\x + y\\ has a minimum in the set 4>, which in our case
is zero. In other words, ||jc() + yQ\\ = 0 for some x0 E £, y{}E.M, where at
least one of ||jt()|| and \\yQ\\ is equal to 1. But then, clearly, JCG G & n M ^
{0}. D
We also need the notion of the "spherical gap" between subspaces. For
nonzero subspaces =$?, M in (p" the spherical gap 0(3?, M ) is defined by
We also put «({0}, ^) = 0(%, {0}) - 1 for every nonzero subspace £ in <p"
and 6*({0}, {0}) = 0. The spherical gap is also a metric in the set of all
subspaces in <p". Indeed, the only nontrivial statement that we have to verify
for this purpose is the triangle inequality:
for all subspaces ^, M , and N in <p". If at least one of j£, M , and Ji is the
zero subspace, (13.2.4) is evident (observe that 0(j£, M)^2 for all sub-
spaces ££, M C <p"). So we can assume that £, M and jVare nonzero. Given
x G 5^., let zx e 5^ be such that ||* - zx\\ = d(x, SM). Then for every y E 5^,
we have
a94 The Metric Space of nSubspaces
and taking the infinum with respect to y, it follows that
It remains to take the supremum over x E S^ and repeat the argument with
the roles of JV and !£ interchanged, in order to verify (13.2.4).
In fact, the spherical gap 0 is not far away from the gap 6 in the following
sense:
The left inequality here follows from (13.1.3). To prove the right inequality
in (13.2.5) it is sufficient to check that for every x E <p" with ||jc|| = 1 we
have
where £ C <p" is a subspace. Let y = P^x, where Py is the orthogonal

projector on !£. If y — 0, then jtl j£, and for every 2 G 5^ we have
So (13.2.6) follows. If ^7^0, then, in the two-dimensional real subspace

spanned by x and y, there is an acute angle between the vectors x and y.
Consider the isosceles triangle with sides ;c, y / H . y l l and enclosing this acute
angle. In this triangle the angle between the sides .y/H.yll and x — y/\\y\\ is
greater than Tr/4. Consequently
and (13.2.6) follows again.
Proposition 13.2.2
For any three subspaces £, M, N C <p",
Proof. Letet a aand be arbitrary vectors satisfying

max{||_y 1 ||, ||_y 3 ||} = 1. Letting e be any fixed positive number, choose
y2 e M such that || y2 \\ = \\y3\\ and
The Minimal Angle and the Spherical Gap 395
Indeed, if y3 = 0, choose y2 - 0; if y3 7^0, then the definition of d(M, jV)

allows us to choose a suitable y2. Now
As e > 0 was arbitrary, the inequality (13.2.7) follows. D
The angle between subspaces allows us to give a qualitative description of

the result of Theorem 13.1.3.
Theorem 13.2.3
Let M, Jf be subspaces in <p" such that M Pi JV = {0}. Then for every pair of
subspaces M , , JV, C <J7" such that
we have ^ , n ^ V , = {0}. //, in addition, M+N=<£", then every pair

of subspaces Ml,Jil satisfying (13.2.8) has the additional property that
Proof. In view of Proposition 13.2.2 we have
and
Adding these inequalities, and using (13.2.8) and Proposition 13.2.1, we

find that Ml n ^, = {0}.
Assume now that, in addition, M + N= <p". Suppose first that M = M } .
Let e > 0 be so small that
If M + .yV, ^ <p", then there exists a vector jc £ <p" with ||A:|| = 1 and
for all y E M + ^ [e.g., one can take x^(M+J^ iY\. We
can represent the vector x as x = y + z, y&M, ze^V. It follows from
the definition of sin q>min(M, jV) that
Indeed, denoting u = max{||y||, ||z||}, we have

a94 The Metric Space of nSubspaces
By the definition of 6(t£, M ) we can find a vector 2 , from .AT, with
The last inequality contradicts the choice of *, because z — z, = jc — t, where

t = y + zlEM + JV,, and ||*-f||<6.
Now consider the general case. Inequality (13.2.8) implies
and, in view of Propositionn 13.2.2,
he part of Theorem 13.2.3 already proved, weroved, we
obtain M + JV, = <p" and then M , -I- JV, = (p". D
13.3 MINIMAL OPENING AND ANGULAR

LINEAR TRANSFORMATION
In this section we study the properties of angular transformations in terms of

the minimal angle between subspaces.
Let MI, M2 be subspaces in <p". The number
is called the minimal opening between Ml and M2. So
where <pmtn(M}, M 2 ) is the minimal angle between M{ and M2. By conven-

tion, T/({0}, {0}) =00. If FI is any projector defined on <p", then
To see this, note that for each z £ <p"
We would like to mention also the following properties of the minimal

opening. If Q, and Q2 are nontrivial (i.e., different from 0 and /) orthogo-
nal projectors of (p" onto the subspaces M\ and M2, respectively, then
and
ainimal Opening and Angular Linear Tranformation 397
Indeed, these formulas follow from the equality
for every x G M,, and from
As a consequence of (13.3.2) we obtain the following connection between

the minimal opening and the distance from one subspace to another. For
two subspaces M\ and M2\n <p", put
[if M{ - {0}, then define p(M^, M2) = 0]. Then we have
whenever Ml ¥^ {0}. To see this, note that for M2 ^ {0}
where <2, is the orthogonal projector onto M } . But then we can use (13.3.2)
to obtain formula (13.3.3). If M2 = {0}, then (13.3.3) holds trivially.
We use the notion of the minimal opening between two subspaces to
describe the behaviour of angular transformations when the corresponding
projectors are allowed to change.
Lemma 13.3.1
Let I10 be a projector defined on <p", and let O be another projector on <f""
such that Then, provided
we have the following estimate for the norm of the angular transformation R
of Im O with respect to I10:
Proof. (). Recall tt
For 0 we have
Taking the infimum over all and using inequality (3.1), one sees
that
Now recall thatt fofofor e Ass

, we see from (13.3.5) that
So, using (13.3.6), we obtain
It follows from (13.3.7) that for each

which proves the inequality (13.3.4). D
The following lemma will be useful.
Lemma 13.3.2
Let P and P x be projectors defined on <p" such that <p" = Im P 4- Im P x. Then
for any pair of projectors Q and Q* defined on
sufficiently small, we have <p" = Im Q + Im Q x, and there exists an
invertible transformation which maps Im Q on Im P, Im Q x on
x
Im P , and
Minimal Opening and Angular Linear Transformation 399
where the positive constant /3 depends on P and P* only.
Proof. Let and assume that the

projectors Q and Q* satisfy
As
tion (13.3.9) implies that
But then we may apply Theorem 13.2.3 combined with (13.2.5) to show that
Note that (13.3.9) implies that ||P-<2||<|. Hence S, =/ + /»- G is

invertible, and we can write As
/ — P-f- Q is invertible also, we have
Further
Let n o (Il) be the projector of <p" along Im P (Im Q) onto Im P x

( I m < 2 x ) and put n = S,IISj"1. Then O is again a projector, and by
(13.3.10) we have Ker fl = Ker O 0 . Further, Im fl = Im S^^S^ , and so we
have
Hence, if R denotes the angular transformation of Im fl with respect to Il(),

then because of equation (13.3.4) of Lemma 13.3.1, we obtain
As this implies
that
Next, put S2- I - RH(}, and take 5 = 525,. Clearly, 52 is invertible; in

fact, S^1 = I + Rll0. It follows that 5 is invertible also. From the properties
of the angular transformation one easily sees that 5(Im Q} - Im P,
S ( I m < 2 x ) = ImP x .
To prove (13.3.8), we simplify our notation. Put
Q x ||, and let 77 = 7j(Im P, Im P x ). From S = (I- RU0)(I + P - Q) and th
fact that ||P - e|| < | , one deduces
For \\R\\ an upper bound is given by (13.3.11), and from (13.3.1) we know
that JlnJIsSTj' 1 . It follows that
Finally, we consider S~l. Recall that S^l = I + V with

- Hence
and (13.3.8) follows in view of (13.3.12).
13.4 THE METRIC SPACE OF SUBSPACESces
We have already seen in Section 13.1 that the set <p(<P") °f a'l subspaces in
<p" is a metric space with respect to the gap 6(2£, M ) . In this section we
investigate some topological properties of <£(<£"), that is, those properties
that depend on convergence (or divergence) in the sense of the gap metric.
Theorem 13.4.1
The metric space (£(£") is compact, and, therefore, complete (as a metric
space).
Recall that compactness of 4/(<p") means that for every sequence

Jjfj, j£2, . . . of subspaces in <j/(<p") there exists a converging subsequence
.2) , ^,2, . . . , that is, such that
The Metric Space of Subspaces 401
for some Completeness of means that every sequence

of subspaces J^, / = 1, 2, . . . , for which lim, is convergent.
Proof. In view of Theorem 13.1.2, the metric space ([/((p") is decom-

posed into components (p w , m = 0, . . . , n, where <$m is a closed and open
set in <j/(<P") consisting of all w-dimensional subspaces in <p".
Obviously, it is sufficient to prove the compactness of each (|7m. To this
end consider the set $m of all orthonormal systems u = {uk}™=l consisting of
m vectors ul, . . . , um in <p"-
For define
It is easily seen that d(u, v) is a metric in $m, thus turning $m into a metric
space. For each u - {uk}^l E $m define yl m M = Span{M t , . . . , um] E (j/m.
In this way we obtain a map ^4OT: $m —> ($m of metric spaces $m and (J/m .
We prove that the map Am is continuous. Indeed, let ^ E §m and let
v}, . . . ,vm be an orthonormal basis in j£ Pick some
(which is supposed to be in a neighbourhood of For u ( ,
/ = 1, . . . , m, we have (where M = Amu and P^ stands for the orthogonal
projector on the subspace N):
Now, since we find that \a\ < 1 and E" , |a,| < m, and
so
Fix some y E 5^±. We wish to evaluate P^y. For every x E j£, write
and
by (13.4.1). On the other hand, write

then for every z E !£1
and
But
Combining (13.4.2) and (13.4.3), we find that \(t, PMy)\ <3m6(w, v) for
every re f" with ||/|| = 1. Thus
Now we can easily prove the continuity of Am. Pick an x £ < p " with
||jc|| = 1. Thus, using (13.4.1) and (13.4.4) we have
so
which obviously implies the continuity of Am.

It is easily seen that $m is compact. Indeed, this follows from the
compactness of the unit sphere {jtE<p"|||jc|| = l} in <p". Since
Am: $m—> 4/m is a continuous map onto (frm , the metric space <$m is compact
as well.
Finally, let us prove the completeness of (pm. Let ^ , «2*2 , . . . be a
Cauchy sequence in (p m , that is, 0(J£t, «^)—>0 as /, j—»°°. By compactness,
there exists a subsequence ^ such that lim A _^ 6(^ , =^) = 0 for some
£ £ $ m . But then it is easily seen that in fact .Sf = lim,-.,, ^,. D
Next we develop a useful characterization of limits in (J/((p").
Theorem 13.4.2
Let M}, M2, . . . be a sequence of m-dimensional subspaces in 4/(<P"), such
that 6(Mp, M)— »Oflsp-»°° for some subspace M C (J7". Then M consists of
exactly those vectors x £ <p" for which there exists a sequence of vectors
xp £ <p", /? = 1, 2, . . . such that
Proof. Denoting by /% the orthogonal projector on the subspace N C

<p", for every x G M we have:
So xp = P M x has the properties that xpE:Jip and lim^^ xp = x.

Conversely, let xp G Mp, p = 1, 2, . . . be such that lim^,, xp = x. Then
(in the last inequality we have used the fact that the norm of an orthogonal
projector is 1); so PMx - x, and x E M.
Using Theorems 13.4.2 and 13.1.3, one obtains the following fact.
Theorem 13.4.3
Let ££ and M be direct complements to each other in <£", and let
{^m}», = i be sequences of subspaces such that
Then, denoting by P (resp. Pm) the projector on £ along M (resp. on !£m

along Mm} we have
Moreover, there exists a constant K>Q depending on 5€ and M only such that
for all sufficiently large m.
Observe that, in view of Theorem 13.1.3, the subspaces !£m and Mm are
direct complements to each other for sufficiently large m.
Proof. Let Pm^ be the projector on Mm along !£ and PM m be the

projector on M along 3?m (for sufficiently large m). By Theorem 13.1.3 we
have
a
The Metric Space of Subspaces
where
As usual, d(jt, N) = inf{||jic - _y|| | y G JV} is the distance between x G <p" and
a subset JV C (p". In particular, for m large enough we find that
When Theorem 13.1.3 is applied again, it follows that
Now use (13.4.6) and deduce (for sufficiently large m):
We finish the proof by showing that
for m sufficiently large.

Arguing by contradiction, assume that (13.4.7) does not hold. Then there
exists a subsequence {Mm }^ =] and vectors xm E Mm with norm 1 such that
As the sequence {xm }^ =1 of n-dimensional vectors is bounded, it has a

converging subsequence. So we can assume that xm —> x0 as k—>°°. Clearly,
||jc0|| = l, and by Theorem 13.4.2, x0E.M. In view of (13.4.8) for each
k — 1, 2, . . . , there is a vector yk G 56 such that
In particular, the sequence ( y k } k =l is bounded, and we can assume that

yk-*y0 as &—><», for some y0E&. Passing to the limit in (13.4.9) when k
tends to infinity, we obtain the inequality
which is contradictory.
The proof of Theorem 13.4.3 shows that actually equation (13.4.5) holds
with
We conclude this section with the following simple observation.
Proposition 13.4.4
The set $m($n) of all m-dimensional subspaces in (p" is connected.
That is, for every M, N G (£„,(<£") there exists a continuous function
f : [0, 1]-* $m(() such that /(O) = M, /(I) = JV (and the continuity of f is
understood in the gap metric).
Proof. Using the proof of Theorem 13.4.1 and the notation introduced
there, we must show that the set $m is connected. As any orthonormal
system «,, . . . , um in <p" can be completed to an orthonormal basis in <p",
the connectedness of $m would follow from the connectedness of the group
£/(<p") of all n x n unitary matrices. To show that £/(<£") is connected,
observe that any X G £/(<p") has the form
where S is unitary and 0,, . . . , 0n are real numbers (see Section 1.9). So
is a continuous £/(<p")- valued function that connects / and X.
Similarly, one can prove that the set <$m($") of all m-dimensional
subspaces in Jj?" is connected. To this end use the facts that any orthonormal
systems u l 5 . . . , um in ^" (m < n) can be completed to an orthonormal basis
« ! , . . . , ! / „ with det[« 1 , u2, . . . , un] = 1 and that the set £/+(Jj?") of all
orthogonal n x n matrices with determinant 1 is connected. Recall that a
real n x n matrix U is called orthogonal if U TU = UU T = I.
For completeness, let us prove the connectedness of f/ + (J(f"). It follows
from Theorem 12.1.4 that any x £ t/ + (JJf") admits the representation
where S is orthogonal and each K- is either the scalar ± 1 or the 2 x 2 matrix

f°r some 9, 0 ^ 0 ^ ZTT, which depends on y. As det

also det[£,, K2,. . . , Kp] = 1, which means that the number of
indices / such that Kf. = -1 is even. Since= wecanassume
that each Kj is either 1 or %, 6 = 0(j). Putting
where X;-(r) = Kt if /Cy = 1 and Ky(0 = ^/# if Kj = ^,, we obtain a f/ + (J|O-

valued continuous function that connects / and X.
13.5 KERNELS AND IMAGES OF LINEARTtr TRANSFORMATIONS
Important examples of subspaces in <p" are images of transformations into

<(7" and kernels of transformations from <p". We study here the behaviour of
these subspaces when the transformation is allowed to change. The main
result in this direction is the following theorem.
Theorem 13.5.1
Let X: <£"-* <p'" be a transformation, and let Px be a projector on Ker X.
Then there exists a constant /C>0, depending only on X and Px, with the
following property: for every transformation Y: <p"-^ (p™ w^tn dim Ker Y =
dim Ker X there exists a projector PY on Ker Y such that
In particular
Proof. It will suffice to prove (13.5.1) for all those y with dim Ker Y -
dim Ker X that are sufficiently close to X, that is, ||X — Y\\ < e, where e > 0
depends on X and Px only. Indeed, for Y with dim Ker Y - dim Ker X and
||A'— y|| > e, use the orthogonal projector PY on Y and the fact that
to obtain (13.5.1) (maybe with a
bigger constant K}.
Consider first the case when X is right invertible. There exists a right
inverse X1 of X such that Im X1 = Im(/ - Px) (cf. Theorem 1.5.5), and then
X'X= I - Px (indeed, both sides are projectors with the same kernel and
the same image). It is easy to verify that any transformation Y: <p"—^Cp" 1
with the property
Kernels and Images of Linear Transformations 407
is also right invertible and one of the right inverses Y1 is given by the
formula Y1 = ZX1 where
Indeed, we have
and hence
where the penultimate equality follows from (13.5.3), because
A similar argument shows that Z is invertible and

. Now p . We have
So (13.5.1) holds for every Y satisfying || Y- *|| < IH*'!!"1, with
Now consider the case when * is not right invertible, and let r be the
dimension of a complementary subspace N to Im * in <pm- Consider the
transformation
defined by X(x + y) = Xx + Ly ; x £ <p", y E <f r, where L: $r -^ jV is some

invertible transformation. As the image of * is the whole space <p™ the
transformation * is right invertible. Also Ker * = Ker *. Let P^ be a
projector on Ker * defined by P%(x + y) = Pxx\ x E <p", y E <pr. Applying
the part of Theorem 13.5.1 already proved to *, we find positive con-
stants e and K such that, for every transformation with
||*- Y|| < e, there exists a projector Py on Ker Y such that
Note that the equality dim Ker Y = dim Ker X holds automatically for e
small enough because then such Y will also be right invertible (see the first
part of this proof). Apply (13.5.4) for Y of the form Y(x + y) = Yx + Ly;
x E <p", y G <f", where Y: <p"-> <pm is a transformation such that \\X - Y\\ <
€ and dim Ker Y = dim Ker X. Let us check that Ker Y C <p". Indeed
and since Ker Y C Ker Y, we have in fact Ker Y = Ker Y and thus Ker Y C
<f". Now put Px = Py| n , Py = P y j n to satisfy (13.5.1), for transformations
Finally, observe that (13.5.2) follows from (13.5.1) in view of Theorem
The condition dim Ker Y = dim Ker X is clearly necessary for the inequal-
ity (13.5.1), since otherwise we obtain a contradiction with Theorem 13.1.2
on taking a Y: f-* <£" such that
A result analogous to Theorem 13.5.1 also holds for the images of linear
transformations. The statement of this result is obtained from Theorem
13.5.1 by replacing Ker X and Ker Y by Im X and Im Y, respectively, and
its proof is reduced to Theorem 13.5.1 by observing that Im A = (Ker A*Y
for a linear transformation A and that 0(M, N) = 8{M \ ^V x ) for any
subspaces M, Ji C <p".
13.6 CONTINUOUS FAMILIES OF SUBSPACES
As before, we denote by <J7(<P") the set of all subspaces in <p" seen as a

metric space in the gap metric.
In this section we consider subspace-valued families ^(/) defined on some
fixed compact set K C Jf"", that is, for each t G K, %(t) is a subspace in <p".
The family 3?(t) will be called continuous (on K) if for every /0 G K and
every e > 0 there is S >0 such that ||/-f 0 ||<5, / e / C implies Q(!£(t},
j£(f0)) < e (the norm ||f - t0\\ is understood as the Euclidean norm, that is,
generated by the standard scalar product (x, y) = £™ = , xlyi for x =
( j t j , . . . , xm), y = (7,,. . . , ym) G $m). In other words, the continuity is
understood in the sense of the gap metric.
Examples of continuous families of subspaces are provided by the
Proposition 13.6.1
Let B(t) be a continuous m x n complex matrix function on K such that
rank B(t) = p is independent of t on K. Then Ker B(i) and Im B(t) are
continuous families of subspaces on K.
Continuous Families of Subspaces 409
Proof. Take f 0 E K. There exists a nonzero minor of size p x p of B(t0).

For simplicity of notation assume that this minor is in the upper left corner
of B(t0). By continuity the p x p minor in the upper left corner of B(t) is
also nonzero as long as t belongs to some neighbourhood U0 of tQ. So [here
we use the assumption that rank B(t) is independent of t] for t E UQ
where bf(t) is the /th column of Z?(f). Let b /y (f) be the (i, y)th entry in B(t)\
and let D(t) = [^(0];:^,:f=i; C(0 = [M')Ky-i- Then the matrix
is a continuous projector with Im P(t) = Im B(t). Hence P(t) is uniformly

continuous on Ulf where U} is a neighbourhood of t0 in K such that t/j C U0.
By Theorem 13.1.1 [inequality (13.1.4)] the orthogonal projector on Im B(t)
is also uniformly continuous on Ul.
The statement concerning Ker B(t) can be reduced to that already
considered because Ker B(t) is the orthogonal complement to lm(B(t))*
(note that B(t)* is continuous in t if B(t) is).
In particular, we obtain an important case.
Corollary 13.6.2
Let P(t) be a continuous projector-valued function on K. Then Im P(t) and
Ker P(t) are continuous families of subspaces on K.
We have to show that rank P(t) is constant if the projector function P(t)
is continuous. But this follows from inequality (13.1.4) and the fact that the
set of subspaces of fixed dimension is open in the set of all subspaces in <p"
(Theorem 13.1.2).
The following characterization of continuous families of subspaces is very
useful.
Theorem 13.6.3
Let ££(t) be a family of subspaces (of <p" ) on a connected compact subset K
of $m. Then the following properties are equivalent: (a) ££(f) is continuous;
(b) for each tE.K there exists an invertible transformation S(t): <p"—»<p"
which depends continuously on t for t E K, and there exists a subspace
M C <f" such that £(f) = S(t}M for all t E K; (c) for each t0£K there exist a
neighbourhood Ut of 10 in K, an invertible transformation St(t): <p"—>(p"
that depends continuously on t in Ut, and a subspace Mt C <p" such that
We prove Theorem 13.6.3 only for the case K = [0, 1] (of course, the case
when K C $ is easily reduced to this one). The proof when K is a connected
compact set of jj?m requires mathematical tools that are beyond the scope of
this book [see Gohberg and Leiterer (1972) for the complete proof].
Proof. Assume that &(t) is continuous on

be points with the property that
Here PA is the orthogonal projector on the subspace JV C <p". For each

/ = 0, . . . , p - 1, the transformation £,(17), r, < 17 < f, + 1 , defined by 5. (77) =
/ - ( P M ( I ) - PM^)) maPs ^(f,-) °n *M(y)-> is invertible and •$,•('/) = /. Now
put
to satisfy (b).
Obviously, (b) implies (c). Finally, let us prove that (c) implies (a). Given
5, and M, as in (c), let P(} be the orthogonal projector on M, . Then
5, (t)P0(S, (0) ' is a projector on &(t)\ therefore, for t E U, we have
As 5,t (t)
\ / is continuous and invertible in U,r ,' its inverse is continuous as well,»
a ()
and the continuity of t£(t) follows from the preceding inequality.
Corollary 13.6.4
Let !£(t) be a continuous family of subspaces (of (p") on K, where K C jjf" is
a connected compact set. Then there exists a continuous basis *,(/), • • • , xp(t)
in !£(t), where p = dim ££(t). (Note that because of the connectedness of K the
dimension of ££(t) is independent of t on K.)
Indeed, use Theorem 13.6.3, (b) and put Xj(t) = S(t)Xj, j = 1, . . . , p,

where xl, . . . , xp is a basis in M.
Corollary 13.6.5
Let B(i) be a continuous m x n matrix function on a connected compact set
, such that rank B(t) = p is independent of t. Then there exists a
Applications to Generalized Inverses 411
continuous basis xl(?),..., xn_p(t) in Ker B(t) and a continuous basis
This corollary follows from Corollary 13.6.4, taking into account

Proposition 13.6.1.
13.7 APPLICATIONS TO GENERALIZEDD INVERSES
In this section we apply results of the preceding sections to study the

behaviour of a generalized inverse of a transformation when this transfor-
mation is allowed to change. Recall that a transformation B: (p"1—> • (p"1 if the equalities
BAB = B, ABA = A hold (see Section 1.5).
As an application of Theorem 13.5.1, we have the following result
concerning close generalized inverses for close linear transformations.
Theorem 13.7.1
Let X: <p" —> <J7m be a transformation with a generalized inverse X1: (pm —> <p".
Then there exist constants K>0 and e > 0 with the property that every
transformation Y: <p"-» fm with \\Y-X\\<c and dim Ker Y = dim Ker X
has a generalized inverse Y1 satisfying
Proof. By Theorem 1.5.5, the generalized inverse X1 is determined by a

direct complement N to Ker X in <p" and by a direct complement M to Im X
in <f"", as follows:
where Px is the projector on Im A' along M, and X{ : N— >lm X is the

invertible transformation defined by X^x — Xx, XE.N. Denote by 3£(X) the
set of all transformations Y: <p" -» (pm such that dim Ker Y = dim Ker X.
Using Theorem 13.1.3 and inequality (13.5.2), choose € j > 0 in such a
way that JV is a direct complement to Ker Y for every Y G 3f{(X) with
HA'- Y||<e!. Using the analog of Theorem 13.5.1 for images of linear
transformations, we find a projector PY on Im Y such that
for every Ye 3£(X). Here the constant K} depends on X and Px only.

Our next observation is that, by Lemma 13.3.2 and (13.7.2), there exists
a positive number e2 ^ e, such that for any Y E 3%(X) with \\X - Y\\ < e2 we
can find an invertible transformation S
and
where the positive constant K2 depends on X and Px only. Let Y = SYY,

and note that for every generalized inverse Y1 of Y the transformation Y1SY
is a generalized inverse for Y. Now for we
have
so it is sufficient to prove Theorem 13.7.1 for Y in place of Y. In other

words, we can (and will) assume that the transformation Y from Theorem
13.7.1 satisfies the additional property that Im Y — Im X.
Now we verify (13.7.1) for the generalized inverse Y1 = Y^1PY, where
r, : JV-> Im Y = Im X is defined by Y,* = Yx, x £ Jf. Indeed
and
But the norms l l ^ ^ l l are bounded provided the transformation YE.3£(X)

with I m r = I m ^ is such that \\X - Y\\ < 5||A'1"1||. Theorem 13.7.1 is
proved. D
Observe that the complete analog of Theorem 13.5.1 does not hold for
the case of generalized inverses. Namely, given X and X1 as in Theorem
13.7.1, in general there is no positive constant K such that any transfor-
mation Y : <p" —> <pm with dim Ker y = dim Ker X has a generalized inverse
Y7 satisfying (13.7.1). To produce an example of such a situation, take
n = m and let X: <P"~* <P" be invertible. Then there is only one generalized
inverse of X, namely, its inverse X~l. Further, let Y = aX, where a ^0. If
(13.7.1) were true, we would have for some K>0 and all a:
which is contradictory for a close to zero.

Now we consider continuous families of transformations and their
generalized inverses. It is convenient to use the language of matrices with
the usual understanding that n x m matrices represent transformations from
<pm into <p" in fixed bases in <pm and <p".
Applications to Generalized Inverses 413
Theorem 13.7.2
Let B(t) be a continuous m x n matrix function on a connected compact set
K C $q such that rank 5(0 — p is independent of t. Then there exists a
continuous n x m matrix function X(t) on K such that, for every t E. K, X(t)
is a generalized inverse of B(t).
Proof. In view of Corollary 13.6.5 there exists a continuous basis

x,(r), • • • , xn_p(t) in Ker 5(0, as well as a continuous basis ^,(0, • • • . 3^(0
in Im 5(0- By the same corollary there exist a continuous basis
xn_p +l(t),. . . , xn(t) in Im 5(0* and a continuous basis yp +l(t),. . . , ym(t)
in Ker 5(0*. As Im 5(0* = (Ker B(t)}\ it follows that j r ^ f ) , . . . ,*„(*)
is a basis in <f"" for all tE K. Also, yv(t),. . . , ym(t) is a basis in <pm for all
r e K. Define a transformation X(t): <pw—>• (pn as follows: X(f)yj(i) = Q,
j = p + 1, . . . ,m; and for j = 1, . . . , p X(t)yj(t) is the unique vector in
Im5(0* such that B(t)X(t)y;(t) = yfi). Theorem 1.5.5 shows that X(t) is
indeed a generalized inverse of 5(0 for all tE. K. It remains to show that
X(t) is continuous.
For a fixed vector z e <pm and any r e A", write f°r
some complex numbers z,(r) that depend on t. These numbers z,(r) turn out
to be continuous, because
Further, the transformation
is invertible, so
for some complex numbers a;,(0 that also depend on t. Again, a;/(0 are
continuous on K. Indeed, a;j(0 ^s the unique solution of the linear system of
equations
Writing y ; (r), j = 1,. . . , p in terms of linear combinations of the standard

basis vectors e^ . . . , em, and writing * ; (0> i = n- p + 1, . . . , n in terms of
linear combinations of e{,. . . , en we can represent the system (13.7.3) in

the form
where a(t) is the p 2 -dimensional vector formed by a y ,(0> / = 1> • • • > P'->
i- n — p + I , . . . , n, and A(t) and C(t) are suitable matrix and vector
functions, respectively, which are continuous in /. As the solution of (13.7.4)
exists and is unique for every f G K, it follows that the columns of A(t) are
linearly independent for every tEK. Now fix t(} G K, and assume for
simplicity of notation that the upper p2 rows of A(t()) are linearly indepen-
dent. Partition
where A0(t) and C 0 (f) are the top p2 rows of A(t) and C(t), respectively.
Then A()(t()) is nonsingular; as A(t) is continuous in f, the matrix AQ(t) is
nonsingular for every / from some neighbourhood U, of t0 in K. It follows
that
is continuous in f for / E t/, . As t0 G A^ was arbitrary, the functions a;/(0 are

continuous on K.
Returning to our generalized inverse X(t), we have for every
the following equalities:
and so ^Tr is continuous on K.
A particular case of Theorem 13.7.2 deserves to be mentioned explicitly.
Corollary 13.7.3.
Let B(t) be a continuous m x n matrix function on a connected compact set
K C ftg such that, for every t G K, the matrix B(t) is left invertible (resp. right
invertible). Then there exists a left inverse (resp. a right inverse) X(t) of B(t)
such that X(t) is a continuous function of t on K.
Subspaces of Normed Spaces 415
13.8 SUBSPACES OF NORMED SPACES
Until now we have studied the notions of gaps, minimal angle, minimal
opening, and so on for subspaces of <p" where the norm of a vector
x = (jc,, . . . , xn} is Euclidean: ||jc|| = (£" =1 I*,) 2 ) 1 ' 2 . Here we show how
these notions can be extended to the framework of a finite-dimensional
linear space with a norm that is not necessarily generated by a scalar
product.
Let V be a finite-dimensional linear space over <p or over %. A real-
valued function defined for all elements # E V, denoted by ||jc||, is called a
norm if the following properties are satisfied: (a) ||je||^0 for all x E V ;
if and only id for every jcE V and every
scalar A (so A E 1. For

every put
Also, define ||*|U = maxdaj, . . . , |aj). We leave it to the reader to verify

that | | - | j p ( p > l ) and H ' l l ^ are norms (one should use the Minkowski
inequality for this purpose): for any complex numbers * , , . . . , * „ ,
yl , . . . , yn and any p > 1 we have
EXAMPLE 13.8.2. For V= <p" (or V= ft") let
where x = (x,, . . . , xn) belongs to (p" (or to JjJ"). We have used this norm
throughout the book. Actually, this is a particular case of Example 13.8.1
(with the basis fl ; = e^ i = 1, . . . , n in <p" (or jj?") and p = 2).
Any norm on V is continuous, as proved in the following proposition.
Proposition 13.8.1
Let /, , . . . , fn be a basis in V, and let \ \ - \ \ be a norm in V. Then, given e > 0
there exists a 8 > 0 such that the inequality
holds provided \Xj — yt\ < S for / = ! , . . . , « , where x — E"=1 Xjfj and
Proof. Letting M = max lsysn ||/j.||, choose 8 = eM n . Then for

every with we have
It remains to use the inequality
which follows easily from the axioms of a norm.
It is important to recognize that different norms on a given finite-

dimensional vector space are equivalent in the following sense.
Theorem 13.8.2
Let || • ||' and || • ||" be two norms in V. Then there exists a constant K^l
such that
for every xE.V.
We stress the fact that K depends on || • ||', || • ||" only (and of course on
the underlying linear space V).
Proof. Let / ! , . . . , / „ be a basis in V. It is sufficient to prove the

theorem for the case when
Consider the real-valued continuous function g defined on <p" by
As the set is closed and bounded, the

function g attains its maximum and minimum on this bounded set. So there
exist *,, x2 £ V such that \\x{\\' = \\x2\\' = 1 and
for every v E V with || i; ||' = 1. Now for x £ V, x ¥= 0 we have

and hence
Thus inequality (13.8.1) holds with K = max(||jc2||", l/||jt,||"). D
In the rest of this section we assume that an arbitrary norm || • || is given

in the finite-dimensional linear space V.
For any subspace M C V, let
be the unit sphere of M. Now the gap 6(!£, M ) between the subspaces !£ and
M in V is defined by formula (13.1.3):
where d(x, Z) = inf, ez ||;c - /|| for a set Z C V.

The gap has two properties of a metric: (a) 0(<&, M) = B(M, 2£) for all
subspaces However, the
triangle inequality
for all subspaces £, M, N in V fails in general, although it is true when the

norm is defined by means of a scalar product (x, y ) , as Theorem 13.1.1
shows. The following example illustrates this fact.
EXAMPLE 13.8.3. Let tjjl2 be the normed space with the norm
Consider a family of one-dimensional subspaces
We compute 0(#(a), ^(j8)). Take x £ 5(^(/3)), so that x - (7, y/3), where

. Now
As the function f(fJi) — \y - /n| + |y/3 — f i a \ is piecewise linear, we have
So
Let a < /3 < y be positive numbers such that /3 < 1 < y and / 3 y < l . We
compute
and
However, clearly
so the inequality
holds for sufficiently small positive a, and the triangle inequality for the gap
fails in this particular case.
In contrast, the spherical gap
is a metric. (The verification of this fact is exactly the same as that given in
Section 13.2.) Instead of inequality (13.2.5), we have in the case of a general
normed space the weaker inequality
for any subspaces J£, M C V. Indeed, the left-hand inequality of (13.8.3) is

evident from the definitions of S($, M) and 0(£,M). To prove the
right-hand inequality in (13.8.3), it is sufficient to verify that for every
vector v E V with ||u|| = 1 and every subspace M C V we have
For a given e > 0 there exists a v G Jf such that
and we can assume that i>^0. [Otherwise, replace v by a nonzero vector

sufficiently close to zero so that (13.8.5) still holds.] Then
and hence
But
and we have
As e > 0 is arbitrary, the desired inequality (13.8.4) follows.

The minimal angle between two subspaces is defined in a normed space
by the formula (13.2.1). With this definition, Proposition 13.2.2 and
Theorem 13.2.3 are valid in this case. Without going into details, we remark
that Lemmas 13.3.1 and 13.3.2 also can be extended to the normed space
context.
Concerning the metric space properties of the set of all subspaces in the
spherical gap metric (such as compactness, completeness), it follows from
inequality (13.8.3) and the following result that these do not depend on the
particular choice of the norm.
Theorem 13.8.3
Let || • ||' and \\ • \\" be two norms in V, with the corresponding gaps 8'(M, N)
and 0"(M, JV) between subspaces M and N in V. Then there exists a constant
L > 1 such that
for all subspaces M and X.

Again, the constant L depends on the norm || • ||' and || • ||" only.
Proof. By Theorem 13.8.2 we have for any x G V
where the constant K > 1 is independent of x. Hence
In view of the definition of 8(3?, M ) we obtain the left-hand inequality in

(13.8.6) with L = K2. The right-hand inequality in (13.8.6) follows
similarly. D
13.9 EXERCISES
13.1 Compute the gap 0(M, jV), where
and x and y are complex numbers such that

13.2 Compute the gap 6(M, Jf), spherical gap 0(Ji, ^V), minimal opening
T)(M, N) and minimal angle (pmin(M, N), where
and x and y are real numbers such that |jc| — |y|.

13.3 Compute 0(M,N), r)(M,N), and <pmin(M,N) for any two one-
dimensional subspaces M and ^V in /(?".
13.4 Let U: <p"-^ <p" be a unitary transformation. Prove that
for any pair of subspaces M , M C <J7".

Exercises 421
13.5 Prove that for subspaces Jz?, M in <p"
13.6 Show that the equality 0(3?, M) = \ holds if and only if either
£± n M ^ {0} or # n M1 ^ {0} (or both).
13.7 Let M I , N ! be subspaces in <p" and M2,N2 be subspaces in <pm.
Prove that
where
13.8 Find the gaps 0(Ker A, Ker B) and 0(Im A, Im 5) for the following
pairs of transformations
(a) j4 and B are diagonal in the same orthonormal basis.
(b) A and # are commuting normal transformations.
(c) A and B are circulant matrices in the same orthonormal basis.
(Hint: A and B can be simultaneously diagonalized by a unitary
matrix.)
in the same orthonormal basis, where ay and /3y are complex

numbers.
13.9 For each of cases (a)-(d) in Exercise 13.8, find
13.10 Let A: fn-+f" be a transformation. Then 0(J<,JV) = 1 for any

distinct ^-invariant subspaces M and ^V if and only if A is normal
with n distinct eigenvalues.
13.11 Show that if A(t), t £ [0, 1] is a continuous family of n x « circulant
matrices and dim Ker A(t) is constant (i.e., independent of t), then
the subspaces Ker A(t) and Im A(t) are constant.
13.12 Prove or disprove the following:
(a) If A(t) is a continuous family of upper triangular Toeplitz n x n
matrices for / £ [0, 1], then dim Ker A(t) is constant if and only
if Ker A(t} and Im A(t) are constant.
(b) Same as (a) for
where a y (f) are continuous scalar functions of t e [0, 1] and A is

a fixed n x n matrix.
13.13 Show that a circulant matrix has a generalized inverse that is also a
circulant.
13.14 Let A(t) be a continuous family of circulant matrices with
dim Ker A(t) constant for f £ [ 0 , 1]. Show that there exists a
continuous family B(t) of generalized inverses of A(t) on [0, 1] that
also consists of circulant matrices.
13.15 Solve Exercises 13.13 and 13.14 with "circulant" replaced by "upper
triangular Toeplitz."
13.16 Assume the hypotheses of Lemma 13.3.1 and, in addition, assume
that the projector I10 is orthogonal. Prove that \\R\\ = cotan <p min ,
where <pmin is the minimal angle between Ker I10 and Im II.
13.17 Find the minimal angle between any two one-dimensional subspaces
in the normed space ^f 2 with the following norms:
Chapter Fourteen
The Metric Spaces

of Invariant Subspaces
We study the structure of the set Inv(^4) of all invariant subspaces of a

transformation A: <p"~* <P" in the context of the metric space <jj(<p n ) °f au<
subspaces in <f"'. Throughout this chapter <p" is considered with the standard
scalar product and the gap metric determined by this scalar product on
(J7((p"), as studied in the preceding chapter. With the exception of Section
14.3, the results of this chapter are not used subsequently in this book.
14.1 CONNECTED COMPONENTS: THE CASE OF ONE EIGENVALUE
Let s£ C <$ be two sets of subspaces of (p". We say that s& is connected in Sft
if for any subspaces <&, M C stf there is a continuous function /: [0, 1]—» 8ft
such that /(O) = j£, /(I) = M. [The continuity of / is understood in the gap
metric. Thus, for every tQ G [0, 1] and every e > 0 there is a 8 > 0 such that
imply The set si is called con-
nected if s& is connected in M.
We start the study of connectedness of the set Inv(y4) with the case when
A — 7, a Jordan matrix with &(J) = {0}. Let r be the geometric multiplicity
of the eigenvalue 0 of J, and let kl>--->kr be the sizes of the Jordan
blocks in /. Also, denote the set of all /^-dimensional J-invariant subspaces
by Inv p .
r) be an ordered r-tuple of integers such that 0 < /i. < A:,,
E; = 1 /,- = />, and let 4>p be the set of all such /--tuples. We associate every
with the subspace $(/) G Inv p , spanned by vectors wj 0 ;
y = 0, . . . , / y — 1; / = 1, . . . , /-, where wj'* are unit coordinate vectors in <(?"
and the sole nonzero coordinate of «j° is equal to one and is in the place
k^ + — • + ki_l+ j + 1 (we assume k0 = 0) for / = 0,. . . , kf - I and i =
1,. . . , r. There is a one-to-one correspondence between elements of p and
423
a24 The Metric paces of Invariant Subspaces
subspaces from Inv/; spanned by unit coordinate vectors. So we can assume

that <£p C Inv;, .
Lemma 14.1.1
<$p is connected in ln\p.
Proof. Let / = ( / , , . . . , l r ) and / = ( / , , . . . , l r ) be r-tuples from <&p, and

suppose, for example, that /, > /, and / 2 < / 2 - Let ^(e) G Invp be the
subspace spanned by vectors w j ' l , + ew}, 2) , u{()l\ . . . , u(,^2, u ( - } for / =
0, . . . , /; - 1 and / = 2, . . . , r, where e is a complex number. Then ^(0) =
(the subspace corresponding to the r-tuple /) and
So / = ( / , , . . . , l r ) and (/,-!, / 2 + 1, /-,, • • • , l r ) are connected in Inv p .

Applying this procedure several times, we obtain a connection between /
and /.
Lemma 14.1.2
Let ^, E ln\p. Then oF, is connected in ln\p with some £F, E4> p .
Proof. For / = 0, 1, 2 , . . . , let
Then 0 = S?0 C S^, C • • • C ^ = <p" for some integer 5 (5 is the minimal

integer such that 7* =0). We construct the basic set of vectors in ^, in the
following way (see the proof of the Jordan form in Section 2.3). Let /0 be the
greatest index that satisfies (£%, "•"{%,._,) H ^, ^0. Take a basis
v,'o; ', , . . . , v,'o<?/
• „0 in &.i fl £%,'o modulo $£,'(i ~ ', . Then the vectors 7'u,'o ', , . . . , J'v/'oî„0
are linearly independent in &} C\ ffl t _,. modulo 3? , - _ , - _ , ; / = ! , . . . , /0 - 1.
We complete the set Jvf !,..., JVj by additional vectors
'(i" 1 - 1 ' ' 'a~l'1i0-\
to form a basis in ^, l fl £%, 'o^,1 modulo S?, 'o,-/ . Then the
vectors
are linearly independent in ^, D £% y _ ; modulo ^ ( _,._, for / = 2, . . . , /0 - 1.

Complete the set
by additional vectors
y ;to a basic set of vectors in
Connected Components: The Case of One Eigenvalue 425
fl £%,. _ 2 modulo ^, _ 3 , and so on. So we obtain the basic set of vectors in
To connect ^, with some subspace ^2 E 4>p, we use the following procedure.

Take a set of q t coordinate unit vectors V,' ', , - . . , ^V,,,
~ '() y
'0î ,
(
in <3l,'() that are
independent modulo £%, _ , . For / = 1, 2, . . . , ^ , put
where A is a complex parameter. Then the u, 0 y(A) are linearly independent

modulo £%; _ , for every A G <p except possibly for a finite set S{. Indeed, let
*,, . . . , * £ be a basis in £%, _ , , and put
Then u/ ( ) y(A), / = 1, . . . , ^ are linearly independent modulo ^, 0 _i if and

only if the columns of B( A) are linearly independent. Let b( A) be a minor of
B(\) of order /0 + k such that ^(0)^0 (such a minor exists because y ( ( / ,
y = 1, . . . , <7, are linearly independent modulo £%,. _ , ) . So fc(A) is a poly-
nomial that is not identically zero. Clearly, for every A that does not belong
to the finite set 5, of zeros of b(\), the vectors u / ( / (A), j = 1, . . . , q^
are linearly independent modulo £%, _ , . Observe that 5, does not contain 0
and 1.
Further, take a set of q t , coordinate unit vectors
in &l _ such that the vectors
are independent modulo S?, _ 2 . Putting
for y = 1, . . . , g,: _ , , we see similarly that the vectors
are independent modulo S? , _ 2 for A G <p ^ ^2' where 52 D 5j is a finite set of

complex numbers (not including 0 and 1). We continue this procedure and
obtain vectors
such that
426 The Metric Spaces of Invariant Subspaces
are linearly independent for A G <p" ^ 5, where 5 is finite set of complex

numbers not including 0 and 1 and vti(l) are coordinate unit vectors. From
this procedure it follows also that vij( A) G £%, for A G <p "~ S. Therefore, the
subspace ^(A) in <p" spanned by vectors (14.1.1) for A G ( p ^ 5 is a
/-invariant subspace with dimension not depending on A. Since S is finite we
can connect between 0 and 1 by a continuous curve F such that F D S — 0.
Then ^(A), A G F carries out the connection between ^, = ^(0) and
&2 = (1), where ^2 G 4>p.
We say that a set ^ C <]/((p") has connected components ja?,, . . . , s&m if

each stfj, / = l , . . . , r a is a nonempty connected set, but there is no
continuous function /: [0, 1]-+ $(<p") such that /(O)G^, / ( l ) G ^ , and
i^j. (In other words, each sdt is a maximal connected set in stf.)
Lemmas 14.1.1 and 14.1.2 allow us to settle the question of connected
components of the set Inv(/l) when the transformation A has only one
eigenvalue.
Theorem 14.1.3
Assume that the transformation A: $"—> <p" has only one eigenvalue A0. Then
Inv(y4) has exactly n + 1 connected components, and each connected compo-
nent consists of all A-invariant subspaces of fixed dimension.
Proof. Without loss of generality we can assume A() = 0. Let J be the

Jordan form of A , and A = S ' 1JS for some invertible transformation 5.
Obviously, ln\(A) = 5"'(Inv(/)) and InvÂ) = S ~ l ( l n v p ( J ) ) , where
ln\p(A) is the set of all ,4-invariant subspaces of dimension p. Lemmas
14.1.1 and 14.1.2 show that lnvp(J) and, therefore, lnvp(A) are connected.
On the other hand, if !£ G lnvf)(A) and M G Inv^/4) with p ^ q, then there
is no continuous function /: [0, l]-» <£(£") with /(O) - <£ and f ( \ } = M.
Indeed, if there were such a function/, then dim /(/) would not be constant
in a neighbourhood of some point f ( , G [0, 1]. This contradicts the continuity
of / i n view of Theorem 13.1.2.
14.2 CONNECTED COMPONENTS: THE GENERAL CASE
The description of connected components in Inv(^4) for a general transfor-

mation A: <p"—» <P" is given in the following theorem.
Theorem 14.2.1
Let A,, . . . , At be all the different eigenvalues of A, and let ( / / , , . . . , «/>c be
their respective algebraic multiplicities. Then for every integer p, Q< p < n,
Connected Components: The General Case 427
and for every ordered c-tuple of integers (xl , . . . , *c) such that 0 < x, — ^
i = l , . .. ,cand
{!£ £ Inv A \ dim !£ = p and the algebraic multiplicity of
is a connected component of ln\(A), and each connected component of

ln\(A) has the form (14.2.1) for a suitable p and suitable c-tuple
Proof. In the proof we use the following well-known properties of the

trace of a transformation A: $"—* (f"1, denoted by tr(/4) [e.g., see Section
3.5 in Hoffman and Kunze (1967)]. We may define tr(/4) to be the sum of
eigenvalues of A. If A is written as an n x n matrix in any basis in <p", then
tr(.A) is also the sum of diagonal elements of A. We have tr(AB) = lr(BA)
for any transformations A , B : <p" —» <p" ; in particular, tr(5~1/15) = if (A) for
any invertible 5. The trace (considered as a map from the set of all
transformations <p"—» <p" onto (p) is a continuous function.
Returning to the proof of Theorem 14.2.1, let F( be a small circle around
A,, with no other eigenvalue of A inside or on F.. Let JV be an /i- in variant
subspace, and let Xi(N) be the geometric multiplicity of A, for the transfor-
mation A\v. Using the Jordan form of A\ N , for instance, it is easily seen that
Let «,, . . . , ap be an orthonormal basis in JV. Then in some neighbourhood

V(N) of .TV, Pv-a,, . . . , Pjf.ap will be a basis in the subspace N' e V(JT),
where P v is the orthogonal projector on JV'. We have
Write /i| v as a matrix in the basis a,, ... ,a p , and for every /1-invariant
subspace N' that belongs to V(N], write ^4 v , as a matrix in the basis
P v , a , , . . . , Pjf.ap. Using formula (14.2.2) and the continuity of the trace,
we see that there exists a S > 0 such that, if 0(jV, N')<8 and N' is A
invariant, then
Since Xi(N') assumes only integer values, it follows that A/,(^') 's constant
in some neighbourhood of N in lnv(A) and, therefore, constant in the
connected component of Inv(^4) that contains N.
We show now that if N and Jf' are p-dimensional ^-invariant subspaces
such that Xi(N) = X i ( N ' ) for / = 1, . . . , c, then JV and N' are connected in
Inv(yl). Indeed, applying Theorem 14.1.3 to each restriction A\^ (A) for
/ = 1, . . . , c, we find that N n &Ai(A) is connected with Jf' n ^(A) in the
set of all /4-invariant subspaces of dimension Xi(N) in 9l^(A). Since
and similarly for M', it follows that JV and .A"' are connected in
It remains to show that, given integers x\* • • • •> Xc suc^ tnat 0 — A', — 'A,
and L c i = l X i — p , there exists a subspace jVelnv(/4) with ^,(JV) = xt, for
/ = 1, . . . , c. But assuming that A is in Jordan form, we can always choose
an jV spanned by appropriate coordinate unit vectors. D
Corollary 14.2.2
The set ln\(A) has exactly Hli=l ((f/l ; + 1) connected components, where
«/fj , . . . , «/rc are the algebraic multiplicities of the different eigenvalues
\i, . . . , \c of A, respectively.
The proof of Theorems 14.1.3 and 14.2.1 shows in more detail how the
subspaces in Inv A belonging to the same connected component are con-
nected. We say that a vector function x(t) defined for fE[0, 1] and with
values in <p" is piecewise linear continuous if there exist m points 0 < tl <
"•<tm<\ and vectors y,, . . . , _y m + 1 and z , , . . . , z m + 1 such that, for
i = 1 , . . . , m +1
(by definition, f 0 = 0, tm + l = 1), and for / = 1, . . . , m, we obtain
Corollary 14.2.3
Let M and M be p-dimensional A-invariant subspaces that belong to the same
connected component in Inv A. Then there exist piecewise linear continuous
vector functions v { ( t ) , . . . , vp(t) such that, for all ?G[0, 1], the subspace
Span{ el), • • • , vp(t)} is p-dimensional, A invariant, and
14.3 ISOLATED INVARIANT SUBSPACES
Let A: <p" —> (f"1 be a transformation. An /1-invariant subspace M is called

isolated if there is an e > 0 such that the only /l-invariant subspace N
satisfying 0(M, N) < e is M itself.
Isolated Invariant Subspaces 429
Theorem 14.3.1
An A-invariant subspace M is isolated if and only if, for every eigenvalue
of A with dim Ker( either
To prove Theorem 14.3.1, we use a lemma that allows us to reduce the

problem to the case when A has only one eigenvalue.
Lemma 14.3.2
An A-invariant subspace M is isolated if and only if for every eigenvalue A0 of
A the subspace (A) is isolated as an A\ invariant subspace.
Proof. We have
where are all the different eigenvalues of A.

Assume that is isolated. If for some A, the subspace
is not isolated [as an A\ invariant subspace], then there exists a
sequence of ^-invariant subspaces such that
For m = 1, 2,. . . , let
Obviously, is A invariant. Let ^ be a direct complement to

in (4) for y = 1,. . . , r, and put Then 'is
direct complement to in Theorem 13.1.3 shows that for m sufficiently
large, t is a direct complement to K(A), and therefore is a direct
complement to in Letting be the projector on
(resp. along J e have (cf. (13.1.4))
where P, is the projector on along
and mis the projector on

along Theorem 13.1.3 shows that for large m
where the constant C > 0 is independent of m. Comparing with (14.3.1), we

obtain as m—»°°, a contradiction with the fact that is

isolated.
Assume now that, for / = 1,. . . , r, K(A) is isolated as an A
invariant subspace. So there exists an >0 such that the only .A-invanant
subspace (^4) satisfying
is itself.
We show now that, for every there exists re aexists
8 > 0a hat,
8 > 0forsuch
anythat, for any
A-invariant subspace with the inequalities
hold for / = 1,. . . , r. Indeed, arguing by contradiction,
assume that for some and some / there exists a sequence of
/l-invariant subspaces such that as w but
Let Then, in particular, and by Theorem 13.4.2

there exists a sequence such that for m = 1 , 2 , . . . and
Write x where Apply

the projector on A (A) along the sum of all other root subspaces of A to
both sides of (14.3.3). We see that y = lim Conversely, if y =
lim for some x then obviously and,
by Theorem 13.4.2, we also have Now by the same Theorem
13.4.2 any limit point of the sequence m = 1, 2, . . . coincides
with Since Theorem 13.4.1 ensures that the limit points of
exist, we obtain a contradiction with (14.3.2).
Now take min Then for with the property de-
scribed in the preceding paragraph, we find that for every .A-invariant
subspace the equa lit hold
for; = 1,. . . , r. But these equalities imply t is, M is isolated. D
Proof of Theorem 14.3.1 In view of Lemma 14.3.2, we can assume that

If dim Ker( then A is unicellular and has a
unique complete chain of invariant subspaces. Obviously, every ^-invariant
subspace is isolated. Now assume that dim Ker In view of
Theorem 14.1.3, the set lnvp(A) of all y4-invariant subspaces of fixed
dimension p is connected. So to prove that the only isolated A-invariant
subspaces are {0} and we must show that ln\p(A) has at least two
members for However, for every p with 0 < p < n, and in a fixe
Jordan basis for A, the transformation A has at least two invariant subspaces
of the same dimension p spanned by some vectors from this basis.
Isolated Invariant Subspaces 431
An A-invariant subspace is called inaccessible if the only continuous

mapping of the interval [0,1] into the lattice ln\(A) of A -invariant subspaces
with is the constant map Clearly, every isolated in-
variant subspace is inaccessible. The converse is also true, as follows.
Proposition 14.3.3
Every inaccessible A-invariant subspace is isolated.
Indeed, if A has only one eigenvalue and dim Ker then

any A -invariant subspace is obviously inaccessible and isolated. It can be
proved by using the arcwise connectedness of ln\p(A) for 0</? < n that, if
(A) = and dim Ker then any nontrivial /1-invariant
subspace is not inaccessible (Corollary 14.2.3). The reduction of the general
case to this special case is achieved with the following lemma.
Lemma 14.3.4
An A-invariant subspace M is inaccessible if and only if, for every eigenvalue
of A, the subspace (v4) is inaccessible as an A -invariant
subspace.
The proof of Lemma 14.3.4 is left to the reader. (It can be obtained along
the same lines as the proof of Lemma 14.3.2.)
Theorem 14.3.5
Every inaccessible (equivalently, isolated) A-invariant subspace is A hyper-
invariant.
Proof. Let be the distinct eigenvalues of A (if any) wit

dim Ker for / = ! , . . . , $ , and let be other
distinct eigenvalues of A (if any). For a given isolated ^-invariant subspace
we have, by Theorem 14.3.1
and some t with Le tting a, be the dimension of we

have =Kerp(A) t where A
every transformation that commutes with A also commutes with p(A), the
subspace is A hyperinvariant. D
The converse of Theorem 14.3.5 does not hold in general, as the next
example shows.
EXAMPLE 14.3.1. Let
The subspace = Sp is the kernel of T and is thus T

hyperinvariant. For any complex number a, the subspace
Span is easily seen to be T invariant. We have
so
and as the norm of a hermitian matrix is equal to the maximal absolute

value of its eigenvalues, a computation shows that
where
So the subspace valued function F defined on is continu-

ous and nonconstant and takes T-invariant values. As the T-
invariant subspace is not inaccessible.
14.4 REDUCING INVARIANT SUBSPACES
Recall that an invariant subspace of a transformation is called

reducing if there exists an ^4-invariant subspace jV that is a direct comple-
ment to n
Reducing Invariant Subspaces 433
The question of existence and openness of the set of reducing ,4-invariant

subspaces of fixed dimension p is settled by the following theorem.
Theorem 14.4.1
Let A: be a transformation with partial multiplicities
(so ml + • • • + mk = n). Then there exists a reducing A-invariant subspace of
imension if and only if p is admissible, that is, is the sum of some
partial multiplicities ra,,. . . , ra, . In this case the set of all reducing A-
invariant subspaces of dimension p is open in the set of all A-invariant
subspaces.
Proof. If p is admissible, then obviously a reducing ^-invariant sub-

space of dimension p exists. Conversely, assume that M is a reducing
/1-invariant subspace of dimension p with an ,4-invariant complement N.
Write
with respect to the direct sum decomposition Taking Jordan

forms of A\ and A2, we see that p is admissible.
For an admissible p, let Rm\p(A) be the set of all p-dimensional reducing
y4-invariant subspaces. For a subspace M Rinvp(A), let Jf be a direct
complement to M that is A invariant. Theorem 13.1.3 shows that there exists
an e < 0 such that ^Vis a direct complement for any ^-invariant subspace M\
with Hence Rmvp(A) is open in the set lnvp(A) of all
/7-dimensional ^-invariant subspaces. D
Now consider the question of whether (for admissible /?) the set R\n\p(A)
of all p-dimensional reducing subspaces for A is dense in the set lnvp(A) of
all p-dimensional /1-invariant subspaces. We see later that the answer is, in
general, no. So a problem arises as to how one can describe the situations
when Rm\p(A) is dense in lnvp(A) in terms of the Jordan structure of A.
We need some preparation to state the results. Let A: be a
transformation with single eigenvalue and partial multiplicities
It follows from Section 4.1 that the partial multiplicities/?
p, of the restriction A to an ^-invariant subspace M satisfy the inequalities
Given an integer p with 1 let p , be a sequence of positive

integers such that (14.4.1) holds and /?, + ---+p,-p; a sequence with
these properties is called p admissible. For a p admissible sequence
denote by \n\p(A; / ? , , . . . , p,) the (nonempty) set of all A-
invariant subspaces M such that the restriction A\ has the partial multi-
plicities pl , . . . , p,, Clearly, dim M = p for every

Moreover
where the union is taken over the finite set of all p-admissible sequences
For each p-admissible sequence
where and K* indi-

cates the number of elements in the finite set K. In connection with the
definition of (-4; / ? , , . . . , p,), observe that c for y = 1, 2,. . . (so each
summand on the right-hand side of (14.4.2) is a nonnegative integer), and/?,
is the maximal index with
We now give a necessary and sufficient condition for the denseness of
Rinv /3 (/l) in \n\p(A), for a transformation A: <p"—» • • • S: mr.
Theorem 14.4.2
For a fixed admissible integer p, the set K\n\p(A) is dense in lnvp(A) if and
only if the following condition holds: any p-admissible sequence p} >•• ->pt
for which the number s,(A\ pt,. . . , p,) attains its maximal value among all
p-admissible sequences has the form p\ = mt, , . . . , p, — mt for some indices
In particular, Rinvp(A) is dense in ln\p(A) pro-
vided there is only one p-admissible sequence for which
is maximal.
In the proof of Theorem 14.4.2 we apply a result proved in Shayman

(1982) concerning a representation of ln\p(A) as a union of complex
(analytic) manifolds. In this proof (and only in this proof) we assume some
familiarity with the definition and simple properties of complex manifolds
that can be found, for instance, in Wells (1980).
Theorem 14.4.3
For every p-admissible sequence p the set Inv /; (^4; /?j, . . . , p,) is,
in the topology induced by the gap metric, a connected complex manifold
whose (complex) dimension is equal to
For the proof of Theorem 14.4.3 we refer the reader to Shayman (1982).
Proof of Theorem 14.4.2. Assume that the condition fails, that is, there
exists a p-admissible sequence with maximal
Reducing Invariant Subspaces 435
that is not of the form By

Theorem 14.4.3 the complex manifold \n\p(A; p , , . . . , p,) has maximal
dimension among all the complex manifolds whose union is ln\p(A). On the
other hand, it is easily seen that ln\p(A; p , , . . . , p,) does not contain any
reducing subspace for A (cf. the proof of Theorem 14.4.1). So Rinv p (/4) is
not dense in Inv p(A).
Assume now that the condition holds. Then every complex manifold
ln\p(A; p,, . . . , p,) with maximal will contain a reducing
subspace { , . . . , p,) for A. Fix such a p-admissible sequence
p,, and let JVbe an ^-invariant direct complement to M(p}, . . . , p,) in
It follows from Theorem 7 in Shayman (1982) that the complex manifold
In\p(A; p,, . . . , p,) can be covered by a finite number of analytic charts
and that each chart is of the form p(A; p , , . . . , p,)
with Span where
A-,(Z), . . . , *p(z) are analytic vector functions in <p9. Now it is easily seen
that the set of all subspaces M E:lnvp(A; p , , . . . , p,) that are not direct
complements to Jf is an analytic set (i.e., the union of the sets of zeros
of a finite number of analytic functions that are not identically zero) in
each of the charts mentioned above. Denoting by K the union of all
ln\p(A\ p,, . . . , p,) for which is maximal, it follows that
Rinv /J (/4) K is dense in K. As ln\p(A) is connected (Theorem 14.1.3), it
follows from Theorem 14.4.3 that the closure of K coincides with Inv p (/4);
hence Rinvp(A) is dense in ln\p(A).
Finally, suppose that there exists only one p-admissible sequence
for which is maximal. As the set ln\p(A) is
connected, and Theorem 14.4.3 implies that lnvp(A) is the closure of
lnvp(A; p j , . . . , p j - ) . Since p is admissible, there exists a p-dimensional
/4-invariant subspace such that for some ^4-invariant
subspace . So there exists a subspace M in ln\ (suf-
ficiently close to for which is a direct complement. Now we can
repeat the arguments in the preceding paragraph to show that Rinv p (v4) is
dense in lnvp(A).
Let us give an example showing that, for an admissible p, Rinv p (,A) is not
generally dense in ln\p(A).
EXAMPLE 14.4.1. Let
where Jm(Q) is the Jordan block of size m with eigenvalue 0. Clearly, p = 5 is

admissible. However, Rinv 5 (/4) is not dense in Inv 5 (^4). According to
Theorem 14.4.3, the connected set Inv5(/4) is the disjoint union of five
analytic manifolds S,, 52, S3, 54, S5 described as follows: let
7 Then for j = 1,. . . , 5,
Sj consists of all five dimensional ,4-invariant subspaces M such that the

restriction A\M has partial multiplicities given by yf. Further, the (complex)
dimensions of 51? S2, S3, S4, 55 are 4, 4, 3, 2, 0, respectively. It is easily seen
that there is no reducing subspace for A in S2. Indeed, the sum of a
subspace from 52 and any four-dimensional A -invariant subspace fails to
contain the vector e Since the dimension of S2 is maximal among the
dimensions of Sj, j = I , . . . ,5, it follows that Rinv 5 (A) is not dense in
Inv5(,4). D
In the next example Rinv p (/4) is dense in ln\p(A), for all admissible p.
EXAMPLE 14.4.2. Let
Obviously, all p - 0, 1, 2, 3 are admissible. Among the one-dimensional

/4-invariant subspaces Span (where all are re-
ducing with the exception of Span (i.e., when . Indeed
for a T^ 1. So Rinv^/l) is dense in Inv,(v4). Further, in the set
of two-dimensional v4-invariant subspaces the reducing ones are

Span that is, again a dense set.
We note the following corollary from Theorem 14.4.2.
Corollary 14.4.4
If the transformation A: has only one eigenvalue and
dim Ker 7 - A) = 2, then Rmvp(A) is dense in lnvp(A) for every p such
that Rmvp(A) is not empty.
Proof. Indeed, let be the partial multiplicities of A. A simple

calculation shows that for every p-admissible sequence we have
and for the p-admissible sequence consisting of one
integer p, only, we have Hence there exists only one
p-admissible sequence for which is maximal,
and the second part of Theorem 14.4.2 applies.
Covariant and Semiinvariant Subspaces 437
14.5 COVARIANT AND SEMIINVARIANT SUBSPACES
In this section we study topological properties of the sets of coinvariant and

semiinvariant subspaces for a transformation y4 . As usual, the
topology on these sets is the metric topology induced by the gap metric.
For the coinvariant subspaces we have the following basic result.
Theorem 14.5.1
The set Coinv(y4) of all coinvariant subspaces for a transformation
A: is open and dense in the set < of all subspaces in
Furthermore, the set Coinvp(A) of all A-coinvariant subspaces of a fixed
dimension p is connected.
Proof. Let M be A coinvariant, so there is an ^-invariant subspace N

that is a direct complement to M in (p". By Theoreml3.1.3 there exists an
such that JV is a direct complement to any subspace with
Hence Coinv(v4) is open.
We now prove that Coinv(y4) is dense. Let M — Span{u,, . . . , vp] be a
p-dimensional subspace in There exists an (n - p)-dimensional A-
invariant subspace N. Let be a basis for JV. Denoting by
W j , . . . , w a basis for some direct complement to N, put
where 17 7^0 is a complex number. As v^ . . . ,vp are linearly independent

for 17 close enough to zero the vectors v are linear
independent as well. Hence dim ^(17) = p for 77 close enough to zero.
Further, the determinant of the n x n matrix [ « , • • • un_pwl • • • wp] is non-
zero. If the determinant of is a
polynomial in £ that is not identically zero, and it follows that
or all such that is large enough. For such the sub space Spa
is a direct complement to JV.
Span it follows that for 17 5^0 and close
nough to zero To show that M belongs to the closure of
the set of all >l-coinvariant subspaces, it remains to prove that
To prove this, assume for simplicity of notation that the upper p rows in
[i>, • • • vp] are linearly independent. Then the same will be true for the upper
p rows of (for 17 close enough to zero). Write
where is a nonsingular p x p matrix and 0(17) is an (n - p) x p matrix.

Then the matrix
where and ' is the orthogonal

projector on s the entries of P(r e continuous functions of 17,
equality (14.5.1) follows.
Finally, let us verify the connectedness of Coin\p(A). Let
Com\p(A). So for some (n - p)-dimensional A-
invariant subspaces and et lt . . . , vp and M , , . . . , up be bases in
and spectively, and consider the subspaces M(rf panju,
where 17 G (p. As in the preceding proof of the dense
ness of Coinv(^4), one verifies that for all 17 with the possible exception of
finite set 4> (= the set of zeros of a certain polynomial), M(rj) is a direct
complement to the least one of the subspaces and Pick a continuous
curve where and that does not intersect 3> and such
that Then is the desired connection
between and in the set Coinvp(,4). D
Now we consider the semiinvariant subspaces. As any /l-coinvariant

subspace is also A semiinvariant, Theorem 14.5.1 implies that the set
Sinv(yl) of all A -semiinvariant subspaces is dense in <Jj(<p" )• However,
Sinv(/4) is not necessarily open, as the following example shows.
EXAMPLE 14.5.1. Let A = -7 The two-dimensional subspace

Span{e2, e3] is obviously A semiinvariant, and
(see the proof of Theorem 14.5.1). But the subspace Span{e2, e3 + rje4} is
not A semiinvariant for 17 ^ 0. Indeed, suppose that
where N and M are A invariant. As the only nonzero /1-invariant subspaces

are Span for / = 1, 2, 3, 4, and (14.5.2) implies e
it follows that Then dinuV = 2. Hence Ji must be Span 2},
which contradicts (14.5.2). D
The Real Case 439
Theorem 14.5.2
For any transformation A: the set Sinvp (/l) of all A-semiinvariant
subspaces of a fixed dimension p is connected.
Proof. Given an ^-invariant subspace N with dimension not less than p,

denote by S the set of all ,4-semiinvariant subspaces of dimension p
such that for some /1-invariant subspace M (in other words, is
A}^ coinvariant). It will suffice to show that for any jVand any
!£ there exists a continuous function/: Sin\p(A) such that
where J*2 is ,4 invariant, and let
/ , , . . . , /p and g , , . . . , gp be bases in j£, and «$?,, respectively. Denote by 5
the finite set of all i?E<P for which Span{/, + rjgj, . . . ,
is not a direct complement to ^ Then put f(t) =
Span{f for and , where
5 is any continuous function with T(0) = 0,
14.6 THE REAL CASE
Consider now a transformation A: $"—*%". We study here the connected

components and isolated subspaces in the set Inv*(/4) of all v4-invariant
subspaces in J|?".
Theorem 14.6.1
If A has only one eigenvalue, and this eigenvalue is real, then the set Inv*(y4)
of all A-invariant subspaces of fixed dimension p is connected.
The proof of Theorem 14.6.1 will be modeled after the proof of Theorem
14.3.1, taking into account the fact that in some basis in $." the transfor-
mation A has the real Jordan form (see Section 12.2). We apply the
following fact.
Lemma 14.6.2
The set GLr(n) of all real invertible n x n matrices has two connected
components; one contains the matrices with positive determinant, the other
contains those with negative determinant.
Proof. Let T be a real matrix with det T > 0 and let J be a real Jordan
form for T. We first show that J can be connected in GLr(ri) to a diagonal
matrix K with diagonal entries ±1. Indeed, J may have blocks Jp of two
types: first
in this case we define
for any where \ p(t) is a continuous path of nonzero real numbers

such that according as
Second, a Jordan block Jp may have the form
where for real and r with Then J p(t) is

defined to have the same zero blocks as / , whereas the diagonal and
superdiagonal blocks are replaced by
respectively, for / £ [0, 1]. Then Jp(t) determines a continuous path of real
invertible matrices such that /p(0) = Jp and J p ( l ) is an identity matrix.
Applying the above procedures to every diagonal block in /, we see that J
is connected to K by a path in GLr(n). Now observe that the path in GLr(2)
defined for by
connects Consequently K, and hence J, is connect-

The Real Case 441
ed in GLr(n) with either / or diag[-l, 1 , 1 , . . . , 1]. But del T>Q implies

d e t J > 0 , and so the latter case is excluded. Since T=S~1JS for some
invertible real S, we can hold S fixed and observe that the path in
connecting J and / will also connect T and 7.
Now assume r(n) and d e t T < 0 . Then det 7' >0, where T' =
T diag[-l, !,...,!]. Using the argument above, we find that T' is connect-
ed with / in GLr(n). Hence T' is connected with diag[-l, !,...,!] in
Proof of Theorem 14.6.1. Without loss of generality we can assume that

A = /„(()). Let be the sizes of Jordan blocks in A. Let the
set of all ordered r-tuples of nonnegative integers / , , . . . , lr such that
i / , = p . As in Section 14.1, each is iden-
tified with a certain p-dimensional ^-invariant subspace; so can be
supposed to be contained in lnv The proof of Lemma 14.1.1 shows
that is connected in lnv Further, we apply the proof of Lemma
14.1.2 to show that any is connected in Inv*(/4) with some
Take vectors as in the
proof of Lemma 14.1.2. Let pt — dim £%. - dim £%,_j for / = / 0 , /0 — 1, . . . , 1.
As the vectors u, ()1 , . . . , u,o<? are linearly independent modulo ^, 0 -i, the
matrix formed by the rows
of the 0
matrix 0 0"|Q
has linearly independent columns. Fo
^
simplicity of notation assume that the top q submatrix of is

nonsingular. Now Lemma 14.6.2 allows us to connect the vectors
with ±e respectively (the
sign -I- or - coincides with the sign of the nonzero real number det in
the set of all qt -tuples of vectors in £%. that are linearly independent modulo
for ; = 2, . . . , <7,o in the proof of
Lemma 14.1.2. Using an analogous rule for the choice of ytj at each step of
the procedure described in the proof of Lemma 14.1.2, we finish the proof
of Theorem 14.6.1.
Theorem 14.6.3
If the transformation A: has the only eigenvalues where a
and (3 are real and P^Q, then again the set Inv of all A-invariant
subspaces of fixed dimension p is connected.
Note that under the condition of Theorem 14.6.3, A does not have
odd-dimensional invariant subspaces (in particular, n is even), so we can
assume that p is even (see Proposition 12.1.1).
Proof. Consider A as the n x n real matrix that represents the trans-

formation A in the basis e and let Ac be the complexification
of A; so By Theorem 12.3.1, there exists a one-to-one

correspondence between the ^'-invariant (/?/2)-dimensional subspaces J< in
and the y4-invariant p-dimensional subspaces 2£, which is given
by the formula
It is easily seen from the proof of Theorem 12.3.1 that this correspondence
is actually a homeomorphism (p: ln\p(A)^>lnv
Now the connectedness of Inv*(/4) follows from the connectedness of
Inv
heorem 14.1.3). D
Recall that as shown in Chapter 12, any /l-invariant subspace t£ admits

the decomposition
where are all the distinct real eigenvalues of A (if any) and
are all the distinct eigenvalues of A in the open upper
half plane. Using this observation, the proof of Theorem 14.2.1 yields the
following description of the connected components in the metric space
Inv (A) of all /1-invariant subspaces in ft" for the general transformation
Theorem 14.6.4
Let A,, . . . , A^ be all the different real eigenvalues of A, let their algebraic
multiplicities be ifr respectively, and let be all
the distinct eigenvalues of A in the open upper half plane with the algebraic
multiplicities <p,,. . . , <p,, respectively. Then for every (s + t)-tuple of integers
such that
the set dim 2£ = p\ \i is tne algebraic multiplicity of
A\y corresponding to A, for i - 1,. . . , s; \s+j is that corresponding to
for where connect-
ed component of Inv (A) and every connected component of Inv*(/4) has
this form. In particular, Inv*(^4) has exactly
connected components.
Finally, consider the isolated subspaces in Inv*(y4).
Theorem 14.6.5
Let A: be a transformation. Then an A-invariant subspace M is
isolated in lnv*(A) if and only if either
Exercises 443
for every real eigenvalue A0 of A with dim Ke and either

for any nonreal eigenvalue of
A with geometric multiplicity greater than 1.
Proof. Using the real analog of Lemma 14.3.2 (its proof is similar to
that of Lemma 14.3.2), we can assume that one of two cases holds: (a)
In the
first case Theorem 14.6.5 is proved in the same way as Theorem 14.3.1. In
the second case use Theorem 14.3.1 and the homeomorphism between
Inv*(^) and lnv given by formula (14.6.1).
14.7 EXERCISES
14.1 Supply the details for the proof of Lemma 14.3.4.

14.2 Prove that for a transformation A the sets of /4-hyperinvariant
subspaces and isolated ^-invariant subspaces coincide if and only if A
is diagonable. In this case an ^-invariant subspace is isolated if and
only if it is a root subspace.
14.3 What is the number of isolated invariant subspaces of the companion
matrix
14.4 Let A =diag Is the set of all reducing

/4-invariant subspaces dense in Inv(/4)?
14.5 Show that there exists a converging sequence of semiinvariant sub-
spaces for the matrix -/3(0) whose limit is not J 3 (0)-semiinvariant.
Chapter Fifteen
Continuity and
Stability of
Invariant Sub spaces
It has already been mentioned that computational problems for invariant

subspaces naturally lead to the problem of describing a class of invariant
subspaces that are stable after small perturbations. Only such subspaces can
be amenable to numerical computations. The analysis of stability of in-
variant subspaces is the main topic of this chapter. We also include related
material on stability of other classes of subspaces (notably, [A J5]-invariant
subspaces), and on stability of lattices of invariant subspaces. Different types
of stability are analyzed.
15.1 SEQUENCES OF INVARIANT SUBSPACES
In this section we consider the continuity of invariant subspaces for trans-

formations from <(7n into <|7". We start with the following simple fact.
Theorem 15.1.1
Let {Am}^m = l be a sequence of transformations from <p" into <p" that
converges to a linear transformation A: <J7"—» ((7". // Mm is an Am-invariant
subspace for m = 1, 2,. . . such that Mm—> M for some subspace M C <p",
then M is A invariant.
Proof. Let x G M. Then, by Theorem 13.4.2, there exists a sequence

such that xm G Mm for each m and limm_x Now
444
Sequences of Invariant Subspaces 445
As Am—> A, the norms are bounded; K for some positive

constant K independent of m. So as m—»°°,
As ^ m is Am invariant, we have AmxmE.Mm for each m, and Theorem

13.4.2 can be applied to conclude that Ax EM.
The continuity property of invariant subspaces expressed in Theorem

15.1.1 does not hold for the classes of coinvariant and semiinvariant
subspaces.
EXAMPLE 15.1.1. For m = 1, 2,. . . , let
The subspace Span{el} is Am coin variant for every m. (Indeed, Span is

a direct complement to SpanfeJ, which is Am invariant.) However,
Span{e,} is not A coinvariant, where is the limit of Am. The
same subspace SpanfeJ is also Am reducing, but not A reducing. D
EXAMPLE 15.1.2. For m = 1, 2 , . . . , let
The eigenvectors of Am are (up to multiplication by a scalar) el, mev + me2,

m2el - me2 + 2e3. Consequently, the subspace Span{el,e3} is Am semi-
invariant for all m (because Span{m^j + e2} is a direct complement to
Span{e1? e3}, which is an Am-invariant subspace). However, Span{el, e3} is
not A semiinvariant, where
is the limit of Am if m—»<». D

446 Continuity and Stability of Invariant Subspaces
Corollary 15.1.2
The set of A-invariant subspaces is closed; that is, if {Mm}*m = l is a sequence
of A-invariant subspaces with limit M:=\\mm_^M, then M is also A
invariant.
Simple examples show that the ,4-invariant subspaces Ker A and Im A

are not generally continuous in the sense of Theorem 15.1.1. Thus it may
happen that {Ker/l m }~ = 1 does not converge to Ker A and {lmAm}^ = }
does not converge to Im A as Am-* A. The following result shows that the
only obstruction to convergence of Ker Am and Im Am is the dimension.
Theorem 15.1.3
Let {Am}~m = l be a sequence of transformations on <p" that converges to a
transformation A on <p". Then Ker A contains the limit of every convergent
subsequence of the sequence {Ker Am}^, =l. In particular, if dim Ker Am =
dim Ker A for every m = 1, 2, . . . then Ker Am and Im Am converge, and
Proof. For k — 1, 2,. . . , let Ker Am converge to some M C <p". Then

for every x&M there exists a sequence xm EKer.4^ , such that xm —>x.
As Am xm =0, we have also Ax — 0, that is, x E Ker A.
Now let Im Am be a sequence converging to some Ji C <p". Then [see
formula (13.1.1)] *
Since A^-* A*, by the part of the theorem already proved,

(and so Jf D Im A.
Assume in addition that dim Ker Am = dim Ker A for all m = 1 , 2 , . . . . If
£ is a limit of a converging subsequence from the sequence {Ker yl m }^ =1 ,
then (see Theorem 13.1.2) dim «2* = dim Ker A. From the first part of the
theorem we know that Jz? C Ker A. So actually Z£ = Ker A. Hence Ker A is
a limit of every converging subsequence of {Ker Am}^, =l. It follows [using
the compactness of (f/(<p")] that Ker/4 m converges to Ker A. Further, we
also have dim Im Am — dim Im A for each m. A similar argument shows that
Im Am converges to Im A. D
Let M be an A -invariant subspace and O be an open set in <p. We

conclude this section by showing that the inclusion a(A\M)CCl is preserved
under small perturbations. Recall that 6 denotes the "gap" metric intro-
duced in Chapter 13.
Stable Invariant Subspaces: The Main Result 447
Theorem 15.1.4
Let M be an invariant subspace for the transformation A: <p"~* <P"> and let
ft C <p be an open set such that all eigenvalues of A\M are inside ft. Then for
transformations B on <p" and B-invariant subspaces Jf, cr(B\ v ) C ft as long as
is sufficiently small.
Proof. Arguing by contradiction, suppose that there exists a sequence

of transformations (Bm}^n = l on (p" and a sequence of subspaces
such that Jim is Bm invariant,
and For each m, let Am be an eigenvalue of outside
Since as the norms are bounded;

hence the sequence {A OT } m = 1 is bounded as well. Passing to subsequences in
formula (15.1.1), if necessary, we can assume that A OT —» A0 and xm-+x0 (as
m—»o°), for some A0 G (p and *„ G <{7". By Theorem 13.4.2, x0E:M, and
clearly JC ( ) TÔ. As Ax() — AOJCO, A0 is an eigenvalue of v 4 | w , which, by
hypothesis, belongs to ft. But this contradicts Am ^ft for m = 1, 2, . . . . D
15.2 STABLE INVARIANT SUBSPACES: THE MAIN RESULT
Let A : <p" —» <p" be a transformation. An /l-invariant subspace ^V is called

stable if, given e > 0 , there exists a 5 > 0 such that ||fi — /1||<8 for a
transformation B: <p"—» <p" implies that B has an invariant subspace M with
0(./^, ^V ) < e. The same definition applies for matrices.
This concept is particularly important from the point of view of numerical
computation. It is generally true that the process of finding a matrix
representation for a linear transformation and then finding invariant sub
spaces can be performed only approximately. Consequently, the stable
invariant subspaces will generally be the only ones amenable to numerical
computation.
Suppose that N is a direct sum of root subspaces of A . The JV is a stable
invariant subspace for A. This follows from the fact that JV appears as the
image of a Riesz projector
where F is a suitable closed rectifiable contour in (p such that the eigenvalue

A0 of A is inside F if 9t^(A) C Ji and outside F if ô(A) n JV = {0} (see

Proposition 2.4.3). Further, the function F(A) = (/A - A) -1 is a continuous
function of A on F. This follows from the formula
where Adj(/A - ^4) is the matrix of algebraic adjoints of /A — A, and from

the continuity of det(/A — A) and Adj(/A — ^4) as functions of A. Since F is
compact, the number K is well defined. Now an
l
transformation fl:(p"—><p" with A has the property that
/A - B is invertible for all A £ F. [Indeed, for A G F we have
and since the invertibility of follows.]

Moreover
which implies that H j R / j - K ^ H is arbitrarily small if ||y4-B|| is small

enough.
Theorem 13.1.1 shows that
so 0(Jf, M) is small together with \\RB - RA\\.

However, it will turn out that not every stable invariant subspace is
spectral. On the other hand, if dim K e r ( A y / — ^ 4 ) > 1 and .A" is a one-
dimensional subspace of Ker(A ; 7 — A), it is intuitively clear that a small
perturbation of A can result in a large change in the gap between invariant
subspaces. The following simple example provides such a situation. Let A be
the 2 x 2 zero matrix, and let JV = Span| C <p2 Clearly, N is A in-
variant, but N is unstable. Indeed, let B = diag[0, e], where e 7^0 is close
enough to zero. The only one-dimensional /^-invariant subspaces are Ml =
Spanj \\ and M2 — Spanj \\, and both are far from JV: computation
shows that
The following theorem gives the description of all stable invariant

subspaces.
Theorem 15.2.1
Let A,, . . . , \r be the different eigenvalues of the transformation A. A
subspace N of (f"1 is A invariant and stable if and only if
Stable Invariant Subspaces: The Main Result 449
where for each j the space Nt is an arbitrary A-invariant subspace of £%A (A)
if dim Ker( A / - A) = 1; if dim Ker( A y / - A) * 1 then either ^ = {0}' or
Comparing this theorem with Theorem 14.3.1, we obtain the following

important fact: an A-invariant subspace Jf is stable if and only if N is isolated
in the metric space lnv(A) of all A-invariant subspaces.
An interesting corollary is easily detained from Theorem 15.2.1.
Corollary 15.2.2
All invariant subspaces of a transformation A: <p"—» <p" are stable if and only
if A is nonderogatory [i.e., dim Ker(A - A 0 7) = 1 for every eigenvalue A0
of A}.
The proof of Theorem 15.2.1 will be based on a series of lemmas and an

auxiliary theorem that is of some interest in itself. We will also take
advantage of an observation that follows immediately from the definition of
a stable subspace: the ^-invariant subspace N is stable if and only if the
5-45-1-invariant subspace SN is stable. Here 5: (p w —»(p w is an arbitrary
invertible transformation.
First we present results leading to the proof of Theorem 15.2.1 for the
case when A has only one eigenvalue. To state the next theorem we need
the following notion: a chain of A-invariant sub
spaces is said to be complete if dim -Mj—j for / ' = ! , . . . , « — 1.
Theorem 15.2.3
Given e > 0, there exists a 8 > 0 such that the following holds true: if B is a
transformation with and is a complete chain of B-
invariant subspaces, then there exists a complete chain {Jij} of A-invariant
subspaces such that 6(Nj, M j ) < e for / = ! , . . . , « — 1.
In general, the chain (M;} for A will depend on the choice of B. To see
this, consider
where i; E (p. Observe that for v ^= 0 the only one-dimensional invariant

subspace of Bv is Span{e2}, and for B'v, i>^0, the only one-dimensional
invariant subspace is Span{e,}.
Proof. Assume that the conclusion of the theorem is not correct. Then
there exists an e > 0 with the property that for every positive integer m there
exists a transformation Bm satisfying \\Bm - A\\ < 1 Im and a complete chain
{Mmj} of Bm-invariant subspaces such that for every complete chain {Jif}
of ^-invariant subspaces
Denote by Pmj the orthogonal projector on Mmj.

Since there exists a subsequence of the sequence of
positive integers and transformations P,,. . . , Pn _, on (p", such that
Observe that P,,. . . , Pn_i are orthogonal projectors. Indeed, passing to the
limit in the equalities P m . y = (P m ., y ) 2 , we find that P; = P]. Further,
equation (15.2.4) combined with P*m. y = Pm . implies that P* = P;; so P;. is
an orthogonal projector (see Section 1.5).
Further, the subspace jVy = Im Py has dimension y, / = 1, . . . , n — 1. This
is a consequence of Theorem 13.1.2.
By passing to the limits it follows from BmPmj = PmjBmPmi that AP- =
PjAPj. Hence ^ is A invariant. Since Pmj = PmJ+lPmj we have Py = Pj+lPjt
and thus JV / -C^ / + 1 . It follows that JVy. is a complete chain of A-invariant
subspaces. Finally, 0(^V), M But this contradicts
(15.2.3), and the proof is complete. D
Corollary 15.2.4
If A has only one eigenvalue, A 0 , say, and if dim Ker( A07 — A) = 1, then each
invariant subspace of A is stable.
Proof. The conditions on A are equivalent to the requirement that for

each 1 < j < n — 1 the operator A has only one /-dimensional invariant
subspace and the nontrivial invariant subspaces form a complete chain (see
Section 2.5). So we may apply the previous theorem to obtain the desired
results. D
Lemma 15.2.5
If A has only one eigenvalue, A0 say, and if dim Ker(A 0 7— A) >2, then the
only stable A-invariant subspaces are {0} and <p".
Proof. Let J = diag[ ] be the Jordan form for A. As

dim Ker( A 0 7 — A) > 2, we have s > 2. By similarity, it suffices to prove that /
has no nontrivial stable invariant subspace.
For € G (p, define the transformation Tf on <p" by setting
otherwise
and put Bf = J+Te. Then || Bf - J \\ tends to 0 as e -» 0. For e * 0 the linear
transformation Bf has exactly one/-dimensional invariant subspace, namely,
Proof of Theorem 15.2.1 in the General Case 451
,Y} = Span{e,,. . . ,e y ). Here 1 < / < £ - ! . It follows that Jff is the only
candidate for a stable /-invariant subspace of dimension j.
Now consider / = diag[Jk ( A 0 ) , . . . , /* 2 (A 0 ), / A | (A 0 )j. Repeating the
argument of the previous paragraph for / instead of /, we see that Nf is the
only candidate for a stable /-invariant subspace of dimension /. But / =
575 ~ l , where S is the similarity transformation that reverses the order of the
blocks in /. It follows that SNj is the only candidate for a stable /-invariant
subspace of dimension /'. As s > 2, however, we have SNj ^ Nj for 1 < / <
k — 1, and the proof is complete. D
Corollary 15.2.4 and Lemma 15.2.5 together prove Theorem 15.2.1 for
the case when A has one eigenvalue only.
15.3 PROOF OF THEOREM 15.2.1 IN THE GENERAL CASE
The proof of Theorem 15.2.1 in the general case is reduced to the case of
one eigenvalue considered in the preceding section. Recall the notion of the
minimal opening
between subspaces M and ^(Section 13.3). Always 0<rj(^, N) < 1, except

when both M and M are the zero subspace, in which case ri(M, N) = °°.
Note that j](M, Jf) >0 if and only if M D N = {0} (Proposition 13.2.1). We
need to apply the following fact.
Proposition 15.3.1
Let be a sequence of subspaces in for
some subspace !£, then
for every subspace N.
Indeed, if both t£ and M are nonzero, then also Mm are nonzero (at least
for m large enough; see Theorem 13.1.2). Then (15.3.1) follows from
formula (13.3.2). If at least one of % and jV is the zero subspace, then
(15.3.1) is trivial.
Let us introduce some terminology and notation that will be used in the
next two lemmas and their proofs. We use the shorthand Am—* A for
lim^^ || Am - A\\ — 0, where Am, m — 1, 2 , . . . , and A are transformations
on (p". Note that A m —> A if and only if the entries of the matrix represen-
tations of Am (in some fixed basis) converge to the corresponding entries of
A (represented as a matrix in the same basis). We say that a simple

rectifiable contour F splits the spectrum of a transformation T if tr(T) D <£ =
0. In that case we can associate with T and F the Riesz projector
The following observation is used subsequently. If T is a transformation

for which F splits the spectrum, then F splits the spectrum for every
transformation S that is sufficiently close to T (i.e., \\S - T\\ is close enough
to zero). Indeed, this follows from the continuity of eigenvalues of a linear
transformation as functions of this transformation.
Lemma 15.3.2
Let F be a simple rectifiable contour that splits the spectrum of T, let TQ be the
restriction of T to Im P(T; F), and let N be a subspace of Im P(T; F). Then Ji
is a stable invariant subspace for T if and only if jV is a stable invariant
subspace for T().
Proof. Suppose that ^Vis a stable invariant subspace for TQ, but not for
T. Then one can find an e > 0 such that for every positive integer m there
exists a transformation Sm such that
From (15.3.2) it is clear that Sm—> T. By assumption, F splits the spectrum

of T. Thus, for m sufficiently large, the contour F will split the spectrum of
Sm. Moreover, P(Sm; F)-» P(T; F), and hence ImP(5 m ;F) tends to
Im P(T; F) in the gap topology. But then, for m sufficiently large,
(cf. Theorem 13.1.3).

Let Rm be the angular transformation of Im P(Sm; T) with respect to
P(T; F). Here, as in what follows, m is supposed to be sufficiently large. As
P(Sm,T)^ P(T-r), we have Rm-+Q. Put
where the matrix representation corresponds to the decomposition

Proof of Theorem 15.2.1 in the General Case 453
Then Em is invertible with inverse
Also, Em Im P(T- F) = Im P(Sm- F), and £„,->/.

Put Tm = EmlSmEm. Then Tm Im P(T; F)C Im P(T; F) and T^T. Let
7m be the restriction of Tm to Im P(T; F). Then Tm —» ro. As ^V is a stable
invariant subspace for 70, there exists a sequence {^Vm} of subspaces of
Im P(T; T) such that Nm is rifl(j invariant and e(Jfm, ^V)->0. Note that JVW is
also 7^ invariant.
Now put Mm = EmNm. Then Mm is an invariant subspace for Sm. From
Em—> I one can easily deduce that 0(J/ m , JV m )—»0. Together with
S(Mm, ^)-^0, this gives 0(Mm, ^V)-*0, which contradicts (15.3.3).
Next assume that jV C Im P(T\ F) is a stable invariant subspace for 7\ but
not for TQ. Then one can find an e >0 such that, for every positive integer
m, there exists a transformation Sm on Im P(T\ F) satisfying
and
Let T, be the restriction of T to Ker P(T; F) and write
where the matrix representation corresponds to the decomposition (15.3.4).

From (15.3.5) it is clear that Sm—> T. Hence, as jV is a stable invariant
subspace for T, there exists a sequence {^Vm} of subspaces of (p" such that
Mm is 5m invariant and B(Jim, M)-*0. Put Mm = P(T; r)JV w . Since P(7; F)
commutes with 5OT, then Mm is an invariant subspace for Sm . We now prove
that 6(-Mm, jV)—»0, thus obtaining a contradiction with (15.3.6).
Take y^Mm with ||_y|| < 1, and let xE.Jfm be such that y = P(T\ T)x.
Then
By Proposition 15.3.1, implies that

where So, for m sufficiently large,

Together with (15.3.7), this gives
for m sufficiently large. Using this inequality, we obtain
and
So
for m sufficiently large. We conclude that Q(Mm, N)—>0, and the proof is
complete. D
Lemma 15.3.3
Let jV be an invariant subspace for T, and assume that the contour F splits the
spectrum of T. If M is stable for T, then P(T;T)Jf is a stable invariant
subspace for the restriction T0 of T to Im P(T; F).
Proof. It is clear that M = P(T\ F)jV is T0 invariant.

Assume that M is not stable for T0. Then M is not stable for 7", either, by
Lemma 15.3.2. Hence there exist e > 0 and a sequence {Sm} such that
Sm —» T and
As N is stable for T, one can find a sequence of subspaces {Nm} such

that SmJimCJ^m and 0(Jfm,Jf)-*Q. Further, since F splits the spectrum of
T and Sm —» T, the contour F will split the spectrum of Sm for m suf-
Perturbed Stable Invariant Subspaces 455
ficiently large. But then, without loss of generality, we may assume that F
splits the spectrum of each Sm . Again using Sm —> T, it follows that
Let 3? be a direct complement of M in <p". As Q(Nm, .yV)-»0, we have

<p" = 3? + Jfm for m sufficiently large (Theorem 13.1.3). So, without loss of
generality, we may assume that (p" = 3? + Nm for each m. Let Rm be the
angular transformation of Nm with respect to the projector of ((7" along 3£
onto JV, and put
where the matrix corresponds to the decomposition (f1" = 2£. 4- N. Note that
Tm = Em*SmEm leaves Jf invariant. Because Rm-*Q, we have £ M -»7, and
sor m -»r.
Clearly, F splits the spectrum of T\v. As Tm-* T and JV is invariant for
Tm, the contour F will split the spectrum of Tm ^ too, provided m is
sufficiently large. But then we may assume that this happens for all m. Also,
we have
Hence in the gap topology.

Now consider !£m - EmMm. Then ££m is an S m -invariant subspace. From
EmÎ it follows that 6(£m,Mm)-+Q. This, together with e(Mm,M)-*Q,
gives 0(<^m, M)^>Q. So we arrive at a contradiction to (15.3.8) and the
proof is complete.
After this long preparation we are now able to give a short proof of
Theorem 15.2.1.
Proof of Theorem 15.2.1. Suppose that .TV is a stable invariant subspace
for A. Put Then N = ^ + • • • + Nr. By Lemma 15.3.3,
the space JV^ is a stable invariant subspace for the restriction Aj of A to
91^ (A). But, by Lemma 2.1.3, Aj has one eigenvalue only, namely, A ; . So
we may apply Lemma 15.2.5 to prove that jV; has the desired form.
Conversely, assume that each Nj has the desired form, and let us prove
that ^V = JVj + • • • + Nr is a stable invariant subspace for A. By Corollary
15.2.4, the space ^ is a stable invariant subspace for the restriction A- of A
to £%A (A). Hence we may apply Lemma 15.3.2 to show that each Jfj is a
stable invariant subspace for A. But then the same is true for the direct sum
jv = jv + ••• + jf.
15.4 PERTURBED STABLE INVARIANT SUBSPACES
In this section we show that the stability of an /1-invariant subspace M is

preserved under small perturbations of M and A. This is true also when we
restrict our attention to the intersection of M and a fixed spectral subspace
of A. To state this result precisely, denote by $ln(A) the spectral subspace

of A (the sum of root subspaces for A) corresponding to those eigenvalues
of A that lie in an open set ft.
Theorem 15.4.1
Let A : <p" —> <p" be a transformation, and let ft C <p be an open set whose
boundary does not intersect a-(A). Assume that M is an A-invariant subspace
for which the intersection M D 3ft.n(A) is stable (with respect to A}. Then any
B-invariant subspace jV has the property that N D &ln(B) is stable (with
respect to B) provided \\B-A\\ and 0(M, JV) are small enough.
The particular case of Theorem 15.4.1 when ft = 0 such that
any B-invariant subspace N is stable provided
We need the following lemma for the proof of Theorem 15.4.1.
Lemma 15.4.3
Let A and ft be as in Theorem 15.4.1, and let M be an A-invariant subspace.
Then for every e > 0 there exists a 8 > 0 such that every B-invariant sub-
space M with ||B — A\\ + 0(M, N)<8 satisfies the inequality
Proof. Arguing by contradiction, assume that there is a sequence of

transformations {Bm}^, =l and a sequence of subspaces {•'Vm}~ = 1 such that
invariant for each
w, but
where e does not depend on m.

Denote by P n (# m ) [resp. PÂ)]tne R'esz projector onto ^ ft (B m ) [resp.
onto £%n(^4)]. By Lemma 13.3.2, for m large enough there exists an
invertible transformation Sm: <p" —> <p" such that
and, moreover,
Perturbed Stable Invariant Subspaces 457
Here C,, C2, . . . are positive constants that depend on A only. Actually,
one can take S defined as follows:
Put Bm = SmlBmSm and jfm = Sm^m (so that Jfm is Bm invariant). Let P*
(resp. Pjf ) be the orthogonal projector onto M (resp. Nm). As SmlPx Sm is
a projector onto -A"m (not necessarily orthogonal), we have
where the first inequality follows from (13.1.4). Hence
It is easily seen that and Ker P n (/* m ) = Ker Pn(^4) (for m

large enough). Consequently
Since also
Theorem 13.4.2, together with (15.4.2), implies that
(cf. the proof of Lemma 14.3.2). Now, as in (15.4.2), we have
which contradicts (15.4.1) in view of (15.4.4).
Proof of Theorem 15.4.1. Consider first the case ft = 0
and a sequence [Bm}^J==l of transformations on <p" converging to A such that
Q{M, N) > € for every stable Bm-invariant subspace jV, m = 1, 2,. . . Since
M is stable and Bm—> A, there exists a sequence {^m}^ = 1 of subspaces in
<p" with BmMmC.Mm for each w and 6(Mm, M)^>$. For m sufficiently
large we have Q(Mm, M)<e, and hence the # m -invariant subspace Mm is
not stable.
Let 3? be a direct complement of M in <p". We may assume that 3? is also a

direct complement to each Mm (Theorem 13.1.3). Let Rm be the angular
transformation of Mm with respect to the projector onto M along '2L. Then
/?-.-*•<). Write
where the matrix representation is taken with respect to the decomposition

<p" = % + M. Then Em is invertible, EmM = Mm, and Em-^> L Put Am =
E~ÊmEm. Obviously, Am—=> A and AmM CM. Note that M is not stable
for
With respect to the decomposition <p" = M + 3£, we write
Then Um—>U and W m —» W. Since ^ is not stable for Am, Theorem 15.2.1
ensures the existence of a common eigenvalue \m of Um and Wm such that
Now | A w | < ||Um\\ and {t/m} converges to U. Hence the sequence {A m }

is bounded. Passing, if necessary, to a subsequence, we may assume
that A m -»A 0 for some A 0 E \0I - U and
\ml — Wm —» A () / — W. It follows that A0 is a common eigenvalue of U and W.
Again applying Theorem 15.2.1, we see that A() is an eigenvalue of
geometric multiplicity one: dim Ker( A () 7 - A)= I . So there exists a nonzero
( n - l ) x ( r t - l ) minor in A,,/ - A. Then, for m large enough, the corres-
ponding minor in A m / - Am is also nonzero, a contradiction with (15.4.5).
Now consider the general case of Theorem 15.4.1. It is seen from the
proof of Lemma 15.4.3 that we can assume that B satisfies 5?n(#) = $tn(A).
But then we can apply the part of Theorem 15.4.1 already proved with <p",
A and B replaced by 9in(A), A\jf ( A ) and B\.A ( B ) , respectively.
Now let us focus attention on the spectral A -invariant subspaces, that is,
sums of root subspaces for A (the zero subspace will also be called spectral).
Theorem 15.2.1 shows that each spectral invariant subspace is stable. The
converse is not true in general: every invariant subspace of a unicellular
transformation is stable, but the only spectral subspaces in this case are the
trivial ones.
For the spectral subspaces, an analog of Theorem 15.4.1 holds.
Theorem 15.4.4
Let A and O be as in Theorem 15.4.1. Assume that M is an A-invariant
subspace for which M H î{l(A) is a spectral invariant subspace for A. Then
Lipschitz Stable Invariant Subspaces 459
any B-invariant subspace X has the property that N n £%n(B) is spectral (as a
B-invariant subspace) provided \\B — A\\ + 6(M, N) is small enough.
Proof. As in the proof of Theorem 15.4.1, the general case can be

reduced to the case O = <p. So assume £1 = <p.
Since every invariant subspace is the sum of its intersections with the root
subspaces, it follows that an A -invariant subspace Z£ is spectral if and only if
there is an /t-invariant direct complement ££' to !£ such that a(A\^)r\
°"(Û ) ~ 0- Let A be an open set containing o-(A\ u), and let A' be an open
set disjoint with A that contains all other eigenvalues of A (if any). Then
cr(A\ y . ) C A' for an ^-invariant direct complement M' to M (actually, M' is
the spectral A -invariant subspace corresponding to the eigenvalues in A').
By Theorem 15.1.4, any B-invariant subspace jV satisfies cr(B| v )CA
provided \\B - A\\ + 9(M, N) is small enough. On the other hand, by
Theorems 15.2.1 and 15.1.4 there exists a B-invariant subspace N' such that
o-(/?| v ,)CA' and 6(M',N') is as small as we wish provided ||B-v4|| is
small enough. As X' is a direct complement to Ji (Theorem 13.1.3) and
c r ( B \ H , ) Pi cr(B\ N -,) = 0, it follows that JV is spectral. D
The proof of Theorem 15.4.4 shows that if M is a spectral ^-invariant

subspace with cr(y4|^)cn, where O C <p is an open set, then for any
B-invariant subspace jV such that ||B — A\\ + 0(M, JV) is small enough, we
also have
15.5 LIPSCHITZ STABLE INVARIANT SUBSPACES
In this section we study a stronger version of stability for invariant sub-

spaces. A subspace M C <p" that is invariant for a transformation
A: (p"—»<p" is said to be Lipschitz stable (with respect to A) if there exist
positive constants K and e such that every transformation B: (p" —» <p" with
| | f i - X | | < e has an invariant subspace N with
Clearly, every Lipschitz stable subspace is stable; the converse is not true in
general.
The following theorem decribes Lipschitz stability.
Theorem 15.5.1
For a transformation A and an A-invariant subspace M the following
statements are equivalent: (a) M is Lipschitz stable; (b) M — {0} or else
M = £%A (A) + • • • + £%A (A) for some different eigenvalues A,, . . . , \r of A;
in other words, M is a spectral A-invariant subspace; (c) for every suf-
ficiently small e > 0 there exists a 8 > 0 such that any transformation B with
\\A — B\\ < 8 has a unique invariant subspace N for which 0(M, N) < €.
The emphasis in (c) is on the uniqueness of Jf\ if the word "unique" is

omitted in (c), we obtain the definition of stability of M.
Proof. First, arguing as in the proof of Lemma 15.3.2, one shows

that M is a Lipschitz stable A -in variant subspace if and only if each
intersection M D £% ^ (A) is Lipschitz stable (with respect to the restriction
A\a (A)} f°r J = 1> • • • > 5 » where /A, , . . . , /u^ are all the distinct eigenvalues
of A.
Assume that (c) holds but (b) does not. Then M is a stable subspace, and
Theorem 15.2.1 ensures that for some eigenvalue A() of A with
dim Ker( A07 - A) = 1 we have
in a Jordan basis for A in and define the transformation B(a),

where 0 < a < 1 , as follows:
B(a) — A on all root subspaces of A other than £%A (^4). Then B(a)—> A as
a-*0. Let p = dim fflÂ); q = dim Ker M n Âo(X) ; so 0<q<p. For
brevity, denote the right-hand side of (15.5.1) by K(a). To obtain a
contradiction, it is sufficient to show that for a small enough the number of
g-dimensional ^(a)-invariant subspaces N such that B(M fl £%A (A), Jf)<
C, a 'lp is exactly ( ) > 1 (we denote by C, , C2 , . . . positive constants that
depend on p and q only).
Let us prove this assertion. The matrix K(a) has p different eigenvalues
el , . . . , ep , which are the p different roots of the equation xp — a. The
corresponding eigenvectors are y, = (1, e,, . . . , ef" 1 ), / = 1, . . . , p. The
only ^-dimensional K(a)- invariant subspaces are those spanned by any q
vectors among y , , . . . , yp . Take such a subspace «AT and suppose for
notational convenience that Jf = Span{ y,, . . . , yq}. The projector QN onto
M along the subspace spanned by eg+l, . . . , e is given by the formula
Lipschitz Stable Invariant Subspaces 461
where Yq (resp. Y p _ q ) is the q x q [resp. (p — q) x <y] matrix formed by the

first q (resp. last p - q) rows of the matrix [y, _y2 • • • yq}. As Fg is a
Vandermonde matrix, det y^ = n i < | . <>S9 (e ; .-e ( -)^0 (cf. Example 2.6.4).
Let Zq - Adj Yq be the matrix of algebraic adjoints to the elements of Yq,
so that y"1 = l/(det Yq)Zq. From the form of Yq it is easily seen that
||ZJ<CX /P > where r = 1 + 2 + • • • + (q -2) = £ ( g - l ) ( ? - 2 ) . Further,
|det yj = C 3 a v/p , where 5 = \ q(q - 1) is the number of all pairs of integers
( i , / ) such that l<i<j<q. As ||y p _J < C 4 a* /p , it follows that
l l y ^ ^ y ^ H < Csa(r+'+q)lp = C 5 a l / p . Consequently,
where As Q (resp. Q v ) is a projector onto M n <%A (.4)

(resp. onto N) we have
[see (13.1.4)]. Combining this inequality with (15.5.2), we find that

small enough. Since the number
of ^-dimensional K(a) -invariant subspaces N is exactly the required
assertion is proved.
Conversely, assume that (b) holds but (c) does not. Since M is a stable
subspace (by Theorem 15.2.1), this implies the existence of a sequence
{Bm}*l = } and the existence of two different Bm -invariant subspaces jVlm and
.yV2/w such that \\Bm - A\\<(\lm) and
for / = 1 and 2. Let F (resp. A) be a closed simple rectifiable contour such

that cr(A) D F = 0 [resp. cr(A) fl A = 0] and A,, . . . , \r are the only eigen-
values of A inside F (resp. outside A). Letting <3il .(C) be the image of the
Riesz projector (27r/)~' J, (A/ - C)~' JA, where the matrix C has no
eigenvalues on F, we have M = $ll(A). Since e(^(Bm),&tr(A))^0 as
m —>°o, we find in view of (15.5.3) that
Now ^ combining this with

(15.5.4), it is easily seen that Jfim n &±(Bm) = {0}, at least for large m.
(Indeed, argue by contradiction and use the properties that the set of all
subspaces in (p" is compact and that the limit of a converging sequence of
nonzero subspaces is again nonzero.) So Nim C 9tv(Bm). But (15.5.3) implies
that (for large m) dim Jfim = dim M = dim 9lv(Bm). Hence Jfim = 9lr(Bm),
/ = 1,2 (for large m), contradicting the assumption that Nlm and M2m are
different.
Now we prove the equivalence of (a) and (b). In view of Theorem 15.2.1,
we have to check that the only Lipschitz stable invariant subspaces of the
Jordan block
are the trivial spaces {0} and <p". For a > 0, let
For k = I , . . . , n — 1, the only A>dimensional /-invariant subspace Nk is

spanned by the first k unit coordinate vectors. Denote by Pk the orthogonal
projector onto Nk, and let Pk a denote the orthogonal projector onto a
^-dimensional 7 a -invariant subspace Nk ( ! < & < « - ! ) . We have
where So
Now use |e| = Va. One finds that for a sufficiently small
On the other hand, ||/ — Ja \\ = a. But then it is clear that for 1 < k < n — 1
the space Nk is not a Lipschitz stable invariant subspace of /, and thus / has
no nontrivial Lipschitz stable invariant subspace.
The property of being a Lipschitz stable subspace is stable in the

following sense: let M be an ^-invariant Lipschitz stable subspace. Then any
/^-invariant subspace Ji is Lipschitz stable (with respect to B} provided
\\B - A\\ and 0(M, JV) are small enough. In view of Theorem 15.5.1, this is
simply a reformulation of Theorem 15.4.4.
It follows from Theorem 15.5.1 that a transformation A: £"—»• <p" has
Stability of Lattices of Invariant Subspaces 463
exactly 2r different Lipschitz stable invariant subspaces, where r is the

number of distinct eigenvalues of A.
15.6 STABILITY OF LATTICES OF INVARIANT SUBSPACES
In this section we extend the notion of stable invariant subspaces to the

lattices of invariant subspaces.
Recall that a set A of subspaces in <p" is called a lattice if M , N G A
implies M + Jf G A and M D JV E A. Two lattices A and A' of subspaces in
<J7" are isomorphic if there exists a bijective map S: A —»A' such that
S(M H JV) = SM n SJV and S(J* + JV) = £/« + SN for any two members M
and jV of A. In this case 5 is called an isomorphism of A onto A'.
Let A be a lattice of (not necessarily all) invariant subspaces of a
transformation A: $"—> <p". The lattice A is called stable if for every e > 0
there exists a 5 > 0 such that, for any transformation Z? : <p" —» <p" with
||/4-Z?||<5, there exists a lattice A' of (not necessarily all) fi-invariant
subspaces that is isomorphic to A and satisfies sup^ eA 6(3?, S(£)) < e for
some isomorphism S: A—» A'. If A consists of just one subspace, we obtain
the definition of a stable invariant subspace.
Theorem 15.6.1
A lattice A of A-invariant subspaces is stable if and only if it consists of stable
A-invariant subspaces.
Proof. Without loss of generality we can assume that {0} and <p" belong
to A.
Suppose first that A contains an ^-invariant subspace M that is not stable.
Then there exist an e > 0 and a sequence of transformations {/?m}* = 1
tending to A such that 6(M, Jf) > e0 for any 5 m -invariant subspace N and
any m. Obviously, A cannot be stable.
Assume now that every member of A is a stable >4 -invariant subspace. As
the number of stable ^-invariant subspaces is finite (by Theorem 15.2.1),
the lattice A is finite. Let Ml, . . . , M p be all the elements in A. Denote by
A 1 5 . . . , \r the different eigenvalues of A ordered so that
Then
where J^ and Jftj is equal either to {0} or to ^(A) for

j = s + 1, . . . , r. Let Fy. (y = 1, . . . , r) be a small circle around A y such that

A y is the only eigenvalue of A inside or on Fy. There exists a 8{} > 0 such that
all transformations B: $"—> <p" with \\B - A\\ < d0 have all their eigenvalues
inside the circles F;; for such a B denote by fflj(B) the sum of root subspaces
of B corresponding to the eigenvalues inside Fy. Now put
where for and

$A (A)', for y = 1, . . . , s we take ,V;y as follows. Let {0} = ^0 C #, C • • • C
££m =• fflj(B) be a complete chain of fl-invariant subspaces in {%.(/?); then
N'ii is equal to that subspace !£k whose dimension coincides with the
dimension of Nir Clearly, M\ is B invariant. Further, it is clear from the
construction that M\C.M'k if and only if Mt C Mk. Using Theorem 15.2.3,
it is not difficult to see that, given e < 0 , there exists a positive 6 < 50
such that max ls . sp 0(,/^,, M'f) < e for any transformation B: <p"-»<p" with
\\B - A\\ < 8. Putting A' = {M[, . . . , Jt'p}, we find that A is stable. D
The case when the lattice A is a chain is of special interest for us.
We say that a chain =^, C • • • C 2£r of ^-invariant subspaces is stable if for
every e > 0 there exists a 5 >0 such that any transformation B: <p"—»<p"
with \\B — A\\ < 8 has a chain !£\ C • • • C Z£'r of invariant subspaces such that
Q(£\ , %) < e for / = 1, . . . , r. It follows from Theorem 15.6.1 that a chain
of ^-invariant subspaces is stable if and only if each member of this chain is
a stable /1-invariant subspace.
The notion of Lipschitz stability of a lattice of invariant subspaces is
introduced naturally: a lattice A of (not necessarily all) ^-invariant sub-
spaces is called Lipschitz stable if there exist positive constants e and K such
that every transformation B with \\B — A\\ < € has a lattice A' of invariant
subspaces that is isomorphic to A and satisfies
where 5 runs through the set of all isomorphisms of A onto A'. Obviously,
every Lipschitz stable lattice of invariant subspaces is stable. We leave the
proof of the following result to the readers.
Theorem 15.6.2
A lattice A of A-invariant subspaces is Lipschitz stable if and only if A
consists only of a spectral subspaces for A .
15.7 STABILITY IN METRIC OF THE LATTICES

If the lattice A consists of all ^-invariant subspaces, then a different notion

of stability (based on the distance between sets) is also of interest. To
introduce this notion, we start with some terminology.
Stability in Metric of the Lattices of Invariant Subspaces 465
Given two sets X and Y of subspaces in <p", the distance between X and Y
is introduced naturally:
Borrowing notation from set theory, denote by 2Z the set of all subsets in a
set Z. Then dist(^, F) is a metric in Z^"' [as before, $(<p") represents the
set of all subspaces in <p"]. Indeed, the only nontrivial property that we have
to check is the triangle inequality:
for any subsets X, Y, Z in $(<p"). For M e A", N G 7, ^ e Z we have
Fix M and e > 0 and take % in such a way that

inf^ e z 0(^, ^) + e. Taking the infimum in (15.7.1) with respect to JV, we
obtain
Now take the supremum with respect to M, and, from the resulting
inequality with the roles of M and JV interchanged, it follows that
As e > 0 was arbitrary, the triangle inequality follows.

Note also that dist(*, F) < 1 for any
The lattice ln\(A) of all invariant subspaces of a transformation
A : <p" —» <p" is called stable in metric if for every e > 0 there exists a 5 > 0
such that the lattice Inv(fi) of any transformation Z? : <p" —»• <p" with
||B-j4||<5 satisfies dist(Inv(Z?), Inv(/4))<e. The following theorem
describes all transformations with stable lattices of invariant subspaces.
Theorem 15.7.1
Inv(v4) is stable in metric if and only if A is nonderogatory , that is,
dim Ker(A — A 0 /) = 1 for every eigenvalue A0 of A.
Proof. Assume that A is derogatory. Then obviously ln\(A) is an

infinite set. Without loss of generality we can assume that A is a matrix in
the Jordan form:
Here A 1 5 . . . , \r are (not necessarily distinct) eigenvalues of A.

be a sequence of numbers such that lim m _^
0 and
for any / 5^; and any positive integer m. (Such sequences can obviously be
arranged.) Letting
we obtain \\Am — A\\—»0 as m—»<*. Moreover, the number of/l m -invariant

subspaces is exactly (kt+ ! ) • • • (kr + 1), and the lattice of Am-invariant
subspaces is independent of m. As lnv(A) is infinite, clearly, dist(Inv(^4),
ln\(Am))^ e >0, where e does not depend on m. Hence Inv(A) is not
stable.
Assume now that A is nonderogatory. Then the lattice Inv(v4) is finite.
Let Ml,. . . , Mp be all the /4-invariant subspaces. Theorem 15.2.1 shows
that every Ml is stable. That is, given e >0, there exists a 5, >0 such that
any transformation t has an invariant subspace
Nt such that B(Mi, JVJ < e. Taking 8' = min(5,, . . . , 5p), we have
for every transformation with \\B - A\\ < 8'. We prove now that given 6 >0
there exists a 6">0 such that
for every transformation B with \\B - A\\ < 8". Suppose not. Then there is a
sequence of transformations on such that
and for every m there exists a Bm -invariant subspace Nm with
where e0 is independent of m. Using the compactness of the set of all

subspaces in <p", we can assume that \\mm_tx6(Jfm, M) = Q for certain
subspace N in <p". Then (15.7.2) gives
However, by Theorem 15.1.1, jVE Inv(/4), which contradicts (15.7.3).

Stability in Metric of the Lattices of Invariant Subspaces 467
Now given e >0, let 5 = min(6', 5") to see that dist(Inv(#), lnv(A)) < e
for every transformation B with
It follows from Theorems 15.6.1 and 15.7.1 that ln\(A) is stable if and
only if it is stable in metric.
Also, let us introduce the notion of Lipschitz stability in metric. We say
that the lattice ln\(A) of all invariant subspaces of a transformation
A : <£" —> <p" is Lipschitz stable in metric if there exist positive constants K
and e such that for any transformation the
inequality
Theorem 15.7.2
The lattice lnv(A) for a transformation A: <p"—» <p" is Lipschitz stable in
metric if and only if A has n distinct eigenvalues.
Proof. Assume that A has n distinct eigenvalues. Then every A-

invariant subspace is spectral and by Theorem 15.5.1 every ^-invariant
subspace is Lipschitz stable, let Ml,...,Mp be dll ^4-invariant sub-
spaces (their number is finite). So there exist positive constants /C(, €t
such that any transformation B with \\B — A\\ < e. has a invariant subspace
^ with B(Mi,Jfi)<Ki\\B-A\\. Letting K = max(tf,, . . . , Kp), e =
min(e,, . . . , e p ), we find that
provided \\B - A\\ < e. Now consider the invariant subspaces of B. As A has
n distinct eigenvalues, the same is true for any transformation B sufficiently
close to A. So every B- invariant subspace ^V is spectral:
for a suitable contour F. We can assume that F n cr(A) = 0. Then, letting
we find that
for every transformation B sufficiently close to A (cf. the verification of

stability of a direct sum of root subspaces in the beginning of Section 15.2).

Hence
for all such B. In view of (15.7.4) and (15.7.5), Inv(/4) is Lipschitz stable in
metric.
Conversely, if A has less than n distinct eigenvalues, then by Theorem
15.5.1 there exists an ,4-invariant subspace that is not Lipschitz stable. Then
clearly lnv(A) cannot be Lipschitz stable in metric.
15.8 STABILITY OF [A B]-1NVAR1ANT SUBSPACES
In this section we treat the stability of [A Z?]-invariant subspaces. In view of

the important part they play in our applications (see Section 17.7), the
reader can anticipate subsequent applications of this material.
Let A: <p"—» <p" and B: <p" —» <p" be linear transformations. Recall from
Chapter 6 that a subspace M C <p" is called [A B] invariant if there exists a
transformation F: <p" -» $m such that (A + BF)M C M (actually, this is a
property that is equivalent to the definition of an [A B]-invariant subspace,
as proved in Theorem 6.1.1). We restrict our attention to the case most
important in applications, when the pair (A, B) is a full-range pair; thus
It turns out that, in contrast with the case of invariant subspaces for a
transformation, every [A #]-invariant subspace is stable and, moreover, the
stability is understood in the Lipschitz sense. More exactly, we have the
following theorem.
Theorem 15.8.1
Let A: <p" and B: <p* and
B1: <p m ^<p", with
there exists an [A' B']-invariant subspace M' satisfying
Proof.
Stability of [A B]-In variant Subspaces 469
and B as block matrices with respect to the decomposition <p" = M + Jf,

where JV is some direct complement to M:
We claim that (A22, B2) is a full-range pair. Indeed, since (A, B) is a

full-range pair, so is (A + BF, B) (Lemma 6.3.1). Now for every x =
XM + Xjf £ <p" with XM E M , x^ - E N we have
Hence in view of the full-range property of (A + BF, B) we find that
This implies the full-range property of (A22, B2).

We appeal to the spectral assignment theorem (Theorem 6.5.1). Accord-
ing to this theorem, there exists a transformation G: JV—»• <pm such that
Put F0 = F + G, where the transformation G: <p" -* <pm is defined by the

properties that Gx = 0 for all x E M and Gx = Gx for all x G N. Clearly
Condition (15.8.2) ensures that M is a spectral invariant subspace for A +

BF0. By Theorem 15.5.1, M is Lipschitz stable [as an (A + BF0)-invariant
subspace]. So there exist constants e', K' >0 such that every transformation
H: <p"—» <p" with \\A + BF0 - H\\ < e' has an invariant subspace M' such
that
It remains to choose e in such a way that
and put
We emphasize that the full-range property of (A, B) is crucial in

Theorem 15.8.1. Indeed, in the extreme case when B = Q the [A B]-
invariant subspaces coincide with ,4-invariant subspaces and, in general, not
every A-invariant subspace is Lipschitz stable.
The proof of Theorem 15.8.1 reveals some additional information about

the stability of [A B]-invariant subspaces:
Corollary 15.8.2
Let A: <p n —» <p" and B: (pm —» (p" be a full-range pair of transformations, and
let M be an [A B]-invariant subspace. Then for every transformation
F: <pn —»• (pm such that (A + BF)M C M and every direct complement N to M
in <p" there exists positive constants K and e with the property that to any pair
of transformations A': (p"-* <p", B': $m^ <p" with \\A-A'\\ + \\B-B'\\<
e there corresponds a transformation F': <p"—» <pm with Ker F' D M, and a
subspace M' C C", such that (A' + B'(F + F')}M' C M' and
A dual version of Theorem 15.8.1 also holds. Namely, given a null kernel
pair of transformations G: <p"—» (p™ an(^ A: ^P""* <P"> every „ -invariant
subspace is Lipschitz stable in the above sense. The proof can be obtained
by using Theorem 15.8.1 and the fact that a subspace M \s\ \ invariant if
and only if its orthogonal complement is [A* G*] invariant. We leave it to
the reader to state and prove this dual version of Corollary 15.8.2.
15.9 STABLE INVARIANT SUBSPACES FOR

REAL TRANSFORMATIONS
Let A: %"—>$" be a transformation. The definition of stable invariant

subspaces of A is analogous to that for transformations from <p" to (p".
Namely, an .A-invariant subspace M C ]|?" is called stable if for every e > 0
there exists a 8 >0 such that any transformation B: %" —> %" with \\B -
A\\ < 8 has an invariant subspace N with 6(M, N} < e. However, it turns out
that, in contrast with the complex case, the classes of stable and of isolated
invariant subspaces no longer coincide. More exactly, every stable invariant
subspace is isolated, but, in general, not every isolated invariant subspace is
stable.
To describe the stable invariant subspaces of real transformations, we
start with several basic particular cases.
Lemma 15.9.1
Let A: $"—» ft" be a transformation such that o~(A) consists of either exactly
one real eigenvalue or exactly one pair of nonreal eigenvalues. Let the
geometric multiplicity (multiplicities} be greater than one in either case. Then
thereîs no nontrivial stable A-invariant subspaces.
The proof of this lemma is similar to the proof of Lemma 15.2.5.
Stable Invariant Subspaces for Real Transformations 471
Lemma 15.9.2
Assume that n is odd and the transformation A: ft" —* ft" has exactly one
eigenvalue (which is real) and the geometric multiplicity of this eigenvalue is
one. Then each A-invariant subspace is stable.
Proof. As n is odd, every transformation X: ft"—* ft" has an invariant
subspace of every dimension k for \<k<n—\ (this follows from the real
Jordan form for X, because X must have a real eigenvalue). Arguing as in
the proof of Theorem 15.2.3, one proves that for every e >0 there exists a
5 > 0 such that, if B is a transformation with ||Z?-v4||<5 and M is a
Ac-dimensional Z?-invariant subspace, there exists a /c-dimensional A-
invariant subspace jV with 0(Jt, N) < e. Since A is unicellular, this subspace
./Vis unique, and its stability follows.
Lemma 15.9.3
Let n be even, and A: ft"-^ ft" have exactly one real eigenvalue. Let its
geometric multiplicity be one. Then the even dimensional A-invariant sub-
spaces are stable and the odd dimensional A-invariant subspaces are not
stable.
Proof. If k is even, then the stability of the /c-dimensional ^-invariant

subspace (which is unique) is proved in the same way as Lemma 15.9.2,
using the existence of a A>dimensional invariant subspace for every trans-
formation
Now let M be a A:-dimensional /^-invariant subspace where k is odd.
Without loss of generality we can assume A = /„(()). For every positive e,
the transformation A(e) = S(e) + A, where S(e) has e in the entries (2,1),
(4, 3), . . . , (n — 2, n - 3), (n, n - 1), and zeros in all other entries, has no
real eigenvalues. Hence A(e) has no /c-dimensional invariant subspaces, so
Q(M, N) > 1 for every ,4(e)-invariant subspace Jf (Theorem 13.1.2). There-
fore, M is not stable.
Lemma 15.9.4
Assume that A: ft"—> ft" has exactly one pair of nonreal eigenvalues a ± i(3,
and their geometric multiplicity is one. Then every A-invariant subspace is
stable.
Proof. Using the real Jordan form of A, we can assume that
(In particular, n is even.) Theorem 12.2.4 shows that the lattice of A-

invariant subspaces is a chain; so for every even integer k with 0< k ^ n,
there exists exactly one ^-invariant subspaces of dimension k. Also, there
exists an e > 0 such that any transformation B with ||B - A\\ < € has no real
eigenvalues. [Indeed, for a suitable e all the eigenvalues of B will be in the
union of two discs { A £ <p 1 1 A - (a ± ij8)| < (/3/2)} that do not intersect the
real axis.] Now one can use the proof of Lemma 15.9.2.
Now we are prepared to handle the general case of a transformation

be all the distinct real eigenvalues of A, and let
be all the distinct eigenvalues of A in the open upper
half plane (so ai are real and /3( are positive). We have
For every ^-invariant subspace jV we also have
(see Theorem 12.2.1). In this notation we have the following general result
that describes all stable ^-invariant subspaces.
Theorem 15.9.5
Let A be a transformation on tjt". The A-invariant subspace JV is stable if and
only if all the following properties hold: (a) Ji n ^^(A) is an arbitrary even
dimensional A-invariant subspace of <3iK (A) whenever the algebraic multi-
plicity of \j is even and the geometric multiplicity of\j is 1; (b) N ft <3iK (-4) is
an arbitrary A-invariant subspace of <%^ (A) whenever the algebraic multiplic-
ity of \j is odd and the geometric multiplicity of \} is 1; (c) JV D £%A (^4), or
N D <3lK (A) - {0} whenever A; has geometric multiplicity at least 2; (d)
N r\ &la . ± i f i ( A ) is an arbitrary A-invariant subspace of &la±ip(A) when-
ever the geometric multiplicity of
Ji H &i a . ± p ( A ) = {0} whenever a; -f /j8y has geometric multiplicity of at least 2.
Proof. As in Lemma 15.3.2 one proves that N is stable if and only if

each intersection jV fl £% A (A) is stable as an A\^ (/4)-invariant subspace, and
each intersection Nr\2fcn+iR a
(A) ' is stable as an /4L
> 'P/ ± V
IWaîliÂ)
-invariant sub-
space. Now apply Lemmas 15.9.1-15.9.4.
Comparing Theorem 15.9.5 with Theorem 14.6.5, we obtain the follow-

ing corollary.
Stable Invariant Subspaces for Real Transformations 473
Corollary 15.9.6
For a transformation A: $"— » 4?", every stable A-invariant subspace is iso-
lated. Conversely, every isolated A-invariant subspace is stable if and only if
A has no real eigenvalues with even algebraic multiplicity and geometric
multiplicity 1.
We pass now to Lipschitz stable invariant subspaces for real transfor-
mations. The definition of Lipschitz stability is the same as for transfor-
mations on (p". Clearly, every Lipschitz stable invariant subspace is stable.
Also, for a transformation A: ft"-* $" every root subspace ^(A) corres-
ponding to a real eigenvalue A of A, as well as every root subspace
3%a±ili(A) corresponding to a pair a ± ifi of nonreal eigenvalues of A, is a
Lipschitz stable ^-invariant subspace. Moreover, every spectral subspace for
A (i.e., a sum of root subspaces) is also a Lipschitz stable A -invariant
subspace. As in the complex case, these are all Lipschitz stable subspaces:
Theorem 15.9.7
For a transformation A:$n —> $n and an A-invariant subspace J f C . $ " ,
the following statements are equivalent: (a) M is Lipschitz stable;
(b) M = &Xi(A) + - - ' + &AJ(A) + ®ai±i,i(A) + ''' + &at±li(A) for some
distinct real eigenvalues A 1 ? . . . , Ar of A and some distinct eigenvalues
a, + //3,, . . . , a, + ifis in the open upper half plane (here terms 9l^(A) or
terms $ta±ip(A), or even both (in which case M is interpreted as the zero
subspace) may be absent) ; (c) for every e > 0 small enough there exists a
8 > 0 such that every transformation B: ft"-* #" for which \\B - A\\ < d has
a unique invariant subspace Jf for which 8(M, M} < e.
Proof. As in Lemma 15.3.2, one proves that M is Lipschitz stable if and
only if for every real eigenvalue A of A the intersection M D $t^(A) is
Lipschitz stable as an /4|Û) -invariant subspace and for every nonreal
eigenvalue a + i/J of A M n &la±lfi(A) is Lipschitz stable as an A\& (A)-
invariant subspace.
Let us prove the equivalence (a)<^>(b). In view of the above remark, we
can assume that A has either exactly one real eigenvalue or exactly one pair
of nonreal eigenvalues. By Theorem 15.9.6 we have only to prove that the
transformations represented by the matrices
and
have no nontrivial Lipschitz stable invariant subspaces. For A, one shows

this as in the proof of Theorem 15.5.1. Consider now A2. By a direct
computation one shows that
where n is the size of A-, and
(For convenience, note that
Moreover, denoting by T the n x n matrix that has 1 in the entries («/2,1)

and (n, n/2 +1) and zeros elsewhere, we have (for a E J|?)
Partial Multiplicities of Close Linear Transformations 475
Now the proof of Theorem 15.5.1 shows that the only candidates for
nontrivial Lipschitz invariant subspaces for A2 are S~'(Span{e,,. . . , e n / 2 } )
and S~\Span{en/2+i,. . . , en}). But since these subspaces are not real (i.e.,
cannot be obtained from subspaces in Jjf" by complexification), A2 has no
nontrivial Lipschitz invariant subspaces.
The implication (b) =^> (c) is proved as in the proof of Theorem 15.5.1. To
prove the converse implication, observe that, as we have seen in the proof
of Theorem 15.5.1, it is sufficient to show that for any .4 2 -invariant subspace
M(C $") of dimension q (0 < q < n) the number of ^-dimensional invariant
subspaces jV of A2(a) such that
is at least ere a is positive and sufficiently close to zero, and C is a

positive constant depending on q and n only.) Observe that q, as well as n,
must be even. Using formula (15.9.1) and arguing as in the proof of
Theorem 15.5.1, we find that for any choice of different complex numbers
€j, . . . , eql2 with e" 7 = 1, / = 1, . . . , q/2, the subspace JV spanned by the
columns of the real matrix
satisfies (15.9.2).
15.10 PARTIAL MULTIPLICITIES OF CLOSE

LINEAR TRANSFORMATIONS
In this chapter we have studied up to now the behaviour of invariant

subspaces under perturbations of the given linear transformation. We have
found that certain information about the transformation (e.g., its spectral
invariant subspaces) remains stable under small changes in the transfor-
mation. Here we study the corresponding problem of stability of the partial
multiplicities of transformations.
Given a transformation A: <p"—»(p", denote by k^\, A),. . . , kp(\, A)
the partial multiplicities of A corresponding to its eigenvalue A, arid put
kr( A, A) = 0 for r >p (here p is the geometric multiplicity of the eigenvalue
A). For a closed contour F in the complex plane that does not intersect the
spectrum of A, let
where A j , . . . , A r are all the distinct eigenvalues of A inside F. If there are

no eigenvalues of A inside F, put formally /cy(F, A) = 0 for y = 1, 2, . . . .
Theorem 15.10.1
Given a transformation A: <p" —> <p" and a closed contour F with F fl o-(A) =
0, there exists an e >0 such that any transformation /?:<£"—> (p" w/tfz
|| 5 — /4|| < e has no eigenvalues on F and satisfies the inequalities
and the equality
Proof. Let n(F, /) be the number of zeros (counting multiplicities) of a

scalar polynomial / inside F. (It is assumed that/does not have zeros on F.)
For 5 = 1 , 2 , . . . , « , we have the relations
where / S (A) is the greatest common divisor of all determinants of s x s

submatrices in A / — A. (Here and in the sequel all transformations on <p"
are regarded as n x n matrices in a fixed basis in <p".) Indeed, (15.10.3)
follows from Theorem A.4.3 (in the appendix).
Consider the Smith form of A/ — A (see the appendix):
where F( A) and G( A) are n x n matrix polynomials with constant nonzero

determinant, and a,(A), . . . , « „ ( A ) are scalar polynomials such that a^X) is
divisible by a,_,(A) for i = 2,. . . , n. By the Binet-Cauchy formula
(Theorem A.2.1) fs(\) coincides with the greatest common divisor of all
determinants of 5 x s submatrices in diag[flj(A),. . . , a n (A)], and this is
equal to the product a , ( A ) - • • as(\) in view of the properties of
aÂ), . . . , « n (A). So for 5 = 1,2, ... , «
Now let e >0 be so small that if \\B - A\\ < e, the determinant of the top
sxs submatrix in F(\)~\\I - fi)G(A)"1 has exactly
Partial Multiplicities of Close Linear Transformations 477
zeros in F. [Such an e exists by Rouche's theorem in the theory of functions

of a complex variable; e.g., see Marsden (1973).] Denote by hs(\) the
greatest common divisor of determinants of all s x s submatrices of F(\) ~ l
(A/ - B)G( A)~'. Then hs( A) coincides (again by the Binet-Cauchy formula)
with the greatest common divisor of determinants of all s x s submatrices in
\I-B. When | | f l - y 4 | | < e we obviously have n(F; flÂ)- • • as(\)) >
n(F; hs). Combining this inequality with (15.10.4) and using (15.10.3) with
A replaced by B, we find that, for 5 = 1, . . . , n
As the inequalities (15.10.1) with s>n are trivial, (15.10.1) is proved.

Further, EJL, /c,(F; A) coincides with the number of zeros of det(A7- A)
inside F, counting multiplicities. This number does not change after suf-
ficiently small perturbations of A, again by the Rouche theorem.
The following question arises in connection with Theorem 15.10.1: Are

the restrictions (15.10.1) and (15.10.2) imposed on the transformation B
sufficient for existence of such a B arbitrarily close to A! Before we answer
this question (it turns out that the answer is yes), let us introduce a
convenient notation for the partial multiplicities of a transformation.
Given a transformation A: <p" —> <p", let
be an ordered sequence, where s is the number of distinct eigenvalues of A,

and the iih eigenvalue has geometric multiplicity ri and partial multiplicities
m 1 / 5 . . . , mir. So £* = ] £y' =1 fntj = n. The order in (15.10.5) is determined by
the following properties: (a) r, > r 2 - • • > r v ; (b) if r-t = ri+l, then
(c) if ri — r / + 1 and equality holds in (15.10.6), then
We say that (15.10.5) is the Jordan structure sequence of A. Denote by 3> the
finite set of all ordered sequences of positive integers (15.10.5) such that
properties (a)-(c) hold and £* =I £y' =] rnif = n (here n is fixed).
Given the sequence ftE<f> as in (15.10.5), for every nonempty subset
A C {1,. . . , s} define
(mp- is interpreted as zero for j>rp). Now we have the following.
Theorem 15,10.2
Let A: (p" —> <p" be a transformation with s distinct eigenvalues and Jordan
structure sequence fl. Then, given a sequence
there exists a sequence of transformations on <p", say, {Bm }* = , that converges

to A and has a common Jordan structure sequence ft' if and only if there is a
partition { 1 , 2 , . . . , . ? ' } into s disjoint nonempty sets A t , . . . , As such that the
following inqualities hold:
Informally, if A,, . . . , A 5 are the distinct eigenvalues of A ordered as in

ft, and if A , m , . . . , \s.m are the distinct eigenvalues of Bm ordered as in ft',
then the eigenvalues {A > m } y e A cluster around \p, for p - 1, . . . , s.
Proof of Theorem 15.10.2. The necessity of conditions (15.10.7) and

(15.10.8) follows from Theorem 15.10.1.
To prove sufficiency, we can restrict our attention to the case 5 = 1. Let A0
be the eigenvalue of A, and write mi = 'Ep = lmpj (recall that r\ =
max{rj,. . . , r's} and that m'pj is zero by definition if / > r'p). We then have
the inequalities
and the equality
Now we construct a sequence (B 4 }^ = , converging to A such that A0 is the

only eigenvalue of Bq and, for each q, the Jordan structure sequence of Bq is
D = (1; /•{; m,, . . . , m^}. Using induction on the number SJli (S) =1 m; -
Exercises 479
E' = 1 m l y ) , it is sufficient to consider only the case when, for some indices
/ < q, we have m, = mu + 1, mq = m ](/ - 1, whereas m; = raly for; T^ l,j^q.
Write
as a matrix in some Jordan basis for A. Let
where the matrix (2 nas all zero entries except for the entry in position
(m n + • • • + MI,, mn + • • • + m{q) that is equal to 1. One verifies without
difficulty that the partial multiplicities of Bq are ml = mu + q,rhq = m\q-\,
and rhj = m l y for j ^ I, q.
Given a sequence {Btj}^=l converging to A such that or{Bq} = {A0} and
the Jordan structure sequence of Bq is fl (for each <?). For a fixed g, let
be a Jordan basis for Bq, in other words, x j } , . . . , x y ^ is a Jordan chain for

Bq for / = 1,. . . , r\. Let ^,. . . , ^s, be distinct complex numbers; define
the transformation Z^(/*,,,. . . , /A S .) by the requirement that in the basis
(15.10.9) it has the matrix form
where /. is the / x / unit matrix, and ; does not appear in (15.10.10) if

'•,. Clearly, S.) has the Jordan structure sequence , and
by suitable choice of JLI, values one can ensure that
With this choice of /x, values (which depend on q), put Brn =
Bm( ,) to satisfy the requirements of Theorem 15.10.2.
15.11 EXERCISES
15.1 When are all invariant subspaces of the following transformations

A: (p" —» (p" (written as matrices in the standard orthonormal basis)
stable?
(a) A is an upper triangular Toeplitz matrix.

(b) A is a circulant matrix.
(c) A is a companion matrix.
15.2 Describe all stable invariant subspaces for the classes (a), (b), and (c)
in Exercise 15.1.
15.3 Describe all stable invariant subspaces of a block circulant matrix
with blocks of size 2 x 2 .
15.4 Show that any transformation A: $"—> <p" with rank A < n - 2 has a
nonstable invariant subspace and identify it.
15.5 Prove that for every transformation A there exists a transformation
B such that every invariant subspace of A + eB is stable. Show that
one can always ensure, in addition, that rank B — n — I.
15.6 Give an example of a transformation A: $"—> <p" such that there is
no transformation B: (p"—»<p" with rank B<n—2 such that, for
some e £ <p, all invariant subspaces of A + eB are stable.
15.7 Given transformations A: <p"—»<p" and B: £" —» <p", an v4-invariant
subspace 5£ will be called B stable if for every e0 > 0 there exists
6 0 >0 such that each transformation A + 8B, with |5|<5 0 has an
invariant subspace M such that 0(<^, M)<eti. Clearly, every stable
/1-invariant subspace is B stable for every B. Give an example of a
/^-stable /4-invariant subspace that is not stable.
15.8 Show that if A and B commute, then there is a complete chain of
fi-stable yl-invariant subspaces.
15.9 Give an example of transformations A and B with the property that
an A -invariant subspace is stable if and only if it is B stable.
15.10 Show that an /1-invariant subspace is stable if and only if it is B
stable for every B.
15.11 Show that the set of all stable invariant subspace of a transformation
A: (p"-> <p" is a lattice. When is this lattice trivial, that is, when does
it consist of {0} and (p" only? When does this lattice coincide with
Inv(^)?
15.12 Show that every stable invariant subspace is hyperinvariant. Is the
converse true?
15.13 Prove that the transformation A: <p"—» <p" has the following property
if and only if A is nonderogatory: for every orthonormal basis
jtj , . , . , xn in which A has an upper triangular form and any e > 0
there exists a 8 >0 such that any transformation B: <p"—>• <|7" with
has an upper triangular form in some orthonormal
basis y\, • • • •, yn that satisfies
Exercises 481
15.14 Let A: ft"-» %r and B: $m-*%" be a full-range pair of real trans-

formations. Show that every [A B]-invariant subspace is stable (in
the class of real transformations and real subspaces). [Hint: Use the
spectral assignment theorem for real transformations (Exercise
12.13).]
15.15 Let A be an upper triangular Toeplitz matrix. Find all possible
partial multiplicities for upper triangular Toeplitz matrices that are
arbitrarily close to A.
15.16 Let A and B be circulant matrices. Compute dist(Inv(/l), Inv(fi)).
Chapter Sixteen
Perturbations of
Lattices of
Invariant Subspaces
with Restrictions
on the Jordan Structure
In this chapter we study the behaviour of the lattice Inv(A') of all invariant
subspaces of a matrix X, when X is perturbed within the class of matrices
with fixed Jordan structure (i.e., with isomorphic lattices of invariant
subspaces). A larger class of matrices with fixed Jordan structure corres-
ponding to the eigenvalues of geometric multiplicity greater than 1 is also
studied. For transformations A and B on <p", our main concern is the
relationship of the distance between the lattices of invariant subspaces for A
and B to ^ - B .
16.1 PRESERVATION OF JORDAN STRUCTURE AND

ISOMORPHISM OF LATTICES
We start with a definition. Transformations A, B: <J7" —> <p" are said to

have the same Jordan structure if they have the same number of distinct
eigenvalues [so that we may write <r(A) = { A , , . . . , A^} and a-(B) —
{/*!,. . . , /u.^}], and the eigenvalues can be ordered in such a way that the
partial multiplicities of A, as an eigenvalue of A coincide with the partial
multiplicities of /A, as an eigenvalue of B, i = 1, . . . , s.
Given a transformation A, denote by J(A) the set of all transformations
with the same Jordan structure as A. This structure is determined by the
sequence of positive integers (which was also a useful tool in Section 15.10):
482
Preservation of Jordan Structure and Isomorphism of Lattices 483
(16.1.1)
where 5 is the number of distinct eigenvalues of A, and the /'th eigenvalu

has geometric multiplicity r, and partial multiplicities ra n ,..., m, r . Thus
The parameters of this sequence are ordered in such a
way that rt > r 2 > • • • > r s , and if r, = r / + 1 , then
and, furthermore, if r, = r ( + l and equality holds in inequality (16.1.2), the

integers m^ and ml + l ; are ordered in such a way that
Clearly, the property of having the same Jordan structure induces an

equivalence relation on the set of all transformations on <p". The number of
equivalence classes under the relation is finite and is equal to the number of
all different sequences of type (16.1.1) with the order properties described.
It is shown in the first theorem that transformations have the same Jordan
structure if and only if they have isomorphic (or linearly isomorphic) lattices
of invariant subspaces.
Let us define the notion of isomorphism of lattices. First, let yi and 5^2 be
two lattices of subspaces in <p". A map <p: y\ —> &2 is called a lattice
homomorphism if <p") ~ 4
for every two subspaces l.
Then a lattice homomorphism <p is called a lattice isomorphism if <p is
one-to-one and onto; in this case the lattices 5^, and 6^2 are said to be
isomorphic. An example of a lattice isomorphism is provided by the
Proposition 16.1.1
If S: <pn—*• <p" is an invertible transformation and y is a lattice of subspaces in
is also a lattice of subspaces and the correspon-
dence = SM. is a lattice isomorphism of y onto Stf.
Proof. The definition of <p ensures that <p is onto, and invertibility of S
ensures that (p is one-to-one. Furthermore,
for any subspaces M and Ji in (p". Indeed, the inclusion C in equation

484 Perturbations of Lattices of Invariant Subspaces
(16.1.3) is evident. To prove the opposite inclusion, take jc E SM H SN, so

x = Sm = Sn for some mELM, nELN. As S is invertible, actually m — n
and xELS(JtnN). Finally, the equality S(M + Jf) = SM + SJf is
evident.
The lattice isomorphisms described in Proposition 16.1.1 are called linear.

So the lattices 5^ and 5^2 are called linearly isomorphic if there exists a
transformation S: <p"—» <J7" (necessarily invertible) such that 2£ E 5^, if and
only i f
It is easy to provide examples of lattices of subspaces that are isomorphic
but not linearly isomorphic. For instance, two chains of subspaces
and
are lattice isomorphic if and only if k = I (it is assumed that M^ M f for

i T^ / and However, there exists an invertible matrix S such
that SMt = <£• for i = 1, . . . , k if and only if dim J£, = dim ^ for each /.
The following theorem shows, in particular, that for the lattices of all
invariant subspaces isomorphism and linear isomorphism are the same.
Theorem 16.1.2
Let a transformation A: <p"~* <P" be given. The following statements are
equivalent for a transformation B : <p" —» (p" : (a) Z? /w,s f/ie same Jordan
structure as A; (b) the lattices Inv(Z?) and Inv(^4) are isomorphic; (c) the
lattices In\(B) and Inv(>4) are linearly isomorphic.
Proof. Assume B E J(A). Let A t , . . . , \p and /i,, . . . , /up be all the

distinct eigenvalues of A and B, respectively, and let them be numbered so
that the partial multiplicities of A and A ; coincide with the partial multi-
plicities of B at fjij for / = ! , . . . , / ? . For a fixed /, let
be a Jordan basis in £%A (,4), and let
be a Jordan basis in 8ft^(B) (so /:,, A:2, . . . , kq are the partial multiplicities
of A at \j and of 5 at /A;). Given an ^4-invariant subspace
spanned by the vectors
Preservation of Jordan Structure and Isomorphism of Lattices 485
[here are complex numbers], put
where
Clearly, j^-(JO is a 5-invariant subspace that belongs to $^(B). Now for

any A-invariant subspace M put
It is easily seen that I/MS a desired isomorphism between Inv(^4) and Inv(fi);
moreover, M) = SM, where 5 is the invertible transformation defined
Sxrs = y,*> * = !,... ,* r ; r = l , . . - , < ? .
Conversely, suppose that /: Inv(^4)—> Inv(fl) is an isomorphism of lat
tices. Let A j , . . . , \p be all the distinct eigenvalues of A, and let jV( =
«/r(5? A (y4)), / = ! , . . . , / ? . Then <p" is a direct sum of the B-invariant sub-
space's JV,, . . . , Jfp. We claim that \ v ) = 0 for / ?*/. Inde
assume the contrary, that is, r some JV} and ^Vy with
/ T^y. Let JV = Spanf}', + y2}, where _yj (resp. _y 2 ) is some eigenvector of B\^
(resp. of B v) corresponding to the eigenvalue Then N is B invariant.
Let M be the /^-invariant subspace such that tff(M) - JV. Since ^ must
contain a one-dimensional ^-invariant subspace, and since $ is a lattice
isomorphism, the subspace M is one-dimensional. Therefore, M C £%A (y4)
for some A:. This implies Ji (/!)) = ^, a contradiction with
the choice of N.
Further, the spectrum of each restriction B\ v is a singleton. To verify
this, assume the contrary. Then for some i the subspace Nt is a sum of at
least two root subspaces for B:
Letting Mf be the yl-invariant subspace such that

1, . . . , / : , we have
If *i and j:2 are eigenvectors of y4| -<ti and of A\M , respectively, then
Span{*, + x2} is ^-invariant and does not belong to any subspace Mf.
Hence (Span{xl + jc2}) is B invariant, belongs to Jii but does not belong to
any subspace 2ft,M(B). This is impossible because
one-dimensional.
We have proved, therefore, that jV) = £% M .(/?), / = ! , . . . , / ? , where
/ i t j , . . . , fjip are all the distinct eigenvalues of B.
For a fixed /, the number of partial multiplicities of A corresponding to A,
that are greater than or equal to a fixed integer q, coincides with the
maximal number of summands in a direct sum Jif, + • • • + J%, where, for
j = 1,. . . , s, ^ C 91A (A) are irreducible subspaces with dimension not less
than q. As ty induces an isomorphism between Inv(/4|^ ( A ) ) and
ln\(B\^ (B) ), it follows that the number of partial multiplicities of A
corresponding to A, that are greater than or equal to q coincides with the
number of partial multiplicities of B corresponding to /u,, that are not less
than q. Hence A and B have the same Jordan structure.
Corollary 16.1.3
Assume that A and B are transformations on <p" with one and only one
eigenvalue {A0}. Then the lattices Inv(A) and Inv(B)
isomorphic if and only if A and B are similar.
16.2 PROPERTIES OF LINEAR ISOMORPHISMS OF LATTICES:

THE CASE OF SIMILAR TRANSFORMATIONS
In view of Theorem 16.1.2, for transformations A and B with the same

Jordan structure, the set A, B) of all invertible transformations S such
that g nv(,4) if and only if S3 nv(#), is not empty. Denote
Note that the set £f(A, B) contains transformations arbitrarily close to zero
[Indeed, take a fixed S A,B) and consider S with »0, ^Q.
Hence (^4, B) < 1 for any A and B with the same Jordan structure. This
observation will be used frequently in the sequel.
The following example shows that the equality fl(y4, B)— I is possible.
EXAMPLE 16.2.1. Let
Then
Properties of Linear Isomorphisms of Lattices 487
and
However, it is easily seen that the norm of I is at least 1, for

any choice of a, b, and c, and can be arbitrarily close to 1. Hence
The number £l(A, B) is closely related to the distance between Inv(A)

and Inv(B), as we shall see in the next theorem. Recall that
dist(Inv(/l), Inv(fi)) = max{ sup inf
Theorem 16.2.1
If A and B have the same Jordan structure and , B)<1, then
Proof. For positive be such that
For every nonzero x denoting y — S lx, we have
Hence
so and
an . Now for
any subspace J< C <p" the transformation 5^5 ' is a projector on S1./^ (we
denote by PM the orthogonal projector on Jt). So
Consequently
dist(Inv(,4), Inv(fi))
and since e <0 was arbitrary, Theorem 16.2.1 follows. D
Now consider the case when A and B are similar. Then, evidently, A and
B have the same Jordan structure. Clearly, in this case y(A, B) contains all
the similarity transformations between A and B:
|5 is invertible and A = S~1BS}
We remark that this inclusion can be proper. Indeed, in Example 16.2.1

above, the similarity transformations between A and B have the form
which is a proper subset of ^(A, B).
Theorem 16.2.2
For every transformation A: <p" —»• <p" we have
where the suprema are taken over all transformations B that are similar to A .
In other words, the first inequality in (16.2.1) means that there exists a
positive constant K > 0 (depending on A) such that for every B that is
similar to A we have
for some invertible transformation T satisfying A = TBT~\

In the next section the result of Theorem 16.2.2 is generalized to include
all the transformations B with the same Jordan structure as A.
Proof of Theorem 16.2.2. As 6(M,N)^1 for any subspaces M, X

in <p" [this follows, for instance, from formula (13.1.3)], we have
for any transformations X, Y: $"—> £". So, by Theorem 16.2.1, the second
inequality in (16.2.1) follows from the first one.
To prove the first inequality in (16.2.1), consider the linear space L(<p")
of all transformations A: <£" -» <p", with the scalar product (X, Y) = tr(AT*)
for X, yeL(<tr) (where Y* denotes the adjoint of Y defined by the
standard scalar product on <p") and the corresponding norm ||^||,=
Vf%T) for all X<EL($"). For every J5eL(<p") consider the linear
transformation
so that, in particular, W If B is similar to A, then

dim Ker WB = dim Ker WA (indeed, Ker WB = {XS \ X<= Ker WA}, where S
is a fixed invertible transformation such that B = S~1AS). Let PA be a fixed
projector on KerV^. [Thus PA: L(<p")-» L(<p").] By Theorem 13.5.1
there exists a positive constant Kl such that, if 5 is similar to A, then
for some
projector P B on Ker WB. Here
is the norm induced by || • ||,, and similarly for

Observe that the norm || • ||, is multiplicative: for all
transformations X, Y: <p"—»<p". Indeed, if || • || is the norm induced on
transformations by the standard norm on <p", then it is easily verified
that ||y|| 2 /-yY* is positive semidefinite and hence that
XYY*X* is positive semidefinite. Thus
Further, denoting by A, ^ • • • ^ A n (s:0) the eigenvalues of the positive

semidefinite transformation y*y, we have every / £ (p" with ||/|| = 1:
so ||y||f. Substitution in (16.2.3) yields the desired ineq
Note that so the multiplicative property of

|| • ||, implies
Now for every transformation B similar to A.

The identity transformation / belongs to Ker WA\ so PA(I) = I and
If, in addition, then PB(I) is an invertible transfor-

mation. In this case P B (/)£ ^(B, A); hence
for every fl£L((p") that is similar to A and such that

where the constant K 2>Q depends on A only.
Taking into account the fact that £l(B, A) < 1 for all B similar to A, we
find that (16.2.1) follows from (16.2.4).
Results analogous to Theorem 16.2.2 hold also for other classes of

subspaces connected with a linear transformation. For example, the lattice
of all invariant subspaces in Theorem 16.2.2 can be replaced by any one of
the following sets: coinvariant subspaces, semiinvariant subspaces, hyper-
invariant subspaces, reducing invariant subspaces, and root subspaces. The
proof remains the same in all these cases.
Theorem 16.2.2 fails in general if we drop the requirement that B is
similar to A. The next example illustrates this fact.
EXAMPLE 16.2.2. Let
Let us compute dist(Inv(v4), Inv(Z?(S)) for 8 7^0. We have for a complex

number
So
Hence
As any subspace in <p is A invariant, we obviously have
for every ^ G l n v B ( g ) . Thus dist(Inv(,4), Inv(fi(6))) = 1. In the limit as

8—»0, we see that the conclusion of Theorem 16.2.2 fails for this particular
A if we drop the condition that B be similar to A.
We conclude this section with a simple example in which £l(B, A) and

dist(Inv(#), lnv(A)) can be calculated explicitly.
EXAMPLE 16.2.3. Let
Then
and
Taking a — 1 , it follows that
On the other hand, taking u — 0, we have
So
An elementary calculation (using the stationary points of \xd\2 + \l — d\2

considered as a function of two real variables &te d and $m d) yields
To calculate the distance between lnv(A) and lnv(B), note that the
unique different invariant subspaces of A and B (if x ^0) are Span and
Span , respectively, with corresponding orthogonal projectors
Observe that
and, letting P3 = /- P, , we find that
These inequalities, together with the fact that 0(M,Ji) = \ if dim M

dim N (see Theorem 13.1.2), allow us to verify that
It is curious that A, B) = dist(Inv(>l), Inv(B)) in this example.
16.3 DISTANCE BETWEEN INVARIANT SUBSPACES FOR

TRANSFORMATIONS WITH THE SAME JORDAN STRUCTURE
We state the main result of this chapter.
Theorem 16.3.1
Given a transformation A on <p", we have
and
where the suprema are taken over the set J(A) of all transformations
B: <p n —»<p" which have the same Jordan structure as A.
Before we proceed with the proof of Theorem 16.4.1 (which is quite

long), let us mention the following result on Lipschitz continuity of
dist(Inv(y4), Inv(/?)), whose proof is facilitated by the use of Theorem
16.3.1.
Theorem 16.3.2
Let J be a class of all linear transformations having the same Jordan structure.
Then the real function defined on J by
for all A, B G / is Lipschitz continuous at every pair Ati, B0 £ J, that is
for every A, B G /, where the constant K>0 depends on A0 and B0 only.
Proof. We need the following observation (proved in Section 15.6):
for any transformations A, B, C: <p"~* <P"- Using (16.3.3) and (16.3.2), we

obtain for a fixed
Proof of Theorem 16.3.1. Since Theorem 16.2.1, together with (16.3.1),

implies (16.3.2), we have only to prove (16.3.1). The main idea of the proof
is to reduce it to Theorem 16.2.2. For the reader's convenience the proof is
divided into three parts.
(a) Let A j , . . . , A p be all the distinct eigenvalues of A, and let F, be a

circle around A,, / = ! , . . . , / ? chosen so small that F, H F; = 0 for
/ T^ ; and A, is the unique eigenvalue of A inside F^ . For every F, and
every transformation B: <p"—»<p" that has no eigenvalues on F.,
define
where .., u are all the eigenvalues of B inside

are the partial multiplicities of B at n-m

(we put kr(fjim, B) = 0 for r greater than the geometric multiplicity
of fjim as an eigenvalue of B). By Theorem 13.5.1 there exists an
e, >0 such that any transformation B with \\B - A\\ < e, has all its
eigenvalues in the union of the interiors of F , , . . . , ! ^ ; an
moreover, the sum of algebraic multiplicities of the eigenvalues of B
inside a fixed circle F, is equal to the algebraic multiplicity of the
eigenvalue A, of /4, for / = 1, . . . , / ? ; further
provided
(b) Assume now that ||B - A\\ < e, and B E J(A). As the numbers of
different eigenvalues of B and of A coincide, there is exactly one
eigenvalue of B, denoted /i(, inside each circle F(. We claim that for
every / = 1 , . . . , / ? the eigenvalue A, of A and the eigenvalue /x, of B
have the same partial multiplicities. Indeed, assuming the contrary,
it follows from (16.3.4) that
for some /„ (1 < /„ </?) and some 50 (note that the equality
holds for i= 1, . . . , / ? ) . For notational simplicity assume that i'0 = 1,

and that A,, A 2 , . . . , A^ are exactly those eigenvalues of A whose
algebraic multiplicities are equal to the algebraic multiplicity of A,.
As B E J(A), there is a permutation TT of {1, 2, . . . , p(}} such that
Consequently,
However, (16.3.4) and (16.3.5) imply
which is a contradiction.
(c) Observe that a transformation F: <p"-» <J7" with

no eigenvalue on F, U F2 U • • • U Fp. So the number
is well defined. For the transformation B G J(A) with

e,/2 we have [using (13.1.4)]
where A / is the length of F(. Let
st
Then 2-n-)||^ - fi|| and (provided,
Put
and for fixed / ( ! < / < / ? ) let 5, be the transformation constructed above for
the transformation B £ 7(^4) with \\B - A\\ < e2. Define the transformation
where 5, = 5,.,^ . Obviously, /A, is the only eigenvalue of Br Further, for

• \i(A) —
the transformation we have (here x
4% Perturbations of Lattices of Invariant Subspaces
Now
and where
the proof of Theorem 16.2.1). Since (1 - <?,)~' <2, (16.3.6) gives
where Now we have

we
On the other hand, for any orthonormal basis / - , . . . , / * . in 3ft ^( A) the

inequality (16.3.8) gives
M,.-trB,.|
Taking into account (16.3.9), we obtain
Now define the transformation B': <p"-> <P" by
Then B' is clearly similar to ;4. As every invariant subspace of a transfor-

mation is the direct sum of its intersections with all the root subspaces of this
transformation, it follows that Inv(fi) = Inv(B'). Moreover, inequality
(16.3.10) shows that for all jc,-
Transformations with the Same Derogatory Jordan Structure 497
For every xE.(" write x = xl + -" + xp, where xj = Pi(B)xt and P,(fi)
is the projector on %(fi) along E/^^fl). As Pi(B) = (l/2iri)
JY ( A / - B ) ' 1 d\, we have
where P;-(>1) is the projector on £%A (A) along E /?S/ 5? A (A). Denoting
we see that ||P,(fi)|| < £ > , , / = 1, . . . , / ? . Now using (16.3.11) with these
inequalities we obtain
and thus
where . By Theorem 16.2.2 there exists <23 >0 such

that for any transformation X that is similar to A there exists an invertible S
with Applying this result for X = B'
and bearing in mind that Inv(fi') = Inv(Z?), we obtain
for any B e J(A) with

As n(,4,B)<l for any fley(X), (16.3.1) follows from (16.3.13). D
/d.4 TRANSFORMATIONS WITH THE SAME DEROGATORY

JORDAN STRUCTURE
The result on continuity of Inv(A) that is contained in Theorem 16.4.1 can

be extended to admit pairs of transformations that are close to one another
and have different Jordan structures, provided the variations in this structure
are confined to those eigenvalues with geometric multiplicity 1. To make
this idea precise, we introduce the following definition. We say that trans-
formations A: (p" —> <J7" and B: <p" —» <{7" have the same derogatory Jordan
structure if /l|<g (/4) and fi|%(fl) have the same Jordan structure, where ^(A) is
the sum of the root subspaces of A corresponding to eigenvalues A0 with

dim Ker( A 0 / - A) > 1. By definition, <€(A) = 0 if dim Ker( A07 - A) = \ for
every eigenvalue A() of A.
Denote by DJ(A) the set of all transformations that have the same
derogatory Jordan structure as A.
We need one more definition to state the next theorem. For a transfor-
mation A, the height of A is the maximal partial multiplicity of A corres-
ponding to the eigenvalues A0 with dim Ker( A 0 7 - A) = 1. If A has no such
eigenvalues, its height is defined to be 1.
Theorem 16.4.1
Let A: <£"—»• <p" be a transformation with height a. Then
where the supremum is taken over all B G DJ(A).
The inequality in Theorem 16.4.1 is exact in the sense that in general a

cannot be replaced by a smaller number. Namely, given a transformation A
with height a, there exists a sequence {Bm}^n^l of transformations converg-
ing to A with Bm G DJ(A) such that
Indeed, it is sufficient to consider the case when A = J,,(0) is a Jordan block.

Then the sequence
satisfies (16.4.1). This is not difficult to verify using the fact that Bm has n
distinct eigenvalues em~1'" with corresponding eigenvectors
n
where e is an nth root of unity. Indeed, writing , we see that the
orthogonal projector on Span IS
Transformations with the Same Derogatory Jordan Structure 499
SO
where the positive constant C is independent of m. Hence for m large

enough (such that C|£| < 1) we have

The proof of Theorem 16.4.1 is given in the next section. For the time
being, note the following important special case.
Corollary 16.4.2
Let A: <p"~* <P" ê a nonderogatory transformation with height a. Then there
exists a neighbourhood °IL of A in the set of all transformations on <p" such
that
Recall that a transformation A is called nonderogatory if dimKer(A/ —

A) — 1 for every eigenvalue A of A, and note that the set of all nonderoga-
tory transformations is open. Indeed, if A: <P"~* <P" 's nonderogatory, then
rank(>4 - A 0 /) = n - 1 for every eigenvalue A0 of A. Write A as an n x n
matrix in some basis in (p", and let A0 be an (n — 1) x (n - 1) nonsingular
submatrix of A — A07. Then, for B sufficiently close to A and A sufficiently
close to A () , the corresponding ( n - l ) x ( « — 1) submatrix B(} of B - A/ will
also be nonsingular. Consequently
for all such B and A. Now the eigenvalues of a transformation depend

continuously on that transformation. So the set of A values for which
(16.4.2) holds will contain all eigenvalues of B (if B is close enough to A),
which means that B is nonderogatory.
Using the openness of the set of all nonderogatory linear transformations,
we see that Corollary 16.4.2 follows immediately from Theorem 16.4.1.
The following result on continuity of dist(Inv(/4), lnv(B)) can be ob-

tained from Theorem 16.4.1 in the same way that Theorem 16.3.2 was
obtained from Theorem 16.3.1.
Theorem 16.4.3
Let DJ be a class of all transformations having the same derogatory Jordan
structure. Then the real function defined on DJ by
for every A, B E DJ is continuous. Moreover, for every pair A 0 , BQ G / there

exists a constant K > 0 such that
for every A, B £ DJ that is sufficiently close to AQ, B0, and where a, j8 are
the heights of A() and fi(), respectively.
Now we consider stable invariant subspaces. Recall from Section 15.2

that an ^-invariant subspace M is called stable if for every e > 0 there exists
8 >0 such that any transformation B with \\B - A\\<8 has an invariant
subspace jVwith the property that 6(M, N)< e. Using Theorem 16.4.1 and
its proof, we can prove a stronger property of stable invariant subspaces:
Theorem 16.4.4
Let A: $"—> (p" be a transformation with height a, and let M be a stable
A-invariant subspace. Then
where the supremum is taken over all transformations B: $"—> <f"".
It will be convenient to prove Theorem 16.4.4 in the next section,

following the proof of Theorem 16.4.1.
16.5 PROOFS OF THEOREMS 16.4.1 AND 16.4.4

We start with a preliminary result.
Lemma 16.5.1
Let A: <p"-^ <p" be a transformation with o-(A) = {0} and dim Ker ^4 = 1.
Then, given a constant M > 0, there exists a K > 0 such that
for every eigenvalue A0 of every transformation B: <p"—* <p" satisfying

Proofs of Theorems 16.4.1 and 16.4.4 501
Proof. Let B: <p"-> <£" be such that \\B-A\\<M.We have A" = 0 and
thus
On the other hand, if A0 is an eigenvalue of /3, then A[J is an eigenvalue of B"

(as one can easily see by passing to the Jordan form of B). Hence
. If this inequality is combined with the preceding one
and «th roots of both sides are taken, the lemma follows. D
Now we prove Theorem 16.4.1 for the case when A: <p"—><|7" is non-
derogatory and has only one eigenvalue.
Lemma 16.5.2
Let a(A) = {\0} and dim Ker(A 0 7 - A) - I . Then there exists a constant
K>0 such that the inequality
holds for every transformation
Proof. It will suffice to prove (16.5.2) for all B belonging to some

neighbourhood of A. We can assume A0 = 0. By Lemma 16.5.1 there exist
Kt >0 and e, >0 such that any eigenvalue A0 of a B with ||fi - ^4|| < e,
lln
satisfies 0 | < K^\B — A\\ . As the set of nonderogatory transforma-
tions is open, we can assume also that every B with \\B — A\\ < e, is non-
derogatory. Now for such a B and its eigenvalue let JCQ be the corre-
sponding eigenvector: (B - O/)JCO = 0, JCG ^ 0. Then dim Ker(Z? - A 0 /) =
dim Ker A = I , and using Theorem 13.5.1, we find that
for any eigenvalue A0 of any B satisfying \\B - A\\ < e2, where the positive
constants K2 and e2 ^ el depend on A only.
It is convenient to assume that A is the Jordan block with respect to the
standard orthonormal basis in (p": A = /„(()). For any B sufficiently close to
A write B - A = [6 /y ]" /=1 . Inequality (16.5.3) shows that there is an eigen-
vector x of B corresponding to an eigenvalue A0 of the form x =

(1, Jt 2 ,jt 3 , . . . , * „ ) ' where x2, . . . , xn G <f. The equation (B - A O /)JC = 0
has the form
Rewrite the first n - 1 equations in the form
Using |A 0 |< Kj||# - A\\lln and Cramer's rule, we see that for
/ = 2, 3,. . . , n, Xj has the following structure:
where are scalar functions of n 2 variables such that
where B satisfies \\B - A\\ < e2. Here and in the sequel L0, L, , . . . , denote
positive constants that depend on A only.
Now let jc (1) , . . . , x(k) be k eigenvectors of B corresponding to k different
eigenvalues A j , . . . , A A . Construct new vectors using divided differences:
Let
be the homogeneous polynomial of degree k in variables y , , . . . , y,. A

simple induction argument [using (16.5.4)] shows that u(ik) has the following
form (where s = k - j and the first s coordinates in u(ik} are zeros):
Here f u w = f u t v ( b ). The induction argument is based on the following

equality (where we put formally pQ = 1):
Now consider the subspace
Obviously
On the other hand, the matrix

is a projector on &, where Yk (resp. Y n _ k ) is the k x k [resp. (n - k) x A:]

matrix formed by the upper k (resp. lower n - k) rows of the n x k matrix
Using formulas (16.5.5), we see that
and thus, Yk is invertible (for B sufficiently close to A). Using the estimates
we easily find from (16.5.5) that
*. Hence
So
Consequently
for every transformation B such that \\B - A\\ < e2 and every ^-invariant
subspace is spanned by its eigenvectors. As B must be nonderogatory, the
last condition means that B has n distinct eigenvalues.
Assume now that B is such that \\B - A\\ < e2, but B does not have n
distinct eigenvalues. In particular, B is nonderogatory. Let {# m }™ =1 be a
sequence of transformations such that \\Bm - A\\ < e2 for all ra, Bm—> B as
m —»°°, and Bm has n distinct eigenvalues for each m. Let M be a
A:- dimensional fi-invariant subspace. As M is a stable subspace (see
Theorem 15.2.1), there exists a sequence {Mm}^ =l, where Mm is a k-
dimensional /?„, -invariant subspace such that 0(Mm,M,)-*Q as m—»<». By
(16.5.6)
Passing to the limit in this inequality as m—»«>, we obtain
hence
\\B-A\\<e
Proof of Theorem 16.4.1. We now start to prove Theorem 16.4.1 in full

generality. Let F, and F2 be two closed contours in the complex plane such
that F, H F2 = 0 and the eigenvalues A0 of A lying inside F\ (resp. F2) are
exactly those for which dim Ker( A 0 7 — A) - 1 (resp. dim Ker( A()/ - A) > 1).
Let 6 , > 0 be chosen so that any transformation £:(p"—>(p" with
\\B - A\\ < 5, has no eigenvalues on F, U F2. For such a B, let
and define the transformation S: <p"—»<p" by Sx = S^x for xELtftÂ), the

spectral subspace associated with the eigenvalues of A inside Fy. Denote by
F, the projector on ^,(^4) along $12(A)\ then for any jcG <p" with jjjc|| = 1
we have
where Ay is the length of Fy and
(cf. the proof of Theorem 16.3.1). Letting

, we ha Hence for
the transformation 5 is invertible and
j= 1,2. Now put B = S~1BS. Then (cf. the proof of Theorem 16.2.1)
As
it is sufficient to prove Theorem 16.4.1 only for those B: <p"~* <P" that are
close enough to A and satisfy ^t(B) - &j(A), j = 1, 2.
Note that for any transformation B sufficiently close to A with

$lj(A), every ^-invariant subspace JV is of the form where
Let M be an /^-invariant subspace, and let
where Then, denoting by Py (resp. Qy^ the orthogonal
projector on the subspace j£, (resp. j£2) in £%,(>!) [resp. £%2(-A)], we have
Hence
Further, we remark that if B is sufficiently close to A, and 2ftj(B) = &tj(A)

for; = 1,2, then B\^(A) is nonderogatory, that is, dimKer(A 0 /- B\^^) = 1
for every eigenvalue A0 of B\.A ( A ) . Indeed, this follows from the choice of
£%j(;4), which ensures that A\^ (A) is nonderogatory and from the openness
of the set of nonderogatory transformations. If, in addition, B e DJ(A), it
now follows that ^|^ 2(/ D and B\^(A} have the same Jordan structure. Hence
in view of (16.5.8) and Theorem 16.3.1, we only need to prove the
inequality
In other words, we can assume that A is nonderogatory. Moreover, using

the arguments similar to those employed above, we can assume in addition
that A has only one eigenvalue, and this case is covered already in Lemma
16.5.2.
Theorem 16.4.1 is proved completely
Proof of Theorem 16.4.4. It is sufficient to prove that there exist

positive constants e and K such that the inequality
holds for every transformation B satisfying ||6 - A\\ s e.

Observe that for any transformations B, B: <f"—»• <p" the inequality
holds. Indeed, for every JVelnv(B) and inv(B) we ha

Taking the infimum over all NEIn\(B) it follows that
It remains to take the infimum over all jV"Elnv(Z?) to obtain (16.5.10).

Using the arguments from the proof of Theorem 16.4.1 [when (16.5.10) is
used instead of (16.5.7)], we reduce the proof of (16.5.9) to the case when B
has the property that every root subspace ^(A) of A is a spectral subspace
for B and, moreover, the spectra of B\M ( A ) and B\A ( A ) do not intersect if
A, ^ A 2 . Let A,, . . . , A r be all the distinct eigenvalues of A; then
Also, for every B- invariant subspace Ji we have
Arguing as in the proof of Theorem 16.4.1, we obtain
So in order to prove (16.5.9), we can assume without loss of generality that

A has only one eigenvalue, say . If dim K e 1, then by
Theorem 15.2.1 (here we use the assumption that M is stable) M — {0} or
in which case (16.5.9) is trivial. If dim A,7 - A) = 1,
(16.5.9) follows from Theorem 16.4.1. [Note that in this case B e DJ(A) for
all B sufficiently close to A.]
16.6 DISTANCE BETWEEN INVARIANT SUBSPACES FOR

TRANSFORMATIONS WITH DIFFERENT JORDAN STRUCTURES
In this section we investigate the behaviour of dist(Inv(y4), Inv(Z?)) when A

and B have different Jordan structures or different derogatory Jordan
structures. The basic result in this direction is as follows.
Theorem 16.6.1
We have
where the infimum is taken over all pairs of transformations A, B:

such that A is derogatory and B is nonderogatory. [The infimum in
depends on n.]
Proof. Recall that B is nonderogatory if and only if the set of its invariant
subspaces is finite.
By assumption, dim Ker( 0/ - A) > 1 from some eigenvalue of A. Let
x and y be orthonormal vectors belonging to Ker 0 /- A), and put
Clearly, the subspaces M(t) are A invariant.

On the other hand, for every nonderogatory B: it *s easily seen
that the number of /^-invariant subspaces does not exceed
where the maximum is taken over all sequences p , , . . . , ps of positive

integers with pl + • • • + ps - n.
Now for any set of 2" subspaces j£, , . . . , «$£>/• in <p" put
As 0(./#(r), ^) is a continuous function of t on [0,1], so is

mm
is/s2"^(-^(0> •=£/)» hence F(J£,, . . . , j£2,,) is well defined. Let us show
thatF(J£j, . . . , £ 2n) is a continuous function of j£, , . . . , ^ 2 «. For some 5 >0,
let JV) , i = 1 , . . . , 2" be subspaces in such tha i)<8 for each i. The
for i = 1, . . . , 2" an 0, 1], we obtai
First take the minimum with respect to / on the left-hand side and then on
the right-hand side. We obtain
for all /E[0, 1]. Taking the maximum with respect to t on the right-hand
side first, and then on the left-hand side, we obtain
With the roles of j£). and Nt, switched it also follows that
that is
which proves the continuity of F^,, . . . , j£,/,). Obviously,

F( ) > 0 for all 56 t. As the set of all 2"-tuples of subspaces in <p" is
compact, there exists an e.>0 such that F for all
i = 1 , . . . , 2". From the definition of F(££\ , . . . , Jz^") 's ^ easily seen that
does not depend on the choice of x and y (because any pair of orthonormal
vectors in can be mapped to any other pair of such vectors by a unitary
transformation). Hence the theorem follows. D
When the transformations A and B are both derogatory, or both non-

derogatory, with different Jordan structures, the situation is more compli-
cated. The following question arises naturally: if [Bm}^n = l is a sequence of
transformations converging to A and such that each Bm has Jordan structure
different from that of A, does it follow that
The next example shows that the answer is, in general, negative.
EXAMPLE 16.6.1. For m = 1, 2, . . . , let
Clearly, for all m, Bm and A have different derogatory Jordan structure (in
particular, different Jordan structure).
One-dimensional ^4-invariant subspaces are Span{e, + Be and
Span{e3}. The orthogonal projector on Span{ej + Be3} is
One-dimensional flm-invariant subspaces are Span{e,}, Span{e 3 ), and

Span{<?! + m~le2 + Be3} where . The orthogonal projector on
SPAN IS
Now there exists a constant L, >0 (independent of and m) such that
Two-dimensional /1-invariant subspaces are Span{ej, e2 + fte3} where

and Span{^,,e3 }. Two-dimensional fim-invariant subspaces are
Span{e, + m~ e2, e3}, Span{e,, e3}, and Spanje,, e2 + /3e3}, where
The orthogonal projector on is
There exists a constant L2 > 0 (independent of m) such that
Now the inequalities (16.6.3) and (16.6.4) ensure that for m = 1,2, . . . ,
In the last example both A and Bm are derogatory. Taking
we obtain an example contradicting (16.6.2) with both A and /?,„ non-

derogatory.
16.7 CONJECTURES
In view of Example 16.6.1 the following question arises: Given a transfor-

mation A: with a certain Jordan structure, it is true that for an
other Jordan structure there exists a sequence of linear transformations
(Bm}m = \ tnat have this other Jordan structure, for which Bm—> A, and for
which
Conjectures 511
A similar question arises for the case of derogatory Jordan structure, when
(16.7.1) is replaced by
and a is the height of A. Of course, certain conditions should be imposed on

the Jordan structure (or on the derogatory Jordan structure) of {Bm}^=l to
ensure the existence of a sequence {Bm}^ = l converging to A. A complete
set of such conditions is given in Theorem 15.10.2.
Let us describe the Jordan structure of transformations on <p" in terms of
sequences as in (16.1.1), and let be the set of all such sequences. As in
Section 15.10, for
and for every nonempty set (1, . . . , s] define
Further, for O given by (16.7.2) denote by P(il) the set of all sequences
for which there is a partition of (1, . . . , 5'} into s disjoint nonempty sets
5 such that the following relations hold:
Note that always (one takes = {/?}, p - 1, . . . , 5). The set

consists of O if and only if represents the Jordan structure
corresponding to n distinct eigenvalues, that is, 5 = n.
Note that by Theorem 15.10.2, P represents exactly those Jordan
structures for which there is a sequence of transformations converging to a
given transformation with the Jordan structure
We propose the following conjecture.
Conjecture 16.7.1
Let A: be a transformation with the Jordan structure f Then
for any sequence ft' that belongs to P(ft) and is different from ft, there exists
a sequence of transformations [Bm}^=l that converges to A, for which each
Bm has the Jordan structure ft', and for which
It is not difficult to verify this conjecture when A is nonderogatory.

Indeed, without loss of generality we can assume that A is the nx n Jordan
block with eigenvalue zero. In view of Theorem 15.10.2, any sequence ft'
belonging to P(ft) (here ft is the Jordan structure of A) has the form
where s> 1 and m, are positive integers with Given such

consider the following n x n matrix (we denote by Qm and Im the m x m zero
and identity matrices, respectively):
where 17,,. . . , ^ are the 5th roots of e, and the n x n matrix Ae has e in the
(5,1) entry and zeros elsewhere. It is easy to see [by considering, e.g.,
det( /- B()] that, at least for e close enough to zero, the matrix B( has the
Jordan structure ft'. Clearly, 17,, . . . , 17, are the eigenvalues of B f , and
(1, 77,,. . . , 17; ', 0, . . . , 0} is the only eigenvector of B (up to multiplica-
tion by a nonzero scalar) corresponding to 17, for / = 1, . . . , s. It follows (cf.
the remark following Theorem 16.5.1) that
and Conjecture 16.7.1 is verified for the matrix A.

To formulate the corresponding conjecture for derogatory Jordan struc-
ture, we introduce one more notion. Let
and
be two sequences from <£. We say that ft and ft' have the same derogatory
part if the number (say, M) of indices /, 1 </ <s such that r- >2 coincide
with the number of indices ;', ! < / < f such that r j > 2 , and, moreover,
r
f it does not
happen that and have the same derogatory part, we say that and
have different derogatory parts.
Exercises 513
Conjecture 16.7.2
Let the transformation A: have the Jordan structure Then
for every sequence that belongs to P( and such that and have
different derogatory parts there exists a sequence of transformations {Bm}^ =l
that converges to A, for which each Bm has the Jordan structure and for
which
where a is the height of A.
16.8 EXERCISES
16.1 Given an n x n upper triangular Toeplitz matrix A, find all possible

Jordan structures of upper triangular Toeplitz n x n matrices that are
arbitrarily close to A. Are there additional Jordan structures if the
perturbed matrix is not necessarily upper triangular Toeplitz?
16.2 Solve Exercise 16.1 for the class of n x n companion matrices.
16.3 Solve Exercise 16.1 for the class of n x n circulant matrices.
16.4 Solve Exercise 16.1 for the class of n x n matrices A such that A2 — 0.
16.5 Prove or disprove each one of the following statements (a), (b), and
(c) : for every transformation A : there exists an 0 such
that any transformation B: with \\B - A\\ has the prop-
erty that (a) the height of B is equal to the height of A; (b) the height
of B is not greater than the height of A; (c) the height of B is not
smaller than the height of A.
16.6 Prove Conjecture 16.7.1 for the case when A = /3(0).
16.7 Given a transformation A: and a number an A-
invariant subspace M is called a stable if there exist positive constants
K and such that every transformation B: h \\B - A\\ < e
has an invariant subspace ^V satisfying
Show that all invariant subspaces of the Jordan block / n ( re a.

stable if n. (Hint: Use Lemma 16.5.2.)
16.8 (a) For every > l , give an example of an -stable y4-invariant
subspace that is not Lipschitz stable, (b) For every l , give an
example of a stable A-invariant subspace that is not a stable.
16.9 Are there a -stable invariant subspaces with 0< < 1?
Chapter Seventeen
Applications
Chapters 13-16 provide us with tools for the study of stability of divisors for
monic matrix polynomials and rational matrix functions. In this chapter we
develop a complete description of stable divisors in terms of their corre-
sponding invariant subspaces and supporting projectors. Special attention is
paid to Lipschitz stable and isolated divisors. We consider also the stability
and isolatedness properties of solutions of matrix quadratic equations as well
as stability of linear fractional decompositions of rational matrix functions.
17.1 STABLE FACTORIZATIONS OF MATRIX POLYNOMIALS:

PRELIMINARIES
Let L be an n x n monic matrix polynomial, and let
be a factorization of L into a product of n x n monic polynomials

Lt We say that the factorization (17.1.1) is stable if, after
sufficiently small changes in the coefficients of L , the new matrix
polynomial again admits a factorization of type (17.1.1) with only small
changes in the factors L y In the next section we study stability of the
factorization of type (17.1.1) in terms of invariant subspaces for the
linearization of the matrix polynomial L In this section we establish the
framework for this study and prove results on continuity of the correspon-
dence between factorizations and invariant subspaces to be used in the next
section.
Let CL be the companion matrix for L
514
Matrix Polynomials: Preliminaries 515
where L As we have seen in Chapter 5, the triple

(X where
is a standard triple for L . Further, there is a one-to-one correspondence

between the factorizations (17.1.1) of L and chains of CL-invariant
subspaces
with the property that the transformations
are invertible (see Section 5.6). Here, lr<: • • < 12< I are some positive
integers. The correspondence between factorizations (17.1.1) and chains of
CL- in variant subspaces is given by the formulas from Theorem 5.6.1.
Namely, let ^ be a direct complement to Mi+l in M} (j = 1,. . . , r - 1) (by (by
definition, Jll = and let P^: Mj—>Nj be the projector on jVy along
MJ+l. For / = !, . . . , r - l , let p. be the difference lj+l — lj where, by
definition, /, = /. Here / is the degree of L Then for /' = 1,. . . , r — 1 we
have
where
and the transformations are

are determined
determined by
by
516 Applications
(As usual, 8UV

uv denotes the Kronecker symbol:
symbol: and if
u T^ v.) For the last factor L
L r(\) we
we have
where
Also, it is
convenient to use the formulas for the products
AB (cf. the proof of
Theorem 5.6.1). We have for / = 2, . . . , r:
where
(Observe that when i = r formula (17.1.4) coincides with the preceding

formula for Lr.) Also, for / = 2, . . . , r:
where is a direct complement to is the projector on

along and
Our next step is to show that this correspondence between factorizations

of monic matrix polynomials L and chains of certain CL-invariant
subspaces is continuous. To this end define a metric on the set of all
n x n monic matrix polynomials of degree k\
Now fix a positive integer /. Consider the set °Wr of all r-tuples
where L is a monic matrix polynomial of
degree /, and is a chain of CL-invariant subspaces.
The set Wr is a metric space with the metric
For every increasing sequence ofofpositive

positive integers
integers /,/,
with 12 < I, define the subset Wr^ of Wr consisting of the elements
(Mr, . . . , M2, L from Wr with the additional property that the trans-
formations (17.1.3) are invertible.
Theorem 17.1.1
For each g the set Wr ^ is open in Wr .
Proof. Define the subspace OF by the condition

if and only if here As
for p = 1, . . . , / , it follows that the transformation (17.1.3) is invertible if

and only if Mt is a direct complement to ^,_/ in (pn/. From Theorem 13.1.3 it
follows that, if M , 4- then for € > 0 sufficiently small we also have
for every subspace M\ in with 0(^,., M' i)< Hence
Tr,f is open in Wr.
Now define a map
where is an increasing sequence of positive integers

lr, / r _i, . . . , /2 with /2 < /, as follows. Given
V r>f , the image of this element is (L t where the monic
matrix polynomials L, are taken from the factorization
which corresponds to the chain M rot CL- in variant subspaces. It

518 Applications
is evident that Ff is one-to-one and surjective, so that the map FJ1 exists.
Make the set into a metric space
by defining
If A^ , A"2 are topological spaces with metrics pl, p2, defined on A^ and X2,
respectively, the map G: Xl—> X2 is said to be locally Lipschitz continuous
if, for every x Xl , there is a deleted neighbourhood Ux of jc for which
Obviously, a locally Lipschitz continuous map is continuous. It is easy to see

that the composition of two locally Lipschitz continuous maps is again locally
Lipschitz continuous.
Theorem 17.1.2
The maps Ff and F^1 are locally Lipschitz continuous.
Proof. Given
where the products Ll • • • Lf_v
and L, • • • Lr are given by (17.1.5) and (17.1.4), respectively. Then
We show first that the coefficients of Af, i = 1,. . . , r — 1 are

locally Lipschitz continuous. Observe that in the representations (17.1.4)
and (17.1.5) the coefficients of M, and Af are uniformly bounded in some
neighbourhood of (Mr,. . . , M2, L . It is then easily seen that in order
to establish the local Lipschitz continuity of the coefficients of M, and Ni it is
sufficient to verify the following assertion: for a fixed
there exist positive constants and C such that,
for a set of subspaces satisfying for / = 2,. . . , r,
it follows that
Here %_t = {<0, . . . ,0, a,, . . . , and

and
PM || < C0(<3^, J<.), where P^, (resp. PM) is the projector on ^ (resp. ^,)
along ^ _ / . But this conclusion follows from Theorem 13.1.3. Hence the
coefficients of A/,-(A) and W,-(A) are locally Lipschitz continuous functions of
an element in °Wr^. In particular, L t = M2 and L r = Nr are locally Lipschitz
continuous.
To prove this property for L 2 , . . . , Lr_l, note that
Regard the equalities (17.1.6) as a system of linear equations
where A and b are formed by the entries of coefficients of M ( (A) and

and the unknown vector x is formed by the
entries of the coefficients of L 2 , . . . , Lr_{. The system (17.1.7) has a unique
solution jc; hence the matrix A is left invertible. So jc = A'b, where A1 is a
left inverse of A. Observe that every matrix B with is
1
also left invertible with a left inverse B satisfying
(cf. the proof of Theorem 13.5.1). This inequality shows that x is a locally
Lipschitz continuous function of because A
and b have this property.
To establish the local Lipschitz continuity of F^ 1 , we consider a fixed
element It is apparent that the polynomial L
L,L 2 • • • Lr will be a Lipschitz continuous function of L ] } . . . , Lr in a
neighbourhood of this fixed element. Further, let Mr C • • • C M2 be the
chain of CL- in variant subspaces corresponding to the factorization L =
L,L 2 • • • Lr. Let NJ = LtLi+l • • • Lr for / = 2, . . . , r, and let
where / is the degree of L and ra, is the degree of 7V(. The projector PM on
Mt along %_m is given by the formula
where A',, = [/ 0 ••• 0] and CN is the companion matrix of Nf. Indeed,

obviously, PM is a projector'and K&rPM=(Sl_m. Let us check that
Im PM = Mt. Recall (see the proof of the converse statement of Theorem
5.3.2) that Mi is given by the formula
As
and
520 Applications
we find that ^ ^ I m = Im PM . Formula (17.1.8) implies the local

Lipschitz continuity of PM '(as a function of (L, , . . . , Lr)} and, therefore, also
of M( (cf. Theorem 13.1.1). D
17.2 STABLE FACTORIZATIONSOF MATRIX POLYNOMIALS:

MAIN RESULTS
We say that a factorization
of a monic matrix polynomial L(A), where L t (A) are monic matrix poly-
nomials as well, is stable if for any e >0 there exists a 6 >0 such that any
monic matrix polynomial L(\) with cr,(L,L)<8 admits a factorization
L(A) = L j ( A ) - • • L r (A), where L,(A) are monic matrix polynomials satis-
fying
max
Here / is the degree of L and L, whereas for / = 2,. . . , r, /. is the degree of

the products Li+l • • • Lr and Li+l • • • Lr.
Recall the definition of a stable chain of invariant subspaces given in
Section 15.6.
Theorem 17.2.1
Let equality (17.2.1) be a factorization of the monic matrix polynomial L(A).
Let (Mr,. . . , M2, L(A)) = F^(Llt. . . , L r ) be the corresponding chain of
CL-invariant subspaces. Then the factorization (17.2.1) is stable if and only if
the chain
is stable.
Proof. If the chain (17.2.2) is stable, then by Theorem 17.1.2 the

factorization (17.2.1) is stable.
Now conversely, suppose that the factorization (17.2.1) is stable but the
chain (17.2.2) is not. Then there exists an e >0 and a sequence of matrice
(Cm}^ = ,, such that lim^^.^, Cm = CL and for any chain
Matrix Polynomials: Main Results 521
of Cm-invariant subspaces the inequality
holds. Put Q = co\[8ill]'i=l and
Then Sm converges to colf^C'"1]^, which is equal to the unit nl x nl

matrix. So without loss of generality we may assume that Sm is nonsingular
for all m. Let and note that
A straightforward calculation shows that SmCmS~^ is the companion matrix

associated with the monic matrix polynomial
From (17.2.3) and the fact that Cm-^ CL it follows that a,(Mm, L)^0. But
then we may assume that for all m the polynomial Mm admits a factorization
where crp.(L(>I(A), L,(A))-»0 for / = 1, . . . , r (here pt is the degree of L,,

which is also equal to the degree of Lim for m = 1, 2, . . .).
Let Mr m C • • • C M2 m be the chain of CM -invariant subspaces corre-
sponding to the factorization (17.2.4), that is
By Theorem 17.1.2 we have
Put Yiim = Sm* M^m for i = 2,. . . , r and m = 1,2,. . . . Then ^ m is an

invariant subspace for Cm for each m. Moreover, it follows from Sm—>/
that, for i = 2 , . . . ,r, d(1^ lm , Mi^m)-*0 as m-»oo. (Indeed,
[Indeed, by Theorem
13.1.1
522 Applications
where Pim is the orthogonal projector on Mim. Now
which tends to zero as m tends to infinity.) But then 6(Tim, jti^—^Q as

m—»°°, for / = 2, . . . , r. This contradicts the choice of C m , and the proof of
Theorem 17.2.1 is complete. D
Comparing Theorem 17.2.1 with Corollary 14.6.2 and Theorem 14.2.1,

we obtain the next result.
Corollary 17.2.2
A factorization
with monic matrix polynomials L(A), L t (A), . . . , Lr(\) is stable if and only
if the corresponding chain
of CL-invariant subspaces satisfies the condition that for every eigenvalue A0

of CL with dim Ker(CL - A 0 /) > 1 and for every i (2< i < r) either Mt D
One can formulate a criterion for stability of factorizations of this kind in

terms of eigenvalues of the polynomials L ( (A) rather than the companion
matrix (as we have done in Corollary 17.2.2), as follows.
Theorem 17.2.3
A factorization (17.2.1) is stable if and only if, for any common eigenvalue A0
of a pair L,( A), Ly( A) (i ^/) we have dim Ker L( A 0 ) = 1.
The proof of Theorem 17.2.3 is based on the following lemma.
Lemma 17.2.4
Let
Matrix Polynomials: Main Results 523
be a transformation from (p"1 into <p™, written in matrix form with respect to
the decomposition Then fm is a stable
invariant subspace for A if and only if for each common eigenvalue \0ofAl
and A 2 the condition dim Ker( A07 - A) — 1 is satisfied.
Proof. It is clear that <p mi is an invariant subspace for A. We know from

Theorem 15.2.1 that (p™1 is stable if and only if for each Riesz projector P of
A corresponding to an eigenvalue A0 with dim Ker(A 0 7— A) ^2, we have
P<pm> = 0 or P<pm' = Im P.
Let P be a Riesz projector of A corresponding to an arbitrary eigenvalue
A0. Also for / = 1 , 2 , let P; be the Riesz projector associated with Aj and A 0 :
for y = l,2, where e > 0 is sufficiently small. Then
l
Observe that for i — 1, 2, the Laurent expansion of (/A — At) at
at A0 has
has the
form
where Qtj are some transformations of Im P. into itself and the ellipsis on
the right-hand side of (17.2.5) represents a series in nonnegative powers of
(A - A 0 ). From (17.2.5) one sees that P has the form
where Ql and Q2 are certain transformations acting from (p"*2 into (p7"1. It
follows that {0} ^ P(pmi ^ Im P if and only if A0 e 0-^4,) n o-(A2). Now
appeal to Theorem 15.2.1 (see first paragraph of the proof) to finish the
proof. D
Proof of Theorem 17.2.3. Let (Mr,... ,M2, L( A)) = F f l ( L l f . . . , L r )

be the chain of CL- in variant subspaces corresponding to the factorization
(17.2.1). From Theorem 17.2.1 (taking into account Corollary 17.2.2) we
know that this factorization is stable if and only if M2,. . . , Mr are stable
Q-invariant subspaces. Let / be the degree of L, let r, be the degree of
L1L2- • • LI, and let
524 Applications
Then <p"' = Mi; + ty. With respect to this decomposition, write
As we know (see Corollary 5.3.3), a(Li+l —• Lr) = a(Cu) and

cr(L, • • • Lj) = cr(C1(). Also (r(CL) = o-(L); the desired result is now ob-
tained by applying Lemma 17.2.4.
Another characterization of stable factorizations of monic matrix poly-

nomials can be given in terms of isolatedness. Consider a factorization
of a monic matrix polynomial L(A) into the product of monic polynomials

L j ( A ) , . . . , Lr(\), and let pi be the degree of L, for / = ! , . . . , r. This
factorization is called isolated if there exists an e > 0 such that any
factorization
of L( A) with monic polynomials M,( A) satisfying <7p.(L,-( A), Af,( A)) < e (it is
assumed that the degree of M, is p() coincides with (17.2.6), that is,
Theorem 17.2.5
A factorization (17.2.6) is stable if and only if it is isolated.
Proof. Let ( M r , . . . , M2, L(A)) = F~'(L,, L 2 ,. . . , L r ) be the corre-

sponding chain of CL-invariant subspaces. By Theorems 17.1.2 and 17.2.1,
the factorization (17.2.6) is isolated if and only if each M( satisfies the
condition that either Mt D % (C L ) or M( n £%A (CL) = {0} for every eigen-
value A0 of CL with dimKer
dimKer(C t - A 0 /)>1. Now it remains to appeal to
Corollary 17.2.2.
We conclude this section with a statement concerning stability of the

property that a given factorization of a monic matrix polynomial is stable.
Theorem 17.2.6
Assume that
is a stable factorization with monic matrix polynomials Lj(A),

L 2 ( A ) , . . . , L r (A). Then there exists an e >0 such that every factorization
with monic matrix polynomials Mj(A), . . . , Mr(\) is stable provided
where for i = 2 , . . . , r, /. is f/ie degree of the products L, • • • Lr and Mt: • • • Mr.
The proof of Theorem 17.2.6 is obtained by combining Theorem 17.2.1

and Corollary 15.4.2.
17.3 LIPSCHITZ STABLE FACTORIZATIONS OF MONIC

MATRIX POLYNOMIALS
A factorization
of the monic matrix polynomial L(A), where L,(A), . . . , Lr(\) are monic
matrix polynomials as well, is called Lipschitz stable if there exist positive
constants e and K such that any monic matrix polynomial L(A) with
cr,(L, L) < e admits a factorization L( A) = L,( A) • • • Lr(\) with monic mat-
rix polynomials Lf(\) satisfying
Obviously, every Lipschitz stable factorization is stable. The converse is not

true in general, as one can see from the results of this section.
We start with the correspondence between the factorization (17.3.1) and
chains of C L -invariant subspaces, where CL is the companion matrix for
L(A), described in Section 17.1.
Theorem 17.3.1
The factorization (17.3.1) is Lipschitz stable if and only if the corresponding
chain of CL-invariant subspaces
is Lipschitz stable.
526 Applications
The Lipschitz stability of (17.3.2) is understood in the sense of Lipschitz

stability of lattices of invariant subspaces (Section 15.6). In the particular
case of chains, the chain (17.3.2) is, by definition, Lipschitz stable if there
exist positive constants c and K [that depend on CL and the chain (17.3.2)]
with the property that every nl x nl matrix A with \\A — CL\\ < e has a chain
of invariant subspaces such that
Proof. If the chain (17.3.2) is Lipschitz stable, then by Theorem 17.1.2

the factorization (17.3.1) is Lipschitz stable. Conversely, assume that the
factorization (17.3.1) is Lipschitz stable but the chain (17.3.2) is not. Then
there exists a sequence (Cm}^ = 1 of nl x nl matrices such that \\Cm - CL\\ <
(1/ra) and for every chain !£r C • • • C !£2 of Cm-invariant subspaces the
inequality
holds. We continue now with an argument analogous to that used in the

proof of Theorem 17.2.1. Putting Sm = col [QC^1] I = l , where
where QQ = col[5n][=1,
we verify that Sm is nonsingular (at least for large m) and that SmCmSml is
the companion matrix associated with the matrix polynomial
where [Uml, Um2,. . . , Uml] = Sml. We assume that Sm is nonsingular for

m = 1, 2 , . . . . Observe that co\[QC'^l]'i=l is the unit matrix /; so it is not
difficult to check that for m - 1 , 2 , . . .
Here and in the sequel we denote certain positive constants independent of

m by Kt, K2,.... As the factorization (17.3.1) is Lipschitz stable, for m
sufficiently large the polynomial M m (A) admits a factorization
with monic matrix polynomials M l n ) (A),. . . , Mrm(\) such that

Let Mr m C • • • C M2m be the chain of CM -invariant subspaces correspond-

ing to the factorization (17.3.5). By Theorem 17.1.2 we have
From (17.3.4), (17.3.6), and (17.3.7) one obtains
Put r,M = S;X« for / = 2, . . . , r and m = 1, 2, . . . . Then y,>in is Cm

invariant for each m. Further, the formula for Sm shows that
Indeed
and (17.3.9) follows. Now (cf. the proof of Theorem 17.2.1)
Using this inequality and (17.3.8), we obtain
a contradiction with (17.3.3). D
Combining Theorem 17.3.1 with Theorems 15.6.2 and 15.5.1, we obtain

the following corollary.
Corollary 17.3.2
For the factorization (17.3.1) and the corresponding chain of CL-invariant
subspaces (17.3.2), the following statements are equivalent: (a) the factoriz-
ation (17.3.1) is Lipschitz stable; (b) all the CL-invariant subspaces
528 Applications
M2, . . . , Mr are spectral; (c) for every e > 0 sufficiently small there exists a
8 >0 with the property every nl x nl matrihx BBwith with \\B-C.\
\\B - CL\\ < 8 has
has a
unique chain of invariant subspaces J i r C J " f r _ l C ' - - C J i 2 such that
ma$0(Mr,JVr),
Now we are ready to state and prove the main result of this section,
namely, the description of Lipschitz stable factorizations. (Recall the defin-
ition of the metric o-k on matrix polynomials given in Section 17.1.)
Theorem 17.3.3
The following statements are equivalent for a factorization
of the monic n x n matrix polynomial L(A) of degree /, where

L t ( A), . . . , Lr( A) are also monic matrix polynomials of degrees p ^ , . . . , pr,
respectively: (a) the factorization (17.3.10) is Lipschitz stable; (b) cr(Ly) fl
o-(Lk) = 0 for j ^ k; (c) for every € > 0 sufficiently small there exists a 8 > 0
such that any monic matrix polynomial L(A) with cr,(L, L ) < 5 has a
unique factorization L(A) = Lj(A)- • • Lr($ with the property that
^L^ Lj), . . . , crp (Lr, L r )) < e.
Proof. Observe that for j = 2, . . . , r,
where Mr C • • • C M2 is the chain of CL- in variant subspaces corresponding

to the factorization (17.3.10) (see formula (17.1.4)). Also, denoting by Mj a
direct complement to Mf in M j _ l for j = 2, . . . , r, defining Ml = <p"', and
letting Pj\ M j _ l ^ > M ' j be the projector on jVy' along M ., we have
So, the subspaces M-t are spectral if and only if cr(L;) fl cr(L^) = 0 for; ¥^ k.
Hence the equivalence (a)<£>(b) in Theorem 17.3.3 follows from the
equivalence (a)<£>(b) in Corollary 17.3.2. Similarly, the equivalence
(a)O(c) in Theorem 17.3.3 follows from the corresponding equivalence in
Corollary 17.3.2, taking account of Theorem 17.1.2. D
17.4 STABLE MINIMAL FACTORIZATIONS OF RATIONAL MATRIX

FUNCTIONS: THE MAIN RESULT
Throughout this section VK 0 (A), W 01 (A), W02(\), . . . , W0k(\) are rational

n x n matrix functions that take the value / at infinity. We assume that
Rational Matrix Functions: The Main Result 529
W k( A) and that this factorization is minimal.

following notion of stability of this factorization is natural. Let
be the minimal realizations for W0 and W01, W 0 2 ,. . . , WQk (so d is the

McMillan degree of W0, and 5, is the McMillan degree of W) for i =
! , . . . , & ) . The minimal factorization W0 = W01 • • • W0k is called stable if for
each e > 0 there exists a w >0 such that \\A - A0\\ + \\B - B0\\ + ||C-
C0|| < o> implies that the realization
is minimal and W admits a minimal factorization W = Wl W2 , . . . , Wk , where

for * = 1, . . . , A:, the rational matrix function W)(A) has a minimal reali-
zation
with the extra property that

Since all minimal realizations of a given rational matrix function are
mutually similar (Theorems 7.1.4 and 7.1.5), this definition does not depend
on the choice of the minimal realizations (17.4.1) and (17.4.2).
The next theorem characterizes stability of minimal factorizations in
terms of spectral data.
Theorem 17.4.1
The minimal factorization W 0 (A) = W 01 (A)W 02 (A) • • • W0k(\) is stable if and
only if each common pole (zero) of WOJ and W0p (j ^ p) is a pole (zero) of
W0 of geometric multiplicity 1.
The geometric multiplicity of a pole (zero) A0 of a rational matrix function

W( A) is the number of negative (positive) partial multiplicities of W( A) at A0
(see Section 7.2).
We need some preliminary discussion before starting the proof of
Theorem 17.4.1. As we have seen in Theorem 7.5.1, the minimal fac-
torizations
of W 0 (A) are in one-to-one correspondence with those direct sum decom-

positions
530 Applkations
for which the subspaces «2\ 4- • • • 4- <£p (p = 1,. . . , k) are A0-invariant and
the subspaces ££k 4- 3?k + l + • • • + Z£p (p = /c, . . . , 1 ) are AQ invariant,
where AQ — A0 — B(}C0. Moreover, the minimal factorization (17.4.3) cor-
responding to the direct sum decomposition (17.4.4) is given by
where 7ry is the projector on J£J along JS?, + • • • 4- .$?._, -i- j£J + 1 4- • • • + j£A;
note that the realizations (17.4.5) are necessarily minimal. In the formula
(17.4.5) the transformations COTT;: ^.-» <p", 7r y A 0 ir y : ^->^, and
7r;£?0: <p"—»J^ are understood as matrices of sizes n x / y , /y x /;, and /; x «,
respectively, where /; = dim «2?;-, with respect to some basis in ^.
Let (^4, 5, C) be a triple of matrices of sizes d x 8, 8 x n, n x 6,
respectively. Consider the ordered /c-tuple II = (TT,, . . . , Trk) of projectors in
((7s. We say that FI is a supporting k-tuple of projectors with respect to the
triple of matrices (A, B, C) if 77,77-, = 77-.7r; = 0 for / ^j, TT, + • • • + TT^ = /, the
subspaces Im(7r, + • • • + irp) for p = 1, 2 , . . . , k are A invariant, and the
subspaces lm(7rp + irp + l + • • • + irk), p = 1,. . . , k, are AK invariant, where
A* - A - BC. Clearly, II is a supporting /c-tuple of projectors with respect to
(A(), B0, C0) if and only if the subspaces^ = Im TT-(/ = 1,. . . , k) form a direct
sum decomposition of (p5 as in (17.4.4).
A supporting /c-tuple of projectors II = (TT,, . . . , irk) with respect to
(A, B, C) will be called stable if for every e >0 there exists an o> >0 such
that, for any triple of matrices (A', B', C') of sizes 5 x 5 , S x / j , n x 8,
respectively, with \\A - A'\\ + \\B - B'\\ + \\C - C'\\ < a>, there exists a sup-
porting /c-tuple of projectors IT = (TT^, . . . , TT£) with respect to (A', B',C')
such that
The first step in the proof of Theorem 17.4.1 is the following lemma.
Lemma 17.4.2
Let (17.4.1) be a minimal realization for WQ(\), and let H = (irl,. . . , Trk) be
a supporting k-tuple of projectors with respect to (A0, B0, C0), with the
corresponding minimal factorization
(so that, f o r j = l,...,k, W0;.( A) = / + C 0 7r ; (A/ - Al}) l7r;J50 with respect to

some basis x^\ . . . , x ( , ' } in Im TT;). Then H is stable if and only if the
factorization (17.4.6) is stable.
Rational Matrix Functions: The Main Result 531
The proof of Lemma 17.4.2 is rather long and technical and is given in
the next section.
Next, we make the connection with stable invariant subspaces.
Lemma 17.4.3
Let II = (TTI, . . . , Trk) be a supporting k-tuple of projectors with respect to
(AQ, BQ, CQ). Then II is stable if and only if the A0-invariant subspaces
Im(7r, + • • • + 7T;), / = 1, . . . , / : are stable and the A^-invariant subspaces
Im(7ry + TT / + I + • • • + TTfr), y = 1, . . . , k are stable as well (as before, A^ =
ô ~~ "n^-o)-
Again, it will be convenient to relegate the proof of Lemma 17.4.3 to the

next section.
Proof of Theorem 17.4.1. Let II = (ir l5 . . . , TJ^) be the supporting

A;- tuple of projectors with respect to (A0, J?0, C0) that corresponds to the
By Lemmas 17.4.2 and 17.4.3 the factorization (17.4.7) is stable if and only
if the y40-invariant subspaces J^ = Im^ + f- 77y), j = 1,. . . , k are stable
and the/io -invariantsubspaces^^ = Im(7ry- + irj+l + • • • + Trk)J= 1, . . . , k
are stable as well.
With respect to the decomposition <ps = Im 17, + Im ir2 + • • • + Im irk,
write
In view of Lemma 17.2.4, Ê. is stable if and only if, for every common
eigenvalue A0 of
we have dim Ker( A07 - A) = 1. So all the subspaces j?,,. . . , &k are stable if
and only if every common eigenvalue of Ajf and App (j ^ p) is an eigenvalue
of AQ of geometric multiplicity 1. Similarly, all the subspaces M I } . . . ,Mk
532 Applications
are stable if and only if every common eigenvalue of A* and A*p with j' ^p
is an eigenvalue of AQ of geometric multiplicity 1. It follows that the
factorization (17.4.7) is stable if and only if every common eigenvalue of Ajf
and App (resp. of A* and A*p} with j ^ p is an eigenvalue of A0 (resp. of
AQ) of geometric multiplicity 1. To finish the proof, observe that the
realizations (17.4.5) are minimal and hence, by Theorem 7.2.3, the poles
(resp. zeros) of WQj(\) coincide with the eigenvalues of TtjA^Tr- = An (resp.
eigenvalues of -nyl^ TT, - A*}). Also, the partial multiplicities of a pole (resp.
zero) A0 of WOJ are equal to the partial multiplicities of A0 as an eigenvalue of
AJJ (resp. A*j). Analogous statements hold for the poles and zeros of WQ(\)
and eigenvalues of AQ and A^. D
17.5 PROOF OF THE AUXILIARY LEMMAS
We start with the proof of Lemma 17.4.2.

Assume that II is stable. Given e > 0, let e' be a positive number that we
fix later. By Lemma 13.3.2 there exists an a)l > 0 with the property that, for
any projector TT'J such that ||TT;' - 7ry|| < o>j, there exists an invert-
ible transformation Sf: <p6-* <p6 with 5y(Im iry) = Im TT'J and \\I - Sj\\ < e'.
We also assume that o>t < min(e', 1). Further, let w2 be the number corres-
ponding to w, as defined by the stability of II.
As the realization (17.4.1) is minimal, in view of Theorem 7.1.5 the
matrix colfQ/l^l^rJ is left invertible, where p is the degree of the minimal
polynomial for A0, and the matrix [B0, AQB0, . . . , AQ~IBQ] is right invert-
ible. Since the left (right) invertibility of a matrix X is stable under small
perturbations [indeed, if ||Y- A"|| < H^'H" 1 , then Y is also left (right)
invertible], there exists a r > 0 such that the realization
is minimal provided \\A — A0\\ + \\B — BQ\\ + ||C — C0|| < T.

Now put &> = min(w1, cu2, T, e') and let (A, B, C) be such that
Then the realization (17.5.1) is minimal. By the stability of II, there exists a
supporting £-tuple of projectors II' = (TT|, . . . , TT£) with respect to
(A, B, C) such that
For y = l , . . . , let 5;: (ps-^(ps be invertible transformations with

Sj(lm 7T;) = Im TT) and ||/ - 5y|| < e'. Now put
Proof of the Auxiliary Lemmas 533
for each y, where the transformation 5; is understood as 5^: Im 7r y —»Im TT'J.

Also, we regard the rational functions (17.5.2) as matrix functions with
respect to the basis introduced for Im it-. We have the minimal factorization
Moreover, writing p = max
Use the inequalities wt < e', w ^ e' and the inequalities ||5y !
||7- S'1]! ^ e'(l - e')"1 (assuming e' < 1; cf. the proof of Theorem 16.2.1)
to get
It remains to choose e' < 1 in such a way that this expression is less than e,
and the stability of factorization (17.4.6) is proved.
Conversely, let the factorization (17.4.6) be stable and assume that II is
not stable. Then there exist an e > 0 and sequences {Am}^=l, {5m}^=1,
{Cmrm^ such that
534 Applications
where O = (irj, . . . , ir'k) is any supporting A:-tuple of projectors with respect

to at least one of the triples (Am, Bm, COT), m = I, 2, . . . . Since (17. 5. 1) is a
minimal realization, we can assume (using Theorem 7.1.5 and the fact that
the full-range and null kernel properties of a pair of transformations are
preserved under sufficiently small perturbation of this pair) that
is minimal for all m. In view of the stability of (17.4.6), we can also assume
that each W A admits a minimal factorization
where for j = 1, 2, . . . k, we obtain
are transformations written as matrices with respect to the basis introduced

for Im iTj with the property that
For fixed m, consider the minimal realization
whre
Proof of the Auxiliary Lemmas 535
obtained from the minimal factorization (17.5.3) [cf. formula (7.3.4)]. As

any two minimal realizations of Wm(\) are similar, there exists an invertible
transformation S : Im TTI + • • • + Im irk —* <p5 such that
Actually, such an Sm is unique, and from the explicit formula for Sm

(Theorem 7.1.3) we find, using (17.5.3) and (17.5.6), that S m -»/as m-»oo.
Now let n("° = (7r\m\ . . . , Tr(m)) be the supporting fc-tuple of projectors
with respect to (Am, Bm, Cm), which corresponds to the minimal factoriz-
ation (17.5.4). Thus, for / = 1,. . . , k we have
and hence ir}m> = 5 IM 7r / 5~ I . We find that £;=1 \\ir}m) - 7ry.||-*0 as wi-»oo, a

contradiction with the choice of (Am, Bm, Cm). Lemma 17.4.2 is proved.
We pass on to the proof of Lemma 17.4.3. Assume that the subspaces
Im^! + • • • + TTJ), I = 1,.. . , k are stable ,40-invariant subspaces and that
Im(7T-y + TT/ + I + • • • + TTk) are stable AQ -invariant subspaces. Arguing by
contradiction, assume that II is not stable. Then there exist an e > 0 and
sequences {v4J~ = 1 , (5J~ = 1 , and {Cm}^, such that
for every supporting A>tuple of projectors (TT[, . . . , Tr'k) with respect to

(Am*Bm,C ), m = l , 2 , . . . . Then clearly Am-*A0 and Axmd=Am-
BmCm-+ AQ as m—»«>. By assumption, and using Theorem 15.6.1, for each
positive integer m there exists a sequence of chains of subspaces {0} C
îm) C • • • C &?\ C ^ - (p5, such that %(r\ . . . , ^ are m invariant
A
Similarly, there exists a sequence of chains of subspaces
such that ^!m), / = 1,. . . , k are A* invariant and

536 Applications
As Im^ + • • • + TT,.) + Im(7r/ + 1 + • • • + irk) = <ps, for j = 1, . . . , k - 1 and

sufficiently large m, we find, using Lemma 13.3.2, that
Now let
It is easy to see that
Furthermore
Indeed, (17.5.12) obviously holds for / = ! . Assuming that (17.5.12) is

nroved for i = n — 1 we have
where is clearly contained in ^m). Take x£&™\ and write x = y + z,

where yE.£(™\ and z^M™. Then z = x - y(=£(pm\ and jce^\ +
CS^n^*,"0). So (17.5.12) is proved. Combining (17.5.11) and (17.5.12),
we find that
Developing an analog of the proof of (17.5.12), one proves that
For sufficiently large m, let TT{M) be the projector on tfjm) along

Then the A:-tuple of projectors
(7r ( r>, IT™, . . . , 7ri m) ) is supporting for (Amt Bm, CJ. Denoting by r^
the projector on ^m) along M(™\ (j = 1,. . . , k - 1), we have r{m) =
7r(,m) + • • • + 7rjm). On the other hand, (17.5.9) and (17.5.10) imply, in view
of Theorem 13.4.3, that for j = 1,. . . , k - 1.
and so lim^^.^ ||7rjm) - ir;-|| =0, a contradiction with (17.5.8).

Conversely, assume that II is stable, but one of the j40-invariant sub-
spaces Im 77j,. . . , Im(7r, + • • • + 7rk), say, lm('rrl + • - • + T^), is not stable.
Rational Matrix Functions: Further Deductions 537
Then there exist an e > 0 and a sequence {Am}fn =l such that \\Am -
A0\\-*Q as m —»<» and
for every A m - in variant subspace M (m = 1, 2, . . .). As n is stable, there

exists a sequence of A>tuples of projectors n (w) = (TT^ , . . . , TT^), m =
1 , 2 , . . . such that n(OT) is supporting for (Am, B0, C0) and
Hence for the ^-invariant subspace Im(7Tim) + • • • + Trjm}) we have
a contradiction with (17.5.13). In a similar way, one arrives at a contradic-

tion if II is stable but one of the ^4^-invariant subspaces Im(7r; + irf+l +
• • • + Trk), j = 1,. . . , k, is not stable.
Lemma 17.4.3 is proved completely.
17.6 STABLE MINIMAL FACTORIZATIONS OF RATIONAL MATRIX

FUNCTIONS: FURTHER DEDUCTIONS
In this section we use Theorem 17.4.1 and its proof to derive some useful
information on stable minimal factorizations of rational matrix functions.
First, let us make Theorem 17.4.1 more precise in the sense that if the
is stable, then so is every minimal factorization sufficiently close to
Theorem 17.6.1
Assume that (17.6.1) is a stable minimal factorization, and let
and
538 Applications
be minimal realizations of WÂ) and W 0j (A). Then every minimal fac-

torization
with minimal realizations
and
is stable provided
is small enough.
The proof of this result is obtained by combining Corollary 15.4.2 with

Lemmas 17.4.2 and 17.4.3.
Let us clarify the connection between isolatedness and stability for
minimal factorizations. The minimal factorization (17.6.1) is called isolated
if the following holds: given minimal realizations
for j = 1,. . . , k, there exists e > 0 such that, if
is a minimal factorization with rational matrix functions Wol(\)- • - WÂ)

that admit minimal realizations
such that
then necessarily VVÂ) = W 0/ (A) for each ;'. It is easily seen that this
definition does not depend on the choice of the minimal realization (17.6.4).
Rational Matrix Functions: Further Deductions 539
From the proof of Theorem 17.4.1 and the fact that the stable invariant
subspaces coincide with the isolated ones (Section 14.3), it is found that this
property also holds for stable minimal factorizations:
Theorem 17.6.2
The minimal factorization (17.6.1) is stable if and only if it is isolated.
Consider again the minimal factorization (17.6.1) with given minimal

realizations (17.6.2) and (17.6.3) for W0(\) and W 01 (A), . . . , W ot (A). We
say that (17.6.1) is Lipschitz stable if there exist positive constants e and K
with the following property: for every triple of matrices (A, B, C) with
appropriate sizes and with the
realization
is minimal and W(A) admits a minimal factorization W=W1W2---Wk such

that, for ; = 1, . . . , k, W;( A) has a minimal realization
where, for each j
Again, the proof of Theorem 17.4.1, together with the description of

Lipschitz stable invariant subspaces (Theorem 15.5.1), yields a characteriza-
tion of Lipschitz stable minimal factorizations, as follows.
Theorem 17.6.3
For the minimal factorization (17.6.1), the following statements are equiva-
lent: (a) equation (17.6.1) is Lipschitz stable; (b) for every pair of indices
j^p, the rational functions W 0y (A) and W0p(\) have no common zeros and
no common poles; (c) given minimal realizations (17.6.2) and (17.6.3) of
W0( A) and W 01 (A), . . . , W0k( A), for every sufficiently small e >0 there exists
an a) >0 such that for any triple (A, B, C) with \\A- A0\\ + \\B - BQ\\ +
|| C - C0|| < to the realization
is minimal and W( A) admits a unique minimal factorization W( A) =

Wi(A)W 2 ( A) • • • Wk(\) with the property that for j = 1, . . . , k each Wf( A) has
a minimal realization
540 Applications
satisfying
17.7 STABILITY OF LINEAR FRACTIONAL DECOMPOSITIONS OF

RATIONAL MATRIX FUNCTIONS
Let t/(A) be a rational q x s matrix function with finite value at infinity. In

this section we study stability of minimal linear fractional decompositions
where W(X) and V(X) are rational matrix functions of suitable sizes that
take finite values at infinity. (See Sections 7.6-7.8 for the definition and
basic facts on linear fractional decompositions.)
In informal terms, the stability of (17.7.1) means that any rational matrix
function U(X) sufficiently close to U(\) admits a minimal linear fractional
decomposition U(\) = 3^^,(V), where the rational matrix functions W(A)
and V(\) are as close as we wish to W(\) and V(A), respectively. To make
this notion precise, we resort to minimal realizations for the matrix functions
involved. Thus let
be a minimal realization of (/(A), where a, /3, -y, and 8 are matrices of sizes
/ x /, / x 5, q x /, and q x s, respectively. Also, let
and
be minimal realizations of W(A) and V(A). We say that the minimal linear
fractional decomposition (17.7.1) is Lipschitz stable if there exist positive
constants e and K such that any q x s rational matrix function U( A) that
admits a realization
with
Decompositions of Rational Matrix Functions 541
has a minimal linear fractional decomposition
where the rational matrix functions W(X) and V(\) admit realizations
with the property that
It is assumed, of course, that the sizes of two matrices coincide each time
their difference appears in the preceding inequalities.
Since any two minimal realizations of the same rational matrix function
are similar (Theorems 7.1.4 and 7.1.5), it is easily seen that the definition of
Lipschitz stability does not depend on the particular choice of minimal
realizations for £/(A), W(\), and V(A).
It is remarkable that a large class of minimal linear fractional decom-
positions is Lipschitz stable, as opposed to the factorization of monic matrix
polynomials and the minimal factorization of rational matrix functions,
where Lipschitz stability is exceptional in a certain sense (Sections 17.3 and
17.6).
Theorem 17.7.1
Let
be a minimal linear fractional decomposition, where
is a suitable partition of W(A). Assume that the rational matrix functions

W(\) and £/(A) take finite values at infinity, and assume, in addition, that the
matrices VKU(<») and W22(°°) are invertible. Then (17.7.6) is Lipschitz stable.
542 Applications
Proof. We make use of Theorem 7.7.1, which describes minimal linear

fractional decompositions in terms of reducing pairs of subspaces with
respect to the minimal realization (17.7.2). Thus there exists an [a /3]-
invariant subspace Ml C <p' and an -invariant subspace M2 C £', which
are direct complements to each other and such that for some transfor-
mations F:<p'-*<p J and G: (*-+(' with (a + pF)M^ C Ml and (a +
Gy)M2CM2 the formulas (7.7.5)-(7.7.10) hold.
Moreover, one can choose F and G in such a way that Ml is a spectral
invariant subspaces (i.e., a sum of root subspaces) for a + /3F and M2 is
a spectral invariant subspace for a + Gy. Indeed, Theorem 7.7.2 shows
that the linear fractional decomposition (17.7.6) depends on
(Ml,M2\ F\M , Qu G) only, where QM is the projector on Ml along M2.
[Of course, it is assumed that the minimal realization (17.7.2) of U(\) is
fixed.] But the proof of Theorem 15.8.1 shows that there exists a transfor-
mation F': $'-+(* such that F'x = 0 for all x^M{ and the (a + fi(F +
F'))-invariant subspace Ml is spectral. So we can replace F by F + F'.
Similarly, one proves that G can be chosen with spectral (a + Gy)-invariant
subspace M2. In the rest of the proof we assume that F and G satisfy this
additional property.
Now let U( A) be another rational q x s matrix function with finite value
at infinity that admits a realization (17.7.3) with the property (17.7.4). Here
the positive number e > 0 is sufficiently small and is chosen later.
First, observe that for e >0 small enough the realization (17.7.3) is also
minimal. Indeed, by Theorem 7.6.1 we have
which means the right invertibility of [j8, a/3,. . . , a1 l(3] and the left
invertibility of
Since one-sided invertibility of a matrix is a property that is preserved under

small perturbations of that matrix, our conclusion concerning minimality of
(17.7.3) follows.
Recall (Theorem 15.8.1) that the spectral invariant subspaces Ml and M2
for (a + ftF) and for (a + G-y), respectively, are Lipschitz stable. It follows
that there exists a constant /^ >0 such that d + /3F and a + Gy have
invariant subspaces Ml and M2, respectively, with the property that
Decompositions of Rational Matrix Functions 543
provided c is small enough. By Lemma 13.3.2, by choosing sufficiently small

e we ensure that Ml and M2 are again direct complements to each other. In
other words, ( M l , M 2 ) is a reducing pair with respect to the realization
(17.7.3). Let d = d, Dn = D n , D22 = D22, D12 = D12, and
Also, put F=F, G-G. By Theorem 7.7.1 we obtain a minimal linear

fractional decomposition U(\) = ^^(V), where the functions W(\) and
V(\) are given by formulas (7.7.5)-(7.7.10) except that each letter (with the
exception of (p^, <p*, <p') has a tilde. These formulas show that for e > 0
small enough there is a positive constant K satisfying (17.7.5) provided F
and G satisfy the following property: given a basis /,,. . . , fk in M x , there
exists a positive constant K2 (which depends on this basis only) such that
Here Fl = F\M{: Ml-^(s and Gl = QMG: f-^Mlt where QMi stands for
the projector on Ml along M2, are transformations written as matrices with
respect to the basis / , , . . . , fk (and the standard orthonormal bases in <p*
and <£*), and are similarly
defined matrices with respect to some basis g{,. . . , gkin M^ where <2^ is
the projector on Ml along M2.
To prove the existence of a constant /C 2 >0 with the property (17.7.7),

we appeal to Lemma 13.3.2. In view of this lemma, in case M\ and M2 are
sufficiently close to Ml and M2, respectively, there exists a constant ^ 3 >0
(depending on Mx and M2 only) such that
for some invertible transformation S: <p'—» <p' such that SMl=Ml and
It remains to choose
It is instructive to compare Theorem 17.7.1 with Theorems 17.4.1 and

17.6.3. Thus any minimal factorization U(\) = (/!(A)t/ 2 (A), where £/,(A)
and t/ 2 (A) are n x n rational matrix functions with value / at infinity, is
Lipschitz stable in the class of minimal linear fractional decompositions. In
contrast, this minimal factorization need not be Lipschitz stable (or even
stable) in the class of minimal factorizations. The following example illus-
trates this point:
544 Applications
EXAMPLE 17.7.1. Let
It is easily seen that t/(A) admits a minimal factorization
This minimal factorization is not stable because the perturbed rational

matrix function
does not have nontrivial minimal factorizations at all. On the other hand,
(17.7.8) can be represented as a minimal linear fractional decomposition
with
Observe that W(\) has a minimal realization
Now Uf (A) also admits a minimal linear fractional decomposition Uf (A) =

&W((V), where
Moreover, We(\) has a minimal realization
Hence, as predicted by Theorem 17.7.1, the minimal factorization (17.7.8)

is Lipschitz stable when understood as a minimal linear fractional
decomposition.
Isolated Solutions of Matrix Quadratic Equations 545
17.8 ISOLATED SOLUTIONS OF MATRIX QUADRATIC EQUATIONS
Consider the matrix quadratic equation
where A, B, C, D are known matrices of sizes n x «, n x m, m x n, m x m,

respectively, and A!" is a matrix of size m x n to be found.
For any m x n matrix X, let
be the graph of X. The following proposition connects the solutions of

(17.8.1) with invariant subspaces of the (m + n) x (m + n) matrix
Proposition 17.8.1
For an m x n matrix X, the subspace G(X) is T invariant if and only if X
satisfies (17.8.1).
Proof. Assume that G(X) is T invariant. So for every x E <f" there

exists a y E <J7" such that
The correspondence jc—* y is clearly linear; so v = Z* for some n x « matrix

Z, and we have
for all x G <p", or
This implies Z = A + BX and
which means that (17.8.1) holds.

Conversely, if (17.8.1) holds and Z d= A + BX, then (17.8.2) holds. This
implies the T invariance of G(X).
546 Applications
To take advantage of Proposition 17.8.1 in describing isolated solutions

of (17.8.1), we need a preliminary result.
Lemma 17.8.2
Define a function G from the set M m x n of all m x n matrices to the set of all
subspaces in <p" © <pm by G(X) = G(X). Then G is a homeomorphism (i.e.,
a bijective map that is continuous together with its inverse) between Mmxn and
the set of all subspaces M C <p" © <£"" witn tne property that 0(M, $?)<!,
where ^=<p"©{0}.
Here 6(M, jV) is the gap between M and N (see Chapter 13).
Proof. The continuity of G and G ~ ' follows from the easily verified fact
that the orthogonal projector P on G(X) is given by
where L = X*X)~l. Let us check that 0(G(X), %)<l. By Theorem

13.1.1
where Px is the orthogonal projector on 9€. The second supremum is
where = 1, that is, is

uniformly bounded, it follows that ||y|| is bounded away from zero. Hence
the second supremum in (17.8.4) is less than 1.
To show that the first supremum in (17.8.4) is also less than 1, assume
(arguing by contradiction) that
and by formula
But L is invertible, so
and (17.8.5) is impossible. Thus 0(G(X), %) < 1 as claimed.

Now we must show that every subspace M C <p" © Cm with 6(M, X) -
a < 1 is a graph subspace, that is, M — G(X) for some X. First, Theorem
13.1.2 shows that dim M — dim $?= n. Further, assume that P^ jt = 0 for
some *£./#. Denoting by P the orthogonal projector on M, we have
INI = IK*** ~ f*)*ll» which, in view of the condition 0(M, %) = \\PM -
PX\\<1, implies jc = 0. Hence Q = PX\M\ M-* 3C is an invertible linear
transformation. Now M = G((I - P*)O~l). Indeed, if jc£ M, then
On the other hand, if for some M E $?
then the vector v - Q lu has the property that vE.M, P^y = u - Pxv and
therefore, y belongs to
M.
A solution X of (17.8.1) is called isolated if there exists a neighbourhood

of X in the linear space A/ mX/J of all m x n matrices that does not contain
other solutions of (17.6.1). A solution X is called inaccessible if the only
continuous function <p: [0,1]—» MmKn such that <p(0) = X and <p(f) is a
solution of (17.8.1) for every fE[0,1], is the constant function <p(t) = X.
Clearly, every isolated solution is inaccessible.
We now have a characterization of isolated and inaccessible solutions of
(17.8.1).
Theorem 17.8.3
The following statements are equivalent: (a) X0 is an isolated solution of
(17.8.1); (b) XQ is an inaccessible solution of (17.8.1); (c) for every eigen-
value A0 of the matrix
with dim Ker(70 - A 0 /) > 1, either

548 Applications
or
(d) every common eigenvalue of A + BXQ and D — XQB has geometric

multiplicity one as an eigenvalue of T0.
Proof. Making a change of variable Y = X - XQ, we see that X satisfies

(17.8.1) if and only if Y satisfies the equation
Hence XQ is an isolated (or inaccessible) solution of (17.8.1) if and only if 0

is an isolated (or inaccessible) solution of (17.8.6). By Proposition 17.8.1
and Lemma 17.8.2, the correspondence
is a homeomorphism between the set of all solutions Y of (17.8.5) and the set
of ro-invariant subspaces M such that 0 ( M , f f l ) < l , where 2i?=
Hence 0 is an isolated (resp. inaccessible) solution of (17.8.6) if and only if
3€ is an isolated (resp. inaccessible) T0- in variant subspace. An application of
Theorem 14.3.1 and Proposition 14.3.3 shows that (a), (b), and (c) are
equivalent.
Further, the characteristic polynomial of T0 is the product of the charac-
teristic polynomials of A + BXQ and D — XQB. As the multiplicity of A0 as a
zero of the characteristic polynomial of a matrix 5 is equal to the dimension
of £%A (5), it follows that A0 is a common eigenvalue of A + BX0 and
D - XQB if and only if
So (c) and (d) are equivalent.
An interesting particular case appears when B = 0. Then we have the

equation
which is a system of linear equations in the entries of X. It is well known

from the theory of linear equations that equation (17.8.7) either has no
solutions, has a unique solution, or has infinitely many solutions. [In this
case the homogeneous equation
has nontrivial solutions, and the general form of solutions of (17.8.7) is

X0 + y, where X0 is a particular solutions of (17.8.7) and Y is the general
solution of the homogeneous equation.] Clearly, a solution X of (17.8.7) is
isolated if and only if (17.8.8) has only the trivial solution. Using the
criterion of Theorem 17.8.3, we obtain the following well-known result.
Corollary 17.8.4
The equation YA - DY = 0 has only the trivial solution Y = 0 if and only if
Reconsidering the general case of equation (17.8.1), let us give some

sufficient conditions for isolatedness of the solutions.
Corollary 17.8.5
If the matrix
is nonderogatory [i.e., dimKer(r- A07) = 1 for every eigenvalue A0 of T],

then the number of solutions of (17.8.1) (if they exist) is finite and,
consequently, every solution is isolated.
Proof. The matrix T has a finite number of invariant subspaces; namely,

there are exactly II'=1 (dim £% A (jT) + 1) of them, where A I } . . . , Ar are all
the distinct eigenvalues of T. It remains to appeal to Proposition 17.8.1. D
EXAMPLE 17.8.1. Consider the equation
The only one-dimensional T-invariant subspaces are M-l=Span{el} and

M2 = Spanf^j - e3}. Defining 9C = Spanf^}, we have
550 Applications
so by Proposition 17.8.1 and Lemma 17.8.2 there exist only two solutions
given by
As expected from Corollary 17.8.5, the number of solutions of (17.8.9) is

finite
Another particular case of (17.8.1) is of interest. Consider the equation
where Al and A0 are given n x n matrices, and X is an n x n matrix to be

found. Equation (17.8.10) is a particular case of (17.8.1) with B = /,
C = — A0, D = — A,, and A = 0 and is sometimes described as "unilateral."
The matrix T turns out to be just the companion matrix of the matrix
polynomial L(A) = A 2 / + \A{ + A0:
Proposition 17.8.1 gives a one-to-one correspondence between the set of

solutions X of (17.8.10) and the set of T-invariant subspaces of the form
We remark that a 7-invariant subspace M has this form if and only if the
transformation [/ 0]|^: Jt—>$" is invertible. In this way we recover the
description of right divisors of L(A) given in Section 5.3. Similarly, the
equation
Stability of Solutions of Matrix Quadratic Equations 551
considered as a particular case of (17.8.1) gives rise (by using Proposition

17.8.1) to a description of left divisors of the matrix polynomial A2/ +
\Al + A0.
17.9 STABILITY OF SOLUTIONS OF MATRIX QUADRATIC

EQUATIONS
with the same assumptions on the matrices A, B, C, D as in the preceding

section. We say that a solution X of (17.9.1) is stable if for any e > 0 there is
8 > 0 such that whenever A', B', C", D' are matrices of appropriate size with
thje equation
has a solution Y for which \\Y — X\\ < e. It turns out that the situation with
regard to stability and isolatedness is analogous to that for invariant
subspaces.
Theorem 17.9.1
A solution X of equation (17.9.1) is stable if and only if X is isolated.
Proof. It is sufficient to prove the theorem for the case when C = 0 and
the solution X is the zero matrix (see the proof of Theorem 17.8.3). In this
case G(X) = <P"©{0); so the homeomorphism described in Lemma 17.8.2
implies that X=0 is a stable (resp. isolated) solution of
if and only if <p"0{0} is a stable (resp. isolated) -invariant

subspace. Now use the fact that the isolated invariant subspaces for a linear
transformation coincide with the stable ones (Theorems 15.2.1 and
14.3.1). D
In view of Theorem 7.9.1, statements (c) and (d) in Theorem 17.8.3

describe the stable solutions of equation (17.9.1). In the particular case
when B - 0 we find that the solution X of XA - DX = C is stable if and only
if cr(>l)ncr(D) = 0.
552 Applications
As a solution X of (17.9.1) is stable if and only if the subspace Im is

stable as a /-invariant subspace, where
we can deduce some properties of stable solutions of (17.9.1) from the

corresponding properties of stable T-invariant subspaces. For instance, the
set of stable solutions of (17.9.1) is always finite (it may also be empty), and
the number of stable solutions of (17.9.1) does not exceed the number 7(7")
of n-dimensional stable /"-invariant subspaces, which can be calculated .as
follows. Let A , , . . . , \p be all the distinct eigenvalues of T with algebraic
multiplicities m , , . . . , m p , respectively; then y(T) is the number of
sequences of type (ql,. . . , qp), where qt are nonnegative integers with the
properties that q.-^m^ either <7y = 0 or qj = mj for every ;' such that
dim Ker( A y / - T) > 1, and ql + • • • + qp = n.
Using Corollary 15.4.2, we obtain the following property of stable
solutions of (17.9.1).
Theorem 17.9.2
Let X be a stable solution of (17.9.1). Then every solution Y of equation
where A', B', C', and D' are matrices of appropriate sizes, is stable provided
is small enough.
The notion of Lipschitz stability of solutions of (17.7.1) is introduced

naturally: a solution X of (17.7.1) is called Lipschitz stable if there exist
positive constants e and K such that, for any matrices A\ B', C\ D' of
appropriate sizes with
the equation
has a solution Y satisfying

The Real Case 553
Theorem 17.9.3
A solution of (17.9.1) is Lipschitz stable if and only if &(A + BX) fl
Proof. Again, we can assume without loss of generality that C = 0 and

^ = 0. Formula (17.8.3) shows that the function G introduced in Lemma
17.8.2 is locally Lipschitz continuous; that is, for every m x n matrix Y there
exists a neighbourhood °U of Y and a positive constant K such that
for every Z E W.. The inverse function G is locally Lipschitz continuous as

well. So the zero matrix is a Lipschitz stable solution of (17.9.1) (where
C = 0) if and only if the subspace 9€ — <p" 0 {0} is Lipschitz stable as an
invariant subspace for the matrix
By Theorem 15.5.1, 3C is Lipschitz stable if and only if it is a spectral

invariant subspace for T. This means that o-(A)r\o~(D) = 0. Indeed, if
o-(A) fl cr(D) ^0, then there exists a T-invariant subspace j£ strictly bigger
than $?and such that cr(T\y) = (T\x) [e.g., £= 3C + Span{*0}, where *0 is
an eigenvector of D corresponding to an eigenvalue A0 G <r(A) fl a(D)]. So
2? is not spectral. Conversely, if o-(A) D o-(D) = 0, then with the use of
Lemma 4.1.3, it follows that %C is spectral.
Similarly, one can obtain the following fact from Theorem 15.5.1: the
solution X in (17.9.1) is Lipschitz stable if and only if for every sufficiently
small e > 0 there exists a 5 > 0 such that
implies that the equation
has a unique solution Y satisfying
17.10 THE REAL CASE
In this section we quickly review some real analogs of the results obtained in
this chapter.
554 Applications
Let L( A) be a monic matrix polynomial whose coefficients are real n x n

matrices, and consider a factorization
where L y (A) are monic matrix polynomials with real coefficients. Using the
results of Section 15.9 and the approach developed in the proof of Theorem
17.3.1, one obtains necessary and sufficient conditions for stability of the
factorization (17.10.1) (the analog of Corollary 17.2.2). The definition of a
stable factorization of real monic matrix polynomials is the same as in the
complex case, except that now only real matrix polynomials are allowed as
perturbations of L(A) and as factors in a factorization of the perturbed
polynomial.
Theorem 17.10.1
Let CL be the companion matrix of L( A), and let
be the chain of CL-invariant subspaces in $"' [where I is the degree of L( A)]

corresponding to the factorization (17.10.1). Then (17.10.1) is stable if and
only if the following conditions are satisfied: (a) for every eigenvalue A0 of CL
with geometric multiplicity greater than 1 and for every i (2< / < r), either
Mi D &i0(CL) or M, n Âo(CL) = {0}; (b) for every real eigenvalue A0 of CL
with geometric multiplicity of 1 and even algebraic multiplicity, the algebraic
multiplicity of A0 as an eigenvalue of each restriction CL\M (if A0 is an
eigenvalue of CL\M at all) is also even.
In contrast with the complex case (Theorem 17.2.5), not every isolated
real factorization (17.10.1) is stable. Using the description of isolated
invariant subspaces for real transformations (Section 15.9), one finds that
(17.10.1) is isolated if and only if the condition (a) in Theorem 17.10.1
holds.
Now we pass to the stability of minimal factorizations
of a rational matrix function W 0 (A) such that the entries of W 0 (A) are real
for real A. (In short, such rational matrix functions are called real.) The
functions W0i(\) are also assumed to be real, and, in addition, we require
that all rational matrix functions involved are n x n and take value / at
infinity. Again, the stability of (17.10.2) is defined as in the complex case
with only real rational matrix functions allowed. The main result on stability
of (17.10.2) is the following analog of Theorem 17.4.1.
The Real Case 555
Theorem 17.10.2
The minimal factorization (17.10.2) of the real rational matrix function
W0( A) with W0(<x>) = I, where for j = 1, 2,. . . , k, W0j(\) is also a real
rational matrix function with W0/(o°) = I, is stable if and only if the following
conditions hold: (a) each common pole (zero) of W0; and W0p ( j ^ p ) is a
pole (zero) of WQ of geometric multiplicity I', (b) each even order real pole A0
of W0 (resp. of WQ*) is also a pole of each W0j (resp. of each VK^ 1 ) of even
order (if A0 is a pole of WQj or of W^1 at all).
Recall that the geometric multiplicity of a pole (zero) A0 of a rational

matrix function W(\) is the number of negative (positive) partial multi-
plicities of W(\) at A(). In connection with condition (b), observe that the
order of a pole A0 of W 0 (A) is the least positive integer p such that
(A — \Q)PW0(A) is analytic in a neighbourhood of A0. It coincides with the
greatest absolute value of a negative partial multiplicity of W 0 (A) at A 0 , as
one can easily see using the local Smith form for W 0 (A) at A 0 .
We omit the proof of Theorem 17.10.2. It can be obtained in a similar
way to the proof of Theorem 17.4.1 by using the description of stable
invariant subspaces for real transformations presented in Section 15.9.
As in the case of matrix polynomials, not every isolated minimal fac-
torization of a real rational matrix function with real factors is stable (in the
class of real factorizations). It is found that (17.10.2) is isolated if and only if
condition (a) of Theorem 17.10.2 holds. Let us give an example of an
isolated but not stable minimal factorization of real rational matrix func-
tions.
EXAMPLE 17.10.1. Let
One verifies easily that W0(\) = W 01 (A)W 02 (A) and this factorization is
minimal (indeed, the McMillan degree of W0(\) is 2, whereas the McMillan
degree of W 01 (A) and W 02 (A) is 1). Furthermore
so W 01 (A) and W 02 (A) do not have common zeros. It is easily seen that
A0 = 0 is a common pole of W 0 (A), Wol(\), and W 02 (A) and that the only
negative partial multiplicities of W 0 (A), W01( A), and Wm(\) at A 0 are -2, -1
and —1, respectively. Hence condition (a) of Theorem 17.10.2 is satisfied,
556 Applications
but condition (b) is not. It follows that the factorization W0(\) =

W 01 (A)W 02 (A) is isolated but not stable in the class of minimal factorizations
of real rational matrix functions. D
Finally, consider the matrix quadratic equation
where A, B, C, D are known real matrices of sizes n x n, n x m, m x «,

m x m, respectively, and X is a real matrix of size m x n to be found. The
solution of X of (17.10.3) is called isolated if there exists e >0 such that the
set of all real matrices Y satisfying \\X — Y\\ < e does not contain solutions
of (17.10.3) other than X. The solution of (17.10.3) is called stable if for any
c >0 there is 8 >0 such that whenever A', B', C", D' are real matrices of
appropriate sizes with
the equation
has a real solution Y for which || Y - X\\ < e. The isolated and stable
solutions-can be characterized as follows.
Theorem 17.10.3
The solution X0 of (17.10.3) is isolated if and only if every common
eigenvalue of A + BX0 and D — XQB has geometric multiplicity 1 as an
eigenvalue of the matrix
The solution X0 is stable if and only if it is isolated and, in addition, for every
real eigenvalue A0 of T with even algebraic multiplicity the algebraic multiplic-
ity of A0 as an eigenvalue of A + BXQ (or of D — XQB) is even (if A0 is an
eigenvalue of A + BXn, or of D — X0B at all).
In connection with the second statement in this theorem, observe that
and thus the algebraic multiplicity m(T\ A0) for the eigenvalue A0 of T is
equal to the sum of the algebraic multiplicities m(A + BX0; A0) and m(D -
X0B; A0). Consequently, if m(T; A 0 ) is even, then the evenness of one of
Exercises 557
the numbers m(A + BX0; A0) and m(D — X0B; A0) implies the evenness of
the other.
Again, we omit the proof of Theorem 17.10.3. It can be obtained by
using an argument similar to the proofs of Theorems 17.8.3 and 17.9.1,
using the description of stable and isolated invariant subspaces for real
transformations (Section 15.9) and taking into account equation (17.10.4).
17.11 EXERCISES
17.1 Find all stable factorizations (whose factors are linear matrix poly-
nomials) of the monic matrix polynomial
Does L(A) have a nonstable factorization?

17.2 Solve Exercise 17.1 for the matrix polynomial
17.3 Let L( A) be a monic n x n matrix polynomial of degree / such that

CL has nl distinct eigenvalues. Show that any factorization of L(\)
(whose factors are monic matrix polynomials as well) is stable.
17.4 Is any factorization of monic matrix polynomial L(A) stable if CL is
diagonable?
17.5 Show that the factorization L = L1L2L3 of a monic matrix poly-
nomial L(A) is stable if and only if each of the factorizations
L = L 2 M, M = L2L3 is stable, where M = L^1L.
17.6 Is the property expressed in Exercise 17.5 true for Lipschitz
stability?
17.7 Show that a factorization of 2 x 2 monic matrix polynomials L =
L,L2 is stable if and only if one of L{( A 0 ) and L2( A0) is invertible for
every A0 e (p such that L( A 0 ) = 0.
17.8 Let L(A) = /A +Y.i~l0Ai\' be an « x « matrix polynomial whose
coefficients A: are circulant matrices. Show that any factorization
where for / = 1,. . . , r, Ly( A) is a monic matrix polynomial with cir-

culant coefficients, is stable in the algebra of circulant matrices,
in the following sense: for every e > 0 there exists a 8 > 0 such that
every monic matrix polynomial L(A) of degree / with circulant
coefficients that satisfies o-,(L, L)<8 admits a factorization
558 Applications
where L , ( A ) , . . . , Lr(\) are monic matrix polynomials with cir-

culant coefficients and such that
(Here /?; is the degree of Ly and of L y , for ; = 1,. . . , r.)

17.9 Give an example of a nonstable factorization of an n x n matrix
polynomial with circulant coefficients.
17.10 Let L(A) = diagfMÂ), M 2 (A)], where M,( A) and M 2 (A) are monic
matrix polynomials of sizes nl x nl and n 2 x « 2 , respectively, and let
be a factorization of L(A), where for ; = l , . . . , r , M1;(A) and

M2J(\) have sizes nl x nl and n2x n2, respectively,
(a) Prove that if (1) is stable, then each factorization
is stable as well.
(b) Show that the converse of statement (a) is generally false.
(c) Show that the factorization (1) is stable in the algebra of all
matrices of type
where A^ (resp. A2) is any n : x nl (resp. n2 x n2) matrix if and

only if each factorization (2) is stable. (Stability in the algebra
of all matrices of type (3) is understood in the same way as
stability in the algebra of circulant matrices, as explained in
Exercise 17.8.)
17.11 Let V be the algebra of all n x n matrices of type
where a; and /3y are complex numbers, and let L(A) be a monic
matrix polynomial with coefficients from the algebra V. Describe
factorizations of L(A) that are stable in the algebra V. (Hint: Use
Exercise 17.7.)
Exercises 559
17.12 Find all stable minimal factorizations of the rational matrix function
Is there a nonstable factorization of this function?

17.13 Prove that every minimal factorization of a scalar rational function
with value / at infinity is stable. (It is assumed that the factors are
scalar rational functions with value / at infinity as well.)
17.14 Let W(\) be a rational matrix function with value / at infinity.
Assume that W(\) has 8 distinct zeros and 8 distinct poles, where 8
is the McMillan degree of W(\). Show that every minimal factoriz-
ation of W(\) is stable.
17.15 Let W( A) be an n x n rational matrix function with value / at infinity
that is a circulant, that is, of type
where w>j(A), vv 2 (A),. . . , wn(\} are scalar rational functions. Show

that every minimal factorization of W(A) is stable in the class of
circulant rational matrix functions.
17.16 Give an example of nonstable minimal factorization of a circulant
rational matrix function with value / at infinity whose factors are also
from this class.
17.17 Let W(A) be a rational matrix function with W(<x>) = /, and let
be a factorization of W(A), where W ; (A) are also rational matrix

functions with value / at infinity. Show that if
is a minimal factorization, then (4) is also minimal. Is the converse

true?
(a) Find all solutions of the matrix quadratic equation
(b) Find all stable solutions of this equation.

(c) Find all Lipschitz stable solutions of this equation.
560 Applications
17.19 (a) Describe all circulant solutions of the equation
with circulant matrices A, B, C, and D.

(b) Can one obtain all circulant solutions of (5), in the event that B
is invertible, by the formula \(D - A)B~l + (\(D - A)2B~2 +
4flCV /2 ?
17.20 Solve the quadratic equation
Notes to Part 3
Chapter 13. This chapter contains mainly well-known results. The main
ideas and results concerning the metric space of subspaces appeared first in
the infinite dimensional framework [see Krein, Krasnoselskii and Milman
(1948); Gohberg and Markus (1959); and also Gohberg and Krein (1957)], and
they are adapted here for the finite-dimensional case. The contents of Sections
13.1 and 13.4 are standard. The exposition presented here is based on that of
Chapters.4 in the authors' book (1982) [see also Kato (1976)]. Theorem 13.2.3
is from Gohberg and Markus (1959). The exposition in Section 13.3 follows
Section 7.2 in Bart, Gohberg, and Kaashoek (1979). Theorem 13.6.3, along with
other related results, was obtained in Gohberg and Leiterer (1972) as a
consequence of general properties of cocycles in certain algebras of continuous
matrix functions. Theorem 13.5.1 appears in the infinite dimensional framework
in Gohberg and Krupnik (1979); here we follow the authors' book (1983b).The
material on normed spaces presented in Section 13.8 is standard knowledge. For
the first part of this section we made use of the exposition in Lancaster and
Tismenetsky (1985).
Chapter 14. The description of connected components in the set of
invariant subspaces (Sections 14.1 and 14.2) is found in Douglas and Pearcy
(1968) [see also Shayman (1982)]. An identification of isolated invariant
subspaces is given in Douglas and Pearcy (1968). Note that in the infinite-
dimensional framework (Hilbert space and bounded linear operators) there
exist inaccessible invariant subspaces that are not isolated [see Douglas and
Pearcy (1968)]. Theorem 14.3.5 was originally proved in the infinite-
dimensional case [Douglas and Pearcy (1968)]. The results on coinvariant
and semiinvariant subspaces in Section 14.5 appear here for the first time.
Chapter 15. Theorem 15.2.1 appeared in Bart, Gohberg and Kaashoek
(1978) and Campbell and Daughtry (1979). The proof presented here
follows the exposition in Bart, Gohberg and Kaashoek (1979). Parts
(a)O(b) of Theorem 15.5.1 was first proved in Kaashoek, van der Mee and
Rodman (1982). The statement of Theorem 15.5.1 and the remaining proof
is taken from Ran and Rodman (1983). Theorem 15.7.1 was proved in
Conway and Halmos (1980). Theorem 15.8.1, although not stated in this
way, was proved in Gohberg and Rubinstein (1985). The material of Section
15.9 is based on Bart, Gohberg and Kaashoek (1979). Theorem 15.10.1 was
561
562 Notes to Part 3
proved in den Boer and Thijsse (1980) and Markus and Parilis (1980).
Theorem 15.10.2 is suggested by Theorem 2.4 in den Boer and Thijsse
(1980).
The results of this chapter play an important role in explicit numerical
computation of invariant subspaces. However, we do not touch the topic of
numerical computation in this book, and refer the reader to the following
sources: Bart, Gohberg, Kaashoek and van Dooren (1980); Golub and
Wilkinson (1976); Ruhe (1970,1970b); van Dooren (1981, 1983); and
Golub and van Loan (1983).
Chapter 16. Most of the results and expositions of the material in this
chapter is taken from Gohberg and Rodman (1986). Corollary 16.1.3
appeared in Brickman and Fillmore (1967). Lemma 16.5.1 is a particular
case of a result due to Ostrowski [see pages 334-335 in Ostrowski (1973)].
Chapter 17. The main results of Section 17.2 (where the case of
factorization into the product of two factors L(A) = Lj(A)L 2 (A) was con-
sidered) are from Bart, Gohberg and Kaashoek (1978). The exposition of
Sections 17.1 and 17.2 follows Gohberg, Lancaster, and Rodman (1982),
where only the case of two factors was considered [see also the authors'
paper (1979)]. The results of Section 17.3 are presented here probably for
the first time. The main part of the contents of Section 17.4, as well as
Theorems 17.6.1 and 17.6.2, is taken from Bart, Gohberg and Kaashoek
(1979). Lemma 17.8.2 is taken from Campbell and Daughtry (1979). The
main results of Section 17.7 are from Gohberg and Rubinstein (1985).
Example 17.10.1 is taken from Chapter 9 in Bart, Gohberg and Kaashoek
(1979).
Part Four
Analytic Properties
of Invariant
Subspaces
This part is devoted to the study of transformations that depend analytically
on a parameter, and to the dependence of their invariant subspaces on the
parameter. We begin with the simplest invariant subspaces, the kernel and
image of the transformation, and this already requires the development of a
theory of analytic families of invariant subspaces. Also, the solution of some
basic problems is required, such as the existence of analytic bases and
analytic complements for analytic families of subspaces. This material is all
presented in Chapter 18 and is probably presented in a book on linear
algebra for the first time. More generally, these results appeared first in the
theory of analytic fibre bundles.
The study of more sophisticated objects and their dependence on the
complex parameter z is the subject of Chapter 19. These include irreducible
subspaces, the Jordan form, and Jordan bases. These results can be viewed
as extensions of perturbation theory for analytic families of transformations.
The final chapter of Part 4 (and of the book) contains applications of the
two preceding chapters to problems that have already appeared in earlier
chapters, but now in the context of analytic dependence on a parameter.
These applications include the factorization of matrix polynomials and
rational matrix functions and the solution of quadratic matrix equations.
563
Chapter Eighteen
Analytic Families
of Subspaces
In this chapter we study analytic families of transformations and analytic

families of their invariant subspaces. For this purpose, the basic notion of an
analytic family of subspaces is introduced and studied. This notion is of a
local character, and the analysis of its global properties is one of the main
problems of this chapter. In the proofs of Lemmas 18.4.2 and 18.5.2 (only)
we use some basic methods from the theory of infinite-dimensional spaces,
and this leads us beyond the prerequisites in linear algebra required up to
this point. It is shown that the kernel and image of an analytic family of
transformations form two analytic families of subspaces (possibly after
correction at a discrete set of points). Other classes of invariant subspaces
whose behaviour is analytic (at least locally) are also studied. In Section 18.8
we analyze the case when the whole lattice of invariant subspaces behaves
analytically. This occurs for analytic families of transformations with a fixed
Jordan structure.
/*./ DEFINITION AND EXAMPLES
Let O be a domain (i.e., a connected open set) in the complex plane <p, and
assume that for every z E f t a transformation A(z)\ <P"-* 4-"" *s given. We
say that A(z) is an analytic family on (1 if in a neighbourhood Uz of each
point ZQ Efl the transformation valued function A(z) admits representation
as a power series
where A0,Al,..., are transformations from <p" into <pm. Equivalently,

A(z) is said to depend analytically on z in fl if the entries in the matrix
565
566 Analytic Families of Subspaces
representing A(z) in fixed bases in <p" and (pm are analytic functions of z on
the domain ftr Obviously, this definition does not depend on the choice of
these bases.
Now let (M(z)}zen be a family of subspaces in <p". So for every z in ft,
M(z) is a subspace in <p". We say that the family {-M(z)}zefl is analytic on ft
if for every z 0 E f t there exists a neighbourhood Uz Cft of z 0 , a subspace
J< C <p", and an invertible transformation A(z): $"-* <p" that depends
analytically on z in f/z and
It is easily seen that for an analytic family of subspaces {^(z)}zeft the

dimension of M(z) is independent of z. Indeed, (18.1.1) shows that
dim M(z) is fixed for z belonging to the neighbourhood Uz of z0. Since ft is
connected, for any two points z', z"Eft there is a sequence z0 =
z', Z j , . . . , zk = z" of points in ft such that the intersections Uz f~\ Uz ,
i = 1,. . . , k are not empty. Then obviously dim M(z^) = dim M(zi_l), i =
! , . . . , & , and hence dim M(z') = dim M(z").
Let us give some examples of analytic families of subspaces.
Proposition 18.1.1
Let x^z), . . . , xp(z) be analytic functions of z on the domain ft whose values
are n-dimensional vectors. If for every z() €E ft the vectors x}(z0), . . . , xp(zn)
are linearly independent, then
is an analytic family of subspaces.
Proof. Take z 0 G f t , and let yp +l, • . . , yn be vectors in <p" such that

A^ZO), . . . , xp(zQ), yp+l, . . . , } > „ form a basis in <p". Then
As the determinant is a continuous function of its entries and Xj(z),

/'= 1,. . . , p are analytic (and hence continuous) functions of z on H, it
follows that
for all z belonging to some neighbourhood U of z0. Hence

where M is spanned by the first p coordinate unit vectors in <p", and

Span{jt,(z),. . . , Jtp(z)} is, by definition, analytic on ft.
We see later that the property described in Proposition 18.1.1 is charac-

teristic in the sense that for every analytic family of subspaces, there exists a
basis that consists of analytic vector functions.
Proposition 18.1.2
Let A(z): <£"—»<£"" be an analytic family of transformations on ft, and
assume that dim Ker A(z) is constant (i.e., independent of z for z in ft).
Then Ker A(z) is an analytic family of subspaces (of (p") on ft, whereas
Im A(z) is an analytic family of subspaces (of (p"1) on ft.
Note that dim Ker A(z) is constant on ft if and only if the rank of A(z) is
constant, or, equivalently, the dimension of Im A(z) is constant.
Proof. Write A(z) as an m x n matrix with respect to fixed bases in (p"1

and <p". Take z0 E ft. There exists a nonzero minor of size p x p of A(z0),
where by assumption, p = rank A(z) is independent of z. For simplicity of
notation assume that this minor is in the upper left corner of A(z0). As the
entries of A(z) depend analytically on z, this p x p minor is also nonzero for
all z in a sufficiently small neighbourhood U0 of z 0 . So for any z E U0 [here
we use the assumption that rank A(z) is independent of z], we obtain
where a^z)I is the Jth column of A(z). Let bp +l,. . . , bm be m-dimensional

vectors suetti that fl,(z0), . . . , ap(z0), bp +l, . . . , bm form a basis in (pm, that
is
Again, by the analyticity of flj(z),. . . , a p ( z ) , there exists a neighbourhood

V0 C U0 such that
for all z G VQ. Now for z £ V0 we have
where M = Span{e t ,. . . , ep] C <pm. So, by definition, Im A(z) is an analytic

family of subspaces.
Now consider Ker A(z) and fix a z 0 in ft. There exists a nonzero minor of
size p x p of ^(z 0 ), which will be supposed to lie in the left upper corner of
A(z0). Partition A(z) accordingly:
where £(z), C(z), D(z), and E(z} are matrix functions of sizes p x p,
p x (n - p), (m - p) x m, (m - p) x (H - p), respectively, and are analytic
on ft. For some neighbourhood t/ of z0 we have del B(z) ^ 0 for z E U. If
the vector * , A: e (pp, y E <p"~ p belongs to Ker A(z) and z e U, then
It follows that dim Ker A(z) = dim Ker[-D(z)#(z) 'C(z) + E(z)}. But dim
Ker^4(z) is independent of z and equal to n — p; consequently,
D(z)fi(z)-1C(z) + £(z) = 0 for all z e U. Now, obviously
where - Hence Ker is an analytic family on

ft. D
We see later that the examples of analytic families of subspaces given in

Proposition 18.1.2 are basic. In fact, any analytic family of subspaces is the
image (or the kernel) of an analytic transformation whose values are
projectors.
More generally, without the extra assumption that the dimension of
Ker A(z) is independent of z, the families of subspaces Ker A(z) and
Im A(z), where A(z): $"—*• <pm is an analytic family on ft, are not analytic
on ft. Let us give a simple example illustrating this fact.
exam[le 18.1.1 let
Obviously, A(z): (p -* £ is an analytic family on <p (written as a matrix in

the standard basis in (p ). We have
Analytic Families of Transformations 569
As dim Im A(z) is not constant, the family of subspaces Im A(z) is not

analytic on <p. Similarly, Ker A(z) is not analytic on <p. Note, however, that
by changing Im A(z) at the single point z = 0 (replacing {0} by Span ,
we obtain a family of one-dimensional subspaces Span that is analytic on
<p) (indeed,
Similarly, by changing Ker A(z) at the single point z = 0 we obtain an

analytic family of subspaces Span
18.2 KERNEL AND IMAGE OF ANALYTIC FAMILIES

OF TRANSFORMATIONS
We have observed in the preceding section that, if A(z): <p"—»<p m is an

analytic family of transformations, then, in general, Ker A(z) and Im A(z)
are not analytic families of subspaces. However, Example 18.1.1 suggests
that after a change at certain points Ker A(z) and Im A(z) become analytic
families. It turns out that this is true in general. To make this statement
more precise, it is convenient to introduce some terminology. Let
A(z): <£"-» <P™ be an analytic family of transformations on ft. The singular
set S(A) of A(z) is the set of all z0 £ ft for which
Note that the singular set is discrete; that is, for every z0 £ S(A) there is a
neighbourhood U C ft of z0 such that U fl S(A) = {z0}.
Theorem 18.2.1
Let A(z): <p n —» 4-"" be an analytic family of transformations on ft, and let
r = max zen rank A(z). Then there exist m-dimensional vector-valued func-
tions y \ ( z ) , • . . , yr(z) and n-dimensional vector-valued functions

*j(z), . . . , x n _ r ( z ) that are all analytic on ft and have the following proper-
ties: (a) .Vi(z), . . . , yr(z) are linearly independent for every z E f t ; (b)
jCj(z), . . . , x n _ r ( z ) are linearly independent for every z G ft; (c) for every z
not belonging to the singular set of A(z)
and
For andy z belongting to the singular set of A(z) the inclusing
and
hold.
In particular (Proposition 18.1.1), S p a n { y l ( z ) , . . . , yr(z)} is an analytic

family of subspaces that coincides with Im A(z) outside the singular set oi
A(z}. Similarly, Span{jCi(z), . . . , xn_r(z)} is an analytic family of subspaces
that coincides with Ker A(z) outside S(A).
The proof of Theorem 18.2.1 is based on the following lemma.
Lemma 18.2.2
Let jc,(z), . . . , xr(z) be n-dimensional vector-valued functions that are analy-
tic on a domain ft in the complex plane. Assume that for some z0 G ft, the
vectors *j(z 0 ),. . . , *,(2o) are linearly independent. Then there exist n-
dimensional vector functions ^(z), . . . , yr(z) with the following properties:
(a) y { ( z ) , . . . , yr(z) are analytic on ft; (b) y^z),. . . , yr(z) are linearly
independent for every zGft; (c) Span{_y,(z), . . . , yr(z)} =
Span{x,(z),. . . , x r ( z ) } (C(f") for every z G n ^ n o , where O0 = {zG
H Jc,(z), . . . , xr(z) are linearly dependent}. If, in addition, for some s (^r)
the vector functions JCj(z), . . . , xs(z) are linearly independent for all z G ft,
then y,(z), i — 1,. . . , r can be chosen in such a way that (a)-(c) hold, and
moreover, for all
In the proof of Lemma 18.2.2 we use two classical results (see Chapter 3
of Markushevich (1965), Vol. 3, for example) in the theory of analytic and
meromorphic functions that are stated here for the reader's convenience.
Recall that a set SCO is called discrete if for every z G 5 there is a

neighbourhood V of z such that V fl 5 = {z}. (In particular, the empty set
and the finite sets are discrete.) Note also that a discrete set is at most
countable.
Lemma 18.2.3
(Weierstrass's theorem). Let S C H be a discrete set, and for every z 0 G 5 let a
positive integer s(z0) be given. Then there exists a (scalar) function f(z) that is
analytic on H and for which the set of zeros off(z) coincides with S, and for
every z() G 5 the multiplicity of z0 as a zero of /(z) is exactly s(z0).
Lemma 18.2.4
(Mittag-Leffter theorem). Let S C fl be a discrete set, and for every z() G S let
a rational function of type
be given, where k is a positive integer (depending on z 0 ) and ay are complex

numbers (also depending on z 0 ). Then there exists a function f(z) that is
meromorphic on O, for which the set of poles of /(z) coincides with S, and
for every z 0 G S, the singular part of f(z) at z0 coincides with qz (z); that is,
/OO ~ <7z0(z) « analytic at z 0 .
Proof of Lemma 18.2.2. We proceed by induction on r. Consider

first the case r — 1. Let g(z) be an analytic scalar function on O with the
property that every zero of g(z) is also a zero of x } ( z ) having the same
multiplicity, and vice versa. The existence of such a g(z) is ensured by the
Weierstrass theorem given above. Put y to prove
Lemma 18.2.2 in the case r = 1.
Now we can pass on to the general case. Using the induction assumption,
we can suppose that jc t (z),. . . , xr_^(z) are linearly independent for every
z G O. Let X0(z) be an r x r submatrix of the n x r matrix [.^(z), . . . , * r (z)]
such that det X0(zQ) •=£ 0. It is well known in the theory of analytic functions
that the set of zeros of the not identically zero analytic function det XQ(z) is
discrete. Since det X0(zQ) 7^0 implies that the vectors x,(z 0 ),. . . , xr(z0) are
linearly independent, it follows that the set
arr\e linealy dependentt}
is also discrete. Disregarding the trivial case when O0 is empty, we can write
H0 = (£1, £ 2 ,. . .}, where £ E f t , / = 1 , 2 , . . . , is a finite or countable
sequence with no limit points inside fl.
Let us show that for every j = I , 2,. . . , there exist a positive integer Sj
and scalar functions flly(z),. . . , ar_l ;(z) that are analytic in a neighbour-
hood of £y such that the system of /i-dimensional analytic vector functions on
n
has the following properties: for each z ^ £; it is linearly equivalent to the

system *j(z),. . . , xr(z) (i.e., both systems span the same subspace in <p n );
for z = £y it is linearly independent. Indeed, consider the n x r matrix B(z)
whose columns are formed by xv(z),. . . , xr(z). By the induction
hypothesis, there exists an (r — 1) x (r — 1) submatrix B0(z) in the first r — 1
columns of B(z) such that det B0( £;.) ^ 0. For simplicity of notation suppose
that BJz) is formed by the first r - 1 columns and rows in B(z]\ so
where Bj(z), B2(z), and B3(z) are of sizes (r - 1) x 1, (n - r 4-1) x (r - 1),

and ( n - r + l ) x l , respectively. Since fi0(z) *s invertible in a neighbour-
hood of £y, we can write
where W(z) = B3(z) - B2(z)BQ l(z)Bl(z) is an (n - r + 1) x 1 matrix. Let st

be the multiplicity of £; as a zero of the vector function W(z). Consider the
matrix function
Clearly, the columns 6j(z),. . . , br(z) of fi(z) are analytic and linearly
independent vector functions in a neighbourhood V(£/) of £ y . From formula
(18.2.6) it is clear that Spanfjc^z),. . . , xr(z}} = Spanf^^z),. . . , br(z)}
for z G V(£.) ^ T. Further, from (18.2.6) we obtain
and
So the columns b,(z),. . ., br(z) of B(z) have the form (18.2.5), where
a^z) are analytic scalar functions in a neighbourhood of £;.
Now choose y^z),... , yr(z) in the form
where the scalar functions gt(z) are constructed as follows: (a) gr(z) is
analytic and different from zero in O except for the set of poles £,, £2, . . . ,
with corresponding multiplicities s{,s2,...; (b) the functions gt(z) (for
/ = l , . . . , r - l ) are analytic in H except for the poles £j, £ 2 ,. . . , and the
singular part of g,(z) at £/ (f°r / = 1, 2,. . .) is equal to the singular part of
a,7(z)£r(z) at Cr
Let us check the existence of such functions g^z). Let gr(z) be the inverse
of an analytic function with zeros at £j, £ 2 ,. . . , with corresponding multi-
plicities slys2,. . . (such an analytic function exists by Lemma 18.2.3). The
functions g { ( z ) , . .. , g r _j(z) are constructed by using the Mittag-Leffler
theorem (Lemma 18.2.4).
Property (a) ensures that y^z),. . . , yr(z) are linearly independent for
every z G H "- { £,, £ 2 ,. . .}. In a neighbourhood of each L we have
where the final ellipsis denotes a vector function that is analytic in a

neighbourhood of £y and assumes the value zero at £;. Formula (18.2.7) and
the linear independence of vectors (18.2.5) for z = ^ ensures that
y\(Cj), • • • , yr(£j) are linearly independent. Finally, the last statement of
Lemma 18.2.2 follows from the proof of the first part of this lemma.
Proof of Theorem 18.2.1. Let A0(z) be an r x r submatrix of A(z) that

is nonsingular for some z e H, that is, del AQ(z) ^ 0. So the set O0 of zeros
of the analytic function det>! 0 (z) is either empty or consists of isolated
points. In what follows we assume for simplicity that AQ(z) is located in the
top left corner of A(z) of size r x r.
Let ^ t (z),. . . , xr(z) be the first r columns of A(z), and let
yt(z),. . . , yr(z) be the vector functions constructed in Lemma 18.2.2. Then
for each z G O "^ ft0 we have
[The last equality follows from the linear independence of Jt^z),. . . , xr(z)
for z E O ^ O0.] We now prove that
Equality (18.2.8) means that for every z E H ^ n o there exists an r x r

matrix B(z} such that
where Y(z) - [y,(z),. . . , yr(z)]. Note that B(z) is necessarily unique.

[Indeed, if B'(z) also satisfies (18.2.10), we have Y(z)(B(z) - B ' ( z ) ) = 0,
and, in view of the linear independence of the columns of Y(z), B(z) =
B'(z).] Further, B(z) is analytic in fl^fl 0 . To check this, pick an arbitrary
z ' E f t ^ n o , and let yo(z) be an rxr submatrix of Y(z) such that
det(y o (z'))^0. [For simplicity of notation assume that F0(z) occupies the
top r rows of F(z).] Then det(y o (z))^0 in some neighbourhood V of z',
and (Y0(z)yl is analytic on z e V. Now Y(z)~L =f [(yo(z))"!, 0] is a left
inverse of Y(z); premultiplying (18.2.10) by Y(z)~L, we obtain
So B(z) is analytic on z E V; since z' G H "^ ft0 was arbitrary, B(z) is analytic
on fl ^n o .
Moreover, B(z) admits analytic continuation to the whole of Q, as
follows. Let z 0 e n o , and let Y(z)~L be a left inverse of Y(z), which is
analytic in a neighbourhood V0 of z 0 . [The existence of such Y(z) is proved
as above.] Define B(z) as y(z)^^4(z) for z e V0. Clearly, B(z) is analytic on
V{}, and for z E V0 ^ (z 0 ), this definition coincides with (18.2.11) in view of
the uniqueness of B(z). So B(z) is analytic on il.
Now it is clear that (18.2.10) holds also for zEfl 0 , which proves
(18.2.9). Consideration of dimensions shows that in fact we have an equality
in (18.2.9), unless rank A(z) < r. Thus (18.2.1) and (18.2.3) are proved.
We pass now to the proof of existence of y r + 1 (z),. . . , yn(z) such that
(b), (18.2.2), and (18.2.4) hold. Let a,(z),. . . , ar(z) be the first r rows of
A(z). By assumption fl,(z")' . . . , a r ( z ) are linearly independent for some
z E11. Apply Lemma 18.2.2 to construct ^-dimensional analytic row func-
tions bj(z), . . . , br(z) such that for all z E H the rows b,(z),. . . , br(z) are
linearly independf ~* — J r <- r» -^ r»
Global Properties of Analytic Families of Subspaces 575
Fix z0 E ft, and let br+l,. . . , br be n-dimensional rows such that the vectors
b,(z 0 ) r ,. . . , br(z0)T, bTr+l,. . . , bTn form a basis in <p". Applying Lemma
18.2.2 again [for *,(z) = b,(z}\ . . . , x,(z) = b,(z)\ *,+1(z) =
fcj+i,. - . , Jtn(z) = bj], we construct n-dimensional analytic row functions
br+i(z), • • • > ^«( z ) sucn tnat tne rt x n matrix
is nonsingular for all z E f t . Then the inverse fi(z)"1 is analytic on ft. Let
y r + 1 (z),. . . , yn(z) be the last (n - r) columns of B(z)~l. We claim that (b),
(18.2.2), and (18.2.4) are satisfied with this choice.
Indeed, (b) is evident. Take z E ft ^ ft0; from (18.2.12) and the construc-
tion of yr+l(z),. . . , yn(z) it follows that
But since z^ft 0 , every row of A(z) is a linear combination of the first r
rows. So in fact
Now (18.2.13) implies that for z £ f t ^ f t
Passing to the limit when z approaches a point from ft0, we find that
(18.2.14), as well as the inclusion (18.2.13), holds for every z E f t . Con-
sideration of dimensions shows that the equality holds in (18.2.13) if and
only if rank A(z) = r. D
18.3 GLOBAL PROPERTIES OF ANALYTIC FAMILIES OF SUBSPACES
In the definition of an analytic family of subspaces the transformation A(z)

and the subspace M depend on z 0 , so the definition of an analytic family of
subspaces has a local character. However, it turns out that for a given
analytic family of subspaces M(z) there exists an analytic family A(z) and a
subspace M independent of z0 for which the equality M(z) = A(z)M holds.
Theorem 18.3.1
Let {^(z)}z6ft bg an analytic family of subspaces (o/<p") on ft. Then there
exist invertible transformations A(z)\ $"—> <p" that are analytic on ft, and a
subspace M C $" such that M(z) = A(z}M, for all z £ ft.
The lengthy proof of Theorem 18.3.1 is relegated to the next two

sections. First, we wish to emphasize that this is a particularly important
result concerning analytic families of subspaces and has many consequences,
some of which we describe now.
Theorem 18.3.2
For an analytic family of subspaces M(z) (of <JT") on ft the following
properties hold: (a) there exist n-dimensional vector functions
jCj(z),. . . , xp(z) that are analytic on ft and such that, for each z Eft, the
vectors ^(z),. . . , xp(z) are linearly independent and
(b) there is an analytic family of projectors P(z) defined on ft such that

M(z) = Im P(z) for all z E ft; (c) for every z E ft there exists a direct
complement N(z) to M(z) in £" such that the family of subspaces N(z) is
analytic.
Proof. Let A(z) and M be as in Theorem 18.3.1, and let xl, . . . , xp be a

basis in M. Then x,(z) = A(z)xi, i — 1,. . . , p satisfy (a). To satisfy (b), put
P(z) = A(z)PA(z)~l, where P is a projector on M. Finally, the family of
subspaces ^V(z) = A(z)N, where N is a direct complement in M in (p",
satisfies (c). D
Note that property (b) [as well as property (a)] is characteristic for
analytic families of subspaces. So, if P(z) is an analytic family of projectors
on ft, then Im P(z) is an analytic family of subspaces. We leave the
verification of this statement to the reader.
In connection with Theorem 18.3.2 (c), note that the orthogonal comple-
ment M{z)^ is usually not an analytic family, as the next example shows.
EXAMPLE 18.3.1. For any z E <p let
Then
Global Properties of Analytic Families of Subspaces 577
which is not analytic. Indeed, if M(z)L were analytic, then for z in a

neighbourhood U of each point z0 E <J7 we would have
where A (z) is a 2 x 2 analytic family of invertible matrices and M is a fixed

one-dimensional subspace that, without loss of generality, may be assumed
equal to Span{el}. So
on U, where he first column of A(z). Hence a
However, the function z is not analytic in U, so (18.3.1) cannot happen. D
In the next section we will need the following generalization of Theorem

18.3.2.
Theorem 18.3.3
Let M(z) and N(z) be analytic families of subspaces (of <p") on ft such that
M(z) C ^V(z) for all z E ft. Then there exist n-dimensional vector functions
jc t (z),. . . , xp(z) [where p - dim N(z) - dim M(z)] that are analytic on ft
and such that, for each z E ft, the vectors JCj(z),. . . , xp(z) form a basis in
N(z) modulo M(z).
Proof. By Theorem 18.3.2 there are bases y\(z),. . . , ys(z) in M(z) and
i>,(z), . . . , v((z} in N(z) that are analytic on ft. By Lemma 18.2.2 there
exist analytic vector functions ys+l(z), . . . , y,(z) such that y ^ ( z ) , . . . , y,(z)
are linearly independent for each z E ft and
Obviously, ys+l(z),. . . , y((z) is the desired analytic basis in ^V(z) modulo

M(z).
We note one more consequence of Theorem 18.3.1.
Corollary 18.3.4
Let Jt^z),. . . , M k ( z ) be analytic families of subspaces (of <p") on ft, and
assume that for each z Eft, (p" is a direct sum of M ^ ( z ) , . . . , M k ( z ) . Then,
given ZQ E H, there exists a family ofinvertible transformations 5(z): <pn —* <p"

that is analytic on ft and for which S(z)Mi(zQ) = M^z} on ft, and S(z0) = /.
Proof. It follows from Theorem 18.3.1 that there exist analytic families
of invertible transformations S,(z): <p"-» (p", / = !,..., k, such that
Sj(z 0 ) = / and 5 ( (z)^,( 2 o) = ^/( z ) f°r a U z E <p. Now the transformation
5(z): <p"-» <p" defined by the property that S(z)x = St(z)x for all x E M^z^}
satisfies the requirements of Corollary 18.3.4.
18.4 PROOF OF THEOREM 18.3.1 (COMPACT SETS)
As a first step towards the proof of Theorem 18.3.1, a result is proved in this
section that can be considered as a weaker version of that theorem. We
say that a function /(z) (whose values may be vectors, or transformations) is
analytic on a compact set ATCft if /(z) is analytic on some open set
containing K.
Theorem 18.4.1
Let K C n be a compact set, and let M(z) C <p" be an analytic family of
subspaces on H. Then there exist vector functions /,(z),. . . , fr(z) E <p" that
are analytic on K and such that / t (z),. . . , fr(z) is a basis in M(z) for every
zG/C.
In turn, we need some preliminaries for the proof of Theorem 18.4.1.

First, we introduce the notion of an incomplete factorization. Let A(z) be an
n x n matrix function that is analytic on a neighbourhood of the unit circle
and is nonsingular on the unit circle. An incomplete factorization of A(z) is a
representation of the form
that holds whenever |z| = 1 and the family +A(z) is nonsingular and analytic
on the disc |z| < 1, and the family A(z) is nonsingular and analytic on the
annulus 1 ^ |z| <°°.
Lemma 18.4.2
Every n x n matrix function A(z) that is analytic and nonsingular on a
neighbourhood of the unit circle admits an incomplete factorization.
Proof. Consider first the case when A(z) is analytic on the disc |z ^ 1.
Let z 0 be a zero of det A(z) with |zj < 1. Then for some invertible matrix
T0 the first row of T0A(z) is zero at the point z0. Put
Proof of Theorem 18.3.1 (Compact Sets) 579
Then A(z) - A^(z) +A^(z)\ moreover, A,(z) is analytic and invertible for
1 < |z| <oo ? +Al(z) is analytic and invertible for |z| ^ 1, and the number of
zeros of det +A j(z) inside the unit circle is strictly less than that of det A(z).
If det +A j(z) ^ 0 for |z| ^ 1, then /l(z) = ^4 ,(z) +/1 ,(2) is an incomplete fac-
torization of A(z). Otherwise, we apply the construction above to ^(z),
and after a finite number of steps an incomplete factorization of A(z) is
obtained.
Now it is easy to prove Lemma 18.4.2 for the case that A(z) is
meromorphic in the disc |z| < i (more exactly, admits a meromorphic con-
tinuation into the disc). Indeed, let z t , . . . , zk be all the poles of A(z) inside
the unit disc with orders al,...,ak, respectively. Then the function
B(z) = nf =1 (z - ziYiA(z) is analytic for z < 1 and thus (according to the
assertion proved in the preceding paragraph) admits an incomplete fac-
torization: B(z) = ~B(z) +B(z). So (18.4.1) with 'A(z) = {n*=1(z -
z-) -Q( '} B(z); +A(z) = *B(z) is an incomplete factorization of A(z).
Now consider the general case. Let e > 0 be such that A(z) is analytic and
invertible in the closed annulus = { z E and analytic in the open annulus 4> = { z E < p | l — e < | z | < l + e}.
The set Cw is an algebra with pointwise addition and multiplication of
matrices and multiplication by scalars, that is, for z e O and A^z), Y(z) E CM
we define
Introduce the following norm in €„:
where A r (z)G Cw. It is easily seen that this is indeed a norm; that is, the
axioms (a)-(c) of Section 13.8 are satisfied. Moreover
for X, Y£ Cw. In fact, the normed algebra CM is a Banach algebra, which

means that each Cauchy sequence converges in the norm || • ||c to some
function in CM. This follows from the fact that the uniform limit of
continuous functions on is itself a continuous function on <f>, and the limit
of analytic function on which is uniform on each compact set in <J> is itself
analytic on $>.
Let M + be the set of all matrix functions from C^ that admit an analytic
continuation to the set ( z E l + e}U {°°} and assume the zero value at infinity. It is easily
seen (as for C w ) that M+ and M_ are closed subspaces in the norm || • ||c .
Clearly, M+ n M_ = {0} (here 0 stands for the identically zero n x « matrix
function on <f>). Furthermore, M+ + M_ = Cli>. Indeed, recall that every
function X(z) £ Cw can be developed into the Laurent series
where the functions
belong to M+ and M _ , respectively. Denoting P+(X(z)) = X+(z), we obtain

a projector P+: Cw -» Cw with Im P+ - M+ and Ker P+ = M _ . It turns out
that P+ is bounded, that is
[See page 225 in Gohberg and Goldberg (1981), for example; the proof is
based on Banach's theorem that every bounded linear operator that maps a
Banach space onto itself, and is one-to-one, has a bounded inverse.]
Return to our original matrix function A(z}. Clearly A(z)"1 £. C w , and
the Laurent series A(z)~l = £*=_«, z'Aj converges uniformly in the annulus
l - e < | z | ^ l + e. Therefore, for some N the matrix function AN(z) —
T,^_N z'Aj has the following properties: det AN(z) ^0 for l - e < | z | < l + e
and
where M(z) G C^ and
Let
000000000000000000000000000000000000000000000000000000000000000
algebra Cw. (Here / represents the constant n x n identity matrix.) Denote
+
G = (/+ +N)'\ Then +G and ( + G) - 1 belong to the image of P+. In
particular, + G and ( + G ) ~ 1 are analytic in the disc |z| < 1. Furthermore, one
checks easily that
so the function G = (/ + +N)(I- M) is analytic for l=s|z|<°° and at

infinity. As
G is invertible in C^. Since both G and / belong to the (closed)

subalgebra C~ = {al + Ker P+ \ a e <p) of Ca, also ("G) -1 e C~. Now
write
and use the fact (proved in the preceding paragraph) that the function
(+G(z)yl(AN(z))~\ which is meromorphic on the unit disc, admits an
incomplete factorization. D
Lemma 18.4.3
Let / j , . . . , fr E <p" and gl,. . . , g, E (p" be two systems of analytic and
linearly independent vectors on ft such that
for z C ft0, where ft0 C ft is a set with at least one limit point inside ft. Then
Span{/i(z) f . . . , fr(z)} = Spani^z), . . . , g,(z)}
for every z E . f l and
where A(z) is an r x r matrix function that is invertible and analytic on ft.
Proof. Consider the system II = {/j,. . . , fr, g^ . . . , gr} of '2r n-

dimensional vectors. Then rank Il(z) = r for z E ft0. On the other hand, the
set {z0 Gft | rankll(z 0 ) <max zen rankll(z)} is discrete. Thus r =
max
*en rank n(z), and (18.4.3) holds for every z Eft because both systems
/ t (z),. . . , fr(z} and gj(z),. . . , gr(z) are linearly independent. Consequent-
ly, there exists a unique matrix function A(z) such that (18.4.4) holds. It
remains to prove that A(z) is analytic on ft. Let z 0 eft, and suppose, for
example, that the square matrix X(z) formed by the upper r rows of
[#i(z),. . . , gr(z)] is invertible for z = z 0 . Computing A(z) in a neighbour-

hood of z0 by Cramer's formulas, we see that A(z) is analytic in a
neighbourhood of z 0 . Thus A (z) is analytic on H. D
Proof of Theorem 18.4.1. Without loss of generality, we can suppose

that K is a connected set (otherwise consider a larger compact set). Fix
z0 E K, and let N0 be some direct complement for M(zQ) in <p". Then
is a direct sum decomposition for every zE. K except maybe for a finite set
of points 2 , , . . . ,zk. Indeed, by the definition of an analytic family of
subspaces, for every TJ E K there exists a neighbourhood U^ of 17 and an
analytic and invertible matrix function B^(z) defined on U^ such that
B^(z)M = M(z) on U^, where J< is a fixed subspace in <p". We can assume
[by changing B^(z) if necessary] that the subspace M is independent of TJ.
[Here we use the fact that dim M{z) is constant because of the connected-
ness of O.] Actually, we assume M = M.(z0). Let *,,. . . , xr be some basis in
M ( z 0 ) , and let xr+l, . . . , xn be a basis in JV0. Then for 2 e U^ the subspaces
^(z) and JV0 are direct complements to each other if and only if
Two cases can occur: (a) DTJ(z) = 0 for z e L^; (b) Dn(z) ^0, and then we
can suppose (taking Un smaller if necessary) that D T J (z)^0 only at a finite
number of points of U^. Let us call the points 17 for which (a) holds points o
the first kind, and the points 17 for which (b) holds points of the second kind
Since K is connected, all 17 £ K are of the same kind, and since z0 is of the
second kind, all 17 G K are of the second kind. Further, let L^, . . . , U bea
finite covering of the compact K. Since D^ (z) ^ 0 only at a finite number of
points z in L^., / = ! , . . . , / , we find that (18.4.5) holds for every z £ K
except possibly for a finite number of points z , , . . . , zk G K.
By the definition of an analytic family of subspaces, there exist neighbour-
hoods U(zl),..., U(zk) of Z , , . . . , Z A , respectively, and functions
5 (1) (z),. . . , B(k\z) that are invertible and analytic on t/(z,),. . . , U(zk),
respectively, such that
Let x\'\ . . . , x ( r j ) be some basis of the subspace M(ZJ), and let g\i)(z) =
Bu\z)x\'\ (/ = 1,. . . , r; z G U(zj)\ j = 1,. . . , k). Then for p >0 small
enough we have
as long as z — z y | ^ p for ;' = 1,. . . , k. Let
For every z E. K ^ S let P(z) be the projector on M(z) along jV0. Then we
claim that P(z) is an analytic function on K ^ S.
Indeed, we have to prove this assertion in a neighbourhood of every
/AO E K ^ S. Let t/0 be a neighbourhood of /u,0 in the set K "- 5 such that,
when z G UQ, M(z) = fi(z)J£(/u,0) for some analytic and invertible matrix
function B(z) on U0. The matrix function B(z) defined on UQ by the
properties that B(z)x = fi(z)jt for all xE.M( /u,0) and B(z)y = y for all y € JV0
is analytic and invertible. As P(z) = B(z)PQ(B(z))~l, where P0 is the project-
or on M((JL{)) along JV0, the analyticity of /'(z) on U0 follows.
Let us now prove that there exist vector functions f\°\z),. . . , /* 0) (z)
that are analytic on K ^ S and for which
where z G K ~^ S. Indeed, let z 0 E K ^ 5 be a fixed point. Then dim

Im P(z0) = r; let gi 0) (z),. . . , g< 0) (z) be columns of P (0) (z) that are
linearly independent for z = z 0 . In view of Lemma 18.2.2, there exist
analytic and linearly independent vector functions /^(z),. . . , /* 0) (z) de-
fined on K ^ S such that
for every z G K ^ S, except maybe for a finite set of points. (The set of
exceptional points is at most finite because of the compactness of K"- 5.)
But from the choice of g\°\ . . . , gJ 0) it follows that
for every z G AT^S, except perhaps for a finite set of points [viz., those
points z for which the vectors £i 0) (z), . . . , g^ 0) (z) are not linearly indepen-
dent]. Thus
for every z £ K ^ S except maybe for a finite number of points. As both

sides of (18.4.6) are analytic families of subspaces on K^~S (Proposition
18.1.1), it is easily seen that, in fact, (18.4.6) holds for every z E #"-S.
Consider now the systems {/ ( , 0) (z),. . . ,/J 0) (z)} and
l z ( l) z
{g\ \ )> • • • •> g r ( )}- These systems form two bases for M(z) that are
analytic in a neighbourhood of the circle |z - z,| = p. Therefore, by Lemma
18.4.3 there exists an r x r matrix function A(z) analytic and invertible on a

neighbourhood U of the set {z E <p | |z - z,| = p} and such that, for all
By Lemma 18.4.2, the function A(z) admits an incomplete factorization

relative to the circle {z | |z - z,| = p}: A(z) = ~A(z) •+A(z) (|z - z j = p).
In view of (18.4.7), we find that, when |z - zj = p
Clearly, the functions /^(z),. . . , f(rl\z) can be continued analytically

to the set K "- (S2 U • • • U Sk). Moreover, since *A(z) [resp. ^4(z)] is invert-
ible for | z - z j p), the set f[l\z),. . . , /^(z) is
linearly independent for every z G K ^ (S2 U • • • U Sk). Furthermore, for
any z G K ^ (52 U • • • U Sk), we obtain
Now take the point z 2 and apply similar arguments, and so on. After k steps
one obtains the conclusion of Theorem 18.4.1. D
18.5 PROOF OF THEOREM 18.3.1 (GENERAL CASE)
In this section we finish the proof of Theorem 18.3.1. The main idea is to
pass from the case of compact sets (Theorem 18.4.1) to the case of a general
domain ft. To this end we need some approximation theorems.
A set M C <p is called finitely connected if M is connected and <p "-• M
consists of a finite number of connected components. A set N C M is called
simply connected relative to M if for every connected component Y of <p ^ N
the set Y fl (<p^- M) is not empty. The first of the necessary approximation
theorems is the following.
Lemma 18.5.1
Let K C ft be a finitely connected compact set that is also simply connected
relative to ft. Let Yl,. . . ,YS be all the bounded components of 0 there exists a rational
matrix function B(z) of size m x n such that B(z) is analytic on (p ^
( z i , . . . , zs) and, for any zE.K,
Proof of Theorem 18.3.1 (General Case) 585
Proof. Without loss of generality we will suppose that m = n = 1, that

is, the functions A(z) and B(z) are scalars. We prove that it is possible to
choose a rational function of the form
where the xjv G <p and such that \A(z) - R(z)\ < e for any z £ K. Let
U CL^~^ {zl,. . . , zs} be a neighbourhood of K whose boundary dU con-
sists of s + I closed simple rectifiable contours. Then for z G K, we obtain
Since this integral can be uniformly approximated by Riemann sums, we

have to prove only that the function (17 — 2)" can be uniformly approxi
mated by functions of the form £* =0 (z - ziY"xv, xv E <p, where 17 E dU
Yj ( / = ! , . . . , s), and (17-z)" 1 can be approximated uniformly by the
polynomials £* = 0 zX, (*„ e <P) where i?e < ? { / n ( < p ^ ( t f U y, U • • • U
Y^)). But this assertion follows from Runge's theorem [Chapter 4 of
Markushevich (1965), Vol. 1], which states that, given a simply connecte
domain F in <pU {°°} and a point £ in the interior of (<pU (°°})^F, any
analytic function /(z) on F is the limit of a sequence of rational functions
with their only pole at £, and the convergence of this sequence to /(z) is
uniform on every compact subset of F. Indeed, for / = 1 , . . . ,5 the set
bounded by the contour dV n Yj is simply connected, as well as the set
(<fu{oo})-(^ U y 1 u---yj. n
Lemma 18.5.2
Let K and z 1? . . . , zs be as in Lemma 18.5.1. // A(z) is an n x n matrix
function that is analytic on K and invertible for every z E K, then for every
€ > 0 there exists an analytic and invertible matrix function B(z) defined on
<p ^ {z,, . . . , z^} such that (18.5.1) holds for any z£K.
Proof. Denote by G the group of all n x n matrix functions M(z) that

are analytic on K and invertible for every z E AT, together with the topology
induced by the norm \\M\\c = maxzeK ||M(z)||. Let G, be the connected
component of the topological space G that contains /, the constant n x n
identity matrix. In fact
G, = \ X £ G \ there exist an integer v > 0 and Ml,. . . , Mv £ G, ||My||G < 1

Indeed, denoting the right-hand side of (18.5.3) by G0, let us prove first
that G0 is both a closed and an open set in G. Let FE G0 and HE. G be
such that \\H-F\\c < HF' 1 ^ 1 . Then // = (/- Af)F, where M = / - HF~\
We have
that is, H E G0. So G0 is open. Suppose now that Fy E G0, j = 1,2,. . . and
||Fj - F||—»0 for some FE G. Let ;0 be large enough such that ||F; - Fj| <
HF-'ir 1 . Then F=(I-M)F]Q where ||M||G = ||7- FFhl\\G < I/that is,
FE G0. So G0 is a closed set.
Now let us prove that G0 is connected. Let
then
is a continuous function that connects Xand /in G0. So G0 is connected and

thus is the connected component of G that contains /. So (18.5.3) is proved.
As a side observation, note that G0 is also a subgroup of G. Indeed, let
X, VE G0. Then the set X- G^1 is connected and contains /; therefore,
00000000000000000000000000000000000000000000000000000000000
of G.
Now let A(z) be as in Lemma 18.5.2, and suppose first that A E Gr Then
for some A/,,. . . , Mv E G with ||A/y||G < 1 for / = 1,. . . , v. Rewrite this
representation in the form
A = exp(ln(/ - M,)) • • • exp(ln(/ - AfJ)
where
By Lemma 18.5.1, for each j = 1,. . . , v there exists a rational n x n matrix

function Dy whose poles are contained in ( z j , . . . , zs, <*>}, with the property
that Dj approximates the analytic function ln(7 - Af;(z)) well enough to
ensure that the analytic matrix function B(z) = exp^^z))- • •exp(Dl)(z))
satisfies (18.5.1) for every z G #. Clearly, B(z) is invertible for every

z E <p ^ { z j , . . . , ZjJ, so the lemma is proved in the case A(z) G G,.
We now pass to the general case. Let GA be the connected component of
G which contains A(z). It suffices to show that there exists an n x n matrix
function D(z) that is analytic and invertible in <p~" {z l 5 . . . , zs} and such
that D(z)EGA. Indeed, then A(z)D(z)~l G G 7 , and as we have seen
already, there exists an analytic and invertible matrix function B(z) in
< p ^ { z 1 ? . . . , zj with the property that \\B - AD~l\\G < c l l ^ H c ' - Tne
matrix function B(z) - B(z)D(z) is the desired one.
Thus let us prove the existence of D(z). According to Lemma 18.5.1, for
every 8 > 0 there exists a rational matrix function DQ(z} that is analytic on
<p^ {z,,. . . , zj and such that ||£>0(z) - A(z)\\ <8 when z G K. Choose
8 > 0 small enough to ensure that D0(z) is invertible for z G K and
D0 G GA. Since DQ(z) is a rational function, det D0(z) ^ 0 for every z G ({7 "••
{z,,. . . , zs) except perhaps for a finite set of points r j j , . . . , r\m G <p^
{z,,. . . , zj, which do not belong to K.
Denote by Y(r){) the connected component of X that contains 17,,
and let 2(17,) be the point from (°°, z , , . . . , zs} that belongs to Y^). Let
p>0 be such that the disc{z£(f z - i 7 , | < p } is contained in ^(17,)^-
{Z(TJ,), Tj 2 ,. . . , rim}. By Lemma 18.4.2 there exists an incomplete fac-
torization of DQ(z) with respect to the circle |z - 17,| = p:
where +D0(z) is analytic and invertible in the disc {z G 0(z) is invertible for all z ^ rjl. Also, +D0 is analytic and invertible on
<P^{z],...,z,,7?2,...,r?m}DA:.
Let Y(t), 0 < / < 1 be a continuous function with values in y(i7,) such that
y(0) = 17,, _y(l) = z(7jj). Then the formula
defines a continuous map F: [0,1]-*G with F0 = D0. Hence

£>, = F! +D0 E. GA. As +D0 is invertible on <p ^ {z t ,. . . , zs, Tj2, . . . , 17^}
and F,(z) = D0(z + 'rjl — zC^)) is invertible on <p"^ ^(T/J)}, it follows that
Dj(z) is analytic and invertible on <p "" {z^ . . . , zs, i}2,. . . , rjm}. Repeating
this argument m — l times with respect to the points rj 2 ,. . . , rjm, we obtain
the desired function D(z).
The following lemma is the main approximation result that will be used in
the transition from compact sets in fl to the domain O itself.
Lemma 18.5.3
Let K C ft be a finitely connected compact set that is also simply connected
relative to O. Let M C <p" be a fixed subspace and A(z) be an n x n matrix
function that is analytic and invertible on K and such that A(z)M = M for
z E K. Then for every e > 0 there exists a matrix function B(z) that is analytic
and invertible on H and such that
for all z E K and B(z)M = M for all z E O.
Proof. Without loss of generality, we can assume that M =

Span{ej,. . . , er}, for some r. Then in the 2 x 2 block matrix formed by
representation with respect to the direct sum decomposition M + M ^ = <p"
we have
Because A(z) is invertible when z G AT, so are A j(z) and A2(z). Use Lemma
18.5.2 to find matrix functions #,(z) and B2(z) that are analytic and
invertible on H and such that \\Bf(z) - Af(z)\\ < e/3 for z e K; i = 1, 2. By
Lemma 18.5.1 there exists an analytic matrix function Bl2(z) on H such that
\\Bl2(z)- Al2(z)\\<e/3 for z<=K. Then
satisfies the requirements of Lemma 18.5.3. D
The following result allows us to pass from the compact sets in H to £1

itself.
Lemma 18.5.4
Let Kl C K2 C • • • C H be a sequence of finitely connected compact sets K^
which are also simply connected relative to H. For m = 1, 2,. . . , let Gm(z) be
an nx. n matrix function that is analytic and invertible on Km and satisfies
Gm(z)M = M for z E Km and for some fixed subspace M C <p". Then for
m = 1, 2,. . . , there exists an n x n matrix function Dm(z) that is analytic and
invertible on Km and such that, whenever z E Km
Proof. We need the following simple assertion. Let Xl, X2, . . . , be a

sequence of n x n matrices such that
Then the infinite product Y = n^, (/ + Xm) converges and ||/ - V|| < aea.
Indeed, for the matrices Ym = tt™=l (I + Xj) we have the estimates:
Thus, in view of (18.5.5) the infinite product Y = 0^ = 1 (/ + Xm) converges.

Moreover
We now prove Lemma 18.5.4 itself. Applying Lemma 18.5.3 repeatedly,

we find for m = l , 2 , . . . a matrix function Hm(z) that is analytic and
invertible on Km, for which //j(z) = /, and for z e Km, Hm(z)M = M and
The assertion proved in the preceding paragraph ensures that for every
m — 1,2,. . . , the infinite product
converges uniformly on Km, and \\l - Em(z)\\<2 m exp(2 m)<l for z G

Km. Consequently, Em(z) is invertible for every zE.Km. Further,
Em(z)M = M(z E Km; m = 1, 2,. . .). Indeed, since Em(z) is invertible, it is
sufficient to prove that Em(z)M C M. But this follows from the equalities
Hm(z)M = M, Gm(z)M = M and the definition of Em. Now we can put
D
M = Hm(z)Em(z)y because Em = H~lGmHm +lEm +l and consequently
Gm = (HmEJ(Hm +lEm +iy\ D
We are now prepared to prove Theorem 18.3.1.
Proof of Theorem 18.3.1. Let us show first that there exists a sequence
of compact sets Kl C. K2 C • • • that are finitely connected, simply connected
relative to ft, and for which U°°=i ^, = ft. To this end choose a sequence of
closed discs Sm C ft, m = 1, 2,. . . , such that U^ = 1 Sm = ft. It is sufficient to
construct Km in such a way that Km D Sm, m = 1, 2 , . . . . Put Kl — 5,,
suppose that Klt. . . , Km are already constructed, and K j ^ S j for j —

1,. . . , m. Let M be a connected compact set such that M D Km U Sm+l, and
let V,,. . . , Vk Cft be a finite set of closed discs from {£„,}*=, such that
N = U* = , V^ D M. Clearly, N is a finitely connected compact set. If N is also
simply connected relative to ft, then put Km+l - N. Otherwise, put Km+l =
N U y, U • • • U Ys, where y,,. . . , Y^ are all the bounded connected com-
ponents of the set $^~ N, which are entirely in ft.
Given the sequence Kl C K2 C • • • constructed in the preceding para-
graph. Choose z 0 E Kl and put MQ = M(zQ) [here J£(z) is the analytic family
of subspaces (of <p") on ft given in Theorem 18.3.1]. Without loss of
generality we can assume that MQ = Span{e,,. . . , er}. By Theorem 18.4.1,
there exist analytic vector functions/, (z),. . . , f(rm)(z) in Km that form a
basis in M(z) for every z E Km. Using Lemma 18.2.2, we find analytic vector
functions /J+ ) 1 ( z )» • • • » /« m> ( z ) defined on Km such that the vectors
/ ( , m) (z),.. ., /jT^z) form a basis in $" for every z E ^m (indeed, apply
Lemma 18.2.2 with *,(z) = /S" l) (z),. . . , x,(z) =/i m) (z), * r+1 (z) =
g,,. . . , xn(z) = gn-r, where g,,. . . , gn_r is a basis in a fixed direct com-
plement to M0). Then the matrix function Am(z) =
( ( m) m)
[f r\z)> f 2 (z),. . . , /l (z)] is analytic and invertible on Km and satisfies
where z£* m . Put G m (z) = Am\z)Am+l(z} for z E / C m . Then (18.5.6)

ensures that Gm(z)M0 = J<0 (z E /Cm, m = 1,2,. . .). By Lemma 18.5.4 (for
m = 1,2, . . .) there exists an analytic and invertible matrix function Dm(z)
on Km such that Gm = DmDm\l and, for z E Km
Since ,4 w+1 (z)£> m+1 (z) = Am(z)Dm(z) (z E /Cm; m = 1, 2, . . .) the relation

A(z) = Am(z)Dm(z), which holds for all z E # m , defines an analytic and
invertible matrix function A(z) on ft. Now the relation A(z)M0 = M(z) for
z E ft follows from (18.5.6) and (18.5.7). D
18.6 DIRECT COMPLEMENTS FOR ANALYTIC FAMILIES

OF SUBSPACES
Let M(z) be an analytic family of subspaces of <p" defined on a domain ft. If

jV is a direct complement to M(z0) and z0 E ft, then the results of Chapter
13 (Theorem 13.1.3) imply that JV is also a direct complement to M(z) as
long as z is sufficiently close to z0. This local property of direct complements
raises the corresponding global question: does there exist a subspace jV of
<p" that is a direct complement to M(z} for all z E ft? The simple example
below shows that the answer is generally no.
Direct Complements for Analytic Families of Subspaces 591
EXAMPLE 18.6.1. Let
As the polynomials z2 — 2 + 1 and z 2 + z do not have common zeros, it

follows that M(z) is an analytic family of subspaces. Indeed, if z0 is such
that z2, + z0 ^ 0, then in a neighbourhood of z0 we have
and if z0 is such that z2, - z0 + 1 ^ 0, then there is a neighbourhood of z0 in

which
However, there is no one-dimensional subspace Span (With at least one

of the complex numbers a, b nonzero) such that
for all z £ (p. Indeed, (18.6.1) means
Which is impossible.
It turns out that although one common direct complement for an analytic
family of subspaces may not exist, only two subspaces are needed to serve as
"alternate" direct complements for each member of the analytic family.
Theorem 18.6.1
For an analytic family of subspaces [ M ( z ) } z e f l of <p" there exist two
subspaces JVj, JV2 C <p" such that for each z £ fl either M(z) 4- JV^ = <p" or
M(z) + JV2 = (fn holds.
Proof. To prove this we first need the following observation: for any
/odimensional subspace !£ C <p, the set DC(^) of all direct complements to
!£ in (p" is open and dense in the set of all (n - A:) -dimensional subspaces.
Indeed, the openness of DC(!£) follows immediately from Theorem 13.1.3.
To prove denseness, let jV be an (n — k) -dimensional subspace in <pn with
hh Analytic Families of Subspaces
basis / ! , . • - , / „ _ f c , and let N0 be a direct complement to !£ with basis
S\-> • • • •> Sn-k- F°r a complex number e put JV(e) = Span{/j +
*£i> • • • , /„-* + egn-*}• Clearly, the vectors /. + eg,, i = 1,. . . , /i - fc are
linearly independent for e close enough to 0, so dim N(e) = n — k.
Moreover, Theorem 13.4.2 shows that
It remains to show that ^V(e) belongs to DC(2£). To this end pick a basis
/ i j , . . . , hk in j£, and consider the n x n matrix
As
(recall that JV0 + ^ f = <f"); also
0000000000000000000000000000000000000000000000000000000000000000000
that det G(e)^0, and since det G(e) is a polynomial in e it follows th;
detG(e)7^0 for eÔ and sufficiently close to zero. Obviously, ^V(e) <
DC(%) for such e.
Now we start to prove Theorem 18.6.1 itself. Fix z
direct complement to M(z0) in <p"- By Theorem 18.3.2 it is possible to pk
vector functions x^(z),. . . , x
for every z e ft, the vectors JCj(z),. . . , x
/ ! > • • • > /n-p ê a bas's m î» consider the n x « matrix function
which is analytic on ft. As det G(z 0 )^0, the determinant of G(z) is not
identically zero, and thus the number of distinct zeros of det G(z) is at most
countable. Let z n z 2 , . . . Eft be all of these zeros. Then jVj is a direct
complement to M(z) for z^{Zj, z 2 , . . .}. On the other hand, we have seen
that, for / = 1, 2, . . . , the sets DC(M(zi)) are open and dense in the set of
all (n — p) -dimensional subspaces in <p". As the latter set is a complete
metric space in the gap topology (Section 13.4), it follows that the inter-
section n°°=1 DC(M(zi)) is again dense [the Baire category theorem;
e.g., see Kelley (1955)]. In particular, this intersection is not empty, so
there exists a subspace JV2 C <p" that is simultaneously a direct complement
to all of M(zl),M(z2), ____ D
Direct Complements for Analytic Families of Subspaces 593
The following result shows that for analytic families of subspaces that
appear as the kernel or the image of a linear matrix function there exists a
common direct complement. As Example 18.6.1 shows, the result is not
necessarily valid for nonlinear matrix functions.
Theorem 18.6.2
Let Tl and T2 be m x n matrices such that the dimension ofKer(Tl + zT2) is
constant, that is, it is independent of z on <p [and the same is automatically
true for dim In^Tj + zT2)]. Then there exist subspaces ^Vj C <p", jV2 C <pm
such that
for all z G (p.
Note that in view of Proposition 18.1.1 and Theorem 18.2.1 the families
of subspaces Ker(r t + zT2) and Im(7i + zT2) are analytic on <p.
Proof. For the proof of Theorem 18.6.2 we use the Kronecker canonical
form for linear matrix polynomials under strict equivalence (which is
developed in the appendix to this book).
As dimKerCT, + zT2) is independent of z E <p, the canonical form of
Tl + zT2 does not have the term z/ + /. So, in the notation of Theorem
A. 7. 3, there exist invertible matrices Ql and Q2 such that
It is easily seen that
for all z E (p, and that
So there exists a direct complement Ml to Ker[Ql(Tl + zT2)Q2] for all

z E <p given as follows:
As
it follows that
The part of Theorem 18.6.2 concerning lm(Tl + zT2) is proved similarly,

taking into account the facts that
and for each z E <p, Im LT has a direct complement Span{ej}. D
18.7 ANALYTIC FAMILIES OF INVARIANT SUBSPACES
Let A(z): <P" be an analytic family of transformations on ft. Our next
topic concerns the analytic properties (as functions on z) of certain invariant
subspaces of A(z).
We have already seen some first results in this direction in Section 18.1.
Namely, if the rank of A(z) is independent of z, then Im A(z) and Ker A(z)
are analytic families of subspaces. In the general case, Im A(z) and
Ker A(z) become analytic families of subspaces if corrected on the singular
set of A(z). The next theorem is mainly a reformulation of this statement.
For convenience, let us introduce another definition: an analytic family of
subspaces {^(z)}2en is called A(z) invariant on ft if the subspace M(z) is
A(z) invariant for every z G ft.
Theorem 18.7.1
There exist A(z)-invariant analytic families {M(z}} z&sl and {N(z)}z&fl such
that M(z) = Im A(z) and jV(z) = Ker A(z) for every z not belonging to the
singular set of A(z).
Proof. In view of Theorem 18.2.1 we have only to prove that M(zQ) and
JV(z0) are A(z0) invariant for every z 0 ES(v4). But this follows from
Theorem 15.1.1 because limz_>z A(z} - A(z0) and
Another class of A (z) -in variant subspaces whole behaviour is analytic (at
least locally) includes spectral subspaces, as follows.
Theorem 18.7.2
Let F be a contour in the complex plane such that F fl cr(A(z0)) = 0 for a
fixed z0 E ft. Then the sum Mr(z) of the root subspaces of A(z) correspond-
Analytic Families of Invariant Subspaces 595
ing to the eigenvalues inside F, is an A(z)-invariant analytic family of

subspaces in a neighbourhood U of ZQ.
Proof. As A(z) is a continuous function of z on H, the eigenvalues of

A(z) also depend continuously on z. Hence there is a neighbourhood U of
z0 such that A(z) has no eigenvalues on F for any z in the closure of U. Now
for z G U we have
We have seen in Section 2.4 that
is a projector for every zEU. So, to prove that Mr(z) is an analytic family
in U, it is sufficient to check that P(z) is an analytic function on U. Indeed,
|det( A/ - A(z})\ > S > 0 for every A G F and z G t/, where 5 is independent
of A and z. Hence ||(A7- ^(z))"1)! is bounded for A G F and z G U, and
consequently the Riemann sums
where A 0 ,. . . , Am are consecutive points in the positive direction on F with

\m = A 0 , converge to the integral (18.7.2) uniformly on every compact set in
U. As each Riemann sum is obviously analytic on U, so is the integral
(18.7.2). D
In view of Theorems 18.7.1 and 18.7.2, the following question arises

naturally: does there exist an A(z)- in variant analytic family that is nontrivial
(i.e., different from {0} and <p")? Without restrictions on A(z) the answer is
no, as the following example shows.
EXAMPLE 18.7.1. Define an analytic family on <p by
Here the A(z)- in variant subspaces (for a fixed z) are easy to find: the only
nontrivial invariant subspace of A(Q) is Spanf^}, and, when z rÔ, the only
nontrivial invariant subspaces of A(z) are
where u1 and u2 are the square roots of z. It is easily seen that there is no
nontrivial, A(z)-invariant, analytic family of subspaces on £. D
5% Analytic Families of Subspaces
In the next section we study >l(z)-invariant analytic families of subspaces

under the extra condition that A(z) have the same Jordan structure for all
z G H. We see that, in this case, nontrivial A(z)-invariant analytic families of
subspaces always exist. On the other hand, we have seen in Example 18.7.1
that there exists a nontrivial A(z)-invariant family of subspaces that is
analytic in <p except for the branch point at zero. Such phenomena occur
more generally and are studied in detail in Chapter 19.
18.8 ANALYTIC DEPENDENCE OF THE SET OF INVARIANT

SUBSPACES AND FIXED JORDAN STRUCTURE
Given a family of transformations A(z): <p"—» <p" that depends analytically

on the parameter z in a domain ft E <£ , we say that the lattice ln\(A(z})
depends analytically on z G ft if there exists an invertible transformation
5(z): $"->$" that is analytic on ft and such that Inv(,4(z)) =
5(z)(Inv(^4(z0))) for all z G f t and some fixed point z 0 Eft. This definition
does not depend on the choice of z0. Indeed, if
then for every z'QG O we have
Also, replacing 5(z) by S(z)S(z0)~l, we can require in the definition of

analytic dependence of lnv(A(z)) that S(zQ) = /.
Since Inv(A), Inv(fi) are linearly isomorphic if and only if A and B have
the same Jordan structure (Theorem 16.1.2), a necessary condition for
analytic dependence of Inv(v4(z)) on z is that A(z} have fixed Jordan
structure, that is, the number m of different eigenvalues of A(z) is indepen-
dent of z on ft, and for every pair z , , z 2 e f t the different eigenvalues
A^zJ, . . . , A m (zj) and A,(z 2 ), . . . , A m (z ? ) of A(z^ and A(z2), respective-
ly, can be enumerated so that the partial multiplicities of A ; (ZI) [as an
eigenvalue of ^4(z,)] coincide with the partial multiplicities of A y (z 2 ) [as an
eigenvalue of j4(z 2 )], for j = 1, . . . , m.
Using Theorem 16.1.2, we find that the family A(z) has fixed Jordan
structure if and only if, for every z , , z 2 G f t the lattices Inv(yt(z,)) and
Inv(.A(z2)) are isomorphic. Clearly, this property is necessary for the lattice
ln\(A(zJ) to depend analytically on z G ft. The following result shows that
this property is also sufficient as long as ft is simply connected.
Theorem 18.8.1
Let ft be a simply connected domain in <J7, and let A(z): <p"—> <p" be an
analytic family of transformations on ft. Then Inv(^4(z)) depends analytically
on z G ft // and only if A(z) have fixed Jordan structure.
Invariant Subspaces and Fixed Jordan Structure 597
In particular, the condition of a fixed Jordan structure ensures existence

of at least as many A (z) -invariant analytic families of subspaces as there are
A(ZQ) -invariant subspaces.
Proof. We assume that A(z) is represented as a matrix- valued function

with respect to some basis in <p" that is independent of z on fl. Fix a z0 in fl.
Let A,, . . . , A be all the distinct eigenvalues of A(z0), and let F, be a circle
around A, chosen so small that F, fl Fy = 0 for / ^ j. As the proof of Theorem
16.3.1 shows, there exists an e > 0 with the property that if B : <p" —> <p" is a
transformation with the same Jordan structure as A(z0), and if \\B —
A(z0)\\ < e, then there is a unique eigenvalue /x,(#) of B in each circle F(
( ! < / < p ) , and, moreover, the partial multiplicities of ^(B) (as an eigen-
value of B) coincide with the partial multiplicities of A, (as an eigenvalue of
y4(z0)). Hence, for every z from some neighbourhood t/j of z 0 , there is a
unique eigenvalue [denoted by /u,,-(z)] of A(z) in the circle F( (1 ^ / < p), and
the partial multiplicities of /u.,(z) coincide with those of A,. Obviously,
M,-(*o) = A,"
Let us prove that /LC,(Z) is analytic on Ul. Indeed, denoting by m, the
algebraic multiplicity of A, [as an eigenvalue of ,4(z0)], we have
which is an analytic function of z on Ul.

We have proved that in a neighbourhood of each point z0 E ft the distinct
eigenvalues of A(z) are analytic functions of z. It follows that the eigen-
values of A(z) admit analytic continuation along any curve in ft. By the
monodromy theorem [see, e.g. Rudin (1974); this is where the simple
connectedness of fl is used] the distinct eigenvalues /u-i(z),. . . , ^p(z) of
A(z) are analytic functions on ft.
Now fix z 0 eft and define the family of transformations B(z): <p" —*• <p",
z G ft by the requirement that B(z)x = [/i,(z0) - fJij(z)]x for any x belonging
to the root subspace of A(z) corresponding to the eigenvalue /^-(z). It is
easily seen that B(z) is analytic on ft. Indeed, for every z t E f t let
rj,. . . , Y'p be circles around ^ ( z l ) , . . . , ^(Zj), respectively, so small that
/^(Zj) is the only eigenvalue of ^(zj inside or on the circle F;' for
/ = ! , . . . , / ? . There is a neighbourhood V of Zj such that any A(z) with
z E V has the unique eigenvalue /i,(z) inside the circle F], / = ! , . . . , / ? .
Then
which is analytic on V in view of the analyticity of A(z) and ^t-(z) for

j =1, . . . , p. Put A(z) = A(z) + B(z). Obviously, the set of A(z) -invariant
subspaces coincides with the set of ^4(z)-invariant subspaces for all z G ft, so
it is sufficient to prove Theorem 18.8.1 for A(z) instead of A(z). From the
definition of A(z) it is clear that the eigenvalues of A(z) are
ptj(z 0 ),. . . , ^(ZQ), that is, they do not depend on z, and, moreover, the
partial multiplicities of Aty(z0) as eigenvalues of A(z) do not depend on z,
either. In other words, in Theorem 18.8.1 we may assume that A(z) is
similar to A(zQ) for all z Eft.
For ; = 1 , . . . , / ? , let my be the maximal partial multiplicity of M;(ZO) as an
eigenvalue of A(zQ) [and hence as an eigenvalue of A(z) for all z in ft].
Note that since A(z) is similar to A(zQ) for all z Eft, by Proposition 18.1.2
there is an analytic basis in Ker(>i(z) — /u,/(z0)/)m for m = 0,1,2,. . . (i.e.,
for each fixed j and m). By Theorem 18.3.3 there exists a basis
x\{\z),. . . , *$(z) in Ker(A(z) - MyôW"' modulo Ker(^(z) - ufa)1*'1
that is analytic on ft. It is easily seen that the vectors
are linearly independent for all z E ft and belong to Ker(A(z) —

/^.(zo)/)"1'""1. Hence by Theorem 18.3.3 again there is a basis
xiftz),. . . ,*$(z) in Ker(>l(z) - /iô)/)"'"1 modulo
which is analytic on ft. Next we find an analytic basis
modulo
and so on. Now define the n x « matrix T(z) formed by the columns
where / = 1,. .. , p. As the proof of the Jordan form of a matrix shows (see
Section 2.3), the columns of T(z) form a Jordan basis of A(z). In particular,
Analytic Dependence on a Real Variable 599
T(z) is invertible for all z E . f l . Clearly, T(z) is analytic on ft. As

is a constant matrix (i.e., independent of z) and is in the
Jordan form, the assertion of Theorem 18.8.1 follows.
In the course of the proof of Theorem 18.8.1 we have also proved the
following result on analytic families of similar transformations.
Corollary 18.8.2
Let A(z): <p" -> <p" be an analytic family of transformations on ft, where ft is
a simply connected domain. Assume that, for a fixed point z 0 Gft, A(z) is
similar to A(zQ) for all z e ft. Then there exists an invertible transformation
T(z): <p"—»$", which is analytic on ft and such that T(z0) = I and
T(zYlA(z) T(z) = A(z0) for all z E ft.
The assumption that ft is simply connected in Theorem 18.8.1 is neces-

sary, as the next example shows.
EXAMPLE 18.8.1. Let ft = <p ^ {0}, and let
Clearly, A(z) has fixed Jordan structure on ft (the eigenvalues being the two
square roots of z). The nontrivial A(z)-invariant subspaces are
Clearly, there is no (single-valued) invertible 2 x 2 matrix function S(z) that

is analytic on <p^{0} and satisfies the conditions of Theorem 18.8.1. D
Note that in the proof of Theorem 18.8.1 the existence of an analytic

Jordan basis [Formula (18.8.1)] of A(z) also follows from a general result on
analytic perturbations of matrices (see Section 19.2).
18.9 ANALYTIC DEPENDENCE ON A REAL VARIABLE
The results presented in Sections 18.1-18.8 include the case when the
families of transformations <p" —> <pm and subspaces of <p" are analytic in a
real variable on an open interval (a, b) of the real axis. The definition of
analyticity is analogous to that in the complex case: representation as a
power series (this time with real coefficients) in a real neighbourhood of each
point t0 E (a, b). As the radius of convergence of this power series is
positive, it converges also in some complex neighbourhood of tQ. Con-
sequently, a family of transformations from <J7" into <pm (or of subspaces of

<p") that is analytic on (a, b) can be extended to a family of linear
transformations (or subspaces) that is analytic in some complex neighbour-
hood fl of (a, b), and the results presented in Sections 18.1-18.8 do apply.
It is noteworthy that, in contrast to the complex variable case, the
orthogonal complement preserves analyticity, as follows.
Theorem 18.9.1
Let M(i) be a family of subspaces (of <p") that is analytic in the real variable t
on (a, b). Then the orthogonal complement M(i)L is an analytic family of
subspaces on (a, b) as well.
Proof. Let /() E (a, b). Then in some real neighbourhood Ul of t0 there
exists an analytic family of invertible transformations A(t): (f"1—»<p" such
that M(t) = A(t)M, / E l/j for a fixed subspace M C <pn. Assume (without
loss of generality) that M = Spanf^,. . . , ep} for some p, and write A(i) as
the nx n matrix with entries that are analytic on (a, b) with respect to the
standard basis in <p". Then M(t) = Im B(t) for t E Ul, where B(t) is formed
by the first p columns of A(t). As A(t) is invertible, the columns of B(t) are
linearly independent. For notational simplicity, assume that the top p rows
of B(t0) are linearly independent and hence form a nonsingular p x p
matrix. Then there is a real neighbourhood U2 C £/t of t0 such that the top
rows of B(t) form a nonsingular p x p matrix C(t) as well. So for t E t/2, we
obtain
where D(t) is the (n — p) x p matrix formed by the bottom n — p rows of

B(t). Denoting X(t) = D(t)C(t)~\ consider the p x p matrix function 5(/) =
(/ + X(t)*X(t)yl for t e U2. Note that / + X(t)*X(t) is positive definite and
thus invertible. Clearly, S(t) is positive definite and analytic on U2. Let F be
a contour that lies in the open right half plane, is symmetrical with respect
to the real axis, and contains all the eigenvalues of 5(/0) in its interior. Then
all eigenvalues of 5(0, where t is taken from some neighbourhood f/3 C U2
of tQ, will also be in the interior of F. For such a t the integral
where A 1 / 2 is the analytic branch of the square root that takes positive values
for A positive, is well defined and Z(t)2 = S(t) (see Section 2.10). Moreover,
because of the symmetry of F, the matrix Z(t) is positive definite for all
t E l/3. Also, Z(t) is an analytic family of matrices on U3. Now one sees
easily that, for t E U3
Exercises 601
is the orthogonal projector on M(t). Indeed, a straightforward computation

verifies that P(t)2 = P(t) = P(t)*. So P(t) is an orthogonal projector.
Furthermore, it is clear that
and since rank P(t) is easily seen to be p, equality (rather than inclusion)
holds in (18.9.1). Consequently, M(tY is the image of the analytic family of
projectors I-P(t), and thus M(i)L is analytic on (/3. As / 0 E(«, b) was
arbitrary, the analyticity of M(t)± on (a, b) follows. D
One can also consider families of real transformations from Jjf" into $m,
as well as families of subspaces in the real vector space J|?", which are
analytic in a real variable t on (a, b). For such families of real linear
transformations and subspaces the results of Sections 18.1-18.8 hold also.
However, in Theorem 18.7.2 the contour F should be symmetrical with
respect to the real axis; and in the definition of fixed Jordan structure one
has to require, in addition, that the enumeration A J ( Z J ) , . . . , A O T (ZJ) and
Aj(z 2 ), . . . , A m (z 2 ) of distinct eigenvalues of A(zl) and A(z2), respective-
ly, is such that A,(ZI) = A y (z,) holds if and only if A^(z 2 ) = A y (z 2 ).
0000000000000000000000
18.1 Let
be an analytic family of transformations written as a matrix in the

standard orthonormal bases in <p2 and <p3-
(a) Are Im A(z) and Ker A(z) analytic families of subspaces?
(b) Find an analytic vector function y(z) such that y(z) ¥^ 0 for all
z £ <p and Span{y(z)} = Ker A(z) for all z G <p with the excep-
tion of a discrete set.
(c) Find linearly independent and analytic (in <p) vector functions
^j(z), y2(z) such that Span{y,(z), y2(z)} = Im A(z) for all z E
£ with the exception of a discrete set. [Hint: Use the Smith
form for the matrix polynomial ^4(z).]
18.2 Solve Exercise 18.1 for
18.3 Let P(z) be an analytic family of projectors. Show that Im P(z) is an

analytic family of subspaces.
18.4 Let
where for; = ! , . . . , & , A^z) is an analytic family of transformations

on a domain ft. Prove that the following statements are equivalent:
(a) Im A(z) and Ker A(z) are analytic families of subspaces.
(b) Im Aj(z) is an analytic family of subspaces, for j = 1,..., k.
(c) Ker AJ(Z) is an analytic family of subspaces, for / = 1,. . . , k.
18.5 Let A(z): <f"n-*(p" be an analytic family of transformations on ft
such that A(z)2 = / for all z E ft. Prove that the families of subspaces
Im(A(z) - /) and lm(A(z) + /) are analytic on ft.
18.6 Let A(z) be an analytic family of transformations on ft such that
p(A(zJ) = Q for all z E f t , where p(\) is a scalar polynomial of
degree m with distinct zeros A 1 ? . . . , \m. Prove that the families of
subspaces Ker( A y / - A(zJ), j = 1,... , m are analytic on ft.
18.7 Does the result of Exercise 18.6 hold if /?(A) has less than m distinct
zeros?
18.8 Given matrices A and B of sizes n x n and n x m, respectively, show
that Ker[A/ + A, B] is an analytic family of subspaces if and only if
(A, B) is a full-range pair.
18.9 Given matrices C and A of sizes p x n and n x n, respectively, show
ihat Im is an analytic family of subspaces if
and only if (C, A) is a null kernel pair.
18.10 Given an analytic n x. n matrix function A(z) on ft that is upper
triangular for all z E ft, when is Ker A(z) analytic on ft?
18.11 For the following analytic vector functions Jt^z), *2(z), where z E (p,
find analytic vector functions .y^z), y2(z) of zE 
(b) ^ ( z ) - ( l , - z , z > , jc2(z) = (l,z 2 ,z 2 + z)
[Hint: Use the Smith form for the matrix polynomial [jc^z), *2(z)].]
Exercises 603
18.12 Let Jtj(z),. . . , xk(z) be n-dimensional vector polynomials such that,

for at least one value z 0 G <f, the vectors X^ZQ), . . ., xk(z0) are
linearly independent. Prove that one can construct /i-dimensional
vector polynomials ^(z),. .. , yk(z) such that y^z),. . ., yk(z) are
linearly independent for all z G <p and
for all z £ (p with the possible exception of a finite set, as follows.

Let
be the Smith form of the n x k matrix [jCj(z), . . . , x*(z)]; then put
18.13 Complete the following linearly independent analytic families of

vectors in <p4 (depending on the complex variable z G <p) to analytic
families of vectors that form a basis in (p4 for every z G <p:
18.14 For the following analytic families M(z) of subspaces in <p" that
depend on z G (p, find two subspaces ^Vt and N2 such that for every
z G <p at least one of
holds:
18.15 For each n^2 give an example of an analytic family of transfor-

mations A(z): <p"-*(p" defined on fl that has no nontrivial A(z)-
invariant analytic family of subspaces on ft.
18.16 Let A(z) be an analytic family of transformations defined on fl such
that p(A(z)) = 0 for all z Gft, where p(A) is a scalar polynomial of
degree m with m distinct zeros. Prove that there are at least 2m
A (z) -invariant analytic families of subspaces on ft.
Chapter Nineteen
Jordan Form of
Analytic Matrix Functions
In this chapter we study the behaviour of eigenvalues and eigenvectors of a

transformation that depends analytically on a parameter in both the local
and global frameworks. It turns out that this behaviour is analytic except for
isolated singularities that are described in detail. The results obtained allow
us to solve (at least partially) the problem of analytic extendability of an
invariant subspace. In turn, the solution of this problem is used in Chapter
20 for the solution of various problems concerning divisors of monic matrix
polynomials, minimal factorization of rational matrix functions, and solu-
tions of matrix quadratic equations, all of which involve analytic dependence
on a parameter. Clearly, the material of this chapter relies on more
advanced complex analysis than does that of the preceding chapters. How-
ever, this is not a prerequisite for understanding the main results.
19.1 LOCAL BEHAVIOUR OF EIGENVALUES AND EIGENVECTORS
Let A(z): £"-»<£" be a family of transformations that is analytic on a

domain H. In this section we study the behaviour of eigenvalues and
eigenvectors as functions of z in a neighbourhood of a fixed point z0 G O.
First let us state the main result in this direction.
Theorem 19.1.1
Let jtj,. . . , jjik be all the distinct eigenvalues of A(zQ), that is, the distinct
zeros of the equation det(ju./- A(z 0 )) =0, where k<n, and let ri(i =
1 , . . . , / : ) be the multiplicity of ^ as a zero of det(/u,/ — A(z0)) = 0 (so
r{ + • • • + rk = ri). Then there is a neighbourhood °IL of z0 in ft with the
following properties: (a) there exist positive integers mn,. .. , m t ;
604
Local Behaviour of Eigenvalues and Eigenvectors 605
m21, . . . , m 2 s ; . . . ; mkl, . . . , mkSk such that the n eigenvalues (not neces-

sarily distinct) of A(z) for z E °li > {z()}, are given by the fractional power
series:
where aai- E • • • > mj^ (>0) of the eigenvalue )U,i;(T(z) o/,4(A), do rcof
depend on z (for zE:°U ^ ( z 0 j ) and do no? depend on cr; (c) for each
/ = 1, . . . , jfc andy = 1, . . . , s, there exist vector-valued fractional power series
converging for z G % :
where x(J^ £ <p", such that for each y and each z E °U, ^ {z0} the vectors
jt-^^z),. . . , x^f'1 }(z) form a Jordan chain of A(z) corresponding to
/*„•(*):
where by definition Jt^ 0) (z) = 0, and jc^ 1) (z)^0. Moreover, for every
z E M ^ {z0} the vectors
form a basis in <p" .

The full proof of Theorem 19.1.1 is too long to be presented here. We
refer the reader to the book of Baumgartel (1985), and especially Section
IX.3 there, for a complete proof.
Let us make some remarks concerning this important theorem. First, in
the expansion (19.1.3), if m,; > 1, then the greatest common divisor of all
606 Jordan Form of Analytic Matrix Functions
positive integers a such that aaii^0 is 1 (so fi0<r(z), a = 1, . . . , mfj have a

branch point at fit of branch multiplicity mi; and not less). If m^ = 1, then
ju(yo.(z) are analytic on a neighbourhood of ju,,; it may even happen that
jLtl/a(z) is the constant function n.i (see Example 19.1.2). Second, the
theorem does not say anything explicit about the partial multiplicities
pn >••• s=/> M of the eigenvalue //,, of A(z0) (we know only that E;';=1 ptj = rt
for i = l,...,k). However, there is a connection between the partial
multiplicities m^ > • • • > mW of the eigenvalues pij(T(z) of A(z) (z E % ^
{z0}) and the partial multiplicities of the eigenvalue /u, of A(zQ). This
connection is given by the following formula (see Theorem 15.10.2):
where piq is interpreted as zero for k>t^ and similarly for m^ } when
q > yir As the total sum of partial multiplicities of eigenvalues near /^ does
not change after small perturbation of the transformation, we also have the
equality
Let us illustrate Theorem 19.1.1 with an example.
EXAMPLE 19.1.1. Let z0 = 0 and
The only eigenvalue of ^4(0) is zero, with partial multiplicities 3 and 1. [The
easiest way to find the partial multiplicities of A(Q) is to observe that
rank A(0) = 2 and A(0)2 ¥- 0.] To find the eigenvalues of A(z), we have to
solve the equation det(/u,7- A(z)) = 0, which gives (in the notation of
Theorem 19.1.1)
(so we have k = l, sl = l, m u =2). It is not difficult to see that the only

partial multiplicity of n>ij(7(z) is m^ u = 2. The Jordan chain of A(z) corre-
sponding to pija(z) is
Global Behaviour of Eigenvalues and Eigenvectors 607
An important particular case of Theorem 19.1.1 appears when the

eigenvalues of A(z) are analytic in a neighbourhood of z 0 , that is, all
integers mif are equal to 1, as follows.
Corollary 19.1.2
Assume that all the eigenvalues ofA(z) are analytic in a neighbourhood of z0.
Then the distinct eigenvalues /i,(z), . . . , /^(z) of A(z), Z^ZQ can be
00000000000000000000000000000000000000000000000000000000000000000.
Further, assuming that the enumeration of the distinct eigenvalues of A(z) for
z T* ZQ is as above, there exist analytic n-dimensional vector functions
in a neighbourhood % C °UV ofzQ with the following properties: (a) for every
zE%"^{z 0 }, and for i = l,...,k; j = l,...,si the vectors
yl*\z),. . . , y^''\z) form a Jordan chain of A(z) corresponding to the
eigenvalue ^(z); (b) for every z G % ^ {z0} the vectors (19.1.4) form a
basis in <p".
The following example illustrates this corollary.
EXAMPLE 19.1.2. Let
Obviously, the eigenvalues of A(z) are analytic (even constant). It is easy to

find analytic vector functions yj/^z) as in Corollary 19.1.2: we have k = 1,
s, = 1, yn =2, and
Note that y\\\z), y\}\z) do not form a basis in <p2 for z = 0; also, /^(O),
yJiÔ) do not form a Jordan chain of ^4(0). This shows that in (a) and (b) in
Corollary 19.1.2 one cannot, in general, replace °U2 ""* {z0} by %. D
19.2 GLOBAL BEHAVIOUR OF EIGENVALUES AND EIGENVECTORS
The result of Theorem 19.1.1 allows us to derive some global properties of

eigenvalues and eigenvectors of an analytic family of transformations
A(z): <p"—»(p" defined on ft. As before, O is a domain in the complex
plane.
For a transformation X: <p"—»<p" we denote by v{X) the number of

distinct eigenvalues of X. Obviously, 1 < v(X) ^ n.
Theorem 19.2.1
Let A(z): <p"—> <p" be an analytic family of transformations on ft. Then for
all z £ ft except a discrete set S0 we have
for ZQ G S0 we have
Proof. Theorem 19.1.1 shows that for every z 0 E f t there is a neigh-

bourhood °ttz of z 0 such that v(A(z)) is constant (equal to VQ, say) for
z E % "" {z0} and v(A(z0}) ^ v0. A priori, it appears that v0 may depend on
zQ. Let us show that actually VQ is independent of z 0 . For f = 1,.. . , n, let
°VV = U°UZ , where the union is taken over all z0 E11 such that p(A(z)) = i/ in
a deleted neighbourhood % ^ {z0} of z 0 . Obviously, T,,. . . , °Vn are open
sets whose union is ft, and it is easily seen that they are mutually disjoint.
This can happen only if all T}. are empty except for T^; therefore, Y^ = ft.
It is clear also that
Now if v(A(z'))0 for some z ' G f t , then by Theorem 19.1.1 we have
v(A(z)} ==• v0 in a deleted neighbourhood of z'. This shows that the set 50 of
all z E ft for which v(A(z)) < VQ is indeed discrete, n
The points from S0 will be called the multiple points of the analytic family
of transformations A(z), because at these points the eigenvalues of A(z)
attain higher multiplicity than "usual."
Another way to prove Theorem 19.2.1 is by examining a suitable
resultant matrix. Let
for some scalar functions tfy(z) that are analytic on ft, and consider the
(2n — 1) x (2/i — 1) matrix whose entries are analytic functions on ft:
This is the resultant matrix of two scalar polynomials on fj.: det(/u/ - A(z})
and (<?/<?w)(det(/x/- A(z)). A well-known property of resultant matrices
[see, e.g., Gohberg and Heinig (1975)] states that 2n - 1 - rank R(z) is
equal to the number of common zeros of these two polynomials in /u,
(counting multiplicities). In other words
or
Now let k (n ^ k < 2n — 1) be the largest size of a square submatrix in R(z)

whose determinant is not identically zero. Denoting by S,(z),. . . , S,(z) all
such submatrices in R(z), we obviously have rank R(z) = k if at least one of
det 5,(2),. . . , det S,(z) is different from zero; and rank R(z) < k otherwise.
Comparing with (19.2.1), we obtain:
if not all numbers det S^(z), . . . , det S,(z) are zeros;
otherwise. Since the set of common zeros of det5 t (z), . . . , det 5/(z) is
discrete, Theorem 19.2.1 follows. D
Theorem 19.1.1 shows that the distinct eigenvalues of
A(z), fJt-i(z),. . . , n^z) (where v = max 2en v(A(z))) are analytic on fl ^ 50,
where S0 is taken from Theorem 19.2.1, and have at most algebraic branch
points is 50. [Some of the functions ^(z),. . . , nv(z) may also be analytic at
certain points in S0.] Denote by Sl the subset of 50 consisting of all the
points ZQ such that at least one of the functions /iy-(z), j - 1,. . . , v is not
analytic at z 0 . As a subset of a discrete set, S{ is itself discrete. The set $1
will be called the first exceptional set of the analytic family of linear
transformations A(z), z GO.
It may happen that Sl ^ S0, as shown in the following example.
610 Jordan Fornôf Analytic Matrix Functions
EXAMPLE 19.2.1. Let ft = <f and
The eigenvalues of A(z) are ±z, so in this case S0 = {0} but Sj = 0. D
Example 19.1.2 shows that in general, when z E f t ^ S l 5 one cannot

expect that there will be a Jordan basis of A(z) that depends analytically on
z. To achieve that we must exclude from consideration a second exceptional
set, which is described now.
Theorem 19.2.2
Let A(z): <£"-» <P" be an analytic family of transformations on ft with the set
50 of multiple points and let /AJ(Z), . . . , ^(z) be the distinct eivenvalues of
A(z) analytic on ft"-S0 and having at most branch points in 50. Let
m yl (z) > • • • > miy(z), y = y ( j , z) be the partial multiplicities of the eigen-
value iij(z) ofA(z) for j = I , . . . , V, z&?S0. Then there exists a discrete set S2
in ft such that S2 C ft ""- 50 and the number y(y, z) of partial multiplicities and
the partial multiplicities mjk(z} themselves, k — 1,. . . , y(y, z) do not depend
on z in ft""- (50 U 52), for 1, . . . , v.
Proof. The proof follows the pattern of the proof of Theorem 19.2.1. In
view of Theorem 19.1.1, for every z 0 Eft, there is a neighbourhood °UZ of
z0 such that the number of distinct eigenvalues v = v(z0), as well as the
number -y, = y;(z0) of partial multiplicities and the partial multiplicities
themselves mjl > • • • > mjy, mjk = mjk(zo), corresponding to the y'th eigen-
value, are constant for zE.^lrzo ^{z ft"}. It is assumed that the distinct
eigenvalues of A(z) for z E. % ^ {z0} are enumerated so that they are
analytic and yl^ • • ->yv. Denote by A the (finite) set of all sequences of
type
where v,yj,mjk are positive integers with the properties that v<n; y^
• • - > yv\ mt ^ • • • ^ miy, i = 1,. . . , v; S, y miy = n. For any sequence 8 E A
as in (19.2.2) let Ys = U% , where the union is taken over all z 0 Eft such
that v = v(z0); yf = r,(z0),°; = 1,. . . , v\ m,.; = m 0 (z 0 ), y = 1,. . . , ^; i =
1,. . . , v. Obviously, Ys is open and U s e A Vs = ft. Also, the sets Ts, d E A
are mutually disjoint. As ft is connected, this means that all °VS, except for
one of them, are empty. So Theorem 19.2.2 follows. D
The set 52 = 52 U (50 ^ 5 X ), where 52 is taken from Theorem 19.2.2 and

50 and Sl are the set of multiple points and the first exceptional set of A(z),
respectively, is called the second exceptional set of A(z). Note that S2C\ Sl =
0. The second exceptional set is characterized by the properties that the

distinct analytic eigenvalues of A(z) can be continued analytically into any
point z 0 E5 2 , but for every z 0 E5 2 , either v(A(z0)) <max reft v(A(z)), or
v(A(z0)} - max zeft v(A(z)} and for at least one analytic eigenvalue /u.;(z) of
A(z) the partial multiplicities of /A ; (Z O ) are different from the partial
multiplicities of /x,(z), z ^ z 0 , in a neighbourhood of z0.
EXAMPLE 19.2.2. Let
where p.(z) and q.(z) are not identically zero polynomials such that
for all z E <£, where 1 ^ kl < k2 < • • • < kq_l < kq = n. We also assume that
the polynomials pk ( z ) , . . . , pk (z) are all different. We have the set of
multiple points
the first exceptional set 5t is empty, and the second exceptional set S2 is the
union of S0 and the set {z £ <p "- S0 | <?,(z) = 0 for some kp + 1 ^ / ^ kp +l — 1
and some p}. D
Now we state the result on existence of an analytic Jordan basis for an

analytic family of transformations.
Theorem 19.2.3
Let A(z): <p"~* <P" be an analytic family of transformations in fl with the first
exceptional set Sl and the second exceptional set S2. Let /^(z),. . . , IAV{Z) be
the* distinct eigenvalues of A(z) (apart from the multiple points), which are
analytic on ft "*- Sl and have at most algebraic branch points in S}. Then there
exist n-dimensional vector functions
j = 1, . . . , v, where m;1 > • • • > mjy are positive integers, with the following
properties: (a) the functions (19.2.3) are analytic on fl ^ Sv and have at most
algebraic branch points in 5,; (b) for every z E f t ^ ( S j U S 2 ) the vectors

(19.2.3) form a basis in <p"; (c) for every z EII ^ (5, U S2) the vectors
form a Jordan chain of the transformation A(z) corresponding to the

eigenvalue ^(z), for k = 1, . . . , ?;; / = 1, . . . , v.
It is easily seen that if My(z) has an algebraic branch point at z 0 E S^, then
all eigenvectors
of A(z) corresponding to ^(z) also have an algebraic branch point at z 0 .

Indeed, let y(z) be some (say, the kill) coordinate of x\()(z) that is not
identically zero. The equality [A(z) - /Lt;.(z)]jc(i{)(z) = 0 for z in ft^S2
implies that
where ak(z) is the kth row of A(z). If x\'^(z} were analytic at z 0 , then
(19.2.4) implies that At;(z) is also analytic at z 0 , a contradiction.
The proof of Theorem 19.2.3 is given in the next section.
In the particular case when A(z) is diagonable (i.e., similar to a diagonal
matrix) for every z^S} U S2, the conclusions of Theorem 19.2.3 can be
strengthened, as follows.
Theorem 19.2.4
Let A(z) be as in Theorem 19.2.3, and assume that A(z) is diagonable for all
z^Sl U S2. Then there exist n-dimensional vector functions
with the following properties: (a) the functions (19.2.5) are analytic onfl^ Sl
and have at most algebraic branch points in 5j; (b) for every z E O and every
j = 1,. . . , v the vectors x['\z),. . . , x(^(z) are linearly independent; (c) for
every z G f t ^ (^ U 52) the vectors x\i\z),. . . , x{^(z) form a basis in
Ker(/x y (z)7- A(z}). In particular, the vectors (19.2.5)/orm a basis in <J7" for
every z e f i ^ (^ U S2).
The strengthening of Theorem 19.2.3 arises in statement (b), where the

linear independence is asserted for all z E fl and not only for z E ft ^ (Sl U
52) as asserted in Theorem 19.2.3. The proof of Theorem 19.2.4 is obtained
in the course of the proof of Theorem 19.2.3.
We illustrate Theorem 19.2.4 with a simple example.
EXAMPLE 19.2.3. Let
Here Sl = 0; 52 = {0}. The eigenvectors jc,(z) = I Q I and x2(z) = 1 1 cor-

responding to the eigenvalues 0 and z 2 of A(z), respectively, are analytic
and nonzero for all z E <p (including the point z = 0), as ensured by
Theorem 19.2.4. However, x%(z) and x2(z) are not linearly independent for
z = 0. D
We need some preparation for the proof of Theorem 19.2.3.

A family of transformations B(z)\ $n-* <p" is called branch analytic on ft
if B(z) is analytic on ft except for a discrete set of algebraic (as opposed to
logarithmic) branch points. The same definition applies to n-dimensional
vector functions as well. The singular set of a family of transformations
B(z): <p"—> <p", which is branch analytic on ft, is, by definition, the set of all
z0 £ ft such that
It is easily seen that the singular set is discrete and coincides with the set of
all z0 e ft with
We use the notation S(B) to designate the singular set of B(z).
Lemma 19.3.1
Let B(z): <p"—»<p m be a branch analytic family of transformations on ft.
Then there exist m-dimensional branch analytic vector-valued functions
y^(z),. . . , yr(z) on ft, and n-dimensional branch analytic vector-valued
functions x ^ ( z ) , . . . , xn_r(z) on ft with the following properties: (a) each
branch point for any function y^z), j = 1,. . . , r or xk(z), k = 1,. . . , n - r is
also a branch point of B(z)\ (b) y t (z),. . . , yr(z) are linearly independent for
every z G ft; (c) *i( z )> • • • » x n - r ( z ) are linearly independent for every z E ft;
(d) Span{y}(z),...,yr(z)} = lmB(z) and Span^z),. . . , *„_,(*)} =
Ker B(z) for every z not belonging to S(B).
The proof of this lemma can be obtained by repeating the proofs of

Lemma 18.2.2 and Theorem 18.2.1 with the following modification: in place
of the Weierstrass and Mittag-Leffler theorems (Lemmas 18.2.3 and

18.2.4), one must use the branch analytic and branch meromorphic versions
of these theorems. [In the context of Riemann surfaces, these versions can
be found in Kra (1972).]
Lemma 19.3.2
Let Bj(z): (p n -^(p w and B2(z): <p"-» <pm be branch analytic families of
transformations on ft, such that
for every z E ft that does not belong to the union of the singular sets of B^(z)
and B2(z). Then there exist branch analytic n-dimensional vector functions
x ^ ( z ) , . . . , xs(z), z Eft with the following properties: (a) every branch point
of any *;(z), j = 1, . . . , s is also a branch point of at least one of B^(z) and
B2(z); (6) jCj(z), . . . , xs(z) are linearly independent for every z E ft; (c) for
every z E f t that does not belong to S(Bl) U S(B2) the vectors
jc t (z), . . . , Jc^z) form a basis in Ker B^(z) modulo Ker B2(z).
An analogous result also holds in case Im B^(z) D Im B2(z), for all z E ft

with the possible exception of singular points of #,(z) and B2(z).
Proof. We regard jB,(z) and B2(z) as m x n matrix functions, with

respect to fixed bases in <p" and <£"". By Lemma 19.3.1, find linearly
independent branch analytic vector-valued functions y,(z), . . . , y u (z) on ft
such that
Span{ v,(z), . . . , yv(z)} = Ker B 2 (z) (19.3.1)
for all z E f t not belonging to the singular set of B2(z). Fix z 0 E f t , and
choose xv +l, . . . , * „ in such a way that ^(z,,), . . . , yv(z0), xu+l, . . . , xn
form a basis in <p". Using the branch analytic version of Lemma 18.2.2 (cf.
the paragraph following Lemma 19.3.1), find branch analytic vector func-
tions yv+l(z), - . . , yn(z), z Eft such that ^(z), . . . , yv(z),
yv +l ( z ) , . . . , yn(z) form a basis in <p" for every z E ft. If necessary, replace
BJ(z) by B,.(z)5(z), i = l,2, where S(z) = [y,(z)- • • yn(z)] is an invertible
n x n matrix function, and we can assume that
where Bf(z) are branch analytic m x (n - u) matrix functions, and

Ker B2(z) = 0 for all z E ft with the possible exception of a discrete set of
points. By Lemma 19.3.1 again, find branch analytic linearly independent
(p"""-valued functions Jt,(z),. .. , xs(z), z EH such that x{(z),. . . , xs(z) is

a basis in Kerfl^z) for all z E f l except for the singular points of B^(z).
Then the vector functions
satisfy the requirements of Lemma 19.3.2. D
Lemma 19.3.3
Let BI(Z) and B2(z) be as in Lemma 19.3.2, and let Xj(z),. . . , xt(z) be
branch analytic n-dimensional vector functions with the following properties:
(a) every branch point of any Xj(z), j = 1,. . . , t is also a branch point of at
least one of B^(z) and B2(z); (b) there exists a discrete set T D S(Bl) U S(B2)
such that Jt^z),. . . , xt(z) belong to Ker B^{z) and are linearly independent
modulo Ker B2(z) for every z £ ft ^ T. Then there exist branch analytic
n-dimensional vector functions Jt,+ 1 (z), . . . ,xs(z) such that every point of
any Jcy(z), j=t + l,...,sisa branch point of at least one o/#,(z) and B2(z)
and for every z G O ^ T the set x{(z),. . . , x,(z), Jt, +] (z), . . . , xs(z) forms a
basis in Ker #,(z) modulo Ker B2(z).
The case t = 0 [when the set JCj(z),. . . ,x,(z) does not appear] is not
excluded in Lemma 19.3.3.
Proof. Arguing as in the proof of Lemma 19.3.2, we can assume that

Kerfl 2 (z) = 0 for every z&S(B2). Replacing T by TUS(B2), we can
assume that S(B2) = 0.
Further, by the branch analytic version of Lemma 18.2.2, there exist
branch analytic and linearly independent vector functions ^(z), . . . , y,(z)
with
There exist branch analytic vector functions y,+i(z), . . . , v n (z) such that
y^z), . . . , yn(z) form a basis in <p" for every z G H (cf. the proof of Lemma
19.3.2). By replacing fit(z) by fl1(z)[y1(z)- • • yn(z)], we can assume that
and the proof is reduced to the case f = 0. But then Lemma 19.3.1 is
applicable. D
We are ready now to prove Theorem 19.2.3. The main idea is to mimic
the proof of the Jordan form for a transformation (Section 2.3) using
Lemma 19.3.2 when necessary.
Proof of Theorem 19.2.3. For a fixed j (y = 1,.. . , i/) let mfl be the
maximal positive integer p such that
for all z&Si U 52. By Theorem 19.2.1 and the definition of 52 the number
mjl is well defined. By Lemma 19.3.2, there exist branch analytic vector
functions x\j)m (z), . . . , Jtj^, (z) on ft that are linearly independent for
every z Eft, can have branch points only in Sj, and such that
form a basis in Ker(/Lt y (z)/- v4(z))m" modulo Ker(/i y (z)/ - ^(z))"1''1'1, for
every z E f t that does not belong to 5((/x;(z)7- A(z))mil)\J 5((/it;(z)/-
^(z))1"'1"1). As we have seen in the proof of the Jordan form, the vectors
are linearly independent modulo Ker(/A ; (z)/- A(z))m>l~2 for every z^Si U
52 (we assume here that m ; 1 >2). By Lemma 19.3.3, there exist branch
analytic vector functions on ft:
with branch points only in Sl and such that for every z ^ Sl U S2 the vectors
form a basis in Ker(/i y (z)/- ^(z))"1"'1 modulo Ker(/i ; (z)7- ^(z))m""2.

Continuing this process as in the proof of the Jordan form, we obtain the
vector function (19.2.3) with the desired properties. D
19.4 ANALYTIC EXTENDABILITY OF INVARIANT SUBSPACES
In this section we study the following problem: given an analytic family of

transformations A(z) on H and an invariant subspace MQ of A(z0), when is
there a family of subspaces M(z) that is analytic in some domain ft' C ft
with z 0 E ft', and such that M(z0) - MQ and M(z) is A(z) invariant for all
z E ft'? (As before, ft is a domain in (p.) If this happens, we say that M0 is
extendable to an analytic A(z) -invariant family of subspaces on ft'. The main
result in this direction is given in the following theorem.
Analytic Extendability of Invariant Subspaces 617
Theorem 19.4.1
Let A(z): <£"—> <P" be an analytic family of transformations on ft with the
first and second exceptional sets Sv and S2, respectively. Then, provided
z 0 £ft"-(S 2 U 5"t), every A(70) -invariant subspace MQ is extendable to an
analytic A(z) -invariant family of subspaces on ft ^ Sl.
Proof. For ; = 1 , . . . , v, let
be w-dimensional vector functions as in Theorem 19.2.3. We consider A(z)

and vectors (19.4.1) as an n x n matrix function and n-dimensional vector
functions, respectively, written in the standard orthonormal basis in <p".
Let z () £ ft "^ (S2 U 5j), and let J0 be the Jordan form of A(z()):
where
and Jk(n) is the fc x A; Jordan block with eigenvalue p. For z £ft "^ Sl let
T(z) be the n x n matrix whose columns are the vectors (19.4.1) (in this
order). Observe that T(z) is analytic on ft ^ Sl with algebraic branch points
in Sl and T(z) is invertible for z e f t ^ ( 5 2 U 5 ! ) [the function T(z} is
analytic but not necessarily invertible at points in S2]. Then we have
A(z0)T(z0) = T(zQ)J. Given an y4(z 0 )-invariant subspace M0, and any
z £ f t ^ ( 5 , U 5 2 ) , define
Clearly, M(z) is analytic and A(z) invariant for z G ft ~-- (Sl U 52), and also
M(ZQ) — M0. We show that M(z) admits analytic and A (z)- invariant con-
tinuation into the set 52. Let / , , . . . , fk be a basis in M0; then the vectors
form a basis in M(z) for every z £ ft "^ (^j U S2). Note that gv(z),. . . , gk(z)
are analytic in ft^Sj. By Lemma 18.2.2 there exist ^-dimensional vector
functions h^z),. . . , hk(z) that are analytic on ft ^ Sj, linearly independent
for every z G ft "^ Sl, and for which
Span{/i,(z),. . . , hk(z)} = Span{g,(z),. . . , gk(z)}
whenever z^S, U 52. Putting M(z) = Span{/ij(z),. . . , h k ( z ) } for z e S2,

we clearly obtain an analytic extension of the analytic family
{dt(z}} z(=n^(s us } to the points in S2. As for a fixed z 0 e S2 we have
it follows in view of Theorem 13.4.2 that M(zQ) is A(z0) invariant. D
The proof of Theorem 19.4.1 shows that the analytic A(z)-invariant

family of subspaces M(z) on ft ^ Sv with M(zQ) = M0, has at most algebraic
branch points in Sl, in the following sense. For every z' E Sl, either M(z)
can be analytically continued into z' (i.e., there exists a subspace M', which
is necessarily A(z') invariant, and for which the family of subspaces
N(z}, z £ (ft -- S,) U {z'} defined by Jf(z) = M(z} on ft ^ 5,, JV(z') = M' is
analytic on (ft "~ Sj) U {z'}), or Jt(z) = S(z)M0 in a neighbourhood of z',
where S(z) is an invertible family of transformations that is analytic on a
deleted neighbourhood of z' and has an algebraic branch point at z'.
Looking ahead to the applications of the next chapter, we introduce the
notion of analytic extendability of chains of invariant subspaces. Let
A(z)\ <p"-» <P" be an analytic family of transformations on ft, and let
be a chain of A(zQ)-invariant subspaces. We say that A0 is extendable to an

analytic chain of A (z)- invariant subspaces on a set ft' Cft containing z0 if
there exist analytic families of subspaces Mol(z),. . . , MQr(z) on ft' such
that MOJ(ZQ) - MQj for j = 1,. . . , r, MQj(z) C MQk(z) for / < k and z G ft',
and M0j(z) is A(z) invariant for all z G ft'. Clearly, this is a generalization of
the notion of extendability of a single invariant subspace dealt with in
Theorem 19.4.1. The arguments used in the proof of Theorem 19.4.1 also
prove the following result on analytic extendability of chains of invariant
subspaces.
Theorem 19.4.2
Let A(z), Sj and S2 be as in Theorem 19.4.1. Then every chain of A(z0}-
invariant subspaces, where z0 Gft ~^(S2 U SJ, is extendable to an analytic
chain of A(z)-invariant subspaces on ft ^ 5 t . Moreover, the analytic families
of subspaces that form this analytic chain have at most algebraic branch
points at Sl (in the sense explained after the proof of Theorem 19.4.1).
Chains consisting of spectral subspaces are important examples of chains

of subspaces that are always analytically extendable. Recall that an A-
invariant subspace M is called spectral if M is a sum of root subspaces of A.
Analytic Extendability of Invariant Subspaces 619
Theorem 19.4.3
Let A(z) and 5, be as in Theorem 19.4.1. Then every chain A 0 = (MQl C
• • • C M0r} of spectral subspaces of A(zQ), where z0 E ft, is extendable to an
analytic chain of A(z) -invariant subspaces on ft^(Sj^{z0}) that has at
most algebraic branch points at 5, "-> {z()}.
Proof. For j ; = 1, . . . , r write M(}j = Im /Y(z 0 ), where
is the Riesz projector of A(z0) corresponding to a suitable simple rectifiable

contour Fi . We can assume that Fi lies in the interior of FKt , for '/ >k. Let
°tt C ft be a neighbourhood of z0 that is so small that A(z) has no
eigenvalues on F, U • • • U Fr for z E all. Clearly, for z G % we find that
where Mj(z) = Im Pr(z) form an analytic chain of A(z)-invariant subspaces

in °lt. Fix z G °IL ^ (S} U S 2 ), and let ^,(z) be the analytic A(z)-invariant
family of subspaces (cf. the proof of Theorem 19.4.1) to which M t(z) is
extendable. It is easily seen that Mj(z) - Mj(z) for z E °U ^ (S, U S2), so A0
admits the desired extension. D
To analyze the extendability of /4(z 0 )-invariant subspaces when z G S 2 ,

we need the following notion. An invariant subspace M0 of A(z0), z 0 G H is
called sequentially isolated (in ft) if there is no sequence zm -^ z 0 , m =
1,2, ... of points in H tending to z0 such that, for some j4(z OT )-invariant
subspace Mm (m = l,2, . . . ) , we have lim m _ 00 Q(Mm, MQ) = 0. Theorem
19.4.1 shows, in particular, that every ;4(z 0 )-invariant subspace with z() G
ft ^ (Sj U 52) is sequentially nonisolated. However, certain /i(z 0 )-invariant
subspaces with z 0 G S2 may be sequentially isolated, as follows.
EXAMPLE 19.4.1. Let
Here 5, is empty, S2 = {0}. Any A(0)-invariant subspace of the form

Span , where y is a complex number, is sequentially isolated. On the
other hand, the ^4(0)-invariant subspace Span is sequentially
nonisolated. D
Clearly, a sequentially isolated A(zQ)-invariant subspace is not extend-

able to an analytic A(z)-invariant family of subspaces on a neighbourhood
of z0.
We conjecture that these are the only nonextendable invariant subspaces.
Conjecture 19.4.4
Let A(z), Sj, and S2 be as in Theorem 19.4.1. Then every sequentially
nonisolated A(zQ)-invariant subspace MQ, where z 0 E 52, is extendable to an
analytic A(z)-invariant family of subspaces on ft "" 5, that has at most
algebraic branch points in Sj (in the same sense as the remark following the
proof of Theorem 19.4.1).
Theorem 19.4.3 verifies this conjecture in case MQ is a spectral subspace.
19.5 ANALYTIC MATRIX FUNCTIONS OF A REAL VARIABLE
The results of Sections 19.1-19.4 hold also for n x n matrix functions A(t)
that are analytic in the real variable t on an open interval ft on the real line.
Of particular interest is the case when all eigenvalues of A(t) are real, as
follows.
Theorem 19.5.1
Let A(t) be an n x n matrix function that is analytic in the real variable t on
ft. Assume that, for all f Eft, all eigenvalues of A(t) are real. Then the
eigenvalues of A(t) are also analytic functions of t on ft.
Proof. Let t0 G ft. By Theorem 19.1.1, all eigenvalues of A(t), for t in a

neighbourhood of t0, are given by fractional power series of the form
where c; are complex numbers. Let jl be the first index such that c^ ¥* 0. [If
all Cj are zeros, then A(f) = A0 is obviously analytic at t0.] Then
Take t > tQ and (t - t0)l/a positive. Since A(f) and A0 are real, we find that c^
must be real. In (19.5.1) we now take t<t0 and (t - t0)l/a = \t -
t0\lla • (cos(27r;7a) + /sin(2ir;7a)). We obtain a contradiction with the fact
Analytic Matrix Functions of a Real Variable 621
that c;i is real unless jl is a multiple of a. If j2 > jl is the minimal integer with
c /2 ô' then
and the preceding argument shows that c- is real and j2 is a multiple of a.

Continue in this way to conclude that A(f) is analytic in a neighbourhood of
t0. As t0 was arbitrary in ft, the analyticity of \(t) on ft follows. D
Combining this result with Theorems 19.2.3 and 19.4.1, we have the
following corollary.
Corollary 19.5.2
Let A(t} be an analytic n x n matrix function of a real variable t on ft, and
assume that all eigenvalues of A(t) are real when t E ft. Let S2 be the discrete
set of points in ft defined by the property that either
where v(t} is the number of distinct eigenvalues of A(t), or
but at least for one analytic eigenvalue fjij(t) ofA(t) the partial multiplicities of
/Lt;(f0) are different from the partial multiplicities of /t,-(f)> '^'o m a rea^
neighbourhood oft0. Then there exist analytic n-dimensional vector functions
on ft such that for every f G f t ^ 52 the vectors (19.5.2) form a basis in <p"
and, for j = 1,. . . , r, xj{(t),. . . , xjm(t) is a Jordan chain of the transfor-
mation A(t). Moreover, when tQ E ft "- 52, every A(tQ)-invariant subspace MQ
is extendable to an analytic A(t)-invariant family of subspaces on ft.
In particular, the conclusions of Corollary 19.5.2 hold for an analytic

n x n matrix function A(t) of the real variable t G ft that is diagonable and
all eigenvalues of which are real for every f E f t . These properties are
satisfied, for example, if A(t) is an analytic matrix function on ft that is
hermitian for all t E ft.
19.6 EXERCISES
19.1 Find the first and second exceptional sets for the following analytic
families of transformations:
19.2 In Exercise 19.1 (a) find a basis in <p2 that is analytic on <p (with the
possible exception of branch points) and consists of eigenvectors of
A(z) (with the possible exception of a discrete set of values of z).
19.3 Describe the first and second exceptional sets for the following types
of analytic families of transformations A(z): <£" —> <p" on ft:
(a) A(z) — diagfa^z),. . . , an(z)] is a diagonal matrix.
(b) A(z) is a circulant matrix (with respect to a fixed basis in <p")
for every z E ft.
(c) A(z) is an upper triangular Toeplitz matrix for every z E f t .
(d) For every z E f t , all the entries in A(z), with the possible
exception of the entries (/, /) with / = / or with / + j = n + 1,
are zeros.
19.4 Show that the analytic matrix function of type a(z)/ + fi(z)A, where
a(z) and /3(z) are scalar analytic functions and A is a fixed n x n
matrix, has all eigenvalues analytic.
19.5 Show that if A(z) = a(z)/ + p(z)A is the function of Exercise 19.4
and /3(z) is a polynomial of degree /, then the second exceptional set
of A(z) contains not more than / points.
19.6 Prove that the number of exceptional points of a polynomial family
of transformations £*=0 zjA}., z E <£, is always finite. [Hint: Use the
approach based on the resultant matrix (Section 19.2).]
19.7 Let A(z) be an analytic n x n matrix function defined on ft whose
values are circulant matrices. When is every A(z0)-invariant sub-
space analytically extendable for every z0 E ft?
19.8 Describe the analytically extendable A(z0)-invariant subspaces,
where A(z) is an analytic n x n matrix function on ft with upper
traingular Toeplitz values, and z0 E ft.
19.9 Let A(z): <p"—* <P" be an analytic family of transformations defined
on ft, and assume that A(zQ} is nonderogatory for some z 0 Eft.
Prove that every j4(z0)-invariant subspace is sequentially noniso-
lated. (Hint: Use Theorem 15.2.3.)
Exercises 623
19.10 Let A(z) be an analytic n x n matrix function of the real variable

z E ft, where ft is an open interval on the real line, such that A(z) is
hermitian for every z E ft. Prove that there exist analytic families
JtjCz), . . . , xn(z) of n-dimensional vectors on ft such that for every
z0 E ft the vectors JC^ZQ), . . . , xn(zQ) form an orthonormal basis of
eigenvectors of A(z0). [Hint: Let A0(z) be an eigenvalue of A(z) that
is analytic on ft (one exists by Theorem 19.5.1). Choose an analytic
vector function jc t (z) EKer(>l(z) - A0(z)7) on ft with ||jc(z)|| = 1.
Repeat this argument for the restriction of A(z) to Spanljc^z)}^ —
recall that Span{xl(z)}± is an analytic family of subspaces on
ft —and so on.]
19.11 Let A and B be hermitian n x n matrices, and assume that A has n
distinct eigenvalues A 1 ? A 2 , . . . , \n. Show that in the power series
and
representing the eigenvalue \k(z) of A + zB and the corresponding

eigenvector fk(z) of A + zB, for z sufficiently close to zero, we have
where ak are pure imaginary numbers. It is assumed that ||/A(z)|| = 1

for real z sufficiently close to zero. [Hint: By Exercise 19.10, the
eigenvalue A^(z) and the corresponding eigenvector/^(z) are analy-
tic functions of z. Show that the equality
holds. Find AJ^ by taking the scalar product of (1) with/^. By taking
the scalar product of (1) with / (/' ^ k) it is found that
The condition ||/t(z)|| = 1 gives (/<'>, /,) + ( /„ /<") = 0.]

Chapter Twenty
Applications
This chapter contains applications of the results of the previous two chap-
ters. These applications are concerned with problems of factorizations of
monic matrix polynomials and rational matrix functions depending analyti-
cally on a parameter. The main problem is the analysis of analytic properties
of divisors. Solutions of a matrix quadratic equation with coefficients
depending analytically on a parameter are also analyzed.
20.1 FACTORIZATION OF MONIC MATRIX POLYNOMIALS
Consider a monic matrix polynomial L(A) = /A' + EJIg Af\', where

AQ,. . . , At_^ are n x n matrices that depend analytically on the parameter
z for z G ft, and ft is a domain in the complex plane. We write Aj = Aj(z)
and L( A) = L( A, z). In this section we study the behaviour of factorizations
L(A, z) = L j ( A , z)- • • Lr(\, z) of L(A, z) as functions of z. Our attention is
focused on the problem of analytic extension of factorizations from a given
z 0 eft.
Let
be the companion matrix of L(A, z). Obviously, C(z) is an analytic n x n

matrix function on ft. The first (resp. second) exceptional set of C(z) is
called the first (resp. second) exceptional set of L( A, z). In other words (see
Chapter 19), z0 E ft belongs to the first exceptional set 5, of L( A, z) if and
only if not all solutions of det L( A, z) = 0 (as functions of z) are analytic at
z0. The point z0 belongs to the second exceptional set S2 of L(A, z) if and
624
Factorization of Monk Matrix Polynomials 625
only if all solutions of det L( A, z) = 0 are analytic in a neighbourhood of z 0

and, denoting by Aj(z), . . . , A r (z) all the different analytic functions in a
neighbourhood of z0 satisfying det L( A ; (z), z) = 0, ;' = 1,. . . , r, we have
either (a) A y (z 0 ) = A A .(z 0 ) for some j^k or (b) all the numbers
A,(z 0 ),. . . , A r (z 0 ) are different, but for at least one A ; (z) the partial
multiplicities of L(A, z) at A ; (z) are not the same when z = z0 and when
z T^ z() (and z is sufficiently close to z 0 ).
Now we state the main result on analytic extendability of factorizations of
L(A,z).
Theorem 20.1.1
and
where Ly( A), j = 1, . . . , r are monic matrix polynomials and S, (resp. S2) is
the first (resp. second) exceptional set of L(A, z). Then there exist monic
matrix polynomials L,(A, z), . . . , L r (A, z) whose coefficients are analytic
functions on f l ^ ( 5 j U 5) (where S is some discrete subset of n^{z 0 }),
having at most poles in 5, and at most algebraic branch points in Sl , and such
that
Note that the case when 5j n S ^ 0 is not excluded. This means that the
coefficients Ajk(z) of L ; (A,z) may have an algebraic branch point and a
pole at the same point z' simultaneously, that is, there is a power series
representation of type
in a deleted neighbourhood of z', where p and q are positive integers.
Proof. We use the description of factorizations of monic matrix polyno-

mials in terms of invariant subspaces developed in Chapter 5. Let
be a standard triple for L(A), and let

626 Applications
be the chain of C(z 0 )-invariant subspaces corresponding to the factorization

(20.1.1) [with respect to the triple (20.1.2)]. In particular, for / =
1,. . . , r — 1, the transformations
are invertible, where PJ is the sum of degrees of the matrix polynomials

L r _ ; + 1 (A),. . . , L r (A). By Theorem 19.4.2 the chain (20.1.3) is extendable
to a chain M^z) C • • • C M r _ \ ( z ) of C(z)-invariant subspaces that is analytic
in ft "^ Sl and has at most algebraic branch points in Sl. Let 5 = 5(1) U • • • U
S(r~l), where 5 ( / ) is the discrete set of all z G U for which the transformation
is not invertible. For z G ft "- 5, let
be the factorization of L( A, z) that corresponds to the chain M^z) C • • • C

M r _ \ ( z ) of C(z)-invariant subspaces [with respect to the triple (20.1.2)].
Formulas (5.6.3) and (5.6.5) show that the coefficients of L ; (A, z) have all
the desired properties. D
In the same way (using Theorem 19.4.3 in place of Theorem 19.4.2) one
proves the analytic extendability of spectral factorization, as follows.
Theorem 20.1.2
Let z 0 G11 and
where <r(Ly) fl o-(Lk) — 0 for j ¥=• k. Then there exist monic matrix polyno-
mials Lj( A, z),. . . , Lr( A, z) with the same properties as in Theorem 20.1.1,
and whose coefficients are, in addition, analytic at z 0 .
We say that a factorization
of monic matrix polynomials L y (A) = I\l'+ T.k=0 AjkAk, j = 1,. . . , r is

sequentially nonisolated if there is a sequence of points (z m }^ = 1 in fl "^ {z0}
such that lim^.,.^ zm — z0 and a sequence of factorizations
where
with lim m _^ 4JJJ!0 = j4 yft for k = 0, . . . , / - 1 and / = 1, . . . , r. Theorem

20.1.1 shows, in particular, that every factorization (20.1.4) with zQ^Sl U
S2 is sequentially nonisolated. Simple examples show that sequentially
isolated factorizations do exist, for instance:
EXAMPLE 20.1.1. Let C(z) be any matrix depending analytically on z in a

domain O with the property that for z = z 0 G O, C(z0) has a square root and
for z =£ z 0 , z in a neighbourhood of z 0 , C(z) has no square root. The prime
example here is
Then define L( A, z) = /A 2 - C(z). It is easily seen that if L(A, z) has a right

divisor /A - A(z), then L(A, z) = /A 2 - /I2(z) and hence that L(A, z) has a
monic right divisor if and only if C(z) has a square root. Thus, under the
hypotheses stated, L(A, z) has an isolated divisor at z 0 . D
It is an open question whether every sequentially nonisolated factoriza-

tion L( A, z 0 ) = L t ( A) • • • Lr( A) of monic matrix polynomials with z0 belong-
ing to the second exceptional set S2 of L(A, z) is analytically extendable in
the sense of Theorem 20.1.1. (It is clear that the sequential nonisolatedness
is a necessary condition for the analytic extendability.) A proof of Conject-
ure 19.4.4 will answer this question in the affirmative.
20.2 RATIONAL MATRIX FUNCTIONS DEPENDING

ANALYTICALLY ON A PARAMETER
In this section we study the realizations and exceptional points of rational

matrix functions that depend analytically on a parameter. This will serve as
628 Applications
a background for the study of analytic extendability of mimimal factoriza-

tions of such functions to be dealt with in the next section.
Let W(A, z) = [w /; (A, z)]"; = 1 be a rational n X n matrix function that
depends analytically on the parameter z for z E . f l , where ft is a domain in
<p. That is, each entry w, 7 (A, z) is a function of type />j ; (A, z)lqij(\, z),
where /?,-,•( A, z) and ^, 7 (A, z) are (scalar) polynomials in A whose coefficients
are analytic functions of z on ft. We assume that:
(a) For each / and / and for all z G ft, the polynomial qtj( A, z) in A is not
identically zero, so the rational matrix function W(\, z) is well
defined for every z E ft.
(b) It is convenient to make the further assumption, namely, that for
each pair of indices /, y (1 ^ /, / ^ n) there exists a z0 E ft such that
the leading coefficient of qr / ; (A,z) is nonzero at z = z0 and the
polynomials ptj(\, z 0 ) and <?, ; (A, z 0 ) are coprime, that is, have no
common zeros. In particular, this assumption rules out the case
when PJJ( A, z) and qtj( A, z) have a nontrivial common divisor whose
coefficients depend analytically on z for z E ft.
(c) Finally, we assume that for every z E ft the rational matrix function
W(A, z) (as a function of A) is analytic at infinity and W(°°, z) = /.
Assumptions (a), (b), and (c) are maintained throughout this section.
It can happen that W( A, z) has zeros and poles tending to infinity when z
tends to a certain point zf, Eft. This is illustrated in the next example.
EXAMPLE 20.2.1. Let
Obviously, W(A,z) satisfies conditions (a), (b), and (c). Specifically,

W( A, z) depends analytically on z for z E <p, W(°°, z) = 1 for all z E <J7, and
the polynomials 1 + Az and z + 1 + Az have no common zeros for 2 = 1.
However, W(A, z) has a zero at A = -z"1 and a pole at A = -(z + l)z~
and both tend to infinity as 2—>0. D
A convenient criterion for boundedness of zeros and poles in a neigh-

bourhood of each point in ft can be given in terms of the entries of W( A, z),
as follows.
Proposition 20.2.1
The poles and zeros of W(\, z) are bounded in a neighbourhood of each
point in ft if and only if, for each entry pif(\, z)/qii(\, z) of W(X, z), the
leading coefficient of the polynomial g /; (A, z) has no zeros in ft (as an
analytic function of z on fl).
Proof. Assume that the leading coefficient of each has no zeros

in O. Fix Write and, in general, 5 depends
on i and;'. the zeros of qtj( A, z) are bounded in a neighbour-
hood of z 0 . Indeed, writing , the
zeros of the polynomial
are all in the disc where ^ is a

suitably chosen neighbourhood of z 0 . As the poles of must also be
zeros of at least one of the polynomials ,. . . , n, it follows
that there exists an M > 0 such that the poles of W(\, z) are all in the
disc for every z G °tt. Arguing by contradiction, assume that the
zeros of are not bounded in any neighbourhood of z0. So there exist
sequences such that
and is a zero of Then for
some vector x of norm 1. [Here we use the fact that A^ is not a pole
of ] Passing to a subsequence, if necessary, we can assume that
x
m~*xo f°r some X - Using also the fact that
for all it follows that is continuous on the set
A quick way to verify this is by using a
general result that says that if a function /(z n . . . , zm) of complex variables
z,, . . . , zm defined on Vl x • • • x Vm where each Vj is a domain in , is
analytic in each variable separately (when all other variables are fixed), then
/(z,, . . . , zm) is analytic (in particular, continuous) on V, x • • • x Vm. For
the proof of this result, see, for example, Bochner and Martin (1948). Now
the continuity of W(\, z) implies , a contradiction with the
fact that
Conversely, let be a zero of the leading coefficient of some
Then there is a zero of the polynomial such
that tends to infinity as z tends to z 0 . As is a pole of
provided , we have only to show that is not a zero of
for z T^ z 0 sufficiently close to z0.
To this end, use the existence of a point such that and
the polynomials are coprime. The coprimeness of
is equivalent to the invertibility of the
resultant matrix
630 Applications
as long as q [e.g., see Uspensky (1978)]. So det^(Zj)^0, and

since del R(z) is an analytic function of z on ft, it follows that det R(z) ¥^ 0
for all Z T ^ Z O sufficiently close to z(). Hence indeed, for
0 in some neighbourhood of z
It turns out that the boundedness of the poles and zeros of is

precisely the condition needed for existence of an analytic minimal realiza-
tion in the following sense.
Theorem 20.2.2
Let be a rational n x n matrix function that depends analytically on
the parameter zE.fl and satisfies assumptions (a), (b), and (c). Let the zeros
and poles of be bounded in a neighbourhood of every point in
Then there exist analytic matrix functions on ft, A(z), B(z), and C(z) of sizes
m x m, m x n, and n x m, respectively, such that
and for every hwith the possible exception of a discrete set S, the
realization (20.2.1) is minimal.
Conversely, if (20.2.1) holds for some matrix functions A(z), B(z), and
C(z) of appropriate sizes that are analytic on ft, then the zeros and poles of
are bounded in a neighbourhood of every point in
Proof. By Theorem 7.1.2, for every there exists a realization
for some matrices C0(z), AQ(z), and S0(z). Further, by Proposition 20.2.1,
the leading coefficients of the denominators of the entries in have
no zeros in According to this fact, the proof of Theorem 7.1.2 shows that
AQ(z), # 0 (z), and C0(z) can be chosen to be analytic matrix functions of z
on Let p x p be the size of A By Theorem 18.2.1 we can find
families of subspaces of , which are analytic on and are
such that, for every with the possible exception of a discrete set 5,, we
have
and
For z E 5, we have
and
By Theorem 18.3.2, when z E ft we may write
where are analytic families of projectors

on O. Using the same Theorem 18.2.1, we find an analytic family of
subspaces t£(z) on CL such that
for every except possibly for a discrete set 5 For each z e 52 we

have
In view of Theorem 18.3.2 there exists an analytic family of subspaces

such that
for all z E f t . Also, Lemma 19.3.2, with
ensures the existence of an analytic family of subspaces M(z) on ft such that
for all Let

632 Applications
where is the projector on We regard A(z),

B(z), and C(z) as matrices with respect to a fixed basis in
J<(z) such that xf(z} are analytic functions on (such a basis exists in view
of Theorem 18.3.2). It is easily seen that A(z), B(z), and C(z) are analytic
on The proof of Theorem 6.1.3, together with Theorem 6.1.5, shows that
for every z E f t ^ ( S j U5 2 ), and that (20.2.2) is a minimal realization for

W(A, z) when z E Sl (J S2. By continuity, equation (20.2.2) holds also for
z E Sj U 52, and the first part of Theorem 20.2.2 is proved.
Assume now that (20.2.1) holds for some analytic matrix functions A(z),
B(z), and C(z). It follows from Theorem 7.2.3 that every pole of i
an eigenvalue of A(z) and every zero of is an eigenvalue of
(although the converse need not be true). As the eigen-
values of A(z) and of A(z) - B(z)C(z) depend continuously on z, they are
bounded in a neighbourhood of each point in ft, and the converse statement
of Theorem 20.2.2 follows. D
As the proof of Theorem 20.2.2 shows, the converse statement of this

theorem remains true if the matrix functions A(z), B(z), and C(z) satisfying
(20.2.1) are merely assumed to be continuous on ft.
The discrete set S from Theorem 20.2.2 consists of exactly those points
where the McMillan degree of is less than m. This follows from
Theorems 7.1.3 and 7.1.5. Note also that the McMillan degree of is
equal to m for every
From now on it will be assumed (in addition to the assumptions made in
the beginning of this section) that the zeros and poles of are
bounded in a neighbourhood of each point in ft. Let
be a minimal realization of as in Theorem 20.2.2.

Here S is the set of all such that the realization (20.2.3) is not
minimal. Denote by S, and S2 the first and second exceptional sets,
respectively, of the analytic matrix function A(z), as defined in Section 19.2.
Similarly, let Sf and S^ be the first and second exceptional sets,
respectively, of The set 5, U 5tx will be
called the first exceptional set T l of As the poles (resp. zeros) of
, when , are exactly the eigenvalues of A(z) [resp. of
y4(z) x ] (see Section 7.2), it follows that the point belongs to the first
exceptional set of W(\, z) if and only if there is a pole or a zero
where where ^ is a neighbourhood of z 0 , such that
z 0 is an algebraic branch point of Note that it can happen that
(20.2.3) is not a minimal realization for some z belonging to the first
exceptional set of (see Example 20.2.2). The set
will be called the second exceptional set T2 of W( A, z). Denoting by S(z) the
McMillan degree of , we obtain the following description of the
points in the second exceptional set: z G T2 if and only if all poles and zeros
of W( A, z) can be continued analytically (as functions of z) to z(), and either
and for at least one zero (or pole) A 0 (z) that is analytic in a neighbourhood
°tt of z () , the zero (or pole) multiplicities of corresponding to
are different from the zero (or pole) multiplicities of
Again, it can happen that T2 intersects with the set of
points where the realization (20.2.3) is not minimal. Clearly, both T, and T2
are discrete sets. Note also that the set Tl U T2 contains all the points z0 for
which 5(z
EXAMPLE 20.2.2. Let
be a scalar rational function depending analytically on Clearly,

W(°o, z) = 1 and the zeros and poles of are bounded in a neighbour-
hood of each point (cf. Proposition 20.2.2). In the notation introduced
above we have 5, = {0, 2}; S2 = {!}; 5 = {0}. Further
and a calculation shows that

634 Applications
So the eigenvalues of A(z)* are given by the formula
It is easily seen that 2 , 2 3 }, where z,, 2 2 , z 3 are the zeros

of the polynomial -223 -I- 622 -42 + 1, and S^ is empty. The first exception-
al set of }, whereas the second exceptional set
of consists of one point {!}.
20.3 MINIMAL FACTORIZATIONS OF RATIONAL

MATRIX FUNCTIONS
Let be a rational n x n matrix function depending analytically on

the parameter 2 for as in the preceding section. Let
be a minimal factorization of for some Here

are n x n rational matrix functions with value / at
infinity. We study the problem of continuation of (20.3.1) to an analytic
family of minimal factorizations. In case ZQ does not belong to the exception-
al sets of such a continuation is always possible, as the following
theorem shows.
Theorem 20.3.1
Let be a rational n x n matrix function that depends analytically on z
for z^.fl and such that = I for Assume that the denominator
and numerator of each entry in W( A, z) are coprime for some z0 E ft that is
not a zero of the leading coefficient of the denominator. Assume, in addition,
that the zeros and poles of are bounded in a neighbourhood of each
point in Let
where d(z) is the McMillan degree of W(\, z), and let Tl and T2 be the first
and second exceptional sets of , respectively. Consider a minima
factorization (20.3.1) with Then there exist rational
matrix functions Wj(A, 2),. . . , Wr(\, 2), the entries of which depend analyt-
ically on z in ft (with the possible exception of algebraic branch points in 7\
and of a discrete set D C ft of poles), and having the following properties: (a)
W^(°°, z) = I for j = 1,. . . , r and every 0 does2 not
; (b) the point
belong to D and
W for every Moreover, this factorization is
minimal for every
The set D of poles of Wf(\, z) in Theorem 20.3.1 generally depends on

the factorization (20.3.1), and not only on the original function W(\, z).
This is in contrast with the sets 7\ and T2 that depend on W(A, z) only.
Proof. Let A(z), B(z), and C(z) be as in Theorem 20.2.2, so the

realization (20.2.1) is minimal for all Using Theorem 7.5.1, let
be the direct sum decomposition corresponding to the minimal factorization

(20.3.1), with respect to the minimal realization
Thus, for j = 1, . . . , r — I the subspaces
are A(zQ) invariant, whereas the subspaces
t/Y
are /I(z 0 ) x invariant. [Here, as usual, A(z)* - A(z) - ZJ(z)C(z).] Now b

Theorem 19.4.2, there exist families of subspaces My(z) for/ = 1,. . . , r — 1,
and M-(z] for / = 2, . . . , r, which are analytic on O, except possibly for
algebraic branch points in Tj, and have the following properties: (a)
^
(c) ^ y (z) are /l(z) invariant, and ^(z) are A(z)* invariant; (d) MJ(ZQ) =
MJO, j = 1, . . . , r - 1; ^(z 0 ) = JfjQt j = 2,...,r.
Let /n; be the dimension of =^0, y = 1,. . . , r (so m^ + • • • + mr = m). It
follows from the proof of Theorem 19.4.2 that
where for each /', the vector functions x\'\z),. . . , x(p')(z), as well as
_y ( j y) (z), . . . , )^;)(z), are linearly independent for every z GO and analytic
on ft except possibly for algebraic branch points in 7^. Here pi = ml +
636 Applications
• • • + m- is the dimension of M j(z) and qt = m, + m / + 1 + • • • + mr is the

dimension of Jfj(z).
Our next observation is that
where Df is a discrete set in Cl. [Note that the sum in (20.3.4) is direct.]
Indeed, by (20.3.2) and (20.3.3) we have
where
is a matrix function of size mx(pj + qi+l) = mxm. It remains to observe

that det F;(z) is not identically zero [because detF ; (z ( ) )^0] and (20.3.4)
holds with D- being the set of zeros of det F;(z).
Let D = D, U • • • U Dr_{. In particular, we have
Note also that z() does not belong to D.

Consider the subspaces <^(z) = M for ; = 2, . . . , r - 1. First,
it is clear that
where we put ^,(z) = M{(z) and «^r(z) = -^-C2)- Indeed, it is sufficient to

verify that
[By definition, Mr(z) = <pm.] The inclusion C in (20.3.8) is evident from the
definition of <5^(z). Further, for , we have
in view of (20.3.4). Now, using (20.3.6), we have for z £ O ^ D

so

By Theorem 7.5.1, for ( D U S ) , there exists a minimal
factorization
which corresponds to the direct sum decomposition (20.3.7), with respect to

the minimal realization
If we show that each projector y-(z) on <£y(z) along ^(z) + • • • + J^_,(z) +

=^ + ] (z) + • • • + cS'X7) is analytic in ft, except possibly for algebraic branch
points in T, and poles in Z), then formula (7.5.5) shows that W y (A, z) have
all the properties required in Theorem 20.3.1. [Note that, by continuity,
factorization (20.3.9) holds also for z G S ^ D, but it is not minimal at these
points.]
To verify these properties of 7r;(z), introduce for z £ ft "^ D the projector
M^z) along jV y + 1 (z) f o r / = 1, . . . , r - 1. Define also = Oand
One checks easily that for y = 1, . . . , r
[Indeed, both sides of (20.3.10) take the value 0 on vectors from ^V; +1 (z)
and from ^ and take the value x on each vector x from
Therefore, (/- Gy-i(2))Gy( z ) >s a projector that coincides with
for / = 1 , . . . , r. But
where Fy(z) is given by (20.3.5); so (2;(z) is analytic on ft except possibly

for algebraic branch points in 7", and poles in D. Hence 7ry(z) also enjoys
these properties. D
Consider now an important case of analytic continuation of minimal

factorizations that can also be achieved when z () G Tl U T2.
Theorem 20.3.2
Let W(\, z) be as in Theorem 20.3.1, and let
638 Applications
be a minimal factorization of z 0 Eft. [As usual, are

rational matrix functions with value I at infinity.] Assume that and
W have no common zeros and no common poles when j ^ k. Then there
exist rational matrix functions Wj(\, z), j = 1, . . . , r with the properties
described in Theorem 20.3.1 and that, in addition, are analytic on a
neighbourhood of z 0 .
The proof is obtained in the same way as the proof of Theorem 20.3.1, by
using Theorem 19.4.3 in place of Theorem 19.4.2.
To conclude this section we discuss minimal factorizations (20.3.1) that
cannot be continued analytically (as in Theorem 20.3.1). We say that the
minimal factorization (20.3.1) is sequentially nonisolated, if there is a
sequence of points {z such that zm—> z 0 , and sequences of
rational matrix functions . . . , r with value / at infinity
such that
is a minimal factorization of W( A, zm), m = 1, 2, . . . , and for / = 1, . . . , r
Equation (20.3.11) is understood in the sense that for each pair of indices
k, I (1 < k, I < n) the (k, 1} entry of Wjm( A) has the form
where are complex numbers (depending, of course, ony, k, and

/) such that
1, . . . , v), and the (k, /) -entry in
Clearly, if an analytic continuation (as in Theorem 20.3.1) of the minimal

factorization (20.3.1) exists, then this factorization is sequentially noniso-
lated. In particular, Theorem 20.3.1 shows that every minimal factorization
(20.3.1) with z is sequentially nonisolated. Also, Theorem
20.3.2 shows that a minimal factorization (20.3.1) is sequentially nonisolated
provided and have no common zeros and no common pole
when /' T^ k.
It turns out that not every minimal factorization of can
Matrix Quadratic Equations 639
be continued analytically; indeed, we exhibit next a sequentially isolated

minimal factorization.
EXAMPLE 20.3.1. Let
and consider the minimal factorization of W(\,0):
We verify that this factorization is sequentially isolated. To this end we find

all minimal factorizations of W where z ^0. A minimal realization of
W(A, z) is easily found:
In the notation of Theorem 20.2.1, we have
Theorem 7.5.1 shows that all nontrivial minimal factorizations of

are given by the formulas
So the minimal factorization (20.3.12) is indeed sequentially isolated. D
20.4 MATRIX QUADRATIC EQUATIONS
Consider the matrix quadratic equation
where A, B, C, D are known matrices of sizes n x n, n x ra, m x n, /n x m,

respectively, and A" is an m x n matrix to be found. We assume that
640 Applications
A = A(z), B = B(z), C = C(z), and D = D(z) are analytic functions of z on

ft, where ft is a domain in the complex plane. The analytic properties of the
solutions X as functions of z are studied.
Let
be the (m + n) x (m + n) analytic matrix function, and let Sj and S2 be the

first and second exceptional sets of T(z) as defined in Section 20.2. We have
the following main result.
Theorem 20.4.1
For every z 0 E ft ^ (Sj U 52) and every solution XQ of
there exists an m x n matrix function X(z) that is analytic on ft, except

possibly for algebraic branch points in 51 and a discrete set of poles in ft, and
such that X(zQ) = X0 and
for every z E ft that is not a pole of X(z). [ The case when a point z 0 E Sl is
also a pole of X(z) is not excluded.]
Proof. By Proposition 17.8.1, the subspace
is T(z0) invariant. By Theorem 19.4.1, there is a family of subspaces M(z)

that is analytic on ft except possibly for algebraic branch points in 51? for
which and for which M(z) is T(z) invariant for all
Theorem 18.3.2 there exists an (m + n) x n analytic matrix function 5(z) on
ft with linearly independent columns such that, for all
ImS(z).
Write
where 5j(z) is of size n x n and 52(z) is of size m x n, and observe that

[because 5j(z 0 ) = /]. Now by the same Proposition 17.8.1
Matrix Quadratic Equations 641
is the desired solution of (20.4.2)
Consider an example.
EXAMPLE 20.4.1. Let C(z) be an n x n analytic matrix function on H with

del C(Z)TÔ, and assume that the eigenvalues of C(z) are analytic functions.
[This will be the case if, for instance, C(z) has an upper triangular form.]
Assume in addition that C(z) has n distinct eigenvalues for every z G O.
Here
and it is easily seen that det(A/- T(z)) = de . So A 0 is an

eigenvalue of T(zQ) if and only if Aj; is an eigenvalue of C(z0). It follows that
the first exceptional set of T(z0) is contained in the set 5 = { z G
. As for every the matrix T(z) has 2n distinct
eigenvalues, it follows that the second exceptional set of T(z) is also
contained in S. By Theorem 20.4.1 every solution X0 of (20.4.3) with
can be extended to a family of solutions X(z) of (20.4.3) that
is meromorphic on ft except possibly for algebraic branch points in 5. D
In addition, let us indicate a case when an analytic extension of a solution

of (20.4.1) is always possible.
Theorem 20.4.2 2
Let X0 be a solution of (20.4.1) and z 0 eO. Furthermore, assume that the
r(z0) -invariant subspace Im is spectral. Then there exists an m x n
matrix function X(z) with the properties described in Theorem 20.4.1 and, in
addition, X(z) is analytic in a neighbourhood of z0.
The proof of Theorem 20.4.2 is obtained in the same way as the proof of
Theorem 20.4.1, but using Theorem 19.4.3 in place of Theorem 19.4.1.
In connection with Theorem 20.4.2, note the following fact. Assume that
m = n. If X\ and X2 are solutions of (20.4.1) such that
then both r(z0)-invariant subspaces

642 Applications
are spectral. Indeed
so a(7(z . In particular, = {Q}. As

dim M = n, it follows that . (Here we use the
assumption that m = n.) Hence both Ml and M2 are spectral.
The following example shows that not every solution of the equation
can be continued analytically as in Theorem 20.4.1. (Of course, it is then

necessary that
EXAMPLE 20.4.2. Consider the scalar equation
The solution x = \ of (20.4.4) with z 0 = 0 cannot be continued

analytically. D
20.5 EXERCISES
20.1 Let
Find the analytic continuation (as in Theorem 20.1.1) of the

factorization
What are the poles of this analytic continuation?

20.2 Let be a monic n x n matrix polynomial of degree / whose
coefficients are analytic on and assume that for every
del has nl distinct zeros. Prove that for every factorization
and are monic
matrix polynomials, there exist monic matrix polynomials
whose coefficients are analytic on and such
that
Exercises 643
20.3 Show that if by Theorem 20.1.1 the polynomial is scalar,

then the analytic continuations of do not have poles in
5 = 0 in the notation of Theorem 20.1.1).
20.4 Let L(A, z) be a monic matrix polynomial whose coefficients are
circulant matrices analytic on fl. Prove that the analytic continuation
of every factorization where (as in
Theorem 20.1.1) has no poles in
20.5 Prove that every factorization of a monic scalar polynomial
with coefficients depending analytically on is sequentiall
nonisolated. (Hint: Use Exercise 19.9.)
20.6 Find the first and second exceptional sets for the following rational
matrix functions depending analytically on a parameter
20.7 Let W be as in Exercise 20.6 (a). Find the analytic continua-

tions (as in Theorem 20.3.1) of all minimal factorizations of the
rational matrix function W
20.8 Let W ) be a rational matrix function that satisfies the hypoth-
eses of Theorem 20.3.1. Assume that for some has
8 distinct zeros and 8 distinct poles, where 8 is the maximum of the
McMillan degrees of W(\, z) for Prove that every minimal
factorization
admits an analytic continuation into a neighbourhood of z 0 , that is,

there exist rational matrix functions that are
analytic in z on a neighbourhood °tt of z0 such that
is a minimal factorization for every z e °U, and W

/ = !,... ,r.
20.9 Let
644 Applications
be a matrix equation, where A(z), 5(z), C(z), and £>(z) are analytic
matrix functions (of appropriate sizes) on a domain . Assume
that all eigenvalues of the matrix
are distinct, for every Prove that given a solution X 0 of (1)

with z = z 0 Eft, there exists an analytic matrix function
such that X(z) is a solution of (1) for every
20.10 We say that a solution XQ of (1) with is sequentially
nonisolated if there exist a sequence (z such that z m — » z 0 as
m—><x> and for m = 1,2,. . . , and a sequence such
that
for m — 1,2,. . . , which satisfies
Prove that if the matrix
is nonderogatory, then every solution of (1) with z = z 0 is sequen-

tially nonisolated.
20.11 Give an example of a solution of (1) that is sequentially isolated.
Notes to
Part 4
Chapter 18. This chapter is an introduction to the basic facts on analytic

families of subspaces. The main result is Theorem 18.3.1, which connects
the local and global properties of an analytic family of subspaces. This result
(in a more general framework) appeared first in the theory of analytic fibre
bundles [Grauert (1958), Allan (1967), Shubin (1979)]. Here, we follow
Gohberg and Leiterer (1972,1973) in the proof of this theorem.
The result of Theorem 18.2.1 goes back to Shmuljan (1957) [see also
Gohberg and Rodman (1981)]. The proof of Theorem 18.2.1 presented here
is from the authors' book (1982). The results of Section 18.6 seem to be
new. In case of a function , where A is a bounded linear operator
acting in infinite dimensional Banach space, the result of Theorem 18.6.2
was proved in Saphar (1965).
Chapter 19. The starting point for the material in this chapter (Theorem
19.1.1) is taken from the book by Baumgartel (1985). Theorem 19.5.1 was
proved in Porsching (1968). The analytic extendability problem for invariant
subspaces is probably treated here for the first time.
Chapter 20. We consider in this chapter some of the applications dealt
with in Chapters 5, 7, and 17, but in the new circumstances when the
matrices involved depend analytically on a complex parameter. All the
results (except those in Section 20.1) seem to be new. In Section 20.1 we
adapt and generalize the results developed in Chapter 5 of Gohberg,
Lancaster, and Rodman (1982). Example 20.1.1 is Example 20.5.4 of the
authors' book (1982).
645
Appendix
Equivalence of
Matrix Polynomials
To make this work more self-contained, we present in this appendix the

basic facts about equivalence of matrix polynomials that are used in the
main body of the book. Two concepts of equivalence are discussed. For the
first of these, two matrix polynomials and are said to be
equivalent if one is obtained from the other by premultiplication and
postmultiplication with square matrix polynomials having constant nonzero
determinant. Elementary divisors (or, alternatively, invariant polynomials)
form the full set of invariants for this concept of equivalence, and the Smith
form (which is diagonal) is the canonical form. This equivalence is studied in
detail in Sections A.1-A.4.
The second concept of equivalence is the strict equivalence of linear
matrix polynomials . This means that
, for some invertible matrices P and Q. For strict equivalence the
full set of invariants comprises minimal column indices, minimal row
indices, elementary divisors, and elementary divisors at infinity. The
Kronecker form (which is block diagonal) is the canonical form. A thorough
treatment of strict equivalence is presented in Sections A.5-A.7. The
canonical form for equivalence of matrix polynomials is a natural pre-
requisite for this presentation.
A.I THE SMITH FORM: EXISTENCE
In this and subsequent sections we consider matrix polynomials A(\)-

£*=0 Aj\', where A; are m x n matrices whose entries are complex numbers
(so that we admit the case of rectangular matrices A j ) . Of course, the sizes
of all Aj must be the same. Two mx n matrix polynomials AND
are said to be equivalent if
646
The Smith Form: Existence 647
and some matrix polynomials £(A) and F(A) of sizes m x n and n x n,

respectively, with constant nonzero determinants (i.e., independent of A).
We use the symbol to mean that ^ and are
equivalent.
It is easy to see that ~ is an equivalence relation, that is: (a)
for every matrix polynomial A(A); (b) A(\)~B(\) implies fi(A)~^4(A);
and implies / Indeed, if
then (A.1.1) holds with Further, assume that
(A. 1.1) holds for matrix polynomials A (A) and 5(A). As del £(A) = const ^
0, the formula for the inverse matrix in terms of cofactors implies that
is a matrix polynomial as well and since det
follows that det is also a nonzero constant. Similarly, is a
matrix polynomial for which det ' is a nonzero constant. Now we have
which means that Finally, let us check (c). We have
where have constant nonzero determinants.

Then and
The central result on equivalence of matrix polynomials is the Smith

form, which describes the simplest matrix polynomial in each equivalence
class, as follows.
Theorem A. 1.1
An m x n matrix polynomial is equivalent to a unique ra x « matrix
polynomial where
is a diagonal polynomial matrix with monic scalar polynomials such

that d is divisible by
In other words, for every matrix polynomial ^4(A) there exist matrix
polynomials with constant nonzero determinants such that
648 Appendix
has the form (A. 1.2), and this form is uniquely determined by The
matrix polynomial of (A. 1.2) is called the Smith form of and
plays an important role in the analysis of matrix polynomials. Note that
) from (A.1.3) are not unique in general. Note also that the
zeros on the main diagonal in are absent in case has full rank
for some [In particular, this happens if is an n x n matrix
polynomial with leading coefficient /.]
Proof of Theorem A. 1.1 (First Part). Here we prove the existence of a

of the form (A. 1.2) that is equivalent to a given . We use the
following elementary transformations of a matrix polynomial of size
m x n: (a) interchange two rows, (b) add to some row another row
multiplied by a scalar polynomial, and (c) multiply a row by a nonzero
complex number, together with the three corresponding operations on
columns.
Note that each of these transformations is equivalent to the multiplication
of by an invertible matrix as follows. Interchange of rows (columns) i
and j in is equivalent to multiplication on the left (right) by
Adding to the /th row of A( A) they'th row multiplied by the polynomial

is equivalent to multiplication on the left by
The Smith Form: Existence 649
the same operation for columns is equivalent to multiplication on the right

by the matrix
Finally, multiplication of the /th row (column) in A(\) by a number a 5^0 is

equivalent to the multiplication on the left (right) by
[Empty spaces in (A.1.4)-(A.1.7) are assumed to be zeros.] Matrices of the

form (A.1.4)-(A.1.7) are be called elementary. It is apparent that the
determinant of any elementary matrix is a nonzero constant. Consequently,
it is sufficient to prove that, by applying a sequence of elementary trans-
formations, every matrix polynomial can be reduced to a diagonal
form: diag[ . . . ,0], where are scalar
polynomials such that the quotients / = 1, 2, . . . , r - 1, are
also scalar polynomials. We prove this statement by induction on m and n.
For m — n = I it is evident.
Consider now the case m = 1 , n > 1 ; that is
If all are zeros, there is nothing to prove. Suppose that not all the
are zeros, and let a- be a polynomial of minimal degree among the
nonzero entries of We can suppose that ;0 = 1. [Otherwise, inter-
change columns in By elementary transformations it is possible to
650 Appendix
replace all the other entries in by zero. Indeed, let fl Divide

fl where is the remainder and
its degree is less than the degree of . Add to the y'th
column the first column multiplied by . Then r will appear in the
yth position of the new matrix. If then put in the firs
position, and if there is still a nonzero entry [different from , apply the
same argument again. Namely, divide this (say, the A:th) entry by and
add to the fcth column the first multiplied by minus the quotient of the
division, and so on. Since the degrees of the remainders decrease, after a
finite number of steps [not more than the degree of we find that all
the entries in our matrix, except the first, are zeros. This proves Theorem
A. 1.1 in the case m = 1, n > 1. The case m > 1, n = 1 is treated in a similar
way.
Assume now that m, n > 1, and assume that the theorem is proved for
matrices with m - 1 rows and n - 1 columns. We can suppose that the (1,1
entry of is nonzero and has the minimal degree among the nonzero
entries of [Indeed, if , we can reach this condition by
interchanging rows and/or columns in , Theorem A. 1.1 is
trivial.] With the help of the procedure described in the previous paragraph
[applied for the first row and the first column of by a finite number of
elementary transformations we reduce to the form
Suppose that for some /, and is not divisible by

(without remainder). Then add to the first row the /th row and apply the
above arguments again. We obtain a matrix polynomial of the form
where the degree of is less than the degree of . If there still

exists some entry 2) A ) that is not divisible by a j ( A ) , repeat the same
procedure once more, and so on. After a finite number of steps we obtain
the matrix
The Smith Form: Uniqueness 651
where every is divisible by Multiply the first row (or column)

by a nonzero constant to make the leading coefficient of the polynomial
equal to 1. Now define the (m - 1) x (n - 1) matrix polynomial
and apply the induction hypothesis for to complete the proof of

existence of a Smith form
A.2 THE SMITH FORM: UNIQUENESS
We need some preparations to prove the uniqueness of the Smith form

in Theorem A.1.1. Let A = be an m x « matrix with complex
entries. Choose k rows, 1 < / , < • • • < ik < m, and k columns, 1 < / j < • • • <
jk < n, in A, and consider the determinant det °f tne kxk
submatrix of A formed by these rows and columns. This determinant is
called a minor of A. Loosely speaking, we can say that this minor is of order
k and is composed of the rows i1,. . . , ik and columns / , , . . . , jk of A. It is
denoted by A( . We establish the important Binet-Cauchy for-
mula, which expresses the minors of a product of two matrices in terms of
the minors of each factor, as follows.
Theorem A. 2.1
Let A = BC, where B is a m x p matrix, and C is a p x n matrix. Then for
every k, \<k< min(m, n) and every minor A( .I of order k we
have
where the sum is taken over all sequences {«^}*=1 of integers satisfying
I =£ a t < a2 < • • • < ak < p. In particular, if k > p, then the sum on the
right hand side of (A. 2.1) is empty and the equation is interpreted as
652 Appendix
Note that for k = 1 formula (A.2.1) is just the rule of multiplication of

two matrices. On the other hand, if m=p — n and k = n, then (A.2.1)
gives the familiar multiplication formula for determinants: det(BC) =
det B • det C.
Proof. As the rank of A does not exceed p, we have

as long as k>p. So we can assume £</?. For simplicity of notation
assume also iq=jq = q, q = 1,. . . , k. Letting A =
C = [Cy]£'/*= i, we may write ^4( ) in the form
and using the linearity of the determinant as a function of each column, this
expression is easily seen to be equal to
where the sum is taken over all /c-tuples of integers (a l 5 . . . , ak) such that
l ^ a t < p . (Here we use the notation B\ I to denote
\a
ta 1*9=1 even when the the sequence =1 is not increasing, or when
it contains repetitions of numbers.) If not all aly a2,. . . , ak are different,
then clearly B\ I ^ O . Ignoring these summands in
(A.2.2), split the remaining terms into groups of &! terms each in such a way
that the summands in the same group differ only in the order of
indices a,, a 2 ,. . . , ak. We obtain:
The Smith Form: Uniqueness 653
where the internal summation is over all permutations of {1,2, . . . , & } .

Denoting by the sign of is 1 if TT is even and -1 if TT is odd), we
find that the right-hand side of (A.2.3) is
and the theorem is proved. D
Returning to matrix polynomials observe that the minors of a matrix

polynomial A (A) are (scalar) polynomials, so we can speak about their
greatest common divisors.
Theorem A. 2.2
Let A( A) be an m x n matrix polynomial. Let be the greatest common
divisor (with leading coefficient 1) of the minors ofA(\) of order k, if not all
of them are zeros, and let = 0 if all the minors of order k of are
zeros. Let p = l and = diag[ , . . . , 0] be a
Smith form of (which exists by the part of Theorem A.I.I already
proved). Then r is the maximal integer such that
Proof. Let us show that if -î(A) and are equivalent matrix

polynomials, then the greatest common divisors of the
minors of order k of respectively, are equal. Indeed, we
have
for some matrix polynomials E( A) and F( A) with constant nonzero determi-

nants. Apply Theorem A. 2.1 twice to express a minor of of order k as
654 Appendix
a linear combination of minors of .<4 2 (A) °f tne same order. Therefore, it

follows that p is a divisor of . But the equation
implies thatp^ j(A) is a divisor of In the same

way one shows that the maximal integer r, such that coincides
with the maximal integer r2 such that
Now apply this observation for the matrix polynomials and It
follows that we have to prove Theorem A. 2. 2 only in the case that
itself is in the diagonal form A(A) = D(\). From the structure of it is
clear that
is the greatest common divisor of the minors of of order s. So

1,.. . ,r, and (A. 2.4) follows
Theorem A. 2. 2 immediately implies the uniqueness of the Smith form

(A. 1.2). Indeed, Theorem A. 2. 2 shows that the number r of not identi-
cally zero entries in the Smith form of A(\), as well as the entries
themselves, can be expressed explicitly in terms of
that is, r and are uniquely determined by
A. 3 INVARIANT POLYNOMIALS, ELEMENTARY DIVISORS, AND

PARTIAL l MULTIPLICITIESes
In this section we study various invariants appearing in the Smith form for
the matrix polynomials. Let A(\) be an m x n matrix polynomial with the
Smith form D(\). The diagonal elements are called
the invariant polynomials of . The number r of invariant polynomia
can be defined as
Indeed, since E(A) and F(A) from (A. 1.3) are invertible matrices for every
we have rank oNOn the other hand, its
clear that rank is not a zero of one of the invariant polyno-
mials, and rank otherwise. So (A.3.1) follows.
The set of invariant polynomials forms a complete invariant for equiva-
lence of matrix polynomials of the same size.
Theorem A.3.1
Matrix polynomials and ) of the same size are equivalent if and only
if the invariant polynomials of and B are the same.
Invariant Polynomials, Elementary Divisors, and Partial Multiplicities 655
Proof. Suppose the invariant polynomials of A(\) and B(A) are the
same. Then their Smith forms are equal:
and
where Since and

F 2 (A) are matrix polynomials with constant nonzero determinants, the same
is true for and, co n and aonsequently and So
Conversely, suppose , where det = const ^0,

del ) = const ^0. Let D(A) be the Smith form for
Then sdi also the Smith formfor
By the uniqueness of the Smith form for [more exactly, by the

uniqueness of the invariant polynomials of it follows that the
invariant polynomials of ) are the same as those of
We now take advantage of the fact that the polynomial entries of A (A)
and its Smith form D( A) are over <p to represent each invariant polynomial
d,(A) as a product of linear factors:
where are different complex numbers and a (1 , . . . , aik are

positive integers. The factors / = 1, . . . , kf, i = 1, . . . , r are
called the elementary divisors of
Some different elementary divisors may contain the same polynomial
(this happens, for example, in case d for some /);
the total number of elementary divisors of A(\) is thus
The degrees atj of the elementary divisors form an important characteris-
tic of the matrix polynomial Here we mention only the following
simple property of the elementary divisors, whose verification is left to the
reader.
656 Appendix
Proposition A.3.2
Let A(\) be an n x n matrix polynomial such that det 0. Then the
sum of degrees of its elementary divisors coincides
with the degree of det
Note that the knowledge of the elementary divisors of A (A) and the
number r of its invariant polynomials is sufficient to
construct d^ In this construction we use the fact that is
divisible by ^_,(A). Let p be all the different complex numbers
that appear in the elementary divisors, and let
(/ = 1, . . . ,p) be the elementary divisors containing the number A / ? and
ordered in the descending order of the degrees an > • • • > a ( k > 0. Clearly,
the number r of invariant polynomials must be greater than or equal to
max{&,, . . . , kp}. Under this condition, the invariant polynomials
are given by the formulas
where we put t.
The following property of the elementary divisors is used subsequently.
Proposition A.3.3
Let A(\) and B(\) be matrix polynomials, and let
a block-diagonal matrix polynomial. Then the set of elementary divisors of
C(A) is the union of the elementary divisors of A(\) and B(A).
Proof. Let D,(A) and D 2 (A) be the Smith forms of A(\) and B(\),
respectively. Then clearly
for some matrix polynomials £(A) and F(A) with constant nonzero deter-
minant. Let
the elementary divisors of /^(A) and D2(\), respectively, corresponding to
the same complex number A0. Arrange the set of exponents
« ! , . . . , ap, j3,,. . . , fiq, in a nonincreasing order:
where 0< y\ ^ • • • < yp+q- Using Theorem A.2.2 it is clear that in the Smith
form D = diag[ of diag[ the in-
variant polynomial d is divisible by but not by
Invariant Polynomials, Elementary Divisors, and Partial Multiplicities 657
is divisible by ' but not by (

and so on. It follows that the elementary divisors of
[and thus also those of corresponding to 0 , are just

and Proposition A. 3. 3 is proved.
In the rest of this section we assume that (as in Proposition A. 3. 2) the

matrix polynomial is square and that the determinant of is not
identically zero. In this case, complex numbers 0 such that det,
are called the eigenvalues of A(X). Clearly, the set of eigenvalues is finite [it
contains not more than degree (det A(\)) points], and is an eigenvalue of
A(\) if and only if there is an elementary divisor of A(\) of type (A - A0)°.
Let A0 be an eigenvalue of A (A), and let be all
the elementary divisors of A( A) that are divisible by The exponents
a,, . . . , a.p are called the partial multiplicities of A(\) corresponding to A0.
Recall that some of the numbers a,, . . . , ap may be equal; the number a;
appears in the list of partial multiplicities as many times as there are
elementary divisors . The partial multiplicities play an
important role in the following representation of matrix polynomials.
Theorem A.3.4
Let A(\) be an nx n matrix polynomial with det Then for every
admits the representation
where and are matrix polynomials invertible at and

KI ^ • • • < Kn are nonnegative integers, which coincide (after striking off
zeros) with the partial multiplicities of A(\) corresponding to
Proof. The existence of representation (A. 3. 2) follows easily from the

Smith form. Namely, let = diag[d be the Smith form
of A (A), and let
where det = const ^ 0, det = const ^0. Represent each in

the form
658 Appendix
where and K,>0. Since is divisible by we have

K. > K . _ J . Now (A. 3. 2) follows from (A. 3. 3), where
It remains to show that the K, coincide (after striking off zeros) with the
degrees of elementary divisors of A (A) corresponding to A0. To this end we
show that any factorization of A (A) of type (A. 3. 2) with KJ ^ • • • < Kn
implies that K; is the multiplicity of A0 as a zero of d^( A), j = 1, . . . , n, where
D(A) = diag[d,(A), . . . , dn(\)] is the Smith form of A(\). Indeed, let
where E( A) and F( A) are matrix polynomials with constant nonzero deter-

minants. Comparing with (A.
(A.3.2),
3. 2), write
diag
where are matrix polynomials in-

vertible at A0. Applying Theorem A. 2.1, we obtain
where . m is a minor of order /0 of

[resp. diag F(A)], and the sum in (A.3.5) is
taken over a certain set of triples (/, j, k). It follows from (A.3.5) and the
condition K, =£ • • • < *„, that A0 is a zero of the product
of multiplicity at least K, + K2 + ••• + *,-. Rewrite (A.3.4) in the form
and apply Theorem A. 2.1 again. Using the fact that and
(F A (A))~ ! are rational matrix functions that are defined and invertible at
0 , and that dt( A) is a divisor of we deduce that
where 4>, (A) is a rational function defined at is not a pole of

Equivalence of Linear Matrix Polynomials 659
O. (A)). It follows that is a zero of of multiplicity

exactly KJ + K2 + • • • + K,. , i = 1 , . . . , n. Hence KJ is exactly the multiplicity
of as a zero of ) for / = 1,. . . , n; that is, the nonzero numbers (if
any) among K,, . . . , Kn are the partial multiplicities of A(\) corresponding
to
As a consequence of Theorem A.3.4, note that there are nonzero K, in

the representation (A.3.2) if and only if is an eigenvalue of
A.4 EQUIVALENCE OF LINEAR MATRIX POLYNOMIALSIALS
We study here equivalence and the Smith form for matrix polynomials of
type /A — A, where A is an n x n matrix. It turns out that for such matrix
polynomials the notion of equivalence is closely related to similarity.
Theorem A.4.1
/A — A ~ /A - B if and only if A and B are similar.
To prove this theorem, we have to introduce division of matrix poly-

nomials.
We restrict ourselves to the case when the dividend is a general matrix
polynomial and the divisor is a matrix polynomial of type
/A + X, where A' is a constant n x n matrix. In this case the following
representation holds:
where Qr(\) is a matrix polynomial, which is called the right quotient, and
Rr is a constant matrix, which is called the right remainder, on division of
where Q,(\) is the left quotient, and the constant matrix R, is the left
remainder.
Let us check the existence of representation (A.4.1); (A.4.2) can be
checked in a similar way. If / = 0 [i.e., A( A) is constant], put = 0 and
R . So we can suppose / > 1. Write Comparing
the coefficients of powers of A on the right- and left-hand sides of (A.4.1),
we can rewrite this relation as follows:
660 Appendix
Clearly, these relations define r, sequentially.

It follows from this argument that the left and right quotient and
remainder are uniquely denned.
Proof of Theorem A.4.1. In one direction this result is immediate: if

A = SBS~l for some nonsingular S, then the equality /A — A = S(I\ —
B)S~l proves the equivalence of and . Conversely, suppose
- f l . Then for some matrix polynomials E(A) and F(A) with
constant nonzero determinant we have
Suppose that division of on the left by /A - A and of F( A) on the

right by /A - B yield
Substituting in the equation
we obtain
whence
Since the degree of the matrix polynomial on the right-hand side here is 1 , it
follows that 5( A) = T( A); otherwise, the degree of the matrix polynomial on
the left is at least 2. Hence
so that
It remains only to prove that E0 is nonsingular. To this end divide E(A)

on the left by IX- B:
Then, using (A.4.3) and

and (A.4.4), we have
Equivalence of Linear Matrix Polynomials 661
Hence the matrix polynomial in the square brackets is zero, and E0R0 - I. It
follows that EQ is nonsingular. D
The definitions of eigenvalues and partial multiplicities made in the
preceding section can be applied to an n x n matrix polynomial of the form
—A. On the other hand, as an n x n matrix (or as a transformation
represented by this matrix in the standard basis el,. . . , en), A has eigen-
values and partial multiplicities as defined in Sections 1.2 and 2.2. It is an
important fact that these notions for IX — A and for A coincide.
Theorem A.4.2
A complex number is an eigenvalue of IX- A if and only if it is an
eigenvalue of A. Moreover, the partial multiplicities of IX — A corresponding
to its eigenvalue A0 coincide with the partial multiplicities of A corresponding
to A 0 .
Proof. The first statement follows from the definitions: 0 is an eigen-
value of IX — A if and only if det(/A — A) = 0, which is exactly the definition
of an eigenvalue of A. For the proof of the second statement, we can
assume that A is in the Jordan form. Further, using Proposition A.3.3, we
reduce the proof to the case when A is a single Jordan block of size n x n:
The partial multiplicity of A is clearly n, corresponding to the eigenvalue A0.

To find the partial multiplicities of IX - A, observe that
has a nonzero minor of order n - 1 that is independent of A (namely, the

662 Appendix
minor formed by crossing out the first column and the last row in
As det Theorem A.2.2 implies that the Smith form of
/A - A is diag[l, 1, . . . , 1 So the only partial multiplicity of
/A — A is n, which corresponds to 0. D
We also need the following connection between the partial multiplicities

of a matrix A and submatrices of
Theorem A.4.3
Let A be an n x n matrix. Let a, ^ • • • > am be the partial multiplicities of an
eigenvalue 0 of A, and put a. = 0 for i = m + 1, . . . , n. Then
is the minimal multiplicity of as a zero of the determinant
{considered as a polynomial in A) of any p* p submatrix in IX— A.
Proof. By Theorems A. 4. 2 and A. 3. 4 we have the following repre-

sentation:
where are matrix polynomials invertible for A = A 0 . Now

the Binet-Cauchy formula (Theorem A. 2.1) implies that the multiplicity of
0 as a zero of the determinant of any p x p submatrix in A is at least
Rewriting (A.4.5) in the form
and using the Binet-Cauchy formula again, we find that
where are certain p x p submatrices in /A - A, and

<p,(A), / = 1, . . . , s are rational functions defined at is not a pole of
any <ft(^)]- It follows from equation (A.4.6) that at least one of the minors
has a zero at A0 with multiplicity exactly equal to
A.S STRICT EQUIVALENCE OF LINEAR MATRIX POLYNOMIALS:

REGULAR CASE
Let A + \B and Al + \Bl be two linear matrix polynomials of the same size
m x n. We say that A + \B and A j + Afi, are strictly equivalent if there exist
Strict Equivalence of Linear Matrix Polynomials 663
invertible matrices P and Q of sizes m x m and n x n, respectively, indepen-

dent of A, such that, for all we obtain
We denote strict equivalence by l + AZJj. It is easily seen that

strict equivalence is indeed an equivalence relation, that is, that the three
following properties hold: A + \B —A + XB for every polynomial A + \B.
, then als
and then
Obviously, strict equivalence of linear matrix polynomials implies their
equivalence. The converse is not true in general, as we see later in this
section.
In this and subsequent sections we find the invariants of strict equival-
ence, as well as the simplest representative (the canonical form) in each
class of strictly equivalent linear matrix polynomials. This section is devoted
to the regular case. That is, when A and B are square matrices and
det(y4 + \B) does not vanish identically. In particular, the polynomials
A + \B with squares matrices A and B and d e t B ^ O are regular. This
hypothesis is used in our first result.
Proposition A. 5.1
Two regular polynomials A + \B and A} + \B^ with det B 7^0, det Bl^0
are strictly equivalent if and only if they have the same invariant polynomials
(or, equivalently , the same elementary divisors).
The proof is easily obtained by combining Theorems A. 3.1 and A. 4.1.

However, the result of Proposition A. 5.1 is false, in general, if we omit the
conditions det B ^0, det Bl ^ 0 and require only that the polynomials are
regular.
EXAMPLE A. 5.1. Let
The polynomials
and
are obviously regular, and both have the Smith form , that is, the
same invariant polynomials. However, they cannot be strictly equivalent
because B and Bl have different ranks. were
664 Appendix
strictly equivalent, we would have B = PB^Q for some invertible P and Q,

and this would imply the equality of the ranks of B and 5, .) D
To extend the result of Proposition A. 5.1 to the class of all regular

polynomials A + \B, we must introduce the elementary divisors at infinity.
We say that A p is an elementary divisor at infinity of a regular polynomial
is an elementary divisor of \A + B. Clearly, there exist
elementary divisors at infinity of A + \B if and only if del B = 0.
Theorem A. 5.2
Two regular polynomials A + \B and A , + \B^ are strictly equivalent if and
only if the elementary divisors of A + \B and Al + \B^ are the same and
their elementary divisors at infinity are the same.
Proof. Assume that A + \B and Al + Afl, are strictly equivalent. Then

obviously A + \B and Al + \B} are equivalent, so by Theorem A. 3.1 they
have the same elementary divisors. Moreover, \A + B and A/4, + Bl are
equivalent as well, so, by the same Theorem A. 3.1, A + \B and .A, + \Bl
have the same elementary divisors at infinity.
To prove the second part of the theorem, we introduce homogeneous
linear matrix polynomials. Thus we consider the polynomial pA + \B where
/a, A E (p. Note that every minor m(\, IJL) of order r of pA + \B is a
polynomial of two complex variables /A and A that is homogeneous of order r
in the sense that
for every a, . For a fixed r, 1 < r ^ n, le be the greatest

common divisor in the set of homogeneous polynomials of all the nonzero
minors mÂ, /A), . . . , ms(\, /A) of order r of tt/4 + \B. In other words,
pr(\, fji) is a homogeneous polynomial that divides each m,(A, /z), and if
q(\, /t) is another homogeneous polynomial with this property, then
divides p Clearly, divides p The poly-
nomials p t (A, /n), . . . , pn(\, M) are called the invariant polynomials of
As each minor is a homogeneous polyno-
mial in A and /A, it admits factorizations of the form
for some complex numbers o^ and aj. (In fact, the nonzero a j values are the
reciprocals of the nonzero a; values.) Using factorizations of this kind, it is
easily seen that l) are the invariant polynomials of
A + \B, whereas p^l, /A), . . . , p n (l, pt) are the invariant polynomials
of A + B.
Strict Equivalence of Linear Matrix Polynomials 665
Returning to the proof of Theorem A. 5. 2, assume that the elementary

divisors of A + \B and A{ + \B}, including those at infinity, are the same.
This means that the invariant polynomials of A + \B and Al + A/?j are the
same, and so are the invariant polynomials of p. A + B and pA^ + B\. Since
a homogeneous polynomial /?(A, /u,) of A and /ti is uniquely defined by
p(A, 1) and p(l, p), it follows from the discussion in the preceding para-
graph that the invariant polynomials of pA + \B and of p,Al + \B\ are the
same. Now we make a change of variables:
where xly2 - x2y} ^0. Then the invariant polynomials of and of
jiAl + \Bl are again the same, where A = y2A + x2B, B = y}A + x}B,
A i = y2Al + x2Bl, Bl = ylAl + xlBl. As the polynomials A + \B and ^4, +
\Bl are regular, we can choose xl and yl in such a way that det B 7^0 and
det /?! 7^0. Apply Proposition A. 5.1 to deduce that A + \B and Al + \Bl
are strictly equivalent: PAQ — Al, PBQ = Bl for some invertible matrices P
and Q. Since
where and similarly for A } and Blf we obtain PAQ = /4,,

PBQ = #], and the strict equivalence of A + \B and A} + \B} follows. O
Theorem A. 5. 2 allows us to obtain the canonical form for strict equival-

ence of regular linear matrix polynomials, as follows.
Theorem A. 5.3
Every regular, linear, matrix polynomial A + \B is strictly equivalent to a
linear polynomial of the form
where Jk(h) is the k x k Jordan block with eigenvalue A. The linear poly-
nomial (/i.5.1) is uniquely determined by A + \B. In fact, A*1, . . . , A p are
the elementary divisors at infinity ofA + \B, whereas / = 1, . . . , q
are the elementary divisors of
Proof. Let A^ + XB^ be the polynomial (A.5.1). Using Proposition

A. 3. 3, we see immediately that = 1, . . . , q are the elementary
divisors of A , . . . , p are its elementary divisors at
infinity. If the strict equivalence claimed by the theorem holds, it follows
from Theorem A. 5. 2 that (A.5.1) is uniquely determined by A + \B, and
that A + \B must have the specified elementary divisors.
666 Appendix
It remains to prove that there is a strict equivalence of the required form.

Let c e (p be such that det(,4 + cB) ^ 0. Write A + \B = (A + cB) + ( A -
c)B, multiply on the left by (A + cB)~\ and apply a similarity transfor-
mation reducing (A + cB)~1B to the Jordan form. We obtain
where 70 is a nilpotent Jordan matrix (i.e., J'Q = 0 for some /) and Jl is an

invertible Jordan matrix.
Multiply the first diagonal block on the right-hand side of (A.5.2) by
(/ - c/0)~ . It is easily verified that
and since J is also nilpotent, is similar to a

matrix of the form
Multiply the second diagonal block on the right-hand side of (A.5.2) by

7J"1 and reduce J\l to its Jordan form by similarity. We find that
for some complex numbers A,, . . . , A and some positive integers
A.6 THE REDUCTION THEOREM FOR SINGULAR POLYNOMIALS
Consider now the singular polynomial A + \B, where A and B are m x n

matrices. Singularity means that either m ^ n or m = n but det(-4 + \B) is
identically zero. Let r be the rank of A + \B, that is, the size of the largest
minors in A + \B that do not vanish identically. Then either r < m or r < n
holds (or both).
Assume r<n. Then the columns of the matrix polynomial are
linearly dependent, that is, the equation
where x is an unknown vector, has a nonzero solution.

Let us check first that there is a vector polynomial 0 for which
(A.6.1) is satisfied. For this purpose we can use the Smith form D(A) of
A + \B in place of A + AB itself (see Theorem A. 1.1). But because of the
The Reduction Theorem for Singular Polynomials 667
assumption r<n, the last column of D(\) is zero. Hence = 0 is

satisfied with x = (0,. . . ,0,1).
The following example is important in the sequel.
EXAMPLE A.6.1. Let
be an e x (e + 1) linear matrix polynomial (e = 1, 2, . . .). We claim that the

minimal degree of a nonzero vector polynomial solution x(A) of the
equation
is e. Indeed, rewrite this equation in the form

* = 0, where * ; (A) is the yth coordinate of
x( A). So
and the minimal degree for x( A) (which is equal to e) is obtained by taking

to be a nonzero constant. D
Among all not identically zero polynomial solutions *(A) of (A. 6.1), we
choose one of least degree e and write
The following reduction theorem holds.
Theorem A. 6.1
I f e i s the minimal degree of a nonzero polynomial solution of (A. 6. 1), and if
e > 0, then A + \B is strictly equivalent to a linear matrix polynomial of the
form
where
668 Appendix
is an e x (e + 1) matrix, and the equation
has no nonzero polynomial solutions of degree less than e.
It is convenient to state and prove a lemma to be used in the proof of

Theorem A.6.1. For an m x n matrix polynomial U + AK, let
be a matrix of size m(i + 2) x n(i + 1) for / = 0 , 1 , 2 , . . . .
Lemma A.6.2
Assume that the rank of U + XV is less than n. Then e is the minimal degree
of nonzero polynomial solutions y( A) of
if and only if
and
Proof. Let be a nonzero polynomial solution of (A.6.6)

of the least degree. Then
or equivalently
Not all the vectors yj are zero, and so (A.6.7) follows. Conversely, if (A.6.7)
holds, we may reverse the argument and obtain a nonzero polynomial
solution of (A.6.6) of degree e. D
Proof of Theorem A.6.1. The proof is given in three steps. In the first
step we show that
for suitable matrices A, B, D, and F, then we show that A + \B satisfies the

conclusions of Theorem A.6.1, and finally we prove that
(a) Let (A.6.2) be a vector polynomial satisfying (A.6.1):
where xf ¥=• 0. This is equivalent to
We claim that the vectors
are linearly independent. Assume the contrary, and let Axh (h = l) be the
first vector in (A.6.9) that is linearly dependent on the preceding ones:
By (A.6.8) this equation can be rewritten as follows:
that is, Bxl_l =0, where

670 Appendix
Furthermore, again by (A.6.8), we have
MHERE
Continuing the process and introducing the vectors
we obtain the equations
From (A.6.10) it follows that
is a nonzero solution of (A.6.1) with degree not exceeding h ~ l< e, which

is impossible. [The fact that this solution is not identically zero follows
because jtQ = jt 0 ^0; for if x0 were zero, then \~1x(\) would be a poly-
nomial solution of (A.6.1) of degree less than e.] Thus the vectors (A.6.9)
are linearly independent.
But then the vectors x0,. . . , xe are linearly independent as well. Indeed,
= = 0, and by the linear independence of
and since *0 ^0 we find that also
Now write A + \B in a basis in <p" whose first e + 1 vectors are

xQ, *!,. . . , xf and in a basis in (pm whose first e vectors are Axl,. . . , Axe.
In view of equations (A.6.8), the polynomial A + \B in the new bases has
the form
for some D, F, A, and B.

In the second step we show that the equation (^4 + \B)x = 0 has no
nonzero polynomial solutions of degree less than e. Note that
is obtained from
by a suitable permutation of rows and columns. By Lemma A.6.2 the rank

of (A.6.11) is equal to en; that is, the columns of (A.6.11) are linearly
independent. By the same lemma, taking into account Example A.6.1,
rank M ; that is, the square matrix
M € _,[L e ] is invertible. As the columns of (A.6.12) are linearly independent
as well, we find that the columns of are linearly independent,
that is, rank Using Lemma A.6.2 again, we
find that (A + \B)x = 0 has no solutions of degree less than e.
In the third step, replacing
for suitable matrices X and Y, we see that Theorem A. 6.1 will be complete-
ly proved if we can show that the matrices X and Y can be chosen so that
the matrix equation
holds.
We introduce a notation for the elements of D, F, X and also for the rows
of Y and the columns of A and B :
Then the matrix equation (A.6.13) can be replaced by a system of scalar

equations that expresses the equality of the elements of the fcth column on
the right- and left-hand sides of (A.6.13). For fc = l,2, . . . , / t - e - l , we
obtain
672 Appendix
The left-hand sides of these equations are linear polynomials in A. The free
term of each of the first e — 1 of these polynomials is equal to the coefficient
of A in the next polynomial. But then the right-hand sides must also satisfy
this condition. Therefore, for fc = l , 2 , . . . , n — e — 1, we obtain
If (A.6.15) holds, then the required elements of X can obviously be

determined from (A.6.14).
It now remains to show that the system of equations (A.6.15) for the
elements of Y always has a solution for arbitrary d ik
fc = l,2, . . . , / z - e - l ) . Rewrite (A.6.15) in the form
where
and use the left invertibility of Mf_2[A + \B] (ensured by Lemma A. 6. 2) to

verify that (A.6.15) has a solution
, where the subscript "L" denotes a left
inverse. Theorem A. 6.1 is now proved completely. D
A.7 MINIMAL INDICES AND STRICT EQUIVALENCE OF LINEAR

MATRIX POLYNOMIALS (GENERAL CASE)
We introduce the important notion of minimal indices for linear matrix

polynomials. Let A + \B be an arbitrary linear matrix polynomial of size
m x n. Then the k polynomial columns that are
solutions of the equation
Minimal Indices and Strict Equivalence of Linear Matrix Polynomials 673
are called linearly dependent if the rank of the polynomial matrix formed
from these columns is less than k\ In that
case there exist k polynomials not all identically
zero, such that
Indeed, let
be the Smith form of X(\), where [resp. ] is an n x n (resp.

k x k) matrix polynomial with constant nonzero determinant, and
with nonzero polynomials d, As the rank r of X(\) is less

than k, the last column of D(\) is zero. One verifies that (A.7.2) is satisfied
with 1). If polynomials /7,(A)
(not all zero) with the property (A.7.2) do not exist, then the rank of X is k
and we say that the solutions are linearly independent.
Among all the polynomial solutions of (A.7.1) we choose a nonzero
solution *,(A) of least degree €l. Among all polynomial solutions x( A) of the
same equation for which ^(A) and x( A) are linearly independent, we take a
solution x2(\) of least degree e2. Obviously, e 1 <6 2 . We continue the
process, choosing from the polynomial solutions x(\) for which
x ) are linearly independent a solution x3( A) of minimal degree
c3, and so on. Since the number of linearly independent solutions of (A.7.1)
is always at most n, the process must come to an end. We obtain a
fundamental series of solutions of (A.7.1)
having the degrees
Note that it may happen that some degrees e,, . . . , ey are zeros. [This is the
case when (A.7.1) admits constant nonzero solutions.] In general, a funda-
mental series of solutions is not uniquely determined (to within scalar
factors) by the pencil A + \B. However, note the following.
674 Appendix
Proposition A. 7.1
Two distinct fundamental series of solutions always have the same series of
degrees
Proof. In addition to (A. 7. 3), consider another fundamental series of

solutions . with the degrees eltc2,.... Suppose that in
(A.7.4)
and similarly, in the series
Obviously, For every vector ) (/ = 1, . . . , ra,) there exists a

polynomial such that
for some polynomials p . (Otherwise, *,, jc 1? . . . , xni would be linearly

independent and one could replace xn + , by *,, which is of smaller degree,
contrary to the definition of xn +l.) Rewrite (A.7.5) in the form
where
m j matrix polynomial. are linearly independent, there
is a nonzero minor /(A) of order m, of . So for every A G (p
that is not a zero of one of the polynomials
rank of the matrix on the left-hand side of (A.7.6) is ml. Hence (A.7.6)
implies ml^nl. Interchanging the roles of Jt,(A) and *,(A), we find the
opposite inequality , we have and we can
repeat the above argument with n2 and m 2 in place of n, and m 1?
respectively, and so on. D
The degrees p of polynomials in any fundamental series of

polynomial solutions of (A. 7.1) are called the minimal column indices of
A + \B, As Proposition A. 7.1 shows, the number p of the minimal column
indices and the indices themselves do not depend on the choice of the
fundamental series. If there are no nonzero solutions of (A. 7.1) (i.e., the
rank of A + XB is equal to n), we say that the number of minimal column
indices is zero, in this case no such indices are defined.
We define the minimal row indices of A + \B as the minimal column
indices of
EXAMPLE A.7.1. Let Le be as in Example A.6.1. The polynomial Lf has the

single minimal column index e, whereas the minimal row indices are absent.
Indeed, as in Example A.6.1, observe that every nonzero polynomial
solution x( A) =
has the form
and a solution x(\) of minimal degree e is obtained by taking !.

Hence the first minimal column index of Le is e. As (A.7.8) shows, every
other solution JC(A) of (A.7.7) has the form , where
is the first coordinate of *(A). So jt(A) and Jc t (A) are linearly dependent,
which means that there are no more minimal column indices.
As the rows of Lf are linearly independent for every A, the minimal row
indices are absent.
Similarly, we conclude that the transposed polynomial LI has the single
minimal row index e and no minimal column indices. D
The importance of minimal indices stems from their invariance under

strict equivalence, as follows.
Proposition A.7.2
then the minimal column indices of the polynomials
are the same, and the minimal row indices of these
polynomials are also the same.
The proof is immediate: if P for invertible mat-

rices P and Q, then the solutions of = 0 are obtained from the
solutions of by multiplication by
which preserves linear dependence and independence and also implies that
jf(A) and y(\) have the same degree.
We are now in a position to state and prove the main result concerning
strict equivalence of linear matrix polynomials in general. We denote by Lf
the e x (e + 1) linear polynomial
and Ljis its transpose [which is an (e + 1) x e linear polynomial]. Then O uxt ,

676 Appendix
will denote the zero u x v matrix. As before, /^(A 0 ) represents the k x k

Jordan block with eigenvalue A0.
Theorem A. 7.3
Every m x n linear matrix polynomial A + \B is strictly equivalent to a
unique linear matrix polynomial of type
Here £j < • • • < ep a«d 171 < • • • ^ 17^ are positive integers; kl , . . . , kr and
l{, . . . , ls are positive integers; \l, . . . , \s are complex numbers.
The uniqueness of the linear matrix polynomial of type (A.7.10) to which

A + \B is strictly equivalent means that the parameters w, v, p, q, r, s,
are uniquely determined by the
polynomial A + \B. It may happen that some of the numbers u, v, p, q, r,
and s are zeros. This means that the corresponding part is missing from
formula (A.7.10).
Proof of Theorem A. 7.3. Let be a basis in the linear

space of all constant solutions of the equation
that is, all solutions that are independent of A. Note that (A.7.11) is
equivalent to the simultaneous equations
Likewise, let be a basis in the linear space of all constant

solutions of
or, what is the same, the simultaneous equations
Write A + \B (understood for each A G <p as a transformation written in the

standard orthonormal bases in <p" and <p m ) as a matrix with respect to the
basis in <p" whose first v vectors are *,, . . . , xv and the basis in <pm whose
first u vectors are yl1 . . . , yu and the others are orthogonal to

Span{_Vj, . . . , yu}. Because Im A = (Ker A*)^~ C(Span{_x 1 , . . . , yu})*~ and
also Im B C (Spanl)^, . . . , y M }) x , it follows that, with respect to the indi-
cated bases, A + \B has the form Here Al + \Bl has
the property that neither has con-
stant nonzero solutions.
If the rank of A} + \Bl is less than the number of columns in Al + Afij,
apply the reduction theorem (Theorem A. 6.1) several times to show that
where is such that the equation = 0 has no nonzero

polynomial solutions x = x(\). From the property of in Theorem
A. 6.1 it is clear that p . It is also clear that the process of
consecutive applications of Theorem A. 6.1 must terminate for the simple
reason that the size of A is finite. The Smith form of
(Theorem A. 1.1) shows that the number of columns of the polynomial
A2 + \B2 coincides with its rank.
If it happens that the rank of A2 + \B2 is less than the number of its
rows, apply the above procedure to After taking adjoints, we
find that
where 0 g and the rank of A3 + \B3 coincides with the number

of columns and the number of rows of (A3 + \B3). In other words,
A 3 + \B3 is regular. It remains to apply Theorem A. 5. 3 in order to show
that the original polynomial A + \B is strictly equivalent to a polynomial of
type (A. 7. 10).
It remains to show that such a polynomial (A. 7. 10) is unique. Proposition
A. 7. 2 and Example A. 7.1 show that the minimal column indices of A + \B
are 0, . . . , 0, e, , . . . , ep (where 0 appears « times) and the minimal row
indices of A + \B are 0, . . . , 0, 17^ . . . , v}q (where 0 appears v times).
Hence the parameters u, v, p, q, =], and are uniquely deter-
mined by A + \B. Further, observe that Lf and Lr( have no elementary
divisors; that is, their Smith forms are [7e 0] and " , respectively. (This
follows from Theorem A.2.2 since both Le and LTt have an € x e minor that
is equal to 1.) Using Proposition A. 3. 3, we see that the elementary divisors
1
of (A. 7. 10) are ,. which must coincide with the
elementary divisors of A + \B because of the strict equivalence of A + \B
and (A. 7. 10) (Theorem A. 3.1). Hence the parameters s, {/,}/ =1 , and
{ A,}^=1 are also uniquely determined by A + \B. Applying this argument for
\A + B in place of A + \B, we see that r and are uniquely
determined by A + \B as well. D
678 Appendix
The matrix polynomial (A.7.10) is called the Kronecker canonical form of

A + \B. Here 0,. . . , 0, el,. . . , ep (u times 0) are the minimal column
indices of A + \B; 0,. . . , 0, T/J, . . . , t}q (v times 0) are the minimal row
indices of A + \B; A * 1 , . . . , A*r are the elementary divisors of A + \B at
infinity; and e the (finite) elementary divisors of
A + \B. We obtain the following corollary from Theorem A.7.3.
Corollary A.7.4
We have A + \B ~A{ + \Bt if and only if the polynomials A + \B and
Al + \Bl have the same minimal column indices, minimal row indices,
elementary divisors, and elementary divisors at infinity.
Thus Corollary A.7.4 describes the full set of invariants for strict equival-
ence of linear matrix polynomials.
A.8 NOTES TO THE APPENDIX
This appendix contains well-known results on matrix polynomials. Essential-

ly the entire material can be found in Chapters 6 and 12 of Gantmacher
(1959), for example. In our exposition of Sections A.5-A.7 we follow this
book. In the exposition of Sections A.1-A.4 we follow Gohberg, Lancaster,
and Rodman (1982).
List of Notations
and Conventions
inclusion between sets X and Y

(equality not excluded)
the field of real numbers
the space of all n-dimensional real
column vectors
the field of complex numbers
the space of all n-dimensional com-
plex column vectors
complex conjugate of complex num-
ber x
the real part of x
the imaginary part of x
the n-dimensional column vector
the standard scalar product in <p";

««„..., o, <&,,...,&„»
the norm of a vector
679
680 List of Notations and Conventions
e, = (0,. . . , 0,1,0,. . ., 0} (with 1 in the ith place) the ith unit

coordinate vector in <p"; its size
n will be clear from the context
"Linear transformation" often abbreviated to "transfor-
mation"—when convenient, a linear
transformation from <pm into <p" is
assumed to be given by an n x m
matrix with respect to the bases
e , , . ..,€„ in <p" and e t , . . . , e m in
<pm, consequently, when convenient,
an n x m matrix will be considered
as a linear transformation written in
the standard bases el,.. . , en and
*!»•••>*«
m x n matrix whose entry in the
(/, /) place is afi
unit matrix; identity linear trans-
formation (the size of / is under-
stood from the context)
the k x k unit matrix
the transpose of a matrix A
the adjoint of a transformation A;
the conjugate transpose of a matrix
A
complex conjugate in every entry of
a matrix A
left inverse of a matrix (or trans-
formation) A
right inverse of a matrix (or trans-
formation) A
one-sided inverse (left or right) of
A; generalized inverse of A
the trace of a matrix (or transfor-
mation) A
the norm of a transformation A
the restriction of a transformation A
to its invariant subspace M
the image of a transformation
A'.^-^f"
the kernel of a transformation A
the spectrum of a matrix (or trans-
formation) A
the root subspace of A correspond-
ing to its eigenvalue X
List of Notations and Conventions 681
the Jordan block of size k x k with

eigenvalue A
the block diagonal matrix with the
matrices A,,..., Ap along the main
diagonal; or, the direct sum of the
linear transformations Alt . . . , A
a block column matrix
the set of all A-invariant subspaces

the set of all p-dimensional A-
invariant subspaces
the set of all coinvariant subspaces
for A
the set of all semiivariant sub-
spaces for A
the set of all reducing invariant sub-
spaces for A
the set of all p-dimensional reducing
invariant subspaces for A
the set of all hyperinvariant sub-
spaces for A
the set of all real invariant subspaces
for a real transformation A
the set of all transformations (or
matrices) that commute with a
transformation (or matrix) A
the zero subspace
the orthogonal complement to a sub-
space M
direct sum of subspaces M and Jf
orthogonal sum of subspaces M and
Jf
the unit sphere in a subspace M
the distance between a point x G <p"
and a set Z C <p"
the distance between sets X and Y
the gap between Z£ and M
the minimal opening between J£ and
M
the spherical gap between 2£ and M
the minimal angle between sub-
spaces Z£ and M
682 References
tric space of all subspaces in

the set of all m-dimensional sub-
spaces of <p"
the subspace spanned by vectors
A T , , . . . , Xk
the algebra of all n x n matrices

the algebra of all transformations on
a linear space !£
the algebra of all upper triangular
Toeplitz matrices of size j x j
the lattice of all invariant subspaces
for an algebra V
the algebra of all transformations for
which every subspace from a lattice
A is invariant
the set of all n x n unitary matrices
the set of all n x n real orthogonal
matrices with determinant 1
the set of all real invertible nx n
matrices
hthe McMillan degree of a rational
matrix function W(A)
the singular set of an analytic family
of transformations A(z)
Kronecker index Kronecker
tj = 0 index:
if i¥> /;d
are positive integers:
the number of distinct elements in a

finite set K
end of a proof or an example
References
Alien, G, R., "Hoiomorphic vector-valued functions on a domain of holomorphy," J. London

Math. Soc. 42, 509-513 (1967),
Bart, H., I. Gohberg, and M. A, Kaashoek, "Stable factorization of monk matrix polynomials
and stable invariant stibspaces," Integral Equations and Operator Theory I, 496-517
0978).
Bart, H., L Gohberg, and M. A. Kaashoek, Minimal Factorization of Mark and Operator
Functions (Operator Theory: Advances and Applications, Vol. 1) Birkhauser, Basei, 1979,
Bart, H., I. Gohberg, M. A, Kaashoek, and P, Van Dooren, "Factorizations of transfer
functions," SIAM J. Control Optim. 18(6), 675-696 (1980).
Baumgartei, H. Analytic Perturbation Theory for Matrices and Operators (Operator Theo
Advances and Applications, Vol. IS) Birkhauser, Basel-Boston-Stuttgart, 1985.
den Boer, H., and G. Ph. A. Thijsse, "Semistability of sums of partial multiplicities under
additive perturbations," Integral Equations and Operator Theory 3, 23-42 (1980).
Bochner, S., and W, T. Martin, Several Complex Variables, Princeton University Press,
Princeton, NJ, 1948.
Brickman, L., and P. A. Fillmore, "The invariant subspace lattice of a linear transformaton,"
Carmd. J. Math, 19, 810-822 (1967).
Brockett, R,, Finite Dimensional Linear Systems, John Wiley & Sons, New York, 1970.
Bmnovsky, P., "A classification of linear controllable systems," Kybentetika (Praha) 3,
173-187 (1970).
Campbell, S., and J. Daughtry, "The stable solutions of quadratic matrix equations," Proc.
AMS 74, 19-23 (W79).
Choi, M.-D,, C, Laurie, and H. Radjavi, "On comnitttators and invariant subspaces," Linear
and Multilinear Algebra 9, 329-340 (1981).
Coddington, E, A., and N. Levinson, Theory of Ordinary Differential Equations, McGraw-
Hill, New York, 1955.
Conway, J. B., and P. R, Hataos, "Knite-dimensiona! points of continuity of Lat," Linear
Algebra Appl. 31, 93-102 (1980).
Djaferis, T. E., and S. K. Mitter, "Some generic invariant factor assignment results using
dynamic output feedback," Linear Algebra Appl. St, 103-131 (1983).
DonneHan, T., Lattice Theory, Pergamon Press, Oxford, 1968.
Douglas, R. G., and C. Pearcy, "On a topology for invariant subspaces," J. Functional Armly.
2, 323-341 (1968).
Fillmore, P. A., D. A. Herrero, and W. E. Longstaff, "The hyperinvariant subspaces lattice of
a linear transformation," Linear Algebra Appl. IT, 125-132 (1977).
Ganttnacher, F. R., The Theory of Matrices, Vote. I and II, Chelsea, New York, 1959;
Gochberg, L Z., and i. Loiterer, "Uher Algebren stetiger Operatorfunctionen," Stadia
Mathematka, Vol. LVI1, 1-26, 1976.
Gohberg, L, and S. Goldberg, Basic Operator Theory, Birkhauser, Basel, 1981.
683
684 References
Gohberg, I., and G. Heinig, "The resultant matrix and its generalizations, I. The resultant
operator for matrix polynomials," Acta Set. Math. (Szeged) 37, 41-61 (Russian) (1975).
Gohberg, I., and M. A. Kaashoek, "Unsolved problems in matrix and operator theory, II.
Partial multiplicities of a product," Integral Equations and Operator Theory 2, 116-120
(1979).
Gohberg, I., M. A. Kaashoek, and F. van Schagen, "Similarity of operator blocks and
canonical forms. I. General results, feedback equivalence and Kronecker indices," Integral
Equations and Operator Theory 3, 350-396 (1980).
Gohberg, I., M. A. Kaashoek, and F. van Schagen, "Similarity of operator blocks and
canonical forms. II. Infinite dimensional case and Wiener-Hopf factorization," in Topics
in Modern Operator Theory. Operator Theory: Advances and Applications, Vol. 2,
Birkhauser-Verlag, 1981, pp. 121-170.
Gohberg, I., M. A. Kaashoek, and F. van Schagen, "Rational matrix and operator functions
with prescribed singularities," Integral Equations and Operator Theory 5, 673-717 (1982).
Gohberg, I. C., and M. G. Krein, "The basic propositions on defect numbers, root numbers
and indices of linear operators," Uspehi Mat. Nauk 12, 43-118 (1957); translation, Russian
Math. Surveys 13, 185-264 (1960).
Gohberg, I., and N. Krupnik, Einfiihrung in die Theorie der eindimensionalen singuldren
Integraloperatoren, Birkhauser, Basel, 1979.
Gohberg, I., P. Lancaster, and L. Rodman, "Perturbation theory for divisors of operator
polynomials," SIAM J. Math. Anal. 10, 1161-1183 (1979).
Gohberg, I., P. Lancaster, and L. Rodman, Matrix Polynomials, Academic Press, New York,
1982.
Gohberg, I., P. Lancaster, and L. Rodman, "A sign characteristic for self-adjoint meromorphic
matrix functions," Applicable Analysis 16, 165-185 (1983a).
Gohberg, L, P. Lancaster, and L. Rodman, Matrices and Indefinite Scalar Products (Operator
Theory: Advances and Applications, Vol. 8) Birkhauser-Verlag, Basel, 1983b.
Gohberg, I., and Ju. Leiterer, "On holomorphic vector-functions of one variable, I. Functions
on a compact set," Matem. Issled. 7, 60-84 (Russian) (1972).
Gohberg, L, and Ju. Leiterer, "On holomorphic vector-functions of one variable, II. Functions
on domains," Matem. Issled. 8, 37-58 (Russian) (1973).
Gohberg, I. C. and A. S. Markus, "Two theorems on the gap between subspaces of a Banach
space," Uspehi Mat. Nauk 14, 135-140 (Russian) (1959).
Gohberg, L, and L. Rodman, "Analytic matrix functions with prescribed local data," /.
d1 Analyse Math. 40, 90-128 (1981).
Gohberg, I., and L. Rodman, "On distance between lattices of invariant subspaces of
matrices," Linear Algebra Appl. 76, 85-120 (1986).
Gohberg, I., and S. Rubinstein, "Stability of minimal fractional decompositions of rational
matrix functions," in Operator Theory: Advances and Applications, Vol. 18, Birkhauser,
Basel, 1986, pp. 249-270.
Golub, G. H., and C. F. van Loan, Matrix Computations, The Johns Hopkins University Press,
Baltimore, 1983.
Golub, G. H., and J. H. Wilkinson, "Ill-conditioned eigensystems and the computation of the
Jordan canonical form," SIAM Review 18, 578-619 (1976).
Grauert, H., "Analytische Faserungen iiber holomorph vollstandigen Raumen," Math. Ann.
135, 263-273 (1958).
Guralnick, R. M., "A note on pairs of matrices with rank one commutator," Linear and
Multilinear Algebra 8, 97-99 (1979).
Halmos, P. R., "Reflexive lattices of subspaces," J. London Math. Soc. 4, 257-263 (1971).
Halperin, L, and P. Rosenthal, "Burnside's theorem on algebras of matrices," Am. Math.
Monthly 87, 810 (1980).
Harrison, K. J., "Certain distributive lattices of subspaces are reflexive," J. London Math. Soc.
8, 51-56 (1974).
Hautus, M. L. J., "Controllability and observability conditions of linear autonomous systems,"
Ned. Akad. Wet. Proc., Ser. A, 12, 443-448 (1969).
References 685
Helton, J. W., and J. A. Ball, "The cascade decompositions of a given system vs the linear
fractional decompositions of its transfer function," Integral Equations and Operator Theory
5, 341-385 (1982).
Hoffman, K., and R. Kunze, Linear Algebra, Prentice-Hall of India, New Delhi, 1967.
Jacobson, N., Lectures in Abstract Algebra II: Linear Algebra, Van Nostrand, Princeton, NJ,
1953.
Johnson, R. E., "Distinguished rings of linear transformations," Trans. Am. Math. Soc. I l l ,
400-412 (1964).
Kaashoek, M. A., C. V. M. van der Mee, and L. Rodman, "Analytic operator functions with
compact spectrum, II. Spectral pairs and factorization," Integral Equations and Operator
Theory 5, 791-827 (1982).
Kailath, T., Linear Systems, Prentice-Hall, Englewood Cliffs, NJ, 1980.
Kalman, R. E., "Mathematical description of linear dynamical systems," SI AM J. Control 1,
152-192 (1963).
Kalman, R. E., "Kronecker invariants and feedback," Proceedings of Conference on Ordinary
Differential Equations, Math. Research Center, Naval Research Laboratory, Washington,
DC, 1971.
Kalman, R. E., P. L. Falb, and M. A. Arbib, Topics in Mathematical System Theory,
McGraw-Hill, New York, 1969.
Kato, T., Perturbation Theory for Linear Operators, 2nd ed., Springer-Verlag, Berlin, 1976.
Kelley, J. L., General Topology, van Nostrand, New York, 1955.
Kra, I., Automorphic Forms and Kleinian Groups, Benjamin, Reading, MA, 1972.
Krein, M. G., "Introduction to the geometry of indefinite ./-spaces and to the theory of
operators in these spaces," Am. Math. Soc. Translations (2) 93, 103-176 (1970).
Krein, M. G., M. A. Krasnoselskii, and D. P. Milman, "On the defect numbers of linear
operators in Banach space and on some geometric problems," Sbornik Trud. Inst. Mat.
Akad. Nauk Ukr. SSR 11, 97-112 (Russian) (1948).
Kurosh, A. G., Lectures in General Algebra, Pergamon Press, Oxford, 1965.
Laffey, T. J., "Simultaneous triangularization of matrices—low rank cases and the non-
derogatory case," Linear and Multilinear Algebra 6, 269-305 (1978).
Lancaster, P., Theory of Matrices, Academic Press, New York, 1969.
Lancaster, P., and M. Tismenetsky, The Theory of Matrices with Applications, Academic Press,
New York, 1985.
Lidskii, V. B., "Inequalities for eigenvalues and singular values," appendix in F. R. Gantmach-
er, The Theory of Matrices, Moscow, Nauka, 1966, pp. 535-559 (Russian).
Markus, A. S., and E. E. Parilis, "Change in the Jordan structure of a matrix under small
perturbations," Matem. Issled. 54, 98-109 (Russian) (1980).
Markushevich, A. I., Theory of Analytic Functions, Vols. I-III, Prentice-Hall, Englewood
Cliffs, NJ, 1965.
Marsden, J. E., Basic Complex Analysis, Freeman, San Francisco, 1973.
Ostrowski, A. M., Solution of Equations in Euclidean and Banach Space, Academic Press, New
York, 1973.
Porsching, T. A., "Analytic eigenvalues and eigenvectors," Duke Math. J. 35, 363-367 (1968).
Radjavi, H., and P. Rosenthal, Invariant Subspaces, Springer-Verlag, Berlin, 1973.
Ran, A. C. M., and L. Rodman, "Stability of neutral invariant subspaces in indefinite inner
products and stable symmetric factorizations," Integral Equations and Operator Theory 6,
536-571 (1983).
Rodman, L., and M. Schaps, "On the partial multiplicities of a product of two matrix
polynomials," Integral Equations and Operator Theory 2, 565-599 (1979).
Rosenbrock, H. H., State Space and Multivariate Theory, Nelson, London, 1970.
Rosenbrock, H. H., and C. E. Hayton, "The general problem of pole assignment," Intern. J.
Control 27, 837-852 (1978).
Rosenthal, E., "A remark on Burnside's theorem on matrix algebras," Linear Algebra Appl.
63, 175-177 (1984).
Rudin, W., Real and Complex Analysis, 2nd ed., Tata McGraw-Hill, New Delhi.
686 References
Ruhe, A., "Perturbation bounds for means of eigenvalues and invariant subspaces," Nordisk
Tidskrift fur Informations Behandlung (BIT) 10, 343-354 (1970a).
Ruhe, A., "An algorithm for numerical determination of the structure of a general matrix,"
Nordisk Tidskrift fur Informations Behandlung (BIT) 10, 196-216 (1970b).
Saphar, P., "Sur les applications lineaires dans un espace de Banach. II," Ann. Sci. Ecole
Norm. Sup. 82, 205-240 (1965).
Sarason, D., "On spectral sets having connected complement," Acta Sci. Math. (Szeged) 26,
289-299 (1965).
Shayman, M. A., "On the variety of invariant subspaces of a finite-dimensional linear
operator," Trans. AMS 274, 721-747 (1982).
Shmuljan, Yu. L., "Finite dimensional operators depending analytically on a parameter,"
Ukrainian Math. J. 9(2), 195-204 (Russian) (1957).
Shubin, M. A., "On holomorphic families of subspaces of a Banach space," Integral Equations
and Operator Theory 2, 407-420 (translation from Russian) (1979).
Sigal, E. I., "Partial multiplicities of a product of operator functions," Matem. Issled. 8(3),
65-79 (Russian) (1973).
Soltan, V. P., "The Jordan form of matrices and its connection with lattice theory," Matem.
Issled. 8(27), 152-170 (Russian) (1973a).
Soltan, V. P., "On finite dimensional linear operators with the same invariant subspaces,"
Matem. Issled. 8(30), 80-100 (Russian) (1973b).
Soltan, V. P., "On finite dimensional linear operators in real space with the same invariant
subspaces," Matem. Issled. 9, 153-189 (Russian) (1974).
Soltan, V. P., "The structure of hyperinvariant subspaces of a finite dimensional operator," in
Nonselfadjoint Operators, Kishinev, Stiinca, 1976, pp. 192-203 (Russian).
Soltan, V. P., "The lattice of hyperinvariant subspaces for a real finite dimensional operator,"
Matem. Issled. 61, 148-154, Stiinca, Kishinev (Russian) (1981).
Thijsse, G. Ph. A., "Rules for the partial multiplicities of the product of holomorphic matrix
functions," Integral Equations and Operator Theory 3, 515-528 (1980).
Thijsse, G. Ph. A., Partial Multiplicities of Products of Holomorphic Matrix Functions,
Habilitationschrift, Dortmund, 1984.
Thompson, R. C., "Author vs. referee: A case history for middle level mathematicians," Am.
Math. Monthly, 90(10), 661-668 (1983).
Thompson, R. C., "Some invariants of a product of integral matrices," in Proceedings of the
1984 Joint Summer Research Conference on Linear Algebra and its Role in Systems Theory,
1985.
Uspensky, J. V, Theory of Equations, McGraw-Hill, New York, 1978.
Van Dooren, P., "The generalized eigenstructure problem in linear system theory," IEEE
Trans. Aut. Contr. AC-26, 111-129 (1981).
Van Dooren, P., "Reducing subspaces: Definitions, properties and algorithms," in A. Ruhe and
B. Kagstrom, Eds., Matrix Pencils, Lecture Notes in Mathematics, Vol. 973, Springer, New
York, 1983, pp. 58-73.
Wells, R. O., Differential Analysis on Complex Manifolds, Springer-Verlag, New York, 1980.
Wonham, W. M., Linear Multivariable Control: A Geometric Approach, Springer-Verlag,
Berlin, 1979.
Author Index
Helton, J.W., 292

Allan, G.R., 645 Herrero, D.A., 384
Arbib, M.A., 292 Hoffman, K., 427
Ball, J.^., 292 Jacobson, N., 384

Bart, H., 290, 292, 561, 562 Johnson, R.E., 348
Baumgartel, H., 605, 645
Bochner, S., 629 Kaashoek, M.A., 290, 291, 292, 561, 562
den Boer, H., 562 Kailath, T., 291, 292
Brickman, L., 562 Kalman, R.E., 292
Brockett, R., 292 Kato, T., 561
Brunovsky, P., 292 Kelley, J.L., 592
Kra, I., 614
Campbell, S., 561, 562 Krasnoselskii, M.A., 561
Choi, M.D., 384 Krein, M.G., 290, 561
Coddington, E.A., 262 Krupnick, N., 561
Conway, J.B., 561 Kunze, R., 427
Kurosh, A.G., 290
Daughtry, J., 561, 562
Djaferis, T.E., 292 Laffey, T.J., 384
Donnellan, T., 313 Lancaster, P.," 122, 290, 291, 327, 384, 561,
Douglas, R.G., 561 562, 645, 678
Laurie, C., 384
Falb, P.L., 292 Leiterer, Ju., 410, 561, 645
Fillmore, P.A., 384, 562 Levinson, N., 262
Lidskii, V.B., 136
Gantmacher, F.G., 115, 290, 384, 678 Longstaff, W.E., 384
Gohberg, I., 290, 291, 292, 410, 561, 562,
580, 609, 645, 678 Markus, A.S., 561, 562
Goldberg, S., 580 Markushevich, A.I., 570, 585
Golub, G., 562 Mardsen, J.E., 477
Grauert, H., 645 Martin, W.T., 629
Guralnick, R.M., 384 Milman, D.P., 561
Mitter, S.K., 292
Halmos, P., 384, 561
Halperin, I., 384 Ostrowski, A.M., 562
Harrison, K.J., 348
Hautus, M.L.J., 292 Parilis, E.E., 562
Hayton, C.E., 292 Pearcy, C., 561
Heinig, G., 609 Porsching, T.A., 645
687
688 Author Index
Radjavi, H., 384 Thijsse, G.Ph.A., 136, 562

Ran, A.C.M., 561 Thompson, R.C., 291
Rodman, L., 136, 291, 561, 562, 645, 678 Tismenetsky, M., 122, 290, 327, 384,
Rosenbrock, H., 292 561
Rosenthal, E., 384
Rosenthal, P., 384
Rubinstein, S., 292, 561, 562 Uspensky, J.V., 630
Rudin, W., 597
Rune, A., 562
vander Mee, C.V.M., 561
Saphar, P., 645 van Dooren, P., 562
Sarason, D., 291 van Loan, C.F., 562
Schaps, M., 136, 291 van Schagen, F., 292
Shayman, M.A., 434, 561
Shmuljan, Yu.L., 645
Shubin, M.A., 645 Wells, R.O., 434
Sigal, E.I., 291 Wilkinson, J.H., 562
Soltan, V.P., 290, 380, 384 Wonham, W.M., 291, 292
Subject Index
Algebra, 339 Chain (of subspaces), 33

k-transitive, 344 almost invariant, 209
reductive, 351 analytic extendability of, 618
self-adjoint, 351 complete, 35, 348, 449
see also Boolean algebra Lipschitz stable, 526
Analytic family: maximal, 35
of subspaces, 566 stable, 464
A(z)-invariant, 594 Characteristic polynomial, 10
direct complement for, 590 Circulant matrix, 43, 96, 256, 260
real, 600 Coextension, 128
of transformations, 565, 599, 604 Coinvariant subspace, 105, 437, 490
analytic Jordan basis for, 611 orthogonally, 108
diagonable, 612 Col, 147
eigenvalues of, 604, 609 Column indices, minimal, 674
eigenvectors of, 605 Commutator, 303
first exceptional set, 609, 624, 632 Commuting matrices, 295, 371
image of, 569 Companion matrix, 146, 515
incomplete factorization of, 578 second, 150
kernel of, 569 Completion, 128
multiple points of, 608 Complexification, 366
real, 600 Compression, 106
second exceptional set of, 610, 624, 633 Connected components, 426, 442
singular set of, 569 Connected set, 423
Angular subspace, 25 finitely, 584
Angular transformation, 27, 398 simply, 584
Atom, 349 Connected subspaces, 405, 423, 437
Continuous families:
Baire category theorem, 592 of subspaces, 408, 445
Binet-Cauchy formula, 651 of transformations, 412
Block similarity, 193, 208, 383 Controllable pair, 290
Boolean algebra, 349 Controllable system, 267
atomic, 349
Branch analytic family, 613 Diagonable transformation, 109, 366
singular set of, 613 Difference equation, 180
Brunovsky canonical form, 196, 359, 383 Differential equation, 175
Burnside's theorem, 341 Dilation, 128
of linear system, 263
Cascade (of linear systems), 273 Direct sum of subspaces, 20
minimal, 274 Distance:
simple, 270 between sets of subspaces, 465
689
690 Subject Index
Distance (Cont.) Invariant subspace, 5, 359

between subspaces, 397 of algebra, 340
from point to set, 388 a-stable, 513
Disturbance decoupling, 275 analytic extendability of, 616
B-stable, 480
Eigenvalue, 10, 146, 361, 604, 609, 657, 661 common to different matrices, 301, 378
Eigenvector, 10, 361, 605 cyclic, 69
generalized, 12, 13 inaccessible, 431
Elementary divisors, 298, 655, 665 intersect v, 208
at infinity, 664, 665 irreducible, 65, 365
Elementary matrices, 694 isolated, 428, 442, 473
Equivalent matrix polynomials, 646 Jordan, 54
strictly, 195, 382, 662, 665 Lipschitz stable, 459, 473
Extension, 121 marked, 83
maximal, 72
Factorization: minimal, 78
of matrix polynomials, 159, 160, 171, 554, mod v, 191
624 orthogonal reducing, 111
analytic extendability, 625, 626 real, 359
isolated, 524, 554 reducible, 65
Lipschitz stable, 525 reducing, 109, 298, 432, 490
sequentially nonisolated, 627 sequentially isolated, 619
stable, 520, 524, 554 spectral, 60, 365, 458, 618
of rational matrix functions, 226, 554 stable, 447
analytic continuation, 634 supporting, 187
isolated, 538, 539, 555
Lipschitz stable, 539 Jordan block, 6, 52
minimal, 226, 529, 634 Jordan chain, 13, 361
sequentially nonisolated, 638 Jordan form, 53
stable, 529, 537, 539, 554 real, 365
Factor space, 29 Jordan indices, 196
Feedback, 275, 277, 279 Jordan part (of Brunovsky form), 196
Fractional power series, 605 Jordan structure, 482
Full range pair, 81, 197, 290, 468 derogatory, 497
fixed, 596
Gap, 387, 417 Jordan structure sequence, 477, 483
spherical, 393, 418 derogatory part, 512
Generalized inverse, 24 Jordan subspace, 54
continuity of, 411, 413
Generators, 69, 100 Kernel, 5, 406
minimal, 69 Kronecker canonical form, 678
Graph (of matrix), 545 Kronecker indices, 196, 199
Kronecker part (of Brunovsky form), 196
Height:
of eigenvalue, 86 Laplace transform, 265
of transformation, 498, 513 Lattice, 31
Hyperinvariant subspace, 305-313, 374, 431, analytic dependence, 596
490 distributive, 311, 348
linear isomorphism, 484
Ideal, in algebra, 343 reflexive, 348
Image, 5, 406 self-dual, 311
Incomplete factorization, 578 Lattice homomorphism, 483
Input (of linear system), 262 Lattice of invariant subspaces, 463, 470
Invariant polynomials, 654, 664 analytic dependence, 596
Subject Index 691
Lipschitz stable, 464 Minimal angle, 392, 419

in metric, 467 Minimal opening, 396, 451
stable, 464 Minimal polynomial, 74
in metric, 465 Minimal realization, 218, 219
Lattice isomorphism, 463, 483, 596 Minimal system, 264
Left inverse, 216 Minor, 651
continuity of, 414 Mittag-Leffler theorem, 571, 614
Left quotient, 659 Monodromy theorem, 597
Left remainder, 659 Multiplicity:
Linear equation (in matrices), 548, 551 algebraic, 53, 365
Linear fractional decomposition, 244, 274 geometric, 53, 365
Lipschitz stable, 540 partial, 53, 365
minimal, 245, 274
Linear fractional transformation, 238 Norm, 88, 415
Linear isomorphism (of lattices), Normed space, 415
484 Null function, 220
Linearization, 144 associated, 222
Linear system, 262 canonical, 220
controllable, 267 order of, 220
disturbance decoupled, 275 Null kernel pair, 75, 81, 209, 290
minimal, 264 Null vector, 220
observable, 266
similar, 263 Observable pair, 290
Linear transformation: Observable system, 266
diagonable, 109, 366 Output (of linear system), 262
normal, 39, 363 Output stabilization, 279
self-adjoint, 363
unitary, 363 Partial multiplicities, 154, 219, 657, 661
Lipschitz continuous map, locally, 518 stability of, 475
Lipschitz stability, 467 Pole (of rational function), 219, 223
Lyapunov equation, see Linear equation geometric multiplicity of, 529
Projector, 20
McMillan degree, 225, 245, 632 complementary, 22
Matrix: orthogonal, 21
block:
circulant, 98 Quadratic equation (in matrices), 27, 545, 637
tridiagonal, 210 inaccessible solution of, 547
circulant, 96, 97, 314 isolated solution of, 547, 551, 556
companion, 98, 100, 299, 314 Lipschitz stable solution of, 552
cyclic, 299 stable solution of, 551, 552, 556
diagonable, 90 unilateral, 550
hermitian, 20
nonderogatory, 299, 449, 465, 499 Rational matrix function, 212
normal, 100, 111, 117, 303 analytic dependence, 628
orthogonal, 363, 405 analytic minimal realization of, 630
Toeplitz, 317 exceptional sets of, 632, 633
Matrix polynomial, 646 minimal realization of, 218
monic, 144 partial multiplicities of, 219
see also Factorization, of matrix polynomials pole of, 219
Metric, 387 realization of, 212
Metric space: zero of, 219
compact, 400 see also Factorization, of rational matrix
complete, 401 functions
connected, 405 Reachable vector, 276
692 Subject Index
Realization, see Rational matrix function controllable, 204

Reducing subspaces, 245, 251 irreducible, 65
Reduction: Jordan, 54
of linear system, 263 orthogonally coinvariant, 108
of realization, 215 orthogonally semiinvariant, 115
Regular linear matrix polynomial, 663 reducible, 65
Resolvent form, 147 root, 46, 363, 490
Restriction of transformation, 121 semiinvariant, 112, 438, 490
Riccati equation, see Quadratic equation spectral, 60
Riesz projector, 64, 447, 452 see also Invariant subspace
Right inverse, 216 Supporting k-tuple, 530
continuity of, 414 stable, 530
Right quotient, 659 Supporting quadruple, 249
Right remainder, 659
Root subspace, 46, 363, 490 Toeplitz matrix, 40, 317
Rotation matrix, 54 upper triangular, 297, 317
Row indices, minimal, 674 Trace, 427
Transfer function, 265
Scalar product, 391 Transformation:
Schmidt-Ore theorem, 290 adjoint, 18
Self-adjoint transformation, 20 angular, 27, 398
Semiinvariant subspace, 112, 438, 490 coextension of, 128
orthogonally, 115 diagonable, 90, 100
Sigal inequalities, 133 dilation of, 128
Similarity, 17 extension of, 121, 190, 208
of standard triple, 147 function of, 85
of systems, 263 induced, 30
Simply connected set, 584 nonderogatory, 299, 449, 465, 499
Smith canonical form, 647 normal, 39, 303
local, 218 orthogonally unicellular, 117
uniqueness of, 651 reduction of, 128
Spectral assignment, 203, 383 self-adjoint, 20
Spectral factorization, 187 unicellular, 67
Spectral shifting, 204 Triinvariant decomposition, 112, 253
Spectral subspace, 60 orthogonal, 115
Spectrum, 10 supporting, 156, 277
Standard pair, 183
Standard triple, 147
Unitary matrix, 37
similarity of, 147
State vector, 262
Subspace: Vandermonde, 72, 98
[A B]-invariant, 190, 481
Weierstrass' theorem, 571, 614
-invariant, 192
angular, 25 Zero:
coinvariant, 105, 437, 490 geometric multiplicity of, 529
complementary, 20 of rational function, 219, 223

Israel Gohberg Peter Lancaster, Leiba Rodman - InvariantSubspacesOfMatricesWithApplications

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Israel Gohberg Peter Lancaster, Leiba Rodman - InvariantSubspacesOfMatricesWithApplications

Uploaded by

Copyright:

Available Formats

Invariant Subspaces

*First time in print.

Richard Haberman, Mathematical Models: Mechanical Vibrations, Population Dynamics,

Library of Congress Cataloging in Publication Data

Gohberg, I. (Israel), 1928-

slam. is a registered trademark.

Part One Fundamental Properties of

Chapter One Invariant Subspaces: Definition, Examples, and First

1.1 Definition and Examples 5

Chapter Two Jordan Form and Invariant Subspaces 45

2.10 Functions of Transformations 85

Chapter Three Coinvariant and Semiinvariant Subspaces 105

Chapter Four Jordan Form for Extensions and Completions 121

Chapter Five Applications to Matrix Polynomials 144

Chapter Six Invariant Subspaces for Transformations Between

Chapter Seven Rational Matrix Functions 212

Chapter Eight Linear Systems 262

Part Two Algebraic Properties of Invariant Subspaces 293

Chapter Nine Commuting Matrices and Hyperinvariant Subspaces 295

Chapter Ten Description of Invariant Subspaces and Linear

10.3 Proof of Theorem 10.2.1 328

Chapter Eleven Algebras of Matrices and Invariant Subspaces 339

Chapter Twelve Real Linear Transformations 359

Part Three Topological Properties of

Chapter Thirteen The Metric Space of Subspaces 387

Chapter Fourteen The Metric Space of Invariant Subspaces 423

Chapter Fifteen Continuity and Stability of Invariant Subspaces 444

Chapter Sixteen Perturbations of Lattices of Invariant Subspaces with

Chapter Seventeen Applications 514

Part Four Analytic Properties of Invariant Subspaces 563

Chapter Nineteen Jordan Form of Analytic Matrix Functions 604

19.3 Proof of Theorem 19.2.3 613

Chapter Twenty Applications 624

Appendix. Equivalence of Matrix Polynomials 646

List of Notations and Conventions 679

In the past 50 or 60 years, developments in mathematics have led to inno-

Partially Specified Matrices and Operators: Classification, Completion, Ap-

3. Chapter 9. Various results on the existence of complete chains of

4. Chapter 11. A simple proof of Burnside's theorem (Theorem 11.2.1 in

[9] A. C. M. Ran and L. Rodman, "A class of robustness problems in matrix

[10] T. Yoshino, "Supplemental examples: 'On complementary matrix alge-

Invariant subspaces are a central notion of linear algebra. However, in

Invariant Sub spaces:

This chapter is mainly introductory. It contains the simplest properties of

1.1 DEFINITION AND EXAMPLES

Let A: <p"—» (p" be a linear transformation. A subspace M C <p" is called

Indeed, as Ax = 0G Ker A for every x G Ker A, the subspace Ker A is A

More generally, the subspaces

EXAMPLE 1.1.1. Let

(the n x n Jordan block with A0 on the main diagonal). Every nonzero

be a vector from M for which the index k = max{m 1 < m ^ n, am ^ 0} is

On the other hand, the vector x = E* = i a ( e., ak 7^0 belongs to M. Hence,

also belong to M. Hence the vectors

and the equality

follows. As for every y = we have

The subspace Span{e,, . . . , e k ] is indeed A invariant. The total number of

As expected, these subspaces are A invariant.

EXAMPLE 1.1.2. Let A — A07, where / is the n x « identity matrix. Clearly,

Note that the set Inv(/l) of all ^-invariant subspaces is uncountably

Hence we must have A() = \(x) is independent of x ^ 0, so actually

EXAMPLE 1.1.3. Let

where the complex numbers A , , . . . , An are distinct. For any indices 1^

Here we shall check only that the 2 x 2 matrix

which means that Ay E. M. So M is A invariant.

Indeed, let x = Ay and z E K e r A Then (x, z) = (Ay, z) = (z, A*y) =