You are on page 1of 715

Invariant Subspaces

of Matrices with
Applications
SIAM's Classics in Applied Mathematics series consists of books that were previously allowed to go
out of print. These books are republished by S1AM as a professional service because they continue
to be important resources tor mathematical scientists.

Editor-in-Chief
Robert E. O'Malley, Jr., University of Washington

Editorial Board
Richard A. Brualdi, University of Wisconsin-Madison
Leah Edelstein-Keshet, University of British Columbia
Nicholas]. Higham, University of Manchester
Herbert B. Keller, California Institute of Technology
Aiidrzej Z. Manitius, George Mason University
Hilary Ockendon, University of Oxford
Ingram Olkin, Stanford University
Peter Olver, University of Minnesota
Ferdinand Verhulst, Mathematisch Instituut, University of Utrecht
Classics in Applied Mathematics
C. C. Lin and L. A. Segel, Mathematics Applied to Deterministic Problems in the Natural Sciences
Johan G. F. Belinfante and Bernard Kolman, A Survey of Lie Groups and Lie Algebras with
Applications and Computational Methods
James M. Ortega, Numerical Analysis: A Second Course
Anthony V. Fiacco arid Garth P. McCorrnick, Nonlinear Programming: Sequential Unconstrained
Minimisation Techniques
F. H. Clarke, Optimization and Nonsmooth Analysis
George F. Carrier and Carl E. Pearson, Ordinary Differential Equations
Leo Breiman, Probability
R. Bellman and G. M. Wing, An Introduction to Invariant Imbedding
Abraham Berman and Robert J. Plemmons, Nonnegative Matrices in the Mathematical Sciences
Olvi L. Mangasarian, Nonlinear Programming
*Carl Friedrich Gauss, Theory of the Combination of Observations Least Subject to Errors:
Part One, Part Two, Supplement. Translated by G. W. Stewart
Richard Bellman, Introduction to Matrix Analysis
U. M. Ascher, R. M. M. Mattheij, and R. D. Russell, Numerical Solution of Boundary Value
Problems for Ordinary Differential Equations
K. E. Brenan, S. L. Campbell, and L. R. Petzold, Numerical Solution of Initial-Value Problems
i?i Differential-Algebraic Equations
Charles L. Lawson and Richard J. Hanson, Solving Least Squares Problems
J. E. Dennis, Jr. and Robert B. Schnabel, Numerical Methods for Unconstrained Optimization
and Nonlinear Equations
Richard E. Barlow and Frank Proschan, Mathematical Theory of Reliability
Cornelius Lanczos, Linear Differential Operators
Richard Bellman, Introduction to Matrix Analysis, Second Edition
Beresforcl N. Parlett, The Symmetric Eigenvalue Problem

*First time in print.


ii
Classics in Applied Mathematics (continued)

Richard Haberman, Mathematical Models: Mechanical Vibrations, Population Dynamics,


and Traffic Flow
Peter W. M. John, Statistical Design and Analysis of Experiments
Tamer Ba§ar arid Geert Jan Olsder, Dynamic Mmcooperative Game Theory, Second Edition
Emanuel Parzen, Stochastic Processes
Petar Kokotovic, Hassan K. Khalil, and John O'Reilly, Singular Perturbation Methods
in Control: Analysis and Design
Jean Dickinson Gibbons, Ingram Olkin, and Milton Sobel, Selecting and Ordering Populations:
A New/1 Statistical Methodology
James A. Murdock, Perturbations: Theory and Methods
Ivar Eke land and Roger Temam, Convex Analysis and Variatiorial Problems
Ivar Stakgokl, Boundary Value Problems of Mathematical Physics, Volumes I and II
J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables
David Kinderlehrer and Guido Stampacchia, An Introduction to Variational Inequalities and
Their Applications
F. Natterer, The Mathematics of Computerized Tomography
Avinash C. Kak and Malcolm Slaney, Principles of Computerized Tomographic Imagiiig
R. Wong, Asymptotic Approximations of Integrals
O. Axelsson and V A. Barker, Finite Element Solution of Boundary Value Problems: Theory
and Computation
David R. Brillinger, Time Series: Data Analysis and Theory
Joel N. Franklin, Methods of Mathematical Economics: Linear arid Nonlinear Programming,
Fixed-Point Theorems
Philip Hartman, Ordinary Differential Equations, Second Edition
Michael D. Intriligator, Mathematical Optimisation arid Economic Theory
Philippe G. Ciarlet, The Finite Element Method for Elliptic Problems
Jane K. Cullum and Ralph A. Willoughby, Lanczos Algorithms for Large Symmetric Eigenvalue
Computations, Vol. I: Theory
M. Vidyasagar, Nonlinear Systems Analysis, Second Edition
Robert Mattheij and Jaap Molenaar, Ordinary Differential Equations in Theory and Practice
Shanti S. Gupta and S. Panchapakesan, Multiple Decision Procedures: Theory and Methodology
of Selecting and Ranking Populations
Eugene L. Allgower and Kurt Georg, Introdttctiori to Numerical Continuation Methods
Leah Edelstein-Keshet, Mathematical Models in Biology
Heinz-Otto Kreiss and Jens Lorenz, Initial-Boundary Value Problems arid the Navier-Stofces Equations
J. L. Hodges, Jr. and E. L. Lehmarm, Basic Concepts of Probability and Statistics, Second Edition
George F. Carrier, Max Krook, and Carl E. Pearson, Functions of a Complex Variable: Theory
and Technique
Friedrich Pukelsheim, Optimal Design of Experiments
Israel Gohberg, Peter Lancaster, and Leiba Rodman, Invariant Subspaces of Matrices with
Applications

iii
This page intentionally left blank
Invariant Subspaces
of Matrices with
Applications

Israel Gohberg
Tel-Aviv University
Ramat>Aviv, Israel

Peter Lancaster
University of Calgary
Calgary, Alberta, Canada

Leiba Rodman
College of William & Mary
Williamsburg, Virginia

siam.
Society for Industrial and Applied Mathematics
Philadelphia
Copyright © 2006 by the Society for Industrial and Applied Mathematics

This SIAM edition is an unabridged republication of the work first published by John
Wiley & Sons, Inc., New York, 1986.

10 9 8 7 6 5 4 3 2 1

All rights reserved. Printed in the United States of America. No part of this book
may be reproduced, stored, or transmitted in any manner without the written
permission of the publisher. For information, write to the Society for Industrial and
Applied Mathematics, 3600 University City Science Center, Philadelphia, PA
19104-2688.

Library of Congress Cataloging in Publication Data

Gohberg, I. (Israel), 1928-


Invarient subspeces of matrices with applications / Israel Gohberg, Peter
Lancaster, Leiba Rodman.
p. cm. — (Classics in applied mathematics ; 51)
Originally published: New York : Wiley, c!986, in series: Canadian Mathematical
Society series of monographs and advanced texts.
Includes bibliographical references and indexes.
ISBN 0-89871-608-X (pbk.)
1. Invariant subspaces. 2. Matrices. I. lancaster, Peter, 1929-. II. Rodman, L.
III. Title. IV. Series.

QA322.G649 2006
515'.73-dc22
2006042260

slam. is a registered trademark.


70 our wives
feeffa, fliane, antf*Effa
This page intentionally left blank
Contents

Introduction 1

Part One Fundamental Properties of


Invariant Subspaces and Applications 3

Chapter One Invariant Subspaces: Definition, Examples, and First


Properties 5

1.1 Definition and Examples 5


1.2 Eigenvalues and Eigenvectors 10
1.3 Jordan Chains 12
1.4 Invariant Subspaces and Basic Operations on Linear
Transformations 16
1.5 Invariant Subspaces and Projectors 20
1.6 Angular Transformations and Matrix Quadratic
Equations 25
1.7 Transformations in Factor Spaces 28
1.8 The Lattice of Invariant Subspaces 31
1.9 Triangular Matrices and Complete Chains of Invariant
Subspaces 37
1.10 Exercises 40

Chapter Two Jordan Form and Invariant Subspaces 45


2.1 Root Subspaces 45
2.2 The Jordan Form and Partial Multiplicities 52
2.3 Proof of the Jordan Form 58
2.4 Spectral Subspaces 60
2.5 Irreducible Invariant Subspaces and Unicellular
Transformations 65
2.6 Generators of Invariant Subspaces 69
2.7 Maximal Invariant Subspace in a Given Subspace 72
2.8 Minimal Invariant Subspace over a Given Subspace 78
2.9 Marked Invariant Subspaces 83

ix
X Contents

2.10 Functions of Transformations 85


2.11 Partial Multiplicities and Invariant Subspaces of
Functions of Transformations 92
2.12 Exercises 95

Chapter Three Coinvariant and Semiinvariant Subspaces 105


3.1 Coinvariant Subspaces 105
3.2 Reducing Subspaces 109
3.3 Semiinvariant Subspaces 112
3.4 Special Classes of Transformations 116
3.5 Exercises 119

Chapter Four Jordan Form for Extensions and Completions 121


4.1 Extensions from an Invariant Subspace 121
4.2 Completions from a Pair of Invariant and Coinvariant
Subspaces 128
4.3 The Sigal Inequalities 133
4.4 Special Case of Completions 136
4.5 Exercises 142

Chapter Five Applications to Matrix Polynomials 144


5.1 Linearizations, Standard Triples, and Representations of
Monic Matrix Polynomials 144
5.2 Multiplication of Monic Matrix Polynomials and Partial
Multiplicities of a Product 153
5.3 Divisibility of Monic Matrix Polynomials 156
5.4 Proof of Theorem 5.3.2 161
5.5 Example 167
5.6 Factorization into Several Factors and Chains of
Invariant Subspaces 171
5.7 Differential Equations 175
5.8 Difference Equations 180
5.9 Exercises 183

Chapter Six Invariant Subspaces for Transformations Between


Different Spaces 189
6.1 [A B]-Invariant Subspaces 189
6.2 Block Similarity 192
6.3 Analysis of the Brunovsky Canonical Form 197
6.4 Description of [A Z?]-Invariant Subspaces 200
6.5 The Spectral Assignment Problem 203
6.6 Some Dual Concepts 207
6.7 Exercises 209
Contents xi

Chapter Seven Rational Matrix Functions 212


7.1 Realizations of Rational Matrix Functions 212
7.2 Partial Multiplicities and Multiplication 218
7.3 Minimal Factorization of Rational Matrix Functions 225
7.4 Example 230
7.5 Minimal Factorizations into Several Factors and Chains
of Invariant Subspaces 234
7.6 Linear Fractional Transformations 238
7.7 Linear Fractional Decompositions and Invariant
Subspaces of Nonsquare Matrices 244
7.8 Linear Fractional Decompositions:
Further Deductions 251
7.9 Exercises 255

Chapter Eight Linear Systems 262


8.1 Reductions, Dilations, and Transfer Functions 262
8.2 Minimal Linear Systems: Controllability and
Observability 265
8.3 Cascade Connections of Linear Systems 270
8.4 The Disturbance Decoupling Problem 274
8.5 The Output Stabilization Problem 279
8.6 Exercises 285
Notes to Part 1. 290

Part Two Algebraic Properties of Invariant Subspaces 293

Chapter Nine Commuting Matrices and Hyperinvariant Subspaces 295


9.1 Commuting Matrices 295
9.2 Common Invariant Subspaces for Commuting
Matrices 301
9.3 Common Invariant Subspaces for Matrices with Rank 1
Commutators 303
9.4 Hyperinvariant Subspaces 305
9.5 Proof of Theorem 9.4.2 307
9.6 Further Properties of Hyperinvariant Subspaces 311
9.7 Exercises 313

Chapter Ten Description of Invariant Subspaces and Linear


Transformations with the Same Invariant Subspaces 316
10.1 Description of Irreducible Subspaces 316
10.2 Transformations Having the Same Set of Invariant
Subspaces 323
xii Contents

10.3 Proof of Theorem 10.2.1 328


10.4 Exercises 338

Chapter Eleven Algebras of Matrices and Invariant Subspaces 339


11.1 Finite-Dimensional Algebras 339
11.2 Chains of Invariant Subspaces 340
11.3 Proof of Theorem 11.2.1 343
11.4 Reflexive Lattices 346
11.5 Reductive and Self-Ad joint Algebras 350
11.6 Exercises 355

Chapter Twelve Real Linear Transformations 359


12.1 Definition, Examples, and First Properties of Invariant
Subspaces 359
12.2 Root Subspaces and the Real Jordan Form 363
12.3 Complexification and Proof of the Real Jordan
Form 366
12.4 Commuting Matrices 371
12.5 Hyperinvariant Subspaces 374
12.6 Real Transformations with the Same Invariant
Subspaces 378
12.7 Exercises 380
Notes to Part 2. 384

Part Three Topological Properties of


Invariant Subspaces and Stability 385

Chapter Thirteen The Metric Space of Subspaces 387


13.1 The Gap Between Subspaces 387
13.2 The Minimal Angle and the Spherical Gap 392
13.3 Minimal Opening and Angular Linear
Transformations 396
13.4 The Metric Space of Subspaces 400
13.5 Kernels and Images of Linear Transformations 406
13.6 Continuous Families of Subspaces 408
13.7 Applications to Generalized Inverses 411
13.8 Subspaces of Normed Spaces 415
13.9 Exercises 420
Contents xiii

Chapter Fourteen The Metric Space of Invariant Subspaces 423


14.1 Connected Components: The Case of One
Eigenvalue 423
14.2 Connected Components: The General Case 426
14.3 Isolated Invariant Subspaces 428
14.4 Reducing Invariant Subspaces 432
14.5 Coinvariant and Semiinvariant Subspaces 437
14.6 The Real Case 439
14.7 Exercises 443

Chapter Fifteen Continuity and Stability of Invariant Subspaces 444


15.1 Sequences of Invariant Subspaces 444
15.2 Stable Invariant Subspaces: The Main Result 447
15.3 Proof of Theorem 15.2.1 in the General Case 451
15.4 Perturbed Stable Invariant Subspaces 455
15.5 Lipschitz Stable Invariant Subspaces 459
15.6 Stability of Lattices of Invariant Subspaces 463
15.7 Stability in Metric of the Lattice of Invariant
Subspaces 464
15.8 Stability of [A £]-Invariant Subspaces 468
15.9 Stable Invariant Subspaces for Real
Transformations 470
15.10 Partial Multiplicities of Close Linear
Transformations 475
15.11 Exercises 479

Chapter Sixteen Perturbations of Lattices of Invariant Subspaces with


Restrictions on the Jordan Structure 482
16.1 Preservation of Jordan Structure and Isomorphism of
Lattices 482
16.2 Properties of Linear Isomorphisms of Lattices:
The Case of Similar Transformations 486
16.3 Distance Between Invariant Subspaces for
Transformations with the Same Jordan Structure 492
16.4 Transformations with the Same Derogatory Jordan
Structure 497
16.5 Proofs of Theorems 16.4.1 and 16.4.4 500
16.6 Distance between Invariant Subspaces for
Transformations with Different Jordan Structures 507
16.7 Conjectures 510
16.8 Exercises 513
xiv Contents

Chapter Seventeen Applications 514


17.1 Stable Factorizations of Matrix Polynomials:
Preliminaries 514
17.2 Stable Factorizations of Matrix Polynomials:
Main Results 520
17.3 Lipschitz Stable Factorizations of Monic Matrix
Polynomials 525
17.4 Stable Minimal Factorizations of Rational Matrix
Functions: The Main Result 528
17.5 Proof of the Auxiliary Lemmas 532
17.6 Stable Minimal Factorizations of Rational Matrix
Functions: Further Deductions 537
17.7 Stability of Linear Fractional Decompositions of
Rational Matrix Functions 540
17.8 Isolated Solutions of Matrix Quadratic Equations 545
17.9 Stability of Solutions of Matrix Quadratic
Equations 551
17.10 The Real Case 553
17.11 Exercises 557
Notes to Part 3. 561

Part Four Analytic Properties of Invariant Subspaces 563


Chapter Eighteen Analytic Families of Subspaces 565
18.1 Definition and Examples 565
18.2 Kernel and Image of Analytic Families of
Transformations 569
18.3 Global Properties of Analytic Families of
Subspaces 575
18.4 Proof of Theorem 18.3.1 (Compact Sets) 578
18.5 Proof of Theorem 18.3.1 (General Case) 584
18.6 Direct Complements for Analytic Families of
Subspaces 590
18.7 Analytic Families of Invariant Subspaces 594
18.8 Analytic Dependence of the Set of Invariant Subspaces
and Fixed Jordan Structure 596
18.9 Analytic Dependence on a Real Variable 599
18.10 Exercises 601

Chapter Nineteen Jordan Form of Analytic Matrix Functions 604


19.1 Local Behaviour of Eigenvalues and Eigenvectors 604
19.2 Global Behaviour of Eigenvalues and Eigenvectors 607
Contents xv

19.3 Proof of Theorem 19.2.3 613


19.4 Analytic Extendability of Invariant Subspaces 616
19.5 Analytic Matrix Functions of a Real Variable 620
19.6 Exercises 622

Chapter Twenty Applications 624


20.1 Factorization of Monic Matrix Polynomials 624
20.2 Rational Matrix Functions Depending Analytically on a
Parameter 627
20.3 Minimal Factorizations of Rational Matrix
Functions 634
20.4 Matrix Quadratic Equations 639
20.5 Exercises 642
Notes to Part 4. 645

Appendix. Equivalence of Matrix Polynomials 646


A.I The Smith Form: Existence 646
A.2 The Smith Form: Uniqueness 651
A.3 Invariant Polynomials, Elementary Divisors, and Partial
Multiplicities 654
A.4 Equivalence of Linear Matrix Polynomials 659
A.5 Strict Equivalence of Linear Matrix Polynomials:
Regular Case 662
A.6 The Reduction Theorem for Singular Polynomials 666
A.7 Minimal Indices and Strict Equivalence of Linear Matrix
Polynomials (General Case) 672
A.8 Notes to the Appendix 678

List of Notations and Conventions 679


References 683
Author Index 687
Subject Index 689
This page intentionally left blank
Preface to the SI AM Classics
Edition

In the past 50 or 60 years, developments in mathematics have led to inno-


vations in linear algebra and matrix theory. This progress was often initiated
by topics and problems from applied mathematics. A good example of this
is the development of mathematical systems theory. In particular, many new
and important results in linear algebra cannot even be formulated without the
notion of invariant subspaces of matrices or linear transformations. In view of
this, the authors set out to write a work on advanced linear algebra in which
invariant subspaces of matrices would be the central notion, the main sub-
ject of research, and the main tool. In other words, matrix theory was to be
presented entirely on the basis of the theory of invariant subspaces, including
the algebraic, geometric, topological, and analytic aspects of the theory. We
believed that this would give a new point of view and a better understanding
of the entire subject. It would also allow us to follow up systematically the
central role of invariant subspaces in linear algebra and matrix analysis, as
well as their role in the study of differential and difference equations, systems
theory, matrix polynomials, rational matrix functions, and algebraic Riccati
equations.
The first edition of the present book was the result. To the authors' knowl-
edge it is the only book in existence with these aims. The first parts of the
book have the character of a textbook easily accessible for undergraduate stu-
dents. As the development progresses, the exposition changes to approach the
style and content of a graduate textbook and even a research monograph until,
in the last part, recent achievements are presented. The fundamental char-
acter of the mathematics, its accessibility, and its importance in applications
makes this a widely useful book for experts and for students in mathematics,
sciences, and engineering.
The first edition sold out in early 2005, and we could not help colleagues
who found a need for it. We are grateful to Wiley-Interscience publications
for producing the first edition and for returning the copyright to us in order to
give the work a new life. We are especially thankful to SIAM for the decision
to include this work in their series Classics in Applied Mathematics.
We would like to mention some other literature with strong connections
to this book. First, there are two other relevant monographs by the present
authors: Matrix Polynomials, published by Academic Press in 1982, and Ma-
trices and Indefinite Scalar Products, published by Birkhauser Verlag in 1983.
Invariant subspaces play an important role in both of them. In fact, work on
these two books convinced us of the need for the present systematic treat-
ment. The monograph of I. Gohberg, M. A. Kaashoek, and F. van Schagen,
xvii
xviii Preface to the Classics Edition

Partially Specified Matrices and Operators: Classification, Completion, Ap-


plications, Birkhauser Verlag, 1995, is recommended as additional reading for
Chapter 4. A later, comprehensive account of the theory of algebraic Riccati
equations, discussed in Chapters 17 and 20, can be found in the monograph
Algebraic Riccati Equations by P. Lancaster and L. Rodman, published by
Oxford University Press in 1995.
By the end of 2005 Birkhauser Verlag will also publish the authors' In-
definite Linear Algebra. This can also be recommended as a book in which
invariant subspaces play an important role.
It is a pleasure to repeat the acknowledgments appearing in the first edi-
tion. These include support from the Killam Foundation of Canada and the
Nathan and Lily Silver Chair on Mathematical Analysis and Operator The-
ory of Tel Aviv University. Continuing support was also provided by staff
at the School of Mathematical Sciences of Tel Aviv University and at the
Department of Mathematics and Statistics of the University of Calgary. In
particular, Jacqueline Gorsky in Tel Aviv and Pat Dalgetty in Calgary con-
tributed with speedy and skillful development of the first typescript. Support
from national organizations is also acknowledged: the Basic Research Fund
of the Israel Academy of Science, the U.S. National Science Foundation, and
the Natural Sciences and Engineering Research Council of Canada.
COMMENTS ON THE DEVELOPMENTS OF TWENTY
YEARS
Twenty years have passed since the appearance of the first edition. Nat-
urally, in this time advances have been made on some the theory appearing
in the first edition, advances which have appeared in specialized journals and
books. Also, the status of some conjectures made in the first edition has
been clarified. Here, several developments of this kind are summarized for the
interested reader, together with a short bibliography.
1. Chapter 2. A characterization of matrices all of whose invariant sub-
spaces are marked is given in [I].
2. Chapter 4. The problem of describing the Jordan forms of completions
from an invariant and a coinvariant subspace, also known as the Carlson
problem, has been solved (in terms of Littlewood-Richardson sequences). As
it turns out, it is closely related to the problem of describing the range of the
eigenvalues of A + B in terms of the eigenvalues of Hermitian matrices A and
B, solved by Klyachko [5]. See the expository paper [2] and references there.

3. Chapter 9. Various results on the existence of complete chains of


invariant subspaces that extend Theorem 9.3.1 are presented in [8] (see also
references there). We quote Radjavi's theorem [7]: A collection S of n x n
complex matrices has a complete chain of common invariant subspaces if and
only if the trace is permutable: trace (Ai • • • Ap) = trace (Aa^ • • • Ar(p)) for
every p-tuple A\,..., Ap, Aj e «S, and every permutation a of (1,2,... ,p}.
Preface to the Classics Edition xix

4. Chapter 11. A simple proof of Burnside's theorem (Theorem 11.2.1 in


the text) is given in [6].
Conjecture 11.2.3 was disproved in [3] (for all n > 1 except 7 and 11) and
in [10] (for n — 7 and n — 11). It is certainly of interest to describe all pairs
of complementary algebras V\ and V% for which this conjecture is correct. In
[3] it was proved that the conjecture is valid if the complementary algebras
Vi and 1/2 are orthogonal.
5. Chapter 15. The past twenty years have seen the development of
a substantial literature concerning stability (in various senses) of invariant
subspaces of matrices, as well as of linear operators acting in an infinite-
dimensional Hilbert space. For much of this material and its applications in
the context of finite-dimensional spaces, we refer the reader to the expository
paper [9] and references there.
6. Chapter 16. Conjecture 16.7.1 is false in general. A counterexample
is given in [4]. The conjecture holds when A is nonderogatory (however, the
proof given on page 512 is erroneous, as pointed out in [4]) and when A is
diagonable. These results were established in [4] as well. An interesting open
question concerns the characterization of those Jordan structures for which
Conjecture 16.7.1 fails.

References
[1] R. Bru, L. Rodman, and H. Schneider, "Extensions of Jordan bases for
invariant subspaces of a matrix," Linear Algebra Appl. 150, 209-225
(1991).
[2] W. Fulton, "Eigenvalues, invariant factors, highest weights, and Schubert
calculus," Bull. Amer. Math. Soc. 37, 209-249 (2000).
[3] M. D. Choi, H. Radjavi and P. Rosenthal, "On complementary matrix
algebras," Integral Equations and Operator Theory 13, 165-174 (1990).
[4] J. Hartman, "On a conjecture of Gohberg and Rodman," Linear Algebra
Appl. 140, 267-278 (1990).
[5] A. A. Klyachko, "Stable bundles, representation theory and Hermitian
operators," Selecta Math. 4, 419-445 (1998).
[6] V. Lomonosov and P. Rosenthal, "The simplest proof of Burnside's the-
orem on matrix algebras," Linear Algebra Appl. 383, 45-47 (2004).
[7] H. Radjavi, "A trace condition equivalent to simultaneous triangulariz-
ability," Canad. J. Math. 38, 376-386 (1986).
[8] H. Radjavi and P. Rosenthal, Simultaneous Triangularization, Springer
Verlag, New York, 2001.
XX Preface to the Classics Edition

[9] A. C. M. Ran and L. Rodman, "A class of robustness problems in matrix


analysis," Operator Theory: Advances and Applications 134, 337-389
(2002).

[10] T. Yoshino, "Supplemental examples: 'On complementary matrix alge-


bras,'" Integral Equations and Operator Theory 14, 764-766 (1991).
Corrections
Page Line Correction
123 13 For [/ 0] read [0 /].
137 3 For nondecreasing read nonincreasing.
137 6 up For Theorem 4.4.1 read Theorem 4.1.4.
137 5 up For Proposition 4.1.1 read Proposition 4.4.1.
140 8 and 9 up Reverse the order of vectors in these chains.
145 14 For L9A) read L(A).
146 1 up For n x nl read nl x n.
196 6 For FN~l read FN.
197 5 up For Cm+n read Cm+n -> Cn.
214 6 up Reverse the positions of B and C. Also B and C.
221 11 For Xj — 1 read Xj-\.
223 10 For (A/ - Ai) read (A/ - Ai)" 1 .
225 4 up For H^A)-1 read W(\] and replace -C by C.
360 11 In the bottom row of the matrix replace r by —r.
673 2 For k\ read k.
687 Sup For "Mardsen" read "Marsden."

xxi
This page intentionally left blank
Introduction

Invariant subspaces are a central notion of linear algebra. However, in


existing texts and expositions the notion is not easily or systematically
followed. Perhaps because the whole structure is very rich, the treatment
becomes fragmented as other related ideas and notions intervene. In
particular, the notion of an invariant subspace as an entity is often lost in the
discussion of eigenvalues, eigenvectors, generalized eigenvectors, and so on.
The importance of invariant subspaces becomes clearer in the context of
operator theory on spaces of infinite dimension. Here, it can be argued that
the structure is poorer and this is one of the few available tools for the study
of many classes of operators. Probably for this reason, the first books on
invariant subspaces appeared in the framework of infinite-dimensional
spaces. It seems to the authors that now there is a case for developing a
treatment of linear algebra in which the central role of invariant subspace is
systematically followed up.
The need for such a treatment has become more apparent in recent years
because of developments in different fields of application and especially in
linear systems theory, where concepts such as controllability, feedback,
factorization, and realization of matrix functions are commonplace. In the
treatment of such problems new concepts and theories have been developed
that form complete new chapters in the body of linear algebra. As examples
of new concepts of linear algebra developed to meet the needs of systems
theory, we should mention invariant subspaces for nonsquare matrices and
similarity of such matrices.
In this book the reader will find a treatment of certain aspects of linear
algebra that meets the two objectives: to develop systematically the central
role of invariant subspaces in the analysis of linear transformations and to
include relevant recent developments of linear algebra stimulated by linear
systems theory. The latter are not dealt with separately, but are integrated
into the text in a way that is natural in the development of the mathematical
structure.

_
2 Introduction

The first part of the book, taken alone or together with selections from
the other parts, can be used as a text for undergraduate courses in
mathematics, having only a first course in linear algebra as prerequisite. At
the same time, the book will be of interest to graduate students in science
and engineering. We trust that experts will also find the exposition and new
results interesting. The authors anticipate that the book will also serve as a
valuable reference work for mathematicians, scientists, and engineers. A set
of exercises is included in each chapter. In general, they are designed to
provide illustrations and training rather than extensions of the theory.
The first part of the book is devoted mainly to geometric properties of
invariant subspaces and their applications in three fields. The fields in
question are matrix polynomials, rational matrix functions, and linear
systems theory. They are each presented in self-contained form, and—rather
than being exhaustive—the focus is on those problems in which invariant
subspaces of square and nonsquare matrices play a central role. These
problems include factorization and linear fractional decompostions for mat-
rix functions; problems of realization for rational matrix functions; and the
problem of describing connections, or cascades, of linear systems, pole
assignment, output stabilization, and disturbance decoupling.
The second part is of a more algebraic character in which other properties
of invariant subspaces are analyzed. It contains an analysis of the extent to
which the invariant subspaces determine the parent matrix, invariant sub-
spaces common to commuting matrices, and lattices of subspaces for a single
matrix and for algebras of matrices.
The numerical computation of invariant subspaces is a difficult task as, in
general, it makes sense to compute only those invariant subspaces that
change very little after small changes in the transformation. Thus it is
important to have appropriate notions of "stable" invariant subspaces. Such
an analysis of the stability of invariant subspaces and their generalizations is
the main subject of Part 3. This analysis leads to applications in some of the
problem areas mentioned above.
The subject of Part 4 is analytic families of invariant subspaces and has
many useful applications. Here, the analysis is influenced by the theory of
complex vector bundles, although we do not make use of this theory. The
study of the connections between local and global problems is one of the
main problems studied in this part. Within reasonable bounds, Part 4 relies
only on the theory developed in this book. The material presented here
appears for the first time in a book on linear algebra and is thereby made
accessible to a wider audience.
Part One

Fundamental
Properties of
Invariant Subspaces
and Applications
Part 1 of this work comprises almost half of the entire book. It includes what
can be described as a self-contained course in linear algebra with emphasis
on invariant subspaces, together with substantial developments of applica-
tions to the theory of polynomial and rational matrix-valued functions, and
to systems theory. These applications demand extensions of the standard
material in linear algebra that are included in our treatment in a natural
way. They also serve to breathe new life into an otherwise familiar body of
knowledge. Thus there is a considerable amount of material here (including
all of Chapters 3, 4, and 6) that cannot be found in other books on linear
algebra.
Almost all of the material in this part can be understood by readers who
have completed a beginning course in linear algebra, although there are
places where basic ideas of calculus and complex analysis are required.

3
This page intentionally left blank
Chapter One

Invariant Sub spaces:


Definition, Examples,
and First Properties

This chapter is mainly introductory. It contains the simplest properties of


invariant subspaces of a linear transformation. Some basic tools (projectors,
factor spaces, angular transformations, triangular forms) for the study of
invariant subspaces are developed. We also study the behaviour of invariant
subspaces of a transformation when the operations of similarity and taking
adjoints are applied to the transformation. The lattice of invariant sub-
spaces of a linear transformation—a notion that will be important in the
sequel—is introduced. The presentation of the material here is elementary
and does not even require use of the Jordan form.

1.1 DEFINITION AND EXAMPLES

Let A: <p"—» (p" be a linear transformation. A subspace M C <p" is called


invariant for the transformation A, or A invariant, if Ax & M for every
vector x G M. In other words, M is invariant for A means that the image of
M under A is contained in M\ AM C. M. Trivial examples of invariant
subspaces are {0} and <p". Less trivial examples are the subspaces

and

Indeed, as Ax = 0G Ker A for every x G Ker A, the subspace Ker A is A


invariant. Also, for every x G (p", the vector Ax belongs to Im^4; in
particular, A(lm A) C Im A, and Im A is A invariant.

5
6 Invariant Subspaces

More generally, the subspaces

and

are A invariant. To verify this, let x e Ker Am, so Amx = 0. Then Am(Ax) =
A(Amx) = 0, that is, Ax e Ker Am. This means that Ker Am is A invariant.
Further, let x E Im Am, so x = Amy for some y e <p". Then Ax = A(Amy) =
Am(Ay), which implies that AxElmA™. So Im Am is A invariant as
well.
When convenient, we shall often assume implicitly that a linear trans-
formation from (p™ into <p" is given by an n x m matrix with respect to
the standard orthonormal bases el = (1,0,. . . , 0), e2 = (0,1, 0,. . . , 0),
*W = < 0 , 0 , . . . , 0 , 1 > in fVi. • • • , « « HI C-
The following three examples of transformations and their invariant
subspaces are basic and are often used in the sequel.

EXAMPLE 1.1.1. Let

(the n x n Jordan block with A0 on the main diagonal). Every nonzero


/1-invariant subspace is of the form Span{e l5 . . . , ek}, where et is the vector
(0,. . . , 0,1, 0,. . . , 0} with 1 in the /th place. Indeed, let M be a nonzero
^-invariant subspace, and let

be a vector from M for which the index k = max{m 1 < m ^ n, am ^ 0} is


maximal. Then clearly

On the other hand, the vector x = E* = i a ( e., ak 7^0 belongs to M. Hence,


since M is A invariant, the vectors
Definition and Examples 7

also belong to M. Hence the vectors

belong to M as well. So

and the equality

follows. As for every y = we have

The subspace Span{e,, . . . , e k ] is indeed A invariant. The total number of


.A-invariant subspaces (including {0} and <p") is thus n + 1.
In this example we have

and

As expected, these subspaces are A invariant.

EXAMPLE 1.1.2. Let A — A07, where / is the n x « identity matrix. Clearly,


every subspace in (p" is A invariant. Here the number of /1-invariant
subspaces is infinite (if n > 1).
8 Invariant Subspaces

Note that the set Inv(/l) of all ^-invariant subspaces is uncountably


infinite. Indeed, for linearly independent vectors x, y E <p" the one-
dimensional subspaces Span{;t + ay}, a G $ are all different and belong to
lnv(A). So they form an uncountable set of ^-invariant subspaces.
Conversely, if every one-dimensional subspace of <p" is A invariant for a
linear transformation A, then A = A 0 / for some A 0 . Indeed, for every jc^O
the subspace Span{*} is A invariant, so Ax = \(x)x, where \(x) is a
complex number that may, a priori, depend on x. Now if
for linearly independent vectors xl and x2, then Span{jct + x2} is not A
invariant, because

Hence we must have A() = \(x) is independent of x ^ 0, so actually


A—

Later (see Proposition 2.5.4) we shall see that the set of all ^-invariant
subspaces of on n x n complex matrix A is never countably infinite; it is
either finite or uncountably infinite.

EXAMPLE 1.1.3. Let

where the complex numbers A , , . . . , An are distinct. For any indices 1^


/, < • • • < ik < n the subspace Span{e, , . . . , ei} is A invariant. Indeed, for

we have

It turns out that these are all the invariant subspaces for A. The proof of this
fact for a general n is given later in a more general framework. So the total
number of j4-invariant subspaces is
Definition and Examples 9

Here we shall check only that the 2 x 2 matrix

has exactly two nontrivial invariant subspaces, Span{e t } and Span{e2}.


Indeed, let M be any one-dimensional .A-invariant subspace

Then Ax = a,A,e, + a2\2e2 should belong to M and thus is a scalar multiple


of x } :

for some /8 E <p. Comparing coefficients, we see that we obtain a contradic-


tion A, = A 2 unless al = 0 or «2 = 0. In the former case M = Span{e2} and in
the latter case M = Span{e,}.
In this example we have Ker A = Span{e ( } (when det.<4=0), where
/0 is the index for which A. = 0 (as we have assumed that the A, are
distinct and det A = 0, there is exactly one such index), and Im A =
Span{e, \i*i0}.

The following observation is often useful in proving that a given subspace


is A invariant: A subspace M = Span{jc,,. . . , xk} is A invariant if and only
if AXj G M for i = 1,. . . , k. The proof of this fact is an easy exercise.
For a given transformation A: <p" —» <p" and a given vector x E. <p",
consider the subspace

We now appeal to the Cayley-Hamilton theorem, which states that


where the complex numbers a () ,. . . , an are the coefficients of
the characteristic polynomial det(A7- ^4) of A:

(By writing A as an n x n matrix in some basis in (p", we easily see from the
definition of the determinant that det(A7— A) is a polynomial of degree n
with an = l.) Hence Akx with k > n is a linear combination of
x, Ax, . . . , A"~lx, so actually

The preceding observation shows immediately that M is A invariant. Any


10 Invariant Subspaces

A-invanant subspace !£ that contains x also contains all the vectors


Ax, A2x,. . . , and hence contains M. It follows that M is the smallest
A -invariant subspace that contains the vector x.
We conclude this section with another useful fact regarding invariant
subspaces. Namely, a subspace M C <p" is A invariant for a transformation
A: £"—» <P" if and only if it is (aA + /3/) invariant, where a, (3 are arbitrary
complex numbers such that a 5^0. Indeed, assume that M is A invariant.
Then for every x G M we see that the vector

belongs to M. So M is (aA + j37) invariant. As

the same reasoning shows that any (aA + /3/) invariant subspace is also A
invariant.

1.2 EIGENVALUES AND EIGENVECTORS

The most primitive nontrivial invariant subspaces are those with dimension
equal to one. For a transformation A: <£"-» <p" and some nonzero x E <p",
therefore, we consider an /l-invariant subspace of the form M = Span{;c}.
In this case there must be a A0 £ <p such that Ax — AOJC. Since we then have
A(ax) = a(Ax) = A0(a*) for any a G <p, the number A0 does not depend on
the choice of the nonzero vector in M. We call A0 an eigenvalue of A, and,
when Ax = AOJC with 0 ^ x G <p", we call x an eigenvector of A (corresponding
to the eigenvalue A 0 ). Observe that, since (A 0 7 - A)x = 0, the eigenvalues of
A can also be characterized as the set of complex zeros of the characteristic
polynomial of A\ <pA( A) = det( A/ — A).
The set of all eigenvalues of A is called the spectrum of A and is denoted
by o-(A). We have seen that any one-dimensional ^-invariant subspace is
spanned by some eigenvector. Conversely, if XQ is an eigenvector of A
corresponding to some eigenvalue A 0 , then Span{jt0) is A invariant. (In
other words, A is the operator of multiplication by A0 when restricted to
Span{*0}.)
Let us have a closer look at the eigenvalues. As the characteristic
polynomial ^(A) = det(A7- A) is a polynomial of degree n, by the funda-
mental theorem of algebra, <p4(h) has n (in general, complex) zeros when
counted with multiplicities. These zeros are exactly the eigenvalues of A.
Since the characteristic polynomial and eigenvalues are independent of the
choice of basis producing the matrix representation, they are properties of
the underlying transformation. So a transformation A: (p"—» <p" has exactly
Eigenvalues and Eigenvectors 11

n eigenvalues when counted with multiplicities, and, in any event, the


number of distinct eigenvalues of A does not exceed n. Note that this is a
property of transformations over the field of complex numbers (or, more
generally, over an algebraically closed field). As we shall see later, a
transformation from ft" into ft" does not always have (real) eigenvalues.
Since at least one eigenvector corresponds to any eigenvalue A0 of A it
follows that every linear transformation A: <p"—»<p" has at least one one-
dimensional invariant subspace. Example 1.1.1 shows that in certain cases a
linear transformation has exactly one one-dimensional invariant subspace.
We pass now to the description of two-dimensional ^-invariant subspaces
in terms of eigenvalues and eigenvectors. So assume that M is a two-
dimensional ^-invariant subspace. Then, in a natural way, A determines a
transformation from M into M. We have seen above that for every trans-
formation in a (complex) finite-dimensional vector space (which can be
identified with <pm for some m) there is an eigenvalue and a corresponding
eigenvector. So there exists an *() G M\{Q} and a complex number A0 such
that Ax0 = A0^0. Now let xl be a vector in M for which {x0, x } } is a linearly
independent set; in other words, M — Span{jc0, x^}. Since M is A invariant it
follows that

for some complex numbers /i0 and /Xj. If /AO = 0, then xl is an eigenvector of
A corresponding to the eigenvalue /i^. If /AO ^ 0 and ^ ^ A 0 , then the vector
v = —/Lt()jc0 + (A 0 —/LtJjCj is an eigenvector of A corresponding to ^ for
which {XQ, y} is a linearly independent set. Indeed

Finally, if /i 0 ^0 and ^ = A 0 , then XQ is the only eigenvector (up to


multiplication by a nonzero complex number) of A in M. To check this,
assume that aQx0 + a,*,, t*j ^ 0, is an eigenvector of A corresponding to an
eigenvalue VQ. Then

But the left-hand side of this equality is

and comparing this with equality (2.1), we obtain


12 Invariant Subspaces

which (with a, 7^0) implies A0 = VQ and a,^t0 = 0, a contradiction with the


assumption /u,0 ^ 0. However, note that the vectors z = (1 II*>Q)X\ and JCQ form
a linearly independent set and z has the property that Az - A0z = *0. Such a
vector z will be called a generalized eigenvector of A corresponding to the
eigenvector jc0.
In conclusion, the two-dimensional invariant subspace M is spanned by
two eigenvectors if and only if either /x() = 0 or /t0 T^ 0 and /LI, ¥^ A0. If /AQ ^ 0
and /A, = A 0 , then ^ is spanned by an eigenvector and a corresponding
generalized eigenvector.
A study of invariant subspaces of dimension greater than 2 along these
lines becomes tedious. Nevertheless, it can be done and leads to the
well-known Jordan normal form of a matrix (or transformation) (see Chap-
ter 2).
Using eigenvectors, one can generally produce numerous invariant sub-
spaces, as demonstrated by the following proposition.

Proposition 1.2.1
Let A , , . . . , \k be eigenvalues of A (not necessarily distinct), and let xt be an
eigenvector of A corresponding to A,, i = 1,. . . , k. Then Span{jc,, . . . , xk}
is an A-invariant subspace.

Proof. For any x = £f = 1 aixi £ Spanjjc,,. . . , xk}, where a, E <p, we


have

so indeed Span{*j,. . . , xk} is A invariant.

For some transformations all invariant subspaces are spanned by eigen-


vectors as in Proposition 1.2.1, and for some transformations not all
invariant subspaces are of this form. Indeed, in Example 1.1.1 only one of
the n nonzero invariant subspaces is spanned by eigenvectors. On the other
hand, in Example 1.1.2 every nonzero vector is an eigenvector correspond-
ing to A 0 , so obviously every ^-invariant subspace is spanned by eigen-
vectors.

1.3 JORDAN CHAINS

We have seen in the description of two-dimensional invariant subspaces that


eigenvectors alone are not always sufficient for description of all invariant
subspaces. This fact necessitates consideration of generalized eigenvectors as
well. Let us make a general definition that will include this notion. Let A0 be
an eigenvalue of a linear transformation A: <p n —» 4-"- A. chain of vectors
Jordan Chains 13

Jt 0 , jc,,. . . , xk is called a Jordan chain of ,4 corresponding to A0 if *0 ^ 0 and


the following relations hold:

The first equation (together with JCG ¥^ 0) means that JCQ is an eigenvector of
A corresponding to A 0 . The vectors xl,...,xk are called generalized
eigenvectors of A corresponding to the eigenvalue A0 and the eigenvector XQ.
For example, let

as in Example 1.1.1. Then e{ is an eigenvector of A corresponding to A 0 , and


el, e2,. . . , en is a Jordan chain. This Jordan chain is by no means unique;
for instance, ev, e2 + ae}, . . . , en + aen_} is again a Jordan chain of A,
where a e <p is any number.
In Example 1.1.3 the matrix A does not have generalized eigenvectors at
all; that is, every Jordan chain consists of an eigenvector only. Indeed, we
have A = diag[A,, A 2 , . . . , AJ, where A,, . . . , An are distinct complex num-
bers; therefore

So A,, . . . , An are exactly the eigenvalues of A. It is easily seen that any


eigenvector of A corresponding to A, is of the form aei with a nonzero
scalar a. Assuming that there is a Jordan chain otei , x of A corresponding to
A, , equations (1.3.1) imply
14 Invariant Subspaces

As A, T^ A, for / T^ i'0, we find immediately that /3, = 0 for / ^ /„. But then the
left-hand side of equation (1.3.3) is zero, a contradiction with a 5^0. So
there are no generalized eigenvectors for the transformation A.
Jordan chains allow us to construct more invariant subspaces.

Proposition 1.3.1
Let x(), . . . , xk be a Jordan chain of a transformation A. Then the subspace
M - Span{jr0, . . . , xk} is A invariant.

Proof. We have

where A0 is the eigenvalue of A to which JCG, . . . , xk corresponds; and for


i=l,...,*

Hence the A invariance of M follows.

The following proposition shows how the Jordan chains behave under a
linear change in the matrix A.

Proposition 1.3.2
Let a 7^ 0 and /8 be complex numbers. A chain of vectors XQ, jc,,. . . , xk is a
Jordan chain of A corresponding to the eigenvalue A0 if and only if the vectors

form a Jordan chain of a A + ($1 corresponding to the eigenvalue

Proof. Assume that JCQ, . . . , xk is a Jordan chain of A corresponding to


A 0 , that is, equalities (1.3.1) hold. Then we have

and in general for / = 1,. . . , k


Jordan Chains 15

So by definition the vectors in equality (1.3.4) form a Jordan chain of


a A + ft I corresponding to «A0 + ft.
Conversely, assume that equality (1.3.4) is a Jordan chain of aA + (31
corresponding to «A0 + ft. As

the first part of the proof shows that the vectors

form a Jordan chain of A corresponding to the eigenvalue


(b/x)-y0

Two corollaries from Proposition 1.3.2 will be especially useful in the


sequel.

Corollary 1.3.3
(a) The vector XQ is an eigenvector of A corresponding to A0 if and only if jc0
is an eigenvector of a A + ft I (here a ^0, ft are complex numbers) corres-
ponding to a\Q + ft', (b) the vectors A:O, . . . , xk form a Jordan chain of A
corresponding to A0 // and only if these vectors constitute a Jordan chain of
A + ft I corresponding to A() + ft for any complex number ft.

In many instances Corollary 1.3.3 allows us to reduce the consideration of


eigenvalues and Jordan chains to cases when the eigenvalue is zero. Our first
example of this device appears in the proof of the following proposition.

Proposition 1.3.4
The vectors in a Jordan chain x(},. . . , xk of A are linearly independent.

Proof. Assume the contrary, and let xp be the first generalized eigen-
vector in the Jordan chain that is a linear combination of the preceding
vectors:

We can assume that the eigenvalue A0 of A to which the Jordan chain


JCQ, . . . , xk corresponds is zero. (Otherwise, in view of Corollary 1.3.36, we
consider A — A 0 7 in place of A.) So we have Axp — xp-\- On the other hand,
we have
16 Invariant Subspaces

Comparing both expressions, we see that xp_l is a linear combination of the


vectors x0,. . . , xp_2. This contradicts the choice of xp as the first vector in
the Jordan chain that is a linear combination of the preceding vectors.

1.4 INVARIANT SUBSPACES AND BASIC OPERATIONS


ON LINEAR TRANSFORMATIONS

In this section we first consider questions concerning invariant subspaces of


sums, compositions, and inverses of linear transformations. We shall also
develop the connection between invariant subspaces for a linear transfor-
mation and those of similar and adjoint transformations.
The basic result for the first three algebraic operations is given in the
following proposition.

Proposition 1.4.1
Let A, B: <£"—» <p" be transformations, and let M C <p" be a subspace which
is simultaneously A invariant and B invariant. Then M is also invariant for
a A + BB (with any a, ft E <p) and for AB. Further, if A is invertible, then M
is also invariant for A~l.

Proof. For every x E M we have

and (AB)x = A(Bx) E M because Bx E M.


Assume now that A is invertible, and let *,,. . . , xp be a basis in M. Then
the vectors y{ = Ax{,. . . , yp = Axp are linearly independent (because A is
invertible) and belong to M (because M is A invariant). So yl,. . . , yp is also
a basis in M. Now

For any transformation A, we denote by lnv(A) the set of all ^-invariant


subspaces. Then Proposition 1.4.1 means, in short, that
Basic Operations on Linear Transformations 17

By applying equality (1.4.3) with A replaced by A \ we get Inv(^4 ] ) C


Inv(A), so actually equality holds in (1.4.3). It is very easy to produce
examples when the equality fails in (1.4.1) or (1.4.2). For instance:

EXAMPLE 1.4.1. Let y4: <p"—» <£"" be a transformation that is not of the form
yl for some y E (p (if n — 2, such transformations obviously exist). By
Example 1.1.2, not all subspaces in <p" are A invariant. On the other hand,
take B = A and a + j8 = 0 in (1.4.1). Then the right-hand side of (1.4.1) is
the zero transformation for which every subspace in <p" is invariant. D

To give an example where the inclusion in (1.4.2) is strict, put

The following example of strict inclusion in (1.4.2) is also instructive.

EXAMPLE 1.4.2. Let

An easy analysis (using Example 1.1.1) shows that A and B have no


nontrivial common invariant subspaces. Thus ln\(A) fl Inv(B) = {{0}, <p 2 }-
On the other hand, ln\(AB) must have an eigenvector that spans a
nontrivial AB-invariant subspace. Again, the inclusion (1.4.2) is strict. D

Consider now the notion of similarity. Recall that two transformations A


and B on (p" are called similar if A = S~*BS for some invertible transfor-
mation S (called a similarity transformation between A and B). Evidently,
similar transformations have the same characteristic polynomial and, con-
sequently, the same eigenvalues. The next proposition reveals the close
connection between invariant subspaces of similar transformations.

Proposition 1.4.2
Let transformations A and B be similar, with the similarity transformation
S: A - S~1BS. Then a subspace M C <p" is A invariant if and only if the
subspace

is B invariant.

Proof. Let M be A invariant, and let x E SM, so that x — Sy for some


y E M. Then Bx = BSy = SAy, and since Ay E M, we find that Bx E SM. So
SM is B invariant.
18 Invariant Subspaces

Conversely, assume that SM is B invariant. Then for y E M we have


BSy E SM and thus

So M is A invariant.

Proposition 1.4.2 shows, in particular, that there is a natural correspon-


dence between the sets of invariant subspaces of similar transformations.
Let us check this correspondence more closely in some of the examples of
invariant subspaces already introduced.

Proposition 1.4.3
Let A and B be similar, with the similarity transformation S. Then (a)
lmB = S(lm A); (b) Ker B = S(Ker A); (c) if *0, xlt...txk is a Jordan
chain of A corresponding to A(), then Sx(}, £.*,,..., Sxk is a Jordan chain of
B corresponding to the same A 0 .

Proof. The proof is straightforward. Let us check (b). Take x E Ker A,


so Ax = 0. Then Ax = S~*BSx = 0, and as 5 is invertible, BSx = 0, that is,
Sx E Ker B. Reversing the order of this argument, we see that if Sx E Ker B
for some x E <p", then x E Ker A. The proofs of (a) and (c) proceed in a
similar way. D

Consider now the operation of taking adjoints. Let A: (p n —* <p" be a


transformation. Recall that the adjoint transformation A*: (p"—><p" is de-
fined by the relation

where ( • , - ) is the standard scalar product in ({7":

More generally, if if,, J"2 are subspaces in <p" and A: if,—> 5~2 is a linear
transformation, its adjoint A*: ^'2—>3'l is defined by the relation

It is not difficult to check that the adjoint transformation always exists and is
unique. It is easily verified that for any linear transformations A and B on
<p" and any a E <p
Basic Operations on Linear Transformations 19

If (in the standard basis e{,. . . , en)

then the adjoint transformation is given by the formula

The same formula also holds for the transformation A written as a matrix in
any orthogonal basis in <p" as long as A* is considered as a matrix in the
same basis.
There is a simple and useful characterization of the invariant subspaces of
the adjoint transformation A* in terms of the invariant subspaces of A, as
follows.

Proposition 1.4.4
Let A: <p" —>• <p" be a linear transformation. A subspace M C (J7" is A*
invariant if and only if its orthogonal complement ML is A invariant.

Proof. Assume that M is A* invariant, and let x e M ±. We must prove


that Ax G M L'. Indeed, for every y E M we have

because A*y E M and x E M. x. Conversely, assume that M ± is A invariant,


and take y E M. Then for every x E M *~ we have

which means that A*y E. M. So M is A* invariant.

Note the following equalities for the A -invariant subspaces Ker A and
Im A and the A*-invariant subspaces Ker A* and Im A*:
20 Invariant Subspaces

Indeed, let x = A*y and z E K e r A Then (x, z) = (A*y, z) = (z, A*y) =


(Az, y) = Q', so x E (Ker A)LI Hence we have proved that

On the other hand, let x be orthogonal to Im A*. Then for every y E <J7", we
have (Ax, y) - (*, A*y) = 0; so Ax 1 (J?n, and thus Ax = 0, or x E Ker A. So
(Im v4*) x CKer y4. Taking orthogonal complements, we obtain Im A* D
(Ker A)1. Combining with (1.4.5), we obtain the first equality in (1.4.4).
The second equality follows from the first one applied to A * instead of A
[recall that (A*)* = A].
Later, we shall also need the following property:

Here, the inclusion D is clear. For the opposite inclusion, let x&lmA.
Then x = Ay for some y. If z is the projection of y onto Ker/4, then
y - z E (Ker A)*~ and also x = A(y - z). Then (1.4.1) implies that y - z E
Im A* and so x E lm(AA*), as required.
A transformation A: $"^> <P" 's called self-adjoint if A = A*. It is easily
seen that A is self-adjoint if and only if it is represented by a hermitian
matrix in some orthogonal basis (recall that a matrix [ajk]"k=i is called
hermitian if ajk = dkj, j, k = 1, . . . , n). For this important class of transfor-
mations we have the following corollary of Proposition 1.4.4.

Corollary 1.4.5
If A is self-adjoint, then M± is A invariant if and only if M is A invariant.

1.5 INVARIANT SUBSPACES AND PROJECTORS

A linear transformation defined by P: (p71—> <p" is called a projector if


P2 = P. The important feature of projectors is that there exists a one-to-one
correspondence between the set of all projectors and the set of all pairs of
complementary subspaces in <p". This correspondence is described in
Theorem 1.5.1.
Recall first that \iM,% are subspaces of <f", then ^ + ^ = { z E < p " | z =
x + y, xEM, y£L j£}. This sum is said to be direct if M fl j£ = {0}, in which
case we write M + & for the sum. The subspaces M, j£ are complementary
(are direct complements of each other) if M n <£ - {0} and M + Ja? = (p".
Nontrivial subspaces ^, j£ are orthogonal if for each * E M and _y E j£ we
have (A:, y) = 0 and they are orthogonal complements if, in addition, they are
complementary. In this case, we write M = t£L, !£ = Mi.
Invariant Subspaces and Projectors 21

Theorem 1.5.1
Let P be a projector. Then (Im P, Ker P) is a pair of complementary
subspaces in <p". Conversely, for every pair (J^, j£2) of complementary
subspaces in <p", there exists a unique projector P such that Im P — ^,
Ker/> = «%.

Proof. Let x e <p". Then x = (x - Px) + Px. Clearly, Px e Im P and


* - P* G Ker P (because P2 = P). So Im P + Ker P = <p". Further, if * £
Im P n Ker P, then jc = Py for some y £ <p" and PJC = 0. So

and Im P O Ker P = {0}. Hence Im P and Ker P are indeed complementary


subspaces.
Conversely, let ^ and 5B2 be a pair of complementary subspaces. Let P be
the unique linear transformation in (f"1 such that Px = x for xE 3?} and
P* = 0 for x G «%. Then clearly P2 = P, ^ C Im P, and ^2 C Ker P. But we
already know from the first part of the proof that Im P 4- Ker P = <p". By
dimensional considerations we have, consequently, ^ = Im P and j£2 =
Ker P. So P is a projector with the desired properties. The uniqueness of P
follows from the property that Px = x for every x E Im P (which, in turn, is a
consequence of the equality P2 = P). D

We say that P is the projector on j£, a/ong $2 if Im P = j^, Ker P = 5£2.


A projector P is called orthogonal if Ker P = (Im P)1. Thus the corre-
sponding complementary subspaces are mutually orthogonal. Orthogonal
projectors are particularly important and can be characterized as follows.

Proposition 1.5.2
A projector P is orthogonal if and only if P is self-adjoint, that is, P* = P.

Proof. Suppose that P* = P, and let x G Im P, y e Ker P. Then (x, y) =


(Px, y) = (x, Py) - (x, 0) = 0, that is, Ker P is orthogonal to Im P. Since by
Theorem 1.5.1 Ker P and Im P are complementary, it follows that in fact
KerP = (ImP) 1 .
Conversely, let Ker P = (Im P)1. To prove that P* = P, we have to check
the equality

Because of the sesquilinearity of the function (Px, y) in the arguments


x, v G <p", and in view of Theorem 1.5.1, it is sufficient to prove equation
(1.5.1) for the following four cases: (a) x, v G l m P ; (b) *EKerP, yE
Im P; (c) x G Im P, y e Ker P; (d) x, y e Ker P. In case (d), equality (1.5.1)
22 Invariant Subspaces

is trivial because both sides are 0. In case (a) we have

and (1.5.1) follows. In case (b), the left-hand side of equation (1.5.1) is
zero (since x £ Ker P) and the right-hand side is also zero in view of
the orthogonality Ker P = (Im P)1. In the same way, one checks (1.5.1) in
case (c).
So (1.5.1) holds, and P* = P.

Note that if P is a projector, so is / - P. Indeed, (/ — P)2 =


7 - 2 P + P 2 = / - 2 P + P = /-P. Moreover, KerP = Im(/-P) and
Im P = Ker(/-P). It is natural to call the projectors P and I— P com-
plementary projectors.
We now give useful representations of a projector with respect to a
decomposition of <p" into a sum of two complementary subspaces. Let
T: <£"-» <p" be a transformation and let ^, J£2 be a pair of complementary
subspaces in <p". Denote m, = dim ^ (i = 1, 2); then m, + m2 = n. The
transformation T may be written as a 2 x 2 block matrix with respect to the
decomposition <$?, + Z£2 = <£"":

Here Tif (i, j = 1,2) is an ra( x my matrix that represents in some basis the
transformation P,T|^: J2?y—»«£), where P, is the projector on 5€t along ^_i
(so P, + P2 = /).
Suppose now that T — P is a projector on =$?, = Im P. Then representation
(1.5.2) takes the form

for some matrix X. In general, X ¥^ 0. One can easily check that X = 0 if and
only if J^ = Ker P. Analogously, if «$?, = Ker P, then (1.5.2) takes the form

and Y — 0 if and only if 1£2 — Im P. By the way, the direct multiplicatio


P-P, where P is given by (1.5.3) or (1.5.4), shows that P is indeed a
projector: P2 = P.
Consider now an invariant subspace M for a transformation A: $"—* <p".
For any projector P with Im P = M we obtain
Invariant Subspaces and Projectors 23

Indeed, if x £ Ker P, we obviously have

If x E Im P = M, we see that Ax belongs to M as well and thus

once more. Since <p" = Ker P + Im P, (1.5.5) follows. Conversely, if P is a


projector for which (1.5.5) holds, then for every x E Im P we have PAx =
Ax] in other words, Im P is A invariant. So a subspace M is A invariant if
and only if it is the image of a projector P for which (1.5.5) holds.
Let M be an A-invariant subspace and let P be a projector on M [so that
(1.5.5) holds]. Denoting by M' the kernel of P, represent A as a 2 x 2 block
matrix

with respect to the direct sum decomposition <£" = M 4- M'. Here A} t


is a transformation PAP\M:M-*M, A12 is a linear transformation
PA(I-P)\M.:M'-+M,

and all these transformations are written as matrices with respect to some
chosen bases in M and M'. As M is A invariant, equation (1.5.5) implies
that (/- P)AP = Q, that is, A2l = 0. Hence

Using this representation of the matrix A, we can deduce some important


connections between the restriction A\M — Al} and the matrix A itself.

Proposition 1.5.3
Let XQ, . . . , xk be a Jordan chain of A\M corresponding to the eigenvalue A0
of A\M. Then JCQ, . . . , xk is also a Jordan chain of A corresponding to A 0 . In
particular, all eigenvalues of A\M are also eigenvalues of A.

Proof. We have x0 ^ 0; jr.. £ M for i = 0,. . . , A;, and


24 Invariant Subspaces

As these relations can be rewritten as

But PXj — x^ i = 0,1,. . . , k, and we obtain the relations defining


JCQ, . . . , xk as a Jordan chain of A corresponding to A0.

The last statement in Proposition 1.5.3 can also be proved in the


following way. Suppose that \0Eo-(A{l), that is, Ker(A 0 7- An) ^ {0}.
The representation (1.5.6) implies that any nonzero vector from
Ker(A 0 /-,4 n ) belongs to Ker(A 0 /-X). Thus Ker( A07 - A) * {0}, and
\0e<r(A).
In fact, a more general result holds.

Proposition 1.5.4
Let M be an A-invariant subspace with a direct complement M' in <p", and let

be the representation of A with respect to the decomposition <p" = M + M'.


Then

Proof. This follows immediately from the fact that det(A/- J( 4) =


det(A/-,4 n ) det(A/-,4 2 2 ).

As an example in which projectors and the subspaces Im A and Ker A of


a transformation A all play important roles, let us describe here a construc-
tion of generalized inverses for A.
Given a transformation A: <P"-* <pm, the transformation X: <£""—»<p" is
called a generalized inverse of A if the following holds: for any b E Im A the
linear system Ax = b has a solution x = Xb, and for any b E Im X the linear
system Xx — b has a solution x — Ab. So this is a natural generalization of
the notion of the inverse transformation.
Observe that X is a generalized inverse of A if and only if AX A = A and
XAX = X. Indeed, let X be a generalized inverse of A. Then AXb = b for
every b E Im A, that is, for every b of the form b = Ay. So AX Ay = Ay for
all y E <p", and AX A = A. Similarly, one checks that XAX = X. Conversely,
if AX A = A, then for every b of the form b - Ay the vector Xb = XAy is
obviously a solution of the linear equation Ax ==• b.
The descrition of all generalized inverses of A, which implies, in particu-
lar, that a generalized inverse of A always exists, is given by the following
theorem.
Angular Transformations and Matrix Quadratic Equations 25

Theorem 1.5.5
Let A: ("-* <pm be a transformation, let <p" = Ker A + N, <fm = Im A + R
for some subspaces N and R, and let P be the projector on Im A along R, Q
the projector on N along Ker A. Then (a) the transformation Al = A\^ is a
one-to-one transformation of N onto Im A] (b) the transformation A1 defined
on$mbyAIy = A^l(Py),forallyE.$m,isa generalized in verse of A for which
A A = P and AA = Q; (c) all generalized inverses of A are determined as
N, R range over all complementary subspaces for Ker A, Im A, respectively.

The proof of Theorem 1.5.5 is straightforward.


It is easily seen that, in the hypothesis of the theorem, complementary
subspaces R, N are simply the range and null-space of the generalized
inverse that they determine.

Corollary 1.5.6
In the statement of Theorem 1.5.5, we have

1.6 ANGULAR TRANSFORMATIONS AND


MATRIX QUADRATIC EQUATIONS

In this section we study angular transformations and their connections with


matrix quadratic equations and invariant subspaces. The correspondence
between the invariant subspaces of similar transformations described in
Proposition 1.4.2 is useful here.
This discussion can be seen as the first step in the examination of
solutions of matrix quadratic equations. In this program, we first need the
notion of a subspace "angular with respect to a projector." In Chapter 13
we discuss the topological properties of such subspaces in preparation for
the applications to quadratic equations to be made in Chapters 17 and 20.
Let TT be a projector defined on <p". Transformations acting on (f"1 in this
section are written in 2 x 2 block matrix form with respect to the decompo-
sition <p" = Ker TT + Im TT.
A subspace N of <p" is said to be angular with respect to IT if N + Ker TT =
<p". That is, if and only if N and Ker TT are complementary subspaces of <p".
Thus Im TT is angular with respect to TT, but more generally, if R is any
transformation from Im TT into Ker TT, then the subspace
26 Invariant Subspaces

is angular with respect to TT. To see this, observe first that NR is indeed a
subspace; that is, if jc,, jc2 E NK, then for some y^ y2 G Im TT

and if

Thenecause,forany
then

and
Finally, if z E NR fl Ker TT, then z = Ry + y, where y £ Im TT and also
77-z = 0. Thus

Since ft is into Ker TT, TrR - 0 and it follows that y = 0. Hence z = 0 and
$n = tfR + Ker TT.
The angular subspaces generated in this way are, in fact, all possible
angular subspaces.

Proposition 1.6.1
Let N be a subspace of <p". Then Jf is angular with respect to TT if and only if
j{ - NR for some transformation R: Im TT + Ker TT that is uniquely determined
by N.

Proof. If jV = JVR, we have already checked that ^V is angular. To prove


the converse, assume that Jf is angular with respect to TT, and let Q be the
projector of <p" onto N along Ker TT. Put

Then N = NR. Indeed

that is, R: Im TT—>Ker TT, and we have to show that N = NR.


If x G jVw, then for some y = Try,
Angular Transformations and Matrix Quadratic Equations

Thus NR C N. Conversely, if y 6 N then

thus N = NR, as required.


To prove the uniqueness of R, we show that any defining transformation
R in (1.6.1) must have the form (1.6.2). Thus let jVbe angular with respect
to TT, and let R: Im IT—»Ker TT satisfy (1.6.1). Let _ y £ l m 7 r and x =
Ry + y E. N. Then, since / - Q is onto Ker TT along jV

But QR = Q and QTT = 0 so that Ry = (Q - ir)y.

The transformation R appearing in the preceding proposition is called the


angular transformation for N. Note that R can be defined as the restriction
of a difference of projectors:

Consider now a transformation T\ <P"—» <p". As before, let TT: <p"—» <f"
be a projector so that we have <p" = Im TT + Ker TT. Then T has a represen-
tation with respect to this decomposition:

It is clear that Im TT is invariant under T if and only if T2l — 0. Similarly,


Ker TT is T invariant if and only if r,2 = 0. More generally, what is the
condition that a subspace Jf that is angular with respect to TT be T invariant?

Theorem 1.6.2
Let Ji be an angular subspace with respect to the projector TT. Let T have the
representation (1.6.3) with respect to the decomposition <p" = Im TT + Ker TT.
Then Ji is T invariant if and only if the angular transformation R for N
satisfies the matrix quadratic equation.

Proof. If /,, 72 are the identity transformations on Im TT and Ker TT,


respectively, then since R: Im TT—»Ker TT we can define the transformation
28 Invariant Subspaces

which is written as a 2 x 2 matrix with respect to the decomposition


<P" - Im TT 4- Ker TT. The transformation E is obviously invertible and

For every x £ Im TT we have Ex — x + Rx E N. So E maps Im TT onto N and


E~l maps JV back onto Im TT. By Proposition 1.4.2, jV is T invariant if and
only if Im TT is E~1TE invariant. Now observe that

1
so Im TT is E TE invariant if and only if (1.6.4) holds.

Another important observation follows from the similarity (1.6.5).

Corollary 1.6.3
If N is T invariant, then

and

Proof. We have

Now use Proposition1.5.4to obtain(1.6.6).Furth e r ,


so Im TT is E 1TE invariant if and only if (1.6.4) holds.

1.7 TRANSFORMATIONS IN FACTOR SPACES

Let N C <p" be a subspace. We say that two vectors x, y E <p" are compar-
able modulo jV if x - y E Jf, and denote this by x = y (mod JV). In particu-
lar, x = 0 (mod JV) if and only if x G jV. This relation is easily seen to be
reflexive, symmetrical, and transitive. That is

x = ;c(mod Jf) for all x G <p"


x = Xmod -^0 ^ y = *(mod jV)
jr = Xm°d^V) and y = z(mod JV)4>jc s z(mod Jf)
Transformations in Factor Spaces 29

Thus we have an equivalence relation on <p". It follows that <p" is decom-


posed into disjoint classes of vectors with the properties that in each class
the vectors are comparable modulo Jf, and in different classes the vectors
are not comparable modulo JV. We denote by [x] v- the class of vectors that
are comparable modulo M to a given vector x G <£"". The set of all such
classes of vectors defined by comparability modulo JV is denoted $"IN.

Proposition 1.7.1
Let set $"/Jf be a vector space over (p with the following operations of
addition and multiplication by a complex number:

Proof. We have to check first that these definitions do not depend on


the choice of the representatives x G [x]^ and .yGf}']^. If xl G [x]^ and
y i ^ M j f ^ then

that is, Jtj + v, G [x + y]^ . So indeed the class [jc + y]^ does not depend on
the choice of x and y. Similarly, one checks that [ax]^ does not depend on
the choice of x in the class [x]^ (for fixed a).
It is a straightforward but tedious task to verify that $nIN satisfies the
following defining properties of a vector space over <p: The sum is commuta-
tive and associative: (a) x + y = y + x, (x + y) + z = x + (y + z) for every
x, y, z G <p"/jV; (b) there is a zero element OE (p"/jV, that is, an element 0
such that jc + 0 = x for all x E. <p%V; (c) for every x €E $"/JV there is an
additive inverse element y G (p7^V, that is, such that x + y = 0; (d) for every
a, (3 G <p and x, y G <p"/OVthe following equalities hold: a(x + y) = ax + ay,
(a + fi}x = ax + fix, (a(3)x = a(fix), and Ix = x (here 1 is the complex
number). We leave the verification of all these properties to the reader. D

The vector space <p"/jVis isomorphic to any direct complement N' of .yVin
<p". Indeed, let a G <p"/^V; then there exists a unique vector y G N' such that
a = [y]tf and in fact, y — Px, where P is the projector on JV' along JV and
x is any vector in the class a. This is easily checked. We have y — x =
—(I — P)x G JV, so v G a. If there were two different vectors y, and v 2 from
N' such that [y^^ — [y2]jf = a, then y ] = _y2 G jV' H JV and _y, ¥= v 2 , which
contradicts the choice of Jf' as a direct complement to ^Vin <p". So we have
constructed a map <p: <p"~* N' defined by <p(a) = y. This map is easily seen
to be a homomorphism of vector spaces; that is
30 Invariant Subspaces

for every a, b E <p"A/V and every a E (p. Moreover, if <p(a) = <p(b), then the
vector y = <p(a) = <p(/>) belongs to both classes a and b of comparable
vectors modulo N, and thus a = b. So 9 is one-to-one. Taking any y E N\
we see that <p([.yL) ~ y> so <P is onto. Summing up, <p is an isomorphism
between the two vector spaces <p"/jV and N'. In particular, dim <p7^V =
n — dim N. Assume now that N is A invariant for some transformation
A: <p"-» <p". Then the induced transformation A: f"W-> f"/Jf is defined
by ^4[*] v = [^-^l.v f°r an V *€• $"• This definition does not depend on the
choice of the vector x in its class [*] A . Indeed, if [jc,]^ = [^l^v' then

because xl - x2 E Ji and N is A invariant.


We now present some basic properties of the induced linear transfor-
mation A.

Proposition 1.7.2
If X is invariant for both transformations A: (p"—»<P" and B: <p"-» <£", then

If, in addition, A is invertible, then

l
Proof. By Proposition 1.4.1, jVis invariant for aA + @B, AB, and A
(if A is invertible). For any jcE <p" we have

Further, by definition of the induced transformation we have

and

for every *e<p". Finally, (1.7.2) is a particular case of (1.7.1) (with


B — A~l), taking into account the fact that / = /.
The Lattice of Invariant Subspaces 31

It may happen that A is not invertible but A is invertible. For instance,


let A: <P"—»• <p" be any transformation with the property that <p" = Ker A 4-
Im A. (There are many transformations with this property; those represen-
ted by a diagonal matrix in some basis for <£"", for example.) Put Ji = Ker A.
Then for every vector x E <f" that is not in N we have-i4[jt] v = [Ax]A 5^0.
Thus Ker A = {0} and A is invertible. The following proposition clarifies the
situation.
Proposition 1.7.3
7/A 0 is an eigenvalue of A and Ker(/l - A 0 /) is not contained in N, then A0 is
also an eigenvalue of A. Conversely, every eigenvalue A0 of A is an eigen-
value of A and Ker(/l - A07) is not contained in N.
The proof is immediate: if Ax - AOJC with x^N, then ^4[jc] v = A 0 [*] v
with [x]^ T^ 0, and conversely.

1.8 THE LATTICE OF INVARIANT SUBSPACES

We start with the notion of a lattice of subspaces in <p". A set S of subspaces


in <p" is called a lattice if {0} and <p" belong to S and S contains the
intersection and sum of any two subspaces belonging to S. The following are
examples of lattices of subspaces: (a) S = {{0}, M, M±, <p"}, where M is a
fixed subspace in (p"; (b) S = {{0}, S p a n f ^ , . . . , ek} for k = 1, . . . , «}; (c)
S is the set of all subspaces in <p". For us, the following example of a lattice
of subspaces will be the most important.
Proposition 1.8.1
The set lnv(A) of all invariant subspaces for a fixed transformation
A: <p"-»<p" is a lattice.
Proof. Let M, jVelnv(y4). If x&M PuV, then because of the A in-
variance of M and N we have Ax G M and Ax G JV, so M D JV* is .4 invariant.
Now let jc G J( + jV, so that x = xl+x2, where x^M, x2 G N. Then
Ax = Ax{ + Ax2 E.M. + N, and Jf + JV is A invariant as well. Finally, both
{0} and <p" obviously belong to Inv(A).
Actually, examples (b) and (c) are particular cases of Proposition 1.8.1:
(b) is just the set of all A -invariant subspaces for

and example (c) is the set of all invariant subspaces for the zero matrix.
32 Invariant Subspaces

In contrast, if n > 2, the lattice of example (a) is never the lattice of


invariant subspaces of a fixed transformation A. Indeed, assuming the
contrary, the restriction A\M has a one-dimensional invariant subspace (a
subspace spanned by an eigenvector; here we consider A\M as a transfor-
mation from M into M). By Proposition 1.5.3, this subspace is also an
invariant subspace of A. Hence necessarily dim M = 1, and for the same
reason dim M L - 1. Since <p" = M + M1 we obtain a contradiction when
n>2.
In terms of the lattices of invariant subspaces, Propositions 1.4.2 and
1.4.4 can be restated as follows. We define [Inv(/l)]1 to be the set of
subspaces M1 for which M G Inv(y4).

Proposition 1.8.2
Given a transformation A: $" —»(p" and an invertible transformation
S: <F"-*<F", we have

and

We know that if Ml and M2 are A invariant, then so are Ml + M2 and


M}r\M2. It is of interest to find out how the spectra of the restrictions
A\M +M and A\M nM are related to the spectra of A\M and A\M^.

Theorem 1.8.3
If M{ and M2 are A-invariant subspaces, then

and

Recall that cr(B) stands for the set of eigenvalues of a transformation B.

Proof. Proposition 1.5.3 shows that the inclusion D holds in (1.8.1). To


prove the opposite inclusion, write

where M\ is a subspace in Ml such that M\ + (M^r\ M2) = Mlt and


M'2<^M2 satisfies M2 + (Ml C\M2) = M2. Write A\Mt+M2 as the 3 x 3 block
matrix with respect to decomposition (1.8.3):
The Lattice of Invariant Subspaces 33

Here, A^^PjAPj, and P{ (resp. P3) is the projector on M\ along


(Ml r\M2) + M'2 [resp. on M'2 along (M{ r\M2) + M(], and P2 = I- P^-
P3. As we have seen above, the A invariance of Ml implies A3l — A32 — 0,
and the A invariance of M2 implies A ]2 = A13 = 0. So

We find that

and hence that \Gcr(A\Mi+M ) implies that \Eo-(A\Mi) or \Ecr(A\M2).


For the proof of (1.8.2) note that M^(^M2CMl, and hence by Propos-
ition 1.5.3, (T(A\MinM2)Ca-(A\M). S Similarly, v(A\M^M7)Ca(A\M), and
(1.8.2) follows.
The following example shows that the inclusion in (1.8.2) may be strict.

EXAMPLE 1.8.1. Let

Then Ji} and M2 are A invariant and <r(A\M nM ) = {1}; <T(A\M) =


o-G4L2) = {i,o).
A set 5 of subspaces in <p" is called a chain if {0} and <p" belong to S and
either M C Ji or N C M (with proper inclusions) for every pair of different
subspaces M , J f E . S . Obviously, a chain is also a lattice. Also, a chain of
subspaces is always finite (actually, it cannot contain more than n + 1
subspaces), in contrast to lattices that may be infinite, as in example (c)
above.
Let

be a chain of different subspaces. We choose a direct complement ^ to


34 Invariant Subspaces

Mj_{ in the subspace Mt (i = 1,. . . , k). Then we obtain a decomposition of


<p" into a direct sum

This means that for every vector x E (p" there exists unique vectors
xl £ ^,. . . , xk G %k such that jc = *, + *2 + • • • + xk. Now let Pi be the
projector on ^£i along

The projectors P. are mutually disjoint; that is, P-P- = PjPi = 0 for / ^ /, and
Pi + -~ + Pk = L
Now any transformation A: $"—> <P" can be written as a A: x k block
matrix with respect to the decomposition (1.8.6):

where each transformation Aify - P./IF.L:* /' jj j£—»


i J£' is written as a matrix in
some fixed bases in ^ and ^..
Choose a basis j c , , . . . , xn in <p" in such a way that

where 0 < pv < p2 < • • • < pk = /i, and let

Then one can characterize all matrices for which (1.8.5) is a chain of (not
necessarily all) invariant subspaces in terms of the k x k block representa-
tion as follows.

Proposition 1.8.4
All subspaces from the chain (1.8.5) are invariant for a transformation A if
and only if A has the following form in the chosen basis xlt. . . ,xn:

where Aif is a ( P J - pi-1)*(pj- pt-i) matrix, 1 < i </ ^ k (and we define


/>o = 0).
The Lattice of Invariant Subspaces 35

Proof. Assume that A has the form (1.8.8), which means that in terms
of the projectors Plt . . . , Pk defined above the equalities PfiP- = 0 for /' >;'
hold. For a fixed ;', it follows that

As Qj = Pl + • • • + Pj is a projector on M f and Pj+l + • • • + Pk = I - Qr we


obtain (/ - Qj)AQf = 0, which means that Mj - Im Qj is A invariant.
Conversely, if M},M2,. . . ,Mk are all A invariant, then the equality
(/ - QJAQ; = 0 holds for j = 1, . . . , k. So /ViP, = 0 for i >/, and A has
the form (1.8.8).

A chain of subspaces

is called maximal (or complete) if it cannot be extended to a larger chain,


that is, any chain of subspaces

with the property that every Mt is equal to some ^, coincides with the chain
(1.8.9). It is easily seen that a chain (1.8.9) is maximal if and only if
dim Mt: = /, / = ! , . . . , « .
Now if (1.8.9) is a maximal chain, we may choose a basis *,,. . . , xn in
<p" in such a way that

As a particular case of Proposition 1.8.4, we find that all the subspaces


Mlt . . . , Mn are A invariant for a transformation A if and only if A has
upper triangular form in the basis x{, . . . , xn:

We conclude this section with a useful result on chains of invariant


subspaces for a transformation having a basis of eigenvectors in <p". It turns
out that such chains can be chosen to be complementary to any chain of
subspaces given in advance.
Theorem 1.8.5
Let A: <p" -* <p" be a transformation having a basis in <p" formed by
eigenvectors of A. Then for every chain of subspaces jV, C • • • C Jfp in <p"
36 Invariant Subspaces

there exists a chain of A-invariant subspaces M, D • • • D Mp such that M • is a


direct complement to jV y , / = 1 , . . . , / ? .

Proof. Let x{, x2,. . . , xn be a basis in <p" consisting of eigenvectors of


A. We show first of all that there exists a set of indices K{ C {1,. . . , n} such
that the subspace M{ = {*, | i E Kt} is a direct complement to N} in (p". Let
/, be the first index such that xf does not belong to ^V,. If /, < i'2 < • • • < is
(^n) are already chosen, let is+l be the first index such that xi does not
belong to SpanfjC; , . . . , * , • } + .A",. This process will stop after t steps (say)
when the equality Span{jt, , . . . , jc,} + jVj = <p" is reached. Now one can put
K{ = { / , , . . . , /,} to ensure that Span{jt, | / E K { } is a direct complement to
JV,.
By the same token, there is a set K2 C Kl such that M2 = Spanjjc^ j / E
K2} is a direct complement to M^ H Jf2 in Ml. As N2 — (Ml ft jV2) 4- .A^,
clearly J<2 is a direct complement to N2 in <p". Let ^ 3 = Span{jc, | / E K3},
where K3 C K2 and Ji3 is a direct complement to M2C\ jV3 in J(2, and so on.
Clearly, all the subspaces M^ are A invariant. D

In connection with Theorem 1.8.5 we emphasize that not every transfor-


mation has a basis of eigenvectors. Indeed, we have seen in Example 1.1.1 a
transformation with only one eigenvector (up to multiplication by a nonzero
complex number); obviously one cannot form a basis in <p" from the
eigenvectors of A. Furthermore, the transformation A of Example 1.1.1
does not satisfy the conclusion of Theorem 1.8.5. We leave it to the reader
to verify the following fact concerning this transformation A: for a chain
jV, C • • • C Np of subspaces in <p" there is a chain Ml D • • • D Mp of A-
invariant subspaces such that Mi + Jtfi = (p", / = 1, . . . , p if and only if each
NI is spanned by the vectors of type en+fn, en_l + / „ _ , , . . . , en.fj+l +
/„_,,+ !, where r. = dim JV; and the vectors /„, /„_,, . . . , /„_,.+, belong to
Span{el, e2, . . . , en_r}. (As usual, ek stands for the £th unit coordinate
vector in (p".)
The converse of Theorem 1.8.5 is also true: if for every chain of sub-
spaces *W\ C • • • C Np in <p" there exists a chain of ^4-invariant subspaces
Ml D • • • D Mp such that Mj is a direct complement to JV"y., / = 1,. . . , p,
then there exists a basis of eigenvectors of A. However, a stronger state-
ment holds.

Proposition 1.8.6
Let A: (p"-* <p" be a transformation. If each subspace N C <p" has a com-
plementary subspace that is A invariant, there is a basis in <p" consisting of
eigenvectors of A.

Proof. Let JV0 be the subspace spanned by all the eigenvectors of A. We


have to prove that ^V0 = (p". Assume the contrary: NQ ^ (p". The hypothesis
Triangular Matrices 37

of the theorem implies that there is an A-invariant subspace M^ that is a


direct complement to Nn. Clearly, M ( ] ^ { Q } . Hence there exists an eigen-
vector JCG of A in MQ: Ax() = A0;t0, x(}^Q. Since xttj£N(}, we contradict the
definition of jV0.

1.9 TRIANGULAR MATRICES AND COMPLETE CHAINS OF


INVARIANT SUBSPACES

The main result of this section is the following theorem on unitary trian-
gularization of a transformation. It has important implications for the study
of invariant subspaces.
Recall that a transformation U: <p"—»(p" is called unitary if it is invertible
and U~l = U* or, equivalently, if (Ux, Uv) = (x, y) for all x, y E <p". Note
that the seemingly weaker condition \\Ux\\ - \\x\\ for all x E <p" is also
sufficient to ensure that U is unitary. Note also that the product of two
unitary transformations is unitary again, and so is the inverse of a unitary
transformation.
It will be convenient to write linear transformations from <p" into <p" as
n x n matrices with respect to the standard orthonormal basis e^, . . . , en in
<p". We shall use the fact that a matrix is unitary if and only if its columns
form an orthonormal basis in <p".

Theorem 1.9.1
For any n x n matrix A there exists a unitary matrix U such that

is an upper triangular matrix, that is, /,y — 0 for i>j, and the diagonal
elements t ] } , . . . , tnn are just the eigenvalues of A.

Proof. Let A, be an eigenvalue of A with an eigenvector jc, and assume


that ||jc,j| = 1. Let jc2, . . . , xn be vectors in (p" that, together with *,, form
an orthonormal basis for <p". Then the matrix

is unitary. Write Ul in a block matrix form t/, = [*,K], where V= [x2 • • • xn]
is an n x (n - 1) matrix. Then because of the orthonormality of *,,. . . , xn,
Y*XI =0. Now, using the relation Axl = Ajjc,, we obtain
38 Invariant Subspaces
def
Applying the same procedure to the ( w - l ) x ( n - l ) matrix A2 = V*AV,
we find an (n - 1) x (n - 1) unitary matrix U2 such that

for some eigenvalue A 2 of A2 and some (n - 2) x (n - 2) matrix A3. Apply


the same procedure to A3 using a suitable (n - 2) x (n - 2) unitary matrix
U3, and so on.
Then for the n x n unitary matrix

the product U*AU is upper triangular. Finally, as U* = U \ we have

so that f , , , . . . , tnn are the eigenvalues of A.

Let T= U*AU be a triangular form of the matrix A as in Theorem 1.9.1.


Then it follows from Proposition 1.8.4 that there is a maximal chain

where all subspaces M, are T invariant. Then Proposition 1.4.2 shows that
the maximal chain

consists of ^-invariant subspaces. We have obtained the following fact.

Corollary 1.9.2
Any transformation (or n x n matrix) A: <f"~* <P" has a maximal chain of
A-invariant subspaces. In particular, for every /', ! < / < n , there exists an
i-dimensional A-invariant subspace.

In general, a complete chain of /^-invariant subspaces is not unique. An


extreme case of this situation is provided by A = al, a £ <p. For such an A,
every complete chain of subspaces is a complete chain of A -invariant
subspaces. Clearly, there are many complete chains of subspaces in <|7"
(unless n — \}.
Let us characterize the matrices A for which there is a unique complete
chain of invariant subspaces.
Triangular Matrices 39

Theorem 1.9.3
An n x n matrix A has a unique complete chain of invariant subspaces if and
only if A has a unique eigenvector (up to multiplication by a scalar).

Proof. We have seen in the proof of Theorem 1.9.1 that for any
eigenvector x of A the subspace Span {A:} appears in some complete chain of
-^-invariant subspaces. So if a complete chain of invariant subspaces is
unique, the matrix A has a unique eigenvector (up to multiplication by a
scalar).
The converse part of Theorem 1.9.3 will be proved later using the Jordan
normal form of a matrix (see Theorem 2.5.1).

Theorem 1.9.1 has important consequences for normal transformations.


A transformation A: (p"—> <p" is called normal if A A* = A* A. Self-adjoint
and unitary transformations are normal, of course, but there are also normal
transformations that are neither self-adjoint nor unitary.

Theorem 1.9.4
A transformation A: <p" —» <p" is normal if and only if there is an orthonormal
basis in <p" consisting of eigenvectors for A.

Proof. Write A as an n x n matrix. Assuming that A is normal, the


matrix T from (1.9.1) is easily seen to be normal as well:

But T is upper triangular:

Hence the (1,1) entry in T*T is |f n | 2 , whereas this entry in TT* is


k n l 2 + ki 2 | 2 + "' + I ' m 2 As T*T= rr*, it follows that f 12 = • . • = *,„= 0.
Comparing the (2, 2) entries in T* T and TT*, we now find that t23 = • • • =
t2n = 0, and so on. It turns out that T is diagonal. Now Uel, . . . , Uen is an
orthonormal basis in <J7" consisting of eigenvectors of A.
Conversely, assume that A has a set of eigenvectors/!, . . . , / „ that form
an orthonormal basis in <p". Then the matrix U = [/,/2 • • • / „ ] is unitary and
40 Invariant Subspaces

where A, is the eigenvalue of A corresponding to /. So

As the diagonal matrix T is obviously normal, we find that A is normal as


well.

1.10 EXERCISES

1.1 Prove or disprove the following statements for any linear transfor-
mation A: ("->(":
(a) Im A + Ker A = <p".
(b) Im A + Ker A = <p" (the sum not necessarily direct).
(c) Im A H Ker ,4^(0}.
(d) dim Im A + dim Ker A = n.
(e) Im A is the orthogonal complement to Ker A *.
1.2 Prove or disprove statements (d) and (e) in the preceding exercise for
a transformation A: <pm-* <p", where m^n.
1.3 Let A: <p"—»<p" be the transformation given (in the standard ortho-
normal basis) by an upper triangular Toeplitz matrix

where « 0 , . . . , « „ _ , are complex numbers. Find the subspaces Im A


and Ker A.
1.4 Given A:$n—» (p" as in Example 1.1.3, identify the /4-invariant
subspaces Im Ak and Ker Ak, k = 0,1,. . . .
1.5 Identify Im Ak and Ker Ak, k = 0,1,. . . , where

is given by a lower triangular Toeplitz matrix.


1.6 Find all one-dimensional invariant subspaces of the following trans-
formations (written as matrices with respect to the standard ortho-
normal basis):
Exercises 41

1.7 In the preceding exercise, which transformations have a Jordan chain


consisting of more than one vector? Find these Jordan chains.
1.8 Show that all invariant subspaces for the projector P on the subspace
Ji are of the form Ml + J\f,, where Ml (resp. ^V,) is a subspace in M
(resp. JV). Find the lattice Inv P*.
1.9 Given P as in Exercise 1.8, find all the invariant subspaces of
<*!/* + a2(I — P), where al and a2 are complex numbers.
1.10 Let A: f "-> <p" be a transformation with A 2 = /. Show that Im(/ +
A) and Im(7 - A) are the subspaces consisting of the zero vector and
all eigenvectors of A corresponding to the eigenvalues 1 and -1,
respectively.
1.11 Find all invariant subspaces of a transformation A: <£"-» <p" such that
A2 = I.
1.12 Let

(a) Show that A is similar to

(b) Find all invariant subspaces of A.


1.13 Let

Show that A is similar to a matrix of type

and find the lattice Inv(y4). What are the invariant subspaces of A*7
1.14 Let
42 Invariant Subspaces

Prove that the eigenvalues of Q are cos(27r/c/«) + /sin(27rkin},


& = 0 , 1 , . . . , « - ! . Find the corresponding eigenvectors.
1.15 Show that the transformation

has eigenvectors xp whose yth coordinate is sin{jpir/(n +1)},


p = 0,. . . , n - 1 (independently of a). What are the corresponding
eigenvalues?
1.16 Let

where a 0 , . . . , « „ _ , are complex numbers. Show that A0 is an eigen-


value of A if and only if A0 is a zero of the equation

1.17 Let

(a) Find all eigenvalues and eigenvectors of A.


(b) Find a longest Jordan chain.
(c) Show that A is similar to a matrix of the form

and find the similarity matrix.


Exercises 43

(d) Find the lattice Inv(A) of all invariant subspaces of A.


(e) Find all invariant subspaces of the transposed matrix AT.
1.18 Let A: <£"—»<£" be a transformation represented as a matrix in the
standard orthonormal basis. Show that all invariant subspaces for the
transposed matrix AT are given by the formula Span{^,. . . , xk},
where xl, . . . ,xk is a basis in the orthogonal complement to some
^-invariant subspace, and for a vector y = (y^ . . . , yn) G <p" we
denote y = ( y , , . . . , yn).
1.19 Prove a generalization of Proposition 1.4.4: if A: <p" —» <p" is a
transformation and J£, ./V are subspaces in <p", then >U£ C jV holds if
and only if A*N±CM\
1.20 Give an example of a transformation A: £"—><£" that is not self-
adjoint but nevertheless AM x C M 1 for every ^-invariant subspace
./#.
1.21 Let

Find the angular transformations of M} and M2 with respect to the


projector on <p" © {0} along {0} © <p".
1.22 Find at least one solution of the quadratic equation

where

are n x n matrices.
(b) Tl} are n x « diagonal matrices.
(c) TV are n x n circulant matrices.
1.23 Prove that x{ + M,. . . , xk + Jt is a basis in <p"/^ (where
*,, . . . , xk G <p) if and only if for some basis y^ . . . , yp in M the
vectors j t j , . . . , xk, ylt . . . , yp form a basis in <p".
1.24 Let A = diagfflj, . . . , aj: <p"—> (f"1, where the numbers « , , . . . , # „
are distinct. Show that for any ^4-invariant subspace M the induced
transformation A: $"IM —> $ IM can also be written in the form
diag[6j, . . . , bk] in some basis in $nIM.
44 Invariant Subspaces

1.25 Find all the induced transformations A: $"/M-+ $"IM, where

and M is any A -invariant subspace.


1.26 Show that if P is a projector on (p" and P is the induced transfor-
mation on $"IM, where M is a P-invariant subspace, then P is a
projector as well. Find Im P and Ker P.
1.27 Let

be in a triangular form. Show that

is also in a triangular form. Hence the triangular form of a matrix is


not unique, in general.
1.28 Find complete chains of invariant subspaces for the transformations
given in Exercise 1.6. Check for uniqueness in each case.
1.29 Given a transformation in a matrix form

with respect to the basis el, e2, e3, find a complete chain of A-
invariant subspaces. Find a basis in which A has the upper triangular
form.
1.30 Let A: (p2"—» <p2" be a transformation. Prove that there exists an
orthonormal basis in <p2" such that, with respect to this basis, A has
the representation

where, for each / and /', Atj is an upper triangular matrix.


Chapter Two

The Jordan Form and


Invariant Subspaces

We have seen in Section 1.4 and Proposition 1.8.2 that there is a strong
relationship between lattices of invariant subspaces of similar transforma-
tions, namely

for any two tranformations A and 5 from <p" into <p" with S invertible. Thus,
for the study of invariant subspaces, it is desirable to use similarity transfor-
mations to reduce a given transformation to the simplest form, in the hope
that the lattice of invariant subspaces for the simplest form would be more
transparent than that for the original transformation. The "simplest form"
here is the Jordan form. It is obtained in this chapter and used to study
some properties of invariant subspaces. Special insights are obtained into
the structure of invariant subspaces and are exploited throughout the book.
We examine irreducible invariant subspaces, generators of invariant sub-
spaces, maximal and minimal invariant subspaces, and invariant subspaces
of functions of transformations. An interesting class of subspaces is intro-
duced and studied in Section 2.9 that we call "marked." All the subject
matter here is well known, although this exposition may be unusual in
matters of emphasis and detail that will be useful subsequently.

2.1 ROOT SUBSPACES

In this section we introduce the root subspaces of a transformation. The


study of these subspaces is the first step towards an understanding of the
Jordan form. At the same time it will be seen that the root subspaces are
important examples of invariant subspaces that can be described in terms of
Jordan chains.

45
46 The Jordan Form and Invariant Subspaces

We consider now some ideas leading up to the definition of root


subspaces. Let A: <p"-» <p" be a transformation and let A0 be an eigenvalue
of A. Consider the subspaces Ker(.A — A0/)', / = 1, 2 , . . . . For / = 1 the
subspace Ker(A - A()7) ^ {0} is just the subspace spanned by the eigenvec-
tors of A corresponding to A 0 . As (A — A O /)'JC = 0 implies (A — A O /)' + IJC = 0,
we have

Consequently, Ker(A - A () /)' + 1 ^ Ker(A - A () /)' if and only if dim Ker(A -


A0/)' + 1 >dimKer(v4 - A0/)'. Since the dimensions of the subspaces
Ker(A - A 0 /)', / = 1, 2 , . .. are bounded above by «, there exists a minimal
integer p > \ such that

for all integers / > p. The subspace Ker(/4 — A0I)P is called the root subspace
of A corresponding to A() and is denoted £%A (A).
In other words, 91A (A) consists of all vectors x E <p" such that (^4 -
A O /)^JC = 0 for some integer q ^ 1. (This integer may depend on x.) Because

all subspaces in (2.1.1) are A invariant. In particular, the root subspace


3$A (A) is A invariant.
By definition, &^(A) = Ker(,4 - A 0 /) p is the biggest subspace in the
chain (2.1.1). We see later that, in fact, p is the minimal integer i> 1 for
which the equality Ker(A - A0/)' = Ker(v4 - A0/)' + 1 holds, and that p < n.
Hence we also have

The nesting of the kernels in (2.1.1) has a dual in the (descending) nesting
of images:

But these sequences of inclusions are coupled by the fact that, for any
integer / > 0,

Consequently, if p is the least integer for which Ker(A - \0I)p +l =


Ker(A - \0I)P, it is also the least integer which lm( A - A 0 /) p + 1 = Im(a - \QlY-
Root Subspaces 47

Proposition 2.1.1
The root subspace <3lx (A) contains the vectors from any Jordan chain of A
corresponding to A 0 .

Proof. Let AT O , . . . , xk be a Jordan chain of A corresponding to A 0 . Then

Hence all the vectors *, (/ = 0,. . . , /c) belong to £%A (A).


Let us look at the simplest examples. For

as well as for A — A07, the only eigenvalue is A 0 , and the corresponding root
subspace $A (A) is the whole of <f". If

then the root subspace 3ftx(A) is one-dimensional and is spanned by et for


j = l , 2 , . . . ,/i.
Later, we also use the following fact: if A,S: <p"—»(p" are transfor-
mations with 5 invertible, then

for every eigenvalue A0 of A. An analogous property holds also for every


member of the chain (2.1.1). The proof of equation (2.1.2) follows the same
lines as the proof of Proposition 1.4.3.
The following property of root subspaces is crucial.

Theorem 2.1.2
Let A,,..., A,, be all the differenteigenvalues of a transformation
A: (p" —> <p". Then <p" decomposes into the direct sum

We need some preparations to prove this theorem.


48 The Jordan Form and Invariant Subspaces

Lemma 2.1.3
For every eigenvalue A0 of A, the restriction A[9l A (A) has the sole eigenvalue
o
A0.

Proof. Let B^A^ (AY We shall show that for every A, ¥^ A0 the
^0
transformation \ll — B on $1K (A) is invertible. Let q be an integer such
that

Then clearly

Since this implies that

and since A! ^ A 0 , the invertibility of \J - B follows.

Lemma 2.1.4
Given a transformation A: $"—> <P" w^h an eigenvalue A 0 , let q be a positive
integer for which

Then the subspaces Ker(/l - 0\iy and lm(A - A 0 /) 9 are direct complements
to each other in <p".

Proof. Since

we have only to check that

Arguing by contradiction, assume that there is an x ^ 0 in the left-hand side


of equation (2.1.5). Then x = (A — A I)qy for some y. On the other hand,
for some integer r > 1 we have
Root Subspaces 49

It follows that

Hence

a contradiction with (2.1.4) and the definition of a root subspace.

Proof of Theorem 2.1.2 Let A, be an eigenvalue of A. Lemma 2.1.4


shows that

where q is some positive integer for which

By Lemma 2.1.3, the restriction of A to Ker(/4 - A,/)*7 has the sole


eigenvalue A^ On the other hand, A, is not an eigenvalue of the restriction
of A to lm(A - A0/)9.
To see this, observe that we also have

Hence A — A,/ maps lm(A - A,/) 9 onto itself. It follows that A, is not an
eigenvalue of the restriction of A to the ^-invariant subspace lm(A — A,/) 9 .
So the restrictions of A to the subspaces Ker(/4 - Aj/) 4 = $1K (A) and
58 = lm(A — A,/) 9 have no common eigenvalues. This property is easily
seen to imply that, for any eigenvalue A2 of A^

So we can repeat the previous argument with A replaced by A\% and with A,
replaced by an eigenvalue A2 of A\y, to show that

for some v4-invariant subspace M such that A t and A2 are not eigenvalues of
A\M. Continuing this process, we eventually prove Theorem 2.1.2. D

Another approach to the proof of Theorem 2.1.2 is based on the fact that
if <7,(A), . . . , qr(\) are polynomials (with complex coefficients) with no
common zeros, there exist polynomails p,(A),. . . , pr(\) such that
50 The Jordan Form and Invariant Subspaces

(This is easily proved by induction on r, using the Euclidean algorithm for


the case r = 2.) Now let the characteristic polynomial ^(A) = det( A/ - A)
be factorized in the form

where A , , . . . , A r are different complex numbers (and are, of course, just


the eigenvalues of A) and *>,,. . . , vr are positive integers. Define

for y = l , . . . , r . Using the fact that <pA(A) = Q (the Cayley-Hamilton


theorem) one verifies that actually

for y = 1,. . . , r. Finally, take advantage of the existence of polynomials


p,(A),. . . , p r (A) such that equality (2.1.6) holds, and use equation (2.1.7),
to prove Theorem 2.1.2. This approach can be used to prove results
analogous to Theorem 2.1.2 for matrices over fields other than (p.
Now let M be an >l-invaraint subspace. Consider the restriction A\M as a
linear transformation from M into M, and note that

for every A0 that is an eigenvalue of A\M. If A0 is an eigenvalue of A but not


an eigenvalue of A\M, then &^(A\M) = {0}; but also M n ^o(A) = {0}. So
the equality Sft^(A\M) = M O &lXo(A) holds for any A 0 e<r(,4). Applying
Theorem 2.1.2 for the linear transformation A\M and using the above
remark, we obtain the following result.

Theorem 2.1.5
Let A: <p" —> <p" be a transformation, and let M be an A-invariant subspace.
Then M decomposes into a direct sum

where A , , . . . , A r are all the different eigenvalues of A.


Root Subspaces 51

Note that Theorem 2.1.2 is actually the particular case of Theorem 2.1.5
with M = <p". We consider now some examples in which Theorem 2.1.5
allows us to find all invariant subspaces of a given linear transformation.

EXAMPLE 2.1.1. Let A = diag[A,, A 2 , . . . , Aj where A , , . . . , \n are differ-


ent complex numbers (as in Example 1.1.3). Then o-(A) — { A 1 ? . . . , \n}, and

$A(,4) = Span{e,} ,

By Theorem 2.1.5, any /1-invariant subspace M is a direct sum

As M nSpan{e,} is either {0} or Span{e,}, it follows that any ^-invariant


subspace is of the form

for some indices \<i^<i2<---<ip<n. This fact was stated without proof
in Example 1.1.3.

EXAMPLE 2.1.2. Let

where A, and A2 are different complex numbers. The matrix A has the
eigenvalues A, and A 2 . Further,

and thus

So £% A (A) — Span{e1? e2}. For the eigenvalue A2 we have £% A (A) =


Span{e3,e 4 }.
52 The Jordan Form and Invariant Subspaces

We see (as Theorem 2.1.2 leads us to expect) that <J7" is a direct (even
orthogonal) sum of R^ (A) and £%A (A). Let M be any ^-invariant subspace.
By Theorem 2.1.5, we obtain

It is easily seen (cf. Example 1.1.1) that the only /1-invariant subspaces in
Span{e l5 e2} are {0}, Spanjtf,}, and Span{e,, e2}. On the other hand, any
subspace in Span{e3, e4} is A invariant.
One can easily describe all subspaces in Span{e3, e4} as follows: {0};
the one-dimensional subspaces Span{e3 4- ae4}, where a G <p is fixed for
each particular subspace; the one-dimensional subspace Span{e4}; and
Span{e 3 ,e 4 ). Finally, the following is a complete list of ^-invariant sub-
spaces:

2.2 THE JORDAN FORM AND PARTIAL MMULTIPLICITIES

Let A be an n x n matrix. In this section we state one of the most important


results in linear algebra—the canonical form of a matrix A under similarity
transformations A—>S~1AS, where S is an invertible n x n matrix.
We start with some notations. The Jordan block of size k x k with
eigenvalue A0 is the matrix

Clearly, det( A/ - Jk(A0)) = (A - A0)*, so A0 is the only eigenvalue of Jk(\o).


Further
The Jordan Form and Partial Multiplicities 53

so the only eigenvector of Jk(kQ) (up to multiplication by a nonzero complex


number) is el. The invariant subspaces ofJk(\0) were described in Example
1.1.1; they form a complete chain of subspaces in <p*:

It turns out that a similarity transformation can always be found trans-


forming a matrix into a direct sum of Jordan blocks.

Theorem 2.2.1
Let A be an n x n (complex) matrix. Then there exists an invertible matrix S
such that S~1AS is a direct sum of Jordan blocks:

The Jordan blocks Jk(\f) in the representation (2.2.1) are uniquely deter-
mined by the matrix A (up to permutation) and do not depend on the choice
ofS.

Since the eigenvalues of a matrix are invariant under similarity, it is clear


that the numbers A,, . . . , \p are the eigenvalues of A. Note that they are
not necessarily distinct.
We stress that this result holds only for complex matrices. For real
matrices there is also a canonical form under similarity with a real similarity
matrix. This canonical form is dealt with in Chapter 12.
The right-hand side of equality (2.2.1) is called a Jordan form of the
matrix (or the linear transformation) A. For a given eigenvalue A0 of A, let
Jk (A, ) , . . . , Jk (A. ) be all the Jordan blocks in the Jordan form of A for
which A, = A 0 , ( j r = l , . . . , m . The positive integer m is called the geometric
multiplicity of A0 as an eigenvalue of A, and the integers kt, ,. . . , /c, are
called the partial multiplicities of A 0 . So the number of partial multiplicities
of A() as an eigenvalue of A coincides with the geometric multiplicity of A0. In
view of Theorem 2.2.1, the geometric multiplicity and the partial multi-
plicities depend on A and A0 only and do not depend on the choice of the
invertible matrix S for which (2.2.1) holds. The sum ki + • • • + ki of the
partial multiplicities of A0 is called the algebraic multiplicity of A0 (as an
eigenvalue of A). Obviously, the algebraic multiplicity of A0 is not less than
its geometric multiplicity.
The following property of the partial multiplicities will be useful in the
sequel.

Corollary 2.2.2
If A{ and A2 are n{ x n, and n2 x n2 matrices with the partial multiplicities
£,(,4,),. . . ,kmi(Al)andkl(A2),. . . , km2(A2) of A, and A2, respectively,
54 The Jordan Form and Invariant Subspaces

all corresponding to the common eigenvalue A 0 , then A:,(y4,),. . . , km^(Al),


kl(A2),. . . , km^(A-,) are the partial multiplicities of the matrix

corresponding to A 0 . In particular, the geometric (resp. algebraic) multiplicity


of

at A0 is the sum of the algebraic (resp. geometric) multiplicities of A, and A2


at A 0 .

The proof of this corollary is immediate if one observes that the Jordan
form of

can be obtained as a direct sum of the Jordan forms of A} and A2,


We also need the following property of partial multiplicities.

Corollary 2.2.3
The partial multiplicities of A at A0 coincide with the partial multiplicities of
the conjugate transpose matrix A* at A 0 .

Proof. Write A = SJS ~ \ where J is the Jordan form of A and S is a


nonsingular matrix. Then A* = S~l*J*S*. Now the conjugate transpose J*
of the matrix / is similar to the matrix / that is obtained from / by replacing
each entry by its complex conjugate. Indeed, if we define the permutation
("rotation") matrix R with elements r/; defined in terms of the Kronecker
delta by rif = S J> + 1 _ y ., then it is easily verified that R~l = R and

Hence / is the Jordan form of A*, and Corollary 2.2.3 follows from the
definition of partial multiplicities.

To describe the result of Theorem 2.2.1 in terms of linear transfor-


mations, let us introduce the following definition. An A -invariant subspace
M is called a Jordan subspace corresponding to the eigenvalue A0 of A if M
is spanned by the vectors of some Jordan chain of A corresponding to A 0 .
The Jordan Form and Partial Multiplicities 55

Theorem 2.2.4
Let A: <p"—> <p" be a linear transformation. Then there exists a direct sum
decomposition

where Mt is a Jordan subspace of A corresponding to an eigenvalue A, {here


A t , . . . , A are not necessarily different}.
If <f?" = jV, + • • • 4- Nq is another direct sum decomposition with Jordan
subspaces Nt corresponding to eigenvalues /A,, / = 1, . . . , q, then q — p, and
(possibly after a permutation of Jf}, . . . , Nq) dim Mi = dim JV) and A, = /n,
for i = l, . . . , q.
Note that in general the decomposition (2.2.2) is not unique. For
example, if A = /, then one can take Mi = Span{jc,}, where * , , . . . , * „ is
any basis in (p".
Theorem 2.2.1 follows easily from Theorem 2.2.2 and vice versa. Indeed,
let S be as in Theorem 2.2.1. Then put

to satisfy equality (2.2.2).


Conversely, if Mi are as in (2.2.2), choose a basis x\'\ . . . , x(k'} in Mt
whose vectors form a Jordan chain for A. Then put

The direct sum decomposition (2.2.2) ensures that S is an n x n nonsingular


matrix, and the definition of a Jordan chain ensures that S~1AS has the form
(2.2.1).
Theorem 2.2.1 (or Theorem 2.2.4) is proved in the next section. Note
that because of Theorem 2.1.2 one has to prove Theorem 2.2.1 only for the
case when $t^(A) - (p", that is, A has only one eigenvalue A 0 . In this sense
the property of root subspaces described in Theorem 2.1.2 is the first step
toward a proof of the Jordan form.
In view of Proposition 1.4.2, there are many cases in which the Jordan
form allows us to reduce the consideration of invariant subspaces of a
general linear transformation to the consideration of invariant subspaces of
a linear transformation that is given by the Jordan normal form in the
standard orthonormal basis. This reduction is used many times in the sequel.
As a first example of such a reduction we note the following simple fact.
56 The Jordan Form and Invariant Subspaces

Proposition 2.2.5
Let A: <p"-» <p" be a linear transformation. Then the geometric multiplicity of
any A0 E ar(A) coincides with dim Ker(A — A 0 /), and the algebraic multiplici-
ty of \Q coincides with the dimension of£%A (A), the root subspace of A0 [i.e.,
with the dimension of Ker(y4 - A0/)"].

Proof. By (2.1.2) and Theorem 2.2.1 we can assume without loss of


generality that

Then for any A0 E <£ we have

From the definition of the Jordan block it is easily seen that

Hence

is the number of indices / for which A0 = A y , and, by definition, this number


coincides [in case A0 G cr(A)] with the geometric multiplicity of A0.
Similarly

So for q = 1, 2 , . . . and A0 E <p we have

As £%A (A) is the maximal subspace of the type Ker(,4 — A 0 /)^, q = 1,2, . . . ,
we obtain
The Jordan Form and Partial Multiplicities 57

which, by definition, is just the algebraic multiplicity of A 0 .


Proposition 2.2.5 is actually a particular case of the following general
proposition.

Proposition 2.2.6
Let A: <p"-» <P" be a transformation with partial multiplicities / c l 5 . . . , km
corresponding to the eigenvalue A0 of A. Then

where H # represents the number of different elements in a finite set ft.

Proof. In view of formula (2.2.3) we have only to show that

This equality is certainly true for q = 1 (for then both sides are equal to m).
Assume that the equality is true for q — 1. We have

Adding the relation

(which is just the induction hypothesis) we verify (2.2.4).

It follows from Proposition 2.2.6 that if

for some positive integer q, then actually

for all p ^ q, that is


58 The Jordan Form and Invariant Subspaces

2.3 PROOF OF THE JORDAN FORM

In this section we prove Theorem 2.2.4. In view of Theorem 2.1.5, it is


sufficient to consider A\& (A), where A 0 G cr(A) is fixed, in place of A. In
other words, we can assume that A has only one eigenvalue A0, possibly with
several partial multiplicities.
Let f?j = Kcr(A - A,,/)', ] = 1, 2,. . . , m, where m is chosen so that
^m = ^A0(^) but <fm_i*9t^(A). Note that ^ C V2 C • • • C <fm. Let
x(^\ . . . , x(^m) be a basis in !fm modulo ^ m _ , , that is, a linearly independent
set in ym such that

(the sum here is direct). We claim that the mtm vectors

are linearly independent. Indeed, assume

Applying (a - A0/)m ' to the left-hand side and using the property that
(a - A 0 /) m jc^ = 0 for / = 1, . . . , / „ , we find that

Hence £|™, ai0x(^ e ^m-i and because of (2.3.1), a10 = • • • = a, 0 =0. Ap-
plying (A — A 0 /) w ~ 2 to the left-hand side of (2.3.2) we show similarly that
an- • • • — at , = 0, and so on, We put

As we have just seen, the sum M} + M2 + • • • + Mt is direct.


Consider now the vectors

We claim that
Proof of the Jordan Form

Indeed, assume

Applying (A - A 0 /) m 2
to the left-hand side, we get

which implies a t = • • • = a, = 0 in view of equality (2.3.1). So equation


(2.3.3) follows.
Assume first that Span does not coincide with
^m _ 1 . Then there exist vectors h in <fm_v such that the
set {*^-]}!=!'m~1 is linearly independent and

Applying the previous argument to (2.3.4) as with (2.3.1), we find that the
vectors

are linearly independent. Now put

If it happens that

then put formally tm_l =0.


At the next step put

and show similarly that

Assuming that ^m_3 + Span{*^_2, i = 1, . . . , tm + t m _ { } ^ 5^,_2, choose


60 The Jordan Form and Invariant Subspaces

*Si-2. ' = '« + '*-i + 1» • • • • ' « + '*-i + fm-2»" such a way that the vectors
x^_2, i = 1,. . . , tm + tm_l + tm__2 are linearly independent and the linear
span of these vectors is a direct complement to 5^,_3 in ^m_2- Then
put

for /' = ! , . . . , f / n _ 2 - We continue this process of construction of Mt, i =


1,. . . , p, where p = tm + tm_{ + • • • + t { . The construction shows that each
Miis a Jordan subspace of A and the sume Mt + • • • + Mp is a direct sum.
Also

because of our assumption that cr(A) = {cr0}. Hence (2.2.2) holds.


Let us prove the uniqueness part of Theorem 2.2.4. Assume that (2.2.2)
holds, and let A , , . . . , \k be all the different eigenvalues of A. Denoting
by Ej the set of all integers /, 1 < / < / ? , such that A, = A y , we have for
f = 0,l,2,...:

Consequently

In particular (taking / = !), the number of elements in £; coincides with


dim Ker(>4 — Ay7). This proves that for a direct sum decomposition <p" =
N\ + • • • + Nq as in Theorem 2.2.4 we have q — p and for a fixed j the
number of /u,, values that are equal to A y coincides with the numbers of A,
values that are equal to A y . Hence we can assume /U, = A,, / = 1,. . . , p.
Further, (2.3.5) implies that (for fixed A y ) the number

coincides with the number of indices / G Ey such that dim Mt >t(t =


1,2,...), and thus it also coincides with the number of indices i G £. such
that dim J^fi > t. This implies the uniqueness part of Theorem 2.2.4.

2.4 SPECTRAL SUBSPACES

Let A: (p" —> <p" be a transformation. A subspace M C <(?" is called a spectral


subspace for A if M is a sum of root subspaces for A. The zero subspace is
also considered spectral. Since root subspaces are A invariant, a spectral
Spectral Subspaces 61

subspace for A is A invariant. It is easily seen that the total number of


spectral subspaces for A is 2r, where r is the number of distinct eigenvalues
of A
By Theorem 2.1.5, for every A invariant subspace M, we have

where A j , . . . , A p are all the distinct eigenvalues of A. From this formula it


is clear that M is spectral if and only if for every A; either M n S?A (A) = {0}
or the inclusion £%A (A) C M holds. Another consequence of formula (2.4.1)
is that, for any nonzero spectral subspace M of A,

where /i,,. . . , /*> are all the distinct eigenvalues of the restriction A\M.
A useful characterization of spectral subspaces is given by their maximali-
ty property.

Proposition 2.4.1
An A-invariant subspace M T£ {0} is spectral if and only if any A-invariant, v
subspace J£ with the property (r(A\<e) C a-(A\M) is contained in M.

Proof. Assume that M is not spectral so that, in particular, {0} ^


M fl £%A (A) T^ £%A (A) for some A0 G cr(A). Define the ^-invariant subspace
!£ by the equalities

for all eigenvalues A, of A different from A 0 . Obviously, a(A\^) = cr(A\M)


but J£ is not contained in M (actually, ^contains M properly).
On the other hand, assume that M is spectral. If !£ is A invariant with
cr{A\<£) C cr(/4|^), then the equality

(where A 1 5 . . . , \p are the distinct eigenvalues of A) implies that t£ n


$tio(A) = 0 for every A 0 Go-(y4) not belonging to the spectrum of A\M. It
follows then from (2.4.2) that

where /i 1} . . . , ^ are the distinct eigenvalues of A\M. As the right-hand side


of (2.4.3) is equal to M, the inclusion & C M follows.

Another characterization of spectral subspaces can be given in terms of


direct complements.
62 The Jordan Form and Invariant Subspaces

Theorem 2.4.2
The following statements are equivalent for an A-invariant subspace M: (a) M
is spectral for A; (b) there exists a direct complement Jito M such that N is A
invariant and

(c) there exists a unique A-invariant direct complement N to M; (d) for any
A-invariant subspace !£ that contains M properly, cr(A^) contains cr(A\M}
properly.

To accommodate the cases M - {0} and M - <p" in Theorem 2.4.2 w


adopt the convention that the spectrum of the restriction of A to the zero
subspace is empty.

Proof. The equivalence of (a) and (d) follows immediately from


Theorem 2.4.1. By Theorem 2.1.5 (considering each root subspace of A
separately) we can assume that ^ (A) = <p", that is, A has the single
eigenvalue A 0 . Then the only spectral subspaces of A are {0} and <p".
Further, since cr(A\y) = {\()} for every nonzero ,4-invariant subspace ££,
equation (2.4.4) implies that either (r(j4|^) or o^/lj^-) is empty; in other
words, either M - {0} or Ji={0}. But if the latter case holds, then
obviously M = <J7". Thus M = {0} and M = <p" are the only subspaces
satisfying (b), and (a) and (b) are equivalent.
Obviously, (a) implies (c). So it remains to prove that (c) implies (a).
Let M be a nontrivial ^-invariant subspace (i.e., different from {0} and
<p") that has an ^-invariant direct complement JV. Then N is nontrivial as
well. We now use the Jordan form (Theorem 2.2.4) for the restriction A\jfi

where x\'\ . . . , x(^ is a Jordan chain (necessarily with eigenvalue A 0 ) of A,


i = 1, . . . , q. It is easily seen (cf. Proposition 1.3.4) that the vectors xf} ,
j = 1,. . . , kt, i = 1, . . . , q are linearly independent and hence form a basis
in X.
We now construct another direct complement for M that is A invariant.
Let y (^0) be an eigenvector of A in M, and put

As Ay = \0y, one checks easily that N' is A invariant. Also, N'^Jf,


Spectral Subspaces 63

because otherwise y would belong to N, a contradiction with the direct sum


M +jV=<p". We verify that Jf' is a direct complement to M. Indeed,
observe that the vectors x\l);. .. ; x ^ ^ ; x [ l ) + y; jcj°, ; = 1,.. ., kt, i =
2,. . . , q are linearly independent and hence dim N' = dim N. So we must
only check that M H Ji' = {0}. Let

where ai; are complex numbers. The condition M fl jV = {0} implies

which in turn implies

and, because of the linear independence of jtj 0 , all the coefficients a/; are
zeros. In particular, alk =0, and z — 0 in view of equation (2.4.6).
We have proved that (when a-(A) = {\0}) any nontrivial ,4-invariant
subspace either does not have A -invariant direct complements or has at least
two of them. This means that (c) implies (a).

We deduce immediately from Theorem 2.4.2 that the unique /1-invariant


direct complement ^V to a spectral subspace M is spectral as well: if
M = &^(A) + • • • + ^s(A), then JV = dt^A) + ••• + ^(A), where
/u,j,. . . , /AS , v ^ , . . . , vt is a complete list of all the distinct eigenvalues of A.
We say that the spectral subspace M for A corresponds to the part A of
the spectrum of A if <j(A\M) — I^. Obviously, there is a unique spectral
subspace corresponding to any given subset A of cr(A) [with the understand-
ing that <r(/l| {() j) = 0]. This spectral subspace can easily be described in case
A is given by an n x n matrix in Jordan form as in equation (2.2.1). Indeed,
using the notation of that equation, if A C cr(A), define the kt x kt matrix AT,
by Kt = / if A, G A and AT, = 0 if A, ^A. Then the subspace

is the spectral subspace for A corresponding to A. Its only ^-invariant direct


complement is

We conclude this section with a description of spectral subspaces in terms


of contour integrals. (Actually, this description is a particular case of the
64 The Jordan Form and Invariant Subspaces

properties of functions of transformations that are studied in more detail in


Section 2.10.) Let F be a simple, closed, rectifiable, and positively oriented
contour in the complex plane. In fact, for our purposes polygonal contours
will suffice. Given an n x n matrix B(\) = [£>,,(A)]" / = 1 , that depends con-
tinuously on the variable A E F (this means that each entry b, 7 (A) in B( A) is
a continuous function of A on F) the integral

is defined naturally as the n x n matrix whose entries are the integrals of the
entries of B(\):

The same definition of a contour integral applies also for transformations


fi(A): <p"—><p" that are continuous functions of A on F. We have only to
write B(\) as a matrix [^/y(A)]" / = 1 in a fixed basis, and then interpret
f r B( A) d\ as a transformation represented by the matrix [Jr 6 (y (A) dA]" ;=1
in the same basis. One checks easily that this defintion is independent of the
chosen basis.

Proposition 2.4.3
Let A be a subset of cr(A) where A is a transformation on <p", and let F be a
closed contour having A in its interior and o-(A) ^ A outside F. Then the trans-
formation

is a projector (known as a Riesz projector) onto the spectral subspace associated


with A and along the spectral subspace associated with o-(A) ^ A.

Proof. Using the relation 5(A7- A)~lS~l = (A/ - SAS'')"', equation


(2.1.2), and the Jordan form, we can assume that A is an n x n matrix given
by

where /*.(A,.) is the k{ x /c. Jordan block with A, on the main diagonal. One
easily verifies that
Irreducible Invariant Subspaces and Unicellular Transformations 65

As a first consequenc of this formula we see immediately that, because


a(A)r\r = 0, ( A / - A)~l is indeed continuous on T. Further, the Cauchy
formula gives

Thus

where Kt = I if A e A and K. = 0 if A^A. Thus the matrix (2.4.7) is indeed a


projector with image and kernel as prescribed by the theorem.

2.5 IRREDUCIBLE INVARIANT SUBSPACES


AND UNICELLULAR TRANSFORMATIONS

In this section we use the Jordan form to study irreducible invariant


subspaces. An invariant subspace M of a transformation A:$n—> <p" is
called reducible if M can be represented as a direct sum of nonzero
y4-invariant subspaces Ml and M2\ otherwise M is called irreducible.
Let us consider some examples.

EXAMPLE 2.5.1. Let A be a Jordan block. Then, as Example 1.1.1 shows,


each nonzero A -invariant subspace (including (p" itself) is irreducible.

EXAMPLE 2.5.2. Let A = A0/, A0 E <p. Then an ^4-invariant subspace is


irreducible if and only if it is one-dimensional.

EXAMPLE 2.5.3. Let

According to Theorem 2.1.5, the ^-invariant subspaces are as follows: {0};


Span{ael + fte^} for fixed numbers a, j8 G <p with at least one of them
different from zero; Span{ej, e2}\ Span{e,, e3}; <p3. Among these subspaces
Spanje,, e3} and <p3 are reducible and the rest are irreducible.

The following theorem gives various characterizations of irreducible


invariant subspaces.
66 The Jordan Form and Invariant Subspaces

Theorem 2.5.1
The following statements are equivalent for an A-invariant subspace M:(a}Mis
irreducible; (b) each A-invariant subspace N contained in Mis irreducible', (c)M
is Jordan, that is, has a basis consisting of vectors that form a Jordan chain of A;
(d) there is a unique eigenvector (up to multiplication by a scalar) of A in M;(e) the
lattice of invariant subspaces of A\M is a chain—that is, for any A-invariant
subspaces j£,, j£, C M either £{ C 2£2 or 5£2 C <£", holds; (/) every nonzero A-
invariant subspace that is contained in M is Jordan; (g) the spectrum ofA\M is a
singleton {A 0 }, and

(h) the Jordan form of the linear transformation A M consists of a single


Jordan block.

Proof. The definition of a Jordan block and the description of its


invariant subspaces (Example 1.1.1) show that (h) implies all the other
statements in Theorem 2.5.1.
The implications (f)—»(c) and (b)—»(a) are obvious. Let us show that
(c)—>(d). Let xl,. . . , xk be a basis in M such that Ax} = \Qxl; Ax2 — \0x2 =
x{;. . . ; Axk - \0xk = xlt^l. The matiix of A\M in this basis is the k x k
Jordan block with A0 on the main diagonal, so the spectrum of A\M is the
singleton {A 0 }. If x — T.^laixi is an eigenvector of A (necessarily corre-
sponding to A 0 ), then (A — A O /)JC = 0, which implies E*=2 aixi_l = 0. As
jt,, . . . , xk are linearly independent, <x2 = • • • = ak = 0, and x is a scalar
multiple of*,. So (d) holds.
If x and y are two eigenvectors of A\M such that Span{jc) 7^Span{_y},
then for the ^-invariant subspaces J£, = Span{jc} and J£, = Span{_y} we have
^,^^2 and ^2^^,. So (e) implies (d).
It remains, therefore, to show that (d)—»(h), (a)—»(h), and (g)—>(h). To
this end we can assume that A\M is in Jordan form (written as a matrix in a
suitable basis in A 0 ):

If p > 1, then el and ek + 1 are two eigenvectors of A in M that are not scalar
multiples of each other; so (d)—»(h). Further, if p>\, then

is a direct sum of two nonzero A -in variant subspaces. Hence (a)—»(h).


Finally, assume that (g) holds. Then we have A, = A 2 = • • • = A p = A0 in
equation (2.5.1), and this equation implies
Irreducible Invariant Subspaces and Unicellular Transformations 67

On the other hand, the statement (g) implies that the left-hand side of this
equation is also equal to max{0, £, + • • • + kp - /}. In particular (for / = 1),
we have

which implies p — l. So (h) holds, and Theorem 2.5.1 is proved.

Observe that with M = <p", Theorem 1.9.3 is just the equivalence


(d)O(e). Thus the proof of that theorem is now complete.
A transformation A: (pw —» <p" is called unicellular if the Jordan form of A
consists of a single Jordan block. Comparing statements (a) and (h) of
Theorem 2.5.1, we obtain another characterization of a unicellular trans-
formation.

Proposition 2.5.2
A transformation A: <p" —> <p" is unicellular if and only if the whole space ((7"
is irreducible as an A-invariant subspace.

Indeed, rewriting Theorem 2.5.1 for the particular case M - <p", one
obtains various characterizations of unicellular transformations.
Another important property of a unicellular transformation is the "near"
uniqueness of an orthonormal basis in which this transformation has upper
triangular form (see Section 1.9).

Theorem 2.5.3
A transformation A: <p" —*• <P" is unicellular if and only if for any two
orthonormal bases x,,..., xn and y , , . . . , yn in which A has an upper
triangular form we have

where 6}E<p and |6> | = 1.

Proof. Assume that A is not unicellular. By Theorem 2.5.1 there exist


two eigenvectors jc, and yl (which can be assumed to have norm 1) such that
Spanl^J T^Spanjy,}. The proof of Theorem 1.9.1 shows that there exists
an orthonormal basis whose first vector is jc, and in which A has a
triangular form. Similarly, there exists such a basis whose first vector is _>>,.
So equation (2.5.2) does not hold for j = 1.
Assume now that A is unicellular, and let z , , . . . , zn be a Jordan basis
for A in (". So
68 The Jordan Form and Invariant Subspaces

For / = 1, 2,. . . , n define jc, to be a vector in Spanjz,,. . . , z,} that is


orthogonal to Span{zj,. . . , Z j _ , } and has norm 1. (By definition, xl =
a Z j / I J z J I for some a e (p with |a| = 1.) Then

and these subspaces are A invariant. By Proposition 1.8.4, A has an upper


triangular form with respect to the orthonormal basis * t , . . . , xn.
If A also has an upper triangular form in an orthonormal basis
y 1 ? . . . , ) > „ , then

is a chain of ^4-invariant subspaces. But the lattice of all ^-invariant


subspaces is a chain (Example 1.1.1); therefore, (2.5.3) is a unique com-
plete chain of A -invariant subspaces. Hence the chain (2.5.3) coincides with

Hence Spanjyj, . . . , y.) = Span{zj,. . . , z,} for i = 1, 2 , . . . , « , and the


orthonormality of y 1 5 . . . , yn implies that (2.5.2) holds.

We conclude this section with a proposition that was promised in


Section 1.1.

Proposition 2.5.4
The set Inv(^4) of all invariant subspaces of a fixed transformation
A: <£"-» <p" is either a continuum [i.e., there exists a bijection <p: Inv(A)-* ft]
or a finite set.

Proof. In view of Theorem 2.1.5 we can assume that A has only one
eigenvalue A0, that is, £%A (A) - <p". If A is unicellular, then by Example
2.1.1 the set Inv(y4) is finite (namely, there are exactly n + l ,4-invariant
subspaces). If A is not unicellular, then by the equivalence (c)O(d) in
Theorem 2.5.1 there exist two linearly independent eigenvectors x and v of
A: Ax = \Qx, and Ay = A0y. Then (Span{;t + ay} \ a E ft] is a set of A-
invariant subspaces which is a continuum. On the other hand, let ^ be the
map from the set of all /i-tuples (jc,,. . . , xn) of n — dimensional vectors
* ! , . . . , * „ onto Inv(yl) defined by <K*i> • • • , *„) = Span{jCj,. . . , xn] if the
subspace Span{jc,,. . . , xn} is A invariant and «K*i, • • • , xn) = {0} other-
wise. As the set of all n-tuples (*,,. . . , *„), jc, G <p" is a continuum, by an
elementary result in set theory it follows that lnv(A) is a continuum as
well.
Generators of Invariant Subspaces 69

2.6 GENERATORS OF INVARIANT SUBSPACES

Let M be an invariant subspace for the transformation A: <p"—><p". The


vectors jc,,. . . , xm £ <p" are called generators for M if

For example, any basis for M forms a set of generators for M. In connection
with this definition note that for any vectors yl, . . . , yp G <p" the subspace
Span{y,, . . . , yp, Aylt . . . , Ayp, A2ylt . . . , A2yp, . . .} is A invariant. The
particular case when M has one generator is of special interest (see also
Section 1.1), that is, when M = Span{*, Ax, A2x,. . .} for some jc G <p". In
this case we call M a cyclic invariant subspace (and is frequently referred to
as a "Krylov subspace" in the literature on numerical analysis).
The notion of generators behaves well with respect to similarity. That is,
if M is an ,4-invariant subspace with generators j c , , . . . , xm, then SM is an
SAS~{-invariant subspace with generators Sxl,...1Sxm (here 5 is any
invertible transformation). So the study of generators of /1-invariant sub-
spaces can be reduced to the study of generators of /-invariant subspaces,
where J is a Jordan form for A. Let us give some examples.

EXAMPLE 2.6.1. Let A = I (or, more generally, A = al, where a G <p).


Then a /c-dimensional subspace M in <p" (which is obviously ^-invariant) has
not less than k generators. Any set of vectors that span M is a set of
generators.

EXAMPLE 2.6.2. Let A = Jn( A) be the n x n Jordan block with eigenvalue A.


An ^-invariant subspace Mk — Span{e l5 . . . , ek] is cyclic with the generator
ek. D

The generators x\, . . . , xm of M are called minimal generators for M\im


is the smallest number of generators of M. Obviously, any set of minimal
generators is a minimal set of generators. (A set of generators xl,. . . , xp for
the ^-invariant subspace M is called minimal if any proper subset of
{AT,, . . . , xp] does not constitute a set of generators for M.) However, not
every minimal set of generators is a set of minimal generators. Let us
demonstrate this in an example.

EXAMPLE 2.6.3. Let

and let M = <f 2 be the ^-invariant subspace. The vector (1,1) is obviously a
generator for M, so a set of minimal generators must consist of a single
vector.
70 The Jordan Form and Invariant Subspaces

On the other hand, the set of two vectors {el, e2} is a set of generators of
<(72 that is minimal. Indeed, neither of the vectors e} and e2 is a generator of
2
<F .

The number of vectors in a set of minimal generators admits an intrinsic


characterization as follows.

Theorem 2.6.1
Let M be an A-invariant subspace. Then the number of vectors in a
set of minimal generators coincides with the maximal dimension m of
Ker(yl - \0I)\M, where A0 is any eigenvalue ofA\M.

Proof. We can assume that

a matrix in Jordan form (with respect to a certain basis in M). Further, we


can assume that A, = • • • = \m where m^p (recall that m is the maximal
number of Jordan blocks corresponding to any eigenvalue). Let xl,. . . , xq
be generators of M. Let yt be the m-dimensional vector formed by the k^h,
(&j + A: 2 )th, . . . , (kt + k2 + • • • + km)th coordinates of jt, ( / = ! , . . . , q).
Now

Examining the £jth, . . . , (kl + k2 + • • • + km)th coordinates of xi and using


the condition A, = • • • = A m , we see that et £Spanf}^, . . . , yq}. Similarly,
the condition

gives rise to the conclusion that e2 ESpanj^,. . . , yq}. Continuing in this


way, we eventually find that ei G Span{vj,. . . , yq}, i = 1, 2,. . . , m. So
yl,. . . , yq span the whole space <pOT, and thus q^m.
We now prove that there is a set of m generators for M. We proceed by
induction on m.
Suppose first that m = 1, that is, the eigenvalues A , , . . . , A p are all
different. Then the vector x = ek + ek +k + - • • + ek +...+k is a generator
for M. Indeed

Because of the form (2.6.1) of A\M the matrix (A - A 2 /)* 2 - • • (A - Ap/)*^


has the form T{ © 0^. © • • • © 0A , where T{ is an upper triangular non-
Generators of Invariant Subspaces 71

singular matrix. Hence the k{lh coordinate /, k of /, is nonzero. Now


(A - A,/) 7 /, has (k{ -y')th coordinate equal to/, k (and thus nonzero) and
all the coordinates of (A - A,/)'/, below the (A:, -;)th coordinate are zeros
(j = 1,. . . , &i - 1). Consequently, the vectors e{,. . . , ek belong to the
span of/,, (A - A,/)/,,. . . , (A - A,/)* 1 " 1 /!- Similarly, one shows that the
span of vectors

where

contains vectors ek + 1 , . . . , ek +k . Proceeding in this way we find


eventually that all the vectors et, i = 1, . . . , & , + • • • + kp belong to
Span{*, Ax, A2x,. . .}.
Assume now that m > 1. Suppose that for any transformation B and any
B-invariant subspace j£ such that

there exists a set of m — 1 generators in !£. Given the transformation


A: $"-+(", write

where Ml and M2 are some yl-invariant subspaces such that

(Such subspaces Ml and M2 are easily found by using the Jordan form of A.)
By the induction hypothesis we have a set of m — 1 generators jc,,. . . , xm_l
for the /4-invariant subspace M^ Also, we have proved that there is a
generator xm for the /4-invariant subspace M2. Then, obviously, jc,,. . . , xm
is a set of generators for M.

In particular, an v4-invariant subspace M, is cyclic if and only if there is


only one eigenvector (up to multiplication by a nonzero number) in M
corresponding to any eigenvalue of the restricton A\M.
We conclude this section with an example.
72 The Jordan Form and Invariant Subspaces

EXAMPLE 2.6.4. Let A = diag[A, A2 • • • A n ] where A p .. . , An are different


complex numbers. Then <p" is a cyclic subspace for A. A vector x =
(xlt. . . , xn) G <p" is cyclic, that is

if and only if all the coordinates xt are different from zero. Indeed, if xt = 0
for some i, then e, does not belong to Span{;t, Ax, A2x,. . .}. On the other
hand, if xf ^ 0 for / = ! , . . . , « , then

The determinant on the right-hand side is known as the Vandermonde


determinant, and it is well known that it is equal to n /<; (A y — A ( ) 5^0. So
det[*, Ax,. . . , A"~lx] ^0. It follows that the vectors x, Ax,. . . , A"~lx are
linearly independent and thus span <p".

2.7 MAXIMAL INVARIANT SUBSPACE IN A GIVEN SUBSPACE

Given a transformation A: <p"—> <p" and a subspace Ji C <p", we say that an


/1-invariant subspace M is maximal in jV if M C jV and there is no ^4-
invariant subspace that is contained in N and contains M properly.

Proposition 2.7.1
A maximal A-invaraint subspace in Jf exists, is unique and is equal to the sum
of all A-invariant subspaces that are contained in Ji.
Note that, because the dimension of N is finite, M can actually be
expressed as the sum of a finite number of j4-invariant subspaces.

Proof. Clearly, M is y4-invariant and contained in N. Also, M is maximal in


JV. This follows from the definition of M that implies that every /l-invariant
subspace in ^Vis contained in M.
For the uniqueness, assume that there are two different maximal ^4-invariant
subspaces in ^V, say, Ml and M2. Then M{ + M2 is any4-invariant subspace in jV
that contains M{ properly, a contradiction with the definition of a maximal
A-invariant subspace in JV.
Maximal Invariant Subspace in a Given Subspace 73

Observe that if .A" is A invariant, the maximal ^-invariant subspace in jV


coincides with N itself. At the other extreme, assume that Jf does not
contain any eigenvector, it follows that JV does not contain nonzero A-
invariant subspaces. Hence the maximal /1-invariant subspace in jV is the
zero subspace.
Let us consider some examples.

EXAMPLES 2.7.1. Let A = diag[A t , A 2 , . . . , A n ], where A 1 5 . . . , \n are dif-


ferent complex numbers. Then the maximal ,4-invariant subspace in Ji is
Span{e yj ,. . . , eik}, where e} , p = 1,. . . , k are all the vectors among
e t , . . . , en that are contained in JV (by definition, Span{efi,. . . , 6j } = {0}
if none of the vectors e{,. .. ,en belongs to JV).

EXAMPLE 2.7.2. Let A = -/ n (A 0 ), the n x n Jordan block with eigenvalue A 0 .


Then the maximal ^-invariant subspace in JV is Span{e,,. . . , e p _j},
where p is the minimal index such that ep^N (again, we put
Span{ij,. . . ,ep-\} — {0} if N does not contain e}).

The following more explicit description of maximal A -invariant subspaces


is sometimes useful.

Theorem 2.7.2
The maximal A-invariant subspace in Ji coincides with

where . (in particular, N(} = .A").

Proof. We have M C jV0 = JV. Further, M is ^4-invariant. For, if jc G M,


then A'x — yt for some yy e N (j — 0, 1, . . .) and

Hence Ax E M. It remains to verify that M is maximal in N. Let !£ be an


,4-invariant subspace contained in JV. Then for j - 0,1, . . . ,

and (because 3? is A invariant)

Combining these inclusions, we have


74 The Jordan Form and Invariant Subspaces

and M is indeed maximal in N.

In connection with Theorem 2.7.2 observe that JV* = A 'N with A is


invertible.
Given a transformation A: $"—> <p", it is well known that there are scalar
polynomials/(A) such that, if /(A) = E, = 0 a ( A', then

Indeed, the characteristic polynomial of A has this property (the Cayley-


Hamilton theorem). A nonzero polynomial g( A) of least degree—say, p—for
which g(A) - 0 is called a minimal polynomial for A (it can be shown that p
is uniquely defined). Then it is clear that for any integer j^p, we can
equate A' io a polynomial in A of degree less than p. Thus (in the notation
of Theorem 2.7.2)

where p is the degree of a minimal polynomial of A. Indeed, the inclusion C


in equation (2.7.1) is obvious. To prove the opposite inclusion, let <?(A) =
\p + SjT0' a-\' be a minimal polynomial of A, so

Let x E n^J N., so A'x E Jf for j = 0, . . . , p - 1. Then

and xE.Jfp. Assume inductively that we have already proved that xEJV^.,
j = 0,. . . , q — 1 for some q^p. Then

and x£.Nq. So actually x E nJL 0 ^, and equation (2.7.1) is proved.


Observe that (2.7.1) implies

for every q >p — I . In particular, equation (2.7.2) holds with q — n — 1.


Maximal Invariant Subspace in a Given Subspace 75

The case when ^V = KerC and C: $"->(' is a transformation is of


particular interest. In this case one can describe the maximal ^-invariant
subspaces in Ker C in terms of the kernels of transformations CA\ j =
0,1,....

Theorem 2.7.3
Given linear transformations A: (pn —> <p" and C:<p"—»<p r , the maximal
A-invariant subspace in Ker C is

Moreover, the subspace 3f(C, A) coincides with nj=0' Ker(CM') for every
integer q greater than or equal to the degree of a minimal polynomial of A.

Proof. In view of Theorem 2.7.2 and equality (2.7.2), we have only to


show that

However, this equality is immediately verified using the defintions of Ker C


and Ker(CA'). D

We say that a pair of linear transformations (C, A) where A: <p"—»<p"


and C: <£""—» <pr is a null kernel pair if the maximal ^-invariant subspace in
Ker C is the zero subspace, or, equivalently, if

It is easily seen that, also, the pair (C, A) is a null kernel pair if and only if

EXAMPLE 2.7.3. Let C = [c, • • • cj: <p"—»<p, and

For j = 0, 1 , . . . , « — 1 we have
76 The Jordan Form and Invariant Subspaces

and hence

where k is the smallest index such that ck ^ 0. In particular, (C, A) is a null


kernel pair if and only if c, 7^0.

The notion of null kernel pairs plays important roles in realization theory
for rational matrix functions and in linear systems theory, as we see in
Chapters 7 and 8. Here we prove that every pair of transformations has a
naturally defined null kernel part.

Theorem 2.7.4
Let C: <p"-*<p r and A: (f"1 -» <f" be transformations, and let Ml be the
maximal A-invariant subspace in Ker C. Then for every direct complement
M2 to M\ in (p", C and A have the following block matrix form with respect to
the direct sum decomposition <p" = M\ 4- M2.

where the pair C2: M2-^> (pr, A22: M2—* M2 is a null kernel pair. If <p" =
M\ + M'2 is another direct sum with respect to which C and A have the form

where the pair (C2, A'22) is a null kernel pair, then M\ is the maximal
A-invariant subspace in Ker C and there exists an invertible linear transfor-
mation S: M2-+M2 such that

Proof. As M, is A invariant and Cx - 0 for every x E M l, the transfor-


mations C and A indeed have the form of equality (2.7.3). Let us show that
the pair (C2, A22) is null kernel. Assume x G n°l 0 Ker(C 2 >422). As

where by * we denote a transformation of no immediate interest, we have


Maximal Invariant Subspace in a Given Subspace 77

and hence

On the other hand, x belongs to the domain of definition of A22, that is,
xE.M2. Since M1C\M2 = {Q}, the vector x must be the zero vector.
Consequently, (C2, A22) is a null kernel pair.
Now consider a direct sum decomposition <p" = M{ + M2, with respect to
which C and A have the form of equality (2.7.4) with the null kernel pair
(C2, A'22). As

we have

where the last equality follows from the null kernel property of (C2, A'22).
Hence M{ actually coincides with M^. Further, write the identity transfor-
mation /: <p"-> <p" as a 2 x 2 block matrix

Here 5: M2—>M2is a linear transformation that must be invertible in view


of the invertibility of /. The inverse of / (which is / itself) written as a 2 x 2
block matrix with respect to the direct sum decompositions (p" = M} + M2 =
Ml + M2 has the form

We obtain the equalities

which imply equality (2.7.5).

Observe that if (2.7.5) holds, one can identify both M2 and M'2 with (pm,
78 The Jordan Form and Invariant Subspaces

for some integer m. Write C2 and A22 as r x m and m x m matrices,


respectively, with respect to a fixed basis in <pr and some basis in <pm. Then
C'2 and A22 are transformations represented by the matrices C2 and A22,
respectively, with respect to the same basis in <pr and a possibly different
basis in <pm. So the pairs (C2, A22) and (C2, A22) are essentially the same.
We conclude this section with an example.

EXAMPLE 2.7.4. Let C and A be as in Example 2.7.3, and assume that


Cj = • • • = ck_l = 0, ck T^0 (k > 1). Then (C, A) is not a null kernel pair. The
null kernel part (C2, A22) of (C, A) (as in Theorem 2.7.3) is given by

2.8 MINIMAL INVARIANT SUBSPACES OVER A GIVEN SUBSPACE

Here we present properties of invariant subspaces that contain a given


subspace and are minimal with respect to this property. It turns out that
such subspaces are in a certain sense dual to the maximal subspaces studied
in the preceding section. We also see a connection with generators of
invariant subspaces, as studied in Section 2.6.
Given a transformation A: (p" -» <p" and a subspace N C <p", we say that
an .A-invariant subspace M is minimal over N if M D JV and there is no
/I-invariant subspace that contains .A" and is contained properly in M. As an
analog of Proposition 2.7.1 we see that a minimal A-invariant subspace over
N exists, is unique, and is equal to the intersection of all A-invariant
subspaces that contain N. The proof of this statement is left to the reader.
If N is A invariant, then the minimal ^-invariant subspace over Jf
coincides with M itself. On the other hand, it can happen that (p" is the
minimal .A-invariant subspace over N, even when JV is one-dimensional.

EXAMPLE 2.8.1. Let A = diag[A 1? A 2 , . . . , Aj, with different complex num-


bers A n . . . , \n. Let jV = Span £"=1 aiei be a one-dimensional subspace.
Then the minimal ^-invariant subspace over JV is Span{ey | ay ^0}. In
particular, if all ay are different from zero, then the minimal /1-invariant
subspace over ^V is <p".

Our next result expresses the duality between minimal and maximal
invariant subspaces in a precise form. (Recall that by Proposition 1.4.4 the
subspace M is A invariant if and only if its orthogonal complement M x is A*
invariant.)

Proposition 2.8.1
An A-invariant subspace M is minimal over Jf if and only if the A*-invariant
subspace M x is maximal in JV1.
Minimal Invariant Subspaces Over a Given Subspace 79

Proof. Assume that the ^-invariant subspace M is minimal over N. In


particular, M D jV, so M ± C N^. If there were an A*-invariant subspace «2*
such that Jt±C£CJf± and ML ^ g, the subspace £L would be ^
invariant and ./^DJ^D.yV, M¥^ ^^. This contradicts the definition of ^ as
a minimal /1-invariant subspace over N. Hence ML is a maximal .4*-
invariant subspace in jV1. Reversing the argument, we find that, if the
A*-invariant subspace M L is a maximal in JfL, the /l-invariant subspace M
is minimal over N.

Proposition 2.8.1 allows us to obtain many properties of minimal in-


variant subspaces from the corresponding properties of maximal invariant
subspaces proved in the preceding section. For example, let us prove an
analogue of Theorem 2.7.2 in this way.

Theorem 2.8.2
The minimal A-invariant subspace M over N coincides with

Proof. By Proposition 2.8.1 and Theorem 2.7.2, we have

where ^ = {xE. <£" | A*'x £ JV 1 }. It is not difficult to check that for


;' = 0 , 1 , . . .

Indeed, let y G A'N, so that y = A'z for some z G jV. Then for every x G <p"
such that A * 'x €! Jf ^ we have

Hence yE.Nf. If the equality (2.8.2) were not true, there would exist a
nonzero y0 G N*~ such that y0 would be orthogonal to A'Jf. Hence for every
2 G jV we have

which implies _y0 G jV;, a contradiction with y0 G Ji^.


Now (2.8.1) and (2.8.2) give

Note that the equality M = EJL 0 A'Jf can also be verified directly without
80 The Jordan Form and Invariant Subspaces

difficulty. To this end, observe that the subspace


invariant: if x = A'z for some z £ N, then Ax — A'*lz belongs to M0.
Obviously, MQ contains JV. If M' is an A -in variant subspace that contains N,
then

So M0 is indeed the minimal ^-invariant subspace over N.


As all subspaces under consideration are finite dimensional, the sum
T,J=0AjJi is actually the sum of a finite number of subspaces A'N (/ =
0,1,...). In fact

where q is any integer greater than or equal to the degree p of a minimal


polynomial for A. Indeed, it is sufficient to verify equation (2.8.3) for q = p.
Let r(A) = \p + SpJ a y A ; be a minimal polynomial of A, so

Assuming by induction that we have already proved the inclusion

for some s>p, for x = Asy, y G Jf we have

So the inclusion follows, and by induction we have proved


the inclusion C in (2.8.3) (with q — p). As the opposite inclusion is obvious,
(2.8.3) is proved.
Going back to Theorem 2.8.2, observe that

where / j , . . . , fk is a basis in Jf. In other words, the ^-invariant subspace M


has a set of k generators, where k = dim J{. Combining this observation with
Theorem 2.6.1, we obtain the following fact.
Minimal Invariant Subspaces Over a Given Subspace 81

Theorem 2.8.3
If M is the minimal A-invariant subspace over N, and k = dim N, then for any
eigenvalue A0 of A \M we have

In particular, the theorem implies that if jVis one-dimensional, then M is


cyclic.
It is easy to produce examples when the inequality in (2.8.4) is strict. For
instance, in the extreme case when JV = (p" and A has n distinct eigenvalues,
we have M - <p" and

The case when N — Im B, and B: <p* —> <p" is a transformation is of special


interest. Noting that A'(lm B) = Im(A'B), Theorem 2.8.2 together with
(2.8.3) gives the following.

Theorem 2.8.4
Let #: <p j —»<p" and A: $"—>$" be transformations. Then the minimal
A-invariant subspace over Im B coincides with

for every integer q greater than or equal to the degree of a minimal


polynomial for A. [In particular, J>(A, B) = £":* ImCA'B).]

We say that a pair of transformations (A, B), where A: (p" —» <p" and
B: <ps —> (p", is a full-range pair if the minimal ^-invariant subspace over
Im B coincides with <p", or, equivalently, if

It is easy to see that, also, (A, B) is a full-range pair if and only if

The duality generated by Proposition 2.8.1 now takes the form: the pair
(A, B) is a full-range pair if and only if the adjoint pair (B*, A*) is a null
kernel pair. This follows from the orthogonal decomposition
82 The Jordan Form and Invariant Subspaces

which is obtained directly from Proposition 1.4.4.

EXAMPLE 2.8.2. Let

Then

where m is the index determined by the properties that bm^Q, bm+ l =


• • • = bn = 0. In particular, $(A, B) = {0} if and only if B = 0, and the pair
(A, B) is full range if and only if bn9^0.

As with null kernel pairs, full-range pairs will be important in realization


theory for rational matrix functions and in linear systems theory (see
Chapters 7 and 8).
We conclude this section with an analog of Theorem 2.7.4 concerning the
full-range part of a pair of transformations.

Theorem 2.8.5
Given transformations A: <p w —»<p", B: (p*—»<p", let N^ be the minimal A-
invariant subspace over Im B. Then for every direct complement N2 to Nr in
<p", and with respect to the decomposition <p" = JV, + Jf2, the transformations
A and B have the block matrix form

where the pair A, l: JV, - » J f , Bl: (p5 -» Jf} is full-range. If f = Jf { + Jf '2 is


another direct sum decomposition with respect to which A and B have the
form

with full-range pair (A'n, B{), then Jf[ = JV, and A'n = An, B{ = Br
Marked Invariant Subspaces 83

Proof. Equality (2.8.5) holds because jVj is A invariant and


Further, in view of (2.8.5) we have

so (/4 U , #,) is indeed a full-range pair. If (2.8.6) holds for a direct sum
decomposition <p" = M\ + N'2, then

which is equal to N\ in view of the full-range property of (A'n, B { ) . Hence


N\ is the minimal /4-invariant subspace over Im B and thus N\ = .yV,. Now
clearly A\, = A n (which is the restriction of A to JVJ = JV,) and B{ = B^. D

2.9 MARKED INVARIANT SUBSPACES

Let A: <p"—»(p" be a transformation, and let

be a basis in which A has the Jordan form

Obviously, any subspace of the form

for some choice of integers ra(, 0< mi < kf, is .4 invariant. [Here ra, = 0 is
interpreted in the sense that the vectors fn, . . . , f i k do not appear in
(2.9.1) at all.] Such .A-invariant subspaces are called marked (with respect to
the given basis _/) - in which A is in the Jordan form).
The following example shows that, in general, not every A-invariant
subspace is marked (with respect to some Jordan basis for A).

EXAMPLE 2.9.1. Let


84 The Jordan Form and Invariant Subspaces

We shall verify that the A-invanant subspace M = Span{e,, e2} is not


marked in any Jordan basis for A. Indeed, it is easy to see (because A2 ^Q
and rank A = 2) that the Jordan form of A is

So any Jordan basis of A is of the form /,, /2, /3, g, where


A/2 = /i» A/3 = /2- If -^ were marked with respect to this basis, we would
have either M = S p a n { f l , g } or M = Span{/,, /2}. The former case is
impossible because A\M 7^0, and the latter case is impossible because it
implies M, C Im A, which is not true (e2^lm A).
The description of marked invariant subspaces can be reduced to the
description of invariant subspaces which are marked with respect to a fixed
Jordan basis. This reduction is achieved with the use of matrices commuting
with J.
Theorem 2.9.1
Let J be an n x n matrix in Jordan form. Then every marked J-invariant
subspace t£ can be represented in the form !£ = BM, where M is marked (with
respect to the standard basis, e{,. . . , en in <J7") and B is an.nxn matrix
commuting with J.
Proof. Assume j£ = BM, where M is marked (with respect to the
standard basis) and BJ = JB. Denoting by / , , . . . , / „ the columns of fi, we
find that J^is a marked ./-invariant subspace in the basis/,,. . . , fn. (In view
of the equality BJ = JB, the matrix J has the same Jordan form in the basis
/„•..,/„.)
Conversely, if 56 is marked with respect to some Jordan basis / , , . . . , / „
of J, then, denoting B = [/,/2 • • • / „ ] and M = B~1S£, we obtain L in the
required form $ = BM.

Note that the characteristic property of a marked invariant subspace


depends only on the parts of this subspace corresponding to each eigen-
value: an yl-invariant subspace M is marked if and only if for every
eigenvalue A0 of A the A\<% (x) -invariant subspace M n 5£A (v4) is marked.
This follows immediately from the definition of a marked subspace.
In view of Example 2.9.1 it is of interest to find transformations for which
every invariant subspace is marked. We have the following result.

Theorem 2.9.2
Let A: <£"—»• <p" be a transformation such that, for every eigenvalue A0 of A,
at least one of the following holds: (a) the geometric multiplicity of \Q is equal
Functions of Transformations 85

to its algebraic multiplicity; (b) dim Ker(v4 - A07) = 1. Then every A-


invariant subspace is marked.

Proof. Considering M n £%A 0 (A) and A\^ A (A) in place of M and A,


0

respectively, we can assume that A has a single eigenvalue A0.


If dim Ker(A - A07) = 1, then there is a unique maximal chain of A-
invariant subspaces:

where /,, /2, . . . , / „ is any Jordan chain for A. So obviously every A-


invariant subspace is marked.
Assume now that the geometric multiplicity of the eigenvalue A0 of A is
equal to its algebraic multiplicity. Then A & (A) = A0/, and since every
nonzero vector in £%A (A) is an eigenvector tor A, again every ^-invariant
subspace is marked.

It is easy to produce examples of transforamtions for which the hypoth-


eses of Theorem 2.9.2 fail, but nevertheless every invariant subspace is
marked; for example

2.10 FUNCTIONS OF TRANSFORMATIONS

We recall the definition of functions of matrices. Let /(A) = E' =O A'/ be a


scalar polynomial of the complex variable A, and let /I: <p" —>• <p" be a
transformation written as a matrix in the standard basis. Then f(A) is
defined as f ( A ) = E' =O ftA'. Letting Jk( A) be the Jordan block of size k with
eigenvalue A, define

a Jordan form for A. Then

A computation shows that


86 The Jordan Form and Invariant Subspaces

and in general the (s, q) entry of A (A)' is if q 2= 5 and zero


otherwise (here = i\l[(q-s)\(i-(q-s))\\ if i^q-s and

= 0 if i < q - s). It follows that

Hence for fixed A the matrix f(A) depends only on the values of the
derivatives

where /AJ , . . . , fir are all the different eigenvalues of A and m; is the height
of /Uy, that is, the maximal size of Jordan blocks with eigenvalue ^ in a
Jordan form of A. Equivalently, the height of /uy is the minimal integer m
such that Ker(/l - /iy/)m = R^A). This observation allows us to define f(A)
by equality (2.10.1) not only for polynomials /(A), but also for complex-
valued functions that are analytic in a neighbourhood of each eigenvalue of
A.
Note that for a fixed A the correspondence f(\)—>f(A) is an algebraic
homomorphism. This means that for any two functions/(A) and g(A) that
are analytic in a neighbourhood of each eigenvalue of A the following holds:
Functions of Transformations 87

On the left-hand side the function af + fig (which is analytic in a neighbour-


hood of each eigenvalue of A) is naturally defined by

Also, we define

These properties can be verified by a straightforward computation using


(2.10.1). For example:

where /(A) and g( A) are analytic functions in a neighbourhood of A0 and


h(\) = /(A)g(A). In particular, the property (2.10.2) ensures that
f(A)g(A) - g(A)f(A) for any functions /(A) and g( A) that are analytic in a
neighbourhood of each eigenvalue of A.
In the sequel we need integral formulas for functions of matrices. Let A
90 The Jordan Form and Invariant Subspaces

be an n x n matrix, and let F be any simple rectifiable contour in the


complex plane with the property that all eigenvalues of A are inside F. For
instance, one can take F to be a circle with center 0 and radius greater than
\\A\\ (here and elsewhere the norm \\A\\ of a transformation A: <p" —» <p" is
defined by

where for a vector x —

Proposition 2.10.1

Proof. Suppose first that T is a Jordan block with eigenvalue A = 0:

Then

(recall that n is the size of T). So


Functions of Transformations 89

It is then easy to verify (2.10.3) for a Jordan block T with eigenvalue A0 (not
necessarily 0). Indeed, T - A0/ has an eigenvalue 0, so by the case already
considered

where F0 = {A - A0 | A EF}. The change of variables /x = A + A0 on the


left-hand side leads to

Now

so (2.10.3) holds for the block T.


Applying (2.10.3) separately for each Jordan block, we can carry the
result further for arbitrary Jordan matrices J. Finally, for a given matrix A
there exists a Jordan matrix / and an invertible matrix 5 such that T =
S~1JS. Since (2.10.3) is already proved for J, we have

As a consequence of Proposition 2.10.1 we see that for a scalar poly-


nomial /(A) the formula

holds. Note that here F can be replaced by a composite contour that consists
of a small circle around each eigenvalue of A. (Indeed, the matrix function
(/A - A)"1 is analytic outside the spectrum of A.) Using this observation and
formula (2.10.1) we see that for any function that is analytic in a neighbour-
hood of each eigenvalue of A, the formula

holds, where F consists of a sufficiently small circle around each eigenvalue


of A [so that /(A) is analytic inside and on F].
90 The Jordan Form and Invariant Subspaces

A transformation A: <p" -»<p" (or an n x n matrix /4) is called diagon-


able if there exist eigenvectors Jtj, . . . , * „ of /I that form a basis in <p".
Equivalently, an n x n matrix .4 is diagonable if for some nonsingular matrix
S the matrix S~1AS has a diagonal form:

So a diagonable matrix has n Jordan blocks in its Jordan form with each
block of size 1. If one knows that A is diagonable, then f ( A ) can be given a
meaning [by the same formula (2.10.1)] for every function /(A) that is
defined on the set of all eigenvalues of A. So, given a diagonable A, there is
an 5 such that

For any function /(A) that is defined for A = a,,. . . , A = an, put

In particular, f ( A ) is defined for a hermitian A and any function/defined on


^?. Also, for a unitary A and any function / defined on the unit circle, the
matrix f(A) is well defined in this way.
Consider now the application of these ideas to the exponential function.
This is subsequently used in connection with the solution of systems of
differential equations with constant coefficients. As/(A) = e* is analytic on
the whole complex plane, the linear transformation f(A) - eA is defined for
every linear transfomation A: <P"—>• <p". In fact

is given by the same power series as ek. In order to verify (2.10.7), we can
assume that A is in the Jordan form:

Then, by definition

On the other hand


Functions of Transformations 91

/ = 0,1,. . . . So the (s, q) (q^s) entry in the matrix

is

Hence formula (2.10.7) follows.


This argument shows that the series (2.10.7) converges for every
transformation A. Actually, it converges absolutely in the sense that the series

converges as well.
The exponential function appears naturally in the solution of systems of
differential equations of the type

Here akj are fixed (i.e., independent of /) complex numbers, and


xt(t),. . . ,*„(/) are functions of the real variable / to be found. Denoting
A = [akj\"k i=i and x(t) = (x,(f),. . . ,*„(/)), we rewrite this system in the
form

A general solution is given by the formula


92 The Jordan Form and Invariant Subspaces

where x0 = *(0) is the initial value of x(t).


In connection with this formula observe that e(t+s)A = e'AesA, as follows,
for instance, from (2.10.7). In fact, eA+B -eAeB provided A and B com-
mute. However, eA+B is not equal to eA • eB in general.

2.11 PARTIAL MULTIPLICITIES AND INVARIANT SUBSPACES


OF FUNCTIONS OF TRANSFORMATIONS

From the definition of a function of a transformation A: <p n —» <p" it follows


immediately that if \l,. . . , An are the eigenvalues of A (not necessarily
distinct), then /(A,),. . . , /(A n ) are the eigenvalues of f(A). Moreover we
compute the partial multiplicities of f(A), as follows.

Theorem 2.11.1
Let A: <p" —» <p" be a transformation with distinct eigenvalues /A, ,. . . , /ur and
partial multiplicities mn,. . . , mik corresponding to /A,., / = 1,. . . , r. Letf(\)
be an analytic function in a neighbourhood of each ti( (if all m(; are 1, it is
sufficient to require that /(/*,-) be defined for i — 1, . . . , r). For each m^
define a positive integer sif as follows: stj = m{- ifra/7 = 1 or iff(k\Hj) = 0for
& = ! , . . . , m-j — 1; otherwise /^(/x,) is the first nonvanishing derivative of
/(A) at /A,. Then the partial multiplicities of f ( A ) corresponding to the
eigenvalue A are as follows:

for all indices i such that /(/a ( ) = A.

Proof. By Corollary 2.2.2, if suffices to consider the case when A =


Jm(n>) is a Jordan block. Using equations (2.10.1), we see that

where f(s\ /x) is the first nonvanishing derivative of /(A) at /LA. [If m = 1 or if
f(k\u.) = 0 for k = 1, . . . , m, we put s = m.] More generally
Partial Multiplicities and Invariant Subspaces 93

Denoting the left-hand side of this relation by f y , note that the sizes of
Jordan blocks of f(A) are uniquely determined by the sequence 11,. . . , tm.
Indeed, the number of Jordan blocks of f ( A ) with size not less than; is just
tj - /,_}, where ; = 1,. . . , m and tQ is zero by definition. This observation,
together with (2.11.1), leads to the conclusion of the theorem.
Let us give an illustrative example for Theorem 2.11.1.
EXAMPLE 2.11.1. Let A be a 23 x 23 matrix with only two distinct eigen-
values 0 and 1, and with partial multiplicities 1,4,9 corresponding to the
eigenvalue 0, and with partial multiplicities 2,7 corresponding to the eigen-
value 1. Let /(A) = A 2 ( A - I) 4 Then f ( A ) has the unique eigenvalue 0, and
the different partial multiplicities of A have the following contribution to the
partial multiplicity (PM) of f(A), according to Theorem 2.7.1:

The PM 1 for A gives rise to the PM 1 of f(A).


The PM 4 of A gives rise to the PM values 2,2 of f ( A ) .
The PM 9 of A gives rise to the PM values 4,5 of f(A).
The PM 2 of A gives rise to the PM values 1,1 of f(A).
The PM 7 of A gives rise to the PM values 1,2,2,2 of f(A).

Hence a Jordan form for the transformation A2(A - I)4 has four Jordan
blocks of size 1,5 Jordan blocks of size 2, one Jordan block of size 4 and one
Jordan block of size 5, all corresponding to the eigenvalue zero.

Note that for a given transformation A: <p" —»<p" and a function /(A) such
that f(A) can be defined as above, there exists a polynomial p(\) such that
p(A) =f(A). Indeed, take p(\) such that

where /Xj, . . . , /u,r are all the different eigenvalues of A and my is the height
Of f l j .
Consider now the connections between invariant subspaces of A and the
invariant subspaces of a function of A.

Proposition 2.11.2
If M is an invariant subspace of a transformation A, then M is also invariant
for every transformation f(A), where /(A) is a function for which f(A) is
defined.

The proof is immediate:


94 The Jordan Form and Invariant Subspaces

for some polynomial p( A) = E;10 p ; A ; ; so for every x E M we have A'x G M,


I = 0, . . . , ra, and thus

Note that in general the linear transformation f(A) may have more invariant
subspaces than A, as the following example shows.

EXAMPLE 2.11.2. Let

The invariant subspaces of A are {0}, Span{e,}, <p2, but the invariant
subspaces of A2 = 0 are all the subspaces in <p2.

We characterize the cases when f ( A ) has exactly the same invariant


subspaces as A.

Theorem 2.11.3
(a) Assume that f( A) is an analytic function in a neighbourhood of each
eigenvalue / u , , . . . , inr of A (/u,,, . . . , /u,r are assumed to be distinct). Then
f(A) has exactly the same invariant subspaces as A if and only if the following
conditions hold: (i) /(^,) ^/(/u. y ) if /LI, ^ /t y ; (ii) /'(/"•,•) ^0 /or every eigen-
value jjLj with height greater than 1. (b) If A is diagonable and f( A) is a
function defined at each eigenvalue of A, then f ( A ) has exactly the same
invariant subspaces as A if and only if condition (/) of part (a) holds.

Proof. We shall assume that A has the Jordan form

where each A, coincides with some /n;, \< j< r.


Suppose that (i) does not hold, and suppose, for instance, that A, ^ A 2
but/^Aj) = /(A 2 ). Formula (2.10.1) shows that et + ek{ + t is an eigenvector of
f ( A ) corresponding to the eigenvalue/(A,). Hence Span{e, + ek +,} is/(A)
invariant; but this subspace is easily seen not to be A invariant.
Suppose that (ii) does not hold; say, £ , > ! and /'(A,) = 0. Formula
(2.6.1) implies that el + e2 is an eigenvector oif(A) corresponding to/^).
So Span{e, + e2} is an f(A)-invariant subspace that is not A invariant.
Assume now that (i) and (ii) hold. As f ( A ) = p(A) for some polynomial
p(A), we can assume that/(A) is itself a polynomial. Condition (i) imposed
on the polynomial / ensures that the root subspace of A corresponding to
some eigenvalue A0 is also a root subspace of f ( A ) corresponding to the
Exercises 95

eigenvalue/(A 0 ). Since every ,4-invariant [resp. f(A)-invariant] subspace is


a direct sum of ^-invariant [resp. f(A)-invariant] subspaces, each summand
belonging to a root subspace, we can assume that cr(A) consists of a single
point; say, cr(A) = {0}. Replacing, if necessary, /(A) by a/(A) + /3, where
a, j3 E <p are constants and a 7^0—such a replacement does not alter the set
lnv(f(A)) of all /(v4)-invariant subspaces—we can assume that /(O) = 0,
/'(0) = 1. In this case

But then f(A)=AF, where F= I + Ef = / ai +lA' is an invertible matrix.


Clearly, every v4-invariant subspace is also AF invariant. Note that F~l is a
polynomial in AF (this can be checked, for instance, by direct computation
in each Jordan block of A, using the fact that A is a Jordan matrix and
(r(A) = {0}); so every A F- in variant subspace is also (AF • F ~ l ) invariant,
that is, A invariant. Thus we have proved that lnv(f(A)) - Inv A.

2.12 EXERCISES

2.1 Let

where A ^ : <p m —»<p m and A2: <p"—»<p" are transformations.


(a) Prove or disprove the following statement: every ^-invariant
subspace is a direct sum of an Al -invariant subspace and an
/!2-invariant subspace.
(b) Prove or disprove the preceding statement under the additional
condition that the spectra of A, and A2 do not intersect.
(c) Prove or disprove the preceding statement under the additional
condition that A, and A2 are unicellular with the same eigenvalue.
2.2 Let A: $"—> <f" be a transformation with A2 = I. Describe the root
subspaces of A.
2.3 Describe the root subspaces of a transformation A such that A* = /.
How many spectral /4-invariant subspaces are there?
2.4 Find the root subspaces of the transformation

where B: <p"-» <p" is some transformation and A, ^ A 2 . Is it true that


& A .(X) = Ker( A,/ - A), i - 1, 2?
% The Jordan Form and Invariant Subspaces

2.5 Find the Jordan form for the following matrices A:

For each one of the matrices A and each eigenvalue A0 of A, check


whether 9t^(A) = Ker( A07 - A).
2.6 Find all possible Jordan forms of transformations A: <p"-* <p" satisfy-
ing A2 = 0. Express the number of Jordan blocks of size 2 in terms of
A.
2.7 Find the Jordan form of the transformation

2.8 What is the Jordan form of Qk, k = 2, 3,. . . , where Q is given in


Exercise 2.7.]
2.9 Describe the Jordan form of a circulant matrix

where al,. , . ,an are complex numbers. Prove that there exists an
invertible matrix S independent of al,. . . , an such that SAS~l is
diagonal. [Hint: A is a polynomial in Q, where Q is defined in
Exercise 2.7].
2.10 What is the Jordan form of the transformation
Exercises 97

2.11 Find the Jordan form of the transformation

2.12 Let AI, A2,. • • , An be transformations on <p2, and define

(a) Show that A is similar to a block diagonal matrix with 2 x 2


blocks on the main diagonal. [Hint: On writing

for /' = 1,. . . , n, A is similar to

where B is the circulant matrix

and analogously for C, D, and F. Now use the existence of one


similarity transformation that takes B, C, D, and Fto the Jordan
form (Exercise 2.9).]
(b) Prove that in the Jordan form of A only Jordan blocks of size 2
or 1 may appear.
(c) Show that if all Aj, j = 1,. . . , n are diagonal matrices, then A is
diagonable, that is, the Jordan form of A is a diagonal matrix. Give
an example of nondiagonal A j , . . . , An for which A is diagonable
nevertheless.
98 The Jordan Form and Invariant Subspaces

2.13 Prove that the block circulant matrix

where A\,. . . , An are k x k matrices, has Jordan blocks of sizes less


than or equal to k in its Jordan form.
2.14 Find the Jordan form for the transformation

where a 0 , . . . , « „ _ , E <p and the polynomial A" - E"=0' af\' has n


distinct zeros. Show that a similarity that takes A to its Jordan form is
given by the Vandermonde matrix of type

2.15 Let

(a) Prove that, for each eigenvalue, A has only one Jordan block in
its Jordan form. (Hint: Use the description of partial multi-
plicities of A in terms of the matrix polynomial A/ — A; see the
appendix.)
(b) Find the Jordan form of A.
Exercises 99

2.16 Show that any matrix of the type

where At are k x k matrices, has not more than k Jordan blocks


corresponding to each eigenvalue in its Jordan form.
2.17 What is the Jordan form of the upper triangular Toeplitz matrix

where « 0 ,. . . , an_l are complex numbers with a{ ^0?


2.18 Find the Jordan form of (/„( A0))*, k = 2, 3, Show that [/„(())]*
has infinitely many invariant subspaces if k>2.
2.19 Describe the Jordan form of the matrix in Exercise 2.17 without the
restriction flj^O. When does this matrix have infinitely many in-
variant subspaces? [Hint: Observe that the matrix is a polynomial in
7n(0) and use Theorem 2.11.1.]
2.20 Prove that an n x n matrix A is similar to its transpose AT.
2.21 Let A: $"-> <f" be a transformation such that p(A) = 0, where p( A) is
a polynomial of degree k with k distinct zeros A,, . . . , \k.
(a) Show that Ker( A/ - A) * {0}, ; = 1, . . . , k.
(b) Verify the direct sum decomposition

(c) Prove that A is diagonable.


2.22 Assume that the transformation A: <p" —> <p" satisfies the equation
p(A) = 0, where p(A) is a polynomial. Let A0 be a zero of/?(A), and
let k be its multiplicity. Show that the .A-invariant subspace Im q(A),
where q(X) = p( A)( A - A 0 )~*, is spectral.
2.23 Prove that for any transformation A: (p" —> (p" the inequalities

dim Ker As+l + dim Ker ,4s'1 <2dim Ker As, 5= 1,2,...

hold.
100 The Jordan Form and Invariant Subspaces

2.24 Prove that a transformation A: <p"—><p" has the property that


AM L C M ± for every ^-invariant subspace M if and only if A is
normal.
2.25 Show that a transformation has only one-dimensional irreducible
subspaces if and only if A is diagonable.
2.26 Find the minimal number of generators in <p" of the following
transformations:
(a) The circulant

(b) The lower triangular Toeplitz matrix

(c) The companion matrix

2.27 Prove that if A: <p" —»<£" has one-dimensional image, the minimal
number of generators of any /1-invariant subspace is less than or
equal to n — 1. Show that Ker A is the only nontrivial A -invariant
subspace whose minimal number of generators is precisely n — 1.
2.28 For a given transformation A, denote by g(M) the minimal number of
generators in an /l-invariant subspace M. Prove that

where the maximum is taken over all eigenvalues A0 of A [g({0}) is


interpreted as zero].
Exercises 101

2.29 Let

where Al and A2 are transformations such that every invariant


subspace of each of them is cyclic. Prove or disprove the following
statements:
(a) Every ^-invariant subspace is cyclic.
(b) Every ^-invariant subspace has not more than two minimal
generators.
2.30 Show that the vector (0, 0,. . . , 0,1) E <p" is a generator of (pn as an
invariant subspace of a companion matrix.
2.31 Find the minimal ^-invariant subspace over Im B for the following
pairs of transformations:

Here « , , . . . , « „ are complex numbers.


2.32 Find the maximal y4-invariant subspace in Ker C for the following
pairs of transformations:
(a) C = [1 0 • • • 0]; A is a companion matrix.
(b) C = [ l 0 • • • 0]; A is an upper triangular Toeplitz matrix.
(c) C = [Ik 0 • • • 0]; A is an in Exercise 2.31, (c).
102 The Jordan Form and Invariant Subspaces

2.33 Prove or disprove the following statements:


(a) If Ml is the maximal y4-invariant subspace in Yl and M2 is the
maximal /l-invariant subspace in T2, then M\ + M2 is the maxi-
mal ,4-invariant subspace in Yl + V2.
(b) If Mi and <Yi (i = 1,2) are as in (a), then Ml n M2 is the maximal
A -in variant subspace in T, fl T2.
(c) The analog of (a) for the case of minimal /1-invariant subspaces Mt
over IT., i = 1,2.
(d) The analog of (b) for the case of minimal ,4-invariant subspaces
./#,. over Tf, / = 1,2.
2.34 Find when the following pairs of matrices are full-range pairs:

(b) (A, B), where A is an n x n matrix with A" =0 and B is an


n x 1 matrix.
2.35 Find when the following pairs of matrices are null kernel pairs:
(a) ([c,,. . . , cj, / W (A 0 )*), where k > 1 is a fixed integer and
c , , . . . ,cne<p.
(b) (C, A), where Cis an 1 x « matrix and ^4 is an n x n upper triangular
matrix with zeros on the main diagonal.
2.36 Given a full-range pair A: ("-+$", £:(p m -*<p", prove that if
A': (p"-» <p", B': $m-+ <p" are transformations sufficiently close to ,4
and B, respectively, (i.e., \\A' - A\\ < e, \\B' - B\\ < e, where e >0
depends on A and B only), then (A', B') is a full-range pair as well.
2.37 Prove that for every pair of transformations A: <p" -> <p", 5: <pm -> <p"
there exists a sequence of full-range pairs (Ap, Bp), p = 1, 2,. . . such
that limp_ ||Xp - ,41| =0 and lim^ \\Bp - B\\ =0.
2.38 State and prove the analogs of Exercises 2.36 and 2.37 for null kernel
pairs.
2.39 Let A and B be transformations on <p". Show that the biggest
.A-invariant (or, equivalently, ^-invariant) subspace M for which
A\M = B\M consists of all vectors *E<p" such that A'x - B'x,
7 = 1,2,....
2.40 Let v 4 j , . . . , Ak be transformations on <J7". Show that the biggest
A, — invariant subspace M for which A^M — Ap\M for p = 1,. . . , k,
consists of all x G <p" such that /l^t = ^l^jc for p = 1,. . . , k and
y = l,2,....
Exercises 103

2.41 Show that the transformation eA is nonsingular for every transfor-


mation A: <£"—»• <£"• Find the eigenvalues and the partial multiplicities
of eA in terms of the eigenvalues and the partial multiplicities of A.
2.42 Give an example of a transformation A such that Inv(^4) is finite but
ln\(eA) is infinite.
2.43 Show that for a transformation A the series

converges provided all eigenvalues of A are less than 1 in absolute


value. For such an A prove that A = ef(A} - /, so one can write
f(A) = ln(7 + A). Prove that A and ln(/ + A) have exactly the same
invariant subspaces.
2.44 Find all marked ^-invariant subspaces for the transformation A of
Example 2.9.1.
2.45 Show that for any transformation A, all /4-hyperinvariant subspaces
are marked.
2.46 For which of the following classes of n x n matrices are all invariant
subspaces marked?
(a) Companion matrices
(b) Block companion matrices

with 2 x 2 blocks Aj (p = n/2)


(c) Upper triangular Toeplitz matrices
(d) Circulant matrices
(e) Block circulant matrices

with 2 x 2 blocks Aj
(f) Matrices A such that A2 = 0
104 The Jordan Form and Invariant Subspaces

2.47 Prove that every invariant subspace of a matrix of type

is marked.
2.48 Prove that for any transformation A: (p3 —» <p3 every invariant sub-
space is marked.
2.49 Find all Jordan forms of transformations A: (p 4 —> (p4 for which there
exists a nonmarked invariant subspace.
Chapter Three

Coinvariant and
Semiinvariant
Subspaces

In this chapter we study two classes of subspaces closely related to invariant


ones; namely, coinvariant and semiinvariant subspaces. A subspace is called
coinvariant if it is a direct complement to an invariant subspace. A subspace
is called semiinvariant if it is a coinvariant part of an invariant subspace.
Also, we introduce here the related notion of a triinvariant decomposition
for a transformation. This requires a decomposition of the whole space into
a direct sum of three subspaces with respect to which the transformation has
a block upper triangular form. It follows that the first, second, and
third subspace are invariant, semiinvariant, and coinvariant, respectively.
The triinvariant decomposition will play an important role in subsequent
applications.

3.1 COINVARIANT SUBSPACES

A subspace M Cjtf"1 is called coinvariant for the transformation A: <p"—» <p"


(or, in short, A coinvariant) if there is an ^-invariant direct complement to
M in <p". Consider some simple examples.

EXAMPLE 3.1.1. Let A be an n x n Jordan block. Then for each / (1 < / < n)
Span{e{, ei+l,. . . ,en} is an ^4-coinvariant subspace (although there are
many other A -coinvariant subspaces). For this subspace there is a unique
A- in variant subspace that is its direct complement, namely,
Span{ej, e 2 ,. . . , e._j} ({0} if i = l). Note that, in this case, the only
subspaces that are simultaneously A invariant and A coinvariant are the
trivial ones {0} and <p". d

105
106 Coinvariant and Semiinvariant Subspaces

EXAMPLE 3.1.2. Let A = diag[A,, . . . , Aj, where all A, are different. As we


have seen in Example 1.1.3, the only /4-invariant subspaces are (0), (J7", and
Span{e, , . . . , et }, k = I , . . . , n — 1, for any choice of /, < /2 < • • • < ik. In
contrast, every subspace in <p" is A coinvariant. Indeed, let M =
Span{jc,,. . . , xq], where jc,,. . . , xq are linearly independent vectors in <p".
Then the columns of the n x q matrix X = [x^x2 • • • xq] are linearly indepen-
dent. So there exist q rows of X, say, the / , t h , . . . , /^th rows, which are also
linearly independent. Put { / ' , , . . . , / „ _ , } = ( 1 , . . . . H}\{/,, . . . , / , } and
Ji = Span{ej , . . . , e- } so that N is an ^-invariant subspace. As, by
construction, the n x n matrix

is nonsingular, N is a direct complement to M in <p". Thus M is A


coinvariant.

EXAMPLE 3.1.3. If A = a/, a G <p, then every subspace in <p" is obviously A


coinvariant. For every A -coinvariant subspace M there is a continuum of
A -invariant subspaces that are direct complements to M in <p".

For an A -coin variant subspace M and any projector P onto M such that
Ker P is A invariant, we have PAP = PA. This follows, for instance, when
equation (1.5.5) is applied to I — P, or else it can be proved directly.
Conversely, if PAP = PA for some projector P onto a subspace M C <p",
then M is A coinvariant and Ker P is an .4-invariant direct complement to M
in <p".
Given an A -coinvariant subspace M and a projector P onto M such that
Ker P is A invariant, the linear transformation A has the following block
triangular form:

with respect to the decomposition <p" = Im P + Ker P. In particular, we find


that every eigenvalue of the compression PA\M: Jt—>M of A to its coin-
variant subspace M, is also an eigenvalue of A. Indeed, in the represent-
ation (3.1.1) the compression PA\M coincides with A n , and this immediately
implies that cr(PA\M)C a-(A).
We note that, essentially, the compression to a coinvariant subspace
depends on the invariant direct complement only. (Actually, we have
encountered this property already in Theorem 2.7.4 and its proof.)

Proposition 3.1.1
Let Ml and M2 be A-coinvariant subspaces with a common A-invariant
direct complement N. Then the compressions P^A^ :Ml-^Ml and
Coinvariant Subspaces 107

P2A\M\ M2—>M2 (where Pj is the projector on Mf along N for j = 1,2) are


similar.
Proof. Write

with respect to the direct sum decompositions <p" = M j + N and <p" =


M2 + N, respectively. Also, write the identity transformation /: <p"—» <£" in
the 2 x 2 block matrix form

(so 5n: Ml-^>M2, 512: N-*M2, S2l: M^ —>N, S22: N—>N). It is easily seen
that 5,2 = 0 and 522 = IN, the identity transformation on N. As / is invert-
ible, the transformation 5,, must be invertible as well, and

Now

which gives, in particular, AU = SU1A[1SU. It remains to observe that


P\A\M
l
= ^11' 1*2^
2
= ^11-

The following property of coinvariant subspaces is analogous to the


property of ^-invariant subspaces proved in Section 1.4.

Proposition 3.1.2
A subspace M is A coinvariant if and only if its orthogonal complement M1 is
A* coinvariant.

Proof. Assume that M is A coinvariant, and let N be an ,4-invariant


direct complement to M in <p". Then M^ fl Jf± = (M + JV)1 = ((p")1 = {0},
and since dim M± + dim N'1 — (n — dim M ) + (n — dim Jf) - n, we have
M x + Jf^ = <p". As NL is A* invariant (see Section 1.4) it follows that M L
is A* coinvariant. Conversely, if M ± is A* coinvariant, then by the part of
this proposition already proved, the subspace (ML)L = M is (A*)* coin-
variant, that is, M is A coinvariant.
108 Coinvariant and Semiinvariant Subspaces

A subspace M C <p" is called orthogonally coinvariant for the transfor-


mation A: <p"—»<p" (in short, orthogonally A coinvariant) if the orthogonal
complement M L of M is A invariant.

Proposition 3.1.3
A subspace M is orthogonally A-coinvariant if and only if M is invariant for
the adjoint linear transformation A*.

Proof. Assume that M is orthogonally A coinvariant. So Ax E.ML.


Then we have

for all y^M. But the left-hand side of (3.1.2) is just (*, A*y). Hence
A*yE.(M±)± =M for all y E M and M is A* invariant. Reversing this
argument we find that if M is A* invariant, then Ax£M± for every x E M """,
that is, M is orthogonally A coinvariant.

We observe that, in general, A -coin variant subspaces do not form a


lattice, that is, the sum and intersection of A -coinvariant subspaces need not
be A coinvariant. This is illustrated in the following example.

EXAMPLE 3.1.4. Let

The only ^-invariant subspaces are {0}, Spanf^}, Span{e,, e2}, <p3. Con-
sequently, all A- coin variant subspaces are as follows:

Indeed, assume Span{«, v} is a two-dimensional subspace for which


Span{ej} is a direct complement. Writing u = {«,, «,, w 3 ), v = (i>,, u,, u,)
we see that det 7^0. Hence replacing u and v with their linear
combinations, if necessary, we see that
Reducing Subspaces 109

for some x, yE <p. Now Span{e2, e3} and Span{e2, (1,0,1}} are ^-coin-
variant subspaces but their intersection (which is equal to Span{e2}) is not.
Also, Span{e3} and Span{(l,0,1}} are y4-coinvariant subspaces but their
sum (which is equal to Span{e3, (1,0,1)}) is not.

In contrast, it follows immediately from Proposition 3.1.3 that the set of


all orthogonally v4-coinvariant subspaces is a lattice. Note also the following
property of orthogonally coinvariant subspaces.

Proposition 3.1.4
Any transformation has a complete chain of orthogonally coinvariant sub-
spaces.

Proof. Let A: $"—> <P" be a transformation. As we have seen in Section


1.9, there is an orthonormal basis jc,,. . . , *„ for f in which A has the
upper triangular form:

Clearly, the subspaces Span^,. . . , *„}, k = 1,. . . , n are orthogonally A


coinvariant and form a complete chain.

3.2 REDUCING SUBSPACES

An invariant subspace ^ of a transformation A: <p"—» <p" is called reducing


for A if SB + M = (p" for some other ^-invariant subspace M. In other
words, a subspace 3? C <p" is reducing for A if it is simultaneously A
invariant and A coinvariant. In particular, {0} and (p" are trivially reducing.
A more important example follows from Theorem 2.1.2. This shows that the
root subspaces £%A (A) are reducing for A. A unicellular linear transfor-
mation is an example in which the only reducing subspaces are the trivial
ones {0} and <p". On the other hand, A = I is a linear transformation for
which every subspace in <p" is invariant and reducing.
As a transformation on <p" with only one Jordan block (i.e., a unicellular
transformation) has the smallest possible number of reducing subspaces, one
might expect that a transformation with the most Jordan blocks has the most
reducing subspaces. This is indeed so. Recall that a transformation is called
diagonable if its Jordan form is a diagonal matrix.
110 Coinvariant and Semiinvariant Subspaces

Theorem 3.2.1
If A is diagonable, then each invariant subspace of A is reducing. Conversely,
if each invariant subspace of A is reducing, then A is diagonable.

Proof. Assume that A is diagonable. Using Proposition 1.4.2, it is easily


seen that each invariant subspace of A is reducing if and only if the same is
true for S~lASy for any nonsingular matrix 5. So we can assume that
A = diag[«j • • • « „ ] for some a,, . . . , « „ E <p. Let A 1 5 . . . , A p be all the
different numbers among the a, values, and for notational convenience
assume that

where

are integers. Obviously, the eigenvalues of A are A , , . . . , A p , and the root


subspaces of A are

(by definition we put £0 = 0). By Theorem 2.1.5 any ^-invariant subspace M


has the form

where Mt C £% A (A). Let JV) be any direct complement for Mt in £% A (A). As


Ax = \fx for every JtE£% A (,<4), the subspace Ji{ is obviously A invariant.
Hence the subspace JV = Ji{ + • • • + Np, which is a direct complement to M
in (p", is also A invariant. This means, by definition, that M is reducing.
Conversely, assume that A is not diagonable. Let M be the ^-invariant
subspace of A spanned by its eigenvectors. As A is not diagonable, M ^ <p".
If .A" is any other /1-invariant subspace and x is an eigenvector of A\x, then x
is also an eigenvector of A, and thus x£.M. So M n ^ V ^ j O } for every
^-invariant N. Consequently, M is not reducing.

An important class of diagonable transformations A: (p"—» <p" are those


that have n distinct eigenvalues A , , . . . , A n . Indeed, the corresponding
eigenvectors xlt . . . ,xn are linearly independent (and, therefore, form a
basis in <£"") because x, E ^(A) and the subspaces $l^(A),. . . , 9t^(A)
form a direct sum. We have the following.
Reducing Subspaces 111

Corollary 3.2.2
If a transformation A: <p"—><p" has n distinct eigenvalues, then every A-
invariant subspace is reducing.

Consider now the situation in which an /4-invariant subspace is reducing


and is orthogonal to its ^-invariant comlementary subspace. An invariant
subspace M of a transformation A: <p" —*• <J7" is called orthogonally reducing
if its orthogonal complement M1 is also A invariant.

Theorem 3.2.3
Every invariant subspace of A is orthogonally reducing if and only if A is
normal.

Proof. Recall first (Theorem 1.9.4) that A is normal if and only if there
is an orthonormal basis of eigenvectors xlt . . . , xn of A.
Assume that A is normal, and let *,,. . . , xn be an orthonormal basis of
eigenvectors of A that is ordered in such a way that

correspond to the eigenvalue A t

correspond to the eigenvalue A2

orrespond to the eigenvalue \p


Here A , , . . . , \p are all the different eigenvalues of A. Arguing as in the
proof of Theorem 3.2.1 we see that any ^4-invariant subspace is of the form

where Mi C Span{xk._(, . . . , xk}, i = 1,. . . , p (by definition kQ = 0) and its


orthogonal complement

in <p" is also A invariant. Here Mf is the orthogonal complement to M( in


the space <p*«'"*•'-'
Conversely, assume that every /l-invariant subspace is orthogonally
reducing. In particular, every /4-invariant subspace is reducing, and by
Theorem 3.2.1, A — diag[a,, . . . , a n ] in a certain basis in (p".
Denoting by A , , . . . , \p all the different eigenvalues of A, it follows that
£%A(/1) is spanned by the eigenvectors of A corresponding to A,. Now for
each / 0, 1 < <o — P> tne subspace 2ft. A (^4) is the unique /1-invariant subspace
'0
that is a direct complement to E/V|- £%A (^4) in (p". [This follows from the fact
112 Coinvariant and Semiinvariant Subspaces

that any ,4-invariant subspace M has the form M = Ef =1 M n £%A (>!)•] The
orthogonal reducing property of <3t K (A) implies that the subspaces
£%A (A),. . . , £%A (/I) are orthogonal to each other. Taking an orthonormal
basis in each £%A (A) (which necessarily consists of eigenvectors of A
corresponding to A,), we obtain an orthonormal basis in <p" in which A has a
diagonal form. Hence A is normal.

The proof of Theorem 3.2.3 shows that if every A-invariant subspace is


reducing and every root subspace for A is orthogonally reducing, then every
-A-invariant subspace is orthogonally reducing.
Note also the important special cases of Theorem 3.2.3: every invariant
subspace of a hermitian or unitary transformation is orthogonally reducing.

3.3 SEMIINVARIANT SUBSPACES

A subspace M C <p" is called semiinvariant for a transformation A: <p"—» (f"


(or, in short, A semiinvariant) if there exists an /1-invariant subspace JVsuch
that N D M = {0} and the sum M + Jf is again A invariant. By taking
jV = {0} we see that any ^4-invariant subspace is also A semiinvariant.
If M is an A -coin variant subspace, then there is an /l-invariant direct
complement N to M in <p" (so the conditions that Jf n M - {0} and that
M + M is A invariant are automatically satisfied). Thus we see that any
A-coinvariant subspace is also A semiinvariant. In general, a subspace
M C (p" is A semiinvariant if and only if M is >l|L-coinvariant for some
A-invariant subspace *£ containing M.

EXAMPLE 3.3.1. Let A be an n x n Jordan block. Then it is easily seen that


the subspaces Span{e,, ei+l,. . . , ey}, where ! < / < / < « , are A semiin-
variant (but there are many other A semiinvariant subspaces). This example
shows that in general there exist semiinvariant subspaces that are neither
invariant nor coin variant.

Consider now the ^4-semiinvariant subspace M, and let JV be an A-


invariant subspace such that Jf D M = {0} and M + Jf is A invariant. Then
we have a direct sum decomposition

where Z£ is a direct complement to M + N in <£". To emphasize the fact that


this is a decomposition of <p" into the sum of invariant, semiinvariant, and
coinvariant subspaces, respectively, we call equation (3.3.1) a triinvariant
decomposition associated with the A-semiinvariant subspace M. Triinvariant
decompositions play an important role in the applications of Chapters 5
and 7.
Semiinvariant Subspaces 113

Note that in general a triinvariant decomposition associated with a given


M is not unique. With respect to the triinvariant decomposition (3.3.1), the
transformation A has the following 3 x 3 block form:

Here A n:Jf-*Jf, A22:M-»M, A33:%-+<e, A,2.M^N,n A23:£-+M,


An\ J£-» N. The presence of zeros in (3.3.2) follows from the A invariance
of M and M + Ji (see Section 1.5). The converse is also true: if A is a
transformation from <p" into <p", and A has the form (3.3.2) with respect to
some direct sum decomposition (3.3.1), then M is A semiinvariant, and the
A -invariant subspace N is such that M + Jf is A invariant as well.
In particular, it follows from the formula (3.3.2) that the spectrum of the
compression PA\M (where P: N + M—»N + Jtis the projector on M along
N) of A to its semiinvariant subspace M is contained in the spectrum of A.
We characterize v4-semiinvariant subspaces in terms of functions of A,
as follows.

Theorem 3.3.1
Let A: (p" —» <p" be a transformation. The following statements are equivalent
for a subspace M C <p": (a) M is semiinvariant for A; (/>) for a suitable
projector P mapping (p" onto M, we have

(c) for any function f( A) such that f ( A ) is defined we have

where P is a suitable projector with Im P = M.


In (b) PAm M is understood as a transformation from M into M. Recall
that f(A) is certainly defined for a function /(A) that is analytic on the
spectrum of A and, if A is diagonable, for any function /(A) that is merely
defined on the spectrum of A. As the spectrum of PA\M is contained in the
spectrum of A (provided M is A semiinvariant), it follows that f(PA\M) is
well defined if /(A) is analytic on the spectrum of A. We shall see in Section
4.1 that if A is diagonable, so is PA\M (provided M is A semiinvariant), and
thus f(PA\M) is well defined in the case when A is diagonable and /(A) is
defined on the spectrum of A.

Proof. Assume that M is A semiinvariant, and write A as in (3.3.2),


with respect to the triinvariant decomposition (3.3.1). Let P be the projec-
114 Coinvariant and Semiinvariant Subspaces

tor on M along Ji + !£. Then PA M = A22. Now a straightforward calcula-


tion shows that

sQthatPAm\M = A;2 = (PA\Mr.


Now assume that (b) holds. Let £ be the smallest ,4-invariant subspace
containing M. (In other words, £ is the intersection of all /1-invariant
subspaces that contain M.) Equivalently, !£ is the span of all vectors of type
A'x, where xEM and / = 0,1,. . . . In particular, £ D M. Let Q be a
projector on 3? such that Ker Q C Ker P (e.g., take any direct complement
JT to £ n Ker P in Ker P, so that Ker P = Jf' 4- (& n Ker P), and let Q be
the projector on !£ along JV'). Then Im(7 - (2)CKerP or, equivalently,
P(l - Q) = 0, that is, PQ = P. As # D ^, the equality 0P = P obviously
holds. Now

so Q - P is a projector, and Im(<2 - P) is a direct complement to M in J£


We shall prove that lm(Q — P) is A invariant, which shows that M is
semiinvariant for A. Clearly, QAQ = AQ (because Im Q = j£ is A in-
variant) and QAP - AP (because for every vector x E Im P = M, the vector
Ax belongs to !£ and thus QAx — Ax). Let us show that

For every x E. M and for any j = 0,1, 2, . . . we have

where we have used the property (b) twice. As the subspace j£is spanned by
A'x, x e M, j = 0,1, . . . , we conclude that PAPy = PAy for every y E J£,
which amounts to the equality PAPQ = PAQ, and (3.3.4) follows. Using the
equalities QAQ = AQ, QAP = AP, PAP=PAQ, we easily verify that
(Q - P)A(Q - P) = A(Q - P). This means that lm(Q - P) is A invariant
Finally, let /(A) be a function such that f(A) is defined. Then f(A) =
p(A), where /?(A) is a polynomial such that

where A j , . . . , \s are all the distinct eigenvalues of A, and mk is the height


Semiinvariant Subspaces 115

of \k (k = 1,. . . , s). Such a polynomial p( A) always exists. For example, the


Lagrange-Sylvester interpolation polynomial, which is given by the formula

where

and ^(A) = (A - \kym" Il*=1 (A - A,)"1', k = 1,. . . , s [see, e.g. Chapter V


of Gantmacher (1959)]. As the eigenvalues of PA\M are also eigenvalues of
A, and the height of A0 G <r(PA\M) does not exceed the height of A0 as an
eigenvalue of A (see Section 4.1), we obtain f(PA\M) = p(PA\M). Now
equality (3.3.3) follows from (b). Conversely, (c) obviously implies (b). D

Given an ^-semiinvariant subspace M with an associated triinvariant


decomposition <p" = N + M + 56, the proof of Theorem 3.1 shows that (b)
holds with P being the projector on M along N + 56. And conversely, if a
projector P satisfies (b), then Ker P - X + 56, where jV and 56 are A-
invariant and >l-coinvariant subspaces, respectively, taken from some tri-
invariant decomposition associated with M.
Extending the notion of orthogonally coinvariant subspaces, we introduce
the notion of orthogonally semiinvariant subspaces as follows. A subspace
M C <p" is called orthogonally semiinvariant for a transformation A: (p w —> (p"
if there exists an ,4-invariant subspace jV such that M, + N is again A
invariant and M is the orthogonal complement to Ji in M + N. Clearly, an
orthogonally semiinvariant subspace is semiinvariant. For an orthogonally
/1-semiinvariant subspace M there exists an orthogonal decomposition

where 56=(M + jV) x . Decomposition (3.3.5) will be called an orthogonal


triinvariant decomposition associated with M. Again, for a given M there are
generally many associated orthogonal triinvariant decompositions. (The
extreme case of this situation appears for A - 0.)
Consider the orthogonal triinvariant decomposition (3.3.5), and choose
orthonormal bases in Jf, M, and 56. Then we represent A as the 3 x 3 block
matrix

in the orthonormal basis for <p" obtained by putting together the ortho-
116 Coinvariant and Semiinvariant Subspaces

normal bases in N, M, and !£. As the representation (3.3.6) is in an ortho-


normal basis, we have

This leads to the following conclusion.

Proposition 3.3.2
An orthogonally A-semiinvariant subspace is also orthogonally A* semiin-
variant.

Indeed, if equation (3.3.5) holds, then J£ is A* invariant, and M is the


orthogonal complement to !£ in the ,4*-invariant subspace Ji± = M ®£.
An analog of Theorem 3.3.1 holds for orthogonally semiinvariant sub-
spaces.

Theorem 3.3.3
The following statements are equivalent for a transformation A: <p"—> <{7" and
a subspace M C (J7n: (a) M is orthogonally semiinvariant for A; (b) we have

where PM is the orthogonal projector on M; (c) for any function f( A) such


that f ( A ) is defined we have

The proof is like the proof of Theorem 3.3.1, with the only difference
that an orthogonal triinvariant decomposition is used and the projector Q is
taken to be orthogonal.

3.4 SPECIAL CLASSES OF TRANSFORMATIONS

In this section we shall describe coinvariant and semiinvariant subspaces for


certain classes of transformations. We start with the relatively simple case of
unicellular transformations.

Proposition 3.4.1
Let A: <p" —» <J7" be a unicellular transformation that is represented as a
Jordan block in some basis xl, . . . , xn. Then a k-dimensional subspace
Special Classes of Transformations 117

M C <p" is A-coinvariant if and only if M is spanned by a set of vectors


y j , . . . , yk with the property that x^ . . . , xa_k, y{,. . . , yk is a basis in <p".
A k-dimensional subspace M is A semiinvariant if and only if M =
Spanl^j,. . . , yk} where the vectors > > , , . . . , yk are such that, for some
index I with k^l<n, we have yi ESpan{jCj,. . . , x,}, / = ! , . . . , / : and
j t j , . . . , x,_k, y t , . . . , yk is a basis in Spanf*,,. . . , x,}.

The proof follows easily from the definitions of coinvariant and semi-
invariant subspaces and from the fact that the only ^-invariant subspaces
are {0} and Span{*i , . . . , * , } , / = 1 , . . . , n.
Consider now a diagonable transformation A: <J7" -» <p", so that A =
diag[A,,. . . , \n] in some basis in <p". As we have seen in Example 1.2, if all
A, are different, then every subspace in <p" is A coinvariant and hence also A
semiinvariant. In fact, this conclusion holds for any diagonable transfor-
mation (not necessarily with all eigenvalues distinct). Indeed, consider the
transformation B given by the matrix diag[/*,,. . . , nn] with different ^
values in the same basis in which A is given by diag[A 1 5 . . . , A n ). As every
fi-invariant subspace is also A invariant, it follows that every fl-coinvariant
subspace is also A coinvariant. But we have already seen that every
subspace is fi-coinvariant.
We consider now the orthogonally coinvariant and semiinvariant sub-
spaces. We say that a transformation A: <p"—» £" is orthogonally unicellular
if there exists a Jordan chain xl,...,xn of A such that the vectors
xl,...,xn form an orthogonal basis in <p". Clearly, any orthogonally
unicellular transformation is unicellular.

Proposition 3.4.2
Let A: $" —> <p" be an orthogonally unicellular transformation, and let
x,,..., xn be its orthogonal Jordan chain. Then the only orthogonally
A-coinvariant subspaces are Span{xk, xk +l, . . . , xn}; k — I , . . . , n; and {0}.
The only orthogonally A-semiinvariant subspaces are Span{jc k ,. . . , x,},
l<k<l<n and {0}.

Again, Proposition 3.4.2 follows from the description of all ^-invariant


subspaces.
Consider a normal transformation A: <p" —» <p"'- AA* = A* A. By
Theorem 1.9.4, A has an orthonormal basis of eigenvectors (and conversely,
if a transformation has an orthonormal basis of eigenvectors, it is normal). It
turns out that normal transformations are exactly those for which the classes
of invariant subspaces and of orthogonally semiinvariant subspaces coincide.

Theorem 3.4.3
The following statements are equivalent for a transformation: (a) A is
normal; (b) every A-invariant subspace is orthogonally A coinvariant; (c)
118 Coinvariant and Semiinvarianl Subspaces

every orthogonally A-coinvariant subspace is A invariant; (d) every or-


thogonally A-semiinvariant subspace is A invariant.

Proof. Obviously, (d) implies (c). Assume that A is normal, and let
A A, be all the different eigenvalues of A. Then

is an orthogonal sum, and A\R^(A) = A,/. Let M be an orthogonally A-


semiinvariant subspace, so that'M is the orthogonal complement to an
^-invariant subspace JV in another ^-invariant subspace X. We have

where JifC^C ffl^A), i = l,.. . ,k. Denoting by Mt the orthogonal com-


plement of Nf in 3?-, the definition of M implies that

It follows that M is A invariant. So (a) implies (d). One sees easily that (a)
implies (b) also.
It remains to show that (c)=>(a) and (b)=>(a). Assume (c) holds, that is
(cf. Proposition 3.1.2) every /4*-invariant subspace is /1-invariant. Write A*
in an upper triangular form with respect to some orthonormal basis

As Spanf*,, . . . , xk}, k = 1, . . . , n are A * -invariant subspaces, they are


also A invariant. Hence (Proposition 1.8.4) A also has an upper triangular
form in the same basis:

On the other hand, equality (3.4.1) implies


Exercises 119

Comparison of (3.4.2) and (3.4.3) reveals that bi} = 0 for /<;', and A is
normal.
Assume now that (b) holds, and write

in some orthonormal basis xl,..,,xn in <£". The subspaces


Spanjjc,, . . . , xk}, k - 1,. . . , n are A invariant and, by (b), orthogonally A
coinvariant. Hence Span{jtA + 1 , . . . , *„}, k = 1,. . . , n - 1 are ^-invariant
subspaces, which means that A has a lower triangular form

Comparing equations (3.4.4) and (3.4.5), we find that A is normal.

As a corollary of Theorem 3.4.3 we obtain the following characterization


of a normal transformation in terms of its invariant subspaces.
Corollary 3.4.4
A transformation A: <p" —> (p" is normal if and only if a subspace M is A
invariant exactly when its orthogonal complement is A invariant.
Indeed, it follows from the definition that the subspace M ± is A invariant
if and only if M is orthogonally A coinvariant.

3.5 EXERCISES

3.1 Prove that, in Example 3.1.2, there is a unique ^-invariant direct


complement to the ^-coinvariant subspace M if and only if M itself is
A invariant.
3.2 Prove that a subspace M is A coinvariant (resp. A semiinvariant) if
and only if M is (aA + /37) coinvariant [resp. (aA + /37) semiin-
variant]. Here a, /8 are complex numbers and a ^0.
120 Coinvariant and Semiinvariant Subspaces

3.3 Show that a subspace M is A coinvariant (resp. A semiinvariant) if


and only if &M is S/15"1 coinvariant (resp. 5>15~1 semiinvariant),
where 5 is an invertible transformation.

3.4 Let y4:<p"-*<p" («^3) be a unicellular transformation. Give an


example of a subspace M C (p" that is not A semiinvariant. List
all such subspaces when n = 3.

3.5 Show that every subspace in <p" is A coinvariant if and only if A is


diagonable (i.e., it is similar to a diagonal matrix).

3.6 Prove that every subspace in <p" is coinvariant for any n x n circulant
matrix.

3.7 Give an example of a nondiagonable transformation A: <p"—»• <P" sucn


that every subspace in <p" is A semiinvariant.

3.8 Find all the coinvariant subspaces for the matrices

3.9 Find all coinvariant and semiinvariant subspaces for the matrix

3.10 Prove that every reducing A -invariant subspace is reducing also for
f ( A ) , where /(A) is any function such that f ( A ) is defined. Is the
converse true?

3.11 If J is a Jordan block, for which positive integers k does the matrix Jk
have a nontrivial reducing invariant subspace? Is the reducing sub-
space unique?

3.12 Prove that an ^-invariant subspace M is reducing if and only if


M fl £%A (A) is reducing for every eigenvalue A0 of A.

3.13 Find all the triinvariant decompositions <p3 = Jf 4- M + £ with


dim Jf = dim M — dim ££ = 1 for the following matrices:
Chapter Four

Jordan Forms
for Extensions
and Completions

Consider a transformation A: <p"—><p" and an y4-coinvariant subspace M.


Thus there is an ,4-in variant subspace N such that <p" = M + Jf and there is
a projector P onto M along N. The main problems of this chapter are: given
Jordan normal forms for A}^ and PA M , what are the possible Jordan forms
for A itself? In general, this problem is open. Here, we present partial
results and important inequalities.

4.1 EXTENSIONS FROM AN INVARIANT SUBSPACE

Let M C <p" be a subspace, and consider a transformation AQ: M—>M. A


linear transformation A: (p"—» <p" is called an extension of AQ if Ax = A0x
for every x G M. Then, in particular, M is A invariant. Also, AQ is called the
restriction of A to M. We are interested in the Jordan form (or, equivalently,
the partial multiplicities) of A0 and its extensions.
We start with a relatively simple but important case in which A as well as
its extension are in the Jordan form and have special spectral properties.
These spectral properties ensure that the partial multiplicities corresponding
to a particular eigenvalue A0 are the same for A0 and its extension A.

Theorem 4.1.1
Let Jl and J2 be matrices in Jordan normal form with sizes p x p and q x q,
respectively. Let B be a p x q matrix and

121
122 Jordan Forms for Extensions and Completions

Denote by J10 and J2Q the Jordan submatrices of J{ and J2, respectively,
formed by those Jordan blocks with the same eigenvalue A 0 .
Then the partial multiplicities of J corresponding to A 0 coincide with the
partial multiplicities of the submatrix

of J , where B0 is the submatrix of J formed by the rows that belong to the


rows of 7,0 and by the columns that belong to the columns ofJ2Q (so actually
B0 is a submatrix of B).
Theorem 4.1.1 is used later to reduce problems concerning the Jordan
form of an extension to the case when the transformations involved have
only one eigenvalue. The proof of Theorem 4.1.1 is based on two lemmas,
which are also independently important.

Lemma 4.1.2
Let A, B, C be given matrices of sizes n x n, m x m, and n x m, respectively.
Consider the equation

where X is an n x m matrix to be found. Equation (4.1.1) has a unique


solution X for every C if and only if cr(A) D a(B) = 0.

This lemma follows immediately from the fact that, for the linear
transformation L: (p"xm-* if*m defined by L(X) = AX-XB, o-(L) =
{X — fjL \ A E a(A) and /x G a(B)}. [See Chapter 12 of Lancaster and Tis-
menetsky (1985), for example.] Here we give a direct proof based on the
Jordan decompositions of A and B.

Proof. Equation (4.1.1) may be regarded as a system of linear equa-


tions in the rs variables xif (i = 1,. . . , r; / = 1,. . . , s) that form the entries
in the matrix X. Thus it is sufficient to prove that the homogeneous equation

has only the trivial solution X = 0 if and only if o~(A) fl cr(B) = 0.


Let JA and JB be the Jordan forms of A and B, respectively; so
A = SAJASAl, B = SBJBSBl for some invertible matrices SA and SB. It
follows that X is a solution of (4.1.2) if and only if Z = S~^XSB is a
solution of
Extensions from an Invariant Subspace 123

Thus we can restrict ourselves to equation (4.1.3). Let us write down JA and
JB explicitly:

where JA,. (resp. JB y ) is a Jordan block of size mA , (resp. mR y ) with


eigenvalue A^ , (resp. A B y ). The matrix Z from (4.1.3) is decomposed into
blocks accordingly:

where Z/y is of size mA t:x mB ;.


Suppose first that o-(A) n cr(B)^0. Without loss of generality we can
assume that A^ j = \Bl. Then we can construct a nonzero solution Z of
equation (4.1.3) as follows. In the representation equation (4.1.4) put
Z/y = 0, except for the case that /=;' = !; and let

(according as mA j > m s , or mA t < m B ,). Direct examination shows that


such a matrix Z satisfies (4.1.3).
Suppose now that cr(A) D a(B) = 0. Let Z be given by (4.1.4) and
suppose that Z satisfies (4.1.3). We have to prove that Z = 0.
Equation (4.1.3) means that

Write

where H and G are the nilpotent matrices [i.e., <r(H) = (r(G) — {0}] having
1 on the first superdiagonal and zeros elsewhere. Rewrite equation (4.1.5) in
the form

Multiply the left-hand side by A^ , - A B y , and in each term on the right-hand


side replace (A^ , - AB y)Zj7 by Zi;G - HZir We obtain

Repeating this process, we obtain for every p — 1, 2,. . .


124 Jordan Forms for Extensions and Completions

Choose p large enough so that either //9 = 0 or Gp 9 = 0 for every


q = 0,. . . , p. Then the right-hand side of equation (4.1.6) is zero, and since
A <7 = °- ZThus Z = °-
^ ,- ^ A B r we find that
Lemma 4.1.3
If A and B are n x n and m x m matrices, respectively, with o~(A) H o-(B) =
0, then for every nx m matrix C the (m + n) x (m + n) matrices

are similar.

Proof. By Lemma 4.1.2, for every n x m matrix C there is a unique


n x m matrix X such that AX - XB = -C. With this X, one verifies that

As

the lemma follows.

Proof of Theorem 1.1. For notational simplicity assume that

where Jn (resp. J2i) are tne Jordan blocks from /, (resp. /2) with eigen-
values different from A 0 , and Btj are the corresponding submatrices in J.
Applying Lemma 4.1.3 twice, we see that J is similar to

which after interchanging the second and third block rows and columns (this
is a similarity operation) becomes
Extensions from an Invariant Subspace 125

It remains to apply Lemma 4.1.3 once more to prove that J is similar to

It is convenient to describe the partial multiplicities of a transformation


A: (p" —» (p" at an eigenvalue A0 as a nonincreasing sequence of nonnegative
integers a^A; A 0 ) > a2(A; A 0 ) > a3(A; A0) > • • • , where the nonzero mem-
bers of this sequence are exactly the partial multiplicities of A at A 0 . In
particular, not more than n of the numbers a^A; A 0 ) are different from
zero. Also, if A0 is not an eigenvalue of A, we define a^A; A 0 ) = 0 for
/ = 1 , 2 , . . . . Thus the nonnegative integers a^A; A 0 ) are defined for all
A0 E <p, and we have

The following result describes the connections between the partial multi-
plicities of a transformation and those of its extension.

Theorem 4.1.4
Let M C <p" be a subspace and let A0: M —> M be a transformation. Then for
every extension A: <P"—»• <p" of A0 we have

for every A0 E <p. Conversely, let /8j ^ (32 ^ • • • be a nonincreasing sequence of


nonnegative integers such that

and

for a fixed complex number A 0 . Then there is an extension A of AQ such that


a > M;A 0 ) = ft,; = i , 2 , . . . .
Proof. We prove (4.1.7) for an extension A of AQ. In view of Theorem
4.1.1, we may restrict ourselves to the case when a(A) = {\0}. (Indeed,
126 Jordan Forms for Extensions and Completions

without loss of generality it can be assumed that A0 is in the Jordan form.


Furthermore, the transformation PA\^-: N-+N, where jVis a direct comple-
ment to M and P is the projector on JV along M, may also be assumed to
have Jordan normal form.) There exists a chain of ^-invariant subspaces

where dim M.^ m + i, i = 0,1,. . . , n — m (so m = dim M). This can be


seen by considering the transformation A: $"IM —» $"IM induced by A and
using the existence of a complete chain of .4-invariant subspaces.
In view of the chain (4.1.10) and using induction on the index i of Mit it
will suffice to prove inequalities (4.1.7) for the case dim M = n — 1. Writing
AQ in a basis for M in which A0 has a Jordan form, we can assume

where / = / f c j (A 0 )0- -®Jk (A 0 ), fc, > • • • > /^ is the Jordan form of A0


and B is an (n - 1)-dimensional vector.
Let / be the first index (1 <y <p) for which the (k{ + k2 + • • • + kj)th
coordinate of B is nonzero (if such a j exists). Let 5 be the (n - 1) x (n ~ 1)
matrix

where Qm is the km x kf matrix of the form [0 Ik] and aj +l,. . . , ap are


complex numbers chosen so that the (kl + k2 + • • • + km)th coordinates of
SB are zeros for m = j + 1 , . - - , / ? . If all coordinates klt kl + k2, . . . ,
k} + • • -kp of B are zeros, put S = In_l. It is easy to see that SJ = JS and 5 is
nonsingular. Moreover, the /c,th, (kl + Ac 2 )th,. . . , (A:, + k2 + • • • + kp)th
coordinates of SB are all zero except for at most one of them. Further, let X
be an (n - 1)-dimensional vector such that the nonzero coordinates of the
vector

can appear only in the places k l y k{ + k2,. . . , kl + k2 + • • • + kp (this is


possible because
Extensions from an Invariant Subspace 127

Now a computation shows that

As is the inverse of it follows that and

have the same partial multiplicities. Now the partial multiplicities

of are easy to discover: they are kl,...,k , 1 if F = 0, and


& , , . . . , kj_{, kj: + 1, kj +,,..., kp if Y =/= 0 and the nonzero coordinate of Y
(by construction of Y there is exactly one) appears in the place kl + • • • + k}.
So the inequalities (4.1.7) are satisfied. If 5 = 0, then (4.1.7) is obviously
satisfied.
Now let j8( be a sequence with the properties described in the theorem.
Let Jtj, . . . , * £ be a basis in M in which A0 has the Jordan form. We
assume also that the first p Jordan blocks in the Jordan form have
eigenvalues A0 and sizes a{(A0; A 0 ),. . . , ap(A0; A 0 ), respectively. (Here,
aj(v4 0 ; A 0 ),. . . , ap(A0; A 0 ) are all the nonzero integers in the sequence
{a y (y4 0 ; AO)}^). So in the basis *,,. . . , xk we have

where A 1 5 . . . , \u are different from A 0 , and al• = ctj(A0; A 0 ). Now let


y\-> • • • » yn-k be vectors in (f"1 such that jc,, . . . , xk, yl, . . . , yn_k is a basis
in (p". Put

where s = Ef = 1 /?,, r = Zf = I (/3,- - a-), and ^ is the number of positive /3,


values. Further, setting t = Ef =1 a,, put

Now let A: (f1"-* (p" be a transformation that is given in the basis 2 , , . . . , zn


by the matrix
128 Jordan Forms for Extensions and Completions

where J is any (n - k - r) x (n - k - r) matrix in the Jordan form with the


property that A0 is not an eigenvalue of J. From the construction of A it is
clear that /3,,. . . , fiq are the partial multiplicities of A corresponding to A0
and that A is an extension of A0.
In particular, the theorem shows that if A is diagonable, then so is the
restriction of A to any /1-invariant subspace.
For coinvariant subspaces the notions of coextension and corestriction
become natural. Let M C <f" be a subspace, and let AQ: M-* M be a linear
transformation. A transformation A: $"-+ <p" is called a coextension of A0 if
there exists an ^-invariant direct complement JV to M in <p" such that
PA\M = A0, where P is the projector on M along jV. Clearly, in this case M
is an A -coinvariant subspace. There is a connection between the partial
multiplicities of a transformation and those of a coextension of the kind
described in Theorem 4.1.4.

Theorem 4.1.5
Let M C <p" be a subspace and A0: M—>M be a transformation. Then for
every coextension A of A0 we have a^A; A0) ^ <Xj(A0; A0), ; = 1, 2,. . . for
every A0 G <p. Conversely, let fi^ (}2> • • • be a nonincreasing sequence of
nonnegative integers such that equations (4.1.8) and (4.1.9) hold. Then there
is a coextension A of A0 such that ctj(A; A 0 ) = /3y, ;' = 1 , 2 , . . . .

The proof of Theorem 4.1.5 is similar to the proof of theorem 4.1.4.


Given a transformation A0:M—*M, where M C <p", we say that a
transformation A: <p" —* <£"* is a dilation of A0 if there exists an yl-invariant
subspace X for which Jf r\ M = (0), M + N is A invariant as well, and
PA\M = AQ, where P is some projector on M with jVCKerP. (The term
"semiextension" would be more logical in the context of our terminology;
however, "dilation" is widely used in the literature.) In this case M is an
/4-semiinvariant subspace and A0 is the reduction of A (again the term
"semirestriction" would be consistent with our terminology, but "reduction"
is already widely used.) Thus there is a subspace 2£ of <p" for which the
decomposition (3.3.1) holds, and this decomposition determines a triangular
representation such as (3.3.2) for A in which A22 = A0. A result similar to
theorems 4.1.4 and 4.1.5 also holds for dilations, and it can be proved by
first applying one of these theorems and then applying the second. In
particular, if A is diagonable, so is any reduction of A.

4.2 COMPLETIONS FROM A PAIR OF INVARIANT AND


COVARIANT SUBSPACES

Let A:M-*M and B: N—*N be transformations, where M and N are


subspaces in <p" which are direct complements to each other. A transfor-
mation C: <p n —» <p" is called a completion of A and B if M is C invariant and
Completions from a Pair of Invariant and Covariant Subspaces 129

C\M = A, PC\^ = B, where P is the projector on ^V along M. So with respect


to the direct sum decomposition <p" = M 4- jV, C has the form

for some matrix D.


Let al > «2 > • • • (resp. j8, s: /32 > • • •) be a sequence of nonnegative
integers whose nonzero elements are exactly the partial multiplicities of A
(resp. B) corresponding to a fixed point A0 E (p. Assuming that C is a
completion of A and B, let y, > y2 > • • • be a sequence of nonnegative
integers such that the nonzero y. values are the partial multiplicities of C at
A 0 . In this section we study the connections between ait @f, and y,. In view
of Theorem 4.1.1, these connections describe the Jordan form of C in terms
of the Jordan forms of A and B.
Some such connections are easily seen. We have

for every A G (p. Now the algebraic multiplicity of an eigenvalue A0 of a


matrix X coincides with the multiplicity of A0 as a zero of the polynomial
det(A^ - A/). (When A0 is not an eigenvalue of X this statement is also true if
we accept the convention that, in this case, the algebraic multiplicity of A0 is
zero.) It follows from equation (4.2.2) that the algebraic multiplicity* of C
at A0 is equal to the sum of the algebraic multiplicities of A and B at A 0 . In
other words

Further, as C is an extension of A and a coextension of B, Theorems 4.1.4


and 4.1.5 imply that

The following inequality between {a,.}".,, {ty},".,, and {yj}^=l is deeper.

Proposition 4.2.1
Let C be a completion of A and B, with the partial multiplicities of A, B, and
C at a fixed A0 E <p given by the nonincreasing sequences of nonnegative
respectively. Then
integers {a,.};°=1, {/SjJLi, and (yj^i' respectively.

*It is convenient here to talk about the "algebraic multiplicity of C at A0" rather than the
"algebraic multiplicity of A0" as an eigenvalue of C.
130 Jordan Forms for Extensions and Completions

As usual in this book, the symbol ft* represents the number of different
elements in a finite set ft.

Proof. First we prove the following inequalities:

Indeed, for every e ^0 we have [using formula (4.2.1)]

and thus

Fix some /, and let

So there exists an m x m nonsingular submatrix Q in (A — A 0 /)' ®


(B — A 0 /)'. Consider the m x m submatrix Q(e) of

which is formed by the same rows and columns as Q itself. Now Q(e) is as
close as we wish to Q provided e is sufficiently close to 0. Take e so small
that the matrix Q(e) is also nonsingular. For such an e

Comparing with (4.2.7), we obtain the desired inequality (4.2.6). Now use
Proposition 2.2.6 to obtain the inequalities (4.2.5).

In connection with inequalities (4.2.5), note that


Completions from a Pair of Invariant and Covariant Subspaces 131

Indeed, as {A: | % >/} # =0 for y > y , , and similarly for {afc}^=1 and
( A j r = i > all the sums in equation (4.2.8) are finite, so (4.2.8) makes sense.
Further, for any nonincreasing sequence of nonnegative integers {8i}^=l with
finite sum E*=1 5, we have

The easiest way to verify (4.2.9) is by representing each nonzero 5, as the


rectangle with height 5, and width 1 and putting these rectangles one next to
another. The result is a ladderlike figure <l>. For instance, if Sl = 5, 82 = 83 —
4, 84 = 1, 8j = 0 for j > 4, then <J> is the following figure:

Obviously, the area of <£ is just the left-hand side of equation (4.2.9). On
the other hand, the right-hand side of (4.2.9) is also the area of <I> calculated
by the rows of <1> (indeed, {k \ Sk ^ i}* is the area of the /th row in <I>
counting from the bottom); hence equality holds in (4.2.9). Now appeal to
(4.2.3) and (4.2.8) follows. We need a completely different line of argument
to prove the following proposition.

Proposition 4.2.2
as An
With {a,}*L,, {/3J°°=, and {y,}r=i Proposition 4.2.1, we have

Proof. Assuming that C is given by (4.2.1), one easily obtains

Using Theorem A. 4. 3 of the appendix, pick a p x p submatrix C 0 (A) in


C — A/ such that A0 is a zero of det C0( A) of multiplicity yn + • • • + yn-p +l
132 Jordan Forms for Extensions and Completions

(here n x n is the size of C). The integer p is assumed to be greater than


max(/i /4 , nB) where nA x nA is the size of A and nB x ng is the size of B (so
n = nA + nB). By the Binet-Cauchy formula (Theorem A.2.1 of the appen-
dix) we have

where are p x p submatrices of


and , respectively, and the summation is taken over certain
triples i, /, k. Note that del Bf( A) = 0 unless £,( A) is of the form ls 0 Bf( A),
where #,(A) is a (p — 5) x (p — 5) submatrix of B — A/ (here 5 is an integer
that may depend on i and for which 0=ss — nA). Similarly, detAk(\) = Q
unless Ak(\) is of the form / , ® A k ( \ ) , where Ak(\) is a (p - t) x (p - r)
submatrix o f > l - A / ( 0 < r ^ n B ) . Taking these observations into account,
rewrite equation (4.2.11) as follows:

Now the size of £,(A) is at least (p - nA) x (p - nA), so by the same


theorem, Theorem A.4.3, the multiplicity of A0 as a zero of det fl,(A) ^s at
least

(here we use nB + nA - n and j3, = 0 for / > n f l ). Similarly, the multiplicity


of A0 as a zero of det ^(A) is at least an + a /l _ 1 + • • • + an_p +l. We find
that the multiplicity £" =n _ p + 1 7, of A0 as a zero of detC 0 (A) is at least
E / " =n _ p+1 (ay + /3;). It follows from equation (4.2.3) that

If it happens that p<nA, then the inequality

and hence also the relation (4.2.12), follows from (4.2.4) because in this
case /3y = 0 for / > n - p + 1. Similarly, (4.2.12) holds for p < nB. We have
proved (4.2.10) for m = 1,. . . , n. For m ^ n the inequality (4.2.10) coin-
cides with (4.2.3), so the proof of (4.2.10) is complete.

We have proved various inequalities and equalities relating the sequences


Special Case of Completions 137

{«,}:=,, {&};=,, and {•yj.}-., [relations (4.2.3), (4.2.5), (4.2.8), (4.2.10)].


These relations are by no means the only connections between these
sequences. More specifically, there exist nonincreasing sequences of non-
negative integers {aj}°°=1, {/3,}°°=i and {yi}i=l, only a finite number of them
nonzero, that satisfy equations (4.2.3), (4.2.5), (4.2.8), and (4.2.10), but for
which there is no completion C of A and B with the property that for some
A 0 E < p the sequences {a,}°°=i' {A}T=i> an^ {yjr=i &VQ tne partial multi-
plicities of A, B, and C, respectively, corresponding to A0. In the next
section we see more general inequalities, but even they do not completely
describe the connections between the partial multiplicities of extensions of A
and B and the partial multiplicities of A and B. The problem of describing
all such connections is open.

4.3 THE SIGAL INEQUALITIES

The main result in this section is the following generalization of Proposition


4.2.2.

Theorem 4.3.1
Let (aJJLj, {j3J7=i> and {yjr=i be as in Proposition 4.2.1. Then for every
sequence r{ < r2 < • • • < rm of positive integers we have

and

Proposition 4.2.2 is obtained from this theorem by putting r. — j, j =


1, . . . , m. It will be convenient to prove a lemma (which is actually a
particular case of Theorem 4.3.1) before proving the theorem itself.

Lemma 4.3.2
Let

where B is (n - k) x (n - k) with a(B) = {0}. // {-yj^i and {j3,}°°=1 are th


nonincreasing sequences of partial multiplicities of C and B, respectively, then
7, = /3( + 5., / = 1, 2, . . . , where 5, is zero or one, and E°°=i 5, = k.
134 Jordan Forms for Extensions and Completions

Proof. Let jc,,. . . , x, (I ^ 2) be a Jordan chain for C:

Write jc, = L where y, is a k-dimensional vector and z, is (n-k)-


dimensional. Equalities (4.3.3) then imply yl = • • • = yl_l - 0 and Bzi+l = z,,
/ = 1 , . . . , / - 2, z, T^ 0. In other words, z 1 ? . . . , z,_, is a Jordan chain for
B. Moreover, if Xyt = 0, then z , , . . . , z/ is also a Jordan chain for fi.
Now let

be a basis in <p" consisting of Jordan chains for C (so q is the maximal


index such that yq>Q). Denoting by p the maximal index such that
yp>2, let Z be the subspace spanned by the Jordan chains
z n , . . . , z , , ; . . . ; z p l , . . . , zp , for B constructed as in the preceding
paragraph from the Jordan chains jc ;1 ,. . . , x^y, j = 1,. . . , p of C. Here /; is
either y. - 1 or y y . The order of Jordan chains in equation (4.3.4) of the
same length can be adjusted so that / , > • • • > lp. Since Z is B invariant,
Theorem 4.1.4 gives Bf ^ /,, / = 1,. . . , p. On the other hand, by Theorem
4.1.5 y, > jS,-, / = 1, 2, So we obtain y, - Bi < S,, / = 1, 2,. . . , where
each 5, is either zero or one. The equality EjLi ^ = ^ follows from the fact
that the sum of the partial multiplicities of C (resp. of B) is n (resp.
n-A:). D

Proof o/ Theorem 4.3.1. Let <p" = ^ + Jiand let y4: M-* M, B: N-^ Ji
be transformations such that {a,}"!,, {j8,}^ =t , and (y / }r=i are the nonin-
creasing sequences of nonnegative integers representing the partial multip-
licities of A, B, and

respectively, corresponding to the eigenvalue A0 (here D is some transfor-


mation from .A" into M}. Applying a similarity transformation, if necessary,
we can assume that M = MJ".
Without loss of generality (Theorem 4.1.1) we can assume also that
A0 = 0 and a(A) = cr(B} = {0} (then also tr(C) = {0}). We can assume also
that A is in the Jordan form:

We use induction on the size a{ of the biggest Jordan block in A. If


Special Case of Completions 137

«! = 1, then A = 0 and by Lemma 4.3.2 (applied to B* and C* in place of B


and C, respectively) we have

Assume that inequality (4.3.2) is proved for all A with the property that the
size of the biggest Jordan block is less than a, . Using a matrix similar to A
in place of A, we can assume that

where A2 is a Jordan matrix with partial multiplicities {a,'}*!, satisfying

With the corresponding partition and using the induction

hypothesis the partial multiplicities {? •}*=! °f tne matrix C'


satisfy the inequalities:

But in view of Lemma 4.3.2 (applied with C'* and C* in place of B and C,
respectively)

Now combine relations (4.3.5), (4.3.6), and (4.3.7) to obtain the inequality
(4.3.2). The inequalities (4.3.1) are obtained from (4.3.2) applied to the
transformation C* written as the 2 x 2 block matrix with respect to the
direct sum decomposition <p" = JV -I- M.

Inequalities (4.3.1) and (4.3.2) admit the following geometric interpre-


tation. Let q be any index such that y(; = 0 for i>q (e.g., <7 = E°1, ai +
E*=1 /3,-). Denote by Kl C R* the convex hull of the points

where TT is any permutation o f { l , 2 , . . . , q r } , that is

Also let
136 Jordan Forms for Extensions and Completions

Then inequalities (4.3.1) and (4.3.2) imply

Actually, the inclusion (4.3.8) in turn implies (4.3.1) and (4.3.2). The proof
of these statements would take us too far afield; we only mention that it is
essentially the same as the proof of Theorem 10 of Lidskii (1966). It is
interesting that the geometric interpretation of inequalities (4.3.1) and
(4.3.2) is completely analogous to the geometric interpretation of the
inequalities for the eigenvalues of the sum of two hermitian matrices in
terms of the eigenvalues of each hermitian matrix [see Lidskii (1966)].
Inequalities (4.3.1) and (4.3.2) can be generalized. In fact, for any
sequence r1 < r2 < • • • < rm of positive integers and any nonnegative integer
k<r} the following inequalities hold [see Thijsse (1984)]:

Theorem 4.3.1 is a particular case of (4.3.9) with k = 0.


We have seen that, given the sequences {a(}°lj and {ft}JLj of partial
multiplicities of A and B, respectively, corresponding to A0, the sequence
{•y,}r=i °f partial multiplicities corresponding to A0 of any completion C of A
and B satisfies the properties of (4.2.3), (4.2.4), (4.2.5), (4.3.1), and
(4.3.2); moreover, (4.3.9) is satisfied as well. However, the following
example shows that, in general, these properties do not characterize the
partial multiplicities of completions.

EXAMPLE 4.3.1 . Let al = a2 = 3, a, = 0 for i > 2; fa = /32 = 5; 03 = 4; ft = 0


for / > 3 ; y i = 7 , y2-6, •y3 = 4, y4 = 3, yi , = 0 for i>4. One verifies that
relations (4.2.3), (4.2.4), (4.2.5), and (4.3.9) hold [the verification of
(4.3.9) is lengthy because of the many possibilities involved]. However,
Theorem 7 of Rodman and Schaps (1979) implies that there is no com-
pletion C of A and B such that the partial multiplicities of A, B, and C
corresponding to some A0 are given by {a,}^, {ft}"!,, and
respectively.

4.4 SPECIAL CASE OF COMPLETIONS

In this section we describe all the possible sequences of partial multiplicities


corresponding to A0 for completions of A and B in case at least one of A and
B has only one partial multiplicity at A0. First, we establish some general
Special Case of Completions 137

observations on partial multiplicities of completions that are used in this


description.
It is convenient to introduce the set ft of all nondecreasing sequences of
nonnegative integers such that, in each sequence, only a finite number of
integers is different from zero. For a = (aj, a 2 ,. . .), /3 = (/?,, /3 2 ,. . .) E ft
denote by F(a, )3) the set of all sequences y = (yl, y2,. . .) E ft with the
following properties: (a) there is a transformation C: <£""—»• £" (for some n)
and a C-invariant subspace M such that the restriction C\M has partial
multiplicities a,, a 2 ,. . . corresponding to a certain eigenvalue A 0 ; (b) the
compression of C to a coinvariant subspace that is a complement to M has
partial multiplicities j8 t , )3 2 ,. . . corresponding to A 0 , and (c) C itself has partial
multiplicities yl,y2, • • - corresponding to the same A 0 .

Proposition 4.4.1
Let a — (otj, a 2 , . . .) E ft, /8 = (/3,, /32, . . .) Eft, and put m — E*=1 a,, n =
£°°=1 )3,. TTzen fl sequence y = (7,, y2, . . .) E ft belongs to F(a, /3) if and only
if there is an m x n matrix A such that the partial multiplicities of the matrix

is the largest
O
index such that a_/I j ^0 [resp.
I /^
B n 7^0]J are y,,
"^2
y,, . . . .
/1 " » / '

Proof. As the part "if" follows from the definition of F(a,/8), we


have only to prove the "only if part. Assume y EF(a, /3). By definition,
there is a matrix C partitioned as follows:

where for some eigenvalue A0 of C the partial multiplicities of C (resp. C u ,


C22) at A0 are given by y (resp. a, /3). Replacing C by C- A07, we can
assume A0 = 0. Furthermore, we can assume that Cn and C22 are matrices in
the Jordan form. It remains to appeal to Theorem 4.4'.l.i
It follows immediately from Proposition 4.4.1 that F(a, j8) = F(j3, a).
Indeed, in the notation of Proposition 4.4.1 we have

so the matrices
138 Jordan Forms for Extensions and Completions

have the same Jordan form. But then (in view of Corollary 2.2.3) this is also
true for the matrices

As / 2 ar|d ^* are similar to J2 and /,, respectively, the conclusion F(a, /3) =
F(j3, a) follows.
In view of Proposition 4.4.1, in order to determine F(a, /3), we have to
find the partial multiplicities -y, > -y2 > • • • (or, what is the same, the Jordan
form) of matrices J of type (4.4.1). As

(by definition, /° = /), we focus on a formula for computation of the ranks


of/ 1 , / = 1,2, ____
Divide the matrix A into blocks Atj, i = 1, . . . , n, ; / = 1, . . . , «2 accord-
ing to the sizes of Jordan blocks in /, and /2 (so the size of Atj is a, x /3;).
For fixed i and j, write Atj = T,%=l E^, "„£,,, where Epq is an a, x p.
matrix with 1 in the intersection of the (a, - p + l)th row and gth column
and zero in all other places. Let

(we put upq = 0 if p > ofj or q > /3y). Define

where the sum is over all the pairs p,q such that p <min(A;, a,), q <
min(/:, jSy), and p + q> k. For example, BJ^ has u n in the lower left corner
and zeros elsewhere, B|,112) has in the lower lefgt corner and

zero elsewhere, B^3) has

in the lower left corner and zeros elsewhere (provided «., /3 ; >3). Let
be the mx n matrix with blocks B^f\i = 1, . . . , n,; / = 1, . . . , n 2 ).

Lemma 4.4.2
In the preceding notation we have

rank /* = rank /* -t- rank /* + rank B(k) , k = 1, 2, . . .


Special Case of Completions 139

Proof. Let A(k) be defined by An easy indctiom


argument on k shows that

and hence

where Eab = 0 whenever at least one of the inequalities 1 < a ^ a,; 1 < ft ^ /3y
is violated, and Ma6 = 0 for a < 1 or ft < 1.
It follows that
A = fl
ij !> + (terms with Ep.q. such that p' > A: or q' > A:)

By column operations from /* and row operations from Jk2, we can eliminate
all terms of A(k) except those in the block B(k). Permuting the rows and
columns of the resulting matrix, we obtain the following matrix that has the
same rank as /*:

where ak = rank Jk and bk = rank /*• Lemma 4.4.2 follows.

It is an immediate consequence of the lemma that the sequence (y;}7=i


depends only on the diagonal sums d^, for /<min(a / , ft). Thus we can
replace each Afj by a matrix in which only the first column can contain
nonzero entries. Alternatively, we can presume that only the bottom row of
A^ can contain nonzero entries.
For illustration of Lemma 4.4.2, consider the following example.

EXAMPLE 4.4.1. Let a =(a,,0, 0,. . .), p = (ft, 0, 0,. . .), where a,, fa >
0. We suppose for definiteness that a, > ft. If d ( l ) 7^0, it is easily seen that
t40 jordan Forms for Extensions and Completions

rank

In general, we have

where t0 is the smallest t such that d*'/ ^ 0, or /0 = ftl + 1 if all rf^'/ are zeros.
It is now clear that y = (y,, y 2 ,. . .)eF(a, 0) is determined completely by
the value of t0. Further, using formula (4.4.3) and Lemma 4.4.2, we
compute

Computation shows that

F(a, 0) = {(a, + jS,, 0), (a, + j8, - 1,1), . . . , K + 1,0,- 1), (a,, 0,)}

(In every y sequence we write only the first members; the others are zeros.)
The y sequence (a{ + 0t — p, p) corresponds to the value f 0 = p + 1.
The possibility of y = (QJ -I- 0j — p, p), p = 0,. . . , /3j, is realized for the
matrix

where Ap is an al x 0, matrix with all but the (a, — p, l)th entry equal to
zero, and this exceptional entry is equal to 1 (for p = 0j we put Ap = 0). It is
not difficult to construct two independent Jordan chains of A/ — J(p) of
lengths aj + 0, — p and p. Namely, the Jordan chain of length a, + 0, — p is
e
a +p ' ea +p - i > • • • » e a, + i> ^a^p' e a,-p-i> • • • , ^ i • The Jordan chain of
length p is ea] - eai+p, e Q ] _, - e a i + p _ i , . . . , « ffll - P + i ~ e«1+ i -

Using Lemma 4.4.2, we shall now give a complete description of the set
F(a, 0) in the case that a = (alt a2, . . . , a n ,0,. . .) and 0 = (0^0,0,. . .)
where an and 0j are positive.
Introduce the set O0 of all w-tuples (co}, a>2,. . . , eon), where o>; are
integers such that 1 < cu; < A y + 1 and A y = min(a y , 0J. For a given sequence
o> = (o)l5 <u2, . . . , & ) „ ) E ft0 and / = 1, 2,. . . , define integers cj"* as follows:
Special Case of Completions 141

where /ay = max(a y , /3,). Now let y = (y,, y2, • • •) be the nonincreasing
sequence of nonnegative integers defined by the equalities

Thus for every &> G fl0 we have constructed a sequence y. Let us denote this
sequence by F(o>).

Theorem 4.4.3
For every eoGfl ( ) the sequence F((o) belongs to Y(a, B). Conversely, if
y G F(a, j3), there exists a) G O0 s«c/z tfza/ y = F(w).

proof. Recall that

Inview of lemma 4.42, we find that

It remains to check, therefore, that for every w Gfl it is possible to pick


the complex numbers dfi ( ! < / < « , / = 1 , 2 , . . . ) in such a way that
fk - rank B(k) for A: = 1, 2, . . . , where fk is defined by equation (4.4.4) and
B(k) is defined as in Lemma 4.4.2; and conversely, for every choice of d^ it
is possible to find an o> G H0 such that fk — rank B(k\
Note that B(k) depends on d(^ with t < A ; , so we restrict ourselves only to
these values of t.
Given w — (otl, . . . , '<u n )Gft 0 , choose d^ in such a way that at- is the
smallest index / with the property that d(^ ¥^ 0 [if a))• = A y + 1, put d^ — 0 for
all t]. It is easy to see that cff is just the rank of the matrix B^ [defined by
(4.4.2)]. Observe that after crossing out some zero columns and rows, if
necessary, B(k^ is an upper triangular Toeplitz matrix with min(Ac, /3,)
columns. Thus the rank of
142 Jordan Forms for Extensions and Completions

is just the maximum of the ranks of B\\\ B^, . . . , flj,*/, that is, fk.
Conversely, if d^ are given, define o>; as the minimal /(I < t < A y ) such
that rfj'/ ^ 0; and if d^ = 0 for every /, 1 < t < A y , put wy = Ay- + 1 .

4.5 EXERCISES

4.1 Supply a proof of Theorem 4.1.5.


4.2 State and prove a result for dilations analogous to Theorems 4.1.4 and
4.1.5.
4.3 Prove that the maximal dimension of an irreducible /1-invariant sub-
space coincides with the maximal dimension of a Jordan block in the
Jordan form of A.
4.4 Find all possibilities for the partial multiplicities of matrices of type

where X is any n x m matrix.


4.5 What is the answer to the preceding exercise under the restriction that
rank X < k, where A: is a fixed positive integer?
4.6 Find all possibilities for the partial multiplicities of matrices of the
following types:

where X is any n x m matrix of rank 1. (Hint: Prove that there


exists an n x m matrix X0 with exactly one nonzero entry such
that

are similar.)

(c) What happens if we allow matrices X of rank 2?


Exercises 143

4.7 Find all possibilities for partial multiplicities of matrices of type

where X is any n x m matrix.


4.8 Let

be circulant matrices. Find all possibilities for the partial multiplicities


of matrices of type

where X is an n x n matrix.
Chapter Five

Applications to
Matrix Polynomials

Let A0, A j , . . . , A,_)be complex n x n matrices. We call the matrix-valued


function L(A) = I\' + ZJIg Aj\' a monic matrix polynomial of degree /. It
will be seen that there are In x In matrices C such that

are equivalent. (See the appendix for the notion of equivalence.) In this case
C is said to be a linearization of L(A). The invariant, coinvariant, and
semiinvariant subspaces for C play a special role in the study of the matrix
polynomial L(A). For example, certain invariant subspaces of C are related
to factorizations of L(A). More precisely, certain invariant subspaces deter-
mine monic right divisors of L(A), certain coinvariant subspaces determine
monic left divisors, and certain semiinvariant subspaces determine three
monic factors of L(A). In this chapter we explore these and similar
connections and study the behavior of solutions of differential and differ-
ence equations with constant coefficients.

5.1 LINEARIZATIONS,s STANDARD TRIPLES, AND


REPRESENTATIONS OF MONIC MATRIX POLYNOMIALS

In this section we introduce the main tools required for the study of monic
matrix polynomials. These tools are freely used in subsequent sections.
Let L(A) = /A' + EJIo Af\' be a monic matrix polynomial of degree /,
where the A- are n x n matrices with complex entries. Note that det L(A) is
a polynomial of degree nl. A linear matrix polynomial /A — A of size
(n + p) x (n + p) is called a linearization of L( A) if

144
Monic Matrix Polynomials 145

where E( A) and F( A) are (n + p) x (n + p) matrix polynomials with con-


stant nonzero determinants. Admitting a small abuse of language, we also
call matrix A from equation (5.1.1) a linearization of L(A). Comparing
determinants on both sides of (5.1.1), we conclude that det(/A - A) is a
polynomial of degree nl, where / is the degree of L(A). So the size of a
linearization A of L(A) is necessarily nl.
As an illustration of the notion of linearization, consider the lineariz-
ations of a scalar polynomial (n = 1). Let L(A) = flf =1 (A — A,)"1 be a scalar
polynomial having different zeros A , , . . . , Afc with multiplicities a 1 5 . . . , ak,
respectively. To construct a linearization of L(A), let/, (i = 1,. . . , k) be the
Jordan block of size a, with eigenvalue A ( , and consider the linear polyno-
mial A/ - / of size E*=1 a^ where J = diag[/,]f=1. Then / is a linearization of
L9A). Indeed, /A - J and have the same elementary divisors;
so using Theorem A.3.1, we find that J is a linearization of L(A).
The following theorem describes a linearization of a monic matrix
polynomial directly in terms of the coefficients of the polynomial.

Theorem 5.1.1
For a monic matrix polynomial definc
the nl x nl matrix

Then C\ is a linearization of L(A).

Proof. Define nl x n/ matrix polynomials £(A) and F(A) as follows:


146 Applicaions toMatrix Polynomials

where# 0 (A) = / a n d B r + l ( A ) = AB r (A) + A,_r_l for r = 0, 1, . . . , / - 2. It is


immediately seen that del F( A) = 1 and det E( A) = ±1. Direct multiplication
on both sides shows that

and Theorem 5.1.1 follows. D

The matrix C, from Theorem 5.1.1 will be called the (first) companion
matrix of L(A), and will play an important role in the sequel. From the
definition of C, it is clear that

In particular, the eigenvalues of L( A), that is, zeros of the scalar polynomial
det L(A), and the eigenvalues of /A - C, are the same. In fact, we can say
more: since Cl is a linearization of L(A), it follows that the elementary
divisors (and thus also the partial multiplicities of every eigenvalue) of
/A — Cj and L(A) are the same.
Now we prove an important result connecting the rational matrix function
L ( A ) ~ l with the resolvent function for the linearization C,.
Proposition 5.1.2
For every A £ <p that is not an eigenvalue of L(A), the following equality
holds:

where

is an n x nl matrix and

is an n x nl matrix.
Monic Matrix Polynomials 147

Proof. Consider the equality (5.1.2) used in the proof of Theorem


5.1.1. We have

It is easy to see that the first n columns of the matrix [£(A)] ' have the form
(5.1.4). Now, multiplying equation (5.1.5) on the left by P, and on the
right by P] and using the relation

We obtain the desired formula (5.1.3).

Formula (5.1.3) is referred to as a resolvent form of the monic matrix


polynomial L(A). The following result follows directly from the definition of
a linearization and Theorem A. 4.1.

Proposition 5.1.3
Any two linearizations of a monic matrix polynomial L(A) are similar.
Conversely, if a matrix T is a linearization of L(\) and matrix S is similar to
T, then S is also a linearization of L(\).

This proposition and the resolvent form (5.1.3) suggest the following
important definition: a triple of matrices (X, T, Y), where T is nl x «/, X is
n x «/, and Y is nl x n, is called a standard triple of L(A) if

For example, Proposition 5.1.2 shows that ( P } , C}, /?,) is a standard triple
of L(A).
It is evident from the definition that, if (X, T, Y) is a standard triple for
L(A), then so is any other triple (X, f, Y) that is similar to (X, T, Y), that
is, such that

for some nonsingular matrix 5. As we see in Theorem 5.1.5, this is the only
freedom in the choice of standard triples.
We start with some useful properties of standard triples. Here and in the
sequel we adopt the notation col[Z(]f=0 for the column matrix
148 Applications to Matrix Polynomials

Proposition 5.1.4
If (X, T, Y) is a standard triple of a monic n x n matrix polynomial
/ then the nl x nl matrices

are nonsingular. Further, the equalities

and

hold.

Proof. We have

and by Proposition 2.10.1,

where F is a circle with centre 0 and sufficiently large radius so that cr(7)
and the eigenvalues of L( A) are inside T. On the other hand, since L( A) is a
monic polynomial of degree /, the matrix function L(A) = A~'L(A) is
analytic and invertible in a neighbourhood of infinity and takes the value / at
infinity. In fact, L(A) is analytic outside and on F. Hence

and representing L(A)" 1 as a power series 7 + E£ =1 A ~ k L k , we see that

Combining this with (5.1.8), we have


Monic Matrix Polynomials 149

As the right-hand side in equation (5.1.9) is nonsingular, the nl x nl matrices


col[AT]!:J and [7 TF • • • r'^y] are both nonsingular.
Now use equation (5.1.8) again and we find that, for / = 0, 1, . . . , / — 1,

It follows that

and since the second factor is nonsingular, formula (5.1.6) follows.


Similarly, starting with the equality

formula (5.1.7) can be verified.

We are now ready to state and prove the basic result that the standard
triple for a monic matrix polynomial is essentially unique (up to similarity).

Theorem 5.1.5
Let (X1, Tl, y t ) and (X2, T2, Y2) be two standard triples of the monic matrix
polynomial L( A) of degree I. Then there exists a unique nonsingular matrix S
such that

The matrix S is given by the formula

where the invertibility of the matrices involved is ensured by Proposition


150 Applications to Matrix Polynomials

5.1.4. In particular, if (X, T, Y) is a standard triple of L(A), then T is a


linearization of L(A).

Proof. Assume we have already found a nonsingular S such that


(5.1.10) holds. Then

and

Thus formulas (5.1.11) hold and consequently 5 is unique.


Now we prove the existence of an 5 such that (5.1.10) holds. Without loss
of generality, and taking advantage of Proposition 5.1.2, we can assume that
X2 = P,, T2 = C}, y2 = /?,. Using (5.1.6) [with (X, T, Y) replaced by
(A",, T,, y,)], the equality

where C\ is the companion matrix of L(A), is easily verified. Also, (5.1.9)


implies

where 8tj is the Kronecker index (8;/ = 0 if 17^;; 5(> = 1 if i =/). Obviously

and equations (5.1.10) hold with S = col[A'17",]!:,'.


Finally, if (A', T, Y) is a standard triple of L(A), then, by the part of
Theorem 5.1.4 already proved, T is similar to the companion matrix C, of
L(A), and thus T is also a linearization of L(A).

Proposition 5.1.2 gives an example of a standard triple based on the


companion matrix of L(A). Another useful example of a standard triple is

where

and is called the second companion matrix of L(\). Indeed, if we define


151

then we have

Thus the triple (5.1.10) is similar to the standard triple given in Proposition
5.1.2. The notion of a standard triple is the main tool in the following
representation theorem.

Theorem 5.1.6
Let be a monic matrix polynomial of degree I with
standard triple (X, 7", Y). Then L(\) admits the following representations:
(a) Right canonical form :

where V- are nl x n matrices such that

(b) Left canonical form:

where W are n x nl matrices such that

Note that only X and T appear in the right canonical form of L(A),
whereas only T and y appear in the left canonical form.

Proof. Observe that the forms (5.1.13) and (5.1.14) are independent of
the choice of the standard triple (X, T, Y). Let us check this for (5.1.13),
for example. We have to prove that if (X, T, Y) and (A", T', Y') are
standard triples of L(A), then

where
152

But these standard triples are similar:

Therefore

and (5.1.15) follows.


Thus it suffices to check equation (5.1.13) only for the special standard
triple

and for checking (5.1.14), we choose the standard triple defined by (5.1.12).
To prove (5.1.13), observe that

and

so

and (5.1.13) becomes evident. To prove (5.1.14), note that by direct


computation one easily checks that for the standard triple (5.1.12)

and

So

Thus
Multiplication of Monic Matrix Polynomials 153

and

So equations (5.1.14) follows.

5.2 MULTIPLICATION OF MONIC MATRIXx pPOLYNOMIALS


AND PARTIAL MULTIPLICITIES OF A PRODUCT

In this section we describe multiplication of monic matrix polynomials in


terms of their standard triples. First we compute the inverse L~l(\) of the
product L(A) = L 2 (A)Lj(A) of two monic matrix polynomials /^(A) and
L 2 (A).

Theorem 5.2.1
Let L,( A) be a matrix polynomial with standard triple (Xt, T,, Y^for i = 1,2,
and let L(A) = L^AJL^A). Then

whrer

Proof. It is easily verified that

The product on the right of equation (5.2.1) is then found to be

But, using the definition of standard triples, this is just ), an


the theorem follows immediately.

Corollary 5.2.2
If L.( A) are monic matrix polynomials with standard triples {X^ Tt, Y,) for
i = 1,2, then has a standard triple (X, T, Y) with the
representations
154 Applications to Matrix Polynomials

Proof. Combine Theorem 5.1.5 with Theorem 5.2.1.

Corollary 5.2.2 allows us to describe the partial multiplicities of a produc


of monic matrix polynomials. We first give some necessary definitions. For a
monic matrix polynomial L(A) and its eigenvalue A0 [i.e., det L( A 0 ) = 0], let
«j > a2 > • • • > ar be the degrees of the elementary divisors of L( A) corre-
sponding to A 0 . The integers a, are called the partial multiplicities of L(A)
corresponding to A(). It is convenient to augment the a, values by zeros and
call the sequence a — (an «2, . . . , ar, 0, . . .) the sequence of partial multi-
plicities of L( A) at A 0 . Thus a G ft (see Section 4.4 for the definition of ft).
Also, we shall say formally that the partial multiplicities of L( A) correspond
ing to a complex number that is not an eigenvalue of L(A) are all zeros.
Recall also the definition of the set F(a, /3) given in Section 4.4.

Theorem 5.2.3
Let Lj(A) and L2(\) be n x n monic matrix polynomials. Let a, /3 and y be
the sequences of partial multiplicities of L,(A), L2(\), and L 2 (A)L,(A),
respectively, at A 0 . Then y E F(a, /3). Conversely, if y G F(a, /3), then for n
sufficiently large there exist n x n monic matrix polynomials L,(A) and
L2( A), such that the sequence of their partial multiplicities at A0 are a and j8,
respectively, and the sequence of partial multiplicities of L 2 (A)Lj(A) is y.

Proof. Let ( X t , Tt, Yt} be a standard triple for L,(A) and i = 1,2. By
the multiplication formula (Corollary 5.2.2), the matrix

is a linearization of L 2 (A)Lj(A). From the properties of a linearization it


follows that y is also the sequence of partial multiplicities of T at A 0 . Now
from the structure of T it is clear that y G F(a, /3), and the first part of the
theorem follows.
To prove the second part of Theorem 5.2.3, we first prove the following
assertion: let A be an r t x r2 matrix. Then for n sufficiently large there exist
an r, x n matrix Y and an n x r2 matrix X such that YX - A, the rows of Y
are linearly independent, and the columns of X are linearly independent.
Indeed, multiplying A by invertible matrices from the left and the right (if
necessary), we can suppose that
Multiplication of Monic Matrix Polynomials 155

where / is the unit r x r matrix (for some r < min(r,, r 2 )). Then we can take

where Y} is an (r t - r) x r, matrix with linearly independent rows and


Xl is an r2 x (r2 - r) matrix with linearly independent columns. Then n =
r + r, + r 2 , of course.
Now let y G F(a, /3), so that y is the sequence of partial multiplicities of

for some 7",, T2, A, and the partial multiplicities of 7, (resp. 72) corre-
sponding to A0 are given by the sequence a (resp. /3). Applying a similarity
to 70, if necessary, we can assume that 7, and 72 are in Jordan form.
Further, in view of Theorem 4.1.1 we can assume that o-(7,) = o-(72) =
{A,,}-
According to the assertion proved in the preceding paragraph, for n
sufficiently large there exist matrices X0 and yo of sizes n x r2 and r, x n,
respectively (where r, = EJL, a;, r2 = EJL, )3y) such that K^, = /I, the rows
of yo are linearly independent, and so are the columns of X(). Choose an
n x (n - r 2 ) matrix Xt such that the matrix [A r () A r ,] (of size n x n) is
invertible, and put

where z is some complex number different from A 0 . Similarly, choose an


matrux Y1 such that is nonsingular, and put

As 72 © z/ (resp. 7, © z/) is a linearization of L 2 ( A) [resp. of L,( A)], it


follows that the partial multiplicities of L2(\) [resp. of L,(A)] corresponding
to A() are given by the sequence j8 (resp. a). Further

are the standard triples for L 2 (A) and L,(A), respectively. By Corollary
5.2.2 the matrix
156 Applications to Matrix Polynomials

is a linearization of L 2 ( A)L,( A). Now Theorem 4.1.1 ensures that the partial
multiplicities of T corresponding to A0 are exactly those for F0; that is, they
are given by the sequence y.

The proof of the converse statement of Theorem 5.2.3 shows that for a
yEF(a, /3) there exist linear monic matrix polynomials L t (A) and L 2 (A)
with the desired properties and with the size not exceeding min(rj,r 2 ) +
TJ + r 2 , where r, (resp. r 2 ) is the sum of all integers in a (resp. )3).
Our analysis of partial multiplicities of completions in sections 4.2-
4.4, combined with Theorem 5.2.3, allows us to deduce various connec-
tions between the partial multiplicities of monic matrix polynomials and the
partial multiplicities of their product, as indicated, for instance, in the
following corollary.

Corollary 5.2.4
Let Lj(A) and L 2 (A) be n x n monic matrix polynomials. Let a =
(a,, a 2 ,. . .), /3 - (j§,, /32, . . .), and y = (y,, -y2, . . .) be sequences of partial
multiplicities o/L,(A), L 2 (A), and L 2 (A)L,(A), respectively, at A 0 . Then

/or an_y sequence r^<- • • <rm of positive integers.

The corollary follows from Theorems 4.3.1 and 5.2.3.

5.3 DIVISIBILITY OF MONIC MATRIX POLYNOMIALS

Let L( A) be an nx n monic matrix polynomial of degree /, and let


(X, T, y) be a standard triple for L(A). Consider a F-semiinvariant sub-
space M. Thus there exists a triinvariant decomposition (see Section 3.3)
associated with M:

where the subspaces j£ and ^ 4- J< are F invariant. The triinvariant decom-
position (5.3.1) is called supporting [with respect to (X, F, y)] if, for some
integers p and q, the transformations
Divisibility of Monk Matrix Polynomials 157

and

are invertible (in particular, this implies that dim(j£ + M) = np,


dim j£' = nq).
Cases in which j£= {0} are of particular interest; then M is T invariant
and condition (5.3.3) is vacuous. Also, if JV= (0), then M is T coinvariant
and the condition (5.3.2) is satisfied automatically with p = /. (Indeed,
we have seen in Proposition 5.1.4 that the matrix col[AT']'lo is n
singular.)
The definition of a supporting triinvariant decomposition is given in terms
of (X, T) only. However, if P# is a projector with Ker P^ — JV, the
following lemma shows that the invertibility of (5.3.2) is equivalent to
the invertibility of the transformation from C n( '~ p) into Im P^ defined by

Similarly, (5.3.3) is invertible if and only if

is invertible, where P^ is a projector with Ker Py — 58 (note that because of


the T invariance of £ and M we have

Lemma 5.3.1
Let L(\) be a monic matrix polynomial of degree I with standard triple
(X, T, Y), and let P be a projector in (p"'. Then the transformation

(where k < I) is invertible if and only if the transformation

is invertible.
invertible.
158 Applications to Matrix Polynomials

Proof. Put and with


spect to the decompositions and
write

Thus the At are transformations with the following domains and ranges:

and similarly for the Bt.


Observe that Al and B4 coincide with the transformations (5.3.4) and
(5.3.5), respectively. By formula (5.1.9) the product AB has the form

where D, and D2 are nonsingular matrices. Recall that A and B are also
nonsingular by Proposition 5.1.4. But then Al is invertible if and only if B4
is invertible. This may be seen as follows.
Suppose that B4 is invertible. Then

is invertible in view of the invertibility of fi, and then also B} - B2B4l B3 is


invertible. The special form of AB implies A^B2 + A2B4 =0. Hence D, =
A ,B, +A2B3 = A1B1 -A {B2B~l B^ = Al(Bl -B2B4}B3) and it follows
that A\ is invertible. A similar argument shows that invertibility of Al
implies the invertibility of B4. This proves the lemma.

The importance of supporting triinvariant decompositions stems from the


following result describing factorizations of a monic matrix polynomial L(A)
in terms of supporting triinvariant decompositions associated with a lineariz-
ation of L(\}.
Divisibility of Monic Matrix Polynomials 159

Theorem 5.3.2
Let =^(A) be an n x n monic matrix polynomial with standard triple
(X, T, y), and let df" = ££ + M+Jfbea supporting triinvariant decompos-
ition associated with a T-semiinvariant subspace M. Then L(A) admits a
factorization

where L ( (A), / = 1 , 2 , 3 are monic matrix polynomials with the following


property: (a) (X\y, T^, Y) is a standard triple of L 3 (A), where

(b) (X, PjfT\lmP , P^Y) is a standard triple for L,(A), where P v is a


projector with Ker P v = !£ 4- M and

(c) (Z\M, PMT\M, F) is a standard triple for L 2 (A), where PM is the projector
on M along !£ + Im P A ,

[Here q^l and / - / ? < / are the unique nonnegative integers such that the
linear transformations co\(XTi)t>:^- £-+ $"" and P^[Y,. . . , Tl'p~2Y,
T'~P~1Y]: $n(l-p)->% + M are invertible.}
Conversely, if equation (5.3.6) is a factorization of L(\) into a product of
three monic matrix polynomials Lj(A), L 2 (A), and L 3 (A), there exists a
supporting triinvariant decomposition
160 Applications to Matrix Polynomials

associated with a T-semiinvariant subspace M such that the standard triples


o/L,(A), L2(\),and L 3 (A) are (X, P x r l l m / P^Y), (Z]M, PMT\M, PMY),
and 7 " , y), respectively, where P^ is a projector with Ker P^ =
!£ 4- M , PM is the projector on M along !£ + Im PA , and X, Y, Z, Y are given
by (5.3.8), (5.3.7), (5.3.9) and (5.3.10), respectively.
Moreover, the T-invariant subspaces !£ and t£ + M in (5.3.11) are unique-
ly determined by the factors Lj(A), L 2 (A), and L 3 (A).
It is assumed in Theorem 5.3.2 that P where
/ is the degree of L(A).
As a monic matrix polynomial M(A) and its inverse are uniquely deter-
mined by any standard triple (see Theorem 5.1.6 and the definition of a
standard triple), Theorem 5.3.2 provides an explicit description of the
factors L,(A) in (5.3.6) in terms of supporting triinvariant decompositions.
For instance, if <p" = !£ + M 4- N is a supporting triinvariant decom-
position (associated with a r-semiinvariant subspace M ) and L( A) =
L!(A)L 2 (A)L 3 (A) is the corresponding factorization of L(A), then (in the
notation of Theorem 5.3.2) we have

Similarly, using Theorem 5.3.2, one can produce the formulas for Lj(A),
L 2 (A), and L 3 (A) themselves. The proof of Theorem 5.3.2 is quite lengthy
and is relegated to the next section.
The following particular case of Theorem 5.3.2 is especially important.
We assume that L(A) and (X, T, Y) are as in Theorem 5.3.2.
Corollary 5.3.3
Let <p"' = ££ + M+Nbea supporting triinvariant decomposition associated
with a T-semiinvariant subspace M such that !£ + M = <prt/ (so Jf — {0} and M
is actually T coinvariant). Then L(\) admits a factorization

where L3( A) is a monic matrix polynomial of degree q with a standard triple


of the form (X^, T^, Y), where
Proof of Theorem 5.3.2 161

Also, L 2 (A) is a monic matrix polynomial of degree I- p with a standard


triple of the form (XIM, PM T\M, PM Y) where

and PM is the projector on M along ££.


Conversely, if equation (5.3.12) is a factorization of L(\) into a product
of two monic matrix polynomials L 2 (A) and L3(X), there exists a unique
T-invariant subspace J£ such that the triinvariant decomposition <p" = 2£ +
M 4- {0} (where M is a direct complement to 3?) is supporting and the
standard triples of L 2 (A) and L3(X) are as described above.

Note that under the conditions of Corollary 5.3.3 we have q = I - p (cf.


Lemma 5.3.1).
Again, as in Theorem 5.3.2, one can write down explicit formulas for the
factors in (5.3.12) and their inverses using the triinvariant decomposition
<p"' = % + M + Jf with M = {0}. For example

in the notation of Corollary 5.3.3.

5.4 PROOF OF THEOREM 5.3.2

We need the following fact.

Proposition 5.4.1
Let L( A) = E / = 0 Aj\' be an n x n matrix polynomial (not necessarily monic)
and let L{(\) be an n x n monic matrix polynomial with standard triple
(Xl,Tl,Yl). Then (a) L(A) = L 2 (A)Li(A) for some matrix polynomial
L 2 (A) if and only if the equality

holds; (b) L(A) = Lj( A)L 3 (A)/or some matrix polynomial L 3 (A) if and only
if the equality

holds.

Proof. Let us prove (a). We have


Therefore
162 Applications to Matrix Polynomials

and for |A| large enough (e.g., for |A| > \\T} ||) we have

Now assume L(A)L 1 (A)~ 1 is a polynomial. Then in formula (5.4.2) the


coefficients of negative powers of A are zeros. But the coefficient of
A~ y "'(; = 0, 1,.. .) in (5.4.2) is

which is zero. So

As [y,, 7,, y,, . . . , r ^ ' y j is nonsingular [where k is the degree of L,(A);


see Proposition 5.1.4], we obtain equality (5.4.1).
Conversely, if (5.4.1) holds, then

which means that all coefficients of negative powers of A in (5.4.2) are zeros,
that is, L(A)L 1 (A)~ I is a polynomial.
Statement (b) of Proposition 5.4.1 follows from the (already proved)
statement (a) when applied to the matrix polynomials L(A) = (L(A))* =
. . _ def _
£}=0 A*\' and L,(A) = (Li(A))* in place of L(A) and L,(A), respectively.
(Note that (y*, T*,X*) is a standard triple for L,(A), and that L(A) =
L,(A)L 3 (A) if and only if L(A) = L 3 (A)L,(A), where L 3 (A) = [L 3 (A)]* is a
matrix polynomial together with L 3 (A).}

Assume now that <p"' = 3? + M + N is a supporting triinvariant decom-


position associated with 7-semiinvariant subspace M, as in Theorem 5.3.2.
As [coltAT'lfJo1]^: ££—> (p"9 is an invertible transformation, we can define
the n x n monic matrix polynomial L3(\) by the formula

where
Proof of Theorem 5.3.2 163

(so V): <p"—»j£, i = 1,. . . , q). It turns out that (A^>, T^, Vq) is a standard
triple of L 3 (A). Indeed, we note that the following equalities hold:

where C3 is the companion matrix for L 3 (A), and

(The second equality is obtained from

on premultiplication by ATj^,.) Hence (A^, 7^,, V ) is similar to the


standard triple (P,, C,, /?,) for the matrix polynomial L 3 (A) (in the notation
of Proposition 5.1.2), so (A^, 7^, V^) is itself a standard triple for L 3 (A).
Because of the equality

where / is the degree of L(A) and Aj is the coefficient of A' in L(A) [see
formula (5.1.6)], Proposition 5.4.1 ensures that there exists a matrix polyno-
mial L 4 (A) such that L(A)= L 4 (A)L 3 (A). The matrix polynomial L 4 (A) is
necessarily monic and of degree def/ — q. Let us find its standard triple. First
note that the transformation Q = P ^ + P v is a projector on M + Im P v
along 3?. Indeed, for every x E 3? we have Qx = PMx + Pxx = 0 + 0 = 0, and
for every yE.M (resp. y e Im P v ) we have Qy — PMy + P^y = y + 0 = 0
(resp. Qy = P^y = y). Then by Lemma 5.3.1, the transformation

is invertible.
Now we check that

where

and QTQ is considered as a transformation from Im Q into itself. In view of


164 Applications to Matrix Polynomials

the multiplication theorem (Theorem 5.2.1) it will suffice to check that the
triple (X, T, Y) is similar to the triple

For then we have where L 4 (A) is the right-han


side of (5.4.3) and thus L 4 (A) = L 4 (A). To this end define

where X} = X\<g, Tl = T^. Then P' is a projector and Im P' = !£. Indeed,
we obviously have P'y = y for every y G !£. Further, formula (5.1.9) implies
that

In fact, we have the equality:

To check this, let y £ Ker P'. As [y, 7Y,. . . , r'^'Y] is invertible, we


have y = E|lo r'Yx, for some x0,. . . , x,_l E <p". Now

and formula (5.1.9) easily implies that xt_q = • • • = jc 7 _j = 0. Hence (5.4.4)


follows.
In view of Lemma 5.3.1 the transformation [Y, TY,. . . , Tl~q~lY] is
one-to-one; therefore

Using (5.4.4) and the fact that P'^ = I, it follows that X and
Im[Y, TY,. . . , T'~q~lY] are direct complements to each other in <pn/.
Thus P' is indeed a projector.
Define 5: f'^lm P + Im Q by

where P' and (? are considered as transformations from <p" into Im P' and
Proof of Theorem 5.3.2 165

Im Q, respectively. One verifies easily that S is invertible. We show that

Take y E <£"'. Then P> E 2 and col^TV'P'.y]*^ = col[AT'~^]*=1. In


particular, XlP'y = A'y. This proves that [A^ 0]S = X. The second equality
in (5.4.5) is equivalent to the relations

and QT= QTQ. The last follows immediately from the fact that Ker Q is an
invariant subspace for T. To prove (5.4.6), take y E <pn/. The case when
_y E Ker (? = Im P' is trivial. Therefore, assume that y E Ker P'. We then
have to demonstrate that P'Ty-VlZ2Qy. Since y E K e r P ' , there exist
x0, . . . , * , _ , _ , E <f" such that u = ^I? r'^^'Fjc,.,. Hence

with u E K e r P ' and, as a consequence, P'Ty = P'T''qYxQ. But then it


follows from the definition of P' that

On the other hand, putting , we obtain

and so V^ZQy is also equal to VqxQ. This completes the proof of equation
(5.4.6). Finally, the last equality in (5.4.5) is obvious because P'y = 0.
We have now proved equality (5.4.3), from which it follows that
(Q, QTQ, QY) is a standard triple for L 4 (A).
Now define the monic matrix polynomial

where and

Then (A^, PxT^lmP ., P^y) is a standard triple for Lj(A). Indeed, this
follows from the equalities
166 Applications to Matrix Polynomials

where C2 is the second companion matrix of L,(A). The first and third
equations of (5.4.7) follow from the definitions; the second equality follows
from the structure of C2 using the fact that A col[l/,]Jl^ = /.
Now Proposition 5.4.1, (b) implies that L 4 (A) = L,(A)L 2 (A) for some
(necessarily monic) matrix polynomial L 2 (A). So in order to prove the direct
statement of Theorem 5.3.2 we have only to verify that (Z\M, PM^\M-> ^) ^s
indeed a standard triple for L 2 (A). To this end, put

where

Note that, in view of Lemma 5.3.1, the invertibility of the transformation on


the right-hand side of (5.4.8) follows from the invertibility of A. As shown
earlier in this section, (Z^, PMT\M, Y) is a standard triple for L2(\), and
L 4 (A) = L!(A)L 2 ( A) for some monic matrix polynomial L,(A) with standard
triple (X, P v T | I m / v P,Y). Hence L,(A) = L,(A), and thus L 2 (A) = L 2 (A).
Consider now the proof of the converse statement of Theorem 5.3.2. This
statement amounts to the following: if L(A) = L 4 (A)L 3 (A) for some monic
matrix polynomials L 4 (A) and L 3 (A), then there is a unique ^-invariant
subspace !£ such that (A^, T\</,, Y), with

is a standard triple for L 3 (A). Here q is the degree of L 3 (A). Let C be the
first companion matrix of L(A). Proposition 5.4.1 implies that

where is a standard triple for Also

Eliminating C from (5.4.9 and (5.4.10), we obtain

This readily implies that the subspace


Example 167

is T invariant. Moreover, it is easily seen that the columns of


are linearly independent; equation (5.4.11)
implies that in the basis of j£ formed by these columns, 7^ is represented by
the matrix 7,.
Further

so X\<£ is represented in the same basis in Z£ by the matrix X. Now it is clear


that is similar to (A',, T and thus Y) is alsot
standard triple for L 3 (A).
It remains to prove the uniqueness of Z£. Assume that J£" is also a
7-invariant subspace such that (X^., 7^,, Y) is a standard triple for L 3 (A)
(for some admissible Y). As any two standard triples of L 3 (A) are similar,
there exists an invertible transformation 5: =2" —> £ such that Xl(f. = X^S,
I \<r>' — >j 11 <fj. 1 nen

In particular

But the matrix col[AT'']!=J is invertible, so


Theorem 5.3.2 is proved completely.

5.5 EXAMPLE

We illustrate Theorem 5.3.2 with an example. Let

Then

is the standard triple for L(A) of Proposition 5.1.2, where


168 Applications to Matrix Polynomials

is the companion matrix for L(A). As we are concerned with semiinvariant


subspaces for C, it is more convenient to use a Jordan form for C in place of
C itself. The only eigenvalues of L(A) (and thus also of C) are 0, 1, and 2. A
calculation shows that the vectors

form a Jordan chain of C corresponding to 0; the vector x3 -


(1, 0, 0, 0, 0, 0) is an eigenvector of C corresponding to 0; the vectors

form a Jordan chain of C corresponding to 1 ; and the vector jc6 =


(-1,1, -2, 2, -4, 4) is an eigenvector of C corresponding to 2. The vectors
jc,, . . . ,jc 6 are easily seen to be linearly independent. Denoting by S the
invertible 6 x 6 matrix with columns Jt l 5 . . . , jc6, let

(/ is the Jordan form of C); and

Clearly, (X, J, Y) is a standard triple for L(A).


We now find some factorizations

where L,( A), / = 1, 2, 3 are monic matrix polynomials of the first degree. As
Example 169

in Theorem 5.3.2, we express these factorizations in terms of the supporting


triinvariant decompositions

with respect to the standard triple (X, J, Y). So we are looking for
/-semiinvariant subspace M with 5£ and SB + M J invariant, such that the
transformations

and

are invertible. In particular, dim !£ = dim M = dim N = 2. As £ and £6 + M


are J invariant, we have

and

where ^?A (/) is the root subspace of / corresponding to the eigenvalue A 0 .


We consider only those supporting triinvariant decomposition (5.5.3) for
which

and

In other words, we consider only those factorizations (5.5.1) for


which detL and or
equivalently

One could consider all other factorizations (5.5.2) of L(A) in a similar way.
First, we find all pairs of /-invariant subspaces (=$?, 3? + M} with the
170 Applications to Matrix Polynomials

properties (5.5.4) and (5.5.5). Using the Jordan form (5.5.1), it is not
difficult to see that all such pairs are given by the following formulas:

Let us check which of these pairs (£, !£ + M ) give rise to supporting


triinvariant decompositions, that is, for which pairs the transformations

are invertible. We have for

(in the basis e, + ae 3 , e4 in J^and the standard basis in <p 2 ), and this matrix
is invertible for all

which is not invertible. For

which is invertible. For

which is invertible if and only if /3 ^ 0. (In this calculation we have used the
formula
Factorization into Several Factors and Chains of Invariant Subspaces 171

Summarizing, one obtains all the supporting triinvariant decompositions


(5.5.3) with the properties (5.5.4) and (5.5.5) where either £ = Span{e, +
ae3,e4} for some a E <(7, M is a direct complement to Span{e l5 e 3 , e4, e6} in
<p6, or, for some nonzero ft E <p we have ^£- Span{e,, e 4 }, M is a direct
complement to !£ in Spanf^, e2 + 0e3, e 4 , ej, and JV is a direct comple-
ment to Span{e,, e2 + (3e3, e 4 , e6} in <p .
Using the formulas given in Theorem 5.3.2, one finds all the factoriza-
tions (5.5.2) corresponding to the supporting triinvariant decomposition
with properties (5.5.4) and (5.5.5) (here a E <p and j8 E <p, jS^O are as
above):

5.6 FACTORIZATION INTO SEVERAL FACTORS AND


CHAINS OF INVARIANT SUBSPACES

In this section we study factorizations of the monic n x n matrix polynomial


L(A) of degree / into the product of several factors:

where L,(A), . . . , L A (A) are monic n x n matrix polynomials of positive


degrees /, lk, respectively (of course, /, + • • • + lk = /). We have al-
ready encountered particular cases of factorizations (5.6.1) in Theorem
5.3.2 (with k = 3) and in Corollary 5.3.3 (with A: = 2). In Theorem 5.3.2
factorizations (5.6.1) with k = 3 were described in terms of supporting
triinvariant decompositions associated with semiinvariant subspaces of a
linearization of L(\). In contrast, the description of (5.6.1) is to be given
in terms of chains of invariant subspaces for a linearization of L(\).
The following main result can be regarded as a generalization of Corol-
lary 5.3.3.

Theorem 5.6.1
Let (X, T, Y) be a standard triple for L(A). Then for every chain of
T-invariant subspaces
172 Applications to Matrix Polynomials

satisfying the property that the transformations

are invertible (for some positive integers mk < mk_l < • • • < m2 < /) there
exists a factorization (5.6.1) of L(A), with the factors L ; (A) uniquely
determined by the chain (5.6.2), as follows. For j = 1,2,. . . , k - 1, let M) be
a direct complement to ^i+l in ^ (by definition, ^ = <p"') and let
PM : j£y —» M •be the projector on M . along &j+l. Then for j = 1,2,. . . , k —

so Vkq are transformations from <p" into ££k for q = 1, . . . , mk. Conversely,
for every factorization (5.6.1) of L( A) there is a unique chain of T-invariant
subspaces (5.6.2) such that for j = 2,3,. . . , k the transformations

where m (/; is the degree of L y ), are invertible and


formulas (5.6.3) and (5.6.5) hold.

Observe that in view of Proposition 3.1.1 formulas (5.6.3) do not depend


on the choice of M - .

Proof. Apply Corollary 5.3.3 several times to see that factorization


Factorization into Several Factors and Chains of Invariant Subspaces 173

(5.6.1) holds for the monic matrix polynomial


having the standard triple (X^, T^, Yj), where Yf is given by (5.6.4)
(y = 2,. . . , k). Now use Theorem 5.1.6 to produce the formulas

where

(so V-q are transformations from (pn into J^ for g = 1,. . . , m y ). In particular
(with j = k), formula (5.6.5) follows. Further, using the formulas for the
standard triple of the factor L2( A) in Corollary 5.3.3, one easily obtains the
desired formulas [equation (5.6.3)]. The converse statement also follows by
repeated application of the converse statement of Corollary 5.3.3.

A "dual" version of Theorem 5.6.1 can be obtained if one uses the left
canonical form [equation (5.1.14) instead of the right canonical form
equation (5.1.13)] to produce formulas for L y (A)Ly + 1 (A) • • • Lk(\). Then
one uses (5.1.13) [instead of (5.1.14)] to derive the formulas for
L , ( A ) , . . . , L^.^A). We omit an explicit formulation of these results.
We are interested particularly in factorizations (5.6.1) with linear factors
L y (A): L ; (A) = A / + Af for some n x n matrices At (y = 1,. . . , k). Note
that in contrast to the scalar case, not every monic matrix polynomial admits
such a factorization:

EXAMPLE 5.6.1. Let

We claim that L(A) cannot be factorized into the product of (two) linear
factors. Indeed, assume the contrary:

for some complex numbers a,, bt, cit dt, i = 1, 2. Multiplying the factors on
the right-hand side and comparing entries, we obtain

Letting
174 Applications to Matrix Polynomials

we can rewrite equality (5.6.6) in the form

which implies However, there is no 2*2matrix A wiht thie

property (indeed, such an A must have only the zero eigenvalue, but then
inevitably A2 = 0).

As we shall see in the next theorem, a necessary (but not sufficient)


condition for a monic matrix polynomial L( A) not to be decomposable into a
product of monic linear factors is that the linearization of L ( A ) is not

diagonable. Indeed, in Example 5.6.1 the linearization of has


only one Jordan block /4(0) in its Jordan form.

Theorem 5.6.2
Let L( A) be an n x n monic matrix polynomial of degree I for which the
companion matrix is diagonable. Then there exist n x n matrices A^,. . . , A,
such that

Proof. Let (X, T, Y) be a standard triple for L(A), and let

Obviously, the JV; are subspaces in <J7n/ and

By Theorem 1.8.5 there exist ^-invariant subspaces M\ C M2 C • • • C Mt_}


such that M • is a direct complement to N^ in (p"'. The transformations

are invertible. Indeed, by the choice of Mt we have KerfcoHAT1]'"^.) =


{0}. As the matrix co^XT']'^ is invertible, the matrix colfAT'^I,', 'has
linearly independent rows and thus ImfcolIAT'jri^ ) = Im(col[AT'"]{-lJ) =
<p"', y = ! , . . . , / - ! . Invertibility of (5.6.7) now follows. The proof is
completed by applying Theorem 5.6.1. D
Differential Equations 175

5.7 DIFFERENTIAL EQUATIONS

Consider the homogeneous system of differential equations with constant


coefficients:

wherere n x n (complex) matrices, ans ann -


dimensional vector function of t to be found. The behaviour of solutions of
equation (5.7.1) as t—»°° is an important question in applications to physical
systems. We look for solutions with prescribed growth (or decay) at infinity.
It will turn out that such solutions depend on certain invariant subspaces of
a linearization of the monic matrix polynomial

connected with (5.7.1).


First we observe that a solution of (5.7.1) is uniquely defined by the
initial data x('\d) = Xj, j = 0 , . . . , / — 1, with given initial vectors
Xn. . . . , X i _ i . Indeed, denoting by y(t) the n/-dimensional vector

equation (5.7.1) is equivalent to the following equation:

As it is well known [cf. Section 2.10, especially formula (2.10.8)], a solution of


equation (5.7.2) is uniquely defined by the initial data y(a), which amounts
to the initial data x(n(d), j = 0 , . . . , / - 1 for equation (5.7.1). In particular,
the dimension of the set of all solutions of (5.7.1) (this set obviously is a
linear space) is nl, the number of (complex) parameters in the n-dimensional
vectors jc () ,. . . , xl_l that determine the initial data of a solution and thus
the solution itself.
It will be convenient to describe the general solution of (5.7.1) in terms
of a standard triple (X, T, Y) of the monic matrix polynomial L(\).
176 Applications to Matrix Polynomials

Lemma 5.7.1
A function x(t) is a solution of (5.7.1) if and only if it has the form

for some vector c E <p" .

Proof. Differentiating (5.7.3), we obtain

so

which is equal to zero in view of Proposition 5.1.4. It remains to show that


every solution of (5.7.1) is of the type (5.7.3) for some c E <pn/. As the linear
space of all solutions of (5.7.1) has dimension nl it will suffice to show that
the solutions Xe'Tc},. . . , Xe'Tcnl that correspond to a basis c,,. . . , cnl in
(pn/ are linearly independent. In other words, we should prove that Xe'Tc = 0
for all t^a implies c = 0. Indeed, differentiating the relation Xe'Tc = Q
j times, we obtain XT'e'Tc = 0 for 7 = 0,1, 2,. . . . In particular

As the matrices e'T and col(AT')|lo are nonsingular (Proposition 5.1.4), it


follows that c = 0.

Now let us introduce some T-invariant subspaces:+ffl (T)[resp.


is the sum of all root subspaces of T corresponding to its eigenvalues with
positive real part (resp. with negative real part); &10(T) is the sum of all root
subspaces of T corresponding to its pure imaginary eigenvalues (including
zero); and

Obviously, 3^0(T) is a /"-invariant subspace contained in &10(T). If it


happens that T has no eigenvalues with positive real part, we set £% + (T) =
(0). A similar convention will apply for &_(T), 020(T), and 3/C0(T).
Differential Equations 177

Let 3£i(T) be a fixed direct complement to ^Q(T} in $10(T) and note that
%i(T) is never T invariant [unless 3^(7) = {0}]. Otherwise ^(T) would
contain an eigenvector of T that, by definition, should belong to 5T0(r).
We now have the direct sum ("' = 9l_(T) + %0(T) 4- %t(T) + 9l+(T).
For a given vector c E <p"', let

where c_ e &_(r), c0 £ 3f 0 (r), c, £ %}(T), c + £ & + (r). We describe the


qualitative behaviour of solutions of (5.7.1) in terms of this decomposition
of the initial value of the solution x(t).
A solution x(t) of (5.7.1) is said to be exponentially increasing if for some
positive number /a

but

for every e > 0. Obviously, such a positive number n is unique and is called
the exponent of the exponentially increasing solution x(t). A solution x(t) of
(5.7.1) is exponentially decreasing if (5.7.5) and (5.7.6) hold for some
negative number /i [which is unique and is called again the exponent of x(t)].
We say that a solution x(t) is polynomially increasing if

for some positive integer m. Finally, we say that a solution x(t) is oscillatory
if

These classes of solutions of (5.7.1) can be distinguished according to the


decomposition (5.7.4) of the vector c, as follows.

Theorem 5.7.2
Let x(t) = Xe'Tc be a solution of (5.7.1). Then (a) x(t) is exponentially
increasing if and only ifc+ ^ 0; (b) x(t) is polynomially increasing if and only
if c+ — 0, Cj T^ 0; (c) x(t) is oscillatory if and only if c+ = Cj = 0, c0 ^ 0; (d)
x(t) is exponentially decreasing if and only if c+ = cl = CQ = 0, c_ 5^0. In
cases (a) and (d), the exponent of x(t) is equal to the maximum of the real
parts of the eigenvalues A0 of T with the property that PA c ^ 0, where PAo is
the projector on ^o(T) along
178 Applications to Matrix Polynomials

Proof. We have

where Without loss of generalit


[passing to a similar triple (X, T, Y), if necessary] we can assume that T_,
T0, T+ are matrices in Jordan form.
Note that for the Jordan block Jk(\) we have (according to Section 2.10)

So every entry in Xe'T+c+ is a function of the type

for some polynomials /?,(/)• Also, every entry in Xe'T°(cQ + c t ) is of the


type

whereas every entry in Xe'r°c0 is of the type (5.7.9) with all polynomials
pf(t) constant. Finally, every entry in Xe'T~c_ is of the type

Further, note that

if and only if c ± =0. Indeed, if equality (5.7.11) holds, then successive


differentiation gives XT'±e'T-c± = 0, / = 0 , 1 , . . . . In particular

As is a nonsingular matrix, the transformation


Differential Equations 179

has zero kernel, and equation (5.7.12) implies c ± =0. Also, the equality

holds if and only if c0 + c, = 0. Also

if and only if c0 = 0. According to the observation made in this and the


preceding paragraphs, statements (a)-(d) follow easily from formula (5.7.7).
For instance, assume that x(t) is exponentially increasing. In view of
(5.7.8)-(5.7.10), this means that Xe'T+c+*Q (since \ez\ = e*ez for any
complex number z), and this is equivalent to the inequality c+ ¥^ 0.

A special case with X — [I 0 • • • 0] and T the companion matrix of


L(A) deserves special attention. In this case the matrix col[AT']|lo is just
the identity, and thus

Exponentially decreasing solutions of (5.7.1) are of particular interest. We


present one result on existence and uniqueness of exponentially decreasing
solutions in which only partial initial data are prescribed.

Theorem 5.7.3
For every set of k vectors JC Q , . . . , jc^., in <p" there exists a unique exponen-
tially decreasing solution x(t) of (5.1 A) such that

if and only if the matrix polynomial L( A) admits a factorization


are monic matrix polynomials of
degrees k and I - k, respectively, such that Re A <0 for all A G o-(/.,) an
<&t A > 0 / o r fl//AGo-(L 2 ).

Proof. In the notation of Theorem 5.7.2 the solution x(t) is exponential-


ly decreasing if and only if

where c_ G £%_(T). When x(t) is given by (5.7.10) we have


180 Applications to Matrix Polynomials

It follows that for every set JCD, . . . , xk_l £ <p" there exists a unique expo-
nentially decreasing solution x(t) of (5.7.1) with x('\a) = jc,, / = 0 , . . . ,
A: - 1 if and only if the transformation

is one-to-one and onto. This amounts to the invertibility of


col[Ar(<r|ijM7.))l]fj01, which in turn is equivalent (by Corollary 5.3.3) to the
existence of a factorization L(A) = L 2 (A)L,(A). Moreover, in this factoriza-
tion [ X \ # _ ( T ) , T\m_(T)-> Y] is a standard triple for L t (A) (for a suitable Y"),
whereas (^, PT\lmp, PY) is a standard triple for L 2 (A) for a suitable X,
where P is the projector on 3fcQ(T) + 3ft+(T) along $._(T). As 7 1 | g8 _ (r) and
PT\lm P are linearizations of Lj(A) and L 2 ( A), respectively (Theorem 5.1.5),
it follows that indeed <3le A < 0 for all AGo-(L,), and S^^ A > 0 for all
Ae<r(L 2 ). D

5.«
S.8 DIFFERENCE EQUATIONS

In this section we consider the system of difference equations

where /10, . . . , A,_l are given n x n matrices, and {^}JL0 is a sequence of


n-dimensional vectors to be found. Clearly, given / initial vectors
XQ, . . . ,x,_l, the vectors x,, xl+1, and so on are determined uniquely from
(5.8.1). Hence, a solution {jcy.}°°=0 of equation (5.8.1) is determined by its
first / vectors.
Again, it will turn out that the asymptotic behaviour of solutions of
(5.8.1) can be described in terms of certain invariant subspaces of a
linearization of the associated monic matrix polynomial

Let (X, T, F) be a standard triple for L( A). The general solution of (5.8.1) is
then

where c E <p"' is an arbitrary vector. Indeed, putting jcy = XT'c, j =


0, 1, . . . , we have
0,1,.
Difference Equations 181

which is zero in view of Proposition 5.1.4. If the first / vectors in (5.8.2) are
zeros, that is

then by the nonsingularity of col[AT']Jlo we obtain c = 0. This means that


the solutions (5.8.2) are indeed all the solution of (5.8.1).
The solutions of (5.8.1) are now to be classified according to the rate of
growth of the sequence {jcy}JL0. We say that the solution {*y}JL0 is of
geometric growth (resp. geometric decay) if there exists a number q > 1
(resp. a positive number q < 1) such that

but

for every positive number e. The number q is called the multiplier of the
geometrically growing (or decaying) solution {jcy}JL0. The solution {*y}JL0 is
said to be of arithmetic growth if for some positive integer k the inequalities

holds. Finally, {jty.}°°=0 is oscillatory if

The classification of the solution Xj = XT'c, y = 0,1,. . . of (5.8.1) in


terms of c E <p"' is based on certain TMnvariant subspaces. Let us introduce
these subspaces. Denote by 2ft+(T) [resp. £%~(T)] the sum of all root
subspaces of T corresponding to the eigenvalues A0 of T with | A 0 | > 1 (resp.
with |A0| < 1), and let 3£l(T) be a direct complement to the subspace

in the sum of all root substances of T corresponding to the eigenvalues An


with |A 0 | = 1. Observe that ^ + (T), 9t~(T), and %°(T) are T invariant. We
have a direct sum decomposition
182 Applications to Matrix Polynomials

according to which every vector c G <pw/ will be represented as

Theorem 5.8.1
Let (xt: — XT'c}~=f) be a solution of (5.8.1). Then the solution is (a) of
geometric growth if and only if c+ 7^0; (b) of arithmetic growth if and only if
c+ = 0, c1 7^0; (c) oscillatory if and only if c+ = 0, c1 =0, cVO; (d) of
geometric decay if and only if c+ — c = c ( = 0 , c ~ 7 ^ 0 . In cases (a) and (d)
the multiplier of {jc,}^=0 is equal to the maximum of the absolute values of the
eigenvalues A0 of T with the property that PA c ^ 0, where PA is the projector
on £%A (T) along

The proof of Theorem 5.8.1 is similar to the proof of Theorem 5.7.2 if we


first observe that the rath power of the Jordan block of size k x k with
eigenvalue A is

(It is assumed here that This formula can be easily


verified by induction on ra.
The following result on existence of geometrically decaying solutions of
equation (5.8.1) can be established using a proof similar to that of Theorem
5.7.3.

Theorem 5.8.2
For every set of k vectors _y 0 , . . . , y k _ { in <p" there exists a unique geometri-
cally decaying solution {jtj}JL 0 with x(} = y ( } , . . . , x k _ { = yk_l if and only if
L(A) admits a factorization L(\) = L 2 (A)L,(A), where L2(\) and L,(A) are
monic matrix polynomials of degrees I - k and k, respectively, such that
Exercises 183

5.9 EXERCISES

5.1 For a monic n x n matrix polynomial L(A) of degree /, the pair of


matrices (X, T), where X and T have sizes n x nl and n/ x «/,
respectively, is called a right standard pair for L(A) if (X, T, Y) is a
standard triple of L(A), for some n x n matrix V.
(a) Prove that a pair of matrices (A', T) of sizes n x nl and «/ x «/,
respectively, is a right standard pair for a monic matrix poly-
nomial if and only if col|
vertible and

[Hint: The necessity follows from Proposition 5.1.4. To prove


sufficiency, define

(1)

and verify that (X, T, Y} is similar to the triple (F,, C,, #,)
from Proposition 5.1.2 with the similarity matrix col[AT'][lJ.]
(b) Show that given a right standard pair (X, T) of L(A), there
exists a unique Y such that (X, T, Y) is a standard triple for
L(A), and in fact Y is given by formula (1). [Hint: Use formula
(5.1.11) for the similarity between the standard triple (X, T, Y)
and the standard triple (Pj, C,, R^ from Proposition 5.1.2.]
5.2 A pair of matrices (T, Y) of sizes nl x nl and nl x n, respectively, is
called a left standard pair for the monic n x n matrix polynomial L( A)
if for some n x nl matrix X the triple (X, T, Y) is a standard triple of
L(A).
(a) Prove that a pair of matrices (T, Y) of sizes nl x nl and nl x n,
respectively, is a left standard pair for L(A) = /A 7 + EJIo Aj\' if
and only if [Y, TY, . . . , Tl~lY] is invertible and

(b) Show that given a left standard pair (T, Y) of L(A), there exists
a unique ^T such that (X, T, Y) is a standard triple of L(A), and
in fact

(c) Prove that (T, Y) is a left standard pair for L(A) = /A' +
Ejlj Aj\' if and only if (Y*,. T*) is a right standard pair for the
monic matrix polynomial
184 Applications to Matrix Polynomials

5.3 be a scalar polynomial with / distinct zeros

(a) Show that

is a right standard pair for L( A). Find Y such that (X, 7\ Y) is a


standard triple for L(A).
(b) Show that

is a left standard pair for L( A), and find X such that (X, T, Y) is
a standard triple for L(A).
5.4 Let be a scalar polynomial. Show that
J,(\0)) is a right standard pair or L(\) and that
a left standard pair for L(A). Find X and Y such that ([1 0 • • • OJ,
J,(A 0 ), Y) and (X, //(A 0 ), col[5,,]J=1) are standard triples for L(A).
5.5 be a scalar polynomial, where
are distinct complex numbers. Show that

and

are right and left standard pairs, respectively, of L(A), where


is an [. matrix and

is an lt x 1 matrix.
Exercises 185

5.6 Let

be a monic matrix polynomial, and let and


be standard triples for the polynomials L,(A) and L 2 (A), respectively.
Find a standard triple for the polynomial L(A).
5.7 Given a standard triple for the polynomial L(A), find a standard triple
for the polynomial S~'L(A + a)S, where S is an invertible matrix,
and a is a complex number.
5.8 Let (X, T, Y) be a standard triple for L(A). Show that

is a standard triple for the matrix polynomial L(A 2 ).


5.9 Given a standard triple for the matrix polynomial L(A), find a
standard triple for the polynomial L(/?(A)), where
is a scalar polynomial.
5.10 Let

be a 3 x 3 matrix polynomial whose coefficients are circulants:

(a A , bk, and c^ are complex numbers). Describe right and left


standard pairs of L(A). [Hint: Find an invertible 5 such that
S~1L(\)S is diagonal and use the results of Exercises 5.5-5.7.]
5.11 Identify right and left standard pairs of a monic n x n matrix
polynomial with circulant coefficients.
5.12 Using the right standard pair of a scalar polynomial given in Exercise
5.5, describe:
(a) The solutions of differential equation

where a 0 , . . . , al^l are complex numbers;


(b) The solutions of difference equations
186 Applications to Matrix Polynomials

5.13 Find the solution of the system of differential equations

where a0,. . . , at_l and b0,. . . , b,_l are complex numbers. When
are all solutions exponentially decreasing? When does there exist a
nonzero oscillatory solution?
5.14 Find the solutions of the system of difference equations

When do all nonzero solutions have geometric growth?


5.15 Find the supporting triinvariant decomposition !£ + M + {0} = <p'
corresponding to the divisor (A - A , ) " ' • • • (A - A^)"* of the scalar
polynomial (A - A,)* 1 • • • (A - \k)Pk (here af < 0;, y = 1,. . . , A:, and
at; are nonnegative integers). Use the standard triple determined by
the right standard pair described in Exercise 5.5.
5.16 Let A/ — A', and A/ - X2 be linear n x n matrix polynomials such
that the matrix Xl - X2 is invertible. Construct a monic n x n matrix
polynomial of second degree with right divisors A/ — Xl and A/ — X2.
[Hint: Look for a matrix polynomial with the standard pair ([/ /],
*,e*2).i
5.17 Let Lj(A) and L 2 (A) be monic matrix polynomials with no partial
multiplicities greater than 1. Show that the product L 1 (A)L 2 (A) has
no partial multiplicities greater than 2.
5.18 State and prove a generalization of the preceding exercise for the
product of k monic matrix polynomials with no partial multiplicities
greater than 1.
5.19 Show that a monic n x n matrix polynomial has not more than n
partial multiplicities corresponding to any zero of its determinant.
(Hint: Use Exercise 2.16.)
5.20 Prove that a monic n x n matrix polynomial of degree / with
circulant coefficients has not more than / partial multiplicities corres-
ponding to any zero of its determinant.
5.21 Describe all supporting triinvariant decompositions for the scalar
polynomial (A - A0)".
Exercises 187

5.22 Given an n x n monic matrix polynomial L( A) of degree /, a


CL-invariant subspace !£ is called supporting if the direct sum
decomposition 56 + M 4- {0} = <p"' is a supporting triinvariant de-
composition with respect to the standard triple

Find all supporting subspaces for the scalar polynomial

5.23 Find all supporting subspaces for the scalar polynomial

5.24 Prove that for a scalar monic polynomial L(A), every CL-invariant
subspace is supporting.
5.25 Describe all supporting subspaces for a monic matrix polynomial
whose coefficients are circulant matrices, that is, matrices of type

5.26 Give an example of a monic matrix polynomial of second degree


with nondiagonable companion matrix that admits factorization into
linear factors.
5.27 Prove the following extension of Theorem 5.6.2 for polynomials of
second degree. Let L(A) be a monic n x n matrix polynomial of
second degree such that its companion matrix has at least 2n - \
blocks in its Jordan form. Then L(\) admits a factorization into
linear factors ( A / - A^XI- A2). [Hint: Let (X, /) be a right
standard pair of L(A) with / in the Jordan form. Arguing by
contradiction, assume that every n columns of X formed by the
eigenvectors of L(A) are linearly dependent. Then the columns in
that correspond to the eigenvectors of L(A) are linearly
dependent, and this contradicts the invertibility of
188 Applications to Matrix Polynomials

5.28 A factorization L(A) = L 2 (A)L 3 (A) of a monic matrix polynomial


L( A) is called spectral if det L2( A) and det L3( A) have no common
zeros. Show that the factorization is spectral if and only if in the
corresponding triinvariant decomposition t£ 4- M + {0} = <pn/
(Corollary 5.3.3) the ^-invariant subspace SE is spectral.
5.29 Prove or disprove the following statement: each monic matrix poly-
nomial L(A) has a spectral factorization corresponding to every
triinvariant decomposition j£ 4- M + {0} = <p" with spectral T-
invariant subspaces $£ and M, where T is a linearization for L(A).
5.30 Let a,, « 2 , a 3 , a4 be distinct complex numbers, and let

(a) Show that

is a right standard pair for L(A).


(b) Find Y such that (AT, 7, 7) is a standard triple for L(A).
(c) Using the supporting triinvariant decomposition !£ 4- M 4- {0} =
<p4 with spectral T- invariant subspace #, find all spectral
factorizations of L( A).
5.31 Let M( A) and N( A) be a monic matrix polynomials of sizes n x n
and m x m, respectively, and of the same degree /, and let

be a direct sum of M(A) and N(\). Prove or disprove each of the


following statements: (a) the monic matrix polynomials Lj(A) and
L2( A) in every factorization L(A) = Lj( A)L 2 (A) are also direct sums;
(b) same as (a) with the extra assumption that M( A) and N( A) do not
have common eigenvalues.
5.32 Verify formula (5.8.3).
5.33 Supply the details for the proof of Theorem 5.8.1.
5.34 Prove Theorem 5.8.2.
Chapter Six

Invariant Subspaces
For Transformations
Between Different
Spaces

We are now to generalize the notion of an invariant subspace for transfor-


mations from <p" into <p" in such a way that it will apply to transformations
from fm+H into <p", or from <p" into $m+". The definitions introduced will
have associated with them a natural generalization of similarity, called
"block similarity", that will apply to transformations between different
spaces. This will form an equivalence relation on the class of transform-
ations between two given (generally different) spaces. A canonical form is
developed for this similarity that is a generalization of the Jordan normal
form. These ideas and results are then applied to the resolution of two
spectral assignment problems. This really means analysis of the changes in
spectra brought about by block similarity transformations.
Although this material is based on the theory of feedback in time-
invariant linear systems, the presentation here is in the framework of linear
algebra.

6.1 [A B]-INVARIANT t SUBSPACES

Consider a transformation from <p m+ " into (p". Our objective in this section
is to develop and investigate a generalization of the notion of an invariant
subspace that will apply to such transformations and that reduces to the
familiar concept when m =0. Let P be the projector on $m+n that maps
each vector onto the corresponding vector with zeros in the last m positions.
We treat vectors of §m+n in terms of their components in Im P and

189
190 Invariant Subspaces for Transformations Between Different Spaces

Im ), respectively, and, fo we identify


with (jtj, . . . , xn} E <p". Then we may repre-
sent any x E <p m+ " as an ordered pair (Px, (I - P)x) and, with respect to this
decomposition, a transformation from <p m+n into <p" can be written in the
block form [A B] where v4:<p"^<p" and fi: <p m -+(p". We also write

A subspace M of <p" will be said to be [A B] invariant if there is a


subspace ^ of <p m+ " with M = P3> and [A B}& C P& = M. Of course, when
m = 0, P = /, and this is interpreted as the familiar definition AM C M for A
invariance.
We now characterize this concept in different ways and, for this purpose,
introduce another definition. Given a transformation [A B]: <p" 4-
<p m -»<F' 1 , a transformation T: <p m+ "^ fm+n is called an extension of
[v4 5] if it has the form

for some transformations

Theorem 6.LI
Let M be a subspace of <p" and [A B] be a transformation from $m+n into
<p". Then the following are equivalent: (a) M is [A B] invariant; (b) there
exists a subspace y of <p m+ " with M = Ptf and an extension of [A B] under
which y is invariant; (c) the subspace M satisfies

(d) there is a transformation such that

Proof. The theorem will be proved by verifying the implications


(a)=>(d):»(c)=>(b)=>(a).
(a)=^>(d): Since M is [A B] invariant, there is a subspace with
M - Py and [A B\V C M. Let xl,. . . , xk be a basis for M. Then there exist
zl,...,zkESe such that *. = Pz-, / = 1, 2, . . . , * . Define y. = (/ - P)z- E
<pm, y - 1, 2,. . . , k and then, since E 5^, [A B\tf C J( implies that, for
; = 1, 2,. . . , k, Axf + Byj E M. Now define a transformation F: <p"-» <pm by
setting FXJ — yt for y = 1,. . . , k and letting F be arbitrary on some direct
complement to M in <p". Then for any m = S^, a^ E ^ we have

as required.
[A fi]-Invariant Subspaces 191

(d)=>(c): Given condition (6.1.2) we have, for any xEM

and (6.1.1) follows.


(c)=>(b): Let *,,. . . , xk be a basis for M and, using formula (6.1.1), let
y{,. . . , yk be vectors in <pm for which Ax} + Byj £ M for j = 1, 2, . . . , & .
Define a transformation //: <p"—><p in by means of the relation Hxj = yi,
j = 1, 2,. . . , k and letting H be arbitrary on some direct complement to M
in (p". Then define the subspace <f of <p m+ ' 1 by

and note our construction ensures that (A + BH)m E M for any mEM.
Consider the extension of \A B]. It is easily verified that & is
invariant under this extension.
This follows immediately from the definitions.

We will find the next simple corollary useful.

Corollary 6.1.2
With the notation of Theorem 6.1.1, if M is [A B] invariant, then for any
transformation F: <p"—*• <pm, M is [A + BF B] invariant.

Proof. We use the equivalence of statements (a) and (d) of the theorem.
The fact that M is [A B] invariant implies the existence of an
such that M is (A + BFQ)invariant.Thus, for any
Consequently,
AMCM+Y

Subspaces characterized by equation (6.1.1) are described in more


geometric terms by replacement of Im B by some subspace °V of ((7". In this
context it is useful to describe a subspace M as A invariant (mod Y) if

When V = {0} a subspace is A invariant (mod V) if and only if it is A


invariant. At the other extreme, when V = <p", every subspace is A invariant
(mod T).
For a given transformation A: <p" —> <p" and a subspace °V of (p", consider
the class of all subspaces that are A invariant (mod T). It is easy to see that
this class is closed under addition of subspaces, but is not closed under
intersection. This is illustrated in the next example. We observe that
192 Invariant Subspaces for Transformations Between Different Spaces

(reverting to the language of transformations), although the set of all


A -invariant subspaces form a lattice, the same is not generally true for the
set of all [A /?]-invariant subspaces.

EXAMPLE 6.1.1. Let A: <p 3 —»<p 3 be defined by linearity and the equalities
Ael — e2, Ae2 = e^, Ae3 = el. Let °V - Span{e2 + e3}. The subspaces
Span{el,e2} and Span{e,,e 3 } are both A invariant (mod Y). (The sub-
space Span{e,,e 2 } is actually A invariant.) However, their intersection
Span{e,} is not A invariant (mod V). Indeed, Ae\ = e2f£Span{el} +
Span{e2 + e>3}.

Given A and Y as above, it is natural to look for a "largest" subspace


among all of those that are A invariant (mod Y). More generally (cf.
Section 2.7), given a subspace M of (p", a subspace °IL of M that is A
invariant (mod V), is said to be maximal in M if °H contains all other
subspaces of M that are A invariant (mod V).

Proposition 6.1.3
For every subspace M C (p" there is a unique subspace of <p" that is A
invariant (mod V) and maximal in M.

Proof. Let °\L be the sum of all subspaces that are A invariant (mod V)
and are contained in M. Because of the finite dimension of M, °U is in fact
the sum of a finite number of such subspaces. Consequently, °U is itself A
invariant (mod Y) and thus maximal in M. The uniqueness is clear from the
definition. D

6.2 BLOCK SIMILARITY

In the preceding section the idea of [A B]-invariant subspaces has been


developed where [A B] is viewed as a transformation from (p" + <pm into
<P". We must also consider transformations of the other kind, namely, those
acting from <p" to (p" 4- (p"1. Such transformations can be written in the form
where A: <p"—»<p" and C: <p"—»<p'". For these transformations, we
need a dual concept of -invariant subspaces where is viewed as a
transformation from (p" into (p" 4- <pm. Thus, guided by Proposition 1.4.4, it
[ A I invariant if an only if M x is
[A* C*] invariant in the sense of Section 6.1. We develop this idea in
Section 6.6. The purpose of this section is to generalize the notion of
similarity to transformations [A B] and in a way that will be consistent
with the definitions of these generalized invariant subspaces.
Block Similarity 193

Let us begin with similarity for transformations from <p" into


1
<p" 4- (p" . In this case it is natural to say that a transformation is

similar to if there is an invertible transformation 5 on <p" 4- <f"m such


that

and the additional assumption that <p" is S invariant. Thus S\(n defines an
invertible transformation on (f"1 — the space on which acts. This
means that, with respect to the decomposition S has the
representation

where X, Y are invertible transformations on <p" and <pm, respectively. The


formal definition is thus as follows: transformations from
1 1
<p" into (f " 4- (p" are said to be block similar if there is an invertible
transformation

such that

Going to the adjoint transformations, this leads us to the dual definition:


transformations [Al fij and [A2 B2] from <p" 4- <pm into <p" are said to be
block similar if there is an invertible transformation

such that

Now let us describe block-similar pairs [Al fij and [A2 B2] in two other
ways.
194 Invariant Subspaces for Transformations Between Different Spaces

Theorem 6.2.1
Let [Al 5J and [A2 B2] be transformations from <p" + <pm into <p". Then
the following statements are equivalent: (a) [A^ Bv] and [A2 B2] are block
similar; (b) there exist invertible transformations N and M on <p" and <pm,
respectively, and a transformation F: <p n —» <f"" such that

(c) for any extension Tl of [A l Bv] there is an extension T2 of [A2 B2] and a
triangular invertible transformation S of the form (6.2.3) for which T{ =
ST2S~\

Proof. Given statement (a) and, hence, equation (6.2.4), let F= LN \


and it is found immediately that equation (6.2.4) implies the relations
(6.2.5). So (a)=>(b).
Given statement (b), define 5 as in (6.2.3), let L = FN, and let

be an extension of [ A } J?J. Then it is easily verified that 5 1T^S is an


extension of [A2 B2], and statement (c) follows.
Finally, statement (c) implies that for any extension 7, of [Al #J [as in
(6.2.6)] there is an extension T2 of [A2 B2] such that T2 = S^T^S with S as
in (6.2.3). This immediately implies equation (6.2.4). Thus (c)=>(a).

Corollary 6.2.2
Let [Al Z?j] and [A2 B2] be block-similar transformations with transform-
ing matrix S given by [6.2.3]. Then Mis an [A t B ^-invariant subspace if and
only if N~1M is an [A2 B2]-invariant subspace.

Proof. Assume that M is [Al Bv] invariant. By Theorem 6.1.1 there is


an extension Tl of [Al #J and a subspace & such that M = Py and
T,ycy. Since But also, using (6.2.3
P(S'iy) = N~lPy=N'lM. Hence [A2 fl2](S~V)C AT1^ and, by defin-
ition, N~}M is [A2 B2] invariant.
If we are given that N~1M is [A2 B2] invariant, it follows from T2 =
5~ 1 r i 5that^is[^l 2 5J invariant.

Corollary 6.2.3
If transformations [Al #J and [A2 B2] are block similar, they have the
same rank.

Proof. Let [A{ B{] and [A2 B2] be block similar. Then Theorem 6.2.1
implies that
Block Similarity 195

Writing G = FNM we see that

But it is easily verified that Im Im and s


rank[,4 2 B2] = rank[y4, £J.

By use of the characterizations of block-similar transformations de-


veloped in Theorem 6.2.1, it is easily verified that block similarity deter-
mines an equivalence relation on the class of all transformations from
<p" + <pm into <p". This immediately raises the problem of finding a canonical
form for representations of the transformations in the equivalence classes
determined by this relation. The rest of this section is devoted to the
derivation of such a form. It will, of course, be a generalization of (and so
be more complicated than) the Jordan normal form, which is associated with
similarity of transformations in the usual sense, and which appears herein as
Theorem 2.2.1.
Our argument will make use of the Kronecker canonical form for linear
matrix polynomials under strict equivalence, as developed in the appendix.
The following proposition is an important step in the argument. Note that it
is convenient to work with matrices here. The previous analysis applies, of
course, when they are viewed as transformations in the natural way.

Proposition 6.2.4
Let Al and A2 be n x n matrices and Bl and B2 be n x m matrices. Then
[Al B,J and [A2 B2] are block similar if and only if the linear matrix
polynomials [/A + A^ B\\ and [I\+ A2 B2] are strictly equivalent, that is,
there exist invertible matrices S and T such that

Proof. Assume that (6.2.7) holds and write

where Tn is n x n. Then

Hence 7\, =
= S~\
5 \and
and

Equation (6.2.7) also implies that


196 Invariant Subspaces for Transformations Between Different Spaces

It follows that r,2 = 0 and then that SfijT^ = B2. Combining this relation
with (6.2.5), it follows from Theorem 6.2.1 that [Al Bt] and [A2 B2] are
block-similar.
Conversely, suppose that the relations (6.2.5) hold for appropriate N, M
and F. Then (6.2.7) holds with 5 = AT1, Tn = N, T12 = 0, T2l = FN~r, and
T22 = M.

Now we are ready to state and prove a result giving a canonical form for
block-similar transformations and known as the Brunovsky canonical form.
In the statement of the theorem /*(A) will, as usual, denote the k x k
Jordan block with eigenvalue A.

Theorem 6.2.5
Given a transformation [A B]: <p" + <f""-^ <p", there is a block-similar
transformation [A0 B0] that (in some bases for <p" and <f"") has the
representation

for some integers kl > • • • ^ kp > 0 and all entries in BQ are zero except for
those in positions ( k { , 1), (fcj + k2, 2),. . ., (k} + • • • + kp, p), and these
exceptional entries are equal to one. Moreover, the matrices AQ and BQ
defined in this way are uniquely determined by [A B], apart from a permu-
tation of the blocks J^( A j ) , . . . , / / (A^) in (6.2.9).

Thus the pair of matrices A0, BQ or the block matrix [AQ B0] may be
seen as making up the Brunovsky canonical form for the transformation
\A B]. It will be convenient to call the matrix the
Kronecker part of A0 and the integers kl, . . . , kp the Kronecker indices of
[A B]. Similarly, we call the Jordan part of AQ and
/ j , . . . , lq the Jordan indices of {A B].

Proof. We use the terminology and results of the appendix to this book.
We may consider A and B to be n x n and n x m matrices, respectively.
Consider the linear matrix polynomial

of size n x (n + m). As the equation

has no nontrivial polynomial solution *(A), the minimal row indices of C(A)
Analysis of the Brunovsky Canonical Form 197

are absent. Further, the polynomial AC( A ') = [/ + \A, \B] obviously has
no elementary divisors at zero, so C(A) has no elementary divisors at
infinitv. Let k /:_ be the minimal column indices of C(A) and
' be the elementary divisors of C(A). Then
Theorem A.7.3 ensures that C(A) is strictly equivalent to the linear matrix
polynomial

where Lk is the k x (k + 1) matrix

and s — max Ae( p (rank C(A)) - n [and we have used the elementary fact that
— //(AO) and J,(-\0) are similar]. After a permutation of columns the
polynomial (6.2.10) becomes [/A + A() B0] with A0 and BQ as defined in the
statement of the theorem. The theorem itself now follows in view of
Proposition 6.2.4. D

6.3 ANALYSIS OF THE BRUNOVSKY CANONICAL FORM

We first draw attention to an important special case of Theorem 6.2.5. This


concerns transformations [A B]: <p m + n —> <p" in which the pair (A, B) is a
full-range pair in the sense defined in Section 2.8. That is, when

where p is the degree of a minimal polynomial for A.


The following lemma will be useful.

Lemma 6.3.1
Consider any transformations
0,1, 2 , . . . we have

Proof. The proof is by induction on s. When s = 0, equation (6.3.1) is


trivially true. Using a binomial expansion it is found that
198 Invariant Subspaces for Transformations Between Different Spaces

Hence

Assuming that the relation (6.3.1) holds when s = r - 1, this implies that the
right-hand side of (6.3.1) is contained in the left-hand side. But the opposite
inclusion follows from that already proved on replacing A by A — BF.

We now formulate other characterizations of full-range pairs (A, B).

Theorem 6.3.2
For a transformation [A B]: $m+n—> <p" the following statements are
equivalent: (a) the pair (A, B) is a full-range pair; (b) there is a full-range pair
( A } , B I ) for which [ A t £,] and [A B] are block-similar; (c) in the
Brunovsky form [AQ B0] for [A B], the matrix A0 has no Jordan part; (d)
the rank of the transformation [I\ + A B] does not depend on the complex
parameter A.

Proof. Consider statement (b). If [Al #,] and [A B] are block-similar,


then, by Theorem 6.2.1, there are invertible transformations N, M and a
transformation F such that

Thus From the definition of full-range pairs and


Lemma 6.3.1 it follows that (A, B) is a full-range pair. So (a) and (b) are
equivalent.
Now consider a canonical pair (AQ, B0) as defined in Theorem 6.2.5. It is
easily verified that such a pair is a full-range pair if and only if the Jordan
part of AQ is absent. Since [A B] is block-similar to a canonical pair
[A0 BQ] (by Theorem 6.2.5), the equivalence of (a) and (c) follows from the
equivalence of (a) and (b).
Consider condition (d). It follows from Corollary 6.2.3 that the rank of
[/A + A B] for any A £ <f is just that of [/A + A0 BQ] where [A0 BQ] is a
Brunovsky form for [A B]. A moment's examination of A0 and BQ convin-
Analysis of the Brunovsky Canonical Form 199

ces us that the rank of [/A + AQ B0] takes the same numerical value, except
at the points A = - A y , j = 1, . . . , q, where there is a reduction in rank. Thus
the rank of [/A + A B] is independent of A if and only if there is no Jordan
part in A0, and the equivalence of (c) and (d) is proved.

So far, the discussion of this section has focussed on cases in which the
matrix AQ of a canonical pair (^4 0 , BQ) has no Jordan part. This can be
described as the case q = 0 in equation (6.2.9). It is also possible that AQ has
no Kronecker part; the case p = 0 in equation (6.2.9). In this case BQ = 0 as
well. We return to this case in Section 6.6.
We conclude this section by showing that the Kronecker indices of the
Brunovsky form can be determined directly from geometric properties of
the transformation [A B] without resort to the computation of the minimal
column indices of [/A + A B].

Proposition 6.3.3
Let [A B] be a transformation from <p m+ " into <p" and define the sequence
d_i, dQ, dt, . . . by d_l = 0 and, for s = 0, 1, . . .

Then the Kronecker indices £,,. . . , kp of [A B] are determined by the


relations

Note that the sequence d_lt d0, . . . is ultimately constant and (if B ^0),
is initially strictly increasing (see Section 2.8).

Proof. Use Theorems 6.2.1 and 6.2.5 to write

where M and N are invertible and [A0 B0] is block similar to [A B]. Now
Lemma 6.3.1 implies

Consequently, the integers ds defined by formula (6.3.2) are invariant under


block similarity. Now formula (6.3.3) is easily verified for a canonical pair
A , B .
200 Invariant Subspaces for Transformations Between Different Spaces

Note that the number of Kronecker indices p is given by equation (6.3.3)


in the case s = 0. Thus

khjjkjkh8huhioytyyy[[piiiouiutiutugfjjugijugifuiugfiuiguigu ioug

In some special cases Theorem 6.2.5 can be used to describe explicitly all
[A B] -invariant subspaces. We consider a primitive but important "full-
range" case in this section.
Theorem 6.4.1
Let [A B] be a transformation from <p" + 1 into (p" for which (A, B) is a
full-range pair. Then there exists a basis / , , . . . , / „ in (p" such that every
m-dimensional [A B]-invariant subspace M ^ {0} admits the description:

where r } , . . . , r, are positive integers with rl + • • • + r, = m and A t , . . . , A,


are distinct complex numbers with the
understanding that 0! = 1 and that Conversly, every
subspace M C <p" of the form (6.4.1) is [A B] invariant.

Proof. Taking advantage of the equivalence (a)<=>(b) in Theorem 6.3.2,


we can assume that

Let M ^ {0} be an [A #]-invariant subspace. Then, by Theorem 6.1.1,


there exists a 1 x n matrix such that M is invariant for
the matrix
Description of [A BJ-In variant Subspaces 201

Let r,, . . . , r, be all the partial multiplicities of (^4 + BF)\M (so r{ + • • • +


r, = dim Jt), and let A , , . . . , A, be the corresponding eigenvalues. For every
A0 G <p the matrix A 0 7 —(A + BF) has a nonsingular ( n - l ) x ( n - l ) sub-
matrix (namely, that formed by the rows 1,2, . . . , n - l and columns
2, 3, . . . , n). It follows that dim Ker( A07 - (A + BF)) = I for every A0 e
cr(A). So there is exactly one Jordan block in the Jordan form of A + BF
corresponding to each A 0 E a(A + BF). Hence the same property holds for
(A + BF)\M, and the eigenvalues A 1 5 . . . , A, must be distinct. It follows that
in order to prove that M has the form (6.4.1), it will suffice to verify that for
any Jordan chain g, , . . . , gr of (A + BF)\M corresponding to A y we have

Observe first that

and consequently A y is a zero of the polynomial


of multiplicity at least r. Further, for t = 1, 2, . . . , r

(and the right-hand side is interpreted as zero for t = 1). Indeed, equality in
the 5th place (s = 1, . . . , n - 1) on both sides of (6.4.3) follows from the
easily verified combinatorial identity:

Equality in the nth place on both sides of (6.4.3) amounts to

or

but the left-hand side of this equation is just the (t - l)th derivative of the
polynomial evaluated at A y ; so equation (6.4.4),
and hence (6.4.3), is confirmed.
We have verified that the vectors
form a Jordan chain of A + BF corresponding to A ; . As the restriction
(A is unicellular, there exists a unique (A + BF)-
202 Invariant Subspaces for Transformations Between Different Spaces

invariant subspace in &tk(A + BF) of dimension r, and this subspace is


spanned by the vectors in any Jordan chain of (A + BF) of length r
corresponding to A y . So (6.4.2) follows.
Conversely, let M be given by (6.4.1) (with fk replaced by ek, k =
! , . . . , « ) . Let/(A) = A" - fln_1A"~1 — • • • — a0 be a polynomial such that A y
is a zero of /(A) of multiplicity of at least r , / = 1 , . . . , / . As we have seen
above, the vectors form a Jordan
chain of A + BF corresponding to A for
. So by Theorem 6.1.1, M is [A B] invariant.

The case /= m in Theorem 6.4.1 deserves special attention.

Corollary 6.4.2
Let [A B] be as in Theorem 6.4.1. Then there exists a basis f^, . . . , / „ m <p"
such that, for every m-tuple of distinct complex numbers A j , . . . , A m , the
m-dimensional subspace

is [A B] invariant.

This corollary shows that (at least in the case of a full-range pair
A: <p"—* <P" and B: <p-» <p") there are a lot of [A B]-invariant subspaces.
Indeed, Corollary 6.4.2 shows the existence of a family of [A B]-invariant
m-dimensional subspaces that depends on m complex parameters (namely,

For the general case of a full-range pair we have the following partial
description of [A /?]-invariant subspaces.

Theorem 6.4.3
Let (A, B) be a full-range pair with Kronecker indices /c, > • • • > /c r . Then
there exists a basis fn,. . . , /^ , / = 1,. . . , r in <p" such that for every r-tuple
of nonnegative integers lt,. . . , lr satisfying lt:^ kt, i = 1,. . . , r, and for every
collection } of complex numbers the subspace

is [A B] invariant.

The proof of Theorem 6.4.3 is obtained by combining Theorem 6.3.2 and


Corollary 6.4.2.
The Spectral Assignment Problem 203

jhyuuyuyuiuiyuiuyuiyiuioyuiouyiouiojjklhjjhkjkfhdjdffffffffaaasseeui

For a transformation A on <p" the eigenvalues are invariant under similarity


transformations. More generally, if A is defined by a transformation
[A B]: <p" + <pm-» (f"1, then, by Theorem 6.2.1, block similarity transforms
A into N~l(A + BF)N for some invertible N. Thus the eigenvalues of A are
no longer invariant, but are transformed to those of A + BF, where F
depends on the similarity. Now we ask, for given [A B], what are the
attainable eigenvalues of A + BF? We do not answer this question directly,
but we present solutions to two closely related problems.
First, suppose that we are given n complex numbers A } , . . . , An (possibly
with repetitions) that are candidates for the eigenvalues of A + BF. Under
what conditions on the transformation [A B] does a transformation
F: <p"-» <p" exist such that the numbers A ] 5 . . . , \n are just the eigenvalues
of A + BF, counting algebraic multiplicities? This is known as the spectral
assignment problem. It is important in its own right and is also relevant to
our discussion of the stability of [A B]-invariant subspaces.
Clearly, when B - 0, the problem is not generally solvable. Another
extreme case arises if B — I when it is easily seen that a solution can always
be found by using diagonable matrices F. We show first that the problem is
always solvable as long as (A, B) is a full-range pair.

Theorem 6.5.1
be a full-range pair of transformations. Then
for every n-tuple of complex numbers A \n there exists a transformation
F: <p"-» <pm such that A + BF has eigenvalues A , , . . . , \n.

Proof. With the use of Theorem 6.2.1 it is easily seen that we can
assume, without loss of generality, that A and B are in Brunovsky canonical
form. Furthermore, by Theorem 6.3.2, it follows that the Jordan part of A is
absent [see equation (6.2.9)]. So the Kronecker indices
satisfy the condition

be the scalar polynomial with zeros where


(and we define /0 = 0). Let

where F, is the m x ki matrix whose /th row is


and the other rows are zeros. Then
204 Invariant Subspaces for Transformations Between Different Spaces

where

is a kt x kt matrix for / = 1,. . . , p [the companion matrix of a,(-)]. It is well


known that the eigenvalues of At are exactly A, + 1 , . . . , A,. This proves the
theorem. D

The argument used in proving Theorem 6.5.1 can also be utilized to


obtain a full description of the solvable cases of the spectral assignment
problem. We omit the details of the proof.

Theorem 6.5.2
Let A: <p"—»<p" and B: <p m —» <p" be a pair of transformations, and let the
/ x / matrix J = / / ( A, ) © • • • © / / ( A^) be the Jordan part of the Brunovsky
form for [A B]. Then, given an n-tuple of (not necessarily distinct) complex
numbers /u,j,. . . , /n n , there exists a transformation F: <j7"-*<p'" such that
A + BF has eigenvalues yu,,, . . . , fj,n if and only if at least I numbers among
f i l t . . . , fj.n coincide with the eigenvalues of J (counting multiplicities).

We need another version of the spectral assignment problem, known as


the spectral shifting problem. Given a transformation [A B]: <p w + n —> <p"
and a nonempty set O C <p, when does there exist a transformation
F: <p"^> <pm such that cr(A + fiF)CH? When (A, B) is a full-range pair,
such an F always exists in view of Theorem 6.3.2. In general, the answer
depends on the relationship between the root subspaces of A and the
minimal /1-invariant subspace over Im B:

[known as the "controllable subspace" of the pair (A, B) in the systems


theory literature; see also Proposition 8.4.1]. Observe first that the subspace
( A | Im B) is the minimal ^-invariant subspace over Im B (see Theorem
2.8.4). In particular, (A \ Im B} is A invariant. Also, equation (6.3.1) can
be expressed in the form

for any transformation F: <p" —> (pm.


The Spectral Assignment Problem 205

Theorem 6.5.3
Given a nonempty set ftC <p and a transformation [A B\. <f"" +n —» <p", there
exists a transformation F: <p"~* 4-"" such that a(A + BF) C ft if and only if

for every eigenvalue A0 of A that does not belong to ft.

Recall that S2A (A) = Ker( A0/ - A)" is the root subspace of A correspond-
ing to the eigenvalue A0 and, by definition, £%A (A) = {0} if \0^a(A).
In the proof we use the following basic fact about induced transformations
in factor spaces. (Recall the definition of the induced transformation given
in Section 1.7.)

Lemma 6.5.4
Let X: be a transformation with an invariant subspace Z£, and let
be the induced transformation. Then for every A0 G <p we
have

where P: <f"-» <p"/^is the canonical transformation: Px = x + %, x E. <p". In


particular, every eigenvalue of X is also an eigenvalue of X.

Proof. Then for every with


we have

So p(X)x e £. Let <7(A) = n;r=1 ( A y - A)", where A , , . . . , A, are all the


eigenvalues of X different from A 0 . As p(\) and q(\) are polynomials with
no common zeros, there exist polynomials g(\) and h(\) such that
g(A)p(A) + /i(A)<?(A) = 1. (This is well known and is easily deduced from
the Euclidean algorithm.) Hence

Since we also havee On the other hand, the


Cayley-Hamilton theorem ensures that p(X)h(X)g(X)x = 0, that is, the
vector u = h(X)g(X)x belongs to 3ix(X). Now equation (6.5.4) implies
206 Invariant Subspaces for Transformations Between Different Spaces

We have proved the inclusion C in equality (6.5.3). The opposite inclusion


follows from the relation

for every vector y E <p".

Proof of Theorem 6.5.3. First consider a pair (A 0 , B0) in the Brunovsky


canonical form, as described in Theorem 6.2.5. Then

The condition <3lK (A0) C { A0 \ Im B0) for every A0 E <J7 --ft means that [in
the notation of equation (6.2.9)] A , , . . . , A^ Eft. It remains to apply
Theorem 6.5.2.
Now consider the general case, and let

where [AQ B0] is in Brunovsky canonical form. It is easily seen that there
exists a transformation Fl such that cr(A0 + Z? 0 F,)Cft if and only if there
exists an F2 with cr(/4 + #F2) Cft (indeed, one can take F2 = F0 +
Further, using equation (6.5.1), we have

and obviously, for any

So it remains to show that (6.5.2) holds if and only if

This is done by using Lemma 6.5.4. Denote by P: $"—> $"/(A Im B) the


canonical transformation

For a transformation X: (p n —»<p" with invariant subspace (A \ Im B), let

be the induced transformation. Using (6.5.1), we see that A and A + BFare


well defined. Further, for every
Some Dual Concepts 207

so A = A + BF0. Now, assuming that (6.5.5) holds, and in view of Lemma


6.5.4, we find that for every

Hence A similar argument shows that (6.5.2) implies


(6.5.5).

6.6 SOME DUAL CONCEPTS

The definitions and analysis of this chapter have primarily concerned trans-
formations [A B]: (J7" 4- <p m —> <p". Questions arise concerning analogs for
transformations : <p" -* <J7" 4- <pm. In this section we quickly review
some notions and results in this direction. Recall first that a subspace M of
<P" will be called invariant if and only if M^ is [A* C*] invariant.
Thus, with the characterization (d) of Theorem 6.1.1 for [A* C*]-invariant
subspaces, there is a transformation G* such that

if and only if M is invariant. Using Proposition 1.4.4, we see that this is


equivalent to

We include this discussion as part of the following statement.


Theorem 6.6.1
Let M be a subspace of <t7" and \ \ be a transformation from (p" into
Then the following are equivalent: (a) M is invariant;

(b)
(c) there is a transformation

Proof. It remains only to establish the equivalence of (a) and (b). This
is done by using the equivalence of statements (a) and (c) in Theorem 6.1.1.
Thus M \s\ \ invariant if and only if
208 Invariant Subspaces for Transformations Between Different Spaces

Now it is easily verified that, for subspaces % Y, and a transformation A,


the relations ATC^U and /l*^ 1 C < F 1 are equivalent. Thus equation
(6.6.4) is equivalent to

or

which is condition (6.6.2).

It is useful to have a terminology involving an arbitrary subspace in the


place of Ker C in (6.6.2). Thus, if A is a transformation on <p" and Tis a
subspace of <p", we say that a subspace M is A invariant intersect V, or A
invariant (int T), if A(M CiT)CM.
Through extension of the terminology of Section 2.8 for any given
subspace M, a subspace ^U that is A invariant (int T) is said to be minimal
over M if °lt D M and there is no other ^-invariant (int T") subspace that
contains M and is contained in aU.
Now consider a generalization of similarity for transformations from (t"1
to <p" 4- <pm. If is such a transformation, an extension of is a
transformation T on (p" 4- (pm of the form

Then we say that transformations rom (

are block similar if, given any extension 7\ of there is an extension


such that T, and T2 are similar. Comparing this with the
corresponding definition of Section 6.2, we see that this is equivalent to the
block similarity of [A* C*] and [A* C*]. We may thus apply Theorem
6.2.1 to obtain the following theorem.

Theorem 6.6.2
The transformations l
from <p" to <p m+ " are block similar if
and only if there exist invertible transformations N on <p" and M on <pOT, and a
transformation G: (p™-» (p" such that
Exercises 209

Once again, it is found that block similarity determines an equivalence


relation on all transformations from <p" into <p" 4- <pm. Furthermore, the
canonical forms in the corresponding equivalence follow immediately from
the Brunovsky form of Theorem 6.2.5 by duality.
Theorem 6.6.3
Given a transformation there is a block-similar trans-
formation that (in some bases for <£"" and <p") has the representation

for some integers kl ^ k2 > • • • S: kp, and al entries in C0 are zero except for
those in positions (1,1), (2, k{ + 1),. . . , ( p , kl + • • • + kp_l + 1), and those
exceptional entries are equal to one. Moreover, the matrices A0 and CQ
defined in this way are uniquely determined by A and C, apart from a
permutation of the blocks J, ( A , ) , . . . , / / (A^) in equation (6.6.6).

The case of full-range pairs (A, B), which was one of our concerns in
Section 6.3, is now replaced by the dual case in which (C, A) is a null kernel
pair (see the definition in Section 2.7 and Theorem 2.8.2). The dual of
Theorem 6.3.2 is now as follows.
Theorem 6.6.4
m
For a transformation the following statements are equi-
valent: (a) the pair (A, C) is a null kernel pair; (b) there is a null kernel pair
l
(y4j, C,) for which \ re block similar; (c) in the Brunovsky

form e matrix AA0 has


has no Jordan part; (d) the rank of th

transformation does not depend on the complex parameter A.

jgughgghjghjghghghghhhh

6.1 Let A: <p"—»• <p" be a transformation. A chain

of subspaces in (p" will be called almost A invariant if AM{ C Mi +i,


i-l,...,k-l. Show that the chain (1) is almost A invariant if and
only if A has the block matrix form A = [-Aj/lfJJi with Aif = 0 for
/ - / > ! , with respect to the direct sum decomposition
•' • + &k + i, where ^ is a direct complement to Mt_^ in Mi (by
definition, M0 = {0} and
210 Invariant Subspaces for Transformations Between Different Spaces

6.2 Prove that every transformation A: $"-^> $" has an almost A-


invariant chain

consisting of n + I distinct subspaces, where Ml is any given one-


dimensional subspace. (Hint: For a given ./^ = Span{jt),
jt^O, put M2 = Span{jt, Ax},. . . , Mk = Span (AT, Ax,. . . , Ak~lx},
where k is the least positive integer such that the vectors
x, Ax,. . . , Akx are linearly dependent. Use the preceding exercise.)
6.3 A block matrix A = [Aij]^j:=l is called tridiagonal if v4/7 = 0 for
\i - j\ > 1. Show that a transformation A has tridiagonal block matrix
form with respect to a direct sum decomposition <p" = 2£{ + • • • + 2£p if
and only if the chains

and

are almost A invariant.


6.4 be a self-adjoint transformation. Prove that for any
with norm 1 there exists an orthonormal basis
1
such that the chains

and

are almost A invariant (so A has a tridiagonal form with respect to the
basis JCj, . . . , jc n ). [Hint: Apply Gram-Schmidt orthogonaliza-
tion to a basis *,, y2,. . . , yn in <p" such that the chain

is almost A invariant (Exercise 6.2) and use the self-adjointness


of A.]
6.5 Let A: (p"-» <p" and B: fm-> (pn be transformations.
(a) Show that

for every A G (p with the possible exception of not more than


n - (dim Im B) points.
(b) Show that if equation (2) holds at k eigenvalues of A (counting
multiplicities) then for every £-tuple /i,,. . . , pk there exists a
transformation F: (£n-^ <pm such that /*,, . . . , fik e a(A + BF).
Exercises 211

6.6 State and prove the analogs of Exercises 6.5 (a) and (b) for a pair
of transformations
6.7 Let (A, B) be a full-range pair of transformations. Show that for any
F the transformation A + BF has not more than dim Im B Jordan
blocks corresponding to each eigenvalue in its Jordan form.
6.8 Let

(a) Show that (A, B) is a full-range pair.


(b) Find matrices N, M and F, where N and M are invertible, such
that the pair N~l(A + BF)N, N'1BM is in the Brunovsky
canonical form.
(c) Find G such that A + BG has the eigenvalues 0, 2, -1.
6.9 Let A: <p" -» <p" be a transformation, and let x E <p" be a cyclic vector
in <p" for /I (i.e., <p" = Span{*, A*, /I 2 jc,. . .}). Show that for any
n-tuple of not necessarily distinct complex numbers A n . . . , \n there
exists a transformation B: <p"—»(p" with Im B C Span{*} such that
A + B has the eigenvalues A , , . . . , A w .
6.10 Let A: <p" -> (p" be a transformation, and let M C <p" be a subspace such
that <p" is the minimal ^-invariant subspace that contains M. Show
that for n-tuple A j , . . . , \n of not necessarily distinct complex numbers
there exists a transformation B: <p"'—» <p"withlm J? C ^ such that A + B
has eigenvalues
6.11 Let

(a) Show that (C, A) is a null kernel pair.


(b) Find matrices N, M and F, where N and M are invertible such
that MCN~\ N(A + FC)N~l are in the canonical form as
described in Theorem 6.6.3.
(c) Find G such that A + GC has the eigenvalues 1, —1,0.
6.12 Let A: <p"-» <p" be a transformation, and let M C <p" be a subspace
such that {0} is the maximal >4-invariant subspace in M. Prove that
for any n-tuple of not necessarily distinct complex numbers
A,, . . . , An there exists a transformation C: (p"—> <p" with Ker C CM
such that A + C has eigenvalues
Chapter Seven

Rational Matrix
Functions

In this chapter we study r x n matrices VV( A) whose elements are rational


functions of a complex variable A. Thus we may write

where p (; (A) and <7,;(A) are scalar polynomials and <7,7(A) are not identically
zero. Such functions W(\) are called rational matrix functions.
We focus on problems for rational matrix functions in which different
types of invariant subspaces and triinvariant decompositions play a decisive
role. All these problems are motivated mostly by linear systems theory, and
their solutions are used in Chapter 8. The problems we have in mind are the
following: (1) the realization problem, which concerns representations of a
rational matrix function in the form D + C(A7 - A)~1B with constant
matrices A, B, C, D; (2) the problem of minimal factorization; and (3) the
problem of linear fractional decomposition.

7.1 REALIZATIONS OF RATIONAL MATRIX FUNCTIONS

Let W(\) be an r x n rational matrix function. We assume that W(\) is finite


at infinity; that is, in each entry p,7( A) /<7,7( A) of W(A) the degree of the
polynomial p /7 (A) is less than or equal to the degree of ^-(A).
A realization of the rational matrix function W( A) is a representation of
the form

where A, B, C, D are matrices of sizes m x m, m x n, r x m, r x n,

212
Realizations of Rational Matrix Functions 213

respectively. Observe that Hm [To verify this, assume


that A is in the Jordan form and use formula (1.9.5).] So if there exists a
realization (7.1.1), then necessarily D = W(o°). We may thus identify such a
realization with the triple (A, B, C). The following lemma is useful in the
proof of existence of a realization.

Lemma 7.1.1
Let H(\) = EjlJ \'Hj and L(A) = A7/ -I- Ej Let H(\) = EjlJ \'Hj and L(A) = A7/ -I- Ej Let H(\) = EjlJ \'Hj and L(A) = A7/ -I- Ej
polynomials, respectively. Put

Then

Proof. We know already (see Section 5.1) that for

where Q = [I 0 ••• 0]. We may define C,(A),. . . , C,(A) for all


by

From equation (7.1.2) we see that

the special form of A yields

It follows that C( A/ - It follows that C( A


is complete.

Theorem 7.1.2
Every r x n rational matrix function that is finite at infinity has a realization.

Proof. Let W( A) be an r x n rational matrix function with finite value at


infinity. There exists a monic scalar polynomial /(A) such that /(A)W(A) is a
214 Rational Matrix Functions

(matrix) polynomial. For instance, take /(A) to be a least common multiple


of the denominators of entries in W(\). Put //(A) = /(A)(W(A) - W(oo)).
Then //(A) is an r x n matrix polynomial. Clearly, L(A) = /(A)/ n is monic
and W(A) = JV(o>) + //(A)L( A)'1. Further

So the degree of //(A) is strictly less than the degree of L(A). We can apply
Lemma 7.1.1 to find A, B, C for which

This is a realization of W(\).

A realization for W(\) is far from being unique. This can be seen from
our construction of a realization because there are many choices for /(A). In
general, if (A, B, C) is a realization of W(A), then so is (^4, B, C), where

for any matrices At/, Bv, and Cl with suitable sizes (in other words, the
matrices A, B, C are of size s x s, s x n, r x 5, respectively, and partitioned
with respect to the orthogonal sum ( where m is the size
of A; for instance, A}3 is a p x q matrix). Indeed, for every \$£cr(A) we
have

and thus

Among all the realizations of W( A) those with the properties that (C, A)
is a null kernel pair and (A, B) is a full-range pair will be of special interest.
That is, for which

The next result shows that any realization "contains" a realization with
Realizations of Rational Matrix Functions 215

those properties. To make this precise it is convenient to introduce another


definition. Let (A, B, C) be a realization of W(A), and let m x m be the size
of A. Given a triinvariant decomposition <pm = t£ + M + JV associated with
an /4-semiinvariant subspace M (so that the subspaces !£ and !£ + M are A
invariant) with the property that C\y — 0 and Im B C !£ + M, a realization
(PMA\ M, PMB, C\M), where PM: (p"1'—>• M is a projector on M with
K e r P ^ D ^ , is called a reduction of (A, B, C). Note that
(PMA\M, PMB, C\M} is again a realization for the same W(\). [See the
proof that (7.1.3) is a realization of W(\) if (A, B, C) is.] We shall also say
that (A, B, C) is a dilation of (PMA\M, PMB, C\M) (in a natural extension of
the terminology introduced at the end of Section 4.1).

Theorem 7.1.3
Any realization (A., fi, C) of W(\) is the dilation of a realization (A^, B0,
C0) of W(\) with null kernel pair (C0, A0) and full-range pair (Att, #0).

Proof. and be

the maximal /1-invariant subspace in Ker C and the minimal ^-invariant


subspace over Im B, respectively.
Put £=jtC(C, A), let M be a direct complement of %n/(A,B) in
#(A, B), and choose M so that

and we recall that m is the size of A. Let us verify that equality (7.1.5) is a
triivariant decomposition associated with an /l-semiinvariant subspace M,
and that the realization

where P: (p M —» M is the projector on M along !£ + JV", is a reduction of


(./4, B, C), and has the required properties. Indeed, !£ and X + M =
3£(C, A) + $(A, B) are ^4-invariant subspaces, so (7.1.5) is indeed a tri-
invariant decomposition. Further, C\<g — 0 and

so (7.1.6) is a reduction of (A, B, C). It remains only to prove that the


realization (7.1.6) of W(\) has the null kernel and full-range properties.
Indeed

So
216 Rational Matrix Functions

Also

Hence

because by construction

It turns out that a realization (^4, B, C) for which conditions (7.1.4) are
satisfied is essentially unique. To state this result precisely and to prove it,
we need some observations concerning one-sided invertibility of matrices.
By Theorems 2.7.3 and 2.8.4 we have

where p is any integer not smaller than the degree of the minimal polyno-
mial for A. Hence there exists a left inverse [col[C47]p0']~L. Thus

Also, there exists a right inverse

Note that in general the left and right inverses involved are not unique.

Theorem 7.1.4
Let (/Ij, fij, Q) and (A2,B2,C2) be realizations for a rational matrix
function W(\) for which (C,, A,) and (C2, A2) are null kernel pairs and
(A{,B}), (A2, B2) are full-range pairs. Then the sizes of A^ and A2
coincide, and there exists a nonsingular matrix S such that

Moreover, the matrix S is unique and is given by


Realizations of Rational Matrix Functions 217

Here p is any integer greater than or equal to the maximum of the degrees of
minimal polynomials for Al and A2, and the superscript —L (resp. — R)
indicates left (resp. right) inverse.

Proof. We have

For |A| >max{||y4 1 ||, ||^42||} the matrices \I-Al and A/ - A2 are non-
singular and for / = 1, 2

Consequently, we have

for any A with | A| > max{ || /i JUI^II}- Comparing coefficients, we see that
ClA\Bl = C2A'2B2, j = 0, 1, . . . . This implies ftjAj = H 2 A 2 , where, for k =
1 , 2 we write

Premultiplying by a left inverse of H2 and postmultiplying by a right inverse


of A 2 , we find that the second equality in (7.1.8) holds. Now define 5 as in
(7.1.8). Let us check first that S is (two-sided) invertible. Indeed, we can
verify the relations

Since we have
/. Similarly, one checks that Because S is
invertible, the sizes of A^ and A2 must coincide
It remains to check equations (7.1.7). Write

Premultiply by H2 L and postmultiply by A 2 * to obtain A2S = SA}. Now

and
218 Rational Matrix Functions

Theorems 7.1.3 and 7.1.4 allow us to deduce the following important


fact.

Theorem 7.1.5
In a realization (A, B, C) of W(\), (C, A) and (A, B) are null kernel pairs
and full-range pairs, respectively, if and only if the size of A is minimal
among all possible realizations of W( A).

Proof. Assume that the size m of A is minimal. By Theorem 7.1.3,


there is a reduction (A1, B', C') of (A, B, C) that is a realization for W(\)
and satisfies conditions (7.1.4). But because of the minimality of m the
realizations (A', B', C") and (A, B, C) must be similar, and this implies that
(A, B, C) also satisfies condition (7.1.4).
Conversely, assume that (A, B, C) satisfies conditions (7.1.4). Arguing
by contradiction, suppose that there is a realization (A', B', C') with A' of
smaller size than A. By Theorem 7.1.3, there is a reduction (A", B", C") of
(A', B', C') that satisfies conditions (7.1.4). But then the size of A" is
smaller than that of A, which contradicts Theorem 7.1.4. D

Realizations of the kind described in this theorem are, naturally, called


minimal realizations of W(\). That is, they are those realizations for which
the dimension of the space on which A acts is as small as possible.

7.2 PARTIAL MULTIPLICITIES AND MULTIPLICATION

In this section we study multiplication and partial multiplicities of rational


matrix functions. To facilitate the presentation, it is assumed that the
functions take values in the square matrices and that the determinant
function is not identically zero.
Let W( A) by an n x n rational matrix function with det W( A) 7^0. In a
neighbourhood of each point A0 G <p the function W( A) admits the repre-
sentation, called the local Smith form of W(\), at A 0 :

where £\(A) and E2(\) are rational matrix functions that are defined and
invertible at A 0 , and i>,,. . . , vn are integers. Indeed, for matrix polynomials
equation (7.2.1) follows from Theorem A.3.4 in the appendix. In the
general case write W(\) = p(A)~ l VV(A), where W(\) and p(\) are matrix
and scalar polynomials, respectively. Since we have a representation (7.2.1)
for W(A), it immediately follows that a similar representation holds for
W(A).
The integers i/,, . . . , vn in (7.2.1) are uniquely determined by W(A) and
Partial Multiplicities and Multiplication 219

A0 up to permutation and do not depend on the particular choice of the local


Smith form (7.2.1). To see this, assume that j/, < • • • < f n , and define the
multiplicity of a scalar rational function g( A) ^0 at A0 as the integer v such
that the function g(\)(X - \0)~" is analytic and nonzero at A 0 . Then, using
the Cauchy-Binet formula (Theorem A.2.1 in the appendix), we see that
i>j + • • • + i>. is the minimal multiplicity at A0 of the not identically zero
minors of size / x i of W(A), / = 1,. . . , n. Thus the numbers ^ + • • • + v{,
/ = ! , . . . , « , and, consequently, i>,,. . . , vn are uniquely determined by
W(\).
The integers i/,,. . . , vn from the local Smith form (7.2.1) of W(\) are
called the partial multiplicities of W(\) at A () .
Note that A 0 E (f1 is a pole of W(\) [i.e., a pole of at least one entry in
W(\)] if and only if W(\) has a negative partial multiplicity at A 0 . Indeed,
the minimal partial multiplicity of W(\) at A0 coincides with the minimal
multiplicity at A0 of the not identically zero entries of W( A). Also, A0 G <p is
a zero of W(A) [by definition, this means that A() is a pole of W^A)" 1 ] if and
only if W(\) has a positive partial multiplicity. In particular, for every
A0 G <J7, except for a finite number of points, all partial multiplicities are
zeros.
There is a close relationship between the partial multiplicities of W(\)
and the minimal realization of W(A). Namely, let W(\) be a rational n x n
matrix function with determinant not identically zero. Let

be the Laurent series of W( A) at infinity (here q is some nonnegative integer


and the coefficients Wt are n x n matrices): write U(\) — £J=0 A'W. for the
polynomial part of W(\). Thus W(\) - t/(A) takes the value 0 at infinity,
and we may write

where C( A/ - A) 1B is a minimal realization of the rational matrix function


W( A) - L^A). We say that (7.2.3) is a minimal realization of W(A). We see
later (Theorem 7.2.3) that A 0 e <p is a pole of W(\) if and only if A0 is an
eigenvalue of A. Moreover, for a fixed pole of W( A) the number of negative
partial multiplicities of W(A) at A0 coincides with the number of Jordan
blocks with eigenvalue A0 in the Jordan normal form of A , and the absolute
values of these partial multiplicities coincide with the sizes of these Jordan
blocks. A similar statement holds for the zeros of W(\).
An analytic n-dimensional vector function
220 Rational Matrix Functions

defined on a neighbourhood of A0 G <p is said to be a null function of a


rational matrix function W(A) at A0 if «A 0 ^0> W(A)i/f(A) is analytic in a
neighbourhood of A 0 , and [W(A)^(A)] A = A =0. The multiplicity of A0 as a
zero of the vector function W(A)^(A) is the order of «H A), and $Q is the null
vector of «/f(A). From this definition it follows immediately that for n x n
matrix-valued functions U( A) and V( A) that are rational and invertible in a
neighbourhood of A 0 , «/r(A) is a null function of V(\)W(\)U(\) at A0 of
order k if and only if U(\)tff( A) is a null function of W( A) at A0 of order k. A
set of null functions <A,(A), . . . , «/^( A) of W( A) at A0 with orders A;,, . . . , kp,
respectively, is said to be canonical if the null vectors ^(AQ), . . . , &p(\0)
are linearly independent and the sum kl + k2 + • • • + kp is maximal among
all sets of null functions with linearly independent null vectors.

Proposition 7.2.1
Let W(\) be as defined above and «/f,(A), . . . , «A p (A) be a canonical set of
null functions of W(\) (resp. W(\)~l) at A 0 . Then the number p is the
number of positive (resp. negative} partial multiplicities of W(\) at A 0 , and
the corresponding orders £,, . . . , kp are the positive (resp. absolute values of
the negative) partial multiplicities of W(\) at A0.

Proof. Briefly, reduce W(\) to local Smith form as described above and
apply the observation made in the paragraph preceding Proposition
7.2.1.

Now we fix an n x n rational matrix function W(\) with det W(\)^Q.


Let

be its minimal realization, and fix an eigenvalue A0 of A. Replacing (7.2.4),


if necessary, by a similar realization, we can assume that

where o-(Ap) — { A0} and \0^o-(A'p). Note also that if A0 is a pole of W(\),
then equation (7.2.4) implies that A0 is an eigenvalue of A.

Proposition 7.2.2
Let W( A), Ap , and Bp be defined as above. Let A0 be a pole of W( A), let $( A)
be a null function of W(\)~l at A0 of order k, and let <?„ be the coefficients of
Partial Multiplicities and Multiplication 221

Then

is a Jordan chain for A at A 0 . Conversely, if A0 is an eigenvalue of A and


x(), . . . , xk_l is a Jordan chain of A at A 0 , there is a null function «A(A) of
W( A)" 1 at A0 with order not less than k for which (7.2.6) holds [in particular,
A0 is a pole of W(\)].

Note that as <r(Ap) - {A 0 }, the series in (7.2.6) is actually finite.

Proof. By definition, vectors (7.2.6) form the Jordan chain for Ap at A()
if

The last k — 1 statements follow immediately from (7.2.6). Also

Now the Laurent series for W( A) at A 0 , say, W( A) = EJL_ 9 ( A - A 0 ) ; W,, has


the following coefficients of negative powers of (A — A 0 ):

and it is easily seen that q is the least positive integer for which
(Ap — A0/)* = 0. (One checks this by passing to the Jordan form of Ap.)
Now recall that «A(A) = W(A)<p(A) is analytic near A0; so equating coeffi-
cients of negative powers of (A — A 0 ) to zero and using the fact that
(AD - AO/)* = 0, we obtain for ; = 1, 2,. . .
224 Rational Matrix Functions

Since n*L 0 Ker CA' = {0}, it follows that nJL 0 Ker CpA'p = {0} or, what is
the same, that col[C A' ]'ld is left invertible for some integer r. As

the matrix co\[Cp(Ap - A 0 /)'], r = d is left invertible as well, and since (Ap -
A 0 /)' = 0 for s > q, we obtain the left invertibility of co\[Cp(Ap-
\J)' ']/=]• K now follows that (Ap - A O /)JC O = 0 as required. Finally, since
«/>( A () ) = JC G , it is also true that x(} 7^0. Thus, as asserted, equations (7.2.6) do
associate a Jordan chain for Ap with the null function «KA).
Conversely, let jc(), j c , , . . . , xk_l be a Jordan chain of Ap at A 0 . From the
definition of a minimal realization it follows that the matrix

is right invertible for some integer m. Consequently, there exist vectors


<pk, <pk +l,. . . , with only finitely many nonzero, such that

The definition of a Jordan chain includes (Ap - A 0 /)jt y = xf_} for / =


1, 2, . . . , k - 1, and so equations (7.2.6) follow immediately from (7.2.7). It
remains only to check that W(A)(p(A) is now a null function of W(X)~l at A 0 ,
where <p( A) = EJL* ( A - A 0 ) Vy-
Observe first that *0 ^ 0 and that Cpj4j,*0 = A y C p Jc 0 for y = 0,1, 2 , . . . . As
the matrix co\[Cp(Ap - A0/y]y"L0' is left invertible for some integer m, so is
col[C A']"1^, and it follows that C jc 0 ^0. But using (7.2.6), we obtain

If the Jordan chain jc () ,. . . , xk_l of Ap at A0 cannot be prolonged, then


xk_l^lm(Ap - A 0 7), and it follows from (7.2.7) that <pk ^0. Thus a maxi-
mal Jordan chain of length k determines, by means of (7.2.6), an associated
null function ^(A) of W(X)~' of order k.
Propositions 7.2.1 and 7.2.2 prove the following result. [The second part
of Theorem 7.2.3 concerning zeros of W(A) is obtained by applying the first
part to W(\)~1.]
Partial Multiplicities and Multiplication 223

Theorem 7.2.3
Let W(\) be a rational n x n matrix function with del JV(A)^0, and let its
minimal realization be given by equation (7.2.4). A complex number A0 is a
pole of JV( A) if and only if A0 is an eigenvalue of A, and then the absolute
values of negative partial multiplicities of W( A) at A0 coincide with the sizes of
Jordan blocks with eigenvalue A0 in the Jordan form of A, that is, with the
partial multiplicities of A0 as an eigenvalue of A.
A complex number A0 is a zero of W(\) if and only if A0 is an eigenvalue
of A i , where Al is taken from a minimal realization for W(\)~1:

with matrix polynomial V(\). In this case the positive partial multiplicities of
W( A) at A() coincide with the partial multiplicities of A0 as an eigenvalue ofA}.

Now we apply Theorem 7.2.3 to study the partial multiplicities of a


product of two rational matrix functions. Let W,(A) and W2(\) be rational
n x n matrix functions with realizations

for / = 1 and 2. [Of course, the existence of realizations (7.2.8) presumes


that W,(A) and W2(\) are finite at infinity.] Then the product Wt(\)W2(\)
has a realization

Indeed, the following formula is easily verified by multiplication:

so the right-hand side of (7.2.9) is equal to


224 Rational Matrix Functions

So formula (7.2.9) produces a realization for the product W}W2 in terms


of the realizations for each factor. Easy examples show that (7.2.9) is not
necessarily minimal even if the realizations (7.2.8) are minimal. See the
following example, for instance.

EXAMPLE 2.1. Let

Minimal realizations for W,(A), / = 1,2 are not difficult to obtain:

Formula (7.2.9) gives

which is a realization of the rational matrix function /, but not a minimal


one. More generally, if W 2 (A) = W{(\y\ then the realization (7.2.9) is not
minimal [unless VKj(A) is a constant].

Let W(\) be an n x n rational matrix function with determinant not


identically zero. For A 0 €E <p, denote by ir(W; A0) = {7r;.}°°=1 the nonincreas-
ing sequence of absolute values of negative partial multiplicities of W( A) at
A 0 . This means that TTI ^ 7r2 ^ • • • are nonnegative integers with only a finite
number of them nonzero (say, Trk > ?rk + l = 0), and — TT{, — Tr2, . . . ,—irk are
the negative partial multiplicities of W(\) at A 0 .
Consider nonincreasing sequences a = {ay-}*=1 and /8 = (ftj}^=l of non-
negative integers such that only finitely many of them are nonzero, and
recall the definition of the set F(a, j8) given in Section 4.4.

Theorem 7.2.4
Let Wj( A) and W 2 (A) be n x n rational matrix functions with determinant not
identically zero and that take finite value at infinity. Then for every A0 E <p
and j = 1,2, . . . we take iry =s 5y, where {^JJl, = tr(WlW2; A 0 ) and {Sf}J=l is
some sequence from V^lW^, A 0 ), 7r(W2; A0)). //, in addition, W t (A) and
W2( A) arfmtY minimal realizations (7.2.8) /or nTi/c/i r/ze realization (7.2.9) o/
Wj(A)W 2 (A) w minimal as well, then actually

Proof. Let ^(A) and W 2 (A) have minimal realizations as in equation


(7.2.8). Using Theorem 7.2.3 and the definition of r(ir(Wi; A 0 ),
Minimal Factorizations of Rational Matrix Functions 225

7r(W 2 ; A 0 )), we see that the nonincreasing sequence 8 - (5,-}JLi of partial


multiplicities of the matrix

belongs to Y(TT(W}; A 0 ), 7r(W2; A 0 )). Now (7.2.9) is a realization (not


necessarily minimal) of Wj(A)W 2 (A). Theorem 7.1.2 shows that there is a
restriction to some ,4 -semii variant sub-
space such that the realization

is minimal. Then {7r;-}JL, is the sequence of partial multiplicities of A0 at A 0 .


But as A0 is a restriction of A to M, we have 7ry ^ 5y, for y = 1, 2, . . . (see
Section 4.1).

The assumption that both H^(A) and W 2 (A) take finite values at infinity is
not essential in Theorem 7.2.4. However, we do not pursue this generaliz-
ation.
The condition that the realization (7.2.9) is minimal for some minimal
realizations (7.2.8) is important in the theory of rational matrix functions
and in the theory of linear systems. It leads to the notion of minimal
factorization and is studied in detail in the following sections.

7.3 MINIMAL FACTORIZATIONS OF RATIONAL MATRIX


FUNCTIONS
In this section we describe the minimal factorizations of a rational matrix
function in terms of certain invariant subspaces. To make the presentation
more transparent, we restrict ourselves to the case when the rational matrix
functions involved are n x n and have value / at infinity. (The same analysis
applies to the case when the matrix function has invertible value at infinity.)
We start with a definition. The McMillan degree of a rational n x n matrix
function W( A) [with W(°°) = I], denoted 8(W), is the size of the matrix A in
a minimal realization

It is easily verified that

where Moreover, if realization is minimal, so is


226 Rational Matrix Functions

equation (7.3.2). Indeed, equation (6.3.1) shows that the pair (A - BC, B)
is a full-range pair [because (A, B} is so]. Further, (C, A) is a null kernel
pair, or, equivalently, (A*, C*) is a full-range pair. By the same argument,
the pair (A* - C*#*, C*) is also a full-range pair. Hence (C, A - BC) is a
null kernel pair, and therefore realization (7.3.2) is minimal. In particular,

Consider the factorization

where, for ; = 1, . . . , / ? , Wj(\) are n x n rational matrix functions with


minimal realizations

Formula (7.2.9) applied several times yields a realization for W(\):

This realization is not neceswsaruly minimal, so we have (in view of Theorem


7.12)

We say that the factorization (7.3.3) is minimal if actually 8(W) = 8(Wl) +


• • • + 8(Wp), that is, realization (7.3.4) is minimal as well. In informal
terms, minimality of (7.3.3) means that zero-pole cancellation does not
occur between the factors Wj(\). Because the McMillan degrees of a
rational matrix function (with value / at infinity) and of its inverse are the
same, (7.3.3) is minimal if and only if the corresponding factorization for
the inverse matrix function

is minimal.
Let us focus on minimal factorizations (7.3.3) with three factors (p = 3).
A description of all such factorizations in terms of certain triinvariant
decompositions associated with A-semiinvariant subspaces is given. Here A
is taken from a minimal realization W(\) = I + C(A7- A)~1B. Write /4 X =
A - BC, and let A and Ax be of size m.
Minimal Factorizations of Rational Matrix Functions 227

We say that a direct sum decomposition

is a supporting triinvariant decomposition for W(\) if (7.3.5) is a triinvariant


decomposition associated with an ,4-semiinvariant subspace M (so !£ and
Z£ + M are A invariant) and at the same time M is Ax semiinvariant with
associated triinvariant decomposition <pm = N + M 4-^ (i.e., N and jV 4- M
are A* invariant). Note that a supporting triinvariant decomposition for
W(\) depends on the choice of minimal realization. We assume, however,
that the minimal realization of W(\) is fixed and thereby suppresses the
dependence of supporting triinvariant decompositions on this choice. (In
view of Theorems 7.1.4 and 7.1.5, there is no loss of generality in making
this assumption.)
The role of supporting triinvariant decompositions in the minimal fac
torization problem is revealed in the next theorem.

Theorem 7.3.1
Let (7.3.5) be a supporting triinvariant decomposition for W(X). Then W( A)
admits a minimal factorization

where TT^ is the projector on £ along M + Jf, and TTM and TTV are defined
similarly.
Conversely, for every minimal factorization W(\) = Wl(\)W2(\)W3(\)
where the factors are rational matrix functions with value I at infinity there
exists a unique supporting triinvariant decomposition <pm = !£ + M + N such
that

Note that the second equality in (7.3.6) follows from the relations
TT^ATT•<£ = ATT<£ and ir^Air^ = Tr^A, which express the A in variance of !£
and !£ 4- M, respectively (see Section 1.5).
228 Rational Matrix Functions

Proof. With respect to the direct sum decomposition (7.3.5), write

Note, in particular, that the triangular form of A* implies A12 = B^C2,


/413 = #iC 3 , and A23 = B2C3. Applying formula (7.2.9) twice, we now see
that the product on the right-hand side of (7.3.6) is indeed W(A). Further,
denoting for #=.$?, M, or Jf, we
obviously have 8( Wx) ^ dim 5f. Hence

Since, by definition, ra = 8(W), it follows that

and the factorization (7.3.6) is minimal.


Next assume that W= W{W2W3 is a minimal factorization of W, and for
i = l , 2 , 3 let

be a minimal realization of W,(A). By the multiplication formula (7.2.9)

where

Note that

As the factorization W= W1W2W3 is minimal, the realization (7.3.8) is


Minimal Factorizations of Rational Matrix Functions 229

minimal. Hence, by Theorem 7.1.4, for some invertible matrix S we have

To satisfy (7.3.7), put 2= S&, M = SM, and Jf = SJf, where

and A. has size p, for / = 1, 2, 3.


It remains to prove the uniqueness of j£, ./#, and JV. Assume that
<pm = ,2" + M ' + JV' is also a supporting triin variant decomposition such that

As the realizations (7.3.7) and (7.3.9) are minimal (see the first part of the
proof), there exist invertible transformations T%:!£'-* t£, TM:M'-*M,
Tjfi jV'-*JVsuch that

Therefore, the invertible transformation ( <pm defined y


for 3( — Z£, M, N is a similarity between the minimal realization
and itself:

Because of the uniqueness of such a similarity (Theorem 7.1.4), we must


have

Using formula (7.3.2), we can rewrite the minimal factorization (7.3.6) in


terms of the minimal factorization of the inverse matrix function:
230 Rational Matrix Functions

where the second equality follows from ir^A*^- = AXTTJV- and Tr^A*TT^ =
TTyAx, expressing the Ax invariance of Ji and M + Jf.
An important particular case of Theorem 7.3.1 appears when N — {0} in
the supporting triinvariant decomposition (7.3.5). This corresponds to the
minimal factorization of W(\) into the product of two factors, as follows.

Corollary 7.3.2
Let 1£ and M be subspaces in <p"" that are direct complements of each other.
Assume that ^ is A invariant and M is A* invariant. Then W(\) admits a
minimal factorization

where ir^ is the projector on 3?along M. Conversely, if W(\) = Wl(\)W2( A)


is a minimal factorization with then there exists a
unique direct sum decomposition <pm = $£ 4- M, where 5£ is A invariant,
M is A* invariant, and such that

000000000000000

Let us illustrate the description of minimal factorizations obtained in


Theorem 7.3.1. The rational matrix function

has a realization

where

This realization is minimal. Indeed, the matrix

has rank 3 and hence zero kernel. The matrix


Example 231

has rank 3, and hence its image is <p3. Further

Let us find all invariant subspaces for A and Ax. It is easy to see that
(1,1,0) is an eigenvector of A corresponding to the eigenvalue 1, whereas
the vectors (0,0,1), (0,1,0} are the eigenvectors of A corresponding to
the eigenvalue 0. Hence all one-dimensional ^4-invariant subspaces are of
the form Span . Al
two-dimensional /4-invariant subspaces are of the form

Passing to Ax, wefin*dthat A* has three eigenvalues -1, y = | ( l + /V3),


and y with corresponding eigenvectors (1,2,3), ( l , y , 0), and ( l , y , 0),
respectively. There are three one-dimensional Ax-invariant subspaces
Span{(l,2,3}}, Span{(l, y, 0)}, Span{(l, y, 0)}, and three two-dimen-
sional y4 x -invariant subspaces Span {(1,2,3), ( l , y , 0)}, Span{(l, 2, 3),
(1, y,0)}, and Span«l, 0,0), (0,1,0)}.
Now we describe supporing triinvariant decompositions

of W(A) with ^=Span{(l,2,3)}, % = Span{(l, 1,0)}. If we let M =


Span{(;t, y, z)}, we easily see that

if and only if z ^ 3(y - x). Further, one of the following four cases appears:

In cases (a) and (c) we obtain J/ = Span{(l,y, 0)} and M~


Span{(l, y, 0}}, respectively. In case (b) we have
232 Rational Matrix Functions

for some complex numbers p, q, r, s. Consider the second equality in (7.4.3)


as an equation with unknowns p, q, r, s. Solving this equation and putting
r = 1 — y, we get q = 3 — 3y, 5 = 1 — 3a, p = 2 — 3a — y, and M is spanned by
(2 - 3a — y, 2 - 3 ay — y, 3 — 3y), where a ¥* \. [This condition reflects the
inequality z^3(y-jc).] Similarly, in case (d) we obtain M — Span{(2 —
3a - y, 2 — 3ay - y, 3 - 3y}}, where a ^ 3. To summarize, the subspaces
M for which

is a supporting triinvariant decomposition for W(\) are exactly the follow-


ing:
and
To compute the corresponding minimal factorizations according to for-
mula (7.3.6), write the matrices A, B, C (understood as transformations in
the standard orthonormal bases in <p2 and (p3) with respect to the basis
(1, 1,0), < l , y , 0 ) , (1,2,3) in (p3 and the standard basis (1,0), (0,1) in
t2:

So the minimal factorization corresponding to the supporting triinvariant


decomposition (7.4.3) with M = Span{(l, y,0)} is
where
Example 233

Replacing y by y in these expressions we obtain the minimal factorization


corresponding to (7.4.3) with M = Span{(l, 7,0}}.
Now for a ^ \ write A, B, and C in the basis

The corresponding minimal factorization is given by


Th

Taking y in the place of y in these expressions, we obtain the mini-


mal factorization corresponding to (7.4.3) with

Note that these four factorizations exhaust all minimal factorizations

with not identically constant rational 2 x 2 matrix functions with value


/ at infinity and for which W,( A) has a pole at = 1 and has a zero a
0000 000000000

7.5 MINIMAL FACTORIZATIONS INTO SEVERAL FACTORS AND


CHAINS OF INVARIANT SUBSPACES

Let W( A) be an n x n rational matrix function with minimal realization

so that, in particular, W(°°) = I. We study minimal factorizations of W( A) by


means of the realization (7.5.1), and in terms of chains of invariant
subspaces for A and A* = A - EC. We state the main theorem of this
section.

Theorem 7.5.1
Let m be the size of A in equation (7.5.1), and let

where the chain

consists of A-invariant subspaces, whereas the chain

consists of Ax-invariant subspaces. Then W admits the minimal factoriz-


ation

where TT, is the projector on ^£- along J^


Conversely, for every minimal factorization
Factors and Chains of Invariant Subspaces 235

where Wt( re rational n x n matrix functions with W^) /, there exists a


unique direct sum decomposition (7.5.2) with the property that the chains
(7.5.3) and (7.5.4) consist of invariant subspaces for A and A", respectively,
such that

The proof is obtained by p — 1 consecutive applications of Corollary


7.3.2.
As in the remark following the proof of Theorem 7.3.1, the factorization
(7.5.5) implies the minimal factorization for W

We are interested in the case when p, the number of factors in the


minimal factorization (7.5.6), is maximal [of course, we exclude the case
when some of the W- values are identically equal to /]. Obviously, p
cannot exceed the McMillan degree of W (W), for then each facto
Wj(\) must have McMillan degree 1. It is not difficult to find a general form
of rational n~x n matrix functions V( A) with V(°°) = / and S(V) — 1; namely

where A0 is a complex number and R is an n x n matrix of rank 1. Indeed, if


V(A) has the form (7.5.7), then by writing R = C0B0, where C0 is an n x 1
matrix and BQ is a 1 x n matrix, we obtain a realization
of V(A) that is obviously minimal. So d(V) = l. Conversely, if 8(V) = l,
then we take a minimal realization and put
R = C0B0 to obtain (7.5.7).
Note that if V(\) has the form (7.5.7), then so does [because
8(V~l) = S(V} = I]. Indeed, by equation (7.3.2)

where tr R is the trace of R (the sum of its diagonal entries).


We arrive at the following problem: study minimal factoriza-
tions

of W(A), where each V^(A) has the form (7.5.7) for some and R. First let
236 Rational Matrix Functions

us see an example showing that not every ) admits a minimal factoriza-


tion of this type.

EXAMPLE 5.1. Let

This realization (with

is easily seen to be minimal. As BC = 0, we have

Obviously, there is no (nontrivial) direct sum decomposition


where j?, and j£2 are A invariant. So by Theorem 7.5.1 (or Corollary 7.3.2)
W(\) does not admit minimal factorizations, except for the trivial ones

We give a sufficient condition for the existence of a minimal factorization


(7.5.8). This condition is based on the following independently interesting
property of chains of invariant subspaces.

Lemma 7.5.2
Let A j, A 2 : <p" —> <p" be transformations and assume that at least one of them
is diagonable. Then there exists a direct sum decomposition <p" = Jz^ + • • • +
££n with one-dimensional subspaces j£y, / = 1, . . . , n, such that the complete
chains

and

consist of A ^invariant and A2-invariant subspaces, respectively.

Proof. It is sufficient to prove the existence of a direct sum decompo-


sition

where dimdim M is A, invariant, an£ is A


hactors and Chyains of Invariant Subspaces 237

Indeed, we can then use induction on n and assume that Lemma 7.5.2 is
already proved for A^M and PMA2\M in place of Al and A2, respectively,
where PM is projector on M along j£ (Remember that if at least one of Al
and A2 is diagonable, the same is true for A^M and PMA2\M\ see Theorems
4.1.4 and 4.1.5.) Combining (7.5.9) with the result of Lemma 7.5.2 for A^M
and PMA2\M, we prove the lemma for Al and A2.
To establish the existence of the decomposition (7.5.9), assume first that
A i is diagonable, and let /,, . . . , / „ be a basis for <p" consisting of eigen-
vectors of v4,. If g is an eigenvector of A2, (7.5.9) is satisfied with
£ = Span{/) , . . . , / , }, where the indices / j , . . . , / „ _ , are such that
/ - , , . . . , f i n _ \ , g form"a basis in <p".
If A2 is diagonable but A, is not, then use the part of the theorem already
proved with A2 and A* in place of A{ and A2, respectively. We obtain an
(n - 1)-dimensional A*-invariant subspace M and a one-dimensional A*-
invariant subspace & that are direct complements of each other. Then put
M = (.&)1 and £ = (M)x to satisfy (7.5.9). D

We can now state and prove the following sufficient condition for minimal
factorization of a rational matrix function W(\) into the product of 8(W)
nontrivial factors.

Theorem 7.5.3
Let W( A) be a rational n x n matrix function with a minimal realization

and assume that at least one of the matrices A and A — BC is diagonable.


Then W( A) admits a minimal factorization of the form

where are complex numbers and Rl, . . . , Rm are n x n matrices


of rank 1.

The proof of Theorem 7.5.3 is obtained by combining Theorem 7.5.1 and


Corollary 7.5.2. In Example 7.5.1 the hypothesis of Theorem 7.5.3 is
obviously violated. Inded, the matrix os not diagonable. THe

following form of Theorem 7.5.3 may be more easily applied in many cases.

Theorem 7.5.4
Let W( A) be a rational n x n matrix function with W(°°) = I. Assume that
either in W(A), or in W(\)~\ all the poles (if any) of each entry are of the
first order. Then W(\) admits a factorization (7.5.11).
238 Rational Matrix Functions

Recall that the order of a pole A of a scalar rational matrix /( A) is defined


as the minimal positive integer r such that lim A _ A [(A — A 0 ) r /(A)] is finite.

Proof. Assume that all the poles of each entry in W( A) are of the first
order. The local Smith form (7.2.1) implies that all the negative partial
multiplicities (if any) of W(\) at each point A0 are —Is. By Theorem 7.2.3,
all the partial multiplicities of the matrix A from the minimal realization
(7.5.10) are Is. Hence A is diagonable and Theorem 7.5.3 applies. If all
poles of W(\yl are of the first order, apply the above reasoning to W(\)~l,
using its realization W(A)^ = / - C(A7- (A - BC))~1B, which is minimal
if (7.5.10) is minimal. D

000000000000000000000000000000000000000000000

In this and the next sections we study linear fractional transformations and
decompositions of general (nonsquare) rational matrix functions. We deviate
here from our custom and denote certain matrices by lower case Latin and
Greek letters.
Let W( A) be a rational matrix function of size r x m written in a 2 x 2
block matrix form as follows:

Here m, and ri (/' = !, 2) are positive integers such that m = ml+m2,


r=r\ + r2. Let V(\) be a rational m2 x rl matrix function for which
det(7- W 12 (A)V(A))^0, and define matrix function

So U( A) is a rational matrix function of size r2 x ml. It is called the linear


fractional transformation of ) by ) [with respect to the block matrix
form of (7.6.1)] and is denoted by SFW(V). It is easily seen that when
m and can be rewritten in the form

where

Conversely, if (7.6.3) holds, then we have (7.6.2) with


Linear Fractional Transformations 239

The form (7.6.3) justifies the terminology "linear fractional transforma-


tion", however, the form (7.6.2) will be more convenient for our analysis.
Observe that multiplication of rational matrix functions is a particular
case of the linear fractional transformation, which is obtained in case
and either
Assume now that both W(\) and K(A) take finite values at infinity. Then
(see Section 7.1) there exist realizations

where A, B, C, and D are matrices of sizes n x H, n x m, r x n, and r x m,


respectively, and

with matrices a, 6, c, d of size p ~ * p , pxr\, m2xp, and w 2 x r , ,


respectively. At this point we do not require that the realizations (7.6.4) and
(7.6.5) be minimal. We are to find a realization of &W(V) in terms of the
realizations (7.6.4) and (7.6.5) of W(A) and V(\).
With respect to the direct sum decompositions (pm = (£""' + (p™2 and
(p = <f ri + <p r2 , we write B, C, and D as block matrices
r

As D = Vy(°°), formula (7.6.2) shows that 3*W(V) is analytic at infinity (i.e.,


has no poles there) provided the matrix / — Dl2d is invertible; in this case

We restrict our attention to rational matrix functions that are analytic at


infinity, so it will be assumed that / - Dl2d is invertible. Then / — dDl2 is
invertible as well and

Indeed, multiplication gives


240 Rational Matrix Functions

Define transformations:

Theorem 7.6.1
We have

Further, if this realization of &W(V) is minimal, then the realizations (7.6.4)


and (7.6.5) of W(\) and V(\), respectively, are minimal as well.

Proof. Write

So

We use a step-by-step procedure to compute a realization for

using these realizations for W ) and the realization (7.6.5) for by the
following rules: given two rational matrix functions d X with
finite values at infinity and realizations

realizations for Xl(\) + X2(\), Xl(\)X2(\), and A\(A) - 1 can be found as


follows [cf. formulas (7.2.9) and (7.3.2)]:
Linear Fractional Transformations 241

(it is assumed in the last formula that D} is invertible). A computation shows


that

where

Let
242 Rational Matrix Functions

where n x n and p x p are the sizes of ^4 and a, respectively. Then

and

Writing (7.6.12) in the form

we see that formula (7.6.11) follows.


0000000000000000000000000000000000000000000000000 00000000
that

for all nonnegative integers k. Using formula (7.6.7), one proves by


induction on k that

Indeed, (7.6.14) holds for k = 0. Assuming that (7.6.14) is true for k - 1, we


have

where the last equality follows in view of (7.6.13). Now


Linear Fractional Transformations 243

and x — 0 because (y, a) is a null kernel pair. [This follows from the
minimality of (7.6.11).] So the pair (C, A) is also a null kernel pair.
To prove that (A, B) is a full-range pair, observe that a* can be written
in the form

where Ylk and Zlk are certain matrices and the stars denote matrices of no
immediate interest. Formula (7.6.15) can be proved by induction on k by
means of formula (7.6.7). From the minimality of (7.6.11) it follows that for
every there exist vectors i; such that

But then, using (7.6.15) and (7.6.8), we have

and (/I, B) is a full-range pair. So the realization (7.6.4) is minimal.


Now consider the realizaton (7.6.5). sucg that

One proves that using an aggu-


ment analogous to that used in obtaining (7.6.14). Hence

In view of the minimality of (7.6.11) we obtain x = 0, and (c, a) is a null


kernel pair. Finally, write a* in the form

for some matrices zlk and ylk. [Again, equation (7.6.16) can be proved by
induction on k using (7.6.7).] For every x E <pp by the minimality of
(7.6.11) there exist vectors w 0 , . . . , uq G (p'"1 such that
244 Rational Matrix Functions

From (7.6.16), it follows that

for some vectors w(), . . . , wg, and the full-range property of (a, b) is
proved. Hence the realization (7.6.5) is minimal as well.

Observe that if D 1 2 =0, £> 2 1 =0, Dn = /, C, =0, ^ = 0 , we have


W 21 (A) = 0, W, 2 (A) = 0, W U (A) = 7, and so

On the other hand, formulas (7.6.7)-(7.6.10) take the form

which coincides with formula (7.2.9) for the realization of a product of


rational matrix functions. So (7.6.11) is a generalization of (7.2.9). On the
other hand, putting D I 2 = 0, D21 = 0, D22 = /, C2 = 0, B2 = 0, we have

and formula (7.6.11) gives another version for the realization of the product
of two rational matrix functions:

7.7 LINEAR FRACTIONAL DECOMPOSITIONS AND INVARIANT


SUBSPACES FOR NONSQUARE MATRICES

Let (J( A) be a rational matrix function of size q x s with finite value at


infinity. A linear fractional decomposition of U(\) is a representation of
(/(A) in the form

for some rational matrix functions W(\) and K(A) that take finite values at
infinity. In this section we describe linear fractional decompositions of U( A)
Linear Fractional Decompositions and Invariant Subspaces 245

in terms of certain invariant subspaces for nonsquare matrices related to a


realization of U(\).
Minimal linear fractional decompositions (7.7.1) are of particular inter-
est. First observe that the definition of the McMillan degree of a rational
matrix function with value / at infinity (given in Section 7.3) extends
verbatim to a (possibly rectangular) rational matrix function W(\) with
finite value at infinity: namely, 8(W) is the size of the matrix A taken from
any minimal realization

of W(\). In any linear fractional decomposition (7.7.1) of U(\) for which


the rational functions W(A) and V(\) take finite values at infinity, we have

Indeed, assuming that (7.6.4) and (7.6.5) are minimal realizations of W(\)
and V(A), respectively, then by Theorem 7.6.1 U(\) has a realization (not
necessarily minimal) 5 + y(\I- a)~lfi, where the size of a is t x /, with
t = 8(W) + 8(V). Hence (7.7.2) follows.
The linear fractional decomposition (7.7.1) is called minimal if equality
holds in (7.7.2), that is, 8(U) = d(W) + S(V). As in the preceding para-
graph, Theorem 7.6.1 implies that (7.7.1) is minimal if and only if for some
(and hence for any) minimal realizations (7.6.4) and (7.6.5) of W(A) and
£/(A), respectively, the realization (7.6.11) of U(\) = &W(V) is again
minimal.
Let

be a realization (not necessarily minimal) of f/(A), where a, /3, y, and 8 are


matrices of sizes / x /, / x s, q x /, and q x s, respectively. Recall from
Theorem 6.1.1 that a subspace M C <p' is [a (3] invariant if and only if there
exists an 5 x / matrix F such that M is invariant for a + /3F. Also (see
Theorem 6.6.1), a subspace X C (p' is invariant if and only if there
exists an / x q matrix G such that (a + Gy)N C N. For the purpose of this
section we can accept these properties as definitions of [a /3]-invariant and
-invariant subspaces, respectively.
A pair of subspaces (M}, M2) of <(7' will be called reducing with respect to
realization (7.7.3) if M\ is [a )8] invariant, M2 is invariant, and M^ and
M2 are direct complements to each other in <p'.
The following theorem provides a geometrical characterization of mini-
mal linear fractional decompositions of t/(A) in terms of its realization
(7.7.3).
246 Rational Matrix Functions

Theorem 7.7.1
Assume that ( M l , M 2 ) is a reducing pair with respect to the realization
(7.7.3) of £/( A). The following recipe may be used to construct realizations of
rational matrix functions W( A) and V( A) such that

and

with a transformation A: Mt — » M , , and

with a transformation a: M2—>M2: (a) choose any transformation

and any transformation d: (p 5 —*(p^ such that the transformations Du, D22
and I — Dl2d are invertible and

(b) choose any transformations F: <p'—» (p* and G:^q—>^' for which
and

be block matrix representations with respect to the direct sum decomposition


<p =M} + M2. Then, defining
Linear Fractional Decompositions and Invariant Subspaces 247

and

equation (7.7.4) holds. Moreover, if, in addition, the realization (7.7.3) is


minimal, the linear fractional decomposition (1.1 A) is minimal as well; and
conversely, any minimal linear fractional decomposition

of U( A) where the rational matrix functions

and V(\) take finite values at infinity and the matrices Wu(°°) and V^22(oo) are
invertible, can be obtained by this recipe.

Proof. Let A, B-, Cj, Dtj and a, b, c, d be defined as in the recipe.


Then, using the relationships (7.7.7), (7.7.9), and (7.7.10) and the
equalities a21 + /3 2 F 1 =0 and a 1 2 +G 1 -y 2 = 0 (which follow from the in-
variance of Ml and M2 under the transformations a + /3F and a + Gy,
respectively), one checks that the equalities (7.6.7)-(7.6.10) hold. Now by
Theorem 7.6.1. we obtain the linear fractional decompo-
sition (7.7.4).
Assume now that (7.7.3) is a minimal realization of t/(A); hence 5(U) -
I. By Theorem 7.6.1 the realizations (7.7.5) and (7.7.6) are minimal, so

As M{ and M2 are direct complements to each other in <p', we have


8(U) = S(W) + S(V), and the minimality of the linear fractional decom-
position (7.7.4) follows.
248 Rational Matrix Functions

Conversely, assume that (7.7.3) is a minimal realization of and let


(7.7.10) be a minimal linear fractional decomposition of , where the
rational functions W(\) and V(A) are finite at infinity and
are invertible. Here

0000 000000000000000000000000000000 00000 00000000000000


formula (7.7.11) and by the invertibility of Wn(°°) and W22(°°); in particular,
the matrix functions W,,(A) and W22(\) must be square.] Let

be a minimal realization of partitioned as in (7.7.12), where the


matrix A has size n x n, n- 8(W). Let

be a minimal realization of V( A) in which a is p x p, p = 8(V). By Theorem


7.6.1, form a realization

where a', (3', y', and 8' are given by formulas (7.6.7), (7.6.8), (7.6.9), and
(7.6.10), respectively, using the realizations (7.7.13) and (7.7.14). As
(7.7.11) is a minimal linear fractional decomposition, the realization
(7.7.15) is minimal. [The size of a' is (n + p) x (n +/?).] Comparing the
minimal realizations (7.7.3) and (7.7.15) we find, in view of Theorem 7.1.4,
that 8 = 8' and there exists an invertible transformation S: <p" 4- (p^—»<p'
such that

Putting

one verifies that and the minimal


linear fractional decomposition (7.7.11) is given by our recipe.

Observe that the linear fractional decomposition of U( A) described in the


recipe of Theorem 7.7.1 depends on the reducing pair ( M l , M 2 ) i on the
choice of D and d such that condition (a) holds, and on the choice of F and
Linear Fractional Decompositions and Invariant Subspaces 249

G such that (a + ^F)M1CM1, (a + Gy)M2CM2. [We assume that the


realization (7.7.3) of U(\) is fixed in advance.] We determine the parts of
this information that are uniquely defined by the linear fractional decompo-
sition. Let us introduce the following definition. Let (Mlf M 2 ) be a reducing
pair [with respect to the realization (7.7.3)] and F: <f'^ <fJ, G: £*-»> <p' be
transformations such that (a + (3F)Ml C M^, (a + Gy)M2 C M2, and write

with respect to the direct sum decomposition <p' = M^ + M2. The quadruple
(Mi,M2;F{,G\) will be called a supporting quadruple [with respect to the
realization (7.7.3)]. Given a supporting quadruple, for every choice of D
and d satisfying condition (a) of Theorem 7.7.1, the recipe produces a linear
fractional decomposition of U(X). We now have the following important
addition to Theorem 7.7.1.

Theorem 7.7.2
Assume that the realization (7.7.3) is minimal, and let (7.7.11) be a minimal
linear fractional decomposition of U(\} such that W (/ (A) and V(\) take finite
values at infinity and the matrices Wn(°°) and W22(<*>) are invertible. Then
there exists a unique supporting quadruple Q = (Ml, M2; FI} G,) that pro-
duces, together with some choice of D and d satisfying condition (a), the
decomposition (7.7.11) according to the recipe of Theorem 7.7.1.

Proof. The existence of Q is ensured by Theorem 7.7.1. To prove the


uniqueness of Q, assume that Q' - (M(, M'2\ F[, G|) is another supporting
quadruple that gives rise (with some choice of D and d) to the same
decomposition (7.7.11). As D = W(QO), d = V(<x>), we see that actually the
matrices

and d, which, together with Q', give rise to the decomposition (7.7.11) are
the same matrices chosen to produce (7.7.11), together with Q. Further, let
(7.7.8) be the block matrix representations of a, j8, and y with respect to the
direct sum decomposition (p7 = M\ 4- M2, and let

be the corresponding representations with respect to the direct sum <p' =


M\ + M2. We now have two realizations for W(A):
250 Rational Matrix Functions

where A, Bt, and Cj are given by formulas (7.7.9) and A', B't, and C'j are
given by (7.7.9) with a n , G,, F,, j8,, y, replaced by «;,, GJ, FJ, /3|, y{,
respectively. By Theorem 7.6.1, both realizations (7.7.16) are minimal, so in
view of Theorem 7.1.3 there exists an invertible transformation S: M\-+ M\
such that

Similarly, we have

where a, b, and c are given by (7.7.10) and a', b', and c' are given by
(7.7.10) with a 22 , j32, y2 replaced by a 22 , /3 2 , y 2 > respectively. Since both
realizations (7.7.18) are minimal, we have

for some invertible transformation


We now verify that

Indeed, formulas (7.7.9) together with (7.7.17) give

and

so a n = S la'nS. From formulas (7.7.10) and (7.7.19) one obtains


Linear Fractional Decompositions: Further Deductions 251

and

Further, the definition of the supporting quadruples Q and Q' implies

so

and

All the established relationships verify the equalities (7.7.20).


It remains to observe that the transformation V = is a similarity
of the minimal realization (7.7.3) with itself. Since such a similarity must be
unique (Theorem 7.1.4), it follows that V= /and hence

7.8. LINEAR FRACTIONAL DECOMPOSITIONS:


FURTHER DEDUCTIONS

We consider here some deductions, examples, and results on linear fraction-


al decompositions that follow from the main theorems, Theorems 7.7.1 and
7.7.2.
The particular case when 8 - I, D — /, and d - I in Theorems 7.7.1 an
7.7.2 is of special interest. In this case condition (a) of Theorem 7.7.1 is
satisfied automatically, and we have the following.

Theorem 7.8.1
Let

be a minimal realization of the rational q x q matrix function U(\). Let


(Jti,M2) be a reducing pair for the realization (7.8.1), and write
252 Rational Matrix Functions

Choose any transformations

in such a way that (a + J3F)M1 CMl, (a + Gy)M2 C M2. Then

and

produce a minimal linear fractional decomposition U( A) = &W(V). Converse-


ly, every minimal linear fractional decomposition U(\) = &W(V) with
W(°o) = / and V(o°) = / can be obtained in this way, and the quadruple
(Ml, M2\ F t , G,) is determined uniquely by W(\) and V(A).

Let us give a simple example illustrating Theorem 7.8.1.

EXAMPLE 7.8.1. Let

A minimal realization for U(\) is easy to find:

with 8 = -y = /3 = 7, « = - We find all nontrivial [i.e., such that


W(\)^I, V(A)^7] minimal linear fractional decompositions U(\)-
&W(V) such that W(«>) = /, K(°o) = /. Every subspace in <p2 is [a /3]
invariant, as well as invariant. We consider the case when the one-
dimensional subspaces Ml and M2 and <p2 that are direct complements to each
other are of the form
Linear Fractional Decompositions: Further Deductions 253

Then one computes

with respect to the direct sum decomposition (p2 = Ml + M2, where (1, x)
and (1, y) are chosen as bases in M} and M2, respectively. Further,
is such that (or + f$F)M\ C M, if and only if the transformation

The transformation G = is such that (a + Gy)M


2 C M2 if and only if

forG, =[g, g2] we have

Now formulas (7.8.2) and (7.8.3) give

We conclude that for every six-tuple of complex numbers (x, y, /j, f 2 , ' g l , g2]
such that x^y and (7.8.4) and (7.8.5) hold, there is a minimal linear
fractional decomposition U(\) - &W(V) where W(\) and K(A) are given by
equalities (7.8.6) and (7.8.7), respectively. D

As an application of Theorem 7.7.1, let us consider linear fractional


decompositions with several factors.
254 Rational Matrix Functions

Theorem 7.8.2
Let U(A) be a rational matrix function that has no pole at infinity, and let
m = 8(U). Then U(\) admits a linear fractional decomposition

where for j = 1, . . . , m Wj( A) is a rational matrix function that is finite at


infinity with McMillan degree 1. Moreover, VV^A) can be chosen in such a
way that

for any rational matrix function V(\) of suitable size, where Wfl(\) and
WJ2(\) are rational matrix functions of appropriate sizes with Wr/2('») = 0,
WW») = /.
Observe that the decomposition (7.8.8) is minimal in the sense that
8(U) = 8(Wl)+--- + 8(Wm). So, in contrast with the factorization of
rational matrix functions (Example 7.5.1), nontrivial minimal linear frac-
tional decompositions always exist.

Proof. Choose a minimal realization

By the pole assignment theorem (Theorem 6.5.1), there exists a transfor-


mation F such that o-(a + ftp) = ( A , , . . . , A,} with distinct numbers
A , , . . . , A, (here / x / is the size of a). So there is a basis g,, . . . g, in (p7 such
that (a + (3F)gj — A^g y , 7' = 1 , . . . , / . On the other hand, for any transform-
ation G: (p*—*• <p' there is a basis / , , . . . , / , in (p7 in which the matrix of
a + Gy has a lower triangular form (Theorem 1.9.1). Choose gy in such a
way that g;, /2, /3, . . . , / / are linearly independent and put Ml — Span{g;},
M2 = Span{/2, /3, . . . , / / } • Then (M^ M 2 ) is a reducing pair and the recipe
of Theorem 7.7.1 (with D = I, d — 8) produces a minimal linear fractional
decomposition U(\)=&w(Ul), where d(W) = l and W(<*>) = I. Moreover,
taking G =0 it follows that W(A) has the form

Hence &W(V) has the form (7.8.9). Now apply the preceding argument to
(/^A), and so on. Eventually we obtain the desired linear fractional
decomposition (7.8.8).

Observe that, because 8(Wj) = l, each function W ; (A) from Theorem


7.8.2 has only one pole /u,., and the multiplicity of this pole is 1. The proof of
000000000000 255

Theorem 7.8.2, together with formula (7.7.8) for the transformation A,


shows that the functions Wj(\) can be chosen with the additional property
that f i l , . .. , n.m are the eigenvalues (counted with multiplicities) of the
transformation a taken from a minimal realization (7.8.3) of f/(A).

00000000000000000

7.1 Find realizations for the following rational matrix functions:

Determine whether these realizations are minimal.


7.2 Find the McMillan degree and a minimal realization for the following
rational matrix functions:

7.3 Reduce the following realizations to minimal realizations

where Cp is the 1 x n matrix with 1 in the pth place and zeros


elsewhere;
256 Rational Matrix Functions

7.4 Find minimal realizations for the following scalar rational functions:

where A; is a positive integer

[Hint: In the minimal realization / + C(\I - A)~1B the matrix


A is the Jordan block of size k with eigenvalue A

7.5 Find a minimal realization for the scalar rational function with finite
value at infinity, assuming that its representation as a sum of simple
fractions is known, that is, of the form Er [Hint:
Use Exercise 7.4 (c) and Exercise 7.11.]
7.6 Show that if

are realizations for n x n and m x m rational matrix functions W,(A)


and W2(\), then the (n + m) x (n + m) rational matrix function
W,(A)@W 2 (A) has realization

Show, furthermore, that (2) is minimal if and only if each realization


(1) is minimal.
7.7 Describe a minimal realization for the 2 x 2 circulant rational matrix
function

where 0,(A) and fl2(A) are scalar rational functions with finite value at
infinity.
7.8 Describe a minimal realization for the n x n circulant rational matrix
function
Exercises 257

[As usual, assume that W(o°) is finite at infinity.]


7.9 Let ^(A) and W2(\) be rational matrix functions with realizations

Show that the sum W{(\) + W2(\) has the realization

7.10 Give an example of rational matrix functions W,(A) and W2(\) with
minimal realizations (3) for which the realization (4) is not minimal.
7.11 Assume that the realizations (3) are minimal and A} and A2 do not
have common eigenvalues. Prove that (4) is minimal as well. [Hint:
We have to show that ([C^ C 1 is a null kernel pair

and I l i s a full-range pair. Suppose that x and


such that

for k = 0,1, Because (r(At) H cr(A2) = 0, for k = 0,1, . . . there


exists a polynomial pk(\) such that pk(Al) = Q, pk(A2)= A2. Then

Hence y = 0. Similarly, one proves that x - 0.]


7.12 Let

be a rational n x n matrix function, where A,, . . . , A^. are distinct


complex numbers. Show that W(\) admits a realization
258 Rational Matrix Functions

When is this realization minimal?


7.13 Find a realization for a rational n x n matrix function of the form

(where A , , . . . , A^ are distinct complex numbers). When is the ob-


tained realization minimal?
7.14 Given a realization W(A) = C(A- A)~1B, find a realization for the
rational matrix function

Is it minimal if the realization W(A) = C( A - A) ]B is minimal?


7.15 Given a realization

of a rational matrix function, find a realization for W(a\ + )3), where


a T^ 0 and /3 are fixed complex numbers. Assuming that (5) is
minimal, determine whether the obtained realization is minimal as
well.
7.16 Given a realization (5), show that W(A 2 ) has a realization

If (5) is minimal, is this realization minimal as well?


7.17 Given a realization (5), find a realization for W(p( A)), where p(A) is
a scalar polynomial of third degree. Is the realization obtained
minimal if (5) is minimal?
7.18 Let

be a minimal realization,
(a) Show that
Exercises 259

is a realization of W(\)2.
(b) Is the realization of W( A)2 minimal?
(c) Is the realization minimal if, in addition, the zeros and poles of W( A)
are disjoints?
7.19 For the minimal realization (6), show that

is a realization of W( A) . Is it minimal? Is it minimal if the zeros and


poles of W( A) are disjoint?
7.20 Show that a realization W( A) = / + C( \I - A)~1B is minimal if A and
A — BC do not have common eigenvalues. (Hint: Use Theorem
7.1.3.)
7.21 Let W( A) be an n x n rational matrix function with W(°°) = I and
assume that W(\) is hermitian for all real A that are not poles of W(\).
Prove that for every minimal realization

there exists a unique invertible matrix 5 such that

7.22 Show that the McMillan degree of

where are distinct complex numbers, is equal to the sum


of ranks of Zl,. . . , Zk.
7.23 Show that for rational n x n matrix functions Wj(A) and W2(\) with
finite values at infinity the inequalities

hold.
260 Rational Matrix Functions

7.24 Find the McMillan degree of the circulant rational matrix function

7.25 Find a minimal realization of W(\), and, with respect to this realiz-
ation, describe all the minimal factorizations W(\) = Wl(\)W2(\) of
W(A) in terms of subspaces 2£ and M as in Corollary 7.3.2, for the
following scalar rational functions:

whree si a fixed integer

7.26 When is the realization /„ + / n (A/ n — A) 1B, where A is upper tri-


angular with zeros on the main diagonal and B is diagonal with
distinct eigenvalues, minimal? Show that in this case W(\) admits a
minimal factorization with factors having McMillan degree 1.
7.27 Prove that a circulant rational matrix function (Exercise 7.24) wit
value / at infinity admits a minimal factorization with factors having
McMillan degree 1.
7.28 Let

be a minimal realization, and assume that BC = 0.


(a) Prove that
(b) Prove that W( A) admits a nontrivial minimal factorization if and
only if A is not unicellular.
7.29 Let

be a scalar rational function. Use the recipe of Theorem 7.7.1 to


construct all minimal linear fractional decompositions t/(A) = ^W(V),
Exercises 261

such that W(\) and V(\) take finite values at infinity and Wn(oo),
W22(o°) are invertible. Find all the corresponding reducing pairs of
subspaces with respect to a fixed minimal realization of t/(A).
7.30 Show that all the following decompositions of a rational matrix
function U(\) are particular cases of the linear fractional decompo-
sition:

7.31 For the rational function U(\) given in Example 7.8.1, find all
minimal linear fractional decompositions U(\) = &W(V), with
and
Chapter Eight

Linear Systems

In this chapter we show how the concepts and results of previous chapters
are applied to the theory of time-invariant linear systems. In fact, this is a
short self-contained introduction to linear systems theory. It starts with the
analysis of controllability, observability, minimality, and state feedback and
continues with a selection of important problems with full solution. These
include cascade connections, disturbance decoupling, and output stabiliz-
ation.

8.1 REDUCTIONS, DILATIONS, AND TRANSFER FUNCTIONS

Consider the system of linear differential equations

where are constant


transformations (i.e., independent of /). Here u(t) is an n-dimensional
vector function on t > 0 that is at our disposal and is referred to as the input
(or control) of the linear system [equations (8.1.1)]. The r-dimensional
vector function y(t) is the output of (8.1.1), and the m-dimensional function
x(t) is the state of (8.1.1). Usually the state of the system (8.1.1) is unknown
to us and must be inferred from the input (which we know) and the output
(which we may be able to observe, at least partially).
Let Jt(/;* 0 , M) be the solution of the first equation in (8.1.1) [with the
initial value *(0) = *0]. It follows from the basic theory of ordinary differen-
tial equations [see Coddington and Levinson (1955), for example] that the
solution x(t;x0, u) is unique and is given by the formula

262
Reductions, Dilations, and Transfer Functions 263

Substituting into the second equation of (8.1.1), we have

Formula (8.1.3) expresses the output in terms of the input. In other words,
the input-output behaviour of the system is represented explicitly.
Now we introduce some important operations on linear systems of type
(8.1.1). It is convenient to describe (8.1.1) by the quadruple of transfor-
mations (A, B, C, D). A linear system (A1, B', C', D ) with transformations
A'\ (pm'-^4:m', B': (p"'-»<p m ', C': <p m '-^(p r ', D': <p"'-*<p r ' will be called
similar to (A, B, C, D) if there exists an invertible transformation
5: <f""'-» <pm such that

(In particular, this implies that m = m', n = n', r = r'.) We also encounter
system (8.1.1) with transformations A:M^>M, B:("->M, C:M-^-(r,
and D: (p"—> (pr, where M is a subspace of <£"" for some m. The definition of
similarity applies equally well to this case. [In particular, similarity with the
system (A1, B', C', D') described above implies dim M = m'.]
A system (A, B', C', D') with A': <pm'-* <pm', B': <p"-» <f w ',
C': <p w '-» <fr, D': <f"^ <fr will be called a dilation of (4, B, C, D) if there
exists a direct sum decomposition

with the two following properties: (1) the transformations A', B', C' have
the following block forms with respect to this decomposition

where the stars denote entries of no immediate concern (so A: M—>M,


C:M^>fr, fl: <£"•-»./#); (2) the system (A, B, C, D') is similar to
(A, B, C, D). In particular, if (A, B', C', D') is a dilation of (A, B, C, /)),
then D' = D. The form (8.1.5) for A' shows that the subspaces 58 and
5£ + M are A' invariant; in other words, (8.1.4) is a triinvariant decompo-
sition associated with the ^4-semiinvariant subspace M. Similarity is actually
a particular case of dilation, with M — <pm and Jz? = Jf = {0}.
We say that (A, B, C, D) is a reduction of (A', B', C', D') if
(A1, B', C', D') is a dilation of (A, B, C, D).
264 Linear Systems

The basic property of reductions and dilations is that they have essentially
the same input-output behaviour; as follows.

Proposition 8.1.1
Let (A', B', C', D') be a dilation of (A, B, C, D). Then, for *0 = 0, the
input-output behaviours of the systems (A', B', C', D') and (A, B, C, D)
are the same. In other words, if u(t) is any (say, continuous) n-dimensional
vector function, then the output y = y(t; 0, u) of the system (A', B', C', D')
and the output y = y(t; 0, u) of the system (A, B, C, D) coincide.

Proof. Formula (8.1.3) gives

As D' = D, and e('~s)A (for a fixed / and s) admits a power series represen-
tation (see Section 2.6), we have only to show that for q =0,1,. . .

Using formula (8.1.5), we obtain

Now (A, B, C, D') and (A, B, C, D) are similar, so there exists an invert-
ible transformation S such that A = S~1AS, C = CS, and B = S~1B. Hence

and (8.1.6) follows.

In practice one is concerned about the dimension m of the state space of


a given system (8.1.1). It is desirable to make this dimension as small as
possible without changing the input-output behaviour. We say that the
system (8.1.1) is minimal if the dimension m of its state space is minimal
among all linear systems (A', B', C', D') that exhibit the same input-output
behaviour given the initial condition that that state vector is zero [i.e.,
jc(0) = 0]. In view of Proposition 8.1.1, the following problem arises: given
the linear system (8.1.1), not necessarily minimal, produce a minimal system
by reduction of (8.1.1). We see later that this is always possible.
Minimal Linear Systems: Controllability and Observability 265

To study this and other problems in linear system theory, it is convenient


to introduce the transfer function. Consider the system (8.1.1) with Jt(0) = 0,
and apply the Laplace transform. Denote by the capital Roman letter the
Laplace transform of the function designated by the corresponding small
letter; thus

[It is assumed here that for f > 0 z(/) is a continuous function such that
|z(0l ^ Ke*' for some positive constants K and /x. This ensures that Z( A) is
well defined for all complex A with Re A > /a.] The system (8.1.1) then takes
the form

Solving the first equation for X(\) and substituting in the second equation,
we obtain the formula for the input-output behaviour in terms of the
Laplace transforms:

So the function W(A) = D + C( A/ - A)~1B performs the input-outut map


of the system (8.1.1), following application of the Laplace transform. This
function is called the transfer function of the linear system (8.1.1). Observe
that the transfer function is a rational matrix function of size r x n that has
finite value (=D) at infinity. Observe also that the transfer functions of two
linear systems coincide if and only if the systems have the same input-
output behaviour. In particular, systems obtained from each other by
reductions and dilations have the same transfer functions.

00000000000000000000000000000000000000
0000000000000000000000000000000000000000000000

Consider once more the linear system of the preceding section:

and recall that this system is called minimal if the dimension of the state
space is minimal. [We omit the initial condition *(0) = *0 from (8.2.1); so
(8.2.1) has in general many solutions x(t).]
266 Linear Systems

Applying the results of Section 7.1 to transfer functions, we obtain the


following information on minimality of the system (8.2.1).

Theorem 8.2.1
(a) Any linear system (8.2.1) is a dilation of a minimal linear system; (b) the
linear system (8.2.1) is minimal if and only if (A, B) is a full-range pair and
(C, A) is a null kernel pair:

where m is the dimension of the state space. Moreover, in (8.2.2) one can
replace nJL 0 Ker CA' by np()' Ker CA' and Z^lm(B'A) by E^1 Im(B'A),
where p is any integer not smaller than the degree of the minimal polynomial
of A.

Indeed, (a) is a restatement of Theorem 7.1.3, and (b) follows from


Theorem 7.1.5.
It turns out that the conditions (8.2.2) obtained in Chapter 7 from
mathematical considerations have important physical meanings, namely,
"controllability" and "observability" of the linear system (8.2.1). Let us
introduce these notions.
The system (8.2.1) is called observable if for every continuous input u(t)
and output y(t) there is at most one solution x(t). In other words, by
knowing the input and output one can determine the state (including the
initial value) in a unique way.

Theorem 8.2.2
The system (8.2.1) is observable if and only if (C, A) is a null kernel pair:

Proof. Assume that (8.2.1) is observable. With y(t) = Q and u(t) = 0,


the definition implies that the only solution of the system

for f > 0 is x(t) = Q. If equality (8.2.3) were not true, there would be a
nonzero A:O £ nj!0 Ker CA' and the function x(t) — e'Ax0 would be a not
identically zero solution of equation (8.2.4). Indeed, for every 12^0 we have
Minimal Linear Systems: Controllability and Observability 267

Thus observability implies the condition stated in equality (8.2.3).


Now assume that (8.2.3) holds but (arguing by contradiction) the system
(8.2.1) is not observable. Then there exist continuous vector functions y(t)
and u(t) such that for j - 1, 2 and all t >0, we obtain

for some x { ( t ) and x2(t) that do not coincide everywhere. Subtracting


(8.2.5) with / = 2 from (8.2.5) withy = 1, and denoting x(t) = *,(0 - x2(t)^
0, we have

In particular, it is found that

Hence x0 = 0 by (8.2.3); but this contradicts x(t)^Q.

The system (8.2.1) is called controllable if by a suitable choice of input


the state can be driven from any position to any other position in a
prescribed period of time. Formally, this means that for every jct G (p"1,
x2 G <£"", and t2 > tl ^ 0 there is a continuous function u(t) such that
x ( t } ) = jc,, x(t2) = *2 f°r some solution *(/) of

Note that in the definition of controllability the second equation y(t) =


Cx(t) + Du(t) of equation (8.2.1) is irrelevant. Further, by replacing x(t) by
x(t - / j ) we can assume in the definition of controllability that f j is always 0.

Theorem 8.2.3
The system (8.2.1) is controllable if and only if (A, B) is a full-range pair:

We need the following lemma for the proof of Theorem 8.2.3.


268 Linear Systems

Lemma 8.2.4
Let G(t), f G [0, /0] be an mx n matrix depending continuously on t. Then

Proof. Let W= J^0 G(t)[G(t)]* dt. Assume x G <pm is such that x = Wy


for some y G <p". Then putting «(f) = [G(/)]*_y we find that x belongs to the
left-hand side of (8.2.7).
Conversely, if xl £lm W, then there exists an x2 G <pm such that Wx2 = 0
and (jtj, x2)^Q. [Here we use the property that W= W* and thus Im W —
(Ker W)1.] Arguing by contradiction, assume that there exists a continuous
vector function u(t) such that

Then

On the other hand

and since the norm is nonnegative and G(t)* continuous, we obtain


G(t)*x2 = 0, or x£G(f) = 0 for all / G [0, t2]. But this contradicts (8.2.8). D

Proof of Theorem 8.2.3. By formula (8.1.2) for every solution x(t) of


(8.2.6) with jc(0) = x{ we have

Hence

From this equation it is clear that (8.2.1) is controllable if and only if for
every t2 > 0 the set of m-dimensional vectors
Minimal Linear Systems: Controllability and Observability 269

coincides with the whole space <pm. By Lemma 8.2.4, the controllability of
(8.2.1) is equivalent to the condition that Im Wt = (p™ for all f >0, where

We prove Theorem 8.2.3 by showing that for all / > 0

If x e Ker Wt, then x*W,x = 0, that is

So B*e sA x - 0, Q<s^t. [Otherwise, in view of the continuity of


H/?*^**!!2 as a function of 5, we obtain a contradiction with (8.2.10).]
Repeated differentiation with respect to s and putting s = 0 gives

It follows that

Assume now that Then B*A*' lx = 0,i = l,2, It


sA
follows that B*e 'x = 0 when s > 0, and hence x*Wtx = 0 for t > 0. But W,
is nonnegative definite, so actually Wtx - 0, that is, jc E Ker Wt.

Combining Theorem 8.2.1 with Theorems 8.2.2 and 8.2.3, we obtain the
following important fact.

Corollary 8.2.5
The linear system (8.2.1) is minimal if and only if it is controllable and
observable.

This corollary, together with Theorem 7.1.5, shows that the concept of
minimality for systems and realizations of rational functions are consistent,
270 Linear Systems

in the sense that a system is minimal precisely when it determines a minimal


realization for its transfer function.

8.3 CASCADE CONNECTIONS OF LINEAR SYSTEMS

Consider two systems of type (8.1.1) (with initial value zero):

and

Suppose also that u } ( t ) and y2(t) are from the same space. The two systems
are combined in a "cascade" form when the output y2 of the second system
becomes the input w, of the first system. We obtain

and

Writing x(t) = we obtain a new system of the same type:

The system (8.3.3) is called a simple cascade composed of the first compo-
nent (8.3.1) and the second component (8.3.2). Note that the dimension of
the state space of the simple cascade is the sum of the state space
dimensions of its components, and the input of the simple cascade coincides
with the input of its second component, whereas the output of the simple
cascade coincides with the output of the first component.
Similarly, one can consider the simple cascade of more than two compo-
nents. Let (Av, /?,, C,, D,),. . . , (A , B , C , D ) be linear systems of
Cascade Connections of Linear Systems 271

type (8.1.1). A linear system that is obtained by identifying the output of


(A,, B,, C,., D,) with the input of (A,_lt £,_,, C,_,, D,_,), i = 2,3,. . . , p
will be called the simple cascade of the systems (.4,, Blt C,, D , ) , . . . ,
(Ap, Bp, Cp, Dp). By applying formula (8.3.2) p - 1 times, we see that such
a simple cascade has the form

In the language of transfer functions the simple cascading connection has


a very simple interpretation: formula (7.2.9) shows that the transfer
function of the simple cascade of two systems is the product of the trans-
fer functions of its first and second components (in this order). More
generally, if (A, B, C, D) is the simple cascade of (A^ 5,, C 1? £ > , ) , . . . ,
(Ap, Bpt Cp, D,), then

The following problem is of considerable interest: describe the represen-


tation of a given linear system (A, B, C, D) as a simple cascade of other
linear systems. We can assume that (A, B, C, D) is minimal (otherwise
replace it by a minimal system with the same input-output behaviour). In
order to relate this problem to the factorization problem for rational matrix
functions described in Sections 7.3 and 7.5, we shall assume that D — I and
that in each component (At, /?,, C,, D-) of the simple cascade (A, B, C, 7)
we have D, = /. Equation (8.3.4) shows that if (A, B, C, D) is a simple
cascade with components (At, Bt, C,, D,), / = ! , . . . , / ? , then the size of A
[or, what is the same, the McMillan degree S(W) of the transfer function
W(A) of (A, B, C, /)] is equal to ra, + • • • + mp, where mi is the size of At,
Denoting by W,(A) the transfer function of (At, Biy C,, £>,), we have
8(Wi) < mr On the other hand, as we have seen in the preceding paragraph

which implies

So equality holds throughout (8.3.6), which means that the factorization


(8.3.5) is minimal and that each system (A(, 5,, C,, D,), / = 1,. . . , p is
272 Linear Systems

minimal. Now we can use the results of Sections 7.3 and 7.5 concerning
minimal factorizations of rational matrix functions to study simple cascading
decompositions of minimal linear systems. The following analog of Theorem
7.5.1 is an example.

Theorem 8.3.1
The components of every representation of a minimal system (A, B, C, /) as
a simple cascade (with the transfer functions of the components having value I
at infinity) are given by

where the projectors TT, , . . . , irp and associated subspaces !£l,. . . , !£p are
defined as in Theorem 7.5.1. The transformations TTjA^ in (8.3.7) are
understood as acting in X^ and the transformations CTT^ and 7r(B are
understood as acting from j^ into <p", and from <p" into ^, respectively,
where n is the number of rows in C (which is equal to the number of columns
in B).

We now describe a more general way to connect two linear systems.


Consider the linear system

and assume that the input vector u = u(t) and the output vector y = y(t) are
divided into two components:

Now let

be another linear system with the input s(t), output z(t), and the state w(t).
(Here a, b, c, and d are constant matrices of appropriate sizes.) We obtain a
new system by feeding the first component of the output of (8.3.8) into the
input of (8.3.10) and at the same time feeding the output of (8.3.10) into the
second component of the input of (8.3.8). [It is assumed, of course, that the
vectors y\(t) and s(t) are in the same space, as well as the vectors « 2 (0 anc*
z(t).] This situation is represented diagrammatically by
Cascade Connections of Linear Systems 273

Here S t and S2 represent the linear systems described by equations (8.3.8)


and (8.3.10), respectively. The new system has u v ( t ) as an input and y2(t) as
an output and is called the cascade of (8.3.10) by (8.3.8). The "simple
cascade" described in the first part of this section is a particular case of a
cascade. Indeed, if the first component of the output y^(t) in the system
(8.3.8) depends on M,(/) only, and y2(t) = « 2 (0> then tne cascade described
by (8.3.11) is actually a simple cascade.
We turn now to a description of the cascade in terms of transfer
functions. First, rewrite (8.3.8) in the form

where

are the block matrix representations of B, C, and D conforming with the


division [equations (8.3.9)] of y(t) and u(t). The transfer function of this
system is

where W^(A) = Dif + C,( A/ - A) '#,; /, j =1,2. So passing to the Laplace


transforms, we have

where, as usual, the capital Roman letters indicate Laplace transforms of


the functions designated by the corresponding lowercase letters. Let V( A) be
the transfer function of (8.3.10); then

Now identify

Using (8.3.12)-(8.3.14) we have (omitting the variable A)


274 Linear Systems

and hence

Further

So the cascade of a linear system with the transfer function V( A) by a linear


system with the transfer function

has the transfer function t/(A) given by the formula

We recognize that U(\) is just a linear fractional transformation, t/(A) =


3PW(V), as discussed in Chapter 7. Consequently, the results of Sections 7.6,
7.7, and 7.8 can be interpreted in terms of minimal cascades of linear
systems. The cascade of (7.3.10) by (7.3.8) will be called minimal if the
corresponding linear fractional decomposition U = 2FW(V) is minimal. As an
example, let us restate Theorem 7.8.2 in these terms.

Theorem 8.3.2
Any minimal linear system with in-dimensional state space can be represented
as a minimal cascade of m linear systems each of which has one-dimensional
state space.

8.4 THE DISTURBANCE DECOUPLING PROBLEM

In this and the next section we consider two important problems from linear
system theory in which [A B]-invariant subspaces (as discussed in Chapter
6) appear naturally and play a crucial role.
Consider the linear system
The Disturbance Decoupling Problem 275

where A: <£"-+ f, B: <f m -» <p", E: <pp-» <p", and D. <p"-» <pr are constant
transformations, and *(/), w(f), q(t), and z(f) are vector functions taking
values in <p", <[?w, <pp, and £r, respectively.
As in Section 8.1, <£" is interpreted as the state space of the underlying
dynamical system, and u(t) is the input. The vector function z(t) is inter-
preted as the output. The term q(t) represents a disturbance that is supposed
to be unknown and unmeasurable. We assume that q(t) is a continuous
function of t for t > 0.
An important transformation of the system (8.4.1) involves "state feed-
back." This is obtained when the state x(t) is fed through a certain constant
linear transformation F into the input, so the input of the new system is
actually the sum of the original input u(t) and the feedback. Diagrammati-
cally, we have

Our problem is to determine (if possible) a state feedback F in such a way


that, in the new system, the output is independent of the disturbance q(t).
To express this problem in mathematical terms we introduce the follow-
ing definition. The system (8.4.1) is called disturbance decoupled if for every
x0 G <p" the output z(t} of the system (8.4.1) with jt(0) = jc0 is the same for
every continuous function q(t). We have (cf. Section 8.1)

and thus

Hence the system (8.4.1) is disturbance decoupled if and only if

for every continuous function q(t).


We need one more notion from linear system theory. Consider the linear
system

where A: <p"—»<p" and B: <p m —» <p" are constant transformations. We say


276 Linear Systems

that the state vector y G (p" is reachable for the system (8.4.2) if there exist a
t0 > 0 and a continuous function u(t) such that the solution x(t) of (8.4.2)
satisfies x(tQ) — y. As

for / >0, it follows easily that the set of all reachable state vectors of (8.4.2)
is a subspace.

Proposition 8.4.1
The set 91 of reachable states coincides with the minimal A-invarant subspace
that contains Im B:

Proof. By Lemma 8.2.4 we find that x e <3l if and only if

for some

By equality (8.2.9)

or, taking into account the hermitian property of W,o

which coincides with (A \ Im B} in view of Theorem 2.7.3.

Using this proposition, we obtain the following characterization of distur-


bance decoupled systems.

Proposition 8.4.2
The system (8.4.1) is disturbance decoupled if and only if
The Disturbance Decoupling Problem 277

Returning to the problem mentioned above, note that state feedback is


described by a transformation F: <p"-» <pm, and substituting u(t) + Fx(t) in
place of u(t) in the system (8.4.1), we obtain the system with state feedback:

The new system has the same form as the original system (8.4.1), with A
replaced by A + BF. Our mathematical problem is: given transformations
A: <p"-» <p" and B: <p m ^ <p", and given subspaces g C <p" (which plays the
role of Im E) and 3) C <p" (which plays the role of Ker D), find, if possible,
a transformation F: <p" -»<f"" such that the subspace

[which is the minimal (A + BF)-invariant subspace containing <£] is con-


tained in 2).
The solution to this problem depends on the notion of [A B]-invariant
subspaces, as developed in Chapter 6.

Theorem 8.4.3
In the preceding notation, there exists a transformation F: <p" —> <pm such that

if and only if the [A B]-invariant subspace °U that is maximal in 9) contains


%. In this case any transformation F: <p"-* (pm with (A + BF)°U C % (which
exists by Theorem 6.1.1) has the property (8.4.3).

Proof. Assume that there is an F: <p"—» <pm with the property (8.4.3).
By Theorem 2.8.4 (applied with A + BF playing the role of A and any
transformation whose image is <£ playing the role of B) the subspace
(A + BF | g) is (A + BF) invariant, and thus (Theorem 6.1.1) it is [A B]
invariant. As (A + BF %) D <£, and the maximal (in 2)) [A B]-invariant
subspace °U contains < A + BF \ %), we obtain ^ D g.
Conversely, assume °ti D <£. By Theorem 6.1.1 there is a transformation
F: <p"-» <pm such that (A + BF)^ C <U. Now

and (8.4.3) follows.


278 Linear Systems

When applied to the disturbance decoupling problem, Theorem 8.4.3 can


be restated in the following form.

Theorem 8.4.4
Given a system (8.4.1), there exists a state feedback F: <p"—> <pm such that the
system

is disturbance decoupled if and only if the [A B]-invariant subspace °U that


is maximal in Ker D, contains Im E. In this case the system (8.4.4) is distur-
bance decoupled for every transformation F: <p"—» <p™ with the property that
°U is (A + BF) invariant.
We illustrate Theorem 8.4.4 by a simple example.

EXAMPLE 8.4.1. Let

where al, a2, and « 3 , as well as £,, b2, and b3 are complex numbers not all
zero. Using Theorem 6.4.1 and its proof, we find that a one-dimensional
subspace M is [A B] invariant if and only if

for some A G <p, and a two-dimensional subspace M is [A B\ invariant if and


only if either

or

Consider first the case when Ker D is [A B] invariant. This happens if and
only if 6 3 ^0. Then obviously Ker D is the maximal [A #]-invariant
The Output Stabilization Problem 279

sukkkbspaceindefrjd,nanddefral;fkjl; jubkf;ehuf aklfjsabdj


l;kfjkijb;lnkjkajfgl;tkaljskofkjhsijbfonkgkusfabnd
jl;kj;lnast
kjl;j;elkunjjlj fshefhuxj
j fss XKLDJFL;AHFL/NH;H
H LKJL;JLK
so, when DF 3 ^0, there exists a 1 x 3 matrix JKDLAF such that the
system (8.4.4) is disturbance decoupled if and only if a,6, + a2b2 + a3b3 = 0,
and in this case one can take XBVGXD in such a way that the polynomial
bl + b2x + b3x2 divides DSFSFSGFVSGSDGGDKJ;ALKJF;SKOAJFA;SKLJ;KLJL;JKL;KJFLKJALKLKJKJFJOJWIOJFJSFJSJLK;NJAJKSFA;LKS;LKJFAS;LKJFEJ;IOJK;DJF SJKAssume
FKOIEKN;IOEL;KNKL;J now that Ker D is not
ALJKLJ;LJLJLJAALA;KFJ BDFNLKA;LKJFIEKJL;KJDISATHKILYARAJAFL;KJL;IEKLKJ THEN THE MAAXIKJAL AAL;KJKL;SJF
sukkbspace in der dis span WHEREWHEREWHEREWHEREWHERE INTHISCASWWEN
have Span im E if and only if

So, if b3 = 0 and b2 7^0, there exists an F= [/i/ 2 / 3 ] as in Theorem 8.4.4 if


and only if (8.4.5) holds, in which case one can take/y in such a way that the
polynomial b} + b2x divides -/} - f2x - f3x2 + jc3. Finally, assume b2 =
b3 — Q. Then the maximal [A fi]-invariant subspace in Ker D is the
zero subspace, and there is no F for which the system (8.4.4) is distur-
bance decoupled. D

ALSKJFLSJ;FKSSL;JFKLSJFLJSKDJOUKTJPLUOTSTAHB;IOESAT;IOJ
8.5 THE OUTPUT STABILIZATIONKLPRONELM DJLJSOFKOLN
PROBLEM

tConsider theSYSTEM
CONSIERTHE system

where the transformations and ARE


constant. The problem we deal with in this section is that of stabilizing the
output z(t) by means of a state feedback while still maintaining the freedom
to apply a control function
function u(t). More exactly, the problem is to find a
TRANSFORMATION 9QHIXHHREPRESENTS THE STAE FEED BACKKK90
solution of the new system

WITH IENTICALY ZEERKOONKPLAKJFLH FKOR EVERY INITIAL VALUKE


AS
ThTTTh linear System
this condition amounts to

To study the property (8.5.2), we need the following lemma. This is, in
fact, a special case of Theorem 5.7.2, but it is convenient to have some of
the conclusions recast in the present form.

Lemma 8.5.1
lettyth be stranfsojsrmfation , anfd sklfj
Q, M + be the sum of root
subspaces of A corresponding to the eigenvalues with negative, zero, a
positive real parts, respectively. Thus

and fkodsr skome wen have

(c) fkor all

Note that Jt_, M(), and M+ are /1-invariant subspaces, and therefore
these subnspaces farse also oknvariant fro the ftransofkormations
gijvefn al. b,d, as in fdjl;skjfldfjla;kjf;skanjktransofosrmadtion

be the maximal (A + BF)-invariant subspace in Ker D. The condition


(8.5.2) can be expressed in terms of the root subspaces of A + BF as
follows.

Lemma 8.5.2
We have

if and only if £% fkoreverhyki efifefn asdvljliule sufch that j


4re

Proof. By Theorem 2.7.4 we have


The Output Stabilization Problem 281

cccc
with respect tko thef diect sli skjflksjfl;jedckojklpjkojtisoitijojnl ljlsfjlsjfs;s;s;s;s;s;s;slslslsllll where
where i is ais a
direct complement to N alsoalso jis a null ellhance

Nowcklearlhy
now clearly for everhy
fkor every
with with if andj only jilfj jifn hkas all iltls eigen vslues iln theffj kkoopen
left-half
left . hjalf plane.
j;plane.l So we jhkavfe
so jwe have tootprkovke
prove that
that

if and only if all the eigenvalues of A22 are in the open left-half plane.
Let jc be an eigen vfectkor of lkof ;lff;lcokrresopojnding tko the eigenvaluke4 vallirhj with
then

and bykl lemma al;js;jl;sfl;jks;lfj;sljfjkl

;aflj but if thfen

which contradicts the definition of NF. Hence in equality (8.5.4) holds,


and thus (8.5.3) does not.
Conversely, if <r(/4 22 ) lies in the open left-half plane, then (8.5.3) holds
by Lemma 8.5.1 (where A22 plays the role of A). D
Now we can reformulate the problem of stabilizing the output by state
fedbacjk as followa givfefn transpfokrmationsl and s
subspace 9whuich plays thfe rkokle kof ker gh such
that everyk rkootj subspace of abf ckorrepkondinf ttko an egtenvslkue fa;llkydlvflk7ue a;llllllllvalk7ue wfith
nkonnefgtatjiveff rfeal jipart jis contained in
In this formulation there is nothing special about the set of eigenvalues
with nonnegative real parts. In general, we can consider any proper subset
kodk (the "bad:":domain )domain domain
klkklklkllof thfe colkosed riht _half plane. cloksefd right .lhalf oplane right half plane. plane.
Now we can prove a general result on solvability of this problem in terms
of [A /?]-invariant subspaces.

Theorem 8.5.3
Given transformation, and a substpace
ace
there exists a transformation such thatjat for
t282 linjear systems

every eigenvaluy4r ifa and only if , for evferyi eigen vaklue of


A in H 6 , we have

where (A \ Im B) is the minimal A-invariant subspace containing Im B and


is the mazimal invariant subspace in

proof,. for fafgevfe t4ransofkormation f be the maximal


invariant asw F is also [A B] invariant, we have

Assume now that F is such that

for every eigenvaklue4 id WBD RGr vwkibfa ri then ,by lemmma


for every ve

and hence

CCCCCCCCCCC;LAKJSL;KJF;THE CANKOJ;LICSL LINEFARTRANSFROMAFTION FJ


FOREVERY ANDFORATRANSFRONATION
FOR WHILCH IJS AN INVGARIANT SUBSPACE, LKET BE THE
TRANSFORMATION INDUCED BY XKON ONE EASILY CHFECKS FTHAT A
NOW USE LEMMA 654TOIJBRub

for every sjilmilarlsyu

for every that ils jnot6j ank ejiotn vslkukefkj ofj a=bf h,cibnsyqhubtkgtv

for every
onvefrselyk, assume thaqt 98560holsj for eve4ry we hjave
tko oporovef fthat6 jtherfe efxists an f such thatj fkor evefryk
The Output Stabilization Problem 283

eigenvalkue kof ja=bfthatj blkelogs tykl let be aj transfor


mation sfuch thatj that 9a-bf it ijs jfeasily sen thatj the subspace
is a invariant. we have = A, where
where the upper
the jupper
bar jdenotes thef jilnducecd transfkormation kon enoting byklb the
canonnical tansfkormation wesee thatj lemma 6.5,.4andj equality
se r y i

Hence

further,u t5he jilncu jimpljies that indwe


havfe seen thaincluksjinon at thej begilnnilng jof jthlis ;rrko.
opposdfite injcluksiion, taskke then fkor we
have ece andj thej inclusiion
folllows. be the canoknocslby
thej transfokrmation inducfefd btykl it is assumed
that isz invgarjiant 0a

Nopw Theorem 6.5.3 implies the existence of a transformaton


such that the spectrum of liies in the comple-
ment of We have

which means that

By Lemma 6.5.4 again, for every

so

and F = F0 + FI is the desired transformation. D


284 Linear Systems

The proof of Theorem 8.5.3 shows that, assuming $k(A)C


(A | Im B) + °U for every A0 E Clb, the transformation F: $"-+ fm such that
91A (A + BF) C % for every A0 E Ht can be constructed as follows: F =
F0 + F,P, where F0: f"-* $m is such that (A + BF0)*U C °U; P: ("W is the
canonical transformation; and Fl: $nl°U —> <pm has the property that the
spectrum of the transformation on <p"/^ induced by the transformation
A + B(FQ + F,P) on <p" lies outside H 6 .
Applying Theorem 8.5.3 to the output stabilization problem, we obtain
the following result.

Theorem 8.5.4
Given the linear system

with constant transformations A: f "-+$", B: fm-*f"t and D: $"->$'>


there exists a transformation (state feedback} F: <p" —* (pm such that, for every
initial value jc(0), the solution of (8.5.8) with u(t) = Fx(t) satisfies
lim,^.^ z(r) = 0 if and only if

for every eigenvalue A0 of A lying in the closed right half plane, and where °ll
is the maximal [A B]-invariant subspace in Ker D.

We conclude this section with an example illustrating Theorem 8.5.4.

EXAMPLE 8.5.1. Let

Ker D — Span{

where al, a2, a3 are complex numbers not all zeros. Here (A Im B) =
Span{el,e2}. If £%eA 0 <0, then there is always an F=[flf2f3] with
properties as in Theorem 8.5.4 (one can take /3 = 0 and choose /i and /2
that the equation A 2 -/ 2 A -/, = 0 has its zeros in the open left-half plane).
So assume 9le A0 > 0. Then there exists an F as in Theorem 8.5.4 if and only
if
Exercises 285

If «3 - 0, then (8.5.9) is always false. If «3 ^0, then (8.5.9) happens if and


only if the subspace Ker D is [A B] invariant, or, equivalently, if Ker D is
(A + BG) invariant for some G = [g,g 2 g 3 ]. An easy verification shows that
this is the case if and only if So there exists an as in
Theorem 8.5.4 if and only if « 3 ^ 0 and a 2 = A 0 a 1 . In this case /3 =
Aoflj — /jflj - A 0 / 2 a, and /, and /2 for which the zeros of A 2 — / 2 A — /, = 0 are
in the left half plane will do. D

8.6 EXERCISES

8.1 For every input u(t) find the output y(t) for the following linear
systems:

8.2 For every input w(f) find the output y(t) for the following linear
systems:

where B is the (k^ + k2) x 2 matrix whose first column is ek and


second column is ek +k , and C is the 2 x (kl + k2) matrix whose first
row is e\ and second row is eTk +1 .

where A is an nx n lower triangular matrix and C=


[0 • • • 0 1 0 • • • 0] with 1 in the &th place.
286 Linear Systems

8.3 Consider the linear system

When is this system controllable? observable? minimal?


8.4 Find transfer functions for the linear systems given in Exercises 8.1
and 8.2.
8.5 Build minimal linear systems with the following transfer functions:

(b) p( A) ', where p( A) = E*=0 a ; A y is a scalar polynomial.


(c) (L(\)y\ where L(A) is a monic n x n matrix polynomial of
degree /.
8.6 Show that the system

is controllable and observable.


8.7 For the system in Exercise 8.6, given the n-tuple of complex numbers
A,, . . . , \n, find a state feedback F such that A + BF has eigenvalues
A , , . . . , A n . Also, find G such that A + GC has eigenvalues

8.8 Let

be a linear system with n x n circulant matrix A and n x 1 and 1 x n


matrices B and C, respectively. When is the system controllable?
Observable? Minimal?
Exercises 287

8.9 Consider the linear system

where / is a nilpotent n x n Jordan matrix (i.e., with J" = 0) and B


and C are n x 1 and 1 x n matrices, respectively. When is this system
controllable? Observable? Minimal?
8.10 Prove or disprove: if the system

is minimal, then the system

is minimal as well.
8.11 Let p(A) be a polynomial of the transformation A: <p"-» <p". Prove
that the minimality of the system

implies the minimality of

Is the converse true?


8.12 Let

and

be two systems, and assume that A2-p(Al), where p(A) is a


polynomial such that p(A,) 5^p(A 2 ) for any pair of different eigen-
values A! and A2 of A l and p'( A)| A=A ^ 0 for every eigenvalue A0 of A l
such that A^ (A ) is not diagonable. Prove that the systems are
simultaneously minimal or nonminimal.
288 Linear Systems

8.13 Show that if the system

is controllable, then for every A0 G <p the system

is controllable as well. Is this property true for the observability of


systems?
8.14 For a controllable system

where bn ¥=• 0, find a state feedback F such that the system with
feedback

is stable, that is, all its solutions x(t) tend to zero as t—»°°.
8.15 For system (1) in Exercise 8.14 and any k >0, find a state feedback F
such that all solutions x(t) of the system with feedback satisfy
||*0)|| < Ke~kl, where K>0 is constant independent of t.
8.16 Prove that any minimal linear system with n-dimensional state space
has a state feedback for which the system with feedback can be
represented as a simple cascade of n linear systems with state spaces
of dimension 1.
8.17 Prove that controllability is a stable property in the following sense:
for everv controllable svstem

there exists an e > 0 such that any linear system

with is controllable as well.


Exercises 289

8.18 Prove that observability and minimality of linear systems are also
stable properties. The definition of stability is, in each case, to be
similar to that of Exercise 8.17.
8.19 Show that for any system

there exists a sequence of minimal systems


Notes to
Part 1

Chapter 1. The material here is quite elementary and well known,


although not everything is readily available in the literature. Part of Section
1.5 is based on the exposition in Chapter S4 of the authors' book (1982).
More about angular transformations and matrix quadratic equations can be
found in Bart, Gohberg, and Kaashoek (1979). Angular subspaces and
operators for the infinite dimensional case were introduced and studied in
Krein (1970).
Chapter 2. The proof of the Jordan form presented here is standard and
can be found in many books in linear algebra; for example, see Gantmacher
(1959) or Lancaster and Tismenetsky (1985). A proof of the Jordan form
can be obtained also by analyzing the properties of the set of all invariant
subspaces as a lattice. This was done in Soltan (1973a). In this approach, the
invariance of the Jordan form follows from the well-known Schmidt-Ore
theorem in lattice theory [see, e.g., Kurosh (1965)].
"The yl-invariant subspace maximal in JV" and "the .A-invariant subspace
minimal over JV" are phrases that are introduced here probably for the first
time, although the notions themselves had been developed and are now well
known in the context of linear systems theory. In general, the whole
material of Sections 2.7 and 2.8 is influenced by linear system theory.
However, our presentation here is independent of that theory and leads us
to abandon its well-established terminology. In particular, in linear systems
theory, "full-range" and "null kernel" pairs are known as "controllable"
and "observable" pairs, respectively. Marked invariant subspaces are prob-
ably introduced for the first time. The existence of nonmarked invariant
subspaces is often overlooked. The description of partial multiplicities and
invariant subspaces of functions will hold no surprises for the specialist, but,
again, these are results that are not easily found in the standard literature on
linear algebra.

290
Notes to Part 1 291

Chapter 3. The material of this chapter (except for Theorem 3.3.1) is


well known. Theorem 3.3.1 in the infinite dimensional case was proved by
Sarason (1965). Here we follow his proof.
Chapter 4. The problem of analysis of partial multiplicities of extensions
from an invariant and a coinvariant subspace was stated in Gohberg and
Kaashoek (1979). This problem was connected there with the description of
partial multiplicities of products of matrix polynomials in terms of partial
multiplicities of each factor and reappears in this context in Section 5.2. The
first results concerning this description were proved in Sigal (1973). In
particular, Theorem 3.3.1 was proved in that paper. Example 4.3.1 and the
material in Section 4.4 (except for Proposion 4.4.1) is taken from Rodman
and Schaps (1979). For further information and more inequalities concern-
ing the partial multiplicities, see Thijsse (1980,1984) and Rodman and
Schaps (1979).
When this book was finalized, the authors learned about another impor-
tant line of development concerning the problem of partial multiplicities of
products of matrix polynomials. This has been intensively studied (even in a
more general setting) by several authors. The reader is referred to recent
work of Thompson (1983 and 1985) for details and further references.
Chapter 5. The theory presented in this chapter can be viewed as a
generalization of the familiar spectral theory of a matrix A but, in this
context, identified with the linear matrix polynomial A / — A. This theory of
matrix polynomials was developed by the authors and summarized in the
book by Gohberg, Lancaster, and Rodman (1982). The material and
presentation in this chapter is based on the first four chapters of that book.
It also contains further results on matrix polynomials including least com-
mon multiples, greatest common divisors, matrix polynomials with her-
mitian coefficients, nonmonic matrix polynomials, and connections with
differential and difference equations. Lists of relevant references and
historical comments on this subject are found in the above-mentioned
monograph by the authors (1982). In this presentation we focus more
closely on decompositions into three or more factors. Theorem 5.2.3 is close to
the original theorem of Sigal (1973) concerning matrix-valued functions. See
also Thompson (1983 and 1985).
Chapter 6. The main results of this chapter were first obtained in a
different form in the theory of linear systems [see, e.g., monographs by
Wonham (1974) and Kailath (1980)]. In this chapter the presentation is
independent of linear systems theory and is given in a pure linear algebraic
form. This approach led us to change the terminology, which is well
established in the theory of linear systems, and to make it more suitable for
linear algebra.
The ideas of block similarity in Sections 6.2 and 6.6, as well as of
[A B]-invariant and -invariant subspaces, are taken from Gohberg,
292 Notes to Part 1

Kaashoek, and van Schagen (1980). That paper contains a more general
theory of invariant subspaces, similarity, canonical forms, and invariants of
blocks of matrices in terms of these blocks only. Some applications of these
results may be found in Gohberg, Kaashoek, and van Schagen (1981,1982).
Theorem 6.2.5 was proved (by a direct approach, without using the Kronec-
ker canonical form) in Brunovsky (1970). The connection between the
Kronecker form for linear polynomials and the state feedback problems is
given in Kalman (1971) and Rosenbrock (1970). In Theorem 6.3.2 the
equivalence of (a) and (d) is due to Hautus (1969).
The spectral assignment problem is classical, by now, and can be found in
many books [see, e.g., Kailath (1980) and Wonham (1974)]. There is a more
difficult version of this problem in which the eigenvalues and their partial
multiplicities are preassigned. This problem is not generally solvable. For
further analysis, see Rosenbrock and Hayton (1978) and Djaferis and Mitter
(1983).
Chapter 7. The concept of minimal realization is a well-known and
important tool in linear system theory [see, e.g., Wonham (1979) and
Kalman (1963)]. See also Bart, Gohberg, and Kaashoek (1979), where the
exposition matches the purposes of this chapter. Section 7.1 contains the
standard material on realization theory, and Lemma 7.1.1 is a particular
case of Theorem 2.2 in Bart, Gohberg, and Kaashoek (1979).
Section 7.2 follows the authors' paper (1983a). Sections 7.3-7.5 are based
on Chapters 1 and 4 in Bart, Gohberg, and Kaashoek (1979). Here, we
concentrate more on decompositions into three or more factors.
Linear fractional decompositions of rational matrix functions play an
important role in network theory; see Helton and Ball (1982). Theorem
7.7.1 is proved in that paper. The exposition in Sections 7.6-7.8 follows that
given in Gohberg and Rubinstein (1985).
Chapter 8. In the last 20 years linear system theory has developed into a
major field of research with very important applications. The literature in
this field is rich and includes monographs, textbooks, and specialized
journals. We mention only the following books where the reader can find
further references and historical remarks: Kalman, Falb, and Arbib (1969),
Wonham (1974), Kailath (1980), Rosenbrock (1970), and Brockett (1970).
This chapter can be viewed as an introduction to some basic concepts of
linear systems theory.
The first three sections contain standard material (except for Theorem
8.3.2). In the last two sections we follow the exposition of Wonham (1979).
Part Two

Algebraic
Properties of
Invariant Subspaces

In Chapters 9-12 we develop material that supplements the theory of Part 1.


In particular, we go more deeply into the algebraic structure of invariant
subspaces. We include a description of the set of all invariant subspaces for a
given transformation and examine to what extent a transformation is defined
by its lattice of invariant subspaces. Special attention is paid to invariant
subspaces of commuting transformations and of algebras of transforma-
tions . In the final chapter the theory of the first two parts (developed for complex
linear transformations) is reviewed in the context of real linear transformations.

293
This page intentionally left blank
Chapter Nine

Commuting Matrices
and Hyperinvariant
Subspaces

In this chapter we study lattices of invariant subspaces that are common to


different commuting transformations. The description of all transformations
that commute with a given transformation is a necessary part of the
investigation of this problem. This description is used later in the chapter to
study the hyperinvariant subspaces for a transformation A, that is, those
subspaces that are invariant for any transformation commuting with A.

9.1 COMMUTING MATRICES

Matrices A and B (both of the same size n x n) are said to commute if


AB = BA. In this section we describe the set of all matrices which commute
with a given matrix A. In other words, we wish to find all the solutions of
the eauation

where X is an n x n matrix to be found.


We can restrict ourselves to the case that A is in the Jordan form. Indeed,
let / = S ~ 1AS be a Jordan matrix for some nonsingular matrix 5. Then X is a
solution of equation (9.1.1) if and only if Z = 5 ~1XS is a solution of

So we shall assume that A - J is in the Jordan form. Write

295
296 Commuting Matrices and Hyperinvariant Subspaces

where Ja(a = 1,. . . , u) is a Jordan block of size ma x ma, Ja = \ala + Ha,


where la is the unit matrix of size ma x ma, and Ha is the ma x ma nilpotent
Jordan block:

Let Z be a matrix that satisfies (9.1.2) and write

where Za/3 is a ma x m^ matrix. Rewrite equality (9.1.2) in the form

Two cases can occur:

(a) Aa T^ Ap. We show that in this case Za/3 = 0. Indeed, multiply the
left-hand side of equality (9.1.3) by Aa — A^ and in each term in
the right-hand side replace We
obtain

Repeating this process, we obtain for every

Choose p large enough that either Hqa = 0 or Hp^ q = 0 for every


q = 0, . . . , / ? . Then the right-hand side of equation (9.1.4) is zero,
and since \a ^ A^, we find that Zaft = 0.
(b) Att = A0.Then

From the structure of Ha and Hft it follows that the product HaZaf}
is obtained from Zafi by shifting all the rows one place upward and
filling the last row with zeros; similarly, ZaftHft is obtained from Zafi
Commuting Matrices 297

by shifting all the columns one place to the right and filling the first
column with zeros. So equation (9.1.5) gives (where £ik is the
(j, /c)th entry in Za/3, which depends, of course, on a and |8):

where by definition £.0 = £ m + ltk — 0. These equalities mean that the


matrix Z ~ has one of the following structures:

where Qpq stands for the zero p x q matrix. Matrices of types


(9.1.6)-(9.1.8) are referred to as upper triangular Toeplitz matrices.
So we have proved the following result.

Theorem 9.1.1
Let / = diag[/,, • • . , Ju] be an n x n Jordan matrix with Jordan blocks
7,, . . . , Ju and eigenvalues A n . . . , A M , respectively. Then an n x n matrix Z
commutes with J if and only if Za/3 = 0 for \a ^ A^ and Za/3 is an upper
triangular Toeplitz matrix for A0 = A^, where Z = [Z ap ]^^ = 1 is the partition
of Z consistent with the partition of J into Jordan blocks.

We repeat that Theorem 9.1.1 gives, after applying a suitable similarity


transformation, a description of all matrices commuting with a fixed matrix
A. This theorem has a number of important corollaries.

Corollary 9.1.2

Let A be an n8n matrix partitionaed as follows:


298 Commuting Matrices and Hyperinvariant Subspaces

where the spectra of the matrices A l and A2 do not intersect. Then any n x n
matrix X that commutes with A has the form

with the same partition as in equality (9.1.9).

Proof. Let /, (resp. 72) be the Jordan form of A} (resp. A2), so


/. = S^1 AiSi for some nonsingular matrices Sl and S2. Then

is the Jordan form of A. By Theorem 9.1.1, and since ^(J,) fl (r(J2) — 0, any
matrix Y that commutes with J has the form 1
•*•
with the same.
partition as in (9.1.9). Now Y commutes with J if and only if X= SYS
commutes with A, where So

has the desired structure.

This corollary, reformulated in terms of transformations, runs as follows:


let A: <p"—»<p" be a transformation, and let Ml and M2 be A-invariant
subspaces that are complementary to each other and for which the restric-
tions A\M and A\M have no common eigenvalues. Then M\ and M2 are
invariant subspaces for every transformation that commutes with A. To
prove this, write A in the 2 x 2 block matrix form with respect to the direct
sum decomposition and use Corollary 9.1.2. The next result
is a special case.

Corollary 9.1.3
Every root subspace for a transformation A: <p"-» <p" is a reducing invariant
subspace for any transformation that commutes with A.

The proof of Theorem 9.1.1 allows us to study the set ^(A) of all
matrices (or transformations) that commute with the matrix (or linear
transformation) A. First, observe that ^(A) is a linear vector space. Indeed,
if
if AX for and then also doe
any complex numbers a and j8.
To compute the dimension of ^(A), consider the elementary divisors of
A. Thus, for every Jordan block or size k x k and eigenvalue A0 in the
Jordan normal form of A we have an elementary divisor (A 0 - A 0 ) of A
Commuting Matrices 299

(which is a polynomial in A). The greatest common divisor of two elemen-


tary divisors ( A - A j ) * 1 and ( A - A 2 ) * 2 of A is (A - \l)mtn(kl'*2) if A, = A 2
and is 1 if \l^\2. Taking this observation into account, Theorem 9.1.1
shows that the dimension of ^(A) is Ef , = 1 asl, where a^, is the degree of
the greatest common divisor of (A — A5)*J and (A-A,)*', and
are all the elementary divisors of A. In particular

where n is the size of A.


We have seen that, quite obviously, any polynomial in A commutes with
A, and we now ask about conditions on A such that, conversely, each matrix
commuting with A is a polynomial in A.
To this end we need the following notion. An n x n matrix (or transfor-
mation A: <p" —> <(7") is called nonderogatory if there is only one Jordan block
in the Jordan form of A associated with each eigenvalue. It turns out that A
is nonderogatory if and only if any one of the following four equivalent
statements holds: (a) dim Ker( A/ - A) < 1 for every A E <p; (b) A is similar
to a matrix

for some complex numbers « 0 ,. . . , «„_,; (c) the minimal polynomial of A


coincides with the characteristic polynomial of A; and (d) A is cyclic, that is,
there exists an x E (p" such that

Indeed, by assuming that A is in the Jordan form, condition (a) is clearly


equivalent to A having only one Jordan block for each eigenvalue. By
Theorem 2.6.1, (d) is equivalent to A being nonderogatory. Further, the
ap
minimal polynomial for A is easily seen to be p) ,
where A j , . . . , A p are all the distinct eigenvalues of A and ay is the maximal
size of the Jordan blocks of A corresponding to A. From this description it is
clear that (c) is equivalent to (a). We have proved, therefore, that (a), (c),
and (d) are equivalent to each other and to the condition that A is
nonderogatory.
Let A be the matrix (9.1.11). We want to prove that (a) holds. Let
300 Commuting Matrices and Hyperinvariant Subspaces

and be eigenvectors of A corresponding


to the eigenvalue A 0 . Thus Ax = A0Jt, Ay = \0y, and x7^0, y^O. The
structure of A implies that jc, = Aj,"1.*,, yt - ^lyl for / = ! , . . . , n. But
then necessarily xl ^ 0, _y, T^ 0, and jc = (yi/*i).y, that is, jc and y are linearly
dependent. Hence (a) holds.
Finally, we show that (d) implies (b). First observe that if (9.1.12) holds,
then the vectors x, Ax,. . . , A"~lx are linearly independent (otherwise <p"
would be spanned by less than n vectors, which is impossible). In the basis
jc, Ax,. . . , A"~lx the matrix A has the form (9.1.11).

Theorem 9.1.4
Every matrix commuting with A is a polynomial in A if and only if A is
nonderogatory.

Proof. First recall that in view of the Cayley-Hamilton theorem the


number of linearly independent powers of A does not exceed n. Thus, if
AX - XA implies that ^ is a polynomial of A, then X can be rewritten as
X = p(A), where / = d e g p ( A ) = £ n and all powers /, A, A2,. . . , Al~l are
linearly independent. So in this case dim ^(A) = / < n. Inequality (9.1.10)
then implies that dim Vo(A) = n. This means [again in view of (9.1.10)] that
ast = 0 for s ¥ ^ t . So in the Jordan form of A there is only one Jordan block
associated with each eigenvalue of A.
Conversely, assume that the Jordan form of A is

where A 1 ? . . . , \s are different complex numbers. As we have seen, the


solution X of AX = XA is then similar to a direct sum of upper triangular
Toeplitz matrices

More exactly, where 5 is a nonsingular matrix such


that / = S~1AS. Now a polynomial p( A) satisfying the conditions

for / = 1,. . . , s gives the desired result:


Common Invariant Subspaces for Commuting Matrices 301

Note that p( A) can be chosen with degree not exceeding

We now confine our attention to matrices commuting with a diagonable


matrix. Recall that an n x n matrix A is diagonable if and only if there is a
basis in <p" of eigenvectors of A. The following corollary is obtained from
Theorem 9.1.1.

Corollary 9.1.5
//A,, . . . , As are the distinct eigenvalues of a diagonable matrix A, then

where

For future reference let us also indicate the following fact.

Proposition 9.1.6
An n x n matrix B commutes with every n x n matrix A if and only if B is a
scalar multiple of /: B = A/ for some A E (f1.

Proof. The part "if" is obvious. So assume that B commutes with every
n x n matrix A—in particular, taking A to be diagonal with n different
eigenvalues with respect to a basis * , , . . . , xn in <p", Corollary 9.1.2 implies
that B is also diagonal in this basis. Therefore, Bxl ESpan{x,}. As any
nonzero vector xx appears in some basis in <p", we find that Bx = Ax for
every * E <p" -> {0}, where the number can depend on x: A = A(z). How
ever, if Bx=A(x)x, By = \(y)y with A(x)*\(y), then B(x + y)0
Span{x + y], a contradiction. Hence A is independent of x and the propos
ition is proved.

9.2 COMMON INVARIANT SUBSPACES FOR COMMUTING MATRIC

In this section we establish a fundamental property of a set of commuting


transformations, namely, that there is always a complete chain of subspaces
that are invariant for every transformation of the set.
302 Commuting Matrices and Hyperinvariant Subspaces

Theorem 9.2.1
Let ft be a set of commuting transformations from <p" into <p" (so AB - BA
for any A, B E. ft). Then there exists a complete chain of subspaces
dim Mt = j, such that MG, M.^ . . . , Mn are invariant for
every transformation from ft.

Proof. For every nonzero vector x E <t" write

Clearly Z£(x) is a nonzero subspace that is invariant for any A E ft (in short,
ft invariant).
Now let xl E <p" be an eigenvector of some transformation / 4 , E f t
corresponding to an eigenvalue A,; so >!,*, = A,*,. Hence for every
Bl,. . . , Bk £ ft we have

so

Let Jt 2 EJ£(jt,) be an eigenvector of some A2 Eft: A2x2 = \2x2. Then


A2\<f(x } = A 2 /, and Z£(x2) C <£(x}). We continue the construction of nonzero
subsoaces

where -A,|^(;t j = \fl, i = 1, . . . , k for some Al, . . . , Ak E. ft and complex


numbers A,, . . . , \k, until we encounter the situation where 2£(y) — <&(xk)
for every eigenvector y&^(xk) corresponding to any eigenvalue A of any
transformation BE ft. In this case every B E f t has an eigenvalue \B with
the property that B^(x ) — \BI. Let y\ be any nonzero vector from ££(xk).
Then the subspace Ml =Span{_y 1 } is ft invariant.
Let jVj be a direct complement to M} in (p". With respect to the
decomposition M , + ^V, = (f1", we have

for any

The condition AB = BA implies that A . Repeating the abov


procedure, we find a common eigenvector y2 G ^ of all linear transfor-
mations from ft. Put M2 = Span{.y,, y2}, and so on. Eventually we obtain a
complete chain of common ft-invariant subspaces.

In terms of bases, Theorem 9.2.1 can be stated as follows.


Common Invariant Subspaces for Matrices with Rank 1 Commutators 303

Theorem 9.2.2
Let fl be a set of commuting transformations from <p" into (f"1. Then there
exists an orthonormal basis *,,. . . , xn in <p" such that the representation of
any AE.fl in this basis is an upper triangular matrix.

Proof. be a complete chain of


subspaces as in Theorem 9.2.1. Now construct an orthonormal basis
X i , . . . , x in such a way that Span for

If every transformation from the set ft is normal, the upper triangular


matrices of Theorem 9.2.2 are actually diagonal (cf. the proof of Theorem
1.9.4). As a result we obtain the "only if" part of the following result.

Theorem 9.2.3
Let ft be a set of normal transformations <p" —> £". Then AB = BA for any
transformations A, BE.il if and only if there is an orthonormal basis
consisting of eigenvectors that are common to all transformations in ft.

The part "if" of this theorem is clear: if x{,. . . , xn is an orthonormal


basis in <p" formed by common eigenvectors of A and B, where A, B E ft,
then in this basis we have

9.3 COMMON INVARIANT SUBSPACES FOR MATRICES WITH


RANK 1 COMMUTATORS

For n x n matrices A and B, the commutator of A and B is, by definition,


the matrix AB - BA. So the commutator measures the extent to which A
and B fail to commute. We have seen in the preceding section that if A and
B commute, that is, if their commutator is zero, then there exists a complete
chain of common invariant subspaces of A and B. It turns out that this result
is still true if the commutator is small in the sense of rank.

Theorem 9.3.1
Let A and B be n x n matrices with rank(AB - BA) s 1. Then there exists a
complete chain of subspaces:

such that each M is both A invariant and B invariant.


304 Commuting Matrices and Hyperinvariant Subspaces

Proof. We shall assume that rank(AB - BA) = 1. (If AB-BA = Q,


Theorem 9.3.1 is contained in Theorem 9.2.1.) We can also assume that A is
singular. (If necessary, replace A by A - A0/ for a suitable A 0 , and note that
the commutators of A and B and of A - A0/ and B are the same.) We claim
that either Ker A or Im A is B invariant. Indeed, if Ker A is not B
invariant, then there exists a nonzero vector x E <p" such that Ax - 0 and
ABx^O. Thus

span the one-dimensional range of AB - BA. Hence for every y G <p" there
exists a constant ii(y) such that

It follows that

and hence

so Im A is B invariant. We have shown that there is a nontrivial subspace Ji


that is invariant for both A and B.
Write A and B as 2 x 2 block matrices with respect to the decomposition
N + N' = <t", where N' is some direct complement to jV:

Then rank(AlBl - B^^^ 1 and rank(A2B2 - B2A2)< 1. So we can appl


the preceding argument to find a nontrivial common invariant subspace for
Al and Bl (if d i m j V > l ) . Similarly, there exists a nontrivial common
invariant subspace for A2 and B2 (if dim N' > 1). Continuing in this way, we
ultimately obtain the result of the theorem.

Theorem 9.3.1 can also be restated in terms of simultaneous trianguliza-


tions of A and B, just as Theorem 9.2.1 was recast in the form of Theorem
9.2.2. In contrast with Theorem 9.2.1, the result of Theorem 9.3.1 does not
generally hold for sets of more than two matrices.

EXAMPLE 9.3.1. Let

It is easily checked that


Hyperinvariant Subspaces 305

Nevertheless, there is no one-dimensional common invariant subspace for


Al, A2, and A3. Indeed, A3 has exactly two one-dimensional invariant
subspaces, Span{e,} and Span{e2}, and neither of them is invariant for both
Al and A2.

9.4 HYPERINVARIANT SUBSPACES

Let A: <p" —»<p" be a transformation. A subspace M C <p" is called hyperm-


variant for A (or A hyperinvariant) if M is invariant for any transformation
that commutes with A. In particular, an /l-hyperinvariant subspace is A
invariant. Let us study two simple examples.

EXAMPLE 9.4.1. Let A = A/, A E (p. Obviously, any transformation from <p"
to <p" commutes with /I, so the only subspaces which are invariant for every
linear transformation that commutes with A are the trivial ones: {0} and <p".
Hence A has only two hyperinvariant subspaces: {0} and <p".

EXAMPLE 9.4.2. Assume that A: (p"—»(p" has n distinct eigenvalues


A,, . . . , \n with corresponding eigenvectors jc n . . . , xn. Then A has exactly
2" invariant subspaces Spanjjc, | / G K ] , where K is any subset in { 1 , . . . , « }
(see Example 1.1.3). By Theorem 9.1.4, the only transformations that
commute with A are the polynomials in A. Since every ^-invariant subspace
is invariant also for any polynomial of A, we find that every ^-invariant
subspace is A hyperinvariant.

More generally, let A be a nonderogatory transformation. Then Theorem


9.1.4 shows that every /1-invariant subspace is also A hyperinvariant. This
property is characteristic for nonderogatory transformations.

Theorem 9.4.1
For a transformation A: <p" —* <p" every A-invariant subspace is A hyperin-
variant if and only if A is nonderogatory.

Proof. We have seen already that the part "if" is true. To prove the
"only if" part, assume that A is not nonderogatory. We prove that there
exists an ^-invariant subspace that is not A hyperinvariant. By assumption,
dim Ker(y4 - A 0 /) > 2 for some eigenvalue A0 of A. Without loss of generali-
ty we can assume that A is a Jordan matrix

where m>2 and the first m Jordan blocks correspond to the eigenvalue A 0 ,
306 Commuting Matrices and Hyperinvariant Subspaces

they are arranged so that k} < k2 and Am + 1 , . . . , \p are different from A0.
Obviously, Span{e,} is an .A-invariant subspace. It turns out that this
subspace is not A hyperinvariant. Indeed, by Theorem 9.1.1 the matrix 5
with 1 in the entries (/c, + 1 , 1 ) , . . . , (2kt, A:,) and zero elsewhere, com-
mutes with A. On the other hand, Sel = ek+l, so Spanf^} is not S
invariant. D

It is easily seen that all the A -hyperinvariant subspaces form a lattice,


that is, the intersection and sum of v4-hyperinvariant subspaces are again A
hyperinvariant. Denote this lattice by Hinv(yl). Now we can state the main
result concerning the structure of Hinv(v4).

Theorem 9.4.2
The lattice of all A-hyperinvariant subspaces coincides with the smallest lattice
tfA of subspaces in (p" that contains

Actually, $fA coincides with the smallest lattice of subspaces in <£" that
contains

where is the minimal polynimial of A.Indeed.


for and for
and
The proof of Theorem 9.4.2 is given in the next section.
The following example shows that, in general, not every /1-hyperinvariant
subspace is the image or the kernel of a polynomial in A.

EXAMPLE 9.4.3. Let A be the 0 X 6 matrix

According to Theorem 9.4.2, the subspace


Im A2 is A hyperinvariant. On the other hand, there is no polynomial p( A)
such that 56 = Ker p(A) or !£- Im p(A). Indeed, for any polynomial p(A)
the matrix p(A) has the form (see Section 9.2.10):
Proof of Theorem 9.4.2 307

for some complex numbers /?,, p 2 , p3, p4. So Ker p(A) can be only one of
the following subspaces: {0} (if pl ^0); Span{e,,e 5 } (if pl = 0, p 2 ^0);
Span{e,,e 2 ,e 5 ,c 6 } (if^, =/? 2 = 0,/? 3 5^0); Span{e l5 e 2 , «?3, e 5 , <?6} (if p, =
P2 — PJ — 0, p3 ^ 0); (p (if p- = 0, / = 1, 2, 3,4). The subspace Im p(v4) can
be one of the following: (p6; Span{<?j, e2, e3, e5}; Span{e!,e 2 }; Span{e,};
{0}. None of these subspaces coincides with !£. D

9.5 PROOF OF THEOREM 9.4.2

The proof of Theorem 9.4.2 requires some preparation. We first prove


several auxiliary results that are useful in their own right.

Proposition 9.5.1
For any A G <p the subspaces

are A hyper invariant.

Proof. Fix A G (J7 and a positive integer k, and let x be any vector from
Ker(j4«- A/)*. If B commutes with A, we have

So Bx G Ker(,4 — A/) , and the subspace Ker(^4 - A/) is A hyperinvariant.


Similarly, let y G lm(A - A/)* and BA = AB. Then for any z G <p" such that
(/4 - A/) z = y, we obtain

So #y G Im(v4 — A/)*; therefore, lm(A - A/)* is A hyperinvariant.

We proceed now with the identification of Hinv(A), assuming that A


has only one eigenvalue. Given positive integers pl>--->pm, let
A(/?,,. . . , pm) be the set of all m-tuples of integers ( < 7 j , . . . , qm) such that
F°r evei7 two
308 Commuting Matrices and Hyperinvariant Subspaces

sequences and from

then min(<7', q") belong to A(/7 1 5 . . . , pm).


Let B: <p" —» <p" be a transformation with a single eigenvalue A 0 , and let

be a Jordan basis in (p" for 5, where p^>p2>--->pm. So in this basis 5


has the form

Let

Lemma 9.5.2
For every the subspace

is B hyperinvariant. Conversely, every B-hyperinvariant subspace !£ has the


form <p(<7 1? . . . , qm) for some (q^ . . . , <? m )e A(p,, . . . , p m ). Moreover

for every

Proof. Let ^ be a nonzero fi-hyperinvariant subspace, and let x E


an arbitrary nonzero vector. Write * as a linear combination of the basis
vectors:

Assume that for some j the vector


Proof of Theorem 9.4.2 309

is nonzero, and let q be the maximal index / (1 < i^pj) such that
We show that the subspace J{'q is in
Let Pj be the projector on 3{'p defined by P,/l' ) = = 0 for iVy and
P,./^ =/l y) (a = 1, . . . , p,). Obviously, PfB = BPr Therefore, the sub-
space J^is Pj invariant. Hence y = PjX G 56. For every k — 1, 2, . . . the linear
transformation (5 - A0/)* commutes with B and hence

Then the vectors

also belong to «£ Thus 3C'q C 5£.


Furthermore, we show that if 3C'gC<e (y'^2), then also %*'*£&.
Indeed, let X: £"—* $" be the linear transformation given in the basis
(9.5.1) by the matrix

where X^v is a pv x p^ matrix, and X^ v = 0 for all /i, i/ except for


which is given as follows:

Theorem 9.1.1 shows that X commutes with B. Consequently, !£ is X


invariant and the vectors

belong to £.
We have proved that j£ has the form (9.5.3) with ql> — ->qm. Let us
verify that pl - q} > • • • >p m - qm. Fix i0<jQ and let C: $"-* <p" be
defined in the block matrix from C=[C (; ]™ y=1 with respect to the basis
(9.5.1) where C,7 is the zero p, x p. matrix if i^jQ or y T^ i0 and C;o/o is the
Pi * pf matrix [0 /]. By Theorem 9.1.1, C commutes with A, so £ is C
invariant. If then obviously
Otherwise

which implies , again.


310 Commuting Matrices and Hyperinvariant Subspaces

It remains to show that every subspace

with ( < ? , , . . . , <7 m )G A(p 1 5 . . . , pm) is B hyperinvariant. Let


We must prove that J^is C invariant. With respect to the basis (9.5.1), write
C as the block matrix C = [C /y ]™ y==l , where C,; is a p{ x p. matrix of one of
the following types (see Theorem 9.1.1):

[in the notation of (9.1.6)-(9.1.8)]. From the structure of C it is easily seen


that £ is C invariant if and only if the ^th column in every C,; has all entries
zero in the places qi + 1,. . . , pt;. In case / > j the first nonzero entry in
the <7yth column of C,7 can be in the [pt•.- (PJ - <? y )]th place; but pi..—
(p/-^-)^^-because ( < ? , , . . . , ^ m ) e A ( p , , . . . , pm). In case i<j the first
nonzero entry in the ^;th column of Ctj can be in the g;th place; but qf < qt,
so we are done in this case also. Finally, in case / = /' obviously the ^th
column of C,y has zeros in places qi; + 1, . . . , / ? , . We have verified that !£ is
indeed C invariant.
Finally, equalities (9.5.4) and (9.5.5) are clear from the definitions of
min(<y', q") and max(g', q").

Now we begin the proof of Theorem 9.4.2 itself. In view of Proposition


9.5.1, every element in the lattice 5^, the smallest lattice containing the
subspaces (9.4.1), is A hyperinvariant. Now let ££ be an A -hyperinvariant
subspace. Then Z£ is, in particular, A invariant; therefore

where A , , . . . , \m are all the distinct eigenvalues of A. Now


is also an ^4-hyperinvariant subspace. [Recall that the
integers ri are defined by the minimal polynomial (A - A,) r ' • • • (A - Am)r"' of
A.] Thus, to show that 3?E.yA, we can assume that A has only one
eigenvalue A 0 . Letting /?, > • • • > / ? , be the partial multiplicities of /4, in view
of Lemma 9.5.2 it will suffice to verify that

where (<?,, . . . , < 7 / ) E A ( p j , . . . , p,) and 3^ are defined as in equation


(9.5.2) [with respect to a Jordan basis f(-} of A]. Actually
Further Properties of Hyperinvariant Subspaces 311

where N=A-\0I. Indeed, as %' CKer A^'fllm N"1'*', / = ! , . . . , / , the


inclusion C in (9.5.7) is obvious. For the opposite inclusion, let
r G K e r A^'nimyV'^' so x = Np'~q'y for some y with N"'y=Q.
Write y = y1 + y2 + "- + y,, where ^GSpanl/J'*,. . . , f(pn}. Then x =
E,'=1 Af'^.and

We want to show that AfPl ^'y ; G 3^. or, equivalently

But since ( < ? , , . . . , ^ ) e A ( p , , . . . , p,), we have q,+ p, - qi >min(p,, p y ),


! < / < / , and (9.5.9) follows from (9.5.8). Theorem 9.4.2 is proved.

9.6 FURTHER PROPERTIES OF HYPERINVARIANT SUBSPACES

We present here some properties of the lattice Hinv(A) of all /i-hyperin-


variant subspaces.

Theorem 9.6.1
For any transformation A: <p"—»(p" the lattice Hmv(A) is distributive and
self-dual and contains exactly

elements, where ///' > • • • > p^' are the partial multiplicities of A correspond-
ing to the ith eigenvalue, i=\, . . . ,k, and k is the number of different
eigenvalues of A (in particular, Hinv(A) is finite).

Let us explain the terms that appear in this theorem. By definition, a


lattice A of subspaces in (p" is called distributive if

for every M, ^V,, JV2 G A. The lattice A is said to be self-dual if there exists a
bijective map $: A-» A such that «/<^ + -V) = «A(^) n «//(JV), ty(M n JV) =
«//(J^) + (/'(•'V) for every M, N E: A. [In other words, A is isomorphic (as a
lattice) to the dual lattice of A.]

Proof. Note that every /4-hyperinvariant subspace J£ admits the repre-


sentation
312 Commuting Matrices and Hyperinvariant Subspaces

where A , , . . . , A^ are all the distinct eigenvalues of A. As

and

for any .A-hyperinvariant subspaces =$?, and j£2, we assume (without loss of
generality) that A has only a single eigenvalue
To show that the lattice of A-hyperinvariant subspaces is distributive, first
observe the following equality for any real numbers r, s, t:

This equality can be easily verified by assuming (without loss of generality)


that r < 5 , and then by considering three cases separately: (1) / < r <s; (2)
r < t ^ s; (3) r =£ s < t. Now let M^, M2, M3 be y4-hyperinvariant subspaces.
According to Lemma 9.5.2, write

in the notation of Lemma 9.5.2, where


A(PJ, . . . , pm), i- 1,2, 3, and / ? , > • • • >/? m are the partial multiplicities of
A. Using (9.5.4) and (9.5.5), we have

and

Using (9.6.2), we obtain equality between (9.6.3) and (9.6.4).


To prove the self-duality of Hinv(>l), observe that, in view of Lemma
9.5.2, the map $: Hinv(j4)—»Hinv(yl) defined by

where (q}, . . . , qm) G A(p 1? . . . , pm) satisfies the definition of a self-dual


lattice. For instance:
Exercises 313

It remains to verify the Hinv(,4) has exactly

elements. Instead of Hinv(>l), we count elements in A( / ? , , . . . , p m ). Using


induction on m [formula (9.6.5) obviously holds for m = 1], assume that
A
(P 2 > • • • . Pm) has exactly [H^1 (pf-pj+l + 1)] (/?„, + 1) elements. Now
observe that (<72 + 5, #2, . . . , qm) belongs to A( / ? , , . . . , pm) if and only if
(<7 2 > • • • ' ^m) belongs to A(/? 2 , . . . , pm) and 0 < * < p , - p2. This com-
pletes the induction step.

We conclude this section by observing that the number of ,4-hyperin-


variant subspaces for A: <p"—> <p" lies between 2 and 2", and both bounds
can be attained. Indeed, the transformation / has only trivial hyperinvariant
subspaces, whereas a diagonable transformation with n distinct eigenvalues
has 2" hyperinvariant subspaces (see Examples 9.4.1 and 9.4.2). That the
number of A -hyperinvariant subspaces cannot exceed 2" follows from a
general result in lattice theory [see, e.g., Theorem 148 in Donnellan (1968)
using the fact that Hinv(,4) is distributive and each chain in Hinv(v4)
contains not more than n + 1 different subspaces.

9.7 EXERCISES

9.1 Consider the transformation

written as a matrix with respect to the standard basis e^, e2,e3.


(a) Find all transformations that commute with A.
(b) Find all A-hyperinvariant subspaces.
314 Commuting Matrices and Hyperinvariant Subspaces

9.2 Show that if a transformation A: (p"—» <p" has n distinct eigenvalues,


then every transformation commuting with A is diagonable. Con-
versely, if every transformation commuting with A is diagonable, then
A has n distinct eigenvalues.
9.3 Supply a proof for Corollary 9.1.5.
9.4 Show that if AJn(\Q) = Jn(\0)A, then A is diagonable if and only if A
is a scalar multiple of the identity.
9.5 Prove or disprove each of the following statements for any commuting
transformations A: <f" -* <p" and B: <f" -» <p":
(a) There exists an orthonormal basis in which A and B have the
lower triangular form.
(b) There exists a basis in which both A and B have Jordan form.
(c) Both A and B have the same eigenvectors (possibly corres-
ponding to different eigenvalues).
(d) Both A and B have the same invariant subspaces.

9.6 Show that any matrix commuting with

is a circulant.
9.7 Show that any matrix commuting with

where « 0 , «,, . . . , #„_, are given complex numbers, is a polynomial


of A.
9.8 Describe all matrices commuting with
Exercises 315

Are all of these polynomials of Q1 Find all <2-hyperinvariant sub-


spaces.
9.9 Describe all transformations commuting with a transformation
A: <£""—> <JT" of rank 1. Find all ^4-hyperinvariant subspaces.
9.10 Let A: <p" —» (J7" be a transformation. Prove that every A-hyperin-
variant subspace is the image of some transformation which com-
mutes with A. (Hint: Use Lemma 9.5.2.)
9.11 Show that every /1-hyperinvariant subspace is the kernel of some
transformation which commutes with A.
9.12 Prove that for the matrix A from Exercise 9.7 we have Hinv(^4) =
lnv(A).
9.13 Is Hinv(y4) = lnv(A) true for any block companion matrix

where Aj are 2 x 2 matrices?


9.14 Show that for circulant matrices A in general Hinv(v4) ^ ln\(A). Find
necessary and sufficient conditions on the circulant matrix A in order
that Hinv(A) = Inv(^).
9.15 Give an example of a transformation A and of an v4-hyperinvariant
subspace M that does not belong to the smallest lattice of subspaces
containing the images of all polynomials in A.
9.16 Give an example analogous to Exercise 9.15 with "images" replaced
by "kernels."
9.17 Give an example of a transformation A such that Inv(y4) is not
distributive.
Chapter Ten

Description of
Invariant Subspaces and
Linear Transformations
with the Same
Invariant Subspaces

In this chapter we consider two related problems: (a) description of all


invariant subspaces of a given transformation and (b) to what extent a
transformation is determined by its lattice of all invariant subspaces.
We have seen in Chapter 2 that every invariant subspace of a linear
transformation A: (p" -» <p" is a direct sum of irreducible ^-invariant sub-
spaces, that is, such that the restriction of A to each one of these subspaces
has only one Jordan block in its Jordan form. Thus, to solve the first
problem mentioned above it will be sufficient to describe all irreducible
yi-invariant subspaces. This is done in Section 10.1.
The second objective of this chapter is a characterization of transfor-
mations having exactly the same set of invariant subspaces. It turns out that,
in general, not all such transformations are polynomials of each other. Our
characterization (given in Section 10.2) will depend on the description of
irreducible invariant subspaces given in Section 10.1.

10.1 DESCRIPTION OF IRREDUCIBLE SUBSPACES

In the description of invariant subspaces upper triangular Toeplitz matrices,


and matrices that resemble upper triangular Toeplitz matrices, play an
important role, as we see later. We recall first some simple facts about
Toeplitz matrices.

316
Description of Irreducible Subspaces 317

A matrix A of size / x j is called Toeplitz if its entries have the following


structure

where . Denote by T- the class of all


upper triangular Toeplitz matrices of size j x y, that is, such that «, = • • • =
a ; _, = 0 in equation (10.1.1).

Proposition 10.1.1
The class T, is an algebra, that is, it is closed under the operations of addition,
multiplication by scalars, and matrix multiplication. Moreover, if A E T- and
det.4^0, then A'1 £ 7}.

Proof. All but the last assertions of Proposition 10.1.1 are immediate
consequence of the definition of Tf. To prove the last assertion, suppose that

Ome dedices seaso;u thjat Further

and in general

(It is assumed that bki = 0 whenever k =sO.) Equations (10.1.2) define


recursively:

Using (10.1.3), we can prove by induction on k (starting with k = 0) that


bi_k • does not depend on /. But this means exactly that the matrix [b^]^ k =l
is Toeplitz.
318 Description of Invariant Subspaces and Linear Transformations

Let A: <p" —»<p" be a transformation. It is clear that each .A-invariant


subspace M can be represented as a direct sum of nonzero A-invariant
subspaces M } , . . . , Mk, each of which is irreducible, that is, not represent-
able as a direct sum of smaller invariant subspaces (indeed, let / be the
maximal number of factors in a decomposition

into a direct sum of nonzero ^-invariant subspaces Mf; then from the choice
of / it follows that each Mi in equality (10.1.4) is irreducible). To describe
the ,4-invariant subspaces, therefore, it is sufficient to describe all the
irreducible subspaces.
It follows from Theorem 2.5.1 that an ^-invariant subspace SB is ir-
reducible if and only if the Jordan form of A ^ consists of one Jordan block
only. In other words, !£ is irreducible if and only if there exists a basis
J t j , . . . , x in Z£ and a complex number \ such that

that is, the system {Af,}f =1 is a Jordan basis in Z£. Consequently, every
irreducible subspace is contained in some root subspace. (One can see this
also from Theorem 2.1.5.) Thus it is sufficient to describe all the irreducible
subspaces contained in a fixed root subspace corresponding to the eigen-
value A. Without loss of generality, we assume that A = 0. (Otherwise,
replace A by B = A- A/ and observe that both transformations A and B
have the same invariant subspaces.)
The root subspace ^(A) is decomposed into a direct sum of Jordan
subspaces:

The description of the Jordan subspaces contained in &10(A) is given


according to the number m of irreducible subspaces in the decomposition

If $10(A) is an irreducible subspace [i.e., m = 1 in (10.1.6)] and the


vectors (*,}f=1 form a Jordan basis in £%0(A), then Span{jCj,. . . , *y},
j — 1, . . . , p are all the .^-invariant subspaces in &t0(A), and all of them are
irreducible subspaces.
Consider now the case when m = 2 in (10.1.6). We use the following
notation: if {2,}f=1 is a system of vectors z, E (p", denote by z ( y ) the column
formed by vectors, as follows:
Description of Irreducible Subspaces 319

Let g,, . . . , gp E .Sfj and / , , . . . , jq G J£2 "e Jordan bases in J£t and Ja?2,
respectively. Without loss of generality, suppose that p > q. It is known that
in any irreducible subspace ^(^0) of A there exists only one eigenvector
(up to multiplication by a nonzero scalar). We describe first all the irreduci-
ble subspaces that contain the eigenvector g, [and thus are contained in
^(A)].
In the following proposition / is a fixed integer, 1 < / < p .

Proposition 10.1.2
Let T(v\ where v = min(y, q), be an upper triangular matrix of size j x /',
whose diagonal elements are zeros and the block formed by the first v rows
and first v columns is a Toeplitz matrix :

Then the components of the column

form a Jordan basis of some j-dimensional A-invariant irreducible subspace


that contains g p Conversely, every irreducible subspace of dimension j of A
that contains g, has a Jordan basis given by the components of (10.1.8),
where T(v) is some matrix of type (10.1.7).
The multiplication in T ( v ) f ( i ) is performed componentwise: for complex
numbers xrs and n-dimensional vectors z , , . . . , z; we define
320 Description of Invariant Subspaces and Linear Transformations

Note also that the dimension of every irreducible subspace of A contained


in 2ft.Q(A) does not exceed p [recall that m = 2 in (10.1.6) and that dim ^ =
p > dim 5£2 > 1]; so Proposition 10.1.2 does indeed give the description of all
irreducible subspaces that contain gj.

Proof. First observe that if SB is an irreducible subspace and g, G Z£y


then

Indeed, if y E 3? n «S?2 ^ {0}, then for some i ( 0 < / < / ? - ! ) and some
complex number y ^ 0 the equality Ay = y/, holds. So /t E ^ O ^ C ^,
and since also g t e <£, the irreducible subspace =$? contains two linearly
independent eigenvectors/! and g l y which is impossible. From (10.1.9) and
the inclusion % + &2 C ^(A) = &{ + %2 it follows that dim £ < dim <£", =
p. Now let £ be an irreducible subspace containing gl with a Jordan basis
y l t . . . , y y ; so y} = a0g, and Ay, + , = ^.(/ = 1, . . . , y - 1). We look for the
vectors y2,. . . , yf in the form of linear combinations of gl,. . . , g ,
/ " , , . . . , fq. Two possibilities can occur: (1) y < ^; (2) q + 1 < y < /?. Consider
first the case when y'< <7. Condition Ay2 = y^ implies that

Condition lies that


Continuing these arguments, we obtain

where a t , . . . , a y _,, /8,,. . . , p ; -_j are some numbers. In case


one finds analogously

where are some complex numbers


Description of Irreducible Subspaces 321

Formulas (10.1.10) and (10.1.11) can be written in the form

where C and 5 (t)) are j x / matrices, and C is an upper triangular invertible


Toeplitz matrix (invertible because its diagonal element is a 0 ^0). By
Proposition 10.1.1, C~l is also an upper triangular Toeplitz matrix. It is easy
to see that the matrix C~}S(v) has the form T(v} [see (10.1.7)]: T(v)
C~}S(V\ Put z ( ] ) = C ~ l y ( > ) = g(i)+T(v)f(j\ It is easy to see that
Span{yl}( =Span{zi}{ and the vectors z , , . . . , z y satisfy (10.1.5). So
the components of z ( / ) form a Jordan basis in £, D

Now let xl (^agj) be an arbitrary eigenvector of A contained in


&Q(A) = ^ + <$£>. Evidently, xl = ^gl + TJ/, (£ ^0). Consider the system of
vectors *,• = £& + TJ/J-, i - 1, • • • , q. Clearly, the vectors xl, . . . , xq satisfy
the condition (10.1.5); therefore, they form a Jordan basis of some ir-
reducible subspace &C$10(A). It can easily be verified that ^+3? =
2ft0(A). Hence dim&=q. By Proposition 10.1.2, for every irreducible
subspace j£ containing the vector jc, (the dimension j of !£ is necessarily not
larger than q) there exists a matrix T(i) of the form (10.1.7) such that the
components of the column v(i) = x ( j ) + TU)g(l) form a Jordan basis in j£
Conversely, for every matrix T(i) of size / x j the components of the column
v(n form a Jordan basis in some irreducible subspace of A. Thus a complete
description of the irreducible subspaces contained in the root subspaces
&o(A) = «£, i <#2, is obtained.
This description for the case when m = 2 in the decomposition (10.1.6)
can be generalized for an arbitrary m. This is the content of the following
theorem.

Theorem 10.1.3
Let

be a decomposition of the root subspace £%A (A) of the transformation


A: <p"—> <p" into a direct sum of irreducible subspaces =$?,,. . . , 2£m. Let
g,,. . . t g p i e ^ . . . ;/„. .. , / P r e ^ r ; . . . ; * , , . . . , / i ^ e ^ 6e /orto
bases /n ^,. . . , ^.,. . . , £m, respectively ( / ? , > • • • ^pm). Let j be an
integer such that \<j<pr — dim !£r. For every i — 1,. . . , m let v( —
min(/, p ( ). Then for every set of matrices T\"l\ . . . , T(^m) of the form
(10.1.7) and of size j x j the components of the column
322 Description of Invariant Subspaces and Linear Transformations

form a Jordan basis in some irreducible subspace of A that contains the vector
/! {here ul, . . . , up E =5^._, and vlt . . . , vp + E 2£r+l are Jordan bases in
cS?r_i and £r+l, respectively). Conversely, for every irreducible subspace t£ of
dimension j such that f^ E SB there exist matrices T("l\ . . . , T("m) such that the
components of the column (10.1.12) form a Jordan basis in t£.

Proof. Use induction on the number m of subspaces in the decompo-


sition (10.1.6). For m = 2 this theorem coincides with Proposition 10.1.2.
Suppose that the theorem holds for m ^ k - 1 , and assume that £%A (A) =
j£, + ---- *-«#*• If % is an irreducible subspace such that /, E SB, then

where Indeed, for every y E SB ^ {0}


there exist a nonnegative integer / and a complex number y ^ 0 such that
A'y = y/j. If, in addition, y E SBT, then y/j = A'y E «2?r, which contradicts the
direct decomposition From (10.1.13) and from the
inclusion S we deduce that dim S£ < dim S£r. As-
sume that /•</:. (The case r = k can be considered in a similar way.) If
then by the induction hypothesis the components of a
column of the form (10.1.12) form a Jordan basis in S£. HSefe^ + ••• +
%k_i, consider the subspace £' = (<€ + Sek) n (^, + • • • + ^.J. Since ^ n
«S?fc = {0}, the equality dim.2" = dim(«Sf -i- 5^) + dim(«2>, + • • • -f ^ _ , ) -
dim(^ 4- • • • 4- S£k) = dim <S? holds. Evidently, S£' is ,4 invariant. Let us
show that S£' is an irreducible subspace. Suppose the contrary; then there
exists an eigenvector gE J£" of A that is not a scalar multiple of / t . Since
SB' C j£j + <3?£, the vector g is a linear combination of the eigenvectors/! and
hly where /i, E <£k. But then /ij E SB' C «#, 4- • • • + SEk_^ which means that
the sum (Ja?, -i- • • • 4- S£k_^) + !£k is not direct, and this is a contradiction with
our assumptions. So S£' is an irreducible subspace. Since S£' C SB, + • • • 4-
SBk_\, by the assumption of induction the components of the column z(i)
form a Jordan basis in SB' for some T\"l), . . . , T(k"^l}. The property that
£'C£ + J£k implies the inclusion SBCSB' + S£k. As it has been proved
above, there exists a matrix T ( k ) such that the components of the column
rm a Jordan
basis in J£.

Theorem 10.1.3 also gives a description of all irreducible subspaces of A


that contain an arbitrarily given eigenvector of A from the root subspace

Indeed, let AC, E £%A (/I) be an eigenvector, and let r be the minimal
integer such that xl E SB\ + • • - + S6r. Then xl = algl + • • • + a r f x , where
a, , . . . , ar E <p and a r T^ 0. Consider the system of vectors xt = a}g, + • • • +
arff, i = 1, . . . , pr. Evidently, jc,, . . . , xp satisfy the condition (10.1.5).
Transformations Having the Same Set of Invariant Subspaces 323

Therefore, their linear span is an irreducible sub-


space. It is easily seen that

So in the representation (10.1.6) one can replace !£r by <Er. Then in view of
Theorem 10.1.3 the components of the columns of form (10.1.12) describe
all the irreducible subspaces of A, which contain the vector jc, [in (10.1.12)
write x' in place of / ( / ) j.
Observe that every irreducible subspace contains an eigenvector of A. So
the description in the preceding paragraph gives all the irreducible subspaces
of A (if the vector xl is varied).

10.2 TRANSFORMATIONS HAVING THE SAME SET


OF INVARIANT SUBSPACES

Consider a transformation A: $"—> (p". In this section we describe the class


of all transformations B: $"-*$" such that lnv(A) = Inv(B). A relative
simple case of this situation has already been pointed out in Theorem 2.11.3
(when one transformation is a polynomial in the other). Surprisingly
enough, it turns out that the set of transformations B such that Inv(fi) =
Inv(v4) does not generally consist only of the transformations/(^4), where
/(A) is a polynomial with the properties indicated in Theorem 2.11.3. It can
even happen that noncommuting transformations have the same set of
invariant subspaces.
Before we embark on the statement and proof of the main theorem
describing the transformations with the same set of invariant subspaces
(which is quite complicated), let us study some examples.

EXAMPLE 10.2.1. Let A be the n x n Jordan block Jn(\a). The invariant


subspaces of A are J^ = Span{e,, . . . , e^, j - 0, . . . , n (by definition ^0 =
{0}). Let us find all transformations B: (pn —> <p" for which lnv(A) = lnv(B).
It turns out that Inv(,4) = Inv(B) if and only if (in the basis e^ . . . , en) B
has the form

where
324 Description of Invariant Subspaces and Linear Transformations

Indeed, suppose lnv(B) = Inv(.A). Then clearly the matrix representing B


has the triangular form (10.2.1). Moreover, it is easy to see that a n = • • • =
ann. Indeed, the numbers a , , , . . . , ann are the eigenvalues of B\ if they are
not all equal, then there exists a pair of nonzero complemented invariant
subspaces of B, namely, the root subspaces corresponding to a pair of
complemented nonempty subsets in cr(B). But the existence of a pair of
nonzero complemented subspaces contradicts the assumption that lnv(B) =
ln\(A).
Let us show that al2 a23 • - • « „ _ , „ ^0. Consider the transformation C =
B-anI, which has the same invariant subspaces as B. If for some /
(1 < / < n — 1) we have then Hence

Since any nonzero vector in Ker C spans a one-dimensional C-invariant


subspace, inequality (10.2.3) contradicts the assumption Inv(fi) = lnv(A)
again.
Conversely, suppose that B satisfies (10.2.1) and (10.2.2). Put C =
B-auI. We show that Ker C = ^. Let x = E" =1 ^jej £ Ker C and x ^0.
Let p be such that £p + j = • • • = £„ = 0 and £,p ^ 0. Then p = \. Indeed, if p
were greater than 1, then Cx = ap p +lep + l + • • • 7^0. So x = ^lel, that is,
Ker C = «$?!. This means that any two eigenvectors of B are collinear.
Appeal to Theorem 2.5.1 [(d)<£>(e)] and deduce that for any two B-
invariant subspaces M^ and M2, either M}CM2 or M2CM}. Since
J^o, jj?,, . . . , 5£n are B invariant and dim !£)= j (j = 0, . . . , n), it follows
that any B-invariant subspace coincides with one of ^. D
Example 10.2.1 provides a situation when Inv(^4) = Inv(S) but A and B
do not commute [take A = / n (A 0 ), n >3, and B as in (10.2.1) with distinct
nonzero numbers ay / + 1 , y = 1,. . . , n — 1].
If A has more than one Jordan block, the situation may be completely
different from Example 10.2.1.

EXAMPLE 10.2.2. Let

It turns out that Inv(fi) = Inv(A) if and only if B is a polynomial in A,


B - p(A), such that p'(\) 9*0. In other words, B has the form

for some a, b, c e <f where b ^ 0.


As by Theorem 2.11.3 Inv(B) = lnv(A) for every B in the form (10.2.4)
Transformations Having the Same Set of Invariant Subspaces 325

with b ^0, we must verify only that every B: <p 5 —» <f5 such that Inv(B) =
lnv(A) has the form (10.2.4) with b^Q (in the basis e}, e2, e3, e4, e5).
So assume Inv(Z?) = lnv(A). Then clearly B has upper triangular form,
and (see the argument in Example 10.2.1) the elements on the main
diagonal of B are all equal. Without loss of generality, we can assume that
the main diagonal in B is zero:

As Span{e4,e5} is A invariant and hence belongs to Inv(fi), we have


a
i4 ~ flis ~ °24 = a2s = ^34 ~fl35= ®- ^ one °^ tne numbers 0 12 , fl23, or a45
were zero, then B would have three one-dimensional invariant subspaces
whose sum is direct. This contradicts the assumption Inv(fi) = Inv(j4) (^4
cannot have more than two one-dimensional invariant subspaces whose sum
is direct). Hence al2, a23, and a45 are different from zero. It remains to show
that fl,2 = «23 ~ a 45- To this end observe that Span{e, + e 4 , e2 + e5} is A
invariant and hence B invariant. So

which implies an = a^- A similar analysis of the iJ-invanant subspace


Span{en e2 + e4, e3 + e5} leads to the conclusion that a23 = a45. D

Now we state the main theorem, which describes all transformations


B :$"-+<£" with Inv(fl) = Inv(/l), where the transformation A: <p" -> (p" is
given. This description will contain the results of Examples 10.2.1 and 10.2.2
as very special cases. Note that without loss of generality we can assume
(and we do) that A is an n x n matrix in the Jordan form

where are all the different eigenvalues of A,


and

where / ? , > • • • >p m . Of course, the number m, as well as / > , , . . . , pm,


depend on /; we suppress this dependence in the notation for the sake of
clarity. The notation for upper triangular Toeplitz matrices will be ab-
breviated to the form
326 Description of Invariant Subspaces and Linear Transformations

Finally, we use the notation

where F is the (q - p)x (q — p) upper triangular matrix whose (/, y) entry


is fif (i<j). It is assumed, of course, that p^q. In other words,
Uq(aQ,. . . , « p _ i ; F) is a q x q matrix whose first p superdiagonals (starting
from the main diagonal) have the structure of a Toeplitz matrix, whereas
the next q — p superdiagonals contain the upper triangular part of the
matrix F, which is not necessarily Toeplitz. If p = q, F is empty and

Theorem 10.2.1
If Inv(B) = Inv(y4) for a transformation B: <p"—» (p", then

(in a chosen Jordan basis for A), where each block


has the form

for some complex numbers


b and an upper triangular matrix F of size
numbers b2,. . . , bp , as well as the matrix F, depend on j. Conversely, if B
has the form (10.2.5), (10.2.6) and //,,, b} and F have the above properties,
then lnv(B) = Inv(yl).
Transformations Having the Same Set of Invariant Subspaces 327

We relegate the lengthy proof of this theorem to the next section. The
proof will be based on the description of irreducible subspaces obtained in
Section 10.1.
We conclude this section with two corollaries of Theorem 10.2.1.

Corollary 10.2.2
Suppose that AB = BA. Then lnv(A) = Inv(B) if and only if B=f(A),
where /(A) is a polynomial such that f(\t) ^/(A y ) for eigenvalues A, ^ A y of
A,f'(\0)*Q whenever A0 G o-(A) and Ker(

In other words, the conditions of Theorem 2.11.3 are not only sufficient,
but also necessary, provided A and B commute.

Proof. In view of Theorem 2.11.3 it is necessary to prove merely the


"only if" statement. So assume Inv(^4) = Inv(fi). Let \lt . . . , \k be the
different eigenvalues of A, and let

be the decomposition of $l^(A) into a direct sum of Jordan subspaces


^n ' • • • ' ^j,m- sucri tnat dim &j\ — '"— dim ^j,m- The restrictions A\^ and
B\% commute; so in view of Theorem 9.1.1 (observing that A\^ has only
one Jordan block) there exists a polynomial py( A) such that B\^ = pj(A\x ).
It follows now from Theorem 10.2.1 that B\^= pt(A\^). Since the minimal
polynomials of A\^ , j - 1, . . . , k are relatively prime ,' there exists a poly-
nomial p(\) such that B = p(A). Indeed, let p(\) be an interpolating poly-
nomial such that
where k- = dim .2), and qr (a) (A 0 ) denotes the ath
derivative of the polynomial g(A) evaluated at A 0 . (See Gantmacher (1959),
Lancaster and Tismenetsky (1985), for example, for information on inter-
polating polynomials (see also Section 2.10).)
From the definition of a function of the matrix A (see Section 2.10), it
follows that Z?|yj . = p(A\gt) for / ' = ! , . . . , & and, consequently, B = p(A).
Using Theorem 10.2.1 once more, we deduce that p(A,) ^/?(A y ) for i^j
a n d / ? ' ( A , ) ^ 0 f or / = !,. . . , k.

Corollary 10.2.3
Let A: §" —> <p" be a transformation. Then every transformation B with
Inv(B) = Inv(>4) commutes with A if and only if the following condition
holds: for every eigenvalue A0 of A with Ker(y4 - A 0 /) ^ S?A (A) and
dim Ker(>4 - A07) > 1 we have
328 Description of Invariant Subspaces and Linear Transformations

where p = p( A 0 ) is the maximal integer such that Ker(^4 - A () /) p ^ 9£A (A).


Further, the set of all transformations B with Inv(fi) = Inv(A) coincides with
the set of all transformations commuting with A if and only if dim Ker(A -
A 0 /) = 1 for every eigenvalue A0 of A, that is, A is nonderogatory.
The proof is obtained by combining Theorem 10.2.1 with the description
of all matrices commuting with A (Theorem 9.1.1).

10.3 PROOF OF THEOREM 10.2.1

We start with three lemmas to be used in the proof of Theorem 10.2.1.


Let A: <p"-» (p" be a unicellular transformation. (Recall that A is called
unicellular if <p" is its irreducible subspace.) Let g,,. . . , gn be a Jordan basis
of A. Let B be a transformation such that its matrix in the basis g t , . . . , gn
has the form

for some 6, E <p and an (n - k) x (n — k) upper triangular matrix F.

Lemma 10.3.1
If B has the form (10.3.1) with b2^0, then in any Jordan basis for B the
transformation A has the form

for some a, €E <p with a2 ^ 0, and some upper triangular matrix G.

Proof. Without loss of generality we can assume that &(A) — {0} and
the Jordan basis g j , . . . , gn coincides with the standard
where

for some bk +l,. . . , bn G £ and upper triangular matrix F' of size (n — k) x


(n - k). Since 6 2 ^ 0 , it follows from Example 10.2.1 that the transfor-
mations A, B, and Bl have the same invariant subspaces. Hence (recalling
the equivalence (a)<=>(e) in Theorem 2.5.1) the transformations B and Bl
are also unicellular.
As ABl = B^A and B^ is unicellular, it follows from Theorem 9.1.1 that
A = p(B}) for some polynomial p(A).
Let / , , . . . , / „ be a Jordan basis for fl. We claim that the matrix of C in
the basis /i, . . - , / „ has the form (10.3.3) again, possibly with another
matrix F'. Indeed, the only nonzero B-invariant subspaces are
Proof of Theorem 10.2.1 329

Span (because they are .A-invariant sub-


spaces). On the other hand, Example 10.2.1 ensures that the only nonzero
jB-invariant subspaces are Span It follows that
f Now it is easily seen that the matrix
of C in the basis has the form (10.3.3).
Consider the following relations

where every summand in H contains C as a factor. Consequently, the matrix


of H in the basis /i , • • • , / „ is upper triangular and the first k diagonals
(counting from the main diagonal) are zeros. Now (10.3.2) follows from
(10.3.4) and, by Example 10.2.1,

Let vectors be linearly independent in <J7". In the


sequel we shall encounter systems of vectors of the form

;lahsdfn';lmdv'cxn/fds;',/';j;cx/.lknfds;ljn;fds';jfdsmnksfd';123456

where fl(/ are certain numbers.

Lemma 10.3.2
If for every m = 1, . . . , p the subspaces Span and
Span coincide, then
Proof. Use induction on p. For p - 1 the lemma is evident. Assume
the lemma holds true for p = k, and Span
330 Description of Invariant Subspaces and Linear Transformations

Span By the induction hypothesis,


For every vector we have
Rewrite the equation

in the form

this will contradict the linear


independence of So we must have

Let ^ be the set of all irreducible subspaces of a transformation


Since every invariant subspace for a
transformation can be represented as a direct sum of irreducible subspaces,
the equality lnv(A) = Inv(B) holds if and only if Now consider a
special case of this equality.

Lemma 10.3.3
Let A: be such that

where ^ (£ = 1,2) are irreducible subspaces of A corresponding to the


same eigenvalue. Let dim
be Jordan bases in these subspaces. Then for
if and only if the matrix of B in the basis
has the form

where are complex numbers with is an upper


triangular matrix of size

Proof. First we prove the necessity, that is, if then B has the
form (10.3.7). Consider first the case p = q and prove the necessity by
induction on everything is evident. Suppose that the lemma i
true for and let be irreducible subspaces of dimension
Le vidently,
are irreducible subspaces of A corresponding to the same eigenvalue. Since
Proof of Theorem 10.2.1 331

by assumption, the subspaces are irreducible


subspaces for B. By Example 10.2.1 and the induction hypothesis, the
matrix representation of B in the basis has the
form

where
We assume otherwise, consider the linear transformation
where in place of B and use the property that lnv
lnv This condition means that B is invertible.
Let j£ be an irreducible subspace of A such that dim and
By Theorem 10.1.3, there exist numbers such that the
vectors

form a Jordan basis in !£. Since 6, ^0, it follows that

It follows from the form of B that

and
33 Descripccccc

evidently, Span Thaen by L:emma


we have

These equalities hold for every (by choosing all possible j£; see
Theorem 10.1.3). Therefore

Similarly, considering Jordan bases of the form

we obtain Let us
show that, in fact, To this end consider a Jordan basis of A
of the form

where £ and 17 are arbitrary numbers. We have

As above, we obtain
proof of theorem

ByLemmalQ.3.2, Since 17 can be arbitrary,


Thus the necessity part of Lemma 10.3.3 is proved for the case
Now consider the case and proceed by induction on
h. Assume that the necessity part of Lemma 10.3.3 holds for h ^ k, and let
be irreducible subspaces for A with dim and
dim By Example 10.2.1 and the assumption of induction, the matrix
representation of B in the basis has the form

where Let
334 Description of Invariant Subspaces and Linear Transformations

be a Jordan basis (for A) of an arbitrary irreducible subspace of


dimension p + k + 1 and such that d\ G $£. As above, we obtain

Now

Hence

Put

Since Span or every


« , , . . . , « , Lemma 10.3.2 implies that

The necessity part of Lemma 10.3.3 is proved.


Let us prove the sufficiency of the conditions of Lemma 10.3.3. Assume
that B has the form (10.3.7) in a Jordan basis for A. Let J£be an irreducible
subspace for A with dim and be an eigenvector.
Then umbers £ and 17. Put
Suppose that In view of Proposition 10.1.2 (see also the
remark after its proof), there are some number for which the
vectors
Proof of Theorem 10.2.1 335

form a Jordan basis of A in [If lace d by


respectively in (10.3.8).] A straightforward computation reveals
that ^ is B invariant, and in the basis we have:

As in Example 10.2.1, implies that J^is an irreducible subspace for B.


Now let $£ be an irreducible subspace for A such that dim S£ — m
It is easily seen that d, G Z£ and (by Proposition 10.1.2)
there exist numbers such that the vectors

form a Jordan basis of A in ££. Again, a straightforward calculation shows


that !£ is B invariant and in the basis

where the (/', /) entry of Since it follows from


Example 10.2.1 that the subspace j£is an irreducible subspace of B. So we
have proved that
Let us prove the opposite inclusion be a Jordan
basis of B in the subspace 2£2. Write pu
Evidently, the vectors form a
Jordan basis for Z? in pan{d
We show that the sequence can be augmented by vectors
so that is a Jordan basis of B in %l. (Observe tha
by Example 10.2.1, ^ is an irreducible subspace for B.) Assume that the
vectors are already constructed. Then
for the following equation must be satisfied in order that
336 Description of Invariant Subspaces and Linear Transformations

where Zr is the submatrix of formed by the first


rows and the columns From (10.3.7) and it
follows that Zr is invertible, so (10.3.11) always has a (unique) solution
is constructed. By Lemma 10.3.1, A has the followi following
form in the basis

for some F, where The first p diagonals in both blocks are the same
in view of the choice of
Now we can repeat the proof of the inclusion $A C $B given above, with
A and B interchanged. So follows and, therefore, also and

Now we are prepared to prove Theorem 10.2.1 itself.

Proof of Theorem 10.2.1. As every A- in variant subspace is the sum of


its intersections with the root subspaces of A, we may restrict ourselves to
the case when <p" is a root subspace for A. Let

be the decomposition of <p" into a direct sum of irreducible subspaces


be
«JC
Jordan bases in respectively. Assume (without loss of generali-
ty) that
Now let t*6 a transformation, and suppose that the invariant
subspaces of B and those of A are the same. Applying Lemma 10.3.3 to the
restrictions we find that /? has the form described in
Theorem 10.2.1.
Conversely, assume that B has the form

with We now prove that Inv(B) = lnv(A). Suppose for definiteness


that Let us show that every irreducible subspace for A is also
an irreducible subspace of B. Let !£ be an irreducible subspace for A
with dim and let be an eigenvector of A. Then
Span — Ker A. Write with
for some Put It is
Proof of Theorem 10.2.1 337

easily seen that Then the vector given b


(10.1.12) (replacing form a Jordan basis for A, for
some numbers wo possibilities occur for the number j
(=dim j£): Consider first the case Taking
into account the form of B, it is easy to check that Ja^is B invariant and the
matrix of B \^ in the basis is of the form

with Then by Example 10.2.1 !£ is an irreducible subspace for B.


Now suppose that Since j clearly r — 1. This means
that the eigenvector xl G £ is collinear with dlE.5£l. Taking into account
the form of B [given by (10.3.13)] we conclude that 3? is B invariant and the
matrix of B\<g in the basis is given by (10.3.10) with and
By Example 10.2.1, £ is an irreducible subspace for B.
We show that every irreducible subspace for B is also an irreducible
subspace for A. As we have already proved, the subspaces
[which appear in (10.3.12)] are also irreducible subspaces for B. Let
be a Jordan basis of B in Z£m\ then

where is the Jordan basis of A in be a


Jordan basis of A in Construct the vectors as follows
(recall that

Since the vectors form a Jordan basis in !£m, the vectors


satisfy the equalities
Because is an irreducible subspace for B, there exists
vectors sucn that the system forms a Jordan
basis for B in (See the last paragraph in the proof of Lemma 10.3.3.)
Express by means of the Jordan basis for A in

Continuing these constructions, we obtain Jordan bases for B in each of the


subspaces From the choice of these bases and Lemma
10.3.1 it follows that the matrix of A in the union of these bases has the
form
338 Description of Invariant Subspaces and Linear Transformations

where A is the eigenvalue of A and As it was proved above, every


irreducible subspace for B is also irreducible for A. Thus the equality
ln\(A) = lnv(B) holds.

10.4 EXERCISES
10.1 Let

(a) Describe all irreducible /l-invariant subspaces that contain


(b) Describe all irreducible y4-invariant subspaces that contain
10.2 Let Describe all irreducible ^-invariant subspaces that
contain
10.3 Prove or disprove the following statement: if are
transformations with and with ln\(A) =
Inv(fi), then A and B are similar.
10.4 Show that if have the same set of hyperinvariant
subspaces and if then A and B are
similar.
10.5 Show that two lower triangular Toeplitz matrices have the same
invariant subspaces if and only if each matrix is a polynomial in the
other.
10.6 Show that two circulants have the same invariant subspaces if and
only if each circulant is a polynomial in the other.
10.7 Is the property expressed in Exercise 10.6 true for two block circu-
lants of type

where A- are 2 x 2 matrices? What happens if Af are 3 x 3 matrices?


10.8 Show that two companion matrices have the same invariant subspaces
if and only if each is a polynomial in the other. Is this property true
for block companion matrices

with 2 x 2 blocks A ? For block companion matrices with 3 x 3 blocks


-v
Chapter Eleven

Algebras of Matrices
and Invariant Subspaces

In this chapter we consider subspaces that are invariant for every transfor-
mation from a given algebra of transformations. In fact, this framework
includes general finite-dimensional algebras over (p. The key result, that
every algebra of n x n matrices that is not the algebra of all n x n matrices
has a nontrivial invariant subspace, is developed with a complete proof.
Some results concerning characterization of lattices of subspaces that are
invariant for every transformation from an algebra are presented. Finally, in
the last section we study algebras of transformations for which the orthogo-
nal complement of an invariant subspace is again invariant.

//./ FINITE-DIMENSIONAL ALGEBRAS

A linear space V (over the field of complex numbers <p) is called an algebra
if an operation (usually called multiplication) is defined in V, which as-
sociates an element in V (denoted xy or x • y) with every (ordered) pair of
elements jc, y from V with the following properties: (a) a(xy) = (ax)y =
x(ay) for every a G <p and every jc, y G V; (b) (xy)z = x(yz) for every
jc, y, z G V (associativity of multiplication); (c) (x + y)z = xz + yz, x(y +
z) — xy + xz for every jc, y, z G V (distributivity of multiplication with re-
spect to addition).
Note that generally speaking xy ¥* yx in the algebra V. The algebra V may
or may not have an identity, that is, an element e G V such that ae — ea = a
for every a G V.
We consider only finite-dimensional algebras, that is, those that are
finite-dimensional linear spaces. The basic example of an algebra is Mn „, the
algebra of all n x n matrices with complex entries, with the usual multiplica-
tion operation. Another important example is the algebra of upper triangu-
lar n x n (complex) matrices.
The following theorem shows that actually every (finite-dimensional)

339
340 Algebras of Matrices and Invariant Subspaces

algebra is an algebra of (not necessarily all) matrices. This is the basic


simple result concerning representations of finite-dimensional algebras.

Theorem 11.1.1
Let V be an algebra of dimension n (as a linear space). If V has identity, then
V can be identified with an algebra of n x n matrices. If V does not have
identity, it can be identified with an algebra of (n + 1) x (n + 1) matrices.

Proof. Assume first that V has the identity e. Let x^,. . . , xn be a basis
in V. For every a E V the mapping a: V-* V defined by a(x) — ax, x E V is a
linear transformation. Denote by M(a) the n x n matrix that represents the
linear transformation a in the fixed basis x{,. . . , xn. It is easy to check that
the mapping M: F-» Mn n defined above is an algebraic homomorphism:

for any elements «, b E V and any a E <p. Further, the only element a E V
for which M(a) = 0 is a = 0. Indeed, if M(d) = 0, then a* = 0 for every x E V.
Taking x = e, we obtain a = 0. Hence we can identify V with the algebra
{M(a) | a e V}, which is simply an algebra of n x n matrices.
Assume now that V does not have identity. Define a new algebra V as all
ordered pairs (x, a) with jc E V, a e <p and with the following operations:

for any x, y € V and any a, /3, y G <p. Obviously, the algebra V has the
identity (0,1) and dimension n + 1. According to the part of Theorem
11.1.1 already proved, we can identify V with an algebra of (n + 1) x (n + 1)
matrices (clearly, dim V= n + l). As V can be identified in turn with the
subalgebra {(x, 0) | oc £ V} of V, the conclusion of Theorem 11.1.1
follows.

In view of Theorem 11.1.1 we consider only algebras of matrices in the


sequel.

11.2 CHAINS OF INVARIANT SUBSPACES

Let V be an algebra of (not necessarily all) n x n matrices. A subspace


is called V invariant if is invariant for any matrix from V. The
Chains of Invariant Subspaces 341

following basic fact (known as Burnside's theorem) establishes the existence


of nontrivial invariant subspaces for algebras of matrices.

Theorem 11.2.1
Let V be an algebra of n x n (complex) matrices with
Then there exists a nontrivial V-invariant subspace.

We exclude the case n — 1, when every subspace in <p" is trivial (in this
case the theorem fails for V= {0}). The proof of Theorem 11.2.1 is lengthy
and based on a series of auxiliary results; it is given in the next section.
Taking a maximal chain of V-invariant subspaces and using Burnside's
theorem we arrive at the following conclusion.

Theorem 11.2.2
For any algebra Vofnxn matrices, there is a chain of V-invariant subspaces

such that, with respect to a direct sum decomposition

where Np is a direct complement to every


transformation A E V has a block triangular form

and the set (App \ A £ V}, coincides with the algebra of all transformations
from Np into Np,for p = 1,. . . , k. The chain (11.2.1) is maximal, and every
maximal chain of V-invariant subspaces has the property stated above.

The case when V is the algebra of all block upper triangular matrices with
respect to the decomposition (11.2.2) is of special interest. Then Mnn is a
direct sum of two subspaces: V and W, where W is the algebra of all lower
block triangular matrices with zeros on the main block diagonal:

The subspaces
342 Algebras of Matrices and Invariant Subspaces

are all the invariant subspaces for W. In particular, we have the following
direct sum decompositions:

This motivates the following conjecture.

Conjecture 11.2.3
Let be nonzero subalgebras in Mn n such that
Then there exist nonzero invariant subspaces M} and M2 for
and respectively, which are direct complements of each other in <p".

We are able to prove a partial result in the direction of this conjecture.


Namely, if and are subalgebras in such that then
for every V^-invariant subspace and every V2-invariant subspace
either or (or both) holds. Indeed, assuming
the contrary, let be a direct complement to in i = 1, 2, and
let Jibe a direct complement to in Then we have a direct sum
decomposition

With respect to this decomposition, every has a block matrix


representation of type

[the zeros appear because of the V invariance of


whereas every has a block matrix representation of type

[the zeros appear because of the invariance of


So every matrix in has a zero in the (4,2) block entry, which
contradicts the assumption that
Proof of Theorem 11.2.1 343

11.3 PROOF OF THEOREM 11.2.1

We start with auxiliary results. A subset Q of an algebra U of n x n matrices


is called an ideal if Q is a subalgebra; that is, implie
and for every complex number a; and, in addition,
AB and ZL4 belong to Q as long as and . Trivial examples of
ideals are and

Lemma 11.3.1
The algebra has no nontrivial ideals.

Proof. Let Q be a nonzero ideal in and let It is


easily seen that for every pair of indices there are
matrices such that has a one in the entry and zeros
elsewhere. Now any matrix can be written

and thus belongs to Q. Hence

Now let U be an algebra of n x n matrices (n > 2) that has no nontrivial


invariant subspaces. We prove that thereby proving Theorem
11.2.1. The first observation is that without loss of generality we can assume
. Indeed, consider the algebra Obvious-
ly, U has no nontrivial invariant subspaces as well. Also, U is an ideal in U.
Hence, if we know already that then Lemma 11.3.1 implies that
either or But the latter case is excluded by the definition
of and the condition So it is assumed that /
Lemma 11.3.2 For every nonzero vector x in and every there
exists a matrix such that

Proof. The set is an invariant subspace for U. This


subspace is nonzero because x = I • x is a nonzero vector in M (recall that
By our assumption on U the subspace coincides with . Hence
for every there exists an such that

Lemma 11.3.3
The only matrices that commute with every matrix in U are the scalar
multiples of I.

Proof. Let be such that SA = AS for every Let be


an eigenvalue of S with corresponding eigenvector Then for every
we have
344 Algebras of Matrices and Invariant Subspaces

By Lemma 11.3.2, for every there is an A in U with So


equations (11.3.1) mean that

Lemma 11.3.4
If xl and x2 are linearly independent vectors in (p", then for every pair of
vectors there exists a matrix A from U such that
and

It is sufficient to show that there exist


Proof. such that
and Indeed, we may then us
Lemma 11.3.2 to find with Henc

We now prove the existence of A\. (The existence of A2 is proved


similarly.) Arguing by contradiction, assume that implies
for every Then one can define a transformation by the
requirement that for all Indeed, if for some
A and B in U, then and thus also which
means . So T is correctly defined. Further,
by Lemma 11.3.2; hence T is defined on the whole of Now for any A
and B in U we have

and since we find that for all By


Lemma 11.3.3, T = al for some a G <p. Therefore, for all
. But this contradicts Lemma 11.3.2.

We say that an algebra V of n x n matrices is /c transitive if for every set


of /c linearly independent vectors in and every set of k vectors
in there exists a matrix A such that
Evidently, every fc-transitive algebra is p transitive for
Lemma 11.3.4 says that the algebra U is 2 transitive.

Proof of Theorem 11.2.1 In view of Lemma 11.3.4 it is sufficient to


prove that every 2-transitive algebra V of n x n matrices is n transitive.
Assume by induction that V is k transitive, and we will prove that V is
transitive (here
So let be linearly independent vectors in It will suffice to
verify that for every there exists a matrix such that
(indeed, for given the 1 transitiv-
Proof of Theorem 11.2.1 345

ity of V implies the existence of such that then for


we have
We will prove the existence of one has simply to
permute the indices). Suppose that no such exists; that is,
implies that . Consider the algebra

of 2/i x In matrices. It turns out (because of the 2 transitivity of V) that any


V (2) -invariant subspace is one of the subspaces
for some A G (p. Indeed, the V^-invariant subspace M
(which we can assume to be nonzero) is a sum of cyclic V (2) -invariant
subspaces: M — M^ where

Fix an index /. For any n x / i matrix B, assuming xn,xi2 are linearly


independent, and by the assumption of 2 transitivity of V, we have Bxn —
Axn, Bxi2 — Axi2 for some hence invariant, where

Now because of the obvious 2 transitivity of Mn n, we find that Mf =


. Assume now that and xi2 are linearly dependent. Then 1
transitivity of V implies again that Ml is M^\ invariant. If xn = 0, we get
and if x for some et

Consequently, is equal to <p2" except for the two cases: (1)


xn = 0 for all / = 1, . . . , p\ (2) / = 1, . . . , p for the same
In the first case in the second case M

Now we return to the proof of the existence of By the induction


hypothesis, for each / ( ! < / < & ) there is some with and
The subspace

is V (2) invariant; therefore (according to the fact proved in the preceding


346 Algebras of Matrices and Invariant Subspaces

paragraph), there exists a complex number a such that


for all The induction hypothesis implies that

and the assumption that = 0 whenever for j = 1, . . , k shows


that a mapping T: is unambiguously defined by

Obviously, T is linear. Further, for A V and j - 1,. . . , k we have (where


the term ACjXj appears in the yth place)

Since the subspace coincided with by the 1


transitivity of V. So the linearity of T gives

Then, for A £ V

Hence {x \ Ax — 0 for all } is a nontrivial V-invariant subspace. This


contradicts the 1 transitivity of V.

11.4 REFLEXIVE LATTICES

Let A be a lattice of subspaces in (p". The set of all n x n matrices A such


that A3?C3? for every J2?EA, denoted Alg(A), is an algebra. Indeed, if
then
Reflexive Lattices 347

for every subspace On the other hand, for an algebra V of n x n


matrices the set Inv(V) of all V- in variant subspaces in <p" is easily seen to be
a lattice of subspaces [i.e., n v ( K ) implies In\(V) and
The following properties of Alg( ) and Inv(V) are
immediate consequences of the definitions.

Proposition 11.4.1
(a) If and are two lattices of subspaces in and then
(b) If V, and V-, are algebras of n x n matrices and
then

Let us check property (c), for example. Assum At or


every /4 lg(A). Hence £ is Alg( invariant; that is, Inv(Alg(

EXAMPLE 11.4.1. Let A be the chain

Then Alg(A) is the algebra of all upper triangular matrices.

EXAMPLE 11.4.2. Let A be the set of subspaces Span where K


runs over all subsets of { ! , . . . , « } . Clearly A is a lattice. The algebra
Alg(A) is easily seen to be the algebra of all diagonal matrices.

EXAMPLE 11.4.3. For a fixed subspace let A be the lattice of all


subspaces that are contained in M. Then Alg(A) is the algebra of all
transformations A having the form

with respect to the direct sum decomposition M + N (for a fixed direct


complement Jf to M).

EXAMPLE 11.4.4. Let V be the algebra of polynomials


where is a fixed linear transformation. Then Inv(V) is the lattice
of all /1-invariant subspaces.

EXAMPLE 11.4.5. Let A: $" —»• <p" be a fixed transformation, and let V be
the algebra of all transformations that commute with A. Then Inv(V) is the
lattice of all A-hyperinvariant subspaces.

Note that
348 Algebras of Matrices and Invariant Subspaces

for every lattice A of the subspaces in Indeed, the inclusion i


equation (11.4.1) follows from (c) and (a). To prove the opposite inclusion,
let A Alg(A). Then any subspace M belonging to Inv(Alg( )) is invariant
for every transformation in Alg(A); in particular, M is A invariant. This
shows that A Alg(Inv(Alg(A))). Similarly, one proves that

for every algebra V of transformations


A lattice of subspaces in is called reflexive if Inv(Alg(
Equality (11.4.2) shows, for example, that any lattice of the form Inv(V) for
some algebra V is reflexive. Let us give an example of a nonreflexive lattice.

EXAMPLE 11.4.6. Let A be the following lattice of subspaces in


Let us find the alge-
bra Alg( ). The 2 x 2 matrix A = , has invariant subspaces !£ and M
if and only if b = c = 0. Further, JV is A invariant if and only if
So

and Lat(Alg(A)) consists of all subspaces in

Many results are known about sufficient conditions for reflexivity of a


lattice of subspaces. Often the key ingredient in such conditions is distribu-
tivity. Recall the definition of a distributive lattice of subspaces given in
Section 9.6.

Theorem 11.4.2
A distributive lattice of subspaces in is reflexive. Conversely, every finite
reflexive lattice of subspaces is distributive.

The proof of Theorem 11.4.2 is beyond the scope of this book, and we
refer the reader to the original papers by Johnson (1964) and Harrison
(1974) for the full story. Here, we shall only prove two particular cases in
the form of Theorems 11.4.3 and 11.4.4.

Theorem 11.4.3
A complete chain of subspaces

is reflexive.
Reflexive Lattices 349

Proof. Let/!, . . . , / „ be a basis in ch that Span{/


i = 1,. . . , n - 1, and write linear transformations as matrices with respect to
this basis. Example 11.4.1 shows that Alg(A) consists of all upper triangular
matrices. As the linear transformation

obviously belongs to AlgA, and its only invariant subspaces are {0}, Mt,
/ = 1, . . . , n - 1, and we have Inv(Alg(A)) C A. Since the reverse inclu-
sion is clear, the conclusion of Theorem 11.4.3 follows.

The next theorem deals with lattices that are as unlike chains as possible.
A lattice A of subspaces in is called a Boolean algebra if it is
distributive and for every M E A there is a unique complement M' (i.e.,
that belongs to We say that a nonzero
subspace is an atom if there are no subspaces for the lattice A strictly
between 3C and (0). The Boolean algebra A is called atomic if any is
a sum of all atoms 3V contained in M. A typical example of an atomic
Boolean algebra of subspaces is A = {Span{jt, | £}, where E is any
subset in (1,2, . . . , n}}, and *,, . . . , xn is a fixed basis in <p".

Theorem 11.4.4
Every atomic Boolean algebra A of subspaces of is reflexive.

Proof. Let K be the set of all atoms in A, and for every K let Px be
the projector on 3C along the complement 3V' of 3V in the lattice A. We shall
show that A = Inv(K), where V is the algebra generated by the transfor-
mations of type PyfAPyf, where A: and K. In other words, V
consists of all linear combinations of transformations of type

where A,: and $fj,. . . , jftm are atoms in


Let Ji? be an atom in A. For any atom $T, we have either £ - 3K or
(This follows from the distributivity of

as £ is an atom, either 'o holds.) In the former


case Im for every transformation A: and in the latter
350 Algebras of Matrices and Invariant Subspaces

case Ker P XAPX. In either case Z£ is PXAPX invariant. Hence


lnv(V). Now every is a sum of the (finitely many) atoms contained in
M. Hence nv(K). In other words,
To prove the reverse inclusion, it is convenient to use the following fact:
if X is an atom in d ^ E l n v ( V ) , then either
Indeed, suppose that M is not contained in 3V', so there exists a vector
such that Since P xf ¥^ 0, it follows that every vector jc in
3£ has the form APxf for some transformation A: Then also
x = PxAPxf. As and (K), we have that is
Return to the proof of the inclusion Inv(V) Let nv(K), and let
M be the sum of all the atoms in that are contained in M. Also, let
be the intersection of all the complements of atoms in A such that
these complements contain M. Obviously

Since A is atomic, the complement is the sum of all atoms that are
not contained in M. (Indeed, if an atom 3Hs contained in 3ifis not
contained in MQ and thus by the definition of M0, 3K is not contained in M.
Conversely, if an atom 3C is not contained in M, then obviously 3£ is not
contained in M0, and since Jiis an atom, it must be contained in M'Q.) The
fact proved in the preceding paragraph shows that M'Q is the sum of all the
atoms jK with the property that For any finite set 3if\,. . .p , of
3C
atoms with / = ! , . . . , / ? , we have (using the distributivity of

and

so actually

This shows that hence Combining this with (11.4.3), we


see that and thu

11.5 REDUCTIVE AND SELF-ADJOINT ALGEBRAS

We have seen in Corollary 3.4.4 that the set Inv(/4) of invariant subspaces of
a transformation A: has the property that lnv(A) exactly
when nv(A) if and only if A is normal. This property makes it
Reductive and Self- Adjoint Algebras 351

natural to introduce the following definition: an algebra V of n x n matrices


is called reductive if it contains / and for every subspace belonging to Inv(K)
its orthogonal complement belongs to Inv(K) as well. Thus the algebra P(A)
wnere
of all polynomials A is a normal transformation, is reduc-
tive. This algebra P(A) has the property that implies
Indeed, we have only to show that, for the normal transformation A, the
adjoint A* is a polynomial in A. Passing, if necessary, to the orthonormal
basis of eigenvectors of A, we can assume that A is diagonal: A =
diag Now let be a scalar polynomial satisfying the
conditions / = 1, . . . , n. Then clearly
The next theorem shows that this property of the reductive algebra P( A)
is a particular case of a much more general fact.

Theorem 11.5.1
An algebra V of n x n matrices with is reductive if and only if V is
self -adjoint, that is, implies

As a subspace M is A invariant if and only if M^ is A* invariant, it


follows immediately that every self-adjoint algebra with identity is reductive.
To prove the converse, we need the following basic property of invariant
subspaces of reductive algebras.

Lemma 11.5.2
Let V be a reductive algebra of n x n matrices, and let Ml , . . . , Mm be a set
of mutually orthogonal V-invariant subspaces such that

and for every i the set of restrictions coincides with the algebra
M(J<,) of all transformations from M^ into Mr Then V is self-adjoint.

Proof. We proceed by induction on m. For m = 1, that is, and


V = M(<p"), the lemma is obvious. So assume that the lemma is proved
already for m - 1 subspaces, and we prove the lemma for m subspaces.
It is convenient to distinguish two cases. In case 1, there exist distinct
integers / and k between 1 and m and an algebraic isomorphism
such that for every This means
that <p is a one-to-one and onto map with the following properties:
352 Algebras of Matrices and Invariant Subspaces

As is equal to we have

We show first that there exists an invertible transformation S: Mj-+Mk


such that p(X) = SXS~l for every Note that takes
rank 1 projectors into rank 1 projectors. Indeed, if P2 = P and
rankP = l, then is a projector. Moreover, the
one-dimensional subspace

is mapped by <p into the subspace

so the subspace is also one-dimensional; hence


rank (P) = 1. Now fix any nonzero vector and let be
the orthogonal projector on Span{/}. As <p(A0): Mk— * Mk is also a one-
dimensional projector, there exists an invertible transformation
S0: M}-*Mk such that A0) = S^^So'. (This follows from the fact that
the Jordan form of any one-dimensional projector in is the same:
diag[l,0, . . . ,0].) Define 5: by

Let us show that this definition is correct. Indeed, if Alf=A2f, then


(A{ - A2)A0 = 0. Consequently, 0, and sinc
<p(A()) is a projector onto Span{50/}, we obtain
In other words, Alf = A2f happens only if 50. Hence S
is correctly defined. Clearly, S is linear and onto. If (,4)S 0/ = 0, then
which implies AAQ = Q and Af=0. This shows that
Ker5={0}. Hence 5 is invertible. Finally, for every A,B we
have

and thus SAg — (p(A)Sg for every Thus (p(A) = SAS for a

Next, we show that S can be taken to be unitary, that is, S~l = 5*. Let M
be a subspace in <p" consisting of all vectors of the form x{ + f- xm, where
As

for every it follows that J£ is V invariant. Since V is reductive, M L is


V invariant as well. A computation shows that
Reductive and Self-Adjoint Algebras 353

The fact that for all implies that is for


and then

As {<<4|^| AEV} coincides with M(Mt) and in the preceding equality jty.
can be an arbitrary vector from M -^ we obtain B = S*SBS~lS*~l for all
By Proposition 9.1.6, for some number that must
1/2
be positive because 5*5 is positive definite. Letting 5, we obtain a
l
unitary transformation U such that (B) = UBU~ for all
We next show that V\ ± is reductive. Indeed, \et J
be V\ ± in variant. Then clearly jV is V invariant,
and by the reductive property of V, so is jV1, and hence also N± nMk. It
remains to notice that coincides with the orthogonal complement
to
By the induction hypothesis, is self-adjoint. Therefore, for every
matrix

the transformation

belongs to As for evye we have


it folowiw that

But and U is unitary. So the transformation (11.5.1) is just


A*. We have proved that V is self-adjoint (in case 1).
Consider now case 2. For any pair of distinct integers / and k between 1
and m, there is no algebraic isomorphism (p as in case 1. If for fixed
=0 implies A\M =0 for any and vice versa, then we can
correctly define an algebraic isomorphism (p: M(Mj)-+M{Mk) by putting
for all (recall th = M(Mj)). Thus our assump-
tion in case 2 implies the following. For each pair ;', k of distinct integers
between 1 and m there exists a matrix A E V such that exactly one of the
transformations is zero.
We now prove that there exists a matrix such that is different
from zero for exactly one index j. Choose different from zero so that
the number p of indices j with is minimal. Permuting
354 Algebras of Matrices and Invariant Subspaces

if necessary, we can assume that


We must show that p = I . Assume the contrary, that is,
Interchanging Ml and Mp if necessary, we can assume that
for some matrix denote the set of all transformations
B:Ml^M} such that B = B\M{ for some BEV with B\M = 0. The fact
that implies that jP, is an ideal in M(M,{). Since C E / j and
C^O, Lemma 11.3.1 shows that actually ^, = M(Ml). Similarly, the set ^2
of all transformations B: M l —> M l such that B = B M for some B E V with
B\M =®> • • • ' B M = 0, is a nonzero ideal in M(J£,) and thus /2 =
Now the identity transformation /: J<, —> M{ belongs to both <^1 and
£2. Therefore, there exist transformations B): and
(; = 2, 3,. . . , /?) such that

and

belongs to V. Then also BC belongs to V, and


However, this contradicts the choice of p. So, indeed, p = 1.
As the ideal j?2 constructed above coincides with M ( M l ) , it follows that
every matrix B from V is a sum of two transformations B\M 1 and B\ M ±.
Since Kj^ = M(J< 1 ) we find that V is self-adjoint provided V\ ± is. But the '
algebra V| i is easily seen to be reductive because V is. Now the self-
adjointness of V\M follows from the induction hypothesis. Lemma 11.5.2 is
proved completely.

Now we are ready to prove the converse statement of Theorem 11.5.1. If


V has no nontrivial invariant subspaces, then by Theorem 11.2.1
and obviously V is self-adjoint. If V has nontrivial invariant subspaces, then
it has a minimal one, say, M } . As V is reductive, M ^ is also V invariant, and
the restriction V\ML is reductive. If V\M±. is not the algebra of all transfor-
mations then there exist a minimal nontrivial V-invariant sub-
space M Proceeding in this manner, we obtain a sequence of
mutually orthogonal V-invariant subspaces Ml,. . . ,Mm such that

and for each j there are no nontrivial y-invariant subspaces in Mr By


Theorem 11.2.1 the restriction V\M (/ = l , . . . , m ) coincides with the
algebra of all transformations M j - * M j . It remains to apply Lemma 11.5.2.
Exercises 355

11.6 EXERCISES

11.1 Prove or disprove that the following sets of n x n matrices are


algebras:
(a) Upper triangular Toeplitz matrices:

(b) Toeplitz matrices:

(c) Circulant matrices:

(d) Companion matrices:

(e) Upper tragular matrices whre


11.2 Prove or disprove that the following sets of nk x nk matrices are
algebras:
(a) Block upper triangular Toeplitz matrices (1), where ai are A: x A:
matrices, / = ! , . . . , « .
(b) Block Toeplitz matrices (2), where «; are k x k matrices,
j — —n + I , . . . , n — I .
(c) Block circulant matrices (3), where ay are k x k matrices,
7 = 1,. . . ,«.
356 Algebras of Matrices and Invariant Subspaces

(d) Block upper triangular matrices [fl, 7 ]" /= i, where «,y are k x k
matrices and ati = 0 if / > j.
(e) Matrices of type

where a(/ are k x k matrices.


11.3 Show that the set of all n x n matrices of type

is an algebra. Find all invariant subspaces of this algebra.


11.4 Let A be an n x n matrix,
(a) Show that the set

is not necessarily an algebra.


(b) ProvethattheclosureofQ,thatis,thesetofalln x nmatricesXior
which there exists a sequence {Xm}^=} with Xm E. Q for m =
1,2,... and limm_^ Xm = X, is an algebra with identity.
(c) Describe all invariant subspaces of the closure of Q.
11.5 Show that the algebra of all n x n upper triangular Toeplitz matrices
and the algebra of all n x n upper triangular matrices have exactly
the same lattice of invariant subspaces.
11.6 Show that the algebra of all upper triangular n x n matrices contains
any algebra A for which

11.7 Show that there is no algebra A with identity strictly contained in the
algebra UT(n) of upper triangular Toeplitz matrices for which
Exercises 357

11.8 Prove that the algebra U(n) of n x n upper triangular matrices is the
unique reflexive algebra for which the lattice of all invariant sub-
spaces is the chain

11.9 Show that there exist n different algebras V,,. . . , Vn whose set of
invariant subspaces coincides with (5) and for which

11.10 Find all invariant subspaces of the algebra of all In x In matrices of


type

where A, B, C, and D are upper triangular matrices.


11.11 As Exercise 11.10 but now, in addition, B and C have zeros along
the main diagonal.
11.12 Find all invariant subspaces of the algebra of all 2n x 2n matrices
, where A, B, C, and D are n x n circulant matrices.
11.13 Let A be an n x n matrix that is not a scalar multiple of the identity.
Find a nontrivial invariant subspace for the algebra of all matrices
that commute with A. Does there exist such a subspace of dimension
1?
11.14 Let A be an n x n matrix and

be the algebra of polynomials in A. Give necessary and sufficient


conditions for reflexivity of V in terms of the structure of the Jordan
form of A.
11.15 Indicate which of the following algebras are reflexive:
(a) n x n upper triangular Toeplitz matrices.
(b) n x n upper triangular matrices.
(c) n x n circulant matrices.
(d) nk x nk block circulant matrices (with kx k blocks).
(e) nk x nk block upper triangular matrices (with k x k blocks).
(f) nk x nk block upper triangular Toeplitz matrices (with k x k
blocks).
(g) the algebra from Exercise 11.3.
358 Algebras of Matrices and Invariant Subspaces

11.16 Let Q be as in Exercise 11.4. When is the closure of Q a reflexive


algebra?
11.17 Given a chain of subspaces

construct reflexive and nonreflexive algebras whose set of invariant


subspaces coincides with (6).
11.18 Let j t j , . . . , xn be a basis in <p", and let A be the minimal lattice of
subspaces that contains Span^},. . . ,Span{jc n }. Prove that there
exists a unique algebra V for which = Inv(V). Is V reflexive?
11.19 Let V be an algebra of n x n matrices without identity and such that
A" - 0 for every A G V. Prove that A }A2- • • An - 0 for every n-tuple
of matrices / 4 , , . . . , An from V. (Hint: Use Theorem 11.2.2.)
Chapter Twelve

Real Linear
Transformations

In this chapter we review the basic facts concerning invariant subspaces for
transformations focusing mainly on those results that are
different (or their proofs are different) in the real case, or cannot be
obtained as immediate corollaries, from the corresponding results for trans-
formations from <p" into <p".
We note here that the applications presented in Chapters 5, 7, and 8 also
hold in the real case. That is, applications to matrix polynomials
with real n x n matrices A- and to rational matrix functions W(\) whose
values are real n x n matrices for the real values of A that are not poles of
W(\). In fact, the description of multiplication and divisibility of matrix
polynomials and rational matrix functions in terms of invariant subspaces (as
developed in Chapters 5 and 7) holds for matrices over any field. This
remark applies for the linear fractional decompositions of rational matrix
functions as well. In contrast, the Brunovsky canonical form (Section 6.2) is
not available in the framework of real matrices, so all the results of Chapter
6 that are based on the Brunovsky canonical form fail, in general, in this
context. Also, the results of Chapter 11 do not generally hold in the
context of finite-dimensional algebras over the field of real numbers.

12.1 DEFINITION, EXAMPLES, AND FIRST PROPERTIES


OF INVARIANT SUBSPACES

Let A: $"—> 4?" be a linear transformation. As in the case of linear


transformations on a complex space, we say that a subspace M C ft" is
invariant for A (or A invariant) if for every The whole of Jjf"
and the zero subspace are trivially A invariant, and the same applies to
Im A and Ker A. As in the complex case, one checks that all the nonzero

359
360 Real Linear Transformations

invariant subspaces of the n x n Jordan block with real eigenvalue (con-


sidered as a transformation from l|?" into $" written as a matrix in the
standard orthonormal basis e,,. . . , en) are Span{e l5 . . . , ek], k = 1,. . . , n.
Also, for the diagonal matrix A = diag where are
distinct real numbers, all the invariant subspaces are of the form
Span{e, | with K C (1,. . . , n} (Span{ is interpreted as the
zero subspace).
In addition to these examples, the following example is basic and
specially significant for real transformations.

EXAMPLE 12.1.1. Let

where cr and T are real numbers and r ^ 0. The size n of the matrix A is
obviously an even number. It is easily seen that Span{e,,. . . , e2k}, k —
I , . . . , n/2 are ^-invariant subspaces. It turns out that A has no other
nontrivial invariant subspaces. Indeed, replacing A by A - crl, we can
assume without loss of generality that cr = 0. We prove that if M is an
2
^-invariant subspace and ; with at least one of the real
numbers a2k_l and a2k different from zero, then Span{e,,. . . , e2k}, and
proceed by induction on k.
In the case k = 1 we have c^e, + a2e2 and A(alel + a2e2) = ra2el —
The conditions r ^ 0 and a\ + a\ T£ 0 ensure that both vectors el
and e2 are linear combinations of alel + a2e2 and ra2el — ra,e2, and the
assertion is proved for k = 1. Assuming that the assertion is proved for
A computation shows that
the vector _ belongs to Span{e,,. . . , e 2k_2} and in the linear
combination y at least one of the numbers 2A _ 3 , j32*-2 is
different from zero. Obviously, so the induction assumption implies
M DSpan{ej,. . . , e2k_2}. Hence a2k-ie2k-\ + a2ke2k €=^; as the differ-
ence Ax — (ra2ke2k_l — Ta2k_le2k) belongs to Span{<?i,. . . , e2k_2}, also
Ta e
2k 2k-i ~ Ta2k_le2kE.M. Consequently, the vectors e2k_} and e2k belong
to M, and M D Spanje,,. . . , e2k}. In particular, A has no odd-dimensional
invariant subspaces. D
Definition, Examples, and First Properties of Invariant Subspaces 361

We say that a complex number A0 is an eigenvalue of A if det( A07 - A) =


0. Note that we admit nonreal numbers as eigenvalues of the real transfor-
mation A. As before, the set of all eigenvalues of A will be called the
spectrum of A and denoted by cr(A). Since the polynomial det( A/ - A) has
real coefficients (as one can see by writing A in matrix form in some basis in
4?"), it follows that the spectrum of A is symmetrical with respect to the real
axis: if A0 is an eigenvalue of A, so is A 0 , and the multiplicity of A0 as a zero
of det( A/ - A) is equal to that of A 0 .
Not every transformation A: ft"—> ft" has real eigenvalues. For instance,
in Example 12.1.1 the eigenvalues of A are a + ir and a - ir. However, if n
is odd, then A must have at least one real eigenvalue. Indeed, det(A/ — A)
is a monic polynomial of degree n with real coefficients; hence for n odd det
(A/ — A) has real zeros. This implies the following fact (which has already
been observed in the case of Example 12.1.1).

Proposition 12.1.1
If the transformation A: ft"—> ft" has no real eigenvalues, then A has no
odd-dimensional invariant subspaces.

Proof. If M C ft" were an odd-dimensional ^-invariant subspace, the


restriction A\M would have a real eigenvalue, which contradicts the fact that
A has no real eigenvalues. (As in the complex case, the eigenvalues of any
restriction A # to an ,4-invariant subspace are necessarily eigenvalues of
A.) D

The Jordan chains for real transformations are defined in the same way as
for complex transformations: vectors XQ, . . . , xk G ft" form a Jordan chain
of the transformation A: ft"—*ft" corresponding to the eigenvalue A0 of A if
x0 7^0 and AxQ = A0;t0; Ax, - \QXj = x^{, / = ! , . . . , & . The vector JCD is
called an eigenvector. The eigenvalue A0 for which a Jordan chain exists must
obviously be real. Since not every real transformation has real eigenvalues,
it follows that there exist transformations A: ft"—> ft" without Jordan chains
(and in particular without eigenvectors). On the other hand, for every real
eigenvalue A0 of A: ft"—*• ft" there exists an eigenvector (which is any
nonzero vector from Ker(A 0 7- A)C ft"). In particular, A has eigenvectors
provided n is odd.
As we have seen (e.g., in Example 12.1.1), not every real transformation
has one-dimensional invariant subspaces. In contrast, two-dimensional in-
variant subspaces always exist, as shown in the following proposition.

Proposition 12.1.2
Any transformation A: ft"—» ft" with n^2 has at least one two-dimensional
invariant subspace.
362 Real Linear Transformations

Proof. Assume first that A has a pair of nonreal eigenvalues a + ir,


a - ir (cr, T are real, r ^ 0). Then

Let x E ft" ^ {0} be such that

Then clearly the subspace M = Span{x, Ax} is v4-invariant. Further, M


cannot be one-dimensional because otherwise Ax = IJLX for some /z €E ft,
which in view of equality (12.1.1) would imply /u,2 — 2ju,cr + (cr2 + r 2 ) = 0, or
(fj, — cr)2 + r 2 = 0, which is impossible since T ^ 0.
If A has no nonreal eigenvalues, then (leaving aside the trivial case when
A is a scalar multiple of /) the subspace Span{jc, y}, where x and y are
eigenvectors of A corresponding to different eigenvalues, is two-dimensional
and A invariant. D

It is clear now that Theorem 1.9.1 is generally false for real transfor-
mations. The next result is the real analog of that theorem.

Theorem 12.1.3
Let A: ft" —> ft" be a transformation and assume that det( A/ - A) has exactly
s real zeros (counting multiplicities). Then there exists an orthonormal basis
X i , . . . , xn in ft" such that, with respect to this basis, the transformation A
has the form [fl,7]"/ = i where all the entries ait with i > j are zeros except for
a
s + 2,s + l> as + 4,s + 3-> • ' • ' an,n-l-

So, the matrix [«,,]", = , is "almost" upper triangular.

Proof. Apply induction on n. If A has a real eigenvalue, then use the


proof of Theorem 1.9.1. If A has no real eigenvalues, then pick a two-
dimensional .A-invariant subspace (which exists by Proposition 12.1.2) with
an orthonormal basis x, y. Write A as the 2 x 2 block matrix with respect to
the orthogonal decomposition <p" = M -f M ^:

and apply the induction hypothesis to the transformation

It follows from Theorem 12.1.3 that a transformation


def
A: &"-*&" with
det( A/ - A) having s real zeros has a chain of p + 1 = \{n + s) + 1 invariant
subspaces:
Root Subspaces and the Real Jordan Form 363

(Observe that n — s is the number of nonreal zeros of det( A/ - A). So n - s


and n + s are even numbers.) We leave it to the reader to verify that
\(n + s) + 1 is the maximal number of elements in a chain of ^4-invariant
subspaces.
We say that a transformation A:ftn—>]jtn is self-adjointnt if (Ax, y) =
(x, Ay) for every jc, y G %", [As usual, (-, •) stands for the standard scalar
product in ^".] In other words, A is self-adjoint if A = A*. Also, a
transformation A is called unitary if A* = A~l and normal if A A* = A*A.
Note that in an orthonormal basis a self-adjoint transformation is represen-
ted by a symmetric matrix, and a unitary transformation is represented by
an orthogonal matrix. (Recall that a real matrix U is called orthogonal if
UUT=UTU=I.)
For normal transformations the "almost" triangular form of Theorem
12.1.3 is actually "almost" diagonal:

Theorem 12.1.4
Let A be as in Theorem 12.1.3 and assume, in addition, that A is normal.
Then there exists an orthonormal basis in %n with respect to which A has
the matrix form [ a ^ ] " j = i , where aif• = 0 for i^j except for as+2s+l,

Proof. Use an orthonormal basis in ftn with the properties described in


Theorem 12.1.3, and observe that the equality A* A = A A* implies that
actually a i f = Q for i>j except as+ls+2, . . . , «„_,,„.

12.2 ROOT SUBSPACES AND THE REAL JORDAN FORM

Let A: ft" —^ ft" be a transformation. The root subspace £%A (A) correspond-
ing to the real eigenvalue A() of A is defined to be Ker(A 0 7- A)", as in the
complex case. Then £%A (A) is spanned by the members of all Jordan chains
of A corresponding to A 0 . For a pair of nonreal eigenvalues a + ir, a — ir of
A (here a, T are real and r ^ O ) the root subspace is defined by

where p is a positive integer such that

for every positive integer k.


Note that, if A t , . . . , \r are the distinct real eigenvalues of A (if any) and
364 Real Linear Transformations

<r, + /r, , . . . , as + irs are the district eigenvalues of A in the open upper half
of the complex plane (if any), then

for some positive integers a,,. . . , ar, /3 15 . . . , Ps. Using this observation, it
can be proved that there is a direct sum decomposition

(see the remark following the proof of Theorem 2.1.2). Moreover, we have:

Theorem 12.2.1
For every A-invariant subspace M the direct sum decomposition

holds.

For the deeper study of properties of invariant subspaces, the real Jordan
form of a real transformation, to be described in the following theorem, is
most useful. As usual, ^(A) denotes the k x k Jordan block with eigenvalue
A. Also, we introduce the 21 x 21 matrix

whrer and are real numbers with and


represents the 2 x 2 identity matrix.

Theorem 12.2.2
For every transformation A: $"—> %" there exists a basis in ft" in which A has
the following matrix form:

whre are real numbers(nucessarilty


Root Subspaces and the Real Jordan Form 365

distinct) and wl,...,wq are positive. In the representation (12.2.1) the


blocks Jk (A,) and //(/Lt ; , w-) are uniquely determined by A up to permu-
tation.

The proof of Theorem 12.2.2 will be relegated to the next section.


The right-hand side of equality (12.2.1) is called a real Jordan form of A.
Clearly, A 1 5 . . . , \p are the real eigenvalues of A, and ^ ± iw{, . . . , iiq ±
iwq are the nonreal eigenvalues of A. Given A0 E o~(A), A0 real, the partial
multiplicities and the algebraic and geometric multiplicity of A correspond-
ing to A0 are defined as in the complex case. For a nonreal eigenvalue p. + iw
of A, the partial multiplicities of A corresponding to /LI + iw are, by
definition, the half-sizes /y of the blocks //(/Lt y , w.) with /xy = /a and w- = ±w.
The number of partial multiplicities of A corresponding to /n + iw is the
geometric multiplicity of /t + iw, and the sum of partial multiplicities is the
algebraic multiplicity of /i + iw.
By use of the real Jordan form, it is not difficult to prove the following
fact, which we need later.

Proposition 12.2.3
If n is odd, then every transformation A: $"—* ^" has an invariant subspace
of any dimension k with 0 < k < n.

Proof. Without loss of generality we can assume that A is given by an


n x n matrix in the real Jordan form. As n is odd, A has a real eigenvalue,
so that blocks J k (A,) in the real Jordan form (12.2.1) oi A are present.
Since the subspaces Spanf^j, . . . , e - } , j = 1, . . . , k- are /^(A ; ) invariant,
and the subspaces Spanl^, . . . , e 2 j ] , j = 1, . . . , I- are //.(/x,-, vv y ) invariant,
we obtain the existence of ^4- invariant subspaces of 'any dimension k,
0<fc<n. D

Analogs of the results on spectral and irreducible invariant subspaces


proved in Chapter 2 can be stated and proved for transformations from ^?"
to jjf". (As in the complex case we say that an ^-invariant subspace M is
irreducible if M cannot be represented as a direct sum of two A -invariant
subspaces.) For example, see Theorem 12.2.4.

Theorem 12.2.4 Let A: %"-*$" be a transformation. The following


statements are equivalent for an A-invariant subspace M :
(a) M is irreducible.
(b) Each A-invariant subspace contained in M is irreducible.
(c) The Jordan form of the restriction A\M is either / n (A), A G J|f, or (in
case n is even) Jn/2(f^, w), ^t, w E. $ , w ^ 0.
366 Real Linear Transformations

(d) There is either a unique eigenvector (up to multiplication by a nonzero real


number) of A in M or (in case A\M has no eigenvectors) a unique
two-dimensional A-invariant subspace in M.
(e) The lattice of A-invariant subspaces is a chain.
(/) The spectrum of A\M is either a singleton { A0}, A() G ft, or a pair of
nonreal eigenvalues { p + iw, p - w>}, and

in the former case and

in the latter case.

The real Jordan form can be used instead of the (complex) Jordan form
to produce results for real transformations analogous to those presented in
Chapters 3 and 4 (with the exception of Proposition 3.1.4). For this purpose
we say that a transformation A: ft"-* ft" is diagonable if its real Jordan form
has only 1 x 1 blocks J,(A ; ), A , , . . . , \p G ft or 2 x 2 blocks /,(/*,-, w-),
j= 1,. . . , q. Also, we use the fact that the Jordan form of the transfor-
mation A: ft"-* ft" with the real Jordan form (12.2.1) is

12.3 COMPLEXIFICATION AND PROOF OF THE


REAL JORDAN FORM

We describe here a standard method for constructing a transformation


<p w —» <p" from a given transformation ft"-* ft" with similar spectral proper-
ties. In many cases this method allows us to obtain results on real transfor-
mations from the corresponding results on complex transformations. In
particular, it is used in the proof of Theorem 12.2.2.
Let A: ft"—>ft" be a transformation. Define the complexification
Ac: (f"1-* <f" of A as follows: Ac(x + iy) = Ax + iAy, where x, y £ ft". Obvi-
ously, Ac is a linear transformation. If A is given by an n x n matrix in some
basis in ft", then this same basis may be considered as a basis in <p" and Ac is
given by the same matrix. It is clear from this observation that the
eigenvalues and the corresponding partial multiplicities of A and of Ac are
the same.
Let M be a subspace in ft". Then M + iM = {x + iy \ x, y G M} is a
Complexification and Proof of the Real Jordan Form 367

subspace in <p". Moreover, if M is A invariant, then M + iM is easily seen to


be Ac invariant.
We need the following basic connection between the invariant subspaces
of a real transformation and the invariant subspaces of its complexification.

Theorem 12.3.1
Assume that the transformation A: $" —» %" does not have real eigenvalues.
Let 9t+ C (p" be the spectral subspace of Ac corresponding to the eigenvalues
in the open upper half plane. Then for every A-invariant subspace ^(C $.")
the subspace (!£ + i££} D £%+ is Ac invariant and contained in 8%+. Converse-
ly, for every A' -invariant subspace M C £ft+ there exists a unique A-invariant
subspace £ such that (% + i%) fl 3?+ = M.

Proof. The direct statement of Theorem 12.3.1 has already been ob-
served. To prove the converse statement, let M C <%+ be an /i'-invariant
subspace. Fix a basis z, , . . . , zk in M, and write z; = Xj + (y;, j = 1, . . . , k,
where jc;, yy G ^f". Put *£ = Spanjjc,, . . . , xk, y{, . . . , yk} C Jj?". Let us
check that & is A invariant. Indeed, for each /, Acz- is a linear combination
(with complex coefficients) of z , , . . . , zk, say

Letting a(pl} - f3p'} + iy(p'\ where /3j,;) and yp'} are real, use the definition of
Ac to rewrite (12.3.1) in the form

After separation of real and imaginary parts, these equations clearly imply
that t£ is A invariant. Further, it is easily seen that

where z, = xl - iyf, j = 1, . . . , k. Equality (12.3.1) implies that the subspace


M - Span{Zp . . . , z k ] is Ac invariant and

This statement is easily verified; by Ietting_z 1 , . . . , zk be a Jordan basis for


AC\M, for example. As M C 5£+, we have MC.<3l_, where 9l_ is the spectral
subspace of Ac corresponding to the eigenvalues in the open lower half
plane. Now
368 Real Linear Transformations

Hence

It remains to prove the uniqueness of t£. Let ,£" be another ^-invariant


subspace such that

For a given subspace JV C <p", define its complex conjugate:

Obviously, JVis also a subspace in <p". We have £' + i«2" = =^' + i^f'. Also,
it is easy to check (e.g., by taking complex conjugates of a Jordan basis in
$+ for Ac\<x+) that $t+ = <3l_. Taking complex conjugates in (12.3.2), we
have

and

As % + i£ = {x + iy \ x, y G &}, and similarly for £' + W, the equality of


&' and g follows. D

The proof shows that Theorem 12.3.1 remains valid if the subspace $t+ is
replaced by the spectral subspace of Ac corresponding to any set S of
eigenvalues of Ac such that A 0 E S implies \0^S and 5 is maximal with
respect to this property.
We pass now to the proof of Theorem 12.2.2. First, let us observe that in
terms of matrices Theorem 12.2.2 can be restated as follows.

Theorem 12.3.2
Given an n x n matrix A whose entries are real numbers, there exists an
invertible n x n matrix with real entries S such that
Complexification and Proof of the Real Jordan Form 369

where A ; , /ny, and Wj are as in Theorem 12.2.2. The right-hand side of


(12.3.3) is uniquely determined by A up to permutations of blocks Jk ( A,) and

We now prove the result in the latter form. The Jordan form for
transformations from <p" into (p" is used in the proof.

Proof. Let Ac be the complexification of A. Let $lx (AC)C (p" be the


root subspace of Ac corresponding to a real eigenvalue A 0 . As the matrices
(A° - A0/)', / = 0 , 1 , 2 , . . . have real entries, there exists a basis in each
subspace Ker(Ac - A 0 /)' C C" that consists of n-dimensional vectors with
real coordinates. (Here, we use the fact that vectors *,,. . . , xk G Jjf"
are linearly independent over $ if and only if they are linearly indepen-
dent over (p.) Further, if m is such that Ker(X - \0I)m = ^a(Ac) but
Ker(v4c - A,,/)"7"1 ^ 5£A (Ac), then, by using the same fact, we see that there
is a basis in £%A (Ac) modulo Ker(Ac - \QI)m~l consisting of real vectors. We
can now repeat the arguments from the proof of the Jordan form (Section
12.2.3) to show that there exists a basis in £%A (Ac) consisting of Jordan
chains of Ac (in short, a Jordan basis) with real coordinates.
Further, let xn,. . . , xf m; i = l,. . . , p be a Jordan basis in £%A (Ac)
where A0 is a nonreal eigenvalue of Ac (so for each / the vectors
* , • ! » • • • > Xi.m f°rm a Jordan chain of Ac corresponding to A 0 ). By taking
complex conjugates in the equalities

(by definition, xi0 - 0) and using the fact that Ac is given by a real matrix in
the standard basis, we see that

are the Jordan chains of Ac corresponding to A 0 . The vectors (12.3.4)


inherit linear independence from the vectors x i f . Further, dim $t^(Ac) —
dim <3l i (Ac) (because the algebraic multiplicities of Ac at A0 and at A0 are
the same); hence the vectors (12.3.4) form a basis in £% A (Ac).
Putting together Jordan bases for each £%A (Ac), where A0 G ft H (r(Ac),
which consist of vectors with real coordinates, and Jordan bases for each
pair of subspaces £%A (Ac) and £% A (Ac) (where A0 is a nonreal eigenvalue of
Ac) that are obtained from each other by complex conjugation, we obtain
the following equality:
370 Real Linear Transformations

Here A,, . . . , A p are real numbers, A p + I , . . . , \p+q are nonreal numbers


(which can be assumed to have positive imaginary parts), and R is an
invertible n x n matrix that, when partitioned according to the sizes of
Jordan blocks in the right-hand side of (12.3.5), say

has the property that are real and

aand consder the

One checks easily that U- is unitary, that is, UjU* = /, and that

and fjij and wy are the real and imaginary parts of A p + / , repectively (see the
paragraph preceding Theorem 12.2.2 for the definition of .//(/A,, vv ; )). Also,
it is easily seen that the matrix

has real entries. Multiplying (12.3.5) from the right by

and denoting the real invertible matrix RU by Q, we have

and formula (12.3.3) follows.


The uniqueness of the right-hand side of (12.3.3) follows from the
Commuting Matrices 371

uniqueness of the Jordan form of Ac. [Indeed, the right-hand side of (12.3.3)
is uniquely determined by the eigenvalues and partial multiplicities of Ac.]

12.4 COMMUTING MATRICES

Let A be an n x n matrix with real entries. In this section we study the


general form of real matrices that commute with A. This result is applied in
the next section to characterize the lattice of hyperinvariant subspaces of a
real transformation.
In view of Theorem 12.2.2, we can assume that

where each Ja is either a Jordan block of size ma x ma with real eigenvalue


A a , or J-Jm 12(1*0 ,wa) ( m tne notation introduced before Theorem
12.2.2). Let Z be a real matrix such that AZ = ZA. Partition Z according to
(12.4.1): Z = [Za/3]^ p = 1 , where Za/3 is an ma x m& real matrix. Then we
have

If (r(Ja) fl o-(Jp) - 0, then equation (12.4.2) has only the trivial solution
Za/3 = 0 (Corollary 9.1.2). Assume tr(Ja) = a(J^) = {A 0 }, where A0 is real.
Then, as in the proof of Theorem 9.1.1, Zafi is an upper triangular Toeplitz
matrix.
To study the case &(Ja) - tr(Jp) = {/t() + w>0, /AO - /w () }, it is convenient
to first verify the following lemma.

Lemma 12.4.1

Let K=\ be a 2 x 2 matrix with real u,, w such that w ^ 0. Then


L -w nJ
f/ie system of equations

KA + C=AK; KC=CK

for unknown 2 x 2 matrices A and C implies C = 0.


The lemma is verified by a direct computation after writing out the
entries in A and C explicitly.
Now return to the case

in equations letting and writing


372 Real Linear Transformations

Comparing the block entries (ma/2,1) and then ( r a a / 2 - l , l ) in this


equation, we obtain

By Lemma 12.4.1, Um / 2 , =0. Now compare the block entries in positions


( m a / 2 - l , l ) and (m a /2-2, 1), and reapplying Lemma 12.4.1, it follows
that Um / 2 _ , , =0. Continue in this way, and it is found that
Commuting Matrices 373

Equality (12.4.4) implies that for / = ! , . . . , p/2, KU^^U^K and for

In view of Lemma 12.4.1, Un = t/22 = • • • = Up/2 pl2, hence UJ_l • commutes


with K for y = 1, . . . , p/2. Further, K U j _ 2 j + #,._, y = f/ ; _ 2 J _, + ^-2,,*
for y = 3, . . . , p/2. Using Lemma 12.4.1 again, U and
KUj_2 j = /7 y _ 2 ;A". Continuing in this way, we find that (7,y (i^j) depends
only on the difference between y and / and commutes with K. Because of the
latter property Utj must have the form for some real numbers a
and b (which depend, of course, on i and y).
Putting all the above information together, we arrive at the following
description of all real matrices that commute with a given real n x n matrix
A.

Theorem 12.4.2
Let A be an n x n matrix with the real Jordan form diag[/ t , . . . , /„], so

for some invertible real n x n matrix S, where each Ja is either a Jordan block
of size ma x ma with real eigenvalue or a matrix of type

with real /MQ , wa and wa > 0. Then every real n x n matrix X that commutes
with A has the form X = S~1ZS, where the matrix Z = [ZaB]"aB = l partitioned
conformally with the Jordan form diag[/ t ,. . . , Ju] has the following struc-
ture: Ifa-(JJ n a(JB) = 0, then ZaB = 0. //tr(7 a ) = (JB) = { A0}, A0 real, then
374 Real Linear Transformations

is a real p x p upper triangular Toeplitz matrix. If


ju, - nv}, where /n a/id >v > 0 are real, then again

and in this case

where the 2 x 2 blocks X(Jp have the form

/or some real numbers u^l and v(

12.5 HYPERINVARIANT T SSUBSPACES

Let A: ^"—>% n be a transformation. A subspace M C J|?" is called .A


hyperinvariant if J{ is invariant for every transformation X: $"-*$" that
commutes with A. It is easily seen that the set of all A -hyperinvariant
subspaces is a lattice. In this section we obtain another characterization of
this lattice, one that is analogous to Theorem 9.4.2. The description of
commuting matrices obtained in Theorem 12.4.2 is used in the proof.

Theorem 12.5.1
Let a transformation A: $"—*$" have the minimal poly nominal
Hyperinvariant Subspaces 375

where A,, /ay, and wf are real and n>y >0, A , , . . . , A^ are distinct, and so are
/LI, + iw,,..., JJLS + i(t>s. Then the lattice of ail A hyperinvariant subspaces
coincides with the smallest lattice ^A of subspaces in <p" that contains
KerM - A/)*, lm(A - \jl)k for k = 1,. . . , r y ; ; = 1,. . . , k, and Ker[(,4 -
/t y /r + wy2/]^, Im[(^ - /i;/)2 + wj/]* for k = 1,. . . , 5;; / = 1,. . . , m.

We consider first a particular case of Theorem 12.5.1 when the spec-


trum of A consists only of one pair of nonreal eigenvalues JJL + iw, JJL - iw
(/LI, w £ # , w^O). Let

be a Jordan basis in Jj?", where pl>--->pm so that, in this basis, /I is


represented by the matrix

The following lemma is an analog of Lemma 9.5.2.

Lemma 12.5.2
Every A-hyperinvariant subspace is of the form

where q\, . . . , qm is a non-decreasing sequence of nonnegative integers such

If qt — 0 for some /, then, of course, 3{'q is interpreted as the zero


subspace. We see later that conversely, every subspace of the form (12.5.2)
is A hyperinvariant.

Proof. Let !£ be a nonzero /l-hyperinvariant subspace, and let x E j£


Write x as a linear combination of the vectors (12.5.1):

We claim that each vector belongs to l. Insteed,let Prr be


the projector on 3C'pr defined by Pr/}5) = 0 for s^r and Fr/!r) = f^ for
/ = ! , . . . , 2/?r. It follows from Theorem 12.4.2 that Pr commutes with A.
Hence 2£ is Pr invariant, and yr = Prx G 5£.
376 Real Linear Transformations

Fix an integer r between 1 and m and denote by a the maximal index / of


a nonzero coefficient £*r) (/ = ! , . . . , 2pr). Without loss of generality, we
can assume that a = 2/3 is even (otherwise consider Ax in place of*). Let us
show that all the vectors f\r\ . . . , f^ belong to J£ Indeed, the vectors

and z 2 = Az{ belong to j?and also to Spanf/^,/^}. Now zl and z 2 are not
collinear; otherwise, A would have a real eigenvalue, and this is impossible.
It follows that Span{z,, zj = Span{/(,r), /f >}, and hence f[r\ /f E %. If
we already know that f\r\ . . . , / 2 /-2 E !£ for some / > 2, then by a similar
argument using the vectors

and z 2/ = Az 2 , ,e^, we find that / 2 -_ , , / 2? e «^. For i = 0 we have


f (r) ) • • • » 7f (<F)
/I a e ^
*=<>£•
As the vector A: e % was arbitrary, it follows that & = Xlqi + -- + %™m
for some integers qi such that 0 ^ ^ ( < / J ( , / = l , . . . , m . To prove that
ql > • • • s qr, we must show that jtCraC!£ implies 3£ra~l C !£. Consider the
transformation B: $"—»$" that, in the basis (12.5.1), has the block matrix
form B = [Xij]™j=i where Xtj is the 2p, x 2pJ zero matrix (/,;' = ! , . . . , m),
except for

Theorem 12.4.2 ensures that B commutes with A. Hence !£ is B invariant,


and /; r ~ I} = Bf(ir) E X, i = 1, . . . , 2a. In other words, <3ir~l C X.
Further, consider the transformation C: $"—> ]jf" that, in the basis
(12.5.1), has the block matrix form C = [¥),•] ",-=,, where YU is the 2p, x 2/?;
zero matrix except for

Then by Theorem 12.4.2, C commutes with A, and assuming 2qr >


2(pr-Pr+i), we have

This implies 2qr - 2(pr ~ f>r+})^2qr+}, or pr - qr >pr+l - qr+l. If qr <


Pr ~ Pr+1» th en the inequality pr - qr > p r + , - ^r r + , is obvious. D

We are now in a position to prove Theorem 12.5.1 for the case cr(A) =
{fjL + iw, (ji - iw}. As in the proof of Theorem 9.4.2, one shows that every
Hyperinvariant Subspaces 377

subspace of the form Ker[(,4 - /A/) 2 + w2I]k, or lm[(A - /a/) 2 + w2I]k is A


hyperinvariant. So we have only to show that every /1-hyperinvariant
subspace j£ belongs to the lattice !?A. By Lemma 12.5.2

for some sequence of integers < ? , , . . . , qm such that ql > • • • > qm >0 and
p{ - ql > p2 - q2 > • • • > p m - qm > 0. We prove that Jz? E &A by induction
on q{. Assume first q} = 1. Then S£ = X\ + \- 3£\ for some t < m. As
p,>p, + 1 , we have

Now assume that the inclusion Jz? £ 5^ is proved for q^ - v — 1, and let =^be
a subspace of the form (12.5.3) with ql = v. Let r, a be the maximal integers
for which ql — • • • = #r and pa — /?r -f i> > 0. Consider the subspace

It is easily seen that

The inequalities p( - qi > /?( + 1 — qi+l imply that M C 3?. Further, the sub-
space

is A hyperinvariant, and since

the induction hypothesis ensures that Jf E $PA. Finally, 2£ - M + Ji belongs


to yA as well.
We have proved Theorem 12.5.1 for the case when the spectrum of A
consists of exactly one pair of nonreal eigenvalues. As the proof shows, the
converse statement of Lemma 12.5.2 is also true: every subspace of the form
(12.5.2) is A hyperinvariant.

Proof of Theorem 5.1 (the general case). Again, it is easily seen that
each subspace Ker(A - A ./)*, lm(A - A,/)*, Ker[(,4 - A,/) 2 + wjl]k,
lm[(A - A ; /) 2 + w2/]* is A hyperinvariant. So we must show that each
j4-hyperinvariant subspace belongs to yA. Let M be an A -hyperinvariant
subspace. By Theorem 12.2.1 we have
378 Real Linear Transformations

Write A in the real Jordan form (as in Theorem 12.2.2) and use Theorem
12.4.2 to deduce that each intersection M. fl ^(A) is A\.# ( A ) hyperinvariant
and M H &t uPp-'"p
+iw is A\#n ±iu- (A) hyperinvariantV(pf = 1,» . . .',•> 5).
IM
p l>
Jr
/
With the use
of Theorem 9.4.2, it follows that M fl *3i.K (A) belongs to the smallest lattice
that contains the subspaces

and

Similarly, by the part of the theorem already proved, we find that M fl


£% M ±IW (A) belongs to the smallest lattice that contains the subspaces

It follows that M E. yA, and Theorem 12.5.1 is proved completely. D

12.6 REAL TRANSFORMATIONS WITH THE SAME


INVARIANT SUBSPACES

In this section we describe transformations B: %"-* $", which have the


same invariant subspaces as a given transformation A: $" —> ^". This de-
scription is a real analog of Theorem 10.2.1.
By Theorem 12.2.2, we can assume that, in a certain basis in Jff", A has
the matrix form

whrer

with different real numbers and


a eal Transformations with the Same Invariant Subspaces 3479

with different complex numbers in the open upper


half plane. We use the notation introduced in Section 12.2, and also assume
that kn^'-->kimi, / „ > • • • > / , . , .
Now introduce the following notation (partly used in Section 10.2): given
real numbers « 0 , . . . , a s _ { , denote by Ts(a0, . . . , a^.,) the s x s upper
triangular Toeplitz matrix

Further, for positive integers s < Met

where F is a real (/ — s) x (t — s) upper triangular matrix

Similarly if real matrices, we


x
define the 2s x 2s upper triangular Toeplitz matrix Ts («0, . . . , a s _ { ) by
the same formula (12.6.2). If, in addition, the real 2x2 matrices f
(l<j<k<t-s) are given, denote by t / f x 2 (a0, . . . , «,_,; F) the 2f x 2t
matrix given by (12.6.3) with F given by (12.6.4). By definition, for 5 = / we
have

and
380 Real Linear Transformations

We can now give a description of all transformations B: $"—>$" with the


same invariant subspaces as A.

Theorem 12.6.1
Let the transformation be given by (12.6.1), in some basis i
Then a transformation has the same invariant subspaces as A if
and only if B has the following matrix form (in the same basis):

where

for some real numbers b(^\ . . . , b^ with M'} 5^0 and some (kn - ki2) x
(kn -ki2) matrix F ( i ) ;

for some 2 x 2 raz/ blocks

ith f\'] r^O and det c 2 ' } 7^0 and some 2(/ yl - / y2 ) x 2(/ yl - 7 /2 ) rea/ mafri*
r

G (/) . Moreover, the real numbers b\l\ . . . , b\p) are different and the com-
plex numbers are different as well.

For the proof of Theorem 12.6.1, we refer the reader to Soltan (1974).

12.7 EeEXERCISES

12.1 Prove that the transformation of rotation through an angle <p:

has no nontrivial invariant subspaces except when <p is an integer


multiple of TT.
Exercises 381

12.2 Given an example of a transformation such that A has


no eigenvectors but A2 has a basis of eigenvectors in ft2".
12.3 Show that if is such that A2 has an eigenvector corre-
sponding to a nonnegative eigenvalue A 0 , then A has an eigenvector
as well.
12.4 Show that if is a transformation with det A< 0, then A
has at least two distinct real eigenvalues.
12.5 Find the real Jordan form of the n x n matrix

Find all the invariant subspaces in ft" of this matrix.


12.6 Describe the real Jordan form and all invariant subspaces in ft* of the
3 x 3 real circulant matrix

12.7 Find the real Jordan form of an n x n real circulant matrix

12.8 Find the real Jordan form and all invariant subspaces in ft" of the
real companion matrix

assuming that the polynomial


distinct complex zeros.
12.9 What is the real Jordan form of real n x n companion matrix?
382 Real Linear Transformations

12.10 Find the real Jordan form and all invariant subspaces in ft" of the
matrix

where
12.11 Two linear matrix polynomials \Al + and \A2 + B2 with real
matrices A{, 5,, A2, and B2 are called strictly equivalent (over J|f) if
there exist invertible real matrices P and Q such that P(\Al +
B{)Q = \A2 + B2. Prove the following result on the canonical form
for the strict equivalence (over J|?) (the real analog of Theorem
A. 7. 3). A real linear matrix polynomial \A + B is strictly equivalent
(over ^) to a real linear polynomial of the type

where 0 is the p x q zero matrix; Lf is the e x (e + 1) matrix

Me is the transpose of L£; A 1 ? . . . , \u are real numbers;

with ; and /L,


y ;<u; are real numbers with a>,>0 for
'
'] — 1,. . . , v. Moreover, the form (1) is uniquely determined by
\A + B up to permutations of blocks. (Hint: In the proof of
Theorem A.7.3 use the real Jordan form in place of the complex
Jordan form.)
Exercises 383

12.12 Prove the following analog of the Brunovsky canonical form for real
transformations. Two transformations and
are called b/oc/; similar if there exist invert-
ible transformations and and a transfor-
mation such that

Prove that every transformation is block


similar to a transformation [A{) 50] of the following form (written as
matrices with respect to certain bases in ftm and ft"):

where J is a matrix in the real Jordan form; Z?0 has all zero entries
except for the entries
and these exceptional entries are equal to 1. (Hint: Use Exercise
12.11 in the proof of the Brunovsky canonical form.)
12.13 Let be a full-range pair of transfor-
mations. Prove that given a sequence S - {A 15 . . . , A,,} of n (not
necessarily distinct) complex numbers such that A0 G S implies A() E S
and A0 appears in S exactly as many times as A0, there exists a
transformation F: ft"—> ftm such that A t , . . . , \n are the eigenvalues
of A + BF (counted with multiplicities). (Hint: Use Exercise 12.12.)
Notes to Part 2
Chapter 9. The first two sections contain standard material in linear
algebra [see, e.g., Gantmacher (1959)]. Theorem 9.3.1 is due to Laffey
(1978) and Guralnick (1979). The proof presented here follows Choi,
Laurie, and Radjavi (1981). Theorem 9.4.2 appears in Soltan (1976) and
Fillmore, Herrero, and Longstaff (1977). Our expositions of Theorem 9.4.2
and Section 9.6 follow the latter paper.
Chapter 10. The results and proofs of this chapter are from Soltan
(1973b).
Chapter 11. Theorem 11.2.1 is a well-known result (Burnside's
theorem). It may be found in books on general algebra [see, e.g., Jacobson
(1953)] but generally not in books on linear algebra. In the proof of
Theorem 11.2.1 we follow the exposition from Chapter 8 in Radjavi and
Rosenthal (1973). Other proofs are also available [see Jacobson (1953);
Halperin and Rosenthal (1980); E. Rosenthal (1984)]. Example 11.4.6 and
Theorem 11.4.4 are from Halmos (1971). In the proof of Theorem 5.1 we
are following Radjavi and Rosenthal (1973).
Chapter 12. The real Jordan form is a standard result, although not so
frequently included in books on linear algebra as the (complex) Jordan
form. The real Jordan form can be found in Lancaster and Tismenetsky
(1985), for instance. The proof of Theorem 5.1 is taken from Soltan (1981).

384
Part Three

Topological
Properties of
Invariant Subspaces
and Stability
There are a number of practical problems in which it is necessary to obtain
an invariant subspace of a transformation or a matrix by numerical methods.
In practice, numerical computation can be performed with only a finite
degree of precision and, in addition, the data for a problem will generally be
imprecise. In this situation, the best that we can hope to do is to obtain an
invariant subspace of a transformation that is close to the one we really have
in mind. However, simple examples show that although two transformations
may be close (in any reasonable sense), their invariant subspaces can be
completely different. This leads us to the problem of identifying all invariant
subspaces of a given transformation that are "stable" under small pertur-
bations of the transformation—that is, to identify those invariant subspaces
for which the perturbed transformation will have a "close" or "neighbour-
ing" invariant subspace, in an appropriate sense.
To develop these ideas, we must introduce a measure of distance between
subspaces and to analyze further the structure of the invariant subspaces of a
given transformation. This is done in Part 3, together with descriptions of
stable invariant subspaces, using different notions of stability.
This machinery is then applied to the study of stability of divisors of
polynomial and rational matrix functions and other problems. The reader
whose interest is confined to the applications of Chapter 17 needs only to
study the material presented in Chapter 13, Section 14.3, and Chapter 15.

385
This page intentionally left blank
Chapter Thirteen

The Metric Space


of Subspaces

This chapter is of an auxiliary character. We set forth the basic facts about
the topological properties of the set of subspaces in <p". Observe that all the
results and proofs of this chapter hold for the set of subspaces in ft" as well.

13.1 THE GAP BETWEEN SUBSPACES

We consider <p" endowed with the standard scalar product. If


and the cor-
responding norm is

The norm of an n x n matrix A (or a transformation A : <p" —» (p" ) is defined


accordingly:

Now we introduce a concept that serves as a measure of distance between


subspaces. The gap between subspaces £ and M (in <p") is defined as

where P^ and PM are the orthogonal projectors on ^ and M, respectively. It


is clear from the definition that 0( <£', M ) is a metric in the set of all
subspaces in (p"; that is, 6($,M) enjoys the following properties: (a)

, Jf) + B(Ji, M) (the triangle inequality).

387
a88 The Metric Space of Subspaces

Note also that 0( j£, M ) < 1 . [This property follows immediately from the
characterization given in condition (13.1.3).] It follows from (13.1.1) that

where !£L and M ^ denote orthogonal complements. Indeed,

In the following paragraphs denote by S? the unit sphere in a subspace


X C <p", that is, . We also need the concept of the
distance of d(x, Z) from x G <£" to a set Z C (f1". This is defined by

Theorem 13.1.1
Let M , £ be subspaces in <p". Then

// exactly one of the subspaces 2£ and M is the zero subspace, then the
right-hand side of (13.1.3) is interpreted as 1; if & = M = {0}, then the
right-hand side o/(13.1.3) is interpreted as 0. 7/P, and P2 are projectors with
Im P2 = !£ and Im P2 = M, not necessarily orthogonal, then

Proof. For every we have

Therefore

Similarly, suP;te^ d(x, £) < ||P, - P2||; so

where
Observe that
Consequently, for every we have

Now
The Gap Between Subspaces 389

Hence by (13.1.6)

On the other hand, using the relation

and the orthogonality of PM, we obtain

Taking advantage of (13.1.6) and (13.1.7) we obtain

So

Using (13.1.5) (with P, = P^, P2 = PM), we obtain (13.1.3). The inequality


(13.1.4) follows now from (13.1.5). D

It is an important property of the metric 6(!£, M} that, in a neighbour-


hood of every subspace !£ E. <p", all the subspaces have the same dimension
(equal to dim 5£ ). This is a consequence of the following theorem.

Theorem 13.1.2
If 0(£, M)<1, then dim $ = dim M.

Proof. The condition 8($, M)<\ implies that £ n M ^ = {0} and


. Indeed, suppose the contrary, and assume, for instance, that
Then d(x,m)=1,and by (13.1.3)
a contradiction. Now l O } implies that dim 3
dim M, and = {0} implies that dim ^ > dim M.

It also follows directly from this proof that the hypothesis 0(3?, M)<\
implies . In addition, we have
hh The Metric Space of Subspaces

For example, to see the first of these observe that for any x E. M there is the
unique decomposition x = y + z, M x = PMy so.
that M C PM(*£). But the reverse inclusion is obvious, and so we must have
equality.
The following result makes precise the idea that direct sum decompo-
sitions of <p" are stable under small perturbations of the subspaces, as
measured in the gap metric.

Theorem 13.1.3
Let M, M j C <p" be subspaces such that

If jV is a subspace in <p" such that Q(M, jV) is sufficiently small, then

and

where PM(Pjf) projects <p" onto M (onto N) along Jll and C is a constant
depending on M and M , but not on Ji. In fact

Proof. Let us prove first that the sum Ji + M\ is indeed direct. The
condition that M + M{ — (p" is a direct sum implies that \\x — _ y | | > 6 > 0 for
every and every Here 8 is a fixed positive constant. Take Ji
so close to M that 0 ( M , J f ) < d / 2 . Then ||z-^||<6/2 for every zG5, v ,
where y = y(z) is the orthogonal projection of z on M. Thus for x E SM and
we have

so ^n^,-{0}. By Theorem 13.1.2 di so


dimensional considerations tell us that M + Ml = <p" for 6(M, jV) < 1, and
equation (13.1.8) follows.
To establish the right-hand inequality in (13.1.9) two preliminary remarks
are needed. First note that for any xE.M, and yEMl we have x =
PM(x + y) so that

It is claimed that, for 6(M, N) small enough


The Gap Between Subspaces 391

for all and


Without loss of generality, assume \\z\\ =
— 1.
\. Suppose that
and let x£M. Then,
Then,using
using(13.1.10),
(13.1.10),we
weobtain
obtain

But then x - (x - z) + z implies ||*|| > 1 - S, and so

and, for S small enough, (13.1.11) is established.


The second remark is that, for any x E. <£"

for some constant C0. To establish (13.1.12), it is sufficient to consider the


case that xE-Jt^ and ||jt|| = 1. But then, obviously, we can take

Now for any x E. 5 V , by use of (13.1.12) and (13.1.3), we obtain

Then, if and it follows that

and the last inequality follows from (13.1.11). This completes the proof of
the theorem. D

We remark that the definition and analysis of the gap between subspaces
presented in this section extends verbatim to a finite-dimensional vector
space V over <p (or over J|?) on which a scalar product is defined. Namely,
there exists a complex-valued (or real-valued) function defined on all the
ordered pairs, x, y, where x, y E. V, denoted by (x, y), which satisfies the
following properties: for enery
x, y, z and every
(c) (x, x) > 0 for all x E. V; and (x, x) = 0 if and only if x = 0.
392 The Metric Space of Subspaces

13.2 THE MINIMAL ANGLE AND THE SPHERICAL GAP

There are notions of the "minimal angle" and the "spherical gap" between
two subspaces that are closely related to the gap between the subspaces. The
basic facts about these notions are exposed in this and the next sections. It
should be noted, however, that these notions and their properties are used
(apart from Sections 13.2 and 13.3) only in Section 13.8 and in the proof of
Theorem 15.2.1.
Given two subspaces 3?, M C (p", the minimal angle <pmin(&, M) (0<
<pmin(J3?, M}< IT/2) between Jz? and M is determined by

The minimal angle can also be defined by the equality

Indeed, writing

for any we have

Now for ||jc|| = ||y|| = 1

and writing /? = u + iv, where u and v are real, we see easily that the
function /(M, v) — 1 + ($(x, y) + /8(y, jc) + |^3|2 of two real variables u and v
has its minimum for u = —^((x, y) + (y, x)) and v = \(i(y, x) - i(x, y)),
that is, when )3 = -(y, *). Thus
The Minimal Angle and the Spherical Gap 393

Denote by a and b the right-hand sides of equations (13.2.2) and (13.2.1),


respectively. Then

In view of (13.2.3) the equality 1 - a2 = b2 follows, and this means that,


indeed, formulas (13.2.1) and (13.2.2) define the same angle <pmin(j£, M}
with 0<<p min (^, M)<Trl2.

Proposition 13.2.1
For two nontrivial subspaces // and only if

Proof. Obviously, if x £ £ fl M is a vector of norm 1, then

so Conversely, assume As the set


is closed and bounded, the
continuous function \\x + y\\ has a minimum in the set 4>, which in our case
is zero. In other words, ||jc() + yQ\\ = 0 for some x0 E £, y{}E.M, where at
least one of ||jt()|| and \\yQ\\ is equal to 1. But then, clearly, JCG G & n M ^
{0}. D

We also need the notion of the "spherical gap" between subspaces. For
nonzero subspaces =$?, M in (p" the spherical gap 0(3?, M ) is defined by

We also put «({0}, ^) = 0(%, {0}) - 1 for every nonzero subspace £ in <p"
and 6*({0}, {0}) = 0. The spherical gap is also a metric in the set of all
subspaces in <p". Indeed, the only nontrivial statement that we have to verify
for this purpose is the triangle inequality:

for all subspaces ^, M , and N in <p". If at least one of j£, M , and Ji is the
zero subspace, (13.2.4) is evident (observe that 0(j£, M)^2 for all sub-
spaces ££, M C <p"). So we can assume that £, M and jVare nonzero. Given
x G 5^., let zx e 5^ be such that ||* - zx\\ = d(x, SM). Then for every y E 5^,
we have
a94 The Metric Space of nSubspaces

and taking the infinum with respect to y, it follows that

It remains to take the supremum over x E S^ and repeat the argument with
the roles of JV and !£ interchanged, in order to verify (13.2.4).
In fact, the spherical gap 0 is not far away from the gap 6 in the following
sense:

The left inequality here follows from (13.1.3). To prove the right inequality
in (13.2.5) it is sufficient to check that for every x E <p" with ||jc|| = 1 we
have

where £ C <p" is a subspace. Let y = P^x, where Py is the orthogonal


projector on !£. If y — 0, then jtl j£, and for every 2 G 5^ we have

So (13.2.6) follows. If ^7^0, then, in the two-dimensional real subspace


spanned by x and y, there is an acute angle between the vectors x and y.
Consider the isosceles triangle with sides ;c, y / H . y l l and enclosing this acute
angle. In this triangle the angle between the sides .y/H.yll and x — y/\\y\\ is
greater than Tr/4. Consequently

and (13.2.6) follows again.

Proposition 13.2.2
For any three subspaces £, M, N C <p",

Proof. Letet a aand be arbitrary vectors satisfying


max{||_y 1 ||, ||_y 3 ||} = 1. Letting e be any fixed positive number, choose
y2 e M such that || y2 \\ = \\y3\\ and
The Minimal Angle and the Spherical Gap 395

Indeed, if y3 = 0, choose y2 - 0; if y3 7^0, then the definition of d(M, jV)


allows us to choose a suitable y2. Now

As e > 0 was arbitrary, the inequality (13.2.7) follows. D

The angle between subspaces allows us to give a qualitative description of


the result of Theorem 13.1.3.

Theorem 13.2.3
Let M, Jf be subspaces in <p" such that M Pi JV = {0}. Then for every pair of
subspaces M , , JV, C <J7" such that

we have ^ , n ^ V , = {0}. //, in addition, M+N=<£", then every pair


of subspaces Ml,Jil satisfying (13.2.8) has the additional property that

Proof. In view of Proposition 13.2.2 we have

and

Adding these inequalities, and using (13.2.8) and Proposition 13.2.1, we


find that Ml n ^, = {0}.
Assume now that, in addition, M + N= <p". Suppose first that M = M } .
Let e > 0 be so small that

If M + .yV, ^ <p", then there exists a vector jc £ <p" with ||A:|| = 1 and
for all y E M + ^ [e.g., one can take x^(M+J^ iY\. We
can represent the vector x as x = y + z, y&M, ze^V. It follows from
the definition of sin q>min(M, jV) that

Indeed, denoting u = max{||y||, ||z||}, we have


a94 The Metric Space of nSubspaces

By the definition of 6(t£, M ) we can find a vector 2 , from .AT, with

The last inequality contradicts the choice of *, because z — z, = jc — t, where


t = y + zlEM + JV,, and ||*-f||<6.
Now consider the general case. Inequality (13.2.8) implies
and, in view of Propositionn 13.2.2,
he part of Theorem 13.2.3 already proved, weroved, we
obtain M + JV, = <p" and then M , -I- JV, = (p". D

13.3 MINIMAL OPENING AND ANGULAR


LINEAR TRANSFORMATION

In this section we study the properties of angular transformations in terms of


the minimal angle between subspaces.

Let MI, M2 be subspaces in <p". The number

is called the minimal opening between Ml and M2. So

where <pmtn(M}, M 2 ) is the minimal angle between M{ and M2. By conven-


tion, T/({0}, {0}) =00. If FI is any projector defined on <p", then

To see this, note that for each z £ <p"

We would like to mention also the following properties of the minimal


opening. If Q, and Q2 are nontrivial (i.e., different from 0 and /) orthogo-
nal projectors of (p" onto the subspaces M\ and M2, respectively, then

and
ainimal Opening and Angular Linear Tranformation 397

Indeed, these formulas follow from the equality

for every x G M,, and from

As a consequence of (13.3.2) we obtain the following connection between


the minimal opening and the distance from one subspace to another. For
two subspaces M\ and M2\n <p", put

[if M{ - {0}, then define p(M^, M2) = 0]. Then we have

whenever Ml ¥^ {0}. To see this, note that for M2 ^ {0}

where <2, is the orthogonal projector onto M } . But then we can use (13.3.2)
to obtain formula (13.3.3). If M2 = {0}, then (13.3.3) holds trivially.
We use the notion of the minimal opening between two subspaces to
describe the behaviour of angular transformations when the corresponding
projectors are allowed to change.
398 The Metric Space of Subspaces

Lemma 13.3.1
Let I10 be a projector defined on <p", and let O be another projector on <f""
such that Then, provided

we have the following estimate for the norm of the angular transformation R
of Im O with respect to I10:

Proof. (). Recall tt

For 0 we have

Taking the infimum over all and using inequality (3.1), one sees
that

Now recall thatt fofofor e Ass


, we see from (13.3.5) that

So, using (13.3.6), we obtain

It follows from (13.3.7) that for each


which proves the inequality (13.3.4). D

The following lemma will be useful.

Lemma 13.3.2
Let P and P x be projectors defined on <p" such that <p" = Im P 4- Im P x. Then
for any pair of projectors Q and Q* defined on
sufficiently small, we have <p" = Im Q + Im Q x, and there exists an
invertible transformation which maps Im Q on Im P, Im Q x on
x
Im P , and
Minimal Opening and Angular Linear Transformation 399

where the positive constant /3 depends on P and P* only.

Proof. Let and assume that the


projectors Q and Q* satisfy

As
tion (13.3.9) implies that

But then we may apply Theorem 13.2.3 combined with (13.2.5) to show that

Note that (13.3.9) implies that ||P-<2||<|. Hence S, =/ + /»- G is


invertible, and we can write As
/ — P-f- Q is invertible also, we have

Further

Let n o (Il) be the projector of <p" along Im P (Im Q) onto Im P x


( I m < 2 x ) and put n = S,IISj"1. Then O is again a projector, and by
(13.3.10) we have Ker fl = Ker O 0 . Further, Im fl = Im S^^S^ , and so we
have

Hence, if R denotes the angular transformation of Im fl with respect to Il(),


then because of equation (13.3.4) of Lemma 13.3.1, we obtain
400 The Metric Space of Subspaces

As this implies
that

Next, put S2- I - RH(}, and take 5 = 525,. Clearly, 52 is invertible; in


fact, S^1 = I + Rll0. It follows that 5 is invertible also. From the properties
of the angular transformation one easily sees that 5(Im Q} - Im P,
S ( I m < 2 x ) = ImP x .
To prove (13.3.8), we simplify our notation. Put
Q x ||, and let 77 = 7j(Im P, Im P x ). From S = (I- RU0)(I + P - Q) and th
fact that ||P - e|| < | , one deduces
For \\R\\ an upper bound is given by (13.3.11), and from (13.3.1) we know
that JlnJIsSTj' 1 . It follows that

Finally, we consider S~l. Recall that S^l = I + V with


- Hence

and (13.3.8) follows in view of (13.3.12).

13.4 THE METRIC SPACE OF SUBSPACESces

We have already seen in Section 13.1 that the set <p(<P") °f a'l subspaces in
<p" is a metric space with respect to the gap 6(2£, M ) . In this section we
investigate some topological properties of <£(<£"), that is, those properties
that depend on convergence (or divergence) in the sense of the gap metric.

Theorem 13.4.1
The metric space (£(£") is compact, and, therefore, complete (as a metric
space).

Recall that compactness of 4/(<p") means that for every sequence


Jjfj, j£2, . . . of subspaces in <j/(<p") there exists a converging subsequence
.2) , ^,2, . . . , that is, such that
The Metric Space of Subspaces 401

for some Completeness of means that every sequence


of subspaces J^, / = 1, 2, . . . , for which lim, is convergent.

Proof. In view of Theorem 13.1.2, the metric space ([/((p") is decom-


posed into components (p w , m = 0, . . . , n, where <$m is a closed and open
set in <j/(<P") consisting of all w-dimensional subspaces in <p".
Obviously, it is sufficient to prove the compactness of each (|7m. To this
end consider the set $m of all orthonormal systems u = {uk}™=l consisting of
m vectors ul, . . . , um in <p"-
For define

It is easily seen that d(u, v) is a metric in $m, thus turning $m into a metric
space. For each u - {uk}^l E $m define yl m M = Span{M t , . . . , um] E (j/m.
In this way we obtain a map ^4OT: $m —> ($m of metric spaces $m and (J/m .
We prove that the map Am is continuous. Indeed, let ^ E §m and let
v}, . . . ,vm be an orthonormal basis in j£ Pick some
(which is supposed to be in a neighbourhood of For u ( ,
/ = 1, . . . , m, we have (where M = Amu and P^ stands for the orthogonal
projector on the subspace N):

Now, since we find that \a\ < 1 and E" , |a,| < m, and
so

Fix some y E 5^±. We wish to evaluate P^y. For every x E j£, write

and

by (13.4.1). On the other hand, write


402 The Metric Space of Subspaces

then for every z E !£1

and

But

Combining (13.4.2) and (13.4.3), we find that \(t, PMy)\ <3m6(w, v) for
every re f" with ||/|| = 1. Thus

Now we can easily prove the continuity of Am. Pick an x £ < p " with
||jc|| = 1. Thus, using (13.4.1) and (13.4.4) we have

so

which obviously implies the continuity of Am.


It is easily seen that $m is compact. Indeed, this follows from the
compactness of the unit sphere {jtE<p"|||jc|| = l} in <p". Since
Am: $m—> 4/m is a continuous map onto (frm , the metric space <$m is compact
as well.
Finally, let us prove the completeness of (pm. Let ^ , «2*2 , . . . be a
Cauchy sequence in (p m , that is, 0(J£t, «^)—>0 as /, j—»°°. By compactness,
there exists a subsequence ^ such that lim A _^ 6(^ , =^) = 0 for some
£ £ $ m . But then it is easily seen that in fact .Sf = lim,-.,, ^,. D
Next we develop a useful characterization of limits in (J/((p").
Theorem 13.4.2
Let M}, M2, . . . be a sequence of m-dimensional subspaces in 4/(<P"), such
that 6(Mp, M)— »Oflsp-»°° for some subspace M C (J7". Then M consists of
exactly those vectors x £ <p" for which there exists a sequence of vectors
xp £ <p", /? = 1, 2, . . . such that
The Metric Space of Subspaces 403

Proof. Denoting by /% the orthogonal projector on the subspace N C


<p", for every x G M we have:

So xp = P M x has the properties that xpE:Jip and lim^^ xp = x.


Conversely, let xp G Mp, p = 1, 2, . . . be such that lim^,, xp = x. Then

(in the last inequality we have used the fact that the norm of an orthogonal
projector is 1); so PMx - x, and x E M.

Using Theorems 13.4.2 and 13.1.3, one obtains the following fact.

Theorem 13.4.3
Let ££ and M be direct complements to each other in <£", and let
{^m}», = i be sequences of subspaces such that

Then, denoting by P (resp. Pm) the projector on £ along M (resp. on !£m


along Mm} we have

Moreover, there exists a constant K>Q depending on 5€ and M only such that

for all sufficiently large m.

Observe that, in view of Theorem 13.1.3, the subspaces !£m and Mm are
direct complements to each other for sufficiently large m.

Proof. Let Pm^ be the projector on Mm along !£ and PM m be the


projector on M along 3?m (for sufficiently large m). By Theorem 13.1.3 we
have
a
The Metric Space of Subspaces

where

As usual, d(jt, N) = inf{||jic - _y|| | y G JV} is the distance between x G <p" and
a subset JV C (p". In particular, for m large enough we find that

When Theorem 13.1.3 is applied again, it follows that

Now use (13.4.6) and deduce (for sufficiently large m):

We finish the proof by showing that

for m sufficiently large.


Arguing by contradiction, assume that (13.4.7) does not hold. Then there
exists a subsequence {Mm }^ =] and vectors xm E Mm with norm 1 such that

As the sequence {xm }^ =1 of n-dimensional vectors is bounded, it has a


converging subsequence. So we can assume that xm —> x0 as k—>°°. Clearly,
||jc0|| = l, and by Theorem 13.4.2, x0E.M. In view of (13.4.8) for each
k — 1, 2, . . . , there is a vector yk G 56 such that

In particular, the sequence ( y k } k =l is bounded, and we can assume that


yk-*y0 as &—><», for some y0E&. Passing to the limit in (13.4.9) when k
tends to infinity, we obtain the inequality
The Metric Space of Subspaces 405

which is contradictory.

The proof of Theorem 13.4.3 shows that actually equation (13.4.5) holds
with

We conclude this section with the following simple observation.

Proposition 13.4.4
The set $m($n) of all m-dimensional subspaces in (p" is connected.
That is, for every M, N G (£„,(<£") there exists a continuous function
f : [0, 1]-* $m(() such that /(O) = M, /(I) = JV (and the continuity of f is
understood in the gap metric).

Proof. Using the proof of Theorem 13.4.1 and the notation introduced
there, we must show that the set $m is connected. As any orthonormal
system «,, . . . , um in <p" can be completed to an orthonormal basis in <p",
the connectedness of $m would follow from the connectedness of the group
£/(<p") of all n x n unitary matrices. To show that £/(<£") is connected,
observe that any X G £/(<p") has the form

where S is unitary and 0,, . . . , 0n are real numbers (see Section 1.9). So

is a continuous £/(<p")- valued function that connects / and X.

Similarly, one can prove that the set <$m($") of all m-dimensional
subspaces in Jj?" is connected. To this end use the facts that any orthonormal
systems u l 5 . . . , um in ^" (m < n) can be completed to an orthonormal basis
« ! , . . . , ! / „ with det[« 1 , u2, . . . , un] = 1 and that the set £/+(Jj?") of all
orthogonal n x n matrices with determinant 1 is connected. Recall that a
real n x n matrix U is called orthogonal if U TU = UU T = I.
For completeness, let us prove the connectedness of f/ + (J(f"). It follows
from Theorem 12.1.4 that any x £ t/ + (JJf") admits the representation

where S is orthogonal and each K- is either the scalar ± 1 or the 2 x 2 matrix


406 The Metric Space of Subspaces

f°r some 9, 0 ^ 0 ^ ZTT, which depends on y. As det


also det[£,, K2,. . . , Kp] = 1, which means that the number of
indices / such that Kf. = -1 is even. Since= wecanassume

that each Kj is either 1 or %, 6 = 0(j). Putting

where X;-(r) = Kt if /Cy = 1 and Ky(0 = ^/# if Kj = ^,, we obtain a f/ + (J|O-


valued continuous function that connects / and X.

13.5 KERNELS AND IMAGES OF LINEARTtr TRANSFORMATIONS

Important examples of subspaces in <p" are images of transformations into


<(7" and kernels of transformations from <p". We study here the behaviour of
these subspaces when the transformation is allowed to change. The main
result in this direction is the following theorem.

Theorem 13.5.1
Let X: <£"-* <p'" be a transformation, and let Px be a projector on Ker X.
Then there exists a constant /C>0, depending only on X and Px, with the
following property: for every transformation Y: <p"-^ (p™ w^tn dim Ker Y =
dim Ker X there exists a projector PY on Ker Y such that

In particular

Proof. It will suffice to prove (13.5.1) for all those y with dim Ker Y -
dim Ker X that are sufficiently close to X, that is, ||X — Y\\ < e, where e > 0
depends on X and Px only. Indeed, for Y with dim Ker Y - dim Ker X and
||A'— y|| > e, use the orthogonal projector PY on Y and the fact that
to obtain (13.5.1) (maybe with a
bigger constant K}.
Consider first the case when X is right invertible. There exists a right
inverse X1 of X such that Im X1 = Im(/ - Px) (cf. Theorem 1.5.5), and then
X'X= I - Px (indeed, both sides are projectors with the same kernel and
the same image). It is easy to verify that any transformation Y: <p"—^Cp" 1
with the property
Kernels and Images of Linear Transformations 407

is also right invertible and one of the right inverses Y1 is given by the
formula Y1 = ZX1 where

Indeed, we have

and hence

where the penultimate equality follows from (13.5.3), because

A similar argument shows that Z is invertible and


. Now p . We have

So (13.5.1) holds for every Y satisfying || Y- *|| < IH*'!!"1, with

Now consider the case when * is not right invertible, and let r be the
dimension of a complementary subspace N to Im * in <pm- Consider the
transformation

defined by X(x + y) = Xx + Ly ; x £ <p", y E <f r, where L: $r -^ jV is some


invertible transformation. As the image of * is the whole space <p™ the
transformation * is right invertible. Also Ker * = Ker *. Let P^ be a
projector on Ker * defined by P%(x + y) = Pxx\ x E <p", y E <pr. Applying
the part of Theorem 13.5.1 already proved to *, we find positive con-
stants e and K such that, for every transformation with
||*- Y|| < e, there exists a projector Py on Ker Y such that
408 The Metric Space of Subspaces

Note that the equality dim Ker Y = dim Ker X holds automatically for e
small enough because then such Y will also be right invertible (see the first
part of this proof). Apply (13.5.4) for Y of the form Y(x + y) = Yx + Ly;
x E <p", y G <f", where Y: <p"-> <pm is a transformation such that \\X - Y\\ <
€ and dim Ker Y = dim Ker X. Let us check that Ker Y C <p". Indeed

and since Ker Y C Ker Y, we have in fact Ker Y = Ker Y and thus Ker Y C
<f". Now put Px = Py| n , Py = P y j n to satisfy (13.5.1), for transformations

Finally, observe that (13.5.2) follows from (13.5.1) in view of Theorem

The condition dim Ker Y = dim Ker X is clearly necessary for the inequal-
ity (13.5.1), since otherwise we obtain a contradiction with Theorem 13.1.2
on taking a Y: f-* <£" such that
A result analogous to Theorem 13.5.1 also holds for the images of linear
transformations. The statement of this result is obtained from Theorem
13.5.1 by replacing Ker X and Ker Y by Im X and Im Y, respectively, and
its proof is reduced to Theorem 13.5.1 by observing that Im A = (Ker A*Y
for a linear transformation A and that 0(M, N) = 8{M \ ^V x ) for any
subspaces M, Ji C <p".

13.6 CONTINUOUS FAMILIES OF SUBSPACES

As before, we denote by <J7(<P") the set of all subspaces in <p" seen as a


metric space in the gap metric.
In this section we consider subspace-valued families ^(/) defined on some
fixed compact set K C Jf"", that is, for each t G K, %(t) is a subspace in <p".
The family 3?(t) will be called continuous (on K) if for every /0 G K and
every e > 0 there is S >0 such that ||/-f 0 ||<5, / e / C implies Q(!£(t},
j£(f0)) < e (the norm ||f - t0\\ is understood as the Euclidean norm, that is,
generated by the standard scalar product (x, y) = £™ = , xlyi for x =
( j t j , . . . , xm), y = (7,,. . . , ym) G $m). In other words, the continuity is
understood in the sense of the gap metric.
Examples of continuous families of subspaces are provided by the
following proposition.

Proposition 13.6.1
Let B(t) be a continuous m x n complex matrix function on K such that
rank B(t) = p is independent of t on K. Then Ker B(i) and Im B(t) are
continuous families of subspaces on K.
Continuous Families of Subspaces 409

Proof. Take f 0 E K. There exists a nonzero minor of size p x p of B(t0).


For simplicity of notation assume that this minor is in the upper left corner
of B(t0). By continuity the p x p minor in the upper left corner of B(t) is
also nonzero as long as t belongs to some neighbourhood U0 of tQ. So [here
we use the assumption that rank B(t) is independent of t] for t E UQ

where bf(t) is the /th column of Z?(f). Let b /y (f) be the (i, y)th entry in B(t)\
and let D(t) = [^(0];:^,:f=i; C(0 = [M')Ky-i- Then the matrix

is a continuous projector with Im P(t) = Im B(t). Hence P(t) is uniformly


continuous on Ulf where U} is a neighbourhood of t0 in K such that t/j C U0.
By Theorem 13.1.1 [inequality (13.1.4)] the orthogonal projector on Im B(t)
is also uniformly continuous on Ul.
The statement concerning Ker B(t) can be reduced to that already
considered because Ker B(t) is the orthogonal complement to lm(B(t))*
(note that B(t)* is continuous in t if B(t) is).

In particular, we obtain an important case.

Corollary 13.6.2
Let P(t) be a continuous projector-valued function on K. Then Im P(t) and
Ker P(t) are continuous families of subspaces on K.

We have to show that rank P(t) is constant if the projector function P(t)
is continuous. But this follows from inequality (13.1.4) and the fact that the
set of subspaces of fixed dimension is open in the set of all subspaces in <p"
(Theorem 13.1.2).
The following characterization of continuous families of subspaces is very
useful.

Theorem 13.6.3
Let ££(t) be a family of subspaces (of <p" ) on a connected compact subset K
of $m. Then the following properties are equivalent: (a) ££(f) is continuous;
(b) for each tE.K there exists an invertible transformation S(t): <p"—»<p"
which depends continuously on t for t E K, and there exists a subspace
M C <f" such that £(f) = S(t}M for all t E K; (c) for each t0£K there exist a
neighbourhood Ut of 10 in K, an invertible transformation St(t): <p"—>(p"
that depends continuously on t in Ut, and a subspace Mt C <p" such that
410 The Metric Space of Subspaces

We prove Theorem 13.6.3 only for the case K = [0, 1] (of course, the case
when K C $ is easily reduced to this one). The proof when K is a connected
compact set of jj?m requires mathematical tools that are beyond the scope of
this book [see Gohberg and Leiterer (1972) for the complete proof].

Proof. Assume that &(t) is continuous on


be points with the property that

Here PA is the orthogonal projector on the subspace JV C <p". For each


/ = 0, . . . , p - 1, the transformation £,(17), r, < 17 < f, + 1 , defined by 5. (77) =
/ - ( P M ( I ) - PM^)) maPs ^(f,-) °n *M(y)-> is invertible and •$,•('/) = /. Now
put

to satisfy (b).
Obviously, (b) implies (c). Finally, let us prove that (c) implies (a). Given
5, and M, as in (c), let P(} be the orthogonal projector on M, . Then
5, (t)P0(S, (0) ' is a projector on &(t)\ therefore, for t E U, we have

As 5,t (t)
\ / is continuous and invertible in U,r ,' its inverse is continuous as well,»
a ()

and the continuity of t£(t) follows from the preceding inequality.

Corollary 13.6.4
Let !£(t) be a continuous family of subspaces (of (p") on K, where K C jjf" is
a connected compact set. Then there exists a continuous basis *,(/), • • • , xp(t)
in !£(t), where p = dim ££(t). (Note that because of the connectedness of K the
dimension of ££(t) is independent of t on K.)

Indeed, use Theorem 13.6.3, (b) and put Xj(t) = S(t)Xj, j = 1, . . . , p,


where xl, . . . , xp is a basis in M.

Corollary 13.6.5
Let B(i) be a continuous m x n matrix function on a connected compact set
, such that rank B(t) = p is independent of t. Then there exists a
Applications to Generalized Inverses 411

continuous basis xl(?),..., xn_p(t) in Ker B(t) and a continuous basis

This corollary follows from Corollary 13.6.4, taking into account


Proposition 13.6.1.

13.7 APPLICATIONS TO GENERALIZEDD INVERSES

In this section we apply results of the preceding sections to study the


behaviour of a generalized inverse of a transformation when this transfor-
mation is allowed to change. Recall that a transformation B: (p"1—> < p is
called a generalized inverse of a transformation A: <p"—>• (p"1 if the equalities
BAB = B, ABA = A hold (see Section 1.5).
As an application of Theorem 13.5.1, we have the following result
concerning close generalized inverses for close linear transformations.

Theorem 13.7.1
Let X: <p" —> <J7m be a transformation with a generalized inverse X1: (pm —> <p".
Then there exist constants K>0 and e > 0 with the property that every
transformation Y: <p"-» fm with \\Y-X\\<c and dim Ker Y = dim Ker X
has a generalized inverse Y1 satisfying

Proof. By Theorem 1.5.5, the generalized inverse X1 is determined by a


direct complement N to Ker X in <p" and by a direct complement M to Im X
in <f"", as follows:

where Px is the projector on Im A' along M, and X{ : N— >lm X is the


invertible transformation defined by X^x — Xx, XE.N. Denote by 3£(X) the
set of all transformations Y: <p" -» (pm such that dim Ker Y = dim Ker X.
Using Theorem 13.1.3 and inequality (13.5.2), choose € j > 0 in such a
way that JV is a direct complement to Ker Y for every Y G 3f{(X) with
HA'- Y||<e!. Using the analog of Theorem 13.5.1 for images of linear
transformations, we find a projector PY on Im Y such that

for every Ye 3£(X). Here the constant K} depends on X and Px only.


Our next observation is that, by Lemma 13.3.2 and (13.7.2), there exists
412 The Metric Space of Subspaces

a positive number e2 ^ e, such that for any Y E 3%(X) with \\X - Y\\ < e2 we
can find an invertible transformation S
and

where the positive constant K2 depends on X and Px only. Let Y = SYY,


and note that for every generalized inverse Y1 of Y the transformation Y1SY
is a generalized inverse for Y. Now for we
have

so it is sufficient to prove Theorem 13.7.1 for Y in place of Y. In other


words, we can (and will) assume that the transformation Y from Theorem
13.7.1 satisfies the additional property that Im Y — Im X.
Now we verify (13.7.1) for the generalized inverse Y1 = Y^1PY, where
r, : JV-> Im Y = Im X is defined by Y,* = Yx, x £ Jf. Indeed

and

But the norms l l ^ ^ l l are bounded provided the transformation YE.3£(X)


with I m r = I m ^ is such that \\X - Y\\ < 5||A'1"1||. Theorem 13.7.1 is
proved. D

Observe that the complete analog of Theorem 13.5.1 does not hold for
the case of generalized inverses. Namely, given X and X1 as in Theorem
13.7.1, in general there is no positive constant K such that any transfor-
mation Y : <p" —> <pm with dim Ker y = dim Ker X has a generalized inverse
Y7 satisfying (13.7.1). To produce an example of such a situation, take
n = m and let X: <P"~* <P" be invertible. Then there is only one generalized
inverse of X, namely, its inverse X~l. Further, let Y = aX, where a ^0. If
(13.7.1) were true, we would have for some K>0 and all a:

which is contradictory for a close to zero.


Now we consider continuous families of transformations and their
generalized inverses. It is convenient to use the language of matrices with
the usual understanding that n x m matrices represent transformations from
<pm into <p" in fixed bases in <pm and <p".
Applications to Generalized Inverses 413

Theorem 13.7.2
Let B(t) be a continuous m x n matrix function on a connected compact set
K C $q such that rank 5(0 — p is independent of t. Then there exists a
continuous n x m matrix function X(t) on K such that, for every t E. K, X(t)
is a generalized inverse of B(t).

Proof. In view of Corollary 13.6.5 there exists a continuous basis


x,(r), • • • , xn_p(t) in Ker 5(0, as well as a continuous basis ^,(0, • • • . 3^(0
in Im 5(0- By the same corollary there exist a continuous basis
xn_p +l(t),. . . , xn(t) in Im 5(0* and a continuous basis yp +l(t),. . . , ym(t)
in Ker 5(0*. As Im 5(0* = (Ker B(t)}\ it follows that j r ^ f ) , . . . ,*„(*)
is a basis in <f"" for all tE K. Also, yv(t),. . . , ym(t) is a basis in <pm for all
r e K. Define a transformation X(t): <pw—>• (pn as follows: X(f)yj(i) = Q,
j = p + 1, . . . ,m; and for j = 1, . . . , p X(t)yj(t) is the unique vector in
Im5(0* such that B(t)X(t)y;(t) = yfi). Theorem 1.5.5 shows that X(t) is
indeed a generalized inverse of 5(0 for all tE. K. It remains to show that
X(t) is continuous.
For a fixed vector z e <pm and any r e A", write f°r
some complex numbers z,(r) that depend on t. These numbers z,(r) turn out
to be continuous, because

Further, the transformation

is invertible, so

for some complex numbers a;,(0 that also depend on t. Again, a;/(0 are
continuous on K. Indeed, a;j(0 ^s the unique solution of the linear system of
equations

Writing y ; (r), j = 1,. . . , p in terms of linear combinations of the standard


basis vectors e^ . . . , em, and writing * ; (0> i = n- p + 1, . . . , n in terms of
414 The Metric Space of Subspaces

linear combinations of e{,. . . , en we can represent the system (13.7.3) in


the form

where a(t) is the p 2 -dimensional vector formed by a y ,(0> / = 1> • • • > P'->
i- n — p + I , . . . , n, and A(t) and C(t) are suitable matrix and vector
functions, respectively, which are continuous in /. As the solution of (13.7.4)
exists and is unique for every f G K, it follows that the columns of A(t) are
linearly independent for every tEK. Now fix t(} G K, and assume for
simplicity of notation that the upper p2 rows of A(t()) are linearly indepen-
dent. Partition

where A0(t) and C 0 (f) are the top p2 rows of A(t) and C(t), respectively.
Then A()(t()) is nonsingular; as A(t) is continuous in f, the matrix AQ(t) is
nonsingular for every / from some neighbourhood U, of t0 in K. It follows
that

is continuous in f for / E t/, . As t0 G A^ was arbitrary, the functions a;/(0 are


continuous on K.
Returning to our generalized inverse X(t), we have for every

the following equalities:

and so ^Tr is continuous on K.

A particular case of Theorem 13.7.2 deserves to be mentioned explicitly.

Corollary 13.7.3.
Let B(t) be a continuous m x n matrix function on a connected compact set
K C ftg such that, for every t G K, the matrix B(t) is left invertible (resp. right
invertible). Then there exists a left inverse (resp. a right inverse) X(t) of B(t)
such that X(t) is a continuous function of t on K.
Subspaces of Normed Spaces 415

13.8 SUBSPACES OF NORMED SPACES

Until now we have studied the notions of gaps, minimal angle, minimal
opening, and so on for subspaces of <p" where the norm of a vector
x = (jc,, . . . , xn} is Euclidean: ||jc|| = (£" =1 I*,) 2 ) 1 ' 2 . Here we show how
these notions can be extended to the framework of a finite-dimensional
linear space with a norm that is not necessarily generated by a scalar
product.
Let V be a finite-dimensional linear space over <p or over %. A real-
valued function defined for all elements # E V, denoted by ||jc||, is called a
norm if the following properties are satisfied: (a) ||je||^0 for all x E V ;
if and only id for every jcE V and every
scalar A (so A E <p or A E J j ? according as V is over ((7 or over
(the triangle inequality).

EXAMPLE 13.8.1. Let/,, . . . , / „ be a basis in V, and fix a number/? > 1. For


every put

Also, define ||*|U = maxdaj, . . . , |aj). We leave it to the reader to verify


that | | - | j p ( p > l ) and H ' l l ^ are norms (one should use the Minkowski
inequality for this purpose): for any complex numbers * , , . . . , * „ ,
yl , . . . , yn and any p > 1 we have

EXAMPLE 13.8.2. For V= <p" (or V= ft") let

where x = (x,, . . . , xn) belongs to (p" (or to JjJ"). We have used this norm
throughout the book. Actually, this is a particular case of Example 13.8.1
(with the basis fl ; = e^ i = 1, . . . , n in <p" (or jj?") and p = 2).

Any norm on V is continuous, as proved in the following proposition.

Proposition 13.8.1
Let /, , . . . , fn be a basis in V, and let \ \ - \ \ be a norm in V. Then, given e > 0
there exists a 8 > 0 such that the inequality
416 The Metric Space of Subspaces

holds provided \Xj — yt\ < S for / = ! , . . . , « , where x — E"=1 Xjfj and

Proof. Letting M = max lsysn ||/j.||, choose 8 = eM n . Then for


every with we have

It remains to use the inequality

which follows easily from the axioms of a norm.

It is important to recognize that different norms on a given finite-


dimensional vector space are equivalent in the following sense.

Theorem 13.8.2
Let || • ||' and || • ||" be two norms in V. Then there exists a constant K^l
such that

for every xE.V.

We stress the fact that K depends on || • ||', || • ||" only (and of course on
the underlying linear space V).

Proof. Let / ! , . . . , / „ be a basis in V. It is sufficient to prove the


theorem for the case when

Consider the real-valued continuous function g defined on <p" by

As the set is closed and bounded, the


Subspaces of Normed Spaces 417

function g attains its maximum and minimum on this bounded set. So there
exist *,, x2 £ V such that \\x{\\' = \\x2\\' = 1 and

for every v E V with || i; ||' = 1. Now for x £ V, x ¥= 0 we have


and hence

Thus inequality (13.8.1) holds with K = max(||jc2||", l/||jt,||"). D

In the rest of this section we assume that an arbitrary norm || • || is given


in the finite-dimensional linear space V.
For any subspace M C V, let

be the unit sphere of M. Now the gap 6(!£, M ) between the subspaces !£ and
M in V is defined by formula (13.1.3):

where d(x, Z) = inf, ez ||;c - /|| for a set Z C V.


The gap has two properties of a metric: (a) 0(<&, M) = B(M, 2£) for all
subspaces However, the
triangle inequality

for all subspaces £, M, N in V fails in general, although it is true when the


norm is defined by means of a scalar product (x, y ) , as Theorem 13.1.1
shows. The following example illustrates this fact.

EXAMPLE 13.8.3. Let tjjl2 be the normed space with the norm

Consider a family of one-dimensional subspaces

We compute 0(#(a), ^(j8)). Take x £ 5(^(/3)), so that x - (7, y/3), where


. Now
418 The Metric Space of Subspaces

As the function f(fJi) — \y - /n| + |y/3 — f i a \ is piecewise linear, we have

So

Let a < /3 < y be positive numbers such that /3 < 1 < y and / 3 y < l . We
compute

and

However, clearly

so the inequality

holds for sufficiently small positive a, and the triangle inequality for the gap
fails in this particular case.

In contrast, the spherical gap

is a metric. (The verification of this fact is exactly the same as that given in
Section 13.2.) Instead of inequality (13.2.5), we have in the case of a general
normed space the weaker inequality
Subspaces of Normed Spaces 419

for any subspaces J£, M C V. Indeed, the left-hand inequality of (13.8.3) is


evident from the definitions of S($, M) and 0(£,M). To prove the
right-hand inequality in (13.8.3), it is sufficient to verify that for every
vector v E V with ||u|| = 1 and every subspace M C V we have

For a given e > 0 there exists a v G Jf such that

and we can assume that i>^0. [Otherwise, replace v by a nonzero vector


sufficiently close to zero so that (13.8.5) still holds.] Then
and hence

But

and we have

As e > 0 is arbitrary, the desired inequality (13.8.4) follows.


The minimal angle between two subspaces is defined in a normed space
by the formula (13.2.1). With this definition, Proposition 13.2.2 and
Theorem 13.2.3 are valid in this case. Without going into details, we remark
that Lemmas 13.3.1 and 13.3.2 also can be extended to the normed space
context.
Concerning the metric space properties of the set of all subspaces in the
spherical gap metric (such as compactness, completeness), it follows from
inequality (13.8.3) and the following result that these do not depend on the
particular choice of the norm.

Theorem 13.8.3
Let || • ||' and \\ • \\" be two norms in V, with the corresponding gaps 8'(M, N)
and 0"(M, JV) between subspaces M and N in V. Then there exists a constant
L > 1 such that

for all subspaces M and X.


420 The Metric Space of Subspaces

Again, the constant L depends on the norm || • ||' and || • ||" only.

Proof. By Theorem 13.8.2 we have for any x G V

where the constant K > 1 is independent of x. Hence

In view of the definition of 8(3?, M ) we obtain the left-hand inequality in


(13.8.6) with L = K2. The right-hand inequality in (13.8.6) follows
similarly. D

13.9 EXERCISES

13.1 Compute the gap 0(M, jV), where

and x and y are complex numbers such that


13.2 Compute the gap 6(M, Jf), spherical gap 0(Ji, ^V), minimal opening
T)(M, N) and minimal angle (pmin(M, N), where

and x and y are real numbers such that |jc| — |y|.


13.3 Compute 0(M,N), r)(M,N), and <pmin(M,N) for any two one-
dimensional subspaces M and ^V in /(?".
13.4 Let U: <p"-^ <p" be a unitary transformation. Prove that

for any pair of subspaces M , M C <J7".


Exercises 421

13.5 Prove that for subspaces Jz?, M in <p"

13.6 Show that the equality 0(3?, M) = \ holds if and only if either
£± n M ^ {0} or # n M1 ^ {0} (or both).
13.7 Let M I , N ! be subspaces in <p" and M2,N2 be subspaces in <pm.
Prove that

where
13.8 Find the gaps 0(Ker A, Ker B) and 0(Im A, Im 5) for the following
pairs of transformations
(a) j4 and B are diagonal in the same orthonormal basis.
(b) A and # are commuting normal transformations.
(c) A and B are circulant matrices in the same orthonormal basis.
(Hint: A and B can be simultaneously diagonalized by a unitary
matrix.)

in the same orthonormal basis, where ay and /3y are complex


numbers.
13.9 For each of cases (a)-(d) in Exercise 13.8, find

13.10 Let A: fn-+f" be a transformation. Then 0(J<,JV) = 1 for any


distinct ^-invariant subspaces M and ^V if and only if A is normal
with n distinct eigenvalues.
13.11 Show that if A(t), t £ [0, 1] is a continuous family of n x « circulant
matrices and dim Ker A(t) is constant (i.e., independent of t), then
the subspaces Ker A(t) and Im A(t) are constant.
13.12 Prove or disprove the following:
(a) If A(t) is a continuous family of upper triangular Toeplitz n x n
matrices for / £ [0, 1], then dim Ker A(t) is constant if and only
if Ker A(t} and Im A(t) are constant.
422 The Metric Space of Subspaces

(b) Same as (a) for

where a y (f) are continuous scalar functions of t e [0, 1] and A is


a fixed n x n matrix.
13.13 Show that a circulant matrix has a generalized inverse that is also a
circulant.
13.14 Let A(t) be a continuous family of circulant matrices with
dim Ker A(t) constant for f £ [ 0 , 1]. Show that there exists a
continuous family B(t) of generalized inverses of A(t) on [0, 1] that
also consists of circulant matrices.
13.15 Solve Exercises 13.13 and 13.14 with "circulant" replaced by "upper
triangular Toeplitz."
13.16 Assume the hypotheses of Lemma 13.3.1 and, in addition, assume
that the projector I10 is orthogonal. Prove that \\R\\ = cotan <p min ,
where <pmin is the minimal angle between Ker I10 and Im II.
13.17 Find the minimal angle between any two one-dimensional subspaces
in the normed space ^f 2 with the following norms:
Chapter Fourteen

The Metric Spaces


of Invariant Subspaces

We study the structure of the set Inv(^4) of all invariant subspaces of a


transformation A: <p"~* <P" in the context of the metric space <jj(<p n ) °f au<
subspaces in <f"'. Throughout this chapter <p" is considered with the standard
scalar product and the gap metric determined by this scalar product on
(J7((p"), as studied in the preceding chapter. With the exception of Section
14.3, the results of this chapter are not used subsequently in this book.

14.1 CONNECTED COMPONENTS: THE CASE OF ONE EIGENVALUE

Let s£ C <$ be two sets of subspaces of (p". We say that s& is connected in Sft
if for any subspaces <&, M C stf there is a continuous function /: [0, 1]—» 8ft
such that /(O) = j£, /(I) = M. [The continuity of / is understood in the gap
metric. Thus, for every tQ G [0, 1] and every e > 0 there is a 8 > 0 such that
imply The set si is called con-
nected if s& is connected in M.
We start the study of connectedness of the set Inv(y4) with the case when
A — 7, a Jordan matrix with &(J) = {0}. Let r be the geometric multiplicity
of the eigenvalue 0 of J, and let kl>--->kr be the sizes of the Jordan
blocks in /. Also, denote the set of all /^-dimensional J-invariant subspaces
by Inv p .
r) be an ordered r-tuple of integers such that 0 < /i. < A:,,
E; = 1 /,- = />, and let 4>p be the set of all such /--tuples. We associate every
with the subspace $(/) G Inv p , spanned by vectors wj 0 ;
y = 0, . . . , / y — 1; / = 1, . . . , /-, where wj'* are unit coordinate vectors in <(?"
and the sole nonzero coordinate of «j° is equal to one and is in the place
k^ + — • + ki_l+ j + 1 (we assume k0 = 0) for / = 0,. . . , kf - I and i =
1,. . . , r. There is a one-to-one correspondence between elements of <I>p and

423
a24 The Metric paces of Invariant Subspaces

subspaces from Inv/; spanned by unit coordinate vectors. So we can assume


that <£p C Inv;, .

Lemma 14.1.1
<$p is connected in ln\p.

Proof. Let / = ( / , , . . . , l r ) and / = ( / , , . . . , l r ) be r-tuples from <&p, and


suppose, for example, that /, > /, and / 2 < / 2 - Let ^(e) G Invp be the
subspace spanned by vectors w j ' l , + ew}, 2) , u{()l\ . . . , u(,^2, u ( - } for / =
0, . . . , /; - 1 and / = 2, . . . , r, where e is a complex number. Then ^(0) =
(the subspace corresponding to the r-tuple /) and

So / = ( / , , . . . , l r ) and (/,-!, / 2 + 1, /-,, • • • , l r ) are connected in Inv p .


Applying this procedure several times, we obtain a connection between /
and /.

Lemma 14.1.2
Let ^, E ln\p. Then oF, is connected in ln\p with some £F, E4> p .

Proof. For / = 0, 1, 2 , . . . , let

Then 0 = S?0 C S^, C • • • C ^ = <p" for some integer 5 (5 is the minimal


integer such that 7* =0). We construct the basic set of vectors in ^, in the
following way (see the proof of the Jordan form in Section 2.3). Let /0 be the
greatest index that satisfies (£%, "•"{%,._,) H ^, ^0. Take a basis
v,'o; ', , . . . , v,'o<?/
• „0 in &.i fl £%,'o modulo $£,'(i ~ ', . Then the vectors 7'u,'o ', , . . . , J'v/'o^i„0
are linearly independent in &} C\ ffl t _,. modulo 3? , - _ , - _ , ; / = ! , . . . , /0 - 1.
We complete the set Jvf !,..., JVj by additional vectors
'(i" 1 - 1 ' ' 'a~l'1i0-\
to form a basis in ^, l fl £%, 'o^,1 modulo S?, 'o,-/ . Then the
vectors

are linearly independent in ^, D £% y _ ; modulo ^ ( _,._, for / = 2, . . . , /0 - 1.


Complete the set

by additional vectors
y ;to a basic set of vectors in
Connected Components: The Case of One Eigenvalue 425

fl £%,. _ 2 modulo ^, _ 3 , and so on. So we obtain the basic set of vectors in

To connect ^, with some subspace ^2 E 4>p, we use the following procedure.


Take a set of q t coordinate unit vectors V,' ', , - . . , ^V,,,
~ '() y
'0^i ,
(
in <3l,'() that are
independent modulo £%, _ , . For / = 1, 2, . . . , ^ , put

where A is a complex parameter. Then the u, 0 y(A) are linearly independent


modulo £%; _ , for every A G <p except possibly for a finite set S{. Indeed, let
*,, . . . , * £ be a basis in £%, _ , , and put

Then u/ ( ) y(A), / = 1, . . . , ^ are linearly independent modulo ^, 0 _i if and


only if the columns of B( A) are linearly independent. Let b( A) be a minor of
B(\) of order /0 + k such that ^(0)^0 (such a minor exists because y ( ( / ,
y = 1, . . . , <7, are linearly independent modulo £%,. _ , ) . So fc(A) is a poly-
nomial that is not identically zero. Clearly, for every A that does not belong
to the finite set 5, of zeros of b(\), the vectors u / ( / (A), j = 1, . . . , q^
are linearly independent modulo £%, _ , . Observe that 5, does not contain 0
and 1.
Further, take a set of q t , coordinate unit vectors
in &l _ such that the vectors

are independent modulo S?, _ 2 . Putting

for y = 1, . . . , g,: _ , , we see similarly that the vectors

are independent modulo S? , _ 2 for A G <p ^ ^2' where 52 D 5j is a finite set of


complex numbers (not including 0 and 1). We continue this procedure and
obtain vectors

such that
426 The Metric Spaces of Invariant Subspaces

are linearly independent for A G <p" ^ 5, where 5 is finite set of complex


numbers not including 0 and 1 and vti(l) are coordinate unit vectors. From
this procedure it follows also that vij( A) G £%, for A G <p "~ S. Therefore, the
subspace ^(A) in <p" spanned by vectors (14.1.1) for A G ( p ^ 5 is a
/-invariant subspace with dimension not depending on A. Since S is finite we
can connect between 0 and 1 by a continuous curve F such that F D S — 0.
Then ^(A), A G F carries out the connection between ^, = ^(0) and
&2 = (1), where ^2 G 4>p.

We say that a set ^ C <]/((p") has connected components ja?,, . . . , s&m if


each stfj, / = l , . . . , r a is a nonempty connected set, but there is no
continuous function /: [0, 1]-+ $(<p") such that /(O)G^, / ( l ) G ^ , and
i^j. (In other words, each sdt is a maximal connected set in stf.)
Lemmas 14.1.1 and 14.1.2 allow us to settle the question of connected
components of the set Inv(/l) when the transformation A has only one
eigenvalue.

Theorem 14.1.3
Assume that the transformation A: $"—> <p" has only one eigenvalue A0. Then
Inv(y4) has exactly n + 1 connected components, and each connected compo-
nent consists of all A-invariant subspaces of fixed dimension.

Proof. Without loss of generality we can assume A() = 0. Let J be the


Jordan form of A , and A = S ' 1JS for some invertible transformation 5.
Obviously, ln\(A) = 5"'(Inv(/)) and Inv^A) = S ~ l ( l n v p ( J ) ) , where
ln\p(A) is the set of all ,4-invariant subspaces of dimension p. Lemmas
14.1.1 and 14.1.2 show that lnvp(J) and, therefore, lnvp(A) are connected.
On the other hand, if !£ G lnvf)(A) and M G Inv^/4) with p ^ q, then there
is no continuous function /: [0, l]-» <£(£") with /(O) - <£ and f ( \ } = M.
Indeed, if there were such a function/, then dim /(/) would not be constant
in a neighbourhood of some point f ( , G [0, 1]. This contradicts the continuity
of / i n view of Theorem 13.1.2.

14.2 CONNECTED COMPONENTS: THE GENERAL CASE

The description of connected components in Inv(^4) for a general transfor-


mation A: <p"—» <P" is given in the following theorem.

Theorem 14.2.1
Let A,, . . . , At be all the different eigenvalues of A, and let ( / / , , . . . , «/>c be
their respective algebraic multiplicities. Then for every integer p, Q< p < n,
Connected Components: The General Case 427

and for every ordered c-tuple of integers (xl , . . . , *c) such that 0 < x, — ^
i = l , . .. ,cand

{!£ £ Inv A \ dim !£ = p and the algebraic multiplicity of

is a connected component of ln\(A), and each connected component of


ln\(A) has the form (14.2.1) for a suitable p and suitable c-tuple

Proof. In the proof we use the following well-known properties of the


trace of a transformation A: $"—* (f"1, denoted by tr(/4) [e.g., see Section
3.5 in Hoffman and Kunze (1967)]. We may define tr(/4) to be the sum of
eigenvalues of A. If A is written as an n x n matrix in any basis in <p", then
tr(.A) is also the sum of diagonal elements of A. We have tr(AB) = lr(BA)
for any transformations A , B : <p" —» <p" ; in particular, tr(5~1/15) = if (A) for
any invertible 5. The trace (considered as a map from the set of all
transformations <p"—» <p" onto (p) is a continuous function.
Returning to the proof of Theorem 14.2.1, let F( be a small circle around
A,, with no other eigenvalue of A inside or on F.. Let JV be an /i- in variant
subspace, and let Xi(N) be the geometric multiplicity of A, for the transfor-
mation A\v. Using the Jordan form of A\ N , for instance, it is easily seen that

Let «,, . . . , ap be an orthonormal basis in JV. Then in some neighbourhood


V(N) of .TV, Pv-a,, . . . , Pjf.ap will be a basis in the subspace N' e V(JT),
where P v is the orthogonal projector on JV'. We have

Write /i| v as a matrix in the basis a,, ... ,a p , and for every /1-invariant
subspace N' that belongs to V(N], write ^4 v , as a matrix in the basis
P v , a , , . . . , Pjf.ap. Using formula (14.2.2) and the continuity of the trace,
we see that there exists a S > 0 such that, if 0(jV, N')<8 and N' is A
invariant, then

Since Xi(N') assumes only integer values, it follows that A/,(^') 's constant
in some neighbourhood of N in lnv(A) and, therefore, constant in the
connected component of Inv(^4) that contains N.
We show now that if N and Jf' are p-dimensional ^-invariant subspaces
428 The Metric Spaces of Invariant Subspaces

such that Xi(N) = X i ( N ' ) for / = 1, . . . , c, then JV and N' are connected in
Inv(yl). Indeed, applying Theorem 14.1.3 to each restriction A\^ (A) for
/ = 1, . . . , c, we find that N n &Ai(A) is connected with Jf' n ^(A) in the
set of all /4-invariant subspaces of dimension Xi(N) in 9l^(A). Since

and similarly for M', it follows that JV and .A"' are connected in
It remains to show that, given integers x\* • • • •> Xc suc^ tnat 0 — A', — 'A,
and L c i = l X i — p , there exists a subspace jVelnv(/4) with ^,(JV) = xt, for
/ = 1, . . . , c. But assuming that A is in Jordan form, we can always choose
an jV spanned by appropriate coordinate unit vectors. D

Corollary 14.2.2
The set ln\(A) has exactly Hli=l ((f/l ; + 1) connected components, where
«/fj , . . . , «/rc are the algebraic multiplicities of the different eigenvalues
\i, . . . , \c of A, respectively.

The proof of Theorems 14.1.3 and 14.2.1 shows in more detail how the
subspaces in Inv A belonging to the same connected component are con-
nected. We say that a vector function x(t) defined for fE[0, 1] and with
values in <p" is piecewise linear continuous if there exist m points 0 < tl <
"•<tm<\ and vectors y,, . . . , _y m + 1 and z , , . . . , z m + 1 such that, for
i = 1 , . . . , m +1

(by definition, f 0 = 0, tm + l = 1), and for / = 1, . . . , m, we obtain

Corollary 14.2.3
Let M and M be p-dimensional A-invariant subspaces that belong to the same
connected component in Inv A. Then there exist piecewise linear continuous
vector functions v { ( t ) , . . . , vp(t) such that, for all ?G[0, 1], the subspace
Span{ el), • • • , vp(t)} is p-dimensional, A invariant, and

14.3 ISOLATED INVARIANT SUBSPACES

Let A: <p" —> (f"1 be a transformation. An /1-invariant subspace M is called


isolated if there is an e > 0 such that the only /l-invariant subspace N
satisfying 0(M, N) < e is M itself.
Isolated Invariant Subspaces 429

Theorem 14.3.1
An A-invariant subspace M is isolated if and only if, for every eigenvalue
of A with dim Ker( either

To prove Theorem 14.3.1, we use a lemma that allows us to reduce the


problem to the case when A has only one eigenvalue.

Lemma 14.3.2
An A-invariant subspace M is isolated if and only if for every eigenvalue A0 of
A the subspace (A) is isolated as an A\ invariant subspace.

Proof. We have

where are all the different eigenvalues of A.


Assume that is isolated. If for some A, the subspace
is not isolated [as an A\ invariant subspace], then there exists a
sequence of ^-invariant subspaces such that
For m = 1, 2,. . . , let

Obviously, is A invariant. Let ^ be a direct complement to


in (4) for y = 1,. . . , r, and put Then 'is
direct complement to in Theorem 13.1.3 shows that for m sufficiently
large, t is a direct complement to K(A), and therefore is a direct
complement to in Letting be the projector on
(resp. along J e have (cf. (13.1.4))

where P, is the projector on along

and mis the projector on


along Theorem 13.1.3 shows that for large m

where the constant C > 0 is independent of m. Comparing with (14.3.1), we


430 The Metric Spaces of Invariant Subspaces

obtain as m—»°°, a contradiction with the fact that is


isolated.
Assume now that, for / = 1,. . . , r, K(A) is isolated as an A
invariant subspace. So there exists an >0 such that the only .A-invanant
subspace (^4) satisfying

is itself.
We show now that, for every there exists re aexists
8 > 0a hat,
8 > 0forsuch
anythat, for any
A-invariant subspace with the inequalities
hold for / = 1,. . . , r. Indeed, arguing by contradiction,
assume that for some and some / there exists a sequence of
/l-invariant subspaces such that as w but

Let Then, in particular, and by Theorem 13.4.2


there exists a sequence such that for m = 1 , 2 , . . . and

Write x where Apply


the projector on A (A) along the sum of all other root subspaces of A to
both sides of (14.3.3). We see that y = lim Conversely, if y =
lim for some x then obviously and,
by Theorem 13.4.2, we also have Now by the same Theorem
13.4.2 any limit point of the sequence m = 1, 2, . . . coincides
with Since Theorem 13.4.1 ensures that the limit points of
exist, we obtain a contradiction with (14.3.2).
Now take min Then for with the property de-
scribed in the preceding paragraph, we find that for every .A-invariant
subspace the equa lit hold
for; = 1,. . . , r. But these equalities imply t is, M is isolated. D

Proof of Theorem 14.3.1 In view of Lemma 14.3.2, we can assume that


If dim Ker( then A is unicellular and has a
unique complete chain of invariant subspaces. Obviously, every ^-invariant
subspace is isolated. Now assume that dim Ker In view of
Theorem 14.1.3, the set lnvp(A) of all y4-invariant subspaces of fixed
dimension p is connected. So to prove that the only isolated A-invariant
subspaces are {0} and we must show that ln\p(A) has at least two
members for However, for every p with 0 < p < n, and in a fixe
Jordan basis for A, the transformation A has at least two invariant subspaces
of the same dimension p spanned by some vectors from this basis.
Isolated Invariant Subspaces 431

An A-invariant subspace is called inaccessible if the only continuous


mapping of the interval [0,1] into the lattice ln\(A) of A -invariant subspaces
with is the constant map Clearly, every isolated in-
variant subspace is inaccessible. The converse is also true, as follows.

Proposition 14.3.3
Every inaccessible A-invariant subspace is isolated.

Indeed, if A has only one eigenvalue and dim Ker then


any A -invariant subspace is obviously inaccessible and isolated. It can be
proved by using the arcwise connectedness of ln\p(A) for 0</? < n that, if
(A) = and dim Ker then any nontrivial /1-invariant
subspace is not inaccessible (Corollary 14.2.3). The reduction of the general
case to this special case is achieved with the following lemma.

Lemma 14.3.4
An A-invariant subspace M is inaccessible if and only if, for every eigenvalue
of A, the subspace (v4) is inaccessible as an A -invariant
subspace.

The proof of Lemma 14.3.4 is left to the reader. (It can be obtained along
the same lines as the proof of Lemma 14.3.2.)

Theorem 14.3.5
Every inaccessible (equivalently, isolated) A-invariant subspace is A hyper-
invariant.

Proof. Let be the distinct eigenvalues of A (if any) wit


dim Ker for / = ! , . . . , $ , and let be other
distinct eigenvalues of A (if any). For a given isolated ^-invariant subspace
we have, by Theorem 14.3.1

and some t with Le tting a, be the dimension of we


have =Kerp(A) t where A
every transformation that commutes with A also commutes with p(A), the
subspace is A hyperinvariant. D

The converse of Theorem 14.3.5 does not hold in general, as the next
example shows.
432 The Metric Spaces of Invariant Subspaces

EXAMPLE 14.3.1. Let

The subspace = Sp is the kernel of T and is thus T


hyperinvariant. For any complex number a, the subspace
Span is easily seen to be T invariant. We have

so

and as the norm of a hermitian matrix is equal to the maximal absolute


value of its eigenvalues, a computation shows that

where

So the subspace valued function F defined on is continu-


ous and nonconstant and takes T-invariant values. As the T-
invariant subspace is not inaccessible.

14.4 REDUCING INVARIANT SUBSPACES

Recall that an invariant subspace of a transformation is called


reducing if there exists an ^4-invariant subspace jV that is a direct comple-
ment to n
Reducing Invariant Subspaces 433

The question of existence and openness of the set of reducing ,4-invariant


subspaces of fixed dimension p is settled by the following theorem.

Theorem 14.4.1
Let A: be a transformation with partial multiplicities
(so ml + • • • + mk = n). Then there exists a reducing A-invariant subspace of
imension if and only if p is admissible, that is, is the sum of some
partial multiplicities ra,,. . . , ra, . In this case the set of all reducing A-
invariant subspaces of dimension p is open in the set of all A-invariant
subspaces.

Proof. If p is admissible, then obviously a reducing ^-invariant sub-


space of dimension p exists. Conversely, assume that M is a reducing
/1-invariant subspace of dimension p with an ,4-invariant complement N.
Write

with respect to the direct sum decomposition Taking Jordan


forms of A\ and A2, we see that p is admissible.
For an admissible p, let Rm\p(A) be the set of all p-dimensional reducing
y4-invariant subspaces. For a subspace M Rinvp(A), let Jf be a direct
complement to M that is A invariant. Theorem 13.1.3 shows that there exists
an e < 0 such that ^Vis a direct complement for any ^-invariant subspace M\
with Hence Rmvp(A) is open in the set lnvp(A) of all
/7-dimensional ^-invariant subspaces. D

Now consider the question of whether (for admissible /?) the set R\n\p(A)
of all p-dimensional reducing subspaces for A is dense in the set lnvp(A) of
all p-dimensional /1-invariant subspaces. We see later that the answer is, in
general, no. So a problem arises as to how one can describe the situations
when Rm\p(A) is dense in lnvp(A) in terms of the Jordan structure of A.
We need some preparation to state the results. Let A: be a
transformation with single eigenvalue and partial multiplicities
It follows from Section 4.1 that the partial multiplicities/?
p, of the restriction A to an ^-invariant subspace M satisfy the inequalities

Given an integer p with 1 let p , be a sequence of positive


integers such that (14.4.1) holds and /?, + ---+p,-p; a sequence with
these properties is called p admissible. For a p admissible sequence
denote by \n\p(A; / ? , , . . . , p,) the (nonempty) set of all A-
invariant subspaces M such that the restriction A\ has the partial multi-
434 The Metric Spaces of Invariant Subspaces

plicities pl , . . . , p,, Clearly, dim M = p for every


Moreover

where the union is taken over the finite set of all p-admissible sequences
For each p-admissible sequence

where and K* indi-


cates the number of elements in the finite set K. In connection with the
definition of (-4; / ? , , . . . , p,), observe that c for y = 1, 2,. . . (so each
summand on the right-hand side of (14.4.2) is a nonnegative integer), and/?,
is the maximal index with
We now give a necessary and sufficient condition for the denseness of
Rinv /3 (/l) in \n\p(A), for a transformation A: <p"—»<p n with single eigen-
value and partial multiplicities m, > • • • S: mr.

Theorem 14.4.2
For a fixed admissible integer p, the set K\n\p(A) is dense in lnvp(A) if and
only if the following condition holds: any p-admissible sequence p} >•• ->pt
for which the number s,(A\ pt,. . . , p,) attains its maximal value among all
p-admissible sequences has the form p\ = mt, , . . . , p, — mt for some indices
In particular, Rinvp(A) is dense in ln\p(A) pro-
vided there is only one p-admissible sequence for which
is maximal.

In the proof of Theorem 14.4.2 we apply a result proved in Shayman


(1982) concerning a representation of ln\p(A) as a union of complex
(analytic) manifolds. In this proof (and only in this proof) we assume some
familiarity with the definition and simple properties of complex manifolds
that can be found, for instance, in Wells (1980).

Theorem 14.4.3
For every p-admissible sequence p the set Inv /; (^4; /?j, . . . , p,) is,
in the topology induced by the gap metric, a connected complex manifold
whose (complex) dimension is equal to

For the proof of Theorem 14.4.3 we refer the reader to Shayman (1982).

Proof of Theorem 14.4.2. Assume that the condition fails, that is, there
exists a p-admissible sequence with maximal
Reducing Invariant Subspaces 435

that is not of the form By


Theorem 14.4.3 the complex manifold \n\p(A; p , , . . . , p,) has maximal
dimension among all the complex manifolds whose union is ln\p(A). On the
other hand, it is easily seen that ln\p(A; p , , . . . , p,) does not contain any
reducing subspace for A (cf. the proof of Theorem 14.4.1). So Rinv p (/4) is
not dense in Inv p(A).
Assume now that the condition holds. Then every complex manifold
ln\p(A; p,, . . . , p,) with maximal will contain a reducing
subspace { , . . . , p,) for A. Fix such a p-admissible sequence
p,, and let JVbe an ^-invariant direct complement to M(p}, . . . , p,) in
It follows from Theorem 7 in Shayman (1982) that the complex manifold
In\p(A; p,, . . . , p,) can be covered by a finite number of analytic charts
and that each chart is of the form p(A; p , , . . . , p,)
with Span where
A-,(Z), . . . , *p(z) are analytic vector functions in <p9. Now it is easily seen
that the set of all subspaces M E:lnvp(A; p , , . . . , p,) that are not direct
complements to Jf is an analytic set (i.e., the union of the sets of zeros
of a finite number of analytic functions that are not identically zero) in
each of the charts mentioned above. Denoting by K the union of all
ln\p(A\ p,, . . . , p,) for which is maximal, it follows that
Rinv /J (/4) K is dense in K. As ln\p(A) is connected (Theorem 14.1.3), it
follows from Theorem 14.4.3 that the closure of K coincides with Inv p (/4);
hence Rinvp(A) is dense in ln\p(A).
Finally, suppose that there exists only one p-admissible sequence
for which is maximal. As the set ln\p(A) is
connected, and Theorem 14.4.3 implies that lnvp(A) is the closure of
lnvp(A; p j , . . . , p j - ) . Since p is admissible, there exists a p-dimensional
/4-invariant subspace such that for some ^4-invariant
subspace . So there exists a subspace M in ln\ (suf-
ficiently close to for which is a direct complement. Now we can
repeat the arguments in the preceding paragraph to show that Rinv p (v4) is
dense in lnvp(A).

Let us give an example showing that, for an admissible p, Rinv p (,A) is not
generally dense in ln\p(A).

EXAMPLE 14.4.1. Let

where Jm(Q) is the Jordan block of size m with eigenvalue 0. Clearly, p = 5 is


admissible. However, Rinv 5 (/4) is not dense in Inv 5 (^4). According to
Theorem 14.4.3, the connected set Inv5(/4) is the disjoint union of five
analytic manifolds S,, 52, S3, 54, S5 described as follows: let
7 Then for j = 1,. . . , 5,
436 The Metric Spaces of Invariant Subspaces

Sj consists of all five dimensional ,4-invariant subspaces M such that the


restriction A\M has partial multiplicities given by yf. Further, the (complex)
dimensions of 51? S2, S3, S4, 55 are 4, 4, 3, 2, 0, respectively. It is easily seen
that there is no reducing subspace for A in S2. Indeed, the sum of a
subspace from 52 and any four-dimensional A -invariant subspace fails to
contain the vector e Since the dimension of S2 is maximal among the
dimensions of Sj, j = I , . . . ,5, it follows that Rinv 5 (A) is not dense in
Inv5(,4). D

In the next example Rinv p (/4) is dense in ln\p(A), for all admissible p.

EXAMPLE 14.4.2. Let

Obviously, all p - 0, 1, 2, 3 are admissible. Among the one-dimensional


/4-invariant subspaces Span (where all are re-
ducing with the exception of Span (i.e., when . Indeed

for a T^ 1. So Rinv^/l) is dense in Inv,(v4). Further, in the set

of two-dimensional v4-invariant subspaces the reducing ones are


Span that is, again a dense set.

We note the following corollary from Theorem 14.4.2.

Corollary 14.4.4
If the transformation A: has only one eigenvalue and
dim Ker 7 - A) = 2, then Rmvp(A) is dense in lnvp(A) for every p such
that Rmvp(A) is not empty.

Proof. Indeed, let be the partial multiplicities of A. A simple


calculation shows that for every p-admissible sequence we have
and for the p-admissible sequence consisting of one
integer p, only, we have Hence there exists only one
p-admissible sequence for which is maximal,
and the second part of Theorem 14.4.2 applies.
Covariant and Semiinvariant Subspaces 437

14.5 COVARIANT AND SEMIINVARIANT SUBSPACES

In this section we study topological properties of the sets of coinvariant and


semiinvariant subspaces for a transformation y4 . As usual, the
topology on these sets is the metric topology induced by the gap metric.
For the coinvariant subspaces we have the following basic result.

Theorem 14.5.1
The set Coinv(y4) of all coinvariant subspaces for a transformation
A: is open and dense in the set < of all subspaces in
Furthermore, the set Coinvp(A) of all A-coinvariant subspaces of a fixed
dimension p is connected.

Proof. Let M be A coinvariant, so there is an ^-invariant subspace N


that is a direct complement to M in (p". By Theoreml3.1.3 there exists an
such that JV is a direct complement to any subspace with
Hence Coinv(v4) is open.
We now prove that Coinv(y4) is dense. Let M — Span{u,, . . . , vp] be a
p-dimensional subspace in There exists an (n - p)-dimensional A-
invariant subspace N. Let be a basis for JV. Denoting by
W j , . . . , w a basis for some direct complement to N, put

where 17 7^0 is a complex number. As v^ . . . ,vp are linearly independent


for 17 close enough to zero the vectors v are linear
independent as well. Hence dim ^(17) = p for 77 close enough to zero.
Further, the determinant of the n x n matrix [ « , • • • un_pwl • • • wp] is non-
zero. If the determinant of is a
polynomial in £ that is not identically zero, and it follows that

or all such that is large enough. For such the sub space Spa
is a direct complement to JV.
Span it follows that for 17 5^0 and close
nough to zero To show that M belongs to the closure of
the set of all >l-coinvariant subspaces, it remains to prove that

To prove this, assume for simplicity of notation that the upper p rows in
[i>, • • • vp] are linearly independent. Then the same will be true for the upper
p rows of (for 17 close enough to zero). Write
438 The Metric Spaces of Invariant Subspaces

where is a nonsingular p x p matrix and 0(17) is an (n - p) x p matrix.


Then the matrix

where and ' is the orthogonal


projector on s the entries of P(r e continuous functions of 17,
equality (14.5.1) follows.
Finally, let us verify the connectedness of Coin\p(A). Let
Com\p(A). So for some (n - p)-dimensional A-
invariant subspaces and et lt . . . , vp and M , , . . . , up be bases in
and spectively, and consider the subspaces M(rf panju,
where 17 G (p. As in the preceding proof of the dense
ness of Coinv(^4), one verifies that for all 17 with the possible exception of
finite set 4> (= the set of zeros of a certain polynomial), M(rj) is a direct
complement to the least one of the subspaces and Pick a continuous
curve where and that does not intersect 3> and such
that Then is the desired connection
between and in the set Coinvp(,4). D

Now we consider the semiinvariant subspaces. As any /l-coinvariant


subspace is also A semiinvariant, Theorem 14.5.1 implies that the set
Sinv(yl) of all A -semiinvariant subspaces is dense in <Jj(<p" )• However,
Sinv(/4) is not necessarily open, as the following example shows.

EXAMPLE 14.5.1. Let A = -7 The two-dimensional subspace


Span{e2, e3] is obviously A semiinvariant, and

(see the proof of Theorem 14.5.1). But the subspace Span{e2, e3 + rje4} is
not A semiinvariant for 17 ^ 0. Indeed, suppose that

where N and M are A invariant. As the only nonzero /1-invariant subspaces


are Span for / = 1, 2, 3, 4, and (14.5.2) implies e
it follows that Then dinuV = 2. Hence Ji must be Span 2},
which contradicts (14.5.2). D
The Real Case 439

Theorem 14.5.2
For any transformation A: the set Sinvp (/l) of all A-semiinvariant
subspaces of a fixed dimension p is connected.

Proof. Given an ^-invariant subspace N with dimension not less than p,


denote by S the set of all ,4-semiinvariant subspaces of dimension p
such that for some /1-invariant subspace M (in other words, is
A}^ coinvariant). It will suffice to show that for any jVand any
!£ there exists a continuous function/: Sin\p(A) such that
where J*2 is ,4 invariant, and let
/ , , . . . , /p and g , , . . . , gp be bases in j£, and «$?,, respectively. Denote by 5
the finite set of all i?E<P for which Span{/, + rjgj, . . . ,
is not a direct complement to ^ Then put f(t) =
Span{f for and , where
5 is any continuous function with T(0) = 0,

14.6 THE REAL CASE

Consider now a transformation A: $"—*%". We study here the connected


components and isolated subspaces in the set Inv*(/4) of all v4-invariant
subspaces in J|?".

Theorem 14.6.1
If A has only one eigenvalue, and this eigenvalue is real, then the set Inv*(y4)
of all A-invariant subspaces of fixed dimension p is connected.

The proof of Theorem 14.6.1 will be modeled after the proof of Theorem
14.3.1, taking into account the fact that in some basis in $." the transfor-
mation A has the real Jordan form (see Section 12.2). We apply the
following fact.

Lemma 14.6.2
The set GLr(n) of all real invertible n x n matrices has two connected
components; one contains the matrices with positive determinant, the other
contains those with negative determinant.

Proof. Let T be a real matrix with det T > 0 and let J be a real Jordan
form for T. We first show that J can be connected in GLr(ri) to a diagonal
matrix K with diagonal entries ±1. Indeed, J may have blocks Jp of two
types: first
440 The Metric Spaces of Invariant Subspaces

in this case we define

for any where \ p(t) is a continuous path of nonzero real numbers


such that according as
Second, a Jordan block Jp may have the form

where for real and r with Then J p(t) is


defined to have the same zero blocks as / , whereas the diagonal and
superdiagonal blocks are replaced by

respectively, for / £ [0, 1]. Then Jp(t) determines a continuous path of real
invertible matrices such that /p(0) = Jp and J p ( l ) is an identity matrix.
Applying the above procedures to every diagonal block in /, we see that J
is connected to K by a path in GLr(n). Now observe that the path in GLr(2)
defined for by

connects Consequently K, and hence J, is connect-


The Real Case 441

ed in GLr(n) with either / or diag[-l, 1 , 1 , . . . , 1]. But del T>Q implies


d e t J > 0 , and so the latter case is excluded. Since T=S~1JS for some
invertible real S, we can hold S fixed and observe that the path in
connecting J and / will also connect T and 7.
Now assume r(n) and d e t T < 0 . Then det 7' >0, where T' =
T diag[-l, !,...,!]. Using the argument above, we find that T' is connect-
ed with / in GLr(n). Hence T' is connected with diag[-l, !,...,!] in

Proof of Theorem 14.6.1. Without loss of generality we can assume that


A = /„(()). Let be the sizes of Jordan blocks in A. Let the
set of all ordered r-tuples of nonnegative integers / , , . . . , lr such that
i / , = p . As in Section 14.1, each is iden-
tified with a certain p-dimensional ^-invariant subspace; so can be
supposed to be contained in lnv The proof of Lemma 14.1.1 shows
that <I> is connected in lnv Further, we apply the proof of Lemma
14.1.2 to show that any is connected in Inv*(/4) with some
Take vectors as in the
proof of Lemma 14.1.2. Let pt — dim £%. - dim £%,_j for / = / 0 , /0 — 1, . . . , 1.
As the vectors u, ()1 , . . . , u,o<? are linearly independent modulo ^, 0 -i, the
matrix formed by the rows
of the 0
matrix 0 0"|Q
has linearly independent columns. Fo
^

simplicity of notation assume that the top q submatrix of is


nonsingular. Now Lemma 14.6.2 allows us to connect the vectors
with ±e respectively (the
sign -I- or - coincides with the sign of the nonzero real number det in
the set of all qt -tuples of vectors in £%. that are linearly independent modulo
for ; = 2, . . . , <7,o in the proof of
Lemma 14.1.2. Using an analogous rule for the choice of ytj at each step of
the procedure described in the proof of Lemma 14.1.2, we finish the proof
of Theorem 14.6.1.

Theorem 14.6.3
If the transformation A: has the only eigenvalues where a
and (3 are real and P^Q, then again the set Inv of all A-invariant
subspaces of fixed dimension p is connected.

Note that under the condition of Theorem 14.6.3, A does not have
odd-dimensional invariant subspaces (in particular, n is even), so we can
assume that p is even (see Proposition 12.1.1).

Proof. Consider A as the n x n real matrix that represents the trans-


formation A in the basis e and let Ac be the complexification
442 The Metric Spaces of Invariant Subspaces

of A; so By Theorem 12.3.1, there exists a one-to-one


correspondence between the ^'-invariant (/?/2)-dimensional subspaces J< in
and the y4-invariant p-dimensional subspaces 2£, which is given
by the formula

It is easily seen from the proof of Theorem 12.3.1 that this correspondence
is actually a homeomorphism (p: ln\p(A)^>lnv
Now the connectedness of Inv*(/4) follows from the connectedness of
Inv
heorem 14.1.3). D

Recall that as shown in Chapter 12, any /l-invariant subspace t£ admits


the decomposition

where are all the distinct real eigenvalues of A (if any) and
are all the distinct eigenvalues of A in the open upper
half plane. Using this observation, the proof of Theorem 14.2.1 yields the
following description of the connected components in the metric space
Inv (A) of all /1-invariant subspaces in ft" for the general transformation

Theorem 14.6.4
Let A,, . . . , A^ be all the different real eigenvalues of A, let their algebraic
multiplicities be ifr respectively, and let be all
the distinct eigenvalues of A in the open upper half plane with the algebraic
multiplicities <p,,. . . , <p,, respectively. Then for every (s + t)-tuple of integers
such that
the set dim 2£ = p\ \i is tne algebraic multiplicity of
A\y corresponding to A, for i - 1,. . . , s; \s+j is that corresponding to
for where connect-
ed component of Inv (A) and every connected component of Inv*(/4) has
this form. In particular, Inv*(^4) has exactly
connected components.

Finally, consider the isolated subspaces in Inv*(y4).

Theorem 14.6.5
Let A: be a transformation. Then an A-invariant subspace M is
isolated in lnv*(A) if and only if either
Exercises 443

for every real eigenvalue A0 of A with dim Ke and either


for any nonreal eigenvalue of
A with geometric multiplicity greater than 1.

Proof. Using the real analog of Lemma 14.3.2 (its proof is similar to
that of Lemma 14.3.2), we can assume that one of two cases holds: (a)
In the
first case Theorem 14.6.5 is proved in the same way as Theorem 14.3.1. In
the second case use Theorem 14.3.1 and the homeomorphism between
Inv*(^) and lnv given by formula (14.6.1).

14.7 EXERCISES

14.1 Supply the details for the proof of Lemma 14.3.4.


14.2 Prove that for a transformation A the sets of /4-hyperinvariant
subspaces and isolated ^-invariant subspaces coincide if and only if A
is diagonable. In this case an ^-invariant subspace is isolated if and
only if it is a root subspace.
14.3 What is the number of isolated invariant subspaces of the companion
matrix

14.4 Let A =diag Is the set of all reducing


/4-invariant subspaces dense in Inv(/4)?
14.5 Show that there exists a converging sequence of semiinvariant sub-
spaces for the matrix -/3(0) whose limit is not J 3 (0)-semiinvariant.
Chapter Fifteen

Continuity and
Stability of
Invariant Sub spaces

It has already been mentioned that computational problems for invariant


subspaces naturally lead to the problem of describing a class of invariant
subspaces that are stable after small perturbations. Only such subspaces can
be amenable to numerical computations. The analysis of stability of in-
variant subspaces is the main topic of this chapter. We also include related
material on stability of other classes of subspaces (notably, [A J5]-invariant
subspaces), and on stability of lattices of invariant subspaces. Different types
of stability are analyzed.

15.1 SEQUENCES OF INVARIANT SUBSPACES

In this section we consider the continuity of invariant subspaces for trans-


formations from <(7n into <|7". We start with the following simple fact.
Theorem 15.1.1
Let {Am}^m = l be a sequence of transformations from <p" into <p" that
converges to a linear transformation A: <J7"—» ((7". // Mm is an Am-invariant
subspace for m = 1, 2,. . . such that Mm—> M for some subspace M C <p",
then M is A invariant.

Proof. Let x G M. Then, by Theorem 13.4.2, there exists a sequence


such that xm G Mm for each m and limm_x Now

444
Sequences of Invariant Subspaces 445

As Am—> A, the norms are bounded; K for some positive


constant K independent of m. So as m—»°°,

As ^ m is Am invariant, we have AmxmE.Mm for each m, and Theorem


13.4.2 can be applied to conclude that Ax EM.

The continuity property of invariant subspaces expressed in Theorem


15.1.1 does not hold for the classes of coinvariant and semiinvariant
subspaces.

EXAMPLE 15.1.1. For m = 1, 2,. . . , let

The subspace Span{el} is Am coin variant for every m. (Indeed, Span is


a direct complement to SpanfeJ, which is Am invariant.) However,
Span{e,} is not A coinvariant, where is the limit of Am. The
same subspace SpanfeJ is also Am reducing, but not A reducing. D

EXAMPLE 15.1.2. For m = 1, 2 , . . . , let

The eigenvectors of Am are (up to multiplication by a scalar) el, mev + me2,


m2el - me2 + 2e3. Consequently, the subspace Span{el,e3} is Am semi-
invariant for all m (because Span{m^j + e2} is a direct complement to
Span{e1? e3}, which is an Am-invariant subspace). However, Span{el, e3} is
not A semiinvariant, where

is the limit of Am if m—»<». D


446 Continuity and Stability of Invariant Subspaces

Corollary 15.1.2
The set of A-invariant subspaces is closed; that is, if {Mm}*m = l is a sequence
of A-invariant subspaces with limit M:=\\mm_^M, then M is also A
invariant.

Simple examples show that the ,4-invariant subspaces Ker A and Im A


are not generally continuous in the sense of Theorem 15.1.1. Thus it may
happen that {Ker/l m }~ = 1 does not converge to Ker A and {lmAm}^ = }
does not converge to Im A as Am-* A. The following result shows that the
only obstruction to convergence of Ker Am and Im Am is the dimension.

Theorem 15.1.3
Let {Am}~m = l be a sequence of transformations on <p" that converges to a
transformation A on <p". Then Ker A contains the limit of every convergent
subsequence of the sequence {Ker Am}^, =l. In particular, if dim Ker Am =
dim Ker A for every m = 1, 2, . . . then Ker Am and Im Am converge, and

Proof. For k — 1, 2,. . . , let Ker Am converge to some M C <p". Then


for every x&M there exists a sequence xm EKer.4^ , such that xm —>x.
As Am xm =0, we have also Ax — 0, that is, x E Ker A.
Now let Im Am be a sequence converging to some Ji C <p". Then [see
formula (13.1.1)] *

Since A^-* A*, by the part of the theorem already proved,


(and so Jf D Im A.
Assume in addition that dim Ker Am = dim Ker A for all m = 1 , 2 , . . . . If
£ is a limit of a converging subsequence from the sequence {Ker yl m }^ =1 ,
then (see Theorem 13.1.2) dim «2* = dim Ker A. From the first part of the
theorem we know that Jz? C Ker A. So actually Z£ = Ker A. Hence Ker A is
a limit of every converging subsequence of {Ker Am}^, =l. It follows [using
the compactness of (f/(<p")] that Ker/4 m converges to Ker A. Further, we
also have dim Im Am — dim Im A for each m. A similar argument shows that
Im Am converges to Im A. D

Let M be an A -invariant subspace and O be an open set in <p. We


conclude this section by showing that the inclusion a(A\M)CCl is preserved
under small perturbations. Recall that 6 denotes the "gap" metric intro-
duced in Chapter 13.
Stable Invariant Subspaces: The Main Result 447

Theorem 15.1.4
Let M be an invariant subspace for the transformation A: <p"~* <P"> and let
ft C <p be an open set such that all eigenvalues of A\M are inside ft. Then for
transformations B on <p" and B-invariant subspaces Jf, cr(B\ v ) C ft as long as
is sufficiently small.

Proof. Arguing by contradiction, suppose that there exists a sequence


of transformations (Bm}^n = l on (p" and a sequence of subspaces
such that Jim is Bm invariant,

and For each m, let Am be an eigenvalue of outside

Since as the norms are bounded;


hence the sequence {A OT } m = 1 is bounded as well. Passing to subsequences in
formula (15.1.1), if necessary, we can assume that A OT —» A0 and xm-+x0 (as
m—»o°), for some A0 G (p and *„ G <{7". By Theorem 13.4.2, x0E:M, and
clearly JC ( ) T^O. As Ax() — AOJCO, A0 is an eigenvalue of v 4 | w , which, by
hypothesis, belongs to ft. But this contradicts Am ^ft for m = 1, 2, . . . . D

15.2 STABLE INVARIANT SUBSPACES: THE MAIN RESULT

Let A : <p" —» <p" be a transformation. An /l-invariant subspace ^V is called


stable if, given e > 0 , there exists a 5 > 0 such that ||fi — /1||<8 for a
transformation B: <p"—» <p" implies that B has an invariant subspace M with
0(./^, ^V ) < e. The same definition applies for matrices.
This concept is particularly important from the point of view of numerical
computation. It is generally true that the process of finding a matrix
representation for a linear transformation and then finding invariant sub
spaces can be performed only approximately. Consequently, the stable
invariant subspaces will generally be the only ones amenable to numerical
computation.
Suppose that N is a direct sum of root subspaces of A . The JV is a stable
invariant subspace for A. This follows from the fact that JV appears as the
image of a Riesz projector

where F is a suitable closed rectifiable contour in (p such that the eigenvalue


448 Continuity and Stability of Invariant Subspaces

A0 of A is inside F if 9t^(A) C Ji and outside F if ^o(A) n JV = {0} (see


Proposition 2.4.3). Further, the function F(A) = (/A - A) -1 is a continuous
function of A on F. This follows from the formula

where Adj(/A - ^4) is the matrix of algebraic adjoints of /A — A, and from


the continuity of det(/A — A) and Adj(/A — ^4) as functions of A. Since F is
compact, the number K is well defined. Now an
l
transformation fl:(p"—><p" with A has the property that
/A - B is invertible for all A £ F. [Indeed, for A G F we have

and since the invertibility of follows.]


Moreover

which implies that H j R / j - K ^ H is arbitrarily small if ||y4-B|| is small


enough.
Theorem 13.1.1 shows that

so 0(Jf, M) is small together with \\RB - RA\\.


However, it will turn out that not every stable invariant subspace is
spectral. On the other hand, if dim K e r ( A y / — ^ 4 ) > 1 and .A" is a one-
dimensional subspace of Ker(A ; 7 — A), it is intuitively clear that a small
perturbation of A can result in a large change in the gap between invariant
subspaces. The following simple example provides such a situation. Let A be
the 2 x 2 zero matrix, and let JV = Span| C <p2 Clearly, N is A in-
variant, but N is unstable. Indeed, let B = diag[0, e], where e 7^0 is close
enough to zero. The only one-dimensional /^-invariant subspaces are Ml =
Spanj \\ and M2 — Spanj \\, and both are far from JV: computation
shows that

The following theorem gives the description of all stable invariant


subspaces.

Theorem 15.2.1
Let A,, . . . , \r be the different eigenvalues of the transformation A. A
subspace N of (f"1 is A invariant and stable if and only if
Stable Invariant Subspaces: The Main Result 449

where for each j the space Nt is an arbitrary A-invariant subspace of £%A (A)
if dim Ker( A / - A) = 1; if dim Ker( A y / - A) * 1 then either ^ = {0}' or

Comparing this theorem with Theorem 14.3.1, we obtain the following


important fact: an A-invariant subspace Jf is stable if and only if N is isolated
in the metric space lnv(A) of all A-invariant subspaces.
An interesting corollary is easily detained from Theorem 15.2.1.

Corollary 15.2.2
All invariant subspaces of a transformation A: <p"—» <p" are stable if and only
if A is nonderogatory [i.e., dim Ker(A - A 0 7) = 1 for every eigenvalue A0
of A}.

The proof of Theorem 15.2.1 will be based on a series of lemmas and an


auxiliary theorem that is of some interest in itself. We will also take
advantage of an observation that follows immediately from the definition of
a stable subspace: the ^-invariant subspace N is stable if and only if the
5-45-1-invariant subspace SN is stable. Here 5: (p w —»(p w is an arbitrary
invertible transformation.
First we present results leading to the proof of Theorem 15.2.1 for the
case when A has only one eigenvalue. To state the next theorem we need
the following notion: a chain of A-invariant sub
spaces is said to be complete if dim -Mj—j for / ' = ! , . . . , « — 1.

Theorem 15.2.3
Given e > 0, there exists a 8 > 0 such that the following holds true: if B is a
transformation with and is a complete chain of B-
invariant subspaces, then there exists a complete chain {Jij} of A-invariant
subspaces such that 6(Nj, M j ) < e for / = ! , . . . , « — 1.

In general, the chain (M;} for A will depend on the choice of B. To see
this, consider

where i; E (p. Observe that for v ^= 0 the only one-dimensional invariant


subspace of Bv is Span{e2}, and for B'v, i>^0, the only one-dimensional
invariant subspace is Span{e,}.
Proof. Assume that the conclusion of the theorem is not correct. Then
there exists an e > 0 with the property that for every positive integer m there
exists a transformation Bm satisfying \\Bm - A\\ < 1 Im and a complete chain
{Mmj} of Bm-invariant subspaces such that for every complete chain {Jif}
of ^-invariant subspaces
450 Continuity and Stability of Invariant Subspaces

Denote by Pmj the orthogonal projector on Mmj.


Since there exists a subsequence of the sequence of
positive integers and transformations P,,. . . , Pn _, on (p", such that

Observe that P,,. . . , Pn_i are orthogonal projectors. Indeed, passing to the
limit in the equalities P m . y = (P m ., y ) 2 , we find that P; = P]. Further,
equation (15.2.4) combined with P*m. y = Pm . implies that P* = P;; so P;. is
an orthogonal projector (see Section 1.5).
Further, the subspace jVy = Im Py has dimension y, / = 1, . . . , n — 1. This
is a consequence of Theorem 13.1.2.
By passing to the limits it follows from BmPmj = PmjBmPmi that AP- =
PjAPj. Hence ^ is A invariant. Since Pmj = PmJ+lPmj we have Py = Pj+lPjt
and thus JV / -C^ / + 1 . It follows that JVy. is a complete chain of A-invariant
subspaces. Finally, 0(^V), M But this contradicts
(15.2.3), and the proof is complete. D

Corollary 15.2.4
If A has only one eigenvalue, A 0 , say, and if dim Ker( A07 — A) = 1, then each
invariant subspace of A is stable.

Proof. The conditions on A are equivalent to the requirement that for


each 1 < j < n — 1 the operator A has only one /-dimensional invariant
subspace and the nontrivial invariant subspaces form a complete chain (see
Section 2.5). So we may apply the previous theorem to obtain the desired
results. D

Lemma 15.2.5
If A has only one eigenvalue, A0 say, and if dim Ker(A 0 7— A) >2, then the
only stable A-invariant subspaces are {0} and <p".

Proof. Let J = diag[ ] be the Jordan form for A. As


dim Ker( A 0 7 — A) > 2, we have s > 2. By similarity, it suffices to prove that /
has no nontrivial stable invariant subspace.
For € G (p, define the transformation Tf on <p" by setting

otherwise
and put Bf = J+Te. Then || Bf - J \\ tends to 0 as e -» 0. For e * 0 the linear
transformation Bf has exactly one/-dimensional invariant subspace, namely,
Proof of Theorem 15.2.1 in the General Case 451

,Y} = Span{e,,. . . ,e y ). Here 1 < / < £ - ! . It follows that Jff is the only
candidate for a stable /-invariant subspace of dimension j.
Now consider / = diag[Jk ( A 0 ) , . . . , /* 2 (A 0 ), / A | (A 0 )j. Repeating the
argument of the previous paragraph for / instead of /, we see that Nf is the
only candidate for a stable /-invariant subspace of dimension /. But / =
575 ~ l , where S is the similarity transformation that reverses the order of the
blocks in /. It follows that SNj is the only candidate for a stable /-invariant
subspace of dimension /'. As s > 2, however, we have SNj ^ Nj for 1 < / <
k — 1, and the proof is complete. D

Corollary 15.2.4 and Lemma 15.2.5 together prove Theorem 15.2.1 for
the case when A has one eigenvalue only.

15.3 PROOF OF THEOREM 15.2.1 IN THE GENERAL CASE

The proof of Theorem 15.2.1 in the general case is reduced to the case of
one eigenvalue considered in the preceding section. Recall the notion of the
minimal opening

between subspaces M and ^(Section 13.3). Always 0<rj(^, N) < 1, except


when both M and M are the zero subspace, in which case ri(M, N) = °°.
Note that j](M, Jf) >0 if and only if M D N = {0} (Proposition 13.2.1). We
need to apply the following fact.

Proposition 15.3.1
Let be a sequence of subspaces in for
some subspace !£, then

for every subspace N.

Indeed, if both t£ and M are nonzero, then also Mm are nonzero (at least
for m large enough; see Theorem 13.1.2). Then (15.3.1) follows from
formula (13.3.2). If at least one of % and jV is the zero subspace, then
(15.3.1) is trivial.
Let us introduce some terminology and notation that will be used in the
next two lemmas and their proofs. We use the shorthand Am—* A for
lim^^ || Am - A\\ — 0, where Am, m — 1, 2 , . . . , and A are transformations
on (p". Note that A m —> A if and only if the entries of the matrix represen-
tations of Am (in some fixed basis) converge to the corresponding entries of
452 Continuity and Stability of Invariant Subspaces

A (represented as a matrix in the same basis). We say that a simple


rectifiable contour F splits the spectrum of a transformation T if tr(T) D <£ =
0. In that case we can associate with T and F the Riesz projector

The following observation is used subsequently. If T is a transformation


for which F splits the spectrum, then F splits the spectrum for every
transformation S that is sufficiently close to T (i.e., \\S - T\\ is close enough
to zero). Indeed, this follows from the continuity of eigenvalues of a linear
transformation as functions of this transformation.

Lemma 15.3.2
Let F be a simple rectifiable contour that splits the spectrum of T, let TQ be the
restriction of T to Im P(T; F), and let N be a subspace of Im P(T; F). Then Ji
is a stable invariant subspace for T if and only if jV is a stable invariant
subspace for T().

Proof. Suppose that ^Vis a stable invariant subspace for TQ, but not for
T. Then one can find an e > 0 such that for every positive integer m there
exists a transformation Sm such that

From (15.3.2) it is clear that Sm—> T. By assumption, F splits the spectrum


of T. Thus, for m sufficiently large, the contour F will split the spectrum of
Sm. Moreover, P(Sm; F)-» P(T; F), and hence ImP(5 m ;F) tends to
Im P(T; F) in the gap topology. But then, for m sufficiently large,

(cf. Theorem 13.1.3).


Let Rm be the angular transformation of Im P(Sm; T) with respect to
P(T; F). Here, as in what follows, m is supposed to be sufficiently large. As
P(Sm,T)^ P(T-r), we have Rm-+Q. Put

where the matrix representation corresponds to the decomposition


Proof of Theorem 15.2.1 in the General Case 453

Then Em is invertible with inverse

Also, Em Im P(T- F) = Im P(Sm- F), and £„,->/.


Put Tm = EmlSmEm. Then Tm Im P(T; F)C Im P(T; F) and T^T. Let
7m be the restriction of Tm to Im P(T; F). Then Tm —» ro. As ^V is a stable
invariant subspace for 70, there exists a sequence {^Vm} of subspaces of
Im P(T; T) such that Nm is rifl(j invariant and e(Jfm, ^V)->0. Note that JVW is
also 7^ invariant.
Now put Mm = EmNm. Then Mm is an invariant subspace for Sm. From
Em—> I one can easily deduce that 0(J/ m , JV m )—»0. Together with
S(Mm, ^)-^0, this gives 0(Mm, ^V)-*0, which contradicts (15.3.3).
Next assume that jV C Im P(T\ F) is a stable invariant subspace for 7\ but
not for TQ. Then one can find an e >0 such that, for every positive integer
m, there exists a transformation Sm on Im P(T\ F) satisfying

and

Let T, be the restriction of T to Ker P(T; F) and write

where the matrix representation corresponds to the decomposition (15.3.4).


From (15.3.5) it is clear that Sm—> T. Hence, as jV is a stable invariant
subspace for T, there exists a sequence {^Vm} of subspaces of (p" such that
Mm is 5m invariant and B(Jim, M)-*0. Put Mm = P(T; r)JV w . Since P(7; F)
commutes with 5OT, then Mm is an invariant subspace for Sm . We now prove
that 6(-Mm, jV)—»0, thus obtaining a contradiction with (15.3.6).
Take y^Mm with ||_y|| < 1, and let xE.Jfm be such that y = P(T\ T)x.
Then

By Proposition 15.3.1, implies that


454 Continuity and Stability of Invariant Subspaces

where So, for m sufficiently large,


Together with (15.3.7), this gives

for m sufficiently large. Using this inequality, we obtain

and

So

for m sufficiently large. We conclude that Q(Mm, N)—>0, and the proof is
complete. D

Lemma 15.3.3
Let jV be an invariant subspace for T, and assume that the contour F splits the
spectrum of T. If M is stable for T, then P(T;T)Jf is a stable invariant
subspace for the restriction T0 of T to Im P(T; F).

Proof. It is clear that M = P(T\ F)jV is T0 invariant.


Assume that M is not stable for T0. Then M is not stable for 7", either, by
Lemma 15.3.2. Hence there exist e > 0 and a sequence {Sm} such that
Sm —» T and

As N is stable for T, one can find a sequence of subspaces {Nm} such


that SmJimCJ^m and 0(Jfm,Jf)-*Q. Further, since F splits the spectrum of
T and Sm —» T, the contour F will split the spectrum of Sm for m suf-
Perturbed Stable Invariant Subspaces 455

ficiently large. But then, without loss of generality, we may assume that F
splits the spectrum of each Sm . Again using Sm —> T, it follows that

Let 3? be a direct complement of M in <p". As Q(Nm, .yV)-»0, we have


<p" = 3? + Jfm for m sufficiently large (Theorem 13.1.3). So, without loss of
generality, we may assume that (p" = 3? + Nm for each m. Let Rm be the
angular transformation of Nm with respect to the projector of ((7" along 3£
onto JV, and put

where the matrix corresponds to the decomposition (f1" = 2£. 4- N. Note that
Tm = Em*SmEm leaves Jf invariant. Because Rm-*Q, we have £ M -»7, and
sor m -»r.
Clearly, F splits the spectrum of T\v. As Tm-* T and JV is invariant for
Tm, the contour F will split the spectrum of Tm ^ too, provided m is
sufficiently large. But then we may assume that this happens for all m. Also,
we have

Hence in the gap topology.


Now consider !£m - EmMm. Then ££m is an S m -invariant subspace. From
Em^I it follows that 6(£m,Mm)-+Q. This, together with e(Mm,M)-*Q,
gives 0(<^m, M)^>Q. So we arrive at a contradiction to (15.3.8) and the
proof is complete.
After this long preparation we are now able to give a short proof of
Theorem 15.2.1.
Proof of Theorem 15.2.1. Suppose that .TV is a stable invariant subspace
for A. Put Then N = ^ + • • • + Nr. By Lemma 15.3.3,
the space JV^ is a stable invariant subspace for the restriction Aj of A to
91^ (A). But, by Lemma 2.1.3, Aj has one eigenvalue only, namely, A ; . So
we may apply Lemma 15.2.5 to prove that jV; has the desired form.
Conversely, assume that each Nj has the desired form, and let us prove
that ^V = JVj + • • • + Nr is a stable invariant subspace for A. By Corollary
15.2.4, the space ^ is a stable invariant subspace for the restriction A- of A
to £%A (A). Hence we may apply Lemma 15.3.2 to show that each Jfj is a
stable invariant subspace for A. But then the same is true for the direct sum
jv = jv + ••• + jf.

15.4 PERTURBED STABLE INVARIANT SUBSPACES

In this section we show that the stability of an /1-invariant subspace M is


preserved under small perturbations of M and A. This is true also when we
restrict our attention to the intersection of M and a fixed spectral subspace
456 Continuity and Stability of Invariant Subspaces

of A. To state this result precisely, denote by $ln(A) the spectral subspace


of A (the sum of root subspaces for A) corresponding to those eigenvalues
of A that lie in an open set ft.

Theorem 15.4.1
Let A : <p" —> <p" be a transformation, and let ft C <p be an open set whose
boundary does not intersect a-(A). Assume that M is an A-invariant subspace
for which the intersection M D 3ft.n(A) is stable (with respect to A}. Then any
B-invariant subspace jV has the property that N D &ln(B) is stable (with
respect to B) provided \\B-A\\ and 0(M, JV) are small enough.

The particular case of Theorem 15.4.1 when ft = <p is especially


important.

Corollary 15.4.2
Let M be a stable A-invariant subspace. Then there exists an € > 0 such that
any B-invariant subspace N is stable provided

We need the following lemma for the proof of Theorem 15.4.1.

Lemma 15.4.3
Let A and ft be as in Theorem 15.4.1, and let M be an A-invariant subspace.
Then for every e > 0 there exists a 8 > 0 such that every B-invariant sub-
space M with ||B — A\\ + 0(M, N)<8 satisfies the inequality

Proof. Arguing by contradiction, assume that there is a sequence of


transformations {Bm}^, =l and a sequence of subspaces {•'Vm}~ = 1 such that
invariant for each
w, but

where e does not depend on m.


Denote by P n (# m ) [resp. P^A)]tne R'esz projector onto ^ ft (B m ) [resp.
onto £%n(^4)]. By Lemma 13.3.2, for m large enough there exists an
invertible transformation Sm: <p" —> <p" such that

and, moreover,
Perturbed Stable Invariant Subspaces 457

Here C,, C2, . . . are positive constants that depend on A only. Actually,
one can take S defined as follows:

Put Bm = SmlBmSm and jfm = Sm^m (so that Jfm is Bm invariant). Let P*
(resp. Pjf ) be the orthogonal projector onto M (resp. Nm). As SmlPx Sm is
a projector onto -A"m (not necessarily orthogonal), we have

where the first inequality follows from (13.1.4). Hence

It is easily seen that and Ker P n (/* m ) = Ker Pn(^4) (for m


large enough). Consequently

Since also

Theorem 13.4.2, together with (15.4.2), implies that

(cf. the proof of Lemma 14.3.2). Now, as in (15.4.2), we have

which contradicts (15.4.1) in view of (15.4.4).

Proof of Theorem 15.4.1. Consider first the case ft = <p (i.e.,


where n is the size of A). Arguing by contradiction, assume that the
statement of the theorem is not true (for Cl = <p). Then there exist an e >0
and a sequence [Bm}^J==l of transformations on <p" converging to A such that
Q{M, N) > € for every stable Bm-invariant subspace jV, m = 1, 2,. . . Since
M is stable and Bm—> A, there exists a sequence {^m}^ = 1 of subspaces in
<p" with BmMmC.Mm for each w and 6(Mm, M)^>$. For m sufficiently
large we have Q(Mm, M)<e, and hence the # m -invariant subspace Mm is
not stable.
458 Continuity and Stability of Invariant Subspaces

Let 3? be a direct complement of M in <p". We may assume that 3? is also a


direct complement to each Mm (Theorem 13.1.3). Let Rm be the angular
transformation of Mm with respect to the projector onto M along '2L. Then
/?-.-*•<). Write

where the matrix representation is taken with respect to the decomposition


<p" = % + M. Then Em is invertible, EmM = Mm, and Em-^> L Put Am =
E~^EmEm. Obviously, Am—=> A and AmM CM. Note that M is not stable
for
With respect to the decomposition <p" = M + 3£, we write

Then Um—>U and W m —» W. Since ^ is not stable for Am, Theorem 15.2.1
ensures the existence of a common eigenvalue \m of Um and Wm such that

Now | A w | < ||Um\\ and {t/m} converges to U. Hence the sequence {A m }


is bounded. Passing, if necessary, to a subsequence, we may assume
that A m -»A 0 for some A 0 E < p . But then \ml - Um-> \0I - U and
\ml — Wm —» A () / — W. It follows that A0 is a common eigenvalue of U and W.
Again applying Theorem 15.2.1, we see that A() is an eigenvalue of
geometric multiplicity one: dim Ker( A () 7 - A)= I . So there exists a nonzero
( n - l ) x ( r t - l ) minor in A,,/ - A. Then, for m large enough, the corres-
ponding minor in A m / - Am is also nonzero, a contradiction with (15.4.5).
Now consider the general case of Theorem 15.4.1. It is seen from the
proof of Lemma 15.4.3 that we can assume that B satisfies 5?n(#) = $tn(A).
But then we can apply the part of Theorem 15.4.1 already proved with <p",
A and B replaced by 9in(A), A\jf ( A ) and B\.A ( B ) , respectively.

Now let us focus attention on the spectral A -invariant subspaces, that is,
sums of root subspaces for A (the zero subspace will also be called spectral).
Theorem 15.2.1 shows that each spectral invariant subspace is stable. The
converse is not true in general: every invariant subspace of a unicellular
transformation is stable, but the only spectral subspaces in this case are the
trivial ones.
For the spectral subspaces, an analog of Theorem 15.4.1 holds.

Theorem 15.4.4
Let A and O be as in Theorem 15.4.1. Assume that M is an A-invariant
subspace for which M H ^i{l(A) is a spectral invariant subspace for A. Then
Lipschitz Stable Invariant Subspaces 459

any B-invariant subspace X has the property that N n £%n(B) is spectral (as a
B-invariant subspace) provided \\B — A\\ + 6(M, N) is small enough.

Proof. As in the proof of Theorem 15.4.1, the general case can be


reduced to the case O = <p. So assume £1 = <p.
Since every invariant subspace is the sum of its intersections with the root
subspaces, it follows that an A -invariant subspace Z£ is spectral if and only if
there is an /t-invariant direct complement ££' to !£ such that a(A\^)r\
°"(^U ) ~ 0- Let A be an open set containing o-(A\ u), and let A' be an open
set disjoint with A that contains all other eigenvalues of A (if any). Then
cr(A\ y . ) C A' for an ^-invariant direct complement M' to M (actually, M' is
the spectral A -invariant subspace corresponding to the eigenvalues in A').
By Theorem 15.1.4, any B-invariant subspace jV satisfies cr(B| v )CA
provided \\B - A\\ + 9(M, N) is small enough. On the other hand, by
Theorems 15.2.1 and 15.1.4 there exists a B-invariant subspace N' such that
o-(/?| v ,)CA' and 6(M',N') is as small as we wish provided ||B-v4|| is
small enough. As X' is a direct complement to Ji (Theorem 13.1.3) and
c r ( B \ H , ) Pi cr(B\ N -,) = 0, it follows that JV is spectral. D

The proof of Theorem 15.4.4 shows that if M is a spectral ^-invariant


subspace with cr(y4|^)cn, where O C <p is an open set, then for any
B-invariant subspace jV such that ||B — A\\ + 0(M, JV) is small enough, we
also have

15.5 LIPSCHITZ STABLE INVARIANT SUBSPACES

In this section we study a stronger version of stability for invariant sub-


spaces. A subspace M C <p" that is invariant for a transformation
A: (p"—»<p" is said to be Lipschitz stable (with respect to A) if there exist
positive constants K and e such that every transformation B: (p" —» <p" with
| | f i - X | | < e has an invariant subspace N with
Clearly, every Lipschitz stable subspace is stable; the converse is not true in
general.
The following theorem decribes Lipschitz stability.

Theorem 15.5.1
For a transformation A and an A-invariant subspace M the following
statements are equivalent: (a) M is Lipschitz stable; (b) M — {0} or else
M = £%A (A) + • • • + £%A (A) for some different eigenvalues A,, . . . , \r of A;
in other words, M is a spectral A-invariant subspace; (c) for every suf-
ficiently small e > 0 there exists a 8 > 0 such that any transformation B with
\\A — B\\ < 8 has a unique invariant subspace N for which 0(M, N) < €.
460 Continuity and Stability of Invariant Subspaces

The emphasis in (c) is on the uniqueness of Jf\ if the word "unique" is


omitted in (c), we obtain the definition of stability of M.

Proof. First, arguing as in the proof of Lemma 15.3.2, one shows


that M is a Lipschitz stable A -in variant subspace if and only if each
intersection M D £% ^ (A) is Lipschitz stable (with respect to the restriction
A\a (A)} f°r J = 1> • • • > 5 » where /A, , . . . , /u^ are all the distinct eigenvalues
of A.
Assume that (c) holds but (b) does not. Then M is a stable subspace, and
Theorem 15.2.1 ensures that for some eigenvalue A() of A with
dim Ker( A07 - A) = 1 we have

in a Jordan basis for A in and define the transformation B(a),


where 0 < a < 1 , as follows:

B(a) — A on all root subspaces of A other than £%A (^4). Then B(a)—> A as
a-*0. Let p = dim ffl^A); q = dim Ker M n ^Ao(X) ; so 0<q<p. For
brevity, denote the right-hand side of (15.5.1) by K(a). To obtain a
contradiction, it is sufficient to show that for a small enough the number of
g-dimensional ^(a)-invariant subspaces N such that B(M fl £%A (A), Jf)<
C, a 'lp is exactly ( ) > 1 (we denote by C, , C2 , . . . positive constants that
depend on p and q only).
Let us prove this assertion. The matrix K(a) has p different eigenvalues
el , . . . , ep , which are the p different roots of the equation xp — a. The
corresponding eigenvectors are y, = (1, e,, . . . , ef" 1 ), / = 1, . . . , p. The
only ^-dimensional K(a)- invariant subspaces are those spanned by any q
vectors among y , , . . . , yp . Take such a subspace «AT and suppose for
notational convenience that Jf = Span{ y,, . . . , yq}. The projector QN onto
M along the subspace spanned by eg+l, . . . , e is given by the formula
Lipschitz Stable Invariant Subspaces 461

where Yq (resp. Y p _ q ) is the q x q [resp. (p — q) x <y] matrix formed by the


first q (resp. last p - q) rows of the matrix [y, _y2 • • • yq}. As Fg is a
Vandermonde matrix, det y^ = n i < | . <>S9 (e ; .-e ( -)^0 (cf. Example 2.6.4).
Let Zq - Adj Yq be the matrix of algebraic adjoints to the elements of Yq,
so that y"1 = l/(det Yq)Zq. From the form of Yq it is easily seen that
||ZJ<CX /P > where r = 1 + 2 + • • • + (q -2) = £ ( g - l ) ( ? - 2 ) . Further,
|det yj = C 3 a v/p , where 5 = \ q(q - 1) is the number of all pairs of integers
( i , / ) such that l<i<j<q. As ||y p _J < C 4 a* /p , it follows that
l l y ^ ^ y ^ H < Csa(r+'+q)lp = C 5 a l / p . Consequently,

where As Q (resp. Q v ) is a projector onto M n <%A (.4)


(resp. onto N) we have

[see (13.1.4)]. Combining this inequality with (15.5.2), we find that


small enough. Since the number
of ^-dimensional K(a) -invariant subspaces N is exactly the required
assertion is proved.
Conversely, assume that (b) holds but (c) does not. Since M is a stable
subspace (by Theorem 15.2.1), this implies the existence of a sequence
{Bm}*l = } and the existence of two different Bm -invariant subspaces jVlm and
.yV2/w such that \\Bm - A\\<(\lm) and

for / = 1 and 2. Let F (resp. A) be a closed simple rectifiable contour such


that cr(A) D F = 0 [resp. cr(A) fl A = 0] and A,, . . . , \r are the only eigen-
values of A inside F (resp. outside A). Letting <3il .(C) be the image of the
Riesz projector (27r/)~' J, (A/ - C)~' JA, where the matrix C has no
eigenvalues on F, we have M = $ll(A). Since e(^(Bm),&tr(A))^0 as
m —>°o, we find in view of (15.5.3) that

Now ^ combining this with


(15.5.4), it is easily seen that Jfim n &±(Bm) = {0}, at least for large m.
(Indeed, argue by contradiction and use the properties that the set of all
subspaces in (p" is compact and that the limit of a converging sequence of
nonzero subspaces is again nonzero.) So Nim C 9tv(Bm). But (15.5.3) implies
that (for large m) dim Jfim = dim M = dim 9lv(Bm). Hence Jfim = 9lr(Bm),
462 Continuity and Stability of Invariant Subspaces

/ = 1,2 (for large m), contradicting the assumption that Nlm and M2m are
different.
Now we prove the equivalence of (a) and (b). In view of Theorem 15.2.1,
we have to check that the only Lipschitz stable invariant subspaces of the
Jordan block

are the trivial spaces {0} and <p". For a > 0, let

For k = I , . . . , n — 1, the only A>dimensional /-invariant subspace Nk is


spanned by the first k unit coordinate vectors. Denote by Pk the orthogonal
projector onto Nk, and let Pk a denote the orthogonal projector onto a
^-dimensional 7 a -invariant subspace Nk ( ! < & < « - ! ) . We have
where So

Now use |e| = Va. One finds that for a sufficiently small

On the other hand, ||/ — Ja \\ = a. But then it is clear that for 1 < k < n — 1
the space Nk is not a Lipschitz stable invariant subspace of /, and thus / has
no nontrivial Lipschitz stable invariant subspace.

The property of being a Lipschitz stable subspace is stable in the


following sense: let M be an ^-invariant Lipschitz stable subspace. Then any
/^-invariant subspace Ji is Lipschitz stable (with respect to B} provided
\\B - A\\ and 0(M, JV) are small enough. In view of Theorem 15.5.1, this is
simply a reformulation of Theorem 15.4.4.
It follows from Theorem 15.5.1 that a transformation A: £"—»• <p" has
Stability of Lattices of Invariant Subspaces 463

exactly 2r different Lipschitz stable invariant subspaces, where r is the


number of distinct eigenvalues of A.

15.6 STABILITY OF LATTICES OF INVARIANT SUBSPACES

In this section we extend the notion of stable invariant subspaces to the


lattices of invariant subspaces.
Recall that a set A of subspaces in <p" is called a lattice if M , N G A
implies M + Jf G A and M D JV E A. Two lattices A and A' of subspaces in
<J7" are isomorphic if there exists a bijective map S: A —»A' such that
S(M H JV) = SM n SJV and S(J* + JV) = £/« + SN for any two members M
and jV of A. In this case 5 is called an isomorphism of A onto A'.
Let A be a lattice of (not necessarily all) invariant subspaces of a
transformation A: $"—> <p". The lattice A is called stable if for every e > 0
there exists a 5 > 0 such that, for any transformation Z? : <p" —» <p" with
||/4-Z?||<5, there exists a lattice A' of (not necessarily all) fi-invariant
subspaces that is isomorphic to A and satisfies sup^ eA 6(3?, S(£)) < e for
some isomorphism S: A—» A'. If A consists of just one subspace, we obtain
the definition of a stable invariant subspace.

Theorem 15.6.1
A lattice A of A-invariant subspaces is stable if and only if it consists of stable
A-invariant subspaces.

Proof. Without loss of generality we can assume that {0} and <p" belong
to A.
Suppose first that A contains an ^-invariant subspace M that is not stable.
Then there exist an e > 0 and a sequence of transformations {/?m}* = 1
tending to A such that 6(M, Jf) > e0 for any 5 m -invariant subspace N and
any m. Obviously, A cannot be stable.
Assume now that every member of A is a stable >4 -invariant subspace. As
the number of stable ^-invariant subspaces is finite (by Theorem 15.2.1),
the lattice A is finite. Let Ml, . . . , M p be all the elements in A. Denote by
A 1 5 . . . , \r the different eigenvalues of A ordered so that

Then

where J^ and Jftj is equal either to {0} or to ^(A) for


464 Continuity and Stability of Invariant Subspaces

j = s + 1, . . . , r. Let Fy. (y = 1, . . . , r) be a small circle around A y such that


A y is the only eigenvalue of A inside or on Fy. There exists a 8{} > 0 such that
all transformations B: $"—> <p" with \\B - A\\ < d0 have all their eigenvalues
inside the circles F;; for such a B denote by fflj(B) the sum of root subspaces
of B corresponding to the eigenvalues inside Fy. Now put

where for and


$A (A)', for y = 1, . . . , s we take ,V;y as follows. Let {0} = ^0 C #, C • • • C
££m =• fflj(B) be a complete chain of fl-invariant subspaces in {%.(/?); then
N'ii is equal to that subspace !£k whose dimension coincides with the
dimension of Nir Clearly, M\ is B invariant. Further, it is clear from the
construction that M\C.M'k if and only if Mt C Mk. Using Theorem 15.2.3,
it is not difficult to see that, given e < 0 , there exists a positive 6 < 50
such that max ls . sp 0(,/^,, M'f) < e for any transformation B: <p"-»<p" with
\\B - A\\ < 8. Putting A' = {M[, . . . , Jt'p}, we find that A is stable. D
The case when the lattice A is a chain is of special interest for us.
We say that a chain =^, C • • • C 2£r of ^-invariant subspaces is stable if for
every e > 0 there exists a 5 >0 such that any transformation B: <p"—»<p"
with \\B — A\\ < 8 has a chain !£\ C • • • C Z£'r of invariant subspaces such that
Q(£\ , %) < e for / = 1, . . . , r. It follows from Theorem 15.6.1 that a chain
of ^-invariant subspaces is stable if and only if each member of this chain is
a stable /1-invariant subspace.
The notion of Lipschitz stability of a lattice of invariant subspaces is
introduced naturally: a lattice A of (not necessarily all) ^-invariant sub-
spaces is called Lipschitz stable if there exist positive constants e and K such
that every transformation B with \\B — A\\ < € has a lattice A' of invariant
subspaces that is isomorphic to A and satisfies

where 5 runs through the set of all isomorphisms of A onto A'. Obviously,
every Lipschitz stable lattice of invariant subspaces is stable. We leave the
proof of the following result to the readers.

Theorem 15.6.2
A lattice A of A-invariant subspaces is Lipschitz stable if and only if A
consists only of a spectral subspaces for A .

15.7 STABILITY IN METRIC OF THE LATTICES


OF INVARIANT SUBSPACES

If the lattice A consists of all ^-invariant subspaces, then a different notion


of stability (based on the distance between sets) is also of interest. To
introduce this notion, we start with some terminology.
Stability in Metric of the Lattices of Invariant Subspaces 465

Given two sets X and Y of subspaces in <p", the distance between X and Y
is introduced naturally:

Borrowing notation from set theory, denote by 2Z the set of all subsets in a
set Z. Then dist(^, F) is a metric in Z^"' [as before, $(<p") represents the
set of all subspaces in <p"]. Indeed, the only nontrivial property that we have
to check is the triangle inequality:

for any subsets X, Y, Z in $(<p"). For M e A", N G 7, ^ e Z we have

Fix M and e > 0 and take % in such a way that


inf^ e z 0(^, ^) + e. Taking the infimum in (15.7.1) with respect to JV, we
obtain

Now take the supremum with respect to M, and, from the resulting
inequality with the roles of M and JV interchanged, it follows that

As e > 0 was arbitrary, the triangle inequality follows.


Note also that dist(*, F) < 1 for any
The lattice ln\(A) of all invariant subspaces of a transformation
A : <p" —» <p" is called stable in metric if for every e > 0 there exists a 5 > 0
such that the lattice Inv(fi) of any transformation Z? : <p" —»• <p" with
||B-j4||<5 satisfies dist(Inv(Z?), Inv(/4))<e. The following theorem
describes all transformations with stable lattices of invariant subspaces.

Theorem 15.7.1
Inv(v4) is stable in metric if and only if A is nonderogatory , that is,
dim Ker(A — A 0 /) = 1 for every eigenvalue A0 of A.

Proof. Assume that A is derogatory. Then obviously ln\(A) is an


infinite set. Without loss of generality we can assume that A is a matrix in
the Jordan form:
466 Continuity and Stability of Invariant Subspaces

Here A 1 5 . . . , \r are (not necessarily distinct) eigenvalues of A.


be a sequence of numbers such that lim m _^
0 and

for any / 5^; and any positive integer m. (Such sequences can obviously be
arranged.) Letting

we obtain \\Am — A\\—»0 as m—»<*. Moreover, the number of/l m -invariant


subspaces is exactly (kt+ ! ) • • • (kr + 1), and the lattice of Am-invariant
subspaces is independent of m. As lnv(A) is infinite, clearly, dist(Inv(^4),
ln\(Am))^ e >0, where e does not depend on m. Hence Inv(A) is not
stable.
Assume now that A is nonderogatory. Then the lattice Inv(v4) is finite.
Let Ml,. . . , Mp be all the /4-invariant subspaces. Theorem 15.2.1 shows
that every Ml is stable. That is, given e >0, there exists a 5, >0 such that
any transformation t has an invariant subspace
Nt such that B(Mi, JVJ < e. Taking 8' = min(5,, . . . , 5p), we have

for every transformation with \\B - A\\ < 8'. We prove now that given 6 >0
there exists a 6">0 such that

for every transformation B with \\B - A\\ < 8". Suppose not. Then there is a
sequence of transformations on such that
and for every m there exists a Bm -invariant subspace Nm with

where e0 is independent of m. Using the compactness of the set of all


subspaces in <p", we can assume that \\mm_tx6(Jfm, M) = Q for certain
subspace N in <p". Then (15.7.2) gives

However, by Theorem 15.1.1, jVE Inv(/4), which contradicts (15.7.3).


Stability in Metric of the Lattices of Invariant Subspaces 467

Now given e >0, let 5 = min(6', 5") to see that dist(Inv(#), lnv(A)) < e
for every transformation B with

It follows from Theorems 15.6.1 and 15.7.1 that ln\(A) is stable if and
only if it is stable in metric.
Also, let us introduce the notion of Lipschitz stability in metric. We say
that the lattice ln\(A) of all invariant subspaces of a transformation
A : <£" —> <p" is Lipschitz stable in metric if there exist positive constants K
and e such that for any transformation the
inequality

Theorem 15.7.2
The lattice lnv(A) for a transformation A: <p"—» <p" is Lipschitz stable in
metric if and only if A has n distinct eigenvalues.

Proof. Assume that A has n distinct eigenvalues. Then every A-


invariant subspace is spectral and by Theorem 15.5.1 every ^-invariant
subspace is Lipschitz stable, let Ml,...,Mp be dll ^4-invariant sub-
spaces (their number is finite). So there exist positive constants /C(, €t
such that any transformation B with \\B — A\\ < e. has a invariant subspace
^ with B(Mi,Jfi)<Ki\\B-A\\. Letting K = max(tf,, . . . , Kp), e =
min(e,, . . . , e p ), we find that

provided \\B - A\\ < e. Now consider the invariant subspaces of B. As A has
n distinct eigenvalues, the same is true for any transformation B sufficiently
close to A. So every B- invariant subspace ^V is spectral:

for a suitable contour F. We can assume that F n cr(A) = 0. Then, letting

we find that

for every transformation B sufficiently close to A (cf. the verification of


468 Continuity and Stability of Invariant Subspaces

stability of a direct sum of root subspaces in the beginning of Section 15.2).


Hence

for all such B. In view of (15.7.4) and (15.7.5), Inv(/4) is Lipschitz stable in
metric.
Conversely, if A has less than n distinct eigenvalues, then by Theorem
15.5.1 there exists an ,4-invariant subspace that is not Lipschitz stable. Then
clearly lnv(A) cannot be Lipschitz stable in metric.

15.8 STABILITY OF [A B]-1NVAR1ANT SUBSPACES

In this section we treat the stability of [A Z?]-invariant subspaces. In view of


the important part they play in our applications (see Section 17.7), the
reader can anticipate subsequent applications of this material.
Let A: <p"—» <p" and B: <p" —» <p" be linear transformations. Recall from
Chapter 6 that a subspace M C <p" is called [A B] invariant if there exists a
transformation F: <p" -» $m such that (A + BF)M C M (actually, this is a
property that is equivalent to the definition of an [A B]-invariant subspace,
as proved in Theorem 6.1.1). We restrict our attention to the case most
important in applications, when the pair (A, B) is a full-range pair; thus

It turns out that, in contrast with the case of invariant subspaces for a
transformation, every [A #]-invariant subspace is stable and, moreover, the
stability is understood in the Lipschitz sense. More exactly, we have the
following theorem.

Theorem 15.8.1
Let A: <p n —> <p" and B: <p m —* <p" be a full-range pair of transformations.
Then for every [A B]-invariant subspace M there exist positive constants e
and K such that, for every pair of transformations A': <p"—> <p* and
B1: <p m ^<p", with

there exists an [A' B']-invariant subspace M' satisfying

Proof.
Stability of [A B]-In variant Subspaces 469

and B as block matrices with respect to the decomposition <p" = M + Jf,


where JV is some direct complement to M:

We claim that (A22, B2) is a full-range pair. Indeed, since (A, B) is a


full-range pair, so is (A + BF, B) (Lemma 6.3.1). Now for every x =
XM + Xjf £ <p" with XM E M , x^ - E N we have

Hence in view of the full-range property of (A + BF, B) we find that

This implies the full-range property of (A22, B2).


We appeal to the spectral assignment theorem (Theorem 6.5.1). Accord-
ing to this theorem, there exists a transformation G: JV—»• <pm such that

Put F0 = F + G, where the transformation G: <p" -* <pm is defined by the


properties that Gx = 0 for all x E M and Gx = Gx for all x G N. Clearly

Condition (15.8.2) ensures that M is a spectral invariant subspace for A +


BF0. By Theorem 15.5.1, M is Lipschitz stable [as an (A + BF0)-invariant
subspace]. So there exist constants e', K' >0 such that every transformation
H: <p"—» <p" with \\A + BF0 - H\\ < e' has an invariant subspace M' such
that

It remains to choose e in such a way that

and put

We emphasize that the full-range property of (A, B) is crucial in


Theorem 15.8.1. Indeed, in the extreme case when B = Q the [A B]-
invariant subspaces coincide with ,4-invariant subspaces and, in general, not
every A-invariant subspace is Lipschitz stable.
470 Continuity and Stability of Invariant Subspaces

The proof of Theorem 15.8.1 reveals some additional information about


the stability of [A B]-invariant subspaces:

Corollary 15.8.2
Let A: <p n —» <p" and B: (pm —» (p" be a full-range pair of transformations, and
let M be an [A B]-invariant subspace. Then for every transformation
F: <pn —»• (pm such that (A + BF)M C M and every direct complement N to M
in <p" there exists positive constants K and e with the property that to any pair
of transformations A': (p"-* <p", B': $m^ <p" with \\A-A'\\ + \\B-B'\\<
e there corresponds a transformation F': <p"—» <pm with Ker F' D M, and a
subspace M' C C", such that (A' + B'(F + F')}M' C M' and

A dual version of Theorem 15.8.1 also holds. Namely, given a null kernel
pair of transformations G: <p"—» (p™ an(^ A: ^P""* <P"> every „ -invariant
subspace is Lipschitz stable in the above sense. The proof can be obtained
by using Theorem 15.8.1 and the fact that a subspace M \s\ \ invariant if
and only if its orthogonal complement is [A* G*] invariant. We leave it to
the reader to state and prove this dual version of Corollary 15.8.2.

15.9 STABLE INVARIANT SUBSPACES FOR


REAL TRANSFORMATIONS

Let A: %"—>$" be a transformation. The definition of stable invariant


subspaces of A is analogous to that for transformations from <p" to (p".
Namely, an .A-invariant subspace M C ]|?" is called stable if for every e > 0
there exists a 8 >0 such that any transformation B: %" —> %" with \\B -
A\\ < 8 has an invariant subspace N with 6(M, N} < e. However, it turns out
that, in contrast with the complex case, the classes of stable and of isolated
invariant subspaces no longer coincide. More exactly, every stable invariant
subspace is isolated, but, in general, not every isolated invariant subspace is
stable.
To describe the stable invariant subspaces of real transformations, we
start with several basic particular cases.
Lemma 15.9.1
Let A: $"—» ft" be a transformation such that o~(A) consists of either exactly
one real eigenvalue or exactly one pair of nonreal eigenvalues. Let the
geometric multiplicity (multiplicities} be greater than one in either case. Then
there^is no nontrivial stable A-invariant subspaces.
The proof of this lemma is similar to the proof of Lemma 15.2.5.
Stable Invariant Subspaces for Real Transformations 471

Lemma 15.9.2
Assume that n is odd and the transformation A: ft" —* ft" has exactly one
eigenvalue (which is real) and the geometric multiplicity of this eigenvalue is
one. Then each A-invariant subspace is stable.
Proof. As n is odd, every transformation X: ft"—* ft" has an invariant
subspace of every dimension k for \<k<n—\ (this follows from the real
Jordan form for X, because X must have a real eigenvalue). Arguing as in
the proof of Theorem 15.2.3, one proves that for every e >0 there exists a
5 > 0 such that, if B is a transformation with ||Z?-v4||<5 and M is a
Ac-dimensional Z?-invariant subspace, there exists a /c-dimensional A-
invariant subspace jV with 0(Jt, N) < e. Since A is unicellular, this subspace
./Vis unique, and its stability follows.
Lemma 15.9.3
Let n be even, and A: ft"-^ ft" have exactly one real eigenvalue. Let its
geometric multiplicity be one. Then the even dimensional A-invariant sub-
spaces are stable and the odd dimensional A-invariant subspaces are not
stable.

Proof. If k is even, then the stability of the /c-dimensional ^-invariant


subspace (which is unique) is proved in the same way as Lemma 15.9.2,
using the existence of a A>dimensional invariant subspace for every trans-
formation
Now let M be a A:-dimensional /^-invariant subspace where k is odd.
Without loss of generality we can assume A = /„(()). For every positive e,
the transformation A(e) = S(e) + A, where S(e) has e in the entries (2,1),
(4, 3), . . . , (n — 2, n - 3), (n, n - 1), and zeros in all other entries, has no
real eigenvalues. Hence A(e) has no /c-dimensional invariant subspaces, so
Q(M, N) > 1 for every ,4(e)-invariant subspace Jf (Theorem 13.1.2). There-
fore, M is not stable.
Lemma 15.9.4
Assume that A: ft"—> ft" has exactly one pair of nonreal eigenvalues a ± i(3,
and their geometric multiplicity is one. Then every A-invariant subspace is
stable.
Proof. Using the real Jordan form of A, we can assume that
472 Continuity and Stability of Invariant Subspaces

(In particular, n is even.) Theorem 12.2.4 shows that the lattice of A-


invariant subspaces is a chain; so for every even integer k with 0< k ^ n,
there exists exactly one ^-invariant subspaces of dimension k. Also, there
exists an e > 0 such that any transformation B with ||B - A\\ < € has no real
eigenvalues. [Indeed, for a suitable e all the eigenvalues of B will be in the
union of two discs { A £ <p 1 1 A - (a ± ij8)| < (/3/2)} that do not intersect the
real axis.] Now one can use the proof of Lemma 15.9.2.

Now we are prepared to handle the general case of a transformation


be all the distinct real eigenvalues of A, and let
be all the distinct eigenvalues of A in the open upper
half plane (so ai are real and /3( are positive). We have

For every ^-invariant subspace jV we also have

(see Theorem 12.2.1). In this notation we have the following general result
that describes all stable ^-invariant subspaces.

Theorem 15.9.5
Let A be a transformation on tjt". The A-invariant subspace JV is stable if and
only if all the following properties hold: (a) Ji n ^^(A) is an arbitrary even
dimensional A-invariant subspace of <3iK (A) whenever the algebraic multi-
plicity of \j is even and the geometric multiplicity of\j is 1; (b) N ft <3iK (-4) is
an arbitrary A-invariant subspace of <%^ (A) whenever the algebraic multiplic-
ity of \j is odd and the geometric multiplicity of \} is 1; (c) JV D £%A (^4), or
N D <3lK (A) - {0} whenever A; has geometric multiplicity at least 2; (d)
N r\ &la . ± i f i ( A ) is an arbitrary A-invariant subspace of &la±ip(A) when-
ever the geometric multiplicity of
Ji H &i a . ± p ( A ) = {0} whenever a; -f /j8y has geometric multiplicity of at least 2.

Proof. As in Lemma 15.3.2 one proves that N is stable if and only if


each intersection jV fl £% A (A) is stable as an A\^ (/4)-invariant subspace, and
each intersection Nr\2fcn+iR a
(A) ' is stable as an /4L
> 'P/ ± V
IWa^ili^A)
-invariant sub-
space. Now apply Lemmas 15.9.1-15.9.4.

Comparing Theorem 15.9.5 with Theorem 14.6.5, we obtain the follow-


ing corollary.
Stable Invariant Subspaces for Real Transformations 473

Corollary 15.9.6
For a transformation A: $"— » 4?", every stable A-invariant subspace is iso-
lated. Conversely, every isolated A-invariant subspace is stable if and only if
A has no real eigenvalues with even algebraic multiplicity and geometric
multiplicity 1.
We pass now to Lipschitz stable invariant subspaces for real transfor-
mations. The definition of Lipschitz stability is the same as for transfor-
mations on (p". Clearly, every Lipschitz stable invariant subspace is stable.
Also, for a transformation A: ft"-* $" every root subspace ^(A) corres-
ponding to a real eigenvalue A of A, as well as every root subspace
3%a±ili(A) corresponding to a pair a ± ifi of nonreal eigenvalues of A, is a
Lipschitz stable ^-invariant subspace. Moreover, every spectral subspace for
A (i.e., a sum of root subspaces) is also a Lipschitz stable A -invariant
subspace. As in the complex case, these are all Lipschitz stable subspaces:

Theorem 15.9.7
For a transformation A:$n —> $n and an A-invariant subspace J f C . $ " ,
the following statements are equivalent: (a) M is Lipschitz stable;
(b) M = &Xi(A) + - - ' + &AJ(A) + ®ai±i,i(A) + ''' + &at±li(A) for some
distinct real eigenvalues A 1 ? . . . , Ar of A and some distinct eigenvalues
a, + //3,, . . . , a, + ifis in the open upper half plane (here terms 9l^(A) or
terms $ta±ip(A), or even both (in which case M is interpreted as the zero
subspace) may be absent) ; (c) for every e > 0 small enough there exists a
8 > 0 such that every transformation B: ft"-* #" for which \\B - A\\ < d has
a unique invariant subspace Jf for which 8(M, M} < e.
Proof. As in Lemma 15.3.2, one proves that M is Lipschitz stable if and
only if for every real eigenvalue A of A the intersection M D $t^(A) is
Lipschitz stable as an /4|^U) -invariant subspace and for every nonreal
eigenvalue a + i/J of A M n &la±lfi(A) is Lipschitz stable as an A\& (A)-
invariant subspace.
Let us prove the equivalence (a)<^>(b). In view of the above remark, we
can assume that A has either exactly one real eigenvalue or exactly one pair
of nonreal eigenvalues. By Theorem 15.9.6 we have only to prove that the
transformations represented by the matrices

and
474 Continuity and Stability of Invariant Subspaces

have no nontrivial Lipschitz stable invariant subspaces. For A, one shows


this as in the proof of Theorem 15.5.1. Consider now A2. By a direct
computation one shows that

where n is the size of A-, and

(For convenience, note that

Moreover, denoting by T the n x n matrix that has 1 in the entries («/2,1)


and (n, n/2 +1) and zeros elsewhere, we have (for a E J|?)
Partial Multiplicities of Close Linear Transformations 475

Now the proof of Theorem 15.5.1 shows that the only candidates for
nontrivial Lipschitz invariant subspaces for A2 are S~'(Span{e,,. . . , e n / 2 } )
and S~\Span{en/2+i,. . . , en}). But since these subspaces are not real (i.e.,
cannot be obtained from subspaces in Jjf" by complexification), A2 has no
nontrivial Lipschitz invariant subspaces.
The implication (b) =^> (c) is proved as in the proof of Theorem 15.5.1. To
prove the converse implication, observe that, as we have seen in the proof
of Theorem 15.5.1, it is sufficient to show that for any .4 2 -invariant subspace
M(C $") of dimension q (0 < q < n) the number of ^-dimensional invariant
subspaces jV of A2(a) such that

is at least ere a is positive and sufficiently close to zero, and C is a


positive constant depending on q and n only.) Observe that q, as well as n,
must be even. Using formula (15.9.1) and arguing as in the proof of
Theorem 15.5.1, we find that for any choice of different complex numbers
€j, . . . , eql2 with e" 7 = 1, / = 1, . . . , q/2, the subspace JV spanned by the
columns of the real matrix

satisfies (15.9.2).

15.10 PARTIAL MULTIPLICITIES OF CLOSE


LINEAR TRANSFORMATIONS

In this chapter we have studied up to now the behaviour of invariant


subspaces under perturbations of the given linear transformation. We have
found that certain information about the transformation (e.g., its spectral
invariant subspaces) remains stable under small changes in the transfor-
mation. Here we study the corresponding problem of stability of the partial
multiplicities of transformations.
Given a transformation A: <p"—»(p", denote by k^\, A),. . . , kp(\, A)
the partial multiplicities of A corresponding to its eigenvalue A, arid put
kr( A, A) = 0 for r >p (here p is the geometric multiplicity of the eigenvalue
A). For a closed contour F in the complex plane that does not intersect the
spectrum of A, let
476 Continuity and Stability of Invariant Subspaces

where A j , . . . , A r are all the distinct eigenvalues of A inside F. If there are


no eigenvalues of A inside F, put formally /cy(F, A) = 0 for y = 1, 2, . . . .

Theorem 15.10.1
Given a transformation A: <p" —> <p" and a closed contour F with F fl o-(A) =
0, there exists an e >0 such that any transformation /?:<£"—> (p" w/tfz
|| 5 — /4|| < e has no eigenvalues on F and satisfies the inequalities

and the equality

Proof. Let n(F, /) be the number of zeros (counting multiplicities) of a


scalar polynomial / inside F. (It is assumed that/does not have zeros on F.)
For 5 = 1 , 2 , . . . , « , we have the relations

where / S (A) is the greatest common divisor of all determinants of s x s


submatrices in A / — A. (Here and in the sequel all transformations on <p"
are regarded as n x n matrices in a fixed basis in <p".) Indeed, (15.10.3)
follows from Theorem A.4.3 (in the appendix).
Consider the Smith form of A/ — A (see the appendix):

where F( A) and G( A) are n x n matrix polynomials with constant nonzero


determinant, and a,(A), . . . , « „ ( A ) are scalar polynomials such that a^X) is
divisible by a,_,(A) for i = 2,. . . , n. By the Binet-Cauchy formula
(Theorem A.2.1) fs(\) coincides with the greatest common divisor of all
determinants of 5 x s submatrices in diag[flj(A),. . . , a n (A)], and this is
equal to the product a , ( A ) - • • as(\) in view of the properties of
a^A), . . . , « n (A). So for 5 = 1,2, ... , «

Now let e >0 be so small that if \\B - A\\ < e, the determinant of the top
sxs submatrix in F(\)~\\I - fi)G(A)"1 has exactly
Partial Multiplicities of Close Linear Transformations 477

zeros in F. [Such an e exists by Rouche's theorem in the theory of functions


of a complex variable; e.g., see Marsden (1973).] Denote by hs(\) the
greatest common divisor of determinants of all s x s submatrices of F(\) ~ l
(A/ - B)G( A)~'. Then hs( A) coincides (again by the Binet-Cauchy formula)
with the greatest common divisor of determinants of all s x s submatrices in
\I-B. When | | f l - y 4 | | < e we obviously have n(F; fl^A)- • • as(\)) >
n(F; hs). Combining this inequality with (15.10.4) and using (15.10.3) with
A replaced by B, we find that, for 5 = 1, . . . , n

As the inequalities (15.10.1) with s>n are trivial, (15.10.1) is proved.


Further, EJL, /c,(F; A) coincides with the number of zeros of det(A7- A)
inside F, counting multiplicities. This number does not change after suf-
ficiently small perturbations of A, again by the Rouche theorem.

The following question arises in connection with Theorem 15.10.1: Are


the restrictions (15.10.1) and (15.10.2) imposed on the transformation B
sufficient for existence of such a B arbitrarily close to A! Before we answer
this question (it turns out that the answer is yes), let us introduce a
convenient notation for the partial multiplicities of a transformation.
Given a transformation A: <p" —> <p", let

be an ordered sequence, where s is the number of distinct eigenvalues of A,


and the iih eigenvalue has geometric multiplicity ri and partial multiplicities
m 1 / 5 . . . , mir. So £* = ] £y' =1 fntj = n. The order in (15.10.5) is determined by
the following properties: (a) r, > r 2 - • • > r v ; (b) if r-t = ri+l, then

(c) if ri — r / + 1 and equality holds in (15.10.6), then

We say that (15.10.5) is the Jordan structure sequence of A. Denote by 3> the
finite set of all ordered sequences of positive integers (15.10.5) such that
properties (a)-(c) hold and £* =I £y' =] rnif = n (here n is fixed).
Given the sequence ftE<f> as in (15.10.5), for every nonempty subset
A C {1,. . . , s} define
478 Continuity and Stability of Invariant Subspaces

(mp- is interpreted as zero for j>rp). Now we have the following.

Theorem 15,10.2
Let A: (p" —> <p" be a transformation with s distinct eigenvalues and Jordan
structure sequence fl. Then, given a sequence

there exists a sequence of transformations on <p", say, {Bm }* = , that converges


to A and has a common Jordan structure sequence ft' if and only if there is a
partition { 1 , 2 , . . . , . ? ' } into s disjoint nonempty sets A t , . . . , As such that the
following inqualities hold:

Informally, if A,, . . . , A 5 are the distinct eigenvalues of A ordered as in


ft, and if A , m , . . . , \s.m are the distinct eigenvalues of Bm ordered as in ft',
then the eigenvalues {A > m } y e A cluster around \p, for p - 1, . . . , s.

Proof of Theorem 15.10.2. The necessity of conditions (15.10.7) and


(15.10.8) follows from Theorem 15.10.1.
To prove sufficiency, we can restrict our attention to the case 5 = 1. Let A0
be the eigenvalue of A, and write mi = 'Ep = lmpj (recall that r\ =
max{rj,. . . , r's} and that m'pj is zero by definition if / > r'p). We then have
the inequalities

and the equality

Now we construct a sequence (B 4 }^ = , converging to A such that A0 is the


only eigenvalue of Bq and, for each q, the Jordan structure sequence of Bq is
D = (1; /•{; m,, . . . , m^}. Using induction on the number SJli (S) =1 m; -
Exercises 479

E' = 1 m l y ) , it is sufficient to consider only the case when, for some indices
/ < q, we have m, = mu + 1, mq = m ](/ - 1, whereas m; = raly for; T^ l,j^q.
Write

as a matrix in some Jordan basis for A. Let

where the matrix (2 nas all zero entries except for the entry in position
(m n + • • • + MI,, mn + • • • + m{q) that is equal to 1. One verifies without
difficulty that the partial multiplicities of Bq are ml = mu + q,rhq = m\q-\,
and rhj = m l y for j ^ I, q.
Given a sequence {Btj}^=l converging to A such that or{Bq} = {A0} and
the Jordan structure sequence of Bq is fl (for each <?). For a fixed g, let

be a Jordan basis for Bq, in other words, x j } , . . . , x y ^ is a Jordan chain for


Bq for / = 1,. . . , r\. Let ^,. . . , ^s, be distinct complex numbers; define
the transformation Z^(/*,,,. . . , /A S .) by the requirement that in the basis
(15.10.9) it has the matrix form

where /. is the / x / unit matrix, and ; does not appear in (15.10.10) if


'•,. Clearly, S.) has the Jordan structure sequence , and
by suitable choice of JLI, values one can ensure that

With this choice of /x, values (which depend on q), put Brn =
Bm( ,) to satisfy the requirements of Theorem 15.10.2.

15.11 EXERCISES

15.1 When are all invariant subspaces of the following transformations


A: (p" —» (p" (written as matrices in the standard orthonormal basis)
stable?
480 Continuity and Stability of Invariant Subspaces

(a) A is an upper triangular Toeplitz matrix.


(b) A is a circulant matrix.
(c) A is a companion matrix.
15.2 Describe all stable invariant subspaces for the classes (a), (b), and (c)
in Exercise 15.1.
15.3 Describe all stable invariant subspaces of a block circulant matrix
with blocks of size 2 x 2 .
15.4 Show that any transformation A: $"—> <p" with rank A < n - 2 has a
nonstable invariant subspace and identify it.
15.5 Prove that for every transformation A there exists a transformation
B such that every invariant subspace of A + eB is stable. Show that
one can always ensure, in addition, that rank B — n — I.
15.6 Give an example of a transformation A: $"—> <p" such that there is
no transformation B: (p"—»<p" with rank B<n—2 such that, for
some e £ <p, all invariant subspaces of A + eB are stable.
15.7 Given transformations A: <p"—»<p" and B: £" —» <p", an v4-invariant
subspace 5£ will be called B stable if for every e0 > 0 there exists
6 0 >0 such that each transformation A + 8B, with |5|<5 0 has an
invariant subspace M such that 0(<^, M)<eti. Clearly, every stable
/1-invariant subspace is B stable for every B. Give an example of a
/^-stable /4-invariant subspace that is not stable.
15.8 Show that if A and B commute, then there is a complete chain of
fi-stable yl-invariant subspaces.
15.9 Give an example of transformations A and B with the property that
an A -invariant subspace is stable if and only if it is B stable.
15.10 Show that an /1-invariant subspace is stable if and only if it is B
stable for every B.
15.11 Show that the set of all stable invariant subspace of a transformation
A: (p"-> <p" is a lattice. When is this lattice trivial, that is, when does
it consist of {0} and (p" only? When does this lattice coincide with
Inv(^)?
15.12 Show that every stable invariant subspace is hyperinvariant. Is the
converse true?
15.13 Prove that the transformation A: <p"—» <p" has the following property
if and only if A is nonderogatory: for every orthonormal basis
jtj , . , . , xn in which A has an upper triangular form and any e > 0
there exists a 8 >0 such that any transformation B: <p"—>• <|7" with
has an upper triangular form in some orthonormal
basis y\, • • • •, yn that satisfies
Exercises 481

15.14 Let A: ft"-» %r and B: $m-*%" be a full-range pair of real trans-


formations. Show that every [A B]-invariant subspace is stable (in
the class of real transformations and real subspaces). [Hint: Use the
spectral assignment theorem for real transformations (Exercise
12.13).]
15.15 Let A be an upper triangular Toeplitz matrix. Find all possible
partial multiplicities for upper triangular Toeplitz matrices that are
arbitrarily close to A.
15.16 Let A and B be circulant matrices. Compute dist(Inv(/l), Inv(fi)).
Chapter Sixteen

Perturbations of
Lattices of
Invariant Subspaces
with Restrictions
on the Jordan Structure

In this chapter we study the behaviour of the lattice Inv(A') of all invariant
subspaces of a matrix X, when X is perturbed within the class of matrices
with fixed Jordan structure (i.e., with isomorphic lattices of invariant
subspaces). A larger class of matrices with fixed Jordan structure corres-
ponding to the eigenvalues of geometric multiplicity greater than 1 is also
studied. For transformations A and B on <p", our main concern is the
relationship of the distance between the lattices of invariant subspaces for A
and B to ^ - B .

16.1 PRESERVATION OF JORDAN STRUCTURE AND


ISOMORPHISM OF LATTICES

We start with a definition. Transformations A, B: <J7" —> <p" are said to


have the same Jordan structure if they have the same number of distinct
eigenvalues [so that we may write <r(A) = { A , , . . . , A^} and a-(B) —
{/*!,. . . , /u.^}], and the eigenvalues can be ordered in such a way that the
partial multiplicities of A, as an eigenvalue of A coincide with the partial
multiplicities of /A, as an eigenvalue of B, i = 1, . . . , s.
Given a transformation A, denote by J(A) the set of all transformations
with the same Jordan structure as A. This structure is determined by the
sequence of positive integers (which was also a useful tool in Section 15.10):

482
Preservation of Jordan Structure and Isomorphism of Lattices 483

(16.1.1)

where 5 is the number of distinct eigenvalues of A, and the /'th eigenvalu


has geometric multiplicity r, and partial multiplicities ra n ,..., m, r . Thus
The parameters of this sequence are ordered in such a
way that rt > r 2 > • • • > r s , and if r, = r / + 1 , then

and, furthermore, if r, = r ( + l and equality holds in inequality (16.1.2), the


integers m^ and ml + l ; are ordered in such a way that

Clearly, the property of having the same Jordan structure induces an


equivalence relation on the set of all transformations on <p". The number of
equivalence classes under the relation is finite and is equal to the number of
all different sequences of type (16.1.1) with the order properties described.
It is shown in the first theorem that transformations have the same Jordan
structure if and only if they have isomorphic (or linearly isomorphic) lattices
of invariant subspaces.
Let us define the notion of isomorphism of lattices. First, let yi and 5^2 be
two lattices of subspaces in <p". A map <p: y\ —> &2 is called a lattice
homomorphism if <p") ~ 4
for every two subspaces l.
Then a lattice homomorphism <p is called a lattice isomorphism if <p is
one-to-one and onto; in this case the lattices 5^, and 6^2 are said to be
isomorphic. An example of a lattice isomorphism is provided by the
following proposition.

Proposition 16.1.1
If S: <pn—*• <p" is an invertible transformation and y is a lattice of subspaces in
is also a lattice of subspaces and the correspon-
dence = SM. is a lattice isomorphism of y onto Stf.

Proof. The definition of <p ensures that <p is onto, and invertibility of S
ensures that (p is one-to-one. Furthermore,

for any subspaces M and Ji in (p". Indeed, the inclusion C in equation


484 Perturbations of Lattices of Invariant Subspaces

(16.1.3) is evident. To prove the opposite inclusion, take jc E SM H SN, so


x = Sm = Sn for some mELM, nELN. As S is invertible, actually m — n
and xELS(JtnN). Finally, the equality S(M + Jf) = SM + SJf is
evident.

The lattice isomorphisms described in Proposition 16.1.1 are called linear.


So the lattices 5^ and 5^2 are called linearly isomorphic if there exists a
transformation S: <p"—» <J7" (necessarily invertible) such that 2£ E 5^, if and
only i f
It is easy to provide examples of lattices of subspaces that are isomorphic
but not linearly isomorphic. For instance, two chains of subspaces

and

are lattice isomorphic if and only if k = I (it is assumed that M^ M f for


i T^ / and However, there exists an invertible matrix S such
that SMt = <£• for i = 1, . . . , k if and only if dim J£, = dim ^ for each /.
The following theorem shows, in particular, that for the lattices of all
invariant subspaces isomorphism and linear isomorphism are the same.

Theorem 16.1.2
Let a transformation A: <p"~* <P" be given. The following statements are
equivalent for a transformation B : <p" —» (p" : (a) Z? /w,s f/ie same Jordan
structure as A; (b) the lattices Inv(Z?) and Inv(^4) are isomorphic; (c) the
lattices In\(B) and Inv(>4) are linearly isomorphic.

Proof. Assume B E J(A). Let A t , . . . , \p and /i,, . . . , /up be all the


distinct eigenvalues of A and B, respectively, and let them be numbered so
that the partial multiplicities of A and A ; coincide with the partial multi-
plicities of B at fjij for / = ! , . . . , / ? . For a fixed /, let

be a Jordan basis in £%A (,4), and let

be a Jordan basis in 8ft^(B) (so /:,, A:2, . . . , kq are the partial multiplicities
of A at \j and of 5 at /A;). Given an ^4-invariant subspace
spanned by the vectors
Preservation of Jordan Structure and Isomorphism of Lattices 485

[here are complex numbers], put

where

Clearly, j^-(JO is a 5-invariant subspace that belongs to $^(B). Now for


any A-invariant subspace M put

It is easily seen that I/MS a desired isomorphism between Inv(^4) and Inv(fi);
moreover, M) = SM, where 5 is the invertible transformation defined
Sxrs = y,*> * = !,... ,* r ; r = l , . . - , < ? .
Conversely, suppose that /: Inv(^4)—> Inv(fl) is an isomorphism of lat
tices. Let A j , . . . , \p be all the distinct eigenvalues of A, and let jV( =
«/r(5? A (y4)), / = ! , . . . , / ? . Then <p" is a direct sum of the B-invariant sub-
space's JV,, . . . , Jfp. We claim that \ v ) = 0 for / ?*/. Inde
assume the contrary, that is, r some JV} and ^Vy with
/ T^y. Let JV = Spanf}', + y2}, where _yj (resp. _y 2 ) is some eigenvector of B\^
(resp. of B v) corresponding to the eigenvalue Then N is B invariant.
Let M be the /^-invariant subspace such that tff(M) - JV. Since ^ must
contain a one-dimensional ^-invariant subspace, and since $ is a lattice
isomorphism, the subspace M is one-dimensional. Therefore, M C £%A (y4)
for some A:. This implies Ji (/!)) = ^, a contradiction with
the choice of N.
Further, the spectrum of each restriction B\ v is a singleton. To verify
this, assume the contrary. Then for some i the subspace Nt is a sum of at
least two root subspaces for B:

Letting Mf be the yl-invariant subspace such that


1, . . . , / : , we have

If *i and j:2 are eigenvectors of y4| -<ti and of A\M , respectively, then
486 Perturbations of Lattices of Invariant Subspaces

Span{*, + x2} is ^-invariant and does not belong to any subspace Mf.
Hence (Span{xl + jc2}) is B invariant, belongs to Jii but does not belong to
any subspace 2ft,M(B). This is impossible because
one-dimensional.
We have proved, therefore, that jV) = £% M .(/?), / = ! , . . . , / ? , where
/ i t j , . . . , fjip are all the distinct eigenvalues of B.
For a fixed /, the number of partial multiplicities of A corresponding to A,
that are greater than or equal to a fixed integer q, coincides with the
maximal number of summands in a direct sum Jif, + • • • + J%, where, for
j = 1,. . . , s, ^ C 91A (A) are irreducible subspaces with dimension not less
than q. As ty induces an isomorphism between Inv(/4|^ ( A ) ) and
ln\(B\^ (B) ), it follows that the number of partial multiplicities of A
corresponding to A, that are greater than or equal to q coincides with the
number of partial multiplicities of B corresponding to /u,, that are not less
than q. Hence A and B have the same Jordan structure.

Corollary 16.1.3
Assume that A and B are transformations on <p" with one and only one
eigenvalue {A0}. Then the lattices Inv(A) and Inv(B)
isomorphic if and only if A and B are similar.

16.2 PROPERTIES OF LINEAR ISOMORPHISMS OF LATTICES:


THE CASE OF SIMILAR TRANSFORMATIONS

In view of Theorem 16.1.2, for transformations A and B with the same


Jordan structure, the set A, B) of all invertible transformations S such
that g nv(,4) if and only if S3 nv(#), is not empty. Denote

Note that the set £f(A, B) contains transformations arbitrarily close to zero
[Indeed, take a fixed S A,B) and consider S with »0, ^Q.
Hence (^4, B) < 1 for any A and B with the same Jordan structure. This
observation will be used frequently in the sequel.
The following example shows that the equality fl(y4, B)— I is possible.

EXAMPLE 16.2.1. Let

Then
Properties of Linear Isomorphisms of Lattices 487

and

However, it is easily seen that the norm of I is at least 1, for


any choice of a, b, and c, and can be arbitrarily close to 1. Hence

The number £l(A, B) is closely related to the distance between Inv(A)


and Inv(B), as we shall see in the next theorem. Recall that

dist(Inv(/l), Inv(fi)) = max{ sup inf

Theorem 16.2.1
If A and B have the same Jordan structure and , B)<1, then

Proof. For positive be such that

For every nonzero x denoting y — S lx, we have

Hence

so and
an . Now for
any subspace J< C <p" the transformation 5^5 ' is a projector on S1./^ (we
denote by PM the orthogonal projector on Jt). So
488 Perturbations of Lattices of Invariant Subspaces

Consequently

dist(Inv(,4), Inv(fi))

and since e <0 was arbitrary, Theorem 16.2.1 follows. D

Now consider the case when A and B are similar. Then, evidently, A and
B have the same Jordan structure. Clearly, in this case y(A, B) contains all
the similarity transformations between A and B:
|5 is invertible and A = S~1BS}

We remark that this inclusion can be proper. Indeed, in Example 16.2.1


above, the similarity transformations between A and B have the form

which is a proper subset of ^(A, B).

Theorem 16.2.2
For every transformation A: <p" —»• <p" we have

where the suprema are taken over all transformations B that are similar to A .

In other words, the first inequality in (16.2.1) means that there exists a
positive constant K > 0 (depending on A) such that for every B that is
similar to A we have

for some invertible transformation T satisfying A = TBT~\


In the next section the result of Theorem 16.2.2 is generalized to include
all the transformations B with the same Jordan structure as A.

Proof of Theorem 16.2.2. As 6(M,N)^1 for any subspaces M, X


in <p" [this follows, for instance, from formula (13.1.3)], we have
Properties of Linear Isomorphisms of Lattices 489

for any transformations X, Y: $"—> £". So, by Theorem 16.2.1, the second
inequality in (16.2.1) follows from the first one.
To prove the first inequality in (16.2.1), consider the linear space L(<p")
of all transformations A: <£" -» <p", with the scalar product (X, Y) = tr(AT*)
for X, yeL(<tr) (where Y* denotes the adjoint of Y defined by the
standard scalar product on <p") and the corresponding norm ||^||,=
Vf%T) for all X<EL($"). For every J5eL(<p") consider the linear
transformation

so that, in particular, W If B is similar to A, then


dim Ker WB = dim Ker WA (indeed, Ker WB = {XS \ X<= Ker WA}, where S
is a fixed invertible transformation such that B = S~1AS). Let PA be a fixed
projector on KerV^. [Thus PA: L(<p")-» L(<p").] By Theorem 13.5.1
there exists a positive constant Kl such that, if 5 is similar to A, then
for some
projector P B on Ker WB. Here

is the norm induced by || • ||,, and similarly for


Observe that the norm || • ||, is multiplicative: for all
transformations X, Y: <p"—»<p". Indeed, if || • || is the norm induced on
transformations by the standard norm on <p", then it is easily verified
that ||y|| 2 /-yY* is positive semidefinite and hence that
XYY*X* is positive semidefinite. Thus

Further, denoting by A, ^ • • • ^ A n (s:0) the eigenvalues of the positive


semidefinite transformation y*y, we have every / £ (p" with ||/|| = 1:

so ||y||f. Substitution in (16.2.3) yields the desired ineq

Note that so the multiplicative property of


|| • ||, implies

Now for every transformation B similar to A.


The identity transformation / belongs to Ker WA\ so PA(I) = I and
490 Perturbations of Lattices of Invariant Subspaces

If, in addition, then PB(I) is an invertible transfor-


mation. In this case P B (/)£ ^(B, A); hence

for every fl£L((p") that is similar to A and such that


where the constant K 2>Q depends on A only.
Taking into account the fact that £l(B, A) < 1 for all B similar to A, we
find that (16.2.1) follows from (16.2.4).

Results analogous to Theorem 16.2.2 hold also for other classes of


subspaces connected with a linear transformation. For example, the lattice
of all invariant subspaces in Theorem 16.2.2 can be replaced by any one of
the following sets: coinvariant subspaces, semiinvariant subspaces, hyper-
invariant subspaces, reducing invariant subspaces, and root subspaces. The
proof remains the same in all these cases.
Theorem 16.2.2 fails in general if we drop the requirement that B is
similar to A. The next example illustrates this fact.

EXAMPLE 16.2.2. Let

Let us compute dist(Inv(v4), Inv(Z?(S)) for 8 7^0. We have for a complex


number

So

Hence
Properties of Linear Isomorphisms of Lattices 491

As any subspace in <p is A invariant, we obviously have

for every ^ G l n v B ( g ) . Thus dist(Inv(,4), Inv(fi(6))) = 1. In the limit as


8—»0, we see that the conclusion of Theorem 16.2.2 fails for this particular
A if we drop the condition that B be similar to A.

We conclude this section with a simple example in which £l(B, A) and


dist(Inv(#), lnv(A)) can be calculated explicitly.

EXAMPLE 16.2.3. Let

Then

and

Taking a — 1 , it follows that

On the other hand, taking u — 0, we have

So
492 Perturbations of Lattices of Invariant Subspaces

An elementary calculation (using the stationary points of \xd\2 + \l — d\2


considered as a function of two real variables &te d and $m d) yields

To calculate the distance between lnv(A) and lnv(B), note that the
unique different invariant subspaces of A and B (if x ^0) are Span and
Span , respectively, with corresponding orthogonal projectors

Observe that

and, letting P3 = /- P, , we find that

These inequalities, together with the fact that 0(M,Ji) = \ if dim M


dim N (see Theorem 13.1.2), allow us to verify that

It is curious that A, B) = dist(Inv(>l), Inv(B)) in this example.

16.3 DISTANCE BETWEEN INVARIANT SUBSPACES FOR


TRANSFORMATIONS WITH THE SAME JORDAN STRUCTURE

We state the main result of this chapter.

Theorem 16.3.1
Given a transformation A on <p", we have

and

where the suprema are taken over the set J(A) of all transformations
B: <p n —»<p" which have the same Jordan structure as A.
Transformations with the Same Jordan Structure 493

Before we proceed with the proof of Theorem 16.4.1 (which is quite


long), let us mention the following result on Lipschitz continuity of
dist(Inv(y4), Inv(/?)), whose proof is facilitated by the use of Theorem
16.3.1.

Theorem 16.3.2
Let J be a class of all linear transformations having the same Jordan structure.
Then the real function defined on J by

for all A, B G / is Lipschitz continuous at every pair Ati, B0 £ J, that is

for every A, B G /, where the constant K>0 depends on A0 and B0 only.

Proof. We need the following observation (proved in Section 15.6):

for any transformations A, B, C: <p"~* <P"- Using (16.3.3) and (16.3.2), we


obtain for a fixed

Proof of Theorem 16.3.1. Since Theorem 16.2.1, together with (16.3.1),


implies (16.3.2), we have only to prove (16.3.1). The main idea of the proof
is to reduce it to Theorem 16.2.2. For the reader's convenience the proof is
divided into three parts.

(a) Let A j , . . . , A p be all the distinct eigenvalues of A, and let F, be a


circle around A,, / = ! , . . . , / ? chosen so small that F, H F; = 0 for
/ T^ ; and A, is the unique eigenvalue of A inside F^ . For every F, and
every transformation B: <p"—»<p" that has no eigenvalues on F.,
define

where .., u are all the eigenvalues of B inside


494 Perturbations of Lattices of Invariant Subspaces

are the partial multiplicities of B at n-m


(we put kr(fjim, B) = 0 for r greater than the geometric multiplicity
of fjim as an eigenvalue of B). By Theorem 13.5.1 there exists an
e, >0 such that any transformation B with \\B - A\\ < e, has all its
eigenvalues in the union of the interiors of F , , . . . , ! ^ ; an
moreover, the sum of algebraic multiplicities of the eigenvalues of B
inside a fixed circle F, is equal to the algebraic multiplicity of the
eigenvalue A, of /4, for / = 1, . . . , / ? ; further

provided
(b) Assume now that ||B - A\\ < e, and B E J(A). As the numbers of
different eigenvalues of B and of A coincide, there is exactly one
eigenvalue of B, denoted /i(, inside each circle F(. We claim that for
every / = 1 , . . . , / ? the eigenvalue A, of A and the eigenvalue /x, of B
have the same partial multiplicities. Indeed, assuming the contrary,
it follows from (16.3.4) that

for some /„ (1 < /„ </?) and some 50 (note that the equality

holds for i= 1, . . . , / ? ) . For notational simplicity assume that i'0 = 1,


and that A,, A 2 , . . . , A^ are exactly those eigenvalues of A whose
algebraic multiplicities are equal to the algebraic multiplicity of A,.
As B E J(A), there is a permutation TT of {1, 2, . . . , p(}} such that

Consequently,

However, (16.3.4) and (16.3.5) imply

which is a contradiction.
Transformations with the Same Jordan Structure 495

(c) Observe that a transformation F: <p"-» <J7" with


no eigenvalue on F, U F2 U • • • U Fp. So the number

is well defined. For the transformation B G J(A) with


e,/2 we have [using (13.1.4)]

where A / is the length of F(. Let

st

Then 2-n-)||^ - fi|| and (provided,

Put

and for fixed / ( ! < / < / ? ) let 5, be the transformation constructed above for
the transformation B £ 7(^4) with \\B - A\\ < e2. Define the transformation

where 5, = 5,.,^ . Obviously, /A, is the only eigenvalue of Br Further, for


• \i(A) —
the transformation we have (here x
4% Perturbations of Lattices of Invariant Subspaces

Now

and where
the proof of Theorem 16.2.1). Since (1 - <?,)~' <2, (16.3.6) gives

where Now we have


we

On the other hand, for any orthonormal basis / - , . . . , / * . in 3ft ^( A) the


inequality (16.3.8) gives

M,.-trB,.|

Taking into account (16.3.9), we obtain

Now define the transformation B': <p"-> <P" by

Then B' is clearly similar to ;4. As every invariant subspace of a transfor-


mation is the direct sum of its intersections with all the root subspaces of this
transformation, it follows that Inv(fi) = Inv(B'). Moreover, inequality
(16.3.10) shows that for all jc,-
Transformations with the Same Derogatory Jordan Structure 497

For every xE.(" write x = xl + -" + xp, where xj = Pi(B)xt and P,(fi)
is the projector on %(fi) along E/^^fl). As Pi(B) = (l/2iri)
JY ( A / - B ) ' 1 d\, we have

where P;-(>1) is the projector on £%A (A) along E /?S/ 5? A (A). Denoting

we see that ||P,(fi)|| < £ > , , / = 1, . . . , / ? . Now using (16.3.11) with these
inequalities we obtain

and thus

where . By Theorem 16.2.2 there exists <23 >0 such


that for any transformation X that is similar to A there exists an invertible S
with Applying this result for X = B'
and bearing in mind that Inv(fi') = Inv(Z?), we obtain

for any B e J(A) with


As n(,4,B)<l for any fley(X), (16.3.1) follows from (16.3.13). D

/d.4 TRANSFORMATIONS WITH THE SAME DEROGATORY


JORDAN STRUCTURE

The result on continuity of Inv(A) that is contained in Theorem 16.4.1 can


be extended to admit pairs of transformations that are close to one another
and have different Jordan structures, provided the variations in this structure
are confined to those eigenvalues with geometric multiplicity 1. To make
this idea precise, we introduce the following definition. We say that trans-
formations A: (p" —> <J7" and B: <p" —» <{7" have the same derogatory Jordan
structure if /l|<g (/4) and fi|%(fl) have the same Jordan structure, where ^(A) is
498 Perturbations of Lattices of Invariant Subspaces

the sum of the root subspaces of A corresponding to eigenvalues A0 with


dim Ker( A 0 / - A) > 1. By definition, <€(A) = 0 if dim Ker( A07 - A) = \ for
every eigenvalue A() of A.
Denote by DJ(A) the set of all transformations that have the same
derogatory Jordan structure as A.
We need one more definition to state the next theorem. For a transfor-
mation A, the height of A is the maximal partial multiplicity of A corres-
ponding to the eigenvalues A0 with dim Ker( A 0 7 - A) = 1. If A has no such
eigenvalues, its height is defined to be 1.

Theorem 16.4.1
Let A: <£"—»• <p" be a transformation with height a. Then

where the supremum is taken over all B G DJ(A).

The inequality in Theorem 16.4.1 is exact in the sense that in general a


cannot be replaced by a smaller number. Namely, given a transformation A
with height a, there exists a sequence {Bm}^n^l of transformations converg-
ing to A with Bm G DJ(A) such that

Indeed, it is sufficient to consider the case when A = J,,(0) is a Jordan block.


Then the sequence

satisfies (16.4.1). This is not difficult to verify using the fact that Bm has n
distinct eigenvalues em~1'" with corresponding eigenvectors

n
where e is an nth root of unity. Indeed, writing , we see that the
orthogonal projector on Span IS
Transformations with the Same Derogatory Jordan Structure 499

SO

where the positive constant C is independent of m. Hence for m large


enough (such that C|£| < 1) we have

and (16.4.1) follows.


The proof of Theorem 16.4.1 is given in the next section. For the time
being, note the following important special case.

Corollary 16.4.2
Let A: <p"~* <P" ^e a nonderogatory transformation with height a. Then there
exists a neighbourhood °IL of A in the set of all transformations on <p" such
that

Recall that a transformation A is called nonderogatory if dimKer(A/ —


A) — 1 for every eigenvalue A of A, and note that the set of all nonderoga-
tory transformations is open. Indeed, if A: <P"~* <P" 's nonderogatory, then
rank(>4 - A 0 /) = n - 1 for every eigenvalue A0 of A. Write A as an n x n
matrix in some basis in (p", and let A0 be an (n — 1) x (n - 1) nonsingular
submatrix of A — A07. Then, for B sufficiently close to A and A sufficiently
close to A () , the corresponding ( n - l ) x ( « — 1) submatrix B(} of B - A/ will
also be nonsingular. Consequently

for all such B and A. Now the eigenvalues of a transformation depend


continuously on that transformation. So the set of A values for which
(16.4.2) holds will contain all eigenvalues of B (if B is close enough to A),
which means that B is nonderogatory.
Using the openness of the set of all nonderogatory linear transformations,
we see that Corollary 16.4.2 follows immediately from Theorem 16.4.1.
500 Perturbations of Lattices of Invariant Subspaces

The following result on continuity of dist(Inv(/4), lnv(B)) can be ob-


tained from Theorem 16.4.1 in the same way that Theorem 16.3.2 was
obtained from Theorem 16.3.1.

Theorem 16.4.3
Let DJ be a class of all transformations having the same derogatory Jordan
structure. Then the real function defined on DJ by

for every A, B E DJ is continuous. Moreover, for every pair A 0 , BQ G / there


exists a constant K > 0 such that

for every A, B £ DJ that is sufficiently close to AQ, B0, and where a, j8 are
the heights of A() and fi(), respectively.

Now we consider stable invariant subspaces. Recall from Section 15.2


that an ^-invariant subspace M is called stable if for every e > 0 there exists
8 >0 such that any transformation B with \\B - A\\<8 has an invariant
subspace jVwith the property that 6(M, N)< e. Using Theorem 16.4.1 and
its proof, we can prove a stronger property of stable invariant subspaces:

Theorem 16.4.4
Let A: $"—> (p" be a transformation with height a, and let M be a stable
A-invariant subspace. Then

where the supremum is taken over all transformations B: $"—> <f"".

It will be convenient to prove Theorem 16.4.4 in the next section,


following the proof of Theorem 16.4.1.

16.5 PROOFS OF THEOREMS 16.4.1 AND 16.4.4


We start with a preliminary result.

Lemma 16.5.1
Let A: <p"-^ <p" be a transformation with o-(A) = {0} and dim Ker ^4 = 1.
Then, given a constant M > 0, there exists a K > 0 such that

for every eigenvalue A0 of every transformation B: <p"—* <p" satisfying


Proofs of Theorems 16.4.1 and 16.4.4 501

Proof. Let B: <p"-> <£" be such that \\B-A\\<M.We have A" = 0 and
thus

On the other hand, if A0 is an eigenvalue of /3, then A[J is an eigenvalue of B"


(as one can easily see by passing to the Jordan form of B). Hence
. If this inequality is combined with the preceding one
and «th roots of both sides are taken, the lemma follows. D

Now we prove Theorem 16.4.1 for the case when A: <p"—><|7" is non-
derogatory and has only one eigenvalue.

Lemma 16.5.2
Let a(A) = {\0} and dim Ker(A 0 7 - A) - I . Then there exists a constant
K>0 such that the inequality

holds for every transformation

Proof. It will suffice to prove (16.5.2) for all B belonging to some


neighbourhood of A. We can assume A0 = 0. By Lemma 16.5.1 there exist
Kt >0 and e, >0 such that any eigenvalue A0 of a B with ||fi - ^4|| < e,
lln
satisfies 0 | < K^\B — A\\ . As the set of nonderogatory transforma-
tions is open, we can assume also that every B with \\B — A\\ < e, is non-
derogatory. Now for such a B and its eigenvalue let JCQ be the corre-
sponding eigenvector: (B - O/)JCO = 0, JCG ^ 0. Then dim Ker(Z? - A 0 /) =
dim Ker A = I , and using Theorem 13.5.1, we find that

for any eigenvalue A0 of any B satisfying \\B - A\\ < e2, where the positive
constants K2 and e2 ^ el depend on A only.
It is convenient to assume that A is the Jordan block with respect to the
standard orthonormal basis in (p": A = /„(()). For any B sufficiently close to
A write B - A = [6 /y ]" /=1 . Inequality (16.5.3) shows that there is an eigen-
502 Perturbations of Lattices of Invariant Subspaces

vector x of B corresponding to an eigenvalue A0 of the form x =


(1, Jt 2 ,jt 3 , . . . , * „ ) ' where x2, . . . , xn G <f. The equation (B - A O /)JC = 0
has the form

Rewrite the first n - 1 equations in the form

Using |A 0 |< Kj||# - A\\lln and Cramer's rule, we see that for
/ = 2, 3,. . . , n, Xj has the following structure:

where are scalar functions of n 2 variables such that

where B satisfies \\B - A\\ < e2. Here and in the sequel L0, L, , . . . , denote
positive constants that depend on A only.
Now let jc (1) , . . . , x(k) be k eigenvectors of B corresponding to k different
eigenvalues A j , . . . , A A . Construct new vectors using divided differences:

Let
Proofs of Theorems 16.4.1 and 16.4.4 503

be the homogeneous polynomial of degree k in variables y , , . . . , y,. A


simple induction argument [using (16.5.4)] shows that u(ik) has the following
form (where s = k - j and the first s coordinates in u(ik} are zeros):

Here f u w = f u t v ( b ). The induction argument is based on the following


equality (where we put formally pQ = 1):

Now consider the subspace

Obviously

On the other hand, the matrix


504 Perturbations of Lattices of Invariant Subspaces

is a projector on &, where Yk (resp. Y n _ k ) is the k x k [resp. (n - k) x A:]


matrix formed by the upper k (resp. lower n - k) rows of the n x k matrix

Using formulas (16.5.5), we see that

and thus, Yk is invertible (for B sufficiently close to A). Using the estimates
we easily find from (16.5.5) that

*. Hence

So

Consequently

for every transformation B such that \\B - A\\ < e2 and every ^-invariant
subspace is spanned by its eigenvectors. As B must be nonderogatory, the
last condition means that B has n distinct eigenvalues.
Assume now that B is such that \\B - A\\ < e2, but B does not have n
distinct eigenvalues. In particular, B is nonderogatory. Let {# m }™ =1 be a
sequence of transformations such that \\Bm - A\\ < e2 for all ra, Bm—> B as
m —»°°, and Bm has n distinct eigenvalues for each m. Let M be a
A:- dimensional fi-invariant subspace. As M is a stable subspace (see
Theorem 15.2.1), there exists a sequence {Mm}^ =l, where Mm is a k-
dimensional /?„, -invariant subspace such that 0(Mm,M,)-*Q as m—»<». By
(16.5.6)

Passing to the limit in this inequality as m—»«>, we obtain

hence
Proofs of Theorems 16.4.1 and 16.4.4 505

\\B-A\\<e

Proof of Theorem 16.4.1. We now start to prove Theorem 16.4.1 in full


generality. Let F, and F2 be two closed contours in the complex plane such
that F, H F2 = 0 and the eigenvalues A0 of A lying inside F\ (resp. F2) are
exactly those for which dim Ker( A 0 7 — A) - 1 (resp. dim Ker( A()/ - A) > 1).
Let 6 , > 0 be chosen so that any transformation £:(p"—>(p" with
\\B - A\\ < 5, has no eigenvalues on F, U F2. For such a B, let

and define the transformation S: <p"—»<p" by Sx = S^x for xELtft^A), the


spectral subspace associated with the eigenvalues of A inside Fy. Denote by
F, the projector on ^,(^4) along $12(A)\ then for any jcG <p" with jjjc|| = 1
we have

where Ay is the length of Fy and

(cf. the proof of Theorem 16.3.1). Letting


, we ha Hence for
the transformation 5 is invertible and
j= 1,2. Now put B = S~1BS. Then (cf. the proof of Theorem 16.2.1)

As

it is sufficient to prove Theorem 16.4.1 only for those B: <p"~* <P" that are
close enough to A and satisfy ^t(B) - &j(A), j = 1, 2.
506 Perturbations of Lattices of Invariant Subspaces

Note that for any transformation B sufficiently close to A with


$lj(A), every ^-invariant subspace JV is of the form where
Let M be an /^-invariant subspace, and let
where Then, denoting by Py (resp. Qy^ the orthogonal
projector on the subspace j£, (resp. j£2) in £%,(>!) [resp. £%2(-A)], we have

Hence

Further, we remark that if B is sufficiently close to A, and 2ftj(B) = &tj(A)


for; = 1,2, then B\^(A) is nonderogatory, that is, dimKer(A 0 /- B\^^) = 1
for every eigenvalue A0 of B\.A ( A ) . Indeed, this follows from the choice of
£%j(;4), which ensures that A\^ (A) is nonderogatory and from the openness
of the set of nonderogatory transformations. If, in addition, B e DJ(A), it
now follows that ^|^ 2(/ D and B\^(A} have the same Jordan structure. Hence
in view of (16.5.8) and Theorem 16.3.1, we only need to prove the
inequality

In other words, we can assume that A is nonderogatory. Moreover, using


the arguments similar to those employed above, we can assume in addition
that A has only one eigenvalue, and this case is covered already in Lemma
16.5.2.
Theorem 16.4.1 is proved completely

Proof of Theorem 16.4.4. It is sufficient to prove that there exist


positive constants e and K such that the inequality

holds for every transformation B satisfying ||6 - A\\ s e.


Observe that for any transformations B, B: <f"—»• <p" the inequality

holds. Indeed, for every JVelnv(B) and inv(B) we ha


Transformations with Different Jordan Structures 507

Taking the infimum over all NEIn\(B) it follows that

It remains to take the infimum over all jV"Elnv(Z?) to obtain (16.5.10).


Using the arguments from the proof of Theorem 16.4.1 [when (16.5.10) is
used instead of (16.5.7)], we reduce the proof of (16.5.9) to the case when B
has the property that every root subspace ^(A) of A is a spectral subspace
for B and, moreover, the spectra of B\M ( A ) and B\A ( A ) do not intersect if
A, ^ A 2 . Let A,, . . . , A r be all the distinct eigenvalues of A; then

Also, for every B- invariant subspace Ji we have

Arguing as in the proof of Theorem 16.4.1, we obtain

So in order to prove (16.5.9), we can assume without loss of generality that


A has only one eigenvalue, say . If dim K e 1, then by
Theorem 15.2.1 (here we use the assumption that M is stable) M — {0} or
in which case (16.5.9) is trivial. If dim A,7 - A) = 1,
(16.5.9) follows from Theorem 16.4.1. [Note that in this case B e DJ(A) for
all B sufficiently close to A.]

16.6 DISTANCE BETWEEN INVARIANT SUBSPACES FOR


TRANSFORMATIONS WITH DIFFERENT JORDAN STRUCTURES

In this section we investigate the behaviour of dist(Inv(y4), Inv(Z?)) when A


and B have different Jordan structures or different derogatory Jordan
structures. The basic result in this direction is as follows.

Theorem 16.6.1
We have
508 Perturbations of Lattices of Invariant Subspaces

where the infimum is taken over all pairs of transformations A, B:


such that A is derogatory and B is nonderogatory. [The infimum in
depends on n.]

Proof. Recall that B is nonderogatory if and only if the set of its invariant
subspaces is finite.
By assumption, dim Ker( 0/ - A) > 1 from some eigenvalue of A. Let
x and y be orthonormal vectors belonging to Ker 0 /- A), and put

Clearly, the subspaces M(t) are A invariant.


On the other hand, for every nonderogatory B: it *s easily seen
that the number of /^-invariant subspaces does not exceed

where the maximum is taken over all sequences p , , . . . , ps of positive


integers with pl + • • • + ps - n.
Now for any set of 2" subspaces j£, , . . . , «$£>/• in <p" put

As 0(./#(r), ^) is a continuous function of t on [0,1], so is


mm
is/s2"^(-^(0> •=£/)» hence F(J£,, . . . , j£2,,) is well defined. Let us show
thatF(J£j, . . . , £ 2n) is a continuous function of j£, , . . . , ^ 2 «. For some 5 >0,
let JV) , i = 1 , . . . , 2" be subspaces in such tha i)<8 for each i. The
for i = 1, . . . , 2" an 0, 1], we obtai

First take the minimum with respect to / on the left-hand side and then on
the right-hand side. We obtain

for all /E[0, 1]. Taking the maximum with respect to t on the right-hand
side first, and then on the left-hand side, we obtain

With the roles of j£). and Nt, switched it also follows that
Transformations with Different Jordan Structures 509

that is

which proves the continuity of F^,, . . . , j£,/,). Obviously,


F( ) > 0 for all 56 t. As the set of all 2"-tuples of subspaces in <p" is
compact, there exists an e.>0 such that F for all
i = 1 , . . . , 2". From the definition of F(££\ , . . . , Jz^") 's ^ easily seen that
does not depend on the choice of x and y (because any pair of orthonormal
vectors in can be mapped to any other pair of such vectors by a unitary
transformation). Hence the theorem follows. D

When the transformations A and B are both derogatory, or both non-


derogatory, with different Jordan structures, the situation is more compli-
cated. The following question arises naturally: if [Bm}^n = l is a sequence of
transformations converging to A and such that each Bm has Jordan structure
different from that of A, does it follow that

The next example shows that the answer is, in general, negative.

EXAMPLE 16.6.1. For m = 1, 2, . . . , let

Clearly, for all m, Bm and A have different derogatory Jordan structure (in
particular, different Jordan structure).
One-dimensional ^4-invariant subspaces are Span{e, + Be and
Span{e3}. The orthogonal projector on Span{ej + Be3} is

One-dimensional flm-invariant subspaces are Span{e,}, Span{e 3 ), and


Span{<?! + m~le2 + Be3} where . The orthogonal projector on
SPAN IS
510 Perturbations of Lattices of Invariant Subspaces

Now there exists a constant L, >0 (independent of and m) such that

Two-dimensional /1-invariant subspaces are Span{ej, e2 + fte3} where


and Span{^,,e3 }. Two-dimensional fim-invariant subspaces are
Span{e, + m~ e2, e3}, Span{e,, e3}, and Spanje,, e2 + /3e3}, where
The orthogonal projector on is

There exists a constant L2 > 0 (independent of m) such that

Now the inequalities (16.6.3) and (16.6.4) ensure that for m = 1,2, . . . ,

In the last example both A and Bm are derogatory. Taking

we obtain an example contradicting (16.6.2) with both A and /?,„ non-


derogatory.

16.7 CONJECTURES

In view of Example 16.6.1 the following question arises: Given a transfor-


mation A: with a certain Jordan structure, it is true that for an
other Jordan structure there exists a sequence of linear transformations
(Bm}m = \ tnat have this other Jordan structure, for which Bm—> A, and for
which
Conjectures 511

A similar question arises for the case of derogatory Jordan structure, when
(16.7.1) is replaced by

and a is the height of A. Of course, certain conditions should be imposed on


the Jordan structure (or on the derogatory Jordan structure) of {Bm}^=l to
ensure the existence of a sequence {Bm}^ = l converging to A. A complete
set of such conditions is given in Theorem 15.10.2.
Let us describe the Jordan structure of transformations on <p" in terms of
sequences as in (16.1.1), and let <I> be the set of all such sequences. As in
Section 15.10, for

and for every nonempty set (1, . . . , s] define

Further, for O given by (16.7.2) denote by P(il) the set of all sequences

for which there is a partition of (1, . . . , 5'} into s disjoint nonempty sets
5 such that the following relations hold:

Note that always (one takes = {/?}, p - 1, . . . , 5). The set


consists of O if and only if represents the Jordan structure
corresponding to n distinct eigenvalues, that is, 5 = n.
Note that by Theorem 15.10.2, P represents exactly those Jordan
structures for which there is a sequence of transformations converging to a
given transformation with the Jordan structure
We propose the following conjecture.
512 Perturbations of Lattices of Invariant Subspaces

Conjecture 16.7.1
Let A: be a transformation with the Jordan structure f Then
for any sequence ft' that belongs to P(ft) and is different from ft, there exists
a sequence of transformations [Bm}^=l that converges to A, for which each
Bm has the Jordan structure ft', and for which

It is not difficult to verify this conjecture when A is nonderogatory.


Indeed, without loss of generality we can assume that A is the nx n Jordan
block with eigenvalue zero. In view of Theorem 15.10.2, any sequence ft'
belonging to P(ft) (here ft is the Jordan structure of A) has the form

where s> 1 and m, are positive integers with Given such


consider the following n x n matrix (we denote by Qm and Im the m x m zero
and identity matrices, respectively):

where 17,,. . . , ^ are the 5th roots of e, and the n x n matrix Ae has e in the
(5,1) entry and zeros elsewhere. It is easy to see [by considering, e.g.,
det( /- B()] that, at least for e close enough to zero, the matrix B( has the
Jordan structure ft'. Clearly, 17,, . . . , 17, are the eigenvalues of B f , and
(1, 77,,. . . , 17; ', 0, . . . , 0} is the only eigenvector of B (up to multiplica-
tion by a nonzero scalar) corresponding to 17, for / = 1, . . . , s. It follows (cf.
the remark following Theorem 16.5.1) that

and Conjecture 16.7.1 is verified for the matrix A.


To formulate the corresponding conjecture for derogatory Jordan struc-
ture, we introduce one more notion. Let

and

be two sequences from <£. We say that ft and ft' have the same derogatory
part if the number (say, M) of indices /, 1 </ <s such that r- >2 coincide
with the number of indices ;', ! < / < f such that r j > 2 , and, moreover,
r
f it does not
happen that and have the same derogatory part, we say that and
have different derogatory parts.
Exercises 513

Conjecture 16.7.2
Let the transformation A: have the Jordan structure Then
for every sequence that belongs to P( and such that and have
different derogatory parts there exists a sequence of transformations {Bm}^ =l
that converges to A, for which each Bm has the Jordan structure and for
which

where a is the height of A.

16.8 EXERCISES

16.1 Given an n x n upper triangular Toeplitz matrix A, find all possible


Jordan structures of upper triangular Toeplitz n x n matrices that are
arbitrarily close to A. Are there additional Jordan structures if the
perturbed matrix is not necessarily upper triangular Toeplitz?
16.2 Solve Exercise 16.1 for the class of n x n companion matrices.
16.3 Solve Exercise 16.1 for the class of n x n circulant matrices.
16.4 Solve Exercise 16.1 for the class of n x n matrices A such that A2 — 0.
16.5 Prove or disprove each one of the following statements (a), (b), and
(c) : for every transformation A : there exists an 0 such
that any transformation B: with \\B - A\\ has the prop-
erty that (a) the height of B is equal to the height of A; (b) the height
of B is not greater than the height of A; (c) the height of B is not
smaller than the height of A.
16.6 Prove Conjecture 16.7.1 for the case when A = /3(0).
16.7 Given a transformation A: and a number an A-
invariant subspace M is called a stable if there exist positive constants
K and such that every transformation B: h \\B - A\\ < e
has an invariant subspace ^V satisfying

Show that all invariant subspaces of the Jordan block / n ( re a.


stable if n. (Hint: Use Lemma 16.5.2.)
16.8 (a) For every > l , give an example of an -stable y4-invariant
subspace that is not Lipschitz stable, (b) For every l , give an
example of a stable A-invariant subspace that is not a stable.
16.9 Are there a -stable invariant subspaces with 0< < 1?
Chapter Seventeen

Applications

Chapters 13-16 provide us with tools for the study of stability of divisors for
monic matrix polynomials and rational matrix functions. In this chapter we
develop a complete description of stable divisors in terms of their corre-
sponding invariant subspaces and supporting projectors. Special attention is
paid to Lipschitz stable and isolated divisors. We consider also the stability
and isolatedness properties of solutions of matrix quadratic equations as well
as stability of linear fractional decompositions of rational matrix functions.

17.1 STABLE FACTORIZATIONS OF MATRIX POLYNOMIALS:


PRELIMINARIES

Let L be an n x n monic matrix polynomial, and let

be a factorization of L into a product of n x n monic polynomials


Lt We say that the factorization (17.1.1) is stable if, after
sufficiently small changes in the coefficients of L , the new matrix
polynomial again admits a factorization of type (17.1.1) with only small
changes in the factors L y In the next section we study stability of the
factorization of type (17.1.1) in terms of invariant subspaces for the
linearization of the matrix polynomial L In this section we establish the
framework for this study and prove results on continuity of the correspon-
dence between factorizations and invariant subspaces to be used in the next
section.
Let CL be the companion matrix for L

514
Matrix Polynomials: Preliminaries 515

where L As we have seen in Chapter 5, the triple


(X where

is a standard triple for L . Further, there is a one-to-one correspondence


between the factorizations (17.1.1) of L and chains of CL-invariant
subspaces

with the property that the transformations

are invertible (see Section 5.6). Here, lr<: • • < 12< I are some positive
integers. The correspondence between factorizations (17.1.1) and chains of
CL- in variant subspaces is given by the formulas from Theorem 5.6.1.
Namely, let ^ be a direct complement to Mi+l in M} (j = 1,. . . , r - 1) (by (by
definition, Jll = and let P^: Mj—>Nj be the projector on jVy along
MJ+l. For / = !, . . . , r - l , let p. be the difference lj+l — lj where, by
definition, /, = /. Here / is the degree of L Then for /' = 1,. . . , r — 1 we
have

where

and the transformations are


are determined
determined by
by
516 Applications

(As usual, 8UV


uv denotes the Kronecker symbol:
symbol: and if
u T^ v.) For the last factor L
L r(\) we
we have

where

Also, it is
convenient to use the formulas for the products
AB (cf. the proof of
Theorem 5.6.1). We have for / = 2, . . . , r:

where

(Observe that when i = r formula (17.1.4) coincides with the preceding


formula for Lr.) Also, for / = 2, . . . , r:

where is a direct complement to is the projector on


along and

Our next step is to show that this correspondence between factorizations


of monic matrix polynomials L and chains of certain CL-invariant
subspaces is continuous. To this end define a metric on the set of all
n x n monic matrix polynomials of degree k\
Matrix Polynomials: Preliminaries 517

Now fix a positive integer /. Consider the set °Wr of all r-tuples
where L is a monic matrix polynomial of
degree /, and is a chain of CL-invariant subspaces.
The set Wr is a metric space with the metric

For every increasing sequence ofofpositive


positive integers
integers /,/,
with 12 < I, define the subset Wr^ of Wr consisting of the elements
(Mr, . . . , M2, L from Wr with the additional property that the trans-
formations (17.1.3) are invertible.

Theorem 17.1.1
For each g the set Wr ^ is open in Wr .

Proof. Define the subspace OF by the condition


if and only if here As

for p = 1, . . . , / , it follows that the transformation (17.1.3) is invertible if


and only if Mt is a direct complement to ^,_/ in (pn/. From Theorem 13.1.3 it
follows that, if M , 4- then for € > 0 sufficiently small we also have
for every subspace M\ in with 0(^,., M' i)< Hence
Tr,f is open in Wr.

Now define a map

where is an increasing sequence of positive integers


lr, / r _i, . . . , /2 with /2 < /, as follows. Given
V r>f , the image of this element is (L t where the monic
matrix polynomials L, are taken from the factorization

which corresponds to the chain M rot CL- in variant subspaces. It


518 Applications

is evident that Ff is one-to-one and surjective, so that the map FJ1 exists.
Make the set into a metric space
by defining

If A^ , A"2 are topological spaces with metrics pl, p2, defined on A^ and X2,
respectively, the map G: Xl—> X2 is said to be locally Lipschitz continuous
if, for every x Xl , there is a deleted neighbourhood Ux of jc for which

Obviously, a locally Lipschitz continuous map is continuous. It is easy to see


that the composition of two locally Lipschitz continuous maps is again locally
Lipschitz continuous.

Theorem 17.1.2
The maps Ff and F^1 are locally Lipschitz continuous.

Proof. Given
where the products Ll • • • Lf_v
and L, • • • Lr are given by (17.1.5) and (17.1.4), respectively. Then

We show first that the coefficients of Af, i = 1,. . . , r — 1 are


locally Lipschitz continuous. Observe that in the representations (17.1.4)
and (17.1.5) the coefficients of M, and Af are uniformly bounded in some
neighbourhood of (Mr,. . . , M2, L . It is then easily seen that in order
to establish the local Lipschitz continuity of the coefficients of M, and Ni it is
sufficient to verify the following assertion: for a fixed
there exist positive constants and C such that,
for a set of subspaces satisfying for / = 2,. . . , r,
it follows that

Here %_t = {<0, . . . ,0, a,, . . . , and


and
PM || < C0(<3^, J<.), where P^, (resp. PM) is the projector on ^ (resp. ^,)
along ^ _ / . But this conclusion follows from Theorem 13.1.3. Hence the
coefficients of A/,-(A) and W,-(A) are locally Lipschitz continuous functions of
an element in °Wr^. In particular, L t = M2 and L r = Nr are locally Lipschitz
continuous.
Matrix Polynomials: Preliminaries 519

To prove this property for L 2 , . . . , Lr_l, note that

Regard the equalities (17.1.6) as a system of linear equations

where A and b are formed by the entries of coefficients of M ( (A) and


and the unknown vector x is formed by the
entries of the coefficients of L 2 , . . . , Lr_{. The system (17.1.7) has a unique
solution jc; hence the matrix A is left invertible. So jc = A'b, where A1 is a
left inverse of A. Observe that every matrix B with is
1
also left invertible with a left inverse B satisfying

(cf. the proof of Theorem 13.5.1). This inequality shows that x is a locally
Lipschitz continuous function of because A
and b have this property.
To establish the local Lipschitz continuity of F^ 1 , we consider a fixed
element It is apparent that the polynomial L
L,L 2 • • • Lr will be a Lipschitz continuous function of L ] } . . . , Lr in a
neighbourhood of this fixed element. Further, let Mr C • • • C M2 be the
chain of CL- in variant subspaces corresponding to the factorization L =
L,L 2 • • • Lr. Let NJ = LtLi+l • • • Lr for / = 2, . . . , r, and let

where / is the degree of L and ra, is the degree of 7V(. The projector PM on
Mt along %_m is given by the formula

where A',, = [/ 0 ••• 0] and CN is the companion matrix of Nf. Indeed,


obviously, PM is a projector'and K&rPM=(Sl_m. Let us check that
Im PM = Mt. Recall (see the proof of the converse statement of Theorem
5.3.2) that Mi is given by the formula

As

and
520 Applications

we find that ^ ^ I m = Im PM . Formula (17.1.8) implies the local


Lipschitz continuity of PM '(as a function of (L, , . . . , Lr)} and, therefore, also
of M( (cf. Theorem 13.1.1). D

17.2 STABLE FACTORIZATIONSOF MATRIX POLYNOMIALS:


MAIN RESULTS

We say that a factorization

of a monic matrix polynomial L(A), where L t (A) are monic matrix poly-
nomials as well, is stable if for any e >0 there exists a 6 >0 such that any
monic matrix polynomial L(\) with cr,(L,L)<8 admits a factorization
L(A) = L j ( A ) - • • L r (A), where L,(A) are monic matrix polynomials satis-
fying

max

Here / is the degree of L and L, whereas for / = 2,. . . , r, /. is the degree of


the products Li+l • • • Lr and Li+l • • • Lr.
Recall the definition of a stable chain of invariant subspaces given in
Section 15.6.

Theorem 17.2.1
Let equality (17.2.1) be a factorization of the monic matrix polynomial L(A).
Let (Mr,. . . , M2, L(A)) = F^(Llt. . . , L r ) be the corresponding chain of
CL-invariant subspaces. Then the factorization (17.2.1) is stable if and only if
the chain

is stable.

Proof. If the chain (17.2.2) is stable, then by Theorem 17.1.2 the


factorization (17.2.1) is stable.
Now conversely, suppose that the factorization (17.2.1) is stable but the
chain (17.2.2) is not. Then there exists an e >0 and a sequence of matrice
(Cm}^ = ,, such that lim^^.^, Cm = CL and for any chain
Matrix Polynomials: Main Results 521

of Cm-invariant subspaces the inequality

holds. Put Q = co\[8ill]'i=l and

Then Sm converges to colf^C'"1]^, which is equal to the unit nl x nl


matrix. So without loss of generality we may assume that Sm is nonsingular
for all m. Let and note that

A straightforward calculation shows that SmCmS~^ is the companion matrix


associated with the monic matrix polynomial

From (17.2.3) and the fact that Cm-^ CL it follows that a,(Mm, L)^0. But
then we may assume that for all m the polynomial Mm admits a factorization

where crp.(L(>I(A), L,(A))-»0 for / = 1, . . . , r (here pt is the degree of L,,


which is also equal to the degree of Lim for m = 1, 2, . . .).
Let Mr m C • • • C M2 m be the chain of CM -invariant subspaces corre-
sponding to the factorization (17.2.4), that is

By Theorem 17.1.2 we have

Put Yiim = Sm* M^m for i = 2,. . . , r and m = 1,2,. . . . Then ^ m is an


invariant subspace for Cm for each m. Moreover, it follows from Sm—>/
that, for i = 2 , . . . ,r, d(1^ lm , Mi^m)-*0 as m-»oo. (Indeed,
[Indeed, by Theorem
13.1.1
522 Applications

where Pim is the orthogonal projector on Mim. Now

which tends to zero as m tends to infinity.) But then 6(Tim, jti^—^Q as


m—»°°, for / = 2, . . . , r. This contradicts the choice of C m , and the proof of
Theorem 17.2.1 is complete. D

Comparing Theorem 17.2.1 with Corollary 14.6.2 and Theorem 14.2.1,


we obtain the next result.

Corollary 17.2.2
A factorization

with monic matrix polynomials L(A), L t (A), . . . , Lr(\) is stable if and only
if the corresponding chain

of CL-invariant subspaces satisfies the condition that for every eigenvalue A0


of CL with dim Ker(CL - A 0 /) > 1 and for every i (2< i < r) either Mt D

One can formulate a criterion for stability of factorizations of this kind in


terms of eigenvalues of the polynomials L ( (A) rather than the companion
matrix (as we have done in Corollary 17.2.2), as follows.

Theorem 17.2.3
A factorization (17.2.1) is stable if and only if, for any common eigenvalue A0
of a pair L,( A), Ly( A) (i ^/) we have dim Ker L( A 0 ) = 1.

The proof of Theorem 17.2.3 is based on the following lemma.

Lemma 17.2.4
Let
Matrix Polynomials: Main Results 523

be a transformation from (p"1 into <p™, written in matrix form with respect to
the decomposition Then fm is a stable
invariant subspace for A if and only if for each common eigenvalue \0ofAl
and A 2 the condition dim Ker( A07 - A) — 1 is satisfied.

Proof. It is clear that <p mi is an invariant subspace for A. We know from


Theorem 15.2.1 that (p™1 is stable if and only if for each Riesz projector P of
A corresponding to an eigenvalue A0 with dim Ker(A 0 7— A) ^2, we have
P<pm> = 0 or P<pm' = Im P.
Let P be a Riesz projector of A corresponding to an arbitrary eigenvalue
A0. Also for / = 1 , 2 , let P; be the Riesz projector associated with Aj and A 0 :

for y = l,2, where e > 0 is sufficiently small. Then

l
Observe that for i — 1, 2, the Laurent expansion of (/A — At) at
at A0 has
has the
form

where Qtj are some transformations of Im P. into itself and the ellipsis on
the right-hand side of (17.2.5) represents a series in nonnegative powers of
(A - A 0 ). From (17.2.5) one sees that P has the form

where Ql and Q2 are certain transformations acting from (p"*2 into (p7"1. It
follows that {0} ^ P(pmi ^ Im P if and only if A0 e 0-^4,) n o-(A2). Now
appeal to Theorem 15.2.1 (see first paragraph of the proof) to finish the
proof. D

Proof of Theorem 17.2.3. Let (Mr,... ,M2, L( A)) = F f l ( L l f . . . , L r )


be the chain of CL- in variant subspaces corresponding to the factorization
(17.2.1). From Theorem 17.2.1 (taking into account Corollary 17.2.2) we
know that this factorization is stable if and only if M2,. . . , Mr are stable
Q-invariant subspaces. Let / be the degree of L, let r, be the degree of
L1L2- • • LI, and let
524 Applications

Then <p"' = Mi; + ty. With respect to this decomposition, write

As we know (see Corollary 5.3.3), a(Li+l —• Lr) = a(Cu) and


cr(L, • • • Lj) = cr(C1(). Also (r(CL) = o-(L); the desired result is now ob-
tained by applying Lemma 17.2.4.

Another characterization of stable factorizations of monic matrix poly-


nomials can be given in terms of isolatedness. Consider a factorization

of a monic matrix polynomial L(A) into the product of monic polynomials


L j ( A ) , . . . , Lr(\), and let pi be the degree of L, for / = ! , . . . , r. This
factorization is called isolated if there exists an e > 0 such that any
factorization

of L( A) with monic polynomials M,( A) satisfying <7p.(L,-( A), Af,( A)) < e (it is
assumed that the degree of M, is p() coincides with (17.2.6), that is,

Theorem 17.2.5
A factorization (17.2.6) is stable if and only if it is isolated.

Proof. Let ( M r , . . . , M2, L(A)) = F~'(L,, L 2 ,. . . , L r ) be the corre-


sponding chain of CL-invariant subspaces. By Theorems 17.1.2 and 17.2.1,
the factorization (17.2.6) is isolated if and only if each M( satisfies the
condition that either Mt D % (C L ) or M( n £%A (CL) = {0} for every eigen-
value A0 of CL with dimKer
dimKer(C t - A 0 /)>1. Now it remains to appeal to
Corollary 17.2.2.

We conclude this section with a statement concerning stability of the


property that a given factorization of a monic matrix polynomial is stable.

Theorem 17.2.6
Assume that
Monic Matrix Polynomials 525

is a stable factorization with monic matrix polynomials Lj(A),


L 2 ( A ) , . . . , L r (A). Then there exists an e >0 such that every factorization

with monic matrix polynomials Mj(A), . . . , Mr(\) is stable provided

where for i = 2 , . . . , r, /. is f/ie degree of the products L, • • • Lr and Mt: • • • Mr.

The proof of Theorem 17.2.6 is obtained by combining Theorem 17.2.1


and Corollary 15.4.2.

17.3 LIPSCHITZ STABLE FACTORIZATIONS OF MONIC


MATRIX POLYNOMIALS

A factorization

of the monic matrix polynomial L(A), where L,(A), . . . , Lr(\) are monic
matrix polynomials as well, is called Lipschitz stable if there exist positive
constants e and K such that any monic matrix polynomial L(A) with
cr,(L, L) < e admits a factorization L( A) = L,( A) • • • Lr(\) with monic mat-
rix polynomials Lf(\) satisfying

Obviously, every Lipschitz stable factorization is stable. The converse is not


true in general, as one can see from the results of this section.
We start with the correspondence between the factorization (17.3.1) and
chains of C L -invariant subspaces, where CL is the companion matrix for
L(A), described in Section 17.1.

Theorem 17.3.1
The factorization (17.3.1) is Lipschitz stable if and only if the corresponding
chain of CL-invariant subspaces

is Lipschitz stable.
526 Applications

The Lipschitz stability of (17.3.2) is understood in the sense of Lipschitz


stability of lattices of invariant subspaces (Section 15.6). In the particular
case of chains, the chain (17.3.2) is, by definition, Lipschitz stable if there
exist positive constants c and K [that depend on CL and the chain (17.3.2)]
with the property that every nl x nl matrix A with \\A — CL\\ < e has a chain

of invariant subspaces such that

Proof. If the chain (17.3.2) is Lipschitz stable, then by Theorem 17.1.2


the factorization (17.3.1) is Lipschitz stable. Conversely, assume that the
factorization (17.3.1) is Lipschitz stable but the chain (17.3.2) is not. Then
there exists a sequence (Cm}^ = 1 of nl x nl matrices such that \\Cm - CL\\ <
(1/ra) and for every chain !£r C • • • C !£2 of Cm-invariant subspaces the
inequality

holds. We continue now with an argument analogous to that used in the


proof of Theorem 17.2.1. Putting Sm = col [QC^1] I = l , where
where QQ = col[5n][=1,
we verify that Sm is nonsingular (at least for large m) and that SmCmSml is
the companion matrix associated with the matrix polynomial

where [Uml, Um2,. . . , Uml] = Sml. We assume that Sm is nonsingular for


m = 1, 2 , . . . . Observe that co\[QC'^l]'i=l is the unit matrix /; so it is not
difficult to check that for m - 1 , 2 , . . .

Here and in the sequel we denote certain positive constants independent of


m by Kt, K2,.... As the factorization (17.3.1) is Lipschitz stable, for m
sufficiently large the polynomial M m (A) admits a factorization

with monic matrix polynomials M l n ) (A),. . . , Mrm(\) such that


Monic Matrix Polynomials 527

Let Mr m C • • • C M2m be the chain of CM -invariant subspaces correspond-


ing to the factorization (17.3.5). By Theorem 17.1.2 we have

From (17.3.4), (17.3.6), and (17.3.7) one obtains

Put r,M = S;X« for / = 2, . . . , r and m = 1, 2, . . . . Then y,>in is Cm


invariant for each m. Further, the formula for Sm shows that

Indeed

and (17.3.9) follows. Now (cf. the proof of Theorem 17.2.1)

Using this inequality and (17.3.8), we obtain

a contradiction with (17.3.3). D

Combining Theorem 17.3.1 with Theorems 15.6.2 and 15.5.1, we obtain


the following corollary.

Corollary 17.3.2
For the factorization (17.3.1) and the corresponding chain of CL-invariant
subspaces (17.3.2), the following statements are equivalent: (a) the factoriz-
ation (17.3.1) is Lipschitz stable; (b) all the CL-invariant subspaces
528 Applications

M2, . . . , Mr are spectral; (c) for every e > 0 sufficiently small there exists a
8 >0 with the property every nl x nl matrihx BBwith with \\B-C.\
\\B - CL\\ < 8 has
has a
unique chain of invariant subspaces J i r C J " f r _ l C ' - - C J i 2 such that
ma\(0(Mr,JVr),

Now we are ready to state and prove the main result of this section,
namely, the description of Lipschitz stable factorizations. (Recall the defin-
ition of the metric o-k on matrix polynomials given in Section 17.1.)

Theorem 17.3.3
The following statements are equivalent for a factorization

of the monic n x n matrix polynomial L(A) of degree /, where


L t ( A), . . . , Lr( A) are also monic matrix polynomials of degrees p ^ , . . . , pr,
respectively: (a) the factorization (17.3.10) is Lipschitz stable; (b) cr(Ly) fl
o-(Lk) = 0 for j ^ k; (c) for every € > 0 sufficiently small there exists a 8 > 0
such that any monic matrix polynomial L(A) with cr,(L, L ) < 5 has a
unique factorization L(A) = Lj(A)- • • Lr(\) with the property that
^L^ Lj), . . . , crp (Lr, L r )) < e.

Proof. Observe that for j = 2, . . . , r,

where Mr C • • • C M2 is the chain of CL- in variant subspaces corresponding


to the factorization (17.3.10) (see formula (17.1.4)). Also, denoting by Mj a
direct complement to Mf in M j _ l for j = 2, . . . , r, defining Ml = <p"', and
letting Pj\ M j _ l ^ > M ' j be the projector on jVy' along M ., we have

So, the subspaces M-t are spectral if and only if cr(L;) fl cr(L^) = 0 for; ¥^ k.
Hence the equivalence (a)<£>(b) in Theorem 17.3.3 follows from the
equivalence (a)<£>(b) in Corollary 17.3.2. Similarly, the equivalence
(a)O(c) in Theorem 17.3.3 follows from the corresponding equivalence in
Corollary 17.3.2, taking account of Theorem 17.1.2. D

17.4 STABLE MINIMAL FACTORIZATIONS OF RATIONAL MATRIX


FUNCTIONS: THE MAIN RESULT

Throughout this section VK 0 (A), W 01 (A), W02(\), . . . , W0k(\) are rational


n x n matrix functions that take the value / at infinity. We assume that
Rational Matrix Functions: The Main Result 529

W k( A) and that this factorization is minimal.


following notion of stability of this factorization is natural. Let

be the minimal realizations for W0 and W01, W 0 2 ,. . . , WQk (so d is the


McMillan degree of W0, and 5, is the McMillan degree of W) for i =
! , . . . , & ) . The minimal factorization W0 = W01 • • • W0k is called stable if for
each e > 0 there exists a w >0 such that \\A - A0\\ + \\B - B0\\ + ||C-
C0|| < o> implies that the realization

is minimal and W admits a minimal factorization W = Wl W2 , . . . , Wk , where


for * = 1, . . . , A:, the rational matrix function W)(A) has a minimal reali-
zation

with the extra property that


Since all minimal realizations of a given rational matrix function are
mutually similar (Theorems 7.1.4 and 7.1.5), this definition does not depend
on the choice of the minimal realizations (17.4.1) and (17.4.2).
The next theorem characterizes stability of minimal factorizations in
terms of spectral data.

Theorem 17.4.1
The minimal factorization W 0 (A) = W 01 (A)W 02 (A) • • • W0k(\) is stable if and
only if each common pole (zero) of WOJ and W0p (j ^ p) is a pole (zero) of
W0 of geometric multiplicity 1.

The geometric multiplicity of a pole (zero) A0 of a rational matrix function


W( A) is the number of negative (positive) partial multiplicities of W( A) at A0
(see Section 7.2).
We need some preliminary discussion before starting the proof of
Theorem 17.4.1. As we have seen in Theorem 7.5.1, the minimal fac-
torizations

of W 0 (A) are in one-to-one correspondence with those direct sum decom-


positions
530 Applkations

for which the subspaces «2\ 4- • • • 4- <£p (p = 1,. . . , k) are A0-invariant and
the subspaces ££k 4- 3?k + l + • • • + Z£p (p = /c, . . . , 1 ) are AQ invariant,
where AQ — A0 — B(}C0. Moreover, the minimal factorization (17.4.3) cor-
responding to the direct sum decomposition (17.4.4) is given by

where 7ry is the projector on J£J along JS?, + • • • 4- .$?._, -i- j£J + 1 4- • • • + j£A;
note that the realizations (17.4.5) are necessarily minimal. In the formula
(17.4.5) the transformations COTT;: ^.-» <p", 7r y A 0 ir y : ^->^, and
7r;£?0: <p"—»J^ are understood as matrices of sizes n x / y , /y x /;, and /; x «,
respectively, where /; = dim «2?;-, with respect to some basis in ^.
Let (^4, 5, C) be a triple of matrices of sizes d x 8, 8 x n, n x 6,
respectively. Consider the ordered /c-tuple II = (TT,, . . . , Trk) of projectors in
((7s. We say that FI is a supporting k-tuple of projectors with respect to the
triple of matrices (A, B, C) if 77,77-, = 77-.7r; = 0 for / ^j, TT, + • • • + TT^ = /, the
subspaces Im(7r, + • • • + irp) for p = 1, 2 , . . . , k are A invariant, and the
subspaces lm(7rp + irp + l + • • • + irk), p = 1,. . . , k, are AK invariant, where
A* - A - BC. Clearly, II is a supporting /c-tuple of projectors with respect to
(A(), B0, C0) if and only if the subspaces^ = Im TT-(/ = 1,. . . , k) form a direct
sum decomposition of (p5 as in (17.4.4).
A supporting /c-tuple of projectors II = (TT,, . . . , irk) with respect to
(A, B, C) will be called stable if for every e >0 there exists an o> >0 such
that, for any triple of matrices (A', B', C') of sizes 5 x 5 , S x / j , n x 8,
respectively, with \\A - A'\\ + \\B - B'\\ + \\C - C'\\ < a>, there exists a sup-
porting /c-tuple of projectors IT = (TT^, . . . , TT£) with respect to (A', B',C')
such that

The first step in the proof of Theorem 17.4.1 is the following lemma.

Lemma 17.4.2
Let (17.4.1) be a minimal realization for WQ(\), and let H = (irl,. . . , Trk) be
a supporting k-tuple of projectors with respect to (A0, B0, C0), with the
corresponding minimal factorization

(so that, f o r j = l,...,k, W0;.( A) = / + C 0 7r ; (A/ - Al}) l7r;J50 with respect to


some basis x^\ . . . , x ( , ' } in Im TT;). Then H is stable if and only if the
factorization (17.4.6) is stable.
Rational Matrix Functions: The Main Result 531

The proof of Lemma 17.4.2 is rather long and technical and is given in
the next section.
Next, we make the connection with stable invariant subspaces.

Lemma 17.4.3
Let II = (TTI, . . . , Trk) be a supporting k-tuple of projectors with respect to
(AQ, BQ, CQ). Then II is stable if and only if the A0-invariant subspaces
Im(7r, + • • • + 7T;), / = 1, . . . , / : are stable and the A^-invariant subspaces
Im(7ry + TT / + I + • • • + TTfr), y = 1, . . . , k are stable as well (as before, A^ =
^o ~~ "n^-o)-

Again, it will be convenient to relegate the proof of Lemma 17.4.3 to the


next section.

Proof of Theorem 17.4.1. Let II = (ir l5 . . . , TJ^) be the supporting


A;- tuple of projectors with respect to (A0, J?0, C0) that corresponds to the
minimal factorization

By Lemmas 17.4.2 and 17.4.3 the factorization (17.4.7) is stable if and only
if the y40-invariant subspaces J^ = Im^ + f- 77y), j = 1,. . . , k are stable
and the/io -invariantsubspaces^^ = Im(7ry- + irj+l + • • • + Trk)J= 1, . . . , k
are stable as well.
With respect to the decomposition <ps = Im 17, + Im ir2 + • • • + Im irk,
write

In view of Lemma 17.2.4, ^E. is stable if and only if, for every common
eigenvalue A0 of

we have dim Ker( A07 - A) = 1. So all the subspaces j?,,. . . , &k are stable if
and only if every common eigenvalue of Ajf and App (j ^ p) is an eigenvalue
of AQ of geometric multiplicity 1. Similarly, all the subspaces M I } . . . ,Mk
532 Applications

are stable if and only if every common eigenvalue of A* and A*p with j' ^p
is an eigenvalue of AQ of geometric multiplicity 1. It follows that the
factorization (17.4.7) is stable if and only if every common eigenvalue of Ajf
and App (resp. of A* and A*p} with j ^ p is an eigenvalue of A0 (resp. of
AQ) of geometric multiplicity 1. To finish the proof, observe that the
realizations (17.4.5) are minimal and hence, by Theorem 7.2.3, the poles
(resp. zeros) of WQj(\) coincide with the eigenvalues of TtjA^Tr- = An (resp.
eigenvalues of -nyl^ TT, - A*}). Also, the partial multiplicities of a pole (resp.
zero) A0 of WOJ are equal to the partial multiplicities of A0 as an eigenvalue of
AJJ (resp. A*j). Analogous statements hold for the poles and zeros of WQ(\)
and eigenvalues of AQ and A^. D

17.5 PROOF OF THE AUXILIARY LEMMAS

We start with the proof of Lemma 17.4.2.


Assume that II is stable. Given e > 0, let e' be a positive number that we
fix later. By Lemma 13.3.2 there exists an a)l > 0 with the property that, for
any projector TT'J such that ||TT;' - 7ry|| < o>j, there exists an invert-
ible transformation Sf: <p6-* <p6 with 5y(Im iry) = Im TT'J and \\I - Sj\\ < e'.
We also assume that o>t < min(e', 1). Further, let w2 be the number corres-
ponding to w, as defined by the stability of II.
As the realization (17.4.1) is minimal, in view of Theorem 7.1.5 the
matrix colfQ/l^l^rJ is left invertible, where p is the degree of the minimal
polynomial for A0, and the matrix [B0, AQB0, . . . , AQ~IBQ] is right invert-
ible. Since the left (right) invertibility of a matrix X is stable under small
perturbations [indeed, if ||Y- A"|| < H^'H" 1 , then Y is also left (right)
invertible], there exists a r > 0 such that the realization

is minimal provided \\A — A0\\ + \\B — BQ\\ + ||C — C0|| < T.


Now put &> = min(w1, cu2, T, e') and let (A, B, C) be such that

Then the realization (17.5.1) is minimal. By the stability of II, there exists a
supporting £-tuple of projectors II' = (TT|, . . . , TT£) with respect to
(A, B, C) such that

For y = l , . . . , let 5;: (ps-^(ps be invertible transformations with


Sj(lm 7T;) = Im TT) and ||/ - 5y|| < e'. Now put
Proof of the Auxiliary Lemmas 533

for each y, where the transformation 5; is understood as 5^: Im 7r y —»Im TT'J.


Also, we regard the rational functions (17.5.2) as matrix functions with
respect to the basis introduced for Im it-. We have the minimal factorization

Moreover, writing p = max

Use the inequalities wt < e', w ^ e' and the inequalities ||5y !
||7- S'1]! ^ e'(l - e')"1 (assuming e' < 1; cf. the proof of Theorem 16.2.1)
to get

It remains to choose e' < 1 in such a way that this expression is less than e,
and the stability of factorization (17.4.6) is proved.
Conversely, let the factorization (17.4.6) be stable and assume that II is
not stable. Then there exist an e > 0 and sequences {Am}^=l, {5m}^=1,
{Cmrm^ such that
534 Applications

where O = (irj, . . . , ir'k) is any supporting A:-tuple of projectors with respect


to at least one of the triples (Am, Bm, COT), m = I, 2, . . . . Since (17. 5. 1) is a
minimal realization, we can assume (using Theorem 7.1.5 and the fact that
the full-range and null kernel properties of a pair of transformations are
preserved under sufficiently small perturbation of this pair) that

is minimal for all m. In view of the stability of (17.4.6), we can also assume
that each W A admits a minimal factorization

where for j = 1, 2, . . . k, we obtain

are transformations written as matrices with respect to the basis introduced


for Im iTj with the property that

For fixed m, consider the minimal realization

whre
Proof of the Auxiliary Lemmas 535

obtained from the minimal factorization (17.5.3) [cf. formula (7.3.4)]. As


any two minimal realizations of Wm(\) are similar, there exists an invertible
transformation S : Im TTI + • • • + Im irk —* <p5 such that

Actually, such an Sm is unique, and from the explicit formula for Sm


(Theorem 7.1.3) we find, using (17.5.3) and (17.5.6), that S m -»/as m-»oo.
Now let n("° = (7r\m\ . . . , Tr(m)) be the supporting fc-tuple of projectors
with respect to (Am, Bm, Cm), which corresponds to the minimal factoriz-
ation (17.5.4). Thus, for / = 1,. . . , k we have

and hence ir}m> = 5 IM 7r / 5~ I . We find that £;=1 \\ir}m) - 7ry.||-*0 as wi-»oo, a


contradiction with the choice of (Am, Bm, Cm). Lemma 17.4.2 is proved.
We pass on to the proof of Lemma 17.4.3. Assume that the subspaces
Im^! + • • • + TTJ), I = 1,.. . , k are stable ,40-invariant subspaces and that
Im(7T-y + TT/ + I + • • • + TTk) are stable AQ -invariant subspaces. Arguing by
contradiction, assume that II is not stable. Then there exist an e > 0 and
sequences {v4J~ = 1 , (5J~ = 1 , and {Cm}^, such that

for every supporting A>tuple of projectors (TT[, . . . , Tr'k) with respect to


(Am*Bm,C ), m = l , 2 , . . . . Then clearly Am-*A0 and Axmd=Am-
BmCm-+ AQ as m—»«>. By assumption, and using Theorem 15.6.1, for each
positive integer m there exists a sequence of chains of subspaces {0} C
^im) C • • • C &?\ C ^ - (p5, such that %(r\ . . . , ^ are m invariant
A

Similarly, there exists a sequence of chains of subspaces

such that ^!m), / = 1,. . . , k are A* invariant and


536 Applications

As Im^ + • • • + TT,.) + Im(7r/ + 1 + • • • + irk) = <ps, for j = 1, . . . , k - 1 and


sufficiently large m, we find, using Lemma 13.3.2, that

Now let

It is easy to see that

Furthermore

Indeed, (17.5.12) obviously holds for / = ! . Assuming that (17.5.12) is


nroved for i = n — 1 we have

where is clearly contained in ^m). Take x£&™\ and write x = y + z,


where yE.£(™\ and z^M™. Then z = x - y(=£(pm\ and jce^\ +
CS^n^*,"0). So (17.5.12) is proved. Combining (17.5.11) and (17.5.12),
we find that

Developing an analog of the proof of (17.5.12), one proves that

For sufficiently large m, let TT{M) be the projector on tfjm) along


Then the A:-tuple of projectors
(7r ( r>, IT™, . . . , 7ri m) ) is supporting for (Amt Bm, CJ. Denoting by r^
the projector on ^m) along M(™\ (j = 1,. . . , k - 1), we have r{m) =
7r(,m) + • • • + 7rjm). On the other hand, (17.5.9) and (17.5.10) imply, in view
of Theorem 13.4.3, that for j = 1,. . . , k - 1.

and so lim^^.^ ||7rjm) - ir;-|| =0, a contradiction with (17.5.8).


Conversely, assume that II is stable, but one of the j40-invariant sub-
spaces Im 77j,. . . , Im(7r, + • • • + 7rk), say, lm('rrl + • - • + T^), is not stable.
Rational Matrix Functions: Further Deductions 537

Then there exist an e > 0 and a sequence {Am}fn =l such that \\Am -
A0\\-*Q as m —»<» and

for every A m - in variant subspace M (m = 1, 2, . . .). As n is stable, there


exists a sequence of A>tuples of projectors n (w) = (TT^ , . . . , TT^), m =
1 , 2 , . . . such that n(OT) is supporting for (Am, B0, C0) and

Hence for the ^-invariant subspace Im(7Tim) + • • • + Trjm}) we have

a contradiction with (17.5.13). In a similar way, one arrives at a contradic-


tion if II is stable but one of the ^4^-invariant subspaces Im(7r; + irf+l +
• • • + Trk), j = 1,. . . , k, is not stable.

Lemma 17.4.3 is proved completely.

17.6 STABLE MINIMAL FACTORIZATIONS OF RATIONAL MATRIX


FUNCTIONS: FURTHER DEDUCTIONS

In this section we use Theorem 17.4.1 and its proof to derive some useful
information on stable minimal factorizations of rational matrix functions.
First, let us make Theorem 17.4.1 more precise in the sense that if the
minimal factorization

is stable, then so is every minimal factorization sufficiently close to

Theorem 17.6.1
Assume that (17.6.1) is a stable minimal factorization, and let

and
538 Applications

be minimal realizations of W^A) and W 0j (A). Then every minimal fac-


torization

with minimal realizations

and

is stable provided

is small enough.

The proof of this result is obtained by combining Corollary 15.4.2 with


Lemmas 17.4.2 and 17.4.3.
Let us clarify the connection between isolatedness and stability for
minimal factorizations. The minimal factorization (17.6.1) is called isolated
if the following holds: given minimal realizations

for j = 1,. . . , k, there exists e > 0 such that, if

is a minimal factorization with rational matrix functions Wol(\)- • - W^A)


that admit minimal realizations

such that

then necessarily VV^A) = W 0/ (A) for each ;'. It is easily seen that this
definition does not depend on the choice of the minimal realization (17.6.4).
Rational Matrix Functions: Further Deductions 539

From the proof of Theorem 17.4.1 and the fact that the stable invariant
subspaces coincide with the isolated ones (Section 14.3), it is found that this
property also holds for stable minimal factorizations:

Theorem 17.6.2
The minimal factorization (17.6.1) is stable if and only if it is isolated.

Consider again the minimal factorization (17.6.1) with given minimal


realizations (17.6.2) and (17.6.3) for W0(\) and W 01 (A), . . . , W ot (A). We
say that (17.6.1) is Lipschitz stable if there exist positive constants e and K
with the following property: for every triple of matrices (A, B, C) with
appropriate sizes and with the
realization

is minimal and W(A) admits a minimal factorization W=W1W2---Wk such


that, for ; = 1, . . . , k, W;( A) has a minimal realization

where, for each j

Again, the proof of Theorem 17.4.1, together with the description of


Lipschitz stable invariant subspaces (Theorem 15.5.1), yields a characteriza-
tion of Lipschitz stable minimal factorizations, as follows.

Theorem 17.6.3
For the minimal factorization (17.6.1), the following statements are equiva-
lent: (a) equation (17.6.1) is Lipschitz stable; (b) for every pair of indices
j^p, the rational functions W 0y (A) and W0p(\) have no common zeros and
no common poles; (c) given minimal realizations (17.6.2) and (17.6.3) of
W0( A) and W 01 (A), . . . , W0k( A), for every sufficiently small e >0 there exists
an a) >0 such that for any triple (A, B, C) with \\A- A0\\ + \\B - BQ\\ +
|| C - C0|| < to the realization

is minimal and W( A) admits a unique minimal factorization W( A) =


Wi(A)W 2 ( A) • • • Wk(\) with the property that for j = 1, . . . , k each Wf( A) has
a minimal realization
540 Applications

satisfying

17.7 STABILITY OF LINEAR FRACTIONAL DECOMPOSITIONS OF


RATIONAL MATRIX FUNCTIONS

Let t/(A) be a rational q x s matrix function with finite value at infinity. In


this section we study stability of minimal linear fractional decompositions

where W(X) and V(X) are rational matrix functions of suitable sizes that
take finite values at infinity. (See Sections 7.6-7.8 for the definition and
basic facts on linear fractional decompositions.)
In informal terms, the stability of (17.7.1) means that any rational matrix
function U(X) sufficiently close to U(\) admits a minimal linear fractional
decomposition U(\) = 3^^,(V), where the rational matrix functions W(A)
and V(\) are as close as we wish to W(\) and V(A), respectively. To make
this notion precise, we resort to minimal realizations for the matrix functions
involved. Thus let

be a minimal realization of (/(A), where a, /3, -y, and 8 are matrices of sizes
/ x /, / x 5, q x /, and q x s, respectively. Also, let

and

be minimal realizations of W(A) and V(A). We say that the minimal linear
fractional decomposition (17.7.1) is Lipschitz stable if there exist positive
constants e and K such that any q x s rational matrix function U( A) that
admits a realization

with
Decompositions of Rational Matrix Functions 541

has a minimal linear fractional decomposition

where the rational matrix functions W(X) and V(\) admit realizations

with the property that

It is assumed, of course, that the sizes of two matrices coincide each time
their difference appears in the preceding inequalities.
Since any two minimal realizations of the same rational matrix function
are similar (Theorems 7.1.4 and 7.1.5), it is easily seen that the definition of
Lipschitz stability does not depend on the particular choice of minimal
realizations for £/(A), W(\), and V(A).
It is remarkable that a large class of minimal linear fractional decom-
positions is Lipschitz stable, as opposed to the factorization of monic matrix
polynomials and the minimal factorization of rational matrix functions,
where Lipschitz stability is exceptional in a certain sense (Sections 17.3 and
17.6).

Theorem 17.7.1
Let

be a minimal linear fractional decomposition, where

is a suitable partition of W(A). Assume that the rational matrix functions


W(\) and £/(A) take finite values at infinity, and assume, in addition, that the
matrices VKU(<») and W22(°°) are invertible. Then (17.7.6) is Lipschitz stable.
542 Applications

Proof. We make use of Theorem 7.7.1, which describes minimal linear


fractional decompositions in terms of reducing pairs of subspaces with
respect to the minimal realization (17.7.2). Thus there exists an [a /3]-
invariant subspace Ml C <p' and an -invariant subspace M2 C £', which
are direct complements to each other and such that for some transfor-
mations F:<p'-*<p J and G: (*-+(' with (a + pF)M^ C Ml and (a +
Gy)M2CM2 the formulas (7.7.5)-(7.7.10) hold.
Moreover, one can choose F and G in such a way that Ml is a spectral
invariant subspaces (i.e., a sum of root subspaces) for a + /3F and M2 is
a spectral invariant subspace for a + Gy. Indeed, Theorem 7.7.2 shows
that the linear fractional decomposition (17.7.6) depends on
(Ml,M2\ F\M , Qu G) only, where QM is the projector on Ml along M2.
[Of course, it is assumed that the minimal realization (17.7.2) of U(\) is
fixed.] But the proof of Theorem 15.8.1 shows that there exists a transfor-
mation F': $'-+(* such that F'x = 0 for all x^M{ and the (a + fi(F +
F'))-invariant subspace Ml is spectral. So we can replace F by F + F'.
Similarly, one proves that G can be chosen with spectral (a + Gy)-invariant
subspace M2. In the rest of the proof we assume that F and G satisfy this
additional property.
Now let U( A) be another rational q x s matrix function with finite value
at infinity that admits a realization (17.7.3) with the property (17.7.4). Here
the positive number e > 0 is sufficiently small and is chosen later.
First, observe that for e >0 small enough the realization (17.7.3) is also
minimal. Indeed, by Theorem 7.6.1 we have

which means the right invertibility of [j8, a/3,. . . , a1 l(3] and the left
invertibility of

Since one-sided invertibility of a matrix is a property that is preserved under


small perturbations of that matrix, our conclusion concerning minimality of
(17.7.3) follows.
Recall (Theorem 15.8.1) that the spectral invariant subspaces Ml and M2
for (a + ftF) and for (a + G-y), respectively, are Lipschitz stable. It follows
that there exists a constant /^ >0 such that d + /3F and a + Gy have
invariant subspaces Ml and M2, respectively, with the property that
Decompositions of Rational Matrix Functions 543

provided c is small enough. By Lemma 13.3.2, by choosing sufficiently small


e we ensure that Ml and M2 are again direct complements to each other. In
other words, ( M l , M 2 ) is a reducing pair with respect to the realization
(17.7.3). Let d = d, Dn = D n , D22 = D22, D12 = D12, and

Also, put F=F, G-G. By Theorem 7.7.1 we obtain a minimal linear


fractional decomposition U(\) = ^^(V), where the functions W(\) and
V(\) are given by formulas (7.7.5)-(7.7.10) except that each letter (with the
exception of (p^, <p*, <p') has a tilde. These formulas show that for e > 0
small enough there is a positive constant K satisfying (17.7.5) provided F
and G satisfy the following property: given a basis /,,. . . , fk in M x , there
exists a positive constant K2 (which depends on this basis only) such that

Here Fl = F\M{: Ml-^(s and Gl = QMG: f-^Mlt where QMi stands for
the projector on Ml along M2, are transformations written as matrices with
respect to the basis / , , . . . , fk (and the standard orthonormal bases in <p*
and <£*), and are similarly
defined matrices with respect to some basis g{,. . . , gkin M^ where <2^ is
the projector on Ml along M2.

To prove the existence of a constant /C 2 >0 with the property (17.7.7),


we appeal to Lemma 13.3.2. In view of this lemma, in case M\ and M2 are
sufficiently close to Ml and M2, respectively, there exists a constant ^ 3 >0
(depending on Mx and M2 only) such that

for some invertible transformation S: <p'—» <p' such that SMl=Ml and
It remains to choose

It is instructive to compare Theorem 17.7.1 with Theorems 17.4.1 and


17.6.3. Thus any minimal factorization U(\) = (/!(A)t/ 2 (A), where £/,(A)
and t/ 2 (A) are n x n rational matrix functions with value / at infinity, is
Lipschitz stable in the class of minimal linear fractional decompositions. In
contrast, this minimal factorization need not be Lipschitz stable (or even
stable) in the class of minimal factorizations. The following example illus-
trates this point:
544 Applications

EXAMPLE 17.7.1. Let

It is easily seen that t/(A) admits a minimal factorization

This minimal factorization is not stable because the perturbed rational


matrix function

does not have nontrivial minimal factorizations at all. On the other hand,
(17.7.8) can be represented as a minimal linear fractional decomposition
with

Observe that W(\) has a minimal realization

Now Uf (A) also admits a minimal linear fractional decomposition Uf (A) =


&W((V), where

Moreover, We(\) has a minimal realization

Hence, as predicted by Theorem 17.7.1, the minimal factorization (17.7.8)


is Lipschitz stable when understood as a minimal linear fractional
decomposition.
Isolated Solutions of Matrix Quadratic Equations 545

17.8 ISOLATED SOLUTIONS OF MATRIX QUADRATIC EQUATIONS

Consider the matrix quadratic equation

where A, B, C, D are known matrices of sizes n x «, n x m, m x n, m x m,


respectively, and A!" is a matrix of size m x n to be found.
For any m x n matrix X, let

be the graph of X. The following proposition connects the solutions of


(17.8.1) with invariant subspaces of the (m + n) x (m + n) matrix

Proposition 17.8.1
For an m x n matrix X, the subspace G(X) is T invariant if and only if X
satisfies (17.8.1).

Proof. Assume that G(X) is T invariant. So for every x E <f" there


exists a y E <J7" such that

The correspondence jc—* y is clearly linear; so v = Z* for some n x « matrix


Z, and we have

for all x G <p", or

This implies Z = A + BX and

which means that (17.8.1) holds.


Conversely, if (17.8.1) holds and Z d= A + BX, then (17.8.2) holds. This
implies the T invariance of G(X).
546 Applications

To take advantage of Proposition 17.8.1 in describing isolated solutions


of (17.8.1), we need a preliminary result.

Lemma 17.8.2
Define a function G from the set M m x n of all m x n matrices to the set of all
subspaces in <p" © <pm by G(X) = G(X). Then G is a homeomorphism (i.e.,
a bijective map that is continuous together with its inverse) between Mmxn and
the set of all subspaces M C <p" © <£"" witn tne property that 0(M, $?)<!,
where ^=<p"©{0}.

Here 6(M, jV) is the gap between M and N (see Chapter 13).

Proof. The continuity of G and G ~ ' follows from the easily verified fact
that the orthogonal projector P on G(X) is given by

where L = X*X)~l. Let us check that 0(G(X), %)<l. By Theorem


13.1.1

where Px is the orthogonal projector on 9€. The second supremum is

where = 1, that is, is


uniformly bounded, it follows that ||y|| is bounded away from zero. Hence
the second supremum in (17.8.4) is less than 1.
To show that the first supremum in (17.8.4) is also less than 1, assume
(arguing by contradiction) that

and by formula
Isolated Solutions of Matrix Quadratic Equations 547

But L is invertible, so

and (17.8.5) is impossible. Thus 0(G(X), %) < 1 as claimed.


Now we must show that every subspace M C <p" © Cm with 6(M, X) -
a < 1 is a graph subspace, that is, M — G(X) for some X. First, Theorem
13.1.2 shows that dim M — dim $?= n. Further, assume that P^ jt = 0 for
some *£./#. Denoting by P the orthogonal projector on M, we have
INI = IK*** ~ f*)*ll» which, in view of the condition 0(M, %) = \\PM -
PX\\<1, implies jc = 0. Hence Q = PX\M\ M-* 3C is an invertible linear
transformation. Now M = G((I - P*)O~l). Indeed, if jc£ M, then

On the other hand, if for some M E $?

then the vector v - Q lu has the property that vE.M, P^y = u - Pxv and
therefore, y belongs to
M.

A solution X of (17.8.1) is called isolated if there exists a neighbourhood


of X in the linear space A/ mX/J of all m x n matrices that does not contain
other solutions of (17.6.1). A solution X is called inaccessible if the only
continuous function <p: [0,1]—» MmKn such that <p(0) = X and <p(f) is a
solution of (17.8.1) for every fE[0,1], is the constant function <p(t) = X.
Clearly, every isolated solution is inaccessible.
We now have a characterization of isolated and inaccessible solutions of
(17.8.1).

Theorem 17.8.3
The following statements are equivalent: (a) X0 is an isolated solution of
(17.8.1); (b) XQ is an inaccessible solution of (17.8.1); (c) for every eigen-
value A0 of the matrix

with dim Ker(70 - A 0 /) > 1, either


548 Applications

or

(d) every common eigenvalue of A + BXQ and D — XQB has geometric


multiplicity one as an eigenvalue of T0.

Proof. Making a change of variable Y = X - XQ, we see that X satisfies


(17.8.1) if and only if Y satisfies the equation

Hence XQ is an isolated (or inaccessible) solution of (17.8.1) if and only if 0


is an isolated (or inaccessible) solution of (17.8.6). By Proposition 17.8.1
and Lemma 17.8.2, the correspondence

is a homeomorphism between the set of all solutions Y of (17.8.5) and the set
of ro-invariant subspaces M such that 0 ( M , f f l ) < l , where 2i?=
Hence 0 is an isolated (resp. inaccessible) solution of (17.8.6) if and only if
3€ is an isolated (resp. inaccessible) T0- in variant subspace. An application of
Theorem 14.3.1 and Proposition 14.3.3 shows that (a), (b), and (c) are
equivalent.
Further, the characteristic polynomial of T0 is the product of the charac-
teristic polynomials of A + BXQ and D — XQB. As the multiplicity of A0 as a
zero of the characteristic polynomial of a matrix 5 is equal to the dimension
of £%A (5), it follows that A0 is a common eigenvalue of A + BX0 and
D - XQB if and only if

So (c) and (d) are equivalent.

An interesting particular case appears when B = 0. Then we have the


equation

which is a system of linear equations in the entries of X. It is well known


Isolated Solutions of Matrix Quadratic Equations 549

from the theory of linear equations that equation (17.8.7) either has no
solutions, has a unique solution, or has infinitely many solutions. [In this
case the homogeneous equation

has nontrivial solutions, and the general form of solutions of (17.8.7) is


X0 + y, where X0 is a particular solutions of (17.8.7) and Y is the general
solution of the homogeneous equation.] Clearly, a solution X of (17.8.7) is
isolated if and only if (17.8.8) has only the trivial solution. Using the
criterion of Theorem 17.8.3, we obtain the following well-known result.

Corollary 17.8.4
The equation YA - DY = 0 has only the trivial solution Y = 0 if and only if

Reconsidering the general case of equation (17.8.1), let us give some


sufficient conditions for isolatedness of the solutions.

Corollary 17.8.5
If the matrix

is nonderogatory [i.e., dimKer(r- A07) = 1 for every eigenvalue A0 of T],


then the number of solutions of (17.8.1) (if they exist) is finite and,
consequently, every solution is isolated.

Proof. The matrix T has a finite number of invariant subspaces; namely,


there are exactly II'=1 (dim £% A (jT) + 1) of them, where A I } . . . , Ar are all
the distinct eigenvalues of T. It remains to appeal to Proposition 17.8.1. D
EXAMPLE 17.8.1. Consider the equation

The only one-dimensional T-invariant subspaces are M-l=Span{el} and


M2 = Spanf^j - e3}. Defining 9C = Spanf^}, we have
550 Applications

so by Proposition 17.8.1 and Lemma 17.8.2 there exist only two solutions
given by

As expected from Corollary 17.8.5, the number of solutions of (17.8.9) is


finite

Another particular case of (17.8.1) is of interest. Consider the equation

where Al and A0 are given n x n matrices, and X is an n x n matrix to be


found. Equation (17.8.10) is a particular case of (17.8.1) with B = /,
C = — A0, D = — A,, and A = 0 and is sometimes described as "unilateral."
The matrix T turns out to be just the companion matrix of the matrix
polynomial L(A) = A 2 / + \A{ + A0:

Proposition 17.8.1 gives a one-to-one correspondence between the set of


solutions X of (17.8.10) and the set of T-invariant subspaces of the form

We remark that a 7-invariant subspace M has this form if and only if the
transformation [/ 0]|^: Jt—>$" is invertible. In this way we recover the
description of right divisors of L(A) given in Section 5.3. Similarly, the
equation
Stability of Solutions of Matrix Quadratic Equations 551

considered as a particular case of (17.8.1) gives rise (by using Proposition


17.8.1) to a description of left divisors of the matrix polynomial A2/ +
\Al + A0.

17.9 STABILITY OF SOLUTIONS OF MATRIX QUADRATIC


EQUATIONS

Consider the equation

with the same assumptions on the matrices A, B, C, D as in the preceding


section. We say that a solution X of (17.9.1) is stable if for any e > 0 there is
8 > 0 such that whenever A', B', C", D' are matrices of appropriate size with

thje equation

has a solution Y for which \\Y — X\\ < e. It turns out that the situation with
regard to stability and isolatedness is analogous to that for invariant
subspaces.

Theorem 17.9.1
A solution X of equation (17.9.1) is stable if and only if X is isolated.

Proof. It is sufficient to prove the theorem for the case when C = 0 and
the solution X is the zero matrix (see the proof of Theorem 17.8.3). In this
case G(X) = <P"©{0); so the homeomorphism described in Lemma 17.8.2
implies that X=0 is a stable (resp. isolated) solution of

if and only if <p"0{0} is a stable (resp. isolated) -invariant


subspace. Now use the fact that the isolated invariant subspaces for a linear
transformation coincide with the stable ones (Theorems 15.2.1 and
14.3.1). D

In view of Theorem 7.9.1, statements (c) and (d) in Theorem 17.8.3


describe the stable solutions of equation (17.9.1). In the particular case
when B - 0 we find that the solution X of XA - DX = C is stable if and only
if cr(>l)ncr(D) = 0.
552 Applications

As a solution X of (17.9.1) is stable if and only if the subspace Im is


stable as a /-invariant subspace, where

we can deduce some properties of stable solutions of (17.9.1) from the


corresponding properties of stable T-invariant subspaces. For instance, the
set of stable solutions of (17.9.1) is always finite (it may also be empty), and
the number of stable solutions of (17.9.1) does not exceed the number 7(7")
of n-dimensional stable /"-invariant subspaces, which can be calculated .as
follows. Let A , , . . . , \p be all the distinct eigenvalues of T with algebraic
multiplicities m , , . . . , m p , respectively; then y(T) is the number of
sequences of type (ql,. . . , qp), where qt are nonnegative integers with the
properties that q.-^m^ either <7y = 0 or qj = mj for every ;' such that
dim Ker( A y / - T) > 1, and ql + • • • + qp = n.
Using Corollary 15.4.2, we obtain the following property of stable
solutions of (17.9.1).

Theorem 17.9.2
Let X be a stable solution of (17.9.1). Then every solution Y of equation

where A', B', C', and D' are matrices of appropriate sizes, is stable provided

is small enough.

The notion of Lipschitz stability of solutions of (17.7.1) is introduced


naturally: a solution X of (17.7.1) is called Lipschitz stable if there exist
positive constants e and K such that, for any matrices A\ B', C\ D' of
appropriate sizes with

the equation

has a solution Y satisfying


The Real Case 553

Theorem 17.9.3
A solution of (17.9.1) is Lipschitz stable if and only if &(A + BX) fl

Proof. Again, we can assume without loss of generality that C = 0 and


^ = 0. Formula (17.8.3) shows that the function G introduced in Lemma
17.8.2 is locally Lipschitz continuous; that is, for every m x n matrix Y there
exists a neighbourhood °U of Y and a positive constant K such that

for every Z E W.. The inverse function G is locally Lipschitz continuous as


well. So the zero matrix is a Lipschitz stable solution of (17.9.1) (where
C = 0) if and only if the subspace 9€ — <p" 0 {0} is Lipschitz stable as an
invariant subspace for the matrix

By Theorem 15.5.1, 3C is Lipschitz stable if and only if it is a spectral


invariant subspace for T. This means that o-(A)r\o~(D) = 0. Indeed, if
o-(A) fl cr(D) ^0, then there exists a T-invariant subspace j£ strictly bigger
than $?and such that cr(T\y) = (T\x) [e.g., £= 3C + Span{*0}, where *0 is
an eigenvector of D corresponding to an eigenvalue A0 G <r(A) fl a(D)]. So
2? is not spectral. Conversely, if o-(A) D o-(D) = 0, then with the use of
Lemma 4.1.3, it follows that %C is spectral.

Similarly, one can obtain the following fact from Theorem 15.5.1: the
solution X in (17.9.1) is Lipschitz stable if and only if for every sufficiently
small e > 0 there exists a 5 > 0 such that

implies that the equation

has a unique solution Y satisfying

17.10 THE REAL CASE

In this section we quickly review some real analogs of the results obtained in
this chapter.
554 Applications

Let L( A) be a monic matrix polynomial whose coefficients are real n x n


matrices, and consider a factorization

where L y (A) are monic matrix polynomials with real coefficients. Using the
results of Section 15.9 and the approach developed in the proof of Theorem
17.3.1, one obtains necessary and sufficient conditions for stability of the
factorization (17.10.1) (the analog of Corollary 17.2.2). The definition of a
stable factorization of real monic matrix polynomials is the same as in the
complex case, except that now only real matrix polynomials are allowed as
perturbations of L(A) and as factors in a factorization of the perturbed
polynomial.

Theorem 17.10.1
Let CL be the companion matrix of L( A), and let

be the chain of CL-invariant subspaces in $"' [where I is the degree of L( A)]


corresponding to the factorization (17.10.1). Then (17.10.1) is stable if and
only if the following conditions are satisfied: (a) for every eigenvalue A0 of CL
with geometric multiplicity greater than 1 and for every i (2< / < r), either
Mi D &i0(CL) or M, n ^Ao(CL) = {0}; (b) for every real eigenvalue A0 of CL
with geometric multiplicity of 1 and even algebraic multiplicity, the algebraic
multiplicity of A0 as an eigenvalue of each restriction CL\M (if A0 is an
eigenvalue of CL\M at all) is also even.

In contrast with the complex case (Theorem 17.2.5), not every isolated
real factorization (17.10.1) is stable. Using the description of isolated
invariant subspaces for real transformations (Section 15.9), one finds that
(17.10.1) is isolated if and only if the condition (a) in Theorem 17.10.1
holds.
Now we pass to the stability of minimal factorizations

of a rational matrix function W 0 (A) such that the entries of W 0 (A) are real
for real A. (In short, such rational matrix functions are called real.) The
functions W0i(\) are also assumed to be real, and, in addition, we require
that all rational matrix functions involved are n x n and take value / at
infinity. Again, the stability of (17.10.2) is defined as in the complex case
with only real rational matrix functions allowed. The main result on stability
of (17.10.2) is the following analog of Theorem 17.4.1.
The Real Case 555

Theorem 17.10.2
The minimal factorization (17.10.2) of the real rational matrix function
W0( A) with W0(<x>) = I, where for j = 1, 2,. . . , k, W0j(\) is also a real
rational matrix function with W0/(o°) = I, is stable if and only if the following
conditions hold: (a) each common pole (zero) of W0; and W0p ( j ^ p ) is a
pole (zero) of WQ of geometric multiplicity I', (b) each even order real pole A0
of W0 (resp. of WQ*) is also a pole of each W0j (resp. of each VK^ 1 ) of even
order (if A0 is a pole of WQj or of W^1 at all).

Recall that the geometric multiplicity of a pole (zero) A0 of a rational


matrix function W(\) is the number of negative (positive) partial multi-
plicities of W(\) at A(). In connection with condition (b), observe that the
order of a pole A0 of W 0 (A) is the least positive integer p such that
(A — \Q)PW0(A) is analytic in a neighbourhood of A0. It coincides with the
greatest absolute value of a negative partial multiplicity of W 0 (A) at A 0 , as
one can easily see using the local Smith form for W 0 (A) at A 0 .
We omit the proof of Theorem 17.10.2. It can be obtained in a similar
way to the proof of Theorem 17.4.1 by using the description of stable
invariant subspaces for real transformations presented in Section 15.9.
As in the case of matrix polynomials, not every isolated minimal fac-
torization of a real rational matrix function with real factors is stable (in the
class of real factorizations). It is found that (17.10.2) is isolated if and only if
condition (a) of Theorem 17.10.2 holds. Let us give an example of an
isolated but not stable minimal factorization of real rational matrix func-
tions.

EXAMPLE 17.10.1. Let

One verifies easily that W0(\) = W 01 (A)W 02 (A) and this factorization is
minimal (indeed, the McMillan degree of W0(\) is 2, whereas the McMillan
degree of W 01 (A) and W 02 (A) is 1). Furthermore

so W 01 (A) and W 02 (A) do not have common zeros. It is easily seen that
A0 = 0 is a common pole of W 0 (A), Wol(\), and W 02 (A) and that the only
negative partial multiplicities of W 0 (A), W01( A), and Wm(\) at A 0 are -2, -1
and —1, respectively. Hence condition (a) of Theorem 17.10.2 is satisfied,
556 Applications

but condition (b) is not. It follows that the factorization W0(\) =


W 01 (A)W 02 (A) is isolated but not stable in the class of minimal factorizations
of real rational matrix functions. D

Finally, consider the matrix quadratic equation

where A, B, C, D are known real matrices of sizes n x n, n x m, m x «,


m x m, respectively, and X is a real matrix of size m x n to be found. The
solution of X of (17.10.3) is called isolated if there exists e >0 such that the
set of all real matrices Y satisfying \\X — Y\\ < e does not contain solutions
of (17.10.3) other than X. The solution of (17.10.3) is called stable if for any
c >0 there is 8 >0 such that whenever A', B', C", D' are real matrices of
appropriate sizes with

the equation

has a real solution Y for which || Y - X\\ < e. The isolated and stable
solutions-can be characterized as follows.

Theorem 17.10.3
The solution X0 of (17.10.3) is isolated if and only if every common
eigenvalue of A + BX0 and D — XQB has geometric multiplicity 1 as an
eigenvalue of the matrix

The solution X0 is stable if and only if it is isolated and, in addition, for every
real eigenvalue A0 of T with even algebraic multiplicity the algebraic multiplic-
ity of A0 as an eigenvalue of A + BXQ (or of D — XQB) is even (if A0 is an
eigenvalue of A + BXn, or of D — X0B at all).

In connection with the second statement in this theorem, observe that

and thus the algebraic multiplicity m(T\ A0) for the eigenvalue A0 of T is
equal to the sum of the algebraic multiplicities m(A + BX0; A0) and m(D -
X0B; A0). Consequently, if m(T; A 0 ) is even, then the evenness of one of
Exercises 557

the numbers m(A + BX0; A0) and m(D — X0B; A0) implies the evenness of
the other.
Again, we omit the proof of Theorem 17.10.3. It can be obtained by
using an argument similar to the proofs of Theorems 17.8.3 and 17.9.1,
using the description of stable and isolated invariant subspaces for real
transformations (Section 15.9) and taking into account equation (17.10.4).

17.11 EXERCISES

17.1 Find all stable factorizations (whose factors are linear matrix poly-
nomials) of the monic matrix polynomial

Does L(A) have a nonstable factorization?


17.2 Solve Exercise 17.1 for the matrix polynomial

17.3 Let L( A) be a monic n x n matrix polynomial of degree / such that


CL has nl distinct eigenvalues. Show that any factorization of L(\)
(whose factors are monic matrix polynomials as well) is stable.
17.4 Is any factorization of monic matrix polynomial L(A) stable if CL is
diagonable?
17.5 Show that the factorization L = L1L2L3 of a monic matrix poly-
nomial L(A) is stable if and only if each of the factorizations
L = L 2 M, M = L2L3 is stable, where M = L^1L.
17.6 Is the property expressed in Exercise 17.5 true for Lipschitz
stability?
17.7 Show that a factorization of 2 x 2 monic matrix polynomials L =
L,L2 is stable if and only if one of L{( A 0 ) and L2( A0) is invertible for
every A0 e (p such that L( A 0 ) = 0.
17.8 Let L(A) = /A +Y.i~l0Ai\' be an « x « matrix polynomial whose
coefficients A: are circulant matrices. Show that any factorization

where for / = 1,. . . , r, Ly( A) is a monic matrix polynomial with cir-


culant coefficients, is stable in the algebra of circulant matrices,
in the following sense: for every e > 0 there exists a 8 > 0 such that
every monic matrix polynomial L(A) of degree / with circulant
coefficients that satisfies o-,(L, L)<8 admits a factorization
558 Applications

where L , ( A ) , . . . , Lr(\) are monic matrix polynomials with cir-


culant coefficients and such that

(Here /?; is the degree of Ly and of L y , for ; = 1,. . . , r.)


17.9 Give an example of a nonstable factorization of an n x n matrix
polynomial with circulant coefficients.
17.10 Let L(A) = diagfM^A), M 2 (A)], where M,( A) and M 2 (A) are monic
matrix polynomials of sizes nl x nl and n 2 x « 2 , respectively, and let

be a factorization of L(A), where for ; = l , . . . , r , M1;(A) and


M2J(\) have sizes nl x nl and n2x n2, respectively,
(a) Prove that if (1) is stable, then each factorization

is stable as well.
(b) Show that the converse of statement (a) is generally false.
(c) Show that the factorization (1) is stable in the algebra of all
matrices of type

where A^ (resp. A2) is any n : x nl (resp. n2 x n2) matrix if and


only if each factorization (2) is stable. (Stability in the algebra
of all matrices of type (3) is understood in the same way as
stability in the algebra of circulant matrices, as explained in
Exercise 17.8.)
17.11 Let V be the algebra of all n x n matrices of type

where a; and /3y are complex numbers, and let L(A) be a monic
matrix polynomial with coefficients from the algebra V. Describe
factorizations of L(A) that are stable in the algebra V. (Hint: Use
Exercise 17.7.)
Exercises 559

17.12 Find all stable minimal factorizations of the rational matrix function

Is there a nonstable factorization of this function?


17.13 Prove that every minimal factorization of a scalar rational function
with value / at infinity is stable. (It is assumed that the factors are
scalar rational functions with value / at infinity as well.)
17.14 Let W(\) be a rational matrix function with value / at infinity.
Assume that W(\) has 8 distinct zeros and 8 distinct poles, where 8
is the McMillan degree of W(\). Show that every minimal factoriz-
ation of W(\) is stable.
17.15 Let W( A) be an n x n rational matrix function with value / at infinity
that is a circulant, that is, of type

where w>j(A), vv 2 (A),. . . , wn(\} are scalar rational functions. Show


that every minimal factorization of W(A) is stable in the class of
circulant rational matrix functions.
17.16 Give an example of nonstable minimal factorization of a circulant
rational matrix function with value / at infinity whose factors are also
from this class.
17.17 Let W(A) be a rational matrix function with W(<x>) = /, and let

be a factorization of W(A), where W ; (A) are also rational matrix


functions with value / at infinity. Show that if

is a minimal factorization, then (4) is also minimal. Is the converse


true?
(a) Find all solutions of the matrix quadratic equation

(b) Find all stable solutions of this equation.


(c) Find all Lipschitz stable solutions of this equation.
560 Applications

17.19 (a) Describe all circulant solutions of the equation

with circulant matrices A, B, C, and D.


(b) Can one obtain all circulant solutions of (5), in the event that B
is invertible, by the formula \(D - A)B~l + (\(D - A)2B~2 +
4flCV /2 ?
17.20 Solve the quadratic equation
Notes to Part 3

Chapter 13. This chapter contains mainly well-known results. The main
ideas and results concerning the metric space of subspaces appeared first in
the infinite dimensional framework [see Krein, Krasnoselskii and Milman
(1948); Gohberg and Markus (1959); and also Gohberg and Krein (1957)], and
they are adapted here for the finite-dimensional case. The contents of Sections
13.1 and 13.4 are standard. The exposition presented here is based on that of
Chapters.4 in the authors' book (1982) [see also Kato (1976)]. Theorem 13.2.3
is from Gohberg and Markus (1959). The exposition in Section 13.3 follows
Section 7.2 in Bart, Gohberg, and Kaashoek (1979). Theorem 13.6.3, along with
other related results, was obtained in Gohberg and Leiterer (1972) as a
consequence of general properties of cocycles in certain algebras of continuous
matrix functions. Theorem 13.5.1 appears in the infinite dimensional framework
in Gohberg and Krupnik (1979); here we follow the authors' book (1983b).The
material on normed spaces presented in Section 13.8 is standard knowledge. For
the first part of this section we made use of the exposition in Lancaster and
Tismenetsky (1985).
Chapter 14. The description of connected components in the set of
invariant subspaces (Sections 14.1 and 14.2) is found in Douglas and Pearcy
(1968) [see also Shayman (1982)]. An identification of isolated invariant
subspaces is given in Douglas and Pearcy (1968). Note that in the infinite-
dimensional framework (Hilbert space and bounded linear operators) there
exist inaccessible invariant subspaces that are not isolated [see Douglas and
Pearcy (1968)]. Theorem 14.3.5 was originally proved in the infinite-
dimensional case [Douglas and Pearcy (1968)]. The results on coinvariant
and semiinvariant subspaces in Section 14.5 appear here for the first time.
Chapter 15. Theorem 15.2.1 appeared in Bart, Gohberg and Kaashoek
(1978) and Campbell and Daughtry (1979). The proof presented here
follows the exposition in Bart, Gohberg and Kaashoek (1979). Parts
(a)O(b) of Theorem 15.5.1 was first proved in Kaashoek, van der Mee and
Rodman (1982). The statement of Theorem 15.5.1 and the remaining proof
is taken from Ran and Rodman (1983). Theorem 15.7.1 was proved in
Conway and Halmos (1980). Theorem 15.8.1, although not stated in this
way, was proved in Gohberg and Rubinstein (1985). The material of Section
15.9 is based on Bart, Gohberg and Kaashoek (1979). Theorem 15.10.1 was

561
562 Notes to Part 3

proved in den Boer and Thijsse (1980) and Markus and Parilis (1980).
Theorem 15.10.2 is suggested by Theorem 2.4 in den Boer and Thijsse
(1980).
The results of this chapter play an important role in explicit numerical
computation of invariant subspaces. However, we do not touch the topic of
numerical computation in this book, and refer the reader to the following
sources: Bart, Gohberg, Kaashoek and van Dooren (1980); Golub and
Wilkinson (1976); Ruhe (1970,1970b); van Dooren (1981, 1983); and
Golub and van Loan (1983).
Chapter 16. Most of the results and expositions of the material in this
chapter is taken from Gohberg and Rodman (1986). Corollary 16.1.3
appeared in Brickman and Fillmore (1967). Lemma 16.5.1 is a particular
case of a result due to Ostrowski [see pages 334-335 in Ostrowski (1973)].
Chapter 17. The main results of Section 17.2 (where the case of
factorization into the product of two factors L(A) = Lj(A)L 2 (A) was con-
sidered) are from Bart, Gohberg and Kaashoek (1978). The exposition of
Sections 17.1 and 17.2 follows Gohberg, Lancaster, and Rodman (1982),
where only the case of two factors was considered [see also the authors'
paper (1979)]. The results of Section 17.3 are presented here probably for
the first time. The main part of the contents of Section 17.4, as well as
Theorems 17.6.1 and 17.6.2, is taken from Bart, Gohberg and Kaashoek
(1979). Lemma 17.8.2 is taken from Campbell and Daughtry (1979). The
main results of Section 17.7 are from Gohberg and Rubinstein (1985).
Example 17.10.1 is taken from Chapter 9 in Bart, Gohberg and Kaashoek
(1979).
Part Four

Analytic Properties
of Invariant
Subspaces
This part is devoted to the study of transformations that depend analytically
on a parameter, and to the dependence of their invariant subspaces on the
parameter. We begin with the simplest invariant subspaces, the kernel and
image of the transformation, and this already requires the development of a
theory of analytic families of invariant subspaces. Also, the solution of some
basic problems is required, such as the existence of analytic bases and
analytic complements for analytic families of subspaces. This material is all
presented in Chapter 18 and is probably presented in a book on linear
algebra for the first time. More generally, these results appeared first in the
theory of analytic fibre bundles.
The study of more sophisticated objects and their dependence on the
complex parameter z is the subject of Chapter 19. These include irreducible
subspaces, the Jordan form, and Jordan bases. These results can be viewed
as extensions of perturbation theory for analytic families of transformations.
The final chapter of Part 4 (and of the book) contains applications of the
two preceding chapters to problems that have already appeared in earlier
chapters, but now in the context of analytic dependence on a parameter.
These applications include the factorization of matrix polynomials and
rational matrix functions and the solution of quadratic matrix equations.

563
This page intentionally left blank
Chapter Eighteen

Analytic Families
of Subspaces

In this chapter we study analytic families of transformations and analytic


families of their invariant subspaces. For this purpose, the basic notion of an
analytic family of subspaces is introduced and studied. This notion is of a
local character, and the analysis of its global properties is one of the main
problems of this chapter. In the proofs of Lemmas 18.4.2 and 18.5.2 (only)
we use some basic methods from the theory of infinite-dimensional spaces,
and this leads us beyond the prerequisites in linear algebra required up to
this point. It is shown that the kernel and image of an analytic family of
transformations form two analytic families of subspaces (possibly after
correction at a discrete set of points). Other classes of invariant subspaces
whose behaviour is analytic (at least locally) are also studied. In Section 18.8
we analyze the case when the whole lattice of invariant subspaces behaves
analytically. This occurs for analytic families of transformations with a fixed
Jordan structure.

/*./ DEFINITION AND EXAMPLES

Let O be a domain (i.e., a connected open set) in the complex plane <p, and
assume that for every z E f t a transformation A(z)\ <P"-* 4-"" *s given. We
say that A(z) is an analytic family on (1 if in a neighbourhood Uz of each
point ZQ Efl the transformation valued function A(z) admits representation
as a power series

where A0,Al,..., are transformations from <p" into <pm. Equivalently,


A(z) is said to depend analytically on z in fl if the entries in the matrix

565
566 Analytic Families of Subspaces

representing A(z) in fixed bases in <p" and (pm are analytic functions of z on
the domain ftr Obviously, this definition does not depend on the choice of
these bases.
Now let (M(z)}zen be a family of subspaces in <p". So for every z in ft,
M(z) is a subspace in <p". We say that the family {-M(z)}zefl is analytic on ft
if for every z 0 E f t there exists a neighbourhood Uz Cft of z 0 , a subspace
J< C <p", and an invertible transformation A(z): $"-* <p" that depends
analytically on z in f/z and

It is easily seen that for an analytic family of subspaces {^(z)}zeft the


dimension of M(z) is independent of z. Indeed, (18.1.1) shows that
dim M(z) is fixed for z belonging to the neighbourhood Uz of z0. Since ft is
connected, for any two points z', z"Eft there is a sequence z0 =
z', Z j , . . . , zk = z" of points in ft such that the intersections Uz f~\ Uz ,
i = 1,. . . , k are not empty. Then obviously dim M(z^) = dim M(zi_l), i =
! , . . . , & , and hence dim M(z') = dim M(z").
Let us give some examples of analytic families of subspaces.

Proposition 18.1.1
Let x^z), . . . , xp(z) be analytic functions of z on the domain ft whose values
are n-dimensional vectors. If for every z() €E ft the vectors x}(z0), . . . , xp(zn)
are linearly independent, then

is an analytic family of subspaces.

Proof. Take z 0 G f t , and let yp +l, • . . , yn be vectors in <p" such that


A^ZO), . . . , xp(zQ), yp+l, . . . , } > „ form a basis in <p". Then

As the determinant is a continuous function of its entries and Xj(z),


/'= 1,. . . , p are analytic (and hence continuous) functions of z on H, it
follows that

for all z belonging to some neighbourhood U of z0. Hence


Definition and Examples 567

where M is spanned by the first p coordinate unit vectors in <p", and


Span{jt,(z),. . . , Jtp(z)} is, by definition, analytic on ft.

We see later that the property described in Proposition 18.1.1 is charac-


teristic in the sense that for every analytic family of subspaces, there exists a
basis that consists of analytic vector functions.

Proposition 18.1.2
Let A(z): <£"—»<£"" be an analytic family of transformations on ft, and
assume that dim Ker A(z) is constant (i.e., independent of z for z in ft).
Then Ker A(z) is an analytic family of subspaces (of (p") on ft, whereas
Im A(z) is an analytic family of subspaces (of (p"1) on ft.

Note that dim Ker A(z) is constant on ft if and only if the rank of A(z) is
constant, or, equivalently, the dimension of Im A(z) is constant.

Proof. Write A(z) as an m x n matrix with respect to fixed bases in (p"1


and <p". Take z0 E ft. There exists a nonzero minor of size p x p of A(z0),
where by assumption, p = rank A(z) is independent of z. For simplicity of
notation assume that this minor is in the upper left corner of A(z0). As the
entries of A(z) depend analytically on z, this p x p minor is also nonzero for
all z in a sufficiently small neighbourhood U0 of z 0 . So for any z E U0 [here
we use the assumption that rank A(z) is independent of z], we obtain

where a^z)I is the Jth column of A(z). Let bp +l,. . . , bm be m-dimensional


vectors suetti that fl,(z0), . . . , ap(z0), bp +l, . . . , bm form a basis in (pm, that
is

Again, by the analyticity of flj(z),. . . , a p ( z ) , there exists a neighbourhood


V0 C U0 such that

for all z G VQ. Now for z £ V0 we have

where M = Span{e t ,. . . , ep] C <pm. So, by definition, Im A(z) is an analytic


family of subspaces.
Now consider Ker A(z) and fix a z 0 in ft. There exists a nonzero minor of
568 Analytic Families of Subspaces

size p x p of ^(z 0 ), which will be supposed to lie in the left upper corner of
A(z0). Partition A(z) accordingly:

where £(z), C(z), D(z), and E(z} are matrix functions of sizes p x p,
p x (n - p), (m - p) x m, (m - p) x (H - p), respectively, and are analytic
on ft. For some neighbourhood t/ of z0 we have del B(z) ^ 0 for z E U. If
the vector * , A: e (pp, y E <p"~ p belongs to Ker A(z) and z e U, then

It follows that dim Ker A(z) = dim Ker[-D(z)#(z) 'C(z) + E(z)}. But dim
Ker^4(z) is independent of z and equal to n — p; consequently,
D(z)fi(z)-1C(z) + £(z) = 0 for all z e U. Now, obviously

where - Hence Ker is an analytic family on


ft. D

We see later that the examples of analytic families of subspaces given in


Proposition 18.1.2 are basic. In fact, any analytic family of subspaces is the
image (or the kernel) of an analytic transformation whose values are
projectors.
More generally, without the extra assumption that the dimension of
Ker A(z) is independent of z, the families of subspaces Ker A(z) and
Im A(z), where A(z): $"—*• <pm is an analytic family on ft, are not analytic
on ft. Let us give a simple example illustrating this fact.

exam[le 18.1.1 let

Obviously, A(z): (p -* £ is an analytic family on <p (written as a matrix in


the standard basis in (p ). We have
Analytic Families of Transformations 569

As dim Im A(z) is not constant, the family of subspaces Im A(z) is not


analytic on <p. Similarly, Ker A(z) is not analytic on <p. Note, however, that
by changing Im A(z) at the single point z = 0 (replacing {0} by Span ,
we obtain a family of one-dimensional subspaces Span that is analytic on
<p) (indeed,

Similarly, by changing Ker A(z) at the single point z = 0 we obtain an


analytic family of subspaces Span

18.2 KERNEL AND IMAGE OF ANALYTIC FAMILIES


OF TRANSFORMATIONS

We have observed in the preceding section that, if A(z): <p"—»<p m is an


analytic family of transformations, then, in general, Ker A(z) and Im A(z)
are not analytic families of subspaces. However, Example 18.1.1 suggests
that after a change at certain points Ker A(z) and Im A(z) become analytic
families. It turns out that this is true in general. To make this statement
more precise, it is convenient to introduce some terminology. Let
A(z): <£"-» <P™ be an analytic family of transformations on ft. The singular
set S(A) of A(z) is the set of all z0 £ ft for which

Note that the singular set is discrete; that is, for every z0 £ S(A) there is a
neighbourhood U C ft of z0 such that U fl S(A) = {z0}.

Theorem 18.2.1
Let A(z): <p n —» 4-"" be an analytic family of transformations on ft, and let
r = max zen rank A(z). Then there exist m-dimensional vector-valued func-
570 Analytic Families of Subspaces

tions y \ ( z ) , • . . , yr(z) and n-dimensional vector-valued functions


*j(z), . . . , x n _ r ( z ) that are all analytic on ft and have the following proper-
ties: (a) .Vi(z), . . . , yr(z) are linearly independent for every z E f t ; (b)
jCj(z), . . . , x n _ r ( z ) are linearly independent for every z G ft; (c) for every z
not belonging to the singular set of A(z)

and

For andy z belongting to the singular set of A(z) the inclusing

and

hold.

In particular (Proposition 18.1.1), S p a n { y l ( z ) , . . . , yr(z)} is an analytic


family of subspaces that coincides with Im A(z) outside the singular set oi
A(z}. Similarly, Span{jCi(z), . . . , xn_r(z)} is an analytic family of subspaces
that coincides with Ker A(z) outside S(A).
The proof of Theorem 18.2.1 is based on the following lemma.

Lemma 18.2.2
Let jc,(z), . . . , xr(z) be n-dimensional vector-valued functions that are analy-
tic on a domain ft in the complex plane. Assume that for some z0 G ft, the
vectors *j(z 0 ),. . . , *,(2o) are linearly independent. Then there exist n-
dimensional vector functions ^(z), . . . , yr(z) with the following properties:
(a) y { ( z ) , . . . , yr(z) are analytic on ft; (b) y^z),. . . , yr(z) are linearly
independent for every zGft; (c) Span{_y,(z), . . . , yr(z)} =
Span{x,(z),. . . , x r ( z ) } (C(f") for every z G n ^ n o , where O0 = {zG
H Jc,(z), . . . , xr(z) are linearly dependent}. If, in addition, for some s (^r)
the vector functions JCj(z), . . . , xs(z) are linearly independent for all z G ft,
then y,(z), i — 1,. . . , r can be chosen in such a way that (a)-(c) hold, and
moreover, for all
In the proof of Lemma 18.2.2 we use two classical results (see Chapter 3
of Markushevich (1965), Vol. 3, for example) in the theory of analytic and
meromorphic functions that are stated here for the reader's convenience.
Analytic Families of Transformations 571

Recall that a set SCO is called discrete if for every z G 5 there is a


neighbourhood V of z such that V fl 5 = {z}. (In particular, the empty set
and the finite sets are discrete.) Note also that a discrete set is at most
countable.

Lemma 18.2.3
(Weierstrass's theorem). Let S C H be a discrete set, and for every z 0 G 5 let a
positive integer s(z0) be given. Then there exists a (scalar) function f(z) that is
analytic on H and for which the set of zeros off(z) coincides with S, and for
every z() G 5 the multiplicity of z0 as a zero of /(z) is exactly s(z0).

Lemma 18.2.4
(Mittag-Leffter theorem). Let S C fl be a discrete set, and for every z() G S let
a rational function of type

be given, where k is a positive integer (depending on z 0 ) and ay are complex


numbers (also depending on z 0 ). Then there exists a function f(z) that is
meromorphic on O, for which the set of poles of /(z) coincides with S, and
for every z 0 G S, the singular part of f(z) at z0 coincides with qz (z); that is,
/OO ~ <7z0(z) « analytic at z 0 .

Proof of Lemma 18.2.2. We proceed by induction on r. Consider


first the case r — 1. Let g(z) be an analytic scalar function on O with the
property that every zero of g(z) is also a zero of x } ( z ) having the same
multiplicity, and vice versa. The existence of such a g(z) is ensured by the
Weierstrass theorem given above. Put y to prove
Lemma 18.2.2 in the case r = 1.
Now we can pass on to the general case. Using the induction assumption,
we can suppose that jc t (z),. . . , xr_^(z) are linearly independent for every
z G O. Let X0(z) be an r x r submatrix of the n x r matrix [.^(z), . . . , * r (z)]
such that det X0(zQ) •=£ 0. It is well known in the theory of analytic functions
that the set of zeros of the not identically zero analytic function det XQ(z) is
discrete. Since det X0(zQ) 7^0 implies that the vectors x,(z 0 ),. . . , xr(z0) are
linearly independent, it follows that the set

arr\e linealy dependentt}

is also discrete. Disregarding the trivial case when O0 is empty, we can write
H0 = (£1, £ 2 ,. . .}, where £ E f t , / = 1 , 2 , . . . , is a finite or countable
sequence with no limit points inside fl.
Let us show that for every j = I , 2,. . . , there exist a positive integer Sj
572 Analytic Families of Subspaces

and scalar functions flly(z),. . . , ar_l ;(z) that are analytic in a neighbour-
hood of £y such that the system of /i-dimensional analytic vector functions on
n

has the following properties: for each z ^ £; it is linearly equivalent to the


system *j(z),. . . , xr(z) (i.e., both systems span the same subspace in <p n );
for z = £y it is linearly independent. Indeed, consider the n x r matrix B(z)
whose columns are formed by xv(z),. . . , xr(z). By the induction
hypothesis, there exists an (r — 1) x (r — 1) submatrix B0(z) in the first r — 1
columns of B(z) such that det B0( £;.) ^ 0. For simplicity of notation suppose
that BJz) is formed by the first r - 1 columns and rows in B(z]\ so

where Bj(z), B2(z), and B3(z) are of sizes (r - 1) x 1, (n - r 4-1) x (r - 1),


and ( n - r + l ) x l , respectively. Since fi0(z) *s invertible in a neighbour-
hood of £y, we can write

where W(z) = B3(z) - B2(z)BQ l(z)Bl(z) is an (n - r + 1) x 1 matrix. Let st


be the multiplicity of £; as a zero of the vector function W(z). Consider the
matrix function

Clearly, the columns 6j(z),. . . , br(z) of fi(z) are analytic and linearly
independent vector functions in a neighbourhood V(£/) of £ y . From formula
(18.2.6) it is clear that Spanfjc^z),. . . , xr(z}} = Spanf^^z),. . . , br(z)}
for z G V(£.) ^ T. Further, from (18.2.6) we obtain

and
Analytic Families of Transformations 573

So the columns b,(z),. . ., br(z) of B(z) have the form (18.2.5), where
a^z) are analytic scalar functions in a neighbourhood of £;.
Now choose y^z),... , yr(z) in the form

where the scalar functions gt(z) are constructed as follows: (a) gr(z) is
analytic and different from zero in O except for the set of poles £,, £2, . . . ,
with corresponding multiplicities s{,s2,...; (b) the functions gt(z) (for
/ = l , . . . , r - l ) are analytic in H except for the poles £j, £ 2 ,. . . , and the
singular part of g,(z) at £/ (f°r / = 1, 2,. . .) is equal to the singular part of
a,7(z)£r(z) at Cr
Let us check the existence of such functions g^z). Let gr(z) be the inverse
of an analytic function with zeros at £j, £ 2 ,. . . , with corresponding multi-
plicities slys2,. . . (such an analytic function exists by Lemma 18.2.3). The
functions g { ( z ) , . .. , g r _j(z) are constructed by using the Mittag-Leffler
theorem (Lemma 18.2.4).
Property (a) ensures that y^z),. . . , yr(z) are linearly independent for
every z G H "- { £,, £ 2 ,. . .}. In a neighbourhood of each L we have

where the final ellipsis denotes a vector function that is analytic in a


neighbourhood of £y and assumes the value zero at £;. Formula (18.2.7) and
the linear independence of vectors (18.2.5) for z = ^ ensures that
y\(Cj), • • • , yr(£j) are linearly independent. Finally, the last statement of
Lemma 18.2.2 follows from the proof of the first part of this lemma.

Proof of Theorem 18.2.1. Let A0(z) be an r x r submatrix of A(z) that


is nonsingular for some z e H, that is, del AQ(z) ^ 0. So the set O0 of zeros
of the analytic function det>! 0 (z) is either empty or consists of isolated
points. In what follows we assume for simplicity that AQ(z) is located in the
top left corner of A(z) of size r x r.
Let ^ t (z),. . . , xr(z) be the first r columns of A(z), and let
yt(z),. . . , yr(z) be the vector functions constructed in Lemma 18.2.2. Then
for each z G O "^ ft0 we have
574 Analytic Families of Subspaces

[The last equality follows from the linear independence of Jt^z),. . . , xr(z)
for z E O ^ O0.] We now prove that

Equality (18.2.8) means that for every z E H ^ n o there exists an r x r


matrix B(z} such that

where Y(z) - [y,(z),. . . , yr(z)]. Note that B(z) is necessarily unique.


[Indeed, if B'(z) also satisfies (18.2.10), we have Y(z)(B(z) - B ' ( z ) ) = 0,
and, in view of the linear independence of the columns of Y(z), B(z) =
B'(z).] Further, B(z) is analytic in fl^fl 0 . To check this, pick an arbitrary
z ' E f t ^ n o , and let yo(z) be an rxr submatrix of Y(z) such that
det(y o (z'))^0. [For simplicity of notation assume that F0(z) occupies the
top r rows of F(z).] Then det(y o (z))^0 in some neighbourhood V of z',
and (Y0(z)yl is analytic on z e V. Now Y(z)~L =f [(yo(z))"!, 0] is a left
inverse of Y(z); premultiplying (18.2.10) by Y(z)~L, we obtain

So B(z) is analytic on z E V; since z' G H "^ ft0 was arbitrary, B(z) is analytic
on fl ^n o .
Moreover, B(z) admits analytic continuation to the whole of Q, as
follows. Let z 0 e n o , and let Y(z)~L be a left inverse of Y(z), which is
analytic in a neighbourhood V0 of z 0 . [The existence of such Y(z) is proved
as above.] Define B(z) as y(z)^^4(z) for z e V0. Clearly, B(z) is analytic on
V{}, and for z E V0 ^ (z 0 ), this definition coincides with (18.2.11) in view of
the uniqueness of B(z). So B(z) is analytic on il.
Now it is clear that (18.2.10) holds also for zEfl 0 , which proves
(18.2.9). Consideration of dimensions shows that in fact we have an equality
in (18.2.9), unless rank A(z) < r. Thus (18.2.1) and (18.2.3) are proved.
We pass now to the proof of existence of y r + 1 (z),. . . , yn(z) such that
(b), (18.2.2), and (18.2.4) hold. Let a,(z),. . . , ar(z) be the first r rows of
A(z). By assumption fl,(z")' . . . , a r ( z ) are linearly independent for some
z E11. Apply Lemma 18.2.2 to construct ^-dimensional analytic row func-
tions bj(z), . . . , br(z) such that for all z E H the rows b,(z),. . . , br(z) are
linearly independf ~* — J r <- r» -^ r»
Global Properties of Analytic Families of Subspaces 575

Fix z0 E ft, and let br+l,. . . , br be n-dimensional rows such that the vectors
b,(z 0 ) r ,. . . , br(z0)T, bTr+l,. . . , bTn form a basis in <p". Applying Lemma
18.2.2 again [for *,(z) = b,(z}\ . . . , x,(z) = b,(z)\ *,+1(z) =
fcj+i,. - . , Jtn(z) = bj], we construct n-dimensional analytic row functions
br+i(z), • • • > ^«( z ) sucn tnat tne rt x n matrix

is nonsingular for all z E f t . Then the inverse fi(z)"1 is analytic on ft. Let
y r + 1 (z),. . . , yn(z) be the last (n - r) columns of B(z)~l. We claim that (b),
(18.2.2), and (18.2.4) are satisfied with this choice.
Indeed, (b) is evident. Take z E ft ^ ft0; from (18.2.12) and the construc-
tion of yr+l(z),. . . , yn(z) it follows that

But since z^ft 0 , every row of A(z) is a linear combination of the first r
rows. So in fact

Now (18.2.13) implies that for z £ f t ^ f t

Passing to the limit when z approaches a point from ft0, we find that
(18.2.14), as well as the inclusion (18.2.13), holds for every z E f t . Con-
sideration of dimensions shows that the equality holds in (18.2.13) if and
only if rank A(z) = r. D

18.3 GLOBAL PROPERTIES OF ANALYTIC FAMILIES OF SUBSPACES

In the definition of an analytic family of subspaces the transformation A(z)


and the subspace M depend on z 0 , so the definition of an analytic family of
subspaces has a local character. However, it turns out that for a given
analytic family of subspaces M(z) there exists an analytic family A(z) and a
subspace M independent of z0 for which the equality M(z) = A(z)M holds.
576 Analytic Families of Subspaces

Theorem 18.3.1
Let {^(z)}z6ft bg an analytic family of subspaces (o/<p") on ft. Then there
exist invertible transformations A(z)\ $"—> <p" that are analytic on ft, and a
subspace M C $" such that M(z) = A(z}M, for all z £ ft.

The lengthy proof of Theorem 18.3.1 is relegated to the next two


sections. First, we wish to emphasize that this is a particularly important
result concerning analytic families of subspaces and has many consequences,
some of which we describe now.

Theorem 18.3.2
For an analytic family of subspaces M(z) (of <JT") on ft the following
properties hold: (a) there exist n-dimensional vector functions
jCj(z),. . . , xp(z) that are analytic on ft and such that, for each z Eft, the
vectors ^(z),. . . , xp(z) are linearly independent and

(b) there is an analytic family of projectors P(z) defined on ft such that


M(z) = Im P(z) for all z E ft; (c) for every z E ft there exists a direct
complement N(z) to M(z) in £" such that the family of subspaces N(z) is
analytic.

Proof. Let A(z) and M be as in Theorem 18.3.1, and let xl, . . . , xp be a


basis in M. Then x,(z) = A(z)xi, i — 1,. . . , p satisfy (a). To satisfy (b), put
P(z) = A(z)PA(z)~l, where P is a projector on M. Finally, the family of
subspaces ^V(z) = A(z)N, where N is a direct complement in M in (p",
satisfies (c). D

Note that property (b) [as well as property (a)] is characteristic for
analytic families of subspaces. So, if P(z) is an analytic family of projectors
on ft, then Im P(z) is an analytic family of subspaces. We leave the
verification of this statement to the reader.
In connection with Theorem 18.3.2 (c), note that the orthogonal comple-
ment M{z)^ is usually not an analytic family, as the next example shows.

EXAMPLE 18.3.1. For any z E <p let

Then
Global Properties of Analytic Families of Subspaces 577

which is not analytic. Indeed, if M(z)L were analytic, then for z in a


neighbourhood U of each point z0 E <J7 we would have

where A (z) is a 2 x 2 analytic family of invertible matrices and M is a fixed


one-dimensional subspace that, without loss of generality, may be assumed
equal to Span{el}. So

on U, where he first column of A(z). Hence a

However, the function z is not analytic in U, so (18.3.1) cannot happen. D

In the next section we will need the following generalization of Theorem


18.3.2.

Theorem 18.3.3
Let M(z) and N(z) be analytic families of subspaces (of <p") on ft such that
M(z) C ^V(z) for all z E ft. Then there exist n-dimensional vector functions
jc t (z),. . . , xp(z) [where p - dim N(z) - dim M(z)] that are analytic on ft
and such that, for each z E ft, the vectors JCj(z),. . . , xp(z) form a basis in
N(z) modulo M(z).

Proof. By Theorem 18.3.2 there are bases y\(z),. . . , ys(z) in M(z) and
i>,(z), . . . , v((z} in N(z) that are analytic on ft. By Lemma 18.2.2 there
exist analytic vector functions ys+l(z), . . . , y,(z) such that y ^ ( z ) , . . . , y,(z)
are linearly independent for each z E ft and

Obviously, ys+l(z),. . . , y((z) is the desired analytic basis in ^V(z) modulo


M(z).

We note one more consequence of Theorem 18.3.1.

Corollary 18.3.4
Let Jt^z),. . . , M k ( z ) be analytic families of subspaces (of <p") on ft, and
assume that for each z Eft, (p" is a direct sum of M ^ ( z ) , . . . , M k ( z ) . Then,
578 Analytic Families of Subspaces

given ZQ E H, there exists a family ofinvertible transformations 5(z): <pn —* <p"


that is analytic on ft and for which S(z)Mi(zQ) = M^z} on ft, and S(z0) = /.

Proof. It follows from Theorem 18.3.1 that there exist analytic families
of invertible transformations S,(z): <p"-» (p", / = !,..., k, such that
Sj(z 0 ) = / and 5 ( (z)^,( 2 o) = ^/( z ) f°r a U z E <p. Now the transformation
5(z): <p"-» <p" defined by the property that S(z)x = St(z)x for all x E M^z^}
satisfies the requirements of Corollary 18.3.4.

18.4 PROOF OF THEOREM 18.3.1 (COMPACT SETS)

As a first step towards the proof of Theorem 18.3.1, a result is proved in this
section that can be considered as a weaker version of that theorem. We
say that a function /(z) (whose values may be vectors, or transformations) is
analytic on a compact set ATCft if /(z) is analytic on some open set
containing K.

Theorem 18.4.1
Let K C n be a compact set, and let M(z) C <p" be an analytic family of
subspaces on H. Then there exist vector functions /,(z),. . . , fr(z) E <p" that
are analytic on K and such that / t (z),. . . , fr(z) is a basis in M(z) for every
zG/C.

In turn, we need some preliminaries for the proof of Theorem 18.4.1.


First, we introduce the notion of an incomplete factorization. Let A(z) be an
n x n matrix function that is analytic on a neighbourhood of the unit circle
and is nonsingular on the unit circle. An incomplete factorization of A(z) is a
representation of the form

that holds whenever |z| = 1 and the family +A(z) is nonsingular and analytic
on the disc |z| < 1, and the family A(z) is nonsingular and analytic on the
annulus 1 ^ |z| <°°.

Lemma 18.4.2
Every n x n matrix function A(z) that is analytic and nonsingular on a
neighbourhood of the unit circle admits an incomplete factorization.

Proof. Consider first the case when A(z) is analytic on the disc |z ^ 1.
Let z 0 be a zero of det A(z) with |zj < 1. Then for some invertible matrix
T0 the first row of T0A(z) is zero at the point z0. Put
Proof of Theorem 18.3.1 (Compact Sets) 579

Then A(z) - A^(z) +A^(z)\ moreover, A,(z) is analytic and invertible for
1 < |z| <oo ? +Al(z) is analytic and invertible for |z| ^ 1, and the number of
zeros of det +A j(z) inside the unit circle is strictly less than that of det A(z).
If det +A j(z) ^ 0 for |z| ^ 1, then /l(z) = ^4 ,(z) +/1 ,(2) is an incomplete fac-
torization of A(z). Otherwise, we apply the construction above to ^(z),
and after a finite number of steps an incomplete factorization of A(z) is
obtained.
Now it is easy to prove Lemma 18.4.2 for the case that A(z) is
meromorphic in the disc |z| < i (more exactly, admits a meromorphic con-
tinuation into the disc). Indeed, let z t , . . . , zk be all the poles of A(z) inside
the unit disc with orders al,...,ak, respectively. Then the function
B(z) = nf =1 (z - ziYiA(z) is analytic for z < 1 and thus (according to the
assertion proved in the preceding paragraph) admits an incomplete fac-
torization: B(z) = ~B(z) +B(z). So (18.4.1) with 'A(z) = {n*=1(z -
z-) -Q( '} B(z); +A(z) = *B(z) is an incomplete factorization of A(z).
Now consider the general case. Let e > 0 be such that A(z) is analytic and
invertible in the closed annulus <i> = { z E < p l - e ^ | z | < l + e}. In the
sequel we use some basic and elementary facts about the structure of the set
Cw of aU n x n matrix functions X(z) that are continuous in the closed
annulus 4> and analytic in the open annulus 4> = { z E < p | l — e < | z | < l + e}.
The set Cw is an algebra with pointwise addition and multiplication of
matrices and multiplication by scalars, that is, for z e O and A^z), Y(z) E CM
we define

Introduce the following norm in €„:

where A r (z)G Cw. It is easily seen that this is indeed a norm; that is, the
axioms (a)-(c) of Section 13.8 are satisfied. Moreover

for X, Y£ Cw. In fact, the normed algebra CM is a Banach algebra, which


means that each Cauchy sequence converges in the norm || • ||c to some
function in CM. This follows from the fact that the uniform limit of
continuous functions on <I> is itself a continuous function on <f>, and the limit
of analytic function on <I> which is uniform on each compact set in <J> is itself
analytic on $>.
580 Analytic Families of Subspaces

Let M + be the set of all matrix functions from C^ that admit an analytic
continuation to the set ( z E < p | | z | < l — e} and let M_ be the set of all
matrix functions from Cw that admit an analytic continuation to the set
( z E < p | | z | > l + e}U {°°} and assume the zero value at infinity. It is easily
seen (as for C w ) that M+ and M_ are closed subspaces in the norm || • ||c .
Clearly, M+ n M_ = {0} (here 0 stands for the identically zero n x « matrix
function on <f>). Furthermore, M+ + M_ = Cli>. Indeed, recall that every
function X(z) £ Cw can be developed into the Laurent series

where the functions

belong to M+ and M _ , respectively. Denoting P+(X(z)) = X+(z), we obtain


a projector P+: Cw -» Cw with Im P+ - M+ and Ker P+ = M _ . It turns out
that P+ is bounded, that is

[See page 225 in Gohberg and Goldberg (1981), for example; the proof is
based on Banach's theorem that every bounded linear operator that maps a
Banach space onto itself, and is one-to-one, has a bounded inverse.]
Return to our original matrix function A(z}. Clearly A(z)"1 £. C w , and
the Laurent series A(z)~l = £*=_«, z'Aj converges uniformly in the annulus
l - e < | z | ^ l + e. Therefore, for some N the matrix function AN(z) —
T,^_N z'Aj has the following properties: det AN(z) ^0 for l - e < | z | < l + e
and

where M(z) G C^ and

Let

000000000000000000000000000000000000000000000000000000000000000
algebra Cw. (Here / represents the constant n x n identity matrix.) Denote
+
G = (/+ +N)'\ Then +G and ( + G) - 1 belong to the image of P+. In
Proof of Theorem 18.3.1 (Compact Sets) 581

particular, + G and ( + G ) ~ 1 are analytic in the disc |z| < 1. Furthermore, one
checks easily that

so the function G = (/ + +N)(I- M) is analytic for l=s|z|<°° and at


infinity. As

G is invertible in C^. Since both G and / belong to the (closed)


subalgebra C~ = {al + Ker P+ \ a e <p) of Ca, also ("G) -1 e C~. Now
write

and use the fact (proved in the preceding paragraph) that the function
(+G(z)yl(AN(z))~\ which is meromorphic on the unit disc, admits an
incomplete factorization. D

Lemma 18.4.3
Let / j , . . . , fr E <p" and gl,. . . , g, E (p" be two systems of analytic and
linearly independent vectors on ft such that

for z C ft0, where ft0 C ft is a set with at least one limit point inside ft. Then

Span{/i(z) f . . . , fr(z)} = Spani^z), . . . , g,(z)}

for every z E . f l and

where A(z) is an r x r matrix function that is invertible and analytic on ft.

Proof. Consider the system II = {/j,. . . , fr, g^ . . . , gr} of '2r n-


dimensional vectors. Then rank Il(z) = r for z E ft0. On the other hand, the
set {z0 Gft | rankll(z 0 ) <max zen rankll(z)} is discrete. Thus r =
max
*en rank n(z), and (18.4.3) holds for every z Eft because both systems
/ t (z),. . . , fr(z} and gj(z),. . . , gr(z) are linearly independent. Consequent-
ly, there exists a unique matrix function A(z) such that (18.4.4) holds. It
remains to prove that A(z) is analytic on ft. Let z 0 eft, and suppose, for
example, that the square matrix X(z) formed by the upper r rows of
582 Analytic Families of Subspaces

[#i(z),. . . , gr(z)] is invertible for z = z 0 . Computing A(z) in a neighbour-


hood of z0 by Cramer's formulas, we see that A(z) is analytic in a
neighbourhood of z 0 . Thus A (z) is analytic on H. D

Proof of Theorem 18.4.1. Without loss of generality, we can suppose


that K is a connected set (otherwise consider a larger compact set). Fix
z0 E K, and let N0 be some direct complement for M(zQ) in <p". Then

is a direct sum decomposition for every zE. K except maybe for a finite set
of points 2 , , . . . ,zk. Indeed, by the definition of an analytic family of
subspaces, for every TJ E K there exists a neighbourhood U^ of 17 and an
analytic and invertible matrix function B^(z) defined on U^ such that
B^(z)M = M(z) on U^, where J< is a fixed subspace in <p". We can assume
[by changing B^(z) if necessary] that the subspace M is independent of TJ.
[Here we use the fact that dim M{z) is constant because of the connected-
ness of O.] Actually, we assume M = M.(z0). Let *,,. . . , xr be some basis in
M ( z 0 ) , and let xr+l, . . . , xn be a basis in JV0. Then for 2 e U^ the subspaces
^(z) and JV0 are direct complements to each other if and only if

Two cases can occur: (a) DTJ(z) = 0 for z e L^; (b) Dn(z) ^0, and then we
can suppose (taking Un smaller if necessary) that D T J (z)^0 only at a finite
number of points of U^. Let us call the points 17 for which (a) holds points o
the first kind, and the points 17 for which (b) holds points of the second kind
Since K is connected, all 17 £ K are of the same kind, and since z0 is of the
second kind, all 17 G K are of the second kind. Further, let L^, . . . , U bea
finite covering of the compact K. Since D^ (z) ^ 0 only at a finite number of
points z in L^., / = ! , . . . , / , we find that (18.4.5) holds for every z £ K
except possibly for a finite number of points z , , . . . , zk G K.
By the definition of an analytic family of subspaces, there exist neighbour-
hoods U(zl),..., U(zk) of Z , , . . . , Z A , respectively, and functions
5 (1) (z),. . . , B(k\z) that are invertible and analytic on t/(z,),. . . , U(zk),
respectively, such that

Let x\'\ . . . , x ( r j ) be some basis of the subspace M(ZJ), and let g\i)(z) =
Bu\z)x\'\ (/ = 1,. . . , r; z G U(zj)\ j = 1,. . . , k). Then for p >0 small
enough we have
Proof of Theorem 18.3.1 (Compact Sets) 583

as long as z — z y | ^ p for ;' = 1,. . . , k. Let

For every z E. K ^ S let P(z) be the projector on M(z) along jV0. Then we
claim that P(z) is an analytic function on K ^ S.
Indeed, we have to prove this assertion in a neighbourhood of every
/AO E K ^ S. Let t/0 be a neighbourhood of /u,0 in the set K "- 5 such that,
when z G UQ, M(z) = fi(z)J£(/u,0) for some analytic and invertible matrix
function B(z) on U0. The matrix function B(z) defined on UQ by the
properties that B(z)x = fi(z)jt for all xE.M( /u,0) and B(z)y = y for all y € JV0
is analytic and invertible. As P(z) = B(z)PQ(B(z))~l, where P0 is the project-
or on M((JL{)) along JV0, the analyticity of /'(z) on U0 follows.
Let us now prove that there exist vector functions f\°\z),. . . , /* 0) (z)
that are analytic on K ^ S and for which

where z G K ~^ S. Indeed, let z 0 E K ^ 5 be a fixed point. Then dim


Im P(z0) = r; let gi 0) (z),. . . , g< 0) (z) be columns of P (0) (z) that are
linearly independent for z = z 0 . In view of Lemma 18.2.2, there exist
analytic and linearly independent vector functions /^(z),. . . , /* 0) (z) de-
fined on K ^ S such that

for every z G K ^ S, except maybe for a finite set of points. (The set of
exceptional points is at most finite because of the compactness of K"- 5.)
But from the choice of g\°\ . . . , gJ 0) it follows that

for every z G AT^S, except perhaps for a finite set of points [viz., those
points z for which the vectors £i 0) (z), . . . , g^ 0) (z) are not linearly indepen-
dent]. Thus

for every z £ K ^ S except maybe for a finite number of points. As both


sides of (18.4.6) are analytic families of subspaces on K^~S (Proposition
18.1.1), it is easily seen that, in fact, (18.4.6) holds for every z E #"-S.
Consider now the systems {/ ( , 0) (z),. . . ,/J 0) (z)} and
l z ( l) z
{g\ \ )> • • • •> g r ( )}- These systems form two bases for M(z) that are
analytic in a neighbourhood of the circle |z - z,| = p. Therefore, by Lemma
584 Analytic Families of Subspaces

18.4.3 there exists an r x r matrix function A(z) analytic and invertible on a


neighbourhood U of the set {z E <p | |z - z,| = p} and such that, for all

By Lemma 18.4.2, the function A(z) admits an incomplete factorization


relative to the circle {z | |z - z,| = p}: A(z) = ~A(z) •+A(z) (|z - z j = p).
In view of (18.4.7), we find that, when |z - zj = p

Clearly, the functions /^(z),. . . , f(rl\z) can be continued analytically


to the set K "- (S2 U • • • U Sk). Moreover, since *A(z) [resp. ^4(z)] is invert-
ible for | z - z j < p (resp. |z-z,|>p), the set f[l\z),. . . , /^(z) is
linearly independent for every z G K ^ (S2 U • • • U Sk). Furthermore, for
any z G K ^ (52 U • • • U Sk), we obtain

Now take the point z 2 and apply similar arguments, and so on. After k steps
one obtains the conclusion of Theorem 18.4.1. D

18.5 PROOF OF THEOREM 18.3.1 (GENERAL CASE)

In this section we finish the proof of Theorem 18.3.1. The main idea is to
pass from the case of compact sets (Theorem 18.4.1) to the case of a general
domain ft. To this end we need some approximation theorems.
A set M C <p is called finitely connected if M is connected and <p "-• M
consists of a finite number of connected components. A set N C M is called
simply connected relative to M if for every connected component Y of <p ^ N
the set Y fl (<p^- M) is not empty. The first of the necessary approximation
theorems is the following.

Lemma 18.5.1
Let K C ft be a finitely connected compact set that is also simply connected
relative to ft. Let Yl,. . . ,YS be all the bounded components of <p "^ K, and,
for j = 1,. . . , s let Zj E Yf;^~ ft be fixed points. Let A(z) be an m*. n matrix
function that is analytic on K. Then for every e > 0 there exists a rational
matrix function B(z) of size m x n such that B(z) is analytic on (p ^
( z i , . . . , zs) and, for any zE.K,
Proof of Theorem 18.3.1 (General Case) 585

Proof. Without loss of generality we will suppose that m = n = 1, that


is, the functions A(z) and B(z) are scalars. We prove that it is possible to
choose a rational function of the form

where the xjv G <p and such that \A(z) - R(z)\ < e for any z £ K. Let
U CL^~^ {zl,. . . , zs} be a neighbourhood of K whose boundary dU con-
sists of s + I closed simple rectifiable contours. Then for z G K, we obtain

Since this integral can be uniformly approximated by Riemann sums, we


have to prove only that the function (17 — 2)" can be uniformly approxi
mated by functions of the form £* =0 (z - ziY"xv, xv E <p, where 17 E dU
Yj ( / = ! , . . . , s), and (17-z)" 1 can be approximated uniformly by the
polynomials £* = 0 zX, (*„ e <P) where i?e < ? { / n ( < p ^ ( t f U y, U • • • U
Y^)). But this assertion follows from Runge's theorem [Chapter 4 of
Markushevich (1965), Vol. 1], which states that, given a simply connecte
domain F in <pU {°°} and a point £ in the interior of (<pU (°°})^F, any
analytic function /(z) on F is the limit of a sequence of rational functions
with their only pole at £, and the convergence of this sequence to /(z) is
uniform on every compact subset of F. Indeed, for / = 1 , . . . ,5 the set
bounded by the contour dV n Yj is simply connected, as well as the set
(<fu{oo})-(^ U y 1 u---yj. n
Lemma 18.5.2
Let K and z 1? . . . , zs be as in Lemma 18.5.1. // A(z) is an n x n matrix
function that is analytic on K and invertible for every z E K, then for every
€ > 0 there exists an analytic and invertible matrix function B(z) defined on
<p ^ {z,, . . . , z^} such that (18.5.1) holds for any z£K.

Proof. Denote by G the group of all n x n matrix functions M(z) that


are analytic on K and invertible for every z E AT, together with the topology
induced by the norm \\M\\c = maxzeK ||M(z)||. Let G, be the connected
component of the topological space G that contains /, the constant n x n
identity matrix. In fact

G, = \ X £ G \ there exist an integer v > 0 and Ml,. . . , Mv £ G, ||My||G < 1


586 Analytic Families of Subspaces

Indeed, denoting the right-hand side of (18.5.3) by G0, let us prove first
that G0 is both a closed and an open set in G. Let FE G0 and HE. G be
such that \\H-F\\c < HF' 1 ^ 1 . Then // = (/- Af)F, where M = / - HF~\
We have

that is, H E G0. So G0 is open. Suppose now that Fy E G0, j = 1,2,. . . and
||Fj - F||—»0 for some FE G. Let ;0 be large enough such that ||F; - Fj| <
HF-'ir 1 . Then F=(I-M)F]Q where ||M||G = ||7- FFhl\\G < I/that is,
FE G0. So G0 is a closed set.
Now let us prove that G0 is connected. Let

then

is a continuous function that connects Xand /in G0. So G0 is connected and


thus is the connected component of G that contains /. So (18.5.3) is proved.
As a side observation, note that G0 is also a subgroup of G. Indeed, let
X, VE G0. Then the set X- G^1 is connected and contains /; therefore,
00000000000000000000000000000000000000000000000000000000000
of G.
Now let A(z) be as in Lemma 18.5.2, and suppose first that A E Gr Then

for some A/,,. . . , Mv E G with ||A/y||G < 1 for / = 1,. . . , v. Rewrite this
representation in the form

A = exp(ln(/ - M,)) • • • exp(ln(/ - AfJ)

where

By Lemma 18.5.1, for each j = 1,. . . , v there exists a rational n x n matrix


function Dy whose poles are contained in ( z j , . . . , zs, <*>}, with the property
that Dj approximates the analytic function ln(7 - Af;(z)) well enough to
ensure that the analytic matrix function B(z) = exp^^z))- • •exp(Dl)(z))
Proof of Theorem 18.3.1 (General Case) 587

satisfies (18.5.1) for every z G #. Clearly, B(z) is invertible for every


z E <p ^ { z j , . . . , ZjJ, so the lemma is proved in the case A(z) G G,.
We now pass to the general case. Let GA be the connected component of
G which contains A(z). It suffices to show that there exists an n x n matrix
function D(z) that is analytic and invertible in <p~" {z l 5 . . . , zs} and such
that D(z)EGA. Indeed, then A(z)D(z)~l G G 7 , and as we have seen
already, there exists an analytic and invertible matrix function B(z) in
< p ^ { z 1 ? . . . , zj with the property that \\B - AD~l\\G < c l l ^ H c ' - Tne
matrix function B(z) - B(z)D(z) is the desired one.
Thus let us prove the existence of D(z). According to Lemma 18.5.1, for
every 8 > 0 there exists a rational matrix function DQ(z} that is analytic on
<p^ {z,,. . . , zj and such that ||£>0(z) - A(z)\\ <8 when z G K. Choose
8 > 0 small enough to ensure that D0(z) is invertible for z G K and
D0 G GA. Since DQ(z) is a rational function, det D0(z) ^ 0 for every z G ({7 "••
{z,,. . . , zs) except perhaps for a finite set of points r j j , . . . , r\m G <p^
{z,,. . . , zj, which do not belong to K.
Denote by Y(r){) the connected component of < p > X that contains 17,,
and let 2(17,) be the point from (°°, z , , . . . , zs} that belongs to Y^). Let
p>0 be such that the disc{z£(f z - i 7 , | < p } is contained in ^(17,)^-
{Z(TJ,), Tj 2 ,. . . , rim}. By Lemma 18.4.2 there exists an incomplete fac-
torization of DQ(z) with respect to the circle |z - 17,| = p:

where +D0(z) is analytic and invertible in the disc {z G <p | |z — rjj < p} and
~Z)0(z) is analytic and invertible for p ^ |z — TjJ <<». The equality D0 =
D0(+D0)~l shows that ~D0 admits analytic continuation to the whole of (p
and £>0(z) is invertible for all z ^ rjl. Also, +D0 is analytic and invertible on
<P^{z],...,z,,7?2,...,r?m}DA:.
Let Y(t), 0 < / < 1 be a continuous function with values in y(i7,) such that
y(0) = 17,, _y(l) = z(7jj). Then the formula

defines a continuous map F: [0,1]-*G with F0 = D0. Hence


£>, = F! +D0 E. GA. As +D0 is invertible on <p ^ {z t ,. . . , zs, Tj2, . . . , 17^}
and F,(z) = D0(z + 'rjl — zC^)) is invertible on <p"^ ^(T/J)}, it follows that
Dj(z) is analytic and invertible on <p "" {z^ . . . , zs, i}2,. . . , rjm}. Repeating
this argument m — l times with respect to the points rj 2 ,. . . , rjm, we obtain
the desired function D(z).

The following lemma is the main approximation result that will be used in
the transition from compact sets in fl to the domain O itself.
588 Analytic Families of Subspaces

Lemma 18.5.3
Let K C ft be a finitely connected compact set that is also simply connected
relative to O. Let M C <p" be a fixed subspace and A(z) be an n x n matrix
function that is analytic and invertible on K and such that A(z)M = M for
z E K. Then for every e > 0 there exists a matrix function B(z) that is analytic
and invertible on H and such that

for all z E K and B(z)M = M for all z E O.

Proof. Without loss of generality, we can assume that M =


Span{ej,. . . , er}, for some r. Then in the 2 x 2 block matrix formed by
representation with respect to the direct sum decomposition M + M ^ = <p"
we have

Because A(z) is invertible when z G AT, so are A j(z) and A2(z). Use Lemma
18.5.2 to find matrix functions #,(z) and B2(z) that are analytic and
invertible on H and such that \\Bf(z) - Af(z)\\ < e/3 for z e K; i = 1, 2. By
Lemma 18.5.1 there exists an analytic matrix function Bl2(z) on H such that
\\Bl2(z)- Al2(z)\\<e/3 for z<=K. Then

satisfies the requirements of Lemma 18.5.3. D

The following result allows us to pass from the compact sets in H to £1


itself.

Lemma 18.5.4
Let Kl C K2 C • • • C H be a sequence of finitely connected compact sets K^
which are also simply connected relative to H. For m = 1, 2,. . . , let Gm(z) be
an nx. n matrix function that is analytic and invertible on Km and satisfies
Gm(z)M = M for z E Km and for some fixed subspace M C <p". Then for
m = 1, 2,. . . , there exists an n x n matrix function Dm(z) that is analytic and
invertible on Km and such that, whenever z E Km

Proof. We need the following simple assertion. Let Xl, X2, . . . , be a


sequence of n x n matrices such that
Proof of Theorem 18.3.1 (General Case) 589

Then the infinite product Y = n^, (/ + Xm) converges and ||/ - V|| < aea.
Indeed, for the matrices Ym = tt™=l (I + Xj) we have the estimates:

Thus, in view of (18.5.5) the infinite product Y = 0^ = 1 (/ + Xm) converges.


Moreover

We now prove Lemma 18.5.4 itself. Applying Lemma 18.5.3 repeatedly,


we find for m = l , 2 , . . . a matrix function Hm(z) that is analytic and
invertible on Km, for which //j(z) = /, and for z e Km, Hm(z)M = M and

The assertion proved in the preceding paragraph ensures that for every
m — 1,2,. . . , the infinite product

converges uniformly on Km, and \\l - Em(z)\\<2 m exp(2 m)<l for z G


Km. Consequently, Em(z) is invertible for every zE.Km. Further,
Em(z)M = M(z E Km; m = 1, 2,. . .). Indeed, since Em(z) is invertible, it is
sufficient to prove that Em(z)M C M. But this follows from the equalities
Hm(z)M = M, Gm(z)M = M and the definition of Em. Now we can put
D
M = Hm(z)Em(z)y because Em = H~lGmHm +lEm +l and consequently
Gm = (HmEJ(Hm +lEm +iy\ D

We are now prepared to prove Theorem 18.3.1.

Proof of Theorem 18.3.1. Let us show first that there exists a sequence
of compact sets Kl C. K2 C • • • that are finitely connected, simply connected
relative to ft, and for which U°°=i ^, = ft. To this end choose a sequence of
closed discs Sm C ft, m = 1, 2,. . . , such that U^ = 1 Sm = ft. It is sufficient to
construct Km in such a way that Km D Sm, m = 1, 2 , . . . . Put Kl — 5,,
590 Analytic Families of Subspaces

suppose that Klt. . . , Km are already constructed, and K j ^ S j for j —


1,. . . , m. Let M be a connected compact set such that M D Km U Sm+l, and
let V,,. . . , Vk Cft be a finite set of closed discs from {£„,}*=, such that
N = U* = , V^ D M. Clearly, N is a finitely connected compact set. If N is also
simply connected relative to ft, then put Km+l - N. Otherwise, put Km+l =
N U y, U • • • U Ys, where y,,. . . , Y^ are all the bounded connected com-
ponents of the set $^~ N, which are entirely in ft.
Given the sequence Kl C K2 C • • • constructed in the preceding para-
graph. Choose z 0 E Kl and put MQ = M(zQ) [here J£(z) is the analytic family
of subspaces (of <p") on ft given in Theorem 18.3.1]. Without loss of
generality we can assume that MQ = Span{e,,. . . , er}. By Theorem 18.4.1,
there exist analytic vector functions/, (z),. . . , f(rm)(z) in Km that form a
basis in M(z) for every z E Km. Using Lemma 18.2.2, we find analytic vector
functions /J+ ) 1 ( z )» • • • » /« m> ( z ) defined on Km such that the vectors
/ ( , m) (z),.. ., /jT^z) form a basis in $" for every z E ^m (indeed, apply
Lemma 18.2.2 with *,(z) = /S" l) (z),. . . , x,(z) =/i m) (z), * r+1 (z) =
g,,. . . , xn(z) = gn-r, where g,,. . . , gn_r is a basis in a fixed direct com-
plement to M0). Then the matrix function Am(z) =
( ( m) m)
[f r\z)> f 2 (z),. . . , /l (z)] is analytic and invertible on Km and satisfies

where z£* m . Put G m (z) = Am\z)Am+l(z} for z E / C m . Then (18.5.6)


ensures that Gm(z)M0 = J<0 (z E /Cm, m = 1,2,. . .). By Lemma 18.5.4 (for
m = 1,2, . . .) there exists an analytic and invertible matrix function Dm(z)
on Km such that Gm = DmDm\l and, for z E Km

Since ,4 w+1 (z)£> m+1 (z) = Am(z)Dm(z) (z E /Cm; m = 1, 2, . . .) the relation


A(z) = Am(z)Dm(z), which holds for all z E # m , defines an analytic and
invertible matrix function A(z) on ft. Now the relation A(z)M0 = M(z) for
z E ft follows from (18.5.6) and (18.5.7). D

18.6 DIRECT COMPLEMENTS FOR ANALYTIC FAMILIES


OF SUBSPACES

Let M(z) be an analytic family of subspaces of <p" defined on a domain ft. If


jV is a direct complement to M(z0) and z0 E ft, then the results of Chapter
13 (Theorem 13.1.3) imply that JV is also a direct complement to M(z) as
long as z is sufficiently close to z0. This local property of direct complements
raises the corresponding global question: does there exist a subspace jV of
<p" that is a direct complement to M(z} for all z E ft? The simple example
below shows that the answer is generally no.
Direct Complements for Analytic Families of Subspaces 591

EXAMPLE 18.6.1. Let

As the polynomials z2 — 2 + 1 and z 2 + z do not have common zeros, it


follows that M(z) is an analytic family of subspaces. Indeed, if z0 is such
that z2, + z0 ^ 0, then in a neighbourhood of z0 we have

and if z0 is such that z2, - z0 + 1 ^ 0, then there is a neighbourhood of z0 in


which

However, there is no one-dimensional subspace Span (With at least one


of the complex numbers a, b nonzero) such that

for all z £ (p. Indeed, (18.6.1) means

Which is impossible.

It turns out that although one common direct complement for an analytic
family of subspaces may not exist, only two subspaces are needed to serve as
"alternate" direct complements for each member of the analytic family.

Theorem 18.6.1
For an analytic family of subspaces [ M ( z ) } z e f l of <p" there exist two
subspaces JVj, JV2 C <p" such that for each z £ fl either M(z) 4- JV^ = <p" or
M(z) + JV2 = (fn holds.

Proof. To prove this we first need the following observation: for any
/odimensional subspace !£ C <p, the set DC(^) of all direct complements to
!£ in (p" is open and dense in the set of all (n - A:) -dimensional subspaces.
Indeed, the openness of DC(!£) follows immediately from Theorem 13.1.3.
To prove denseness, let jV be an (n — k) -dimensional subspace in <pn with
hh Analytic Families of Subspaces
basis / ! , . • - , / „ _ f c , and let N0 be a direct complement to !£ with basis
S\-> • • • •> Sn-k- F°r a complex number e put JV(e) = Span{/j +
*£i> • • • , /„-* + egn-*}• Clearly, the vectors /. + eg,, i = 1,. . . , /i - fc are
linearly independent for e close enough to 0, so dim N(e) = n — k.
Moreover, Theorem 13.4.2 shows that

It remains to show that ^V(e) belongs to DC(2£). To this end pick a basis
/ i j , . . . , hk in j£, and consider the n x n matrix

As

(recall that JV0 + ^ f = <f"); also

0000000000000000000000000000000000000000000000000000000000000000000
that det G(e)^0, and since det G(e) is a polynomial in e it follows th;
detG(e)7^0 for e^O and sufficiently close to zero. Obviously, ^V(e) <
DC(%) for such e.
Now we start to prove Theorem 18.6.1 itself. Fix z
direct complement to M(z0) in <p"- By Theorem 18.3.2 it is possible to pk
vector functions x^(z),. . . , x
for every z e ft, the vectors JCj(z),. . . , x
/ ! > • • • > /n-p ^e a bas's m ^i» consider the n x « matrix function

which is analytic on ft. As det G(z 0 )^0, the determinant of G(z) is not
identically zero, and thus the number of distinct zeros of det G(z) is at most
countable. Let z n z 2 , . . . Eft be all of these zeros. Then jVj is a direct
complement to M(z) for z^{Zj, z 2 , . . .}. On the other hand, we have seen
that, for / = 1, 2, . . . , the sets DC(M(zi)) are open and dense in the set of
all (n — p) -dimensional subspaces in <p". As the latter set is a complete
metric space in the gap topology (Section 13.4), it follows that the inter-
section n°°=1 DC(M(zi)) is again dense [the Baire category theorem;
e.g., see Kelley (1955)]. In particular, this intersection is not empty, so
there exists a subspace JV2 C <p" that is simultaneously a direct complement
to all of M(zl),M(z2), ____ D
Direct Complements for Analytic Families of Subspaces 593

The following result shows that for analytic families of subspaces that
appear as the kernel or the image of a linear matrix function there exists a
common direct complement. As Example 18.6.1 shows, the result is not
necessarily valid for nonlinear matrix functions.

Theorem 18.6.2
Let Tl and T2 be m x n matrices such that the dimension ofKer(Tl + zT2) is
constant, that is, it is independent of z on <p [and the same is automatically
true for dim In^Tj + zT2)]. Then there exist subspaces ^Vj C <p", jV2 C <pm
such that

for all z G (p.

Note that in view of Proposition 18.1.1 and Theorem 18.2.1 the families
of subspaces Ker(r t + zT2) and Im(7i + zT2) are analytic on <p.

Proof. For the proof of Theorem 18.6.2 we use the Kronecker canonical
form for linear matrix polynomials under strict equivalence (which is
developed in the appendix to this book).
As dimKerCT, + zT2) is independent of z E <p, the canonical form of
Tl + zT2 does not have the term z/ + /. So, in the notation of Theorem
A. 7. 3, there exist invertible matrices Ql and Q2 such that

It is easily seen that

for all z E (p, and that

So there exists a direct complement Ml to Ker[Ql(Tl + zT2)Q2] for all


z E <p given as follows:

As
594 Analytic Families of Subspaces

it follows that

The part of Theorem 18.6.2 concerning lm(Tl + zT2) is proved similarly,


taking into account the facts that

and for each z E <p, Im LT has a direct complement Span{ej}. D

18.7 ANALYTIC FAMILIES OF INVARIANT SUBSPACES

Let A(z): <p n —> <P" be an analytic family of transformations on ft. Our next
topic concerns the analytic properties (as functions on z) of certain invariant
subspaces of A(z).
We have already seen some first results in this direction in Section 18.1.
Namely, if the rank of A(z) is independent of z, then Im A(z) and Ker A(z)
are analytic families of subspaces. In the general case, Im A(z) and
Ker A(z) become analytic families of subspaces if corrected on the singular
set of A(z). The next theorem is mainly a reformulation of this statement.
For convenience, let us introduce another definition: an analytic family of
subspaces {^(z)}2en is called A(z) invariant on ft if the subspace M(z) is
A(z) invariant for every z G ft.

Theorem 18.7.1
There exist A(z)-invariant analytic families {M(z}} z&sl and {N(z)}z&fl such
that M(z) = Im A(z) and jV(z) = Ker A(z) for every z not belonging to the
singular set of A(z).

Proof. In view of Theorem 18.2.1 we have only to prove that M(zQ) and
JV(z0) are A(z0) invariant for every z 0 ES(v4). But this follows from
Theorem 15.1.1 because limz_>z A(z} - A(z0) and

Another class of A (z) -in variant subspaces whole behaviour is analytic (at
least locally) includes spectral subspaces, as follows.

Theorem 18.7.2
Let F be a contour in the complex plane such that F fl cr(A(z0)) = 0 for a
fixed z0 E ft. Then the sum Mr(z) of the root subspaces of A(z) correspond-
Analytic Families of Invariant Subspaces 595

ing to the eigenvalues inside F, is an A(z)-invariant analytic family of


subspaces in a neighbourhood U of ZQ.

Proof. As A(z) is a continuous function of z on H, the eigenvalues of


A(z) also depend continuously on z. Hence there is a neighbourhood U of
z0 such that A(z) has no eigenvalues on F for any z in the closure of U. Now
for z G U we have

We have seen in Section 2.4 that

is a projector for every zEU. So, to prove that Mr(z) is an analytic family
in U, it is sufficient to check that P(z) is an analytic function on U. Indeed,
|det( A/ - A(z})\ > S > 0 for every A G F and z G t/, where 5 is independent
of A and z. Hence ||(A7- ^(z))"1)! is bounded for A G F and z G U, and
consequently the Riemann sums

where A 0 ,. . . , Am are consecutive points in the positive direction on F with


\m = A 0 , converge to the integral (18.7.2) uniformly on every compact set in
U. As each Riemann sum is obviously analytic on U, so is the integral
(18.7.2). D

In view of Theorems 18.7.1 and 18.7.2, the following question arises


naturally: does there exist an A(z)- in variant analytic family that is nontrivial
(i.e., different from {0} and <p")? Without restrictions on A(z) the answer is
no, as the following example shows.

EXAMPLE 18.7.1. Define an analytic family on <p by

Here the A(z)- in variant subspaces (for a fixed z) are easy to find: the only
nontrivial invariant subspace of A(Q) is Spanf^}, and, when z r^O, the only
nontrivial invariant subspaces of A(z) are

where u1 and u2 are the square roots of z. It is easily seen that there is no
nontrivial, A(z)-invariant, analytic family of subspaces on £. D
5% Analytic Families of Subspaces

In the next section we study >l(z)-invariant analytic families of subspaces


under the extra condition that A(z) have the same Jordan structure for all
z G H. We see that, in this case, nontrivial A(z)-invariant analytic families of
subspaces always exist. On the other hand, we have seen in Example 18.7.1
that there exists a nontrivial A(z)-invariant family of subspaces that is
analytic in <p except for the branch point at zero. Such phenomena occur
more generally and are studied in detail in Chapter 19.

18.8 ANALYTIC DEPENDENCE OF THE SET OF INVARIANT


SUBSPACES AND FIXED JORDAN STRUCTURE

Given a family of transformations A(z): <p"—» <p" that depends analytically


on the parameter z in a domain ft E <£ , we say that the lattice ln\(A(z})
depends analytically on z G ft if there exists an invertible transformation
5(z): $"->$" that is analytic on ft and such that Inv(,4(z)) =
5(z)(Inv(^4(z0))) for all z G f t and some fixed point z 0 Eft. This definition
does not depend on the choice of z0. Indeed, if

then for every z'QG O we have

Also, replacing 5(z) by S(z)S(z0)~l, we can require in the definition of


analytic dependence of lnv(A(z)) that S(zQ) = /.
Since Inv(A), Inv(fi) are linearly isomorphic if and only if A and B have
the same Jordan structure (Theorem 16.1.2), a necessary condition for
analytic dependence of Inv(v4(z)) on z is that A(z} have fixed Jordan
structure, that is, the number m of different eigenvalues of A(z) is indepen-
dent of z on ft, and for every pair z , , z 2 e f t the different eigenvalues
A^zJ, . . . , A m (zj) and A,(z 2 ), . . . , A m (z ? ) of A(z^ and A(z2), respective-
ly, can be enumerated so that the partial multiplicities of A ; (ZI) [as an
eigenvalue of ^4(z,)] coincide with the partial multiplicities of A y (z 2 ) [as an
eigenvalue of j4(z 2 )], for j = 1, . . . , m.
Using Theorem 16.1.2, we find that the family A(z) has fixed Jordan
structure if and only if, for every z , , z 2 G f t the lattices Inv(yt(z,)) and
Inv(.A(z2)) are isomorphic. Clearly, this property is necessary for the lattice
ln\(A(zJ) to depend analytically on z G ft. The following result shows that
this property is also sufficient as long as ft is simply connected.

Theorem 18.8.1
Let ft be a simply connected domain in <J7, and let A(z): <p"—> <p" be an
analytic family of transformations on ft. Then Inv(^4(z)) depends analytically
on z G ft // and only if A(z) have fixed Jordan structure.
Invariant Subspaces and Fixed Jordan Structure 597

In particular, the condition of a fixed Jordan structure ensures existence


of at least as many A (z) -invariant analytic families of subspaces as there are
A(ZQ) -invariant subspaces.

Proof. We assume that A(z) is represented as a matrix- valued function


with respect to some basis in <p" that is independent of z on fl. Fix a z0 in fl.
Let A,, . . . , A be all the distinct eigenvalues of A(z0), and let F, be a circle
around A, chosen so small that F, fl Fy = 0 for / ^ j. As the proof of Theorem
16.3.1 shows, there exists an e > 0 with the property that if B : <p" —> <p" is a
transformation with the same Jordan structure as A(z0), and if \\B —
A(z0)\\ < e, then there is a unique eigenvalue /x,(#) of B in each circle F(
( ! < / < p ) , and, moreover, the partial multiplicities of ^(B) (as an eigen-
value of B) coincide with the partial multiplicities of A, (as an eigenvalue of
y4(z0)). Hence, for every z from some neighbourhood t/j of z 0 , there is a
unique eigenvalue [denoted by /u,,-(z)] of A(z) in the circle F( (1 ^ / < p), and
the partial multiplicities of /u.,(z) coincide with those of A,. Obviously,
M,-(*o) = A,"
Let us prove that /LC,(Z) is analytic on Ul. Indeed, denoting by m, the
algebraic multiplicity of A, [as an eigenvalue of ,4(z0)], we have

which is an analytic function of z on Ul.


We have proved that in a neighbourhood of each point z0 E ft the distinct
eigenvalues of A(z) are analytic functions of z. It follows that the eigen-
values of A(z) admit analytic continuation along any curve in ft. By the
monodromy theorem [see, e.g. Rudin (1974); this is where the simple
connectedness of fl is used] the distinct eigenvalues /u-i(z),. . . , ^p(z) of
A(z) are analytic functions on ft.
Now fix z 0 eft and define the family of transformations B(z): <p" —*• <p",
z G ft by the requirement that B(z)x = [/i,(z0) - fJij(z)]x for any x belonging
to the root subspace of A(z) corresponding to the eigenvalue /^-(z). It is
easily seen that B(z) is analytic on ft. Indeed, for every z t E f t let
rj,. . . , Y'p be circles around ^ ( z l ) , . . . , ^(Zj), respectively, so small that
/^(Zj) is the only eigenvalue of ^(zj inside or on the circle F;' for
/ = ! , . . . , / ? . There is a neighbourhood V of Zj such that any A(z) with
z E V has the unique eigenvalue /i,(z) inside the circle F], / = ! , . . . , / ? .
Then

which is analytic on V in view of the analyticity of A(z) and ^t-(z) for


j =1, . . . , p. Put A(z) = A(z) + B(z). Obviously, the set of A(z) -invariant
598 Analytic Families of Subspaces

subspaces coincides with the set of ^4(z)-invariant subspaces for all z G ft, so
it is sufficient to prove Theorem 18.8.1 for A(z) instead of A(z). From the
definition of A(z) it is clear that the eigenvalues of A(z) are
ptj(z 0 ),. . . , ^(ZQ), that is, they do not depend on z, and, moreover, the
partial multiplicities of Aty(z0) as eigenvalues of A(z) do not depend on z,
either. In other words, in Theorem 18.8.1 we may assume that A(z) is
similar to A(zQ) for all z Eft.
For ; = 1 , . . . , / ? , let my be the maximal partial multiplicity of M;(ZO) as an
eigenvalue of A(zQ) [and hence as an eigenvalue of A(z) for all z in ft].
Note that since A(z) is similar to A(zQ) for all z Eft, by Proposition 18.1.2
there is an analytic basis in Ker(>i(z) — /u,/(z0)/)m for m = 0,1,2,. . . (i.e.,
for each fixed j and m). By Theorem 18.3.3 there exists a basis
x\{\z),. . . , *$(z) in Ker(A(z) - My^oW"' modulo Ker(^(z) - ufa)1*'1
that is analytic on ft. It is easily seen that the vectors

are linearly independent for all z E ft and belong to Ker(A(z) —


/^.(zo)/)"1'""1. Hence by Theorem 18.3.3 again there is a basis
xiftz),. . . ,*$(z) in Ker(>l(z) - /i^o)/)"'"1 modulo

which is analytic on ft. Next we find an analytic basis

modulo

and so on. Now define the n x « matrix T(z) formed by the columns

where / = 1,. .. , p. As the proof of the Jordan form of a matrix shows (see
Section 2.3), the columns of T(z) form a Jordan basis of A(z). In particular,
Analytic Dependence on a Real Variable 599

T(z) is invertible for all z E . f l . Clearly, T(z) is analytic on ft. As


is a constant matrix (i.e., independent of z) and is in the
Jordan form, the assertion of Theorem 18.8.1 follows.

In the course of the proof of Theorem 18.8.1 we have also proved the
following result on analytic families of similar transformations.

Corollary 18.8.2
Let A(z): <p" -> <p" be an analytic family of transformations on ft, where ft is
a simply connected domain. Assume that, for a fixed point z 0 Gft, A(z) is
similar to A(zQ) for all z e ft. Then there exists an invertible transformation
T(z): <p"—»$", which is analytic on ft and such that T(z0) = I and
T(zYlA(z) T(z) = A(z0) for all z E ft.

The assumption that ft is simply connected in Theorem 18.8.1 is neces-


sary, as the next example shows.

EXAMPLE 18.8.1. Let ft = <p ^ {0}, and let

Clearly, A(z) has fixed Jordan structure on ft (the eigenvalues being the two
square roots of z). The nontrivial A(z)-invariant subspaces are

Clearly, there is no (single-valued) invertible 2 x 2 matrix function S(z) that


is analytic on <p^{0} and satisfies the conditions of Theorem 18.8.1. D

Note that in the proof of Theorem 18.8.1 the existence of an analytic


Jordan basis [Formula (18.8.1)] of A(z) also follows from a general result on
analytic perturbations of matrices (see Section 19.2).

18.9 ANALYTIC DEPENDENCE ON A REAL VARIABLE

The results presented in Sections 18.1-18.8 include the case when the
families of transformations <p" —> <pm and subspaces of <p" are analytic in a
real variable on an open interval (a, b) of the real axis. The definition of
analyticity is analogous to that in the complex case: representation as a
power series (this time with real coefficients) in a real neighbourhood of each
point t0 E (a, b). As the radius of convergence of this power series is
positive, it converges also in some complex neighbourhood of tQ. Con-
600 Analytic Families of Subspaces

sequently, a family of transformations from <J7" into <pm (or of subspaces of


<p") that is analytic on (a, b) can be extended to a family of linear
transformations (or subspaces) that is analytic in some complex neighbour-
hood fl of (a, b), and the results presented in Sections 18.1-18.8 do apply.
It is noteworthy that, in contrast to the complex variable case, the
orthogonal complement preserves analyticity, as follows.

Theorem 18.9.1
Let M(i) be a family of subspaces (of <p") that is analytic in the real variable t
on (a, b). Then the orthogonal complement M(i)L is an analytic family of
subspaces on (a, b) as well.

Proof. Let /() E (a, b). Then in some real neighbourhood Ul of t0 there
exists an analytic family of invertible transformations A(t): (f"1—»<p" such
that M(t) = A(t)M, / E l/j for a fixed subspace M C <pn. Assume (without
loss of generality) that M = Spanf^,. . . , ep} for some p, and write A(i) as
the nx n matrix with entries that are analytic on (a, b) with respect to the
standard basis in <p". Then M(t) = Im B(t) for t E Ul, where B(t) is formed
by the first p columns of A(t). As A(t) is invertible, the columns of B(t) are
linearly independent. For notational simplicity, assume that the top p rows
of B(t0) are linearly independent and hence form a nonsingular p x p
matrix. Then there is a real neighbourhood U2 C £/t of t0 such that the top
rows of B(t) form a nonsingular p x p matrix C(t) as well. So for t E t/2, we
obtain

where D(t) is the (n — p) x p matrix formed by the bottom n — p rows of


B(t). Denoting X(t) = D(t)C(t)~\ consider the p x p matrix function 5(/) =
(/ + X(t)*X(t)yl for t e U2. Note that / + X(t)*X(t) is positive definite and
thus invertible. Clearly, S(t) is positive definite and analytic on U2. Let F be
a contour that lies in the open right half plane, is symmetrical with respect
to the real axis, and contains all the eigenvalues of 5(/0) in its interior. Then
all eigenvalues of 5(0, where t is taken from some neighbourhood f/3 C U2
of tQ, will also be in the interior of F. For such a t the integral

where A 1 / 2 is the analytic branch of the square root that takes positive values
for A positive, is well defined and Z(t)2 = S(t) (see Section 2.10). Moreover,
because of the symmetry of F, the matrix Z(t) is positive definite for all
t E l/3. Also, Z(t) is an analytic family of matrices on U3. Now one sees
easily that, for t E U3
Exercises 601

is the orthogonal projector on M(t). Indeed, a straightforward computation


verifies that P(t)2 = P(t) = P(t)*. So P(t) is an orthogonal projector.
Furthermore, it is clear that

and since rank P(t) is easily seen to be p, equality (rather than inclusion)
holds in (18.9.1). Consequently, M(tY is the image of the analytic family of
projectors I-P(t), and thus M(i)L is analytic on (/3. As / 0 E(«, b) was
arbitrary, the analyticity of M(t)± on (a, b) follows. D

One can also consider families of real transformations from Jjf" into $m,
as well as families of subspaces in the real vector space J|?", which are
analytic in a real variable t on (a, b). For such families of real linear
transformations and subspaces the results of Sections 18.1-18.8 hold also.
However, in Theorem 18.7.2 the contour F should be symmetrical with
respect to the real axis; and in the definition of fixed Jordan structure one
has to require, in addition, that the enumeration A J ( Z J ) , . . . , A O T (ZJ) and
Aj(z 2 ), . . . , A m (z 2 ) of distinct eigenvalues of A(zl) and A(z2), respective-
ly, is such that A,(ZI) = A y (z,) holds if and only if A^(z 2 ) = A y (z 2 ).

0000000000000000000000

18.1 Let

be an analytic family of transformations written as a matrix in the


standard orthonormal bases in <p2 and <p3-
(a) Are Im A(z) and Ker A(z) analytic families of subspaces?
(b) Find an analytic vector function y(z) such that y(z) ¥^ 0 for all
z £ <p and Span{y(z)} = Ker A(z) for all z G <p with the excep-
tion of a discrete set.
(c) Find linearly independent and analytic (in <p) vector functions
^j(z), y2(z) such that Span{y,(z), y2(z)} = Im A(z) for all z E
£ with the exception of a discrete set. [Hint: Use the Smith
form for the matrix polynomial ^4(z).]
602 Analytic Families of Subspaces

18.2 Solve Exercise 18.1 for

18.3 Let P(z) be an analytic family of projectors. Show that Im P(z) is an


analytic family of subspaces.
18.4 Let

where for; = ! , . . . , & , A^z) is an analytic family of transformations


on a domain ft. Prove that the following statements are equivalent:
(a) Im A(z) and Ker A(z) are analytic families of subspaces.
(b) Im Aj(z) is an analytic family of subspaces, for j = 1,..., k.
(c) Ker AJ(Z) is an analytic family of subspaces, for / = 1,. . . , k.
18.5 Let A(z): <f"n-*(p" be an analytic family of transformations on ft
such that A(z)2 = / for all z E ft. Prove that the families of subspaces
Im(A(z) - /) and lm(A(z) + /) are analytic on ft.
18.6 Let A(z) be an analytic family of transformations on ft such that
p(A(zJ) = Q for all z E f t , where p(\) is a scalar polynomial of
degree m with distinct zeros A 1 ? . . . , \m. Prove that the families of
subspaces Ker( A y / - A(zJ), j = 1,... , m are analytic on ft.
18.7 Does the result of Exercise 18.6 hold if /?(A) has less than m distinct
zeros?
18.8 Given matrices A and B of sizes n x n and n x m, respectively, show
that Ker[A/ + A, B] is an analytic family of subspaces if and only if
(A, B) is a full-range pair.
18.9 Given matrices C and A of sizes p x n and n x n, respectively, show
ihat Im is an analytic family of subspaces if
and only if (C, A) is a null kernel pair.
18.10 Given an analytic n x. n matrix function A(z) on ft that is upper
triangular for all z E ft, when is Ker A(z) analytic on ft?
18.11 For the following analytic vector functions Jt^z), *2(z), where z E (p,
find analytic vector functions .y^z), y2(z) of zE <p such that yv(z)
and y2(z) are linearly independent for every z E <p and
Span Span

for every z E (p except for a discrete set:


(a) ^(z) = < z 2 , l - z , 0 ) , *2(z)=<z3,l-z2,z2-z>
(b) ^ ( z ) - ( l , - z , z > , jc2(z) = (l,z 2 ,z 2 + z)
[Hint: Use the Smith form for the matrix polynomial [jc^z), *2(z)].]
Exercises 603

18.12 Let Jtj(z),. . . , xk(z) be n-dimensional vector polynomials such that,


for at least one value z 0 G <f, the vectors X^ZQ), . . ., xk(z0) are
linearly independent. Prove that one can construct /i-dimensional
vector polynomials ^(z),. .. , yk(z) such that y^z),. . ., yk(z) are
linearly independent for all z G <p and

for all z £ (p with the possible exception of a finite set, as follows.


Let

be the Smith form of the n x k matrix [jCj(z), . . . , x*(z)]; then put

18.13 Complete the following linearly independent analytic families of


vectors in <p4 (depending on the complex variable z G <p) to analytic
families of vectors that form a basis in (p4 for every z G <p:

18.14 For the following analytic families M(z) of subspaces in <p" that
depend on z G (p, find two subspaces ^Vt and N2 such that for every
z G <p at least one of

holds:

18.15 For each n^2 give an example of an analytic family of transfor-


mations A(z): <p"-*(p" defined on fl that has no nontrivial A(z)-
invariant analytic family of subspaces on ft.
18.16 Let A(z) be an analytic family of transformations defined on fl such
that p(A(z)) = 0 for all z Gft, where p(A) is a scalar polynomial of
degree m with m distinct zeros. Prove that there are at least 2m
A (z) -invariant analytic families of subspaces on ft.
Chapter Nineteen

Jordan Form of
Analytic Matrix Functions

In this chapter we study the behaviour of eigenvalues and eigenvectors of a


transformation that depends analytically on a parameter in both the local
and global frameworks. It turns out that this behaviour is analytic except for
isolated singularities that are described in detail. The results obtained allow
us to solve (at least partially) the problem of analytic extendability of an
invariant subspace. In turn, the solution of this problem is used in Chapter
20 for the solution of various problems concerning divisors of monic matrix
polynomials, minimal factorization of rational matrix functions, and solu-
tions of matrix quadratic equations, all of which involve analytic dependence
on a parameter. Clearly, the material of this chapter relies on more
advanced complex analysis than does that of the preceding chapters. How-
ever, this is not a prerequisite for understanding the main results.

19.1 LOCAL BEHAVIOUR OF EIGENVALUES AND EIGENVECTORS

Let A(z): £"-»<£" be a family of transformations that is analytic on a


domain H. In this section we study the behaviour of eigenvalues and
eigenvectors as functions of z in a neighbourhood of a fixed point z0 G O.
First let us state the main result in this direction.

Theorem 19.1.1
Let jtj,. . . , jjik be all the distinct eigenvalues of A(zQ), that is, the distinct
zeros of the equation det(ju./- A(z 0 )) =0, where k<n, and let ri(i =
1 , . . . , / : ) be the multiplicity of ^ as a zero of det(/u,/ — A(z0)) = 0 (so
r{ + • • • + rk = ri). Then there is a neighbourhood °IL of z0 in ft with the
following properties: (a) there exist positive integers mn,. .. , m t ;

604
Local Behaviour of Eigenvalues and Eigenvectors 605

m21, . . . , m 2 s ; . . . ; mkl, . . . , mkSk such that the n eigenvalues (not neces-


sarily distinct) of A(z) for z E °li > {z()}, are given by the fractional power
series:

where aai- E <p and for a = 1,. . . , mti

(b) the dimension y^ of Ker(v4(A) - /A (;(r (z)/), as we// as the partial multi-
plicities m(^} > • • • > mj^ (>0) of the eigenvalue )U,i;(T(z) o/,4(A), do rcof
depend on z (for zE:°U ^ ( z 0 j ) and do no? depend on cr; (c) for each
/ = 1, . . . , jfc andy = 1, . . . , s, there exist vector-valued fractional power series
converging for z G % :

where x(J^ £ <p", such that for each y and each z E °U, ^ {z0} the vectors
jt-^^z),. . . , x^f'1 }(z) form a Jordan chain of A(z) corresponding to
/*„•(*):

where by definition Jt^ 0) (z) = 0, and jc^ 1) (z)^0. Moreover, for every
z E M ^ {z0} the vectors

form a basis in <p" .


The full proof of Theorem 19.1.1 is too long to be presented here. We
refer the reader to the book of Baumgartel (1985), and especially Section
IX.3 there, for a complete proof.
Let us make some remarks concerning this important theorem. First, in
the expansion (19.1.3), if m,; > 1, then the greatest common divisor of all
606 Jordan Form of Analytic Matrix Functions

positive integers a such that aaii^0 is 1 (so fi0<r(z), a = 1, . . . , mfj have a


branch point at fit of branch multiplicity mi; and not less). If m^ = 1, then
ju(yo.(z) are analytic on a neighbourhood of ju,,; it may even happen that
jLtl/a(z) is the constant function n.i (see Example 19.1.2). Second, the
theorem does not say anything explicit about the partial multiplicities
pn >••• s=/> M of the eigenvalue //,, of A(z0) (we know only that E;';=1 ptj = rt
for i = l,...,k). However, there is a connection between the partial
multiplicities m^ > • • • > mW of the eigenvalues pij(T(z) of A(z) (z E % ^
{z0}) and the partial multiplicities of the eigenvalue /u, of A(zQ). This
connection is given by the following formula (see Theorem 15.10.2):

where piq is interpreted as zero for k>t^ and similarly for m^ } when
q > yir As the total sum of partial multiplicities of eigenvalues near /^ does
not change after small perturbation of the transformation, we also have the
equality

Let us illustrate Theorem 19.1.1 with an example.

EXAMPLE 19.1.1. Let z0 = 0 and

The only eigenvalue of ^4(0) is zero, with partial multiplicities 3 and 1. [The
easiest way to find the partial multiplicities of A(Q) is to observe that
rank A(0) = 2 and A(0)2 ¥- 0.] To find the eigenvalues of A(z), we have to
solve the equation det(/u,7- A(z)) = 0, which gives (in the notation of
Theorem 19.1.1)

(so we have k = l, sl = l, m u =2). It is not difficult to see that the only


partial multiplicity of n>ij(7(z) is m^ u = 2. The Jordan chain of A(z) corre-
sponding to pija(z) is
Global Behaviour of Eigenvalues and Eigenvectors 607

An important particular case of Theorem 19.1.1 appears when the


eigenvalues of A(z) are analytic in a neighbourhood of z 0 , that is, all
integers mif are equal to 1, as follows.

Corollary 19.1.2
Assume that all the eigenvalues ofA(z) are analytic in a neighbourhood of z0.
Then the distinct eigenvalues /i,(z), . . . , /^(z) of A(z), Z^ZQ can be
00000000000000000000000000000000000000000000000000000000000000000.
Further, assuming that the enumeration of the distinct eigenvalues of A(z) for
z T* ZQ is as above, there exist analytic n-dimensional vector functions

in a neighbourhood % C °UV ofzQ with the following properties: (a) for every
zE%"^{z 0 }, and for i = l,...,k; j = l,...,si the vectors
yl*\z),. . . , y^''\z) form a Jordan chain of A(z) corresponding to the
eigenvalue ^(z); (b) for every z G % ^ {z0} the vectors (19.1.4) form a
basis in <p".

The following example illustrates this corollary.

EXAMPLE 19.1.2. Let

Obviously, the eigenvalues of A(z) are analytic (even constant). It is easy to


find analytic vector functions yj/^z) as in Corollary 19.1.2: we have k = 1,
s, = 1, yn =2, and

Note that y\\\z), y\}\z) do not form a basis in <p2 for z = 0; also, /^(O),
yJi^O) do not form a Jordan chain of ^4(0). This shows that in (a) and (b) in
Corollary 19.1.2 one cannot, in general, replace °U2 ""* {z0} by %. D

19.2 GLOBAL BEHAVIOUR OF EIGENVALUES AND EIGENVECTORS

The result of Theorem 19.1.1 allows us to derive some global properties of


eigenvalues and eigenvectors of an analytic family of transformations
A(z): <p"—»(p" defined on ft. As before, O is a domain in the complex
plane.
608 Jordan Form of Analytic Matrix Functions

For a transformation X: <p"—»<p" we denote by v{X) the number of


distinct eigenvalues of X. Obviously, 1 < v(X) ^ n.

Theorem 19.2.1
Let A(z): <p"—> <p" be an analytic family of transformations on ft. Then for
all z £ ft except a discrete set S0 we have

for ZQ G S0 we have

Proof. Theorem 19.1.1 shows that for every z 0 E f t there is a neigh-


bourhood °ttz of z 0 such that v(A(z)) is constant (equal to VQ, say) for
z E % "" {z0} and v(A(z0}) ^ v0. A priori, it appears that v0 may depend on
zQ. Let us show that actually VQ is independent of z 0 . For f = 1,.. . , n, let
°VV = U°UZ , where the union is taken over all z0 E11 such that p(A(z)) = i/ in
a deleted neighbourhood % ^ {z0} of z 0 . Obviously, T,,. . . , °Vn are open
sets whose union is ft, and it is easily seen that they are mutually disjoint.
This can happen only if all T}. are empty except for T^; therefore, Y^ = ft.
It is clear also that

Now if v(A(z'))< i>0 for some z ' G f t , then by Theorem 19.1.1 we have
v(A(z)} ==• v0 in a deleted neighbourhood of z'. This shows that the set 50 of
all z E ft for which v(A(z)) < VQ is indeed discrete, n

The points from S0 will be called the multiple points of the analytic family
of transformations A(z), because at these points the eigenvalues of A(z)
attain higher multiplicity than "usual."
Another way to prove Theorem 19.2.1 is by examining a suitable
resultant matrix. Let

for some scalar functions tfy(z) that are analytic on ft, and consider the
(2n — 1) x (2/i — 1) matrix whose entries are analytic functions on ft:
Global Behaviour of Eigenvalues and Eigenvectors 609

This is the resultant matrix of two scalar polynomials on fj.: det(/u/ - A(z})
and (<?/<?w)(det(/x/- A(z)). A well-known property of resultant matrices
[see, e.g., Gohberg and Heinig (1975)] states that 2n - 1 - rank R(z) is
equal to the number of common zeros of these two polynomials in /u,
(counting multiplicities). In other words

or

Now let k (n ^ k < 2n — 1) be the largest size of a square submatrix in R(z)


whose determinant is not identically zero. Denoting by S,(z),. . . , S,(z) all
such submatrices in R(z), we obviously have rank R(z) = k if at least one of
det 5,(2),. . . , det S,(z) is different from zero; and rank R(z) < k otherwise.
Comparing with (19.2.1), we obtain:

if not all numbers det S^(z), . . . , det S,(z) are zeros;

otherwise. Since the set of common zeros of det5 t (z), . . . , det 5/(z) is
discrete, Theorem 19.2.1 follows. D
Theorem 19.1.1 shows that the distinct eigenvalues of
A(z), fJt-i(z),. . . , n^z) (where v = max 2en v(A(z))) are analytic on fl ^ 50,
where S0 is taken from Theorem 19.2.1, and have at most algebraic branch
points is 50. [Some of the functions ^(z),. . . , nv(z) may also be analytic at
certain points in S0.] Denote by Sl the subset of 50 consisting of all the
points ZQ such that at least one of the functions /iy-(z), j - 1,. . . , v is not
analytic at z 0 . As a subset of a discrete set, S{ is itself discrete. The set $1
will be called the first exceptional set of the analytic family of linear
transformations A(z), z GO.
It may happen that Sl ^ S0, as shown in the following example.
610 Jordan Forn^of Analytic Matrix Functions

EXAMPLE 19.2.1. Let ft = <f and

The eigenvalues of A(z) are ±z, so in this case S0 = {0} but Sj = 0. D

Example 19.1.2 shows that in general, when z E f t ^ S l 5 one cannot


expect that there will be a Jordan basis of A(z) that depends analytically on
z. To achieve that we must exclude from consideration a second exceptional
set, which is described now.

Theorem 19.2.2
Let A(z): <£"-» <P" be an analytic family of transformations on ft with the set
50 of multiple points and let /AJ(Z), . . . , ^(z) be the distinct eivenvalues of
A(z) analytic on ft"-S0 and having at most branch points in 50. Let
m yl (z) > • • • > miy(z), y = y ( j , z) be the partial multiplicities of the eigen-
value iij(z) ofA(z) for j = I , . . . , V, z&?S0. Then there exists a discrete set S2
in ft such that S2 C ft ""- 50 and the number y(y, z) of partial multiplicities and
the partial multiplicities mjk(z} themselves, k — 1,. . . , y(y, z) do not depend
on z in ft""- (50 U 52), for 1, . . . , v.

Proof. The proof follows the pattern of the proof of Theorem 19.2.1. In
view of Theorem 19.1.1, for every z 0 Eft, there is a neighbourhood °UZ of
z0 such that the number of distinct eigenvalues v = v(z0), as well as the
number -y, = y;(z0) of partial multiplicities and the partial multiplicities
themselves mjl > • • • > mjy, mjk = mjk(zo), corresponding to the y'th eigen-
value, are constant for zE.^lrzo ^{z ft"}. It is assumed that the distinct
eigenvalues of A(z) for z E. % ^ {z0} are enumerated so that they are
analytic and yl^ • • ->yv. Denote by A the (finite) set of all sequences of
type

where v,yj,mjk are positive integers with the properties that v<n; y^
• • - > yv\ mt ^ • • • ^ miy, i = 1,. . . , v; S, y miy = n. For any sequence 8 E A
as in (19.2.2) let Ys = U% , where the union is taken over all z 0 Eft such
that v = v(z0); yf = r,(z0),°; = 1,. . . , v\ m,.; = m 0 (z 0 ), y = 1,. . . , ^; i =
1,. . . , v. Obviously, Ys is open and U s e A Vs = ft. Also, the sets Ts, d E A
are mutually disjoint. As ft is connected, this means that all °VS, except for
one of them, are empty. So Theorem 19.2.2 follows. D

The set 52 = 52 U (50 ^ 5 X ), where 52 is taken from Theorem 19.2.2 and


50 and Sl are the set of multiple points and the first exceptional set of A(z),
respectively, is called the second exceptional set of A(z). Note that S2C\ Sl =
Global Behaviour of Eigenvalues and Eigenvectors 611

0. The second exceptional set is characterized by the properties that the


distinct analytic eigenvalues of A(z) can be continued analytically into any
point z 0 E5 2 , but for every z 0 E5 2 , either v(A(z0)) <max reft v(A(z)), or
v(A(z0)} - max zeft v(A(z)} and for at least one analytic eigenvalue /u.;(z) of
A(z) the partial multiplicities of /A ; (Z O ) are different from the partial
multiplicities of /x,(z), z ^ z 0 , in a neighbourhood of z0.

EXAMPLE 19.2.2. Let

where p.(z) and q.(z) are not identically zero polynomials such that

for all z E <£, where 1 ^ kl < k2 < • • • < kq_l < kq = n. We also assume that
the polynomials pk ( z ) , . . . , pk (z) are all different. We have the set of
multiple points

the first exceptional set 5t is empty, and the second exceptional set S2 is the
union of S0 and the set {z £ <p "- S0 | <?,(z) = 0 for some kp + 1 ^ / ^ kp +l — 1
and some p}. D

Now we state the result on existence of an analytic Jordan basis for an


analytic family of transformations.

Theorem 19.2.3
Let A(z): <p"~* <P" be an analytic family of transformations in fl with the first
exceptional set Sl and the second exceptional set S2. Let /^(z),. . . , IAV{Z) be
the* distinct eigenvalues of A(z) (apart from the multiple points), which are
analytic on ft "*- Sl and have at most algebraic branch points in S}. Then there
exist n-dimensional vector functions

j = 1, . . . , v, where m;1 > • • • > mjy are positive integers, with the following
properties: (a) the functions (19.2.3) are analytic on fl ^ Sv and have at most
612 Jordan Form of Analytic Matrix Functions

algebraic branch points in 5,; (b) for every z E f t ^ ( S j U S 2 ) the vectors


(19.2.3) form a basis in <p"; (c) for every z EII ^ (5, U S2) the vectors

form a Jordan chain of the transformation A(z) corresponding to the


eigenvalue ^(z), for k = 1, . . . , ?;; / = 1, . . . , v.

It is easily seen that if My(z) has an algebraic branch point at z 0 E S^, then
all eigenvectors

of A(z) corresponding to ^(z) also have an algebraic branch point at z 0 .


Indeed, let y(z) be some (say, the kill) coordinate of x\()(z) that is not
identically zero. The equality [A(z) - /Lt;.(z)]jc(i{)(z) = 0 for z in ft^S2
implies that

where ak(z) is the kth row of A(z). If x\'^(z} were analytic at z 0 , then
(19.2.4) implies that At;(z) is also analytic at z 0 , a contradiction.
The proof of Theorem 19.2.3 is given in the next section.
In the particular case when A(z) is diagonable (i.e., similar to a diagonal
matrix) for every z^S} U S2, the conclusions of Theorem 19.2.3 can be
strengthened, as follows.

Theorem 19.2.4
Let A(z) be as in Theorem 19.2.3, and assume that A(z) is diagonable for all
z^Sl U S2. Then there exist n-dimensional vector functions

with the following properties: (a) the functions (19.2.5) are analytic onfl^ Sl
and have at most algebraic branch points in 5j; (b) for every z E O and every
j = 1,. . . , v the vectors x['\z),. . . , x(^(z) are linearly independent; (c) for
every z G f t ^ (^ U 52) the vectors x\i\z),. . . , x{^(z) form a basis in
Ker(/x y (z)7- A(z}). In particular, the vectors (19.2.5)/orm a basis in <J7" for
every z e f i ^ (^ U S2).

The strengthening of Theorem 19.2.3 arises in statement (b), where the


linear independence is asserted for all z E fl and not only for z E ft ^ (Sl U
52) as asserted in Theorem 19.2.3. The proof of Theorem 19.2.4 is obtained
in the course of the proof of Theorem 19.2.3.
We illustrate Theorem 19.2.4 with a simple example.
Proof of Theorem 19.2.3 613

EXAMPLE 19.2.3. Let

Here Sl = 0; 52 = {0}. The eigenvectors jc,(z) = I Q I and x2(z) = 1 1 cor-


responding to the eigenvalues 0 and z 2 of A(z), respectively, are analytic
and nonzero for all z E <p (including the point z = 0), as ensured by
Theorem 19.2.4. However, x%(z) and x2(z) are not linearly independent for
z = 0. D

19.3 PROOF OF THEOREM 19.2.3

We need some preparation for the proof of Theorem 19.2.3.


A family of transformations B(z)\ $n-* <p" is called branch analytic on ft
if B(z) is analytic on ft except for a discrete set of algebraic (as opposed to
logarithmic) branch points. The same definition applies to n-dimensional
vector functions as well. The singular set of a family of transformations
B(z): <p"—> <p", which is branch analytic on ft, is, by definition, the set of all
z0 £ ft such that

It is easily seen that the singular set is discrete and coincides with the set of
all z0 e ft with

We use the notation S(B) to designate the singular set of B(z).

Lemma 19.3.1
Let B(z): <p"—»<p m be a branch analytic family of transformations on ft.
Then there exist m-dimensional branch analytic vector-valued functions
y^(z),. . . , yr(z) on ft, and n-dimensional branch analytic vector-valued
functions x ^ ( z ) , . . . , xn_r(z) on ft with the following properties: (a) each
branch point for any function y^z), j = 1,. . . , r or xk(z), k = 1,. . . , n - r is
also a branch point of B(z)\ (b) y t (z),. . . , yr(z) are linearly independent for
every z G ft; (c) *i( z )> • • • » x n - r ( z ) are linearly independent for every z E ft;
(d) Span{y}(z),...,yr(z)} = lmB(z) and Span^z),. . . , *„_,(*)} =
Ker B(z) for every z not belonging to S(B).

The proof of this lemma can be obtained by repeating the proofs of


Lemma 18.2.2 and Theorem 18.2.1 with the following modification: in place
614 Jordan Form of Analytic Matrix Functions

of the Weierstrass and Mittag-Leffler theorems (Lemmas 18.2.3 and


18.2.4), one must use the branch analytic and branch meromorphic versions
of these theorems. [In the context of Riemann surfaces, these versions can
be found in Kra (1972).]

Lemma 19.3.2
Let Bj(z): (p n -^(p w and B2(z): <p"-» <pm be branch analytic families of
transformations on ft, such that

for every z E ft that does not belong to the union of the singular sets of B^(z)
and B2(z). Then there exist branch analytic n-dimensional vector functions
x ^ ( z ) , . . . , xs(z), z Eft with the following properties: (a) every branch point
of any *;(z), j = 1, . . . , s is also a branch point of at least one of B^(z) and
B2(z); (6) jCj(z), . . . , xs(z) are linearly independent for every z E ft; (c) for
every z E f t that does not belong to S(Bl) U S(B2) the vectors
jc t (z), . . . , Jc^z) form a basis in Ker B^(z) modulo Ker B2(z).

An analogous result also holds in case Im B^(z) D Im B2(z), for all z E ft


with the possible exception of singular points of #,(z) and B2(z).

Proof. We regard jB,(z) and B2(z) as m x n matrix functions, with


respect to fixed bases in <p" and <£"". By Lemma 19.3.1, find linearly
independent branch analytic vector-valued functions y,(z), . . . , y u (z) on ft
such that

Span{ v,(z), . . . , yv(z)} = Ker B 2 (z) (19.3.1)

for all z E f t not belonging to the singular set of B2(z). Fix z 0 E f t , and
choose xv +l, . . . , * „ in such a way that ^(z,,), . . . , yv(z0), xu+l, . . . , xn
form a basis in <p". Using the branch analytic version of Lemma 18.2.2 (cf.
the paragraph following Lemma 19.3.1), find branch analytic vector func-
tions yv+l(z), - . . , yn(z), z Eft such that ^(z), . . . , yv(z),
yv +l ( z ) , . . . , yn(z) form a basis in <p" for every z E ft. If necessary, replace
BJ(z) by B,.(z)5(z), i = l,2, where S(z) = [y,(z)- • • yn(z)] is an invertible
n x n matrix function, and we can assume that

where Bf(z) are branch analytic m x (n - u) matrix functions, and


Ker B2(z) = 0 for all z E ft with the possible exception of a discrete set of
points. By Lemma 19.3.1 again, find branch analytic linearly independent
Proof of Theorem 19.2.3 615

(p"""-valued functions Jt,(z),. .. , xs(z), z EH such that x{(z),. . . , xs(z) is


a basis in Kerfl^z) for all z E f l except for the singular points of B^(z).
Then the vector functions

satisfy the requirements of Lemma 19.3.2. D

Lemma 19.3.3
Let BI(Z) and B2(z) be as in Lemma 19.3.2, and let Xj(z),. . . , xt(z) be
branch analytic n-dimensional vector functions with the following properties:
(a) every branch point of any Xj(z), j = 1,. . . , t is also a branch point of at
least one of B^(z) and B2(z); (b) there exists a discrete set T D S(Bl) U S(B2)
such that Jt^z),. . . , xt(z) belong to Ker B^{z) and are linearly independent
modulo Ker B2(z) for every z £ ft ^ T. Then there exist branch analytic
n-dimensional vector functions Jt,+ 1 (z), . . . ,xs(z) such that every point of
any Jcy(z), j=t + l,...,sisa branch point of at least one o/#,(z) and B2(z)
and for every z G O ^ T the set x{(z),. . . , x,(z), Jt, +] (z), . . . , xs(z) forms a
basis in Ker #,(z) modulo Ker B2(z).

The case t = 0 [when the set JCj(z),. . . ,x,(z) does not appear] is not
excluded in Lemma 19.3.3.

Proof. Arguing as in the proof of Lemma 19.3.2, we can assume that


Kerfl 2 (z) = 0 for every z&S(B2). Replacing T by TUS(B2), we can
assume that S(B2) = 0.
Further, by the branch analytic version of Lemma 18.2.2, there exist
branch analytic and linearly independent vector functions ^(z), . . . , y,(z)
with

There exist branch analytic vector functions y,+i(z), . . . , v n (z) such that
y^z), . . . , yn(z) form a basis in <p" for every z G H (cf. the proof of Lemma
19.3.2). By replacing fit(z) by fl1(z)[y1(z)- • • yn(z)], we can assume that

and the proof is reduced to the case f = 0. But then Lemma 19.3.1 is
applicable. D

We are ready now to prove Theorem 19.2.3. The main idea is to mimic
the proof of the Jordan form for a transformation (Section 2.3) using
Lemma 19.3.2 when necessary.
616 Jordan Form of Analytic Matrix Functions

Proof of Theorem 19.2.3. For a fixed j (y = 1,.. . , i/) let mfl be the
maximal positive integer p such that

for all z&Si U 52. By Theorem 19.2.1 and the definition of 52 the number
mjl is well defined. By Lemma 19.3.2, there exist branch analytic vector
functions x\j)m (z), . . . , Jtj^, (z) on ft that are linearly independent for
every z Eft, can have branch points only in Sj, and such that

form a basis in Ker(/Lt y (z)/- v4(z))m" modulo Ker(/i y (z)/ - ^(z))"1''1'1, for
every z E f t that does not belong to 5((/x;(z)7- A(z))mil)\J 5((/it;(z)/-
^(z))1"'1"1). As we have seen in the proof of the Jordan form, the vectors

are linearly independent modulo Ker(/A ; (z)/- A(z))m>l~2 for every z^Si U
52 (we assume here that m ; 1 >2). By Lemma 19.3.3, there exist branch
analytic vector functions on ft:

with branch points only in Sl and such that for every z ^ Sl U S2 the vectors

form a basis in Ker(/i y (z)/- ^(z))"1"'1 modulo Ker(/i ; (z)7- ^(z))m""2.


Continuing this process as in the proof of the Jordan form, we obtain the
vector function (19.2.3) with the desired properties. D

19.4 ANALYTIC EXTENDABILITY OF INVARIANT SUBSPACES

In this section we study the following problem: given an analytic family of


transformations A(z) on H and an invariant subspace MQ of A(z0), when is
there a family of subspaces M(z) that is analytic in some domain ft' C ft
with z 0 E ft', and such that M(z0) - MQ and M(z) is A(z) invariant for all
z E ft'? (As before, ft is a domain in (p.) If this happens, we say that M0 is
extendable to an analytic A(z) -invariant family of subspaces on ft'. The main
result in this direction is given in the following theorem.
Analytic Extendability of Invariant Subspaces 617

Theorem 19.4.1
Let A(z): <£"—> <P" be an analytic family of transformations on ft with the
first and second exceptional sets Sv and S2, respectively. Then, provided
z 0 £ft"-(S 2 U 5"t), every A(70) -invariant subspace MQ is extendable to an
analytic A(z) -invariant family of subspaces on ft ^ Sl.

Proof. For ; = 1 , . . . , v, let

be w-dimensional vector functions as in Theorem 19.2.3. We consider A(z)


and vectors (19.4.1) as an n x n matrix function and n-dimensional vector
functions, respectively, written in the standard orthonormal basis in <p".
Let z () £ ft "^ (S2 U 5j), and let J0 be the Jordan form of A(z()):

where

and Jk(n) is the fc x A; Jordan block with eigenvalue p. For z £ft "^ Sl let
T(z) be the n x n matrix whose columns are the vectors (19.4.1) (in this
order). Observe that T(z) is analytic on ft ^ Sl with algebraic branch points
in Sl and T(z) is invertible for z e f t ^ ( 5 2 U 5 ! ) [the function T(z} is
analytic but not necessarily invertible at points in S2]. Then we have
A(z0)T(z0) = T(zQ)J. Given an y4(z 0 )-invariant subspace M0, and any
z £ f t ^ ( 5 , U 5 2 ) , define

Clearly, M(z) is analytic and A(z) invariant for z G ft ~-- (Sl U 52), and also
M(ZQ) — M0. We show that M(z) admits analytic and A (z)- invariant con-
tinuation into the set 52. Let / , , . . . , fk be a basis in M0; then the vectors

form a basis in M(z) for every z £ ft "^ (^j U S2). Note that gv(z),. . . , gk(z)
are analytic in ft^Sj. By Lemma 18.2.2 there exist ^-dimensional vector
functions h^z),. . . , hk(z) that are analytic on ft ^ Sj, linearly independent
for every z G ft "^ Sl, and for which
618 Jordan Form of Analytic Matrix Functions

Span{/i,(z),. . . , hk(z)} = Span{g,(z),. . . , gk(z)}

whenever z^S, U 52. Putting M(z) = Span{/ij(z),. . . , h k ( z ) } for z e S2,


we clearly obtain an analytic extension of the analytic family
{dt(z}} z(=n^(s us } to the points in S2. As for a fixed z 0 e S2 we have

it follows in view of Theorem 13.4.2 that M(zQ) is A(z0) invariant. D

The proof of Theorem 19.4.1 shows that the analytic A(z)-invariant


family of subspaces M(z) on ft ^ Sv with M(zQ) = M0, has at most algebraic
branch points in Sl, in the following sense. For every z' E Sl, either M(z)
can be analytically continued into z' (i.e., there exists a subspace M', which
is necessarily A(z') invariant, and for which the family of subspaces
N(z}, z £ (ft -- S,) U {z'} defined by Jf(z) = M(z} on ft ^ 5,, JV(z') = M' is
analytic on (ft "~ Sj) U {z'}), or Jt(z) = S(z)M0 in a neighbourhood of z',
where S(z) is an invertible family of transformations that is analytic on a
deleted neighbourhood of z' and has an algebraic branch point at z'.
Looking ahead to the applications of the next chapter, we introduce the
notion of analytic extendability of chains of invariant subspaces. Let
A(z)\ <p"-» <P" be an analytic family of transformations on ft, and let

be a chain of A(zQ)-invariant subspaces. We say that A0 is extendable to an


analytic chain of A (z)- invariant subspaces on a set ft' Cft containing z0 if
there exist analytic families of subspaces Mol(z),. . . , MQr(z) on ft' such
that MOJ(ZQ) - MQj for j = 1,. . . , r, MQj(z) C MQk(z) for / < k and z G ft',
and M0j(z) is A(z) invariant for all z G ft'. Clearly, this is a generalization of
the notion of extendability of a single invariant subspace dealt with in
Theorem 19.4.1. The arguments used in the proof of Theorem 19.4.1 also
prove the following result on analytic extendability of chains of invariant
subspaces.

Theorem 19.4.2
Let A(z), Sj and S2 be as in Theorem 19.4.1. Then every chain of A(z0}-
invariant subspaces, where z0 Gft ~^(S2 U SJ, is extendable to an analytic
chain of A(z)-invariant subspaces on ft ^ 5 t . Moreover, the analytic families
of subspaces that form this analytic chain have at most algebraic branch
points at Sl (in the sense explained after the proof of Theorem 19.4.1).

Chains consisting of spectral subspaces are important examples of chains


of subspaces that are always analytically extendable. Recall that an A-
invariant subspace M is called spectral if M is a sum of root subspaces of A.
Analytic Extendability of Invariant Subspaces 619

Theorem 19.4.3
Let A(z) and 5, be as in Theorem 19.4.1. Then every chain A 0 = (MQl C
• • • C M0r} of spectral subspaces of A(zQ), where z0 E ft, is extendable to an
analytic chain of A(z) -invariant subspaces on ft^(Sj^{z0}) that has at
most algebraic branch points at 5, "-> {z()}.

Proof. For j ; = 1, . . . , r write M(}j = Im /Y(z 0 ), where

is the Riesz projector of A(z0) corresponding to a suitable simple rectifiable


contour Fi . We can assume that Fi lies in the interior of FKt , for '/ >k. Let
°tt C ft be a neighbourhood of z0 that is so small that A(z) has no
eigenvalues on F, U • • • U Fr for z E all. Clearly, for z G % we find that

where Mj(z) = Im Pr(z) form an analytic chain of A(z)-invariant subspaces


in °lt. Fix z G °IL ^ (S} U S 2 ), and let ^,(z) be the analytic A(z)-invariant
family of subspaces (cf. the proof of Theorem 19.4.1) to which M t(z) is
extendable. It is easily seen that Mj(z) - Mj(z) for z E °U ^ (S, U S2), so A0
admits the desired extension. D

To analyze the extendability of /4(z 0 )-invariant subspaces when z G S 2 ,


we need the following notion. An invariant subspace M0 of A(z0), z 0 G H is
called sequentially isolated (in ft) if there is no sequence zm -^ z 0 , m =
1,2, ... of points in H tending to z0 such that, for some j4(z OT )-invariant
subspace Mm (m = l,2, . . . ) , we have lim m _ 00 Q(Mm, MQ) = 0. Theorem
19.4.1 shows, in particular, that every ;4(z 0 )-invariant subspace with z() G
ft ^ (Sj U 52) is sequentially nonisolated. However, certain /i(z 0 )-invariant
subspaces with z 0 G S2 may be sequentially isolated, as follows.

EXAMPLE 19.4.1. Let

Here 5, is empty, S2 = {0}. Any A(0)-invariant subspace of the form


Span , where y is a complex number, is sequentially isolated. On the
other hand, the ^4(0)-invariant subspace Span is sequentially
nonisolated. D
620 Jordan Form of Analytic Matrix Functions

Clearly, a sequentially isolated A(zQ)-invariant subspace is not extend-


able to an analytic A(z)-invariant family of subspaces on a neighbourhood
of z0.
We conjecture that these are the only nonextendable invariant subspaces.

Conjecture 19.4.4
Let A(z), Sj, and S2 be as in Theorem 19.4.1. Then every sequentially
nonisolated A(zQ)-invariant subspace MQ, where z 0 E 52, is extendable to an
analytic A(z)-invariant family of subspaces on ft "" 5, that has at most
algebraic branch points in Sj (in the same sense as the remark following the
proof of Theorem 19.4.1).

Theorem 19.4.3 verifies this conjecture in case MQ is a spectral subspace.

19.5 ANALYTIC MATRIX FUNCTIONS OF A REAL VARIABLE

The results of Sections 19.1-19.4 hold also for n x n matrix functions A(t)
that are analytic in the real variable t on an open interval ft on the real line.
Of particular interest is the case when all eigenvalues of A(t) are real, as
follows.

Theorem 19.5.1
Let A(t) be an n x n matrix function that is analytic in the real variable t on
ft. Assume that, for all f Eft, all eigenvalues of A(t) are real. Then the
eigenvalues of A(t) are also analytic functions of t on ft.

Proof. Let t0 G ft. By Theorem 19.1.1, all eigenvalues of A(t), for t in a


neighbourhood of t0, are given by fractional power series of the form

where c; are complex numbers. Let jl be the first index such that c^ ¥* 0. [If
all Cj are zeros, then A(f) = A0 is obviously analytic at t0.] Then

Take t > tQ and (t - t0)l/a positive. Since A(f) and A0 are real, we find that c^
must be real. In (19.5.1) we now take t<t0 and (t - t0)l/a = \t -
t0\lla • (cos(27r;7a) + /sin(2ir;7a)). We obtain a contradiction with the fact
Analytic Matrix Functions of a Real Variable 621

that c;i is real unless jl is a multiple of a. If j2 > jl is the minimal integer with
c /2 ^o' then

and the preceding argument shows that c- is real and j2 is a multiple of a.


Continue in this way to conclude that A(f) is analytic in a neighbourhood of
t0. As t0 was arbitrary in ft, the analyticity of \(t) on ft follows. D

Combining this result with Theorems 19.2.3 and 19.4.1, we have the
following corollary.

Corollary 19.5.2
Let A(t} be an analytic n x n matrix function of a real variable t on ft, and
assume that all eigenvalues of A(t) are real when t E ft. Let S2 be the discrete
set of points in ft defined by the property that either

where v(t} is the number of distinct eigenvalues of A(t), or

but at least for one analytic eigenvalue fjij(t) ofA(t) the partial multiplicities of
/Lt;(f0) are different from the partial multiplicities of /t,-(f)> '^'o m a rea^
neighbourhood oft0. Then there exist analytic n-dimensional vector functions

on ft such that for every f G f t ^ 52 the vectors (19.5.2) form a basis in <p"
and, for j = 1,. . . , r, xj{(t),. . . , xjm(t) is a Jordan chain of the transfor-
mation A(t). Moreover, when tQ E ft "- 52, every A(tQ)-invariant subspace MQ
is extendable to an analytic A(t)-invariant family of subspaces on ft.

In particular, the conclusions of Corollary 19.5.2 hold for an analytic


n x n matrix function A(t) of the real variable t G ft that is diagonable and
all eigenvalues of which are real for every f E f t . These properties are
satisfied, for example, if A(t) is an analytic matrix function on ft that is
hermitian for all t E ft.
622 Jordan Form of Analytic Matrix Functions

19.6 EXERCISES

19.1 Find the first and second exceptional sets for the following analytic
families of transformations:

19.2 In Exercise 19.1 (a) find a basis in <p2 that is analytic on <p (with the
possible exception of branch points) and consists of eigenvectors of
A(z) (with the possible exception of a discrete set of values of z).
19.3 Describe the first and second exceptional sets for the following types
of analytic families of transformations A(z): <£" —> <p" on ft:
(a) A(z) — diagfa^z),. . . , an(z)] is a diagonal matrix.
(b) A(z) is a circulant matrix (with respect to a fixed basis in <p")
for every z E ft.
(c) A(z) is an upper triangular Toeplitz matrix for every z E f t .
(d) For every z E f t , all the entries in A(z), with the possible
exception of the entries (/, /) with / = / or with / + j = n + 1,
are zeros.
19.4 Show that the analytic matrix function of type a(z)/ + fi(z)A, where
a(z) and /3(z) are scalar analytic functions and A is a fixed n x n
matrix, has all eigenvalues analytic.
19.5 Show that if A(z) = a(z)/ + p(z)A is the function of Exercise 19.4
and /3(z) is a polynomial of degree /, then the second exceptional set
of A(z) contains not more than / points.
19.6 Prove that the number of exceptional points of a polynomial family
of transformations £*=0 zjA}., z E <£, is always finite. [Hint: Use the
approach based on the resultant matrix (Section 19.2).]
19.7 Let A(z) be an analytic n x n matrix function defined on ft whose
values are circulant matrices. When is every A(z0)-invariant sub-
space analytically extendable for every z0 E ft?
19.8 Describe the analytically extendable A(z0)-invariant subspaces,
where A(z) is an analytic n x n matrix function on ft with upper
traingular Toeplitz values, and z0 E ft.
19.9 Let A(z): <p"—* <P" be an analytic family of transformations defined
on ft, and assume that A(zQ} is nonderogatory for some z 0 Eft.
Prove that every j4(z0)-invariant subspace is sequentially noniso-
lated. (Hint: Use Theorem 15.2.3.)
Exercises 623

19.10 Let A(z) be an analytic n x n matrix function of the real variable


z E ft, where ft is an open interval on the real line, such that A(z) is
hermitian for every z E ft. Prove that there exist analytic families
JtjCz), . . . , xn(z) of n-dimensional vectors on ft such that for every
z0 E ft the vectors JC^ZQ), . . . , xn(zQ) form an orthonormal basis of
eigenvectors of A(z0). [Hint: Let A0(z) be an eigenvalue of A(z) that
is analytic on ft (one exists by Theorem 19.5.1). Choose an analytic
vector function jc t (z) EKer(>l(z) - A0(z)7) on ft with ||jc(z)|| = 1.
Repeat this argument for the restriction of A(z) to Spanljc^z)}^ —
recall that Span{xl(z)}± is an analytic family of subspaces on
ft —and so on.]
19.11 Let A and B be hermitian n x n matrices, and assume that A has n
distinct eigenvalues A 1 ? A 2 , . . . , \n. Show that in the power series

and

representing the eigenvalue \k(z) of A + zB and the corresponding


eigenvector fk(z) of A + zB, for z sufficiently close to zero, we have

where ak are pure imaginary numbers. It is assumed that ||/A(z)|| = 1


for real z sufficiently close to zero. [Hint: By Exercise 19.10, the
eigenvalue A^(z) and the corresponding eigenvector/^(z) are analy-
tic functions of z. Show that the equality

holds. Find AJ^ by taking the scalar product of (1) with/^. By taking
the scalar product of (1) with / (/' ^ k) it is found that

The condition ||/t(z)|| = 1 gives (/<'>, /,) + ( /„ /<") = 0.]


Chapter Twenty

Applications

This chapter contains applications of the results of the previous two chap-
ters. These applications are concerned with problems of factorizations of
monic matrix polynomials and rational matrix functions depending analyti-
cally on a parameter. The main problem is the analysis of analytic properties
of divisors. Solutions of a matrix quadratic equation with coefficients
depending analytically on a parameter are also analyzed.

20.1 FACTORIZATION OF MONIC MATRIX POLYNOMIALS

Consider a monic matrix polynomial L(A) = /A' + EJIg Af\', where


AQ,. . . , At_^ are n x n matrices that depend analytically on the parameter
z for z G ft, and ft is a domain in the complex plane. We write Aj = Aj(z)
and L( A) = L( A, z). In this section we study the behaviour of factorizations
L(A, z) = L j ( A , z)- • • Lr(\, z) of L(A, z) as functions of z. Our attention is
focused on the problem of analytic extension of factorizations from a given
z 0 eft.
Let

be the companion matrix of L(A, z). Obviously, C(z) is an analytic n x n


matrix function on ft. The first (resp. second) exceptional set of C(z) is
called the first (resp. second) exceptional set of L( A, z). In other words (see
Chapter 19), z0 E ft belongs to the first exceptional set 5, of L( A, z) if and
only if not all solutions of det L( A, z) = 0 (as functions of z) are analytic at
z0. The point z0 belongs to the second exceptional set S2 of L(A, z) if and

624
Factorization of Monk Matrix Polynomials 625

only if all solutions of det L( A, z) = 0 are analytic in a neighbourhood of z 0


and, denoting by Aj(z), . . . , A r (z) all the different analytic functions in a
neighbourhood of z0 satisfying det L( A ; (z), z) = 0, ;' = 1,. . . , r, we have
either (a) A y (z 0 ) = A A .(z 0 ) for some j^k or (b) all the numbers
A,(z 0 ),. . . , A r (z 0 ) are different, but for at least one A ; (z) the partial
multiplicities of L(A, z) at A ; (z) are not the same when z = z0 and when
z T^ z() (and z is sufficiently close to z 0 ).
Now we state the main result on analytic extendability of factorizations of
L(A,z).

Theorem 20.1.1
and

where Ly( A), j = 1, . . . , r are monic matrix polynomials and S, (resp. S2) is
the first (resp. second) exceptional set of L(A, z). Then there exist monic
matrix polynomials L,(A, z), . . . , L r (A, z) whose coefficients are analytic
functions on f l ^ ( 5 j U 5) (where S is some discrete subset of n^{z 0 }),
having at most poles in 5, and at most algebraic branch points in Sl , and such
that

Note that the case when 5j n S ^ 0 is not excluded. This means that the
coefficients Ajk(z) of L ; (A,z) may have an algebraic branch point and a
pole at the same point z' simultaneously, that is, there is a power series
representation of type

in a deleted neighbourhood of z', where p and q are positive integers.

Proof. We use the description of factorizations of monic matrix polyno-


mials in terms of invariant subspaces developed in Chapter 5. Let

be a standard triple for L(A), and let


626 Applications

be the chain of C(z 0 )-invariant subspaces corresponding to the factorization


(20.1.1) [with respect to the triple (20.1.2)]. In particular, for / =
1,. . . , r — 1, the transformations

are invertible, where PJ is the sum of degrees of the matrix polynomials


L r _ ; + 1 (A),. . . , L r (A). By Theorem 19.4.2 the chain (20.1.3) is extendable
to a chain M^z) C • • • C M r _ \ ( z ) of C(z)-invariant subspaces that is analytic
in ft "^ Sl and has at most algebraic branch points in Sl. Let 5 = 5(1) U • • • U
S(r~l), where 5 ( / ) is the discrete set of all z G U for which the transformation

is not invertible. For z G ft "- 5, let

be the factorization of L( A, z) that corresponds to the chain M^z) C • • • C


M r _ \ ( z ) of C(z)-invariant subspaces [with respect to the triple (20.1.2)].
Formulas (5.6.3) and (5.6.5) show that the coefficients of L ; (A, z) have all
the desired properties. D

In the same way (using Theorem 19.4.3 in place of Theorem 19.4.2) one
proves the analytic extendability of spectral factorization, as follows.

Theorem 20.1.2
Let z 0 G11 and

where <r(Ly) fl o-(Lk) — 0 for j ¥=• k. Then there exist monic matrix polyno-
mials Lj( A, z),. . . , Lr( A, z) with the same properties as in Theorem 20.1.1,
and whose coefficients are, in addition, analytic at z 0 .
Rational Matrix Functions 627

We say that a factorization

of monic matrix polynomials L y (A) = I\l'+ T.k=0 AjkAk, j = 1,. . . , r is


sequentially nonisolated if there is a sequence of points (z m }^ = 1 in fl "^ {z0}
such that lim^.,.^ zm — z0 and a sequence of factorizations

where

with lim m _^ 4JJJ!0 = j4 yft for k = 0, . . . , / - 1 and / = 1, . . . , r. Theorem


20.1.1 shows, in particular, that every factorization (20.1.4) with zQ^Sl U
S2 is sequentially nonisolated. Simple examples show that sequentially
isolated factorizations do exist, for instance:

EXAMPLE 20.1.1. Let C(z) be any matrix depending analytically on z in a


domain O with the property that for z = z 0 G O, C(z0) has a square root and
for z =£ z 0 , z in a neighbourhood of z 0 , C(z) has no square root. The prime
example here is

Then define L( A, z) = /A 2 - C(z). It is easily seen that if L(A, z) has a right


divisor /A - A(z), then L(A, z) = /A 2 - /I2(z) and hence that L(A, z) has a
monic right divisor if and only if C(z) has a square root. Thus, under the
hypotheses stated, L(A, z) has an isolated divisor at z 0 . D

It is an open question whether every sequentially nonisolated factoriza-


tion L( A, z 0 ) = L t ( A) • • • Lr( A) of monic matrix polynomials with z0 belong-
ing to the second exceptional set S2 of L(A, z) is analytically extendable in
the sense of Theorem 20.1.1. (It is clear that the sequential nonisolatedness
is a necessary condition for the analytic extendability.) A proof of Conject-
ure 19.4.4 will answer this question in the affirmative.

20.2 RATIONAL MATRIX FUNCTIONS DEPENDING


ANALYTICALLY ON A PARAMETER

In this section we study the realizations and exceptional points of rational


matrix functions that depend analytically on a parameter. This will serve as
628 Applications

a background for the study of analytic extendability of mimimal factoriza-


tions of such functions to be dealt with in the next section.
Let W(A, z) = [w /; (A, z)]"; = 1 be a rational n X n matrix function that
depends analytically on the parameter z for z E . f l , where ft is a domain in
<p. That is, each entry w, 7 (A, z) is a function of type />j ; (A, z)lqij(\, z),
where /?,-,•( A, z) and ^, 7 (A, z) are (scalar) polynomials in A whose coefficients
are analytic functions of z on ft. We assume that:

(a) For each / and / and for all z G ft, the polynomial qtj( A, z) in A is not
identically zero, so the rational matrix function W(\, z) is well
defined for every z E ft.
(b) It is convenient to make the further assumption, namely, that for
each pair of indices /, y (1 ^ /, / ^ n) there exists a z0 E ft such that
the leading coefficient of qr / ; (A,z) is nonzero at z = z0 and the
polynomials ptj(\, z 0 ) and <?, ; (A, z 0 ) are coprime, that is, have no
common zeros. In particular, this assumption rules out the case
when PJJ( A, z) and qtj( A, z) have a nontrivial common divisor whose
coefficients depend analytically on z for z E ft.
(c) Finally, we assume that for every z E ft the rational matrix function
W(A, z) (as a function of A) is analytic at infinity and W(°°, z) = /.

Assumptions (a), (b), and (c) are maintained throughout this section.
It can happen that W( A, z) has zeros and poles tending to infinity when z
tends to a certain point zf, Eft. This is illustrated in the next example.

EXAMPLE 20.2.1. Let

Obviously, W(A,z) satisfies conditions (a), (b), and (c). Specifically,


W( A, z) depends analytically on z for z E <p, W(°°, z) = 1 for all z E <J7, and
the polynomials 1 + Az and z + 1 + Az have no common zeros for 2 = 1.
However, W(A, z) has a zero at A = -z"1 and a pole at A = -(z + l)z~
and both tend to infinity as 2—>0. D

A convenient criterion for boundedness of zeros and poles in a neigh-


bourhood of each point in ft can be given in terms of the entries of W( A, z),
as follows.

Proposition 20.2.1
The poles and zeros of W(\, z) are bounded in a neighbourhood of each
point in ft if and only if, for each entry pif(\, z)/qii(\, z) of W(X, z), the
leading coefficient of the polynomial g /; (A, z) has no zeros in ft (as an
analytic function of z on fl).
Rational Matrix Functions 629

Proof. Assume that the leading coefficient of each has no zeros


in O. Fix Write and, in general, 5 depends
on i and;'. the zeros of qtj( A, z) are bounded in a neighbour-
hood of z 0 . Indeed, writing , the
zeros of the polynomial

are all in the disc where ^ is a


suitably chosen neighbourhood of z 0 . As the poles of must also be
zeros of at least one of the polynomials ,. . . , n, it follows
that there exists an M > 0 such that the poles of W(\, z) are all in the
disc for every z G °tt. Arguing by contradiction, assume that the
zeros of are not bounded in any neighbourhood of z0. So there exist
sequences such that
and is a zero of Then for
some vector x of norm 1. [Here we use the fact that A^ is not a pole
of ] Passing to a subsequence, if necessary, we can assume that
x
m~*xo f°r some X - Using also the fact that
for all it follows that is continuous on the set
A quick way to verify this is by using a
general result that says that if a function /(z n . . . , zm) of complex variables
z,, . . . , zm defined on Vl x • • • x Vm where each Vj is a domain in , is
analytic in each variable separately (when all other variables are fixed), then
/(z,, . . . , zm) is analytic (in particular, continuous) on V, x • • • x Vm. For
the proof of this result, see, for example, Bochner and Martin (1948). Now
the continuity of W(\, z) implies , a contradiction with the
fact that
Conversely, let be a zero of the leading coefficient of some
Then there is a zero of the polynomial such
that tends to infinity as z tends to z 0 . As is a pole of
provided , we have only to show that is not a zero of
for z T^ z 0 sufficiently close to z0.
To this end, use the existence of a point such that and
the polynomials are coprime. The coprimeness of
is equivalent to the invertibility of the
resultant matrix
630 Applications

as long as q [e.g., see Uspensky (1978)]. So det^(Zj)^0, and


since del R(z) is an analytic function of z on ft, it follows that det R(z) ¥^ 0
for all Z T ^ Z O sufficiently close to z(). Hence indeed, for
0 in some neighbourhood of z

It turns out that the boundedness of the poles and zeros of is


precisely the condition needed for existence of an analytic minimal realiza-
tion in the following sense.

Theorem 20.2.2
Let be a rational n x n matrix function that depends analytically on
the parameter zE.fl and satisfies assumptions (a), (b), and (c). Let the zeros
and poles of be bounded in a neighbourhood of every point in
Then there exist analytic matrix functions on ft, A(z), B(z), and C(z) of sizes
m x m, m x n, and n x m, respectively, such that

and for every hwith the possible exception of a discrete set S, the
realization (20.2.1) is minimal.
Conversely, if (20.2.1) holds for some matrix functions A(z), B(z), and
C(z) of appropriate sizes that are analytic on ft, then the zeros and poles of
are bounded in a neighbourhood of every point in

Proof. By Theorem 7.1.2, for every there exists a realization

for some matrices C0(z), AQ(z), and S0(z). Further, by Proposition 20.2.1,
the leading coefficients of the denominators of the entries in have
no zeros in According to this fact, the proof of Theorem 7.1.2 shows that
AQ(z), # 0 (z), and C0(z) can be chosen to be analytic matrix functions of z
on Let p x p be the size of A By Theorem 18.2.1 we can find
families of subspaces of , which are analytic on and are
such that, for every with the possible exception of a discrete set 5,, we
have

and
Rational Matrix Functions 631

For z E 5, we have

and

By Theorem 18.3.2, when z E ft we may write

where are analytic families of projectors


on O. Using the same Theorem 18.2.1, we find an analytic family of
subspaces t£(z) on CL such that

for every except possibly for a discrete set 5 For each z e 52 we


have

In view of Theorem 18.3.2 there exists an analytic family of subspaces


such that

for all z E f t . Also, Lemma 19.3.2, with

ensures the existence of an analytic family of subspaces M(z) on ft such that

for all Let


632 Applications

where is the projector on We regard A(z),


B(z), and C(z) as matrices with respect to a fixed basis in
J<(z) such that xf(z} are analytic functions on (such a basis exists in view
of Theorem 18.3.2). It is easily seen that A(z), B(z), and C(z) are analytic
on The proof of Theorem 6.1.3, together with Theorem 6.1.5, shows that

for every z E f t ^ ( S j U5 2 ), and that (20.2.2) is a minimal realization for


W(A, z) when z E Sl (J S2. By continuity, equation (20.2.2) holds also for
z E Sj U 52, and the first part of Theorem 20.2.2 is proved.
Assume now that (20.2.1) holds for some analytic matrix functions A(z),
B(z), and C(z). It follows from Theorem 7.2.3 that every pole of i
an eigenvalue of A(z) and every zero of is an eigenvalue of
(although the converse need not be true). As the eigen-
values of A(z) and of A(z) - B(z)C(z) depend continuously on z, they are
bounded in a neighbourhood of each point in ft, and the converse statement
of Theorem 20.2.2 follows. D

As the proof of Theorem 20.2.2 shows, the converse statement of this


theorem remains true if the matrix functions A(z), B(z), and C(z) satisfying
(20.2.1) are merely assumed to be continuous on ft.
The discrete set S from Theorem 20.2.2 consists of exactly those points
where the McMillan degree of is less than m. This follows from
Theorems 7.1.3 and 7.1.5. Note also that the McMillan degree of is
equal to m for every
From now on it will be assumed (in addition to the assumptions made in
the beginning of this section) that the zeros and poles of are
bounded in a neighbourhood of each point in ft. Let

be a minimal realization of as in Theorem 20.2.2.


Here S is the set of all such that the realization (20.2.3) is not
minimal. Denote by S, and S2 the first and second exceptional sets,
respectively, of the analytic matrix function A(z), as defined in Section 19.2.
Similarly, let Sf and S^ be the first and second exceptional sets,
respectively, of The set 5, U 5tx will be
called the first exceptional set T l of As the poles (resp. zeros) of
, when , are exactly the eigenvalues of A(z) [resp. of
Rational Matrix Functions 633

y4(z) x ] (see Section 7.2), it follows that the point belongs to the first
exceptional set of W(\, z) if and only if there is a pole or a zero
where where ^ is a neighbourhood of z 0 , such that
z 0 is an algebraic branch point of Note that it can happen that
(20.2.3) is not a minimal realization for some z belonging to the first
exceptional set of (see Example 20.2.2). The set

will be called the second exceptional set T2 of W( A, z). Denoting by S(z) the
McMillan degree of , we obtain the following description of the
points in the second exceptional set: z G T2 if and only if all poles and zeros
of W( A, z) can be continued analytically (as functions of z) to z(), and either

and for at least one zero (or pole) A 0 (z) that is analytic in a neighbourhood
°tt of z () , the zero (or pole) multiplicities of corresponding to
are different from the zero (or pole) multiplicities of
Again, it can happen that T2 intersects with the set of
points where the realization (20.2.3) is not minimal. Clearly, both T, and T2
are discrete sets. Note also that the set Tl U T2 contains all the points z0 for
which 5(z

EXAMPLE 20.2.2. Let

be a scalar rational function depending analytically on Clearly,


W(°o, z) = 1 and the zeros and poles of are bounded in a neighbour-
hood of each point (cf. Proposition 20.2.2). In the notation introduced
above we have 5, = {0, 2}; S2 = {!}; 5 = {0}. Further

and a calculation shows that


634 Applications

So the eigenvalues of A(z)* are given by the formula

It is easily seen that 2 , 2 3 }, where z,, 2 2 , z 3 are the zeros


of the polynomial -223 -I- 622 -42 + 1, and S^ is empty. The first exception-
al set of }, whereas the second exceptional set
of consists of one point {!}.

20.3 MINIMAL FACTORIZATIONS OF RATIONAL


MATRIX FUNCTIONS

Let be a rational n x n matrix function depending analytically on


the parameter 2 for as in the preceding section. Let

be a minimal factorization of for some Here


are n x n rational matrix functions with value / at
infinity. We study the problem of continuation of (20.3.1) to an analytic
family of minimal factorizations. In case ZQ does not belong to the exception-
al sets of such a continuation is always possible, as the following
theorem shows.

Theorem 20.3.1
Let be a rational n x n matrix function that depends analytically on z
for z^.fl and such that = I for Assume that the denominator
and numerator of each entry in W( A, z) are coprime for some z0 E ft that is
not a zero of the leading coefficient of the denominator. Assume, in addition,
that the zeros and poles of are bounded in a neighbourhood of each
point in Let

where d(z) is the McMillan degree of W(\, z), and let Tl and T2 be the first
and second exceptional sets of , respectively. Consider a minima
factorization (20.3.1) with Then there exist rational
matrix functions Wj(A, 2),. . . , Wr(\, 2), the entries of which depend analyt-
ically on z in ft (with the possible exception of algebraic branch points in 7\
and of a discrete set D C ft of poles), and having the following properties: (a)
W^(°°, z) = I for j = 1,. . . , r and every 0 does2 not
; (b) the point
Minimal Factorizations of Rational Matrix Functions 635

belong to D and
W for every Moreover, this factorization is
minimal for every

The set D of poles of Wf(\, z) in Theorem 20.3.1 generally depends on


the factorization (20.3.1), and not only on the original function W(\, z).
This is in contrast with the sets 7\ and T2 that depend on W(A, z) only.

Proof. Let A(z), B(z), and C(z) be as in Theorem 20.2.2, so the


realization (20.2.1) is minimal for all Using Theorem 7.5.1, let

be the direct sum decomposition corresponding to the minimal factorization


(20.3.1), with respect to the minimal realization

Thus, for j = 1, . . . , r — I the subspaces

are A(zQ) invariant, whereas the subspaces

t/Y

are /I(z 0 ) x invariant. [Here, as usual, A(z)* - A(z) - ZJ(z)C(z).] Now b


Theorem 19.4.2, there exist families of subspaces My(z) for/ = 1,. . . , r — 1,
and M-(z] for / = 2, . . . , r, which are analytic on O, except possibly for
algebraic branch points in Tj, and have the following properties: (a)
^
(c) ^ y (z) are /l(z) invariant, and ^(z) are A(z)* invariant; (d) MJ(ZQ) =
MJO, j = 1, . . . , r - 1; ^(z 0 ) = JfjQt j = 2,...,r.
Let /n; be the dimension of =^0, y = 1,. . . , r (so m^ + • • • + mr = m). It
follows from the proof of Theorem 19.4.2 that

where for each /', the vector functions x\'\z),. . . , x(p')(z), as well as
_y ( j y) (z), . . . , )^;)(z), are linearly independent for every z GO and analytic
on ft except possibly for algebraic branch points in 7^. Here pi = ml +
636 Applications

• • • + m- is the dimension of M j(z) and qt = m, + m / + 1 + • • • + mr is the


dimension of Jfj(z).
Our next observation is that

where Df is a discrete set in Cl. [Note that the sum in (20.3.4) is direct.]
Indeed, by (20.3.2) and (20.3.3) we have

where

is a matrix function of size mx(pj + qi+l) = mxm. It remains to observe


that det F;(z) is not identically zero [because detF ; (z ( ) )^0] and (20.3.4)
holds with D- being the set of zeros of det F;(z).
Let D = D, U • • • U Dr_{. In particular, we have

Note also that z() does not belong to D.


Consider the subspaces <^(z) = M for ; = 2, . . . , r - 1. First,
it is clear that

where we put ^,(z) = M{(z) and «^r(z) = -^-C2)- Indeed, it is sufficient to


verify that

[By definition, Mr(z) = <pm.] The inclusion C in (20.3.8) is evident from the
definition of <5^(z). Further, for , we have

in view of (20.3.4). Now, using (20.3.6), we have for z £ O ^ D


Minimal Factorizations of Rational Matrix Functions 637

so

and (20.3.8) follows.


By Theorem 7.5.1, for ( D U S ) , there exists a minimal
factorization

which corresponds to the direct sum decomposition (20.3.7), with respect to


the minimal realization

If we show that each projector y-(z) on <£y(z) along ^(z) + • • • + J^_,(z) +


=^ + ] (z) + • • • + cS'X7) is analytic in ft, except possibly for algebraic branch
points in T, and poles in Z), then formula (7.5.5) shows that W y (A, z) have
all the properties required in Theorem 20.3.1. [Note that, by continuity,
factorization (20.3.9) holds also for z G S ^ D, but it is not minimal at these
points.]
To verify these properties of 7r;(z), introduce for z £ ft "^ D the projector
M^z) along jV y + 1 (z) f o r / = 1, . . . , r - 1. Define also = Oand
One checks easily that for y = 1, . . . , r

[Indeed, both sides of (20.3.10) take the value 0 on vectors from ^V; +1 (z)
and from ^ and take the value x on each vector x from
Therefore, (/- Gy-i(2))Gy( z ) >s a projector that coincides with
for / = 1 , . . . , r. But

where Fy(z) is given by (20.3.5); so (2;(z) is analytic on ft except possibly


for algebraic branch points in 7", and poles in D. Hence 7ry(z) also enjoys
these properties. D

Consider now an important case of analytic continuation of minimal


factorizations that can also be achieved when z () G Tl U T2.

Theorem 20.3.2
Let W(\, z) be as in Theorem 20.3.1, and let
638 Applications

be a minimal factorization of z 0 Eft. [As usual, are


rational matrix functions with value I at infinity.] Assume that and
W have no common zeros and no common poles when j ^ k. Then there
exist rational matrix functions Wj(\, z), j = 1, . . . , r with the properties
described in Theorem 20.3.1 and that, in addition, are analytic on a
neighbourhood of z 0 .

The proof is obtained in the same way as the proof of Theorem 20.3.1, by
using Theorem 19.4.3 in place of Theorem 19.4.2.
To conclude this section we discuss minimal factorizations (20.3.1) that
cannot be continued analytically (as in Theorem 20.3.1). We say that the
minimal factorization (20.3.1) is sequentially nonisolated, if there is a
sequence of points {z such that zm—> z 0 , and sequences of
rational matrix functions . . . , r with value / at infinity
such that

is a minimal factorization of W( A, zm), m = 1, 2, . . . , and for / = 1, . . . , r

Equation (20.3.11) is understood in the sense that for each pair of indices
k, I (1 < k, I < n) the (k, 1} entry of Wjm( A) has the form

where are complex numbers (depending, of course, ony, k, and


/) such that
1, . . . , v), and the (k, /) -entry in

Clearly, if an analytic continuation (as in Theorem 20.3.1) of the minimal


factorization (20.3.1) exists, then this factorization is sequentially noniso-
lated. In particular, Theorem 20.3.1 shows that every minimal factorization
(20.3.1) with z is sequentially nonisolated. Also, Theorem
20.3.2 shows that a minimal factorization (20.3.1) is sequentially nonisolated
provided and have no common zeros and no common pole
when /' T^ k.
It turns out that not every minimal factorization of can
Matrix Quadratic Equations 639

be continued analytically; indeed, we exhibit next a sequentially isolated


minimal factorization.

EXAMPLE 20.3.1. Let

and consider the minimal factorization of W(\,0):

We verify that this factorization is sequentially isolated. To this end we find


all minimal factorizations of W where z ^0. A minimal realization of
W(A, z) is easily found:

In the notation of Theorem 20.2.1, we have

Theorem 7.5.1 shows that all nontrivial minimal factorizations of


are given by the formulas

So the minimal factorization (20.3.12) is indeed sequentially isolated. D

20.4 MATRIX QUADRATIC EQUATIONS

Consider the matrix quadratic equation

where A, B, C, D are known matrices of sizes n x n, n x ra, m x n, /n x m,


respectively, and A" is an m x n matrix to be found. We assume that
640 Applications

A = A(z), B = B(z), C = C(z), and D = D(z) are analytic functions of z on


ft, where ft is a domain in the complex plane. The analytic properties of the
solutions X as functions of z are studied.
Let

be the (m + n) x (m + n) analytic matrix function, and let Sj and S2 be the


first and second exceptional sets of T(z) as defined in Section 20.2. We have
the following main result.

Theorem 20.4.1
For every z 0 E ft ^ (Sj U 52) and every solution XQ of

there exists an m x n matrix function X(z) that is analytic on ft, except


possibly for algebraic branch points in 51 and a discrete set of poles in ft, and
such that X(zQ) = X0 and

for every z E ft that is not a pole of X(z). [ The case when a point z 0 E Sl is
also a pole of X(z) is not excluded.]
Proof. By Proposition 17.8.1, the subspace

is T(z0) invariant. By Theorem 19.4.1, there is a family of subspaces M(z)


that is analytic on ft except possibly for algebraic branch points in 51? for
which and for which M(z) is T(z) invariant for all
Theorem 18.3.2 there exists an (m + n) x n analytic matrix function 5(z) on
ft with linearly independent columns such that, for all
ImS(z).
Write

where 5j(z) is of size n x n and 52(z) is of size m x n, and observe that


[because 5j(z 0 ) = /]. Now by the same Proposition 17.8.1
Matrix Quadratic Equations 641

is the desired solution of (20.4.2)

Consider an example.

EXAMPLE 20.4.1. Let C(z) be an n x n analytic matrix function on H with


del C(Z)T^O, and assume that the eigenvalues of C(z) are analytic functions.
[This will be the case if, for instance, C(z) has an upper triangular form.]
Assume in addition that C(z) has n distinct eigenvalues for every z G O.
Consider the equation

Here

and it is easily seen that det(A/- T(z)) = de . So A 0 is an


eigenvalue of T(zQ) if and only if Aj; is an eigenvalue of C(z0). It follows that
the first exceptional set of T(z0) is contained in the set 5 = { z G
. As for every the matrix T(z) has 2n distinct
eigenvalues, it follows that the second exceptional set of T(z) is also
contained in S. By Theorem 20.4.1 every solution X0 of (20.4.3) with
can be extended to a family of solutions X(z) of (20.4.3) that
is meromorphic on ft except possibly for algebraic branch points in 5. D

In addition, let us indicate a case when an analytic extension of a solution


of (20.4.1) is always possible.

Theorem 20.4.2 2
Let X0 be a solution of (20.4.1) and z 0 eO. Furthermore, assume that the
r(z0) -invariant subspace Im is spectral. Then there exists an m x n
matrix function X(z) with the properties described in Theorem 20.4.1 and, in
addition, X(z) is analytic in a neighbourhood of z0.
The proof of Theorem 20.4.2 is obtained in the same way as the proof of
Theorem 20.4.1, but using Theorem 19.4.3 in place of Theorem 19.4.1.
In connection with Theorem 20.4.2, note the following fact. Assume that
m = n. If X\ and X2 are solutions of (20.4.1) such that

then both r(z0)-invariant subspaces


642 Applications

are spectral. Indeed

so a(7(z . In particular, = {Q}. As


dim M = n, it follows that . (Here we use the
assumption that m = n.) Hence both Ml and M2 are spectral.
The following example shows that not every solution of the equation

can be continued analytically as in Theorem 20.4.1. (Of course, it is then


necessary that

EXAMPLE 20.4.2. Consider the scalar equation

The solution x = \ of (20.4.4) with z 0 = 0 cannot be continued


analytically. D

20.5 EXERCISES
20.1 Let

Find the analytic continuation (as in Theorem 20.1.1) of the


factorization

What are the poles of this analytic continuation?


20.2 Let be a monic n x n matrix polynomial of degree / whose
coefficients are analytic on and assume that for every
del has nl distinct zeros. Prove that for every factorization
and are monic
matrix polynomials, there exist monic matrix polynomials
whose coefficients are analytic on and such
that
Exercises 643

20.3 Show that if by Theorem 20.1.1 the polynomial is scalar,


then the analytic continuations of do not have poles in
5 = 0 in the notation of Theorem 20.1.1).
20.4 Let L(A, z) be a monic matrix polynomial whose coefficients are
circulant matrices analytic on fl. Prove that the analytic continuation
of every factorization where (as in
Theorem 20.1.1) has no poles in
20.5 Prove that every factorization of a monic scalar polynomial
with coefficients depending analytically on is sequentiall
nonisolated. (Hint: Use Exercise 19.9.)
20.6 Find the first and second exceptional sets for the following rational
matrix functions depending analytically on a parameter

20.7 Let W be as in Exercise 20.6 (a). Find the analytic continua-


tions (as in Theorem 20.3.1) of all minimal factorizations of the
rational matrix function W
20.8 Let W ) be a rational matrix function that satisfies the hypoth-
eses of Theorem 20.3.1. Assume that for some has
8 distinct zeros and 8 distinct poles, where 8 is the maximum of the
McMillan degrees of W(\, z) for Prove that every minimal
factorization

admits an analytic continuation into a neighbourhood of z 0 , that is,


there exist rational matrix functions that are
analytic in z on a neighbourhood °tt of z0 such that

is a minimal factorization for every z e °U, and W


/ = !,... ,r.
20.9 Let
644 Applications

be a matrix equation, where A(z), 5(z), C(z), and £>(z) are analytic
matrix functions (of appropriate sizes) on a domain . Assume
that all eigenvalues of the matrix

are distinct, for every Prove that given a solution X 0 of (1)


with z = z 0 Eft, there exists an analytic matrix function
such that X(z) is a solution of (1) for every
20.10 We say that a solution XQ of (1) with is sequentially
nonisolated if there exist a sequence (z such that z m — » z 0 as
m—><x> and for m = 1,2,. . . , and a sequence such
that

for m — 1,2,. . . , which satisfies

Prove that if the matrix

is nonderogatory, then every solution of (1) with z = z 0 is sequen-


tially nonisolated.
20.11 Give an example of a solution of (1) that is sequentially isolated.
Notes to
Part 4

Chapter 18. This chapter is an introduction to the basic facts on analytic


families of subspaces. The main result is Theorem 18.3.1, which connects
the local and global properties of an analytic family of subspaces. This result
(in a more general framework) appeared first in the theory of analytic fibre
bundles [Grauert (1958), Allan (1967), Shubin (1979)]. Here, we follow
Gohberg and Leiterer (1972,1973) in the proof of this theorem.
The result of Theorem 18.2.1 goes back to Shmuljan (1957) [see also
Gohberg and Rodman (1981)]. The proof of Theorem 18.2.1 presented here
is from the authors' book (1982). The results of Section 18.6 seem to be
new. In case of a function , where A is a bounded linear operator
acting in infinite dimensional Banach space, the result of Theorem 18.6.2
was proved in Saphar (1965).
Chapter 19. The starting point for the material in this chapter (Theorem
19.1.1) is taken from the book by Baumgartel (1985). Theorem 19.5.1 was
proved in Porsching (1968). The analytic extendability problem for invariant
subspaces is probably treated here for the first time.
Chapter 20. We consider in this chapter some of the applications dealt
with in Chapters 5, 7, and 17, but in the new circumstances when the
matrices involved depend analytically on a complex parameter. All the
results (except those in Section 20.1) seem to be new. In Section 20.1 we
adapt and generalize the results developed in Chapter 5 of Gohberg,
Lancaster, and Rodman (1982). Example 20.1.1 is Example 20.5.4 of the
authors' book (1982).

645
Appendix

Equivalence of
Matrix Polynomials

To make this work more self-contained, we present in this appendix the


basic facts about equivalence of matrix polynomials that are used in the
main body of the book. Two concepts of equivalence are discussed. For the
first of these, two matrix polynomials and are said to be
equivalent if one is obtained from the other by premultiplication and
postmultiplication with square matrix polynomials having constant nonzero
determinant. Elementary divisors (or, alternatively, invariant polynomials)
form the full set of invariants for this concept of equivalence, and the Smith
form (which is diagonal) is the canonical form. This equivalence is studied in
detail in Sections A.1-A.4.
The second concept of equivalence is the strict equivalence of linear
matrix polynomials . This means that
, for some invertible matrices P and Q. For strict equivalence the
full set of invariants comprises minimal column indices, minimal row
indices, elementary divisors, and elementary divisors at infinity. The
Kronecker form (which is block diagonal) is the canonical form. A thorough
treatment of strict equivalence is presented in Sections A.5-A.7. The
canonical form for equivalence of matrix polynomials is a natural pre-
requisite for this presentation.

A.I THE SMITH FORM: EXISTENCE

In this and subsequent sections we consider matrix polynomials A(\)-


£*=0 Aj\', where A; are m x n matrices whose entries are complex numbers
(so that we admit the case of rectangular matrices A j ) . Of course, the sizes
of all Aj must be the same. Two mx n matrix polynomials AND
are said to be equivalent if

646
The Smith Form: Existence 647

and some matrix polynomials £(A) and F(A) of sizes m x n and n x n,


respectively, with constant nonzero determinants (i.e., independent of A).
We use the symbol to mean that ^ and are
equivalent.
It is easy to see that ~ is an equivalence relation, that is: (a)
for every matrix polynomial A(A); (b) A(\)~B(\) implies fi(A)~^4(A);
and implies / Indeed, if
then (A.1.1) holds with Further, assume that
(A. 1.1) holds for matrix polynomials A (A) and 5(A). As del £(A) = const ^
0, the formula for the inverse matrix in terms of cofactors implies that
is a matrix polynomial as well and since det
follows that det is also a nonzero constant. Similarly, is a
matrix polynomial for which det ' is a nonzero constant. Now we have

which means that Finally, let us check (c). We have

where have constant nonzero determinants.


Then and

The central result on equivalence of matrix polynomials is the Smith


form, which describes the simplest matrix polynomial in each equivalence
class, as follows.

Theorem A. 1.1
An m x n matrix polynomial is equivalent to a unique ra x « matrix
polynomial where

is a diagonal polynomial matrix with monic scalar polynomials such


that d is divisible by

In other words, for every matrix polynomial ^4(A) there exist matrix
polynomials with constant nonzero determinants such that
648 Appendix

has the form (A. 1.2), and this form is uniquely determined by The
matrix polynomial of (A. 1.2) is called the Smith form of and
plays an important role in the analysis of matrix polynomials. Note that
) from (A.1.3) are not unique in general. Note also that the
zeros on the main diagonal in are absent in case has full rank
for some [In particular, this happens if is an n x n matrix
polynomial with leading coefficient /.]

Proof of Theorem A. 1.1 (First Part). Here we prove the existence of a


of the form (A. 1.2) that is equivalent to a given . We use the
following elementary transformations of a matrix polynomial of size
m x n: (a) interchange two rows, (b) add to some row another row
multiplied by a scalar polynomial, and (c) multiply a row by a nonzero
complex number, together with the three corresponding operations on
columns.
Note that each of these transformations is equivalent to the multiplication
of by an invertible matrix as follows. Interchange of rows (columns) i
and j in is equivalent to multiplication on the left (right) by

Adding to the /th row of A( A) they'th row multiplied by the polynomial


is equivalent to multiplication on the left by
The Smith Form: Existence 649

the same operation for columns is equivalent to multiplication on the right


by the matrix

Finally, multiplication of the /th row (column) in A(\) by a number a 5^0 is


equivalent to the multiplication on the left (right) by

[Empty spaces in (A.1.4)-(A.1.7) are assumed to be zeros.] Matrices of the


form (A.1.4)-(A.1.7) are be called elementary. It is apparent that the
determinant of any elementary matrix is a nonzero constant. Consequently,
it is sufficient to prove that, by applying a sequence of elementary trans-
formations, every matrix polynomial can be reduced to a diagonal
form: diag[ . . . ,0], where are scalar
polynomials such that the quotients / = 1, 2, . . . , r - 1, are
also scalar polynomials. We prove this statement by induction on m and n.
For m — n = I it is evident.
Consider now the case m = 1 , n > 1 ; that is

If all are zeros, there is nothing to prove. Suppose that not all the
are zeros, and let a- be a polynomial of minimal degree among the
nonzero entries of We can suppose that ;0 = 1. [Otherwise, inter-
change columns in By elementary transformations it is possible to
650 Appendix

replace all the other entries in by zero. Indeed, let fl Divide


fl where is the remainder and
its degree is less than the degree of . Add to the y'th
column the first column multiplied by . Then r will appear in the
yth position of the new matrix. If then put in the firs
position, and if there is still a nonzero entry [different from , apply the
same argument again. Namely, divide this (say, the A:th) entry by and
add to the fcth column the first multiplied by minus the quotient of the
division, and so on. Since the degrees of the remainders decrease, after a
finite number of steps [not more than the degree of we find that all
the entries in our matrix, except the first, are zeros. This proves Theorem
A. 1.1 in the case m = 1, n > 1. The case m > 1, n = 1 is treated in a similar
way.
Assume now that m, n > 1, and assume that the theorem is proved for
matrices with m - 1 rows and n - 1 columns. We can suppose that the (1,1
entry of is nonzero and has the minimal degree among the nonzero
entries of [Indeed, if , we can reach this condition by
interchanging rows and/or columns in , Theorem A. 1.1 is
trivial.] With the help of the procedure described in the previous paragraph
[applied for the first row and the first column of by a finite number of
elementary transformations we reduce to the form

Suppose that for some /, and is not divisible by


(without remainder). Then add to the first row the /th row and apply the
above arguments again. We obtain a matrix polynomial of the form

where the degree of is less than the degree of . If there still


exists some entry 2) A ) that is not divisible by a j ( A ) , repeat the same
procedure once more, and so on. After a finite number of steps we obtain
the matrix
The Smith Form: Uniqueness 651

where every is divisible by Multiply the first row (or column)


by a nonzero constant to make the leading coefficient of the polynomial
equal to 1. Now define the (m - 1) x (n - 1) matrix polynomial

and apply the induction hypothesis for to complete the proof of


existence of a Smith form

A.2 THE SMITH FORM: UNIQUENESS

We need some preparations to prove the uniqueness of the Smith form


in Theorem A.1.1. Let A = be an m x « matrix with complex
entries. Choose k rows, 1 < / , < • • • < ik < m, and k columns, 1 < / j < • • • <
jk < n, in A, and consider the determinant det °f tne kxk
submatrix of A formed by these rows and columns. This determinant is
called a minor of A. Loosely speaking, we can say that this minor is of order
k and is composed of the rows i1,. . . , ik and columns / , , . . . , jk of A. It is
denoted by A( . We establish the important Binet-Cauchy for-
mula, which expresses the minors of a product of two matrices in terms of
the minors of each factor, as follows.

Theorem A. 2.1
Let A = BC, where B is a m x p matrix, and C is a p x n matrix. Then for
every k, \<k< min(m, n) and every minor A( .I of order k we
have

where the sum is taken over all sequences {«^}*=1 of integers satisfying
I =£ a t < a2 < • • • < ak < p. In particular, if k > p, then the sum on the
right hand side of (A. 2.1) is empty and the equation is interpreted as
652 Appendix

Note that for k = 1 formula (A.2.1) is just the rule of multiplication of


two matrices. On the other hand, if m=p — n and k = n, then (A.2.1)
gives the familiar multiplication formula for determinants: det(BC) =
det B • det C.

Proof. As the rank of A does not exceed p, we have


as long as k>p. So we can assume £</?. For simplicity of notation
assume also iq=jq = q, q = 1,. . . , k. Letting A =
C = [Cy]£'/*= i, we may write ^4( ) in the form

and using the linearity of the determinant as a function of each column, this
expression is easily seen to be equal to

where the sum is taken over all /c-tuples of integers (a l 5 . . . , ak) such that
l ^ a t < p . (Here we use the notation B\ I to denote
\a
ta 1*9=1 even when the the sequence =1 is not increasing, or when
it contains repetitions of numbers.) If not all aly a2,. . . , ak are different,
then clearly B\ I ^ O . Ignoring these summands in
(A.2.2), split the remaining terms into groups of &! terms each in such a way
that the summands in the same group differ only in the order of
indices a,, a 2 ,. . . , ak. We obtain:
The Smith Form: Uniqueness 653

where the internal summation is over all permutations of {1,2, . . . , & } .


Denoting by the sign of is 1 if TT is even and -1 if TT is odd), we
find that the right-hand side of (A.2.3) is

and the theorem is proved. D

Returning to matrix polynomials observe that the minors of a matrix


polynomial A (A) are (scalar) polynomials, so we can speak about their
greatest common divisors.

Theorem A. 2.2
Let A( A) be an m x n matrix polynomial. Let be the greatest common
divisor (with leading coefficient 1) of the minors ofA(\) of order k, if not all
of them are zeros, and let = 0 if all the minors of order k of are
zeros. Let p = l and = diag[ , . . . , 0] be a
Smith form of (which exists by the part of Theorem A.I.I already
proved). Then r is the maximal integer such that

Proof. Let us show that if -^i(A) and are equivalent matrix


polynomials, then the greatest common divisors of the
minors of order k of respectively, are equal. Indeed, we
have

for some matrix polynomials E( A) and F( A) with constant nonzero determi-


nants. Apply Theorem A. 2.1 twice to express a minor of of order k as
654 Appendix

a linear combination of minors of .<4 2 (A) °f tne same order. Therefore, it


follows that p is a divisor of . But the equation

implies thatp^ j(A) is a divisor of In the same


way one shows that the maximal integer r, such that coincides
with the maximal integer r2 such that
Now apply this observation for the matrix polynomials and It
follows that we have to prove Theorem A. 2. 2 only in the case that
itself is in the diagonal form A(A) = D(\). From the structure of it is
clear that

is the greatest common divisor of the minors of of order s. So


1,.. . ,r, and (A. 2.4) follows

Theorem A. 2. 2 immediately implies the uniqueness of the Smith form


(A. 1.2). Indeed, Theorem A. 2. 2 shows that the number r of not identi-
cally zero entries in the Smith form of A(\), as well as the entries
themselves, can be expressed explicitly in terms of
that is, r and are uniquely determined by

A. 3 INVARIANT POLYNOMIALS, ELEMENTARY DIVISORS, AND


PARTIAL l MULTIPLICITIESes

In this section we study various invariants appearing in the Smith form for
the matrix polynomials. Let A(\) be an m x n matrix polynomial with the
Smith form D(\). The diagonal elements are called
the invariant polynomials of . The number r of invariant polynomia
can be defined as

Indeed, since E(A) and F(A) from (A. 1.3) are invertible matrices for every
we have rank oNOn the other hand, its
clear that rank is not a zero of one of the invariant polyno-
mials, and rank otherwise. So (A.3.1) follows.
The set of invariant polynomials forms a complete invariant for equiva-
lence of matrix polynomials of the same size.

Theorem A.3.1
Matrix polynomials and ) of the same size are equivalent if and only
if the invariant polynomials of and B are the same.
Invariant Polynomials, Elementary Divisors, and Partial Multiplicities 655

Proof. Suppose the invariant polynomials of A(\) and B(A) are the
same. Then their Smith forms are equal:

and

where Since and


F 2 (A) are matrix polynomials with constant nonzero determinants, the same
is true for and, co n and aonsequently and So

Conversely, suppose , where det = const ^0,


del ) = const ^0. Let D(A) be the Smith form for

Then sdi also the Smith formfor

By the uniqueness of the Smith form for [more exactly, by the


uniqueness of the invariant polynomials of it follows that the
invariant polynomials of ) are the same as those of

We now take advantage of the fact that the polynomial entries of A (A)
and its Smith form D( A) are over <p to represent each invariant polynomial
d,(A) as a product of linear factors:

where are different complex numbers and a (1 , . . . , aik are


positive integers. The factors / = 1, . . . , kf, i = 1, . . . , r are
called the elementary divisors of
Some different elementary divisors may contain the same polynomial
(this happens, for example, in case d for some /);
the total number of elementary divisors of A(\) is thus
The degrees atj of the elementary divisors form an important characteris-
tic of the matrix polynomial Here we mention only the following
simple property of the elementary divisors, whose verification is left to the
reader.
656 Appendix

Proposition A.3.2
Let A(\) be an n x n matrix polynomial such that det 0. Then the
sum of degrees of its elementary divisors coincides
with the degree of det

Note that the knowledge of the elementary divisors of A (A) and the
number r of its invariant polynomials is sufficient to
construct d^ In this construction we use the fact that is
divisible by ^_,(A). Let p be all the different complex numbers
that appear in the elementary divisors, and let
(/ = 1, . . . ,p) be the elementary divisors containing the number A / ? and
ordered in the descending order of the degrees an > • • • > a ( k > 0. Clearly,
the number r of invariant polynomials must be greater than or equal to
max{&,, . . . , kp}. Under this condition, the invariant polynomials
are given by the formulas

where we put t.
The following property of the elementary divisors is used subsequently.

Proposition A.3.3
Let A(\) and B(\) be matrix polynomials, and let
a block-diagonal matrix polynomial. Then the set of elementary divisors of
C(A) is the union of the elementary divisors of A(\) and B(A).

Proof. Let D,(A) and D 2 (A) be the Smith forms of A(\) and B(\),
respectively. Then clearly

for some matrix polynomials £(A) and F(A) with constant nonzero deter-
minant. Let
the elementary divisors of /^(A) and D2(\), respectively, corresponding to
the same complex number A0. Arrange the set of exponents
« ! , . . . , ap, j3,,. . . , fiq, in a nonincreasing order:

where 0< y\ ^ • • • < yp+q- Using Theorem A.2.2 it is clear that in the Smith
form D = diag[ of diag[ the in-
variant polynomial d is divisible by but not by
Invariant Polynomials, Elementary Divisors, and Partial Multiplicities 657

is divisible by ' but not by (


and so on. It follows that the elementary divisors of

[and thus also those of corresponding to 0 , are just


and Proposition A. 3. 3 is proved.

In the rest of this section we assume that (as in Proposition A. 3. 2) the


matrix polynomial is square and that the determinant of is not
identically zero. In this case, complex numbers 0 such that det,
are called the eigenvalues of A(X). Clearly, the set of eigenvalues is finite [it
contains not more than degree (det A(\)) points], and is an eigenvalue of
A(\) if and only if there is an elementary divisor of A(\) of type (A - A0)°.
Let A0 be an eigenvalue of A (A), and let be all
the elementary divisors of A( A) that are divisible by The exponents
a,, . . . , a.p are called the partial multiplicities of A(\) corresponding to A0.
Recall that some of the numbers a,, . . . , ap may be equal; the number a;
appears in the list of partial multiplicities as many times as there are
elementary divisors . The partial multiplicities play an
important role in the following representation of matrix polynomials.

Theorem A.3.4
Let A(\) be an nx n matrix polynomial with det Then for every
admits the representation

where and are matrix polynomials invertible at and


KI ^ • • • < Kn are nonnegative integers, which coincide (after striking off
zeros) with the partial multiplicities of A(\) corresponding to

Proof. The existence of representation (A. 3. 2) follows easily from the


Smith form. Namely, let = diag[d be the Smith form
of A (A), and let

where det = const ^ 0, det = const ^0. Represent each in


the form
658 Appendix

where and K,>0. Since is divisible by we have


K. > K . _ J . Now (A. 3. 2) follows from (A. 3. 3), where

It remains to show that the K, coincide (after striking off zeros) with the
degrees of elementary divisors of A (A) corresponding to A0. To this end we
show that any factorization of A (A) of type (A. 3. 2) with KJ ^ • • • < Kn
implies that K; is the multiplicity of A0 as a zero of d^( A), j = 1, . . . , n, where
D(A) = diag[d,(A), . . . , dn(\)] is the Smith form of A(\). Indeed, let

where E( A) and F( A) are matrix polynomials with constant nonzero deter-


minants. Comparing with (A.
(A.3.2),
3. 2), write

diag

where are matrix polynomials in-


vertible at A0. Applying Theorem A. 2.1, we obtain

where . m is a minor of order /0 of


[resp. diag F(A)], and the sum in (A.3.5) is
taken over a certain set of triples (/, j, k). It follows from (A.3.5) and the
condition K, =£ • • • < *„, that A0 is a zero of the product
of multiplicity at least K, + K2 + ••• + *,-. Rewrite (A.3.4) in the form

and apply Theorem A. 2.1 again. Using the fact that and
(F A (A))~ ! are rational matrix functions that are defined and invertible at
0 , and that dt( A) is a divisor of we deduce that

where 4>, (A) is a rational function defined at is not a pole of


Equivalence of Linear Matrix Polynomials 659

O. (A)). It follows that is a zero of of multiplicity


exactly KJ + K2 + • • • + K,. , i = 1 , . . . , n. Hence KJ is exactly the multiplicity
of as a zero of ) for / = 1,. . . , n; that is, the nonzero numbers (if
any) among K,, . . . , Kn are the partial multiplicities of A(\) corresponding
to

As a consequence of Theorem A.3.4, note that there are nonzero K, in


the representation (A.3.2) if and only if is an eigenvalue of

A.4 EQUIVALENCE OF LINEAR MATRIX POLYNOMIALSIALS

We study here equivalence and the Smith form for matrix polynomials of
type /A — A, where A is an n x n matrix. It turns out that for such matrix
polynomials the notion of equivalence is closely related to similarity.

Theorem A.4.1
/A — A ~ /A - B if and only if A and B are similar.

To prove this theorem, we have to introduce division of matrix poly-


nomials.
We restrict ourselves to the case when the dividend is a general matrix
polynomial and the divisor is a matrix polynomial of type
/A + X, where A' is a constant n x n matrix. In this case the following
representation holds:

where Qr(\) is a matrix polynomial, which is called the right quotient, and
Rr is a constant matrix, which is called the right remainder, on division of

where Q,(\) is the left quotient, and the constant matrix R, is the left
remainder.
Let us check the existence of representation (A.4.1); (A.4.2) can be
checked in a similar way. If / = 0 [i.e., A( A) is constant], put = 0 and
R . So we can suppose / > 1. Write Comparing
the coefficients of powers of A on the right- and left-hand sides of (A.4.1),
we can rewrite this relation as follows:
660 Appendix

Clearly, these relations define r, sequentially.


It follows from this argument that the left and right quotient and
remainder are uniquely denned.

Proof of Theorem A.4.1. In one direction this result is immediate: if


A = SBS~l for some nonsingular S, then the equality /A — A = S(I\ —
B)S~l proves the equivalence of and . Conversely, suppose
- f l . Then for some matrix polynomials E(A) and F(A) with
constant nonzero determinant we have

Suppose that division of on the left by /A - A and of F( A) on the


right by /A - B yield

Substituting in the equation

we obtain

whence

Since the degree of the matrix polynomial on the right-hand side here is 1 , it
follows that 5( A) = T( A); otherwise, the degree of the matrix polynomial on
the left is at least 2. Hence

so that

It remains only to prove that E0 is nonsingular. To this end divide E(A)


on the left by IX- B:

Then, using (A.4.3) and


and (A.4.4), we have
Equivalence of Linear Matrix Polynomials 661

Hence the matrix polynomial in the square brackets is zero, and E0R0 - I. It
follows that EQ is nonsingular. D
The definitions of eigenvalues and partial multiplicities made in the
preceding section can be applied to an n x n matrix polynomial of the form
—A. On the other hand, as an n x n matrix (or as a transformation
represented by this matrix in the standard basis el,. . . , en), A has eigen-
values and partial multiplicities as defined in Sections 1.2 and 2.2. It is an
important fact that these notions for IX — A and for A coincide.

Theorem A.4.2
A complex number is an eigenvalue of IX- A if and only if it is an
eigenvalue of A. Moreover, the partial multiplicities of IX — A corresponding
to its eigenvalue A0 coincide with the partial multiplicities of A corresponding
to A 0 .
Proof. The first statement follows from the definitions: 0 is an eigen-
value of IX — A if and only if det(/A — A) = 0, which is exactly the definition
of an eigenvalue of A. For the proof of the second statement, we can
assume that A is in the Jordan form. Further, using Proposition A.3.3, we
reduce the proof to the case when A is a single Jordan block of size n x n:

The partial multiplicity of A is clearly n, corresponding to the eigenvalue A0.


To find the partial multiplicities of IX - A, observe that

has a nonzero minor of order n - 1 that is independent of A (namely, the


662 Appendix

minor formed by crossing out the first column and the last row in
As det Theorem A.2.2 implies that the Smith form of
/A - A is diag[l, 1, . . . , 1 So the only partial multiplicity of
/A — A is n, which corresponds to 0. D

We also need the following connection between the partial multiplicities


of a matrix A and submatrices of

Theorem A.4.3
Let A be an n x n matrix. Let a, ^ • • • > am be the partial multiplicities of an
eigenvalue 0 of A, and put a. = 0 for i = m + 1, . . . , n. Then
is the minimal multiplicity of as a zero of the determinant
{considered as a polynomial in A) of any p* p submatrix in IX— A.

Proof. By Theorems A. 4. 2 and A. 3. 4 we have the following repre-


sentation:

where are matrix polynomials invertible for A = A 0 . Now


the Binet-Cauchy formula (Theorem A. 2.1) implies that the multiplicity of
0 as a zero of the determinant of any p x p submatrix in A is at least
Rewriting (A.4.5) in the form

and using the Binet-Cauchy formula again, we find that

where are certain p x p submatrices in /A - A, and


<p,(A), / = 1, . . . , s are rational functions defined at is not a pole of
any <ft(^)]- It follows from equation (A.4.6) that at least one of the minors
has a zero at A0 with multiplicity exactly equal to

A.S STRICT EQUIVALENCE OF LINEAR MATRIX POLYNOMIALS:


REGULAR CASE

Let A + \B and Al + \Bl be two linear matrix polynomials of the same size
m x n. We say that A + \B and A j + Afi, are strictly equivalent if there exist
Strict Equivalence of Linear Matrix Polynomials 663

invertible matrices P and Q of sizes m x m and n x n, respectively, indepen-


dent of A, such that, for all we obtain

We denote strict equivalence by l + AZJj. It is easily seen that


strict equivalence is indeed an equivalence relation, that is, that the three
following properties hold: A + \B —A + XB for every polynomial A + \B.
, then als
and then
Obviously, strict equivalence of linear matrix polynomials implies their
equivalence. The converse is not true in general, as we see later in this
section.
In this and subsequent sections we find the invariants of strict equival-
ence, as well as the simplest representative (the canonical form) in each
class of strictly equivalent linear matrix polynomials. This section is devoted
to the regular case. That is, when A and B are square matrices and
det(y4 + \B) does not vanish identically. In particular, the polynomials
A + \B with squares matrices A and B and d e t B ^ O are regular. This
hypothesis is used in our first result.

Proposition A. 5.1
Two regular polynomials A + \B and A} + \B^ with det B 7^0, det Bl^0
are strictly equivalent if and only if they have the same invariant polynomials
(or, equivalently , the same elementary divisors).

The proof is easily obtained by combining Theorems A. 3.1 and A. 4.1.


However, the result of Proposition A. 5.1 is false, in general, if we omit the
conditions det B ^0, det Bl ^ 0 and require only that the polynomials are
regular.

EXAMPLE A. 5.1. Let

The polynomials

and

are obviously regular, and both have the Smith form , that is, the
same invariant polynomials. However, they cannot be strictly equivalent
because B and Bl have different ranks. were
664 Appendix

strictly equivalent, we would have B = PB^Q for some invertible P and Q,


and this would imply the equality of the ranks of B and 5, .) D

To extend the result of Proposition A. 5.1 to the class of all regular


polynomials A + \B, we must introduce the elementary divisors at infinity.
We say that A p is an elementary divisor at infinity of a regular polynomial
is an elementary divisor of \A + B. Clearly, there exist
elementary divisors at infinity of A + \B if and only if del B = 0.

Theorem A. 5.2
Two regular polynomials A + \B and A , + \B^ are strictly equivalent if and
only if the elementary divisors of A + \B and Al + \B^ are the same and
their elementary divisors at infinity are the same.

Proof. Assume that A + \B and Al + Afl, are strictly equivalent. Then


obviously A + \B and Al + \B} are equivalent, so by Theorem A. 3.1 they
have the same elementary divisors. Moreover, \A + B and A/4, + Bl are
equivalent as well, so, by the same Theorem A. 3.1, A + \B and .A, + \Bl
have the same elementary divisors at infinity.
To prove the second part of the theorem, we introduce homogeneous
linear matrix polynomials. Thus we consider the polynomial pA + \B where
/a, A E (p. Note that every minor m(\, IJL) of order r of pA + \B is a
polynomial of two complex variables /A and A that is homogeneous of order r
in the sense that

for every a, . For a fixed r, 1 < r ^ n, le be the greatest


common divisor in the set of homogeneous polynomials of all the nonzero
minors m^A, /A), . . . , ms(\, /A) of order r of tt/4 + \B. In other words,
pr(\, fji) is a homogeneous polynomial that divides each m,(A, /z), and if
q(\, /t) is another homogeneous polynomial with this property, then
divides p Clearly, divides p The poly-
nomials p t (A, /n), . . . , pn(\, M) are called the invariant polynomials of
As each minor is a homogeneous polyno-
mial in A and /A, it admits factorizations of the form

for some complex numbers o^ and aj. (In fact, the nonzero a j values are the
reciprocals of the nonzero a; values.) Using factorizations of this kind, it is
easily seen that l) are the invariant polynomials of
A + \B, whereas p^l, /A), . . . , p n (l, pt) are the invariant polynomials
of A + B.
Strict Equivalence of Linear Matrix Polynomials 665

Returning to the proof of Theorem A. 5. 2, assume that the elementary


divisors of A + \B and A{ + \B}, including those at infinity, are the same.
This means that the invariant polynomials of A + \B and Al + A/?j are the
same, and so are the invariant polynomials of p. A + B and pA^ + B\. Since
a homogeneous polynomial /?(A, /u,) of A and /ti is uniquely defined by
p(A, 1) and p(l, p), it follows from the discussion in the preceding para-
graph that the invariant polynomials of pA + \B and of p,Al + \B\ are the
same. Now we make a change of variables:
where xly2 - x2y} ^0. Then the invariant polynomials of and of
jiAl + \Bl are again the same, where A = y2A + x2B, B = y}A + x}B,
A i = y2Al + x2Bl, Bl = ylAl + xlBl. As the polynomials A + \B and ^4, +
\Bl are regular, we can choose xl and yl in such a way that det B 7^0 and
det /?! 7^0. Apply Proposition A. 5.1 to deduce that A + \B and Al + \Bl
are strictly equivalent: PAQ — Al, PBQ = Bl for some invertible matrices P
and Q. Since

where and similarly for A } and Blf we obtain PAQ = /4,,


PBQ = #], and the strict equivalence of A + \B and A} + \B} follows. O

Theorem A. 5. 2 allows us to obtain the canonical form for strict equival-


ence of regular linear matrix polynomials, as follows.

Theorem A. 5.3
Every regular, linear, matrix polynomial A + \B is strictly equivalent to a
linear polynomial of the form

where Jk(h) is the k x k Jordan block with eigenvalue A. The linear poly-
nomial (/i.5.1) is uniquely determined by A + \B. In fact, A*1, . . . , A p are
the elementary divisors at infinity ofA + \B, whereas / = 1, . . . , q
are the elementary divisors of

Proof. Let A^ + XB^ be the polynomial (A.5.1). Using Proposition


A. 3. 3, we see immediately that = 1, . . . , q are the elementary
divisors of A , . . . , p are its elementary divisors at
infinity. If the strict equivalence claimed by the theorem holds, it follows
from Theorem A. 5. 2 that (A.5.1) is uniquely determined by A + \B, and
that A + \B must have the specified elementary divisors.
666 Appendix

It remains to prove that there is a strict equivalence of the required form.


Let c e (p be such that det(,4 + cB) ^ 0. Write A + \B = (A + cB) + ( A -
c)B, multiply on the left by (A + cB)~\ and apply a similarity transfor-
mation reducing (A + cB)~1B to the Jordan form. We obtain

where 70 is a nilpotent Jordan matrix (i.e., J'Q = 0 for some /) and Jl is an


invertible Jordan matrix.
Multiply the first diagonal block on the right-hand side of (A.5.2) by
(/ - c/0)~ . It is easily verified that

and since J is also nilpotent, is similar to a


matrix of the form

Multiply the second diagonal block on the right-hand side of (A.5.2) by


7J"1 and reduce J\l to its Jordan form by similarity. We find that

for some complex numbers A,, . . . , A and some positive integers

A.6 THE REDUCTION THEOREM FOR SINGULAR POLYNOMIALS

Consider now the singular polynomial A + \B, where A and B are m x n


matrices. Singularity means that either m ^ n or m = n but det(-4 + \B) is
identically zero. Let r be the rank of A + \B, that is, the size of the largest
minors in A + \B that do not vanish identically. Then either r < m or r < n
holds (or both).
Assume r<n. Then the columns of the matrix polynomial are
linearly dependent, that is, the equation

where x is an unknown vector, has a nonzero solution.


Let us check first that there is a vector polynomial 0 for which
(A.6.1) is satisfied. For this purpose we can use the Smith form D(A) of
A + \B in place of A + AB itself (see Theorem A. 1.1). But because of the
The Reduction Theorem for Singular Polynomials 667

assumption r<n, the last column of D(\) is zero. Hence = 0 is


satisfied with x = (0,. . . ,0,1).
The following example is important in the sequel.

EXAMPLE A.6.1. Let

be an e x (e + 1) linear matrix polynomial (e = 1, 2, . . .). We claim that the


minimal degree of a nonzero vector polynomial solution x(A) of the
equation

is e. Indeed, rewrite this equation in the form


* = 0, where * ; (A) is the yth coordinate of
x( A). So

and the minimal degree for x( A) (which is equal to e) is obtained by taking


to be a nonzero constant. D

Among all not identically zero polynomial solutions *(A) of (A. 6.1), we
choose one of least degree e and write

The following reduction theorem holds.

Theorem A. 6.1
I f e i s the minimal degree of a nonzero polynomial solution of (A. 6. 1), and if
e > 0, then A + \B is strictly equivalent to a linear matrix polynomial of the
form

where
668 Appendix

is an e x (e + 1) matrix, and the equation

has no nonzero polynomial solutions of degree less than e.

It is convenient to state and prove a lemma to be used in the proof of


Theorem A.6.1. For an m x n matrix polynomial U + AK, let

be a matrix of size m(i + 2) x n(i + 1) for / = 0 , 1 , 2 , . . . .

Lemma A.6.2
Assume that the rank of U + XV is less than n. Then e is the minimal degree
of nonzero polynomial solutions y( A) of

if and only if

and

Proof. Let be a nonzero polynomial solution of (A.6.6)


of the least degree. Then

or equivalently
The Reduction Theorem for Singular Polynomials 669

Not all the vectors yj are zero, and so (A.6.7) follows. Conversely, if (A.6.7)
holds, we may reverse the argument and obtain a nonzero polynomial
solution of (A.6.6) of degree e. D

Proof of Theorem A.6.1. The proof is given in three steps. In the first
step we show that

for suitable matrices A, B, D, and F, then we show that A + \B satisfies the


conclusions of Theorem A.6.1, and finally we prove that

(a) Let (A.6.2) be a vector polynomial satisfying (A.6.1):

where xf ¥=• 0. This is equivalent to

We claim that the vectors

are linearly independent. Assume the contrary, and let Axh (h = l) be the
first vector in (A.6.9) that is linearly dependent on the preceding ones:

By (A.6.8) this equation can be rewritten as follows:

that is, Bxl_l =0, where


670 Appendix

Furthermore, again by (A.6.8), we have

MHERE

Continuing the process and introducing the vectors

we obtain the equations

From (A.6.10) it follows that

is a nonzero solution of (A.6.1) with degree not exceeding h ~ l< e, which


is impossible. [The fact that this solution is not identically zero follows
because jtQ = jt 0 ^0; for if x0 were zero, then \~1x(\) would be a poly-
nomial solution of (A.6.1) of degree less than e.] Thus the vectors (A.6.9)
are linearly independent.
But then the vectors x0,. . . , xe are linearly independent as well. Indeed,
= = 0, and by the linear independence of
and since *0 ^0 we find that also

Now write A + \B in a basis in <p" whose first e + 1 vectors are


xQ, *!,. . . , xf and in a basis in (pm whose first e vectors are Axl,. . . , Axe.
In view of equations (A.6.8), the polynomial A + \B in the new bases has
the form

for some D, F, A, and B.


In the second step we show that the equation (^4 + \B)x = 0 has no
nonzero polynomial solutions of degree less than e. Note that

is obtained from
The Reduction Theorem for Singular Polynomials 671

by a suitable permutation of rows and columns. By Lemma A.6.2 the rank


of (A.6.11) is equal to en; that is, the columns of (A.6.11) are linearly
independent. By the same lemma, taking into account Example A.6.1,
rank M ; that is, the square matrix
M € _,[L e ] is invertible. As the columns of (A.6.12) are linearly independent
as well, we find that the columns of are linearly independent,
that is, rank Using Lemma A.6.2 again, we
find that (A + \B)x = 0 has no solutions of degree less than e.
In the third step, replacing

for suitable matrices X and Y, we see that Theorem A. 6.1 will be complete-
ly proved if we can show that the matrices X and Y can be chosen so that
the matrix equation

holds.
We introduce a notation for the elements of D, F, X and also for the rows
of Y and the columns of A and B :

Then the matrix equation (A.6.13) can be replaced by a system of scalar


equations that expresses the equality of the elements of the fcth column on
the right- and left-hand sides of (A.6.13). For fc = l,2, . . . , / t - e - l , we
obtain
672 Appendix

The left-hand sides of these equations are linear polynomials in A. The free
term of each of the first e — 1 of these polynomials is equal to the coefficient
of A in the next polynomial. But then the right-hand sides must also satisfy
this condition. Therefore, for fc = l , 2 , . . . , n — e — 1, we obtain

If (A.6.15) holds, then the required elements of X can obviously be


determined from (A.6.14).
It now remains to show that the system of equations (A.6.15) for the
elements of Y always has a solution for arbitrary d ik
fc = l,2, . . . , / z - e - l ) . Rewrite (A.6.15) in the form

where

and use the left invertibility of Mf_2[A + \B] (ensured by Lemma A. 6. 2) to


verify that (A.6.15) has a solution
, where the subscript "L" denotes a left
inverse. Theorem A. 6.1 is now proved completely. D

A.7 MINIMAL INDICES AND STRICT EQUIVALENCE OF LINEAR


MATRIX POLYNOMIALS (GENERAL CASE)

We introduce the important notion of minimal indices for linear matrix


polynomials. Let A + \B be an arbitrary linear matrix polynomial of size
m x n. Then the k polynomial columns that are
solutions of the equation
Minimal Indices and Strict Equivalence of Linear Matrix Polynomials 673

are called linearly dependent if the rank of the polynomial matrix formed
from these columns is less than k\ In that
case there exist k polynomials not all identically
zero, such that

Indeed, let

be the Smith form of X(\), where [resp. ] is an n x n (resp.


k x k) matrix polynomial with constant nonzero determinant, and

with nonzero polynomials d, As the rank r of X(\) is less


than k, the last column of D(\) is zero. One verifies that (A.7.2) is satisfied
with 1). If polynomials /7,(A)
(not all zero) with the property (A.7.2) do not exist, then the rank of X is k
and we say that the solutions are linearly independent.
Among all the polynomial solutions of (A.7.1) we choose a nonzero
solution *,(A) of least degree €l. Among all polynomial solutions x( A) of the
same equation for which ^(A) and x( A) are linearly independent, we take a
solution x2(\) of least degree e2. Obviously, e 1 <6 2 . We continue the
process, choosing from the polynomial solutions x(\) for which
x ) are linearly independent a solution x3( A) of minimal degree
c3, and so on. Since the number of linearly independent solutions of (A.7.1)
is always at most n, the process must come to an end. We obtain a
fundamental series of solutions of (A.7.1)

having the degrees

Note that it may happen that some degrees e,, . . . , ey are zeros. [This is the
case when (A.7.1) admits constant nonzero solutions.] In general, a funda-
mental series of solutions is not uniquely determined (to within scalar
factors) by the pencil A + \B. However, note the following.
674 Appendix

Proposition A. 7.1
Two distinct fundamental series of solutions always have the same series of
degrees

Proof. In addition to (A. 7. 3), consider another fundamental series of


solutions . with the degrees eltc2,.... Suppose that in
(A.7.4)

and similarly, in the series

Obviously, For every vector ) (/ = 1, . . . , ra,) there exists a


polynomial such that

for some polynomials p . (Otherwise, *,, jc 1? . . . , xni would be linearly


independent and one could replace xn + , by *,, which is of smaller degree,
contrary to the definition of xn +l.) Rewrite (A.7.5) in the form

where
m j matrix polynomial. are linearly independent, there
is a nonzero minor /(A) of order m, of . So for every A G (p
that is not a zero of one of the polynomials
rank of the matrix on the left-hand side of (A.7.6) is ml. Hence (A.7.6)
implies ml^nl. Interchanging the roles of Jt,(A) and *,(A), we find the
opposite inequality , we have and we can
repeat the above argument with n2 and m 2 in place of n, and m 1?
respectively, and so on. D

The degrees p of polynomials in any fundamental series of


polynomial solutions of (A. 7.1) are called the minimal column indices of
A + \B, As Proposition A. 7.1 shows, the number p of the minimal column
indices and the indices themselves do not depend on the choice of the
fundamental series. If there are no nonzero solutions of (A. 7.1) (i.e., the
rank of A + XB is equal to n), we say that the number of minimal column
indices is zero, in this case no such indices are defined.
We define the minimal row indices of A + \B as the minimal column
indices of
Minimal Indices and Strict Equivalence of Linear Matrix Polynomials 675

EXAMPLE A.7.1. Let Le be as in Example A.6.1. The polynomial Lf has the


single minimal column index e, whereas the minimal row indices are absent.
Indeed, as in Example A.6.1, observe that every nonzero polynomial
solution x( A) =

has the form

and a solution x(\) of minimal degree e is obtained by taking !.


Hence the first minimal column index of Le is e. As (A.7.8) shows, every
other solution JC(A) of (A.7.7) has the form , where
is the first coordinate of *(A). So jt(A) and Jc t (A) are linearly dependent,
which means that there are no more minimal column indices.
As the rows of Lf are linearly independent for every A, the minimal row
indices are absent.
Similarly, we conclude that the transposed polynomial LI has the single
minimal row index e and no minimal column indices. D

The importance of minimal indices stems from their invariance under


strict equivalence, as follows.

Proposition A.7.2
then the minimal column indices of the polynomials
are the same, and the minimal row indices of these
polynomials are also the same.

The proof is immediate: if P for invertible mat-


rices P and Q, then the solutions of = 0 are obtained from the
solutions of by multiplication by
which preserves linear dependence and independence and also implies that
jf(A) and y(\) have the same degree.
We are now in a position to state and prove the main result concerning
strict equivalence of linear matrix polynomials in general. We denote by Lf
the e x (e + 1) linear polynomial

and Ljis its transpose [which is an (e + 1) x e linear polynomial]. Then O uxt ,


676 Appendix

will denote the zero u x v matrix. As before, /^(A 0 ) represents the k x k


Jordan block with eigenvalue A0.

Theorem A. 7.3
Every m x n linear matrix polynomial A + \B is strictly equivalent to a
unique linear matrix polynomial of type

Here £j < • • • < ep a«d 171 < • • • ^ 17^ are positive integers; kl , . . . , kr and
l{, . . . , ls are positive integers; \l, . . . , \s are complex numbers.

The uniqueness of the linear matrix polynomial of type (A.7.10) to which


A + \B is strictly equivalent means that the parameters w, v, p, q, r, s,
are uniquely determined by the
polynomial A + \B. It may happen that some of the numbers u, v, p, q, r,
and s are zeros. This means that the corresponding part is missing from
formula (A.7.10).

Proof of Theorem A. 7.3. Let be a basis in the linear


space of all constant solutions of the equation

that is, all solutions that are independent of A. Note that (A.7.11) is
equivalent to the simultaneous equations

Likewise, let be a basis in the linear space of all constant


solutions of

or, what is the same, the simultaneous equations

Write A + \B (understood for each A G <p as a transformation written in the


standard orthonormal bases in <p" and <p m ) as a matrix with respect to the
basis in <p" whose first v vectors are *,, . . . , xv and the basis in <pm whose
Minimal Indices and Strict Equivalence of Linear Matrix Polynomials 677

first u vectors are yl1 . . . , yu and the others are orthogonal to


Span{_Vj, . . . , yu}. Because Im A = (Ker A*)^~ C(Span{_x 1 , . . . , yu})*~ and
also Im B C (Spanl)^, . . . , y M }) x , it follows that, with respect to the indi-
cated bases, A + \B has the form Here Al + \Bl has
the property that neither has con-
stant nonzero solutions.
If the rank of A} + \Bl is less than the number of columns in Al + Afij,
apply the reduction theorem (Theorem A. 6.1) several times to show that

where is such that the equation = 0 has no nonzero


polynomial solutions x = x(\). From the property of in Theorem
A. 6.1 it is clear that p . It is also clear that the process of
consecutive applications of Theorem A. 6.1 must terminate for the simple
reason that the size of A is finite. The Smith form of
(Theorem A. 1.1) shows that the number of columns of the polynomial
A2 + \B2 coincides with its rank.
If it happens that the rank of A2 + \B2 is less than the number of its
rows, apply the above procedure to After taking adjoints, we
find that

where 0 g and the rank of A3 + \B3 coincides with the number


of columns and the number of rows of (A3 + \B3). In other words,
A 3 + \B3 is regular. It remains to apply Theorem A. 5. 3 in order to show
that the original polynomial A + \B is strictly equivalent to a polynomial of
type (A. 7. 10).
It remains to show that such a polynomial (A. 7. 10) is unique. Proposition
A. 7. 2 and Example A. 7.1 show that the minimal column indices of A + \B
are 0, . . . , 0, e, , . . . , ep (where 0 appears « times) and the minimal row
indices of A + \B are 0, . . . , 0, 17^ . . . , v}q (where 0 appears v times).
Hence the parameters u, v, p, q, =], and are uniquely deter-
mined by A + \B. Further, observe that Lf and Lr( have no elementary
divisors; that is, their Smith forms are [7e 0] and " , respectively. (This
follows from Theorem A.2.2 since both Le and LTt have an € x e minor that
is equal to 1.) Using Proposition A. 3. 3, we see that the elementary divisors
1
of (A. 7. 10) are ,. which must coincide with the
elementary divisors of A + \B because of the strict equivalence of A + \B
and (A. 7. 10) (Theorem A. 3.1). Hence the parameters s, {/,}/ =1 , and
{ A,}^=1 are also uniquely determined by A + \B. Applying this argument for
\A + B in place of A + \B, we see that r and are uniquely
determined by A + \B as well. D
678 Appendix

The matrix polynomial (A.7.10) is called the Kronecker canonical form of


A + \B. Here 0,. . . , 0, el,. . . , ep (u times 0) are the minimal column
indices of A + \B; 0,. . . , 0, T/J, . . . , t}q (v times 0) are the minimal row
indices of A + \B; A * 1 , . . . , A*r are the elementary divisors of A + \B at
infinity; and e the (finite) elementary divisors of
A + \B. We obtain the following corollary from Theorem A.7.3.

Corollary A.7.4
We have A + \B ~A{ + \Bt if and only if the polynomials A + \B and
Al + \Bl have the same minimal column indices, minimal row indices,
elementary divisors, and elementary divisors at infinity.

Thus Corollary A.7.4 describes the full set of invariants for strict equival-
ence of linear matrix polynomials.

A.8 NOTES TO THE APPENDIX

This appendix contains well-known results on matrix polynomials. Essential-


ly the entire material can be found in Chapters 6 and 12 of Gantmacher
(1959), for example. In our exposition of Sections A.5-A.7 we follow this
book. In the exposition of Sections A.1-A.4 we follow Gohberg, Lancaster,
and Rodman (1982).
List of Notations
and Conventions

inclusion between sets X and Y


(equality not excluded)
the field of real numbers
the space of all n-dimensional real
column vectors
the field of complex numbers
the space of all n-dimensional com-
plex column vectors
complex conjugate of complex num-
ber x
the real part of x
the imaginary part of x
the n-dimensional column vector

the standard scalar product in <p";


««„..., o, <&,,...,&„»

the norm of a vector

679
680 List of Notations and Conventions

e, = (0,. . . , 0,1,0,. . ., 0} (with 1 in the ith place) the ith unit


coordinate vector in <p"; its size
n will be clear from the context
"Linear transformation" often abbreviated to "transfor-
mation"—when convenient, a linear
transformation from <pm into <p" is
assumed to be given by an n x m
matrix with respect to the bases
e , , . ..,€„ in <p" and e t , . . . , e m in
<pm, consequently, when convenient,
an n x m matrix will be considered
as a linear transformation written in
the standard bases el,.. . , en and
*!»•••>*«
m x n matrix whose entry in the
(/, /) place is afi
unit matrix; identity linear trans-
formation (the size of / is under-
stood from the context)
the k x k unit matrix
the transpose of a matrix A
the adjoint of a transformation A;
the conjugate transpose of a matrix
A
complex conjugate in every entry of
a matrix A
left inverse of a matrix (or trans-
formation) A
right inverse of a matrix (or trans-
formation) A
one-sided inverse (left or right) of
A; generalized inverse of A
the trace of a matrix (or transfor-
mation) A
the norm of a transformation A
the restriction of a transformation A
to its invariant subspace M
the image of a transformation
A'.^-^f"
the kernel of a transformation A
the spectrum of a matrix (or trans-
formation) A
the root subspace of A correspond-
ing to its eigenvalue X
List of Notations and Conventions 681

the Jordan block of size k x k with


eigenvalue A
the block diagonal matrix with the
matrices A,,..., Ap along the main
diagonal; or, the direct sum of the
linear transformations Alt . . . , A

a block column matrix

the set of all A-invariant subspaces


the set of all p-dimensional A-
invariant subspaces
the set of all coinvariant subspaces
for A
the set of all semiivariant sub-
spaces for A
the set of all reducing invariant sub-
spaces for A
the set of all p-dimensional reducing
invariant subspaces for A
the set of all hyperinvariant sub-
spaces for A
the set of all real invariant subspaces
for a real transformation A
the set of all transformations (or
matrices) that commute with a
transformation (or matrix) A
the zero subspace
the orthogonal complement to a sub-
space M
direct sum of subspaces M and Jf
orthogonal sum of subspaces M and
Jf
the unit sphere in a subspace M
the distance between a point x G <p"
and a set Z C <p"
the distance between sets X and Y
the gap between Z£ and M
the minimal opening between J£ and
M
the spherical gap between 2£ and M
the minimal angle between sub-
spaces Z£ and M
682 References

tric space of all subspaces in


the set of all m-dimensional sub-
spaces of <p"
the subspace spanned by vectors
A T , , . . . , Xk

the algebra of all n x n matrices


the algebra of all transformations on
a linear space !£
the algebra of all upper triangular
Toeplitz matrices of size j x j
the lattice of all invariant subspaces
for an algebra V
the algebra of all transformations for
which every subspace from a lattice
A is invariant
the set of all n x n unitary matrices
the set of all n x n real orthogonal
matrices with determinant 1
the set of all real invertible nx n
matrices
hthe McMillan degree of a rational
matrix function W(A)
the singular set of an analytic family
of transformations A(z)
Kronecker index Kronecker
tj = 0 index:
if i¥> /;d

are positive integers:

the number of distinct elements in a


finite set K
end of a proof or an example
References

Alien, G, R., "Hoiomorphic vector-valued functions on a domain of holomorphy," J. London


Math. Soc. 42, 509-513 (1967),
Bart, H., I. Gohberg, and M. A, Kaashoek, "Stable factorization of monk matrix polynomials
and stable invariant stibspaces," Integral Equations and Operator Theory I, 496-517
0978).
Bart, H., L Gohberg, and M. A. Kaashoek, Minimal Factorization of Mark and Operator
Functions (Operator Theory: Advances and Applications, Vol. 1) Birkhauser, Basei, 1979,
Bart, H., I. Gohberg, M. A, Kaashoek, and P, Van Dooren, "Factorizations of transfer
functions," SIAM J. Control Optim. 18(6), 675-696 (1980).
Baumgartei, H. Analytic Perturbation Theory for Matrices and Operators (Operator Theo
Advances and Applications, Vol. IS) Birkhauser, Basel-Boston-Stuttgart, 1985.
den Boer, H., and G. Ph. A. Thijsse, "Semistability of sums of partial multiplicities under
additive perturbations," Integral Equations and Operator Theory 3, 23-42 (1980).
Bochner, S., and W, T. Martin, Several Complex Variables, Princeton University Press,
Princeton, NJ, 1948.
Brickman, L., and P. A. Fillmore, "The invariant subspace lattice of a linear transformaton,"
Carmd. J. Math, 19, 810-822 (1967).
Brockett, R,, Finite Dimensional Linear Systems, John Wiley & Sons, New York, 1970.
Bmnovsky, P., "A classification of linear controllable systems," Kybentetika (Praha) 3,
173-187 (1970).
Campbell, S., and J. Daughtry, "The stable solutions of quadratic matrix equations," Proc.
AMS 74, 19-23 (W79).
Choi, M.-D,, C, Laurie, and H. Radjavi, "On comnitttators and invariant subspaces," Linear
and Multilinear Algebra 9, 329-340 (1981).
Coddington, E, A., and N. Levinson, Theory of Ordinary Differential Equations, McGraw-
Hill, New York, 1955.
Conway, J. B., and P. R, Hataos, "Knite-dimensiona! points of continuity of Lat," Linear
Algebra Appl. 31, 93-102 (1980).
Djaferis, T. E., and S. K. Mitter, "Some generic invariant factor assignment results using
dynamic output feedback," Linear Algebra Appl. St, 103-131 (1983).
DonneHan, T., Lattice Theory, Pergamon Press, Oxford, 1968.
Douglas, R. G., and C. Pearcy, "On a topology for invariant subspaces," J. Functional Armly.
2, 323-341 (1968).
Fillmore, P. A., D. A. Herrero, and W. E. Longstaff, "The hyperinvariant subspaces lattice of
a linear transformation," Linear Algebra Appl. IT, 125-132 (1977).
Ganttnacher, F. R., The Theory of Matrices, Vote. I and II, Chelsea, New York, 1959;
Gochberg, L Z., and i. Loiterer, "Uher Algebren stetiger Operatorfunctionen," Stadia
Mathematka, Vol. LVI1, 1-26, 1976.
Gohberg, L, and S. Goldberg, Basic Operator Theory, Birkhauser, Basel, 1981.

683
684 References

Gohberg, I., and G. Heinig, "The resultant matrix and its generalizations, I. The resultant
operator for matrix polynomials," Acta Set. Math. (Szeged) 37, 41-61 (Russian) (1975).
Gohberg, I., and M. A. Kaashoek, "Unsolved problems in matrix and operator theory, II.
Partial multiplicities of a product," Integral Equations and Operator Theory 2, 116-120
(1979).
Gohberg, I., M. A. Kaashoek, and F. van Schagen, "Similarity of operator blocks and
canonical forms. I. General results, feedback equivalence and Kronecker indices," Integral
Equations and Operator Theory 3, 350-396 (1980).
Gohberg, I., M. A. Kaashoek, and F. van Schagen, "Similarity of operator blocks and
canonical forms. II. Infinite dimensional case and Wiener-Hopf factorization," in Topics
in Modern Operator Theory. Operator Theory: Advances and Applications, Vol. 2,
Birkhauser-Verlag, 1981, pp. 121-170.
Gohberg, I., M. A. Kaashoek, and F. van Schagen, "Rational matrix and operator functions
with prescribed singularities," Integral Equations and Operator Theory 5, 673-717 (1982).
Gohberg, I. C., and M. G. Krein, "The basic propositions on defect numbers, root numbers
and indices of linear operators," Uspehi Mat. Nauk 12, 43-118 (1957); translation, Russian
Math. Surveys 13, 185-264 (1960).
Gohberg, I., and N. Krupnik, Einfiihrung in die Theorie der eindimensionalen singuldren
Integraloperatoren, Birkhauser, Basel, 1979.
Gohberg, I., P. Lancaster, and L. Rodman, "Perturbation theory for divisors of operator
polynomials," SIAM J. Math. Anal. 10, 1161-1183 (1979).
Gohberg, I., P. Lancaster, and L. Rodman, Matrix Polynomials, Academic Press, New York,
1982.
Gohberg, I., P. Lancaster, and L. Rodman, "A sign characteristic for self-adjoint meromorphic
matrix functions," Applicable Analysis 16, 165-185 (1983a).
Gohberg, L, P. Lancaster, and L. Rodman, Matrices and Indefinite Scalar Products (Operator
Theory: Advances and Applications, Vol. 8) Birkhauser-Verlag, Basel, 1983b.
Gohberg, I., and Ju. Leiterer, "On holomorphic vector-functions of one variable, I. Functions
on a compact set," Matem. Issled. 7, 60-84 (Russian) (1972).
Gohberg, L, and Ju. Leiterer, "On holomorphic vector-functions of one variable, II. Functions
on domains," Matem. Issled. 8, 37-58 (Russian) (1973).
Gohberg, I. C. and A. S. Markus, "Two theorems on the gap between subspaces of a Banach
space," Uspehi Mat. Nauk 14, 135-140 (Russian) (1959).
Gohberg, L, and L. Rodman, "Analytic matrix functions with prescribed local data," /.
d1 Analyse Math. 40, 90-128 (1981).
Gohberg, I., and L. Rodman, "On distance between lattices of invariant subspaces of
matrices," Linear Algebra Appl. 76, 85-120 (1986).
Gohberg, I., and S. Rubinstein, "Stability of minimal fractional decompositions of rational
matrix functions," in Operator Theory: Advances and Applications, Vol. 18, Birkhauser,
Basel, 1986, pp. 249-270.
Golub, G. H., and C. F. van Loan, Matrix Computations, The Johns Hopkins University Press,
Baltimore, 1983.
Golub, G. H., and J. H. Wilkinson, "Ill-conditioned eigensystems and the computation of the
Jordan canonical form," SIAM Review 18, 578-619 (1976).
Grauert, H., "Analytische Faserungen iiber holomorph vollstandigen Raumen," Math. Ann.
135, 263-273 (1958).
Guralnick, R. M., "A note on pairs of matrices with rank one commutator," Linear and
Multilinear Algebra 8, 97-99 (1979).
Halmos, P. R., "Reflexive lattices of subspaces," J. London Math. Soc. 4, 257-263 (1971).
Halperin, L, and P. Rosenthal, "Burnside's theorem on algebras of matrices," Am. Math.
Monthly 87, 810 (1980).
Harrison, K. J., "Certain distributive lattices of subspaces are reflexive," J. London Math. Soc.
8, 51-56 (1974).
Hautus, M. L. J., "Controllability and observability conditions of linear autonomous systems,"
Ned. Akad. Wet. Proc., Ser. A, 12, 443-448 (1969).
References 685

Helton, J. W., and J. A. Ball, "The cascade decompositions of a given system vs the linear
fractional decompositions of its transfer function," Integral Equations and Operator Theory
5, 341-385 (1982).
Hoffman, K., and R. Kunze, Linear Algebra, Prentice-Hall of India, New Delhi, 1967.
Jacobson, N., Lectures in Abstract Algebra II: Linear Algebra, Van Nostrand, Princeton, NJ,
1953.
Johnson, R. E., "Distinguished rings of linear transformations," Trans. Am. Math. Soc. I l l ,
400-412 (1964).
Kaashoek, M. A., C. V. M. van der Mee, and L. Rodman, "Analytic operator functions with
compact spectrum, II. Spectral pairs and factorization," Integral Equations and Operator
Theory 5, 791-827 (1982).
Kailath, T., Linear Systems, Prentice-Hall, Englewood Cliffs, NJ, 1980.
Kalman, R. E., "Mathematical description of linear dynamical systems," SI AM J. Control 1,
152-192 (1963).
Kalman, R. E., "Kronecker invariants and feedback," Proceedings of Conference on Ordinary
Differential Equations, Math. Research Center, Naval Research Laboratory, Washington,
DC, 1971.
Kalman, R. E., P. L. Falb, and M. A. Arbib, Topics in Mathematical System Theory,
McGraw-Hill, New York, 1969.
Kato, T., Perturbation Theory for Linear Operators, 2nd ed., Springer-Verlag, Berlin, 1976.
Kelley, J. L., General Topology, van Nostrand, New York, 1955.
Kra, I., Automorphic Forms and Kleinian Groups, Benjamin, Reading, MA, 1972.
Krein, M. G., "Introduction to the geometry of indefinite ./-spaces and to the theory of
operators in these spaces," Am. Math. Soc. Translations (2) 93, 103-176 (1970).
Krein, M. G., M. A. Krasnoselskii, and D. P. Milman, "On the defect numbers of linear
operators in Banach space and on some geometric problems," Sbornik Trud. Inst. Mat.
Akad. Nauk Ukr. SSR 11, 97-112 (Russian) (1948).
Kurosh, A. G., Lectures in General Algebra, Pergamon Press, Oxford, 1965.
Laffey, T. J., "Simultaneous triangularization of matrices—low rank cases and the non-
derogatory case," Linear and Multilinear Algebra 6, 269-305 (1978).
Lancaster, P., Theory of Matrices, Academic Press, New York, 1969.
Lancaster, P., and M. Tismenetsky, The Theory of Matrices with Applications, Academic Press,
New York, 1985.
Lidskii, V. B., "Inequalities for eigenvalues and singular values," appendix in F. R. Gantmach-
er, The Theory of Matrices, Moscow, Nauka, 1966, pp. 535-559 (Russian).
Markus, A. S., and E. E. Parilis, "Change in the Jordan structure of a matrix under small
perturbations," Matem. Issled. 54, 98-109 (Russian) (1980).
Markushevich, A. I., Theory of Analytic Functions, Vols. I-III, Prentice-Hall, Englewood
Cliffs, NJ, 1965.
Marsden, J. E., Basic Complex Analysis, Freeman, San Francisco, 1973.
Ostrowski, A. M., Solution of Equations in Euclidean and Banach Space, Academic Press, New
York, 1973.
Porsching, T. A., "Analytic eigenvalues and eigenvectors," Duke Math. J. 35, 363-367 (1968).
Radjavi, H., and P. Rosenthal, Invariant Subspaces, Springer-Verlag, Berlin, 1973.
Ran, A. C. M., and L. Rodman, "Stability of neutral invariant subspaces in indefinite inner
products and stable symmetric factorizations," Integral Equations and Operator Theory 6,
536-571 (1983).
Rodman, L., and M. Schaps, "On the partial multiplicities of a product of two matrix
polynomials," Integral Equations and Operator Theory 2, 565-599 (1979).
Rosenbrock, H. H., State Space and Multivariate Theory, Nelson, London, 1970.
Rosenbrock, H. H., and C. E. Hayton, "The general problem of pole assignment," Intern. J.
Control 27, 837-852 (1978).
Rosenthal, E., "A remark on Burnside's theorem on matrix algebras," Linear Algebra Appl.
63, 175-177 (1984).
Rudin, W., Real and Complex Analysis, 2nd ed., Tata McGraw-Hill, New Delhi.
686 References

Ruhe, A., "Perturbation bounds for means of eigenvalues and invariant subspaces," Nordisk
Tidskrift fur Informations Behandlung (BIT) 10, 343-354 (1970a).
Ruhe, A., "An algorithm for numerical determination of the structure of a general matrix,"
Nordisk Tidskrift fur Informations Behandlung (BIT) 10, 196-216 (1970b).
Saphar, P., "Sur les applications lineaires dans un espace de Banach. II," Ann. Sci. Ecole
Norm. Sup. 82, 205-240 (1965).
Sarason, D., "On spectral sets having connected complement," Acta Sci. Math. (Szeged) 26,
289-299 (1965).
Shayman, M. A., "On the variety of invariant subspaces of a finite-dimensional linear
operator," Trans. AMS 274, 721-747 (1982).
Shmuljan, Yu. L., "Finite dimensional operators depending analytically on a parameter,"
Ukrainian Math. J. 9(2), 195-204 (Russian) (1957).
Shubin, M. A., "On holomorphic families of subspaces of a Banach space," Integral Equations
and Operator Theory 2, 407-420 (translation from Russian) (1979).
Sigal, E. I., "Partial multiplicities of a product of operator functions," Matem. Issled. 8(3),
65-79 (Russian) (1973).
Soltan, V. P., "The Jordan form of matrices and its connection with lattice theory," Matem.
Issled. 8(27), 152-170 (Russian) (1973a).
Soltan, V. P., "On finite dimensional linear operators with the same invariant subspaces,"
Matem. Issled. 8(30), 80-100 (Russian) (1973b).
Soltan, V. P., "On finite dimensional linear operators in real space with the same invariant
subspaces," Matem. Issled. 9, 153-189 (Russian) (1974).
Soltan, V. P., "The structure of hyperinvariant subspaces of a finite dimensional operator," in
Nonselfadjoint Operators, Kishinev, Stiinca, 1976, pp. 192-203 (Russian).
Soltan, V. P., "The lattice of hyperinvariant subspaces for a real finite dimensional operator,"
Matem. Issled. 61, 148-154, Stiinca, Kishinev (Russian) (1981).
Thijsse, G. Ph. A., "Rules for the partial multiplicities of the product of holomorphic matrix
functions," Integral Equations and Operator Theory 3, 515-528 (1980).
Thijsse, G. Ph. A., Partial Multiplicities of Products of Holomorphic Matrix Functions,
Habilitationschrift, Dortmund, 1984.
Thompson, R. C., "Author vs. referee: A case history for middle level mathematicians," Am.
Math. Monthly, 90(10), 661-668 (1983).
Thompson, R. C., "Some invariants of a product of integral matrices," in Proceedings of the
1984 Joint Summer Research Conference on Linear Algebra and its Role in Systems Theory,
1985.
Uspensky, J. V, Theory of Equations, McGraw-Hill, New York, 1978.
Van Dooren, P., "The generalized eigenstructure problem in linear system theory," IEEE
Trans. Aut. Contr. AC-26, 111-129 (1981).
Van Dooren, P., "Reducing subspaces: Definitions, properties and algorithms," in A. Ruhe and
B. Kagstrom, Eds., Matrix Pencils, Lecture Notes in Mathematics, Vol. 973, Springer, New
York, 1983, pp. 58-73.
Wells, R. O., Differential Analysis on Complex Manifolds, Springer-Verlag, New York, 1980.
Wonham, W. M., Linear Multivariable Control: A Geometric Approach, Springer-Verlag,
Berlin, 1979.
Author Index

Helton, J.W., 292


Allan, G.R., 645 Herrero, D.A., 384
Arbib, M.A., 292 Hoffman, K., 427

Ball, J.^., 292 Jacobson, N., 384


Bart, H., 290, 292, 561, 562 Johnson, R.E., 348
Baumgartel, H., 605, 645
Bochner, S., 629 Kaashoek, M.A., 290, 291, 292, 561, 562
den Boer, H., 562 Kailath, T., 291, 292
Brickman, L., 562 Kalman, R.E., 292
Brockett, R., 292 Kato, T., 561
Brunovsky, P., 292 Kelley, J.L., 592
Kra, I., 614
Campbell, S., 561, 562 Krasnoselskii, M.A., 561
Choi, M.D., 384 Krein, M.G., 290, 561
Coddington, E.A., 262 Krupnick, N., 561
Conway, J.B., 561 Kunze, R., 427
Kurosh, A.G., 290
Daughtry, J., 561, 562
Djaferis, T.E., 292 Laffey, T.J., 384
Donnellan, T., 313 Lancaster, P.," 122, 290, 291, 327, 384, 561,
Douglas, R.G., 561 562, 645, 678
Laurie, C., 384
Falb, P.L., 292 Leiterer, Ju., 410, 561, 645
Fillmore, P.A., 384, 562 Levinson, N., 262
Lidskii, V.B., 136
Gantmacher, F.G., 115, 290, 384, 678 Longstaff, W.E., 384
Gohberg, I., 290, 291, 292, 410, 561, 562,
580, 609, 645, 678 Markus, A.S., 561, 562
Goldberg, S., 580 Markushevich, A.I., 570, 585
Golub, G., 562 Mardsen, J.E., 477
Grauert, H., 645 Martin, W.T., 629
Guralnick, R.M., 384 Milman, D.P., 561
Mitter, S.K., 292
Halmos, P., 384, 561
Halperin, I., 384 Ostrowski, A.M., 562
Harrison, K.J., 348
Hautus, M.L.J., 292 Parilis, E.E., 562
Hayton, C.E., 292 Pearcy, C., 561
Heinig, G., 609 Porsching, T.A., 645

687
688 Author Index

Radjavi, H., 384 Thijsse, G.Ph.A., 136, 562


Ran, A.C.M., 561 Thompson, R.C., 291
Rodman, L., 136, 291, 561, 562, 645, 678 Tismenetsky, M., 122, 290, 327, 384,
Rosenbrock, H., 292 561
Rosenthal, E., 384
Rosenthal, P., 384
Rubinstein, S., 292, 561, 562 Uspensky, J.V., 630
Rudin, W., 597
Rune, A., 562
vander Mee, C.V.M., 561
Saphar, P., 645 van Dooren, P., 562
Sarason, D., 291 van Loan, C.F., 562
Schaps, M., 136, 291 van Schagen, F., 292
Shayman, M.A., 434, 561
Shmuljan, Yu.L., 645
Shubin, M.A., 645 Wells, R.O., 434
Sigal, E.I., 291 Wilkinson, J.H., 562
Soltan, V.P., 290, 380, 384 Wonham, W.M., 291, 292
Subject Index

Algebra, 339 Chain (of subspaces), 33


k-transitive, 344 almost invariant, 209
reductive, 351 analytic extendability of, 618
self-adjoint, 351 complete, 35, 348, 449
see also Boolean algebra Lipschitz stable, 526
Analytic family: maximal, 35
of subspaces, 566 stable, 464
A(z)-invariant, 594 Characteristic polynomial, 10
direct complement for, 590 Circulant matrix, 43, 96, 256, 260
real, 600 Coextension, 128
of transformations, 565, 599, 604 Coinvariant subspace, 105, 437, 490
analytic Jordan basis for, 611 orthogonally, 108
diagonable, 612 Col, 147
eigenvalues of, 604, 609 Column indices, minimal, 674
eigenvectors of, 605 Commutator, 303
first exceptional set, 609, 624, 632 Commuting matrices, 295, 371
image of, 569 Companion matrix, 146, 515
incomplete factorization of, 578 second, 150
kernel of, 569 Completion, 128
multiple points of, 608 Complexification, 366
real, 600 Compression, 106
second exceptional set of, 610, 624, 633 Connected components, 426, 442
singular set of, 569 Connected set, 423
Angular subspace, 25 finitely, 584
Angular transformation, 27, 398 simply, 584
Atom, 349 Connected subspaces, 405, 423, 437
Continuous families:
Baire category theorem, 592 of subspaces, 408, 445
Binet-Cauchy formula, 651 of transformations, 412
Block similarity, 193, 208, 383 Controllable pair, 290
Boolean algebra, 349 Controllable system, 267
atomic, 349
Branch analytic family, 613 Diagonable transformation, 109, 366
singular set of, 613 Difference equation, 180
Brunovsky canonical form, 196, 359, 383 Differential equation, 175
Burnside's theorem, 341 Dilation, 128
of linear system, 263
Cascade (of linear systems), 273 Direct sum of subspaces, 20
minimal, 274 Distance:
simple, 270 between sets of subspaces, 465
689
690 Subject Index

Distance (Cont.) Invariant subspace, 5, 359


between subspaces, 397 of algebra, 340
from point to set, 388 a-stable, 513
Disturbance decoupling, 275 analytic extendability of, 616
B-stable, 480
Eigenvalue, 10, 146, 361, 604, 609, 657, 661 common to different matrices, 301, 378
Eigenvector, 10, 361, 605 cyclic, 69
generalized, 12, 13 inaccessible, 431
Elementary divisors, 298, 655, 665 intersect v, 208
at infinity, 664, 665 irreducible, 65, 365
Elementary matrices, 694 isolated, 428, 442, 473
Equivalent matrix polynomials, 646 Jordan, 54
strictly, 195, 382, 662, 665 Lipschitz stable, 459, 473
Extension, 121 marked, 83
maximal, 72
Factorization: minimal, 78
of matrix polynomials, 159, 160, 171, 554, mod v, 191
624 orthogonal reducing, 111
analytic extendability, 625, 626 real, 359
isolated, 524, 554 reducible, 65
Lipschitz stable, 525 reducing, 109, 298, 432, 490
sequentially nonisolated, 627 sequentially isolated, 619
stable, 520, 524, 554 spectral, 60, 365, 458, 618
of rational matrix functions, 226, 554 stable, 447
analytic continuation, 634 supporting, 187
isolated, 538, 539, 555
Lipschitz stable, 539 Jordan block, 6, 52
minimal, 226, 529, 634 Jordan chain, 13, 361
sequentially nonisolated, 638 Jordan form, 53
stable, 529, 537, 539, 554 real, 365
Factor space, 29 Jordan indices, 196
Feedback, 275, 277, 279 Jordan part (of Brunovsky form), 196
Fractional power series, 605 Jordan structure, 482
Full range pair, 81, 197, 290, 468 derogatory, 497
fixed, 596
Gap, 387, 417 Jordan structure sequence, 477, 483
spherical, 393, 418 derogatory part, 512
Generalized inverse, 24 Jordan subspace, 54
continuity of, 411, 413
Generators, 69, 100 Kernel, 5, 406
minimal, 69 Kronecker canonical form, 678
Graph (of matrix), 545 Kronecker indices, 196, 199
Kronecker part (of Brunovsky form), 196
Height:
of eigenvalue, 86 Laplace transform, 265
of transformation, 498, 513 Lattice, 31
Hyperinvariant subspace, 305-313, 374, 431, analytic dependence, 596
490 distributive, 311, 348
linear isomorphism, 484
Ideal, in algebra, 343 reflexive, 348
Image, 5, 406 self-dual, 311
Incomplete factorization, 578 Lattice homomorphism, 483
Input (of linear system), 262 Lattice of invariant subspaces, 463, 470
Invariant polynomials, 654, 664 analytic dependence, 596
Subject Index 691

Lipschitz stable, 464 Minimal angle, 392, 419


in metric, 467 Minimal opening, 396, 451
stable, 464 Minimal polynomial, 74
in metric, 465 Minimal realization, 218, 219
Lattice isomorphism, 463, 483, 596 Minimal system, 264
Left inverse, 216 Minor, 651
continuity of, 414 Mittag-Leffler theorem, 571, 614
Left quotient, 659 Monodromy theorem, 597
Left remainder, 659 Multiplicity:
Linear equation (in matrices), 548, 551 algebraic, 53, 365
Linear fractional decomposition, 244, 274 geometric, 53, 365
Lipschitz stable, 540 partial, 53, 365
minimal, 245, 274
Linear fractional transformation, 238 Norm, 88, 415
Linear isomorphism (of lattices), Normed space, 415
484 Null function, 220
Linearization, 144 associated, 222
Linear system, 262 canonical, 220
controllable, 267 order of, 220
disturbance decoupled, 275 Null kernel pair, 75, 81, 209, 290
minimal, 264 Null vector, 220
observable, 266
similar, 263 Observable pair, 290
Linear transformation: Observable system, 266
diagonable, 109, 366 Output (of linear system), 262
normal, 39, 363 Output stabilization, 279
self-adjoint, 363
unitary, 363 Partial multiplicities, 154, 219, 657, 661
Lipschitz continuous map, locally, 518 stability of, 475
Lipschitz stability, 467 Pole (of rational function), 219, 223
Lyapunov equation, see Linear equation geometric multiplicity of, 529
Projector, 20
McMillan degree, 225, 245, 632 complementary, 22
Matrix: orthogonal, 21
block:
circulant, 98 Quadratic equation (in matrices), 27, 545, 637
tridiagonal, 210 inaccessible solution of, 547
circulant, 96, 97, 314 isolated solution of, 547, 551, 556
companion, 98, 100, 299, 314 Lipschitz stable solution of, 552
cyclic, 299 stable solution of, 551, 552, 556
diagonable, 90 unilateral, 550
hermitian, 20
nonderogatory, 299, 449, 465, 499 Rational matrix function, 212
normal, 100, 111, 117, 303 analytic dependence, 628
orthogonal, 363, 405 analytic minimal realization of, 630
Toeplitz, 317 exceptional sets of, 632, 633
Matrix polynomial, 646 minimal realization of, 218
monic, 144 partial multiplicities of, 219
see also Factorization, of matrix polynomials pole of, 219
Metric, 387 realization of, 212
Metric space: zero of, 219
compact, 400 see also Factorization, of rational matrix
complete, 401 functions
connected, 405 Reachable vector, 276
692 Subject Index

Realization, see Rational matrix function controllable, 204


Reducing subspaces, 245, 251 irreducible, 65
Reduction: Jordan, 54
of linear system, 263 orthogonally coinvariant, 108
of realization, 215 orthogonally semiinvariant, 115
Regular linear matrix polynomial, 663 reducible, 65
Resolvent form, 147 root, 46, 363, 490
Restriction of transformation, 121 semiinvariant, 112, 438, 490
Riccati equation, see Quadratic equation spectral, 60
Riesz projector, 64, 447, 452 see also Invariant subspace
Right inverse, 216 Supporting k-tuple, 530
continuity of, 414 stable, 530
Right quotient, 659 Supporting quadruple, 249
Right remainder, 659
Root subspace, 46, 363, 490 Toeplitz matrix, 40, 317
Rotation matrix, 54 upper triangular, 297, 317
Row indices, minimal, 674 Trace, 427
Transfer function, 265
Scalar product, 391 Transformation:
Schmidt-Ore theorem, 290 adjoint, 18
Self-adjoint transformation, 20 angular, 27, 398
Semiinvariant subspace, 112, 438, 490 coextension of, 128
orthogonally, 115 diagonable, 90, 100
Sigal inequalities, 133 dilation of, 128
Similarity, 17 extension of, 121, 190, 208
of standard triple, 147 function of, 85
of systems, 263 induced, 30
Simply connected set, 584 nonderogatory, 299, 449, 465, 499
Smith canonical form, 647 normal, 39, 303
local, 218 orthogonally unicellular, 117
uniqueness of, 651 reduction of, 128
Spectral assignment, 203, 383 self-adjoint, 20
Spectral factorization, 187 unicellular, 67
Spectral shifting, 204 Triinvariant decomposition, 112, 253
Spectral subspace, 60 orthogonal, 115
Spectrum, 10 supporting, 156, 277
Standard pair, 183
Standard triple, 147
Unitary matrix, 37
similarity of, 147
State vector, 262
Subspace: Vandermonde, 72, 98
[A B]-invariant, 190, 481
Weierstrass' theorem, 571, 614
-invariant, 192
angular, 25 Zero:
coinvariant, 105, 437, 490 geometric multiplicity of, 529
complementary, 20 of rational function, 219, 223

You might also like