You are on page 1of 12

LINEAR

ALGEBRA

Almost

AXD

Diagonal

ITS XPPLIC4TIONS

Matrices

with

Multiple

or Close Eigenvalues*

J. H. WILKISSOS
~Vational

Physical

Teddington,

Laboratory

Eragland

1. INTRODUCTIOX

In a number
latter

of algorithms

is reduced

to almost

for finding eigenvalues

by an iterative

diagonal

form.

of all the transforms


then in the nature

sequence

When A, has a multiple

(assuming

exact

of almost diagonal

of a matrix

of similarity

eigenvalue

computation).
matrices

with multiple

It turns out that such matrices

have special characteristics

interest

the convergence

for reducing

a matrix

2. THE HERMITIAN

eigenvalues.
which are of

of iterative

procedures

form.

CASE

We first consider
A be hermitian

to diagonal

this is true

We are interested

considerable

as regards

A,, the

transformations

hermitian

matrices

with eigenvalues

A, being precisely of multiplicity


but this will not affect

with multiple

eigenvalues.

Let

Ai, Ai, . . . , A,, Ar+l, ;l,+2, . . . , An the root


Y. (A may have other multiple eigenvalues

the argument.)

Let S be defined by the relation

and let

A=D+E,

(3)

* This work was done while the author was Visiting Professor in the Cornputer Science Department
of Stanford University.
It was supported
by the
Xational Science Foundation and the Office of Naval Research.
Lznear

Copyright

4lgcbm

1968 by American

asd

Its

Applications

Elsevier

Publishing

1, I-12

Company,

(1968)
Inc.

J. H. WILKIXSON

2
where D is the diagonal

of A.

Suppose

we have

ilE;lF = E < 6

(3)

where F denotes the Frobenius norm (C 2 letjj2)1/2. When E is small


A may be regarded as almost diagonal. By the Wielandt-Hoffman
theorem
[5] the li and aii may be ordered
2
Let us permute

so that

(APi-

ai;)

the rows and columns

ciated with the A, eigenvalues


n -

(4)

of A similarly

so that the aii asso-

are the first Y. Without

we can assume this was true originally


of the remaining

< &2.

Y eigenvalues

loss of generality

and with appropriate

inequality

numbering

(4) becomes

We write

A=GTH
i

(6)

where F is an Y x Y matrix.
If the eigenvalues

of H are I:+,,

. . . , An, then since the off-diagonal

elements of H are a subset of those of E, we have by the Wielandt-Hoffman


theorem

[5] with appropriate

numbering

of the Ai

(7)
Hence
I& -

& = I& -

aii + u,i -

GE+&<28

(8)

and

Linear

Algebra and Its Applications

1, l-12

ilil

(1968)

ALMOST

DIAGONAL

The matrix

H -

MATKICES

Ail is therefore

nonsingular,

i.e., it is of rank

Now since A has 1, as a r-fold root it, too, is of rank n that

this means

A -

Ai1 in the form

that F is especially

F -

If we premultiply

A -

alI =

A -

iliT by

its rank is unaltered

;i,I
G7

be true

r.

(10)

&I

the derived

Since H -

We partition

(11)

matrix

AlI)-Gr

I$
H -

GT
is also of rank n -

to G and H.

H -

- G(H - AlI)-

iill - G(H -

is already

I,I

Y.

and hence

F -

related

n -

Y. We shall show

(12)

AlI
of rank n -

Y this can

only if
F -

il,I -

G(H -

il,I)plGT = 8,

;i,I)pG?

= ,?,I + M

(13)

i.e.,
F = 1,I + G(H Now the elements

of G are a subset

of those

(say).

(14)

of E and hence

while
(H where R is unitary.
norm

and from

= R diag(A, -

&I)-

Hence from the unitary

A,)-iRH,
invariance

(16)
of the Frobenius

(9) and (15)

(17)

We see then
bounded

that

diagonal

elements

by ~~16 and its off-diagonal

of F differ
elements

from A, by quantities
are bounded

by e2/6.

When E < b this means that


never

the largest

found in F, the matrix

with the multiple root 1,.

This has important

with the classical

method

matrices.

matrix

is annihilated

stage such an off-diagonal

elements
This

tending
simple

the classical

observation

annihilated

strategy

the annihilated

3.

removes
remark

is used [9].
elements

In practice

element

shows that

is never associated

after

with two

root.

a difficulty
applies

in demonstrating
quadratically

to the serial

that

convergent

Jacobi

If at any stage the element

will not be associated

CLOSE

method
which is

errors.

the quantity

matrix

is made on a matrix

merely

In discussing

has very

the convergence

mini& -- 2; is of great importance


the roots

elements

IiOOTS

close roots would appear to be serious.


Suppose

with two diagonal

stage

root.

when a transformation

the transformed

rounding

hermitian
element in

then this ensures that from a certain

to the same multiple

PATHOLOGICALLY

roots,

in connection

is chosen to be one which is not small compared with the current

norm of off-diagonal
tending

element

method is always ultimately


A similar

[7, 10, 11, 15).


if a threshold

consequences

but the theorem

of A is

associated

the largest off-diagonal

to the same multiple

Jacobi

element

elements

[6, 10, 151 for diagonalizing

At each stage in the reduction

the current
certain

Jacobi

off-diagonal

with the diagonal

having multiple

close roots
of the

because

Jacobi

and the presence

of

method
of very

We now show that this is not so.

of A are

a,;

a,,a,

,...)

I.,, ,,...,

a,,,

(W

where

ii =
and the ei are very small.
close.

Define

a+

Ei,

i-

l,...,v,

The first Y roots are therefore

D and E as in (Z), but

(19)
pathologically

6 by the relation

(20)
and assume

that

Now A may

be expressed

in the form

A = RD,RH,
where

D, = diag(Q

and D,

can

(22)

be separated

D, = diag(2, A, . . ., 1, a,,,,

into

D, and D, where
(23)

. . . , An),

(24)

D, = diag(s,, F~, . . , E,, 0, . . , 0).


Hence
A =: K(L), + D,)RH = RD,RH
The matrix

+ RD,RH

= B + C

(say).

(25)

H has A as an r-fold root and to apply the result of the previous

section we require only a bound for the Frobenius


Since

elements.

/JElI,+
The Frobenius

IICj!F=

root

corresponding

is therefore

elements

Suppose

llE~I~_l- (~E~~)~~~==.z<~.

norm of the off-diagonal

the multiple

I -

elements

bounded

by ?/6

of A is bounded

for example

norm of its off-diagonal

C such a bound is given by

B = A -

a matrix

(W),

of B associated
and hence

by &Z/C?
+ (2

that

with
of the

E~~)~.

A has the roots

1 -

10-10,

1,

4,

and
IIEl,fl + 211(1()-10) E
The

have

off-diagonal

elements

a Frobenius

and therefore
element

they

of A.

the current

the close roots.


10M5 to 10-lo.

positions

than

the largest

method with a matrix


rotation

Jacobi

off-diagonal
method

or

having the root distribu-

will not be in a plane

associated

with

In fact, with the above example one sweep of the threshold

method

will reduce the norm of off-diagonal

Provided

level the presence


is beneficial

with the close roots will then

b\

at such a stage in the classical

serial Jacobi

serial Jacobi

of A associated
bounded

will all be far smaller

Hence

the threshold
tion above,

norm

10-5.

since

elements

of the close roots has no adverse influence.


it ensures

is concentrated

from

we do not wish to reduce the norm below this


that

the main

on fewer elements.)

weight

(In fact, it

in the off-diagonal

J. H. WILKINSOS

6
4. NONHERhlITIAX

The above
associated
result

MATRICES

proofs

may give the impression

that

with hermitian

In fact a closely related

specifically

matrices.

is true for any diagonalizable

Again

let A have

the roots

matrix

the result

having

an r-fold

il,, . . ., Al, Ar+l, . . ., I,.

d = D -t E

above
root.

Let

(D diagonal),

(26)

(28)

iiEl!,=E<6.
Suppose
[2]*

&I is of rank n -

there

the last

n) include

the Ai (i=r+l,...,

that A -

are at least

a root ;1, of multiplicity

i for which

following

from

(28).

This

are at least s of the a,, lying in the disk with center


At the moment

it appears

if so let us associate

possible

that

the first s qualifying

there

precisely

n -

and columns
the southeast

Y diagonal

&I

that

there

may be more than

s;

The aCi associated


since the disks with

Hence we have associated

with il, ,_i, . . . , ii,.

so that these n -

Now permute

Y diagonal

elements

rows
are in

We can assume that A was in this form originally.

As in the symmetric
A -

elements

of A similarly
corner.

from (27) and (28).

means

1, and radius e.

a,, with $.

with A, and L, (k, 1 > Y, L, # 2,) must be different


centers L, and 1, are disjoint,

s so

s; then by a theorem due to Fan and Hoffman

s indices

two inequalities

is

case, A -

A,1 is of rank +z -

Y and partitioning

in the form
G
H -

il,I

we have
I; -

l,Z -

G(H -

,?,I)-K

= 0,

(31)

* I would like to thank Dr. B. Levinger of Case Institute for drawing my attention
to this result while we were both enjoying
Division,

*Argonne Xatlonal

Laboratory.

the hospitality

of :\pplied Mathematics

ALMOST

DIAGONAL

provided

H -

MATRICES

A,1 is nonsingular.

Now

H -~ L,I ==diag(aii - Al) +- L = D, + L


where L is the matrix of off-diagonal
a subset

of those

of E.)
laiz -

elements of H.

(34

(These elements

are

Since

a,1 2

/Ai-

A,/ -

(aii -

A,(

(i=r+l,...,n),

> 2s
D, is nonsingular

(say),

(33)

and
H -

&I = D, jI f

D,-lL].

(34)

Now

(35)
and hence

z
Equation

(31) therefore

1
-.
i)

(36)

gives

F = A,1 + G(H -

;2,1)-1K

= R,I + M

(say),

where

(37)
Liwav

_zilgebra and

Its

Applications

1, l-

12 (1968)

J.

This now shows that precisely


;li (i=r+-

a disk of radius .s2/6 centered


ciated

n -

Y of the ajj were associated

n) and the remaining

l,...,

on 1,.

the level

r diagonal

elements

Again off-diagonal

with the multiple eigenvalue

well below
problem

of the largest

off-diagonal

of A seems not to be involved.

elements

elements

In this respect

in order to derive a similarity


is almost

have
2.

this

deceptive

PATHOLOGICALLY

In the hermitian

CLOSE

The deceptive

nature

and let X be a matrix


Then

lIEllm < E

X-lBX

problem

then,

= A such that A
quantity,

we shall,

if B is ill-conditioned
case the hypothesis

than

does not

feature.
ROOT5

IN

NONHERMITIAN

of the result

consider the effect of very close roots.


of A.

eigenvalue

transformation

E << d.

(though not as

it is the hypothesis

have to work to higher precision

if it is well-conditioned.

when

a result may be proved

with (\Ej Ian1ess than a prescribed

diagonal

in general,

Indeed

If B has an ill-conditioned

which is deceptive.

asso-

of the eigenvalzbe

weaker even when A is defective

far as ii, is concerned).

with the
are all in

are bounded by c2/6 and are therefore

The result is at first sight surprising since the condition


which is only marginally

H. WILKIXSOS

having

becomes

CASE

apparent

as soon as we

Assume now that A is nondefective

as its columns

n independent

eigenvectors

we have

A = X diag(&)X-l.
Using a similar notation

to that in Section

(33)

3 we have in the case of Y very

close roots

A = XD,X-l

+ XD,X-1

where B now has an r-fold root.


ijCjjF =

(39)

= B i_ C,

In the hermitian

case X is unitary

(40)

I!C!/ G IIXII lIDal l/X-1ll,


and we see that the condition
inevitably

involved.

for all permissible


not imply that

number

of a multiple

[l].

and

large.

the fact that the eigenvector

Its Applications

though,

root or of a set of very close roots does

is irrelevant.
Algebra

is

value of 1IX/j 1IX-ill

It should be emphasized,

j]X/ 1 //X-l\/ is necessarily

are well conditioned

of X with respect to inversion

It is clear that it is the minimum

X that is relevant

that the possession

Linear

and

1jD,jj,, but now all we can say is

1, I-

12

(1968)

Provided

the close roots

problem is ill conditioned

6.

ITERATIVE

The

REFINEMENT

above

procedures

results

OF

have

[12, 13, 151.

EIGENSYSTEM

important

for the refinement

consequences

of a computed

In such procedures

values and eigenvectors


and define

AN

in connection

eigensystem

one starts with a computed

,u and xi. Let X be the matrix

with

of a matrix
set of eigen-

having columns ,yi

K and S by the relatiorr


.4X ~ X diag(pu,) =- K,
X-i.lX

If the system
neither

diag(,u;) = X-1R

were exact

both

R nor S can be computed

rounding
with

errors

but

error.
S

If the computed
for calculating
small compared

(S -

.q).

is obtained

(13)

for 11.5~ .qi/ which is

(Note S IS computed
for S -

an almost

of

an .q is determined

.? is small, and with good procedures

a bound

for the norm only is determined

In practice

we have

system is accurate
1IS[I.

(42)

with the given X because

procedures

A .Y = diag(,u;) + . f

with

of (43) is therefore

exactly

Hence

R and X-lR

= S.

R and S would be null.

with well-designed

a low relative

(41)

S.)

diagonal

explicitly

The matrix

matrix

but a bound

sum on the right

which is exactly

similar

to A.
Now when A has a multiple
result

shows that

the off-diagonal

provided
elements

root corresponding

s is small

to a linear divisor our

(and hence

of 9 associated

S -

far smaller than the largest off-diagonal

elements

roots

find typically

of A is ill conditioned

then the bound for j/S mantissa


associated

binary

we shall

s/j,

of &2 and the associated


Hence after suitable
side of (43) will have

of 5.

will be approximately

computer).

with the multiple

The

5 is very small),

with the multiple

diagonal

elements

that
2-%

permutations

elements

if /IsI/, = E
(with a t-digit

of diag&)

roots will differ by quantities

off-diagonal

roots will be

When none of the

+ s

of the order

will be of order e2.

of rows and columns the right-hand

the form

Linear

.4lgebra

and Iis Applicalions 1, 1 -

12

(196X)

10

J. H. WILKINSON

and the bound for 1(s Premultiplication


then

modifies

S[ / WI11usually be of order at least as small as $.

of the first Y rows by ke and the first r columns by (l/k)&


the second

matrix

to the form

EEL WM

I I
;Ar

and because
multiple

of this Gerschgorins

roots

what
which

theorem

as for well-separated

Forgetting

(45)

EP

gives just as fine bounds

roots.

rounding errors for the moment

can be achieved

with

can be expressed

it is interesting

an approximate

matrix

to consider

X of eigenvectors

in the form
Z = X(1 + EE),

&?Zl[m= 1 and X is a matrix

where

for

(46)

of exact

normalized

eigenvectors.

We have
X-rAx

= (1 + &E)-X-r/fX(I
= (I -

FE + ,s2E2 -

EE)

. * a) diag(&)(I

= dia.&A,) + EF + terms

+ EE)

in .9, etc.,

(47)

where

fij = We see that


diagonal

the elements

elements

Notice

that

coincident,

the

A has eigenvalues

associated

similarity

Ai = A,.

eigenvalues
which,

off-diagonal

elements

transformations

smaller
are again

of Gerschgorins

gives bounds

The weakest

bounds

being
than

arise when the separations

When

whichever

Linear

Algebra

the system
and

Its

using
eigenis the

are themselves
merely

transformations.

the procedure

then provided

truly
E, (48)

appreciably

theorem

of order E. The bounds are then of order E and cannot be improved


by diagonal similarity

the off-

for the relevant

values which are of the order of e2 or of the separations,


larger.

Hence

are of order 3.

while not

which are appreciably

than E and a simple application

diagonal

(48)

ilieLj.

with multiple

have separations

shows that
smaller

fij are zero whenever

associated

when

Ajeij

for refining

an eigensystem

is not too ill conditioned

Af$&ations

1, l-

12 (1968)

is used iteratively,
the final eigensystem

ALMOST

DIAGOSAL

11

MATRICES

is correct

to working accuracy.

computed

system

of vectors

= (X-i

X-iEX-i

= diag(&) Equation

liE!jm < fi.2-JJX/I,.

. . .)AX(I

of a prescribed

are certainly
the quadratic

the bounds

bounded

and diagonal
consists

transformations

of multiple

with a minimum
consists

separation

order of fiz/S,.

into

large

using

three

we

(51)

Gerschgorins

theorem

in the following
groups.

The first

form.
group

of eigenvalues

p, and the third

group

in the first group having


the bound is of the

of the second group having


and a minimum

separations

separation

of

of order S,

the bound is of the order of s + (/P/S,). For a member


separation

of 6, the bound is of the order of /P/S,.


except

if

of

Writing

of 6, from all other eigenvalues

For a member

with

elements

max(Aij = fi,

For an eigenvalue

of the third group having a minimum


is quite

in E.

which is less than

216 to s from its close neighbors


from all others

terms

the second group consists

eigenvalues;
separation

(50)

accuracy

off-diagonal

can be expressed

be divided

of the remainder.

a minimum

The

for the eigenvalues

the eigenvalues

+ ...

by 2n. 2-~~X~~,~~X-1~~, maxjA,l,

and higher-order

attainable

+ X-1E)

on the attainable

precision.

2%. 2-1j[Xj/,I(X-11j,

Let

(4%

diag(&) + diag(&)X-lE

X-lE

(50) shows the real limitation

computation
ignore

of the form

we have

X-ijfX

X-lAX

we can assume that the final

a relation

where

X=_X+E

Hence

Generally

satisfies

the

bounds

are all appreciably

when s is of the order of magnitude

This result has been amply confirmed


being found, in general,

from all other eigenvalues

In general

unless )(XI jm/IX-l/ Jm

better

than

2-t maxIAil

of p.

in practice-multiple

eigenvalues

to the same high precision as well-separated

eigen-

values.
REFERESCES
1 F. L. Bauer.

Optimally

scaled matrices, Numer. Math. 8(1963),


7347.
Lower bounds for the rank and location of the eigenvalues of a matrix, Nat. Bureau of Standards A$@. Math. Series 39(1954), 117-130.

2 K. Fan and A. J. Hoffman,

3 S. Gerschgorin, tiber die Abgrenzung der Eigenwerte


Nauk. SSSR,
Ser. f&mat. 6(1931), 749-754.
Linear

Algebra

einer Matrix,

and Its Applications

Izv.

1, l-12

Akad.

(1968)

J. H. WILKINSON

12

4 P. Henrici, On the speed of convergence of cyclic and quasi-cyclic


for computing
6(1968),

eigenvalucs

6 C.

G.

Duke

and H. W. Wielandt,
Math.

J. Jacobi,

J.

20(1953),

real symmetric
11-18.
8 H. P. M. van

X.

of the spectrum

die in der Theorie

of a normal
der Sacular-

9(1966),

of the classical Jacobi


eigenvalues,

On the quadratic

Numer. Math.

4( 1957))

Verfahren,

with non-distinct

Kempken,

method,

The variation

On the convergence

matrices

9 D. il. Pope and C. Tompkins,


10

Math.

Gleichungen numerisch aufzulosen, Crelles J. 30(1846),

7 H. P. M. van Kempken,

Mach.

Jacobi methods

J. Sac. Industr. Appl.

37-38.

Uber ein leichtes

stBrungen vorkommenden
51-94.

Jacobi

matrices,

144-162.

5 A. J. Hoffman
matrix,

of Hermitian

convergence

method for

Numer. Math.

9(1966),

of the special

cyclic

19-22.

Maximizing

functions

of rotations,

J. Ass.

Camp.

459-466.

Schtinhage,

Zur Konvergenz

des Jacobi-Verfahrens,

Numer.

Math. 3(1961),

374380.
11 A. Schonhage,
Math.

12

Zur quadratischen

6(1964),

J. Varah,

Ph. D. Thesis,

13 J. H. Wilkinson,
4(1961),

Konvergenz

Stanford

Numer.

(1967).

Rigorous error bounds for computed

eigensystems,

Computev J.

230-241.

14 J. H. Wilkinson,
Numer. Math.

Xote on the quadratic


7tie Algebraic

16 J. H. Wilkinson,

The QR algorithm

eigenvalues,

Computer

Received September

Algebra

convergence

of the cyclic Jacobi process,

4( 1962)) 296-300.

15 J. H. Wilkinson,

Lilaear

des Jacobi-Verfahrens,

410-412.

14.

Eigenvalue

J. 8(1963),

Problem,

85-87.

1967

and Its Applications

Oxford

for real symmetric

1, l-12

(1968)

Univ. Press, 1965.

matrices with multiple

You might also like