You are on page 1of 44

Lecture Notes 1: Matrix Algebra

Part D: Matrix Diagonalization

Peter J. Hammond

revised 2016 September 6th

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 1 of 44


Outline

Eigenvalues and Eigenvectors


Real Case
The Complex Case
Eigenvectors are Linearly Independent

Diagonalizing a General Matrix

Diagonalizing a Symmetric Matrix


A Symmetric Matrix has only Real Eigenvalues

Orthogonal Projections and Complements


A Trick Function for Generating Eigenvalues
The Spectral Theorem

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 2 of 44


Definitions in the Real Case

Consider any n × n matrix A.


The scalar λ ∈ R is an eigenvalue
just in case the equation Ax = λx has a non-zero solution.
In this case the solution x ∈ Rn \ {0} is an eigenvector,
and the pair (λ, x) is an eigenpair.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 3 of 44


The Eigenspace

Given any eigenvalue λ, let Eλ := {x ∈ Rn \ {0} | Ax = λx}


denote the associated set of eigenvectors.
Given any two eigenvectors x, y ∈ Eλ
and any two scalars α, β ∈ R, note that

A(αx + βy) = αAx + βAy = αλx + βλy = λ(αx + βy)

Hence the linear combination αx + βy,


unless it is 0, is also an eigenvector in Eλ .
It follows that the set Eλ ∪ {0} is a linear subspace of Rn
which we call the eigenspace associated with the eigenvalue λ.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 4 of 44


Powers of a Matrix
Theorem
Suppose that (λ, x) is an eigenpair of the n × n matrix A.
Then Am x = λm x for all m ∈ N.
Proof.
By definition, Ax = λx.
Premultiplying each side of this equation by the matrix A gives

A2 x = A(Ax) = A(λx) = λ(Ax) = λ(λx) = λ2 x

As the induction hypothesis,


suppose that Am−1 x = λm−1 x for any m = 2, 3, . . .
Premultiplying each side of this last equation by the matrix A gives

Am x = A(Am−1 x) = A(λm−1 x) = λm−1 (Ax) = λm−1 (λx) = λm x

This completes the proof by induction on m.


University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 5 of 44
Characteristic Equation
The equation Ax = λx holds for x 6= 0
if and only if x 6= 0 solves (A − λI)x = 0.
This holds iff the matrix A − λI is singular,
which holds iff λ is a root
of the characteristic equation |A − λI| = 0,
or equivalently, a zero of the polynomial |A − λI| of degree n.
Suppose that the |A − λI| = 0 has k distinct roots λ1 , λ2 , . . . , λk
whose multiplicities are respectively m1 , m2 , . . . , mk .
This means that
|A − λI| = (−1)n (λ − λ1 )m1 (λ − λ2 )m2 · · · (λ − λk )mk
= (−1)n kj=1 (λ − λj )mj
Q

The polynomial has degree m1 + m2 + . . . + mk , which equals n.


This implies that k ≤ n,
so there can be at most n distinct eigenvalues.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 6 of 44
Eigenvalues of a 2 × 2 matrix
 
a11 a12
Consider the 2 × 2 matrix A = .
a21 a22
The characteristic equation for its eigenvalues is

a11 − λ a12
|A − λI| = =0
a21 a22 − λ
Evaluating the determinant gives the equation
0 = (a11 − λ)(a22 − λ) − a12 a21
= λ2 − (a11 + a22 )λ + (a11 a22 − a12 a21 )
= λ2 − (tr A)λ + |A| = (λ − λ1 )(λ − λ2 )
where the two roots λ1 and λ2 of the quadratic equation have:
I a sum λ1 + λ2 equal to the trace tr A of A
(the sum of its diagonal elements);
I a product λ1 · λ2 equal to the determinant of A.

Let Λ denote the diagonal matrix diag(λ1 , λ2 )


whose diagonal elements are the eigenvalues.
Note that tr A = tr Λ and |A| = |Λ|.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 7 of 44
The Case of a Diagonal Matrix, I

For the diagonal matrix D = diag(d1 , d2 , . . . , dn ),


|D − λI| = 0
the characteristic equation Q
takes the degenerate form nk=1 (dk − λ) = 0,
so the eigenvalues are the diagonal elements.
The ith component of the vector equation Dx = dk x
takes the form di xi = dk xi ,
which has a non-trivial solution if and only if di = dk .
The kth vector ek = (δjk )nj=1
of the canonical orthonormal basis of Rn
always solves the equation Dx = dk x,
and so is an eigenvector associated with the eigenvalue dk .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 8 of 44


The Case of a Diagonal Matrix, II

Apart from non-zero multiples of ek ,


there are other eigenvectors associated with dk
only if a different element di of the diagonal also equals dk .
In fact, the eigenspace spanned by the eigenvectors
associated with each eigenvalue dk
equals the space spanned by the set {ei | di = dk }
of canonical basis vectors.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 9 of 44


Example with No Real Eigenvalues, I
Recall that a 2-dimensional rotation matrix takes the form
 
cos θ − sin θ
Rθ :=
sin θ cos θ

for θ ∈ R, which is the angle of rotation measured in radians.


The rotation Rθ transforms any vector x = (x1 , x2 ) ∈ R2 to
    
cos θ − sin θ x1 x1 cos θ − x2 sin θ
Rθ x = =
sin θ cos θ x2 x1 sin θ + x2 cos θ

Introduce polar coordinates (r , η),


where x = (x1 , x2 ) = r (cos η, sin η). Then
   
cos η cos θ − sin η sin θ cos(η + θ)
Rθ x = r =r
cos η sin θ + sin η cos θ sin(η + θ)

This makes it easy to verify that Rθ+2kπ = Rθ for all θ ∈ R


and k ∈ Z, and that Rθ Rη = Rη Rθ = Rθ+η for all θ, η ∈ R.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 10 of 44
Example with No Real Eigenvalues, II

The characteristic equation |Rθ − λI| = 0 takes the form



cos θ − λ − sin θ
0 = = (cos θ−λ)2 +sin2 θ = 1−2λ cos θ+λ2
sin θ cos θ − λ

1. There is a degenerate case when cos θ = 1


because θ = 2kπ for some k ∈ Z.
Then Rθ reduces to the identity matrix I2 .
2. Otherwise, the real matrix Rθ has no real eigenvalues.
Instead, the two roots λ = cos θ ± i sin θ = e ±iθ
of the characteristic equation are complex eigenvalues,
with associated complex eigenvectors.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 11 of 44


Outline

Eigenvalues and Eigenvectors


Real Case
The Complex Case
Eigenvectors are Linearly Independent

Diagonalizing a General Matrix

Diagonalizing a Symmetric Matrix


A Symmetric Matrix has only Real Eigenvalues

Orthogonal Projections and Complements


A Trick Function for Generating Eigenvalues
The Spectral Theorem

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 12 of 44


Complex Eigenvalues

To consider complex eigenvalues properly,


we need to leave Rn and consider instead the linear space Cn
whose elements are n-vectors with complex coordinates.
That is, we consider a linear space whose field of scalars
is the plane C of complex numbers,
rather than the line R of real numbers.
Suppose A is any n × n matrix
whose elements may be real or complex.
The complex scalar λ ∈ C is an eigenvalue
just in case the equation Ax = λx has a non-zero solution,
in which case that solution x ∈ Cn \ {0} is an eigenvector.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 13 of 44


Fundamental Theorem of Algebra
Theorem
Let P(λ) = λn + n−1 k
P
k=0 pk λ
be a polynomial function of λ of degree n in the complex plane C.
Then there exists at least one root λ̂ ∈ C such that P(λ̂) = 0.

Corollary
The polynomial P(λ) canQbe factorized
as the product Pn (λ) ≡ nr=1 (λ − λr ) of exactly n linear terms.

Proof.
The proof will be by induction on n.
When n = 1 one has P1 (λ) = λ + p0 , whose only root is λ = −p0 .
Suppose the result is true when n = m − 1.
By the fundamental theorem of algebra,
there exists λ̂ ∈ C such that Pm (λ̂) = 0.
Polynomial division gives Pm (λ) ≡ Pm−1 (λ)(λ − λ̂), etc.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 14 of 44
Characteristic Roots as Eigenvalues

Theorem
Every n × n matrix A ∈ Cn×n with complex elements
has exactly n eigenvalues (real or complex)
corresponding to the roots, counting multiple roots,
of the characteristic equation |A − λI| = 0.

Proof.
The characteristic equation can be written in the form Pn (λ) = 0
where Pn (λ) ≡ |λI − A| is a polynomial of degree n.
By the fundamental theorem of algebra, together with its corollary,
the polynomial |λI − A| equals the product nr=1 (λ − λr )
Q
of n linear terms.
For any of these roots λr the matrix A − λr I is singular,
so there exists x 6= 0 such that (A − λr I)x = 0 or Ax = λr x,
implying that λr is an eigenvalue.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 15 of 44


Outline

Eigenvalues and Eigenvectors


Real Case
The Complex Case
Eigenvectors are Linearly Independent

Diagonalizing a General Matrix

Diagonalizing a Symmetric Matrix


A Symmetric Matrix has only Real Eigenvalues

Orthogonal Projections and Complements


A Trick Function for Generating Eigenvalues
The Spectral Theorem

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 16 of 44


Proof by Induction, I
Theorem
Let {λk }m
k=1 = {λ1 , λ2 , . . . , λm }
be any collection of m ≤ n distinct eigenvalues.
Then any set {xk }mk=1 of associated eigenvectors
must be linearly independent.
The proof will be by induction on m.
Because x1 6= 0, the set {x1 } is linearly independent.
So the result is evidently true when m = 1.
As the induction hypothesis, suppose the result holds for m − 1.
Suppose that one solution of the equation Ax = λmP
x,
which may be zero, is the linear combination xm = m−1
k=1 αk xk
of the preceding m − 1 eigenvectors. Hence
Xm−1
Ax = λm x = αk λm xk
k=1

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 17 of 44


Proof by Induction, II

Next, the hypothesis that {(λk , xk )}m−1


k=1
is a collection of eigenpairs implies that
Xm−1 Xm−1
Ax = αk Axk = αk λ k xk
k=1 k=1

Subtracting this equation from the prior equation


Xm−1
Ax = λm x = αk λm xk
k=1

gives
Xm−1
0= αk (λm − λk )xk
k=1

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 18 of 44


Proof by Induction, III

So we have Xm−1
0= αk (λm − λk )xk
k=1

The induction hypothesis is that {xk }m−1


k=1 is linearly independent.

It follows that αk (λm − λk ) = 0 for k = 1, . . . , m − 1.


Hence
either λm = λk for some k = 1, . . . , m − 1,
in which case the m eigenvalues are not distinct;
or else αk = 0 for k = 1, . . . , m − 1.
But then the hypothesis that xm = m−1
P
k=1 αk xk
implies that xm = 0, so xm is not an eigenvector.
This completes the proof by induction.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 19 of 44


Diagonalization
Definition
An n × n matrix A matrix is said to be diagonalizable
just in case there exists an invertible diagonalizing matrix S
and a diagonal matrix Λ = diag(λ1 , λ2 , . . . , λn )
such that each of the following equivalent statements holds:

A = SΛS−1 ⇐⇒ AS = SΛ ⇐⇒ S−1 AS = Λ

Theorem
Given any n × n matrix A:
1. The columns of any matrix S that diagonalizes A
must consist of n linearly independent eigenvectors of A.
2. The matrix A is diagonalizable if and only if
it has a set of n linearly independent eigenvectors.
3. The matrix A and its diagonalization Λ = S−1 AS
have the same set of eigenvalues.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 20 of 44
Proof of Part 1
Suppose that AS = SΛ where A = (aij )n×n , S = (sij )n×n ,
and Λ = diag(λ1 , λ2 , . . . , λn ).
Then for each i, k ∈ {1, 2, . . . , n},
equating the elements in row i and column k
of the equal matrices AS and SΛ implies that
Xn Xn
aij sjk = sij δjk λk = sik λk
j=1 j=1

It follows that Ask = λk sk


where sk = (sik )ni=1 denotes the kth column of the matrix S.
Because S must be invertible:
I each column sk must be non-zero, so an eigenvector of A;
I the set of all these n columns must be linearly independent.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 21 of 44


Proof of Part 2

By part 1, if the diagonalizing matrix S exists,


its columns must form a set of n linearly independent eigenvectors
for the matrix A.
Conversely, suppose that A does have a set {x1 , x2 , . . . , xn }
of n linearly independent eigenvectors,
with Axk = λk xk for k = 1, 2, . . . , n.
Now define S as the n × n matrix whose kth column
is the eigenvector xk , for each k = 1, 2, . . . , n.
Then it is easy to check that AS = SΛ
where Λ = diag(λ1 , λ2 , . . . , λn ).

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 22 of 44


Proof of Part 3

Suppose that λ is an eigenvalue of A


with corresponding eigenvector x 6= 0.
Then x solves Ax = λx = SΛS−1 x.
Premultiplying each side of the equation SΛS−1 x = Λx by S−1 ,
it follows that y := S−1 x solves Λy = λy.
Moreover, because S−1 has the inverse S,
the equation S−1 x = 0 has only the trivial solution.
Hence x 6= 0 implies that y 6= 0,
implying that (λ, y) is an eigenpair of Λ.
Essentially the same argument shows that
if (λ, y) is an eigenpair of Λ, then (λ, Sy) is an eigenpair of Λ.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 23 of 44


A Non-Diagonalizable 2 × 2 Matrix

Example  
0 1
The non-symmetric matrix A = cannot be diagonalized.
0 0

−λ 1
Its characteristic equation is 0 = |A − λI| =
= λ2 .
0 −λ
It follows that λ = 0 is the unique eigenvalue.
      
x1 0 1 x1 x
The eigenvalue equation is 0 = = 2
x2 0 0 x2 0
>
or x2 = 0, whose only solutions take the form x2 (1, 0) .
Thus, every eigenvector is a non-zero multiple
of the column vector (1, 0)> .
This makes it impossible to find
any set of two linearly independent eigenvectors.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 24 of 44


A Non-Diagonalizable n × n Matrix: Specification
The following n × n matrix also has a unique eigenvalue,
whose eigenspace is of dimension 1.
Example
Consider the non-symmetric n × n matrix A
whose elements in the first n − 1 rows satisfy aij = δi,j−1
for i = 1, 2, . . . , n − 1 but whose last row is 0> .
Such a matrix is upper triangular, and takes the special form
 
0 1 0 ... 0
0 0 1 . . . 0   
.. .. . . ..  = 0 In−1

A =  ...

 . . . . 0 0>
0 0 0 . . . 1
0 0 0 ... 0

in which the elements in the first n − 1 rows and last n − 1 columns


make up the identity matrix.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 25 of 44
A Non-Diagonalizable n × n Matrix: Analysis
Because A − λI is also upper triangular,
its characteristic equation is 0 = |A − λI| = (−λ)n .
This has λ = 0 as an n-fold repeated root.
So λ = 0 is the unique eigenvalue.
The eigenvalue equation Ax = λx with λ = 0
takes the form Ax = 0 or
Xn
0= δi,j−1 xj = xi+1 (i = 1, 2, . . . , n − 1)
j=1

with an extra nth equation of the form 0 = 0.


The only solutions take the form xj = 0 for j = 2, . . . , n,
with x1 arbitrary.
So all the eigenvectors of A
are non-zero multiples of e1 = (1, 0, . . . , 0)> ,
implying that there is just one eigenspace, which has dimension 1.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 26 of 44
Outline

Eigenvalues and Eigenvectors


Real Case
The Complex Case
Eigenvectors are Linearly Independent

Diagonalizing a General Matrix

Diagonalizing a Symmetric Matrix


A Symmetric Matrix has only Real Eigenvalues

Orthogonal Projections and Complements


A Trick Function for Generating Eigenvalues
The Spectral Theorem

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 27 of 44


Complex Conjugates and Adjoint Matrices
Recall that any complex number c ∈ C can be expressed as a + ib
with a ∈ R as the real part and b ∈ R as the imaginary part.
The complex conjugate of c is c̄ = a − ib.
Note that c c̄ = c̄c = (a + ib)(a − ib) = a2 + b 2 = |c|2 ,
where |c| is the modulus of c.
Any m × n complex matrix C = (cij )m×n ∈ Cm×n
can be written as A + iB, where A and B are real m × n matrices.
The adjoint of the m × n complex matrix C = A + iB,
is the n × m complex matrix C∗ := (A − iB)> = A> − iB> .
This is the transpose of the matrix A − iB whose elements are the
complex conjugates c̄jk of the corresponding elements of C.
That is each element of C∗ is given by cjk
∗ = a − b i.
kj kj

In the case of a real matrix A whose imaginary part is 0,


its adjoint is simply the transpose A> .
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 28 of 44
Self-Adjoint and Symmetric Matrices

An n × n complex matrix C = A + iB is self-adjoint


just in case C∗ = C, which holds if and only if A> − iB> = A + iB,
and so if and only if:
I the real part A is symmetric;
I the imaginary part B is anti-symmetric
in the sense that B> = −B.
Of course, a real matrix is self-adjoint if and only if it is symmetric.
Theorem
Any eigenvalue of a self-adjoint complex matrix is a real scalar.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 29 of 44


Proof that Eigenvalues are Real
Suppose that the scalar λ ∈ C and vector x ∈ Cn together satisfy
the eigenvalue equation Ax = λx for any A ∈ Cn×n .
Taking complex conjugates throughout, one has λ̄x∗ = x∗ A∗ .
By the associative law of complex matrix multiplication,
one has x∗ Ax = x∗ (Ax) = x∗ (λx) = λ(x∗ x)
as well as x∗ A∗ x = (x∗ A∗ )x = (λ̄x∗ )x = λ̄(x∗ x).
In case A is self-adjoint and so A∗ = A,
subtracting the second equation from the first
gives x∗ Ax − x∗ A∗ x = x∗ (A − A∗ )x = 0 = (λ − λ̄)(x∗ x).
is an eigenvector, one has x = (xi )ni=1 ∈ Cn \ {0}
But in case x P
and so x x = ni=1 |xi |2 > 0.

Because 0 = (λ − λ̄)(x∗ x),


it follows that the eigenvalue λ satisfies λ − λ̄ = 0,
implying that λ is real.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 30 of 44
Orthogonal Projections
Definition
An n × n matrix P is an orthogonal projection if P2 = P
and u> v = 0 whenever Pv = 0 and u = Px for some x ∈ Rn .
Theorem
Suppose that the n × m matrix X has full rank m < n.
Let L ⊂ Rn be the linear subspace spanned
by m linearly independent columns of X.
−1
Define the n × n matrix P := X(X> X) X> . Then:
1. The matrix P is a symmetric orthogonal projection onto L.
2. The matrix I − P is a symmetric orthogonal projection
onto the orthogonal complement L⊥ of L.
3. For each vector y ∈ Rn , its orthogonal projection onto L
is the unique vector v = Py
that minimizes the distance ky − vk between y and L
— i.e., ky − vk ≤ ky − uk for all u ∈ L.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 31 of 44
Proof of Part 1

Because of the rules for the transposes of products and inverses,


−1
the definition P := X(X> X) X> implies that P> = P and also
−1 −1 −1
P2 = X(X> X) X> X(X> X) X> = X(X> X) X> = P

Moreover, if Pv = 0 and u = Px for some x ∈ Rn , then

u> v = x> P> v = x> Pv = 0

Finally, for every y ∈ Rn , the vector Py equals Xb, where


−1
b = (X> X) X> y

Hence Py ∈ L.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 32 of 44


Proof of Part 2
Evidently (I − P)> = I − P> = I − P, and

(I − P)2 = I − 2P + P2 = I − 2P + P = I − P

Hence I − P is a projection.
This projection is also orthogonal because if (I − P)v = 0
and u = (I − P)x for some x ∈ Rn , then

u> v = x> (I − P)> v = x> (I − P)v = 0

Next, suppose that v = Xb ∈ L and that y = (I − P)x


belongs to the range of (I − P). Then

y> v = x> (I − P)> Xb = x> Xb − x> Xb = 0

so y ∈ L⊥ .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 33 of 44


Proof of Part 3
For any vector v = Xb ∈ L and y ∈ Rn , one has

ky − vk2 = (y − Xb)> (y − Xb) = y> y − 2y> Xb + b> X> Xb


−1
Now define b̂ := (X> X) X> y (which is the OLS estimator of b
in the linear regression equation y = Xb + e ).
2
Also, define v̂ := Xb̂ = Py. Because P> P = P> = P = P ,
ky − vk2 = y> y − 2y> Xb + b> X> Xb
= (b − b̂)> X> X(b − b̂) + y> y − b̂> X> Xb̂
= kv − v̂k2 + y> y − y> P> Py = kv − v̂k2 + y> y − y> Py
On the other hand, given that v̂ = Py, one also has
ky − v̂k2 = y> y − 2y> v̂ + v̂> v̂
= y> y − 2y> Py + y> P> Py = y> y − y> Py
So ky − vk2 − ky − v̂k2 = kv − v̂k2 ≥ 0 with = iff v = v̂.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 34 of 44
Outline

Eigenvalues and Eigenvectors


Real Case
The Complex Case
Eigenvectors are Linearly Independent

Diagonalizing a General Matrix

Diagonalizing a Symmetric Matrix


A Symmetric Matrix has only Real Eigenvalues

Orthogonal Projections and Complements


A Trick Function for Generating Eigenvalues
The Spectral Theorem

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 35 of 44


A Trick Function for Generating Eigenvalues
For all x 6= 0, define the homogeneous of degree zero function
Pn Pn
x> Ax i=1 xi aij xj
n
R \ {0} 3 x 7→ f (x) := > = Pn j=1 2
x x i=1 xi

which is left undefined at x = 0.


The partial derivative w.r.t. any component xh of the vector x is
∂f 2 hXn i
= > 2 ahj xj (x> x) − (x> Ax)xh
∂xh (x x) j=1

At any stationary point x̂ 6= 0 where ∂f /∂xh = 0 for all h,


one therefore has (x̂> x̂)Ax̂ = (x̂> Ax̂)x̂
and so Ax̂ = λx̂ where λ = f (x̂).
That is, a stationary point x̂ 6= 0 must be an eigenvalue,
with the corresponding function value f (x̂)
as the associated eigenvector.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 36 of 44
More Properties of the Trick Function

Using this trick function


Pn Pn
x> Ax i=1 xi aij xj
n
R \ {0} 3 x 7→ f (x) := > = Pn j=1 2
x x i=1 xi

one can state and prove the following lemma.

Lemma
Every n × n symmetric square matrix A:
1. has a maximum eigenvalue λ∗ at an eigenvector x∗
where f attains its maximum;
2. has a minimum eigenvalue λ∗ at an eigenvector x∗
where f attains its minimum;
3. satisfies A = λI if and only if λ∗ = λ∗ = λ.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 37 of 44


Proof of Parts 1 and 2

The unit sphere S n−1 is a compact subset of Rn ,


and the function f restricted to S n−1 is continuous.
By the extreme value theorem, f restricted to S n−1 must have:
I a maximum value λ∗ attained at some point x∗ ;
I a minimum value λ∗ attained at some point x∗ .
Because f is homogeneous of degree zero,
these are the maximum and minimum values of f
over the whole domain Rn \ {0}.
In particular, f must be stationary at any maximum point x∗ ,
as well as at any minimum point x∗ .
But stationary points must be eigenvectors.
This proves parts 1 and 2 of the lemma.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 38 of 44


Outline

Eigenvalues and Eigenvectors


Real Case
The Complex Case
Eigenvectors are Linearly Independent

Diagonalizing a General Matrix

Diagonalizing a Symmetric Matrix


A Symmetric Matrix has only Real Eigenvalues

Orthogonal Projections and Complements


A Trick Function for Generating Eigenvalues
The Spectral Theorem

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 39 of 44


A Useful Lemma

Lemma
Let A be a symmetric n × n matrix.
Suppose that there are m < n eigenvectors {uk }m
k=1
which form an orthonormal set of vectors,
as well as the columns of an n × m matrix U.
Then there is at least one more eigenvector x
that satisfies U> x = 0
— i.e., it is orthogonal to each of the m eigenvectors uk .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 40 of 44


Constructive Proof I
For each eigenvector uk , let λk be the associated eigenvalue,
so that Auk = λk uk for k = 1, 2, . . . , m.
Then AU = UΛ where Λ := diag(λk )nk=1 .
Also, because the eigenvectors {uk }m k=1 form an orthonormal set,
one has U> U = Im . Hence U> AU = U> UΛ = Λ.
Also, transposing AU = UΛ gives U> A = ΛU> .
Consider now the n × n matrix  := (I − UU> )A(I − UU> ), which
is symmetric because both A and UU> are symmetric. Note that
 = A − UU> A − AUU> + UU> AUU>
= A − UΛU> − UΛU> + UΛU> = A − UΛU>
This matrix  has at least one eigenvalue λ, which must be real,
and an associated eigenvector x 6= 0, which together satisfy
Âx = (I − UU> )A(I − UU> )x = (A − UΛU> )x = λx
Pre-multiplying each side of the last equation by U> shows that
λU> x = U> Ax − U> UΛU> x = ΛU> x − ΛU> x = 0
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 41 of 44
Constructive Proof II
This leaves two possible cases:
1. In the exceptional case when the only eigenvalue
of the symmetric matrix  is λ = 0,
one has  = 0 and so A = UΛU> .
Then any vector x 6= 0 satisfying U> x = 0
must satisfy Ax = UΛU> x = 0,
implying that x is an eigenvector of A.
2. Otherwise, in the generic case
when  has at least one eigenvalue λ 6= 0,
there is a corresponding eigenvector x 6= 0 of Â
>
that satisfies U> x = 0m . But then this implies that

Ax = (Â + UΛU> )x = Âx = λx

Hence x is an eigenvector of A.
>
In both cases there is an eigenvector x of A satisfying U> x = 0m .
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 42 of 44
Spectral Theorem

Theorem
Given any symmetric n × n matrix A:
1. its eigenvectors span the whole of Rn ;
2. there exists an orthogonal matrix P that diagonalizes A.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 43 of 44


Proof of Spectral Theorem
The matrix A has at least one eigenvalue, which must be real.
The associated eigenvector x, normalized to satisfy x> x = 1,
forms an orthonormal set {u1 }.
As the induction hypothesis,
suppose that there are m < n eigenvectors {uk }m
k=1
which form an orthonormal set of vectors.
We have just proved that this hypothesis holds for m = 1.
The lemma shows that, if it holds for any m = 1, 2, . . . , n − 1,
then it holds for m + 1. The result follows by induction.
In particular, when m = n, there exists an orthonormal set
of n eigenvectors, which must then span the whole of Rn .
Also, by the previous result, taking P as an orthogonal matrix
whose columns are an orthonormal set of n eigenvectors
implies that P> AP = Λ where the elements
of the diagonal matrix Λ constitute its set of eigenvalues.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 44 of 44

You might also like