Matrix Alg Slides D

Lecture Notes 1: Matrix Algebra
Part D: Matrix Diagonalization
Peter J. Hammond
revised 2016 September 6th
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 1 of 44

Outline
Eigenvalues and Eigenvectors

Real Case
The Complex Case
Eigenvectors are Linearly Independent
Diagonalizing a General Matrix
Diagonalizing a Symmetric Matrix

A Symmetric Matrix has only Real Eigenvalues
Orthogonal Projections and Complements

A Trick Function for Generating Eigenvalues
The Spectral Theorem

Definitions in the Real Case
Consider any n × n matrix A.

The scalar λ ∈ R is an eigenvalue
just in case the equation Ax = λx has a non-zero solution.
In this case the solution x ∈ Rn \ {0} is an eigenvector,
and the pair (λ, x) is an eigenpair.

The Eigenspace
Given any eigenvalue λ, let Eλ := {x ∈ Rn \ {0} | Ax = λx}

denote the associated set of eigenvectors.
Given any two eigenvectors x, y ∈ Eλ
and any two scalars α, β ∈ R, note that
A(αx + βy) = αAx + βAy = αλx + βλy = λ(αx + βy)
Hence the linear combination αx + βy,

unless it is 0, is also an eigenvector in Eλ .
It follows that the set Eλ ∪ {0} is a linear subspace of Rn
which we call the eigenspace associated with the eigenvalue λ.

Powers of a Matrix
Theorem
Suppose that (λ, x) is an eigenpair of the n × n matrix A.
Then Am x = λm x for all m ∈ N.
Proof.
By definition, Ax = λx.
Premultiplying each side of this equation by the matrix A gives
A2 x = A(Ax) = A(λx) = λ(Ax) = λ(λx) = λ2 x
As the induction hypothesis,

suppose that Am−1 x = λm−1 x for any m = 2, 3, . . .
Premultiplying each side of this last equation by the matrix A gives
Am x = A(Am−1 x) = A(λm−1 x) = λm−1 (Ax) = λm−1 (λx) = λm x
This completes the proof by induction on m.

Characteristic Equation
The equation Ax = λx holds for x 6= 0
if and only if x 6= 0 solves (A − λI)x = 0.
This holds iff the matrix A − λI is singular,
which holds iff λ is a root
of the characteristic equation |A − λI| = 0,
or equivalently, a zero of the polynomial |A − λI| of degree n.
Suppose that the |A − λI| = 0 has k distinct roots λ1 , λ2 , . . . , λk
whose multiplicities are respectively m1 , m2 , . . . , mk .
This means that
|A − λI| = (−1)n (λ − λ1 )m1 (λ − λ2 )m2 · · · (λ − λk )mk
= (−1)n kj=1 (λ − λj )mj
Q
The polynomial has degree m1 + m2 + . . . + mk , which equals n.

This implies that k ≤ n,
so there can be at most n distinct eigenvalues.
Eigenvalues of a 2 × 2 matrix

a11 a12
Consider the 2 × 2 matrix A = .
a21 a22
The characteristic equation for its eigenvalues is

a11 − λ a12
|A − λI| = =0
a21 a22 − λ
Evaluating the determinant gives the equation
0 = (a11 − λ)(a22 − λ) − a12 a21
= λ2 − (a11 + a22 )λ + (a11 a22 − a12 a21 )
= λ2 − (tr A)λ + |A| = (λ − λ1 )(λ − λ2 )
where the two roots λ1 and λ2 of the quadratic equation have:
I a sum λ1 + λ2 equal to the trace tr A of A
(the sum of its diagonal elements);
I a product λ1 · λ2 equal to the determinant of A.
Let Λ denote the diagonal matrix diag(λ1 , λ2 )

whose diagonal elements are the eigenvalues.
Note that tr A = tr Λ and |A| = |Λ|.
The Case of a Diagonal Matrix, I
For the diagonal matrix D = diag(d1 , d2 , . . . , dn ),

|D − λI| = 0
the characteristic equation Q
takes the degenerate form nk=1 (dk − λ) = 0,
so the eigenvalues are the diagonal elements.
The ith component of the vector equation Dx = dk x
takes the form di xi = dk xi ,
which has a non-trivial solution if and only if di = dk .
The kth vector ek = (δjk )nj=1
of the canonical orthonormal basis of Rn
always solves the equation Dx = dk x,
and so is an eigenvector associated with the eigenvalue dk .

The Case of a Diagonal Matrix, II
Apart from non-zero multiples of ek ,

there are other eigenvectors associated with dk
only if a different element di of the diagonal also equals dk .
In fact, the eigenspace spanned by the eigenvectors
associated with each eigenvalue dk
equals the space spanned by the set {ei | di = dk }
of canonical basis vectors.

Example with No Real Eigenvalues, I
Recall that a 2-dimensional rotation matrix takes the form

cos θ − sin θ
Rθ :=
sin θ cos θ
for θ ∈ R, which is the angle of rotation measured in radians.

The rotation Rθ transforms any vector x = (x1 , x2 ) ∈ R2 to

cos θ − sin θ x1 x1 cos θ − x2 sin θ
Rθ x = =
sin θ cos θ x2 x1 sin θ + x2 cos θ
Introduce polar coordinates (r , η),

where x = (x1 , x2 ) = r (cos η, sin η). Then

cos η cos θ − sin η sin θ cos(η + θ)
Rθ x = r =r
cos η sin θ + sin η cos θ sin(η + θ)
This makes it easy to verify that Rθ+2kπ = Rθ for all θ ∈ R

and k ∈ Z, and that Rθ Rη = Rη Rθ = Rθ+η for all θ, η ∈ R.
Example with No Real Eigenvalues, II
The characteristic equation |Rθ − λI| = 0 takes the form

cos θ − λ − sin θ
0 = = (cos θ−λ)2 +sin2 θ = 1−2λ cos θ+λ2
sin θ cos θ − λ
1. There is a degenerate case when cos θ = 1

because θ = 2kπ for some k ∈ Z.
Then Rθ reduces to the identity matrix I2 .
2. Otherwise, the real matrix Rθ has no real eigenvalues.
Instead, the two roots λ = cos θ ± i sin θ = e ±iθ
of the characteristic equation are complex eigenvalues,
with associated complex eigenvectors.

Outline

Real Case
The Complex Case



Complex Eigenvalues
To consider complex eigenvalues properly,

we need to leave Rn and consider instead the linear space Cn
whose elements are n-vectors with complex coordinates.
That is, we consider a linear space whose field of scalars
is the plane C of complex numbers,
rather than the line R of real numbers.
Suppose A is any n × n matrix
whose elements may be real or complex.
The complex scalar λ ∈ C is an eigenvalue
just in case the equation Ax = λx has a non-zero solution,
in which case that solution x ∈ Cn \ {0} is an eigenvector.

Fundamental Theorem of Algebra
Theorem
Let P(λ) = λn + n−1 k
P
k=0 pk λ
be a polynomial function of λ of degree n in the complex plane C.
Then there exists at least one root λ̂ ∈ C such that P(λ̂) = 0.
Corollary
The polynomial P(λ) canQbe factorized
as the product Pn (λ) ≡ nr=1 (λ − λr ) of exactly n linear terms.
Proof.
The proof will be by induction on n.
When n = 1 one has P1 (λ) = λ + p0 , whose only root is λ = −p0 .
Suppose the result is true when n = m − 1.
By the fundamental theorem of algebra,
there exists λ̂ ∈ C such that Pm (λ̂) = 0.
Polynomial division gives Pm (λ) ≡ Pm−1 (λ)(λ − λ̂), etc.
Characteristic Roots as Eigenvalues
Theorem
Every n × n matrix A ∈ Cn×n with complex elements
has exactly n eigenvalues (real or complex)
corresponding to the roots, counting multiple roots,
of the characteristic equation |A − λI| = 0.
Proof.
The characteristic equation can be written in the form Pn (λ) = 0
where Pn (λ) ≡ |λI − A| is a polynomial of degree n.
By the fundamental theorem of algebra, together with its corollary,
the polynomial |λI − A| equals the product nr=1 (λ − λr )
Q
of n linear terms.
For any of these roots λr the matrix A − λr I is singular,
so there exists x 6= 0 such that (A − λr I)x = 0 or Ax = λr x,
implying that λr is an eigenvalue.

Outline

Real Case
The Complex Case



Proof by Induction, I
Theorem
Let {λk }m
k=1 = {λ1 , λ2 , . . . , λm }
be any collection of m ≤ n distinct eigenvalues.
Then any set {xk }mk=1 of associated eigenvectors
must be linearly independent.
The proof will be by induction on m.
Because x1 6= 0, the set {x1 } is linearly independent.
So the result is evidently true when m = 1.
As the induction hypothesis, suppose the result holds for m − 1.
Suppose that one solution of the equation Ax = λmP
x,
which may be zero, is the linear combination xm = m−1
k=1 αk xk
of the preceding m − 1 eigenvectors. Hence
Xm−1
Ax = λm x = αk λm xk
k=1

Proof by Induction, II
Next, the hypothesis that {(λk , xk )}m−1

k=1
is a collection of eigenpairs implies that
Xm−1 Xm−1
Ax = αk Axk = αk λ k xk
k=1 k=1
Subtracting this equation from the prior equation

Xm−1
Ax = λm x = αk λm xk
k=1
gives
Xm−1
0= αk (λm − λk )xk
k=1

Proof by Induction, III
So we have Xm−1
0= αk (λm − λk )xk
k=1
The induction hypothesis is that {xk }m−1

k=1 is linearly independent.
It follows that αk (λm − λk ) = 0 for k = 1, . . . , m − 1.

Hence
either λm = λk for some k = 1, . . . , m − 1,
in which case the m eigenvalues are not distinct;
or else αk = 0 for k = 1, . . . , m − 1.
But then the hypothesis that xm = m−1
P
k=1 αk xk
implies that xm = 0, so xm is not an eigenvector.
This completes the proof by induction.

Diagonalization
Definition
An n × n matrix A matrix is said to be diagonalizable
just in case there exists an invertible diagonalizing matrix S
and a diagonal matrix Λ = diag(λ1 , λ2 , . . . , λn )
such that each of the following equivalent statements holds:
A = SΛS−1 ⇐⇒ AS = SΛ ⇐⇒ S−1 AS = Λ
Theorem
Given any n × n matrix A:
1. The columns of any matrix S that diagonalizes A
must consist of n linearly independent eigenvectors of A.
2. The matrix A is diagonalizable if and only if
it has a set of n linearly independent eigenvectors.
3. The matrix A and its diagonalization Λ = S−1 AS
have the same set of eigenvalues.
Proof of Part 1
Suppose that AS = SΛ where A = (aij )n×n , S = (sij )n×n ,
and Λ = diag(λ1 , λ2 , . . . , λn ).
Then for each i, k ∈ {1, 2, . . . , n},
equating the elements in row i and column k
of the equal matrices AS and SΛ implies that
Xn Xn
aij sjk = sij δjk λk = sik λk
j=1 j=1
It follows that Ask = λk sk

where sk = (sik )ni=1 denotes the kth column of the matrix S.
Because S must be invertible:
I each column sk must be non-zero, so an eigenvector of A;
I the set of all these n columns must be linearly independent.

Proof of Part 2
By part 1, if the diagonalizing matrix S exists,

its columns must form a set of n linearly independent eigenvectors
for the matrix A.
Conversely, suppose that A does have a set {x1 , x2 , . . . , xn }
of n linearly independent eigenvectors,
with Axk = λk xk for k = 1, 2, . . . , n.
Now define S as the n × n matrix whose kth column
is the eigenvector xk , for each k = 1, 2, . . . , n.
Then it is easy to check that AS = SΛ
where Λ = diag(λ1 , λ2 , . . . , λn ).

Proof of Part 3
Suppose that λ is an eigenvalue of A

with corresponding eigenvector x 6= 0.
Then x solves Ax = λx = SΛS−1 x.
Premultiplying each side of the equation SΛS−1 x = Λx by S−1 ,
it follows that y := S−1 x solves Λy = λy.
Moreover, because S−1 has the inverse S,
the equation S−1 x = 0 has only the trivial solution.
Hence x 6= 0 implies that y 6= 0,
implying that (λ, y) is an eigenpair of Λ.
Essentially the same argument shows that
if (λ, y) is an eigenpair of Λ, then (λ, Sy) is an eigenpair of Λ.

A Non-Diagonalizable 2 × 2 Matrix
Example
0 1
The non-symmetric matrix A = cannot be diagonalized.
0 0

−λ 1
Its characteristic equation is 0 = |A − λI| =
= λ2 .
0 −λ
It follows that λ = 0 is the unique eigenvalue.

x1 0 1 x1 x
The eigenvalue equation is 0 = = 2
x2 0 0 x2 0
>
or x2 = 0, whose only solutions take the form x2 (1, 0) .
Thus, every eigenvector is a non-zero multiple
of the column vector (1, 0)> .
This makes it impossible to find
any set of two linearly independent eigenvectors.

A Non-Diagonalizable n × n Matrix: Specification
The following n × n matrix also has a unique eigenvalue,
whose eigenspace is of dimension 1.
Example
Consider the non-symmetric n × n matrix A
whose elements in the first n − 1 rows satisfy aij = δi,j−1
for i = 1, 2, . . . , n − 1 but whose last row is 0> .
Such a matrix is upper triangular, and takes the special form
 
0 1 0 ... 0
0 0 1 . . . 0 
.. .. . . ..  = 0 In−1

A =  ...

 . . . . 0 0>
0 0 0 . . . 1
0 0 0 ... 0
in which the elements in the first n − 1 rows and last n − 1 columns

make up the identity matrix.
A Non-Diagonalizable n × n Matrix: Analysis
Because A − λI is also upper triangular,
its characteristic equation is 0 = |A − λI| = (−λ)n .
This has λ = 0 as an n-fold repeated root.
So λ = 0 is the unique eigenvalue.
The eigenvalue equation Ax = λx with λ = 0
takes the form Ax = 0 or
Xn
0= δi,j−1 xj = xi+1 (i = 1, 2, . . . , n − 1)
j=1
with an extra nth equation of the form 0 = 0.

The only solutions take the form xj = 0 for j = 2, . . . , n,
with x1 arbitrary.
So all the eigenvectors of A
are non-zero multiples of e1 = (1, 0, . . . , 0)> ,
implying that there is just one eigenspace, which has dimension 1.
Outline

Real Case
The Complex Case



Complex Conjugates and Adjoint Matrices
Recall that any complex number c ∈ C can be expressed as a + ib
with a ∈ R as the real part and b ∈ R as the imaginary part.
The complex conjugate of c is c̄ = a − ib.
Note that c c̄ = c̄c = (a + ib)(a − ib) = a2 + b 2 = |c|2 ,
where |c| is the modulus of c.
Any m × n complex matrix C = (cij )m×n ∈ Cm×n
can be written as A + iB, where A and B are real m × n matrices.
The adjoint of the m × n complex matrix C = A + iB,
is the n × m complex matrix C∗ := (A − iB)> = A> − iB> .
This is the transpose of the matrix A − iB whose elements are the
complex conjugates c̄jk of the corresponding elements of C.
That is each element of C∗ is given by cjk
∗ = a − b i.
kj kj
In the case of a real matrix A whose imaginary part is 0,

its adjoint is simply the transpose A> .
Self-Adjoint and Symmetric Matrices
An n × n complex matrix C = A + iB is self-adjoint

just in case C∗ = C, which holds if and only if A> − iB> = A + iB,
and so if and only if:
I the real part A is symmetric;
I the imaginary part B is anti-symmetric
in the sense that B> = −B.
Of course, a real matrix is self-adjoint if and only if it is symmetric.
Theorem
Any eigenvalue of a self-adjoint complex matrix is a real scalar.

Proof that Eigenvalues are Real
Suppose that the scalar λ ∈ C and vector x ∈ Cn together satisfy
the eigenvalue equation Ax = λx for any A ∈ Cn×n .
Taking complex conjugates throughout, one has λ̄x∗ = x∗ A∗ .
By the associative law of complex matrix multiplication,
one has x∗ Ax = x∗ (Ax) = x∗ (λx) = λ(x∗ x)
as well as x∗ A∗ x = (x∗ A∗ )x = (λ̄x∗ )x = λ̄(x∗ x).
In case A is self-adjoint and so A∗ = A,
subtracting the second equation from the first
gives x∗ Ax − x∗ A∗ x = x∗ (A − A∗ )x = 0 = (λ − λ̄)(x∗ x).
is an eigenvector, one has x = (xi )ni=1 ∈ Cn \ {0}
But in case x P
and so x x = ni=1 |xi |2 > 0.
∗
Because 0 = (λ − λ̄)(x∗ x),

it follows that the eigenvalue λ satisfies λ − λ̄ = 0,
implying that λ is real.
Orthogonal Projections
Definition
An n × n matrix P is an orthogonal projection if P2 = P
and u> v = 0 whenever Pv = 0 and u = Px for some x ∈ Rn .
Theorem
Suppose that the n × m matrix X has full rank m < n.
Let L ⊂ Rn be the linear subspace spanned
by m linearly independent columns of X.
−1
Define the n × n matrix P := X(X> X) X> . Then:
1. The matrix P is a symmetric orthogonal projection onto L.
2. The matrix I − P is a symmetric orthogonal projection
onto the orthogonal complement L⊥ of L.
3. For each vector y ∈ Rn , its orthogonal projection onto L
is the unique vector v = Py
that minimizes the distance ky − vk between y and L
— i.e., ky − vk ≤ ky − uk for all u ∈ L.
Proof of Part 1
Because of the rules for the transposes of products and inverses,

−1
the definition P := X(X> X) X> implies that P> = P and also
−1 −1 −1
P2 = X(X> X) X> X(X> X) X> = X(X> X) X> = P
Moreover, if Pv = 0 and u = Px for some x ∈ Rn , then
u> v = x> P> v = x> Pv = 0
Finally, for every y ∈ Rn , the vector Py equals Xb, where

−1
b = (X> X) X> y
Hence Py ∈ L.

Proof of Part 2
Evidently (I − P)> = I − P> = I − P, and
(I − P)2 = I − 2P + P2 = I − 2P + P = I − P
Hence I − P is a projection.
This projection is also orthogonal because if (I − P)v = 0
and u = (I − P)x for some x ∈ Rn , then
u> v = x> (I − P)> v = x> (I − P)v = 0
Next, suppose that v = Xb ∈ L and that y = (I − P)x

belongs to the range of (I − P). Then
y> v = x> (I − P)> Xb = x> Xb − x> Xb = 0
so y ∈ L⊥ .

Proof of Part 3
For any vector v = Xb ∈ L and y ∈ Rn , one has
ky − vk2 = (y − Xb)> (y − Xb) = y> y − 2y> Xb + b> X> Xb

−1
Now define b̂ := (X> X) X> y (which is the OLS estimator of b
in the linear regression equation y = Xb + e ).
2
Also, define v̂ := Xb̂ = Py. Because P> P = P> = P = P ,
ky − vk2 = y> y − 2y> Xb + b> X> Xb
= (b − b̂)> X> X(b − b̂) + y> y − b̂> X> Xb̂
= kv − v̂k2 + y> y − y> P> Py = kv − v̂k2 + y> y − y> Py
On the other hand, given that v̂ = Py, one also has
ky − v̂k2 = y> y − 2y> v̂ + v̂> v̂
= y> y − 2y> Py + y> P> Py = y> y − y> Py
So ky − vk2 − ky − v̂k2 = kv − v̂k2 ≥ 0 with = iff v = v̂.
Outline

Real Case
The Complex Case



For all x 6= 0, define the homogeneous of degree zero function
Pn Pn
x> Ax i=1 xi aij xj
n
R \ {0} 3 x 7→ f (x) := > = Pn j=1 2
x x i=1 xi
which is left undefined at x = 0.

The partial derivative w.r.t. any component xh of the vector x is
∂f 2 hXn i
= > 2 ahj xj (x> x) − (x> Ax)xh
∂xh (x x) j=1
At any stationary point x̂ 6= 0 where ∂f /∂xh = 0 for all h,

one therefore has (x̂> x̂)Ax̂ = (x̂> Ax̂)x̂
and so Ax̂ = λx̂ where λ = f (x̂).
That is, a stationary point x̂ 6= 0 must be an eigenvalue,
with the corresponding function value f (x̂)
as the associated eigenvector.
More Properties of the Trick Function
Using this trick function

Pn Pn
x> Ax i=1 xi aij xj
n
R \ {0} 3 x 7→ f (x) := > = Pn j=1 2
x x i=1 xi
one can state and prove the following lemma.
Lemma
Every n × n symmetric square matrix A:
1. has a maximum eigenvalue λ∗ at an eigenvector x∗
where f attains its maximum;
2. has a minimum eigenvalue λ∗ at an eigenvector x∗
where f attains its minimum;
3. satisfies A = λI if and only if λ∗ = λ∗ = λ.

Proof of Parts 1 and 2
The unit sphere S n−1 is a compact subset of Rn ,

and the function f restricted to S n−1 is continuous.
By the extreme value theorem, f restricted to S n−1 must have:
I a maximum value λ∗ attained at some point x∗ ;
I a minimum value λ∗ attained at some point x∗ .
Because f is homogeneous of degree zero,
these are the maximum and minimum values of f
over the whole domain Rn \ {0}.
In particular, f must be stationary at any maximum point x∗ ,
as well as at any minimum point x∗ .
But stationary points must be eigenvectors.
This proves parts 1 and 2 of the lemma.

Outline

Real Case
The Complex Case



A Useful Lemma
Lemma
Let A be a symmetric n × n matrix.
Suppose that there are m < n eigenvectors {uk }m
k=1
which form an orthonormal set of vectors,
as well as the columns of an n × m matrix U.
Then there is at least one more eigenvector x
that satisfies U> x = 0
— i.e., it is orthogonal to each of the m eigenvectors uk .

Constructive Proof I
For each eigenvector uk , let λk be the associated eigenvalue,
so that Auk = λk uk for k = 1, 2, . . . , m.
Then AU = UΛ where Λ := diag(λk )nk=1 .
Also, because the eigenvectors {uk }m k=1 form an orthonormal set,
one has U> U = Im . Hence U> AU = U> UΛ = Λ.
Also, transposing AU = UΛ gives U> A = ΛU> .
Consider now the n × n matrix Â := (I − UU> )A(I − UU> ), which
is symmetric because both A and UU> are symmetric. Note that
Â = A − UU> A − AUU> + UU> AUU>
= A − UΛU> − UΛU> + UΛU> = A − UΛU>
This matrix Â has at least one eigenvalue λ, which must be real,
and an associated eigenvector x 6= 0, which together satisfy
Âx = (I − UU> )A(I − UU> )x = (A − UΛU> )x = λx
Pre-multiplying each side of the last equation by U> shows that
λU> x = U> Ax − U> UΛU> x = ΛU> x − ΛU> x = 0
Constructive Proof II
This leaves two possible cases:
1. In the exceptional case when the only eigenvalue
of the symmetric matrix Â is λ = 0,
one has Â = 0 and so A = UΛU> .
Then any vector x 6= 0 satisfying U> x = 0
must satisfy Ax = UΛU> x = 0,
implying that x is an eigenvector of A.
2. Otherwise, in the generic case
when Â has at least one eigenvalue λ 6= 0,
there is a corresponding eigenvector x 6= 0 of Â
>
that satisfies U> x = 0m . But then this implies that
Ax = (Â + UΛU> )x = Âx = λx
Hence x is an eigenvector of A.
>
In both cases there is an eigenvector x of A satisfying U> x = 0m .
Spectral Theorem
Theorem
Given any symmetric n × n matrix A:
1. its eigenvectors span the whole of Rn ;
2. there exists an orthogonal matrix P that diagonalizes A.

Proof of Spectral Theorem
The matrix A has at least one eigenvalue, which must be real.
The associated eigenvector x, normalized to satisfy x> x = 1,
forms an orthonormal set {u1 }.
As the induction hypothesis,
suppose that there are m < n eigenvectors {uk }m
k=1
which form an orthonormal set of vectors.
We have just proved that this hypothesis holds for m = 1.
The lemma shows that, if it holds for any m = 1, 2, . . . , n − 1,
then it holds for m + 1. The result follows by induction.
In particular, when m = n, there exists an orthonormal set
of n eigenvectors, which must then span the whole of Rn .
Also, by the previous result, taking P as an orthogonal matrix
whose columns are an orthonormal set of n eigenvectors
implies that P> AP = Λ where the elements
of the diagonal matrix Λ constitute its set of eigenvalues.

Matrix Alg Slides D

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Matrix Alg Slides D

Uploaded by

Copyright:

Available Formats

Lecture Notes 1: Matrix Algebra

Part D: Matrix Diagonalization

revised 2016 September 6th

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 1 of 44

Eigenvalues and Eigenvectors

Diagonalizing a General Matrix

Diagonalizing a Symmetric Matrix

Orthogonal Projections and Complements

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 2 of 44

Consider any n × n matrix A.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 3 of 44

Given any eigenvalue λ, let Eλ := {x ∈ Rn \ {0} | Ax = λx}

A(αx + βy) = αAx + βAy = αλx + βλy = λ(αx + βy)

Hence the linear combination αx + βy,

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 4 of 44

A2 x = A(Ax) = A(λx) = λ(Ax) = λ(λx) = λ2 x

As the induction hypothesis,

Am x = A(Am−1 x) = A(λm−1 x) = λm−1 (Ax) = λm−1 (λx) = λm x

This completes the proof by induction on m.

The polynomial has degree m1 + m2 + . . . + mk , which equals n.

Let Λ denote the diagonal matrix diag(λ1 , λ2 )

For the diagonal matrix D = diag(d1 , d2 , . . . , dn ),

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 8 of 44

Apart from non-zero multiples of ek ,

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 9 of 44

for θ ∈ R, which is the angle of rotation measured in radians.

Introduce polar coordinates (r , η),

This makes it easy to verify that Rθ+2kπ = Rθ for all θ ∈ R

The characteristic equation |Rθ − λI| = 0 takes the form

1. There is a degenerate case when cos θ = 1

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 11 of 44

Eigenvalues and Eigenvectors

Diagonalizing a General Matrix

Diagonalizing a Symmetric Matrix

Orthogonal Projections and Complements

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 12 of 44

To consider complex eigenvalues properly,

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 13 of 44

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 15 of 44

Eigenvalues and Eigenvectors

Diagonalizing a General Matrix

Diagonalizing a Symmetric Matrix

Orthogonal Projections and Complements

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 16 of 44

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 17 of 44

Next, the hypothesis that {(λk , xk )}m−1

Subtracting this equation from the prior equation

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 18 of 44

The induction hypothesis is that {xk }m−1

It follows that αk (λm − λk ) = 0 for k = 1, . . . , m − 1.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 19 of 44

It follows that Ask = λk sk

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 21 of 44

By part 1, if the diagonalizing matrix S exists,

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 22 of 44

Suppose that λ is an eigenvalue of A

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 23 of 44

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 24 of 44

in which the elements in the first n − 1 rows and last n − 1 columns

with an extra nth equation of the form 0 = 0.

Eigenvalues and Eigenvectors

Diagonalizing a General Matrix

Diagonalizing a Symmetric Matrix