You are on page 1of 16

Chapter 6

Eigenvalues and Eigenvectors


6.1

Eigenvalues and the Characteristic Equation

Given a n n matrix A, if
Av = v, v 6= 0

(6.1.1)

where is a scalar and v is a non-zero vector, is called an eigenvalue of


A and v an eigenvector. It is important here that an eigenvector should be
a non-zero vector. For the zero vector, (6.1.1) is always true for any , and
so, it is not very interesting. Let us look at some examples.
Example 15. Let
A=

2 1
1 2

Then, we have:

 
  
   
2 1
1
1
2 1
1
1
=3
,
=
.
1 2
1
1
1 2
1
1

(6.1.2)

(6.1.3)

We thus see that 3 and 1 are eigenvalues of the matrix A and that the
corresponding eigenvectors are (1, 1)T and (1, 1)T . An important thing to
note here that the eigenvector corresponding to 3, say, is not unique. Any
non-zero scalar multiple of (1, 1)T is also an eigenvector corresponding to
the eigenvalue 3.
Now consider:

2 1 1
(6.1.4)
A = 1 2 1
1 1 2
77

In this case, the eigenvalues are 4 and 1, and the eigenvectors are:




1
1
1
1
1
1
A 1 = 4 1 , A 1 = 1 , A 0 = 0 .
(6.1.5)
1
1
0
0
1
1
Note here that there are two linearly independent eigenvectors for the eigenvalue 1.
Given a matrix A, how can we find the eigenvalues of A? We argue as
follows. Equation (6.1.1) can be written as:
Av Iv = 0,

(6.1.6)

where I is the identity matrix. We thus have:


(A I)v = 0.

(6.1.7)

Now, note that v is a non-zero vector. The above equation states that a
non-zero vector is being sent to the origin. Therefore, the matrix A I has
a non-trivial kernel, and this implies that AI is not invertible. Therefore,
det(A I) = 0.

(6.1.8)

This gives us an equation for the eigenvalues. The above equation is called
the characteristic equation and det(A I) is called the characteristic polynomial. Let us state this as a theorem.
Theorem 14. Given an n n matrix A, is an eigenvalue of A if and only
if it satisfies the following characteristic equation:
det(A I) = 0.

(6.1.9)

Once the eigenvalues are found, we may find the eigenvectors by solving
the equation:
(A I)v = 0
(6.1.10)
viewing v as the unknown. This is just a homogeneous linear equation,
which we know very well how to solve.
Example 16. Consider the matrix:


1 2
.
A=
1 4
MATH 2574H

78

(6.1.11)
Yoichiro Mori

The characteristic equation is given by:




1
2
det
= (1 )(4 ) + 2 = 2 5 + 6 = ( 2)( 3) = 0.
1 4
(6.1.12)
Therefore, the eigenvalues of this matrix are 2 and 3. To obtain the eigenvector for = 2, we solve the equation:

   
1 2
a
0
=
(6.1.13)
1 2
b
0
from which we see that the solution is given by:
 
 
a
2
.
=c
1
b

(6.1.14)

Thus, one eigenvector corresponding to = 2 is (2, 1)T . In a similar fashion,


we see that an eigenvector corresponding to = 3 is (1, 1)T .
Consider the following example of a 3 3 matrix:

3 0 1
(6.1.15)
A = 0 3 0 .
1 0 2
The characteristic equation is:

3
0
2
det 0
3
0 = (3 )((3 )(2 ) 1) = 0.
1
0
2

(6.1.16)

Solving this equation, we see that the eigenvalues are = 1, 3, 4. The eigenvalue for = 1 can be obtained by solving the equation:

2 0 2
a
0
0 3 0 b = 0 .
(6.1.17)
1 0 1
c
0
An eigenvector for = 1 is (1, 0, 1)T . For = 3, an eigenvector is
(0, 1, 0)T and for = 4, one eigenvector is (2, 0, 1)T .
Let us examine the characteristic equation is greater detail. First of all,
we have the following result.
MATH 2574H

79

Yoichiro Mori

Theorem 15. Given an n n matrix A, the characteristic polynomial


det(A I) is a n-degree polynomial in , and has the form:
det(A I) = (1)n n + .

(6.1.18)

To prove this statement, we first prove the following.


Proposition 19. Consider a nn matrix B whose elements are polynomials
in . Assume further that the polynomials are at most degree 1 in . Then,
det(B) is a polynomial of degree at most n.
Proof. We proceed by induction on the size n of the matrix. For n = 1, this
is clearly true. Suppose this is true for n 1. The determinant of B can be
written as:
det B = b11 det(B11 ) + (1)n+1 b1n det(B1n )

(6.1.19)

where bij is the ij component of B and Bij is the ij minor of B. Since


b1k , k = 1, , n are at most degree 1 and det B1k are at most degree n 1
by the induction hypothesis, each term in the above sum is at most degree
n. This completes the proof.
We are now ready to prove Theorem 15.
Proof of Theroem 15. We use induction on the size of the matrix n. For
n = 1, the statement is clear. Suppose the statement is true for n 1. Let
B = A I. We have:
det B = b11 det(B11 ) + (1)n+1 b1n det(B1n )

(6.1.20)

where we used the same noation as in (6.1.19). The first term in the above
sum is:
b11 det(B11 ) = (a11 ) det(A11 I)
(6.1.21)

where aij is the ij component of the matrix A and Aij is the ij minor.
By the induction hypothesis, det(A11 I) is a polynomial in of degree
n 1, and the leading order term is (1)n1 n1 . Therefore, (6.1.21) is a
polynomial in of degree n with leading order term (1)n n . Consider the
other terms in (6.1.20):
(1)k+1 b1k det(B1k ) = (1)k+1 a1k det(B1k ), k 2.

(6.1.22)

By Proposition 19, det(B1k ) is a polynomial in of degree at most n 1.


Since a1k is just a scalar, (6.1.22) is a polynomial in of degree at most
n 1. Since the first term of (6.1.20) is degree n and the rest of the terms
are at most degree n 1 in , det B = det(A I) is a n degree polynomial
in .
MATH 2574H

80

Yoichiro Mori

Given a n n matrix A, let p() = det(A I) be its characteristic


polynomial. It is a well-known fact (that we will not prove) that a degree
n polynomial equation has n roots in C counting multiplicities. That is to
say, the characteristic equation can be written as:
det(A I) = p() = (1)n ( 1 )( 2 ) ( n ),

(6.1.23)

where i mathbbC are the eigenvalues of A. This implies that a matrix A


can have at most n different eigenvalues. We state this as a theorem.
Theorem 16. A matrix A has at least one eigenvalue in C and at most n
distinct eigenvalues in C.
One complication here is that eigenvalues are not necessarily going to be
real numbers but are in general complex.
Example 17. Consider the matrix:


2 1
A=
1 2

(6.1.24)

The characteristic equation and the eigenvalues for this matrix are:
2 4 + 5 = 0, = 2 i.

(6.1.25)

The eigenvector corresponding to 2 + i is (i, 1)T and the eigenvector corresponding to 2 i is (i, 1)T .
We also point out that it is very possible for a n n matrix to have less
than n distinct eigenvalue. We have already seen an example of this for the
matrix in equation (6.1.4).

6.2

Diagonalization

Suppose we have a n n matrix A with n linearly independent eigenvectors


v1 , , vn . Each eigenvector satisfies:
Avk = k vk .

(6.2.1)

We may form a matrix P = (v1 , , vn ) so


are the eigenvectors of A. Let

1 0
0 2

D= .
.. . .
..
.
.

that the column vectors of P

MATH 2574H

81

0
0

..
.

(6.2.2)

Yoichiro Mori

The matrix D is thus a diagonal matrix whose diagonals are the eigenvalues
of A. Equation (6.2.1) can now be written as:
AP = P D.

(6.2.3)

Since the column vectors of P are linearly independent, P has an inverse,


and we have:
P 1 AP = D.
(6.2.4)
This is called the diagonalization of a matrix. From the above calculation,
we have the following result.
Theorem 17. A n n matrix A can be diagonalized if and only if A has n
linearly independent eigenvectors.
Proof. If A has n linearly independent eigenvectors, we saw above that A
can be diagonalized. If A can be diagonalized, there is an invertible matrix
P such that P 1 AP is a diagonal matrix. The column vectors of P are n
linearly independent eigenvectors of A.
All examples of matrices we have seen so far in this chapter can be
diagonalized.
Example 18. Consider the matrix:


1 3
A=
2 0
Its eigenvalues and eigenvectors are:
 
 
 
 
1
1
3
3
A
=2
, A
= 3
.
1
1
2
2
The two eigenvectors are linearly independent. Let


1 3
P =
.
1 2
We may use P to diagonalize A as follows:


2 0
P 1 AP = D =
.
0 3

(6.2.5)

(6.2.6)

(6.2.7)

(6.2.8)

There are cases in which the matrix does not have n linearly independent
eigenvectors. In such a case, A cannot be diagonalized.
MATH 2574H

82

Yoichiro Mori

Example 19. Consider the matrix:


A=


2 1
.
0 2

(6.2.9)

The characteristic equation is (2)2 = 0, and therefore, the only eigenvalue


is 2. To find the eigenvector corresponding to 2, we must solve:


   
0 1
a
0
=
.
0 0
b
0

(6.2.10)

An eigenvector is (1, 0)T , and there are no other linearly independent eigenvectors.
There is large class of matrices for which diagonalizability is guaranteed.
Proposition 20. Suppose a n n matrix A has n distinct eigenvalues.
Then, A has n linearly independent eigenvectors, each corresponding to different eigenvalues. The matrix A is thus diagonalizable.
Proof. Let 1 , , n be the n distinct eigenvalues of A. Take one eigenvector vk for each eigenvalue k so that
Avk = k vk .

(6.2.11)

Consider the linear combination:


c1 v1 + + cn vn = 0.

(6.2.12)

We now show that all ck must be equal to 0. This shows that vk are linearly
independent, and thus A is diagonalizable. To conclude that c1 is 0, multiply
both sides of (6.2.12) from the left by the matrix:
(A 2 I)(A 3 I) (A n I).

(6.2.13)

Since (A k I)vk = 0, all terms disappear except for the first term:
c1 (1 2 )(1 3 ) (1 n )v1 = 0.

(6.2.14)

Since all the k are distinct and v1 6= 0, we conclude that c1 = 0. In much


the same way, we can conclude that all the other ck are also 0.
MATH 2574H

83

Yoichiro Mori

6.3

Matrix Powers

One of the reasons why eigenvalues/eigenvectors of diagonalization is so


useful is that it allows us to compute matrix powers (or more generally
matrix functions) very easily. The idea is the following. Suppose we have
successfully diagonalized a matrix A as follows:

1 0 0
0 2 0

1
P AP = D, D = .
(6.3.1)
.. . .
..
..
. .
.
0

We are interested in computing the matrix power Ak . Using the above,


A = P DP 1 .

(6.3.2)

Ak = P DP 1 P DP 1 P DP 1 = P D k P 1

(6.3.3)

Therefore,
since all the intervening P 1 P = I. The beautiful thing here is that D k is
very easy to compute. It is just:
k

1 0 0
0 k 0
2

k
D = .
(6.3.4)
.. . .
.. .
.
.
. .
.
0

kn

Example 20. Let us look at matrix powers of the matrix in Example (18).
Using the same notation, we have:


 k
 
1 2 3
1 3
2
0
k
k 1
A = PD P =
1 2
0 (3)k 5 1 1
(6.3.5)

 k+1
1 2
(3)k+1
3 2k + (3)k+1
.
=
5 2k+1 2 (3)k 3 2k + 2 (3)k+1

6.4

Traces and Determinants

We explore connections between eigenvalues and the determinant.


Proposition 21. Let A be a n n matrix and 1 , , n be the eigenvalues
of A, counting multiplicities. Then,
det(A) = 1 2 n .
MATH 2574H

84

(6.4.1)
Yoichiro Mori

Proof. Consider the characteristic polynomial:


det(A I) = (1)n ( 1 )( 2 ) ( n ).

(6.4.2)

Set = 0 to obtain the desired result.


The determinant is thus just a product of the eigenvalues.
Another interesting quantity in this connection is the trace. The trace
is the sum of the diagonal components of a matrix. For a n n matrix A
whose elements are given by aij ,
Tr(A) = a11 + a22 + + ann .

(6.4.3)

One interesting property of the trace is the following. Given two n n


matrices A and B,
Tr(AB) = Tr(BA).
(6.4.4)
This can be seen by writing out both sides in terms of the components. Let
aij and bij be the components of A and B respectively. Then,
Tr(AB) =

n
n X
X
i=1 j=1

aij bji =

n
n X
X

bji aij = Tr(BA).

(6.4.5)

j=1 i=1

This can be used to establish the following fact.


Proposition 22. The trace of an n n matrix A is the sum of its eigenvalues, counting multiplicities.
Proof. We only prove this for matrices with n distinct eigenvalues (although
in fact, the statement is actually true in general). In this case, A is diagonalizable, so that
P 1 AP = D,
(6.4.6)
where D is a diagonal matrix with the n distinct eigenvalues of A along the
diagonal. Take the trace on both sides. We have:
Tr(P 1 AP ) = Tr(D) = 1 + + n .

(6.4.7)

Tr(P 1 AP ) = Tr(AP P 1 ) = Tr(A)

(6.4.8)

Note that:
where we used (6.4.4).
MATH 2574H

85

Yoichiro Mori

6.5

Classification of 2 2 matrices by similarity

Let A and B be two n n matrices related by:


P 1 AP = B

(6.5.1)

we say that A and B are similar matrices. We have seen that, in many cases,
we can find a matrix P so that B is a diagonal matrix. However, in some
cases, this is not possible. Even if it is possible, it is sometimes the case that
the eigenvalues are complex conjugates, in which case diagonalization will
produce a complex matrix. Here, we try to see how far we can go, only with
real matrices. This will yield a classification of 2 2 matrices by similarity.
Let the eigenvalues of the 2 2 matrix A be 1 and 2 . When 1 6= 2
and real, we know that it is diagonalizable with a real matrix P . In this
case, we have:


1 0
1
P AP =
.
(6.5.2)
0 2
The matrix A, from a geometric standpoint, is a transformation that stretches
the plane in the eigenvector directions by a factor of 1 and 2 respectively.
When 1 = 2 , there are two situations. It may happen that the matrix
A still has two linearly independent eigenvectors v1 and v2 . In this case,
P 1 AP = 1 I, P = (v1 , v2 )

(6.5.3)

where I is the 2 2 identity matrix. This means that:


A = 1 P IP 1 = 1 I

(6.5.4)

Therefore, if 1 = 2 and A has two linearly independent eigenvectors, A is


necessarily a constant multiple of the identity matrix.
Suppose 1 = 2 and that A has only one linearly independent eigenvector. Let this vector be v1 . We now seek a vector v2 that satisfies:
(A 1 I)v2 = v1 .

(6.5.5)

The claim is that there is such a vector v2 and that v1 and v2 are linearly
independent. To see that the above equation can be solved, we need to
show that v1 is in the image or (A 1 I). This can be seen as follows. Any
element w in the image of (A 1 I) can be written as:
w = (A 1 I)v
MATH 2574H

86

(6.5.6)
Yoichiro Mori

for some v. By the Cayley-Hamilton theorem, we know that:


(A 1 I)2 = O

(6.5.7)

where O is the 2 2 zero matrix. Therefore,

(A 1 I)w = (A 1 I)2 v = 0.

(6.5.8)

This means that w is in the kernel of (A 1 I). The image of A 1 I is


thus included in the kernel of A 1 I. Since the kernel of A 1 I is onedimensional (A has only one linearly independent eigenvector), the image of
A 1 I is also one dimensional. This means that the kernel and image of
A 1 I must coincide. Since the eigenvector v1 is in the kernel of A 1 I,
this means that (6.5.5) must have a solution. It is also clear that v2 and v1
must be linearly independent. For if not, v2 will be a constant multiple of
v1 and will thus be mapped to 0 by A 1 I. We have, therefore:
Av1 = 1 v1 , Av2 = 1 v2 + v1 .

(6.5.9)

From this, we have:


P

AP =


1 1
, P = (v1 v2 ).
0 1

Let us consider matrix powers. Let




1 1
B=
0 1

(6.5.10)

(6.5.11)

We can see that B can be written as:


B = 1 I + N, N =


0 1
.
0 0

(6.5.12)

Now, using the binomial theorem and noting that N I = IN , we see that
k(k 1) k2 k2 2
1 I
N + .
2
(6.5.13)
2
Since N = 0, all terms except for the first two terms are just the zero
matrix. therefore,
 k

1 k1k1
k1
k
k
B = 1 I + k1 N =
.
(6.5.14)
0
k1
B k = (1 I + N )k = k1 I k + k1k1 I k1 N +

Now that we know the matrix powers of B, we can compute the matrix
power of A in (6.5.10) as:
Ak = P B k P 1 .
MATH 2574H

87

(6.5.15)
Yoichiro Mori

Example 21. Consider the matrix:




1 1
A=
.
1 3

(6.5.16)

The characteristic equation is:


2 4 + 4 = ( 2)2 = 0.

(6.5.17)

Therefore, = 2 is the only eigenvalue, and the eigenvector corresponding


to this value is:
 
1
.
(6.5.18)
v1 =
1
We now must solve:
(A 2I)v2 = v1 ,

(6.5.19)

and one solution to this equation is


 
1
v2 =
.
0

(6.5.20)




1 1
2 1
.
, P =
1 0
0 2

(6.5.21)

Therefore, we have:
P

AP =

The matrix powers of A can be computed as:


 k

 k

2 k2k1
2 k2k1
k2k1
k
1
A =P
P =
.
0
2k
k2k1
2k + k2k1

(6.5.22)

Let us now turn to the case when the 2 2 matrix A has two complex
conjugate eigenvalues. In this case, the two eigenvalues are:
= a ib
= a + ib,

(6.5.23)

where a and b are real numbers. The eigenvector corresponding to can be


written as:
vv = u + iw, v
= u iw
Av = v, A
v =

(6.5.24)

where u and w are real vectors. We see from the above that
Au =
MATH 2574H

1
1
v) = au bw.
) = (v +
A(v + v
2
2
88

(6.5.25)
Yoichiro Mori

Likewise,
Aw =

1
1
v) = bu + aw.
) (v
A(v v
2i
2i

(6.5.26)

This shows that:


AP = P B, B =


a b
, P = (w, u).
b a

(6.5.27)

(You may have realized that we are letting P = (wu) instead of P = (uw).
It is slightly more convenient to let P = (wu)). In fact, the matrix P is
must be linearly indepeninvertible. This can be seen as follows: v and v
dent (they correspond to different eigenvalues) and therefore, u and w must
be linearly independent. Therefore, any matrix with complex conjugate
eigenvalues can be written as:


a b
1
P AP = B, B =
.
(6.5.28)
b a
Recall that we have seen the matrix B before. This is just a composition of
magnification and rotation. Therefore, any matrix with complex conjugate
eigenvalues can be seen as a linear transformation involving magnification
and a rotation.
Now, we may consider the matrix powers of A. Note that B can be
written as:


p
a
b
cos() sin()
2
2
, cos() =
, sin() =
.
B = a +b
sin() cos()
a 2 + b2
a 2 + b2
(6.5.29)
This is just the statement that B is magnification with rotation. Therefore,


p
k cos(k) sin(k)
k
2
2
.
(6.5.30)
B =( a +b )
sin(k) cos(k)
Using this, we may compute Ak as:
Ak = P Ak P 1 .

(6.5.31)

Example 22. Take the matrix


A=


2 1
2 0

(6.5.32)

The eigenvalues of this matrix are


= 1 i.
MATH 2574H

89

(6.5.33)
Yoichiro Mori

The eigenvector corresponding to 1 + i is:




1
v=
.
1i

(6.5.34)

Therefore, if we set:
 
 
1
0
u=
, w=
, P = (wu)
1
1
we have:
P
Now
B=

AP = B, B =

(6.5.35)


1 1
1 1

(6.5.36)


cos(/4) sin(/4)
.
2
sin(/4) cos(/4)


(6.5.37)

Thus,



cos(k/4) sin(k/4)
B k = ( 2)k
.
sin(k/4) cos(k/4)

(6.5.38)

Using this fact, we may compute Ak as:




k cos(k/4) + sin(k/4)
sin(k/4)
k
k 1
.
A = P B P = ( 2)
2 sin(k/4)
cos(k/4) sin(k/4)
(6.5.39)

6.6

Exercises

1. Find the eigenvalues and eigenvectors of the following matrices.



 
 
 
 

3 1
1 1
2 1
2 1
2 2
,
,
,
,
1 3
1 1
4 1
1 4
1 0

2 1 0 1
3 1 1
1 1 2
0 1 0

1 3 1 , 0 1 1 , 0 0 1 , 1 2 1 0
0 1 2 1
1 1 3
0 0 2
1 0 0
1 0 1 2
(6.6.1)
2. Diagonalize the 2 2 matrices in problem 1, if possible.
3. Find the matrix powers of the 2 2 matrices in problem 1.
MATH 2574H

90

Yoichiro Mori

4. Given a 2 2 matrix A, show that its characteristic equation can be


written as:
2 Tr(A) + det(A) = 0.
(6.6.2)
Use this to prove that for 2 2 matrices, the sum of the eigenvalues
gives the trace and the product of eigenvalues gives the determinant.
5. Show that, for 2 2 matrices A and B, AB and BA always have the
same eigenvalues. Hint: Use the result from previous question.
6. Consider a 2 2 matrix of the form:


a b
A=
b c

(6.6.3)

where a, b, c are real numbers and b 6= 0. Show that the eigenvalues of


this matrix are always real. Show further that the matrix always has
two distinct eigenvalues and that the two eigenvectors are orthogonal
to each other.
7. Consider two n n matrices A and B that commute. That is to say,
AB = BA. Show that if v is an eigenvector of A, then Bv is also an
eigenvector of A, provided Bv 6= 0.
8. Let be one eigenvalue of an n n matrix A. Consider the set:
All vectors v Rn such that Av = v.

(6.6.4)

Show that this set is a subspace. This subspace is called the eigenspace
of .
9. A n n matrix A satisfies:
A2 3A + 2I = O

(6.6.5)

where O is the zero matrix. What are the possible eigenvalues of A?


(Hint: Let v be an eigenvector and apply the above expression to v.
10. Find all 2 2 matrices that satisfy A2 = O. Hint: First of all, what
are its eigenvalues?
11. Consider the matrix:
A=

where 0 < a < 1 and 0 < b < 1.


MATH 2574H

a 1a
b 1b

91

(6.6.6)

Yoichiro Mori

(a) Find the eigenvalues and eigenvectors of this matrix.


(b) Find the matrix power of A.
(c) Compute:
lim Ak .

MATH 2574H

92

(6.6.7)

Yoichiro Mori

You might also like