You are on page 1of 5

The Cayley-Hamilton Theorem and Minimal Polynomials

Paul Skoufranis
March 4, 2012
The purpose of this document is examine some advanced topics in the theory of linear maps and matrices.
Specically, we will examine when a polynomial equation with matrix entries is the zero matrix. Thus we
begin with the following denition. (For these notes,
T
will denote the characteristic polynomial of T and
we will use
T
(x) = det(xI
V
T) instead of det(T xI
V
) as we should always do anyways.)
Notation) Let F be a eld and let p(x) = a
n
x
n
+ a
n1
x
n1
+ + a
1
x + a
0
be a polynomial with
a
n
, a
n1
, . . . , a
1
, a
0
F. Let V be a vector space over F and let T : V V be a linear map. We dene p(T)
to be the linear map
p(T) = a
n
T
n
+ a
n1
T
n1
+ + a
1
T + a
0
I
V
.
Similarly, if A M
nn
(F), we dened p(A) to be the n n matrix
p(A) = a
n
A
n
+ a
n1
A
n1
+ + a
1
A + a
0
I
n
.
Example) Let
A =
_
4 3
2 1
_
.
If p(x) = x
2
+ x + 1, then
p(A) = A
2
+ A + I
2
=
_
10 9
6 5
_
+
_
4 3
2 1
_
+
_
1 0
0 1
_
=
_
15 12
8 5
_
.
Example) With A as in the above example, notice that the characteristic polynomial of A is

A
(x) = det(xI
2
A) = det
__
x 4 3
2 x + 1
__
= (x 4)(x + 1) + 6 = x
2
3x + 2
and that

A
(A) = A
2
3A + 2I
2
=
_
10 9
6 5
_
+
_
12 9
6 3
_
+
_
2 0
0 2
_
=
_
0 0
0 0
_
.
The fact that
A
(A) = 0 is no coincidence and is known as the Cayley-Hamilton Theorem.
The Cayley-Hamilton Theorem) Let V be an n-dimensional vector space over a eld F and let T : V V
be a linear map. If
T
is the characteristic polynomial of T, then
T
(T) = 0
V
(where 0
V
is the zero linear
operator on V ). Similarly, if A M
nn
(F) and
A
is the characteristic polynomial of A, then
A
(A) = 0
n
.
Remarks) The most common incorrect proof of the Cayley-Hamilton Theorem is the following: We know
that
T
(x) = det(xI
V
T). Therefore, if we substitute T for x,
T
(T) = det(T T) = det(0
V
) = 0.
The problem with the above argument is that we cannot simply substitute T for x. Taking the determi-
nant of xI
V
T gives a polynomial in x which we can then substitute T in for. However, if we substitute
T for x before taking the determinant, we are not getting a polynomial expression for T; that is
T
(T) is a
linear map where as det(0
V
) is a scalar.
We will not go through the full proof of the Cayley-Hamilton Theorem in this class. However, we will
prove the result when T is diagonalizable.
Proof of the Cayley-Hamilton Theorem when T is diagonalizable: Suppose T is a diagonal-
izable linear map. Then, if
1
,
2
, . . . ,
n
are the eigenvalues of T (including multiplicities), there exists an
ordered basis = {v
1
, v
2
, . . . , v
n
} such that T(v
j
) =
j
v
j
for all j {1, 2, . . . , n}. Then
[T]

=
_

1
0 . . . 0 0
0
2
.
.
. 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0
.
.
.
n1
0
0 0 . . . 0
n
_

_
.
Write
T
(x) = x
n
+ a
n1
x
n1
+ + a
1
x + a
0
. Therefore
[
T
(T)]

= [T
n
+ a
n1
T
n1
+ + a
1
T + a
0
I
V
]

= [T]
n

+ a
n1
[T]
n1

+ + a
1
[T]

+ a
0
I
n
=
_

n
1
0 . . . 0 0
0
n
2
.
.
. 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0
.
.
.
n
n1
0
0 0 . . . 0
n
n
_

_
+ + a
1
_

1
0 . . . 0 0
0
2
.
.
. 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0
.
.
.
n1
0
0 0 . . . 0
n
_

_
+ a
0
_

_
1 0 . . . 0 0
0 1
.
.
. 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0
.
.
. 1 0
0 0 . . . 0 1
_

_
=
_

T
(
1
) 0 . . . 0 0
0
T
(
2
)
.
.
. 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0
.
.
.
T
(
n1
) 0
0 0 . . . 0
T
(
n
)
_

_
.
However, since each eigenvalue
j
is a root of
T
, [
T
(T)]

is the zero matrix and thus


T
(T) = 0
V
as
desired.
Note that the above proof also holds for diagonalizable matrices due to our knowledge of the connection
between linear maps and matrices. With our knowledge of the Cayley-Hamilton Theorem, we turn to another
question: Given a matrix A, can we nd a non-zero polynomial p of smallest degree for which p(A) = 0?
Example) Let
A =
_
1 1
0 1
_
.
Then
A
(x) = (x 1)
2
and it is clear that

A
(A) =
_
0 1
0 0
_
2
= 0
2
.
2
However, it is clear that if p(x) = xa for any a F, then p(A) = 0
2
. Therefore
A
is a non-zero polynomial
such that
A
(A) = 0
2
yet p(A) = 0
2
for all non-zero polynomials p with deg(p) < deg(
A
).
The above example leads us to the following result. From this point onward, we will only state our results
for linear maps between nite dimensional vector spaces as the analogous results for matrices will hold due
to the connection between linear maps and matrices.
Proposition) Let V be a nite dimensional vector space over a eld F and let T : V V be a non-
zero linear map. Then there exists an n N and a unique monic polynomial m
T
of degree n (monic meaning
the coecient of x
n
is 1) such that m
T
(T) = 0
V
yet q(T) = 0
V
for all non-zero polynomials q with degree
less than n. The polynomial m
T
is called the minimal polynomial of T.
Proof: By the Cayley-Hamilton Theorem, we know that
Z = {m N | there exists a polynomial p of degree m such that p(T) = 0
V
}
is a non-empty set as
T
(T) = 0
V
. Let n be the smallest number in Z. Therefore there exists a polynomial
p of degree n such that p(T) = 0. Since the leading coecient a
n
of p is non-zero, if we divide p by a
n
, we
obtain a monic polynomial m
T
of degree n such that m
T
(T) = 0. Moreover, if q is a non-zero polynomial
with degree less than n, then q(T) = 0
V
or else deg(q) would be an element of Z which contradicts the
minimality of n.
To see that m
T
is unique, suppose p is another monic polynomial of degree n such that p(T) = 0
V
.
Consider the polynomial q = m
T
p. Since m
T
and p were both monic polynomials of degree n, q is a
polynomial of degree strictly less than n. However q(T) = m
T
(T) p(T) = 0
V
0
V
= 0
V
. Therefore, due
to the minimality of n, we must have that q = 0 so m
T
= p as desired.
When trying to compute the minimal polynomial of a linear map, the following two results are extremely
useful.
Lemma) Let V be an n-dimensional vector space over a eld F and let T : V V be a linear map.
If is an eigenvalue of T, then is a root of m
T
.
Proof: Let be an eigenvalue of T. Therefore, there exists a non-zero vector v V such that T(v) = v.
Suppose is not a root of m
T
. By using the Division Algorithm for Polynomials (see Theorem E.1 of
the text), we can write m
T
(x) = q(x)(x ) + r(x) where r is a non-zero polynomial with deg(r) < 1 (that
is, r is a non-zero constant polynomial). Therefore

0
V
= m
T
(T)(v) = (q(T)(T I
V
) + r(T))v = q(T)(T(v) v) + r(T)(v) =

0
V
+ r(T)(v) = r(T)(v).
However, since r is a non-zero constant polynomial and v =

0, r(T)(v) =

0
V
so we have a contradiction.
Hence must be a root of m
T
.
Lemma) Let V be an n-dimensional vector space over a eld F and let T : V V be a linear map.
If p is a non-zero polynomial such that p(T) = 0
V
, then m
T
divides p (that is, there exists another polyno-
mial q such that p(x) = q(x)m
T
(x)). Therefore, by the Cayley-Hamilton Theorem, m
T
divides
T
.
Proof: Suppose p is a non-zero polynomial such that p(T) = 0
V
yet m
T
does not divide p. By using the Di-
vision Algorithm for Polynomials (see Theorem E.1 of the text), we can write p(x) = q(x)m
T
(x)+r(x) where
r is a non-zero polynomial with deg(r) < deg(m
T
). However, this implies that r(T) = p(T) q(T)m
T
(T) =
0
V
0
V
= 0
V
. Hence r is a non-zero polynomial with deg(r) < deg(m
T
) such that r(T) = 0
V
. This clearly
contradicts the denition of the minimal polynomial of T and thus completes the proof of the rst claim.
Note
T
(T) = 0
V
by the Cayley-Hamilton Theorem. Hence m
T
divides
T
by the above result.
3
Using the above results, we can simplify the computation of the minimal polynomial of a matrix.
Example) Let F be a eld where 3 = 0 and let
A =
_
_
1 1 1
1 1 1
1 1 1
_
_
.
Then

A
() = det
_
_
_
_
1 1 1
1 1 1
1 1 1
_
_
_
_
= ( 1)
3
+ (1)(1)(1) + (1)(1)(1) (1)( 1)(1) (1)(1)( 1) ( 1)(1)(1)
=
3
3
2
=
2
( 3).
Thus the distinct eigenvalues of A are 0 and 3. Since m
A
divides
A
and every eigenvalue of A is a root of
m
A
, we must have that m
A
(x) = x(x 3) or m
A
(x) = x
2
(x 3). To check which of these works, we start
with the one of the smallest degree:
A(A3I) =
_
_
1 1 1
1 1 1
1 1 1
_
_
_
_
2 1 1
1 2 1
1 1 2
_
_
=
_
_
0 0 0
0 0 0
0 0 0
_
_
.
Hence the minimal polynomial of A is x(x 3).
Using our proof of the Cayley-Hamilton Theorem in the diagonalizable case and the above lemmas, it is
easy to compute the minimal polynomial of a diagonalizable linear operator.
Theorem) Let V be an n-dimensional vector space over a eld F and let T : V V be a linear map.
If {
1
,
2
, . . . ,
k
} are the distinct eigenvalues of T and T is diagonalizable, then the minimal polynomial of
T is m
T
(x) = (x
1
)(x
2
) (x
k
).
Proof: Since T is diagonalizable, if
1
,
2
, . . . ,
n
are the eigenvalues of T (including multiplicities), there
exists an ordered basis = {v
1
, v
2
, . . . , v
n
} such that T(v
j
) =
j
v
j
for all j {1, 2, . . . , n}. Then
[T]

=
_

1
0 . . . 0 0
0
2
.
.
. 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0
.
.
.
n1
0
0 0 . . . 0
n
_

_
.
4
Let p(x) = (x
1
)(x
2
) (x
k
) and write p(x) = x
n
+ a
n1
x
n1
+ + a
1
x + a
0
. Therefore
[p(T)]

= [T
n
+ a
n1
T
n1
+ + a
1
T + a
0
I
V
]

= [T]
n

+ a
n1
[T]
n1

+ + a
1
[T]

+ a
0
I
n
=
_

n
1
0 . . . 0 0
0
n
2
.
.
. 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0
.
.
.
n
n1
0
0 0 . . . 0
n
n
_

_
+ + a
1
_

1
0 . . . 0 0
0
2
.
.
. 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0
.
.
.
n1
0
0 0 . . . 0
n
_

_
+ a
0
_

_
1 0 . . . 0 0
0 1
.
.
. 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0
.
.
. 1 0
0 0 . . . 0 1
_

_
=
_

_
p(
1
) 0 . . . 0 0
0 p(
2
)
.
.
. 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0
.
.
. p(
n1
) 0
0 0 . . . 0 p(
n
)
_

_
.
However, since each eigenvalue
j
is a root of p, [p(T)]

is the zero matrix and thus p(T) = 0 as desired.


If m
T
is the minimal polynomial of T, then m
T
divides p by the above results. Moreover x
j
must
be a root of m
T
for all j {1, . . . , k} by the above results and thus p divides m
T
. Since m
T
and p are both
monic polynomials, this implies that p = m
T
as desired.
It turns out that the converse of the above result is true. That is;
Theorem) Let V be an n-dimensional vector space over a eld F and let T : V V be a linear
map. If {
1
,
2
, . . . ,
k
} are the distinct eigenvalues of T and the minimal polynomial of T is m
T
(x) =
(x
1
)(x
2
) (x
k
), then T is diagonalizable.
With the above theorem and our computation of the minimal polynomial, we see that the matrix
A =
_
_
1 1 1
1 1 1
1 1 1
_
_
must be diagonalizable (when 3 = 0 in our eld). The beauty of this approach is that we did not need to
compute the eigenspaces of A in order to determine whether or not A is diagonalizable.
5

You might also like