Math853 JBrown Grad Linear Alg

Graduate Linear Algebra
Jim L. Brown
June 19, 2014
Contents
1 Introduction 3
2 4
2.1 Getting started . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Bases and dimension . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Direct sums and quotient spaces . . . . . . . . . . . . . . . . . . 22
2.4 Dual spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3 32
3.1 Linear transformations and matrices . . . . . . . . . . . . . . . . 32
3.2 Transpose of a matrix via the dual map . . . . . . . . . . . . . . 36
3.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4 43
4.1 Invariant subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 T-invariant complements . . . . . . . . . . . . . . . . . . . . . . . 53
4.3 Rational canonical form . . . . . . . . . . . . . . . . . . . . . . . 60
4.4 Jordan canonical form . . . . . . . . . . . . . . . . . . . . . . . . 70
4.5 Multiple linear transformations . . . . . . . . . . . . . . . . . . . 79
4.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5 86
5.1 Basic denitions and facts . . . . . . . . . . . . . . . . . . . . . . 86
5.2 Symmetric, skew-symmetric, Hermitian, and skew-Hermitian forms 95
5.3 Section 4.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6 112
6.1 Inner product spaces . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.2 The spectral theorem . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
1
CONTENTS CONTENTS
7 129
7.1 Extension of scalars . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.2 Tensor products of vector spaces . . . . . . . . . . . . . . . . . . 134
7.3 Alternating forms, exterior powers, and the determinant . . . . . 140
A Groups, rings, and elds: a quick recap 148
2
Chapter 1
Introduction
These notes are the product of teaching Math 8530 several times at Clemson
University. This course serves as the rst year breadth course for the Algebra
and Discrete Mathematics subfaculty and is the only algebra course many of
our graduate students will see. As such, this course was designed to be a serious
introduction to linear algebra at the graduate level giving students a avor of
what modern abstract algebra consists of. While undergraduate abstract algebra
is not a formal prerequisite, it certainly helps. To aid with this an appendix
reviewing essential undergraduate concepts is included.
It is assumed students have had an undergraduate course in basic linear al-
gebra and are comfortable with concepts such as multiplying matrices, Gaussian
elimination, nding the null and column space of matrices, etc. In this course
we work with abstract vector spaces over an arbitrary eld and linear transfor-
mations. We do not limit ourselves to R
n
and matrices, but often do rephrase
results in terms of matrices for the convenience of the reader. Once a student
has completed and mastered the material in these notes s/he should have no
trouble translating these results into the results typically presented in a rst or
second semester undergraduate linear algebra or matrix analysis course.
While it certainly would be helpful to use modules at several places in these
notes, they are purposefully avoided so as not to stray too far from the original
intent of the course. Notes using modules may be forthcoming at some point in
the future.
Please report any errors you nd in these notes to me so that they may be
corrected.
The motivation for typing these notes was provided by Kara Stasikelis. She
provided her typed class notes from this course Summer 2013. These were
greatly appreciated and gave me the starting point from which I typed this
version.
3
Chapter 2
Vector spaces
In this chapter we give the basic denitions and facts having to do with vector
spaces that will be used throughout the rest of the course. Many of the results
in this chapter are covered in an undergraduate linear algebra class in terms of
matrices. We often convert the language back to that of matrices, but the focus
is on abstract vector spaces and linear transformations.
Throughout this chapter F is a eld.
2.1 Getting started
We begin with the denition of a vector space.
Denition 2.1.1. Let V be a non-empty set with an operation
V V V
(v, w) v +w
referred to as addition and an operation
F V V
(c, v) cv
referred to as scalar multiplication so that V satises the following properties:
1. (V, +) is an abelian group;
2. c(v +w) = cv +cw for all c F, v, w V ;
3. (c +d)v = cv +dv for all c, d F, v V ;
4. (cd)v = c(dv) for all c, d F, v V ;
5. 1
F
v = v for all v V .
4
2.1. GETTING STARTED CHAPTER 2.
We say V is a vector space (or an F-vector space). If we need to emphasize the
addition and scalar multiplication we write (V, +, ) instead of just V . We call
elements of V vectors and elements of F scalars.
We now recall some familiar examples from undergraduate linear algebra.
The verication that these are actually vector spaces is left as an exercise.
Example 2.1.2. Set F
n
=
_
_
_
_
_
_
_
a
1
a
2
.
.
.
a
n
_
_
_
_
_
: a
i
F
_
_
. Then F
n
is a vector space with
addition given by
_
_
_
_
_
a
1
a
2
.
.
.
a
n
_
_
_
_
_
+
_
_
_
_
_
b
1
b
2
.
.
.
b
n
_
_
_
_
_
=
_
_
_
_
_
a
1
+b
1
a
2
+b
2
.
.
.
a
n
+b
n
_
_
_
_
_
and scalar multiplication given by
c
_
_
_
_
_
a
1
a
2
.
.
.
a
n
_
_
_
_
_
=
_
_
_
_
_
ca
1
ca
2
.
.
.
ca
n
_
_
_
_
_
.
Example 2.1.3. Let F and K be elds and suppose F K. Then K is an
F-vector space. The typical example of this one sees in undergraduate linear
algebra is C as an R-vector space.
Example 2.1.4. Let m, n Z
>0
. Set Mat
m,n
(F) to be the m by n matrices
with entries in F. This forms an F-vector space with addition being matrix
addition and scalar multiplication given by multiplying each entry of the matrix
by the scalar. In the case m = n we write Mat
n
(F) for Mat
n,n
(F).
Example 2.1.5. Let n Z
0
. Dene
P
n
(F) = f F[x][ deg(f) n.
This is an F-vector space. We also have that
F[x] =
_
n0
P
n
(F)
is an F-vector space.
Example 2.1.6. Let U and V be subsets of R. Dene C
0
(U, V ) to be the
set of continuous functions from U to V . This forms an R-vector space. Let
f, g C
0
(U, V ) and c R. Addition is dened point-wise, namely, (f +g)(x) =
f(x) + g(x) for all x U. Scalar multiplication is given point-wise as well:
(cf)(x) = cf(x) for all x U.
5
More generally, for k Z
0
we let C
k
(U, V ) be the functions from U to V
so that f
(j)
is continuous for all 0 j k where f
(j)
denotes the jth derivative
of f. This is an R-vector space. If U = V we write C
k
(U) for C
k
(U, U).
We set C
(U, V ) to be the set of smooth functions, i.e., f C
(U, V ) if
f C
k
(U, V ) for all k 0. This is an R-vector space. If U = V we write
C
(U) for C
(U, U).
Example 2.1.7. Set
F
=
_
_
_
_
_
a
1
a
2
.
.
.
_
_
_ : a
i
F, a
i
= 0 for all but nitely many i
_
_
Set
F
N
=
_
_
_
_
_
a
1
a
2
.
.
.
_
_
_ : a
i
F
_
_
.
This is an F-vector space.
Example 2.1.8. Consider the sphere
S
n1
= (x
1
, x
2
, . . . , x
n
) R
n
: x
2
1
+x
2
2
+ +x
2
n
= 1.
Let p = (a
1
, . . . , a
n
) be a point on the sphere. We can realize the sphere as the
w = 1 level surface of the function w = f(x
1
, . . . , x
n
) = x
2
1
+ + x
2
n
. The
gradient of f is f = 2x
1
e
1
+ 2x
2
e
2
+ 2x
n
e
n
where e
1
= (1, 0, . . . , 0), e
2
=
(0, 1, 0, . . . , 0), . . . , e
n
= (0, , 0, 1). This gives that the tangent plane at the
point p is given by
2a
1
(x
1
a
1
) + 2a
2
(x
2
a
2
) + + 2a
n
(x
n
a
n
) = 0.
This is pictured in the n = 2 case here:
6
Note the tangent plane is not a vector space because it does not contain a zero
vector, but if we shift it to the origin we have a vector space called the tangent
space of S
n1
at p:
T
p
(S
n1
) = (x
1
, . . . , x
n
) R
n
: 2a
1
x
1
+ + 2a
n
x
n
= 0.
The shifted tangent space from the graph pictured above is given here:
Lemma 2.1.9. Let V be an F-vector space.
1. The zero element 0
V
V is unique.
2. We have 0
F
v = 0
V
for all v V .
3. We have (1
F
) v = v for all v V .
Proof. Suppose that 0, 0
both satisfy the conditions necessary to be 0

V
, i.e.,
0 +v = v = v + 0 for all v V
0
+v = v = v + 0
for all v V .
We apply this with v = 0 to obtain 0 = 0 + 0
, and now use v = 0
to obtain
0 + 0
= 0
. Thus, we have that 0 = 0
.
Observe that 0
F
v = (0
F
+0
F
) v = 0
F
v +0
F
v. So if we subtract 0
F
v
from both sides we get that 0
V
= 0
F
v.
Finally, we have (1
F
) v + v = (1
F
) v + (1
F
) v = (1
F
+ 1
F
) v =
0
F
v = 0
V
, i.e., (1
F
) v +v = 0
V
. So (1
F
) v is the unique additive identity
of v, i.e., (1
F
) v = v.
Note we will often drop the subscripts on 0 and 1 when they are clear from
context.
Denition 2.1.10. Let V be an F-vector space and let W be a nonempty
subset of V . If W is an F-vector space with the same operations as V we call
W a subspace (or an F-subspace) of V .
7
Example 2.1.11. We saw above that V = C is an R-vector space. Set W =
x + 0 i : x R. Then W is an R-subspace of V . We have that V is also a
C-vector space. However, W is not a C-subspace of V as it is not closed under
scalar multiplication by i.
Example 2.1.12. Let V = R
2
. In the graph below W
1
is a subspace but W
2
is
not. It is easy to check that any line passing through the origin is a subspace,
but a line not passing through the origin cannot be a subspace because it does
not contain the zero element.
Example 2.1.13. Let n Z
0
. We have P
j
(F) is a subspace of P
n
(F) for all
0 j n.
Example 2.1.14. The F-vector space F
is a subspace of F
N
.
Example 2.1.15. Let V = Mat
n
(F). The set of diagonal matrices is a subspace
of V .
The following lemma gives easy to check criterion for when one has a sub-
space. The proof is left as an exercise.
Lemma 2.1.16. Let V an F-vector space and W V . Then W is a subspace
of V if
1. W is nonempty;
2. W is closed under addition;
3. W is closed under scalar multiplication.
As is customary in algebra, in order to study objects we rst introduce the
subobjects (subspaces in our case) and then the appropriate maps between the
objects of interest. In our case these are linear transformations.
8
Denition 2.1.17. Let V and W be F-vector spaces and let T : V W be a
map. We say T is a linear transformation (or F-linear) if
1. T(v
1
+v
2
) = T(v
1
) +T(v
2
) for all v
1
, v
2
V ;
2. T(cv) = cT(v) for all c F, v V .
The collection of all F-linear maps from V to W is denoted Hom
F
(V, W).
Example 2.1.18. Dene id
V
: V V by id
V
(v) = v for all v V . Then
id
V
Hom
F
(V, V ). We refer to this as the identity transformation.
Example 2.1.19. Let T : C C be dened by T(z) = z. Since C is a C and
an R-vector space, it is natural to ask if T is C-linear or R-linear. Observe we
have
T(z +w) = z +w
= z +w
= T(z) +T(w)
for all z, w C. However, if we let c C we have
T(cv) = cv
= cT(v).
Note that if T(v) ,= 0, then T(cv) = cT(v) if and only if c = c. Thus, T is not
C-linear but is R-linear.
1
. Let A Mat
m,n
(F). Dene T
A
: F
n
F
m
by T
A
(x) = Ax. This is an F-linear map.
Example 2.1.21. Set V = C
(R). In this example we give several linear maps

that arise in calculus.
Given any a R, dene
E
a
: V R
f f(a).
We have E
a
Hom
R
(V, R) for every a R.
Dene
D : V V
f f
.
We have D Hom
R
(V, V ).
9
Let a R. Dene
I
a
: V V
f
_
x
a
f(t)dt.
We have I
a
Hom
R
(V, V ) for every a R.
Let a R. Dene
E
a
: V V
f f(a)
where here we view f(a) as the constant function. We have

E
a
Hom
R
(V, V ).
We can use these linear maps to rephrase the fundamental theorems of cal-
culus as follows:
1. D I
a
= id
V
;
2. I
a
D = id
V

E
a
.
Exercise 2.1.22. Let V and W be F-vector spaces. Show that Hom
F
(V, W)
Lemma 2.1.23. Let T Hom
F
(V, W). Then T(0
V
) = 0
W
.
Proof. This is proved using the properties of the additive identity element:
T(0
V
) = T(0
V
+ 0
V
)
= T(0
V
) +T(0
V
),
i.e., T(0
V
) = T(0
V
) + T(0
V
). Now, subtract T(0
V
) from both sides to obtain
0
W
= T(0
V
) as desired.
Exercise 2.1.24. Let U, V, W be F-vector spaces. Let S Hom
F
(U, V ) and
T Hom
F
(V, W). Then T S Hom
F
(U, W).
In the next chapter we will focus on matrices and their relation with linear
maps much more closely, but we have the following elementary result that we
can prove immediately.
Lemma 2.1.25. Let m, n Z
1
.
1. Let A, B Mat
m,n
(F). Then A = B if and only if T
A
= T
B
.
2. Every T Hom
F
(F
n
, F
m
) is given by T
A
for some A Mat
m,n
(F).
Proof. If A = B then clearly we must have T
A
= T
B
from the denition. Con-
versely, suppose T
A
= T
B
. Consider the standard vectors e
1
=
t
(1, 0, . . . , 0), e
2
=
t
(0, 1, 0, . . . , 0), . . . , e
n
=
t
(0, . . . , 0, 1). We have T
A
(e
j
) = T
B
(e
j
) for j = 1, . . . , n.
However, it is easy to see that T
A
(e
j
) is the jth column of A. Thus, A = B.
10
Let T Hom
F
(F
n
, F
m
). Set
A =
_
T(e
1
)
.
.
.
.
.
. T(e
n
)
_
.
It is now easy to check that T = T
A
.
In general, we do not multiply vectors. However, if V = Mat
n
(F), we can
multiply vectors in here! So V is a vector space, but also a ring. In this case, V
is an example of an F-algebra. Though we will not be concerned with algebras
in these notes, we give the denition here for the sake of completeness.
Denition 2.1.26. An F-algebra is a ring A with a multiplicative identity
together with a ring homomorphism f : F A mapping 1
F
to 1
A
so that
f(F) is contained in the center of A, i.e., if a A and f(c) f(F), then
af(c) = f(c)a.
Exercise 2.1.27. Show that F[x] is an F-algebra.
The fact that we can multiply in Mat
n
(F) is due to the fact that we can
compose linear maps. In fact, it is natural to dene matrix multiplication to be
the matrix associated to the composition of the associated linear maps. This
denition explains the bizarre multiplication rule dened on matrices.
Denition 2.1.28. Let m, n and p be positive integers. Let A Mat
m,n
(F),
B Mat
n,p
(F). Then AB Mat
m,p
(F) is the matrix corresponding to T
A
T
B
.
Exercise 2.1.29. Show this denition of matrix multiplication agrees with that
given in undergraduate linear algebra class.
Denition 2.1.30. Let T Hom
F
(V, W) be invertible, i.e., there exists T
1
Hom
F
(W, V ) so that T T
1
= id
W
and T
1
T = id
V
. We say T is an
isomorphism and we say V and W are isomorphic and right V

= W.
Exercise 2.1.31. Let T Hom
F
(V, W). Show that T is an isomorphism if and
only if T is bijective.
2
, W = C. These are both R-vector spaces. Dene
T : R
2
C
(x, y) x +iy.
We have T Hom
R
(V, W). It is easy to see this is an isomorphism with inverse
given by T
1
(x +iy) = (x, y). Thus, C
= R
2
as R-vector spaces.
Example 2.1.33. Let V = P
n
(F) and W = F
n+1
. Dene a map T : V W
by sending a
0
+a
1
x + a
n
x
n
to
t
(a
0
, . . . , a
n
). It is elementary to check this is
an isomorphism.
n
(F) and W = F
n
2
. Dene T : V W by
sending A = (a
i,j
) to
t
(a
1,1
, a
1,2
, . . . , a
n,n
). This gives an isomorphism between
V and W.
11
F
(V, W). The kernel of T is given by
ker T = v V : T(v) = 0
W
.
The image of T is given by
ImT = w W : there exists v V with T(v) = w = T(V ).
One should note that in undergraduate linear algebra one often refers to
ker T as the null space and ImT the column space when T = T
A
for some
A Mat
m,n
(F)).
F
(V, W). Then ker T is a subspace of V and
ImT is a subspace of W.
Proof. First, observe that 0
V
ker T so ker T is nonempty. Now let v
1
, v
2

ker T and c F. Then we have
T(v
1
+v
2
) = T(v
1
) +T(v
2
)
= 0
W
+ 0
W
= 0
W
and
T(cv
1
) = cT(v
1
)
= c 0
W
= 0
W
.
Thus, ker T is a subspace of V .
Next we show that ImT is a subspace. We have ImT is nonempty because
T(0
V
) = 0
W
ImT. Let w
1
, w
2
ImT and c F. There exists v
1
, v
2
V
such that T(v
1
) = w
1
and T(v
2
) = w
2
. Thus,
w
1
+w
2
= T(v
1
) +T(v
2
)
= T(v
1
+v
2
).
and
cw
1
= cT(v
1
)
= T(cv
1
).
Thus, w
1
+w
2
and cw
1
are both in ImT, i.e., ImT is a subspace of W.
>0
with m > n. Dene T : F
m
F
n
by
T
_
_
_
_
_
_
_
_
_
_
a
1
a
2
.
.
.
a
m
_
_
_
_
_
_
_
_
_
_
=
_
_
_
_
_
a
1
a
2
.
.
.
a
n
_
_
_
_
_
.
12
2.2. BASES AND DIMENSION CHAPTER 2.
The image of this map is F
n
and the kernel is given by
ker T =
_
_
_
_
_
_
_
_
_
_
_
_
0
.
.
.
0
a
n+1
.
.
.
a
m
_
_
_
_
_
_
_
_
_
_
: a
i
F
_
_
.
It is easy to see that ker T

= F
mn
.
Dene S : F
m
F
n
by
S
_
_
_
_
_
_
_
_
_
_
a
1
a
2
.
.
.
a
n
_
_
_
_
_
_
_
_
_
_
=
_
_
_
_
_
_
_
_
_
_
_
_
a
1
a
2
.
.
.
a
n
0
.
.
.
0
_
_
_
_
_
_
_
_
_
_
_
_
where there are m n zeroes. The image of this map is isomorphic to F
mn
and the kernel is trivial.
Example 2.1.38. Dene T : Q[x] R by T(f(x)) = f(
3). This is a Q-
linear map. The kernel of this map consists of those polynomials f(x) Q[x]
satisfying f(
3) = 0, i.e., those polynomials f(x) for which (x

2
3) [ f(x).
Thus, ker T = (x
2
3)Q[x]. The image of this map clearly contains Q(
3) =
a + b
3 : a, b Q. Let Im(T). Then there exists f(x) Q[x] so that

f(
3) = . Write f(x) = q(x)(x

2
3) +r(x) with deg r(x) < 2. Then
= f(
3)
= q(
3)((
3)
2
3) +r(
3)
= r(
3).
Since deg r(x) < 2, we have = r(
3) = a + b
3 for some a, b Q. Thus,

Im(T) = Q(
3).
2.2 Bases and dimension
One of the most important features of vector spaces is the fact that they have
bases. This essentially means that one can always nd a nice subset (in many
interesting cases a nite set) that will completely describe the space. Many of
the concepts of this section are likely familiar from undergraduate linear algebra.
Throughout this section V denotes an F-vector space unless indicated oth-
erwise.
13
Denition 2.2.1. Let B = v
i
be a subset of V . We say v V is a linear
combination of B and write v span
F
B if there exists c
i
F such that
v =
c
i
v
i
.
i
be a subset of V . We say B is linearly inde-
pendent if whenever we write 0 =
c
i
v
i
for some c
i
F we must have c
i
= 0
for all i.
i
be a subset of V . We say B is an F-basis of V
if
1. span
F
B = V ;
2. B is linearly independent.
The rst goal of this section is to show every vector space has a basis. We will
need Zorns lemma to prove this. We recall Zorns lemma for the convenience
of the reader.
Theorem 2.2.4 (Zorns Lemma). Let X be any partially ordered set with the
property that every chain in X has an upper bound. Then X contains at least
one maximal element.
Theorem 2.2.5. Let V be a vector space. Let / ( be subsets of V . Further-
more, assume / linearly independent and ( spans V . Then there exists a basis
B of V satisfying / B (. In particular, if we set / = , ( = V , then this
says that V has a basis.
Proof. Let
X = B
V : / B
(, B
is linearly independent.
We have X is a partially ordered set under inclusion and is nonempty because
/ X. We also have that ( provides an upper bound on X. This allows us
to apply Zorns Lemma to the chain X to conclude it has a maximal element,
say B. If span
F
B = V , we are done. Suppose span
F
B , = V . Then there exists
v ( such that v , span
F
B. However, this gives B
= B v is an element of
X that properly contains B, a contradiction. Thus, B is the desired basis.
We will make use of the following fact from undergraduate linear algebra.
Lemma 2.2.6. ([3, Theorem 1.1]) A homogeneous system of m linear equations
in n unknowns with m < n always has nontrivial solutions.
Corollary 2.2.7. Let B V be such that span
F
B = V and #B = m. Then
any set with more than m elements can not be linearly independent.
14
Proof. Let ( = w
1
, . . . , w
n
with n > m. Let B = v
1
, . . . , v
m
be a spanning
set for V . For each i write
w
i
=
m
j=1
a
ji
v
j
.
Consider the equation
n
i=1
a
ji
x
i
= 0.
The previous lemma gives a solution (c
1
, . . . , c
n
) to this equation with (c
1
, . . . , c
n
) ,=
(0, . . . , 0). We have
0 =
m
j=1
_
n
i=1
a
ji
c
i
_
v
j
=
n
i=1
c
i
_
_
m
j=1
a
ji
v
j
_
_
=
n
i=1
c
i
w
i
.
This shows ( is not linearly independent.
Theorem 2.2.8. Let V be an F-vector space. Any two bases of V have the
same number of elements.
Proof. Let B and ( be bases of V . Suppose that #B = m and #( = n. Since
B is a basis, it is a spanning set and since ( is a basis it is linearly independent.
The previous result now gives so n m. We now reverse the roles of B and (
to get the other direction.
Now that we have shown any two bases of a vector space have the same
number of elements we can make the following denition.
Denition 2.2.9. Let V be a vector space. The number of elements in a
F-basis of V is called the dimension of V and written dim
F
V.
The following theorem does not contain anything new, but it is useful to
summarize some of the important facts to this point.
Theorem 2.2.10. Let V be an F-vector space with dim
F
V = n. Let ( V
with #( = m.
1. If m > n, then ( is not linearly independent.
2. If m < n, then span
F
( ,= V .
3. If m = n, then the following are equivalent:
( is a basis;
15
( is linearly independent;
( spans V .
This theorem immediately gives the following corollary.
Corollary 2.2.11. Let W V be a subspace. Then dim
F
W dim
F
V . If
dim
F
V < , then V = W if and only if dim
F
V = dim
F
W.
We now give several examples of bases of some familiar vector spaces. It is
left as an exercise to check these are actually bases.
Example 2.2.12. Let V = F
n
. Set e
1
=
t
(1, 0, . . . , 0), e
2
=
t
(0, 1, 0, . . . , 0), . . . , e
n
=
t
(0, . . . , 0, 1). Then c = e
1
, . . . , e
n
is a basis for V and is often referred to as
the standard basis. We have dim
F
F
n
= n.
Example 2.2.13. Let V = C. From the previous example we have B = 1 is
a basis of V as a C-vector space and dim
C
C = 1. We saw before that C is also
a R-vector space. In this case a basis is given by ( = 1, i and dim
R
C = 2.
m,n
(F). Let e
i,j
to be the matrix with 1 in
(i, j)th position and zeros elsewhere. Then B = e
i,j
is a basis for V over F
and dim
F
Mat
m,n
(F) = mn.
Example 2.2.15. Set V = sl
2
(C) where
sl
2
(C) =
__
a b
c d
_
Mat
2
(C) : a +d = 0
_
.
One can check this is a C-vector space. Moreover, it is a proper subspace because
_
1 0
0 1
_
is not in sl
2
(C). Set
B =
__
0 1
0 0
_
,
_
0 0
1 0
_
,
_
1 0
0 1
__
.
It is easy to see that B is linearly independent. We know that dim
C
sl
2
(C) < 4
because it is a proper subspace of Mat
2
(C). This gives dim
C
sl
2
(C) = 3 and B
is a basis. The vector space sl
2
(C) is actually the Lie algebra associated to the
Lie group SL
2
(C). There is a very rich theory of Lie algebras and Lie groups,
but this is too far aeld to delve into here. (Note the algebra multiplication on
sl
2
(C) is not matrix multiplication, it is given by X Y = [X, Y ] = XY Y X.)
Example 2.2.16. Let f(x) F[x] be an polynomial of degree n. We can
use this polynomial to split F[x] into equivalence classes analogously to how
one creates the eld F
p
. For details of this construction, please see Example
A.0.25 found in Appendix A. This is an F-vector space under addition given
by [g(x)] + [h(x)] := [g(x) +h(x)] and scalar multiplication given by c[g(x)] :=
[cg(x)].
Note that given any g(x) F[x] we can use the Euclidean algorithm on F[x]
(polynomial long division) to nd unique polynomials q(x) and r(x) satisfying
16
g(x) = f(x)q(x) + r(x) where r(x) = 0 or deg r < deg f. This shows that each
nonzero equivalence class has a representative r(x) with deg r < deg f. Thus,
we can write
F[x]/(f(x)) = [r(x)] : r(x) F[x] with r = 0 or deg r < deg f.
From this one can see that a spanning set for F[x]/(f(x)) is given by [1], [x], . . . , [x
n1
].
This is also seen to be linearly independent by observing if there exists a
0
, . . . , a
n1

F so that [a
0
+a
1
x + +a
n1
x
n1
] = [0], this means f(x) divides a
0
+a
1
x +
+a
n1
x
n1
. However, this is impossible unless a
0
+a
1
x+ +a
n1
x
n1
= 0
because deg f = n. Thus a
0
= a
1
= = a
n1
= 0. This gives that F[x]/(f(x))
is an F-vector space of degree n.
Exercise 2.2.17. Show that R[x]/(x
2
+ 1)
= C as R-vector spaces.
The following lemma gives an alternate way to check if a set is a basis.
Lemma 2.2.18. Let V be an F-vector space and let ( = v
j
be a subset of
V . We have ( is a basis of V if and only if every vector in V can be written
uniquely as a linear combination of elements in (.
Proof. First suppose ( is a basis. Let v V . Since span
F
( = V , we can
write any vector as a linear combination of elements in (. Suppose we can write
v =
a
i
v
i
=
b
i
v
i
for some a
i
, b
i
F. Subtracting these equations gives
0 =
(a
i
b
i
)v
i
. We now use that the v
i
are linearly independent to conclude
a
i
b
i
= 0, i.e., a
i
= b
i
for every i.
Now suppose every v V can be written uniquely as a linear combination
of the elements in (. This immediately gives span
F
( = V . It remains to show
( is linearly independent. Suppose there exists a
i
F with
a
i
v
i
= 0. We
also have
0v
i
= 0, so the uniqueness gives a
i
= 0 for every i, i.e., ( is linearly
independent.
The following proposition will be used repeatedly throughout the course. It
says that a linear transformation between vector spaces is completely determined
by what it does to a basis. One way we will often use this is to dene a linear
transformation by only specifying what it does to a basis. This provides another
important application of the fact that vector spaces are completely determined
by their bases.
Proposition 2.2.19. Let V, W be vector spaces.
1. Let T Hom
F
(V, W). Then T is determined by its values on a basis of
V .
2. Let B = v
i
be a basis of V and ( = w
i
be any collection of vectors
in W so that #B = #(. There is a unique linear transformation T
Hom
F
(V, W) satisfying T(v
i
) = w
i
.
17
Proof. Let B = v
i
be a basis of V . Given any v V there are elements a
i
F
so that v =
a
i
v
i
. We have
T(v) = T
_
a
i
v
i
_
=
T(a
i
v
i
)
=
a
i
T(v
i
).
Thus, if one knows the elements T(v
i
), one knows T(v) for any v V . This
gives the rst claim.
The second part follows immediately from the rst. Set T(v
i
) = w
i
for each
i. For v =
a
i
v
i
V , dene T(v) by
T(v) =
a
i
w
i
.
It is now easy to check this is a linear map from V to W and is unique.
Example 2.2.20. Let V = W = R
2
. It is easy to check that B =
_
v
1
=
_
1
1
_
, v
2
=
_
1
1
__
is a basis of V . The previous results says to dene a map from V to W it is
enough to say where to send v
1
and v
2
. For instance, let
__
2
3
_
,
_
5
10
__
be
a subset of W. Then we have a unique linear map T : R
2
R
2
given by
T(v
1
) = w
1
, T(v
2
) = w
2
. Note it is exactly this property that allows us to
represent a linear map T : F
n
F
m
as a matrix.
Corollary 2.2.21. Let T Hom
F
(V, W), B = v
i
be a basis of V , and
( = w
i
= T(v
i
) W. Then ( is a basis for W if and only if T is an
isomorphism.
Proof. Suppose ( is a basis for W. The previous result allows us to dene S :
W V such that S(w
i
) = v
i
. This is an inverse for T, so T is an isomorphism.
Now suppose T is an isomorphism. We need to show that ( spans W and it
is linearly independent. Let w W. Since T is an isomorphism there exists a
v V with T(v) = w. Using that B is a basis of V we can write v =
a
i
v
i
for
some a
i
F. Applying T to v we have
w = T(v)
= T
_
a
i
v
i
_
=
a
i
T(v
i
)
=
a
i
w
i
.
Thus, w span
F
( and since w was arbitrary we have span
F
( = W. Suppose
18
there exists a
i
F with
a
i
w
i
= 0. We have
T(0) = 0
=
a
i
w
i
=
a
i
T(v
i
)
=
T(a
i
v
i
)
= T
_
a
i
v
i
_
.
Applying that T is injective we have 0 =
a
i
v
i
. However, B is a basis so it
must be the case that a
i
= 0 for all i. This gives ( is linearly independent and
so a basis.
The following result ties the dimension of the kernel of a linear transfor-
mation, the dimensions of the image, and the dimension of the domain vector
space. This is extremely useful as one often knows (or can bound) two of the
three.
Theorem 2.2.22. Let V be a nite dimensional vector space and let T
Hom
F
(V, W). Then
dim
F
ker T + dim
F
ImT = dim
F
V.
Proof. Let dim
F
ker T = k and dim
F
V = n. Let / = v
1
, . . . , v
k
be a basis
of ker T and extend this to a basis B = v
1
, . . . , v
n
of V . It is enough to show
that ( = T(v
k+1
), . . . , T(v
n
) is a basis for ImT.
Let w ImT. There exists v V so that T(v) = w. Since B is a basis for
V there exists a
i
such that v =
n
i=1
a
i
v
i
. This gives
w = T(v)
= T
_
n
i=1
a
i
v
i
_
=
n
i=1
a
i
T(v
i
)
=
n
i=k+1
a
i
T(v
i
)
where we have used T(v
i
) = 0 for i = 1, . . . , k because / is a basis for ker T.
Thus, span
F
( = ImT. It remains to show ( is linearly independent. Suppose
19
there exists a
i
F such that
n
i=k+1
a
i
T(v
i
) = 0. Note
0 =
n
i=k+1
T(a
i
v
i
)
= T
_
n
i=k+1
a
i
v
i
_
.
Thus,
n
i=k+1
a
i
v
i
ker T. However, / spans ker T so there exists a
1
, . . . , a
k
in
F such that
n
i=k+1
a
i
v
i
=
k
i=1
a
i
v
i
,
i.e.,
n
i=k+1
a
i
v
i
+
k
i=1
(a
i
)v
i
= 0.
Since B = v
1
, . . . , v
n
is a basis of V we must have a
1
= = a
k
= a
k+1
=
= a
n
= 0. In particular, a
k+1
= = a
n
= 0 as desired.
This theorem allows to prove the following important results.
Corollary 2.2.23. Let V, W be F-vector spaces with dim
F
V = n. Let V
1
V
be a subspace of dimension k and W
1
W be a subspace of dimension n k.
Then there exists T Hom
F
(V, W) such that V
1
= ker T and W
1
= ImT.
Proof. Let B = v
1
, . . . , v
k
be a basis of V
1
. Extend this to a basis v
1
, . . . , v
k
, . . . , v
n
of V . Let ( = w
k+1
, . . . , w
n
be a basis of W
1
. Dene T by T(v
1
) = =
T(v
k
) = 0 and T(v
k+1
) = w
k+1
, . . . , T(v
n
) = w
n
. This is the required linear
map.
The previous corollary says the only limitation on a subspace being the kernel
or image of a linear transformation is that the dimensions add up properly. One
should contrast this to the case of homomorphisms in group theory for example.
There in order to be a kernel one requires the subgroup to satisfy the further
property of being a normal subgroup. This is another way in which vector spaces
are very nice to work with.
The following corollary follows immediately from Theorem 2.2.22. This
corollary makes checking a map between two vector spaces of the same di-
mension is an isomorphism much easier as one only needs to check the map is
injective or surjective, not both.
F
(V, W) and dim
F
V = dim
F
W. Then the
following are equivalent:
20
1. T is an isomorphism;
2. T is surjective;
3. T is injective.
We can also rephrase this result in terms of matrices.
Corollary 2.2.25. Let A Mat
n
(F). Then the following are equivalent:
1. A is invertible;
2. There exists B Mat
n
(F) with BA = 1
n
;
3. There exists B Mat
n
(F) with AB = 1
n
.
One should prove the previous two corollaries as well as the following corol-
lary as exercises.
Corollary 2.2.26. Let V, W be F-vector spaces and let dim
F
V = m, dim
F
W =
n.
1. If m < n and T Hom
F
(V, W), then T is not surjective.
2. If m > n and T Hom
F
(V, W), then T is not injective.
3. We have m = n if and only if V

= W. In particular, V

= F
m
.
Note that while it is true that if dim
F
V = dim
F
W = n < then V

= W,
it is not the case that every linear map from V to W is an isomorphism. This
result is only saying there is a map T : V W that is an isomorphism. Clearly
we can dene a linear map T : V W by T(v) = 0
W
for all v V and this is
not an isomorphism.
The previous corollary gives a very important fact: if V is an n-dimensional
F-vector space, then V

= F
n
. This result gives that for any positive integer n,
all F-vector spaces of dimension n are isomorphic. One obtain this isomorphism
by choosing a basis. This is why in undergraduate linear algebra one often
focuses almost exclusively on the vector spaces F
n
and matrices as the linear
transformations.
The following example gives a nice application of what we have studied thus
far.
Example 2.2.27. Recall P
n
(R) is the vector space of polynomials of degree
less than or equal to n and dim
R
P
n
(R) = n + 1. Set V = P
n1
(R). Let
a
1
, . . . , a
k
R be distinct and pick m
1
, . . . , m
k
Z
0
such that
k
j=1
(m
j
+1) = n.
Our goal is to show given any real numbers b
1,0
, . . . , b
1,m1
, . . . , b
k,m
k
there is a
unique polynomial f P
n1
(R) satisfying f
(j)
(a
i
) = b
i,j
.
21
2.3. DIRECT SUMS AND QUOTIENT SPACES CHAPTER 2.
Dene
T :P
n1
(R) R
n
f(x)
_
_
_
_
_
_
_
_
_
_
_
_
_
f(a
1
)
.
.
.
f
(m1)
(a
1
)
.
.
.
f(a
k
)
.
.
.
f
(m
k
)
(a
k
)
_
_
_
_
_
_
_
_
_
_
_
_
_
.
If f ker T, then for each i = 1, . . . , k we have f
(j)
(a
i
) = 0 for j = 1, . . . , m
i
.
Thus, for each i we have (x a
i
)
mi+1
[ f(x). Since these polynomials are
relatively prime, this gives their product divides f and thus f is divisible by a
polynomial of degree n. Since f P
n1
(F) this implies f = 0. Hence ker T = 0.
Applying Theorem 2.2.22 we have ImT must have dimension n, i.e., ImT = R
n
.
This gives the result.
2.3 Direct sums and quotient spaces
In this section we cover two of the basic constructions in vector space theory,
namely the direct sum and the quotient space. The direct sum is a way to
split a vector space into subspaces that are independent. The quotient space
construction is a way to identify some vectors to be 0. We begin with the direct
sum.
Denition 2.3.1. Let V be an F-vector space and V
1
, . . . , V
k
be subspaces of
V . The sum of V
1
, . . . , V
k
is dened by
V
1
+ +V
k
= v
1
+ +v
k
: v
i
V
i
.
Denition 2.3.2. Let V
1
, . . . , V
k
be subspaces of V . We say V
1
, . . . , V
k
are
independent if whenever v
1
+ +v
k
= 0 with v
i
V
i
, then v
i
= 0 for all i.
1
, . . . , V
k
be subspaces of V . We say V is the direct
sum of V
1
, . . . , V
k
and write
V = V
1
V
k
if the following two conditions are satised
1. V = V
1
+ +V
k
;
2. V
1
, . . . , V
k
are independent.
22
Example 2.3.4. Set V = F
2
, V
1
= (x, 0) : x F, V
2
= (0, y) : y
F. Then we clearly have V
1
+ V
2
= (x, y) : x, y F = V . Moreover,
if (x, 0) + (0, y) = (0, 0), then (x, y) = (0, 0) and so x = y = 0. This gives
(x, 0) = (0, 0) = (0, y) and so V
1
and V
2
are independent. Thus F
2
= V
1
V
2
.
Example 2.3.5. Let B = v
1
, . . . , v
n
be a basis of V . Set V
i
= F v
i
= av
i
:
a F. We have V
1
, . . . , V
n
are clearly subspaces of V , and by the denition of
a basis we obtain V = V
1
V
n
.
Lemma 2.3.6. Let V be a vector space, V
1
, . . . , V
k
be subspaces of V . We have
V = V
1
V
k
if and only if each v V can be written uniquely in the form
v = v
1
+ +v
k
with v
i
V
i
.
Proof. Suppose V = V
1
V
k
. Then certainly we have V = V
1
+ + V
k
and so given any v V there are elements v
i
V
i
so that v = v
1
+ + v
k
.
The only thing to show is this expression is unique. Suppose v = v
1
+ +v
k
=
w
1
+ +w
k
with v
i
, w
i
V
i
. Then 0 = (v
1
w
1
) + +(v
k
w
k
). Since the
V
i
are independent we have v
i
w
i
= 0, i.e., v
i
= w
i
for all i.
Suppose each v V can be written uniquely in the form v = v
1
+ + v
k
with v
i
V
i
. Immediately we get V = V
1
+ +V
k
. Suppose 0 = v
1
+ +v
k
with v
i
V
i
. We have 0 = 0 + + 0 as well, so by uniqueness, we get v
i
= 0
for all i. This gives V
1
, . . . , V
k
are independent and so V = V
1
V
k
.
Exercise 2.3.7. Let V
1
, . . . , V
k
be subspaces of V . For each i let B
i
be a basis
of V
i
. Set B = B
1
B
k
. Then
1. B spans V if and only if V = V
1
+ +V
k
;
2. B is linearly independent if and only if V
1
, . . . , V
k
are independent;
3. B is a basis if and only if V = V
1
V
k
.
Given a subspace U V , it is natural to ask if there is a subspace W so
that V = U W. This leads us to the following denition.
Denition 2.3.8. Let V be a vector space and U V a subspace. We say
W V is a complement of U in V if V = U W.
Lemma 2.3.9. Let U V be a subspace. Then U has a complement.
Proof. Let / be a basis of U, and extend / to a basis of B of V . Set W =
span
F
(B /). One checks immediately that V = U W.
Exercise 2.3.10. Let U V be a subspace. Is the complement of U unique?
If so, prove it. If not, give a counterexample.
We now turn our attention to quotient spaces. We have already seen an
example of a quotient space. Namely, recall Example 2.2.16. Consider V = F[x]
and W = (f(x)) := g F[x] : f[g = [0]. One can check that W is a subspace
of V . In that example we dened a vector space V/W = F[x]/(f(x)) and saw
23
that the elements in W become 0 in the new space. This construction is the
one we generalize to form quotient spaces.
Let V be a vector space and W V be a subspace. Dene an equivalence
relation on V as follows v
1

W
v
2
if and only if v
1
v
2
W. We write the
equivalence classes as
[v
1
] = v
2
V : v
1
v
2
W = v
1
+W.
Set V/W = v + W : v V . Addition and scalar multiplication on V/W are
dened as follows. Let v
1
, v
2
V and c F. Dene
(v
1
+W) + (v
2
+W) = (v
1
+v
2
) +W;
c(v
1
+W) = cv
1
+W.
Exercise 2.3.11. Show that V/W is an F-vector space.
We call V/W the quotient space of V by W.
2
and W = (x, 0) : x R. Clearly we have
W V is a subspace. Let (x
0
, y
0
) V . To nd (x
0
, y
0
) +W, we want all (x, y)
such that (x
0
, y
0
) (x, y) W. However, it is clear that (x
0
x, y
0
y) W
if and only if y = y
0
. Thus,
(x
0
, y
0
) +W = (x, y
0
) : x R.
The graph below gives the elements (0, 0) +W and (0, 1) +W.
One immediately sees from this that (x, y)+W is not a subspace of V unless
(x, y) +W = (0, 0) +W.
Dene
: R V/W
y
0
(x
0
, y
0
).
It is straightforward to check this is an isomorphism, so V/W

= R.
24
Example 2.3.13. More generally, let m, n Z
>0
with m > n. Consider V =
F
m
and let W be the subspace of V spanned by e
1
, . . . , e
n
with e
1
, . . . , e
m
the
standard basis. We can form the quotient space V/W. This space is isomorphic
to F
mn
with a basis given by e
n+1
+W, . . . , e
m
+W.
Example 2.3.14. Let V = F[x] and let W = (f(x)). We saw before that the
quotient space V/W = F[x]/(f(x)) has as a basis [1], [x], . . . , [x
n1
]. Dene
T : V/W P
n1
(F) by T([x
j
]) = x
j
. One can check this is an isomorphism,
and so F[x]/(x
n
)
= P
n1
(F) as F-vector spaces.
Denition 2.3.15. Let W V be a subspace. The canonical projection map
is given by
W
: V V/W
v v +W.
It is immediate that
W
Hom
F
(V, V/W).
One important point to note is that when working with quotient spaces, if
one denes a map from V/W to another vector space, one must always check
the map is well-dened as dening the map generally involves a choice of rep-
resentative for v +W. In other words, one must show if v
1
+W = v
2
+W then
T(v
1
+W) = T(v
2
+W). Consider the following example.
2
and W = (x, 0) : x R. We saw above the
elements of the quotient space V/W are of the form (x, y)+W and (x
1
, y
1
)+W =
(x
2
, y
2
) +W if and only if y
1
= y
2
. Suppose we with to dene a linear map T :
V/W R. We could try to dene such a map by specifying T((x, y) +W) = x.
However, we know that (x, y) + W = (x + 1, y) + W, so x = T((x, y) + W) =
T((x + 1, y) + W) = x + 1. This doesnt make sense, so our map is not well-
dened. The correct map in this situation is to send (x, y) + W to y since y
is xed across the equivalence class.
The following result allows us to avoid checking the map is well-dened when
it is induced from another linear map. One should look back at the examples
of quotient spaces above to see how this theorem applies.
Theorem 2.3.17. Let T Hom
F
(V, W). Dene
T : V/ ker T W
v + ker T T(v).
Then T Hom
F
(V/ ker T, W). Moreover, T gives an isomorphism
V/ ker T

ImT.
Proof. The rst step is to show T is well-dened. Suppose v
1
+ker T = v
2
+ker T,
25
i.e, v
1
v
2
ker T. So there exists x ker T such that v
1
v
2
= x. We have
T(v
1
+ ker T) = T(v
1
)
= T(v
2
+x)
= T(v
2
) +T(x)
= T(v
2
)
= T(v
2
+ ker T).
Thus, T is well-dened.
The next step is to show T is linear. Let v
1
+W, v
2
+W V/W and c F.
We have
T(v
1
+ ker T +v
2
+ ker T) = T(v
1
+v
2
+ ker T)
= T(v
1
+v
2
)
= T(v
1
) +T(v
2
)
= T(v
1
+ ker T) +T(v
2
+ ker T)
and
T(c(v
1
+ ker T)) = T(cv
1
+ ker T)
= T(cv
1
)
= cT(v
1
)
= cT(v
1
+ ker T).
Thus, T Hom
F
(V/ ker T, W).
It only remains to show that T is a bijection. Let w Im(T). There exists
v V so that T(v) = w. Thus, T(v + ker T) = T(v) = w, so T is surjective.
Now suppose v +ker T ker T. Then 0 = T(v +ker T) = T(v). Thus, v ker T
which means v + ker T = 0 + ker T and so T is injective.
Example 2.3.18. Let V = F[x]. Dene a map T : V P
2
(F) by sending
f(x) = a
0
+a
1
x + a
n
x
n
to a
0
+a
1
x +a
2
x
2
. This is a surjective linear map.
The kernel of this map is exactly (x
3
) = g(x) : x
3
[g(x). Thus, the previous
result gives an isomorphism F[x]/(x
3
)
= P
2
(F) as F-vector spaces.
2
(F). Dene a map T : Mat
2
(F) F
2
by
T
__
a b
c d
__
=
_
b
c
_
. One can check this is a surjective linear map. The kernel
of this map is given by
ker T =
__
a 0
0 d
_
: a, d F
_
.
Thus, Mat
2
(F)/ ker T is isomorphic to F
2
as F-vector spaces.
26
2.4. DUAL SPACES CHAPTER 2.
Theorem 2.3.20. Let W V be a subspace. Let B
W
= v
i
be a basis for W
and extend to a basis B for V . Set B
U
= B B
W
= z
i
. Let U = span
F
B
U
,
i.e., U is a complement of W in V . Then the linear map
p : U V/W
z
i
z
i
+W
is an isomorphism. Thus, B
U
= z
i
+W : z
i
B
U
is a basis of V/W.
Proof. Note that p is linear because we dened it on a basis. It only remains to
show p is an isomorphism. We will do this by showing B
U
= z
i
+W : z
i
B
U
is a basis for V/W.

Let v + W V/W. Since v V , there exists a
i
, b
j
F such that v =
a
i
v
i
+
b
j
z
j
. We have
a
i
v
i
W, so v
b
j
z
j
W. Thus
v +W =
b
j
z
j
+W
=
b
j
(z
j
+W).
This shows that B
U
spans V/W.
Suppose there exists b
j
F such that
b
j
(z
j
+W) = 0+W, i.e.,
b
j
(z
j
+
W) W. So there exists a
i
F such that
b
j
z
j
=
a
i
v
i
. Thus
a
i
v
i
+
b
j
z
j
= 0. Since v
i
, z
j
is a basis of V we obtain a
i
= 0 = b
j
for all i, j.
This gives B
U
is linearly independent and so completes the proof.
2.4 Dual spaces
We conclude this chapter by discussing dual spaces. Throughout this section V
Denition 2.4.1. The dual space of V , denoted V
, is given by V
= Hom
F
(V, F).
Elements of the dual space are called linear functionals.
Theorem 2.4.2. The vector space V is isomorphic to a subspace of V
. If
dim
F
V < , then V

= V
.
Proof. Let B = v
i
be a basis of V . For each v
i
, dene an element v
i
by
setting
v
i
(v
j
) =
_
1 if i = j;
0 otherwise.
One sees immediately that v
i
V
as it is dened on a basis. Dene T

Hom
F
(V, V
) by T(v
i
) = v
i
. We claim that v
i
is a linearly independent set
in V
. Suppose there exists a

i
F such that
a
i
v
i
= 0 where here 0 indicates
the zero map. For any v V this gives
a
i
v
i
(v) = 0. In particular,
0 =
a
i
v
i
(v
j
)
= a
j
v
j
(v
j
)
= a
j
.
27
Since a
j
= 0 for all j, we have v
i
is linearly independent as claimed. Let
W
be given by W
= span
F
v
i
. Then we have V

= W
, which gives
the rst statement of the theorem.
Assume dim
F
V < so we can write B = v
1
, . . . , v
n
. Given v
,
dene a
j
= v
(v
j
). Set v =
a
i
v
i
V . Dene S : V
V by v
v. This
map denes an inverse to the map T given above and thus V

= V
.
Note it is not always the case that V is isomorphic to its dual. In fact, if V
is innite dimensional it is never the case that V is isomorphic to its dual. In
functional analysis or topology this problem is often overcome by requiring the
linear functionals to be continuous, i.e. the dual space consists of continuous
linear maps from V to F. In that case one would refer to our dual as the
algebraic dual. However, we will only consider the algebraic setting here so
we dont bother with saying algebraic dual.
Example 2.4.3. Let V be a vector space over a eld F and let the dimension
be denoted by . Note that is not nite by assumption, but we need to work
with dierent cardinalities here so we must keep track of this. The cardinality of
V as a set is given by #F = max, #F. Moreover, we have V is naturally
isomorphic to the set of functions from a set of cardinality to F with nite
support. We denote this space by F
()
.
The dual space of V is the set of all functions from a set of cardinality to
F, i.e., to F
. If we set
= dim
F
V
, we wish to show
> . As above, the

cardinality of V
as a set is max
, #F.
Let / = v
i
be a countable linear independent subset of V and extend it to
a basis of V . For each nonzero c F dene f
c
: V F by f
c
(v
i
) = c
i
for v
i
/
and 0 for the other elements in the basis. One can show that f
c
is linearly
independent, so
#F. Thus, we have #V
#F = max
, #F =
.
However, we also have #V
= #F
. Since < #F
because #F 2, we have
= #F
> as desired.
One should note that in the nite dimensional case when we have V

= V
,
the isomorphism depends upon the choice of a basis. This means that while V
is isomorphic to its dual, the isomorphism is non-canonical. In particular, there
is no preferred isomorphism between V and its dual. Studying the possible
isomorphisms that arise between V and its dual is interesting in its own right.
We will return to this problem in Chapter 6.
Denition 2.4.4. Let V be a nite dimensional F-vector space and let B =
v
1
, . . . , v
n
be a basis of V . The dual basis of V
with respect to B is given by

B
= v
1
, . . . , v
n
.
If V is nite dimensional then we have V

= V

= (V
, i.e., V

= (V
.
The major dierence is that while there is no canonical isomorphism between
V and V
, there is a canonical isomorphism V

= (V
! Note that in proof

below we construct the map from V to (V
without choosing any basis. This

is what makes the map canonical. We do use a basis in proving injectivity, but
the map does not depend on this basis so it does not matter.
28
Proposition 2.4.5. There is a canonical injective linear map from V to (V
.
If dim
F
V < , then this is an isomorphism.
Proof. Let v V . Dene eval
v
: V
F by sending Hom
F
(V, F) = V
to eval
v
() = (v). One must check that eval
v
is a linear map. To see this, let
, V
and c F. We have
eval
v
(c +) = (c +)(v)
= c(v) +(v)
= c eval
v
() + eval
v
().
Thus, for each v V we obtain a map eval
v
Hom
F
(V
, F). This allows us to

dene a well-dened map
: V Hom
F
(V
, F) = (V
v eval
v
: (v).
We claim that is a linear map. Since maps into a space of maps, we
check equality by checking the maps agree on each element, i.e., for v, w V
and c F we want to show that for each Hom
F
(V
, F) that (cv+w)() =
c(v)() + (w)(). Observe we have
(cv +w)() = eval
cv+w
()
= (cv +w)
= c(v) +(w)
= c(v)() + (w)().
Thus, we have Hom
F
(V, (V
).
It remains to show that is injective. Let v V , v ,= 0. Let B be a
basis containing v. This is possible because we can start with the set v and
complete it to a basis. Note v
and eval
v
(v
) = v
(v) = 1. Moreover,
for any w B with w ,= v we have eval
w
(v
) = v
(w) = 0. Thus, we have

(v) = eval
v
is not the 0 map, so ker = 0, i.e., is injective.
If dim
F
V < , we have is an isomorphism because dim
F
V = dim
F
V
since they are isomorphic, so dim

F
V = dim
F
V
= dim
F
(V
.
Let T Hom
F
(V, W). We obtain a natural map T
Hom
F
(W
, V
) as
follows. Let W
, i.e., : W F. To obtain a map T
() : V F, we
simply compose the maps, namely, T
() = T. It is easy to check this is a

linear map. In particular, we have the following result.
Proposition 2.4.6. Let T Hom
F
(V, W). The map T
dened by T
() =
T is a linear map from W
to V
.
We will see in the next chapter how the dual map gives the proper denition
of a transpose.
29
2.5. PROBLEMS CHAPTER 2.
2.5 Problems
For these problems F is assumed to be a eld.
1. Dene
sl
n
(Q) =
_
X = (x
i,j
) Mat
n
(Q) : Tr(X) =
n
i=1
x
i,i
= 0
_
.
Show that sl
n
(Q) is a Q-vector space.
2. Consider the vector space F
3
. Determine, and justify your answer, whether
each of the following are subspaces of F
3
:
(a) W
1
=
_
_
_
_
_
x
1
x
2
x
3
_
_
: x
1
+ 2x
2
+ 3x
3
= 0
_
_
_
(b) W
2
=
_
_
_
_
_
x
1
x
2
x
3
_
_
: x
1
x
2
x
3
= 0
_
_
_
(c) W
3
=
_
_
_
_
_
x
1
x
2
x
3
_
_
: x
1
= 5x
3
_
_
_
.
3. Let V be an F-vector space.
(a) Prove that an arbitrary intersection of subspaces of V is again a subspace
of V .
(b) Prove that the union of two subspace of V is a subspace of V if and only if
one of the subspaces is contained in the other.
4. Let T Hom
F
(F, F). Prove there exists F so that T(v) = v for every
v F.
5. Let U, V, and W be F-vector spaces. Let S Hom
F
(U, V ) and T
Hom
F
(V, W). Prove that T S Hom
F
(U, W).
6. Let V be an F-vector space. Prove that if v
1
, . . . , v
n
is linearly indepen-
dent in V , then so is the set v
1
v
2
, v
2
v
3
, . . . , v
n1
v
n
, v
n
.
7. Let V be the subspace of R
5
dened by
V = (x
1
, x
2
, . . . , x
5
) R
5
: x
1
= 4x
4
, x
2
= 5x
5
.
Find a basis for V .
8. Prove that there does not exist a T Hom
F
(F
5
, F
2
) so that
ker(T) = (x
1
, x
2
, . . . , x
5
) F
5
: x
1
= x
2
and x
3
= x
4
= x
5
.
30
9. Let V be a nite dimensional vector space and T Hom
F
(V, V ) with T
2
= T.
(a) Prove that Im(T) ker(T) = 0.
(b) Prove that V = Im(T) ker(T).
(c) Let V = F
n
. Prove that there is a basis of V such that the matrix of T
with respect to this basis is a diagonal matrix whose entries are all 0 or 1.
10. Let T Hom
F
(V, F). Prove that if v V is not in ker(T), then
V = ker(T) cv : c F.
11. Let V
1
, V
2
be subspaces of the nite dimensional vector space V . Prove
dim
F
(V
1
+V
2
) = dim
F
(V
1
) + dim
F
(V
2
) dim
F
(V
1
V
2
).
12. Suppose that V and W are both 5-dimensional R-subspaces of R
9
. Prove
that V W ,= 0.
13. Let p be a prime and V a dimension n vector space over F
p
. Show there
are
(p
n
1)(p
n
p)(p
n
p
2
) (p
n
p
n1
)
distinct bases of V .
14. Let V be an F-vector space of dimension n. Let T Hom
F
(V, V ) so that
T
2
= 0. Prove that the image of T is contained in the kernel of T and hence
the dimension of the image of T is at most n/2.
15. Let W be a subspace of a vector space V . Let T Hom
F
(V, V ) so that
T(W) W. Show that T induces a linear transformation T Hom
F
(V/W, V/W).
Prove that T is nonsingular (i.e., injective) on V if and only if T restricted to
W and T on V/W are both nonsingular.
31
Chapter 3
Choosing coordinates
This chapter will make the connection between the more abstract version of
vector space and linear transformation given in Chapter 2 and the material
given in undergraduate linear algebra class. Throughout this chapter all vector
spaces are assumed to be nite dimensional.
3.1 Linear transformations and matrices
Let B = v
1
, . . . , v
n
be a basis for V . This choice of basis gives an isomorphism
between V and F
n
. Namely, for v V if we write v =
a
i
v
i
for some a
i
F,
we have an isomorphism T
B
: V F
n
given by v
_
_
_
a
1
.
.
.
a
n
_
_
_. When we identify
V with F
n
via this map, we will write [v]
B
=
_
_
_
a
1
.
.
.
a
n
_
_
_. We refer to this as choosing
coordinates on V .
Example 3.1.1. Let V = sl
2
(C). Recall a basis for this vector space is
B =
_
v
1
=
_
0 1
0 0
_
, v
2
=
_
0 0
1 0
_
, v
3
=
_
1 0
0 1
__
.
Let
_
a b
c a
_
be an element of V . Observe we have
_
a b
c a
_
= bv
1
+cv
2
+av
3
.
Thus,
__
a b
c a
__
B
=
_
_
b
c
a
_
_
.
32
3.1. LINEAR TRANSFORMATIONS AND MATRICES CHAPTER 3.
2
(R). Recall a basis for V is given by B = 1, x, x
2
.
Let f(x) = a +bx +cx
2
. Then
[f]
B
=
_
_
a
b
c
_
_
.
2
(R). One can easily check that ( = 1, (x1), (x
1)
2
is a basis for V . Let f(x) = a +bx +cx
2
. We can write f in terms of ( as
f(x) = (a +b +c) + (b + 2c)(x 1) +c(x 1)
2
.
Thus, we have
[f]
C
=
_
_
a +b +c
b + 2c
c
_
_
.
Let T Hom
F
(V, W). Let B = v
1
, . . . , v
n
be a basis for V and ( =
w
1
, . . . , w
m
be a basis for W. Recall that we have W

= F
m
via the map
Q(w) = [w]
C
and V

= F
n
via the map P(v) = [v]
B
. Furthermore, recall that
any linear transformation from F
n
to F
m
is given by a matrix A Mat
m,n
(F).
Thus, we have the following diagram:
V
T
//
P
W
Q
F
n
A
//
F
m
Thus, we have a unique matrix A Mat
m,n
(F) given by A = Q T P
1
.
Write A = [T]
C
B
, i.e., A is the matrix that gives the map T when one chooses B
as the coordinates on V and ( as the coordinates on W. In particular, [T]
C
B
is
the unique matrix that satises [T]
C
B
[v]
B
= [T(v)]
C
.
One can easily compute [T]
C
B
. Since ( is a basis for W, there are scalars
a
ij
F so that
T(v
j
) =
m
i=1
a
ij
w
i
.
Observe that
[T(v
j
)]
C
=
_
_
_
a
1j
.
.
.
a
mj
_
_
_.
We also have [v
j
]
B
= e
j
, so [T]
C
B
[v
j
]
B
is exactly the jth column of [T]
C
B
. Thus,
the matrix [T]
C
B
is given by
[T]
C
B
= (a
ij
)
= ([T(v
1
)]
C
[ [[T(v
n
]
C
)) .
33
3
(R). Dene T Hom
R
(V, V ) by T(f(x)) = f
(x).
Let B = 1, x, x
2
, x
3
. For f(x) = a + bx + cx
2
+ dx
3
, we have T(f(x)) =
b +2cx+3dx
2
. In particular, T(1) = 0, T(x) = 1, T(x
2
) = 2x, and T(x
3
) = 3x
2
.
The matrix for T with respect to B is given by
[T]
B
B
=
_
_
_
_
0 1 0 0
0 0 2 0
0 0 0 3
0 0 0 0
_
_
_
_
.
Example 3.1.5. Let V = sl
2
(C) and W = C
4
. We pick the standard basis
B =
_
v
1
=
_
0 1
0 0
_
, v
2
_
0 0
1 0
_
, v
3
=
_
1 0
0 1
__
for sl
2
(C). Let
( =
_
_
w
1
=
_
_
_
_
1
0
1
0
_
_
_
_
, w
2
=
_
_
_
_
0
i
0
0
_
_
_
_
, w
3
=
_
_
_
_
0
0
2
0
_
_
_
_
, w
4
=
_
_
_
_
0
0
0
1
_
_
_
_
_
_
.
It is easy to check that ( is a basis for W. Dene T Hom
F
(V, W) by
T(v
1
) =
_
_
_
_
2
0
0
0
_
_
_
_
T(v
2
) =
_
_
_
_
0
3
0
1
_
_
_
_
T(v
3
) =
_
_
_
_
0
0
0
1
_
_
_
_
.
From this it is easy to check that
T(v
1
) = 2w
1
w
3
T(v
2
) = 3iw
2
+w
4
T(v
3
) = w
4
.
Thus, the matrix for T with respect to B and ( is
[T]
C
B
=
_
_
_
_
2 0 0
0 3i 0
1 0 0
0 1 1
_
_
_
_
.
34
Exercise 3.1.6. Let / is a basis of U, B a basis of V , and ( a basis of W. If
S Hom
F
(U, V ) and T Hom
F
(V, W), show that
[T S]
C
A
= [T]
C
B
[S]
B
A
.
Let T Hom
F
(V, V ) and B a basis of V . To save notation we will write [T]
B
for the matrix [T]
B
B
.
As one learns in undergraduate linear algebra, it can often be the case that
one has information about V given in terms of a basis B, but it would be more
useful if it were given in terms of a basis B
. We now recall how to change from

B to B
. We can recover this by specializing the situation we just studied. Let

B = v
1
, . . . , v
n
and B
= v
1
, . . . , v
n
. Dene T : V F
n
by T(v) = [v]
B
and
S : V F
n
by S(v) = [v]
B
. We have the following diagram:
V
id
//
T
W
S
F
n
A
//
F
m
Applying our previous results we see [v]
B
= (Tid S
1
)([v]
B
) = (TS
1
)([v]
B
).
The change of basis matrix is [id]
B
B
.
Exercise 3.1.7. Let B = v
1
, . . . , v
n
. Prove the change of basis matrix [id]
B
B
is given by ([v
1
]
B
[ [[v
n
]
B
)
Example 3.1.8. Let V = Q
2
. It is elementary to check that B =
__
1
1
_
,
_
1
1
__
and B
=
__
2
3
_
,
_
5
7
__
both give bases of V . To compute the change of basis
matrix from B to B
, we expand the elements of B in terms of the basis B
. For
example, we want to nd a, b Q so that
_
1
1
_
= a
_
2
3
_
+b
_
5
7
_
.
This leads to the system of linear equations
1 = 2a + 5b
1 = 3a + 7b.
One solves these by writing expressing them as the matrix
_
2 5 1
3 7 1
_
and
using Gaussian elimination to reduce this matrix to
_
1 0 12
0 1 5
_
. Thus, a =
35
3.2. TRANSPOSE OF A MATRIX VIA THE DUAL MAP CHAPTER 3.
12 and b = 5. One now performs the same operation on the vector
_
1
1
_
to
obtain
_
1
1
_
= 2
_
2
3
_
+ 1
_
5
7
_
.
Thus, the change of basis matrix is given by
_
12 2
5 1
_
.
Example 3.1.9. Consider V = P
2
(F). Let B = 1, x, x
2
and B
= 1, x
2, (x 2)
2
be bases of V . We calculate the change of basis matrix. We have
[1]
B
= 1,
[x]
B
= 1 (x 2) + 2 1,
[x]
B
= 1 (x
2
)
2
+ 4 (x 2) + 4 1.
Thus, the change of basis matrix is given by A =
_
_
1 2 4
0 1 4
0 0 1
_
_
.
Exercise 3.1.10. Let V = P
3
(F). Dene
x
(i)
= x(x 1)(x 2) (x i + 1).
In particular, x
(0)
= 1, x
(1)
= x, x
(2)
= x(x 1) and x
(3)
= x(x 1)(x 2). Set
B = 1, x, x
(2)
, x
(3)
and B
= 1, x, x
2
, x
3
.
1. Show that B is a basis for V .
2. Find the change of basis matrix from B to B
.
This gives the language to allow one to translate the results we prove in
these notes from the language of vector spaces and linear transformations to the
language of F
n
and matrices. Many of the familiar results from undergraduate
linear algebra will be proven in the problems at the end of the chapter.
3.2 Transpose of a matrix via the dual map
Recall at the end of last chapter we introduced a dual map. Namely, given
a linear map T Hom
F
(V, W), we dened a map T
: W
given by
T() = T. We also remarked at that point that this map could be used to
properly dene a transpose. This section deals with this construction.
Given a basis B = v
1
, . . . , v
n
of V and a basis ( = w
1
, . . . , w
m
of W
we have dual bases B
of V
and C
of W
. The previous section gave an

associated matrix to any linear transformation, so we have a matrix [T
]
B
C

Mat
n,m
(F).
F
(V, W), B a basis of V , ( a basis of W, and
set A = [T]
C
B
. The transpose of A, denoted
t
A, is given by
t
A = [T
]
B
C
.
36
For this denition to be of interest we need to show it agrees with the
denition of a transpose given in undergraduate linear algebra.
Lemma 3.2.2. Let A = (a
ij
) Mat
m,n
(F). Then
t
A = (b
ij
) Mat
n,m
(F)
with b
ij
= a
ji
.
Proof. Let c
n
= e
1
, . . . , e
n
be the standard basis of F
n
and T
m
= f
1
, . . . , f
m
the standard basis for F

m
. Let c
n
and T
m
be the dual bases. Let T be the
linear map associated to A, i.e., [T]
Fm
En
= A. In particular, we have
T(e
i
) =
m
k=1
a
ki
f
k
.
We also have that [T
]
E
n
F
m
is a matrix B = (b
ij
) Mat
n,m
(F) where the entries
of B are given by
T
(f
j
) =
n
k=1
b
kj
e
k
.
If we apply f
j
to the rst sum we see
f
j
(T(e
i
)) =
m
k=1
a
ki
f
j
(f
k
)
= a
ji
.
If we evaluate the second sum at e
i
we have
T
(f
j
)(e
i
) =
n
k=1
b
kj
e
k
(e
i
)
= b
ij
.
We now use the denition of the map T
to see that f
j
T(e
i
) = T
(f
j
)(e
i
),
and so a
ji
= b
ij
, as desired.
Exercise 3.2.3. Let A
1
, A
2
Mat
m,n
(F). Use the denition given above to
show that
t
(A
1
+A
2
) =
t
A
1
+
t
A
2
.
Exercise 3.2.4. Let A Mat
m,n
(F) and c F. Use the denition given above
to show that
t
(cA) = c
t
A.
We can use our denition of the transpose to give a very simple proof of the
following fact.
Lemma 3.2.5. Let A Mat
m,n
(F) and B Mat
p,m
(F). Then
t
(BA) =
t
A
t
B.
37
Proof. Write c
m
be the standard basis on F
m
, and likewise for c
n
and c
p
. Let
S be the multiplication by A map and T the multiplication by B map so that
A = [S]
Em
En
and B = [T]
Ep
Em
. We also have BA = [T S]
Ep
En
. We now have
t
(BA) = [(T S)
]
E
n
E
p
= [S
]
E
n
E
p
= [S
]
E
n
E
m
[T
]
E
m
E
p
=
t
A
t
B,
as claimed.
We also get the transpose of the inverse of a matrix very easily.
Lemma 3.2.6. Let A GL
n
(F). Then
t
(A
1
) = (
t
A)
1
.
Proof. The strategy of proof is to show that
t
(A
1
) satises the conditions of
being an inverse of
t
A, i.e.,
t
A
t
(A
1
) = 1
n
=
t
(A
1
)
t
A. We then use that
inverses in a group are unique, so it must be that
t
(A
1
) = (
t
A)
1
.
Let c
n
be the standard basis of F
n
and let T be the multiplication by A
map, i.e., A = [T]
En
En
. By assumption we have A is invertible, so T
1
exists and
A
1
= [T
1
]
En
En
. Write I for the identity map and we continue to denote the
identity matrix by 1
n
. Observe we have
1
n
= [I]
En
En
= [(T
1
T)
]
E
n
E
n
= [T
(T
1
)
]
E
n
E
n
= [T
]
E
n
E
n
[(T
1
)
]
E
n
E
n
=
t
A
t
(A
1
).
Similarly one has 1
n
=
t
(A
1
)
t
A. As noted above, the uniqueness of inverses in
GL
2
(F) completes the proof.
38
3.3 Problems
For all of these problems V is a nite dimensional F-vector space.
1. Let V = P
n
(F). Let B = 1, x, . . . , x
n
be a basis of P
n
(F). Let F
and set ( = 1, x , . . . , (x )
n1
, (x )
n
. Dene a linear transformation
T Hom
F
(V, V ) by dening T(x
j
) = (x )
j
. Determine the matrix of this
linear transformation. Use this to conclude that ( is also a basis of P
n
(F).
2. Let V = P
5
(Q) and let B = 1, x, . . . , x
5
. Prove that the following are
elements of V
and express them as linear combinations of the dual basis:

(a) : V Q dened by (p(x)) =
_
1
0
t
2
p(t)dt.
(b) : V Q dened by (p(x)) = p
(5) where p
(x) denotes the derivative of

p(x).
3. Let V be a vector space over F and let T Hom
F
(V, V ). A nonzero element
v V satisfying T(v) = v for some F is called an eigenvector of T with
eigenvalue .
(a) Prove that for any xed F the collection of eigenvectors of T with
eigenvalue together with 0 forms a subspace of V .
(b) Prove that if V has a basis B consisting of eigenvectors for T then [T]
B
B
is
a diagonal matrix with the eigenvalues of T as diagonal entries.
4. Let A, B Mat
n
(F). We say A and B are similar if there exists T
Hom
F
(V, V ) for some n-dimensional F-vector space V so that A = [T]
B
and
B = [T]
C
for bases B and ( of V .
(a) Show that if A and B are similar, there exists P GL
n
(F) so that
A = PBP
1
. Conversely, if there exists P GL
n
(F) so that A = PBP
1
,
show that A is similar to B.
(b) Let T : R
3
R
3
be dened so that T = T
A
where A =
_
_
1 2 0
0 1 3
1 2 4
_
_
. Let
B =
_
_
_
_
_
1
0
2
_
_
,
_
_
1
3
0
_
_
,
_
_
0
1
1
_
_
_
_
_
be a basis of R
3
. First, calculate [T]
B
directly.
Then nd P so that [T]
B
= PAP
1
.
5. Let T Hom
R
(R
4
, R
4
) be the linear transformation given by the matrix
A =
_
_
_
_
1 1 0 3
1 2 1 1
1 1 0 3
1 2 1 1
_
_
_
_
with respect to the standard basis c
4
. Determine a basis for the image and
kernel of T.
39
6. Let T Hom
F
(P
7
(F), P
7
(F)) be dened by T(f) = f
where f
denotes
the usual derivative of a polynomial f P
7
(F). For each of the elds below,
determine a basis for the image and kernel of T:
(a) F = R
(b) F = F
3
.
7. Let V and W be F vector spaces of dimensions n and m respectively. Let
A Mat
m,n
(F) be a matrix representing a linear transformation T from V to
W with respect to bases B
1
for V and (
1
for W. Suppose B is the matrix for
T with respect to the bases B
2
for V and (
2
for W. Let I denote the identity
map on V and

I denote the identity map on W. Set P = [I]
B1
B2
and Q = [
I]
C1
C2
.
Prove that Q
1
= [
I]
C2
C1
and that Q
1
AP = B.
The next problems recall Gaussian elimination. First we recall the set-up
from undergraduate linear algebra.
Consider a system of equations
a
11
x
1
+ +a
1n
x
n
= c
1
a
21
x
1
+ +a
2n
x
n
= c
2
.
.
.
a
m1
x
1
+ +a
mn
x
n
= c
m
for unknowns x
1
, . . . , x
n
and scalars a
ij
, c
i
. We have a coecient matrix A =
(a
ij
) Mat
m,n
(F), and an augmented matrix (A[C) Mat
m,(n+1)
(F) where
we add the column vector given by the c
i
s on the right side. Note the solutions
to the equations above are not altered if we perform the following operations:
1. interchange any two equations
2. add a multiple of one equation to another
3. multiply any equation by a nonzero element of F.
In terms a the matrix these correspond to the elementary row operations given
by
1. interchange any two rows
2. add a multiple of one row to another
3. multiply any row by a nonzero element of F.
A matrix A that can be transformed into a matrix B by a series of elementary
row operations is said to be row reduced to B.
8. Describe the elementary row operations in terms of problem 1. In partic-
ular, explain what it is doing on a basis. You can do this separately for each
40
elementary operation.
We say A B if A can be row reduced to C.
9. Prove that is an equivalence relation. Prove that if A B then the row
rank of A is the same as the row rank of B.
An m by n matrix is said to be in reduced row echelon form if
1. the rst nonzero entry a
i,ji
in row i is 1 and all other entries in the
corresponding j
i
th column are 0 and
2. j
1
< j
2
< < j
r
where r is the number of nonzero rows.
An augmented matrix (A[C) is said to be in reduced row echelon form if the
coecient matrix A is in reduced row echelon form. For example, the following
matrix is in reduced row echelon form:
A =
_
_
_
_
1 0 5 7 0 3
0 1 1 1 0 4
0 0 0 0 1 6
0 0 0 0 0 0
_
_
_
_
The rst nonzero entry in any given row of a reduced row echelon matrix is
referred to as a pivotal element. The columns containing pivotal elements are
referred to as pivotal columns.
10. Prove by induction that any augmented matrix can be put in reduced row
echelon form by a series of elementary row operations.
11. Let A and B be two matrices in reduced row echelon form. Prove that if A
and B are row equivalent, then A = B.
12. Prove that the row rank of a matrix in reduced row echelon form is the
number of nonzero rows.
13. Find the reduced row echelon form of the matrix
A =
_
_
_
_
1 1 4 8 0 1
1 2 3 9 0 5
0 2 2 2 1 14
1 4 1 11 0 13
_
_
_
_
.
14. Use what you have done above to nd solutions of the system of equations
x 2y +z = 5
x 4y + 6z = 10
4x 11y + 11z = 12.
41
15. Let V be an n-dimensional F vector space with basis c = e
1
, . . . , e
n
and
let W be an m-dimensional F vector space with basis T = f
1
, . . . , f
m
. Let
T Hom
F
(V, W) with [T]
F
E
= (a
ij
). Let A
be the reduced row echelon form

of A.
(a) Prove that the image T(V ) has dimension r where r is the number of
nonzero rows of A
and that a basis for T(V ) is given by the vectors T(e

ji
), i.e.,
the columns of A corresponding to the pivotal columns of A
give the coordi-

nates of a basis for the image of T.
The elements in the kernel of T are the vectors in V whose coordinates (x
1
, . . . , x
n
)
with respect to c satisfy the equation
A
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
= 0
and the solutions x
1
, . . . , x
n
to this system of linear equations are determined
by the matrix A
.
(b) Prove that T is injective if and only if A
has n nonzero rows.

(c) By (a) the kernel of T is nontrivial if and only if A
has nonpivotal columns.

Show that each of the variables x
1
, . . . , x
n
above corresponding to the nonpivotal
columns of A
can be prescribed arbitrarily and the values of the remaining

variables are then uniquely determined to give an element x
1
e
1
+ +x
n
e
n
in
the kernel of T. In particular, show that the coordinates of a basis for the kernel
are obtained by successively setting one nonpivotal variable equal to 1 and all
other nonpivotal variables to 0 and solving for the remaining pivotal variables.
Conclude that the kernel of T has dimension n r where r is the rank of A.
(d) Give a basis for the image and kernel of T if the matrix associated to T
with respect to the standard bases is
_
_
_
_
1 2 3 4 4
2 4 0 0 2
1 2 0 1 2
1 2 0 0 1
_
_
_
_
42
Chapter 4
Structure Theorems for
Linear Transformations
Let T Hom
F
(V, V ). Let B be a basis for V , where dim
F
V < , then we have
a matrix [T]
B
. Our goal is to pick this basis so that [T]
B
is as nice as possible.
Throughout this chapter we will always take V to be a nite dimensional
vector space.
Though we do not formally introduce determinants until Chapter 7, we use
determinants throughout this chapter. Here one should just treat the determi-
nant as it was used in undergraduate linear algebra, namely, in terms of the
cofactor expansion.
4.1 Invariant subspaces
In this section we will dene some of the basic objects needed throughout the
remainder of the chapter. However, we begin with an example.
Example 4.1.1. Let V be a 2-dimensional vector space with basis v
1
, v
2
.
Let T Hom
F
(V, V ) such that T(v
1
) = v
1
+3v
2
and T(v
2
) = 2v
1
+4v
2
. Thus,
[T]
B
=
_
1 2
3 4
_
.
It is natural to ask if there is a basis ( so that [T]
C
is a diagonal matrix. We
will see that if F = Q, there is no such basis (. However if Q(
33) F, then
there is a basis ( so that
[T]
C
=
_
5+
33
2
0
0
5
33
2
_
.
We can also consider the question over nite elds. For instance, F = F
3
, there
is a basis ( so that
[T]
C
=
_
1 0
0 1
_
43
4.1. INVARIANT SUBSPACES CHAPTER 4.
but if F = F
5
the matrix cannot be diagonalized. The results of this chapter
will make answering such a question routine.
Let V be an F-vector space of dim
F
V = n and T Hom
F
(V, V ). Let
f(x) F[x]. We will use throughout this chapter the notation f(T). If f(x) =
a
m
x
m
+ +a
1
x+a
0
, we will view f(T) as the linear map a
m
T
m
+ +a
1
T +a
0
where T
m
= T T T with m copies of T.
Theorem 4.1.2. Let v V with v ,= 0. There is a unique nonzero monic
polynomial m
T,v
(x) F[x] of lowest degree so that m
T,v
(T)(v) = 0. Moreover,
deg m
T,v
(x) dim
F
V .
Proof. Consider the set v, T(v), . . . , T
n
(v). There are n + 1 vectors and so
they must be linearly dependent, i.e., there exists a
0
, . . . , a
n
F such that
a
n
T
n
(v) + +a
1
T(v) +a
0
v = 0. Set p(x) = a
n
x
n
+ +a
1
x +a
0
. We have
that p(T)(v) = 0.
Consider the subset I
V
F[x] given by I
V
= f(x) : f(T)(v) = 0. Since
p(x) I
V
we have I
V
,= . Pick

f(x) I
V
nonzero of minimal degree and write
f(x) = a
m
x
m
+ + a
1
x + a
0
with a
m
,= 0, m n. Set f(x) =
1
am
f(x) I
V
.
This is a monic polynomial of minimum degree in I
V
.
It remains to show that f(x) is unique. Suppose g(x) I
V
with deg g = m
and g monic. Write f(x) = q(x)g(x)+r(x) for q(x), r(x) F[x] with r(x) = 0 or
deg r(x) < m. Rewriting this gives r(x) = f(x) q(x)g(x). Observe r(T)(v) =
f(T)(v) q(T)(v)g(T)(v) = 0 and so r(x) I
V
. However, this contradicts f
having minimal degree in I
v
unless r(x) = 0. Thus, f(x) = q(x)g(x). Now
observe that since deg f = m = deg g, we must have deg q = 0. This gives
q(x) F. Now we use that f and g are monic to see q = 1 and so f = g. This
gives the result.
One should note the previous proof amounted to showing that F[x] is a
principal ideal domain. Since we are not assuming that level of abstract algebra,
the proof is necessary. We will use this type of argument repeatedly so it is
important to understand it at this point.
Denition 4.1.3. The polynomial m
T,v
(x) is referred to as the T-annihilator
of v.
Corollary 4.1.4. Let v V and T Hom
F
(V, V ). If f(x) F[x] satises
f(T)(v) = 0, then m
T,v
(x) [ f(x).
Proof. We showed this in the course of proving the previous theorem, but we
repeat the argument here as this fact will be extremely important in what
follows. Let f(x) F[x] so that f(T)(v) = 0. Using the division algorithm we
can write
f(x) = q(x)m
T,v
(x) +r(x)
44
with q, r F[x] and r(x) = 0 or deg r < deg m
T,v
. We have
0 = f(T)(v)
= q(T)m
T,v
(T)(v) +r(T)(v)
= r(T)(v).
This contradicts the minimality of the degree of m
T,v
(x) unless r(x) = 0. Thus,
we have the result.
n
and c
n
= e
1
, . . . , e
n
be the standard basis of
V . Dene T : V V by T(e
j
) = e
j1
for 2 j m and T(e
1
) = 0. We
calculate m
T,ej
(x) for each j.
Observe that if we set f
1
(x) = x then f
1
(T)(e
1
) = T(e
1
) = 0. Moreover,
if g(x) = a a constant, then g(T)(e
1
) ,= 0. Thus, using the above corollary we
have that m
T,e1
(x) is a monic polynomial of degree at least one that divides
f
1
(x) = x, i.e., m
T,e1
(x) = x.
Next we calculate m
T,e2
(x). Observe that if we set f
2
(x) = x
2
, then
f
2
(T)(e
2
) = T
2
(e
2
) = T(e
1
) = 0. Thus, m
T,e2
[ f
2
. This gives that m
T,e2
(x) =
1, x, or x
2
since it must be monic. Since T(e
2
) = e
1
, it must be that m
T,e2
(x) =
x
2
. Similarly, one obtains that m
T,ej
(x) = x
j
.
Example 4.1.6. We now return to Example 4.1.1 and adopt the notation used
there. We nd m
T,v1
(x) and leave the calculation of m
T,v2
(x) as an exercise. We
have T(v
1
) = v
1
+3v
2
and T
2
(v
1
) = T(v
1
) +3T(v
2
) = v
1
+3v
2
+3(2v
1
+4v
2
) =
7v
1
+ 15v
2
. Since V has dimension 2 we know there exists b, c F such that
T
2
(v
1
) + bT(v
1
) + cv
1
= 0. Finding b and c amounts to solving a system of
linear equations. We obtain b = 5, c = 2. So f
1
(x) = x
2
5x 2 satises
f(T)(v
1
) = 0. Whether this is m
T,v1
(x) depends on the eld F and whether we
can factor over F. For example, over Q this is an irreducible polynomial and so
it must be that m
T,v1
(x) = x
2
5x 2 over Q. In fact, we will later see this
means that m
T,v1
(x) = x
2
5x2 over any eld F that contains Q. One other
thing to observe is that the roots of this polynomial are
5
2

33
2
.
Exercise 4.1.7. Redo the previous example with F = F
3
this time.
Though the annihilating polynomials are useful, they only tell us about the
linear map element by element. What we would really like it is a polynomial
that tells us about the overall linear map. The minimal polynomial provides
just such a polynomial.
Theorem 4.1.8. Let dim
F
V = n. There is a unique monic polynomial m
T
(x)
F[x] of lowest degree so that m
T
(T)(v) = 0 for all v V . Furthermore,
deg m
T
(x) n
2
.
Proof. Let B = v
1
, . . . , v
n
be a basis for V . Let m
T,vi
(x) be the annihilating
polynomial for each i. Set m
T
(x) = lcm
i
m
T,vi
(x). Note that m
T
(x) is monic
and m
T
(T)(v
i
) = 0 for each 1 i n. From this it is easy to show that
45
m
T
(T)(v) = 0 for all v V . Since each m
T,vi
(x) has degree n and there are n
polynomials, we must have deg m
T
(x)
n
i=1
n = n
2
.
It remains to show m
T
(x) is unique. Suppose there exists r(x) F[x] with
r(T)(v) = 0 for all v V but m
T,vj
(x) r(x) for some j. This is a contradiction
because if m
T,vj
(x) r(x), then r(T)(v
j
) ,= 0. Thus, by the denition of least
common multiple we must have m
T
(x) [ r(x) and so m
T
(x) is the unique monic
polynomial of minimal degree satisfying m
T
(T)(v) = 0 for all v V .
We note the following corollary that was shown in the process of proving the
last result.
F
(V, V ) and suppose f(x) F[x] with f(T)(v) =
0 for all v V . Then m
T
(x) [ f(x).
The bound on the degree of m
T
(x) given above is far from optimal. In fact,
we will see shortly that deg m
T
(x) n.
Denition 4.1.10. The polynomial m
T
(x) is called the minimal polynomial of
T.
Example 4.1.11. We once again return to Example 4.1.1. Let F = Q. We saw
m
T,v1
(x) = x
2
5x2 and one saw in the exercise that m
T,v2
(x) = x
2
5x2.
Thus,
m
T
(x) = lcm(x
2
5x 2, x
2
5x 2)
= x
2
5x 2.
Example 4.1.12. Let V = Q
3
and let c
3
= e
1
, e
2
, e
3
be the standard basis.
Let T be given by the matrix
_
_
1 2 3
0 1 4
0 0 1
_
_
. One can calculate from this that
m
T,e1
(x) = x 1, m
T,e2
= (x 1)
2
, m
T,e3
= (x 1)
2
(x + 1). Thus,
m
T
(x) = lcm((x 1), (x 1)
2
, (x 1)
2
(x + 1))
= (x 1)
2
(x + 1).
The following result is the rst step in showing that one can realize the
minimal polynomial of any linear map as the annihilator of an element of the
vector space.
F
(V, V ). Let v
1
, . . . , v
k
V and set p
i
(x) =
m
T,vi
(x). Suppose the p
i
(x) are pairwise relatively prime. Set v = v
1
+ +v
k
.
Then m
T,v
(x) = p
1
(x) p
k
(x).
Proof. We prove the case k = 2. The general case follows by induction and is
left as an exercise. Let p
1
(x) and p
2
(x) be as in the statement of the lemma.
46
The fact that they are relatively prime gives polynomials q
1
(x), q
2
(x) F[x]
such that p
1
(x)q
1
(x) +p
2
(x)q
2
(x) = 1. Thus, we have
v = 1 v
= (p
1
(T)q
1
(T) +p
2
(T)q
2
(T))v
= q
1
(T)p
1
(T)(v) +q
2
(T)p
2
(T)(v)
= q
1
(T)p
1
(T)(v
1
) +q
1
(T)p
1
(T)(v
2
) +q
2
(T)p
2
(T)(v
1
) +q
2
(T)p
2
(T)(v
2
)
= q
1
(T)p
1
(T)(v
2
) +q
2
(T)p
2
(T)(v
1
)
where we have used p
1
(T)(v
1
) = 0 and p
2
(T)(v
2
) = 0. Set w
1
= q
2
(T)p
2
(T)(v
1
)
and w
2
= q
1
(T)p
1
(T)(v
2
) so that v = w
1
+w
2
.
Observe that
p
1
(T)(w
1
) = p
1
(T)q
2
(T)p
2
(T)(v
1
)
= p
2
(T)q
2
(T)p
1
(T)(v
1
)
= 0.
Thus, w
1
ker p
1
(T). Similarly, w
2
ker p
2
(T).
Let r(x) F[x] such that r(T)(v) = 0. Recall that v = w
1
+ w
2
. Since
w
2
ker p
2
(T) we have
p
2
(T)(v) = p
2
(T)(w
1
+w
2
)
= p
2
(T)(w
1
).
Thus,
0 = p
2
(T)q
2
(T)r(T)(v)
= r(T)q
2
(T)p
2
(T)(v)
= r(T)q
2
(T)p
2
(T)(w
1
).
Moreover, we have
0 = r(T)p
1
(T)q
1
(T)(w
1
)
since w
1
ker p
1
(T). Thus,
0 = r(T)(p
1
(T)q
1
(T) +p
2
(T)q
2
(T))(w
1
).
Using that p
1
(T)q
1
(T) + p
2
(T)q
2
(T) = 1, we obtain 0 = r(T)(w
1
). Thus, we
have 0 = r(T)(w
1
) = r(T)(p
2
(T)q
2
(T)(v
1
)). Since p
1
(x) is the annihilating
polynomial of v
1
, we obtain p
1
(x)[r(x)p
2
(x)q
2
(x). However, since p
1
(x)q
1
(x) +
p
2
(x)q
2
(x) = 1 we have that p
1
is relatively prime to p
2
(x)q
2
(x), so p
1
(x)[r(x).
A similar argument shows p
2
(x)[r(x). Since p
1
(x), p
2
(x) are relatively prime,
p
1
(x)p
2
(x)[r(x). Observe that
p
1
(T)p
2
(T)(v) = p
1
(T)p
2
(T)(v
1
) +p
1
(T)p
2
(T)(v
2
)
= 0.
Thus p
1
(x)p
2
(x) is a monic polynomial and for any r(x) F[x] so that r(T)(v) =
0, we have p
1
(x)p
2
(x) [ r(x). This is exactly what it means for m
T,v
(x) =
p
1
(x)p
2
(x).
47
F
(V, V ). There exists v V such that m
T,v
(x) =
m
T
(x).
Proof. Let v
1
, . . . , v
n
be a basis of V so that m
T
(x) = lcm
i
m
T,vi
(x). Factor
m
T
(x) into irreducible factors p
1
(x)
e1
p
k
(x)
e
k
with e
i
1 and the p
i
(x)
pairwise relatively prime. For each 1 j n there exists i
j
1, . . . , k and
q
ij
(x) F[x] such that m
T,vj
(x) = p
ij
(x)
ei
j
q
ij
(x). Set u
ij
= q
ij
(T)(v
j
). Then
m
T,ui
j
(x) = p
ij
(x)
ei
j
. Thus, if we set v =
u
ij
, then Lemma 4.1.13 gives
m
T,v
(x) = p
1
(x)
e1
p
k
(x)
e
k
= m
T
(x).
This result gives us the desired bound on the degree of m
T
(x).
Corollary 4.1.15. For T Hom
F
(V, V ) we have deg m
T
(x) n.
Proof. Since there exists v V so that m
T
(x) = m
T,v
(x) and we know deg m
T,v
(x)
n, this gives the result.
Denition 4.1.16. Let A Mat
n
(F). We dene the characteristic polynomial
of A as c
A
(x) = det(x1
n
A) F[x].
One must be very careful with what is meant here. As we have seen above,
we will often be interested in evaluating a polynomial at a linear map or a
matrix. Without the more rigorous treatment, we have to be careful what we
mean by this. For example, the Cayley-Hamilton Theorem (see Theorem 4.1.30)
says that c
A
(A) = 0. At rst glance this seems trivial, but that is only the case
if you misinterpret what is meant by c
A
(A). What this actually means is form
the polynomial c
A
(x) and then replace the xs with As. One easy way to see
the dierence is that c
A
(B) is a matrix for any matrix B, but if you evaluated
B1
n
A and then took the determinant this would give a scalar. Note that a
more rigorous treatment of the characteristic polynomial in the appendix. For
a rst reading of the material the more rigorous treatment can certainly be
skipped.
Recall that we say matrices A, B Mat
n
(F) are similar if there exists
P GL
n
(F) so that A = PBP
1
.
Lemma 4.1.17. Let A, B Mat
n
(F) be similar matrices. Then c
A
(x) = c
B
(x).
48
Proof. Let P GL
n
(F) such that A = PBP
1
. Then we have
c
A
(x) = det(xI
n
A)
= det(xI
n
PBP
1
)
= det(PxI
n
P
1
PBP
1
)
= det(P(xI
n
B)P
1
)
= det P det(xI
n
B) det P
1
= det(xI
n
B)
= c
B
(x).
We can use this result to dene the characteristic polynomial of a linear
map as well. Given T Hom
F
(V, V ), we dene c
T
(x) = c
[T]
B
(x) for any basis
B. Note this is well-dened because choosing a dierent basis gives a similar
matrix, which does not aect the characteristic polynomial. If dim
F
V = n,
then deg c
T
(x) = n and c
T
(x) is monic.
Denition 4.1.18. Let f(x) = x
n
+ a
n1
x
n1
+ a
1
x + a
0
F[x]. The
companion matrix of f is given by:
C(f(x)) =
_
_
_
_
_
_
_
a
n1
1 0 0 0
a
n2
0 1 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
1
0 0 0 1
a
0
0 0 0 0
_
_
_
_
_
_
_
The companion matrix will be extremely important when we study rational
canonical form.
Lemma 4.1.19. Let f(x) = x
n
+ a
n1
x
n1
+ a
1
x + a
0
. Set A = C(f(x)).
Then c
A
(x) = f(x).
Proof. Observe we have
xI
n
A =
_
_
_
_
_
_
_
x +a
n1
1 0 0
a
n2
x 1 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
1
0 0 x 1
a
0
0 0 0 x
_
_
_
_
_
_
_
We prove the result by induction on n. First, suppose n = 1. Then we have
f(x) = x+a
0
and A = (a
0
). So xI
n
A = x+a
0
and thus c
A
(x) = det(x+a
0
) =
x +a
0
= f(x) as claimed. Now suppose the result is true for all n k 1. We
49
show the result is true for n = k. We have
c
A
(x) = det(xI
k
A)
= det
_
_
_
_
_
_
_
x +a
k1
1 0 0
a
k2
x 1 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
1
0 0 x 1
a
0
0 0 0 x
_
_
_
_
_
_
_
= (1)
k+1
a
0
det
_
_
_
_
_
_
_
1 0 0 0
x 1 0 0
0 x 1 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 x 1
_
_
_
_
_
_
_
+ (1)
2k
xdet
_
_
_
_
_
x +a
k1
1 0
a
k2
x 0
.
.
.
.
.
.
.
.
.
.
.
.
a
1
0 0 x
_
_
_
_
_
= a
0
+x(x
k1
+a
k1
x
k2
+ +a
1
)
= f(x).
Thus, we have the result by induction.
The next theorem gives us our rst serious result towards our structure
theorems.
Theorem 4.1.20. Let f(x) = x
n
+ a
n1
x
n1
+ + a
1
x + a
0
F[x]. Let
A = C(f(x)). Set V = F
n
and T = T
A
. Let c
n
= e
1
, . . . , e
n
is the standard
basis of F
n
. Then the subspace W V given by
W = g(T)(e
n
) : g(x) F[x]
is all of V . Moreover, m
T
(x) = m
T,en
(x) = f(x).
Proof. Observe we have T(e
n
) = e
n1
, T
2
(e
n
) = e
n2
, and in general T
k
(e
n
) =
e
nk
for k n1. We have that W contains span
F
(T
n1
(e
n
), . . . , T(e
n
), e
n
) =
span
F
(e
1
, . . . , e
n
) = V . Since W V , this implies that W = V .
Note, we have n elements in T
n1
(e
n
), . . . , T(e
n
), e
n
and they span, so
they are linearly independent. So any polynomial p(x) such that p(T)(e
n
) = 0
must have degree at least n. We have
T
n
(e
n
) = T(e
1
)
= a
n1
e
1
a
n2
e
2
a
0
e
n
= a
n1
T
n1
(e
n
) a
1
T(e
n
) a
0
e
n
.
Thus, T
n
(e
n
) + + a
1
T(e
n
) + a
0
e
n
= 0. This gives f(T)(e
n
) = 0. Since
f is degree n and is monic, we must have that f(x) = m
T,en
(x). We know
m
T,en
(x)[m
T
(x) and deg m
T
(x) n, so m
T
(x) = m
T,en
(x) = f(x).
We now switch our attention to subspaces of V that are well-behaved with
respect to T, namely, ones that are preserved by T.
50
Denition 4.1.21. A subspace W V that satises T(W) W is referred to
as a T-invariant subspace or a T-stable subspace.
As is the custom, we rephrase the denition in terms of matrices.
Theorem 4.1.22. Let V be an n-dimensional F-vector space and W V a
k-dimensional subspace. Let B
W
= v
1
, . . . , v
k
be a basis of W and extend to
a basis B = v
1
, . . . , v
n
of V . Let T Hom
F
(V, V ). Then W is T-invariant if
and only if [T]
B
is block upper triangular
[T]
B
=
_
A B
0 D
_
where A Mat
k
(F) and is given by A = [T[
W
]
B
W
.
Note that if W is a T-invariant subspace, then T[
W
Hom
F
(W, W). More-
over, we have a natural map T : V/W V/W given by T(v +W) = T(v) +W.
This is well-dened precisely because W is T-invariant, which one should check
as an exercise. We also have the following simple relations between the annihi-
lating polynomials of T and T as well as the minimal polynomials. These will
be very useful in induction proofs that will follow.
Lemma 4.1.23. Let W V be T-invariant and let T be the induced linear
map on V/W. Let v V . Then m
T,[v]
(x)[m
T,v
(x) where [v] = v +W.
m
T,v
(T)([v]) = m
T,v
(T)(v) +W
= 0 +W.
Thus, m
T,[v]
(x)[m
T,v
(x).
The following result can either be proved by the same methods as used in
the previous lemma, or one can use the previous result and the denition of the
minimal polynomial.
Corollary 4.1.24. Let W V be T-invariant and let T be the induced linear
map on V/W. Then m
T
(x)[m
T
(x).
F
(V, V ) and let / = v
1
, . . . , v
k
be a set of
vectors in V . The T-span of / is the subspace
W =
_
k
i=1
p
i
(T)(v
i
) : p
i
(x) F[x]
_
.
We say / T-generates W.
Exercise 4.1.26. Check that W given in the previous denition is a T-invariant
a subspace of V . Moreover, show it is the smallest (with respect to inclusion)
T-invariant subspace of V that contains B.
51
The following lemma, while elementary, will be used repeatedly in this chap-
ter.
F
(V, V ). Let w V and let W be the subspace
of V that is T-generated by w. Then
dim
F
W = deg m
T,w
(x).
Proof. Let deg m
T,w
(x) = k. Then w, T(w), . . . , T
k1
(w) spans W and if any
subset spanned we would have deg m
T,w
(x) < k. This gives the result.
Up to this point we have dealt with two separate polynomials associated to
a linear map: the characteristic and minimal polynomials. It is natural to ask
if there is a relation between the two. In fact, there is a very nice relation given
by the following theorem.
F
(V, V ). Then
1. m
T
(x)[c
T
(x);
2. Every irreducible factor of c
T
(x) is a factor of m
T
(x).
Proof. We proceed by induction on dim
F
V = n. If n = 1 the result is true
trivially, so our base case is done. Let deg m
T
(x) = k n. Let v V such that
m
T
(x) = m
T,v
(x). Let W
1
be the T-span of v, so by Lemma 4.1.27 we have
dim
F
W
1
= k. Set v
k
= v and v
ki
= T
i
(v) for i = 0, . . . , k 1. Then we have
B
1
= v
1
, . . . , v
k
is a basis for W
1
and [T[
W1
]
B1
= C(m
T
(x)), the companion
matrix.
If k = n, then W
1
= V So [T]
B1
= C(m
T
(x)) has characteristic polynomial
m
T
(x), i.e., c
T
(x) = m
T
(x) and we are done.
Now suppose k < n. Let V
2
be the orthogonal complement of W
1
in V ,
and B
2
a basis of V
2
. Set B = B
1
B
2
. We have [T]
B
=
_
A B
0 D
_
with A the
companion matrix of m
T
(x). This allows us to observe that
c
T
(x) = det(xI
n
[T]
B
)
= det
_
xI
k
A B
0 xI
nk
D
_
= det(xI
k
A) det(xI
nk
D)
= m
T
(x) det(xI
nk
D).
Thus, we see m
T
(x)[c
T
(x). This gives the rst claim of the theorem.
It remains to show c
T
(x) and m
T
(x) have the same irreducible factors. Note
that since m
T
(x) [ c
T
(x) we certainly have every irreducible factor of m
T
(x) is an
irreducible factor of c
T
(x). If m
T
(x) has degree n, then m
T
(x) = c
T
(x) and the
result is trivial. Assume deg m
T
(x) < n. Consider T : V/W
1
V/W
1
. Set B =
W1
(B). We have [T]
B
= D. Since dim
F
V/W
1
< dim
F
V , we can use induction
to conclude c
T
(x) and m
T
(x) have the same irreducible factors. Let p(x) be
52
4.2. T-INVARIANT COMPLEMENTS CHAPTER 4.
an irreducible factor of c
T
(x). As above write c
T
(x) = m
T
(x) det(xI
nk
D),
which we have is equal to m
T
(x)c
T
(x). Since p(x) is irreducible, it divides
m
T
(x) or c
T
(x). If p(x)[m
T
(x), we are done. If not, p(x)[c
T
(x), and so p(x)
is an irreducible factor of c
T
(x). However, m
T
(x) and c
T
(x) have the same
irreducible factors so p(x) is an irreducible factor of m
T
(x). We now use that
m
T
(x)[m
T
(x) to conclude p(x)[m
T
(x).
Corollary 4.1.29. Let dim
F
V = n, T Hom
F
(V, V ). Then V is T-generated
by a single element if and only if m
T
(x) = c
T
(x).
Proof. Let w V and let W be the subspace of V that is T-generated by w. We
know that dim
F
W = deg m
T,w
(x). Thus, if there is an element that T-generates
V we have deg m
T,w
(x) = n, and since m
T,w
(x) [ m
T
(x) for every w, we have
deg m
T
(x) = n and so it must be equal to c
T
(x). Conversely, if m
T
(x) = c
T
(x)
then we use the fact that there is an element w V so that m
T,w
(x) = m
T
(x).
Thus, deg m
T,w
(x) = n and so dim
F
W = n, i.e., W = V .
These results trivially give the Cayley-Hamilton theorem.
Theorem 4.1.30. [Cayley-Hamilton Theorem]
1. Let T Hom
F
(V, V ) with dim
F
V < . Then c
T
(T) = 0.
2. Let A Mat
n
(F) and c
A
(x) the characteristic polynomial. Then c
A
(A) =
0.
As mentioned above, the term c
A
(A) is a matrix and the content of the
theorem above is that this is the zero matrix.
4.2 T-invariant complements
We now turn our focus to invariant subspaces and the question of when a T-
invariant subspace has a T-invariant complement. Recall that given any sub-
space W V , we always have a subspace W
V so that V = W W
. In
particular, if W is T-invariant we have a complement W
. However, it is not
necessarily the case that W
will also be T-invariant. As we saw in the last

section, this essentially comes down to whether there is a basis B so that the
matrix for T is block diagonal.
Denition 4.2.1. Let W
1
, . . . , W
k
be subspaces of V such that V = W
1

W
k
. Let T Hom
F
(V, V ). We say V = W
1
W
k
is a T-invariant direct
sum if each W
i
is T-invariant. If V = W W
is a T-invariant direct sum, we

say W
is the T-invariant complement of W.

We now give two fundamental examples of this. Not only are they useful for
understanding the denition, they will be useful in understanding the arguments
to follow.
53
Example 4.2.2. Let T Hom
F
(V, V ) and assume m
T
(x) = c
T
(x). For con-
venience write f(x) = m
T
(x). Suppose we can factor f(x) = g(x)h(x) with
gcd(g, h) = 1. Let v
0
V be so that m
T,v0
(x) = m
T
(x), i.e., V is T-generated
by v
0
.
Set W
1
= h(T)(V ) and W
2
= g(T)(V ). We claim that V = W
1
W
2
is a
T-invariant direct sum. Note it is clear that W
1
and W
2
are both T-invariant,
so we only need to show that V = W
1
W
2
to see it is a T-invariant direct sum.
First, we show that W
1
W
2
= 0. Let w
1
W
1
. Then w
1
= h(T)(v
1
)
for some v
1
V . Since V is generated by v
0
, we can write v
1
= a(T)(v
0
)
for some a(x) F[x]. Thus g(T)(w
1
) = g(T)h(T)(v
1
) = g(T)h(T)a(T)(v
0
) =
a(T)f(T)(v
0
) = 0. Thus, m
T,w1
[ g(x). Similarly, if w
2
W
2
, then m
T,w2
[ h(x).
If w W
1
W
2
, then m
T,w
(x) [ g(x) and m
T,w
(x) [ h(x). Thus m
T,w
(x) = 1.
This implies we must have w = 0, as desired.
It remains to show that V = W
1
+ W
2
. Since gcd(g, h) = 1, there exists
s, t F[x] such that 1 = g(x)s(x) +h(x)t(x). Thus,
v
0
= (g(T)s(T) +h(T)t(T))v
0
= g(T)s(T)v
0
+h(T)t(T)v
0
= w
1
+w
2
where
w
1
= h(T)t(T)v
0
= h(T)(t(T)v
0
) h(T)(V ) = W
1
and
w
2
= g(T)s(T)v
0
= g(T)(s(T)v
0
) g(T)(V ) = W
2
.
Thus, v
0
W
1
+ W
2
. Let v V . There exists b(x) F[x] such that v =
b(T)(v
0
). This gives
v = b(T)v
0
= b(t)(w
1
+w
2
)
= b(T)(w
1
) +b(T)(w
2
) W
1
+W
2
.
Thus V = W
1
W
2
as T-invariant spaces.
In summary, if m
T
(x) = c
T
(x) and we can write m
T
(x) = g(x)h(x) with
gcd(g, h) = 1, then there exists T-invariant subspaces W
1
and W
2
so that V =
W
1
W
2
. Let B
i
be a basis for W
i
. Then we have
[T]
B1B2
=
_
[T]
B1
0
0 [T]
B2
_
.
Example 4.2.3. As in the previous example, assume m
T
(x) = c
T
(x) and again
write f(x) = m
T
(x). However, in this case we assume f(x) = g(x)h(x) with
54
gcd(g, h) > 1. For example, f(x) could be a power of an irreducible polynomial.
Let v
0
V such that m
T
(x) = m
T,v0
(x), i.e., V is T-generated by v
0
. Set
W
1
= h(T)(V ). Clearly we have W
1
is T-invariant. We will now show W
1
does not have a T-invariant complement. Suppose V = W
1
W
2
with W
2
a
T-invariant subspace. Write T
1
= T[
W1
and T
2
= T[
W2
.
We claim that m
T1
(x) = g(x). Let w
1
W
1
and write w
1
= h(T)(v
1
) for
some v
1
V . Write v
1
= a(T)(v
0
) for some a(x) F[x] which is possible since
v
0
T-generates V . We have
g(T)(w
1
) = g(T)h(T)(v
1
)
= g(T)h(T)a(T)(v
0
)
= a(T)f(T)(v
0
)
= 0.
Thus, we must have m
T1
(x) [ g(x). Set w
1
= h(T)(v
0
) and k(x) = m
T,w
1
(x).
Then we have
0 = k(T)(w
1
)
= k(T)h(T)(v
0
).
Thus, we have m
T,v0
(x) = g(x)h(x) [ k(x)h(x). This gives g(x) [ k(x) =
m
T,w
1
(x). However, m
T,w
1
(x) [ m
T1
(x) and so g(x) [ m
T1
(x). This gives
g(x) = m
T1
(x) as claimed.
Our second claim is that m
T2
(x)[h(x). Let w
2
W
2
. Then h(T)(w
2
) W
1
because h(T)(V ) = W
1
. We also have h(T)(w
2
) W
2
because W
2
is T-invariant
by assumption. However, W
1
W
2
= 0 so it must be the case that h(T)(w
2
) =
0. Hence m
T2
(x) [ h(x) as claimed.
As we are assuming V = W
1
+W
2
we can write v
0
= w
1
+w
2
for some w
i

W
i
. Set b(x) = lcm(g(x), h(x)). We have b(T)(v
0
) = b(T)(w
1
) + b(T)(w
2
) = 0
since m
T1
(x) = g(x) [ b(x) and m
T2
(x) [ h(x) [ b(x). Thus, b(x) is divisible by
f(x) = m
T,v0
(x). However, since gcd(g, h) ,= 1 we have b(x) is a proper factor
of f(x) that vanishes on v
0
, so on V . This contradicts m
T,v0
(x) = f(x) = c
T
(x).
Thus, W
1
does not have a T-invariant complement.
Now that we have these examples at our disposal, we return to the general
situation.
F
(V, V ). Let m
T
(x) = p
1
(x) p
k
(x) be a
factorization into relatively prime polynomials. Set W
i
= ker(p
i
(T)) for i =
1, . . . , k. Then W
i
is T-invariant for each i and V = W
1
W
k
is a
T-invariant direct sum decomposition of V .
Proof. Let w
i
W
i
. We have
p
i
(T)(T(w
i
)) = T(p
i
(T)(w
i
))
= T(0)
= 0.
55
Thus, T(w
i
) ker(p
i
(T)) = W
i
. Thus, each W
i
is T-invariant so it only remains
to show V is a direct sum of the W
i
.
For each i set q
i
(x) = m
T
(x)/p
i
(x). The collection q
1
, . . . , q
k
is relatively
prime (not in pairs, just overall) so there are polynomials r
1
, . . . , r
k
F[x] such
that
1 = q
1
r
1
+ +q
k
r
k
.
Let v V . We have
v = 1 v
= r
1
(T)q
1
(T)v + +r
k
(T)q
k
(T)v
= w
1
+ +w
k
where we set w
i
= r
i
(T)q
i
(T)(v). We claim that w
i
W
i
. Observe
p
i
(T)(w
i
) = p
i
(T)q
i
(T)r
i
(T)(v)
= r
i
(T)m
T
(T)(v)
= 0.
Thus, w
i
W
i
as claimed. This shows we have V = W
1
+ +W
k
.
Suppose there exists w
i
W
i
so that 0 = w
1
+ + w
k
. We need to show
this implies w
i
= 0 for each i. We have
0 = q
1
(T)(0)
= q
1
(T)(w
1
+ +w
k
)
= q
1
(T)(w
1
) + +q
1
(T)(w
k
)
= q
1
(T)(w
1
)
where we have used p
i
[ q
1
for all i ,= 1 so q
1
(T)(w
i
) = 0 for i ,= 1. Thus,
we have q
1
(T)(w
1
) = 0 and by denition p
1
(T)(w
1
) = 0. Since p
1
and q
1
are
relatively prime there exists r, s F[x] so that
1 = r(x)p
1
(x) +s(x)q
1
(x).
Thus,
w
1
= (r(T)p
1
(T) +s(T)q
1
(T))w
1
0.
The same argument gives w
i
= 0 for all i. Thus, V = W
1
W
k
with W
i
T-invariant.
This result allows us to determine exactly how all T-invariant subspaces of
V arise from the minimal polynomial of T.
F
(V, V ), write m
T
(x) = p
1
(x)
e1
p
k
(x)
e
k
be
the irreducible factorization of m
T
(x). Let W
i
= ker(p
i
(T)
ei
). For i = 1, . . . , k
let U
i
be a T-invariant subspace of W
i
. (Note we allow U
i
to be zero.) Then
U = U
1
U
k
is a T-invariant subspace of V . Moreover, all T-invariant
subspaces of V arise in this manner.
56
Proof. The previous result shows V = W
1
W
k
with the W
i
being T-
invariant, so it is clear such a U is a T-invariant subspace.
Let U be any T-invariant subspace. Let
i
: U W
i
be the natural projec-
tion map and set U
i
=
i
(U). We need to show U = U
1
U
k
. It is enough
to show U
i
U for each i because clearly the U
i
s span U and have trivial in-
tersection since the W
i
have trivial intersection. Let u
i
U
i
. By the denition
of U
i
there exists u U such that
i
(u) = u
i
. Since V = W
1
W
k
we can write u = u
1
+ + u
k
with u
j
W
j
. However, the denition
of
i
gives that u
j
U
j
for each j. Set q
i
(x) = m
T
(x)/p
i
(x)
ei
. We have
gcd(q
i
(x), p
i
(x)
ei
) = 1, so there are polynomials r
i
(x), s
i
(x) F[x] such that
1 = r
i
(x)q
i
(x) + s
i
(x)p
i
(x). Note that for j ,= i we have q
i
(T)(u
j
) = 0 and
p
i
(T)
ei
(u
i
) = 0 because u
i
U
i
W
i
= ker(p
i
(T)
ei
). Thus,
u
i
= (1 s
i
(T)p
i
(T)
ei
)(u
i
)
= r
i
(T)q
i
(T)(u
i
)
= r
i
(T)q
i
(T)(u).
Since U in T-invariant we have u
i
= r
i
(T)q
i
(T)(u) U. Thus, U
i
U.
This result reinforces the importance of the minimal polynomial in under-
standing the structure of T. We obtain T-invariant subspaces by factoring the
minimal polynomial and using the irreducible factors.
The following lemma is essential to prove the theorem that will allow us to
give our rst signicant result on the structure of a linear map: the rational
canonical form. The proof of this lemma is a bit of work though and can be
skipped without losing any of the main ideas.
F
(V, V ). Let w
1
V such that m
T,w1
(x) = m
T
(x)
and let W
1
be the T-invariant subspace of V that is T-generated by w
1
. Suppose
W
1
V is a proper subspace and there is a vector v
2
V so that V is T-
generated by w
1
, v
2
. Then there is a vector w
2
V such that if W
2
is the
subspace T-generated by w
2
, then V = W
1
W
2
.
Proof. Let v
2
be as in the statement of the theorem. Let V
2
be the subspace of
V that is T-generated by v
2
. We certainly have V = W
1
+ V
2
, but in general
we will not have W
1
V
2
= 0. We will obtain the direct sum by changing v
2
into a dierent element so that there is no overlap.
Let dim
F
V = n and dim
F
W
1
= k. Then B
1
= T
k1
(w
1
), . . . , T(w
1
), w
1
is a basis for W
1
. Set u
i
= T
ki
(w
1
) to ease the notation. We have V is
spanned by B
1
v
2
, T(v
2
), . . . where v
2
= v
2
+ w for any w W
1
. Note
that for any such v
2
that w
1
, v
2
also T-generates V . The point now is to
choose w carefully so that if we set w
2
= v
2
we will have the result. Set
B
2
= v
2
, T(v
2
), . . . , T
nk1
(v
2
). The rst claim is that B
1
B
2
is a basis for
V . We know that B
1
is linearly independent, so we can add elements to it to
form a basis. We add v
2
, T(v
2
), etc. until we obtain a set that is no longer
linearly independent. Certainly we cannot go past T
nk1
(v
2
) because then we
will have more than n vectors. To see we can go all the way to T
nk1
(v
2
),
57
observe that if T
j
(v
2
) is a linear combination of B
1
and v
2
, . . . , T
j1
(v
2
), then
the latter set consisting of k +j vectors spans V so j n k. (Note one must
use W
1
is T-invariant to conclude this. Make sure you understand this point.)
However, we know j n k and so j = n k as desired. Thus, B
= B
1
B
2
is a basis for V . Set u
k+1
= T
nk1
(v
2
), . . . , u
n
= v
2
.
We now consider T
nk
(u
n
). We can uniquely write
T
nk
(u
n
) =
k
i=1
b
i
u
i
+
n
i=k+1
b
i
u
i
for some b
i
F. Set
p(x) = x
nk
b
k+1
x
nk1
b
n1
x b
n
.
Set u = p(T)(v
2
) and observe
u = p(T)(v
2
)
= p(T)(u
n
)
=
k
i=1
b
i
u
i
W
1
.
We now break into two cases:
Case 1: Suppose u = 0.
In this case we have
k
i=1
b
i
u
i
= 0, and so b
i
= 0 for i = 1, . . . , k since the
u
i
are linearly independent. Set V
2
= span
F
B
2
. We have
T
nk
(v
2
) = T
nk
(u
n
)
=
n
i=k+1
b
i
u
i
span
F
B
2
.
Thus, we have T
j
(v
2
) span
F
B
2
for all j, so V
2
is a T-invariant subspace of
V . By construction we have W
1
V
2
= 0, so we have that V = W
1
V
2
is a
T-invariant direct sum decomposition of V and we are done in this case.
Case 2: Suppose u ,= 0.
The goal here is to reduce this case to the previous case. We must adjust v
2
for this to work. We now set V
2
to be the space T-generated by v
2
. We claim
that b
2kn+1
, . . . , b
k
are all zero so u =
2kn
i=1
b
i
u
i
. Suppose there exists b
m
with 2k n + 1 m k so that b
m
,= 0 and let m be the largest such index.
Since T acts by shifting the u
i
, we have
T
m1
(u) = b
m
u
1
T
m2
(u) = b
m
u
2
+b
m1
u
1
.
.
.
58
Thus, we have
T
m1
p(T)(v
2
), T
m2
p(T)(v
2
), . . . , p(T)v
2
, T
nk1
(v
2
), T
nk2
(v
2
), . . . , v
is a linearly independent subset of V
2
. Thus, dim
F
V
2
m+nk k+1. This
gives that the degree of m
T,v
2
(x) is at least k+1. However, since m
T,v
2
(x) must
divide m
T
(x), and m
T
(x) = m
T,w1
(x) which has degree k, this is a contradiction.
Thus, b
2kn+1
= = b
k
= 0 as claimed.
Set
w =
2kn
i=1
b
i
u
i+nk
.
Note that the u
i
s range from u
nk+1
to u
k
, so w W
1
. Set w
2
= v
2
+
w. Dene B
1
= u
1
, . . . , u
k
as above, but now set B
2
= u
k+1
, . . . , u
n
=
T
nk1
(w
2
), . . . , w
2
. Set B = B
1
B
2
. We have
T
nk
(u
n
) = T
nk
(v
2
+w)
= T
nk
(v
2
) +T
nk
(w)
=
2kn
i=1
b
i
u
i
+T
nk
_
2kn
i=1
b
i
u
i+nk
_
=
2kn
i=1
b
i
u
i
+
2kn
i=1
(b
i
u
i
)
= 0.
Thus, we are back in the previous case so we have the result.
This lemma now allows us to prove the following theorem that will be es-
sential to developing the rational and Jordan canonical forms.
F
(V, V ) and let w
1
V such that m
T,w1
(x) =
m
T
(x). Let W
1
be the subspace T-generated by w
1
. Then W
1
has a T-invariant
complement W
2
.
Proof. If W
1
= V , we can set W
2
= 0 and we are done. Suppose that W
1
is
properly contained in V . Consider the collection of T-invariant subspaces of V
that intersect W
1
trivially. We have this is a nonempty set because 0 is a
subspace that intersects W
1
trivially. Partially ordering this by inclusion, we
apply Zorns lemma to choose a maximal subspace W
2
that is T-invariant and
intersects W
1
trivially. We now show that V = W
1
W
2
.
Suppose that W
1
W
2
is properly contained in V . Let v V so that
v / W
1
W
2
. Let V
2
be the subspace of V that is T-generated by v. Set
U
2
= W
2
+V
2
. If W
1
U
2
= 0 we have a contradiction to W
2
being maximal.
Thus, we must have W
1
U
2
,= 0. Set V
= W
1
+ U
2
. We have V
is a
T-invariant subspace of V . Set T
= T[
V
. Since W
2
is a T-invariant space, it
is a T
-invariant subspace of V
. Consider
T
: V
/W
2
V
/W
2
.
59
4.3. RATIONAL CANONICAL FORM CHAPTER 4.
Set X = V
/W
2
and S = T
. Let
W2
: V
X be the natural projection

map and set w
1
=
W2
(w
1
) and v
2
=
W2
(v
2
). Set Y
1
=
W2
(W
1
) X and
Z
2
=
W2
(U
2
) X. Clearly we have Y
1
and Z
2
are S-invariant subspaces of X.
The space Y
1
is S-spanned by w
1
and Z
2
is S-spanned by v
2
, so X is S-spanned
by w
1
, v
2
. Finally, since W
1
W
2
= 0 we have
W2
[
W1
is injective.
We have that m
T
(x)[m
T
(x). We also have m
S
(x) [ m
T
(x). We as-
sumed m
T,w1
(x) = m
T
(x) and since
W2
: W
1
Y
1
is injective, we must
have m
S,w1
(x) = m
T,w1
(x). Since w
1
V
, m
T,w1
(x) [ m
T
(x). Finally,
m
S,w1
(x) [ m
S
(x). Combining all of these gives
m
S,w1
(x) [ m
S
(x) [ m
T
(x) [ m
T
(x) = m
T,w1
(x) = m
S,w1
(x),
which gives equality throughout.
We are now in a position to apply the previous lemma to S, X, w
1
, and w
2
.
This gives a vector w
2
so that X = Y
1
Y
2
where Y
2
is the subspace of X that
is S-generated by w
2
. Let w
2
be any element of V
with
W2
(w
2
) = w
2
and
set V
2
to be the subspace of V
that is T
-generated by w
2
(equivalently, the
subspace of V that is T-generated by w
2
.) This gives
W2
(V
2
) = Y
2
.
We are nally in a position to nish the proof. Observe we have
V
/W
2
= X = Y
1
+Z
2
= Y
1
Y
2
.
Set U
2
= W
2
+V
2
. Then
V = W
1
+V
2
+W
2
= W
1
+ (W
2
+V
2
)
= W
1
+U
2
.
We have W
1
U
2
= 0. To see this, observe that if x W
1
U
2
, then
W2
(x)
W2
(W
1
)
W2
(U
2
) = Y
1
Y
2
= 0. However, if x W
1
V
2
, then
x W
1
and
W2
[
W1
is injective so x = 0. Thus, we have V
= W
1
U
2
and U
2
properly contains W
2
. This contradicts the maximality of W
2
.
4.3 Rational canonical form
In this section we will give the rational canonical form for a linear transforma-
tion. The idea is that we show a nice basis exists so that the matrix with
respect to the linear transformation is particularly simple. One important fea-
ture of the rational canonical form is that this result works over any eld. This
is in contrast to the Jordan canonical form, which will be presented in the next
section.
F
(V, V ). An ordered set ( = w
1
, . . . , w
k
is a
rational canonical T-generating set of V if it satises
1. V = W
1
W
k
where W
i
is the subspace T-generated by w
i
;
60
2. for all i = 1, . . . , k 1 we have m
T,wi+1
(x) [ m
T,wi
(x).
One should note that some textbooks will reverse the order of divisibility of
the annihilating polynomials.
The rst goal of this section is to show such a set exists. Showing such a
set exists is straightforward given Theorem 4.2.7; the work now lies in showing
uniqueness.
F
(V, V ). Then V has a rational canonical T-
generating set ( = w
1
, . . . , w
k
. Moreover, if (
= w
1
, . . . , w
l
is any other
rational canonical T-generating set, then k = l and m
T,wi
(x) = m
T,w
i
(x) for
i = 1, . . . , k.
Proof. Let dim
F
V = n. We induct on n to prove existence. Let w
1
V such
that m
T
(x) = m
T,w1
(x) and let W
1
be the subspace T-generated by w
1
. If
W
1
= V we are done, so assume not. Let W
be the T-invariant complement

of W
1
, which we know exists by Theorem 4.2.7. Consider T
= T [
W
. Clearly
we have m
T
(x) [ m
T
(x). Moreover, dim
F
W
< n. By induction W
has a
rational canonical T-generating set w
2
, . . . , w
k
. The rational canonical T-
generating set of V is just w
1
, . . . , w
k
. This gives existence. It remains to
prove uniqueness.
Let ( = w
1
, . . . , w
k
and (
= w
1
, . . . , w
l
be rational canonical T-generating
sets with corresponding decompositions
V = W
1
W
k
and
V = W
1
W
l
.
Let p
i
(x) = m
T,wi
(x), p
i
(x) = m
T,w
i
(x), d
i
= deg p
i
(x), and d
i
= deg p
i
(x). We
proceed by induction on k. If k = 1, then V = W
1
= W
1
W
l
. We have
p
1
(x) = m
T,w1
(x) = m
T
(x) = m
T,w
1
(x) = p
1
(x). This gives d
1
= n. However,
we also have dim
F
W
1
= n = deg m
T,w1
(x) = deg m
T
(x) = deg p
1
(x). Thus,
dim
F
W
1
= n and so l = 1 and we are done in this case.
Now suppose for some m 1 we have p
i
(x) = p
i
(x) for all 1 i m. If
V = W
1
W
m
, then n = d
1
+ +d
m
= d
1
+ +d
m
. Thus k = m = m
= l
and we are done. The same argument works if V = W
1
W
m
.
Suppose now that V ,= W
1
W
m
, i.e., k > m. Consider p
m+1
(T)(V ).
This is T-invariant. By assumption we have
V = W
1
W
m
W
m+1
.
This gives
p
m+1
(T)(V ) = p
m+1
(W
1
) p
m+1
(T)(W
m
) p
m+1
(T)(W
m+1
) .
Since p
m+1
(T)(x) = m
T,wm+1
(x), we have p
m+1
(T)(w
m+1
) = 0. We now use
that W
m+1
is generated by w
m+1
to conclude p
m+1
(T)(W
m+1
) = 0 as well. We
61
also have that p
m+j
(x) [ p
m+1
(x) for all j 1. This gives p
m+1
(T)(W
m+j
) = 0
for all j 1. Thus,
p
m+1
(T)(V ) = p
m+1
(T)(W
1
) p
m+1
(T)(W
m
).
Since p
m+1
(x) [ p
i
(x) for 1 i m, we get dim
F
p
m+1
(T)(W
i
) = d
i
d
m+1
.
(See the homework problems.) Thus,
dim
F
p
m+1
(T)(V ) = d = (d
1
d
m+1
) + + (d
m
d
m+1
).
We do the same thing to V = W
1
W
l
. We now apply the same argument
to V = W
1
W
l
to obtain
p
m+1
(T)(V ) =
j1
p
m+1
(T)(W
j
).
This has a subspace of dimension d given by
m
j=1
p
m+1
(T)(W
j
) by our in-
duction hypothesis. Thus, this must be the entire space since it has dimension
equal to the dimension of p
m+1
(T)(V ). Thus, p
m+1
(T)(W
j
) = 0 for j m+1.
The annihilator of W
m+1
is p
m+1
(x), so we must have p
m+1
(x) [ p
m+1
(x).
We now run the same argument with p
m+1
(x) instead of p
m+1
(x) to obtain
p
m+1
(x) [ p
m+1
(x). Thus, p
m+1
(x) = p
m+1
(x) and we are done by induc-
tion.
We now rephrase this in terms of matrices.
Denition 4.3.3. Let M Mat
n
(F). We say M is in rational canonical form
if M is a block diagonal matrix
M =
_
_
_
_
_
C(p
1
(x))
C(p
2
(x))
.
.
.
C(p
k
(x))
_
_
_
_
_
where C(p
i
(x)) denotes the companion matrix of p
i
(x) where p
1
(x), . . . , p
k
(x)
is a sequence of polynomials satisfying p
i+1
(x) [ p
i
(x) for i = 1, . . . , k 1.
Corollary 4.3.4. 1. Let T Hom
F
(V, V ). Then V has a basis B such that
[T]
B
= M is in rational canonical form. Moreover, M is unique.
2. Let A Mat
n
(F). Then A is similar to a unique matrix in rational
canonical form.
Proof. Let ( = w
1
, . . . , w
k
be a rational canonical T-generating set for V with
p
i
(x) = m
T,wi
(x). Set d
i
= deg p
i
(x). Then
B = T
d11
(w
1
), . . . , T(w
1
), w
1
, . . . , T
d
k
1
(w
k
), . . . , w
k
is the desired basis. To prove the second statement just apply the rst part to
T = T
A
.
62
Denition 4.3.5. Let T have rational canonical form with diagonal blocks
C(p
1
(x)), . . . , C(p
k
(x)) with p
i
(x) divisible by p
i+1
(x). We call the polynomials
p
i
(x) the elementary divisors of T. These are sometimes also referred to as the
invariant factors of T.
Remark 4.3.6. 1. The rational canonical form is a special case of the fun-
damental structure theorem for modules over a principal ideal domain.
This is where the terminology comes from.
2. Some sources will use invariant factors so that p
1
(x) [ p
2
(x) [ [ p
k
(x).
This is clearly equivalent to our presentation upon a change of basis.
Corollary 4.3.7. Let A Mat
n
(F).
1. A is determined up to similarity by its sequence of elementary divisors
p
1
(x), . . . , p
k
(x).
2. The sequence of elementary divisors is determined recursively as follows:
(a) Set p
1
(x) = m
T
(x).
(b) Let w
1
V so that m
T
(x) = m
T,w1
(x) and let W
1
be the subspace
T-generated by w
1
.
(c) Let T : V/W
1
V/W
1
and set p
2
(x) = m
T
(x).
(d) Now repeat this process where the starting space for the next step is
V/W
1
.
Proof. The proof of this is left as an exercise.
F
(V, V ) have elementary divisors p
1
, . . . , p
k
.
Then we have
1. p
1
(x) = m
T
(x);
2. c
T
(x) = p
1
(x) p
k
(x).
Proof. We have already seen the rst statement. The second statement follows
immediately from the denition of c
T
(x) and the fact that det(C(p(x))) =
p(x).
The important thing about rational canonical form, as opposed to Jordan
canonical form, is that it is dened over any eld. It does not depend on the
eld which one considers the vector space dened over, as the following result
shows.
Corollary 4.3.9. Let F K elds. Let A, B Mat
n
(F) Mat
n
(K).
1. The rational canonical form of A is the same whether computed over F
or K. The minimal polynomial, characteristic polynomial, and invariant
factors are the same whether considered over F or K.
63
2. The matrices A and B are similar over K if and only if they are similar
over F. In particular, there exists some P GL
n
(K) such that B =
PAP
1
if and only if there exists Q GL
n
(F) such that B = QAQ
1
.
Proof. Write M
F
for the rational canonical form of A computed over F. Note
this also satises the condition for being a rational canonical form over K as
well so uniqueness of the rational canonical form gives that M
F
is the rational
canonical form for A over K as well. Thus, the invariant factors must agree.
However, one now just uses Corollary 4.3.8 to obtain the statement about the
minimal and characteristic polynomials.
Suppose that A and B are similar over F, i.e., there exists Q GL
n
(F) such
that B = QAQ
1
. Since Q GL
n
(K) as well, this gives A and B are similar
over K as well. If A and B are similar over K then A and B have the same
rational canonical form over K. The rst part of the corollary now tells us they
have the same rational canonical form over K and F, so they are similar over
F as well.
It is important to note for the above result to be applied both matrices must
actually be dened over F! This does not say if one has two matrices dened
over K that they are similar over any subeld of K since they may not even be
dened over that subeld.
It is particularly easy to compute the rational canonical form for a matrix
in Mat
2
(F) or Mat
3
(F) since the invariant factors are completely determined
by the characteristic and minimal polynomials in this case. We illustrate this in
the following example. However, we then work the same example with a general
method that works for any size matrix. We do not prove this method works as
it relies on working with modules over principal ideal domains.
Example 4.3.10. Set
A =
_
_
2 2 14
0 3 7
0 0 2
_
_
Mat
3
(Q).
We have
c
A
(x) = det
_
_
x 2 2 14
0 x 3 7
0 0 x 2
_
_
= (x 2)
2
(x 3).
Since we know m
A
(x) must have the same irreducible factors as c
A
(x), the
only possibilities for m
A
(x) are (x 2)(x 3) and (x 2)
2
(x 3). One easily
checks that (A 2 1
3
)(A 3 1
3
) = 0 so m
A
(x) = (x 2)(x 3). Thus we
have p
1
(x) = (x 2)(x 3) = x
2
5x + 6. We now use that c
A
(x) is the
product of the invariant factors to conclude that p
2
(x) = (x2). Thus, we have
64
C(p
1
(x)) =
_
5 1
6 0
_
and C(p
2
(x)) = 2. Thus, the rational canonical form for
A is
_
_
5 1 0
6 0 0
0 0 2
_
_
.
Example 4.3.11. We now compute the rational canonical form of A =
_
_
2 2 14
0 3 7
0 0 2
_
_
Mat
3
(Q) in a way that generalizes to matrices in Mat
n
(F). Consider the matrix
x 1
3
A =
_
_
x 2 2 14
0 x 3 7
0 0 x 2
_
_
.
We now apply elementary row and column operations on this matrix to diag-
onalize it. The fact that this matrix is always diagonalizable is a fact from
abstract algebra having to do with the fact that F[x] is a principal ideal do-
main. We use standard notation for elementary row and column operations.
For example, R
1
+R
2
R
1
means we replace row 1 by row 1 + row 2.
x 1
n
A
_
_
x 2 x 1 7
0 x 3 7
0 0 x 2
_
_
(via R
1
+R
2
R
1
)
_
_
1 x 1 7
x 3 x 3 7
0 0 x 2
_
_
(via C
1
+C
2
C
1
)
_
_
1 x 1 7
0 x
2
+ 5x 6 7x 14
0 0 x 2
_
_
(via (x 3)R
1
+R
2
R
2
)
_
_
1 0 7
0 x
2
+ 5x 6 7x 14
0 0 x 2
_
_
(via (x 1)C
1
+C
2
C
2
)
_
_
1 0 0
0 x
2
+ 5x 6 7x 14
0 0 x 2
_
_
(via 7C
1
+C
3
C
3
)
_
_
1 0 0
0 x
2
+ 5x 6 0
0 0 x 2
_
_
(via R
2
7R
3
R
2
).
Once the matrix has been diagonalized, the polynomials of degree 1 or greater
are the invariant factors. In this case we have p
1
(x) = x
2
5x +6 and p
2
(x) =
x2, as was found in the previous example. If one keeps track of the elementary
row and column operations one can compute P that converts A to its rational
canonical form as well.
65
Example 4.3.12. In this example we compute a representative of each similar-
ity class of matrices A Mat
3
(Q) that satisfy A
4
= 1
3
. Note that if A
4
= 1
3
,
then m
A
(x) [ x
4
1. We can factor x
4
1 into irreducibles over Q to obtain
x
4
1 = (x 1)(x +1)(x
2
+1). Thus, m
A
(x) is a polynomial of degree at most
3 that divides (x 1)(x + 1)(x
2
+ 1). Conversely, if B Mat
3
(Q) satises that
m
B
(x) [ x
4
1, then B
4
= 1
3
. We now list all possible minimal polynomials:
1. x 1;
2. x + 1;
3. x
2
+ 1;
4. (x 1)(x + 1);
5. (x 1)(x
2
+ 1);
6. (x + 1)(x
2
+ 1).
Note we cannot have m
A
(x) = x
4
1 because deg m
A
(x) 3. We can now list
the possible invariant factors, keeping in mind the product must have degree 3
and they must divide each other. The possible invariant factors are
1. x 1, x 1, x 1;
2. x + 1, x + 1, x + 1;
3. (x 1)(x + 1), x 1;
4. (x 1)(x + 1), x + 1;
5. (x 1)(x
2
+ 1);
6. (x + 1)(x
2
+ 1).
Note the rst factor cannot be x
2
+ 1 because then we cannot obtain p
2
(x) so
that p
2
(x) [ x
2
+ 1 and p
1
(x)p
2
(x) has degree 3. Thus, the elements of GL
3
(Q)
of order dividing 4, up to similarity, are given by
1. 1
3
;
2. 1
3
;
3.
_
_
0 1
1 0
1
_
_
;
4.
_
_
0 1
1 0
1
_
_
;
66
5.
_
_
1 1 0
1 0 1
1 0 0
_
_
;
6.
_
_
1 1 0
1 0 1
1 0 0
_
_
.
Note in the above matrices we omit the 0s in some spots to help emphasize the
blocks giving the rational canonical form.
In the previous example the possibilities were limited by how m
A
(x) factored
over our eld. For the next example we consider the same set-up, but working
over a dierent eld.
Example 4.3.13. Consider A Mat
3
(Q(i)) satisfying A
4
= 1
3
. In this ex-
ample we classify all such matrices up to similarity. As above, we have for
such an A that m
A
(x) [ x
4
1, but unlike in the previous example we have
x
4
1 = (x 1)(x + 1)(x i)(x + i) when we factor it into irreducibles over
Q(i). This vastly increases the possibilities for the minimal polynomials and
hence invariant factors. The possible minimal polynomials are given by
1. x 1;
2. x + 1;
3. x i;
4. x +i;
5. x
2
1;
6. x
2
+ 1;
7. (x 1)(x i);
8. (x 1)(x +i);
9. (x + 1)(x i);
10. (x + 1)(x +i);
11. (x 1)(x
2
+ 1);
12. (x + 1)(x
2
+ 1);
13. (x i)(x
2
1);
14. (x +i)(x
2
1).
From this, we see the possible invariant factors are given by
1. x 1, x 1, x 1;
67
2. x + 1, x + 1, x + 1;
3. x i, x i, x i;
4. x +i, x +i, x +i;
5. x
2
1, x 1;
6. x
2
1, x + 1;
7. x
2
+ 1, x i;
8. x
2
+ 1, x +i;
9. (x 1)(x i), x 1;
10. (x 1)(x i), x i;
11. (x 1)(x +i), x 1;
12. (x 1)(x +i), x +i;
13. (x + 1)(x i), x + 1;
14. (x + 1)(x i), x i;
15. (x + 1)(x +i), x + 1;
16. (x + 1)(x +i), x +i;
17. x
3
x
2
+x 1;
18. x
3
+x
2
+x + 1;
19. x
3
ix
2
x +i;
20. x
3
+ix
2
x i.
This gives the following possible rational canonical forms:
1. 1
3
;
2. 1
3
;
3. i 1
3
;
4. i 1
3
;
5.
_
_
0 1
1 0
1
_
_
;
6.
_
_
0 1
1 0
1
_
_
;
68
7.
_
_
0 1
1 0
i
_
_
;
8.
_
_
0 1
1 0
i
_
_
;
9.
_
_
1 +i 1
i 0
1
_
_
;
10.
_
_
1 +i 1
i 0
i
_
_
;
11.
_
_
1 i 1
i 0
1
_
_
;
12.
_
_
1 i 1
i 0
i
_
_
;
13.
_
_
i 1 1
i 0
1
_
_
;
14.
_
_
i 1 1
i 0
i
_
_
;
15.
_
_
i 1 1
i 0
1
_
_
;
16.
_
_
i 1 1
i 0
i
_
_
;
17.
_
_
1 1 0
1 0 1
1 0 0
_
_
;
18.
_
_
1 1 0
1 0 1
1 0 0
_
_
;
69
4.4. JORDAN CANONICAL FORM CHAPTER 4.
19.
_
_
i 1 0
1 0 1
i 0 0
_
_
;
20.
_
_
i 1 0
1 0 1
i 0 0
_
_
.
4.4 Jordan canonical form
In this section we use the results on the rational canonical form to deduce the
Jordan canonical form. One nice aspect of the Jordan canonical form is that it is
given in terms of the eigenvalues of the matrix. One serious drawback is it is only
dened over elds containing the splitting eld of the minimal polynomial, i.e.,
the eld must contain all the roots of the minimal polynomial. Most treatments
of the Jordan canonical form work over an algebraically closed eld, and one can
certainly do this to be safe. However, for a particular matrix it is not necessary
to move all the way to an algebraically closed eld so we do not do so.
F
(V, V ) and F. If ker(T I) ,= 0, then
we say is an eigenvalue of T. Any nonzero vector in this kernel is called an
eigenvector of T or a -eigenvector of T if we need to emphasize the eigenvalue.
The space E
1
= ker(T I) is called the eigenspace associated to . More

generally, for k 1 the k
th
eigenspace of T is given by
E
k
= v V : (T I)
k
v = 0 = ker((T I)
k
).
We refer to a nonzero element in E
k
as a generalized -eigenvector of T. Write

E
k=1
E
k
.
One should note that E
1
E
2
E
k
.
Example 4.4.2. Let B = v
1
, . . . , v
n
be a basis of V and consider T
Hom
F
(V, V ) with matrix given by
A = [T]
B
=
_
_
_
_
_
_
_
_
_
1 0 0 0
0 1 0 0
.
.
.
.
.
.
0 0 1 0
0 0 0 1
0 0 0 0
_
_
_
_
_
_
_
_
_
.
We have that each v
k
is a generalized eigenvector. Note that
A 1
n
=
_
_
_
_
_
_
_
_
_
0 1 0 0 0
0 0 1 0 0
.
.
.
.
.
.
0 0 0 1 0
0 0 0 0 1
0 0 0 0 0
_
_
_
_
_
_
_
_
_
.
70
Thus,
(T I)(v
1
) = 0
(T I)(v
2
) = v
1
.
.
.
(T I)(v
n
) = v
n1
.
This clearly gives v
2
, . . . , v
n
, E
1
. Moreover, observe that we have v

1
, . . . , v
n1

ImT, so dim
F
ImT n 1. However, since dim
F
V = n and dim
F
ker T 1,
this gives dim
F
ker T = 1 and dim
F
ImT = n 1. This allows us to con-
clude that E
1
= span
F
v
1
. Next we consider E
2
. It is immediate that
v
1
, v
2
E
2
. Since (A1
n
)[v
k
]
B
= [v
k1
], we have v
k
/ E
2
for k > 2. Thus,

as above we have E
2
= span
F
v
1
, v
2
. More generally, the same argument gives
E
k
= span
F
v
1
, . . . , v
k
for k = 1, . . . , n and E
= E
n
= V .
Exercise 4.4.3. Describe the generalized eigenspaces of the map given by
A =
_
_
_
_
1
1 0 0
0
1
0 0
0 0
2
0
0 0 0
3
_
_
_
_
where
1
,
2
, and
3
are distinct elements of F.
F
(V, V ) and suppose m
T
(x) = c
T
(x) = (x )
n
for some F. Then V is T-generated by a single element w
1
and V has a
basis v
1
, . . . , v
n
where v
n
= w
1
, and v
i
= (T I)(v
i+1
) for i = 1, . . . , n 1.
Proof. Let w
1
V be such that m
T,w1
(x) = m
T
(x) = c
T
(x). Let W
1
be the
subspace T-generated by w
1
. Then dim
F
W
1
= deg m
T,w1
(x) = deg c
T
(x) =
dim
F
V and so W
1
= V . Set v
n
= w
1
and dene v
i
= (T I)
ni
(v
n
). Observe
we have
v
i
= (T I)
ki
(v
n
)
= (T I)(T I)
ni1
(v
n
)
= (T I)(v
i+1
).
Thus, we only need to show that B = v
1
, . . . , v
n
is a basis for V . This has
dim
F
V elements, so it is enough to check B is linearly independent. Suppose
there are scalars c
1
, . . . , c
n
F such that
c
1
v
1
+ +c
n
v
n
= 0.
This gives
c
1
(T I)
n1
(v
n
) + +c
k1
(T I)v
n
+c
n
v
n
= 0.
71
If we set p(x) = c
1
(x )
n1
+ +c
n1
(x ) +c
n
, then p(T)(v
n
) = 0, i.e.,
p(T)(w
1
) = 0. This gives m
T,w1
(x) [ p(x). However, m
T,w1
(x) = c
T
(x) has
degree n and deg p(x) = n 1, so it must be that p(x) = 0, i.e., c
j
= 0 for all
j.
Corollary 4.4.5. Let T and B be as in the previous lemma. Then
[T]
B
=
_
_
_
_
_
_
_
_
_
1 0 0 0
0 1 0 0
.
.
.
.
.
.
0 0 1 0
0 0 0 1
0 0 0 0
_
_
_
_
_
_
_
_
_
.
Proof. We have (T I)v
1
= 0, so Tv
1
= v
1
. For i > 1 we have (T I)v
i+1
=
v
i
, so T(v
i+1
) = v
i
+v
i+1
. This gives the correct form for the matrix.
Denition 4.4.6. A basis B as in Lemma 4.4.4 is called a Jordan basis for V .
Moreover, generally if V = V
1
V
k
is a T-invariant decomposition and
each V
i
has a Jordan basis B
i
, then we call B = B
i
a Jordan basis for V .
Denition 4.4.7. 1. A matrix A Mat
k
(F) of the form
_
_
_
_
_
_
_
_
_
1 0 0 0
0 1 0 0
.
.
.
.
.
.
0 0 1 0
0 0 0 1
0 0 0 0
_
_
_
_
_
_
_
_
_
is called a k k Jordan block associated to the eigenvalue .
2. A matrix J is said to be in Jordan canonical form if it is a block diagonal
matrix where each J
i
is a Jordan Block.
Theorem 4.4.8. 1. Let T Hom
F
(V, V ). Suppose that
c
T
(x) = (x
1
)
e1
(x
k
)
e
k
over F. Then V has a basis B such that J = [T]
B
is in Jordan canonical
form. Moreover, J is unique up to the order of the blocks.
2. Let A Mat
n
(F). Suppose
c
A
(x) = (x
1
)
e1
(x
k
)
e
k
over F. Then A is similar to a matrix J in Jordan canonical form. More-
over, J is unique up to the order of the blocks
72
Proof. Set p
i
(x) = (x
i
)
ei
. Set V
i
= E
ei
i
= ker(p
i
(T)). We can apply
Theorem 4.2.4 to conclude that
V = E
e1
1
E
e
k
k
.
Set T
i
= T [
V
i . Then c
Ti
(x) = (x
i
)
ei
. Choose a rational canonical T
i
-
generating set ( = w
i
1
, . . . , w
i
mi
with corresponding direct sum decomposition
V
i
= W
i
1
W
i
mi
where W
i
j
is the T
i
-generated subspace by w
i
j
. Note that each W
i
j
satises the
hypotheses of Lemma 4.4.4 so we have a Jordan basis B
i
j
for W
i
j
with respect
to T
i
. The desired basis is
B =
k
_
j=1
j
_
i=1
B
j
mi
.
The uniqueness follows from uniqueness of the rational canonical form and our
construction.
The second claim follows easily from the rst.
One should note that if F is algebraically closed the characteristic polynomial
always splits completely into linear factors. In particular, if F contains all the
roots of c
T
(x) then T can be put in Jordan canonical form over F. If c
T
(x)
does not split over F we cannot put T in Jordan canonical form over F. This
makes the rational canonical form more useful in such situations.
Before giving an algorithm for computing the Jordan canonical form and
some examples, we recover a few results from elementary linear algebra using
our results on canonical forms.
Exercise 4.4.9. Let T Hom
F
(V, V ). Then c
T
(x) has a root in F if and only
if T has an eigenvalue in F.
F
(V, V ). If c
T
(x) does not split into a product
of linear factors, then T is not diagonalizable. If c
T
(x) does split into a product
of linear factors then the following are equivalent:
1. T is diagonalizable;
2. m
T
(x) splits into a product of distinct linear factors;
3. For every eigenvalue , E
= E
1
;
4. For every eigenvalue of T if c
T
(x) = (x )
e
p(x) with p() ,= 0, then

e
= dim
F
E
1
;
5. If we set d
= dim
F
E
1
, then
= dim
F
V ;
6. If
1
, . . . ,
m
are the distinct eigenvalues of T, then V = E
1
1
E
1
m
.
73
Proof. Suppose T is diagonalizable. There is a basis B such that
[T]
B
= D =
_
_
_
1
.
.
.
m
_
_
_
for some
1
, . . . ,
m
F. Then c
T
(x) = det(xI D) = (x
1
) (x
m
).
Thus, c
T
(x) splits into linear factors. Equivalently, if c
T
(x) does not split into
linear factors then T is not diagonalizable.
Now suppose c
T
(x) = (x
1
)
e1
(x
m
)
em
for some
i
F and e
i
Z
>1
.
Note that since c
T
(x) splits into linear factors, the Jordan canonical form for T
exists over F.
Suppose T is diagonalizable. We have from the proof of the existence of the
Jordan canonical form of T that V = V
1
V
m
where V
i
= ker(T
i
)
ei
=
E
ei
i
. Each V
i
splits into a direct sum of subspaces coming from the rational
canonical form of T
i
= T [
V
i . In particular, we can write
V
i
= W
i
1
W
i
mi
where each W
i
j
is generated by an element w
i
j
. Note that if dim
F
W
i
j
> 1, then
the rational canonical form for T restricted to W
i
j
is larger than a 1 by 1 matrix,
i.e., T is not diagonalizable. Thus, we have each W
i
j
is 1-dimensional. Now we
have W
i
j
is generated by a single element w
i
j
, m
Ti,w
i
j+1
(x) [ m
Ti,w
i
j
(x) for each i,
and m
Ti
(x) = m
Ti,w
i
1
(x). Thus, m
Ti
(x) = (x
i
). We have m
T
(x) is the least
common multiple of the m
Ti
(x), and so m
T
(x) splits into a product of distinct
linear factors, i.e., we have 1) implies 2).
Now suppose that m
T
(x) splits into a product of distinct linear terms. We
again consider W
i
j
. Since m
T
(x) splits into distinct linear factors, each W
i
j
must
have dimension 1. Thus, E
= E
1
and we have shown 2) implies 3).

Assume that for every eigenvalue we have E
= E
1
. Since V
i
= W
i
1

W
i
mi
and by assumption W
i
j
= E
i
= E
1
i
, we must have dim
F
E
1
=
dim
F
W
i
j
= deg m
Ti
(x) = e
i
. We have shown 3) implies 4).
Suppose that we are given for each that dim
F
E
1
= e
. Then since
deg c
T
(x) = n, we must have d
1
+ + d
m
= e
1
+ + e
m
= n, i.e.,
4) implies 5).
We now show 5) implies 6). Suppose d
1
+ + d
m
= n. We know
E
1
1
E
1
m
V . However, since they have the same dimension they must
be equal.
Finally, it only remains to show 6) implies 1). Suppose we have V = E
1
1
E
1
m
. This gives that each Jordan block can have size at most 1, so in
particular the Jordan canonical form is a diagonal matrix.
One does not need the existence of Jordan canonical form to prove the
above result as there are elementary proofs. However, this proof has the added
benet of reinforcing the important points of the proof of the existence of Jordan
canonical form. One should note that c
T
(x) factoring into linear factors does not
74
imply T is diagonalizable (this is the entire point of the Jordan canonical form!),
but if it splits into distinct linear factors then one does have T is diagonalizable
since m
T
(x) [ c
T
(x).
We now give an algorithm for computing the Jordan canonical form. Our
rst worked example is particularly simple as the matrix is already in Jordan
canonical form. However, it is nice to see how the algorithm works on a very
simple example before giving a less trivial one.
Example 4.4.11. Consider the matrix
A =
_
_
_
_
_
_
_
_
_
_
_
_
6 1 0
0 6 1
0 0 6
6
6
7 1
0 7
7
_
_
_
_
_
_
_
_
_
_
_
_
.
One easily calculate that c
A
(x) = (x 6)
5
(x 7)
3
. This immediately gives the
only nontrivial generalized eigenspaces are associated to 6 and 7. Moreover,
from the powers on the linear terms we know E
6
= E
j
6
for some j = 1, . . . , 5
and E
7
= E
i
7
for some i = 1, 2, 3.
Consider = 6. We compute the dimension of ker(A6 1
8
). Note that
A6 1
8
=
_
_
_
_
_
_
_
_
_
_
_
_
0 1 0
0 0 0
0 0 0
0
0
1 1
0 1
1
_
_
_
_
_
_
_
_
_
_
_
_
.
It is now immediate that dim
F
E
1
6
= dim
F
(A 6 1
8
) = 3 and has basis
e
1
, e
4
, e
5
. The next step is to compute E
2
6
= ker(A 6 1
8
)
2
. This has
dimension 4 with basis e
1
, e
2
, e
4
, e
5
. Finally, E
3
6
= ker(A 6 1
8
)
3
has di-
mension 5 and is spanned by e
1
, e
2
, e
3
, e
4
, e
5
. We represent this graphically as
follows. We begin with a horizontal line with nodes the basis elements of E
1
6
.
75
We then add a second row of nodes by adding basis elements that are in E
2
6
E
1
6
.
In this case it means we only add e
2
. Note we can put it over any element of
the rst row since we are only looking for the size of the blocks here, so for
convenience we put it over e
1
.
Finally, we build on that by adding a third row with basis elements that are in
E
3
6
E
2
6
. Since there is only one element, it must go over the e
2
as there can
be no gaps in the vertical lines through the nodes we add.
This gives us all the information we need for the Jordan blocks associated to
the eigenvalue 6. The number of nodes on the rst horizontal line tells us the
number of Jordan blocks with 6 on the diagonal, and the height over each of
these nodes tells the size of the block. So we have three blocks with 6s on the
76
diagonal: one of size 3, and two of size 1.
We now move on to the eigenvalue 7. We compute E
1
7
= ker(A 7 1
8
)
has dimension 2 with basis e
6
, e
8
. We have E
2
7
has dimension 7 with basis
e
6
, e
7
, e
8
. We now position these on a graph as above, giving only the nal
result here:
Thus, we have two blocks with 7s on the diagonal, one of size 2 and one of size
1. Putting this into matrix form gives us what we already knew, namely, that
A is already in Jordan canonical form.
Our next example is a bit more involved.
Example 4.4.12. Consider the matrix
A =
_
_
_
_
_
_
_
_
_
_
_
_
3 3 0 0 0 1 0 2
3 4 1 1 1 0 1 1
0 6 3 0 0 2 0 4
2 4 0 1 1 0 2 5
3 2 1 1 2 0 1 2
1 1 0 1 1 3 1 1
5 10 1 3 2 1 6 10
3 2 1 1 1 0 1 1
_
_
_
_
_
_
_
_
_
_
_
_
.
One calculates
c
A
(x) = (x 2)(x 3)
5
(x
2
6x + 21)
= (x 2)(x 3)
5
(x (3 + 2
3))(x (3 2
3)).
Thus, A does not have Jordan canonical form over Q, but does over Q(
3). We
now compute the Jordan canonical form over Q(
3). Note that the eigenvalues

2, 3 2
3 are very easily to deal with. Since they only occur to multiplicity
one, there can only be one Jordan block for each of these eigenvalues, each of
77
size 1. Thus, the only work comes in determining the block structure for the
eigenvalue 3.
We compute the Jordan block structure for the eigenvalue 3 just as in the
previous example. The dierence here is it requires more work to compute the
size of the eigenspaces, and we dont keep track of the bases of the eigenspaces
since we are only looking for the canonical form and not the basis that realizes
the form. One could keep track of the bases using elementary linear algebra at
each step if one so desired. One computes, using elementary linear algebra, that
dim
Q(
3)
E
1
3
= 2. Let v
1
, v
2
be a basis for this space. One then calculates
dim
Q(
3)
E
2
3
= 4, so we add vectors v
3
and v
4
to obtain a basis v
1
, v
2
, v
3
, v
4
for E
2
3
. We know E
3
has dimension 5, so it must be that E
3
= E
3
3
and so we
have a basis v
1
, v
2
, v
3
, v
4
, v
5
for E
3
3
. Graphically, we have for the rst step:
Since we add two vectors going from E
1
3
to E
2
3
, we add v
3
over v
1
and v
4
over
v
2
obtaining:
Finally, we add the nal vector over v
3
to obtain the nal graph:
78
4.5. MULTIPLE LINEAR TRANSFORMATIONS CHAPTER 4.
Thus, we have two Jordan blocks associated to the eigenvalue 3, one of size 3
and one of size 2. Thus, the Jordan canonical form of A is given by
_
_
_
_
_
_
_
_
_
_
_
_
3 1 0
0 3 1
0 0 3
3 1
0 3
2
3 + 2
3
3 2
3
_
_
_
_
_
_
_
_
_
_
_
_
.
4.5 Multiple linear transformations
We have seen in the last two sections that in many instances one can choose a
basis so that the matrix representing a linear transformation is in a very nice
form. However, it may be the case that one is interested in more than one linear
transformation at a time. In that case, the results just given arent as useful
because while a basis might put one linear transformation into a nice form, it
does not necessarily put the other transformation in a nice form. We study this
issue some in this section.
Lemma 4.5.1. Let S, T Hom
F
(V, V ) with V a nite dimensional F-vector
space. Let p(x) = a
r
x
r
+ +a
1
x+a
0
F[x] with a
0
,= 0. Then dim
F
(ker p(ST)) =
dim
F
(ker p(TS)).
Proof. Let v
1
, . . . , v
k
be a basis for ker(p(ST)). We claim T(v
1
), . . . , T(v
k
)
is linearly independent. Suppose
c
1
T(v
1
) + +c
k
T(v
k
) = 0.
79
This gives T(c
1
v
1
+ + c
k
v
k
) = 0, so ST(c
1
v
1
+ + c
k
v
k
) = 0. Let v =
c
1
v
1
+ + c
k
v
k
. Then ST(v) = 0. Moreover, we also have v ker(p(ST))
because v
1
, . . . , v
k
is a basis for ker p(ST). Thus
0 = p(ST)v
= a
r
(ST)
r
(v) + +a
1
ST(v) +a
0
v
= a
0
v
where we have used v ker(ST). However, since a
0
,= 0, this gives v = 0. Thus
0 = c
1
v
1
+ +c
k
v
k
. But v
1
, . . . , v
k
is linearly independent so we must have
c
i
= 0 for all i as desired. Thus T(v
1
), . . . , T(v
k
) is linearly independent.
We now claim T(v
i
) ker(p(TS)) for each i. We have
(TS)
j
T = (TS) (TS)
. .
j times
T
= T (ST) (ST)
. .
j times
= T(ST)
j
.
Thus,
p(TS)T(v
i
) = (a
r
(TS)
r
+ +a
0
)T(v
i
)
= T((a
r
(ST)
r
+ +a
0
)(v
i
))
= T(p(ST)(v
i
))
= T(0)
= 0.
Thus, T(v
1
), . . . , T(v
k
) is a linearly independent set in ker p(TS), so dim
F
ker(p(TS))
k = dim
F
ker(p(ST)). One now uses the same argument with S replacing T To
get the other direction of the inequality.
F
(V, V ) and S Hom
F
(V, V ) with F alge-
braically closed. Then ST and TS have the same nonzero eigenvalues, and for
a common eigenvalue , they have the same Jordan block structure at .
Proof. Let ,= 0 be a nonzero eigenvalue of ST and v V a nonzero eigenvector.
Set w = T(v). Then ST(v) = v = S(w). We also have
TS(w) = T(v)
= T(v)
= w.
We have T(v) is nonzero since if it were 0 we would have S(0) = v implies
= 0 or v = 0, a contradiction. Thus w ,= 0 and so is an eigenvalue of TS as
well. The same argument in the other direction shows every nonzero eigenvalue
of TS is a nonzero eigenvalue of ST. It remains to deal with the Jordan block
structure.
80
Consider the polynomials p
j,
(x) = (x )
j
for j 1. The Jordan block
structure of ST at is given by the dimensions of ker(p
j,
(ST)). However,
Lemma 4.5.1 gives dimker(p
j,
(ST)) = dimker(p
j,
(TS)) for each j as long as
the constant term of p
j,
(x) ,= 0. However, as long as ,= 0 this is satised so
the Jordan block structures must be equal.
Over a general eld we have the following result.
Corollary 4.5.3. Let S, T Hom
F
(V, V ). Then c
TS
(x) = c
ST
(x).
Proof. We claim it is enough to prove the result for F algebraically closed. This
follows because the characteristic polynomial is the same regardless of which
eld one consider S and T to be dened over.
Let dim
F
V = n and let
1
, . . . ,
k
be the distinct nonzero eigenvalues of
ST. Write
c
ST
(x) = x
e0
(x
1
)
e1
(x
k
)
e
k
with e
0
= n (e
1
+ + e
k
). The previous theorem gives that ST and TS
have the same Jordan block structure at
1
, . . . ,
k
. Thus, c
TS
(x) = x
f0
(x
1
)
e1
(x
k
)
e
k
where f
0
= n(e
1
+ +e
k
) = e
0
. Thus c
TS
(x) = c
ST
(x).
The previous results do not give that the matrices associated to ST and TS
are similar. Consider the following example.
Example 4.5.4. Let A =
_
1 0
1 0
_
and B =
_
1 1
1 1
_
. Then AB =
_
1 1
1 1
_
has characteristic polynomial c
AB
(x) = x
2
and BA =
_
0 0
0 0
_
has characteristic
polynomial c
BA
(x) = x
2
. Note the rational canonical form of BA is BA, but
the rational canonical form of AB is
_
0 1
0 0
_
. Thus, they cannot be similar.
F
(V, V ). Then the following are equivalent:
1. V is T-generated by a single element, i.e., the rational canonical form of
T is a single block;
2. every linear transformation S Hom
F
(V, V ) that commutes with T is a
polynomial in T.
Proof. Suppose that V is T-generated by a single element v
0
. Let S Hom
F
(V, V )
commute with T. There exists p
0
(x) F[x] so that S(v
0
) = p
0
(T)(v
0
) since v
0
is a T-generator of V . Let v V and write v = g(T)(v
0
) for some g(x) F[x].
We have
S(v) = S(g(T)(v
0
))
= g(T)S(v
0
)
= g(T)p
0
(T)(v
0
)
= p
0
(T)g(T)(v
0
)
= p
0
(T)(v).
81
Thus, S = p
0
(T) since they agree on every element of V .
Now suppose that V is not T-generated by a single element. Let v
1
, . . . , v
k
be a rational canonical generating set for T. This gives V = V

1
V
k
with
V
i
T-generated by v
i
. In particular the V
i
are T-invariant. Dene S : V V
by
S(v) =
_
0 if v V
1
v if v V
i
for i > 0.
Since the direct sum is T-invariant we get that S and T commute. Suppose
S = p(T) for some p(x) F[x]. We have 0 = S(v
1
) = p(T)(v
1
). Thus,
p
1
(x) [ p(x) where p
j
(x) = m
T,vj
(x). However, p
j
(x) [ p
1
(x) for all j 1 and
so p
j
(x) [ p(x) for all j. Thus, S(v
j
) = p(T)(v
j
) = 0 because p
j
(x) [ p(x). This
is a contradiction if k > 1.
We close this chapter with the following very nice result about when one can
diagonalize two linear transformations simultaneously.
Theorem 4.5.6. Let S, T Hom
F
(V, V ) with each diagonalizable. The follow-
ing are equivalent:
1. There is a basis B of V such that [S]
B
and [T]
B
are diagonal, i.e., B
consists of common eigenvalues of S and T;
2. S and T commute.
Proof. First suppose there is a basis B = v
1
, . . . , v
n
so that [S]
B
and [T]
B
are both diagonal. Then there exists
i
,
i
F such that S(v
i
) =
i
v
i
and
T(v
i
) =
i
v
i
. To show S and T commute, it is enough to check they commute
on a basis. We have
ST(v
i
) = S(
i
v
i
)
=
i
S(v
i
)
=
i
i
v
i
=
i
i
v
i
=
i
T(v
i
)
= TS(v
i
).
Thus, S and T commute.
Now suppose that S and T commute. Since T is diagonalizable, we can write
V = V
1
V
k
with V
i
the
i
-eigenspace of T. For v
i
V
i
we have
TS(v
i
) = ST(v
i
)
=
i
S(v
i
).
Thus, S(v
i
) V
i
and so the V
i
are S-invariant. We claim each V
i
has a basis
consisting of eigenvectors of S, i.e., S
i
= S[
Vi
is diagonalizable. We know from
the previous section that a linear transformation is diagonalizable if and only
82
if the minimal polynomial splits into distinct linear factors. Thus, m
S
(x) splits
into distinct linear factors by assumption. However, m
Si
(x) [ m
S
(x), so S
i
is diagonalizable as well. Let B
i
be a basis of V
i
consisting of eigenvectors of
S. The elements of B
i
are in V
i
, so are eigenvectors of T automatically. Set
B = B
1
B
k
and we are done.
One can extend this result to more than two linear transformations via
induction.
83
4.6 Problems
For all of these problems V is a nite dimensional F-vector space.
1. Let T Hom
F
(V, V ). Prove that the intersection of any collection of T-
invariant subspaces of V is T-invariant.
2. Let T Hom
F
(V, V ) and w V . Let m
T,w
(x) F[x] be the annihilating
polynomial of w.
(a) Show that if m
T,w
(x) = p(x)q(x), then p(x) = m
T,q(T)(w)
(x).
(b) Let W be the subspace of V that is T-generated by w. If deg m
T,w
(x) = d
and deg q(x) = e, show that dim
W
q(T)(W) = d e.
3. Let A Mat
4
(Q) be dened by
_
_
_
_
1 2 4 4
2 1 4 8
1 0 1 2
0 1 2 3
_
_
_
_
.
First nd the rational canonical form of A by hand. Check your answer using
SAGE.
4. Prove that two 3 3 matrices are similar if and only if they have the same
characteristic and minimal polynomials. Give an explicit counterexample to
this assertion for 4 4 matrices.
5. We say A Mat
2
(F) has multiplicative order n if A
n
= I and A
m
,= I for
any 0 < m < n. Show that x
5
1 = (x 1)(x
2
4x +1)(x
2
+5x +1) in F
19
[x].
Use this to determine all similarity all elements of Mat
2
(F
19
) of multiplicative
order 5.
6. In a group G we say two elements a and b are conjugate if there exists g G
so that a = gbg
1
. The conjugacy class of an element is the collection of all
elements conjugate to it. Given a conjugacy class (, any element in ( is referred
to as a representative for (. Determine representatives for all the conjugacy
classes for GL
3
(F
2
).
7. Prove that if
1
, . . . ,
n
are the eigenvalues of a matrix A Mat
n
(F), then
k
1
, . . . ,
k
n
are the eigenvalues of A
k
for any k 0.
8. Prove that the matrices
_
_
_
_
2 0 0 0
4 1 4 0
2 1 3 0
2 4 9 1
_
_
_
_
and
_
_
_
_
5 0 4 7
3 8 15 13
2 4 7 7
1 2 5 1
_
_
_
_
are
similar.
84
9. Prove that the matrices
_
_
_
_
0 1 1 1
1 0 1 1
1 1 0 1
1 1 1 0
_
_
_
_
and
_
_
_
_
5 2 8 8
6 3 8 8
3 1 3 4
3 1 4 5
_
_
_
_
both
have characteristic polynomial (x3)(x+1)
3
. Determine the Jordan canonical
form for each matrix and determine if they are similar.
10. Determine all possible Jordan canonical forms for a linear transformation
with characteristic polynomial (x 2)
3
(x 3)
2
.
11. Prove that any matrix A Mat
n
(C) satisfying A
3
= A can be diagonal-
ized. Is the same statement true over any eld F? If so, prove it. If not, give a
counterexample.
12. Determine the Jordan canonical form for a matrix A Mat
n
(Q) with
entries all equal to 1.
85
Chapter 5
Bilinear and sesquilinear
forms
In this chapter we study bilinear and sesquilinear forms. In the special cases of
symmetric, skew-symmetric, Hermitian, and skew-Hermitian forms we provide
the standard classication theorems.
5.1 Basic denitions and facts
In this chapter we will be interested in maps V V F that are linear in each
variable separately. In particular, we will be interested in bilinear forms.
Denition 5.1.1. Let V be an F-vector space. A function : V V F is
said to be a bilinear form if is an F-linear map in each variable separately,
i.e., for all v
1
, v
2
, v V and c F we have
1. (cv
1
+v
2
, v) = c(v
1
, v) +(v
2
, v);
2. (v, cv
1
+v
2
) = c(v, v
1
) +(v, v
2
).
We denote the collection of bilinear forms by Hom
F
(V, V ; F).
Exercise 5.1.2. Show that Hom
F
(V, V ; F) is an F-vector space.
We now give some elementary examples of bilinear forms. Checking each of
these satisfy the criterion to be bilinear form is left as an exercise.
Example 5.1.3. The rst example, and the one that should be kept in mind
throughout this and the next chapter, is that familiar example of a dot product
from multivariable calculus. Let V = R
n
for some n Z
1
. Dene : V V
86
5.1. BASIC DEFINITIONS AND FACTS CHAPTER 5.
R by setting
(v, w) = v w
=
t
vw
=
n
i=1
a
i
b
i
where v =
t
(a
1
, . . . , a
n
) and w =
t
(b
1
, . . . , b
n
).
Example 5.1.4. Let A Mat
n
(F). We have a bilinear form
A
on V = F
n
dened by
A
(v, w) =
t
vAw.
Much as one saw that upon choosing a basis any linear map between nite
dimensional vector spaces could be realized as a matrix, we will soon see that
any bilinear form on a nite dimensional vector space can be represented as a
matrix as in this example upon choosing a basis for V .
Example 5.1.5. Let V = C
0
([0, 1], R) be the R-vector space of continuous
functions from [0, 1] to R. Dene : V V R by setting
(f, g) =
_
1
0
f(x)g(x)dx.
This gives a bilinear form. More generally, given any positive integer n, one
can consider the vector space of paths V = C
0
([0, 1], R
n
). Given f, g V , we
have f(x), g(x) R
n
for each x [0, 1]. Thus, for each x we have f(x) g(x) is
well-dened from Example 5.1.3 above. We can dene
(f, g) =
_
1
0
f(x) g(x)dx.
This denes a bilinear form on V .
In the next section we will be interested in classifying certain nice bilinear
forms. In particular, we will often restrict to the case our bilinear form is non-
degenerate.
Denition 5.1.6. Let Hom
F
(V, V ; F) be a bilinear form. We say that
is non-degenerate if given w V so that (v, w) = 0 for every v V , one has
w = 0.
One can dene the kernel of by setting
ker = w V : (v, w) = 0, for every v V .
This allows us to rephrase the condition that is non-degenerate to be that
the kernel of is trivial. Non-degenerate forms arise in a very natural way in
a context we have already studied, namely, in terms of isomorphisms between
87
a nite dimensional vector space and its dual. In particular, recall that when
studying dual spaces we saw that for a nite dimensional F-vector space V
that one has V

= V
, but that this isomorphism depends upon picking a basis

and so is non-canonical. It turns out that non-degenerate bilinear forms are in
bijection with the collection of such isomorphisms. Before proving this, we need
some background.
Let Hom
F
(V, V ; F). If we x v V then we have a linear map (, v)
V
given by w (w, v). In particular, we dene R
: V V
by setting
R
(v) = (, v).
Note that for any v V , the map R
(v) V
is given by R
(v)(w) = (w, v).

We claim that R
is a linear map. We want to show that for any a F and

v
1
, v
2
V that R
(av
1
+ v
2
) = aR
(v
1
) + R
(v
2
). As this is an equality of
maps, we must show these maps agree on elements. We have
R
(av
1
+v
2
)(w) = (w, av
1
+v
2
)
= a(w, v
1
) +(w, v
2
)
= aR
(v
1
)(w) +R
(v
2
)(w),
as desired. Thus, R
Hom
F
(V, V
).
Lemma 5.1.7. A bilinear form on a nite dimensional vector space V is
non-degenerate if and only if R
is an isomorphism.
Proof. First, note that since dim
F
V = dim
F
V
one has R
is an isomorphism
if and only if R
is injective.
Now suppose that R
is injective. Let w V . If (v, w) = 0 for all v V ,

this implies that R
(w)(v) = 0 for every v V . However, this says that R
(w)
is the zero map. Since we are assuming R
is injective, this gives w = 0. Thus,

is non-degenerate
Now suppose there exists w V with w ,= 0 so that R
(w) = 0, i.e., R
is not
an isomorphism. This translates to the statement that 0 = R
(w)(v) = (v, w)
for all v V . This gives that is not non-degenerate.
It is important for the above result that we are working with nite dimen-
sional vector spaces. We saw before that if V is innite dimensional we will not
have an isomorphism between V and its dual.
Lemma 5.1.8. Let V be a nite dimensional vector space. There is a bijection
between isomorphisms T : V V
and non-degenerate bilinear forms

Hom
F
(V, V ; F).
Proof. Let be a non-degenerate bilinear form. Then we associate the required
isomorphism by sending to R
.
Now suppose we have an isomorphismT : V V
. Dene Hom
F
(V, V ; F)
by
(v, w) = T(w)(v).
88
It is elementary to check this is a bilinear form and R
= T, so is non-degenerate.
Moreover, since R
= T it is immediate that these maps are inverse to each

other and so provide the required bijection.
One should note here that in dening non-degenerate and R
we made
a fairly arbitrary choice, namely, we could just as easily have dened non-
degenerate to be if (v, w) = 0 for all w then v must be 0 and the map could have
been dened to send v to (v, ). We denote this map by L
, i.e., L
: V V
is dened by L
(v) = (v, ).. In fact, these two choices go together. If we

change the denition of non-degenerate, the appropriate map to use it L
in-
stead of R
. We will see later why we prefer to use R
, but it is important to
note that up to keeping track of some duals there is really no dierence between
using L
and R
. The following result gives the relation between these two

maps.
Theorem 5.1.9. Let V be a nite dimensional vector space over a eld F.
Let Hom
F
(V, V ; F). Then L
and R
are dual to each other, namely,

given L
: V V
, if we dualize this to obtain L
: (V
, then upon
identifying (V
with V via the canonical isomorphism we have L
= R
.
Similarly one has R
= L
.
Proof. Recall that given F-vector spaces V, W and T Hom
F
(V, W), the dual
map T
is dened by
T
()(v) = (T(v))
for v V and V
. We also recall the canonical isomorphism between V

and (V
is given by sending v to the evaluation map eval

v
. For v, w V we
have
L
(eval
v
)(w) = eval
v
(L
(w))
= eval
v
((w, ))
= (w, v)
= R
(v)(w).
Thus, we have L
= R
upon identifying V and (V
.
It is well known from multivariable calculus that Example 5.1.3 is used to
dene the length of a vector in R
n
. However, if we make the same denition on
C we do not recover the length of a complex number. In order to nd the length
of z C, we want to consider zz. This leads to the denition of a dierent type
of form, a sesquilinear form, that keeps track of conjugation as well. Before
dening these forms, we need to dene conjugation for more general elds than
C.
Denition 5.1.10. Let F be a eld and conj : F F a map that satises:
1. conj(conj(x)) = x for every x F;
2. conj(z +y) = conj(x) + conj(y) for every x, y F;
89
3. conj(xy) = conj(x) conj(y) for every x, y F.
We call such a map a conjugation map on F. We say conj is nontrivial if conj
is not the identity map on F.
We give a couple of familiar examples of conjugation maps.
Example 5.1.11. Let F = C. Then conj is the familiar conjugation map
sending x +iy to x iy for x, y R.
Example 5.1.12. Let D Z and consider F = Q(
D) where we recall
Q(
D) = a +b
D : a, b Q.
It is easy to check the map sending a+b
D to ab
D is a nontrivial conjugation
map on F if D is not a perfect square.
To emphasize that one should think of conjugation as a generalization of
complex conjugation as well as to save writing we will denote conjugation maps
by x x.
Lemma 5.1.13. Let F be a eld with nontrivial conjugation and assume char(F) ,=
2. Then:
1. Let F
0
= z F : z = z. Then F
0
is a subeld of F.
2. There is a nonzero element j F so that j = j.
3. Every element of F can be written uniquely as z = x +jy for some x, y
F
0
.
Proof. The fact that F
0
is a subeld of F is left as an exercise. Let a F so
that a ,= a. Since the conjugation is nontrivial there is always such an a. Set
j =
aa
2
. Then one has j = j. Given z F, we have
z+z
2
and
zz
2j
are both
elements of F
0
and z =
_
z+z
2
_
+j
_
zz
2j
_
.
Example 5.1.14. Returning to the above examples, in the case F = C we have
F
0
= R and j =
1. In the case F = Q(
D) we have F
0
= Q and j =
D.
Given a eld F that admits a conjugation map and a vector space V over
F, we can also consider a conjugation map on V .
Denition 5.1.15. Let V be an F-vector space where F is a eld with conju-
gation. A conjugation map on V is a map conj : V V that satises
1. conj(conj(v)) = v for every v V ;
2. conj(v +w) = conj(v) + conj(w) for every v, w V ;
3. conj(av) = a conj(v) for every a F, v V .
90
As in the case of conjugation on a eld, we will often write v v to denote a
conjugation map.
The third property in the denition of conjugation for a vector space gives
immediately that linear maps are not the correct maps to work with if one wants
to take nontrivial conjugation into account. To remedy this, we dene conjugate
linear maps.
Denition 5.1.16. Let F be a eld with conjugation and let V and W be
F-vector spaces. A map T : V W is said to be conjugate linear if it satises
1. T(v
1
+v
2
) = T(v
1
) +T(v
2
) for every v
1
, v
2
V ;
2. T(av) = aT(v) for every a F, v V .
We say T is a conjugate isomorphism if it is conjugate linear and bijective.
Of course, one might not wish to introduce a completely new denition to
deal with conjugation. We can get around this by dening a new vector space
associated to V that takes into account the conjugation. Let V be equal to V
as a set, and let V have the same addition as V . However, we dene scalar
multiplication on V by setting
F V V
(a, v) a v := av.
Exercise 5.1.17. Check that V is an F-vector space.
We now consider maps in Hom
F
(V , W). Let T Hom
F
(V , W). Then we
have
1. T(v
1
+v
2
) = T(v
1
) +T(v
2
) for every v
1
, v
2
V ;
2. T(a v) = aT(v) for every a F, v V , i.e., T(av) = aT(v) for every
a F, v V .
The second equation translates to T(av) = aT(v), i.e., T is a conjugate lin-
ear map. Thus, the set of conjugate linear maps forms an F-vector space, in
particular, it is equal to Hom
F
(V , W).
Denition 5.1.18. A sesquilinear form : V V F is a map that is linear
in the rst variable and conjugate linear in the second variable.
While bilinear forms are linear in each variable, a sesquilinear form is linear
in the rst variable and conjugate linear in the second variable. Note that
sesqui means one and a half, and this is what it refers to.
Exercise 5.1.19. Show that a sesquilinear form : V V F is the same as
a bilinear form from V V to F.
The above examples of bilinear forms are easily adjusted to give sesquilinear
forms.
91
n
. Dene : V V C by
(v, w) =
t
vw
=
n
i=1
v
i
w
i
where v =
t
(v
1
, . . . , v
n
) and w =
t
(w
1
, . . . , w
n
). Observe that (v, v) = [[v[[
2
,
and so for any v C
n
we have (v, v) R
0
.
n
where F is a eld with conjugation. Let A
Mat
n
(F) and dene : V V F by
(v, w) =
t
vAw.
This is a sesquilinear form.
0
([0, 1], C), i.e., V is the set of paths in C. Dene
: V V C to be the function
(f, g) =
_
1
0
f(z)g(z)dz.
One can easily check that is a sesquilinear form.
We dene non-degenerate for sesquilinear forms exactly as for bilinear forms.
In the case the vector space is nite dimensional, given a sesquilinear form one
can dene L
and R
as above. In this case one wants R
as this is a conjugate
linear map where L
is still just a linear map.

Exercise 5.1.23. Let : V V F be a sesquilinear form. Use the result
that one can view conjugate linear maps as elements of Hom
F
(V , W) to relate
L
and R
.
Exercise 5.1.24. Let V be a nite dimensional vector space and let be a
sesquilinear from on V . Show that is non-degenerate if and only if R
is a
conjugate isomorphism.
Exercise 5.1.25. Show there is a bijection between conjugate linear isomor-
phisms T : V V
and nondegenerate sesquilinear forms : V V V .

We now dene the matrix associated to a bilinear (resp. sesquilinear) form.
Denition 5.1.26. Let : V V F be a bilinear or sesquilinear form. Let
B = v
1
, . . . , v
n
be a basis of V . Set a
ij
= (v
i
, v
j
). The matrix associated to
is given by
[]
B
= A = (a
ij
) Mat
n
(F).
92
Theorem 5.1.27. Let be a bilinear or sesquilinear form on a nite dimen-
sional vector space V . Let B be a basis of V . Then for w, v V we have
(v, w) =
t
[v]
B
A[w]
B
if is bilinear and
(v, w) =
t
[v]
B
A[w]
B
if is sesquilinear.
Proof. This follows immediately upon calculating on a basis.
The denition of the matrix associated to a bilinear (resp. sesquilinear) form
seems rather arbitrary in how we dened it. At this point the only justication
is that the previous theorem shows this denition works how we expect it to.
However, there is a more natural reason this is the correct denition to use. Let
be a bilinear or sesquilinear form. Let B = v
1
, . . . , v
n
be a basis of V and
B
= v
1
, . . . , v
n
the dual basis of V
. Observe we have
R
(v
j
) = (, v
j
) V
.
As this is a linear map from V to V
, we can ask for [R
]
B
B
. To calculate this,
we need to expand R
(v
j
) in terms of B
for each j. Write

R
(v
j
) = c
1
v
1
+ +c
n
v
n
for c
i
F. The goal is to nd c
i
. Observe we have
a
ij
= (v
i
, v
j
)
= R
(v
j
)(v
i
)
= c
1
v
1
(v
i
) + +c
n
v
n
(v
i
)
= c
i
.
Thus, a
ij
= c
i
and so
R
(v
j
) = a
1j
v
1
+ +a
nj
v
n
,
which gives
[R
]
B
B
= []
B
.
This shows that the matrix of a bilinear or sesquilinear form is really just the
matrix associated to the linear map R
.
Corollary 5.1.28. A bilinear or sesquilinear form on a nite dimensional
vector space V is nondegenerate if and only if []
B
is nonsingular for any basis
B.
Proof. This follows immediately from the denition of nondegenerate and rela-
tion between []
B
and the linear map R
.
93
Just as when studying linear maps, we would like to know how a change of
basis eects the matrix of .
Theorem 5.1.29. Let V be a nite dimensional F-vector space and let B and
( be bases of V . If P is the change of basis matrix from ( to B then if is
bilinear we have
[]
C
=
t
P[]
B
P
and if is sesquilinear we have
[]
C
=
t
P[]
B
P.
Proof. We consider the case that is sesquilinear. The proof when is bilinear
can be obtained by taking the conjugation to be trivial in this proof.
We have from the denition that for every v, w V
(v, w) =
t
[v]
B
[]
B
[w]
B
and
(v, w) =
t
[v]
C
[]
C
[w]
C
.
Moreover, since P is the change of basis matrix from( to B we have [v]
B
= P[v]
C
and [w]
B
= P[w]
C
. Combining all of this we obtain
t
[v]
C
[]
C
[w]
C
= (v, w)
=
t
[v]
B
[]
B
[w]
B
=
t
(P[v]
C
)[]
B
P[w]
C
=
t
[v]
C
t
P
B
P[w]
C
.
Since this is true for all v, w V , we must have
[]
C
=
t
P[]
B
P
as claimed.
It is very important to note the dierence between this and changing bases
for linear maps. We do not conjugate the matrix here, i.e., in Chapter 3 we had
the equation [T]
C
= P
1
[T]
B
P. This means that you cannot in general use the
results from Chapter ?? to nd a nice basis for the matrix []
B
. Later we will
discuss this further and determine some cases when one can use rational and
Jordan canonical forms to obtain nice bases for forms as well.
Denition 5.1.30. Let A, B Mat
n
(F). We say A and B are congruent
(resp. conjugate congruent) if there exists P GL
n
(F) so that B =
t
PAP
(resp. B =
t
PAP.)
Exercise 5.1.31. Show that congruence on matrices denes and equivalence
relation.
The above results show that matrices A and B in Mat
n
(F) represent the
same bilinear (resp. sesquilinear) form if and only if A and B are congruent.
94
5.2. SYMMETRIC, SKEW-SYMMETRIC, HERMITIAN, AND
SKEW-HERMITIAN FORMS CHAPTER 5.
5.2 Symmetric, skew-symmetric, Hermitian, and
skew-Hermitian forms
In this section we specialize the forms we are looking at. This allows us to prove
some very nice structure theorems.
Denition 5.2.1. Let V be a vector space over a eld F.
1. A bilinear form on V is said to be symmetric if (v, w) = (w, v) for
all v, w V .
2. A bilinear form on V is said to be skew-symmetric if (v, w) = (w, v)
for all v, w V and (v, v) = 0 for all v V .
3. A sesquilinear form on V is said to be Hermitian if (v, w) = (w, v)
for all v, w V .
4. A sesquilinear form on V is said to be skew-Hermitian if char(F) ,= 2
and (v, w) = (w, v) for all v, w V .
Exercise 5.2.2. Show that if char(F) ,= 2 then is skew-symmetric if and
only if (v, w) = (w, v) for all v, w V .
For the rest of this section, all of the bilinear and sesquilinear forms will be
one of the forms given in the previous denition. When we wish to emphasize
a form on V , we write (V, ) for the vector space.
Lemma 5.2.3. Let (V, ) be a nite dimensional F-vector space. Let B be a
basis for V and set A = []
B
. Then we have:
1. is symmetric if and only if
t
A = A.
2. is skew-symmetric if and only if
t
A = A.
3. is Hermitian if and only if
t
A = A.
4. is skew-Hermitian if and only if
t
A = A and char(F) ,= 2.
The proof of this lemma is left as an exercise. It amounts to converting the
denition to matrix language.
Denition 5.2.4. Let A Mat
n
(F).
1. We say A is symmetric if
t
A = A.
2. We say A is skew-symmetric if
t
A = A.
3. We say A is Hermitian if
t
A = A.
4. We say A is skew-Hermitian if
t
A = A.
95
One should keep in mind that the notion we are generalizing is that of the
dot product on R
n
. As such, the correct notion of equivalence between spaces
(V, ) and (W, ) should preserve distance, i.e., it should be a generalization
of the notion of isometry from Euclidean geometry.
Denition 5.2.5. Let (V, ) and (W, ) be F-vector spaces. We say T
Hom
F
(V, W) is an isometry if T is an isomorphism and satises
(v
1
, v
2
) = (T(v
1
), T(v
2
))
for all v
1
, v
2
V .
As is the custom, we rephrase this denition in terms of matrices.
F
(V, W) with (V, ) and (W, ) nite dimensional
F-vector spaces of dimension n. Let B be a basis for V and ( a basis for W.
Set P = [T]
C
B
. Then we have T is an isometry if and only if P GL
n
(F) and
t
P[]
C
P = []
B
for and bilinear and
t
P[]
C
P = []
B
if and are sesquilinear.
Proof. Exercise.
Denition 5.2.7. Let be a form on V . The isometry group of is given by
Isom(V, ) = T Hom
F
(V, V ) : T is an isometry.
When V is clear from context we will write Isom().
Exercise 5.2.8. Let (V, ) and (W, ) be F-vector spaces. Show that (V, ) is
isometric to (W, ) if and only if the matrix relative to is congruent to the
matrix of if and are bilinear. If they are sesquilinear, show the same
statement with conjugate congruent.
Exercise 5.2.9. Show that Isom(V, ) is a group under composition.
One should keep in mind that studying isometry groups of a space gives
information back about the geometry of the space (V, ). We will see more
detailed examples of this in the next chapter when we restrict ourselves to inner
product spaces.
Depending on the type of form we give the isometry group special names.
1. Let be a nondegenerate symmetric bilinear form. We write O() for the
isometry group and refer to this as the orthogonal group associated to .
We call isometries in this group orthogonal maps.
96
2. Let be a nondegenerate Hermitian form. We write U() for the isometry
group and refer to this as the unitary group associated to . We call
isometries in this group unitary maps.
3. Let be a skew-symmetric bilinear form. We write Sp() for the isometry
group and refer to this as the symplectic group associated to . We call
isometries in this group symplectic maps.
Recall that when studying linear transformations T Hom
F
(V, V ) in Chap-
ter 4, the rst step was to break V into a direct sum of T-invariant subspaces.
We want to follow the same general pattern here, but rst need the appropriate
notion of breaking V up into pieces for a form .
Denition 5.2.10. Let be a form on V . We say v, w V are orthogonal
with respect to if
(v, w) = (w, v) = 0.
We say subspaces V
1
, V
2
V are orthogonal if
(v
1
, v
2
) = (v
2
, v
1
) = 0
for all v
1
V
1
, v
2
V
2
.
We once again emphasize that one should be keeping in mind the notion of
a dot product on R
n
when thinking of these denitions. One then sees that in
this special case this notion of orthogonal is the same as the notion that was
introduced in vector calculus.
1
, V
2
V be orthogonal subspaces. We say V is the
orthogonal direct sum of V
1
and V
2
and write V = V
1
V
2
if V = V
1
V
2
.
Exercise 5.2.12. 1. Show that V = V
1
V
2
if V = V
1
V
2
and given any
v, v
V , when we write v = v
1
+ v
2
and v
= v
1
+ v
2
for v
1
, v
1
V
1
,
v
2
, v
2
V
2
we have
(v, v
) = (v
1
, v
1
) +(v
2
, v
2
).
2. Suppose V = V
1
V
2
and V is nite dimensional. Let
1
= [
V1V1
and
2
= [
V2V2
. If we let B
1
be a basis for V
1
, B
2
a basis for V
2
, and
B = B
1
B
2
, then show that
[]
B
=
_
[
1
]
B1
0
0 [
2
]
B2
_
.
The next result shows that even if one is given a degenerate form, one can
split o the kernel of this form and then be left with a nondegenerate form.
This will allow us to focus our attention on nondegenerate forms when proving
our classication theorems.
97
Lemma 5.2.13. Let be a form on a nite dimensional F-vector space V .
Then there is a subspace V
1
of V so that
1
= [
V1
is nondegenerate and V =
ker() V
1
. Moreover, (V
1
,
1
) is well-dened up to isometry.
Proof. The exercise above gives that ker() is a subspace of V , so it certainly
has a complement V
1
, i.e., we can write V = ker() V
1
for some subspace
V
1
V . Set
1
= [
V1
. The denition of ker() immediately gives that
V = ker() V
1
.
The next step is to show that
1
is nondegenerate, i.e., R
1
is not injective.
(Note we use nite dimensional here to conclude R
1
being injective is equivalent
to being an isomorphism.) Suppose there exists v
1
V
1
so that R
1
(v
1
) = 0,
i.e., 0 = R
1
(v
1
)(w
1
) =
1
(w
1
, v
1
) = (w
1
, v
1
) for every w
1
V
1
. Moreover, if
w V V
1
, then w ker() and so (w, v
1
) = 0. Thus, (w, v
1
) = 0 for all
w V , i.e. v
1
ker(). However, this is a contradiction since V
1
ker() = 0.
Thus,
1
must be nondegenerate.
The last thing to show is that V
1
is well-dened up to isometry. Set V
=
V/ ker(). We dene a form
on V
as follows. Let : V V
be the standard
projection map. For v
, w
, let v, w V such that (v) = v
, (w) = w
.
Set
(v
, w
) = (v, w).
This is a well-dened form on V
as we are quotienting out by the kernel. Thus,

[
V1
is an isometry between (V
1
,
1
) and (V, ). This gives the result.
Denition 5.2.14. Let be a form on a nite dimensional vector space V .
The rank of is the dimension of V/ ker().
We saw in the proof of the previous lemma that given , there is a subspace
V
1
V so that V = ker() V
1
. In general, given a subspace W V , we can
ask if there is a subspace W
so that V = W W
.
Denition 5.2.15. Let W V be a subspace. The orthogonal complement of
W is given by
W
= v V : (w, v) = 0 for all w W..

Exercise 5.2.16. Show that W
is a subspace of V .
Lemma 5.2.17. Let W be a nite dimensional subspace of (V, ) and set =
[
WW
. If is nondegenerate then V = W W
. If V is nite dimensional
and is nondegenerate as well then
= [
W
W
is also nondegenerate.
Proof. By denition we have W and W
are orthogonal, so to show V = W

W
we need only show that V = W W
.
Let v
0
WW
. Observe that (v
0
, v
0
) = 0 because v
0
W and v
0
W
.
Using that is nondegenerate we obtain v
0
= 0 and so W W
= 0.
Let v
0
V . Note that since is assumed to be nondegenerate and W is
nite dimensional, R
: W W
is an isomorphism. Thus, for any T W
98
there is an element w
T
W so that R
(w
T
) = T. We now apply this fact to
our situation. We have R
(v
0
)[
W
W
, so there is a w
0
W with
R
(w
0
) = R
(v
0
)[
W
,
i.e., for every w W we have
(w, w
0
) = R
(w
0
)(w)
= R
(v
0
)[
W
(w)
= (w, v
0
).
Thus, we have
(w, v
0
) = (w, w
0
)
= (w, w
0
)
where we have used
WW
= . Subtracting we have that for every w W
(w, v
0
w
0
) = 0,
i.e., v
0
w
0
W
. This allows us to write

v
0
= w
0
+ (v
0
w
0
) W +W
.
Thus, V = W W
as desired.
Now suppose that is nondegenerate. Let v
0
W
. Since is nondegen-
erate there exists v V with (v, v
0
) ,= 0. Write v = w
1
+ w
2
with w
1
W
and w
2
W
. Then
0 ,= (v, v
0
)
= (w
1
+w
2
, v
0
)
= (w
1
, v
0
) +(w
2
, v
0
)
= (w
2
, v
0
)
where we have used that (w
1
, v
0
) = 0 because w
1
W and v
0
W
. Thus,
R
(v
0
)(w
2
) ,= 0 and since v
0
was arbitrary, this gives R
|
W
is injective. Since
we are assuming V is nite dimensional, this is enough to conclude that R
|
W
is an isomorphism, i.e., [
W
W
is nondegenerate.
Lemma 5.2.18. Let W be a subspace of (V, ) with V nite dimensional. As-
sume [
WW
and [
W
W
are both nondegenerate. Then (W
= W.
Proof. We can write V = W W
and V = W
(W
by using that
W and W
are both subspaces of V . This immediately gives that dim

F
W =
dim
F
(W
. It is now easy to see using the denition that W (W
. Since
they have the same dimension, we must have equality.
99
We will later see how to prove the above theorem in the case that W is nite
dimensional without the need to assume that V is nite dimensional as well. Our
next step is to classify forms on nite dimensional vector spaces where possible.
The following result is essential in the case of symmetric or Hermitian.
Lemma 5.2.19. Let (V, ) be an F-vector space and assume that is nonde-
generate. If char(F) ,= 2, assume is symmetric or Hermitian. If char(F) = 2,
assume is Hermitian. Then there is a vector v V with (v, v) ,= 0.
Proof. Let v
1
V with v
1
,= 0. If (v
1
, v
1
) ,= 0 we are done, so assume
(v
1
, v
1
) = 0. The fact that is nondegenerate gives a nonzero v
2
V so
that (v
1
, v
2
) ,= 0. Set b = (v
1
, v
2
). If (v
2
, v
2
) ,= 0 we are done, so assume
(v
2
, v
2
) = 0. Set v
3
= tv
1
+v
2
where t F. Then we have
(v
3
, v
3
) = (tv
1
+v
2
, tv
1
+v
2
)
= tt(v
1
, v
1
) +t(v
1
, v
2
) +t(v
2
, v
1
) +(v
2
, v
2
)
= tb +tb
where t = t and b = b if is symmetric. Thus, if V is symmetric (v
3
, v
3
) = 2tb,
so choose any t ,= 0 and we are done. Now suppose that is Hermitian so
(v
3
, v
3
) = tb +tb. Set t = 1/b so tb = 1. We claim 1 = 1. To see this, observe
1 = 1 1 = 1 1, so 1 = 1. Thus, (v
3
, v
3
) = 2 ,= 0 as long as char(F) ,= 2. Now
suppose that char(F) = 2. In this case pick a F so that a ,= a. Such an
a exists because we are assuming the conjugation is not trivial. Set t = a/b so
that (v
3
, v
3
) = a +a ,= 0. Thus, we have the result in all cases.
Consider a basis B = v
1
, . . . , v
n
for (V, ) so that V
i
= span
F
v
i
satises
V = V
1
V
n
.
This gives
[]
B
=
_
_
_
a
1
.
.
.
a
n
_
_
_
where a
i
= (v
i
, v
i
). This leads to the following denition.
Denition 5.2.20. Let be a symmetric bilinear form or a Hermitian form
on a nite dimensional vector space V . We say is diagonalizable if there are
1-dimensional subspaces V
1
, . . . , V
n
so that V = V
1
V
n
.
Let a F with a ,= 0. We can dene a bilinear form on F by considering
a Mat
1
(F), i.e., we dene
a
(x, y) = xay. Here we get
a
is symmetric for
free. Similarly, if F has nontrivial conjugation we can dene a sesquilinear form
a
by setting
a
(x, y) = xay. This is not Hermitian for free. In particular, we
have
a
(x, y) = xay
= yax.
100
This is equal to
a
(y, x) if and only if a = a. Thus, we have
a
is Hermitian if
and only if a = a. In either case, we write [a] for the space (F,
a
). This set-up
allows us to rephrase the denition of diagonalizable to say is diagonalizable
if there exists a
1
, . . . , a
n
so that is isometric to [a
1
] [a
n
].
Theorem 5.2.21. Let (V, ) be a nite dimensional vector space over a eld
F. If char(F) ,= 2 we assume is symmetric or Hermitian. If char(F) = 2, we
require to be Hermitian. Then is diagonalizable.
Proof. We immediately reduce to the case that is nondegenerate by recalling
that there is a subspace V
1
so that [
V1V1
is nondegenerate and V = ker()
V
1
. Thus, if [
V1V1
is diagonalizable, then certainly is as well since the
restriction to ker() is just 0.
We now assume is nondegenerate and induct on the dimension of V . The
case that dim
F
V = 1 is trivial. Assume the result is true for all vector spaces of
dimension less than n and let dim
F
V = n. We have a v
1
V so that (v
1
, v
1
) ,=
0 by Lemma 5.2.19. Set a
1
= (v
1
, v
1
) and let V
1
= span
F
v
1
. Since is
nondegenerate we have V = V
1
V
1
. However, this gives dim
F
V
1
= n 1
and Lemma 5.2.17 gives that
1
:= [
V
1
V
1
is nondegenerate. We apply the
induction hypothesis to (V
1
,
1
) to obtain one dimensional subspaces V
2
, . . . , V
n
so that V
1
= V
2
V
n
. Thus, V = V
1
V
2
V
n
and we are
done.
We can use this theorem to give our rst classication theorem. The fol-
lowing theorem is usually stated strictly over algebraically closed elds, but we
state it a little more generally as algebraically closed is not required.
Theorem 5.2.22. Let (V, ) be an n-dimensional F-vector space with a non-
degenerate symmetric bilinear form. Suppose that F satises that char(F) ,= 2
and it is closed under square-roots, namely, given any a F, one has f(x) =
x
2
a F[x] has a root in F. (In particular, if F is algebraically closed this
is certainly the case.) Then is isometric to [1] [1]. In particular, all
nondegenerate symmetric bilinear forms over such a eld are isometric.
Proof. The previous theorem gives one dimensional subspaces V
1
, . . . , V
n
so that
V = V
1
V
n
. Let v
i
be a basis for V
i
and set a
i
= (v
i
, v
i
). We choose
v
i
so that a
i
,= 0. By our assumption on F we have an element b
i
F so that
b
i
is a root of f(x) = x
2
1/a
i
, i.e., b
2
i
= 1/a
i
. Set B = b
1
v
1
, . . . , b
n
v
n
. Then
we have
(b
i
v
i
, b
i
v
i
) = b
2
i
(v
i
, v
i
)
= b
2
i
a
i
= 1.
In the case that one has is isometric to [1] [1] with n copies of [1],
we will write n[1].
101
Recall that the isometry group of a symmetric bilinear form was denoted
O(). Consider now the case that (V, ) is dened over a eld F that is closed
under square-roots. Then we have just seen there is a basis B so that []
B
= 1
n
where 1
n
is the n n identity matrix. Thus, we have
O() = M Mat
n
(F) :
t
M1
n
M = 1
n
= M Mat
n
(F) :
t
M = M
1
.
We denote this group by O
n
(F) or just O
n
if F is clear from context and refer
to matrices M O
n
(F) as orthogonal matrices.
Note that the proof of Theorem 5.2.22 does not work for a Hermitian form
even if F is algebraically closed because in order to scale the v
i
we would need
b
i
so that b
i
b
i
= a
i
. However, this equation cannot be set up as a polynomial
equation and so algebraically closed does not guarantee a solution to such an
equation exists. To push our classications further we need to restrict the elds
of interest some. In general one needs an ordered eld. However, to save the
trouble of introducing ordered elds we will consider our elds F to satisfy
Q F C. In addition, if we wish to consider symmetric forms, we will
require F R. If we wish to consider Hermitian forms, we only require F has
nontrivial conjugation. One important point to note here is that with our set-
up, if is symmetric we certainly have (v, v) R for all v V . However, if
is Hermitian we have for any v V that (v, v) C satises (v, v) = (v, v),
i.e., (v, v) R in this case as well. This allows us to make the following
denition.
Denition 5.2.23. Let (V, ) be an F-vector space so that if is symmetric,
Q F R and if is Hermitian Q F C. We say is positive denite if
(v, v) > 0 for all nonzero v V and is negative denite if (v, v) < 0 for
all nonzero v V . We say is indenite if there are vectors v, w V so that
(v, v) > 0 and (w, w) < 0.
With these denitions we give the next classication theorem.
Theorem 5.2.24 (Sylvestors Law of Inertia). Let (V, ) be a nite dimensional
R-vector space with a nondegenerate symmetric bilinear form on V or let
(V, ) be a C-vector space with a nondegenerate Hermitian form on V . Then
is isometric to p[1] q[1] for well-dened integers p, q with p+q = dim
F
V .
Proof. The proof that is isometric to p[1] q[1] follows the same argument
as used in the proof of Theorem 5.2.22. We begin with the case that (V, )
is a R-vector space and is a nondegenerate symmetric bilinear form. As in
the proof of Theorem 5.2.22, we have one-dimensional subspaces V
1
, . . . , V
n
so
that V
i
= span
R
v
i
and V = V
1
V
n
. Write a
i
= (v
i
, v
i
). Let
b
i
= 1/
_
[a
i
[ R. Then b
2
i
= 1/[a
i
[ and so
(b
i
v
i
, b
i
v
i
) = b
2
i
(v
i
, v
i
)
=
a
i
[a
i
[
= sign(a
i
)
102
where sign(a
i
) = 1 if a
i
> 0 and sign(a
i
) = 1 if a
i
< 0. Thus, if we let p be the
number of positive a
i
and q the number of negative a
i
we get p[1] q[1].
Now suppose (V, ) is a C-vector space and is a nondegenerate Hermitian
form. Given any v V and b C we have (bv, bv) = bb(v, v) = [b[
2
(v, v).
Proceeding as above, this shows that for each j we can scale (v
j
, v
j
) by any
positive real number. Since (v
j
, v
j
) R for all j, we again have p[1]
q[1] where p and q are dened as above.
It remains to show that p and q are invariants of and do not depend on
the choice of basis. Let V
+
be the largest subspace of V so that the restriction
of to V
+
is positive denite and V
the largest subspace of V so that the

restriction of to V
is negative denite. Set p

0
= dim
F
V
+
and q
0
= dim
F
V
.
Note that p
0
and q
0
are well-dened and do not depend on any choices. We now
show p = p
0
and q = q
0
.
Let B = v
1
, . . . , v
n
be a basis of V so that []
B
= p[1] q[1]. Set
B
+
= v
1
, . . . , v
p
and B
= v
p+1
, . . . , v
n
. Let W
+
= span
F
B
+
and W
=
span
F
B
. Then we have restricted to W

+
is positive denite so p p
0
and
similarly we obtain q q
0
. Note that this gives n = p + q p
0
+ q
0
. Suppose
p ,= p
0
. Then dim
F
V
+
+dim
F
V
> n, so V
+
V
,= 0. This is easily seen to

be a contradiction, so p = p
0
and the same argument gives q = q
0
, completing
the proof.
One important point to note is all that was really used in the above proof
was that given any positive number [a
i
[, one had 1/
_
[a
i
[ in F. Thus, we do
not need to use F = R or F = C in the theorem. For example, if (V, ) is an
F-vector space with is a symmetric bilinear form and F R, to apply the
result to (V, ) we only require that
_
[a[ F for every a F. If the eld does
not contain all its positive square roots things are more dicult as we will see
below.
Denition 5.2.25. Let (V, ), p, and q be as in the previous theorem. The
signature of is (p, q).
Corollary 5.2.26. A nondegenerate symmetric bilinear form on a nite dimen-
sional R-vector space or a nondegenerate Hermitian form on a nite dimensional
C-vector space is classied up to isometry by its signature. If is not required
to be nondegenerate, it is classied by its signature and its rank.
Proof. This follows immediately from the previous theorem with the exception
of the degenerate case. However, if we do not require to be nondegenerate,
then write V = ker() V
1
and apply Sylvestors law to V
1
and we obtain the
result.
If we allow to be degenerate, the previous result gives = p[1] q[1]
r[0] for r = n p q.
We again briey return to the isometry groups. Let be a nondegenerate
symmetric bilinear form on a real vector space V of signature (p, q) so
103
p[1] q[1]. Let 1
p,q
be given by
1
p,q
=
_
1
p
1
q
_
.
Then the isometry group of is given by
O
p,q
(R) := O() = M GL
p+q
(R) :
t
M1
p,q
M = 1
p,q
.
In the case that p = n = dim
R
V and q = 0 we recover the case given above for
a positive denite symmetric bilinear form.
Let be a nondegenerate Hermitian form on a complex vector space V of
signature (p, q) so p[1] q[1]. The isometry group in this case is given by
U
p,q
(C) := U() = M GL
p+q
(C) :
t
M1
p,q
M = 1
p,q
.
One will also often see this group denoted at U(p, q) when C is clear from
context. In the case p = n = dim
C
V and q = 0, we have
U
n
(C) := U
n,0
(C) = M GL
n
(C) :
t
M1
n
M = 1
n
.
We now briey address the situation of a nondegenerate symmetric bilinear
or Hermitian form on a vector space over a eld Q F C that is not closed
under taking square roots of positive numbers. (If F is not a subset of R, we
realize the positive numbers in F as F R
>0
. Note by requiring F C this
makes sense.) Let be such a form. Theorem 5.2.21 gives elements a
1
, . . . , a
n
so
that [a
1
] [a
n
]. We know that each a
i
can be scaled by any element
b
2
i
for b
i
F. This gives a nice diagonal representation for such forms. Let
F
= F 0 and set (F
)
2
= b
2
: b F
. The above discussion may lead

one to believe that there is a bijection between nondegenerate symmetric bilinear
forms on an n-dimensional F-vector space V and (F
/(F
)
2
)
n
. However, the
following example shows this does not work even in the case F = Q, namely,
there is not a well-dened map from the collection of nondegenerate symmetric
bilinear forms to (F
/(F
)
2
)
n
. One can relate the study of such forms to
the classication of quadratic forms, but one quickly sees it is a very dicult
problem in this level of generality.
Example 5.2.27. Consider the symmetric bilinear form on Q
2
given by
(x, y) =
t
x
_
2 0
0 3
_
y.
Let P =
_
1 3
1 2
_
. Then we have
_
5 0
0 30
_
=
t
P
_
2 0
0 3
_
P,
and so we have is isometric to
_
5 0
0 30
_
, but the diagonal entries of this do
not dier from 2 and 3 by squares.
104
We next classify skew-symmetric forms.
Theorem 5.2.28. Let (V, ) be a nite dimensional vector space over a eld
F and let be a skew-symmetric form. Then n = dim
F
V must be even and
is isometric to n/2 copies of
_
0 1
1 0
_
, i.e., there is a basis B so that
[]
B
=
_
_
_
_
_
_
_
0 1
1 0
.
.
.
0 1
1 0
_
_
_
_
_
_
_
.
Proof. We use induction on n = dim
F
V . Let n = 1 and B = v
1
be a basis
for V . Since is skew-symmetric we have (v
1
, v
1
) = 0, which gives []
B
= [0].
This contradicts being nondegenerate. Since the n = 1 case is reasonably
trivial, we also show the n = 2 case as it is more illustrative of the general case.
Let v
1
V with v
1
,= 0. Since is nondegenerate there is a w V so that
(w, v
1
) ,= 0. Since is skew-symmetric w cannot be a multiple of v
1
and so
dimspan
F
(v
1
, w) = 2. Set a = (w, v
1
) and v
2
= w/a. Set B
1
= v
1
, v
2
and
V
1
= span
F
(v
1
, v
2
). Then we have
[[
V1
]
B1
=
_
0 1
1 0
_
.
Now suppose the theorem is true for all vector spaces of dimension less than
n. Let v
1
V with v
1
,= 0. As in the n = 2 case we construct a v
2
and
V
1
= span
F
(v
1
, v
2
) so that
[[
V1
]
B1
=
_
0 1
1 0
_
.
This shows [
V1
is nondegenerate, so V = V
1
V
1
. We now apply the induction
hypothesis to V
1
. Suppose that n is odd and is a nondegenerate Hermitian
form on V . Then [
V
1
is a nondegenerate Hermitian form on a vector space of
dimension n 2. Since n 2 is odd, the induction hypothesis gives there are
no nondegenerate Hermitian forms on a vector space of dimension n 2 if it is
odd. This contradiction shows there are no nondegenerate Hermitian forms on
V if n is odd. Now suppose n is even, then since dim
F
V
1
= n 2 is even we
have a basis B
2
= v
3
, . . . , v
n
of V
1
with
[[
V
1
]
B2
=
_
_
_
_
_
_
_
0 1
1 0
.
.
.
0 1
1 0
_
_
_
_
_
_
_
105
where there are (n 2)/2 copies of
_
0 1
1 0
_
on the diagonal. The basis B =
v
1
, v
2
, v
3
, . . . , v
n
now gives the result.
Exercise 5.2.29. One often sees the isometry group for a skew-symmetric
bilinear form represented by a dierent matrix. Namely, let (V, ) be an F-
vector space of dimension 2n with a skew-symmetric bilinear form.
1. Show that there is a basis B
1
so that
[]
B1
=
_
0
n
1
n
1
n
0
n
_
.
2. Show that there is a basis B
2
so that
[]
B2
=
_
_
_
_
_
_
_
_
_
_
1
.
.
.
1
1
.
.
.
1
_
_
_
_
_
_
_
_
_
_
Using the previous exercise, we can write
Sp
2n
(F) := Sp() =
_
M GL
2n
(F) :
t
M
_
0
n
1
n
1
n
0
n
_
M =
_
0
n
1
n
1
n
0
n
__
.
The last case for us to classify is the case of skew-Hermitian forms.
Theorem 5.2.30. Let (V, ) be an F-vector space with a nondegenerate skew-
Hermitian form. Then is isometric to [a
1
] [a
n
] for some a
i
F with
a
i
,= 0, a
i
= a
i
. (Note that a
i
= a
i
is equivalent to a
i
= jb
i
for some
b
i
F
0
.)
Proof. Our rst step is to show there is a v V so that (v, v) ,= 0. Let
v
1
V be any nonzero element. If (v
1
, v
1
) ,= 0, set v = v
1
. Otherwise, choose
v
2
V so that (v
2
, v
1
) ,= 0. Such a v
2
exists because is nondegenerate.
Set a = (v
1
, v
2
). If (v
2
, v
2
) ,= 0, set v = v
2
. Otherwise, for any c F set
v
3
= v
1
+cv
2
. We have
(v
3
, v
3
) = (v
1
, v
1
) +(v
1
, cv
2
) +(cv
2
, v
1
) +(cv
2
, cv
2
)
= c(v
1
, v
2
) +c(v
2
, v
1
)
= ac ac.
Set c = j/a. Then we have
(v
3
, v
3
) = j j = 2j ,= 0.
106
5.3. SECTION 4.3 CHAPTER 5.
We now proceed by induction on the dimension of V . If dim
F
V = 1, we
construct v
3
as above and we have is isometric to [2j]. Now assume the result
is true for all vector spaces of dimension less than n. Let v
1
V be a vector
so that (v
1
, v
1
) ,= 0 as above. Set V
1
= span
F
v
1
. We have [
V1V1
,= 0,
so [
V1V1
is nondegenerate so we have V = V
1
V
1
. We now apply the
induction hypothesis to V
1
. This gives the result.
Exercise 5.2.31. Let (V, ) be an C-vector space and a nondegenerate skew-
Hermitian form. Then is isometric to p[i] q[i] for some well-dened
integers p and q with p +q = dim
C
V .
5.3 The adjoint map
The last topic of this chapter is the adjoint map. Let (V, ) and (W, ) be
F-vector spaces with and either both non-degenerate bilinear or both non-
degenerate sesquilinear forms. (Throughout this section when given and , we
always assume they non-degenerate and are both bilinear or both sesquilinear.)
Let T Hom
F
(V, W). The adjoint map is a linear map T
Hom
F
(W, V ) that
behaves nicely with respect to and .
F
(V, W). The adjoint map is a map T
: W
V satisfying
(v, T
(w)) = (T(v), w)
for all v V , w W.
It is important to note here that many books will use T
to denote the dual

map T
. Be careful not to confuse the adjoint T
with the dual map!

The rst step is to show that an adjoint map actually exists when V and W
are nite dimensional.
Proposition 5.3.2. Let (V, ) and (W, ) be nite dimensional vector spaces.
Then there is an adjoint map T
Hom
F
(W, V ).
Proof. First we show there is a map T
that satises the equation. We then show

it is unique and linear. Recall that given any W
, we have T
dened by T
(v) = (T(v)) for v V . Fix w W so we have (, w) =

R
(w) W
. Then T
(w) V
and is given by (T
(w))(v) =
R
(w)(T(v)) = (T(v), w). We now use that is non-degenerate, i.e., R
:
V V
is an isomorphism to conclude that for each w W there exists a

unique element z
w
V so that R
(z
w
) = T
(w), i.e., for each w W and

v V we have
(v, z
w
) = R
(z
w
)(v)
= (T
(w))(v)
= R
(w)(T(v))
= (T(v), w).
107
Thus, we have a map T
: W V dened by sending w to z
w
. Moreover, from
the construction it is clear the map is unique.
One could easily show this map is linear directly, but we present a dierent
proof that is conceptually more useful for later results. Suppose that and
are bilinear. We just saw the dening formula for T
can be rewritten as
R
(T
(w))(v) = T
(R
(w))(v).
Thus, R
= T
. Since is assumed to be non-degenerate, R
is an
isomorphism so R
1
exists and we can write

T
= R
1
.
Since each term on the right hand side is linear, so is T
. This nishes the

proof of the result in the case the forms are bilinear. We leave the proof of the
linearity of the sesquilinear case as an exercise.
Note the above proof fails completely when V and W are not nite dimen-
sional vector spaces. One no longer has that R
and R
are isomorphisms. In
fact, in the innite dimensional case it often happens that the adjoint map does
not exist! One particularly important case where one knows adjoints exist in
the innite dimensional case is for bounded linear maps on Hilbert spaces. This
follows from the Riesz Representation Theorem. We do not expand on this here
as it takes us too far aeld.
We now give a matrix representation of the adjoint map. This result will be
extremely important in the next chapter when we give the Spectral Theorem.
Corollary 5.3.3. Let B be a basis of (V, ) and ( be a basis of (W, ). Set
P = []
B
and Q = []
C
. Then
[T
]
B
C
= P
1 t
[T]
C
B
Q
if and are bilinear and
[T
]
B
C
= P
1
t
[T]
C
B
Q
Proof. This is an immediate consequence of the previous proposition.
One special case of the corollary is when V = W and = . In this case one
has P = Q. We can take this further with the following denition and result.
These concepts will be developed more fully in the next chapter.
Denition 5.3.4. Let (V, ) be a vector space. A basis B = v
i
is said to
be orthogonal if (v
i
, v
j
) = 0 for all v
i
, v
j
B with i ,= j. We say the basis is
orthonormal if it is orthogonal and satises (v
i
, v
i
) = 1 for all v
i
B.
In general there is no reason an arbitrary vector space should have an or-
thonormal basis. However, if one has an orthonormal basis one obtains some
very nice properties. One immediate property is the following corollary.
108
Corollary 5.3.5. Let (V, ) and (W, ) be nite dimensional vector spaces with
orthonormal bases B and ( respectively. Let T Hom
F
(V, W). Then
[T
]
B
C
=
t
[T]
C
B
if and are bilinear and
[T
]
B
C
=
t
[T]
C
B
Proof. This follows from Proposition 5.3.2 upon observing that P and Q are
both the identity matrix due to the fact that B and ( are orthonormal.
Suppose we have a nite dimensional vector space (V, ) with a non-
degenerate symmetric bilinear form. Let B = v
1
, . . . , v
n
and ( = w
1
, . . . , w
n
be orthonormal bases of V . Let T Hom

F
(V, W) be dened by T(v
i
) = w
i
.
Observe we have
(v
i
, v
j
) =
i,j
= (w
i
, w
j
)
= (T(v
i
), T(v
j
))
where
ij
=
_
0 i ,= j
1 i = j
. We immediately obtain that (v, v) = (T(v), T( v))
for all v, v V . In particular, we have (v, v) = (v, T
(T( v))) for all v, w V ,

(v, v T
(T( v))) = 0 for all v, v V . Since is non-degenerate, this gives

T
T = id
V
, i.e., if we set P = [T]
C
B
, then
t
PP = 1
n
. Thus, if P is a change
of basis matrix between two orthonormal bases then P is an orthogonal matrix.
The same argument give that if is a Hermitian form, then a change of basis
matrix between two orthonormal bases is a unitary matrix.
Exercise 5.3.6. Let B = v
1
, . . . , v
n
be an orthonormal basis of a nite di-
mensional vector space (V, ) with a non-degenerate symmetric bilinear form.
Let T Hom
F
(V, V ) and let P = [T]
B
. Show that if P is an orthogonal matrix
then ( = T(v
1
), . . . , T(v
n
) is an orthonormal basis of V . Prove the analogous
result when is Hermitian as well.
Exercise 5.3.7. Let (U, ), (V, ), and (W, ) be nite dimensional vector
spaces. Prove the following elementary results about the adjoint map.
1. S, T Hom
F
(V, W). Show that (S +T)
= S
+T
.
2. (cT)
= cT
for every c F
3. Let T Hom
F
(U, V ) and S Hom
F
(V, W). Then (S T)
= T
.
4. Let p(x) = a
n
x
n
+ + a
1
x + a
0
F[x] and T Hom
F
(V, W). Then
(p(T))
= p(T
) where p(x) = a
n
x
n
+ +a
1
x +a
0
.
109
Corollary 5.3.8. Let (V, ) and (W, ) be vector spaces with and both
symmetric, skew-symmetric, Hermitian, or skew-Hermitian. Then (T
= T.
Proof. We prove the Hermitian case and leave the others as exercises as the
proofs are analogous. Set S = T
so that
(v, S(w)) = (T(v), w)
for all v V , w W. We have
(S(w), v) = (v, S(w))
= (T(v), w)
= (w, T(v))
for all v V , w W. This shows we must have T = S
since the adjoint is

unique and T satises the properties of being the adjoint of S.
Note the above proof works for innite dimensional spaces as well with the
added assumption that the adjoint exists.
5.4 Problems
For these problems we assume V is a nite dimensional vector space over a eld
F.
1. Let be a sesquilinear form on V . Relate R
and L
.
2. Let V
1
and V
2
be subspaces of V . Show that V = V
1
V
2
if
1. V = V
1
V
2
and
2. given any v, v
V , when we write v = v
1
+ v
2
and v
= v
1
+ v
2
for
v
i
, v
i
V
i
we have
(v, v
) =
1
(v
1
, v
1
) +
2
(v
2
, v
2
)
where
i
= [
Vi
.
3. Let be a bilinear form on V and assume char(F) ,= 2. Prove that B is
skew-symmetric if and only if the diagonal function V F given by v (v, v)
is additive.
4. Let D be an integer so that
D / Z. Let V = Q
2
.
(a) Show that
D
(x, y) =
t
x
_
1 0
0 D
_
y is a bilinear form on V .
(b) Given two such integers D
1
, D
2
, when is
D1
isometric to
D2
? What if we
consider the forms
D1
and
D2
on R
2
instead of Q
2
?
110
5. Let V = R
2
. Set ((x
1
, y
1
), (x
2
, y
2
)) = x
1
x
2
.
(a) Show this is a bilinear form. Give a matrix representing this form. Is this
form nondegenerate?
(b) Let W = span
R
(e
1
) where e
1
is the standard basis element. Show that
V = W W
.
(c) Calculate (W
.
6. Let V be a 3-dimensional Q-vector space with basis B = v
1
, v
2
, v
3
.
Dene a symmetric bilinear form on V by setting (v
1
, v
1
) = 0, (v
1
, v
2
) =
2, (v
1
, v
3
) = 2, (v
2
, v
2
) = 2, (v
2
, v
3
) = 2, (v
3
, v
3
) = 3.
(a) Give the matrix []
B
.
(b) If possible, nd a basis B
so that []
B
is diagonal and give []
B
. If it is
not possible to diagonalize , give reasons.
(c) Is the symmetric bilinear form given in (b) isometric to the symmetric bi-
linear form given by
_
_
2 0 0
0 8 0
0 0 1
_
_
?
7. (a) Let V = F
3
5
and let c
3
be the standard basis for V . Let : V V be
the symmetric bilinear form given by
[]
E3
=
_
_
1 1 2
1 3 1
2 1 4
_
_
.
Find an orthogonal basis for V with respect to . Can you make this an or-
thonormal basis?
(b) Give a nite collection of matrices so that every nondegenerate symmetric
bilinear form on V is isometric to exactly one of the forms listed. Give a short
justication on why your list is complete.
8. Let be the standard inner product on R
n
from multivariable calculus class.
Dene T Hom
R
(R
n
, R
n
) by
T(x
1
, . . . , x
n
) = (0, x
1
, . . . , x
n1
).
Find a formula for the adjoint map T
.
9. Let T Hom
F
(V, W). Show that T is injective if and only if T
is surjective.
111
Chapter 6
Inner product spaces and
the spectral theorem
We now specialize some of the results of the previous chapter to situations where
more can be said. We will recover all our vector spaces (V, ) to satisfy that
is positive denite. In particular, we will restrict to the case that Q F R
in the case that is a symmetric bilinear form and F C if is Hermitian.
One can restrict to F = R or F = C in this chapter and not much will be lost
unless one really feels the need to work over other elds.
In this chapter we will recover some of the things one sees in undergraduate
linear algebra such as the Gram-Schmidt process. We will end with the Spectral
Theorem, which tells when one can choose a nice basis of a linear map that
is also nice with respect to the bilinear form on the vector space.
6.1 Inner product spaces
We begin by dening the spaces that we will work with in this chapter.
Denition 6.1.1. 1. Let (V, ) be a vector space with a positive denite
symmetric bilinear form. We say is an inner product on V and say (V, )
is an inner product space.
2. Let (V, ) be a vector space with a positive denite Hermitian form. We
say is an inner product on V and say (V, ) is an inner product space.
Some of the most familiar examples given in the last chapter turn out to be
inner products.
Example 6.1.2. Let A Mat
n
(R) with
t
A = A and A positive denite or
A Mat
n
(C) with
t
A = A and A positive denite. Then the form
A
is an
inner product. Note this example recovers the usual dot product on R
n
by
setting A = 1
n
.
112
6.1. INNER PRODUCT SPACES CHAPTER 6.
0
([0, 1], R) be the space of continuous functions
from [0, 1] to R. Dene
(f, g) =
_
1
0
f(x)g(x)dx.
We have seen before this is a symmetric bilinear form. Moreover, note that
(f, f) =
_
1
0
f(x)
2
dx > 0
for all f V with f ,= 0. Thus, is an inner product.
0
([0, 1], C) be the vector space of continuous func-
tions from [0, 1] to C. Dene
(f, g) =
_
1
0
f(x)g(x)dx.
As in the previous example, one has this is an inner product.
Exercise 6.1.5. Let (V, ) be an inner product space. Show that for any
subspace W V one has [
W
is nondegenerate. Conversely, if is a form on
V so that [
W
is nondegenerate for each subspace W V , then either or
is an inner product on V .
One of the nicest things about an inner product is that it allows us to dene
a norm on the vector space, i.e., we can dene a notion of distance on V .
Denition 6.1.6. Let (V, ) be an inner product space. The norm of a vector
v V , denoted [[v[[, is dened by
[[v[[ =
_
(v, v).
The previous denition recovers the denition of the length of a vector given
in multivariable calculus by considering V = R
n
and the usual dot product.
The following lemma gives some of the basic properties of norms. These are
all familiar properties from calculus class for the norm of a vector.
Lemma 6.1.7. Let (V, ) be an inner product space.
1. We have [[cv[[ = [c[[[v[[ for all c F, v V .
2. We have [[v[[ 0 and [[v[[ = 0 if and only if v = 0.
3. (Cauchy-Schwartz inequality) We have [(v, w)[ [[v[[[[w[[. We have
equality if and only if v, w is linearly dependent.
4. (Triangle inequality) We have [[v + w[[ [[v[[ + [[w[[ for all v, w V .
This is an equality if and only if w = 0 or v = cw for c F
0
where
F
0
= F R
0
.
113
Proof. The proofs of the rst two statements are left as exercises. We now
prove the Cauchy-Schwartz inequality. If v, w is linearly dependent, then up
to reordering either w = 0 or w = cv for some c F. In either case it is clear
we have equality. Now assume v, w is linearly independent, i.e., for any c F
we have u = v cw ,= 0. Observe we have
0 [[u[[
2
= (u, u)
= [[v[[
2
+(cw, v) +(v, cw) +[c[
2
[[w[[
2
= [[v[[
2
c(v, w) +[c[
2
[[w[[
2
for any c F. Set c =
(v,w)
(w,w)
. This gives
0 [[v[[
2
(v, w)(v, w)
[[w[[
2

(v, w)(v, w)
[[w[[
2
+
(v, w)
(w, w)
2
[[w[[
2
i.e., 0 [[v[[
2
|(v,w)|
2
||w||
2
. Thus, [(v, w)[ < [[v[[[[w[[ as claimed. Moreover,
note that we get a strict inequality here so if v, w is linearly independent, we
cannot have equality.
We now turn our attention to the triangle inequality. Observe for any v, w
V we have
[[v +w[[
2
= (v +w, v +w)
= [[v[[
2
+(v, w) +(w, v) +[[w[[
2
= [[v[[
2
+(v, w) +(v, w) +[[w[[
2
= [[v[[
2
+ 21((v, w)) +[[w[[
2
[[v[[
2
+ 2[(v, w)[ +[[w[[
2
[[v[[
2
+ 2[[v[[[[w[[ +[[w[[
2
= ([[v[[ +[[w[[)
2
.
This gives the triangle inequality. It remains to deal with the case when [[v +
w[[
2
= ([[v[[ + [[w[[)
2
. If this is the case, then both inequalities above must be
equalities. The second inequality is equality if and only if w = 0 (note the rst
inequality is also an equality if w = 0) or if w ,= 0 and w = cv for some c F.
We have
(v, w) +(w, v)(cw, w) +(w, cw) = (c +c)[[w[[
2
.
Thus the rst inequality is an equality if and only if c F
0
.
We now give a few easy lemmas that deal with orthogonality of vectors.
Lemma 6.1.8. Let B = v
i
be an orthogonal set of nonzero vectors in an
inner product space (V, ). If v V can be written as v =
j
c
j
v
j
for c
j
F,
then c
j
=
(v,vj)
||vj||
2
. If B is orthonormal, then c
j
= (v, v
j
).
114
(v, v
j
) =
_
i
c
i
v
i
, v
j
_
=
i
c
i
(v
i
, v
j
)
= c
j
(v
j
, v
j
).
This gives c
j
=
(v,vj)
(vj,vj)
. If B happens to be orthonormal then (v
j
, v
j
) = 1.
Corollary 6.1.9. Let B = v
i
be a set of nonzero orthogonal vectors in an
inner product space (V, ). Then B is linearly independent.
Proof. Suppose there exists c
j
F so that 0 =
j
c
j
v
j
. The previous result
gives c
j
=
(0,vj)
||vj||
2
= 0. Thus, we have B is linearly independent.
Lemma 6.1.10. Let B = v
i
be a set of nonzero orthogonal vectors in an inner
product space (V, ). Let v V and suppose v can be written as v =
j
c
j
v
j
.
Then
[[v[[
2
=
j
[c
j
[
2
[[v
j
[[
2
.
If B is orthonormal, then
[[v[[
2
=
j
[c
j
[
2
.
Proof. We have
[[v[[
2
= (v, v)
=
_
_
i
c
i
v
i
,
j
c
j
v
j
_
_
=
i,j
c
i
c
j
(v
i
, v
j
)
=
j
[c
j
[
2
(v
j
, v
j
)
=
j
[c
j
[
2
[[v
j
[[
2
.
This gives the result for B orthogonal. The orthonormal case follows immedi-
ately upon observing [[v
j
[[ = 1 for all j.
Corollary 6.1.11. Let B = v
1
, . . . , v
n
be a set of nonzero orthogonal vectors
in V . Then for any vector v V we have
n
j=1
[(v, v)[
2
[[v
j
[[
2
[[v[[
2
115
with equality if and only if v =
n
j=1
(v, v
j
)
[[v
j
[[
2
v
j
.
Proof. Set w =
n
j=1
_
(v, v
j
)
[[v
j
[[
2
_
v
j
and u = v w. Note,
(w, v
i
) =
n
j=1
_
(v, v
j
)
[[v
j
[[
2
_
(v
j
, v
i
)
= (v, v
i
),
and so (u, v
i
) = (w, v
i
) (v, v
i
) = 0 for i = 1, . . . , n. Thus,
[[v[[
2
= (v, v)
= (u +w, u +w)
= (u, u) +(u, w) +(w, u) +(w, w)
= [[u[[
2
+[[w[[
2
since (u, w) = 0 as (u, v
i
) = 0 for all i. Hence,
n
j=1
[(v, v
j
)[
2
[[v
j
[[
2
= [[w[[
2
[[v[[
2
,
as claimed. We obtain equality if and only if u = 0, i.e., v is of the form
n
j=1
_
(v, v
j
)
[[v
j
[[
2
_
v
j
.
Our next step is to prove the traditional Gram-Schmidt process from under-
graduate linear algebra. Upon completing this, we give a brief outline of how
one can do this in a more general setting.
Theorem 6.1.12. (Gram-Schmidt) Let W be a nite dimensional subspace
of V with dim
F
W = k. Let B = v
1
, . . . , v
k
be a basis of W. Then there
is an orthogonal basis ( = w
1
, . . . , w
k
of W so that span
F
(v
1
, . . . , v
l
) =
span
F
(w
1
, . . . , w
l
) for l = 1, . . . , k. Moreover, if F contains R we can choose (
to be orthonormal.
Proof. The fact that an orthogonal basis exists follows Theorem 5.2.21. If R F
we obtain an orthonormal basis via Sylvestors Law of Inertia and the fact that
is positive denite. To construct the basis we work inductively. Set x
1
= v
1
and for j 1 we set
x
i
= v
i
j<i
(v
i
, x
j
)
[[x
j
[[
2
x
j
.
One can easily check this gives an orthogonal basis that satises span
F
(v
1
, . . . , v
l
) =
span
F
(x
1
, . . . , x
l
) for l = 1, . . . , k. If R F, we set w
j
= x
j
/[[x
j
[[ and obtain
the result.
116
Let (V, ) be a real vector space with symmetric. (We do not require that
be positive denite here!) Let B = v
1
, . . . , v
n
be a basis of V . Reorder the
basis so ker = span
R
v
1
, . . . , v
r
, i.e., if 1 i r, then (v
i
, v
j
) = 0 for all
1 j n. Set B
1
= v
1
, . . . , v
r
and let B
2
= v
r+1
, . . . , v
n
. We know that
[[
ker
]
B1
= 0
r
where 0
r
is the r r matrix consisting of all zeros. We now
want to study the basis B
2
. Reorder B
2
so that any vectors v
i
with (v
i
, v
i
) = 0
come rst. Replace v
r+1
by v
r+1
t
r+2
v
r+2
t
n
v
n
where t
i
R is chosen
so
_
_
v
r+1
j=r+2
t
j
v
j
, v
r+1
j=r+2
t
j
v
j
_
_
,= 0.
Note that we can always nd t
r+2
, . . . , t
n
so that this occurs for otherwise t
r+1

ker . We apply this procedure for each vector v
j
in B
2
that satises (v
j
, v
j
) =
0. One this is done, we have a set B
3
so that any v
j
B
3
satises (v
j
, v
j
) ,= 0.
We now apply Gram-Schmidt to B
3
to obtain a collection of orthogonal vectors
x
r+1
, . . . , x
n
. Set w
i
=
xi
||xi||
for i = r + 1, . . . , n. This gives the signature
by counting the ones and negative ones from (w
i
, w
i
). Let w
1
, . . . , w
p
be the
vectors with (w
i
, w
i
) = 1 and w
p+1
, . . . , w
p+q
those with (w
p+j
, w
p+j
) = 1.
Set ( = v
1
, . . . , v
r
, w
1
, . . . , w
p
, w
p+1
, . . . , w
p+q
. Then
[]
C
=
_
_
0
r
1
p
1
q
_
_
.
Proposition 6.1.13. Let (V, ) and (W, ) be nite dimensional inner product
spaces and let T Hom
F
(V, W). Then we have:
1. ker T
= (ImT)
2. ker T = (ImT
3. ImT
= (ker T)
4. ImT = (ker T
Proof. 1) Let w W. We have w ker T
if and only if T
(w) = 0. Since
is nondegenerate, T
(w) = 0 if and only if (v, T
(w)) = 0 for all v V .

However, this is equivalent to (T(v), w) = 0 for all v V by the denition of
the adjoint map. Finally, this is equivalent to w (ImT)
.
2) This follows exactly as 1) upon interchanging T and T
.
3) Observe we have V = ker T (ker T)
= (ImT
(ker T)
by 2).
Moreover, we have V = ImT
(ImT
. Combining these gives ImT
=
(ker T)
.
4) This follows exactly as in 3) upon interchanging T and T
.
117
Lemma 6.1.14. Let (V, ) be a nite dimensional inner product space and let
T Hom
F
(V, V ). Then we have
dim
F
ker T = dim
F
ker T
.
Proof. The previous proposition gives that ImT
= (ker T)
. Observe we have
dim
F
V = dim
F
ker T + dim
F
(ker T)
= dim
F
ker T + dim
F
ImT
.
On the other hand, we also have dim
F
V = dim
F
ker T
+dim
F
ImT
. Equating
these and canceling dim
F
ImT
gives the result.

We end this section with a couple of elementary results on the minimal and
characteristic polynomials as well as the Jordan canonical form of T
.
Corollary 6.1.15. Let (V, ) be a nite dimensional inner product space over F
let and T Hom
F
(V, V ). Suppose c
T
(x) factors into linear terms over F, i.e.,
T has Jordan canonical form over F. Then T
has Jordan canonical form over

F and the Jordan canonical form of T
is given by conjugating the diagonals of

the Jordan canonical form of T.
Proof. Assuming the roots of c
T
(x) lie in F, the dimensions of the spaces
E
k
(T) = ker(T I)
k
for and eigenvalue of T determine the Jordan canonical
form of T. (Note we include the linear map T in the notation to avoid confu-
sion.) Recall from Exercise 5.3.7 that we have ((T I)
k
)
= (T
I)
k
. We
now apply the previous lemma to obtain
dim
F
E
k
(T
) = dim
F
ker((T I)
k
)
= dim
F
ker(T I)
k
= dim
F
E
k
(T).
Since the dimensions of these spaces are equal, we have the Jordan canonical
form of T
exists and is obtained by conjugating the diagonals of the Jordan

canonical form of T.
Corollary 6.1.16. Let (V, ) be a nite dimensional inner product space. Let
T Hom
F
(V, V ). Then
1. m
T
(x) = m
T
(x);
2. c
T
(x) = c
T
(x).
Proof. Observe that Exercise 5.3.7 gives that m
T
(x) = m
T
(x) immediately.
For the second result it is enough to work over C, and then one can apply the
previous result.
118
6.2. THE SPECTRAL THEOREM CHAPTER 6.
6.2 The spectral theorem
In this nal section we give one of the more important results of the course. The
spectral theorem tells us when we can choose a nice basis for a linear map that is
also nice with respect to the inner products, i.e., is orthonormal. Unfortunately
this theorem does not apply to all linear maps, so we begin by dening the
linear maps we will be considering. Throughout this section we x (V, ) to be
an inner product space.
F
(V, V ). We say T is normal if T
exists and
T T
= T
T. We say T is self-adjoint if T
exists and T = T
.
F
(V, V ). Then (T(v), T(w)) = (v, w) for all
v, w V if and only if [[T(v)[[ = [[v[[ for all v V . Furthermore, If [[T(v)[[ =
[[v[[ for all v V , then T is an injection. If V is nite dimensional this gives
an isomorphism.
Proof. Suppose (T(v), T(w)) = (v, w) for all v, w V . This gives
[[T(v)[[
2
= (T(v), T(v))
= (v, v)
= [[v[[
2
,
i.e., [[T(v)[[ = [[v[[ for all v V .
Conversely, suppose [[T(v)[[ = [[v[[ for all v V . Then we have the following
identities:
1. If is a symmetric bilinear form, then
(v, w) =
1
4
[[v +w[[
2
1
4
[[v w[[
2
.
2. If is a Hermitian form, then
(v, w) =
1
4
[[v +w[[
2
i
4
[[v +iw[[
2
1
4
[[v w[[
2
i
4
[[v iw[[
2
.
These two identities are enough to give the result. For example, if is a sym-
metric bilinear form we have
(T(v), T(w)) =
1
4
[[T(v) +T(w)[[
2
1
4
[[T(v) T(w)[[
2
=
1
4
[[T(v +w)[[
2
1
4
[[T(v w)[[
2
=
1
4
[[v +w[[
2
1
4
[[v w[[
2
= (v, w)
for all v, w V .
119
F
(V, V ) be an isometry. Then T has an adjoint
and T
= T
1
.
Proof. Note that by denition if T is an isometry then T is an isomorphism,
which implies T
1
exists. Now we calculate
(v, T
1
(w)) = (T(v), T(T
1
(w)))
= (T(v), w)
for all v, w V . This fact, combined with the uniqueness of the adjoint map
gives T
= T
1
.
F
(V, V ).
1. If T is self-adjoint, then T is normal.
2. If T is an isometry, then T is normal.
Proof. If T is self-adjoint, the only thing we need to check for T to be normal
is that T commutes with its adjoint. However, self-adjoint gives T = T
so this
is obvious. If T is an isometry we have just seen that T
exists and is equal to

T
1
. Again, the fact that T commutes with its adjoint is clear.
Recall that for a nondegenerate symmetric bilinear form the isometry group
is denoted O() and elements of this group are said to be orthogonal. If is
a nondegenerate Hermitian form we denoted the isometry group by U() and
elements of this group are said to be unitary. We say a matrix P GL
n
(F) is
orthogonal if
t
P = P
1
and we say P is unitary if
t
P = P
1
. In the case F = R
or F = C, the statement P is orthogonal is equivalent to P O
n
(R) and P
being unitary is equivalent to P U
n
(C).
F
(V, V ). Let ( be an orthonormal basis and
M = [T]
C
. Then
1. If (V, ) is a real vector space, then
(a) if T is self-adjoint, then M is symmetric (
t
M = M).
(b) if T is orthogonal, then M is orthogonal.
2. If (V, ) is a complex vector space. Then
(a) If T is self-adjoint, then M is Hermitian (
t
M = M).
(b) If T is unitary, then M is unitary.
Proof. Recall that for ( an orthonormal basis we have
[T
]
C
=
t
[T]
C
via Corollary 5.3.5. If T is self-adjoint, this gives
M = [T]
C
= [T
]
C
=
t
M,
120
i.e., M is symmetric. The statement that T being self-adjoint in the complex
case implies M is Hermitian follows along the exact same lines. The statements
in regards to orthogonal and unitary were given before when discussing isometry
groups in the previous chapter.
One should also note the reverse direction in the above result. For example,
if M is a symmetric matrix, then the associated map T
A
is certainly self-adjoint.
The following example is the fundamental example to keep in mind when
working with normal and self-adjoint linear maps.
Example 6.2.6. Let
1
, . . . ,
m
F. Let W
1
, . . . , W
m
be nonzero subspaces
of V such that V = W
1
W
m
. Dene T Hom
F
(V, V ) as follows. Let
B
i
= w
1
i
, . . . , w
k
i
be a basis for W
i
. Set T(w
j
i
) =
i
w
j
i
for j = 1, . . . , k. This
gives W
i
is the
i
-eigenspace for T, i.e., T(v) =
i
v for all v W
i
.
For any v V we can write v = v
1
+ +v
m
for some v
i
W
i
. We have
T(v) =
1
v
1
+
m
v
m
and
T
(v) =
1
v
1
+ +
m
v
m
.
Thus
(T
T)(v) = [
1
[
2
v
1
+ +[
m
[
2
v
m
= (T T
)(v).
This gives that T is a normal map. Moreover, we see immediately that T is
self-adjoint if and only if
i
=
i
for all i.
In the following proposition we collect some of the results weve already seen,
as well as give some new ones that will be of use.
Proposition 6.2.7. Let T Hom
F
(V, V ) be normal. Then T
is normal.
Furthermore,
1. We have p(T) is normal for all p(x) F[x]. If T is self-adjoint, then p(T)
is self-adjoint.
2. We have [[T(v)[[ = [[T
(v)[[ for all v and ker T = ker T
.
3. We have ker T = (ImT)
and ker T
= (ImT
.
4. If T
2
(v) = 0, then T(v) = 0.
5. If v V is an eigenvector of T with eigenvalue , then v is an eigenvector
of T
with eigenvalue .
6. Eigenspaces of distinct eigenvalues are orthogonal.
121
Proof. We have already seen that for symmetric or Hermitian that if T has
an adjoint, so does T
and (T
= T. Since T is assumed to be normal it has

an adjoint and
T
(T
= T
T
= TT
= (T
.
Thus, T
is normal.
We have already seen 1) and 5).
For 2) we have
[[T(v)[[
2
= (T(v), T(v))
= (v, T
T(v))
= (v, TT
(v))
= (v, (T
(v))
= (T
(v), T
(v))
= [[T
(v)[[
2
.
The statement about kernels now follows immediately.
We now prove 3). Note we have just seen that v ker T if and only if
v ker T
. We have v ker T = ker T
is equivalent to (T
(v), w) = 0 for
every w V . However, since (T
(v), w) = (v, T(w)), this gives that v ker T

if and only if (v, T(w)) = 0 for every w V , i.e., if and only if v (ImT)
.
One gets the other equality by switching T and T
.
Assume T
2
(v) = 0 and let w = T(v). Note that T(w) = T
2
(v) = 0, so
w ker T. However, w ImT as well, so we have w ker T ImT = 0 by 3).
This gives part 4).
We now prove 6). Let v
1
be an eigenvector of T with eigenvalue
1
and v
2
with eigenvalue
2
and assume
1
,=
2
. Set S = T
1
I, so S(v
1
) = 0. We
have
0 = (S(v
1
), v
2
)
= (v
1
, S
(v
2
))
= (v
1
, (T
1
)(v
2
))
= (v
1
, (
2
1
)v
2
)
= (
1
2
)(v
1
, v
2
).
However, by assumption we have
2
1
,= 0 so it must be that (v
1
, v
2
) = 0,
i.e., v
1
and v
2
are orthogonal.
Exercise 6.2.8. Let (V, ) be nite dimensional and T Hom
F
(V, V ) be nor-
mal. Show ImT = ImT
.
122
Proposition 6.2.9. Let (V, ) be a nite dimensional inner product space and
let T Hom
F
(V, V ) be normal. The minimal polynomial of T is a product of
distinct irreducible factors. If V is a C-vector space or V is an R-vector space
and T is self-adjoint, then every factor is linear.
Proof. Let p(x) be an irreducible factor of m
T
(x). Suppose p
2
[m
T
. Thus, there
is a vector v so that p
2
(T)(v) = 0, but p(T)(v) ,= 0. Let S = p(T). Then S
is normal and S
2
(v) = 0, but S(v) ,= 0. This is a contradiction to 4) in the
previous proposition, so p(x) exactly divides m
T
(x).
For the second statement if V is a C vector space there is nothing to prove.
Assume V is an R-vector space and T is self-adjoint. We can factor m
T
(x)
into linear and quadratic factors as this is true for any polynomial over R. Let
p(x) = x
2
+ bx + c be irreducible over R and assume p(x) [ m
T
(x). Note that
p(x) irreducible over R gives b
2
4c < 0. Let v V with v ,= 0 such that
p(T)(v) = 0. Write p(x) = (x+
b
2
)
2
+d
2
with d =
_
4cb
2
4
R. Set S = T +
b
2
I.
Then (S
2
+d
2
I)(v) = 0, i.e., S
2
v = d
2
v. Since we are assuming T is self-adjoint
we obtain S is self-adjoint because
b
2
R. Thus, we have
0 < (S(v), S(v))
= (S
S(v), v)
= (S
2
(v), v)
= d
2
(v, v).
This is clearly a contradiction, so it must be that p(x) is reducible over R.
Theorem 6.2.10 (Spectral Theorem). 1. Let (V, ) be a nite dimensional
C-vector space and let T Hom
F
(V, V ) be normal. Then V has an or-
thonormal basis that is an eigenbasis for T.
2. Let V be a nite dimensional R-vector space and let T Hom
F
(V, V ) be
self-adjoint. Then there is an orthonormal basis of V that is an eigenbasis
for T.
Proof. Use the previous result to split m
T
(x) into distinct linear factors and let
1
, . . . ,
k
be the roots of m
T
(x). Let E
i
be the eigenspace associated to
i
.
We have V = E
1
E
k
. But we saw eigenspaces of distinct eigenvalues
are orthonormal. So V = E
1
E
k
. Let (
i
be an orthonormal basis of
E
i
. Then ( =
k
i=1
(
i
is the desired basis.
Corollary 6.2.11. 1. Let A be a Hermitian matrix. Then there is a unitary
matrix P and a diagonal matrix D such that A = PDP
1
= PD
t
P.
2. Let A be a real symmetric matrix. Then there is a real orthogonal matrix
P and real diagonal matrix D such that A = PDP
1
= PD
t
P.
Proof. Let A Mat
n
(R) be a symmetric matrix. Let V = R
n
and let c
n
be the
standard basis of V . We have that A = [T
A
]
En
. The fact that A is symmetric
123
immediately gives T
A
is a self-adjoint map. Thus, the spectral theorem implies
there is an orthonormal basis B of V that is an eigenbasis for T
A
, i.e., [T
A
]
B
= D
is a diagonal matrix. Let Q be the change of basis matrix from B to c
n
. Then
Q is an orthogonal matrix and QAQ
1
= D. (To see why it is orthogonal, see
the discussion preceding Exercise 5.3.6 Thus, if we set P = Q
1
we have the
result in the real symmetric case. The Hermitian case follows along the same
lines.
Example 6.2.12. Consider the matrix A =
_
2 1 +i
1 i 3
_
. We want to nd
a unitary matrix P and a diagonal matrix D so that A = PDP
1
. Observe we
have c
A
(x) = x
2
5x + 4 = (x 4)(x 1), so the eigenvalues of A are
1
= 1
and
2
= 4. Thus, we must have D =
_
1 0
0 4
_
. This gives that C
2
= ker(A
1
2
) ker(A 41
2
). One computes that ker(A 1
2
) = span
C
_
1 i
1
_
and
ker(A41
2
) = span
C
_
1 +i
2
_
. We now apply the Gram-Schmidt procedure to
obtain an orthonormal basis given by w
1
=
_
1i
3
1
3
_
and w
2
=
_
1+i
6
2
6
_
. Thus,
we have
P =
_
1i
3
1+i
6
1
3
2
6
_
.
One immediately checks that A = PDP
1
as desired.
We saw in the previous section a way to use Gram-Schmidt to compute
the signature. The following corollary allows us to compute the signature by
counting eigenvalues.
Corollary 6.2.13. Let (V, ) be either a real vector space and a nondegener-
ate symmetric bilinear form or a complex vector space with a nondegenerate
Hermitian form. Set n = dim
F
V . Let B be a basis for V and set A = []
B
.
Then
1. A has n real eigenvalues (counting multiplicity);
2. the signature of is (r, s) where r is the number of positive eigenvalues
and s is the number of negative eigenvalues of A.
Proof. The rst statement is left as a homework problem.
Let be a nondegenerate symmetric bilinear form. Then if we take any
basis B of V , we have A = []
B
is a symmetric matrix. We apply Corollary
6.2.11 to see that there exists an orthonormal basis ( = v
1
, . . . , v
n
so that
D = []
C
is a diagonal matrix, i.e., there is an orthogonal matrix P so that
A = PDP
1
. The diagonal entries of D are precisely the eigenvalues of A.
Upon reordering ( we can assume that the rst r diagonal entries of D are
positive and the remaining s = n r are negative. Let W
1
= span
R
v
1
, . . . , v
r
124
and W
2
= span
R
v
r+1
, . . . , v
n
. We have V = W
1
W
2
. Observe that [
W1
is positive denite and [
W2
is negative denite. Thus, the signature of is
exactly (dim
R
W
1
, dim
R
W
2
) = (r, s), as claimed.
Recall that in Theorem ?? we saw that if S and T are diagonalizable, they
are simultaneously diagonalizable if and only if S and T commute. In fact, it
was noted this result can be extended to a countable collection of diagonalizable
maps. This, along with the spectral theorem, immediately gives the following
result.
Corollary 6.2.14. Let T
i
be a countable collection of normal or self-adjoint
maps. Then T
i
is simultaneously diagonalizable if and only if the T
i
all com-
mute.
We have the following elementary calculation of when an isometry with a
minimal polynomial that splits completely is an isometry.
F
(V, V ) be normal. Suppose m
T
(x) is a prod-
uct of linear factors over F. Then T is an isometry if and only if [[ = 1 for
every eigenvalue of T.
Proof. Let T be an isometry and let v be an eigenvector with eigenvalue .
Then
(v, v) = (T(v), T(v))
= (v, v)
= (v, v)
= [[
2
(v, v).
Thus, [[ = 1.
Let B = v
1
, . . . , v
n
be a basis of orthogonal eigenvectors with eigenvalues
1
, . . . ,
n
. Assume [
i
[ = 1 for all i. Pick any v V and write v =
a
i
v
i
for
125
some a
i
F. We have
(T(v), T(v)) =
_
a
i
T(v
i
),
a
j
T(v
j
)
_
=
i,j
(a
i
i
v
i
, a
j
j
v
j
)
=
i,j
a
i
i
a
j
j
(v
i
, v
j
)
=
j
a
j
a
j
[
j
[
2
(v
j
, v
j
)
=
j
a
j
a
j
(v
j
, v
j
)
=
_
_
j
a
j
v
j
,
j
a
j
v
j
_
_
= (v, v).
Thus, T is an isometry as desired.
We now have the very nice result that if we have a normal map then it is
diagonalizable with respect to an orthonormal basis. Of course, not all maps
are normal so it is natural to ask if anything can be said in the more general
case. We close this chapter with a couple of results in that direction.
Theorem 6.2.16 (Schurs Theorem). Let (V, ) be a nite dimensional inner
product space and let T Hom
F
(V, V ). Then V has an orthonormal basis ( so
that [T]
C
is upper triangular if and only if the minimal polynomial m
T
(x) is a
product of linear factors.
Proof. Suppose there is an orthonormal basis ( such that [T]
C
is upper trian-
gular. Then c
T
(x) =
(x
i
) for
i
the diagonal entries. Since m
T
(x) [ c
T
(x)
we immediately obtain m
T
(x) splits into a product of linear factors.
Suppose m
T
(x) factors as a product of linear factors. Let W be a T-invariant
subspace. We claim W
is a T
-invariant subspace. To see this, observe that

for any w W, y W
we have
0 = (T(w), y) = (w, T
(y)).
Thus, T
(y) W
, i.e., W
is T
-invariant as claimed. We now use induction

on the dimension of V . If dim
F
V = 1 the result is trivial. Suppose the result
is true for any vector space of dimension less than n. Let dim
F
V = n. Note
that since m
T
(x) splits into linear factors, so does m
T
(x) because m
T
(x) =
m
T
(x). Then T
has an eigenvector, say v

n
. Scale this so [[v
n
[[ = 1. Let W =
span
F
v
n
. Using the fact that W
is T
-invariant, W
is an n1 dimensional
subspace of V that is a (T
-invariant subspace, i.e., a T-invariant subspace.

Set S = T [
W
. Now m
S
[ m
T
so m
S
splits into linear factors. Applying the
126
induction hypothesis we get an orthonormal basis (
1
= v
1
, . . . , v
n1
such that
[S]
C1
is upper triangular. Thus, setting ( = (
1
v
n
, we have [T]
C
is upper
triangular as desired.
In fact, it turns out we can characterize whether a map is normal by whether
it can be diagonalized.
Theorem 6.2.17. Let (V, ) be a nite dimensional inner product space and
T Hom
F
(V, V ). Let ( be any orthonormal basis of V with [T]
C
upper trian-
gular. Then T is normal if and only if [T]
C
is diagonal.
Proof. Certainly if [T]
C
is diagonal then T is normal. So suppose T is normal
and set E = [T]
C
. Let (
1
be a basis so that [T]
C1
= D is diagonal. Such a
basis exists by the Spectral Theorem. Set P = [T]
C
C1
so E = PDP
1
. Since
( and (
1
are both orthonormal, we have P is orthogonal in the real case and
unitary in the Hermitian case, i.e., if F = R, then
t
P = P
1
and if F = C, then
t
P = P
1
. Using this, we have if F = R then
t
E =
t
(PDP
1
)
=
t
(PD
t
P)
= PD
t
P
= PDP
1
= E.
Since E is assumed to be upper triangular, the only way it can equal its transpose
is if it is actually diagonal. The same argument works in the Hermitian case as
well.
6.3 Problems
For these problems V and W are nite dimensional F-vector spaces.
1. Prove the two identities used in the proof of Lemma 6.2.2. Complete the
proof in the case that is a Hermitian form.
2. Prove the rst statement in Corollary 6.2.13.
3. Let V = Mat
2
(R).
(a) Show that the map : V V R given by (A, B) Tr(AB) is a bilinear
form.
(b) Determine the signature of .
(c) Determine the signature of on the subspace sl
2
(R).
4. Let be a nondegenerate bilinear form on V .
(a) Given any V
, show there is a unique vector v V so that

(w) = (w, v)
127
for every w V .
(b) Find a polynomial q P
2
(R) so that
p(1/2) =
_
1
0
p(t)q(t)dt
for every p P
2
(R).
5. Let V = P
2
(R) and let (f, g) =
_
1
1
f(x)g(x)dx. Find an orthonormal basis
of V .
6. (a) Show that if V is a nite dimensional vector space over F with F not of
characteristic 2 and is a symmetric bilinear form (not necessarily nondegen-
erate!) then V has an orthogonal basis with respect to .
(b) Let V = Q
3
. Let be the bilinear form represented by
_
_
1 2 1
2 1 2
1 2 0
_
_
in the standard basis. Find an orthogonal basis of V with respect to .
7. (a) Let be a bilinear form on a real vector space V . Show there is a
symmetric form
1
and a skew-symmetric form
2
so that =
1
+
2
.
(b) Let A Mat
n
(R). Show that there is a unique symmetric matrix B and a
unique skew-symmetric matrix C so that A = B +C.
128
Chapter 7
Tensor products, exterior
algebras, and the
determinant
In this chapter we will introduce tensor products and exterior algebras and use
the exterior algebra to provide a coordinate free denition of the determinant.
The rst time one sees a determinant it seems very unnatural. Viewing the
determinant from the perspective of exterior algebras makes it a very natural
object to consider. It takes consider machinery to work up to the denition, but
once it has been developed the familiar properties of the determinant essentially
fall out for free. The benet of this is that the machinery that one builds up
is useful in many other contexts as well. Unfortunately, by restricting ourselves
to tensor products over vector spaces we miss many of the more interesting and
nontrivial properties one obtains by studying tensor products of modules.
7.1 Extension of scalars
In this section we discuss a particular example of tensor products that can be
useful in many situations. Namely, given an F-vector space V and a eld K that
contains F, can we form a K-vector space that contains a copy of V ? Before
we can do this, we need to address a basic question about forming vector spaces
with a given basis.
We have seen early on that given a vector space V over a eld F, we can
always nd a basis for V . Before we can dene tensor products, we need to
address the very natural question of if we are given a set X, can we nd a
vector space that has X as a basis? In fact, one can construct such a vector
space and moreover the vector space we construct satises a universal property.
Theorem 7.1.1. Let F be a eld and X a set. There is an F-vector space
VS
F
(X) that has X as a basis. Moreover, VS
F
(X) satises the following uni-
129
7.1. EXTENSION OF SCALARS CHAPTER 7.
versal property. If W is any F-vector space and t : X W is any map of sets,
there is a unique T Hom
F
(VS
F
(X), W) so that T(x) = t(x) for all x X,
i.e., the following diagram commutes:
X
incl.
//
t
((
VS
F
(X)
T
W
Proof. This result should not be surprising. In fact, the universal property
essentially amounts to the fact that a linear map is determined by its image on
a basis.
If X = , set VS
F
(X) = 0 and we are done. If X ,= , let
VS
F
(X) =
_
iI
a
i
x
i
: a
i
F, a
i
= 0 for all but nitely many i
_
where I is an indexing set of the same cardinality as X. We dene
iI
a
i
x
i
+
iI
b
i
x
i
=
iI
(a
i
+b
i
)x
i
and
c
_
iI
a
i
x
i
_
=
iI
ca
i
x
i
.
It is easy to see these both lie in VS
F
(X) and that VS
F
(X) is a vector space
under this addition and scalar multiplication. By denition we have X spans
VS
F
(X) and is linearly independent by construction, so X is a basis for VS
F
(X).
It remains to show the universal property. Let t : X W. We dene
T : VS
F
(X) W by
T
_
iI
a
i
x
i
_
=
iI
a
i
t(x
i
).
This gives a well-dened linear map because X is a basis. It is unique because
any linear map that agrees with t on X must also agree with T on VS
F
(X).
It is important to note in the above result that we treat X as strictly a set.
It may be the case that X is itself a vector space, but when we form VS
F
(X)
we ignore any structure that X may have.
Example 7.1.2. Let F = R and X = R. We know that as an R-vector space
R is 1-dimensional with 1 as a basis. However, if we consider VS
R
(R) as an
R-vector space, this is innite dimensional with every element of R as a basis
element. For example, an element of VS
R
(R) is 2 3 + 4 where 3 and are
considered as elements of X. This sum does not simplify in VS
R
(R).
130
We now turn our attention to the main topic of this section, extension of
scalars. Let V be an F-vector space and let K/F be an extension of elds, i.e.,
F K and K is a eld. We can naturally consider K as an F-vector space
as well. When studying problems over the vector space V , say looking for the
Jordan canonical form of a linear transformation, the eld F may not be big
enough. In this case we would like to change the scalars of V so that we are
working over a bigger eld. We can construct such a vector space by forming
the tensor product of V with K. Let X = (a, v) : a K, v V . As above, we
consider this as just a set and forget the vector space structure on it. We form
the K-vector space VS
K
(X); elements in this space are given by
c
i
(a
i
, v
i
)
where c
i
K. This is a K-vector space, but it does not take into account the
F-vector space structure of K or V at all. This means it is much too large
to be useful for anything. We will cut this space down to something useful by
taking the quotient by an appropriate subspace. We dene a subspace Rel
K
(X)
of VS
K
(X) by setting Rel
K
(X) to be the K-span of the elements
1. (a
1
+a
2
, v) (a
1
, v) (a
2
, v) for all a
1
, a
2
K, v V ;
2. (a, v
1
+v
2
) (a, v
1
) (a, v
2
) for all a K, v
1
, v
2
V ;
3. (ca, v) (a, cv) for all c F, a K and v V ;
4. a
1
(a
2
, v) (a
1
a
2
, v) for all a
1
, a
2
K, v V .
Note the Rel stands for relations as we are using this quotient to identify
some relations we are requiring be satised. We now consider the quotient
space K
F
V = VS
K
(X)/ Rel
K
(X). The fact that K
F
V is a vector space
follows immediately because we have constructed it as the quotient of two vector
spaces. Note the F subscript indicates the original eld that K and V are vector
spaces over. Given an element (a, v) VS
K
(X), we denote the equivalence class
(a, v) + Rel
K
(X) by a v. Observe from the K
F
V that we have
1. (a
1
+a
2
) v = a
1
v +a
2
v for all a
1
, a
2
K, v V ;
2. a (v
1
+v
2
) = a v
1
+a v
2
for all a K, v
1
, v
2
V ;
3. ca v = a cv for all c F, a K and v V ;
4. a
1
(a
2
v) = (a
1
a
2
) v for all a
1
, a
2
K, v V .
It is very important to note that a typical element in K
F
V is of the form
c
i
a
i
v
i
where c
i
is 0 for all but nitely many i. Note we can combine the
c
i
and a
i
so in this case a typical element is really just
a
i
v
i
where a
i
= 0
for all but nitely many i. It is a common mistake when working with tensor
products to check things for elements a v without remembering this is not a
typical element! One other important point is that since K
F
V is a quotient,
the elements are all equivalence classes. So one must be very careful to check
things are well-dened when working with tensor products!
We have that 0 v = c 0 = 0 0 = 0
K
F
V
for all v V , c K.
First, note that since 0 K is also in F, we have 0 v = 0 0v = 0 0 and
131
c 0 = 0c 0 = 0 0. Now we use uniqueness of the additive identity in a
vector space and the fact that 0 0 is clearly an additive identity in V .
Example 7.1.3. Let K = C, F = R and let V be an F-vector space. We
consider some elements in C
R
V and how they simplify. We have
i((2 +i) v) + 6 v = (1 + 2i) v + 6 v
= (5 + 2i) v.
The reason we were able to combine the terms is the term coming from V ,
namely, v was the same in both. Similarly, we have
2 v
1
+ 2 v
2
= 2 (v
1
+v
2
).
One other thing we can use to simplify is to bring a scalar that is in F = R
across the tensor. So we have
2 v
1
+ 7 v
2
= 1 2v
1
+ 1 7v
2
= 1 (2v
1
+ 7v
2
).
However, if the rst terms in the tensors lies in K = C and are unequal, and
the second terms are not equal, the sum cannot be combined into one term. For
instance, we have
(2 +i) v
1
+ 3i v
2
= 2 v
1
+i v
1
+ 3i v
2
= 1 2v
1
+i v
1
+i 3v
2
.
The only way the rst two terms could be combined is if v
1
= 0, and for the
second two one needs v
1
= 3v
2
. Which means to combine this to one term the
vectors must all be 0.
Since the original goal was to extend V to be a vector space over K, it is
important to check that there is actually a subspace of K
F
V that is isomorphic
to V as an F-vector space.
Proposition 7.1.4. Let K/F be a eld extension and V an F-vector space.
Then K
F
V contains an F-subspace isomorphic to V as an F-vector space.
Proof. Let B = v
i
be an F-basis of V . Dene T Hom
F
(V, K
F
V ) by
setting T(v
i
) = 1v
i
. Let W be the image of T, i.e., W is the F-span of 1v
i
.
Note that this is not a K-subspace of K
F
V . The only thing needed to see it
is actually an F-subspace is to notice that c(1v) = cv = 1cv for c F, so
W is closed under scalar multiplication by F and 1 v
1
+1 v
2
= 1 (v
1
+v
2
)
so it is closed under addition. It is now clear that T is a surjective linear map.
It only remains to see that it is injective. Suppose T(v) = 0 for some v V .
Then we have 1 v = 0 0. This is equivalent to (1, v) (0, 0) Rel
K
(X).
However, from the denition of Rel
K
(X) it is clear this is only the case if v = 0.
Thus, T is injective and so we have the result.
132
The K-vector space K
F
V is referred to as the extension of scalars of V
by K. Thus, we have constructed a K-vector space that contains a copy of V
as an F-subspace. The following universal property shows that we have done
as good as possible, i.e., the K-vector space K
F
V is the smallest K-vector
space that contains V as an F-subspace.
Theorem 7.1.5. Let K/F be an extension of elds, V an F-vector space, and
: V K
F
V given by (v) = 1 v. Let W be a K-vector space and
t Hom
F
(V, W). There is a unique T Hom
K
(K
F
V, W) so that t = T ,
i.e., the following diagram commutes
V

//
t
((
K
F
V
T
W.
Conversely, if T Hom
K
(K
F
V, W) then T Hom
F
(V, W).
Proof. Let t Hom
F
(V, W). Consider the K-vector space VS
K
(K V ). Since
W is a K-vector space, we have a map
K V W
(c, v) ct(v).
This extends to a map T : VS
K
(K V ) W by Theorem 7.1.1. Moreover,
T Hom
K
(VS
K
(KV ), W). It is now easy to check that T is 0 when restricted
to Rel
K
(K V ). Thus, we have T : K
F
V W given by T(a v) = at(v).
Observe we have
cT(a v) = c(at(v))
= (ca)t(v)
= T(c(a v))
for all a, c K and v V . It is also easy to see that T is additive, and so
T Hom
K
(K
F
V, W). This gives the existence of T and that the diagram
commutes.
We have that K
F
V is given by the K-span of elements of the form 1 v,
so any K-linear map on K
F
V is determined by its image on these elements.
Since T(1 v) = t(v), we get T is uniquely determined by t. This gives the
uniqueness statement.
It is now an easy exercise to check that for any T Hom
K
(K
F
V, W), one
has T Hom
F
(V, W).
Example 7.1.6. Let K/F be an extension of elds. In this example we show
that K
F
F

= K as K-vector spaces. We have a natural inclusion map i : F
K. We write : F K
F
F as above. From the previous result we obtain a
unique K-linear map T : K
F
F K so that the following diagram commutes
133
7.2. TENSOR PRODUCTS OF VECTOR SPACES CHAPTER 7.
F

//
i
((
K
F
F
T
K.
Thus, we see T(1 x) = x. Moreover, since T is K-linear this completely
determines T because for
a
i
x
i
K
F
F, we have
T
_
a
i
x
i
_
=
T(a
i
x
i
)
=
T(a
i
(1 x
i
))
=
a
i
T(1 x
i
).
Dene S : K K
F
F by S(y) = y1. We clearly have S Hom
K
(K, K
F
F)
and we have
S T(y 1) = S(y) = y 1
and
T S(y) = T(y 1) = y.
Thus, we have T
1
= S and so K
F
F

= K as K-vector spaces.
Example 7.1.7. More generally, let K/F be an extension of elds and let V
be an n-dimensional F-vector space. We claim that K
F
V

= K
n
as K-vector
spaces. We being by using the universal property to obtain a K-linear map from
K
F
V to K
n
. Let B = v
1
, . . . , v
n
be a basis of V and dene an F-linear
map t : V K
n
by t(v
i
) = e
i
where e
i
is the standard basis element of K
n
.
Using the universal property we obtain a K-linear map T : K
F
V K
n
given
by T(1 v
i
) = e
i
for i = 1, . . . , n. Dene a linear map S : K
n
K
F
V by
S(e
i
) = 1 v
i
. We know to dene a linear map we only need to specify where
it sends a basis, so this gives a well-dened K-linear map. Moreover, it is easy
to see that S and T are inverse maps. Thus, K
F
V

= K
n
. Moreover, since S
is an isomorphism and e
1
, . . . , e
n
is a basis for K
n
, we see 1v
1
, . . . , 1v
n
is a basis of K
F
V .
7.2 Tensor products of vector spaces
We now turn our attention to tensor products. One motivation for dening
tensor products is it gives a way to form a new vector space where one has the
product of elements from the original vector space. We began with a fairly
simple case to help motivate the more general denition. Now that we have
seen a particular example of tensor products in the previous section, namely,
extension of scalars, we return to the more general situation of forming the tensor
product of two F-vector spaces V and W. Since the set-up is very similar to
what was done in the previous section many of the details will be left to the
reader.
134
Let V and W be F-vector spaces. (If we allow them to be K-vector spaces
as well, we will recover what was done above.) Consider X = V W = (v, w) :
v V, w W and let VS
F
(V W) be the associated F-vector space as above.
Let Rel
F
(V W) be the subspace of VS
F
(V W) given by the F-span of
1. (v
1
+v
2
, w) (v
1
, w) (v
2
, w) for all v
1
, v
2
V , w W;
2. (v, w
1
+w
2
) (v, w
1
) (v, w
2
) for all v V , w
1
, w
2
W;
3. (cv, w) (v, cw) for all c F, v V, w W;
4. c(v, w) (cv, w) for all c F, v V, w W.
Then, as above, we dene V
F
W = VS
F
(V W)/ Rel
F
(V W). As before,
denote the equivalence class containing (v, w) by v w.
We have V
F
W is an F-vector space with elements of the form
i
c
i
(v
i
w
i
). Note, we can represent any such element by combining c
i
with the v
i
w
i
,
so elements can be represented in the form
v
i
w
i
for v
i
V, w
i
W with
only nitely many terms being nonzero. Elements of the form v w are call
pure tensors.
We would like a similar universal property to the one we gave above for
K
F
V . However, this requires a new type of map. We dened bilinear maps
Hom
F
(V, V ; F) in Chapter 5. We now extend that denition.
Denition 7.2.1. Let V, W, and U be F-vector spaces. We say a map t :
V W U is an F-bilinear map (or just bilinear map if F is clear from
context), and write t Hom
V
(V, W; U), if it satises
1. t(cv
1
+v
2
, w) = ct(v
1
, w) +t(v
2
, w) for all c F, v
i
V , w W;
2. t(v, cw
1
+w
2
) = ct(v, w
1
) +t(v, w
2
) for all c F, v V , w
i
W.
The point of bilinear maps as opposed to linear maps is they treat V and W
as vector spaces, but they do not use the fact that we can dene a vector space
structure on V W. They keep V and W separate in terms of the algebraic
structure; they are linear in each variable separately. This allows us to give the
appropriate universal property for V
F
W.
Theorem 7.2.2. Let U, V , and W be F-vector spaces. Dene a map : V
W V
F
W by (v, w) = v w. Then
1. Hom
F
(V, W; V
F
W), i.e., is F-bilinear;
2. if T Hom
F
(V
F
W, U), then T Hom
F
(V, W; U);
3. if t Hom
F
(V, W; U), then there is a unique T Hom
F
(V
F
W, U) so
that t = T .
Equivalently, we can write the correspondence by saying we have a bijection
between Hom
F
(V, W; U) and Hom
F
(V
F
W, U) so that the following diagram
commutes:
135
V W

//
t
))
V
F
W
T
U.
Proof. 1. This part follows immediately from the denition and properties
of the tensors.
2. We will show that t = T is linear in the rst variable; linear in the
second variable is the same argument. Let v
1
, v
2
V , w W, and c F.
We have
t(cv
1
+v
2
, w) = T (cv
1
+v
2
, w)
= T((cv
1
+v
2
) w)
= T(cv
1
w +v
2
w)
= cT(v
1
w) +T(v
2
w)
= ct(v
1
, w) +t(v
2
, w).
Thus, t is bilinear in the rst variable.
3. Let t Hom
F
(V, W; U). We have that t vanishes on the elements of
Rel
F
(V W) by the properties of being a bilinear map. Thus, we obtain
a well-dened linear map T : V
F
W U so that T(v w) = t(v, w).
Thus, we have a unique map T Hom
F
(V
F
W, U) satisfying T(v, w) =
T(v w) = t(v, w), as claimed.
We now illustrate how this universal property can be used to prove basic
properties about tensor products. It is extremely powerful because it allows one
to dene a bilinear map on V W and obtain a linear map on V
F
W. The
reason this is so nice is to dene a map directly on V
F
W one must also check
the map is well-dened, which can be very tedious.
Corollary 7.2.3. Let V and W be F-vector spaces with bases B = v
1
, . . . , v
m
and ( = w
1
, . . . , w
n
respectively. Then v
i
w
j
1im
1jn
is a basis for V
F
W.
In particular, dim
F
(V
F
W) = dim
F
V dim
F
W.
Proof. We show this result by showing that V
F
W

= Mat
m,n
(F)

= F
mn
.
Dene a map t : V W F
mn
by t((v
i
, w
j
)) = e
i,j
. First, observe this is
enough to dene a bilinear map. Given v V and w W, write v =
m
i=1
a
i
v
i
136
and w =
m
j=1
b
j
w
j
. Then
t(v, w) = t
_
_
m
i=1
a
i
v
i
,
n
j=1
b
j
w
j
_
_
=
m
i=1
a
i
t
_
_
v
i
,
n
j=1
b
j
w
j
_
_
=
m
i=1
n
j=1
a
i
b
j
t(v
i
, w
j
).
Thus, just as it was enough to dene a linear map on a basis, it is enough
to specify the values of a bilinear map on elements (v
i
, w
j
) for v
i
B and
w
j
(. We now apply the universal property to obtain an F-linear map
T : V
F
W Mat
m,n
(F) that satises T(v
i
w
j
) = e
i,j
. We can dene
an F-linear map S : Mat
m,n
(F) V
F
W by S(e
i,j
) = v
i
w
j
. It is clear
that S and T are inverse maps, so we obtain V
F
W

= Mat
m,n
(F)

= F
mn
.
Moreover, since e
i,j
forms a basis for Mat
m,n
(F) and S is an isomorphism,
we have that v
i
w
j
is a basis of V
F
W.
Example 7.2.4. The vector space C
R
C

= R
4
. A basis for C
R
C is given
by 1 1, 1 i, i 1, i i.
The following two theorem follows immediately from this corollary.
Corollary 7.2.5. Let U, V and W be F-vector spaces. We have
1. V
F
W

= W
F
V ;
2. (U
F
V )
F
W

= U
F
(V
F
W).
Given F-vector spaces U, V and W, it follows immediately from Corollary
7.2.3. However, we give a direct proof. The benets of this are it shows the
spirit of how such things would be proven for modules, and it also allows us to
easily see what the isomorphism is and show it is unique.
Theorem 7.2.6. Let U, V and W be F-vector spaces. There is a unique iso-
morphism
(U V )
F
W

(U
F
W) (V
F
W)
(u, v) w (u w, v w).
Proof. We will once again make use of the universal property to dene the
appropriate maps. Dene
(U V ) W (U
F
W) (V
F
W)
((u, v), w) (u w, v w).
137
It is easy to see this map is bilinear, so the universal property gives a unique
linear map
T : (U V )
F
W (U
F
W) (V
F
W)
(u, v) w (u w, v w).
It now remains to dene an inverse map. We begin by dening maps
U W (U V )
F
W
(u, w) (u, 0) w
and
V W (U V )
F
W
(v, w) (0, v) w.
The universal property applied to each of these maps gives
S
1
: U
F
W (U V )
F
W
u w (u, 0) w
and
S
2
: V
F
W (U V )
F
W
v w (0, v) w.
Combining these gives a linear map
S : (U
F
W) (V
F
W) (U V )
F
W
(u w
1
, v w
2
) (u, 0) w
1
+ (0, v) w
2
.
It now only remains to check that These are inverse maps. These these are
linear maps it is enough to check it on pure tensors. We have
S T((u, v) w) = S(u w, v w)
= (u, 0) w + (0, v) w
= (u, v) w
and
T S(u w
1
, v w
2
) = T((u, 0) w
1
+ (0, v) w
2
)
= T((u, 0) w
1
) +T((0, v) w
2
)
= (u w
1
, 0 w
1
) + (0 w
2
, v w
2
)
= (u w
1
, 0) + (0, v w
2
)
= (u w
1
, v w
2
).
Thus, we have the desired isomorphism.
138
Before we can introduce the exterior product of a vector space, we need to
consider multilinear maps and tensor products of nitely many vector spaces.
In particular, we saw above that given F-vector spaces U, V and W, we have
U
F
(V
F
W)

= (U
F
V )
F
W as F-vector spaces. Thus, it makes sense
to just write U
F
V
F
W. By induction, given F-vector spaces V
1
, . . . , V
n
, it
makes sense to write V
F
V
2
F

F
V
n
. We will be particularly interested
in the case when V
i
= V for all i. In this case we write V
n
for V
F

F
V .
If there is any chance of confusion we write V
F
n
. We now dene multilinear
maps; these are the appropriate maps to consider in this context.
1
, . . . , V
n
and W be F-vector spaces. A map
t : V
1
V
n
W
is said to be multilinear if t is linear in each variable separately. We denote the
set of multilinear maps by Hom
F
(V
1
, . . . , V
n
; W).
Exercise 7.2.8. Show that Hom
F
(V
1
, . . . , V
n
; W) is an F-vector space.
For tensor products of several vector spaces we have a universal property as
well; bilinear maps are just replaced by multilinear maps.
Theorem 7.2.9. Let V
1
, . . . , V
n
be F-vector spaces. Dene
: V
1
V
n
V
1
F

F
V
n
(v
1
, . . . , v
n
) v
1
v
n
.
Then we have
1. For every T Hom
F
(V
1
F

F
V
n
; W) the map T Hom
F
(V
1
, . . . , V
n
; W).
2. For every t Hom
F
(V
1
, . . . , V
n
; W) there is a unique T Hom
F
(V
1

F

F
V
n
, W) so that t = T .
The proof of this theorem is left as an exercise as it follows from the same
type of arguments used in the case n = 2. Note the theorem can be restated
by saying there is a bijection between Hom
F
(V
1
, . . . , V
n
; W) and Hom
F
(V
1
F

F
V
n
, W) so that the following diagram commutes
V
1
V
n
//
t
**
V
1
F

F
V
n
T
W.
Corollary 7.2.10. Let V
1
, . . . V
k
be F-vector spaces of dimension n
1
, . . . , n
k
.
Let B
i
= v
i
1
, . . . , v
i
ni
be a basis of V
i
. Show that v
1
i1
v
k
i
k
is a basis for
V
1
F

F
V
k
. In particular, show that
dim
F
(V
1
F

F
V
k
) =
k
j=1
dim
F
V
i
.
Proof. See homework problems.
139
7.3. ALTERNATING FORMS, EXTERIOR POWERS, AND THE
DETERMINANT CHAPTER 7.
7.3 Alternating forms, exterior powers, and the
determinant
In this section we will dene exterior powers of a vector space and see how
studying these gives a very natural denition of the determinant.
Let V be a nite dimensional F-vector space and let k be a positive integer.
We begin by dening the k
th
exterior power of V .
Denition 7.3.1. Let /
k
(V ) of V
k
be the subspace spanned by v
1
v
k
where v
i
= v
j
for some i ,= j. The k
th
exterior power of V is the quotient space
k
(V ) = V
k
//
k
(V ).
We denote elements of
k
(V ) by v
1
v
k
= v
1
v
k
+/
k
(V ). Note
that we have v
1
v
k
= 0 if v
i
= v
j
for any i ,= j. Thus, for v, w V we
have
0 = (v +w) (v +w)
= v v +v w +w v +w w
= v w +w v.
This gives that in
2
(V ) we have v w = w v. This can be used for higher
degree exterior powers as well. Namely, we have
v
1
v
i
v
i+1
v
k
= v
1
v
i+1
v
i
v
k
.
This will be very useful when proving the universal property for exterior powers.
Before we can state that, we need to dene the appropriate maps.
Denition 7.3.2. Let V and W be F-vector spaces and let t Hom
F
(V, . . . , V ; W).
We say t is alternating if t(v
1
, . . . , v
k
) = 0 whenever v
i
= v
i+1
for some 1
i k 1. We denote the set of alternating maps by Alt
k
F
(V ; W). We set
Alt
0
F
(V ; W) = F.
Exercise 7.3.3. 1. Show that Alt
k
F
(V ; W) is an F-subspace of Hom
F
(V, . . . , V ; W).
2. Show that Alt
1
F
(V ; F) = Hom
F
(V, F) = V
.
3. If dim
F
V = n, so Alt
k
F
(V ; F) = 0 for all k > n.
Theorem 7.3.4. Let V and W be F-vector spaces and k a positive integer.
Dene a map : V V
k
(V ) by (v
1
, . . . , v
k
) = v
1
v
k
. Then
1. Alt
k
F
(V ;
k
(V )), i.e., is alternating;
2. if T Hom
F
(
k
(V ), W), then T Alt
k
F
(V V ; W);
3. if t Alt
k
F
(V V ; W), then there is a unique T Hom
F
(
k
(V ), W)
so that t = T .
140
Equivalently, we can write the correspondence by saying we have a bijection
between Alt
k
F
(V V ; W) and Hom
F
(
k
(V ), W) so that the following dia-
gram commutes:
V V

//
t
))
Alt
k
F
(V )
T
W.
Proof. This essentially follows from Theorem 7.2.9. To see is alternating,
observe that it is the composition of the multilinear map used in the universal
property of the tensor product composed with projection onto the quotient.
This gives is multilinear. It vanishes on elements of the form (v
1
, . . . , v
k
) with
v
i
= v
j
for i ,= j by the denition of the exterior power. This gives the rst
statement.
We now show t = T is alternating. Let c F and v
1
, . . . , v
k
, v
1
V . We
have
t(av
1
+v
1
, v
2
, . . . , v
k
) = T((av
1
+v
1
) v
2
v
k
)
= T(av
1
v
2
v
k
+v
1
v
2
v
k
)
= aT(v
1
v
2
v
k
) +T(v
1
v
2
v
k
)
= at(v
1
v
2
v
k
) +t(v
1
v
2
v
k
).
This shows t is multilinear. To see it is also alternating, one just uses is
alternating.
Suppose now we have t Alt
k
F
(V V ; W). Since the alternating forms
are a subset of the multilinear forms, we obtain a linear map S : V
k
W so
that S(v
1
v
k
) = t(v
1
, . . . , v
k
). As
k
(V ) is a quotient of V
k
, we obtain
a linear map T :
k
(V ) W given by composing S with the projection map,
i.e., T(v
1
v
i
) = t(v
1
, . . . , v
n
). It is now easy to check this map is unique
because it agrees with t and that the diagram above commutes.
Example 7.3.5. Let V be an F-vector space with dim
F
V = 1. Thus, given any
nonzero v V , the set v forms a basis for V . Given any positive integer k, we
can write elements of
k
(V ) as nite sums of elements of the form a
1
v a
k
v.
However, we know a
1
v a
k
v = (a
1
a
k
)v v and since v v = 0,
we have a
1
v a
k
v = 0 for k 2. Thus, we have
0
(V ) = F,
1
(V )

= V ,
and
k
(V ) = 0 for k 2.
Example 7.3.6. Let V be a 2-dimensional F-vector space and let B = v
1
, v
2
be a basis. Let k be a positive integer. Elements of

k
(V ) consist of nite sums
of the form (a
1
v
1
+b
1
v
2
) (a
k
v
1
+b
k
v
2
) for a
i
, b
i
F. If k 3, this sum
is 0 because each term will have a term v
i
v
j
v
l
, and since the dimension is
141
2 one of these terms will be a repeat and so 0. If k = 2, a typical element can
be written in the form
(av
1
+bv
2
) (cv
1
+dv
2
) = adv
1
v
2
+bcv
2
v
1
= (ad bc)v
1
v
2
.
Thus, in this case we have
0
(V ) = F,
1
(V )
= V , and
2
(V )
= F(v
1
v
2
)
= F.
Theorem 7.3.7. Let V be an n-dimensional F-vector space. Let B = v
1
, . . . , v
n
be a basis of V . For k n the the vectors v

i1
v
i
k
: 1 i
1
i
k

n form a basis of
k
(V ). For k > n we have
k
(V ) = 0. In particular,
dim
F

k
(V ) =
_
n
k
_
.
Proof. This follows easily from the fact that v
i1
v
i
k
: 1 i
j
n is a
basis for V
k
. Since
k
(V ) is a quotient of V
k
by /
k
(V ), to nd a basis it
only amounts to nding a basis of /
k
(V ). However, we clearly have any element
v
i1
v
i
k
with i
j1
= i
j2
for some j
1
,= j
2
lies in /
k
(V ). Moreover, we know
the elements v
i1
v
i
k
can be reordered by introducing a negative sign. This
gives the result.
Exercise 7.3.8. Prove the above theorem directly from Theorem 7.3.4.
As was observed in the previous examples for particular cases, this theorem
shows in general that for an n-dimensional F-vector space one has
k
(V ) is of
dimension 1 over F and has as a basis v
1
v
n
for v
1
, . . . , v
n
a basis of
V . This is the key point in dening the determinant of a linear map.
Let T Hom
F
(V, W) with V and W nite dimensional vector spaces. The
map T induces a map
T
k
: V
k
V
k
v
1
v
k
T(v
1
) T(v
k
).
It is easy to see that the generators of /
k
(V ) are sent to generators of /
k
(W),
so this descends to a map
k
(T) :
k
(V )
k
(W)
v
1
v
k
T(v
1
) T(v
k
).
We now restrict to the case that V = W and dim
F
V = n. Since
n
(V ) is
1-dimensional over F, we have
n
(T) is just multiplication by a constant. This
leads to the denition of the determinant of a map.
Denition 7.3.9. Let V be an F-vector space with dim
F
V = n. Let T
Hom
F
(V, V ). We dene the determinant of T, denoted det(T), to be the
constant so that
n
(T)(v) = (det(T))v for all v
n
(V ). Given a matrix
A Mat
n
(F), we dene the determinant of A to be the determinant of the
associated linear map T
A
.
142
One should note that it is clear from this denition of the determinant that
there is no dependence on the choice of a basis of V as none was used in dening
the determinant.
Lemma 7.3.10. Let S, T Hom
F
(V, V ). Then det(T S) = det(T) det(S).
Proof. Let v
1
v
n

n
(V ) be any nonzero element (so a basis). We have
det(T S)v
1
v
n
=
n
(T S)(v
1
v
n
)
= T S(v
1
) T S(v
n
)
= T(S(v
1
)) T(S(v
n
))
=
n
(T)(S(v
1
) S(v
n
))
=
n
(T)(
n
(S)(v
1
v
n
))
=
n
(T)(det(S)v
1
v
n
)
= det(S)
n
(T)(v
1
v
n
)
= det(S) det(T)v
1
v
n
.
Of course, for this to be useful we want to show this is the same determinant
that was dened in undergraduate linear algebra class. Before showing this in
general we check the case V has dimension 2.
2
. Let A =
_
a b
c d
_
and let T
A
be the associated
linear map. Thus, we have
2
(T)(e
1
e
2
) = T(e
1
) T(e
2
)
= (ae
1
+ce
2
) (be
1
+de
2
)
= abe
1
e
1
+ade
1
e
2
+cbe
2
e
1
+cde
2
e
2
= ade
1
e
2
+bce
2
e
1
= (ad bc)e
1
e
2
.
Thus, we see det(A) = adbc, as one expects from undergraduate linear algebra.
Exercise 7.3.12. Let A Mat
3
(F). Show that the denition of det(A) given
here matches the denition from undergraduate linear algebra.
We now want to prove in general that given a matrix A Mat
n
(F) one has
det(A) is the same as the value from undergraduate linear algebra. Note that
we can view A as an element of F
n
F
n
with each column of A a vector
in F
n
. Thus, we can view
det : F
n
F
n
F.
Theorem 7.3.13. The determinant function is in Alt
n
(F
n
, F) and satises
det(1
n
) = 1.
143
Proof. We begin by checking bilinearity in the rst variable; the other variables
follow from the same argument. Write
w
1
= a
11
e
1
+ +a
n1
e
n
.
.
.
w
n
= a
n1
e
1
+ +a
nn
e
n
and w = b
11
e
1
+ + b
n1
e
n
for e
1
, . . . , e
n
the standard basis of F
n
. We want
to show that for any c F we have
det(w
1
+cw, w
2
, . . . , w
n
) = det(w
1
, w
2
, . . . , w
n
) +c det(w, w
2
, . . . , w
n
).
To do this, we need to translate this into a statement about linear maps. Dene
T
1
: F
n
F
n
by T
1
(e
i
) = w
i
, so
[T
1
]
En
= A
1
=
_
_
_
a
11
a
1n
.
.
.
.
.
.
a
n1
a
nn
_
_
_;
T
2
: F
n
F
n
by T
2
(e
1
) = w and T
2
(e
j
) = w
j
for 2 j n so
[T
2
]
En
= A
2
=
_
_
_
b
11
a
12
a
1n
.
.
.
.
.
.
b
n1
a
n2
a
nn
_
_
_;
T
3
: F
n
F
n
by T
3
(e
1
) = w
1
+cw and T
3
(e
j
) = w
j
for 2 j n so
[T
3
]
En
= A
3
=
_
_
_
a
11
+cb
11
a
12
a
1n
.
.
.
.
.
.
a
n1
+cb
n1
a
n2
a
nn
_
_
_.
Now observe we have
det(w
1
, w
2
, . . . , w
n
) = det(T
1
)
det(w, w
2
, . . . , w
n
) = det(T
2
)
det(w
1
+cw, w
2
, . . . , w
n
) = det(T
3
).
Thus, we just need to show that det(T
3
) = det(T
1
) +c det(T
2
). We have
det(T
3
)e
1
e
n
=
n
(T
3
)(e
1
e
2
e
n
)
= T
3
(e
1
) T(e
2
) T(e
n
)
= (w
1
+cw) w
2
w
n
= w
1
w
n
+cw w
2
w
n
= T
1
(e
1
) T
1
(e
n
) +cT
2
(e
1
) T
2
(e
n
)
=
n
(T
1
)(e
1
e
n
) +c
n
(T
2
)(e
1
e
n
)
= det(T
1
)(e
1
e
n
) +c det(T
2
)(e
1
e
n
)
= (det(T
1
) +c det(T
2
))(e
1
e
n
).
144
This gives det Hom
F
(F
n
, . . . , F
n
; F). Next we show that det is alternating.
Suppose that w
i
= w
j
for some i ,= j. Then we have
n
(T
1
)(e
1
e
n
) = T
1
(e
1
) T
1
(e
n
)
= w
1
w
n
= 0.
Thus,
det(w
1
, . . . , w
n
) = det(T
1
) = 0.
This gives det Alt
n
(F
n
; F). It is easy to see that det 1
n
= 1.
Not only is det in Alt
n
(F
n
; F) satisfying det(1
n
) = 1, it is the only such
map. This takes a little work to prove, but it is the key to seeing this is actu-
ally the same map as the undergraduate version. This will follow immediately
from observing the undergraduate denition certainly takes 1
n
to 1 and is in
Alt
n
(F
n
; F). To see the undergraduate denition is in Alt
n
(F
n
; F), one only
needs to recall that the determinant is preserved under elementary column op-
erations. We need a few result before proving the uniqueness.
Let n be a positive integer and let S
n
denote the collection of bijections
from 1, . . . , n to itself. It is easy to see this is a group under composition of
functions. Given a tuple (x
1
, . . . , x
n
), we dene
(x
1
, . . . , x
n
) = (x
(1)
, . . . , x
(n)
).
Note this tuple could be a tuple of anything: variables, vectors, numbers, etc.
Dene
=
1i<jn
(x
i
x
j
).
For example, if n = 3 we have
= (x
1
x
2
)(x
1
x
3
)(x
2
x
3
).
We have
=
1i<jn
(x
(i)
x
(j)
).
Since just permutes the values of 1, . . . , n, we have = where the
depends on . We dene sign() 1 by
= sign().
Lemma 7.3.14. Let t Alt
k
(V, F). Then
t(v
1
, . . . , v
i
, v
i+1
, . . . , v
k
) = t(v
1
, . . . , v
i1
, v
i+1
, v
i
, v
i+2
, . . . , v
k
).
145
Proof. Dene
(x, y) = t(v
1
, . . . , v
i1
, x, y, v
i+1
, . . . , v
n
)
for xed v
1
, . . . , v
i1
, v
i+1
, . . . , v
n
. It is enough to show that (x, y) = (y, x).
Since t Alt
k
(V ; F) we have
(x +y, x +y) = 0.
Expanding this and noting that (x, x) = (y, y) = 0 we obtain
(x, y) = (y, x).
We leave the proof of the following lemma to the homework problems.
Lemma 7.3.15. Let t Alt
k
(V ; F).
1. For each S
k
,
t(v
(1)
, . . . , v
(k)
) = sign()t(v
1
, . . . , v
k
).
2. If v
i
= v
j
for any i ,= j, then t(v
1
, . . . , v
k
) = 0.
3. If v
i
is replaced by v
i
+ cv
j
for any i ,= j, c F, then the value of t is
unchanged.
The following proposition is the key step needed to show that det is unique.
Proposition 7.3.16. Let t Alt
k
(V ; F). Assume for some v
1
, . . . , v
n
V ,
w
1
, . . . , w
n
V , and a
ij
F we have
w
1
= a
11
v
1
+ +a
n1
v
n
.
.
.
w
n
= a
1n
v
1
+ +a
nn
v
n
.
Then
t(w
1
, . . . , w
n
) =
Sn
sign()a
(1)1
a
(n)n
t(v
1
, . . . , v
n
).
Proof. We expand t(w
1
, . . . , w
n
) using multilinearity. Note that any of the terms
with equal entries vanish since t is alternating so we are left with terms of the
form
a
i11
a
inn
t(v
i1
, . . . , v
in
)
where the i
1
, . . . , i
n
run over 1, . . . , n and are all distinct. Such tuples are in
bijective correspondence with S
n
, so we can write each such term as
a
(1)1
a
(n)n
t(v
(1)
, . . . , v
(n)
)
for a unique S
n
. This gives the result.
146
Corollary 7.3.17. The determinant function dened above is the unique map
in Alt
n
(F
n
; F) that satises det(1
n
) = 1.
Proof. We have already seen that det satises the conditions; it only remains
to show it is the unique function that does so. Let c
n
= e
1
, . . . , e
n
be the
standard basis of F
n
. As above, we can identify elements of Mat
n
(F) with
F
n
F
n
via column vectors. Thus, the identity matrix is identied with
(e
1
, . . . , e
n
). Let A Mat
n
(F) with column vectors v
1
, . . . , v
n
, i.e.,
v
1
= a
11
e
1
+ +a
n1
e
n
.
.
.
v
n
= a
1n
e
1
+ +a
nn
e
n
.
We now apply the previous proposition to see
det(A) = det(v
1
, . . . , v
n
)
=
Sn
sign()a
(1)1
a
(n)n
det(e
1
, . . . , e
n
)
=
Sn
sign()a
(1)1
a
(n)n
.
Moreover, given any t Alt
n
(F
n
; F) with t(1
n
) = 1 we see
t(v
1
, . . . , v
n
) =
Sn
sign()a
(1)1
a
(n)n
.
Thus, we have the result.
147
Appendix A
Groups, rings, and elds: a
quick recap
It is expected that students reading these notes have some experience with
abstract algebra. For instance, an undergraduate course in abstract algebra is
more than sucient. A good source for those beginning in abstract algebra is [2]
and a more advanced resource is [1]. We do not attempt to give a comprehensive
treatment here; we present only the basic denitions and some examples to
refresh the readers memory.
Denition A.0.1. Let G be a nonempty set and : G G G a binary
operation. We call (G, ) a group if it satises the following properties:
for all g
1
, g
2
, and g
3
G we have (g
1
g
2
) g
3
= g
1
(g
2
g
3
);
there exists an element e
G
G, referred to as the identity element of G,
so that g e
G
= g = e
G
g for all g G;
for each g G there exists an element g
1
, referred to as the inverse of g,
so that g g
1
= e
G
= g
1
g.
We say a group is abelian if for all g
1
, g
2
G we have g
1
g
2
= g
2
g
1
.
Often the operation will be clear from context so we will write our groups
as G instead of (G, ).
Exercise A.0.2. Let G be a group. Show that inverses are unique, i.e., if given
g G there exists h G so that g h = e
G
= h g, then h = g
1
.
Denition A.0.3. Let (G, ) be a group and H G. We say H is a subgroup
of (G, ) if (H, ) is a group. One generally denotes H is a subgroup of G by
H G.
Example A.0.4. Let G = C and = +. It is easy to check that this gives
an abelian group. In fact, it is easy to see that Z Q R C. Note the
operation is addition for all of these.
148
APPENDIX A. GROUPS, RINGS, AND FIELDS: A QUICK RECAP
Example A.0.5. Let G = Z and let be multiplication. This does not give a
group as 2 Z but the multiplicative inverse of 2, namely, 1/2 is not in Z.
Example A.0.6. Let G = R
= x R : x ,= 0 with being multiplication.

This gives an abelian group.
Example A.0.7. Let X be a set of n elements. Let S
n
denote the set of
bijections from X to X, i.e., the permutations of X. This forms a group under
composition of functions.
Example A.0.8. Let Mat
n
(R) denote the n by n matrices with entries in R.
This forms a group under matrix addition but not under matrix multiplication
as the 0 matrix, the matrix with all 0s as entries, does not have a multiplicative
inverse.
Example A.0.9. Let GL
n
(R) be the subset of Mat
n
(R) consisting of invertible
matrices. This gives a group under matrix multiplication. Let SL
n
(R) be the
subset of GL
n
(R) consisting of matrices with determinant 1. This is a subgroup
of GL
n
(R).
Example A.0.10. Let n be a positive integer and consider the set H = nZ =
nm : m Z. Note that H (Z, +). We can form a quotient group in this case
as follows. We dene an equivalence relation on Z by setting a b if n[(a b).
One should check this is in fact an equivalence relation. This partitions Z into
equivalence classes. For example, given a Z, the equivalence class containing
a is given by [a]
n
= b Z : n[(a b). We write a b (mod n) if a and b lie
in the same equivalence class, i.e., if [a]
n
= [b]
n
. Note that for each a Z, there
is an element r 0, 1, . . . , n 1 so that [a]
n
= [r]
n
. This follows immediately
from the division algorithm: write a = nq +r with 0 r n 1. Write
Z/nZ = [0]
n
, [1]
n
, . . . , [n 1]
n
.
We dene addition on this set by setting [a]
n
+ [b]
n
= [a + b]
n
. One can easily
check this is well-dened and makes Z/nZ into an abelian group with n elements.
We drop the subscript n on the equivalence classes when it is clear from context.
Consider the case that n = 4. We have Z/4Z = [0], [1], [2], [3]. An addition
table is given here:
+ [0] [1] [2] [3]
[0] [0] [1] [2] [3]
[1] [1] [2] [3] [0]
[2] [2] [3] [0] [1]
[3] [3] [0] [1] [2]
Example A.0.11. Let n Z and consider the subset of Z/nZ consisting of
those [a] so that gcd(a, n) = 1. We denote this set by (Z/nZ)
. Recall that if
gcd(a, n) = 1 then there exists x, y Z so that ax + ny = 1. This means that
n [ (ax 1), i.e., ax 1 (mod n). If [a], [b] (Z/nZ)
, then [ab] (Z/nZ)
as
well. One can now check that if we set [a] [b] = [ab], then this makes (Z/nZ)
into an abelian group. Note that (Z/nZ)
is not a subgroup of Z/nZ even

though it is a subset; they do not have the same operations.
149
Exercise A.0.12. (a) How many elements are in (Z/nZ)
?
(b) Give multiplication tables for (Z/4Z)
and (Z/5Z)
. Do you notice any-

thing interesting about these tables?
As is always the case, the relevant maps to study are the ones that preserve
the structure being studied. For groups these are group homomorphisms.
Denition A.0.13. Let (G,
G
) and (H,
H
) be groups. A map : G H is a
group homomorphism if it satises (g
1
G
g
2
) = (g
1
)
H
(g
2
) for all g
1
, g
2
G.
We dene the kernel of the map by
ker = g G : (g) = e
H
.
Exercise A.0.14. (a) Show that ker G.
(b) Show that if : G H is a group homomorphism, then is injective if
and only if ker = e
G
.
Example A.0.15. Let n Z be a positive integer and dene : Z Z/nZ
by (a) = [a]
n
. This is easily seen to be a surjective group homomorphism.
Moreover, one has that ker = nZ.
Example A.0.16. Let : R C
be dened by (x) = e
2ix
. Given x, y R
we have (x + y) = e
2i(x+y)
= e
2ix
e
2iy
= (x)(y). Thus, is a group
homomorphism. The image of this group homomorphism is S
1
= z C : [z[ =
1, i.e., the unit circle. The kernel is given by Z. For those that remember the
rst isomorphism theorem, this gives R/Z
= S
1
.
Example A.0.17. The determinant map gives a group homomorphism from
GL
n
(R) to R
with kernel SL
n
(R).
As one may have observed, the rst examples of groups given were familiar
objects such as Z and R with only one operation considered. It is natural to
consider both operations for such objects, which brings us to the denition of a
ring.
Denition A.0.18. A ring is a nonempty set R with two binary operations,
+ and , satisfying the following properties
R is an abelian group under the operation + (the additive identity is
denoted 0
R
);
there is an element 1
R
that satises r 1
R
= r = 1
R
r for all r R;
for all r
1
, r
2
, r
3
R one has (r
1
r
2
) r
3
= r
1
(r
2
r
3
);
for all r
1
, r
2
, r
3
R one has (r
1
+r
2
) r
3
= r
1
r
3
+r
2
r
3
;
for all r
1
, r
2
, r
3
R one has r
1
(r
2
+r
3
) = r
1
r
2
+r
1
r
3
.
We say R is commutative if r
1
r
2
= r
2
r
1
for all r
1
, r
2
R.
150
It is often the case that one writes r
1
r
2
as r
1
r
2
to ease notation. We will
follow that convention here when there is no fear of confusion.
Note that what we have dened as a ring is often referred to as a ring with
identity. This is because in some cases one would like to consider rings without
requiring that there be an element 1
R
. We will not be interested in such rings
so in these notes rings always means rings with identity.
Example A.0.19. Many of the examples given above are also rings. The groups
Z, Q, R, and C are all commutative rings when considered with the operations
+ and .
Example A.0.20. Let R be a ring. The group Mat
n
(R) is a ring under the
operations of matrix addition and multiplication. However, if n 2 this is not
a commutative ring.
Example A.0.21. Let R be a ring and let R[x] denote the set of polynomials
with coecients in R. The set R[x] is a ring under polynomial addition and
multiplication.
Given a ring R, a subring of R is a nonempty set S that is a ring under
the same operations. However, it turns out that the concept of ideals is more
important than the notion of subrings.
Denition A.0.22. Let R be a ring and I R be a subring. We say I is an
ideal if ra I and ar I for all r R and a I.
Note this is a stronger condition than being a subring. For example, Z is a
subring of R but it is not an ideal.
Example A.0.23. Consider the ring Z. For any integer n we have nZ is an
ideal in Z. In fact, every ideal can be written in this form. Let I be an ideal in
Z. If I = Z then clearly we have I = 1Z. Suppose I is a proper subset of Z and
I ,= 0. Let I
+
be the subset of I consisting of positive elements. Since I ,= 0
this is a nonempty set because if m I, either m I
+
or m I
+
. Since I
+
is a set of positive integers, it has a minimal element; call this minimal element
n. We claim I = nZ. Let m I. Then we can write m = nq + r for some
0 r n 1. Since m I and n I, we have r = mnq I. However, this
contradicts n being the least positive integer in I unless r = 0. Thus, m nZ
and so I = nZ. We call an ideal generated by a single element a principal ideal.
A ring where all the ideals are principal is called a principal ideal domain. Thus,
Z is a principal ideal domain.
Exercise A.0.24. Let F be a eld. Mimic the proof that Z is a principal ideal
domain to show F[x] is a principal ideal domain.
Example A.0.25. Consider the polynomial ring R[x] with R a ring. Let f(x)
R[x] be any nonzero polynomial. We dene an equivalence relation on R[x]
much as we did on Z when dening Z/nZ, namely, we say g(x) h(x) if
f(x) [ (g(x) h(x)). We write g(x) h(x) (mod f(x)) if g(x) h(x). This
151
equivalence relation partitions R[x] into equivalence classes. Given g(x) R[x],
we let [g(x)]
f(x)
denote the equivalence class containing g(x), i.e., [g(x)]
f(x)
=
h(x) R[x] : h(x) g(x) (mod f(x)). We drop the subscript f(x) when it is
clear from context. We denote the set of equivalence classes by R[x]/(f(x)) =
[g(x)] : g(x) R[x]. Note that R[x]/(f(x)) need not be a nite set. In
this case one can use the division algorithm to see that one can always choose a
representative r(x) for the equivalence class with deg r(x) < deg f(x). We dene
addition and multiplication on R[x]/(f(x)) by [g(x)] + [h(x)] = [g(x) + h(x)]
and [g(x)] [h(x)] = [g(x)h(x)]. This makes R[x]/(f(x)) into a ring. This ring
is commutative if R is commutative.
Exercise A.0.26. Consider the ring R[x]/(x
2
+ 1). Show that one has
R[x]/(x
2
+ 1) = [ax +b] : a, b R.
What is [x]
2
equal to in this ring? What familiar ring does this look like?
Exercise A.0.27. Let R be a ring and I R an ideal. Dene a relation on R
by setting a b if ab I. Show this is an equivalence relation. For a R, let
a+I denote the equivalence class containing a, i.e., a+I = a+i : i I. (Note
this is also sometimes denoted [a].) Let R/I denote the set of these equivalence
classes. Dene (a+I)+(b+I) = (a+b)+I and (a+I)(b+I) = ab+I. Show this
makes R/I into a ring. Rectify this with the notation Z/nZ and R[x]/(f(x))
used above.
While rings are extremely important if one wants to consider the more gen-
eral results of linear algebra that are phrased in terms of modules, for the main
results in these notes we stick with vector spaces. Thus, we are mainly interested
in elds.
Denition A.0.28. A commutative ring R is said to be a eld if every nonzero
element in R has a multiplicative inverse.
Exercise A.0.29. Show the only ideals in a eld F are 0 and F.
Example A.0.30. The rings Q, R and C are all elds, but Z is not a eld.
Example A.0.31. Let n be a positive integer and recall the group Z/nZ with
the operation given by +. Given [a], [b] Z/nZ, we dene [a] [b] = [ab]. One
can easily check this makes Z/nZ into a commutative ring. Now consider the
case when n = p is a prime. We still clearly have Z/pZ is a commutative ring,
but in this case we have more. Let [a] Z/pZ with [a] ,= [0], i.e., p a. Since
p is a prime this means that gcd(a, p) = 1, so there exists x, y Z so that
ax + by = 1, i.e., ax 1 (mod p). This means that [a][x] = [1], i.e., [a] has
a multiplicative inverse. Thus, Z/pZ is a eld. (Note this does not work for
general n; look back at your multiplication table for Z/4Z.) When we wish to
think of Z/pZ as a eld we denote it by F
p
.
Exercise A.0.32. Let F be a eld and f(x) F[x] an irreducible polynomial.
Show that F[x]/(f(x)) is a eld. (Hint: mimic the proof that Z/pZ is a eld.)
152
Exercise A.0.33. Consider the ring C[x]/(x
2
+ 1). Is this a eld? Prove your
answer.
Denition A.0.34. Let R and S be rings. A map : R S is a ring
homomorphism if it satises
(r
1
+
R
r
2
) = (r
1
) +
S
(r
2
) for all r
1
, r
2
R;
(r
1
R
r
2
) = (r
1
)
S
(r
2
) for all r
1
, r
2
R.
We dene the kernel of by
ker = r R : (r) = 0
S
.
Exercise A.0.35. Let : R S be a ring homomorphism. Show ker is an
ideal of R.
Exercise A.0.36. Let F be a eld and R a ring. Let : F R be a ring
homomorphism and assume is not the zero map, i.e., there exists x F so
that (x) ,= 0
R
. Prove that is an injective map.
Example A.0.37. Let S R be a subring. Then inclusion map S R is a
ring homomorphism with a trivial kernel.
Example A.0.38. Let R be a ring and I R an ideal. The natural projection
map
R R/I
a a +I
is a surjective ring homomorphism with kernel equal to I.
Example A.0.39. Dene : R[x] C by f(x) f(i). This is a ring homo-
morphism with kernel (x
2
+ 1).
Exercise A.0.40. Dene a map : Z/6Z Z/6Z by ([a]) = [4a]. What is
the kernel of this map? What is the image of this map?
153
Bibliography
[1] D. Dummitt and R. Foote. Abstract Algebra. John Wiley and Sons, Inc.,
3rd edition, 2004.
[2] T. Hungerford. Abstract Algebra: An Introduction, volume 3. Cengage
Learning, 2012.
[3] R. Larson. Elementary Linear Algebra. Houghton Miin, Boston, 6 edition,
2009.
154

Math853 JBrown Grad Linear Alg

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Math853 JBrown Grad Linear Alg

Uploaded by

Copyright:

Available Formats

Graduate Linear Algebra

(U, V ) to be the set of smooth functions, i.e., f C

both satisfy the conditions necessary to be 0

, and now use v = 0

. Thus, we have that 0 = 0

(R). In this example we give several linear maps

3) = 0, i.e., those polynomials f(x) for which (x

3 : a, b Q. Let Im(T). Then there exists f(x) Q[x] so that

3) = . Write f(x) = q(x)(x

3 for some a, b Q. Thus,

is a basis for V/W.

as it is dened on a basis. Dene T

. Suppose there exists a

> . As above, the

#F. Thus, we have #V

with respect to B is given by

, there is a canonical isomorphism V

! Note that in proof

without choosing any basis. This

, F). This allows us to

(w) = 0. Thus, we have

since they are isomorphic, so dim

, i.e., : W F. To obtain a map T

() = T. It is easy to check this is a

. We now recall how to change from

. We can recover this by specializing the situation we just studied. Let

, we expand the elements of B in terms of the basis B

. The previous section gave an

the standard basis for F

and express them as linear combinations of the dual basis:

(x) denotes the derivative of

be the reduced row echelon form

and that a basis for T(V ) is given by the vectors T(e

give the coordi-

has n nonzero rows.

has nonpivotal columns.

can be prescribed arbitrarily and the values of the remaining

will also be T-invariant. As we saw in the last

is a T-invariant direct sum, we

is the T-invariant complement of W.

is a linearly independent subset of V

X be the natural projection

be the T-invariant complement

= ker(T I) is called the eigenspace associated to . More

as a generalized -eigenvector of T. Write

. Moreover, observe that we have v

for k > 2. Thus,

p(x) with p() ,= 0, then

and we have shown 2) implies 3).

3). Note that the eigenvalues

be a rational canonical generating set for T. This gives V = V

, but that this isomorphism depends upon picking a basis

given by w (w, v). In particular, we dene R

(v)(w) = (w, v).

is a linear map. We want to show that for any a F and

is injective. Let w V . If (v, w) = 0 for all v V ,

(w)(v) = 0 for every v V . However, this says that R

is injective, this gives w = 0. Thus,

and non-degenerate bilinear forms

= T it is immediate that these maps are inverse to each

(v) = (v, ).. In fact, these two choices go together. If we

. We will see later why we prefer to use R

. The following result gives the relation between these two

are dual to each other, namely,