Professional Documents
Culture Documents
Nigel Boston
University of Wisconsin - Madison
THE PROOF OF
FERMATS LAST
THEOREM
Spring 2003
ii
INTRODUCTION.
This book will describe the recent proof of Fermats Last The-
orem by Andrew Wiles, aided by Richard Taylor, for graduate
students and faculty with a reasonably broad background in al-
gebra. It is hard to give precise prerequisites but a rst course
in graduate algebra, covering basic groups, rings, and elds to-
gether with a passing acquaintance with number rings and va-
rieties should suce. Algebraic number theory (or arithmetical
geometry, as the subject is more commonly called these days)
has the habit of taking last years major result and making it
background taken for granted in this years work. Peeling back
the layers can lead to a maze of results stretching back over the
decades.
I attended Wiles three groundbreaking lectures, in June 1993,
at the Isaac Newton Institute in Cambridge, UK. After return-
ing to the US, I attempted to give a seminar on the proof
to interested students and faculty at the University of Illinois,
Urbana-Champaign. Endeavoring to be complete required sev-
eral lectures early on regarding the existence of a model over
Q for the modular curve X
0
(N) with good reduction at primes
not dividing N. This work hinged on earlier work of Zariski
from the 1950s. The audience, keen to learn new material, did
not appreciate lingering over such details and dwindled rapidly
in numbers.
Since then, I have taught the proof in two courses at UIUC,
a two-week summer workshop at UIUC (with the help of Chris
Skinner of the University of Michigan), and most recently a
iii
course in spring 2003 at the University of Wisconsin - Madi-
son. To avoid getting bogged down as in the above seminar, it
is necessary to assume some background. In these cases, refer-
ences will be provided so that the interested students can ll
in details for themselves. The aim of this work is to convey the
strong and simple line of logic on which the proof rests. It is
certainly well within the ability of most graduate students to
appreciate the way the building blocks of the proof go together
to give the result, even though those blocks may themselves be
hard to penetrate. If anything, this book should serve as an
inspiration for students to see why the tools of modern arith-
metical geometry are valuable and to seek to learn more about
them.
An interested reader wanting a simple overview of the proof
should consult Gouvea [13], Ribet [25], Rubin and Silverberg
[26], or my article [1]. A much more detailed overview of the
proof is the one given by Darmon, Diamond, and Taylor [6], and
the Boston conference volume [5] contains much useful elabora-
tion on ideas used in the proof. The Seminaire Bourbaki article
by Oesterle and Serre [22] is also very enlightening. Of course,
one should not overlook the original proof itself [38], [34] .
iv
CONTENTS.
Introduction.
Contents.
Chapter 1: History and overview.
Chapter 2: Pronite groups, complete local rings.
Chapter 3: Innite Galois groups, internal structure.
Chapter 4: Galois representations from elliptic curves, mod-
ular forms, group schemes.
Chapter 5: Invariants of Galois representations, semistable
representations.
Chapter 6: Deformations of Galois representations.
Chapter 7: Introduction to Galois cohomology.
Chapter 8: Criteria for ring isomorphisms.
Chapter 9: The universal modular lift.
Chapter 10: The minimal case.
Chapter 11: The general case.
Chapter 12: Putting it together, the nal trick.
v
1
History and Overview
It is well-known that there are many solutions in integers to x
2
+
y
2
= z
2
, for instance (3, 4, 5), (5, 12, 13). The Babylonians were
aware of the solution (4961, 6480, 8161) as early as around 1500
B.C. Around 1637, Pierre de Fermat wrote a note in the margin
of his copy of Diophantus Arithmetica stating that x
n
+y
n
= z
n
has no solutions in positive integers if n > 2. We will denote
this statement for n (FLT)
n
. He claimed to have a remarkable
proof. There is some doubt about this for various reasons. First,
this remark was published without his consent, in fact by his
son after his death. Second, in his later correspondence, Fermat
discusses the cases n = 3, 4 with no reference to this purported
proof. It seems likely then that this was an o-the-cu comment
that Fermat simply omitted to erase. Of course (FLT)
n
implies
(FLT)
n
, for any positive integer, and so it suces to prove
(FLT)
4
and (FLT)
4
). QED
Lemma 1.4 Suppose ,[xy, [z. Then
2
[z.
Proof: Consider again 1 1 uz
3
(mod
4
). If the left
side is 0, then
4
[z
2
, so
2
[z. If the left side is 2, then [2,
contradicting [A/()[ = 3. QED
1.2 Proof of (FLT)3 ix
Lemma 1.5 Suppose that ,[xy,
k
[[z, k 2. Then there
exists a solution with ,[xy,
k1
[[z.
Proof: In this case, the gcd of any 2 factors on the left is .
Hence we can assume that
(1)x +y = u
1
t
(2)x +y = u
2
(3)x +
2
y = u
3
3
,
where u
1
, u
2
, and u
3
are units, t = 3k 2, and ,[, , .
(1)+(2)+
2
(2) yields (setting x
1
= , y
1
= , and z
1
=
k1
)
x
3
1
+
1
y
3
1
=
2
z
3
1
with
1
,
2
units. Reducing mod
2
, we get
1
1
0,
which implies that
1
= 1. Replacing y
1
by y
1
if necessary,
we get
x
3
1
+y
3
1
=
2
z
3
.
QED
Finally, to prove the theorem, if ,[xyz, we use lemma 1.2. If
,[xy but [z, we use lemmas 1.3 and 1.4. If [x then ,[yz,
hence mod
3
0 1 u
which implies that u 1 (mod
3
), and hence u = 1.
Rearranging yields
(z)
3
+ (y)
3
= x
3
,
a case which has already been treated. QED
1.3 Further Eorts at Proof x
1.3 Further Eorts at Proof
Peter Dirichlet and Adrien Legendre proved (FLT)
5
around
1825, and Gabriel Lame proved (FLT)
7
around 1839. If we set
= e
2i/
( prime), and
Z[] = a
0
+a
1
+. . . +a
l2
l2
: a
i
Z,
then there are cases when Z[] is not a UFD and the factor-
ization method used above fails. (In fact, Z[] is a UFD if and
only if 19.)
It turns out that the method can be resuscitated under weaker
conditions. In 1844 Ernst Kummer began studying the ideal
class group of Q(), which is a nite group that measures how
far Z[] is from being a UFD [33]. Between 1847 and 1853,
he published some masterful papers, which established almost
the best possible result along these lines and were only really
bettered by the recent approach detailed below, which began
over 100 years later. In these papers, Kummer dened regu-
lar primes and proved the following theorem, where h(Q())
denotes the order of the ideal class group.
Denition 1.6 Call a prime regular if ,[h(Q()) (where
= e
2i/
). Otherwise, is called irregular.
Remark 1.7 The rst irregular prime is 37 and there are in-
nitely many irregular primes. It is not known if there are in-
nitely many regular primes, but conjecturally this is so.
Theorem 1.8 (Kummer) (i) (FLT)
holds if is regular.
(2) is regular if and only if does not divide the numerator
of B
i
for any even 2 i 3.
1.3 Further Eorts at Proof xi
Here B
n
are the Bernoulli numbers dened by
x
e
x
1
=
(B
n
/n!)x
n
.
For instance, the fact that B
12
=
691
2730
shows that 691 is
irregular. We shall see the number 691 appearing in many dif-
ferent places.
Here the study of FLT is divided into two cases. The rst
case involves showing that there is no solution with ,[xyz.
The idea is to factor x
+y
= z
as
(x +y)(x +y) (x +
1
y) = z
,
where = e
2i/
. The ideals generated by the factors on the
left side are pairwise relatively prime by the assumption that
,[xyz (since := 1 has norm - compare the proof of
(FLT)
3
), whence each factor generates an th power in the ideal
class group of Q(). The regularity assumption then shows that
these factors are principal ideals. We also use that any for unit
u in Z[],
s
u is real for some s Z. See [33] or [16] for more
details.
The second case involves showing that there is no solution to
FLT for [xyz.
In 1823, Sophie Germain found a simple proof that if is
a prime with 2 + 1 a prime then the rst case of (FLT)
holds. Exam-
ples of that fail this are rare - the only known examples are
1093 and 3511. Moreover, similar criteria are known if p
1
, 1
(mod
2
) and p is any prime 89 [15]. This allows one to prove
the rst case of (FLT)
for many .
Before Andrew Wiles, (FLT)
n=1
(1 q
n
)
24
=
n=1
(n)q
n
.
Then
(n)
11
(n) (mod 691),
where
k
(n) =
d[n
d
k
.
is a modular form; this means that, if we set q = e
2iz
,
satises (among other conditions)
_
az+b
cz+d
_
= (cz +d)
k
(z) for
all z in the upper half-plane Im(z) > 0 and all
_
a b
c d
_
with,
in this case, (weight) k = 12 and = SL
2
(Z) (in general, we
dene a level N by having dened as the group of matrices
in SL
2
(Z) such that N[c; here N = 1). For instance, setting
a, b, d = 1, c = 0, (z + 1) = (z), and this is why can be
written as a Fourier series in q = e
2iz
.
Due to work of Andre Weil in the 1940s and John Tate in
the 1950s, the study of elliptic curves, that is curves of the
form y
2
= g(x), where g is a cubic with distinct roots, led to
the study of Galois representations, i.e. continuous homomor-
phisms Gal(
Q/Q) GL
2
(R), where R is a complete local ring
such as the nite eld F
. In
particular, given elliptic curve E dened over Q (meaning the
1.4 Modern Methods of Proof xiii
coecients of g are in Q), and any rational prime , there exist
associated Galois representations
,E
: Gal(
Q/Q) GL
2
(Z
)
and (by reduction mod )
,E
: Gal(
Q/Q) GL
2
(F
). These
encode much information about the curve.
A conjecture of Jean-Pierre Serre associates to a certain kind
of modular form f (cuspidal eigenforms) and to a rational prime
a Galois representation,
,f
. All known congruences for fol-
low from a systematic study of the representations associated
to . This conjecture was proved by Pierre Deligne [7] (but
note that he really only wrote the details for - extensive
notes of Brian Conrad http://www.math.lsa.umich.edu/ bd-
conrad/bc.ps can be used to ll in details here) in 1969 for
weights k > 2. For k = 2 it follows from earlier work of Martin
Eichler and Goro Shimura [31]. For k = 1 it was later estab-
lished by Deligne and Serre [8].
These representations
,f
share many similarities with the
representations
,E
. Formalizing this, a conjecture of Yutaka
Taniyama of 1955, later put on a solid footing by Shimura,
would attach a modular form of this kind to each elliptic curve
over Q. Thus, we have the following picture
Repns from elliptic curves
[
Repns from certain modular forms Admissible Galois representations
In 1985, Gerhard Frey presented a link with FLT. If we as-
sume that a, b, c are positive integers with a
+ b
= c
, and
consider the elliptic curve y
2
= x(x a
)(x +b
) (called a Frey
curve), this curve is unlikely to be modular, in the sense that
1.4 Modern Methods of Proof xiv
,E
turns out to have properties that a representation associ-
ated to a modular form should not.
The Shimura-Taniyama conjecture, however, states that any
given elliptic curve is modular. That is, given E, dened over Q,
we consider its L-function L(E, s) =
a
n
/n
s
. This conjecture
states that
a
n
q
n
is a modular form. Equivalently, every
,E
is a
,f
for some modular form f.
In 1986, Kenneth Ribet (building on ideas of Barry Mazur)
showed that these Frey curves are denitely not modular. His
strategy was to show that if the Frey curve is associated to a
modular form, then it is associated to one of weight 2 and level
2. No cuspidal eigenforms of this kind exist, giving the desired
contradiction. Ribets approach (completed by Fred Diamond
and others) establishes in fact that the weak conjecture below
implies the strong conjecture (the implication being the so-
called -conjecture). The strong conjecture would imply many
results - unfortunately, no way of tackling this is known.
Serres weak conjecture [30] says that all Galois representa-
tions : Gal(
Q/Q) GL
2
(k) with k a nite eld, and such
that det(()) = 1, where denotes a complex conjugation,
(this condition is the denition of being odd) come from mod-
ular forms.
Serres strong conjecture [30] states that comes from a mod-
ular form of a particular type (k, N, ) with k, N positive inte-
gers (the weight and level, met earlier) and : (Z/NZ)
ii
= Id, and that
jk
ij
=
ik
.
Example: Index the normal subgroups of nite index by I as
above. Setting G
i
= G/N
i
, and
ij
: G
i
G
j
to be the natural
quotient map whenever i j, we get an inverse system of
groups.
2.1 Pronite Groups xviii
We now form a new category, whose objects are pairs (H,
i
:
i I), where H is a group and each
i
: H G
i
is a group
homomorphism, with the property that
H
i
.~
~
~
~
~
~
~
~
j
A
A
A
A
A
A
A
A
G
i
ij
G
j
commutes whenever i j. Given two elements (H,
i
) and
(J,
i
), we dene a morphism between them to be a group
homomorphism : H J such that
H
i
A
A
A
A
A
A
A
A
J
i ~
~
~
~
~
~
~
~
G
i
commutes for all i I.
Example: Continuing our earlier example, (G,
i
) is an ob-
ject of the new category, where
i
: G G/N
i
is the natural
quotient map.
Denition 2.3 lim
iI
G
i
is the terminal object in the new cat-
egory, called the inverse limit of the G
i
. That is, lim
G
i
is the
unique object (X,
i
) such that given any object (H,
i
)
there is a unique morphism
(H,
i
) (X,
i
).
The existence of a terminal object in this category will be
proved below, after the next example.
Example: Continuing our earlier example, the group above, X,
is the pronite completion
G of G. Since
G is terminal, there is a
2.1 Pronite Groups xix
unique group homomorphism G
G. If this is an isomorphism
then we say that G is pronite (or complete). For instance, it
will be shown below that Gal(
6
6
6
6
6
6
6
6
6
6
6
6
6
6
j
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
!
6
6
6
6
6
6
6
6
6
6
6
6
6
6
!
G
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
G/N
i
G/N
j
G is an isomorphism, so that
G is
pronite/complete.
We return to the general case and we now need to prove the
existence of lim
iI
G
i
. To do this, let
C =
iI
G
i
,
and
i
: C G
i
be the ith projection. Let X = c C[
ij
(
i
(c)) =
j
(c) i j. We claim that
lim
iI
G
i
= (X,
i
[
X
).
Proof: (i) (X,
i
[
X
) is an object in the new category, since X
is a group (check!) and the following diagram commutes for all
2.1 Pronite Groups xx
i j (by construction)
X
i
[
X
.~
~
~
~
~
~
~
~
j
[
X
A
A
A
A
A
A
A
A
G
i
ij
G
j
(ii) Given any (H,
i
) in the new category, dene (h) =
(
i
(h))
iI
, and check that this is a group homomorphism :
H X such that
H
i
A
A
A
A
A
A
A
A
X
i
.}
}
}
}
}
}
}
}
G
i
and that is forced to be the unique such map. QED
Example: Let G = Z and let us describe
G =
Z. The nite
quotients of G are G
i
= Z/i, and i j means that j[i. Hence
Z = (a
1
, a
2
, a
3
, . . .)[a
i
Z/i and a
i
a
j
(mod j) whenever j[i.
Then for a Z, the map a (a, a, a . . .)
Z is a homomor-
phism of Z into
Z.
Now consider
F
p
=
n
F
p
n. Then, if m[n,
Gal(
F
p
/F
p
)
restriction,
n
restriction,
m
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
Gal(F
p
n/F
p
)
= Z/n
nm
Gal(F
p
m/F
p
)
= Z/m
Note that Gal(F
p
n/F
p
) is generated by the Frobenius auto-
morphism Fr : x x
p
. We see that (Gal(
F
p
/F
p
),
n
) is an
object in the new category corresponding to the inverse system.
Thus there is a map Gal(
F
p
/F
p
)
Z.
We claim that this map is an isomorphism, so that Gal(
F
p
/F
p
)
is pronite. This follows from our next result.
2.1 Pronite Groups xxi
Theorem 2.4 Let L/K be a (possibly innite) separable, al-
gebraic Galois extension. Then Gal(L/K)
= lim
Gal(L
i
/K),
where the limit runs over all nite Galois subextensions L
i
/K.
Proof: We have restriction maps:
Gal(L/K)
i
j
P
P
P
P
P
P
P
P
P
P
P
P
Gal(L
i
/K)
Gal(L
j
/K)
whenever L
j
L
i
, i.e. i j. We use the projection maps to
form an inverse system, so, as before, (Gal(L/K), ) is an
object of the new category and we get a group homomorphism
Gal(L/K)
lim
Gal(L
i
/K).
We claim that is an isomorphism.
(i) Suppose 1 ,= g Gal(L/K). Then there is some x L
such that g(x) ,= x. Let L
i
be the Galois (normal) closure of
K(x). This is a nite Galois extension of K, and 1 ,= g[
L
i
=
i
(g), which yields that 1 ,= (g). Hence is injective.
(ii) Take (g
i
) lim
Gal(L
i
/K) - this means that L
j
L
i
g
i
[L
j
= g
j
. Then dene g Gal(L/K) by g(x) = g
i
(x) when-
ever x L
i
. This is a well-dened eld automorphism and
(g) = (g
i
). Thus is surjective. QED
For the rest of this section, we assume that the groups G
i
are
all nite (as, for example, in our running example). Endow the
nite G
i
in our inverse system with the discrete topology. G
i
is certainly a totally disconnected Hausdor space. Since these
properties are preserved under taking products and subspaces,
lim
G
i
G
i
is Hausdor and totally disconnected as well.
Furthermore
G
i
is compact by Tychonos theorem.
2.2 Complete Local Rings xxii
Exercise: If f, g : A B are continuous (A, B topological
spaces) with A, B Hausdor, then x[f(x) = g(x) is closed.
Deduce that
lim
G
i
=
ij
_
c
G
i
:
ij
(
i
(c)) =
j
(c)
_
is closed in
G
i
, therefore is compact. In summary, lim
G
i
is a
compact, Hausdor, totally disconnected topological space.
Exercise: The natural inclusion Z
Z maps Z onto a dense
subgroup. In fact, for any group G, its image in
G is dense, but
the kernel of G
G need not be trivial. This happens if and
only if G is residually nite (meaning that the intersection of
all its subgroups of nite index is trivial).
If we denote by Fr the element of Gal(
F
q
/F
q
) given by Fr(x) =
x
q
, i.e. the Frobenius automorphism, then Fr does not gener-
ate the Galois group, but the group which it does generate is
dense (by the last exercise), and so we say that Gal(
F
q
/F
q
)
is topologically nitely generated by one element Fr (and so is
procyclic).
2.2 Complete Local Rings
We now carry out the same procedure with rings rather than
groups and so dene certain completions of them. Let R be a
commutative ring with identity 1, I any ideal of R. For i j
we have a natural quotient map
R/I
i
ij
R/I
j
.
These rings and maps form an inverse system (now of rings).
Proceeding as in the previous section, we can form a new cate-
gory. Then the same proof gives that there is a unique terminal
2.2 Complete Local Rings xxiii
object, R
I
= lim
i
R/I
i
, which is now a ring, together with a
unique ring homomorphism R R
I
, such that the following
diagram commutes:
R
5
5
5
5
5
5
5
5
5
5
5
5
5
5
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
J
R
I
.t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
R/I
i
ij
R/I
j
Note that R
I
depends on the ideal chosen. It is called the
I-adic completion of R (do not confuse it with the localization
of R at I). We call R (I-adically) complete if the map R R
I
is an isomorphism. Then R
I
is complete. If I is a maximal ideal
m, then we check that R
I
is local, i.e. has a unique maximal
ideal, namely m
Z
p
, the product being over all
rational primes.
Exercise: Show that the ideals of Z
p
are precisely 0 and p
i
Z
k
Id
k
where the vertical maps are the natural projections. This is
equivalent to requiring that (m
R
) m
S
, where m
R
(respec-
tively m
S
) is the maximal ideal of R (respectively S).
As an example, if k = F
, then Z
is an object of (. By a
theorem of Cohen [2], the objects of ( are of the form
W(k)[[T
1
, . . . , T
m
]]/(ideal),
where W(k) is a ring called the ring of innite Witt vectors
over k (see chapter 3 for an explicit description of it). Thus
W(k) is the initial object of (. In the case of F
p
, W(F
p
) = Z
p
.
Exercise: If R is a ring that is I-adically complete, then GL
n
(R)
=
lim
i
GL
n
(R/I
i
) (the maps GL
n
(R/I
i
) GL
n
(R/I
j
) being the
natural ones).
Note that the topology on R induces the product topology on
M
n
(R) and thence the subspace topology on GL
n
(R).
The Big Picture. We shall seek to use continuous group
homomorphisms (Galois representations)
Gal(
Q/Q) GL
n
(R),
2.2 Complete Local Rings xxv
where R is in some (, to parametrize the homomorphisms that
elliptic curves and modular forms naturally produce. In this
chapter we have constructed these groups and rings and ex-
plained their topologies. Next, we study the internal structure
of both sides, notably certain important subgroups of the left
side. This will give us the means to characterize Galois repre-
sentations in terms of their eect on these subgroups.
xxvi
3
Innite Galois Groups: Internal Structure
We begin with a short investigation of Z
p
. A good reference for
this chapter is [28].
We rst check that p
n
Z
p
is the kernel of the map Z
p
Z/p
n
.
Hence, we have
Z
p
pZ
p
p
2
Z
p
. . .
If x p
n
Z
p
p
n+1
Z
p
, then we say that the valuation of x,
v(x) = n. Set v(0) = .
Exercise: x is a unit in Z
p
if and only if v(x) = 0
Corollary 3.1 Every x Z
p
0 can be uniquely written as
p
v(x)
u where u is a unit.
In fact
() : (1)v(xy) = v(x) +v(y), (2)v(x +y) min(v(x), v(y)).
We dene a metric on Z
p
as follows: set d(x, y) = c
v(xy)
for
a xed 0 < c < 1, x ,= y Z
p
(d(x, x) = 0). We have some-
thing stronger than the triangle inequality, namely d(x, z)
3. Innite Galois Groups: Internal Structure xxvii
max(d(x, y), d(y, z)) for all x, y, z Z
p
. This has unusual con-
sequences such as that every triangle is isosceles and every point
in an open unit disc is its center.
The metric and pronite topologies then agree, since p
n
Z
p
is a
base of open neighborhoods of 0 characterized by the property
v(x) n d(0, x) c
n
.
By ()(1), Z
p
is an integral domian. Its quotient eld is called
Q
p
, the eld of p-adic numbers. We have the following diagram
of inclusions.
Q
p
Q
Q
p
?
Z
p
?
Q
p
/Q
p
) Gal(
Q/Q).
We can check that this is a continuous group homomorphism
(dened up to conjugation only). Denote Gal(
K/K) by G
K
.
Denition 3.2 Given a continuous group homomorphism :
G
Q
GL
n
(R), () yields by composition a continuous group
homomorphism
p
: G
Q
p
GL
n
(R). The collection of homo-
morphisms
p
, one for each rational prime p, is called the local
data attached to .
The point is that G
Q
p
is much better understood than G
Q
;
in fact even presentations of G
Q
p
are known, at least for p ,= 2
[18]. We next need some of the structure of G
Q
p
and obtain
3. Innite Galois Groups: Internal Structure xxviii
this by investigating nite Galois extensions K of Q
p
and how
Gal(K/Q
p
) acts.
Now,
Z
p
0 = p
n
u[n 0, u is a unit,
whence
Q
p
0 = p
n
u[u is a unit inZ
p
.
We can thus extend v : Z
p
0 N to a map v : Q
p
Z
which is a homomorphism of groups.
Denition 3.3 A discrete valuation w on a eld K is a sur-
jective homomorphism w : K
Z such that
w(x +y) min(w(x), w(y)
for all x, y K (we take w(0) = ).
Example: The map v : Q
p
Z is a discrete valuation, since
() extends to Q
p
.
Exercise: If K has a discrete valuation then
A = x K[w(x) 0
is a ring, and
m = x K[w(x) > 0
is its unique maximal ideal. Choose so that w() = 1. Every
element x K
2
+ . . . [c
i
S, where S A is chosen
3. Innite Galois Groups: Internal Structure xxix
to contain exactly one element of each coset of A/m. The
connection with our approach is by mapping the typical el-
ement to (c
0
(mod ), c
0
+ c
1
(mod )
2
, c
0
+ c
1
+ c
2
2
(mod )
3
, . . .)
A/m
i
= A. The description extends to K
by having K =
i=N
c
i
i
, i.e. Laurent series in , coecients
in S.
Corollary 3.4 The ideal m is equal to the principal ideal (),
and all ideals of A are of the form (
n
), so A is a PID.
Let K/Q
p
be a nite Galois extension, and dene a norm
N : K
p
by
x
G
(x)
where G = Gal(K/Q
p
). The composition of homomorphisms
K
N
Q
p
v
Z
is nonzero with some image fZ. We then dene w : K
Z
by w =
1
f
v N. Then w is a discrete valuation on K. f is
called the residue degree of K, and we say that w extends v
with ramication index e if w[
Q
p
= ev. For any x Q
p
, we
have
ev(x) = w(x) =
1
f
v(N(x)) =
1
f
v(x
n
) =
n
f
v(x),
so that ef = n = [K : Q
p
].
Proposition 3.5 The discrete valuation w is the unique dis-
crete valuation on K which extends v.
Proof: A generalization of the proof that any two norms on a
nite dimensional vector space over C are equivalent ([4],[29]
Chap. II). QED
3. Innite Galois Groups: Internal Structure xxx
Exercise: Let A be the valuation ring of K, and m the maximal
ideal of A. Prove that the order of A/m is p
f
.
We have a collection of embeddings as follows:
A
K
Z
p
?
Q
p
?
1
(x) x = (
1
(x)
1
(x))
= ((
1
(x))
1
(x))
shows that G
i
G.
We have that
G = G
1
G
0
G
1
,
3. Innite Galois Groups: Internal Structure xxxi
where we call G
0
the inertia subgroup of G, and G
1
the wild
inertia subgroup of G.
Exercise: Let be a uniformizer of K. Show that in fact
G
i
= G[w(() ) i + 1.
These normal subgroups determine a ltration of G, and we
now study the factor groups in this ltration.
Theorem 3.7 (a) The quotient G/G
0
is canonically isomor-
phic to Gal(k/F
p
), with k = A/m, hence it is cyclic of order
f.
(b) Let U
0
be the group of units of A. Then U
i
= 1 + (
i
)
(i 1) is a subgroup of U
0
. For all G, the map ()/
induces an injective group homomorphism G
i
/G
i+1
U
i
/U
i+1
.
Proof: (a) Let G. Then acts on A, and sends m to
m. Hence it acts on A/m = k. This denes a map : G
Gal(k/F
p
) by sending to the map x + m (x) + m. We
now examine the kernel of this map.
ker = G[(x) x m for all x A
= G[w((x) x) 1 for all x A = G
0
.
This shows that induces an injective homomorphism from
G/G
0
to Gal(k/F
p
).
As for surjectivity, choose a A such that the image a of a
in k has k = F
p
( a). Let
p(x) =
G
(x (a)).
Then p(x) is a monic polynomial with coecients in A, and
3. Innite Galois Groups: Internal Structure xxxii
one root is a. Then
p(x) =
G
(x (a)) k[x]
yields that all conjugates of a are of the form (a). For
Gal(k/F
p
), ( a) is such a conjugate, whence it is equal to some
(a). Then the image of is .
(b) G
i
w(() ) i + 1 ()/ U
i
.
This map is independent of choice of uniformizer. Suppose
/
= u is another uniformizer, where u is a unit. Then
(
/
)
/
=
()
(u)
u
.
but for G
i
, we have that i +1 w((u) u) = w((u)/u
1) + w(u), so that (u)/u U
i+1
, i.e. (
/
)/
/
diers from
()/ by an element of U
i+1
.
The map
i
is a homomorphism, since
()
=
()
()
(u)
u
,
where u = ()/, and, as above, (u)/u U
i+1
.
Finally the map
i
is injective. To see this assume that ()/
U
i+1
. Then () = (1 +y) with y (
i+1
). Hence () =
y has valuation at least i + 2, that is, G
i+1
. QED
Proposition 3.8 We have that U
0
/U
1
is canonically isomor-
phic to k
, and is
surjective. Its kernel is x[x 1 (mod m) = U
1
, whence
U
0
/U
1
= k
.
3. Innite Galois Groups: Internal Structure xxxiii
The map 1 + x x (mod )
i+1
takes U
i
to
i
/
i+1
and
is surjective with kernel U
i+1
. Moreover, since acts trivially
on
i
/
i+1
, this is an (A/m)-module (i.e. k-vector space). Its
dimension is 1, since otherwise there would be an ideal of A
strictly between
i
and
i+1
. QED
We note in particular that G is solvable, since the factors in its
ltration are all abelian by the above. Thus, G
Q
p
is prosolvable.
Specically, we have the following inclusions:
G
G
0
?
cyclic
G
1
?
1
?
pgroup
Gal(L/Q
p
)
_
[
G
0
= lim
G
(L)
0
_
[
3.1 Innite extensions xxxv
G
1
= lim
G
(L)
1
A second inverse system consists of the groups Gal(L/Q
p
)/G
(L)
0
.
Homomorphisms for M L are obtained from the restriction
map and the fact that the image under restriction of G
(L)
0
is
equal to G
(M)
0
.
Gal(L/Q
p
)/G
(L)
0
Gal(M/Q
p
)/G
(M)
0
[[ [[
Gal(k
L
/F
p
)
Gal(k
M
/F
p
)
We use this to show that
G
Q
p
/G
0
= Gal(
F
p
/F
p
)
=
Z
q
Z
q
.
We shall see later that
G
0
/G
1
=
q,=p
Z
q
.
The following will come in useful:
Theorem 3.10 If H is a closed subgroup of G
Q
p
, then H =
Gal(
Q
p
/L), where L is the xed eld of H.
Proof: Let H be a closed subgroup of G
Q
p
, and let L be its xed
eld. Then for any nite Galois extension M/L, we obtain a
3.2 Structure of G
Qp
/G0 and G
Qp
/G1 xxxvi
commutative diagram
H
K
K
K
K
K
K
K
K
K
K
K
Gal(
Q
p
/L)
Gal(M/L)
Since Gal(
Q
p
/L) = lim
Q
p
/L). Since H is closed, H = Gal(
Q
p
/L). QED
In fact innite Galois theory provides an order-reversing bi-
jection between intermediate elds and the closed subgroups of
the Galois group.
3.2 Structure of G
Q
p
/G
0
and G
Q
p
/G
1
We now have normal subgroups G
1
G
0
G
Q
p
. Suppose that
: G
Q
GL
2
(R) is one of the naturally occurring continu-
ous homomorphisms in which we are interested, e.g. associated
to an elliptic curve or modular form. We get continuous ho-
momorphisms
p
: G
Q
p
GL
2
(R) , such that
p
(G
0
) = 1
for all but nitely many p (in which case we say that is un-
ramied at p). For a p at which is unramied, induces a
homomorphism G
Q
p
/G
0
GL
2
(R). Often it will be the case
that
p
(G
1
) = 1 for a ramied p (in which case we say that
is tamely ramied at p). In this case induces a homomorphism
G
Q
p
/G
1
GL
2
(R). Thus, we are interested in the structure of
the two groups G
Q
p
/G
0
and G
Q
p
/G
1
.
Here is a rough outline of how we establish this. In the -
nite extension case considered already, G/G
0
is isomorphic to
Gal(k/F
p
) (and so is cyclic), and G
0
/G
1
embeds in k
(and so is
cyclic of order prime to p). In the limit, G
Q
p
/G
0
= lim
G/G
0
=
lim
Gal(k/F
p
) = Gal(
F
p
/F
p
) (and so is procyclic), and G
0
/G
1
3.2 Structure of G
Qp
/G0 and G
Qp
/G1 xxxvii
embeds in lim
. In the limit, G
Q
p
/G
1
is
prometacyclic, and the extension of groups
1 G
0
/G
1
G
Q
p
/G
1
G
Q
p
/G
0
1
is a semidirect product since G
Q
p
/G
0
is free, with the action
given by that of Gal(
F
p
/F
p
) on lim
F
p
/F
p
) and acts
by mapping elements to their pth power.
Theorem 3.11 Let G
0
, G
1
denote the 0th and 1st ramication
subgroups of G
Q
p
. Then G
Q
p
/G
0
= lim
Gal(k/F
p
)
=
Z and
G
0
/G
1
= lim
=
q,=p
Z
q
, where the maps in the inverse
system are norm maps. Moreover, G
Q
p
/G
1
is (topologically)
generated by two elements x, y where y generates G
0
/G
1
, x
3.2 Structure of G
Qp
/G0 and G
Qp
/G1 xxxviii
maps onto the Frobenius element in G
Q
p
/G
0
= Gal(
F
p
/F
p
),
and x
1
yx = y
p
.
For any nite extension L/Q
p
, dene G
(L)
= Gal(L/Q
p
), A
L
is the valuation ring of L, and k
L
the residue eld of L.
Let M L be nite Galois extensions of Q
p
. We obtain the
following diagram:
G
Q
p
{
{
{
{
{
{
{
{
D
D
D
D
D
D
D
D
G
(L)
/G
(L)
0
G
(M)
/G
(M)
0
Gal(k
L
/F
p
)
Gal(k
M
/F
p
)
Note that G
0
lis in the kernel of each
L
. From this diagram we
get maps
L
: G
Q
p
/G
0
Gal(k
L
/F
p
). We note the following
facts.
1) Given k, there is only one such map,
k
. The reason here
is that if k
L
= k
M
= k, then there is a unique map
Gal(LM/Q
p
) Gal(k
LM
/F
p
) Gal(k/F
p
)
since Gal(k
LM
/F
p
) is cyclic.
Denition 3.12 The valuation ring of the xed eld of
k
will
be denoted W(k), the ring of innite Witt vectors of k. An al-
ternative explicit description is given in [17]. Note that W(F
p
)
is simply Z
p
.
2) Given k, there is some L with k
L
= k. (We may take
L = Q
p
(), where is a primitive ([k[ 1)th root of 1.)
We consider the inverse system consisting of groups Gal(k/F
p
),
where k/F
p
runs through nite Galois extensions, together with
3.2 Structure of G
Qp
/G0 and G
Qp
/G1 xxxix
the usual restriction maps. Then G
Q
p
/G
0
,
k
) is an object of
the new category associated to this inverse system. We obtain
a homomorphism
G
Q
p
/G
0
lim
Gal(k/F
p
)
=
Z,
and this is an isomorphism because of (1) and (2) above.
Next we study the structure of G
0
/G
1
. We have the following
diagram:
G
0
A
A
A
A
A
A
A
G
(L)
0
/G
(L)
1
G
(M)
0
/G
(M)
1
M
Note that G
1
lies in the kernel of each
L
. It is a simple exercise
to show that the norm map k
L
k
M
makes the bottom square
commutative.
Theorem 3.13 There is a canonical isomorphism G
0
/G
1
=
lim
.
Proof: Let L/Q
p
be a nite Galois extension, and recall that
G
(L)
0
/G
(L)
1
k
L
naturally. Then
G
0
G
(L)
0
k
L
factors through G
0
/G
1
. In this way we get maps G
0
/G
1
k
L
for each L, and nally a map G
0
/G
1
lim
, where the
inverse limit is taken over all nite Galois extensions k/F
p
, and
for k
1
k
2
, the map k
2
k
1
is the norm map. The fact that
3.2 Structure of G
Qp
/G0 and G
Qp
/G1 xl
G
0
/G
1
lim
. Let
n
be the group of nth roots of 1 in
F
p
, where (p, n) = 1. For m[n, we have a group homomorphism
n
m
given by x x
n/m
, which forms an inverse system.
Let = lim
n
.
Lemma 3.14 We have a canonical isomorphism lim
= .
Proof: Since the groups k
n
lim
. To obtain a map in
the other direction, note that the numbers p
f
1 are conal
in the set of integers prime to p. In fact, if d is such an integer
integer, there is some f 1 with p
f
1 (mod d), e.g. f =
(d). QED
is noncanonically isomorphic to
q,=p
Z
q
, since
n
is non-
canonically isomorphic to Z/nZ, and
lim
Z/nZ
q,=p
Z
q
,
since Z/nZ
=
Z/p
r
i
i
Z for n =
p
r
i
i
.
We now examine the map
G
(L)
0
/G
(L)
1
k
L
.
One can check that it is G
(L)
-equivariant (with the conjugation
action on the left, and the natural action on the right). Now
G
(L)
0
acts trivially on the left (G
(L)
0
/G
(L)
1
is abelian ), and also
trivially on the right by denition. Hence we end up with a
3.2 Structure of G
Qp
/G0 and G
Qp
/G1 xli
G
(L)
/G
(L)
0
(
= Gal(k
L
/F
p
))-equivariant map. The canonical iso-
morphism G
0
/G
1
lim
is G
Q
p
/G
0
-equivariant.
The upshot is that G
Q
p
/G
1
is topologically generated by 2
elements x, and y, where x generates a copy of
Z, y generates
a copy of
q,=p
Z
q
, with one relation x
1
yx = y
p
. This is seen
from the short exact sequence of groups:
1 G
0
/G
1
G
Q
p
/G
1
G
Q
p
/G
0
1,
in which G
0
/G
1
=
q,=p
Z
q
, and G
Q
p
/G
0
=
Z are both procyclic
(i.e topologically generated by one element). Since
Z is free,
the sequence is split, i.e. denes a semidirect product with the
action given as above.
Next, let : G
Q
GL
n
(k) be given, where k is a nite eld of
characteristic . The image of is nite, say Gal(K/Q), where
K is a number eld. Letting the nite set of rational primes
ramied in K/Q be S, we see that is unramied at every
p , S, i.e.
p
(G
0
) = 1 and so
p
factors through G
Q
p
/G
0
for
all p , S. Consider next
: G
Q
GL
n
(k).
Call
semisimple if V := k
n
, viewed as a k[G
Q
]-module, is
semisimple, i.e. a direct sum of irreducible modules.
Theorem 3.15 If
is semisimple, then
(G
1
) = 1, and so
factors through G
Q
/G
1
.
Proof: Assume V is irreducible, i.e. has no proper k[G
Q
]-submodules.
Let V
/
= v V [g(v) = vfor allg
(G
1
). Since G
1
is a pro-
group,
(G
1
) is a nite -group and so its orbits on V are
of length 1 or a power of . The orbits of length 1 comprise
V
/
, so that [V
/
[ [V [ (mod ) 0 (mod ). In particular,
V
/
,= 0. Since G
1
is normal in G
Q
, V
/
is stable under G
Q
,
implying by irreducibility of V that V
/
= V . Thus
(G
1
) acts
3.2 Structure of G
Qp
/G0 and G
Qp
/G1 xlii
trivially on the whole of V . For a semisimple V , we apply the
above to each summand of V . QED
The Big Picture. We have dened subgroups G
Q
p
(one for
each prime p) of G
Q
with much simpler structure than G
Q
itself.
This will allow us to describe representations : G
Q
GL
n
(R)
in terms of their restrictions
p
to these subgroups. Each
p
can
be described in turn by its eect on ramication subgroups of
G
Q
p
, ultimately enabling us to dene useful numerical invari-
ants associated to (see chapter 5). First, however, we need
some natural sources of Galois representations and these will
be provided by elliptic curves, modular forms, and more gen-
erally group schemes.
xliii
4
Galois Representations from Elliptic Curves,
Modular Forms, and Group Schemes
Having introduced Galois representations, we next describe nat-
ural sources for them, that will lead to the link with Fermats
Last Theorem.
4.1 Elliptic curves
An elliptic curve over Q is given by equation y
2
= f(x), where
f Z[x] is a cubic polynomial with no repeated roots in
Q. A
good reference for this theory is [32].
Example: y
2
= x(x 1)(x + 1)
If K is a eld, set E(K) := (x, y) KK[y
2
= f(x).
We begin by studying E(C).
Let be a lattice inside C. Dene the Weierstrass -function
by
(z; ) =
1
z
2
+
0,=
(
1
(z )
2
1
2
)
4.1 Elliptic curves xliv
and the 2kth Eisenstein series by
G
2k
() =
0,=
2k
.
These arise as the coecients in the Laurent series of about
z = 0, namely:
(z) =
1
z
2
+ 3G
4
z
2
+ 5G
6
z
4
+ 7G
8
z
6
+. . . ,
which is established by rearranging the innite sum, allowed by
the following.
Proposition 4.1 The function is absolutely and locally uni-
formly convergent on C . It has poles exactly at the points
of and all the residues are 0. G
2k
() is absolutely convergent
if k > 1.
Denition 4.2 A function f is called elliptic with respect to
(or doubly periodic) if f(z) = f(z +) for all . Note that
to check ellipticity it suces to show this for =
1
,
2
, the
fundamental periods of . An elliptic function can be regarded
as a function on C/.
Proposition 4.3 is an even function and elliptic with re-
spect to .
Proof: Clearly, is even, i.e. (z) = (z). By local uniform
convergence, we can dierentiate term-by-term to get
/
(z) =
2
1
(z)
3
, and so
/
is elliptic with respect to . Integrat-
ing, for each , (z +) = (z) +C(), where C() does
not depend on z. Setting z =
i
/2 and =
i
(i = 1 or 2)
gives
(
i
/2) = (
i
/2) +C(
i
) = (
i
/2) +C(
i
),
4.1 Elliptic curves xlv
using the evenness of . Thus, C(
i
) = 0(i = 1, 2), so is
elliptic. QED
Let g
2
= 60G
4
() and g
3
= 140G
6
().
Proposition 4.4 (
/
(z))
2
= 4(z)
3
g
2
(z) g
3
(z , ).()
Proof: Let f(z) = (
/
(z))
2
(a(z)
3
b(z)
2
c(z) d). Con-
sidering its explicit Laurent expansion around z = 0, since f is
even, there are terms in z
6
, z
4
, z
2
, z
0
whose coecients are
linear in a, b, c, d. Choosing a = 4, b = 0, c = g
2
, d = g
3
makes
these coecients zero, and so f is holomorphic at z = 0 and
even vanishes there. We already knew that f is holomorphic
at points not in . Since f is elliptic, it is bounded. By Liou-
villes theorem, bounded holomorphic functions are constant.
f(0) = 0 implies this constant is 0. QED
Theorem 4.5 The discriminant () := g
3
2
27g
2
3
,= 0, and
so 4x
3
g
2
xg
3
has distinct roots, whence E
dened by y
2
=
4x
3
g
2
x g
3
is an elliptic curve (over Q if g
2
, g
3
Q).
Proof: Setting
3
=
1
+
2
,
/
(
i
/2) =
/
(
i
/2) =
/
(
i
/2)
since
/
is odd and elliptic. Hence
/
(
i
/2) = 0. Thus 4(
i
/2)
3
g
2
(
i
/2)g
3
= 0. We just need to show that these three roots
of 4x
3
g
2
x g
3
= 0 are distinct.
Consider (z)(
i
/2). This has exactly one pole, of order 2,
and so by the next lemma has either two zeros of order 1 (which
cannot be the case since this function is even and elliptic) or 1
zero of order 2, namely
i
/2. Hence (
i
/2) (
j
/2) ,= 0 if
i ,= j. QED
Lemma 4.6 If f is elliptic and v
w
(f) is the order of vanishing
of f at w, then
wC/
v
w
(f) = 0. Moreover, the sum of all the
zeros minus the poles is 0 (mod ).
4.1 Elliptic curves xlvi
Proof: By Cauchys residue theorem,
wC/
v
w
(f) =
1
2i
_
f
f
dz,
where the integral is over the boundary of the fundamental par-
allelogram with vertices 0,
1
,
2
,
3
(if a pole or zero happens
to land on the boundary, then translate the whole parallelo-
gram to avoid it). By ellipticity, the contributions from parallel
sides cancel, so the integral is 0. The last statement is proved
similarly, using
1
2i
_
z
f
f
dz. QED
Theorem 4.7 There is a bijection (in fact a homeomorphism
of Riemann surfaces) : C/ E
(C) given by
z ((z),
/
(z))(z , ), z (z ).
Proof: Ellipticity of and
/
implies that is well-dened and
() shows that the image is in E
(C). Later,
we shall see that every elliptic curve over C is of the form E
.
Theorem 4.8 The group law on E
=
R/Z R/Z, E[n]
= Z/n Z/n. Moreover, there are polyno-
mials f
n
Q[x] such that E[n] = (x, y)[f
n
(x) = 0 .
Proof: The elements of R/Z of order dividing n form a cyclic
group of order n. The f
n
come from iterating the rational func-
tion that describes addition of two points on the curve. QED
4.1 Elliptic curves xlviii
Example: Continuing the case n = 2, we see that f
2
= f.
By the last lemma, E[n] E(
(E) = lim
E[
n
], where the maps in the inverse
system are E[
n
] E[
m
] for n > m dened by P
nm
P.
Example: Continuing the case n = 2, note that GL
2
(Z/2)
= S
3
,
the symmetric group on 3 letters. The action of G
Q
on E[2] =
, (
1
, 0), (
2
, 0), (
3
, 0) amounts to permuting the roots
i
of f, and so the image of
E,2
is Gal(K/Q) S
3
where K is
the splitting eld of f.
Given a typical E (meaning one with no complex multiplica-
tions),
E,
is surjective for all but nitely many primes . In
fact it is surjective for all for a set of elliptic curves of density
1, for example if E is the curve y
2
+ y = x
3
x (elliptic - you
can complete the square).
These -adic Galois representations encode much information
about elliptic curves. For example, for a xed ,
E,
and
E
(the units of A) is
representable since A
= hom
Ralg
(R[X, Y ]/(XY 1), A) (un-
der an R-algebra homomorphism, X has to map to something
invertible in A and then Y maps to its inverse).
(iii) More generally, the functor GL
n
is representable since
GL
n
(A) = hom
Ralg
(R[T
11
, T
12
, ..., T
nn
, Y ]/(det((T
ij
))Y 1), A).
Note GL
1
= G
m
.
4.2 Group schemes l
(iv) The functor
n
dened by
n
(A) := x A[x
n
= 1 is
representable since
n
(A) = hom
Ralg
(R[T]/(T
n
1), A).
n
is
a subgroup scheme of G
m
.
The following example shows how certain group schemes can
give rise to Galois representations. It will be generalized below.
Denition 4.12 Let K be a eld, ,= charK. Then G
K
acts
on
n(
K)
= Z/
n
. This yields a representation
n : G
K
GL
1
(Z/
n
) = (Z/
n
)
: G
K
GL
1
(Z
) = Z
() := lim
n, then
().
Exercise: Let K = Q. Show that the -adic cyclotomic char-
acter is unramied at all primes p ,= , i.e.
p
: G
Q
p
Z
F
p
/F
p
)
= G
Q
p
/G
0
Z
.
If A
= Z
n
T (T torsion) as a Z
-module, then A
R
Q
= Q
n
as a Q
-module and A
R
F
= F
n
T/T as an F
-module.
Let : R S be a ring homomorphism. Given a functor
F on R-algebras, we get a functor F
/
on S-algebras since via
composition with every S-algebra S A is an R-algebra. If
1 is an R-algebra, then hom
Salg
(1
R
S, A)
= hom
Ralg
(1, A)
(adjoint associativity - follows quickly from universal descrip-
tion of tensor above), so if F is representable (by R-algebra 1),
then F
/
is representable (by S-algebra 1
R
S). This process is
called base change.
A homomorphism of ane group schemes over R is a natu-
ral map G H. This gives us for each R-algebra A a group
homomorphism G(A) H(A) such that whenever A B
is a homomorphism of R-algebras, the following diagram com-
mutes:
G(A)
H(A)
G(B)
H(B)
Here the vertical maps come from G and H being functors.
4.2 Group schemes lii
Exercise: For each R-algebra A, dene F(A) = ker(G(A)
H(A)). Show that F is a group scheme over R.
For example, the determinant map gives a homomorphism
from GL
n
to G
m
with kernel SL
n
, and the map x x
n
a
homomorphism from G
m
to G
m
with kernel
n
. It requires a
lot more work to give the cokernel a functorial description.
We now have the objects and morphisms of the category of
ane group schemes over a xed ring R. An elliptic curve
E : y
2
= f(x) over Q gives for each Q-algebra A a group
E(A) and for each Q-algebra map A B a group homomor-
phism E(A) E(B). Is it an ane group scheme? Actually
not - the obvious try is 1 = Q[x, y]/(y
2
f(x)), in which case
hom
Qalg
(1, A) yields the points (x, y) with coordinates in A
satisfying y
2
= f(x). This, however, misses the point at . In
other words, it shows that E(A) denes an ane scheme.
The answer is to dene group schemes in general, obtained by
patching together ane schemes called charts. For instance,
E(A) provides such a chart. We say more about non-
ane group schemes later.
It turns out that the n-division points are nicer, since they
dene ane group schemes.
Denition 4.13 An R-algebra A is called nite if it is nitely
generated as an R-module. (Note that this is stronger than being
nitely generated as an R-algebra - consider e.g. R[T].) Let G
be an ane group scheme over R. We call G nite over R if
its representing ring 1 is a nite R-algebra.
In fact, any group scheme nite over R is ane. In particular,
if E : y
2
= f(x) is an elliptic curve over Q and a prime such
that f(x) (mod ) has distinct roots, then E[n] is a nite (of
4.2 Group schemes liii
rank n
2
) ane group scheme over Q
p
. (It can be shown directly
to be ane by nding a polynomial f whose zero set V ((f)) -
see below - does not meet E[n], whence E[n] lies in an ane
chart U
f
. See Conrads article in [5] for details. A nice explicit
description of E[n] can be found in [9].) This will be discussed
in the section on reduction of elliptic curves in the next chapter.
Next, we make explicit the kind of group schemes that yield
Galois representations useful to us.
Denition 4.14 Let G be a nite group scheme over a eld
K, e.g. represented by K-algebra 1. We call G etale if 1 is an
etale (or separable) K-algebra, i.e. 1
K
K
=
K . . .
K.
Example: Let G =
1).
Then 1
K
K =
K[T]/(T
K[T]/(T1)
i
)
=
K . . .
K, and so G is etale.
Note that the etale situation here is that in which we ob-
tained useful Galois representations earlier, namely the cyclo-
tomic characters. This is claried as follows (more details can
be found in [37]).
Theorem 4.15 The category of etale group schemes over K
is equivalent to the category of nite groups with G
K
acting
continuously.
Proof: Given an etale group scheme G, represented say by 1, an
etale K-algebra, consider the group G(
K) = hom
Kalg
(1,
K).
Since 1 is etale, [ hom
Kalg
(1,
K)[ = dim
K
1, so G(
K) is a
nite group. Given a K-algebra 1
K and G
K
, the
action is given by composing to get 1
K
K. The images
4.3 General schemes liv
of 1 all lie in some nite Galois extension of K and so the
action is continuous.
Conversely, given a nite group H with continuous G
K
-action,
consider rst the case of transitive action, say H = G
K
h. By
continuity, choose nite extension L of K such that the action
of G
K
factors through Gal(L/K). Let S be the subgroup xing
h and 1 L its xed eld. By Galois theory, all maps 1
K
map to L and are conjugate, yielding a G
K
-isomorphism H
hom
Kalg
(1,
K) by sending h to one of them. If the action is
intransitive, we obtain for each orbit a ring 1
i
and then
1
i
works.
QED
Denition 4.16 An R-algebra A is called at if tensoring with
A is an exact functor, i.e. if for any short exact sequence 0
M N L 0 of R-modules, 0 M
R
A N
R
A
L
R
A 0 is also exact. A nite group scheme over SpecR
is called at if its representing ring A is a at R-algebra.
If R = Z
GL
n
(Z/
m
). The above theorem associates to
an etale group scheme G over Q
i
K
K
K
K
K
K
K
K
K
K
SpecA
i
.s
s
s
s
s
s
s
s
s
s
SpecR
i.e. a morphism of ane schemes over SpecR. The category
of ane schemes over SpecR is hereby anti-equivalent to the
category of R-algebras.
Exercise: Let SpecR and let () = Frac(R/). Show that
() is an R-algebra, so that Spec() embeds in SpecR. Let
A be an R-algebra, so that SpecA is a cover of SpecR. Show
that its bre over can be identied with Spec(A
R
().
A scheme then is a ringed space admitting a covering by open
sets that are ane schemes. Morphisms of schemes are dened
locally, i.e. f : S
/
S is a morphism if there is a covering
of S by open, ane subsets SpecR
i
such that f
1
(SpecR
i
) is
an ane scheme SpecR
/
i
and the restriction map SpecR
/
i
SpecR
i
is a morphism of ane schemes. We say that S
/
is a
scheme over S. In Grothendiecks approach, this relative notion
is important rather than absolute questions about a scheme.
Questions about f turn into questions about the ring maps
f
i
: R
i
R
/
i
. In particular, we say that f has property () (for
example, is nite or at), if there is a covering of S such that
each of the ring maps f
i
has this property.
If S is a scheme, then a group scheme over S is a representable
functor F from the category of schemes over S to the category
of groups, i.e. there exists some scheme o over S such that
F(X) = hom
Sschemes
(X, S). For example, an elliptic curve
over Q is a (non-ane) group scheme over SpecQ.
4.4 Modular forms lvii
4.4 Modular forms
Another source of Galois representations (in fact, which turns
out to produce all that we are interested in) is modular forms.
For this section, [28], [19], and [11] are recommended. Note that
elliptic curves will ultimately correspond to modular forms of
weight 2 and trivial Nebentypus, and so those forms will be
highlighted.
Fix positive integers k, N and homomorphism : (Z/N)
0
(N) :=
_
_
a b
c d
_
_
SL
2
(Z)[c 0 (mod N)
and
1
(N) =
_
_
a b
c d
_
_
0
(N)[d 1 (mod N).
Denition 4.17 A modular function of weight k, level N, and
Nebentypus , is
(i) a meromorphic function on H,
(ii) satises f[[]
k
= (d)f for all
0
(N),
(iii) is meromorphic at all cusps.
If the form is holomorphic on H and at the cusps, then it is
called a modular form. If it vanishes at the cusps, then it is
called a cusp form. We now explain (iii).
Note that by (ii) f(z + 1) = f(z). Thus, f has a Fourier
expansion in terms of q = e
2iz
, say f =
a
n
q
n
.
We say that f is meromorphic (respectively, holomorphic,
vanishes) at if a
n
= 0 for all n < some n
0
(respectively
4.4 Modular forms lviii
n < 0, n 0). We say that f is meromorphic (respectively,
holomorphic, vanishes) at the cusps if f[[]
k
is meromorphic
(respectively, holomorphic, vanishes) at for all SL
2
(Z).
For example, if N = 1, then
0
(N) =
1
(N) = SL
2
(Z). If
f is a modular form of this level, then its Nebentypus must
be trivial. Moreover, f[[]
k
= f for all SL
2
(Z) and so the
cusp condition only need be checked at . In general, one has
nitely many conditions to check, taking running through the
nitely many cosets of
0
(N) in SL
2
(Z).
The set of modular forms (respectively cusp forms) of weight
k, level N, and Nebentypus will be denoted M
k
(N, ) (re-
spectively S
k
(N, )). As will be shown later, these are nite-
dimensional C-vector spaces.
Exercise: Show that if f, f
/
are modular functions of weights
k, k
/
respectively and level 1, then ff
/
, f/f
/
are modular func-
tions of weights k+k
/
, kk
/
respectively and level 1. Show that
for C, f and if k = k
/
, then f +f
/
are modular functions
of weight k, level 1.
Example: Let be the lattice in C generated by fundamental
periods 1, , where H. G
2k
() = G
2k
() =
(m,n),=(0,0)
1
(m+n)
2k
(the Eisenstein series) is a modular form of weight 2k and
level 1. (ii) follows since G
2k
() =
2k
G
2k
() and (iii) fol-
lows since uniform convergence allows passage to limit term
by term, the m ,= 0 terms giving 0, the m = 0 terms giving
n,=0
n
2k
= 2(2k).
For this same , = g
3
2
27g
2
3
is therefore a modular form
of weight 12 and level 1. Using the known values of (4), (6),
we get its constant coecient 0, and so it is a cusp form. In
fact, looking deeper shows that = (2)
12
q
n=1
(1q
n
)
24
[28].
4.4 Modular forms lix
Henceforth, we shall normalize so that its rst coecient is
1.
Another important example is j() :=
1728g
3
2
, which is a mod-
ular function of weight 0 and level 1, and thus denes a map
j : SL
2
(Z)H C. Its Fourier expansion
1
q
+744 +196884q +
. . . has fascinating connections with the Monster nite simple
group.
The best way to think of modular forms is in terms of asso-
ciated Riemann surfaces, called modular curves.
4.4.1 Riemann surfaces
A surface is a topological space S which is Hausdor and con-
nected such that there is an open cover U
[ A and home-
omorphisms
of U
to open sets V
C. Then (U
) is
called a chart and the set of charts an atlas. If U
,= , then
transition function t
(U
(U
).
Call S a Riemann surface if all the t
= C,
(z) = z and U
= C
0,
(z) =
1/z(z C),
a compact
Riemann surface, identiable with the sphere.
(ii) If is a lattice in C, then C/ is a compact Riemann
surface.
We now introduce another useful class of compact Riemann
surfaces:
Denition 4.18 Let H
=
X
i
(N). These are compact Riemann surfaces and called modu-
lar curves. See [19] p.311 or [11] p.76 for more details.
Riemann surfaces form a category where the morphisms are
analytic maps dened thus.
Denition 4.19 Call a continuous map f : R S of Rie-
mann surfaces analytic if the maps
f
1
is an
elliptic curve), j has a simple pole at and no others. Thus,
invoking the exercise above, j is 1-to-1. QED
Corollary 4.21 Every elliptic curve over C is of the form E
,
for some lattice .
Proof: Given y
2
= f(x) with f C[x] a cubic with distinct
roots, we can, by change of variables, get y
2
= 4x
3
Ax B
for some A, B C. The claim is that there exists such that
g
2
() = A, g
3
() = B. From the denition of j above, we
can nd such that
g
2
3
g
3
2
takes any value other than
1
27
. Pick
such that this equals
B
2
A
3
(,=
1
27
since f has distinct roots).
Choose such that g
2
() =
4
A. Then g
3
()
2
=
12
B
2
, so
g
3
() =
6
B. If we have the negative sign, then replace by
i. Noting that Eisenstein series satisfy by denition g
2
(L) =
4
g
2
(L), g
3
(L) =
6
g
3
(L), we are done if we take = L
where L has basis 1, . QED
Note that by the last theorem, j identies X
0
(1) with C
. In fact:
Theorem 4.22 The modular functions of weight 0 and level 1
are precisely the rational functions of j, i.e. the function eld
4.4 Modular forms lxii
K(X
0
(1)) = C(j).
Proof: An earlier exercise showed that rational functions in j
are modular functions of that weight and level. Conversely, if f
is such a function, say with poles
i
, counted with multiplicity,
then g = f
i
(j() j(
i
)) is a modular function of weight 0
and level 1 with no poles in H. If g has a pole of order n at ,
then there exists c such that g cj
n
has a pole of order n 1
at (and no others). By induction, g minus some polynomial
in j has no pole in H
N
:=
_
_
a b
0 d
_
_
with ad = N, d > 0, 0 b < d, gcd(a, b, d) = 1
(of which there are (N) := N
p[N
(1 +
1
p
) such matrices), the
modular polynomial of order N is
N
(x) =
(N)
i=1
(xj
i
). One
root is j
N
:= j , where =
_
_
N 0
0 1
_
_
(i.e. j
N
(z) = j(Nz)).
In fact, (N) = [SL
2
(Z) :
0
(N)] and so is the degree of the
cover X
0
(N) X
0
(1). The corresponding extension of func-
tion elds K(X
0
(1)) K(X
0
(N)) is then also of degree (N).
K(X
0
(N)) turns out to be K(X
0
(1))(j
N
):
Theorem 4.23 See [19], p. 336 on.
N
(x) has coecients in
Z[j] and is irreducible over C(j) (and so is the minimal polyno-
mial of j
N
over C(j)). The function eld K(X
0
(N)) = C(j, j
N
).
This enables us to dene X
0
(N), a priori a curve over C, over
Q. This means that it can be given by equations over Q. If
N > 3, then one can further dene a scheme over Z[1/N], so
that base change via Z[1/N] Q yields this curve (in other
4.4 Modular forms lxiii
words, we have a model for X
0
(N) over Q with good reduction
at primes not dividing N).
Proof: First, we note that j, j
N
are indeed in K(X
0
(N)), i.e.
satisfy f(z) = f(z) for all
0
(N). This clearly holds for
j since j is a modular function of weight 0 on all of SL
2
(Z).
Since j
N
(z) = j (z) = j (
1
)(z) and
1
=
_
_
a Nb
N
1
c d
_
_
SL
2
(Z), j
N
(z) = j (z) = j
N
(z). The
condition at the cusps is easily checked.
Next, we note that j
i
(1 i (N)) are distinct functions
on H. If
i
=
_
_
a b
0 d
_
_
, then j
i
(z) = j(
az+b
d
) =
1
q
a
d
b
d
+ . . .,
where q
d
= e
2iz/d
,
d
= e
2i/d
. If j
i
= j
i
, take the quotient
and let Imz 0 to get q
a
d
b
d
= q
a
d
b
d
. So a/d = a
/
/d
/
, but since
ad = N = a
/
d
/
and all are positive, a = a
/
, d = d
/
, whence b = b
/
too.
Next, we show the properties of
N
. If SL
2
(Z), then
we check that
i
=
k
for some k and SL
2
(Z). Thus
j
i
= j
k
, whence permutes the roots of
N
, and
so its coecients are invariant under SL
2
(Z), hence C(j)
(meromorphic since polynomials in the j
i
). In fact, one easily
computes that SL
2
(Z) acts transitively on the roots of
N
,
whence
N
is irreducible over C(j). To show its coecients lie
in Z[j], we see that they lie in Z[
N
] since d [ N. Automorphism
N
r
N
(any r coprime to N) permutes the j
i
, and so the
coecients lie in Q Z[
N
] = Z.
QED
These polynomials
N
tend to have huge coecients, but at
least they dene X
0
(N) as a curve over Q.
If X is a compact Riemann surface of genus g and W =
4.4 Modular forms lxiv
hol
(X) its holomorphic dierentials, then W and so V =
hom(W, C) are g-dimensional C-vector spaces.
Exercise: Show that there is an isomorphism of C-vector spaces
given by
hol
(X
0
(N)) S
2
(N), f(z)dz f(z).
(Hint: if f(z) is such a cusp form and =
_
_
a b
c d
_
_
0
(N),
then f(z)d(z) = (cz + d)
2
f(z)(cz + d)
2
dz = f(z)dz. The
holomorphicity of f(z)dz corresponds to being a cusp form
since 2idz = dq/q.)
This shows that the dimension of S
2
(N) is the genus of X
0
(N)
(in particular nite), computable by Riemann-Roch (explicitly
given in [19]). Likewise, the nite dimensionality of S
2
(N, ) is
bounded by the genus of X
1
(N). There are similar interpreta-
tions of S
k
(N) for higher k.
Let C
1
, ..., C
2g
denote the usual 2g cycles on the g-handled X,
generating free abelian group H
1
(X, Z). Let = Im(H
1
(X, Z)
hom(W, C)) where the map is C (
_
C
), a discrete sub-
group of rank 2g called the period lattice of X. The Jacobian of
X, Jac(X), is the g-dimensional complex torus C
g
/, isomor-
phic as a group to (R/Z)
2g
. This is an example of an abelian
variety over C.
The Abel map is X Jac(X), x
_
x
x
0
j
(for some xed
x
0
whose choice doesnt matter since the image is dened up
to a period in ). In the case g = 1, then this is bijective.
Let Div(X) be the free abelian group on the points of X and
Div
0
(X) its subgroup consisting of the elements whose coef-
cients sum to 0. The Abel map extends to a group homo-
morphism Div(X) Jac(X), whose restriction to Div
0
(X) is
4.4 Modular forms lxv
surjective with kernel the so-called principal divisors T(A).
Since Div
0
(X
0
(N))/T(X
0
(N)) makes sense over Q, this de-
nes J
0
(N) := Jac(X
0
(N)) over Q (in fact it is a coarse moduli
scheme over Z[1/N] if N > 3). Galois action on its division
points then produces Galois representations G
Q
GL
2g
(Z/n).
From this we obtain 2-dimensional Galois representations as-
sociated to certain modular forms, dened below.
Denition 4.24 If f is a cusp form of weight k, level N, and
Nebentypus , dene the mth Hecke operator by
T
m
f = m
(k/2)1
j
f[[
j
]
k
,
where if N = 1, the
j
run through
n
, and for general N
through the matrices in
n
with gcd(a, N) = 1.
Exercise: If f has q-expansion at ,
n=0
a
n
q
n
, then T
m
f has
q-expansion
n=0
b
n
q
n
, where b
n
=
d[(m,n)
(d)d
k1
a
mn/d
2
(taking (d) = 0 if (d, N) ,= 1).
Show that T
m
f is again a cusp form of weight k, level N,
and Nebentypus such that T
m
is a linear operator on S
k,
(N).
Furthermore, check the Hecke operators commute.
Denition 4.25 If T
m
f =
m
f (some
m
C) for all m, then
f is called a cuspidal eigenform. Actually, a
m
=
m
a
1
and we
shall normalize eigenforms so that a
1
= 1 and so
m
= a
m
. The
commutative ring the Hecke operators generate, T , is called the
Hecke algebra.
For example, since S
12,1
(1) has dimension 1, basis , is
4.4 Modular forms lxvi
a cuspidal eigenform. A very useful corollary to the denition
follows:
Corollary 4.26 If f =
a
n
q
n
is a cuspidal eigenform, the
map T
m
a
m
extends to a ring homomorphism (eigencharac-
ter) : T C.
Consider the (injective) map S
k
(N) C[[q]] that sends a
cusp form to its Fourier expansion at . Then the inverse image
of Z[[q]] is S
k
(N; Z) and S
k
(N; A) = S
k
(N; Z)
Z
A. Note that
using the explicit coecients of T
m
, T acts on S
k
(N; Z). The
q-expansion principle says that S
k
(N; C) = S
k
(N), i.e. S
k
(N)
has a basis in S
k
(N; Z), and so T embeds in EndS
k
(N; Z). This
has various consequences:
Theorem 4.27 T is a nite free Z-algebra. If f is a cuspidal
eigenform, then there exists an algebraic number ring O
f
con-
taining all coecients of f and the image of . (Note that the
image of above lies in O
f
.)
If A is a ring, we let S
k
(N; A) denote the cusp forms of level
N and weight k dened over A. Note that
S
k
(N; A) = hom
Aalg
(T , A). ()
This follows for A = Z by mapping : S
k
(N; Z) hom(T , Z)
by f (t a
1
(tf)) and then noting that is injective and
that T is free of rank that of S
k
(N; Z) . The general case
follows by tensoring with A.
Note that the Hecke operators T
m
also act on Div
0
(X
0
(N))
(and so J
0
(N)) by extending linearly T
m
[z] =
[
i
z], where [z]
is the orbit of z H and
i
runs (again) through all matrices
_
_
a b
0 d
_
_
with ad = n, d > 0, gcd(a, N) = 1, 0 b < d. Let I
p
4.4 Modular forms lxvii
denote the image of inertia under G
Q
p
G
Q
, D
p
the image of
G
Q
p
, and Fr
p
the element of D
p
/I
p
mapping to the Frobenius
map in G
F
p
. For a prime ideal of O
f
, denote the fraction eld
of (O
f
)
by K
.
Theorem 4.28 Let f =
a
n
q
n
be a cuspidal eigenform of
weight k, level N, and Nebentypus . For each prime , there ex-
ists a unique semisimple continuous homomorphism : G
Q
GL
2
(K
= (Z/
n
)
2g
denote the kernel of multiplication by
n
on J
0
(N) and T
(J
0
(N)) =
lim
J
0
(N)[
n
]
= Z
2g
-
module of rank 2g on which G
Q
acts (the argument for elliptic
curves carries over to g-dimensional tori), where g is the genus
of X
0
(N). The action of T
m
on J
0
(N) given above carries over to
T
(J
0
(N)). Then W := T
(J
0
(N))
Z
is a T
Z
Q
-module, in
fact free of rank 2. (The proof of this comes from the Hodge de-
composition S
2
(N)
S
2
(N) = H
1
(X, C), where
S
2
(N) gives the
anti-holomorphic dierentials, and the fact that if A is a eld of
characteristic 0, then S
k
(N; A) is free of rank 1 over T
Z
A by
() above.) This yields a representation G
Q
GL
2
(T
Z
Q
).
Note that the particular level N eigenform has not been used
yet.
The eigencharacter T O
f
now induces a homomorphism
T
Z
Q
O
f
Z
Q
. Since O
f
is an order in a number eld,
O
f
Z
Q
.
(Hint: Z[x]/(f(x))
Z
Q
= Q
[x]/(f(x)).)
For cases with k > 2, Deligne [7] used higher symmetric pow-
ers of the Tate module. For k = 1, Deligne and Serre [8] built
the representation from congruent eigenforms of higher weight.
Exercise: Let : G
Q
GL
n
(K) be a continuous homomor-
phism where K is a local eld with valuation ring O. Consider-
ing K
n
thus as a G
Q
-module, show that there is a stable lattice
O
n
under this action.
Letting O be the valuation ring of K
GL
2
(O) GL
2
(F) pro-
duced above.
The Big Picture. We have three sources of naturally oc-
curring Galois representations, namely elliptic curves, modu-
lar forms, and etale group schemes. The most general is the
last, and these will be used next to dene the general class of
semistable representations we shall focus on. We shall see that
this class contains the representations coming from an elliptic
curve associated by Frey to a putative counterexample to Fer-
mats Last Theorem. The rst thing is to create links between
the various kinds of representations dened.
lxix
5
Invariants of Galois representations,
semistable representations
Let F be a nite eld of characteristic > 2 and V a 2-
dimensional F-vector space on which G
Q
acts continuously.
Thus we have a representation : G
Q
GL
2
(F). By compos-
ing with the homomorphism G
Q
G
Q
, V has a G
Q
-action
and so denes an etale group scheme over Q
.
Denition 5.1 Call good at if this group scheme comes
via base change from a nite at group scheme over Z
and
also det [
I
= [
I
. Call ordinary at if [
I
can be written
_
_
[
I
0 1
_
_
. Call semistable at if it is good or ordinary at
.
Let prime p ,= . Call good at p if is unramied at p, i.e.
(I
p
) = 1. Call semistable at p if (I
p
) is unipotent (i.e. all
its eigenvalues are 1). Call semistable if it is semistable at all
primes.
5.1 Serre Invariants lxx
5.1 Serre Invariants
Given as above, we want to dene positive integers N( ), k( ),
and a group homomorphism ( ) : (Z/N( ))
. Suppose
is odd, i.e. det (c) = 1, where c is complex conjugation,
and that is absolutely irreducible, i.e. V
F
F has no G
Q
-
invariant proper subspace. We shall consider the following two
conjectures of Serre [30]:
Conjecture 5.2 (Strong conjecture) There exists a cuspidal
eigenform f of weight k( ), level N( ), and Nebentypus ( )
whose associated residual representation is .
Conjecture 5.3 (Weak conjecture) There exists a cuspidal eigen-
form whose associated residual representation is .
Work of Kenneth Ribet and others [24],[10] established that
the weak conjecture implies the strong conjecture. N( ) comes
from the local representations
p
for p ,= and k( ) from
.
We begin with the Artin conductor N( ).
Consider
p
: G
Q
p
GL
2
(F) (p ,= ). Let (
i
denote the
(nite) image of the ith ramication subgroup (using upper
numbering, in fact, [29]; note that (
0
= G
0
and (
1
= G
1
). Let
V
i
denote the subspace of V xed by (
i
, and set
n(p, ) =
i=0
dim(V/V
i
)
[(
0
/(
i
[
.
It is a deep theorem [29] that n(p, ) is a non-negative integer.
More easily, note that
(i)n(p, ) = 0 V = V
0
p
unramied, ( ((
0
) = 1)
(ii)n(p, ) = dim(V/V
0
) V = V
1
p
tamely ramied, ( ((
1
) = 1).
5.1 Serre Invariants lxxi
Denition 5.4 The Artin conductor of is dened by
N( ) =
p,=
p
n(p, )
,
well-dened by (i) above, since only nitely many primes ramify
under .
Since det : G
Q
F
=
(Z/)
(Z/N( ))
. Lifting
this to
Z C gives : (Z/N( ))
. This denes ( ).
We would like to have det = ( )
k( )1
, and this determines
k( ) (mod 1). By the result at the end of chapter 3, ((
1
) =
1. Thus [
I
: I
GL
2
(
F) has cyclic image of order prime
to , and so is diagonalizable. Let ,
/
: I
be the two
characters this produces. The exact description of k( ) is quite
complicated but just depends on what ,
/
are in terms of
fundamental characters (see chapter 3) [30].
Theorem 5.5 is semistable if and only if N( ) is squarefree,
( ) is trivial, and k( ) = 2 or + 1.
Proof: The form of ( ) and of k( ) (mod 1) corresponds
to det being exactly the cyclotomic character. The exact form
of k( ) requires Serres prescription above. As for N( ) being
squarefree, n(p, ) = 1 if and only if V
0
is of dimension 1 and
V
1
= V , so if and only if the representation is ordinary at p.
QED
5.2 Fontaine-Mazur Conjecture lxxii
5.2 Fontaine-Mazur Conjecture
The notions of good and semistable extend to represen-
tations to rings other than elds. Suppose R is a complete,
Noetherian local ring with residue eld nite of characteris-
tic . Then R is a Z
. Let
: G
Q
GL
2
(R) be a continuous homomorphism, giving a
G
Q
-action on V := R
2
.
Denition 5.6 If [R[ < , then call good at if the corre-
sponding etale group scheme over Q
and det [
I
= [
I
, such that I
acts trivially
on V
0
and as on V
1
so that
[
I
=
_
_
[
I
0 1
_
_
.
Call semistable at if is good or ordinary at and det[
I
=
[
I
.
The following is a special case of the major conjecture of
Fontaine and Mazur [12] that says that every potentially semistable
-adic Galois representation arises from the Galois action on
some subquotient of an -adic cohomology group of a variety
over Q.
Conjecture 5.7 (Fontaine-Mazur) If O is the valuation ring
of a nite extension K of Q
, and : G
Q
GL
2
(O) a contin-
uous, absolutely irreducible homomorphism, unramied at all
5.3 Reduction of elliptic curves lxxiii
but nitely many primes, and semistable at , then there exists
a cuspidal eigenform f whose associated Galois representation
is .
Wiles work amounts to establishing some special, nontrivial
cases of this conjecture.
5.3 Reduction of elliptic curves
Our goals now are to introduce Freys elliptic curves associated
to a putative counterexample to Fermats Last Theorem, to
dene semistable elliptic curves and show that Freys curves
are semistable, and to show that the Galois representations
attached to semistable elliptic curves are semistable. First, we
need a little on reduction of elliptic curves.
A Weierstrass model for an elliptic curve E over Q is an
equation of the form
y
2
+a
1
xy +a
3
y = x
3
+a
2
x
2
+a
4
x +a
6
.
By completing square and cube it can be written y
2
= x
3
27c
4
x 54c
6
, where c
4
= b
2
2
24b
4
, c
6
= b
3
2
+ 36b
2
b
4
216b
6
and b
2
= a
2
1
+ 4a
2
, b
4
= 2a
4
+ a
1
a
3
, b
6
= a
2
3
+ 4a
6
. This curve
has discriminant = (c
3
4
c
2
6
)/1728.
Denition 5.8 We say that E has good reduction over Q
p
(or
at p) if there is a Weierstrass equation for E with a
i
Z
p
that
(when the coecients are reduced modulo p) denes an elliptic
curve over F
p
(which happens if and only if the corresponding
, pZ
p
).
We have to be careful - for example, y
2
= x
3
1 and y
2
= x
3
64 dene the same elliptic curve over Q but with the reductions
5.3 Reduction of elliptic curves lxxiv
mod 2 being nonsingular and having a cusp. Another example
is the Frey elliptic curve with equation y
2
= x(x A)(x + B),
where A, B, C := AB are nonzero relatively prime integers.
We shall focus particularly on the case A = a
, B = b
, C = c
,
with 5 a prime and gcd(a, b, c) = 1 (i.e. a counterexample to
(FLT)
situation,
however, we may assume, by relabeling, that A 1 (mod 4)
and B 0 (mod 32) and then changing variables by x
4x, y 8y + 4x yields equation
y
2
+xy = x
3
+
B A 1
4
x
2
AB
16
x.
For this equation, = 2
8
(ABC)
2
. Moreover, c
4
= A
2
+AB+
B
2
, which is checked to be relatively prime to abc, i.e. v
p
(c
4
) =
0 for all primes p of bad reduction. This is what is called a
minimal Weierstrass model for the curve, in that [[ is now as
small as possible, say
min
.
Exercise: Considering how changes of variable of the form x
u
2
x, y u
3
y aect , c
4
, c
6
, show that a Weierstrass model
is minimal if v
p
() < 12 or v
p
(c
4
) < 4 or v
p
(c
6
) < 6 for ev-
ery prime p. (Here v
p
denotes p-adic valuation.) Show that
changes of variable do not change the j-invariant of the equa-
tion. Deduce that two elliptic curves are isomorphic over an
algebraically closed eld if and only if they have the same j-
invariant.
So, for our example, v
p
(c
4
) = 0 gives that the model is mini-
mal. Thus, if p[abc, the Frey curve necessarily has bad reduction
at p. In fact, its reduction mod p has a node.
Denition 5.9 Suppose p is a prime of bad reduction for E. E
5.3 Reduction of elliptic curves lxxv
mod p has a unique singular point. If that point is a node (re-
spectively cusp), then we say E has multiplicative (respectively
additive) reduction at p. A node (respectively cusp) means that
two (respectively three) of the roots of the cubic are equal.
Say E has semistable reduction at p if it has good or multi-
plicative reduction at p. Call E semistable if its reduction at all
primes is semistable.
Multiplicative reduction occurs at p if and only if v
p
(c
4
) =
0, v
p
() > 0. (We see this directly for the Frey curves for odd
primes p since if p divides a, it does not divide b and so only
two roots of x(x a
)(x + b
given by x e
2ix
. Setting q = e
2i
, this yields a group iso-
morphism C/ C
/q
Z
, where q
Z
is the innite cyclic group
with generator q. We thus have a multiplicative representation
of E
(C).
Explicitly, in terms of q, setting
1
(2i)
2
x = X+
1
12
and
1
(2i)
3
y =
2Y +X, the equation for E
becomes:
y
2
+xy = x
3
+a
4
(q)+a
6
(q), a
4
(q) = 5s
3
(q), a
6
(q) = (5s
3
(q)+7s
5
(q))/12,
and s
k
(q) =
n=1
n
k
q
n
1q
n
. Note that a
4
(q), a
6
(q) Z[[q]], so this
denes an elliptic curve over Z[[q]] since the discriminant of
this curve is q
n=1
(1 q
n
)
24
and we get its j-invariant as
1
q
+
5.3 Reduction of elliptic curves lxxvi
744 + 196884q + . . . (), as usual. Tate observed that much of
this carries over from C to p-adic local elds:
Theorem 5.10 (Tate) Let K/Q
p
be a nite extension with val-
uation v and absolute value [u[ = c
v(u)
for some 0 < c < 1.
Suppose q K
/q
Z
= E
q
(L).
Since [q[ < 1, the reduction of the curve is y
2
+ xy = x
3
,
which has bad multiplicative reduction (in fact split - meaning
that the tangent lines at the node are dened over the residue
eld).
Conversely, if E has multiplicative reduction at p, then its
j-invariant has v
p
(j) < 0 and the series q = j
1
+ 744j
2
+
750420j
3
+ ..., inverting (), converges to give q such that
v
p
(q) = v
p
(j) > 0, so [q[ < 1. The corresponding Tate elliptic
curve E
q
is isomorphic over
Q
p
(in fact over a quadratic ex-
tension of Q
p
) to E since they have the same j-invariant. The
division points of E
q
are particularly simple to study, whence
the action of G
Q
p
on them is easily given.
Theorem 5.11 Let E have multiplicative reduction at p. Then
there exists q Q
p
such that E(
Q
p
)
= (
p
/q
Z
)() as G
Q
p
-
modules, where is the unique unramied quadratic character,
i.e. ((x)) = ((x
()
)) for all G
Q
p
, where : G
Q
p
1 is trivial if the reduction is split, and is the unique un-
ramied quadratic character otherwise. Thus,
E,
[
G
Qp
=
_
_
0 1
_
_
.
5.3 Reduction of elliptic curves lxxvii
Proof: E
q
[
n
] E
q
(
Q
p
) and if x
Q
p
/q
Z
, x
n
is trivial if and
only if x
n
= q
m
for some m Z, so if and only if x =
a
n(q
1/
n
)
b
for some a, b Z/
n
. Consider the map
E
q
[
n
] < q > / < q
n
>
= Z/
n
, x x
n
,
a homomorphism with kernel
n. G
Q
p
acts via the cyclotomic
character on the kernel and trivial on the image since q Q
p
.
QED
Theorem 5.12 If E is a semistable elliptic curve over Q, then
E,
: G
Q
GL
2
(Z
) is semistable.
Proof: First, the Weil pairing gives that
2
T
(E)
=
, as
Galois modules, and so det is the cyclotomic character.
Second, if E has good reduction at , then for each n E[
n
]
comes from a nite at group scheme over Z
-algebra 1 is free,
since tensoring with Q
and F
is trivial).
Third, if E has good reduction at p ,= , then by Neron-Ogg-
Shafarevich below E[
n
] is unramied at p, so
E,
is good
at p. If E has bad so multiplicative reduction at p ,= , the
previous theorem again applies. QED
Theorem 5.13 (Neron-Ogg-Shafarevich) Suppose p ,[m. Then
E has good reduction at p if and only if the G
Q
-action on E[m]
is unramied at p.
5.3 Reduction of elliptic curves lxxviii
Proof: We show the forwards direction, which is the one we
need. Let K be a nite extension of Q
p
such that E[m] E(K).
Let F be the residue eld of K. One checks that the kernel of
reduction : E(K) E(F) has no points of order prime to p.
Thus the restriction E[m] E(F) is injective. Let P E[m]
with reduction
P, I
p
. Then ((P) P) = (
P)
P = 0,
and so by injectivity (P) = P. Thus, I
p
acts trivially on E[m].
QED
This result is likewise true for abelian varieties, in particular
J
0
(N), and thus shows that the representations at the end of
chapter 4 are unramied at primes not dividing N. Moreover,
in both cases, the action on Fr
p
on the m-division points is
given by their action on their reductions.
Theorem 5.14 Suppose E, an elliptic curve over Q, has mul-
tiplicative reduction at p > 2, which may or may not be . Then
E,
is good at p if and only if [v
p
(
min
).
Proof: Consider the -division eld of E
q
, namely K = Q
p
(
, q
1/
).
If p ,= , note that [v
p
(
min
) [v
p
(q) K/Q
p
unramied
E,
unramied at p. For p = , need the argument in Edix-
hovens article in [ ]. QED
For the Frey curves related to (FLT)
,
min
= 2
8
(abc)
2
and
so divides v
p
(
min
) for all odd primes p. By the last criterion
E,
is good at every prime except possibly 2. Being good at
gives k(
E,
) = 2, whereas being good at the other odd primes
and semistable at 2 gives N(
E,
) = 2. Since its determinant is
just the cyclotomic with no twist, (
E,
) is trivial. Thus, if
E,
were associated to a cuspidal eigenform, then it would come
from S
2
(2) = 0, a contradiction.
Moreover, Mazurs work [ ] shows that
E,
is absolutely ir-
5.3 Reduction of elliptic curves lxxix
reducible and so a counterexample to (FLT)
would violate
Serres strong conjecture. The idea for this is that if say
E,
were reducible, then E[] would have a subgroup or quotient of
order on which G
Q
acts trivially. Such a thing corresponds
to a rational point on X
0
(), but Mazur found all of them and
there are no noncuspidal ones for large .
Ribet [24], which inspired Wiles to begin his long journey to
the proof, took the approach of trying to prove that Serres
weak conjecture implies his strong conjecture. This then shows
that a counterexample to (FLT)
would violate
E,
being as-
sociated to any kind of cuspidal eigenform, i.e. the Shimura-
Taniyama conjecture. Ribet, Diamond, and others established
the full weak implies strong conclusion later [ ], but Ribet at
rst did enough for our purposes.
Theorem 5.15 (Ribet) Suppose : G
Q
GL
2
(F) is an odd,
continuous, absolutely irreducible representation over a nite
eld F of characteristic > 2. Suppose is associated to some
cuspidal eigenform of level N, weight 2, and trivial Nebentypus.
If is good at p[[N, then is associated to some cuspidal eigen-
form of level N/p whenever p , 1 (mod ) or gcd(, N) = 1.
A description of Ribets approach will be given in an ap-
pendix.
Corollary 5.16 The Shimura-Taniyama conjecture, that every
Galois representation attached to an elliptic curve is attached
to a modular form, implies Fermats Last Theorem.
Suppose we have a counterexample to (FLT)
, 5. If the
Shimura-Taniyama conjecture holds, then the associated Frey
curve is associated to some cuspidal eigenform of squarefree
5.3 Reduction of elliptic curves lxxx
level N. It is a theorem of Mazur that says that the corre-
sponding -division representation is absolutely irreducible. By
repeated use of Ribets theorem, this representation is associ-
ated to some cuspidal eigenform of level 2 (weight 2 and trivial
Nebentypus), but there are no such.
The Big Picture. The proof of Fermats Last Theorem is
hereby reduced to proving that all semistable elliptic curves are
modular. This will be achieved by showing that certain families
of semistable Galois representations are all associated to mod-
ular forms. We therefore need some way of parametrizing such
families and this will be provided by universal (semistable/modular)
Galois representations.
lxxxi
6
Deformation theory of Galois representations
A good description of this material can be found in [14]. The
original source is [20].
Fix a pronite group G, a nite eld F, and a continuous
representation : G GL
n
(F). Let C
F
denote the category
whose objects are complete, Noetherian local (commutative)
rings R with residue eld R/m
R
= F and whose morphisms
are ring homomorphisms : R S such that (m
R
) m
S
and such that the induced map R/m
R
S/m
S
is the identity
map on F. Cohens theorem [2] says that every ring in C
F
is
a quotient ring of W(F)[[T
1
, ..., T
m
]] for some m and so is a
W(F)-algebra.
Denition 6.1 A lift of to a ring R in C
F
is a continuous
homomorphism : G GL
n
(R) which modulo m
R
produces
. Two such lifts
1
and
2
are called strictly equivalent if there
exists M
n
(R) := ker(GL
n
(R) GL
n
(F)) such that
2
=
M
1
1
M. A deformation of is a strict equivalence class of
6. Deformation theory of Galois representations lxxxii
lifts.
Let E(R) = deformations of to R. A morphism R
S induces a map of sets E(R) E(S), making E a functor
C
F
Sets. We establish conditions under which E is
representable, so that there is one representation parametrizing
all lifts of .
Say that G satises () if the maximal elementary -abelian
quotient of every open subgroup of G is nite. Examples in-
clude:
(1) G
Q
p
for any p;
(2) let S be a nite set of primes of Q and G
Q,S
denote the
Galois group of the maximal extension of Q unramied outside
S, i.e. the quotient of G
Q
by the normal subgroup generated
by the inertia subgroups I
p
for p , S.
These satisfy () by class eld theory. Note that by Neron-
Ogg-Shafarevich a Galois representations associated to an el-
liptic curves or a modular form factors through G
Q,S
for some
S.
Theorem 6.2 (Mazur) Suppose that is absolutely irreducible
and that G satises (). Then the associated functor E is rep-
resentable.
That means that there is a ring 1( ) in C
F
and a continuous
homomorphism : G GL
n
(1( )) such that all lifts of to
R in C
F
arise, up to strict equivalence, via a unique morphism
1( ) R. The representing ring R( ) is called the universal
deformation ring of and (the strict equivalence class of) the
universal deformation of . For example, if n = 2, is odd,
F = Z/, and G = G
Q,S
with S, then 1( ) is typically
Z
[[T
1
, T
2
, T
3
]].
6. Deformation theory of Galois representations lxxxiii
One proof of Mazurs theorem follows Schlessingers criteria
[27], which were developed to study local singularities. We let
C
0
F
denote the subcategory of C
F
consisting of Artinian rings.
Note that if R is in C
F
, then every R/m
i
R
is in C
0
F
. The dual
numbers F[] = F[T]/(T
2
) with the image of T. A morphism
R S is called small if it is surjective with kernel a principal
ideal whose product with m
R
is 0, for example the map :
F[] F given by a +b a.
Denition 6.3 Suppose E : C
F
Sets is a functor sat-
isfying [E(F)[ = 1. Let R
1
R
0
and R
2
R
0
be morphisms
in C
0
F
. Consider the natural map
() E(R
1
R
0
R
2
) E(R
1
)
E(R
0
)
E(R
2
),
which exists since E(R
1
)
E(R
0
)
E(R
2
) is the terminal object in
the category of sets
S
.s
s
s
s
s
s
s
s
s
s
s
K
K
K
K
K
K
K
K
K
K
K
E(R
1
)
J
J
J
J
J
J
J
J
J
E(R
2
)
.t
t
t
t
t
t
t
t
t
E(R
0
)
Schlessingers criteria are as follows:
H1. R
2
R
0
small implies () surjective.
H2. If R
0
= F, R
2
= F[], and R
2
R
0
is the map above,
then () is bijective.
Note: If H2 holds, then t
E
:= E(F[]) has an F-vector space
structure (the tangent space of E).
H3. t
E
is a nite-dimensional F-vector space.
H4. If R
1
= R
2
and R
i
R
0
(i = 1, 2) is the same small
map, then () is bijective.
6. Deformation theory of Galois representations lxxxiv
Theorem 6.4 (Schlessinger) H1, H2, H3, H4 hold if and
only if E is representable.
Proof: So to prove Mazurs result, we need to show that his
functor E satises H1, H2, H3, H4. Let E
i
denote the set of
continuous homomorphisms G GL
n
(R
i
) lifting . Let K
i
=
n
(R
i
). Then E(R
i
) = E
i
/K
i
. Let R
3
= R
1
R
0
R
2
. We are
interested in the map
() E
3
/K
3
E
1
/K
1
E
0
/K
0
E
2
/K
2
,
when R
2
R
0
is small.
To show () is surjective then, we take
1
E
1
,
2
E
2
,
yielding the same element of E
0
/K
0
, i.e.
1
= M
1
2
M for
some M K
0
. Since R
2
R
0
is surjective, so is K
2
K
0
.
If N K
2
maps to M, then (
1
, N
1
2
N) gives the desired
element of E
3
.
Exercise: () is injective if C
K
2
(
2
(G)) C
K
0
(
0
(G)) is surjec-
tive.
If is absolutely irreducible, then both these groups consist
of scalar matrices and so this holds, ensuring H4 holds. This
actually follows from Nakayamas Lemma, since e.g. R
2
[G]
End
R
2
(R
n
2
) is surjective after tensoring with F thanks to the
absolute irreducibility of . If R
0
= F, then K
0
= 1 and so
the condition holds, whence H2 follows.
As for H3, note that
n
(F[]) is isomorphic to the direct
product of n
2
copies of F
+
, so is an elementary abelian -group.
Thus any lift : G GL
n
(F[]) of factors through G/H,
where ker( )/H is the maximal elementary abelian quotient of
ker( ). By (), G/H is nite and so there are nitely many lifts
to F[], proving H3. QED
6. Deformation theory of Galois representations lxxxv
We shall be interested in families of representations satisfying
some further condition, such as semistability. For this we need
Ramakrishnas renement of Mazurs result.
Let X be a property of W(F)[G]-modules of nite cardinality
which is closed under isomorphism, direct sums, taking sub-
modules, and quotienting. Fix : G GL
n
(F) such that F
n
considered as a W(F)[G]-module via satises X. For R C
0
F
(which must then have nite cardinality), let E
X
(R) denote the
set of deformations in E(R) satisfying X.
Theorem 6.5 (Ramakrishna) E
X
is a functor on C
0
F
. More-
over, if E satises H1, H2, H3, H4, then so does E
X
(in which
case both functors are representable, where E
X
is extended to
the category C
F
by E
X
(R) = lim
E
X
(R/m
i
R
)).
Proof: Let R, S be objects in C
0
F
and : R S a morphism. To
show E
X
a functor, we need to show that if : G GL
n
(R)
has X, then the composition map G GL
n
(S) also has X.
This follows since if B = R
n
and D = S
n
both with the given
G-action, then induces B D making D a nitely generated
(its nite!) B-module, say a quotient of B
m
- since having X
is closed under direct product and quotient, B and so B
m
and
so D all have X.
The next thing to note is that H1 for E
X
implies H2,H3,H4
too. This follows since restrictions of injective maps to subsets
are still injective. This gives injectivity in H2 and H4 with H1
giving surjectivity. As for H3, since E
X
(F[]) E(F[]), the
tangent space of E
X
is also nite-dimensional.
To prove H1 for E
X
, set R
3
= R
1
R
0
R
2
. Let
1
0
2
E
X
(R
1
)
E
X
(R
0
)
E
X
(R
2
). By H1 for E, we get E(R
3
) map-
ping to this element. We just need to show that has X. Well,
6. Deformation theory of Galois representations lxxxvi
R
3
R
1
R
2
induces R
n
3
R
n
1
R
n
2
, making R
n
3
a submod-
ule of a direct product of W(F)[G]-modules with X, whence
R
n
3
with this G-action has X. QED
Theorem 6.6 Suppose that E and E
X
satisfy the hypotheses
of the previous theorem. Let 1 and 1
X
be the respective defor-
mation rings. Then there is a natural surjection 1 1
X
.
Proof: Let : G GL
n
(1) and
X
: G GL
n
(1
X
) denote the
universal deformations. Since
X
is a lift of and parametrizes
all such, there is a (unique) morphism : R R
X
which after
composition with yields
X
. Let the image of be S. We
thereby get a representation : G GL
n
(S), which is of type
X. By universality of 1
X
we get a unique morphism 1
X
S
producing . The composition 1
X
S 1
X
, by universality
again, has to be the identity map, so S = 1
X
. QED
We shall show that semistability denes such a property X.
Namely, x a nite eld F of characteristic > 2 and a con-
tinuous homomorphism : G
Q
GL
2
(F) which is absolutely
irreducible and semistable. (Note that semistability includes
that det be the cyclotomic character and this makes odd,
so we need not make this an extra condition.) Fix a nite set
of rational primes. Let R be in C
0
F
. If : G
Q
GL
2
(R) is a
lift of , we say that is of type if
(1) det is the cyclotomic character;
(2) is semistable at ;
(3) if , and is good at , then is good at ;
if p , and is unramied at p, then is unramied
at p;
if p , and is ramied (so ordinary) at p, then is
ordinary at p.
6. Deformation theory of Galois representations lxxxvii
Note: (1) If E is a semistable elliptic curve over Q and
E,
is
absolutely irreducible, then
E,
is a lift of type if contains
all the primes of bad reduction for E.
(2) If is of type
/
, then is of type
/
.
(3) If is of type , then is unramied outside p : p[N( )
.
(4) A lift of , unramied outside , with det() the cyclo-
tomic character, semistable at , is of type .
Theorem 6.7 Given a continuous absolutely irreducible homo-
morphism : G
Q
GL
2
(F) and given , there exists an ob-
ject 1
in C
F
and a continuous homomorphism
: G
Q
GL
2
(1
with some , a
unique morphism from 1
to R.
The proof proceeds by dealing with each of the conditions re-
quired for type in turn. First, we deal with the group scheme
condition. To apply Ramakrishnas criterion we consider the
etale group scheme over Q
corresponding to G
Q
GL
n
(R)
with RinC
0
F
and say that has property X if this group scheme
extends to a nite at group scheme over Z
. We just need to
show that X is preserved under direct sums, sub, and quotient.
Denition 6.8 If G is an ane group scheme over Z
, say
represented by A, let G
gen
be its bre (or base change) over
Q
, so represented by A
Z
. Let H
gen
be a closed subgroup
scheme of G
gen
, say represented by (A
Z
)/J where J is
a Hopf ideal (ideals that ensure the quotient is still a Hopf al-
gebra). Let I =
1
(J) under : A A
Z
. Then the
group scheme H represented by A/I, a subgroup scheme of G,
is called the schematic closure of H
gen
.
6. Deformation theory of Galois representations lxxxviii
Lemma 6.9 A/I is torsion-free.
Proof: induces an injection A/I (A
Z
)/J, which is
torsion-free since it is a Q
-algebra. QED
It follows that H is a nite at group scheme over Z
. This
is why goodness is preserved under sub. As for quotient group
schemes, if H and G are group schemes over R with H(A)
a normal subgroup of G(A) for all R-algebras A, then typi-
cally F(A) = G(A)/H(A) does not give a representable func-
tor F on R-algebras. Raynaud showed how to x this. Let
A
/
= a A[(a) 1 a (mod I A)). Then A
/
is con-
sidered as representing G/H. Finally, as regards direct sums, if
g, H are nite at group schemes over Z
represented by R, S
respectively, let F(A) = G(A) H(A) for every Z
-algebra A.
Then check that F is an ane group scheme represented by
R
Z
S, which is a nite at Z
-algebra.
The second kind of condition is of the following nature. Let
G be a pronite group and I a closed subgroup. Call : G
GL
2
(R) I-ordinary if the xed points of R
2
under I form a free
direct summand of rank 1.
Theorem 6.10 If is I-ordinary and absolutely irreducible
and G satises (), then it has a universal I-ordinary lift.
Proof: Again, we need only prove H1. Let E
I
(R) denote the set
of I-ordinary deformations of to R. Let R
3
= R
1
R
0
R
2
and
consider
1
2
E
I
(R
1
)
E
I
(R
0
)
E
I
(R
2
). Then since Mazurs
functor E is representable, we get a
3
E(R
3
) mapping to
0
2
. It remains to show that
3
is I-ordinary. QED
The third kind of condition is just that have determinant
the cyclotomic character. Without imposing this condition, we
6. Deformation theory of Galois representations lxxxix
so far have a universal deformation : G
Q
GL
2
(1) satisfying
everything else contained in type . In particular, for p , S,
is unramied at p. Thus, (Fr
p
) (for p , S) are dened. Let
det(Fr
p
) = r
p
1 and let I be the ideal of 1 generated by
r
p
p for p , S. The representation
= (mod I) : G
Q
GL
2
(1/I) now has det
(Fr
p
) = r
p
= p = (Fr
p
) for p , S.
By Chebotarevs density theorem, the Fr
p
(p , S) are dense in
G
Q,S
and so det
= .
Incidentally, imposing all these conditions has actually re-
duced us to a ring 1/I which will turn out to be nite over
Z
.
Finally, we need another description of the tangent space of
our functors. Suppose E is a representable functor on C
F
, rep-
resented by 1. Set t
E
= E(F[]), the tangent space of E, a
nite-dimensional F-vector space. This dimension will turn out
to be a useful invariant of both E and 1, calculated via Galois
cohomology.
Note that t
E
= hom(1, F[]). Under every such morphism,
m
1
maps to b[b F with kernel containing m
2
1
+ 1, such
that t
E
is the dual space of m
1
/(m
2
1
+1). For example, if 1 =
W(F)[[T
1
, ..., T
r
]], then m
1
= (T
1
, ..., T
r
, ) and so dim
F
t
E
=
dim
F
(m
1
/(m
2
1
+ 1)) = r. Furthermore, if I is an ideal of 1
such that I m
2
1
+ 1, then the tangent space of 1/I is also
r-dimensional.
The Big Picture. We have seen that given a semistable
and a nite set of primes , there is a universal lift
: G
Q
GL
2
(1
= H
1
(G, Ad( )).
Note that this gives an alternative way of seeing that E(F[])
is an F-vector space. If we impose some property X and look
at the deformations of that have X, i.e. E
X
(F[]), this cor-
responds to some subgroup of H
1
(G, Ad( )) which we denote
H
1
X
(G, Ad( )). In particular, if we start with a semistable, ab-
solutely irreducible representation : G
Q
GL
2
(F) (F of odd
characteristic ) and consider lifts of type , then the subgroup
will be denoted H
1
(G
Q
, Ad( )).
We want to identify this set by considering what sort of 1-
cocycles correspond to the properties involved in being of type
.
First, we impose the condition that the lift should have
determinant the cyclotomic character , which is the same as
the determinant of . Then det(1 + a(g)) = 1 for all g G
Q
.
Since
1 +a(g) =
_
_
1 +a
11
a
12
a
21
1 +a
22
_
_
,
7. Introduction to Galois cohomology xciii
and
2
= 0, this gives 1+(a
11
+a
22
) = 1, implying trace(a(g)) =
0. The G
Q
-submodule of Ad( ) consisting of trace 0 matrices
will be denoted Ad
0
( ). We are therefore only interested in
subgroups of H
1
(G
Q
, Ad
0
( )).
Second, if H is a subgroup of G and M a G-module, then we
can restrict 1-cocyles and 1-coboundaries from G to H, thus
producing a restriction map
res : H
1
(G, M) H
1
(H, M).
More generally, any homomorphism H G will produce such
a restriction map. The local behavior corresponding to type
will be captured by the restrictions from G
Q
to G
Q
p
and/or
I
p
. For example, suppose a lift corresponds to an element
of the kernel from H
1
(G
Q
, Ad
0
( )) H
1
(I
p
, Ad
0
( )). Then
(g) = (g) for all g I
p
, so that is unramied at p if and
only if is unramied at p. Being in the kernel in fact ensures
that the ramication at p of is no worse than that of .
Denition 7.2 Let M = Ad
0
( ). By local conditions we mean
a collection L = L
p
, where for each prime p (including in-
nity) we are given a subgroup L
p
H
1
(G
Q
p
, M) such that for
all but nitely many p L
p
= ker(H
1
(G
Q
p
, M) H
1
(I
p
, M)).
(These are called the unramied classes and will be denoted
H
1
ur
(G
Q
p
, M).) The corresponding Selmer group will be
H
1
L
(G
Q
, M) = c H
1
(G
Q
, M) : res
p
(c) L
p
for all p,
where res
p
: H
1
(G
Q
, M) H
1
(G
Q
p
, M) is restriction.
Note that by Q
=
Gal(C/R) of order 2, identied as the subgroup of order 2 of
G
Q
generated by complex conjugation. Since F has odd char-
7. Introduction to Galois cohomology xciv
acteristic, H
1
(G
Q
has
to be 0.
Example: Suppose G has order 2 and M odd order. Show that
H
1
(G, M) = 0.
Proof: Let G = 1, c. A 1-cocycle is a map f : G M
such that f(gh) = f(g) + gf(h) for g, h G. Setting g = 1
gives f(1) = 0. Setting g = h = c gives 0 = f(c) + cf(c). So
c M
:= m M[cm = m. Conversely, if m M
,
dening f by f(1) = 0, f(c) = m gives a 1-cocycle.
If f is a 1-coboundary, then f(g) = gxx for some x M. In
particular, f(c) = cxx. Consider the map : M M
given
by (x) = cx x. Let M
+
= m M[cm = m. Since [M[
is odd, M = M
+
M
(m =
m+cm
2
+
mcm
2
). Then [Im()[ =
[M[/[Ker()[ = [M[/[M
+
[ = [M
(G
Q
, Ad( )) = H
1
L
(G
Q
, M), where M = Ad
0
( )
and L is given by:
L
= 0,
L
p
= H
1
ur
(G
Q
p
, M) if p , ,
= H
1
(G
Q
p
, M) if p , p ,= ,
= H
1
f
(G
Q
p
, M) if p = , ,
= H
1
ss
(G
Q
p
, M) if p = .
7. Introduction to Galois cohomology xcv
Here, H
1
f
and H
1
ss
are the at and semistable cohomology
groups respectively, to be dened below.
This theorem is simply a restatement of what it means for a
lift to be of type - all we do is translate the conditions across
to cohomology. Our main aim will be to compute the size of
this Selmer group, giving the tangential dimension of 1
. First,
since the property of being good or ordinary is dened in
terms of the G
Q
-module F
2
, we need another interpretation of
H
1
.
Denition 7.4 Given : G GL
n
(F), consider V = F
n
, a
G-module thanks to , and consider the set of extensions E of
V by V , consisting of short exact sequences
0 V
E
V 0
of F[G]-modules. Call two such extensions equivalent if there
is an isomorphism i : E
1
E
2
making the following diagram
commute
0
V
1
V
E
1
i
V
1
V
0
0
V
E
2
V
0
Let Ext
1
F[G]
(V, V ) denote the set of equivalence classes.
Theorem 7.5 There is a bijection between H
1
(G, Ad( )) and
Ext
1
F[G]
(V, V ).
Proof: Pick : V E such that ((m)) = m for all m V .
Given g G, dene T
g
: V V by
m
1
(g(g
1
m) (m)).
7. Introduction to Galois cohomology xcvi
T
g
can be considered as an n by n matrix. One checks that
T
gh
= T
g
+gT
h
, where gT
h
means the matrix T
h
conjugated by
(g). One also checks that equivalent extensions correspond to
1-cocycles that dier by a 1-coboundary. QED
Recall that an F[G
Q
such that V
=
H(
) as F[G
Q
acts trivially on V
0
and via
the cyclotomic character on V
1
. V is called semistable if it is
good or ordinary (or both).
Denition 7.6 Identify H
1
(G
Q
]
(V, V ).
Let H
1
ss
(G
Q
, Ad( )) to
be H
1
ss
. If is good at , take H
1
f
(G
Q
, Ad( )) to consist of
those extensions of V by V that are good. H
1
ss
(G
Q
, Ad
0
( )) and
H
1
f
(G
Q
, Ad
0
( )) are dened by intersecting the above subgroups
with H
1
(G
Q
, Ad
0
( )).
Theorem 7.7 H
1
L
(G
Q
, Ad
0
( )) is a nite group.
Note: we actually already know this theorem since its dimen-
sion over F is the tangential dimension of 1
, but it is good
preparation for nding the exact order of the Selmer group..
Proof: Let M = Ad
0
( ). Let S be a nite set of primes con-
taining all the bad ones, i.e. , , the primes p such that I
p
acts nontrivially on M, and such that L
p
,= H
1
ur
. By denition
of Selmer group, there is an exact sequence
7. Introduction to Galois cohomology xcvii
0 H
1
L
(G
Q
, M) H
1
(G
Q,S
, M)
pS
H
1
(G
Q
p
, M)/L
p
.
Let H = ker(G
Q,S
Aut(M)), the kernel of the action of
G
Q,S
on M. Since M is nite, H is an open, nite index, nor-
mal subgroup of G
Q,S
. By an above exercise, since H acts triv-
ially on M, H
1
(H, M) = Hom(H, M). This is a nite group
since the Hermite-Minkowski theorem says that there are only
nitely many extensions of degree of the xed eld of H (a
nite extension of Q) unramied outside the primes above S.
The ination-restriction exact sequence (see below) says that
0 H
1
(G
Q,S
/H, M
H
) H
1
(G
Q,S
, M) H
1
(H, M)
is exact, where M
H
:= m M[gm = mfor allg H. Since
G
Q,S
/H and M
H
are nite, so is H
1
(G
Q,S
/H, M
H
) and so is
H
1
(H, M) as noted above. Thus, so is H
1
(G
Q,S
, M) and so is
its subgroup H
1
L
(G
Q
, M). QED
The next theorem identies the kernel of the restriction maps
introduced above.
Theorem 7.8 (Ination-Restriction) If M is a G-module and
H is a normal subgroup of G, then M
H
is a G/H-module and
there is an exact sequence:
0 H
1
(G/H, M
H
)
inf
H
1
(G, M)
res
H
1
(H, M).
Corollary 7.9 H
1
ur
(G
Q
p
, M) = H
1
(G
Q
p
/I
p
, M
I
p
)
The advantage of working with cohomology groups is that
they t into various exact sequences and perfect pairings, al-
lowing them to be computed in terms of simpler objects. One
7. Introduction to Galois cohomology xcviii
such simpler object is H
0
(G, M), which is dened to be M
G
.
An example of this sort of simplication is:
Theorem 7.10 For nite M, [H
1
ur
(G
Q
p
, M)[ = [H
0
(G
Q
p
, M)[ <
.
Proof: Consider
0 M
G
Qp
M
I
p
Fr
p
1
M
I
p
M
I
p
/((Fr
p
1)M
I
p
) 0
and by the last exercise,
H
1
(G
Q
p
/I
p
, M
I
p
) = H
1
(< Fr
p
>, M
I
p
) = M
I
p
/((Fr
p
1)M
I
p
).
QED
One can dene 2-cocycles and 2-coboundaries to be certain
maps f : G G M. Namely, a 2-cocycle f is a map that
satises
gf(h, k) f(gh, k) +f(g, hk) f(g, h) = 0,
whereas a 2-coboundary is an f of the form
f(g, h) = gF(h) F(gh) +F(g)
for some F : G M. Then take H
2
(G, M) = 2cocycles/2
coboundaries, and in fact we can likewise dene r-cocycles and
r-coboundaries.
Theorem 7.11 Suppose 0 A B C 0 is a short ex-
act sequence of G-modules. Then there is a long exact sequence
0 H
0
(G, A) H
0
(G, B) H
0
(G, C)
H
1
(G, A) H
1
(G, B) H
1
(G, C)
H
2
(G, A) H
2
(G, B) H
2
(G, C) . . .
7. Introduction to Galois cohomology xcix
This sequence continues with groups H
i
(G, A) dened for all
integers i 0.
Before giving Wiles big result on sizes of Selmer groups, we
need some preparatory results. Note that if A and B are G-
modules, then hom(A, B) has a G-action given by (g(f))(a) =
g(f(g
1
a)) where g G, f hom(A, B), a A.
Theorem 7.12 (Local Tate duality) Suppose M is a G
Q
p
-module
of nite cardinality, n. Set M
= hom(M,
n
(
Q
p
)), where
n
(
Q
p
)
is the nth roots of 1 in
Q
p
given its natural G
Q
p
-action.
There is a nondegenerate pairing for i = 0, 1, 2
H
i
(G
Q
p
, M) H
2i
(G
Q
p
, M
) H
2
((
Q
p
,
n
) Q/Z.
If p does not divide the order of M, then under the pairing
H
1
ur
(G
Q
p
, M) and H
1
ur
(G
Q
p
, M
= L
p
, where L
p
H
1
(G
Q
p
, M
) is the annihilator of L
p
under Tates pairing. By the last remark, these are also local
conditions.
[H
1
L
(G
Q
, M)[
[H
1
L
(G
Q
, M
)[
=
[H
0
(G
Q
, M)[
[H
0
(G
Q
, M
)[
p
[L
p
[
[H
0
(G
Q
p
, M)[
Note that since for all but nitely many primes L
p
= H
1
ur
(G
Q
p
, M),
which has the same order as H
0
(G
Q
p
, M), the righthandside is
a nite product.
Proof: Heres the idea. The Poitou-Tate 9-term sequence (see
e.g. Milnes book) and the denition of the Selmer group yield
7. Introduction to Galois cohomology c
exact sequence (S being the set of bad primes introduced earlier
and c complex conjugation):
0 H
0
(G
Q,S
, M) (
pS
H
0
(G
Q
p
, M))/((1+c)M) H
2
(G
Q,S
, M
H
1
L
(G
Q
, M)
pS
L
p
H
1
(G
Q,S
, M
H
1
L
(G
Q
, M
0
Here A
)[
=
[H
0
(G
Q,S
, M)[[H
2
(G
Q,S
, M
)[[(1 +c)M[
pS
[L
p
[
[H
1
(G
Q,S
, M
)[
pS
[H
0
(G
Q
p
, M)[
.
Since we are happiest computing orders of H
0
s (or if neces-
sary H
1
s), we need some way of removing the H
2
term here,
which is provided by the global Euler characteristic formula:
[H
1
(G
Q,S
, M
)[
[H
0
(G
Q,S
, M
)[[H
2
(G
Q,S
, M
)[
=
[M
[
[H
0
(G
Q
, M
)[
= [(1+c)M[.
QED
Note that this last module is what was called M
+
in a previous
example. There is also a local Euler characteristic formula:
Theorem 7.14 If M is a nite G
Q
p
-module, then
[H
1
(G
Q
p
, M)[
[H
0
(G
Q
p
, M)[[H
2
(G
Q
p
, M)[
= p
v
p
([M[)
.
7. Introduction to Galois cohomology ci
This will be needed in studying how H
1
varies as we vary
. Suppose, for instance, we add a prime into , yielding
/
.
Then 1
/
maps onto 1
)[.
Proof: If L is replaced by L
/
, consider what happens to the
terms on the righthandside of Wiles formula. They remain the
same except for the term for p = q, which changes from 1 to
[H
1
(G
Q
q
, M)[/[H
0
(G
Q
q
, M)[ = [H
2
(G
Q
q
, M)[, using the above
Euler characteristic formula. By the local Tate duality pairing,
[H
2
(G
Q
q
, M)[ = [H
0
(G
Q
q
, M
)[.
We must also consider the eect on L
. Let L
/
denote the
new dual local conditions. Since L
/
q
= 0, the conditions den-
ing H
1
L
are more restrictive than those dening [H
1
L
. Thus,
[H
1
L
(G
Q
, M
)[ [H
1
L
(G
Q
, M
)[.
Putting this together,
[H
1
L
(G
Q
, M)[
[H
1
L
(G
Q
, M)[
=
[H
1
L
(G
Q
, M
)[
[H
1
L
(G
Q
, M
)[
[H
0
(G
Q
q
, M
)[ [H
0
(G
Q
q
, M
)[.
QED
The remaining matter is to compute the individual terms on
the righthandside of Wiles formula. Let us start with p = .
H
0
(G
Q
, M) = M
G
Q
. We are assuming is odd so can take
7. Introduction to Galois cohomology cii
the image of complex conjugation to be
_
_
1 0
0 1
_
_
, and so with
M = Ad
0
( ), we seek those trace 0 matrices xed under conju-
gation by that matrix, easily computed to be
_
_
a 0
0 a
_
_
[a
F. [L
)[
(by local Tate duality).
Thus, the only hard part lies in computing the p = con-
tribution to the formula. (Note that the global terms, e.g.
[H
0
(G
Q
, M)[ will typically be 1, since we assume that is abso-
lutely irreducible, whence the centralizer of M consists of scalar
matrices, but the only such of trace 0 is the zero matrix.)
In the case of H
1
f
, the theory of Fontaine and Lafaille is used,
whereby they work with a category equivalent to that of -
nite at W(F)[G
Q
[
[H
0
(G
Q
, Ad
0
( ))[
= [F[.
As for H
1
ss
, if we let W Ad
0
( ) = M be
_
_
0
0 0
_
_
, then
this is a G
Q
). Since
_
_
r s
0 t
_
_
_
_
0 a
0 0
_
_
_
_
r s
0 t
_
_
1
=
_
_
0 ra/t
0 0
_
_
,
the G
Q
:= ker(H
1
(G
Q
, M) H
1
(I
, M/W)).
Consider now
H
1
(G
Q
, M)
0
H
1
ur
(G
Q
, M/W)
H
1
(G
Q
, M/W)
H
1
(I
, M/W)
Then L
= ker(res ).
The short exact sequence of G
Q
-modules
0 W M M/W 0
yields a long exact sequence of cohomology groups:
0 H
0
(G
Q
, W) H
0
(G
Q
, M) H
0
(G
Q
, M/W)
H
1
(G
Q
, W) H
1
(G
Q
, M) Im() 0.
Since the alternating product of the orders is 1,
[Im()[ =
[H
0
(G
Q
, W)[[H
0
(G
Q
, M/W)[[H
1
(G
Q
, M)[
[H
0
(G
Q
, M)[[H
1
(G
Q
, W)[
.
7. Introduction to Galois cohomology civ
Next note that [Im(res )[ [Im()[/[H
1
ur
(G
Q
, M/W)[ =
[Im()[/[H
0
(G
Q
, M/W)[.
Putting this together,
[L
[ = [H
1
(G
Q
, M)[/[Im(res)[ [H
1
(G
Q
, M)[[H
0
(G
Q
, M/W)[/[Im()[
= [H
0
(G
Q
, M)[[H
1
(G
Q
, W)[/[H
0
(G
Q
, W)[.
Thus,
[L
[
[H
0
(G
Q
, M)[
[H
1
(G
Q
, W)[
[H
0
(G
Q
, W)[
= [H
2
(G
Q
, W)[[F[ = [H
0
(G
Q
, W
)[[F[
(using respectively the local Euler characteristic, noting v
([W[) =
1, and local Tate duality).
To compute [H
0
(G
Q
, W
)[, let c
= (
1
(Fr
)/
2
(Fr
)) 1,
which will turn out to be very important later! Let W
be xed by G
Q
, r F. If
=
1
(Fr
)/
2
(Fr
F) by R/(c
, W
)[ = [F/(c
F)[
and so
[L
[
[H
0
(G
Q
, M)[
[F[[F/(c
F)[.
The Big Picture. We obtained a formula, involving the or-
ders of various Galois cohomology groups, for the tangential
dimension r of 1
is a quotient
of W(F)[[T
1
, . . . , T
r
]]). We also computed how this dimension
changes under the operation of adding (or removing) a prime
to/from . We next intend to show that there is a universal
lift of type parametrizing lifts associated to modular forms,
7. Introduction to Galois cohomology cv
G
Q
GL
2
(T
this
produces is an isomorphism of local rings. First, however, let us
obtain a criterion for establishing such an isomorphism, a nu-
merical criterion involving on the one hand r and on the other
a certain invariant of T
.
cvi
8
Criteria for ring isomorphisms
Given the kind of : G
Q
GL
2
(F) in which we are in-
terested (i.e. absolutely irreducible and semistable) and a -
nite set of rational primes, we have obtained a universal lift
: G
Q
GL
2
(1
) of type . By uni-
versality, we get a morphism
: 1
is an isomorphism.
By kind of , we also wish to include that it is associated to
at least one cuspidal eigenform f. This means that f gives us
a lift of to characteristic zero, say
f
: G
Q
GL
2
(O), where
O is the valuation ring of a nite extension of Q
and is in (
F
.
We shall see later that there exists f such that
f
is of type
, so of type for any . The universality of 1
and T
then
8. Criteria for ring isomorphisms cvii
yields a commutative triangle of morphisms:
1
B
B
B
B
B
B
B
B
T
.}
}
}
}
}
}
}
O
Let (
O
denote the subcategory of (
F
consisting of local O-
algebras (as noted earlier, every ring A in (
F
is of the form
W(F)[[T
1
, ..., T
r
]]/I and so is a local W(F)-algebra - we are ask-
ing that the map W(F) A factors through O, or equivalently
that A is a quotient of some O[[T
1
, ..., T
r
]]). Inspired by our in-
tended application, we shall consider the category (
O
, whose
objects are pairs (A,
A
), where A is in (
O
and
A
: A O is
a surjective morphism, and whose morphisms are morphisms
: A B such that
A
@
@
@
@
@
@
@
B
B
.~
~
~
~
~
~
~
O
commutes.
Denition 8.1 Associated to an object (A,
A
) of (
O
are two
invariants, namely the cotangent space
A
= (ker(
A
))/(ker(
A
))
2
(a nitely generated O-module) and the congruence ideal
A
=
A
(Ann
A
ker(
A
)) (an ideal of O).
Call a ring A in (
O
a complete intersection ring if A is free of
nite rank as an O-module and is expressible as O[[T
1
, ..., T
r
]]/(f
1
, ..., f
r
).
Example: Let f be the cuspidal eigenform associated to the
elliptic curve denoted 57B in Cremonas book. This has level
57, weight 2, trivial Nebentypus, and all its coecients in Z.
It therefore has an associated 3-adic representation
f
: G
Q
8. Criteria for ring isomorphisms cviii
GL
2
(Z
3
). Let : G
Q
GL
2
(Z/3) be the mod 3 representation
it produces. One computes that is surjective as follows.
The image of Fr
2
must have trace a
2
(f) = 1 and determinant
2. The only such elements of GL
2
(Z/3) have order 8. Next, the
image of the local representation at 19 is given by Tates the-
ory of elliptic curves since 19 is a prime of good multiplicative
reduction and we see that 3 divides its order. Thus 24 divides
the order of the image of . One easily sees that the only sub-
group of GL
2
(Z/3) of order 24 is SL
2
(Z/3) but the image of
Fr
2
has nontrivial determinant. Thus the image is the whole
of GL
2
(Z/3) of order 48.
It follows that is absolutely irreducible, Since 57B is a
semistable curve, is semistable. Our theory will apply to this
example.
The theory of the next chapter allows us to compute that
T
= (x, y) Z
2
3
[x y (mod 3)
= Z
3
[[T]]/(T(T 3)).
Thus, T
= Z/3,
T
= (3),
and so [
T
[ = [Z
3
/
T
[.
Amazingly, these sorts of properties turn out to hold in gen-
eral, and even more strikingly there is a simple numerical cri-
terion that is sucient to establish the kind of isomorphism we
covet.
Theorem 8.2 (Wiles, improved by Lenstra) Let : R T
be a surjective morphism in (
O
. Assume that T is free of nite
rank as an O-module and that
T
,= (0) (so [O/
T
[ is nite).
Then the following are equivalent:
(a) [
R
[ [O/
T
[,
(b) [
R
[ = [O/
T
[,
(c) the map is an isomorphism of complete intersection
8. Criteria for ring isomorphisms cix
rings.
The theorem is established via the following lemma:
Lemma 8.3 Suppose that A and B are in (
O
and that there is
a surjection : A B.
(1) induces a surjection
:
A
B
and so [
A
[ [
B
[.
A
B
.
(2) [
A
[ [O/
A
[.
(3) Suppose B is a complete intersection ring. If
is an iso-
morphism and
A
is nite, then is an isomorphism.
(4) Suppose A is a complete intersection ring. If
A
=
B
,=
(0) and A and B are free, nite rank O-modules, then is an
isomorphism.
(5) Suppose A is free and of nite rank as an O-module. Then
there exists a complete intersection ring
A mapping onto A such
that the induced map
A
A
is an isomorphism.
Proof: We show how the lemma implies that (a) implies (c).
That (c) implies (b) will come out of the proof of the lemma
later. That (b) implies (a) is trivial.
[O/
T
[ [
R
[ [
T
[ [O/
T
[ (),
by what we are given, together with (1) and (2). Thus [O/
T
[ =
[
T
[, whence
[O/
T
[ = [
T
[ = [
T
[ [O/
T
[,
by (5) and (2). But since by (1)
T
T
, it must be that
T
=
T
, which by (4) implies that the map
T T is an
isomorphism and so T is a complete intersection ring.
8. Criteria for ring isomorphisms cx
Another consequence of () is that [
R
[ = [
T
[, and so by
(3), : R T is an isomorphism. QED
Now we prove the lemma. For this we need rst to know
Nakayamas lemma and what Fitting ideals are.
Lemma 8.4 (Nakayamas Lemma) Let A be a local ring and
M a nitely generated A-module. If I is an ideal of A such that
IM = M, then M = 0.
Proof: Let m
1
, ..., m
n
be a minimal generating set for M. Since
m
n
M = IM, we can write it r
1
m
1
+ ... + r
n
m
n
with r
i
I.
Then (1 a
n
)m
n
= r
1
m
1
+ ... + r
n1
m
n1
. We cannot have
both a
n
and 1 a
n
in the maximal ideal of A and so 1 a
n
is a unit, but then m
n
is in the submodule of M generated by
m
1
, ..., m
n1
, a contradiction. QED
Corollary 8.5 Suppose : M N is a homomorphism of
nitely generated A-modules and I an ideal such that M/IM
N/IN is surjective. Then is surjective.
Proof: Apply Nakayamas lemma to the cokernel. QED
Note that M/IM = M
A
A/I. In our context we shall apply
the corollary by establishing that a map of A-modules is sur-
jective after tensoring with O or F (which via
A
are quotients
of A).
Denition 8.6 Let R be in (
O
and M be a nitely generated
R-module. This means that there is an exact sequence
0 M
/
R
n
M 0
of R-modules (i.e. a presentation of M).
The Fitting ideal of M, Fitt
R
(M), is dened to be the ideal
of R generated by det(v
1
, ..., v
n
) as the v
i
R
n
run through M
/
.
8. Criteria for ring isomorphisms cxi
Exercise: (i) Show that this ideal is independent of the choice
of presentation. [Hint: from two presentations R
m
M 0,
R
n
M 0, form a presentation R
m+n
M 0.]
(ii) Suppose R = O and so without loss of generality
M = O
r
(O/
n
1
) . . . (O/
n
k
),
where is a uniformizer of O. Show that Fitt
O
(M) = (0) if
r > 0 and = (
n
1
+...n
k
) if r = 0.
Deduce that [M[ = [O/Fitt
O
(M)[.
Proof: (1) The composition ker(
A
) ker(
B
)
B
factors
through
A
. The mere existence of a map Ann
A
ker(
A
)
Ann
B
ker(
B
) implies that
A
B
.
(2) We note that Fitt
R
(M) Ann
R
(M). This follows because
if we take a presentation
0 M
/
R
n
M 0
and let x
1
, ..., x
n
M be the images of the standard generators
of R
n
and let d = det(c
ij
) where (c
ij
) = (v
1
, . . . , v
n
) with each
v
i
M
/
, then
j
c
ij
x
j
= 0 and so dI
n
= (d
ij
)(c
ij
) (where (d
ij
) is
the adjoint-transpose of (c
ij
)) multiplies the vector with entries
x
i
to 0, whence dx
i
= 0 for 1 i n, and so since the x
i
generate M, d Ann
R
(M).
Secondly, note that
A
(Fitt
A
(M)) = Fitt
O
(M
A
O) (where
A
makes O into an A-algebra), since as an O-module M
A
O = M/(ker(
A
)M) is dened by the same relations as those
dening M as an A-module).
In particular, ker(
A
)
A
O =
A
, and so
Fitt
O
(
A
) =
A
(Fitt
A
(ker(
A
))
A
(Ann
A
ker(
A
)) =
A
,
whence [
A
[ = [O/Fitt
O
(
A
)[ [O/
A
[, the rst equality com-
ing out of the exercise above.
8. Criteria for ring isomorphisms cxii
[Note that the (a) (b) part of the main theorem
follows since if R surjects onto T, then always [
R
[ [
T
[
[O/
T
[. ]
(3) Consider U = O[[T
1
, ..., T
r
]] as in (
O
by taking
U
to be the
map sending each T
i
0. Since B is a complete intersection
ring, there exists a local homomorphism
B
: U B with
kernel (f
1
, ..., f
r
). Letting b
i
=
B
(T
i
), then b
i
ker(
B
). Since
:
A
B
is an isomorphism, there exists a
i
ker(
A
)
mapping to b
i
and the images of a
i
in
A
generate
A
.
Now dene
A
: U A by sending T
i
a
i
. This gives a
surjection
A
:
U
A
and so by Nakayama
A
is surjective.
We next establish that ker(
B
) ker(
A
) (and so are actually
equal).
Since
U
= O
r
and
A
is nite, ker(
A
) has r generators,
say g
1
, . . . , g
r
. Pick g
1
, . . . , g
r
ker(
A
) mapping to g
1
, . . . , g
r
.
Since ker(
A
) ker(
B
), (g
1
, ..., g
r
) = (f
1
, ..., f
r
)M for some
M M
r
(U). Let
M = M (mod (T
1
, ..., T
r
)), i.e. the ma-
trix of constant terms. Then ( g
1
, . . . , g
r
) = (
f
1
, . . . , g
r
)
M. Since
these generate the same rank r submodule of nite index in
U
, det(
M) O
1
B
is well-dened. It is an inverse
to and so is an isomorphism.
(4) We rst claim that ker(
A
) Ann
A
ker(
A
) = 0. This is
proven as follows. Suppose x
A
, x ,= 0. Say x =
A
(x
/
),
where x
/
Ann
A
ker(
A
). If a ker(
A
) Ann
A
ker(
A
), then
since x x
/
ker(
A
), 0 = a(x x
/
) = ax (since a ker(
A
)
and x
/
Ann
A
ker(
A
)). Thus a is O-torsion, so a = 0.
Thus the restriction
A
: Ann
A
ker(
A
)
A
is injective. The
denition of
A
makes it surjective. Since
A
=
B
, this isomor-
phism translates into the restriction of : Ann
A
ker(
A
)
8. Criteria for ring isomorphisms cxiii
Ann
B
ker(
B
) being an isomorphism.
So we have an exact sequence 0 ker() Ann
A
ker(
A
)
A, with the cokernel A/(ker()Ann
A
ker(
A
))
= B/((Ann
A
ker(
A
))
=
B/(Ann
B
ker(
B
)) End
O
ker(
B
). Thus the cokernel is tor-
sionfree, giving us splitting over O. If we now dene A
=
hom
O
(A, O), then one can see explicitly that A
= A as A-
modules (this says that complete intersection rings are Goren-
stein). We therefore get a dual exact sequence:
A (ker())
(Ann
A
ker(
A
))
0.
Tensoring with F (the residue ring of A) gives 1 = dim
F
(A
A
F) and (Ann
A
ker(
A
))
A
F ,= 0 (since
A
,= 0). Thus (ker())
A
F = 0. By Nakayamas lemma and dualizing, ker() = 0, and
we are done.
(5) As seen in (2), we can write A as a quotient of U =
O[[T
1
, ..., T
r
]], where the T
i
map to elements of ker(
A
). The
idea is to choose
f
1
, ...,
f
r
generating the kernel of the induced
U
(= O
r
)
A
and then to lift these to U and set
A =
U/(f
1
, ..., f
r
), a complete intersection ring if we ensure that
A
is nitely generated.
Let a
1
, ..., a
r
be O-module generators of ker(
A
). Let V =
O[T
1
, ..., T
r
]. Dene : V A by T
j
a
j
. Then is sur-
jective. Pick f
1
, ..., f
r
ker() and let m be the maximal de-
gree occurring. Since a
2
i
ker(
A
), a
2
i
= h
i
(a
1
, ..., a
r
) for some
linear polynomial h
i
. Replace f
i
by f
i
+ T
m
i
h
i
T
m+2
i
. Then
V/(f
1
, ..., f
r
) is nitely generated as an O-module. Complete
at (, T
1
, ..., T
r
) to get
A = U/(f
1
, ..., f
r
), a nitely generated
O-module. Note that
A A induces an isomorphism on the
cotangent spaces since the linear terms of the f
i
generate the
kernel of the induced map
U
A
.
8. Criteria for ring isomorphisms cxiv
QED
There is one problem in applying the Wiles-Lenstra criterion
to our set-up, namely that 1
and T
W(F)
O and T = T
W(F)
O, and checking
the following exercise.
Exercise:
(a) 1
T
with the aim of proving [
R
[ [O/
T
[.
Lemma 8.7 Let E be the fraction eld of O. There is a natural
bijection from
Hom
O
(
R
, E/O) H
1
(G
Q
, Ad
0
(
f
) E/O).
Thus, [Phi
R
[ = [H
1
(G
Q
, Ad
0
(
f
) E/O)[.
Proof: Recall how at the start of chapter 7 we showed how
given : G GL
n
(F) a lift : G GL
n
(F[]) yielded
a 1-cocycle G ker(GL
n
(F[]) GL
n
(F)) by considering
the dierence between and the trivial lift of . Likewise,
here we have a trivial lift of
f
: G
Q
GL
2
(O) to R/
2
,
where = ker(R O), since R (and so R/
2
) is an O-
algebra. Using this, each lift of
f
to R/
2
gives a 1-cocycle
G
Q
ker(GL
2
(R/
2
) GL
2
(O)) = M
2
(
R
). Strictly equiv-
alent lifts dier by a 1-coboundary and the process can be re-
8. Criteria for ring isomorphisms cxv
versed.
Next, note that f Hom(
R
, O/
n
) together with this de-
nes a class in H
1
(G
Q
, Ad
0
(
f
) O/
n
), since we are only
considering lifts of type . Taking the union (direct limit)
over all n yields the lemma. Another way of seeing this is to
consider lifts of
f
to O[]/(
n
,
2
). Every such lift yields a
map R O[]/(
n
,
2
). This restricts to a homomorphism
(O/
n
) such that
2
maps to 0, and so this factors
through
R
. QED
We can now use the Wiles-Lenstra criterion to reduce proving
that 1
/
= p, R
/
= 1
W(F)
O, T
/
= T
W(F)
O. We have
the following commutative diagram, reecting the facts that
the larger rings parametrize larger sets:
R
/
T
/
R
T
Theorem 8.8 There exists c
p
O satisfying
[
R
[/[
R
[ [O/c
p
[,
T
c
p
T
.
We shall specify c
p
shortly. The amazing thing is that the
same c
p
arises on either side. This is what makes our reduction
work.
Corollary 8.9 Suppose R T is an isomorphism. Then so
is R
/
T
/
. Applying this one prime at a time, we see that
if 1
is an isomorphism, then 1
is an isomor-
phism for every nite set of primes.
Proof: By the theorem, [
R
[ [
R
[[O/c
p
[, which = [O/
T
[[O/c
p
[
8. Criteria for ring isomorphisms cxvi
by Wiles-Lenstra since R T is an isomorphism. Applying
the last theorem again, this [O/
T
[. Applying Wiles-Lenstra
again gives that R
/
T
/
is an isomorphism. QED
The big question is what c
p
works. If p ,= , then following
theorem 7.15 we have that
[
R
[/[
R
[ [H
0
(G
Q
p
, M
)[,
where M = Ad
0
(
f
) E/O. This H
0
is just the xed points of
M
)[ = [O/c
p
[ where
c
p
= (p 1)(a
2
p
(p + 1)
2
) if p is unramied in (and so
f
),
and = p
2
1 otherwise.
Proof: In the case that p is unramied, I
p
acts trivially and so
the action of G
Q
p
factors through < Fr
p
>. Thus we need the
order of the submodule of M
xed under Fr
p
. The rst thing
to note is that the representation Ad
0
(
f
) = Sym
2
det
1
and so we just need the xed points of Sym
2
. The eigenvalues
of Fr
p
are
2
p
,
p
p
,
2
p
as given below and so of 1 Fr
p
are
1
2
p
, 1
p
p
, 1
2
p
. Thus the determinant of the action is
(1
2
p
)(1
p
p
)(1
2
p
) = c
p
above.
The ramied at p case is similar. QED
Remarks: Note that c
p
,= 0. This follows from the Petersson-
Ramanujan bound [a
p
[ 2
p.
This will be our choice of c
p
for p ,= . To give some perspec-
tive of how Wiles desired inequality actually ts into the big
picture of number theory, relating orders of Selmer groups to
special values of L-functions, we next dene the L-function of
the symmetric square of
f
.
Denition 8.11 Suppose f =
a
n
q
n
is a cuspidal eigenform
8. Criteria for ring isomorphisms cxvii
of weight 2 and trivial Nebentypus. Let
p
,
p
be the eigenvalues
of Frobenius, so they satisfy
p
+
p
= a
p
(f) and
p
p
= p. The
pth Euler factor of L(s,
f
) is 1/((1
p
p
s
)(1
p
p
s
)) =
1/(1 a
p
p
s
+ pp
2s
). The pth Euler facor of L(s, Sym
2
(
f
))
is 1/((1
2
p
s
)(1
p
p
p
s
)(1
2
p
s
)).
The amazing fact is that up to a power of p, c
p
is the pth
Euler factor of L(2, Sym
2
(
f
)). In this way we can restate the
desired inequality in terms of the order of a Selmer group being
bounded by a special value of an L-function, and we have a case
of the Bloch-Kato conjecture, a vast generalization of the Birch
and Swinnerton-Dyer conjecture. Unfortunately, this observa-
tion led Wiles o on a wild goose chase for several years, seeking
to prove this case of Bloch-Kato by generalizing ideas of Flach
to construct geometric Euler systems, similar to those that had
been successful in proving cases of the Birch and Swinnerton-
Dyer conjecture earlier. We shall return to this observation once
we have introduced T
.
Lemma 8.12 If p = , then if is not both good and ordinary,
then [H
1
ss
[/[H
1
f
[ = 1 and so we set c
= a
2
( +1)
2
. Up to units, this is the
same as
2
1, where
X+ (there
is one such by the ordinariness of at .
Proof: The factor in Wiles formula for the order of the Selmer
group goes up by [H
1
ss
[/[H
1
f
[ in going from to
/
. If is not
good, then in denition 7.6 H
1
f
was taken to be H
1
ss
. If is not
ordinary, then its lifts are not either and so a lift is semistable
if and only if it is good.
In the good, ordinary case, we already calculated that, if M =
Ad
0
(
f
) O/
n
, then [H
1
f
(G
Q
, M)[ = [H
0
(G
Q
, M)[[O/
n
[,
8. Criteria for ring isomorphisms cxviii
whereas [H
1
ss
(G
Q
, M)[ [H
0
(G
Q
, M)[[O/
n
[[O/(
n
, c
)[, where
c
= (
1
/
2
)(Fr
) 1. Since
1
=
1
2
and
2
(Fr
) is the unit
root of X
2
a
X +, we get c
=
2
1. QED
Once we describe the construction of T
T
, we shall be able to show the other half - that
T
c
p
T
.
This then reduces us to having to prove the isomorphism in the
case of = . For this we prove the following. In applications,
R
n
and T
n
will be rings obtained by using sets containing
primes of a specic form, for which we can control the structure
of 1
and T
O
with T nite and free over O. For any
r 0 regard R and T as O[[S
1
, ..., S
r
]]-modules by letting the
S
i
act trivially. Let
n
(T) denote (1 +T)
p
n
1.
Suppose there exists an integer r 0 such that for every n 1
there is a commutative diagram of surjective homomorphisms
of local O[[S
1
, ..., S
r
]]-algebras
R
n
T
n
T
that has the following properties
(i) the induced maps R
n
/(S
1
, ..., S
r
)R
n
R and T
n
/(S
1
, ..., S
r
)T
n
T are isomorphisms;
(ii) the O-algebras R
n
can be generated by r elements;
8. Criteria for ring isomorphisms cxix
(iii) the rings T
n
/(
n
(S
1
), ...,
n
(S
r
))T
n
are nite free O[[S
1
, ..., S
r
]]/(
n
(S
1
), ...,
n
(S
r
))-
algebras.
Then : R T is an isomorphism of complete intersection
rings.
cxx
9
The universal modular lift
Fix a continuous homomorphism : G
Q
GL
2
(F), where F is
a nite eld of characteristic 3. We assume as usual that is
absolutely irreducible, semistable (and so in particular odd),
and modular (i.e. there is a cuspidal eigenform f =
a
n
q
n
such that tr (Fr
p
) = a
p
for all but nitely many primes p).
Letting K = Q(
.
Since the traces of Frobenius land in T
/
m
, by Chebotarevs den-
sity theorem (that the Frobenius elements generate G
Q,S
for
any nite S) the trace of any element lies in T
/
m
. This to-
gether with the absolute irreducibility of
m
gives, by a result
of Carayol, that there is a representation into GL
2
(T
/
m
) with
the same trace as our given representation into GL
2
(T
Z
Q
).
QED
Now let
N =
N( ) be the product of the primes at which
fails to be good. Since is semistable, N( ) is squarefree and
so
N = N( ) if is good at , and N( ) otherwise.
Theorem 9.2 There exists a unique ring homomorphism a :
T
/
(
N) F such that a(T
p
) = tr (Fr
p
) for all p ,[
N. Let
m = kera, a maximal ideal of T
/
(
N). Then is isomorphic to
m
: G
Q
GL
2
(T
/
(
N)/m) GL
2
(F).
9. The universal modular lift cxxii
Proof: This is simply a restatement of the work of Ribet and
others showing that if is modular (as we have assumed), then
it is associated to a cuspidal eigenform f of weight, level, and
Nebentypus predicted by Serres invariants, in this case weight
2, level
N, and trivial Nebentypus. Now dene a as the usual
eigencharacter, T(f) = a(T)f for T T
/
(
N). QED
Next we introduce . Let N
= N
( ) =
p
n
p
, where n
p
is
the exponent of p in
N if p , , = 2 if p , p ,= , and = 1
if p , p = . In particular, N
=
N. Since
N[N
, there is a
map r : T
/
(N
) T
/
(
N). Set m
= r
1
(m), a maximal ideal
of T
/
(N
).
Theorem 9.3 Let be a lift of to a ring A in (
F
. The fol-
lowing are equivalent:
(a) is unramied outside N
.
(b) There exists a ring homomorphism : T
/
(N
)
m
A
such that is isomorphic to the representation obtained by com-
position of
m
with .
When these hold, (a) determines uniquely, extends con-
tinuously, and is a lift of of type .
Proof: (b) implies (a): Composition of the completion map
T
/
(N
) T
/
(N
)
m
).
(a) implies (b): Since T
p
maps to tr (Fr
p
) (p not in and
not dividing N( )) either way around, the following diagram
commutes:
9. The universal modular lift cxxiii
T
/
(N
T
/
(
N)
a
F
It follows that (m
. Namely, J
0
(N
. Our form of N
then implies
that the reduction at is semistable and that the reduction at
any prime , is the same as that for . The nal piece needed
is an analogue of Tates elliptic curves for semistable abelian va-
rieties (due to Grothendieck, LNM 288) that translates these
reduction properties over to the form of the associated local
Galois representations, just as we did in chapter 5. QED
Denition 9.4 Call a lift of satisfying the equivalent condi-
tions of the previous theorem a modular lift of of type .
Now we are in a position to produce a universal such lift.
Let F
0
be the subeld of F generated by tr (g)[g G
Q
,
which is by Chebotarev the same as the subeld generated by
the traces of Frobenius. Thus T
/
(N
)/m
= F
0
. Set T
=
T
/
(N
)
m
W(F
0
)
W(F).
Then T
is an object in (
F
and the map T
/
(N
)
m
composed with
m
: G
Q
GL
2
(T
/
(N
)
m
) yields a lift
mod
:
G
Q
GL
2
(T
A such that
mod
sending
mod
to
mod
. This is surjective
since T
: 1
sending
the universal lift of type to
mod
is surjective.
(3) Since T
/
(N
is
reduced and a free W(F)-module of nite rank.
We can even give an explicit construction of T
. Given ,
consider the set of newforms f such that
f
: G
Q
GL
2
(O
f
)
is equivalent to a lift of of type . This is nonempty since by
Ribet there is such a cuspidal eigenform giving a lift of type
, and so necessarily of type . Inside
O
f
consider for each
prime p not in and not dividing N( ) the element (a
p
(f)).
Then T
= (x, y) Z
2
3
[x y (mod 3)
= Z
3
[[T]]/(T(T 3)).
In particular, this is a complete intersection ring and
T
=
3Z
3
.
If, in a general situation, there are exactly two newforms f
and g (of allowable weight and level) producing the same :
G
Q
GL
2
(Z/) and they are congruent (mod
n
) but not
(mod
n+1
), then T
= Z
[[T]]/(T(T
n
)) yielding
T
=
n
Z
.
In general,
T
. The
rst one we needed in reducing to the minimal case. Let T = T
and T
/
= T
, where
/
= p.
Theorem 9.6 With the choice of c
p
O from the previous
chapter,
T
= c
p
T
.
Proof: Given
mod
: G
Q
GL
2
(T
), let M
O, which
we shall denote by (x, y) < x, y >
. Let
= ker(T
O)
and L
= M
annihilated by
),
a free T
= O-module of rank 2.
Claim: If x, y is a basis for L
, then
T
=< x, y >
.
To compare
T
and
T
, we need to construct a homomorphism
: M
= N
p
2
. Let X = X
0
(N
), X
/
= X
0
(N
),
and J, J be the respective Jacobians. The map on the upper
9. The universal modular lift cxxvi
half-plane sending (, p, p
2
) induces degeneracy maps
X
/
X
3
. By Albanese functoriality, this yields a map J
/
J
3
.
(In the cases p ,= , p[N( ) or p = , good and ordinary, we
do the same with just (, p). For p = , otherwise, we
use just . )
This then induces a homomorphism T
(J
/
)
Z
O (T
(J)
Z
O)
3
. By picking the right map (M
)
3
M
, namely (1, p
1
T
p
, p
1
),
we induce a map : M
. Let
/
: M
be the
adjoint of , i.e. satisfy
< x, y >
=<
/
x, y >
Wiles computes
/
by looking at the eect on cuspforms. He
nds that
/
= p
2
(p 1)(T
2
p
(p +1)
2
), which up to a unit
is c
p
. In every other case, the same thing holds.
Now, assuming that if x, y is a basis of M
= (<
/
(x),
/
(y) >
) = (< x,
/
(y) >)
= c
p
(< x, y >
) = c
p
QED
cxxvii
10
The minimal case
Our aim then is rst to show that the homomorphism 1
1
j
qa
ij
(mod m)I. Our
assumption that
i
,=
j
yields that a
ij
mI for all i ,= j.
Thus, I = mI, so by Nakayama I = (0). Hence, a
ij
= 0 for
i ,= j, i.e. (x) is diagonal.
By the equation above again,
i
is of nite order dividing
q 1. Since
i
1 (mod m), these orders are powers of .
QED
Letting
q
be the largest quotient of (Z/q)
of order a power
of , this is naturally a quotient of I
q
via the cyclotomic char-
acter. The last result says that each
i
factors through
q
.
We shall be considering sets Q of primes q satisfying the con-
ditions above. Letting
Q
=
qQ
, choosing one of the
i
for
each q Q denes a homomorphism
Q
A
. In particular,
10. The minimal case cxxix
if we take A to be the universal type Q deformation ring 1
Q
or
the universal modular type Q deformation ring T
Q
, then this
makes them W(F)[
Q
]-algebras. Letting I
Q
be the augmenta-
tion ideal of W(F)[
Q
], our above lemma nicely relates 1
Q
and
1
.
Theorem 10.2 A lift of of type Q to A in (
F
is unramied
at all q Q if and only if the kernel of the representing map
1
Q
A contains I
Q
1
Q
.
Thus the natural map 1
Q
1
induces an isomorphism
1
Q
/I
Q
1
Q
1
.
Proof: First, note that in the above lemma,
2
=
1
1
since the
determinant of lands in both
1
(A) and in Z
(G
Q
, Ad
0
( )
res
q
rQ
H
1
(G
Q
q
, Ad
0
( )
)()
is an isomorphism, then 1
Q
is generated by r elements.
Proof: 1
Q
is a quotient of O[[T
1
, ..., T
d
]] where d is the dimension
of H
1
Q
(G
Q
, Ad
0
( )). So we need an upper bound (of r) on this
dimension.
By Wiles formula, looking at the qth factor, q Q, gives:
[L
q
[
[H
0
(G
Q
q
, Ad
0
( ))[
=
[H
1
(G
Q
q
, Ad
0
( ))[
H
0
(G
Q
q
, Ad
0
( ))[
which = [H
2
(G
Q
q
, Ad
0
( ))[ since the Euler characteristic is 1,
so by Tate duality = [H
0
(G
Q
q
, Ad
0
( )
).
Since 1
Q
/I
Q
1
Q
= 1
, 1
Q
is generated by the same number of
elements as 1
(G
Q
, Ad
0
( )
).
We just need to show this is r, but this follows from our hy-
pothesis and the fact that dimH
1
(G
Q
q
, Ad
0
( )
) = 1. QED
Moving towards the hypotheses of theorem 8.13, we want for
each n Z and H
1
Q
(G
Q
, Ad
0
( )
) a prime q depending on
such that
(1) q 1 (mod )
n
10. The minimal case cxxxi
(2) q special
(3) res
q
() ,= 0 (to ensure holds).
Recall Chebotarevs theorem. This states that if K/Q is a
nite Galois extension and g Gal(K/Q), then there exist in-
nitely many primes q unramied in K/Q such that Fr
q
= g
(at least up to conjugation). The above conditions then trans-
late into nding G
Q
satisfying
(1) G
Q(
n)
= ker(G
Q
Gal(Q(
n)/Q)) (Z/
n
)
(2) Ad
0
( )() has an eigenvalue ,= 1
(3) () , ( 1)Ad
0
( )
where Ad
0
is the homomorphism GL
2
(F) Aut(M
0
2
(F))
=
GL
3
(F) given by conjugation in M
2
(F).
Exercise: Show that if () has eigenvalues and , then
Ad
0
( )() has eigenvalues 1, /, /. Deduce that the eigen-
values of () are distinct if and only if Ad
0
( )() does not
have 1 as an eigenvalue. The equivalence of both (2)s above
follows if we ensure F is large enough to contain all eigenvalues
, .
Since the scalar matrices act trivially by conjugation, Ad
0
factors through PGL
2
(F) and so the image of Ad
0
( ) restricted
to G
Q(
n)
can be considered as a nite subgroup of PGL
2
(F)
and so of PGL
2
(
n), PGL
2
(F
n), D
n
, A
4
, S
4
, A
5
. The idea is to
show that in each case
H
1
(Gal(K/Q), Ad
0
( )
) = 0
either by direct calculation(Cline-Parshall-Scott) or by using
10. The minimal case cxxxii
that Gal(K/Q) has order prime to (Schur-Zassenhaus).
cxxxiii
11
Putting it together - the nal trick
Theorem 11.1 Fermats Last Theorem holds, i.e. if a, b, c, n
are integers such that a
n
+b
n
+c
n
= 0 and n 3, then abc = 0.
More generally, the argument below establishes:
Theorem 11.2 Every semistable elliptic curve over Q is mod-
ular.
Let E be the corresponding Frey elliptic curve. Consider its
associated 3-division representation : G
Q
GL
2
(F
3
). There
are two possibilities - either is irreducible or reducible. Sup-
pose rst it is irreducible.
Lemma 11.3 In this case, restricted to G
Q(
3)
is absolutely
irreducible.
Proof: Let H be the image of in PGL
2
(F
3
)
= S
4
. Suppose
the lemma is false. Then H ,= S
4
else would be surjective by
the exercise that follows, and then the image of G
Q(
3)
would
11. Putting it together - the nal trick cxxxiv
be SL
2
(F
3
), which is absolutely irreducible.
We consider the subgroups of S
4
in turn, rst noting that
H A
4
= PSL
2
(F
3
) is impossible since then det( ) would
be trivial. Also, H cannot be a subgroup of any S
3
since then
the image of is in a Borel subgroup and so is reducible,
contradicting our hypothesis.
This leaves the only possibilities that H is dihedral of order
8 or a subgroup of index 2 of it. Since E is semistable at every
prime p ,= 3, (I
p
) has order 1 or 3. Since 3 ,[[H[, (I
p
) has
order 1, i.e. can only be ramied at the prime 3. Thus the
abelianization H/H
/
is the Galois group of an abelian extension
of Q of degree 4 ramied only at 3. By Kronecker-Weber, this
has to be a subextension of some Q(
3
r ), but this has degree
over Q not divisible by 4. QED
The hypotheses of the general case are now satised; namely
we have a representation : G
Q
GL
2
(F
) ( = 3 in fact)
such that
(i) restricted to G
K
, K = Q(
_
(1)
(1)/2
) is absolutely
irreducible;
(ii) is semistable (follows from theory of Tate curves);
(iii) is modular (follows from Langlands-Tunnell).
Let denote the set of primes that divide abc. Then
3
:
G
Q
GL
2
(Z
3
) is a lift of of type . We showed in chapter
11 that 1
Z
3
producing
3
actually factors through
T
, whence
3
is modular, i.e. there exists a cuspidal eigenform
f such that tr
3
(Fr
p
) = a
p
(f) for all but nitely many primes
p.
Next consider
n
: G
Q
GL
2
(Z
n
), where n is the Fermat
exponent. Since tr
n
(Fr
p
) = tr
3
(Fr
p
) = a
p
(f) for all but
11. Putting it together - the nal trick cxxxv
nitely many primes p, it follows that
n
is modular (note
Faltings theorem is not needed) and so
n
is modular too.
We showed, however, that
n
is good at all primes ,= 2 and
semistable at 2, whence N(
n
) = 2. By Ribet,
n
is then as-
sociated to a cuspidal eigenform of level 2 and weight 2 and
trivial Nebentypus, but there are none such. This contradic-
tion concludes the proof except for the matter of dealing with
the possibility of being reducible.
This possibility also troubled Wiles. Unable to handle it, he
had the rest of his work typed up in the spring of 1993 (with the
amusing typo Fermats lost theorem), when he spotted how
an idea of Mazur would x the problem. This is called the 3-5
switch. If Wiles had not spotted this, then Elkies alternative
approach [?] of twisting the curves would have nished o FLT.
Theorem 11.4 Suppose
3
: G
Q
GL
2
(F
3
) is reducible. Then
5
: G
Q
GL
2
(F
5
) is irreducible.
Proof: Suppose both are reducible. Then E(
Q) has a G
Q
-stable
subgroup of order 15. Under the 1-1 correspondence between
Y
0
(N)(K) and isomorphism classes of elliptic curves over a eld
K together with a subgroup of order N dened over K, this el-
liptic curve corresponds to a point of Y
0
(15)(Q). Now, X
0
(15)
has genus 1 and so is an elliptic curve (in fact 15A in Cre-
monas tables). Its rational points form a group of order 8. Of
these 4 are cusps so not in Y
0
(15)(Q), whereas the other 4 cor-
respond to elliptic curves of conductor 50, which are therefore
not semistable. QED
A similar calculation to the rst lemma above shows that
5
restricted to G
Q(
5)
is absolutely irreducible.
5
is also semistable
since E is. The one thing missing is that we do not know that
11. Putting it together - the nal trick cxxxvi
it is modular. Since GL
2
(F
5
) is nonsolvable, the Langlands-
Tunnell approach fails and we take an alternative route.
Theorem 11.5 (3-5 switch) There exists a semistable ellip-
tic curve A over Q such that A[5]
= E[5] as G
Q
-modules and
A[3] is irreducible as a G
Q
-module.
Once this is proven, then by the theorem proven so far, A is
modular, so its
5
is modular. But this is the same
5
as for E.
Thus the missing piece is lled in.
Proof: The idea is to consider the collection of all elliptic curves
A such that A[5]
= E[5] as G
Q
-modules by an isomorphism
that makes the diagram of Weil pairings commute:
A[5] A[5]
L
L
L
L
L
L
L
L
L
L
L
E[5] E[5]
.r
r
r
r
r
r
r
r
r
r
r
5
and note that this is in 1-1 correspondence with a curve Y
/
which is a twist of Y (5), so of genus 0. Then we use Hilberts
irreducibility theorem. QED
cxxxvii
References
[1] N. Boston. A Taylor-made plug for Wiles proof. College
Mathematics Journal, 26:100105, 1995.
[2] N. Bourbaki. Commutative algebra. Chapters 17.
Springer-Verlag, Berlin, 1998.
[3] J. Buhler, R. Crandall, R. Ernvall, and T. Metsankyla.
Irregular primes and cyclotomic invariants to four million.
Math. Comp., 61, no. 203:151153, 1993.
[4] J. Cassels and A. Frohlich. Algebraic number theory. Pro-
ceedings of the instructional conference held at the Univer-
sity of Sussex, Brighton, September 117, 1965. Harcourt
Brace Jovanovich, 1986.
[5] G. Cornell, J. Silverman, and G. Stevens. Modular forms
and Fermats last theorem. Springer-Verlag, 1997.
[6] H. Darmon, F. Diamond, and R. Taylor. Fermats last
References cxxxviii
theorem. Elliptic curves, modular forms and Fermats last
theorem (Hong Kong, 1993):2140, 1997.
[7] P. Deligne. Formes modulaires et representations -
adiques. Seminaire Bourbaki, pages 139172, 1971.
[8] P. Deligne and J.-P. Serre. Formes modulaires de poids 1.
Annales de l