You are on page 1of 112

Chapter 1.

6, Page 37
Problem 2:
(a) Prove that x is in the Cantor set i x has a ternary expansion that
uses only 0s and 2s.
(b) The Cantor-Lebesgue function is dened on the Cantor set by writ-
ing xs ternary expansion in 0s and 2s, switching 2s to 1s, and
re-interpreting as a binary expansion. Show that this is well-dened
and continuous, F(0) = 0, and F(1) = 1.
(c) Prove that F is surjective onto [0, 1].
(d) Show how to extend F to a continuous function on [0, 1].
Solution.
(a) The nth iteration of the Cantor set removes the open segment(s) con-
sisting of all numbers with a 1 in the nth place of the ternary expansion.
Thus, the numbers remaining after n iterations will have only 0s and
2s in the rst n places. So the numbers remaining at the end are pre-
cisely those with only 0s and 2s in all places. (Note: Some numbers
have a non-unique ternary representation, namely those that have a
representation that terminates. For these, we choose the innitely re-
peating representation instead; if it consists of all 0s and 2s, it is in
the Cantor set. This works because we remove an open interval each
time, and numbers with terminating representations are the endpoints
of one of the intervals removed.)
(b) First, we show that this is well-dened. The only possible problem is
that some numbers have more than one ternary representation. How-
ever, such numbers can have only one representation that consists of
all 0s and 2s. This is because the only problems arise when one rep-
resentation terminates and another doesnt. Now if a representation
terminates, it must end in a 2 if it contains all 0s and 2s. But then
the other representation ends with 12222... and therefore contains a
1.
Next we show F is continuous on the Cantor set; given > 0, choose
k such that
1
2
k
< . Then if we let =
1
3
k
, any numbers within will
agree in their rst k places, which means that the rst k places of their
images will also agree, so that their images are within
1
2
k
< of each
other.
The equalities F(0) = 0 and F(1) = 1 are obvious; for the latter,
1 = 0.2222 . . . so F(1) = 0.1111 = 1.
(c) Let x [0, 1]. Choose any binary expansion of x, replace the 0s with
2s, and re-interpret as a ternary expansion. By part (a), this will
produce a member of the Cantor set whose image is x. (Note: Their
may be more than one preimage of x, e.g. F(
1
3
) = F(
2
3
) =
1
2
.)
(d) First, note that F is increasing on the Cantor set C. Now let
G(x) = supF(y) : y x, y C.
Note that G(x) = F(x) for x C because F is increasing. G is
continuous at points not in C, because

C is open, so if z

C, there
is a neighborhood of z on which G is constant. To show that G is
continuous on C, let x C and use the continuity of F (part b) to
1
2
choose > 0 such that [G(x) G(z) < for z C, [xz[ < . Choose
z
1
(x , x), z
2
(x, x + ) and let
t
< min(x z
1
, z
2
x). Then
for [y x[ <
t
, if y C we automatically have [F(y) F(x)[ < . If
y / C but y < x, G(x) > G(y) G(z
1
) > G(x) ; similarly, if y / C
but y > x, G(x) < G(y) G(z
2
) < G(x) +.

Problem 3: Suppose that instead of removing the middle third of the seg-
ment at each step, we remove the middle , where 0 < < 1.
(a) Prove that the complement of C

is the union of open intervals with


total length 1.
(b) Prove directly that m

(C

) = 0.
Solution.
(a) At the nth step (starting at n = 0), we remove 2
n
segments, each of
length
_
1
2
_
n
. The total length of these segments is

n=0
2
n

_
1
2
_
n
=

n=0
1
(1 )
n
=
1
1 (1 )
= 1.
(b) If C
n
is the set remaining after n iterations, then C
n
is a union of 2
n
segments of length
_
1
2
_
n
. So
m(C
n
) = (1 )
n
.
Note that m(C
n
) 0. Since each C
n
is a covering of C by almost
disjoint cubes, the inmum of the measures of such coverings is 0.

Problem 4: Construct a closed set



C so that at the kth stage of the con-
struction one removes 2
k1
centrally situated open intervals each of length

k
, with

1
+ 2
2
+ + 2
k1

k
< 1.
(a) If
j
are chosen small enough, then

k=1
2
k1

k
< 1. In this case,
show that m(

C) > 0, and in fact,
m(

C) = 1

k=1
2
k1

k
.
(b) Show that if x

C, then there exists a sequence x
n
such that x
n
/

C,
yet x
n
x and x
n
I
n
, where I
n
is a sub-interval in the complement
of

C with [I
n
[ 0.
(c) Prove as a consequence that

C is perfect, and contains no open interval.
(d) Show also that

C is uncountable.
Solution.
(a) Let C
k
denote the set remaining after k iterations of this process, with
C
0
being the unit segment. Then
m([0, 1] C
k
) =
k

j=1
2
j

j
3
since [0, 1] C
k
is a union of disjoint segments with this total length.
Then
m(C
k
) = 1
k

j=1
2
j

j
.
Now C
k


C, so by Corollary 3.3,
m(

C) = lim
n
m(C
k
) = 1

k=1
2
k

k
.
(b) For k = 1, 2, . . . , let J
k
be the interval of C
k
which contains x. Let I
n
be the interval in

C
c
which is concentric with J
n1
. (Thus, at the nth
step of the iteration, the interval I
n
is used to bisect the interval J
n1
.)
Let x
n
be the center of I
n
. Then x
n


C
c
. Moreover, [x
n
x[ [J
n1
[
since J
n1
contains both x
n
and x. Since the maximum length of the
intervals in C
n
tends to 0, this implies x
n
x. Finally, x
n
I
n


C
c
,
and I
n
J
n1
[I
n
[ 0.
(c) Clearly

C is closed since it is the intersection of the closed sets C
n
. To
prove it contains no isolated points, we use the same construction from
the previous part. Let x

C. This time, let x
n
be an endpoint of I
n
,
rather than the center. (We can actually take either endpoint, but for
specicity, well take the one nearer to x.) Because I
n
is constructed
as an open interval, its endpoints lie in C
n
. Moreover, successive
iterations will not delete these endpoints because the kth iteration
only deletes points from the interior of C
k1
. So x
n


C. We also
have [x
n
x[ J
n
as before, so that x
n
x. Hence x is not an
isolated point. This proves that

C is perfect.
(d) We will construct an injection from the set of innite 0-1 sequences
into

C. To do this, we number the sub-intervals of C
k
in order from
left to right. For example, C
2
contains four intervals, which we denote
I
00
, I
01
, I
10
, and I
11
. Now, given a sequence a = a
1
, a
2
, . . . of 0s and
1s, let I
a
n
denote the interval in C
n
whose subscript matches the rst
n terms of a. (For instance, if a = 0, 1, 0, 0, . . . then I
a
4
= I
0100
C
4
.)
Finally, let
x

n=1
I
a
n
.
This intersection is nonempty because I
a
n+1
I
a
n
, and the intersection
of nested closed intervals is nonempty. On the other hand, it contains
only one point, since the length of the intervals tends to 0. Thus, we
have constructed a unique point in

C corresponding to the sequence a.
Since there is an injection from the uncountable set of 0-1 sequences
into

C,

C is also uncountable.

Problem 5: Suppose E is a given set, and and O


n
is the open set
O
n
= x : d(x, E) <
1
n
.
Show that
4
(a) If E is compact, then m(E) = lim
n
m(O
n
).
(b) However, the conclusion in (a) may be false for E closed and un-
bounded, or for E open and bounded.
Proof. (a) First note that for any set E,

E =

n=1
O
n
since the closure of E consists of precisely those points whose distance
to E is 0. Now if E is compact, it equals its closure, so
E =

n=1
O
n
.
Note also that O
n+1
O
n
, so that O
n
E. Now since E is bounded,
it is a subset of the sphere B
N
(0) for some N. Then O
1
B
N+1
(0),
so that m(O
1
) < . Thus, by part (ii) of Corollary 3.3,
m(E) = lim
n
m(O
n
).
(b) Suppose E = Z R. Then m(O
n
) = for all n, since O
n
is a
collection of innitely many intervals of length
2
n
. However, m(Z) = 0.
This shows that a closed unbounded set may not work. To construct a
bounded open counterexample, we need an open set whose boundary
has positive measure. To accomplish this, we use one of the Cantor-
like sets

C from Problem 4, with the
j
chosen such that m(

C) > 0.
Let E = [0, 1]

C. Then E is clearly open and bounded. The boundary
of E is precisely

C, since

C contains no interval and hence has empty
interior. (This shows that the boundary of E contains

C; conversely,
it cannot contain any points of E because E is open, so it is exactly
equal to

C.) Hence

E = E E = [0, 1]. Now
O
n
=
_
x R : d(x, E) <
1
n
_
=
_
x R : d(x,

E) <
1
n
_
=
_

1
n
, 1 +
1
n
_
.
Clearly m(O
n
) 1, but m(E) = 1 m(

C) < 1.

Problem 6: Using translations and dilations, prove the following: Let B be


a ball in R
d
of radius r. Then m(B) = v
d
r
d
, where v
d
= m(B
1
) and B
1
is
the unit ball x R
d
: [x[ < 1.
Solution. Let > 0. Choose a covering Q
j
of B
1
with total volume
less than m(B
1
) +

r
d
; such a covering must exist because the m(B
1
) is
the inmum of the volumes of such cubical coverings. When we apply the
homothety x rx to R
d
, each Q
j
is mapped to a cube Q
t
j
whose side
length is r times the side length of Q
j
. Now Q
t
j
is a cubical covering of
B
r
with total volume less than r
d
m(B
1
)+. This is true for any > 0, so we
must have m(B
r
) r
d
m(B
1
). Conversely, if R
j
is a cubical covering of
B
r
whose total volume is less than m(B
r
) +, we can apply the homothety
x
1
r
x to get a cubical covering R
t
j
of B
1
with total volume less than
5
1
r
d
(m(B
r
) + ). This shows that m(B
1
)
1
r
d
(m(B
r
) + ). Together, these
inequalities show that m(B
r
) = r
d
m(B
1
).
Problem 7: If = (
1
, . . . ,
d
) is a d-tuple of positive numbers with
i
> 0,
and E R
d
, we dene E by
E = (
1
x
1
, . . . ,
d
x
d
) : (x
1
, . . . , x
d
) E.
Prove that E is measurable whenever E is measurable, and
m(E) =
1
. . .
d
m(E).
Solution. First we note that for an open set U, U is also open. We could
see this from the fact that x x is an invertible linear transformation, and
therefore a homeomorphism. More directly, if p U, let B
r
(p) be a neigh-
borhood of p which is contained in U; then if we dene

= min(
1
, . . . ,
d
),
we will have B
r
(p) U.
Next, we note that for any set S, m

(S) =
1
. . .
d
m

(S). The proof of


this is almost exactly the same as Problem 6: the dilation x x and
its inverse map rectangular coverings of S to rectangular coverings of S
and vice versa; but since the exterior measure of a rectangle is just its area
(Page 12, Example 4), the inmum of the volume of rectangular coverings
is the same as the inmum over cubical coverings. Hence a rectangular
covering within of the inmum for one set is mapped to a rectangular
covering within

1...n
for the other.
As a more detailed version of the preceding argument, suppose Q
j
is a
cubical covering of S with

[Q
j
[ < m

(S)+. Then Q
j
is a rectangular
covering of S with

[Q
j
[ <
1
. . .
d
m

(S) +
1
. . .
d
. Now for each rec-
tangle Q
j
we can nd a cubical covering Q
t
jk
with

k
[Q
t
jk
[ < [Q
j
[+

2
j
.
Then
j,k
Q
t
jk
is a cubical covering of S with

j,k
[Q
t
jk
[ <
1
. . .
d
m

(S)+
(1 +
1
. . .
d
). This implies that m

(S)
1
. . .
d
m

(S). To get the re-


verse inequality we note that another -type transformation goes the other
direction, i.e. S =
t
(S) where
t
= (1/
1
, . . . , 1/
d
).
Now let U E be an open set with m

(U E) <

1...
d
. Then U E is
an open set. Moreover, (EU) = EU, so m

(UE) =
1
. . .
d
m

(U
E) < . Hence E is also measurable. (Alternatively, we could prove that
E is measurable by appealing to Problem 8.)
Problem 8: Suppose L is a linear transformation of R
d
. Show that if E is a
measurable subset of R
d
, then so is L(E), by proceeding as follows:
(a) Note that if E is compact, so is L(E). Hence if E is an F

set, so is
L(E).
(b) Because L automatically satises the inequality
[L(x) L(x
t
)[ M[x x
t
[
for some M, we can see that L maps any cube of side length into a
cube of side length c
d
M, with c
d
= 2

d. Now if m(E) = 0, there is


a collection of cubes Q
j
such that E
j
Q
j
, and

j
m(Q
j
) < .
Thus m

(L(E)) c
t
, and hence m(L(E)) = 0. Finally, use Corol-
lary 3.5. (Problem 4 of the next chapter shows that m(L(E)) =
[ det L[m(E).)
6
Solution.
(a) Since linear transformations on nite-dimensional spaces are always
continuous, they map compact sets to compact sets. Hence, if E is
compact, so is L(E). Moreover, because R
d
is -compact, any closed
set is the countable union of compact sets. So if
E =

n=1
F
n
where F
n
is closed, then for each n we have
F
n
=

j=1
K
nj
where K
nj
is compact; then
E =

j,n
K
nj
is a countable union of compact sets. Then
L(E) =

j,n
L(K
nj
)
is too, since L(K
nj
) is compact. But compact sets are closed, so this
shows that L(E) is F

.
(b) Let x be a corner of a cube Q of side length . Then every point
x
t
in the cube is a distance of at most

d away from x, since this
is the distance to the diagonally opposite corner. Now [x x
t
[ <

d [L(x) L(x
t
)[ <

dM. Now if Q
t
is the cube of side length
2

dM centered at x, the points on the exterior of the cube are all at


least

dM away from x. L(Q) Q
t
. Since a set of measure 0 has
a cubical covering with volume less than , its image under L has a
cubical covering with volume less than 2

dM. This implies that L


maps sets of measure 0 to sets of measure 0.
Finally, let E be any measurable set. By Corollary 3.5, E = C N
where C is an F

set and N has measure 0. We have just shown


that L(C) is also F

and L(N) also has measure 0. Hence L(E) =


L(C) L(N) is measurable.

Problem 9: Give an example of an open set O with the following property:


the boundary of the closure of O has positive Lebesgue measure.
Solution. We will use one of the Cantor-like sets from Problem 4; let

C
be such a set with m(hatC) > 0. We will construct an open set whose
closure has boundary

C. Let us number the intervals involved in the Cantor
iteration as follows: If C
n
is the set remaining after n iterations (with
C
0
= [0, 1]), we number the 2
n
intervals in C
n
in binary order, but with 2s
instead of 1s. For example, C
2
= I
00
I
02
I
20
I
22
. The intervals in the
complement of

C, denoted by subscripted Js, are named according to the
intervals they bisected, by changing the last digit to a 1. For instance, in
C
1
, the interval J
1
is taken away to create the two intervals I
0
and I
2
. In
7
the next iteration, I
0
is bisected by J
01
to create I
00
and I
02
, while I
2
is
bisected by J
21
to create I
20
and I
21
, etc.
Having named the intervals, let G = J
1
J
001
J
021
J
201
J
221
. . .
be the union of the intervals in

C
c
which are removed during odd steps of
the iteration, and G
t
= [0, 1] (G

C) be the union of the other intervals,
i.e. the ones removed during even steps of the iteration. I claim that the
closure of G is G

C. Clearly this is a closed set (its complement in [0, 1]
is the open set G
t
) containing G, so we need only show that every point in

C is a limit of points in G. To do this, we rst note that with the intervals


numbered as above, an interval I
abc...
whose subscript is k digits long has
length less than
1
2
k
. This is so because each iteration bisects all the existing
Is. In addition, an interval J
abc...
with a k-digit subscript has length less
than
1
2
k1
because it is a subinterval of an I-interval with a (k 1)-digit
subscript. Now let x

C. Then x
n
C
n
so for each n we can nd an
interval I
(n)
containing x which has an n-digit subscript. Let J
(n)
be the
J-interval with an n-digit subscript, whose rst n1 digits match those of
I
(n)
. Then I
(n)
and J
(n)
are consecutive intervals in C
n
. Since they both
have length at most
1
2
n1
, the distance between a point in one and a point
in the other is at most
1
2
n2
. Thus, if we let y
n
be a sequence such that
y
n
J
(n)
, then y
n
x. Now let y
n
be the subsequence taken for odd
n, so that y
n
G. Then we have constructed a sequence of points in G
which converge to x

C.
We have shown that

G = G

C. It only remains to show that (G

C) =

C. Clearly (G

C)

C since G is open and is therefore contained in the
interior of G

C. Now let x

C. By the same construction as above, we
can choose a sequence y
n
J
(n)
which converges to x. If we now take the
subsequence y
n
over even n, then y
n
G
t
and y
n
x. This proves that
x (G

C). Hence we have shown that G is an open set whose closure
has boundary

C, which has positive measure.
Problem 11: Let A be the subset of [0, 1] which consists of all numbers which
do not have the digit 4 appearing in their decimal expansion. Find m(A).
Proof. A has measure 0, for the same reason as the Cantor set. We can
construct A as an intersection of Cantor-like iterates. The rst iterate
is the unit interval; the second has a subinterval of length 1/10 deleted,
with segments of lengths 3/10 and 6/10 remaining. (The deleted interval
corresponds to all numbers with a 4 in the rst decimal place.) The next
has 9 subintervals of length 1/100 deleted, corresponding to numbers with
a non-4 in the rst decimal place and a 4 in the second. Continuing, we get
closed sets C
n
of length (9/10)
n
, with A = C
n
. Clearly A is measurable
since each C
n
is; since m(C
n
) 0, m(A) = 0.
Problem 13:
(a) Show that a closed set is G

and an open set F

.
(b) Give an example of an F

which is not G

.
(c) Give an example of a Borel set which is neither G

nor F

.
8
Proof.
(a) Let U be open. As is well known, U is the union of the open rational
balls that it contains. However, it is also the union of the closed
rational balls that it contains. To prove this, let x U and r > 0 such
that B
r
(x) U. Choose a rational lattice point q with [x q[ <
r
3
,
and a rational d with
r
3
< d <
r
2
. Then

B
d
(q) B
r
(x) U and
x

B
d
(q), so any x U is contained in a closed rational ball within
U. Thus, U is a union of closed rational balls, of which there are only
countably many. For a closed set F, write the complement R
d
F as
a union of rational balls B
n
; then F = B
c
n
is a countable intersection
of the open sets B
c
n
, so F is G

.
(b) The rational numbers are F

since they are countable and single points


are closed. However, the Baire category theorem implies that they
are not G

. (Suppose they are, and let U


n
be open dense sets with
Q = U
n
. Dene V
n
= U
n
r
n
, where r
n
is the nth rational in some
enumeration. Note that the V
n
are also open and dense, but their
intersection is the empty set, a contradiction.)
(c) Let A = (Q(0, 1)) ((R Q) [2, 3]) consist of the rationals in (0, 1)
together with the irrationals in [2, 3]. Suppose A is F

, say A = F
n
where F
n
is closed. Then
(R Q) [2, 3] = A [2, 3] = (F
n
) [2, 3] = (F
n
[2, 3])
is also F

since the intersection of the two closed sets F


n
and [2, 3] is
closed. But then
Q (2, 3) = (F
c
n
(2, 3))
is G

because F
c
n
(2, 3) is the intersection of two open sets, and
therefore open, for each n. But then if r
n
is an enumeration of the
rationals in (2, 3), (F
c
n
(2, 3))r
n
is also open, and is dense in (2, 3).
Hence (F
c
n
(2, 3)) r
j
is dense in (2, 3) by the Baire Category
Theorem. But this set is empty, a contradiction. Hence A cannot be
F

.
Similarly, suppose A is G

, say A = G
n
where G
n
is open. Then
Q (0, 1) = A (0, 1) = (G
n
) (0, 1) = (G
n
(0, 1))
is also G

since G
n
(0, 1) is the intersection of two open sets and
therefore open. But then if q
n
is an erumeration of the rationals in
(0, 1), (G
n
(0, 1)) q
n
is open and is dense in (0, 1), so
((G
n
(0, 1)) q
n
)
must be dense in (0, 1). But this set is empty, a contradiction. Hence
A is not G

Problem 16: Borel-Cantelli Lemma: Suppose E


k
is a countable family
of measurable subsets of R
d
and that

k=1
m(E
k
) < .
9
Let
E = x R
d
: x E
k
for innitely many k = limsup E
k
.
Show that E is measurable.
Prove m(E) = 0.
Solution.
Let
B
n
=

_
k=n
E
k
be the set of x which are in some E
k
with k n. Then x is in innitely
many E
k
i x B
n
for all n, so
E =

n=1
B
n
=

n=1

_
k=n
E
k
.
This is a countable intersection of a countable union of measurable
sets, and hence is measurable.
Let > 0. Since

m(E
k
) converges, N such that

k=N
m(E
k
) < .
Then
m(B
N
) = m
_

_
k=N
E
k
_

k=N
m(E
k
) <
by subadditivity. But m(B
n
) m(B
N
) by monotonicity, so m(E) <
for all . Hence m(E) = 0.

Problem 17: Let f


n
be a sequence of measurable functions on [0, 1] with
[f
n
(x)[ < for a.e. x. Show that there exists a sequence c
n
of positive
real numbers such that
f
n
(x)
c
n
0 a.e. x.
Solution. We are given that for each n,
m
_

k=1
_
x : [f
n
(x)[ >
k
n
_
_
= 0
since this set is precisely the set where [f(x)[ = . Since these sets are
nested, this implies
lim
k
m
__
x : [f
n
(x)[ >
k
n
__
= 0.
Hence, c
n
such that
m
__
x : [f
n
(x)[ >
c
n
n
__
<
1
2
n
.
Dene
E
n
=
_
x : [f
n
(x)[ >
c
n
n
_
.
10
Then m(E
n
) <
1
2
n
, so
m
_
_

m=1

_
j=m
E
j
_
_
= 0
by the Borel-Cantelli lemma. But the complement of this set consists of
precisely those points that are in nitely many E
n
, i.e. those points for
which
fn(x)
cn
is eventually less than
1
2
n
. Hence we have found a set of measure
0 such that
fn(x)
cn
0 on the complement.
Problem 18: Prove the following assertion: Every measurable function is the
limit a.e. of a sequence of continuous functions.
Proof. Let f : R R be measurable. (The problem didnt specify whether
f can have as a value, but Im assuming not.) Let B
n
= [n, n]. Then
by Lusins Theorem, there exists a closed (hence compact) subset E
n
B
n
with m(B
n
E
n
) <
1
2
n
and f continuous on E
n
. Then by Tietzes Extension
Theorem, we can extend f to a continuous function f
n
on all of R, where
f
n
= f on E
n
. (Explicitly, such an extension could work as follows: Dene
f
n
: R R by f
n
(x) = f(x) for x E
n
; for x / E
n
, since the complement
is open, x is in some open interval (a, b) E
c
n
or in some unbounded open
interval (, a) E
c
n
or (b, ) E
c
n
. Let f
n
(x) = f(a) +
xa
ba
f(b) in the
rst case and f
n
(x) = f(a) in the other two cases.)
I claim that f
n
f almost everywhere. Suppose x is a point at which
f
n
, f. Then x (B
c
n
) (B
n
E
n
) for innitely many n since otherwise
f
n
(x) is eventually equal to f(x). Now a given x can be in only nitely many
B
c
n
, so it must be in innitely many (B
n
E
n
), i.e. x limsup(B
n
E
n
).
But limsup(B
n
E
n
) has measure 0 by the Borel-Cantelli Lemma. Hence
the set of x at which f
n
(x) , f(x) is a subset of a set of measure 0, and
therefore has measure 0.
Problem 20: Show that there exist closed sets Aand B with m(A) = m(B) =
0, but m(A+B) > 0:
(a) In R, let A = (, B = (/2. Note that A+B [0, 1].
(b) In R
2
, observe that if A = I 0 and B = 0 I (where I = [0, 1]),
then A+B = I I.
Solution.
(a) As noted, let ( be the Cantor set, A = (, and B = (/2. Then A
consists of all numbers which have a ternary expansion using only 0s
and 2s, as shown on a previous homework set. This implies that B
consists of all numbers which have a ternary expansion using only 0s
and 1s. Now any number x 0, 1] can be written as a + b where
a A and b B as follows: Pick any ternary expansion 0.x
1
x
2
. . . for
x. Dene
a
n
=
_
2 (x
n
= 2)
0 (else)
b
n
=
_
1 (x
n
= 1)
0 (else)
11
Then a = 0.a
1
a
2
A and b = 0.b
1
b
2
B, and a +b = x.
(b) Duly noted. (I feel like I should prove something, but Im not sure
what there is to prove here.)

Problem 21: Prove that there is a continuous function that maps a Lebesgue
measurable set to a non-measurable set.
Proof. As shown in the last homework, there is a continuous function f :
[0, 1] [0, 1] such that f(C) = [0, 1], where C is the Cantor set. Let
V [0, 1] be a Vitali set, which is non-measurable. Let E = f
1
(V ) C.
Then E is measurable since E is a subset of a set of measure 0. However,
f(E) = V which is not measurable.
Problem 22: Let
[0,1]
be the characteristic function of [0, 1]. Show that
there is no everywhere continuous function f on R such that
f(x) =
[0,1]
(x) almost everywhere.
Proof. Suppose that such an f exists. Then f(1) =
[0,1]
(1) = 1. By
continuity, > 0 such that [x 1[ < [f(x) 1[ <
1
2
. In particular,
f(x) >
1
2
> 0 for x (1, 1 +). Thus
x : f(x) ,=
[0,1]
(x) (1, 1 +) m(x : f(x) ,=
[0,1]
(x)) > 0.

Problem 23: Suppose f(x, y) is a function on R


2
that is separately contin-
uous: for each xed variable, f is continuous in the other variable. Prove
that f is measurable on R
2
.
Solution. For all x R and n N, we dene D
n
x
to be the largest nth-order
dyadic rational less than or equal to x, i.e. D
n
x
=
k
2
n
where
k
2
n
x <
k+1
2
n
.
Let f
n
(x, y) = f(D
n
x
, y). I will show that f
n
is measurable and f
n
f
everywhere.
First we show that f
n
is measurable.
(x, y) : f
n
(x, y) > a =

_
k=
_
(x, y) :
k
2
n
x <
k + 1
2
n
, f
_
k
2
n
, y
_
> a
_
=

_
k=
_
k
2
n
,
k + 1
2
n
_

_
y : f
_
k
2
n
, y
_
> a
_
.
Now because f is continuous in y, y : f(
k
2
n
, y) > a is open, so the product
[
k
2
n
,
k+1
2
n
) y : f(
k
2
n
, y) > a is measurable. Hence f
n
> a is a countable
union of measurable sets, and thus measurable.
Next, we show that f
n
f everywhere. Let > 0. For any x, y, because f
is continuous in x, there is some 1-dimensional neighborhood (x , y) on
which [f(x
t
, y) f(x, y)[ < . Then for suciently large n, [D
n
x
x[ <
[f
n
(x, y) f(x, y)[ = [f(D
n
x
, y) f(x, y)[ < . Hence f
n
(x, y) is within of
f(x, y) for suciently large n. Since f is the pointwise limit of measurable
functions f
n
, it is measurable.
12
Problem 27: Suppose E
1
and E
2
are a pair of compact sets in R
d
with
E
1
E
2
, and let a = m(E
1
) and b = m(E
2
). Prove that for any c with
a < c < b, there is a compact set E with E
1
E E
2
and m(E) = c.
Solution. Since E
1
is measurable, there is an open set U E
1
with m(U
E
1
) < b c. Then E
2
U
c
is compact (since its the intersection of a
compact set and a closed set) and has measure at least m(E
1
) m(U) >
b (a +b c) = c a. If we can nd a compact subset K E
2
U
c
with
m(K) = c a, then K E
1
will be a compact subset of E
2
with measure
(c a) + a = c (since K and E
1
are disjoint). Hence we have reduced the
problem to the following: Given a compact set F R
d
with m(K) = ,
and given with 0 < < , nd a compact subset F
t
F with m(F
t
) = .
This can be solved as follows: Let f(y) = m(F B
y
(0)). Then f(0) = 0
whereas f(y) = for suciently large y (because F is bounded). Moreover,
f is continuous: Given y and > 0, the continuity of m(B
y
(0)) allows us
to nd such that [y
t
y[ < [m(B
y
(0)) m(B
y
(0))[ < . Then
[f(y
t
) f(y)[ = m(F (B
y
(0)B
y
(0))) < , because this is the measure
of a subset of the symmetric dierence (B
y
(0)B
y
(0)) which has measure
[m(B
y
(0)) m(B
y
(0))[ < . Hence f is continuous, so by the Intermediate
Value Theorem there is a value y
0
such that f(y
0
) = . Then the compact
set F B
y0
(0) has measure as desired.
Problem 28: Let E R with m

(E) > 0. Let 0 < < 1. Then there exists


an interval I R such that m

(I E) m(I).
Proof. For any , we can nd a cubical covering Q
j
of E with

[Q
j
[ <
m

(E) +. Then, by expanding each cube to an open cube of size



2
j
more,
we can construct an open cubical covering I
j
with

[I
j
[ < m

(E) + 2.
(We name these I
j
because 1-dimensional cubes are in fact intervals.) Then
E
_
j
I
j
E =
_
j
(E I
j
) m

(E)

j
m

(E I
j
).
Now we apply something like the Pigeonhole Principle: Suppose m

(E
I
j
) < m(I
j
) for all j. Then
m

(E)

j
m

(E I
j
) <

j
m(I)
j
< (m

(E) + 2).
But if is chosen small enough, this is a contradiction; explicitly, we can
choose <
(1)m(E)
2
. Thus there must be some interval I
j
for which
m

(E I
j
) m(I
j
).
Problem 29: Suppose E is a measurable subset of R with m(E) > 0. Prove
that the dierence set of E
z R : z = x y for some x, y E
contains an open interval centered at the origin.
Solution. By Problem 28, there exists an interval I such that m(E I)
3
4
m(I). (We can replace m

by m here because E is measurable.) Let


d = m(I) and 0 < [[ <
d
4
. Then the translated set I + intersects I in an
13
interval I
t
of length d >
3
4
d. Now m(EI)
3
4
d so m((E+)(I+))
3
4
d by the translation invariance of Lebesgue measure. Now
3
4
d m(EI) = m(EI
t
)+m(E(I I
t
)) m(EI
t
)+m(I I
t
) = m(EI
t
)+
so m(EI
t
)
3d
4
>
d
2
. Similarly, m((E+)I
t
) >
d
2
. Now if (E+) and
E were disjoint, this would imply m(I
t
) m(EI
t
) +m((E+
d
4
) I
t
) > d.
But m(I
t
) = d , so E and E + must have nonempty intersection. Let
x E(E+). Then e
1
, e
2
E such that e
1
= x = e
2
+ e
1
e
2
= .
Hence E E for all (
d
4
,
d
4
).
Problem 30: If E and F are measurable, and m(E) > 0, m(F) > 0, prove
that
E +F = x +y : x E, y F
contains an interval.
Solution. We follow the preceding proof almost exactly. By the lemma,
there exist intervals I
1
and I
2
such that m(E I
1
)
3
4
m(I
1
) and m(F
I
2
)
3
4
m(I
2
). WLOG assume m(I
2
) m(I
1
). Then t
0
such that I
2
+t
0

I
1
. Let d = m(I
2
). Then for 0 < [[ <
d
4
, I
2
+ t
0
+ intersects I
1
in
an interval I
t
of length at least d >
3d
4
. By the same argument as
in problem 29, this implies that I
1
and I
2
+ t
0
+ must have nonempty
intersection; let x be a point of this intersection. Then e E, f F such
that e = x = f + t
0
+ e + f = t
0
+ . Hence E + F contains the
interval (t
0

d
4
, t
0
+
d
4
).
Problem 34: Given two Cantor sets (
1
, (
2
[0, 1], there exists a continuous
bijection f : [0, 1] [0, 1] such that f((
1
) = (
2
.
Proof. Any Cantor set can be put in bijective correspondence with the set
of 0-1 sequences as follows: Given x C, where C = C
1
C
2
. . . is a
Cantor set, dene x
1
= 0 if x is in the left of the two intervals in C
1
(call
this left interval I
0
), and x
1
= 1 if x is in the right interval I
1
. Then dene
x
2
= 0 if x is in the left subinterval (either I
00
or I
10
) in C
2
, and x
2
= 1 if x
is in the right subinterval. Continuing in this fashion, we obtain a bijection
from C to the set of 0-1 sequences. Note that this bijection is increasing in
the sense that if y > x for x, y C, then y
n
> x
n
at the rst point n in the
sequence at which x
n
and y
n
dier.
Now we can create an increasing bijection f from (
1
to (

by mapping from
(
1
to 0-1 sequences, and from there to (
2
. This function will be continuous
on (
1
because if x, y (
1
are close, their corresponding sequences x
n

and y
n
will agree in their rst N terms; then f(x) and f(y) will agree in
their rst N terms as well, which means theyre in the same subinterval of
the Nth iterate of (
2
, which has length at most
1
2
N
. Hence f(x) and f(y)
can be made arbitrarily close if x and y are suciently close. Then since
(
1
is compact, we can extend f to a continuous bijection on all of [0, 1] in a
piecewise linear fashion, because (
c
1
is a disjoint union of open intervals on
which f can be made piecewise linear. This construction will also preserve
the bijectivity of f. Hence we have a continuous bijection f : [0, 1] [0, 1]
with f((
1
) = (
2
.
14
Problem 35: Give an example of a measurable functions f and a continuous
function so that f is non-measurable. Use the construction in the
hint to show that there exists a Lebesgue measurable set that is not a Borel
set.
Solution. We will make use of Problem 34. Let (
1
and (
2
be two Cantor
subsets of [0, 1] such that m((
1
) > 0 and m((
2
) = 0. Let N (
1
such
that N is non-measurable. (Here we use the fact that every set E R
of positive measure has a non-measurable subset. This is easy to prove
by mimicking the Vitali construction but restricting it to E: Dene the
equivalence relation on E by x y if x y Q, and let N E con-
tain one member from each equivalence class. Then N and its countably
many translates are all of E, so N cannot have either measure 0 or nonzero
measure.) Dene f =
(N)
. Then f =
N
is non-measurable, since
x :
N
(x) >
1
2
= N.
To show that there is a Lebesgue-measurable set which is not Borel mea-
surable, we will use (without proof) the fact that every Borel set can be
represented by a nite number of union and intersection signs, followed by
some open sets. We also use the general fact that for any function and
any sets X

,
1
(X

) =
1
X

and
1
(X

) =
1
X

. Together,
these imply that for a continuous function ,
1
of a Borel set is Borel,
because

1
_
_

G
n
_
=
_

1
(G
n
)
and
1
(G
n
) is open for continuous and open G
n
. (Of course, the string
of cups and caps in this equation could just as well start with a cap.) Now
consider the set (N) from our construction above. Since (N) is a subset
of the set (
2
which has measure 0, (N) is measurable by the completeness
of Lebesgue measure. However, is a bijection so
1
((N)) = N which
is not Borel, so (N) cannot be Borel.
1. Chapter 1.7, Page 46
Problem 2: Any open set U can be written as the union of closed cubes, so
that U = Q
j
with the following properties:
(i) The Q
j
have disjoint interiors.
(ii) d(Q
j
, U
c
) side length of Q
j
. This means that there are positive
constants c and C so that c d(Q
j
, U
c
)/(Q
j
) C, where (Q
j
)
denotes the side length of Q
j
.
Proof. Let U R
d
be an open set. Let T
0
= Q
j
be the partition of U
into dyadic cUbes as described in the book (pp. 7-8). I claim that
d(Qj,U
c
)
(Qj)
is bounded above for this partition (i.e. does not take on arbitrarily large
valUes). This is so because if d(Q
j
, U
c
) >

d(Q
j
), then Q
j
is a subset of
a larger dyadic cube lying within U. (If we take the dyadic cube one size
larger than Q
j
and containing Q
j
, then Q
j
is at most

d
Qj
away from
any point in this larger dyadic cube.) But the constrUction of T
0
is done
in such a way that every dyadic cube is maximal in U, i.e. is not contained
in a larger dyadic cube lying within U. Hence
d(Qj,U
c
)
Qj

d for all Q
j
.
15
Now we dene an iterative rening procedUre on the partition T
0
. In this
procedure, every cube Q in the partition T
n
for which
d(Q,U
c
)
(Q)
<

d
4
is
replaced by its 2
d
dyadic sub-cubes. If R is one of these sub-cubes, then
d(R, U
c
)
(R)

d(Q, U
c
)
(Q)
= 2
d(Q, U
c
)
(Q)
and
d(R, U
c
)
(R)

d +

d(R)
(R)
= 2
d(Q, U
c
)
(Q)
+

d.
ThUs, if (Q) =
d(Q,U
c
)
(Q)
, then the sub-cubes have the property
2(Q) (R) 2(Q) +

d.
Now let
T =

_
n=1

k=n
T
k
be the partition consisting of those cubes that are eventUally in all the T
n
.
I claim that

d
4

d(Q,U
c
)
(Q)
2

d for any cube Q T. Consider what


happens as oUr renement process iterates. If a given cube has too small
a distance-to-side ratio, its sub-cubes will have this ratio at least doubled
in the next iteration. Hence, after enough iterations its sub-cubes will all
have their distance-to-side ratio in the desired interval [

d
4
, 2

d]. Once
they are in this interval, they are not sub-divided any further. One the
other hand, none can ever achieve a ratio too large to be in this interval,
since a cube is only subdivided if its ratio is less than

d/4, and the next
iteration can then make it at most 2(

d/4) +

d = 3

d/2 < 2

d. This
shows that T has all its cubes in the desired interval. Consider also that
T must have disjoint interiors and cover all of U: The only cubes with
overlapping interiors are those from distinct steps in our iteration scheme,
so taking the intersection to get T will weed out any such overlaps. Also,
for a given x U, if we consider the sequence of cubes containing x in
our various partitions, this sequence will shrink for a nite number of steps
and then stay constant once the distance-to-side ratio reaches a desirable
number. Hence x is contained in some cube that is eventually in all the T
n
,
so x is covered by T.
Problem 3: Find an example of a measurable subset C of [0, 1] such that
m(C) = 0, yet the dierence set of C contains a non-trivial interval centered
at the origin.
Solution. Let C be the Cantor middle-thirds set. Note that the Cantor
dust described in the hint consists precisely of those point (x, y) for which
both x and y are in C. Note also that if a line of slope 1 passes through
any cube in any iteration of the Cantor dust, it must pass through one
of the sub-cubes of that cube in the next iteration. To see this, consider
WLOG the original cube. For a line y = x + a, if
1
3
a 1 then the line
passes through the upper left cube; if
1
3
a
1
3
then it passes through
the lower left cube; and if 1 a
1
3
it passes through the lower right
cube. Thus, if C
n
is the nth iteration of the Cantor dust and L is a line
16
of the form y = x + a with 1 a 1, then L C
n
is nonempty and
compact. (It is nonempty by the preceding remark, and compact because
it is the intersection of the compact sets C
n
and L([0, 1] [0, 1]).) Thus,
the innite intersection of these nested compact sets is nonempty, i.e. L
intersects the Cantor dust at some point (x, y). Then since x, y C, we
have x y = a. So a is in the dierence set of the Cantor set for any
a [1, 1].
Problem 4: Complete the following outline to prove that a bounded function
on an interval [a, b] is Riemann integrable if and only if its set of disconti-
nuities has measure zero. This argument is given in detail in the appendix
to Book I.
Let f be a bounded function on a compact interval J, and let I(c, r) de-
note the open interval centered at c of radius r > 0. Let osc(f, c, r) =
sup [f(x) f(y)[ where the supremum is taken over all x, y J I(c, r),
and dene the oscillation of f at c by osc(f, c) = lim
r0
osc(f, c, r). Clearly,
f is continuous at c J if and only if osc(f, c) = 0.
Prove the following assertions:
(a) For every > 0, the set of points c J such that osc(f, c) is
compact.
(b) If the set of discontinuities of f has measure 0, then f is Riemann
integrable.
(c) Conversely, if f is Riemann integrable on J, then its set of discontinu-
ities has measure 0.
Solution.
(a) Clearly this set is bounded, so we need only show that its closed.
Suppose c
1
, c
2
, . . . are a sequence of such points with c
n
c. We wish
to show that osc(f, c) as well. Let r > 0. Since c
n
c, there is
some c
N
with [c
N
c[ <
r
2
. Then since osc(f, c
N
) , there must be
x, y J within
r
2
of c
N
such that [f(x) f(y)[ . But then x and
y are within r of c, so osc(f, c, r) . Since this is true for all r, we
must have osc(f, c) .
(b) Let > 0. Let
A

= c J : osc(f, c) .
Since A

is a subset of the discontinuity set of f, it has measure 0.


Hence there it can be covered by an open set U with m(U) < . Since
every open subset of R is a countable disjoint union of intervals, we
can write U = I
n
with

[I
n
[ < . Now because A

is compact, there
is a nite subcover; by re-ordering the intervals we may write
A


N
_
n=1
I
n
.
Now on the compact set J
t
= J I
n
, osc(f, c) < for all c. For
each x J
t
, this means we can nd r
x
such that osc(f, x, r
x
) < .
Then J
t
is covered by the open intervals U
x
= (x r
x
, x + r
x
). Let
be the Lebesgue number of this covering, so that any subinterval of
J
t
with length at most must be contained in one of the U
x
. Now
17
consider a partition of J with mesh size less than . The total length of
all subintervals which intersect I
n
is at most +2N since enlarging
each I
n
by will cover all such intervals. On each of these subintervals,
sup f inf f 2M where [f[ M on J. Hence the contribution these
intervals make to the dierence U(P, f)L(P, f) is at most 2M+4M.
The other subintervals are contained in J
t
and by construction of ,
each is contained within some U
x
, so sup f inf f on each of them.
Hence the total contribution they make to U(P, f)L(P, f) is at most
m(J). Thus, we have
U(P, f) L(P, f) (2M +m(J) + 4).
By requiring to be less than some constant times , we have thus
shown that the dierence between upper and lower sums can be made
smaller than a constant times . Hence f is Riemann integrable.
(c) Suppose f is Riemann integrable, and let > 0. Let n N. Then
there is a partition P of J with U(P, f) L(P, f) <

n
. Now if the
interior of any subinterval I
k
of this partition intersects A
1/n
at some
x, then sup f inf f
1
n
on I
k
because osc(f, x, r)
1
n
for all r, and
(x r, x + r) I
k
for suciently small r. So the total length of the
subintervals whose interiors intersect A
1/n
is at most since otherwise
they would make a contribution of more than /n to U(P, f) (P, f).
Hence we have covered A
1/n
by a collection of intervals of total length
less than , which implies m(A
1/n
) < . Now if A is the set of points
at which f is discontinuous, then
A =

_
n=1
A
1/n
.
Since A
n
A
n+1
, the continuity of measure implies that
m(A) = lim
n
m(A
1/n
) .
Since m(A) for all , m(A) = 0.

Problem 6: The fact that the axiom of choice and the well-ordering principle
are equivalent is a consequence of the following considerations.
One begins by dening a partial ordering on a set E to be a binary relation
on the set E that satises:
(i) x x for all x E.
(ii) If x y and y x, then x = y.
(iii) If x y and y z, then x z.
If in addition x y or y x whenever x, y E, then is a linear ordering
of E.
The axiom of choice and the well-ordering principle are then logicall equiva-
lent to the Hausdor maximal principle: Every non-empty partially ordered
set has a (non-empty) maximal linearly ordered subset. In other words, if
E is partially ordered by , then E contains a non-empty subset F which
is linearly ordered by and such that if F is contained in a set G also
linearly ordered by , then F = G.
An application of the Hausdor maximal principle to the collection of all
18
well-orderings of subsets of E implies the well-ordering principle for E.
However, the proof that the axiom of choice implies the Hausdor maximal
principle is more complicated.
Solution. I dont like this problem for two reasons: (1) Its not very clearly
stated what exactly Im supposed to do. My best guess is that Im supposed
to use the axiom of choice to prove the Hausdor maximal principle, but
the problem never really comes out and says that. (2) What the crap. This
is supposed to be an analysis course, not a set theory course. We havent
dened any of the basic concepts of set theory, yet Im supposed to come
up with a complicated set theory proof. Since Ive never had a course in
set theory, I really dont feel like guessing my way to something that might
be a proof. So, heres the proof from the appendix to Rudins Real and
Functional Analysis:
For a collection of set T and a sub-collection T, call a subchain of
T if is totally ordered by set inclusion.
Lemma 1. Suppose T is a nonempty collection of subsets of a set X such
that the union of every subchain of T belongs to T. Suppose g is a function
which associates to each A T a set g(A) T such that A g(A) and
g(A) A consists of at most one element. Then there exists an A T for
which g(A) = A.
Proof. Let A
0
T. Call a subcollection T
t
T a tower if A
0
T
t
, the
union of every subchain of T
t
is in T
t
, and g(A) T
t
for every A T
t
.
Then there exists at least one tower because the collection of all A T
such that A
0
A is a tower. Let T
0
be the intersection of all towers,
which is also a tower. Let be the collection of all C T
0
such that every
A T
0
satises either A C or C A. For each C , let (C) be the
collection of A T
0
such that A C or g(C) A. We will prove that is
a tower. The rst two properties are obvious. Now let C , and suppose
A (C). If A is a proper subset of C, then C cannot be a proper subset
of g(A) because then g(A) A would have at least two elements. Hence
g(A) C. If A = C, then g(A) = g(C). The third possibility for A is that
g(C) A. But since A g(A) this implies g(C) g(A). Thus, we have
shown that g(A) (C) for any A (C), so (C) is a tower. By the
minimality of T
0
, we must have (C) = T
0
for every C . This means
that g(C) for all C , so is also a tower; by minimality again,
= T
0
. This shows that T
0
is totally ordered.
Now let A be the union of all sets in T
0
. Then A T
0
by the second tower
property, and g(A) T
0
by the third. But A is the largest member of T
0
and A g(A), so A = g(A).
Now let T be the collection of all totally ordered subsets of a partially
ordered set E. Since every single-element subset of E is totally ordered, T
is not empty. Note that the union of any chain of totally ordered sets is
totally ordered. Now let f be a choice function for E. If A T, let A

be
the set of all x in the complement of A such that A x T. If A

,= ,
let
g(A) = A f(A

).
19
If A

= , let g(A) = A.
By the lemma, A

= for at least one A T, and any such A is a maximal


element of T.
Problem 7: Consider the curve = y = f(x) in R
2
, 0 x 1. Assume
that f is twice continuously dierentiable in 0 x 1. Then show that
m( + ) > 0 if and only if + contains an open set, if and only if f is
not linear.
Solution. We are asked to show the equivalence of the conditions (i) m(+
) > 0, (ii) + contains an open set, and (iii) f is not linear. We will
show that (ii) implies (i), which implies (iii), which implies (ii).
First, we should note that + is measurable. The problem doesnt ask for
this, but its worth pointing out. Consider G : [0, 1] [0, 1] R
2
dened
by G(x, y) = (x +y, f(x) +f(y)). Then + is just the range of G. Since
dierentiable functions map measurable sets to measurable sets, + is
measurable. (We havent proved yet that dierentiable functions preserve
measurability, but I assume we will once we get further into dierentiation
theory.)
The easiest of our three implications is (ii) implies (i). Suppose +
contains an open set. Open sets have positive measure, so + is a
measurable set with a subset of positive measure, so it has positive measure.
Now suppose m(+) > 0. We wish to show that f is not linear. Suppose
instead that f is linear, say f(x) = ax+b. Then for any x, x
t
, (x+x
t
, f(x)+
f(x
t
)) = (x+x
t
, a(x+x
t
) +2b) so + is a subset of the line y = ax+2b,
which has measure 0. Thus, if + has positive measure, f must not be
linear.
The third implication is the least trivial. Suppose f is not linear. Then
there are points x
0
, y
0
[0, 1] with f
t
(x
0
) ,= f
t
(y
0
). Then the Jacobian
DG =

1 1
f
t
(x) f
t
(y)

= f
t
(y) f
t
(x)
is nonzero at the point (x, y) [0, 1] [0, 1], where G(x, y) = (x+y, f(x) +
f(y)) as above. WLOG we may assume (x, y) (0, 1) (0, 1) since a
nonlinear function on [0, 1] with continuous derivative cannot have con-
stant derivative everywhere on (0, 1). Then the Inverse Function Theorem
guarantees that there is an open neighborhood of (x, y) on which G is a
dieomorphism; since dieomorphisms are homeomorphisms, this implies
that the image of G contains an open set.
Chapter 2.5, Page 89
Exercise 1: Given a collection of sets F
1
, . . . , F
n
, construct another collection
F

1
, . . . , F

N
, with N = 2
n
1, so that

n
k=1
F
k
=

N
j=1
F

j
; the collection
F

j
is disjoint; and F
k
=

j
F
k
F

j
for every k.
Solution. For j = 1, . . . , N, create F

j
as follows: First write j as an n-digit
binary number j
1
j
2
. . . j
n
. Then for k = 1, . . . , n, let
G
k
=
_
F
k
j
k
= 1
F
c
k
j
k
= 0.
20
Finally, let
F

j
=
n

k=1
G
k
.
For example, 2 = 000 . . . 10 in binary, so
F

2
= F
c
1
F
c
2
F
c
n2
F
n1
F
c
n
.
Note that the F

j
are pairwise disjoint because if j ,= j
t
, then they dier
in some binary digit, say j

,= j
t

. Suppose WLOG that j

= 1 and j
t

= 0.
Then F

j
F

whereas F

j
F
c

, so they are disjoint.


Also,
F
k
=
_
F

j
F
k
F

j
.
To see this, note that the RHS is clearly a subset of the LHS since it is a
union of subsets. Conversely, suppose x F
k
. Dene x
1
, . . . , x
n
by x
i
= 1 if
x F
i
and 0 otherwise. Then if m has the binary digits m
1
= x
1
, . . . , m
n
=
x
n
, x F

m
by denition of F

m
. Since F

m
F
k
, the result follows.
This implies
n
_
i=1
F
i

N
_
j=1
F

j
.
But F

F
i
for each j, so the reverse inclusion holds as well.
Exercise 2: In analogy to Proposition 2.5, prove that if f is integrable on R
d
and > 0, then f(x) converges to f(x) in the L
1
norm as 1.
Solution. Let > 0. Since C
C
(R
d
) is dense in L
1
(R
d
), we can choose
g C
C
(R
d
) such that |f g|
1
<

3
. We can also choose such that
[ 1[ <
1

d
<
3
2
. Now let K = supp(g), and choose M such that
K [x[ M. Let B = [x[ M + 1, which is also compact. Because
g is uniformly continuous on B,
t
> 0 such that
t
< 1 and for x, y B,
[x y[ <
t
[g(x) g(y)[ <

6m(B)
. Now suppose we choose such that
[ 1[ < min(,

M+1
). Then
|f(x) f(x)| =
_
[f(x) f(x)[

_
[f(x) g(x)[ +
_
[f(x) g(x)[ +
_
[g(x) f(x)[
by the triangle inequality. The third integral is just |f g|

3
. By the
dilation property of the integral, the rst integral is
1

d
|f g|
3
2

3
=

2
.
Now for the second integral. I claim that the integrand is 0 outside B and
at most

6m(B)
inside. If x / B, then [x[ > M + 1 so
[x[ = [[[x[
_
1

t
M + 1
_
[x[ >
_
1

t
M + 1
_
(M + 1) > M,
which implies that g(x) = g(x) = 0. Now suppose x B. If g(x) ,= 0,
then x K and
[x[
_
1 +

t
M + 1
_
[x[ <
_
1 +

t
M + 1
_
M < M + 1
21
so x B and [x x[ <
t
[g(x) g(x)[ <

6m(B)
. If g(x) ,= 0,
then x K B and again [g(x) g(x)[ <

6m(B)
because of uniform
continuity. Hence the integrand is 0 outside B and is at most

6m(B)
inside
B, so its integral is at most

6m(B)
m(B) =

6
. Putting the pieces together,
we have |f(x) f(x)| < .
Exercise 4: Suppose f is integrable on [0, b], and
g(x) =
_
b
x
f(t)
t
dt for 0 < x b.
Prove that g is integrable on [0, b] and
_
b
0
g(x)dx =
_
b
0
f(t)dt.
Solution. We may assume WLOG that f(t) 0, since otherwise we can
analyze f
+
and f

separately. Now let


h(x, t) =
f(t)
t

0<xtb]
.
Then h 0 and h is clearly measurable since it is a quotient of measurable
functions times another measurable function. By Tonellis theorem,
_

h(x, t)dt =
_
b
x
f(t)
t

0<xb]
dt
is a measurable function of x. Note that this is just equal to g(x) for
0 < x b and 0 elsewhere. Hence g is measurable (in general, g : (0, b] R
is measurable i g
(0,b]
: R R is). Moreover, Tonellis theorem also tells
us that
_
b
0
g(x) dx =
_
b
0
__
t
x
f(t)
t
dt
_
dx
=
_
RR
h(x, t)
=
_
b
0
__
t
0
h(x, t) dx
_
dt
=
_
b
0
__
t
0
f(t)
t
dx
_
dt
=
_
b
0
t
f(t)
t
dt
=
_
b
0
f(t) dt.
Note that the fact that this integral is nite implies that g is integrable and
not just measurable.
Exercise 5: Suppose F is a closed set in R, whose complement has nite
measure, and let (x) denote the distance from x to F, that is,
(x) = d(x, F) = inf[x y[ : y F.
22
Consider
I(x) =
_
R
(y)
[x y[
2
dy.
(a) Prove that is continuous by showing that it satises the Lipschitz
condition
[(x) (y)[ [x y[.
(b) Show that I(x) = for each x / F.
(c) Show that I(x) < for a.e. x F. This may be surprising in view of
the fact that the Lipschitz condition cancels only one power of [x y[
in the integrand of I.
Solution.
(a) Let > 0. Choose z F such that [y z[ < (y) +. Then
(x) [x z[ [x y[ +[y z[ < [x y[ +(y) + (x) (y) < [x y[ +.
Interchanging the roles of x and y, we also have
(y) (x) < [x y[ +.
Hence [(x) (y)[ < [x y[ + for any > 0. This implies [(x)
(y)[ [x y[.
(b) Suppose x / F. Because F is closed, this implies (x) > 0, since
otherwise there would be a sequence of points in F converging to x.
Let = (x). By the Lipschitz condition from part (a), [xy[ <

2

[(y) [ <

2
(y)

2
. Hence
I(x) =
_

(y)
[x y[
2
dy

_
x+/2
x/2
(y)
[x y[
2
dy

_
x+/2
x/2

2
[x y[
2
dy
=

2
_
/2
/2
1
y
2
dy = .
(c) First, consider that for y / F,
_
F
1
[x y[
2
dx 2
_

(y)
1
x
2
dx =
2
(y)
23
since F x : [x y[ (y). Now since I(x) 0, we have by
Tonellis theorem
_
F
I(x) dx =
_
RR
(y)
[x y[
2

F
(x)
=
_
R
__
R
(y)
[x y[
2

F
(x) dx
_
dy
=
_
F
c
(y)
__
F
1
[x y[
2
dx
_
dy

_
F
c
(y)
2
(y)
dy
= 2m(F
c
) < .
Since
_
F
I(x) dx < , we must have I(x) < for almost all x F.
(This is actually not all that shocking, since I(x) is clearly less than
for an interior point. Of course, there are closed sets whose boundaries
have positive measure, but those are the nasty guys.)

Exercise 6: Integrability of f on R does not necessarily imply the conver-


gence of f(x) to 0 as x .
(a) There exists a positive continuous function f on R such that f is
integrable on R, yet limsup
x
f(x) = .
(b) However, if we assume that f is uniformly continuous on R and inte-
grable, then lim
]x]
f(x) = 0.
Solution.
(a) Let
f(x) =
_
2
3n+4
d
_
x,
_
n, n +
1
2
2n+1

c
_ _
n x n +
1
2
2n+1
, n Z
_
0 else.
The graph of f consists of a series of triangular spikes with height
2
n+2
and base
1
2
2n+1
. The nth such spike has area 2
n
, so
_
[f[ =

n=0
2
n
= 2. But, because the spikes get arbitrarily high, limsup f(x) =
.
(b) Suppose f is uniformly continuous on R, and let > 0. Select > 0
such that [x y[ < [f(x) f(y)[ <

2
, and also require <
1
2
.
Since f(x) , 0, x
1
> 0 such that [f(x
1
)[ . Then [f(y)[

2
for
y (x
1
, x
1
+). Now since f(x) ,0, x
2
> x
1
+1 with [f(x
2
)[ .
Then [f[

2
on (x
2
, x
2
+). Continuing in this manner, we obtain
innitely many intervals of length 2 on which [f[

2
. These intervals
are disjoint because of our requirements that [x
n+1
x[ > 1 and <
1
2
.
Hence, by Tchebyches inequality,
_
R
[f(x)[ dx

2
m(x : [f(x)[

2
) = .

24
Exercise 8: If f is integrable on R, show that
F(x) =
_
x

f(t)dt
is uniformly continuous.
Solution. Let > 0. By the absolute continuity of the integral (Prop 1.12b),
> 0 such that m(E) <
_
E
[f[ < . Then (assuming WLOG x > y),
[x y[ < [F(x) F(y)[ =

_
x
y
f(t)dt

_
x
y
[f(t)[ dt <
because m([y, x]) = [x y[ < .
Exercise 10: Suppose f 0, and let E
2
k = x : f(x) > 2
k
and F
k
= x :
2
k
< f(x) 2
k+1
. If f is nite almost everywhere, then

_
k=
F
k
= f(x) > 0
and the sets F
k
are disjoint.
Prove that f is integrable if and only if

k=
2
k
m(F
k
) < , if and only if

k=
2
k
m(E
2
k) < .
Use this result to verify the following assertions. Let
f(x) =
_
[x[
a
if [x[ 1
0 otherwise
and
g(x) =
_
[x[
b
if [x[ > 1
0 otherwise.
Then f is integrable on R
d
if and only if a < d; also g is integrable on R
d
if and only if b > d.
Solution. Let
g(x) =

k=
2
k

F
k
(x),
h(x) =

k=
2
k+1

F
k
(x).
Then g(x) f(x) h(x) by denition of F
k
. Then
_
f(x)dx <
_
g(x)dx =

k=
2
k
m(F
k
) <
whereas

k=
2
k
m(F
k
) <
_
f(x)dx <
_
h(x)dx =

k=
2
k+1
m(F
k
) = 2

k=
2
k
m(F
k
) < .
25
Now let
(x) =

k=
2
k

E
k
(x).
Then f(x) (x) 2f(x) because if 2
k
< f(x) 2
k+1
, (x) =

k
j=
2
k
=
1 + 1 + 2 + 4 + + 2
k
= 2
k+1
. Hence
_
f(x)dx <
_
(x)dx =

k=
2
k
m(E
k
) < .
Now for the function f given,
E
k
= f(x) > 2
k
=
_
[x[ 1 k 0
[x[ 2
k/a
k 1
so
m(E
k
) =
_
2
d
k 0
2
d
2
kd/a
k 1
.
So f is integrable i

k=
2
k
m(E
k
) =
0

k=
2
k
2
d
+

k=1
2
k
2
d
2
kd/a
= 2
d+1
+ 2
d

k=
2
(1d/a)k
< .
This innite sum will converge i the constant 1
d
a
is negative, i.e. i
a < d.
For the function g given, let us redene g(x) = 1 for [x[ 1; clearly this
does not aect the integrablity of g. Now E
k
is empty for k > 0, so we
need only consider negative values of k.
g(x) > 2
k
[x[ < 2
k/b
so E
k
is a cube of volume 2
d
2
kd/b
. Hence g is integrable i
0

k=
2
k
2
d
2
kd/b
= 2
d
0

k=
2
(1d/b)k
converges. This will happen i 1 d/b > 0 b > d.
Exercise 11: Prove that if f is integrable on R
d
, and
_
E
f(x)dx 0 for every
measurable E, then f(x) 0 a.e. x. As a result, if
_
E
f(x)dx = 0 for every
measurable E, then f(x) = 0 a.e.
Proof. Suppose it is not true that f(x) 0 a.e., so m(f(x) < 0 is positive.
Now
x : f(x) < 0 =

_
n=1
_
x : f(x) <
1
n
_
m(x : f(x) < 0)

n=1
m
__
x : f(x) <
1
n
__
by countable additivity. Hence at least one of the sets
E
n
=
_
x : f(x) <
1
n
_
has positive measure. But then
_
En
f(x)dx
_
En

1
n
dx =
1
n
m(E
n
) < 0.
26
By contraposition, if
_
E
f(x)dx 0 for every measurable set E, then f(x)
0 a.e.
Now if
_
E
f(x)dx = 0 for every measurable E, then
_
E
f(x)dx 0 and
_
E
f(x)dx 0, which means f 0 a.e. and f 0 a.e. Hence f = 0
a.e.
Exercise 12: Show that there are f L
1
(R
d
) and a sequence f
n
with
f
n
L
1
(R
d
) such that
|f f
n
|
1
0,
but f
n
(x) f(x) for no x.
Solution. To assist in constructing such a sequence, we rst construct a
sequence of measurable sets E
n
R
d
with the property that m(E
n
) 0
but every x R
d
is in innitely many E
n
. We proceed as follows: Choose
integers N
1
, N
2
, . . . such that
1 +
1
2
+ +
1
N
1
> 10
1
N
1
+ 1
+
1
N
1
+ 2
+ +
1
N
2
> 100
1
N
2
+ 1
+
1
N
2
+ 2
+ +
1
N
3
> 1000
etc. This is possible because of the divergence of the harmonic series.
For convience, we also dene N
0
= 0. Next, for each k = 0, 1, 2, . . . , let
B
N
k
+1
be the cube of volume
1
N
k
+1
centered at the origin. Then, for a
given k, dene B
j
for N
k
+ 1 < j N
k+1
to be the cube centered at the
origin with [B
j
[ = [B
j1
[ +
1
j
. Finally, we dene E
N
k
+1
= B
N
k
+1
, and for
N
k
+ 1 < j N
k+1
, E
j
= B
j
B
j1
. I claim that the sets E
n
have the
desired properties. First, note that m(E
n
) =
1
n
. This is obvious for N
k
+1;
for N
k
+1 < j N
k+1
it is easy to see inductively that [B
j
[ =
1
N
k
+1
+ +
1
j
and since they are nested sets, [B
j
B
j1
[
1
j
. Thus m(E
n
) 0. However,
for each k,
N
k+1
_
j=N
k
+1
E
j
is a cube centered at the origin with a volume greater than 10
k
. For any
given x, these cubes will eventually contain x, i.e. there is some K such
that
k > K x
N
k+1
_
j=N
k
+1
E
k
.
Hence every x is in innitely many E
j
as desired.
Having constructed these sets, we simply let f
n
(x) =
En
(x) and f(x) = 0.
Then
_
f
n
(x)dx =
1
n
so |f
n
f| =
_
[f
n
f[ =
_
f
n
0, i.e. f
n
L
1
f.
However, for any given x there are innitely many n such that f
n
(x) = 1,
so f
n
(x) ,f(x) for any x.
Exercise 13: Give an example of two measurable sets A and B such that
A+B is not measurable.
27
Solution. As suggested in the hint, let N R be a non-measurable set.
Let A = 0 [0, 1] and B = N 0. Then A and B are both measurable
subsets of R
2
because they are subsets of lines, which have measure 0. Now
A+B = N [0, 1]. By Proposition 3.4, if N [0, 1] were measurable, since
[0, 1] has positive measure, this would imply that N was measurable too, a
contradiction. Hence A+B is not measurable.
Exercise 15: Consider the function dened over R by
f(x) =
_
x
1/2
if 0 < x < 1,
0 otherwise.
For a xed enumeration r
n
of the rationals Q, let
F(x) =

n=1
2
n
f(x r
n
).
Prove that F is integrable, hence the series dening F converges for almost
every x R. However, observe that this series is unbounded on every
interval, and in fact, any function

F that agrees with F a.e. is unbounded
in any interval.
Solution. First we compute the integral of f; the improper Riemann inte-
gral is
_
1
0
1

x
dx = 2

1
0
= 2,
but we only proved that the Lebesgue and Riemann integrals are equal for
the proper Riemann integral. Of course its true for improper integrals as
well; here, since (0, 1] =

n=1
(
1
n+1
,
1
n
], we have by countable additivity that
_
R
fdx =

n=1
_
(
1
n+1
,
1
n
]
fdx
= lim
N
N

n=1
_
(
1
n+1
,
1
n
]
fdx
= lim
N
_
1
1
N+1
fdx
= lim
a0
_
1
a
fdx (since this limit exists)
= 2.
By translation invariance, the integral of f(x r
n
) is also 2. Now since f
is nonnegative everywhere, the partial sums are monotonely increasing, so
by the Monotone Convergence Theorem
_
Fdx =

n=1
_
2
n
f(x r
n
)dx =

n=1
2
1n
= 2.
Since this integral is nite, F is integrable. This implies that F is nite-
valued for almost all x R.
Now let

F be any function that agrees with F almost everywhere, and I
28
any interval on the real line. Let r
N
be some rational number contained in
I. Then for any M > 0, f(xr
N
) > M on the interval (r
N

1
M
2
, r
N
+
1
M
2
),
which intersects I in an interval I
M
of positive measure. Since

F agrees
with F almost everywhere, it must also be greater than M at almost all
points of this interval I
M
I. Hence

F exceeds any nite value M on
I.
Exercise 17: Suppose f is dened on R
2
as follows: f(x, y) = a
n
if n x <
n + 1 and n y < n + 1, n 0; f(x, y) = a
n
if n x < n + 1 and
n + 1 y < n + 2, n 0; f(x, y) = 0 elsewhere. Here a
n
=

kn
b
k
, with
b
k
a positive sequence such that

k=0
b
k
= s < .
(a) Verify that each slice f
y
and f
x
is integrable. Also for all x,
_
f
x
(y)dy =
0, and hence
_
(
_
f(x, y)dy)dx = 0.
(b) However,
_
f
y
(x)dx = a
0
if 0 y < 1, and
_
f
y
(x)dx = a
n
a
n1
if n y < n + 1 with n 1. Hence y
_
f
y
(x)dx is integrable on
(0, ) and
_ __
f(x, y)dx
_
dy = s.
(c) Note that
_
RR
[f(x, y)[dxdy = .
Solution.
(a) Since f is constant on boxes and 0 elsewhere, the horizontal and ver-
tical slices are constant on intervals and 0 elsewhere, and therefore
integrable. More precisely,
f
y
(x) =
_

_
a
y1
y| 1 x < y|
a
y
y| x < y| + 1
0 else
for y 1,
f
y
(x) =
_
a
0
0 x < 1
0 else
for 0 y < 1, and
f
x
(y) =
_

_
a
x
x| y < x| + 1
a
x
x| + 1 y < x| + 2
0 else
where x| is the greatest integer less than or equal to x. (For x < 0
the function f
x
(y) is identically 0, and for y < 0 the function f
y
(x)
is identically 0.) Clearly
_
f
x
(y)dy = 0 for all x, since f
x
(y) is equal
to a
x
on an interval of length 1 and a
x
on an interval of length 1
and 0 elsewhere. Hence
_ _
f(x, y)dydx = 0.
(b) Since all the integrals are of constants on intervals of length 1, it
immediately follows from the formulas in part (a) that
_
f
y
(x)dx is a
0
for 0 y < 1 and a
n
a
n1
= b
n
for n y < n + 1. Then
_
R
__
R
f
y
(x)dx
_
=

n=0
_
n+1
n
__
R
f
y
(x)dx
_
dy =

n=0
b
n
= s.
29
(c) Since [f(x, y)[ is positive, we may use Tonellis theorem, so
_
RR
[f(x, y)[ =
_ __
f
x
(y)dy
_
dx
=

n=0
_
n+1
n
__
f
x
(y)dx
_
dx
=

n=0
2a
n
=
since a
n
> a
0
so the terms in the sum are bounded away from 0.

Exercise 18: Let f be a measurable nite-valued function on [0, 1], and sup-
pose that [f(x) f(y)[ is integrable on [0, 1] [0, 1]. Show that f(x) is
integrable on [0, 1].
Solution. Let g(x, y) = [f(x) f(y)[. By Fubinis Theorem, since g is
integrable on [0, 1] [0, 1], g
y
(x) is an integrable function of x for almost
all y [0, 1]. Choose any such y. Then since f(x) f(y) [f(x) f(y)[,
_
1
0
(f(x) f(y)) dx
_
1
0
[f(x) f(y)[dx <
so
_
1
0
f(x)dx f(y) +
_
1
0
[f(x) f(y)[dx < .

Exercise 19: Suppose f is integrable on R


d
. For each > 0, let E

= x :
[f(x)[ > . Prove that
_
R
d
[f(x)[dx =
_

0
m(E

)d.
Solution. By Tonellis Theorem,
_

0
m(E

)d =
_

0
__
R
d

]f(x)]>
dx
_
d
=
_
R
d
__

0

]f(x)]>
d
_
dx
=
_
R
d
[f(x)[dx.

Exercise 22: Prove that if f L


1
(R
d
) and

f() =
_
R
d
f(x)e
2ix
dx,
then

f() 0 as [[ . (This is the Riemann-Lebesgue lemma.)
30
Solution. By translation invariance,

f() =
_
R
d
f(x)e
2ix
dx
=
_
R
d
f
_
x

2[[
2
_
e
2i(x

2||
2
)
dx
=
_
R
d
f
_
x

[[
2
_
e
2ix
e
2i

2||
2

dx
=
_
R
d
f
_
x

2[[
2
_
e
2ix
dx
so, multiplying by
1
2
and adding the original expression,

f() =
1
2
__
R
d
f(x)e
2ix
dx
_
R
d
f
_
x

2[[
2
_
e
2ix
dx
_
=
1
2
_
R
d
_
f(x) f
_
x

2[[
2
__
e
2ix
dx
and
[

f()[ =
1
2

_
R
d
_
f(x) f
_
x

2[[
2
__
e
2ix
dx

1
2
_
R
d

_
f(x) f
_
x

2[[
2
__
e
2ix

dx
=
1
2
_
R
d

f(x) f
_
x

2[[
2
_

dx
=
1
2
_
_
_
_
f(x) f
_
x

2[[
2
__
_
_
_
1
.
As [[ , [

2]]
2
[ 0, so |f(x) f(x

2]]
2
)|
1
0 by the L
1
-continuity
of translation (Proposition 2.5).
Exercise 23: As an application of the Fourier transform, show that there
does not exist a function I L
1
(R
d
) such that
f I = f for all f L
1
(R
d
).
Solution. Suppose such an I exists. Then for every f L
1
,

f()

I() =

f()
for all . This implies that

I() = 1 for all . But this contradicts the
Riemann-Lebesgue Lemma (Problem 22). QED.
Note: To be totally complete, we should show that for any there is a
function g L
1
such that g() ,= 0. Otherwise, the equation

f

I =

f
wouldnt necessarily imply

I = 1 everywhere. But it is easy to show that
such a g exists for any ; for example, g could be equal to e
2ix
on some
compact set and 0 outside.
Exercise 24: Consider the convolution
(f g)(x) =
_
R
d
f(x y)g(y)dy.
(a) Show that f g is uniformly continuous when f is integrable and g
bounded.
31
(b) If in addition g is integrable, prove that (f g)(x) 0 as [x[ .
Solution.
(a) Since g is bounded, M with [g[ < M everywhere. Then
[f g(x) f g(x
t
)[ =

_
R
d
f(x y)g(y)dy
_
R
d
f(x
t
y)g(y)dy

_
R
d
(f(x y) f(x
t
y)) g(y)dy

_
R
d
[f(x y) f(x
t
y)[ [g(y)[dy
M
_
R
d
[f(x y) f(x
t
y)[dy
= M
_
R
d
[f(y) f(y + (x
t
x))[dy
= M|f(y) f(y + (x x
t
))|
1
.
In the penultimate step we have used translation invariance and the
fact that
_
f(y)dy =
_
f(y)dy provided both integrals are taken over
all of R
d
. Now by the L
1
-continuity of translation, > 0 such that
[x x
t
[ < |f(y) f(y + (x x
t
))|
1
<

M
. This in turn implies
[f g(x) f g(x
t
)[ < , so f g is uniformly continuous.
(b) Let > 0. Since C
C
(R
d
) is dense in L
1
(R
d
), we may choose

f such
that supp(

f) = K is compact,

f is continuous, and |f

f|
1
<

2M
,
where M is a bound for [g[ as in part (a). Continuous functions with
compact support are bounded, so choose N such that [

f[ < N. Now
[f g(x)[ =

_
R
d
f(x y)g(y)dy

_
R
d
_

f(x y) + (f(x y)

f(x y))
_
g(y)dy

_
R
d

f(x y) + (f(x y)

f(x y))g(y)

dy

_
R
d
[

f(x y)[[g(y)[dy +
_
R
d

f(x y)

f(x y)

[g(y)[dy.
Call the rst integral I
1
and the second I
2
. Since [g[ < M, I
2

M|f

f|
1
<

2
. Now since g is integrable, there must exist compact
F such that
_
F
c
[g[ <

2N
. Then if [x[ is larger than the sum of the
diameters of K and N,
I
2
=
_
R
d
[

f(x y)[[g(y)[dy
=
_
xyK
[

f(x y)[[g(y)[dy since



f = 0 on K
c
N
_
xyK
[g(y)[dy
N
_
F
c
[g(y)[dy <

2
32
since y : x y K F
c
. Thus, for suciently large x, [f g(x)[
I
1
+I
2
< .

Chapter 2.6, Page 95


Problem 1: If f is integrable on [0, 2], then
_
2
0
f(x)e
inx
dx 0 as [n[
. Show as a consequence that if E is a measurable subset of [0, 2], then
_
E
cos
2
(nx +u
n
)dx
m(E)
2
, as n
for any sequence u
n
.
Solution. First, note that
_
2
0
f(x) cos(nx)dx 0 and
_
2
0
f(x) sin(nx)
0 since these are the real and imaginary parts of
_
2
0
f(x)e
inx
dx. In
particular, if we let f(x) =
E
(x) for some measurable E [0, 2], then for
any > 0, N such that [
_
2
0

E
(x) sin(nx)dx[ and [
_
2
0

E
(x) cos(nx)dx[
are both less than

2
provided [n[ > N. Then for any sequence u
n
,

_
E
cos(2nx + 2u
n
)dx

_
E
cos(2nx) cos(2u
n
) sin(2nx) sin(2u
n
)dx

cos(2u
n
)
_
2
0

E
(x) cos(2nx)dx sin(2u
n
)
_
2
0

E
(x) sin(2nx)dx

[ cos(2u
n
)[

_
2
0

E
(x) cos(2nx)dx

+[ sin(2u
n
)[

_
2
0

E
(x) sin(2nx)dx

1

2
+ 1

2
=
for [n[ > N. Hence
_
E
cos(2nx +u
n
)dx 0 as [n[ . Now
_
E
cos
2
(nx +u
n
) =
_
E
1
2
(1 + cos(2(nx +u
n
))) dx
=
m(E)
2
+
_
2
0

E
(x) cos(2nx + 2u
n
)dx
and we have shown that the second term tends to 0 as [n[ .
Problem 2: Prove the Cantor-Lebesgue theorem: if

n=0
A
n
(x) =

n=0
(a
n
cos nx +b
n
sin nx)
converges for x in a set of positive measure (or in particular for all x), then
a
n
0 and b
n
0 as n .
Solution. We can rewrite A
n
(x) = c
n
cos(nx + d
n
) where c
n
=
_
a
2
n
+b
2
n
and d
n
is some phase angle (it can be arctan(b
n
/a
n
), for example). If

A
n
(x) converges on some E with m(E) > 0, then A
n
(x) 0 on E. By
33
Egorovs theorem, this implies A
n
0 uniformly on some E
t
e with
m(E
t
) > 0. Then
c
n
cos(nx +d
n
)
u
0 on E
t
c
2
n
cos
2
(nx +d
n
)
u
0 on E
t

_
E

c
2
n
cos
2
(nx +d
n
)dx 0.
But
_
t
E
cos
2
(nx+d
n
)dx
m(E

)
2
by the previous problem, so c
2
n
0, which
implies c
n
0, which implies a
n
0 and b
n
0.
Problem 3: A sequence f
k
of measurable functions on R
d
is Cauchy in
measure if for every > 0,
m(x : [f
k
(x) f

(x)[ > ) 0 as k, .
We say that f
k
converges in measure to a (measurable) function f if
for every > 0,
m(x : [f
k
(x) f(x)[ > ) 0 as k .
This notion coincides with the convergence in probability of probability
theory.
Prove that if a sequence f
k
of integrable functions converges to f in L
1
,
then f
k
converges to f in measure. Is the converse true?
Solution. Suppose f
n
, f L
1
and f
n
f in L
1
. By Chebyshevs Inequal-
ity,
m(x : [f
n
(x) f(x)[ > )
|f
n
f|
1

and since the RHS tends to 0 as n , the LHS does as well, so f


n

f in measure. However, the converse does not hold. Consider the case
f(x) = 0, f
n
(x) = n
[0,
1
n
]
. Then m(x : f(x) ,= f
n
(x)) =
1
n
so for any ,
m(x : [f(x) f
n
(x)[ > )
1
n
0. Hence f
n
f in measure. However,
|f
n
f|
1
= 1 for all n, so f
n
,f in L
1
.
Problem 4: We have already seen (in Exercise 8, Chapter 1) that if E is a
measurable set in R
d
, and L is a linear transformation of R
d
to R
d
, then
L(E) is also measurable, and if E has measure 0, then so has L(E). The
quantitative statement is
m(L(E)) = [ det(L)[m(E).
As a special case, note that the Lebesgue measure is invariant under rota-
tions. (For this special case see also Exercise 26 in the next chapter.)
The above identity can be proved using Fubinis theorem as follows.
(a) Consider rst the case d = 2, and L a strictly upper triangular
transformation x
t
= x +ay, y
t
= y. Then

L(E)
(x, y) =
E
(L
1
(x, y)) =
E
(x ay, y).
34
Hence
m(L(E)) =
_
RR
__

E
(x ay, y)dx
_
dy
=
_
RR
__

E
(x, y)dx
_
dy
= m(E),
by the translation-invariance of the measure.
(b) Similarly m(L(E)) = m(E) if L is strictly lower triangular. In general,
one can write L = L
1
L
2
, where L
j
are strictly (upper and lower)
triangular and is diagonal. Thus m(L(E)) = [ det L[m(E), if one
uses Exercise 7 in Chapter 1.
Solution.
(a) Im not quite sure what to comment on here, since the problem state-
ment pretty much did all the work for me. I guess I should point out
that the use of Tonellis theorem to turn the double integral into an
iterated integral is justied because
E
is nonnegative; note that this
proves that the result holds even if m(E) = .
(b) The fact that lower triangular transformations work the same way is
obvious. Now supposing L = L
1
L
2
, we have
m(L(E)) = m(L
1
((L
2
(E)))) = m((L
2
(E)))
= [ det()[m(L
2
(E)) = [ det()[m(E) = [ det(L)[m(E)
since det(L) = det(L
1
) det() det(L
2
) = det() and m((E)) =
[ det()[m(E) by Exercise 7 of Chapter 1. Thus, every linear transfor-
mation that has an LU decomposition works as we want it to. How-
ever, I think the problem is awed because not every matrix has an
LU decomposition (in fact, not even every invertible matrix does.) In
particular, in the 2 2 case a matrix of the form L
1
L
2
will look like
either
_
1 c
0 1
__
d
1
0
0 d
2
__
1 0
e 1
_
=
_
d
1
+cd
2
e cd
2
d
2
e d
2
_
or
_
1 0
e 1
__
d
1
0
0 d
2
__
1 c
0 1
_
=
_
d
1
cd
1
ed
1
ed
1
c +d
2
_
.
But a matrix of the form
_
0 a
b 0
_
with a, b ,= 0 cannot be put in either form, because in the rst case we
would need d
2
= 0 in order to make the lower right entry 0, and then
the upper right and lower left entries could not be nonzero; similarly,
in the second case, we would need d
1
= 0 which would make a, b ,= 0
impossible. Hence, there are some matrices that cannot be factored
in the way this problem indicates; for the ones that can, though, we
know that they expand measures by a factor of [ det [.

35
Chapter 3.5, Page 145
Exercise 1: Suppose is an integrable function on R
d
with
_
R
d
(x)dx = 1.
Set K

(x) =
d
(x/), > 0.
(a) Prove that K

>0
is a family of good kernels.
(b) Assume in addition that is bounded and supported in a bounded
set. Verify that K

>0
is an approximation to the identity.
(c) Show that Theorem 2.3 holds for good kernels as well.
Solution.
(a) By the dilation properties of the integral, we have immediately that
_
R
d
K

(x)dx =
_
R
d
(x)dx = 1 and
_
R
d
[K

(x)[dx =
_
R
d
[(x)[dx =
||
1
< . This proves the rst two properties of good kernels. For
the last, we recall that for L
1
, for every > 0 there exists a
compact set F

such that
_
F
c

[[ < . Now compact subsets of R


d
are
bounded, so K

B
r
(0) for some radius r

. Now if h > 0 is any xed


number, for <
h
r
this will imply that
_
]x]>h
[K

(x)[dx < . Thus,


for any h > 0,
_
]x]>h
[K

(x)[dx 0 as 0. Hence K

is a family
of good kernels.
(b) Suppose [[ M everywhere and (x) = 0 for [x[ B. Let A =
MB
d+1
. Then for any > 0,
1

[K

(x)[ =
(d+1)

_
x

_
M
(d+1)

A
(B)
d+1

A
[x[
d+1
for
x

B; for
x

> B we have K

(x) = 0. Hence [K

(x)[ A/[x[
d+1
for all x and , so K

is an approximation to the identity.


(c) Suppose K

is any family of good kernels. Then


|f K

f| =
_
R
d
[f K

(x) f(x)[ dx
=
_
R
d

_
R
d
f(x y)K

(y)dy f(x)

dx
=
_
R
d

_
R
d
(f(x y) f(x))K

(y)dy

dx

_
R
d
_
R
d
[f(x y) f(x)[[K

(y)[dydx
=
_
R
d
_
R
d
[f(x y) f(x)[[K

(y)[dxdy
=
_
]y]
_
R
d
[f(x y) f(x)[[K

(y)[dxdy +
_
]y]>
_
R
d
[f(x y) f(x)[[K

(y)[dxdy
for any > 0. Let us call the rst integral I
1
and the second I
2
. Then
I
1
=
_
]y]
[K

(y)[
__
R
d
[f(x y) f(x)[dx
_
dy
=
_
]y]
[K

(y)[|f(x y) f(x)|
1
dy.
36
Now by the L
1
-continuity of translation, |f(x y) f(x)|
1
0 as
y 0. Thus, if is suciently small, this norm will be at most, say,

2A
for all y [, ]. Then
I
1

_
]y]
[K

(y)[

2A
dy

2
since
_
R
d
[K

(y)[dy A for all . Thus, by choosing suciently


small, we may make I
1
as small as we like, independent of . On the
other hand,
I
2
=
_
]y]>
[K

(y)[
_
R
d
[f(x y) f(x)[dxdy

_
]y]>
[K

(y)[
_
R
d
([f(x y)[ +[f(x)[)dxdy
=
_
]y]>
[K

(y)[ 2|f|
1
dy
= 2|f|
1
_
]y]>
[K

(y)[dy 0
as 0. Putting the two halves together, we see that choosing a
suciently small and then letting 0 makes |f K

f|
1
0.

Exercise 3: Suppose 0 is a point of (Lebesgue) density of the set E R.


Show that for each of the individual conditions below there is an innite
sequence of points x
n
E, with x
n
,= 0, and x
n
0 as n .
(a) The sequence also satises x
n
E for all n.
(b) In addition, 2x
n
belongs to E for all n.
Generalize.
Solution.
(a) Since 0 is a point of density of E, there exists r
0
> 0 such that m(E
B
r
(0)) >
2
3
m(B
r
(0)) =
4
3
r for r r
0
. By symmetry, m((E)
B
r0
(0)) = m(EB
r0
(0))
4
3
r
0
. Now E and E both intersect B
r0
(0),
which has measure 2r
0
, in sets of measure at least
4
3
r
0
. Therefore they
intersect each other in a set of measure at least
2
3
r
0
. Since E (E)
has positive measure, it is innite, so it contains an innite sequence
x
n
. This sequence satises x
n
E and x
n
E.
(b) Since 0 is a point of density of E, there exists r
0
> 0 such that
m(E B
r
(0)) >
2
3
m(B
r
(0)) =
4
3
r for r r
0
. Let E
0
= E B
r0
(0).
Then m(
1
2
E
0
) =
1
2
m(E
0
)
2
3
r
0
as we showed in a previous homework
about the eect of dilation on Lebesgue measure. Now
1
2
E
0
= (
1
2
E)
B
r0/2
(0) has measure at least
2
3
r
0
, and we also know m(EB
r0/2
(0))
2
3
m(B
r0/2
) =
2
3
r
0
since
r0
2
< r
0
. So E and
1
2
E both intersect B
r0/2
,
which has measure r
0
, in sets of measure at least
2
3
r
0
. Therefore they
intersect each other in a set of measure at least
1
3
r
0
. Since E (
1
2
E)
has positive measure, it must contain an innite sequence x
n
. Then
x
n
E and 2x
n
E.
37
Clearly the above process generalizes to produce a sequence x
n
with x
n
E
and cx
n
E for any c ,= 0.
Exercise 5: Consider the function on R dened by
f(x) =
_
1
]x](log 1/]x])
2
if [x[ 1/2,
0 otherwise.
(a) Verify that f is integrable.
(b) Establish the inequality
f

(x)
c
[x[(log 1/[x[)
for some c > 0 and all [x[ 1/2,
to conclude that the maximal function f

is not locally integrable.


Solution.
(a) We have
_
R
d
[f(x)[dx = 2
_
1/2
0
1
x(log 1/x)
2
dx = 2
1
log
1
x

1/2
0
=
2
log 2
.
(b) For 0 < [x[
1
2
, if B = (0, 2x) is the ball of radius [x[ centered at x,
then
1
m(B)
_
B
[f(y)[dy =
1
2[x[
_
2
0
[x[[f(y)[dy

1
2[x[
_
]
0
x[
1
y(log 1/y)
2
dy
=
1
2[x[
1
log(1/[x[)
.
Since f

(x) is the supremum of such integrals over all balls containing


x, it is at least equal to the integral over B, so f

(x)
1
2]x] log(1/]x])
.
This function is not locally integrable, because if we integrate it in any
neighborhood around 0 we get
_

1
2[x[ log(1/x)
dx = log
_
log
1
x
_

0
= .

Exercise 6: In one dimension there is a version of the basic inequality (1) for
the maximal function in the form of an identity. We dened the one-sided
maximal function
f

+
(x) = sup
h>0
1
h
_
x+h
x
[f(y)[dx.
If E
+

= x R : f

+
(x) > , then
m(E
+

) =
1

_
E
+

[f(y)[dy.
38
Solution. First, we note that
x E
+

h > 0 s.t.
1
h
_
x+h
x
[f(y)[dy >

_
x+h
x
[f(y)[dy > h

_
x+h
0
[f(y)[dy
_
x
0
[f(y)[dy (x +h) +x > 0

_
x+h
0
[f(y)[dy (x +h) >
_
x
0
[f(y)[dy x.
Thus, if we dene F(x) =
_
x
0
[f(y)[dy x, the set E
+

is precisely the
set x : h > 0 s.t. F(x + h) > F(x). Note also that F is continuous
by the absolute continuity of the integral (we are assuming that f L
1
,
naturally). By the Rising Sun Lemma,
E
+

_
j=1
(a
j
, b
j
)
where the intervals (a
j
, b
j
) are disjoint and
F(a
j
) = F(b
j
)

_
aj
0
[f(y)[dy a
j
=
_
bj
0
[f(y)[dy b
j

_
bj
aj
[f(y)[dy = (b
j
a
j
).
Then
_
E
+

[f(y)[dy =

j=1
_
bj
aj
[f(y)[dy =

j=1
(b
j
a
j
) = m(E
+

)
as desired.
Exercise 8: Suppose A is a Lebesgue measurable set in R with m(A) > 0.
Does there exist a sequence s
n
such that the complement of

n=1
(A+s
n
)
in R has measure zero?
Solution. Yes. Let x be any point of density of A, and let s
n
= q
n
x
where q
n
is an enumeration of the rationals. Let E = (A + s
n
). To
show that m(E
c
) = 0, it is sucient to show that m(E
c
[n, n + 1]) = 0
for all n Z; since E is invariant under rational translations, it is sucient
to show that E
c
[0, 1] has measure zero.
For m = 1, 2, . . . let N
m
be an integer such that m(A B
1/Nm
(x)) (1
1
m
)m(B
1/Nm
)(x). Such an N
m
must exist because x is a point of density of
A. Then by the construction of E, m(AB
1/Nm
(q)) (1
1
m
)m(B
1/Nm
)(q)
for any q Q. Now the open balls
U
j
m
= B
1/Nm
_
j
2N
m
_
, j = 1, 2, . . . , 2N
m
39
cover [0, 1], so
E
c
[0, 1]
2Nm
_
j=1
_
E
c
U
j
m
_
and
m(E
c
[0, 1])
2Nm

j=1
m
_
E
c
U
j
m
_

2Nm

j=1
1
m
m(U
j
m
) = (2N
m
)
1
m
2
N
m
=
4
m
.
Since m(E
c
[0, 1])
4
m
for all m, m(E
c
[0, 1]) = 0. Hence m(E
c
) = 0.
Note: It is not sucient to construct a set whose Lebesgue points are
dense in R. Consider an open dense set of measure (e.g. put an interval
of length

2
k
around the kth rational). Then every point is a Lebesgue point
since the set is open, yet its complement has positive measure.
Exercise 9: Let F be a closed subset in R, and (x) the distance function
from x to F, that is
(x) = inf[x y[ : y F.
Clearly, (x +y) [y[ whenever x F. Prove the more rened estimate
(x +y) = o([y[) for a.e. x F,
that is, (x +y)/[y[ 0 for a.e. x F.
Solution. We note that is a function of bounded variation on any interval;
in fact, in general we have V
b
a
() [b a[ because [(y) (z)[ [y z[
for all y, z R. Since has bounded variation, it is dierentiable almost
everywhere; in particular, it is dierentiable for a.e. x F. But is a
nonnegative function that is 0 for all x F, so it has a local minimum at
every point of F; if it is dierentiable at x F, its derivative is zero. By
the denition of derivative, this implies (x +y)/[y[ 0.
Exercise 15: Suppose F is of bounded variation and continuous. Prove that
F = F
1
F
2
, where both F
1
and F
2
are monotonic and continuous.
Solution. Every bounded-variation function is a dierence of increasing
functions, so write F = G
1
G
2
where G
1
and G
2
are increasing. As
shown in Lemmas 3.12-13, an increasing function is a continuous increasing
function plus a jump function. Hence G
1
= F
1
+J
1
where F
1
is continuous
and increasing, and J
1
is a jump function; similarly, G
2
= F
2
+ J
2
. Then
F = (F
1
F
2
) + (J
1
J
2
). But J
1
J
2
is a jump function, and jump
functions are continuous only if theyre constant. Since F is continuous,
this implies that J
1
J
2
is constant; WLOG, J
1
J
2
= 0. (Otherwise we
could redene F
t
1
= F
1
+ (J
1
J
2
) and F
t
1
would also be continuous and
increasing.) Hence F = F
1
F
2
.
Exercise 17: Prove that if K

>0
is a family of approximations to the
identity, then
sup
>0
[(f K

)(x)[ cf

(x)
for some constant c > 0 and all integrable f.
40
Exercise 18: Verify the agreement between the two denitions given for the
Cantor-Lebesgue function in Exercise 2, Chapter 1 and Section 3.1 of this
chapter.
Solution. This is such a lame problem. Its so clear that theyre the same.
Probably the easiest way to see that is to think of the Cantor-Lebesgue
function as the following process:
Given x, let y be the greatest member of the Cantor set such that
y x. (We know such a y exists because the Cantor set is closed.)
Write the ternary expansion of y.
Change all the 2s to 1s and re-interpret as a binary expansion. The
value obtained is F(x).
Its pretty clear that both the denitions of the Cantor-Lebesgue function
given in the text do exactly this.
Exercise 19: Show that if f : R R is absolutely continuous, then
(a) f maps sets of measure zero to sets of measure zero.
(b) f maps measurable sets to measurable sets.
Solution.
(a) Suppose E R has measure zero. Let > 0. By absolute continuity,
> 0 such that

[b
j
a
j
[ <

[f(b
j
) f(a
j
)[ < . Since
m(E) = 0, there is an open set U E with m(U) < . Every open
subset of R is a countable disjoint union of open intervals, so
U =

_
j=1
(a
j
, b
j
) with

j=1
(b
j
a
j
) < .
For each j = 1, 2, . . . ,, let m
j
, M
j
[a
j
, b
j
] be values of x with
f(m
j
) = min
x[a,b]
f(x) and f(M
j
) = max
x[a,b]
f(x).
Such m
j
and M
j
must exist because f is continuous and [a
j
, b
j
] is
compact. Then
f(U)

_
j=1
[f(m
j
), f(M
j
)].
But [M
j
m
j
[ [b
j
a
j
[ so

j=1
[M
j
m
j
[ <

j=1
[f(M
j
) f(m
j
)[ < .
Hence f(E) is a subset of a set of measure less that . This is true for
all , so f(E) has measure zero.
(b) Let E R be measurable. Then E = F N where F is F

and
N has measure zero. Since closed subsets of R are -compact, F is
-compact. But then f(F) is also -compact since f is continuous.
Then f(E) = f(F) f(N) is a union of an F

set and a set of measure


zero. Hence f(E) is measurable.

41
Exercise 20: This exercise deals with functions F that are absolutely con-
tinuous on [a, b] and are increasing. Let A = F(a) and B = F(b).
(a) There exists such an F that is in addition strictly increasing, but such
that F
t
(x) = 0 on a set of positive measure.
(b) The F in (a) can be chosen so that there is a measurable subset E
[A, B], m(E) = 0, so that F
1
(E) is not measurable.
(c) Prove, however, that for any increasing absolutely continuous F, and
E a measurable subset of [A, B], the set F
1
(E) F
t
(x) > 0 is
measurable.
Solution.
(a) Let
F(x) =
_
x
a

C
(x)dx
where C [a, b] is a Cantor set of positive measure and
C
(x) is
the distance from x to C. Note that
C
(x) 0 with equality i
x C. Since
C
is continuous, this integral is well-dened, even in the
Riemann sense. Moreover, F is absolutely continuous by the absolute
continuity of integration of L
1
functions. As shown in problem 9, F
t
(x)
exists and equals zero a.e. in C, hence on a set of positive measure.
However, F is strictly increasing: Suppose a x < y b. Since C
contains no interval, some point, and therefore some interval, between
x and y belongs to C
C
. The integral of
C
over this interval will be
positive, so F(y) > F(x).
(b) The same function from part (a) does the trick. Since F is increasing,
it maps disjoint open intervals to disjoint open intervals. Let U =
[a, b] C. Since U is open, we can write
U =

_
j=1
(a
j
, b
j
)
where the intervals (a
j
, b
j
) are disjoint. Then
F(U) =

_
j=1
(F(a
j
), F(b
j
))
and
m(F(U)) =

j=1
F((b
j
) F(a
j
)).
But
B A = F(b) F(a) =
_
b
a
(x)dx =
_
U
(x)dx =

j=1
(F(b
j
) F(a
j
))
since = 0 on C so
_
C
(x)dx = 0. Thus m(F(U)) = m(F([a, b])), so
that m(F(C)) = 0. This implies that m(F(S)) = 0 for any subset S
C. But since C has positive measure, it has a non-measurable subset.
Then if E = F(S), m(E) = 0 so E is measurable, but F
1
(E) = S is
not measurable.
(c)

42
Exercise 22: Suppose that F and G are absolutely continuous on [a, b]. Show
that their product FG is also absolutely continuous. This has the following
consequences.
(a) Whenever F and G are absolutely continuous in [a, b],
_
b
a
F
t
(x)G(x)dx =
_
b
a
F(x)G
t
(x)dx + [F(x)G(x)]
b
a
.
(b) Let F be absolutely continuous in [, ] with F() = F(). Show
that if
a
n
=
1
2
_

F(x)e
inx
dx,
such that F(x)

a
n
e
inx
, then
F
t
(x)

ina
n
e
inx
.
(c) What happens if F() ,= F()?
Proof. Since F and G are absolutely continuous, they are continuous and
therefore bounded on the compact interval [a, b]. Suppose [F[, [G[ M on
this interval. Now given > 0, we can choose > 0 such that

[b
j
a
j
[ <

[F(b
j
) F(a
j
) <

M
and

[G(b
j
) G(a
j
)[ <

2M
. Then

[F(b
j
)G(b
j
) F(a
j
)G(a
j
)[
=

1
2
[(F(b
j
F(a
j
))(G(b
j
) +G(a
j
)) + (F(b
j
) +F(a
j
))(G(b
j
) G(a
j
))[

1
2
_

[F(b
j
) F(a
j
)[[G(b
j
) +G(a
j
)[ +

[F(b
j
) +F(a
j
)[[G(b
j
) G(a
j
)[
_

1
2
_

(2M)[F(b
j
) F(a
j
)[ +

(2M)[G(b
j
) G(a
j
)[
_

1
2
_
2M

2M
+ 2M

2M
_
= .
This proves that FG is absolutely continuous on [a, b]. We now turn to the
consequences of this:
(a) Since FG is absolutely continuous, its dierentiable almost every-
where. By elementary calculus, (FG)
t
= F
t
G + FG
t
at any point
where all three derivatives exist, which is almost everywhere. Inte-
grating both sides and subtracting
_
FG
t
yields
_
b
a
F
t
(x)G(x)dx =
_
b
a
F(x)G
t
(x)dx +
_
b
a
(FG)
t
(x)dx.
Since FG is absolutely continuous, this implies
_
b
a
F
t
(x)G(x)dx =
_
b
a
F(x)G
t
(x)dx + [F(x)G(x)]
b
a
.
(b) It would be nice if the problem would actually dene this for us, but
Im assuming that the here means is represented by as opposed to
any kind of statement about whether the function actually converges
43
to its Fourier series or not. Then suppose b
n
are the Fourier coecients
of F
t
, so by denition
b
n
=
1
2
_

F
t
(x)e
inx
dx.
Using part (a), we have
b
n
=
1
2
_

F(x)(ine
inx
)dx+[F(x)e
inx
]
b
a
= in
1
2
_

F(x)e
inx
dx = ina
n
.
(c) Then all bets are o. As one example, consider F(x) = x which is
clearly absolutely continuous on [, ]. Then
a
n
=
1
2
_

xe
inx
dx =
1
2
_
xe
inx
in
+
e
inx
n
2
_

=
2i
n
(1)
n
for n ,= 0, and a
0
=
_

xdx = 0. However, F
t
(x) = 1 which has
Fourier coecients b
0
= 1 and b
n
= 0 for n ,= 0.

Exercise 25: The following shows the necessity of allowing for general ex-
ceptional sets of measure zero in the dierentiation Theorems 1.4, 3.4, and
3.11. Let E be any set of measure zero in R
d
. Show that:
(a) There exists a non-negative integrable f in R
d
, such that
liminf
m(B)0
xB
1
m(B)
_
B
f(y)dy = for each x E.
(b) When d = 1 this may be restated as follows. There is an increasing
absolutely continuous function F such that
D
+
F(x) = D

F(x) = , for each x E.


Solution.
(a) Since E has measure zero, there exist open sets O
n
with E O
n
for
all n and m(O
n
) <
1
2
n
. Let f =

n=1

cn
. Then f L
1
since
_
R
d
f =

n=1
_
R
d

cn
=

n=1
m(O
n
)

n=1
1
2
n
= 1.
Now let x E. Since O
n
is open, there exist open balls B
n
O
n
with
x B
n
. Then for any ball B x,
_
B
f(y)dy =

n=1
m(O
n
B)

n=1
m(B
n
B)
1
m(B)
_
B
f(y)dy

n=1
m(B
n
B)
m(B)
.
For any N, there exists > 0 such that m(B) < and x B implies
B B
j
for all j = 1, . . . , N. (This is true because B
1
B
n
is
an open set containing x and hence contains an open ball around x.)
Then
1
m(B)
_
B
f(y)dy

n=1
m(B
n
B)
m(B)
N
44
for suciently small B. This proves that
liminf
m(B)0
xB
1
m(B)
_
B
f(y)dy = .
(b) Let f be as in part (a), and
F(x) =
_
x
0
f(y)dy.
Then F is absolutely continuous because it is the integral of an L
1
function; it is increasing because f is nonnegative. Now
D
+
F(x) = liminf
h0
h>0
F(x +h) F(x)
h
= liminf
h0
h>0
_
x+h
x
f(y)dy
and
D

F(x) = liminf
h0
h<0
F(x +h) F(x)
h
= liminf
h0
h<0
_
x+h
x
f(y)dy
The conclusion in (a) implies that both of these are innite, since
one can consider integrals over the balls [x, x + h) and (x h, x].
(Technically, I suppose we should work with open balls, but one can
look at e.g. (x , x +h) for suciently small .)

Exercise 30: A bounded function F is said to be of bounded variation on R if


F is of bounded variation on any nite sub-interval [a, b] and sup
a,b
T
F
(a, b) <
. Prove that such an F enjoys the following two properties:
(a)
_
R
[F(x +h) F(x)[dx A[h[, for some constant A and all h R.
(b) [
_
R
F(x)
t
(x)dx[ A, where ranges over all C
1
functions of bounded
support with sup
xR
[(x)[ 1.
For the converse, and analogues in R
d
, see Problem 6* below.
Solution.
(a) First, note that it is sucient to treat the case where F is a bounded
increasing function. This is so because in general we can let F =
F
1
F
2
where F
1
and F
2
are bounded increasing functions; if they
both satisfy the given condition, with constants A
1
and A
2
, then
[F(x + h) F(x)[ [F
1
(x + h) F
1
(x)[ + [F
2
(x + h) F
2
(x)[ for
all x, so
_
[F(x +h) F(x)[ A
1
+A
2
.
(In case someone asks why F must be a dierence of bounded in-
creasing functions, we could re-do the proof that was used on nite
intervals, using the positive and negative variations of f. This avoids
the problem of trying to extend from the bounded case and worrying
about whether such an extension is unique.)
Suppose now that F is bounded and increasing. Since F is increas-
ing, [F(x + h) F(x)[ = F(x + h) F(x) for h > 0. (By transla-
tion invariance, it is sucient to treat the case of positive h, since
_
[F(x h) F(x)[ =
_
F(x) F(x h) =
_
F(x +h) F(x).) Then
45
on any interval [a, a +h],
F(x+h)F(x) F(a+2h)F(a)
_
a+h
a
F(x+h)F(x) h(F(a+2h)F(a)).
In particular,
_
(n+1)h
nh
h(F((n + 2)h) F(nh))
for any n Z. Then for any N,
_
Nh
Nh
F(x +h) F(x) h
N1

n=N
F((n + 2)h) F(nh)
h
_
F((N + 1)h) +F(Nh) F((N + 1)h) F(Nh)
_
2h(F(+) F())
since the sum telescopes. Since F(+) F() is a nite constant,
this proves the result.
(b) By some algebraic legwork,
F(x+h)(x+h) F(x)(x) = F(x)
_
(x+h) (x)
_
(x+h)
_
F(x+h) F(x)
_
.
When we integrate both sides over R, the left-hand side integrates to
zero by translation invariance. Hence
_
R
F(x)
_
(x +h) (x)
_
dx =
_
R
(x +h)
_
F(x +h) F(x)
_
dx.
By part (a), we have

_
R
F(x)
_
(x +h) (x)
_
dx

_
R
(x +h)
_
F(x +h) F(x)
_
dx

_
R
[(x +h)[

F(x +h) F(x)

dx

_
R

F(x +h) F(x)

dx
A[h[.
Hence

_
R
F(x)
(x +h) (x)
h
dx

A.
Now is supported on some compact set K, so
t
is a continuous
function which is 0 outside K. Hence it has a maximum M. By the
Mean Value Theorem, any dierence quotient of is at most M in
absolute value. Then if L is a bound for F on K, F(x)
x+h(x)
h
is dominated by ML
K
for all h. Hence we can use the Dominated
Convergence Theorem as h 0 to obtain

_
R
F(x)
t
(x)dx

A.

46
Exercise 31: Let F be the Cantor-Lebesgue function described in Section
3.1. Consider the curve that is the graph of F, that is, the curve given by
x(t) = t and y(t) = F(t) with 0 t 1. Prove that the length L( x) of the
segment 0 t x of the curve is given by L( x) = x + F( x). Hence the
total length of the curve is 2.
Solution. It is true for any increasing function with F(0) = 0 that L( x)
x +F( x), because for any partition 0 = t
0
< t
1
< < t
n
= x,
n

j=1
_
(t
j
t
j1
)
2
+ (F(t
j
) F(t
j1
))
2

j=1
(t
j
t
j1
)+(F(t
j
)F(t
j1
)) = x+F( x).
We wish to show that this upper bound is in fact the least upper bound
when F is the Cantor-Lebesgue function. Consider the iterates F
n
(x) of
which this function is the limit. The interval [0, 1] can be divided into
2
n+1
1 intervals on which F
n
(x) alternately increases and stays constant;
suppose we label them I
1
, C
1
, I
2
, C
2
, . . . , C
2
n
1
, I
2
n. The intervals C
j
have
varying lengths, since they correspond to intervals that are deleted from the
Cantor set at varying stages of the iteration; however, the I
j
all have length
1
3
n
since they correspond to the intervals remaining in the nth iteration of
the Cantor set. Hence the sum of the lengths of the I
j
is
_
2
n
3
n
_
, while the
sum of the lengths of the C
j
is 1
_
2
n
3
n
_
.
Now let x [0, 1], and consider the partition P
n
consisting of all points less
than or equal to x which are an endpoint of one of the C
j
or I
j
. Thus we
have 0 = t
0
< t
1
< < t
m
= x where F
n
is increasing on [t
0
, t
1
], constant
on [t
1
, t
2
], increasing on [t
2
, t
3
], etc. Note also that F(t
j
) = F
n
(t
j
) since all
the t
j
are endpoints of the C
k
intervals, which remain xed in all successive
iterations. Then
m

j=1
_
(t
j
t
j1
)
2
+ (F(t
j
) F(t
j1
))
2
=
m

j=1
j odd
_
(t
j
t
j1
)
2
+ (F(t
j
) F(t
j1
))
2
+
m

j=1
j even
_
(t
j
t
j1
)
2
+ (F(t
j
) F(t
j1
))
2

j=1
j odd
(F(t
j
) F(t
j1
)) +
m

j=1
j even
(t
j
t
j1
)
=F( x) +

k
[C
k
[0, x][
=F( x) + x

k
[I
k
[0, x][
F( x) + x

k
[I
k
[
=F( x) + x
_
2
3
_
n
.
47
Letting n , this approaches x + F( x), which proves L( x) x + F( x).
Since we already know L( x) x + F( x), we have L( x) = x + F( x) as
desired.
Exercise 32: Let f : R R. Prove that f satises the Lipschitz condition
[f(x) f(y)[ M[x y[
for some M and all x, y R, if and only if f satises the following two
properties:
(i) f is absolutely continuous.
(ii) [f
t
(x)[ M for a.e. x.
Solution. Suppose f is Lipschitz. Then for any > 0, if we let =

M
, then

[b
j
a
j
[ <

[f(b
j
) f(a
j
)[ < . Hence f is absolutely continuous.
This implies f is dierentiable a.e.; if x is a point for which f
t
(x) exists, the
Lipschitz condition implies [
f(x+h)f(x)
h
[ M for all h. Taking the limit
as h 0 implies [f
t
(x)[ M.
Conversely, suppose f is absolutely continuous and has bounded deriva-
tive a.e. Since absolutely continuous functions are the integrals of their
derivatives,
[f(y) f(x)[ =

_
y
x
f
t
(t)dt


_
max(x,y)
min(x,y)
[f
t
(t)[dt
_
max(x,y)
min(x,y)
Mdt = M[x y[
so f is Lipschitz.
Chapter 3.6, Page 152
Problem 4: A real-valued function dened on an interval (a, b) is convex
if the region lying above its graph (x, y) R
2
: y > (x), a < x < b is a
convex set. Equivalently, is convex if
(x
1
+ (1 )x
2
) (x
1
) + (1 )(x
2
)
for every x
1
, x
2
(a, b) and 0 1. One can also observe as a conse-
quence that we have the following inequality of the slopes:
(1)
(x +h) (x)
h

(y) (x)
y x

(y) (y h)
h
,
whenever x < y, h > 0, and x +h < y. The following can then be proved.
(a) is continuous on (a, b).
(b) satises a Lipschitz condition of order 1 in any proper closed sub-
interval [a
t
, b
t
] of (a, b). Hence is absolutely continuous in each sub-
interval.
(c)
t
exists at all but an at most denumerable number of points, and

t
= D
+
is an increasing function with
(y) (x) =
_
y
x

t
(t)dt.
(d) Conversely, if is any increasing function on (a, b), then (x) =
_
x
c
(t)dt is a convex function in (a, b) for c (a, b).
Solution.
48
(a) Suppose to the contrary that is discontinuous at some x (a, b).
This means there exists > 0 and an innite sequence x
n
x with
x
n
(a, b) and [(x
n
) (x)[ > for all n. Since the sequence x
n
is innite, it must have innitely many points in one of the following
categories:
(i) (x
n
) > (x) +
(ii) (x
n
) < (x)
We will treat each case separately and obtain a contradiction.
(i) Assume WLOG that the entire sequence x
n
is in this category
(otherwise, take a subsequence that is). We may also assume x
n
converges to x monotonically, since otherwise we can again take
a subsequence. Let L() = (x) + (1 )(x
1
) for 0 1.
This is a continuous function of , so > 0 such that [[ <
L() < (x) + . Now let
n
=
xnx1
xx1
. Then 0
n
1 since
x
n
x monotonically. Note also that x
n
=
n
x + (1
n
)x
1
.
Now
n
0 as n , so
n
< for n suciently large. But
this implies
(x
n
) = (
n
x + (1
n
)x
1
) L(
n
) < (x) +,
for suciently large n, a contradiction.
(ii) Again, we assume WLOG that x
n
x monotonically. Let y
(a, b) such that x is between y and x
n
for all n. Then
n
=
xnx
xny
has the properties that 0
n
1 and
n
1 as n . Now
x =
n
x
n
+ (1
n
)y, so
(x)
n
(x
n
) + (1
n
)(y)
for all n. But
n
1, and since (x
n
) < (x) , this implies
(x) < (x) for suciently large n, a contradiction.
Hence is continuous.
(b) First, I prove an inequality of slopes that I like better than the one
given. I claim that for s < t < u with s, t, u (a, b),
(2)
(t) (s)
t s

(u) (s)
u s

(u) (t)
u t
.
This follows straightforwardly from the convexity condition:
t =
u t
u s
s +
t s
u s
u
(t)
u t
u s
(s) +
t s
u s
(u)
(u s)(t) (u t)(s) + (t s)(u) (3)
(u s)(t) (u s)(s) (s t)(s) + (t s)(u)

(t) (s)
t s

(u) (s)
u s
.
Taking a dierent route from inequality (3) leads to
(u t)(u) (u t)(s) (u s)(u) (u s)(t)

(u) (s)
u s

(u) (t)
u t
49
as desired. Given this inequality of slopes, we can easily prove that
is Lipschitz on [a
t
, b
t
]. Choose h > 0 such that [a
t
h, b
t
+h] (a, b).
Then for x, y [a
t
, b
t
], suppose WLOG that x < y; since a
t
h < a
t

x < y b
t
< b
t
+h, the slope inequality yields
(a
t
h) (a
t
)
h

(y) (a
t
h)
y a
t
+h

(y) (x)
y x

(b
t
+h) (x)
b
t
+h x

(b
t
+h) (b
t
)
h
.
The leftmost and rightmost terms above are constants, which we may
call m and M; we thus have m[y x[ [(y) (x)[ M[y x[,
whence is Lipschitz on [a
t
, b
t
].
(c) Since is Lipschitz on any closed subinterval [x, y] (a, b), is abso-
lutely continuous on [x, y] by Exercise 32 above; hence (y) (x) =
_
y
x

t
(t)dt.
Now inequality (2) implies that
(x+h)(x)
h
is an increasing function
of h at any x (a, b). This implies that D
+
= D
+
and D

= D

,
that [D
+
[, [D

[ < , and that D


+
D

. The inequality (1) tells


us that x < y D
+
(x) D

(y). This in turn implies that D


+
and D

are increasing. To show that D


+
= D

except at countably
many points, let x

be those points in (a, b) for which this is not


true, and dene j

> 0 by j

= D
+
()(x

) D

()(x

). Then on
any subinterval [a
t
, b
t
], if x
1
, . . . , x
n
[a
t
, b
t
], we have
n+1

k=1
_
D
+
()(x
k
) D

()(x
k
)
_

n+1

k=1
_
D
+
()(x
k
) D

()(x
k
)
_
+
n+1

k=1
_
D

()(x
k
) D
+
()(x
k1
)
_
=D
+
()(b
t
) D

()(a
t
),
where we use the convention x
0
= a
t
and x
n+1
= b
t
. This implies that

x[a

,b

]
_
D
+
()(x
k
) D

()(x
k
)
_
is nite, because all nite sub-sums are bounded by the nite constant
D
+
()(b
t
)D

()(a
t
). So this sum can containly only countably many
nonzero terms, which means only countably many points in [a
t
, b
t
] can
have D

()(x) ,= D
+
()(x). Since (a, b) is a countable union of closed
subintervals (e.g. [a+
1
n
, b
1
n
]), it can contain only countably many
points for which D
+
,= D

. Everywhere else, the derivative exists.


50
(d) Because is increasing,

_
x
1
+ (1 )x
2
_
=
_
x1+(1)x2
c
(t)dt
=
_
x2
c
(t)dt
_
x2
x1+(1)x2
(t)dt
=
_
x2
c
(t)dt
_
x2
x1+(1)x2
(t)dt (1 )
_
x2
x1+(1)x2
(t)dt

_
x2
c
(t)dt
_
x2
x1+(1)x2
(t)dt (1 )(x
2
x
1
)(x
1
+ (1 )x
2
)

_
x2
c
(t)dt
_
x2
x1+(1)x2
(t)dt
_
x1+(1)x2
x1
(t)dt
=
_
x2
c
(t)dt
_
x2
x1
(t)dt
=
_
x1
c
(t)dt + (1 )
_
x2
c
(t)dt
= (x
1
) + (1 )(x
2
).
So is convex.

Chapter 4.7, Page 193


Exercise 4: Prove from the denition that
2
(Z) is complete and separable.
Solution. The proof that
2
is complete is exactly the same as the proof
that L
2
is complete, from pp. 159-160 of the textbook. Let a
(m)
j

m=1
be
a Cauchy sequence in
2
(Z). For each k 1 we can choose n
k
such that
m, n n
k
|a
(m)
a
(n)
| <
1
2
k
and n
k
< n
k+1
. Then the subsequence
a
(n
k
)
has the property that |a
(n
k+1
)
a
(n
k
)
|
1
2
k
. Dene sequences a =
a
j
and b = b
j
by
a
j
= a
(n1)
j
+

k=1
_
a
(n
k+1
)
j
a
(n
k
)
j
_
and
b
j
= [a
(n1)
j
[ +

k=1

a
(n
k+1
)
j
a
(n
k
)
j

and the partial sums


S
(a,K)
j
= a
(n1)
j
+
K

k=1
_
a
(n
k+1
)
j
a
(n
k
)
j
_
and
S
(b,K)
j
= [a
(n1)
j
[ +
K

k=1

a
(n
k+1
)
j
a
(n
k
)
j

.
51
Then
|S
(b,K)
| |a
(n1)
| +
K

k=1
1
2
k
by the triangle inequality; letting K , |b| converges by the monotone
convergence theorem (for sums, but hey, sums are just integrals with dis-
crete measures), so |a| converges since it converges absolutely. Of course,
this implies that b
j
and hence a
j
converges for each j; since the partial
sums are S
(a,K)
= a
(n
k+1
)
by construction, a
(n
k+1
)
j
a
j
for all j. Now
given > 0, choose N such that |a
(n)
a
(m)
| <

2
for n, m > N, and let
n
K
> N such that |a
(n
K
)
a| <

2
. Then
m > N |a
(m)
a| |a
(m)
a
(n
K
)
| +|a
(n
K
)
a| < .
Hence a
(m)
a.
To prove that
2
is separable, consider the subset T consisting of all rational
sequences which are 0 except at nitely many values. This is countable
because
T = rational sequences of nite length
=

_
N=1
sequences a
n
with a
n
Q and a
n
= 0 for [n[ > N
=

_
N=1
Q
2N+1
is a countable union of countable sets. Now let b
n

2
be any square
summable sequence. Given > 0, there exists N such that

]n]>N
[b
n
[
2
<

2
2
since the innite sum converges. Then for each j = N, . . . , N, we can
choose a rational number q
j
with [q
j
b
j
[
2
<

2
2
2+N+j
. If we also dene
q
j
= 0 for [j[ > N, then q T and
|q b|
2
=

j=
[q
j
b
j
[
2
=

]j]>N
[b
j
0[
2
+
N

j=N
[q
j
b
j
[
2
<

2
2
+
2N

s=0

2
2
2+s
<
2
.
This shows that T is dense.
Exercise 5: Establish the following relations between L
2
(R
d
) and L
1
(R
d
):
(a) Neither the inclusion L
2
(R
d
) L
1
(R
d
) nor the inclusion L
1
(R
d
)
L
2
(R
d
) is valid.
(b) Note, however, that if f is supported on a set E of nite measure and
if f L
2
(R
d
), applying the Cauchy-Schwarz inequality to f
E
gives
f L
1
(R
d
), and
|f|
1
m(E)
1/2
|f|
2
.
(c) If f is bounded ([f(x)[ M), and f L
1
(R
d
), then f L
2
(R
d
) with
|f|
2
M
1/2
|f|
1/2
1
.
52
Solution.
(a) Let f(x) =
]x]1
1
]x]
d/2
and g(x) =
]x]1
1
]x]
d
. Then Exercise 10 of
Chapter 2 shows that f and g
2
are integrable, but f
2
and g are not.
(b) Applying the Cauchy-Schwarz inequality to the inner product of f
E
and
E
,
|f|
1
= |f
E
|
1
=
_
[f
E
[
__
[f[
_
1/2
__

E
_
1/2
= m(E)
1/2
|f|
2
.
(c) Since [f[ M,
[f[
2
M[f[ |f|
2
2
=
_
[f[
2
M
_
[f[ = M|f|
1
|f|
2
M
1/2
|f|
1/2
1
.

Exercise 6: Prove that the following are dense subspaces of L


2
(R
d
):
(a) The simple functions.
(b) The continuous functions of compact support.
Solution.
(a) It is sucient to treat the case of nonnegative f, since every complex
L
2
function is a linear combination of nonnegative L
2
functions. We
know there exists a sequence of simple functions s
n
f with 0 s
n

f. Then [f s
n
[
p
[f[
p
so by the Dominated Convergence Theorem,
_
[f s
n
[
p
0. Hence s
n
f in L
p
. Therefore the simple functions
are dense.
(b) Let s L
p
(R
d
) be a simple function. It is sucient to nd g
C
C
(R
d
) with |g s|
p
< . If s = 0, s C
C
(R
d
) and were done.
Otherwise, since s is simple, 0 < |s|

< ; since it is in L
p
, it
must be supported on a set E of nite measure. Now Lusins theorem
enables us to construct g C
C
(R
d
) with m(g ,= s) <
_

2|s|
_
p
and
sup [g[ sup [s[ < . (To do this, construct g from Lusins theorem;
then if g |s|

on the closed set F, change g to |s|

on F. See
Rudin, Real and Complex Analysis pp. 55-56.) Then [g s[ 2|s|

and is nonzero on a set of measure


_

2|s|
_
p
, so
_
[g s[
p
2
p
|s|

_

2|s|

_
p
=
p
|g s|
p
< .

Exercise 7: Suppose
k

k=1
is an orthonormal basis for L
2
(R
d
). Prove that
the collection
k,j

1k,j<
with
k,j
(x, y) =
k
(x)
j
(y) is an orthonormal
basis of L
2
(R
d
R
d
).
Solution. First, note that
k,j
is indeed in L
2
(R
d
R
d
, since
_
R
2d
[(k, j)[
2
=
_
R
2d
[
k
(x)[
2
[
j
(y)[
2
dxdy =
__
R
d
[
k
(x)[
2
dx
___
R
d
[
j
(y)[
2
dy
_
= 1
53
by Fubinis Theorem. Also,

j,k

,m
) =
_
R
2d

j,k

,m
=
_
R
2d

j
(x)
k
(y)

(x)
m
(y)dxdy
=
__
R
d

j
(x)

(x)dx
___
R
d

k
(y)
m
(y)dy
_
=

m
k
so
j,k
is an orthonormal set. To show that linear combinations of
j,k

are dense in L
2
(R
d
), one approach would be to blow this problem out
of the water with the Stone-Weierstrass theorem. We know C
C
(R
d
) is
dense in L
2
(R
d
), and the Stone-Weierstrass theorem tells us that linear
combinations of separable continuous functions (i.e. functions of the form

m
i=1
f
i
(x)g
i
(y)) are dense in C
C
(R
d
). It is easy to verify that functions of
this form can be approximated by
j,k
, so were done.
Alternatively, we can follow the approach given in the hint. Let f L
2
(R
2d
)
and suppose that f,
j,k
) = 0 for all j and k. By Fubinis Theorem,
0 =
_
R
2d
f(x, y)
j,k
(x, y)
=
_
R
d
__
R
d
f(x, y)
j,k
(x, y)dy
_
dx
=
_
R
d
__
R
d
f(x, y)
k
(y)dy
_

j
(x)dx
Hence, if we dene
f
k
(x) =
_
R
d
f(x, y)
k
(y)dy,
we see that
_
R
d
f
k
(x)
j
(x) = 0
for all j. Because
j
is an orthonormal basis, this implies that f
k
(x) = 0
for all k. Because
k
is an orthonormal basis, this in turn implies that
f(x, y) = 0. Since f
j,k
f = 0,
j,k
is an orthonormal basis.
Exercise 9: Let H
1
= L
2
([, ]) be the Hilbert space of functions F(e
i
)
on the unit circle with inner product (F, G) =
1
2
_

F(e
i
)G(e
i
)d. Let
H
2
be the space L
2
(R). Using the mapping
x
i x
i +x
of R to the unit circle, show that:
(a) The correspondence U : F f, with
f(x) =
1

1/2
(i +x)
F
_
i x
i +x
_
gives a unitary mapping of H
1
to H
2
.
54
(b) As a result,
_
1

1/2
_
i x
i +x
_
n
1
i +x
_

n=
is an orthonormal basis of L
2
(R).
Solution.
(a) If we dene = 2 tan
1
(x), then x = tan
_

2
_
,
ix
i+x
= e
i
, 1 + x
2
=
sec
2
_

2
_
, and dx =
1
2
sec
2
_

2
_
d. (Brings back memories of high school
calculus, dont it?) Then
_
R
[f(x)[
2
dx =
_
R
1
[i +x[
2

F
_
i x
i +x
_

2
dx
=
_
R
1

1
x
2
+ 1

F
_
i x
i +x
_

2
dx
=
_

1
sec
2
(/2)
[F(e
i
)[
2
1
2
sec
2
_

2
_
d
=
1
2
_

[F(e
i
)[
2
d
so |f|
12
= |F|
11
. So U is unitary.
(b) By the Riesz-Fischer theorem, e
in
is an orthonormal basis for L
2
(T).
Because U is unitary,
U(e
in
) =
_
1

_
i x
i +x
_
n
1
i +x
_
is an orthonormal basis for L
2
(R).

Exercise 10: Let o denote a subspace of a Hilbert space H. Prove that


(o

is the smallest closed subspace of H that contains o.


Solution. Let

S =

V 1 subspace
V closed
V.
Then

S is a closed subspace, because the intersection of closed sets is closed
and the intersection of subspaces is a subspace. It is obviously the smallest
closed subspace containing S. We want to show that

S = (S

. Clearly

S (S

since the latter is a closed subspace containing S. To establish


the reverse inclusion, we rst show that (

S)

= S

. Clearly (

S)

since S

S. To show the opposite inclusion, let x S

. Then for any


w

S, there is a sequence w
n
S with w
n
w. Because inner products
are continuous, 0 = w
n
, x) w, x) so x w for any w

S. This proves
S

S)

. Since theyre equal, we can use Proposition 4.2 to write


H =

S S

.
Now let x (S

. Then we can write x = v+w where v



S and w S

.
Then
x, w) = v, w) +w, w) = w, w).
55
But x, w) = 0 because w S

and x (S

. Hence w = 0 and
x

S.
Exercise 11: Let P be the orthogonal projection associated with a closed
subspace o in a Hilbert space H, that is,
P(f) = f if f o and P(f) = 0 if f o

.
(a) Show that P
2
= P and P

= P.
(b) Conversely, if P is any bounded operator satisfying P
2
= P and P

=
P, prove that P is the orthogonal projection for some closed subspace
of H.
(c) Using P, prove that if o is a closed subspace of a separable Hilbert
space, then o is also a separable Hilbert space.
Solution.
(a) Let x H and write x = x
S
+x
S
. Then
P
2
(x) = P(P(x)) = P(x
S
) = x
S
= P(x)
so P
2
= P. Moreover, if y = y
S
+y
S
is any other vector in H, then
Px, y) = x
S
, y
S
+y
S
) = x
S
, y
S
) = x
S
+x
S
, y
S
) = x, Py)
so P = P

.
(b) Let S = im(P) which is a subspace of H. To show S is closed, suppose
x
n
S and x
n
x. Then because P is bounded, its continuous,
so Px
n
Px. But Px
n
= x
n
, so x
n
Px which implies Px = x.
Hence x S, so S is closed. Also, if w S

, then for all v S,


x, Pw) = Px, w) = x, w) = 0
so Pw S

. But Pw S, so Pw = 0. Now using Proposition 4.2, if


y is any vector in H, then
y = y
S
+y
S
.
Then by linearity, P(y) = P(y
S
) +P(y
S
) = y
S
+ 0 = y
S
. So P does
the same thing to y as orthogonal projection onto S, for any y H.
(c) Let H be a separable Hilbert space,
n
a countable dense set, and S
a closed subspace. I claim that P
S

n
is dense in S. For any x S,
we can nd a sequence
k
x. Then P
S
(
k
x) = P
S

k
x since
Px = x. Since projections do not increase length,
|P
S

k
x| |
k
x| (P
S

k
) x.

Exercise 12: Let E be a measurable subset of R


d
, and suppose o is the
subspace of L
2
(R
d
) of functions that vanish for a.e. x / E. Show that the
orthogonal projection P on o is given by P(f) =
E
f, where
E
is the
characteristic function of E.
Solution. Dene a linear operator T : L
2
(R
d
) L
2
(R
d
) by T(f) =
E
f.
Then T
2
(f) =
2
E
f =
E
f = T(f). Moreover,
Tf, g) =
_
R
d

E
f g =
_
R
d
f
E
g = f, Tg)
56
so T

= T. We also note that T is bounded since [Tf(x)[ [f(x)[ for all


x, so |Tf| |f|. By problem 11c, T is a projection onto its image. But
im(T) is precisely those functions which are 0 a.e. on E
c
. Hence T is the
desired projection.
Exercise 13: Suppose P
1
and P
2
are a pair of orthogonal projections on S
1
and S
2
, respectively. Then P
1
P
2
is an orthogonal projection if and only if
P
1
and P
2
commute, that is, P
1
P
2
= P
2
P
1
. In this case, P
1
P
2
projects onto
S
1
S
2
.
Solution. Suppose P
1
P
2
is an orthogonal projection. Then
P
2
P
1
= P

2
P

1
= (P
1
P
2
)

= P
1
P
2
so they commute. On the other hand, suppose they commute; then
(P
1
P
2
)
2
= P
2
1
P
2
2
= P
1
P
2
and
(P
1
P
2
)

= P

2
P

1
= P
2
P
1
= P
1
P
2
so P
1
P
2
is an orthogonal projection. The image of P
1
P
2
is a subspace of
S
1
because P
1
P
2
v = P
1
(P
2
v) S
1
for any v; similarly, its a subspace of S
2
because P
1
P
2
v = P
2
(P
1
v) S
2
. Hence the image of P
1
P
2
is a subspace of
S
1
S
2
. But every vector in S
1
S
2
is xed by both P
1
and P
2
, and hence
by P
1
P
2
. Thus, the image of P
1
P
2
is precisely S
1
S
2
.
Exercise 14: Suppose H and H
t
are two completions of a pre-Hilbert space
H
0
. Show that there is a unitary mapping from H to H
t
that is the identity
on H
0
.
Solution. Dene U : H H
t
as follows: Given x H, choose a sequence
f
n
x with f
n
H
0
. Dene U(x) = limf
n
in H
t
. (This limit exists
because f
n
is Cauchy.) To show this is well-dened, suppose g
n
H
0
is
another sequence converging to x. Then f
n
g
n
0 in both spaces (because
f
n
g
n
H
0
for all n), so they have the same limit in H
t
as well. This
shows that U is well-dened. Clearly it is the identity on H
0
. To show that
its unitary, we need only use the continuity of the norm: Suppose f
n
x
in H and f
n
U(x) in H
t
. Then |x| = lim|f
n
| = |U(x)|. Hence U is
unitary.
Exercise 15: Let T be any linear transformation from H
1
to H
2
. If we
suppose that H
1
is nite-dimensional, then T is automatically bounded.
Solution. Let e
1
, . . . , e
n
be an orthonormal basis for H
1
. (This is a basis
in the usual linear algebra sense, i.e. every vector is a nite linear combi-
nation of the basis vectors.) Let m
i
= |T(e
i
)|. Then if x H
1
is a unit
vector, we can write x =

n
i=1
c
i
e
i
with

n
i=1
[c
i
[
2
= 1. By the Triangle
Inequality,
|T(x)| =
_
_
_
_
_
T
_
n

i=1
c
i
e
i
__
_
_
_
_
=
_
_
_
_
_
n

i=1
c
i
T(e
i
)
_
_
_
_
_

i=1
[c
i
[|T(e
i
)| nM
where M = max(m
i
). (In fact, this bound can be improved to M

n,
because

[c
i
[

n by the Cauchy-Schwarz inequality.)


57
Exercise 18: Let H denote a Hilbert space, and L(H) the vector space of all
bounded linear operators on H. Given T L(H), we dene the operator
norm
|T| = infB : |Tv| B|v|, for all v H.
(a) Show that |T
1
+T
2
| |T
1
| +|T
2
| whenever T
1
, T
2
L(H).
(b) Prove that
d(T
1
, T
2
) = |T
1
T
2
|
denes a metric on L(H).
(c) Show that L(H) is complete in the metric d.
Solution.
(a) This is easier if we use another expression for |T|, such as
|T| = sup
xin1,x,=0
|Tx|
|x|
.
If we call this supremum S, then we have |Tx| S|x| for all x;
moreover, for any < S, there exists x with |Tx|/|x| > |Tx| >
|x|. Hence S is the inmum (in fact, the minimum) of such bounds,
so this denition really is equivalent. Then
|T
1
+T
2
| = sup
|T
1
x|
|x|
+
|T
2
x|
|x|
sup
|T
1
x|
|x|
+ sup
|T
2
x|
|x|
= |T
1
| +|T
2
|.
(b) Since |Tx| = | Tx| for all x, |T| = | T|. Then
d(T
1
, T
2
) = |T
1
T
2
| = |T
2
T
1
| = d(T
2
, T
1
).
Clearly d(T
1
, T
1
) = 0; conversely, if T
1
,= T
2
then there exists some x
with |(T
1
T
2
)x| > 0, so |T
1
T
2
| > 0. Finally, using part (a),
d(T
1
, T
3
) = |T
1
T
3
| = |(T
1
T
2
)+(T
2
T
3
)| |T
1
T
2
|+|T
2
T
3
| = d(T
1
, T
2
)+d(T
2
, T
3
).
Hence d is a metric.
(c) Let T
n
be a Cauchy sequence in L(H). Dene T(x) = limT
n
(x) for
all x H. This limit exists because |(T
m
T
n
)x| |T
m
T
n
||x|
so T
n
x is a Cauchy sequence in H. T is linear by the linearity of
limits. Finally, since T
n
x Tx, the continuity of the norm implies
|T
n
x| |Tx| for all x. Hence |T| = lim|T
n
| which is nite because
[(|T
m
| |T
n
|)[ |T
m
T
n
| by the triangle inequality, so |T
n
| is a
Cauchy sequence of real numbers. So T is bounded.

Exercise 19: If T is a bounded linear operator on a Hilbert space, prove that


|TT

| = |T

T| = |T|
2
= |T

|
2
.
58
Solution. We already know |T|
2
= |T

|
2
from Proposition 5.4. Now
|T

T| = sup
|f|=|g|=1
[T

Tf, g)[
= sup
|f|=|g|=1
[Tf, Tg)[
sup
|f|=|g|=1
|Tf||Tg|
= sup
|f|=1
|Tf| sup
|g|=1
|Tg|
= |T|
2
.
To show that equality is achieved, choose a sequence f
n
with |f
n
| = 1 and
|Tf
n
| |T|. Then Tf
n
, Tf
n
) |T|
2
, so
|T

T| = sup
|f|=|g|=1
Tf, Tg) sup
|f|=1
Tf, Tf) |T|
2
.
Hence |T

T| = |T
2
|. Finally, replacing T with T

yields |TT

| =
|T

|
2
= |T|
2
and we are done.
Exercise 20: Suppose H is an innite-dimensional Hilbert space. We have
seen an example of a sequence f
n
in H with |f
n
| = 1 for all n, but
for which no subsequence of f
n
converges in H. However, show that for
any sequence f
n
in H with |f
n
| = 1 for all n, there exist f H and a
subsequence f
n
k
such that for all g H, one has
lim
n
(f
n
k
, g) = (f, g).
One says that f
n
k
converges weakly to f.
Solution. The proof is similar to the Arzela-Ascoli theorem, and as with
that proof, the main hang-up is notation. Let e
n
be an orthonormal
basis for H. (Were assuming here that H is separable, of course, which the
book includes in its denition of Hilbert space.) Then we can dene f by
dening its inner product with each e
n
. Now f
n
, e
1
) is a sequence of real
numbers in [1, 1] (by the Cauchy-Schwarz inequality, since these are both
unit vectors). Since [1, 1] is compact, there is a subsequence f
nj
which
converges to some limit
1
. Then f
nj
, e
2
) is a sequence of real numbers in
[1, 1], so some subsequence f
njr
converges to a limit
2
. Continuing in this
fashion, we obtain sequences S
0
, S
1
, S
2
, . . . where S
m+1
is a subsequence
of S
m
, S
0
= f
n
, and v
(m)
n
, e
j
)
n

j
for j = 1, . . . , m, where v
(m)
n
is the
nth term of S
m
. Dene a sequence S whose kth term is v
(k)
k
. Then for any
S
m
, the tail of S is a subsequence of the tail of S
m
. Hence v
k
, e
j
)
k

j
for
all j. Dene f by f, e
j
) =
j
. Relabeling, v
k
= f
n
k
is a subsequence with
f
n
k
, e
j
) f, e
j
) for all basis elements e
j
, and thus f
n
k
, g) f, g) for
all g H.
Exercise 17: Fatous theorem can be generalized by allowing a point to ap-
proach the boundary in larger regions, as follows.
For each 0 < s < 1 and point z on the unit circle, consider the region

s
(z) dened as the smallest closed convex set that contains z and the
closed disc D
s
(0). In other words,
s
(z) consists of all lines joining z with
59
pointsn in D
s
(0). Near the point z, the region
s
(z) looks like a triangle.
See Figure 2.
We say that a function F dened in the open unit disc has a non-
tangential limit at a point z on the circle, if for every 0 < s < 1, the
limit
F(w)
wz
ws(z)
exists.
Prove that if F is holomorphic and bounded on the open unit disc, then
F has a non-tangential limit for almost every point on the unit circle.
Solution. Since F is holomorphic, we have F(z) =

n=0
a
n
z
n
for [z[ < 1.
As shown on page 174 in the proof of Fatous theorem,

[a
n
[
2
< so
there is an L
2
(T) function F(e
i
) whose Fourier coecients are a
n
. Note
also that F(e
i
) is bounded (almost everywhere) since, by Fatous theorem,
it is the a.e. radial limit of F(z), so [F(z)[ M [F(e
i
)[ M.
We next prove a lemma about the Poisson kernel.
Lemma 2. For each s (0, 1) there exists a constant k
s
such that
P
r
( ) k
s
P
r
()
for all (r, ) such that re
i

s
.
Proof. By elementary arithmetic,
P
r
( ) =
1 r
2
[e
i
re
i
[
2
and
P
r
() =
1 r
2
[e
i
r[
2
.
(This alternate formula can be found in any complex analysis book.) Our
task is thus reduced to proving
[e
i
r[ k
s
[e
i
re
i
[.
By the triangle inequality,
[e
i
r[ [e
i
re
i
[ +r[e
i
1[ = [e
i
re
i
[ + 2r sin
_
[[
2
_
.
Thus, our task is reduced to proving that
2r sin
_

2
_
[e
i
re
i
[
is bounded on
s
. But [e
i
re
i
[ 1 r by the triangle inequality, so it
is sucient to prove that
2r sin
_

2
_
1 r
is bounded on
s
. Now for each r, the maximum value of [[ (which will
maximize this quotient) is, as indicated in the diagram below, one for which
2r sin
_

2
_
occurs in a triangle with 1 r and

1 s
2

r
2
s
2
. Thus,
60
it is at most equal to the sum of them (actually quite a bit less). So we
nally need only to prove that

1 s
2

r
2
s
2
1 r
is bounded as r 1. But this follows from the fact that it has negative
derivative, since
d
dr
_
_
1 s
2

_
r
2
s
2
_
=
r

r
2
s
2
< 1 =
d
dr
(1 r).
This completes the proof of the lemma. (I know, there are probably much
shorter proofs, but its late at night...)
61
Having established this lemma, the rest of the problem becomes trivial.
By the Poisson integral formula,
[F(re
i
)[ =

1
2
_

P
r
( )F(e
i
d

1
2
_

P
r
( )[F(e
i
[d
k
s
1
2
_

P
r
()[F(e
i
[d
and by Math 245A (specically, the fact that P
r
is an approximate identity)
this last integral tends to zero. (Recall that we assumed F(1) = 0.)
Exercise 21: There are several senses in which a sequence of bounded oper-
ators T
n
can converge to a bounded operator T (in a Hilbert space H).
First, there is convergence in the norm, that is, |T
n
T| 0 as n .
Next, there is a weaker convergence, which happens to be called strong
convergence, that requires that T
n
f Tf as n for every f H.
Finally, there is weak convergence (see also Exercise 20) that requires
(T
n
f, g) (Tf, g) for every pair of vectors f, g H.
(a) Show by examples that weak convergence does not imply strong con-
vergence, nor does strong convergence imply convergence in the norm.
(b) Show that for any bounded operator T there is a sequence T
n
of
bounded operators of nite rank so that T
n
T strongly as n .
Solution. (a) Let H =
2
(N). Let T be the zero operator and T
n
= R
n
where R is the right shift operator; thus
T
n
(a
1
, a
2
, . . . ) = (0, . . . , 0, a
1
, a
2
, . . . ).
Then for any xed f = (a
1
, a
2
, . . . ) and g = (b
1
, b
2
, . . . ),
T
n
f, g) =

k=1
a
k
b
n+k
= f, L
n
g)
where L is the left-shift operator dened by Lg = (b
2
, b
3
, . . . ). By the
Cauchy-Schwarz inequality,
[T
n
f, g)[ |f||L
n
g|.
But |L
n
g| 0 because

[b
k
[
2
converges and the tails of a convergent
series tend to zero. Hence T
n
f, g) 0 for all f, g H. Thus,
T
n
T weakly. However, T
n
does not converge to T strongly; note
that |Rf| = |f| for any f, so |T
n
f| = |R
n
f| = |f| by induction.
Hence T
n
f ,Tf = 0.
To show that strong convergence does not imply convergence in norm,
let T
n
= L
n
in the same space. Then for any f, |T
n
f| = |L
n
f| 0
because the tails of a convergent series tend to zero. Hence T
n
T
strongly. However, |T
n
| = 1 because a unit vector (0, . . . , 0, 1, 0, . . . )
with more than n initial zeros is mapped to another unit vector. Hence
|T
n
T| = |T
n
| ,0.
62
(b) (We assume H is separable.) Let e
i
be an orthonormal basis for H.
We can write
Te
i
=

j=1
c
ij
e
j
for each i = 1, 2, . . . . Dene
T
n
e
i
=
n

j=1
c
ij
e
j
and extend linearly from the basis to the rest of the space (actually,
extend linearly to nite linear combinations of the basis, and then take
limits to get the rest of the space...) Clearly each T
n
is of nite rank
since its range is spanned by e
1
, . . . , e
n
. Now let f =

i=1
a
i
e
i
. Then
T
f
=

i=1

j=1
a
i
c
ij
e
j
whereas
T
n
f =

i=1
n

j=1
a
i
c
ij
e
j
which is just the nth partial sum (in j) and hence converges to Tf.
(This is where we use the fact that T is a bounded operator, since
absolute convergence allows us to rearrange these sums.) Hence T
n
f
Tf weakly for all f H.

Exercise 22: An operator T is an isometry if |Tf| = |f| for all f H.


(a) Show that if T is an isometry, then (Tf, Tg) = (f, g) for every f, g H.
Prove as a result that T

T = I.
(b) If T is an isometry and T is surjective, then T is unitary and TT

= I.
(c) Give an example of an isometry that is not unitary.
(d) Show that if T

T is unitary then T is an isometry.


Solution. (a) By the polarization identity,
Tf, Tg) =
|Tf +Tg|
2
|Tf Tg|
2
+i|Tf +iTg|
2
i|Tf iTg|
2
4
=
|T(f +g)|
2
|T(f g)|
2
+i|T(f +ig)|
2
i|T(f ig)|
2
4
=
|f +g|
2
|f g|
2
+i|f +ig|
2
i|f ig|
2
4
= f, g).
This in turn implies
f, T

Tg) = f, Ig)
for all f, g, so that T

T = I.
(b) T preserves norms because its an isometry; its injective because norm-
preserving linear maps are always injective (since the kernel cannot
contain anything nonzero). Since its surjective as well, its a norm-
preserving linear bijection, which is by denition a unitary map. We
63
know T

T = I from part (a). Since T is bijective, it has a linear 2-


sided inverse, and the equation T

T = 1 shows that T is this inverse.


Hence TT

= I.
(c) The right-shift operator on
2
(N) is isometric, since
|(0, a
1
, a
2
, . . . )|
2
=

j=1
[a
j
[
2
= |(a
1
, a
2
, . . . )|
2
.
However, this operator is not surjective, so its not unitary.
(d) First,
|Tf|
2
= Tf, Tf) = f, T

Tf) |f||T

Tf| = |f|
2
by the Cauchy-Schwarz inequality; hence |Tf| |f|. Then,
|f|
2
= |T

Tf|
2
= T

Tf, T

Tf) = Tf, TT

Tf) |Tf||T(T

Tf)| |Tf||T

Tf| = |Tf||f|
where we have applied the previous inequality with T

Tf in place of f.
Dividing by |f| yields |f| |Tf|. Putting the inequalities together,
|Tf| = |f|. Hence T is an isometry.

Exercise 23: Suppose T


k
is a collection of bounded operators on a Hilbert
space H, with |T
k
| 1 for all k. Suppose also that
T
k
T

j
= T

k
T
j
= 0 for all k ,= j.
Let S
N
=

n
k=N
T
k
. Show that S
N
(f) converges as N , for every
f H. If T(f) denotes the limite, prove that |T| 1.
Exercise 24: Let e
k

k=1
denote an orthonormal set in a Hilbert space H.
If c
k

k=1
is a sequence of positive real numbers such that

c
2
k
< , then
the set
A =
_

k=1
a
k
e
k
: [a
k
[ c
k
_
is compact in H.
Solution. This is a standard diagonalization argument. Let be a sequence
of points in H. Consider the rst components of the points in , i.e. their
components with respect to e
1
. This is a sequence of complex numbers
in the compact ball z : [z[ c
1
, so some subsequence converges to a
complex number in this ball. If we take the corresponding subsequence of
, we obtain a subsequence S
1
whose rst components converge to some
number b
1
with [b
1
[ c
1
. Now consider the second components of the
points in S
1
; they form a sequence of complex numbers in the compact ball
z : [z[ c
2
and hence some subsequence converges to a complex number
b
2
in this ball. Taking the corresponding subsequence S
2
of S
1
, we have a
sequence whose rst and second components converge. Continuing, we may
inductively dene sequences S
n
for all n such that S
n+1
is a subsequence
of S
n
, and the rst n components of S
n
converge. Finally, we dene a
sequence S whose nth term is the nth term of S
n
. This is a subsequence
of , and for any n it is eventually a subsequence of S
n
(i.e. the tail of S
is a subsequence of the tail of S
n
). Hence it converges in every component.
64
But this implies convergence to a point in the Hilbert space (since the sizes
of the tails are uniformly bounded), so we are done.
Exercise 25: Suppose T is a bounded operator that is diagonal with respect
to a basis
k
, with T
k
=
k

k
. Then T is compact if and only if
k
0.
Solution. (From lecture) Suppose
k
0. Let T
n
be the nth truncation,
i.e. the operator that results when
k
is replaced with 0 for k > n. Then
T T
n
is also a diagonal operator, with
|T T
n
| = sup
k>n
[
k
[ 0.
Since T can be uniformly approximated by operators of nite rank, it is
compact. Conversely, suppose that
k
, 0, i.e. limsup [
n
[ > 0, so that
there is some subsequence
nj
with [
nj
[ >

2
for some real number > 0.
Then T
nj
=
nj

nj
and by orthonormality,
|T
nj
T
n
k
| =
_

2
nj
+
2
n
k
>

2
.
Since all the points of the sequence T
nj
are uniformly bounded away
from each other, it can have no convergent subsequence. These points all
lie in T(B), so T(B) is not compact.
Exercise 26: Suppose w is a measurable function on R
d
with 0 < w(x) <
for a.e. x, and K is a measurable function on R
2d
that satises:
(i)
_
R
d
[K(x, y)[w(y)dy Aw(x) for almost every x R
d
, and
(ii)
_
R
d
[K(x, y)[w(x)dx Aw(y) for almost every y R
d
.
Prove that the integral operator dened by
Tf(x) =
_
R
d
K(x, y)f(y)dy, x R
d
is bounded on L
2
(R
d
) with |T| A. Note as a special case that if
_
[K(x, y)[dy A for all x, and
_
[K(x, y)[dx A for all y, then |T| A.
Solution. First, note that
__
[K(x, y)[[f(y)[dy
_
2
=
__
_
_
[K(x, y)[
_
w(y)
__
_
[K(x, y)[[f(y)[
_
w(y)
1
_
_
2

__
[K(x, y)[w(y)dy
___
[K(x, y)[[f(y)[
2
w(y)
1
dy
_
(a.e.)
Aw(x)
_
[K(x, y)[[f(y)[
2
w(y)
1
dy
65
by the Cauchy-Schwarz inequality. Thus, for f L
2
,
|Tf|
2
=
_

_
K(x, y)f(y)dy

2
dx

_ __
[K(x, y)[[f(y)[dy
_
2
dx

_
Aw(x)
_
[K(x, y)[[f(y)[
2
w(y)
1
dydx
(Tonelli)
= A
_
[f(y)[
2
w(y)
1
_
[K(x, y)[w(x)dxdy
A
_
[f(y)[
2
w(y)
1
Aw(y)dy
= A
2
_
[f(y)[
2
dy
= A
2
|f|
2
.
Hence |T| A.
Exercise 27: Prove that the operator
Tf(x) =
1

_

0
f(y)
x +y
dy
is bounded on L
2
(0, ) with norm |T| 1.
Exercise 28: Suppose H = L
2
(B), where B is the unit ball in R
d
. Let
K(x, y) be a measurable function on BB that satises [K(x, y)[ A[x
y[
d+
for some > 0, whenever x, y B. Dene
Tf(x) =
_
B
K(x, y)f(y)dy.
(a) Prove that T is a bounded operator on H.
(b) Prove that T is compact.
(c) Note that T is a Hilbert-Schmidt operator if and only if > d/2.
Solution.
(a) Let
C =
_
zR
d
:]z]2
dz
[z[
d
which converges because the exponent is less than d. Then
_
B
[K(x, y)[dy
_
Ady
[x y[
d
A
_
]z]2
dz
[z[
d
= AC
so by problem 26 with w = 1, we have T bounded with |T| AC.
(b) As suggested, let
K
n
(x, y) =
_
K(x, y) [x y[
1
n
0 else
and
T
n
f(x) =
_
K
n
(x, y)f(y)dy.
66
Then T
n
is Hilbert-Schmidt (and therefore compact) since clearly K
n

L
2
(BB) (K
n
is, after all, bounded with compact support). Moreover,
_
B
[K
n
(x, y) K(x, y)[dy
_
]xy]1/n
A[x y[
d+
= AC
n
where we dene
C
n
=
_
zR
d
:]z]1/n
dz
[z[
d
.
Since
1
]z]
d
L
1
(R
d
), the absolute continuity of the integral implies
C
n
0. By problem 26 again with w = 1, this implies |T T
n
| 0.
Since T
n
is compact, this implies that T is compact.
(c) This should actually say T is guaranteed to be Hilbert-Schmidt if and
only if... since K could be a lot less than the bound given. Anyhoo,
T necessarily Hilbert-Schmidt A[x y[
d+
L
2
(B B)

_
B
_
B
A
2
[x y[
2d+2
<
2d + 2 > d
>
d
2
.

Exercise 29: Let T be a compact operator on a Hilbert space H and assume


,= 0.
(a) Show that the range of I T dened by
g H : g = (I T)f for some f H
is closed.
(b) Show by example that this may fail when = 0.
(c) Show that the range of I T is all of H if and only if the null space
of

I T

is trivial.
Exercise 30: Let H = L
2
([, ]) with [, ] identied as the unit circle.
Fix a bounded sequence
n

n=
of complex numbers, and dene an
operator Tf by
Tf(x)

n=

n
a
n
e
inx
whenever f(x)

n=
a
n
e
inx
.
Such an operator is called a Fourier multiplication operator, and the
sequence
n
is called the multiplier sequence.
(a) Show that T is a bounded operator on H and |T| = sup [
n
[.
(b) Verify that T commutes with translations, that is, if we dene
h
(x) =
f(x h) then
T
h
=
h
T for every h R.
(c) Conversely, prove that if T is any bounded operator on H that com-
mutes with translations, then T is a Fourier multiplier operator.
67
Exercise 34: Let K be a Hilbert-Schmidt kernel which is real and symmet-
ric. Then, as we saw, the operator T whose kernel is K is compact and
symmetric. Let
k
(x) be the eigenvectors (with eigenvalues
k
) that
diagonalize T. Then
(a)

[
k
[
2
< .
(b) K(x, y)

k
(x)
k
(y) is the expansion of K in the basis phi
k
(x)
k
(y).
(c) Suppose T is a compact operator which is symmetric. Then T is of
Hilbert-Schmidt type if and only if

[
n
[
2
< , where
n
are the
eigenvalues of T counted according to their multiplicities.
Exercise 35: Let H be a Hilbert space. Prove the following variants of the
spectral theorem.
(a) If T
1
and T
2
are two linear symmetric and compact operators on H
that commute, show that they can be diagonalized simultaneously. In
other words, there exists an orthonormal basis for H which consists of
eigenvectors for both T
1
and T
2
.
(b) A linear operator on H is normal if TT

= T

T. Prove that if T is
normal and compact, then T can be diagonalized.
(c) If U is unitary, and U = I T where T is compact, then U can be
diagonalized.
Solution.
(a) We can pretty much copy the proof verbatim with eigenvector re-
placed by common eigenvector. Let S be the closure of the subspace
of H spanned by all common eigenvectors of T
1
and T
2
. We want to
show S = H. Suppose not; then H = SS

with S

nonempty. If we
can show S

contains a common eigenvector of T


1
and T
2
, we have a
contradiction. Note that T
1
S S, which in turn implies T
1
S

since
g S

Tg, f) = g, Tf) = 0
for all f S. Similarly, T
2
S

. Now by the theorem for one


operator, T
1
must have an eigenvector in S

with some eigenvalue .


Let E

be the eigenspace of (as a subspace of S

). Then for any


x E

,
T
1
(T
2
x) = T
2
(T
1
x) = T
2
(x) = (T
2
x)
so T
2
x E

as well. Since T
2
xes E

, it has at least one eigenvector in


E

. This eigenvector is a common eigenvector of T


1
and T
2
, providing
us with our contradiction.
(b) This follows from part (a). Write
T =
T +T

2
+i
T T

2i
.
By a trivial calculation, both
T+T

2
and
TT

2i
are self-adjoint. More-
over, since T is normal,
(T +T

)(T T

) = T
2
+T

T TT

T
2
= T
2
T
2
= (T T

)(T +T

)
so they commute as well. Hence, there exists an ONB of common
eigenvectors of
T+T

2
and
TT

2i
. Any such common eigenvector is an
68
eigenvector of T, since
T +T

2
x = x and
T T

2i
x =
t
x Tx = ( +i
t
)x.
(c)

Chapter 4.8, Page 202


Problem 1: Let H be an innite-dimensional Hilbert space. There exists a
linear functional dened on H that is not bounded.
Solution. It is a well-known fact from linear algebra that every vector space
has a basis. This can be proved using Zorns lemma: linearly independent
sets are partially ordered by inclusion, and every chain has an upper bound
by union, so there exists a maximal linearly independent set, which is by
denition a basis. Applying this to our Hilbert space, we obtain an (alge-
braic) basis, i.e. one for which every vector is a nite linear combination
of basis elements. Let e
n
be a countable subset of our algebraic basis.
Dene (e
n
) = n|e
n
| and (f) = 0 for f in our basis but f ,= e
n
for any
n. We can then extend to the whole space in a well-dened manner, but
clearly is not bounded since [(e
n
)[ = n|e
n
|.
Problem 2: The following is an example of a non-separable Hilbert space.
We consider the collection of exponentials e
ix
on R, where ranges over
the real numbers. Let H
0
denote the space of nite linear combinations of
these exponentials. For f, g H
0
, we dene the inner product as
(f, g) = lim
T
1
2T
_
T
T
f(x)g(x)dx.
(a) Show that this limit exists, and
(f, g) =
N

k=1
a

k
if f(x) =

N
k=1
a

k
e
i
k
x
and g(x) =

N
k=1
b

k
e
i
k
x
.
(b) With this inner product H
0
is a pre-Hilbert space. Notice that |f|
sup
x
[f(x)[, if f H
0
, where |f| denotes the norm f, f)
1/2
. Let H
be the completion of H
0
. Then H is not separable because e
ix
and
e
i

x
are orthonormal if ,=
t
. A continuous function F dened on R
is called almost periodic if it is the uniform limit (on R) of elements
in H
0
. Such functions can be identied with (certain) elements in the
completion H: We have H
0
AP H, where AP denotes the almost
periodic functions.
(c) A continuous function F is in AP if for ever > 0 we can nd a length
L = L

such that any interval I R of length L contains an almost


period satisfying
sup
x
[F(x +) F(x)[ < .
69
(d) An equivalent characterization is that F is in AP if and only if every
sequence F(x + h
n
) of translates of F contains a subsequence that
converges uniformly.
Problem 7: Show that the identity operator on L
2
(R
d
) cannot be given as
an (absolutely) convergent integral operator. More precisely, if K(x, y) is a
measurable function on R
d
R
d
with the property that for each f L
2
(R
d
),
the integral T(f)(x) =
_
R
d
K(x, y)f(y)dy converges for almost every x, then
T(f) ,= f for some f.
Solution. Suppose such a K exists. Let B
1
and B
2
be disjoint balls in R
d
.
We will show that K = 0 a.e. in B
1
B
2
. Suppose not; then there is a
rectangle E
1
E
2
with E
1
B
1
and E
2
B
2
sets of positive measure,
on which K(x, y) > 0 or K(x, y) < 0; WLOG, K(x, y) > 0. (If K is allowed
complex values, we can change this condition to Re(K) > 0.) Let f =
E2
.
Then for almost all x E
1
,
0 = f(x) = Tf(x) =
_
E2
K(x, y)dy.
But the integral of a positive function over a set of positive measure is
nonzero, so we have a contradiction. Thus, K(x, y) = 0 a.e. in B
1
B
2
.
Now if we let = (x, x) : x R
d
R
2d
be the diagonal, we can cover
R
2d
with product sets B
1
B
2
. (One way to see this is topological:
products of balls form a basis for the product topology on R
d
R
d
, and
is closed, so its complement is open.) This implies K(x, y) = 0 a.e. on
R
2d
, and therefore a.e. on R
2d
. But then Tf = 0 for all f, so any
nonzero f will have Tf ,= f.
Problem 8: Suppose T
k
is a collection of bounded operators on a Hilbert
space H. Assume that
|T
k
T

j
| a
2
kj
and |T

k
T
j
| a
2
kj
,
for positive constants a
n
with the property that

n=
a
n
= A < .
Then S
N
(f) converges as N , for every f H, with S
N
=

N
N
T
k
.
Moreover, T = lim
N
S
N
satises |T| A.
Solution. For any integers N and n,
|S
N
|
2
n
= |(S

S)
2
n1
|
=
_
_
_
_
_
_
N

j1=N
N

k1=N

N

j
2
n1=N
N

k
2
n1=N
T

j1
T
k1
T

j
2
n1
T
k
2
n1
_
_
_
_
_
_

j1=N
N

k1=N

N

j
2
n1=N
N

k
2
n1=N
|T

j1
T
k1
T

j
2
n1
T
k
2
n1
|.
Now since |T
m
| a
0
for any m,
70

Problem 9: A discussion of a class of regular Sturm-Liouville operators fol-


lows. Other special examples are given in the problems below.
Suppose [a, b] is a bounded interval, and L is dened on functions f that
are twice continuously dierentiable in [a, b] (we write f C
2
([a, b]) by
L(f)(x) =
d
2
f
dx
2
q(x)f(x).
Here the function q is continuous and real-valued on [a, b], and we assume
for simplicity that q is non-negative. We say that C
2
([a, b]) is an
eigenfunction of L with eigenvalue if L() = , under the assumption
that satises the boundary conditions (a) = (b) = 0. Then one can
show:
(a) The eigenvalues are strictly negative, and the eigenspace correspond-
ing to each eigenvalue is one-dimensional.
(b) Eigenvectors corresponding to distinct eigenvalues are orthogonal in
L
2
([a, b]).
(c) Let K(x, y) be the Greens kernel dened as follows. Choose

(x)
to be a solution of L(

) = 0, with

(a) = 0 but
t

(a) ,= 0. Simi-
larly, choose
+
(x) to be a solution of L(
+
) = 0 with
+
(b) = 0 but

t
+
(b) ,= 0. Let w =
t
+
(x)

(x)
t

(x)
+
(x), be the Wronskian
of these solutions, and note that w is a non-zero constant.
Set
K(x, y) =
_
(x)+(y)
w
a x y b,
+(x)(y)
w
a y x b.
Then the operator T dened by
T(f)(x) =
_
b
a
K(x, y)f(y)dy
is a Hilbert-Schmidt operator, and hence compact. It is also sym-
metric. Moreover, whenever f is continuous on [a, b], Tf is of class
C
2
([a, b]) and
L(Tf) = f.
(d) As a result, each eigenvector of T (with eigenvalue ) is an eigen-
vector of L (with eigenvalue = 1/). Hence Theorem 6.2 proves
the completeness of the orthonormal set arising from normalizing the
eigenvectors of L.
Solution.
(a) Let be an eigenfunction of L with eigenvalue . Then

tt
= (q +)
tt
= (q +)
2
.
Integrating by parts from a to b, we have

t
[
b
a

_
(
t
)
2
=
_
(q +)
2
.
Since (a) = (b) = 0, this reduces to

_
(
t
)
2
=
_
(q +)
2
.
71
Now if is not almost everywhere zero, the LHS is strictly negative.
But the integrand on the RHS is everywhere nonnegative unless is
strictly negative. Hence the eigenvalues of L are all strictly negative.
Now suppose is an eigenvalue of L with eigenfunctions
1
and
2
.
Then
( +q)
1

2
=
1

tt
2
=
2

tt
1

t
1

t
2
+
1

tt
2
=
t
1

t
2
+
tt
1

2
(
1

t
2
)
t
= (
t
1

2
)
t

t
2
=
t
1

2
+C.
Plugging in a, we see that C = 0 because
1
(a) =
2
(a) = 0. Hence

t
2

t
1

2
= 0. Since the Wronskian of these solutions is zero, they
are linearly dependent. So the eigenspace can only be one-dimensional.
(b) Suppose
1
and
2
are eigenfunctions with eigenvalues
1
and
2
re-
spectively. Then

1
_
=
_
(
tt
q)
=
_

tt

_
q
=
t
[
b
a

_

t

_
q
=
_

t

_
q
=
t
[
b
a

_

t

_
q
=
_

tt

_
q
=
_
(
tt
q)
=
2
_
.
If
1
,=
2
, this implies
_

1

2
= 0.
(c) The Wronskian is a constant because
w
t
=
tt
+

tt

+
= q
+

+
= 0;
it is nonzero because plugging in at a yields
t

(a)
+
(a). We already
know
t

(a) ,= 0, and
+
(a) cannot be zero because then
+
would be
an eigenfunction with eigenvalue 0, contradicting part (a). To show T
is Hilbert-Schmidt, consider
_ _
[K
2
[. We will treat this on the region
R = a y x b; the other half is symmetric. Then
__
R
[K(x, y)[
2
dxdy =
__
R
[
+
(x)

(y)[
2
w
2
dxdy.
Now w is a nonzero constant, as we saw above;
+
and

are both
continuous on a compact set and hence bounded. Thus, the integrand
is bounded, and the region of integration is compact, so the integral
72
is nite. So T is Hilbert-Schmidt. The symmetry of T is immedi-
ately evident from its denition. Now suppose f C([a, b]). Then
Tf C
2
([a, b]) because K C
2
([a, b]
2
) and the second partials of K
are bounded so that one can dierentiate Tf under the integral sign.
Finally,
Tf(x) =
_
b
a
K(x, y)f(y)dy =
1
w
_
x
a

+
(x)

(y)f(y)dy +
1
w
_
b
x

(x)
+
(y)f(y)dy
=

+
(x)
w
_
x
a

(y)f(y)dy +

(x)
w
_
b
x

+
(y)f(y)dy,
so
(Tf)
t
(x) =

t
+
(x)
w
_
x
a

(y)f(y)dy +

+
(x)
w

(x)f(x) +

t

(x)
w
_
b
x

+
(y)f(y)dy

(x)
w

+
(x)f(x)
=

t
+
(x)
w
_
x
a

(y)f(y)dy +

t

(x)
w
_
b
x

+
(y)f(y)dy
and
(Tf)
tt
(x) =

tt
+
(x)
w
_
x
a

(y)f(y)dy +

t
+
(x)
w

(x)f(x) +

tt

(x)
w
_
b
x

+
(y)f(y)dy

t

(x)
w

+
(x)f(x)
= f(x)
w
w
+
q(x)
+
(x)
w
_
x
a

(y)f(y)dy +
q(x)

(x)
w
_
b
x

+
(y)f(y)dy
= f(x) +q(x)
_
b
a
K(x, y)f(y)dy
= f(x) +q(x)(Tf)(x)
so L(Tf) = (Tf)
tt
q(Tf) = f.
(d) This is more just an observation in the problem statement than some-
thing for me to do.

Chapter 5.5, Page 253


Exercise 1: Suppose f L
2
(R
d
) and k L
1
(R
d
).
(a) Show that (f k)(x) =
_
f(x y)k(y)dy converges for a.e. x.
(b) Prove that |f k|
2
|f|
2
|k|
1
.
(c) Establish

(f k)() =

k()

f() for a.e. .
(d) The operator Tf = f k is a Fourier multiplication operator with
multiplier m() =

k().
Solution.
(a) This will follow from part (b) because an L
2
function must be -
nite almost everywhere (which will prove a.e. convergence of
_
[f(x
y)[[k(y)[dy), and absolutely convergent integrals are convergent.
(b) Just so I have it for future reference, why dont I prove the L
p
version
of this. Suppose f L
p
with 1 < p < and k L
1
. Let q be the
73
conjugate exponent of p. Then
|f k|
p
p
=
_

_
f(x y)k(y)dy

p
dx

_ __
[f(x y)[[k(y)[
1/p
[k(y)[
1/q
dy
_
p
dx

_
_
|f(x y)k(y)
1/p
|
p
|k(y)
1/q
|
q
_
p
dx
=
_ __
[f(x y)[
p
[k(y)[dy
___
[k(y)[dy
_
p/q
dx
= |k|
p/q
1
|f
p
k|
1
|k|
p/q
1
|k|
1
|f
p
|
1
= |k|
p
1
|f|
p
p
.
Here we have used Holders inequality on [f[[k[
1/p
L
p
and [k[
1/q

L
q
, as well as the bound for the L
1
norm of a convolution of L
1
func-
tions. In the case p = q = 2, Holders inequality reduces to Cauchy-
Schwarz.
(c) If f L
1
L
2
, we already know this from our theory of Fourier
transforms on L
1
. Otherwise, since L
1
L
2
is dense in L
2
, we may
take a sequence f
n
L
1
L
2
with f
n
L
2
f. Then

f k()

f
n
k()

_
e
2i x
_
(f(x y) f
n
(x y)) k(y)dydx

_ _
[f(x y) f
n
(x y)[ [k(y)[dydx
|f f
n
|
2
|k|
1
L
2
0
so

f k() = lim

f
n
k() = lim

f
n
()

k() =

k() lim

f
n
() =

k()

f().
(d) This is just the denition of a Fourier multiplication operator applied
to part (c).

Exercise 3: Let F(z) be a bounded holomorphic function in the half-plane.


Show in two ways that lim
y0
F(x +iy) exists for a.e. x.
(a) By using the fact that F(z)/(z +i) is in H
2
(R
2
+
).
(b) By noting that G(z) = F
_
i
1z
1+z
_
is a bounded holomorphic function
in the unit disc, and using Exercise 17 in the previous chapter.
Solution.
(a) Since [F(z)[ M for some M,

F(x +iy)
x +i(y + 1)

x
2
+ 1
74
so
_

F(x +iy)
x +i(y + 1)

2
dx
_

M
2
x
2
+ 1
dx = M
2
.
Hence
F(z)
z+i
H
2
(R
2
+
). This implies that
lim
y0
F(x +iy)
x +i(1 +y)
exists a.e., which in turn implies that limF(x +iy) exists a.e.
(b) I assume that I can take for granted that z i
1z
1+z
is a conformal
mapping of the unit disc into the upper half plane, since we did this
on a previous homework. Then dene G(w) = F(i
1w
1+w
) which is a
bounded holomorphic function on D. It now suces to show that w
approaches the unit circle non-tangentially as y = Re(z) 0, where
w is now given by the inverse mapping
w =
x + (1 y)i
x + (1 +y)i
.
Then
[w[
2
=
x
2
+ (1 y)
2
x
2
+ (1 +y)
2
by straightforward arithmetic. Now if w were approaching the unit
circle in a tangential manner, we would have
d]w]
2
dy
[
y=0
= 0. However,
d[w[
2
dy
=
4(y
2
x
2
1)
(x
2
+ (1 y)
2
which is nonzero at y = 0.

Exercise 4: Consider F(z) = e


i/z
/(z +i) in the upper half-plane. Note that
F(x + iy) L
2
(R), for each y > 0 and y = 0. Observe also that F(z) 0
as [z[ . However, F / H
2
(R
2
+
). Why?
Solution. For any xed y > 0,

e
i/(x+iy)
x +i(1 +y)

e
1/y
x
2
+ 1
which is integrable, so F(x +iy) L
2
(R). For y = 0,

e
i/x
x +i

2
=
1
[x +i[
2
=
1
x
2
+ 1
which is again integrable. Also, as [z[ , the numerator approaches 1 in
magnitude while the denominator becomes innite. However, F / H
2
, as is
suggested by the fact that our bound includes an e
1/y
term, which blows up.
The problem, of course, is that F is not bounded in the upper half plane;
it has an essential singularity at 0, and Picards theorem tells us that it
takes on every complex value (except possibly 1) in every neighborhood of
the origin.
75
Exercise 6: Suppose is an open set in C = R
2
, and let H be the subspace
of L
2
() consisting of holomorphic functions on . Show that H is a closed
subspace of L
2
(), and hence is a Hilbert space with inner product
(f, g) =
_

f(z) g(z)dxdy, where z = x +iy.


Solution. For any f H, f
2
H as well, since the square of an analytic
function is analytic. Now by the mean value property, for any z and
r d(z,
c
),
f(z)
2
=
1
r
2
__
]z]r
f()
2
dA,
whence
[f(z)[
2

1
r
2
__
]z]r
[f()[
2
dA
1
r
2
|f|
2
.
Now on any compact K , there is a minimum value r
0
> 0 of d(z,
c
) for
z K. (This is because the distance between a compact set and a closed
set always attains a nonzero minimum.) Then we have [f(z)[

r0
|f|
for all z K and f L
2
(). So if f
n
is a Cauchy sequence in L
2
(),
then |f
m
f
n
| 0, whence [f
m
f
n
[ 0 as well. Thus, f
n
converges
uniformly on any compact subset of . Now it is a theorem in complex
analysis that the uniform limit of analytic functions is analytic; this may be
proved, for example, by using the ML estimate to show that the integral of
the limit around any contour is zero, and then applying Moreras theorem.
(See e.g. Gamelin p. 136.) This theorem works on any domain, e.g. the
interior of any compact disc contained in . This allows us to prove that
the limit of f
n
is analytic at each point in , so its analytic on .
Exercise 7: Following up on the previous exercise, prove:
(a) If
n

n=0
is an orthonormal basis of H, then

n=0
[
n
(z)[
2

c
2
d(z,
c
)
2
for z .
(b) The sum
B(z, w) =

n=0

n
(z)

n
(w)
converges absolutely for (z, w) , and is independent of the
choice of the orthonormal basis
n
of H.
(c) To prove (b) it is useful to characterize the function B(z, w), called
the Bergman kernel, by the following property. Let T be the linear
transformation on L
2
() dened by
Tf =
_

B(z, w)f(w)dudv, w = u +iv.


Then T is the orthogonal projection of L
2
() to H.
76
(d) Suppose that is the unit disc. Then f H exactly when f(z) =

n=0
a
n
z
n
, with

n=0
[a
n
[
2
(n + 1)
1
< .
Also, the sequence
z
n
(n+1)

1/2

n=0
is an orthonormal basis of H. More-
over, in this case
B(z, w) =
1
(1 z w)
2
.
Solution.
(a) First we prove a lemma, whose relevance was so kindly pointed out by
Prof. Garnett:
Lemma 3. Let b
n

n=0
be a sequence of complex numbers. Then

n=0
[b
n
[
2
= sup

]an]
2
1

n=0
a
n
b
n

.
Proof. If

[b
n
[
2
< this follows from the Cauchy-Schwarz inequality
applied to
2
(N), where equality is achieved when a
n
is the unit
vector in the same direction (actually, the conjugate) as b
n
. Now
suppose

[b
n
[
2
= . Then for any N, dene the truncated sequence

b
(N)
=

b
(N)
n
by

b
(N)
n
=
_
b
n
n N
0 else.
Then

b
(N)

2
, so if a
(N)
= a
(N)
n

n=0
is the unit vector in the
conjugate direction of

b
(N)
, we have

n=0
a
(N)
n
b
n

n=0
a
(N)
n

b
(N)
n

= |

b
(N)
|.
Since this goes to innity as N , we have
sup

|an|
2
1

a
n
b
n

= =

n=0
[b
n
[
2
.

Returning to the problem at hand, for any sequence a


n
with

[a
n
[
2

1,
g(z) =

n=0
a
n

n
(z)
is a unit vector in H. Applying problem 6, we have at any xed z
that

n=0
a
n

n
(z)

= [g(z)[

d(z,
c
)
|g| =

d(z,
c
)
.
77
Applying the lemma with b
n
=
n
(z), we have

n=0
[
n
(z)[
2
=
_
sup

]an]
2
1

n=0
a
n

n
(z)

_
2


d(z,
c
)
2
.
(b) The absolute convergence of this sum follows from part (a) and the
Cauchy-Schwarz inequality: For xed values of z and w, [
n
(z)[ and

n
(w) are vectors in
2
, so by the Cauchy-Schwarz inequality,

n
(z)

n
(w)
_

[
n
(z)[
2
_

n
(w)[
2
< .
To prove that the sum is independent of the choice of basis, we use
part (c). Because integration against this sum is projection onto H,
and there is only one projection map, any two such sums must be
equal almost everywhere. Im not 100% sure how to go about show-
ing they are in fact equal everywhere. Certainly B(z, w) is analytic
in z and analytic in w (with either variable xed, its in H as a func-
tion of the other variable). However, I dont know anything about
functions of several complex variables; is a function thats analytic in
each variable separately necessarily analytic? Or, more to the point,
continuous? Assuming so, continuity plus equality almost everywhere
implies equality. Of course, one could say that since H is being viewed
as a subspace of L
2
, a.e. equality is all we need for the functions to be
the same point in the Hilbert space.
(c) Since
n
is an ONB for the closed subspace H, we can extend it
to a basis for all of L
2
by complementing it with another set
k
of
orthonormal vectors. For z , dene B
z
(w) = B(z, w). Then
Tf(z) =
_

B
z
(w)f(w)dw = f, B
z
).
We can write f in our ONB as f(w) =

n=0
a
n

n
(w)+

k=0
b
k

k
(w)
whence
Tf(z) = f, B
z
)
=
__

n=0
a
n

n
+

k=0
b
k

k
_
,
_
_

j=0

j
(z)
j
_
_
_
=

j,n
a
n

j
(z)
n
,
j
) +

k,j
b
k

j
(z)
k
,
j
)
=

n
a
n

n
(z).
This is the formula for projection onto a closed subspaceT erases all
the components in the orthogonal complement.
(d) The set
n
=
_
z
n
_
n+2

_
is orthonormal since

n
,
n
) =
n + 1

_
D
[z
2
n[dA =
n + 1

_
1
r=0
_
2
=0
r
2
nrdrd = 2(n + 1)
r
2n+2
2n + 2

1
0
= 1
78
and

n
,
)
=
_
(m+ 1)(n + 1)

_
1
r=0
_
2
=0
r
n+m
e
2i(nm)
rdrd = 0
for m ,= n. Now since every analytic function has a power series ex-
pansion, any analytic function can be written as

b
n

n
. This proves
that
n
is a basis for H, and also gives us the condition for an
analytic function to be in L
2
:

[b
n
[
2
<

]an]
2
n+1
< , since
b
n
= a
n
_

n+1
.
To obtain an expression for B(z, w), we rst note that for any complex
number with [[ < 1,
1
(1 )
2
=

n=0
(n + 1)
n
.
This may be obtained by dierentiating the series
1

n
termwise,
or by squaring it and collecting like terms. Both are justied by the
uniform absolute convergence of this series on compact subdisks of the
unit disk. Then
B(z, w) =

n=0
_
n + 1

z
n
_
n + 1

w
n
=
1

n=0
(n + 1)(z w)
n
=

(1 z w)
2
.

Exercise 8: Continuing with Exercise 6, suppose is the upper half-plane


R
2
+
. Then every f H has the representation
f(z) =

4
_

0

f
0
()e
2iz
d, z R
2
+
,
where
_

0
[

f
0
()[
2 d

< . Moreover, the mapping



f
0
f given by this
formula is a unitary mapping from L
2
((0, ),
d

) to H.
Solution. Following the proof of Theorem 2.1 on page 214, we dene

f
y
()
to be the Fourier transform of the L
2
function f(x+iy). (We know f(x+iy)
is an L
2
function of x for almost all y since
|f|
2
2
=
_ __
[f(x +iy)[
2
dx
_
dy
so that
_
[f(x + iy)[
2
dx is an integrable function of y, and therefore nite
almost everywhere. Then we can show that

f
y
()e
2y
is independent of y
using exactly the same proof in the book. (Our proof of the boundedness
of f on closed half-planes changes slightly: we now have
[f()[
2
=
1

2
_
]z]<
[f( +z)[
2
dxdy
1

2
|f|
2
2
.
Other than that the proof requires no modication.) Having established
this, we can then dene

f
0
() to be the function that equals

f
y
()e
2y
for
79
almost all y. The Plancherel formula then gives us
_

[f(x +iy)[
2
dx =
_

f
0
()[
2
e
4y
d.
This tells us that

f
0
() = 0 for a.a. < 0 (since the integral in is innite
for < 0), and also gives us the relation
|f|
2
2
=
_

[f(x +iy)[
2
dxdy
=
_

_

0
[

f
0
()[
2
e
4y
ddy
Tonelli
=
_

0
[

f
0
()[
2
_

e
4y
dyd
=
_

0
[

f
0
()[
2
1
4
d
This tells us that |f| = |
1

4
f
0
|
L
2
((0,),d/)
. We also have by Fourier
inversion that f(z) =

4
_

0
1

f
0
()e
2iz
d. If we replace

f
0
by
1

f
0
,
we will have a unitary map

f
0
f, and
f(z) =

4
_

0

f
0
()e
2iz
d.

Exercise 9: Let H be the Hilbert transform. Verify that


(a) H

= H, H
2
= I, and H is unitary.
(b) If
h
denotes the translation operator,
h
(f)(x) = f(x h), then H
commutes with
h
,
h
H = H
h
.
(c) if
a
denotes the dilation operator,
a
(f)(x) = f(ax) with a > 0, then
H commutes with
a
,
a
H = H
a
.
Solution.
(a) Since the projection P and the identity I are both self-adjoint, 2P I
is self-adjoint, so H = i(2P I) is skew-adjoint.
(b) Since H is a linear combination of I and P, it suces to verify that
both of these commute with
h
. For I this is trivial. For P, we have

P(
h
f)() = ()

h
f() = ()e
2ih

f()
and

h
P(f)() = e
2ih

Pf() = e
2ih
()

f().
Since the Fourier transform on L
2
is invertible, this implies
h
Pf =
P
h
f, so P commutes with
h
.
(c) Again, it suces to verify that P commutes with dilations.

a
P(f)() = a

Pf(a) = a(a)

f(a) = a()

f(a)
where (a) = () because a > 0. Similarly,

P
a
f() = ()

a
f() = ()a

f(a).
Hence P commutes with dilations.
80

Exercise 15: Suppose f L


2
(R
d
). Prove that there exists g L
2
(R
d
) such
that
_

x
_

f(x) = g(x)
in the weak sense, if and only if
(2i)


f() = g() L
2
(R
d
).
Solution. (Help from Kenny Maples.) Let L =
_

x
_

. Then L

= (1)
]]
_

x
_

.
Note in particular that

() =

(1)
]]
_

x
_

() = (1)
]]
(2i)

() = (2i)


().
Now suppose g =

f()(2i)

L
2
. Dene g L
2
as the inverse Fourier
transform of g. Using Plancherels identity, for any C

0
we have
g, ) = g,

)
=
_
g()

()d
=
_

f()(2i)


()d
=
_

f()

()d
=

f,

L

)
= f, L

).
Hence g = Lf weakly.
Conversely, suppose there exists g L
2
such that g = Lf weakly. Using
Plancherel again,
_
g()

()d = g,

)
= g, )
= f, L

)
=

f,

L

)
=
_

f()(2i)


()d.
Since this is true for all C

0
, we must have g() =

f()(2i)

a.e.
Since g L
2
, g L
2
by Plancherel, so

f()(2i)

= g() L
2
.
Chapter 5.6, Page 259
Problem 6: This problem provides an example of the contrast between anal-
ysis on L
1
(R
d
) and L
2
(R
d
).
81
Recall that if f is locally integrable on R
d
, the maximal function f

is
dened by
f

(x) = sup
xB
1
m(B)
_
B
[f(y)[dx,
where the supremum is taken over all balls containing the point x.
Complete the following outline to prove that there exists a constant C
so that
|f

|
2
C|f|
2
.
In other words, the map that takes f to f

(although not linear) is bounded


on L
2
(R
d
). This diers notably from the situation in L
1
(R
d
), as we observed
in Chapter 3.
(a) For each > 0, prove that if f L
2
(R
d
), then
m(x : f

(x) > )
2A

_
]f]>/2
[f(x)[dx.
Here, A = 3
d
will do.
(b) Show that
_
R
d
[f

(x)[
2
dx = 2
_

0
m(E

)d,
where E

= x : f

(x) > .
(c) Prove that |f

|
2
C|f|
2
.
Solution.
(a) Let G

= x : [f(x)[ >

2
. Then 1
2

[f[ on G

, so
_
G
[f(y)[dy
_
G
2

[f(y)[
2
dy
2

|f|
2
< .
Now let E

= x : f

(x) > . For any x E

, B
x
with x B
x
and
m(B
x
) <
1

_
Bx
[f(y)[dy <
1

2
m(B
x
) +
_
GBx
[f(y)[dy
_
m(B
x
) <
2

_
GBx
[f(y)[dy.
Here we have broken up the integral into the integral over the portion
of B
x
where [f[

2
and the region where [f[ >

2
and bounded each
portion. Now let K be any compact subset of E

; then K is covered
by nitely many balls B
x1
, . . . , B
x
N
. By the Covering Lemma, there
exists a subcollection B
xn
1
, . . . , B
xn
M
such that
m
_
N
_
i=1
B
i
_
3
d
M

j=1
m(B
xn
j
).
Then
m(K) 3
d
M

j=1
m(B
xn
j
)
2 3
d

j=1
_
GBxn
j
[f(y)[dy
2 3
d

_
G
[f(y)[dy.
By the regularity of Lebesgue measure,
m(E

) = m(K)
KE cpct
m(E

)
2 3
d

_
G
[f(y)[dy.
82
(b) Using Tonellis theorem,
_
R
d
[f

(x)[
2
dx =
_
R
d
_

0

]f

(x)]
2
>y
dydx =
_

0
m(x : [f

(x)[ >

y) dy.
Substituting =

y, dy = 2d, this equals


2
_

0
m([f

(x)[ > )d.


(c)

Chapter 6.7, Page 312


Exercise 3: Consider the exterior Lebesgue measure m

introduced in Chap-
ter 1. Prove that a set E R
d
is Caratheodory measurable if and only if
E is Lebesgue measurable in the sense of Chapter 1.
Exercise 4: Let r be a rotation of R
d
. Using the fact that the mapping
x r(x) preserves Lebesgue measure (see Problem 4 in Chapter 2 and
Exercise 26 in Chapter 3), show that it induces a measure-preserving mape
of the sphere S
d1
with its measure d.
Solution. Let E S
d1
. By denition, (E) = dm(

E) where

E is the
union of all radii with endpoints in E. Then if r is a rotation of R
d
,
(rE) = dm(

rE) by denition. But

rE = r

E since
x

rE x = for some 1, rE
x = r(), 1, E
x = r() (since rotations are linear)
x r(

E)
Thus, m(rE) = dm(r

E) = dm(

E) = m(E), so r preserves measures on the
sphere.
Exercise 5: Use the polar coordinate formula to prove the following:
(a)
_
R
d
e
]x]
2
dx = 1, when d = 2. Deduce from this that the same
identity holds for all d.
(b)
_
_

0
e
r
2
r
d1
dr
_
(S
d1
) = 1, and as a result, (S
d1
) = 2
d/2
/(d/2).
(c) If B is the unit ball, v
d
= m(B) =
d/2
/(d/2+1), since this quantity
equals
_
_
1
0
r
d1
dr
_
(S
d1
).
Solution.
(a) For d = 2, we have by polar coordinates
_
R
d
e
]x]
2
dx =
_
S
1
_

0
e
r
2
rdrd = 2
_

0
e
r
2
rdr = e
r
2
[

0
= 1.
Note that for general d,
_
R
d
e
]x]
d
dx =
_
R
d
e
(x
d
1
++x
d
d
)
dx =
_
. . .
_
e
x
d
1
. . . e
x
d
d
dx
1
. . . dx
d
=
__
R
1
e
x
2
dx
_
d
83
by Tonellis Theorem. Since we have calculated that this equals 1 for
d = 2, it follows that
_
R
1
e
x
2
dx = 1, whence the integral over R
d
is
1 for all d.
(b) Using integration by parts, for d 3,
_

0
e
r
2
r
d1
dr =
e
r
2
2
r
d2

0
+
_

0
e
r
2
2
(d 2)r
d3
dr.
Since (S
d1
) is just the reciprocal of this rst integral (which follows
immediately from applying the polar coordinates formula to the result
in part (a)), it follows that (S
d1
) =
2
d2
(S
d3
). We now prove
the formula (S
d1
) = 2
d/2
/(d/2) by induction. The base cases are
(S
1
) = 2 = 2
2/2
/(2/2) and (S
2
) = 4 = 2
3/2
/(

/2) =
2
3/2
/(3/2) since (3/2) = 1/2(1/2) = 1/2

. Now if we let
a
d
= 2
d/2
/(d/2), then a
d
=
2
d2

(d2)/2
/((d 2)/2((d 2)/2) =
2
d2
a
d2
. Since a
d
and (S
d1
) satisfy the same recurrence and initial
conditions, they are equal for all d.
(c) By polar coordinates,
m(B) = (S
d1
)
_
1
0
r
d1
dr = 2
d/2
/(d/2)
1
d
=
2
d2
2
d
2
(d/2)
=

d/2
(d/2 + 1)
.

Exercise 8: The fact that the Lebesgue measure is uniquely characterized by


its translation invariance can be made precise by the following assertion:
If is a Borel measure on R
d
that is translation-invariant, and is nite
on compact sets, then is a multiple of Lebesgue measure m. Prove this
theorem by proceeding as follows.
(a) Suppose Q
a
denotes a translate of the cube x : 0 < x
j
a, j =
1, . . . , d of side length a. If we let (Q
1
) = c, then (Q
1/n
) = cn
d
for each integer n.
(b) As a result is absolutely continuous with respect to m, and there is
a locally integrable function f such that
(E) =
_
E
fdx.
(c) By the dierentiation theorem (Corollary 1.7 in Chapter 3) it follows
that f(x) = c a.e., and hence = cm.
Solution.
(a) Because Q
1
is a disjoint union of n
d
translates of the cube Q
1/n
,
(Q
1
) = n
d
(Q
1/n
) (Q
1/n
) = n
d
(Q
1
) = cn
d
.
(b) Let E be Borel measurable with m(E) = 0. Then for any > 0 there
is an open set U with m(U E) < m(U) < . We can write
U as a countable disjoint union of cubes Q
j
whose side lengths are
of the form 1/n, for example by decomposing U into dyadic rational
cubes as described on pp. 7-8. Then (Q
j
) = cm(Q
j
) by part (a),
so (U) =

(Q
j
) =

cm(Q
j
) = c

m(Q
j
) = cm(U) < c. This
can be done for any , so (E) = 0. Thus, is absolutely continuous
84
wrt m, so there exists a locally integrable Borel measurable function
f such that (E) =
_
E
fdm.
(c) Let x be a Lebesgue point of f. Let Q
n
be a series of dyadic rational
cubes containing x. (Just to be clear, these are half-open dyadic
rational cubes, i.e. ones of the form
mi
2
n
i
x
i
<
mi+1
2
n
i
for i = 1, . . . , d.)
Then Q
n
shrinks regularly to x because the ratio of a cube to the
circumscribing ball is constant. For each cube we have
1
m(Q
n
)
_
Qn
fd =
1
m(Q
n
)
(Q
n
) =
1
m(Q
n
)
cm(Q
n
) = c,
so f(x) = c at every Lebesgue point x of f (hence a.e.).

Exercise 9: Let C([a, b]) denote the vector space of continuous functions on
the closed and bounded interval [a, b]. Suppose we are given a Borel measure
on this interval, with ([a, b]) < . Then
f (f) =
_
b
a
f(x)d(x)
is a linear functional on C([a, b]), with positive in the sense that (f) 0
if f 0.
Prove that, conversely, for any linear functional on C([a, b]) that is
positive in the above sense, there is a unique nite Borel measure so that
(f) =
_
b
a
fd for f C([a, b]).
Solution. Dene the notation f u to mean 0 f 1 and f = 1 on [a, u].
Let
F(u) =
fu
(f).
Then F is increasing on [a, b], because for u
t
> u, f u
t
f u so F(u
t
)
is the inmum of a smaller class of sets. To show F is right continuous, it
suces to show that for every f u and > 0 there exists a u
t
> u and
f
t
u
t
with (f
t
) < (f) + . Let f u and let C = (f). By continuity,
the function (1+

C
)f is greater than 1 in some neighborhood u < x < x+.
Let f
t
u+

2
and f
t
(y) = 0 for y . (Such an f
t
can be constructed, for
example, as piecewise linear, say f
t
= 1 on [a, u+/2], f
t
(u+3/4) = 0, and
f is linear between /2 and 3/4.) Then f
t
(x) (1 +

C
)f(x) everywhere,
so (f
t
) (1 +

C
)(f) = C + . This proves that F is right continuous.
By Theorem 3.5, there exists a unique Borel measure on [a, b] such that
((a
t
, b
t
]) = F(b
t
) F(a
t
) for all a a
t
< b
t
b.
Now we need to show (f) =
_
b
a
fd for all f C([a, b]). Let L(f) =
_
b
a
fd. Then it suces to show (f) L(f) since this will imply (f) =
(f) L(f) = L(f) (f) L(f). Let > 0. Because continuous
functions can be uniformly approximated by step functions, we may choose
a step function f

f with L(f

) < L(f) +. Write


f

c
k

(a
k
,b
k
]
.
WLOG we may assume that the intervals (a
k
, b
k
] are disjoint. Choose
a
t
k
> a
k
and let f
t

=

c
k

(a

k
,b
k
]
. Now by the denition of F there
exist continuous g
k
and h
k
with g
k
b
k
, h
k
a
t
k
, and (h
k
) F(a
t
k
) <
85
(g
k
) F(b
k
) <

2
k
. WLOG we may also assume f < c
k
(1 h
k
) on (a
k
, a
t
k
)
since otherwise we may take
c
k
(1 h
t
k
) =
_
max(f, c
k
(1 h
k
)) a
k
< x < a
t
k
c
k
a
t
k
x < b
k
and the function h
t
k
dened by these relations will also be continuous, satisfy
h
t
k
a
t
k
, and have h
t
k
< h
k
(h
t
k
) < (h
k
). Now let

c
k
(g
k
h
k
).
Then we have

f

f by the above remarks concerning h


k
. Note also that
(

f

) =

c
k
((g
k
) (h
k
)) <

c
k
(F(b
k
) F(a
t
k
)) + = L(f
t

) +.
Since we also have the relations

f

f and f

f
t

, and both and L are


positive,
(f) < (

f

) < L(f
t

) + < L(f

) + < L(f) + 2.
This is true for all , so (f) L(f).
Exercise 10: Suppose ,
1
,
2
are signed measures on (X, /) and a (pos-
itive) measure on /. Using the symbols and dened in Section 4.2,
prove:
(a) If
1
and
2
, then
1
+
2
.
(b) If
1
and
2
, then
1
+
2
.
(c)
1

2
implies [
1
[ [
2
[.
(d) [[.
(e) If and , then = 0.
Solution.
(a) Let disjoint A
1
and B
1
be chosen such that
1
(E) =
1
(A
1
E) and
(E) = (B
1
E) for all measurable E. Similarly, choose A
2
and
B
2
disjoint with
2
(E) =
2
(A
2
E) and (E) = (B
2
E). Let
A = A
1
A
2
and B = B
1
B
2
. Note that A and B are disjoint
because A
1
B A
1
B
1
= and similarly for A
2
. Then for any
measurable E, (E) = (E B
1
) = (E B) + (E (B
1
B
2
).
But (B
1
B
2
) = ((B
1
B
2
) B
2
) = 0, so (E) = (E B).
Similarly,
1
(E) =
1
(E A
1
) =
1
(E A)
1
(E (A A
1
)), but

1
(A A
1
) =
1
((A A
1
) A
1
) = 0, so
1
(E) =
1
(E A) and by the
same token,
2
(E) =
2
(E A), so (
1
+
2
)(E) = (
1
+
2
)(E A).
Thus, and
1
+
2
are supported on the disjoint sets A and B.
(b) (E) = 0
1
(E) =
2
(E) = 0 (
1
+
2
)(E) = 0.
(c) Choose disjoint A and B such that
1
(E) =
1
(E A) and
2
(E) =

2
(E B) for all measurable E. Then
[
1
[(E) = sup

j
[
1
(E
j
)[ = sup

j
[
1
(E
j
A)[ = [
1
[(E A)[
and similarly [
2
[(E) = [
2
[(EB). Hence [
1
[ and [
2
[ are supported
on the disjoint sets A and B.
(d) [[(E) = 0 sup

[(E
j
)[ = 0 (E
j
) = 0 for all subsets E
j

E (E) = 0.
86
(e) Let disjoint A and B be chosen with (E) = (E A) and (E) =
(EB). Then for any measurable E, (EA) = ((EA) B) = 0
because A and B are disjoint. Then (E) = (E A) = 0 because
(E A) = 0 and .

Exercise 11: Suppose that F is an increasing normalized function on R, and


let F = F
A
+F
C
+F
J
be the decomposition of F in Exercise 24 of Chapter
3; here F
A
is absolutely continuous, F
C
is continuous with F
t
C
= 0 a.e., and
F
J
is a pure jump function. Let =
A
+
C
+
J
with ,
A
,
C
, and

J
the Borel measures associated to F, F
A
, F
C
, and F
J
respectively. Verify
that:
(i)
A
is absolutely continuous with respect to Lebesgue measure and

A
(E) =
_
E
F
t
(x)dx for every Lebesgue measurable set E.
(ii) As a result, if F is absolutely continuous, then
_
fd =
_
fdF =
_
f(x)F
t
(x)dx whenever f and fF
t
are integrable.
(iii)
C
+
J
and Lebesgue measure are mutually singular.
Solution.
(i) By denition

A
(E) = inf
E(aj,bj]

F
A
(b
j
) F
A
(a
j
)
= inf
E(aj,bj]

_
bj
aj
F
t
(x)dx
inf
E(aj,bj]
_
(aj,bj]
F
t
(x)dx

_
E
F
t
(x)dx.
To prove the reverse inequality, let > 0 and use the absolute continu-
ity of the integral to nd a > 0 such that m(E) <
_
E
F
t
(x) < .
(In case the assumption that F
t
L
1
is a problem, we can treat
the intersection of E with each interval [n, n + 1) separately.) Now
since E is Lebesgue measurable, there is an open set U E such
that m(U E) < . Let

U be constructed by writing U as a disjoint
union of open intervals (a
j
, b
j
) and replacing each with (a
j
, b
j
]. Then
m(

UE) = m(UE) and


_

U\E
F
t
(x)dx =
_
U\E
F
t
(x)dx because

UE
is U E plus countably many points. Thus
_

U
F
t
(x)dx =
_

U\E
F
t
(x)dx +
_
E
F
t
(x)dx +
_
E
F
t
(x)dx.
But

U is one of the sets over which the inmum is taken in the deni-
tion of
A
(E), so
A
(E) +
_
E
F
t
(x)dx. This is true for any , so

A
(E) =
_
E
F
t
(x)dx.
(ii) The equation
_
E
fd =
_
E
fF
t
(x)dx follows immediately from (a) in
the case where f is a characteristic function. By the linearity of the
integral, it holds for f a simple function as well. The result for non-
negative f follows from the Monotone Convergence Theorem: Choose
87
simple f
n
f. Then
_
E
fd =
_
E
(limf
n
)d = lim
_
E
f
n
d = lim
_
E
f
n
(x)F
t
(x)dx =
_
E
(limf
n
)(x)F
t
(x)dx =
_
E
f(x)F
t
(x)dx.
Finally, linearity allows us to extend to functions in L
1
(). Note that
f L
1
() i fF
t
L
1
(dx) because both
_
[f[d and
_
[fF
t
[dx are
dened as suprema of integrals of simple functions, and
_
[c[d =
_
[gF
t
[dx for g simple so the suprema are taken over the same sets.
The condition that f be integrable is, so far as I can tell, superuous,
unless it means -integrable, in which case its redundant.
(iii) By Exercise 10a, it is sucient to prove that
C
and
J
are both
singular wrt Lebesgue measure. Write F
J
(x) =

k=1
c
k

[x
k
,)
. Let
A = x
k
which is countable and therefore has Lebesgue measure zero,
so that Lebesgue measure is supported on A
c
. Now A
c
is open, so it
is covered by countably many intervals (a
j
, b
j
]. (We can write A
c
as a
countable disjoint union of open intervals, and any open interval is a
countable union of half-closed intervals.) Thus, any subset E A
c
can
be covered by countably many intervals (a
j
, b
j
] with F
J
(a
j
) = F
J
(b
j
),
so
J
(E) = 0. Thus,
J
m.
The proof that
C
m is a bit trickier. (Help from Paul Smith on
this part.) We will use the following lemma, taken from page 35 of
Folland:
Lemma 4. If
F
is the Borel measure corresponding to the increasing,
right-continuous function F, then for any -measurable set E,
(E) = inf
E(aj,bj)

((a
j
, b
j
)).
In words, this lemma says that it is equivalent to use coverings of
open intervals instead of half-open intervals. This is nice because it
enables us to use theorems about open covers. The straightforward
but unenlightening proof is in Folland for anyone who cares.
Let A = x R : F
t
C
(x) = 0
c
. Then m(A) = 0 by hypothesis, so
Lebesgue measure is supported on A
c
. We wish to show that
C
(A
c
) =
0, so that
C
is supported on A. It is sucient to show that
C
(A
c

[0, 1]) = 0 since A


c
= A
c
[n, n+1] and replacing F
C
(x) by F
C
(xn)
shifts A
c
[n, n+1] to A
c
[0, 1]. Thus, let B = A
c
[0, 1]. Let > 0.
Now for any x B, F
t
C
(x) = 0, so h
x
> 0 such that [F
C
(y)F
C
(x)[ <
[y x[ for y [xh, x+h]. By the Dreiecksungleichung, this implies
F
C
(x +h) F
C
(x h) < 2h. Thus, if we let I
x
= (x h, x +h), we
have (I
x
) ((xh, x+h]) = F
C
(x+h)F
C
(xh) < 2h = m(I
x
).
The intervals I
x
are an open cover of B, so we can take a countable
subcover I
n
. (Here we use the fact that every subset of R is a Lindelof
space. The proof-by-jargon of this fact is that every subspace of a
separable metric space is separable, and a metric space is separable
i it is Lindelof. The direct proof is that every open subset of R is a
countable union of rational intervals, so for each rational interval we
can take a member of our open cover (if there is any) which contains
that interval, and the resulting countable subcollection will still cover
our set.) Next, we shrink the intervals I
n
to make

m(I
n
) < 3.
88
We do this inductively: For a given I
n
, if I
n
I
j
for some j < n, we
discard I
n
entirely; similarly, if I
j
I
n
for some j < n then we discard
I
j
. Once this is done, I
n
(
j<n
I
j
) is an open subset of I
n
, hence a
disjoint union of open intervals; but none of these intervals can have
both its endpoints within I
n
because this would imply I
j
I
n
for some
j < n. Hence if I
n
= (a, b), then I
n
(
j<n
I
j
) = (a, ) (, b) for
some a < b. By replacing a with max(a,
1
2
n+1
) and b with
min(b, +
1
2
n+1
), we form a new interval I
t
n
I
n
with the property
that m(I
n
(
j<n
I
j
)) <
1
2
n
. However,
n
j=1
I
t
j
=
n
j=1
I
j
because we
only delete parts of an interval that are covered by other intervals.
We would also like I
t
n
to still have the property that (I
t
n
) < m(I
t
n
).
Unfortunately, this is only guaranteed as long as I
t
n
contains the central
point x
n
from the interval I
n
. So far I have not been able to close this
hole in the proof.
Overlooking this problem, we have found an open cover I
t
n
B with
the properties that (I
t
n
) < m(I
t
n
) for each n, and m(I
t
n
(
j<n
I
t
j
)) <
1
2
n
. We may additionally assume that I
t
n
[
1
2
,
3
2
] for all n since
we are only interested in covering [0, 1]. This will then imply that

m(I
n
) < 3 because if we write I
t
n
= A
n
B
n
where A
n
= I
n

(
j<n
I
t
j
) and B
n
= I
t
n
(
j<n
I
t
j
), then the A
n
are disjoint and have
union equal to I
t
n
, and

m(I
t
n
) =

m(A
n
) +m(B
n
) = m(A
n
) +

m(B
n
) 2 +

1
2
n
= 3
where m(A
n
) < 2 because A
n
[
1
2
,
3
2
]. This then implies
(B) (I
t
n
)

(I
t
n
)

m(I
t
n
) < 3.
This is true for any , so (B) = 0.

Exercise 14: Suppose (X


j
, /
j
,
j
), 1 j k, is a nite collection of mea-
sure spaces. Show that parallel with the case k = 2 considered in Section 3
one can construct a product measure
1

k
on X = X
1
X
k
. In
fact, for any set E X such that E = E
1
E
k
, with E
j
/
j
for all
j, dene
0
(E) =

k
j=1

j
(E
j
). Verify that
0
extends to a premeasure on
the algebra / of nite disjoint unions of such sets, and then apply Theorem
1.5.
Solution. First a hand should at least be waved at the fact that / is an
algebra. It is closed under complements because
(E
1
E
k
)
c
= (E
c
1
X
2
X
k
)
(E
1
E
c
2
X
3
X
k
)
(E
1
E
2
E
k1
E
c
k
).
This is a stopping time argument: we divide the complement into k
sets based on which is the rst of the E
k
that a point fails to be in. The
intersection of two unions of disjoint measurable rectangles is another union
89
of disjoint measurable rectangles, so we only need to check unions. This
follows from
(E
1
E
k
) (F
1
F
k
) = (E
1
E
k
)
(F
1
E
1
F
2
F
k
)
(F
1
E
1
F
2
E
2
F
3
F
k
)
(F
1
E
1
F
2
E
2
F
3
E
3
F
4
F
k
)
(F
1
E
1
F
k1
E
k1
F
k
E
k
).
Finally, to show that the extension from rectangles to a premeasure on
the algebra / generated by them is well-dened, let E
1
E
k
be a
measurable rectangle, and suppose E
1
E
k
=
j
E
j
1
E
j
k
where
the union is disjoint. This immediately implies

E1
(x
1
) . . .
E
k
(x
k
) =

j=1

E
j
1
(x
1
) . . .
E
j
k
(x
j
)
for all (x
1
, . . . , x
k
) X
1
X
k
. We can integrate both sides with respect
to x
1
, using the monotone convergence theorem to move the integral inside
the sum on the RHS, to obtain

1
(E
1
)
E2
(x
2
) . . .
E
k
(x
k
) =

j=1

1
(E
j
1
)
E
j
2
(x
2
) . . .
E
j
k
(x
k
).
We then integrate each side wrt x
2
, etc. After doing this k times we obtain

1
(E
1
) . . .
k
(E
k
) =

j=1

1
(E
j
1
) . . .
k
(E
j
k
)
0
(E
1
E
k
) =

j=1

0
(E
j
1
E
j
k
)
as desired.
Since
0
is a premeasure on /, it extends to a measure on the -algebra
generated by / by Theorem 1.5.
Exercise 15: The product theory extends to innitely many factors, under
the requisite assumptions. We consider measure spaces (X
j
, /
j
,
j
) with

j
(X
j
) = 1 for all but nitely many j. Dene a cylinder set E as
x = (x
j
), x
j
E
j
, E
j
/
j
, E
j
= X
j
for all but nitely many j.
For such a set dene
0
(E) =

j=1

j
(E
j
). If / is the algebra generated
by the cylinder sets,
0
extends to a premeasure on /, and we can apply
Theorem 1.5 again.
Solution. First, note that nite disjoint unions of cylinder sets form an al-
gebra, which is therefore the algebra /. To see this, we can just apply
Exercise 14 because of the condition that nitely many indices in the cylin-
der have E
j
,= X
j
. For example, to see how unions work, let

E
j
be a
cylinder set (where E
j
= X
j
for all but nitely many j) and

F
j
another
cylinder set. Then there are nitely many j for which either E
j
or F
j
is not
X
j
; we may apply the decomposition from Exercise 14 to these components
while leaving the others untouched, and hence obtain a decomposition of
(

E
j
) (

F
j
) into nitely many disjoint cylinder sets. Similar comments
apply to intersections and complements.
90
To verify that
0
extends to a premeasure on /, let

E
j
be a cylinder
set, and suppose

j=1
E
j
=

_
k=1

j=1
E
k
j
where the union is disjoint and all but nitely many E
k
j
are equal to X
j
for
any xed k. The characteristic-function version of this statement is

j=1

Ej
(x
j
) =

k=1

j=1

E
k
j
(x
j
).
Integrating both sides with respect to x
1
and using the monotone conver-
gence theorem to move the integral inside the sum on the right,

1
(E
1
)

j=2

Ej
(x
j
) =

k=1

1
(E
k
1
)

j=2

E
k
j
(x
j
).
Repeating the process times, we have

1
(E
1
) . . .

(E

j=+1

Ej
(x
j
) =

k=1

1
(E
k
1
) . . .

(E
k

j=+1

E
k
j
(x
j
).
As , the LHS approaches
0
(

E
j
); in fact, it will equal it after
a nite number of steps because all but nitely many E
j
equal X
j
and

j
(X
j
) = 1 for all but nitely many X
j
. For the RHS, we apply monotone
convergence again (this time in ) to see that it approaches

k=1

j=1

j
(E
k
j
) =

k=1

0
_
_

j=1
E
k
j
_
_
as desired. Hence
0
extends to a premeasure on /, and therefore to a
measure on the sigma-algebra generated by / by Theorem 1.5.
Exercise 16: Consider the d-dimensional torus T
d
= R
d
/Z
d
. Identify T
d
as T
1
T
1
and let be the product measure on T
d
given by =

1

d
, where
j
is Lebesgue measure on X
j
identied with the circle
T. That is, if we represent each point in X
j
uniquely as k
j
with 0 < x
j
1,
then the measure
j
is the induced Lebesgue measure on R restricted to
(0, 1].
(a) Check that the completion is Lebesgue measure induced on the cube
Q = x : 0 < x
j
1, j = 1, . . . , d.
(b) For each function f on Q let

f be its extension to R
d
which is periodic,
that is,

f(x+z) =

f(x) for every z Z
d
. Then f is measurable on T
d
if and only if

f is measurable on R
d
, and f is continuous on T
d
if and
only if

f is continuous on R
d
.
(c) Suppose f and g are integrable on T
d
. Show that the integral dening
(f g)(x) =
_
T
d
f(xy)g(y)dy is nite for a.e. x, that f g is integrable
over T
d
, and that f g = g f.
(d) For any integrable function f on T
d
, write
f

nZ
d
a
n
e
2in x
91
to mean that a
n
=
_
T
d
f(x)e
2in x
dx. Prove that if g is also integrable,
and g

nZ
d b
n
e
2in x
, then
f g

nZ
d
a
n
b
n
e
2in x
.
(e) Verify that e
2in x
is an orthonormal basis for L
2
(T
d
). As a result
|f|
L
2
(T
d
)
=

nZ
d [a
n
[
2
.
(f) Let f be any continuous periodic function on T
d
. Then f can be uni-
formly approximated by nite linear combinations of the exponentials
e
2in x

nZ
d.
Solution.
(a) This follows from the translation invariance of . To show that the
product of translation invariant measures is translation invariant, let
E = E
1
E
d
be a measurable rectangle in T
d
, and x = (x
1
, . . . , x
d
)
T
d
. Then
(E+x) =
_
(E
1
+x
1
) (E
d
+x
d
)
_
=
1
(E
1
+x
1
) . . .
d
(E
d
+x
d
) =
1
(E
1
) . . .
d
(E
d
) = (E)
so is translation invariant on measurable rectangles. This implies
that the outer measure

generated by coverings of measurable rect-


angles is also translation invariant. But is just the restriction of

to the sigma-algebra of Caratheodory-measurable sets, so it is trans-


lation invariant as well. This implies that is a multiple of Lebesgue
measure; since (T
d
) = m(Q) = 1, we must have m = (modulo the
correspondence between Q and T
d
).
(b) This is blindingly obvious, but
f mble (resp. cts) f
1
(U) mble (resp. open) for open U


f
1
(U) mble (resp. open) in R
d


f mble (resp. cts).
Here we use the fact that

f
1
(U) is a lattice consisting of translates
of f
1
(U) by Z
d
; such a set is open or measurable i f
1
(U) is.
(c) This is a simple application of Tonellis Theorem, exactly analogous
to the case in L
1
(R
d
):
_
T
d
[f g(x)[dx =
_
T
d

_
T
d
f(x y)g(y)dy

dx

_
T
d
_
T
d
[f(x y)[[g(y)[dydx
=
_
T
d
_
T
d
[f(x y)[[g(y)[dxdy
= |f|
L
1
(T
d
)
|g|
L
1
(T
d
)
so f g is integrable on T
d
. This in turn implies that it is nite a.e.
Finally, the change of variables u = x y shows that
f g(x) =
_
f(x y)g(y)dy =
_
f(u)g(x u)du = g f(x).
92
(d) Once again, there is absolutely nothing dierent from the one-variable
case. Since f g is integrable by our above remarks, and [e
2inx
[ = 1,
f g(x)e
2inx
is also integrable, so by Fubinis theorem
_
T
d
f g(x)e
2inx
dx =
_
T
d
e
2inx
_
T
d
f(x y)g(y)dydx
=
_
T
d
g(y)
_
T
d
f(x y)e
2inx
dxdy
=
_
T
d
g(y)e
2iny
_
T
d
f(x y)e
2in(xy)
dxdy
=
_
T
d
g(y)e
2iny
a
n
dy
= a
n
b
n
.
(e) The orthonormality of this system is evident, since
_
T
d
e
2inx
e
2imx
dx =
_
T
d
e
2i(mn)x
dx =
d

j=1
_
T
e
2i(mjnj)xj
dx
j
=
d

j=1

nj
mj
=
n
m
where is the Kronecker delta function. To show completeness, we
use the fact that an orthonormal system e
n
in a Hilbert space is
complete i f, e
n
) = 0 for all n f = 0. Suppose
0 = f, e
2inx
) =
_
T
d
f(x)e
2inx
dx =
_
T
e
2in1x1
_
T
e
2in2xn
. . .
_
T
d
e
2in
d
x
d
f(x
1
, . . . , x
d
)dx
d
. . . dx
1
for all n
1
, . . . , n
d
, where the use of Fubinis theorem is justied by the
integrability of f and the fact that [e
2in
k
x
k
[ = 1. Let
F
1
(x
1
) =
_
T
e
2in2x2
_
T
e
2in3x3
. . .
_
T
e
2in
d
x
d
f(x
1
, . . . , x
d
)dx
d
. . . dx
2
.
Then
_
T
e
2in1x1
F
1
(x
1
)dx
1
= 0 for all n
1
, so F
1
(x
1
) = 0 a.e. by the
completeness of exponentials in the 1-dimensional case. Now let
F
2
(x
1
, x
2
) =
_
T
e
2in3x3
. . .
_
T
e
2in
d
x
d
f(x
1
, . . . , x
d
)dx
d
. . . dx
1
.
For any xed value of x
1
, F
n
(x
1
, x
2
) is a function of x
2
with the prop-
erty that
_
T
e
2in2xn
F
2
(x
1
, x
2
)dx
2
= F
1
(x
1
) = 0 a.e., so we must
have F
2
(x
1
, x
2
) = 0 for a.e. x
2
. Continuing inductively, we see that
f(x
1
, . . . , x
d
) = 0 for a.e. x
1
, . . . , x
d
.
(f) This problem is begging for the Stone-Weierstrass theorem, but since
we havent covered that in class, Ill reluctantly do the convolution
stu. Let g

(x) =
1

[0,]
d. Then
_
g

(x)dx = 1, so
[f(x) f g

(x)[ =

f(x)
_
g

(y)dy
_
f(x y)g

(y)

dy

_
[f(x) f(x y)[[g

(y)[dy.
By the uniform continuity of f, given > 0 there exists
0
> 0 such
that [x y[ <
0
[f(x) f(y)[ < . For <
0
,
[f(x) f g

(x)[
_
[g

(y)[dy = .
93
Thus, f g

(x) f(x) uniformly as 0. However, if a


n
are the
Fourier coecients of f and b

n
those of g

, then

[a
n
[
2
< and

[b

n
[
2
< because f and g

are in L
2
so their Fourier transforms
are as well. Then by the Cauchy-Schwarz inequality,

[a
n
b

n
[ < .
This implies that

f g

L
1
and since f g

L
1
, Fourier inversion
holds and we have f g

(x) =

a
n
b
n
e
2inx
a.e. (in fact, everywhere,
since both sides are continuous). Now we can choose such that
[f f g

[ <

2
everywhere. For this , since the tails of the convergent
series

a
n
b

n
go to zero, we can choose some truncation

]n]N
[a
n
b
n
[
such that

]n]>N
[a
n
b

n
[ <

2
. Then for any x,

f(x)

]n]N
a
n
b
n
e
2inx

[f(x) f g

(x)[ +

f g

(x)

]n]N
e
2inx

= [f(x) f g

(x)[ +

]n>N]
a
n
b

n
e
2inx

[f(x) f g

(x)[ +

]n>N]
[a
n
b

n
[


2
+

2
= .
Thus, f can be uniformly approximated by trigonometric polynomials.

Exercise 17: By reducing to the case d = 1, show that each rotation x


x + of the torus T
d
= R
d
/Z
d
is measure preserving, for any R
d
.
Solution. We rst suppose that E T
d
is a measurable rectangle E =
E
1
E
d
where E
k
T for k = 1, . . . , d. Then
m(
1
(E)) = m(E) = m((E
1

1
) (E
d

d
)) = m(E
1
) . . . m(E
d
) = m(E).
Hence is measure-preserving on measurable rectangles. But since the
measure of any set is computed in terms of its coverings by measurable
rectangles, is measure-preserving on all measurable sets. (We actually use
here the fact that
1
is measure-preserving as well; if R
k
is a covering of
E by measurable rectangles, then
1
(R
k
) is a covering of
1
(E) with
the same measure; conversely, if R
t
k
is a covering of
1
(E) by measurable
rectangles, then (R
t
k
) is a covering of E with the same measure.)
Exercise 18: Suppose is a measure-preserving transformation on a measure
space (X, ) with (X) = 1. Recall that a measurable set E is invariant
if
1
(E) and E dier by a set of measure zero. A sharper notion is to
require that
1
(E) equal E. Prove that if E is any invariant set, there is
a set E
t
so that E
t
=
1
(E), and E and E
t
dier by a set of measure zero.
Solution. Let
E
t
=

n=1

_
k=n

k
(E).
94
Then
E
t
E =

n=1

_
k=n

k
(E) E
and
E E
t
=

_
n=1

k=n
E
k
(E).
But E
k
(E) and
k
(E) E both have measure zero (this follows from
m(EE
t
) = 0 by an easy induction), and countable unions and intersec-
tions of null sets are null, so m(EE
t
) = 0. Moreover,

1
(E
t
) =

n=1

_
k=n

k
(E) =

n=2

_
k=n

k
(E) = E
t
because the sets inside the intersection are nested so we get the same set
whether we start at n = 1 or n = 2.
Exercise 19: Let be a measure-preserving transformation on (X, ) with
(X) = 1. Then is ergodic if and only if whenever is absolutely con-
tinuous with respect to and is invariant (that is, (
1
(E)) = (E) for
all measurable sets E), then = c, with c a constant.
Solution. We use the fact that is ergodic i the only functions with f =
f a.e. are constant a.e., as well as the fact that
_
E
fd =
_

1
(E)
f d
for measure-preserving maps . Let be ergodic and let be an
invariant measure. By the Radon-Nikodym theorem, d = hd for some
function h L
1
(). Then for any measurable E,
(E) =
_
E
hd =
_

1
(E)
h d.
By the invariance of , this equals
(
1
(E)) =
_

1
(E)
hd.
Since this is true for any measurable E, h = h a.e. But since is
ergodic, this implies h is constant a.e., so = c for some constant c.
Conversely, suppose every invariant absolutely continuous measure is a
constant times . Let f L
1
be any function with the property that
f = f a.e. Dene an absolutely continuous measure by d = fd.
Then the above calculation (run in reverse) shows that is invariant, so
= c. But this implies that fd = d = cd so f is constant a.e. Hence
is ergodic.
Exercise 20: Suppose is a measure-preserving transformation on (X, ). If
(
n
(E) F) (E)(F)
as n for all measurable sets E and F, then (T
n
f, g) (f, 1)(1, g)
whenever f, g L
2
(X) with (Tf)(x) = f((x)). Thus is mixing.
95
Solution. Suppose (
n
(E) F) (E)(F) for measurable E and F.
This means that (T
n
f, g) (f, 1)(1, g) if f and g are characteristic func-
tions: say f =
E
and g =
F
, then
(T
n
f, g) =
_

E
(
n
(x))
F
(x)dx =
_

n
(E)

F
= m(
n
(E)F) (E)(F) = (f, g).
By linearity in f and conjugate-linearity in g, this implies (T
n
f, g)
(f, 1)(1, g) if f and g are measurable simple functions. Now let f, g L
2
and let f
m
f and g
m
g be sequences of measurable simple functions.
For each m we have
(T
n
f, g) = (T
n
f, g g
m
) + (T
n
(f f
m
), g) + (T
n
f
m
, g
m
).
Since T is an isometry, [(T
n
f, gg
m
)[ |T
n
f||gg
m
| = |f||gg
m
| 0
as m . Similarly, (T
n
(f f
m
), g) 0 uniformly in n as m .
Finally, (T
n
f
m
, g
m
)
n
(f
m
, 1)(1, g
m
)
m
(f, 1)(1, g). (To be more precise
about this business of taking limits in two dierent variables, choose f
m
and g
m
with |f
m
f|, |g
m
g| < . Then since T is an isometry,
(T
n
f, g) = (T
n
f
m
, g
m
) +h(n)
where |h(n)| < 2 for all n. Letting n , we see that (T
n
f, g) is even-
tually within 2 of (f
m
, 1)(1, g
m
), which in turn is within C of (f, 1)(1, g)
for some constant C. This is true for all , so (T
n
f, g) (f, 1)(g, 1).)
Exercise 21: Let T
d
be the torus, and : x x + the mapping arising
in Exercise 17. Then is ergodic if and only if = (
1
, . . . ,
d
) with

1
, . . . ,
d
, and 1 are linearly independent over the rationals. To do this
show that:
(a)
1
m
m1

k=0
f(
k
(x))
_
T
d
f(x)dx
as n , for each x T
d
, whenever f is continuous and periodic
and satises the hypothesis.
(b) Prove as a result that in this case is uniquely ergodic.
Solution.
(a) Suppose rst that
1
, . . . ,
d
and 1 are dependent over Q, say
a
1

1
+ +a
d

d
=
p
q
,
where
i
Q. Let
E = (x
1
, . . . , x
d
) : 0 < q(a
1
x
1
+ +a
d
x
d
) <
1
2
,
where z = z z| denotes the fractional part of z. Then m(E) =
1
2
but E =
1
(E), so is not ergodic.
On the other hand, suppose
1
, . . . ,
d
and 1 are independent over Q.
Let f(x) = e
2inx
be any complex exponential. If n = 0, then
1
m
m1

k=0
f((
k
(x)) = 1 =
_
T
d
f(x)dx.
96
If n ,= 0, then
1
m
m1

k=0
f(x)dx =
1
m
e
2inx
m1

k=0
e
2ikn
=
e
2inx
m
1 e
2imn
1 e
2in
.
Since [1 e
2imn
[ 2, this goes to zero as m , so
1
m
m1

k=0
f((
k
(x))
_
T
d
f(x)dx.
Finally, since complex exponentials are uniformly dense in the contin-
uous periodic functions by exercise 16f, the above limit holds for any
continuous periodic function: Let f be a continuous periodic func-
tion and P

a nite linear combination of complex exponentials with


[f P

[ < everywhere. Choose n suciently large that [A


m
P


_
P

dx[ < for all m > n, where A


m
g(x) =
1
m

m1
k=0
g(
k
(x)). Then
for m > n,

A
m
f(x)
_
f(x)dx

[A
m
f(x) A
m
P

(x)[ +

A
m
P

(x)
_
P

(x)dx

_
(P

(x) f(x))dx

< + + = 3.
(b) Unique ergodicity follows by the same logic as the 1-variable case.
Let be any invariant measure; then part (a) plus the Mean Ergodic
Theorem shows that P

(f) =
_
fdx, where P

is the projection in
L
2
() onto the subspace of invariant functions. This implies that the
image of P

is just the constant functions. But we know that the


L
2
() projection of f onto the constants is
_
fd, so we must have
_
fdx =
_
fd for continuous f. Since characteristic functions of
open rectangles can be L
2
-approximated by continuous functions, this
implies that m(R) = (R) for any open rectangle R. But this implies
that m and agree on the Borel sets. Hence m is uniquely ergodic for
this .

Exercise 26: There is an L


2
version of the maximal ergodic theorem. Sup-
pose is a measure-preserving transformation on (X, ). Here we do not
assume that (X) < . Then
f

(x) = sup
1
m
m1

k=0
[f(
k
(x))[
satises
|f

|
L
2
(X)
c|f|
L
2
(X)
, whenever f L
2
(X).
The proof is the same as outlined in Problem 6, Chapter 5 for the maximal
function on R
d
. With this, extend the pointwise ergodic theorem to the
case where (X) = , as follows:
(a) Show that lim
m
1
m

m1
k=0
f(
k
(x)) converges for a.e. x to P(f)(x)
for every f L
2
(X), because this holds for a dense subspace of L
2
(X).
(b) Prove that the conclusion holds for every f L
1
(X), because it holds
for the dense subspace L
1
(X) L
2
(X).
97
Solution.
(a) We use the subspaces S = f L
2
: f = f and S
1
= g Tg :
g L
2
from the proof of the mean ergodic theorem. As shown there,
L
2
(X) = S

S
1
. Given f L
2
, let > 0 and write f = f
0
+ f
1
+ f
2
where f
0
S, f
1
+ f
2


S
1
, f
1
S
1
, |f
2
| < . Since f
1
S
1
,
f
1
= g Tg for some g L
2
. Let h = f
0
+f
1
. Then A
m
f
0
= f
0
= Pf
0
for all m, and
A
m
f
1
=
1
m
m1

k=0
T
k
(g Tg) =
1
m
(g T
m
g).
Clearly
1
m
g(x) 0 for all x as m . Moreover, as shown on page
301,
1
m
T
m
g(x) 0 for almost all x; one can see this from the fact
that, by the monotone convergence theorem,
_
X

m=1
1
m
2
[T
m
(g)(x)[
2
=

m=1
1
m
2
_
X
[T
m
(g)(x)[
2
=

m=1
1
m
2
|T
m
g|
2
= |g|

m=1
1
m
2
< .
Since

1
m
2
[T
m
(g)(x)[
2
is integrable, it is nite almost everywhere,
which means the terms in the series tend to zero for almost all x. The
upshot is that A
m
f
1
(x) Pf
1
(x) for a.a. x, so that A
m
h(x) Ph(x)
a.e. Finally, let
E

=
_
x X : lim
m
sup [A
m
f(x) Pf(x)[ >
_
.
Since A
m
fPf = A
m
hPh+A
m
(fh)P(fh), E

NF

,
where N = x : A
m
h(x) ,Ph(x),
F

=
_
x X : (f h)

(x) >

2
_
,
and
G

=
_
x X : [P(f g)(x)[ >

2
_
.
Now (N) = 0 as we have already established, and by the L
2
maximal
theorem, (F

) <
|(fh)

|
2
2
(/2)
2
<
4c
2

2
; similarly, since |Py| |y| for
y L
2
, (G

) <
|fh|
2
2
(/2)
2
<
4
2

2
. Since is arbitrary, (F

) = (G

) =
0 for all > 0. Thus, (E

) = 0 for all > 0, so A


m
f(x) Pf(x)
a.e.
(b) Let f L
1
. For any > 0, choose g L
2
L
1
with |f g|
1
< .
Then
A
m
f(x) Pf(x) = A
m
g(x) Pg(x) +A
m
(f g)(x) P(f g)(x).
Let
E

= x X : limsup [A
m
f(x) Pf(x)[ > 2 .
Then E

N F

where N = x : A
m
g(x) Pg(x) ,0,
F

= x X : limsup [A
m
(f g)(x)[ > ,
and
G

= x X : limsup [P(f g)(x)[ > .


98
Now (N) = 0 by part (a), and by inequality (24) on page 297,
(F

)
A

|f g|
1
<
A

.
Also |P(f g)| |f g| < , so by Chebyshevs inequality,
(G

)
1

|f g|
1
<
1

.
Since is arbitrary, this implies (F

) = (G

) = (E

) = 0 for all
> 0. Hence A
m
f(x) Pf(x) for a.a. x.

Exercise 27: We saw that if |f


n
|
L
2 1, then
fn(x)
n
0 as n for a.e.
x. However, show that the analogue where one replaces the L
2
-norm by the
L
1
-norm fails, by constructing a sequence f
n
, f
n
L
1
(X), |f
n
|
L
1 1,
but with limsup
fn(x)
n
= for a.e. x.
Solution. This is yet another example of why Stein & Shakarchi sucks. The
problem doesnt say anything about conditions on X. In fact, the hint seems
to assume that X = [0, 1]. I will assume that X is -nite. I will also assume
that the measure has the property that for any measurable set E and any
real number with 0 (E), there is a subset S E with (S) = .
(This property holds, for example, in the case of Lebesgue measure, or any
measure which is absolutely continuous with respect to Lebesgue measure.
In fact, we dont need quite this stringent a requirementit isnt necessary
that every subset of X have this nice property, but only that we can nd a
nested sequence of subsets of X
n
whose measures we can control this way,
where X = X
n
and (X
n
) < .)
Given these assumptions, let X = X
n
where (X
n
) < . We construct
a sequence E
n
of measurable subsets with the properties that (E
n
)
1
nlog n
and that every x X is in innitely many E
n
. To do this, we
will construct countably many nite sequences and then string them all
together. The rst sequence E
2
, . . . , E
N1
will have the property that X
1
=

N
j=2
E
j
, and (E
j
)
1
j log j
. To do this, let E
2
be any subset of X
1
with
measure
1
2 log 2
, unless (X
1
)
1
2 log 2
, in which case E
2
= X
1
. Let E
3
be
any subset of X
1
E
2
with measure
1
3 log 3
, unless (X
1
E
2
)
1
3 log 3
, in
which case E
3
= X
1
E
2
. Let E
4
be a subset of X
1
(E
2
E
3
) with measure
1
4 log 4
, or X
1
(E
2
E
3
) if this has measure at most
1
4 log 4
. This process
will terminate in nitely many steps because

1
nlog n
diverges and (X
1
)
is nite.
We then construct a second nite sequence of sets E
N1+1
, . . . , E
N2
whose
union is X
1
X
2
, a third nite sequence whose union is X
1
X
2
X
3
, etc.
Let E
n
be the concatenation of all these nite sequences. Then every point
in X is in innitely many E
n
, and (E
n
)
1
nlog n
.
Now let f
n
= nlog n
En
. Then |f
n
|
1
= nlog n(E
n
) 1. However,
fn(x)
n
= log n
En
(x) and since x is in innitely many E
n
, limsup
fn(x)
n
=
for all x.
99
Exercise 28: We know by the Borel-Cantelli lemma that if E
n
is a collec-
tion of measurable sets in a measure space (X, ) and

n=1
(E
n
) < ,
then E = limsupE
n
has measure zero.
In the opposite direction, if is a mixing measure-preserving transfor-
mation on X with (X) = 1, then whenever

n=1
(E
n
) = , there are
integers m = m
n
so that if E
t
n
=
mn
(E
n
), then limsup(E
t
n
) = X except
for a set of measure 0.
Solution. Let F
n
= E
c
n
. Since

(E
n
) = , a theorem on innite prod-
ucts (e.g. Corollary 5.6 on page 166 of Conway, Functions of One Complex
Variable) says that

(F
n
) = 0. Let F
t
n
= F
1
. Because is mixing, there
exists m
2
such that
(
m2
(F
2
) F
1
) < (F
1
)(F
2
) +
1
2
2
.
and let F
t
2
=
m
n
(F
2
). Next, choose m
3
such that
(
m3
(F
3
) (F
t
1
F
t
2
)) < (F
t
1
F
t
2
)(F
3
) +
1
2
3
and
(
m3
(F
3
) F
t
2
) < (F
t
2
)(F
3
) +
1
2
3
and let F
t
3
=
m3
(F
3
). Note that this implies
(F
t
1
F
t
2
F
t
3
) < (F
3
)
_
(F
1
)(F
2
) +
1
2
2
_
+
1
2
4
< (F
1
)(F
2
)(F
3
)+
1
2
2
(F
3
)+
1
2
3
and
(F
t
2
F
t
3
) < (F
2
)(F
3
) +
1
2
3
.
Continuing, we choose m
k
such that

_
_

m
k
(F
k
)
k1

j=
F
t
j
_
_
< (F
k
)
_
_
k1

j=
F
t
j
_
_
+
1
2
k
for all = 1, . . . , k 1; by induction this is less than
k

j=
(F
j
) +
k

j=+1
1
2
j
k

i=j+1
(F
i
)
where the latter product is taken to be 1 if empty (i.e. when j = k). I
will prove that this tends to 0 as k , for any xed . As noted before,

j=
(F
j
) = 0 (any tail of the sum diverges, so any tail of the product
tends to zero) so the rst term tends to zero. Similarly,

j=i
(F
j
) = 0 for
any xed i. Thus, if we split the sum into a sum from + 1 to M and a
sum from M to k, the nitely many terms from +1 to M can all be made
arbitrarily small because they each approach zero as k . On the other
hand, each term
1
2
j

k
i=j+1
(F
i
) is at most
1
2
j
, so their sum is less than

j=M
1
2
j
which can be made arbitrarily small by an appropriate choice of
M. This proves that

_
_

j=
F
t
j
_
_
= 0
100
for all . Then

_
_

_
=1

j=
F
t
j
_
_
= 0.
But the complement of this set is just limsup E
t
j
where E
t
j
=
mj
(E
j
).
Thus, limsup E
t
j
is almost all of X.
Chapter 6.8, Page 319
Problem 1: Suppose is a C
1
bijection of an open set O in R
d
with another
open set O
t
in R
d
.
(a) If E is a measurable subset of O, then (E) is also measurable.
(b) m((E)) =
_
E
[ det
t
(x)[dx, where
t
is the Jacobian of .
(c)
_
c

f(y)dy =
_
c
f((x))[ det
t
(x)[dx whenever f is integrable on O
t
.
Solution.
(a) If K is compact, then (K) is also compact by continuity. Now if A
is F

, then A is -compact (since every closed set in R


n
is a countable
union of compact sets), so (A) is also -compact and hence F

. Thus,
maps F

sets to F

sets. Since measurable sets are precisely those


that dier from F

sets by a set of measure zero, it suces to show that


maps sets of measure zero to sets of measure zero. (The following
argument is adopted from Rudin p. 153.) Let E O have measure
zero. For each integer n, dene
F
n
= x c : [
t
(x)[ < n.
Then for any x F
n
, the denition of derivative implies that
](y)(x)]
]yx]
<
n for [x y[ < for some > 0. Dene F
n,p
F
n
to be the set of
x for which =
1
p
works. Now m(F
n,p
) = 0 because its a subset of
E. I claim that we can cover F
n,p
by balls of radius less than
1
p
with
centers in F
n,p
and total measure at most . To do this, we can rst
cover F
n,p
by an open set of arbitrarily small measure. This open set
can be decomposed into cubes of arbitrarily small diameter, as shown
in chapter 1. If the diameter is suciently small (less than a constant
times
1
p
), we can cover whichever of these cubes intersect F
n,p
with
a ball centered at a point of F
n,p
and radius less than
1
p
; the other
cubes we discard. The total measure of the resulting balls is at most
a constant times the measure of the open set. (This constant comes
from nding the maximal possible ratio of volumes of the ball covering
one of these small-diameter cubes to the cube, which comes when the
center of the ball is at one corner of the cube and the radius of the
ball is

2 times the cubes diameter.) This proves the claim. Now if
we cover F
n,p
by such cubes B
j
, centered at x
j
and with radius r
j
<
1
p
then for x B
j
we have [T(x)T(x
j
)[ n[xx
j
[ nr
j
. This implies
that
T(F
n,p
)
_
j
B
nrj
(x
j
) m(T(F
n,p
)

m(B
nrj
(x
j
)) = n
d

m(B
rj
(x
j
)) < n
d
.
101
Since was arbitrary, this shows that m(T(F
n,p
)) = 0. Since c =

n,p
F
n,p
, this implies m((c)) = 0.
(b) I will prove this in the case where E is a closed rectangle contained in
O, and then show that this implies the general case.
Thus, assume E R U for a compact rectangle R. By absolute
continuity of each of the n components of
t
, given any > 0 there
exists > 0 such that [x y[ < , for x, y R, implies [(
t
(x)

t
(y))
j
[ < for j = 1, . . . , n. Divide R into cubes Q
k
of diameter less
than . Let a
k
be the center of Q
k
. Then for x Q
k
, the mean value
theorem implies (x)
j
(a
k
)
j
=
t
(c)(xa
k
) for some c on the line
segment connecting a
k
to k. Then c Q
k
so [
t
(c)
t
(a
k
)[ < .
Thus, [(x)
j
(a
k
)
j

t
(a
k
)
j
(x a
k
)
j
[ < [x a
k
[
j
. This then
implies |(x) (a
k
)
t
(a
k
)(x a
k
)| < |x a
k
| since a vector
that is larger in each component has larger norm. This local statement
translates into the global statement
(a
k
) + (1 )
t
(a
k
)(Q
k
a
k
) (Q
k
) (a
k
) + (1 +)
t
(a
k
)(Q
k
a
k
)
which may be veried by checking it for each x Q
k
, since x must
lie in the space between the cubes (a
k
) + (1 )
t
(a
k
)(Q
k
a
k
)
and (a
k
) + (1 + )
t
(a
k
)(Q
k
a
k
). Now is a bijection so (Q
k
)
are almost disjoint, which means their images are also almost disjoint
since maps sets of measure zero to sets of measure zero by part (a).
Thus, m((Q
k
)) =

m((Q
k
)) and

m((a
k
) + (1 )
t
(a
k
)(Q
k
a
k
))

m((Q
k
))

m((a
k
) + (1 +)
t
(a
k
)(Q
k
a
k
)) .
Now by Exercise 4 of Chapter 2,

m((a
k
) + (1 )
t
(a
k
)(Q
k
a
k
)) = (1 )[ det
t
(a
k
)[m(Q
k
)
since
t
is linear. Thus,
(1 )

[ det
t
(a
k
)[m(Q
k
)

m((Q
k
)) (1 +)

[ det
t
(a
k
)[m(Q
k
).
But

[ det
t
(a
k
)[m(Q
k
) is a Riemann sum for
_
R
[ det
t
(x)[dx, and
Riemann integration works because
t
is continuous on the compact
set R. Thus,
(1 )
_
R
[ det(
t
(x))[dx m((R)) (1 +)
_
R
[ det(
t
(x))[dx
for all , so m((R)) =
_
R
[ det(
t
(x))[dx.
Now let E be any measurable subset of O. Let U
n
O be open
sets containing E with m(U
n
E) <
1
n
. Then U
n
= E N where
m(N) = 0. Then
(E N) = (U
n
) = (U
n
).
(In general it is only true that (U
n
) (U
n
), but here we have
equality because is bijective.) Since m((N)) = 0,
m((E)) m((E N)) m((E)) +m((N)) m((E N)) = m((E)).
102
Now (U
n
) are nested sets of nite measure, so
m((U
n
)) = limm((U
n
)) = lim
_
Un
[ det
t
[ =
_
Un
[
t
[ =
_
EN
[
t
[ =
_
E
[
t
[.
Thus, we have (nally) that m((E)) =
_
E
[
t
[.
(c) We proved in part (b) that
_
c

g(y)dy =
_
c
g((x))[ det
t
(x)[dx holds
if g is a characteristic function
E
for measurable E O. By linearity,
this extends to a measurable simple function

c
j

Ej
. Now for f
nonnegative, take a sequence f
n
f of simple functions; then by the
Monotone Convergence Theorem,
_
c

f(y)dy =
_
c

limf
n
(y)dy
= lim
_
c

f
n
(y)dy
= lim
_
c
f
n
((x))[ det
t
(x)[dx
=
_
c
limf
n
((x))[ det
t
(x)[dx
=
_
c
f((x))[ det
t
(x)[dx.
Finally, we can extend to complex integrable f by linearity, since f is
a linear combination of four nonnegative integrable functions.

Problem 2: Show as a consequence of the previous problem: the measure


d =
dxdy
y
2
in the upper half-plane is preserved by any fractional linear
transformation z
az+b
cz+d
, where
_
a b
c d
_
belongs to SL
2
(R).
Solution. We rst note that such a transformation does in fact map the
upper half plane to itself: if z = x +iy, then
az +b
cz +d
=
a(x +iy) +b
c(x +iy) +d
=
ac(x
2
+y
2
) + (ad +bc)x +bd
(cx +d)
2
+ (cy)
2
+i
(ad bc)y
(cx +d)
2
+ (cy)
2
and since y > 0 and ad bc = 1 this has positive imaginary part. Now if
we write the map z
az+b
cz+d
in terms of its components as (x, y) (x
t
, y
t
),
then using the ugly formulas for x
t
and y
t
from the above expression, we can
compute the even uglier partial derivatives and the Jacobian. However, we
can shortcut that by using the fact that the Jacobian is always the square
norm of the complex derivative. In case this needs proof, suppose z f(z)
is a complex dierentiable function. Then if f
t
(z
0
) = + i, the linear
map z f
t
(z
0
)(z z
0
) can be rewritten as
(x+iy) (+i)((xx
0
)+(yy
0
)) = ((xx
0
)(yy
0
))+i((xx
0
)+(yy
0
)),
or
_
x
y
_

_


__
x x
0
y y
0
_
103
which has Jacobian
2
+
2
= [f
t
(z
0
)[
2
. Now in our case,
f
t
(z) =
(ad bc)z
(cz +d)
2
=
1
(cz +d)
2
.
If we let denote the mapping z z
t
(and equivalently (x, y) (x
t
, y
t
)),
then
((E)) =
_
(E)
1
(y
t
)
2
dx
t
dy
t
=
_
E
1
(y
t
)
2
[ det
t
[dxdy
=
_
E
((cx +d)
2
+ (cy)
2
)
2
y
2
[(c(x +iy) +d)
2
[
2
dxdy
=
_
E
((cx +d)
2
+ (cy)
2
)
2
y
2
((cx +d)
2
+ (cy)
2
)
2
dxdy
=
_
E
1
y
2
dxdy
= (E).

Problem 3: Let S be a hypersurface in R


d
= R
d1
R, given by
S = (x, y) R
d1
R : y = F(x),
with F a C
1
function dened on an open set R
d1
. For each subset E
we write

E for the corresponding subset of S given by

E = (x, F(x) :
x E. We note that the Borel sets of S can be dened in terms of the
metric on S (which is the restriction of the Euclidean metric on R
d
). Thus
if E is a Borel set in , then

E is a Borel subset of S.
(a) Let be the Borel measure on S given by
(

E) =
_
E
_
1 +[F[
2
dx.
If B is a ball in , let

B

= (x, y) R
d
, d((x, y),

B) < . Show that
(

B) = lim
0
1
2
m((

B)

),
where m denotes the d-dimensional Lebesgue measure. This result is
analogous to Theorem 4.4 in Chapter 3.
(b) One may apply (a) to the case when S is the (upper) half of the
unit sphere in R
d
, given by y = F(x), F(x) = (1 [x[
2
)
1/2
, [x[ < 1,
x R
d1
. Show that in this case d = d, the measure on the sphere
arising in the polar coordinate formula in Section 3.2.
(c) The above conclusion allows one to write an explicit formula for d in
terms of spherical coordinates. Take, for example, the case d = 3, and
write y = cos , x = (x
1
, x
2
) = (sin cos , sin sin ) with 0 <

2
,
0 < 2. Then according to (a) and (b) the element of area d
equals (1[x[
2
)
1/2
dx. Use the change of variable theorem in Problem
1 to deduce that in this case d = sin dd. This may be generalized
104
to d dimensions, d 2, to obtain the formulas in Section 2.4 of the
appendix in Book I.
Solution.
(a) We proceed in the steps outlined in Prof. Garnetts hint:
Since B is compact and
c
is closed, the distance between them
is greater than zero, so we can choose < d(B,
c
). Let V

=
x : d(X, B) < . For each x V

dene I

(x) = y R :
(x, y)

B

and h(x, ) = m(I

(x)).
Note that

B

R since for (x, y) R


d1
R, d((x, y),

B)
d(x, B). By Tonellis theorem,
m(

B

) =
_
(x,y)V

(x)
(y) =
_
V

_
R

(x)
(y)dydx =
_
V

h(x, )dx.
For x B, let M = [F[ evaluated at x. Let v be the unit
vector in the direction of F. Then F v = M at x. By the
continuity of F, for any > 0 we may choose
0
> 0 such that
[F[ < M + and F v > M at all points within
0
of x.
Suppose <
0
. Then for y R, if F(x) y
_
1 + (M )
2
,
consider F(x + tv) for 0 t
(M)
1+(M)
2
. By the construction
of , F(x) + (M )t < F(x + tv) < F(x) + (M + )t. Hence,
because
y(M)
1+(M)
2
<
(M)

1+(M)
2
< ,
F
_
x +
y(M )
1 + (M )
2
v
_
> F(x) +
y(M )
2
1 + (M )
2
.
By the intermediate value theorem, there is some t
0
<
y(M)
1+(M)
2
at which F(x+t
0
v) = F(x) +
y(M)
2
1+(M)
2
. Let x
0
= x+t
0
v. Then
(x
0
, F(x
0
))

B; moreover, the distance squared from (x, y) to
(x
0
, F(x
0
)) is
t
2
0
+
_
y
y(M )
2
1 + (M )
2
_
2
= t
2
0
+
_
y
2
1 + (M )
2
_
2

_
y(M )
1 + (M )
2
_
2
+
_
y
2
1 + (M )
2
_
2
=
y
2
1 + (M )
2
<
so y I
x,
. On the other hand, if y > F(x) +
_
1 + (M +)
2
,
suppose (x, y) I
x,
, so there is some x
t
with dist((x, y), (x
t
, F(x
t
))) <
. Clearly this implies [x
t
x[ < ; suppose [x
t
x[ = t. Then be-
cause [F[ < M + between x and x
t
, F(x
t
) < F(x) +(M +)t.
Then the distance squared from (x, y) to (x
t
, F(x
t
)) is
t
2
+ (y F(x
t
))
2
t
2
+ (y (M +)t)
2
= (1 + (M +)
2
)t
2
2y(M +)t +y
2
.
105
This is a quadratic polynomial in t; the minimum occurs when
the derivative is zero, i.e. when
2t(1 + (M +)
2
) = 2y(M +) t =
y(M +)
1 + (M +)
2
.
At this point, the value of the quadratic is
(1 + (M +)
2
)
_
y(M +)
1 + (M +)
2
_
2
2y
y(M +)
1 + (M +)
2
+y
2
=
y
2
1 + (M +)
2
> ,
so in fact there is no point in

B within of (x, y), a contra-
diction. Thus, y / I
x,
. By symmetry, the same results hold
for y < F(x), so [
_
1 + (M )
2
,
_
1 + (M )
2
] I
x,

[
_
1 + (M +)
2
,
_
1 + (M +)
2
]. Thus,
_
1 + (M )
2

h(x, )
2
<
_
1 + (M +)
2
for <
0
. This proves that
lim
0
h(x, )
2
=
_
1 +[F[
2
.
Because F is continuous on the compact set B, it attains a
maximum M on B. For any x B, if [yF(x)[ > (M+1) then
y / I
x,
, because any point (x
t
, F(x
t
)) within of (x, y) would
have to have [x
t
x[ < , and then the bound on F implies
[F(x
t
) F(x)[ < M [y F(x
t
)[ > . Thus,
h(x,)
2
< M + 1
for all .
Since were interested in the limit of small , we can restrict
our attention to below some cuto value, say a, where a <
d(B,
c
). Then
_
V

h(x, ) =
_
Va
h(x, ) because h(x, ) = 0 for
x V
a
V

. This enables us to take all the integrals over the


same region. Moreover, since V
a
has nite measure, the constant
M + 1 is integrable over V
a
. So by the dominated convergence
theorem,
lim
0
1
2
m(

B

) = lim
0
_
Va
h(x, )
2
dx
=
_
Va
lim
h(x, )
2
dx
=
_
B
_
1 +[F[
2
dx
= (

B).
(The region of integration becomes B in the limit because h(x, )
is eventually 0 for x outside B.)
(b) For two points , S
d1
, let a(, ) denote the angular distance
between and , i.e. the angle between the radius vectors to and .
Similarly, for a point S
d1
and a set E S
d1
, dene a(, E) =
inf
E
a(, ). Now given a ball B R
d1
and corresponding ball
106

B S
d1
, dene B
t

= p S
d1
: a(p,

B) < arcsin(). I claim that

B [1 , 1 +] (

B)

B
t

[1 , 1 +].
Here the product is in spherical coordinates, of course. The rst inclu-
sion is obvious because if (, r)

B[1, 1+], then (, 1)

B and
is a distance [1 r[ away. For the second inclusion, let (, r) be
any point in R
d
. f / B
t

, let (, 1) be any point in



B. The distance
from (, 1) to the line through the origin and (, r) is sin(a(, )) by
elementary trigonometry. By hypothesis this is greater than , so the
distance from (, 1) to (, r) is greater than . On the other hand,
if r / [1 , 1 + ], then no point in S
d1
is within of (, r). This
proves the second inclusion. Now by the spherical coordinates formulas
derived in section 3,
m(

B [1 , 1 +]) = (

B)
_
1+
1
r
d1
dr =
(1 +)
d
(1 )
d
d
(

B),
so
m(

B [1 , 1 +])
2
=
1
d
(1 +)
d
(1 )
d
2
(

B).
Since
(1+)
d
(1)
d
2
is a dierence quotient for the function f(x) = x
d
at x = 1, it approaches
d
dx
x
d
[
x=1
= d as 0, so
1
2
m(

B[1 , 1 +
]) (

B). Similarly,
1
2
m(B
t

[1 , 1 +]) =
1
d
(1 +)
d
(1 )
d
2
(B
t

).
As 0, this approaches lim
0
(B
t

) (provided the latter exists,


of course). Now since the B
t

are nested and B


t

=

B, lim(B
t

) =
(

B) = (

B) by the continuity of measures. (It hardly bears pointing
out here that (B
t

) are all nite.) By the Sandwich Theorem,


(

B

) = lim
0
1
2
m(

B

) = (B
t

).
Thus, = on balls, but since these generate the Borel sets, = .
(c) By the change of variable theorem,
dx =

x1

x1

x2

x2

dd =

cos cos sin sin


cos sin sin cos

dd = sin cos dd.


Note also that
F =
_
F
x
1
,
F
x
2
_
=
_
x
1
_
1 x
2
1
x
2
2
,
x
2
_
1 x
2
1
x
2
2
_
so
1 +[F[
2
= 1 +
x
2
1
1 x
2
1
x
2
2
+
x
2
2
1 x
2
1
x
2
2
=
1
1 x
2
1
x
2
2
=
1
1 sin
2

=
1
cos
2

.
Then
d =
_
1 +[F[
2
dx =
1
cos
sin cos dd = sin dd.
107
In n dimensions, one may use for spherical coordinates the n angles

1
, . . . ,
n
with
y = cos
1
x
1
= sin
1
cos
2
x
2
= sin
1
sin
2
cos
3
.
.
.
.
.
.
x
n1
= sin
1
. . . sin
n1
cos
n
x
n
= sin
1
. . . sin
n
.
Then the Jacobian
(x1,...,xn)
(1,...,n)
is

c
1
c
2
s
1
s
2
0 0 . . . 0
c
1
s
2
c
3
s
1
c
2
c
3
s
1
s
2
s
3
0 . . . 0
c
1
s
2
s
3
c
4
s
1
c
2
s
3
c
4
s
1
s
2
c
3
c
4
s
1
s
2
s
3
s
4
. . . 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
c
1
s
2
. . . s
n
s
1
c
2
s
3
. . . s
n
s
1
s
2
c
3
s
4
. . . s
n
s
1
s
2
s
3
c
4
s
5
. . . s
n
. . . s
1
. . . s
n1
c
n

where c
i
= cos
i
and s
i
= sin
i
. Call this J
n
(
1
, . . . ,
n
). Then we
can compute these inductively: a cofactor expansion along the rst
row yields
J
n
(
1
, . . . ,
n
) = c
1
c
2
s
n1
1
J
n1
(
2
,
3
, . . . ,
n
) +s
1
s
2
s
n1
2
J
n1
(
1
,
3
, . . . ,
n
).
Using as our initial case J
2
(
1
,
2
) = c
1
s
1
as computed above, one has
by an easy induction that
J
n
(
1
, . . . ,
n
) = c
1
s
n1
1
s
n2
2
s
n3
3
. . . s
n1
.
Since 1 +[F[
2
=
1
cos
2
1
as before, this yields the area element
d = sin
n1

1
sin
n2

2
. . . sin
n1
d
1
. . . d
n
.

Note: If the point of this exercise was to calculate the area element on
the unit sphere, it seems that a more direct way is to use the change of
variables formula to compute the volume element in spherical coordinates
(r
2
sin drdd in three dimensions), and since the measure of a set E on
the unit sphere is dened to be the measure of the corresponding conical
segment, we can compute it as
(E) = 3
_
[0,1]E

r
2
sin drdd =
_
E

sin dd,
where E
t
is the region in the plane corresponding to the spherical
region E. This implies d = sin dd, with similar formulas in higher
dimensions.
Problem 6: Consider an automorphism A of T
d
= R
d
/Z
d
, that is, A is a
linear isomorphism of R
d
that preserves the lattice Z
d
. Note that A can
be written as a d d matrix whose entries are integers, with det A = 1.
Dene the mapping : T
d
T
d
by (x) = A(x).
(a) Observe that is a measure-preserving isomorphism of T
d
.
108
(b) Show that is ergodic (in fact, mixing) if and only if A has no eigen-
values of the form e
2ip/q
, where p and q are integers.
(c) Note that is never uniquely ergodic. (Hint.)
Solution.
(a) Duly noted.
OK, I guess Im supposed to prove it. :-) Since A is a linear isomor-
phism, A
1
is as well. Let E T
d
be measurable, and let

E be the
corresponding subset of the unit cube in R
d
. Then
(A
1
E) = m(A
1
(

E)) = [ det A
1
[m(

E) = m(

E) = (E)
where is the measure on T
d
induced by the Lebesgue measure m on
R
d
.
(b) Suppose that A has an eigenvector of the form e
2ip/q
; then A
T
does
as well, since it has the same characteristic polynomial. Then (A
T
)
q
has 1 as an eigenvector. Since A
T
I has all rational entries, there is a
(nonzero) eigenvector in Q
d
, and hence, by scaling, in Z
d
. Let n Z
d
with (A
T
)
q
n = n. Consider the function f(x) = e
2inx
for x T
d
.
Then since n (Ax) = (A
T
n) x, T
k
f(x) = e
2i((A
T
)
k
n)x
. The averages
of this function are
A
m
f(x) =
1
m
m1

k=0
e
2i((A
T
)
k
n)
.
But T
k+q
f = T
k
f for all k, so A
jq
f(x) = A
q
f(x) for any integer j.
Since A
q
f(x) is not zero (it is a linear combination of exponentials
with distinct periods, since we may assume WLOG that q is as small
as possible), the averages do not converge a.e. to
_
T
d
f(x)dx = 0.
Hence cannot be ergodic.
On the other hand, suppose A has no eigenvector e
2ip/q
. Let f(x) =
e
2inx
and g(x) = e
2imx
for any m, n Z
d
. Then
T
k
f, g) =
_
T
d
e
2i((A
T
)
k
nm)x
dx.
If m = n = 0 the integrand is 1 and the integral is 1 = f, g) for all k.
If m and n are not both zero, then (A
T
)
k
nm) is eventually nonzero;
if not, there would be values k
1
and k
2
at which it were zero, but then
(A
T
)
k1
n = (A
T
)
k2
n so (A
T
)
k1
n is an eigenvector of (A
T
)
k2k1
with
eigenvalue 1. (Since A is invertible, this eigenvector is nonzero). This
then implies that A has an eigenvector which is a (k
2
k
1
)th root of
unity, a contradiction. Thus, (A
T
)
k
n m is eventually nonzero, so
T
k
f, g) is eventually equal to 0 = f, g). Hence T is mixing.
As a side note, the fact that (1 is an eigenvalue of A
k
) implies (some kth
root of 1 is an eigenvalue of A) is not completely trivial. The converse
is trivial, of course, but this direction is not if A is not diagonalizable
(at least not for any reason that Ive found). It follows, however, from
the Jordan canonical form. If J
m
() is a Jordan block of size m with
on the diagonal, then J
m
()
k
is triangular with
k
on the diagonal,
so it has
k
as a k-fold eigenvalue. This implies that the algebraic
multiplicity of
k
in A
k
is the same as the algebraic multiplicity of
109
in A; since A and A
k
have the same dimension, A
k
can have no other
eigenvalues besides kth powers of eigenvalues of A (since the sum of
the multiplicities of the
k
i
is already equal to the dimension of A
k
).
(c) If is the Dirac measure, i.e. (E) = 1 if 0 E and 0 otherwise, then
(
1
(E)) = (E) because the linearity of guarantees 0
1
(E)
0 E, and the invertibility of guarantees 0 E 0
1
(E).

Problem 8: Let X = [0, 1), (x) = 1/x), x ,= 0, (0) = 0. Here x) denotes


the fractional part of x. With the measure d =
1
log 2
dx
1+x
, we have of course
(X) = 1.
Show that is a measure-preserving transformation.
Solution. Let (a, b) [0, 1). Then

1
x
) (a, b)
1
x

_
n=1
(n +a, n +b) x

_
n=1
_
1
n +b
,
1
n +a
_
so
(
1
((a, b))) =

n=1

__
1
n +b
,
1
n +a
__
=

n=1
1
log 2
_
1/(n+a)
1/(n+b)
dx
1 +x
=
1
log 2

n=1
log
_
1 +
1
n+a
1 +
1
n+b
_
=
1
log 2
log

n=1
(n +a + 1)(n +b)
(n +a)(n +b + 1)
.
But

n=1
(n +a + 1)(n +b)
(n +a)(n +b + 1)
=
1 +b
1 +a
because the product telescopes; all terms cancel except 1 + b on the top
and 1 +a on the bottom. Hence
(
1
((a, b)) =
1
log 2
log
_
1 +b
1 +a
_
= ((a, b)).
Since is measure-preserving on intervals and these generate the Borel sets,
it is measure-preserving.
Note: By following the hint and telescoping a sum rather than a product,
it is possible to prove is measure-preserving for all Borel sets directly
rather than proving it for intervals and then passing to all Borel sets. Let
E [0, 1) be Borel, let E +k denote the translates of E, and 1/(E +k) =
1/(x +k) : x E. Since
1
1 +x
=

k=1
1
k +x

1
k + 1 +x
=

k=1
1
(k +x)(k + 1 +x)
,
110
it follows that
(E) =
1
log 2
_
E
1
1 +x
dx
=
1
log 2
_
E

k=1
1
(x +k)(x +k + 1)
dx
=
1
log 2

k=1
_
E
1
(x +k)(x +k + 1)
dx
=
1
log 2

k=1
_
E+k
1
x(x + 1)
dx
where we have used the monotone convergence theorem to interchange the
sum and the integral. Now if we make the change of variable x =
1
y
, then
it turns out that
dx
x(x+1)
=
dy
1+y
, so this equals
1
log 2

k=1
_
1/(E+k)
1
y + 1
dy =

k=1

_
1
E +k
_
=
_

_
k=1
1
E +k
_
= (
1
(E)).
Chapter 7.5, Page 380
Exercise 2: Suppose E
1
and E
2
are two compact subsets of R
d
such that
E
1
E
2
contains at most one point. Show directly from the denition of
the exterior measure that if 0 < d, and E = E
1
E
2
, then
m

(E) = m

(E
1
) +m

(E
2
).
Solution. If E

E
2
= then d(E
1
, E
2
) > 0 because both are compact, so
for < d(E
1
, E
2
), every -cover of E
1
E
2
is a disjoint union of a -cover
of E
1
and a -cover of E
2
. This implies
H

(E) = H

(E
1
) +H

(E
2
)
and taking the limit as 0 yields m

(E) = m

(E
1
) +m

(E
2
).
Now suppose E
1
E
2
= z. Let F

1
= E
1
B

(z) and F

2
= E
2
B

(z).
For any -cover F
j
of E, let A
i
be the collection of those F
j
which
intersect F

1
, and B
k
the collection of those that intersect F

2
. Note that
these collections are disjoint because d(F

1
, F

2
) . Then A
i
B

(z)
is a -cover for E
1
, and B
k
B

(z) is a -cover for E


2
. Thus
H

(E
1
) +H

(E
2
) H

(E) + 2

.
Taking limits as 0, we have m

(E) m

(E
1
) +m

(E
2
). The reverse
inequality always holds, of course, because m

is an outer measure.
Exercise 3: Prove that if f : [0, 1] R satises a Lipschitz condition of
exponent > 1, then f is a constant.
111
Solution. Suppose [f(x)f(y)[ M[xy[

with > 1. Let 0 x < y 1


and let h = y x. Then for any integer n,
[f(x) f(y)[ =

n1

j=0
f
_
x +
j + 1
n
h
_
f
_
x +
j
n
h
_

n1

j=0

f
_
x +
j + 1
n
h
_
f
_
x +
j
n
h
_

n1

j=0
M
_
h
n
_

= Mh

n
1
.
This is true for all n, and the bound approaches 0 as n , so f(x) =
f(y).
Exercise 4: Suppose f : [0, 1] [0, 1] [0, 1] is surjective and satises a
Lipschitz condition
[f(x) f(y)[ C[x y[

.
Prove that 1/2 directly, without using Theorem 2.2.
Solution. Suppose >
1
2
. By constructing a lattice of spacing
1
n
, we can
nd n
2
points in [0, 1]
2
with the property that any two are at least
1
n
apart. Call these points y
k
. For each y
k
, let x
k
[0, 1] be any point in the
preimage. Then for any k ,= j, [x
j
x
k
[
_
]yjy
k
]
C
_
1/

1
(Cn)
1/
. But for
n suciently large, (Cn)
1/
< n
2
, so the n
2
points x
k
must all be farther
than
1
n
2
from each other. This is manifestly impossible; by the Pigeonhole
Principle, some interval [
j
n
2
,
j+1
n
2
] for j = 0, . . . , n
2
1 must contain two of
the points.
Exercise 5: Let f(x) = x
k
be dened on R, where k is a positive integer,
and let E be a Borel subset of R.
(a) Show that if m

(E) = 0 for some , then m

(f(E)) = 0.
(b) Prove that dim(E) = dimf(E).
Solution.
(a) Since
f(E) = f
_

_
n=
E [n, n + 1]
_
=

_
n=
f(E [n, n + 1]),
it is sucient to show m

(f(E [n, n + 1])) = 0. But f is Lipschitz


on [n, n + 1] because
[f(x) f(y)[ = [x
k
y
k
[ = [x y[[x
k1
+x
k2
y + +xy
k2
+y
k1
[
and the second term is continuous on the compact set [n, n+1], hence
bounded. By Lemma 2.2, this implies m

(f(E [n, n + 1])) = 0.


112
(b) Let > dimE. Then m

(E) = 0 m

(f(E)) = 0 so dimf(E).
Hence dimf(E) dimE. To show the reverse, it suces to show that
m

(f(E)) = 0 implies m

(E) = 0, since then we can apply exactly


the same logic with E and f(E) interchanged. Let
g(x) =
_
(x)
1/k
x < 0 and k even
x
1/k
else.
Note that g(f(x)) = x, so E g(f(E)) g(f(E)). Thus, it will
suce to prove that g is -Lipschitz. Since
R =

_
n=1
[n 1, n]

_
n=1
[n, n + 1]

_
n=1
_

1
n
,
1
n + 1
_

_
n=1
_
1
n + 1
,
1
n
_
0,
and g is Lipschitz on each of these compact sets (it is C
1
on all but the
last), the result follows. (If we let K
n
denote all the sets in the above
decomposition, then g is Lipschitz on K
n
, so m

(g(f(E)K
n
)) = 0 by
Lemma 2.2, and by countable additivity m

(g(f(E))) = 0. Similarly
m

(g(f(E)) = 0, so m

(E) = 0.)

Exercise 6: Let E
k
be a sequence of Borel sets in R
d
. Show that if
dimE
k
for some and all k, then
dim
_
k
E
k
.
Solution. Suppose to the contrary that dimE
k
> . Choose
t
with
<
t
< dimE
k
. Then m

(E
k
) = because
t
< dimE
k
. But
m

(E
k
) = 0 for each k because
t
> dimE
k
, which implies m

(E
k
)

(E
k
) = 0 by countable subadditivity. This is a contradiction, so
dimE
k
.

You might also like