Bruce Driver-Undergraduate Analysis Tools

Bruce K.
Driver
Undergraduate Analysis Tools
April 1, 2013 File:unanal.tex
Contents
Part I Numbers
1 Natural, integer, and rational Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Limits in . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 The Problem with . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Peanos arithmetic (Highly Optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1 Basic Properties of Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Ordered Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1 Extended real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Limsups and Liminfs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3 Partitioning the Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4 The Decimal Representation of a Real Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.5 Summary of Key Facts about Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.6 (Optional) Proofs of Theorem 3.6 and Theorem 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.7 Supremum and Inniums of sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1 A Matrix Perspective (Optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Set Operations, Functions, and Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.1 Set Operations and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.2 Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.3 Finite Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.4 Countable and Uncountable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4 Contents
5.4.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Part II Normed and Metric Spaces
6 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.1 Normed Spaces [Linear Algebra Meets Analysis] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.1.1 Review of Vector Spaces and Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.1.2 Normed Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.2 Sequences in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.3 General Limits and Continuity in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.4 Density and Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.5 Test 2: Review Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
7 Series and Sums in Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8 More Sums and Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.1 Rearrangements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.2 Double Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.3 Iterated Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8.4 Double Sums and Continuity of Sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.5 Sums of positive functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.6 Sums of complex functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
8.7 Iterated sums and the Fubini and Tonelli Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.8
p
spaces, Minkowski and Holder Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
8.9.1 Limit Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.9.2 Monotone and Dominated Convergence Theorem Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.9.3
p
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
9 Topological Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
9.1 Closed and Open Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
9.2 Continuity Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
9.3 Metric spaces as topological spaces (Not required Reading!) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
9.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
9.5 Sequential Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
9.6 Open Cover Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
9.7 Equivalence of Sequential and Open Cover Compactness in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
9.8 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
9.8.1 Connectedness Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Part III Calculus
Page: 4 job: unanal macro: svmonob.cls date/time: 1-Apr-2013/8:22
Contents 5
10 Dierential Calculus in One Real Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
10.1 Basics of the Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
10.2 First Derivative Test and the Mean Value Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
10.3 Limsup and Liminf revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
10.4 Dierentiating past innite sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
10.5 Taylors Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
10.6 Inverse Function Theorem I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
10.7 Lhopitals Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
11 Simple Integration Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.1 Banach Spaces of Bounded Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.2 Algebras of Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
11.3 Finitely additive measure on /. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
11.4 Simple Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
11.5 Simple Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
11.6 Product Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
12 Extending the Integral by Uniform Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
12.1 The Fundamental Theorem of Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
12.2 Taylors Theorem revisited. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
12.3 Interchanging Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
12.4 Stirlings Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
12.4.1 A primitive Stirling type approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
12.4.2 Bone Yard. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Part IV A Little Functional Analysis
13 Weierstrass Approximation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
13.1 Classical Weierstrass Approximation Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
13.2 Stone-Weierstrass Theorem for Compact X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
13.3 Stone-Weierstrass Theorem for Locally Compact X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
14 Inner Product Spaces and Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
15 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
15.1 Fourier Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
15.2 Fourier Series Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
15.3 Dirichlet, Fejer and Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
15.4 The Dirichlet Problems on D and the Poisson Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
16 Arzela-Ascoli Theorem (Compactness in C (X)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
16.0.1 Arzela-Ascoli Theorem Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6 Contents
Part V Appendices
A Appendix: Notation and Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
B Appendix: More Set Theoretic Properties (highly optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
B.1 Appendix: Zorns Lemma and the Hausdor Maximal Principle (optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
B.2 Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
B.3 Formalities of Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
C Appendix: Aspects of General Topological Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
C.1 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
C.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
C.2.1 General Topological Space Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
C.2.2 Metric Spaces as Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
C.2.3 Compactness Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
C.3 Connectedness in General Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
C.4 Completions of Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
C.5 Supplementary Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
C.5.1 Riemannian Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
D Appendix: Limsup and liminf of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
E Appendix: Countably Additive Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
E.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
E.2 Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
E.2.1 A Density Result* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
E.3 Construction of Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
E.4 Radon Measures on 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
E.4.1 Lebesgue Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
F Appendix: Math 140A Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
F.1 Summary of Key Facts about Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
F.2 Test 2 Review Topics: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
F.3 After Test 2 Review Topics: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Part I
Numbers
1
Natural, integer, and rational Numbers
Notation 1.1 Let N =1, 2, . . . denote the natural numbers, N
0
= 0N,
Z := 0, 1, 2, . . . = n : n N
0
be the integers, and

:=
_
m
n
: m Z and n N
_
be the rationale numbers.
I am going to assume that the reader is familiar with all the standard arith-
metic operations (addition, multiplication, inverses, etc.) on N
0
, Z, and . How-
ever let us review the important induction axiom of the natural numbers.
Induction Axiom If S N is a subset such that 1 S and n + 1 S
whenever n S, then S = N.
This axiom takes on two other useful forms which we describe in the next
Propositions.
Proposition 1.2 (Strong form of Induction). Suppose S N is a subset
such that 1 S and n + 1 S whenever 1, 2, . . . , n S, then S = N.
Proof. Let T := n N : 1, 2, . . . , n S . Then 1 T and if n T then
n + 1 T by assumption. Therefore by the induction axiom, T = N so that
1, 2, . . . , n S in for all n N. This suces to show S = N.
Proposition 1.3 (Well ordering principle). Suppose S N is a non-empty
subset, then there exists a smallest element m of S.
Proof. Let S be a subset of N for which there is no smallest element, m S.
Let
T = n N : n < s for all s S .
If 1 / T, then 1 S and 1 would be a smallest element of S. Hence we must
have 1 T. Now suppose that n T so that n < s for all s S. If n + 1 / T
then there exists s S such that n < s n+1 which would force s = n+1 S.
But we would then have n + 1 is the minimal element of S which is assumed
not to exist. So we have shown if n T then n + 1 T. So by the induction
axiom of N it follows that T = N and therefore n / S for all n N, i.e. S = .
Remark 1.4. Let us further observe that the well ordering principle implies the
induction axiom. Indeed, suppose that S N is a subset such that 1 S and
n + 1 S whenever n S. For sake of contradiction suppose that S ,= N so
that T := N S is not empty. By the well ordering principle there T has a
unique minimal element m and in particular T m, m+ 1, . . . . This implies
that 1, 2, . . . , m1 S and then by assumption that 1, 2, . . . , m S. But
this then implies T m+ 1, . . . and therefore m / T which violates m being
the minimal element of T. We have arrived at the desired contradiction and
therefore conclude that S = N.
Remark 1.5. Recall that, for q , we dene
1
[q[ =
_
q if q 0
q if q 0.
Recall that, for all a, b ,
[a +b[ [a[ +[b[ , [ab[ = [a[ [b[ , and
1
a
=
1
[a[
when a ,= 0.
It is also often useful to keep in mind that the following statements are equiv-
alent for a, b with b 0;
1. [a[ b,
2. b a b, and
3. a b, i.e. a b and a b.
Lemma 1.6. If a, b , then
[[b[ [a[[ [b a[ . (1.1)
Proof. Since both sides of Eq. (1.1) are symmetric in a and b, we may
assume that [b[ [a[ so that [[b[ [a[[ = [b[ [a[ . Since
[b[ = [b a +a[ [b a[ +[a[ ,
it follows that
[[b[ [a[[ = [b[ [a[ [b a[ .
4 1 Natural, integer, and rational Numbers
The proof of the previous lemma illustrates one of the key techniques of
adding 0 to an expression. In this case we added 0 in the form of a + a
to b. The next remark records a couple of other very important tricks in this
subject. Taking to heart the following remarks will greatly aid the student in
real analysis.
Remark 1.7 (Some basic philosophies of real analysis). Let a, b, be num-
bers (i.e. in or later real numbers). We will often prove;
1. a b by showing that a b + for all > 0. (See the next theorem.)
2. a = b by proving a b and b a or
3. prove a = b by showing [b a[ for all > 0.
Theorem 1.8. The rationale
2
numbers have the following properties;
1. For any p there exists N N such that p < N.
2. For any with > 0 there exists an N N such that 0 <
1
N
< .
3. If a, b and a b + for all > 0, then a b.
Proof. 1. If p 0 we may take N = 1. So suppose that p =
m
n
with
m, n N. In this case let N = m.
2. Write =
m
n
with m, n N and then take N = 2n.
3. If a b is false happens i a > b which is equivalent to a b > 0. If we
now let :=
ab
2
> 0, then
a = b + (b a) > b +
which would violate the assumption that a b + for all > 0.
1
Absolute values will be discussed in more generality in Section 2.2 below.
2
We will see that the real numbers have these same properties as well.
1.1 Limits in
In this course we will often use the abbreviations, i.o. and a.a. which stand for
innitely often and almost always respectively. For example a
n
b
n
a.a. n
means there exists an N N such that a
n
b
n
for all n N while a
n
b
n
i.o.
n means for all N N there exists a n N such that a
n
b
n
. So for example,
1/n 1/100 for a.a. n while and (1)
n
0 i.o. By the way, it should be clear
that if something happens for a.a. n then it also happens i.o. n.
Denition 1.9. A sequence a
n
n=1
converges to 0 if for all > 0
in there exists N N such that [a
n
[ for all n N. Alternatively put, for
all > 0 we have [a
n
[ for a.a. n. This may also be stated as for all M N,
[a
n
[
1
M
for a.a. n.
n
n=1
converges to a if [a a
n
[ 0
as n , i.e. if for all N N, [a a
n
[
1
N
for a.a. n. As usual if a
n
n=1
converges to a we will write a
n
a as n or a = lim
n
a
n
.
Proposition 1.11. If a
n
n=1
converges to a , then lim
n
[a
n
[ =
[a[ .
Proof. From Lemma 1.6 we have,
[[a[ [a
n
[[ [a a
n
[ .
Thus if > 0 is given, by denition of a
n
a there exists N N such that
[a a
n
[ < for all n N. From the previously displayed equation, it follows
that [[a[ [a
n
[[ < for all n N and hence we may conclude that lim
n
[a
n
[
exists and is equal to [a[ .
Lemma 1.12 (Convergent sequences are bounded). If a
n
n=1
con-
verges to a , then there exists M such that [a
n
[ M for all n N.
Proof. Taking = 1 in the denition of a = lim
n
a
n
implies there exists
N N such that [a
n
a[ 1 for all n N. Therefore,
[a
n
[ = [a
n
a +a[ [a
n
a[ +[a[ 1 +[a[ for n N.
We may now take M := max
__
[a
n
[
N
n=1
_
1 +[a[
_
.
Theorem 1.13. If a
n
n=1
converges to a 0 , then
lim
n
1
a
n
=
1
a
. (1.2)
It is possible that a
n
= 0 for small n so that
1
an
is not dened but for large n
this can not happen and therefore it makes sense to talk about the limit which
only depends on the tail of the sequences.
1.2 The Problem with 5
Proof. Since a ,= 0 we know that [a[ > 0. Hence, there exists M := M
]a]
N
such that [a
n
a[ <
]a]
2
for all n M. Therefore for n M
[a[ = [a a
n
+a
n
[ [a a
n
[ +[a
n
[ <
[a[
2
+[a
n
[
from which it follows
3
that [a
n
[ >
]a]
2
for all n M. If > 0 is given arbitrarily,
we may choose N M such that [a a
n
[ < for all n M. Then for n N
we have,
1
a
n
1
a
a a
n
a
n
a
=
[a a
n
[
[a
n
[ [a[
<

]a]
2
[a[
=
2
[a[
2
.
As > 0 is arbitrary it follows that
2
]a]
2
> 0 is arbitrarily small as well (replace
by [a[
2
/2 if you feel it is necessary), and hence we may conclude that Eq.
(1.2) holds.
Variation on the method. In order to make these arguments more rout-
ing, it is often a good idea to write a
n
= a +
n
where
n
:= a
n
a is the error
between a
n
and a. By assumption, lim
n
n
= 0 and so for any > 0 given
there exists N () N such that [
n
[ . With this notation we have,
1
a
n
1
a
1
a +
n
1
a
n
a +
n
1
a

[a[ ([[a[ [)
.
So if we assume that [a[ /2 we nd that
1
a
n
1
a
2
[a[
2
for all n N () . (1.3)
Taking = () = min
_
[a[ /2, [a[
2
/2
_
in Eq. (1.3) shows for n N ( ())
that
1
an
1
a
. Since > 0 is arbitrary we may conclude that lim

n
1
an
=
1
a
.
End of Lecture 1, 9/28/2012
n
n=1
is Cauchy if [a
n
a
m
[ 0 as
m, n . More precisely we require for each > 0 in that [a
m
a
n
[
for a.a. pairs (m, n) , i.e. there should exists N N such that [a
m
a
n
[ for
all m, n N.
3
The idea is very simple here. If an is near a and a = 0 then an must stay away
from zero. You should draw the picture to go along with the proof.
Exercise 1.1. Show that all convergent sequences a
n
n=1
are Cauchy.
Exercise 1.2. Show all Cauchy sequences a
n
n=1
are bounded i.e. there
exists M N such that
[a
n
[ M for all n N.
Exercise 1.3. Suppose a
n
n=1
and b
n
n=1
are Cauchy sequences in . Show
a
n
+b
n
n=1
and a
n
b
n
n=1
are Cauchy.
Exercise 1.4. Assume that a
n
n=1
and b
n
n=1
are convergent sequences in
. Show a
n
+b
n
n=1
and a
n
b
n
n=1
are convergent in and
lim
n
(a
n
+b
n
) = lim
n
a
n
+ lim
n
b
n
and
lim
n
(a
n
b
n
) = lim
n
a
n
lim
n
b
n
.
Exercise 1.5. Assume that a
n
n=1
and b
n
n=1
are convergent sequences in
such that a
n
b
n
for all n. Show A := lim
n
a
n
lim
n
b
n
=: B.
Exercise 1.6 (Sandwich Theorem). Assume that a
n
n=1
and b
n
n=1
are
convergent sequences in such that lim
n
a
n
= lim
n
b
n
. If x
n
n=1
is
another sequence in which satises a
n
x
n
b
n
for all n, then
lim
n
x
n
= a := lim
n
a
n
= lim
n
b
n
.
Please note that that main part of the problem is to show that lim
n
x
n
exists in . Hint: start by showing; if a x b then [x[ max ([a[ , [b[) .
Denition 1.15 (Subsequence). We say a sequence, y
k
k=1
is a subse-
quence of another sequence, x
n
n=1
, provided there exists a strictly increas-
ing function, N k n
k
N such that y
k
= x
nk
for all k N. [Example,
n
k
= k
2
+ 3, and y
k
:= x
k
2
+3
k=1
would be a subsequence of x
n
n=1
.]
Exercise 1.7. Suppose that x
n
n=1
is a Cauchy sequence in (or 1) which
has a convergent subsequence, y
k
= x
nk
k=1
in (or 1). Show that lim
n
x
n
exists and is equal to lim
k
y
k
.
1.2 The Problem with
The problem with is that it is full of holes. To be more precise, is not
complete, i.e. not all Cauchy sequences are convergent. In fact, according
to Corollary 5.31 below, most Cauchy sequences of rational numbers do not
converge to a rational number. Let us demonstrate some examples pointing out
this aw. We rst pause to recall how to sum geometric series.
Lemma 1.16 (Geometric Series). Let , m, n Z with n m, and
S :=
m
k=n
k
. Then
S =
_
mn + 1 if = 1
m+1
n
1
if ,= 1
.
Moreover if 0 < 1, then
m
k=n
k
=
n
1
mn+1
1

n
1
. (1.4)
Proof. When = 1,
S =
m
k=n
1
k
= mn + 1.
If ,= 1, then
S S =
m+1
n
.
Solving for S gives
S =
m
k=n
k
=

m+1
n
1
if ,= 1. (1.5)
Example 1.17. Let S
n
:=
n
k=0
1
k!
for all n N. For n > m in N we have,
0 S
n
S
m
=
n
k=m+1
1
k!
=
nm
j=1
1
(m+j)!
=
1
(m+ 1)!
+ +
1
n!
1
m!
_
1
m+ 1
+
_
1
m+ 1
_
2
+ +
_
1
m+ 1
_
nm
_
1
m!
1
m+ 1
1
1
1
m+1
=
1
m m!
, (1.6)
wherein we have used Eq. (1.4) for the last inequality. From this inequality it
follows that S
n
n=0
is a Cauchy sequence and we also have,
1
(m+ 1)!
S
n
S
m

1
m m!
for all n > m. (1.7)
Suppose that e := lim
n
S
n
were to exist in . Then letting n in Eq.
(1.7) would show,
0 <
1
(m+ 1)!
e S
m

1
m m!
.
Multiplying this inequality by m! then implies,
0 < m!e m!S
m

1
m
.
However for m suciently large m!e N (as e is assumed to be rational) and
m!S
m
is always in N and therefore k := m!e m!S
m
N. But there is no
element k N such that 0 < k <
1
m
and hence we must conclude lim
n
S
n
can not exist in . Moral: the number e =
n=0
1
n!
= lim
n
_
1 +
1
n
_
n
that
you learned about in calculus is not in !
Plotting the partial sums
n
k=0
1
k!
(black curve) and
_
1 +
1
n
_
n
(red curve)
which are both converging to e.
Example 1.18 (Square roots need not exist). The square root,
2, of 2 does not
exist in . Indeed, if
2 =
m
n
where m and n have no common factors (in
particular no common factors of 2 so that either m or n is odd), then
m
2
n
2
= 2 = m
2
= 2n
2
.
This shows that m
2
is even which would then imply that m = 2k is even (since
oddodd=odd). However this implies 4k
2
= 2n
2
from which it follows that
n
2
= 2k
2
is even and hence n is even. But this contradicts the assumption that
m and n had no common factors (of 2).
1.3 Peanos arithmetic (Highly Optional) 7
Exercise 1.8. Use the following outline to construct another Cauchy sequence
q
n
n=1
which is not convergent in .
1. Recall that there is no element q such that q
2
= 2. To each n N let
m
n
N be chosen so that
m
2
n
n
2
< 2 <
(m
n
+ 1)
2
n
2
(1.8)
and let q
n
:=
mn
n
.
2. Verify that q
2
n
2 as n and that q
n
n=1
is a Cauchy sequence in .
3. Show q
n
n=1
does not have a limit in .
Example 1.19. It is also a fact that / where
= 2
_

0
1
1 +x
2
dx = 2 lim
N
_
N
0
1
1 +x
2
dx
= 2 lim
N
N
2
k=1
1
1 +
_
k
N
_
2

1
N
= lim
N
N
2
k=1
2N
N
2
+k
2
.
The point is that the basic operations from calculus tend to produce real
numbers which are not rational even though we start with only rational num-
bers.
1.3 Peanos arithmetic (Highly Optional)
This section is for those who want to understand N at a more fundamental
level. Here we start with Peanos rather minimalist axioms for N and show how
they lead to all the standard properties you are used to using for N. Here are
the axioms;
non-empty N
0
is a non-empty set which contains a distinguished element, 0.
We let N := N
0
0 and call these the natural numbers.
Successor Function There is an injective
4
function, s : N
0
N and we let
1 := s (0) N.
4
Injective is the same as one to one. Thus we are assuming that if s (n) = s (m) then
n = m.
Induction hypothesis If S N
0
is a set such that 0 S and s (n) S whenever
n S, then S = N
0
.
Assuming these axioms one may develop all of the properties or N
0
that
you are accustomed to seeing. I will develop the basic properties of addition,
multiplication, and the ordering on N
0
in this section. For more on this point
and then the further construction of Z and from N
0
, the reader is referred
to the notes; Numbers by M. Taylor. You may also consult E. Landaus
book [?] for a very detailed (but perhaps too long winded) exposition of these
topics.
Lemma 1.20. The map s : N
0
N is a bijection.
Proof. Let S := s (N
0
) 0 N
0
. Then 0 S and s (0) s (N
0
) S.
Moreover if x N S then s (x) s (N
0
) S so that x S = s (x) S
and hence S = N
0
and therefore s (N
0
) = N.
Theorem 1.21 (Addition). There exists a function p : N
0
N
0
N
0
such
that p (x, 0) = x for all x N
0
and p (x, s (y)) = s (p (x, y)) for all x, y N
0
.
Moreover, we may construct p so that p (s (x) , y) = p (x, s (y)) for all x, y N
0
.
This function p satises the following properties;
1. p (x, 0) = x = p (0, x) for all x N
0
,
2. p (x, 1) = p (1, x) = s (x) for all x N
0
,
3. p (x, y) = p (y, x) for all x, y N
0
,
4. p (x, p (y, z)) = p (p (x, y) , z) for all x, y, z N
0
.
Proof. We will construct p inductively. Let
S := x N : p
x
: N
0
N
0
p
x
(0) = x and p
x
(s (y)) = s (p
x
(y)) y N
0
.
Taking p
0
(y) = y shows 0 S. Moreover if x S we dene
p
s(x)
(y) := s (p
x
(y)) for all y N
0
.
We then have p
s(x)
(0) = s (p
x
(0)) = s (x) and
p
s(x)
(s (y)) := s (p
x
(s (y))) = s s (p
x
(y)) = s
_
p
s(x)
(y)
_
which shows s (x) S. Thus we may conclude S = N
0
and we may now dene
p (x, y) := p
x
(y) for all x, y N
0
. By construction this function satises,
p (s (x) , y) = s (p (x, y)) = p (x, s (y)) .
We now verify the properties in items 1. 4.
1. By construction p (x, 0) = x for all x N
0
. Let S = x N : p (0, x) = x ,
then 0 S and if x S we have p (0, s (x)) = s (p (0, x)) = s (x) so that
s (x) S. Therefore S = N
0
and the rst item holds.
2. p (x, 1) = p (x, s (0)) = s (p (x, 0)) = s (x) and p (1, x) = p (s (0) , x) =
s (p (0, x)) = s (x) so that item 2. is proved.
3. Let S = x N
0
: p (x, ) = p (, x) . Then by items 1 and 2. it follows that
0, 1 S. Moreover if x S, then for all y N
0
we nd,
p (s (x) , y) = s (p (x, y)) = s (p (y, x)) = p (y, s (x))
which shows s (x) S. Therefore S = N
0
and item 3. is proved.
4. Let
S := x N
0
: p (x, p (y, z)) = p (p (x, y) , z) y, z N
0
.
Then 0 S and if x S we nd,
p (s (x) , p (y, z)) = s (p (x, p (y, z))) = s (p (p (x, y) , z))
= p (s (p (x, y)) , z) = p (p (s (x) , y) , z)
which shows that s (x) S and therefore S = N
0
Notation 1.22 We now write x + y for p (x, y) and refer to the symmetric
binary operator, +, as addition.
To summarize we have now shown addition satises for all x, y, z N
0
;
1. x + 0 = 0 +x = x,
2. s (x) = x + 1 = 1 +x,
3. x +y = y +x,
4. (x +y) +z = x + (y +z) .
5. The induction hypothesis may now be written as; if S N
0
is a subset such
that 0 S and n + 1 S whenever n S, then S = N
0
.
Proposition 1.23 (Additive Cancellation). If x, y, z N
0
and x+z = y+z,
then x = y.
Proof. Let S be those z N
0
for which the statement x+z = y +z implies
x = y holds. It is clear that 0 S. Moreover if z S and x + (z + 1) =
y + (z + 1) then (x + 1) + z = (y + 1) + z and so by the inductive hypothesis
s (x) = x + 1 = y + 1 = s (y) . Recall that s is one to one by assumption
and therefore we may conclude x = y and we have shown s (z) S. Therefore
S = N
0
and the proposition is proved.
Denition 1.24. Given x, y N
0
, we say x < y i y = x +n for some n N
and x y i y = x +n for some n N
0
. We further let
R
x
:= x +n : n N
0
so that y x i y R
x
.
Proposition 1.25. If x, y N
0
and x y and y x then x = y. Moreover if
x y then either x < y or x = y.
Proof. By assumption there exists m, n N
0
such that y = x + m and
x = y + n and therefore y = y + (m+n) . Hence by cancellation it follows
that m + n = 0. If n ,= 0 then n = s (x) for some x N
0
and we have
m + n = m + s (x) = s (m+x) N which would imply m + n ,= 0. Thus we
conclude that m = 0 = n and therefore x = y.
If x y and x ,= y then y = x +n for some n N
0
with n ,= 0, i.e. x < y.
Proposition 1.26. If x, y N
0
then precisely one of the following three choices
must hold, 1) x < y, 2) x = y, 3 y < x.
Proof. Suppose that x y does not hold, i.e. y / R
x
. We wish to show that
y < x, i.e. that x = y +n for some n N. We do this by induction on y. That
is let S be the the set of y N
0
such that the statement y / R
x
implies y < x
holds. If y = 0 / R
x
implies n := x ,= 0 so that x = y +n, i.e. y = 0 < x. This
shows 0 S. Now suppose that y S and that y+1 / R
x
= x +m : m N
0
.
It follows that y + 1 ,= x + m + 1 for all m N
0
and hence that y ,= x + m
for all m N
0
, i.e. y / R
x
. So by induction y < x and therefore x = y + k for
some k N. Since k N we know there exists k
t
N
0
such that k = k
t
and it
follows that x = y + 1 + k
t
, i.e. y + 1 x. Since y + 1 / R
x
we may conclude
that in fact y + 1 < x and therefore y + 1 S. So by induction S = N
0
and
we have shown if x < y does not hold i y x. Combining this statement with
the Proposition 1.25 completes the proof.
We have now set up a satisfactory addition operations and ordering on N
0
.
Our next goal is to dene multiplication on N
0
.
Theorem 1.27. There exists a function M : N
0
N
0
N
0
such that M (x, 0) =
0 for all x N
0
and M (x, y + 1) = M (x, y) +x for all x, y N
0
. This function
M satises the following properties;
1. M (x, 0) = 0 = M (0, x) for all x N
0
,
2. M (x, 1) = M (1, x) = x for all x N
0
,
3. M (x, y) = M (y, x) for all x, y N
0
,
4. M (x, y +z) = M (x, y) +M (x, z) for all x, y, z N
0
.
5. M (x, M (y, z)) = M (M (x, y) , z) for all x, y, z N
0
.
1.3 Peanos arithmetic (Highly Optional) 9
Proof. Let S denote those x N
0
such that there exists a function M
x
:
N
0
N
0
satisfying M
x
(0) = 0 and M
x
(y + 1) = M
x
(y) + x for all y N
0
.
Taking M
0
(y) := 0 shows 0 S. Moreover if x S we dene M
x+1
(y) :=
M
x
(y) +y. Then M
x+1
(0) = 0 and
M
x+1
(y + 1) = M
x
(y + 1) +y + 1 = M
x
(y) +x +y + 1
while
M
x+1
(y) + (x + 1) = M
x
(y) +y +x + 1 = M
x+1
(y + 1) .
This shows that x +1 S and so by induction S = N
0
and we may now dene
M (x, y) := M
x
(y) for all x, y N
0
. We now prove the properties of M stated
above.
1. By construction M (x, 0) = 0 for all x. Let S := x N
0
: M (0, x) = 0 .
Then 0 S and if x S we have
M (0, x + 1) = M (0, x) + 0 = 0 + 0 = 0
which shows x + 1 S. Therefore by induction S = N
0
and M (0, x) = 0
for all x N
0
.
2. M (x, 1) = M (x, 0 + 1) = M (x, 0) + x = 0 + x = x for all x N
0
. Let
S := x N
0
: M (1, x) = x . Then 0 S and if x S we have
M (1, x + 1) = M (1, x) + 1 = x + 1
which shows x + 1 S. Therefore S = N
0
and M (1, x) = x for all x N
0
.
3. Let S := x N
0
: M (x, ) = M (, x) . Then by items 1. and 2. we know
that 0, 1 S. Now suppose that x S, then by construction,
M (x + 1, y) = M
x+1
(y) = M (x, y) +y
while
M (y, x + 1) = M (y, x) +y.
The last two displayed equations along with the induction hypothesis shows
x + 1 S and therefore S = N
0
4. Let S denotes those x S such that M (x, y +z) = M (x, y) +M (x, z) for
all y, z N
0
. Then 0, 1 S and if x S we have,
M (x + 1, y +z) = M (x, y +z) +y +z
= M (x, y) +M (x, z) +y +z
= M (x, y) +y +M (x, z) +z
= M (x + 1, y) +M (x + 1, z)
which shows x + 1 S. Therefore S = N
0
and we have proved item 4.
5. Let
S := x N
0
: M (x, M (y, z)) = M (M (x, y) , z) y, z N
0
.
Then 0 S and if x S we nd,
M (x + 1, M (y, z)) = M (x, M (y, z)) +M (y, z)
while
M (M (x + 1, y) , z) = M (M (x, y) +y, z) = M (M (x, y) , z) +M (y, z) .
The last two equations along with the induction hypothesis shows x+1 S
and therefore S = N
0
Notation 1.28 We now write x y for M (x, y) and refer to the symmetric
binary operator, , as multiplication..
To summarize Theorem 1.27 we have shown multiplication satises for all
x, y, z N
0
;
1. x 0 = 0 = 0 x,
2. x 1 = x = 1 x,
3. x y = y x,
4. x (y +z) = x y +x z,
5. (x y) z = x (y z) .
Proposition 1.29 (Multiplicative Cancellation). If x, y N
0
and z N
such that x z = y z, then x = y.
Proof. If x ,= y, say x < y, then y = x +n for some n N and therefore
y z = (x +n) z = x z +n z.
Hence if x z = y z, then by additive cancellation we must have n z = 0. As
n, x N we may write n = n/ +1 and z = z
t
+1 with n
t
, z
t
N
0
and therefore,
0 = n z = (n
t
+ 1) (z
t
+ 1) = n
t
z
t
+n
t
+z
t
+ 1 ,= 0
which is a contradiction.
Remark 1.30 (Base 10 counting). The typical method of counting is to use base
10 enumeration of N
0
. The rules are;
0 = 0, 1 = 1, 2 := 1 + 1, 3 := 2 + 1, 4 := 3 + 1, 5 := 4 + 1
6 := 5 + 1, 7 := 6 + 1, 8 := 7 + 1, 9 := 8 + 1, and 10 := 9 + 1.
Once these element of N
0
have been dened, then given a
0
, . . . , a
n
0, 1, . . . , 9
with a
n
,= 0, we let
a
n
a
n1
. . . a
0
:=
n
k=0
a
k
10
k
.
For example, 35 = 3 10 + 5 = 34 + 1, etc.
As mentioned above one can formalize Z and using N
0
constructed above.
I will omit the details here and refer the reader to the references already men-
tioned.
2
Fields
The basic question we want to eventually address is; What are the real
numbers? Our answer is going to be; the real numbers is the essentially unique
complete ordered eld, see Theorem 3.3 below. In order to make sense of this
answer we need to explain the terms, complete, ordered, and eld. We
will start with the notion of a eld which loosely stated means something that
can reasonably be interpreted a numbers.
Denition 2.1 (Fields, i.e. numbers). A eld is a non-empty set F
equipped with two operations called addition and multiplication, and denoted
by + and , respectively, such that the following axioms hold;( subtraction and
division are dened implicitly in terms of the inverse operations of addition and
multiplication:)
1. Closure of F under addition and multiplication. For all a, b in F, both a+b
and a b are in F (or more formally, + and are binary operations on F).
2. Associativity of addition and multiplication. For all a, b, and c in F,
the following equalities hold: a+(b+c) = (a+b) +c and a (b c) = (a b) c.
3. Commutativity of addition and multiplication. For all a and b in F,
the following equalities hold: a +b = b +a and a b = b a.
4. Additive and multiplicative identity. There exists an element of F,
called the additive identity element and denoted by 0 = 0
F
, such that for all
a in F, a + 0 = a. Likewise, there is an element, called the multiplicative
identity element and denoted by 1 = 1
F
, such that for all a in F, a 1 = a.
It is assumed that 0
F
,= 1
F
.
5. Additive and multiplicative inverses. For every a in F, there exists an
element a in F, such that a+(a) = 0. Similarly, for any a F other than
0, there exists an element a
1
in F, such that a a
1
= 1. (The elements
a + (b) and a b
1
are also denoted a b and a/b, respectively.) In other
words, subtraction and division operations exist.
6. Distributivity of multiplication over addition. For all a, b and c in F,
the following equality holds: a (b +c) = (a b) + (a c).
(Note that all but the last axiom are exactly the axioms for a commuta-
tive group, while the last axiom is a compatibility condition between the two
operations.)
2.1 Basic Properties of Fields
Here are some sample properties about elds. For more information about Fields
see 5-8 of Rudin.
Lemma 2.2. Let F be a eld, then;
1. There is only one additive and multiplicative inverses.
2. If x, y, z F with x ,= 0 and xy = xz then y = z.
3. 0 x = 0 for all x F.
4. If x, y F such that xy = 0 then x = 0 or y = 0.
5. (x) y = (xy) .
6. (x) = x for all x F.
7. (x) (y) = xy or all x, y F.
Proof. We take each item in turn.
1. Suppose that x + y = 0 = x + y
t
, then adding x to both sides of this
equation shows y = y
t
. Taking y = x then shows y = x = y
t
, i.e.
additive inverses are unique. Similarly if x ,= 0 and xy = 1 then multiplying
this equation by x
1
shows y = x
1
and so there is only one multiplicative
inverse.
2. If xy = xz then multiplying this equation by x
1
shows y = z.
3.
0 x +x = 0 x + 1 x = (0 + 1) x = 1 x = x.
Adding x to both side of this equation using associativity and commuta-
tivity of addition then implies 0 x = 0.
4. If x F 0 and y F such that xy = 0, then
0 = x
1
0 = x
1
(xy) =
_
x
1
x
_
y = 1y = y.
5. (x) y +xy = (x +x) y = 0 y = 0 = (x) y = (xy) .
6. Since (x) +x = 0 we have (x) = x.
7. (x) (y) = (x (y)) = ((xy)) = xy by 6.
Example 2.3. Here are a few examples of Fields;
12 2 Fields
1. F
2
= 0, 1 with 0 +0 = 0 = 1 +1, and 0 +1 = 1 +0 = 0 and 0 1 = 1 0 =
0 0 = 0 and 1 1 = 1. In this case 1 = 1, 1
1
= 1 and 0 = 0.
2. the rational numbers with the usual addition and multiplication of
fractions.
_
m
n
_
1
=
n
m
if m ,= 0 and
m
n
=
m
n
.
3. F = (t) where
(t) =
_
p (t)
q (t)
: p (t) and q (t) are polynomials over q (t) ,= 0
_
.
Again the multiplication and addition are as usual.
Example 2.4. Z is not a eld. For example, 2 has no multiplicative inverse in Z.
The inverse to 2, 2
1
, should be
1
2
but this is not in Z.
Denition 2.5. We say a map : Z F is a (ring) homomorphism i (1) =
1
F
, (0) = 0
F
, and for all x, y Z;
(x +y) = (x) +(y) and (xy) = (x) (y) .
[The assumption that (0) = 0
F
is redundant since (0) = (0 + 0) = (0) +
(0) and therefore (0) = 0
F
.]
Lemma 2.6. For every eld F there a unique (ring) homomorphism, : Z F.
In fact, (n) = n1
F
for all n Z where 0 1
F
= 0
F
,
n1
F
:=
n times
..
1
F
+ + 1
F
if n N and
(n) 1
F
:= (n1
F
) if n N.
[The map need not be injective as is seen by taking F = F
2
.]
Proof. Let us rst work on N
0
Z. We must dene (0) = 0 and (1) = 1
and then inductively by (n + 1) = (n) +(1) = (n) + 1
F
so that
(n) =
n times
..
1
F
+ + 1
F
.
We now write n1
F
for (n) with the convention that 01
F
= 0
F
. For n N we
must set (n) = (n) = (n1
F
) . Thus we have (n) = n1
F
for all n Z.
We now must show is a homomorphism.
Additive homomorphism: First suppose that m, n N
0
and let
S := m N
0
: (m+n) = (m) +(n) for all n N
0
.
One easily sees that 0 S and that 1 S by construction. Moreover if m S,
then
((m+ 1) +n) = (m+n + 1) = (m) +(n + 1)
= (m) +(n) + 1
F
= (m) + 1
F
+(n) = (m+ 1) +(n)
which shows m + 1 S. Therefore by induction, S = N
0
and (m+n) =
(m) +(n) for all m, n N
0
.
If m N
0
we have (m) = (m) by construction. If n > m N
0
, then
(n + (m)) +(m) = (n m) +(m) = (n)
so that
(n + (m)) = (n) + ((m)) = (n) +(m) .
If n < m N
0
, then
(n + (m)) = (mn) = [(m) (n)] = (n) +(m)
and if m, n N
0
, then
(n + (m)) = ((n +m)) = (n +m)
= [(n) +(m)] = (n) (m)
= (n) +(m) .
Putting all of this together shows is an additive homomorphism.
Multiplicative homomorphism: First suppose that m, n N
0
and let
S := m N
0
: (mn) = (m) (n) for all n N
0
.
It is easily seen that 0, 1 S. Moreover if m S and n N
0
, then
((m+ 1) n) = (mn +n) = (mn) +(n)
= (m) (n) +(n) = ((m) + 1
F
) (n)
= (m+ 1) (n) ,
which shows m + 1 S. Therefore by induction, S = N
0
and (mn) =
(m) (n) for all m, n N
0
.
If m, n N
0
, then
((m) n) = (mn) = (mn) = [(m) (n)] = [(m)] (n) = (m) (n)
and
((m) (n)) = (mn) = (mn) = ((m)) ((n)) = (m) (n)
which completes the verication that is a multiplicative homomorphism.
2.2 Ordered Fields 13
2.2 Ordered Fields
Denition 2.7 (Ordered Field). We say F is an ordered eld if there ex-
ists, P F, called the positive elements, such that
Ord 1. F is the disjoint union of P, 0 , and P, i.e. if x F then precisely
one of following happens; x P, x = 0, or x P.
Ord 2. P +P P and P P P.
Lemma 2.8. Let (F, P) be an ordered eld, then;
1. For all x F 0 , x
2
P. In particular 1 = 1
2
P.
2. If x P and y P then xy P.
3. If x P then x
1
P.
Proof. If x P then x
2
P P P while if x P then x P and
x
2
= (x)
2
P. For item 3. we have x x
1
= 1
Example 2.9. The eld F =0, 1 is not ordered. The only possible choice for P
is P = 1 which does not work since 1 + 1 = 0 / P.
Example 2.10. Take F = and P =
_
m
n
: m, n > 0
_
. This is in fact the unique
choice we can make for P in this case. Indeed suppose that P is any order on
. By Lemma 2.8, we know 1 P and then by induction it follows that N P.
Then again by Lemma 2.8 we must have m n
1
P for all m, n .
Example 2.11. Take F = (t) and
P =
_
p (t)
q (t)
F :
p (t)
q (t)
> 0 for t > 0 large
_
,
i.e.
p(t)
q(t)
P i the highest order coecients of p (t) and q (t) have the same
sign. For example
t
2
25t+7
t
4
10
26
P while
t
2
+25t+7
t
4
10
26
P.
Notice that t > n for all n N and
1
t
<
1
n
for all n N. This is kind
of strange and explains why you have to prove the obvious in this course!!!
Moral: obvious statements are often false.
Notation 2.12 (Max and Min) We will often use the following notation in
the sequel. If a, b are elements of an ordered eld, let
a b := min (a, b) =
_
a if a b
b if b a
and
a b := max (a, b) =
_
b if a b
a if b a
.
More generally if a
i
n
i=1
F we let
a
1
a
n
:= min (a
1
, . . . , a
n
) and
a
1
a
n
:= max (a
1
, . . . , a
n
)
be the smallest and largest element in the nite list (a
1
, . . . , a
n
) .
Denition 2.13. Suppose that F and G are elds. A map, : F G is a
(eld) homomorphism i (1
F
) = 1
G
, (0
F
) = 0
G
, and for all x, y F;
(x +y) = (x) +(y) and (xy) = (x) (y) .
Lemma 2.14 ( embeds into an ordered eld). For every ordered eld
(F, P) , there a unique eld homomorphism, : F. In fact,
_
m
n
_
=
m
n
1
F
:= m1
F
(n1
F
)
1
(2.1)
where n1
F
:=
n times
..
1
F
+ + 1
F
and (n) 1
F
:= (n1
F
) for n N and 0 1
F
= 0
F
.
Moreover;
1. (x) P whenever x > 0,
2. and is injective. Thus we may identify with () and consider as a
sub-eld of F.
[In particular, ordered elds must be elds with an innite number of ele-
ments in it.]
Proof. From Lemma 2.6 we know there is a unique ring homomorphism,
: Z F, given by (m) = m 1
F
. So for m Z and n N we must have
_
m
n
_
n1
F
=
_
m
n
_
(n) =
_
m
n
n
_
= (m) = m1
F
which forces us to dene as in Eq. (2.1). Notice that is easy to verify by
induction that n1
F
= (n) P for all n N and in particular n1
F
,= 0 for
n N. In particular if x = m/n > 0 then
_
m
n
_
= m1
F
(n1
F
)
1
P by
Lemma 2.8. We must still check that is well dened homomorphism.
Well dened. Suppose that k N, we must show
(km) 1
F
((kn) 1
F
)
1
= m1
F
(n1
F
)
1
.
By cross multiplying, this will happen i
(km) 1
F
(n1
F
) = ((kn) 1
F
) m1
F
14 2 Fields
which is the case as : Z F is a ring homomorphism.
Homomorphism property. We have
_
m
n
+
p
n
_
=
_
m+p
n
_
= (m+p) (n)
1
= [(m) +(p)] (n)
1
= (m) (n)
1
+(p) (n)
1
=
_
m
n
_
+
_
p
n
_
and
_
m
n
_
_
q
p
_
= (m) (n)
1
(q) (p)
1
= (m) (q) [(n) (p)]
1
= (mq) [(np)]
1
=
_
mq
np
_
.
Injectivity. If 0 =
_
m
n
_
then
0 = (m) (n)
1
which implies (m) = 0 which happens i m = 0, i.e. m/n = 0.
Notation 2.15 If (F, P) is an ordered eld we write x > y i x y P. We
also write x y i x > y or x = y.
Notice that if x, y F then either x y = 0 (i.e. x = y), or x y P (i.e.
x > y), or xy P (i.e. y x P and y > x). Also in this notation we have
P = x F : x > 0 ,
P = x F : x < 0 (i.e. 0 > x) .
Lemma 2.16. Suppose that x < y and y < z and a > 0. Then x < z and
ax < ay.
Proof. By assumption y x P and z y P, therefore z x = (y x) +
(z y) P, i.e. z > x. Moreover, a P and (y x) P implies
P a (y x) = ay ax.
That is ay > ax.
Exercise 2.1. Let (F, P) be an ordered eld and x, y F with y > x. Show;
1. y +a > x +a for all a F,
2. x > y,
3. if we further suppose x > 0, show
1
x
>
1
y
.
Denition 2.17. Given x F, we say that y F is a square root of x if y
2
= x.
[From Lemma 2.8, it follows that if x F has a square root then x 0.]
Lemma 2.18. Suppose x, y F with x
2
= y
2
, then either x = y or x = y. In
particular, there are at most 2 square roots of any number x 0 in F.
Proof. Observe that
(x y) (x +y) = (x y) x + (x y) y
= x
2
yx +xy y
2
= x
2
y
2
= 0.
Thus it follows that either x y = 0 or x +y = 0, i.e. x = y or x = y.
Denition 2.19. If x > 0 admits a square root we let

x be the unique positive
root. We also dene
0 = 0.
Lemma 2.20. Suppose that 0 < x < y, i.e. x, y x P, then x
2
< y
2
.
Proof. By Lemma 2.16, we know x x < x y and x y < y y and therefore
x
2
< y
2
.
Corollary 2.21. If 0 x < y and

x and

y exists, then 0

x <

y.
Proof. If

x =

y then x = (
x)
2
=
_
y
_
2
= y which is impossible.
Similarly if
x >

y then
x =
_
x
_
2
> (
y)
2
= y
which is again false.
Alternatively: starting with y
2
x
2
= (y x) (y +x) and then replacing
y and x by

y and

x respectively (assuming they exist) shows,
y x =
_
x
_ _
y +
x
_
=

y
x = (y x)
_
y +
x
_
1

y
x P if (y x) P. More importantly this

shows

y depends continuously in on y.
Denition 2.22. The absolute value, [x[ , of x in ordered eld F is dened by
[x[ =
_
x if x 0
x if x < 0
Alternatively we may dene
[x[ =
x
2
.
2.2 Ordered Fields 15
Proposition 2.23. For all x, y F, then
1. [x[ 0
2. [xy[ = [x[ [y[
3. [x +y[ [x[ +[y[ .
Proof. 1. holds by denition since x > 0 if x < 0.
2. As [x[ [y[ 0 and ([x[ [y[)
2
= [x[
2
[y[
2
= x
2
y
2
= (xy)
2
, we have
[x[ [y[ =
_
(xy)
2
= [xy[ .
3. It suces to show [x +y[
2
([x[ +[y[)
2
. However,
[x +y[
2
= (x +y)
2
= x
2
+y
2
+ 2xy
x
2
+y
2
+ 2 [xy[ (xy [xy[)
= [x[
2
+[y[
2
+ 2 [x[ [y[
= ([x[ +[y[)
2
.
Denition 2.24. Let (F, P) be an ordered eld and S be a subset of F.
1. We say that S F is bounded from above (below) if there exists x F
such that x s (x s) for all s S. Any such x is called an upper (lower)
bound of S.
2. If S is bounded from above (below), we say that y F is a least upper
bound (greatest lower bound) for S if y is an upper (lower) bound for
S and y x (y x) for any other upper (lower) bound, x, of S.
Notice that least upper bounds and greatest lower bounds are unique if they
exists. We will write and
y = l.u.b. (S) = sup (S)
if y is the least upper bound for S and
y = g.l.b. (S) = inf (S)
if y is the greatest lower bound for S.
Example 2.25. Let F = , then;
1. max (a, b) and min (a, b) are least upper respectively greatest lower bounds
respectively for S = a, b . More generally, if S = a
1
, . . . , a
n
, then
sup (S) = a
1
a
n
:= max (a
1
, . . . , a
n
) and
inf (S) = a
1
a
n
:= min (a
1
, . . . , a
n
) .
2. S = N is not bounded from above while inf (S) = 1.
3. S =
_
1
1
n
: n N
_
is bounded from above and 1 = sup (S) while inf (S) =
1
2
.
4. Let
S = 1, 1.4, 1.41, 1.414, 1.4142, 1.41421, 1.414213, 1.4142135, . . .
where I am getting these digits from the decimal expansion of
2;
2

= 1.41421356237309504880168872420969807856967187537694807317667973799...
In this case S is bounded above by 2, or 1.42, or 1.415, etc. Nevertheless
2 = sup (S) does not exists in .

Example 2.26. Now let F = (t) be the eld of rational functions described in
Example 2.11, then; S = N is bounded from above. For example t is an upper
bound but there is not least upper bound. For example
1
m
t is also an upper
bound for S.
Denition 2.27 (Dedekind Cuts). A subset is called a cut (see [?, p.
17]) if;
1. is a proper subset of , i.e. ,= and ,= ,
2. if p and q and q < p, then q ,
3. if p , then there exists r with r > p.
Example 2.28. To each a , let
a
:= q : q < a . Then
a
is a cut and
a is the least upper bound of
a
in .
Example 2.29. Let S
n
n=0
be any bounded sequence such that S
n
S
n+1
for all n. Then
:=
n=0
Sn
= q : q < S
n
a.a. n
is a cut as the reader should verify. Let us further suppose that lim
n
S
n
does
not exist in . [For example from Example 1.17 we may take S
n
:=
n
k=0
1
k!

.] If m is an upper bound for , then m S
n
for all n since if m < S
n
for some n then q :=
1
2
(m+S
n
) with q > m. Since lim
n
S
n
,= m as
m there must exists > 0 such that
mS
n
= [mS
n
[ i.o. n.
As m S
n
is decreasing we may conclude that m S
n
for all n, i.e.
S
n
m for all n. From this it now follows that m is an upper bound
for which is strictly smaller that m. So there can be no least upper bound.
3
Real Numbers
As we saw in Section 1.2, is full of holes and calculus tends to produce
answers which live in these holes. So it is imperative that we ll the holes. Doing
so will lead to the real numbers provided we ll in the holes without adding too
much extra ller along the way. One good answer to the question, What are
the real numbers?, is contained in the statement of Theorem 3.3.
Denition 3.1. An order preserving eld isomorphism between two or-
dered elds, (F
1
, P
1
) and (F
2
, P
2
) , is a bijection, f : F
1
F
2
such that
f (0) = 0, f (1) = 1, f (P
1
) = P
2
, and
f (x +y) = f (x) +f (y) and f (xy) = f (x) f (y) for all x, y F
1
.
Denition 3.2. An ordered eld (F, P) is has the least upper bound prop-
erty (or is complete) if every non-empty subset, S F, which is bounded from
above possesses a least upper bound in F. [As we have seen in examples above,
does not have the least upper bound property.]
Theorem 3.3 (The real numbers). Up to order preserving eld isomorphism
(see Denition 3.1), there is exactly one complete ordered eld. It is this eld
that we refer to as the real numbers and denote by 1.
Denition 3.4. We say two Cauchy sequences a
n
n=1
and b
n
n=1
of ratio-
nal numbers are equivalent and write a
n
n=1
b
n
n=1
i
lim
n
[a
n
b
n
[ = 0.
We then dene := [a
n
n=1
] to be the equivalence class of the Cauchy sequence
a
n
n=1
and refer to the collection of these equivalence classes as the real
numbers. The set of real numbers will be denoted by 1.
Notation 3.5 Let i : 1 be dened by i (a) := [(a, a, a, . . . )] , i.e. i (a) is
the equivalence class of the constant sequence a.
Notice that if i (a) = i (b) i a = lim
n
a = lim
n
b = b. Thus the map,
i : 1 is injective and we will often simply identify a with i (a) and in this
way consider as a subset of 1.
Theorem 3.6. Let 1 be as in Theorem 3.3. For := [a
n
n=1
] and :=
[
n
n=1
] in 1 we dene
+ = [a
n
+b
n
n=1
] and = [a
n
b
n
n=1
] .
1. With these denitions, 1, satises the axioms of a eld.
2. Moreover, we can make this into an ordered eld by setting P :=
1 : > 0 where we say > 0 i there exists an N N such that
a
n
>
1
N
for a.a. n.
1
3. The ordered eld (1, P) is complete, i.e. has the least upper bound property.
The proof of Theorem 3.6 and Theorem 3.3 will be relegated to Section
3.6 at the end of this chapter. For an alternative existence proof of 1 using
Dedekind cuts as the elements of 1 is covered in Rudin [?, pages 17-21.]. One
may also construct the Real numbers using decimal expansions, see T. Gowers
notes on real numbers as decimals. We will prove the uniqueness assertion of
Theorem 3.3 in Section at the end of this section. From now on we are going
to take Theorem 3.3 for granted and derive from this the familiarproperties
of the real numbers.
Observe that , (t) , 1(t) are not complete and hence are not the real
numbers, 1. For example N (t) (or N 1(t) ) is bounded by t say but
has no least upper bound. However, we do know that 1 by Lemma 2.14.
We will soon see that is dense in 1. We now pause to discuss some of the
basic properties of 1.
Theorem 3.7. Suppose that 1 is a complete ordered eld which we assume we
have already embedded into 1 as in Lemma 2.14. Then;
1. For all x 0 there exists n N such that n x.
2. For all > 0 in 1 there exists n N such that 0 <
1
n
.
3. If 0 satises 1/n for all n N then = 0.
4. If a, b 1 and a b +
1
n
for all n N or a b + for all > 0, then
a b.
1. If n < x for all n N, then N is bounded from above and so a := sup (N)
exists in 1 by the completeness axiom. As a is the least upper bound for N
there must be an n N such that n > a1. However this implies n+1 > a
which violates a be an upper bound for N.
1
Roughly speaking here, you should think of = limnan and so > 0 should
happen i >
1
N
for some N N which then implies an
1
N
for a.a. n.
18 3 Real Numbers
2. If > 0 in 1 and
1
n
> for all n N, then n <
1
for all n N which is

impossible by item 1.
3. If there exists > 0 such that
1
n
for all n then n 1/ for all n which
is again impossible by item 1.
4. It suces to prove the rst assertion. We may assume a b for otherwise
we are done. If a b +
1
n
for all n, then 0 a b
1
n
for all n N and
hence a = b and in particular a b.
Proposition 3.8. If 1 is a complete ordered eld, then every subset S 1
which is bounded from below has a greatest lower bound, glb (S) = inf (S) . In
fact,
inf (S) = sup (S) .
Proof. We let m := sup (S) . Then we have s m for all s S, i.e.
s m for all s S so that m is a lower bound for S. Moreover if > 0 is given
there exists s
S such that s
m , i.e. s
m + . This shows that

any lower bound, k of S must satisfy, k m+ for all > 0, i.e. k m. This
shows that m is the greatest lower bound for S.
Let me sketch one way to construct 1 based on Cauchy sequences of rational
numbers.
Denition 3.9. A sequence q
n
n=1
1 converges to 0 1 if for all > 0
in 1 there exists N N such that [q
n
[ for all n N. Alternatively put, for
all M N we have [q
n
[
1
M
for a.a. n.
n
n=1
1 converges to q 1 if [q q
n
[ 0
as n , i.e. if for all N N, [q q
n
[
1
N
for a.a. n. As usual if q
n
n=1
converges to q we will write q
n
q as n or q = lim
n
q
n
.
n
n=1
1 is Cauchy if [q
n
q
m
[ 0 as
m, n . More precisely we require for each > 0 in 1 that [q
m
q
n
[ for
a.a. pairs (m, n) , i.e. there should exists N N such that [q
m
q
n
[ for all
m, n N.
The next few results are analogous to what you have already shown in the
case 1 is replaced by . As the proofs are essentially identical to those of
Theorem 1.13 and Exercise 1.1 1.6.
Proposition 3.12. If a
n
n=1
1 is a convergent sequence then it is Cauchy.
2
If a
n
n=1
is Cauchy sequence then it is bounded.
2
We are going to show shortly that the converse is true as well!
Theorem 3.13 (Basic Limit Results). Suppose that a
n
and b
n
are se-
quences of real numbers such that A := lim
n
a
n
and B := lim
n
b
n
exists
in 1. Then;
1. lim
n
[a
n
[ = [A[ .
2. If A ,= 0 then lim
n
1
an
=
1
A
.
3. lim
n
(a
n
+b
n
) = A+B.
4. lim
n
(a
n
b
n
) = A B.
5. If a
n
b
n
for all n, then A B.
6. If x
n
1 is another sequence such that a
n
x
n
b
n
and A = B, then
lim
n
x
n
= A = B.
Theorem 3.14. If S 1 is a non-empty set which is bounded from above, then
there exists x
n
n=1
S such that x
n
sup S as n , i.e. x
n
x
n+1
for
all n and lim
n
x
n
= sup S.
Proof. Let M := sup S. For each n N, there exists y
n
S such that
M y
n
M
1
n
. We now let x
n
:= max (y
1
, . . . , y
n
) in which case M
x
n
M
1
n
and x
n
is increasing. By the Sandwich theorem it follows that
lim
n
x
n
= M.
Theorem 3.15. If x
n
n=1
1 is bounded from above and x
n
is non-
decreasing, then lim
n
x
n
= sup
n
x
n
. Similarly if x
n
n=1
1 is bounded
from below and x
n
is non-increasing, then lim
n
x
n
= inf
n
x
n
.
Proof. Let M := sup
n
x
n
, then for all > 0, there exists N
N such that
M x
N
M . As x
n
is non-decreasing it follows that M x
n
M
for all n N
, i.e.
[M x
n
[ for all n N
.
3 Real Numbers 19
As > 0 was arbitrary, we may conclude that lim
n
x
n
= M. If x
n
is decreas-
ing instead, then x
n
and we have lim
n
x
n
= sup
n
(x
n
) = inf
n
x
n
.
Theorem 3.16. Suppose that 1 is a complete ordered eld which we assume
we have already embedded into 1 as in Lemma 2.14. Then;
1. For all m 1, if
m
:= y : y < m ,
then sup
m
= m.
2. If a, b 1 with a < b, then there exists q such that a < q < b.
3. If is a cut and m := sup , then =
m
.
1. Proof 1. Let
m
:= y : y < m and M := sup
m
1. Then M m.
If M ,= m then M < m. To see this last case is not possible := mM > 0
and choose n N such that 0 <
1
n
. Then choose y such that
M
1
2n
< y < M.
From this it follows that
M < y +
1
2n
< M +
1
2n
< m
which shows y +
1
2n

m
is greater than M violating the assumption that
M is an upper bound for
m
.
Proof 2. [Here is a slight rewriting of the above argument.] Choose y
n

m
such that y
m
M as m . Choose n N so that m M >
1
n
. Then
y
m
+
1
n
M +
1
n
< m as m . So for large m, y
m
+
1
n
< m while
y
m
+
1
n
> M, i.e. y
m
+
1
n

m
yet y
m
+
1
n
> M. This violates the assumption
that M is an upper bound for
m
.
2. By item 1. and Theorem 3.14 we can choose q
b
to be as close to b as
we choose and in particular q can be chosen to be in
b
with q > a.
3. You are asked to prove this in Exercise 3.1 below.
Exercise 3.1. Suppose that is a cut as in Denition 2.27. Show is
bounded from above. Then let m := sup and show that =
m
, where
m
m
:= y : y < m .
Also verify that
m
is a cut for all m 1. [In this way we see that we may
identify 1 with the cuts of . This should motivate Dedekinds construction of
the real numbers as described in Rudin.]
Proposition 3.17 (Rationals are dense in the reals). For all b 1, there
exists q
n
such that q
n
b. Similarly there exists p
n
such that p
n
b.
Proof. Given b we know that b = sup
b
by Theorem 3.16. Then by
Theorem 3.14 there exists q
n

b
such that q
n
b as n . The second
assertion can be proved in much the same way as the rst. Alternatively, let
q
n
such that q
n
b and set p
n
:= q
n
. Then p
n
b.
Denition 3.18. The real numbers which are not rational are called irra-
tional,
3
so the irrational numbers are 1 .
Example 3.19 (Eulers number). Let S
n
:=
n
k=0
1
k!
for all n N
0
. We dene
Eulers number to be,
e := lim
n
S
n
= sup S
n
: n N
0
1.
From Example 1.17, we have seen that e 1 .
Theorem 3.20 (n
th
roots). Let n N and x > 0 in 1, then there exists
a unique y 1
+
such that y
n
= x. We of course denote y by x
1/n
for
n
x.
The function x x
1/n
is increasing.[See Rudin for more properties of x
1/n
and
x
m/n
where m Z and n N.]
Proof. Uniqueness. First of t > s 0 then t
n
> s
n
0 as can be proved
by induction.
4
Thus if x, y 0 and x
n
= y
n
then x = y for otherwise x > y or
3
For what it is worth, as dictionary denition of irrational is not consistent with
or using reason. Lets try to use irrational numbers in a rational way!
4
The statement holds for n = 1 by assumption and if t
n
> s
n
, then t
n+1
> ts
n
>
s
n+1
. For the last equality we used t > s implies ts
n
> s s
n
.
20 3 Real Numbers
y > x in which case x
n
> y
n
or y
n
> x
n
respectively. This shows that there is
at most one n
th
root if it exists. I also claim that x
1/n
< y
1/n
if x < y. If not
then x
1/n
y
1/n
and this would then imply x =
_
x
1/n
_
n
_
y
1/n
_
n
= y which
contradicts x < y.
Existence. Let := t 1
+
: t
n
x . If t =
x
1+x
(0, 1) , then t
n
t x
so that t and ,= . If t = 1 + x, then t
n
= (1 +x)
n
1 + nx > x and
therefore is bounded from above. Hence we may dene y := sup . We will
now show that y
n
= x.
By Theorem 3.14, there exists t
k
such that t
k
y as k . By
denition of , t
n
k
x for all k. Passing to the limit as k in this inequality
implies y
n
= lim
k
t
n
k
x.
If y
n
< x then (using the Binomial theorem) and properties of limits,
_
y +
1
m
_
n
= y
n
+
n
k=1
_
n
k
_
y
nk
_
1
m
_
k
y
n
< x as m .
Hence for suciently large m we will have
_
y +
1
m
_
n
< x. But this shows that
y +
1
m
which violates y being an upper bound for . Therefore we conclude
that y
n
= x.
End of Lecture 6, 10/10/2012.
3.1 Extended real numbers
Notation 3.21 The extended real numbers is the set

1 := 1 , i.e. it
is 1 with two new points called and . We use the following conventions,
a = if a

1 with a > 0, a = if a

1 with a < 0,
+ a = for any a 1, + = and = while the
following expressions are not dened;
, +, /, 0 , and 0.
A sequence a
n

1 is said to converge to () if for all M 1 there exists
m N such that a
n
M (a
n
M) for all n m. In these case we write
lim
n
a
n
= or a
n
as n .
For any subset

1, let sup and inf denote the least upper bound and
greatest lower bound of respectively. The convention being that sup = if
or is not bounded from above and inf = if or is not
bounded from below. We will also use the conventions that sup = and
inf = +. The next theorem is a fairly simple but often useful result about
computing least upper bounds.
Theorem 3.22 (Sup Sup Theorem). Suppose that is a subset of 1 such
that =
I
where
and I is some index set. Then

sup = sup
I
sup
.
The convention here is that the supremum of a set which is not bounded from
above is and the sup = .
Proof. Let M := sup and M
:= sup
for all I. As
we
have M
M for all I and therefore sup

I
M
M. Conversely, if
, then M
for some I and therefore M
. From this it
follows that sup
I
M
and as is arbitrary we may conclude that

M = sup sup
I
M
.
The next corollary records a typical way the Sup Sup theorem is used.
Corollary 3.23. Suppose that X and Y are sets and S : X Y 1 is a
function. Then
sup
xX
sup
yY
S (x, y) = sup
(x,y)XY
S (x, y) = sup
yY
sup
xX
S (x, y) .
In particular, if S
m,n
1 for all m, n N, then
sup
m
sup
n
S
m,n
= sup
(m,n)
S
m,n
= sup
n
sup
m
S
m,n
.
Proof. Let := S (x, y) : (x, y) X Y , and for x X let
x
:=
S (x, y) : y Y . Then =
xX
x
and therefore,
sup
(x,y)XY
S (x, y) = sup = sup
xX
sup
x
= sup
xX
sup
yY
S (x, y) .
The same reasoning also shows,
sup
(x,y)XY
S (x, y) = sup
yY
sup
xX
S (x, y) .
The next Lemma records some basic limit theorems involving the extended
real numbers.
Lemma 3.24. Suppose a
n
n=1
and b
n
n=1
are convergent sequences
5
in

1,
then:
1. If a
n
b
n
for a.a. n then lim
n
a
n
lim
n
b
n
.
2. If c 1, lim
n
(ca
n
) = c lim
n
a
n
.
5
The only sequences that do not converge in

1 are those which oscillate too much.
3.2 Limsups and Liminfs 21
3. If a
n
+b
n
n=1
is convergent and
lim
n
(a
n
+b
n
) = lim
n
a
n
+ lim
n
b
n
(3.1)
provided the right side is not of the form .
4. a
n
b
n
n=1
is convergent and
lim
n
(a
n
b
n
) = lim
n
a
n
lim
n
b
n
(3.2)
provided the right hand side is not of the for 0 of 0 () .
Before going to the proof consider the simple example where a
n
= n and
b
n
= n +c with > 0 and c 1. Then
6
lim(a
n
+b
n
) =
_
_
_
if < 1
c if = 1
if > 1
while
lim
n
a
n
+ lim
n
b
n
= .
This shows that the requirement that the right side of Eq. (3.1) is not of form
is necessary in Lemma 3.24. Similarly by considering the examples
a
n
= n and b
n
= n
with > 0 shows the necessity for assuming right hand

side of Eq. (3.2) is not of the form 0.
Proof. The proofs of items 1. and 2. are left to the reader.
Proof of Eq. (3.1). Let a := lim
n
a
n
and b = lim
n
b
n
.
Case 1. Suppose b = in which case we must assume a > . In this
case, for every M > 0, there exists N such that b
n
M and a
n
a 1 for all
n N and this implies
a
n
+b
n
M +a 1 for all n N.
Since M is arbitrary it follows that a
n
+ b
n
as n . The cases where
b = or a = are handled similarly.
Case 2. If a, b 1, then for every > 0 there exists N N such that
[a a
n
[ and [b b
n
[ for all n N.
Therefore,
[a +b (a
n
+b
n
)[ = [a a
n
+b b
n
[ [a a
n
[ +[b b
n
[ 2
6
This example shows that if you formally arrive at an expression like , then
you should work harder to decide what it really means!
for all n N. Since n is arbitrary, it follows that lim
n
(a
n
+b
n
) = a +b.
Proof of Eq. (3.2). It will be left to the reader to prove the case where lima
n
and limb
n
exist in 1. I will only consider the case where a = lim
n
a
n
,= 0
and lim
n
b
n
= here. Let us also suppose that a > 0 (the case a < 0 is
handled similarly) and let := min
_
a
2
, 1
_
. Given any M < , there exists
N N such that a
n
and b
n
M for all n N and for this choice of N,
a
n
b
n
M for all n N. Since > 0 is xed and M is arbitrary it follows
that lim
n
(a
n
b
n
) = as desired.
Exercise 3.2. Show lim
n
n
= and lim
n
1
n
= 0 whenever > 1.
Exercise 3.3. Suppose > 1 and k N, show there is a constant c = c (, k) >
0 such that
n
cn
k
for all n N. [In words,
n
grows in n faster than any
polynomial in n.]
Lemma 3.25. Suppose that a
n
n=1
1 and lim
n
a
n
= A

1. Then
every subsequence, b
k
:= a
nk
k=1
, also converges to A.
Exercise 3.4. Prove Lemma 3.25.
3.2 Limsups and Liminfs
Notation 3.26 Suppose that x
n
n=1

1 is a sequence of numbers. Then
lim inf
n
x
n
= lim
n
infx
k
: k n and (3.3)
lim sup
n
x
n
= lim
n
supx
k
: k n. (3.4)
We will also write lim for liminf and lim for limsup .
Remark 3.27. Notice that if a
k
:= infx
k
: k n and b
k
:= supx
k
: k
n, then a
k
is an increasing sequence while b
k
is a decreasing sequence.
Therefore the limits in Eq. (3.3) and Eq. (3.4) always exist in

1 (see Theorem
3.15) and
lim inf
n
x
n
= sup
n
infx
k
: k n and
lim sup
n
x
n
= inf
n
supx
k
: k n.
Owing to the following exercise, one may reduce properties of the liminf to
those of the limsup .
Exercise 3.5. Show liminf
n
(a
n
) = limsup
n
a
n
.
22 3 Real Numbers
Exercise 3.6. Let a
n
n=1
be the sequence given by,
(1, 2, 3, 1, 2, 3, 1, 2, 3, . . . ) .
Find limsup
n
a
n
and liminf
n
a
n
.
Exercise 3.7. If a
n
n=1
and b
n
n=1
are two sequences such that a
n
b
n
for
a.a. n, then
lim sup
n
a
n
lim sup
n
b
n
and liminf
n
a
n
liminf
n
b
n
. (3.5)
The following proposition contains some basic properties of liminfs and lim-
sups.
Proposition 3.28. Let a
n
n=1
and b
n
n=1
be two sequences of real numbers.
Then
1. liminf
n
a
n
limsup
n
a
n
.
2. lim
n
a
n
exists in

1 i
lim inf
n
a
n
= lim sup
n
a
n

1.
3.
lim sup
n
(a
n
+b
n
) lim sup
n
a
n
+ lim sup
n
b
n
(3.6)
whenever the right side of this equation is not of the form .
4. If a
n
0 and b
n
0 for all n N, then
lim sup
n
(a
n
b
n
) lim sup
n
a
n
lim sup
n
b
n
, (3.7)
provided the right hand side of (3.7) is not of the form 0 or 0.
Proof. Items 1. and 2. will be proved here leaving the remaining items as
an exercise to the reader. For item 1. we have
infa
k
: k n supa
k
: k n n,
and therefore by the Sandwich theorem, liminf
n
a
n
limsup
n
a
n
.
2. (=) Let A := liminf
n
a
n
= limsup
n
a
n

1. Since
inf
kn
a
k
a
n
sup
kn
a
k
,
if A 1 then it follows by the sandwich theorem that lim
n
a
n
= A. If
A = , then for all M N we have M inf
kn
a
k
for a.a. n. Therefore
a
k
M for a.a. k and we have shown lim
k
a
k
= . If A = then for all
M N we have sup
kn
a
k
M for a.a. n. Therefore a
k
M for a.a. k and
we have shown lim
k
a
k
= .
( = ) Conversely, suppose that lim
n
a
n
= A

1 exists. If A 1, then
for every > 0 there exists N() N such that [Aa
n
[ for all n N(),
i.e.
A a
n
A+ for all n N().
From this we learn that
A inf
kn
a
k
sup
kn
a
k
A+ for a.a. n
and so passing to the limit as n implies
A lim inf
n
a
n
lim sup
n
a
n
A+.
Since > 0 is arbitrary, it follows that
A lim inf
n
a
n
lim sup
n
a
n
A,
i.e. that A = liminf
n
a
n
= limsup
n
a
n
.
If A = , then for all M > 0 there exists N = N(M) such that a
n
M
for all n N. This show that liminf
n
a
n
M and since M is arbitrary it
follows that
lim inf
n
a
n
lim sup
n
a
n
.
The proof for the case A = is analogous to the A = case.
Exercise 3.8. Show that
limsup
n
(a
n
+b
n
) limsup
n
a
n
+ limsup
n
b
n
(3.8)
provided that the right side of Eq. (3.8) is well dened, i.e. no or +
type expressions. (It is OK to have + = or = , etc.)
Exercise 3.9. Suppose that a
n
0 and b
n
0 for all n N. Show
limsup
n
(a
n
b
n
) limsup
n
a
n
limsup
n
b
n
, (3.9)
n
n=1
and b
n
n=1
are two non-negative se-
quences and assume A = lim
n
a
n
exists in (0, ) . Show
limsup
n
a
n
b
n
= A limsup
n
b
n
.
3.2 Limsups and Liminfs 23
Denition 3.29. A sequence, a
n
n=1
, of positive numbers is said to have
sub-geometric growth i for all > 1 there exists c = c () < such that
a
n
c
n
for a.a. n N. (3.10)
Lemma 3.30. Suppose that a
n
n=1
is a sequence of non-negative numbers.
1. If a
n
n=1
has sub-geometric growth, then limsup
n
a
1/n
n 1.
2. If a
n
> 0 for a.a. n and
_
a
1
n
_
n=1
has sub-geometric growth (i.e. for all
> 1 there exists = () < such that a
n

n
for a.a. n), then
liminf
n
a
1/n
n 1.
3. In particular if a
n
> 0 for a.a. n and both a
n
n=1
and
_
a
1
n
_
n=1
have
sub-geometric growth, then lim
n
a
1/n
n = 1.
Proof. 1. For any > 1 choose (1, ) and let c = c () . Since
lim
n
c
_
_
n
= 0 < 1, it follows that a
n
c
n
= c
_
_
n
n

n
for
a.a. n and therefore
a
1/n
n
for a.a. n = limsup
n
a
1/n
n
.
As > 1 is arbitrary we may conclude that limsup
n
a
1/n
n 1.
2. This may be proved similarly to item 1. or it can be reduced to item 1 as
we show here. By item 1. applies to
_
a
1
n
_
n
,
limsup
n
1
a
1/n
n
= limsup
n
a
1/n
n
1.
The proof is completed because limsup
n
1
a
1/n
n
=
1
liminfna
1/n
n
as the reader
should prove.
3. From items 1. and 2. we learn that
1 liminf
n
a
1/n
n
limsup
n
a
1/n
n
1
from which the result follows.
Example 3.31. If c
1
n
p
a
n
cn
p
for some p N and c < , then
lim
n
a
1/n
n = 1. Indeed using Exercise 3.3 we may easily verify the hypothesis
of Lemma 3.30.
Exercise 3.11. Suppose that p (t) is a non-zero polynomial of t 1 with (pos-
sibly) complex coecients. Show
lim
n
[p (n)[
1/n
= 1.
Exercise 3.12. If a
n
0, then lim
n
a
n
= 0 i limsup
n
a
n
= 0.
Proposition 3.32. Suppose that a
n
n=1
is a sequence of real numbers and let
B := y 1 : a
n
y for i.o. n .
Then sup B = limsup
n
a
n
with the convention that sup B = if B = .
Proof. If a
n
n=1
is not bounded from above, then B is not bounded from
above and sup B = = limsup
n
a
n
. If B = so that sup B = , then for
all y 1 we must have a
n
< y for a.a. n. This then implies limsup
n
a
n
y
for all y 1 from which we conclude that limsup
n
a
n
= . So let us now
assume that B ,= and a
n
n=1
is bounded in which case B is bounded from
above. Let us set := sup B 1 and a
:= limsup
n
a
n
.
If y > , then a
n
< y for a.a. n from which it follows that a
:=
limsup
n
a
n
y. We may now let y in order to see that a
.
7
Now
suppose that y < , then a
n
y for a.a. n and hence a
= limsup
n
a
n
y.
Letting y then shows a
. Thus we have shown a
= .
Theorem 3.33. There is a subsequence a
nk
k=1
of a
n
n=1
such that
lim
k
a
nk
= limsup
n
a
n
. Similarly, there is a subsequence a
nk
k=1
of
a
n
n=1
such that lim
k
a
nk
= liminf
n
a
n
. Moreover, every convergent
subsequence, b
k
:= a
nk
k=1
of a
n
n=1
satises,
lim inf
n
a
n
lim
k
b
k
lim sup
n
a
n
.
Proof. Let me prove the last assertion rst. Suppose that b
k
:= a
nk
is some
convergent subsequence of a
n
n=1
. We then have,
inf
nnk
a
n
b
k
sup
nnk
a
n
for all k N.
Passing to the limit in this equation then implies,
liminf
n
a = lim
k
inf
nnk
a
n
lim
k
b
k
lim
k
sup
nnk
a
n
= limsup
n
a
n
.
We have used, inf
nnk
a
n
k=1
and
_
sup
nnk
a
n
_
k=1
are subsequence of the
convergent sequences of inf
nk
a
n
k=1
and
_
sup
nk
a
n
_
k=1
respectively and
therefore converge to the same limits respectively, see Lemma 3.25.
Now let us prove the rst assertions. I will cover the limsup case here as
the liminf case is similar or can be deduced from the limsup case with the aid
of Exercise 3.5. Let A := limsup
n
a
n
. We will need to consider three case,
A 1, A = , and A = .
7
This can be done more formally by choosing a sequence {yk}
k=1
such that yk
so that a
yk. Now let k to conclude a
limkyk = .
24 3 Real Numbers
i) A 1, then by Proposition 3.32, for all k N we have A
1
k
a
n
for innitely many n. In particular we can choose n
1
< n
2
< n
3
< . . .
inductively so that A
1
k
a
nk
for all k. Since
A
1
k
a
nk
sup
mnk
a
m
and the limit as k of both extremes of this inequality are A, it follow
from the sandwich inequality that lim
k
a
nk
= A.
ii) If A = limsup
n
a
n
= , then sup
kn
a
k
= for all n N which
implies for all M < that a
k
M i.o. k. Working similarly to case i) we
can choose n
1
< n
2
< n
3
< . . . so that a
nk
k for all k and therefore
lim
k
a
nk
= .
iii) Finally suppose that A = limsup a
n
= so that for all M N there
exists N N such that sup
kn
a
k
M for all n N, i.e. a
n
M for
all n N. In this case it follows that in fact lim
n
a
n
= and we do
not have to even choose as subsequence.
Corollary 3.34 (BolzanoWeierstrass Property / Compactness). Every
bounded sequence of real numbers, a
n
n=1
, has a convergent in 1 subsequence,
a
nk
k=1
. If we drop the bounded assumption then we may only assert that there
is a subsequence which is convergent in

1.
Proof. Let M < such that [a
n
[ M for all n N, i.e. M a
n
M
for all n. We may then conclude from Exercise 3.7 that,
M limsup
n
a
n
M.
It now follows from Theorem 3.33 that there exists a subsequence, a
nk
k=1
,
of a
n
n=1
such that
lim
k
a
nk
= limsup
n
a
n
[M, M] 1.
Theorem 3.35 (1 is Cauchy complete). If a
n
n=1
1 is a Cauchy se-
quence, then lim
n
a
n
exists in 1 and in fact,
lim
n
a
n
= limsup
n
a
n
= liminf
n
a
n
.
Proof. We will give two proofs of this important theorem. Each proof uses
the fact that a
n
n=1
1 is Cauchy implies a
n
n=1
is bounded. This is proved
exactly in the same way as the solution to Exercise 1.2.
First proof. By Corollary 3.34, there is a subsequence, a
nk
k=1
, such that
lim
k
a
nk
= L 1. As in the proof of Exercise 1.7, it follows that lim
n
a
n
exists and is equal to L.
Second proof. Let a := liminf
n
a
n
and b := limsup
n
a
n
. It suces
to show a = b. As we always know that a b it will suce to show b a.
Given > 0, there exists N N such that
[a
m
a
n
[ for all m, n N.
In particular, for m, n k N we have a
m
a
n
+ and hence
b sup
mk
a
m
a
n
+ for all n k.
From this inequality we may further conclude,
b inf
nk
a
n
+ a +.
As > 0 is arbitrary, we have indeed shown b a.
n
n=1
is a sequence of real numbers and let
A := y 1 : a
n
y for a.a. n .
Then sup A = liminf
n
a
n
with the convention that sup A = if A = .
n
n=1
is a sequence of real numbers. Show
limsup
n
a
n
= a
1 i for all > 0,

a
n
a
+ for a.a. n. and

a
a
n
i.o. n.
Similarly, show liminf
n
a
n
= a
1 i for all > 0,

a
n
a
+ i.o.. n. and
a
a
n
for a.a. n.
Notice that this exercise gives another proof of item 2. of Proposition 3.28 in
the case all limits are real valued.
Exercise 3.15 (Cauchy Complete = L.U.B. Property). Suppose that
1 denotes any ordered eld which is Cauchy complete. Show 1 has the least
upper bound property and therefore is the eld of real numbers.
3.4 The Decimal Representation of a Real Number 25
3.3 Partitioning the Real Numbers
Notation 3.36 (Intervals) For a, b 1 with a < b we dene,
(a, b) := x 1 : a < x < b ,
[a, b] := x 1 : a x b ,
(a, b] := x 1 : a < x b , and
[a, b) := x 1 : a x < b .
We also also a = in the intervals, (a, b) and (a, b] and allows b = + in
the intervals (a, b) and [a, b).
Notation 3.37 (Pairwise disjoint unions) If X is a set and A
X for
I, we write X =
I
A
to mean; X =
I
A
and A
for all
,= .
Exercise 3.16. Suppose that a, b, c, d 1 such that a < b c < d. Show
(a, b] (c, d] = and [a, b) [c, d) = .
Lemma 3.38 (Well Ordering II). Suppose that S is a non-empty subset of Z
which is bounded from below, then inf (S) S, i.e. S has a (unique) minimizer.
Proof. As S is bounded from below, there exists k Z such that k s
for all s S. Therefore

S := s k + 1 : s S N and hence by the Well
ordering principle, min
_
S
_
:= m N exists. That is m sk +1 for all s S
and there exists s
0
S such that m = s
0
k + 1. These last statements are
equivalent to saying,
s
0
= m+k 1 s for all s S,
which is to say s
0
= min (S) .
Proposition 3.39. Suppose that S
n
n=
1 such that S
n
< S
n+1
for all
n Z, lim
n
S
n
= and lim
n
S
n
= . Then
nZ
(S
n1
, S
n
] = 1 =
nZ
[S
n
, S
n+1
). (3.11)
Proof. The fact that (S
n
, S
n+1
] (S
m
, S
m+1
] = follows from Exercise
3.16. For x 1, let
n
0
:= min (n Z : x S
n
)
which exists since n Z : x S
n
is non-empty as S
n
as n and is
bounded from below since S
n
as n . It then follows that x S
n0
while x _ S
n01
, i.e. S
n01
< x S
n0
and we have shown x (S
n01
, S
n0
]
which completes the proof of the rst equality in Eq. (3.11). The proof of the
second equality is similar and so will be omitted.
Proposition 3.40. Suppose that < a < b < and S
n
N
n=0
[a, b] such
that a = S
0
< S
1
< < S
N1
< S
N
= b, then
[a, b) =
N
n=1
[S
n1
, S
n
).
This result also holds if N = provided we now assume S
n
< S
n+1
for all n,
a = S
0
, and S
n
b as n .
Proof. This proof is very similar to the proof of Proposition 3.39 and so
will be omitted.
3.4 The Decimal Representation of a Real Number
Lemma 3.41 (Geometric Series). Let 1 or , m, n Z and
S :=
m
k=n
k
. Then
S =
_
mn + 1 if = 1
m+1
n
1
if ,= 1
.
Proof. When = 1,
S =
m
k=n
1
k
= mn + 1.
If ,= 1, then
S S =
m+1
n
.
Solving for S gives
S =
m
k=n
k
=

m+1
n
1
if ,= 1. (3.12)
Taking = 10
1
in Eq. (3.12) implies
m
k=n
10
k
=
10
(m+1)
10
n
10
1
1
=
1
10
n1
1 10
(mn+1)
9
and in particular, for all M n,
lim
m
m
k=n
10
k
=
1
9 10
n1

M
k=n
10
k
.
26 3 Real Numbers
Denition 3.42 (Decimal Numbers). Let | denote those sequences
0, 1, 2, . . . , 9
Z
with the following properties:
1. there exists N N such that
n
= 0 for all n N and
2.
n
,= 0 for some n Z.
A decimal number is then an expression of the form
N+1
. . .
0
.
1
3
. . . .
For example
52 +
2

= 53.41421356237309504880168872420969807856967187537694807 . . .
To every decimal number | is the sequence a
n
= a
n
() dened for n N
by
a
n
:=
n
k=
k
10
k
. (a nite sum).
Since for m > n,
[a
m
a
n
[ =
k=n+1
k
10
k
9
m
k=n+1
10
k
9
1
9 10
n
=
1
10
n
,
it follows that
[a
m
a
n
[
1
10
min(m,n)
0 as m, n
which shows a
n
n=1
is a Cauchy sequence. Thus to every decimal number we
may associate the real number
a () := lim
n
a
n
.
Theorem 3.43. If x 0 is a real number, there exists | such that x =
a () , i.e. all real numbers can be represented in decimal form.
Proof. If x = 0, we can take
n
= 0 for all n so that 0 = a () . So
suppose that x > 0 and let p := min (n N : x < n) . Set m = p 1, then
m x < m+ 1. We then dene
k
for k 0 so that m =
N
. . .
0
. We now
construct
k
for k 1. For k = 1 we write
[m, m+ 1) =
9
l=0
[m+
l
10
, m+
l + 1
10
)
and then choose
1
= l if x [m+
l
10
, m+
l+1
10
). We then construct
2
using,
[m+

1
10
, m+

1
+ 1
10
)
9
l=0
[m+

1
10
+
l
100
, m+

1
10
+
l + 1
100
)
and set
2
= l for x [m +
1
10
+
l
100
, m +
1
10
+
l+1
100
). Continuing this way
inductively we construct
k
k=1
such that
x [m+
k
j=1
j
10
j
, m+
k1
j=1
j
10
j
+

k
+ 1
10
k
).
It is now easy to see that x = a () .
Remark 3.44. The representation of x 0 as a decimal number may not be
unique. For example,
0.999 =
k=1
9
10
k
=
9
10

1
1
1
10
= 1.000.
[Or note that
1 0
n-times
.
..
9 9 = 0
n1 times
.
..
0 0 1 = 10
n
0 as n .]
On the other hand if we agree to not allow a tail of repeated 9s as an element
of |, then the representation would be unique.
3.5 Summary of Key Facts about Real Numbers
1. The real numbers, 1, is the unique (up to order preserving eld isomor-
phism) ordered eld with the least upper bound property or equivalently
which is Cauchy complete.
2. Informally the real numbers are the rational numbers with the (irrational)
hole lled in.
3. Monotone bounded sequence always converge in 1.
4. A sequence converges in 1 i it is Cauchy.
5. Cauchy sequences are bounded.
6. N is unbounded from above in 1.
7. For all > 0 in 1 there exists n N such that
1
n
< .
8. and 1 is dense in 1. In particular, between any two real numbers
a < b, there are innitely many rational and irrational numbers.
9. Decimal numbers map (almost 1-1) into the real numbers by taking the
limit of the truncated decimal number.
10. If a, b, 1, then
3.6 (Optional) Proofs of Theorem 3.6 and Theorem 3.3 27
a) a b by showing that a b + for all > 0.
b) a = b by proving a b and b a or
c) a = b by showing [b a[ for all > 0.
11. A number of standard limit theorems hold, see Theorem 3.13.
12. Unlike limits, limsup and liminf always exist. Moreover we have;
liminf
n
a
n
limsup
n
a
n
with equality i lim
n
a
n
exists in
which case
liminf
n
a
n
= lim
n
a
n
= limsup
n
a
n
.
We may allow the values of in these statements.
13. If b
k
:= a
nk
k=1
is a convergent subsequence of a
n
, then
liminf
n
a
n
lim
k
b
k
limsup
n
a
n
and we may choose b
k
so that lim
k
b
k
= limsup
n
a
n
or
lim
k
b
k
= liminf
n
a
n
.
14. Bounded sequences of real numbers always have convergence subsequences.
15. If S 1 and A := sup (S) , then there exists a
n
n=1
S such that
a
n
a
n+1
for all n and lim
n
a
n
= sup (S) .
16. If S 1 and A := inf (S) , then there exists a
n
n=1
S such that
a
n+1
a
n
for all n and lim
n
a
n
= inf (S) .
3.6 (Optional) Proofs of Theorem 3.6 and Theorem 3.3
In this section, we assume that 1 is as describe in Theorem 3.6. The next
exercise is relatively straight forward.
Exercise 3.17. Prove the following properties of 1.
1. Show addition and multiplication in Theorem 3.6 are well dened.
2. Show (1, +, ) satises the axioms of a eld. Hint: for constructing mul-
tiplicative inverses, make use of Proposition 3.45 below to conclude if
:= [a
n
n=1
] 1 and a ,= 0 = i (0) , then there exists N N such
that [a
n
[
1
N
for a.a. n. By redening the rst few terms of a
n
if necessary,
you may assume that [a
n
[
1
N
for all n and then take
1
=
_
_
a
1
n
_
n=1
_
.
3. Show i : 1 is injective homomorphism of elds.
To nish the proof of Theorem 3.6 we must show that P is an ordering on 1
with the least upper bound property. This will be carried out in the remainder
of this section.
Proposition 3.45. Suppose that := [a
n
n=1
] and := [b
n
n=1
] are real
numbers. Then precisely one of the following three cases can happen;
1. lim
n
(a
n
b
n
) = 0, i.e. = ,
2. there exists =
1
N
> 0 such that a
n
b
n
+ for a.a. n in which case > ,
or
3. there exists =
1
N
> 0 such that b
n
a
n
+ for a.a. n in which case > .
Proof. If case 1. does not hold then there exists > 0 such that [a
n
b
n
[
for innitely many n. There are now two possibilities (which will turn out to
me mutually exclusive;
i) a
n
b
n
i.o. n,
ii) b
n
a
n
i.o. n.
Since a
n
and b
n
are Cauchy sequences, there exists N N such that
[a
n
a
m
[ /3 and [b
n
b
m
[ /3 for all m, n N.
If case i) holds, we may choose an m N such that a
m
b
m
and so for
n N we nd,
a
m
b
m
= a
m
a
n
+a
n
b
n
+b
n
b
m
[a
m
a
n
[ +a
n
b
n
+[b
n
b
m
[
= /3 +a
n
b
n
+/3
from which it follows that a
n
b
n
:= /3 for all n N and we are in case
2. Similarly if case ii) holds then we are in fact in case 3. of the proposition.
Corollary 3.46. Suppose that := [a
n
n=1
] and := [b
n
n=1
] are real
numbers, then i for all N N,
a
n
b
n

1
N
for a.a. n. (3.13)
Alternatively put, i for all N N,
b
n
a
n
+
1
N
for a.a. n.
Proof. If = , then lim
n
(a
n
b
n
) = 0 and therefore Eq. (3.13) holds.
If > , then in fact a
n
b
n
> 0 > 1/N for a.a. n.
Conversely, if < , then there exists > 0 such that b
n
a
n
+ for a.a.
n. Thus if Eq. (3.13) were to also hold we could conclude for each N N that
a
n
b
n

1
N
a
n
+
1
N
for a.a. n.
This leads to a contradiction as soon as we choose N so large as to make
1/N < . Thus if Eq. (3.13) holds we must have .
28 3 Real Numbers
Proposition 3.47. Suppose that 1, a
n
n=1
be a Cauchy sequence in ,
and := [a
n
n=1
] . If i (a
k
) for all k then . Similarly if i (a
k
)
for all k then .
Proof. Let = [
n
n=1
] and suppose that i (a
n
) for all n. For sake of
contradiction, suppose that > , i.e. there exists an N N such that
n

a
n
+
1
N
for a.a. n. The assumption that i (a
k
) implies that
n
a
k
+
1
3N
for a.a. n. Because a
k
is Cauchy, we may conclude there exists M N such
that
n
a
k
+
1
2N
for all n, k M.
By making M even larger if necessary, we may assume that
n
a
n
+
1
N
for
all n M as well. From these two inequalities with k = n M we learn
a
n
+
1
N

n
a
n
+
1
2N
=
1
2N

1
N
and we have reached the desired contradiction. The fact that i (a
k
) for all k
implies is proved similarly. Alternatively if i (a
k
) then i (a
k
)
which implies , i.e. .
With these results in hand, let us now show that 1 as dened in Theorem
3.6 has the least upper bound property.
Proof of the least upper bound property. So suppose that 1 is a
non empty set which is bounded from above. For each m N, let k
m
Z be
the smallest integer such that i (a
m
) := i
_
km
2
m
_
is an upper bound for . Since,
for all n m, a
m
2
m
a
n
a
m
, we may conclude that
[a
n
a
m
[ 2
min(n,m)
0 as n, m .
This shows a
n
n=1
is Cauchy and hence we dened an element :=
[a
n
n=1
] 1. We now will show = sup .
If , then i (a
n
) for all n and so by Proposition 3.47 we conclude
that , i.e. is an upper bound for . Now suppose that is another
upper bound for . As i (a
n
2
n
) is not an upper bound for there exists
such that
i
_
a
n
2
n
_
< .
So by another application of Proposition 3.47 we learn that
= [a
n
n=1
] =
__
a
n
2
n
_
n=1
.
This shows that is in fact the least upper bound for .
Theorem 3.48 (Real numbers are unique). Suppose that F and G are two
complete ordered elds. Then there is a unique order preserving isomorphism,
: F G.
(Sketch). Suppose that : F G is an order preserving homomorphism.
The usual arguments show that any homomorphism, : F G must satisfy
(q1
F
) = q1
G
. We know that q 1
F
: q and q 1
G
: q are dense
copies of inside of F and G respectively. Now for general a F choose
q
n
, p
n
that q
n
1
F
a and p
n
1
F
a. Since is order preserving we must have
q
n
1
G
= (q
n
1
F
) is increasing and p
n
1
G
= (p
n
1
F
) is decreasing. Moreover,
since p
n
q
n
0 we must have lim
n
(q
n
1
F
) = lim
n
(p
n
1
F
) . Since
(q
n
1
F
) (a) (p
n
1
F
) for all n it then follows that (a) = lim
n
q
n
1
G
=
lim
n
p
n
1
G
and we have shown is uniquely determined.
For the converse, if q
n
we know that
[q
n
1
F
q
m
1
F
[ = [q
n
q
m
[ 1
F
and
[q
n
1
G
q
m
1
G
[ = [q
n
q
m
[ 1
G
.
Thus if q
n
1
F
n=1
is convergent in F i q
n
1
G
n=1
is convergent in G. Thus
for any a F we choose q
n
such that q
n
1
F
a and then dene (a) :=
lim
n
q
n
1
G
. One now checks that this formula is well dened (independent
of the choice of q
n
such that q
n
1
F
a) and denes an order preserving
isomorphism. For example, if a b we may choose q
n
and p
n

such that q
n
1
F
a and p
n
1
F
b. Then q
n
1
G
p
n
1
G
for all n and letting n
shows,
(a) = lim
n
q
n
1
G
lim
n
p
n
1
G
= (b) .
The other properties of are proved similarly.
3.7 Supremum and Inniums of sets
Denition 3.49. Given a set A X, let
1
A
(x) =
_
1 if x A
0 if x / A
be the indicator function of A.
Lemma 3.50 (Properties of inf and sup). We have:
1. (
n
A
n
)
c
=
n
A
c
n
,
2. A
n
i.o.
c
= A
c
n
a.a. ,
3. limsup
n
A
n
= x X :
n=1
1
An
(x) = ,
4. liminf
n
A
n
=
_
x X :
n=1
1
A
c
n
(x) <
_
,
5. sup
kn
1
Ak
(x) = 1
knAk
= 1
sup
kn
Ak
,
6. inf
kn
1
Ak
(x) = 1
knAk
= 1
infkn Ak
,
7. 1
limsup
n
An
= limsup
n
1
An
, and
8. 1
liminfnAn
= liminf
n
1
An
.
Proof. These results follow fairly directly from the denitions and so the
proof is left to the reader. (The reader should denitely provide a proof for
herself.)
4
Complex Numbers
Denition 4.1 (Complex Numbers). Let C = 1
2
equipped with multiplica-
tion rule
(a, b)(c, d) (ac bd, bc +ad) (4.1)
and the usual rule for vector addition. As is standard we will write 0 = (0, 0) ,
1 = (1, 0) and i = (0, 1) so that every element z of C may be written as z =
x1 +yi which in the future will be written simply as z = x +iy. If z = x +iy,
let Re z = x and Imz = y.
Writing z = a + ib and w = c + id, the multiplication rule in Eq. (4.1)
becomes
(a +ib)(c +id) (ac bd) +i(bc +ad) (4.2)
and in particular 1
2
= 1 and i
2
= 1.
Proposition 4.2. The complex numbers C with the above multiplication rule
satises the usual denitions of a eld see Denition 2.1. For example z
1
z
2
=
z
2
z
1
, and z (w
1
+w
2
) = zw
1
+zw
2
, etc. Moreover if z = a +ib ,= 0, then z has
a multiplicative inverse given by
z
1
=
a
a
2
+b
2
i
b
a
2
+b
2
. (4.3)
Moreover C contains 1 as sub-eld under the identication
1 a a1 + 0i = (a, 0) C.
Proof. Suppose z = a+ib ,= 0, we wish to nd w = c +id such that zw = 1
and this happens by Eq. (4.2) i
ac bd = 1 and (4.4)
bc +ad = 0. (4.5)
Solving these equations as follows
a(4.4) +b (4.5) =
_
a
2
+b
2
_
c = a = Re w = c =
a
a
2
+b
2
b(4.4) +a (4.5) =
_
a
2
+b
2
_
d = b = Imw = d =
b
a
2
+b
2
,
gives implies the result in Eq. (4.3).
Probably the most painful thing to check directly is the associative law,
namely that [z
1
z
2
] z
3
= z
1
[z
2
z
3
] for all z
1
, z
2
, z
3
C. This is equivalent to
showing for all a, b, u, v, x, y 1 that
[(a +ib) (u +iv)] (x +iy) = (a +ib) [(u +iv) (x +iy)] .
We do this by working out both sides as follows;
LHS = [(au bv) +i (av +bu)] (x +iy)
= (au bv) x (av +bu) y +i [(av +bu) x + (au bv) y] ;
RHS = (a +ib) [(ux vy) +i (uy +vx)]
= a (ux vy) b (uy +vx) +i [b (ux vy) +a (uy +vx)] .
The reader should now easily see that both of these expressions are in fact
equal. The remaining axioms of a eld are checked similarly.
Test 1 took place of lecture 10, 10/22/2012.
Notation 4.3 We will write 1/z for z
1
and w/z to mean z
1
w.
Notation 4.4 (Conjugation and Modulous) If z = a +ib with a, b 1 let
z = a ib and
[z[
2
z z = a
2
+b
2
.
Notice that
Re z =
1
2
(z + z) and Imz =
1
2i
(z z) . (4.6)
Proposition 4.5. Complex conjugation and the modulus operators satisfy;
1. z = z,
2. zw = z w and z + w = z +w.
3. [ z[ = [z[
32 4 Complex Numbers
4. [zw[ = [z[ [w[ and in particular [z
n
[ = [z[
n
for all n N.
5. [Re z[ [z[ and [Imz[ [z[
6. [z +w[ [z[ +[w[ .
7. z = 0 i [z[ = 0.
8. If z ,= 0 then
z
1
:=
z
[z[
2
(also written as
1
z
) is the inverse of z.
9.
z
1
= [z[
1
and more generally [z
n
[ = [z[
n
for all n Z.
Proof. 1. and 3. are geometrically obvious as well as easily veried.
2. Say z = a +ib and w = c +id, then z w is the same as zw with b replaced
by b and d replaced by d, and looking at Eq. (4.2) we see that
z w = (ac bd) i(bc +ad) = zw.
4. [zw[
2
= zw z w = z zw w = [z[
2
[w[
2
as real numbers and hence [zw[ =
[z[ [w[ .
5. Geometrically obvious or also follows from
[z[ =
_
[Re z[
2
+[Imz[
2
.
6. This is the triangle inequality which may be understood geometrically or
by the computation
[z +w[
2
= (z +w) (z +w) = [z[
2
+[w[
2
+w z + wz
= [z[
2
+[w[
2
+w z +w z
= [z[
2
+[w[
2
+ 2 Re (w z) [z[
2
+[w[
2
+ 2 [z[ [w[
= ([z[ +[w[)
2
.
7. Obvious.
8. Follows from Eq. (4.3). Alternatively if = + i0 > 0 is a real number
then
1
=
1
+i0 as is easily veried since 1 is a sub-eld of C. Thus since
zz = [z[
2
we nd
1
[z[
2
zz =
1
[z[
2
[z[
2
= 1 = z
1
=
1
[z[
2
z =
Re z
[z[
2
i
Imz
[z[
2
.
9.
z
1
z
]z]
2
1
]z]
2
[ z[ =
1
]z]
.
Corollary 4.6. If w, z C, then
[[z[ [w[[ [z w[ .
Proof. Just copy the proof of Lemma 1.6.
Lemma 4.7. For complex numbers u, v, w, z C with v ,= 0 ,= z, we have
1
u
1
v
=
1
uv
, i.e. u
1
v
1
= (uv)
1
u
v
w
z
=
uw
vz
and
u
v
+
w
z
=
uz +vw
vz
.
[These statements hold in any eld.]
4.1 A Matrix Perspective (Optional) 33
Proof. For the rst item, it suces to check that
(uv)
_
u
1
v
1
_
= u
1
uvv
1
= 1 1 = 1.
The rest follow using
u
v
w
z
= uv
1
wz
1
= uwv
1
z
1
= uw(vz)
1
=
uw
vz
.
u
v
+
w
z
=
z
z
u
v
+
v
v
w
z
=
zu
zv
+
vw
vz
= (vz)
1
(zu +vw) =
uz +vw
vz
.
Denition 4.8. A sequence z
n
n=1
C is Cauchy if [z
n
z
m
[ 0 as
m, n and is convergent to z C if [z z
n
[ 0 as n . As usual if
z
n
n=1
converges to z we will write z
n
z as n or z = lim
n
z
n
.
Theorem 4.9. The complex numbers are complete, i.e. all Cauchy sequences
are convergent.
Proof. This follows from the completeness of real numbers and the easily
proved observations that if z
n
= a
n
+ib
n
C, then
1. z
n
n=1
C is Cauchy i a
n
n=1
1 and b
n
n=1
1 are Cauchy and
2. z
n
z = a +ib as n i a
n
a and b
n
b as n .
The complex numbers satisfy all the same limit theorems as the real num-
bers.
Theorem 4.10. If w
n
n=1
and z
n
n=1
are convergent sequences of complex
numbers, then
1. w
n
n=1
and z
n
n=1
are Cauchy sequences.
2. lim
n
(w
n
+z
n
) = lim
n
w
n
+ lim
n
z
n
.
3. lim
n
(w
n
z
n
) = lim
n
w
n
lim
n
z
n
.
4. if we further assume that lim
n
z
n
,= 0, then
lim
n
_
w
n
z
n
_
=
lim
n
w
n
lim
n
z
n
.
Lemma 4.11 (BolzanoWeierstrass property). Every bounded sequence,
z
n
n=1
C, has a convergent subsequence.
Proof. By assumption there exists M < such that [z
n
[ M for all n N.
Writing z
n
= a
n
+ ib
n
with a
n
, b
n
1 we may conclude that [a
n
[ , [b
n
[ M.
According to Corollary 3.34 there exists an increasing function N k n
k
N
such that lim
k
a
nk
=: A exists. Similarly, we can apply Corollary 3.34 again
to nd an an increasing function N l k
l
N such that lim
l
b
nk
l
=: B
exists. We now let w
l
:= z
nk
l
for l N. Then w
l
l=1
is a subsequence of
z
n
n=1
which is convergent to A+iB C. Indeed,
[w
l
(A+iB)[ =
a
nk
l
A+i
_
b
nk
l
B
_
a
nk
l
A
b
nk
l
B
0 as l .
Notation 4.12 (Euclidean Spaces) Let C
n
:= a := (a
1
, . . . , a
n
) : a
i
C
and 1
n
:= a := (a
1
, . . . , a
n
) : a
i
1 C
n
. For a, b C
n
we let a =
( a
1
, . . . , a
n
) ,
a b := a
1
b
1
+ +a
n
b
n
=
n
i=1
a
i
b
i
, and
|a| = |a|
2
=
a a =
_
n
i=1
[a
i
[
2
. (4.7)
Exercise 4.1. If a, b C
n
and C, then a b = b a,
(a) b = a (b) = (a b) and |a| = [[ |a| = [[ | a| .
If we further assume that c C
n
, then (a +b) c = a c +b c.
4.1 A Matrix Perspective (Optional)
Here is a way to understand some of the basic properties of C using your
knowledge of linear algebra. Let M
z
: C C denote multiplication by z = a+ib.
We now identify C with 1
2
by
C c +id

=
_
c
d
_
1
2
.
Using this identication, the product formula
zw = (ac bd) +i (bc +ad) ,
becomes
M
z
w =
_
ac bd
bc +ad
_
=
_
a b
b a
__
c
d
_
so that
M
z
=
_
a b
b a
_
= aI +bJ
where
J :=
_
0 1
1 0
_
.
We now have the following simple observations;
1. J
2
= I and J
tr
= J,
2. M
z
M
w
= M
w
M
z
because J and I commute,
3. we have
M
z
M
w
= (aI +bJ) (cI +dJ) = (ac bd) I + (ad +bc) J = M
zw
,
4. the associativity of complex multiplication follows from the associativity
properties of matrix multiplication,
5. M
tr
z
= aI bJ = M
z
and in particular
6. M
zw
= (M
z
M
w
)
tr
= M
tr
w
M
tr
z
= M
w
M
z
= M
w z
,
7. M
tr
z
M
z
= M
zz
= M
]z]
2 = det (M
z
) ,
8. [wz[ = det (M
wz
) = det (M
w
M
z
) = det (M
w
) det (M
z
) = [w[ [z[ ,
9. M
z
is invertible i det (M
z
) ,= 0 which happens i [z[
2
,= 0 and in this case
we know from basic linear algebra that
M
1
z
=
1
a
2
+b
2
_
a b
b a
_
=
1
[z[
2
M
tr
z
= M 1
|z|
2
z
,
10. With this notation we have M
z
M
w
= M
zw
and since I and J commute it
follows that zw = wz. Moreover, since matrix multiplication is associative
so is complex multiplication. Also notice that M
z
is invertible i det M
z
=
a
2
+b
2
= [z[
2
,= 0 in which case
M
1
z
=
1
[z[
2
_
a b
b a
_
= M
z/]z]
2
as we have already seen above.
5
Set Operations, Functions, and Counting
Let N denote the positive integers, N
0
:= N0 be the non-negative inte-
gers and Z = N
0
(N) the positive and negative integers including 0, the
rational numbers, 1 the real numbers, and C the complex numbers. We will
also use F to stand for either of the elds 1 or C.
5.1 Set Operations and Functions
Notation 5.1 Given two sets X and Y, let Y
X
denote the collection of all
functions f : X Y. If X = N, we will say that f Y
N
is a sequence
with values in Y and often write f
n
for f (n) and express f as f
n
n=1
. If
X = 1, 2, . . . , N, we will write Y
N
in place of Y
1,2,...,N]
and denote f Y
N
by f = (f
1
, f
2
, . . . , f
N
) where f
n
= f(n).
Notation 5.2 More generally if X
: A is a collection of non-empty sets,

let X
A
=
A
X
and
: X
A
X
be the canonical projection map dened

by
(x) = x
. If If X
= X for some xed space X, then we will write
A
X
as X
A
rather than X
A
.
Recall that an element x X
A
is a choice function, i.e. an assignment
x
:= x() X
for each A. The axiom of choice states that X

A
,=
provided that X
,= for each A.
Notation 5.3 Given a set X, let 2
X
denote the power set of X the collection
of all subsets of X including the empty set.
The reason for writing the power set of X as 2
X
is that if we think of 2
meaning 0, 1 , then an element of a 2
X
= 0, 1
X
is completely determined
by the set
A := x X : a (x) = 1 X.
In this way elements in 0, 1
X
are in one to one correspondence with subsets
of X.
For A 2
X
let
A
c
:= X A = x X : x / A
and more generally if A, B X let
B A := x B : x / A = A B
c
.
We also dene the symmetric dierence of A and B by
AB := (B A) (A B) .
As usual if A
I
is an indexed collection of subsets of X we dene the union
and the intersection of this collection by
I
A
:= x X : I x A
and
I
A
:= x X : x A
I .
Example 5.4. Let A, B, and C be subsets of X. Then
A (B C) = [A B] [A C] .
Indeed, x A(B C) x A and x B C x A and x B or
x A and x C x A B or x A C x [A B] [A C] .
Notation 5.5 We will also write
I
A
for
I
A
in the case that

A
I
are pairwise disjoint, i.e. A
= if ,= .
Notice that is closely related to and is closely related to . For example
let A
n
n=1
be a sequence of subsets from X and dene
A
n
i.o. := x X : #n : x A
n
= and
A
n
a.a. := x X : x A
n
for all n suciently large.
(One should read A
n
i.o. as A
n
innitely often and A
n
a.a. as A
n
almost
always.) Then x A
n
i.o. i
N N n N x A
n
and this may be expressed as
A
n
i.o. =
N=1

nN
A
n
.
Similarly, x A
n
a.a. i
36 5 Set Operations, Functions, and Counting
N N n N, x A
n
which may be written as
A
n
a.a. =
N=1

nN
A
n
.
We end this section with some notation which will be used frequently in the
sequel.
Denition 5.6. If f : X Y is a function and B Y, then
f
1
(B) := x X : f (x) B .
If A X we also write,
f (A) := f (x) : x A Y.
Example 5.7. If f : X Y is a function and B Y, then f
1
(B
c
) =
_
f
1
(B)
c
or to be more precise,
f
1
(Y B) = X f
1
(B) .
To prove this notice that
x f
1
(B
c
) f (x) B
c
f (x) / B x / f
1
(B) x
_
f
1
(B)
c
.
On the other hand, if A X it is not necessarily true that f (A
c
) = [f (A)]
c
.
For example, suppose that f : 1, 2 1, 2 is the dened by f (1) = f (2) = 1
and A = 1 . Then f (A) = f (A
c
) = 1 where [f (A)]
c
= 1
c
= 2 .
Notation 5.8 If f : X Y is a function and c 2
Y
let
f
1
c := f
1
(c) := f
1
(E)[E c.
If ( 2
X
, let
f
( := A 2
Y
[f
1
(A) (.
Denition 5.9. Let c 2
X
be a collection of sets, A X, i
A
: A X be the
inclusion map (i
A
(x) = x for all x A) and
c
A
= i
1
A
(c) = A E : E c .
5.1.1 Exercises
Let f : X Y be a function and A
i
iI
be an indexed family of subsets of Y,
verify the following assertions.
Exercise 5.1. (
iI
A
i
)
c
=
iI
A
c
i
.
Exercise 5.2. Suppose that B Y, show that B (
iI
A
i
) =
iI
(B A
i
).
Exercise 5.3. f
1
(
iI
A
i
) =
iI
f
1
(A
i
).
Exercise 5.4. f
1
(
iI
A
i
) =
iI
f
1
(A
i
).
Exercise 5.5. Find a counterexample which shows that f(C D) = f(C)
f(D) need not hold.
5.2 Cardinality
In this section, X and Y be sets.
Denition 5.10 (Cardinality). We say card (X) card (Y ) if there exists an
injective map, f : X Y and card (Y ) card (X) if there exists a surjective
map g : Y X. We say card (X) = card (Y ) (also denoted as X Y ) if there
exists a bijections, f : X Y.
5.3 Finite Sets 37
Proposition 5.11. If X and Y are sets, then card (X) card (Y ) i
card (Y ) card (X) .
Proof. If f : X Y is an injective map, dene g : Y X by g[
f(X)
= f
1
and g[
Y \f(X)
= x
0
X chosen arbitrarily. Then g : Y X is surjective.
If g : Y X is a surjective map, then Y
x
:= g
1
(x) ,= for all x X
and so by the axiom of choice there exists f
xX
Y
x
. Thus f : X Y such
that f (x) Y
x
for all x. As the Y
x
xX
are pairwise disjoint, it follows that
f is injective.
Theorem 5.12 (Schroder-Bernstein Theorem). If X and Y are sets then
either card (X) card (Y ) or card (Y ) card (X) . Moreover, if card (X)
card (Y ) and card (Y ) card (X) , then card (X) = card (Y ) . [Stated more
explicitly; if there exists injective maps f : X Y and g : Y X, then there
exists a bijective map, h : X Y.]
Proof. These results are proved in the appendices. For the rst assertion
see B.8 and for the second see Theorem B.11.
Exercise 5.6. If X = X
1
X
2
with X
1
X
2
= , Y = Y
1
Y
2
with Y
1
Y
2
= ,
and X
i
Y
i
for i = 1, 2, then X Y. This exercise generalizes to an arbitrary
number of factors.
5.3 Finite Sets
Notation 5.13 (Integer Intervals) For n N we let
J
n
:= 1, 2, . . . , n := k N : k n .
Denition 5.14. We say a non-empty set, X, is nite if card (X) = card (J
n
)
for some n N. We will also write #(X) = n
1
to indicate that card (X) =
card (J
n
) . [It is shown in Theorem 5.17 below that #(X) is well dened, i.e.
it is not possible for card (X) = card (J
n
) and card (X) = card (J
m
) unless
m = n.]
Lemma 5.15. Suppose n N and k J
n+1
, then card (J
n+1
k) =
card (J
n
) .
Proof. Let f : J
n
J
n+1
k be dened by
f (x) =
_
x if x < k
x + 1 if x k
1
You should read #(X) = n, as X is a set with n elements.
Then f is the desired bijection.
2
Alternatively. If n = 1, then J
2
= 1, 2 and either J
2
k = J
1
or
J
2
k = 2 , either way card (J
2
k) = card (J
1
) . Now suppose that result
holds for a given n N and k J
n+2
. If k = n + 2 we have J
n+2
k = J
n+1
so card (J
n+2
k) = card (J
n+1
) while if k J
n+1
J
n+2
, then J
n+2
k =
(J
n+1
k) n + 2 J
n
n + 2 J
n
n + 1 = J
n+1
.
Lemma 5.16. If m, n N with n > m, then every map, f : J
n
J
m
, is not
injective.
Proof. If f : J
n
J
m
were injective, then f[
Jm+1
: J
m+1
J
m
would
be injective as well. Therefore it suces to show there is no injective map,
f : J
m+1
J
m
for all m N. We prove this last assertion by induction on m.
The case m = 1 is trivial as J
1
= 1 so the only function, f : J
2
J
1
is the
function, f (1) = 1 = f (2) which is not injective.
Now suppose m 1 and there were an injective map, f : J
m+2
J
m+1
.
Letting k := f (m+ 2) we would have, f[
Jm+1
: J
m+1
J
m+1
k J
m
, which
would produce and injective map from J
m+1
to J
m
. However this contradicts
the induction hypothesis and thus completes the proof.
Theorem 5.17. If m, n N, then card (J
m
) card (J
n
) i m n. Moreover,
card (J
n
) = card (J
m
) i m = n and hence card (J
m
) < card (J
n
) i m < n.
Proof. As J
m
J
n
if m n and J
m
= J
n
if m = n, it is only the
forward implications that have any real content. If card (J
m
) card (J
n
) , there
exists an injective map, g : J
m
J
n
. According to Lemma 5.16 this can only
happen if m n. If card (J
n
) = card (J
m
) , then card (J
n
) card (J
m
) and
card (J
m
) card (J
n
) and hence m n and n m, i.e. m = n.
Proposition 5.18. If X is a nite set with #(X) = n and S is a non-empty
subset of X, then S is a nite set and #(S) = k n. Moreover if #(S) = n,
then S = X.
Proof. It suces to assume that X = J
n
and S J
n
. We now give two
proofs of the result.
Proof 1. Let S
1
= S and f (1) := min S 1. If S
2
:= S
1
f (1) is not
empty, let f (2) := min S
2
2. We then continue this construction inductively.
So if f (k) = min S
k
k has been constructed, then we dene S
k+1
:= S
k

f (k) . If S
k+1
,= we dene f (k + 1) := min S
k+1
k + 1. As f (k) k
2
The inverse map, g : Jn+1 \ {k} Jn, to f is given by
g (y) =
_
y if y < k
y 1 if y > k.
for all k that f is dened, this process has to stop after at most n steps. Say
it stops at k so that S
k+1
= . Then f : J
k
S is a bijection and therefore S
is nite and #(S) = k n. Moreover, the only way that k = n is if f (k) = k
at each step of the construction so that f : J
n
S is the identity map in this
case, i.e. S = J
n
.
Proof 2. We prove this by induction on n. When n = 1 the only no-empty
subset of S of J
1
is J
1
itself. Thus #(S) = 1 and S = J
1
. Now suppose that the
result hold for some n N and let S J
n+1
. If n +1 / S, then S J
n
and by
the induction hypothesis we know #(S) = k n < n+1. So now suppose that
n + 1 S and let S
t
:= S n + 1 J
n
. Then by the induction hypothesis,
S
t
is a nite set and #(S
t
) = k n, i.e. there exists a bijection, f
t
: J
k
S
t
and S
t
= J
n
is k = n. Therefore f : J
k+1
S given by f = f
t
on J
k
and
f (k + 1) = n+1 is a bijections from J
k+1
to S. Therefore #(S) = k+1 n+1
with equality i S
t
= J
n
which happens i S = J
n+1
.
Proposition 5.19. If f : J
n
J
n
is a map, then the following are equivalent,
1. f is injective,
2. f is surjective,
3. f is bijective.
Proof. If n = 1, the only map f : J
1
J
1
is f (1) = 1. So in this case
there is nothing to prove. So now suppose the proposition holds for level n and
f : J
n+1
J
n+1
is a given map.
If f : J
n+1
J
n+1
is an injective map and f (J
n+1
) is a proper subset of
J
n+1
, then card (J
n+1
) < card (f (J
n+1
)) = card (J
n+1
) which is absurd. Thus
f is injective implies f is surjective.
Conversely suppose that f : J
n+1
J
n+1
is surjective. Let g : J
n+1
J
n+1
be a right inverse, i.e. f g = id, which is necessarily injective, see the proof
of Proposition 5.11. By the pervious paragraph we know that g is necessarily
surjective and therefore f = g
1
is a bijection.
Theorem 5.20. A subset S N is nite i S is bounded. Moreover if #(S) =
n N then the sup (S) n with equality i S = J
n
.
Proof. If S is bounded then S J
n
for some n N and hence S is a nite
set by Proposition 5.18. Also observe that if #(S) = n = sup (S) , then S J
n
and #(S) = n = #(J
n
) . Thus it follows from Proposition 5.18 that S = J
n
.
Conversely suppose that S N is a nite set and let n = #(S) . We will
now complete the proof by induction. If n = 1 we have S J
1
and therefore
S = k for some k N. In particular sup S = k 1 with equality i S = J
1
.
Suppose the truth of the statement for some n N and let S N be a set
with #(S) = n + 1. If we choose a point, k S, we have by Lemma 5.15 that
#(S k) = n. Hence by the induction hypothesis, sup (S k) n with
equality i S k = J
n
. If sup (S k) > n then sup (S) sup (S k)
n + 1 as desired. If sup (S k) = n then S k = J
n
therefore S k > n.
Hence it follows that sup (S) = k n + 1.
Corollary 5.21. Suppose S is a non-empty subset of N. Then S is an un-
bounded subset of N i card (J
n
) card (S) for all n N.
Proof. If S is bounded we know card (S) = card (J
k
) for some k N which
would violate the hypothesis that card (J
n
) card (S) for all n N. Conversely
if card (S) card (J
n
) for some n N, then there exists and injective map,
f : S J
n
. Therefore card (S) = card (f (S)) = card (J
k
) for some k n. So
S is nite and hence bounded in N by Theorem 5.20.
Exercise 5.7. Suppose that m, n N, show J
m+n
= J
m
(m+J
n
) and
(m+J
n
) J
m
= . Use this to conclude if X is a disjoint union of two non-
empty nite sets, X
1
and X
2
, then #(X) = #(X
1
) + #(X
2
) .
Exercise 5.8. Suppose that m, n N, show J
m
J
n
J
mn
. Use this to con-
clude if X and Y are two non-empty sets, then #(X Y ) = #(X) #(Y ) .
5.4 Countable and Uncountable Sets
Denition 5.22 (Countability). A set X is said to be countable if X =
or if there exists a surjective map, f : N X. Otherwise X is said to be
uncountable.
Remark 5.23. From Proposition 5.11, it follows that X is countable i there
exists an injective map, g : X N. This may be succinctly stated as; X is
countable i card (X) card (N) . From a practical point of view as set X is
countable i the elements of X may be arranged into a linear list,
X = x
1
, x
2
, x
3
, . . . .
Example 5.24. The integers, Z, are countable. In fact N Z, for example dene
f : N Z by
(f (1) , f (2) , f (3) , f (4) , f (5) , f (6) , f (7) , . . . ) := (0, 1, 1, 2, 2, 3, 3, . . . ) .
Remark 5.25 (Countability in a nutshell). If f : N X is surjective, then
g (x) := min f
1
(x) denes an injective map, g : X N. If g : X N is
injective, then f (n) := g
1
(n) for n g (X) =: S and f (n) = x
0
X for
n / S denes a surjective map, f : N X. Moreover, if S is a subset of N we
may list its elements in increasing order so that either
5.4 Countable and Uncountable Sets 39
S = n
1
< n
2
< < n
k
for some k N or
S = n
1
< n
2
< < n
k
< . . . .
In the rst case card (X) = card (J
k
) while in the second card (S) = card (N) .
[Dene f (j) := n
j
to set up the bijections between J
k
or N and S.]
The above arguments demonstrate that the following statements are equiv-
alent;
1. X is countable, i.e. there exists a surjective map f : N X.
2. card (X) card (N) , i.e. there exists an injective map, g : X N.
3. There exists S N such that card (X) = card (S) . Furthermore card (S) =
card (J
k
) for some k N i S is bounded and card (S) = card (N) i S is
unbounded.
4. Either card (X) = card (N) for card (X) = card (J
k
) for some k N.
Formal proofs of these observations are given above and below.
Lemma 5.26. If S N is an unbounded set, then card (S) = card (N) .
Proof. The main idea is that any subset, S N, may be given as an nite
or innite list written in increasing order, i.e.
S = n
1
, n
2
, n
3
, . . . with n
1
< n
2
< n
3
< . . . .
If the list is nite, say S = n
1
, . . . , n
k
, then n
k
is an upper bound for S. So S
will be unbounded i only if the list is innite in which case f : N S dened
by f (k) = n
k
denes a bijection.
Formal proof. Dene f : N S via, let
S
1
:= S and f (1) := min S
1
,
S
2
:= S
1
f (1) and f (2) := min S
2
,
S
3
:= S
2
f (2) and f (3) := min S
3
.
.
.
In more detail, let T denote those n N such that there exists f : J
n
S
and S
k
S
n
k=1
satisfying, S
1
= S, f (k) = min S
k
and S
k+1
= S
k
f (k)
for 1 k < n. If n T, we may dene S
n+1
:= S
n
f (n) and f (n + 1) :=
min S
n+1
in order to show n + 1 T. Thus T = N and we have constructed an
injective map, f : N S. Moreover
kN
S
k
N J
n
for all n and therefore
kN
S
k
= . Thus it follows that f is a bijection.
The following theorem summarizes most of what we need to know about
counting and countability.
Theorem 5.27. The following properties hold;
1. NN is countable and in fact NN N, i.e. there exists a bijective map,
h, from N to N N.
2. If X and Y are countable, then X Y is countable.
3. If X
n
nN
are countable sets then X :=
n=1
X
n
is a countable set.
4. If X is countable, then either there exists n N such that X J
n
or
X N.
5. If S N and S J
n
for some n N then S is bounded.
6. If X is a set and card J
n
card X for all n N then card N card X.
7. If A X is a subset of a countable set X then A is countable.
1. Put the elements of N N into an array of the form
_
_
_
_
_
(1, 1) (1, 2) (1, 3) . . .
(2, 1) (2, 2) (2, 3) . . .
(3, 1) (3, 2) (3, 3) . . .
.
.
.
.
.
.
.
.
.
.
.
.
_
_
_
_
_
and then count these elements by counting the sets (i, j) : i +j = k
one at a time. For example let h(1) = (1, 1) , h(2) = (2, 1), h(3) = (1, 2),
h(4) = (3, 1), h(5) = (2, 2), h(6) = (1, 3) and so on. In other words we put
N N into the following list form,
N N =(1, 1) , (2, 1) , (1, 2) , (3, 1) , (2, 2) , (1, 3) , (4, 1) , . . . , (1, 4) , . . . .
2. If f : N X and g : N Y are surjective functions, then the function
(f g) h : N X Y is surjective where (f g) (m, n) := (f (m), g(n))
for all (m, n) N N.
3. By assumption there exists surjective maps, f
n
: N X
n
, for each n N.
Let h(n) := (a (n) , b (n)) be the bijection constructed for item 1. Then
f : N X dened by f (n) := f
a(n)
(b (n)) is a surjective map.
4. To see this let f : N X be a surjective map and let g (x) := min f
1
(x)
for all x X. Then g : X N is an injective map. Let S := g (X) , then
g : X S N is a bijection. So it remains to show S N or S J
n
for some n N. If S is unbounded, then S N as we have already seen.
So it suces to consider the case where S is bounded. If S is bounded by
1 then S = 1 = J
1
and we are done. Now assume the result is true if
S is bounded by n N and now suppose that S is bounded by n + 1. If
n + 1 / S, then S is bounded by n and so by induction, S J
k
for some
k n < n + 1. If n + 1 S, then from above, S n + 1 J
k
for some
k n, i.e. there exists a bijection, f : J
k
S n + 1 . We then extend
f to J
k+1
by setting f (k + 1) := n + 1 which shows J
k+1
S.
5. We again prove this by induction on n. If n = 1, then S = m for some
m N which is bounded. So suppose for some n N, every subset S N
with S J
n
is bounded. Now suppose that S N with S J
n+1
. Then
f (J
n
) J
n
and hence f (J
n
) is bounded in N. Then max f (J
n
)f (n + 1)
is an upper bound for S. This completes the inductive argument.
6. For each n N there exists and injection, f
n
: J
n
X. By replacing X
by X
0
:=
nN
f
n
(J
n
) we may assume that X =
nN
f
n
(J
n
) . Thus there
exists a surjective map, f : N X by item 3. Let g : X N be dened
by g (x) := min f
1
(x) for all x X and let S := g (X) . To nish the
proof we need only show that S is unbounded. If S were bounded, then
we would nd k N such that J
k
S X. However this is impossible
since card J
n
card X = card J
k
would imply n k even though n can be
chosen arbitrarily in N.
7. If g : X N is an injective map then g[
A
: A N is an injective map and
therefore A is countable.
Lemma 5.28. If X is a countable set which contains Y X with Y N, then
X N.
Proof. By assumption there is an injective map, g : X N and a bijective
map, f : N Y. It then follows that g f : N N is injective from which it
follows that g (X) is unbounded. [Indeed, (g f) (J
n
) g (X) for all n implies
card (J
n
) card (g (X)) for all n which implies g (X) is unbounded by Corollary
5.21.] Therefore X g (X) N by Lemma 5.26.
Corollary 5.29. We have card () = card (N) and in fact, for any a < b in 1,
card ((a, b)) = card (N) .
Proof. First o is a countable since may be expressed as a countable
union of countable sets;
=
_
mN
_
n
m
: n Z
_
.
From this it follows that (a, b) is countable for all a < b in 1. As these sets
are not nite, they must have the cardinality of N.
Theorem 5.30 (Uncountabilty results). If X is an innite set and Y is
a set with at least two elements, then Y
X
is uncountable. In particular 2
X
is
uncountable for any innite set X.
Proof. Let us begin by showing 2
N
= 0, 1
N
is uncountable. For sake
of contradiction suppose f : N 0, 1
N
is a surjection and write f (n) as
(f
1
(n) , f
2
(n) , f
3
(n) , . . . ) . Now dene a 0, 1
N
by a
n
:= 1 f
n
(n). By
construction f
n
(n) ,= a
n
for all n and so a / f (N) . This contradicts the
assumption that f is surjective and shows 2
N
is uncountable. For the general
case, since Y
X
0
Y
X
for any subset Y
0
Y, if Y
X
0
is uncountable then so
is Y
X
. In this way we may assume Y
0
is a two point set which may as well
be Y
0
= 0, 1 . Moreover, since X is an innite set we may nd an injective
map x : N X and use this to set up an injection, i : 2
N
2
X
by setting
i (A) := x
n
: n N X for all A N. If 2
X
were countable we could
nd a surjective map f : 2
X
N in which case f i : 2
N
N would be
surjective as well. However this is impossible since we have already seen that
2
N
is uncountable.
Corollary 5.31. The set (0, 1) := a 1 : 0 < a < 1 is uncountable while
(0, 1) is countable. More generally, for any a card (N) .
Proof. From Section 3.4, the set 0, 1, 2 . . . , 8
N
can be mapped injec-
tively into (0, 1) and therefore it follows from Theorem 5.30 that (0, 1) is
uncountable. For each m N, let A
m
:=
_
n
m
: n N with n < m
_
. Since
(0, 1) =
m=1
X
m
and #(X
m
) < for all m, another application of The-
orem 5.27 shows (0, 1) is countable.
The fact that these results hold for any other nite interval follows from the
fact that f : (0, 1) (a, b) dened by f (t) := a +t (b a) is a bijection.
Denition 5.32. We say a non-empty set X is innite if X is not a nite
set.
Example 5.33. Any unbounded subset, S N, is an innite set according to
Theorem 5.20.
Theorem 5.34. Let X be a non-empty set. The following are equivalent;
1. X is an innite set,
2. card (J
n
) card (X) for all n N,
3. card (N) card (X) ,
4. card (X x) = card (X) for some (or all) x X.
Proof. 1. = 2. Suppose that X is an innite set. We show by induction
that card (J
n
) card (X) for all n N. Since X is not empty, there exists
x X and we may dene f : J
1
X by f (1) = x in order to learn card (J
1
)
card (X) . Suppose we have shown card (J
n
) card (X) for some n N, i.e.
there exists and injective map, f : J
n
X. If f (J
n
) = X it would follows
that card (X) = card (J
n
) and would violated the assumption that X is not a
nite set. Thus there exists x X f (J
n
) and we may dene f
t
: J
n+1
X
5.4 Countable and Uncountable Sets 41
by f
t
[
Jn
= f and f
t
(n + 1) = x. Then f
t
: J
n+1
X is injective and hence
card (J
n+1
) card (X) .
2. 3. This is the content of Theorem B.19.
3. = 4. Let x
1
X and f : N X be an injective map such that
f (1) = x
1
. We now dene a bijections, : X X x
1
by
(x) =
_
x if x / f (N)
f (i + 1) if x = f (i) f (N) .
4. = 1. We will prove the contrapositive. If X is a nite and x X, we
have seen that card (X x) < card (X) , namely #(X x) = #(X) 1.
The next two theorems summarizes the properties of cardinalities that have
been proven above.
Theorem 5.35 (Cardinality/Counting Summary I). Given a non-empty
set X, then one and only one of the following statements holds;
1. There exists a unique n N such that card X = card J
n
.
2. card X = card N,
3. card X > card N.
Cases 2. or 3. hold i card J
n
card X for all n N which happens i
card N card X.
If X satises case 1. we say X is a nite set. If X satises case 2 we say
X is a countably innite set and if X satises case 3. we say X is an
uncountably innite set.
Theorem 5.36 (Cardinality/Counting Summary II). Let X and Y be sets
and S be a subset of N.
1. If S N is an unbounded set, then card (S) = card (N) .
2. If S N is a bounded set then card (S) = card (J
n
) for some n N.
3. If X
k
k=1
are subsets of X such that card (X
k
) card (N) , then
card (
k=1
X
k
) card (N) .
4. If X and Y are sets such that card (X) card (N) and card (Y ) card (N) ,
then card (X Y ) card (N) .
5. card () = card (N) = card (Z) .
6. For any a 
card (N) .
5.4.1 Exercises
n
is countable for all n N.
Exercise 5.10. Let [t] be the set of polynomial functions, p, such that p has
rationale coecients. That is p [t] i there exists n N
0
and a
k
for
0 k n such that
p (t) =
n
k=0
a
k
t
k
for all t 1. (5.1)
Show [t] is a countable set.
Denition 5.37 (Algebraic Numbers). A real number, x 1, is called al-
gebraic number, if there is a non-zero polynomial p [t] such that p (x) = 0.
[That is to say, x 1 is algebraic if it is the root of a non-zero polynomial with
coecients from .]
Note that for all q , p (t) := t q satises p (q) = 0. Hence all rational
numbers are algebraic. But there are many more algebraic numbers, for example
y
1/n
is algebraic for all y 0 and n N.
Exercise 5.11. Show that the set of algebraic numbers is countable. [Hint:
any polynomial of degree n has at most n real roots.] In particular, most
irrational numbers are not algebraic numbers, i.e. there is still an uncountable
number of non-algebraic numbers..
Part II
Normed and Metric Spaces
6
Metric Spaces
Denition 6.1. A function d : X X [0, ) is called a metric if
1. (Symmetry) d(x, y) = d(y, x) for all x, y X,
2. (Non-degenerate) d(x, y) = 0 if and only if x = y X, and
3. (Triangle inequality) d(x, z) d(x, y) +d(y, z) for all x, y, z X.
Example 6.2. Here are a few immediate examples of metric spaces;
1. Let X be any set and then dene,
d (x, y) =
_
0 if x = y
1 if x ,= y
.
2. Let X = 1 with d (x, y) := [y x[ . Notice that
d (x, z) = [x y +y z[
[x y[ +[y z[ = d(x, y) +d(y, z).
3. Let X be any subset of C and dene d (w, z) := [z w[ .
In general our typical example of a metric space will often be a generalization
of the last example above, see Example 6.12.
6.1 Normed Spaces [Linear Algebra Meets Analysis]
6.1.1 Review of Vector Spaces and Subspaces
Denition 6.3 (Vector Space). A vector space is a non-empty set Z of
objects, called vectors, equipped with an addition operation + and scalar (=1
or maybe C) multiplication satisfying all of the properties above: i.e. For all
u, v, w Z and a, b 1;
1. Associativity of addition: u + (v +w) = (u +v) +w.
2. Commutativity of addition: v +w = w +v.
3. Identity element of addition: There is a element 0 Z such that
0 +v = v for all v.
4. Inverse elements of addition: v + v = 0 for all v Z. (In fact
v = (1) v.)
5. Distributivity of scalar multiplication with vector addition: a
(v +w) = a v +a w.
6. Distributivity of scalar multiplication with respect to eld addi-
tion (a +b) v = a v +b v.
7. Compatibility of scalar multiplication with the multiplication on
1 a (b v) = (ab) v.
8. Identity element of scalar multiplication 1 v = v for all v Z.
Example 6.4. Here are two fundamental examples of vector spaces.
1. 1 with usual vector addition and scalar multiplication is vector space over
1.
2. C
n
with usual vector addition and scalar multiplication is vector space over
C.
Notation 6.5 If T and X are sets, let X
T
denote the collection of functions,
f : T X.
Example 6.6 (The Main Umbrella Example). Let T be a non-empty set and let
Z := 1
T
. For f, g Z and 1 we dene f +g and f by
(f +g) (t) = f (t) +g (t) (addition in 1) for all t T
( f) (t) = f (t) (multiplication in 1) for all t T.
It can now be checked that Z is a vector space so that functions have now
become vectors! Essentially all other examples of vector spaces we give will be
related to an example of this form. The same observations show C
T
is a complex
vector space.
Example 6.7. For example; 1
3
= 1
1,2,3]
and more generally 1
n
= 1
Jn
where
J
n
:= 1, 2, . . . , n . In this setting we usually specify x 1
Jn
by listing its
values (x(1) , . . . , x(n)) . To abbreviate notation a bit more we will usually
write x(i) as x
i
so that (x(1) , . . . , x(n)) becomes (x
1
, . . . , x
n
) .
46 6 Metric Spaces
Example 6.8. The vector space of 2 2 matrices;
M
22
= A : A is a 2 2 matrix = A : (1, 1) , (1, 2) , (2, 1) , (2, 2) 1 .
This can be generalized.
Denition 6.9 (Subspace). Let Z be a vector space. A non-empty subset,
H Z, is a subspace of Z if H is closed under addition and scalar multipli-
cation. Note, if H is a subspace and v H, then 0 = 0 v H.
The vector space 1
T
and C
T
are typically the largest vector spaces we
will consider in this course.
Example 6.10. Here are three common subspaces of 1
R
;
1. H = f Z : f is continuous .
2. H = f Z : f is continuously dierentiable.
3. H = f Z : f is dierentiable at .
6.1.2 Normed Spaces
Denition 6.11. A norm on a vector space Z is a function || : Z [0, )
such that
1. (Homogeneity) |f| = [[ |f| for all F and f Z.
2. (Triangle inequality) |f +g| |f| +|g| for all f, g Z.
3. (Positive denite) |f| = 0 implies f = 0.
A pair (Z, ||) where Z is a vector space and || is a norm on Z is called a
normed vector space or normed space for short.
Example 6.12. If (Z, ||) is a normed space, then d (x, y) := |x y| is a metric
on Z and restricts to a metric on any subset of Z.
Example 6.13 (Normed Spaces). The following are normed spaces;
1. Z = 1 with |x| = [x[ .
2. Z = C with |z| = [z[ .
3. Z = C
n
with
|z|
1
:=
n
i=1
[z
i
[ for z = (z
1
, . . . , z
n
) Z.
The triangle inequality is easily veried here since,
|z +w|
1
=
n
i=1
[z
i
+w
i
[
i=1
[[z
i
[ +[w
i
[]
=
n
i=1
[z
i
[ +
n
i=1
[w
i
[ = |z|
1
+|w|
1
.
4. Let X be a set and for any function f : X C, let
|f|
u
:= sup
xX
[f (x)[ .
Then Z := f : X C : |f|
u
< is a vector space and ||
u
is a norm
on Z.
Exercise 6.1. Verify the last item of Example 6.13. That is let X be a set and
for any function f : X C, let
|f|
u
:= sup
xX
[f (x)[ .
Show Z := f : X C : |f|
u
< is a vector space and ||
u
is a norm on Z.
Our next goal is to show that ||
2
dened in Eq. (4.7) denes a norm on 1
n
and C
n
. We will begin by proving the important Cauchy-Schwarz inequality.
Lemma 6.14. If x, y 0 and > 0, then
xy
1
2
_
x
2
+
1
y
2
_
(6.1)
with equality when = y/x in the case x > 0.
Proof. Since
0
_
x
y
_
2
= x
2
+
1
y
2
2xy,
with equality i

x =
y
, i.e. i = y/x, we see that

xy
1
2
_
x
2
+
1
y
2
_
with equality i = y/x.
6.1 Normed Spaces [Linear Algebra Meets Analysis] 47
Denition 6.15. The two norm, ||
2
on C
n
is the function dened by
|z|
2
=
_
n
i=1
[z
i
[
2
for all z C
n
.
Theorem 6.16 (Cauchy-Schwarz Inequality). For a, b C
n
, [a b[ |a|
2
|b|
2
.
Proof. The inequality holds true if a = 0 so we may now assume a ,= 0.
Using Lemma 6.14 with x = [a
i
[ and y = [b
i
[ we nd for any > 0 that
[a b[
i=1
a
i
b
i
i=1
[a
i
b
i
[ =
n
i=1
[a
i
[ [b
i
[
i=1
1
2
_
[a
i
[
2
+
1
[b
i
[
2
_
=
1
2
_
|a|
2
2
+
1
|b|
2
_
.
Taking = |b|
2
/ |a|
2
then completes the proof.
Theorem 6.17 (Triangle Inequality). The function, ||
2
, on C
n
is a norm
and in particular for a, b C
n
,
|a +b|
2
|a|
2
+|b|
2
.
Proof. The main point is to prove the triangle inequality. The proof is as
follows;
|a +b|
2
2
= (a +b)
_
a +b
_
= (a +b)
_
a +
b
_
= (a +b) a + (a +b)
b = |a|
2
2
+b a +a
b +|b|
2
2
= |a|
2
2
+|b|
2
2
+ 2 Re
_
a
b
_
|a|
2
2
+|b|
2
2
+ 2
Re
_
a
b
_
|a|
2
2
+|b|
2
2
+ 2
_
a
b
_
|a|
2
2
+|b|
2
2
+ 2 |a|
2

_
_
b
_
_
2
(Theorem 6.16)
= |a|
2
2
+|b|
2
2
+ 2 |a|
2
|b|
2
= (|a|
2
+|b|
2
)
2
.
The remaining properties of a norm are easily checked. For example, if C
and a C
n
, then
|a|
2
=
_
n
i=1
[a
i
[
2
=
_
n
i=1
[[
2
[a
i
[
2
=
_
[[
2
n
i=1
[a
i
[
2
=
_
[[
2
_
n
i=1
[a
i
[
2
= [[ |a|
2
.
Fact 6.18 For 0 < p < , let ||
p
: C
n
[0, ) be dened by
|z|
p
=
p
_
n
i=1
[z
i
[
p
for all z C
n
.
Then ||
p
is a norm on C
n
for all 1 p < but is not a norm (the triangle
inequality fails) for 0 < p < 1.
Denition 6.19 (Balls). Let (X, d) be a metric space. For x X and r 0
let
B
x
(r) := y X : d (x, y) < r and
C
x
(r) := y X : d (x, y) r .
We refer to B
x
(r) and C
x
(r) as the open and closed ball respectively about
x with radius r.
Example 6.20. Let us consider ||
p
to 1
2
. Figure 6.1 shows the boundary of the
balls of radius 1 centered at 0 for some of the dierent p norms.
Fig. 6.1. Balls in 1
2
corresponding to p
_
1
2
, 1, 2, 5, 20
_
. The case p =
1
2
is not
convex like the other balls which is an indication that the triangle inequality fails.
Example 6.21. The next gures explain how to understand balls in C ([0, 1] , 1)
equipped with the uniform norm.
48 6 Metric Spaces
Fig. 6.2. The ball in C ([0, 1] , 1) of radius 1/4 centered at f (x) = sin
_
x
2
_
are all
the continuous functions whose graphs lie between the green envelope.
Exercise 6.2 (Weighted 2 norms). Suppose that
i
(0, ) for 1 i n
and for a, b C
n
let
a b :=
n
i=1
a
i
b
i
i
and |a| :=
a a =
_
n
i=1
[a
i
[
2
i
.
Show, for all a, b C
n
that [a b[ |a| |b| and that || is a norm on C
n
.
[Hint: reduce to the case where
i
= 1 for all i.]
Fig. 6.3. Here are some unit balls in 1
2
relative to some weighted 2 norms. Each
of these balls is the interior of an ellipse. It would be possible to construct norms
where these ellipses are rotated as well.
For the next two exercise you will be using some concepts from calculus
which we will developed in detail next quarter. For now, I assume you know
what the Riemann integral is for continuous functions on [0, 1] with values in
1. Let Z denote the continuous functions
1
on [0, 1] with values in 1. The only
properties that you need to know about the Riemann integral are;
1. The integral is linear, namely for all f, g Z and 1,
_
1
0
(f(t) +g (t))dt =
_
1
0
f (t) dt +
_
1
0
g (t) dt.
2. If f, g Z and f (t) g (t) for all t [0, 1] , then
_
1
0
f (t) dt
_
1
0
g (t) dt.
3. For all f Z,
_
1
0
f (t) dt
_
1
0
[f (t)[ dt.
In fact this item follows from items 1. and 2. Indeed, since f (t) [f (t)[
for all t, we nd
_
1
0
f (t) dt =
_
1
0
f (t) dt
_
1
0
[f (t)[ dt
_
1
0
f (t) dt
_
1
0
[f (t)[ dt.
1
The notion of continuity will be formally developed shortly.
6.2 Sequences in Metric Spaces 49
Exercise 6.3. Let Z denote the continuous function on [0, 1] with values in 1
and for f Z let
|f|
1
:=
_
1
0
[f (t)[ dt.
Show ||
1
satises,
1. (Homogeneity) |f| = [[ |f| for all 1 and f Z.
Remark 6.22 (An interpretation of |f|
1
.). If we interpret f (t) as the speed of
a particle on the real line at time t, then |f|
1
represents the total distance
(including retracing of its path) the particle travels over the time interval [0, 1] .
Exercise 6.4. Let Z denote the continuous function on [0, 1] with values in 1
and for f Z let
|f|
2
:=
_
1
0
[f (t)[
2
dt.
Show;
1. for f, g Z that
_
1
0
f (t) g (t) dt
|f|
2
|g|
2
,
2. Homogeneity) |f|
2
= [[ |f|
2
for all 1 and f Z, and
3. (Triangle inequality) |f +g|
2
|f|
2
+|g|
2
for all f, g Z.
Remark 6.23 (An interpretation of |f|
2
.). Let us suppose that f (t) is the
voltage across a 1 Ohm resistor. By Ohms law the current through this re-
sistor is f (t) /1 = f (t) and the power dissipated by the resistor at time t is
(VoltageCurrent) is f (t)
2
. The work done over the time interval, [0, 1] is then
_
1
0
Power (t) dt =
_
1
0
f
2
(t) dt = |f|
2
2
.
On the other hand if we had a constant voltage of |f|
2
across the resistor
over the time interval [0, 1] , the work done over this period would again be
|f|
2
2
. Thus |f|
2
is often referred to as the RMS voltage (root mean squared
voltage) and represents the equivalent DC (Direct Current, i.e. constant) voltage
necessary to produce the same amount of work over the time interval [0, 1] .
Exercise 6.5. Let || be a norm on 1
n
such that |a| |b| whenever 0
a
i
b
i
for 1 i n.
2
Further suppose that (X
i
, d
i
) for i = 1, . . . , n is a nite
2
For example we could take to be
u
,
1
, or
2
on 1
n
. Not all norms
satisfy this assumption though. For example, take (x, y) = |x y| + |y| on 1
2
,
then (x, y) is decreasing in x when 0 x < y.
collection of metric spaces and for x = (x
1
, x
2
, . . . , x
n
) and y = (y
1
, . . . , y
n
) in
X :=
n
i=1
X
i
, let
d(x, y) = |(d
1
(x
1
, y
1
) , d
2
(x
2
, y
2
) , . . . , d
n
(x
n
, y
n
))| .
Show (X, d) is a metric space.
End of Lecture 13, 10/28/2012. [We started Section 5.1 above as well this
day.]
6.2 Sequences in Metric Spaces
Denition 6.24. A sequence x
n
n=1
in a metric space (X, d) is said to be
convergent if there exists a point x X such that lim
n
d(x, x
n
) = 0. In
this case we write lim
n
x
n
= x or x
n
x as n .
Exercise 6.6. Show that x in Denition 6.24 is necessarily unique.
Denition 6.25 (Cauchy sequences). A sequence x
n
n=1
in a metric space
(X, d) is Cauchy provided that lim
m,n
d(x
n
, x
m
) = 0, i.e. for all > 0 there
exists N () N such that
d(x
n
, x
m
) when m, n N () .
Exercise 6.7. Show that convergent sequences in metric spaces are always
Cauchy sequences. The converse is not always true. For example, let X =
be the set of rational numbers and d(x, y) = [x y[. Choose a sequence
x
n
n=1
which converges to
2 1, then x
n
n=1
is (, d) Cauchy
but not (, d) convergent. Of course the sequence is convergent in 1.
Exercise 6.8. If x
n
n=1
is a Cauchy sequence in a metric space (X, d),
lim
n
d (x
n
, y) exists in 1 for all y X. In particular, d (x
n
, y)
n=1
is a
bounded sequence in 1 for all y X.
Denition 6.26. A metric space (X, d) (or normed space (X, ||)) is com-
plete if all Cauchy sequences are convergent sequences. A complete normed
space is called a Banach space.
Lemma 6.27. Let X be a non-empty set and
|f|
u
:= sup
xX
[f (x)[ for all f C
X
.
Then the subspace, Z :=
_
f f C
X
: |f|
u
<
_
is a Banach space, i.e.
(Z, ||
u
) is a complete normed space.
50 6 Metric Spaces
Proof. Let f
n
n=1
Z be a Cauchy sequence. Since for any x X, we
have
[f
n
(x) f
m
(x)[ |f
n
f
m
|
u
(6.2)
which shows that f
n
(x)
n=1
C is a Cauchy sequence of complex numbers.
Because C is complete, f (x) := lim
n
f
n
(x) exists for all x X. Passing to
the limit n in Eq. (6.2) implies
[f (x) f
m
(x)[ lim inf
n
|f
n
f
m
|
u
and taking the supremum over x X of this inequality implies
|f f
m
|
u
lim inf
n
|f
n
f
m
|
u
0 as m
showing f
m
f in Z.
Denition 6.28. We say that two norms, ||
a
and ||
b
, on a vector space X
are equivalent if there are constants C
1
, C
2
(0, ) such that
|x|
a
C
1
|x|
b
and |x|
b
C
2
|x|
a
for all x X.
Similarly two metrics, d
a
and d
b
on a set X are said to be equivalent if there
are constants C
1
, C
2
(0, ) such that
d
a
(x, y) C
1
d
b
(x, y) and d
b
(x, y) C
2
d
a
(x, y) for all x, y X.
Exercise 6.9. Show that two norms, ||
a
and ||
b
on a vector space X are
equivalent i the corresponding metrics, d
a
(x, y) := |y x|
a
and d
b
(x, y) :=
|y x|
b
, on X are equivalent metrics.
Corollary 6.29. If d
a
and d
b
are two equivalent metrics on a set X then (X, d
a
)
is a complete metric space i (X, d
b
) is a complete metric space.
Proof. Suppose that (X, d
b
) is complete. If x
n
n=1
is d
a
Cauchy implies
d
b
(x
n
, x
m
) C
2
d
a
(x
n
, x
m
) 0 as m, n
which shows that x
n
n=1
is d
b
Cauchy. As (X, d
b
) is complete, there exists
x X such that d
b
(x, x
n
) 0 as n . Since
d
a
(x, x
n
) C
1
d
b
(x, x
n
) 0 as n
we see that x
n
x in the d
a
metric as well. This shows (X, d
a
) is complete.
The reverse implication is proved the same way.
Exercise 6.10 (Equivalence of 3 norms on C
n
). Let ||
1
, ||
u
, and ||
2
be the three norms on C
n
given above. Show for all z C
n
that
|z|
u
|z|
1
n|z|
u
,
|z|
1

n|z|
2
, (Hint: use Cauchy Schwarz.)
|z|
2

_
|z|
u
|z|
1
|z|
1
and
|z|
2

n|z|
u
.
It follows from these inequalities that ||
1
, ||
u
, and ||
2
are equivalent norms
on C
n
.
Theorem 6.30 (Completeness of C
n
). Let n N and || denote any one of
the norms, ||
1
, ||
2
, or ||
u
on C
n
. Then (C
n
, ||) is complete.
Proof. By Exercise 6.10 all of these norms are equivalent to ||
u
and hence
it suces to show that ||
u
is a complete norm on C
n
. This is a special case of
Lemma 6.27 with X = 1, 2, . . . , n .
Exercise 6.11. Let X be a set and (Y, ) be a complete metric space. Suppose
that f
n
: X Y are functions such that
m,n
:= sup
xX
d (f
n
(x) , f
m
(x)) 0 as m, n .
Show there exists a (unique) functions, f : X Y such that
lim
n
sup
xX
d (f
n
(x) , f (x)) = 0.
Hint: mimic the proof of Lemma 6.27.
Exercise 6.12. Let Z denote the continuous functions on [0, 1] with values in
1 and as above let
|f|
1
:=
_
1
0
[f (t)[ dt, |f|
2
:=
_
1
0
[f (t)[
2
dt, and |f|
u
= sup
0t1
[f (t)[ .
Show for all f Z that;
|f|
1
|f|
2
and |f|
2
|f|
u
.
[Hint: for the rst inequality use Cauchy Schwarz.] Also show there is no
constant C < such that
|f|
u
C |f|
2
for all f Z. (6.3)
[Hint: consider the sequence, f
n
(t) = t
n
.]
6.3 General Limits and Continuity in Metric Spaces 51
Example 6.31. Let Z = C ([0, 1] , 1) be the vector space of continuous functions
on [0, 1] with values in 1 and for f Z let
|f|
1
:=
_
1
0
[f (t)[ dt.
Let us show that (Z, ||
1
) is not complete. To this end let
g (t) :=
_
2t if t 1/2
1 if 1/2 t 1
and then set f
n
(t) = g (t)
n
or all t [0, 1] . Then
Fig. 6.4. Here is a plot of f1, f2, f4, f8, f32, and the pointwise limit function h.
lim
n
f
n
(t) = h(t) =
_
0 if t <
1
2
1 if t
1
2
which is discontinuous at 1/2. Let us now observe that |h|
1
=
1
2
and
|f
n
h|
1
=
_ 1
2
0
(2t)
n
dt =
1
2
1
(n + 1)
so that f
n
h in ||
1
. If there were some f Z so that |f f
n
|
1
0 we
would have to have |f h|
1
= 0. If := [f (t
0
) h(t
0
)[ for some t
0
,=
1
2
, by
continuity we would have [f (t) h(t)[ /2 for t near t
0
from which it would
follow that |f h|
1
> 0. Therefore we must have f (t) = h(t) for all t ,= t
0
.
Since
lim
t
1
2
f (t) = lim
t
1
2
h(t) = 1 and
lim
t
1
2
f (t) = lim
t
1
2
h(t) = 0
there is not choice for f (1/2) for which f would be continuous at
1
2
. Hence we
have shown that f
n
can not converge to any element, f Z.
In fact, (Z, ||
1
) is full of uncountably many holes and h is the location
of just one of these holes. In the third quarter of this course we will ll these
holes.
6.3 General Limits and Continuity in Metric Spaces
Suppose now that (X, ) and (Y, d) are two metric spaces and f : X Y is a
function..
Denition 6.32 (Limits of functions). If x
0
X and f : X x
0
Y
is a function, then we say lim
xx0
f (x) = y
0
Y i for all > 0 there is a
= (, x
0
) > 0 such that
d(f (x) , y
0
) provided that 0 < (x, x
0
) (, x
0
) . (6.4)
[In generally when we write lim
xx0
f (x) we do not need to assume that f (x
0
)
is dened.]
Theorem 6.33 (Computing Limits Using Sequences). If x
0
X and
f : X x
0
Y is a function as above, then lim
xx0
f (x) = y
0
Y
i lim
n
f (x
n
) = y
0
for all sequences x
n
n=1
X x
0
such that
lim
n
x
n
= x
0
.
52 6 Metric Spaces
Proof. Suppose that lim
xx0
f (x) = y
0
Y and x
n
n=1
X x
0
with
lim
n
x
n
= x
0
. Then all > 0 there is a > 0 such that Eq. (6.4) holds.
Since x
n
x
0
as n it follows that 0 < (x
n
, x
0
) for a.a. n and
therefore d (f (x
n
) , y
0
) for a.a. n. As > 0 was arbitrary it follows that
lim
n
d (f (x
n
) , y
0
) = 0, i.e. that lim
n
f (x
n
) = y
0
.
Conversely if lim
xx0
f (x) ,= y
0
, then there exists > 0 such that for
any =
1
n
> 0 there exists x
n
X x
0
such that d(f(x
n
), y
0
) while
(x, x
n
) <
1
n
. We then have lim
n
x
n
= x while lim
n
f (x
n
) ,= y
0
.
Denition 6.34 (Continuity). A function f : X Y is continuous at
x X if for all > 0 there is a > 0 such that
d(f (x) , f(x
t
)) < provided that (x, x
t
) < . (6.5)
The function f is said to be continuous if f is continuous at all points x X.
We will write C (X, Y ) for the collection of continuous functions from X to Y.
Denition 6.35 (Seqeuential Continuity). A function f : X Y is con-
tinuous at x X if
lim
n
f (x
n
) = f (x) for all x
n
n=1
X with lim
n
x
n
= x.
We say f is sequentially continuous on X if it is continuous at all points in X.
Corollary 6.36. Continuity and sequential continuity are the same notions.
Proof. This follows rather directly from Theorem 6.33.
Example 6.37. The functions f : C 0 C dened by f (z) = 1/z is contin-
uous. Indeed, if z
n
n=1
C 0 and lim
n
z
n
= z C 0 , then
lim
n
f (z
n
) = lim
n
1
z
n
=
1
z
= f (z) .
Example 6.38. Let f : 1 1 be the function dened by
f (x) =
_
1 if x
0 if x / .
The function f is discontinuous at all points in 1. For example, if x
0
we
may choose x
n
1 such that lim
n
x
n
= x
0
while
lim
n
f (x
n
) = lim
n
0 = 0 ,= 1 = f (x
0
) .
Similarly if if x
0
1 we may choose x
n
such that lim
n
x
n
= x
0
while
lim
n
f (x
n
) = lim
n
1 = 1 ,= 0 = f (x
0
) .
Exercise 6.13. Consider N as a metric space with d (m, n) := [mn[ and
suppose that (Y, d) is a metric space. Show that every function, f : N Y is
continuous.
Exercise 6.14. Suppose that (X, d) is a metric space and f, g : X C are two
continuous functions on X. Show;
1. f +g is continuous,
2. f g is continuous,
3. f/g is continuous provided g (x) ,= 0 for all x X.
Exercise 6.15. Show the following functions from C to C are continuous.
1. f (z) = c for all z C where c C is a constant.
2. f (z) = [z[ .
3. f (z) = z and f (z) = z.
4. f (z) = Re z and f (z) = Imz.
5. f (z) =
N
m,n=0
a
m,n
z
m
z
n
where a
m,n
C.
Exercise 6.16. Suppose now that (X, ), (Y, d), and (Z, ) are three metric
spaces and f : X Y and g : Y Z. Let x X and y = f (x) Y, show
g f : X Z is continuous at x if f is continuous at x and g is continuous
at y. Recall that (g f) (x) := g (f (x)) for all x X. In particular this implies
that if f is continuous on X and g is continuous on Y then f g is continuous
on X.
Example 6.39. If f : X C is a continuous function then [f[ is continuous and
F :=
N
m,n=0
a
mn
f
m

f
n
is continuous.
Denition 6.40 (One sided limits). Suppose (Y, d) is a metric space, <
a 0 = () > 0 d (f (x) , y
0
) if 0 < xx
0

and
lim
xx0
f (x) = y
0
> 0 = () > 0 d (f (x) , y
0
) if 0 < x
0
x .
Theorem 6.41 (One sided limit criteria). Suppose that (Y, d) is a metric
space, (a, b) 1, f : (a, b) Y is a function, and x
0
(a, b) . Then the
following are equivalent;
1. lim
xx0
f (x) = y
0
2. lim
n
f (x
n
) = y
0
for all x
n
n=1
(x
0
, b) such that lim
n
x
n
= x
0
.
3. lim
n
f (x
n
) = y
0
for all x
n
n=1
(x
0
, b) such that x
n
x
0
, i.e. x
n+1

x
n
and x
0
< x
n
for all n and lim
n
x
n
= x
0
.
We also have the following equivalent statements;
a. lim
xx0
f (x) = y
0
b. lim
n
f (x
n
) = y
0
for all x
n
n=1
(a, x
0
) such that lim
n
x
n
= x
0
.
c. lim
n
f (x
n
) = y
0
for all x
n
n=1
(a, x
0
) such that x
n
x
0
, i.e. x
n+1

x
n
and x
n
< x
0
for all n and lim
n
x
n
= x
0
.
Moreover, lim
xx0
f (x) = y
0
i lim
xx0
f (x) = y
0
= lim
xx0
f (x) .
Proof. (1. = 2.) If lim
xx0
f (x) = y
0
then for all > 0 there exits =
() > 0 such that d (f (x) , y
0
) if 0 < xx
0
. Hence if x
n
n=1
(x
0
, b)
such that x
n
x
0
then 0 < x
n
x
0
for a.a. n and hence d (f (x
n
) , y
0
)
for a.a. n. Since > 0 was arbitrary, it follows that lim
n
d (f (x
n
) , y
0
) = 0,
i.e. lim
n
f (x
n
) = y
0
. It is trivial that (2. = 3.)
(3. = 1.) If lim
xx0
f (x) ,= y
0
there exists > 0 such that for all =
1
n
>
0 there exists x
t
n
(x
0
, b) such that 0 x
t
n
x
0

1
n
while d (f (x
t
n
) , y
0
) .
Then take x
n
= min (x
t
1
, . . . , x
t
n
) . Then x
n
x while d (f (x
n
) , y
0
) , i.e.,
lim
n
f (x
n
) ,= y
0
. The equivalent of statements a.c. are proved similarly
and so the proofs will be omitted.
For the last statement it is clear that lim
xx0
f (x) = y
0
implies
lim
xx0
f (x) = y
0
= lim
xx0
f (x) . For the converse assertion, suppose that
> 0 is given, then choose
() > 0 such that

d (f (x) , y
0
) when 0 < x x
0

+
() or 0 < x
0
x
() .
If we take () := min (
+
() ,
()) , then we will have

d (f (x) , y
0
) when 0 < [x x
0
[ ()
and since > 0 was arbitrary, it follows that lim
xx0
f (x) = y
0
.
Corollary 6.42 (A monotone continuity criteria). Suppose that (Y, d) is
a metric space, X = (a, b) 1, and f : X Y is a function. Then f is con-
tinuous at x X i lim
n
f (x
n
) = f (x) whenever x
n
n=1
X converges
monotonically to x as n . In other words, it is sucient to check sequential
continuity along sequences which are either increasing or decreasing.
Proof. This is a direct consequence of Theorem 6.41.
Exercise 6.17 (Continuity of x
1/m
). Show for each m N that the function
f (x) := x
1/m
is continuous on [0, ).
Exercise 6.18 (Dierentiability of x
1/m
). Show for each m N that the
function f (x) := x
1/m
is dierentiable on (0, ) and that
d
dx
x
1/m
:= lim
yx
y
1/m
x
1/m
y x
=
1
m
x
1/m1
.
Exercise 6.19 (Intermediate value theorem). Suppose that < a <
b < and f : [a, b] 1 is a continuous function such that f (a) f (b) . Show
for any y [f (a) , f (b)] , there exists a c [a, b] such that f (c) = y.
3
Hint:
Let S := t [a, b] : f (t) y and let c := sup (S) .
Exercise 6.20 (Inverse Function Theorem I). Let f : [a, b] [c, d] be a
strictly increasing (i.e. f (x
1
) < f (x
2
) whenever x
1
< x
2
) continuous function
such that f (a) = c and f (b) = d. Then f is bijective and the inverse function,
g := f
1
: [c, d] [a, b] , is strictly increasing and is continuous.
Notations 6.43 Let (X, ) and (Y, d) be metric spaces and f : X Y be a
function.
3
The same result holds for y [f (b) , f (a)] if f (b) f (a) just replace f by f
in this case.
54 6 Metric Spaces
1. We say f is uniformly continuous, i for all > 0 there exists =
() > 0 such that
x, x
t
X with (x, x
t
) = d (f (x) , f (x
t
)) .
2. A function, f : X Y, is said to be Lipschitz if there is a constant C <
such that
d (f (x) , f (x
t
)) C (x, x
t
) for all x, x
t
X.
Recall that a function f : X Y is continuous at x
0
X if for all > 0
there exists = (, x
0
) > 0 such that
x X with (x, x
0
) = d (f (x) , f (x
0
)) .
Thus we see that a function is uniformly provided we can take (, x
0
) > 0 to
be independent of x
0
. If f is Lipschitz and > 0, we may take := /C in
order to see that if
(x, x
t
) = d (f (x) , f (x
t
)) C (x, x
t
) C =
which shows f is uniformly continuous.
Example 6.44. Any function, f : 1 1 which is everywhere dierentiable is
Lipschitz i K := sup
tR
[f
t
(t)[ < . Indeed if
[f (y) f (x)[ K[y x[ for all x, y 1
then
[f
t
(x)[ = lim
yx
[f (y) f (x)[
[y x[
K for all x 1.
Conversely, if K := sup
tR
[f
t
(t)[ < , then by the mean value Theorem 10.16,
for all y > x there exists c (x, y) such that
f (y) f (x)
y x
= [f
t
(c)[ K.
It turns out that every metric spaces with an innite number of elements
comes equipped with a large collection of Lipschitz functions.
Lemma 6.45 (Distance to a Set). For any non empty subset A X, let
d
A
(x) := infd(x, a)[a A,
then
[d
A
(x) d
A
(y)[ d(x, y) x, y X. (6.6)
In particular, d
A
: X [0, ) is continuous.
Proof. Let a A and x, y X, then
d
A
(x) d(x, a) d(x, y) +d(y, a).
Take the inmum over a in the above equation shows that
d
A
(x) d(x, y) +d
A
(y) x, y X.
Therefore, d
A
(x) d
A
(y) d(x, y) and by interchanging x and y we also have
that d
A
(y) d
A
(x) d(x, y) which implies Eq. (6.6).
Corollary 6.46. The function d satises,
[d(x, y) d(x
t
, y
t
)[ d(y, y
t
) +d(x, x
t
).
Therefore d : X X [0, ) is continuous in the sense that d(x, y) is close
to d(x
t
, y
t
) if x is close to x
t
and y is close to y
t
. In particular, if x
n
x and
y
n
y then
lim
n
d (x
n
, y
n
) = d (x, y) = d
_
lim
n
x
n
, lim
n
y
n
_
.
Proof. First Proof. By Lemma 6.45 for single point sets and the triangle
inequality for the absolute value of real numbers,
[d(x, y) d(x
t
, y
t
)[ [d(x, y) d(x, y
t
)[ +[d(x, y
t
) d(x
t
, y
t
)[
d(y, y
t
) +d(x, x
t
).
Second Proof. By the triangle inequality,
d (x, y) d (x, x
t
) +d (x
t
, y) d (x, x
t
) +d (x
t
, y
t
) +d (y
t
, y)
d (x, y) d (x
t
, y
t
) d (x, x
t
) +d (y
t
, y) .
Interchanging x with x
t
and y with y
t
in this inequality shows,
d (x
t
, y
t
) d (x, y) d (x, x
t
) +d (y
t
, y)
and the result follows from the last two inequalities.
Exercise 6.21 (Continuity of integration). Let Z = C ([0, 1] , 1) be the
continuous functions from [0, 1] to 1 and ||
u
be the uniform norm, |f|
u
:=
sup
0t1
[f (t)[ . Dene K : Z Z by
K (f) (x) :=
_
x
0
f (t) dt for all x [0, 1] .
Show that K is a Lipschitz function. In more detail, show
|K (f) K (g)|
u
|f g|
u
for all f, g Z.
In this problem please take for granted the standard properties of the integral
including
1. The function x K (f) (x) is indeed continuous (in fact dierentiable by
the fundamental theorem of calculus).
2. K : Z Z is a linear transformation.
3. If f (t) g (t) for all t [0, 1] , then
_
x
0
f (t) dt
_
x
0
g (t) dt for all x X.
4. From 3. it follows that
_
x
0
f (t) dt
_
x
0
[f (t)[ dt.
Exercise 6.22 (Discontinuity of dierentiation). Let Z be the polynomial
functions in C ([0, 1] , 1) , i.e. functions of the form p (t) =
n
k=0
a
k
t
k
with
a
k
1. As above we let |p|
u
:= sup
0t1
[p (t)[ . Dene D : Z Z by
D(p) = p
t
, i.e. if p (t) =
n
k=0
a
k
t
k
then
D(p) (t) =
n
k=1
ka
k
t
k1
.
1. Show D is discontinuous at 0 where 0 represents the zero polynomial.
2. Show D is discontinuous at all points p Z.
Exercise 6.23 (Continuity of integration II). Let Z = C ([0, 1] , 1) and K
be as in Exercise 6.21. Further let
|f|
2
:=
_
1
0
[f (t)[
2
dt.
Show
|K (f) K (g)|
2

1
2
|f g|
2
for all f, g Z.
Denition 6.47 (Pointwise Convergence). Let (X, d) and (Y, ) be metric
spaces and f
n
: X Y be functions for each n N. We say that f
n
converges
pointwise to f : X Y provided lim
n
f
n
(x) = f (x) for all x X, i.e.
provided
lim
n
d (f (x) , f
n
(x)) = 0 for each x X.
Denition 6.48 (Uniform Convergence). Let (X, d) and (Y, ) be metric
spaces and f
n
: X Y be functions for each n N. We say that f
n
converges
uniformly to f : X Y provided
n
:= sup
xX
d (f (x) , f
n
(x)) 0 as n .
Fig. 6.5. Here is a sequence of continuous functions converging pointwise to a dis-
continuous function. This convergence is not uniform.
Hopefully it is clear to the reader the uniform convergence implies pointwise
convergence. The next theorem is a basic fact about uniform convergence which
does not hold in general for pointwise convergence. The proof of this theorem
is a bit subtle but well worth mastering as the method will arise over and over
again.
Theorem 6.49 (Uniform Convergence Preserves Continuity). Suppose
that f
n
n=1
are continuous functions from X to Y and f
n
converges uniformly
to f : X Y. If f
n
is continuous at x X for all n then f is continuous at
x as well. In particular if f
n
is continuous on X for all n then f is continuous
on X as well.
Proof. We will give three proofs of this important theorem. In these proofs
we will let
n
:= sup
xX
d (f (x) , f
n
(x)) .
First Proof. Suppose that f were discontinuous at some point x
0
X.
Then there would exists > 0 and x
k
X x
0
such that lim
k
x
k
= x
0
while (f (x
k
) , f (x
0
)) for all > 0. Let n N and set g := f
n
, then
(f (x
k
) , f (x
0
)) (f (x
k
) , g (x
k
)) + (g (x
k
) , g (x
0
)) + (g (x
0
) , f (x
0
))

n
+ (g (x
k
) , g (x
0
)) +
n
= 2
n
+ (g (x
k
) , g (x
0
)) .
Letting k in this inequality implies, 2
n
and then letting n
implies = 0 and we have reached the desired contradiction, see Figure 6.6.
56 6 Metric Spaces
Fig. 6.6. Showing that f can not have any discontinuities.
Second Proof. We must show lim
k
f (x
k
) = f (x) whenever x
k
k=1

X is a convergent sequence such that x := lim
k
x
k
X. So assume we are given
such a sequence x
k
k=1
. Then for any n N we have,
(f (x) , f (x
k
)) (f (x) , f
n
(x)) + (f
n
(x) , f (x
k
))
(f (x) , f
n
(x)) + (f
n
(x) , f
n
(x
k
)) + (f
n
(x
k
) , f (x
k
))

n
+ (f
n
(x) , f
n
(x
k
)) +
n
.
Therefore,
limsup
k
(f (x) , f (x
k
)) limsup
k
(f
n
(x) , f
n
(x
k
)) + 2
n
= 2
n
,
wherein we have used the continuity of f
n
for the last equality. Thus we have
shown
limsup
k
(f (x) , f (x
k
)) 2
n
which upon passing to the limit as n shows limsup
k
(f (x) , f (x
k
)) =
0. This suces to show lim
k
f (x
k
) = f (x) .
Third Proof. Let x X and > 0 be given. Choose n N so that
n
and let g := f
n
. Since g is continuous there exists > 0 such that
(g (x) , g (x
t
)) when d (x, x
t
) . So if d (x, x
t
) , then
(f (x) , f (x
t
)) (f (x) , g (x)) + (g (x) , f (x
t
))
(f (x) , g (x)) + (g (x) , g (x
t
)) + (g (x
t
) , f (x
t
))

n
+ (g (x) , g (x
t
)) +
n
+ + = 3.
As > 0 and x X were arbitrary, we have shown f is continuous on X.
Example 6.50 (Non-uniform convergence). For an example of nonuniform con-
vergence, suppose that g (x) = max
_
1 4x
2
, 0
_
and f
n
(x) := g (x 3n) for all
n, see Figure 6.7 Notice that for each x 1
+
, f
n
(x) = 0 for a.a. n and therefore
Fig. 6.7. Here is a plot of {fn}
3
n=0
.
lim
n
f
n
(x) = 0. On the other hand, |f
n
|
u
= 1 for all n so f
n
converges to
0 pointwise but not uniformly in x. Nevertheless the limiting function is still
continuous. This is not always the case as you will see in the next exercise.
Exercise 6.24. Let f
n
: [0, 1] 1 be dened by f
n
(x) = x
n
for x [0, 1] .
Show f (x) := lim
n
f
n
(x) exists for all x [0, 1] and nd f explicitly. Show
that f
n
does not converge to f uniformly.
6.4 Density and Separability
Denition 6.51. Let (X, d) be a metric space. We say A X is dense in X
if for all x X, there exists x
n
n=1
A such that x = lim
n
x
n
. [In words,
all points in X are limit points of sequences in A.] A metric space is said to be
separable if it contains a countable dense subset, A.
Example 6.52. The spaces 1
n
and C
n
with their Euclidean metrics are sepa-
rable. Indeed we can take D =
n
and D = (+i)
n
respectively for the
6.5 Test 2: Review Topics 57
countable dense subsets. For example given k N and x = (x
1
, . . . , x
n
) 1
n
,
we may choose q
k
=
_
q
k
1
, . . . , q
k
n
_
D such that
x
i
q
k
i

1
k
for 1 i n.
We then have,
d
_
x, q
k
_
=
_
n
i=1
x
i
q
k
i
_
n
i=1
_
1
k
_
2
=
n
1
k
0 as k .
The C
n
case now follows from this case as C
n
is really 1
2n
in disguise.
Example 6.53. Let Y := 1 which we equip with the usual metric, d (y, y
t
) =
[y y
t
[ for all y, y
t
Y. I now claim that Y is separable. We can no longer use
as the countable dense subset of Y since is not contained in Y ! On the other
hand, for each q we may choose y
n
(q) Y such that lim
n
y
n
(q) = q.
Then if y Y and =
1
k
> 0 is given, we may choose q such that
[y q[
1
2k
. We then take a
k
:= y
n
(q) for some large n so that [a
k
q[
1
2k
.
It then follows that [y a
k
[
1
k
0 as k and this shows that A :=
qQ
y
n
(q) : n N is a dense subset of Y. Moreover A is countable, why?
Remark 6.54. An equivalent way to say that A X is dense is to say d
A
0,
i.e. d
A
(x) = 0 for all x X. Indeed if x X, you should show that d
A
(x) = 0
i there exists a
n
A such that d (x, a
n
) 0 as n .
Exercise 6.25. Suppose that (X, d) is a separable metric space and Y is a non-
empty subset of X which is also a metric space by restricting d to Y. Show (Y, d)
is separable. [Hint: suppose that A X be a countable dense subset of X. For
each a A choose y
n
(a)
n=1
Y so that d
Y
(a) d (a, y
n
(a)) d
Y
(a) +
1
n
.
Now show A
Y
:=
aA
y
n
(a) : n N is a countable dense subset of Y.]
Exercise 6.26. Let n N. Show any non-empty subset Y C
n
equipped with
the metric,
d (x, y) = |y x| for all x, y Y
is separable, where || is either ||
u
, ||
1
, or ||
2
.
Exercise 6.27. For x, y 1, let
d (x, y) :=
_
0 if x = y
1 if x ,= y
.
Then (1, d) is a non-separable metric space. (This metric space is also com-
plete.)
Exercise 6.28. Suppose (X, ) and (Y, d) are metric spaces and A is a dense
subset of X.
1. Show that if F : X Y and G : X Y are two continuous functions such
that F = G on A then F = G on X.
2. Now suppose that (Y, d) is complete and f : A Y is a function which
is uniformly continuous (Notation 6.43). Recall this means for every > 0
there exists a > 0 such that
d (f (a) , f (b)) for all a, b A with (a, b) . (6.7)
Show there is a unique continuous function F : X Y such that F = f on
A.
Hint: Dene F (x) = lim
n
f (x
n
) where x
n
n=1
A is chosen to
converge x X. You must show the limit exists and is independent of the
choice of sequence x
n
n=1
A which converges for x.
3. Let X = 1 = Y and A = X, nd a function f : 1 which is
continuous on but does not extend to a continuous function on 1.
6.5 Test 2: Review Topics
1. Understand the basic properties of complex numbers.
2. Coutability. Key facts are that countable union of countable sets is count-
able and the nite product of countable sets is countable.
3. Denitions of metric and normed spaces and their basic properties which
in the end of the day typically follow from the triangle inequality.
4. Be aware of dierent norms, ||
u
, ||
1
, and ||
2
.
5. Understand the notion of; limits of sequences, Cauchy sequences, com-
pleteness, limits and continuity of functions.
6. Know what is meant by pointwise and uniform convergence. You should
be able to compute pointwise limits and know how to test if the limit is
uniform or not. A key theorem is the uniform limit of continuous functions
is still continuous.
7
Series and Sums in Banach Spaces
Denition 7.1. Suppose (X, ||) is a normed space and x
n
n=1
is a sequence
in X. Then we say
n=1
x
n
converges in X i s := lim
N
N
n=1
x
n
exists
in X otherwise we say
n=1
x
n
diverges. We often let S
N
:=
N
n=1
x
n
and
refer to S
N
N=1
X as the sequence of partial sums.
If X = 1 and x
n
0, then
n=1
x
n
diverges i lim
N
N
n=1
x
n
= and
so we will write
n=1
x
n
= in this case to indicate that
n=1
x
n
diverges
to innity.
Theorem 7.2 (Comparison Theorem). Suppose that a
n
n=1
and b
n
n=1
are sequences in [0, ). If a
n
b
n
for all n, then
n=1
a
n

n=1
b
n
where we allow for these sums to be innite. Moreover if a
n
b
n
for a.a. n then
n=1
b
n
< implies
n=1
a
n
< and if
n=1
a
n
= then
n=1
b
n
= .
Proof. Let A
k
:=
k
n=1
a
n
and B
k
:=
k
n=1
b
n
. Then a simple induction
argument shows that A
k
B
k
for all k and therefore
n=1
a
n
= lim
k
A
k
lim
k
B
k
=
n=1
b
n
by the sandwich lemma.
Theorem 7.3 (Telescoping Series / Fundamental Theorem of Summa-
tion). Let f (n)
n=1
X be a sequence, then
N
n=1
[f (n + 1) f (n)] = f (N + 1) f (1) for all N N (7.1)
and
n=1
[f (n + 1) f (n)] is convergent in X i lim
N
f (N) exists in X
in which case,
n=1
(f (n + 1) f (n)) = lim
N
f (N) f (1) .
We also have,
N
n=M
[f (n + 1) f (n)] = f (N + 1) f (M) for N M. (7.2)
Proof. When N = 3 we have,
3
n=1
(f (n + 1) f (n)) = (f (2) f (1)) + (f (3) f (2)) + (f (4) f (3))
= f (4) f (1) .
In general, Eqs. (7.1) and (7.2) are easily veried by a simple induction argu-
ment. The rest of the theorem is now evident.
Example 7.4 (Geometric Series). Suppose that f (n) =
n
where C. Then
f (n + 1) f (n) =
n+1
n
=
n
( 1)
and we nd,
( 1)
N
n=1
n
=
N+1
.
If ,= 1 it follows that
N
n=1
n
=

N+1
1
and if [[ < 1, it follows that
N+1
= [[
N+1
0 as N and therefore
n=1
n
=

1
.
Theorem 7.5 (Integral test). Let f : (0, ) 1 be a C
1
function such
that f
t
(x) 0 and f
t
(x) is decreasing in x (i.e. f
tt
(x) 0 if it exists) and
f () := lim
x
f (x) , see Figure 7.1. (As f is an increasing function, f ()
exists in (, ].) Then
60 7 Series and Sums in Banach Spaces
[f (N + 1) f (M)]
N
n=M
f
t
(n) [f (N + 1) f (M)] +f
t
(M) f
t
(N + 1)
(7.3)
[f (N + 1) f (M)] +f
t
(M)
for all M N. Letting N in these inequalities also gives,
f () f (M)
n=M
f
t
(n) f () f (M) +f
t
(M) (7.4)
and in particular,
n=1
f
t
(n) < i f () < .
Fig. 7.1. The items plotted here are ln x and 1/
x both of which satisfy the

hypothesis of the propostion.
Proof. By the mean value Theorem 10.16, there exists c
n
(n 1, n) such
that f (n) f (n 1) = f
t
(c
n
) . Since f
t
is decreasing, it follows that
f
t
(n) f
t
(c
n
) f
t
(n 1) for all n,
or equivalently
1
that
f
t
(n) f (n) f (n 1) f
t
(n 1) for all n.
1
This is a standard inequality involving convex functions. The reader should draw
the picture and note that f (n) f (n 1) is the slope of the cord joining
(n 1, f (n 1)) to (n, f (n)) .
Summing these inequalities on n, using
N+1
n=M+1
[f (n) f (n 1)] = f (N + 1) f (M) ,
shows,
N+1
n=M+1
f
t
(n) f (N + 1) f (M)
N+1
n=M+1
f
t
(n 1) =
N
n=M
f
t
(n) .
This inequality is equivalent to Eq. (7.3) because it gives,
[f (N + 1) f (M)]
N
n=M
f
t
(n) =
N+1
n=M+1
f
t
(n) +f
t
(M) f
t
(N + 1)
[f (N + 1) f (M)] +f
t
(M) f
t
(N + 1) .
Corollary 7.6 (p series). Let p 1, then
n=1
1
n
p
=
_
if p 1
< if p > 1
.
Proof. If p 0, then lim
n
1
n
p
,= 0 and hence the series diverges and
we may assume that p > 0. For p > 1, let f (x) = x
(p1)
so that f
t
(x) =
(p 1) x
p
which is decreasing in x and therefore by Theorem 7.5,
f () f (M)
n=M
p 1
n
p
f () f (M) +f
t
(M) =
1
M
p1
+
p 1
M
p
<
and so
1
(p 1) M
p1

n=M
1
n
p

1
(p 1) M
p1
+
1
M
p
.
In particular for M = 1,
1
p 1

n=1
1
n
p

1
p 1
+ 1.
Since
1
p 1

n=1
1
n
p

n=1
1
n
7 Series and Sums in Banach Spaces 61
we may let p 1 in order to show,

n=1
1
n
. This complete the proof as
for any p < 1 we have ,

n=1
1
n

n=1
1
n
p
.
Exercise 7.1. Take f (x) = ln x in Theorem 7.5 in order to directly conclude
that
n=1
1
n
= .
Exercise 7.2. Let 0 p < . Prove,
n=2
1
n(ln n)
p
=
_
if 0 p 1
< if p > 1
.
by applying Theorem 7.5 with the following functions;
1. for 0 p < 1 take f (x) = (ln x)
1p
,
2. for p = 1 take f (x) = ln ln x, and
3. for p > 1 take f (x) = (ln x)
1p
.
Theorem 7.7. Let (X, ||) be a Banach space and x
k
k=1
X be a sequence.
Then;
1.
k=1
x
k
converges i
_
_
_
_
_
n
k=m
x
k
_
_
_
_
_
0 as n, m with n m.
2. If
k=1
x
k
converges then lim
k
x
k
= 0 or alternatively if lim
k
x
k
,= 0
then
k=1
x
k
diverges.
3. If
k=1
x
k
converges then lim
N
k=N
x
k
= 0, i.e. the N - tail,
k=N
x
k
, of a
convergent series,
k=1
x
k
, go to zero as N .
Remark: the only place we use X is complete is in the implication (=)
in item 1. All of the remaining statements hold for any normed space X.
Proof. Let S
n
:=
n
k=1
x
k
so that
k=1
x
k
converges i lim
n
S
n
exists
i S
n
n=1
is Cauchy since X is a Banach space which gives item 1. since
S
n
S
m1
=
n
k=m
x
k
. For the second item apply the rst with n = m+1. For
the third item let S :=
k=1
x
k
, then lim
N
S
N
= S and so by very denition,
k=N
x
k
= S S
N+1
0 as N .
Exercise 7.3. Let (X, d) be a metric space. Suppose that x
n
n=1
X is a
sequence and set
n
:= d(x
n
, x
n+1
). Show that for m > n that
d(x
n
, x
m
)
m1
k=n
k=n
k
.
Conclude from this that if
k=1
k
=
n=1
d(x
n
, x
n+1
) <
then x
n
n=1
is Cauchy. Moreover, show that if x
n
n=1
is a convergent se-
quence and x = lim
n
x
n
then
d(x, x
n
)
k=n
k
.
Theorem 7.8 (Absolute Convergence Implies Convergence). Let (X, |
|) be a Banach space x
k
k=1
X be a sequence. Then
k=1
|x
k
| < implies
k=1
x
k
is convergent. [We say
k=1
x
k
is absolutely convergent if
k=1
|x
k
| <
.]
Exercise 7.4. Prove Theorem 7.8. Namely if (X, | |) is a Banach space and
x
k
k=1
X is a sequence, then
k=1
|x
k
| < implies
k=1
x
k
is convergent.
Proposition 7.9 (Alternating Series Test). If a
k
k=1
[0, ) is a non-
increasing sequence (i.e. a
k
a
k+1
for all k) such that lim
k
a
k
= 0, then
s :=
k=1
(1)
k+1
a
k
is convergent. Moreover, for all n N
s
n
k=1
(1)
k+1
a
k
k=n+1
(1)
k+1
a
k
a
n+1
. (7.5)
0 S1 S2 S3 S4 S5 S6 S
a1
a2
a3
a4
a5
a6
.
.
.
Fig. 7.2. The picture proof of the alternating series test.
Proof. The proof is essentially contained in Figure 7.2. Let S
n
:=
n
k=1
(1)
k+1
a
k
. For n N,
S
2n+1
= S
2n1
(a
2n
a
2n+1
) S
2n1
and
S
2(n+1)
= S
2n
+ (a
2n+1
a
2n+2
) S
2n
,
wherein we have used the fact that a
k
is decreasing to conclude the terms
in parenthesis above are non-negative. From these inequalities we learn that
S
2n
n=1
is an increasing sequence while S
2n+1
n=1
is a decreasing sequence
and hence
s
ev
:= lim
n
S
2n
exists in (, ] and
s
odd
:= lim
n
S
2n1
exists in (, ].
Since S
2n+1
S
2n
= a
2n+1
0 as n it follows that s
ev
= s
odd
1. It is
now easy to conclude (You prove!) that
S =
k=1
(1)
k+1
a
k
= lim
n
S
n
= s
ev
= s
odd
1.
In summary, S
2n
S while S
2n+1
S as n and in particular, for all n N,
S
2n+1
a
2n
= S
2n
S S
2n+1
= S
2n
+a
2n+1
.
These inequalities then implies
0 S S
2n
=
k=2n+1
(1)
k+1
a
k
a
2n+1
and
0 S
2n+1
S =
k=2n
(1)
k+1
a
k
a
2n
.
which imply Eq. (7.5).
Alternatively, here is another way to think about the error estimates;
0 a
1
a
2
= S
2
S S
1
a
1
so that 0 S a
1
. Applying these results to
k=n+1
(1)
k+1
a
k
shows,
s
n
k=1
(1)
k+1
a
k
k=n+1
(1)
k+1
a
k
a
n+1
.
Example 7.10. By the alternating series test,
n=1
(1)
n
1
n
p
is convergent for all p > 0. On the other hand, by Corollary 7.6, the series is
absolutely convergent i p > 1.
Theorem 7.11 (Uniform convergence and the Weierstrass M test).
Suppose that (X, ||) is a Banach space, (Y, d) is a metric space, for each n N,
f
n
: Y X is a function, and there exists M
n
n=1
[0, ) satisfying;
sup
yY
|f
n
(y)|
X
M
n
n N and
n=1
M
n
< .
Then S
N
(y) :=
N
n=1
f
n
(y) converges absolutely and uniformly to S (y) =
n=1
f
n
(y) . Moreover, if we further assume that f
n
n=1
C (Y, X) (i.e. f
n
is continuous for all n), then the function S : Y X is also continuous.
Proof. For any y Y,
n=1
|f
n
(y)|
X

n=1
M
n
<
and therefore S (y) =
n=1
f
n
(y) converges absolutely. Moreover we have,
|S (y) S
N
(y)| = lim
M
|S
M
(y) S
N
(y)| = lim
M
_
_
_
_
_
M
n=N+1
f
n
(y)
_
_
_
_
_
liminf
M
M
n=N+1
|f
n
(y)| =
n=N+1
|f
n
(y)|
n=N+1
M
n
.
As the last member of this inequality does not depend on y we have,
sup
yY
|S (y) S
N
(y)|
n=N+1
M
n
0 as N
because tails of convergent series vanish, Theorem 7.7. The continuity of S now
follows form the continuity of S
N
, the uniform convergence just proved, and
Theorem 6.48.
Theorem 7.12 (Root test). Suppose that a
n
n=1
is a sequence in C and let
:= limsup
n
[a
n
[
1/n
. Then
1. If < 1 then
n=1
[a
n
[ < and
n=1
a
n
is absolutely convergent.
2. If > 1, then limsup
n
[a
n
[ = and
n=1
a
n
diverges.
3. If = 1, the test fails, i.e. you must work harder!]
1. If < 1, let (, 1) , then [a
n
[
1/n
for a.a. n which implies that
[a
n
[
n
for a.a. n and so the result follows by the comparison Theorem
7.2 as
n=1
n
=

1
< .
2. If > 1 and (1, ) , then [a
n
[
1/n
i.o. n and hence [a
n
[
n
i.o. n.
As
n
as n it follows that limsup
n
[a
n
[ = .
3. From Corollary 7.6 we know that
n=1
1
n
= while
n=1
1
n
2
< .
However from Lemma 3.30,
lim
n
_
1
n
_
1/n
= 1 = lim
n
_
1
n
2
_
1/n
which shows the test has failed. In fact, if 0 < p < and k N such that
k p, then
1
n
k

1
n
p
1 which implies
_
1
n
k
_
1/n
_
1
n
p
_
1/n
(1)
1/n
= 1.
By Lemma 3.30 we know lim
n
_
1
n
k
_
1/n
= 1 and therefore by the sand-
wich lemma, lim
n
_
1
n
p
_
1/n
= 1.
Exercise 7.5. For every p N, show
n=0
(n)
n/p
z
n
is convergent i z = 0.
Theorem 7.13 (Ratio test). Suppose that a
n
n=1
is a sequence in C such
that a
n
,= 0 for a.a. n. Then
1. If := limsup
n
an+1
an
< 1 then
n=1
[a
n
[ < and
n=1
a
n
is abso-
lutely convergent.
2. If
an+1
an
1 for a.a. n then lim

n
[a
n
[ > 0 and
n=1
a
n
diverges.
3. If liminf
n
an+1
an
> 1 then lim

n
[a
n
[ = and
n=1
a
n
diverges.
4. If limsup
n
an+1
an
= 1, the test fails, i.e. you must work harder!]

1. If < 1, let (, 1) , then
an+1
an
for a.a. n, i.e. there exists N N

such that [a
n+1
[ [a
n
[ for all n N. A simple induction argument shows,
[a
n
[ [a
N
[
nN
=
N
[a
N
[
n
for n N.
The result follows by the comparison Theorem 7.2 and the fact that
n=N
N
[a
N
[
n
=
[a
N
[
1
< .
2. Suppose there exists N N such that
an+1
an
1 for all n N.
This inequality says that [a
n
[ is non-decreasing for large n and therefore
lim
n
[a
n
[ [a
m
[ for any m N. We may now choose m N such that
[a
m
[ , = 0.
3. If liminf
n
an+1
an
> > 1, then
an+1
an
for a.a. n. Working as above

there exists N N such that [a
N
[ , = 0 and [a
n
[ [a
N
[
nN
for all n N.
From this it follows that lim
n
[a
n
[ = .
4. From Corollary 7.6 we know that
n=1
1
n
p
 1. However
lim
n
_
_
_
1
n+1
_
p
_
1
n
_
p
_
_
= lim
n
_
n
n + 1
_
p
=
_
lim
n
_
n
n + 1
__
p
= 1
p
= 1
which shows the test has failed and does not resolve when
n=1
1
n
p
< .
In what follows let (Z, ||) be a complex Banach space. For example, Z = C
and |z| = [z[ is an important special case.
Remark 7.14. The root and ratio tests extend to series in Banach spaces
(a
n
n=1
Z) with the only changes being that we replace [a
n
[ by |a
n
| and
an+1
an
by
|an+1|
|an|
wherever these occur. The point is that we can apply the root
and ratio test to the sequence of real numbers |a
n
|
n=1
in order to make
conclusion about the absolute convergence or divergence of
n=1
a
n
.
Denition 7.15. Given z
0
C and a
n
n=0
Z, the series of the form
n=0
a
n
(z z
0
)
n
(7.6)
is called a power series. If z
0
= 0 we call it a Maclaurin series, i.e. a series
of the form
n=0
a
n
z
n
. (7.7)
The radius of convergence of either of these series is dened to be R :=
1

[0, ] where
:= limsup
n
|a
n
|
1/n
[0, ] .
By denition,
1
0
:= in this formula.
The next theorem shows that R is the critical radius governing the conver-
gence of Eqs. (7.6) and (7.7).
Proposition 7.16. If R is the radius of convergence of a power series in Eq.
(7.6) then:
1. If [z z
0
[ < R, the series converges.
2. If [z z
0
[ > R, the series diverges.
3. If [z z
0
[ = R, the series may or may not converge.
We may also characterize R as
R := sup
_
[z z
0
[ : z C and
n=0
a
n
(z z
0
)
n
converges
_
. (7.8)
Proof. Let
(z) := limsup
n
|a
n
(z z
0
)
n
|
1/n
= [z z
0
[ limsup
n
|a|
1/n
=
[z z
0
[
R
.
By the root test, Theorem 7.12, we know that power series in Eq. (7.6) converges
if (z) < 1 and diverges if (z) > 1 and the test fails if (z) = 1.
Corollary 7.17. If = lim
n
|an+1|
|an|
exists, the radius (R) of convergence
of a power series in Eq. (7.6) is R =
1
= lim
n
|an|
|an+1|
.
Proof. If = lim
n
|an+1|
|an|
, then
(z) := lim
n
_
_
_a
n+1
(z z
0
)
n+1
_
_
_
|a
n
(z z
0
)
n
|
= [z z
0
[ lim
n
|a
n+1
|
|a
n
|
= [z z
0
[ .
The the Ratio test in Theorem 7.13 now implies
n=0
a
n
(z z
0
)
n
converges
if [z z
0
[ <
1
( (z) < 1) and diverges if [z z

0
[ >
1
( (z) > 1) , i.e. R =

1
from Eq. (7.8).

Theorem 7.18. If the radius of convergence (R) of a power series
n=0
a
n
(z z
0
)
n
is positive (R > 0) , then the functions
S : D(z
0
, R) := z C : [z z
0
< R[ Z
dened by S (z) :=
n=0
a
n
(z z
0
)
n
is continuous and the series is uniformly
convergent on D(z
0
, ) for all < R.
Proof. For any < R let M
n
:= |a
n
|
n
. Then
n=0
M
n
< by the root
test or the denition of R. So if z D(z
0
, ) we nd,
|a
n
(z z
0
)
n
| |a
n
|
n
= M
n
.
It follows from the Weierstrass M test, Theorem 7.11, that the series is
uniformly convergent and S is continuous on D(z
0
, ) for all < R. Since
D(z
0
, R) =
0<<R
D(z
0
, ) , we may conclude that S is continuous on
D(z
0
, R) .
Exercise 7.6. Show that each of the following power series have an innite
radius of convergence and hence dene continuous functions from C to C.
1. exp (z) :=
n=0
z
n
n!
,
2. sin (z) :=
n=0
(1)
2n+1 z
2n+1
(2n+1)!
, and
3. cos (z) :=
n=0
(1)
2n z
2n
(2n)!
.
Proposition 7.19. Suppose that (X, ||) is a normed space and a
n
n=1
,
b
n
n=1
X such that A :=
n=1
a
n
and B :=
n=1
b
n
exist in X. Then
for all F (F = 1 or C if X is a real or complex vector space respectively),
we have the series,
n=1
(a
n
+b
n
) , is convergent and
n=1
(a
n
+b
n
) = A+B.
The easy proof of this proposition is left to the reader. Our next goal is to
consider the multiplication of series. In this case we will need a multiplication
on X itself. We start with the case where X = C or X = 1. Let a
n
n=0
,
b
n
n=0
C and let us consider the power series,
A(z) :=
n=0
a
n
z
n
and B(z) :=
n=0
b
n
z
n
.
Working formally,
A(z) B(z) =
_

n=0
a
n
z
n
__

m=0
b
m
z
m
_
=
m,n=0
a
n
b
m
z
n
z
m
=
k=0
m+n=k
a
n
b
m
z
n
z
m
=
k=0
_

m+n=k
a
n
b
m
_
z
k
.
Thus we expect A(z) B(z) to be represented by a power series
C (z) =
k=0
c
k
z
k
where c
k
:=
m+n=k
a
n
b
m
=
k
m=0
a
km
b
m
.
Theorem 7.20 (Multiplication of Series). Suppose that a
n
n=0
,
b
n
n=0
C and
n=0
a
n
and
n=0
b
n
are absolutely convergent series. Let
c
k
k=0
be dened by
c
k
:=
k
m=0
a
km
b
m
. (7.9)
Then
k=0
c
k
converges absolutely and
k=0
c
k
=
_

n=0
a
n
_
n=0
b
n
_
. (7.10)
Proof. Let J
N
= 0, 1, 2, . . . , N ,
N
:=
_
(m, n) J
2
N
= J
N
J
N
: m+n > N
_
,
and
R
N
=
(m,n)N
[a
n
b
m
[ .
I now claim that lim
N
R
N
= 0. To see this let A :=
n=0
[a
n
[ , B :=
m=0
[b
m
[ and observe m + n > N implies either m > N/2 or n > N/2, see
Figure 7.3. Hence we nd,
Fig. 7.3. The red region represents the indices contributing to
N
k=0
ck while the
N N square represents the indices contributing to
N
m,n=0
ambn. The blue and
green indices include all of the indices in the square minus those contributing to
N
k=0
ck.
R
N

N
2
<mN
N
n=0
[a
n
b
m
[ +
N
2
<nN
N
m=0
[a
n
b
m
[
N
2
<mN
[b
m
[
N
n=0
[a
n
[ +
N
2
<nN
[a
n
[
N
m=0
[b
m
[
m>
N
2
[b
m
[ A+
n>
N
2
[a
n
[ B 0 as N
because the tails of convergent series tend to zero.
We now have the identity,
N
n=0
a
n

N
m=0
b
m
=
(m,n)JNJN
a
n
b
m
=
m+nN
a
n
b
m
+r
N
=
N
k=0
c
k
+r
N
,
where r
N
:=
(m,n)N
a
n
b
m
.Since [r
N
[ R
N
0, it follows that
k=0
c
k
= lim
N
N
k=0
c
k
= lim
N
_
N
n=0
a
n

N
m=0
b
m
r
N
_
= lim
N
N
n=0
a
n
lim
N
N
m=0
b
m
=
_

n=0
a
n
_
n=0
b
n
_
.
Lastly observe that
[c
k
[
k
m=0
[a
km
[ [b
m
[ =: c
k
and so applying what we have just proved to [a
n
[
n=0
and [b
n
[
n=0
shows
k=0
[c
k
[
k=0
c
k
=
_

n=0
[a
n
[
_
n=0
[b
n
[
_
< .
Here is the short summary of the argument;
m,n=0
a
n
b
m

N
k=0
c
k
(m,n)N
a
n
b
m
(m,n)N
[a
n
b
m
[
N
2
<mN, 0nN
[a
n
b
m
[ +
N
2
<nN, 0mN
[a
n
b
m
[
m>
N
2
[b
m
[ A+
n>
N
2
[a
n
[ B 0 as N .
Corollary 7.21. Let a
n
n=0
, b
n
n=0
C and c
k
:=
m+n=k
a
m
b
n
. If
R := min
_
1
limsup
n
[a
n
[
1/n
,
1
limsup
n
[b
n
[
1/n
_
> 0, (7.11)
then
k=0
c
k
z
k
is convergent for [z[ < R and
k=0
c
k
z
k
=
_

n=0
a
n
z
n
__

m=0
b
m
z
m
_
.
Proof. For [z[ < R, the root test shows
n=0
[a
n
z
n
[ < and
m=0
[b
m
z
m
[ < .
The result now follows by applying Theorem 7.20 with a
n
[a
n
z
n
[ and b
n

[b
n
z
n
[ .
Example 7.22. For [z[ < 1 we have
(1 z)
n=0
z
n
= 1.
This shows that power series,
k=0
c
k
z
k
, in Corollary 7.21 may in fact have
radius of convergence larger than R in Eq. (7.11).
Theorem 7.23. The function exp : C C satises the following properties;
1. exp (0) = 1,
2. exp (w) exp (z) = exp (w +z) for all w, z C,
3. exp (w) is never 0 and [exp (w)]
1
= exp (w) for all w C.
Proof. The rst assertion is clear and item 3. easily follows from item 2.
with z = w. We now prove item 2. by applying Theorem 7.20 with a
n
:=
1
n!
z
n
and b
n
:=
1
n!
w
n
. In this case,
c
k
=
k
m=0
1
m!
z
m
1
(k m)!
w
km
=
1
k!
k
m=0
_
k
m
_
z
m
w
km
=
1
k!
(z +w)
k
wherein we have used the Binomial theorem for the last equality, see Exercise
7.10. It now follows from Theorem 7.20 that
exp (w) exp (z) =
n=0
z
n
n!

n=0
w
n
n!
=
k=0
1
k!
(z +w)
k
= exp (w +z) .
Denition 7.24 (Eulers Number). Eulers number is dened by
e := exp (1) =
m=0
1
m
m!
=
m=0
1
m!
.
If m N, then
e = exp (1) = exp
_
1
m
m
_
=
_
exp
_
1
m
__
m
.
As exp
_
1
m
_
> 0 we may conclude that exp
_
1
m
_
= e
1/m
. Now if n N
0
we nd,
e
n
m
=
_
e
1/m
_
n
=
_
exp
_
1
m
__
n
= exp
_
_
_
_
n times
..
1
m
+ +
1
m
_
_
_
_
= exp
_
n
m
_
.
As
e
n
m
=
1
e
n
m
=
1
exp
_
n
m
_ = exp
_
n
m
_
we have in fact shown that e
q
= exp (q) for all q . Owing to this identity,
we will often write e
z
for exp (z) .
With the denitions in Exercise 7.6, we have
exp (iz) =
n=0
(iz)
n
n!
=
n=0
(iz)
2n
(2n)!
+
n=0
(iz)
2n+1
(2n + 1)!
=
n=0
(1)
n
z
2n
(2n)!
+i
n=0
(1)
n
(z)
2n+1
(2n + 1)!
= cos (z) +i sin (z) ,
i.e.
exp (iz) = cos (z) +i sin (z) for all z C. (7.12)
This is called Eulers formula. If z = 1, it states that
Re exp (i) = cos and Imexp (i) = sin .
Thus if , 1, we have the following addition formulas;
cos ( +) = Re exp (i ( +)) = Re [exp (i) exp (i)]
= Re [(cos +i sin ) (cos +i sin )]
= cos cos sin sin (7.13)
and
sin ( +) = Imexp (i ( +)) = Im[exp (i) exp (i)]
= Im[(cos +i sin ) (cos +i sin )]
= cos sin + sin cos . (7.14)
We also dene
sinh (z) =
e
z
e
z
2
=
n=0
z
2n+1
(2n + 1)!
and
cosh (z) =
e
z
+e
z
2
=
n=0
z
2n
(2n)!
.
The following identities are now easily veried;
cos (z) = cos (z) and cosh (z) = cosh (z)
sin (z) = sin (z) and sinh (z) = sinh (z) ,
e
z
= cosh (z) + sinh (z) ,
sin (z) =
e
iz
e
iz
2i
=
1
i
sinh (iz) , and
cos (z) =
e
iz
+e
iz
2
= cosh (iz) .
Exercise 7.7 (Addition Formulas). Prove the addition formulas in Eqs.
(7.13) and (7.14) extend to , C, i.e. show for all w, z C that
cos (w +z) = cos (w) cos (z) sin (w) sin (z) and
sin (w +z) = cos (w) sin (z) + cos (z) sin (w) .
Exercise 7.8. Suppose that p (t) and q (t) are non-zero polynomials of t 1
with (possibly) complex coecients. Further assume that q (n) ,= 0 for all n
N
0
. Show for any sequence a
n
n=0
C that the power series;
n=0
p (n)
q (n)
a
n
z
n
and
n=0
a
n
z
n
have the same radius of convergence.
Theorem 7.25 (Hilbert Schmidt norm). Let Z = C
JNJN
denote the space
of N N complex matrices, A = (A
ij
)
N
i,j=1
with A
ij
C. We let
|A|
2
:=
_
N
i,j=1
[A
ij
[
2
which is called the Hilbert Schmidt norm on Z. This norm satises,
1. |I|
2
=
N where I is the N N identity matrix.

2. |AB|
2
|A|
2
|B|
2
for all A, B Z.
3. |A
n
|
2
|A|
n
2
for all n N.
4. If A
n
A and B
n
B then A
n
B
n
AB and A
n
+ B
n
A + B as
n . Thus matrix multiplication and addition are continuous operations
on (Z, ||
2
) .
5. If (X, d) is a metric space and f : X Z and g : X Z are continuous
functions then so is f g (order matters) and f +g.
6. The functions f (A) = A
n
is continuous on Z for all n N
0
, where by
convention A
0
= I.
7. (Z, ||
2
) is a Banach space.
Proof. Item 1. is clear. From the denition of matrix multiplication and the
Cauchy Schwarz inequality we nd
(AB)
ij
2
=
k=1
A
ik
B
kj
k=1
[A
ik
[
2
k=1
[B
kj
[
2
.
Therefore,
|AB|
2
2
=
n
i,j=1
(AB)
ij
i,j=1
_
N
k=1
[A
ik
[
2
k=1
[B
kj
[
2
_
=
N
i,k=1
[A
ik
[
2
j,k=1
[B
kj
[
2
= |A|
2
2
|B|
2
2
which proves item 2. Item 3. follows by an easy induction argument.
Item 4. Let A
n
:= A
n
A and B
n
:= B
n
B so that A
n
= A+A
n
and
B
n
= B +B
n
where |A
n
|
2
0 and |B
n
|
2
0 as n . Then
|A
n
B
n
AB|
2
= |(A+A
n
) (B +B
n
) AB|
2
= |A
n
B +AB
n
+A
n
B
n
|
2
|A
n
B|
2
+|AB
n
|
2
+|A
n
B
n
|
2
|A
n
|
2
|B|
2
+|A|
2
|B
n
|
2
+|A
n
|
2
|B
n
|
2
0 as n .
Similarly,
|A
n
+B
n
(A+B)|
2
= |A
n
+B
n
|
2
0 as n .
Item 5. Suppose that x
n
n=1
X and x
n
x as n , then using item 4.,
lim
n
(f g) (x
n
) = lim
n
[f (x
n
) g (x
n
)] = f (x) g (x) = (f g) (x)
and
lim
n
(f +g) (x
n
) = lim
n
[f (x
n
) +g (x
n
)] = f (x) +g (x) = (f +g) (x) .
This shows f g and f +g are continuous.
Item 6. follows from item 5 with X = Z along with an induction argument.
Item 7. follows from the fact that (Z, ||
2
) is C
N
2
with the 2 norm in a
slight disguise.
Denition 7.26 (Matrix Functions). If f (z) =
n=0
c
n
z
n
for some c
n
C
and A C
JNJN
, then we dene
f (A) :=
n=0
c
n
A
n
provided the sum is convergent.
For example,
sin (A) :=
n=0
(1)
2n+1
(2n + 1)!
A
2n+1
.
Theorem 7.27. Suppose that c
n
n=1
is a sequence in C and
R :=
_
limsup
n
[c
n
[
1/n
_
1
. If A C
JNJN
with |A| < R, then
f (A) :=
n=0
c
n
A
n
is convergent and f : B
0
(R) := A : |A| < R C
JNJN
is continuous.
Moreover, for any < R, the sum converges absolutely and uniformly on B
0
() .
Proof. Let (|A| , R) and let M
n
:= [c
n
[
n
. Then limsup
n
M
1/n
n =
/R < 1 and therefore by the Root test,
n=0
M
n
< . Since, for n 1,
|c
n
A
n
| = [c
n
[ |A
n
| [c
n
[ |A|
n
M
n
,
the results are now again a consequence of the Weierstrass M test in Theorem
7.11.
Theorem 7.28 (Matrix Exponentials and etc.). The series for e
A
, sin (A) ,
cos (A) , sinh (A) , and cosh (A) are all absolutely convergent dene continuous
functions from C
JNJN
to C
JNJN
in the Hilbert Schmidt norm.
Proof. Since the corresponding numerical series all of innite radius of
convergence the result follows from Theorem 7.27.
Exercise 7.9 (Inverting perturbations of the identity). For |A|
2
< 1 in
C
JNJN
, I A is invertible and
(I A)
1
=
n=0
A
n
where the sum is absolutely convergent. Moreover the function A (I A)
1
is continuous on the ball, B := A : |A|
2
< 1 and
_
_
_(I A)
1
_
_
_
2
n=0
|A
n
|
2

|A|
2
1 |A|
2
+
N.
Exercise 7.10 (Binomial Theorem). Suppose that A, B C
JNJN
are
commuting matrices, i.e. AB = BA. For any n N, show by induction that
n
k=0
_
n
k
_
A
k
B
nk
= (A+B)
n
. (7.15)
Exercise 7.11 (Matrix Addition Formula). If A, B C
JNJN
are com-
muting matrices, show exp (A+B) = exp (A) exp (B) . Hint: mimic the proof
of Theorem 7.23.
Example 7.29 (exp (A+B) ,= exp (A) exp (B)). Let 1 and
A =
_
0
0 0
_
and B =
_
0 0
0
_
.
Since A
2
=
_
0 0
0 0
_
= B
2
it follows that
e
A
= I +A =
_
1
0 1
_
and e
B
= I +B =
_
1 0
1
_
and in particular,
e
A
e
B
=
_
1
0 1
_ _
1 0
1
_
=
_
1
2
1
_
.
Let us now compute e
A+B
. First observe that
A+B = J where J :=
_
0 1
1 0
_
.
Since
J
2
= I where I =
_
1 0
0 1
_
,
it follows that J
2n
= (I)
n
= (1)
n
I and J
2n+1
= J
2n
J = (1)
n
J for all
n N
0
. Therefore,
e
A+B
= e
J
=
n=0
n
n!
J
n
=
n=0
1
(2n)!
2n
J
2n
+
n=0
1
(2n + 1)!
2n+1
J
2n+1
=
_

n=0
1
(2n)!
2n
(1)
n
_
I +
_

n=0
1
(2n + 1)!
2n+1
(1)
n
_
J
= cos I + sin J =
_
cos sin
sin cos
_
.
From these formula it is evident (using facts that you know about sin and
cos ) that e
A
e
B
,= e
A+B
unless = 0. There is no contradiction to Exercise
7.11 since
[A, B] =
_
0
0 0
_ _
0 0
0
_
_
0 0
0
_ _
0
0 0
_
=
_
2
0
0
2
_
=
2
_
1 0
0 1
_
which is zero i = 0.
Exercise 7.12 (Bounded Operators). Let (X, ||) be a normed space and
T : X X be a linear transformation and suppose T is continuous at
0 X. Show;
1. There exists C < such that |T (x)| C |x| for all x X. Hint: for
sake of contradiction suppose that no such C < exists. Then construct
a sequence y
n
n=1
X such that lim
n
y
n
= 0 while |T (y
n
)| = 1 for
all n.
2. Show T is continuous on all of X.
8
More Sums and Sequences
Warning: this chapter is a bit rough and will be edited later when it be-
comes needed.
8.1 Rearrangements
[The stu about sums of positive numbers could go much earlier in fact.]
Denition 8.1. If is a countable set and a : [0, ] is a function, let
x
a (x) := sup
0f
x0
a (x) .
Notice that if 0 a (x) b (x) , then
x
a (x)

x
b (x) . Moreover
if B , then
xB
a (x) =
x
a (x) 1
B
(x) .
Theorem 8.2 (Monotone Convergence Theorem for Sums). If is a
countable set and a
n
: [0, ] is an increasing sequence of functions, then
lim
n
x
a
n
(x) :=
x
lim
n
a
n
(x) .
Proof. This is an easy consequence of the sup sup theorem. Indeed,
x
a
n
(x) is an increasing sequence of numbers, therefore,
lim
n
x
a
n
(x) = sup
n
x
a
n
(x) = sup
n
sup
0f
x0
a
n
(x)
= sup
0f
sup
n
x0
a
n
(x) = sup
0f
lim
n
x0
a
n
(x)
= sup
0f
x0
lim
n
a
n
(x) =
x
lim
n
a
n
(x) .
Corollary 8.3. Again suppose that is a countable set and a : [0, ] is
a function. If
n
and
n
, then
a (x) = lim
n
n
a (x) .
Proof. Apply Theorem 8.2 with a
n
(x) := a (x) 1
n
(x) .
Corollary 8.4. If = A B with A B = , then
x
a (x) =
xA
a (x) +
xB
a (x) .
Proof. Choose
n

f
such that
n
. Then
n
A A and
n
B B
and hence,
x
a (x) = lim
n
xn
a (x)
= lim
n
_

xnA
a (x) +
xnB
a (x)
_
= lim
n
xnA
a (x) + lim
n
xnB
a (x)
=
xA
a (x) +
xB
a (x) .
Corollary 8.5. If is a countable set, a : [0, ] is a function, and
=
n=1
n
, then
a (x) =
n=1
n
a (x)
Proof. We have
n=1
n
a (x) = lim
N
N
n=1
n
a (x) = lim
N
N
n=1
n
a (x) =
x
a (x) .
72 8 More Sums and Sequences
Corollary 8.6. If x
n
n=1
is any enumeration of then
x
a (x) =
n=1
a (x
n
) .
Proof. =
n=1
x
n
and therefore
x
a (x) =
n=1
xxn]
a (x) =
n=1
a (x
n
) .
Corollary 8.7 (Tonelli Theorem for Sums). Suppose that is a countable
set and a
n
: [0, ] is a sequence of functions, then
n=1
x
a
n
(x) =
n=1
a
n
(x) .
Proof. This is a simple consequence of MCT,
n=1
x
a
n
(x) = lim
N
N
n=1
x
a
n
(x) = lim
N
x
N
n=1
a
n
(x)
=
x
lim
N
N
n=1
a
n
(x) =
n=1
a
n
(x) .
Lemma 8.8 (Fatous Lemma). If is a countable set and a
n
: [0, ]
is a sequence of functions, then
x
liminf
n
a
n
(x) liminf
n
x
a
n
(x) .
Proof. By denition and MCT,
x
liminf
n
a
n
(x) =
x
lim
n
inf
kn
a
k
(x)
= lim
n
x
inf
kn
a
k
(x)
= liminf
n
x
inf
kn
a
k
(x)
liminf
n
x
a
n
(x) .
Theorem 8.9 (Dominated Convergence Theorem for Sums I). If
is a countable set and a
n
: [0, ) is a sequence of functions such that
x
sup
n
a
n
(x) < and lim
n
a
n
(x) = 0 for all x , then
lim
n
x
a
n
(x) := 0.
Proof. For all > 0 there

f
such that
x/
sup
n
a
n
(x) .
Therefore,
x
a
n
(x) =
x
a
n
(x) +
x/
a
n
(x)
x
a
n
(x) +
and therefore,
limsup
n
x
a
n
(x) limsup
n
x
a
n
(x) + = lim
n
x
a
n
(x) + = .
Since > 0 is arbitrary, it follows that
limsup
n
x
a
n
(x) = 0.
Alternative Proof. Let (x) := sup
n
a
n
(x) , then (x) a
n
(x) 0 for
all x . Therefore by Fatous Lemma 8.8,
x
liminf
n
[(x) a
n
(x)] liminf
n
x
[(x) a
n
(x)]
=
x
(x) + liminf
n
_
x
a
n
(x)
_
=
x
(x) limsup
n
_
x
a
n
(x)
_
while similarly,
x
liminf
n
[(x) a
n
(x)] =
x
_
(x) + liminf
n
[a
n
(x)]
_
=
x
_
(x) limsup
n
a
n
(x)
_
=
x
(x) .
This then implies that
8.1 Rearrangements 73
x
(x)
x
(x) limsup
n
_
x
a
n
(x)
_
and hence that
limsup
n
_
x
a
n
(x)
_
0
as before.
Now suppose that (Z, ||) is a Banach space and a : Z is a function
such that
x
|a (x)| < .
Theorem 8.10. Let
1
(, Z) be the space of functions a : Z such that
|a|
1
:=
x
|a (x)| < .
Then;
1.
_
1
(, Z) , ||
1
_
is a Banach space.
2. The set D = a : Z : #(x : a (x) ,= 0 < ) is dense subspace
of
1
(, Z) .
3. For a D we may dene
x
a (x) =
x:a(x),=0
a (x) Z.
Since _
_
_
_
_
x
a (x)
_
_
_
_
_
x
|a (x)| = |a|
1
,
this linear operator has a unique continuous extension to a linear operator
:
1
(, Z) Z.
1. Let a
n
n=1
be a Cauchy sequence in
1
(, Z) . Since
|a
n
(x) a
m
(x)| |a
n
a
m
|
1
0 as m, n ,
it follows that lim
n
a
n
(x) =: a (x) exists for all x . Moreover, using
Fatous Lemma 8.8,
|a a
n
|
1
=
x
|a (x) a
n
(x)|
=
x
liminf
m
|a
m
(x) a
n
(x)|
liminf
m
x
|a
m
(x) a
n
(x)|
= liminf
m
|a
m
a
n
|
1
0 as n .
2. Choose
n

f
such that
n
and set a
n
(x) = a (x) 1
n
(x) for all
n N and x . Then
lim
n
|a a
n
|
1
= 0 by DCT.
3. This is a consequence of the BLT theorem.
Theorem 8.11 (DCT II). Suppose that a
n
n=1

1
(, Z) and A(x) :=
sup
n
|a
n
(x)| satisfy;
1. a (x) := lim
n
a
n
(x) existing in Z for all x , [Conict of notation]and
2.
x
A(x) < ,
then
lim
n
|a a
n
|
1
= 0 and lim
n
x
a
n
(x) =
x
a (x) .
Proof. It suces to prove the rst assertion since the sum operation is
continuous relative to the ||
1
norm on
1
(, Z) . Since
|a (x)| = lim
n
|a
n
(x)| A(x)
it follows that
|a (x) a
n
(x)| |a (x)| +|a
n
(x)| 2A(x) .
Since
lim
n
|a (x) a
n
(x)| =
_
_
_a (x) lim
n
a
n
(x)
_
_
_ = |a (x) a (x)| = 0 for all x ,
we may use the dominated convergence theorem for sums (Theorem 8.9) to
conclude,
lim
n
|a a
n
|
1
= lim
n
x
|a (x) a
n
(x)| =
x
lim
n
|a (x) a
n
(x)| = 0.
Corollary 8.12. Suppose (Z, ||) and a
1
(, Z) . If
n
, then
lim
n
xn
a (x) =
x
a (x) .
Proof. Let a
n
(x) := 1
n
(x) a (x) for all n. Then |a
n
(x)| |a (x)| and
x
|a (x)| < . Since lim
n
a
n
(x) = a (x) for all x , it follows by
DCT that
lim
n
xn
a (x) = lim
n
x
a
n
(x) =
x
lim
n
a
n
(x) =
x
a (x) .
Corollary 8.13. Suppose (Z, ||) is a Banach space, is a countable set, and
a
1
(, Z) . If =
n=1
n
, then
x
a (x) =
n=1
xn
a (x) .
[We permit
n
= for some n in this result with the convention that
x
a (x) = 0.]
Proof. We have
n=1
n
a (x) = lim
N
N
n=1
n
a (x) = lim
N
N
n=1
n
a (x) =
x
a (x) .
because
N
n=1
n
as N . [Bruce: we need to prove the nite additivity
here, namely that if = A B, then
x
a (x) =
xA
a (x) +
xB
a (x) .
But this is simple, since a = 1
A
a + 1
B
a and
x
1
A
a =
xA
a
where this last equality follow by choosing
n

f
such that
n
so that
x
1
A
a = lim
n
xn
1
A
a = lim
n
xnA
a =
xA
a.
Corollary 8.14. If
n=1
x
|a
n
(x)| < , then
n=1
x
a
n
(x) =
n=1
a
n
(x) Z.
Proof. First observe that
_
_
_
_
_
N
n=1
a
n
(x)
_
_
_
_
_
n=1
|a
n
(x)|
n=1
|a
n
(x)|
1
(, Z)
and lim
N
N
n=1
a
n
(x) =
n=1
a
n
(x) . Therefore by DCT,
n=1
x
a
n
(x) = lim
N
N
n=1
x
a
n
(x) = lim
N
x
N
n=1
a
n
(x)
=
x
lim
N
N
n=1
a
n
(x) =
n=1
a
n
(x) .
Better proof. By assumption,
n=1
|a
n
|
1
< =
n=1
a
n
converges in
1
(, Z) .
Therefore,
n=1
a
n
_
=
n=1
a
n
.
Exercise 8.1 (Dierentiating past an innite sum). Suppose that
n=0
sup
x
[f
t
n
(x)[ < and
n=0
f
n
(0) exists.
Then S (x) :=
n=0
f
n
(x) exists and
S
t
(x) =
n=0
f
t
n
(x) . (8.1)
Exercise 8.2 (Dierentiating past a limit). Suppose lim
n
f
n
(x) =
f (x) and f
t
n
g uniformly. Show f
t
(x) = g (x) , i.e. we have in this case
that
d
dx
lim
n
f
n
(x) = lim
n
d
dx
f
n
(x) .
Exercise 8.3. Suppose that
f (z) :=
n=0
a
n
z
n
has radius of convergence R. Show f
t
(z) =
n=0
na
n
z
n
for all [z[ < R and the
radius of convergence of f
t
is still R.
8.2 Double Sequences 75
Theorem 8.15 (Changing the center of a power series). Suppose that
A(z) =
n=0
an
n!
z
n
has radius of convergence R > 0. For [z[ < R and [h[ <
R [z[ we have
A(z +h) =
m=0
1
m!
a
m
(z) h
m
where
a
m
(z) :=
n=m
a
n
(n m)!
z
nm
= A
(m)
(z) .
Proof. Working formally for the moment,
A(z +h) =
n=0
a
n
n!
(z +h)
n
=
n=0
a
n
n!
n
m=0
_
n
m
_
z
nm
h
m
=
0mn<
a
n
m! (n m)!
z
nm
h
m
=
m=0
1
m!
_

n=m
a
n
(n m)!
z
nm
_
h
m
which gives the result provided the interchange of order of sums was permissible.
To check this we consider,
0mn<
a
n
m! (n m)!
z
nm
h
m
0mn<
[a
n
[
m! (n m)!
[z[
nm
[h[
m
=
n=0
[a
n
[
n!
([z[ +[h[)
n
<
as we know for all 0 < R that
n=0
]an]
n!

n
< . Hence we may apply
Fubinis theorem for sums in order to conclude the result.
8.2 Double Sequences
In this chapter we will consider doubly indexed sequences, S
m,n
m,n=1
, of
complex numbers.
1
To be more precise S
m,n
m,n=1
is simply a function from
N
2
to C. In this chapter we are interested in the following limits;
lim
m
lim
n
S
m,n
, lim
n
lim
m
S
m,n
, lim
mn
S
m,n
, and lim
mn
S
m,n
,
where the last two limits are dened as follows.
1
Much of what we will say holds for sequences taking values in complete metric
spaces to be covered later.
Denition 8.16. Suppose that S
m,n
m,n=1
is a sequence of complex numbers
(or more generally elements of a metric space). We say lim
mn
S
m,n
= L
i for all > 0 there exists N N such that
[L S
m,n
[ for all m n N.
We say lim
mn
S
m,n
= L exists i for all > 0 there exists N N such
that
[L S
m,n
[ for all m n N.
Clearly lim
mn
S
m,n
= L implies lim
mn
S
m,n
= L but the converse
is not true. We are mostly interested in nding sucient conditions in order for
iterated limits to be equal, i.e. for
lim
m
lim
n
S
m,n
= lim
n
lim
m
S
m,n
.
Example 8.17 (Switching Limits is Dangerous I). If S
m,n
=
1
1+
m
n
, then
lim
m
S
m,n
= 0 (but not uniformly in n) so that lim
n
lim
m
S
m,n
= 0
while lim
n
S
m,n
= 1 and lim
m
lim
n
S
m,n
= 1. In order to visualize
better what is going on here let us make the change of variables, x =
1
m
and
y =
1
n
, i.e. let
S (x, y) = S
m,n
=
1
1 +
y
x
=
x
x +y
whose plot appears in Figure 8.17.
A plot of S
m,n
=
1
1+
m
n
in tems of the variables
1
m
and
1
n
.
With this change of variables, m x 0 and n y 0
and m = and n = correspond to x = 0 and y = 0 respectively. It is now
quite clearly that lim
x0
A(x, y) = 0 for all y > 0 and lim
y0
A(x, y) = 1 for
all x > 1.
Example 8.18 (Switching Limits is Dangerous II). Let
S
m,n
:=
(1)
m+1
1 +
m
n
=
cos (m)
1 +
m
n
.
Then lim
m
S
m,n
= 0 (but not uniformly in n) so that
lim
n
lim
m
S
m,n
= 0. We also have lim
n
S
m,n
= (1)
m
and
lim
m
lim
n
S
m,n
= lim
m
(1)
m
does not exists. In order to visualize
better what is going on here let us again make the change of variables, x =
1
m
and y =
1
n
, i.e. let
S (x, y) = S
m,n
=
x
x +y
cos
_
x
_
A plot of S
m,n
=
cos(m)
1+
m
n
1
m
and
1
n
.
Example 8.19 (Switching Limits is Dangerous III). If
S
m,n
:=
(1)
n
m
+
(1)
m
n
=
cos (n)
m

cos (m)
n
and we let x =
1
m
and y =
1
n
. Then
S (x, y) = S
m,n
= xcos (/y) y cos (/x)
8.2 Double Sequences 77
A plot of
cos(n)
m

cos(m)
n
1
m
and
1
n
.
In this case, lim
mn
S
m,n
= 0 exists but lim
m
S
m,n
and lim
n
S
m,n
do not exist!
Remark 8.20. However, if lim
mn
S
m,n
exists then we do always have,
lim
mn
S
m,n
= lim
m
limsup
n
S
m,n
= lim
m
liminf
n
S
m,n
.
On the other hand if lim
mn
S
m,n
exists then we will have
lim
mn
S
m,n
= lim
m
lim
n
S
m,n
= lim
n
lim
m
S
m,n
.
One way to avoid these types of behaviors is to assume S
m,n
0 and is
increasing in each index.
Example 8.21 (Monotinicity is good). If
S
m,n
:=
1
1 +
1
n
+
1
m
2
and we let x =
1
m
and y =
1
n
. Then
S (x, y) = S
m,n
=
1
1 +y +x
2
A plot of
1
1+
1
n
+
1
m
2
1
m
and
1
n
.
This example is covered by Theorem 8.22 below.
Theorem 8.22 (Equality of monotone iterated limits). Suppose that
S
m,n
0 for all m, n N and S
m+1,n
S
m,n
and S
m,n+1
S
m,n
for all
m, n N. Then
L := lim
n
lim
m
S
m,n
= lim
m
lim
n
S
m,n
= sup
(m,n)
S
m,n
(8.2)
where all limits exist but may take on the value innity. Moreover
lim
mn
S
m,n
= L.
Proof. Because S
m,n
is increasing in each of its variables, we nd,
lim
n
lim
m
S
m,n
= sup
n
sup
m
S
m,n
and lim
m
lim
n
S
m,n
= sup
m
sup
n
S
m,n
and therefore Eq. (8.2) follows from Corollary 8.2 of the sup-sup Theorem. So
it only remains to show lim
mn
S
m,n
= L.
Cases 1. If L = , then for any N N, there exists (m
N
, n
N
) such that
S
mN,nN
N and since S
m,n
is increasing in each of its variables it follows that
S
m,n
N for all m m
N
and n n
N
and it follows that lim
mn
S
m,n
= .
Case 2. L < . In this case if > 0 is given, there exists (m
, n
) such that
S
m,n
L . Since S
m,n
is increasing in each of its variables it follows that
L S
m,n
L for all m m
and n n
and so [L S
m,n
[ for all m, n m
. So by denition, lim
mn
S
m,n
=
L.
Exercise 8.4. If f
n
(x)
n=1
is a sequence of increasing continuous functions
on 1 such that f
n
(x) f (x) < as n , then f (x) is continuous and
increasing as well.
Exercise 8.5. Show lim
mn
S
m,n
= L i for all > 0 there exists M, N N
such
[L S
m,n
[ for all m M and n N.
Lemma 8.23. Suppose that S
m,n
m,n=1
is Cauchy in the sense that for all
> 0 there exists M, N N such that
[S
m,n
S
m
,n
[ for all m, m
t
M and n, n
t
N. (8.3)
Then lim
mn
S
m,n
= L exists.
Proof. Let s
n
:= S
n,n
. Then the assumption shows that s
n
n=1
is a Cauchy
sequence and hence convergent. Let L := lim
n
s
n
. Now take m
t
= n
t
and
then let n
t
in Eq. (8.3) in order to learn,
[S
m,n
L[ for all m M and n N.
From this we conclude that lim
mn
S
m,n
= L.
Another way to ensure that the iterated limits are equal is to assume some
uniformity in one of the limits as in the next key theorem. [This theorem will
be used in one guise or another repeatedly throughout these notes.]
Theorem 8.24. Suppose that S
m,n
m,n=1
is a sequence of complex numbers
(or more generally elements of a complete metric space (X, )). Assume that
S
m,
:= lim
n
S
m,n
exists uniformly in m and
S
,n
:= lim
m
S
m,n
exists pointwise in n.
Then lim
m
lim
n
S
m,n
, lim
n
lim
m
S
m,n
, and lim
mn
S
m,n
all
exist and are equal, i.e.
L := lim
m
lim
n
S
m,n
= lim
n
lim
m
S
m,n
= lim
mn
S
m,n
.
[In words, the existence of limits in both variables along with uniformity in one
of the variables implies the iterated limits exists and agree.]
Proof. Let > 0 be given. Choose N N such that
sup
m
[S
m,
S
m,n
[ for all n N.
Now choose M N such that
[S
,N
S
m,N
[ for all m M.
Then for n N and m M we have,
[S
m,n
S
,N
[ [S
m,n
S
m,
[ +[S
m,
S
,N
[
[S
m,n
S
m,
[ +[S
m,
S
m,N
[ +[S
m,N
S
,N
[
+ + = 3.
Therefore it follows that
[S
m,n
S
m
,n
[ 6 for all m, m
t
M and n, n
t
N.
Hence S
m,n
m,n=1
is Cauchy and therefore by Lemma 8.23 we know that
L := lim
mn
S
m,n
exists, i.e. for all > 0 there exists M, N N such that
[S
m,n
L[ for all m M and n N.
Letting m above then shows,
[S
,n
L[ for all and n N
and therefore lim
n
S
,n
= L. Similarly one shows lim
m
S
m,
= L as
well.
Metric space proof. Let > 0 be given. Choose N N such that
sup
m
(S
m,
, S
m,n
) for all n N.
Now choose M N such that
(S
,N
, S
m,N
) for all m M.
Then for n N and m M we have,
(S
m,n
, S
,N
) (S
m,n
, S
m,
) + (S
m,
, S
,N
)
(S
m,n
, S
m,
) + (S
m,
, S
m,N
) + (S
m,N
, S
,N
)
+ + = 3.
Therefore it follows that
8.3 Iterated Limits 79
(S
m,n
, S
m
,n
) 6 for all m, m
t
M and n, n
t
N.
Hence S
m,n
m,n=1
is Cauchy and therefore by Lemma 8.23 we know that
L := lim
mn
S
m,n
exists, i.e. for all > 0 there exists M, N N such that
(S
m,n
, L) for all m M and n N.
Letting m above then shows,
(S
,n
, L) for all and n N
and therefore lim
n
S
,n
= L. Similarly one shows lim
m
S
m,
= L as
well.
Remark 8.25. The previous theorem is rather easy to understand intuitively as
the following picture indicates the strategy of the proof.
8.3 Iterated Limits
Theorem 8.26. Let a
m,n
m,n=1
be a sequence of complex numbers and as-
sume that
lim
m
a
m,n
= A
n
exists uniformly in n
lim
n
A
n
= L exists
then lim
mn
a
m,n
= L and in particular,
lim
m
lim
n
a
m,n
= lim
n
lim
m
a
m,n
.
Idea.. The rst assumption guarantees the rows of a
m,

= A
for large
m. The second assertion say that A
n

= L for large n. Thus we must have
a
m,n

= A
n

= L for large m and n.
Theorem 8.27. Let a
m,n
m,n=1
be a sequence of complex numbers and as-
sume that
lim
m
a
m,n
= a
n
exists uniformly in n
lim
n
a
m,n
= b
m
exists
lim
m
b
m
= B exists .
Then, lim
n
a
n
= B. In other words under the above conditions,
lim
m
lim
n
a
m,n
= lim
n
lim
m
a
m,n
.
Moreover lim
mn
a
m,n
= B as well.
Idea. As above we know that a
m,

= a
for m large. We are also given that

lim
n
a
m,n

= B for m large. Thus B

= lim
n
a
m,n

= lim
n
a
n
for m
large. Hence we may now apply the previous theorem.
[Details] We have
[B a
n
[ [B b
m
[ +[b
m
a
m,n
[ +[a
m,n
a
n
[
[B b
m
[ +[b
m
a
m,n
[ + sup
k
[a
m,k
a
k
[
and then taking limsup
n
of this inequality shows,
limsup
n
[B a
n
[ [B b
m
[ + sup
k
[a
m,k
a
k
[ 0 as m .
We now prove the second assertion. For this we have,
[B a
m,n
[ [B a
n
[ +[a
n
a
m,n
[
[B a
n
[ + sup
k
[a
k
a
m,k
[ .
Thus given > 0 there exists M N and N N such that
[B a
m,n
[ [B a
n
[ + sup
k
[a
k
a
m,k
[ +
for n N and m M. This proves the stronger statement that
lim
mn
a
m,n
= B, i.e. a
m,n
is near B as long as both m, n are suciently
large.
Exercise 8.6. Suppose that f
n
f uniformly and f
n
is continuous for all n,
then f is continuous. [Use sequential notion of continuity here.]
8.4 Double Sums and Continuity of Sums
Here are a couple of very useful consequences of these theorems.
Theorem 8.28 (Monotone convergence theorem for sums). Let
a
k,n
k,n=1
be a sequence of non-negative numbers, assume that a
k,n+1
a
k,n
for all k, n. Then
lim
n
k=1
a
k,n
=
k=1
lim
n
a
k,n
= lim
mn
m
k=1
a
k,n
.
Proof. Let S
m,n
:=
m
k=1
a
k,n
, then S
m,n
m,n=1
satises the hypothesis of
Theorem 8.22 and the conclusions now follows from that Theorem upon noting
that
lim
n
k=1
a
k,n
= lim
n
lim
m
S
m,n
and
k=1
lim
n
a
k,n
= lim
m
lim
n
S
m,n
.
Theorem 8.29 (Tonellis Theorem for sums). Let a
k,l
k,l=1
be any se-
quence of non-negative numbers, then
k=1
l=1
a
k,l
=
l=1
k=1
a
k,l
.
Proof. Apply Theorem 8.22 with S
m,n
:=
m
k=1
n
l=1
a
k,l
.
Theorem 8.30 (Domintated convergence theorem for sums). Let
a
k,n
k,n=1
be a sequence of complex numbers such that lim
n
a
k,n
= a
k
exists for all n and there exists a summable dominating sequence, M
k
, such
that [a
k,n
[ M
k
for all k, n N. Then
lim
n
k=1
a
kn
=
k=1
lim
n
a
kn
= lim
mn
m
k=1
a
k,n
. (8.4)
Proof. Let S
m,n
:=
m
k=1
a
k,n
, then S
m,n
m,n=1
satises the hypothesis
of Theorem 8.27. Indeed,
S
m,n

k=1
a
k,n
k=m+1
a
k,n
k=m+1
[a
k,n
[
k=m+1
M
k
and hence,
sup
n
S
m,n

k=1
a
k,n
k=m+1
M
k
0 as m .
Therefore S
m,n

k=1
a
k,n
uniformly in n as m . Moreover,
lim
n
S
m,n
=
m
k=1
lim
n
a
k,n
=
m
k=1
a
k

k=1
a
k
as m .
Thus the hypothesis of Theorem 8.27 are satised and so we may conclude,
lim
m
lim
n
S
m,n
= lim
n
lim
m
S
m,n
= lim
mn
S
m,n
=
k=1
a
k
which is exactly Eq. (8.4).
Theorem 8.31 (Fubinis Theorem for sums). Let a
k,l
k,l=1
be any se-
quence of complex numbers. If
k,l=1
[a
k,l
[ < , then
k=1
l=1
a
k,l
=
k,l=1
a
k,l
=
l=1
k=1
a
k,l
.
8.5 Sums of positive functions
In this and the next few sections, let X and Y be two sets. We will write
f
X
to denote that is a nite subset of X and write 2
X
f
for those
f
X.
Denition 8.32. Suppose that a : X [0, ] is a function and F X is a
subset, then
F
a =
xF
a (x) := sup
_
x
a (x) :
f
F
_
.
Remark 8.33. Suppose that X = N = 1, 2, 3, . . . and a : X [0, ], then
N
a =
n=1
a(n) := lim
N
N
n=1
a(n).
Indeed for all N,
N
n=1
a(n)

N
a, and thus passing to the limit we learn
that
n=1
a(n)
N
a.
8.5 Sums of positive functions 81
Conversely, if
f
N, then for all N large enough so that 1, 2, . . . , N,
we have
N
n=1
a(n) which upon passing to the limit implies that
n=1
a(n).
Taking the supremum over in the previous equation shows
N
a
n=1
a(n).
Remark 8.34. Suppose a : X [0, ] and
X
a < , then x X : a (x) > 0
is at most countable. To see this rst notice that for any > 0, the set x :
a (x) must be nite for otherwise
X
a = . Thus
x X : a (x) > 0 =
_
k=1
x : a (x) 1/k
which shows that x X : a (x) > 0 is a countable union of nite sets and thus
countable by Lemma ??.
Lemma 8.35. Suppose that a, b : X [0, ] are two functions, then
X
(a +b) =
X
a +
X
b and
X
a =
X
a
for all 0.
I will only prove the rst assertion, the second being easy. Let
f
X be a
nite set, then
(a +b) =
a +
X
a +
X
b
which after taking sups over shows that
X
(a +b)
X
a +
X
b.
Similarly, if ,
f
X, then
a +
a +
b =
(a +b)
X
(a +b).
Taking sups over and then shows that
X
a +
X
b
X
(a +b).
Lemma 8.36. Let X and Y be sets, R X Y and suppose that a : R

1
is a function. Let
x
R := y Y : (x, y) R and R
y
:= x X : (x, y) R .
Then
sup
(x,y)R
a(x, y) = sup
xX
sup
yxR
a(x, y) = sup
yY
sup
xRy
a(x, y) and
inf
(x,y)R
a(x, y) = inf
xX
inf
yxR
a(x, y) = inf
yY
inf
xRy
a(x, y).
(Recall the conventions: sup = and inf = +.)
Proof. Let M = sup
(x,y)R
a(x, y), N
x
:= sup
yxR
a(x, y). Then a(x, y)
M for all (x, y) R implies N
x
= sup
yxR
a(x, y) M and therefore that
sup
xX
sup
yxR
a(x, y) = sup
xX
N
x
M. (8.5)
Similarly for any (x, y) R,
a(x, y) N
x
sup
xX
N
x
= sup
xX
sup
yxR
a(x, y)
and therefore
M = sup
(x,y)R
a(x, y) sup
xX
sup
yxR
a(x, y) (8.6)
Equations (8.5) and (8.6) show that
sup
(x,y)R
a(x, y) = sup
xX
sup
yxR
a(x, y).
The assertions involving inmums are proved analogously or follow from what
we have just proved applied to the function a.
Theorem 8.37 (Monotone Convergence Theorem for Sums). Suppose
that f
n
: X [0, ] is an increasing sequence of functions and
f (x) := lim
n
f
n
(x) = sup
n
f
n
(x) .
Then
lim
n
X
f
n
=
X
f
Proof. We will give two proofs.
First proof. Let
2
X
f
:= A X : A
f
X.
Then
Fig. 8.1. The x and y slices of a set R X Y.
lim
n
X
f
n
= sup
n
X
f
n
= sup
n
sup
2
X
f
f
n
= sup
2
X
f
sup
n
f
n
= sup
2
X
f
lim
n
f
n
= sup
2
X
f
lim
n
f
n
= sup
2
X
f
f =
X
f.
Second Proof. Let S
n
=
X
f
n
and S =
X
f. Since f
n
f
m
f for all
n m, it follows that
S
n
S
m
S
which shows that lim
n
S
n
exists and is less that S, i.e.
A := lim
n
X
f
n

X
f. (8.7)
Noting that
f
n

X
f
n
= S
n
A for all
f
X and in particular,
f
n
A for all n and
f
X.
Letting n tend to innity in this equation shows that
f A for all
f
X
and then taking the sup over all
f
X gives
X
f A = lim
n
X
f
n
(8.8)
which combined with Eq. (8.7) proves the theorem.
Lemma 8.38 (Fatous Lemma for Sums). Suppose that f
n
: X [0, ] is
a sequence of functions, then
X
lim inf
n
f
n
lim inf
n
X
f
n
.
Proof. Dene g
k
:= inf
nk
f
n
so that g
k
liminf
n
f
n
as k . Since
g
k
f
n
for all n k,
X
g
k

X
f
n
for all n k
and therefore
X
g
k
lim inf
n
X
f
n
for all k.
We may now use the monotone convergence theorem to let k to nd
X
lim inf
n
f
n
=
X
lim
k
g
k
MCT
= lim
k
X
g
k
lim inf
n
X
f
n
.
Remark 8.39. If A =
X
a < , then for all > 0 there exists

f
X such
that
A
a A
for all
f
X containing
or equivalently,
(8.9)
for all
f
X containing
. Indeed, choose
so that
a A.
8.6 Sums of complex functions
Denition 8.40. Suppose that a : X C is a function, we say that
X
a =
xX
a (x)
exists and is equal to A C, if for all > 0 there is a nite subset
X
such that for all
f
X containing
we have
.
8.6 Sums of complex functions 83
The following lemma is left as an exercise to the reader.
Lemma 8.41. Suppose that a, b : X C are two functions such that
X
a
and
X
b exist, then
X
(a +b) exists for all C and
X
(a +b) =
X
a +
X
b.
Denition 8.42 (Summable). We call a function a : X C summable if
X
[a[ < .
Proposition 8.43. Let a : X C be a function, then
X
a exists i
X
[a[ <
, i.e. i a is summable. Moreover if a is summable, then
X
a
X
[a[ .
Proof. If
X
[a[ < , then
X
(Re a)
< and
X
(Ima)
<
and hence by Remark 8.39 these sums exists in the sense of Denition 8.40.
Therefore by Lemma 8.41,
X
a exists and
X
a =
X
(Re a)
+
X
(Re a)
+i
_
X
(Ima)
+
X
(Ima)
_
.
Conversely, if
X
[a[ = then, because [a[ [Re a[ +[Ima[ , we must have
X
[Re a[ = or
X
[Ima[ = .
Thus it suces to consider the case where a : X 1 is a real function. Write
a = a
+
a
where
a
+
(x) = max(a (x) , 0) and a
(x) = max(a (x) , 0). (8.10)

Then [a[ = a
+
+a
and
=
X
[a[ =
X
a
+
+
X
a
which shows that either
X
a
+
= or
X
a
= . Suppose, with out loss

of generality, that
X
a
+
= . Let X
t
:= x X : a (x) 0, then we
know that
X
a = which means there are nite subsets
n
X
t
X
such that
n
a n for all n. Thus if
f
X is any nite set, it follows that
lim
n
n
a = , and therefore
X
a can not exist as a number in 1.
Finally if a is summable, write
X
a = e
i
with 0 and 1, then
X
a
= = e
i
X
a =
X
e
i
a
=
X
Re
_
e
i
a
X
_
Re
_
e
i
a
_
+
Re
_
e
i
a
e
i
a
X
[a[ .
Alternatively, this may be proved by approximating
X
a by a nite sum and
then using the triangle inequality of [[ .
Remark 8.44. Suppose that X = N and a : N C is a sequence, then it is not
necessarily true that
n=1
a(n) =
nN
a(n). (8.11)
This is because
n=1
a(n) = lim
N
N
n=1
a(n)
depends on the ordering of the sequence a where as
nN
a(n) does not. For
example, take a(n) = (1)
n
/n then
nN
[a(n)[ = i.e.
nN
a(n) does not
exist while
n=1
a(n) does exist. On the other hand, if
nN
[a(n)[ =
n=1
[a(n)[ <
then Eq. (8.11) is valid.
Theorem 8.45 (Dominated Convergence Theorem for Sums). Sup-
pose that f
n
: X C is a sequence of functions on X such that f (x) =
lim
n
f
n
(x) C exists for all x X. Further assume there is a dominating
function g : X [0, ) such that
[f
n
(x) [ g (x) for all x X and n N (8.12)
and that g is summable. Then
lim
n
xX
f
n
(x) =
xX
f (x) . (8.13)
Proof. Notice that [f[ = lim[f
n
[ g so that f is summable. By considering
the real and imaginary parts of f separately, it suces to prove the theorem in
the case where f is real. By Fatous Lemma,
X
(g f) =
X
lim inf
n
(g f
n
) lim inf
n
X
(g f
n
)
=
X
g + lim inf
n
_
X
f
n
_
.
Since liminf
n
(a
n
) = limsup
n
a
n
, we have shown,
X
g
X
f
X
g +
_
liminf
n
X
f
n
limsup
n
X
f
n
and therefore
lim sup
n
X
f
n

X
f lim inf
n
X
f
n
.
This shows that lim
n
X
f
n
exists and is equal to
X
f.
Proof. (Second Proof.) Passing to the limit in Eq. (8.12) shows that [f[ g
and in particular that f is summable. Given > 0, let
f
X such that
X\
g .
Then for
f
X such that ,
f
n
(f f
n
)
[f f
n
[ =
[f f
n
[ +
\
[f f
n
[
[f f
n
[ + 2
\
g
[f f
n
[ + 2.
and hence that
f
n
[f f
n
[ + 2.
Since this last equation is true for all such
f
X, we learn that
X
f
X
f
n
[f f
n
[ + 2
which then implies that
lim sup
n
X
f
X
f
n
lim sup
n
[f f
n
[ + 2
= 2.
Because > 0 is arbitrary we conclude that
lim sup
n
X
f
X
f
n
= 0.
which is the same as Eq. (8.13).
Remark 8.46. Theorem 8.45 may easily be generalized as follows. Suppose
f
n
, g
n
, g are summable functions on X such that f
n
f and g
n
g pointwise,
[f
n
[ g
n
and
X
g
n

X
g as n . Then f is summable and Eq. (8.13)
still holds. For the proof we use Fatous Lemma to again conclude
X
(g f) =
X
lim inf
n
(g
n
f
n
) lim inf
n
X
(g
n
f
n
)
=
X
g + lim inf
n
_
X
f
n
_
and then proceed exactly as in the rst proof of Theorem 8.45.
8.7 Iterated sums and the Fubini and Tonelli Theorems
Let X and Y be two sets. The proof of the following lemma is left to the reader.
Lemma 8.47. Suppose that a : X C is function and F X is a subset such
that a (x) = 0 for all x / F. Then
F
a exists i
X
a exists and when the
sums exists,
X
a =
F
a.
Theorem 8.48 (Tonellis Theorem for Sums). Suppose that a : X Y
[0, ], then
XY
a =
Y
a =
X
a.
8.8
p
spaces, Minkowski and H older Inequalities 85
Proof. It suces to show, by symmetry, that
XY
a =
Y
a
Let
f
X Y. Then for any
f
X and
f
Y such that , we
have
a =
Y
a
Y
a,
i.e.
Y
a. Taking the sup over in this last equation shows
XY
a
Y
a.
For the reverse inequality, for each x X choose
x
n

f
Y such that
x
n
Y as
n and
yY
a(x, y) = lim
n
y
x
n
a(x, y).
If
f
X is a given nite subset of X, then
yY
a(x, y) = lim
n
yn
a(x, y) for all x
where
n
:=
x
x
n

f
Y. Hence
yY
a(x, y) =
x
lim
n
yn
a(x, y) = lim
n
yn
a(x, y)
= lim
n
(x,y)n
a(x, y)
XY
a.
Since is arbitrary, it follows that
xX
yY
a(x, y) = sup
f X
yY
a(x, y)
XY
a
which completes the proof.
Theorem 8.49 (Fubinis Theorem for Sums). Now suppose that a : X
Y C is a summable function, i.e. by Theorem 8.48 any one of the following
equivalent conditions hold:
1.
XY
[a[ < ,
2.
Y
[a[ < or
3.
X
[a[ < .
Then
XY
a =
Y
a =
X
a.
Proof. If a : X 1 is real valued the theorem follows by applying Theorem
8.48 to a
the positive and negative parts of a. The general result holds for
complex valued functions a by applying the real version just proved to the real
and imaginary parts of a.
8.8
p
spaces, Minkowski and H older Inequalities
In this chapter, let : X (0, ) be a given function. Let F denote either 1
or C. For p (0, ) and f : X F, let
|f|
p
:=
_
xX
[f (x)[
p
(x)
_
1/p
and for p = let
|f|
= sup [f (x)[ : x X .
Also, for p > 0, let
p
() = f : X F : |f|
p
< .
In the case where (x) = 1 for all x X we will simply write
p
(X) for
p
().
Denition 8.50. A norm on a vector space Z is a function || : Z [0, )
such that
1. (Homogeneity) |f| = [[ |f| for all F and f Z.
3. (Positive denite) |f| = 0 implies f = 0.
A function p : Z [0, ) satisfying properties 1. and 2. but not necessarily
3. above will be called a semi-norm on Z.
A pair (Z, ||) where Z is a vector space and || is a norm on Z is called a
normed vector space.
The rest of this section is devoted to the proof of the following theorem.
Theorem 8.51. For p [1, ], (
p
(), | |
p
) is a normed vector space.
Proof. The only diculty is the proof of the triangle inequality which is
the content of Minkowskis Inequality proved in Theorem 8.58 below.
Proposition 8.52. Let f : [0, ) [0, ) be a continuous strictly increasing
function such that f(0) = 0 (for simplicity) and lim
s
f(s) = . Let g = f
1
and for s, t 0 let
F(s) =
_
s
0
f(s
t
)ds
t
and G(t) =
_
t
0
g(t
t
)dt
t
.
Then for all s, t 0,
st F(s) +G(t)
and equality holds i t = f(s).
Proof. Let
A
s
:= (, ) : 0 f() for 0 s and
B
t
:= (, ) : 0 g() for 0 t
then as one sees from Figure 8.2, [0, s] [0, t] A
s
B
t
. (In the gure: s = 3,
t = 1, A
3
is the region under t = f(s) for 0 s 3 and B
1
is the region to the
left of the curve s = g(t) for 0 t 1.) Hence if m denotes the area of a region
in the plane, then
st = m([0, s] [0, t]) m(A
s
) +m(B
t
) = F(s) +G(t).
As it stands, this proof is a bit on the intuitive side. However, it will become
rigorous if one takes m to be Lebesgue measure on the plane which will be
introduced later.
We can also give a calculus proof of this theorem under the additional as-
sumption that f is C
1
. (This restricted version of the theorem is all we need in
this section.) To do this x t 0 and let
h(s) = st F(s) =
_
s
0
(t f())d.
If > g(t) = f
1
(t), then t f() < 0 and hence if s > g(t), we have
h(s) =
_
s
0
(t f())d =
_
g(t)
0
(t f())d +
_
s
g(t)
(t f())d
_
g(t)
0
(t f())d = h(g(t)).
Combining this with h(0) = 0 we see that h(s) takes its maximum at some
point s (0, g(t)] and hence at a point where 0 = h
t
(s) = t f(s). The only
solution to this equation is s = g(t) and we have thus shown
st F(s) = h(s)
_
g(t)
0
(t f())d = h(g(t))
with equality when s = g(t). To nish the proof we must show
_
g(t)
0
(t
f())d = G(t). This is veried by making the change of variables = g()
and then integrating by parts as follows:
_
g(t)
0
(t f())d =
_
t
0
(t f(g()))g
t
()d =
_
t
0
(t )g
t
()d
=
_
t
0
g()d = G(t).
Fig. 8.2. A picture proof of Proposition 8.52.
Denition 8.53. The conjugate exponent q [1, ] to p [1, ] is q :=
p
p1
with the conventions that q = if p = 1 and q = 1 if p = . Notice that q is
characterized by any of the following identities:
1
p
+
1
q
= 1, 1 +
q
p
= q, p
p
q
= 1 and q(p 1) = p. (8.14)
Lemma 8.54. Let p (1, ) and q :=
p
p1
(1, ) be the conjugate exponent.
Then
st
s
p
p
+
t
q
q
for all s, t 0 (8.15)
with equality if and only if t
q
= s
p
. (See Example ?? below for a generalization
of the inequality in Eq. (8.15).)
8.8
p
spaces, Minkowski and H older Inequalities 87
Proof. Let F(s) =
s
p
p
for p > 1. Then f(s) = s
p1
= t and g(t) = t
1
p1
=
t
q1
, wherein we have used q 1 = p/ (p 1) 1 = 1/ (p 1) . Therefore
G(t) = t
q
/q and hence by Proposition 8.52,
st
s
p
p
+
t
q
q
with equality i t = s
p1
, i.e. t
q
= s
q(p1)
= s
p
.
** For those who do not want to use Proposition 8.52, here is a direct
calculus proof. Fix t > 0 and let
h(s) := st
s
p
p
.
Then h(0) = 0, lim
s
h(s) = and h
t
(s) = t s
p1
which equals zero i
s = t
1
p1
. Since
h
_
t
1
p1
_
= t
1
p1
t
t
p
p1
p
= t
p
p1
t
p
p1
p
= t
q
_
1
1
p
_
=
t
q
q
,
it follows from the rst derivative test that
max h = max
_
h(0) , h
_
t
1
p1
__
= max
_
0,
t
q
q
_
=
t
q
q
.
So we have shown
st
s
p
p

t
q
q
with equality i t = s
p1
.
We also have the following closely related calculus result.
Lemma 8.55. Let a, b > 0 and p, q [1, ) such that
1
p
+
1
q
= 1 and dene
f () :=
a
p
p
+
b
q
q
for > 0.
Then
min
>0
f () = a
1
p
b
1
q
.
which in particular implies [let = 1 and replace a a
p
and b b
p
] that
ab
a
p
p
+
b
q
q
for all a, b 0.
Proof. Since f () if 0 or , it follows that f has a minimum
in (0, ) . We can nd this minimum using the rst derivative test. So
0
set
= f
t
() = a
p1
b
q1
=
b
a
=

p1
q1
=
p+q
.
Evaluating f at this gives the desired minimum;
f () =
q
_
a
p
p+q
+
b
q
_
=
q
_
a
p
b
a
+
b
q
_
=
q
b
=
_
b
a
_
q
p+q
b =
_
b
a
_
q
p+q
b
p+q
p+q
= a
q
p+q
b
p
p+q
= a
1
p
b
1
q
.
where we have used
q
q +p
=
q/ (qp)
1
p
+
1
q
=
1
p
and
p
q +p
=
1
q
.
Theorem 8.56 (Holders inequality). Let p, q [1, ] be conjugate expo-
nents. For all f, g : X F,
|fg|
1
|f|
p
|g|
q
. (8.16)
If p (1, ) and f and g are not identically zero, then equality holds in Eq.
(8.16) i
_
[f[
|f|
p
_
p
=
_
[g[
|g|
q
_
q
. (8.17)
Proof. The proof of Eq. (8.16) for p 1, is easy and will be left to
the reader. The cases where |f|
p
= 0 or or |g|
q
= 0 or are easily dealt
with and are also left to the reader. So we will assume that p (1, ) and
0 < |f|
p
, |g|
q
< . Letting s = [f (x)[ /|f|
p
and t = [g[/|g|
q
in Lemma 8.54
implies
[f (x) g (x)[
|f|
p
|g|
q
1
p
[f (x)[
p
|f|
p
p
+
1
q
[g (x)[
q
|g|
q
q
with equality i
[f (x)[
p
|f|
p
p
= s
p
= t
q
=
[g (x)[
q
|g|
q
q
. (8.18)
Multiplying this equation by (x) and then summing on x gives
|fg|
1
|f|
p
|g|
q
1
p
+
1
q
= 1
with equality i Eq. (8.18) holds for all x X, i.e. i Eq. (8.17) holds.
Denition 8.57. For a complex number C, let
sgn() =
_

]]
if ,= 0
0 if = 0.
For , C we will write sgn() sgn() if sgn() = sgn() or = 0.
Theorem 8.58 (Minkowskis Inequality). If 1 p and f, g
p
()
then
|f +g|
p
|f|
p
+|g|
p
. (8.19)
Moreover, assuming f and g are not identically zero, equality holds in Eq. (8.19)
i
sgn(f) sgn(g) when p = 1 and
f = cg for some c > 0 when p (1, ).
Proof. For p = 1,
|f +g|
1
=
X
[f +g[
X
([f[ +[g[) =
X
[f[ +
X
[g[
with equality i
[f[ +[g[ = [f +g[ sgn(f) sgn(g).
For p = ,
|f +g|
= sup
X
[f +g[ sup
X
([f[ +[g[)
sup
X
[f[ + sup
X
[g[ = |f|
+|g|
.
Now assume that p (1, ). Since
[f +g[
p
(2 max ([f[ , [g[))
p
= 2
p
max ([f[
p
, [g[
p
) 2
p
([f[
p
+[g[
p
)
it follows that
|f +g|
p
p
2
p
_
|f|
p
p
+|g|
p
p
_
< .
Eq. (8.19) is easily veried if |f + g|
p
= 0, so we may assume |f + g|
p
> 0.
Multiplying the inequality,
[f +g[
p
= [f +g[[f +g[
p1
[f[ [f +g[
p1
+[g[[f +g[
p1
(8.20)
by , then summing on x and applying Holders inequality on each term gives
X
[f +g[
p

X
[f[ [f +g[
p1
+
X
[g[ [f +g[
p1
(|f|
p
+|g|
p
)
_
_
_[f +g[
p1
_
_
_
q
. (8.21)
Since q(p 1) = p, as in Eq. (8.14),
|[f +g[
p1
|
q
q
=
X
([f +g[
p1
)
q
=
X
[f +g[
p
= |f +g|
p
p
. (8.22)
Combining Eqs. (8.21) and (8.22) shows
|f +g|
p
p
(|f|
p
+|g|
p
) |f +g|
p/q
p
(8.23)
and solving this equation for |f + g|
p
(making use of Eq. (8.14)) implies Eq.
(8.19). Now suppose that f and g are not identically zero and p (1, ) .
Equality holds in Eq. (8.19) i equality holds in Eq. (8.23) i equality holds in
Eq. (8.21) and Eq. (8.20). The latter happens i
sgn(f) sgn(g) and
_
[f[
|f|
p
_
p
=
[f +g[
p
|f +g|
p
p
=
_
[g[
|g|
p
_
p
. (8.24)
wherein we have used
_
[f +g[
p1
|[f +g[
p1
|
q
_
q
=
[f +g[
p
|f +g|
p
p
.
Finally Eq. (8.24) is equivalent to [f[ = c [g[ with c = (|f|
p
/|g|
p
) > 0 and this
equality along with sgn(f) sgn(g) implies f = cg.
8.9 Exercises
Exercise 8.7. Now suppose for each n N := 1, 2, . . . that f
n
: X 1 is a
function. Let
D := x X : lim
n
f
n
(x) = +
show that
D =
M=1

N=1

nN
x X : f
n
(x) M. (8.25)
Exercise 8.8. Let f
n
: X 1 be as in the last problem. Let
C := x X : lim
n
f
n
(x) exists in 1.
Find an expression for C similar to the expression for D in (8.25). (Hint: use
the Cauchy criteria for convergence.)
8.9 Exercises 89
8.9.1 Limit Problems
Exercise 8.9. Show liminf
n
(a
n
) = limsup
n
a
n
.
Exercise 8.10. Suppose that limsup
n
a
n
= M

1, show that there is a
subsequence a
nk
k=1
of a
n
n=1
such that lim
k
a
nk
= M.
limsup
n
(a
n
+b
n
) limsup
n
a
n
+ limsup
n
b
n
(8.26)
provided that the right side of Eq. (8.26) is well dened, i.e. no or
+ type expressions. (It is OK to have + = or = ,
etc.)
n
0 and b
n
0 for all n N. Show
limsup
n
(a
n
b
n
) limsup
n
a
n
limsup
n
b
n
, (8.27)
8.9.2 Monotone and Dominated Convergence Theorem Problems
Exercise 8.15. Let M < , show there are polynomials p
n
(t) and q
n
(t) for
n N such that
lim
n
sup
0tM
t q
n
(t)
= 0 (8.28)
and
lim
n
sup
]t]M
[[t[ p
n
(t)[ = 0 (8.29)
using the following outline.
1. Let f (x) =
1 x for [x[ 1 and use Taylors theorem with integral

remainder (see Eq. 12.19 of Appendix ??), or analytic function theory if
you know it, to show there are constants
2
c
n
> 0 for n N such that
1 x = 1
n=1
c
n
x
n
for all [x[ < 1. (8.30)
2
In fact cn :=
(2n3)!!
2
n
n!
, but this is not needed.
2. Let q
m
(x) := 1
m
n=1
c
n
x
n
. Use (8.30) to show
n=1
c
n
= 1 and conclude
from this that
lim
m
sup
]x]1
[
1 x q
m
(x) [ = 0. (8.31)
3. Conclude that q
n
(t) :=
M q
n
(1 t/M) and p
n
(t) := q
n
_
t
2
_
for n N
are polynomials verifying Eqs. (8.28) and (8.29) respectively.
Notation 8.59 For u
0
1
n
and > 0, let B
u0
() := x 1
n
: [x u
0
[ <
be the ball in 1
n
centered at u
0
with radius .
Exercise 8.16. Suppose U 1
n
is a set and u
0
U is a point such that
U (B
u0
() u
0
) ,= for all > 0. Let G : U u
0
C be a function
on U u
0
. Show that lim
uu0
G(u) exists and is equal to C,
3
i for all
sequences u
n
n=1
U u
0
which converge to u
0
(i.e. lim
n
u
n
= u
0
) we
have lim
n
G(u
n
) = .
Exercise 8.17. Suppose that Y is a set, U 1
n
is a set, and f : U Y C
is a function satisfying:
1. For each y Y, the function u U f(u, y) is continuous on U.
4
2. There is a summable function g : Y [0, ) such that
[f(u, y)[ g(y) for all y Y and u U.
Show that
F(u) :=
yY
f(u, y) (8.32)
is a continuous function for u U.
Exercise 8.18. Suppose that Y is a set, J = (a, b) 1 is an interval, and
f : J Y C is a function satisfying:
1. For each y Y, the function u f(u, y) is dierentiable on J,
2. There is a summable function g : Y [0, ) such that
u
f(u, y)
g(y) for all y Y and u J.

3
More explicitly, limuu0
G(u) = means for every every > 0 there exists a > 0
such that
|G(u) | < whenever u U (Bu0
() \ {u0}) .
4
To say g := f(, y) is continuous on U means that g : U C is continuous relative
to the metric on 1
n
restricted to U.
3. There is a u
0
J such that
yY
[f(u
0
, y)[ < .
Show:
a) for all u J that
yY
[f(u, y)[ < .
b) Let F(u) :=
yY
f(u, y), show F is dierentiable on J and that
F(u) =
yY
u
f(u, y).
(Hint: Use the mean value theorem.)
8.9.3
p
Exercises
Exercise 8.19. Generalize Proposition 8.52 as follows. Let a [, 0] and
f : 1 [a, ) [0, ) be a continuous strictly increasing function such that
lim
s
f(s) = , f(a) = 0 if a > or lim
s
f(s) = 0 if a = . Also let
g = f
1
, b = f(0) 0,
F(s) =
_
s
0
f(s
t
)ds
t
and G(t) =
_
t
0
g(t
t
)dt
t
.
Then for all s, t 0,
st F(s) +G(t b) F(s) +G(t)
and equality holds i t = f(s). In particular, taking f(s) = e
s
, prove Youngs
inequality stating
st e
s
+ (t 1) ln (t 1) (t 1) e
s
+t ln t t,
where s t := min (s, t) . Hint: Refer to Figures 8.3 and 8.4..
Exercise 8.20. Using dierential calculus, prove the following inequalities
1. For y > 0, let g (x) := xy e
x
for x 1. Use calculus to compute the
maximum of g (x) and use this prove Youngs inequality;
xy e
x
+y ln y y for x 1 and y > 0.
2. For p > 1 and y 0, let g (x) := xy x
p
/p for x 0. Again use calculus to
compute the maximum of g (x) and show that your result gives the following
inequality;
xy
x
p
p
+
y
q
q
for all x, y 0.
where q =
p
p1
, i.e.
1
q
= 1
1
p
.
Fig. 8.3. Comparing areas when t b goes the same way as in the text.
Fig. 8.4. When t b, notice that g(t) 0 but G(t) 0. Also notice that G(t) is no
longer needed to estimate st.
3. Suppose now that u : [0, ) [0, ) is a C
1
- function such that: u(0) = 0,
lim
x
u(x)
x
= , and u
t
(x) > 0 for all x > 0. Show
xy u(x) +v (y) for all x, y 0,
where v (y) = y (u
t
)
1
(y) u
_
(u
t
)
1
(y)
_
. Hint: consider the function,
g (x) := xy u(x) .
9
Topological Considerations
9.1 Closed and Open Sets
Let (X, d) be a metric space.
Denition 9.1. Let (X, d) be a metric space. The open ball B(x, ) X
centered at x X with radius > 0 is the set
B(x, ) := y X : d(x, y) < .
We will often also write B(x, ) as B
x
(). We also dene the closed ball cen-
tered at x X with radius > 0 as the set C
x
() := y X : d(x, y) .
Fig. 9.1. Balls in 1
2
corresponding to the 1 norm, 2 norm, 5 norm, and
1
2

norm.
Fig. 9.2. The ball in C ([0, 1] , 1) of radius 1/4 centered at f (x) = sin
_
x
2
_
are all
the continuous functions whose graphs lie between the green envelope.
92 9 Topological Considerations
Denition 9.2. A set E X is bounded if E B(x, R) for some x X and
R < . A set F X is closed i every convergent sequence x
n
n=1
which is
contained in F has its limit back in F. A set V X is open i V
c
is closed.
We will write F X to indicate F is a closed subset of X and V
o
X to
indicate the V is an open subset of X. We also let
d
denote the collection of
open subsets of X relative to the metric d.
Lemma 9.3. If f : X1 is a continuous function and k 1, then the follow-
ing sets are closed,
A := x X : f (x) k , B := x X : f (x) = k , and
C := x X : f (x) k .
Proof. The proof that A, B, and C are closed all go the same way so let
me just check that A is closed. To this end, suppose that x
n
n=1
is a sequence
in A such that x := lim
n
x
n
exists in X. Since x
n
A, f (x
n
) k and
therefore,
k lim
n
f (x
n
) = f (x)
wherein the last equality we have used the denition of f being continuous. By
denition of A it then follows that x A and so we have checked that A is
closed.
Example 9.4 (Closed Balls). Closed balls are closed. Indeed, we have seen
f (y) := d (x, y) is continuous and therefore
C
x
() := y X : d (x, y) = y X : f (y)
is a closed set. Notice that x = C
x
(0) is a closed set for all x X.
Example 9.5. The following subsets of C are closed;
1. z C : a Imz b for all a b in 1.
2. z C : a Re z b for all a b in 1.
3. z C : Imz = 0 and a Re z b for all a b in 1.
Example 9.6 (Open Balls). Open balls in metric spaces are open sets. Indeed
let f (y) := d (x, y) , then
B
x
()
c
:= X B
x
() = y X : d (x, y) = y X : f (y)
which is closed since f is continuous. Thus B
x
() is open.
Theorem 9.7. The closed subsets of (X, d) have the following properties;
1. X and are closed.
2. If C
I
is a collection of closed subsets of X, then
I
C
is closed in
X.
3. If A and B are closed sets then A B is closed.
Proof. 3. Let z
n
n=1
A B such that lim
n
z
n
=: z exists. Then
z
n
A i.o. or z
n
B i.o. For sake of deniteness say z
n
A i.o. in which case
we may choose a subsequence, w
k
:= z
nk
A for all k. Since lim
k
w
k
= z
and A is closed it follows that z A and hence z AB. Thus we have shown
A B is closed.
Exercise 9.1. Prove item 2. of Theorem 9.7. If C
I
is a collection of closed
subsets of X, then
I
C
is closed in X.
Exercise 9.2. Give an example of a collection of closed subsets, A
n
n=1
, of
C such that
n=1
A
n
is not closed.
Corollary 9.8. Let (X, d) be a metric space. Then the collection of open sub-
sets,
d
, of X satisfy;
1. X and are in
d
.
2.
d
is closed under taking arbitrary unions. i.e. if U
I
is a collection of
open sets then
I
U
is open..
3.
d
is closed under taking nite intersections, i.e. if U and V are open sets
then then U V is open as well.
Exercise 9.3. Prove Corollary 9.8.
Exercise 9.4. Let U be a subset of a metric space (X, d) . Show the following
are equivalent;
1. U is open,
2. for all z U there exists > 0 such that B
z
() U.
3. U can be written as a union of open balls.
Exercise 9.5 (Redundant now). Show that V X is open i for every
x V there is a > 0 such that B
x
() V. In particular show B
x
() is open
for all x X and > 0. Hint: by denition V is not open i V
c
is not closed.
Exercise 9.6. Let (X, d) be a metric space and x
1
, . . . , x
n
be a nite subset
of X. Show X x
1
, . . . , x
n
is an open subset.
Exercise 9.7. Let (X, d) be a complete metric space. Let A X be a subset
of X viewed as a metric space using d[
AA
. Show that (A, d[
AA
) is complete
i A is a closed subset of X.
9.1 Closed and Open Sets 93
Lemma 9.9. For any non empty subset A X, let d
A
(x) := infd(x, a)[a
A, then
[d
A
(x) d
A
(y)[ d(x, y) x, y X (9.1)
and in particular if x
n
x in X then d
A
(x
n
) d
A
(x) as n . Moreover
the set F
:= x X[d
A
(x) is closed in X.
Proof. Let a A and x, y X, then
d
A
(x) d(x, a) d(x, y) +d(y, a).
Take the inmum over a in the above equation shows that
d
A
(x) d(x, y) +d
A
(y) x, y X.
Therefore, d
A
(x) d
A
(y) d(x, y) and by interchanging x and y we also have
that d
A
(y) d
A
(x) d(x, y) which implies Eq. (9.1). If x
n
x X, then by
Eq. (9.1),
[d
A
(x) d
A
(x
n
)[ d(x, x
n
) 0 as n
so that lim
n
d
A
(x
n
) = d
A
(x) . Now suppose that x
n
n=1
F
and x
n
x
in X, then
d
A
(x) = lim
n
d
A
(x
n
)
since d
A
(x
n
) for all n. This shows that x F
and hence F
is closed.
Denition 9.10. A subset A X is a neighborhood of x if there exists an
open set V
o
X such that x V A. We will say that A X is an open
neighborhood of x if A is open and x A.
Example 9.11. Let x X and > 0, then C
x
() and B
x
()
c
are closed subsets
of X. For example if y
n
n=1
C
x
() and y
n
y X, then d (y
n
, x) for
all n and using Corollary 6.46 it follows d (y, x) , i.e. y C
x
() . A similar
proof shows B
x
()
c
is closed, see Exercise 9.5.
Lemma 9.12 (Approximating open sets from the inside by closed
sets). Let U X be an open set and F
:= x X[d
U
c (x) X be
as in Lemma 9.9. Then F
U as 0.
Proof. It is clear that d
U
c (x) = 0 for x U
c
so that F
U for each > 0

and hence
>0
F
U. Now suppose that x U

o
X. By Exercise 9.5 there
exists an > 0 such that B
x
() U, i.e. d(x, y) for all y U
c
. Hence
x F
and we have shown that U

>0
F
. Finally it is clear that F
whenever
t
.
Denition 9.13. Given a set A contained in a metric space X, let

A X be
the closure of A dened by
A := x X : x
n
A x = lim
n
x
n
.
That is to say

A contains all limit points of A. We say A is dense in X if
A = X, i.e. every element x X is a limit of a sequence of elements from A.

A metric space is said to be separable if it contains a countable dense subset,
D.
Lemma 9.14. For any A X, then
1. A =

A if A is closed.
2.

A = x : d
A
(x) = 0 and

A is closed.
3.

A = x X : A B
x
() ,= > 0 .
4. d
A
(x) > 0 for all x A
c
if A is closed.
Proof. 1. We always have A

A. If A is closed we can not leave A by
taking limits and hence

A A, i.e. A =

A if A is closed.
2. Let F := x : d
A
(x) = 0 which is a closed set since d
A
is continuous. If
x F (i.e. d
A
(x) = 0), there exists x
n
A such that d (x, x
n
) 1/n for all
n N. Thus lim
n
x
n
= x and so x

A. This shows F

A. Conversely if
x

A, there exists x
n
A such that lim
n
x
n
= x and so
d
A
(x) = d
A
_
lim
n
x
n
_
= lim
n
d
A
(x
n
) = lim
n
0 = 0
which shows x F.
2. 3. Since A B
x
() ,= happens i d
A
(x) < we see that A
B
x
() ,= > 0 i d
A
(x) < for all > 0, i.e. i d
A
(x) = 0.
4. If A is closed then A =

A = x X : d
A
(x) = 0 and therefore A
c
=
x X : d
A
(x) > 0 .
Exercise 9.8. If A is a non-empty subset of X, then d
A
= d
A
.
Exercise 9.9. Given A X, show

A is a closed set and in fact
A = F : A F X with F closed. (9.2)

That is to say

A is the smallest closed set containing A.
Denition 9.15. Let (X, d) be a metric space and A be a subset of X.
1. The closure of A is the smallest closed set

A containing A, i.e.
A := F : A F X .
(Because of Exercise 9.9 this is consistent with Denition 9.13 for the clo-
sure of a set in a metric space.)
2. The interior of A is the largest open set A
o
contained in A, i.e.
A
o
= V : V A .
3. A X is a neighborhood of a point x X if x A
o
.
4. The accumulation points of A is the set
acc(A) = x X : V [A x] ,= for all V
x
.
5. The boundary of A is the set bd(A) :=

A A
o
.
6. A is dense in X if

A = X and X is said to be separable if there exists a
countable dense subset of X.
Lemma 9.16. Let (X, d) be a metric space and A be a subset of X, then
A
o
= x X : B
x
(
x
) A for some
x
> 0 .
Proof. Let V := x X : B
x
() A for some =
x
> 0 . If B
x
() A
and y B
x
() and = d (x, y) , then B
y
() B
x
() A which shows
that y V, i.e. B
x
() V. Thus we may write
V = B
x
() : B
x
() A .
This shows that V is an open subset of A. Moreover if W is another open subset
of A, then
W = B
x
() : B
x
() W B
x
() : B
x
() A = V
so that V is the largest open subset contained in A. This completes the proof.
Exercise 9.10. Let (X, ||) be a normed space and d (x, y) := |y x| . Show;
1. C
x
()
o
= B
x
() ,
2. B
x
() = C
x
() ,
3. bd (C
x
()) = bd (C
x
()) = y X : |y x| = .
Example 9.17 (Words of Caution). Let (X, d) be a metric space. It is always
true that B
x
() C
x
() since C
x
() is a closed set containing B
x
(). However,
it is not always true that B
x
() = C
x
(). For example let X = 1, 2 and
d(1, 2) = 1, then B
1
(1) = 1 , B
1
(1) = 1 while C
1
(1) = X. For another
counterexample, take
X =
_
(x, y) 1
2
: x = 0 or x = 1
_
with the usually Euclidean metric coming from the plane. Then
B
(0,0)
(1) =
_
(0, y) 1
2
: [y[ < 1
_
,
B
(0,0)
(1) =
_
(0, y) 1
2
: [y[ 1
_
, while
C
(0,0)
(1) = B
(0,0)
(1) (1, 0) .
Exercise 9.11. Suppose that A and B are subsets of a metric space, show
A B =

A

B.
Exercise 9.12. Given an example showing that
n=1
A
n
need not be equal to
n=1
A
n
.
Remark 9.18. The relationships between the interior and the closure of a set
are:
(A
o
)
c
=
V
c
: V and V A =
C : C is closed C A
c
= A
c
and similarly, (

A)
c
= (A
c
)
o
.
Proposition 9.19. If A is a subset of a metric space, (X, d) , then bd (A) may
be computed using either;
1. bd (A) =

A A
c
,
2. bd (A) = x X : d
A
(x) = 0 = d
A
c (x) ,
3. bd (A) = x X : B
x
() A ,= , = B
x
() A
c
for all > 0 , or
4. bd (A) = x X : x
n
A and y
n
A
c
lim
n
x
n
= x = lim
n
y
n
.
Proof. From the denition of bd (A) and the relation, (

A)
c
= (A
c
)
o
, we
nd
bd (A) =

A A
o
=

A (A
o
)
c
=

A A
c
. (9.3)
Item 2. now follow from item 1. and the fact that

A = x X : d
A
(x) = 0 . I
leave it to the reader to check that 2. = 3. = 4. = 1.
Exercise 9.13. If D is a dense subset of a metric space (X, d) and E X is
a subset such that to every point x D there exists x
n
n=1
E with x =
lim
n
x
n
, then E is also a dense subset of X. If points in E well approximate
every point in D and the points in D well approximate the points in X, then
the points in E also well approximate all points in X.
Exercise 9.14. Suppose (X, d) is a metric space which contains an uncountable
subset X with the property that there exists > 0 such that d (a, b)
for all a, b with a ,= b. Show that (X, d) is not separable.
Exercise 9.15. Let Y = BC (1, C) be the Banach space of continuous bounded
complex valued functions on 1 equipped with the uniform norm, |f|
u
:=
sup
xR
[f (x)[ . Further let C
0
(1, C) denote those f C (1, C) such that vanish
at innity, i.e. lim
x
f (x) = 0. Also let C
c
(1, C) denote those f C (1, C)
with compact support, i.e. there exists N < such that f (x) = 0 if [x[ N.
Show C
0
(1, C) is a closed subspace of Y and that
C
c
(1, C) = C
0
(1, C) .
9.2 Continuity Revisited 95
9.2 Continuity Revisited
Suppose now that (X, ) and (Y, d) are two metric spaces and f : X Y is a
function.
Denition 9.20. A function f : X Y is continuous at x X if for all
> 0 there is a > 0 such that
d(f (x) , f(x
t
)) < provided that (x, x
t
) < . (9.4)
The function f is said to be continuous if f is continuous at all points x X.
The following lemma gives two other characterizations of continuity of a
function at a point.
Lemma 9.21 (Local Continuity Lemma). Suppose that (X, ) and (Y, d)
are two metric spaces and f : X Y is a function dened in a neighborhood
of a point x X. Then the following are equivalent:
1. f is continuous at x X.
2. For all neighborhoods A Y of f (x) , f
1
(A) is a neighborhood of x X.
3. For all sequences x
n
n=1
X such that x = lim
n
x
n
, f(x
n
) is con-
vergent in Y and
lim
n
f(x
n
) = f
_
lim
n
x
n
_
.
Proof. 1 = 2. If A Y is a neighborhood of f (x) , there exists > 0
such that B
f(x)
() A and because f is continuous there exists a > 0 such
that Eq. (9.4) holds. Therefore
B
x
() f
1
_
B
f(x)
()
_
f
1
(A)
showing f
1
(A) is a neighborhood of x.
2 = 3. Suppose that x
n
n=1
X and x = lim
n
x
n
. Then for any >
0, B
f(x)
() is a neighborhood of f (x) and so f
1
_
B
f(x)
()
_
is a neighborhood
of x which must contain B
x
() for some > 0. Because x
n
x, it follows that
x
n
B
x
() f
1
_
B
f(x)
()
_
for a.a. n and this implies f (x
n
) B
f(x)
() for
a.a. n, i.e. d(f (x) , f (x
n
)) < for a.a. n. Since > 0 is arbitrary it follows that
lim
n
f (x
n
) = f (x) .
3. = 1. We will show not 1. = not 3. If f is not continuous at x,
there exists an > 0 such that for all n N there exists a point x
n
X with
(x
n
, x) <
1
n
yet d (f (x
n
) , f (x)) . Hence x
n
x as n yet f (x
n
) does
not converge to f (x) .
Here is a global version of the previous lemma.
Lemma 9.22 (Global Continuity Lemma). Suppose that (X, ) and (Y, d)
are two metric spaces and f : X Y is a function dened on all of X. Then
the following are equivalent:
1. f is continuous.
2. f
1
(V )
for all V
d
, i.e. f
1
(V ) is open in X if V is open in Y.
3. f
1
(C) is closed in X if C is closed in Y.
4. For all convergent sequences x
n
X, f(x
n
) is convergent in Y and
lim
n
f(x
n
) = f
_
lim
n
x
n
_
.
Proof. Since f
1
(A
c
) =
_
f
1
(A)
c
, it is easily seen that 2. and 3. are
equivalent. So because of Lemma 9.21 it only remains to show 1. and 2. are
equivalent. If f is continuous and V Y is open, then for every x f
1
(V ) , V
is a neighborhood of f (x) and so f
1
(V ) is a neighborhood of x. Hence f
1
(V )
is a neighborhood of all of its points and from this and Exercise 9.5 it follows that
f
1
(V ) is open. Conversely, if x X and A Y is a neighborhood of f (x) then
there exists V
o
X such that f (x) V A. Hence x f
1
(V ) f
1
(A)
and by assumption f
1
(V ) is open showing f
1
(A) is a neighborhood of x.
Therefore f is continuous at x and since x X was arbitrary, f is continuous.
Exercise 9.16. Use Example 9.28 and Lemma 9.22 to recover the results of
Example 9.11.
The next result shows that there are lots of continuous functions on a metric
space (X, d) .
Lemma 9.23 (Urysohns Lemma for Metric Spaces). Let (X, d) be a met-
ric space and suppose that A and B are two disjoint closed subsets of X. Then
f (x) =
d
B
(x)
d
A
(x) +d
B
(x)
for x X (9.5)
denes a continuous function, f : X [0, 1], such that f (x) = 1 for x A and
f (x) = 0 if x B.
Proof. By Lemma 9.9, d
A
and d
B
are continuous functions on X. Since
A and B are closed, d
A
(x) > 0 if x / A and d
B
(x) > 0 if x / B. Since
AB = , d
A
(x) +d
B
(x) > 0 for all x and (d
A
+d
B
)
1
is continuous as well.
The remaining assertions about f are all easy to verify.
Sometimes Urysohns lemma will be use in the following form. Suppose
F V X with F being closed and V being open, then there exists f
C (X, [0, 1])) such that f = 1 on F while f = 0 on V
c
. This of course follows
from Lemma 9.23 by taking A = F and B = V
c
.
9.3 Metric spaces as topological spaces (Not required
Reading!)
Let (X, d) be a metric space and let =
d
denote the collection of open
subsets of X. (Recall V X is open i V
c
is closed i for all x V there
exists an =
x
> 0 such that B(x,
x
) V i V can be written as a (possibly
uncountable) union of open balls.) Although we will stick with metric spaces
in this chapter, it will be useful to introduce the denitions needed here in the
more general context of a general topological space, i.e. a space equipped
with a collection of open sets.
Denition 9.24 (Topological Space). Let X be a set. A topology on X is
a collection of subsets () of X with the following properties;
1. contains both the empty set () and X.
2. is closed under arbitrary unions.
3. is closed under nite intersections.
The elements V are called open subsets of X. A subset F X is said
to be closed if F
c
is open. I will write V
o
X to indicate that V X and
V and similarly F X will denote F X and F is closed. Given x X
we say that V X is an open neighborhood of x if V and x V. Let
x
= V : x V denote the collection of open neighborhoods of x.
Denition 9.25 (Continuity at a point in topological terms). Let
(X,
X
) and (Y,
Y
) be topological spaces. A function f : X Y is contin-
uous at a point x X if for every open neighborhood V of f (x) there is an
open neighborhood U of x such that U f
1
(V ). See Figure 9.3.
Y
V
f(x)
f
X
U
x
f
1
(V )
Fig. 9.3. Checking that a function is continuous at x X.
Denition 9.26 (Global continuity in topological terms). Let (X,
X
)
and (Y,
Y
) be topological spaces. A function f : X Y is continuous if
f
1
(
Y
) :=
_
f
1
(V ) : V
Y
_

X
.
We will also say that f is
X
/
Y
continuous or (
X
,
Y
) continuous. Let
C(X, Y ) denote the set of continuous functions from X to Y.
Exercise 9.17. Show f : X Y is continuous (Denition ??) i f is continu-
ous at all points x X.
Exercise 9.18. Show f : X Y is continuous i f
1
(C) is closed in X for all
closed subsets C of Y.
Denition 9.27. A map f : X Y between topological spaces is called a
homeomorphism provided that f is bijective, f is continuous and f
1
: Y
X is continuous. If there exists f : X Y which is a homeomorphism, we say
that X and Y are homeomorphic. (As topological spaces X and Y are essentially
the same.)
Example 9.28. The function d
A
dened in Lemma 9.9 is continuous for each
A X. In particular, if A = x , it follows that y X d(y, x) is continuous
for each x X.
9.4 Exercises
Exercise 9.19. Show that (X, d) is a complete metric space i every sequence
x
n
n=1
X such that
n=1
d(x
n
, x
n+1
) < is a convergent sequence in X.
You may nd it useful to prove the following statements in the course of the
proof.
1. If x
n
is Cauchy sequence, then there is a subsequence y
j
:= x
nj
such that
j=1
d(y
j+1
, y
j
) < .
2. If x
n
n=1
is Cauchy and there exists a subsequence y
j
:= x
nj
of x
n
such
that x = lim
j
y
j
exists, then lim
n
x
n
also exists and is equal to x.
Exercise 9.20. Suppose that f : [0, ) [0, ) is a C
2
function such
that f(0) = 0, f
t
> 0 and f
tt
0 and (X, ) is a metric space. Show that
d(x, y) = f((x, y)) is a metric on X. In particular show that
d(x, y) :=
(x, y)
1 +(x, y)
is a metric on X. (Hint: use calculus to verify that f(a + b) f(a) + f(b) for
all a, b [0, ).)
9.5 Sequential Compactness 97
Exercise 9.21. Let (X
n
, d
n
)
n=1
be a sequence of metric spaces, X :=
n=1
X
n
, and for x = (x(n))
n=1
and y = (y(n))
n=1
in X let
d(x, y) =
n=1
2
n
d
n
(x(n), y(n))
1 +d
n
(x(n), y(n))
. (9.6)
Show:
1. (X, d) is a metric space,
2. a sequence x
k
k=1
X converges to x X i x
k
(n) x(n) X
n
as
k for each n N and
3. X is complete if X
n
is complete for all n.
9.5 Sequential Compactness
Suppose that (X, d) and (Y, ) are metric spaces.
Denition 9.29. As subset K X is (sequentially) compact if every se-
quence z
n
n=1
K has a convergent subsequence, w
k
:= z
nk
k=1
such that
lim
k
w
k
K.
Example 9.30. Suppose that F X is an unbounded set, i.e. for all n N
there exists z
n
F such that d (x, z
n
) n. The sequence z
n
n=1
and all of
its subsequences are unbounded and therefore not Cauchy in X and hence not
convergent in X. This shows that sequentially compact sets must be bounded.
Example 9.31. Suppose that F X is not closed. Then there exists z
n
n=1

F such that z := lim
n
z
n
/ F. Moreover, although every subsequence of
z
n
n=1
is convergent, they all still converge to z / F. This shows that a
sequentially compact set must be closed.
Lemma 9.32 (BolzanoWeierstrass property for C
D
). Let D N. Every
bounded sequence, z (n)
n=1
C
D
, has a convergent subsequence.
Proof. By assumption there exists M < such that |z (n)| =
d (z (n) , 0) M for all n N. Writing z (n) = (z
1
(n) , . . . , z
D
(n)) C
D
. Since
[z
i
(n)[ |z (n)| it follows that z
i
(n)
n=1
is a bounded sequence in C. Hence
by the BolzanoWeierstrass property for C we replace z (n) by a subsequence
z (n
k
) such that lim
k
z (n
k
) = z
1
exists. We may now replace the original z
by this new subsequence and then nd a further subsequence z (n
k
) such that
lim
k
z
i
(n
k
) = z
i
exists for i = 1, 2. We may continue this way inductively
to nd a subsequence such that lim
k
z
i
(n
k
) = z
i
exists for all 1 i D. It
then follows that lim
k
|z z (n
k
)| = 0 as desires where z := (z
1
, . . . , z
D
) .
Theorem 9.33 (BolzanoWeierstrass / HeineBorel theorem). As sub-
set K C
D
is sequentially compact i it is closed and bounded.
Proof. In light of Examples 9.30 and 9.31 we are left to show that closed
and bounded sets are sequentially compact. So let K C
D
be a closed and
bounded set and z
n
n=1
be any sequence in K. According toe Lemma 9.32,
z
n
n=1
has a convergent subsequence, w
k
:= z
nk
k=1
. Since w
k
K for all
k and K is closed it necessarily follows that lim
k
w
k
K which shows K is
sequentially compact.
Example 9.34 (Warning!). It is not true that a closed and bounded subset of an
arbitrary metric space (X, d) is necessarily sequentially compact. For example
let Z denote the vector space of continuous functions on [0, 1] with values in 1
and for f Z let |f| = sup
t[0,1]
[f (t)[ . Then the set C := f
n
n=0
where
f
n
(t) = 2
(n+2)
_
_
_
0 if t [0, 2
(n+1)
] [2
n
, 1]
t 2
(n+1)
if 2
(n+1)
t 3 2
(n+2)
2
n
t if 3 2
(n+2)
t 2
n
.
[So f
n
(t) is a shark tooth over the interval
_
2
(n+1)
, 2
n
.] Notice that |f
n
| = 1
for all n so that C is bounded. Moreover |f
n
f
m
| = 1 for all m ,= n, therefore
there are no convergent subsequence of C. The reader should use this fact to
see that C is closed and bounded but not sequentially compact!
Fig. 9.4. Here are the plots of f0 and f1.
Exercise 9.22 (Extreme value theorem). Let K be sequentially com-
pact subset of X and f : K 1 be a continuous function. Show <
inf
xK
f (x) sup
xK
f (x) < and there exists a, b K such that
f(a) = inf
xK
f (x) and f(b) = sup
xK
f (x) . Hint: rst argue that there
exists z
n
n=1
K such that f (z
n
) sup
xK
f (x) as n .
Exercise 9.23 (Extreme value theorem II). Suppose that f : X 1 is
a continuous function on a metric space (X, d) . Further assume there exists a
sequentially compact subset K X and x
0
K such that f (x
0
) f (x) for
all x X K. Show there exists k K such that inf
xX
f (x) = f (k) .
Theorem 9.35 (Fundamental Theorem of Algebra). Suppose that p (z) =
n
k=0
a
k
z
k
is a polynomial on C with a
n
,= 0 and n > 0. Then there exists
z
0
C such that p (z
0
) = 0.
Proof. Since
lim
]z]
[p (z)[
[z[
n
= lim
]z]
n1
k=0
a
k
z
nk
+a
n
= [a
n
[ ,
if R > 0 is suciently large, then
[p (z)[ [a
n
[ [z[
n
/2 [a
n
[ R
n
/2 [a
0
[ = [p (0)[ for [z[ > R.
Applying Exercise 9.23 with K = z C : [z[ R and x
0
= 0 there exists
z
0
K C such that [p (z
0
)[ [p (z)[ for all z C. It now follows from
Lemma 9.36 below that p (z
0
) = 0.
Lemma 9.36. Suppose that p (z) =
n
k=0
a
k
z
k
is a polynomial on C with a
n
,=
0 and n > 0. If [p (z)[ has a minimum at z
0
C, then [p (z
0
)[ = 0.
Proof. For sake of contradiction, let us suppose that p (z
0
) ,= 0 and set
q (z) := p (z
0
+z) =
n
k=0
a
k
(z +z
0
)
k
=
n
k=0
k
z
k
.
Then 0 < [q (0)[ = [
0
[ [q (z)[ for all z C and
n
= a
n
,= 0. Let l 1 be
the rst index such that
l
,= 0 so that
q (z) =
0
+
l
z
l
+ +
n
z
n
=
0
_
1 +

l
0
z
l
+ +

n
0
z
n
_
Evaluating this at z = re
i
with chosen so that
l
0
e
il
=
l
0
= implies
for r > 0 small that
[
0
[
q
_
re
i
_
= [
0
[
1 r
l
+c
l+1
r
l+1
+ +c
n
r
n
[
0
[
_
1 r
l
+
c
l+1
r
l+1
+ +c
n
r
n
[
0
[
_
1 r
l
+Cr
l+1
= [
0
[
_
1 r
l
[ Cr]
where c
k
and C are certain appropriate constants. Choosing r (0, /C)
then gives the [
0
[ < [
0
[ which is absurd and we have reached the desired
contradiction.
Remark 9.37. The fundamental theorem of algebra does not hold for polynomi-
als in z and z or in x and y. For example consider the polynomial
p (z, z) = 1 +z z = 1 +x
2
+y
2
.
The point is that Lemma 9.36 does not hold for these more general classes of
polynomials.
Exercise 9.24 (Uniform Continuity). Let K be sequentially compact subset
of X and f : K C be a continuous function. Show that f is uniformly
continuous, i.e. if > 0 there exists > 0 such that [f(z) f(w)[ < if
w, z K with d (w, z) < . Hint: prove the contrapositive.
Exercise 9.25. If (X, d) is a metric space and K X is sequentially compact.
Show subset, C K, which is closed is sequentially compact as well.
Exercise 9.26. If K 1 is sequentially compact then sup (K) K, i.e.
sup (K) = max (K) .
Exercise 9.27. Let (X, d) and (Y, ) be metric spaces, K X be a sequen-
tially compact set, and f : K Y be a continuous function. Show f (K) is
sequentially compact in Y. In particular, for C K closed, we have f (C) is
closed and in fact sequentially compact in Y.
Exercise 9.28. Let f : [a, b] [c, d] be a strictly increasing continuous func-
tion such that f (a) = c and f (b) = d and g := f
1
: [c, d] [a, b] as in Exercise
6.20. Give one or better yet two alternative proofs that g is continuous based
on sequentially compactness arguments.
Denition 9.38. Let Z be a vector space. We say that two norms, [[ and || ,
on Z are equivalent if there exists constants , (0, ) such that
|f| [f[ and [f[ |f| for all f Z.
Theorem 9.39. Let Z be a nite dimensional vector space. Then any two
norms [[ and || on Z are equivalent. (This is typically not true for norms
on innite dimensional spaces.)
9.6 Open Cover Compactness 99
Proof. Let f
i
n
i=1
be a basis for Z and dene a new norm on Z by
_
_
_
_
_
n
i=1
a
i
f
i
_
_
_
_
_
2
:=
_
n
i=1
[a
i
[
2
for a
i
F.
By the triangle inequality for the norm [[ , we nd
i=1
a
i
f
i
i=1
[a
i
[ [f
i
[
_
n
i=1
[f
i
[
2
_
n
i=1
[a
i
[
2
M
_
_
_
_
_
n
i=1
a
i
f
i
_
_
_
_
_
2
where M =
_
n
i=1
[f
i
[
2
. Thus we have [f[ M |f|
2
for all f Z and this
inequality shows that [[ is continuous relative to ||
2
. Since the normed space
(Z, ||
2
) is homeomorphic and isomorphic to F
n
with the standard euclidean
norm, the closed bounded set, S := f Z : |f|
2
= 1 Z, is a sequentially
compact subset of Z relative to ||
2
. Therefore by Exercise 9.22 there exists
f
0
S such that
m = inf [f[ : f S = [f
0
[ > 0.
Hence given 0 ,= f Z, then
f
|f|
2
S so that
m
f
|f|
2
= [f[
1
|f|
2
or equivalently
|f|
2

1
m
[f[ .
This shows that [[ and ||
2
are equivalent norms. Similarly one shows that ||
and ||
2
are equivalent and hence so are [[ and || .
Corollary 9.40. If (Z, ||) is a nite dimensional normed space, then A Z
is sequentially compact i A is closed and bounded relative to the given norm,
|| .
Corollary 9.41. Every nite dimensional normed vector space (Z, ||) is com-
plete. In particular any nite dimensional subspace of a normed vector space is
automatically closed.
Proof. If f
n
n=1
Z is a Cauchy sequence, then f
n
n=1
is bounded and
hence has a convergent subsequence, g
k
= f
nk
, by Corollary 9.40. It is now
routine to show lim
n
f
n
= f := lim
k
g
k
.
9.6 Open Cover Compactness
Denition 9.42. Let A be a subset of a metric space, (X, d) . An open cover
of A is a collection of, |, of open subsets of X such that A
V /
V.
Denition 9.43. The subset A X is open cover compact
1
if every open
cover (Denition 9.42) of A has nite a sub-cover, i.e. if | is an open cover
of A there exists nite subcollection, |
0

f
| such that |
0
is still a cover of
A. (We will write A X to denote that A X and A is compact.) A subset
A X is precompact if

A is compact.
Remark 9.44. If (X, d) is a metric space, we will see below in Theorem 9.53 that
X is sequentially compact i it is open cover compact. We will not use this fact
in this section however. In the future though we will just refer to compact sets
with out the any extra adjectives.
Example 9.45. Suppose that A is an unbounded subset of X. Pick a
1
A and
then choose a
n
n=2
inductively so that d
a1,...,an]
(a
n+1
) 1 for all n. This
sequence then has the property that d (a
k
, a
l
) 1 for all k ,= l and from this it
follows that F := a
1
, a
2
, . . . is a closed set. We then dene an open cover of
A by taking,
| = F
c
, B
a1
(1/3) , B
a2
(1/3) , B
a3
(1/3) , . . . .
This cover has no nite subcover. Therefore A can not be open cover compact.
Alternatively. Given x X, the collection | := B
x
(n) : n N is an
open cover of X. So if K is an open cover compact subset of X, there must
exist n
1
< n
2
< < n
l
so that
K B
x
(n
1
) B
x
(n
2
) B
x
(n
l
) = B
x
(n
l
) .
This shows K is bounded.
Lemma 9.46. Suppose that K X is an open cover compact set, then K is
closed.
Proof. We will shows that K
c
is open. To this end suppose x K
c
. Then
let
k
:=
1
3
d (x, k) > 0 for all k K. It then follows that B
x
(
k
) B
k
(
k
) =
for all k K. As B
k
(
k
)
kK
is an open cover K, there exists
f
K such
that K
k
B
k
(
k
) . If we now let := min
k
k
> 0, then
B
x
() B
k
(
k
) B
x
(
k
) B
k
(
k
) = for all k
and therefore
B
x
() K B
x
() [
kK
B
k
(
k
)] = .
1
We will usually simply say A is compact in this case.
Theorem 9.47. The open cover compact subsets of 1
n
are the closed and
bounded sets.
Proof. Let us rst suppose that K = [M, M]
n
for some positive integer
M and for sake of contradiction let us suppose that | is an open cover of K
with not nite subcover. For any k N let
k
:= [M, M)
_
2
k
: Z
_
and for x
n
k
, let
C
k
x
:=
_
x
1
, x
1
+ 2
k

_
x
n
, x
n
+ 2
k
.
We now let (
k
=
_
C
k
x
: x
n
k
_
. The cubes have the following properties;
1. K =
CC
kC for all k and if C (
k
, then C =
FC
k+1F.
2. diam(C) =

n 2
k
for all C (
k
.
By item 1. there must be a C
1
(
1
such that | has no nite subcover of
C
1
. Similarly there exists C
2
(
2
such that C
2
C
1
such that | has no nite
subcover of C
2
. Continuing this way inductively we may construct C
k
(
k
such
that C
1
C
2
C
3
. . . and | has no nite subcover of C
k
for any k. Choose
a point z
k
C
k
for all k N. Because diam(C
k
) =

n 2
k
0 one learns
that z
k
k=1
is a Cauchy sequence and hence convergent. Let z = lim
k
z
k
which is in C
k
for all k as each of these cubes are closed sets. Since | is an open
cover of K, there exists V | such that z V. As V is open, there exists > 0
such that B
z
() V. On the other hand because diam(C
k
) 0 as k ,
it follows that C
k
B
z
() V for all k suciently large which violates the
condition that | has no nite subcover of C
k
for any k.
Now suppose that K is a general closed and bounded subset of 1
n
and |
is an open cover of K. Since K is bounded, K [M, M]
n
for some M N.
Since |
t
:= |K
c
is an open cover of [M, M]
n
it has a nite subcover, say
U
1
, . . . , U
l
K
c
. As K
c
K = we must have that U
1
, . . . , U
l
covers K
i.e. U
1
, . . . , U
l
| is a nite subcover of K.
Proposition 9.48. Suppose that K X is an open cover compact set and
F K is a closed subset. Then F is open cover compact. If K
i
n
i=1
is a nite
collections of open cover compact subsets of X then K =
n
i=1
K
i
is also an
open cover compact subset of X.
Proof. Let | be an open cover of F, then |F
c
is an open cover of
K. The cover |F
c
of K has a nite subcover which we denote by |
0
F
c
where |
0

f
|. Since F F
c
= , it follows that |
0
is the desired subcover
of F. For the second assertion suppose | is an open cover of K. Then |
covers each compact set K
i
and therefore there exists a nite subset |
i

f
|
for each i such that K
i
|
i
. Then |
0
:=
n
i=1
|
i
is a nite cover of K.
Fig. 9.5. Nested Sequence of cubes.
Exercise 9.29. Suppose f : X Y is continuous and K X is open cover
compact, then f(K) is an open cover compact subset of Y. Give an example of
continuous map, f : X Y, and an open cover compact subset K of Y such
that f
1
(K) is not open cover compact.
Exercise 9.30 (Extreme value theorem III). Let (X, d) be an open cover
compact metric space and f : X 1 be a continuous function. Show <
inf f sup f < and there exists a, b X such that f(a) = inf f and
f(b) = sup f. Hint: use Exercise 9.29 and Theorem 9.47.
Exercise 9.31 (Uniform Continuity). Let (X, d) be an open cover compact
metric space, (Y, ) be a metric space and f : X Y be a continuous function.
Show that f is uniformly continuous, i.e. if > 0 there exists > 0 such that
(f(y), f (x)) < if x, y X with d(x, y) < .
Exercise 9.32. Suppose f : X Y is continuous and K X is open cover
compact, then f(K) is an open cover compact subset of Y. Give an example of
continuous map, f : X Y, and an open cover compact subset K of Y such
that f
1
(K) is not open cover compact.
Exercise 9.33 (Dinis Theorem). Let X be an open cover compact metric
space and f
n
: X [0, ) be a sequence of continuous functions such that
f
n
(x) 0 as n for each x X. Show that in fact f
n
0 uniformly in x,
i.e. sup
xX
f
n
(x) 0 as n . Hint: Given > 0, consider the open sets
V
n
:= x X : f
n
(x) < .
Denition 9.49. A collection T of closed subsets of a metric space (X, d) has
the nite intersection property if T
0
,= for all T
0

f
T.
9.7 Equivalence of Sequential and Open Cover Compactness in Metric Spaces 101
The notion of open cover compactness may be expressed in terms of closed
sets as follows.
Proposition 9.50. A metric space X is open cover compact i every family of
closed sets T 2
X
having the nite intersection property satises
T ,= .
Proof. The basic point here is that complementation interchanges open
and closed sets and open covers go over to collections of closed sets with empty
intersection. Here are the details.
() Suppose that X is open cover compact and T 2
X
is a collection of
closed sets such that
T = . Let
| = T
c
:= C
c
: C T ,
then | is a cover of X and hence has a nite subcover, |
0
. Let T
0
= |
c
0

f
T,
then T
0
= so that T does not have the nite intersection property.
() If X is not open cover compact, there exists an open cover | of X with
no nite subcover. Let
T = |
c
:= U
c
: U | ,
then T is a collection of closed sets with the nite intersection property while
T = .
9.7 Equivalence of Sequential and Open Cover
Compactness in Metric Spaces
Denition 9.51. A metric space (X, d) is bounded ( > 0) if there exists a
nite cover of X by balls of radius . We further say (X, d) is totally bounded
if it is bounded for all > 0.
Remark 9.52 (Totally bounded means almost nite). An equivalent way to state
that (X, d) is bounded is to say there exists a nite subset =

f
X
such that d
(x) < for all x X. In other words, (X, d) is bounded i X

is a nite subset within an error.
Theorem 9.53. Let (X, d) be a metric space. The following are equivalent.
(a) X is open cover compact.
(b) X is sequentially compact.
(c) X is totally bounded and complete.
Proof. The proof will consist of showing that a b c a.
(a b) We will show that not b not a. Suppose there exists
x
n
n=1
X which has no convergent subsequence. In this case the set
S := x
n
X : n N must be an innite set as we have already seen nite
sets are sequentially compact. For every x X we must have
(x) := liminf
n
d (x
n
, x) > 0
since otherwise there would be a subsequence x
nk
k=1
such that
lim
k
d (x
nk
, x) = 0, i.e. lim
k
x
nk
= x. We now let V
x
:= B
x
_
1
2
(x)
_
and observe that x
n
can be in V
x
for only nitely many n otherwise we
would conclude that liminf
n
d (x
n
, x) (x) /2. From these observations,
| := V
x
: x X is an open cover of X with no nite subcover. Indeed, if

f
X, then we must still have x
n

x
V
x
for only nitely many n and in
particular
x
V
x
can not cover S.
(b c) Suppose x
n
n=1
X is a Cauchy sequence. By assumption there
exists a subsequence x
nk
k=1
which is convergent to some point x X. Since
x
n
n=1
is Cauchy it follows that x
n
x as n showing X is complete.
Now for sake of contradiction suppose that X is not totally bounded.
There there exists > 0 for which X is not bounded. In particular,
| := B
x
() : x X is an open cover of X with no nite subcover. We now
use this to construct a sequence x
n
n=1
X. Choose x
1
X at random, then
choose x
2
X B
x1
() , then x
3
X [B
x1
() B
x2
()] . . .
x
n
X
_
B
x1
() B
x2
() B
xn1
()
, . . . .
The process may be continued indenitely as | has no nite subcover. By
construction we have chosen x
n
n=1
such that d
x1,...,xn1]
(x
n
) for all n
and therefore d (x
k
, x
l
) for all k ,= l. Every subsequence will share this
property, i.e. not be Cauchy, and hence can not be convergent.
(c a) For sake of contradiction, assume there exists an open cover 1 =
V
A
of X with no nite subcover. Since X is totally bounded for each
n N there exists
n

f
X such that
X =
_
xn
B
x
(1/n)
_
xn
C
x
(1/n).
Choose x
1

1
such that no nite subset of 1 covers K
1
:= C
x1
(1). Since
K
1
=
x2
K
1
C
x
(1/2), there exists x
2

2
such that K
2
:= K
1
C
x2
(1/2)
can not be covered by a nite subset of 1, see Figure 9.6. Continuing this way
inductively, we construct sets K
n
= K
n1
C
xn
(1/n) with x
n

n
such that no
K
n
can be covered by a nite subset of 1. Now choose y
n
K
n
for each n. Since
K
n
n=1
is a decreasing sequence of closed sets such that diam(K
n
) 2/n, it
follows that y
n
is a Cauchy and hence convergent with
y = lim
n
y
n

m=1
K
m
.
Since 1 is a cover of X, there exists V 1 such that y V. Since K
n
y
and diam(K
n
) 0, it now follows that K
n
V for some n large. But this
violates the assertion that K
n
can not be covered by a nite subset of 1.
Fig. 9.6. Nested Sequence of cubes.
Corollary 9.54. The compact subsets of 1
n
are the closed and bounded sets.
Proof. If K is closed and bounded then K is complete (being the closed
subset of a complete space) and K is contained in [M, M]
n
for some positive
integer M. For > 0, let
= Z
n
[M, M]
n
:= x : x Z
n
and [x
i
[ M for i = 1, 2, . . . , n.
We will show, by choosing > 0 suciently small, that
K [M, M]
n

x
B(x, ) (9.7)
which shows that K is totally bounded. Hence by Theorem 9.53, K is compact.
Suppose that y [M, M]
n
, then there exists x
such that [y
i
x
i
[
for i = 1, 2, . . . , n. Hence
d
2
(x, y) =
n
i=1
(y
i
x
i
)
2
n
2
which shows that d(x, y)

n. Hence if choose < /
n we have shows that

d(x, y) < , i.e. Eq. (9.7) holds.
9.8 Connectedness
Denition 9.55. Let (X, d) be a metric space. Two subset A and B of X are
separated
2
if

AB = = A

B. A set E X is disconnected if E = AB
where A and B are two non-empty separated sets, otherwise E is said to be
connected.
Theorem 9.56 (The Connected Subsets of 1). The connected subsets of
1 are intervals.
Proof. We will break the proof into two parts. First we show if E is con-
nected then E is an interval. Then we show if E is disconnected then E is not
an interval.
1) Suppose that E 1 is a connected subset and that a, b E with
a < b. If there exists c (a, b) such that c / E, then A := (, c) E
and B := (c, ) E would be two non-empty separated subsets such that
E = A B. Hence (a, b) E. Let := inf(E) and := sup(E) and choose
n
,
n
E such that
n
<
n
and
n
and
n
as n . By what we
have just shown, (
n
,
n
) E for all n and hence (, ) =
n=1
(
n
,
n
) E.
From this it follows that E = (, ), [, ), (, ] or [, ], i.e. E is an interval.
2) Now suppose that E is a disconnected subset of 1. Then E = AB where
A and B are non-empty separated sets and let a A and b B. After relabelling
A and B if necessary we may assume that a < b. Let p = sup ([a, b] A) . Since
p

A[a, b] it follows that p / B and a p b. Since b B, p ,= b and hence
a p < b.
i) if p / A then p / E and a < p < b which shows E is not an interval.
ii) if p A, then p /

B and there exists
3
p
1
such that a p < p
1
< b and
p
1
/

B B. Again p
1
/ A (for otherwise p p
1
) and so p
1
/ E and hence
again E is not an interval.
Lemma 9.57. Suppose that f : X Y is a continuous function between two
metric spaces (X and Y ) and A and B are separated subsets of Y. Then :=
f
1
(A) and := f
1
(B) are separated subsets of X. In particular, if E X
is connected, then f (E) is connected in Y.
Proof. Since := f
1
(A) f
1
_
A
_
and f
1
_
A
_
is the closed being the
inverse image under a continuous function of the closed set

A, it follows that
f
1
_
A
_
. Thus if x , then f (x)

A B = and hence = .
Similarly one shows

= as well.
2
Notice that separated sets are disjoint. The sets A = (0, 1) and B = [1, ) are
disjoint but not separated.
3
This is because

B
c
is an open subset of 1.
9.8 Connectedness 103
If f (E) disconnected, there exists non-empty separated subsets, A and B
of f (E) , so that f (X) = A B. The sets := f
1
(A) and := f
1
(B) are
now non-empty separated subsets of X. It now follows that
0
:= E and
0
:= E are non-empty sets such that

0

0
= and
0

0

=
which would imply E is disconnected. Thus if E is connected we must have that
f (E) is connected.
Theorem 9.58 (Intermediate Value Theorem). Suppose that (X, d) is a
connected metric space and f : X 1 is a continuous map. Then f satises
the intermediate value property. Namely, for every pair x, y X such that
f (x) < f(y) and c (f (x) , f(y)), there exists z X such that f(z) = c.
Proof. By Lemma 9.57, f (X) is a connected subset of 1. So by Theorem
9.56, f (X) is a subinterval of 1 and this completes the proof.
Lemma 9.59. If E X is a connected set and E AB where A and B are
two separated sets, then E A or E B.
Proof. Let := E A and := E B, then

A B = and
similarly

= . Thus E = with and being separated sets. Since
E is connected we must have = or = , i.e. E B or E A.
Proposition 9.60. Suppose that F and G are connected subsets of X such that
F G ,= , then E = F G is connected in X as well.

Proof. Suppose that E = A B where A and B are separated sets. Then
from Lemma 9.59 we know that F A or F B and similarly G A or
G B. If both F and G are in the same set (say A), then E A E and B
must be empty. On the other hand if F A and G B, then , =

F G

AB
which would violate A and B being separated. Thus we have shown there does
not exists two non-empty separated sets A and B such that E = A B, i.e. E
is connected.
Denition 9.61. A subset E of a metric space X is path connected if to
every pair of points x
0
, x
1
E there exists C([0, 1], E), such that (0) =
x
0
and (1) = x
1
. We refer to as a path joining x
0
to x
1
.
Proposition 9.62. Every path connected subset, E, of a metric space X is
connected.
Exercise 9.34. Prove Proposition 9.62, i.e. if E X is path connected then
E is connected.. Hint: for sake of contradiction suppose that A and B are two
non-empty separated subsets of X such that E = A B and choose a path
connecting a point in A to a point in B.
Denition 9.63. A subset, C, of a vector space, X, is convex if for all a, b C
the path,
(t) = a +t (b a) = (1 t) a +tb for 0 t 1
is contained in C.
Example 9.64. Every convex subset, C, of a normed vector space (X, ||) is path
connected and hence connected.
Exercise 9.35. Suppose that (X, ||) is a normed space, x X, and R > 0.
Show the open and closed balls in X, B
x
(R) and C
x
(R) , are both convex sets
and hence path connected.
Denition 9.65. A metric space X is locally path connected if for each
x X, there is an open neighborhood V X of x which is path connected.
Proposition 9.66. Let X be a metric space.
1. If X is connected and locally path connected, then X is path connected.
2. If X is any connected open subset of 1
n
, then X is path connected.
Exercise 9.36. Prove item 1. of Proposition 9.66, i.e. if X is connected and
locally path connected, then X is path connected. Hint: x x
0
X and let
W denote the set of x X such that there exists C([0, 1], X) satisfying
(0) = x
0
and (1) = x. Then show W is both open and closed.
Exercise 9.37. Prove item 2. of Proposition 9.66, i.e. if X is any connected
open subset of 1
n
, then X is path connected.
9.8.1 Connectedness Problems
Exercise 9.38. In this exercise we will work inside the metric space with
d (x, y) := [x y[ for all x, y . Let a, b with a < b and let
J := [a, b] = x : a x b .
Show J is disconnected in .
Exercise 9.39. Suppose a < b and f : (a, b) 1 is a non-decreasing function.
Show if f satises the intermediate value property (see Theorem 9.58), then f
is continuous.
Exercise 9.40. Suppose < a < b and f : [a, b) 1 is a strictly
increasing continuous function. Using the intermediate value theorem, one sees
that f ([a, b)) is an interval and since f is strictly increasing it must of the
form [c, d) for some c 1 and d

1 with c < d. Show the inverse function
f
1
: [c, d) [a, b) is continuous and is strictly increasing. In particular if
n N, apply this result to f (x) = x
n
for x [0, ) to construct the positive
n
th
root of a real number. Compare with Exercise ??.
Exercise 9.41. Let
X :=
_
(x, y) 1
2
: y = sin(x
1
) with x ,= 0
_
(0, 0)
equipped with the relative topology induced from the standard topology on 1
2
.
Show X is connected but not path connected.
Remark 9.67 (Structure of open sets in 1). Let V 1 be an open set. For
x V, let a
x
:= inf a : (a, x] V and b
x
:= sup b : [x, b) V . Since V is
open, a
x
< x < b
x
and it is easily seen that J
x
:= (a
x
, b
x
) V. Moreover if
y V and J
x
J
y
,= , then J
x
= J
y
. The collection, J
x
: x V , is at most
countable since we may label each J J
x
: x V by choosing a rational
number r J. Letting J
n
: n < N, with N = allowed, be an enumeration
of J
x
: x V , we have V =
n<N
J
n
as desired.
Part III
Calculus
10
Dierential Calculus in One Real Variable
10.1 Basics of the Derivative
Denition 10.1. If f : J X is a function we say lim
tt0
f (t) = x X i
for all > 0 there exists > 0 such that
|f (t) x| whenever 0 < [t t
0
[ .
A function f : J X is said to be dierentiable at t
0
J i lim
tt0
f(t)f(t0)
tt0
exists in X. We denote the limit by

f (t
0
) or f
t
(t
0
) or by
df
dt
(t
0
) , i.e.
f (t
0
) = lim
tt0
f (t) f (t
0
)
t t
0
Example 10.2. If f (t) = at +b, then

f (t) = a for all t 1. Indeed,
f (t) f (t
0
)
t t
0
=
a (t t
0
)
t t
0
= a for all t ,= t
0
.
Remark 10.3. One may also dene the derivative of f by
f (t
0
) = lim
h0
f (t
0
+h) f (t
0
)
h
provided the limit exists. To see this is the case let (t) :=
f(t)f(t0)
tt0
L
.
Then lim
h0
(t
0
+h) = 0 i for all > 0 there exists > 0 such that
(t
0
+h) if 0 < [h[ . Letting t = t
0
+ h, we see this is equivalent to
(t) if 0 < [h[ = [t t
0
[ , i.e. i lim
tt0
(t) = 0.
Proposition 10.4. The function f (t) := 1/t = t
1
dened for t ,= 0 is dier-
entiable and

f (t) = 1/t
2
for all t ,= 0.
Proof. We have
f (t +h) f (t)
h
=
1
h
_
1
t +h

1
t
_
=
1
h
t (t +h)
(t +h) t
=
1
(t +h) t

1
t
2
as h 0.
Here we have used the fact that
1
t
is continuous for the last limit.
Notation 10.5 ( () , O() , o () Notation) To simplify proofs below, we will
often let (h) , O(h) , and o (h) denote generic X or 1 valued functions dened
in some deleted neighborhood of 0 1 (i.e. for h (, ) 0) such that
1. lim
h0
(h) = 0,
2. there exists C < such that |O(h)| C [h[ for h near 0 or equivalently,
limsup
h0
|O(h)|
]h]
< ,
3. lim
h0
o(h)
h
= 0, i.e. for all c > 0 there exists = (c) > 0 such that
|o (h)| c [h[ for 0 < [h[ .
We will often assume that each of these functions has been extended to h = 0
by setting there values at 0 to be 0.
Remark 10.6. Using the above notation we have, O(h) (h) = o (h) , O(h) +
O(h) = O(h) = O(h) +o (h) , O(h) + (h) = (h) = (h) + (h) , etc. etc. We
may also write, lim
tt0
f (t) = x as f (t
0
+h) = x+ (h) and f is dierentiable
at t
0
with derivative

f (t
0
) i
f (t
0
+h) = f (t
0
) +

f (t
0
) h +o (h) .
Lemma 10.7. If f is dierentiable at t
0
then it is continuous at t
0
.
Proof. By denition,
f (t
0
+h) = f (t
0
) +f
t
(t
0
) h +o (h) = f (t
0
) + (h)
and so lim
h0
f (t
0
+h) = f (t
0
) .
Theorem 10.8 (Linearity of
d
dt
). Suppose that f : (a, b) X and g : (a, b)
X are dierentiable at t
0
(a, b) and 1, then
d
dt
(f +g) (t
0
) = f
t
(t
0
) +g
t
(t
0
) .
Proof. We are given,
f (t
0
+h) = f (t
0
) +

f (t
0
) h +o (h) and
g (t
0
+h) = g (t
0
) + g (t
0
) h +o (h)
and therefore,
108 10 Dierential Calculus in One Real Variable
f (t
0
+h) +g (t
0
+h) = f (t
0
) +

f (t
0
) h +o (h) +g (t
0
) + g (t
0
) h +o (h)
= f (t
0
) +g (t
0
) +
_

f (t
0
) + g (t
0
)
_
h +o (h)
from which the result follows.
Theorem 10.9 (Product rule I). Suppose that f : (a, b) 1 and g : (a, b)
X are dierentiable at t
0
(a, b) , then
d
dt
(f g) (t
0
) = f
t
(t
0
) g (t
0
) +f (t
0
) g
t
(t
0
) . (10.1)
Proof. We have,
(f g) (t
0
+h) = f (t
0
+h) g (t
0
+h)
=
_
f (t
0
) +

f (t
0
) h +o (h)
_
(g (t
0
) + g (t
0
) h +o (h))
= f (t
0
) g (t
0
) +
_

f (t
0
) g (t
0
) +f (t
0
) g (t
0
)
_
h +o (h) (10.2)
wherein we have used h
2
= o (h) , ho (h) = o (h) , and o (h) o (h) = o (h) . From
Eq. (10.2) we deduce Eq. (10.1).
Exercise 10.1. Suppose that f : (a, b) 1
n
and g : (a, b) 1
n
are dieren-
tiable at t
0
(a, b) , show
d
dt
(f g) (t
0
) = f
t
(t
0
) g (t
0
) +f (t
0
) g
t
(t
0
)
where now denotes the usual dot product in 1
n
, i.e.
x y :=
n
i=1
x
i
y
i
for all x, y 1
n
. (10.3)
Exercise 10.2. Let f : (a, b) 1
n
and write f (t) = (f
1
(t) , . . . , f
n
(t)) where
f
k
: (a, b) 1 are functions for 1 k n. Show; if

f (t) exists at t (a, b) ,
then

f
k
(t) exists for all k and
f (t) =
_

f
1
(t) , . . . ,

f
n
(t)
_
.
Hint: f
k
(t) = e
k
f (t) where e
k
is the k
th
standard basis vector for 1
n
Exercise 10.3. Suppose that f : (a, b) 1
n
and write f (t) =
(f
1
(t) , . . . , f
n
(t)) where f
k
: (a, b) 1. Show; if

f
k
(t) exists at t (a, b) for
all k, then

f (t) exists and
f (t) =
_

f
1
(t) , . . . ,

f
n
(t)
_
.
Hint: f (t) =
n
k=1
f
k
(t) e
k
.
Lemma 10.10. We have the following estimates, o (O(h)) = o (h) and
O(o (h)) = o (h) .
Proof. Suppose that f (h) and g (h) are functions such that f (h) = o (h)
and g (h) = O(h) . These statements are equivalent to the existence of a con-
stant C < and a function (h) such that lim
h0
(h) = 0 and
[f (h)[ [ (h)[ [h[ and [g (h)[ C [h[ for h near 0.
With these estimates in hand we have
[f (g (h))[ [ (g (h))[ [g (h)[ C [ (g (h))[ [h[
and
[g (f (h))[ C [f (h)[ C [ (h)[ [h[ .
Since lim
h0
C [ (g (h))[ = 0 we see that f (g (h)) = o (h) and since
lim
h0
C [ (h)[ = 0 it follows that g (f (h)) = o (h) .
Theorem 10.11 (Chain Rule). Suppose that g : (c, d) X and f : (a, b)
(c, d) are given functions such that

f (t
0
) at t
0
(a, b) and g
t
(w
0
) exists at
z
0
:= f (t
0
) . Then
d
dt
g (f (t)) [
t=t0
= g
t
(z
0
) f
t
(t
0
) = g
t
(f (t
0
)) f
t
(t
0
) .
Proof. Let h(t) := g (f (t)) . Then by assumption,
f (t
0
+t) = f (t
0
) +f
t
(t
0
) t +o (t) and
g (z
0
+z) = g (z
0
) +g
t
(z
0
) z +o (z) .
Letting
z := f (t
0
+t) f (t
0
) = f
t
(t
0
) t +o (t)
which is small for t small, we nd,
h(t
0
+t) = g (f (t
0
+t)) = g (z
0
+z)
= g (z
0
) +g
t
(z
0
) z +o (z)
= h(t
0
) +g
t
(z
0
) [f
t
(t
0
) t +o (t)] +o (f
t
(t
0
) t +o (t))
= h(t
0
) +g
t
(z
0
) f
t
(t
0
) t +o (t) . (10.4)
wherein we have used Lemma 10.10 to conclude, o (f
t
(t
0
) t +o (t)) =
o (t) . The result follows Eq. (10.4) and the denition of the derivative.
Corollary 10.12 (Quotient Rule). Suppose that f : (a, b) 1 and g :
(a, b) X are dierentiable at t
0
(a, b) and f (t
0
) ,= 0, then
d
dt
_
g
f
_
(t
0
) =
f (t
0
) g
t
(t
0
) f
t
(t
0
) g (t
0
)
f
2
(t
0
)
.
10.2 First Derivative Test and the Mean Value Theorems 109
Proof. First of let (z) :=
1
z
, then
1
f
= (f) and so by the chain rule,
d
dt
1
f
(t
0
) =
t
(f (t
0
)) f
t
(t
0
) =
1
f
2
(t
0
)
f
t
(t
0
) .
Hence by the product rule,
d
dt
_
g
f
_
(t
0
) =
d
dt
_
1
f
g
_
(t
0
) =
d
dt
1
f
(t
0
) g (t
0
) +
1
f (t
0
)
g
t
(t
0
)
=
1
f
2
(t
0
)
f
t
(t
0
) g (t
0
) +
1
f (t
0
)
g
t
(t
0
)
=
f (t
0
) g
t
(t
0
) f
t
(t
0
) g (t
0
)
f
2
(t
0
)
.
10.2 First Derivative Test and the Mean Value Theorems
Denition 10.13. A function f : (a, b) 1 has as local minimum (maximum)
at t
0
(a, b) if there exists > 0 such that f (t) f (t
0
) (f (t) f (t
0
)) when
[t t
0
[ .
Theorem 10.14 (First Derivative Test). If f : (a, b) 1 has a local ex-
tremum (i.e. local minimum or maximum) at t
0
(a, b) and f is dierentiable
at t
0
, then f
t
(t
0
) = 0.
Proof. For sake of deniteness, suppose that t
0
is a local minimum for f.
Then for h > 0 small,
f (t
0
+h) f (t
0
)
h
0 and
f (t
0
h) f (t
0
)
h
0.
Therefore letting h 0 in these two inequalities leads to f
t
(t
0
) 0 and f
t
(t
0
)
0 which implies f
t
(t
0
) = 0.
Exercise 10.4 (Existence of Eigenvectors for Symmetric Matrices).
Let A be an n n symmetric real matrix. Recall this implies (and is equiv-
alent to) Ax y = x Ay for all x, y 1
n
where is the dot product on 1
n
as
in Eq. (10.3). In this problem you are going to show that A has an eigenvector
by proving the following statements.
1. Show [Ax y[ C |x| |y| where C :=
_
i,j
A
2
ij
the Hilbert-Schmidt
norm of A.
2. Show q (x) := Ax x is a continuous function on 1
n
.
Let S := x 1
n
: |x| = 1 be the closed and bounded and hence compact
unit sphere in 1
n
. By the extreme value theorem, := max
xS
q (x) exists.
Let x
0
S be chosen so that
= q (x
0
) = Ax
0
x
0
.
3. Explain why R(x) =
Axx
|x|
2
= q
_
x
|x|
_
is a continuous function on 1
n
0
and show x
0
is also a maximizer of R.
4. For any v 1
n
let f
v
(t) = R(x
0
+tv) which is dened for t such that
|x
0
+tv| , = 0 and in particular for t near zero. Compute

f
v
(t) for t near 0
and in particular show
f
v
(0) = 2 (Ax
0
x
0
) v
5. Noting that f
v
(t) has a maximum at t = 0, the rst derivative implies
f
v
(0) = 0. By making a judicious choice for v, show Ax
0
= x
0
, i.e. x
0
is
an eigenvector of A with eigenvalue .
Exercise 10.5 (Multiplication Operators). Let V := C ([0, 1] , 1) and A :
V V be the operation of multiplication by t, i.e. for f V let Af V be
dene by
(Af) (t) = tf (t) for all t [0, 1] .
Show; if there is a 1 and f V such that Af = f, then f 0, i.e. A has
no eigenvectors (or eigenfunctions if you prefer).
[Moral: The reader should compare this exercise with Exercise 10.4. Notice
that if we dene an inner product on V by
(f, g) =
_
1
0
f (t) g (t) dt
then (Af, g) = (f, Ag) , i.e. A is symmetric. Nevertheless, A has no eigenvectors.
This may be considered as another proof the that unit sphere in V is not
compact!]
Theorem 10.15 (Rolles Theorem). Suppose that f : [a, b] 1 is a con-
tinuous function such that f (a) = L = f (b) and f is dierentiable on (a, b) .
Then there exists t
0
(a, b) such that f
t
(t
0
) = 0.
Proof. If f is not constant then there exists t (a, b) such that either
f (t) > L or f (t) < L. For deniteness let assume the rst case. Then by the
extreme value theorem, there exists t
0
[a, b] such that f (t
0
) f (t) for all
t [a, b] . Moreover this maximizer can not be at a or b as max f ([a, b]) > L =
f (a) = f (b) . Therefore f has a maximum at some point t
0
(a, b) . It then
follows by the rst derivative test that f
t
(t
0
) = 0.
Theorem 10.16 (Mean Value Theorem). Suppose that f : [a, b] 1 is
a continuous function such that f is dierentiable on (a, b) . Then there exists
t
0
(a, b) such that
f (t
0
) =
f (b) f (a)
b a
.
Proof. Let
h(t) := f (t)
_
f (a) +
t a
b a
(f (b) f (a))
_
.
Then h(a) = 0 = h(b) and so by Rolles theorem, there exist t
0
(a, b) such
that
0 =

h(t
0
) =

f (t
0
)
1
b a
(f (b) f (a))
=

f (t
0
)
f (b) f (a)
b a
.
Alternatively, apply the generalized mean value Theorem 10.46 below with
g (t) = t.
Corollary 10.17. Suppose that f : (a, b) 1 is dierentiable on (a, b) .
1. If f
t
0 on (a, b) then f is non-decreasing on (a, b) .
2. If f
t
0 on (a, b) then f is non-increasing on (a, b) .
3. If f
t
= 0 on (a, b) , then f is constant.
Proof. For a < x
1
< x
2
< b there exists c = c (x
1
, x
2
) (x
1
, x
2
) such that
f (x
2
) f (x
1
) = f
t
(c (x
1
, x
2
)) (x
2
x
1
) .
The result easily follow from this identity.
Remark 10.18. The mean value Theorem 10.16 does not hold for vector valued
functions and in particular not for C valued functions. For example let f (t) =
(u(t) , v (t)) 1
2
be dierentiable function such that f (0) = f (1) = 0 1
2
.
We can arrange for u(t) to only be zero at
1
2
and v (t) to be zero only at 3/4.
In this case both u and v satisfy the mean value theorem with two dierent
times and there is no common time t
0
such that u(t
0
) = 0 = v (t
0
) . However,
the following mean value inequality does still hold.
Theorem 10.19 (Mean Value Inequality). Suppose that X = 1
n
and f :
[a, b] X = 1
n
is a continuous function which is dierentiable on (a, b) . Then
there exists c (a, b) such that
|f (b) f (a)|
_
_
_

f (c)
_
_
_ (b a) .
In particular, if f : [a, b] C
= 1
2
then there exists c (a, b) such that
[f (b) f (a)[

f (c)
(b a) .
[Although we do not prove it now, this result remains true when X is any normed
vector space however the proof would require the Hahn-Banach theorem. We
will give a proof of a closely related result later, see Proposition 12.12 and
Corollary 12.21 below.]
Proof. For v 1
n
, let f
v
(t) := v f (t) . Applying the mean value theorem
to f
v
shows there exists c (v) (a, b) such that
v [f (b) f (a)] = v f (b) v f (a) =
d
dt
v f (t) [
t=c(v)
(b a)
=
_
v

f (c (v))
_
(b a) (10.5)
wherein we have made use of Exercise 10.1. If f (b) = f (a) there is nothing to
prove so we will assume that f (b) ,= f (a) . In the latter case we take,
v :=
f (b) f (a)
|f (b) f (a)|
in Eq. (10.5) in order to learn,
|f (b) f (a)| = v [f (b) f (a)] =
_
v

f (c (v))
_
(b a)
|v|
_
_
_

f (c (v))
_
_
_ (b a) =
_
_
_

f (c (v))
_
_
_ (b a) .
Corollary 10.20. Suppose that f : [a, b] 1
n
is a continuous function such
that f
t
(t) = 0 for all t (a, b) , then f (t) = f (a) for all t [a, b] , i.e. f is
constant on [a, b] .
Proof. By the mean value inequality for all [a, b] we have
|f () f (a)| |f
t
(c)| ( a) = 0
for some c (a, ) . This shows that f () = f (a) for all [a, b] .
10.3 Limsup and Liminf revisited
For proofs to follow it will be convenient to introduce the notion of limsup and
liminf for function g : Y y
0
1, where Y is a metric space and y
0
Y.
[The reader will nd many more details regarding the material in this section
in the Appendix D which has been generously supplied by Ali Behzadan.]
10.3 Limsup and Liminf revisited 111
Notation 10.21 The deleted open ball at y
0
with radius > 0 is dened by
B
t
y0
() := B
y0
() y
0
= y Y : 0 < d (y, y
0
) < .
Denition 10.22. Given g : Y y
0
1 we let
limsup
yy0
g (y) := lim
0
sup
yB
y
0
()
g (y)
and
liminf
yy0
g (y) := lim
0
inf
yB
y
0
()
g (y) .
These limits always exist since sup
yB
y
0
()
g (y) decreases as decreases
while inf
yB
y
0
()
g (y) increases as decreases. Moreover
inf
yB
y
0
()
g (y) sup
yB
y
0
()
g (y) for all > 0,
and hence by passing to the limit as 0,
liminf
yy0
g (y) limsup
yy0
g (y) .
The following proposition is analogous to Proposition 3.28 and the reader may
nd a more general version in Proposition D.9.
Proposition 10.23. Keeping the notation introduced above, lim
yy0
g (y) ex-
ists in

1 i
liminf
yy0
g (y) = limsup
yy0
g (y)
in which case
lim
yy0
g (y) = liminf
yy0
g (y) = limsup
yy0
g (y) . (10.6)
Proof. (=) Suppose that L := liminf
yy0
g (y) = limsup
yy0
g (y) . If
L 1, then given > 0 there exists > 0 such that
inf
yB
y
0
()
g (y) L
and
sup
yB
y
0
()
g (y) L
.
In particular we will have
L inf
yB
y
0
()
g (y) sup
yB
y
0
()
g (y) L +
and so
L g (y) L + for all y B
t
y0
()
which, by denition, implies lim
yy0
g (y) = L.
Next suppose that
= L = liminf
yy0
g (y) = lim
0
inf
yB
y
0
()
g (y) .
Then for all 0 < M < there exists > 0 such that inf
yB
y
0
()
g (y) M, i.e.
g (y) M if 0 < d (y, y
0
) . By denition, this implies lim
yy0
g (y) = .
The case L = is similar and will be omitted.
( = ) Suppose that L := lim
yy0
g (y) exists in

1. If L 1 then for all
> 0 there exists > 0 such that
[g (y) L[ if 0 < d (y, y
0
) < ,
i.e.
L g (y) L + if y B
t
y0
() .
From this it follows that
L inf
yB
y
0
()
g (y) sup
yB
y
0
()
g (y) L +
and therefore letting 0 shows
L liminf
yy0
g (y) limsup
yy0
g (y) L +.
We may now let 0 to conclude that
L = liminf
yy0
g (y) limsup
yy0
g (y) = L
which suces to prove omitted (10.6) in this case.
Next suppose that L = . Then for every M (0, ) there exists = (M)
such that g (y) M if y B
t
y0
() and therefore,
liminf
yy0
g (y) = lim
0
inf
yB
y
0
()
g (y) M.
We may now let M to learn
lim
yy0
g (y) = liminf
yy0
g (y) limsup
yy0
g (y)
which again proves Eq. (10.6). The case where L = is proved similarly so
will be omitted.
Corollary 10.24. If g : Y y
0
[0, ) and limsup
yy0
g (y) = 0, then
lim
yy0
g (y) = 0.
Proof. This is a direct consequence of Proposition 10.23 and the observation
that
0 liminf
yy0
g (y) limsup
yy0
g (y) 0.
Lemma 10.25. If n N and g
k
: Y y
0
[0, ] are given functions for
1 k n, then
limsup
yy0
n
k=1
g
k
(y)
n
k=1
limsup
yy0
g
k
(y) . (10.7)
Proof. Since
n
k=1
g
k
(z)
n
k=1
sup
yB
y
0
()
g
k
(y) for all z B
t
y0
() ,
it follows that
sup
zB
y
0
()
n
k=1
g
k
(z)
n
k=1
sup
yB
y
0
()
g
k
(y) .
Passing the the limit as 0 in this inequality gives Eq. (10.7).
Notation 10.26 (One sided limits) Given a function, f : (a, b) 1 we let
limsup
xb
f (x) := lim
xb
sup
x<y<b
f (y) and liminf
xb
f (x) := lim
xb
inf
x<y<b
f (y) .
Similarly we let
limsup
xa
f (x) := lim
xa
sup
a<y<x
f (y) and liminf
xa
f (x) := lim
xa
inf
a<y<x
f (y) .
The proof of the following proposition is left to the reader as it is completely
analogous to the proof of Proposition 10.23.
Proposition 10.27. Keeping the notation introduced;
1. liminf
xb
f (x) limsup
xb
f (x) and lim
xa
inf
a<y<x
f (y)
limsup
xa
f (x) ,
2. liminf
xb
f (x) = limsup
xb
f (x) = L

1 i lim
xb
f (x) = L

1, and
3. liminf
xa
f (x) = limsup
xa
f (x) = L

1 i lim
xa
f (x) = L

1.
10.4 Dierentiating past innite sums
Theorem 10.28 (Weierstrass M test II). Suppose that (X, ||) is a Ba-
nach space, (Y, d) is a metric space, for each n N, f
n
: Y y
0
X is a
function, and there exists M
n
n=1
[0, ) with
n=1
M
n
< such that
sup
yY
|f
n
(y)|
X
M
n
n N.
Under the above assumptions, if lim
yy0
f
n
(y) exists in X for all n N, then
lim
yy0
n=1
f
n
(y) =
n=1
lim
yy0
f
n
(y) .
Proof. Let x
n
:= lim
yy0
f
n
(y) . Ass the norm on X is continuous it follows
that |x
n
| M
n
for all n and therefore
n=1
x
n
is absolutely convergent. To
nish the proof we must show lim
yy0
n=1
f
n
(y) = x. We begin with the
estimate;
_
_
_
_
_
x
n=1
f
n
(y)
_
_
_
_
_
=
_
_
_
_
_
n=1
[x
n
f
n
(y)]
_
_
_
_
_
=
_
_
_
_
_
N
n=1
[x
n
f
n
(y)] +
n=N+1
[x
n
f
n
(y)]
_
_
_
_
_
n=1
|[x
n
f
n
(y)]| +
n=N+1
[|x
n
| +|f
n
(y)|]
n=1
|[x
n
f
n
(y)]| + 2
n=N+1
M
n
. (10.8)
Taking the limsup
yy0
of this inequality and letting N shows
limsup
yy0
_
_
_
_
_
x
n=1
f
n
(y)
_
_
_
_
_
limsup
yy0
N
n=1
|[x
n
f
n
(y)]| + 2
n=N+1
M
n
n=1
limsup
yy0
|[x
n
f
n
(y)]| + 2
n=N+1
M
n
= 2
n=N+1
M
n
0 as N .
Alternative ending #1: for those that skipped Section 10.3 above, taking
supremum over y such that sup
0<d(y,y0)
in Eq. (10.8) gives
10.4 Dierentiating past innite sums 113
sup
0<d(y,y0)
_
_
_
_
_
x
n=1
f
n
(y)
_
_
_
_
_
n=1
sup
0<d(y,y0)
|[x
n
f
n
(y)]| + 2
n=N+1
M
n
.
Letting 0 (notice the functions involved are decreasing in and os the limits
exist) we learn,
lim
0
sup
0<d(y,y0)
_
_
_
_
_
x
n=1
f
n
(y)
_
_
_
_
_
n=1
lim
0
sup
0<d(y,y0)
|[x
n
f
n
(y)]| + 2
n=N+1
M
n
= 2
n=N+1
M
n
0 as N .
It is now easy to see that lim
0
sup
0<d(y,y0)
|x
n=1
f
n
(y)| = 0 implies
lim
yy0
|x
n=1
f
n
(y)| = 0.
Alternative ending #2: for those that dont want to use supremums. Let
> 0 be given and then choose N = N () large (but now xed) so that
2
n=N+1
M
n

2
.
Then choose = (, N ()) > 0 (so only depends on ) so small that
|[x
n
f
n
(y)]|

2N ()
when 0 < d (y, y
0
) .
Using these estimates in Eq. Eq. (10.8) shows, gives
_
_
_
_
_
x
n=1
f
n
(y)
_
_
_
_
_
n=1
2N
+ 2

2
= when 0 < d (y, y
0
) .
Thus by the denition of the limit we have shown lim
yy0
n=1
f
n
(y) = x.
Theorem 10.29 (Dierentiating past a sum). Suppose that f
n
: J :=
(a, b) X is a sequence of functions satisfying;
1. there exists c J such that F (c) :=
n=1
f
n
(c) is convergent in X.
2. For all n N, f
n
is dierentiable on J and

f
n
is continuous on J.
3. If M
n
:= sup
tJ
_
_
_

f
n
(t)
_
_
_
X
, then
n=1
M
n
< .
Then F (t) :=
n=1
f
n
(t) is convergent for t J, F is dierentiable on J,
F is continuous on J, and
F (t) =
n=1
f
n
(t) for all t J. (10.9)
Proof. For this proof we will need the mean value inequality (Theorem
10.19 or Proposition 12.12 below) for X valued functions. As of yet we have
only proved this when X = 1
n
but we will see later that it holds more generally.
By the mean value inequality (Theorem 10.19 or Proposition 12.12 below) for
any s, t J,
|f
n
(t) f
n
(s)| M
n
[t s[
and hence,
n=1
|f
n
(t) f
n
(s)| [t s[
n=1
M
n
<
from which it follows
n=1
[f
n
(t) f
n
(s)] is absolutely convergent. Therefore,
F (t) =
n=1
f
n
(t) =
n=1
(f
n
(c) + [f
n
(t) f
n
(c)])
is convergent as well. Furthermore the assumptions imply G(t) :=
n=1
f
n
(t)
is absolutely convergent and that G is a continuous function by the Weierstrass
M - test, Theorem 7.11. Since for all n N and h ,= 0 but small,
_
_
_
_
f
n
(t +h) f
n
(t)
h
_
_
_
_
M
n
,
it follows from Theorem 10.28 that
lim
h0
F (t +h) F (t)
h
= lim
h0
n=1
f
n
(t +h) f
n
(t)
h
=
n=1
lim
h0
f
n
(t +h) f
n
(t)
h
=
n=1
f
n
(t) = G(t)
which completes the proof of Eq. (10.9).
Alternative computation of

F. Let F (t) =
n=1
f
n
(t) as above. For
t J and > 0 small so that (t , t +) J, let
g
n
(h) =
_
fn(t+h)fn(t)
h
if h ,= 0
f
n
(t) if h = 0
.
By the denition of the derivative, lim
h0
g
n
(h) =

f
n
(t) = g
n
(0) which shows
g
n
is continuous at 0 and hence g
n
is continuous for h (, ) . By the mean
value inequality (Theorem 10.19 or Proposition 12.12 below), there exists c
between t and t +h such that
|g
n
(h)|
_
_
_

f
n
(c)
_
_
_ M
n
for h ,= 0.
This inequality holds at h = 0 as well. We may now apply the Weierstrass M -
test of Theorem 7.11 to conclude that
G(h) :=
n=1
g
n
(h) is continuous in h (, ) .
Since G(h) = [F (t +h) F (t)] /h for 0 < [h[ < it follows that
lim
h0
F (t +h) F (t)
h
= lim
h0
G(h) = G(0) =
n=1
g
n
(0) =
n=1
f
n
(t) .
This shows that F is dierentiable at t and

F (t) =
n=1
f
n
(t) .
Denition 10.30. We say f : (a, b) X is n times dierentiable on (a, b)
if the iterated derivatives of f up to order n exist. That is we assume f
t
exists
and then f
(2)
:= (f
t
)
t
exists on (a, b) , and f
(3)
=
_
f
(2)
_
t
exists on (a, b) , . . . ,
f
(n)
=
_
f
(n1)
_
t
exists on (a, b) .
Theorem 10.31 (Dierentiation of Power Series). Suppose
x
0
1 and a
n
n=0
is a sequence of complex numbers such that
R :=
_
limsup
n
[a
n
[
1/n
_
1
> 0. Then f (x) :=
n=0
a
n
(x x
0
)
n
de-
nes an innitely dierentiable function f : (x
0
R, x
0
+R) C. Moreover,
f
t
(x) =
n=0
na
n
(x x
0
)
n1
=
n=1
na
n
(x x
0
)
n1
(10.10)
and this series has the same radius of convergence as the series for f. [This
theorem holds more generally when a
n
n=0
is a sequence in a Banach space X
such that R :=
_
limsup
n
|a
n
|
1/n
X
_
1
> 0.]
Proof. We are going to apply Theorem 10.29. To this end suppose that
r (0, R) and that x satises [x x
0
[ < r. Letting f
n
(x) := a
n
(x x
0
)
n
we
have
[f
t
n
(x)[ =
na
n
(x x
0
)
n1
n[a
n
[ r
n1
=: M
n
=
n
r
[a
n
[ r
n1
where
limsup
n
M
1/n
n
= r limsup
n
_
n
r
_
1/n
[a
n
[
1/n
=
r
R
< 1
wherein we have used Lemma 3.30 to see that lim
n
_
n
r
_
1/n
= 1. So by the
root test it follows that
n=1
M
n
< and so we may apply Theorem 10.29 in
order to conclude Eq. (10.10) holds for [x x
0
[ < r. As r < R was arbitrary,
we may conclude that Eq. (10.10) holds for [x x
0
[ < R. The computation
above shows the series for f
t
(x) converges of [x x
0
[ < R. Thus these same
results apply to f
t
(x) and then f
tt
(x) , etc. etc. So the result now follows by a
straightforward induction argument.
Recall for a N that the binomial theorem states,
(1 +x)
a
=
a
k=0
_
a
k
_
x
k
(10.11)
where
_
a
0
_
:= 1 and
_
a
k
_
:=
a!
k! (a k)!
=
a (a 1) . . . (a k + 1)
k!
.
We are going to generalize this theorem to all a 1, see Corollary 10.33 and
Exercise 10.17 below for the general case.
Denition 10.32 (Generalized Binomial Coecients). For a 1 0
let
_
a
k
_
:=
_
a(a1)...(ak+1)
k!
if k N
1 if k = 0
,
i.e.
_
a
0
_
= 1,
_
a
1
_
=
a
1
,
_
a
2
_
=
a
1
a 1
2
,
_
a
3
_
=
a
1
a 1
2
a 2
3
, . . . .
_
a
k
_
=
k
l=1
a l + 1
l
=
a
1
a 1
2
a 2
3
. . .
a k + 1
k
. (10.12)
Notice that if a N, we will have
_
a
k
_
= 0 if k > a and for 0 k a, so
that Eq. (10.11) may be written as
(1 +x)
a
=
k=0
_
a
k
_
x
k
for x 1 and a N.
It is also often useful to observe that
_
a
k + 1
_
=
a k
k + 1
_
a
k
_
, (10.13)
and
10.4 Dierentiating past innite sums 115
_
a + 1
k
_
=
a + 1
1
a
2
a 1
3
. . .
a k + 2
k
=
a + 1
k
a
1
a 1
2
. . .
a k + 2
k 1
=
a + 1
k
_
a
k 1
_
. (10.14)
Letting a a 1 and k k + 1 in this last identity gives the identity,
_
a
k + 1
_
=
a
k + 1
_
a 1
k
_
. (10.15)
Corollary 10.33. For a N,
(1 +x)
a
=
k=0
_
a
k
_
x
k
for [x[ < 1. (10.16)
Proof. Let n = a N. We are going to arrive at this result by dierenti-
ating the geometric series,
1
1 x
=
k=0
x
k
for [x[ < 1
repeatedly. For example,
1
_
1
1 x
_
2
=
d
dx
1
1 x
=
k=1
kx
k1
=
k=0
(k + 1) x
k
and then similarly
2!
_
1
1 x
_
3
=
k=1
(k + 1) kx
k1
=
k=0
(k + 2) (k + 1) x
k
,
.
.
.
(n 1)!
_
1
1 x
_
n
=
k=0
(k +n 1) . . . (k + 2) (k + 1) x
k
.
So replacing x by x in the above expression implies
(1 +x)
a
=
k=0
(k +n 1) . . . (k + 2) (k + 1)
(n 1)!
(1)
k
x
k
for [x[ < 1.
To complete the proof we observe that the coecient of x
k
may be expressed
as,
(1)
k
(n +k 1)!
k! (n 1)!
=
(n +k 1) (n +k 2) . . . n
k!
(1)
k
=
(a k + 1) (a k + 2) . . . a
k!
=
k
l=1
a l + 1
l
=
_
a
k
_
.
Alternatively, we can prove Eq. (10.16) by induction;
a (1 +x)
a1
=
d
dx
(1 +x)
a
=
d
dx
k=0
_
a
k
_
x
k
=
k=1
_
a
k
_
kx
k1
=
k=0
_
a
k + 1
_
(k + 1) x
k
=
k=0
a
k + 1
_
a 1
k
_
(k + 1) x
k
= a
k=0
_
a 1
k
_
x
k
,
where we have made us of Eq. (10.15). This completes the induction step.
Exercise 10.6. For z C recall we have dened
e
z
:=
n=0
z
n
n!
.
The goal here is to give another proof of Proposition 7.23 using the following
outline. Show;
1.
d
dt
e
tz
= e
tz
with e
0z
= 1.
2. Use this results and the obvious product rule to show
d
dt
_
e
tz
e
tw
e
t(z+w)
_
= 0 for all z, w C and t 1. (10.17)
3. Use Eq. (10.17) to show e
z
e
w
e
(z+w)
= 1 for all z, w C.
4. Now, using item 3., show e
z
= (e
z
)
1
and then that e
z
e
w
= e
z+w
for all
z, w C.
Corollary 10.34. For x 1, the functions
e
x
:=
n=0
x
n
n!
, (10.18)
sin x :=
n=0
(1)
n
x
2n+1
(2n + 1)!
and (10.19)
cos x =
n=0
(1)
n
x
2n
(2n)!
(10.20)
are innitely dierentiable and they satisfy
d
dx
e
x
= e
x
with e
0
= 1
d
dx
sin x = cos x with sin (0) = 0
d
dx
cos x = sin x with cos (0) = 1.
Proof. Uses Theorem 10.31 to justify dierentiating the power series ex-
pansions term by term to arrive at the desired results. Alternatively, take
z = 1 in Exercise 10.6 in order to show
d
dx
e
x
= e
x
. Then take z = i in Exercise
10.6 to learn that
d
dx
e
ix
= ie
ix
. (10.21)
Recall from Eulers formula, Eq. (7.12), that e
ix
= cos x + i sin x. Then by
Exercise 10.2, Eq. (10.21) is equivalent to
d
dx
cos x +i
d
dx
sin x = i (cos x +i sin x) = sin x +i cos x.
Comparing the real and imaginary parts of this equation proves the derivative
formulas for sin x and cos x.
Exercise 10.7. Show;
1. cos
2
y + sin
2
y = 1 for all y 1. Hint: compute
d
dy
_
cos
2
y + sin
2
y
_
= 0.
2. Conclude that
e
iy
= 1 for all y 1.
3. Show [e
z
[ = e
Re z
for all z C.
Remark 10.35 (cos and sin are the functions you think they are). Please see
page 182-184 of Rudin where he shows;
1. There is a number called /2 which is the rst positive zero of cos () .
2. The functions e
i
, cos () , and sin () are 2 periodic. Here we say a
function, f : 1 C is 2 periodic i f ( + 2) = f () for all 1.
3. e
i
,= 1 for 0 < < 2 and hence [0, 2) e
i
C is one to one.
4. For all z C with [z[ = 1, there exists a unique [0, 2) such that
e
i
= z.
5. The arc length of the curve [0, ] e
i
is for all 0. [We have
to wait until we do integration theory in order to dene arc-length.] In
particular, 2 is the arc-length of the unit circle in C.
Exercise 10.8 (Theory behind separation of variables). Let a
n
n=
be a summable sequence of complex numbers, i.e.
n=
[a
n
[ < . For t 0
and x 1, dene
F(t, x) =
n=
a
n
e
tn
2
e
inx
=
n=
a
n
e
tn
2
+inx
,
where as usual e
ix
= cos (x) +i sin (x) , this is motivated by replacing x in Eq.
(10.18) by ix and comparing the result to Eqs. (10.19) and (10.20).
1. F(t, x) is continuous for (t, x) [0, ) 1. Hint: Weierstrass M - test,
Theorem 7.11.
2. F(t, x)/t, F(t, x)/x and
2
F(t, x)/x
2
exist for t > 0 and x 1.
Hint: Use Theorem 10.29 to compute these derivatives. In computing the
t derivative, you should let > 0 and apply Theorem 10.29 with t > and
then afterwards let 0.
3. F satises the heat equation, namely
F(t, x)/t =
2
F(t, x)/x
2
for t > 0 and x 1.
10.5 Taylors Theorem
Remark 10.36. If f (x) :=
n=0
a
n
x
n
converges of x (R, R) , then by in-
duction along with Theorem 10.31, one shows for [x[ < R that,
f
(k)
(x) =
n=k
a
n
n(n 1) (n 2) . . . (n k + 1) x
nk
and in particular that
f
(k)
(0) = k! a
k
.
Therefore we may express f as,
f (x) =
k=0
f
(k)
(0)
k!
x
k
.
10.5 Taylors Theorem 117
Exercise 10.9. Suppose f : 1 1 is a continuous function such that f is
dierentiable to all orders on 10 . Further suppose that lim
t0
f
(n)
(t) = L
n
exists in 1 for each n N. Show f is dierentiable at t = 0 to all orders and
that f
(n)
(0) = L
n
.
Exercise 10.10. [In this exercise we will take for granted that
d
dx
e
x
= e
x
. This
will be proved a little later in this chapter.] Let f : 1 1 be the function
dened by
f (t) :=
_
0 if t 0
e
1/t
if t > 0
.
Show f is dierentiable to all orders on 1 and that f
(n)
(0) = 0 for all n N.
Hints: rst show there exists a polynomial functions, p
n
(x) , such that
f
(n)
(t) = p
n
_
t
1
_
f (t) for t > 0. You may also use e
x
N
n=0
x
n
n!
when x 0
and therefore lim
x
x
n
e
x
= 0 for all n N.
Remark 10.37. Exercise 10.10 shows that not all innitely dierentiable func-
tions may be represented as a power series! Nevertheless we can try to approx-
imate a function f : (a, b) X near x
0
(a, b) by its Taylor Polynomial of
degree n dened by
p
n
(x) :=
n
k=0
f
(k)
(x
0
)
k!
(x x
0
)
k
. (10.22)
It is easy to check that p
n
is the unique degree n polynomial such that
p
(k)
n (x
0
) = f
(k)
(x
0
) for 0 k n with the convention that f
(0)
= f. Indeed if
p (x) =
n
k=0
p
k
(x x
0
)
k
where p
k
X,
then a simple computation shows p
(k)
(x
0
) = p
k
k!. So if we want p
(k)
(x
0
) =
f
(k)
(x
0
) for 0 k n we must require p
k
:= f
(k)
(x
0
) /k!. The question now
becomes what is the error between f and p
n
.
Theorem 10.38 (Taylors Theorem). Suppose that f : 1 1 is n+1 times
dierentiable on (a, b) and x
0
(a, b) . Let p
n
(x) be as in Eq. (10.22). Then
for all x (a, b) there exists a real number c between x
0
and x such that
f (x) = p
n
(x) +
f
(n+1)
(c)
(n + 1)!
(x x
0
)
n+1
(10.23)
Proof. Let me rst give the proof in the special case that n = 2. Given
x (a, b) with x ,= x
0
and let
1
M = M
x
so that
1
That is,
M :=
f (x) p2 (x)
(x x0)
3
.
f (x) = p
2
(x) +M (x x
0
)
3
.
Our goal is to show, for some c between x
0
and x, that
M =
f
(3)
(c)
3!
.
To prove this we let
h(t) := f (t) p
2
(t) M (t x
0
)
3
and observe that h(x) = 0 by construction of M and that h
(k)
(x
0
) = 0 for
0 k 2 and
h
3
(t) = f
3
(t) M 3!.
So to nish the proof it suces to nd a c between x
0
and x such that h
3
(c) = 0
as this will imply,
0 = f
3
(c) M 3! = M =
f
(3)
(c)
3!
To prove this last assertion we are repeatedly going to use the Mean value
theorem (or use Rolles theorem directly); 1) as h(x) h(x
0
) = 0 0 = 0,
there exists c
1
between x
0
and x so that h
t
(c
1
) = 0. 2) Similarly since h
t
(c
1
)
h
t
(x
0
) = 0 0 = 0, there exists c
2
between x
0
and c
1
such that h
tt
(c
2
) = 0. 3)
Since h
tt
(c
2
) h
tt
(x
0
) = 0 0 = 0, there exists c
3
between x
0
and c
2
such that
h
ttt
(c
3
) = 0. Thus we may take c = c
3
which is between x
0
and x.
For the general case, let x (a, b) with x ,= x
0
and dene M = M
x
so that
f (x) p
n
(x) = M (x x
0
)
n+1
(10.24)
and set
h(t) := f (t) p
n
(t) M (t x
0
)
n+1
.
By construction; h(x) = 0, h
(k)
(x
0
) = 0 for 0 k n, and
h
(n+1)
(t) = f
(n+1)
(t) M (n + 1)!.
For sake of deniteness, suppose that x > x
0
. Since h(x) h(x
0
) = 0 0 = 0,
then by the Mean value theorem (or use Rolles theorem directly) there exists c
1
between x
0
and x so that h
t
(c
1
) = 0. Similarly since h
t
(c
1
)h
t
(x
0
) = 00 = 0,
there exists c
2
between x
0
and c
1
such that h
tt
(c
2
) = 0. By repeating this
procedure we nd
x
0
< c
n+1
< c
n
< < c
1
< x
so that h
(k)
(c
k
) = 0 for 1 k n + 1. Setting c = c
n+1
, we then have
0 = h
(n+1)
(c) = f
(n+1)
(c) M (n + 1)!
which implies,
M =
f
(n+1)
(c)
(n + 1)!
.
Using this back in Eq. (10.24) gives Eq. (10.23).
Example 10.39 (Exponential Function revisited). Suppose f : 1 1, such that
f
t
(x) = f (x) for all x 1 and f (0) = 0. Then by Taylors theorem with = 0,
for all x 1 and n N
0
there exists a point c = c
n
(x) between 0 and x such
that
f (x) =
n
k=0
1
k!
x
k
+
x
n+1
(n + 1)!
f (c
n
(x))
wherein we have used f
(n)
= f for all n. Letting M
x
:= max [f (s)[ : [s[ [x[
we may conclude that
f (x)
n
k=0
1
k!
x
k
=
x
n+1
(n + 1)!
[f (c
n
(x))[
x
n+1
(n + 1)!
M
x
0 as n .
Thus there is at most one such function and if it exists it must be given by
f (x) =
k=0
1
k!
x
k
,
i.e. the function we have dened previously to be exp (x) = e
x
. We have seen in
Corollary 10.34 that exp (x) is dierentiable and the
d
dx
exp (x) = exp (x) with
exp (0) = 1 as required.
Exercise 10.11 (Taylors Theorem II). Suppose that f : 1 1
d
is n + 1
times dierentiable on (a, b) and x
0
(a, b) . Let
p
n
(x) :=
n
k=0
f
(k)
(x
0
)
k!
(x x
0
)
k
as in Eq. (10.22). Then for all x (a, b) there exists a c between x
0
and x such
that
|f (x) p
n
(x)|
_
_
_f
(n+1)
(c)
_
_
_
[x x
0
[
n+1
(n + 1)!
.
Hint: use the idea in the proof of Theorem 10.19 to reduce this to the case
where d = 1.
10.6 Inverse Function Theorem I
Theorem 10.40 (Converse to the Chain Rule). Let U and V be open
subset of 1 and suppose f : U V and g : V 1 are continuous functions
such that g is dierentiable on V and h := g f is dierentiable on U. Then f
is dierentiable at all point x U where g
t
(f (x)) ,= 0 and at such a point,
f
t
(x) = g
t
(f (x))
1
h
t
(x) . (10.25)
Proof. Suppose that x U and g
t
(f (x)) ,= 0. Let f = f (x +x) f (x)
and notice that f = (x) because f is continuous at x. Then on one hand,
h := g (f (x) +f) g (f (x)) = g
t
(f (x)) f +o (f)
while on the other hand,
h = h(x +x) h(x) = h
t
(x) x +o (x) .
Comparing these equations implies,
h
t
(x) x +o (x) = g
t
(f (x)) f +o (f)
or equivalently,
f = g
t
(f (x))
1
[h
t
(x) x +o (x) o (f)]
= g
t
(f (x))
1
h
t
(x) x +o (x) +o (f) . (10.26)
Since f is continuous, f 0 as x 0. Therefore there exists > 0 such
that [o (f)[
1
2
[f[ if [x[ . Using this in Eq. (10.26) implies, for some
C < and [x[ that
[f[ =
g
t
(f (x))
1
h
t
(x) x +o (x) +o (f)
g
t
(f (x))
1
h
t
(x) x
+o (x) +
1
2
[f[
C [x[ +
1
2
[f[ ,
i.e. [f[ 2C [x[ for [x[ . From this we may now conclude that o (f) =
o (x) and therefore Eq. (10.26) asserts,
f (x +x) f (x) = g
t
(f (x))
1
h
t
(x) x +o (x) .
i.e. f
t
(x) exists and is give by Eq. (10.25).
10.6 Inverse Function Theorem I 119
Theorem 10.41 (Inverse Function Theorem). Suppose that g : (a, b) 1
is a dierentiable function such that g
t
(x) > 0 (or g
t
(x) < 0) for all a <
x < b. Let = inf g (x) : a < x < b and := sup g (x) : a < x < b , then
g : (a, b) (, ) is a bijection and there exists a unique inverse function,
f : (, ) (a, b) such that g (f (y)) = y for all y (, ) and f (g (x)) = x
for all x (a, b) . Moreover the function f is dierentiable on (, ) and
f
t
(y) =
1
g
t
(f (y))
for all y (, ) .
Proof. I will assume that g
t
> 0, if not apply the results to g instead.
It follows by the mean value theorem that g (x) < g (x
t
) if x < x
t
. i.e. g is
strictly increasing. Since g is continuous and (a, b) is connected we know that
J := g (a, b) is connected and hence equal to an interval. If = sup J J,
then there would exists x (a, b) such that g (x) = and then for x
t
(x, b)
we would have g (x
t
) J while g (x
t
) > which violates the fact that is
an upper bound for J. Similarly we see that / J and therefore J = (, ) .
Hence we have shown g : (a, b) (, ) is a bijection and hence there exists a
unique inverse map, f : (, ) (a, b) and this map is also strictly increasing.
The continuity of f has already been addressed in Exercise 6.20 and Exercise
9.40. Here is yet another variant of the continuity proof. Suppose y (, ) ,
<
0
< y <
0
< and y
n
[
0
,
0
] is a sequence such that lim
n
y
n
=
y. Then x
n
:= f (y
n
) [f (
0
) , f (
0
)] and we wish to show lim
n
x
n
=
f (y) . For sake of contradiction suppose not, then there exists > 0 and as
subsequence, x
k
:= x
nk
such that [ x
k
f (y)[ . By passing to a subsequence
if necessary, using the sequential compactness of [f (
0
) , f (
0
)] , we may assume
that lim
k
x
k
= x
0
exists. Since g is continuous it follows that
g (x
0
) = lim
k
g ( x
k
) = lim
k
g (f (y
nk
)) = lim
k
y
nk
= y
which shows x
0
= f (y) . But this then leads to the contradiction;
0 < lim
k
[ x
k
f (y)[ = [x
0
f (y)[ = 0.
The dierentiability assertions now follows from Theorem 10.40.
Since the function g (x) = e
x
satises g
t
(x) > 0 for all x 1, lim
x
e
x
=
0, and lim
x
e
x
= there is a unique function, ln : (0, ) 1 such that
e
ln y
= y for all y > 0 and ln (e
x
) = x and x 1. We refer to this function as
the natural logarithm.
Theorem 10.42 (Natural logarithm). Let ln : (0, ) 1 be the inverse
function to exp (x) = e
x
. Then;
1. ln is an increasing dierentiable function such than ln 1 = 0 and
d
dy
ln y =
1
y
for all y > 0.
2. For a, b (0, ) , ln (ab) = ln a + ln b.
3. For a > 0, ln a
1
= ln a.
4. For any k Z and a > 0, ln a
k
= k ln a or equivalently a
k
= e
k ln a
.
5. For any q and a > 0, ln a
q
= q ln a or equivalently a
q
= e
q ln a
.
Proof. 1. The rst assertion follows directly from Theorem 10.41 noting
that
1 =
d
dy
y =
d
dy
e
ln y
= e
ln y
d
dy
ln y = y ln y
and e
0
= 1 so that ln 1 = 0.
2. If x = ln a and y = ln b, then a = e
x
and b = e
y
so that ab = e
x
e
y
= e
x+y
and hence ln (ab) = x +y = ln a + ln b.
3. Take b = a
1
in item 2. to conclude that 0 = ln 1 = ln
_
a a
1
_
=
ln a + ln a
1
, i.e. ln a
1
= ln a.
4. If k = 0 and a > 0 then a
0
= 1 so that ln a
0
= ln 1 = 0 = 0 ln a. For
k N we show using item 1. and induction that ln a
k
= k ln a. If k N then
ln a
k
= ln
_
a
]k]
_
1
= ln a
]k]
= [k[ ln a = k ln a.
5. If b =
1
n
, then e
1
n
ln a
> 0 and
_
e
1
n
ln a
_
n
= e
n
1
n
ln a
= e
ln a
= a.
As we know there is a unique positive n
th
root of a (we have already seen
this but it also follows from Theorem 10.41) and therefore, e
1
n
ln a
= a
1/n
. So if
q = m/n with m Z and n N we nd,
a
q
=
_
a
1/n
_
m
=
_
e
1
n
ln a
_
m
= e
m
n
ln a
= e
q ln a
.
Denition 10.43. For a > 0 and b 1 we dene a
b
:= e
b ln a
. [This agrees
with our previous denition of a
b
when b by item 5. of Theorem 10.42.]
Corollary 10.44. For a > 0 and b 1,
d
dx
a
x
= ln a a
x
for all x 1
and
d
dx
x
b
= bx
b1
for all x > 0.
Exercise 10.12. Prove Corollary 10.44, i.e. for a > 0 and b 1,
d
dx
a
x
= ln a a
x
for all x 1
and
d
dx
x
b
= bx
b1
for all x > 0.
Exercise 10.13. Suppose that f : 1 1 is a continuous function such that
f (0) ,= 0 and
f (x +y) = f (x) f (y) for all x, y 1. (10.27)
Show;
1. f (0) = 1 and f (x) > 0 for all x 1.
2. Under the further assumption that f is dierentiable, show f (x) = e
cx
for
some c 1.
3. Show the result in 2. holds even if we only assume that f is continuous.
Exercise 10.14. Let T = 2, 3, 5, 7, 11, . . . N be the prime numbers and for
N N let T
N
= T 1, 2, . . . , N . Show;
1. Fix N N and let T
N
= p
1
, . . . , p
k
. Then for each 1 n N
(by the prime factorization of n) there exists a unique vector a (n) =
(a
1
(n) , . . . , a
k
(n)) N
k
0
such that n = p
a1(n)
1
. . . p
ak(n)
k
and hence
N
n=1
1
n
=
N
n=1
1
p
a1(n)
1
. . .
1
p
ak(n)
k
. (10.28)
Use this observation to show,
N
n=1
1
n

k
j=1
N
l=0
1
p
l
j
j=1
1
1
1
pj
=
p1N
_
1
1
p
_
1
. (10.29)
2. Show (1 x)
1
e
2x
for all 0 x
1
2
.
3. Combining items 1. and 2. together to show,
N
n=1
1
n

p1N
e
2
1
p
= exp
_
_
2
p1N
1
p
_
_
.
Letting N then shows
p1
1
p
= which implies the primes are not
sparse in N and in particular implies #(T) = .
Example 10.45. In a similar fashion one may use Theorem 10.41 to prove the
usual dierentiation formulas for the inverse trigonometric functions;
d
dx
arctan (x) =
1
x
2
+ 1
,
d
dx
arcsin (x) =
1
1 x
2
,
d
dx
arccos (x) =
1
1 x
2
, and
d
dx
arcsec (x) =
1
[x[
_
(x
2
1)
(where they are dened).
1. For example,
d
dy
tan (y) =
d
dy
sin y
cos y
=
cos
2
y + sin
2
y
cos
2
y
=
1
cos
2
y
which is positive on (/2, /2) . Moreover we have lim
y/2
tan y =
and lim
y/2
tan y = . Thus by the inverse function theorem asserts
there exist a increasing dierentiable function, f : 1 (/2,
2
) such
that tan (f (x)) = x for all x 1. Dierentiating this equation then implies
1 =
1
cos
2
(f (x))
f
t
(x) , i.e. f
t
(x) = cos
2
(f (x)) .
Letting y = f (x) , we have
x
2
= tan
2
y =
sin
2
y
cos
2
y
= 1 cos
2
y = x
2
cos
2
y = cos
2
y =
1
1 +x
2
.
Therefore it follows that f
t
(x) =
1
1+x
2
. The functions f (x) is more com-
monly known as arctan (x) .
2. As another example, let us observe that
d
dx
sin (x) = cos (x) > 0 on
(/2, /2) and in this case lim
y/2
sin y = 1 and lim
y/2
sin y = 1.
Thus the inverse function theorem asserts there exists a dierentiable in-
creasing function, f : (1, 1) (/2, /2) such that sin (f (x)) = x. [Of
course f (x) = arcsin (x) .] Dierentiating this equation then implies
1 = cos (f (x)) f
t
(x) , i.e. f
t
(x) =
1
cos (f (x))
.
Letting y = f (x) , we have x = sin y and so cos y =
1 x
2
. We know
we must take the + sign here since cos y > 0 for y (/2, /2) . Thus we
have shown,
10.7 Lhopitals Rule 121
d
dx
arcsin (x) = f
t
(x) =
1
cos (f (x))
=
1
1 x
2
.
Exercise 10.15. Use Taylors Theorem 10.38 to show
ln (1 +x) =
n=1
(1)
n+1
n
x
n
for
1
2
< x < 1.
Exercise 10.16. Show the series
f (x) :=
n=1
(1)
n+1
n
x
n
converges for [x[ < 1 and that f (x) = ln (1 +x) for [x[ < 1. Hint: this problem
can be done independently from Exercise 10.15. Your goal is to show g (x) :=
f (x) ln (1 +x) = 0 for [x[ < 1.
Exercise 10.17 (Binomial Expansions). Let a 1 and f (x) = (1 +x)
a
.
By direct computation using induction one shows, f (0) = 1, f
t
(x) =
a (1 +x)
a1
,
f
tt
(x)
2!
=
a (a 1)
2
(1 +x)
a2
,
f
(3)
(x)
3!
=
a (a 1) (a 2)
3!
(1 +x)
a2
.
.
.
f
(n)
(x)
n!
=
a (a 1) . . . (a n + 1)
n!
(1 +x)
an
=
_
a
n
_
(1 +x)
an
.
So we expect, based on Taylors Theorem 10.38, that
(1 +x)
a
=
n=0
_
a
n
_
x
n
. (10.30)
for [x[ < R where R is the radius of convergence of the series above. Show;
1. R = 1. Hint: you might use the ratio test here.
2. Show (1 +x)
a
solves the dierential equation,
(1 +x)
d
dx
(1 +x)
a
= a (1 +x)
a
for x > 1. (10.31)
3. Show Eq. (10.30) is valid for [x[ < 1 as follows.
a) Show
g (x) :=
n=0
_
a
n
_
x
n
(10.32)
satises (1 +x) g
t
(x) = ag (x) for [x[ < 1, the same dierential equation
as (1 +x)
a
.
b) Next compute the derivative of h(x) := (1 +x)
a
g (x) and use to show
h(x) = 1 for [x[ < 1.
Exercise 10.18. Rudin problem 5.15 on page 115.
10.7 Lhopitals Rule
Theorem 10.46 (Cauchys Generalized Mean Value Theorem). Suppose
that f, g : [a, b] 1 are continuous functions which are dierentiable on (a, b) .
Then there exists t
0
(a, b) such that
[f (b) f (a)] g
t
(t
0
) = [g (b) g (a)] f
t
(t
0
) .
Proof. Let
h(t) := [f (b) f (a)] g (t) [g (b) g (a)] f (t) .
Then
h(b) = [f (b) f (a)] g (b) [g (b) g (a)] f (b) = g (a) f (b) f (a) g (b)
and
h(a) = [f (b) f (a)] g (a) [g (b) g (a)] f (a) = g (a) f (b) f (a) g (b)
so that h(a) = h(b) . Applying Rolles theorem shows there exists t
0
(a, b)
such that
0 = h
t
(t
0
) = [f (b) f (a)] g
t
(t
0
) [g (b) g (a)] f
t
(t
0
) .
Theorem 10.47 (Lhopitals Rule). Let c and L be in

1. The real valued
functions f and g are assumed to be dierentiable on an open interval J with
endpoint c, and additionally g
t
(x) ,= 0 and g (x) ,= 0 for x J. It is also
assumed that
lim
xc
f
t
(x)
g
t
(x)
= L,
where lim
xc
means lim
xc
if c is the right end point of J and lim
xc
if c is the
left endpoint of J. If either;
1) lim
xc
f (x) = lim
xc
g (x) = 0 or
2) lim
xc
[f (x) [ = lim
xc
[g (x) [ = ,
then
lim
xc
f (x)
g (x)
= L.
The following proof is taken fairly directly from Wikipedia.. For
deniteness we assume that J = (a, c) for some < a < c, we allow for
c = here. The case where c is the left endpoint of J is proved similarly. For
x J let
m(x) = inf
(x,c)
f
t
()
g
t
()

1 and M (x) = sup
(x,c)
f
t
()
g
t
()

1.
By Cauchys mean value Theorem 10.46, if a < x < y < c there exist (x, y)
such that
f (x) f (y)
g (x) g (y)
=
f
t
()
g
t
()
.
[Notice g (x) g (y) ,= 0 by the mean value theorem and the assumption that
g
t
() ,= 0 for J.] In particular we have
m(x)
f (x) f (y)
g (x) g (y)
M (x) for all a < x < y < c. (10.33)
Case 1. Suppose that lim
xc
f (x) = lim
xc
g (x) = 0. In this case we may let
y c in Eq. (10.33) in order to learn
m(x)
f (x)
g (x)
M (x) . (10.34)
Case 2. If lim
xc
[g (x) [ = , then
m(x)
f (y) f (x)
g (y) g (x)
=
f(y)
g(y)

f(x)
g(y)
1
g(x)
g(y)
M (x) . (10.35)
Since
lim
yc
f (x)
g (y)
= 0 = lim
yc
g (x)
g (y)
we may pass to the limit in Eq. (10.35) in order to nd,
m(x) liminf
yc
f (y)
g (y)
limsup
yc
f (y)
g (y)
M (x) . (10.36)
Since
lim
xc
m(x) = liminf
xc
f
t
(x)
g
t
(x)
= L = limsup
xc
f
t
(x)
g
t
(x)
= lim
xc
M (x)
we may pass to the limit in Eqs. (10.34) and (10.36) using the squeeze theorem
in order to learn L = lim
xc
f(x)
g(x)
in case 1) and
L liminf
yc
f (y)
g (y)
limsup
yc
f (y)
g (y)
L
in Case 2. The last equation implies lim
yc
f(y)
g(y)
exists and is equal to L.
Example 10.48. By Lhopitals Rule,
lim
x0
sin (x)
x
= lim
x0
d
dx
sin (x)
d
dx
x
= lim
x0
cos (x)
1
= 1
and by two applications of Lhopitals Rule,
lim
x0
1 cos (x)
x
2
= lim
x0
sin (x)
2x
= lim
x0
cos (x)
2
=
1
2
.
Technically one needs to work backwards in these string of equalities to justify
the use of Lhopitals. Let me point out you would be wrong to use Lhopitals
rule for the following limit; lim
x0
cos(x)
x
= . If we were to mistakenly use
Lhopitals rule in this /0 situation we would conclude mistakenly that
= lim
x0
cos (x)
x
= lim
x0
d
dx
cos (x)
d
dx
x
= lim
x0
sin (x)
1
= 0.
Exercise 10.19. Rudin problem 5.11 on page 115.
Exercise 10.20. Rudin problem 5.13 on page 115. In this problem assume that
x [0, 1] rather than [1, 1] and assume a ,= 0 and c > 0.
11
Simple Integration Theory
In this chapter we will let F be either 1 or C and X and Y will be Banach
spaces over F.
11.1 Banach Spaces of Bounded Functions
Notation 11.1 Let be a set and Y be a Banach space and for any function
f : Y let
|f|
u
:= sup
|f ()|
Y
.
Also let B(, Y ) denote those functions, f : Y such that |f|
u
< . We
refer to B(, Y ) as the space of Y valued bounded functions on .
Fact 11.2 The collection of all functions f : Y, denoted by Y
, is an F -
vector space where for g : Y and F we dene
(f +g) () := f () +g () for all .
We are going to take this fact for granted. To prove one needs to open a book
on linear algebra and verify that Y
satises all the axioms of a vector space.

Proposition 11.3. The space B(, Y ) is a subspace of Y
, ||
u
is a norm on
B(, Y ) , and (B(, Y ) , ||
u
) is complete, i.e. (B(, Y ) , ||
u
) is a Banach
space.
Exercise 11.1. Prove the rst part of Proposition 11.3. That is show B(, Y )
is a subspace of Y
, ||
u
is a norm on B(, Y ) . Hint: model your proof on
Exercise 6.1.
Exercise 11.2. Prove the second part of Proposition 11.3. That is show
(B(, Y ) , ||
u
) is complete, i.e. (B(, Y ) , ||
u
) is a Banach space. Hint:
model your proof on Lemma 6.27.
Notation 11.4 (Indicator Functions) The indicator function, 1
A
, of a sub-
set A is the function dened by,
1
A
() =
_
1 if A
0 if / A
.
Notation 11.5 Given a function, f : Y and y Y, let
f = y := f
1
(y) := : f () = y .
Example 11.6 (Simple functions). Given subsets A
k
and y
k
Y for 1
k n, the function,
f =
n
k=1
y
k
1
Ak
B(, Y ) .
Notice that if S =
_
z =
k
y
k
: 1, 2, . . . , n
_
where by convention z =
0 if = , then f () S and in particular f () is a nite set with no more
that 2
n
elements. Conversely if f : Y is a function such that f () is a
nite subset of Y, then
f =
yf()
y 1
f=y]
where
f = y := : f () = y = f
1
(y) .
A function f : Y such that f () is a nite set is called a simple function.
We are now going to take = [a, b] where < a < b < .
Denition 11.7 (Patititions). A partition of [a, b] is a nite subset, , of
[a, b] such that a, b . We will typically write as,
= a = t
0
< t
1
< < t
n
= b . (11.1)
The mesh size of is dened to be,
[[ := max
1in
[t
i
t
i1
[ .
Notation 11.8 Given a partition, , of [a, b] as in Eq. (11.1) and c =
(c
1
, . . . , c
n
) [a, b] such that t
i1
c
i
t
i
for 1 i n [we will say that
c interlaces ] let
f
,c
:= f (c
1
) 1
a]
+
n
i=1
f (c
i
) 1
(ti1,ti]
.
124 11 Simple Integration Theory
Fig. 11.1. The usual picture associated to the Riemann integral.
In other words, f
,c
(t) = c
1
if a = t
0
t t
1
and f
,c
(t) = c
i
if t
i1
< t t
i
with 2 i n. Notice that f
,c
: [a, b] Y is a simple function. See Figure
11.1.
Exercise 11.3 (Uniform Continuity). Show that every continuous function,
f : [a, b] Y is uniformly continuous, i.e. if > 0 there exists > 0 such that
|f(t) f(s)|
Y
if s, t [a, b] with [t s[ . Hint: see Exercise 9.24.
Proposition 11.9. If f : [a, b] Y is a continuous function then f
B([a, b] , Y ) . Let us further suppose that
m
m=1
is any collection of par-
titions of [a, b] such that lim
m
[
m
[ = 0 and that f
m
:= f
m
, c
m
where c
m
interlaces
m
, then lim
m
|f f
m
|
u
= 0. In particular, every continuous
function is the uniform limit of simple functions.
Exercise 11.4. Prove Proposition 11.9. Suppose that f : [a, b] Y is a con-
tinuous function.
1. Show f B([a, b] , Y ) .
2. Show for every > 0 there exists a > 0 such that for every partition
of [a, b] with [[ we have |f f
, c|
u
for any collections of times
c which interlace . In particular if
m
m=1
is a collection of partitions
of [a, b] such that lim
m
[
m
[ = 0 and that f
m
:= f
m
, c
m
where c
m
interlaces
m
, then lim
m
|f f
m
|
u
= 0. Hint: make use of Exercise
11.3.
11.2 Algebras of Sets
Notation 11.10 (Pairwise Disjoint Unions) If A
I
and A are subsets
of , we write A =
I
A
to mean that A =
I
A
with the added restric-

tion that A
= if ,= . We will also write A
B for A B with the

added condition that A B = . Similarly A
C = A B C with the
added requirement that A B = A C = B C = .
Denition 11.11. A set o 2
is said to be an semialgebra or elementary

class provided that
o
o is closed under nite intersections
if E o, then E
c
is a nite disjoint union of sets from o. (In particular
=
c
is a nite disjoint union of elements from o.)
Notation 11.12 A subset J 1 is said to be a half open interval it J =
(a, b] 1 for some a b . In other words a half open interval is
a set of the form (a, b] with a b < or of the form (a, ) with
a < . If J and K are two half open intervals we will write J < K i
s < t for all s J and t K. More concretely, (a, b] (c, d] 1 i b c.
Example 11.13. Let = 1, then
o :=
_
(s, t] 1 : s, t

1
_
= (s, t] : s [, ) and s < t < , 1
is an elementary class.
Example 11.14. Let < a b < , = (a, b], and
o := (s, t] : a s t b
Example 11.15. Suppose that is a nite set, then
o := :
Example 11.16. Suppose that = (a
1
, b
1
] (a
2
, b
2
] 1
2
. Then
o := (s
1
, t
1
] (s
2
, t
2
] : a
i
s
i
t
i
b
i
for i = 1, 2
is an elementary class as follows from from Example 11.14 and Exercise 11.5
below or may be veried directly. This example has an obvious generalization
to any dimension.
11.2 Algebras of Sets 125
Exercise 11.5. Let
1
and
2
be sets and /
1
2
1
and /
2
2
2
be ele-
mentary classes. Show the collection
o := AB : A /
1
and B /
2
2
12
is also a elementary class.
Proposition 11.17 (/ is an algebra of sets). Let o be a semi-algebra on
and let / = /(o)
1
denote the subsets of which may be written as a nite
disjoint union of elements from o. So A / i there E
i
o such that A =
n
i=1
E
i
. Then / is an algebra of subsets of , i.e.;
1. and belong to /,
2. if A, B /, then A B /,
3. if A / then A
c
= A /, and
4. if A, B / then A B /.
Proof. We check each assertion in turn.
1. By the properties of o, we know that , /.
2. (/ is closed under intersections.) Suppose that A =
n
i=1
E
i
/ and
B =
m
j=1
F
j
/ where E
i
, F
j
o. Then
A B =
_
n
i=1
E
i
_
_
_
m
j=1
F
j
_
_
=
i,j
E
i
F
j
where we are justied in writing
i,j
since,
[E
i
F
j
] [E
i
F
j
] = [E
i
E
i
] [F
j
F
j
] =
unless i = i
t
and j = j
t
. Therefore / is closed under intersections.
3. (/ is closed under complementation.) If A =
n
i=1
E
i
is in / with E
i
o,
then A
c
=
n
i=1
E
c
i
/ from item 2. and the property of o that implies
E
c
i
/ whenever E
i
o.
4. (/ is closed under unions.) If A, B /, then
A B = [[A B]
c
]
c
= [A
c
B
c
]
c
/
as A
c
, B
c
/ and so A
c
B
c
/ and therefore [A
c
B
c
]
c
/.
1
We will refer to A(S) as the algebra generated by S.
Example 11.18. Let = 1, then / consisting of nite disjoint unions of ele-
ments from
o :=
_
(a, b] 1 : a, b

1
_
= (a, b] : a [, ) and a < b < , 1
is an algebra For example,
A = (0, ] (2, 7] (11, ) /(o) .
Example 11.19 (Algebra of half open intervals). Let = (a, b]. The algebra of
half open intervals, /, on is the collection of subsets of (a, b] which may
be written as a union of pairwise disjoint half open subintervals of (a, b]. [The
general element of / may be written as A =
n
i=1
J
i
with J
1
< J
2
< < J
n
and J
i
(a, b] for all i.]
Lemma 11.20. Suppose that T is a nite collection of half open intervals which
are pairwise disjoint, i.e. if J, K T with J ,= K then J K = . If n = #(T) ,
there exists a unique listing of T as J
1
, . . . , J
n
with J
1
< J
2
< < J
n
.
Proof. The proof is by induction on n. If n = 1, T =J
1
and we are
done. Now suppose the statement is true when #(T) = n 1 for some n 2
and now suppose that #(T) = n. Choose J
1
= (a
1
, b
1
] T so that it has
the smallest possible left end point. By in the induction hypothesis we can list
T J
1
as J
2
, . . . , J
n
with J
2
< J
3
< . . . < J
n
. It now only remains to show
J
1
= (a
1
, b
1
] < J
2
= (a
2
, b
2
]. By construction we know that a
2
a
1
and by
assumption that J
1
J
2
= . These two conditions can only hold if a
2
b
1
, i.e.
J
1
< J
2
.
Denition 11.21. Let be a set. A collection, , of non empty subsets of
is a partition of if =
A
A. In other words, every point of x ,
belongs precisely to one and only one element, A, in . We say that is a
nite partition if we further assume that #() < .
Corollary 11.22. If is a nite partition of (a, b] consisting of half open
sub-intervals of (a, b]. Then there exists t
k
n
k=0
[a, b] such that =
(t
i1
, t
k
]
n
k=1
where n = #() and a = t
0
< t
1
< t
2
< < t
n
= b.
Proof. By Lemma 11.20 we know that = J
k
= (a
k
, b
k
]
n
k=1
with J
1
<
J
2
< < J
n
. With this ordering we must have a = a
1
and b = b
n
. If b
k
< a
k+1
for some k < n then , = (b
k
, a
k+1
) (a, b] 1 while on the other hand,
(b
k
, a
k+1
) (a, b] 1 =(b
k
, a
k+1
) [
J
J] = .
Thus we conclude that b
k
= a
k+1
for k = 1, 2, . . . , n 1 and therefore the
theorem holds with t
0
= a and t
k
= b
k
for 1 k n.
Exercise 11.6 (Structure of nite algebras). Suppose that is a set and
/ is a sub-algebra of 2
with only nitely many elements, i.e. #(/) < . For

each x let
A
x
= A / : x A /,
wherein we have used / is nite to insure A
x
/. Hence A
x
is the smallest set
in / which contains x, i.e. if A / and x A then A
x
A. Show;
1. If x, y with A
x
A
y
,= then A
x
= A
y
so either A
x
= A
y
or A
x
A
y
= .
Hint: if x / A
y
, then A
x
A
y
would be a set in / which is strictly smaller
than A
x
this would violate the denition of A
x
.
2. Let := A
1
, . . . , A
n
/ be the list of distinct elements
2
in
A
x
: x . Show that every A / may be written as A =
j
A
j
for some nite subset 1, 2, . . . , n . By convention, if = we dene
j
A
j
= 2
. In particular this shows =
n
i=1
A
i
and hence
is a partition of . [Conclusion, nite subalgebras of 2
are in one to one

correspondence with nite partitions of .]
3. Explain why there is no algebra / 2
such that #(/) = 6.

11.3 Finitely additive measure on A
Let Z be a vector space (typically we are going to take Z = 1).
Denition 11.23. An Z valued nitely additive measure on / is a func-
tion, : / Z such that;
1. () = 0 Z, and
2. (A B) = (A)+(B) whenever A, B / with AB = . [By induction
it then follows that
_
n
j=1
A
j
_
=
n
j=1
(A
j
) where A
j
/.]
Exercise 11.7. Let be a set, / be a sub-algebra of 2
, and : / [0, ) be
a nitely additive measure. Further suppose that A, B / and A
j
n
j=1
/.
Show;
1. ( is monotone) If A B, then (A) (B) and more precisely, (B)
(A) = (B A) .
2. Show
_
n
j=1
A
j
_
=
n
j=1
(A
j
) if A
j
A
i
= when i ,= j.
3. For any A, B /,
(A B) = (A) +(B) (A B) . (11.2)
4. ( is nitely subadditive) (
n
j=1
A
j
)
n
j=1
(A
j
).
2
We know this list is nite since A is a nite collection of subsets of .
Denition 11.24. If o 2
is a semi-algebra, we say a function : o Z

is additive on o if whenever n N and E =
n
i=1
E
i
with E
i
, E o, then
(E) =
n
i=1
(E
i
) .
Lemma 11.25. Let = (a, b], : [a, b] Z be a function, and o be the
collection of half open intervals in . Then : o Z dened by
((s, t]) = (t) (s) for all a s t b
is additive on o.
Proof. Suppose that J =
n
i=1
J
i
with J = (u, v] and J
i
:= (u
i
, v
i
] in o.
We must show
(J) =
n
i=1
(J
i
) . (11.3)
As the sum in Eq. (11.3) is independent of the ordering we may reorder the
terms so that J
1
< J
2
< < J
n
and then use Corollary 11.22 to deduce that
J
i
= (t
i1
, t
i
] where u = t
0
< t
1
< < t
n
= v. With this notation, we nd by
a telescoping series arguments (see Theorem 7.3) that
n
i=1
(J
i
) =
n
i=1
[(t
i
) (t
i1
)]
= (t
n
) (t
0
) = (v) (u) = (J) .
Theorem 11.26 (Construction of Finitely Additive Measures). Suppose
o 2
is a semi-algebra (see Denition 11.11) and / = /(o) is the algebra

generated by o, see Proposition 11.17. Then every additive function : o Z
such that () = 0 extends uniquely to an additive measure on /. This unique
extension will still be denoted by .
Proof. Since every element A / is of the form A =
i
E
i
for a nite
collection of E
i
o, it is clear that if extends to a measure on / then this
extension is unique and we must dene (A) by
(A) =
i
(E
i
). (11.4)
To prove existence, the main point is to show that (A) in Eq. (11.4) is well
dened; i.e. if we also have A =
j
F
j
with F
j
o, then we must show
11.4 Simple Functions 127
i
(E
i
) =
j
(F
j
). (11.5)
But E
i
=
j
(E
i
F
j
) and the additivity of on o implies (E
i
) =
j
(E
i
F
j
) and hence
i
(E
i
) =
j
(E
i
F
j
) =
i,j
(E
i
F
j
).
Similarly,
j
(F
j
) =
i,j
(E
i
F
j
)
which combined with the previous equation shows that Eq. (11.5) holds.
Finally if A =
i
E
i
and B =
m
j=1
F
j
with A B = , then A B =
(
i
E
i
)
m
j=1
F
j
_
so that
(A B) =
i
(E
i
) +
j
(F
j
) = (A) +(B) .
This shows that is a nitely additive measure on /.
Corollary 11.27. Let : [a, b] Z be any function, = (a, b], and / be the
algebra of half open intervals in . Then there exists a unique nitely additive
measure =
: / Z such that
((s, t]) = (t) (s) for all a s t b.
Proof. This corollary follows directly from Lemma 11.25 and Theorem
11.26.
Remark 11.28. Let = (a, b] and / be the algebra of half open intervals.
Given a nitely additive measure : / Z, let : [a, b] Z be dened by
(t) := ((a, t]) Z. Then if J = (s, t] (a, b], then (a, t] = (a, s] (s, t] and
so
(t) = ((a, t]) = ((a, s]) +((s, t]) = (s) +((s, t])
and hence
(J) = ((s, t]) = (t) (s) .
Hence if A =
n
i=1
(a
i
, b
i
] we nd
(A) =
n
i=1
((a
i
, b
i
]) =
n
i=1
[(b
i
) (a
i
)] (11.6)
which shows is uniquely determined by its cumulative distribution func-
tion, .
Example 11.29. Suppose that is a nite set and let o := : .
Then o is a semi-algebra and the associated algebra is / = 2
. Given any
function, : Z we may dene : o Z by () = 0 and () = () .
Then an application of Theorem 11.26 states that extends to a nitely additive
measure on / given by
(A) =
_
_
=
A
() =
A
() .
11.4 Simple Functions
In this section, let F be either 1 or C and let Y be a vector space over F. Further
suppose that is a set and / 2
is an algebra of sets.
Denition 11.30 (Simple functions). A function, f : Y is said to be
simple if f () Y is a nite set. [See Example 11.6 for the structure of simple
functions.] We further say that f is an / simple function if f = y /
for all y f () Y. Let S (/, Y ) denote the collection of Y valued simple
functions on .
Example 11.31. If y Y and A /, then f := 1
A
y (so f () = y if A
and 0 otherwise) is an / simple function. Indeed, assuming y ,= 0, we have
f () = 0, y and f
1
(y) = A / and f
1
(0) = A
c
/. If y = 0, then
f () = 0 and f
1
(0) = /.
Lemma 11.32. The set of simple functions, S (/, Y ) , is a vector subspace of
all functions from to Y. Moreover, f : Y is an / simple function i
it may be written as
f =
N
i=1
y
i
1
Ai
for some y
i
Y and A
i
/. (11.7)
One may always let y
1
, . . . , y
N
be a listing of f () and then take A
i
:=
f
1
(y
i
) / for each i in which case A
i
A
j
= for i ,= j.
Proof. Let us observe that 1
= 1 and 1
= 0 are in S (/, Y ) . If f, g
S (/, Y ) and c F 0 , then for any y Y,
f +cg = y =
_
a,bC:a+cb=y
(f = a g = b) /. (11.8)
This shows that S (/, Y ) is a vector subspace.
From Example 11.31 and the fact that S (/, Y ) is a linear subspace, any
function given in the form of Eq. (11.7) is an / simple function. Conversely
if f is an / simple function, let y
1
, . . . , y
N
be a listing of f () and set
A
i
:= f
1
(y
i
) . Then Eq. (11.7) holds with A
i
A
j
= if i ,= j.
Corollary 11.33. If f S (/, Y ) and F : Y Z is any function from Y to
another vector space Z, then F f S (/, Z) . In particular, if ||
Y
is a norm
on Y, then |f ()|
Y
S (/, 1) .
Proof. We may write f as in Eq. (11.7) with A
i
A
j
= for all i ,= j. We
then have,
F f =
N
i=1
F (y
i
) 1
Ai
which is in S (/, Z) by Lemma 11.32.
Lemma 11.34. The set of simple functions, S (/, Y ) , is a vector subspace of
all functions from to Y. Moreover, f S (/, F) and g S (/, Y ) then f g
S (/, Y ) . [In algebraic jargon, S (/, F) is an algebra of functions and S (/, Y )
is a module over S (/, F) .]
Proof. First proof. We may write f and g as,
f =
N
i=1
i
1
Ai
and g =
M
j=1
y
j
1
Bj
with
i
F, y
j
Y, and A
i
, B
j
/. It then follows that
f g =
N
i=1
M
j=1
i
y
j
1
Ai
1
Bj
=
N
i=1
M
j=1
i
y
j
1
AiBj
which is a linear combination of / indicator functions and hence is in S (/, Y ) .
Second proof. We rst observe that the f g can only take values in
ay : a f () and y g () which is a nite set. Therefore f g is a simple
function. If y Y 0 , then
f g = y =
az=y
(f = a g = z) / (11.9)
where the union is over those a f () 0 and z g () 0 such that
az = y. We also have
f g = 0 = f = 0 g = 0 /
and thus we have shown f g S (/, F) .
Denition 11.35. When = (a, b], we say that f : Y is a step function
if there is a partition, = a = t
0
< t
1
< < t
n
= b of such that f is
constant on (t
i1
, t
i
] for each i, i.e. there exists y
i
Y such that
f =
n
i=1
y
i
1
(ti1,ti]
. (11.10)
Lemma 11.36. Let = (a, b] and f : Y be a function, then;
1. f is an / simple function i f is a step function.
2. Suppose X is another vector space and F : Y X is any function then
F f S (/, X) for all f S (/, Y ) . In particular, |f|
Y
S (/, 1) when
f S (/, Y ) .
Proof. 1. All step functions are / simple functions by Lemma 11.32.
Conversely if f is an / simple function then
f =
yf()
y 1
f=y]
where f = y =
ny
k=1
J
y
k
with J
y
k
being half open intervals. It then fol-
lows that =
yf()
ny
k=1
J
y
k
and so that collection of half open intervals
yf()
J
y
k
: 1 k n
y
forms a partition of . Therefore there exits =
a = t
0
< t
1
< < t
N
= b such that (t
i1
, t
i
] is in
yf()
J
y
k
: 1 k n
y
for each i and therefore f is constant on (t

i1
, t
i
], i.e. f is a step function.
2. If f S (/, Y ) and F : Y X is any function, then
F f = F
_
_

yf()
y 1
f=y]
_
_
=
yf()
F (y) 1
f=y]
S (/, Y ) .
11.5 Simple Integration
Denition 11.37. A nitely additive measure space is a triple (, /, )
where is a set, / is a sub-algebra of 2
, and : / 1
+
is a nitely additive
measure.
In this section we will suppose that (, /, ) is a nitely additive measure
space and Y is a normed space.
Denition 11.38 (Simple Integral). For f S (/, Y ) the integral (written
as
_
fd or
_
f () d() is dened by
_
fd =
yf()
(f = y) y Y. (11.11)
where (f = y) is shorthand for (f = y) . It is often convenient to write the
last sum as
11.5 Simple Integration 129
yY
(f = y) y
with the understanding that this is really a nite sum since (f = y) ,= 0 hap-
pens only when y f () which is a nite set.
To simplify notation I will often write
_
f for
_
fd in this section.
Example 11.39. Suppose that A / and y Y, then
_
y1
A
d =
_
y1
A
d = (A) y = y
_
1
A
d. (11.12)
Proposition 11.40. The integration operator,
_
: S (/, Y ) Y, satises:
1. If f S (/, Y ) and F, then
_
f =
_
f. (11.13)
2. If f, g S (/, Y ) , then
_
(f +g) =
_
g +
_
f. (11.14)
Items 1. and 2. say that integration is a linear functional on S (/, Y ) .
3. If f =
N
j=1
y
j
1
Aj
for some y
j
Y and some A
j
/, then
_
f =
N
j=1
y
j
(A
j
) . (11.15)
4. For all f S (/, Y ) ,
_
_
_
_
_
f
_
_
_
_
Y
|f (t)| d(t) |f|
() . (11.16)
where
|f|
:= sup
|f ()|
Y
.
5. If X is another vector space and T : Y X is a linear operator, then
T
_
fd =
_
T (f) d.
Proof.
1. If F and ,= 0, then
_
f =
yY
y (f = y) =
yY
y (f = y/)
=
zY
z (f = z) =
_
f.
The case = 0 is trivial.
2. Writing f = v, g = w for f
1
(v) g
1
(w), then
_
(f +g) =
zY
z (f +g = z)
=
zY
z
_

v+w=z
f = v, g = w
_
=
zY
z
v+w=z
(f = v, g = w)
=
zY
v+w=z
(v +w) (f = v, g = w)
=
v,w
(v +w) (f = v, g = w) .
But
v,w
v(f = v, g = w) =
v
v
w
(f = v, g = w)
=
v
v(
w
f = v, g = w)
=
v
v(f = v) =
_
f
and similarly,
v,w
w(f = v, g = w) =
_
g.
Equation (11.14) is now a consequence of the last three displayed equations.
3. If f =
N
j=1
y
j
1
Aj
, then using the linearity of the integral and Example
11.39 we nd
_
f =
_
_
_
N
j=1
y
j
1
Aj
_
_
=
N
j=1
y
j
_
1
Aj
=
N
j=1
y
j
(A
j
) .
4. By the triangle inequality,
_
_
_
_
_
f
_
_
_
_
=
_
_
_
_
_
_
yf()
y(f = y)
_
_
_
_
_
_
yf()
|y| (f = y) =
_
|f| d,
wherein the last equality we have used Eq. (11.15) and the fact that [f[ =
yY
[y[ 1
f=y
. Since for y f () we have |y| |f|
, we may conclude
that
yf()
|y| (f = y)
yf()
|f|
(f = y)
= |f|
yf()
(f = y) = |f|
()
from which we see the second inequality in Eq. (11.16) holds.
5. Since T is linear,
T f =
yf()
T (y) 1
f=y]
and therefore,
_
T (f) d =
yf()
_
T (y) 1
f=y]
d =
yf()
T (y) (f = y)
= T
_
_

yf()
y(f = y)
_
_
= T
__
fd
_
= T
_
fd.
Proposition 11.41. If Y = F = 1, then
_
is positive, i.e.
_
f 0 for all
0 f S (/, 1) . More generally, if f, g S (/, 1) and f g, then
_
f
_
g.
Proof. If f 0 then
_
f =
a0
a(f = a) 0
and if f g, then g f 0 so that
_
g
_
f =
_
(g f) 0.
Notation 11.42 If A / and f S (/, Y ) , we let
_
A
fd :=
_
1
A
fd.
Notation 11.43 When := (a, b], : [a, b] 1 is an increasing function,
/ is the algebra of half open intervals on , and : / 1
+
is the unique
measure such that ((s, t]) = (t) (s) for all a s t b, we will denote
_
fd by
_
b
a
fd or
_
b
a
fd. Moreover, if f S (/, Y ) and , [a, b] we let
_

fd =
_
_
b
a
1
(,]
fd if
_
b
a
1
(,]
fd if
.
Proposition 11.44. Let us continue using the setup in Notation 11.43. If f
S (/, Y ) and , , [a, b] , then
_

fd =
_

fd +
_

fd.
Moreover,
_
_
_
_
_
_

fd
_
_
_
_
_
|f| d
.
Proof. There are a number of cases to consider. I will only check two of
them.
First suppose that . In this case we have
1
(,]
f = 1
(,]
f + 1
(,]
f
and therefore,
_

fd =
_
1
(,]
f =
_
1
(,]
f +
_
1
(,]
f
=
_

fd +
_

fd.
Secondly suppose that , then by the rst case,
_

fd =
_

fd +
_

fd
and so
_

fd =
_

fd
_

fd =
_

fd +
_

fd.
11.6 Product Measure 131
The remaining possibilities are veried by similar means.
For the last assertion, we may suppose that and the result follows
from Eq. (11.16) since,
_
_
_
_
_
_

fd
_
_
_
_
_
=
_
_
_
_
_
_
b
a
1
(,]
fd
_
_
_
_
_
_
b
a
_
_
1
(,]
f
_
_
d =
_
b
a
1
(,]
|f| d =
_

|f| d.
11.6 Product Measure
Suppose that (
i
, /
i
,
i
) for i = 1, 2 be two nitely additive measure spaces.
From Exercise 11.5 we know that o := AB : A /
1
and B /
2

2
12
is an elementary class. Then, by Proposition 11.17, the subalgebra of /
of 2
12
generated by o consists of those C
1
2
which may be written
as
C :=
i
A
i
B
i
for some A
i
/
1
and B
i
/
2
. (11.17)
Denition 11.45 (Product Algebra). We let /
1
/
2
be the subalgebra of
2
12
consisting of sets C
1

2
of the form in Eq. (11.17).
Theorem 11.46 (Product Measure). There is a unique nitely additive
measure,
1

2
, on / := /
1
/
2
,such that
1

2
(AB) =
1
(A)
2
(B) for all A /
1
and B /
2
.
Moreover
1

2
satises,
_
1
__
2
1
C
(x, y) d
2
(y)
_
d
1
(x) =
1
2
(C) =
_
2
__
1
1
C
(x, y) d
1
(x)
_
d
2
(y)
(11.18)
for all C /. Part of what is being asserted here is that y 1
C
(x, y) is an
/
2
simple function for each x
1
and x
_
2
1
C
(x, y) d
2
(y) is an /
1
simple function and similarly, x 1
C
(x, y) is an /
1
simple function for
each y
2
and x
_
1
1
C
(x, y) d
1
x is an /
2
simple function so that all
integrals make sense.
Proof. Let C / be given as in Eq. (11.17). Then we have
1
C
(x, y) =
i
1
Ai
(x) 1
Bi
(y) for all (x, y)
1

2
and hence for each xed x
1
the functions y 1
C
(x, y) is an /
2
simple
function and
_
2
1
C
(x, y) d
2
(y) =
i
1
Ai
(x)
_
2
1
Bi
(y) d
2
(y) =
i
1
Ai
(x)
2
(B
i
) .
From this equation we see that x
_
2
1
C
(x, y) d
2
(y) is an /
1
simple
function and that
_
1
__
2
1
C
(x, y) d
2
(y)
_
d
1
(x) =
2
(B
i
)
_
1
1
Ai
(x) d
1
(x)
=
1
(A
i
)
2
(B
i
) .
A similar computation shows
_
2
__
1
1
C
(x, y) d
1
(x)
_
d
2
(y) =
1
(A
i
)
2
(B
i
) .
At any rate we may dene
1

2
:= : / 1
+
by
(C) :=
_
1
__
2
1
C
(x, y) d
2
(y)
_
d
1
(x) .
If C
1
, C
2
/ are disjoint sets then
1
C1C2
= 1
C1+1C
2
and it follows that
(C
1
C
2
) =
_
1
__
2
[1
C1
(x, y) + 1
C2
(x, y)] d
2
(y)
_
d
1
(x)
=
_
1
__
2
1
C1
(x, y) d
2
(y) +
_
2
1
C2
(x, y) d
2
(y)
_
d
1
(x)
=
_
1
__
2
1
C1
(x, y) d
2
(y)
_
d
1
(x) +
_
1
__
2
1
C2
(x, y) d
2
(y)
_
d
1
(x)
= (C
1
) +(C
2
) .
Thus we have shown
1
2
is a nitely additive measure on / which satises
Eq. (11.18) and
1

2
(C) =
1
(A
i
)
2
(B
i
)
when C / is given as in Eq. (11.17). This last expression uniquely determines
1

2
.
Corollary 11.47 (Area measure). Let := (a, b]
2
1
2
. Then the collection
/ of subsets C of the form, C =
n
i=1
(x
i
, y
i
] (u
i
, v
i
], is a subalgebra of
2
, where a x
i
y
i
b and a u
i
vy
i
b. Moreover there is a unique
nitely additive measure m : / 1
+
such that
m(C) =
n
i=1
(y
i
x
i
) (v
i
u
i
)
where C is given as above, i.e. m is the total area of C.
Theorem 11.48 (Baby Fubini Theorem). Continuing the above notation.
If f :
1

2
1 is an / simple function, then
_
1
__
2
f (x, y) d
2
(y)
_
d
1
(x)
=
_
12
fd [
1

2
] =
_
2
__
1
f (x, y) d
1
(x)
_
d
2
(y) .
Exercise 11.8. Prove Theorem 11.48. That is show
_
1
__
2
f (x, y) d
2
(y)
_
d
1
(x)
=
_
12
fd [
1

2
] =
_
2
__
1
f (x, y) d
1
(x)
_
d
2
(y)
for all / simple functions, f :
1

2
1. Here (
i
, /
i
,
i
) are two
nitely additive measure spaces and
1

2
is the product measure on
(
1

2
, / = /
1
/
2
) . [Part of the problem here is to show all integrals
are well dened.]
12
Extending the Integral by Uniform Limits
We now suppose that Y is a Banach space, is a set, / is a subalgebra of
2
, and : / 1
+
is a nitely additive measure on /.
Theorem 12.1. Suppose that f : Y is a function such that there exists
f
n
S (/, Y ) with |f f
n
|
0 as n , then lim
n
_
f
n
d exists in
Y. Moreover if g
n
S (/, Y ) with |f g
n
|
0 as n , then
lim
n
_
g
n
d = lim
n
_
f
n
d.
We denote this common limit by
_
fd and call it the integral of f relative to

.
Proof. To show the limit exists we need only need to use the completeness
of Y along with the observation that
_
_
_
_
_
f
n
d
_
f
m
d
_
_
_
_
|f
n
f
m
|
() 0 as m, n .
Similarly,
_
_
_
_
_
f
n
d
_
g
m
d
_
_
_
_
|f
n
g
n
|
() 0 as m, n
since
|f
n
g
n
|
|f
n
f|
+|f g
n
|
0 + 0 = 0 as n .
In light of Theorem 12.1 we have extended the integral from

S (/, Y ) the
closure of S (/, Y ) in the collection of bounded functions from Y relative
to the ||
.
Exercise 12.1 (

S (/, Y ) is an

S (/, F) module). If f

S (/, F) and g, h
S (/, Y ) , then h + fg

S (/, Y ) and |g|

S (/, 1) . You should prove this
by showing; if f
n
S (/, F) and g
n
, h
n
S (/, Y ) are chosen so that f
n
f,
h
n
h, and g
n
g uniformly on then h
n
+f
n
g
n
h +fg and |g
n
()|
|g ()| uniformly on .
Exercise 12.2 (Closure of subspaces are subspaces). Suppose that
(B, ||) is a Banach space and S B is a linear subspace of B. Show that
closure,

S, of B is a subspace of B as well. [It now follows that
_
S, ||
_
is a
complete normed space, i.e. is a Banach space.]
Theorem 12.2 (Linearity of the Integral). The extended integral,
_
S (/, Y ) Y is still linear and satises the estimates,

_
_
_
_
_
fd
_
_
_
_
Y
|f ()|
Y
d() |f|
() . (12.1)
In particular if f
n

S (/, Y ) and f = lim
n
f
n
exists in

S (/, Y ) , i.e.
|f f
n
|
0 as n , then
lim
n
_
f
n
d =
_
fd =
_
_
lim
n
f
n
_
d. (12.2)
Proof. Let F and g, h

S (/, Y ) and choose g
n
, h
n
S (/, Y ) are
chosen so that h
n
h, and g
n
g uniformly on . Then
_
(h +g) d = lim
n
_
(h
n
+g
n
) d
= lim
n
__
h
n
d +
_
g
n
d
_
=
_
hd +
_
gd.
Similarly using |h
n
| |h| uniformly on we have,
_
_
_
_
_
hd
_
_
_
_
= lim
n
_
_
_
_
_
h
n
d
_
_
_
_
lim
n
_
|h
n
| d =
_
|h| d
and _
|h
n
| d |h
n
|
()
so that
lim
n
_
|h
n
| d lim
n
|h
n
|
() = |h|
() .
134 12 Extending the Integral by Uniform Limits
For the last assertion we need only observe that
_
_
_
_
_
fd
_
f
n
d
_
_
_
_
=
_
_
_
_
_
(f f
n
) d
_
_
_
_
|f f
n
|
() 0 as n .
Notation 12.3 If A / and f

S (/, Y ) we let
_
A
fd :=
_
1
A
fd.
Theorem 12.4 (Summary of Integration). Given a measure space,
(, /, ) , there is a unique linear map,
S (/, Y ) f
_
fd Y
such that f = y1
A
for some y Y and A /, then
_
fd = y(A) . Moreover
this integral satises;
1.
_
_
_
fd
_
_
Y

_
|f ()|
Y
d() |f|
() for all f

S (/, Y ) ,
2. If A, B / and A B = , then
_
AB
fd =
_
A
fd +
_
B
fd.
3. If X is another Banach space and T : Y X is a continuous linear map,
then
T
_
fd =
_
Tfd.
4. If Y = 1, f, g

S (/, 1) , and f () g () for all then
_
fd
_
gd.
Proof. Everything but items 2. 4. have already been proved. Item 2. is
easy since
1
AB
f = 1
A
f + 1
B
f
and so the result follows by linearity of the integral.
3. If f =
n
i=1
y
i
1
Ai
with y
i
Y and A
i
/, then Tf =
n
i=1
Ty
i
1
Ai
and
_
Tfd =
n
i=1
Ty
i
(A
i
) = T
_
n
i=1
y
i
(A
i
)
_
= T
_
fd.
If f

S (/, Y ) and f
n
S (/, Y ) with |f f
n
|
0, then
T
_
fd = T lim
n
_
f
n
d
= lim
n
T
_
f
n
d [T is continuous]
= lim
n
_
Tf
n
d.
Now from Exercise 7.12 there exists C < such that |Ty|
X
C |y|
Y
for all
y Y and therefore,
|Tf Tf
n
|
= sup
|Tf () Tf
n
()|
Y
C sup
|f () f
n
()| = C |f f
n
|
and hence Tf
n
Tf uniformly and hence Tf

S (/, X) and
T
_
fd = lim
n
_
Tf
n
d =
_
Tfd.
4. Let h() = f () g () 0 and choose h
n

S (/, 1) such that
|h
n
h|
0 as n . Hence given any > 0 there exists N N such

that h
n
for all n N and therefore,
_
hd = lim
n
_
h
n
d = lim
n
_
[h
n
+] d
_
d () .
As > 0 is arbitrary we may conclude that
_
fd
_
gd =
_
hd 0.
Exercise 12.3. Let f

S (/, 1) and y Y where Y is a Banach space. Show
_
yfd = y
_
fd.
Hint: this can be proved by from rst principles but it would be easier to use
Theorem 12.4 for a well chosen continuous linear transformation, T : F Y.
Exercise 12.4 (Interchanging sums with integrals). Suppose that
f
n
n=1

S (/, Y ) and
n=1
|f
n
|
< . Show
n=1
f
n
is in

S (/, Y ) and
_
n=1
f
n
_
d =
n=1
_
f
n
d
12 Extending the Integral by Uniform Limits 135
and moreover
_
_
_
_
_
_
n=1
f
n
_
d
_
_
_
_
_
n=1
_
|f
n
| d ()
n=1
|f
n
|
.
Exercise 12.5. Suppose that

S (/, 1) with 0.
1. Show (A) :=
_
A
d denes a nitely additive measure on /.
2. If f

S (/, Y ) , show,
_
fd =
_
fd. (12.3)
Hint: rst verify Eq. (12.3) when f = y 1
A
for some y Y and A /.
At this stage you should make use of Exercise 12.3. Then verify Eq. (12.3)
for f S (/, Y ) and then go on to the general case.
Let us now specialize to the case where := (a, b], : [a, b] 1 is an
increasing function, /is the algebra of half open intervals on , and : / 1
+
is the unique measure such that ((s, t]) = (t) (s) for all a s t b.
Theorem 12.5. Suppose that = (a, b] and let C ([a, b] , Y ) be the space of
continuous functions from [a, b] Y. Then C ([a, b] , Y )

S (/, Y ) . [More
precisely, if f C ([a, b] , Y ) then f[
(a,b]

S (/, Y ) .] Further suppose that f
C ([a, b] , Y ) and to each partition := a = t
0
< t
1
< < t
n
= b of [a, b]
we have chosen c
i
[t
i1
, t
i
] for 1 i n. Then
_
b
a
fd = lim
]]0
n
i=1
f(c
i
)((t
i
) (t
i1
)).
In particular if (t) = t, then we write dt for d or d and we have,
_
b
a
f (t) dt = lim
]]0
n
i=1
f(c
i
)(t
i
t
i1
).
Proof. From Exercise 11.4 we know that f may be written as a uniform
limit of step functions and hence is in

S (/, Y ) . Moreover that same exercise
shows; if > 0 there exists > 0 such that
_
_
f f
,c
_
_
u
if [[ .
Therefore if [[ , then
_
_
_
_
_
_
b
a
fd
_
b
a
f
,c
d
_
_
_
_
_
_
_
f f
,c
_
_
u
() ()
and this suces to show
lim
]]0
_
b
a
f
,c
d =
_
b
a
fd.
This completes the proof since,
f
,c
=
n
i=1
f (c
i
) 1
(ti1,ti]
on = (a, b]
and hence
_
b
a
f
,c
d =
n
i=1
f (c
i
) ((t
i1
, t
i
]) =
n
i=1
f(c
i
)((t
i
) (t
i1
)).
If f
n
o and f

o such that lim
n
|f f
n
|
= 0, then for a u <

v b, then 1
(u,v]
f
n
o and lim
n
_
_
1
(u,v]
f 1
(u,v]
f
n
_
_
= 0. This shows
1
(u,v]
f

o whenever f

o.
Notation 12.6 For u, v [a, b] and f

S (/, Y ) we let
_
v
u
fd :=
_ _
1
(u,v]
fd if u v
1
(v,u]
fd if u v
.
The next lemma explains there reason for the convention in Notation 12.6.
Lemma 12.7. For all u, v, w [a, b] and f

S (/, Y ) we have
_
w
u
fd =
_
v
u
fd +
_
w
v
fd. (12.4)
Proof. If u v w, then the formula in Eq. (12.4) is a consequence of the
linearity of the integral and the fact that
1
(u,w]
= 1
(u,v]
+ 1
(v,w]
.
All other cases may be reduced to this case. For example if w u v, then we
know
_
v
w
fd =
_
u
w
fd +
_
v
u
fd =
_
w
u
fd +
_
v
u
fd.
Solving this equation for
_
w
u
fd then gives
_
w
u
fd =
_
v
u
fd
_
v
w
fd =
_
v
u
fd +
_
w
v
fd.
Similar arguments work for the 4 other orderings of u, v, w and are left to the
reader. [That is you should verify Eq. (12.4) in the following cases; 1) u w v,
2) v u w, 3) v w u, and 4) w v u.
Lemma 12.8. Suppose that u, v [a, b] and f

S (/, Y ) , then
_
_
_
_
_
v
u
fd
_
_
_
_
_
v
u
|f| d
|f1
J
|
(J)
where
J =
_
(u, v] if u < v
(v, u] if v u.
Proof. By denition we have
_
_
_
_
_
v
u
fd
_
_
_
_
=
_
_
_
_
_
J
fd
_
_
_
_
=
_
_
_
_
_
_
(a,b]
1
J
fd
_
_
_
_
_
.
Therefore,
_
_
_
_
_
v
u
fd
_
_
_
_
_
(a,b]
|1
J
f| d = |1
J
f|
(J)
which combined with the identity,
_
v
u
|f| d
_
J
|f| d
=
_
(a,b]
|1
J
f| d =
_
J
|f| d,
proves the result.
The next lemma lists a number of the many familiar properties of the integral
and really is just a special case of Theorem 12.4.
Lemma 12.9. For f

o((a, b], X) and u, v, w [a, b], the integral satises:
1.
_
_
_
v
u
f(t) dt
_
_
X
(v u) sup |f(t)| : u t v .
2.
_
w
u
f(t) dt =
_
v
u
f(t) dt +
_
w
v
f(t) dt.
3. If Y is another Banach space and T : X Y be a continuous linear
operator,
1
then Tf

o((a, b], Y ) and
T
__
v
u
f(t) dt
_
=
_
v
u
Tf(t) dt.
4. The function t |f(t)|
X
is in

o((a, b], 1) and
_
_
_
_
_
_
b
a
f(t) dt
_
_
_
_
_
X
_
b
a
|f(t)|
X
dt.
1
We will let L(X, Y ) denote the continuous linear operators from X to Y.
5. If f, g

o((a, b], 1) and f g, then
_
b
a
f(t) dt
_
b
a
g(t) dt.
Analogous statements also hold if dt is replaced by d(t) where is any
nitely additive measure from /
(a,b]
1
+
. The only modication is in
item 1. where we would now have,
_
_
_
_
_
v
u
f(t) dt
_
_
_
_
X
((u, v]) sup |f(t)| : u t v .
Proof.
1.
_
_
_
v
u
f(t) dt
_
_
(v u) sup |f(t)| : u t v is easily seen to hold for

f o. The general statement then holds by passing to the limit.
2. The function G(t) :=
_
t
a
f()d is continuous on [a, b], indeed,
|G(t) G()| =
_
_
_
_
_
t
f () d
_
_
_
_
|f|
[t [ .
3. If T L(X, Y ) and f o((a, b], X) then Tf o((a, b], Y ) and because T
commutes with nite sums,
T
__
v
u
f(t)dt
_
=
_
v
u
Tf(t)dt.
For general f

o((a, b], X), there exists f
n
o((a, b], X) such that
|f f
n
|
0. Because T is bounded it follows that

|Tf Tf
n
|
|T| |f f
n
|
0.
which shows Tf

o((a, b], Y ). Moreover we have
T
__
v
u
f(t)dt
_
= T lim
n
_
v
u
f
n
(t)dt = lim
n
T
_
v
u
f
n
(t)dt
= lim
n
_
v
u
Tf
n
(t)dt =
_
v
u
Tf(t)dt.
4. The function t |f(t)|
X
is in o((a, b], 1) for f o((a, b], X) and by the
triangle inequality for nite sums,
_
_
_
_
_
_
b
a
f(t) dt
_
_
_
_
_
_
b
a
|f(t)| dt. (12.5)
12 Extending the Integral by Uniform Limits 137
For general f

o((a, b], X), choose f
n
o((a, b], X) such that
|f f
n
|
0 as above. Since
sup
t
[|f (t)| |f
n
(t)|[ sup
t
|f (t) f
n
(t)| = |f f
n
|
0 as n
it follows that t |f(t)|
X
is in

o((a, b], 1). A similar limiting argument
to that in item 4. shows that Eq. (12.5) holds for f

o((a, b], X).
5. It suces to show if f

o((a, b], 1) with f 0 then
0
_
b
a
f(t)dt.
This is certainly true if f o((a, b], 1). Again for general f

o((a, b], 1)
with f 0, choose f
n
o((a, b], 1) such that |f f
n
|
0. By replacing
f
n
with max (f
n
, 0) o((a, b], 1) we may assume f
n
0 for all n. So passing
to the limit in the inequality,
0
_
b
a
f
n
(t)dt
leads to the desired conclusion.
Exercise 12.6. Show f = (f
1
, . . . , f
n
)

o((a, b], 1
n
) i f
i

o((a, b], 1) for
i = 1, 2, . . . , n and
_
b
a
f(t)dt =
_
_
b
a
f
1
(t)dt, . . . ,
_
b
a
f
n
(t)dt
_
. (12.6)
Here 1
n
is to be equipped with the usual Euclidean norm. Hint: Use item 4.
of Lemma12.9 with judiciously chosen T : 1
n
1 for prove one direction.
For the opposite direction, use Exercise 12.1 and the observation that f (t) =
n
i=1
f
i
(t) e
i
where e
i
n
i=1
is the standard basis for 1
n
.
Exercise 12.7. Let 0 < a < T < and > 0 be given and let : [0, T] 1
+
be dened by
(x) = 1
xa
=
_
if x a
0 if x < a
.
Further let
be the unique nitely additive measure on /

(0,T]
such that
((s, t]) = (t) (s) for all 0 s t T. Show

_
(0,T]
fd =
_
(0,T]
fd
= f (a) for all f C ([0, T] , C) .

Exercise 12.8. Let 0 < a < T < and > 0 be given and let : [0, T] 1
+
be dened by
(x) = 1
x>a
=
_
if x > a
0 if x a
.
Further let
be the unique nitely additive measure on /

(0,T]
such that
((s, t]) = (t) (s) for all 0 s t T. Show

_
(0,T]
fd =
_
(0,T]
fd
= f (a) for all f C ([0, T] , C) .

[Thus it is possible that
_
(0,T]
fd =
_
(0,T]
fd for all f C ([0, T] , C)
even though ,= . This point may be addressed more in the third quarter.]
Remark 12.10. If , : [0, T] 1 are two increasing functions, it is easy to
verify that
+
=
and
_
T
0
fd ( +) =
_
T
0
fd +
_
T
0
fd
for all f

o
_
/
(0,T]
, Y
_
.
Corollary 12.11. If : [0, T] 1 is dene by
(x) :=
n
i=1
i
1
xai
.
where 0 < a
1
< a
2
< < a
n
< T and
i
> 0 for 1 i n, then
_
T
0
fd =
n
i=1
i
f (a
i
) for all f C ([0, T] , C) .
Moreover if (x) = (x) +x, then for all f C ([0, T] , C) ,
_
T
0
fd =
n
i=1
i
f (a
i
) +
_
T
0
f (x) dx.
This shows that our integral may be a combination of the usual Riemann integral
along with a nite sum.
12.1 The Fundamental Theorem of Calculus
Our next goal is to show that our Riemann integral interacts well with dierenti-
ation, namely the fundamental theorem of calculus holds. The next proposition
follows directly from Theorem 10.19. Since we have not proven this theorem
in full generality, I will supply a proof of the proposition which is all we really
need for our purposes.
Proposition 12.12 (A Crude Mean Value Inequality). Suppose that
< a < b < and f : [a, b] X is a continuous function such that
f(t) exists for all t (a, b). If M := sup

a<t 0 and (a, b) there exists (by denition of the
derivative) a
> 0 such that

_
_
_f(t) f()

f()(t )
_
_
_ [t [ if [t [
. (12.7)
This in equality along with the triangle inequality implies for [t [
with
t [a, b] that
|f(t) f()|
_
_
_

f()(t )
_
_
_ + [t [ (M +) [t [ . (12.8)
Now suppose that (a, b) and > 0 is given. (We will later let 0.) Let
S := t [, b] : |f(t) f()| > (M +) (t ) .
From Eq. (12.8) with = , we see that [, +
] S = . For sake of con-

tradiction suppose that S is not empty. In this case let c := inf S [ +
, ] .
Then for all t [, c) we have
|f(t) f()| (M +) (t ) . (12.9)
By continuity of f and the norm, the previous estimate remains true for t = c
as well. However, using Eq. (12.8) with = c and Eq. (12.9) at t = c, we nd
for 0 h
c
that
|f(c +h) f()| |f(c +h) f (c)| +|f (c) f()|
(M +) h + (M +) (c ) = (M +) (c +h ) .
This however then implies that c = inf S > c which is a contradiction and we
conclude that S = .
The previous argument has shown
|f(t) f()| (M +) (t ) for all t b.
We may now let 0 in order to conclude that
|f(t) f()| M(t ) for all a < t b.
By continuity this inequality also holds for = a. Finally if M = 0 we have
shown |f(t) f(a)| = 0 for all a t b, i.e. f (t) = f (a) for all t [a, b] .
Alternate Proof.
2
To simplify notation let us now write f ( [u, v]) :=
f (v) f (u) and [[u, v][ := v u for all a u < v b. Suppose for sake
of contradiction that there exits a M ( ) = M [[, ][ .
By continuity of f we may move and if necessary in order to further assume
a < < < b. We now choose M
t
> M so that
|f ( [, ])| > M
t
[[, ][ > M [[, ][ .
Let J
0
= [, ] and t =
1
2
( +) be the mid-point of J
0
. Since
M
t
( t) +M
t
(t ) = M
t
( )
< |f () f ()| |f () f (t)| +|f (t) f ()|
we must have either
M
t
[[t, ][ < |f ( [t, ])| or M
t
[[, t][ < |f ( [, t])| .
If the rst case holds we let J
1
= [, t] and if only the second case holds we
take J
1
= [t, ] . We may now work similarly and split J
1
in half an choose
one one of the two sub-intervals in J
2
so that |f (J
2
)| > M
t
[J
2
[ . Continuing
this way inductively we construct, J
i
J
0
so that J
0
J
1
J
2
. . . and
|f (J
i
)| > M
t
[J
i
[ for all i and diam(J
i
) = 2
i
( ) . As the sets J
i
are
closed it follows that
i
J
i
= for some point J
0
.
Let := M
t
M and use the denition of derivative of f at to nd a
> 0 such that
_
_
_f(t) f()

f()(t )
_
_
_ [t [ if [t [ .
From this estimate we learn that
|f (t) f ()|
__
_
_

f ()
_
_
_ +
_
[t [ = M
t
[t [ .
2
This proof is reminiscent of Moreras theorem for triangles from complex variables.
12.1 The Fundamental Theorem of Calculus 139
Choosing n large enough so that J
n
= [
n
,
n
] [ , +] , we nd,
M
t
[J
n
[ < |f (J
n
)| |f (
n
) f ()| +|f () f (
n
)|
M
t
(
n
+
n
) = M
t
[J
n
[
which leads to the contradiction that M
t
[J
n
[ < M
t
[J
n
[ .
Denition 12.13. Given a function f C([a, b], X) we say that F
C([a, b], X) is an anti-derivative of f if

F (t) = f (t) for all t (a, b) .
Lemma 12.14. For f C([a, b], X), the function
3
G(t) :=
_
t
a
f() d is Lips-
chitz continuous in t [a, b] and in fact,
|G(t) G(s)|
X

_
t
s
|f ()| d |f|
[t s[ for all s, t [a, b] .

Proof. Suppose that a s t b. Then
G(t) G(s) =
_
t
a
f() d
_
s
a
f() d =
_
t
s
f () d
and therefore,
|G(t) G(s)|
X

_
t
s
|f ()| d |f|
[t s[ .
The following elementary but important lemma and its relatives will fre-
quently be used throughout the rest of these notes.
Lemma 12.15. For all x X and a, b 1 we have,
_
b
a
xd = x (b a) . (12.10)
Proof. Proof if a b then Eq. (12.10) holds by denition of the integral on
simple functions. If b < a, then
_
b
a
xd =
_
a
b
xd = x (a b) = x (b a) .
Theorem 12.16 (Fundamental Theorem of Calculus). Suppose that f
C([a, b], X), Then
3
Note that G(a) =
_
(a,a]
fdm = 0.
1.
d
dt
_
t
a
f() d = f(t) for all t (a, b), i.e. G(t) :=
_
t
a
f() d is an anti-
derivative of f.
2. If F C([a, b], X) is an anti-derivative of f, then
_
b
a
f(t) dt = F(b) F(a).
Proof. 1. By Lemma 12.14 the function G(t) :=
_
t
a
f () d is continuous
for t [a, b] . Now suppose that t (a, b) and h 1 0 is small so that
t +h (a, b) . Let
J
h
:=
_
(t, t +h] if h > 0
(t +h, t] if h < 0.
Applying Lemma 12.15 with x = f (t) , a = t and b = t +h implies,
G(t +h) G(t) f (t) h =
_
t+h
a
f()d
_
t
a
f()d f(t)h
=
_
t+h
t
f()d f(t)h
=
_
t+h
t
[f() f (t)] d.
Taking norms of this equation along with Lemma 12.8 gives,
|G(t +h) G(t) f (t) h|
X
=
_
_
_
_
_
_
t+h
t
[f() f (t)] d
_
_
_
_
_
X
_
Jh
|f() f (t)|
X
d (h) [h[
where
(h) := max
Jh
|(f() f(t))|
X
max
[t]h],t+]h]]
|(f() f(t))|
X
.
The continuity of f at t implies lim
h0
(h) = 0 and so by the denition of the
derivative we have shown,

G(t) = f (t) exists for all t (a, b) .
2. For the second item, let h(t) := G(t)F (t) . Then h C ([a, b] , X) with
h(t) =

G(t)

F (t) = f (t) f (t) = 0 for all t (a, b) .
Hence by the mean value inequality in any one its guises (see Theorem 10.19 or
Proposition 12.12) we know that h(t) is constant. In particular we may conclude
that
G(b) F (b) = h(b) = h(a) = G(a) F (a) = F (a) ,
i.e.
_
b
a
f () d = G(b) = F (b) F (a) .
Example 12.17. For all a, b 1 and z C 0 ,
_
b
a
e
z
d =
1
z
e
z
[
=b
=a
=
1
z
_
e
zb
e
za
. (12.11)
and in particular,
_
b
0
e
z
d =
1
z
_
e
zb
1
.
Let us now write z = x +iy so that
e
z
= e
x
e
iy
= e
x
cos y +ie
x
sin y.
Then according to Exercise 12.6 and Eq. (12.11) we may conclude
_
b
0
e
z
d =
_
b
a
e
x
cos yd +i
_
b
a
e
x
sin yd.
On the other hand,
1
z
_
e
zb
1
=
1
x
2
+y
2
(x iy)
_
e
xb
cos yb +ie
xb
sin yb 1
_
.
So comparing real and imaginary parts of these two equations implies,
_
b
0
e
x
cos yd =
1
x
2
+y
2
_
e
xb
(xcos yb +y sin yb) x
and
_
b
0
e
x
sin yd =
1
x
2
+y
2
_
e
xb
(y cos yb +xsin yb) +y
.
Exercise 12.9 (Integrating Power Series). Suppose that f (t) :=
n=0
c
n
t
n
has radius of convergence R > 0. Show for any R < a < b < R
that
_
b
a
n=0
c
n
t
n
dt =
n=0
c
n
_
b
a
t
n
dt =
n=0
c
n
_
b
n+1
a
n+1
n + 1
_
Hint: use Exercise 12.4 above.
Example 12.18. By the fundamental theorem of calculus along with Exercise
12.9 we nd for [x[ < 1 that
ln (1 +x) =
_
x
0
1
1 +t
dt =
_
x
0
_

n=0
(1)
n
t
n
_
dt
=
n=0
(1)
n
_
x
0
t
n
dt =
n=0
(1)
n
x
n+1
n + 1
and
tan
1
(x) =
_
x
0
1
1 +t
2
dt =
_
x
0
_

n=0
(1)
n
t
2n
_
dt
=
n=0
(1)
n
_
x
0
t
2n
dt =
n=0
(1)
n
x
2n+1
2n + 1
.
Exercise 12.10. Suppose : (a , b +) 1 is a non-decreasing C
1
func-
tion. Show
_
b
a
fd =
_
b
a
f (t) (t) dt for all f

S (/, Y ) .
Hint: Let
be the unique measure on / such that
((s, t]) = (t) (s)

for all a s t b and rst show
(A) =
_
A
() d for all A /. Now
apply Exercise 12.5.
Remark 12.19. Suppose f : [a, b] Y is continuous in Exercise 12.10 and let
= a = t
0
< t
1
< < t
n
= b be a partition of [a, b] . By the mean value
theorem, there exists c
i
(t
i1
, t
i
) such that (t
i
)(t
i1
) = (c
i
) (t
i
t
i1
) .
Therefore,
n
i=1
f (c
i
) [(t
i
) (t
i1
)] =
n
i=1
f (c
i
) (c
i
) (t
i
t
i1
) . (12.12)
We may now use Theorem 12.5 twice to conclude that
lim
]]0
n
i=1
f (c
i
) [(t
i
) (t
i1
)] =
_
b
a
fd and
lim
]]0
n
i=1
f (c
i
) (c
i
) (t
i
t
i1
) =
_
b
a
f (t) (t) dt
which combined with Eq. (12.12) shows
_
b
a
fd =
_
b
a
f (t) (t) dt.
12.1 The Fundamental Theorem of Calculus 141
Remark 12.20. Suppose that : [a, b] 1 is an non-decreasing function such
that there a nite subset, (a, b) , such that is C
1
on (a, b) with
[
(a,b)\
extending to a continuous function on [a, b] . Then combining the ideas
in Remark 12.19 along with those in Corollary 12.11, for every f : [a, b] 1
which is continuous we have
_
b
a
fd =
_
b
a
f (t) (t) dt +
t
f (t) [(t+) (t)] , (12.13)
where (t+) = lim
t
() and (t) = lim
t
() . However, it should be
noted that there exists continuous strictly increasing functions
4
, : [a, b] 1
such that (t) = 0 almost everywhere! For such a function, , Eq. (12.13) is
not longer valid!
Corollary 12.21 (Another Mean Value Inequality). Suppose that f :
[a, b] X is a continuous function such that

f(t) exists for t (a, b) and
f extends to a continuous function on [a, b]. Then

|f(b) f(a)|
_
b
a
|

f(t)|dt (b a)
_
_
_

f
_
_
_
. (12.14)
Proof. By the fundamental theorem of calculus, f(b) f(a) =
_
b
a
f(t)dt
and then by Lemma 12.9,
|f(b) f(a)| =
_
_
_
_
_
_
b
a
f(t)dt
_
_
_
_
_
_
b
a
|

f(t)|dt
_
b
a
_
_
_

f
_
_
_
dt = (b a)
_
_
_

f
_
_
_
.
Corollary 12.22 (Change of Variable Formula). Suppose that f
C([a, b], X) and T : [c, d] (a, b) is a continuous function such that T (s)
is continuously dierentiable for s (c, d) and T
t
(s) extends to a continuous
function on [c, d]. Then
_
d
c
f (T (s)) T
t
(s) ds =
_
T(d)
T(c)
f (t) dt.
Proof. For t (a, b) dene F (t) :=
_
t
T(c)
f () d. Then F C
1
((a, b) , X)
and by the fundamental theorem of calculus and the chain rule,
4
Look up Cantor functions.
d
ds
F (T (s)) = F
t
(T (s)) T
t
(s) = f (T (s)) T
t
(s) .
Integrating this equation on s [c, d] and using the chain rule again gives
_
d
c
f (T (s)) T
t
(s) ds = F (T (d)) F (T (c)) =
_
T(d)
T(c)
f (t) dt.
Corollary 12.23 (Integration by Parts). Suppose that g : [a, b] X and
f : [a, b] F are continuous functions such that

f(t) and g (t) exists for t
(a, b) and

f and g extend to continuous function on [a, b]. Then
_
b
a
f (t) g (t) dt +
_
b
a
f (t) g (t) dt = f (t) g (t) [

b
a
.
Proof. By the product rule,
d
dt
(f (t) g (t)) = f (t) g (t) +

f (t) g (t) for all a t b.
Integrating this equation, making use of the fundamental theorem of calculus,
gives the desired result.
Exercise 12.11. Show, using Calculus, that for any x, > 0,
u
x
e
u
_
x
_
x
e
x
for all u 0.
In other words,
u
x
__
x
_
x
e
x
_
e
u
for all u 0.
Exercise 12.12 (The Gamma Function). Let 0 < a < b < and set
a,b
(x) :=
_
b
a
u
x1
e
u
du
and let
(x) := lim
b
lim
a0
a,b
(x)
when these limits exists. [This is an example of an improper integral and as
usual we will typically write, (x) =
_
0
u
x1
e
u
du.]
1. Show
a,b
(x + 1) = x
a,b
(x) u
x
e
u
[
b
a
. [Hint: integrate by parts.]
2. Show for x 0 that
lim
a0
a,b
(x + 1) =
_
b
0
u
x
e
u
du =:
b
(x) ,
and
lim
b
b
(x + 1) =: (x + 1) exists.
Hint: for the limit as b notice that b
b
(x + 1) is an increasing
function and hence lim
b
b
(x + 1) = sup
b<
b
(x + 1) exist in [0, ] .
Your goal is to show that the limit is nite and for this you may nd
Exercise 12.11 useful. This exercise implies, for example, that there exists
C = C (x) < such that u
x
e
u
Ce
u/2
for u 0.
3. Making use items 1. and 2. show for all x > 0 that (x) =
lim
b
lim
a0
a,b
(x) exists and satises (x + 1) = x (x) .
4. Show (1) = 1 and use this and item 3. to conclude that (n + 1) = n!
for all n N
0
.
[The function, : (0, ) 1
+
is called the Gamma function.]
Remark 12.24. Later we will show (1/2) =

from which it then follows that
_
n +
1
2
_
=
(2n 1)!!
2
n
where (1)!! := 1 and (2n 1)!! = (2n 1) (2n 3) . . . 3 1 for n N.

12.2 Taylors Theorem revisited
Proposition 12.25 (Taylors Theorem I). Suppose that h : (, 1 +) Y
is a C
N
dierentiable function. Then
h(1) =
N1
l=0
h
(l)
(0)
l!
+
_
1
0
(1 t)
N1
(N 1)!
h
(N)
(t)dt. (12.15)
Exercise 12.13. Prove Proposition 12.25. That is show by induction on N that
if h : (, 1 +) Y is a C
N
dierentiable function, then
h(1) =
N1
l=0
h
(l)
(0)
l!
+
_
1
0
(1 t)
N1
(N 1)!
h
(N)
(t)dt. (12.16)
Hints: the case N = 1 is the fundamental theorem of calculus. The induction
step uses integration by parts and the observation that
(1 t)
N1
(N 1)!
=
d
dt
(1 t)
N
N!
.
Theorem 12.26 (Taylors Theorem with Integral Remainder). Suppose
that f : (a, b) Y is C
N
dierentiable and x
0
(a, b) . Then for all x (a, b) ,
f (x) =
N1
l=0
f
(k)
(x
0
)
l!
(x x
0
)
k
+
(x x
0
)
N
N!
R
N
(x)
where R
N
(x) = R
N
(f, x
0
; x) is given by
R
N
(x) =
_
1
0
f
(N)
(x
0
+t (x x
0
)) N(1 t)
N1
dt
=
_
1
0
f
(N)
((1 t) x
0
+tx) N(1 t)
N1
dt.
[Observe that
_
1
0
N(1 t)
N1
dt = (1 t)
N
[
1
0
= 1.]
Proof. We apply Proposition 12.25 with h(t) := f (x
0
+t (x x
0
)) using,
h
(k)
(t) = f
(k)
(x
0
+t (x x
0
)) (x x
0
)
k
. Therefore,
f (x) =h(1) =
N1
l=0
h
(l)
(0)
l!
+
_
1
0
(1 t)
N1
(N 1)!
h
(N)
(t)dt
=
N1
l=0
f
(k)
(x
0
)
l!
(x x
0
)
k
+
(x x
0
)
N
N!
_
1
0
f
(N)
(x
0
+t (x x
0
)) N(1 t)
N1
dt.
Example 12.27. Let X = (1, 1) 1, 1 and f(x) = (1 x)
. The reader
should verify
f
(m)
(x) = (1)
m
( 1) . . . ( m+ 1)(1 x)
m
and therefore by Taylors theorem (Eq. (12.15) with x = 0 and y = x)
(1 x)
= 1 +
N1
m=1
1
m!
(1)
m
( 1) . . . ( m+ 1)x
m
+R
N
(x) (12.17)
where
R
N
(x) =
x
N
N!
_
1
0
(1)
N
( 1) . . . ( N + 1)(1 sx)
N
d
N
(s)
=
x
N
N!
(1)
N
( 1) . . . ( N + 1)
_
1
0
N(1 s)
N1
(1 sx)
N
ds.
12.3 Interchanging Limits 143
Now for x (1, 1) and N > ,
0
_
1
0
N(1 s)
N1
(1 sx)
N
ds
_
1
0
N(1 s)
N1
(1 s)
N
ds =
_
1
0
N(1 s)
1
ds =
N
and therefore,
[R
N
(x)[
[x[
N
(N 1)!
[( 1) . . . ( N + 1)[ =:
N
.
Since
lim sup
N
N+1
N
= [x[ lim sup
N
N
N
= [x[ < 1
and so by the Ratio test, [R
N
(x)[
N
0 (exponentially fast) as N .
Therefore by passing to the limit in Eq. (12.17) we have proved
(1 x)
= 1 +
m=1
(1)
m
m!
( 1) . . . ( m+ 1)x
m
(12.18)
which is valid for [x[ < 1 and 1. An important special cases is = 1
in which case, Eq. (12.18) becomes
1
1x
=
m=0
x
m
, the standard geometric
series formula. Another another useful special case is = 1/2 in which case Eq.
(12.18) becomes
1 x = 1 +
m=1
(1)
m
m!
1
2
(
1
2
1) . . . (
1
2
m+ 1)x
m
= 1
m=1
(2m3)!!
2
m
m!
x
m
for all [x[ < 1. (12.19)
Theorem 12.28 (Intermediate Value Theorem for Averages). Suppose
: [0, 1] 1
+
is continuous, (t) > 0 on (0, 1) , and
_
1
0
() d = 1. Then for
every f C ([0, 1] , 1) , show there exists c (0, 1) such that
_
1
0
() f () d = f (c) .
Proof. Let m := min
0t1
f (t) and M := max
0t1
f (t) . Then m
f (t) M for all t and therefore,
m =
_
1
0
() md L =
_
1
0
() f () d
_
1
0
() Md M.
It now follows by the intermediate value theorem that there exists a c [0, 1]
such that f (c) = L =
_
1
0
() f () d. The only thing left to do is to show that
we may choose c (0, 1) . If this is not the case we will have either f (t) > L for
all 0 < t < 1 or f (t) < L for all 0 < t < 1 again use the intermediate value
theorem. But in the former case we will have min1
3
t
2
3
f (t) = L + for some
> 0 and hence
L =
_
1
0
f (t) (t) dt =
_
1/3
0
f (t) (t) dt +
_
1
2/3
f (t) (t) dt +
_
2/3
1/3
f (t) (t) dt
_
1/3
0
L (t) dt +
_
1
2/3
L (t) dt +
_
2/3
1/3
(t) (L +) dt
= L +
1
3

_
2/3
1/3
(t) dt > L
which is a contradiction. A similar argument works if f (t) < L for all 0 < t < 1.
Corollary 12.29 (Real valued Taylors Theorem). If f : (a, b) 1 is C
N
dierentiable and x
0
, x (a, b) , then there exist c between x
0
and x such that
f (x) =
N1
l=0
f
(k)
(x
0
)
l!
(x x
0
)
k
+
f
(N)
(c)
N!
(x x
0
)
N
.
12.3 Interchanging Limits
Theorem 12.30 (Interchanging derivates with limits). Suppose that
< a < b < and f
n
: (a, b) Y is a sequence of continuously dif-
ferentiable functions such that;
1. y := lim
n
f
n
(c) exists in Y for some c (a, b) ,
2. g (t) := lim
n

f
n
(t) exits uniformly in t, i.e. g : (a, b) Y such that
lim
n
sup
t(a,b)
|g (t) f
n
(t)|
Y
= 0.
Then f (t) := lim
n
f
n
(t) exists for all t (a, b) , f
n
(t) f (t) uniformly
in t, and f is continuously dierentiable with

f (t) = g (t) = lim
n

f
n
(t) .
Proof. By the fundamental theorem of calculus, for all t (a, b) and n N,
f
n
(t) = f
n
(c) +
_
t
c
f
n
() d.
Using the continuity of the integral under uniform limits, we may let n
in the previous equation to learn, f (t) := lim
n
f
n
(t) exists and
f (t) = y +
_
t
c
g () d.
Since g is the uniform limit of continuous functions we know that g is continuos
as well. Therefore by the fundamental theorem of calculus,

f (t) exits and is
equal to g (t) for all t (a, b) . Since
|f (t) f
n
(t)| =
_
_
_
_
y f
n
(c) +
_
t
c
_
g ()

f
n
()
_
d
_
_
_
_
Y
|y f
n
(c)|
Y
+
_
t
c
_
_
_g ()

f
n
()
_
_
_ d
|y f
n
(c)|
Y
+
_
b
a
_
_
_g ()

f
n
()
_
_
_ d
|y f
n
(c)|
Y
+
_
_
_g

f
n
_
_
_
(b a) ,
we learn that
|f f
n
|
|y f
n
(c)|
Y
+
_
_
_g

f
n
_
_
_
(b a) 0 as n .
Theorem 12.31 (Interchanging limits with integrals). Suppose that
< a < b < , is a nitely additive measure on /
(a,b]
, and (K, ) be
a compact metric space for example a compact subset of 1
m
for some m. If
f : K [a, b] Y is a continuous function, then F : K Y dened by
F (x) :=
_
b
a
f (x, t) d(t)
is continuous on K.
Exercise 12.14. (Prove Theorem 12.31.) Suppose that < a < b < ,
is a nitely additive measure on /
(a,b]
, and (K, ) be a compact metric space
for example a compact subset of 1
m
for some m. If f : K [a, b] Y is a
continuous function, then F : K Y dened by
F (x) :=
_
b
a
f (x, t) d(t)
is continuous on K. Hint: rst point out that f is uniformly continuous on
K[a, b] and then make use of this fact to show lim
xx0
|F (x) F (x
0
)|
Y
= 0
for all x
0
K.
Theorem 12.32 (Interchanging derivates with integrals). Suppose is
a nitely additive measure on /
(a,b]
and f : (u, v) [a, b] Y is a continuous
function such that f
s
(s, t) :=

s
f (s, t) exists for s (u, v) and f
s
: (u, v)
[a, b] Y is a continuous. Then
d
ds
_
b
a
f (s, t) d(t) =
_
b
a
f
s
(s, t) d(t) for all s (u, v) .
Proof. Let s (u, v) and choose > 0 suciently small so that s+h (u, v)
when [h[ . Let F (s) :=
_
b
a
f (s, t) d(t) . Then for 0 < [h[
F (s +h) F (s)
h
=
_
b
a
f (s +h, t) f (s, t)
h
d(t)
=
_
b
a
g (h, t) d(t)
where
g (h, t) =
_
f(s+h,t)f(s,t)
h
if 0 < [h[
f
s
(s, t) if h = 0
.
The function g (h, t) may also be written as,
g (h, t) =
_
1
0
f
s
(s +rh, t) dr
as this formula agrees trivially for h = 0 and for h ,= 0 by the fundamen-
tal theorem of calculus. The function u(h, t, r) := f
s
(s +rh, t) is continu-
ous for on [, ] [a, b] [0, 1] and so by Theorem 12.31 it follows that
g (h, t) =
_
1
0
u(h, t, r) dr is continuous in (h, t) . Therefore by another appli-
cation of Theorem 12.31,
lim
h0
F (s +h) F (s)
h
= lim
h0
_
b
a
g (h, t) d(t)
=
_
b
a
g (0, t) d(t) =
_
b
a
f
s
(s, t) d(t) .
Example 12.33. Let 0 < b < and n N, then by repeated use of Theorem
12.32,
_
b
0
e
sx
x
n
dx =
_
b
0
_
d
ds
_
n
e
sx
dx
=
_
d
ds
_
n
_
b
0
e
sx
dx =
_
d
ds
_
n
1
s
_
1 e
sb
_
.
12.4 Stirlings Formula 145
A simple induction argument shows,
_
d
ds
_
n
1
s
_
1 e
sb
_
= n!s
(n+1)
+e
bs
p
n
_
s
1
_
for some polynomial function p
n
. Therefore we may conclude that
_

0
e
sx
x
n
dx := lim
b
_
b
0
e
sx
x
n
dx =
n!
s
n+1
and taking s = 1 we nd,
(n + 1) =
_

0
e
x
x
n
dx = n!
as you already showed in Exercise 12.12.
Exercise 12.15. Explain why the following identities hold for t ,= 0;
_

xsin (tx) dx =
d
dt
_

cos (tx) dx = 2
d
dt
sin t
t
(12.20)
and
_

x
2
cos (tx) dx =
d
dt
_

xsin (tx) dx = 2
d
2
dt
2
sin t
t
. (12.21)
Now use these formulas to show
_

xsin (nx) dx = 2
(1)
n+1
n
and
_

x
2
cos (nx) dx = 4
(1)
n
n
2
for all n Z 0.
12.4 Stirlings Formula
[Warning: methods in this section are beyond what we have covered so far.]
On occasion one is faced with estimating an integral of the form,
_
J
e
G(t)
dt,
where J = (a, b) 1 and G(t) is a C
1
function with a unique (for simplicity)
global minimum at some point t
0
J. The idea is that the majority contribu-
tion of the integral will often come from some neighborhood, (t
0
, t
0
+) ,
of t
0
. Moreover, it may happen that G(t) can be well approximated on this
neighborhood by its Taylor expansion to order 2;
G(t)

= G(t
0
) +
1
2
G(t
0
) (t t
0
)
2
.
Notice that the linear term is zero since t
0
is a minimum and therefore

G(t
0
) =
0. We will further assume that

G(t
0
) ,= 0 and hence

G(t
0
) > 0. Under these
hypothesis we will have,
_
J
e
G(t)
dt

= e
G(t0)
_
]tt0]<
exp
_
1
2
G(t
0
) (t t
0
)
2
_
dt.
Making the change of variables, s =
_
G(t
0
) (t t
0
) , in the above integral then
gives,
_
J
e
G(t)
dt

=
1
_
G(t
0
)
e
G(t0)
_
]s]<
G(t0)
e
1
2
s
2
ds
=
1
_
G(t
0
)
e
G(t0)
_
2
_

G(t0)
e
1
2
s
2
ds
_
=
1
_
G(t
0
)
e
G(t0)
_
_
2 O
_
_
1
_
G(t
0
)
e
1
2
G(t0)
2
_
_
_
_
.
If is suciently large, for example if
_
G(t
0
) = 3, then the error term is
about 0.0037 and we should be able to conclude that
_
J
e
G(t)
dt

=
G(t
0
)
e
G(t0)
. (12.22)
The proof of the next theorem (Stirlings formula for the Gamma function) will
illustrate these ideas and what one has to do to carry them out rigorously.
Theorem 12.34 (Stirlings formula). The Gamma function (see Exercise
12.12), satises Stirlings formula,
lim
x
(x + 1)
2e
x
x
x+1/2
= 1. (12.23)
In particular, if n N, we have
n! = (n + 1)
2e
n
n
n+1/2
where we write a
n
b
n
to mean, lim
n
an
bn
= 1. (See Example 12.37 below
for a slightly cruder but more elementary estimate of n!)
Proof. (The following proof is an elaboration of the proof found on page
236-237 in Krantzs Real Analysis and Foundations.) We begin with the formula
for (x + 1) ;
(x + 1) =
_

0
e
t
t
x
dt =
_

0
e
Gx(t)
dt, (12.24)
where
G
x
(t) := t xln t.
Then

G
x
(t) = 1 x/t,

G
x
(t) = x/t
2
, G
x
has a global minimum (since

G
x
> 0)
at t
0
= x where
G
x
(x) = x xln x and

G
x
(x) = 1/x.
So if Eq. (12.22) is valid in this case we should expect,
(x + 1)

=
2xe
(xx ln x)
=
2e
x
x
x+1/2
which would give Stirlings formula. The rest of the proof will be spent on
rigorously justifying the approximations involved.
Let us begin by making the change of variables s =
_
G(t
0
) (t t
0
) =
1
x
(t x) as suggested above. Then
G
x
(t) G
x
(x) = (t x) xln (t/x) =
xs xln
_
x +
xs
x
_
= x
_
s
x
ln
_
1 +
s
x
__
= s
2
q
_
s
x
_
where
q (u) :=
1
u
2
[u ln (1 +u)] for u > 1 with q (0) :=
1
2
.
Setting q (0) = 1/2 makes q a continuous and in fact smooth function on
(1, ) , see Figure 12.1. Using the power series expansion for ln (1 +u) we
nd,
q (u) =
1
2
+
k=3
(u)
k2
k
for [u[ < 1. (12.25)
Making the change of variables, t = x +
xs in the second integral in Eq.

(12.24) yields,
(x + 1) = e
(xx ln x)
x
_

x
e
q
_
s
x
_
s
2
ds = x
x+1/2
e
x
I (x) ,
where
I (x) =
_

x
e
q
_
s
x
_
s
2
ds =
_

1
s
x
e
q
_
s
x
_
s
2
ds. (12.26)
Fig. 12.1. Plot of q (u) .
From Eq. (12.25) it follows that lim
u0
q (u) = 1/2 and therefore,
_

lim
x
_
1
s
x
e
q
_
s
x
_
s
2
_
ds =
_

1
2
s
2
ds =
2. (12.27)
So if there exists a dominating function, F L
1
(1, m) , such that
1
s
x
e
q
_
s
x
_
s
2
F (s) for all s 1 and x 1,
we can apply the DCT to learn that lim
x
I (x) =
2 which will complete

the proof of Stirlings formula.
We now construct the desired function F. From Eq. (12.25) it follows that
q (u) 1/2 for 1 0 for u ,= 0 (u ln (1 +u) is
convex and has a minimum of 0 at u = 0) we may conclude that q (u) > 0 for
all u > 1 therefore by compactness (on [0, M]), min
1<uM
q (u) = (M) > 0
for all M (0, ) , see Remark 12.35 for more explicit estimates. Lastly, since
1
u
ln (1 +u) 0 as u , there exists M < (M = 3 would due) such that
1
u
ln (1 +u)
1
2
for u M and hence,
q (u) =
1
u
_
1
1
u
ln (1 +u)
_
1
2u
for u M.
So there exists > 0 and M < such that (for all x 1),
1
s
x
e
q
_
s
x
_
s
2
1
x<sM
e
s
2
+ 1
sM
e
xs/2
1
x<sM
e
s
2
+ 1
sM
e
s/2
e
s
2
+e
]s]/2
=: F (s) L
1
(1, ds) .
We will sometimes use the following variant of Eq. (12.23);
lim
x
(x)
_
2
x
_
x
e
_
x
= 1 (12.28)
To prove this let x go to x 1 in Eq. (12.23) in order to nd,
1 = lim
x
(x)
2e
x
e (x 1)
x1/2
= lim
x
(x)
_
2
x
_
x
e
_
x
_
x
x1
e
_
1
1
x
_
x
which gives Eq. (12.28) since
lim
x
_
x
x 1
e
_
1
1
x
_
x
= 1.
Remark 12.35 (Estimating q (u) by Taylors Theorem). Another way to estimate
q (u) is to use Taylors theorem with integral remainder. In general if h is C
2
function on [0, 1] , then by the fundamental theorem of calculus and integration

by parts,
h(1) h(0) =
_
1
0
h(t) dt =
_
1
0
h(t) d (1 t)
=
h(t) (1 t) [
1
0
+
_
1
0
h(t) (1 t) dt
=

h(0) +
1
2
_
1
0
h(t) d (t) (12.29)

where d (t) := 2 (1 t) dt which is a probability measure on [0, 1] . Applying
this to h(t) = F (a +t (b a)) for a C
2
function on an interval of points
between a and b in 1 then implies,
F (b) F (a) = (b a)

F (a) +
1
2
(b a)
2
_
1
0
F (a +t (b a)) d (t) . (12.30)

(Similar formulas hold to any order.) Applying this result with F (x) = x
ln (1 +x) , a = 0, and b = u (1, ) gives,
u ln (1 +u) =
1
2
u
2
_
1
0
1
(1 +tu)
2
d (t) ,
i.e.
q (u) =
1
2
_
1
0
1
(1 +tu)
2
d (t) .
From this expression for q (u) it now easily follows that
q (u)
1
2
_
1
0
1
(1 + 0)
2
d (t) =
1
2
if 1 < u 0
and
q (u)
1
2
_
1
0
1
(1 +u)
2
d (t) =
1
2 (1 +u)
2
.
So an explicit formula for (M) is (M) = (1 +M)
2
/2.
12.4.1 A primitive Stirling type approximation
Theorem 12.36. Suppose that f : (0, ) 1 is an increasing concave down
function (like f (x) = ln x) and let s
n
:=
n
k=1
f (k) , then
s
n

1
2
(f (n) +f (1))
_
n
1
f (x) dx
s
n

1
2
[f (n + 1) + 2f (1)] +
1
2
f (2)
s
n

1
2
[f (n) + 2f (1)] +
1
2
f (2) .
Proof. On the interval, [k 1, k] , we have that f (x) is larger than the
straight line segment joining (k 1, f (k 1)) and (k, f (k)) and thus
1
2
(f (k) +f (k 1))
_
k
k1
f (x) dx.
Summing this equation on k = 2, . . . , n shows,
s
n

1
2
(f (n) +f (1)) =
n
k=2
1
2
(f (k) +f (k 1))
k=2
_
k
k1
f (x) dx =
_
n
1
f (x) dx.
For the upper bound on the integral we observe that f (x) f (k)f
t
(k) (x k)
for all x and therefore,
_
k
k1
f (x) dx
_
k
k1
[f (k) f
t
(k) (x k)] dx = f (k)
1
2
f
t
(k) .
Summing this equation on k = 2, . . . , n then implies,
_
n
1
f (x) dx
n
k=2
f (k)
1
2
n
k=2
f
t
(k) .
Since f
tt
(x) 0, f
t
(x) is decreasing and therefore f
t
(x) f
t
(k 1) for x
[k 1, k] and integrating this equation over [k 1, k] gives
f (k) f (k 1) f
t
(k 1) .
Summing the result on k = 3, . . . , n + 1 then shows,
f (n + 1) f (2)
n
k=2
f
t
(k)
and thus ti follows that
_
n
1
f (x) dx
n
k=2
f (k)
1
2
(f (n + 1) f (2))
= s
n

1
2
[f (n + 1) + 2f (1)] +
1
2
f (2)
s
n

1
2
[f (n) + 2f (1)] +
1
2
f (2)
Example 12.37 (Approximating n!). Let us take f (n) = ln n and recall that
_
n
1
ln xdx = nln n n + 1.
Thus we may conclude that
s
n

1
2
ln n nln n n + 1 s
n

1
2
ln n +
1
2
ln 2.
Thus it follows that
_
n +
1
2
_
ln n n + 1 ln
2 s
n

_
n +
1
2
_
ln n n + 1.
Exponentiating this identity then implies,
e
2
e
n
n
n+1/2
n! e e
n
n
n+1/2
which compares well with Strirlings formula (Theorem 12.34) which states,
n!
2e
n
n
n+1/2
.
Observe that
e
= 1. 922 1
2

= 2. 506 e

= 2.718 3.
12.4.2 Bone Yard
Exercise 12.16. If f (x) = ln (1 +x) we know that
f
(n)
(x) = (1)
n+1
(n 1)! (1 +x)
n
.
Therefore the integral remainder term in Taylors theorem is given by
R
N
(x) =
1
N!
_
x
0
f
(N+1)
(y) (x y)
N
dy
=
1
N!
(1)
N
N!
_
x
0
(1 +x)
(N+1)
(x y)
N
dy
= (1)
N
_
x
0
_
x y
1 +y
_
N
1
1 +y
dy.
and hence
[R
N
(x)[
_
x
0
x y
1 +y
N
1
1 +y
dy.
So for 1 < x < 0 we have
_
x y
1 +y
_
N
=
_
[x[ [y[
1 [y[
_
N
and therefore,
[R
N
(x)[
_
x
0
_
[x[ [y[
1 [y[
_
N
1
1 +y
dy
0 as N
_
x
0
_
[x[ [y[
1 [y[
_
N
1
1 +y
dy
[x[
N
_
x
0
1
1 +y
dy
0 as N
wherein we have used, as f (t) :=
]x]t
1t
satises
f (t) =
(1 t) + ([x[ t)
(1 t)
2
=
[x[ 1
(1 t)
2
< 0
so f (t) on [0, [x[] has its maximum at t = 0 and so
[x[ [y[
1 [y[

[x[
1
= [x[
Exercise 12.17. For 1 < x 0, the best we can do with mean value theorem
type remainder is
[R
N
(x)[ =
1
N + 1
x
1 +c
N
N+1
1
N + 1
x
1 x
N+1
.
Theorem 12.38 (Taylors at order 1 and 2). If f (t) for 0 t 1 is twice
continuously dierentiable, then
f (1) = f (0) +
_
1
0
f (t) dt and
f (1) = f (0) +
f (0)
1!
+
1
1!
_
1
0
f (t) (1 t) dt.
Proof. The rst assertion follows by the fundamental theorem of calculus
f (1) f (0) =
_
1
0
f (t) dt.
For the second we integrate by parts as follows;
_
1
0
f (t) dt =
_
1
0
f (t) d (1 t) =

f (t) (1 t) [
1
0
+
_
1
0
f (t) (1 t) dt
and therefore
f (1) = f (0) +
_
1
0
f (t) dt = f (0) +
f (0)
1!
+
1
1!
_
1
0
f (t) (1 t) dt.
_
1
0
f (t) dt =
_
1
0
f (t) d (t 1) =

f (t) (t 1) [
t=1
t=0

_
t
0
f (t) (t 1) dt
=

f (0) +
_
t
0
f (t) (1 t) dt
=

f (0)
_
t
0
f (t) d
(1 t)
2
2
=

f (0)

f (t)
(1 t)
2
2
[
1
0
+
_
t
0
f
(3)
(t)
(1 t)
2
2
dt
=

f (0) +
f (t)
2!
+
_
t
0
f
(3)
(t)
(1 t)
2
2!
dt.
Exercise 12.18. If f : 1 C is a function which is dierentiable to all orders,
show for all N N
0
= 0, 1, 2, 3, . . . that
f (1) =
N
k=0
1
k!
f
(k)
(0) +
1
N!
_
1
0
f
(N+1)
(t) (1 t)
N
dt.
Hint: use integration by parts and induction with Theorem 12.38 providing
the case N = 0 and N = 1.
Exercise 12.19. If f : 1 1 is a function which is dierentiable to all orders,
show there exists c (0, 1) such that
1
N!
_
1
0
f
(N+1)
(t) (1 t)
N
dt =
f
(N+1)
(c)
(N + 1)!
.
Hint: apply Exercise ?? with (t) = (N + 1) (1 t)
N
.
Exercise 12.20. Recall the if we dene e
z
:= e
x
(cos y +i sin y) where z =
x +iy, then
d
dt
e
tz
= ze
tz
. Use Exercise 12.18 with f (t) = e
tz
to conclude;
e
z
= f (1) =
N
k=0
z
k
k!
+R
N
(z) (12.31)
where
R
N
(z) =
z
N+1
N!
_
1
0
e
tz
(1 t)
N
dt. (12.32)
Then show that lim
N
[R
N
(z)[ = 0 for all z C and use this to conclude
e
z
=
k=0
z
k
k!
for all z C.
Part IV
A Little Functional Analysis
13
Weierstrass Approximation Theorem
In this chapter, (X, d) , will be a metric space. Recall that if Z is a Banach
space then C(X, Z) and BC(X, Z) denotes the continuous and bounded con-
tinuous functions form X to Z respectively. Recall that BC (X, Z) is a Banach
spaces when equipped with the supremum norm,
|f|
u
:= sup
xX
|f (x)|
Z
.
(I will also often write |f|
for |f|
u
.) We are most interested in the case
where Z = 1 or C. The main goal of this chapter is to nd dense subsets of
C (X, 1) and C (X, C) when (X, d) is compact.
Denition 13.1 (Support). Let f : X Z be a function from a metric space
(X, d) to a vector space Z. The support of f is the closed subset, supp(f), of X
dened by
supp(f) := x X : f (x) ,= 0.
Example 13.2. For example if f : 1 1 is dened by f (x) = sin (x) 1
[0,4]
(x)
1, then
f ,= 0 = (0, 4) , 2, 3
and therefore supp(f) = [0, 4].
13.1 Classical Weierstrass Approximation Theorem
Are rst priority is to consider the metric space X = J = [a, b] for some
< a < b < . For the remainder of this section, Z will be used to denote
a Banach space.
Denition 13.3 (Convolution). For f, g C (1) with either f or g having
compact support, we dene the convolution of f and g by
f g (x) =
_
R
f(x y)g(y)dy =
_
R
f(y)g(x y)dy.
We will also use this denition when one of the functions, either f or g, takes
values in a Banach space Z.
Lemma 13.4 (Approximate sequences). Suppose that q
n
n=1
is a se-
quence non-negative continuous real valued functions on 1 with compact support
that satisfy
_
R
q
n
(x) dx = 1 and (13.1)
lim
n
_
]x]
q
n
(x) dx = 0 for all > 0. (13.2)
If f BC(1, Z), then
q
n
f (x) :=
_
R
q
n
(y)f(x y)dy
converges to f uniformly on compact subsets of 1.
Proof. Let x 1, then because of Eq. (13.1),
|q
n
f (x) f (x)| =
_
_
_
_
_
R
q
n
(y) (f(x y) f (x)) dy
_
_
_
_
_
R
q
n
(y) |f(x y) f (x)| dy.
Let M = sup |f (x)| : x 1 . Then for any > 0, using Eq. (13.1),
|q
n
f (x) f (x)|
_
]y]
q
n
(y) |f(x y) f (x)| dy
+
_
]y]>
q
n
(y) |f(x y) f (x)| dy
sup
]w]
|f(x +w) f (x)| + 2M
_
]y]>
q
n
(y)dy.
So if K is a compact subset of 1 (for example a large interval) we have
sup
(x)K
|q
n
f (x) f (x)|
sup
]w], xK
|f(x +w) f (x)| + 2M
_
|y|>
q
n
(y)dy
154 13 Weierstrass Approximation Theorem
and hence by Eq. (13.2),
lim sup
n
sup
xK
|q
n
f (x) f (x)|
sup
]w], xK
|f(x +w) f (x)| .
This nishes the proof since the right member of this equation tends to 0 as
0 by uniform continuity of f on compact subsets of 1.
Let q
n
: 1 [0, ) be dened by
q
n
(x) :=
1
c
n
(1 x
2
)
n
1
]x]1
where c
n
:=
_
1
1
(1 x
2
)
n
dx. (13.3)
Figure 13.1 displays the key features of the functions q
n
.
Fig. 13.1. A plot of q1, q50, and q100. The most peaked curve is q100 and the least is
q1. The total area under each of these curves is one.
Lemma 13.5. The sequence q
n
n=1
is an approximate sequence, i.e. they
satisfy Eqs. (13.1) and (13.2).
Proof. By construction, q
n
C
c
(1, [0, )) for each n and Eq. 13.1 holds.
Since
_
]x]
q
n
(x) dx =
2
_
1
(1 x
2
)
n
dx
2
_
0
(1 x
2
)
n
dx + 2
_
1
(1 x
2
)
n
dx
_
1
(1 x
2
)
n
dx
_
0
x
(1 x
2
)
n
dx
=
(1 x
2
)
n+1
[
1
(1 x
2
)
n+1
[
0
=
(1
2
)
n+1
1 (1
2
)
n+1
0 as n ,
the proof is complete.
Theorem 13.6 (Classical Weierstrass Approximation Theorem). Sup-
pose < a < b < , J = [a, b] and f C(J, Z). Then there exists polynomial
functions, p
n
: 1 Z such that p
n
f uniformly on J.
Proof. By replacing f by F where
F (t) := f (a +t (b a)) [f (a) +t (f (b) f (a))] for t [0, 1] ,
it suces to assume a = 0, b = 1 and f (0) = f (1) = 0. Furthermore we may
now extend f to a continuous function on all 1 by setting f 0 on 1 [0, 1] .
With q
n
dened as in Eq. (13.3), let f
n
(x) := (q
n
f) (x) and recall from
Lemma 13.4 that f
n
(x) f (x) as n with the convergence being uniform
in x [0, 1]. This completes the proof since f
n
is equal to a polynomial function
on [0, 1] . Indeed, there are polynomials, a
k
(y) , such that
(1 (x y)
2
)
n
=
2n
k=0
a
k
(y) x
k
,
and therefore, for x [0, 1] ,
f
n
(x) =
_
R
q
n
(x y)f(y)dy
=
1
c
n
_
[0,1]
f(y)
_
(1 (x y)
2
)
n
1
]xy]1
dy
=
1
c
n
_
[0,1]
f(y)(1 (x y)
2
)
n
dy
=
1
c
n
_
[0,1]
f(y)
2n
k=0
a
k
(y) x
k
dy =
2n
k=0
A
k
x
k
where
A
k
=
1
c
n
_
[0,1]
f (y) a
k
(y) dy.
13.2 Stone-Weierstrass Theorem for Compact X 155
Corollary 13.7. Any continuous function f : J Z may be approximated
uniformly by innitely dierentiable functions.
Exercise 13.1. Suppose < a < b < and g C ([a, b], 1) satises
g (t) 0 for all t J := [a, b] and
_
J
g (t) dt = 0. Show g 0.
Exercise 13.2. Let J = [a, b] where < a < b < . Show if f C (J, C)
satises _
J
f (t) t
n
dt = 0 for all n N
0
, (13.4)
then f 0. Hint: use Theorem 13.6 to conclude that
_
J
[f (t)[
2
dt = 0 and
then use Exercise 13.1.
Example 13.8 (Probability densities are determined by their moments). Let J =
[a, b] where < a < b < and suppose that
1
and
2
are two non-negative
continuous probability density functions on J having the same moments, then
1
=
2
. Indeed by assumption that
1
and
2
are probability density functions
with the same moments is exactly that
_
J
t
n
1
(t) dt =
_
J
t
n
2
(t) dt for all n N
0
.
This last equation is equivalent to
_
J
t
n
[
1
(t)
2
(t)] dt = 0 for all n N
0
and hence by Exercise 13.2 it follows that
1
(t) =
2
(t) for all t J.
13.2 Stone-Weierstrass Theorem for Compact X
We now wish to generalize Theorem 13.6 to general metric spaces and to gen-
eralize to collections other than polynomial functions.
Denition 13.9. Let X be a metric space and / C (X) = C(X, 1) or
C(X, C) be a collection of functions. Then
1. / is said to separate points if for all distinct points x, y X there exists
f / such that f(x) ,= f(y).
2. / is an algebra if / is a vector subspace of C (X) which is closed under
pointwise multiplication. (Note well: we do not assume 1 /.)
3. / C(X, 1) is called a lattice if f g := max(f, g) and f g = min(f, g)
/ for all f, g /.
4. / C(X, C) is closed under conjugation if

f / whenever f /.
Lemma 13.10. If / is a closed (relative to the supremum norm) sub-algebra
of BC(X, 1) then [f[ / for all f / and / is a lattice.
Proof. Let f / and let M = sup
xX
[f(x)[ . By Theorem 13.15, there are
polynomials p
n
(t) such that
lim
n
sup
]t]M
[[t[ p
n
(t)[ = 0.
By replacing p
n
by p
n
p
n
(0) if necessary we may assume that p
n
(0) = 0. Since
/ is an algebra, it follows that f
n
= p
n
(f) / and [f[ /, because [f[ is the
uniform limit of the f
n
s. Since
f g =
1
2
(f +g +[f g[) and
f g =
1
2
(f +g [f g[),
we have shown / is a lattice.
Lemma 13.11. Let / C(X, 1) be an algebra which separates points and
suppose x and y are distinct points of X. If there exists f, g / such that
f(x) ,= 0 and g(y) ,= 0, (13.5)
then
V := (f(x), f(y)) : f /= 1
2
. (13.6)
Proof. It is clear that V is a non-zero subspace of 1
2.
If dim(V ) = 1, then
V = span(a, b) for some (a, b) 1
2
which, necessarily by Eq. (13.5), satisfy
a ,= 0 ,= b. Since (a, b) = (f(x), f(y)) for some f / and f
2
/, it follows
that (a
2
, b
2
) = (f
2
(x), f
2
(y)) V as well. Since dimV = 1, (a, b) and (a
2
, b
2
)
are linearly dependent and therefore
0 = det
_
a b
a
2
b
2
_
= ab
2
a
2
b = ab(b a)
which implies that a = b. But this the implies that f(x) = f(y) for all f /,
violating the assumption that / separates points. Therefore we conclude that
dim(V ) = 2, i.e. V = 1
2
.
Theorem 13.12 (Stone-Weierstrass Theorem, Compact Case). Suppose
X is a compact metric space and / C(X, 1) is a closed subalgebra which
separates points. For x X let
/
x
:= f(x) : f / and
J
x
= f C(X, 1) : f(x) = 0.
Then either one of the following two cases hold.
1. / = C(X, 1) or
2. there exists a unique point x
0
X such that / = J
x0
.
Moreover, case 1. holds i /
x
= 1 for all x X and case 2. holds i there
exists a point x
0
X such that /
x0
= 0 .
Proof. If there exists x
0
such that /
x0
= 0 (x
0
is unique since / separates
points) then / J
x0
. If such an x
0
exists let ( = J
x0
and if /
x
= 1 for all x,
set ( = C(X, 1). Let f ( be given. By Lemma 13.11, for all x, y X such
that x ,= y, there exists g
xy
/ such that f = g
xy
on x, y.
1
When X is
compact the basic idea of the proof is contained in the following identity,
f(z) = inf
xX
sup
yX
g
xy
(z) for all z X. (13.7)
To prove this identity, let g
x
:= sup
yX
g
xy
and notice that g
x
f since
g
xy
(y) = f(y) for all y X. Moreover, g
x
(x) = f(x) for all x X since
g
xy
(x) = f(x) for all x. Therefore,
inf
xX
sup
yX
g
xy
= inf
xX
g
x
= f.
The rest of the proof is devoted to replacing the inf and the sup above by
min and max over nite sets at the expense of Eq. (13.7) becoming only an
approximate identity.
Claim. Given > 0 and x X there exists g
x
/ such that g
x
(x) = f(x)
and f < g
x
+ on X.
To prove the claim, let V
y
be an open neighborhood of y such that [f g
xy
[ <
on V
y
so in particular f < +g
xy
on V
y
. By compactness, there exists
f
X
such that X =
y
V
y
. Set
g
x
(z) = maxg
xy
(z) : y ,
then for any y , f < +g
xy
< +g
x
on V
y
and therefore f < +g
x
on X.
Moreover, by construction f(x) = g
x
(x), see Figure 13.2 below. This completes
the proof of the claim and we now can complete the proof of the theorem.
For each x X, let U
x
be a neighborhood of x such that [f g
x
[ < on
U
x
. Choose
f
X such that X =
x
U
x
and dene
g = ming
x
: x /.
Then f < g + on X and for x ,
1
If Ax0
= {0} and x = x0 or y = x0, then gxy exists merely by the fact that A
separates points.
Fig. 13.2. Constructing the funtions gx.
g g
x
< f + on U
x
.
Since X =
x
U
x
, we conclude
f < g + and g < f + on X,
i.e. [f g[ < on X. Since > 0 is arbitrary it follows that f

/ = /.
Corollary 13.13 (Complex Stone-Weierstrass Theorem, Compact
Case.). Let X be a compact metric space. Suppose / is a subalgebra of
C(X, C) which is closed in the uniform topology, separates points, and is closed
under complex conjugation. Then either / = C(X, C) or
/ = J
C
x0
:= f C(X, C) : f(x
0
) = 0
for some x
0
X.
Proof. Since
Re f =
f +

f
2
and Imf =
f

f
2i
,
Re f and Imf are both in /. Therefore
/
R
= Re f, Imf : f /
is a real sub-algebra of C(X, 1) which separates points. Therefore either /
R
=
C(X, 1) or /
R
= J
C
x0
C(X, 1) for some x
0
and hence / = C(X, C) or J
C
x0
respectively.
Here are some applications of this theorem.
13.3 Stone-Weierstrass Theorem for Locally Compact X 157
Example 13.14. Let f C([a, b]) be a positive function which is injective. Then
functions of the form
N
k=1
a
k
f
k
with a
k
C and N N are dense in C([a, b]).
For example if a = 1 and b = 2, then one may take f(x) = x
for any ,= 0,
or f(x) = e
x
, etc.
Exercise 13.3. Let (X, d) be a separable compact metric space. Show that
C (X) is also separable. Hint: Let E X be a countable dense set and then
consider the algebra, / C (X) , generated by d(x, )
xE
.
Theorem 13.15 (Weierstrass Approximation Theorem). Suppose that
K 1
d
is a compact subset and f C(K, C). Then there exists polynomi-
als p
n
on 1
d
such that p
n
f uniformly on K.
Remark 13.16. The mapping (x, y) 1
d
1
d
z = x + iy C
d
is an iso-
morphism of vector spaces. Letting z = x iy as usual, we have x =
z+ z
2
and
y =
z z
2i
. Therefore under this identication any polynomial p(x, y) on 1
d
1
d
may be written as a polynomial q in (z, z), namely
q(z, z) = p(
z + z
2
,
z z
2i
).
Conversely a polynomial q in (z, z) may be thought of as a polynomial p in
(x, y), namely p(x, y) = q(x +iy, x iy).
Corollary 13.17 (Complex Weierstrass Approximation Theorem).
Suppose that K C
d
is a compact set and f C(K, C). Then there exists
polynomials p
n
(z, z) for z C
d
such that sup
zK
[p
n
(z, z) f(z)[ 0 as
n .
Proof. This is an immediate consequence of Theorem 13.15 and Remark
13.16.
Example 13.18. Let K = S
1
= z C : [z[ = 1 and /be the set of polynomials
in (z, z) restricted to S
1
. Then / is dense in C(S
1
).
2
Since z = z
1
on S
1
,
we have shown polynomials in z and z
1
are dense in C(S
1
). This example
generalizes in an obvious way to K =
_
S
1
_
d
C
d
.
Exercise 13.4. Suppose f C (1, C) is a 2 periodic function (i.e.
f (x + 2) = f (x) for all x 1) and
_
2
0
f (x) e
inx
dx = 0 for all n Z,
2
Note that it is easy to extend f C(S
1
) to a function F C(C) by setting
F(z) = |z| f(
z
|z|
) for z = 0 and F(0) = 0. So this special case does not require the
Tietze extension theorem.
show again that f 0. Hint: Use Example 13.18 to show that any 2 periodic
continuous function g on 1 is the uniform limit of trigonometric polynomials
of the form
p (x) =
n
k=n
p
k
e
ikx
with p
k
C for all k.
Exercise 13.5. Given a continuous function f : 1 C which is 2 -
periodic and > 0. Show there exists a trigonometric polynomial, p() =
n
n=N
n
e
in
, such that [f() P()[ < for all 1. Hint: show that there
exists a unique function F C(S
1
) such that f() = F(e
i
) for all 1.
Remark 13.19. Exercise 13.5 generalizes to 2 periodic functions on 1
d
, i.e.
functions such that f(+2e
i
) = f() for all i = 1, 2, . . . , d where e
i
d
i=1
is the
standard basis for 1
d
. A trigonometric polynomial p() is a function of 1
d
of the form
p() =
n
e
in
where is a nite subset of Z
d
. The assertion is again that these trigonometric
polynomials are dense in the 2 periodic functions relative to the supremum
norm.
13.3 Stone-Weierstrass Theorem for Locally Compact X
Denition 13.20. If A and B are two subsets of a metric space (X, d) we let
d (A, B) := inf d (a, b) : a A and b B
be the distance between A and B.
Exercise 13.6. Let K be a compact subset and F be a closed subset of a metric
space, (X, d) . If K F = , show d (K, F) > 0.
Denition 13.21. A metric space (X, d) is locally compact if for all x X
there exists V A
x
such that

V is compact. [We actually only need to use here
the easier fact that
x

V
x
is a compact set being the nite union of compact
sets and that

V is a closed subset of
x

V
x
.]
Exercise 13.7. Let (X, d) be a locally compact metric space and K be a
compact subset of X. Show there exists a precompact
3
open set V X
such that K X. Use this result to show there exists an > 0 so that
V
(K) := x X : d
K
(x) < is precompact.
3
Recall that a subset V of a metric space is precompact if

V is compact.
Denition 13.22. If (X, d) is a metric space and Z is a normed space, let
C
c
(X, Z) denote those f C (X, Z) with compact support. Recall that the sup-
port of f is dened to be supp (f) := f
1
(0).
The next corollary is a minor variant of Urysohns Lemma 9.23.
Corollary 13.23 (Locally Comapct Urysohns Lemma). Let (X, d) be a
locally compact metric space and K be a compact subset of X and F be a closed
subset of X such that K F = . Then there exists f C
c
(X, [0, 1]) such that
f = 1 on K and f = 0 on F.
Proof. By Exercise 13.6, := d (K, F) > 0 and then by Exercise 13.7 there
exists 0 < < such that V
(K) is precompact. We then dene

f (x) =
d
V(K)
c (x)
d
K
(x) +d
V(K)
c (x)
for x X.
This function satises f = 1 on K and f = 0 on V
(K)
c
and hence f = 0 on
F V
(K) . Moreover we have f ,= 0 V
(K) and so supp (f) is a closed

subset of the compact set, V
(K), and hence is itself compact.

Denition 13.24. If (X, d) is a locally compact metric space we let C
0
(X, Z)
denote those f C (X, Z) satisfying K
:= x X : |f (x)| is compact
for all > 0. If this condition holds we say f vanishes at innity.
Theorem 13.25. C
0
(X, Z) is a closed subspace of BC (X, Z) and C
c
(X, Z) is
a dense subspace of C
0
(X, Z) .
Proof. Suppose that f
n
C
0
(X, Z) and f
n
converges uniformly to f
BC (X, Z) . Let > 0 be given and choose n so that |f f
n
|
< /2. Then

choose a compact set K
= x X : |f
n
(x)| /2 . It then follows that, for
all x X K
,
|f (x)| |f
n
(x)| +|f (x) f
n
(x)|
|f
n
(x)| +|f f
n
|
<

2
+

2
= .
Therefore x X : |f (x)| K
which shows x X : |f (x)| is

compact and hence f vanishes at innity, i.e. f C
0
(X, Z).
For the second assertion, let f C
0
(X, Z) and suppose that > 0 is
given. Let K
:= x X : |f (x)| and use Corollary 13.23 to nd

C
c
(X, [0, 1]) such that = 1 on K
. Then the function, f
:= f C
c
(X, Z)
and
|f (x) f
(x)| = |f (x)| [1 (x)[ =

_
0 if x K
< if x / K
.
Thus it follows that |f f
and since > 0 was arbitrary this suces

to show C
c
(X, Z) is dense in C
0
(X, Z) .
Theorem 13.26 (Stone-Weierstrass Theorem, Locally Compact
Case). Suppose (X, d) is a locally compact
4
metric space and / C
0
(X, 1) is
a closed subalgebra which separates points. For x X let
/
x
:= f(x) : f / and
J
x
= f C
0
(X, 1) : f(x) = 0.
Then either one of the following two cases hold.
1. / = C
0
(X, 1) or
2. there exists a unique point x
0
X such that / = J
x0
.
Moreover, case 1. holds i /
x
= 1 for all x X and case 2. holds i there
exists a point x
0
X such that /
x0
= 0 .
Proof. If there exists x
0
such that /
x0
= 0 (x
0
is unique since / separates
points) then / J
x0
. If such an x
0
exists let ( = J
x0
and if /
x
= 1 for all
x, set ( = C
0
(X, 1). Let f ( be given. By Lemma 13.11, for all x, y X
such that x ,= y, there exists g
xy
/ such that f = g
xy
on x, y.
5
When X is
compact the basic idea of the proof is contained in the following identity,
f(z) = inf
xX
sup
yX
g
xy
(z) for all z X. (13.8)
To prove this identity, let g
x
:= sup
yX
g
xy
and notice that g
x
f since
g
xy
(y) = f(y) for all y X. Moreover, g
x
(x) = f(x) for all x X since
g
xy
(x) = f(x) for all x. Therefore,
inf
xX
sup
yX
g
xy
= inf
xX
g
x
= f.
The rest of the proof is devoted to replacing the inf and the sup above by
min and max over nite sets at the expense of Eq. (13.8) becoming only an
approximate identity. We also have to modify Eq. (13.8) slightly to take care
of the non-compact case.
Claim. Given > 0 and x X there exists g
x
/ such that g
x
(x) = f(x) and
f < g
x
+ on X.
To prove this, let V
y
be an open neighborhood of y such that [f g
xy
[ <
on V
y
; in particular f < +g
xy
on V
y
. Also let g
x,
be any xed element in /
such that g
x,
(x) = f (x) and let
4
The reason we suppose that X is locally compact is so that we know that C0 (X, 1)
is a non-trivial subspace of C (X, 1) .
5
If Ax0
= {0} and x = x0 or y = x0, then gxy exists merely by the fact that A
separates points.
13.3 Stone-Weierstrass Theorem for Locally Compact X 159
K =
_
[f[

2
_
_
[g
x,
[

2
_
. (13.9)
Since K is compact, there exists
f
K such that K

y
V
y
. Dene
g
x
(z) = maxg
xy
(z) : y .
Since
f < +g
xy
< +g
x
on V
y
,
for any y , and
f <

2
< +g
x,
g
x
+ on K
c
,
f < + g
x
on X and by construction f(x) = g
x
(x), see Figure 13.3. This
completes the proof of the claim.
Fig. 13.3. Constructing the dominating approximates, gx for each x X.
To complete the proof of the theorem, let g
be a xed element of / such

that f < g
+ on X; for example let g
= g
x0
/ for some xed x
0
X.
For each x X, let U
x
be a neighborhood of x such that [f g
x
[ < on U
x
.
Choose

f
F :=
_
[f[

2
_
_
[g
[

2
_
such that F

x
U
x
( exists since F is compact) and dene
g = ming
x
: x /.
Then, for x F, g
x
< f + on U
x
and hence g < f + on
x
U
x
F. Likewise,
g g
< /2 < f + on F
c
.
Therefore we have now shown,
f < g + and g < f + on X,
i.e. [f g[ < on X. Since > 0 is arbitrary it follows that f

/ = / and so
/ = (.
Corollary 13.27 (Complex Stone-Weierstrass Theorem, Locally Com-
pact Case). Let X be a locally compact metric space. Suppose / is a subalge-
bra of C
0
(X, C) which is closed in the uniform topology, separates points, and
is closed under complex conjugation. Then either / = C
0
(X, C) or
/ = J
C
x0
:= f C
0
(X, C) : f(x
0
) = 0
for some x
0
X.
Proof. Since
Re f =
f +

f
2
and Imf =
f

f
2i
,
Re f and Imf are both in /. Therefore
/
R
= Re f, Imf : f /
is a real sub-algebra of C
0
(X, 1) which separates points. Therefore either /
R
=
C
0
(X, 1) or /
R
= J
x0
C
0
(X, 1) for some x
0
and hence / = C
0
(X, C) or J
C
x0
respectively.
Example 13.28. Let X = [0, ), > 0 be xed, / be the real algebra generated
by t e
t
. So the general element f / is of the form f(t) = p(e
t
), where
p(x) is a polynomial function in x with real coecients. Since / C
0
(X, 1)
separates points and e
t
/ is pointwise positive,

/ = C
0
(X, 1).
Exercise 13.8. Suppose that g C ([0, ), 1) is a function such that
|g|
1
:=
_

0
[g (x)[ dx := lim
M
_
M
0
[g (x)[ dx < .
Show;
1.
_
0
f (x) g (x) dx := lim
M
_
M
0
f (x) g (x) dx exists for all
f BC ([0, )) .
2. Also show
_

0
f (x) g (x) dx
|f|
u
|g|
1
for all f BC ([0, )) .
Thus it follows if f
n
BC ([0, )) converges uniformly to f BC ([0, ))
then
_

0
g (x) f (x) dx = lim
n
_

0
g (x) f
n
(x) dx. (13.10)
3. Observe (but do not turn in) that the map, BC ([0, )) f
_
0
f (x) g (x) dx 1 is linear.
Denition 13.29 (Laplace Transform). Suppose that g C ([0, ), 1) is a
function such that
_

0
[g (t)[ e
at
dt < for some a (0, ) .
Then for (a, ) we dene (/g) () :=
_
0
g (t) e
t
dt and refer to /g as
the Laplace transform of g.
Theorem 13.30 (Injectivity of the Laplace Transform). Continuing the
notation in Denition 13.29. If /g () = 0 for all > a then g (t) = 0 for all
t 0.
Proof. Let > 0 be xed. From Example 13.28,
/ =
_
p
_
e
t
_
: p is a polynomial w/o constant term
_
= span
_
e
nt
: n N
_
is dense in C
0
([0, )) .By assumption,
_

0
g (t) e
at
e
nt
dt =
_

0
g (t) e
(n+a)t
dt = 0
and so by the linearity of the integral it follows that
_

0
g (t) e
at
f (t) dt = 0 for all f /.
Then form the continuity properties of the integral in Eq. (13.10) and the fact
that

/ = C
0
([0, )) we may conclude that
_

0
g (t) e
at
f (t) dt = 0 for all f C
0
([0, )) .
Given M > 0 let
M
C
c
([0, ), [0, 1]) such that
M
= 1 on [0, M] and
M
= 0 on [M + 1, ). Then f :=
M
g C
c
([0, )) C
0
([0, )) and
therefore,
0 =
_

0
g (t) e
at
f (t) dt =
_
M+1
0
g
2
(t)
2
M
(t) e
at
dt
from which it follows that g
2
(t)
2
M
(t) e
at
= 0 for all t M + 1. But this
then shows g (t) = 0 for all t M. As M was arbitrary we may conclude that
g (t) = 0 for all t 0.
Exercise 13.9. Suppose that g C ([0, )) such that
_

0
[g (x)[ dx := lim
M
_
M
0
[g (x)[ dx <
and
_

0
g (x)
_
1
1 +x
_
n
dx := lim
M
_
M
0
g (x)
_
1
1 +x
_
n
dx = 0 for n N.
(13.11)
Show g (x) = 0 for all x 0.
14
Inner Product Spaces and Orthonormal Bases
Denition 14.1. Let H be a complex vector space. An inner product on H is
a function, [ : H H C, such that
1. af +bg[h = af[h +bg[h i.e. f f[h is linear.
2. f[g = g[f.
3. |f|
2
:= f[f 0 with equality |f|
2
= 0 i f = 0.
Notice that combining properties (1) and (2) that f h[f is conjugate
linear for xed h H, i.e.
h[af +bg = ah[f +
bh[g.
The following identity will be used frequently in the sequel without further
mention,
|f +g|
2
= f +g[f +g = |f|
2
+|g|
2
+f[g +g[f
= |f|
2
+|g|
2
+ 2Ref[g. (14.1)
Theorem 14.2 (Schwarz Inequality). Let (H, [) be an inner product
space, then for all f, g H
[f[g[ |f||g|
and equality holds i f and g are linearly dependent.
Proof. If g = 0, the result holds trivially. So assume that g ,= 0 and observe;
if f = g for some C, then f[g = |g|
2
and hence
[f[g[ = [[ |g|
2
= |f||g|.
Now suppose that f H is arbitrary, let h := f |g|
2
f[gg. (So h is the
orthogonal projection of f onto g, see Figure 14.1.) Then
0 |h|
2
=
_
_
_
_
f
f[g
|g|
2
g
_
_
_
_
2
= |f|
2
+
[f[g[
2
|g|
4
|g|
2
2Ref[
f[g
|g|
2
g
= |f|
2
[f[g[
2
|g|
2
from which it follows that 0 |g|
2
|f|
2
[f[g[
2
with equality i h = 0 or
equivalently i f = |g|
2
f[gg.
h = f
f|g
g
2
g
f|g
g
2
g
f
g
0
Fig. 14.1. The picture behind the proof of the Schwarz inequality.
Corollary 14.3. Let (H, [) be an inner product space and |f| = |f|
H
:=
_
f[f. Then the Hilbertian norm, | |, is a norm on H. Moreover [ is
continuous on H H, where H is viewed as the normed space (H, ||).
Proof. If f, g H, then, using Schwarzs inequality,
|f +g|
2
= |f|
2
+|g|
2
+ 2Ref[g
|f|
2
+|g|
2
+ 2|f||g| = (|f| +|g|)
2
.
Taking the square root of this inequality shows || satises the triangle inequal-
ity.
Checking that || satises the remaining axioms of a norm is now routine
and will be left to the reader. The continuity of the inner product follows from
Theorem 14.2 as in Exercise ??. For completeness, if f, g, f, g H, then
[f +f[g +g f[g[ = [f[g +f[g +f[g[
|f| |g| +|g| |f| +|g| |f|
with the latter expression clearly going to zero as f and g go to zero proving
the inner product is continuous.
Denition 14.4. An inner product space (H, [) is a Hilbert space if H is
complete in the ||
H
, i.e. (H, ||
H
) is a Banach space.
Unfortunately for the purposes of this course we our inner product spaces
will typically not be complete because we are using Riemann integration theory
rather than Lebesgue integration theory. Here is our one and only example of
a Hilbert space.
162 14 Inner Product Spaces and Orthonormal Bases
Example 14.5 (Space of square summable sequences). Let
2
denote those se-
quences a = (a
1
, a
2
, a
3
, . . . ) with a
i
C such that |a|
2
2
:=
n=1
[a
n
[
2
< .
We then dene a[b :=
n=1
a
n
b
n
for all a, b
2
. Applying the Cauchy-
Schwarz inequality for nite sequences we nd,
N
n=1
a
n
b
n
_
N
n=1
[a
n
[
2
_
N
n=1
[b
n
[
2
|a|
2
|b|
2
.
Letting N then shows
n=1
a
n
b
n
|a|
2
|b|
2
< for all a, b
2
.
Using this remark, one may show that
_
2
, [
_
is an inner product space.
The proof that
2
is complete follows the familiar completeness argument
which do now.
Let a (k)
k=1

2
be a Cauchy sequence, i.e.
|a (k) a (l)|
2
2
=
n=1
[a
n
(k) a
n
(l)[
2
0 as k, l .
This implies for each n N that a (k)
k=1
is a Cauchy sequence in C and
hence convergent. Let a
n
:= lim
k
a
n
(k) C for all n N. Then for any
N N,
N
n=1
[a
n
(k) a
n
[
2
= lim
l
N
n=1
[a
n
(k) a
n
(l)[
2
limsup
l
n=1
[a
n
(k) a
n
(l)[
2
= limsup
l
|a (k) a (l)|
2
2
.
Letting N in this inequality then shows,
|a (k) a|
2
2
limsup
l
|a (k) a (l)|
2
2
0 as k .
Denition 14.6. Let (H, [) be an inner product space, we say f, g H are
orthogonal and write f g i f[g = 0. More generally if A H is a set,
f H is orthogonal to A (write f A) i f[g = 0 for all g A. Let
A
= f H : f A be the set of vectors orthogonal to A. A subset S H

is an orthogonal set if f g for all distinct elements f, g S. If S further
satises, |f| = 1 for all f S, then S is said to be an orthonormal set.
Remark 14.7. For any subset, V, of H we have V V
0 . Indeed if f
V V
, then |f|
2
= f[f = 0.
Proposition 14.8. Let (H, [) be an inner product space then
1. (Parallelogram Law)
|f +g|
2
+|f g|
2
= 2|f|
2
+ 2|g|
2
(14.2)
for all f, g H.
2. (Pythagorean Theorem) If S
f
H is a nite orthogonal set, then
_
_
_
_
_
_
fS
f
_
_
_
_
_
_
2
=
fS
|f|
2
. (14.3)
3. If A H is a set, then A
is a closed linear subspace of H.

Proof. I will assume that H is a complex Hilbert space, the real case being
easier. Items 1. and 2. are proved by the following elementary computations;
|f +g|
2
+|f g|
2
= |f|
2
+|g|
2
+ 2Ref[g +|f|
2
+|g|
2
2Ref[g
= 2|f|
2
+ 2|g|
2
,
and
_
_
_
_
_
_
fS
f
_
_
_
_
_
_
2
=
fS
f[
gS
g =
f,gS
f[g
=
fS
f[f =
fS
|f|
2
.
Item 3. is a consequence of the continuity of [ and the fact that
A
=
fA
Nul([f)
where Nul([f) = g H : g[f = 0 a closed subspace of H.
Denition 14.9. If H is a nite orthonormal set, let P
: H H be the
linear operator dened by
P
f :=
f[ V := span .
We refer to P
to the orthogonal projection determined by .

14 Inner Product Spaces and Orthonormal Bases 163
Proposition 14.10 (Properties of P
). Let be a nite orthonormal subset

of H and V = span . For all f, g H the following properties hold;
1. P
f[P
g = P
f[g = f[P
g =
f[ [g ,
2. |P
f|
2
=
[f[[
2
3. (f P
f) V
4. |f|
2
= |f P
f|
2
+|P
f|
2
,
5. P
f = f for f V and P
f = 0 for f V
, and
6. P
2
= P
.
Moreover if
t
is another orthonormal basis for V then P
= P
.
Proof. These follow by standard linear algebra direct calculations. We sup-
ply the proof for the reader convenience.
1. For f, g H we have,
P
f[g =
_
f[ [g
_
=
f[ [g (14.4)
and similarly on shows f[P
g =
f[ [g . Taking f =
0
in
this equation shows
0
[P
g =
0
[g and using this result in Eq. (14.4)
shows,
f[ [g =
f[ [P
g = P
f[P
g
wherein we have used Eq. (14.4) again for the second equality.
2. Take f = g in item 1.
3. Taking g = in Eq. (14.4) shows P
f[ = f[ . This is equivalent
to [f P
f = 0 for all which is equivalent to (f P
f) V.
4. This follows from Pythagoreans theorem as (P
f) (f P
f) since
P
f V while (f P
f) V.
5. If f V
then f[ = 0 for all V and therefore P
f = 0. If
f V, then f P
f V V
and hence f P
f = 0, i.e. f = P
f.
6. This follows from item 5. since for f H, P
f V and therefore
P
(P
f) = P
f.
For the last assertion, observe that (P
f P
f) = (f P
f) (f P
f)
from which it follows that (P
f P
f) V V
= 0 , i.e. P
f = P
f.
Proposition 14.11 (GramSchmidt process). Every nite dimensional
subspace, V H, has an orthonormal basis .
Proof. Let v
1
, . . . , v
n
be a basis for V and set u
1
:= v
1
/ |v
1
| . Then dene
u
k
inductively by setting
v
k
= v
k
P
u1,...,uk1]
v
k
and then setting u
k
:= v
k
/ | v
k
| .
Denition 14.12. If V is a nite dimensional subspace of H we let P
V
= P
where is any orthonormal basis for V.

Corollary 14.13 (Best Approximation Theorem). If V is a nite dimen-
sional subspace of H, then
|f P
V
f| = inf
gV
|f g| for all f H.
In words, P
V
f V is the best approximation to f by an element in V.
Proof. If g V,
f g = (f P
V
f) + (P
V
f g)
where (f P
V
f) V
and (P
V
f g) V. Therefore by Pythagoreans theo-
rem,
|f g|
2
= |f P
V
f|
2
+|P
V
f g|
2
|f P
V
f|
2
.
Proposition 14.14 (Bessels Inequality). If T is an orthonormal subset of
H, then
T
[f[[
2
:= sup
f T
[f[[
2
|f|
2
for all f H (14.5)
and
T
[f[ [g[ := sup
f T
[f[ [g[ |f| |g| for all f, g H. (14.6)

In particular, the sum,
T
f[ [g , is absolutely convergent.
Proof. Let
f
T be any nite set. Then from Proposition 14.10,
[f[[
2
= |P
f|
2
= |f|
2
|f P
f|
2
|f|
2
.
This shows shows |f|
2
is an upper bound
_
[f[[
2
:
f
T
_
and proves
Eq. (14.5). From the Cauchy Schwarz inequality and Eq. (14.5),
164 14 Inner Product Spaces and Orthonormal Bases
[f[ [g[
[f[[
2
[[g[
2
|f| |g|
from which Eq. (14.6) follows.
Denition 14.15. If H is a countable (for simplicity) orthonormal set
and f H we write
f =
f[
i for every choice of
n

f
such that
n
, we have
lim
n
n
f[ = f in H, i.e. lim
n
P
n
f = f in H.
Theorem 14.16 (Orthonormal Basis Theorem). Let (H, [) be a not nec-
essarily complete inner product space and H be a countable (for simplicity)
orthonormal set. Then the following three conditions are equivalent:
1. f =
f[ for all f H.
2. There exists
n
such that
n
and lim
n
n
f[ = f for all
f H.
3. f, g =
f[ [g for all f, g H.
4. |f|
2
=
[ f[ [
2
for all f H.
From Proposition 14.14 the sums in items 2. and 3. are absolutely convergent
and so how we order the terms in the sum is not important.
Proof. Certainly 1. implies 2. If 2. holds then P
n
f f as n and
therefore,
f[g = lim
n
P
n
f[g = lim
n
_

n
f[
n

n
[g
_
= lim
n
n
f[
n

n
[g =
f[ [g
which is item 3. Item 3. implies 4. by taking g = f. So it only remains to prove
4. implies 1.
Suppose that
n

f
with
n
is given. Then
|f P
n
f|
2
= |f|
2
|P
n
f|
2
= |f|
2
n
[ f[ [
2
|f|
2
[ f[ [
2
= 0
which veries that 4. = 1.
Denition 14.17. A countable orthonormal subset H is an orthonor-
mal basis for H if any one of the equivalent conditions in Theorem 14.16 are
satised.
Denition 14.18. A subset T H is total if span (T) is a dense subspace of
H.
Corollary 14.19. If T = u
1
, u
2
, u
3
, . . . is a total subspace of H, then apply-
ing the GramSchmidt process to T, see the proof of Proposition 14.11, produces
an orthonormal basis for H. Note the span = span T.
Proof. Let us assume dimH = which is the case of interest to
us. The GramSchmidt orthogonalization process produces an orthonormal
set =
1
,
2
, . . . and as sequence N
n
N such that N
n
and
span
1
, . . . ,
n
= span u
1
, . . . , u
Nn
. It is possible that that N
n
> n be-
cause the set T is not assumed to be linearly independent. By assumption, for
any f H and > 0 there exists an N N and (a
1
, . . . , a
N
) C
N
such that
_
_
_
_
_
f
N
k=1
a
k
u
k
_
_
_
_
_
.
As span u
1
, . . . , u
N
span
1
, . . . ,
M
for any M N we may conclude
that
|f P
M
f|
_
_
_
_
_
f
N
k=1
a
k
u
k
_
_
_
_
_

where
M
:=
1
, . . . ,
M
. This shows that lim
M
P
M
f = f and hence
is an orthonormal basis for H.
Theorem 14.20 (Orthogonal Polynomials). Let < a < b < and
C ([a, b] , [0, )) such that has only nitely many zeros. Let H = C ([a, b] , C)
with
f[g =
_
b
a
f (x) g (x) (x) dx.
Then H is an inner product space and applying the Grahm Schmidt process
to the functions, T :=
_
1, x, x
2
, x
3
, . . .
_
, produces an orthonormal basis =
p
n
n=0
for H. Here p
n
(x) is a polynomial with deg p
n
= n for all n.
Proof. By the Weierstrass approximation Theorem 13.6 or by Corollary
13.13 we know that T is a total subset of C ([a, b] , C) when given the supre-
mum norm. That is for any f H, there exists polynomials, q
n
(x) , such that
|f q
n
|
= [f (x) q
n
(x)[ = 0. Since
14 Inner Product Spaces and Orthonormal Bases 165
|f|
2
H
=
_
b
a
[f (x)[
2
(x) dx ||
1
|f|
2
where ||
1
=
_
b
a
(x) dx, it follows that
|f q
n
|
2
H
||
1
|f q
n
|
2
0 as n .
This shows that T is total in H with the norm coming from the inner prod-
uct. Since the the set T is linearly independent, the Grahm Schmidt pro-
cess never produces any zeros. Moreover, it is easily seen by induction that
span p
0
, . . . , p
n
is the space of polynomials of degree n and hence p
n+1
must
be a polynomial of degree n + 1.
Example 14.21 (Jacobi Polynomials). Let a = 1, b = 1, and (x) =
(1 x)
(1 +x)
where , > 1. The associated polynomial are the Jacobi

polynomials, p
(,)
n (x) , see the Wiki Article. When or are less than 0 it
is not longer true that is continuous at the end points. In this case one has
to interpret the integrals as improper integrals. Never the less the condition
, > 1 insure that ||
1
=
_
1
1
(x) dx < which is what is really impor-
tant in the proof of Theorem 14.20.
Exercise 14.1 (Legendre Polynomials). When = = 0 (so that
1) in Example 14.21, the resulting polynomials are multiples of the Legendre
Polynomials. Apply Grahm Schmidt process to nd the rst four Legendre
polynomials, i.e. nd p
0
, p
1
, p
2
and p
3
.
Example 14.22 (Hermite polynomials). Let
H =
_
f C (1, C) :
_

[f (x)[
2
e
1
2
x
2
dx <
_
and for f, g H let
f[g =
_

f (x) g (x) e
1
2
x
2
dx.
It can be shown that T := 1, x, x
2
, x
3
. . . is total in H. Applying the Gram-
Schmidt procedure to T gives orthonormal basis, , for H of polynomial func-
tions on 1. The resulting polynomials are multiples of the so called Hermite
polynomials.
Remark 14.23 (An Interesting Phenomena). Let H = L
2
([1, 1], dm) and S :=
1, x
3
, x
6
, x
9
, . . . . Then again S is total in H by the same argument as in
Theorem 14.20. This is true even though S is a proper subset of T. Notice
that T is an algebraic basis for the polynomials on [1, 1] while S is not! The
following computations may help relieve some of the readers anxiety. Let f
L
2
([1, 1], dm), then, making the change of variables x = y
1/3
, shows that
_
1
1
[f (x)[
2
dx =
_
1
1
f(y
1/3
)
2
1
3
y
2/3
dy =
_
1
1
f(y
1/3
)
2
d(y) (14.7)
where d(y) =
1
3
y
2/3
dy. Since ([1, 1]) = m([1, 1]) = 2, is a nite mea-
sure on [1, 1] and hence T := 1, x, x
2
, x
3
. . . is total in L
2
([1, 1], d). In
particular for any > 0 there exists a polynomial p(y) such that
_
1
1
f(y
1/3
) p(y)
2
d(y) <
2
.
However, by Eq. (14.7) we have
2
>
_
1
1
f(y
1/3
) p(y)
2
d(y) =
_
1
1
f (x) p(x
3
)
2
dx.
Alternatively, if f C([1, 1]), then g(y) = f(y
1/3
) is back in C([1, 1]).
Therefore for any > 0, there exists a polynomial p(y) such that
> |g p|
= sup [g(y) p(y)[ : y [1, 1]

= sup
_
g(x
3
) p(x
3
)
: x [1, 1]
_
= sup
_
f (x) p(x
3
)
: x [1, 1]
_
.
This gives another proof the polynomials in x
3
are dense in C([1, 1]) and hence
in L
2
([1, 1]).
We will see another important example of Corollary 14.19 when we study
Fourier series in the next chapter.
15
Fourier Series
In this chapter we let H be the continuous functions on [, ] with periodic
boundary conditions. That is we let
H = C
p
([, ] , C) := f C ([, ] , C) : f () = f () .
We equip H with the inner product,
f[g :=
1
2
_

f () g () d.
Let C
p
(1, C) be the space of 2 periodic functions, i.e. those F C (1, C)
such that F (x + 2) = F (x) for all x 1. The reader is asked to check that
to each f C
p
([, ] , C) there is a unique function F C
p
(1, C) such
that f = F[
[,]
. In the future we will simply identify C
p
([, ] , C) with
C
p
(1, C) .
15.1 Fourier Basis
Lemma 15.1. The functions, :=
_
n
() := e
in
: n Z
_
form an orthonor-
mal subset of H.
Proof. This is simple computation,
n
[
m
=
1
2
_

e
in
e
im
d =
1
2
_

e
i(nm)
d
=
_
1 if m = n
1
2(mn)
e
i(nm)
[
if m ,= n
=
mn
.
Let S
1
:= z C : [z[ = 1 be the unit circle in C
Theorem 15.2 (Fourier Basis). The function :=
_
n
() := e
in
: n Z
_
form an orthonormal basis for H.
Proof. By the same method of proof of Theorem 14.20, it suces to show
that is total in C
p
([, ] , C) when equipped with the supremum norm. By
Proposition 15.5 below, every f C
p
([, ] , C) may be written at f () =
F
_
e
i
_
where F C
_
S
1
, C
_
. Let

:=
n
(z) := z
n
: n Z C
_
S
1
, C
_
.
Then / := span
C

is a complex algebra inside of C
_
S
1
, C
_
which is stable
under complex conjugation, contains the function 1, and separates points. [The
function
1
(z) = z already separates points.] Therefore by the Stone Weier-
strass Theorem (see Corollary 13.17), we know that / is dense in C
_
S
1
, C
_
equipped with the supremum norm. In particular if > 0 is given there exists
P (z) =
N
k=N
a
k
z
k
sup
zS
1
[F (z) P (z)[ .
Letting
p () := P
_
e
i
_
=
N
k=N
a
k
e
ik
=
N
k=N
a
k
k
()
(which we refer to as a trigonometric polynomial), we have
sup
[,]
[f () p ()[ = sup
[,]
F
_
e
i
_
P
_
e
i
_
= sup
zS
1
[F (z) P (z)[ .
This shows that is total in C
p
([, ] , C) with the sup-norm and hence with
the L
2
norm as well.
For the proof of Proposition 15.5 below we will need a couple of lemmas.
Lemma 15.3. If : [, ] S
1
C is dened by () = e
i
for [, ] ,
then [
(,)
: (, ) S
1
1 is a homeomorphism.
Proof. The map is continuous and [
(,)
: (, ) S
1
1 is a
bijection and hence we must show [
1
(,)
is continuous. To this end, suppose
that
n
(, ) and z
n
= e
in
S
1
1 and z
n
z = e
i
S
1
1 with
(, ) . We wish to show
n
as n . If not there exists an > 0
and a subsequence
k
:=
nk
such that [
k
[ for all k. By passing to a
further subsequence if necessary we may assume that lim
k
k
= exists in
[, ] . It then follows that
e
i
= lim
k
z
nk
= lim
k
e
ik
= e
i
.
168 15 Fourier Series
However this shows that = and therefore lim
k
[
k
[ = 0 which is
a contradiction.
You are asked to give another proof of this Exercises 15.1 and 15.2.
Lemma 15.4. If
n
(, ) and z
n
= e
in
S
1
1 and z
n
1, then
d (
n
, ) := [
n
[ [
n
+[ 0 as n .
Proof. If d (
n
, ) 0 as n there exists an > 0 and a subse-
quence
k
:=
nk
such that d (
k
, ) for all k. By passing to a further
subsequence if necessary we may assume that lim
k
k
= exists in [, ] .
It then follows that
1 = lim
k
z
nk
= lim
k
e
ik
= e
i
so that and we would conclude that
lim
k
d (
k
, ) = d (, ) = 0
Proposition 15.5. To every f C
p
([, ] , C) there exists a unique function
F C
_
S
1
, C
_
such that f () = F
_
e
i
_
.
Proof. Given f : [, ] C such that f () = f () , let F (z) :=
f
_
1
(z)
_
which is well dened even for z = 1 since
1
(1) = and
f () = f () . Then F is the unique function such that f () = F
_
e
i
_
for all
[, ] . Let us further suppose that f is continuous. Making use of Lemmas
15.3 and 15.4, if z
n
= e
in
z = e
i
in S
1
then either
n
as n if
(, ) or d (
n
, ) 0 if . In either case we have,
lim
n
F (z
n
) = lim
n
f (
n
) = f () = F (z)
which shows that F is continuous.
Exercise 15.1 (Stereographic Projection). Let z = x + iy S
1
1 .
Consulting Figure 15.1, nd the unique t 1 such that Re [1 +t (z + 1)] = 0
and for this t show
1 +t (z + 1) = ia where a =
y
1 +x
1.
Then show that : S
1
1 1 dene by (z) =
y
1+x
is a bijection with
1
(a) =
1
(a) =
_
1 +a
2
_
1
__
1 a
2
_
+i2a
.
It follows from these formula that is a homeomorphism.
ia
z
1
0
x
y
Fig. 15.1. Stereographic projection.
Exercise 15.2. Continuing the notation in Exercise 15.1, let
f () := ( ()) =
_
e
i
_
=
sin
1 + cos
.
Show f : (, ) 1 is a homeomorphism that is smooth with a smooth
inverse. Hence it follows that =
1
f : (, ) S
1
1 is a homeo-
morphism being the composition of two homeomorphisms.
Notation 15.6 For f C ([, ] , C) or more generally for f Riemann inte-
grable on [, ] , let
f (n) := f,
n
=
1
2
_

f () e
in
d.
We say

f (n) is the n
th
Fourier coecient of f. Also for N N
0
, let
P
N
f =
N
n=N
f (n)
n
be orthogonal projection onto span
n
: [n[ N .
Theorem 15.7. Suppose that f : [, ] C is Riemann integrable and there
exists f
n
C
p
([, ] , C) such that |f f
n
| 0, then
15.1 Fourier Basis 169
lim
N
|P
N
f f|
2
= 0, and
|f|
2
2
=
nZ
[f[
n
[
2
=
nZ
f (n)
2
.
Moreover if f, g and two such functions, then
f[g =
nZ
f[
n

n
[g =
nZ
f (n) g (n).
Proof. Using the triangle inequality,
|P
N
f f|
2
|P
N
f P
N
f
n
|
2
+|P
N
f
n
f
n
|
2
+|f
n
f|
2
|f f
n
|
2
+|P
N
f
n
f
n
|
2
+|f
n
f|
2
and therefore,
limsup
N
|P
N
f f|
2
2 |f
n
f|
2
0 as n .
It the follows that
f[g = lim
N
P
N
f[P
N
g = lim
N
]n]N
f[
n

n
[g =
nZ
f[
n

n
[g
where the sum is absolutely convergent.
Example 15.8. Let f () := 1
[0,]
() and f
n
() be the function which is one on
_
1
n
,
1
n
and goes linearly to zero on the intervals

_
0,
1
n
and
_

1
n
,
and
is zero on [, 0] . Then
|f f
n
|
2
2
= 2
_ 1
n
0
[1 f
n
()[
2
d 2
1
n
0 as n
and since f
n
C
p
([, ] , C) , Theorem 15.7 is applicable. The Fourier coe-
cients of f are
f (0) = f[
0
=
1
2
_

0
d =
1
2
and for n ,= 0,
f (n) = f[
n
=
1
2
_

0
e
in
d =
1
2in
e
in
[
0
=
1
2ni
_
1 e
in
_
=
1
ni
1
Zodd
(n) .
Thus we nd,
f ()

=
1
2
+
1
i
nNodd
1
n
_
e
in
e
in
_
=
1
2
+
2
nNodd
1
n
sin n = s
N
()
where
s
N
() :=
1
2
+
2
n=1
1
2n + 1
sin ((2n + 1) )
and we write

= to emphasize the that convergence takes plane in the
L
2
sense. See Figure 15.2 where the reader will likely be convinced that
lim
N
S
N
() ,= f () for 0, . It is only guaranteed that
lim
N
_

[f () S
N
()[
2
d = 0.
We also have from Theorem 15.7 that
Fig. 15.2. The plots of sN () for N = 1, N = 10, and N = 100. The red line is the
graph of s100 and the black is the graph of s1.
1
2
=
1
2
_

1
[0,]
()
2
d =
_
1
2
_
2
+
nZodd
1
ni
2
=
1
4
+
2
nNodd
1
n
2
n=0
_
1
2n + 1
_
2
=

2
8
.
15.2 Fourier Series Exercises
Exercise 15.3. Suppose that f C ([, ] , C) . Show there exists f
n

C
p
([, ] , C) such that |f f
n
|
2
0 as n .
Exercise 15.4. Show that if f C ([, ] , C) and

f (k) = 0 for all k then
f 0.
Exercise 15.5. Show
k=1
k
2
=
1
6
2
, by taking f (x) = x on [, ] and
computing |f|
2
2
directly and then in terms of the Fourier Coecients

f of f.
Exercise 15.6 (Smoothness implies decay). Suppose m N and f
C
m
p
(1) the 2 periodic functions on 1 such that f
(l)
exists and is con-
tinuous for 0 l m.
1. Use integration by parts to show
f
(l)
(k) = f
(l)
[
k
= (ik)
l

f(k) for all k Z.
Note: This equality implies
f(k)

1
k
l
_
_
_f
(l)
_
_
_
H
1
k
l
_
_
_f
(l)
_
_
_
for k ,= 0.
2. Show for f C
1
p
(1) that
kZ\0]
f (k)
kZ\0]
1
[k[
f
t
(k)
3
|f
t
|
H
< . (15.1)
Hint: use the CauchySchwarz inequality for sums, see Theorem 6.16 or
Theorem 14.2 applied with a[b =
n=1
a
n
b
n
for sequence a and b.
Exercise 15.7. Suppose f C
per
([, ]) is a function such that
nZ
f (n)
< and set

g (x) :=
kZ
f(k)e
ikx
(pointwise).
1. Show g C
per
(1).
2. Show g (x) = f (x) for all x. Hint: Show g(k) =

f(k) and apply Exercise
15.4.
Corollary 15.9 (Pointwise Convergence). If f C
1
p
(1) then
kZ

f (k) e
ik
converges pointwise to f () for all 1. Moreover the
sum is absolutely and uniformly convergent.
Proof. This result follows directly from Exercises 15.6 and 15.7.
Exercise 15.8 (Decay implies smoothness). Suppose s 1 and
c
k
C : k Z are coecients such that
kZ
[c
k
[
2
(1 +[k[
2
)
s
< .
Show if s >
1
2
+m, the function f dened by
f (x) =
kZ
c
k
e
ikx
is in C
m
per
(1). Hint: First show
kZ
[c
k
[ [k[
l
< for all l m.
[This is perhaps the simplest version of a Sobolev Imbedding Theorem.]
Exercise 15.9 (Poisson Summation Formula). Let F : 1 C be a con-
tinuous function such that there exists C < and s > 1 such that
[F (x)[ C(1 +[x[)
s
for all x 1.
Also dene the Fourier transform,

F : 1 C, of F by,
F() :=
1
2
_
R
F (x) e
ix
dx = lim
M
1
2
_
M
M
F (x) e
ix
dx.
We further assume that
kZ
F (k)
< . [This can be achieved by assuming

F is twice continuously dierentiable with the derivatives decaying fast enough
so that the resulting functions are integrable.]
1. Show
kZ
F(x + 2k) is absolutely convergent for all x 1 and the
function,
f (x) :=
kZ
F(x + 2k)
is in C
p
(1, C) .
2. Show

f (k) =

F (k) for all k Z.
15.3 Dirichlet, Fejer and Kernels 171
3. Prove the Poisson Summation Formula;
kZ
F(x + 2k) = f (x) =
kZ
F(k)e
ikx
for all x 1 (15.2)
which at x = 0 implies
kZ
F(2k) =
kZ
F(k). (15.3)
Exercise 15.10 (Heat Equation 1.). Let (t, x) [0, ) 1 u(t, x) be a
continuous function such that u(t, ) C
per
(1) for all t 0, u := u
t
, u
x
, and
u
xx
exists and are continuous when t > 0. Further assume that u satises the
heat equation u =
1
2
u
xx
. Let u(t, k) := u(t, )[
k
for k Z. Show for t > 0
and k Z that u(t, k) is dierentiable in t and
d
dt
u(t, k) = k
2
u(t, k)/2. Use
this result to show
u(t, x) =
kZ
e
t
2
k
2
f(k)e
ikx
(15.4)
where f (x) := u(0, x) and as above
f(k) = f[
k
=
_

f(y)e
iky
dy.
Notice from Eq. (15.4) that (t, x) u(t, x) is C
for t > 0.
Exercise 15.11 (Heat Equation 2.). Let q
t
(x) :=
1
2
kZ
e
t
2
k
2
e
ikx
. Show
that Eq. (15.4) may be rewritten as
u(t, x) =
_

q
t
(x y)f(y)dy
and
q
t
(x) =
kZ
p
t
(x +k2)
where p
t
(x) :=
1
2t
e
1
2t
x
2
. Also show u(t, x) may be written as
u(t, x) = p
t
f (x) :=
_
R
p
t
(x y)f(y)dy.
Hint: To show q
t
(x) =
kZ
p
t
(x +k2), use the Poisson summation formula
(Exercise 15.9) and the Gaussian integration identity,
p
t
() =
1
2
_
R
p
t
(x) e
ix
dx =
1
2
e
t
2
2
. (15.5)
Equation (15.5) will be discussed later in the notes.
Exercise 15.12 (Wave Equation). Let u C
2
(11) be such that u(t, )
C
per
(1) for all t 1. Further assume that u solves the wave equation, u
tt
= u
xx
.
Let f (x) := u(0, x) and g (x) = u(0, x). Show u(t, k) := u(t, ),
k
for k Z
is twice continuously dierentiable in t and
d
2
dt
2
u(t, k) = k
2
u(t, k). Use this
result to show
u(t, x) =
kZ
_
f(k) cos(kt) + g(k)

sin kt
k
_
e
ikx
(15.6)
with the sum converging absolutely. Also show that u(t, x) may be written as
u(t, x) =
1
2
[f(x +t) +f(x t)] +
1
2
_
t
t
g(x +)d. (15.7)
Hint: To show Eq. (15.6) implies (15.7) use
cos kt =
e
ikt
+e
ikt
2
,
sin kt =
e
ikt
e
ikt
2i
, and
e
ik(x+t)
e
ik(xt)
ik
=
_
t
t
e
ik(x+)
d.
15.3 Dirichlet, Fejer and Kernels
For a Riemann integrable function, f : [, ] C, the sum
f =
kZ
f[
k
k
=
f(k)
k
(15.8)
is guaranteed to converge relative to the Hilbertian norm on H. It turns out
that it need not converge pointwise even if f C
per
(1) . Nevertheless as we
saw in Corollary 15.9 if f C
1
p
(1, C), then the sum in Eq. (15.8) will converge
pointwise. We are going to discuss in pointwise convergence in a more detail
here. In the process we will give a direct and constructive proof of the result in
Exercise 13.5, see Theorem 15.11 below.
For n N we will be considering,
f
n
() =
]k]n
f(k)
k
() =
]k]n
1
2
_
_
[,]
f(x)e
ikx
dx
_
k
()
=
1
2
_
[,]
f(x)
]k]n
e
ik(x)
dx
=
1
2
_
[,]
f(x)D
n
( x)dx (15.9)
where
D
n
() :=
n
k=n
e
ik
is called the Dirichlet kernel. Letting = e
i/2
, we have
D
n
() =
n
k=n
2k
=

2(n+1)
2n
2
1
=

2n+1
(2n+1)

1
=
2i sin(n +
1
2
)
2i sin
1
2
=
sin(n +
1
2
)
sin
1
2
.
and therefore
D
n
() :=
n
k=n
e
ik
=
sin(n +
1
2
)
sin
1
2
, (15.10)
see Figure 15.3.
This is a plot D
1
and D
10
.
with the understanding that the right side of this equation is 2n + 1 whenever
2Z.
Theorem 15.10. Suppose f C
p
(1, C) (or just Riemann integrable) and f is
dierentiable at some [, ] , then lim
n
f
n
() = f () where f
n
is as
in Eq. (15.9).
Proof. Observe that
1
2
_
[,]
D
n
( x)dx =
1
2
_
[,]
]k]n
e
ik(x)
dx = 1
and therefore,
f
n
() f () =
1
2
_
[,]
[f(x) f ()] D
n
( x)dx
=
1
2
_
[,]
[f(x) f ( x)] D
n
(x)dx
=
1
2
_
[,]
_
f( x) f ()
sin
1
2
x
_
sin(n +
1
2
)x dx
=
1
2
_
[,]
g (x) sin(n +
1
2
)x dx (15.11)
where for x [, ] ,
g (x) =
_
f(x)f()
sin
1
2
x
if x ,= 0
2f
t
() if x = 0
.
The function g is continuous (Riemann integrable) on [, ] and
1
2
_
[,]
g (x) sin(n +
1
2
)x dx
=
1
4i
_
[,]
g (x) e
ix/2
e
ixn
dx
1
4i
_
[,]
g (x) e
ix/2
e
ixn
dx
=
1
2i
_
_
x g (x) e
ix/2
_
(n)
_
x g (x) e
ix/2
_
(n)
_
.
The last term goes to zero as n since, by Bessels equality,
nZ
_
x g (x) e
ix/2
_
(n)
2
=
1
2
_

[g (x)[
2
dx < .
Despite the Dirichlet kernel not being positive, it still satises the approx-
imate sequence property,
1
2
D
n

0
as n , when acting on C
1
periodic functions in . In order to improve the convergence properties it is

reasonable to try to replace f
n
: n N
0
by the sequence of averages,
F
N
() =
1
N + 1
N
n=0
f
n
() =
1
N + 1
N
n=0
1
2
_
[,]
f(x)
]k]n
e
ik(x)
dx
=
1
2
_
[,]
K
N
( x)f(x)dx
where
K
N
() :=
1
N + 1
N
n=0
]k]n
e
ik
(15.12)
is the Fejer kernel.
15.3 Dirichlet, Fejer and Kernels 173
Theorem 15.11. The Fejer kernel K
N
in Eq. (15.12) satises:
1.
K
N
() =
N
n=N
_
1
[n[
N + 1
_
e
in
(15.13)
=
1
N + 1
sin
2
_
N+1
2

_
sin
2
_
2
_ . (15.14)
2. K
N
() 0.
3.
1
2
_
K
N
()d = 1
4. sup
]]
K
N
() 0 as N for all > 0, see Figure 15.3.
5. For any continuous 2 periodic function f on 1, K
N
f() f()
uniformly in as N , where
K
N
f() =
1
2
_

K
N
( )f()d
=
N
n=N
_
1
[n[
N + 1
_

f (n) e
in
. (15.15)
Fig. 15.3. Plots of KN() for N = 2, 7 and 13.
Proof. 1. Equation (15.13) is a consequence of the identity,
N
n=0
]k]n
e
ik
=
]k]nN
e
ik
=
]k]N
(N + 1 [k[) e
ik
.
Moreover, letting = e
i/2
and using Eq. (??) shows
K
N
() =
1
N + 1
N
n=0
]k]n
2k
=
1
N + 1
N
n=0
2n+2
2n
2
1
=
1
(N + 1) (
1
)
N
n=0
_
2n+1
2n1
=
1
(N + 1) (
1
)
N
n=0
_
2n
2n
=
1
(N + 1) (
1
)
_
2N+2
1
2
1

1
2N2
1
2
1
_
=
1
(N + 1) (
1
)
2
_
2(N+1)
1 +
2(N+1)
1
_
=
1
(N + 1) (
1
)
2
_
(N+1)
(N+1)
_
2
=
1
N + 1
sin
2
((N + 1) /2)
sin
2
(/2)
.
Items 2. and 3. follow easily from Eqs. (15.14) and (15.13) respectively. Item
4. is a consequence of the elementary estimate;
sup
]]
K
N
()
1
N + 1
1
sin
2
_
2
_
and is clearly indicated in Figure 15.3. Item 5. now follows by the standard
approximate function arguments, namely,
[K
N
f() f ()[ =
1
2
K
N
( ) [f() f ()] d
1
2
_

K
N
() [f( ) f ()[ d
1
N + 1
1
sin
2
_
2
_ |f|
+
1
2
_
]]
K
N
() [f( ) f ()[ d
1
N + 1
1
sin
2
_
2
_ |f|
+ sup
]]
[f( ) f ()[ .
Therefore,
lim sup
N
|K
N
f f|
sup
sup
]]
[f( ) f ()[ 0 as 0.
15.4 The Dirichlet Problems and the Poisson Kernel
Let D := z C : [z[ < 1 be the open unit disk in C

= 1
2
, write z C as
z = x + iy or z = re
i
, and let =

2
x
2
+

2
y
2
be the Laplacian acting on
C
2
(D) .
Theorem 15.12 (Dirichlet problem for D). To every continuous function
g C (bd(D)) there exists a unique function u C(

D) C
2
(D) solving
u(z) = 0 for z D and u[
D
= g. (15.16)
Moreover for r < 1, u is given by,
u(re
i
) =
1
2
_

P
r
( )u(e
i
)d =: P
r
u(e
i
) (15.17)
=
1
2
Re
_

1 +re
i()
1 re
i()
u(e
i
)d (15.18)
where P
r
is the Poisson kernel dened by
P
r
() :=
1 r
2
1 2r cos +r
2
.
(The problem posed in Eq. (15.16) is called the Dirichlet problem for D.)
Proof. In this proof, we are going to be identifying S
1
= bd(D) :=
_
z

D : [z[ = 1
_
with [, ]/ ( ) by the map [, ] e
i
S
1
.
Also recall that the Laplacian may be expressed in polar coordinates as,
u = r
1
r
_
r
1
r
u
_
+
1
r
2
u,
where
(
r
u)
_
re
i
_
=

r
u
_
re
i
_
and (
u)
_
re
i
_
=

u
_
re
i
_
.
Uniqueness. Suppose u is a solution to Eq. (15.16) and let
g(k) :=
1
2
_

g(e
ik
)e
ik
d
and
u(r, k) :=
1
2
_

u(re
i
)e
ik
d (15.19)
be the Fourier coecients of g () and u
_
re
i
_
respectively. Then for
r (0, 1) ,
r
1
r
(r
r
u(r, k)) =
1
2
_

r
1
r
_
r
1
r
u
_
(re
i
)e
ik
d
=
1
2
_

1
r
2
u(re
i
)e
ik
d
=
1
r
2
1
2
_

u(re
i
)
2
e
ik
d
=
1
r
2
k
2
u(r, k)
or equivalently
r
r
(r
r
u(r, k)) = k
2
u(r, k). (15.20)
Recall the general solution to
r
r
(r
r
y(r)) = k
2
y(r) (15.21)
may be found by trying solutions of the form y(r) = r
which then implies
2
= k
2
or = k. From this one sees that u(r, k) solving Eq. (15.20) may
be written as u(r, k) = A
k
r
]k]
+ B
k
r
]k]
for some constants A
k
and B
k
when
k ,= 0. If k = 0, the solution to Eq. (15.21) is gotten by simple integration and
the result is u(r, 0) = A
0
+B
0
ln r. Since u(r, k) is bounded near the origin for
each k it must be that B
k
= 0 for all k Z. Hence we have shown there exists
A
k
C such that, for all r (0, 1),
A
k
r
]k]
= u(r, k) =
1
2
_

u(re
i
)e
ik
d. (15.22)
Since all terms of this equation are continuous for r [0, 1], Eq. (15.22) remains
valid for all r [0, 1] and in particular we have, at r = 1, that
A
k
=
1
2
_

u(e
i
)e
ik
d = g(k).
Hence if u is a solution to Eq. (15.16) then u must be given by
u(re
i
) =
kZ
g(k)r
]k]
e
ik
for r < 1. (15.23)
or equivalently,
u(z) =
kN0
g(k)z
k
+
kN
g(k) z
k
.
Notice that the theory of the Fourier series implies Eq. (15.23) is valid in the
L
2
(d) - sense. However more is true, since for r < 1, the series in Eq. (15.23)
is absolutely convergent and in fact denes a C
function (see Exercise ??

15.4 The Dirichlet Problems and the Poisson Kernel 175
or Corollary ??) which must agree with the continuous function, u
_
re
i
_
,
for almost every and hence for all . This completes the proof of uniqueness.
Existence. Given g C (bd(D)) , let u be dened as in Eq. (15.23). Then,
again by Exercise ?? or Corollary ??, u C
(D) . So to nish the proof it

suces to show lim
xy
u(x) = g (y) for all y bd(D). Inserting the formula
for g(k) into Eq. (15.23) gives
u(re
i
) =
1
2
_

P
r
( ) u(e
i
)d for all r < 1
where
P
r
() =
kZ
r
]k]
e
ik
=
k=0
r
k
e
ik
+
k=0
r
k
e
ik
1 =
= Re
_
2
1
1 re
i
1
_
= Re
_
1 +re
i
1 re
i
_
= Re
_
_
1 +re
i
_ _
1 re
i
_
[1 re
i
[
2
_
= Re
_
1 r
2
+ 2ir sin
1 2r cos +r
2
_
(15.24)
=
1 r
2
1 2r cos +r
2
.
The Poisson kernel again solves the usual approximate function proper-
ties (see Figure 2), namely:
1. P
r
() > 0 and
1
2
_

P
r
( ) d =
1
2
_

kZ
r
]k]
e
ik()
d
=
1
2
kZ
r
]k]
_

e
ik()
d = 1
and
2.
sup
]]
P
r
()
1 r
2
1 2r cos +r
2
0 as r 1.
A plot of P
r
() for r = 0.2, 0.5 and 0.7.
Therefore by the same argument used in the proof of Theorem 15.11,
lim
r1
sup
u
_
re
i
_
g
_
e
i
_
= lim
r1
sup
(P
r
g)
_
e
i
_
g
_
e
i
_
= 0
which certainly implies lim
xy
u(x) = g (y) for all y bd(D).
Remark 15.13 (Harmonic Conjugate). Writing z = re
i
, Eq. (15.18) may be
rewritten as
u(z) =
1
2
Re
_

1 +ze
i
1 ze
i
u(e
i
)d
which shows u = Re F where
F(z) :=
1
2
_

1 +ze
i
1 ze
i
u(e
i
)d.
Moreover it follows from Eq. (15.24) that
ImF(re
i
) =
1
Im
_

r sin( )
1 2r cos( ) +r
2
g(e
i
)d
=: (Q
r
u) (e
i
)
where
Q
r
() :=
r sin()
1 2r cos() +r
2
.
From these remarks it follows that v =: (Q
r
g) (e
i
) is the harmonic conjugate
of u and

P
r
= Q
r
. For more on this point see Section ?? below.
16
Arzela-Ascoli Theorem (Compactness in C (X))
In this section, let (X, d) be a metric space, 2
X
be the collection of
open sets, and for x X let
x
= V : x V be the collection of open
neighborhoods of x.
Denition 16.1. Suppose T C (X) , i.e. T is a collection of continuous
functions.
1. T is equicontinuous at x X i for all > 0 there exists U
x
such
that [f(y) f(x)[ < for all y U and f T.
2. T is equicontinuous if T is equicontinuous at all points x X.
3. T is pointwise bounded if sup[f(x)[ : f T < for all x X.
Theorem 16.2 (Ascoli-Arzela Theorem). Let (X, ) be a compact metric
space and T C (X) . Then T is precompact in C (X) i T is equicontinuous
and point-wise bounded.
Proof. () Let m N and use the equicontinuity of T to conclude for all
x X, there exists V
x

x
such that [f(y) f(x)[ <
1
m
if y V
x
and f T.
Since X is compact we may choose
m

f
X such that X =
xm
V
x
. Let
x
k
: k N be an enumeration of the set, :=
m=1
m
.
Because of the pointwise boundedness assumption, we know
f
m
(x
k
) : m N has a convergent subsequence for every k N. By the
Cantor diagonalization technique, we may nd a subsequence g
l
= f
ml
of
f
m
: m N such that lim
l
g
l
(x) exists for all x .
The proof will be nished by showing g
l
: l N is in fact a uniformly
convergent sequence. To prove this let y X and then choose x
m
such
that y V
x
, with V
x
as above. Then
[g
l
(y) g
k
(y)[ [g
l
(y) g
l
(x)[ +[g
l
(x) g
k
(x)[ +[g
k
(x) g
k
(y)[
2
m
+[g
l
(x) g
k
(x)[
2
m
+ sup
xm
[g
l
(x) g
k
(x)[ .
Therefore,
lim sup
l,k
sup
yX
[g
l
(y) g
k
(y)[
2
m
+ lim sup
l,k
sup
yX
sup
xm
[g
l
(x) g
k
(x)[
=
2
m
0 as m .
() (*The rest of this proof may safely be skipped.) Let me rst give the
argument under the added restriction that =
d
for some metric, d, and X.
Since ||
: C (X) [0, ) is a continuous function on C (X) it is bounded

on any compact subset T C (X) . This shows that sup |f|
: f T <
which clearly implies that T is pointwise bounded.
1
Suppose T were not
equicontinuous at some point x X, i.e. there exists > 0 such that for all
V
x
, sup
yV
sup
fJ
[f(y)f(x)[ > . Let V
n
= B
x
(1/n)
n=1
. By the assumption
that T is not equicontinuous at x, there exist f
n
T and x
n
V
n
such that
[f
n
(x) f
n
(x
n
)[ for all n. Since T is a compact metric space by passing to
a subsequence if necessary we may assume that f
n
converges uniformly to some
f T. Because x
n
x as n we learn that
[f
n
(x) f
n
(x
n
)[ [f
n
(x) f(x)[ +[f(x) f(x
n
)[ + [f(x
n
) f
n
(x
n
)[
2 |f
n
f|
+[f(x) f(x
n
)[ 0 as n
Exercise 16.1. Suppose k C
_
[0, 1]
2
, 1
_
and for f C ([0, 1] , 1) , let
Kf (x) :=
_
1
0
k (x, y) f (y) dy for all x [0, 1] . (16.1)
Show K is a compact operator on (C ([0, 1] , 1) , ||
) which by denition
means that Kf : |f|
1 is compact in C ([0, 1] , 1) .
Exercise 16.2 (Continuation of Exercise 16.1). Suppose k
C
_
[0, 1]
2
, 1
_
but now show that one can use the formula for K in Eq.
(16.1) in order to dene a linear operator from L
2
([0, 1] , m; 1) to C ([0, 1] , 1)
and that this operator is still compact, i.e. Kf : |f|
L
2 1 is compact in
C ([0, 1] , 1) . Conclude that K is also a compact operator when viewed as an
operator from L
2
([0, 1] , m; 1) to L
2
([0, 1] , m; 1) .
1
One could also prove that F is pointwise bounded by considering the continuous
evaluation maps ex : C(X) 1 given by ex(f) = f(x) for all x X.
178 16 Arzela-Ascoli Theorem (Compactness in C (X))
The following result is a corollary of Theorem 16.2.
Corollary 16.3 (Locally Compact Ascoli-Arzela Theorem). Let (X, d)
be a metric space with the Heine Borel property, i.e. closed and bounded sets
are compact. If f
m
C (X) is a pointwise bounded sequence of functions such
that f
m
[
K
is equicontinuous for any compact subset K X, then there exists
a subsequence m
n
m such that g
n
:= f
mn
n=1
C (X) is uniformly
convergent on all compact subsets of X.
Proof. Let x X and K
n
= C
x
(n)
n=1
, so thatK
n
is a nested sequence
of compact sets such that K
n
X as n . We may now apply Theorem 16.2
repeatedly to nd a nested family of subsequences
f
m
g
1
m
g
2
m
g
3
m
. . .
such that the sequence g
n
m
m=1
C (X) is uniformly convergent on K
n
. Using
Cantors trick, dene the subsequence h
n
of f
m
by h
n
:= g
n
n
. Then h
n
is uniformly convergent on K
l
for each l N. Now if K X is an arbitrary
compact set, there exists l < such that K K
o
l
K
l
and therefore h
n
is
uniformly convergent on K as well.
16.0.1 Arzela-Ascoli Theorem Problems
Exercise 16.3. Let (X, ) be a compact metric space and T := f
n
n=1

C (X) is a sequence of functions which are equicontinuous and pointwise conver-
gent. Show f (x) := lim
n
f
n
(x) is continuous and that lim
n
|f f
n
|
=
0, i.e. f
n
f uniformly as n .
Exercise 16.4. Let T (0, ) and T C([0, T], 1) be a family of functions
such that:
1.

f(t) exists for all t (0, T) and f T,
2. sup
fJ
[f(0)[ < , and
3. M := sup
fJ
sup
t(0,T)

f(t)
< .
Show T is precompact in the Banach space C([0, T]) equipped with the
norm |f|
= sup
t[0,T]
[f(t)[ .
Exercise 16.5 (Peanos Existence Theorem). Suppose Z : 11
d
1
d
is
a bounded continuous function. Then for each T <
2
there exists a solution
to the dierential equation
x(t) = Z(t, x(t)) for T < t < T with x(0) = x
0
. (16.2)
Do this by lling in the following outline for the proof.
2
Using Corollary 16.3, we may in fact allow T = .
1. Given > 0, show there exists a unique function x
C([, ) 1
d
)
such that x
(t) := x
0
for t 0 and
x
(t) = x
0
+
_
t
0
Z(, x
( ))d for all t 0. (16.3)

Here
_
t
0
Z(, x
( ))d =
__
t
0
Z
1
(, x
( ))d, . . . ,
_
t
0
Z
d
(, x
( ))d
_
where Z = (Z
1
, . . . , Z
d
) and the integrals are either the Lebesgue or the
Riemann integral since they are equal on continuous functions. Hint: For
t [0, ], it follows from Eq. (16.3) that
x
(t) = x
0
+
_
t
0
Z(, x
0
)d.
Now that x
(t) is known for t [, ] it can be found by integration for

t [, 2]. The process can be repeated.
2. Then use Exercise 16.4 to show there exists
k
k=1
(0, ) such that
lim
k
k
= 0 and x
k
converges to some x C([0, T]) with respect to the
sup-norm: |x|
= sup
t[0,T]
[x(t)[). Also show for this sequence that
lim
k
sup
kT
[x
k
(
k
) x()[ = 0.
3. Pass to the limit (with justication) in Eq. (16.3) with replaced by
k
to show x satises
x(t) = x
0
+
_
t
0
Z(, x())d t [0, T].
4. Conclude from this that x(t) exists for t (0, T) and that x solves Eq.
(16.2).
5. Apply what you have just proved to the ODE,
y(t) = Z(t, y(t)) for 0 t < T with y(0) = x
0
.
Then extend x(t) above to (T, T) by setting x(t) = y(t) if t (T, 0].
Show x so dened solves Eq. (16.2) for t (T, T).
Part V
Appendices
A
Appendix: Notation and Logic
The following abbreviations along with their negations are used throughout
these notes.
Symbol Meaning Negation
for all
there exits
, or space then
such that , or space
a.a. almost always i.o.
i.o. innitely often a.a.
= equals ,=
,= not equals =
less than or equal >
> greater than
Here are some examples.
1. a
n
= b
n
i.o. n #(n : a
n
= b
n
) = . The negation of
#(n : a
n
= b
n
) = is
#(n : a
n
= b
n
) < a
n
,= b
n
for a.a. n.
2. lim
n
a
n
= L is by denition the statement;
> 0 N N n N, [L a
n
[ .
This may also be written as
> 0, [L a
n
[ for a.a. n.
3. The negation of the previous statement is lim
n
a
n
,= L which translates
to
> 0 N N, n N [L a
n
[ > .
This last statement is also equivalent to;
> 0 [L a
n
[ > i.o. n.
It is sometimes useful to reformulate this last statement as; there exists > 0
and an increasing function N k n
k
N such that
[L a
nk
[ > for all k N.
B
Appendix: More Set Theoretic Properties (highly optional)
B.1 Appendix: Zorns Lemma and the Hausdor Maximal
Principle (optional)
Denition B.1. A partial order on X is a relation with following properties;
1. If x y and y z then x z.
2. If x y and y x then x = y.
3. x x for all x X.
Example B.2. Let Y be a set and X = 2
Y
. There are two natural partial orders
on X.
1. Ordered by inclusion, A B is A B and
2. Ordered by reverse inclusion, A B if B A.
Denition B.3. Let (X, ) be a partially ordered set we say X is linearly or
totally ordered if for all x, y X either x y or y x. The real numbers 1
with the usual order is a typical example.
Denition B.4. Let (X, ) be a partial ordered set. We say x X is a max-
imal element if for all y X such that y x implies y = x, i.e. there is no
element larger than x. An upper bound for a subset E of X is an element
x X such that x y for all y E.
Example B.5. Let
X =
_
a = 1 b = 1, 2 c = 3 d = 2, 4 e = 2
_
ordered by set inclusion. Then b and d are maximal elements despite that fact
that b _ d and d _ b. We also have;
1. If E = a, c, e, then E has no upper bound.
2. If E = a, e, then b is an upper bound.
3. If E = e, then b and d are upper bounds.
Theorem B.6. The following are equivalent.
1. The axiom of choice: to each collection, X
A
, of non-empty sets
there exists a choice function, x : A

A
X
such that x() X
for
all A, i.e.
A
X
,= .
2. The Hausdor Maximal Principle: Every partially ordered set has a
maximal (relative to the inclusion order) linearly ordered subset.
3. Zorns Lemma: If X is partially ordered set such that every linearly or-
dered subset of X has an upper bound, then X has a maximal element.
1
Proof. (2 3) Let X be a partially ordered subset as in 3 and let T =
E X : E is linearly ordered which we equip with the inclusion partial
ordering. By 2. there exist a maximal element E T. By assumption, the
linearly ordered set E has an upper bound x X. The element x is maximal,
for if y Y and y x, then E y is still an linearly ordered set containing
E. So by maximality of E, E = Ey , i.e. y E and therefore y x showing
which combined with y x implies that y = x.
2
(3 1) Let X
A
be a collection of non-empty sets, we must show
A
X
is not empty. Let ( denote the collection of functions g : D(g)
A
X
such that D(g) is a subset of A, and for all D(g), g() X
.
Notice that ( is not empty, for we may let
0
A and x
0
X
and then set

D(g) =
0
and g(
0
) = x
0
to construct an element of (. We now put a partial
order on ( as follows. We say that f g for f, g ( provided that D(f) D(g)
and f = g[
D(f)
. If ( is a linearly ordered set, let D(h) =
g
D(g) and for
D(g) let h() = g(). Then h ( is an upper bound for . So by Zorns
1
If X is a countable set we may prove Zorns Lemma by induction. Let {xn}
n=1
be an enumeration of X, and dene En X inductively as follows. For n = 1 let
E1 = {x1}, and if En have been chosen, let En+1 = En {xn+1} if xn+1 is an upper
bound for En otherwise let En+1 = En. The set E =
n=1
En is a linearly ordered
(you check) subset of X and hence by assumption E has an upper bound, x X. I
claim that his element is maximal, for if there exists y = xm X such that y x,
then xm would be an upper bound for Em1 and therefore y = xm Em E.
That is to say if y x, then y E and hence y x, so y = x. (Hence we may view
Zorns lemma as a jazzed up version of induction.)
2
Similarly one may show that 3 2. Let F = {E X : E is linearly ordered}
and order F by inclusion. If M F is linearly ordered, let E = M =
AM
A. If
x, y E then x A and y B for some A, B M. Now M is linearly ordered
by set inclusion so A B or B A i.e. x, y A or x, y B. Since A and B
are linearly order we must have either x y or y x, that is to say E is linearly
ordered. Hence by 3. there exists a maximal element E F which is the assertion
in 2.
184 B Appendix: More Set Theoretic Properties (highly optional)
Lemma there exists a maximal element h (. To nish the proof we need only
show that D(h) = A. If this were not the case, then let
0
A D(h) and
x
0
X
0
. We may now dene D(
h) = D(h)
0
and
h() =
_
h() if D(h)
x
0
if =
0
.
Then h

h while h ,=

h violating the fact that h was a maximal element.
(1 2) Let (X, ) be a partially ordered set. Let T be the collection of
linearly ordered subsets of X which we order by set inclusion. Given x
0
X,
x
0
T is linearly ordered set so that T , = . Fix an element P
0
T. If
P
0
is not maximal there exists P
1
T such that P
0
_ P
1
. In particular we
may choose x / P
0
such that P
0
x T. The idea now is to keep repeating
this process of adding points x X until we construct a maximal element P
of T. We now have to take care of some details. We may assume with out
loss of generality that

T = P T : P is not maximal is a non-empty set.
For P

T, let P
= x X : P x T . As the above argument shows,

P
,= for all P

T. Using the axiom of choice, there exists f

P
J
P
.
We now dene g : T T by
g(P) =
_
P if P is maximal
P f(x) if P is not maximal.
(B.1)
The proof is completed by Lemma B.7 below which shows that g must have a
xed point P T. This xed point is maximal by construction of g.
Lemma B.7. The function g : T T dened in Eq. (B.1) has a xed point.
3
Proof. The idea of the proof is as follows. Let P
0
T be chosen arbi-
trarily. Notice that =
_
g
(n)
(P
0
)
_
n=0
T is a linearly ordered set and it is
therefore easily veried that P
1
=
n=0
g
(n)
(P
0
) T. Similarly we may repeat
the process to construct P
2
=
n=0
g
(n)
(P
1
) T and P
3
=
n=0
g
(n)
(P
2
) T, etc.
etc. Then take P
n=0
P
n
and start again with P
0
replaced by P
. Then
keep going this way until eventually the sets stop increasing in size, in which
case we have found our xed point. The problem with this strategy is that we
may never win. (This is very reminiscent of constructing measurable sets and
the way out is to use measure theoretic like arguments.)
Let us now start the formal proof. Again let P
0
T and let T
1
= P
T : P
0
P. Notice that T
1
has the following properties:
3
Here is an easy proof if the elements of F happened to all be nite sets and there
existed a set P F with a maximal number of elements. In this case the condition
that P g(P) would imply that P = g(P), otherwise g(P) would have more
elements than P.
1. P
0
T
1
.
2. If T
1
is a totally ordered (by set inclusion) subset then T
1
.
3. If P T
1
then g(P) T
1
.
Let us call a general subset T
t
T satisfying these three conditions a tower
and let
T
0
= T
t
: T
t
is a tower .
Standard arguments show that T
0
is still a tower and clearly is the smallest
tower containing P
0
. (Morally speaking T
0
consists of all of the sets we were
trying to constructed in the idea section of the proof.) We now claim that
T
0
is a linearly ordered subset of T. To prove this let T
0
be the linearly
ordered set
= C T
0
: for all A T
0
either A C or C A .
Shortly we will show that T
0
is a tower and hence that T
0
= . That is
to say T
0
is linearly ordered. Assuming this for the moment let us nish the
proof.
Let P T
0
which is in T
0
by property 2 and is clearly the largest element
in T
0
. By 3. it now follows that P g(P) T
0
and by maximality of P, we
have g(P) = P, the desired xed point. So to nish the proof, we must show
that is a tower. First o it is clear that P
0
so in particular is not
empty. For each C let
C
:= A T
0
: either A C or g(C) A .
We will begin by showing that
C
T
0
is a tower and therefore that
C
= T
0
.
1. P
0

C
since P
0
C for all C T
0
. 2. If
C
T
0
is totally
ordered by set inclusion, then A
:= T
0
. We must show A

C
, that
is that A
C or C A
. Now if A C for all A , then A
C and
hence A

C
. On the other hand if there is some A such that g(C) A
then clearly g(C) A
and again A

C
. 3. Given A
C
we must show
g(A)
C
, i.e. that
g(A) C or g(C) g(A). (B.2)
There are three cases to consider: either A _ C, A = C, or g(C) A. In the
case A = C, g(C) = g(A) g(A) and if g(C) A then g(C) A g(A) and
Eq. (B.2) holds in either of these cases. So assume that A _ C. Since C ,
either g(A) C (in which case we are done) or C g(A). Hence we may
assume that
A _ C g(A).
Now if C were a proper subset of g(A) it would then follow that g(A) A would
consist of at least two points which contradicts the denition of g. Hence we
B.2 Cardinality 185
must have g(A) = C C and again Eq. (B.2) holds, so
C
is a tower. It is now
easy to show is a tower. It is again clear that P
0
and Property 2. may be
checked for in the same way as it was done for
C
above. For Property 3., if
C we may use
C
= T
0
to conclude for all A T
0
, either A C g(C)
or g(C) A, i.e. g(C) . Thus is a tower and we are done.
Here is an example of using Zorns lemma.
Proposition B.8. Suppose that X and Y are non-empty sets, then either there
exists an injective function, f : X Y, or an injective function g : Y X. In
other words, either card (X) card (Y ) or card (Y ) card (X) .
Proof. Let T be the collection of injective functions, u : T(u) Y where
T(u) is a non-empty subset of X. We say that u v for u, v T provided
T(u) T(v) and u = v[
1(u)
. One now checks that (T, ) is a partially ordered
set such that every linearly ordered subset of T has an upper bound. Therefore,
by an application of Zorns lemma, T has a maximal element, U.
If T(U) = X, we take f = U and we have constructed an injective map from
X to Y. If T(U) ,= X, then Ran (U) := U (T(U)) = Y. [Indeed, if not we could
nd and x X T(U) and y Y Ran (U) and then extend U to T(U) x
by setting U (x) = y. The extended U is still injective and hence would violate
the maximallity of U.] In this case we take g := U
1
: Y T(U) X.
B.2 Cardinality
In mathematics, the essence of counting a set and nding a result n, is that it
establishes a one to one correspondence (or bijection) of the set with the set
of numbers 1, 2, . . . , n. A fundamental fact, which can be proved by math-
ematical induction, is that no bijection can exist between 1, 2, . . . , n and
1, 2, . . . , m unless n = m; this fact (together with the fact that two bijec-
tions can be composed to give another bijection) ensures that counting the
same set in dierent ways can never result in dierent numbers (unless an error
is made). This is the fundamental mathematical theorem that gives counting its
purpose; however you count a (nite) set, the answer is the same. In a broader
context, the theorem is an example of a theorem in the mathematical eld of
(nite) combinatoricshence (nite) combinatorics is sometimes referred to as
the mathematics of counting.
Many sets that arise in mathematics do not allow a bijection to be estab-
lished with 1, 2, . . . , n for any natural number n; these are called innite sets,
while those sets for which such a bijection does exist (for some n) are called
nite sets. Innite sets cannot be counted in the usual sense; for one thing, the
mathematical theorems which underlie this usual sense for nite sets are false
for innite sets. Furthermore, dierent denitions of the concepts in terms of
which these theorems are stated, while equivalent for nite sets, are inequivalent
in the context of innite sets.
The notion of counting may be extended to them in the sense of establishing
(the existence of) a bijection with some well understood set. For instance, if a set
can be brought into bijection with the set of all natural numbers, then it is called
countably innite. This kind of counting diers in a fundamental way from
counting of nite sets, in that adding new elements to a set does not necessarily
increase its size, because the possibility of a bijection with the original set is not
excluded. For instance, the set of all integers (including negative numbers) can
be brought into bijection with the set of natural numbers, and even seemingly
much larger sets like that of all nite sequences of rational numbers are still
(only) countably innite. Nevertheless there are sets, such as the set of real
numbers, that can be shown to betoo large to admit a bijection with the
natural numbers, and these sets are called uncountable. Sets for which there
exists a bijection between them are said to have the same cardinality, and in
the most general sense counting a set can be taken to mean determining its
cardinality. Beyond the cardinalities given by each of the natural numbers,
there is an innite hierarchy of innite cardinalities, although only very few
such cardinalities occur in ordinary mathematics (that is, outside set theory
that explicitly studies possible cardinalities).
Counting, mostly of nite sets, has various applications in mathematics. One
important principle is that if two sets X and Y have the same nite number
of elements, and a function f : X Y is known to be injective, then it is also
surjective, and vice versa. A related fact is known as the pigeonhole principle,
which states that if two sets X and Y have nite numbers of elements n and
m with n > m, then any map f : X Y is not injective (so there exist two
distinct elements of X that f sends to the same element of Y ); this follows
from the former principle, since if f were injective, then so would its restriction
to a strict subset S of X with m elements, which restriction would then be
surjective, contradicting the fact that for x in X outside S, f(x) cannot be in
the image of the restriction. Similar counting arguments can prove the exis-
tence of certain objects without explicitly providing an example. In the case of
innite sets this can even apply in situations where it is impossible to give an
example; for instance there must exists real numbers that are not computable
numbers, because the latter set is only countably innite, but by denition a
non-computable number cannot be precisely specied.
The domain of enumerative combinatorics deals with computing the number
of elements of nite sets, without actually counting them; the latter usually
being impossible because innite families of nite sets are considered at once,
such as the set of permutations of 1, 2, . . . , n for any natural number n.
186 B Appendix: More Set Theoretic Properties (highly optional)
B.3 Formalities of Counting
Denition B.9. We say card (X) card (Y ) if there exists an injective map,
f : X Y and card (Y ) card (X) if there exists a surjective map g : Y X.
We say card (X) = card (Y ) if there exists a bijections, f : X Y.
Proposition B.10. We have card (X) card (Y ) i card (Y ) card (X) .
Proof. If f : X Y is an injective map, dene g : Y X by g[
f(X)
= f
1
and g[
Y \f(X)
= x
0
X chosen arbitrarily. Then g : Y X is surjective.
If g : Y X is a surjective map, then Y
x
:= g
1
(x) ,= for all x X
and so by the axiom of choice there exists f
xX
Y
x
. Thus f : X Y such
that f (x) Y
x
for all x. As the Y
x
xX
are pariwise disjoint, it follows that
f is injective.
Theorem B.11 (Schroder-Bernstein Theorem). If card (X) card (Y )
and card (Y ) card (X) , then card (X) = card (Y ) . Stated more explicitly; if
there exists injective maps f : X Y and g : Y X, then there exists a
bijective map, h : X Y.
Proof. Starting with an x X we may form the sequence of ancestors of
x, namely ancestor (x) := (x, y
1
, x
1
, y
2
, . . . ) where y
1
= g
1
(x) , x
1
= f
1
(y
1
) ,
y
2
= g
1
(x
1
) , . . . ., i.e.
x
g
1
y
1
f
1
x
1
g
1
y
2
f
1
. . .
We continue this process of inverse iterates as long as it is possible, i.e. we can
construct y
n+1
if x
n
g (Y ) and x
n+1
if y
n+1
f (X) . There are now three
possibilities;
1. ancestor (x) has innite length so the process never gets stuck in which case
we say x X
, read as start in X and end never get stuck.

2. ancestor (x) is nite and the last term in the sequence is in X, in which case
we say x X
X
(read as start in X and end in (get stuck in) X).
3. ancestor (x) is nite and the last term in the sequence is in Y, in which case
we say x X
Y
(read as start in X and end in (get stuck in) Y ).
In this way we partition X into three disjoint sets, X
, X
X
, and X
Y
. Sim-
ilarly we may partition Y into Y
, Y
Y
, and Y
X
. Let us now observe that,
1. f (X
) = Y
. Indeed if x X
then ancestor (f (x)) = (x, ancestor (x))

is an innite sequence, i.e. f (x) Y
. Moreover if y Y
, then
ancestor (y) = (y, ancestor (x)) where f (x) = y so that x X
and
y f (X
) . Thus we have shown f : X
is a bijection, i.e.
card (X
) = card (Y
) ,
2. f (X
X
) = Y
X
. Indeed if x X
X
then again ancestor (f (x)) =
(x, ancestor (x)) which ends in X so that f (x) Y
X
. Moreover if y Y
X
,
then ancestor (y) = (y, ancestor (x)) where f (x) = y so that ancestor (x)
ends in X, i.e. y f (X
X
) . Thus we have shown f : X
X
Y
X
is a bijection,
i.e. card (X
X
) = card (Y
X
) .
3. By the same argument as in item 2. it follow that g : Y
Y
X
Y
is a bijection,
i.e. card (X
Y
) = card (Y
Y
) .
The last three statement implies card (X) = card (Y ) . We may in fact dene
a bijection, h : X Y, by
h(x) =
_
f (x) if x X
X
X
g[
1
YY
(x) if x X
Y
.
Denition B.12. We say card (X) < card (Y ) if card (X) card (Y ) and
card (X) ,= card (Y ) , i.e. card (X) < card (Y ) if there exists an injective map,
f : X Y, but not bijective map exists. Similarly we say card (Y ) > card (X)
if card (Y ) card (X) and card (Y ) ,= card (X) , i.e. card (Y ) > card (X) if
there exists a surjective map g : Y X but no bijective map exists.
Proposition B.13. For any non-empty set X, card (X) < card
_
2
X
_
.
Proof. Dene f : X 2
X
by f (x) = x . Then f is an injective map and
hence card (X) card
_
2
X
_
. Now suppose that g : X 2
X
is any map. Let
X
0
= x X : x / g (x) X. I claim that X
0
/ g (X) .
Indeed suppose there exists x
0
X such that g (x
0
) = X
0
. If x
0
X
0
,
then x
0
/ g (x
0
) = X
0
which is impossible. Similarly if x
0
/ X
0
= g (x
0
) then
x
0
X
0
and again we have reached a contradiction. Thus we must conclude
that X
0
/ g (X) . Thus there are no surjective maps, g : X 2
X
so that
card (X) ,= card
_
2
X
_
.
Proposition B.14. If card (X) < card (Y ) and card (Y ) card (Z) , then
card (X) < card (Z) .
Proof. If there exists an injective map, f : Z X then composing this
with and injective map, g : X Y gives an injective map, g f : Z X and
there for card (Z) card (X) . But this would imply that card (X) = card (Z) .
Denition B.15. Let A
n
:= 1, 2, . . . , n for all n N and write n for
card (A
n
) .
Proposition B.16. We have card (A
m
) < card (A
n
) for all m < n. Moreover
if , = X _ A
n
then card (X) = card (A
k
) for some k < n.
B.3 Formalities of Counting 187
Proof. If f : A
1
A
2
, then either f (1) = 1 or f (1) = 2. In either case
f is injective but not bijective so that card (A
2
) < card (A
1
) . Let S
n
be the
statement that card (A
k
) < card (A
l
) for all 1 k < l n and for any proper
subset X A
n
we have card (X) = card (A
m
) for some m < n. Then we have
just shown that S
2
is true. So suppose that S
n
is now true. As f : A
k
A
l
dened by f (m) = m for all m A
k
is a injection when k < l we always
have card (A
k
) card (A
l
) . Now suppose that card (A
k
) = card (A
n+1
) for
some k n. Then there exists a bijection, f : A
n+1
A
k
. In this case f (A
n
)
is a proper subset of A
k
and therefore card (f (A
n
)) < card (A
k
) but on the
other hand card (f (A
n
)) = card (A
n
) card (A
k
) which is a contradiction.
So no such bijection can exists and we have shown card (A
k
) < card (A
n+1
)
for all k n. Finally suppose that X A
n+1
is proper subset. If X A
n
then card (X) = card (A
k
) for some k n by the induction hypothesis. On
the other hand if n + 1 X, let X
t
:= X n + 1 _ A
n
. Therefore by the
induction hypothesis card (X
t
) = card (A
k
) for some k < n. It is then clear that
card (X) = card (A
k+1
) where k +1 < n, indeed we map X := X
t
n + 1
A
k
k + 1 = A
k+1
.
Example B.17. card (A
n
k) = n 1 for k A
n
. Indeed, let f : A
n1

A
n
k be dened by
f (x) =
_
x if x < k
x + 1 if x k
.
Then f is the desired bijection. More generally if X Y and card (X) = m <
n = card (Y ) , then card (Y X) = nm and if X and Y are nite disjoint sets
then card (X Y ) = card (X) +card (Y ) . Similarly, card (X Y ) = card (X)
card (Y ) .
Proposition B.18. If f : A
n
A
n
is a map, then the following are equivalent,
1. f is injective,
2. f is surjective,
3. f is bijective.
Moreover card (Bijec (A
n
)) = n!.
Proof. If n = 1, the only map f : A
1
A
1
is f (1) = 1. So in this case
there is nothing to prove. So now suppose the proposition holds for level n and
f : A
n+1
A
n+1
is a given map.
If f : A
n+1
A
n+1
is an injective map and f (A
n+1
) is a proper subset
of A
n+1
, then card (A
n+1
) < card (f (A
n+1
)) = card (A
n+1
) which is absurd.
Thus f is injective implies f is surjective.
Conversely suppose that f : A
n+1
A
n+1
is surjective. Let g : A
n+1

A
n+1
be a right inverse, i.e. fg = id, which is necessarily injective, see the proof
of Proposition B.10. By the pervious paragraph we know that g is necessarily
surjective and therefore f = g
1
is a bijection.
It now only remains to prove card (Bijec (A
n
)) = n! which we again do
by induction. For n = 1 the result is clear. So suppose it holds at level n. If
f : A
n+1
A
n+1
is a bijection. Given 1 k n + 1 let
Bij
k
(A
n+1
) := f Bij (A
n+1
) : f (n + 1) = k .
For f Bij
k
(A
n+1
) , we have f : A
n
A
n+1
k

= A
n
is a bijection. Thus
Bij
k
(A
n+1
)

= Bij (A
n
) and
Bij (A
n+1
) =
n+1
k=1
Bij
k
(A
n+1
)
we have
card (Bij (A
n+1
)) =
n+1
k=1
card (Bij
k
(A
n+1
))
=
n+1
k=1
card (Bij (A
n
)) =
n+1
k=1
n!
= (n + 1) n! = (n + 1)!.
Theorem B.19. Suppose that X is a set. Then card (J
n
) card (X) for all
n N i card (N) card (X) .
Proof. Since card (J
n
) card (N) for all n N it suces to prove
card (J
n
) card (X) for all n N implies card (N) card (X) . The intuitive
idea is as follows.
Suppose we have constructed f
n
: J
n
X which is injective. If f
n
were
bijective we would have card (J
n
) = card (X) and in particular card (J
m
) >
card (J
n
) = card (X) for all m > n. Thus there exists x X f
n
(J
n
) and we
then dene f
n+1
: J
n+1
X so that f
n+1
(n + 1) = x and f
n+1
[
Jn
= f
n
. This
process continues indenitely and so we may construct injective maps f
n
: J
n

X such that f
m
= f
n
[
Jm
for all m n. We then dene f (m) := f
n
(m) where
n N is any integer such that n m. In this way we construct a function,
f : N X such that f[
Jn
= f
n
for all n. This function is easily seen to be
injective.
Formalities Version 1. Consider the collection of injective maps f :
T(f) N X, where T(f) is either J
n
for some n N or is N. We say
f g if T(f) T(g) and f = g[
1(f)
. It is easy to see that every linearly or-
dered collection of such maps has an upper bound and so by Zorns lemma (see
Theorem B.6), there exists a maximal element, f. If T(f) ,= N then T(f) = J
n
for some n. By the last paragraph we could extend f to injective map on J
n+1
violating the maximality of f. Thus T(f) = N and we have found an injective
map from N to X.
Formalities Version 2. (This argument will avoid the use of Zorns
Lemma.) By assumption, for each n N there exists an injective map,
f
n
: J
n
X. We now let Y :=
nN
f
n
(J
n
) X. We may construct a sur-
jective map (but not necessarily injective map) F : N Y. From this map
we then dene : Y N by (y) := min F
1
(y) so that : Y N
is now injective. Suppose for the sake of contradiction that (Y ) J
N
for
some N N, i.e. (Y ) is a bounded set. Then using our above arguments,
we we know that card ( (Y )) = card (J
k
) for some k N. On the other
hand, f
n
: J
n
Y being injective implies card ( (Y )) card (J
n
) for all
n N. As both of these statements can not be correct at the same time
we conclude that (Y ) is unbounded. We may now apply Lemma 5.26 in or-
der to see that card (Y ) = card ( (Y )) = card (N) . From this it follows that
card (N) card (X) .
Alternate Proof. By assumption, there exists and injective map, f
n
:
J
n
X for each n N. By replacing X by X
0
:=
nN
f
n
(J
n
) we may assume
that X =
nN
f
n
(J
n
) . As X is the countable union of nite sets it follows
that there exists a surjective map, f : N X by item 2 of Theorem 5.27. .
Let g : X N be dened by g (x) := min f
1
(x) for all x X and let
S := g (N) . To nish the proof we need only show that S is unbounded. If S
were bounded, then we would nd k N such that J
k
X. However this is
impossible since card J
n
card X = card J
k
would imply n k.
C
Appendix: Aspects of General Topological Spaces
C.1 Compactness
Denition C.1. The subset A of a topological space (X, ) is said to be com-
pact if every open cover (Denition 9.42) of A has nite a sub-cover, i.e. if |
is an open cover of A there exists |
0
| such that |
0
is a cover of A. (We
will write A X to denote that A X and A is compact.) A subset A X
is precompact if

A is compact.
Proposition C.2. Suppose that K X is a compact set and F K is a closed
subset. Then F is compact. If K
i
n
i=1
is a nite collections of compact subsets
of X then K =
n
i=1
K
i
is also a compact subset of X.
Proof. Let | be an open cover of F, then |F
c
is an open cover of
K. The cover |F
c
of K has a nite subcover which we denote by |
0
F
c
where |
0
|. Since F F
c
= , it follows that |
0
is the desired subcover
of F. For the second assertion suppose | is an open cover of K. Then |
covers each compact set K
i
and therefore there exists a nite subset |
i
|
for each i such that K
i
|
i
. Then |
0
:=
n
i=1
|
i
is a nite cover of K.
Exercise C.1 (Suggested by Michael Gurvich). Show by example that
the intersection of two compact sets need not be compact. (This pathology
disappears if one assumes the topology is Hausdor, see Denition ?? below.)
Exercise C.2. Suppose f : X Y is continuous and K X is compact, then
f(K) is a compact subset of Y. Give an example of continuous map, f : X Y,
and a compact subset K of Y such that f
1
(K) is not compact.
Exercise C.3 (Dinis Theorem). Let X be a compact topological space and
f
n
: X [0, ) be a sequence of continuous functions such that f
n
(x) 0
as n for each x X. Show that in fact f
n
0 uniformly in x, i.e.
sup
xX
f
n
(x) 0 as n . Hint: Given > 0, consider the open sets V
n
:=
x X : f
n
(x) < .
Denition C.3. A collection T of closed subsets of a topological space (X, )
has the nite intersection property if T
0
,= for all T
0
T.
The notion of compactness may be expressed in terms of closed sets as
follows.
Proposition C.4. A topological space X is compact i every family of closed
sets T 2
X
having the nite intersection property satises
T ,= .
Proof. The basic point here is that complementation interchanges open
and closed sets and open covers go over to collections of closed sets with empty
intersection. Here are the details.
() Suppose that X is compact and T 2
X
is a collection of closed sets
such that
T = . Let
| = T
c
:= C
c
: C T ,
then | is a cover of X and hence has a nite subcover, |
0
. Let T
0
= |
c
0
T,
then T
0
= so that T does not have the nite intersection property.
() If X is not compact, there exists an open cover | of X with no nite
subcover. Let
T = |
c
:= U
c
: U | ,
then T is a collection of closed sets with the nite intersection property while
T = .
Exercise C.4. Let (X, ) be a topological space. Show that A X is compact
i (A,
A
) is a compact topological space.
Let (X, d) be a metric space and for x X and > 0 let
B
t
x
() := B
x
() x
be the ball centered at x of radius > 0 with x deleted. Recall from Denition
9.15 that a point x X is an accumulation point of a subset E X if ,=
E V x for all open neighborhoods, V, of x. The proof of the following
elementary lemma is left to the reader.
Lemma C.5. Let E X be a subset of a metric space (X, d) . Then the fol-
lowing are equivalent:
1. x X is an accumulation point of E.
2. B
t
x
() E ,= for all > 0.
3. B
x
() E is an innite set for all > 0.
4. There exists x
n
n=1
E x with lim
n
x
n
= x.
190 C Appendix: Aspects of General Topological Spaces
Denition C.6. A metric space (X, d) is bounded ( > 0) if there exists
a nite cover of X by balls of radius and it is totally bounded if it is
bounded for all > 0.
Theorem C.7. Let (X, d) be a metric space. The following are equivalent.
(a) X is compact.
(b) Every innite subset of X has an accumulation point.
(c) Every sequence x
n
n=1
X has a convergent subsequence.
(d) X is totally bounded and complete.
Proof. The proof will consist of showing that a b c d a.
(a b) We will show that not b not a. Suppose there exists an innite
subset E X which has no accumulation points. Then for all x X there
exists
x
> 0 such that V
x
:= B
x
(
x
) satises (V
x
x) E = . Clearly
1 = V
x
xX
is a cover of X, yet 1 has no nite sub cover. Indeed, for each
x X, V
x
E x and hence if X,
x
V
x
can only contain a nite
number of points from E (namely E). Thus for any X, E
x
V
x
and in particular X ,=
x
V
x
. (See Figure C.1.)
e
e
x
x
Fig. C.1. The black dots represents an innite set, E, with not accumulation points.
For each x X \ E we choose x > 0 so that Bx (x) E = and for x E so that
Bx (x) E = {x} .
(b c) Let x
n
n=1
X be a sequence and E := x
n
: n N . If
#(E) < , then x
n
n=1
has a subsequence x
nk
k=1
which is constant and
hence convergent. On the other hand if #(E) = then by assumption E has
an accumulation point and hence by Lemma C.5, x
n
n=1
has a convergent
subsequence.
(c d) Suppose x
n
n=1
X is a Cauchy sequence. By assumption there
exists a subsequence x
nk
k=1
which is convergent to some point x X. Since
x
n
n=1
is Cauchy it follows that x
n
x as n showing X is complete. We
now show that X is totally bounded. Let > 0 be given and choose an arbitrary
point x
1
X. If possible choose x
2
X such that d(x
2
, x
1
) , then if possible
choose x
3
X such that d
x1,x2]
(x
3
) and continue inductively choosing
points x
j
n
j=1
X such that d
x1,...,xn1]
(x
n
) . (See Figure C.2.) This
process must terminate, for otherwise we would produce a sequence x
n
n=1

X which can have no convergent subsequences. Indeed, the x
n
have been chosen
so that d (x
n
, x
m
) > 0 for every m ,= n and hence no subsequence of
x
n
n=1
can be Cauchy.
x2
x1 x3
x4
x5
x6
x7
x8
Fig. C.2. Constructing a set with out an accumulation point.
(d a) For sake of contradiction, assume there exists an open cover 1 =
V
A
of X with no nite subcover. Since X is totally bounded for each
n N there exists
n
X such that
X =
_
xn
B
x
(1/n)
_
xn
C
x
(1/n).
Choose x
1

1
such that no nite subset of 1 covers K
1
:= C
x1
(1). Since
K
1
=
x2
K
1
C
x
(1/2), there exists x
2

2
such that K
2
:= K
1
C
x2
(1/2)
can not be covered by a nite subset of 1, see Figure C.3. Continuing this way
inductively, we construct sets K
n
= K
n1
C
xn
(1/n) with x
n

n
such that no
K
n
can be covered by a nite subset of 1. Now choose y
n
K
n
for each n. Since
K
n
n=1
is a decreasing sequence of closed sets such that diam(K
n
) 2/n, it
follows that y
n
is a Cauchy and hence convergent with
y = lim
n
y
n

m=1
K
m
.
Since 1 is a cover of X, there exists V 1 such that y V. Since K
n
y
and diam(K
n
) 0, it now follows that K
n
V for some n large. But this
violates the assertion that K
n
can not be covered by a nite subset of 1.
Corollary C.8. Any compact metric space (X, d) is second countable and hence
also separable by Exercise ??. (See Example ?? below for an example of a com-
pact topological space which is not separable.)
Proof. To each integer n, there exists
n
X such that X =
xn
B(x, 1/n). The collection of open balls,
C.1 Compactness 191
Fig. C.3. Nested Sequence of cubes.
1 :=
nN

xn
B(x, 1/n)
forms a countable basis for the metric topology on X. To check this, suppose
that x
0
X and > 0 are given and choose n N such that 1/n < /2
and x
n
such that d (x
0
, x) < 1/n. Then B(x, 1/n) B(x
0
, ) because for
y B(x, 1/n),
d (y, x
0
) d (y, x) +d (x, x
0
) < 2/n < .
Corollary C.9. The compact subsets of 1
n
are the closed and bounded sets.
Proof. This is a consequence of Theorem ?? and Theorem C.7. Here is
another proof. If K is closed and bounded then K is complete (being the closed
subset of a complete space) and K is contained in [M, M]
n
for some positive
integer M. For > 0, let
= Z
n
[M, M]
n
:= x : x Z
n
and [x
i
[ M for i = 1, 2, . . . , n.
We will show, by choosing > 0 suciently small, that
K [M, M]
n

x
B(x, ) (C.1)
which shows that K is totally bounded. Hence by Theorem C.7, K is compact.
Suppose that y [M, M]
n
, then there exists x
such that [y
i
x
i
[
for i = 1, 2, . . . , n. Hence
d
2
(x, y) =
n
i=1
(y
i
x
i
)
2
n
2
which shows that d(x, y)

n. Hence if choose < /
n we have shows that

d(x, y) < , i.e. Eq. (C.1) holds.
Example C.10. Let X =
p
(N) with p [1, ) and
p
(N) such that (k) 0
for all k N. The set
K := x X : [x(k)[ (k) for all k N
is compact. To prove this, let x
n
n=1
K be a sequence. By compactness of
closed bounded sets in C, for each k N there is a subsequence of x
n
(k)
n=1

C which is convergent. By Cantors diagonalization trick, we may choose a
subsequence y
n
n=1
of x
n
n=1
such that y(k) := lim
n
y
n
(k) exists for all
k N.
1
Since [y
n
(k)[ (k) for all n it follows that [y(k)[ (k), i.e. y K.
Finally
lim
n
|y y
n
|
p
p
= lim
n
k=1
[y(k) y
n
(k)[
p
=
k=1
lim
n
[y(k) y
n
(k)[
p
= 0
wherein we have used the Dominated convergence theorem. (Note
[y(k) y
n
(k)[
p
2
p
p
(k)
and
p
is summable.) Therefore y
n
y and we are done.
Alternatively, we can prove K is compact by showing that K is closed and
totally bounded. It is simple to show K is closed, for if x
n
n=1
K is a
convergent sequence in X, x := lim
n
x
n
, then
[x(k)[ lim
n
[x
n
(k)[ (k) k N.
This shows that x K and hence K is closed. To see that K is totally
bounded, let > 0 and choose N such that
_
k=N+1
[(k)[
p
_
1/p
< . Since
N
k=1
C
(k)
(0) C
N
is closed and bounded, it is compact. Therefore there
exists a nite subset
N
k=1
C
(k)
(0) such that
N
k=1
C
(k)
(0)
z
B
N
z
()
1
The argument is as follows. Let {n
1
j
}
j=1
be a subsequence of N ={n}
n=1
such that
limjx
n
1
j
(1) exists. Now choose a subsequence {n
2
j
}
j=1
of {n
1
j
}
j=1
such that
limjx
n
2
j
(2) exists and similarly {n
3
j
}
j=1
of {n
2
j
}
j=1
such that limjx
n
3
j
(3)
exists. Continue on this way inductively to get
{n}
n=1
{n
1
j
}
j=1
{n
2
j
}
j=1
{n
3
j
}
j=1
. . .
such that limjx
n
k
j
(k) exists for all k N. Let mj := n
j
j
so that eventually
{mj}
j=1
is a subsequence of {n
k
j
}
j=1
for all k. Therefore, we may take yj := xmj
.
where B
N
z
() is the open ball centered at z C
N
relative to the
p
(1, 2, 3, . . . , N) norm. For each z , let z X be dened by
z(k) = z(k) if k N and z(k) = 0 for k N + 1. I now claim that
K
z
B
z
(2) (C.2)
which, when veried, shows K is totally bounded. To verify Eq. (C.2), let x K
and write x = u + v where u(k) = x(k) for k N and u(k) = 0 for k < N.
Then by construction u B
z
() for some z and
|v|
p

_

k=N+1
[(k)[
p
_
1/p
< .
So we have
|x z|
p
= |u +v z|
p
|u z|
p
+|v|
p
< 2.
Exercise C.5 (Extreme value theorem). Let (X, ) be a compact topologi-
cal space and f : X 1 be a continuous function. Show < inf f sup f <
and there exists a, b X such that f(a) = inf f and f(b) = sup f. Hint: use
Exercise C.2 and Corollary C.9.
Exercise C.6 (Uniform Continuity). Let (X, d) be a compact metric space,
(Y, ) be a metric space and f : X Y be a continuous function. Show that f is
uniformly continuous, i.e. if > 0 there exists > 0 such that (f(y), f(x)) <
if x, y X with d(x, y) < . Hint: you could follow the argument in the proof
of Theorem ??.
Theorem C.11. Suppose that (X, ||) is a normed vector in which the unit
ball, V := B
0
(1) , is precompact. Then dimX < .
An alternate proof is given in Proposition C.13.. Since

V is compact,
we may choose X such that
V
x
_
x +
1
2
V
_
(C.3)
where, for any > 0,
V := x : x V = B
0
() .
Let Y := span(), then Eq. (C.3) implies,
V

V Y +
1
2
V.
Multiplying this equation by
1
2
then shows
1
2
V
1
2
Y +
1
4
V = Y +
1
4
V
and hence
V Y +
1
2
V Y +Y +
1
4
V = Y +
1
4
V.
Continuing this way inductively then shows that
V Y +
1
2
n
V for all n N. (C.4)
Indeed, if Eq. (C.4) holds, then
V Y +
1
2
V Y +
1
2
_
Y +
1
2
n
V
_
= Y +
1
2
n+1
V.
Hence if x V, there exists y
n
Y and z
n
B
0
(2
n
) such that y
n
+z
n
x.
Since lim
n
z
n
= 0, it follows that x = lim
n
y
n

Y . Since dimY
#() < , Corollary 9.41 implies Y =

Y and so we have shown that V Y.
Since for any x X,
1
2|x|
x V Y, we have x Y for all x X, i.e. X = Y.
Lemma C.12. Let H be a normed linear space and H
0
a closed proper subspace.
For any > 0, there exists x
0
H such that |x
0
| = 1 and |x x
0
| 1
whenever x H
0
.
Proof. Can assume < 1. Take any z
0
/ H
0
. Let d = inf
xH0
|x z
0
|. For
any > 0, there exists z H
0
, such that |z z
0
| d + . Take =
d
1
. Let
x
0
= (z z
0
)/|z z
0
|, where z is determined for this . Then |x
0
| = 1, and if
x H
0
,
|x x
0
| =
||z z
0
|x z +z
0
|
|z z
0
|

d
|z z
0
|

d
d +
= 1 .
[Here is the proof again at a higher level. Choose h H
0
such that d :=
dist (z
0
, H
0
)

= |h z
0
| and then take x
0
:= (h z
0
)/|h z
0
| as above. Then
dist (x
0
, H
0
) =
1
|h z
0
|
dist (h z
0
, H
0
)
=
1
|h z
0
|
dist (z
0
, H
0
)

= 1,
where we have used the easily veried fact that dist (ax +h, H) = [a[ dist (x, H)
for all a 1 and h H.]
C.2 Exercises 193
Proposition C.13. A locally compact Banach space is nite dimensional.
Proof. We prove that an innite dimensional Banach space is not locally
compact. We construct a sequence x
1
, x
2
, . . . , x
n
, . . . such that |x
n
| = 1, |x
i
x
j
| 1/2, i ,= j. Take x
1
to be any unit vector. Suppose vectors x
1
, . . . , x
n
are constructed. Let H
0
be the linear span of x
1
, . . . , x
n
. By Corollary 9.41,
H
0
is closed. By Lemma C.12, there exists x
n+1
such that |x
i
x
n+1
| 1/2,
i = 1, . . . , n. Now the sequence just constructed has no Cauchy subsequence.
Hence the closed unit ball is not compact. Similarly the closed ball of radius
r > 0 is also not compact.
Exercise C.7. Suppose (Y, ||
Y
) is a normed space and (X, ||
X
) is a nite
dimensional normed space. Show every linear transformation T : X Y is
necessarily bounded.
C.2 Exercises
C.2.1 General Topological Space Problems
Exercise C.8. Let V be an open subset of 1. Show V may be written as a
disjoint union of open intervals J
n
= (a
n
, b
n
), where a
n
, b
n
1 for
n = 1, 2, < N with N = possible.
Exercise C.9. Let (X, ) and (Y,
t
) be a topological spaces, f : X Y be a
function, | be an open cover of X and F
j
n
j=1
be a nite cover of X by closed
sets.
1. If A X is any set and f : X Y is (,
t
) continuous then f[
A
: A Y
is (
A
,
t
) continuous.
2. Show f : X Y is (,
t
) continuous i f[
U
: U Y is (
U
,
t
)
continuous for all U |.
3. Show f : X Y is (,
t
) continuous i f[
Fj
: F
j
Y is (
Fj
,
t
)
continuous for all j = 1, 2, . . . , n.
Exercise C.10. Suppose that X is a set, (Y
) : A is a family of topo-
logical spaces and f
: X Y
is a given function for all A. Assum-

ing that o
is a sub-base for the topology
for each A, show

o :=
A
f
1
(o
) is a sub-base for the topology := (f
: A).
C.2.2 Metric Spaces as Topological Spaces
Denition C.14. Two metrics d and on a set X are said to be equivalent
if there exists a constant c (0, ) such that c
1
d c.
Exercise C.11. Suppose that d and are two metrics on X.
1. Show
d
=
if d and are equivalent.

2. Show by example that it is possible for
d
=
even though d and are

inequivalent.
Exercise C.12. Let (X
i
, d
i
) for i = 1, . . . , n be a nite collection of metric
spaces and for 1 p and x = (x
1
, x
2
, . . . , x
n
) and y = (y
1
, . . . , y
n
) in
X :=
n
i=1
X
i
, let
p
(x, y) =
_
(
n
i=1
[d
i
(x
i
, y
i
)]
p
)
1/p
if p ,=
max
i
d
i
(x
i
, y
i
) if p =
.
1. Show (X,
p
) is a metric space for p [1, ]. Hint: Minkowskis inequality.
2. Show for any p, q [1, ], the metrics
p
and
q
are equivalent. Hint: This
can be done with explicit estimates or you could use Theorem 9.39 below.
Notation C.15 Let X be a set and p := p
n
n=0
be a family of semi-metrics
on X, i.e. p
n
: X X [0, ) are functions satisfying the assumptions of
metric except for the assertion that p
n
(x, y) = 0 implies x = y. Further assume
that p
n
(x, y) p
n+1
(x, y) for all n and if p
n
(x, y) = 0 for all n N then x = y.
Given n N and x X let
B
n
(x, ) := y X : p
n
(x, y) < .
We will write (p) form the smallest topology on X such that p
n
(x, ) : X
[0, ) is continuous for all n N and x X, i.e. (p) := (p
n
(x, ) : n N
and x X).
Exercise C.13. Using Notation C.15, show that collection of balls,
B := B
n
(x, ) : n N, x X and > 0 ,
forms a base for the topology (p). Hint: Use Exercise C.10 to show B is a
sub-base for the topology (p) and then use Exercise ?? to show B is in fact a
base for the topology (p).
Exercise C.14 (A minor variant of Exercise 9.21). Let p
n
be as in Nota-
tion C.15 and
d(x, y) :=
n=0
2
n
p
n
(x, y)
1 +p
n
(x, y)
.
Show d is a metric on X and
d
= (p). Conclude that a sequence x
k
k=1
X
converges to x X i
lim
k
p
n
(x
k
, x) = 0 for all n N.
Exercise C.15. Let (X
n
, d
n
)
n=1
be a sequence of metric spaces, X :=
n=1
X
n
, and for x = (x(n))
n=1
and y = (y(n))
n=1
in X let
d(x, y) =
n=1
2
n
d
n
(x(n), y(n))
1 +d
n
(x(n), y(n))
.
(See Exercise 9.21.) Moreover, let
n
: X X
n
be the projection maps, show
d
=
n=1
dn
:= (
n
: n N).
That is show the d metric topology is the same as the product topology on
X. Suggestions: 1) show
n
is
d
continuous for each n and 2) show for each
x X that d (x, ) is
n=1
dn
continuous. For the second assertion notice
that d (x, ) =
n=1
f
n
where f
n
= 2
n
_
dn(x(n),)
1+dn(x(n),)
_

n
.
C.2.3 Compactness Problems
Exercise C.16 (Tychonos Theorem for Compact Metric Spaces). Let
us continue the Notation used in Exercise 9.21. Further assume that the spaces
X
n
are compact for all n. Show (without using Theorem ?? below) that (X, d) is
compact. Hint: Either use Cantors method to show every sequence x
m
m=1

X has a convergent subsequence or alternatively show (X, d) is complete and
totally bounded. (Compare with Example C.10 and see Theorem ?? below for
the general version of this theorem.)
C.3 Connectedness in General Topological Spaces
(You may safely skip this section on rst reading.) As a warm up suppose that
(X, d) is a metric space and Y be a non-empty subset of X. If A Y, the closure
of A in the metric space (Y, d) is denote by

A
Y
and is dened by
A
Y
:=
_
y Y : y = lim
n
a
n
for some a
n
A
_
=

A Y.
Lemma C.16. A set A Y is closed in Y i A =

A
Y
=

AY i there exists
C X closed such that A = C Y. Similarly, A is open in Y i there exists V
open in X such that A = V Y.
Proof. The assertion that A Y is closed in Y i A =

A
Y
=

AY follows
directly from the denition. Hence if A is closed in Y, then A = CY where we
may take C =

A a closed subset of X. Conversely, if A = C Y, then

A C
and hence A

A
Y
=

A Y C Y = A which shows that A =

A
Y
and so A
is closed in Y.
If A Y is open, then Y A is closed and hence Y A = Y C for some
closed subset, C X. Since,
A = Y (Y A) = Y [Y C]
c
= Y [Y
c
C
c
] = Y C
c
we see that A = Y V where V := C
c
is open in X. Conversely if A = Y V,
then
Y A = Y [Y V ]
c
= Y [Y
c
V
c
] = Y V
c
which shows that Y A is closed in Y.
Proposition C.17. Let E be a subset of a metric space, (X, d) . Then E is
disconnected i E is the disjoint union of two non-empty relatively open subset
of E, i.e. there exists open subset U and V of X such that;
1. U E ,= , = V E while U V E = ,
2. and E = (U E) [V E] .
Proof. Suppose that E is a disconnected and write E = AB where where
A and B are two non-empty separated sets. Let U :=

B
c
and V :=

A
c
. Then
A U and B V. Moreover, E U = (A B)

B
c
= A

B = A ,= and
similarly E V = B ,= while U V E = A B = . Therefore E is the
disjoint union of two non-empty relatively open sets.
Conversely, suppose that E = A B where A and B are relatively open
non-empty disjoint open subsets of E. Then both A and B are relatively closed
in E and hence,
A =

A
E
:=
_
x E : a
n
A lim
n
a
n
= x
_
=

A E
and similarly B =

B
E
=

B E. Since
A =

A E =

A [A B] = A
_
A B
and A B = we conclude,
= A B =
_
A
_
A B
_
B =

A B.
Similarly we may show A

B = and hence E is written as the union of two
non-empty separated sets.
Denition C.18. (X, ) is disconnected if there exist non-empty open sets
U and V of X such that U V = and X = U V . We say U, V is a
disconnection of X. The topological space (X, ) is called connected if it
is not disconnected, i.e. if there is no disconnection of X. If A X we say
A is connected i (A,
A
) is connected where
A
is the relative topology on
A. Explicitly, A is disconnected in (X, ) i there exists U, V such that
U A ,= , U A ,= , A U V = and A U V.
C.3 Connectedness in General Topological Spaces 195
The reader should check that the following statement is an equivalent de-
nition of connectivity. A topological space (X, ) is connected i the only sets
A X which are both open and closed are the sets X and . This version of
the denition is often used in practice.
Remark C.19. Let A Y X. Then A is connected in X i A is connected in
Y .
Proof. Since
A
:= V A : V X = V A Y : V X = U A : U
o
Y ,
the relative topology on A inherited from X is the same as the relative topology
on A inherited from Y . Since connectivity is a statement about the relative
topologies on A, A is connected in X i A is connected in Y.
Theorem C.20 (The Connected Subsets of 1). The connected subsets of
1 are intervals.
Proof. Suppose that A 1 is a connected subset and that a, b A with
a < b. If there exists c (a, b) such that c / A, then U := (, c) A
and V := (c, ) A would form a disconnection of A. Hence (a, b) A. Let
:= inf(A) and := sup(A) and choose
n
,
n
A such that
n
<
n
and
n
and
n
as n . By what we have just shown, (
n
,
n
) A for all
n and hence (, ) =
n=1
(
n
,
n
) A. From this it follows that A = (, ),
[, ), (, ] or [, ], i.e. A is an interval.
Conversely suppose that A is a sub-interval of 1. For the sake of contra-
diction, suppose that U, V is a disconnection of A with a U, b V. After
relabelling U and V if necessary we may assume that a < b. Since A is an
interval [a, b] A. Let p = sup ([a, b] U) , then because U and V are open,
a p and p
can not be in V for otherwise p < sup ([a, b] U) . From this it follows that
p / U V and hence A ,= U V contradicting the assumption that U, V is a
disconnection of A.
Alternative proof of the converse. Because of Theorem C.22 below, it
suces to assume that A is an open interval. For the sake of contradiction,
suppose that U, V is a disconnection of A with a U, b V. After relabelling
U and V if necessary we may assume that a < b. Let J
a
= (, ) be the maximal
open interval in U which contains a. (See Exercise C.8 or Remark 9.67 for the
structure of open subsets of 1.) If U we could extend J
a
to the right and
still be in U violating the denition of . Moreover we can not have V
because in this case J
a
would not be in U. Therefore / U V = A while on
the other hand a < < b and so A as A is an interval and we have reached
the desired contradiction.
Here is a typical way these connectedness ideas are used.
Example C.21. Suppose that f : X Y is a continuous map between two
topological spaces, the space X is connected and the space Y is T
1
, i.e. y
is a closed set for all y Y as in Denition ?? below. Further assume f is
locally constant, i.e. for all x X there exists an open neighborhood V of x
in X such that f[
V
is constant. Then f is constant, i.e. f(X) = y
0
for some
y
0
Y. To prove this, let y
0
f(X) and let W := f
1
(y
0
). Since y
0
Y
is a closed set and since f is continuous W X is also closed. Since f is locally
constant, W is open as well and since X is connected it follows that W = X,
i.e. f(X) = y
0
.
As a concrete application of this result, suppose that X is a connected open
subset of 1
d
and f : X 1 is a C
1
function such that f 0. If x X and
> 0 such that B(x, ) X, we have, for any [v[ < and t [1, 1] , that
d
dt
f (x +tv) = f (x +tv) v = 0.
Therefore f (x +v) = f (x) for all [v[ < and this shows f is locally constant.
Hence, by what we have just proved, f is constant on X.
Theorem C.22 (Properties of Connected Sets). Let (X, ) be a topological
space.
1. If B X is a connected set and X is the disjoint union of two open sets U
and V, then either B U or B V.
2. If A X is connected,
a) then

A is connected.
b) More generally, if A is connected and B acc(A) or B bd (A) , then
AB is connected as well. (Recall that acc(A) the set of accumulation
points of A was dened in Denition 9.15 above. Moreover by Exercise
??, we know that acc(A) A = bd (A) A. What we are really showing
here is that for any B such that A B

A, then B is connected.)
3. If E
A
is a collection of connected sets such that E
,= for all
, A,
2
then Y :=
A
E
is connected as well.
4. Suppose A, B X are non-empty connected subsets of X such that

AB ,=
, then A B is connected in X.
5. Every point x X is contained in a unique maximal connected subset C
x
of
X and this subset is closed. The set C
x
is called the connected component
of x.
2
One may assume much less here. What we really need is for any , A there
exists {i}
n
i=0
in A such that 0 = , n = , and Ei
Ei+1
= for all 0 i < n.
Moreover if we make use of item 4. it suces to assume that
Ei
Ei+1
Ei

Ei+1
= for all 0 i < n.
Proof.
1. Since B is the disjoint union of the relatively open sets BU and BV, we
must have B U = B or B V = B for otherwise B U, B V would
be a disconnection of B.
2. a) Let Y =

A be equipped with the relative topology from X. Suppose
that U, V
o
Y form a disconnection of Y =

A. Then by 1. either A U
or A V. Say that A U. Since U is both open and closed in Y, it
follows that Y =

A U. Therefore V = and we have a contradiction to
the assumption that U, V is a disconnection of Y =

A. Hence we must
conclude that Y =

A is connected as well.
b) Now let Y = A B with B acc(A), then
A
Y
=

A Y = (A acc(A)) Y = A B.
Because A is connected in Y, by (2a) Y = A B =

A
Y
is also connected.
3. Let Y :=
A
E
. By Remark C.19, we know that E
is connected in Y
for each A. If U, V were a disconnection of Y, by item (1), either
E
U or E
V for all . Let = A : E
U then U =
and V =
A\
E
. (Notice that neither or A can be empty since U

and V are not empty.) Since
= U V =
_
,
c (E
) ,= .
we have reached a contradiction and hence no such disconnection exists.
4. (A good example to keep in mind here is X = 1, A = (0, 1) and B = [1, 2).)
For sake of contradiction suppose that U, V were a disconnection of Y =
AB. By item (1) either A U or A V, say A U in which case B V.
Since Y = AB we must have A = U and B = V and so we may conclude:
A and B are disjoint subsets of Y which are both open and closed. This
implies
A =

A
Y
=

A Y =

A (A B) = A
_
A B
_
and therefore
= A B =
_
A
_
A B
_
B =

A B ,=
which gives us the desired contradiction.
Alternative proof. Let A
t
:= A
_
B

A
so that A B = A
t
B. By
item 2b. we know A
t
is still connected and since A
t
B ,= we may now
apply item 3. to nish the proof.
5. Let ( denote the collection of connected subsets C X such that x C.
Then by item 3., the set C
x
:= ( is also a connected subset of X which
contains x and clearly this is the unique maximal connected set containing
x. Since

C
x
is also connected by item (2) and C
x
is maximal, C
x
=

C
x
, i.e.
C
x
is closed.
Proposition C.23 (Stability of Connectedness Under Products). Let
(X
) be connected topological spaces. Then the product space X

A
=
A
X
equipped with the product topology is connected.

Proof. Let us begin with the case of two factors, namely assume that X and
Y are connected topological spaces, then we will show that XY is connected
as well. Given x X, let f
x
: Y XY be the map f
x
(y) = (x, y) and notice
that f
x
is continuous since
X
f
x
(y) = x and
Y
f
x
(y) = y are continuous
maps. From this we conclude that x Y = f
x
(Y ) is connected by Lemma
9.57. A similar argument shows X y is connected for all y Y.
Let p = (x
0
, y
0
) X Y and C
p
denote the connected component of p.
Since x
0
Y is connected and p x
0
Y it follows that x
0
Y C
p
and hence C
p
is also the connected component (x
0
, y) for all y Y. Similarly,
X y C
(x0,y)
= C
p
is connected, and therefore X y C
p
. So we have
shown (x, y) C
p
for all x X and y Y, see Figure C.4. By induction the
theorem holds whenever A is a nite set, i.e. for products of a nite number of
connected spaces.
X
Y
y0
p
X Y
x0
X {y0}
X {y}
{x0} Y
y
Fig. C.4. This picture illustrates why the connected component of p in X Y must
contain all points of X Y.
For the general case, again choose a point p X
A
= X
A
and again let
C = C
p
be the connected component of p. Recall that C
p
is closed and therefore
C.5 Supplementary Remarks 197
if C
p
is a proper subset of X
A
, then X
A
C
p
is a non-empty open set. By the
denition of the product topology, this would imply that X
A
C
p
contains a
non-empty open set of the form
V :=
(V
) = V
X
A\
X
A
C
p
(C.5)
where A and V
for all .
On the other hand, let : X
X
A
by (y) = x where
x
=
_
y
if
p
if / .
If ,
(y) = y
(y) and if A then
(y) = p
so that in
every case
: X
is continuous and therefore is continuous. Since

X
is a product of a nite number of connected spaces and so is connected

and thus so is the continuous image, (X
) = X
A\
X
. Since
p (X
) we must have
X
A\
C
p
. (C.6)
Hence it follows from Eqs. (C.5) and (C.6) that
V
A\
=
_
X
A\
_
V C
p
[X
A
C
p
] =
which is a contradiction since V
A\
,= .
C.4 Completions of Metric Spaces
Exercise C.17 (Completions of Metric Spaces). Suppose that (X, d) is a
(not necessarily complete) metric space. Using the following outline show there
exists a complete metric space
_
X,

d
_
and an isometric map i : X

X such
that i (X) is dense in

X, see Denition 9.13.
1. Let ( denote the collection of Cauchy sequences a = a
n
n=1
X. Given
two element a, b ( show d
C
(a, b) := lim
n
d (a
n
, b
n
) exists, d
C
(a, b) 0
for all a, b ( and d
C
satises the triangle inequality,
d
C
(a, c) d
C
(a, b) +d
C
(b, c) for all a, b, c (.
Thus ((, d
C
) would be a metric space if it were true that d
C
(a, b) = 0 i
a = b. This however is false, for example if a
n
= b
n
for all n 100, then
d
C
(a, b) = 0 while a need not equal b.
2. Dene two elements a, b ( to be equivalent (write a b) when-
ever d
C
(a, b) = 0. Show is an equivalence relation on ( and that
d
C
(a
t
, b
t
) = d
C
(a, b) if a a
t
and b b
t
. (Hint: see Corollary 6.46.)
3. Given a ( let a := b ( : b a denote the equivalence class containing
a and let

X := a : a ( denote the collection of such equivalence classes.
Show that

d
_
a,
b
_
:= d
C
(a, b) is well dened on

X

X and verify
_
X,

d
_
is
a metric space.
4. For x X let i (x) = a where a is the constant sequence, a
n
= x for all n.
Verify that i : X

X is an isometric map and that i (X) is dense in

X.
5. Verify
_
X,

d
_
is complete. Hint: if a(m)
m=1
is a Cauchy sequence in

X
choose b
m
X such that

d (i (b
m
) , a(m)) 1/m. Then show a(m)

b
where b = b
m
m=1
.
C.5 Supplementary Remarks
C.5.1 Riemannian Metrics
This subsection is not completely self contained and may safely be skipped.
Lemma C.24. Suppose that X is a Riemannian (or sub-Riemannian) manifold
and d is the metric on X dened by
d(x, y) = inf () : (0) = x and (1) = y
where () is the length of the curve . We dene () = if is not piecewise
smooth.
Then
B
x
() = C
x
() and
bd(B
x
()) = y X : d(x, y) =
where the boundary operation, bd() is dened in Denition 9.15 below.
Proof. Let C := C
x
() B
x
() =:

B. We will show that C

B by showing
B
c
C
c
. Suppose that y

B
c
and choose > 0 such that B
y
()

B = . In
particular this implies that
B
y
() B
x
() = .
We will nish the proof by showing that d(x, y) + > and hence that
y C
c
. This will be accomplished by showing: if d(x, y) < + then B
y
()
B
x
() ,= . If d(x, y) < max(, ) then either x B
y
() or y B
x
(). In either
case B
y
() B
x
() ,= . Hence we may assume that max(, ) d(x, y) < +.
Let > 0 be a number such that
Fig. C.5. An almost length minimizing curve joining x to y.
max(, ) d(x, y) < < +
and choose a curve from x to y such that () < . Also choose 0 <
t
< such
that 0 <
t
< which can be done since < . Let k(t) = d(y, (t)) a
continuous function on [0, 1] and therefore k([0, 1]) 1 is a connected set which
contains 0 and d(x, y). Therefore there exists t
0
[0, 1] such that d(y, (t
0
)) =
k(t
0
) =
t
. Let z = (t
0
) B
y
() then
d(x, z) ([
[0,t0]
) = () ([
[t0,1]
) < d(z, y) =
t
<
and therefore z B
x
() B
x
() ,= .
Remark C.25. Suppose again that X is a Riemannian (or sub-Riemannian)
manifold and
d(x, y) = inf () : (0) = x and (1) = y .
Let be a curve from x to y and let = () d(x, y). Then for all 0 u <
v 1,
d(x, y) + = () = ([
[0,u]
) +([
[u,v]
) +([
[v,1]
)
d(x, (u)) +([
[u,v]
) +d((v), y)
and therefore, using the triangle inequality,
([
[u,v]
) d(x, y) + d(x, (u)) d((v), y)
d((u), (v)) +.
This leads to the following conclusions. If is within of a length minimizing
curve from x to y then [
[u,v]
is within of a length minimizing curve from (u)
to (v). In particular if is a length minimizing curve from x to y then [
[u,v]
is a length minimizing curve from (u) to (v).Appendix: Limsup and liminf
of functions
D
Appendix: Limsup and liminf of functions
[Thanks go to Ali Behzadan from providing this appendix.] Our goal in
this appendix is to introduce the notion of limit superior and limit inferior
for functions. The denitions and properties will be very similar to those that
were previously discussed for sequences. Here we will assume the domain is a
metric space and the destination set is the set of extended real numbers. One
may easiliy generalize the denitions to the case where domain and codomain
are topological spaces and of course the codomain must be an ordered set. By
introducing a proper topology on the set of extended real numbers one can
consider the limit superior and limit inferior of sequences as a special case of
what will be discussed here.
First lets review the denition of limit points in a metric space.
Denition D.1. Let (X, d) be a metric space and E X. A point p X is a
limit point of the set E if every open ball about p contains a point q ,= p such
that q E.
Exercise D.1. Prove that p is a limit point of E i there exists a sequence
x
n
n=1
E p such that lim
n
x
n
= p.
Example D.2. Consider 1(with the Euclidean metric d(x, y) = [x y[). Let
E = (2, 3] 5. p = 2 is a limit point of E because 2 = lim
n
(2 + 1/n).
p = 5 is not a limit point of E because the open ball of radius 1/2 centered at
5 does not intersect E 5.
Denition D.3. Let (X, d) be a metric space. Suppose E X and p is a
limit point of E. Let f be a real (or extended real) valued function on E, i.e.,
f : E1 (or f : E
1). Then
liminf
xp
f(x) = lim
0
inff(x) : x E B
p
() p,
limsup
xp
f(x) = lim
0
supf(x) : x E B
p
() p.
Remark D.4. Clearly if > 0 is such that B
p
() E and

f := f[
Bp()
then
liminf
xp
f(x) = liminf
xp
f(x) , limsup
xp
f(x) = limsup
xp
f(x).
So in fact if f
1
and f
2
agree on an open ball about p in X then
liminf
xp
f
1
(x) = liminf
xp
f
2
(x) , limsup
xp
f
1
(x) = limsup
xp
f
2
(x).
Remark D.5. Notice that if we set,
g() := inff(x) : x E B
p
() p,
h() := supf(x) : x E B
p
() p,
then as shrinks (as 0) g() increases and h()decreases ,i.e., if
1
>
2
then
g(
1
) g(
2
) and h(
1
) h(
2
). Therefore the limits in the denition of limsup
and liminf always exist in the set of extended real numbers and we have:
liminf
xp
f(x) = sup
>0
inff(x) : x E B
p
() p,
limsup
xp
f(x) = inf
>0
supf(x) : x E B
p
() p.
Notation D.6 S
p,
:= E B
p
() p.
Remark D.7. In the special case where X = 1 and E is an open interval in 1
containing p we have:
liminf
xp
f(x) = lim
0
inf
0<]xp]<
f(x),
limsup
xp
f(x) = lim
0
sup
0<]xp]<
f(x).
In the following two propositions we list some of the most important properties
of limit superior and limit inferior of functions. The reader may check that these
properties are completely analogous to those of sequences (Proposition 3.28).
Proposition D.8. Let (X, d) be a metric space. Suppose E X, f : E
1,
g : E
1 and p is a limit point of E.

1. liminf
xp
(f(x)) = limsup
xp
f(x).
2. liminf
xp
f(x) limsup
xp
f(x).
3. If f(x) g(x) on the intersection of an open ball about p with E (or on
S
p,
for some > 0) then
liminf
xp
f(x) liminf
xp
g(x) , limsup
xp
f(x) limsup
xp
g(x).
200 D Appendix: Limsup and liminf of functions
Proof.
1. From Proposition 3.8 one can easily conclude that if S is a nonempty subset
of the extended real numbers then inf(S) = sup(S). Thus
liminf
xp
(f(x)) = lim
0
inf
xSp,
(f(x)) = lim
0
( sup
xSp,
(f(x))
= lim
0
sup
xSp,
(f(x)) = limsup
xp
f(x).
2. We have
> 0 inf
xSp,
f(x) sup
xSp,
f(x).
Since the limit of both sides as 0 exists (in the extended real numbers)
we get
lim
0
inf
xSp,
f(x) lim
0
sup
xSp,
f(x),
therefore, liminf
xp
f(x) limsup
xp
f(x).
3. By assumption there exists > 0 such that f(x) g(x) on S
p,
. Thus if
0 < then f(x) sup
ySp,
g(y) for all x S
p,
. Therefore
sup
xSp,
f(x) sup
ySp,
g(y).
Similarly, if 0 < then inf
xSp,
f(x) g(y) for all y S
p,
and hence
inf
xSp,
f(x) inf
ySp,
g(y).
Passing to the limit (as 0) in each of these inequalities gives the desired
inequalities.
Proposition D.9. Let (X, d) be a metric space. Suppose E X, f : E1,
g : E1 and p is a limit point of E.
1. lim
xp
f(x) exists in

1 i liminf
xp
f(x) = limsup
xp
f(x)

1. In this
case lim
xp
f(x) = liminf
xp
f(x) = limsup
xp
f(x).
2. limsup
xp
(f(x) + g(x)) limsup
xp
f(x) + limsup
xp
g(x),provided the
right side of this equation is not of the form or +.
3. liminf
xp
f(x)+liminf
xp
g(x) liminf
xp
(f(x)+g(x)),provided the left
side of this equation is not of the form or +.
4. liminf
xp
f(x) + limsup
xp
g(x) limsup
xp
(f(x) + g(x)), provided the
left side of this equation is not of the form or +.
5. liminf
xp
(f(x) + g(x)) liminf
xp
f(x) + limsup
xp
g(x),provided the
6. lim
xp
f(x) exists in 1 then,
limsup
xp
(f(x) +g(x)) = lim
xp
f(x) + limsup
xp
g(x),
liminf
xp
(f(x) +g(x)) = lim
xp
f(x) + liminf
xp
g(x).
7. If f(x), g(x) 0 on the intersection of an open ball about p with E (or on
S
p,
for some > 0) then
limsup
xp
(f(x).g(x)) (limsup
xp
f(x))(limsup
xp
g(x)),
provided the right hand side of this equation is not of the form 0. or .0.
S
p,
for some > 0) and A = lim
xp
f(x) exists in (0, ) then
limsup
xp
(f(x).g(x)) = ( lim
xp
f(x))(limsup
xp
g(x)).
Proof.
1. () Let A := liminf
xp
f(x) = limsup
xp
f(x)

1. We may consider 3
cases:
a) A 1. Let x
n
be a sequence in E p such that x
n
p as n
. In what follows we will prove that lim
n
f(x
n
) = A and hence
by Theorem 6.33 we can conclude that lim
xp
f(x) = A. Let > 0
be given. By assumption there exists
1
> 0 be such that A
inf
xSp,
1
f(x) and there exists
2
> 0 such that sup
xSp,
2
f(x) A+ .
Let := min
1
,
2
. Since x
n
p and x
n
Ep we can conclude
that x
n
S
p,
a.a. n. Therefore,
> 0 > 0 A inf
xSp,
f(x) f(x
n
) sup
xSp,
f(x) A+ a.a.n
This precisely means that lim
n
f(x
n
) = A.
b) A = . Let M > 0 be given. Since lim
0
inf
xSp,
f(x) = we may
conclude that there exists > 0 such that inf
xSp,
f(x) > M provided
0 < . Therefore
M > 0 > 0 s.t. x S
p,
f(x) > M.
xp
f(x) = .
c) A = . Let M < 0 be given. Since lim
0
sup
xSp,
f(x) = we
may conclude that there exists > 0 such that sup
xSp,
f(x) < M
provided 0 < . Therefore
M < 0 > 0 s.t. x S
p,
f(x) < M.
xp
f(x) = .
D Appendix: Limsup and liminf of functions 201
() Conversely, suppose that A := lim
xp
f(x)

1. We may consider 3
cases:
a) A 1. Let > 0 be given. Since lim
xp
f(x) = A there exists > 0
such that if x S
p,
then A < f(x) < A + . So for all 0 <
we have
A inf
xSp,
f(x) sup
xSp,
f(x) A+.
Passing to the limit (as 0) we get
A liminf
xp
f(x) limsup
xp
f(x) A+.
Since > 0 is arbitrary, it follows that,
A liminf
xp
f(x) limsup
xp
f(x) A,
i.e. that A = liminf
xp
f(x) = limsup
xp
f(x).
b) A = . Let M > 0 be given. There exists > 0 such that f(x) > M
provided x S
p,
. So for all 0 < we have inf
xSp,
f(x) M.
Passing to the limit (as 0) we get liminf
xp
f(x) M. Since
M > 0 is arbitrary it follows that liminf
xp
f(x) = . Also since
in general liminf
xp
f(x) limsup
xp
f(x) we can conclude that
limsup
xp
f(x) = .
c) A = . Let M < 0 be given. There exists > 0 such that f(x) < M
provided x S
p,
. So for all 0 < we have sup
xSp,
f(x) M.
Passing to the limit (as 0) we get limsup
xp
f(x) M. Since
M < 0 is arbitrary it follows that limsup
xp
f(x) = . Also
since in general liminf
xp
f(x) limsup
xp
f(x) we can conclude that
liminf
xp
f(x) = .
2. Let > 0
x S
p,
f(x) +g(x) sup
ySp,
f(y) + sup
ySp,
g(y).
So
sup
xSp,
(f(x) +g(x)) sup
ySp,
f(y) + sup
ySp,
g(y).
Passing to the limit as 0 in this equation then implies
limsup
xp
(f(x) +g(x)) limsup
xp
f(x) + limsup
xp
g(x).
3. We have,
liminf
xp
f(x) + liminf
xp
g(x) = (limsup
xp
(f(x)) + limsup
xp
(g(x)))
(limsup
xp
((f(x) +g(x))))
= liminf
xp
(f(x) +g(x)).
4. We may consider 3 cases:
a) liminf
xp
f(x) 1. We have
limsup
xp
g(x) = limsup
xp
(g(x) +f(x) f(x))
limsup
xp
(f(x) +g(x)) + limsup
xp
(f(x))
= limsup
xp
(f(x) +g(x)) liminf
xp
(f(x)).
Therefore
liminf
xp
f(x) + limsup
xp
g(x) limsup
xp
(f(x) +g(x)).
b) liminf
xp
f(x) = . By assumption liminf
xp
f(x) +
limsup
xp
g(x) cannot be , so either limsup
xp
g(x) 1
or limsup
xp
g(x) = . In both cases liminf
xp
f(x) +
limsup
xp
g(x) = . So obviously the claimed inequality holds true.
c) liminf
xp
f(x) = . By assumption liminf
xp
f(x) +limsup
xp
g(x)
cannot be , so either limsup
xp
g(x) 1 or limsup
xp
g(x) =
. Also note that in general liminf
xp
f(x) limsup
xp
f(x), so we
can conclude that limsup
xp
f(x) = and hence lim
xp
f(x) = .
In what follows we will show that limsup
xp
(f(x) + g(x)) = and
hence the desired inequality obviously holds true. First lets consider
the case where A := limsup
xp
g(x) 1. Let M > 0 be given.
lim
xp
f(x) = =
1
> 0 x S
p,1
f(x) M +[A[,
A = limsup
xp
g(x) =
2
> 0 0 <
2
sup
xSp,
g(x) > A/2.
Therefore if we let := min
1
,
2
then
0 < x S
p,
f(x) +g(x) M +[A[ +A/2 M.
Consequently
0 < sup
xSp,
(f(x) +g(x)) M.
Passing to the limit (as 0) we get limsup
xp
(f(x) + g(x)) M.
Since M is arbitrary we can conclude that limsup
xp
(f(x)+g(x)) = .
Now lets consider the case where limsup
xp
g(x) = . Let M > 0 be
given.
lim
xp
f(x) = =
1
> 0 x S
p,1
f(x) > M/2,
limsup
xp
g(x) = =
2
> 0 0 <
2
sup
xSp,
g(x) > M/2.
Therefore if we let := min
1
,
2
then
0 < x S
p,
f(x) +g(x) M/2 +M/2 = M.
Consequently
0 < sup
xSp,
(f(x) +g(x)) M.
Passing to the limit as 0 we get limsup
xp
(f(x)+g(x)) M. Since
M is arbitrary we can conclude that limsup
xp
(f(x) +g(x)) = .
5. By using item 4. we can write
liminf
xp
(g(x) +f(x)) = limsup
xp
(g(x) f(x))
(limsup
xp
(f(x)) + liminf
xp
(g(x)))
= liminf
xp
f(x) + limsup
xp
g(x).
6. By item 2. and item 4. we have
liminf
xp
f(x)+limsup
xp
g(x) limsup
xp
(f(x)+g(x)) limsup
xp
f(x)+limsup
xp
g(x)
But by item 1. we know
liminf
xp
f(x) = limsup
xp
f(x) = lim
xp
f(x).
Therefore
lim
xp
f(x) + limsup
xp
g(x) limsup
xp
(f(x) +g(x)) lim
xp
f(x) + limsup
xp
g(x)
Consequently
limsup
xp
(f(x) +g(x)) = lim
xp
f(x) + limsup
xp
g(x).
Similarly by item 3. and item 5. we have
liminf
xp
f(x)+liminf
xp
g(x) liminf
xp
(f(x)+g(x)) limsup
xp
f(x)+liminf
xp
g(x)
and so
lim
xp
f(x) + liminf
xp
g(x) liminf
xp
(f(x) +g(x)) lim
xp
f(x) + liminf
xp
g(x)
Therefore
liminf
xp
(f(x) +g(x)) = lim
xp
f(x) + liminf
xp
g(x).
7. By assumption there exists > 0 such that f(x), g(x) 0 for all x S
p,
.
For all 0 < we have
x S
p,
f(x).g(x) ( sup
ySp,
f(y))( sup
ySp,
g(y)),
and therefore,
sup
xSp,
(f(x).g(x)) ( sup
ySp,
f(y))( sup
ySp,
g(y)).
Letting 0 in this inequality implies,
lim
0
sup
xSp,
(f(x).g(x)) lim
0
[( sup
ySp,
f(y))( sup
ySp,
g(y))]
= lim
0
sup
ySp,
f(y). lim
0
sup
ySp,
g(y)
= (limsup
xp
f(x))(limsup
xp
g(x)).
provided (limsup
xp
f(x))(limsup
xp
g(x)) is not of the form 0 or 0
8. Note that by item 1. liminf
xp
f(x) = limsup
xp
f(x) = lim
xp
f(x); so
considering the previous item we just need to show that
limsup
xp
(f(x).g(x)) A. limsup
xp
g(x).
Let (0, A). By assumption there exists
1
> 0 such that f(x), g(x) 0
for all x S
p,1
. Since lim
xp
f(x) = A there exists
2
> 0 such that
f(x) for all x S
p,
2
and hence f(x)g(x) g(x) for all x S
p,
where = minv
1
,
2
. Therefore by item 3. of Proposition D.8 we can
conclude that
limsup
xp
(f(x).g(x)) limsup
xp
(g(x)) = . limsup
xp
g(x).
Now we may consider two cases:
a) If limsup
xp
g(x) < we may let A in order to see that
limsup
xp
(f(x).g(x)) A. limsup
xp
g(x).
b) If limsup
xp
(g(x)) = it follows that
limsup
xp
(f(x).g(x)) = A. limsup
xp
g(x) = . limsup
xp
g(x) = .
Remark D.10. By item 1. of the previous proposition if limit exists, then limsup
and liminf are both equal to the limit. So whenever the limit exists we may
replace the limit by limsup and liminf. The important point is that even if the
limit does not exist, limsup and liminf always exist. Therefore if we want to
take limit but we do not know whether the limit exists we should take limsup
or liminf instead.
Exercise D.2. Prove that if f(x) 0, then limsup
xp
f(x) = 0 if and only
if lim
xp
f(x) = 0. Conclude that, even if f is not necessarily nonnegative,
lim
xp
f(x) = c i limsup
xp
[f(x) c[ = 0.
Exercise D.3. Prove the followings:
S
p,
for some > 0) then
liminf
xp
(f(x).g(x)) (liminf
xp
f(x))(liminf
xp
g(x)),
S
p,
for some > 0) and A = lim
xp
f(x) exists in (0, ) then
liminf
xp
(f(x).g(x)) = ( lim
xp
f(x))(liminf
xp
g(x)).
The claim of the following theorem is similar to that of Exercise 3.14.
Theorem D.11. Let (X, d) be a metric space. Suppose E X and p is a limit
point of E. Let f be a real (or extended real) valued function on E and suppose
A 1.
1. limsup
xp
f(x) A if and only if for every > 0 > 0 such that
f(x) A+ provided x S
p,
.
2. limsup
xp
f(x) A if and only if for every > 0 > 0 x
,
S
p,
such that f(x
,
) A .
3. limsup
xp
f(x) = A if and only if for every > 0 > 0 such that
f(x) A+ provided x S
p,
, > 0 x
,
S
p,
such that f(x
,
) A
.
Proof.
1. () Let > 0 be given. lim
0
sup
xSp,
f(x) A so there exists > 0 such
that for all 0 < , sup
xSp,
f(x) A+. So in particular f(x) A+
provided x S
p,
.
2. () Conversely, if > 0 then by assumption there exists > 0 such
that f(x) A + on S
p,
. Therefore, by item 3. of Proposition D.8,
limsup
xp
f(x) limsup
xp
(A + ) = lim
xp
(A + ) = A + . Since
> 0 is arbitrary we can conclude that limsup
xp
f(x) A.
3. ()Let > 0 be given. Since inf
>0
sup
xSp,
f(x) A we can conclude
that for all > 0, sup
xSp,
f(x) A . Therefore for all > 0 there exists
x
,
S
p,
such that f(x
,
) A (otherwise sup
xSp,
f(x) A).
4. ()Conversely, if > 0 then the assumption implies that for
all > 0, sup
xSp,
f(x) A . Therefore limsup
xp
f(x) =
inf
>0
sup
xSp,
f(x) A . Since > 0 is arbitrary we can con-
clude that limsup
xp
f(x) A.
5. This one is a direct consequence of the previous items.
Exercise D.4. Let (X, d) be a metric space. Suppose E X and p is a limit
point of E. Let f be a real (or extended real) valued function on E and suppose
A 1.
1. liminf
xp
f(x) A if and only if for every > 0 > 0 such that
f(x) A provided x S
p,
.
2. liminf
xp
f(x) A if and only if for every > 0 > 0 x
,
S
p,
such
that f(x
,
) A+ .
3. liminf
xp
f(x) = A if and only if for every > 0 > 0 such that
f(x) A provided x S
p,
, > 0 x
,
S
p,
such that f(x
,
) A+
.
Theorem D.12. Let (X, d) be a metric space. Suppose E X and p is a limit
point of E. Let f be a real (or extended real) valued function on E. Suppose
limsup
xp
f(x) = A

1.
1. Ifx
n
n=1
is a sequence in E p such that p = lim
n
x
n
, then
limsup
n
f(x
n
) A.
2. There exists a sequencex
n
n=1
in E p such that p = lim
n
x
n
and
A = lim
n
f(x
n
).
Proof.
1. We may consider 3 cases:
Case 1: A = +. Clearly limsup
n
f(x
n
) A = +.
Case 2: A = . In general liminf
xp
f(x) limsup
xp
f(x) so we
may conclude that liminf
xp
f(x) = limsup
xp
f(x) = and hence
lim
xp
f(x) = . Since x
n
E p and x
n
p, it follows that
lim
n
f(x
n
) = . Therefore limsup
n
f(x
n
) = A = .
Case 3. A 1. Let > 0 be given. By Theorem D.11 there exists > 0
such that f(x) A+ for all x S
p,
. Since x
n
E p and x
n
p,
we have x
n
S
p,
a.a.n. Therefore f(x
n
) A + a.a.n. Consequently
limsup
n
f(x
n
) limsup
n
(A + ) = A + . Since > 0 is arbitrary
we can conclude that limsup
n
f(x
n
) A.
2. By item ii. of Theorem D.11, for all n 1 (let = = 1/n), there exists
x
n
S
p,1/n
such that f(x
n
) A1/n. Therefore
x
n
S
p,1/n
= d(x
n
, p) < 1/n, x
n
Ep = x
n
p, x
n
Ep
Moreover,
f(x
n
) A1/n = liminf
n
f(x
n
) liminf
n
(A1/n) = lim
n
(A1/n) = A
But by item i. limsup
n
f(x
n
) A and so
A liminf
n
f(x
n
) limsup
n
f(x
n
) A
Consequently liminf
n
f(x
n
) = limsup
n
f(x
n
) = A which implies
lim
n
f(x
n
) = A.
Exercise D.5. Let (X, d) be a metric space. Suppose E X and p is a limit
point of E. Let f be a real (or extended real) valued function on E. Suppose
liminf
xp
f(x) = A

1.
1. Ifx
n
n=1
is a sequence in E p such that p = lim
n
x
n
, then
liminf
n
f(x
n
) A.
n
n=1
in E p such that p = lim
n
x
n
and
A = lim
n
f(x
n
).
Remark D.13. Considering the above theorem and exercise, one can give the
following interpretation of limsup
xp
f(x) and liminf
xp
f(x):
limsup
xp
f(x) is the largest limiting value of f(x
n
)
n=1
over all sequences
x
n
n=1
E p that converge to p.
liminf
xp
f(x) is the smallest limiting value of f(x
n
)
n=1
over all se-
quences x
n
n=1
E p that converge to p.
Denition D.14. Let X = 1 and E = (a, b)be an open interval in 1 containing
p (or let E = (a, b) p where p (a, b)). Let f be a real (or extended real)
valued function on E. We dene
limsup
xp
f(x) = lim
0
sup
0<xp<
f(x),
liminf
xp
f(x) = lim
0
inf
0<xp<
f(x),
limsup
xp
f(x) = lim
0
sup
0<px<
f(x),
liminf
xp
f(x) = lim
0
inf
0<px<
f(x).
(Note that the set E can be unbounded.)
Remark D.15. Note that clearly if we let f
1
:= f[
(a,p)
and f
2
:= f[
(p,b)
then
limsup
xp
f(x) = limsup
xp
f
2
(x) , liminf
xp
f(x) = liminf
xp
f
2
(x),
limsup
xp
f(x) = limsup
xp
f
1
(x) , liminf
xp
f(x) = liminf
xp
f
1
(x).
So in fact the properties that were previously mentioned for limsup and liminf
of functions (e.g. Prposition 10) will hold true for one sided limsup and liminf
as well (with obvious modications).
Also it will be helpful to note that lim
xp
f(x) = A

1 i lim
xp
f
2
(x) =
A

1. Similarly lim
xp
f(x) = A

1 i lim
xp
f
1
(x) = A

1.
Corollary D.16. An immediate consequence of the above remark, item 1. of
Proposition D.9, and Theorem 6.41 is that if f is a real valued function on the
open interval E = (a, b) which contains p (or on E = (a, b) p where p
(a, b)) then
1. lim
xp
f(x) = A

1 i limsup
xp
f(x) = liminf
xp
f(x) = A

1.
2. lim
xp
f(x) = A

1 i limsup
xp
f(x) = liminf
xp
f(x) = A

1.
3. lim
xp
f(x) = A

1 i limsup
xp
f(x) = liminf
xp
f(x) =
limsup
xp
f(x) = liminf
xp
f(x) = A

1.
Theorem D.17. Let X = 1 and E = (a, b)be an open interval in 1 containing
p (or let E = (a, b) p where p (a, b)). If f is a real valued function on E
then
limsup
xp
f(x) = maxlimsup
xp
f(x), limsup
xp
f(x).
Proof. We have
> 0 sup
0<xp<
f(x) sup
0<]xp]<
f(x) = limsup
xp
f(x) limsup
xp
f(x),
> 0 sup
0<px<
f(x) sup
0<]xp]<
f(x) = limsup
xp
f(x) limsup
xp
f(x).
Therefore, maxlimsup
xp
f(x), limsup
xp
f(x) limsup
xp
f(x). To prove
the reverse inequality we will proceed as follows:
By Theorem D.12 there exists a sequencex
n
n=1
in E p such that
p = lim
n
x
n
and lim
n
f(x
n
) = limsup
xp
f(x). Clearly at least one
of the sets A
1
= n : x
n
(a, p) or A
2
= n : x
n
(p, b) is in-
nite. If A
1
is innite then x
n
has a subsequence y
k
k=1
(a, p). So
in particular y
k
p and lim
k
f(y
k
) = limsup
xp
f(x). Since y
k
p
we can conclude that limsup
k
f(y
k
) limsup
xp
f(x) (by Theorem D.12
and Remark D.15). But limsup
k
f(y
k
) = lim
k
f(y
k
) = limsup
xp
f(x)
and hence limsup
xp
f(x) limsup
xp
f(x). Similarly, if A
2
is innite
then limsup
xp
f(x) limsup
xp
f(x). So in either case limsup
xp
f(x)
maxlimsup
xp
f(x), limsup
xp
f(x).
Exercise D.6. Let X = 1 and E = (a, b)be an open interval in 1 containing
p (or let E = (a, b) p where p (a, b)). Let f be a real valued function on
E. Prove that
liminf
xp
f(x) = minliminf
xp
f(x), liminf
xp
f(x).
Example D.18. Let f(x) =
_
0 if x
1 if x
c
and p 1. Since and
c
are both
dense in 1, we have
> 0 sup
0<xp<
f(x) = 1 = lim
0
sup
0<xp<
f(x) = 1 = limsup
xp
f(x) = 1,
> 0 sup
0<px<
f(x) = 1 = lim
0
sup
0<px<
f(x) = 1 = limsup
xp
f(x) = 1,
> 0 inf
0<xp<
f(x) = 0 = lim
0
inf
0<xp<
f(x) = 0 = liminf
xp
f(x) = 0,
> 0 inf
0<px<
f(x) = 0 = lim
0
inf
0<px<
f(x) = 0 = liminf
xp
f(x) = 0.
Also by Theorem D.17 and Exercise D.6, limsup
xp
f(x) = max1, 1 = 1 and
liminf
xp
f(x) = min0, 0 = 0. In particular, considering Corollary D.16, we
can conclude that lim
xp
f(x) does not exist.
Example D.19. Let
f(x) =
_
_
_
1/x if x 0
1 if x
c
0 if x = 0
.
Lets determine one sided limsup and liminf of f as x approches 0. First
note that for x
n
= 1/n (0, 2) we have lim
n
x
n
= 0 and
lim
n
f(x
n
) = . So by Theorem D.12 and Remark D.15 we can con-
clude that = lim
n
f(x
n
) = limsup
n
f(x
n
) limsup
x0
f(x) and
therefore limsup
x0
f(x) = . Similarly, for x
n
= 1/n (2, 0) we
have lim
n
x
n
= 0 and lim
n
f(x
n
) = . So by Exercise D.5 and Re-
mark D.15 we can conclude that - = lim
n
f(x
n
) = liminf
n
f(x
n
)
liminf
x0
f(x) and therefore liminf
x0
f(x) = . Also since
c
is dense in 1,
> 0 sup
0<0x<
f(x) = 1 = lim
0
sup
0<0x<
f(x) = 1 = limsup
x0
f(x) = 1,
0 < <
1
2
inf
0<x0<
f(x) = 1 = lim
0
inf
0<x0<
f(x) = 1 = liminf
x0
f(x) = 1.
Now, clearly limsup
x0
f(x) = max1, = and liminf
xp
f(x) =
min1, = .
Denition D.20. Let f be a real (or extended real) valued function on 1. We
dene
liminf
x
f(x) := lim
t
inf
xt
f(x) = sup
t>0
inf
xt
f(x),
limsup
x
f(x) := lim
t
sup
xt
f(x) = inf
t>0
sup
xt
f(x).
Previously mentioned properties for limsup and liminf of a function at a
nite point hold for limsup and liminf at innity as well (with obvious modi-
cations). For completeness here we will state those prpoperties again.
Proposition D.21. Let f be a real (or extended real) valued function on 1.
1. liminf
x
(f(x)) = limsup
x
f(x).
2. liminf
x
f(x) limsup
x
f(x)
3. If g : 1
1 is such that f(x) g(x) on [M, ) for some M > 0, then

liminf
x
f(x) liminf
x
g(x) , limsup
x
f(x) limsup
x
g(x).
(Proof is completely analogous to the proofs of the corresponding items of
Proposition D.8.)
Proposition D.22. Let f and g be a real valued function on 1, i.e., f : 11
and g : 11 .
1. lim
x
f(x) exists in

1 i liminf
x
f(x) = limsup
x
f(x)

1. In this
case lim
x
f(x) = liminf
x
f(x) = limsup
x
f(x).
2. limsup
x
(f(x) + g(x)) limsup
x
f(x) + limsup
x
g(x),provided
the right side of this equation is not of the form or +.
3. liminf
x
f(x) +liminf
x
g(x) liminf
x
(f(x) +g(x)),provided the
left side of this equation is not of the form or +.
4. liminf
x
f(x) + limsup
x
g(x) limsup
x
(f(x) + g(x)), provided
the left side of this equation is not of the form or +.
5. liminf
x
(f(x)+g(x)) liminf
x
f(x)+limsup
x
g(x),provided the
6. lim
x
f(x) exists in 1 then,
limsup
x
(f(x) +g(x)) = lim
x
f(x) + limsup
x
g(x),
liminf
x
(f(x) +g(x)) = lim
x
f(x) + liminf
x
g(x).
7. If f(x), g(x) 0 on [M, ) for some M > 0, then
limsup
x
(f(x).g(x)) (limsup
x
f(x))(limsup
x
g(x)),
8. If f(x), g(x) 0 on [M, ) for some M > 0 and A = lim
x
f(x) exists
in (0, ) then
limsup
x
(f(x).g(x)) = ( lim
x
f(x))(limsup
x
g(x)).
(Proof is completely analogous to the proofs of the corresponding items of
Proposition D.9.)
Exercise D.7. Prove that if f(x) 0, then limsup
x
f(x) = 0 if and only
if lim
x
f(x) = 0. Conclude that, even if f is not necessarily nonnegative,
lim
x
f(x) = c i limsup
x
[f(x) c[ = 0.
Exercise D.8. Prove the followings:
1. If f(x), g(x) 0 on [M, ) for some M > 0 then
liminf
x
(f(x).g(x)) (liminf
x
f(x))(liminf
x
g(x)),
2. If f(x), g(x) 0 on [M, ) for some M > 0 and A = lim
xp
f(x) exists in
(0, ) then
liminf
x
(f(x).g(x)) = ( lim
x
f(x))(liminf
x
g(x)).
Theorem D.23. Let f be a real (or extended real) valued function on 1 and
suppose A 1.
1. limsup
x
f(x) A if and only if for every > 0 M > 0 such that
f(x) A+ provided x M.
2. limsup
x
f(x) A if and only if for every > 0 M > 0 x
M,
M
such that f(x
M,
) A .
3. limsup
x
f(x) = A if and only if for every > 0 M > 0 such that
f(x) A + provided x M, M > 0 x
M,
M such that f(x
M,
)
A .
(Proof is completely analogous to the proof of Theorem D.11.)
Exercise D.9. Let f be a real (or extended real) valued function on 1 and
suppose A 1.
1. liminf
x
f(x) A if and only if for every > 0 M > 0 such that
f(x) A provided x M.
2. liminf
x
f(x) A if and only if for every > 0 M > 0 x
M,
M
such that f(x
M,
) A+ .
3. liminf
x
f(x) = A if and only if for every > 0 M > 0 such that
f(x) A provided x M,
M > 0 x
M,
M such that f(x
M,
) A+ .
Theorem D.24. Let f be a real (or extended real) valued function on 1. Sup-
pose limsup
x
f(x) = A

1.
1. Ifx
n
n=1
is a sequence of real numbers such that lim
n
x
n
= , then
limsup
n
f(x
n
) A.
n
n=1
of real numbers such that lim
n
x
n
=
and A = lim
n
f(x
n
).
(Proof is completely analogous to the proof of Theorem D.12.)
Exercise D.10. Let f be a real (or extended real) valued function on 1. Sup-
pose liminf
x
f(x) = A

1.
1. Ifx
n
n=1
is a sequence of real numbers such that lim
n
x
n
= , then
liminf
n
f(x
n
) A.
n
n=1
of real numbers such that lim
n
x
n
=
and A = lim
n
f(x
n
).
Example D.25. Let f(x) = sin x. We have
t sup
xt
f(x) = 1 = lim
t
sup
xt
f(x) = 1 = limsup
x
f(x) = 1,
t inf
xt
f(x) = 1 = lim
t
inf
xt
f(x) = 1 = liminf
x
f(x) = 1.
Example D.26. Let f(x) = xsin(x
2
).
For
_
x
n
=
_
2n +/2
_
n=1
we have lim
n
x
n
= and
lim
n
f(x
n
) = lim
n
_
2n +/2 sin(2n +/2) = lim
n
_
2n +/2 =
So by Theorem D.24 we can conclude that
= lim
n
f(x
n
) = limsup
n
f(x
n
) limsup
x
f(x)
and therefore limsup
x
f(x) = .
Similarly, for
_
x
n
=
_
2n + 3/2
_
n=1
we have lim
n
x
n
= and
lim
n
f(x
n
) = lim
n
_
2n +
3
2
sin
_
2n +
3
2
_
= lim
n
_
2n + 3/2 =
So by Exercise D.10 we can conclude that
= lim
n
f(x
n
) = liminf
n
f(x
n
) liminf
x
f(x)
and therefore liminf
x
f(x) =
E
Appendix: Countably Additive Measures
Let / 2
be an algebra and : / [0, ] be a nitely additive measure.

Recall that is a premeasure on / if is additive on /. If is a
premeasure on / and / is a algebra (Denition ??), we say that is a
measure on (, /) and that (, /) is a measurable space.
Denition E.1. Let (, B) be a measurable space. We say that P : B [0, 1] is
a probability measure on (, B) if P is a measure on B such that P () = 1.
In this case we say that (, B, P) a probability space.
E.1 Overview
The goal of this chapter is develop methods for proving the existence of proba-
bility measures with desirable properties. The main results of this chapter may
are summarized in the following theorem.
Theorem E.2. A nitely additive probability measure P on an algebra, /
2
, extends to additive measure on (/) i P is a premeasure on /. If the

extension exists it is unique.
Proof. The uniqueness assertion is proved Proposition E.15 below. The
existence assertion of the theorem in the content of Theorem E.27.
In order to use this theorem it is necessary to determine when a nitely ad-
ditive probability measure in is in fact a premeasure. The following Proposition
is sometimes useful in this regard.
Proposition E.3 (Equivalent premeasure conditions). Suppose that P is
a nitely additive probability measure on an algebra, / 2
. Then the following

are equivalent:
1. P is a premeasure on /, i.e. P is additive on /.
2. For all A
n
/ such that A
n
A /, P (A
n
) P (A) .
3. For all A
n
/ such that A
n
A /, P (A
n
) P (A) .
4. For all A
n
/ such that A
n
, P (A
n
) 1.
5. For all A
n
/ such that A
n
, P (A
n
) 0.
Proof. We will start by showing 1 2 3.
1. = 2. Suppose A
n
/ such that A
n
A /. Let A
t
n
:= A
n
A
n1
with A
0
:= . Then A
t
n
n=1
are disjoint, A
n
=
n
k=1
A
t
k
and A =
k=1
A
t
k
.
Therefore,
P (A) =
k=1
P (A
t
k
) = lim
n
n
k=1
P (A
t
k
) = lim
n
P (
n
k=1
A
t
k
) = lim
n
P (A
n
) .
2. = 1. If A
n
n=1
/ are disjoint and A :=
n=1
A
n
/, then
N
n=1
A
n
A. Therefore,
P (A) = lim
N
P
_
N
n=1
A
n
_
= lim
N
N
n=1
P (A
n
) =
n=1
P (A
n
) .
2. = 3. If A
n
/ such that A
n
A /, then A
c
n
A
c
and therefore,
lim
n
(1 P (A
n
)) = lim
n
P (A
c
n
) = P (A
c
) = 1 P (A) .
3. = 2. If A
n
/ such that A
n
A /, then A
c
n
A
c
and therefore we
again have,
lim
n
(1 P (A
n
)) = lim
n
P (A
c
n
) = P (A
c
) = 1 P (A) .
The same proof used for 2. 3. shows 4. 5 and it is clear that
3. = 5. To nish the proof we will show 5. = 2.
5. = 2. If A
n
/ such that A
n
A /, then A A
n
and therefore
lim
n
[P (A) P (A
n
)] = lim
n
P (A A
n
) = 0.
Remark E.4. Observe that the equivalence of items 1. and 2. in the above propo-
sition hold without the restriction that P () = 1 and in fact P () = may
be allowed for this equivalence.
Lemma E.5. If : / [0, ] is a premeasure, then is countably sub-
additive on /.
210 E Appendix: Countably Additive Measures
Proof. Suppose that A
n
/ with
n=1
A
n
/. Let A
1
:= A
1
and for
n 2, let A
t
n
:= A
n
(A
1
. . . A
n1
) /. Then
n=1
A
n
=
n=1
A
t
n
and
therefore by the countable additivity and monotonicity of we have,
(
n=1
A
n
) =
_

n=1
A
t
n
_
=
n=1
(A
t
n
)
n=1
(A
n
) .
Let us now specialize to the case where = 1 and / =
/((a, b] 1 : a b ) . In this case we will describe proba-
bility measures, P, on B
R
by their cumulative distribution functions.
Denition E.6. Given a probability measure, P on B
R
, the cumulative dis-
tribution function (CDF) of P is dened as the function, F = F
P
: 1 [0, 1]
given as
F (x) := P ((, x]) . (E.1)
Example E.7. Suppose that
P = p
1
+q
1
+r
with p, q, r > 0 and p +q +r = 1. In this case,

F (x) =
_
_
0 for x < 1
p for 1 x < 1
p +q for 1 x <
1 for x <
.
A plot of F (x) with p = .2, q = .3, and r = .5.
Lemma E.8. If F = F
P
: 1 [0, 1] is a distribution function for a probability
measure, P, on B
R
, then:
1. F is non-decreasing,
2. F is right continuous,
3. F () := lim
x
F (x) = 0, and F () := lim
x
F (x) = 1.
Proof. The monotonicity of P shows that F (x) in Eq. (E.1) is non-
decreasing. For b 1 let A
n
= (, b
n
] with b
n
b as n . The continuity
of P implies
F(b
n
) = P((, b
n
]) ((, b]) = F(b).
Since b
n
n=1
was an arbitrary sequence such that b
n
b, we have shown
F (b+) := lim
yb
F(y) = F(b). This show that F is right continuous. Similar
arguments show that F () = 1 and F () = 0.
It turns out that Lemma E.8 has the following important converse.
Theorem E.9. To each function F : 1 [0, 1] satisfying properties 1. 3.. in
Lemma E.8, there exists a unique probability measure, P
F
, on B
R
such that
P
F
((a, b]) = F (b) F (a) for all < a b < .
Proof. The uniqueness assertion is proved in Corollary E.17 below or see
Exercises E.2 and ?? below. The existence portion of the theorem is a special
case of Theorem E.33 below.
Example E.10 (Uniform Distribution). The function,
F (x) :=
_
_
_
0 for x 0
x for 0 x < 1
1 for 1 x <
,
is the distribution function for a measure, m on B
R
which is concentrated on
(0, 1]. The measure, m is called the uniform distribution or Lebesgue mea-
sure on (0, 1].
With this summary in hand, let us now start the formal development. We
begin with uniqueness statement in Theorem E.2.
E.2 Theorem
Recall that a collection, T 2
, is a class or system if it is closed

under nite intersections. We also need the notion of a system.
Denition E.11 ( system). A collection of sets, / 2
, is class or
system if
a. /
E.2 Theorem 211
Fig. E.1. The cumulative distribution function for the uniform distribution.
b. If A, B / and A B, then B A /. (Closed under proper dierences.)
c. If A
n
/ and A
n
A, then A /. (Closed under countable increasing
unions.)
Remark E.12. If / is a collection of subsets of which is both a class and
a system then / is a algebra. Indeed, since A
c
= A, we see that
any - system is closed under complementation. If / is also a system, it is
closed under intersections and therefore / is an algebra. Since / is also closed
under increasing unions, / is a algebra.
Lemma E.13 (Alternate Axioms for a System*). Suppose that /
2
is a collection of subsets . Then / is a class i satises the following

postulates:
1. /
2. A / implies A
c
/. (Closed under complementation.)
3. If A
n
n=1
/ are disjoint, then
n=1
A
n
/. (Closed under disjoint
unions.)
Proof. Suppose that / satises a. c. above. Clearly then postulates 1. and
2. hold. Suppose that A, B / such that A B = , then A B
c
and
A
c
B
c
= B
c
A /.
Taking complements of this result shows A B / as well. So by induction,
B
m
:=
m
n=1
A
n
/. Since B
m

n=1
A
n
it follows from postulate c. that
n=1
A
n
/.
Now suppose that / satises postulates 1. 3. above. Notice that /
and by postulate 3., / is closed under nite disjoint unions. Therefore if A, B
/ with A B, then B
c
/ and A B
c
= allows us to conclude that
A B
c
/. Taking complements of this result shows B A = A
c
B / as
well, i.e. postulate b. holds. If A
n
/ with A
n
A, then B
n
:= A
n
A
n1
/
for all n, where by convention A
0
= . Hence it follows by postulate 3 that
n=1
A
n
=
n=1
B
n
/.
Theorem E.14 (Dynkins Theorem). If / is a class which contains
a class, T, then (T) /.
Proof. We start by proving the following assertion; for any element C /,
the collection of sets,
/
C
:= D / : C D / ,
is a system. To prove this claim, observe that: a. /
C
, b. if A B with
A, B /
C
, then A C, B C / with A C B C and therefore,
(B A) C = [B C] A = [B C] [A C] /.
This shows that /
C
is closed under proper dierences. c. If A
n
/
C
with
A
n
A, then A
n
C / and A
n
C AC /, i.e. A /
C
. Hence we have
veried /
C
is still a system.
For the rest of the proof, we may assume without loss of generality that /
is the smallest class containing T if not just replace / by the intersection
of all classes containing T. Then for C T we know that /
C
/ is a
- class containing T and hence /
C
= /. Since C T was arbitrary, we have
shown, C D / for all C T and D /. We may now conclude that if
C /, then T /
C
/ and hence again /
C
= /. Since C / is arbitrary,
we have shown CD / for all C, D /, i.e. / is a system. So by Remark
E.12, / is a algebra. Since (T) is the smallest algebra containing T it
follows that (T) /.
As an immediate corollary, we have the following uniqueness result.
Proposition E.15. Suppose that T 2
is a system. If P and Q are two

probability
1
measures on (T) such that P = Q on T, then P = Q on (T) .
Proof. Let / := A (T) : P (A) = Q(A) . One easily shows / is a
class which contains T by assumption. Indeed, T /, if A, B / with
A B, then
P (B A) = P (B) P (A) = Q(B) Q(A) = Q(B A)
1
More generally, P and Q could be two measures such that P () = Q() < .
so that B A /, and if A
n
/ with A
n
A, then P (A) = lim
n
P (A
n
) =
lim
n
Q(A
n
) = Q(A) which shows A /. Therefore (T) / = (T) and
the proof is complete.
Example E.16. Let := a, b, c, d and let and be the probability measure
on 2
determined by, (x) =

1
4
for all x and (a) = (d) =
1
8
and
(b) = (c) = 3/8. In this example,
/ :=
_
A 2
: P (A) = Q(A)
_
is system which is not an algebra. Indeed, A = a, b and B = a, c are in
/ but A B / /.
Exercise E.1. Suppose that and are two measures (not assumed to be
nite) on a measure space, (, B) such that = on a system, T. Further
assume B = (T) and there exists
n
T such that; i) (
n
) = (
n
) <
for all n and ii)
n
as n . Show = on B.
Hint: Consider the measures,
n
(A) := (A
n
) and
n
(A) =
(A
n
) .
Corollary E.17. A probability measure, P, on (1, B
R
) is uniquely determined
by its cumulative distribution function,
F (x) := P ((, x]) .
Proof. This follows from Proposition E.15 wherein we use the fact that
T := (, x] : x 1 is a system such that B
R
= (T) .
Remark E.18. Corollary E.17 generalizes to 1
n
. Namely a probability measure,
P, on (1
n
, B
R
n) is uniquely determined by its CDF,
F (x) := P ((, x]) for all x 1
n
where now
(, x] := (, x
1
] (, x
2
] (, x
n
].
E.2.1 A Density Result*
Exercise E.2 (Density of / in (/)). Suppose that / 2
is an algebra,
B := (/) , and P is a probability measure on B. Let (A, B) := P (AB) .
The goal of this exercise is to use the theorem to show that / is dense in
B relative to the metric, . More precisely you are to show using the following
outline that for every B B there exists A / such that that P (AB) < .
1. Recall from Exercise ?? that (a, B) = P (AB) = E[1
A
1
B
[ .
2. Observe; if B = B
i
and A =
i
A
i
, then
B A =
i
[B
i
A]
i
(B
i
A
i
)
i
A
i
B
i
and
A B =
i
[A
i
B]
i
(A
i
B
i
)
i
A
i
B
i
so that
AB
i
(A
i
B
i
) .
3. We also have
(B
2
B
1
) (A
2
A
1
) = B
2
B
c
1
(A
2
A
1
)
c
= B
2
B
c
1
(A
2
A
c
1
)
c
= B
2
B
c
1
(A
c
2
A
1
)
= [B
2
B
c
1
A
c
2
] [B
2
B
c
1
A
1
]
(B
2
A
2
) (A
1
B
1
)
and similarly,
(A
2
A
1
) (B
2
B
1
) (A
2
B
2
) (B
1
A
1
)
so that
(A
2
A
1
) (B
2
B
1
) (B
2
A
2
) (A
1
B
1
) (A
2
B
2
) (B
1
A
1
)
= (A
1
B
1
) (A
2
B
2
) .
4. Observe that A
n
B and A
n
A, then
P (B A
n
) = P (B A
n
) +P (A
n
B)
P (B A) +P (A B) = P (AB) .
5. Let / be the collection of sets B B for which the assertion of the theorem
holds. Show / is a system which contains /.
E.3 Construction of Measures
Denition E.19. Given a collection of subsets, c, of , let c
denote the col-

lection of subsets of which are nite or countable unions of sets from c.
Similarly let c
denote the collection of subsets of which are nite or count-

able intersections of sets from c. We also write c
= (c
and c
= (c
,
etc.
Lemma E.20. Suppose that / 2
is an algebra. Then:
E.3 Construction of Measures 213
1. /
is closed under taking countable unions and nite intersections.

2. /
is closed under taking countable intersections and nite unions.

3. A
c
: A /
= /
and A
c
: A /
= /
.
Proof. By construction /
is closed under countable unions. Moreover if

A =
i=1
A
i
and B =
j=1
B
j
with A
i
, B
j
/, then
A B =
i,j=1
A
i
B
j
/
,
which shows that /
is also closed under nite intersections. Item 3. is straight

forward and item 2. follows from items 1. and 3.
Remark E.21. Let us recall from Proposition E.3 and Remark E.4 that a nitely
additive measure : / [0, ] is a premeasure on / i (A
n
) (A) for all
A
n
n=1
/ such that A
n
A /. Furthermore if () < , then is a
premeasure on / i (A
n
) 0 for all A
n
n=1
/ such that A
n
.
Proposition E.22. Given a premeasure, : / [0, ] , we extend to /
by dening
(B) := sup (A) : / A B . (E.2)
This function : /
[0, ] then satises;

1. (Monotonicity) If A, B /
with A B then (A) (B) .

2. (Continuity) If A
n
/ and A
n
B /
, then (A
n
) (B) as n .
3. (Strong Additivity) If A, B /
, then
(A B) +(A B) = (A) +(B) . (E.3)
4. (Sub-Additivity on /
) The function is sub-additive on /
, i.e. if
A
n
n=1
/
, then
(
n=1
A
n
)
n=1
(A
n
) . (E.4)
5. ( - Additivity on /
) The function is countably additive on /
.
Proof. 1. and 2. Monotonicity follows directly from Eq. (E.2) which then
implies (A
n
) (B) for all n. Therefore M := lim
n
(A
n
) (B) .
To prove the reverse inequality, let / A B. Then by the continuity of
on / and the fact that A
n
A B A = A, we have (A
n
A) (A) . As
(A
n
) (A
n
A) for all n it then follows that M := lim
n
(A
n
) (A) .
As A / with A B was arbitrary we may conclude,
(B) = sup (A) : / A B M.
3. Suppose that A, B /
and A
n
n=1
and B
n
n=1
are sequences in /
such that A
n
A and B
n
B as n . Then passing to the limit as n
in the identity,
(A
n
B
n
) +(A
n
B
n
) = (A
n
) +(B
n
)
proves Eq. (E.3). In particular, it follows that is nitely additive on /
.
4 and 5. Let A
n
n=1
be any sequence in /
and choose A
n,i
i=1
/
such that A
n,i
A
n
as i . Then we have,
N
n=1
A
n,N
_
n=1
(A
n,N
)
N
n=1
(A
n
)
n=1
(A
n
) . (E.5)
Since /
N
n=1
A
n,N

n=1
A
n
/
, we may let N in Eq. (E.5) to

conclude Eq. (E.4) holds. If we further assume that A
n
n=1
/
are pairwise
disjoint, by the nite additivity and monotonicity of on /
, we have
n=1
(A
n
) = lim
N
N
n=1
(A
n
) = lim
N
N
n=1
A
n
_
(
n=1
A
n
) .
This inequality along with Eq. (E.4) shows that is additive on /
.
Suppose is a nite premeasure on an algebra, / 2
, and A /
.
Since A, A
c
/
and = AA
c
, it follows that () = (A) +(A
c
) . From
this observation we may extend to a function on /
by dening
(A) := () (A
c
) for all A /
. (E.6)
Lemma E.23. Suppose is a nite premeasure on an algebra, / 2
, and
has been extended to /
as described in Proposition E.22 and Eq. (E.6)

above.
1. If A /
then (A) = inf (B) : A B / .

2. If A /
and A
n
/ such that A
n
A, then (A) = lim
n
(A
n
) .
3. is strongly additive when restricted to /
.
4. If A /
and C /
such that A C, then (C A) = (C) (A) .

Proof.
1. Since (B) = () (B
c
) and A B i B
c
A
c
, it follows that
inf (B) : A B / = inf () (B
c
) : / B
c
A
c
= () sup (B) : / B A
c
= () (A
c
) = (A) .
2. Similarly, since A
c
n
A
c
/
, by the denition of (A) and Proposition

E.22 it follows that
(A) = () (A
c
) = () lim
n
(A
c
n
)
= lim
n
[() (A
c
n
)] = lim
n
(A
n
) .
3. Suppose A, B /
and A
n
, B
n
/ such that A
n
A and B
n
B, then
A
n
B
n
A B and A
n
B
n
A B and therefore,
(A B) +(A B) = lim
n
[(A
n
B
n
) +(A
n
B
n
)]
= lim
n
[(A
n
) +(B
n
)] = (A) +(B) .
All we really need is the nite additivity of which can be proved as follows.
Suppose that A, B /
are disjoint, then AB = implies A

c
B
c
= .
So by the strong additivity of on /
it follows that
() +(A
c
B
c
) = (A
c
) +(B
c
)
(A B) = () (A
c
B
c
)
= () [(A
c
) +(B
c
) ()]
= (A) +(B) .
4. Since A
c
, C /
we may use the strong additivity of on /
to conclude,
(A
c
C) +(A
c
C) = (A
c
) +(C) .
Because = A
c
C, and (A
c
) = () (A) , the above equation may
be written as
() +(C A) = () (A) +(C)
which nishes the proof.
Notation E.24 (Inner and outer measures) Let : / [0, ) be a nite
premeasure extended to /
as above. The for any B let
(B) := sup (A) : /
A B and
(B) := inf (C) : B C /
.
We refer to
(B) and
(B) as the inner and outer content of B respec-

tively.
If B has the same inner and outer content it is reasonable to dene the
measure of B as this common value. As we will see in Theorem E.27 below, this
extension becomes a additive measure on a algebra of subsets of .
Denition E.25 (Measurable Sets). Suppose is a nite premeasure on an
algebra / 2
. We say that B is measurable if
(B) =
(B) . We
will denote the collection of measurable subsets of by B = B () and dene
: B [0, ()] by
(B) :=
(B) =
(B) for all B B. (E.7)

Remark E.26. Observe that
(B) =
(B) i for all > 0 there exists A /
and C /
such that A B C and

(C A) = (C) (A) < , (E.8)
wherein we have used Lemma E.23 for the rst equality. Moreover we will use
below that if B B and /
A B C /
, then
(A)
(B) = (B) =
(B) (C) . (E.9)

Theorem E.27 (Finite Premeasure Extension Theorem). Suppose is
a nite premeasure on an algebra / 2
and : B := B () [0, ()] be as

in Denition E.25. Then B is a algebra on which contains / and is a
additive measure on B. Moreover, is the unique measure on B such that
[
,
= .
Proof. 1. B is an algebra. It is clear that / B and that B is closed
under complementation see Eq. (E.8) and use the fact that A
c
C
c
= C A.
Now suppose that B
i
B for i = 1, 2 and > 0 is given. We may then
choose A
i
B
i
C
i
such that A
i
/
, C
i
/
, and (C
i
A
i
) < for
i = 1, 2. Then with A = A
1
A
2
, B = B
1
B
2
and C = C
1
C
2
, we have
/
A B C /
. Since
C A = (C
1
A) (C
2
A) (C
1
A
1
) (C
2
A
2
) ,
it follows from the sub-additivity of that with
(C A) (C
1
A
1
) +(C
2
A
2
) < 2.
Since > 0 was arbitrary, we have shown that B B which completes the proof
that B is an algebra.
2. B is a algebra. As we know B is an algebra, to show B is a algebra
it suces to show that B =
n=1
B
n
B whenever B
n
n=1
is a disjoint
sequence in B. To this end, let > 0 be given and choose A
i
B
i
C
i
such
E.3 Construction of Measures 215
that A
i
/
, C
i
/
, and (C
i
A
i
) < 2
i
for all i. Let C :=
i=1
C
i
/
and for n N let A

n
:=
n
i=1
A
i
/
. Since the A
i
i=1
are pairwise disjoint
we may use Lemma E.23 to show,
n
i=1
(C
i
) =
n
i=1
((A
i
) +(C
i
A
i
))
= (A
n
) +
n
i=1
(C
i
A
i
) () +
n
i=1
2
i
which on letting n implies
i=1
(C
i
) () + < . (E.10)
Using
C A
n
=
i=1
(C
i
A
n
) [
n
i=1
(C
i
A
i
)]
_
i=n+1
C
i
,
and the sub-additivity of on /
it follows that
(C A
n
)
n
i=1
(C
i
A
i
) +
i=n+1
(C
i
)
n
i=1
2
i
+
i=n+1
(C
i
)
+
i=n+1
(C
i
) as n ,
wherein we have used Eq. (E.10) in computing the limit. In summary, B =
i=1
B
i
, /
A
n
B C /
, C A
n
/
with (CA
n
) 2 for all n
suciently large. Since > 0 is arbitrary, it follows that B B.
3. is a measure. Continuing the notation in step 2, we have
i=1
(A
i
)
n
i=1
(A
i
) = (A
n
) (B) (C)
i=1
(C
i
) . (E.11)
On the other hand, since A
i
B
i
C
i
, it follows (see Eq. (E.9)) that (A
i
)
(B
i
) (C
i
) and therefore that
i=1
(A
i
)
i=1
(B
i
)
i=1
(C
i
) . (E.12)
Equations (E.11) and (E.12) show that (B) and
i=1
(B
i
) are both between
i=1
(A
i
) and
i=1
(C
i
) and so
(B)
i=1
(B
i
)
i=1
(C
i
)
i=1
(A
i
) =
i=1
(C
i
A
i
)
i=1
2
i
= .
Since > 0 is arbitrary, we have shown (B) =
i=1
(B
i
) , i.e. is a measure
on B.
Since we really had no choice as to how to extend , it is to be expected
that the extension is unique. You are asked to supply the details in Exercise
E.3 below.
Exercise E.3. Let , , /, and B := B () be as in Theorem E.27. Further
suppose that B
0
2
is a algebra such that / B

0
B and : B
0

[0, ()] is a additive measure on B
0
such that = on /. Show that
= on B
0
as well. (When B
0
= (/) this exercise is of course a consequence
of Proposition E.15. It is not necessary to use this information to complete the
exercise.)
Corollary E.28. Suppose that / 2
is an algebra and : B
0
:= (/)
[0, ()] is a additive nite measure. Then for every B (/) and > 0;
1. there exists /
A B C /
and > 0 such that (C A) < and

2. there exists A / such that (AB) < .
Exercise E.4. Prove corollary E.28 by considering where := [
,
. Hint:
you may nd Exercise ?? useful here.
Theorem E.29 ( - Finite Premeasure Extension Theorem). Suppose
that is a nite premeasure on an algebra /. Then
(B) := inf (C) : B C /
B (/) (E.13)
denes a measure on (/) and this measure is the unique extension of on /
to a measure on (/) . Recall that
(C) = sup (A) : / A C for all C /
.
Proof. Let
n
n=1
/ be chosen so that (
n
) < for all n and
n

as n and let
n
(A) :=
n
(A
n
) for all A /.
Each
n
is a premeasure (as is easily veried) on / and hence by Theorem E.27
each
n
has an extension,
n
, to a measure on (/) . Since the measure
n
are
increasing, := lim
n

n
is a measure which extends .
The proof will be completed by verifying that Eq. (E.13) holds. Let B
(/) , B
m
=
m
B and > 0 be given. By Theorem E.27, there exists
C
m
/
such that B
m
C
m

m
and (C
m
B
m
) =
m
(C
m
B
m
) < 2
n
.
Then C :=
m=1
C
m
/
and
(C B)
_

_
m=1
(C
m
B)
_
m=1
(C
m
B)
m=1
(C
m
B
m
) < .
Thus
(B) (C) = (B) + (C B) (B) +
which, since > 0 is arbitrary, shows satises Eq. (E.13). The uniqueness of
the extension is the subject of Exercise ??.
The following slight reformulation of Theorem E.29 can be useful.
Corollary E.30. Let / be an algebra of sets,
m
m=1
/ is a given sequence
of sets such that
m
as m . Let
/
f
:= A / : A
m
for some m N .
Notice that /
f
is a ring, i.e. closed under dierences, intersections and unions
and contains the empty set. Further suppose that : /
f
[0, ) is an additive
set function such that (A
n
) 0 for any sequence, A
n
/
f
such that A
n

as n . Then extends uniquely to a nite measure on /.
Proof. Existence. By assumption,
m
:= [
,m
: /
m
[0, ) is a
premeasure on (
m
, /
m
) and hence by Theorem E.29 extends to a measure
t
m
on (
m
, (/
m
) = B
m
) . Let
m
(B) :=
t
m
(B
m
) for all B B.
Then
m
m=1
is an increasing sequence of measure on (, B) and hence :=
lim
m

m
denes a measure on (, B) such that [
,f
= .
Uniqueness. If
1
and
2
are two such extensions, then
1
(
m
B) =
2
(
m
B) for all B / and therefore by Proposition E.15 or Exercise ?? we
know that
1
(
m
B) =
2
(
m
B) for all B B. We may now let m
to see that in fact
1
(B) =
2
(B) for all B B, i.e.
1
=
2
.
E.4 Radon Measures on 1
We say that a measure, , on (1, B
R
) is a Radon measure if ([a, b]) <
for all < a < b < . In this section we will give a characterization of all
Radon measures on 1. We rst need the following general result characterizing
premeasures on an algebra generated by a semi-algebra.
Proposition E.31. Suppose that o 2
is a semi-algebra, / = /(o) and

: / [0, ] is a nitely additive measure. Then is a premeasure on / i
is countably sub-additive on o.
Proof. Clearly if is a premeasure on / then is - additive and hence
sub-additive on o. Because of Proposition ??, to prove the converse it suces
to show that the sub-additivity of on o implies the sub-additivity of on /.
So suppose A =
n=1
A
n
/ with each A
n
/ . By Proposition ?? we
may write A =
k
j=1
E
j
and A
n
=
Nn
i=1
E
n,i
with E
j
, E
n,i
o. Intersecting
the identity, A =
n=1
A
n
, with E
j
implies
E
j
= A E
j
=
n=1
A
n
E
j
=
n=1
Nn
i=1
E
n,i
E
j
.
By the assumed sub-additivity of on o,
(E
j
)
n=1
Nn
i=1
(E
n,i
E
j
) .
Summing this equation on j and using the nite additivity of shows
(A) =
k
j=1
(E
j
)
k
j=1
n=1
Nn
i=1
(E
n,i
E
j
)
=
n=1
Nn
i=1
k
j=1
(E
n,i
E
j
) =
n=1
Nn
i=1
(E
n,i
) =
n=1
(A
n
) .
Suppose now that is a Radon measure on (1, B
R
) and F : 1 1 is chosen
so that
((a, b]) = F (b) F (a) for all < a b < . (E.14)
For example if (1) < we can take F (x) = ((, x]) while if (1) =
we might take
F (x) =
_
((0, x]) if x 0
((x, 0]) if x 0
.
The function F is uniquely determined modulo translation by a constant.
Lemma E.32. If is a Radon measure on (1, B
R
) and F : 1 1 is chosen
so that ((a, b]) = F (b) F (a) , then F is increasing and right continuous.
Proof. The function F is increasing by the monotonicity of . To see that
F is right continuous, let b 1 and choose a (, b) and any sequence
b
n
n=1
(b, ) such that b
n
b as n . Since ((a, b
1
]) < and
(a, b
n
] (a, b] as n , it follows that
F(b
n
) F(a) = ((a, b
n
]) ((a, b]) = F(b) F(a).
E.4 Radon Measures on 1 217
Since b
n
n=1
was an arbitrary sequence such that b
n
b, we have shown
lim
yb
F(y) = F(b).
The key result of this section is the converse to this lemma.
Theorem E.33. Suppose F : 1 1 is a right continuous increasing function.
Then there exists a unique Radon measure, =
F
, on (1, B
R
) such that Eq.
(E.14) holds.
Proof. Let o := (a, b] 1 : a b , and / = /(o) consists
of those sets, A 1 which may be written as nite disjoint unions of sets
from o as in Example 11.13. Recall that B
R
= (/) = (o) . Further dene
F () := lim
x
F (x) and let =
F
be the nitely additive measure
on (1, /) described in Proposition ?? and Remark ??. To nish the proof it
suces by Theorem E.29 to show that is a premeasure on / = /(o) where
o := (a, b] 1 : a b . So in light of Proposition E.31, to nish
the proof it suces to show is sub-additive on o, i.e. we must show
(J)
n=1
(J
n
). (E.15)
where J =
n=1
J
n
with J = (a, b] 1 and J
n
= (a
n
, b
n
] 1. Recall from
Proposition ?? that the nite additivity of implies
n=1
(J
n
) (J) . (E.16)
We begin with the special case where < a < b < . Our proof will be
by continuous induction. The strategy is to show a where
:=
_
[a, b] : (J (, b])
n=1
(J
n
(, b])
_
. (E.17)
As b J, there exists an k such that b J
k
and hence (a
k
, b
k
] = (a
k
, b] for this
k. It now easily follows that J
k
so that is not empty. To nish the proof
we are going to show a := inf and that a = a.
Let
m
such that
m
a, i.e.
(J (
m
, b])
n=1
(J
n
(
m
, b]). (E.18)
The right continuity of F implies (J
n
(, b]) is right continuous. So
by the dominated convergence theorem
2
for sums,
2
DCT applies as (Jn (m, b]) (Jn) and
n=1
(Jn) (J) < by Eq.
(E.16).
(J ( a, b]) = lim
m
(J (
m
, b])
lim
m
n=1
(J
n
(
m
, b])
=
n=1
lim
m
(J
n
(
m
, b]) =
n=1
(J
n
( a, b]),
i.e. a .
If a > a, then a J
l
= (a
l
, b
l
] for some l. Letting = a
l
< a, we have,
(J (, b]) = (J (, a]) +(J ( a, b])
(J
l
(, a]) +
n=1
(J
n
( a, b])
= (J
l
(, a]) +(J
l
( a, b]) +
n,=l
(J
n
( a, b])
= (J
l
(, b]) +
n,=l
(J
n
( a, b])
n=1
(J
n
(, b]).
This shows and < a which violates the denition of a. Thus we
must conclude that a = a.
The hard work is now done but we still have to check the cases where
a = or b = . For example, suppose that b = so that
J = (a, ) =
n=1
J
n
with J
n
= (a
n
, b
n
] 1. Then
I
M
:= (a, M] = J I
M
=
n=1
J
n
I
M
and so by what we have already proved,
F(M) F(a) = (I
M
)
n=1
(J
n
I
M
)
n=1
(J
n
).
Now let M in this last inequality to nd that
((a, )) = F() F(a)
n=1
(J
n
).
The other cases where a = and b 1 and a = and b = are handled
similarly.
Proof. Old Proof of Theorem E.33. Case 1. First suppose that < a a,

b
n
> b
n
in which case I := ( a, b] J,
J
n
:= (a
n
,
b
n
]

J
o
n
:= (a
n
,
b
n
) J
n
.
Since

I = [ a, b] is compact and

I J
n=1
J
o
n
there exists
3
N < such that
I

I
N
_
n=1
J
o
n

N
_
n=1
J
n
.
Hence by nite sub-additivity of ,
F(b) F( a) = (I)
N
n=1
(

J
n
)
n=1
(

J
n
).
Using the right continuity of F and letting a a in the above inequality,
(J) = ((a, b]) = F(b) F(a)
n=1
J
n
_
=
n=1
(J
n
) +
n=1
(

J
n
J
n
). (E.19)
Given > 0, we may use the right continuity of F to choose

b
n
so that
(

J
n
J
n
) = F(
b
n
) F(b
n
) 2
n
n N.
Using this in Eq. (E.19) shows
(J) = ((a, b])
n=1
(J
n
) +
which veries Eq. (E.15) since > 0 was arbitrary.
3
To see this, let c := sup
_
x b : [ a, x] is nitely covered by
_

J
o
n
_
n=1
_
. If c < b,
then c

J
o
m
for some m and there exists x

J
o
m
such that [ a, x] is nitely covered
by
_

J
o
n
_
n=1
, say by
_

J
o
n
_
N
n=1
. We would then have that
_

J
o
n
_
max(m,N)
n=1
nitely
covers [a, c
] for all c

J
o
m
. But this contradicts the denition of c.
Example E.34 (The construction of all probability measures on (1, B
R
)). Sup-
pose that = 1, o := (a, b] 1 : a b , and / = /(o) consists
of those sets, A 1 which may be written as nite disjoint unions of sets from
o as in Example 11.13. Recall that B
R
= (/) = (o) .
1. By Proposition ??, to every increasing function, F : 1 [0, 1] such that
F () := lim
x
F (x) = 0 and
F (+) := lim
x
F (x) = 1
there exists a nitely additive probability measure, P = P
F
on / such that
P ((a, b] 1) = F (b) F (a) for all a b .
2. By Lemma E.8 and Theorem E.33, the measure P is additive on / i
F is right continuous.
3. Now apply Theorem E.27 to learn that P extends to a probability measure
on B
R
i F is right continuous.
E.4.1 Lebesgue Measure
If F (x) = x for all x 1, we denote
F
by m and call m Lebesgue measure on
(1, B
R
) .
Theorem E.35. Lebesgue measure m is invariant under translations, i.e. for
B B
R
and x 1,
m(x +B) = m(B). (E.20)
Lebesgue measure, m, is the unique measure on B
R
such that m((0, 1]) = 1 and
Eq. (E.20) holds for B B
R
and x 1. Moreover, m has the scaling property
m(B) = [[ m(B) (E.21)
where 1, B B
R
and B := x : x B.
Proof. Let m
x
(B) := m(x+B), then one easily shows that m
x
is a measure
on B
R
such that m
x
((a, b]) = b a for all a < b. Therefore, m
x
= m by
the uniqueness assertion in Exercise ??. For the converse, suppose that m is
translation invariant and m((0, 1]) = 1. Given n N, we have
(0, 1] =
n
k=1
(
k 1
n
,
k
n
] =
n
k=1
_
k 1
n
+ (0,
1
n
]
_
.
Therefore,
1 = m((0, 1]) =
n
k=1
m
_
k 1
n
+ (0,
1
n
]
_
=
n
k=1
m((0,
1
n
]) = n m((0,
1
n
]).
That is to say
m((0,
1
n
]) = 1/n.
Similarly, m((0,
l
n
]) = l/n for all l, n N and therefore by the translation
invariance of m,
m((a, b]) = b a for all a, b with a < b.
Finally for a, b 1 such that a < b, choose a
n
, b
n
such that b
n
b and
a
n
a, then (a
n
, b
n
] (a, b] and thus
m((a, b]) = lim
n
m((a
n
, b
n
]) = lim
n
(b
n
a
n
) = b a,
i.e. m is Lebesgue measure. To prove Eq. (E.21) we may assume that ,= 0
since this case is trivial to prove. Now let m
(B) := [[
1
m(B). It is easily
checked that m
is again a measure on B
R
which satises
m
((a, b]) =
1
m((a, b]) =
1
(b a) = b a
if > 0 and
m
((a, b]) = [[
1
m([b, a)) = [[
1
(b a) = b a
if < 0. Hence m
= m.
Exercise E.5 (Riesz Markov Theorem for an Interval). Suppose that
X = [0, 1] and C (X)
and 0. Dene F(1) := (1) and for b [0, 1), let

F(b) = inf
>0
(
b,
) = lim
0
(
b,
) where
b,
(x) =
_
_
_
1 if x b
1 (x b)/ if b x b +
0 if x b +.
(E.22)
Show F is an increasing right continuous function on X. Let be the unique
measure on B
X
such that ([0, b]) = F (b) for all b X. Show also that
(f) =
_
1
0
fd for all f C (X) . (E.23)
This is a simple version of Theorem ??.
F
Appendix: Math 140A Topics
F.1 Summary of Key Facts about Real Numbers
1. The real numbers, 1, is the unique (up to order preserving eld isomor-
phism) ordered eld with the least upper bound property or equivalently
which is Cauchy complete.
2. Informally the real numbers are the rational numbers with the (irrational)
hole lled in.
3. Monotone bounded sequence always converge in 1.
4. A sequence converges in 1 i it is Cauchy.
5. Cauchy sequences are bounded.
6. N is unbounded from above in 1.
7. For all > 0 in 1 there exists n N such that
1
n
< .
8. and 1 is dense in 1. In particular, between any two real numbers
a < b, there are innitely many rational and irrational numbers.
9. Decimal numbers map (almost 1-1) into the real numbers by taking the
limit of the truncated decimal number.
10. If a, b, 1, then
a) a b by showing that a b + for all > 0.
b) a = b by proving a b and b a or
c) a = b by showing [b a[ for all > 0.
11. A number of standard limit theorems hold, see Theorem 3.13.
12. Unlike limits, limsup and liminf always exist. Moreover we have;
liminf
n
a
n
limsup
n
a
n
with equality i lim
n
a
n
exists in
which case
liminf
n
a
n
= lim
n
a
n
= limsup
n
a
n
.
We may allow the values of in these statements.
13. If b
k
:= a
nk
k=1
is a convergent subsequence of a
n
, then
liminf
n
a
n
lim
k
b
k
limsup
n
a
n
and we may choose b
k
so that lim
k
b
k
= limsup
n
a
n
or
lim
k
b
k
= liminf
n
a
n
.
14. Bounded sequences of real numbers always have convergence subsequences.
15. If S 1 and A := sup (S) , then there exists a
n
n=1
S such that
a
n
a
n+1
for all n and lim
n
a
n
= sup (S) .
16. If S 1 and A := inf (S) , then there exists a
n
n=1
S such that
a
n+1
a
n
for all n and lim
n
a
n
= inf (S) .
F.2 Test 2 Review Topics:
1. Understand the basic properties of complex numbers.
2. Coutability. Key facts are that countable union of countable sets is count-
able and the nite product of countable sets is countable.
3. Denitions of metric and normed spaces and their basic properties which
in the end of the day typically follow from the triangle inequality.
4. You should know that metrics and norms are continuous functions that
satisfy,
[|x| |y|[ |x y| and
[d (x, y) d (x
t
, y
t
)[ d (x, x
t
) +d (y, y
t
) .
5. Be aware of dierent norms, ||
u
, ||
1
, and ||
2
.
6. Understand the notion of; limits of sequences, Cauchy sequences, com-
pleteness, limits and continuity of functions.
7. Know what is meant by pointwise and uniform convergence. You should
be able to compute pointwise limits and know how to test if the limit is
uniform or not. A key theorem is the uniform limit of continuous functions
is still continuous.
F.3 After Test 2 Review Topics:
Let (X, ||) be a Banach space.
1. Know
n=1
x
n
= lim
N
N
n=1
x
n
if the limit exists.
2. Know
n=1
x
n
converges absolutely i
n=1
|x
n
| < and absolute con-
vergence implies convergence in a Banach space.
3. Telescoping series and geometric series.
4. Alternating series test.
5. Absolute convergence tests: 1) integral test, 2) root test, 3) ratio test often
combined with the 4) comparison test.
6. p series examples.
7. n
th
term test for divergence.
8. Cauchy criteria for convergence and the fact that tails of convergent series
tend to zero. i.e. tails of convergent series tend to zero.
9. Pointwise convergence of sums.
10. Uniform convergence of sums and the Weierstrass M test
11. Power series including radius of convergence notion.
12. The exponential function and its relatives, sin, sinh, cos, cosh .

Bruce Driver-Undergraduate Analysis Tools

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bruce Driver-Undergraduate Analysis Tools

Uploaded by

Copyright:

Available Formats

Bruce K.

be the integers, and

. Since > 0 is arbitrary we may conclude that lim

x P if (y x) P. More importantly this

2 = sup (S) does not exists in .

for all n N which is

m + . This shows that

and I is some index set. Then

M for all I and therefore sup

for some I and therefore M

and as is arbitrary we may conclude that

with > 0 shows the necessity for assuming right hand

. Thus we have shown a

yk. Now let k to conclude a

1 i for all > 0,

+ for a.a. n. and

1 i for all > 0,

: A is a collection of non-empty sets,

be the canonical projection map dened

= X for some xed space X, then we will write

for each A. The axiom of choice states that X

in the case that

, i.e. i = y/x, we see that

() > 0 such that

()) , then we will have

x both of which satisfy the

1 for a.a. n then lim

> 1 then lim

= 1, the test fails, i.e. you must work harder!]

for a.a. n, i.e. there exists N N

> > 1, then

for a.a. n. Working as above

( (z) < 1) and diverges if [z z

( (z) > 1) , i.e. R =

from Eq. (7.8).

N where I is the N N identity matrix.

for m large. We are also given that

(x) = max(a (x) , 0). (8.10)

which shows that either

= . Suppose, with out loss

1 x for [x[ 1 and use Taylors theorem with integral

g(y) for all y Y and u J.

U for each > 0

U. Now suppose that x U

and we have shown that U

. Finally it is clear that F

A = X, i.e. every element x X is a limit of a sequence of elements from A.

A = F : A F X with F closed. (9.2)

(x) < for all x X. In other words, (X, d) is bounded i X

n we have shows that

F G ,= , then E = F G is connected in X as well.

satises all the axioms of a vector space.

with the added restric-

= if ,= . We will also write A

B for A B with the

is said to be an semialgebra or elementary

with only nitely many elements, i.e. #(/) < . For

. In particular this shows =

are in one to one

such that #(/) = 6.

is a semi-algebra, we say a function : o Z

is a semi-algebra (see Denition 11.11) and / = /(o) is the algebra

for each i and therefore f is constant on (t

|f (t)| d(t) |f|