You are on page 1of 69

London School of Economics

MA203
Real Analysis
Lecture Notes

Written by Martin Anthony, Department of Mathematics, LSE Martin


c Anthony 2009.
MA203 Real Analysis

Contents

1 Introduction 2
1.1 What is this course? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 What will it achieve? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Who should take it? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Course Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Lecturer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 Teaching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.7 Classes and Office Hours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.9 Books . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.10 Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Series of real numbers 5


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Revision: sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Limits of sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.2 Boundedness and monotonicity . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.3 The Algebra of Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.4 Subsequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Convergence of series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 Special series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.6 Some useful tests for non-negative series . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6.1 Comparison test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6.2 Ratio test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6.3 Root test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6.4 Integral test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.7 Alternating series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.8 Absolute convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.8.1 Definition of absolute convergence . . . . . . . . . . . . . . . . . . . . . . . . 16
2.8.2 Tests for absolute convergence . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.9 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.10 Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.11 Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3 Sequences, functions and limits in higher dimensions 21


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Sequences in Rm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.1 Distance in Rm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.2 Convergence in Rm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.3 Bolzano-Weierstrass theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Revision: limits and continuity of functions f : R → R . . . . . . . . . . . . . . . . . 23
3.3.1 Limits of functions f : R → R . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3.2 Continuity of functions f : R → R . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4 Limits and continuity of functions f : Rn → Rm . . . . . . . . . . . . . . . . . . . . . 25
3.4.1 Limits of functions f : Rn → Rm . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4.2 Two informative examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4.3 Continuity of functions f : Rn → Rm . . . . . . . . . . . . . . . . . . . . . . 26
3.4.4 Sequences and continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.5 Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.6 Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4 Differentiation 30
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Derivative of functions f : R → R . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.1 Definition of the derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.2 Differentiability and continuity . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.3 Maxima, Minima, and the derivative . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.4 Rolle’s Theorem and the Mean Value Theorem . . . . . . . . . . . . . . . . . 33
4.3 Differentiation of functions f : Rn → Rm . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3.1 Partial and directional derivatives . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3.2 The derivative of f : Rn → Rm . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.4 Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5 Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5 Topology of Rm 41
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2 Open and closed subsets of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2.1 Open sets of real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2.2 Collections of sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2.3 Properties of open sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.2.4 Closed sets of real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.3 Open and closed subsets of Rm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.3.1 Open balls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.3.2 The definition of open set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.3.3 Closed sets in Rm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.4 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.4.1 Continuity and open balls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.4.2 Continuity in terms of open sets . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.5 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.5.1 Compact sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.5.2 Characterising compact subsets of Rm . . . . . . . . . . . . . . . . . . . . . . 47
5.5.3 Continuous functions on compact sets . . . . . . . . . . . . . . . . . . . . . . 48
5.6 Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.7 Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

6 Metric spaces 51
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.2 Metrics and Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.2.1 Towards the idea of a metric space . . . . . . . . . . . . . . . . . . . . . . . . 51
6.2.2 Definition of a metric space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.2.3 Important examples of metric spaces . . . . . . . . . . . . . . . . . . . . . . . 52
6.2.4 Bounded subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.2.5 Open balls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.3 Open sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.3.1 The definition of open set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.4 Continuity in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.4.1 The definition of continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.4.2 Continuity in terms of open sets . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.5 Convergence and closed sets in metric spaces . . . . . . . . . . . . . . . . . . . . . . 56
6.5.1 Definition of convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

i
MA203 Real Analysis

6.6 Compactness in metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57


6.6.1 Definition of compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.6.2 Closed-ness and boundedness . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.6.3 Continuous functions on compact sets . . . . . . . . . . . . . . . . . . . . . . 57
6.7 Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.8 Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

7 Uniform convergence 61
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.2 Pointwise and uniform convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.3 Uniform convergence as convergence in a metric space . . . . . . . . . . . . . . . . . 62
7.4 Uniform convergence and continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.5 Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.6 Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

ii
Preface

These notes have been developed over several years. This current version is an edited version of a
Subject Guide produced for the University of London external programme. That guide was itself based
on MA203 lecture notes.

I am grateful to Malwina Luczak and Keith Martin for carefully reading a draft of the subject guide
and for suggesting ways in which to improve it.

1
MA203 Real Analysis

Chapter 1
Introduction

Contents

1.1 What is this course? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2


1.2 What will it achieve? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Who should take it? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Course Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Lecturer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 Teaching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.7 Classes and Office Hours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.9 Books . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.10 Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.1 What is this course?

This is a course in real analysis, designed for those who already know some real analysis (such as that
encountered in MA103 Introduction to Abstract Mathematics). The emphasis is on functions,
sequences and series in n-dimensional real space. The general concept of a metric space will also be
studied and, if time allows, we will look briefly at topological spaces.

1.2 What will it achieve?

After studying this course, you should be equipped with a knowledge of concepts (such as continuity
and compactness) which are central not only to further mathematical courses, but to applications of
mathematics in economics and other areas. For example, as we shall see, compactness is a very
important idea in optimisation. The course will also enable you to set the real analysis you previously
encountered in a larger context, to see that there is a ‘bigger picture’. More generally, a course of this
nature, with the emphasis on abstract reasoning and proof, will help you to think in an analytical way,
and be able to formulate mathematical arguments in a precise, logical manner.

1.3 Who should take it?

Most students taking this course will have already taken MA103 Introduction to Pure
Mathematics or some other course based on formal definitions and proofs and, ideally, covering the
concept of limit: indeed, such a course is a formal pre-requisite. Students who have not covered the
notion of a limit may be able to take this course after carrying out some preliminary reading (Chapters
1 to 3 of Bryant’s ‘Yet Another Introduction to Analysis’, for example), but familiarity with proof
techniques really is essential. Some of you (for example, BSc Mathematics and Economics students)
will be required to take this course: others will simply be interested in learning more about real
analysis.

2
Chapter 1. Introduction

1.4 Course Content

We study the formal mathematical theory of:

series of real numbers;


series and sequences in n-dimensional real space Rn ;
limits, continuity and derivatives of functions mapping between
Rm and Rn ;
closed and open sets;
compactness;
metric spaces;
uniform convergence of sequences of functions.

1.5 Lecturer

Professor Martin Anthony. Room B311, Columbia House.

Mathematics Department Office: Jackie Everid, B401, 020 7955 7732

Email: m.anthony@lse.ac.uk (This is the best way to contact me.)

1.6 Teaching

This is a half-unit course. Lectures will take place in the Michaelmas term, as follows.

Mondays, 11-12 in room U8 (Tower 1) and Fridays 11-12 in room D1 (the Hong Kong Theatre,
Clement House).

There will also be a revision session in the Summer term.

Classes start in Week 2, and run until the first week of Lent term (inclusive).

1.7 Classes and Office Hours

Classes for the course are taught by Dr Elizabeth Boardman and Dr Eleni Katirtzoglou. You are
encouraged to consult your class teacher during her office hour if you are having problems with the
work of the classes and think you would benefit from one-to-one advice. If you are unable to see your
class teacher, then you should see me. If you cannot attend my office hours, then I can see you at
other times, by arrangement. (If you need to make an appointment, it is easiest if you email me.) My
office hours are Mondays 3.30-4.30 and Fridays 9.30-10.30. (These are for Michaelmas term: Lent
term office hours will probably be different.) Office hours are a very valuable, and often under-used,
resource for students: please do talk with us if you are having difficulties.

1.8 Exercises

Exercises will be assigned on a weekly basis. It is very important that you attempt all the Exercises
suggested for handing in, and hand in work to your class teacher by the arranged time. Working
through examples is the best way of ensuring you understand key concepts and techniques. Work

3
MA203 Real Analysis

handed in will be marked, graded, and returned to you. Answers to all the exercises will be made
available after the work has been discussed in class.

1.9 Books

There are many books that would be useful for this course, since Mathematical Analysis is a major
component of most university-level mathematics degree programmes. There is no single book that
corresponds exactly to this course, but there are many books that are useful for parts of it. There is
no requirement to buy a book.

The following books are recommended.

Bryant, Victor. Yet Another Introduction to Analysis. (Cambridge University Press: Cambridge, 1990)
[ISBN 052138835X]

Binmore, K.G. Mathematical Analysis: A Straightforward Approach. (Cambridge University Press:


Cambridge, 1982) [ISBN 0521288827]

Brannan, David. A First Course in Mathematical Analysis. (Cambridge University Press: Cambridge,
2006) [ISBN 0521684242]

Bartle, R.G. and D.R. Sherbert. Introduction to Real Analysis. (John Wiley and Sons: New York,
1999) Third edition. [ISBN 0471321486].

Bryant, Victor. Metric Spaces: Iteration and Application. (Cambridge University Press: Cambridge,
1985) [ISBN 0521318971]

Sutherland, W. A. Introduction to Metric and Topological Spaces. (Oxford University Press: Oxford,
1995) [ISBN 0198531613]

None of these books covers all of the topics in this course. ’Yet Another Introduction to Analysis’ will
be useful for Chapters 2 and 4, and it will also be useful for revising the material you will need to
know from Introduction to Abstract Mathematics. The Binmore book will be useful for Chapters
2, 3 and 4. Brannan’s book will be useful for Chapters 2 and 4. The book by Bartle and Sherbert will
be useful for Chapters 2, 4 and 7, and will also be of some use for Chapters 5 and 6. Of more use for
Chapters 5 and 6 are the ‘Metric Spaces’ book of Bryant and the Sutherland book. Note, however,
that most of the Sutherland book covers more advanced topics than course, and the Bryant Metric
Spaces book takes a slightly different approach from that taken here.

Many other books cover the topics of this course, and the library has a range of texts on real analysis
(under QA300).

1.10 Assessment

There will be a formal 2-hour examination in the Summer term. Selected past papers and solutions
will be available.

4
Chapter 2
Series of real numbers

Contents

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Revision: sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Convergence of series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 Special series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.6 Some useful tests for non-negative series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.7 Alternating series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.8 Absolute convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.9 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.10 Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.11 Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Reading

Bryant, Victor. Yet Another Introduction to Analysis. Chapter 2.

Binmore, K.G. Mathematical Analysis: A Straightforward Approach. Chapter 6.

Brannan, David. A First Course in Mathematical Analysis. Chapter 3.

Bartle, R.G. and D.R. Sherbert. Introduction to Real Analysis. Chapters 3 and 9.

2.1 Introduction

The first main topic of the unit is series. This chapter looks at how one can formalise and deal
properly with infinite sums. A key question is whether an infinite sum exists (that is, whether a series
converges).

To understand series, we need to understand sequences. We start, therefore, by racing through some
of the results you should know already from Introduction to Abstract Mathematics about
sequences. (The discussion of this background material is therefore deliberately brief.)

2.2 Revision: sequences

Formally, a sequence is a function f from N to R. We call f (n) the nth term of the sequence and we
often denote the sequence by (f (n))∞n=1 or simply (f (n)). Informally a sequence is an infinite list of
real numbers, one for each positive integer; for example,

a1 , a2 , a3 , . . .

We denote it (an )∞
n=1 or (an ) (or indeed, (ar ), (ai ) etc.). Then we call an the n
th
term of this
sequence.

5
MA203 Real Analysis

A sequence may be defined by giving an explicit formula for the nth term. For example the formula
an = n1 defines the sequence whose value at the positive integer n is n1 .

A sequence may also be defined inductively. For instance, we might have


an 3
a1 = 1, an+1 = + (n ≥ 1).
2 2an

2.2.1 Limits of sequences

The formal definition of a limit is as follows.

Definition 2.1 (Finite limit of a sequence) The sequence (xn ) is said to tend to the (finite) limit
L if for all  > 0, there is an integer N such that for all n > N we have |xn − L| < . That is, (xn )
tends to L if
∀ > 0, ∃N such that n > N =⇒ |xn − L| < .

We write
xn → L as n → ∞
or
lim xn = L,
n→∞

and say that xn tends to L as n tends to ∞.

Any sequence which tends to a finite limit is said to be convergent.

The following result is easy to prove, but very useful.

Theorem 2.2 A sequence has no more than one limit.

2.2.2 Boundedness and monotonicity

Definition 2.3 (Bounded sequences) A sequence is bounded above/bounded/bounded below if the


set
S = {xn : n ∈ N} ⊆ R
is bounded above/bounded/bounded below.

Theorem 2.4 Any convergent sequence is bounded. That is,

convergent =⇒ bounded.

Definition 2.5 (Monotonic sequences) A sequence (an ) is increasing (decreasing) if for all n,
an+1 ≥ an (an+1 ≤ an ). A sequence is monotonic if it is either increasing or decreasing.

Theorem 2.6 An increasing (decreasing) sequence which is bounded above (below) is convergent.
That is,
bounded + monotonic =⇒ convergent.

6
Chapter 2. Series of real numbers

2.2.3 The Algebra of Limits

Theorem 2.7 (‘Algebra of limits’ results) Suppose (an ) and (bn ) are convergent sequences with

an → a, bn → b, as n → ∞.

Then as n → ∞,

1. Can → Ca for any real number C,


2. |an | → |a|,
3. an + bn → a + b,
4. an bn → ab,
1
5. If bn 6= 0 ∀ n and if b 6= 0, then bn → 1b ,
6. akn → ak for any positive integer k.

Also, note that if an ≥ 0 for all n, then a ≥ 0.

2.2.4 Subsequences

The formal definition of a subsequence is as follows.

Definition 2.8 (Subsequence) Let (an ) be a sequence and let f be a strictly increasing function
from N to N. The sequence (af (n) ) is called a subsequence of the sequence (an ).

[Note that a function is strictly increasing if f (n + 1) > f (n) for all positive n.]

Often we will use shorthand to denote a sequence. For example, for a sequence

a1 , a2 , a3 , a4 , . . .

we may say that we wish to choose a subsequence

ak1 , ak2 , ak3 , ak4 , . . . .

We have written the increasing function f explicitly in terms of its value, by saying exactly which
terms of the original sequence to take; that is, we take terms k1 , k2 , k3 and so on. Note that this will
be just for notation’s sake and in these cases the underlying function has not disappeared; in fact the
increasing function here is given by f (i) = ki for i = 1, 2, 3, . . ..

Another notation, sometimes useful, is to let A = {f (n) : n ∈ N} and denote the subsequence (af (n) )
by (an )n∈A . For any infinite set A of natural numbers, (an )n∈A is a subsequence of (an ). This
notation is particularly useful if we ever have to form a subsequence of a subsequence. For example, if
B and A are infinite subsets of N, with B ⊆ A, then (an )n∈B is a subsequence of (an )n∈A , which in
turn is a subsequence of (an ). This approach avoids the need for double and triple subscripts.

One result which is easy to show is:

Theorem 2.9 Let (an ) be a sequence which tends to a limit L. Then any subsequence also tends to
the limit L.

Another result, less easy to prove, but useful is the following.

Theorem 2.10 Every sequence has a monotonic subsequence.

We obtain from this as a corollary the following famous result.

Theorem 2.11 (Bolzano-Weierstrass Theorem) Every bounded real sequence has a convergent
subsequence.

7
MA203 Real Analysis

(The proof is immediate. Let (an ) be a bounded sequence; by the preceding theorem, it has a
monotonic subsequence (af (n) ). This subsequence is then also bounded and we have seen that
bounded monotonic subsequences are convergent.)

2.3 Series

The previous material is revision from Abstract Mathematics. Now we start on new material.

In this part of the unit, we will be concerned with how one can formalise the idea of summing an
infinite list of numbers
a1 + a2 + a3 + . . . .

As you would expect, this will once again involve the notion of a limit. We begin with a basic
definition:

Definition 2.12 Let (an )∞


n=1 be a sequence. For each n ≥ 1, let

n
X
sn = a1 + a2 + . . . + an = ak .
k=1

The series with nth term an , denoted an , is, formally, the sequence (sn ). We call an the nth term
P
th
of the series and sn is called the n partial sum of the series.
P P∞
Although we denote a series by the notation an , some textbooks use the notation n=1 an . We
shall reserve that notation for something different, as I will explain below.

Note that, as far as we are concerned, a series involves an infinite list of numbers. We do not discuss
‘finite’ series, since there are no convergence issues there.

Let’s consider an example.

Example (−1)n .
P

The nth term is (−1)n and the nth partial sum sn is −1 if n is odd and 0 if n is even.

Learning activity 2.1


P
What is the nth partial sum of the series n?

2.4 Convergence of series


P
Definition 2.13 Let an be a series. If the sequence (sn ) of partial sums converges to L (finite),
P
then we say that the series converges to L, or has sum L. If (sn ) diverges, then we say that an
diverges.
P P∞ P∞
If a series an converges to L, we write n=1 an = L. (Here is what we use the notation n=1 an
for: we use it to mean the sum of the series, when the series converges.)
P
WARNING! Be clear in your mind that the convergence of a series an is about convergence of the
sequence of partial sums (sn ). It is not about convergence of (an ).
P
Theorem 2.14 If an converges, then limn→∞ an = 0.

8
Chapter 2. Series of real numbers

Proof. Because the series converges, there is some number L such that sn → L as n → ∞. Now, if
sn → L, we also have sn−1 → L. It follows that sn − sn−1 → L − L = 0. But what is sn − sn−1 ?
Well, it is precisely an . So we have an → 0 as n → ∞.

Note that, here, we have used the symbol ‘ ’ to denote the end of the proof. This is a convenient
way of indicating when a proof is over and the main text continues.

WARNING! You should be aware P that the converse of this result is false: an tending to 0 does not
necessarily mean that the series an converges. We shall see specific examples shortly of series in
which an → 0 and yet the series diverges. Finding sufficient conditions for a series to converge
P is the
main aim in what follows, and it’s not easy. Life would be simple if it were the case that an
converges if and only if an → 0, but it isn’t so. What this means is that you should never find
yourself saying or writing “an → 0 and therefore the series converges.” It is never possible to conclude
that a series converges just from the fact that an → 0.

However, the result is useful, sometimes, for proving that a series does not converge, for it is
equivalent to the following result. (This is the contrapositive of Theorem 2.14.)
P
Theorem 2.15 If an does not tend to 0 as n → ∞, then the series an diverges.

Learning activity 2.2

Prove that the series (−1)n diverges.


P

You should not think that if a series diverges, then it must be the case that sn → ∞. This will turn
out to be P
the case if all terms an of the series are non-negative, but it is not true in general. For
example, (−1)n diverges, but we do not have sn → ∞.

The following result establishes that, for a series with non-negative terms (a ‘non-negative series’),
either the series converges, or the partial sums tend to infinity.
P
Theorem 2.16 Suppose that an is a series in which an ≥ 0 for all n. If the series diverges, then
the nth partial sum, sn , is such that sn → ∞ as n → ∞.

Proof. We start by observing that the partial sums of a non-negative series form an increasing
sequence. This is because sn+1 = sn + an+1 ≥ sn , since an+1 ≥ 0. If the sequence (sn ) was bounded
above, then it would converge, because an increasing sequence that is bounded above converges. So
it must be the case that (sn ) is not bounded above. So, for each K there is N such that sN > K.
But then we have that, for all n ≥ N , sn ≥ sN > K. This shows that sn → ∞.

Comment. You might well see that the proof of Theorem 2.16 shows us something else: namely, that
if the partial sums of a non-negative series form a bounded sequence, then the series converges. This
is because the partial sums are an increasing sequence. So, for non-negative series, the question of
convergence becomes one about whether the partial sums are bounded: explicitly, for a non-negative
series, the series converges if and only if the partial sums are bounded.

WARNING! P Theorem 2.16 is only true for series with non-negative terms. If that’s not clear,
consider (−1)n . The partial sums of this series are bounded (for they are all either −1 or 0), but
the series does not converge.

9
MA203 Real Analysis

2.5 Special series

Here’s a classic example that you’ll be familiar with.

Theorem 2.17 (Geometric Series) Let a, r ∈ R. Then


a
P n−1
1. ar converges to 1−r if |r| < 1.
P n−1
2. ar diverges if |r| ≥ 1.

a(1 − rn )
Proof. The partial sum of the geometric series is sn = if r 6= 1. If |r| < 1 then this
1−r
a n
converges to the limit 1−r because in that case r → 0. If |r| > 1, it does not converge because
rn → ∞ when r > 1 and rn oscillates unboundedly when r < −1. The only remaining cases are when
r = 1 and r = −1. When r = 1, sn is simply na and this tends to infinity and therefore does not
converge. When r = −1, the partial sums are alternately a and 0, and this sequence of numbers does
not converge.

X1
The following result is extremely useful. The series is so special that it has a special name: the
n
harmonic series.
P
Theorem 2.18 The harmonic series 1/n diverges.

Proof. Let sn denote the partial sum. Then

s2n − sn = (a1 + a2 + · · · + a2n ) − (a1 + a2 + · · · + an )


= an+1 + an+2 + · · · + a2n
1 1 1
= + + ··· +
n+1 n+2 2n
1
≥ n = 1/2.
2n
Now, if the series converges, we should have sn → L for some L. Then we’d also have s2n → L and
hence s2n − sn → L − L = 0. But this cannot be, because we’ve shown that s2n − sn ≥ 1/2 for all n.
Hence the series diverges.

The following more general result is going to be very useful to us. The second part of its proof is
difficult and you would not be expected to reproduce it in an examination.

1/ns diverges if s ≤ 1 and converges if s > 1.


P
Theorem 2.19 The series

Proof. The first part of the proof, in which we suppose s ≤ 1, is similar to the proof of divergence of
the harmonic series. (It can alternatively be proved by using Theorem 2.16 together with
Theorem 2.18 and the fact that 1/ns ≥ 1/n if s ≤ 1.) If sn denotes the partial sum, then we have

s2n − sn = (a1 + a2 + · · · + a2n ) − (a1 + a2 + · · · + an )


= an+1 + an+2 + · · · + a2n
1 1 1
= + + ··· +
(n + 1)s (n + 2)s (2n)s
1 1 1
≥ + + ··· +
n+1 n+2 2n
1
≥ n = 1/2.
2n
As in the proof above, this shows that we cannot have (sn ) converging and so the series diverges.

Now we prove that the series converges if s > 1.

10
Chapter 2. Series of real numbers

Let sn denote the nth partial sum of the series. To prove that the series converges is, by definition, to
prove that the sequence (sn ) converges. Since the terms of the series are positive, the sequence (sn )
is increasing, so to establish convergence it is sufficient to show that it is bounded above. (Remember
that an increasing sequence that is bounded above must converge.)

Obviously sn ≤ s2n −1 , although you might wonder why we make this observation. (You’ll see. . .).

Now,
1 1 1
+ s + ··· + n
s2n −1 = 1 +
s
2 3 (2 − 1)s
 
   
1 1 1 1 1 1  1 1 1 
=1+ + + + + + + · · · + + + · · · + .
 
2s 3s 4s 5s 6s 7s  (2n−1 )s (2n−1 + 1)s (2n − 1)s 

| {z }
2n−1 terms
What we’ve done here is simply group the terms together. This does not change the value of the
expression. We have taken the first term, then the next 2 together, then the next 4, then the next 8,
and so on, until the last group, which is of size 2n−1 . (Note that 1 + 2 + · · · + 2n−1 does indeed
equal 2n − 1.) The reason for doing this is that we can now bound this expression by noting that in
each group, the largest term is the first in that group, so the value of each bracketed quantity is no
more than the number of terms inside the brackets multiplied by the first of the terms. So,
1 1 1 1
s2n −1 ≤ 1+2 + 4 s + 8 s + · · · + 2n−1 n−1 s
2s 4 8 (2 )
1 1 1 1
= 1 + s−1 + s−1 + s−1 + · · · + n−1 s−1
2 4 8 (2 )
1 1 1 1
= 1 + s−1 + s−1 2 + s−1 3 + · · · + s−1 n−1 ,
2 (2 ) (2 ) (2 )
where we have just used the fact that (2i )s−1 = (2s−1 )i . Now, consider the geometric series
X 1
. This has common ratio 1/2s−1 which is positive and less than 1, so the series
(2 )n−1
s−1
converges. In fact, its sum is
1
L= .
1 − (1/2s−1 )
But the most important fact is that its partial sums are bounded (all are less than L). The calculation
above shows that if tn is the nth partial sum of this geometric series, then s2n −1 ≤ tn . It follows that
s2n −1 ≤ L and hence, as required, we have shown that the partial sums s2n −1 (and hence sn ) are
bounded. This finishes the proof.

Learning activity 2.3

1/ns diverges, by using Theorem 2.16 together with Theorem 2.18 and
P
Prove that if s ≤ 1 then
s
the fact that 1/n ≥ 1/n if s ≤ 1.

There also exist some ‘Algebra of Limits’ results which can be proved directly from the corresponding
results for sequences:
P P P∞ P∞
Theorem 2.20 Suppose an and bP
n converge, and that
P n=1 an = L and n=1 bn = M .
Then,
P∞ for any real number c, the
P∞series (an + b n ) and can converge, and
n=1 (an + bn ) = L + M and n=1 can = cL.

n

But . . . note that
Pthe same does not hold for
P products. For example, if an = (−1) / n, then, P as we
1
will shortly see, an converges. However, (an × an ) diverges. This latter series is simply n , the
harmonic series.

11
MA203 Real Analysis

2.6 Some useful tests for non-negative series

A series is non-negative if all its terms are non-negative. (Later we look at series which have some
negative terms, but it’s easiest at the moment to stick to non-negative series.) The aim now is to
develop a range of tests for convergence.

2.6.1 Comparison test

First, we have the Comparison Test.

Theorem 2.21 (Comparison Test) Let (an ), (bn ) be non-negative sequences such that an ≤ bn for
all n. Then
P P P∞ P∞
1. If bn converges, then an does also, and n=1 an ≤ n=1 bn .
P P
2. If an diverges, then bn diverges.
P
Proof.
P The key observation is thatPif sn and tn are, respectively, the nth partial sums of an and
bn , then sn ≤ tn . Suppose that bn converges. This means precisely that the series (tn )
converges (by the definition of convergence of a series). So the sequence (tn ) is certainly bounded
above. Now, (sn ) is an increasing sequence and since sn ≤ tn for all n, (sn ) is bounded above too.
So, as an increasing sequence which is bounded above, it converges. Furthermore,

X ∞
X
an = lim sn ≤ lim tn = bn .
n→∞ n→∞
n=1 n=1
P
Suppose, now, that P By Theorem 2.16, sn → ∞. So, tn → ∞ since tn ≥ sn . Hence
an diverges.
(tn ) diverges and (by definition) bn diverges.

When using the Comparison Test, Pit’s important to use it in the right direction. Suppose,
P for example,
you want to use it to show that an converges. Then you need to find a series bn that you know
converges and which satisfies 0 ≤ an ≤ bn for
P P all n. If you wanted to use it to show that a series
cn diverges, you need a divergent series dn with cn ≥ dn .

The Comparison Test can be weakened slightly as follows. (Here, what we’ve done is replace ‘for all
n’ with ‘for all sufficiently large n’.)

Theorem 2.22 Let (an ), (bn ) be non-negative sequences such that there is some N such that
an ≤ bn for all n ≥ N . Then
P P
1. If bn converges, then an does also.
P P
2. If an diverges, then bn diverges.

Proof. The key observation in the proof of the previous version of the Comparison Test was that,
using the same notation, sn ≤ tn . That is no longer necessarily true in this case. However, it is true
that there will be some constant M such that sn ≤ tn + M for all n ≥ N . For,
n
X N
X −1 n
X
tn − sn = (bn − an ) = (bn − an ) + (bn − an ).
i=1 i=1 i=N
Pn Pn
Now, let M = i=1 (bn − an ). Then, noting that i=N (bn − an ) ≥ 0 because bn ≥ an for n ≥ N ,
we see that
tn − sn ≥ M + 0 = M.
Now the proof is very similar to the one before.

12
Chapter 2. Series of real numbers

P
Suppose that bn converges. Then (tn ) converges and so it is bounded above. The sequence (sn ) is
increasing and, since sn ≤ tn + M for all n ≥ N , (sn ) is bounded above.
P So, as an increasing
sequence which is bounded above, it converges. Suppose, now, that an diverges. P
By Theorem 2.16,
sn → ∞. So, tn → ∞ since tn ≥ sn − M . Hence (tn ) diverges and (by definition) bn diverges.

X n2 + 1
Example Consider . The nth term here behaves like 1/n3 , because the dominant term
n5
+n+1
on the numerator is n2 and the dominant term in Pthe denominator is n5 . But this needs to be made
precise. We can formally compare the series with 1/n3 by noting that

n2 + 1 n2 + n2 2
≤ = 3.
n5 + n + 1 n5 n

2/n3 converges because 1/n3 does, by Theorem 2.19. Hence, by the Comparison
P P
The series
Test, the given series converges also.

The following, more sophisticated, version of the Comparison Test, is more useful. We could call it
the ‘Limiting’ Comparison Test, but we’ll usually just call it the Comparison Test (since the previous
two versions of the Test can be thought of as special cases of this one.)

Theorem 2.23 (Comparison Test) Suppose that n ) are positive and that an /bn → L,
P (an ), (bP
where L 6= 0 (and L is finite) as n → ∞. Then an and bn either both converge or both diverge:
that is, they have the same behaviour with respect to convergence.

Proof. Note that L will be positive because an , bn ≥ 0 and L 6= 0. Because an /bn → L, there will
be some N so that for all n ≥ N ,
an L
bN − L < 2 .

This is just taking  = L/2 > 0 in the definition of the limit of a sequence. So, for all n ≥ N ,

L an 3L
< < .
2 bn 2
P P
If bn converges, then so does P (3L/2)bn and hence, by the fact that an ≤ P (3L/2)bn for all
n ≥ N , Theorem
P 2.22 shows that an converges also. On the other hand, if an converges,
P then
so too does (2/L)a Pn and, since b n ≤ (2/L)an Pfor n ≥ N , Theorem 2.22 shows that b n
converges too. So, an converges if and only if bn converges. In other words, either they both
converge or they both diverge.

X n2 + 1
Example Consider again . Using the limiting form of the Comparison Test to compare
n5 + n + 1
3
P
the series with 1/n , we simply observe that, since

(n2 + 1)/(n5 + n + 1) n 5 + n3 1 + n−2


= 5 = → 1 6= 0,
1/n 3 n +n+1 1 + n−4 + n−5

1/n3 converges, then the given series converges too.


P
and since

2.6.2 Ratio test

Another very useful test is the Ratio Test.


P
Theorem 2.24 (Ratio Test) Let an be a non-negative series such that
an+1
L = lim (L = ∞ allowed).
n→∞ an
Then

13
MA203 Real Analysis

P
1. L < 1 ⇒ an converges.
P
2. L > 1 ⇒ an diverges (This includes the case L = ∞.)

Proof. We prove the first part. (The second part can be proved similarly: try it!) Suppose that
L < 1. Evidently, we may choose an M such that L < M < 1. Hence there exists N such that
an+1
n≥N ⇒ < M.
an
In particular, aN +1 < M aN . From this we see that in general we have

aN +n < M n aN .
P n
Now sincePthe geometric series M aN P converges (since 0 < M < 1), we have by the Comparison
Test that aN +n converges, and hence an converges.

WARNING! Note P that the Ratio


P Test2 says nothing if L = 1: in this case, the test is useless. In fact,
consider the series 1/n and 1/n . In both cases, an+1 /an → 1, yet the first series is divergent
and the second convergent. So the ratio test fails in the case L = 1 not because we can’t prove that
it works, but because the limit of the ratio really tells us nothing at all about convergence or
divergence if that limit is 1.
X n7
Example Consider . Letting an = n7 /6n , we have
6n
7
(n + 1)7 /6n+1 1 (n + 1)7

an+1 1 1 1
= 7 n
= 7
= 1+ → .
an n /6 6 n 6 n 6
This limit is less than 1, so the series converges.

2.6.3 Root test

Also useful is the Root Test.


P 1/n
Theorem 2.25 (Root Test) Let an be a non-negative series, and suppose that an → L as
n → ∞ (where we allow L = ∞). Then,
P
1. L < 1 ⇒ an converges.
P
2. L > 1 ⇒ an diverges (this includes the case L = ∞).

X n7
Example Consider again . Here,
6n
1/n
n7 n7/n (n1/n )7

a1/n
n = = = .
6n 6 6
1/n
Now, n1/n → 1, so an → 1/6 as n → ∞. By the Root Test, the series converges.

Again, note that the Root Test says nothing about the case L = 1.

2.6.4 Integral test

The following test draws on the interpretation of an area.

Theorem 2.26 (Integral Test) LetR g be a positive, decreasing,P integrable (for example, continuous)
n
function on [1, ∞), and let G(n) = 1 g(x) dx.P Then the series g(n) converges if and only if the
sequence
R∞ (G(n)) converges. In other words, g(n) converges if and only if the improper integral
1
g(x) dx exists.

14
Chapter 2. Series of real numbers

In fact, the following slight generalisation is valid.

Theorem 2.27 (Integral Test) Suppose that a ≥ 1 is a fixed number. Let g beRa positive,
n
decreasing,Pfunction defined on [a, ∞) and integrable on [a, ∞), and let G(n) = a g(x) dx. PThen
the series g(n) converges if and only if the Rsequence (G(n)) converges. In other words, g(n)

converges if and only if the improper integral a g(x) dx exists.

(This second version is useful when the integral exhibits improper behaviour near 1, as in the following
example.)

1/n2 converges. This


P P P
Example Consider 1/(n log n). We know that 1/n diverges and that
series is ’between’ these two. To see whether it converges, we can use the integral test. Let
g(n) = 1/(n log n). Then, taking a = 2 in the general version of the integral test, we have
Z n Z n Z log n
1 1
g(x) dx = dx = du,
2 2 x log x log 2 u

where we have made the substitution u = log x. So


log n
G(n) = [log u]log 2 = log log n − log log 2.
Since G(n) → ∞ as n → ∞, the series is divergent. (We use a = 2 rather than a = 1 because the
integral of g(x) is not defined when x = 1.)

2.7 Alternating series

AP
series is alternating if its terms are alternately positive and negative. Such a series takes the form
± (−1)n+1 cn , where cn ≥ 0.

an = (−1)n+1 cn
P P
Theorem 2.28 (Leibniz Alternating Series Test (‘LAST’)) Suppose that
is an alternating
P series, where cn ≥ 0. Then, if (cn ) is a decreasing sequence and limn→∞ cn = 0, the
series an converges.

P (−1)n+1
Corollary 2.29 converges for s > 0.
ns
WARNING! This test says that if the sequence (cn ) is decreasing and tends to 0, then the series
converges. It says nothing at all if one of these two conditions fails to hold. This does not mean that
these two conditions are necessary for convergence of the alternating series: it just means that the
Leibniz test doesn’t work in those situations.

X n
Example Let us use the Leibniz Alternating Series Test to prove that (−1)n converges.
n+1

The series is alternating, and takes the form (−1)n cn where cn = n/(n + 1) ≥ 0. We have
P

1/ n
cn = → 0.
1 + 1/n
Also, (cn ) is decreasing. There is more than one way to show this. First, we could note that

cn+1 n + 1/(n + 2)
= √
cn n/(n + 1)

(n + 1) n + 1
= √
n(n + 2)
s
(n + 1)2 (n + 1)
=
n(n + 2)2
r
n3 + 3n2 + 3n + 1
= ,
n3 + 4n2 + 4n

15
MA203 Real Analysis

and this√is at most 1 because 4n2 + 4n ≥ 3n2 + 3n + 1. Alternatively, we could note that if
f (x) = x/(x + 1), then
√ √
0 (1/(2 x))(x + 1) − x 1−x
f (x) = 2
= √ ≤ 0 for x ≥ 1,
(x + 1) 2 x(x + 1)2
and this shows that f is decreasing for x ≥ 1 and hence that (cn ) is decreasing.

2.8 Absolute convergence

2.8.1 Definition of absolute convergence


P P
Definition 2.30 Let P an be a series (in which somePof the terms may bePnegative). If |an | P
converges, we say that an converges absolutely. If an converges but |an | diverges, then an
is said to converge conditionally.

Absolute convergence implies convergence.

Theorem 2.31 If a series is absolutely convergent, then it is convergent.


P P
Proof. Suppose that the series an converges absolutely. This means that |an | is a convergent
series of non-negative terms. From the fact that −|an | ≤ an ≤ |an |, we deduce that
an + |an | ≥ −|an | + |an | = 0 and an + |an | ≤ |an | + |an | = 2|an |. It follows that
0 ≤ (an + |an |)/2 ≤ |an |. In a similar way, it can be seen that 0 ≤ (|an | − an )/2 ≤ |an |. So the series
X an + |an | X |an | − an
,
2 2
P
are non-negative series which, by thePcomparison test (comparing with the convergent series |an |)
are both convergent. The fact that an converges now follows from
X X  an + |an | |an | − an 
an = − ,
2 2
P P P
and the fact that if cn and dn converge, then so also does (cn − dn ). (see Theorem 2.20 for
this last observation.)

P
Note that if an is a convergent series with non-negative terms, then it is absolutely convergent.
P n−1
From what we saw earlier in Theorem 2.17, the geometric series ar converges absolutely if
|r| < 1.
1/n diverges, the series (−1)n /n converges
P P
By Theorem 2.29 and the fact that P conditionally.
(Theorem 2.29 shows it converges, but it does not converge absolutely because 1/n diverges.)

2.8.2 Tests for absolute convergence

The Comparison, Ratio, and Root Tests can be generalised as follows.

Theorem 2.32 (Comparison Test) Let (an ), (bn ) be sequences such that |an | ≤ |bn | for all n. Then
P P P∞ P∞
1. If bn converges absolutely, then an does also and n=1 |an | ≤ n=1 |bn |.
P P
2. If |an | diverges, then |bn | diverges.
P
Theorem 2.33 (Ratio Test) Let an be a series such that
|an+1 |
L = lim (L = ∞ allowed).
n→∞ |an |
Then

16
Chapter 2. Series of real numbers

P
1. L < 1 ⇒ an converges absolutely
P
2. L > 1 ⇒ an diverges.

Note the slightly stronger conclusion than you


P might have expected in the P case where L > 1. You
might think that this only establishes that |an | diverges, but in fact an diverges.

an be a series, and suppose that |an |1/n → L as n → ∞ (where


P
Theorem 2.34 (Root Test) Let
we allow L = ∞). Then,
P
1. L < 1 ⇒ an converges absolutely.
P
2. L > 1 ⇒ an diverges (this includes the case L = ∞).

2.9 Power series

an xn , where x is a real variable. Perhaps


P
For our purposes, a power series is a series of thePform
n
the most important example of a power series is x /n!, used to define the exponential function. It
turns out that, for any real number x, this series converges, and we may define the exponential
function by

X xn
exp(x) = .
n=1
n!

There isn’t enough time to cover power series in very great detail, but we look at how our
convergence tests apply to power series.

Let’s take the exponential series first. It’s easy to show that this converges absolutely for all x. We
simply observe that
n+1
x /(n + 1)! |x|
= → 0,
|xn /n!| n+1
for any x, and so, by the Ratio Test, absolute convergence follows.

Here’s a less straightforward example.

xn /n is convergent. Taking
P
Example Let’s determine exactly those values of x for which the series
an = xn /n, the ratio |an+1 |/|an | is
n+1
x /(n + 1) n
= |x| → |x|.
|xn /n| n+1

The ratio test therefore tells us that the series converges absolutely if |x| < 1, and that it diverges if
|x| > 1. But what if |x| = 1? Here, the ratio test is useless and we have to be more sophisticated.
Well, |x| = 1 corresponds to twoP cases: x = 1 and x = −1. We treat each separately. When x = 1,
the series is the harmonic series 1/n, which we know diverges. When x = −1, we have the series
(−1)n /n. This is convergent, by the Leibniz Alternating Series Test. (You should check this!) So
P
we have now determined exactly the values of x where the series converges: it converges for
−1 ≤ x < 1 and diverges for all other values of x.

A general result about power series is as follows.

an xn converges
P
Theorem 2.35 For every sequence (an ), there is an R such that the series
absolutely for all x ∈ (−R, R), and diverges for all x with |x| > R. (It is possible that R = ∞).

In the case in which R is finite, what happens at ±R is not determined by this theorem, and has to
be considered separately. The name radius of convergence is given to R.

17
MA203 Real Analysis

2.10 Learning outcomes

At the end of this chapter and the relevant reading, you should be able to:

state what’s meant by a series


state precisely the definition of convergence of a series
prove that some sequences converge by considering their partial sums
prove that if a series converges, then the nth term tends to 0; and use this to prove divergence
describe under what conditions a geometric series converges or diverges, and be able to prove
these results
state what is meant by the harmonic series and be able to prove it diverges
1/ns converges for s > 1 (and be able to prove this) and that it diverges for s ≤ 1;
P
state that
and be able to use this result. (You need not be able to prove it for s > 1.)
state and use the ‘algebra of limits’ results for series
state and use the Comparison Test (in all its forms), and be able to prove these results.
state and use the Ratio Test (but no need to be able to prove it)
state and use the Root Test (but no need to be able to prove it)
state and use the Integral Test (but no need to be able to prove it)
state and use the Leibniz Alternating Series Test (but no need to be able to prove it)
state what is meant by absolute convergence, and know, and be able to use, the fact that this
implies convergence (proof not necessary)
state what is meant by conditional convergence
determine divergence, conditional, and absolute convergence by using the tests already mentioned
state what is meant by a power series; and be able to determine the set of x for which a given
power series converges

2.11 Comments on selected activities

Learning activity 2.1 We have


n
X
sn = k = 1 + 2 + 3 + · · · + k.
k=1

This, you might recognise, is the sum of an arithmetic progression, and so, using the formula for such
a sum, we obtain
1
sn = n(n + 1).
2

Learning activity 2.2 There are at least two ways we can do this. First, as we already noted, the
partial sum sn equals −1 if n is odd and 0 of n is even. So the sequence (sn ) alternates between the
two values −1 and 0, and for this reason it does not converge. So the series diverges. Alternatively,
we could use Theorem 2.15: an = (−1)n and so an does not tend to 0. Hence the series diverges.
s s
P
Learning activity 2.3 Let sP n be the nth partial sum ofP 1/n . Because s ≤ 1, 1/n ≥ 1/n. So, if
tn is the nth partial sum of 1/n, then sn ≥ tn . But 1/n diverges and 1/n ≥ 0 for all n, so by
Theorem 2.16, tn → ∞. Since sn ≥ tn ,P it follows that sn → ∞ also and hence the sequence (sn )
diverges. But this means precisely that 1/ns diverges.

18
Chapter 2. Series of real numbers

2.12 Exercises

Exercise 2.1 Use the comparison test to prove that the series
X n2 − n + 1
n3 + 1
diverges

Exercise 2.2 Show that, for all n ≥ 1,


1 1 1
√ −√ ≥ √ .
n n+1 2(n + 1) n
P −3/2
Use this, and the comparison test, to show that n converges.

Exercise 2.3 Let


sn = 1 + 2x + 3x2 + . . . + nxn−1 .
P n−1
Evaluate sn − xsn . Deduce that, for −1 < x < 1, the series nx converges, and that

X 1
nxn−1 = .
n=1
(1 − x)2
P
Exercise 2.4 Suppose that (an ) is a non-negative
P∞ sequence, and that an is convergent. Show
that, for any subsequence (ank ), thePsum k=1 ank converges.
P Give an example of a sequence (an )
and a subsequence (ank ) such that an converges but ank does not.
P P P
Exercise 2.5 Prove that if an converges and bn diverges, then (an + bn ) diverges.
[Be careful to ensure that your proof does not assume that an and bn are non-negative.]

Exercise 2.6 Show that


Xa b c

+ −
n n+1 n+2
converges if and only if c = a + b.

Exercise 2.7 For each of the following series say whether the series converges or diverges. In each
case, give a brief reason or proof.
X 1  
X
1/n
X 1
n , 5/4
, cos ,
n n
X 1 X √n X 2n + (−1)n X n + (−1)n √n
√ , , , .
n 2n3 − 1 n2 − n + 1 (n + 1)4

X n
n
Exercise 2.8 Use the root test to determine whether the series converges.
2n + 1

Exercise 2.9 For each of the following series say whether the series converges or diverges. In each
case, give a brief reason or proof.
X (n + 1)2 X 2.5.8 . . . (3n − 1) X (n!)2 X (n!)2
, , 6n , 4n .
n! 4.8.12 . . . (4n) (2n)! (2n)!

Exercise 2.10 Prove Theorem 2.25; i.e., verify the correctness of the Root Test. (Hint: try to follow
the proof for the Ratio Test).

P 1
Exercise 2.11 Discuss the convergence of the series for all s > 0.
(n + 1)(log(n + 1))s

19
MA203 Real Analysis

Exercise 2.12 Determine whether each of the following series converges, in each case justifying your
answer carefully.
X 2 X (−1)n n X (−1)n n2 X   X
1 sin n
3n 5−n , , , (−1) n
sin , .
n2 + 1 n2 + 1 n n3/2

Exercise 2.13 Determine whether each of the following series converges:


X (−1)n √
X
n n
, (−1) √ .
log(n + 1) n+1
X (n2 + 2)
Exercise 2.14 For which values of x does the series xn converge?
4n n
P
Exercise 2.15 Decide whether the series an is absolutely convergent, conditionally convergent or
divergent when an has the following forms:

sin n (−1)n n4 (−1)n n3 (−1)n n2 n sin nπ
, , , , (−1) .
n2 n4 + 1 n4 + 1 n4 + 1 n2
X (n + 1)2 xn
Exercise 2.16 Determine for which values of x the series converges.
n3

20
Chapter 3
Sequences, functions and limits in higher
dimensions

Contents

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Sequences in Rm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3 Revision: limits and continuity of functions f : R → R . . . . . . . . . . . . . . . . . . . . . . 23
3.4 Limits and continuity of functions f : Rn → Rm . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.5 Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.6 Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Reading

Binmore, K.G. Mathematical Analysis: A Straightforward Approach. Chapter 19.

Some of this chapter contains material that is revision from Introduction to Abstract
Mathematics. Coverage in the textbooks of the other material in this chapter is weak. Chapter 19 of
Binmore’s book is probably the best place to look.

3.1 Introduction

In this chapter we look at what it means for a sequence of vectors to converge. We then look at limits
and continuity of functions from Rn to Rm , reminding ourselves en route of the relevant concepts for
functions from R to R that we met in Introduction to Abstract Mathematics.

3.2 Sequences in Rm

3.2.1 Distance in Rm

The Euclidean distance (or simply distance) between x = (x1 , x2 , . . . , xm ) and y = (y1 , y2 , . . . , ym )
in Rm is defined to be v
um
uX
kx − yk = t (xi − yi )2 .
i=1

(The case in which m = 1 corresponds to the distance |x − y| between two real numbers.)

There is a certain mathematical attraction in defining distances on rather more abstract or unusual
spaces, and this leads to the notion of a metric space. This is something we will touch on later in this
course. For the moment, we are primarily interested in Euclidean space Rm , for some integer m ≥ 1,
and we shall always use the Euclidean distance.

21
MA203 Real Analysis

Note: The Euclidean distance between two vectors x, y in Rm is simply the norm or length of x − y,
where the norm is the one arising from the usual inner product on Rm . (See Linear Algebra.)
Consequently, the Euclidean distance has some nice properties. For example, by the triangle inequality
for norms, we have that for any x, y, z ∈ Rm ,

ky − zk = k(y − x) + (x − z)k ≤ ky − xk + kx − zk.

Having equipped ourselves with a notion of distance in Rm , we can say what we mean by a bounded
subset. A bounded subset of Rm is one in which there is some fixed number bounding the distance
between any two points in the set. Formally:

Definition 3.1 (Bounded subset of Rm ) A subset B of Rm is bounded if there is K > 0 such that
for all x, y ∈ B, kx − yk ≤ K.

Note that K is fixed: it does not depend on x, y. (The definition would be meaningless if that were
the case.) There are other, equivalent, ways to think about boundedness. For instance, we have the
following characterisation.

Theorem 3.2 A subset B of Rm is bounded if and only if there is some M such that kxk ≤ M for
all x ∈ B.

Learning activity 3.1

Prove Theorem 3.2. (You will have to show that Definition 3.1 implies the property described in
Theorem 3.2, and also that the property described in that theorem implies the property of
Definition 3.1.)

3.2.2 Convergence in Rm

A sequence (xn ) in Rm is simply an ordered list x1 , x2 , . . . of elements of Rm . (This is a


straightforward extension of the definition of a sequence of real numbers.) It’s fairly simple to extend
to Rm ideas about convergence of sequences. The following definition is a straightforward extension
of the definition for convergence for sequences of real numbers.

Definition 3.3 (Convergence of sequences in Rm ) Suppose that (xn ) is a sequence of points of


Rm . Then we say that the sequence has limit x ∈ Rm if for every  > 0 there is N such that if
n > N then kxn − xk < . Such a sequence is said to be convergent and to converge towards x.

Equivalently, xn → x as n → ∞ if

kxn − xk → 0 as n → ∞.

The following result says that a sequence converges to a point if and only if it converges in each
co-ordinate.

Theorem 3.4 Suppose (xn ) is a sequence in Rm and let xn = (x1n , x2n , . . . , xmn ). Then
xn → x = (x1 , . . . , xm ) if and only if, for i = 1, . . . , m, xin → xi as n → ∞.

Proof. Suppose xn → x and let  > 0 be given. Then there is N such that for n > N ,
kxn − xk < . But v
um
uX
kxn − xk = t (xin − xi )2 ≥ |xin − xi |,
i=1

22
Chapter 3. Sequences, functions and limits in higher dimensions

for any i between 1 and m, so for n > N , |xin − xi | <  and hence xin → xi . On the other hand, if,
for each i, |xin − xi | < α, then
v
um
uX
kxn − xk = t (xin − xi )2
i=1

< mα2

= mα,

so if we let α = / m, we have:

|xin − xi | < / m, (i = 1, 2, . . . , m) =⇒ kxn − xk < .

If xin → xi for each i, we may take / m in the√ definition of limit (in place of ) to see that there is
some Ni such that for n√> Ni , |xin − xi | < / m. Let N be the largest of N1 , . . . , Nm . Then for
n > N , |xin − xi | < / m for all i and hence kxn − xk < . This shows that xn → x.

 n   
4n+2 1/4
Example Suppose xn = n2
. Then, as n → ∞, xn → x = . To see this, we can simply
1
n2 −1
n 1 n2
observe that → and 2 → 1. Alternatively (though this is more difficult), we could
4n + 2 4 n −1
calculate kxn − xk and check that kxn − xk → 0 as n → ∞.

3.2.3 Bolzano-Weierstrass theorem

Definition 3.5 A sequence in Rm is bounded if the set {xn : n ∈ N} is a bounded subset of Rm .

Thus, a sequence (xn ) is bounded if there is some number M such that kxn k ≤ M for all n.

Many of the results for sequences of real numbers extend to sequences in Rm for m > 1. For
example, the Bolzano-Weierstrass theorem has the following generalisation.

Theorem 3.6 (Bolzano-Weierstrass) Any bounded sequence in Rm has a convergent subsequence.

3.3 Revision: limits and continuity of functions f : R → R

This section acts to remind us briefly of the important ideas of limit and continuity of functions from
R to R.

3.3.1 Limits of functions f : R → R

Definition 3.7 (Limit of a function at a point) Let f : R → R be a function. We say that L is the
limit of f (x) as x tends to a, denoted by limx→a f (x) = L, if for each  > 0, there exists δ > 0 such
that
0 < |x − a| < δ =⇒ |f (x) − L| < .

The definition states that if someone gives us any arbitrarily small , then there is some
neighbourhood of a, (a − δ, a + δ), such that any x in this neighbourhood — other than possibly a
itself — will have f (x) in the -neighbourhood (L − , L + ) of L. (Note that what happens to the
function at a is irrelevant.)

Let f, g : R → R be two functions and c any real number. Then a new function (f + g) is obtained by
defining for each x, (f + g)(x) = f (x) + g(x). Similarly, we may define the functions
|f |, (cf ), (f − g), (f + g), (f g) and (f /g), provided g(x) 6= 0. (For example, (f g)(x) = f (x)g(x).
This should not be confused with the composite function f (g(x)).)

23
MA203 Real Analysis

Theorem 3.8 Let f , g : R → R be two functions and c any real number. Suppose that
limx→a f (x) = L and limx→a g(x) = M . Then

1. limx→a (cf )(x) = cL


2. limx→a (|f |)(x) = |L|
3. limx→a (f + g)(x) = L + M
4. limx→a (f − g)(x) = L − M
5. limx→a (f g)(x) = LM
6. limx→a (f /g)(x) = L/M provided g(x) 6= 0 for each x in some neighbourhood of a.

Definition 3.9 (One-sided limits) Let f : R → R be a function. We say that L is the limit of f (x)
as x approaches a from the left, denoted by limx→a− f (x) = L if for each  > 0, there exists δ > 0
such that
0 < a − x < δ =⇒ |f (x) − L| < .

A similar definition applies to limits from the right, denoted limx→a+ f (x) = L.

3.3.2 Continuity of functions f : R → R

Definition 3.10 (Continuity at a point) A function f : R → R is continuous at the point a if


limx→a f (x) exists and equals f (a).

Definition 3.11 (Continuity on a set) Suppose X ⊆ R. A function f : R → R is continuous on X


if for each a ∈ X, the limit of f (x), as x → a and x ∈ X, exists and equals f (a).

Here is a special case of this definition.

Definition 3.12 (Continuity on an interval) A function is continuous on the interval [a, b] if it is


continuous at each point in (a, b) and (i) f (a) = limx→a+ f (x) and (ii) f (b) = limx→b− f (x).

So, to say that f is continuous on [a, b] means that f is continuous at each point in (a, b), and that it
is continuous on the left at b and continuous on the right at a.

Definition 3.13 (Continuity) A function is continuous if it is continuous at each point a where it is


defined.

A function is discontinuous at a if it is not continuous there.

It follows from the results on the algebra of limits that there are ‘heredity’ results for continuity.

Theorem 3.14 (Heredity results for continuity) Let f, g : R → R be functions that are continuous
at a ∈ R and c be any real number. Then |f |, (cf ), (f − g), (f + g), (f g) are all continuous at a, and
(f /g) is continuous provided g(x) 6= 0 for any x in some neighbourhood of a.
Pk
As a corollary, any polynomial p(x) = i=0 ai xi is continuous.

Recall that if f, g are functions, then we may define the composite function f (g(x)). It turns out that
if g is continuous at a, and f is continuous at g(a), then the composite function f (g(x)) is
continuous at a.

24
Chapter 3. Sequences, functions and limits in higher dimensions

3.4 Limits and continuity of functions f : Rn → Rm

We now turn our attention to functions defined on Rn .

3.4.1 Limits of functions f : Rn → Rm

Suppose that f : R → R. As mentioned above, we say that f (x) tends to L as x → a if and only if
given any  > 0, there is δ > 0 such that

0 < |x − a| < δ =⇒ |f (x) − L| < .

This appears to use the fact that one can form the difference between any two real numbers,
However, as above, we can interpret |x − a| as the distance between the real numbers x and a, in
which case the above condition can be restated as

0 < distance(x, a) < δ =⇒ distance(f (x), L) < .

It should be clear from this that the condition doesn’t really use any algebraic properties of R, only
‘distance’ properties. This definition and many of its consequences will remain if we have as domain
and codomain Rn and Rm for any m, n ≥ 1.

Definition 3.15 Suppose f : Rn → Rm , and that a ∈ Rn and L ∈ Rm . We say that L is the limit of
f (x) as x tends to a if for each  > 0, there exists δ > 0 such that

0 < kx − ak < δ =⇒ kf (x) − Lk < .

and we write limx→a f (x) = L.

(Note that we use the same notation, k.k, for the lengths of vectors in both Rn and Rm .)

The definition can be modified in the obvious way if the function f maps from some subset A of Rn :
we simply add the qualification that x ∈ A.

3.4.2 Two informative examples

We now give two examples to illustrate some important points about considering limits for functions
f : Rn → Rm . Two key observations are:

The limit of f (x) as x → a exists and equals L only if, no matter how x tends to a, the value of
the function approaches L.
Consideration of particular approaches of x to a along particular ‘trajectories’ (such as lines) can
be used to show that a limit does not exist, but it can never be used to show a limit does exist:
that requires a more general argument that assumes nothing about how x approaches a.

Example Suppose f : R2 \ {(0, 0)T } → R is given by


 
x1 x1 x2
f =p 2 .
x2 x1 + x22

Then f (x) → 0 as x → 0 = (0, 0)T . To see this, we note that

|x1 ||x2 | (1/2)(x21 + x22 ) 1


q
|f (x) − 0| = |f (x)| = p 2 ≤ p = x21 + x22 → 0,
x1 + x2 2 2
x1 + x2 2 2

where we have used the fact that for any real numbers a and b, 2ab ≤ a2 + b2 . (This follows from
(a − b)2 ≥ 0.)

25
MA203 Real Analysis

When looking at limits for functions from R to R, we noticed that one can define left and right limits
and that these might be different. A counterpart to the idea of left and right limits for functions
f : Rn → Rm when n > 1 is the idea of the limit along a path.

Example Suppose f : R2 \ {(0, 0)T } → R is given by

x22 − x21
 
x1
f =
x2 x21 + x22

and that g : R2 \ {(0, 0)T } → R is


x1 x22
 
x1
g = .
x2 x21 + x42

Let’s consider what happens to f (x) as x tends to 0 = (0, 0)T along the line x2 = αx1 ; that is,
through x of the form (t, αt)T . We have

α 2 t2 − t 2 α2 − 1
 
t
f = = 2 .
αt 2
t +α t 2 2 α +1

So, f (x) approaches different values as x → 0 along different lines. In particular, f (x) does not have
a limit as x → 0.

The function g is quite different. If we again investigate what happens as x → 0 along the lines
x2 = αx1 , we note that, for all α,

α 2 t3
 
t
g = 2 → 0 as t → 0.
αt t + α 4 t4

So, here, the limit as x tends to 0 along all lines is the same. However, we cannot deduce from this
alone that g(x) has a limit as x → 0. For, consider what happens to g(x) as x → 0 along the
parabola given by x1 = αx22 (that is, through points (αt2 , t)). We have

αt4
 
αt2 α
g = = 2 ,
t α 2 t 4 + t4 α +1

so the limit depends on α, and so g(x) has no limit as x → 0.

The two examples just given are very, very important and illustrate why the topic of limits for
functions f : Rn → Rm is quite hard when n > 1. We repeat the main lessons to be learned. There
are so many different ways in which x can approach a given a. The limit of f (x) as x → a exists and
equals L only if, no matter how x tends to a, the value of the function approaches L. Consideration
of particular approaches of x to a along particular ‘trajectories’ (such as lines) can be used to show
that the limit does not exist (because, for instance, the values along different trajectories tend to
different limits). However, to show that the limit of f as x tends to a exists, an argument needs to be
given that does not assume any particular way in which x tends towards a.

3.4.3 Continuity of functions f : Rn → Rm

Definition 3.16 (Continuity of f : Rn → Rm ) Suppose that f : Rn → Rm and that a ∈ Rn . Then,


we say that f is continuous at a if limx→a f (x) exists and equals f (a). Equivalently, f is continuous
at a if given any  > 0 there exists δ > 0 such that if kx − ak < δ then kf (x) − f (a)k < .

For a subset X of Rm , we say that f is continuous on X if for all a ∈ X, the limit of f (x), as x → a,
with x ∈ a, exists and equals f (a).

If f is continuous on the whole of Rm then we simply say that f is continuous.

The following result is often useful.

26
Chapter 3. Sequences, functions and limits in higher dimensions

Theorem 3.17 Suppose that f : Rn → Rm and that f1 , f2 , . . . , fm : Rn → R are such that for all
x ∈ Rn ,
f (x) = (f1 (x), f2 (x), . . . , fm (x))T .
Then f is continuous at a ∈ Rn if and only if f1 , f2 , . . . , fm are continuous at a.

Note that if e1 , e2 , . . . , em are the standard basis vectors of Rm (so that ei has a 1 in position i and
all other entries equal to 0), then the functions fi referred to are given by
fi (x) = hf (x), ei i = eTi f (x),
where ha, bi denotes the usual inner product (scalar product) on Rm . We shall call the functions fi
the component functions of f .

A useful observation is that all linear functions are continuous. Recall that a linear function
f : Rn → Rm is one with the property that for all x, y ∈ Rn and all α, β ∈ R,
f (αx + βy) = αf (x) + βf (y).
Any linear function can be represented in matrix form: that is, there is some m × n matrix M such
that f (x) = M x for all x. In this case, for 1 ≤ i ≤ m,
fi (x) = eTi f (x) = eTi M x.
If we let mi denote M T ei , then we see that
T
fi (x) = eTi M x = M T ei x = mTi x,
and so, for any a ∈ Rn ,
kfi (x) − fi (a)k = kmTi x − mTi ak = kmTi (x − a)k ≤ kmTi kkx − ak.
As x → a, the right hand side tends to 0, because kmTi k is just a fixed number. This shows that fi is
continuous, for each i, and it follows that f is continuous.

3.4.4 Sequences and continuity

It is possible to describe an alternative approach to the definition of continuity of functions from Rn


to Rm . We have the following theorem. (In Introduction to Abstract Mathematics, we met the
version of this that applies to functions f : R → R.)

Theorem 3.18 Suppose that f : Rn → Rm and that a ∈ Rn . Then f is continuous at a if and only
if for every sequence (xn ) converging to a we have f (xn ) → f (a). Therefore f is continuous (on the
whole of Rn ) if for every convergent sequence (xn ) in Rn , we have lim f (xn ) = f (lim xn ).

Proof. Let (*) be the statement that for any sequence (xn ) such that limn→∞ xn = a,
limn→∞ f (xn ) = f (a).

Suppose first that f is continuous at a. We prove that this implies (*). Let (xn ) be a sequence of
reals converging to a. We want to show that f (xn ) → f (a) as n → ∞, that is,
∀ > 0 ∃N ∀n ≥ N |f (xn ) − f (a)| < . (∗∗)
To prove this, let  > 0. Choose, according to the definition of continuity, a δ > 0 so that for all x,
whenever |x − a| < δ, then |f (x) − f (a)| < . Since xn → a as n → ∞, there is an N so that n ≥ N
implies |xn − a| < δ, which in turn implies |f (xn ) − f (a)| < . This shows (**) as desired.

Conversely, assume that property (*) holds. In order to show continuity, we assume, to the contrary,
that the function is discontinuous at a. This means that there is an  > 0 so that for all δ > 0 there is
an x with |x − a| < δ but |f (x) − f (a)| ≥ . In particular, for every natural number n, letting
δ = 1/n, there is a real number x, call it xn , with |xn − a| < 1/n but |f (xn ) − f (a)| ≥ . But then
clearly xn → a as n → ∞, but we do not have f (xn ) → f (a) as n → ∞, a contradiction to (*).

27
MA203 Real Analysis

3.5 Learning outcomes

At the end of this chapter and the relevant reading, you should be able to:

state precisely what’s meant by a bounded set in Rm


state the definition of convergence of a sequence in Rm , and be able to use it
state, and be able to prove, the result stating that a sequence (xn ) in Rm converges to x if and
only if, for each i between 1 and m, the ith entries of xn converge to the ith entry of x
state the Bolzanno-Weierstrass theorem (but there is no need to be able to prove it)
state the definitions of limit and continuity (including continuity on an interval, etc) of a function
f :R→R
state the definition of the limit of a function f : Rm → Rn
state the definition of continuity of a function f : Rm → Rn
prove that certain functions are continuous, or not continuous at certain points; and that certain
limits do or do not exist
state the result relating continuity of f : Rm → Rn to continuity of the component functions of f
state, and be able to prove, the characterisation of continuity in terms of sequences

3.6 Comments on selected activities

Learning activity 3.1 This is an “if and only if” problem. Separate the two halves completely;
otherwise only confusion will ensue.

Draw pictures to see why this result is true; if you just launch into a calculation you won’t succeed.
So, if the “diameter” K of B is finite, why can there not be points of B arbitrarily far from the
origin, and how then can we put some specific upper bound M on the distance from the origin? For
the other way round, if all points of B are at distance at most M from the origin, how far can a pair
of points of B be from each other? Once you’ve figured out (in both directions) what to prove, the
main weapon at your disposal is the triangle inequality.

First suppose that B is bounded, i.e., there is some constant K such that kx − yk ≤ K for all
x, y ∈ B. If B = ∅ then the result is trivial. If B 6= ∅, then fix some y ∈ B, and observe that, for all
x ∈ B, kxk = kx − 0k ≤ kx − yk + ky − 0k ≤ K + kyk, by the triangle inequality. Therefore, taking
M = K + kyk, we see that kxk ≤ M for all x ∈ B.

Now suppose that there is some M such that kxk ≤ M for all x ∈ B. Take any x, y in B, and
observe that kx − yk ≤ kx − 0k + k0 − yk = kxk + kyk ≤ M + M = 2M . So, setting K = 2M , we
see that kx − yk ≤ K for all x, y ∈ B, as required.

3.7 Exercises

Exercise 3.1 For n ∈ N, let the point xn in R3 be given by xn = (1/n, 1/n, 2/n). Calculate kxn k
for each n, and hence show that xn → 0 as n → ∞.

Exercise 3.2 Let f, g : R → R be two functions, and c a real number. Suppose that limx→c f (x) = A
and limx→c g(x) = B. Prove, directly from the definitions, that limx→c (f (x)g(x)) = AB.
Hint: f (x)g(x) − AB = f (x)(g(x) − B) + (f (x) − A)B.

x21 + x22
Exercise 3.3 Show that limx→0 f (x) = 0, when f (x1 , x2 )T = .
|x1 | + |x2 |

28
Chapter 3. Sequences, functions and limits in higher dimensions

2
  3.4 Prove, directly from the definition, that the function f : R → R defined by
Exercise
x1
f = x1 x2 is continuous.
x2

x1 x1 x2
Exercise 3.5 Does limx→0 f (x) exist, when f = ?
x2 |x1 | + |x2 |
[You might make use of the fact that, for any u, v, u2 + v 2 ≥ 2uv.]

x31 x2 − 2x21 x22


 
x1
Exercise 3.6 Suppose f : R2 \ {0} → R is given by f = .
x2 x41 + x42
Prove that limx→0 f (x) does not exist.

x1 x42

 
x1 if (x, y)T 6= 0

Exercise 3.7 Suppose g : R2 → R is the function given by g = x2 + x82
x2  1
0 if (x, y)T = 0.
Prove that g is not continuous at 0.

(x, 0)T
  
2 2 x if y ≤ 0
Exercise 3.8 Let f : R → R be f = Determine the set of a ∈ R2 at
y (x, x)T if y > 0.
which f is continuous.

Exercise 3.9 Suppose f : R2 → R2 is the function given by


  
x1
    if x1 ≥ x2
x1  x2

f = 
x2 x1
if x1 < x2 .


2x2 − x1

Prove that f is continuous (on all of R2 ).

29
MA203 Real Analysis

Chapter 4
Differentiation

Contents

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Derivative of functions f : R → R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.3 Differentiation of functions f : Rn → Rm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.4 Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5 Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Reading

Bryant, Victor. Yet Another Introduction to Analysis. Chapter 4.

Binmore, K.G. Mathematical Analysis: A Straightforward Approach. Chapters 10 and 19.

Brannan, David. A First Course in Mathematical Analysis. Chapter 6.

Bartle, R.G. and D.R. Sherbert. Introduction to Real Analysis. Chapter 6.

4.1 Introduction

You will know already how useful differentiation is. In this chapter, we look at why the familiar
techniques of calculus work, and we also see how the notion of derivative of a function f : R → R can
be generalised to functions f : Rn → Rm .

4.2 Derivative of functions f : R → R

4.2.1 Definition of the derivative

Let f be a real-valued function defined at all points of an interval (a, b). For each x ∈ (a, b), we
define the derivative of f at x, denoted by f 0 (x), to be

f (y) − f (x)
lim
y→x y−x

if this limit exists (if not, then the function does not have a derivative at x).

As you will know from calculus courses, we can think of f 0 (x) as being the slope of the tangent to the
graph of f at x.

If f 0 (x) is defined, we say that f is differentiable at x. If f is differentiable at each x ∈ (a, b), then f
is said to be differentiable on (a, b).

30
Chapter 4. Differentiation

We can also define the right and left derivatives of f at x. For example, the right derivative, denoted
by fr0 (x), is the limit
f (y) − f (x)
lim+
y→x y−x
if it exists. We define the left derivative fl0 (x) similarly.

Note that f 0 (x) exists if and only if fl0 (x) and fr0 (x) exist and are equal.

Example Consider f (x) = |x|.

First suppose that a > 0. Then for y sufficiently close to a we have that y > 0 and so |y| = y. Hence

f (y) − f (a) |y| − |a| y−a


f 0 (a) = lim = lim = lim = 1.
y→a y−a y→a y − a y→a y − a

Similarly, if a < 0, then f 0 (a) = −1. But at a = 0 we have:

|y| − |0| (−y) − 0


fl0 (0) = lim− = lim− = −1
y→0 y−0 y→0 y

But fr0 (0) = 1. Thus the derivative of f does not exist at 0.

4.2.2 Differentiability and continuity

Differentiability is a stronger version of continuity, in the sense that differentiability implies continuity.

Theorem 4.1 If f is differentiable at a point c, then f is continuous at c.

Proof. We are given that


f (y) − f (c)
lim = f 0 (c)
y→c y−c
exists. But
f (y) − f (c)
lim (f (y) − f (c)) = lim (y − c)
y→c y→c y−c
and, by the algebra of limits, this is equal to f 0 (c) × 0. Thus limy→c f (y) = f (c) and so f is
continuous at c.

It should not be thought that the converse is true; that is, that continuity implies differentiability.
Indeed (see Brannan, David, A First Course in Mathematical Analysis, Chapter 6), there is a function
that is continuous at all real numbers, but not differentiable anywhere. The following results are
well-known to you.

Theorem 4.2 Let f, g be defined on (a, b) and differentiable at c ∈ (a, b). Then

(f + g)0 (c) = f 0 (c) + g 0 (c)


(f g)0 (c) = f 0 (c)g(c) + f (c)g 0 (c)
 0
f g(c)f 0 (c) − g 0 (c)f (c)
if g(c) 6= 0, then (c) = .
g g(c)2

Learning activity 4.1

Try to prove the second of the results of Theorem 4.2 using the definition of derivative. (You will of,
course, recognise this as the product rule.)

31
MA203 Real Analysis

Theorem 4.3 (Chain Rule) Let f be defined on (a, b) and suppose f 0 (c) exists. Let g be defined on
the range of f and be differentiable at f (c). Define the new function

K(x) = (g ◦ f )(x) = g(f (x))

for all x ∈ (a, b). Then K is differentiable at c and

K 0 (c) = g 0 (f (c))f 0 (c).

Proof. One has to be just a little careful. Define


(
g(y)−g(f (c))
y−f (c) if y 6= f (c)
G(y) = 0
g (f (c)) if y = f (c)

Note that G(y) → g 0 (f (c)) as y → f (c). (The problem is that if y = f (c), then we cannot divide by
y − f (c).) Now, if f (x) 6= f (c),

g(f (x)) − g(f (c)) g(f (x)) − g(f (c)) f (x) − f (c) f (x) − f (c)
= · = G(f (x)) · ,
x−c f (x) − f (c) x−c x−c

and this conclusion is also valid if f (x) = f (c) and x 6= c. As x → c, f (x) → f (c) (Theorem 4.1), so
G(f (x)) → g 0 (f (c)). Also of course (f (x) − f (c))/(x − c) → f 0 (c) as x → c. The result follows.

4.2.3 Maxima, Minima, and the derivative

You will know from calculus that the sign of the derivative provides information about how a function
behaves. For instance, if f 0 (a) > 0, then in some small neighbourhood of a, the f -values to the left
of a are smaller than f (a), and those to the right are larger.

Theorem 4.4 If f 0 (a) > 0, then there exists δ > 0 such that

f (a + h) > f (a) > f (a − h)

for all h ∈ (0, δ).

Proof. Since limx→a (f (x) − f (a))/(x − a) = f 0 (a) > 0, then (taking  = f 0 (a) in the definition of
limit) we can choose a δ > 0 such that

f (a + h) − f (a)
− f 0 (a) < f 0 (a).

0 < |h − a| < δ =⇒
h

In particular, if h ∈ (0, δ),


f (a + h) − f (a)
−f 0 (a) < − f 0 (a).
h
Since h > 0, this implies that f (a + h) > f (a). The same argument with h ∈ (−δ, 0) gives the other
half of the result.

Definition 4.5 Let f : R → R. We say that f has a local maximum at a point c ∈ R if there exists
δ > 0 such that f (c) ≥ f (x) for all x ∈ (c − δ, c + δ). We say that f has a local minimum at c if
there exists δ > 0 such that f (c) ≤ f (x) for all x ∈ (c − δ, c + δ).

Well, you know that to find local maxima or minima, you solve f 0 = 0. But why? Can we give a
formal justification for this. Indeed we can, as the following theorem and its proof show.

Theorem 4.6 Let f : R → R. If f has a local maximum (or minimum) at c, and if f 0 (c) exists, then
f 0 (c) = 0.

32
Chapter 4. Differentiation

Proof. Suppose that f has a local maximum at c. The proof in the case of a local minimum is
similar. Then there is δ > 0 such that for all x ∈ I = (c − δ, c + δ), f (x) ≤ f (c). So, for x < c and
x ∈ I,
f (x) − f (c)
≥0
x−c
and for x > c and x ∈ I,
f (x) − f (c)
≤ 0.
x−c
But f 0 (c) exists and so

f (x) − f (c) f (x) − f (c)


f 0 (c) = lim = lim ≥ 0.
x→c x−c x→c− x−c

(This left limit is non-negative because f (x) − f (c) ≤ 0 and, for such x, x − c < 0.) But, also,

f (x) − f (c) f (x) − f (c)


f 0 (c) = lim = lim ≤ 0.
x→c x−c x→c+ x−c

So f 0 (c) ≥ 0 and f 0 (c) ≤ 0. We must therefore have f 0 (c) = 0.

The theorem tells us that if f is differentiable on (a, b), then in order to examine all local maxima or
minima, we may restrict attention to the points where the derivative is zero. Of course, in general a
function may have a local maximum or minimum at a point where it is not differentiable. (For
example: f (x) = |x| has a local minimum at x = 0.)

4.2.4 Rolle’s Theorem and the Mean Value Theorem

In this section, we look at three extremely useful and important results. First, we have the following
(which we call the Extreme Value Theorem), which says that a continuous function will have a
maximum and a minimum on a closed and bounded interval.

Theorem 4.7 (Extreme Value Theorem) Suppose the real function f is continuous on the closed
bounded interval [a, b]. Then f is bounded on [a, b] and attains its bounds; that is, there are
c1 , c2 ∈ [a, b] such that

f (c1 ) = min{f (x) : x ∈ [a, b]}, f (c2 ) = max{f (x) : x ∈ [a, b]}.

Proof. Suppose first that f is unbounded above. For each n ∈ N, let xn be a point in [a, b] such
that f (xn ) > n. The sequence (xn ) is bounded, so has a convergent subsequence (xnk ), tending to
some limit c. Necessarily c ∈ [a, b]. Since f is continuous at c, f (xnk ) → f (c) as k → ∞. But this
contradicts the construction of the sequence (xn ).

So f is bounded above. Let M = sup{f (x) : x ∈ [a, b]}. For each n ∈ N, let xn be a point in [a, b]
such that f (xn ) > M − n1 . Again take a convergent subsequence (xnk ) of (xn ), tending to some
limit c ∈ [a, b]. Arguing as before, we see f (c) = M .

The argument showing that f attains a minimum value over [a, b] is very similar.

Note that this result does not hold if, for instance, the domain of f is an open interval (a, b).

The next two results, Rolle’s Theorem and the Mean Value Theorem, concern the derivative. As we
shall see, they are very useful.

Theorem 4.8 (Rolle’s Theorem) Let f be continuous on [a, b] and differentiable on (a, b), and
suppose that f (a) = f (b). Then there exists c ∈ (a, b) such that f 0 (c) = 0.

33
MA203 Real Analysis

Proof. If f (c) = f (a) for all c ∈ [a, b] then we are done because in this case, the definition of
derivative shows that f 0 (c) = 0 for all c ∈ (a, b). Why? Otherwise we may suppose that there is some
c ∈ [a, b] such that f (c) > f (a) = f (b). Now consider a point c ∈ (a, b) such that
f (c) = max{f (x) : x ∈ [a, b]}. Such a point c exists by the Extreme Value Theorem, and we must
have f 0 (c) = 0 by Theorem 4.6.

An immediate corollary of this is the (very important) Mean Value Theorem.

Theorem 4.9 (The Mean Value Theorem) Let f be continuous on [a, b] and differentiable on
(a, b). Then there exists c ∈ (a, b) such that

f (b) − f (a) = f 0 (c)(b − a).

Proof. Let the constant α be given by α = (f (b) − f (a))/(b − a). Then the function g defined by
g(x) = f (x) − αx is continuous on [a, b] and differentiable on (a, b) (because f is) and it satisfies
g(a) = g(b) (this is why we chose α as we did). By Rolle’s Theorem, there is c ∈ (a, b) with
g 0 (c) = 0. But g 0 (x) = f 0 (x) − α, so there is c ∈ (a, b) with f 0 (c) = α, as required.

What is the point of the Mean Value Theorem? Well, there are a number of ways of thinking about
it. One useful interpretation is that it gives precise results about small changes in the value of a
function. We know that if b is close to a, then

f (b) − f (a) f (b) − f (a)


≈ lim = f 0 (a),
b−a b→a b−a
where ‘≈’ means ‘approximately equal to’. This implies the familiar fact that

f (b) ≈ f (a) + (b − a)f 0 (a).

Now, what the Mean Value Theorem tells us is that

f (b) = f (a) + (b − a)f 0 (c)

for some c between a and b. This is a precise statement: it is an equality, not an approximation.
Note that it involves f 0 (c) rather than f 0 (a). The precise value of c is not necessarily known, so it
could be argued that there is still some inherent uncertainty in the statement, but we at least know
that c is between a and b. Consider the following example.

Example
√ Suppose
√ that n is a positive
√ integer and that we want to find a good approximation to
n + 1 − n. If we let f (x) = x (for x > 0) then f is differentiable on (0, ∞) and
√ √
n + 1 − n = f (n + 1) − f (n).

Now, using the derivative in the standard way to obtain an approximation, we see that

f (n + 1) − f (n) ≈ (n + 1 − n)f 0 (n),

which becomes
√ √ 1
n+1− n≈ √ .
2 n
That’s nice, but how close is the approximation? What does ≈ actually mean?

Suppose instead we use the Mean Value Theorem (MVT). This tells us that, precisely,

f (n + 1) − f (n) = (n + 1 − n)f 0 (c),

where c is some number in (n, n + 1). That is,


√ √ 1
n+1− n= √ .
2 c

34
Chapter 4. Differentiation

Now, we don’t know c precisely, but we do know that it lies between n and n + 1. This implies that
1 1 1
√ < √ < √ .
2 n+1 2 c 2 n
So the MVT tells us that
1 √ √ 1
√ < n+1− n< √ .
2 n+1 2 n
√ √ √
This is much more useful: it shows not only that 1/(2 n) is an approximation to n + 1 − n, but
it shows much more, giving a precise range of possible values, close to the approximation. In other
words, we now know something concrete about how precise the approximation is.

The Mean Value Theorem provides a useful tool for proving some of the familiar results about the
derivative.

Definition 4.10 Let f : I → R, where I is some interval. f is increasing (decreasing) on I if for each
x, y ∈ I with x < y, we have f (x) ≤ f (y) (resp. f (x) ≥ f (y)).

Similarly we define f to be strictly increasing if ≤ can be replaced by <.

Theorem 4.11 Let f : R → R be differentiable on (a, b).

1. f 0 (x) ≥ 0 for all x ∈ (a, b) =⇒ f is increasing on (a, b)


2. f 0 (x) = 0 for all x ∈ (a, b) =⇒ f is constant on (a, b)
3. f 0 (x) ≤ 0 for all x ∈ (a, b) =⇒ f is decreasing on (a, b).

This can be proved by contradiction, using the Mean Value Theorem: the results follow from the fact
that, for each pair x1 < x2 in (a, b), and for some c ∈ (x1 , x2 ).
f (x2 ) − f (x1 ) = (x2 − x1 )f 0 (c).

4.3 Differentiation of functions f : Rn → Rm

4.3.1 Partial and directional derivatives

Suppose that f : Rn → R. The partial derivative ∂f /∂xi at a point a ∈ Rn is the instantaneous rate
of change of the function with respect to xi , at a.

Formally,
∂f f (a1 , . . . , ai−1 , ai + h, ai+1 , . . . , an ) − f (a)
(a) = lim ,
∂xi h→0 h
if this limit exists.

We may think of the partial derivative ∂f /∂xi as the rate of change in f as we move in the direction
of the vector ei , because
f (a1 , . . . , ai−1 , ai + h, ai+1 , . . . , an ) − f (a) f (a + hei ) − f (a)
= .
h h

But we can move in many other directions, and such considerations lead to the notion of directional
derivative.

We define a direction (or direction vector) in Rn to be an n-vector of length 1.

Definition 4.12 (Directional derivative) The directional derivative of f in direction v, at the


point a, is defined to be the limit
f (a + hv) − f (a)
Dv f (a) = lim ,
h→0 h
if this limit exists.

35
MA203 Real Analysis

Example Suppose that f : R2 → R is given by



   x2 y
x if (x, y)T 6= 0
f = x4 + y 2
y
0 if (x, y)T = 0.

Let v = (u, v)T be a direction vector, and let us take a = 0. We have

(tu)2 (tv) u2 v
 
f (0 + tv) − f (0) 1
= = .
t t (tu)4 + (tv)2 t2 u 4 + v 2

If v 6= 0, this tends to u2 /v as t → 0, while if v = 0, then it tends to 0. The limit therefore exists in


all cases: that is, f has directional derivatives in every possible direction v at 0.

4.3.2 The derivative of f : Rn → Rm

It is useful to describe a function f : Rn → Rm in terms of its component functions fi : Rn → R, by


writing
T
f (x) = (f1 (x), f2 (x), . . . , fm (x)) .

Let’s start informally. Suppose we want to approximate the change in f when x changes from a to
a + h where
h = (h1 , h2 , . . . , hn ).

Suppose also that all the partial derivatives exist. If the hi are small enough, then

∂f1 ∂f1
f1 (a + h) − f1 (a) ≈ (a)h1 + · · · + (a)hn
∂x1 ∂xn
∂f2 ∂f2
f2 (a + h) − f2 (a) ≈ (a)h1 + · · · + (a)hn
∂x1 ∂xn
..
.
∂fm ∂fm
fm (a + h) − fm (a) ≈ (a)h1 + · · · + (a)hn ,
∂x1 ∂xn

so we have that
 ∂f1 ∂f1  
  (a) · · · (a) h1
f1 (a + h) − f1 (a)  ∂x1 ∂xn   h2 
f (a + h) − f (a) =  .
.. .. .. ..
≈   ..  .
    
. . . .
∂fm ∂fm
 
fm (a + h) − fm (a) (a) · · · (a) hn
∂x1 ∂xn

This describes the linear approximation of f at a, and the matrix (or, equivalently, the linear mapping
it describes)
 ∂f1 ∂f1 
(a) · · · (a)
 ∂x1 ∂xn 
Df (a) = 
 .. .. .. 
. . . 
∂fm ∂fm
 
(a) · · · (a)
∂x1 ∂xn
is known as the derivative (or the Jacobian derivative) of f at a.

The ‘argument’ just given is not precise. We now take a more formal approach, in which we shall see
that some conditions on the partial derivatives, other than simply their existence, are required to make
the argument watertight. First, we start with the ‘proper’ formal definition of what is meant by the
derivative of a function f : Rn → Rm .

36
Chapter 4. Differentiation

Definition 4.13 A function f : Rn → Rm is differentiable at a ∈ Rn if there exists a linear function


x 7→ Ax (where A is an m × n matrix) from Rn to Rm such that
f (a + h) − f (a) − Ah
→0
khk
as h → 0. We call the linear function (or, equivalently, the matrix A representing it) the derivative of
f at a and denote it by Df (a).

In the statement of the next theorem, by a neighbourhood of a ∈ Rn we mean a set of the form
{x ∈ Rn : kx − ak < }, for some . (There are other, more general interpretations of
‘neighbourhood’, but this will do for now.)

Suppose that f : Rn → R. Then the gradient of f at a point a is defined to be the column vector
 ∂f 
(a)
 ∂x1 
 ∂f 
 ∂x (a) 
 
∇f (a) =  2 .
 .. 
 . 
∂f
 
(a)
∂xn
Theorem 4.14 Suppose that f : Rn → Rm , that f1 , f2 , . . . , fm are the component functions, and
that a ∈ Rn . Then:

f is differentiable at a =⇒ f is continuous at a.
f is differentiable at a ⇐⇒ fi is differentiable at a (for i = 1, 2, . . . , m).
If f is differentiable at a then
 ∂f1 ∂f1 

(∇f1 (a)) T
 (a) · · · (a)
 ∂x1 ∂xn 
Df (a) =  .
.. .. .. ..
= .
   
. . .
∂fm ∂fm
T
 
(∇fm (a)) (a) · · · (a)
∂x1 ∂xn

∂fi
If (for i = 1, . . . , m and j = 1, . . . , n) all exist in a neighbourhood of a and are continuous
∂xj
at a, then f is differentiable at a.

In the special case m = 1, we have the following result.

Theorem 4.15 Suppose f : Rn → R is differentiable at a. Then

all directional derivatives of f at a exist, and Dv f (a) = (∇f (a))T v


Df (a) = (∇f (a))T .

The gradient has a useful interpretation. We have seen that the rate of change of f at a in the
direction v is the directional derivative

Dv f (a) = (∇f (a))T v.

This may be expressed as the inner (or scalar) product h∇f (a), vi, which equals

k∇f (a)kkvk cos θ = k∇f (a)k cos θ,

where θ denotes the angle between the vectors ∇f (a) and v, and where we have used the fact that
kvk = 1. This quantity is maximised when cos θ = 1, so it is maximised when the direction v is in the
same direction as ∇f (a). Suppose ∇f (a) 6= 0. Since directions have length 1, this means that the
maximising v is
∇f (a)
v= .
k∇f (a)k

37
MA203 Real Analysis

So the function f increases most rapidly in the direction of the gradient.

WARNING! Note that directional derivatives in all directions can exist, even if f is not differentiable.
In other words, the existence of all directional derivatives does not imply the existence of the
derivative. To see this, consider the following example.

Example Consider again the function f : R2 → R is given by



   x2 y
x if (x, y)T =
6 0
f = x4 + y2
y
0 if (x, y)T = 0.

We saw in an earlier example that f has directional derivatives in all directions at 0. However, f is
not differentiable at 0. In fact, it is not even continuous there, since, for example,
f (t, t2 ) = 1/2 6→ 0 = f (0) as t → 0.

Learning activity 4.2

In the example just given, why does showing that f is not continuous at 0 establish that the
derivative does not exist there?

4.4 Learning outcomes

At the end of this chapter and the relevant reading, you should be able to:

state the definition of the derivative, and left and right derivatives, and be able to use the
definitions to calculate the derivative
state, and be able to prove, that differentiability implies continuity
state the product, quotient and chain rules (but the proofs are not needed)
state the definition of local maxima and minima
be able to prove that if c is a local maximum or minimum of f , and f is differentiable at c, then
f 0 (c) = 0
state that if f : R → R is continuous on [a, b] then f is bounded and attains its bounds (proof
not needed)
state, and be able to prove, the Extreme Value Theorem
state, and be able to prove and use, Rolle’s Theorem
state, and be able to prove and use, the Mean Value Theorem
state, and be able to use, the formal definitions of partial derivatives
state what is meant by directions and directional derivatives, and be able to calculate directional
derivatives
state, and be able to use, the precise definition of the derivative of a function f : Rm → Rn
state what’s meant by the gradient of f : Rm → R and be able to calculate it
state that differentiability of f : Rm → Rn implies continuity (no proof needed)
state that the derivative exists if each component function is differentiable (no proof needed); and
that if each partial derivative exists and is continuous, then f is differentiable (no proof needed)
calculate derivatives of functions f : Rm → Rn
state, and be able to use, the connection between directional derivative and gradient when f is
differentiable

38
Chapter 4. Differentiation

4.5 Comments on selected activities

Learning activity 4.1 The trick is to realise that


   
f (x)g(x) − f (c)g(c) f (x) − f (c) g(x) − g(c)
= g(x) + f (c) .
x−c x−c x−c

Check this! Now, as x → c, (f (x) − f (c))/(x − c) → f 0 (c), (g(x) − g(c))/(x − c) → g 0 (c) and,
because g is continuous (since it is differentiable) at c, g(x) → g(c). So the limit of the right hand
side as x → c exists and is f 0 (c)g(c) + f (c)g 0 (c). The limit of the left hand side must, of course, be
the same. But, by definition of the derivative, the limit of the left hand side is (f g)0 (c). Therefore
(f g)0 (c) = f 0 (c)g(c) + f (c)g 0 (c).

Learning activity 4.2 If a function is differentiable at a point, then it is also continuous there. So if
it is not continuous, then it cannot be differentiable.

4.6 Exercises

Exercise 4.1 By considering (f (y) − f (x))/(y − x), prove that f (x) = x3 is differentiable, and that
f 0 (x) = 3x2 .

Exercise 4.2 Suppose f : R → R is differentiable on R and that f 0 (x) ≥ K for all x > N , where
K > 0 and N is some real number. Prove that f (x) → ∞ as x → ∞.

Suppose g : R → R is differentiable on R and that g 0 (x) → L as x → ∞, where L > 0. Prove that


g(x) → ∞ as x → ∞.

Exercise 4.3 Suppose that f : R → R is differentiable (on all of R) and that, for all x, |f 0 (x)| ≤ M .
Prove that for all x, y ∈ R,
|f (x) − f (y)| ≤ M |x − y|.

Exercise 4.4 Using the Mean Value Theorem, prove that, for all k ≥ 2,
1 1
< log k − log(k − 1) < .
k k−1
[What function f (x) might you try applying the Mean Value Theorem to? What interval [a, b] is likely
to be relevant?]
1 1 X1
Let sn = 1 + + . . . + be the nth partial sum of the harmonic series . Use the inequalities
2 n n
above to show that
sn − 1 < log n < sn−1 < sn .
Now let bn = sn − log n. Prove that (bn ) is decreasing and bounded below. Hence show that there is
a constant γ, with 0 ≤ γ ≤ 1, such that sn = log n + γ + en , where en → 0 as n → ∞. [The
constant γ, known as Euler’s constant, is approximately 0.577.]

Use this result to determine  


1 1 1
lim + + ... + .
n→∞ n n+1 3n

Exercise 4.5 Suppose that f : R → R is differentiable on R and that for all real numbers x,

f 0 (x) ≥ f (x) > 0.

Prove that
f (x) → ∞ as x → ∞.

39
MA203 Real Analysis

Exercise 4.6 Suppose f : R → R is such that for all n and for all x ∈ R, f (n) (x) exists (that is, f is
infinitely differentiable on R). Suppose further that for all x ∈ R,
f (x + 1) = f (x).
Use Rolle’s Theorem to prove that for every positive integer n, there is cn ∈ [0, 1) such that
f (n) (cn ) = 0.
[Hint: It’s enough to find a suitable point cn anywhere on the real line (why?). For n = 1 this is just
Rolle’s Theorem. Try it for n = 2, . . . .]

Exercise 4.7 Suppose a, b are real numbers with b > a. Apply Rolle’s Theorem to the function
f (x) = e−x (x − a)(x − b) to prove that the equation
(x − a)(x − b) = (x − a) + (x − b)
has a solution between a and b.

Exercise 4.8 Let f : R2 → R be defined by f (x, y)T = x2 − xy. Set a = (1, 1)T , and let v be a unit
vector in the direction (2, 1)T . Find the derivative Df (a), and hence the directional derivative
Dv f (a).

Exercise 4.9 Find the derivatives of the following functions:


 2   
  x sin y x  
x 4 xz
f = x +y , g y =
    .
y xy + z
ey z

Exercise 4.10 Define f : R2 → R as follows:



   xy 2
x if (x, y)T 6= 0
f = x2 + y 4
y
0 if (x, y)T = 0.

∂f ∂f
Use the definition of partial derivatives to show that and both exist at (0, 0)T . Show,
∂x ∂y
however, that f is not differentiable at (0, 0)T .

Exercise 4.11 Let A be an n × n matrix and suppose that f : Rn → R is defined by f (x) = xT Ax


for all x ∈ Rn . Show that f (a + h) − f (a) − aT (A + AT ) h = hT Ah. Hence prove that

f (a + h) − f (a) − aT (A + AT ) h
≤ kAhk.

khk

[Hint: recall that, for vectors x, y ∈ Rn , |hx, yi| ≤ kxk kyk.]


Deduce that f is differentiable on Rn and that
Df (x) = xT (A + AT ).

Exercise 4.12 Let a be a point in Rn , and v be a vector in Rn . The line segment [a, a + v] between
a and a + v is the set {a + tv : t ∈ [0, 1]}. Suppose that f : Rn → R is differentiable. Define
g : R → R by g(t) = f (a + tv). Show that g is differentiable, with derivative g 0 (t) given by
Df (a + tv))(v).
[It might help to write v = ku, where u is a unit vector, and make use of the notion of the directional
derivative in direction u.]

By applying the (1-dimensional) Mean Value Theorem to g on the interval [0, 1], prove that there is
some c ∈ [a, a + v] such that
f (a + v) − f (a) = Df (c)v.

Exercise 4.13 For u, v ∈ Rm , let hu, vi be the inner product, equal to uT v (or, equally, vT u). Let
f, g : Rn → Rm be differentiable at a, and define hf, gi : Rn → R to be the function given by
hf, gi(x) = hf (x), g(x)i. Prove that hf, gi is differentiable at a and that
Dhf, gi(a) = (f (a))T Dg(a) + (g(a))T Df (a).

40
Chapter 5
Topology of Rm

Contents

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2 Open and closed subsets of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.3 Open and closed subsets of Rm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.4 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.5 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.6 Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.7 Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Reading

Bartle, R.G. and D.R. Sherbert. Introduction to Real Analysis. Chapter 11.

Sutherland, W. A. Introduction to Metric and Topological Spaces. Chapters 2 and 5.

Neither of these readings is ideal in every way. The approaches taken by Bartle and Sherbert to the
concepts of closed set and compactness are different from those taken here, and the Sutherland book
is quite advanced, and of more use for the next chapter.

5.1 Introduction

In this and the next chapter, we explore some important theoretical concepts in analysis. Partly, the
aim is to enable us to generalise and place in a larger context some of the results that we have met
earlier. For instance, the Extreme Value Theorem for a function f : R → R tells us that a continuous
function will have a maximum and a minimum value on a closed interval. You might well ask what’s
so special about closed intervals in order for this to work; or what’s so special about continuous
functions? Will the Theorem also work for other types of domain rather than just closed intervals? To
answer this question, we need to begin to consider some ‘topological’ ideas. (Do not be afraid: at this
point, the word ‘topology’ is not meant to mean anything to you!)

5.2 Open and closed subsets of R

5.2.1 Open sets of real numbers

Very roughly speaking, a set U of real numbers is said to be an open set if around every point of U
there is some room to move in both directions (increasing and decreasing or, if you like, to the left
and right) without leaving the set U . The formal definition is as follows:

Definition 5.1 (Open set of real numbers) A set U ⊆ R is an open set (or is open) if for every
y ∈ U there is some  = (y) > 0 such that (y − , y + ) ⊆ U .

41
MA203 Real Analysis

We write  = (y) in this definition to emphasise that  can, and will, generally, depend on y: that is,
y is given and we then find a suitable .

Example The open interval U = (1, 2) is open. To see this, let y ∈ U . Then y is between 1 and 2.
Provided we take  to be no more than the smaller of y − 1 and 2 − y, then (y − , y + ) ⊆ U .
Convince yourself of this! A similar argument shows that any open interval (a, b) is open (which is a
relief, since we refer to it as an ‘open’ interval!).

Learning activity 5.1

Make sure you understand why, with the chosen value of , we have (y − , y + ) ⊆ U . Write down a
formal proof.

Example The interval U = (1, 2] is not open, because if we take x = 2 then no matter how small  is,
the interval (2 − , 2 + ) contains numbers greater than 2 and hence does not lie entirely in
U = (1, 2]. (Note that, although there is an open interval in U around every other point of U , the
fact that this fails to hold for the single point 2 is enough to show that U is not open: to be open, we
would need the condition to hold for every point of U .)

WARNING! It should not be thought that all open sets are open intervals: although every open
interval is open, there are many other types of open set. For example, the set (1, 2) ∪ (3, 4) is open,
but it is not an open interval.

5.2.2 Collections of sets

Some results (such as Theorem 5.2 below) are not true just for a finite collection of open sets, nor
just for countably many: but for any collection. There’s some useful notation we can use when we
want to work with collections, or families, of sets, and it’s worthwhile mentioning this at this stage.
The reason for introducing this notation is that it will make it much simpler to work with infinite
(rather than finite) collections of sets.

Suppose that S is some set and I is some nonempty (indexing) set such that for each i ∈ I, we have
a set Ai ⊆ S. Thus, {Ai : i ∈ I} is a collection, or family, of sets. The intersection and union of the
sets in the family are easily defined:
\
Ai = {x : x ∈ Ai for all i ∈ I}
i∈I

and [
Ai = {x : x ∈ Ai for at least one i ∈ Ai } .
i∈I

We have the De Morgan laws of complementation:


\ [ [ \
S \ Ai = (S \ Ai ), S \ Ai = (S \ Ai ).
I I I I

What’s the point here? Well, we could imagine having a set Ai for each positive integer i. But we
could have even more sets: one for each i in some interval of real numbers, for example.
S
Example Suppose that for each i ∈ (1, ∞), Ai = (1/i, 2]. Then i∈I Ai = (0, 2].

Learning activity 5.2


S
Prove that, with Ai defined as in this example, then i∈I Ai = (0, 2].

42
Chapter 5. Topology of Rm

5.2.3 Properties of open sets

The following result will be useful.

Theorem 5.2 The union of any collection of open sets is again an open set.

Proof. The Theorem says that the union of any collection of open sets is open. And it really means
any collection: not just a finite collection, not just a countably infinite collection . . .. So how do we
prove this? We don’t know what kind of collection we’re dealing with. Well, the point is that any
collection can be written as {Ui : i ∈ I} for some index set I. (This is completely general, and it
covers all possibilities. Special cases are I = {1, 2, . . . , n} if there are nSsets in the collection, I = N if
there are countably many sets, and so on.) We need to show that U = i∈I Ui is open. So let us take
an arbitrary y ∈ U . We need to show that there is some  > 0 such that (y − , y + ) ⊆ U . Now, the
fact that y ∈ U means precisely that, for some i ∈ I, we have y ∈ Ui . (There may, of course, be more
than one such i.) Because Ui is open, there is some  > 0 such that (y − , y + ) ⊆ Ui . But Ui ⊆ U
(since U is the union of all the Ui ) and hence (y − , y + ) ⊆ U , as required.

WARNING! It is not true that the intersection of any collection of open sets is open. ForTinstance,

suppose that Ui = (−i, i) for i ∈ N. Then each set Ui is open. However, the intersection i=1 Ui is
{0}, the set containing only the number 0, and this is not open.

Learning activity 5.3


T∞
Verify that i=1 Ui = {0} and explain why this is not an open set.

It is true, though, that the intersection of a finite collection of open sets is open.

Theorem 5.3 The intersection of a finite number of open sets is open.

Proof. We prove the result for the case of two sets, U1 and U2 . (For any other finite number of sets
you can prove the result in a similar way, or we can use induction on the number of sets.) Let
U = U1 ∩ U2 . Suppose that y ∈ U . Then, because y ∈ U1 and U1 is open, there is 1 > 0 so that
(y − 1 , y + 1 ) ⊆ U1 . Equally, because y ∈ U2 and U2 is open, there is 2 > 0 so that
(y − 2 , y + 2 ) ⊆ U2 . Let  = min(1 , 2 ), the smaller of 1 and 2 . Then we have

(y − , y + ) ⊆ (y − 1 , y + 1 ) ⊆ U1

and
(y − , y + ) ⊆ (y − 2 , y + 2 ) ⊆ U2 .
So (y − , y + ) ⊆ U1 ∩ U2 = U , and this is what we need.

5.2.4 Closed sets of real numbers

We also have the notion of closed sets. But before defining what this means, we need to clear up one
source of potential confusion. By analogy with the use of the words ‘open’ and ‘closed’ in everyday
language, we might think that a given set of real numbers must be either open or closed, and that if it
is not open, it is closed. This, unfortunately, will not be the case: as we shall see, sets can be open
but not closed, closed but not open, both open and closed, or neither open nor closed!

Here is the formal definition of a closed set:

43
MA203 Real Analysis

Definition 5.4 A set C ⊆ R is a closed set (or is closed) if whenever (xn ) is a convergent sequence
and xn ∈ C for all n, then the limit of the sequence, lim xn , is in C.

So a set C is closed if for any convergent sequence of members of C, the limit of the sequence is in
C. This is a tricky definition to work with, but as we shall see shortly, there is another way of
describing closed sets.

Example The interval C = [0, 1] is closed. To see this, suppose that (xn ) is any sequence in C,
converging to a limit L. Then for each n, xn ∈ C, so 0 ≤ xn ≤ 1. Now, it follows from this that
0 ≤ L ≤ 1 (prove this!), so L ∈ C, and hence C is closed.

Example The interval C = (0, 1] is not closed. Consider the sequence (xn ) where xn = 1/n. For all
n, xn ∈ C. The sequence converges to 0, but 0 is not in C. So C is not closed.

We mentioned that ‘closed’ is not the ‘opposite’ of ‘open’, but the following result linking open sets
and closed sets is very useful.

Theorem 5.5 A set C of real numbers is closed if and only if its complement R \ C is open.

Proof. Because this is an ‘if and only if’ result, there are two things to prove here: first, that if C is
closed then R \ C is open; secondly, that if R \ C is open then C is closed.

Suppose, first, that C is closed and consider its complement U = R \ C. We want to show U is open.
Suppose it isn’t. Then there is some y ∈ U such that for no  > 0 do we have (y − , y + ) ⊆ U . In
other words, for all  > 0, the interval (y − , y + ) does not lie entirely within U = R \ C and hence
must contain points of C. For any positive integer n, let’s take  = 1/n. Then there is some
xn ∈ (y − 1/n, y + 1/n) such that xn ∈ C. Because |xn − y| < 1/n, we have that xn → y as
n → ∞. So here we have a sequence (xn ) in C such that lim xn = y 6∈ C. But this cannot happen
since C is closed. So what’s gone wrong? Well, we supposed that R \ C was not open, and this
supposition must therefore be wrong. So R \ C is open.

Next, suppose that R \ C is open. To prove that C is closed, we need to show that the limit of any
convergent sequence of points of C is in C. So suppose (xn ) is a convergent sequence, with xn ∈ C
for all n, and set L = lim xn . We need to show L ∈ C. Suppose this isn’t so. Then L is in the open
set R \ C, so there is some  > 0 such that (L − , L + ) ⊆ R \ C. Now, because xn → L, there is
some N such that for n > N , |xn − L| < , that is xn ∈ (L − , L + ). But then for n > N ,
xn ∈ R \ C. This is a contradiction to the fact that xn ∈ C. So we have gone wrong in assuming
that L is not in C. Therefore it is in C, and C is closed.

This theorem is an extremely useful characterisation of closed sets. In fact, we could, if we had
wanted, have taken the definition of a closed set to be a set C whose complement R \ C is open, and
many texts (including that of Bartle and Sherbert) do this.

Example Consider again the set C = [0, 1]. We showed this was closed by using Definition 5.4. But
we can also see that it is closed by considering its complement. For,

R \ C = R \ [0, 1] = (−∞, 0) ∪ (1, ∞),

and this is open because it is the union of two open sets. So, since the complement of C is open, C is
closed.

Example The set R of all real numbers is both open and closed. The interval (0, 1] is neither open
nor closed.

WARNING! As mentioned, it is wrong to think that a set of real numbers must be either open or
closed, and that if it is not open, it is closed. This is not the case. Sets can be open but not closed,
closed but not open, both open and closed, or neither open nor closed. Theorem 5.5 describes a
relationship between closed and open: they are not ‘opposites’.

44
Chapter 5. Topology of Rm

5.3 Open and closed subsets of Rm

5.3.1 Open balls

For m > 1, the counterpart in Rm to the open interval (y − , y + ) in R is the open ball.

Definition 5.6 (Open Ball) For x ∈ Rm and  > 0, the open ball of radius  around x is
B (x) = {y : kx − yk < } .
This is the set of those points y whose distance from x is less than .

Example When m = 1, B (x) is exactly the open interval (x − , x + ).

Example In R2 the open ball B (x) is the region enclosed by a circle of radius  centred at x. Note
that the points on this circle do not lie in B (x).

5.3.2 The definition of open set

We have already investigated the notion of open sets of real numbers. All the ideas and results extend
to Rm .

Definition 5.7 A subset U of Rm is open if for any y ∈ U , there is  = (y) > 0 such that
B (y) ⊆ U.

Informally, a set is open if, from any point of the set, we can move some positive distance in any
‘direction’ without going outside the set.

The following theorem shows us that any open ball is an open set, but there are other types of open
set. Just as for open sets in R, the union of any collection of open subsets of Rn is again open.

Theorem 5.8 Any open ball is an open set.

Proof. Suppose that B = B (x) is an open ball. Let y ∈ B. We need to show there is η > 0 such
that Bη (y) ⊆ B. (We use η rather than  because the symbol  is already used in the description of
B.) Now, since y ∈ B, we have that ky − xk < , so the number η =  − ky − xk is positive. We will
show that Bη (y) ⊆ B. So, suppose z ∈ Bη (y). Then kz − yk < η and hence, by the triangle
inequality,
kz − xk ≤ kz − yk + ky − xk < η + ky − xk =  − ky − xk + ky − xk = .
So, kz − xk < . This means z ∈ B. So we have established that Bη (y) ⊆ B. It now follows that B
is open.

5.3.3 Closed sets in Rm

Just as for subsets of R, we can define the notion of a closed subset of Rm .

Definition 5.9 A set C ⊆ Rm is a closed set (or is closed) if whenever (xn ) is a convergent sequence
and xn ∈ C for all n, then the limit of the sequence, lim xn , is in C.

So a set C is closed if for any convergent sequence of members of C, the limit of the sequence is in C.

As for the case m = 1 investigated above, we have the following result, the proof of which is similar
to the one given earlier.

45
MA203 Real Analysis

Theorem 5.10 A set C ⊆ Rm is closed if and only if its complement Rm \ C is open.

Learning activity 5.4

Prove Theorem 5.10.

5.4 Continuity

5.4.1 Continuity and open balls

The definition of continuity of a function f : Rn → Rm at a ∈ Rn can be phrased in terms of open


balls.

Theorem 5.11 The function f : Rn → Rm is continuous at a ∈ Rn if and only if given any open ball
B (f (a)), there exists δ > 0 such that

f (Bδ (a)) ⊆ B (f (a)) .

Proof. Recall that f is continuous at a if given any  > 0 there exists δ > 0 such that if kx − ak < δ
then kf (x) − f (a)k < . The condition kx − ak < δ is exactly the same as x ∈ Bδ (a) and the
condition kf (x) − f (a)k <  is equivalent to f (x) ∈ B (f (a)). Therefore, f is continuous at a if
given any  > 0 there exists δ > 0 such that

x ∈ Bδ (a) =⇒ f (x) ∈ B (f (a)).

But this is the same as saying that f (Bδ (a)) ⊆ B (f (a)) .

5.4.2 Continuity in terms of open sets

There is a simple characterisation of continuity (on the whole of Rn ) involving open sets. To state
this succinctly, we need a new notation. Suppose that f : Rn → Rm is a function, and that B ⊆ Rm .
Then we denote by f −1 (B) the subset {x ∈ Rn : f (x) ∈ B} of Rn consisting of all points which f
maps into B. The use of the notation f −1 should not be taken as meaning that the inverse function
f −1 of f exists: the same symbol is used here, but it means something different. (In particular, f −1
as defined here is not a mapping from Rm to Rn but is, instead, a mapping from all subsets of Rm to
subsets of Rn .)

Theorem 5.12 Suppose f : Rn → Rm . Then f is continuous if and only if for all open subsets U of
Rm , f −1 (U ) is an open subset of Rn .

Proof. Because this is an ‘if and only if’ result, there are two things to prove.

First, suppose that f is continuous. Then we want to show that if U is open, then V = f −1 (U ) is
open. To do this, we need to show that for each x ∈ V there is some η > 0 so that Bη (x) ⊆ V .
Consider f (x). Because x ∈ f −1 (U ), we have f (x) ∈ U and because U is open, there is some  > 0
such that B (f (x)) ⊆ U . By continuity of f , there is δ > 0 such that f (Bδ (x)) ⊆ B (f (x)). Thus,
for any z ∈ Bδ (x), we have f (z) ∈ B (f (x)) ⊆ U. In particular, therefore, this shows that anything in
Bδ (x) is mapped by f into U and hence Bδ (x) ⊆ f −1 (U ) = V . So we may take η = δ.

Next, suppose that it is the case that f −1 (U ) is open whenever U is, and let a ∈ Rn . We want to
show that f is continuous at a. Now, U = B (f (a)) is open because it is an open ball. So it follows

46
Chapter 5. Topology of Rm

that V = f −1 (B (f (a)) is open. We have f (a) ∈ U so a ∈ V . Because V is open, there is some
δ > 0 such that Bδ (a) ⊆ V = f −1 (B (f (a)). This means that f (Bδ (a)) ⊆ B (f (a)) and hence that
f is continuous at a.

5.5 Compactness

5.5.1 Compact sets

The idea of a compact set is extremely important in analysis and its applications (especially to
optimisation). There are a number of ways of defining what we mean by a compact set. The
approach we take is through what is sometimes called ‘sequential compactness’.

Definition 5.13 (compact subset of Rm ) A subset C of Rm is said to be compact if any sequence


(xn ), where xn ∈ C for all n, has a subsequence converging to a point of C.

WARNING! Make sure you understand this definition. It is not the same as the definition of a closed
set, though at first glance you may think it similar. Recall that a set C is closed if, whenever a
sequence of members of C converges, then its limit is in C. This says nothing at all about sequences
that do not converge. On the other hand, the definition of compactness says something about any
sequence, and not just convergent ones. What it says, to re-iterate, is that C is compact if: when we
take any sequence whose members are in C then that sequence will have a subsequence which
converges and, furthermore the limit of this subsequence lies in C.

5.5.2 Characterising compact subsets of Rm

Recall the Bolzano-Weierstrass theorem:

Theorem 5.14 (Bolzano-Weierstrass Theorem) Every bounded real sequence has a convergent
subsequence.

Consider a closed bounded interval [a, b] of real numbers (where a < b). Suppose that (xn ) is a
sequence of real numbers each belonging to [a, b]. The Bolzano-Weierstrass theorem tells us that this
has a convergent subsequence. Since each member of the subsequence is between a and b, so too is
L. So we have established:

Theorem 5.15 Any closed bounded interval of real numbers is compact.

But what exactly are the compact subsets of R, and, more generally, Rm ? The following
characterisation, known as the Heine-Borel Theorem, is very useful.

Theorem 5.16 (Heine-Borel Theorem) A subset C of Rm is compact if and only if it is both


closed and bounded.

Proof. We prove one half of the theorem, namely that if C is compact then it must be closed and
bounded. So, suppose C is compact. We want to show that it is closed. To do so, we need to prove
that whenever (xn ) is a convergent sequence in C then x = lim xn ∈ C. By compactness, there is
some convergent subsequence of (xn ) whose limit is in C. But, since (xn ) converges to x, so too do
all of its subsequences. We conclude that x ∈ C. Now we want to show that C is bounded. Suppose
it is not. Then for each n ∈ N, there is some xn ∈ C with kxn k > n. We show that the sequence
(xn ) has no convergent subsequence, contradicting the compactness of C. Suppose that (xnk ) is a
subsequence and that xnk → x as k → ∞. Then there is some N so that for all k ≥ N ,
kxnk − xk < 1. It follows that for all k ≥ N ,

kxnk k = kxnk − x + xk ≤ kxnk − xk + kxk < 1 + kxk.

47
MA203 Real Analysis

But
kxnk k > nk ≥ k → ∞ as k → ∞,
so this is not possible. So we conclude that C is indeed bounded.

For example, the subset [1, 2] ∪ [3, 6] of R is compact, but [1, 2) is not.

WARNING! Some texts define compactness by saying that a set is compact if and only if it is closed
and bounded. This is a reasonable approach when dealing with Rm , but the definition of compactness
we have given is substantially more general. It can apply (as we shall see) to ‘metric spaces’ other
than Rm .

5.5.3 Continuous functions on compact sets

The following result is useful. Earlier we mentioned that a continuous real function on a closed
interval [a, b] of the real numbers is bounded on the interval and attains its maximum and minimum.
That result can be seen as a special case of the following one.

Theorem 5.17 Suppose that f : Rm → Rn is continuous and that C ⊆ Rm is compact. Then the
image of C under f , f (C) = {f (x) : x ∈ C} is a compact subset of Rn .

Proof. To show that f (C) is compact, take any sequence (yn ) in f (C); we need to show that (yn )
has a subsequence converging to some element of f (C). For each n, since yn ∈ f (C), there is some
xn in C such that f (xn ) = yn . Consider the sequence (xn ) in C: since C is compact, there is some
subsequence (xnk ) converging to a limit x ∈ C. We claim that f (xnk ) → f (x) as k → ∞: to see this,
fix any  > 0; by continuity of f there is some δ > 0 such that ky − xk < δ ⇒ kf (y) − f (x)k < , and
by definition of convergence there is some K such that k ≥ K ⇒ kxnk − xk < δ. Combining these
two facts gives us that, for any  > 0, there is some K such that k ≥ K ⇒ kf (xnk ) − f (x)k < , as
required. But now notice that the sequence (f (xnk )) = (ynk ) is a subsequence of the original
sequence (yn ) converging to a limit f (x) ∈ f (C), which is what we need.

In particular, when n = 1 we have the following corollary.

Theorem 5.18 Suppose that C is a compact subset of Rm and that the function f : Rm → R is
continuous on C. Then f is bounded on C and it achieves its maximum and minimum. In other
words, the set {f (x) : x ∈ C} is bounded and has a maximum and a minimum: i.e., there are
x1 , x2 ∈ C such that f (x1 ) = max{f (x) : x ∈ C} and f (x2 ) = min{f (x) : x ∈ C}.

Note that the Extreme Value Theorem we met in Chapter 4 follows from this, by taking m = 1 and C
to be a closed and bounded interval [a, b].

5.6 Learning outcomes

At the end of this chapter and the relevant reading, you should be able to:

state what is meant by open and closed subsets of Rm


prove that sets are open or closed (or not)
state what is meant by open balls
prove that open balls are open
describe continuity in terms of open balls
describe continuity in terms of open sets, and to prove this characterisation
state what is meant by a compact subset of Rm

48
Chapter 5. Topology of Rm

state the Heine-Borel theorem (that the compact subsets of Rm are precisely the closed and
bounded sets), and be able to prove part of it, namely that the compact subsets are closed and
bounded.
state, and be able to prove, that the image, under a continuous function, of a compact set is
compact.

5.7 Comments on selected activities

Learning activity 5.1 Because  is the smaller of y − 1 and 2 − y, we have both that  ≤ y − 1 and
 ≤ 2 − y. So:
y −  ≥ y − (y − 1) = 1
and
y +  ≤ y + (2 − y) = 2.
Therefore, (y + , y + ) ⊆ (1, 2) = U .
S
Learning activity
S 5.2 We need to show S that i∈I Ai = (0, 2]. Probably the easiest approach is to
proveSthat i∈I Ai = (0, 2] and (0, 2] ⊆ i∈I Ai . For any i, Ai = (1/i,S2] ⊆ (0, 2], so we certainly
have i∈I AI ⊆ (0, 2]. The more difficult part is to show that (0, 2] ⊆ i∈I Ai . To do this, we need
to establish that if y ∈ (0, 2] then there is some i such that y ∈ Ai . So let y ∈ (0, 2]. If y > 1 then y
belongs to all the Ai for i ∈ I = (1, ∞), so it certainly belongs to their union. Suppose now that
y ≤ 1. We can see that y will belong to Ai if and only if 1/i ≤ y, which means i ≥ 1/y. So, if we
take i = 1/y, then we’ll have i ∈ (1, ∞) = I, and y ∈ Ai .

LearningTactivity 5.3 We have Ui = (−i, i) for i ∈ N. Certainly, for all i ∈ N, 0 ∈ Ui , so



0 ∈ U = i=1 Ui . We show that no other number belongs to the intersection U . Suppose that x 6= 0.
Then if i > 1/|x|, we have 1/i < |x|. So either x > 1/i or x < −1/i, and in either case this means
that x is not in Ui . So such an x cannot be in the intersection of the Ui . The set {0} is not open
because any interval of the form (0 − , 0 + ) for  > 0 must contain numbers other than 0 and is not
therefore contained in {0}.

Learning activity 5.4 The proof is almost identical to that of the m = 1 result, Theorem 5.5: we
simply replace open intervals by open balls.

Suppose, first, that C is closed and consider its complement U = Rm \ C. We want to show U is
open. Suppose it isn’t. Then there is some y ∈ U such that for no  > 0 do we have B (y) ⊆ U . In
other words, for all  > 0, the open ball B (y) does not lie entirely within U = Rm \ C and hence
must contain points of C. For any positive integer n, let’s take  = 1/n. Then there is some
xn ∈ B1/n (y) such that xn ∈ C. Because kxn − yk < 1/n, we have that xn → y as n → ∞. So here
we have a sequence (xn ) in C such that lim xn = y 6∈ C. But this cannot happen since C is closed.
So what’s gone wrong? Well, we supposed that Rm \ C was not open, and this supposition must
therefore be wrong. So Rm \ C is open.

Next, suppose that Rm \ C is open. To prove that C is closed, we need to show that the limit of any
convergent sequence of points of C is in C. So suppose (xn ) is a convergent sequence, with xn ∈ C
for all n, and set L = lim xn . We need to show L ∈ C. Suppose this isn’t so. Then L is in the open
set Rm \ C, so there is some  > 0 such that B (L) ⊆ Rm \ C. Now, because xn → L, there is some
N such that for n > N , kxn − Lk < , that is xn ∈ B (L). But then for n > N , xn ∈ Rm \ C. This
is a contradiction to the fact that xn ∈ C. So we have gone wrong in assuming that L is not in C.
Therefore it is in C, and C is closed.

5.8 Exercises

Exercise 5.1 Let z ∈ Rm and  > 0. Show that the ‘closed ball’ {x ∈ Rm : kx − zk ≤ } is a closed
subset of Rm .

49
MA203 Real Analysis

Exercise 5.2 For each of the following sets, state whether they are open, closed, both or neither.
Justify your answers briefly. [It might be useful to start by sketching each set.]
A = {0, 1} (a subset of R);
B = {(x, y)T ∈ R2 : x > y};
C = {(x, 0)T ∈ R2 : 0 < x < 1}.

Exercise 5.3 Suppose that f : Rm → R is a continuous function and that f (x∗ ) > 0. Show that
there is an open ball B = Bδ (x∗ ) such that f (x) > 0 for all x ∈ B.
2
Exercise 5.4 Suppose f : R → R is given by
 f (x) = x . Let A = [0, 1] and B = [−1, 1]. Determine
−1 −1 −1
f (A), f (B), f (f (A)), and f f (B) .

Exercise 5.5 Prove, using the definition of a closed set, that the intersection of any collection of
closed subsets of Rn is closed.
Now prove the result using both of the following facts: (i) a set is closed if and only if its complement
is open, and (ii) the union of any collection of open sets is open.

Exercise 5.6 Let f : Rn → Rm and g : Rp → Rn be two functions. Recall that the composition
f ◦ g : Rp → Rm is defined by (f ◦ g)(x) = f (g(x)). Show that, for any subset S ⊆ Rm ,
(f ◦ g)−1 (S) = g −1 (f −1 (S)).
Hence show that, if f and g are continuous, then so is f ◦ g.

Exercise 5.7 By giving an example, show that it is not generally the case that for a continuous
function f : R → R, f (U ) is open whenever U is.

Exercise 5.8 Define the subset S of R2 by S = {(x, y)T : x > 0, y = sin(1/x)}. Sketch the set S.
Show that S is not closed. Show that T = S ∪ {(0, y)T : −1 ≤ y ≤ 1} is closed.

Exercise 5.9 Prove, using the definition of a closed set, that the union of a finite collection of closed
subsets of Rn is closed. Now prove the result using both of the following facts: (i) a set is closed if
and only if its complement is open, and (ii) the intersection of a finite collection of open sets is open.

Exercise 5.10 Suppose that {Ui : i ∈ I} is a family of subsets of Rm , and consider a function
f : Rn → Rm . Prove that !
[ [
−1
f Ui = f −1 (Ui ).
i∈I i∈I

Exercise 5.11 Which of the following sets are compact? Explain your answers briefly. (You might
want to use the Heine-Borel theorem, which tells us that C ⊆ Rn is compact ⇐⇒ C is closed and
bounded.)
{0} ∪ [1, 2],
  
x1
: x1 + x2 ≤ 1, x1 ≥ 0, x2 ≥ 0 ,
x2
  
1/n
:n∈N ,
1/n
  
x 2
:1≤x y≤2 .
y

Exercise 5.12 Suppose that C ⊆ Rn is compact and that D ⊆ C is a closed subset of C. Use the
Heine-Borel theorem to prove that D is also compact. Now prove the same result directly from the
definitions of closed and compact.

Exercise 5.13 Suppose that B ⊆ R is not compact. Prove that there is a continuous function
f : B → R which is not bounded on B.
[Hint: Recall that if B is not compact, then it is not bounded or not closed. Consider separately the
cases (a) B not bounded, (b) B not closed.]

50
Chapter 6
Metric spaces

Contents

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.2 Metrics and Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.3 Open sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.4 Continuity in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.5 Convergence and closed sets in metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.6 Compactness in metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.7 Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.8 Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Reading

Sutherland, W. A. Introduction to Metric and Topological Spaces. Chapters 2 and 5.

Bryant, Victor. Metric Spaces: Iteration and Application. Chapters 2, 3 and 5.

Neither of these readings is ideal in every way. The Sutherland book is quite advanced. The Bryant
book is useful, but its approach is different, in that it starts with closed sets and only considers open
sets much later on.

6.1 Introduction

In this chapter we unify and generalise some of the key concepts that we met earlier in this course.
The important notions of closed and open set, convergence, continuity and compactness are all set in
the larger context of metric spaces.

6.2 Metrics and Metric Spaces

6.2.1 Towards the idea of a metric space

In the last chapter, we built up our definitions and results from some fairly simple concepts. We
started by defining an open ball in Rn , and then we used that to define an open set. We can write our
definition of what it means for a sequence to converge to a limit in terms of open balls if we like, and
from that idea we defined the notion of a closed set, and a compact set in Rn . Our definition of
continuous function can also be re-written in terms of open balls.

So there are a lot of important concepts springing from the idea of an open ball, which is simply the
set of points at “distance” less than  from a given point. So really all our key definitions (and most
of the results) depend only on the concept of distance.

If for instance we could define the “distance” between two functions, then we could follow exactly the
same process and get a (hopefully useful!) definition of what it means for a sequence of functions to

51
MA203 Real Analysis

converge to another function, or for a set of functions to be compact. Or if we define the “distance”
between two matrices, we can do the same again.

Rather than do exactly the same thing over and over again, the normal mathematical procedure is to
define exactly what we mean by “distance”, and extend our definitions and (where possible) results to
cover any notion of distance that qualifies. So a metric space is going to be a set X, equipped with a
distance “function” (metric), satisfying certain properties. Our basic examples will include the sets R
and Rn , equipped with the Euclidean distance. Then we will give definitions, and prove theorems,
about metric spaces. Mostly these will be the same as we had in the previous chapter.

To see what we want from our “distance”, let’s think about our motivating example of the Euclidean
2 1/2
Pn 
distance d(x, y) = kx − yk = i=1 |xi − yi | between two elements x, y of Rn . Here d is a
function from the set R × R of ordered pairs of elements of Rn to the set of real numbers. In
n n

addition, the distance function d satisfies a few fairly simple rules that go along with what we expect
of anything called a distance. First, d(x, y) ≥ 0, and equals 0 only when x = y. Also, the distance
between x and y is the same as the distance between y and x; that is, d(x, y) = d(y, x). Further, we
have the triangle inequality: for any x, y, z ∈ Rn ,

d(x, z) ≤ d(x, y) + d(y, z).

6.2.2 Definition of a metric space

Abstracting from the above example, we obtain the definition of a metric (or distance function) d on
an arbitrary set X:

Definition 6.1 A metric space M = (X, d) consists of a set X together with a function, called a
metric, d : X × X → R such that

d(x, y) ≥ 0 for all x, y ∈ X, and d(x, y) = 0 ⇐⇒ x = y,


d(x, y) = d(y, x) for all x, y ∈ X,
d(x, z) ≤ d(x, y) + d(y, z) for all x, y, z ∈ X.

6.2.3 Important examples of metric spaces

Example Take X = Rn and, for x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ), let


v
u n
uX
d(x, y) = t |xi − yi |2 ,
i=1

known as the usual metric or Euclidean metric on Rn .

Example For any set X, we have a rather trivial metric known as the discrete metric d0 . This is
defined by 
0 if x = y;
d0 (x, y) =
1 if x 6= y.

This last example shows that for any set X, there is always at least one metric defined on X. On a
given set, it may be possible to define a number of different metrics, as the following example shows.

Example Returning again to Rn , we have the discrete metric d0 and, for any positive integer p, the
following function is a metric:
n
!1/p
X
p
dp (x, y) = |xi − yi | .
i=1

52
Chapter 6. Metric spaces

In particular, d2 is the usual metric. For example,

d1 (x, y) = |x1 − y1 | + |x2 − y2 | + . . . + |xn − yn |.

The metric d1 gives the distance one would have to travel between x and y, when only movement
parallel to the coordinate axes was possible. (As a ‘real’ interpretation, note that in the USA, where
many downtown areas have a rectangular grid of streets, d1 is often called the taxicab metric.)
Another metric on Rn , which we shall denote d∞ , is defined by

d∞ (x, y) = max |xi − yi |.


1≤i≤n

(Note that, for given x and y, dp (x, y) → d∞ (x, y) as p → ∞.)

So far, we have defined metrics on the nice, familiar, Euclidean spaces Rn . But the notion of metric
space has far more to offer. Consider the following.

Example Let C[0, 1] be the set of all continuous functions f : [0, 1] → R. Recall that each such
function is bounded. As for Euclidean space, we can define a family of metrics on C[0, 1]. We define

d∞ (f, g) = sup {|f (x) − g(x)| : x ∈ [0, 1]}

and, for p ≥ 1,
Z 1 1/p
p
dp (f, g) = |f (x) − g(x)| dx .
0

The most important examples are p = 1 and p = 2. The metric d∞ is often called the ‘sup metric’.

Example Let A(m, n) be the set of m × n real matrices, and define

d(X, Y ) = max |xi,j − yi,j |.


i,j

Then d is a metric on A(m, n).

Learning activity 6.1

Prove that this function d is a metric on A(m, n).

Example Again let A(m, n) be the set of m × n real matrices. Now define the norm kM k of a matrix
M to be
max{kM xk : x ∈ Rn , kxk = 1}.
To see that this is well-defined, notice that the function fM : Rn → R defined by fM (x) = kM xk is
continuous, and the set S = {x ∈ Rn : kxk = 1} is closed and bounded, so compact. Hence the
function fM has a maximum value on S, which is by definition kM k. We then have, for any M and
x, that kM xk ≤ kM kkxk. Now we define a metric on A(m, n) by setting d(M, N ) = kM − N k, for
matrices M, N ∈ A(m, n).

6.2.4 Bounded subsets

Thinking of a metric space as having a distance defined on it, we can discuss boundedness, and so on,
in this more general context.

Definition 6.2 Suppose that (X, d) is a metric space and Y ⊆ X. Then Y is bounded if there is
K ∈ R such that for all x, y ∈ Y , d(x, y) ≤ K.

The following theorem is quite easy to see.

53
MA203 Real Analysis

Sn
Theorem 6.3 If Y1 , Y2 , . . . , Yn are bounded subsets of a metric space, then so too is i=1 Yi .

It is not necessarily true that an infinite union of bounded sets is open, however.

Learning activity 6.2

Give an example of an infinite family of bounded sets in some metric space, the union of which is not
bounded.

6.2.5 Open balls

Let (X, d) be a metric space. For x ∈ X and  > 0, the open ball of radius  around x is

B (x) = {y : d(x, y) < } .

(If the metric is not clear, we use the notation B (x; d).)

Example In R with the usual (Euclidean) metric, B (x) = (x − , x + ).

Example In the space (R2 , d2 ) (R2 with the usual metric), the open ball B (x) is the region enclosed
by a circle of radius  centred at x — note that the points on this circle do not lie in B (x). On the
other hand, with respect to the metric d1 (x, y) = |x√1 − y1 | + |x2 − y2 |, the open ball B (x) is a
square, aligned diagonally, whose sides have length 2. With respect to the metric
d∞ (x, y) = max(|x1 − y1 |, |x2 − y2 |), the open ball is an axis-aligned square of side-length 2.

Example If X is any set then the open balls in the discrete metric space M = (X, d0 ) are given by

B (x) = {x} if  ≤ 1;
n
X if  > 1.

6.3 Open sets

6.3.1 The definition of open set

The open balls of a metric space have the following key property: given an open ball Bδ (x) in a
metric space and a point y of Bδ (x), there is  = (y) > 0 such that B (y) ⊆ Bδ (x). To see this, set
 = δ − d(x, y), and use the triangle inequality. In general, we call sets with this property open sets.

Definition 6.4 A subset U of a metric space M is open (in M ) if, for any y ∈ U , there is  > 0
(depending on y) such that B (y) ⊆ U.

Informally, a set is open if from any point of the set, we can move some positive distance in any
‘direction’ without going outside the set.

Example If d0 is the discrete metric on a set X, then every subset of X is an open set in (X, d0 ),
since every singleton subset is an open ball (e.g., B1 (x) = {x}).

Example A singleton subset {x} is not open in R with the usual metric, but is open if we use the
discrete metric. For clarity, when it is not perfectly clear what metric is being used, we should speak
of ‘d-open sets’. Usually there will be no confusion.

Example Consider R2 with the usual metric. Let

U = {(x1 , x2 ) : a < x1 < b, c < x2 < d}

54
Chapter 6. Metric spaces

be the interior of a rectangle. Then U is an open set, but not an open ball.

Example Consider again R2 , with the three different metrics d1 (the taxicab metric), d2 (the
Euclidean metric), and d∞ . Let U be a d1 -open subset of R2 , and let x be any point of U . Then
there is a d1 -open ball A = B (x; d1 ) around
√ x contained in U ; so A is the interior of a
diagonally-aligned square of side-length  2. Then it is possible to fit the d2 -open ball B/√2 (x; d2 ) –

a disc of radius / 2 – and the d∞ -open ball B/2 (x; d∞ ) – an axis aligned-square of side-length  –
inside A, and so inside U . This tells us that U is also d2 -open and d∞ open.

A similar argument shows that if U is d2 -open, then it is also d1 - and d∞ -open, and if U is d∞ -open,
then it is also d1 - and d2 -open. We say that the three metrics are equivalent metrics.

Results about open sets in general metric spaces can be proved in much the same way as their
counterparts applying to the case of Rn (with the usual metric). The following theorem provides an
example of this.

Theorem 6.5 Let (X, d) be a metric space. Then

U1 , U2 open =⇒ U1 ∩ U2 open;
[
Ui (i ∈ I) open =⇒ Ui open.
i∈I

6.4 Continuity in Metric Spaces

6.4.1 The definition of continuity

We mentioned earlier that with a concept of distance, we ought to be able to generalise the definition
of continuity to functions having as domain and codomain general metric spaces.

Definition 6.6 Let (X, dX ) and (Y, dY ) be two metric spaces, and let f be a function from X to Y .
Suppose that a ∈ X. Then, we say that f is (dX , dY )-continuous (or, simply, continuous) at a if,
given any  > 0, there exists δ > 0 such that

dX (x, a) < δ =⇒ dY (f (x), f (a)) < .

This can immediately be rephrased in terms of open balls.

Definition 6.7 The function f : X → Y is continuous at a ∈ X if given any open ball B (f (a); dY )
around f (a), there exists δ > 0 such that

f (Bδ (a; dX )) ⊆ B (f (a); dY ) ;

that is,
Bδ (a; dX ) ⊆ f −1 (B (f (a); dY )) .

If f is continuous on the whole of X, we shall simply say that f is continuous.

Example Let (X, dX ) and (Y, dY ) be metric spaces. Suppose that dX is the discrete metric d0 .
Then, for any a ∈ X,

d0 (x, a) < 1 =⇒ x = a =⇒ f (x) = f (a) =⇒ dY (f (x), f (a)) = 0 < .

Thus (taking δ = 1 in the definition of continuity) any function f from X to Y is (d0 , dY )-continuous.

Suppose on the other hand that dY is the discrete metric, and again let f be a function from X to Y .
For f to be (dX , d0 )-continuous at a point a ∈ X, we need (taking  = 1) to find δ such that

dX (x, a) < δ =⇒ d0 (f (x), f (a)) < 1;

55
MA203 Real Analysis

that is, we require δ such that


dX (x, a) < δ =⇒ f (x) = f (a).
Thus, if f is to be (dX , d0 )-continuous on X, f must be ‘locally constant’ in the sense that, for every
a ∈ X, there is δ > 0 such that f |Bδ (a) is a constant function. (For most of our examples of metric
spaces (X, d), this is only possible if f is a constant function.)

6.4.2 Continuity in terms of open sets

Continuity can be characterised completely in terms of open sets. (We saw a special case of this in
the previous chapter, when the spaces are Rm and Rn with the usual metrics.)

Theorem 6.8 Let (X, dX ) and (Y, dY ) be metric spaces. Then a mapping f : X → Y is
(dX , dY )-continuous if and only if, for every dY -open subset U of Y , f −1 (U ) is a dX -open subset of
X.

It is not, in general, true that if f : X → Y is continuous and if U ⊆ X is open, then f (U ) ⊆ Y is


open.

Learning activity 6.3

Give an example to show that it is possible for a function f : X → Y between metric spaces to be
continuous, for U to be an open subset of X, and yet for f (U ) not to be an open subset of Y .

6.5 Convergence and closed sets in metric spaces

6.5.1 Definition of convergence

It is an easy matter to extend the notion of convergence to general metric spaces. Recall the
definition of convergence of a real sequence: The sequence (xn ) converges to x ∈ R if for all  > 0,
there is N such that
n ≥ N =⇒ |xn − x| < .
Thus the following definition is completely natural.

Definition 6.9 Suppose that (X, d) is a metric space and that (xn ) is a sequence in X. We say that
(xn ) converges to x ∈ X if for any  > 0 there is N such that

n ≥ N =⇒ d(xn , x) < .

Of course, d(xn , x) <  is equivalent to xn ∈ B (x).

Now that we have a definition of convergence, we can say precisely what we should mean by a closed
subset of a metric space.

Definition 6.10 Suppose (X, d) is a metric space. A subset C of X is closed if, whenever (xn ) is a
sequence of elements of C converging to a limit x, the limit x is also in C.

Example If d0 is the discrete metric on any set X, then a sequence (xn ) converges to x if and only if
there is an N such that n ≥ N =⇒ xn = x. (I.e., a sequence is convergent if and only if it is
eventually constant.) It follows that all subsets of X are closed in the discrete metric.

The following theorem generalises one from the previous chapter.

56
Chapter 6. Metric spaces

Theorem 6.11 Suppose (X, d) is a metric space. A set C ⊆ X is closed if and only if its
complement X \ C is open.

6.6 Compactness in metric spaces

6.6.1 Definition of compactness

The definition of (‘sequential’) compactness in a metric space is as follows.

Definition 6.12 Suppose that (X, d) is a metric space. A subset C of X is said to be (sequentially)
compact if any sequence (xn ), where xn ∈ C for all n, has a subsequence converging to a point of C.

If X itself is a compact subset, we simply say that the metric space (X, d) is compact.

Example Let X be any set and let d0 be the discrete metric on X. Then (X, d0 ) is compact if and
only if X is finite.

Learning activity 6.4

Prove the statement in this example.

6.6.2 Closed-ness and boundedness

Theorem 6.13 Any compact subset C of a metric space (X, d) is closed and bounded.

We saw earlier that, for Rm with the usual metric, the ‘if’ of this result can be replaced by ‘if and
only if’. In other words, any closed and bounded subset of Rm is in fact compact. However, the next
example shows that this is not true in general.

Example Let d0 be the discrete metric on an infinite set X. Then X itself is a closed and bounded
set in (X, d0 ), but it is not compact.

6.6.3 Continuous functions on compact sets

The following result is a more general form of a key theorem given in the previous chapter. It is often
phrased as ‘The continuous image of a compact set is compact.’

Theorem 6.14 Suppose that (X, dX ) and (Y, dY ) are metric spaces, and that C is a compact subset
of X. If f : X → Y is (dX , dY )-continuous, then f (C) is a compact subset of Y .

Proof. One point about this proof: we must not assume that the elements of X and Y are real
numbers, or elements of some Rn , or whatever. All we know is that they are elements of some metric
space, and that’s all we may use.

To show that f (C) is compact, take any sequence (yn ) in f (C); we need to show that (yn ) has a
subsequence converging to some element of f (C). For each n, since yn ∈ f (C), there is some xn in
C such that f (xn ) = yn . Consider the sequence (xn ) in C: since C is compact, there is some
subsequence (xnk ) converging to a limit x ∈ C. We claim that f (xnk ) → f (x) as k → ∞: to see
this, fix any  > 0; by continuity of f there is some δ > 0 such that
dX (y, x) < δ ⇒ dY (f (y), f (x)) < , and by definition of convergence there is some K such that

57
MA203 Real Analysis

k ≥ K ⇒ dX (xnk , x) < δ – combining these two facts gives us that, for any  > 0, there is some K
such that k ≥ K ⇒ dY (f (xnk ), f (x)) < , as required. But now notice that the sequence
(f (xnk )) = (ynk ) is a subsequence of the original sequence (yn ) converging to a limit f (x) ∈ f (C),
which is what we need.

So we have the following important corollary, which generalises earlier results on the maximisation and
minimisation of continuous functions.

Theorem 6.15 Suppose that C is a compact subset of a metric space (X, d) and that f : X → R is
continuous (with respect to d and the usual metric on R). Then f attains its bounds on C. That is,
there are c, d ∈ C such that

f (c) = sup f (x), f (d) = inf f (x).


x∈C x∈C

6.7 Learning outcomes

At the end of this chapter and the relevant reading, you should be able to:

state what is meant by a metric, and a metric space


prove that a given metric is indeed a metric
describe the standard metrics: the usual metric on Rn , the dp and d∞ metrics on Rn , the
discrete metric on any set, the dp and d∞ metrics on the set of functions bounded on a domain
state, and be able to work with, the definition of continuity of a function between metric spaces;
and its characterisation in terms of open balls and open sets
state what is meant by a bounded subset of a metric space
state what’s meant by open balls and open and closed sets in metric spaces
state what’s meant by convergence in a metric space
state what is meant by compactness
state, and use, the fact that any compact subset of a metric space is closed and bounded (proof
not necessary)
prove that the continuous image of a compact set is compact

6.8 Comments on selected activities

Learning activity 6.1 We need to check that the three properties for a metric hold. Certainly,
d(X, Y ) ≥ 0, and d(X, Y ) = 0 if X = Y . Moreover, if d(X, Y ) = 0 then maxi,j |xij − yij | = 0, so
|xij − yij | = 0 for all i, j; in other words, xij = yij , and X = Y . Next,

d(X, Y ) = max |xij − yij | = max |yij − xij | = d(Y, X).


i,j i,j

Lastly, we need to verify the triangle inequality for d. Let X, Y, Z ∈ A = A(m, n). We want to show
that d(X, Y ) ≤ d(X, Z) + d(Y, Z). Now, for each i, j, by the standard triangle inequality for real
numbers, |xij − yij | ≤ |xij − zij | + |zij − yij |. So, for all i, j,

|xij − yij | ≤ |xij − zij | + |zij − yij |


≤ max |xij − zij | + max |zij − yij |
ij ij
= d(X, Z) + d(Z, Y ),

so, since this is true for all i, j,

d(X, Y ) = max |xij − yij | ≤ d(X, Z) + d(Z, Y ),


i,j

58
Chapter 6. Metric spaces

as required. It follows that d is a metric on A and (A, d) is therefore a metric space.

Learning activity 6.2 Let’s take the metric space to be R with the usual metric. For n ∈ N, let
Un = (−n,Sn). Then Un is bounded, because for each x, y ∈ Un , we have d(x, y) < 2n. However,

the union n=1 Un is the whole of R and this is not a bounded set.

Learning activity 6.3 Consider f : R → R defined by f (x) = 0, for all x. With the usual metric on
R, this is continuous but, although (0, 1) is open, f ((0, 1)) = {0} is not open.

Learning activity 6.4 Suppose that X is finite. Then any sequence in X must take the same value,
say x ∈ X, infinitely often, and this constant subsequence is convergent to x because all its members
equal x. On the other hand, suppose X is infinite. Then there is a sequence with no repeated
members. Such a sequence has no convergent subsequence because a sequence converges in the
discrete metric if and only if all its terms are equal from some point onwards, and this sequence (and
hence all if its subsequences) has none of its members equal. (You might want to look at the solution
to Exercise 6.9 for a fuller explanation of why a sequence in the discrete metric converges if and only
if it is ultimately constant.)

6.9 Exercises

Exercise 6.1 Is it possible that in a metric space M containing more than one point, the only open
subsets are M and ∅, the empty set?

Exercise 6.2 Let Z be the set of integers. Let p be a fixed prime number. Define

d:Z×Z→R

by d(m, m) = 0, and d(m, n) = 1/r where r is such that m − n = pr−1 k, where r, k are integers and
p does not divide k. Prove that d is a metric on Z.

Exercise 6.3 Prove that in any metric space (A, d), for any x, y, z ∈ A,

d(x, y) ≥ |d(x, z) − d(y, z)|.

Exercise 6.4 Suppose f : Rn → Rn is continuous. Prove that the function g : Rn → R defined by


g(x) = kf (x) − xk is continuous. Suppose C ⊆ Rn is compact and that for every x ∈ C, f (x) 6= x.
Prove that there is  > 0 such that for all x ∈ C, kf (x) − xk ≥ .

Exercise 6.5 Suppose that, in a metric space (A, d), the sequence (an ) converges to a and (bn )
converges to b. Prove that (in R, with the usual metric) the sequence of real numbers (d(an , bn ))
converges to d(a, b). [You may find the result of Exercise 6.3 useful.]

Exercise 6.6 Suppose that (A, d) is a metric space, X is a set, and that f : X → A is an injective
(one-to-one) function. Define d0 : X × X → R by d0 (x, y) = d (f (x), f (y)) . Prove that d0 is a metric
on X. Prove that d0 is not a metric if the function f is not injective.

Exercise 6.7 Prove that a subset of a metric space is open if and only if it is a union of open balls.
[Note: the union may be the union of an infinite family of open sets.]

Exercise 6.8 Let M = (X, d) be a metric space, and let (xn ) be a sequence of points in X. Suppose
that xn → x and xn → y. Show that x = y.

Use this fact to show that, if C ⊆ X is compact in M , then C is closed.

Exercise 6.9 Suppose that (A, d) is a metric space. What does it mean to say that a sequence (xn )
of members of A converges to x ∈ A (with respect to the metric d)?

What is meant by the discrete metric d0 on A?

59
MA203 Real Analysis

Prove that a sequence in A converges with respect to the discrete metric if and only if there is N ∈ N
such that xr = xs for all r, s ≥ N . (That is, if and only if all of its terms are eventually equal to each
other.)

60
Chapter 7
Uniform convergence

Contents

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.2 Pointwise and uniform convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.3 Uniform convergence as convergence in a metric space . . . . . . . . . . . . . . . . . . . . . . 62
7.4 Uniform convergence and continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.5 Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.6 Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

Reading

This topic is not covered by many of the textbooks, but the following is worth reading.

Bartle, R.G. and D.R. Sherbert. Introduction to Real Analysis. Chapter 8.

7.1 Introduction

This short chapter concerns the matter of how we might think about convergence of a sequence of
functions (rather than of numbers of vectors).

7.2 Pointwise and uniform convergence

Suppose S ⊆ Rn and that for each n ∈ N, fn is a function from S to R. We may think of the
functions f1 , f2 , f3 , . . ., as forming a sequence of functions (fn ). Of course, this is very different from
a sequence of real numbers, but it is still possible to formulate some idea of ‘limit’ in this context.
There are two main ways in which we might do this. One is through the definition of pointwise
convergence and the other through uniform convergence. The following definitions describe these.

Definition 7.1 (Pointwise convergence) The sequence (fn ) converges pointwise to the function f
on S if and only if for each x ∈ S, fn (x) → f (x) as n → ∞; that is, given x ∈ S, and  > 0, there
exists N = N (x, ) such that

n > N (x, ) =⇒ |fn (x) − f (x)| < .

Definition 7.2 (Uniform convergence) The sequence (fn ) converges uniformly to the function f
on S if and only if given  > 0, there exists N = N () such that

n > N () =⇒ |fn (x) − f (x)| < , for all x ∈ S.

The difference between these definitions is that, with uniform convergence, the N depends only on ,
so that the same N works for every x ∈ S, whereas in pointwise convergence, x is given, and N can
depend on x as well as on .

61
MA203 Real Analysis

Uniform convergence is a stronger property than pointwise convergence.

Theorem 7.3 Suppose that (fn ) converges uniformly on S to f . Then (fn ) converges pointwise on
S to f .

Learning activity 7.1

Convince yourself of this.

The converse is false, as can be seen from the following example.

Example Suppose that we take S = [0, 1] and fn (x) = xn . Then it is easy to see that (fn ) converges
pointwise on [0, 1] to the function

0 if 0 ≤ x < 1
f (x) =
1 if x = 1.

This is simply because if 0 ≤ x < 1 then xn → 0 and, of course, if x = 1 then xn = 1n = 1 → 1. But


the convergence is not uniform. To see this, note that if 0 < x < 1, then |fn (x) − f (x)| = xn and,
although this tends to 0, it does not do so at a rate which can be bounded independently of x. This is
because to have xn <  we require n log x < log , so (remembering that log x is negative), we need
n > log / log x. This bound depends on x as well as .

Example Suppose again that fn (x) = xn but that S = [0, 1/2]. Then (fn ) converges uniformly to
the identically-0 function f (given by f (x) = 0 for all x) on [0, 1/2], because, for all x ∈ [0, 1/2],
 n
n 1
|fn (x) − f (x)| = x ≤ .
2

Hence, given  > 0, we can choose N so that (1/2)N < , so that, for any n > N and any
x ∈ [0, 1/2], |fn (x) − f (x)| < .

7.3 Uniform convergence as convergence in a metric space

For S a subset of Rn , let F = FS be the set of bounded functions from S to R, i.e., the set of
functions f : S → R for which there is some constant K with |f (x)| ≤ K for all x ∈ S.

For two functions f, g ∈ FS , define d(f, g) = supx∈S |f (x) − g(x)|. Note that, since f and g are
bounded, the set {|f (x) − g(x)| : x ∈ S} is bounded above – so the supremum d(f, g) does exist.
Then the pair MS = (FS , d) is a metric space: d is sometimes called the sup metric and is often
denoted by d∞ .

To say that a sequence (fn ) of functions in FS converges to a limit f in the metric space MS means
that, for all  > 0, there exists some N = N () such that n > N =⇒ d(fn , f ) < . Of course the last
inequality can be rewritten as supx∈S |f (x) − g(x)| < . This isn’t exactly the same as the definition
we gave for uniform convergence, but nevertheless the following result is now fairly obvious.

Theorem 7.4 Let S be any subset of Rn , let (fn ) be a sequence of bounded functions from S to R,
and let f be a bounded function from S to R. Then (fn ) converges uniformly to f on S if and only if
fn → f in MS .

Proof. Suppose fn → f in MS . Take any  > 0. Then there is some N = N () such that, for all
n > N (), d(fn , f ) = supx∈S |fn (x) − f (x)| < . Thus, for n > N (), and any x ∈ S,
|fn (x) − f (x)| < . Thus (fn ) converges uniformly to f .

62
Chapter 7. Uniform convergence

Suppose (fn ) converges uniformly to f on S. Then for any  > 0 there is some N = N () such that
n > N (), x ∈ S =⇒ |fn (x) − f (x)| < /2. Thus, for n > N (), supx∈S |fn (x) − f (x)| ≤ /2 < .
Hence fn → f in MS .

So, this gives us a method to test whether (fn ) converges uniformly to f . We need only to evaluate
supx∈S |fn (x) − f (x)| = d(fn , f ) for each n, and see whether this tends to 0 as n → ∞.

Example Consider again the sequence (fn ) of functions on [0, 1] given by fn (x) = xn , and let f be
the pointwise limit of (fn ). (So f (x) = 0 for all x < 1, and f (1) = 1.) Then
supx∈[0,1] |fn (x) − f (x)| = supx∈[0,1) xn = 1 for each n, and therefore (fn ) does not converge
uniformly to f (as we saw before).
nx
Example Suppose fn : [0, 1] → R is given by fn (x) = . Then
n+x
x
fn (x) = → x,
1 + x/n

as n → ∞, so the sequence (fn ) converges pointwise on [0, 1] to f (x) = x. To check whether the
convergence is uniform, we consider supx∈[0,1] |fn (x) − x|. Now,

−x2 2

nx
= sup x .

|fn (x) − x| = sup − x = sup
x∈[0,1] n + x x∈[0,1] n + x x∈[0,1] n + x

The derivative of x2 /(n + x) is non-negative on [0, 1], so it is increasing, and is hence maximised at
x = 1, so the supremum is 1/(n + 1). This does tend to 0 as n → ∞, so the convergence is uniform.

Example Define f0 : R → R by f (x) = 0 for all x. The open ball B1 (f0 ) consists of all functions f
for which there is some t < 1 with |f (x)| ≤ t for all x.

Example The set G of functions g : R → R such that |g(x)| < 1 for all x is (perhaps curiously) not
an open set in the sup-metric space. For instance, consider the function g(x) = π2 tan−1 (x): there is
no positive  such that B (g) ⊆ G.

Example Consider the set of all functions from S = [0, 1] to [0, 1]. This is a subset of FS . It is a
bounded set in the metric space MS , and it is closed. However it is not compact: consider again the
sequence (fn ) defined by fn (x) = xn .

7.4 Uniform convergence and continuity

The following result is sometimes useful, as we shall see, because it will lead to a method by which we
can determine that some sequences of functions do not converge uniformly.

Theorem 7.5 Suppose S ⊆ Rn and that fn : S → R (for n ∈ N). If each fn is continuous at a ∈ S,


and if (fn ) converges uniformly on S to the function f , then f is continuous at a.

As a consequence, if each fn is continuous on S, and (fn ) converges uniformly to f , then f is


continuous. We have already seen that this fails if we only assume that (fn ) converges pointwise to f .

In terms of the sup-metric, this means that the set CS of continuous functions from S to R is a
closed subset in the metric space MS .

The theorem above results in the following, which sometimes gives an easy way to prove that
convergence is not uniform.

Corollary 7.6 Suppose (fn ) converges pointwise on S to f , and that each fn is continuous at a ∈ S.
If f is not continuous at a, then (fn ) does not converge uniformly on S to f .

63
MA203 Real Analysis

Example (As above) Suppose that we take S = [0, 1] and fn (x) = xn . Then, as we have seen, (fn )
converges pointwise on [0, 1] to the function

0 if 0 ≤ x < 1
f (x) =
1 if x = 1.

But f is not continuous, so the convergence is not uniform.

7.5 Learning outcomes

At the end of this chapter and the relevant reading, you should be able to:

state what is meant by the pointwise and uniform convergence (on a set) of a sequence of
continuous functions, and demonstrate that you understand the difference between these
state, and use, the fact that uniform convergence implies pointwise convergence
demonstrate that you understand that uniform convergence is equivalent to convergence in a
metric space of functions with respect to the sup metric
state the fact that (fn ) converges uniformly on S to f if and only if
kf kS = sup{kfn (x) − f (x)k : x ∈ S} tends to 0 as n → ∞; and be able to use this to prove
uniform convergence, or that a sequence does not uniformly converge
state, and use, the fact that if a sequence of continuous functions converges uniformly, then the
limit function is also continuous (proof not necessary)

7.6 Comments on selected activities

Learning activity 7.1 Informally, the reason that uniform convergence implies pointwise convergence
is that the former condition is stronger: it requires the same N () to work for every x. Formally, all
one has to observe to prove this is that, if the convergence is uniform and if N () is as in
Definition 7.2, then, for each x, we can take N (, x) to equal N () in Definition 7.1. Definition 7.1 is
then satisfied.

7.7 Exercises

Exercise 7.1 Let fn : [0, 1] → R be defined by fn (x) = nxn (1 − x). Prove that (fn ) converges
pointwise on [0, 1] to the identically-0 function (the function f such that f (x) = 0 for all x). Prove,
however, that (fn ) does not converge uniformly to f .
2
Exercise 7.2 Let fn : R → R be defined by fn (x) = xe−nx . Prove that (fn ) converges uniformly on
[0, 1] to the identically-0 function.

Exercise 7.3 Does the sequence (fn ) of functions converge pointwise, when fn takes the following
form?
(i) fn (x) = tan−1 (nx) (= arctan(nx)),
(ii) fn (x) = xe−nx .
In each case, determine whether the sequences converge uniformly on R.

Exercise 7.4 Consider the sequence of functions (gn ), each defined on (0, 1), given by

x x ≥ 1/n
gn (x) =
0 x < 1/n.

Show that (gn ) converges uniformly on (0, 1) to the limit g, where g(x) = x for all x.

64
Chapter 7. Uniform convergence

Let G denote the set of all functions from (0, 1) to (0, 1), considered as a subset of the metric space
M(0,1) of bounded functions on (0, 1) with the sup-metric. Deduce from the previous part of the
question that G is not an open set in M(0,1) .

Exercise 7.5 Suppose that for each n ∈ N, gn : [0, 1] → R is the function defined by

nx if 0 ≤ x ≤ 1/n
gn (x) = n
n−1 (1 − x) if 1/n ≤ x ≤ 1.

Find a function g such that (gn ) converges pointwise to g on [0, 1]. Prove that the convergence is
uniform on the interval [c, 1] for any c > 0, but that it is not uniform on [0, 1].

Exercise 7.6 Let C[0, 1] be the set of continuous functions from [0, 1] to R. (Recall that all such
functions are bounded and attain their bounds.) Consider the metric space M = (C[0, 1], d), where d
denotes the sup-metric.

Show that the set of functions in C[0, 1] whose image is contained in (0, 1) is an open set in C[0, 1].

65

You might also like