You are on page 1of 167

E1: CALCULUS - lecture notes

Ştefan Balint
Eva Kaslik, Simona Epure, Simina Mariş, Aurelia Tomoioagă

Contents

I Introduction 9

1 The notions ”set”, ”element of a set”, ”membership of an element in a


set” are basic notions of mathematics 9

2 Symbols used in set theory 10

3 Operations with sets 10

4 Relations 11

5 Functions 14

6 Composite function. Inverse of a function. 15

7 Logic symbols 16

8 Converse theorem and contrary theorem 17

9 Necessity and sufficiency 18

II Single variable calculus 19

10 Topology in R1 19

11 Sequences 20

1
12 Convergence 21

13 Rules (for convergence of sequences) 23

14 Limit points of a sequence 26

15 Series of real numbers 27

16 Rules (for convergence of series) 29

17 Absolute convergent series 34

18 Limit of a function at a point 36

19 Rules for the limit of a function 38

20 One sided limits 41

21 Infinite limits 43

22 Limit points of a function at a point 44

23 Continuity 45

24 Rules for continuity 47

25 Properties of continuous functions 48

26 Sequence of functions. Set of convergence. 51

27 Continuity and uniform convergence 53

28 Equal continuous and equal bounded sequence of functions 54

29 Series of functions. Convergence and uniform convergence. 55

30 Convergence criteria for series of functions 57

31 Power Series 58

2
32 Arithmetics of power series 60

33 Differentiable functions 61

34 Rules of differentiability 63

35 Local extremum 68

36 Theorems concerning basic properties of differentiable functions 68

37 Higher-order derivatives and differentials 71

38 Taylor polynomials 72

39 Classification theorem for local extrema 77

40 The Riemann-Darboux integral 79

41 Properties of the Riemann-Darboux integral 81

42 Classes of Riemann-Darboux integrable functions 84

43 Mean value theorem 86

44 The fundamental theorem of calculus 87

45 Techniques to find primitives 89

46 Improper integrals 92

47 Fourier series 94

48 Different forms of Fourier series 101

III Functions of several variables 104

49 Topology in Rn 104

50 Limit of a function at a point 107

3
51 Continuity 108

52 Important properties of continuous functions 111

53 Differentiation 112

54 Basic properties of differentiable functions 117

55 Higher order partial differentiability 121

56 Taylor’s theorems 123

57 Classification theorem for local extrema 124

58 Conditional extrema 125

59 Jordan measurable subsets of R2 125

60 The Riemann-Darboux integral of functions of two variables 127

61 Integrable functions 129

62 Properties of the Riemann-Darboux integral 130

63 Riemann-Darboux integral calculus when A is rectangular 131

64 Riemann-Darboux integral calculus when A is not a rectangle 134

65 Jordan measurable subsets of Rn 136

66 The Riemann-Darboux integral of a n variable function 138

67 Integrable functions of n variables 140

68 Properties of the Riemann-Darboux integral of n-variable functions 140

69 Riemann-Darboux integral calculus for n-variable functions when A is a


hypercube 141

70 Elementary curves and elementary closed curves 143

4
71 Line integral of first type 148

72 Line integrals of second type 150

73 Transformation of double integrals into line integrals 152

74 Elementary Surfaces 156

75 Surface integrals of first type 161

76 Surface integrals of second type 162

77 Properties of surface integrals 164

78 Differentiation of an integral containing a parameter 165

5
In which way can a Calculus course be useful to a first
year computer science student?

This is a frequently asked question of first year students at the beginning of their Calculus
course.
It is difficult to give a full and convincing answer to this question at the very beginning
of the course, as we have to talk about the utility of some concepts and mathematical
instruments, that are unknown to those who ask, in solving practical problems which are
out of their reach at the moment.
However, the question cannot and must not be avoided. It is necessary to formulate a
partial answer showing the utility of this course in solving real problems, that future com-
puter scientists could find interesting. We have to emphasize here that for mathematics
students, Calculus is a basic and very important part of their curriculum, and its utility
is usually not questioned outside the field of mathematics.
So let’s get back to giving a partial answer to computer science students. We would like
to point out that in this course, basic concepts and instruments will be presented, used
for analyzing real or vector functions of one ore more variables. To illustrate the utility of
some of these concepts and instruments, we will consider the following practical problem:
constructing a train schedule.
Constructing a train schedule for a railway network is a real and complex problem. It
is based on the knowledge of speed restrictions in the network, train stations, transport
material, options concerning the stops of some trains in certain stations, and a previous
computation that guarantees that in ideal conditions, the trains will not collide. Some
concepts of calculus prove to be useful in this computation. To guarantee that the trains
will not collide, it is necessary to know, at every moment, the position of every train and
to assure that these positions do not coincide at a certain moment of time. Let’s consider
for example the Timişoara-Bucharest railway which can be represented as a curve AB g
like in the following figure: and a train that circulates on this railway in the time range

[t0 , t0 + T ] will be represented by a point P . If in the considered time range there are more
trains circulating on this railway, we will have to describe the motion of each of them.
In order to describe the motion of a train represented by the point P , we can associate
to each moment of time t ∈ [t0 , t0 + T ] the length of the arc of curve AP g, where P is
the position on the curve AB g where the train is at the moment t. Therefore, a function
f is obtained, which is defined for t ∈ [t0 , t0 + T ] and takes its values in the set [0, l]:
f : [t0 , t0 + T ] → [0, l]; l is the distance from A to B, on the considered railway.

6
We must emphasize that the object that appeared in a natural way in this problem of
describing the position of a train on a railway, is a real function of one real variable, a
mathematical object that belongs to the field of interests of this course.
Our train has to arrive at given times to its stations and has some speed restrictions
along the way, hence, the function f could be quite complicated. However, there are some
characteristics of real motion that have to be translated mathematically as properties of
the function f . For example, the real motion is continuous, meaning that the train moves
from the position P1 to the position P2 gradually, passing through all the intermediate
positions and not by jumping. This means that the function f , even if complicated, must
have to following property: for any t2 ∈ [t0 , t0 + T ], if t1 tends to t2 then f (t1 ) tends to
f (t2 ).
A function with the above property is said to be continuous on the interval [t0 , t0 + T ].
The concept of continuity is studied in this course, revealing several properties. Hence,
continuous functions that are studies in this course are useful, for example, for describing
the motion of a train on a railway.
If our train leaves at the moment t0 from station A and moves off continuously from A
without stopping until the moment t1 at the first station S1 , then the function f which
describes the motion of the train has the following property: for any t0 , t00 ∈ [t0 , t1 ], t0 < t00
it results that f (t0 ) < f (t00 ). In this course, such function is said to be increasing. The
course presents several properties of monotonous functions. In the case of the considered
motion, this concept is useful for expressing moving off or approaching.
Due to speed restrictions and stops at the stations, the velocity of the train depends on
its position. More exactly, it depends on the moment of time t, as in the time range
[t0 , t0 + T ], the train may pass through the same place a couple of times. In order to find
f (t) − f (t1 )
the velocity of the train at the moment t1 , we consider the mean velocity
t − t1
(distance over time) on a short time range [t, t1 ] and the limit of this mean velocity when t
tends to t1 represents the velocity of the train at the moment t1 . In this course, this limit
is called the derivative of the function f at t1 and is denoted by f 0 (t1 ). If the train stays
in a station in the time range [t1 , t2 ] then it’s velocity is zero, f 0 (t) = 0, for t ∈ [t1 , t2 ].
If f 0 (t) > 0, then the train moves off A, and if f 0 (t) < 0 then the train approaches A.
If the train moves with a constant velocity in the time range [t1 , t2 ], then f 0 (t) = const
in the interval [t1 , t2 ]. These show the utility of the concept of derivative for describing
mechanic motion.
Finally, we point out that starting from a velocity profile v(t) (which results from speed
restrictions and previously assigning the arrival and departure times) the function f (t)
which describes the motion can be recovered using the integral formula:
Zt
f (t) = f (t0 ) + v(τ )dτ.
t0

presented in this course.


We hope that this extremely simple and partial reasoning manages to convince computer
science students that they will study at this course mathematical objects and results that
will be useful in their future careers.

7
The written course is presented in a standard form, similar to the course presented to
mathematics students. However, the spoken course is full of comments and examples that
are meant to illustrate the utility and applicability of the concepts and results at solving
real problems.
The authors

8
Part I

Introduction
1 The notions ”set”, ”element of a set”, ”member-
ship of an element in a set” are basic notions of
mathematics

A strict mathematics course requires a precise definition of all the notions used to present
the material.
A definition should precisely describe a notion (A) using an other notion (B), which is
assumed to be known, or in any event simpler than (A).
Notion (B) must also be strictly defined, and its definition will contain another notion
(C) simpler then (B), and so on.
For the construction of a mathematical theory with exact definitions, of all the notions, it
is necessary to have a collection of very simple notions to which the rest can be reduced
and which are themselves not defined.
We will call such notions basic notions.
From the point of view of common sense, the basic notions of mathematics are so self
evident that they do not require definitions. The meaning of basic notions can be described
by examples.
The notions: a set, an element of a set, membership of an element in a set, are basic
notions of mathematics.
We cannot obtain an exact definition of the above notions, but it is possible to clarify
their meaning, by examples.
Thus, let us consider the notion of a set. We may speak of the set of days in a year, points
in a plane, students in a lecture-room, and so on. In these cases, each day of a year, each
point in a plane, each student in a lecture-room is an element of the set.
When a concrete set is considered, an essential thing is to be able to affirm for any
element if it belongs or not to the set. Thus, for the set of days in a year, the 3rd of July,
20th of May, 29th of December are all elements of the set, while ”Wednesday”, ”Friday”,
”holiday”, ”days in a year” are not. In the second example, only the points in the given
plane are elements of the set. If the point does not lie in the given plane, or the element
is not a point, then the point or the element is not an element of the set.
In order to define a concrete set it is necessary to describe clearly the elements belonging
to it. Any faulty description may lead to a logical contradiction.

9
2 Symbols used in set theory

If x is a member (an element) of a set A, then we write x ∈ A, otherwise we write, x ∈


/A
(∈ is called the membership symbol).
Two sets A and B that have precisely the same elements are said to be equal. Thus,
with respect to sets, the equality A = B means that the same set is denoted by different
letters, that is , A and B are two names for the same set.
The notation A = {x, y, z, ...} means that the set A consists of elements x, y, z, ... . In this
notation, duplicated elements are regarded as one element. For instance: {1, 2, 3, 4, 5} =
{1, 1, 1, 2, 2, 3, 4, 5}.
If a set A consists of all the elements x of a set B that posses a given property, then we
write A = {x ∈ B | . . . } where the property is written after the vertical line. For instance,
let a and b two real numbers satisfying the condition a < b; then the set of points of the
closed interval [a, b], that is the set of all real members x such that a ≤ x ≤ b, can be
written as:
[a, b] = {x ∈ R1 | a ≤ x ≤ b}
where R1 means the set of all real members.
If every element in a set A is also an element of a set B, then we say that A is a subset
of B and write A ⊂ B or B ⊃ A. The first relation reads ”set A is contained in set B”,
and the second relations reads ”B contains A”.
It is easy to prove that if A ⊂ B and B ⊂ A, then A = B.

3 Operations with sets

Definition 3.1. For any two sets A and B the set of elements belonging to A or B or to
both sets is called the union of A and B, and is written A ∪ B.
Definition 3.2. For any two sets A and B the set of elements belonging to A and B at
the same time is called the intersection of A and B and is written A ∩ B.
Definition 3.3. For any two sets A and B the set of elements of B that are not elements
of A is the difference B − A written B \ A. If the set A is a subset of B, then B \ A is
called the complement of A in B and is denoted as CB A.
Comment 3.1.

- The notions of union and intersection of sets can be extended to three, four or any
number of sets. Namely, the union of n sets A, B, C, . . . is the set of those elements
which belong to at least one of these sets. The intersection of n sets A, B, C, . . .
is the set of those elements which belong simultaneously to each set.
- It is possible that two sets A and B have no elements in common. In such a case
A ∩ B contains no elements. Nevertheless, it is still convenient to view A ∩ B as a
set (containing no elements). It is called the empty (or null) set, and is denoted by
the symbol ∅.

10
For any set A we have A ⊃ A and A ⊃ ∅; thus A and ∅ are subsets of A; they are called
improper subsets, all other subsets being proper subsets.
Sometimes, the union of sets is called the sum of sets, and the intersection of sets the
product sets.
Usually, the operations of union and intersection of sets are defined on the set of all subsets
of a given set S. These operations, for any A, B, C ⊂ S, satisfy the following properties:

• (A ∪ B) ∪ C = A ∪ (B ∪ C) associativity of union;

• (A ∩ B) ∩ C = A ∩ (B ∩ C) associativity of intersection;

• A ∪ B = B ∪ A commutativity of union;

• A ∩ B = B ∩ A commutativity of intersection;

• (A ∪ B) ∩ C = (A ∩ C) ∪ (B ∩ C) distributivity of intersection over union;

• (A ∩ B) ∪ C = (A ∪ C) ∩ (B ∪ C) distributivity of union over intersection;

• for A ⊂ S there is a unique B ⊂ S such that A ∪ B = S, A ∩ B = ∅ : this set is


S \ A;

• the set S possesses the property A ∩ S = A for any A ⊂ S, the empty set ∅ possesses
the property: ∅ ∩ A = ∅ for any A.

There are identities, known as rules of De Morgan, which relate the operations of
complementation, taking unions, and taking intersections. These rules are expressed
by the formulas:
CS (A ∪ B) = CS A ∩ CS B;
CS (A ∩ B) = CS A ∪ CS B.

Definition 3.4. For any two sets A and B, the set of ordered couples (a, b) with a ∈ A,
b ∈ B is called the cartesian product of A and B and it is denoted A × B.

The cartesian product has the following properties:

A × (B ∪ C) = (A × B) ∪ (A × C);

A × (B ∩ C) = (A × B) ∩ (A × C).
for any sets A, B, C.

4 Relations

Definition 4.1. A binary relation in the set A is a subset R of the cartesian product
A × A : R ⊂ A × A.

11
Traditionally, the membership (x, y) ∈ R is denoted by xRy.
The set R = {(x, y) ∈ R × R : x2 + y 2 ≤ 1} is a binary relation in the set of all real
numbers R.

Definition 4.2. A binary relation R in the set A is called reflexive if for any x ∈ A we
have xRx.

The set R = {(x, y) ∈ R × R : x − y ≤ 0} is a reflexive binary relation in the set of all


real numbers R.

Definition 4.3. A binary relation R in the set A is called symmetric if

xRy ⇒ yRx for any x, y ∈ A

The set R = {(x, y) ∈ R × R : x2 + y 2 ≤ 1} is a symmetric binary relation in the set of


all real numbers R.

Definition 4.4. A binary relation R in the set A is called antisymmetric if

xRy and yRx ⇒ x = y for any x, y ∈ A

The set R = {(x, y) ∈ R × R : x − y ≤ 0} is an antisymmetric binary relation in the set


of all real numbers R.

Definition 4.5. A binary relation R in the set A is called transitive if

xRy and yRz ⇒ xRz for any x, y, z ∈ A.

The set R = {(x, y) ∈ R × R : x − y ≤ 0} is a transitive binary relation in the set of all


real numbers R.

Definition 4.6. A binary relation R in the set A is total if for any x, y ∈ A, at least one
of the following statements is true: xRy, yRx.

The set R = {(x, y) ∈ R × R : x − y ≤ 0} is a total binary relation in the set of all real
numbers R.

Definition 4.7. A binary relation R in the set A is partial of there exist x, y ∈ A such
that none of the following statements is true: xRy, yRx.

The set R = {(x, y) ∈ R × R : x2 + y 2 ≤ 1} is a partial binary relation in the set of all


real numbers R.

Definition 4.8. A binary relation R in the set A is a relation of partial order if it satisfies
the following properties: R is a partial relation; R is reflexive; R is antisymmetric; R is
transitive.

The inclusion of sets is a relation of partial order in the set of all parts of a given set S.

12
Definition 4.9. A binary relation R in the set A is a relation of total order if it satisfies
the following properties: R is a total relation ; R is reflexive; R is antisymmetric; R is
transitive.

The set R = {(x, y) ∈ R × R : x − y ≤ 0} is a relation of total order in the set of all real
numbers R.

Definition 4.10. A set A, together with a relation of partial order R in A is called


partially ordered system and it is denoted by (A, R).

The set of all parts of a given set S, together with the relation of inclusion is a partially
ordered system.

Definition 4.11. A set A together with a relation of total order R in A is called totally
ordered system and it is also denoted by (A, R).

The set of real numbers R, together with the binary relation R = {(x, y) ∈ R × R :
x − y ≤ 0} is a totally ordered system.

Definition 4.12. Let (A, R) be a partially ordered system and A0 a subset of A : A0 ⊂ A.


An element a ∈ A is an upper bound for the set A0 if a verifies a0 Ra for any a0 ∈ A0 . An
upper bound a∗ for A0 is said to be a least upper bound for A0 if a∗ verifies a∗ Ra for any
upper bound a of A0 . If it exists, a least upper bound of A0 is denoted by sup A0 .

Definition 4.13. Let (A, R) be a partially ordered system and A0 a subset of A : A0 ⊂ A.


An element a ∈ A is a lower bound for the set A0 if a verifies aRa0 for any a0 ∈ A0 . A
lower bound a∗ for A0 is said to be a greatest lower bound for A0 if a∗ verifies aRa∗ for
any lower bound a of A0 . If it exists, a greatest lower bound of A0 is denoted by inf A0 .

Definition 4.14. Let (A, R) be a partially ordered system. An element a ∈ A is maximal


if for any a0 ∈ A with the property aRa0 , one has a0 Ra.

The family P(X) of all subsets of a set X affords an illustration of this concepts. The
inclusion relation R =⊂ between the sets contained in X makes the pair (P(X), j) a
partially
[ ordered
[ system. An upper bound for a subfamily B ⊂ P(X) in any set containing
B and B is the only least upper bound of B.
B∈B \B∈B
Similarly, B is the only greatest lower bound of B. The only maximal element of P(X)
B∈B
is X.

Definition 4.15. A relation R in a set A is called equivalency if possesses the following


properties: R is reflexive, symmetric and transitive. For instance, the equality of sets is
an equivalency.

For example, the equality in the set of parts P (X) of a given set X is an equivalency.
The set R = {(x, y) ∈ Z × Z : x − y divisible by 5} is a relation of equivalency in the set
of integers Z.

13
Definition 4.16. A relation R between the elements of a set A and the elements of a set
B is a subset of the cartesian product A × B; R ⊂ A × B.

Traditionally, (x, y) ∈ R is denoted xRy.

Definition 4.17. A function (mapping) f of the set A into the set B written f : A → B
is a relation R between the elements of the sets A and B (R ⊂ A × B) which posses the
following properties:

a) for every x ∈ A there exists y ∈ B such that x R y;

b) if (x, y1 ), (x, y2 ) ∈ R, then y1 = y2 .

Traditionally, a function f defined on the set A into the set B is denoted by f : A → B.

5 Functions

The notion of function plays an important role in mathematics. It is not a basic notion,
since we have already seen that it can be defined in terms of sets. However for those
starting mathematical analysis, it is easiest to consider mapping (function) as a basic
notion clarifying it by examples and describing it in a manner that satisfies common
sense.
If for every x ∈ A an element y ∈ B is chosen according to some rules, then we say that
there is a function (mapping) f of the set A into the set B, written f : A → B.
Thus, a function is defined uniquely by the rule which makes every x ∈ A correspond to
y ∈ B.
What does the above description of function lack for it to be a strict definition?
Firstly, we must explain what a rule is; secondly, what a correspondence is.
Intuitively it is clear what a rule and correspondence are. In simple cases, these notions
do not involve misunderstandings and are sufficient for a meaningful mathematical theory
to be constructed on their basis.
Let us note once again that the rule defining the element y ∈ B is applicable to every
x ∈ A. The element x ∈ A is called the argument of the function f , the element y ∈ B is
called the value of the function f corresponding to the element x ∈ A, y = f (x), and the
function itself is a rule which ”processes” every x ∈ A into y = f (x).
The set A is called the domain of the function, and the set of all the elements y ∈ B for
which there are x ∈ A such that y = f (x) is called the range of the function f.
We shall consider functions which associate every real number x ∈ A ⊂ R1 with a number
y = f (x) ∈ R. For this kind of functions, a rule can be given by an explicit algebraic
expression; for instance:
q
2 1−x 5 √
y = x + 2 x; y = √ ; y = 1 + 7 x.
x+2

14
The right-hand sides of the equalities contain the rule that ”processes” x into y. The rule
in the first expression is: each x should be squared and added to twice x. The rules in the
second and third expression can be formulated in a similar way.
The rule can be given also by the symbols exp, loga , sin, cos, tan, cot and also combinations
of the symbols and algebraic operations. For instance,
√ 1
y = log2 1 + sin x; y = 1 .
(tan x) 2 − 2x
The right-hand sides of equalities define the rules for ”processing” x into y.
The rule can be given by another frequent method.
Let f1 and f2 be functions defined by expressions given above, and a be a number. We
have then set: ½
f1 (x) for x < a
f (x) =
f2 (x) for x ≥ a

The above equality can be interpreted as a rule according to which every x has corre-
sponding y. This rule can be formulated thus: if an x is less than a, then the corresponding
y is computed by rule f1 ; but if x is greater than or equal to a, then the corresponding y
is determined by rule f2 .

6 Composite function. Inverse of a function.

Definition 6.1. Let f : X → U , g : U → Y be, respectively, mappings of the set X into


the set U and of the set U into the set Y. For every x ∈ X the element g(f(x)) belongs to
the set Y. The correspondence x 7→ g(f (x)) defines a mapping of the set X into the set Y
which is denoted by g ◦ f and called the composition of mappings.
If X, U, Y are sets of numbers, then the composition of the mappings (functions) g ◦ f is
called the superposition of the functions or a composite function.
Comment 6.1. The rule associating the element x ∈ X with the element g(f (x)) is
that the mapping f is applied first to x (as a result, the element f (x) ∈ U is obtained),
and then the mapping g is applied to the obtained element f (x) ∈ U ; finally we have
g(f (x)) ∈ Y.
For instance:

- if y = u2 , u = sin x, then y = (sin x)2 = sin2 x;


- if y = tan u, u = x2 , then y = tan (x2 );
x ³x´
- if y = cos u, u = , then y = cos ;
2 2

are composite functions.


Definition 6.2. A mapping (function) f : X → Y is said to be injective (an injection)
if for different values of the argument there are different values of the function.
Definition 6.3. A mapping (function) f : X → Y is said to be surjective (a surjection)
if every y ∈ Y is the image of some x ∈ X, that is, there is an x such that f (x) = y.

15
Definition 6.4. A mapping (function) f : X → Y is said to be bijective (a bijection) if
it is both injective and surjective.

Comment 6.2. 1. An injective mapping possesses the following property: different values
of the function correspond to different values of the argument. For instance, the number
functions: y = 5 x; y = ex ; y = arctan x are injective.
2. Surjective functions are also called ”onto mappings.”
For instance, the number function y = sin x is a surjective mapping of R1 onto the set
[−1, 1] but is not surjective mapping of R1 onto all R1 (there is no inverse image of the
point y = 2).
3. A bijective function is a one-to-one mapping f : X → Y. This means that every x ∈ X
has a corresponding y ∈ Y, y = f (x), with different x ∈ X having different corresponding
y ∈ Y, and every y ∈ Y having a corresponding x ∈ X (such that y = f (x), different x
corresponding to different y ∈ Y ).

Definition 6.5. Let f : X → Y be a bijective mapping. Then for every y ∈ Y there


exists a unique x ∈ X such that f (x) = y. The correspondence y 7→ x defines a mapping
Y 7→ X, which is called the inverse of f and is denoted by f −1 . For the number sets X
and Y the mapping f −1 is called the inverse of the function f (or an inverse function).

Comment 6.3. 1. The rule in the Definition 6.5 implies the following property of an
inverse mapping (inverse function):

f (f −1 (y)) = y for any y ∈ Y.

2. The functions (mappings) f and f −1 are mutually inverse, that is, (f −1 )−1 = f.
3. To find the inverse of a given number function y = f (x), we must express x in terms
y−2 √
of y. Thus, for y = 3 x + 2 the inverse mapping is x = ; for y = x3 it is x = 3 y, for
3
y = 10x it is x = log y.

7 Logic symbols

The expressions ”for any element” and ”there exists” are frequently used in mathematics.
They are designated in a special manner:

- the first is denoted by the symbol ∀ (the first letter of the word ”Any” inverted);

- the second by the symbol ∃ (the first letter of the word ”Exist” reflected).

We shall also use the symbol ⇒ to mean ”follows”. Thus, if A and B are two sentences,
then A ⇒ B means that B follows from A.
If A ⇒ B and B ⇒ A, then the sentence A and B are said to be equivalent, written
A ⇔ B (A is equivalent to B).
Using this notation, the injectivity of a mapping f : X → Y can be written in the form:

∀ x1 , x2 ∈ X, x1 6= x2 ⇒ f (x1 ) 6= f (x2 )

16
and the surjectivity of the same mapping in the form:
∀ y ∈ Y, ∃ x ∈ X | f (x) = y
the vertical line before f (x) = y is read ”such that”.
def
The designation A ⇐⇒ B is used when we want to describe a notion A using a sentence
B. It is read ”A is by definition B”. For instance the notation:
def
X ⊂ Y ⇐⇒ {(∀ x)(x ∈ X) ⇒ (x ∈ Y )}
defines X as a subset of set Y : the right-hand side of this notation is a sentence and it is
read: ”any element x of X is also an element of the set Y ”.

8 Converse theorem and contrary theorem

Many mathematical statements (including theorems) have the following form: ”if A, then
B ”, or, which is the same, ”B follows from A ”, A ⇒ B, where A is the condition, and
B is the conclusion of the theorem.
For any statement A ⇒ B we can construct a new statement by interchanging A and B,
namely, write B ⇒ A, that is ”if B, then A ”, ”A follows from B ”.
The theorem (statement) B ⇒ A is the converse of the theorem (statement) A ⇒ B.
It is obvious that the converse of a converse is the original theorem, therefore the two
theorems are said to be mutually converse.
If the direct theorem is true, its converse may be either true or false.
Example 8.1. The direct theorem (Pythagoras’ theorem) is: if a triangle is right-angled,
then the square of the hypotenuse is equal to the sum of the squares of the other two
sides.
The converse is: if the square of the biggest side equals the sum of the squares of the two
smaller sides, the triangle is right-angled.
In this case, both the direct theorem and the converse are true.
Example 8.2. The direct theorem is: if two angles are right angles, they are equal.
The converse is: if two angle are equal, then they are right angles.
Here the direct theorem is true, but the converse is false.

For any statement A we denote A the proposition that A is false.


Example 8.3. If A denotes the statement ” 7 is an even number ” then A denotes the
statement ” 7 is not an even number ”.
If A is the statement ” It will rain tomorrow ” then A is the statement ” It will not rain
tomorrow ”.
If A is the statement ” All bullets will hit the target ”, then A is the statement ” At least
one bullet will not hit the target ”.

For the theorem ” if A, then B ”, the statement ” if A, then B ” is called the contrary
theorem. The contrary of a contrary theorem is the initial theorem.

17
Example 8.4. For the theorem ” If the sum of two opposite angles in a quadrilateral is
equal to 180◦ , then a circle can be circumscribed about the quadrilateral ” the contrary
theorem is ” If the sum of two opposite angles in a quadrilateral is not equal to 180◦ , then
a circle cannot be circumscribed about the quadrilateral ”.

In this case, both the direct theorem and its contrary are true.
The contrary theorem is equivalent to the converse. This means that the contrary theorem
is true if and only if the converse theorem is true.

9 Necessity and sufficiency

Let the statement ” if A, then B ” be true. In this case the condition A is said to be
sufficient for B, and the condition B to be necessary for A.
Let also the converse be true, that is, ” if B, then A”. In this case B is the sufficient
condition for A and the condition A is necessary for B.
Thus, the condition A is necessary and sufficient for B (and the condition B is necessary
and sufficient for A). In other words, conditions A and B are equivalent: A occurs if and
only if B is true.

Example 9.1. Bézout’s theorem is: ”If α is a root of a polynomial P (x), then the
polynomial P (x) is divisible by x − α without remainder ”.
The converse is: If a polynomial P (x) is divisible by x − α, then α is a root of
the polynomial P (x). We know that both Bézout’s theorem and its converse are true.
Therefore, the necessary and sufficient condition for the number α to the root of a
polynomial P (x) is that ” the polynomial P (x) is divisible by x − α”.
The following statement is also true: ” for a polynomial P (x) to be divisible by x − α
without remainder it is necessary and sufficient that the number α be a root of the
polynomial P (x)”.

18
Part II

Single variable calculus


10 Topology in R1

Definition 10.1. A neighborhood of the point x ∈ R1 is a set V ⊂ R1 which contains an


open interval (a, b) ⊂ R1 containing x; i.e x ∈ (a, b) ⊂ V.
For instance, any open interval containing x is a neighborhood of the point x.
Definition 10.2. Let be A ⊂ R1 . A point x ∈ R1 is called an interior point of the set A
if there exists an open interval (a, b) such that: x ∈ (a, b) ⊂ A.
For instance, a point x of the open interval (a, b) is an interior point of the set (a, b).
Definition 10.3. The interior of a set A ⊂ R1 is the set of all interior points of the set
A.
Usually, the interior of a set A is denoted by Å or Int(A).
For instance, if A is an open interval A = (a, b), then Å = (a, b) = A.
Definition 10.4. A set A ⊂ R1 is open, if A = Å.
For instance, any open interval is an open set.
A set A ⊂ R1 is open if and only if it contains a neighborhood of each of its points.
The union of any family of open sets is open.
The set of all real numbers R1 and the empty set ∅ are open.
The intersection of a finite number of open sets is open.
Definition 10.5. A set A ⊂ R1 is said to be closed if its complement is open.
The intersection of any family of closed sets is closed.
The union of a finite number of closed sets is closed.
The set of all real numbers R1 and the empty set are closed.
Any closed interval [a, b] is a closed set.
Definition 10.6. If A is a subset of R1 , then a point x ∈ R1 is a limit point, or a point
of accumulation, of A provided every neighborhood of x contains at least one point y 6= x,
with y ∈ A.
Definition 10.7. The closure A of a set A ⊂ R1 is the intersection of all closed sets
containing A. The set of points belonging to A and not to the interior Å of A is called
the boundary of A, denoted usually by ∂A.

The closure operation has the following properties:

a) A ∪ B = A ∪ B;

b) A ⊃ A;

c) A = A;

d) A = A if and only if A is a closed set;

19
e) x ∈ A if and only if every neighborhood V (x) of x intersect A.

Definition 10.8. A set A ⊂ R1 is bounded if there exist m, M ∈ R1 such that m ≤ x ≤ M


for every x ∈ A.

Definition 10.9. A set A ⊂ R1 is compact if it is both bounded and closed.

For instance, any closed interval [a, b] is compact.

11 Sequences

Definition 11.1. A function whose domain is the set of positive integers N =


{1, 2, . . . , n, . . . } and whose values belong to the set R1 of real numbers, is called a se-
quence of real numbers.

Comment 11.1. the value of the function (defining a sequence of real numbers)
corresponding to argument 1 is denoted by a1 , that corresponding to the argument 2
by a2 , . . . , that corresponding to the argument n by an . Here, a1 is called the first term
of the sequence, a2 the second term, . . . , an the n-th term.
The sequence a1 , a2 , . . . , an , . . . is denoted by (an ).

In order to define a sequence the value of the first, second,. . . , and n-th terms of the
sequence must be indicated. In other words, a rule must be given for evaluating the n-th
term of the sequence, given its place in the sequence for n = 1, 2, . . . .

Example 11.1. Let an = q n−1 , q 6= 0 then a1 = 1, a2 = q, a3 = q 2 , . . . , an = q n−1 , . . . .


1 1 1 1
Example 11.2. Let an = then a1 = 1, a2 = , a3 = , . . . , an = , . . . .
n 2 3 n
Example 11.3. Let an = n2 then a1 = 1, a2 = 4, a3 = 9, . . . , an = n2 , . . . .

Example 11.4. Let an = (−1)n then a1 = −1, a2 = 1, a3 = −1, . . . , an = (−1)n , . . . .


Thus: ½
−1 for n odd
an =
1 for n even

1 + (−1)n
Example 11.5. Let an = , then a1 = 0, a2 = 1, a3 = 0, a4 = 1. Thus:
2
½
0 for n odd
an =
1 for n even

It may happen that as the number n increases, the terms an of the sequence increases
too.

Definition 11.2. An increasing sequence (an ) is one in which an ≤ an+1 for all n ∈ N.

Definition 11.3. A decreasing sequence (an ) is one in which an ≥ an+1 for all n ∈ N.

20
Definition 11.4. A sequence which is either increasing or decreasing is called a monotone
sequence.

Example 11.6. If q > 1, then the sequence an = q n is increasing and if 0 < q < 1, then
the sequence an = q n is decreasing. If q ∈ (0, +∞) and q 6= 1, then the sequence an = q n
is monotone.

Definition 11.5. A sequence (an ) is called bounded if there exists a number M such that
|an | ≤ M for all n.

For instance, if 0 < q < 1, then the sequence an = q n is bounded (|an | < 1). The sequence
an = (−1)n is also bounded (|an | ≤ 1).

Definition 11.6. A sequence (an ) is called unbounded if it is not bounded. In other words,
if for any M > 0 there exists nM such that |anM | > M.
For instance, if q > 1, then the sequence an = q n is unbounded.

Definition 11.7. If (an ) is a sequence, then any sequence (ank ), where (nk ) = n1 , n2 , . . .
is a strictly increasing sequence of positive integers, is called a subsequence of the sequence
(an ).

Comment 11.2.

• any subsequence of an increasing sequence is increasing;

• any subsequence of a decreasing sequence is decreasing;

• any subsequence of a bounded sequence is bounded.

12 Convergence

It may happen that as the number n increases without bound, the terms an of the sequence
approach closely a certain number L. In this case we arrive at an important mathematical
concept that of the limit of a sequence.

Definition 12.1. A number L is said to be the limit of the sequence (an ) if for any number
ε > 0 there is a number N (dependent on ε) such that all the terms an of the sequence
with subscript n exceeding N satisfy the condition:

|an − L| < ε.

In this case we write


lim an = L and read: ”as n tends to infinity, the limit of an equals L” or
n→∞
an −−−−−→ L and read: ”as n tends to infinity an tends to L”.
n→∞

If an −−−−−→
n→∞ L, then the sequence (an ) is said to be convergent to L.

Comment 12.1.

21
• If the sequence (an ) converges to L, then any subsequence (ank ) of the sequence (an )
converges to L.
Indeed: for any ε > 0 there exists N such that for n > N we have |an − L| < ε.
Hence for nk > N we have |ank − L| < ε.
• Not every sequence has a limit. For instance, the sequence an = (−1)n has no
limit. That is because the subsequence a2k = (−1)2k = 1 converges to 1 and the
subsequence a2k+1 = (−1)2k+1 = −1 converges to −1.
• The limit of a sequence (an ), if it exists, it is unique.
Assuming the contrary, that is (an ) converges to L1 and L2 , L1 6= L2 , we find N1 and
N2 such that |an −L1 | < |L1 −L2 |/2 for n > N1 and |an −L2 | < |L1 −L2 |/2 for n > N2 .
Since for n > max{N1 , N2 } we have |L1 − L2 | ≤ |L1 − an | + |L2 − an | < |L1 − L2 |
we obtain that |L1 − L2 | < |L1 − L2 | what is absurd.
• If the sequence (an ) converges to L, then it is bounded. Indeed, considering ε = 1
and N1 such that |an − L| < 1 for n > N1 we have
|an | = |an − L + L| ≤ |an − L| + |L| < 1 + |L|
for n > N1 .
Therefore |an | ≤ max{|a1 |, |a2 |, . . . |aN1 |, 1 + |L|}
1
Example 12.1. Let us show that lim √ = 0
n→∞ n
Indeed, let ε > 0. Consider the inequality
¯ ¯
¯ 1 ¯
¯ √ − 0¯ < ε
¯ n ¯

1 1 1
we have √ < ε, < ε2 that is n > 2 .
n· ¸ n · ¸ ε
1 1 1
We set N = 2 + 1 where 2 is the integral part of the number 2 . It is obvious that
ε ε ¯ ¯ ε
1 ¯ 1 ¯
if n > N, then n > 2 and inequality ¯¯ √ − 0¯¯ < ε will be fulfilled.
ε n

Note that when proving the existence of a limit we calculated the number N for the given
ε in a formal, textbook manner. From now on, we shall compute limits using other, more
simple and convenient rules.
In some cases the limit of a sequence (an ) is said to be infinity. The meaning of this
concept is the following:
Definition 12.2. The limit of the sequence (an ) is said to be +∞ if for any M > 0 there
is NM such that an > M for n > NM .

For instance, the limit of the sequence an = n2 is +∞.


Definition 12.3. The limit of the sequence (an ) is said to be −∞ if for any M > 0 there
is NM such that an < −M for n > NM .

For instance, the limit of the sequence an = −n2 is −∞.

22
13 Rules (for convergence of sequences)

Suppose that (an ) and (bn ) are convergent sequences with limits a and b respectively, then
the following rules apply:

Sum rule: (an + bn ) converges to a + b.

Proof. Consider the inequality:

|(an + bn ) − (a + b)| = |(an − a) + (bn − b)| ≤ |an − a| + |bn − b|.


1
Given ε > 0, let ε0 = ε. Then ε0 > 0 and, since lim an = a and lim bn = b, there exist
2 n→∞ n→∞
natural numbers N1 and N2 such that n > N1 ⇒ |an − a| < ε0 and n > N2 ⇒ |bn − b| < ε0 .
Let N be the maximum of N1 and N2 and so n > N ⇒ |an − a| + |bn − b| = 2 ε0 = ε
In other words (an + bn ) converges to a + b.

Product rule: (an · bn ) converges to a · b.

Proof. Since lim = b, there is M > 0 such that |bn | ≤ M for any n ∈ N. It follows that:
n→∞

|an · bn − a · b| = |an · bn − a · bn + a · bn − a · b| = |bn (an − a) + a(bn − b)|


≤ |bn | · |an − a| + |a| · |bn − b| ≤ M |an − a| + |a| · |bn − b|, for all n ∈ N
ε ε
Given ε > 0, let ε1 = and ε2 = .
2M 2(|a| + 1)
Since lim an = a and lim bn = b, there exist N1 and N2 such that :
n→∞ n→∞

n > N1 ⇒ |an − a| < ε1

and
n > N2 ⇒ |bn − b| < ε2 .
Let N3 the maximum of the N1 and N2 and so conclude that if n > N3 then:
|an · bn − a · b| < ε.
In other words lim an · bn = a · b.
n→∞

Quotient rule: (an /bn ) converges to a/b provided that bn 6= 0 for each n and b 6= 0.

1
Proof. Firstly it is shown that if lim bn = b 6= 0 and bn 6= 0 for all n, then lim bn = .
n→∞ n→∞ b
It is clear that we have: ¯ ¯
¯1 ¯
¯ − 1 ¯ = |bn − b|
¯ bn b ¯ |bn | · |b|
1
Since lim bn = b there exists an integer N1 such that |bn − b| < |b| for all n > N1 . Let
n→∞ ¯ ¯ 2
2 1 1 ¯1¯
M be the maximum of , ,..., . Then ¯¯ ¯¯ < M for all n.
|b| |b1 | |bN1 | bn

23
ε · |b|
So, given any ε > 0, let ε0 = . Then ε0 > 0 and there exists an integer N2 such that
M ¯ ¯
¯1 1 ¯
|bn − b| < ε for all n > N2 . Hence, ¯¯ − ¯¯ < ε for all n > N3 where N3 is the maximum
0 0
bn b
1 1 an a
of N1 and N2 . In other words lim = . By the product rule then lim = .
n→∞ bn b n→∞ bn b

Scalar product rule: (k · an ) converges to k · a for every real number k.


The scalar product rule is a special case of the product rule.

Application 13.1. Evaluate


n2 + 2n + 3
lim .
n→∞ 4n2 + 5n + 6

Solution: The quotient rule cannot be applied direct since neither the numerator nor
n2 + 2n + 3
the denominator of converges to a finite limit.
4n2 + 5n + 6
However, if the numerator and denominator are divided by the dominant term n2 the
following is obtained:
2 3
1+ + 2
an = n n .
5 6
4+ + 2
n n
1
It is easy to prove that −−−→ 0 and the constant sequence (k) has limit k. Hence
n x→∞
1
lim an = freely using the sum, product, scalar product and quotient rules.
n→∞ 4

Squeeze rule: Let (an ), (bn ), (cn ) be sequences satisfying an ≤ bn ≤ cn for all n ∈ N. If
(an ) and (cn ) both converge to the same limit L, then (bn ) also converges to L.

Proof. If an ≤ bn ≤ cn , then an − L ≤ bn − L ≤ cn − L. Hence |bn − L| ≤


max{|an − L|, |cn − L|}. Given ε > 0, there exist natural numbers N1 and N2 such that
n > N1 ⇒ |an − L| < ε and n > N2 ⇒ |cn − L| < ε. Let N be the maximum of N1 and
N2 . Then for n > N it follows that |bn − L| < ε. In the other words lim bn = L
n→∞

1
Application 13.2. Show that lim (−1)n · = 0.
n→∞ n2

Solution: Note that ¯ ¯


¯ ¯
¯(−1)n · 1 ¯ < 1 .
¯ n ¯ n2
2

1 1 1
Now let an = − 2
, bn = (−1)n · 2 , cn = 2 .
n n n
Both (an ) and (bn ) converge to 0. By the squeeze rule (bn ) converges to 0.

24
Principle of monotone sequences: A bounded monotone sequence is convergent.

Proof. The statement for a bounded increasing sequence is proved, the proof being similar
for a decreasing sequence.
Let (an ) be such that a1 ≤ a2 ≤ . . . ≤ an ≤ . . . and an ≤ M for all n ∈ N. Let
M0 = sup {an |n ∈ N} the least upper bound of the set of numbers appearing in the
sequence. Given ε > 0, M0 − ε cannot be an upper bound for {an |n ∈ N}. Hence, there
exists a value n = N such that aN > M0 − ε. Furthermore an ≤ M0 by the definition of
M0 and hence, for n > N , |an − M0 | < ε. This proves that lim an = M0 .
n→∞

Application 13.3. A sequence
√ (an ) is defined by a1 = 1 and a n+1 = an + 1 for n ≥ 1.
1+ 5
Show that lim an = .
n→∞ 2

Solution: First, it is shown


√ by induction on n, that (an ) is an increasing sequence.
Since a1 = 1 and a2 = 2, it follows that a1 ≤ a2 . Now
√ p an − an−1
an+1 − an = an + 1 − an−1 + 1 = √ √
an + 1 + an−1 + 1
√ √
and since an + 1 + an−1 + 1 is positive if an−1 ≤ an then an ≤ an+1 . So, by induction
(an ) is an increasing sequence.
µ ¶2 µ ¶2
2 2 2 1 5 1 5
Now an −an+1 = an −an −1 = an − − and since (an ) is increasing, an − − ≤
2 4 2 4
1 √
0. This quickly leads to (an ) being bounded above by (1 + 5).
2
By the principle of monotone sequences (an ) is convergent.
√ Hence, suppose that
lim an = L. Since lim an+1 = L we obtain L = L + 1 and so L2 = L + 1. The
n→∞ n→∞
1 √
quadratic equation L2 = L + 1 has two roots, namely (1 ± 5). Since an ≥ 1 for all
2
1 √
n ∈ N, the positive root is required. Hence L = (1 + 5).
2
Theorem 13.1 (Bolzano-Weierstrass theorem). Any bounded sequence (an ) of real num-
bers contains a convergent subsequence.

Proof. Let SN = {an |n > N }. If every SN has a maximum element, then define a
subsequence of (an ) as follows: b1 = an1 is the maximum of S1 , b2 = an2 is the maximum
of the Sn1 , b3 = an3 is the maximum of Sn2 and so on. Therefore (bn ) is a monotone
decreasing subsequence of (an ). Since (an ) is bounded, then so is (bn ) too. It follows that
(bn ) is a convergent subsequence of (an ).
On the other hand if, for some M, SM does not have a maximum element, then for any am
with m > M there exists an an following am with an > am . Now let c1 = aM +1 and c2 the
first term of (an ) following c1 for which c2 > c1 . Now let c3 the first term of (an ) following
c2 for which c3 > c2 and so on. Therefore (cn ) is monotone increasing subsequence of
(an ). Since (cn ) is bounded, it is convergent.

It is intuitively clear that if an −−−→ L, then all the terms of the sequence with large
n→∞
subscripts will differ very little, all of them being approximately equal to L.
More precisely, we have:

25
Theorem 13.2 (Cauchy’s criterion for the convergence of a sequence). A sequence (an )
has a limit if and only if for any ε > 0 there exists Nε such that all the terms of the
sequence with subscripts p, q > Nε satisfy |ap − aq | < ε.

Proof. Let assume that the sequence (an ) has a limit L and let ε > 0 be a number.
ε
Consider the number ; by definition of limit there exists an integer N such that
2
ε ε ε
|an − L| < for all n > N. Hence |ap − L| < , |aq − L| < for p, q > N and it
2 2 2
follows that |ap − aq | ≤ |ap − L| + |aq − L| < ε for p, q > N. Let assume now that for any
ε > 0 there exists N such that |ap − aq | < ε for p, q > N1 .
Considering ε = 1 and N1 such that |ap − aq | < 1 for p, q > N1 we have:

|an | = |an − aN1 +1 + aN1 +1 | ≤ |an − aN1 +1 | + |aN1 +1 | ≤ 1 + |aN1 +1 |, for n > N1

Therefore:
|an | ≤ max{|a1 |, |a2 |, . . . , |aN1 |, |aN1 +1 | + 1} = M
According to Bolzano-Weierstrass theorem the sequence (an ) contains a convergent
subsequence (ank ). Let be L = lim ank and ε > 0 a number. There exists N1 such
nk →∞
ε ε
that for nk > N1 we have |ank − L| < and N2 such that |ap − aq | < for p, q > N2 .
2 2
Considering N3 = max{N1 , N2 } and n > N3 we have:

|an − L| ≤ |an − ank | + |ank − L| < ε

where nk > N3 and nk is fixed.

14 Limit points of a sequence

Definition 14.1. The set of limit points of sequence (an ) is the collection of points x ∈ R1
for which there exists a subsequence (ank ) of the sequence (an ) such that lim ank = x.
nk →∞
Usually the set of limit points of sequence (an ) is denoted by L(an ).
The sequence (an ) converges and lim an = L if and only if L(an ) = {L}.
n→+∞

Definition 14.2. The limit superior of a sequence (an ) is sup L(an ). The limit superior
of a sequence (an ) usually is denoted by lim sup an or by lim an .
n→∞ n→∞

Definition 14.3. The limit inferior of a sequence (an ) is inf L(an ). The limit inferior of
a sequence (an ) usually is denoted by lim inf an or lim an .
n→∞ n→∞

Example 14.1. If an = (−1)n then L(an ) = {−1, 1} and lim an = −1, lim an = 1.
n→∞ n→∞

The sequence (an ) converges if and only if

lim an = lim an = L.
n→∞ n→∞

26
15 Series of real numbers

If a sequence (an ) is given, the finite sum

sn = a1 + a2 + · · · + an

for each n ∈ N can be formed.


If the sequence (sn ) converges to some limit s, then s can justifiably be called the sum of
the infinite series ∞
X
an = a1 + a2 + . . . .
n=1

More precisely:

X
Definition 15.1. It is said that the symbol an is a convergent series, with sum s, if
n=1
the sequence (sn ) of n-th partial sums converges to s.

X
If (sn ) is a divergent sequence then, irrespective of its precise behavior, an is called a
n=1
divergent series.


X
Rather regrettably, an is still used to denote a divergent series even through it does
n=1
not possess a sum.

Example 15.1. Show that


1 1 1
+ 2 + 3 + ... .
2 2 2
has sum 1.

X∞
1 1 1 1 1 1
Solution: The n-th partial sum of n
is sn = + 2 + 3 + · · · + n = 1 − n . Since
n=1
2 2 2 2 2 2

X 1
lim sn = 1 it can be deduced that converges and has sum 1.
n→∞
n=1
2n

Example 15.2. Show that 1 + 2 + 3 + . . . is a divergent series.

1
Solution: The n-th partial sum is sn = 1+2+· · ·+n = n (n+1). Since sn is a divergent

2
X
sequence, n is divergent.
n=1

Example 15.3. Show that



X 1
=1
n=1
n2 +n

27
Solution: Since
1 1 1 1
= = −
n2 +n n(n + 1) n n+1

X 1
the n-th partial sum of can be written as
n=1
n2 +n
µ ¶ µ ¶ µ ¶
1 1 1 1 1 1
sn = 1− + − + ··· + − =1−
2 2 3 n n+1 n+1
Now lim sn = 1.
n→∞

X ∞
1
Example 15.4. is a special case of an important class of infinite series, namely
n=1
2n
X∞
the geometric series a · xn , where x is a real number.
n=0
Notice that the summation here begins at n = 0, and not at n = 1. For this series the
sum of the first n terms is

sn = a + a · x + a · x2 + · · · + a · xn−1

so
x · sn = a · x + a · x2 + · · · + a · xn
and by substraction
a(1 − xn )
sn =
1−x
a
for x 6= 1. Therefore, lim sn = for |x| < 1.
n→∞ 1−x
Since (sn ) diverges for |x| ≥ 1 the following result is obtained:
Result: The geometric series

X
a · xn = a + a · x + a · x2 + . . . , a 6= 0
n=0

a
converges if and only if |x| < 1. Moreover, its sum is then .
1−x

Since the sum of a convergent series is defined to be the limit of the sequence of n-th
partial sums of the series in question the rules concerning the convergence of sequence
can be used to establish theorems concerning convergence of series.
The first result provides a useful test for the divergence of series.
The vanishing condition

X
If an is convergent, then lim an = 0.
n→∞
n=1

Proof. Suppose that (sn ) converges to some limit s. Hence, (sn−1 ) also converges to s.
But an = sn − sn−1 and so lim an = 0.
n→∞

28

X n n
Example 15.5. Consider . Since an = and lim an = 1 6= 0, by the
n=1
n + 1 n + 1 n→∞

X∞ X∞
n n
vanishing condition does not converge. In other words is divergent.
n=1
n + 1 n=1
n + 1

It is important to note that the converse of the vanishing condition is false! In other
words there are divergent series whose terms nevertheless tend to zero.
X∞
√ √
Example 15.6. Consider ( n − n − 1). The n-th partial sum may be written as:
n=1
√ √ √ √ √ √ √
sn = ( 1 − 0) + ( 2 − 1) + · · · + n − n − 1 = n

X √ √
Clearly (sn ) is a divergent sequence and so ( n − n − 1) is divergent series. However
n=1
√ √ √ √
√ √ ( n − n − 1)( n + n − 1) 1
an = n − n − 1 = √ √ = √ √ →0
( n + n − 1) ( n + n − 1)

Cauchy’s criterion for the convergence of a series



X
A series an converges if and only if for any ε > 0 there exists N such that for n ≥ N
n=1
and p ≥ 1 the following inequality holds:
|an+1 + an+2 + · · · + an+p | < ε.

Proof. Let be sn the n-th partial sum of the series:


sn = a1 + a2 + · · · + an .

X
The series an converges if and only if the sequence (sn ) converges. The sequence (sn )
n=1
converges if and only if for any ε > 0 there exists Nε such that for q, r > Nε the following
inequality hold
|sq − sr | < ε.
This is equivalent with the condition: for any ε > 0 there exists Nε such that for n ≥ Nε
and p ≥ 1 the following inequality hold:
|an+1 + an+2 + · · · + an+p | < ε.

16 Rules (for convergence of series)

By considering the n-th partial sums of the appropriate series and the sum and the scalar
product rules for sequences the following elementary results can be easily proved.

29

X ∞
X ∞
X
Sum rule: If an and bn are convergent series, then (an +bn ) is also convergent
n=1 n=1 n=1
and ∞ ∞ ∞
X X X
(an + bn ) = an + bn
n=1 n=1 n=1


X ∞
X
Scalar product rule: If an is convergent, then (k · an ) is convergent for any
n=1 n=1
k ∈ R1 and

X ∞
X
(k · an ) = k an .
n=1 n=1

Rules will be established below which can be used to test whether a given series converges
or not.

Integral test: Let f : R1+ → R1+ be a decreasing function and let an = f (n) for each
Z n ∞
X
n ∈ N. Let jn = f (x) dx. The series an converges if and only if jn converges.
1 n=1
The proof of the statement is given in the section where the Riemann integral is rigourously
defined.
X ∞
1
Application 16.1. Establish that the p series converges if and only if p > 1.
n=1
np

1
Solution: Consider the function fp : R1+ → R1+ given by fp (x) = . When p > 0 this is
xp ∞
1 X 1
a decreasing function of x and an = p = fp (n) is the n-th term of the p series .
n n=1
np
For p 6= 1 Z n ¯n
1 x1−p ¯¯ 1
jn = p
dx = ¯ = (n1−p − 1).
1 x 1 − p 1 1 − p
1
So, for p > 1, lim jn = , and forp < 1 the sequence (jn ) is divergent. For p = 1,
n→∞ p−1
Z n
1
jn = dx = ln x|n1 = ln(n)
1 x

and so (jn ) again diverges.


X ∞
1
When p ≤ 0, diverges by the vanishing condition.
n=1
np
X∞ X∞
1 1
Example 16.1. The harmonic series diverges and the series converges.
n=1
n n=1
n2

The p-series, together with geometric series, give a fund of known convergent and divergent
series.

30

X
First comparison test: If 0 ≤ an ≤ bn for all n ∈ N, then bn convergent implies
n=1

X
an convergent.
n=1

n
X X
Proof. Let sn = ak and tn = bk . From the given conditions 0 ≤ sn ≤ tn for all
k=1 k=1

X
n ∈ N. If bn converges, then lim tn = t and, since (tn ) is an increasing sequence,
n→∞
n=1
tn ≤ t for all n ∈ N.
Therefore, sn ≤ t for all n ∈ N and hence (sn ) is bounded and increasing sequence. It

X
follows that (sn ) converges and hence an converges.
n=1

X∞
1 + cos n
Example 16.2. The series is convergent.
n=1
3n + 2 · n3

1 + cos n 2
Solution: Let an = n 3
. Then an ≥ 0 since cos n ≥ −1. Also an ≤ n
3 +2·n 3 + 2 · n3
X∞
n 2 1 1
since cos n ≤ 1. Therefore, since 3 > 0, an < = 3 . Let bn = 3 . Then bn
2 n3 n n n=1
X∞
converges, being the p series with p = 3. Hence an also converges.
n=1


X ∞
X
Second comparison test: Let an and bn be positive term series such that
n=1 n=1
an
lim = L 6= 0.
n→∞ bn
X∞ ∞
X
Then, an converges if and only if bn converges.
n=1 n=1


X
Proof. Suppose that bn is convergent and let
n=1

sn = a1 + a2 + · · · + an ; tn = b1 + b2 + · · · + bn .
an
Since lim = L for ε = 1 there is an N1 such that
n→∞ bn
¯ ¯
¯ an ¯
¯ − L¯ < 1, for all n > N1
¯ bn ¯

Hence,
¯ ¯ ¯ ¯ ¯ ¯
an ¯¯ an ¯¯ ¯¯ an ¯ ¯ an ¯
= ¯ ¯ = ¯ − L + L¯¯ ≤ ¯¯ − L¯¯ + |L| < 1 + |L| = k, n > N1 .
bn bn bn bn

31

X ∞
X
Now consider the positive term series αn and βn where αn = aN1 +n and βn =
n=1 n=1
k · bN1 +n .

X ∞
X
Hence 0 ≤ αn ≤ βn for all n ∈ N. Since bn converges, then so does bn too
n=1 n=N1 +1

X
and hence βn converges by the scalar product rule for series. By the first comparison
n=1

X
test, αn converges and, since the addition of a finite number of terms to a convergent
n=1

X ∞
X
series produces another convergent series, an also converges. This proves that bn
n=1 n=1

X
convergent implies that an convergent. The converse of this statement can be proved
n=1
bn 1
by reversing the roles of an and bn in the above argument and observing that → .
an L

X 2n
Example 16.3. Show that the series is divergent.
n=1
n2 − 5n + 8

2n 1 an 2 n2
Solution: Let an = 2 and bn = . Then = 2 → 2 6= 0. Hence
n − 5n + 8 n bn n − 5n + 8
X∞
an diverges by comparison with the divergent harmonic series.
n=1


X an+1
Ratio test: Let an be a series of positive terms and for each n ∈ N let αn = .
n=1
an

X
Suppose that (αn ) converge to some limit L. If L > 1 then an diverges; if L < 1 then
n=1

X
an converges; and if L = 1 the test give no information.
n=1

1
Proof. Suppose that L < 1 and let ε = (1 − L).
2
Now ε > 0 and L + ε = k < 1. Since lim αn = L there is a value Nε such that
n→∞
αn = |αn − L + L| ≤ ε + L = k < 1 for all n > Nε . Therefore an+1 ≤ k · an for all n > Nε .
Let βn = aNε +n . Then βn+1 ≤ k · βn for all n ∈ N and so (by induction on n)

βn+1 ≤ k n · β1 , for all n ∈ N



X ∞
X
n
Now k · β1 is a convergent geometric series since k < 1. Hence βn converges and
n=0 n=1

X
an also converges.
n=1

32
Suppose now that L > 1 and let ε = L − 1. Now ε > 0 and since lim αn = L there is a
n→∞
value Nε such that αn > L − ε = 1 for all n > Nε .
Hence an+1 > an for all n > Nε and so an > aNε for all n > Nε . Since aNε 6= 0, (an ) does

X
not tend to zero. By the vanishing condition an diverges.
n=1


X
Example 16.4. Determine those values of x for which n · (4x2 )n is convergent.
n=1

Solution: Here an = n · (4x2 )n , and so


µ ¶
(n + 1) · (4x2 )n+1 2 1
αn = = 4x 1 + .
n · (4x2 )n n

X ∞
2 2 1 X
Now αn → 4x . Hence an diverges if 4x > 1 (in other words |x| > ), an converges
n=1
2 n=1
1 1
if |x| < , and no information is gained if |x| = .
2 ∞ ∞
2
1 X X
If |x| = then an = n diverges.
2 n=1 n=1
X ∞
1
Therefore n · (4x2 )n converges if and only if |x| < .
n=1
2

X∞
The root test: Let an be a series of positive terms. If there exists N and k ∈ (0, 1)
√ n=1
such that n an ≤ k for n > N, then the series converges.

If n an ≥ 1 for an infinity of terms of the series, then the series diverges.


Proof. If there exists N and k ∈ (0, 1) such that n an ≤ k for n > N, then an ≤ k n for

X X∞
n > N. consequently, the series an can be compared with the geometric series kn
n=1 n=1
which converges since k < 1. This proves the first case.

If n an ≥ 1 for an infinity of terms of the series, then an ≥ 1 for an infinity of terms of
the series. Hence the vanishing condition is not satisfied and the series diverges.
∞ r
X 1 1
Application 16.2. The series n
converges. Using the root test we have n n =
n=1
n n
1 1
≤ for n ≥ 2.
n 2

Alternating series test (Leibnitz): If the sequence (bn ) is a monotonic decreasing



X
sequence and bn −−−→ 0, then the alternating series (−1)n−1 · bn converges.
n→∞
n=1

33

X
Proof. Let sm = (−1)n−1 · bn .
n=1
Then s2m = b1 − (b2 − b3 ) − · · · − (b2m−2 − b2m−1 ) − b2m ≤ b1 and hence the sequence (s2m )
is bounded above.
Since s2m = (b1 − b2 ) + (b3 − b4 ) + · · · + (b2m−1 − b2m ) and (bn ) is decreasing, (s2m ) is
increasing. Consequently, (s2m ) converges and let be s = lim s2m .
m→∞
Similarly, the sequence (s2m+1 ) is a decreasing sequence which is bounded below by b1 − b2
and so (s2m+1 ) converges to a limit t = lim s2m+1 .
2m+1→∞
Now t − s = lim (s2m+1 − s2m ) = lim b2m+1 = 0
m→∞ m→∞
Finally, it is shown that lim sn = s.
n→∞
If ε > 0, there are integers N1 and N2 such that |s2m − s| < ε for all m > N1 and
|s2m+1 − s| < ε for all m > N2 . Let N be the maximum of 2N1 and 2N2 + 1. If n > N,
then either n = 2 m ( and m > N1 ) or n = 2 m + 1 ( and m > N2 ). In either case
|sn − s| < ε for all n > N. In other words, the sequence (sn ) converges and hence, the

X
series (−1)n−1 · bn is convergent.
n=1

X∞
1 1
Example 16.5. The series (−1)n−1 · is convergent since is decreasing sequence
n=1
n n
with limit zero.

17 Absolute convergent series



X
Definition 17.1. The series an is said to be absolute convergent if the series of
n=1

X
absolute values |an | converges.
n=1

A convergent series which is not absolute convergent is called conditionally convergent.

Absolute convergence implies convergence.



X ∞
X
If the series |an | converges, then the series an converges too.
n=1 n=1

Proof. By triangle inequality we have

|sn − sm | = |am+1 + · · · + an | ≤ |am+1 | + · · · + |an |

where n

> m.
X
Since |an | converges, for any ε > 0, there exists an integer N such that for n > m > N
n=1
we have |am+1 | + · · · + |an | < ε and hence |sn − sm | < ε. By the Cauchy criterion we obtain
X∞
that the series an converges.
n=1

34

X X∞
(−1)n 1
Example 17.1. The series converges, since it is absolute convergent; =
n=0
2n n=0
2n

2.

X (−1)n
The alternating harmonic series is conditionally convergent. It converges (by
n=1
n
X∞
1
Leibnitz criterion) but it is not absolutely convergent since diverges.
n=1
n

Comment 17.1. Absolute convergence can be tested by the convergence tests given
above for series with positive terms.
Absolute convergence is important for the following reason: the sum of an absolute
X∞
convergent series an does not depend on the order in which the terms an are taken.
n=1

X
It can be shown that for any conditionally convergent series an and any real number
n=1

X ∞
X
S, we can have an = S by rearranging the terms of an . For example rearranging
n=1 n=1
the terms of the alternating harmonic series we can have it sum to 106 or to −106 or
within ε to the number of atoms in universe.


X ∞
X
Cauchy product of series: If an and bn are absolute convergent series and
n=1 n=1

cn = a1 bn + a2 bn−1 + · · · + an b1

X
then cn is absolute convergent and
n=1


̰ ! ̰ !
X X X
cn = an · bn
n=1 n=1 n=1


X ∞
X
Proof. Suppose first that an and bn are positive term series and consider the array:
n=1 n=1

a1 b1 a1 b2 a1 b3 ...
a2 b1 a2 b2 a2 b3 ...
a3 b1 a3 b2 a3 b3 ...
... ... ... ...

If wn is the sum of the terms in the array that lie in the n × n square with a1 b1 at one

X X∞
corner, then wn = sn · tn , where sn and tn are the n-th partial sums of an and bn
n=1 n=1
respectively.

35
̰ ! ̰ !
X X
Hence lim wn = an · bn .
n→∞
n=1 n=1

X
Now cn is the sum of the terms in the array summed ”by diagonals” and so if un is
n=1

X
the partial sum of cn then:
n=1
w[ n2 ] ≤ un ≤ wn .
By the squeeze rule we have
à ∞
! ̰ !
X X
lim un = un · bn
n→∞
n=1 n=1

as required.

X ∞
X
For the general case the above argument can be applied to the series |an |, |bn |,
n=1 n=1

X ∞
X X∞
and |cn | to deduce that the series cn is absolute convergent. As cn is a linear
n=1 n=1 n=1

X ∞
X ∞
X ∞
X
combination of the series a+
n, a−
n, b+
n and b−
n we have:
n=1 n=1 n=1 n=1


̰ ! ̰ !
X X X
cn = an · bn .
n=1 n=1 n=1

18 Limit of a function at a point

An important concept in calculus is the limit of a function at a point. It is used in the


study of the continuity, derivatives, integrals, and other important topics in calculus.
Consider a function f : A → R1 , whose domain A is a subset of R1 . The behaviour of
the function f as x approaches a fixed real value ”a” shall now be investigated. It shall
be assumed that f (x) is defined for all x close to ”a” but not necessarily at ”a”. In other
words, it is assumed that the domain of f contains a set of the form (a − r, a) ∪ (a, a + r)
for some r > 0.

Definition 18.1. L is called the limit of f (x) as x approaches ”a”, if for any ε > 0, there
exists a number δ > 0 (depending on ε) such that |f (x) − L| < ε for all x ∈ A, x 6= a and
|x − a| < δ.

We denote this by
lim f (x) = L or f (x) → L for x → a.
x→a

36
Comment 18.1. Definition 18.1 does not depend on the value of f at ”a” (if this exists)
as the point ”a” is excluded from consideration. If the value f (a) exist, may violate the
inequality.
Given the function f and the value L the inequality |f (x) − L| < ε means
L − ε < f (x) < L + ε
and therefore, ε can be regarded as the prescribed accuracy of approximating L, i.e how
close to L one wants to get.
The number δ is not uniquely determined by ε. You can always ”take a smaller δ” in the
sense that if, for a given ε we have:
0 < |x − a| < δ1 ⇒ |f (x) − L| < ε
then, for any 0 < δ < δ1 we have
0 < |x − a| < δ ⇒ |f (x) − L| < ε.
Example 18.1. Let us show that
x2 − 4
lim = 4.
x→2 x − 2

Let ε > 0 and consider the inequality


¯ 2 ¯
¯x − 4 ¯ 2
¯ − 4 ¯ < ε or 4 − ε < x − 4 < 4 + ε
¯ x−2 ¯ x−2
is equivalent, for x 6= 2 to 4 − ε < x + 2 < 4 + ε or 2 − ε < x < 2 + ε showing that we can
take δ = ε.

Example 18.2. Let us show that at any point a > 0, the function f (x) = x has the
limit √ √
lim x = a.
x→a
Indeed, if ε > 0, then the inequality
√ √ √ √ √
| x − a| < ε or a − ε < x < a + ε
becomes, by squaring
√ √
a − 2 a · ε + ε2 < x < a + 2 a · ε + ε2

For the given ”a” and ε we can take δ = 2 a · ε + ε2 .
1
Example 18.3. The limit lim sin does not exist.
x→0 x
Recall that arbitrary close to a = 0 there exists x such that f (x) = 1 as well f (x) = −1.
Therefore for any L there is ε > 0 such that for any δ > 0 there exist x such that:
¯ ¯
¯ 1 ¯
x 6= 0 and |x − 0| < δ and ¯¯sin − L¯¯ > ε.
x

The limit of f as x approaches ”a” is unique.


|L1 − L2 |
Indeed: assume that lim f (x) = L1 and lim f (x) = L2 and L1 6= L2 . For ε =
x→a x→a 2
there exists δ1 such that |f (x) − L1 | < ε for 0 < |x − a| < δ1 , and there exists δ2 such
that |f (x) − L2 | < ε for 0 < |x − a| < δ2 . Hence, for 0 < |x − a| < min{δ1 , δ2 } we have
|L1 − L2 | ≤ |L1 − f (x)| + |f (x) − L2 | < |L1 − L2 | which is absurd.

37
Heine’s criterion for the limit: The function f : A ⊂ R1 → R1 has a limit as x
approaches ”a” if and only if for any sequence (xn ), xn ∈ A, xn 6= a, and xn → a as
n → ∞, the sequence (f (xn )) converges.

Proof. Assume that L is the limit of f (x) as x approaches ”a” and consider a sequence
(xn ), xn ∈ A, xn 6= a and xn → a as n → ∞. For ε > 0 there exists δ > 0 such that
0 < |x − a| < δ ⇒ |f (x) − L| < ε. For δ > 0 there exists N such that |xn − a| < δ for
n > N. Hence: |f (xn ) − L| < ε for n > N. Therefore, the sequence (f (xn )) converges.
Assume now that for any sequence (xn ), xn ∈ A, xn 6= a and xn → a as n → ∞ the
sequence (f (xn )) converges. Firstly, we show that lim f (xn ) is independent on (xn ).
n→∞
For that assume the contrary i.e there exist (x0n ), (x00n ); x0n , x00n ∈ A, x0n 6= a, x00n 6= a and
lim x0n = lim x00n = a for which lim f (x0n ) = L0 6= L00 = lim f (x00n ). Consider the
n→∞ n→∞ n→∞ n→∞
sequence (xn ) defined as
½
x0k for n = 2 k
xn =
x00k+1 for n = 2 k + 1

and remark that xn ∈ A, xn 6= a and lim xn = a. Hence the sequence (f (xn )) converges.
n→∞
Let be L = lim f (xn ) and remark that lim f (x0n ) = L0 and lim f (x00n ) = L00 has to be the
n→∞ n→∞ n→∞
same L, i.e. L0 = L; L00 = L. It follows that L0 = L00 what is absurd. Denote now by L the
common value of lim f (xn ) and show that lim f (x) = L. For that assume the contrary.
n→∞ x→a
It follows that there exists ε0 > 0 such that for any n ∈ N there exists xn ∈ A, xn 6= a such
1
that |xn − a| < and |f (xn ) − L| ≥ ε0 . Hence the sequence (f (xn )) does not converge to
n
L even xn ∈ A, xn 6= a and xn → a as n → ∞. That is absurd.

Cauchy-Bolzano’s criterion for the limit: The function f : A ⊂ R1 → R1 has


a limit as x approaches ”a” if and only if for any ε > 0 there exists δ > 0 such that
0 < |x0 − a| < δ and 0 < |x00 − a| < δ ⇒ |f (x0 ) − f (x00 )| < ε.

Proof. Assume that L = lim f (x) and consider ε > 0. There is δ > 0 such that
x→a
ε
0 < |x − a| < δ ⇒ |f (x) − L| < .
2
Hence, 0 < |x0 −a| < δ and 0 < |x00 −a| < δ ⇒ |f (x0 )−f (x00 )| ≤ |f (x0 )−L|+|f (x00 )−L| < ε.
Assume now that for any ε > 0 there exists δ > 0 such that 0 < |x0 − a| < δ and
0 < |x00 − a| < δ ⇒ |f (x0 ) − f (x00 )| < ε and consider a sequence (xn ), xn ∈ A, xn 6= a and
xn −−−→ a. For δ > 0 there is N such that |xn − a| < δ for n > N.
n→∞
Hence, for n, m > N we have |f (xn ) − f (xm )| < ε. This means that the sequence (f (xn ))
converges.
Applying Heine’s criterion, the function f has a limit as x approaches to ”a”.

19 Rules for the limit of a function


a) If k is a constant, then lim k = k.
x→a

38
b) If lim f (x) = L and lim g(x) = M, then lim (f (x) ± g(x)) = L ± M.
x→a x→a x→a

c) If lim f (x) = L and lim g(x) = M, then lim f (x) · g(x) = L · M.


x→a x→a x→a

f (x) L
d) If lim f (x) = L, g(x) 6= 0 and lim g(x) = M 6= 0, then lim = .
x→a x→a x→a g(x) M

Proof. We prove part b) (the other parts are proved similarly, the proof is rather technical
and can be skipped at first reading). For any ε > 0, there are positive δ1 and δ2 such that
ε
0 < |x − a| < δ1 ⇒ |f (x) − L| <
2
ε
0 < |x − a| < δ2 ⇒ |f (x) − M | <
2
For 0 < |x − a| < min{δ1 , δ2 } we have

|f (x) ± g(x) − (L ± M )| ≤ |f (x) − L| + |g(x) − M | < ε.

Pinching rule: Suppose that the inequality f (x) ≤ g(x) ≤ h(x) holds for all x in some
interval around ”a”, except perhaps at x = a. If lim f (x) = L and lim h(x) = L then also
x→a x→a
lim g(x) = L.
x→a

Proof. Since f (x) ≤ g(x) ≤ h(x), we have f (x) − L ≤ g(x) − L ≤ h(x) − L. Hence
|g(x) − L| ≤ max {|f (x) − L|, |h(x) − L|}. For ε > 0 there exist δ1 > 0 and δ2 > 0 such
that
0 < |x − a| < δ1 ⇒ |f (x) − L| < ε,
0 < |x − a| < δ2 ⇒ |h(x) − L| < ε.
Hence, for 0 < |x − a| < min {δ1 , δ2 } we have

|g(x) − L| ≤ max {|f (x) − L|, |h(x) − L|} < ε.

Example 19.1. Show that lim sin θ = 0 and lim cos θ = 1.


θ→0 θ→0
Let θ be measured in radians and consider the angle θ as a central angle in a circle with
a radius of 1.

39
θ
The area of the circular sector OAB is .
2
sin θ
The area of the triangle OAB is .
2
| sin θ| |θ|
Therefore 0 ≤ ≤ .
2 2
”Pinching” sin θ between 0 and θ, both of which approach 0 as θ → 0, proving the first
limit.
From the Pythagorean relation sin2 θ + cos2 θ = 1 we get moreover
lim cos2 θ = lim(1 − sin2 θ) = 1
θ→0 θ→0

But lim cos2 θ = (lim cos θ)2 and therefore lim cos θ = +1 or − 1. The negative sign is
θ→0 θ→0 θ→0
eliminated, since cos θ is positive near θ = 0.
Example 19.2. We use the pinching rule to prove
1
lim x · sin =0
x→0 x
1
The function x · sin is bounded below by −|x| and above by |x|, so that
x
1
−|x| ≤ x · sin ≤ |x|.
x
As x → 0, we also have |x| → 0, and therefore
1
0 ≤ lim x · sin ≤ 0.
x→0 x
proving the claimed limit.
1
We can similarly show that lim x2 · sin
=0
x→0 x
Example 19.3. We can show that lim ex = 1 pinching the exponential function ex near
x→0
0 between the functions 1 + x and 1 + x + x2 i.e.
1 + x < ex < 1 + x + x2 for x ∈ (−∞, 1).
Using the above inequalities we can similarly calculate
ex − 1
lim = 1.
x→0 x

The substitution rule: Assume that lim f (x) = L and lim g(y) = M. Then
x→a y→L
lim g(f (x)) = M.
x→a

Proof. Let ε > 0 be given. Since g(y) → M as y → L there exists δ1 > 0 such that
0 < |y − L| < δ1 ⇒ |g(y) − M | < ε. Also since lim f (x) = L there exists δ2 > 0 such that
x→a
0 < |x − a| < δ2 ⇒ |f (x) − L| < δ1 .
Therefore, 0 < |x − a| < δ2 ⇒ |f (x) − L| = |y − L| < δ1 ⇒ |g(y) − M | = |g(f (x)) − M | < ε
proving that lim g(f (x)) = M.
x→a

40
20 One sided limits

The limit lim f (x) = L in Definition 18.1 is a two-sided limit, since the variable x
x→a
approaches the point ”a” from both sides. We now analyze one-sided limits, where the
variable x approaches the point ”a” on one side. This is necessary if the function is defined
only on one side of the point in question, or if approaching the point from different sides
gives different limits.
We use the following terminology:
”x approaches ”a” from the right”, also ”x approaches ”a” from the above”, denoted by
x → a+ or x & a, means a < x < a + δ for δ > 0 sufficiently small.
”x approaches ”a” from the left”, also ”x approaches ”a” from the below”, denoted by
x → a− or x % a, means a − δ < x < a for δ > 0 sufficiently small.
Definition 20.1. (one-sided limits)

a) L is called the right limit of f at ”a” or the limit of f (x) as x approaches ”a” from
the right (or from above), denoted by
lim f (x) = L or lim f (x) = L
x→a+ x&a

if for any ε > 0, there exists a number δ > 0 such that


a < x < a + δ ⇒ |f (x) − L| < ε;

b) L is called the left limit of f at ”a”, or the limit of f (x) as x approaches ”a” from
the left (or below), denoted by
lim f (x) = L or lim f (x) = L
x→a− x%a

if for any ε > 0, there exists a number δ > 0 such that


a − δ < x < a ⇒ |f (x) − L| < ε.
Remark 20.1. If the left limit and the right limit exist and are equal
lim f (x) = lim+ = L
x→a− x→a

then the limit of f at ”a” exists and equals the same value L
lim f (x) = L.
x→a

Example 20.1. The function √ f (x) = x is defined
√ only for x ≥ 0. As x approaches 0
from the right, the value of x tends to 0, lim x = 0.
x&a

Example 20.2. The function


½
x 1 if x > 0
sign(x) = =
|x| −1 if x < 0
does not have a limit at a = 0; however, the two one-sided limits exist
lim sign(x) = 1 and lim− sign(x) = −1.
x→0+ x→0

41
Example 20.3. Step and staircase function. The step function is defined as:

 0 if x < 0

1
step(x) = if x = 0

 2
1 if x > 0

which, for x 6= 0, can be expressed as


1
step(x) = (1 + sign(x)).
2
The step function has one-sided limits at 0

lim step(x) = 1 and lim− step(x) = −1.


x→0+ x→0

The translated step function step(x − a) has its step at the point ”a” where it has the
two one-sided limits

lim step(x − a) = 1 and lim− step(x − a) = −1.


x→a+ x→a

A staircase function is a function with several steps, for example,


m
X
Sm (x) = step(x − n).
n=0

At each step point, the staircase function has a left limit and a right limit which are
different,and also not equal to the value of the function Sm at that point. At all other
points the left limits and the right limits coincide, and therefore the two sided limits exist
at x 6= n.
Definition 20.2. A function f : A ⊂ R1 → R1 is increasing if x1 , x2 ∈ A, x1 < x2 ⇒
f (x1 ) ≤ f (x2 ).
Definition 20.3. A function f : A ⊂ R1 → R1 is decreasing if x1 , x2 ∈ A, x1 < x2 ⇒
f (x1 ) ≥ f (x2 ).
Definition 20.4. A function f : A ⊂ R1 → R1 is monotone if it is increasing or
decreasing function.

Monotone limits exist: If a function f : (a, b) → R is monotone, then the one-sided


limits lim f (x) and lim f (x) exists for any x0 ∈ (a, b).
x%x0 x&x0

Proof. Consider x0 ∈ (a, b) and the set

Sx0 = {f (x) | x < x0 }

If f is increasing, then the set Sx0 is bounded above by f (x0 ) and if f is decreasing, then
the set Sx0 is bounded below by f (x0 ).
If f is increasing, then the least-upper-bound of Sx0 i.e. sup Sx0 is the left limit of f at
x0 and if f is decreasing the greatest-lower-bound of Sx0 , i.e. inf Sx0 is the left limit of f

42
at x0 .
In this way it was shown that: for an increasing function f the left limit in x0 exists and

lim f (x) = sup Sx0 .


x→x−
0

For a decreasing function f the left limit in x0 exists and

lim f (x) = inf Sx0 .


x→x−
0

Considering the set Rx0 = {f (x) | x > x0 } it can be proven, in a similar way, that if f
increases, then
lim+ f (x) = inf Rx0
x→x0

and if f decreases, the


lim f (x) = sup Rx0
x→x+
0

21 Infinite limits

”Infinity” (±∞) is a mathematical symbol and not a number which is subject to arithmetic
operations.

Definition 21.1. (infinite limits)


The function f has the right limit +∞ at ”a” denoted by lim+ f (x) = +∞ if for any
x→a
M > 0 there is a δ > 0 such that f (x) > M whenever a < x < a + δ.
The function f has the right limit −∞ at ”a” denoted by lim+ f (x) = −∞ if for any
x→a
M > 0 there is a δ > 0 such that f (x) < −M whenever a < x < a + δ.

The left limits:


lim f (x) = +∞, lim− f (x) = −∞
x→a− x→a

and the two-sided limits

lim f (x) = +∞, lim f (x) = −∞


x→a x→a

are defined analogously.

Example 21.1. Show that

1
a) lim = +∞.
x→0 x2
1 1
b) lim− = −∞ and lim+ = +∞.
x→0 x x→0 x

43
Example 21.2. A function can have, at a point a finite one-sided-limit as the point is
approached from one side, and an infinite one-sided-limit as the point is approached from
the other side.
For example the function (
0 if x ≤ 0
h(x) = 1
if x > 0
x
lim− h(x) = 0, lim+ h(x) = +∞.
x→0 x→0

Limits as x → +∞ or as x → −∞ are called limits at infinity, not to be confused with


the infinite limits!

Definition 21.2. (limits at infinity) The number L is the limit of f (x) as x approaches
+∞, denoted by
lim f (x) = L
x→+∞

if for any ε > 0, there exists a number M > 0 such that

x > M ⇒ |f (x) − L| < ε

The limit at −∞, lim f (x) = L is defined analogously.


x→−∞

1 − x2
Example 21.3. The function f (x) = has the following limits at infinity
1 + x + x2
lim f (x) = −1 and lim f (x) = −1.
x→−∞ x→+∞

22 Limit points of a function at a point

Definition 22.1. (limit point at ”a”)


The number L is a limit point of f (x) at ”a” if there exists a sequence (xn ) such that
xn ∈ A, xn 6= a, lim xn = a and lim f (xn ) = L.
n→+∞ n→+∞
Usually, the set of limit points of f (x) at ”a” is denoted by La (f ).

• inf La (f ) is called the inferior limit of f at ”a” and it is denoted by lim f (x),
x→a

def
lim f (x) = inf La (f ).
x→a

• sup La (f ) is called the superior limit of f at ”a” and it is denoted by lim f (x),
x→a

def
lim f (x) = sup La (f ).
x→a

44
The following statement holds: The number L is the limit of f (x) as x approaches
”a” if and only if
lim f (x) = lim f (x) = L.
x→a x→a

Proof. Assume first that lim f (x) = L and consider a sequence (xn ) such that xn ∈
x→a
A, xn 6= a and lim xn = a. For ε > 0 there exists δ > 0 such that
n→∞

0 < |x − a| < δ ⇒ |f (x) − L| < ε.

Since lim xn = a there is N such that |xn − a| < δ for n > N. Hence |f (xn ) − L| < ε for
n→∞
n > N.
Therefore, lim f (xn ) = L. We obtain in this way that La (f ) = {L} and consequently
n→∞

lim f (x) = lim f (x) = L.


x→a x→a

Assume now that lim f (x) = lim f (x) = L and suppose that L is not the limit of f (x)
x→a x→a
as x approaches ”a”. Then, there exists ε0 > 0 such that for every n ∈ N there exists xn
1
such that |xn − a| < and |f (xn ) − L| > ε0 . On the other hand, the sequence (f (xn )) is
n
bounded and has a subsequence (f (xnk )) which has a limit. It is clear that lim f (xnk )
nk →∞
is different from L.
Hence La (f ) contains at least two elements.
1
Example 22.1. If f (x) = sin for x ∈ R1 \ {0} then L0 (f ) = [−1, 1].
x

23 Continuity

Naively a function f : A ⊂ R1 → R1 is continuous if its graph is a continuous curve. In


particular if the domain of f contains a neighborhood of a fixed real number ”a”, then
the graph of f can be drawn through the point (a, f (a)) without removing the pen from
the paper. The desired behavior at (a, f (a)) can be arranged by insisting that, for all
values of x, sufficiently close to ”a”, f (x) is close to f (a).
If f : A ⊂ R1 → R1 is a function whose domain contains a neighborhood of ”a”, then the
following definition holds.

Definition 23.1. The function f is continuous at ”a” if lim f (x) = f (a).


x→a

Note that this definition demands three things:

- firstly that lim f (x) exists,


x→a

- secondly that f (a) is defined,

45
- finally that the previous two values are equal.

The ε, δ definition of lim f (x) = L can be easily adapted to give the following ε, δ definition
x→a
of continuity at ”a”.
Definition 23.2. The function f is continuous at ”a” if and only if for every ε > 0, there
exists δ > 0 such that:
|x − a| < δ ⇒ |f (x) − f (a)| < ε.
Example 23.1. Use the ε, δ definition of continuity to prove that f (x) = x2 is continuous
at a = 0.

Solution: For any ε > 0, determine those x √for which |f (x) √ − f (0)| < ε. Now
|f (x) − f (0)| = |x2 − 0| = |x2 | < ε provided |x| < ε. So let δ = ε. If |x − 0| < δ, then
|f (x) − f (0)| < ε. In other words, lim f (x) = 0 and, since f (0) = 0, lim f (x) = f (0).
x→0 x→0
Hence, f is continuous at 0.
If a function f is continuous for all x in the range a < x < b, then it can be said that f
is continuous on the interval (a, b).
If f is continuous for all x in its domain, it can be simply said that f is continuous.
Definition 23.3. If lim+ f (x) exists and equals f (a), then f is called right-continuous at
x→a
a.
Definition 23.4. If lim− f (x) exists and equals f (a), then f is called left-continuous at
x→a
a.

The ε, δ formulation of the last two definitions are not difficult to write down. Moreover,
the following result holds.
A function f is continuous at ”a” if and only if is both left-continuous and right-continuous
at ”a”.
Note: If a function f is only defined on the closed interval [a, b] and it is claimed that ”f
is continuous on [a, b]” what is meant is that f is continuous on (a, b), right-continuous
at ”a” and left-continuous at ”b”.

Heine’s criterion for continuity: The function f : A ⊂ R1 → R1 is continuous at


a ∈ A if and only if for any sequence (xn ), xn ∈ A, xn −−−→ a the sequence (f (xn ))
n→∞
converges to f (a).

Proof. The result is obtained from Heine’s criteria for the limit.

Cauchy-Bolzano’s criterion for continuity: The function f : A ⊂ R1 → R1 is


continuous at a ∈ A if and only if for any ε > 0 there exists δ > 0 such that |x0 − a| < δ
and |x00 − a| < δ ⇒ |f (x0 ) − f (x00 )| < ε.

Proof. The result is obtained from Cauchy-Bolzano’s criteria for the limit.

46
24 Rules for continuity

Sum rule: If f and g are continuous at ”a”, then f + g is continuous at ”a”.

Product rule: If f and g are continuous at ”a”, then f · g is continuous at ”a”.

1
Reciprocal rule: If f is continuous at ”a” and f (x) 6= 0, then is continuous at ”a”.
f

Squeeze rule: Let f, g and h be such that


h(x) ≤ f (x) ≤ g(x)
for all x in some neighborhood of ”a” and such that h(a) = f (a) = g(a). If h and g are
continuous at ”a”, then so is f too.
The proofs of the above rules are left as exercises.

Composite rule: Let f and g be continuous at ”a” and f (a), respectively. Then g ◦ f
is continuous at ”a”.

Proof. Let f (a) = b. Since g is continuous at b for every ε > 0 there exists a δ1 > 0 such
that
|t − b| < δ1 ⇒ |g(t) − g(b)| < ε.
Since f is continuous at ”a” for δ1 > 0, there exists a δ2 > 0 such that
|x − a| < δ2 ⇒ |f (x) − f (a)| < δ1 .
Now we deduce that
|x − a| < δ2 ⇒ |g(f (x)) − g(f (a))| < ε.
Hence, for any ε > 0, there exists a δ = δ2 > 0 such that
|x − a| < δ ⇒ |(g ◦ f )(x) − (g ◦ f )(a)| < ε.

Example 24.1. Given that the identity function x 7→ x the constant function x 7→ k and
the trigonometric functions sine and cosine are all continuous, the following are proved
to be continuous functions:

x2 + 2x + 3
a) x 7−→ ,
x2 + x + 1
b) x 7−→ x3 · cos x2 ,
( 1
x · sin if x 6= 0
c) x 7−→ x
0 if x = 0

47
The extension by continuity: If f : A ⊂ R1 → R1 is not defined at the point ”a” but
L = lim f (x) exists, then the function g : A ∪ {a} ⊂ R1 → R1 , defined by g(x) = f (x) for
x→a
x 6= a and g(a) = L, is continuous at ”a”.
The function g is called the extension by continuity of the function f.

Example 24.2. The function g : R1 → R1 defined by


(
sin x
for x 6= 0
g(x) = x
1 for x = 0

is the extension by continuity of the function f : R1 \ {0} → R1 , defined by:


sin x
f (x) =
x

25 Properties of continuous functions

The results in this section show that the definition of continuity leads naturally to the
intuitive geometric interpretation which is used when sketching the graphs of continuous
functions.

The boundedness property: Let f be continuous on the interval [a, b]. Then

(1) f is bounded on [a, b];

(2) f attains its bounds somewhere on [a, b].

Comment 25.1. What is being said is that, for (1) there exist numbers m and M such
that
m ≤ f (x) ≤ M for all x ∈ [a, b].
For (2) if m and M are chosen to be the infimum and supremum, respectively, of the set
{f (x) | a ≤ x ≤ b}, then (2) claims that there are numbers c and d in [a, b] such that
m = f (c) and M = f (d).

Proof of boundedness property.


(1) Let B = {x | x ∈ [a, b] and f is bounded on [a, x]}. Clearly a ∈ B and B is bounded
above by b. By the completness property of R1 , B possesses at least upper bound. Let
c = sup B. Since f is right-continuous at ”a”, for ε = 1 there exists δ > 0, such that

a < x < a + δ ⇒ |f (x) − f (a)| < 1 ⇒ |f (x)| < 1 + |f (a)|

δ
Hence, f is bounded on [a, a + δ] and so c ≥ a + > a. We want to show that a = b.
2
Suppose that c < b. Since c > a, f is continuous at c. Then, for ε = 1 there exists a δ 0 > 0
such that
|x − c| < δ 0 ⇒ |f (x)| ≤ 1 + |f (c)|.

48
δ0
In the other words, f is bounded on [c, c + δ 0 ]. But then c +
∈ B and this contradicts
2
c being the supremum of B. Thus c = b and so f is bounded on [a, b], as required.
(2) Since f is bounded on [a, b]

A = f ([a, b]) = {f (x) | a ≤ x ≤ b}

is a set which is bounded both above and below. Let m = inf A and M = sup A. Suppose
1
that there is no x ∈ [a, b] such that f (x) = M and define g(x) = for x ∈ [a, b].
M − f (x)
Now g is continuous on [a, b] by the sum and reciprocal rules. By (1) g is bounded on
[a, b].
Let such a bound be K. Then:
1 1 1
g(x) ≤ K ⇒ ≤K⇒ ≤ M − f (x) ⇒ f (x) ≤ M −
M − f (x) K K

This contradicts the fact that M is the least upper bound for f on [a, b], so the assumption
that f never takes the value M is false. Hence f attains its upper bound on [a, b]. A similar
argument shows that f attains its lower bound somewhere on [a, b].

The intermediate value property: Let f be continuous on [a, b] and suppose that
f (a) = α and f (b) = β. For every real number γ between α and β there exists a number
c, a < c < b with f (c) = γ.

Comment 25.2. Here it is being said that, if f take the values α and β somewhere on
the interval [a, b], then f must take all posible values between α and β.

Proof of the intermediate value property. Suppose α < γ < β and let

S = {x | x ∈ [a, b] and f (x) < γ}.

The set S is non-empty, since it contains ”a”. Let c = sup S. It is clear that a < c < b.
If f (c) < γ, then, for ε = γ − f (c) > 0, there exists a δ > 0 such that

|x − c| < δ ⇒ |f (x) − f (c)| < ε.

In particular ¯ µ ¶ ¯
¯ ¯
¯f c + δ − f (c)¯ < ε
¯ 2 ¯
and so µ ¶
δ
f c+ − f (c) < γ − f (c).
2
µ ¶
δ δ
But then f c + < γ and hence c + ∈ S, which contradicts the fact that c is the
2 2
supremum of S. Hence, f (c) ≥ γ.
If f (c) > γ then, for ε = f (c) − γ > 0, there exists a δ > 0 such that

|x − c| < δ ⇒ |f (x) − f (c)| < ε.

49
Hence c − δ < x ≤ c ⇒ f (x) > γ and so x does not lie in S. In other words, sup S ≤ c − δ
which in turn contradicts the definition of c.
It follows that f (c) = γ. Hence, γ is a value of f.
The intermediate value property has many applications and the following example
illustrates one of these.

Example 25.1. Any polynomial of odd degree has at least one real root.

Solution: Let P (X) = a0 + a1 X + · · · + an X n where n is odd, and without loss of


generality, let an = 1. We know that P is continuous. Define

P (x)
r(x) = − 1, x 6= 0
xn
Now
¯ ¯ ¯
¯ P (x)
¯
¯ ¯ an−1
¯ a1 a0 ¯¯ ¯¯ an−1 ¯¯ ¯ a ¯ ¯a ¯
¯ 1 ¯ ¯ 0¯
|r(x)| = ¯ n − 1¯ = ¯ + · · · + n−1 + n ¯ ≤ ¯ ¯ + · · · + ¯ n−1 ¯ + ¯ n ¯
x x x x x x x

and if M is the maximum of |an−1 |, . . . , |a0 | then


1
Xn X∞ M·
1 1 |x| M
|r(x)| ≤ M · <M· = = , for |x| > 1.
|x| r |x| r 1 |x| − 1
r=1 r=1 1−
|x|

Hence, |r(x)| < 1 for |x| > 1 + M. In particular 1 + r(x) > 0 for |x| > 1 + M. Hence,
P (x) = xn (1 + r(x)) has the same sign as xn for |x| > 1 + M. Since n in odd, there exist
α, β ∈ R1 with P (α) > 0 (choose α > 1 + M ) and P (β) < 0 ( choose β < −(1 + M ).
By the intermediate value property P (γ) = 0 for some γ, |γ| < 1 + M. Incidentally, this
shows that P has a zero in the interval (−(1 + M ), (1 + M )). In fact all the real zeros of
P lie in this interval.

Theorem 25.1 (The interval theorem). Let f be continuous on I = [a, b]. Then, f (I) is
a closed bounded interval.

Comment 25.3. The claim here is that continuous functions map intervals onto intervals.
This means that the intuitive picture of a continuous functions as one having a continuous
graph is round.

Proof of the interval theorem. By the boundedness property there exist numbers c and d
in I such that f (c) = m0 and f (d) = M0 and m0 ≤ f (x) ≤ M0 , for all x ∈ I.
Suppose for simplicity, that c ≤ d. Apply the intermediate value property to f on the
subinterval [c, d] to deduce that f takes all possible values between f (c) = m0 and
f (d) = M0 . In other words f (I) = [m0 , M0 ].

Theorem 25.2 (The fixed point theorem). Let f : [a, b] → [a, b] be a continuous function.
Then there is at least one number c which is fixed by f. That is f (c) = c.

Comment 25.4. This result says that if proceeding continuously from (a, f (a)) to
(b, f (b)), then the line y = x must be crossed.

50
Proof of the fixed point theorem. Let g : [a, b] → R1 be defined by g(x) = f (x) − x. Since
the identity function x 7→ x is continuous on [a, b] the function g is continuous on [a, b].
If f (a) = a or f (b) = b, then there is nothing to prove. So it is assumed that f (a) 6= a
and f (b) 6= b. Since f maps onto [a, b], g(a) > 0 and g(b) < 0.
The intermediate value property applied to g on the interval [a, b] implies that g(c) = 0
for some c, a < c < b. Hence f (c) = c.
Theorem 25.3 (The continuity of the inverse function). Suppose that f : A → B is a
bijection where A and B are intervals. If f is continuous on A, then f −1 is continuous
on B.

Proof. Consider the continuous bijection f : A → B where A and B are intervals. First
it is shown that f is either strictly increasing or strictly decreasing. If f is neither strictly
increasing nor decreasing, then without loss of generality, there are numbers a1 , a2 and
a3 such that a1 < a2 < a3 and f (a1 ) < f (a3 ) < f (a2 ). Apply the intermediate value
theorem to f on the interval [a1 , a2 ] to deduce that f (c) = f (a3 ) for some c ∈ (a1 , a2 ).
This contradicts the fact that f is a bijection. For the rest of the proof it is assumed that
f is strictly increasing, the proof for the strictly decreasing case being similar.
Hence f −1 is strictly increasing. Let b ∈ B and f −1 (b) = a, so that f (a) = b. For every
ε > 0, f maps the interval I = [a − ε, a + ε] onto some interval f (I) = [m, M ]. Since f
is strictly increasing, m < b < M, so let δ be the minimum of b − m and M − b. Clearly
δ > 0.
Now [b − δ, b + δ] is a subset of [m, M ] = f (I) and so f −1 maps [b − δ, b + δ] into
I = [a − ε, a + ε]. Thus given any ε > 0, there exists a δ > 0 such that
|y − b| < δ ⇒ |f −1 (y) − f −1 (b)| < ε.

If f : A → B is a strictly monotone surjection, where A and B are intervals, then f and


f −1 are continuous.
Let be f : A ⊂ R1 → R1 a continuous function.
Definition 25.1. The function f is called uniformly continuous on A if for any ε > 0
there exists δ > 0 such that
|x0 − x00 | < δ ⇒ |f (x0 ) − f (x00 )| < ε for any x0 , x00 ∈ A.
Theorem 25.4 (Theorem of the uniform continuity). If f : [a, b] ⊂ R1 → R1 is
continuous, then f is uniformly continuous.
Comment 25.5. This result says that a continuous function on a closed interval [a, b] is
uniformly continuous.

26 Sequence of functions. Set of convergence.

Definition 26.1. A sequence of real valued functions defined on A ⊂ R is a function


F : N → {f | f : A → R}. We write F (n) = fn and the sequence of functions is denoted
by (fn ).

51
Let be A ⊂ R1 and f1 , f2 , . . . , fn , . . . a sequence of real valued functions fn : A → R1 . We
will denote this sequence by (fn ).

Definition 26.2. An element of a ∈ A is called point of convergence of the sequence (fn )


if the sequence (fn (a)) converges.
The set of all points of convergence is called the set of convergence of the sequence (fn ).

Let be B ⊂ A the set of convergence of sequence (fn ). For x ∈ B we denote by f (x) the
limit
f (x) = lim fn (x).
n→∞

We establish in this way a correspondence from B to R1 ; i.e. a function f : B ⊂ A ⊂


R1 → R1 .
The function f defined above is called the limit function, on the set B, of the sequence
(fn ). We will say that the sequence (fn ) converges on B to f.

Definition 26.3. Let be (fn ) a sequence of function defined on A i.e. fn : A ⊂ R1 → R1 .


A function f : A → R is called the limit function of sequence (fn ) if for any x ∈ A and
ε > 0 there exists N (x, ε) such that for n > N (x, ε) we have

|fn (x) − f (x)| < ε

written: fn −−−→ f on A.
n→∞

Comment 26.1. If in the above definition N depends only on ε and does not depend on
x, then we say that the sequence (fn ) converges uniformly to f on A.

Definition 26.4. The sequence (fn ) is uniformly convergent on A to f if for any ε > 0,
there exists N (ε) such that for n > N (ε) and x ∈ A we have

|fn (x) − f (x)| < ε.

u
If the sequence (fn ) is uniformly convergent to f we will write fn −−−→ f.
n→∞
½
1 for x = 1
Example 26.1. A = [0, 1], fn (x) = xn , f (x) = , fn −−−→ f but
0 for x ∈ [0, 1) n→∞
u
fn −−−
/→ f.
n→∞

sin n x u
Example 26.2. A = [0, 2 π], fn (x) = , f (x) = 0, fn −−−→ f.
n n→∞

Uniform convergence criteria: Let be (fn ) a sequence of functions fn : A ⊂ R1 → R.


• First criterion (Cauchy): The sequence (fn ) converges uniformly to a function f
defined on A if and only if for any ε > 0 there exists N (ε) such that, for any n, m > N (ε)
and any x ∈ A we have:
|fn (x) − fm (x)| < ε.

52
u
Proof. Assume first that fn −−−→ f. For ε > 0 there exists Nε such that for p ≥ Nε we
n→∞
have
ε
|fp (x) − f (x)| < , for any x ∈ A.
2
We have
ε ε
|fn (x) − fm (x)| < |fn (x) − f (x)| + |f (x) − fm (x)| ≤ + = ε.
2 2
for any n, m > Nε .
Assume now that for any ε > 0 there exists Nε , such that for n, m > Nε and x ∈ A we
have
|fn (x) − fm (x)| < ε
u
and show that there exists f : A ⊂ R1 → R1 such that fn −−−→ f.
n→∞
From hypothesis we have that the sequence of the real numbers (fn (x)) converges. Let
be f (x) = lim fn (x). We obtain in this way a function f : A ⊂ R1 → R1 . The sequence
n→∞
of function (fn ) converges in any point x ∈ A to f. Let now ε > 0 and Nε such that for
n, m > Nε and x ∈ A we have

|fn (x) − fm (x)| < ε.

We choose n0 ≥ Nε and since fn −−−→ f we have fn − fn0 −−−→ f − fn0 and more
n→∞ n→∞

|f (x) − fn0 (x)| < ε for x ∈ A.

Since n0 ≥ Nε was arbitrary chosen, it follows that for any n ≥ Nε and x ∈ A we have

|fn (x) − f (x)| < ε


u
i.e. fn −−−→ f.
n→∞

• Second criterion: Let be (fn ) a sequence of functions defined on A: fn : A ⊂ R1 → R1


and f : A ⊂ R1 → R1 . If there exists a sequence (an ) of positive real numbers (an > 0)
which converges to 0 (an → 0), such that |fn (x) − f (x)| ≤ an , for any n ∈ N and any
u
x ∈ A, then fn −−−→ f.
n→∞

Proof. Let be ε > 0. Since an → 0, there exists Nε such that for any n ≥ Nε we have
an < ε. It follows that
|fn (x) − f (x)| < ε
u
for n ≥ Nε and x ∈ A i.e. fn −−−→ f.
n→∞

27 Continuity and uniform convergence

The following statement shows that the uniform convergence conserves the continuity.
Proposition 27.1. Let be (fn ) a sequence of functions fn : A ⊂ R1 → R1 which converges
u
uniformly to f : A ⊂ R1 → R1 ; fn −−−→ f. If all the functions fn are continuous at a
n→∞
point a ∈ A, then f is continuous at ”a”.

53
u
Proof. Let be ε > 0. Since fn −−−→ f, there exists Nε such that for any n ≥ Nε and
n→∞
x ∈ A we have
ε
|fNε (x) − f (x)| < .
3
In particular, we have
ε
|fNε (a) − f (a)| < .
3
Since fN ε is continuous at ”a”, there exists δε > 0 such that |x − a| < δε ⇒ |fNε (x) −
ε
fNε (a)| < . Therefore, for any x with |x − a| < δε we have
3
ε ε ε
|f (x) − f (a)| ≤ |f (x) − fNε (x)| + |fNε (x) − fNε (a)| + |fNε (a) − f (a)| < + + =ε
3 3 3
That means that f is continuous at ”a”.

Corollary 27.1. The limit of a uniform convergent sequence of continuous functions is


a continuous function.

28 Equal continuous and equal bounded sequence of


functions

Let be A ⊂ R1 and (fn ) a sequence of function defined on A; fn : A → R1 .


The expression: ”(fn ) is a sequence of continuous functions” means: for any n ∈ N, x ∈
A and ε > 0 there exists δ = δ(n, x, ε) > 0 such that:

|x0 − x| < δ → |fn (x0 ) − fn (x)| < ε.

If the functions fn are uniformly continuous on A, then δ does not depend on x. Hence:
”(fn ) is a sequence of uniformly continuous function on A” means:
for any n ∈ N and any ε > 0, there exists δ = δ(n, ε) > 0 such that:

|x0 − x00 | < δ ⇒ |fn (x0 ) − fn (x00 )| < ε for any x0 , x00 ∈ A.

It is possible that δ does not depend on n, but depends on x and ε. In this case the
sequence (fn ) is a sequence of equal continuous functions. More precisely:
The sequence (fn ) is a sequence of equal continuous functions on A if for any x ∈ A and
ε > 0 there exists δ = δ(x, ε) > 0 such that for any n :

|x0 − x| < δ ⇒ |fn (x0 ) − fn (x)| < ε.

If δ is independent on x and n, then the functions of the sequence (fn ) are uniformly
continuous and equal continuous too; they are equal uniformly continuous. More precisely:

54
(fn ) is a sequence of functions equal uniformly continuous on A if for any ε > 0 there is
a δ = δ(ε) > 0 such that

|x0 − x00 | < δ(ε) ⇒ |fn (x0 ) − fn (x00 )| < ε

for any n ∈ N and x0 , x00 ∈ A.


The sequence (fn ) is a sequence of bounded functions on A if for any n ∈ N there is
M = M (n) > 0 such that
|fn (x)| < M
for any x ∈ A.
If M is independent on n, then the functions of the sequence (fn ) are equal bounded.
More precisely:
(fn ) is a sequence of equal bounded functions on A, if there is M > 0, such that

|fn (x)| < M

for n ∈ N, x ∈ A.

Theorem 28.1 (Arzela-Ascoli). Let be I = [a, b] a closed interval and (fn ) a sequence of
functions fn : I → R1 .
If (fn ) is a sequence of equal continuous and equal bounded functions, then (fn ) contains
a subsequence (fnk ) which is uniformly convergent on I.

Proof. The proof is rather technical and it will omitted.

29 Series of functions. Convergence and uniform


convergence.

Let be A ⊂ R1 and (fn ) a sequence of functions fn : A → R1 .



X
Definition 29.1. It is said that the symbol fn is a convergent series of functions at
n=1

X
the point a ∈ A, if the numerical series fn (a) is convergent.
n=1

X
The symbol fn is a divergent series of functions at the point a ∈ A, if the numerical
n=1

X
series fn (a) diverges.
n=1


X
A point a ∈ A is called point of convergence of the series of functions fn if the series
n=1
converges at ”a”.

55
The collection of all the points of convergence of the series is called the set of convergence

X
of the series fn .
n=1

X
Let be B ⊂ A the set of convergence of the series fn . For x ∈ B we denote by S(x)
n=1
the sum ∞
X
S(x) = fn (x).
n=1

We establish in this way a correspondence from B to R1 i.e. a function

S : B ⊂ A ⊂ R1 → R1

X
The function S defined above is called the sum function, on the set B, of the series fn .
n=1

X
We will say that the series fn converges to S on B, and we will write
n=1


X
S= fn , for x ∈ B.
n=1


X
Definition 29.2. Let be fn a series of functions defined on A, and S a function
n=1

X
defined on B ⊂ A. The series fn converges to S on B if for any x ∈ B and any ε > 0
n=1
there exists N = N (x, ε) > 0 such that for any n > N we have

|f1 (x) + f2 (x) + · · · + fn (x) − S(x)| < ε.

If the number N is independent on x, then the series is uniformly convergent on B to S.

In this case we have the following:



X
Definition 29.3. The series fn converges uniformly to S if for any ε > 0 there is
n=1
N = N (ε) > 0 such that

|f1 (x) + f2 (x) + · · · + fn (x) − S(x)| < ε.

for any x ∈ B and n > N (ε).



X ∞
X
Definition 29.4. The series fn converges absolutely on B if the series |fn |
n=1 n=1
converges on B.


X
If the series fn converges absolutely on B, then it converges on B.
n=1

56
Example 29.1.

a) Consider the sequence of functions defined by

x2
fn (x) = ,n≥0
(1 + x2 )n

X
and the series fn (x).
n=0
The set of convergence of this series is R1 and the sum of series is
½
1 + x2 for x 6= 0
S(x) =
0 for x = 0

The series is absolutely convergent on R1 .

b) For n ≥ 1 consider fn defined on R1 as

sinn x
fn (x) =
n2

X
and the series fn .
n=1
The series is absolutely convergent on R1 .
The series is uniformly convergent on R1 .

X
n
c) For n ≥ 1 consider fn (x) = cos x and the series fn .
n=1
The set of convergence is R1 \ {k · π}k∈Z .
The series is absolutely convergent on the set of convergence.
X∞
en·|x|
d) consider for n ≥ 1 the functions fn (x) = and the series fn .
n n=1
The set of convergence of the series is empty.

30 Convergence criteria for series of functions



X
Consider the series of functions fn defined on A, i.e. fn : A ⊂ R1 → R1 .
n=1


X
Definition 30.1. The series of functions fn is called the remainder of order k of
n=k+1

X
the series fn .
n=1

57

X
1st Criterion: The series fn converges if and only if the remainder of any order k
n=1
of the series converges.

Proof. Consider
Sk = f1 + f2 + · · · + fn
and
σp = fk+1 + fk+2 + · · · + fk+p
and remark that
Sk+p = Sk + σp .
Therefore, the sequence (Sk+p ) converges as p → ∞ if and only if the sequence (σp )
converges as p → ∞.


X
nd
2 Criterion: The series fn converges if and only if the sequence of the sums of
n=1
remainders tends to 0.

Proof. Obvious.


X
rd
3 Criterion (Cauchy): The series fn converges uniformly on A if and only if for
n=1
any ε > 0 there is N = N (ε) such that for n ≥ N and p ≥ 1 we have

|fn+1 (x) + fn+2 (x) + · · · + fn+p | < ε

for x ∈ A.

Proof. An immediate consequence of the Cauchy criterion for sequences.


X
th
4 Criterion: Let be an a convergent series of positive numbers.
n=1

X
If |fn (x)| ≤ an for x ∈ A and n ∈ N then the series fn is uniform convergent.
n=1

Proof. Obvious.

31 Power Series

X
Definition 31.1. A series of functions of the form an · xn is called power series.
n=0
Clearly, any power series converges when x = 0.

58
Theorem 31.1 (The set of the convergence of power series. Abel-Cauchy-Hadamard
theorem).


X
- The power series an · xn is absolutely convergent for |x| < R (R called radius of
n=0
convergence) where R is given by
1
R= if 0 < ω ≤ +∞
ω
R = +∞ if ω = 0
p
n
and ω = lim |an |.
n→∞

- The series diverges for any x with |x| > R.


- For any r ∈ (0, R) the series is uniformly convergent on the closed interval [−r, r].


X
1
Proof. Consider x0 ∈ R and the series |an | · |x0 |n .
n=0
Apply the root test to this series and obtain:

X
p
If lim n
|an | · |x0 | < 1, then the series an · xn0 is absolutely convergent. In other words,
n→∞
n=0

X 1
the series an · xn0 is absolutely convergent for |x0 | < R, where R = if 0 < ω ≤ +∞
ω
n=0 pn
and R = +∞ if ω = 0 and ω = lim |an |.
n→+∞
Applying the same test, it follows that the series diverges for any x0 with |x0 | > R.

X
For r ∈ (0, R) the series |an | · rn converges (x = r is a point at which the series
n=0

X
an · xn converges absolutely) and for x ∈ [−r, r], we have
n=0

|an | · |x|n ≤ |an | · rn .



X
th
According to the 4 criterion of convergence the series, an · xn is uniformly convergent
n=0
for |x| ≤ r.
Example 31.1.


X
a) The series xn converges absolutely for |x| < 1 and diverges for |x| > 1. The
n=0
radius of convergence is R = 1; the convergence set is (−1, 1).

X xn
b) For the series the radius of convergence R is R = 1. The convergence set is
n=1
n
[−1, 1).

59
X∞
xn
c) The set of convergence of series (−1)n · is (−1, 1].
n=1
n

X xn
d) The series (α > 1) is absolutely convergent on [−1, 1].
n=1

Concerning the continuity of the sum of a power series we have:


X∞
• The sum S of the power series an · xn is a continuous function on (−R, R).
n=0

Proof. Let be x0 ∈ (−R, R). There exists r ∈ (0, R) such that −R < −r < x0 < r < R.
Since on the closed interval [−r, r] the series converges uniformly, and the terms of the
series are continuous functions, the sum S is continuous on [−r, r]. In particular, it is
continuous at x0 .


X
• The sum S of the power series an · xn is uniformly continuous on any compact
n=0
interval contained in (−R, R).

32 Arithmetics of power series



X ∞
X
n
Let an · x and bn · xn be power series with radii of convergence R1 and R2 ,
n=0 n=0
respectively, where 0 ≤ R1 ≤ R2 . Then:


X
- the sum (an + bn ) · xn
n=0


X
- the scalar product k · an · xn
n=0


X ∞
X
- the Cauchy product c n · xn , c n = an · bn−k
n=0 k=0

all have radius of convergence at least R1 .


X∞ ∞
X
n
Moreover if an · x has the sum f (x) and bn · xn has the sum g(x), then:
n=0 n=0


X
(an + bn ) · xn = f (x) + g(x)
n=0


X
(k · an ) · xn = k · f (x)
n=0

60

X
cn · xn = f (x) · g(x).
n=0

Proof. These claims concerning the sum and scalar product follow from the sum and scalar

X
product rules for series. To establish the Cauchy product result, note that an · xn and
n=0

X n
X
bn · xn are absolutely convergent for |x| < R1 . Since cn · xn = (ak · xk )(bn−k · xn−k ),
n=0 k=0

X
the series cn · xn is absolutely convergent for |x| < R1 , and has the sum stated.
n=0
Much of the preceding discussion can be modified to apply to series of the form

X
an · (x − a)n .
n=0

33 Differentiable functions

Intuitively, a function f : A ⊂ R1 → R1 is differentiable at c ∈ A if a tangent can be


drawn to the curve at the point P (c, f (c)).

Figure 33.1:

The slope of the chord P Q in Figure 33.1 is


f (x) − f (c)
x−c
and as Q moves closer to P it is required that the slope P Q approaches the slope of the
tangent line at P .
This geometric idea motivates the following formal definition:
Definition 33.1. A function f : A ⊂ R1 → R1 is differentiable at c ∈ A if
f (x) − f (c)
lim
x→c x−c
exists. f 0 (c) is written for the value of this limit, called the derivative of f at c.

An alternative form of the limit is obtained by setting x = c + h. Then


f (x) − f (c) f (c + h) − f (c)
lim = lim .
x→c x−c h→0 h

61
Example 33.1. The function f (x) = x2 is differentiable for all x.

f (x) − f (c)
Solution: Consider for any x 6= c, where c is fixed. Now
x−c
f (x) − f (c) x2 − c2
lim = lim = lim(x + c) = 2 c.
x→c x−c x→c x − c x→c
0
Hence, f is differentiable at c and f (c) = 2 c. Since c was arbitrary, the derivative function
f can be defined as
f 0 (x) = 2 x.
Example 33.2. The function f (x) = |x| is not differentiable at c = 0.

Solution: Consider
½
f (x) − f (0) |x| 1 for x > 0
= =
x−0 x −1 for x < 0.
Hence
f (x) − f (0) f (x) − f (0)
lim+ = 1 and lim− = −1.
x→0 x−0 x→0 x−0
Since these right and left limits differ, f is not differentiable at 0.
It is easy to show that f is differentiable for all x 6= 0 and f 0 (x) = 1 for x > 0 and
f 0 (x) = −1 for x < 0.
In general, points where f is not differentiable can be often detected by examining the
f (x) − f (c)
left and the right limits of as x → c.
x−c
f (x) − f (c)
The left limit lim− is called the left derivative of f at c and is denoted by
x→c x−c
f−0 (c).
f (x) − f (c)
Similarly, the right limit lim+ is called the right derivative of f at c and is
x→c x−c
denoted by f+0 (c).
Clearly, f 0 (c) exists if and only if f−0 (c) and f+0 (c) both exist and are equal.
The next result establishes that only continuous functions can be differentiable.
Theorem 33.1. If f is differentiable at c, then f is continuous at c.

Proof. Define the function



 f (x) − f (c)
 if x 6= c
Fc (x) = x−c


f 0 (c) if x = c.
Since f is differentiable at c, lim Fc (x) = Fc (c) and hence, Fc is continuous at c.
x→c
Now
f (x) = f (c) + Fc (x) · (x − c) for all x.
Since Fc and the identity and constant functions are all continuous at c, f is continuous
at c.

62
Note that Example 33.2 shows that there are continuous functions which are not
differentiable.
The following table gives certain elementary functions and their derivatives.

Function f derivative f 0
f (x) = k a constant f 0 (x) = 0
f (x) = xn , n ∈ N f 0 (x) = n · xn−1
√ 1
f (x) = x f 0 (x) = √
2 x
f (x) = sin x f 0 (x) = cos x
f (x) = cos x f 0 (x) = − sin x
1
f (x) = tan x f 0 (x) =
cos2 x
1
f (x) = cot x f 0 (x) = − 2
sin x
f (x) = ex f 0 (x) = ex
1
f (x) = ln x f 0 (x) =
x

34 Rules of differentiability

Sum rule Let f and g be functions differentiable at c. Then, their sum f + g is


differentiable at c and
(f + g)0 (c) = f 0 (c) + g 0 (c).

Product rule Let f and g be functions differentiable at c. Then, their product f · g is


differentiable at c and

(f · g)0 (c) = f 0 (c) · g(c) + f (c) · g 0 (c).

Reciprocal rule Let f be a function which is non-zero and is differentiable at c. Then


1
is differentiable at c and
f µ ¶0
1 f 0 (c)
(c) = − 2 .
f f (c)

63
Proof of product rule. For x 6= c

(f · g)(x) − (f · g)(c) f (x) · g(x) − f (c) · g(c)


= =
x−c x−c
f (x) · g(x) − f (x) · g(c) + f (x) · g(c) − f (c) · g(c)
= =
x−c
f (x) − f (c) g(x) − g(c)
= · g(c) + f (x) · .
x−c x−c
As x → c,
f (x) − f (c) g(x) − g(c)
→ f 0 (c) and → g 0 (c).
x−c x−c

Therefore
(f · g)(x) − (f · g)(c)
−−−x→c
−−−−→ f 0 (c) · g(c) + f (c) · g 0 (c).
x−c

The product and the reciprocal rules can be combined as follows, to give the quotient
rule.

f
Quotient rule If f and g are differentiable at c and g(x) 6= 0, then is differentiable
g
at c and µ ¶0
f f 0 (c) · g(c) − f (c) · g 0 (c)
(c) = .
g [g(c)]2
Example 34.1. Use the above rules to prove that each of the following functions is
differentiable at the points indicated:

i) f (x) = x2 + sin x, x ∈ R1 ;

ii) f (x) = x2 · sin x, x ∈ R1 ;


π
iii) f (x) = tan x, x ∈ R1 and x 6= (2n + 1) , n integer;
2
iv) f (x) = xn , n ∈ Z, n 6= 0;
π
v) f (x) = sec x, x 6= (2n + 1) ;
2
vi) f (x) = csc x, x 6= nπ;
π
vii) f (x) = tan x, x 6= (2n + 1) ;
2
viii) f (x) = cot x, x 6= nπ.

64
Squeeze rule Let f , g and h be three functions such that g(x) ≤ f (x) ≤ h(x) for all x
in some neighborhood of c and such that g(c) = f (c) = h(c). If g and h are differentiable
at c, then so is f and f 0 (c) = g 0 (c) = h0 (c).

Proof. The given inequalities imply

g(x) − g(c) f (x) − f (c) h(x) − h(c)


≤ ≤
x−c x−c x−c
for all x > c and the inequality signs are reversed for x < c. The result follows by squeeze
rule for limits of functions provided that g 0 (c) = h0 (c) can be established. To this end let

 g(x) − g(c)
if x 6= c
Gc (x) = x−c
 0
g (c) if x = c

and 
 h(x) − h(c)
if x 6= c
Hc (x) = x−c
 h0 (c) if x = c.
Since g and h are differentiable at c, Gc and Hc are continuous at c. Let k(x) =
Gc (x) − Hc (x). Thus k is continuous at c. The earlier inequalities imply that if x > c,
then k(x) ≤ 0 and if x < c, then k(x) ≥ 0. Hence, k(c) = 0 and so Gc (c) = Hc (c). In
other words, g 0 (c) = h0 (c).

Example 34.2. The function



 1
 x2 sin if x 6= 0
f (x) = x


0 if x = 0.

can be squeezed between h(x) = −x2 and g(x) = x2 at x = 0. Since g and h are
differentiable at 0 with common derivative of value 0, the squeeze rule gives that f is
differentiable at x = 0.
By other rules, f is also differentiable for x 6= 0. Moreover,

 1 1
 2x sin − cos if x 6= 0
0
f (x) = x x


0 if x = 0.

Note that lim f 0 (x) does not exist so that f 0 (0) exists but f 0 is not continuous at 0.
x→0

Composite rule Let f be differentiable at c, and g be differentiable at b = f (c). Then


g ◦ f is differentiable at c and

(g ◦ f )0 (c) = g 0 (f (c)) · f 0 (c).

65
Proof. Let 
 f (x) − f (c)
if x 6= c
Fc (x) = x−c
 f 0 (c) if x = c
and 
 g(y) − g(b)
if y 6= b
Gb (y) = y−b
 g 0 (b) if y = b.
Then Fc is continuous at x = c and, for all x,

f (x) = f (c) + (x − c) · Fc (x).

Gb is continuous at y = b and, for all y

g(y) = g(c) + (y − b) · Gb (y).

Now

(g ◦ f )(x) =g(f (x)) = g(y) = g(b) + (y − b) · Gb (y) =


=g(f (c)) + (f (x) − f (c)) · Gb (f (x)) =
=g(f (c)) + (x − c) · Fc (x) · Gb (f (x)).

So
(g ◦ f )(x) − (g ◦ f )(c)
= Fc (x) · Gb (f (x)).
x−c
The function on the right-hand side of the above equality is continuous at x = c. Hence
(g ◦ f )(x) − (g ◦ f )(c)
lim = Fc (c) · Gb (f (c)) = f 0 (c) · g 0 (f (c)),
x→c x−c
as required.

The composite rule is often called the chain rule and the formula given for the derivative of
a composite is more suggestive in Leibnitz notation. Let ∆x = h and ∆y = f (x+h)−f (x).
Then
f (x + h) − f (x) ∆y
f 0 (x) = lim = lim .
h→0 h ∆x→0 ∆x

dy
The Leibnitz notation for this limit is . Write y = g(u) where u = f (x). Then
dx
du dy dy
f 0 (x) = and g 0 (f (x)) = and (g ◦ f )0 (x) = . The chain rule can now be written
dx du dx
as
dy dy du
= · .
dx du dx
Example 34.3. Show that h(x) = sin x2 is differentiable.

Solution: Let g(x) = sin x and f (x) = x2 , then h = g ◦ f . Since f and g are everywhere
differentiable, the composite rule gives that h is differentiable. Moreover,

h0 (x) = g 0 (f (x)) · f 0 (x) = 2x cos x.

66
Inverse rule Suppose that f : A → B is a continuous bijection where A and B are
intervals. If f is differentiable at a ∈ A and f 0 (a) 6= 0, then f −1 is differentiable at
b = f (a) and
1
(f −1 )0 (b) = 0 .
f (a)

Proof. For a ∈ A, let 


 f (x) − f (a)
if x 6= a
Fa (x) = x−a
 f 0 (a) if x = a.
Fa is continuous at x = a and for all x ∈ A

f (x) = f (a) + (x − a) · Fa (x).

Given f (a) = b and letting f (x) = y for x ∈ A we have

y−b
Fa (x) = for x 6= a.
x−a
Consider
x−a
Gb (y) = for y 6= b
y−b
so
x−a 1 1
Gb (y) = = = for y 6= b.
f (x) − f (a) Fa (x) (Fa ◦ f −1 )(y)
Since f bijective and continuous, so f −1 is continuous too. Also f −1 (b) = a and Fa is
continuous at x = a.
Hence Fa ◦ f −1 is continuous at y = b and

(Fa ◦ f −1 )(b) = Fa (f −1 (b)) = Fa (a) = f 0 (a) = (f 0 ◦ f −1 )(b) 6= 0

So
x−a 1 1 1
= Gb (y) = −1
→ −1
= 0 as y → b.
y−b (Fa ◦ f )(y) (Fa ◦ f )(b) f (a)
In other words µ ¶
f −1 (y) − f −1 (b) 1 1
lim = (b) = .
y→b y−b f ◦ f −1
0 f 0 (a)

If the function f : A → f (A) is differentiable on the interval A and f 0 (a) 6= 0 for any
1
a ∈ A then f −1 is differentiable on f (A) and (f −1 )0 (f (a)) = 0 .
f (a)
Example 34.4. The function f : (0, ∞) → √ (0, ∞) given by f (x) = x2 is a bijection.
Its inverse function is given by f −1 (x) = x. Now√f is differentiable for x > 0 and
f 0 (x) = 2x 6= 0. Hence, by the inverse rule, f −1 (x) = x is differentiable and
1
(f −1 )0 (x) = √ .
2 x

67
³ π π´
Example 34.5. The function g : − , → (−1, 1) given by g(x) = sin x is a bijection
2 2
with inverse given by g −1 (x) = arcsin x. Now g is differentiable and

(g 0 ◦ g −1 )(x) = cos(arcsin x) for |x| < 1.

Hence, by the inverse rule g −1 is differentiable and


1
(g −1 )0 (x) = √ for |x| < 1.
1 − x2

35 Local extremum

In this section we present a result which helps to locate the local maxima and minima of
a differentiable function.

Definition 35.1. A function f has a local maximum value at c if c is contained in some


open interval I for which f (x) ≤ f (c) for each x ∈ I. If f (x) ≥ f (c) for each x ∈ I, then
f has a local minimum value at c.

Theorem 35.1 (Local extremum theorem, Fermat). If f is differentiable at c and


possesses a local maximum or a local minimum at c, then f 0 (c) = 0.

Proof. Consider the case of a local minimum at x = c. There is an open interval I such
f (x) − f (c)
that f (x) − f (c) ≥ 0 for all x ∈ I. If x > c, then ≥ 0 and if x < c, then
x−c
f (x) − f (c)
≤ 0. Thus f+0 (c) ≥ 0 and f−0 (c) ≤ 0. But f 0 (c) exists and so f+0 (c) = f−0 (c).
x−c
Thus f 0 (c) = 0.

Note that although f 0 must vanish at a local extremum this is not sufficient for such a
point. For example, consider the behavior of f (x) = x3 at x = 0. Here f 0 (0) = 0, but 0
is neither a local maximum, nor a local minimum.

Example 35.1. Localize the local maximum and local minimum of the function

f (x) = x (x − 1) (x − 2)

36 Theorems concerning basic properties of differen-


tiable functions

This section establishes some basic properties of differentiable functions.

Theorem 36.1 (Rolle’s theorem). Let f be differentiable on (a, b) and continuous on


[a, b]. If f (a) = f (b) then there exists c ∈ (a, b), such that f 0 (c) = 0.

68
Proof. Since f is continuous on [a, b], it attains a maximum value f (c1 ) and a minimum
value f (c2 ) on [a, b] by the boundedness property.
If f (c1 ) = f (c2 ), then f is constant for all x ∈ [a, b], hence f 0 (x) = 0 for all x ∈ [a, b] and
the result follows.
If f (c1 ) 6= f (c2 ), then at least one of c1 and c2 is not a or b. Hence f has a local maximum
or minimum (or both) inside the interval [a, b].
By the local extremum theorem f 0 is zero at at least one point inside [a, b].
Theorem 36.2 (Mean value theorem, Lagrange). Let f be differentiable on (a, b) and
continuous on [a, b]. Then there exists c ∈ (a, b), such that
f (b) − f (a)
f 0 (c) = .
b−a

f (b) − f (a)
Proof. Let g(x) = f (x) − λx, where λ = . g is differentiable on (a, b) and
b−a
continuous on [a, b]. The choice of λ means that g(a) = g(b). Applying Rolle’s theorem
f (b) − f (a)
there is c ∈ (a, b), such that g 0 (c) = 0. Hence, f 0 (c) − λ = 0, so f 0 (c) = .
b−a
Theorem 36.3 (the increasing-decreasing theorem). If f is differentiable on (a, b) and
continuous on [a, b] then

(1) f 0 (x) > 0 for all x ∈ (a, b) implies f is strictly increasing on [a, b];
(2) f 0 (x) < 0 for all x ∈ (a, b) implies f is strictly decreasing on [a, b];
(3) f 0 (x) = 0 for all x ∈ (a, b) implies f is constant on [a, b].

Proof. Let x1 , x2 ∈ [a, b] with x1 < x2 . Since f satisfies the hypothesis of the mean value
theorem on the interval [x1 , x2 ] we have
f (x2 ) − f (x1 )
= f 0 (c)
x2 − x1
for some c, x1 < c < x2 . But f 0 (c) > 0 and so f (x2 ) > f (x1 ). In other words, f is strictly
increasing on [a, b].
The proof of (2) and (3) is similar.
Comment 36.1. The increasing-decreasing theorem is useful for finding and classifying
local extrema and establishing inequalities between functions.
Example 36.1. Find and describe the local extrema of f (x) = x2 · e−x .

Solution: f is everywhere differentiable and

f 0 (x) = e−x (2 − x) · x.

Local extrema occur only when f 0 (x) = 0 and so x = 0 or x = 2. Since e−x > 0, if x < 0
then f 0 (x) < 0, if x ∈ (0, 2) then f 0 (x) > 0 and if x > 2 then f 0 (x) < 0. Thus f is
decreasing on (−∞, 0), increasing on (0, 2) and decreasing again on (2, +∞). This means
that x = 0 gives a local minimum and x = 2 gives a local maximum of f .

69
Example 36.2. Prove that ex ≥ 1 + x for all x.

Solution: Let f (x) = ex − 1 − x. f is differentiable and f 0 (x) = ex − 1. Hence f 0 (x) > 0


for x > 0 and so f (x) > f (0) = 0 for x > 0. Since f 0 (x) < 0 for x < 0 it follows that
f (x) > f (0) = 0 for x < 0. Finally we obtain that f (x) ≥ 0 for all x ∈ R1 . That means
ex ≥ 1 + x.
The next result is difficult to interpret geometrically but it will be needed to prove
l’Hôspital’s rule. This is a rule which is well suited to the evaluation of limits of form
f (x)
lim , where f (x0 ) = g(x0 ) = 0.
x→x0 g(x)

Theorem 36.4 (Cauchy’s mean value theorem). Let f and g be differentiable on (a, b)
and continuous on [a, b]. Then there exists c ∈ (a, b), such that
f 0 (c) f (b) − f (a)
0
=
g (c) g(b) − g(a)
provided that g 0 (x) 6= 0 for all x ∈ (a, b).

Proof. First note that g(a) 6= g(b), otherwise Rolle’s theorem applied to g on [a, b]
would mean that g 0 vanished somewhere on (a, b). Let h(x) = f (x) − λg(x) where
f (b) − f (a)
λ= .
g(b) − g(a)
By the sum and product rules for continuity and differentiability and our choice of λ,
h satisfies all the hypotheses of Rolle’s theorem. Hence there is c ∈ (a, b), such that
h0 (c) = 0. This gives f 0 (c) = λ · g 0 (c) and the result now follows.
Theorem 36.5 (l’Hôspital’s rule, version A). Let f and g satisfy the hypotheses of
Cauchy’s mean value theorem and let x0 satisfy x0 ∈ (a, b). If f (x0 ) = g(x0 ) = 0,
then
f (x) f 0 (x)
lim = lim 0
x→x0 g(x) x→x0 g (x)

provided that the latter limit exists.

Proof. Apply Cauchy’s mean value theorem to f and g on the interval [x0 , x] where
x0 < x ≤ b. Hence there exists c, x0 < c < x such that
f 0 (c) f (x) − f (x0 ) f (x)
0
= = .
g (c) g(x) − g(x0 ) g(x)
Now
f (x) f 0 (c) f 0 (x)
lim+ = lim+ 0 = lim+ 0
x→x0 g(x) c→x0 g (c) x→x0 g (x)
provided that the latter exists.
A similar argument applied on the interval [x, x0 ] where a ≤ x < x0 gives that:
f (x) f 0 (x)
lim− = lim− 0
x→x0 g(x) x→x0 g (x)
again provided that the latter exists.
The rule now follows.

70
Example 36.3. Show that
sin x
lim = 1.
x→0 x

Solution: The functions f (x) = sin x and g(x) = x satisfy the hypotheses of l’Hôspital
rule. Moreover
sin x cos x
lim = lim = 1.
x→0 x x→0 1
Example 36.4. Show that
1
lim (1 + x) x = e.
x→0

Solution: Via the composite rule for limits of functions we have


³ 1
´ ³ 1
´ ln(1 + x)
ln lim (1 + x) x = lim ln(1 + x) x = lim .
x→0 x→0 x→0 x
By l’Hôspital’s rule
ln(1 + x) 1
lim = lim = 1.
x→0 x x→0 x + 1

Hence
1
lim (1 + x) x = e.
x→0

L’Hôspital’s rule can be used to evaluate many indeterminate limits once they have been
expressed as the limit of a quotient of differentiable functions, provided of course that
f (x)
lim can be evaluated.
x→x0 g(x)

Often this final limit is itself indeterminate (in other words f 0 (x0 ) = g 0 (x0 ) = 0) and it
may be tempting to apply l’Hôspital’s rule again. But this requires that f 0 and g 0 are
themselves differentiable.

37 Higher-order derivatives and differentials

Definition 37.1. If the derived function f 0 of a given differentiable function f is itself


differentiable it is said that f is twice differentiable and f 00 or f (2) denotes (f 0 )0 its second
derivative.
Definition 37.2. In general, f is n times differentiable if f is (n − 1) times differentiable
and its (n − 1)-th derivative is differentiable.
The n-th derivative (f (n−1) )0 is denoted by f (n) . If moreover, f (n) is a continuous function,
then f is said to be n times continuously differentiable.
Example 37.1.

1. If f (x) = xm , m ∈ N then

 m!
(n) · xm−n for n ≤ m
f (x) = (m − n)!
 0 for n > m.

71
³ nπ ´
2. If f (x) = sin x then f (n) (x) = sin x + for n ≥ 1.
2
(−1)n−1 · (n − 1)!
3. If f (x) = ln x then f (n) (x) = for n ≥ 1.
xn

The successive differentiation of products of simple functions are:

(f · g)0 =f 0 · g + f · g 0 (37.1)
(2) (2) 0 0 (2)
(f · g) =f ·g+2·f ·g +f ·g (37.2)
(3) (3) (2) 0 0 (2) (3)
(f · g) =f ·g+3·f ·g +3·f ·g +f ·g . (37.3)

Generally the following formula applies. This can be proved by induction on n.

Theorem 37.1 (Leibnitz formula). Let f and g be n times continuously differentiable.


Then h = f · g is also n times continuously differentiable and:
n
X
(n)
h = Cnk · f (n−k) · g (k) .
k=0

Theorem 37.2 (l’Hôspital’s rule, version B). Let f and g be n times continuously
differentiable on the interval (a, b) and x0 satisfies a < x0 < b. If

f (k) (x0 ) = g (k) (x0 ) = 0 for 0 ≤ k ≤ n − 1

and
g (n) (x0 ) 6= 0
then
f (x) f (n) (x0 )
lim = (n) .
x→x0 g(x) g (x0 )
Example 37.2. Prove that
1 − cos x 1
lim 2
=
x→0 x 2
and µ ¶
1
lim − cot x = 0.
x→0 x

38 Taylor polynomials

X
Suppose that an xn is a power series with radius of convergence R > 0. Let f (x) be
n=0
the sum of series for |x| < R.
It can be proved that f is differentiable and that

X
0
f (x) = an · n · xn−1 for |x| < R.
n=1

72
Continually differentiating in this manner leads to

X
(k)
f (x) = an · n · (n − 1) · . . . · (n − k + 1) · xn−k for |x| < R.
n=k

If x = 0, then
f (k) (0)
ak = for k = 1, 2, . . .
k!

f (n) (0)
Thus the coefficient of xn in any power series is where f (x) is the sum of the given
n!
power series. Hence
X∞
f (n) (0) n
f (x) = · x for |x| < R.
n=0
n!

For small values of x, the sum f (x) can be approximated by the polynomial

f 0 (0) f (2) (0) 2 f (N ) (0) N


f (0) + x+ x + ... + x
1! 2! N!
for any value of N .
This section investigates how good an approximation this polynomial is when f (x) is not
necessarily the sum of a given power series.
Definition 38.1. Let f an n times continuously differentiable function at 0. The Taylor
polynomial of degree n for f at 0 is defined by:

f 0 (0) f (2) (0) 2 f (n) (0) n


Tn f (x) = f (0) + x+ x + ... + x .
1! 2! n!
Example 38.1. Let f (x) = ex . For k = 1, 2, . . . f (k) (0) = 1. Thus

T0 f (x) =1
T1 f (x) =1 + x
1
T2 f (x) =1 + x + x2
2
and so on.

The first result provides an estimate for the difference between f (b), the value of a given
function at x = b, and Tn f (b), the value of its Taylor polynomial of degree n at x = b.
Theorem 38.1 (The first remainder theorem). Let f be (n + 1) times continuously
differentiable on an open interval containing the points 0 and b. Then the difference
between f and Tn f at x = b is given by
bn+1
f (b) − Tn f (b) = · f (n+1) (c)
(n + 1)!
for some c between 0 and b.

73
Proof. For simplicity assume that b > 0. Let
n
X f (k) (x)
hn (x) = f (b) − (b − x)k x ∈ [0, b].
k=0
k!

Then hn (b) = 0 and hn (0) = f (b) − Tn f (b). Let


µ ¶n+1
b−x
g(x) = hn (x) − · hn (0) x ∈ [0, b].
b
The function g is continuous on [0, b] and differentiable on (0, b) and g(0) = g(b) = 0.
Hence by Rolle’s theorem g 0 (c) = 0 for some c between 0 and b. Now
f (n+1) (x)
h0n (x) = − (b − x)n
n!
after a straightforward calculation. Thus
f (n+1) (x) (n + 1)(b − x)n
g 0 (x) = − (b − x)n + · hn (0)
n! bn+1
and so
f (n+1) (c) (n + 1)(b − c)n
0 = g 0 (c) = − (b − c)n + · hn (0)
n! bn+1
leading to
bn+1
hn (0) = · f (n+1) (c).
(n + 1)!

Denote f (b) − Tn f (b) by Rn f (b) and call it the remainder term at x = b. Thus

f (b) = Tn f (b) + Rn f (b)

and so the error in approximating f (b) by Tn f (b) is given by the remainder term Rn f (b).
Since f (n+1) is continuous on a closed interval containing 0 and b, it is bounded on that
interval.
So there exists a number M such that |f (n+1) (c)| ≤ M and so
¯ n+1 ¯
¯ b ¯
|Rn f (b)| ≤ ¯¯ ¯ · M.
(n + 1)! ¯
Thus, for a fixed n, the remainder term will be small for b close to zero. In other words
Taylor polynomials provide good approximations of the function near x = 0. The next
example illustrates this.
Example 38.2. Let f (x) = sin x. Then
x3 x5 x7 x8
T7 f (x) = x − + − R7 f (x) = (− sin c)
3! 5! 7! 8!
for some c between 0 and x. By the first remainder theorem
0.18
R7 f (0.1) ≤ = 2.48 · 10−13 .
8!

74
It can now be shown how Taylor polynomials can be used to generate power series
expansions for functions f which are infinitely differentiable on an open interval containing
0 and x. For x we have:
f (x) = Tn f (x) + Rn f (x).
Now ∞
X f (k) (0)
lim Tn f (x) = · xk for |x| < R
n→∞
k=0
k!
where R is the radius of convergence of the resulting power series.
If it can be shown that lim Rn f (x) = 0 for |x| < R0 < R for some R0 , then
n→∞


X f (n) (0)
f (x) = · xn for |x| < R0 .
n=0
n!

This power series is called the McLaurin series for f (x).

Example 38.3. Derive the McLaurin series for f (x) = ex .

Solution: Firstly
x x2 xn
Tn f (x) = 1 + + + ... +
1! 2! n!
X xn ∞
xn+1 c
and Rn f (x) = e for some c ∈ (0, x). Now the series is absolutely
(n + 1)! n=0
n!
convergent for all x by the ratio test so for any fixed real number x

X xn
lim Tn f (x) = .
n→∞
n=0
n!

xn
By vanishing condition −−−→ 0. Thus
n! n→∞
¯ n+1 ¯
¯ x ¯
|Rn f (x)| = ¯¯ e ¯¯ → 0
c
as n → ∞
(n + 1)!

for fixed x. Hence, for any x, limn→∞ Rn f (x) = 0. Hence



X
x xn x x2 xn
e = =1+ + + ... + + ...
n=0
n! 1! 2! n!

In a similar manner, power series can be generated for all the standard functions. In each
case the sum of the power series is just lim Tn f (x) and the range of validity is precisely
n→∞
those x for which:

a) the resulting power series converges and

b) the remainder Rn f (x) → 0 as n → ∞.

75
In deriving the following list, the trickiest part is establishing b):

X xn
ex = x ∈ R1 (38.1)
n=0
n!

X (−1)n · x2n+1
sin x = x ∈ R1 (38.2)
n=0
(2n + 1)!
X∞
(−1)n · x2n
cos x = x ∈ R1 (38.3)
n=0
(2n)!

X t · (t − 1) · . . . · (t − n + 1)
(1 + x)t = 1 + xn t∈
/ N, x ∈ (−1, 1) (38.4)
n=1
n!

X (−1)n−1 · xn
ln(1 + x) = − 1 < x ≤ 1. (38.5)
n=1
n

The form of the remainder found in the first remainder theorem is called the Lagrange
form.
The first few terms of the McLaurin series for a given function f provide a good
approximation of f (x) close to 0.
But what happens when approximations for x close to some other real number a are
required? Polynomials must be considered not in powers of x but in powers of (x − a).

Definition 38.2. Let f be n times continuously differentiable on an open interval


containing a fixed real number a. Define the Taylor polynomial of degree n for f at a
by
f 0 (a) f (2) (a) f (n) (a)
Tn,a f (x) = f (a) + (x − a) + (x − a)2 + . . . + (x − a)n .
1! 2! n!

The first remainder theorem can now be generalized.

Theorem 38.2 (Taylor’s theorem). Let f be (n + 1) times continuously differentiable on


an open interval containing the points a and b. Then the difference between f and Tn,a f
at b is given by
(b − a)n+1 n+1
f (b) − Tn,a f (b) = f (c)
(n + 1)!
for some c between a and b.

Proof. For each t between a and b

f 0 (t) f (n) (t)


f (b) = f (t) + (b − t) + . . . + (b − t)n + F (t)
1! n!
where
F (t) = Rn,t f (b) = f (b) − Tn,t f (b).

76
Differentiating with respect to t gives:
µ ¶ µ (2) ¶
0 0 f (2) (t) f (t) f (3) (t) 2
0 =f (t) + −f (t) + (b − t) + − (b − t) + (b − t) + . . .
1! 1! 2!
µ ¶
f (n) (t) n−1 f (n+1) (t)
... + − (b − t) + (b − t) + F 0 (t).
n
(n − 1)! n!
Cancelation now gives that
f (n+1) (t)
F 0 (t) = − (b − t)n .
n!
Apply Cauchy’s mean value theorem to the functions F and G on the interval with
endpoints a and b, where G(t) = (b − t)n+1 . Thus, there is a number c between a and b
such that
f (n+1) (c)
F (b) − F (a) 0
F (c) − (b − c)n
= 0 = n! .
G(b) − G(a) G (c) −(n + 1)(b − c)n
Hence
f (n+1) (c)
−(f (b) − Tn,a f (b)) (b − c)n
= n!
−(b − a)n+1 (n + 1)(b − c)n
or
(b − a)n+1 n+1
f (b) − Tn,a f (b) = f (c).
(n + 1)!

The error in approximating f (b) by the polynomial Tn,a f (b) is just the remainder term:
(b − a)n+1 n+1
Rn,a f (b) = f (c)
(n + 1)!
where c lies between a and b. The approximation is good for b close to a.
Just as before, power series can be generated in powers of (x − a), called Taylor series,
for suitable functions of x. The range of validity is again those x for which

a) the resulting power series converges, and


b) Rn,a f (x) → 0 as n → ∞.

39 Classification theorem for local extrema

A form of Taylor’s theorem much used in numerical analysis is derived below and used to
round off the investigation of local extrema.
From Taylor’s theorem, it follows that
f (x) = Tn,a f (x) + Rn,a f (x) =
f 0 (a) f (2) (a) f (n) (a) (b − a)n+1 n+1
= f (a) + (x − a) + (x − a)2 + . . . + (x − a)n + f (c)
1! 2! n! (n + 1)!

77
for some c between a and x.
Let x − a = h. Then c lies between a and a + h. Thus c = a + θ · h for some θ ∈ (0, 1).
Hence the following result holds:

h 0 hn hn+1 n+1
f (a + h) = f (a) + f (a) + . . . + f (n) (a) + f (a + θ · h)
1! n! (n + 1)!

for some θ ∈ (0, 1).


This expression emphasizes that the value of f at a + h is determined by the values of f
and its derivatives at a with θ measuring the degree of indeterminacy.
When the extrema of a function f which has a stationary point at x = a (i.e. f 0 (a) = 0)
is investigated, it is required that the sign of f (a + h) − f (a) for all small h be determined.
The above expression relates f (a + h) − f (a) to the derivatives of f at a, thus enabling
the following to be proved.

Theorem 39.1 (Classification theorem for local extrema). If f is (n + 1) times con-


tinuously differentiable on a neighborhood of a and f (k) (a) = 0 for k = 1, 2, . . . , n (in
particular f 0 (a) = 0 and so x = a is a stationary point of f ) and f (n+1) (a) 6= 0 then:

(1) n + 1 even and f (n+1) (a) > 0 implies that f has a local minimum at x = a.

(2) n + 1 even and f (n+1) (a) < 0 implies that f has a local maximum at x = a.

(3) n + 1 odd implies that f has neither a local maximum nor a local minimum at x = a.

Proof. Since f (k) (a) = 0 for k = 1, 2, . . . , n

hn+1 (n+1)
f (a + h) − f (a) = f (a + θh)
(n + 1)!

where 0 < θ < 1. Since f (n+1) (a) 6= 0 and f (n+1) is continuous, there is a δ > 0 such that
f (n+1) (x) 6= 0 for |x − a| < δ. Thus for all h satisfying |h| < δ, f (n+1) (a + θh) has the
same sign as f (n+1) (a), so f (a + h) − f (a) has the same sign as hn+1 · f (n+1) (a) for all h,
|h| < δ.
(1) If n + 1 is even and f (n+1) (a) > 0, then f (a + h) − f (a) > 0 on the open interval
(a − δ, a + δ). Hence, x = a gives a local minimum of f .
(2) If n + 1 is even and f (n+1) (a) < 0, then f (a + h) − f (a) < 0 on the open interval
(a − δ, a + δ). Hence, x = a gives a local maximum of f .
(3) If n + 1 is odd, the sign of f (a + h) − f (a) changes with the sign of h. It is said that
x = a gives a horizontal point of inflection.

Example 39.1. Determine the nature of the stationary points of

f (x) = x6 − 4x4 .

78
40 The Riemann-Darboux integral
Z b
It is intended to give one form of the definition of the Riemann integral f (x) dx. The
a
definition involves the areas of the rectangles and applies to a wider class of functions
than continuous ones.

Definition 40.1. Let [a, b] be a given finite interval. A partition P on [a, b] is a finite set
of points {x0 , x1 , . . . , xn } satisfying

a = x0 < x1 < . . . < xn = b.

Suppose now that f is a function defined and bounded on [a, b] (if f were continuous on
[a, b] this would certainly be the case). Then f is bounded on each of the subintervals
[xi−1 , xi ]. Hence f has a least upper bound Mi , and a greatest lower bound mi in [xi−1 , xi ].

Definition 40.2. The upper Darboux sum of f related to P is defined by


n
X
Uf (P ) = Mi (xi − xi−1 )
i=1

where Mi = sup{f (x) | xi−1 ≤ x ≤ xi }.


The lower Darboux sum of f related to P is defined by
n
X
Lf (P ) = mi (xi − xi−1 )
i=1

where mi = sup{f (x) | xi−1 ≤ x ≤ xi }.

Now f is bounded above and below on the whole [a, b]. So there exist numbers m and M
with
m ≤ f (x) ≤ M for all x ∈ [a, b].
Thus for any partition of [a, b]:

m(b − a) ≤ Lf (P ) ≤ Uf (P ) ≤ M (b − a)

Hence the set


Lf = {Lf (P ) | P is a partition of [a, b]}
is bounded above and the set

Uf = {Uf (P ) | P is a partition of [a, b]}

is bounded below.
So Lf = sup Lf and Uf = inf Uf exist.
The first result establishes the intuitively obvious fact that Lf ≤ Uf .

Proposition 40.1. If f is defined and bounded on [a, b], then Lf ≤ Uf .

79
Proof. Let P be a partition of [a, b] and P 0 be the partition P ∪ {y} where xi−1 < y < xi
for one particular i, 1 ≤ i ≤ n. In other words, P 0 is obtained by adding one more point
to P .
It is now shown that Lf (P ) ≤ Lf (P 0 ) and Uf (P ) ≥ Uf (P 0 ). Let

Mi0 = sup{f (x) | xi−1 ≤ x ≤ y} and Mi00 = sup{f (x) | y ≤ x ≤ xi }.

Clearly, Mi0 ≤ Mi and Mi00 ≤ Mi . Hence:


i−1
X n
X
Uf (P 0 ) = Mj (xj − xj−1 ) + Mi0 (y − xi−1 ) + Mi00 (xi − y) + Mj (xj − xj−1 ) ≤
j=1 j=i+1
i−1
X n
X
≤ Mj (xj − xj−1 ) + Mi (y − xi−1 ) + Mi (xi − y) + Mj (xj − xj−1 ) =
j=1 j=i+1
Xn
= Mj (xj − xj−1 ) = Uf (P ).
j=1

In a similar way, it can be shown that Lf (P ) ≤ Lf (P 0 ). It now follows that if


P 00 = P ∪ {y1 , y2 , . . . , ym }, where yi are distinct numbers in [a, b], then Lf (P ) ≤ Lf (P 00 )
and Uf (P ) ≥ Uf (P 00 ).
Now suppose that P1 and P2 are two partitions of [a, b] and let P3 = P1 ∪ P2 . Thus,
Lf (P1 ) ≤ Lf (P3 ) and Uf (P2 ) ≥ Uf (P3 ). Since Lf (P3 ) ≤ Uf (P3 ) it can be deduced that
Lf (P1 ) ≤ Uf (P2 ). In other words, the lower sum related to a given partition of [a, b] does
not exceed the upper sum related to any partition of [a, b]. Hence every lower sum is a
lower bound for the set of upper sums. So Lf (P ) ≤ Uf for all possible partitions P . But
then Uf is an upper bound for the set of lower sums. Thus Lf ≤ Uf .
Definition 40.3. A function defined and bounded on [a, b] is Riemann-Darboux integrable
on [a, b] if Lf = Uf . This common value is denoted by
Z b
f (x) dx = Lf = Uf .
a

Example 40.1. Prove that f (x) = x is Riemann-Darboux integrable on [0, 1].

© ª n+1
Solution: For n ∈ N let Pn = 0, n1 , n2 , . . . , 1 . Hence Uf (Pn ) = and Lf (Pn ) =
2n
n−1
. So
2n
n−1 n+1
≤ Lf ≤ U f ≤ .
2n 2n
1
Letting n → ∞ it can be deduced that Lf = Uf = .
2
Example 40.2. Show that the function
½
1 if x is rational
f (x) =
0 if x is irrational

is not Riemann-Darboux integrable on any interval [a, b].

80
Solution: For any partition P it follows that Lf (P ) = 0 and Uf (P ) = b − a, since any
interval of real numbers contains infinitely many rationals and irrationals. Hence Lf = 0
and Uf = b − a and so Lf 6= Uf .
Z b
This definition of f (x) dx is only one of the many ways of assigning areas to bounded
a
regions. There are others, notably the Lebesgue integral; all however, give the same
”answer” for areas under the graphs of continuous functions. It will be proved that all
continuous functions are Riemann-Darboux integrable and a neat method of evaluating
the integral involved is derived.
Firstly, some elementary properties of the Riemann integral must be established - all of
which are essentially properties of areas.

41 Properties of the Riemann-Darboux integral

Proposition 41.1. If f and g are Riemann-Darboux integrable on [a, b] then all the
integrals below exist and

Z b Z b Z b
(1) (α f (x) + β g(x)) dx = α f (x) dx + β g(x) dx α, β ∈ R1 .
a a a
Z b Z c Z b
(2) f (x) dx = f (x) dx + f (x) dx a ≤ c ≤ b.
a a c
Z b Z b
(3) if f (x) ≤ g(x) on [a, b] then f (x) dx ≤ g(x) dx.
a a
¯Z b ¯ Z b
¯ ¯
(4) ¯¯ f (x) dx¯¯ ≤ |f (x)| dx.
a a

Property (1) is described as the linearity of the integral and (2) is called the additive
property.

Proof of (1). It is sufficient to prove that the following equalities hold

Zb Zb Zb Zb Zb
αf (x) dx = α f (x) dx and (f (x) + g(x)) dx = f (x) dx + g(x) dx
a a a a a

Zb Zb
The equality αf (x) dx = α f (x) dx is true for any α ≥ 0, provided by the equalities
a a
Lαf (P ) = αLf (P ) and Uαf (P ) = αUf (P ) for any α ≥ 0 and any partition P of [a, b] .
Zb Zb
The equality αf (x) dx = α f (x) dx holds provided by U−f (P ) = −Lf (P ), for any
a a

81
partition P of [a, b].
Zb Zb Zb
The equality (f (x) + g(x)) dx = f (x) dx + g(x) dx is obtained by observing that
a a a
for any partition P of [a, b] the followings hold:
Lf (P ) + Lg (P ) ≤ Lf +g (P ) ≤ Uf +g (P ) ≤ Uf (P ) + Ug (P )
from where:
Lf + Lg ≤ Lf +g ≤ Uf +g ≤ Uf + Ug
These inequalities together with:
Zb Zb
L f = Uf = f (x) dx and Lg = Ug = g(x) dx
a a

prove that:
Zb Zb Zb
Lf +g = Uf +g = (f (x) + g(x)) dx = f (x) dx + g(x) dx
a a a

Proof of (2). Let P1 and P2 be partitions of [a, c] and [c, b] respectively. Then P = P1 ∪P2
is a partition of [a, b]. Clearly, Lf (P ) = Lf (P1 ) + Lf (P2 ). Let
L1 = sup{Lf (P1 ) | P1 is a partition of [a, c]}
and
L2 = sup{Lf (P2 ) | P2 is a partition of [c, b]}.
Z b
Since Lf (P ) ≤ f (x) dx (by definition) we have
a
Z b
Lf (P1 ) + Lf (P2 ) ≤ f (x) dx.
a
Z b Z b
Hence Lf (P1 ) ≤ f (x) dx − Lf (P2 ) and so L1 ≤ f (x) dx − Lf (P2 ).
Z b
a Z b a Z b
Hence Lf (P2 ) ≤ f (x) dx − L1 and so L2 ≤ f (x) dx − L1 or L1 + L2 ≤ f (x) dx.
a a a
Now consider upper sums and observe that Uf (P ) = Uf (P1 ) + Uf (P2 ). Let
U1 = inf{Uf (P1 ) | P1 is a partition of [a, c]}
and
U2 = inf{Uf (P2 ) | P2 is a partition of [c, b]}.
Z b
Since Uf (P ) ≥ f (x) dx (by definition) we have
a
Z b
Uf (P1 ) + Uf (P2 ) ≥ f (x) dx.
a

82
Z b Z b
Hence Uf (P1 ) ≥ f (x) dx − Uf (P2 ) and so U1 ≥ f (x) dx − Uf (P2 ).
Z b
a Z b a Z b
Hence Uf (P2 ) ≥ f (x) dx − U1 and so U2 ≥ f (x) dx − U1 or U1 + U2 ≥ f (x) dx.
a a a
Thus Z b
L1 + L2 ≤ f (x) dx ≤ U1 + U2 .
a

Since f is Riemann integrable on [a, b], for any ε > 0, P can be chosen such that
Uf (P ) − Lf (P ) < ε. Then:

Uf (P1 ) − Lf (P1 ) + Uf (P2 ) − Lf (P2 ) = [Uf (P1 ) + Uf (P2 )] − [Lf (P1 ) + Lf (P2 )] =
= Uf (P ) − Lf (P ) < ε.

Hence
0 ≤ Uf (P1 ) − Lf (P1 ) < ε and ≤ Uf (P2 ) − Lf (P2 ) < ε.
Hence L1 = U1 and L2 = U2 . In other words, f is Riemann-Darboux integrable on both
[a, c] and [c, b]. Hence the additive property is established.

Proof of (3). It is sufficient to prove that if f (x) ≥ 0 on [a, b] then:

Zb
f (x) dx ≥ 0
a

The above inequality follows from the inequalities

0 ≤ Lf (P ) ≤ Uf (P )

which are valid for any partition P of [a, b].

Proof of (4). We first prove that |f | is Riemann-Darboux integrable on [a, b]. We consider
the functions f + , f − : [a, b] → R defined by:
½
+ f (x) if f (x) ≥ 0
f (x) =
0 if f (x) ≤ 0

and ½
− 0 if f (x) ≥ 0
f (x) =
−f (x) if f (x) ≤ 0
and we remark that:

f (x) = f + (x) − f − (x) and |f (x)| = f + (x) + f − (x)

We will show now that f + , f − : [a, b] → R are Riemann-Darboux integrable on [a, b]. The
boundedness of f + and f − is obvious. Consider a partition P of [a, b] and denote:

m+ + + +
i = inf{f (x) | x ∈ [xi−1 , xi ]} and Mi = sup{f (x) | x ∈ [xi−1 , xi ]}

Remark that:
Mi+ − m+
i ≤ Mi − mi i = 1, 2, ..., n

83
where:

mi = inf{f (x) | x ∈ [xi−1 , xi ]} and Mi = sup{f (x) | x ∈ [xi−1 , xi ]}

Hence, we obtain the inequalities

0 ≤ Uf + (P ) − Lf + (P ) ≤ Uf (P ) − Lf (P )

for any partition P .


It follows that f + is Riemann-Darboux integrable on [a, b].
In a similar way, we obtain that f − is Riemann-Darboux integrable on [a, b].
Using (1) and the equality |f | = f + +f − we obtain that |f | is Riemann-Darboux integrable
on [a, b].
In order to obtain the inequality (4) we use:

−|f (x)| ≤ f (x) ≤ |f (x)|

and we deduce that:


Zb Zb Zb
− |f (x)| dx ≤ f (x) dx ≤ |f (x)| dx
a a a

42 Classes of Riemann-Darboux integrable functions

Theorem 42.1. If f is continuous on [a, b], then f is Riemann-Darboux integrable on


[a, b].

Proof. If ε0 > 0 then either

|f (x) − f (a)| < ε0 for all x ∈ [a, b]

or else
S = {x | x ∈ [a, b] and |f (x) − f (a) = ε0 |}
is non empty and by the intermediate value property, inf S exists. A partition P of [a, b]
is constructed as follows: if

|f (x) − f (a)| < ε0 for all x ∈ [a, b]

let x0 = a and x1 = b, otherwise let x0 = a and x1 = inf S. Hence x = x1 is the


first element of [a, b] with |f (x) − f (a)| = ε0 . If x1 < b let x2 be the first element of
[x1 , b] with |f (x) − f (x1 )| = ε0 , otherwise let x2 = b. Define x3 , x4 , . . . and so on in a
similar manner. If this process continues indefinitely, a sequence (xn ) has been produced
in which |f (xn ) − f (xn−1 )| = ε0 for all n ∈ N. Now xn is an increasing sequence which
is bounded above and so (xn ) tends to some limit x. Since f is continuous, this implies
that f (xn ) → f (x). Hence f (xn−1 ) → f (x) also. But this contradicts the condition
|f (xn ) − f (xn−1 )| = ε0 . So there is an integer N such that P = {x0 , x1 , . . . , xN } is a

84
partition of [a, b] for which |f (xn ) − f (xn−1 )| = ε0 , i = 1, 2, . . . , N .
For this partition P , we have

Mi − mi < 2ε0 for all i = 1, 2, . . . , N .

Hence n
X
Uf (P ) − Lf (P ) = (Mi − mi )(xi − xi−1 ) ≤ 2ε0 (b − a).
i=1
ε
Now for any ε > 0 consider ε0 = > 0 and deduce that there exists a partition P
2(b − a)
with Uf (P ) − Lf (P ) < ε. Now Lf ≥ Lf (P ) > Uf (P ) − ε ≥ Uf − ε. Since ε is arbitrary,
Lf ≥ Uf . Since Lf ≤ Uf , by an earlier result, Lf = Uf . Thus, f is Riemann-Darboux
integrable on [a, b].

Definition 42.1. A function f is called piecewise continuous on [a, b] if there exists a


partition P = {x0 , x1 , . . . , xn } of [a, b] and continuous functions fi defined on [xi−1 , xi ],
such that f (x) = fi (x) for x ∈ (xi−1 , xi ), i = 1, 2, . . . , n.

A partition Q of [a, b] can be chosen to contain the points x0 , x1 , . . . , xn . Then Q =


P1 ∪ P2 ∪ . . . ∪ Pn where Pi is a partition for [xi−1 , xi ]. Hence
n
X
Lf (Q) = Lfi (Pi )
i=1

where Lf (Q) is the lower sum of f related to Q. For each i, Lfi (Pi ) is the lower sum of
n
X
fi related to Pi . Thus Lfi (Pi ) ≤ Lf , where Lf is the supremum of all the lower sums
i=1
for f on [a, b]. Since fi is continuous on [xi−1 , xi ], fi is Riemann integrable on [xi−1 , xi ].
Hence: n µZ xi ¶
X
fi (x) dx ≤ Lf .
i=1 xi−1

In a similar fashion n µZ ¶
X xi
Uf ≤ fi (x) dx
i=1 xi−1

where Uf is the supremum of all upper sums of f on [a, b]. Hence, using Lf ≤ Uf , we
obtain
Lf = Uf .
In other words, a piecewise continuous function is Riemann-Darboux integrable and
Z b n Z
X xi
f (x) dx = fi (x) dx.
a i=1 xi−1

ExampleZ42.1. The function given by f (x) = x − [x] is piecewise continuous on [0, 3].
2
Compute f (x) dx.
0

85
43 Mean value theorem

Theorem 43.1 (the integral mean value theorem). If f and g are continuous on [a, b]
and g(x) ≥ 0 for x ∈ [a, b], then there exists c between a and b such that
Z b Z b
f (x) · g(x) dx = f (c) g(x) dx.
a a

Proof. By the interval theorem applied to f on [a, b], m ≤ f (x) ≤ M for all x ∈ [a, b]
where m is the infimum and M is the supremum of f on [a, b]. Since g(x) ≥ 0 we have

m · g(x) ≤ f (x) · g(x) ≤ M · g(x) for x ∈ [a, b].

Hence Z Z Z
b b b
m g(x) dx ≤ f (x) g(x) dx ≤ M g(x) dx
a a a
and Z b
f (x) g(x) dx
a
k= Z b ∈ [m, M ].
g(x) dx
a
By the intermediate value property, there exists c ∈ [a, b] with f (c) = k. Hence
Z b Z b
f (x) · g(x) dx = f (c) g(x) dx.
a a

Corollary 43.1. If f is continuous on [a, b], then there exists c ∈ [a, b] such that
Z b
f (x) dx = f (c)(b − a).
a

Application 43.1 (Integral test for series). Let f : R+ → R+ be a continuous decreasing


Z n ∞
X
function and let an = f (n) for each n ∈ N. Let jn = f (x) dx. The series an
1 n=1
converges if and only if (jn ) converges.

Proof. Since f (n + 1) ≤ f (x) ≤ f (n) for all x ∈ [n, n + 1], n ∈ N we have


Z n+1 Z n+1 Z n+1
f (n + 1) dx ≤ f (x) dx ≤ f (n) dx.
n n n

Therefore Z n+1
f (n + 1) ≤ f (x) dx ≤ f (n)
n
and so
n+1
X n
X Z n+1 n
X
f (k) = f (k + 1) ≤ f (x) dx ≤ f (k).
k=2 k=1 1 k=1

86
Rn
Now let an = f (n) and jn = 1
f (x) dx. Then
n
X n−1
X
ak ≤ jn ≤ ak .
k=2 k=1


X
If (jn ) converges, then the n-th partial sums of an are increasing and bounded above
n=1

X
and so an is a convergent series.
n=1

X
Conversely, if an is a convergent series, then (jn ) is increasing and bounded above and
n=1
hence, it is a convergent sequence.

44 The fundamental theorem of calculus


Z x
Theorem 44.1. If f is Riemann-Darboux integrable on [a, b] and F (x) = f (t) dt, then
a
F is continuous on [a, b]. Furthermore, if f is continuous on [a, b], then F is differentiable
on [a, b] and F 0 = f .

Proof. Since f is integrable, it is bounded on [a, b]. So there exists some number M with
|f (t)| ≤ M for all t ∈ [a, b]. For a fixed c, we have
¯Z x Z c ¯ ¯Z x ¯ Z x Z x
¯ ¯ ¯ ¯
¯
|F (x) − F (c)| = ¯ f (t) dt − ¯ ¯
f (t) dt¯ = ¯ ¯
f (t) dt¯ ≤ |f (t)| dt ≤ M dt.
a a c c c

Since the constant function x → M has U (P ) = M |x − c| for the trivial partition P of


the interval with endpoints x and c,

|F (x) − F (c)| ≤ M |x − c|.


ε
Now given ε > 0, choose δ = . Hence
M
|x − c| < δ ⇒ |F (x) − F (c)| < ε.

In other words F is continuous at c and, since c was arbitrary, F is continuous on [a, b].

87
Let c ∈ [a, b] and consider x > c. Then
¯Z x Z c ¯
¯ ¯
¯ ¯ ¯ ¯
¯ F (x) − F (c) ¯ ¯ a f (t)dt − a f (t)dt ¯
¯ − f (c)¯=¯ − f (c)¯≤
¯ x−c ¯ ¯ x − c ¯
¯ ¯
¯ ¯
¯Z x ¯ ¯Z x ¯
¯ ¯ ¯ ¯
¯ f (t)dt ¯ ¯ (f (t) − f (c))dt ¯
¯ c ¯ ¯ c ¯
≤¯ ¯ ¯
− f (c)¯ ≤ ¯¯ ¯≤
x − c x − c ¯
¯ ¯ ¯ ¯
¯ ¯ ¯ ¯
¯Z x ¯
¯ ¯
¯ |f (t) − f (c)|dt ¯
¯ c ¯
≤ ¯¯ ¯ (since f (c) is constant)
¯
¯ x−c ¯
¯ ¯

Given ε > 0, there exists δ > 0 such that |f (t) − f (c)| < ε for |t − c| < δ. Hence, for
0 < x − c < δ,
¯Z x ¯ Z x
¯ ¯
¯ ¯ ¯ |f (t) − f (c)|dt ¯ ε dt
¯ F (x) − F (c) ¯ ¯ c ¯
¯ ¯ ¯
− f (c)¯ ≤ ¯ ¯ c
¯ x−c x−c ¯ ≤ x − c < ε.
¯ ¯
¯ ¯

In other words, F+0 (c) = f (c). Similarly F−0 (c) = f (c). Hence F is differentiable and
F0 = f.
Remark 44.1. Suppose that it is required to evaluate
Z x2
f (t) dt
x1

where x1 , x2 ∈ [a, b] and f is continuous on [a, b]. Using the additive property we have
Z x2 Z x2 Z x1
f (t) dt = f (t) dt − f (t) dt
x1 a a

and by the fundamental theorem


Z x2
f (t) dt = F (x2 ) − F (x1 )
x1
Z x
where F (x) = f (t) dt.
a

Remark 44.2. If f is continuous on [a, b] and Φ0 = f on [a, b],Zthen there exists a constant
x
c such that Φ(x) = F (x) + c for any x ∈ [a, b] where F (x) = f (t) dt. That is because
a
(Φ − F )0 = 0 and by the mean value theorem, Φ − F = c.
Hence if f is continuous on [a, b] and Φ is continuously differentiable on [a, b] such that
Φ0 = f , then for any x1 , x2 ∈ [a, b] we have:
Z x2
f (t) dt = Φ(x2 ) − Φ(x1 ).
x1

88
Z b
Notice that this method of evaluating integrals f (t) dt hinges on the ability to
a
determine Φ such that Φ0 = f on [a, b].

Definition 44.1. Any function Φ such that Φ0 = f is called a primitive for f .

Unfortunately, most functions do not posses primitives expressible in terms of the


elementary functions alone. In such cases it is necessary to settle for numerical estimates.

Example 44.1.

Z ¯1
1
1 4 ¯ 9
a) (x + 2) dx = x + 2x¯¯ = .
3
0 4 0 4
Z 0 3
¯
2 ¯0
x x 5
b) (x2 − x)dx = − ¯¯ = .
−1 3 2 −1 6

c) Find the primitives for the following functions:

1. f (x) = x2 + 3x − 2;
2. f (x) = 1 + cos 3x;
3. f (x) = ex cosh 2x;
1
4. f (x) = √ ;
9 − x2
5. f (x) = |x|.

45 Techniques to find primitives

In the applications of integral calculus, it is necessary to find primitives (when primitives


exist and are expressible in terms of simple functions). Various techniques exist for
determining primitives and this short section looks at two of the most important:
integration by parts and change of variables.

Integration by parts

Proposition 45.1. If the functions f and g are continuously differentiable on [a, b], then
Z Z
f (x) · g (x) dx = f (x) · g(x) − f 0 (x) · g(x) dx.
0

Z Z
0
where f (x)g (x)dx represents the set of primitives of f g and 0
f 0 (x)g(x)dx represents
the set of primitives of f 0 g.

89
Proof. The function h = f · g is differentiable and its derivative h0 is continuous on [a, b].
By the product rule of differentiation, we have
h0 (x) = f 0 (x) · g(x) + f (x) · g 0 (x).
Z
Let now ϕ ∈ f (x)g 0 (x)dx and ψ = ϕ − f g. It is easy to see that ψ 0 = ϕ0 − f 0 g −
Z
f g = −f g and therefore ψ ∈ − f 0 (x)g(x)dx. We obtain that ϕ = f g + ψ and
0 0
Z Z
ψ ∈ − f (x)g(x)dx. In other words, ϕ ∈ f g − f 0 (x)g(x)dx. It can be shown in
0

Z
the same manner that for every ψ ∈ − f 0 (x)g(x)dx, the function ϕ = f g + ψ ∈
Z
f (x)g 0 (x)dx.

The value of this formula lies in the hope that the primitive on the right-hand side is
easier to evaluate than the original one.
Corollary 45.1. If the functions f and g are continuously differentiable on [a, b] then
Z b ¯b Z b
¯
f (x) · g (x) dx = f (x) · g(x)¯¯ −
0
f 0 (x) · g(x) dx.
a a a

Example 45.1. Many so-called reduction formulae can be established by repeated


integration by parts. For example, let
Z
In = cosn x dx.

Then
In = cosn−1 x · sin x + (n − 1) In−2 − (n − 1) In .
Hence:
n In = cosn−1 x · sin x + (n − 1) In−2 n ≥ 2.
This
Z formula, together with the fact that I0 = x and I1 = sin x, leads to the evaluation
of cosn x dx for n ∈ N.

Example 45.2. Show that


Z 2
1
x2 e2x dx = e2 (3e2 − 1).
1 4

Change of variables

Proposition 45.2. If the function g : [α, β] → [a, b] is a continuously differentiable


bijection having the property g(α) = a, g(β) = b and f : [a, b] → R1 is continuous, then
µZ ¶ Z
f (x) dx ◦ g = (f ◦ g)(t) · g 0 (t) dt.
Z Z
where f (x)dx represents the set of primitives of f and (f ◦ g)(t) · g 0 (t) dt represents
the set of primitives of (f ◦ g) · g 0 .

90
Z
Proof. Let F ∈ f (x) dx and G(t) = (F ◦ g)(t). By the composite rule of differentiation

G0 (t) = F 0 (g(t)) · g 0 (t) = f (g(t)) · g 0 (t) = (f ◦ g)(t) · g 0 (t).

Hence Z
G∈ (f ◦ g)(t) · g 0 (t) dt
Z
Let now G ∈ (f ◦ g)(t) · g 0 (t) dt and consider g −1 : [a, b] → [α, β]. We have
1
(g −1 )0 (x) = where x = g(t) and the function F = F ◦ g −1 verifies
g 0 (t)

F 0 (x) = G0 (g −1 (x))(g −1 )0 (x) = (f ◦ g)(g −1 (x))[(g −1 )0 (x)]−1 (g −1 )0 (x) = f (x)


Z µZ ¶
Therefore, F ∈ f (x) dx and G ∈ f (x) dx ◦ g.

Example 45.3. Evaluate Z


1
dx.
x ln x

1
Solution: Let f (x) = and g(t) = et . Then g 0 (t) = et and so
x ln x
Z Z Z
1 1 t 1
dx = t
e dt = dt = ln t = ln(ln x).
x ln x e ·t t

Corollary 45.2. If the function g : [α, β] → [a, b] is a continuously differentiable bijection


having the property g(α) = a, g(β) = b and f : [a, b] → R1 is continuous, then
Z b Z β
f (x) dx = (f ◦ g)(t) · g 0 (t) dt.
a α

Example 45.4. Evaluate Z 2 √


t2 t3 − 1 dt.
1


Solution: Let f (x) = x and g(t) = t3 − 1. Then g 0 (t) = 3t2 and therefore
Z 2 √ Z g(2)
2

t3 − 1 · 3t dt = x dx.
1 g(1)

Hence Z Z ¯7
2
2
√ 1 7 √ 1 2 3 ¯¯
t t3 − 1 dt = x dx = · x 2 ¯ ≈ 4.116.
1 3 0 3 3 0

The change of variables formula is often called integration by substitution where x = g(t)
gives the substitution to be used. It is extensively used in elementary calculus books,
where trial substitutions, which depend on the form of the integrand, are suggested.

91
Remark 45.1. If f : [−a, a] → R is piecewise continuous and symmetric (f (−x) = f (x))
then:
Za Za
f (x) dx = 2 f (x) dx
−a 0

and if f is antisymmetric (f (−x) = −f (x)) then:


Za
f (x) dx = 0
−a

Za
In order to obtain the above equalities, the integral f (x) dx is written as:
−a

Z a Z0 Za
f (x) dx = f (x) dx + f (x) dx
−a
−a 0

and then, in the first integral, the substitution x = −t is made.


Remark 45.2. If f : R → R is periodic of period T and piecewise continuous, then for
any a ∈ R, the following equality holds:
Z
a+T ZT
f (x) dx = f (x) dx ∀a ∈ R
a 0

In order to prove this equality, we write:


Z
a+T Z0 ZT Z
a+T

f (x) dx = f (x) dx + f (x) dx + f (x) dx


a a 0 T

Z
a+T

In the integral f (x) dx the substitution x = t + T is made, obtaining:


T

Z
a+T Z0
f (x) dx = − f (x) dx
T a

46 Improper integrals

The definition of the Riemann-Darboux integrals applies only to bounded functions


defined on bounded intervals. This section relaxes these conditions and defines improper
integrals.
The integral of a function defined and bounded on an interval which is not bounded is
defined below. This is called an improper integral of the first kind.

92
Definition 46.1. Let f be a function bounded on [a, +∞) and Riemann-Darboux inte-
Z b Z ∞
grable on [a, b] for every b > a. If lim f (x) dx exists, it is said that f (x) converges.
b→∞ a
Z ∞ a

Otherwise f (x) diverges.


a

Z a
A completely analogous definition holds for integrals of the form f (x).
−∞

Since it is necessary to preserve the additivity of the integral for improper integrals the
following is defined: Z +∞ Z 0 Z +∞
f (x) = f (x) + f (x)
−∞ −∞ 0
provided both improper integrals on the right hand side converge.
Example 46.1.
Z +∞
1 π
a) The integral 2
converges to .
0 1+x 2
Z +∞
1
b) The integral converges to 1.
1 x2
Z +∞
1
c) The integral √ diverges.
1 x
Z +∞
d) The integral sin x diverges.
−∞

The integral of a function over a bounded interval where the function is not bounded will
now be defined. This is called improper integral of second kind.
Definition 46.2. Let f be a function defined on (a, b] and Riemann-Darboux integrable
on [a + ε, b], for ε ∈ (0, b − a). If
Z b
lim+ f (x) dx
ε→0 a+ε
Z b
exists, then it is said that f (x) dx converges.
a

Example 46.2.
Z 1
1
a) The integral √ dx converges to 2.
0 x
Z 1
1 π
b) The integral √ dx converges to .
0 1 − x2 2
Z 1
1
c) The integral dx diverges.
0 x

93
Z +∞
1
a) The integral √ dx diverges.
0 x
Theorem 46.1 (Comparison test for integrals). Let f and g be defined on [a, +∞) and
Riemann-Darboux integrable on [a, b] for every b > a. Suppose that

a) 0 ≤ f (x) ≤ g(x) for all x ≥ a;


Z +∞
b) g(x) dx converges.
a

Z +∞
Then f (x) dx converges too.
a

Proof. Now Z Z
b b
0≤ f (x) dx ≤ g(x) dx.
a a
Z b
Since 0 ≤ f (x) ≤ g(x), g(x) dx increases to its limiting value as b → +∞. Hence
Z b a Z +∞
f (x) dx is increasing and bounded above. Therefore f (x) dx is convergent
a a
integral.

A comparison test for improper integrals of the second kind is easily formulated and
proved in a similar manner.
Z +∞ −x
e
Example 46.3. The integral dx converges.
0 1 + x2

e−x 1
Solution: Consider the functions f (x) = 2
and g(x) = and apply the
1+x 1 + x2
comparison test.

47 Fourier series

The idea that a function may be represented by its Taylor series has already been
discussed. We saw that in order to be able to write the Taylor series of a function f
at a point x, it needs to be infinitely differentiable. This is a sever restriction that most
functions do not satisfy. Even when Taylor’s theorem with remainder is employed, the
function still needs to be differentiable a finite number of times and this, like infinite
differentiability, certainly implies that the function must be continuous. Nevertheless,
many functions used to describe important physical phenomena are discontinuous and
cannot be represented by Taylor series. For example, the function used to describe
the voltage behavior in time in a circuit in which a switch is suddenly operated is
discontinuous, just like the functional behavior of the gas pressure across a shock front.

94
In principle, at least, Fourier series offer the possibility of representation of continuous and
piecewise continuous functions, because whereas for Taylor series expansion, a function
needs to be differentiable, for Fourier series expansion it would appear that it only needs to
be integrable; the Fourier coefficients can be computed when f (x) is piecewise continuous.
Definition 47.1. The Fourier series of a piecewise continuous function f (x) defined on
the interval [−π, π] is the series

a0 X
f (x) ∼ + (an · cos nx + bn · sin nx)
2 n=1

in which the Fourier coefficients an , bn are given by


Z
1 π
an = f (x) · cos nx dx for n = 0, 1, 2, ...
π −π
Z
1 π
bn = f (x) · sin nx dx for n = 1, 2, ...
π −π
Example 47.1.

a) Determine the Fourier series of the function f (x) = π 2 − x for x ∈ [−π, π].
4π 2 4
Solution: a0 = , an = (−1)n+1 · 2 for n = 1, 2, ..., bn = 0 for all n.
3 n
b) Determine the Fourier series of the function f (x) = |x| for x ∈ [−π, π].
−4
Solution: a0 = π, an = (−1)n+1 · for n = 1, 2, ..., bn = 0 for all n.
π(2n + 1)2
c) Determine the Fourier series of the function
½
a for −π ≤ x < 0
f (x) =
b for 0 ≤ x < π

2(b − a)
Solution: a0 = a + b, an = 0 for n = 1, 2, ..., b2n = 0, b2n+1 = for
(2n + 1)π
n = 1, 2, ....

Usually, until the convergence problem has been resolved, it is customary to denote the
relationship between f (x) and its Fourier series by the sign ∼ instead of an equality.
The main result of this section will be establishing a fundamental theorem on the
convergence of Fourier series of a piecewise continuous function f (x). However, as this
will require several subsidiary results, which are important in their own way, we now
establish them in the form of two lemmas.
Lemma 47.1 (Integral representation of Sn (x)). The n-th partial sum of the Fourier
series of the function f (x) defined on the fundamental interval [−π, π], and prolonged by
periodic extension outside it, may be represented in the form:
Z ¡ ¢
1 π sin n + 12 u
Sn (x) = f (x − u) · du
π −π 2 sin 12 u

95
Proof. First, using the summation formula for a geometric progression it follows immedi-
ately that £ ¡ ¢ ¤ £1 ¤
Xn 1
exp i n + x − exp ix
eikx = 2
1
2
.
k=1
2 i sin 2 x
Hence, equating the real parts of the two sides of this equation, we deduce that
n ¡ ¢
1 X sin n + 12 x
+ cos kx = .
2 k=1 2 sin 12 x

Integration of this expression over the intervals [−π, 0] and [0, π] shows that
Z 0 ¡ ¢ Z π ¡ ¢
sin n + 12 u sin n + 12 u 1
1 du = 1 du = π
−π 2 sin 2 u 0 2 sin 2 u 2
since the only contribution from the left-hand side comes from the constant term.
Now consider the n-th partial sum Sn (x) of the Fourier series of f (x):
n
a0 X
Sn (x) = + (ak · cos kx + bk · sin kx).
2 k=1

By the definition of the Fourier coefficients ak and bk we may write


Z π n · Z π Z π ¸
1 1X
Sn (x) = f (t) dt + cos kx f (t) cos kt dt + sin kx f (t) sin kt dt .
2π −π π k=1 −π −π

Taking the functions cos kx, sin kx under the integral signs, and employing the trigono-
metric identity
cos k(x − t) = cos kx · cos kt + sin kx · sin kt
allows us to write
Z " n
#
1 π
1 X
Sn (x) = f (t) + cos k(x − t) dt.
π −π 2 k=1

Applying the identity


n ¡ ¢
1 X sin n + 12 (x − t)
+ cos k(x − t) =
2 k=1 2 sin 21 (x − t)

and writing x − t = u, this becomes


Z ¡ ¢
1 x+π sin n + 12 u
Sn (x) = f (x − u) · du.
π x−π 2 sin 12 u
The trigonometric factor in this integrand has period 2π so that if, for the purpose of
the study of its Fourier series, f (x) itself is also regarded as periodic with period 2π,
then the entire integrand is periodic with period 2π. Consequently, a definite integral
of this function taken over any interval of length 2π will be the same, showing that we
may replace the limits x − π and x + π by −π and π, respectively. This assumption of
the periodicity of the function f (x) outside [−π, π] in fact places no restriction of f (x),
because the Fourier series can only represent f (x) in the fundamental interval [−π, π], so
that the manner in which f (x) is defined outside it is immaterial.

96
Lemma 47.2. For a piecewise continuous function f (x) defined on [−π, π] the following
equalities hold:
Z π Z π
a) lim f (x) cos nx dx = 0 and lim f (x) sin nx dx = 0, and
n→∞ −π n→∞ −π
Z b µ ¶
1
b) lim f (x) sin n + x dx = 0 if −π ≤ a < b ≤ π.
n→∞ a 2

Proof. Consider the identity:


Z π Z π Z π Z π
2 2
[f (x) − Sn (x)] dx = [f (x)] dx − 2 f (x) Sn (x) dx + [Sn (x)]2 .
−π −π −π −π

From the definition of the Fourier coefficients, the orthogonality property of the trigono-
metric system i.e.
Z π
sin mx cos nx dx = 0 for all m, n;
−π
Z π ½
0 for m 6= n
sin mx sin nx dx = ;
−π π for m = n

Z π  0 for m 6= n
cos mx cos nx dx = π for m = n 6= 0 ;
−π 
2π for m = n = 0

and from the form of the n-th partial sum Sn (x), it follows that
Z π Z π " n
#
a2 X
0
[Sn (x)]2 dx = f (x) Sn (x) dx = π + (a2k + b2k ) .
−π −π 2 k=1

Combining the last two equalities we have:


Z Z n
" #
π π
a 2 X
[f (x) − Sn (x)]2 dx = [f (x)]2 dx − π 0 + (a2k + b2k ) .
−π −π 2 k=1

as the integrand of the left-hand side integral involves a square, it is either positive or
zero, so we may conclude
n Z
a20 X 2 1 π
+ (ak + b2k ) ≤ f 2 (x) dx.
2 k=1
π −π

This is known as the Bessel inequality and it is true for all n.


The fact that the right-hand side is finite by hypothesis, implies that the sum of squares
of the Fourier series coefficients must always be convergent.
This result implies that an → 0 and bn → 0 for n → ∞. In terms of the definition of
Fourier coefficients the limits an → 0 and bn → 0 are seen to be equivalent to
Z π Z π
lim f (x) cos nx dx = 0 and lim f (x) sin nx dx = 0.
n→∞ −π n→∞ −π

97
Observe that when Z π
lim [f (x) − Sn (x)]2 dx = 0,
n→∞ −π

then we have ∞ Z
a20 X 2 1 π
+ (ak + b2k ) = [f (x)]2 dx.
2 k=1
π −π

This last result is known as Parseval’s equality.


The convergence Z π
[f (x) − Sn (x)]2 dx −−−→ 0
−π n→∞

is usually known as the convergence in the mean.


We will show now that if f is a piecewise continuous function on [−π, π], then for a, b
such that −π ≤ a < b ≤ π we have
Z b µ ¶
1
lim f (x) sin n + x dx = 0.
n→∞ a 2

For that, we first remark that for any α < β


¯Z β µ ¶ ¯ ¯¯ ¡ ¢ ¡ ¢ ¯
¯ 1 ¯ ¯ cos n + 12 α − cos n + 12 β ¯¯ 2
¯ sin n + x dx¯¯ = ¯ ¯≤ .
¯ 2 ¯ 1
n+ 2 ¯ n + 12
α

Now let a = x0 < x1 < . . . xp = b be a partition of the closed interval [a, b] and the
corresponding decomposition of the integral:
Z b µ ¶ p−1 Z
X xi+1 µ ¶
1 1
f (x) sin n + x dx = f (x) sin n + x dx.
a 2 i=0 xi 2
Z b µ ¶
1
Consider mi = inf{f (x) | x ∈ [xi , xi+1 ]} and represent f (x) sin n + x dx in the
a 2
following form
Z b µ ¶ p−1 Z
X xi+1 ¶ µ
1 1
f (x) sin n + x dx = [f (x) − mi ] sin n + x dx+
a 2 i=0 xi
2
Xp−1 Z xi+1 µ ¶
1
+ mi sin n + x dx.
i=0 x i
2

For ωi = Mi − mi , where Mi = sup{f (x) | x ∈ [xi , xi+1 ]} we have

f (x) − mi ≤ Mi − mi = ωi .

Now ¯Z b µ ¶ ¯ p−1 p−1


¯ 1 ¯ X 2 X
¯ f (x) sin n + ¯
x dx¯ ≤ ωi · ∆xi + |mi |
¯ 2 n + 1/2 i=0
a i=0

98
p−1
X ε
For ε > 0 we choose the partition such that ωi · ∆xi < . This is possible because the
i=0
2
piecewise continuous function f is integrable.
4
Now we can take n > M (b − a), where M = sup{f (x) | x ∈ [a, b]} and we obtain that
ε
for such values of n we have
¯Z b µ ¶ ¯
¯ 1 ¯
¯ f (x) sin n + x dx ¯ < ε.
¯ 2 ¯
a

Collecting the results together we arrive at Lemma 47.2.

Now we are ready to prove the fundamental Fourier theorem on convergence.


Theorem 47.1 (Fourier theorem). Let f be a piecewise continuous function defined on
the interval [−π, π] and extended by periodicity outside it. If f (x) has finite left-hand and
right-hand side derivatives at its points of discontinuity, then:

a) when x = x0 is a point of continuity of f , then

lim Sn (x0 ) = f (x0 ).


n→∞

b) when x = x0 is a point of discontinuity of f , then


1£ ¤
lim Sn (x0 ) = f (x+ −
0 ) + f (x0 ) .
n→∞ 2

Proof. Consider a function f (x) defined on [−π, π] being defined by periodic extension
outside [−π, π]. Assume that f is piecewise continuous on [−π, π] and has a finite
discontinuity at x0 . Denote

f (x−
0 ) = lim− f (x) and f (x+
0 ) = lim+ f (x).
x→x0 x→x0

Then from the Lemma 47.1 we may write:


Z ¡ ¢
1 π sin n + 12 u
Sn (x0 ) = f (x0 − u) · du.
π −π 2 sin 12 u
We have also: Z ¡ ¢
1 1 0
sin n + 12 u
f (x+
0) = f (x+
0) · du
2 π −π 2 sin 12 u
and Z ¡ ¢
1 1 π
sin n + 12 u
f (x−
0) = f (x−
0) · du.
2 π 0 2 sin 12 u
From the above results we deduce:
Z ¡ ¢
1£ ¤ 1 0
sin n + 12 u
Sn (x0 ) − f (x+ −
0 ) + f (x0 ) = [f (x0 − u) − · f (x+
0 )] du +
2 π −π 2 sin 12 u
Z ¡ ¢
1 π − sin n + 12 u
+ [f (x0 − u) − f (x0 )] · du
π 0 2 sin 12 u

99
The integrands on the right-hand side are well defined everywhere except, maybe at u = 0,
where they require examination. The first integrand can be written in the form
µ ¶
1
F1 (u) · sin n + u
2

where
1
f (x0 − u) − f (x+
0) u
F1 (u) = · 21 .
u sin 2 u
As u → 0, the second factor tends to 1 and when the right-hand side derivative of f exists
at x = x0 , the first factor tends to −f 0 (x+ 0 +
0 ). So, F1 (0) = −f (x0 ) and the integrand is
well defined at u = 0. Similarly, if
1
f (x0 − u) − f (x−
0) u
F2 (u) = · 21
u sin 2 u

and if the right-hand derivative of f exists at x = x0 , then F2 (0) = f 0 (x−


0 ) and the second
integrand is also well defined at u = 0.
We may thus write
Z µ ¶ Z µ ¶
1£ + −
¤ 1 0 1 1 π 1
Sn (x0 )− f (x0 ) + f (x0 ) = F1 (u)·sin n + u du+ F2 (u)·sin n + u du
2 π −π 2 π 0 2

Applying Lemma 47.2 we conclude that


1£ ¤
lim Sn (x0 ) = f (x+ −
0 ) + f (x0 ) .
n→∞ 2
If f is continuous at x0 , then
lim Sn (x0 ) = f (x0 ).
n→∞

We have thus proved one form of the Fourier theorem on convergence of Fourier series.

Example 47.2.

a) Deduce the Fourier series expansion of f (x) = π 2 − x2 in the interval [−π, π].

b) Deduce the Fourier series expansion of the function f (x) = |x| in the interval [−π, π].

c) Deduce the Fourier series expansion of the function


½
a if x ∈ [−π, 0]
f (x) =
b if x ∈ (0, π]

in the interval [−π, π].

100
48 Different forms of Fourier series

Theorem 48.1 (change of the origin of the fundamental interval). If f (x) is a piecewise
continuous function defined in the fundamental interval [−π, π] and by periodic extension
outside it, then for any α, the Fourier coefficients an , bn are given by
Z
1 α+π
an = f (x) · cos nx dx for n = 0, 1, 2, . . .
π α−π
Z α+π
1
bn = f (x) · sin nx dx for n = 1, 2, . . .
π α−π

The Fourier series of f (x) converges at every point of continuity and:



a0 X
f (x) = + (an cos nx + bn sin nx) for x ∈ [α − π, α + π].
2 n=1

Theorem 48.2 (change of the interval length). The Fourier expansion of the piecewise
continuous function f (x) defined on [−L, L] is the series

a0 X nπx nπx
f (x) = + (an cos + bn sin )
2 n=1
L L

with Z L
1 nπx
an = f (x) · cos dx for n = 0, 1, 2, . . .
L −L L
and Z L
1 nπx
bn = f (x) · sin dx for n = 1, 2, . . .
L −L L
Example 48.1.

a) Deduce the Fourier series expansion of the function


 π

 −x
 for − ≤ x < 0
2
f (x) = x for 0 ≤ x < π

 2π − x for π ≤ x ≤ 3π .

2

b) Deduce the Fourier series expansion of f (x) = x3 for −1 ≤ x ≤ 1.

When f (x) is an even function defined on the interval [−π, π], then f (−x) = f (x). Thus,
it follows directly that f (x) · cos nx is an even function, because cos nx is even, and
f (x) · sin nx is an odd function, because sin nx is odd.
Consider the Fourier coefficients an of an even function f (x), that we choose to write in
the form Z Z
1 0 1 π
an = f (x) · cos nx dx + f (x) · cos nx dx.
π −π π 0

101
Then, changing the variable in the first integrand by writing u = −x, employing the even
nature of the integrand to replace f (−u) · cos n(−u) by f (u) · cos nu and changing the
sign of the integral by reversing the limits, we find
Z
2 π
an = f (x) cos nx dx for n = 0, 1, 2, . . .
π 0
The same argument applied to the coefficients bn shows that
bn = 0 for n = 1, 2, . . .

Consequently, if f (x) is an even function on [−π, π], its Fourier series contains only cosine
functions and is of the form

a0 X
f (x) = + an cos nx for x ∈ [−π, π].
2 n=1

This is called the Fourier cosine expansion of the even function f (x) in [−π, π].
Example 48.2.

a) Deduce the Fourier series expansion of the even function f (x) = x2 in [−π, π].
b) Deduce the Fourier series expansion of the even function f (x) = |x| in [−π, π].

When f (x) is an odd function defined on the interval [−π, π], then f (−x) = −f (x). A
similar argument as in the case of even functions leads us to
an = 0 for n = 0, 1, . . .
and Z
2 π
bn = f (x) sin nx dx for n = 1, 2, . . .
π 0
from which it follows that the Fourier series of an odd function defined on [−π, π] contains
only sine functions and is of the form

X
f (x) = bn sin nx for x ∈ [−π, π].
n=1

This is called the Fourier sine expansion of the odd function f (x) in [−π, π].
This results can be usefully interpreted in terms of any arbitrary function f (x) which is
to be expanded in the half interval [0, π]. Defining a new function g(x), by the rule
½
f (−x) for − π ≤ x < 0
g(x) =
f (x) for 0 ≤ x ≤ π
we see that g(x) is an even function which is equal to f (x) in the required interval [0, π].
Thus, as a Fourier cosine expansion of g(x) only requires the knowledge of g(x) in the
half interval [0, π] in which g(x) = f (x), it follows that

a0 X
f (x) = + an cos nx for x ∈ [0, π]
2 n=1

102
is the desired cosine expansion of f (x) in [0, π].
Alternatively, we may expand the same function f (x) in the half interval [0, π] in a Fourier
sine series as follows: define a new function h(x) by the rule
½
−f (−x) for − π ≤ x < 0
h(x) =
f (x) for 0 ≤ x ≤ π.

Then, h(x) is an odd function which is equal to f (x) in the required interval [0, π]. The
Fourier sine expansion of h(x) only requires the knowledge of h(x) in the half interval
[0, π] where h(x) = f (x). So

X
f (x) = bn sin nx for 0 ≤ x ≤ π
n=1

provides the desired sine expansion of f (x) for x ∈ [0, π]. These expansions are often
called the half-range expansions of f (x).
We have proved the following theorem:

Theorem 48.3 (Fourier sine and cosine series). If f (x) is an arbitrary function defined
and piecewise continuous on [0, π], then it may either be expanded as a Fourier cosine
series ∞
a0 X
f (x) = + an cos nx 0 ≤ x ≤ π
2 n=1

in which Z π
2
an = f (x) cos nx dx for n = 0, 1, 2, . . .
π 0
or as a Fourier sine series

X
f (x) = bn sin nx 0≤x≤π
n=1

in which Z π
2
bn = f (x) sin nx dx for n = 1, 2, . . .
π 0

103
Part III

Functions of several variables


49 Topology in Rn

Definition 49.1. The set Rn is the collection of all the finite sequences x = (x1 , x2 , ..., xn )
of n real numbers:

Rn = {(x1 , x2 , ..., xn )|xi ∈ R1 , i = 1, 2, ..., n}

Definition 49.2. A real valued function of n variables associates to every finite sequence
of n real numbers of a set A ⊂ Rn an unique real number.

Formally, f : A ⊂ Rn → R1 is given by

(x1 , x2 , . . . , xn ) = x 7→ f (x) = f (x1 , x2 , . . . , xn )

where x = (x1 , x2 , . . . , xn ) is an element of A ⊂ Rn .

Example 49.1. The function f : R2 → R1 given by f (x1 , x2 ) = x21 + x22 is a real valued
function of two variables.

Definition 49.3. A vector valued function f of n variables associates to every finite


sequence x = (x1 , x2 , . . . , xn ) of n real numbers of the set A ⊂ Rn a unique vector f (x)
from Rm .

Formally, f : A ⊂ Rn → Rm is given by

x 7→ f (x1 , x2 , . . . , xn ) = (f1 (x1 , x2 , . . . , xn ), . . . , fm (x1 , x2 , . . . , xn ))

where x = (x1 , x2 , . . . , xn ) ∈ A and (f1 (x1 , x2 , . . . , xn ), . . . , fm (x1 , x2 , . . . , xn )) ∈ Rm .

Example 49.2. The function f : R3 → R2 given by

f (x1 , x2 , x3 ) = (x1 + x2 + x3 , x1 · x2 · x3 )

is a vector function of three variables. Here f1 (x1 , x2 , x3 ) = x1 +x2 +x3 and f2 (x1 , x2 , x3 ) =
x1 · x2 · x3 .

If f (x1 , x2 , . . . , xn ) = (f1 (x1 , x2 , . . . , xn ), . . . , fm (x1 , x2 , . . . , xn )), then fi are real valued


functions of n variables for i = 1, m and are called the scalar components of the vector
function f .
When functions of one real variable were discussed, it was necessary to investigate real
numbers which were close to a fixed real number a. This lead to an interest in the quantity
|x − a|. Analogies of length and distance in Rn can be obtained as follows:

104
Rn is organized as an n-dimensional vector space using the sum and the scalar product
defined by:

(x1 , x2 , . . . , xn ) + (y1 , y2 , . . . , yn ) = (x1 + y1 , x2 + y2 , . . . , xn + yn )

k(x1 , x2 , . . . , xn ) = (kx1 , kx2 , . . . , kxn )

For x ∈ Rn the norm (or length) of x is defined by


v
u n q
uX
kxk = t xi = x21 + x22 + . . . + x2n .
2

i=1

The distance between x and a = (a1 , a2 , . . . , an ) is taken to be kx − ak. So


v
u n
uX p
kx − ak = t (xi − ai )2 = (x1 − a1 )2 + (x2 − a2 )2 + . . . + (xn − an )2 .
i=1

A neighborhood of a ∈ Rn is a set V ⊂ Rn which contains a hypersphere Sr (a) centered


in a,
Sr (a) = {x ∈ Rn | kx − ak < r} r > 0
Sr (a) ⊂ V .
Deleting a from a neighborhood V of a, a deleted neighborhood of a will be obtained:

V 0 = V \ {a}.

Example 49.3. For any (a1 , a2 ) ∈ R2 a hypersphere Sr (a) is a neighborhood of


a = (a1 , a2 ). For example, the set

S1 (0, −1) = {(x1 , x2 ) | x21 + (x2 + 1)2 < 1}

is a neighborhood of a = (0, −1).

Now the limit of a sequence (xk ) of points of Rn can be defined. A sequence (xk ) of points
of Rn is a function whose domain is the set of natural numbers and whose values belong
to Rn . The value of the function corresponding to argument k is denoted by xk . The
sequence x1 , x2 , . . . , xk , . . . is denoted by (xk ).

Definition 49.4. A vector x ∈ Rn is said to be the limit of the sequence (xk ) if for any
ε > 0 there exists N = N (ε) > 0 such that for any k > N we have kxk − xk < ε. In this
case we write lim xk = x.
k→∞

The limit of the sequence (xk ), if it exists, is unique. If a sequence (xk ) converges to x,
then the sequence is bounded, i.e. there exists M > 0 such that kxk k < M for any k ∈ N.
If a sequence (xk ) converges to x, then any subsequence (xkl ) of the sequence (xk ) converges
to x.

105
A sequence (xk ), xk = (x1k , x2k , ..., xnk ) ∈ Rn converges to x = (x1 , x2 , ..., xn ) ∈ Rn if and
only if the sequence (xik ) converges to xi for any i = 1, 2, ..., n.
According to Bolzano-Weierstrass theorem, any bounded sequence (xk ) of points of Rn
contains a convergent subsequence.
The Cauchy’s criterion for the convergence of a sequence (xk ) of points xk ∈ Rn states
that (xk ) converges if and only if for any ε > 0 there exists Nε such that for p, q > Nε we
have kxp − xq k < ε.
Definition 49.5. Let be A ⊂ Rn . A point x ∈ Rn is called an interior point of the set A
if there exists a hypersphere Sr (x) such that Sr (x) ⊂ A.

For instance, a point y of a hypersphere Sr (a) is an interior point of the hypersphere.


Definition 49.6. The interior set of A ⊂ Rn is the set of all interior points of the set A.
Usually the interior of a set A is denoted by Int(A).

For instance, if A is the hypersphere Sr (a), then Int(A) = Sr (a) = A.


Definition 49.7. A set A ⊂ Rn is said to be open if A = Int(A).

For instance, any hypersphere Sr (x) is an open set.


A set A ⊂ Rn is open if and only if it contains a neighborhood of each of its points.
The union of any family of open sets is open.
The sets Rn and ∅ are open.
The intersection of a finite number of open sets is open.
Definition 49.8. A set A ⊂ Rn is said to be closed if its complement is open.

The intersection of any family of closed sets is closed.


The union of a finite number of closed sets is closed.
The sets Rn and ∅ are closed.
A closed hypersphere S r (a) defined as:

S r (a) = {x ∈ Rn | kx − ak ≤ r}

is closed.
Definition 49.9. A point a ∈ Rn is a limit point (or a point of accumulation) of the set
A ⊂ Rn provided every deleted neighborhood of a intersects A.
Definition 49.10. The closure A of a set A ⊂ Rn is the intersection of all closed sets
containing A.
The set of points in A and not in the interior Int(A) of A is called the boundary of A
and it is denoted by ∂A.

106
The closure operation has the properties:

• A ∪ B = A ∪ B;

• A ⊃ A;

• A = A;

• A=A ⇔ A is closed;

• x ∈ A if and only if every neighborhood V of x intersects A.

Definition 49.11. A set A ⊂ Rn is bounded if there exists r > 0 such that A ⊂ Sr (0).

Definition 49.12. A set A ⊂ Rn is compact if it is both bounded and closed.

For instance, a closed hypersphere S a (r) is a compact set.

Example 49.4. The set D defined by

D = {(x, y) | x + y ≤ 1 and x ≤ 0 and y ≤ 0}

is a compact subset of R2 .

Solution: D is bounded since it is contained in the hypersphere S2 (0) = {(x, y) | x2 +y 2 <


4}. If a is an element of the complement of D, then a lies a distance r > 0 away from at
least one of the lines x + y = 1, x = 0 or y = 0. Hence, the open hypersphere Sr (a) lies
in one of the regions x + y > 1, x < 0 or y < 0. So Sr (a) lies in the complement of D.
Thus, D is closed.

Remark 49.1. If A ⊂ Rn is a compact set, then every sequence (xk ) with xk ∈ A contains
a subsequence (xkl ) which converges to a point x0 ∈ A.

Definition 49.13. A set A ⊂ Rn is connected if there are no open sets G1 , G2 such that

A ⊂ G1 ∪ G2 , A ∩ G1 6= ∅, A ∩ G2 6= ∅, and (A ∩ G1 ) ∩ (A ∩ G2 ) = ∅.

50 Limit of a function at a point

Definition 50.1. Let f : A ⊂ Rn → R1 be a real valued n variable function and a a point


of accumulation of A (i.e. every deleted neighborhood of a contains at least one point
a0 ∈ A). The real number L is called the limit of f (x) as x tends to a if for any ε > 0,
there exists δ = δ(ε) > 0 such that

0 < kx − ak < δ ⇒ |f (x) − L| < ε.

We write
lim f (x) = L.
x→a

107
Just like in the case of functions of one variable, this definition is technically difficult to
implement except for the simplest functions. However, the obvious generalization of the
sum, product and quotient rules can be proved. Their use is illustrated in the following
example.
x2 − y 2
Example 50.1. Evaluate lim f (x, y) where f (x, y) = .
(x,y)→(2,1) x2 + y 2

Solution: As x → 2 and y → 1, x2 − y 2 → 3 and x2 + y 2 → 5. Hence

x2 − y 2 3
2 2
−−−−−−→ .
x + y (x,y)→(2,1) 5

Definition 50.2. Let f : A ⊂ Rn → Rm be a vector valued function of n variables and a


point of accumulation a of A. L ∈ Rm is called the limit of f (x) as x tends to a, if for
any ε > 0, there exists δ > 0 such that

0 < kx − ak < δ ⇒ kf (x) − Lk < ε.

We write
L = lim f (x).
x→a

If f (x1 , . . . , xn ) = (f1 (x1 , . . . , xn ), . . . , fm (x1 , . . . , xn )) and L = (L1 , . . . , Lm ), then the


following statement holds:

lim f (x) = L ⇔ lim fi (x) = Li i = 1, m.


x→a x→a
µ ¶
xy
Example 50.2. Evaluate lim f (x, y) where f (x, y) = ,x − y .
(x,y)→(2,1) x2 + y 2
Theorem 50.1 (Heine’s criterion for the limit). The function f : A ⊂ Rn → Rm has a
limit as x approaches a if and only if for any sequence (xk ), xk ∈ A, xk 6= a, and xk → a
as k → ∞, the sequence (f (xk )) converges.

Proof. As for the one variable real valued functions.

Theorem 50.2 (Cauchy-Bolzano’s criterion for the limit). The function f : A ⊂ Rn →


Rm has a limit as x → a if and only if for any ε > 0 there exists δ > 0 such that

0 < kx0 − ak < δ and 0 < kx00 − ak < δ ⇒ kf (x0 ) − f (x00 )k < ε.

Proof. As for the one variable real valued functions.

51 Continuity

Definition 51.1. A real valued n variable function f : A ⊂ Rn → R1 is continuous at


a ∈ A if lim f (x) = f (a).
x→a

108
This definition requests three things: firstly that lim f (x) exists, secondly that f (a) is
x→a
defined, and finally that the previous two values are equal.
In terms of ε, δ this is equivalent to the following:
Definition 51.2. A function f : A ⊂ Rn → R1 is continuous at a ∈ A if for every ε > 0
there exists δ = δ(ε) > 0 such that
kx − ak < δ ⇒ |f (x) − f (a)| < ε.
Definition 51.3. A vector valued n-variable function f : A ⊂ Rn → Rm is continuous at
a ∈ A if for every ε > 0 there exists δ = δ(ε) > 0 such that
kx − ak < δ ⇒ kf (x) − f (a)k < ε.
Example 51.1. Use the ε, δ condition of continuity to prove that the following functions
are continuous at the mentioned points:

a) f (x, y) = x2 + y 2 at x = y = 0;
b) f (x, y) = (x2 − y 2 , x · y) at x = 1, y = 1;
c) f (x, y, z) = x + y + z at x = y = z = 0;
d) f (x, y, z) = (x2 + y 2 + z 2 , x + y + z) at x = y = z = 1.

The rules for continuous functions of one variable can be generalized to give corresponding
rules for functions of several variables. These are stated in the next two theorems.
Theorem 51.1. Let f and g be real valued functions of n variables defined in a
neighborhood of a. If f and g are continuous at a, then so are f + g, f · g, and, when
1
f (x) 6= 0, .
f
Theorem 51.2. Let f : A ⊂ Rn → B ⊂ Rm be continuous at a ∈ A and g : B ⊂ Rm → Rp
be continuous at f (a) = b ∈ Rm . Then the composite function g◦f : A → Rp is continuous
at a.

Discontinuities of functions of more than one variable are often difficult to spot. In the
case of a function f of two variables, any discontinuities can be visualized geometrically
by appealing to the surface in R3 represented by the equation z = f (x, y). Some, but by
no means all, of the discontinuities of such a surface correspond to holes or tears.
Example 51.2. The function f : R2 → R1 given by f (x, y) = x2 + y 2 is continuous for
all (x, y). The surface given by z = f (x, y) is a parabolic bowl. To see this, notice that a
horizontal section for z = k ≥ 0 gives the circle x2 + y 2 = k and a vertical cross-section
for fixed x or y gives a parabola.
Example 51.3. Investigate the behavior of the function f given by
 xy

 x2 + y 2 if (x, y) 6= (0, 0)
f (x, y) =


0 if (x, y) = (0, 0)
as (x, y) approaches (0, 0).

109
Solution: f is discontinuous at (0, 0) and the discontinuity is much nastier than the
removable jump of finite discontinuities seen for functions of one variable. Recall that
when establishing the discontinuity of a function of one variable at some points often the
right and left-hand limits existed, but were unequal. Clearly, if lim f (x, y) exists,
(x,y)→(0,0)
its value must be independent on the way in which (x, y) approaches (0, 0). Consider
lim f (x, mx):
x→0
mx2 m
lim f (x, mx) = lim
2 2 2
= .
x→0 x→0 x + m x 1 + m2
But this quantity varies with m and so f cannot be continuous at (0, 0), no matter what
value is specified for f (0, 0).
Example 51.4. Investigate the behavior of the function f given by


 xy 3
 2 6
if (x, y) 6= (0, 0)
f (x, y) = x + y



0 if (x, y) = (0, 0)

as (x, y) approaches (0, 0).

Solution: Firstly
m3 x4 m 3 x2
lim f (x, mx) = lim = lim = 0.
x→0 x→0 x2 + m6 x6 x→0 1 + m6 x4

However this is not sufficient evidence to suppose that f (x, y) approaches 0 as (x, y)
approaches (0, 0). In fact
√ x2 1
lim f (x, 3
x) = lim 2 2
= .
x→0 x +x 2
Hence f is discontinuous at (0, 0).
Theorem 51.3. Let f : A ⊂ Rn → Rm , f (x) = (f1 (x), . . . , fm (x)) and a ∈ A. The
function f is continuous at a ∈ A if and only if the functions fi , i = 1, 2, . . . , m are
continuous at a.
Theorem 51.4 (Heine’s criterion for continuity). The function f : A ⊂ Rn → Rm is
continuous at a ∈ A if and only if for any sequence (xk ), xk ∈ A, xk −−−→ a the sequence
k→∞
(f (xk )) converges to f (a).
Theorem 51.5 (Cauchy-Bolzano’s criterion for continuity). The function f : A ⊂ Rn →
Rm is continuous at a ∈ A if and only if for any ε > 0 there exists δ > 0 such that

kx0 − ak < δ and kx00 − ak < δ ⇒ kf (x0 ) − f (x00 )k < ε.

Remark 51.1. If the function f : A ⊂ Rn → Rm is continuous at a ∈ A, then the


function kf k : A ⊂ Rn → R1+ defined by kf k(x) = kf (x)k, is continuous at a.
Remark 51.2. Generalizations of some important theorems, proved for single variable
real valued continuous functions, requires higher dimensional analogues of the topology
in R1 .

110
52 Important properties of continuous functions

Theorem 52.1 (The boundedness property). If f : A ⊂ Rn → Rm is continuous on the


compact set A, then

a) the set f (A) is bounded and


b) there exists a ∈ A such that kf (a)k = sup kf (A)k.

Proof.
a) Assume that f (A) is unbounded. Then for every k ∈ N there exists xk ∈ A such that
kf (xk )k > k. The sequence (xk ) is bounded and therefore there exists a subsequence (xkl )
of the sequence (xk ) which converges towards a point x0 ∈ A, xkl −−−→ x0 . Hence, the
kl →∞
sequence (f (xkl )) converges to f (x0 ). Therefore, there exists N such that for kl > N we
have kf (xkl )k ≤ kf (x0 )k + 1. Absurd.
b) Consider R = sup kf (A)k and note that for every k ∈ N there exists xk ∈ A such that
1
R− < kf (xk )k < R.
k
For the sequence (xk ) there exists a subsequence xkl such that xkl −−−→ x0 ∈ A. Therefore
kl →∞
f (xkl ) −−−→ f (x0 ) and kf (xkl )k −−−→ kf (x0 )k. From the inequality
kl →∞ kl →∞

1
R− < kf (xkl )k < R.
kl
it follows that kf (x0 )k = R.
Corollary 52.1. If f : A ⊂ Rn → R1 is continuous on the compact set A, then:

a) there exists m, M such that


m = inf{f (x) | x ∈ A} M = sup{f (x) | x ∈ A}

b) there exist c and d, c, d ∈ A such that f (c) = m and f (d) = M .


Definition 52.1. A function f : A ⊂ Rn → R1 is uniformly continuous (on A) if for
every ε > 0 there exists δ = δ(ε) > 0 such that for x0 , x00 ∈ A we have
kx0 − x00 k < δ ⇒ kf (x0 ) − f (x00 )k < ε.
Theorem 52.2. A vector valued function f : A ⊂ Rn → Rm is uniformly continuous (on
A) if and only if its scalar components f1 , f2 , . . . , fm : A → R1 are uniformly continuous.
Theorem 52.3. If f : A ⊂ Rn → Rm is continuous on the compact set A, then f is
uniformly continuous on A.
Theorem 52.4. Let A ⊂ Rn , A0 ⊂ Rm and f : A → A0 such that f (A) = A0 . The
function f is continuous on A if and only if for every open set G0 ⊂ Rm , there exists an
open set G ⊂ Rn such that
G ∩ A = f −1 (G0 ∩ A0 ).

111
Corollary 52.2. The function f is continuous on A if and only if for every closed set
F 0 ⊂ Rm , there exists a closed set F ∈ Rn such that

F ∩ A = f −1 (F 0 ∩ A0 ).

Theorem 52.5. If the set A ⊂ Rn is connected and the function f : A ⊂ Rn → Rm is


continuous, then the set f (A) is connected.

Corollary 52.3. If the set A ⊂ Rn is compact and connected and the function f : A ⊂
Rn → R1 is continuous, then f (A) is a closed interval.

53 Differentiation

This section defines what is meant by saying that a function of n variables is differentiable,
but to begin, let’s examine the concept of partial differentiability.

Definition 53.1. Let f : A ⊂ Rn → R1 be a real valued function of n variables and a


an interior point of the set A. The function f is said to be partially differentiable with
respect to xi at a if

f (a1 , ..., ai−1 , ai + h, ai+1 , ..., an ) − f (a1 , ..., ai , ..., an )


lim
h→0 h
∂f
exists. The value of this limit is denoted by (a) and is called the partial derivative of
∂xi
f with respect to xi at the point a.

Definition 53.2. If f is partially differentiable with respect to xi in a neighborhood of a,


∂f
then the function x 7→ (x) defined on that neighborhood is called the partial derivative
∂xi
of f with respect to xi .

Remark 53.1. To calculate partial derivatives, one has to differentiate (in the normal
manner) with respect to xi keeping all the other variables fixed. Hence, the obvious rules
for partially differentiating sums, products and quotients can be used.

Example 53.1. Calculate the partial derivatives of the function f given by


y
f (x, y, z) = x2 y + x · sin y + .
z

Solution: The partial derivatives with respect to x is


∂f
= 2xy + sin y.
∂x
Similarly,
∂f 1 ∂f y
= x2 + x cos y + and = − 2.
∂y z ∂z z

112
Example 53.2. Calculate the partial derivatives of the function f given by
n X
X n
f (x1 , ..., xn ) = aij xi xj (x1 , ..., xn ) ∈ Rn .
i=1 j=1

Solution: n
∂f X
= (akj + ajk )xj .
∂xk j=1

Definition 53.3. Let f : A ⊂ Rn → Rm be an n variable vector valued function (i.e.


f (x1 , ..., xn ) = (f1 (x1 , ..., xn ), ..., fm (x1 , ..., xn ))) and a an interior point of the set A.
The function f is said to be partially differentiable with respect to xi at a if the scalar
components f1 (x1µ , ..., xn ), ..., fm (x1 , ...,¶xn ) are partially differentiable with respect to xi
∂f1 ∂fm
at a. The vector (a), . . . , (a) is called the partial derivative of f with respect
∂xi ∂xi
∂f
to xi at a and it is denoted by (a):
∂xi
µ ¶
∂f ∂f1 ∂fm
(a) = (a), . . . , (a) .
∂xi ∂xi ∂xi

If f = (f1 , ..., fm ) is partially differentiable with respect to xi in a neighborhood of a, a


∂f
function defined on that neighborhood, called the partial derivative of f with respect
∂xi
to xi is obtained: µ ¶
∂f ∂f1 ∂fm
x 7→ (x) = (x), . . . , (x) .
∂xi ∂xi ∂xi
Example 53.3. Calculate the partial derivatives of the function f given by
f (x, y, z) = (x + y + z, xy + xz + yz, xyz).

Solution:
∂f ∂f ∂f
= (1, y + z, yz) ; = (1, x + z, xz) ; = (1, x + y, xy).
∂x ∂y ∂z
Definition 53.4. Let f : A ⊂ Rn → R1 be a real valued function of n variables, a an
interior point of the set A and u a unit vector in Rn (i.e. kuk = 1). If the limit
f (a + t · u) − f (a)
lim
t→0 t
exists it is called the directional derivative of f at the point a and it is denoted by
f (a + t · u) − f (a)
∇u f (a) = lim .
t→0 t
Remark 53.2. Let ei = (0, ..., 0, |{z}
1 , 0, ..., 0). The directional derivative of f at a is
i

∂f
∇ei f (a) = (a) i = 1, n.
∂xi
Hence, partial derivatives are special cases of directional derivatives.

113
Example 53.4. If u = (ux , uy ) and f (x, y) = x · y, then
∂f ∂f
∇u f (x, y) = · ux + · uy = y · ux + x · uy .
∂x ∂y
Definition 53.5. Let f = (f1 , . . . , fm ) be a vector valued n variables function f : A ⊂
Rn → Rm ; a an interior point of A and u a unit vector in Rn (i.e. kuk = 1).
If the limit
f (a + t · u) − f (a)
lim
t→0 t
exists, it is called the directional derivative of f at the point a and it is denoted by ∇u f (a).
It is easy to see that
∇u f (a) = (∇u f1 (a), . . . , ∇u fm (a)).
Remark 53.3. The directional derivative ∇u f (a) exists if the directional derivatives
∇u fi (a), i = 1, m exist.
Example 53.5. The directional derivative of the function f (x, y, z) = (xy +xz +yz, xyz)
at the point (x, y, z) is

∇u f (x, y, z) = ((y + z)ux + (x + z)uy + (x + y)uz , yz · ux + xz · uy + xy · uz ).

Theorem 53.1. Let f be a real valued function of n variables, f : A ⊂ Rn → R1 , and a


∂f
an interior point of A. If the partial derivatives , i = 1, n exist in a neighborhood of a
∂xi
and they are continuous at a, then the following equality holds:
" n
#
1 X ∂f
lim f (a + h) − f (a) − (a) · hi = 0.
h→0 khk ∂x i
i=1

Proof. Consider the vectors vj = (a1 , a2 , . . . , aj , aj+1 + hj+1 , . . . , an + hn ) for j = 0, n − 1


and vn = a and represent f (a + h) − f (a) in the form:
n−1
X n−1
X ∂f
f (a + h) − f (a) = [f (vj ) − f (vj+1 )] = (vj+1 + θj+1 · hj+1 · ej+1 )hj+1
j=0 j=0
∂xj+1
Xn
∂f
= (vi + θi · hi · ei ) · hi with 0 ≤ θi ≤ 1
i=1
∂xi

where ei = (0, . . . , 0, 1, 0, . . . , 0).


Hence:
" n
# n · ¸
1 X ∂f 1 X ∂f ∂f
f (a + h) − f (a) − (a) · hi = (vi + θi · hi · ei ) − (a) · hi .
khk i=1
∂xi khk i=1 ∂xi ∂xi

Now remark that kvi + θi · hi · ei − ak ≤ khk, i = 1, n and therefore, for ε > 0, there is
δ > 0 such that khk < δ ⇒
¯ ¯
¯ ∂f ∂f ¯ ε
¯ (v + θ · h · e ) − (a)¯ < , i = 1, n.
¯ ∂xi i i i i
∂xi ¯ n

114
So, ¯ ¯
1 ¯¯ n
X ∂f ¯
¯
khk < δ ⇒ ¯f (a + h) − f (a) − (a) · hi ¯ < ε.
khk ¯ ∂xi i=1
¯

The above theorem shows that in a small neighborhood of a the function f can be
Xn
∂f
approximated by the polynomial of first degree f (a) + (a) · hi .
i=1
∂xi

Definition 53.6. A real valued n variables function f : A ⊂ Rn → R1 is said to be


differentiable at a if it is partially differentiable at a with respect to every variable xi and
" n
#
1 X ∂f
lim f (a + h) − f (a) − (a) · hi = 0.
h→0 khk ∂xi
i=1

The function da f : Rn → R1 defined by


Xn
∂f
da f (h) = (a) · hi , ∀h ∈ Rn
i=1
∂xi

is called the Fréchet derivative of f at a.

Remark 53.4. The Fréchet derivative da f : Rn → R1 of a function f : A ⊂ Rn → R1 at


a is a linear function on Rn . It is a polynomial of first degree in h1 , h2 , ..., hn .

Remark 53.5. For khk = 1, we have da (h) = ∇h f (a).

Example 53.6. Show that the following functions are differentiable and compute their
Fréchet derivatives.

a) f (x, y) = x2 + y 2 , d(x,y) f (hx , hy ) = 2x · hx + 2y · hy ;

b) f (x, y) = x · y, d(x,y) f (hx , hy ) = y · hx + x · hy ;

c) f (x, y, z) = x·y+x·z +y·z, d(x,y,z) f (hx , hy , hz ) = (y+z)·hx +(x+z)·hy +(x+y)·hz .

Remark 53.6. If the real valued function f :⊂ Rn → R1 is differentiable at a ∈ A, then


f is continuous at a.

Definition 53.7. Let f = (f1 , . . . , fm ) be a vector valued function of n variables,


f : A ⊂ Rn → Rm , and a an interior point of A. The function f is said to be differentiable
at a if every scalar component fj , j = 1, m of f is differentiable at a.
The function da f : Rn → Rm defined by
m
à n !
X X ∂fi
da f (h) = (a) · hi · ej
j=1 i=1
∂x i

is called the Fréchet derivative of f at a, where ej = (0, . . . , 0, 1, 0, . . . , 0) ∈ Rm .

115
The Fréchet derivative is a set of m polynomials of first degree in h1 , h2 , ..., hn .

Example 53.7. Show that f (x1 , x2 , x3 ) = (x1 x2 x3 , x21 + x22 + x23 ) is differentiable at any
point and compute its Fréchet derivative.

Solution: da f (h) = (x2 x3 h1 + x1 x3 h2 + x1 x2 h3 , 2x1 h1 + 2x2 h2 + 2x3 h3 )

Definition 53.8. The matrix of the linear function da f is called the Jacobi matrix of f
at a. This is a m × n matrix and is given by
µ ¶
∂fi
Ja (f ) = (a) .
∂xi m×n

We have da f (h) = Ja (f ) · h.

Example 53.8. Consider the vector valued function of n variables f : Rn → Rm defined


by à n !
m
X X
f (x1 , . . . , xn ) = aij xj · ei .
i=1 j=1

Show that f is Fréchet differentiable at any point x ∈ Rn and the following relations hold:
m
à n !
X X
dx f (h) = aij hj · ei
i=1 j=1

µ ¶
∂fi
Ja (f ) = (a) = (aij )m×n .
∂xi m×n

Remark 53.7. If the vector valued function of n variables f : A ⊂ Rn → Rm is


differentiable at a ∈ A, then f is continuous at a.

Theorem 53.2. Let f : A ⊂ Rn → B ⊂ Rm and g : B ⊂ Rm → Rp . If f is differentiable


at a ∈ Int(A) and g is differentiable at f (a) = b ∈ Int(B), then h = g ◦ f is differentiable
at a and da h = db g ◦ da f.

Proof. f differentiable at a implies

f (x) − f (a) = da f (x − a) + ε1 (x) · kx − ak

with ε1 (x) → 0 as x → a.
g differentiable at b = f (a) implies

g(y) − g(b) = db g(y − b) + ε2 (y) · ky − bk

with ε2 (y) → 0 as y → b.
Hence

h(x) − h(a) =g(f (x)) − g(f (a)) = db g(f (x) − f (a)) + ε2 (f (x)) · kf (x) − f (a)k
=db g(da f (x − a) + ε1 (x) · kx − ak) + ε2 (f (x)) · kda f (x − a) + ε1 (x) · kx − akk
=db g ◦ da f (x − a) + kx − akdb g(ε1 (x)) + kda f (x − a) + ε1 (x)kx − akk) · ε2 (f (x)).

116
Denote
h(x) − h(a) − db g ◦ da f (x − a)
ε3 (x) =
kx − ak
kda f (x − a) + ε1 (x) · kx − akk
=db g(ε1 (x)) + · ε2 (f (x))
kx − ak
Hence
kε3 (x)k ≤ kdb gk · kε1 (x)k + (kda f k + kε1 (x)k) · kε2 (f (x))k
and ε3 (x) → 0 as x → a.
Remark 53.8. The Jacobi matrix of h at a is the product of the Jacobi matrix of g at
”b” and the Jacobi matrix of f at a :
m
X ∂gi
∂hi ∂fk
(a) = (b) · (a), i = 1, p, j = 1, n.
∂xj k=1
∂y k ∂x j

Example 53.9. Consider f : R2 → R2 defined by f (x1 , x2 ) = (x1 + x2 , x1 · x2 ) and


∂h ∂h
g : R2 → R2 defined by g(ρ, θ) = (ρ cos θ, ρ sin θ). Find h(ρ, θ) = (f ◦ g)(ρ, θ) and , .
∂ρ ∂θ
Corollary 53.1. Let f : A ⊂ Rn → B ⊂ Rn be a bijection where A, B are open subsets
of Rn .
If f is differentiable at a ∈ A and f −1 is differentiable at b = f (a), then da f is a bijection
of Rn on Rn and
(da f )−1 = df (a) f −1 .
The above statement follows from the equality f −1 ◦ f = iA using the rule of differentiable
composite functions.

54 Basic properties of differentiable functions

Mean value theorem(Lagrange) Let x, h ∈ Rn . Denote by [x, x + h] the set defined


by:
[x, x + h] = {x + th ∈ Rn | 0 ≤ t ≤ 1}
This set is called the closed segment which joins x and x + h.
Consider A ⊂ Rn , A an open subset, f : A → R1 and x, x + h ∈ A.
Theorem 54.1. If the following conditions hold:

a) [x, x + h] ⊂ A

b) f is differentiable on [x, x + h]

then there exists t0 ∈ (0, 1) such that:


Xn
∂f
f (x + h) − f (x) = dx+t0 h f (h) = (x + t0 h) · hi
i=1
∂x i

117
Proof. Consider the function ϕ(t) = f (x+th) for t ∈ [0, 1]. The function ϕ is differentiable
on [0, 1] and ϕ0 (t) = dx+th f (h). Since f (x + h) − f (x) = ϕ(1) − ϕ(0), applying the mean
value theorem for the function ϕ on [0, 1], we obtain that there exists t0 ∈ (0, 1) such
that: n
X ∂f
0
ϕ(1) − ϕ(0) = ϕ (t0 ) = dx+t0 h f (h) = (x + t0 h) · hi
i=1
∂xi

Remark 54.1. If the function f : A ⊂ Rn → Rm with m ≥ 2, then the above theorem is


false. In order to illustrate this, consider for example the function f : [0, 2π] ⊂ R → R2 ,
f (t) = (cos t, sin t).

In the case f : A ⊂ Rn → Rm (m ≥ 2) the mean value theorem is the following:

Theorem 54.2. If the following conditions hold:

1) [x, x + h] ⊂ A

2) f is differentiable on [x, x + h]

3) kdx+th f k ≤ M ∀t ∈ [0, 1]

then kf (x + h) − f (x)k ≤ M · khk .

Proof. Consider again the function ϕ(t) = f (x + th) for t ∈ [0, 1] and
Xn
ψ(t) = (ϕi (1) − ϕi (0)) · ϕi (t) for t ∈ [0, 1]. For ψ there exists t0 ∈ [0, 1] such that
i=1
ψ(1) − ψ(0) = ψ 0 (t0 ). Hence
m
X m
X
kϕ(1) − ϕ(0)k2 = [ϕi (1) − ϕi (0)]2 = [ϕi (1) − ϕi (0)] · ϕ0i (t0 ) ≤
i=1 i=1
" m # 21 " m # 21
X X
≤ [ϕi (1) − ϕi (0)]2 · [ϕ0i (t0 )]2 ≤ M · khk · kϕ(1) − ϕ(0)k
i=1 i=1

and
kϕ(1) − ϕ(0)k ≤ M · khk

Local extremum

Definition 54.1. A function f : A ⊂ Rn → R1 has a local maximum value at c ∈ A if


there exists some open neighborhood V ⊂ A of c for which f (x) ≤ f (c), for any x ∈ V .
It f (x) ≥ f (c) for any x ∈ V , then f has a local minimum value at c.

Theorem 54.3 (Fermat). If f is differentiable at c and possesses a local maximum or a


∂f
local minimum at c, then ∂x i
(c) = 0 for i = 1, n.

118
Proof. Consider h ∈ Rn and for t ∈ R1 sufficiently close to 0, the function ϕ(t) = f (c+th).
The function f possesses a local maximum or a local minimum at c if and only if ϕ
possesses a local maximum or a local minimum at t = 0. Since the derivative of ϕ at a
local extremum is equal to 0 it follows that
Xn
0 ∂f
ϕ (0) = (c) · hi = 0 ∀h ∈ Rn
i=1
∂x i

Therefore
∂f
(c) = 0 for i = 1, n
∂xi

∂f
Note that although ∂xi
(c) = 0 at a local extremum c, this is not a sufficient condition for
such a point.

Example 54.1. If f (x, y) = xy, then ∂f ∂x


= y and ∂f ∂y
= x and hence (0, 0) is
the only possible local extremum of f . However, for any δ > 0, f (δ, δ) = δ 2 and
f (−δ, δ) = −δ 2 < 0. Hence, f takes both positive and negative values in any neighborhood
of (0, 0). Thus (0, 0) is neither a local maximum nor a local minimum.
∂f
Definition 54.2. A point c is a stationary point of f if ∂xi
(c) = 0 for i = 1, n.

Just like in the case of functions of one variable, stationary points for functions of several
variables can be classified with the aid of Taylor approximations.

Definition 54.3. Let f : A ⊂ Rn → Rm be a differentiable function on the open set A.


∂fi
If the partial derivatives A 3 x 7→ ∂x j
are continuous, i = 1, m, j = 1, n, then f is said
to be continuously differentiable.

Theorem 54.4 (of local inversion). If the function f : A ⊂ Rn → Rn is continuously


∂fi
differentiable on the open set A and the Jacobi matrix of f , ( ∂x j
(a))n×n , is not singular
∂fi
(i.e. det( ∂x j
(a)) 6= 0), then there exists an open neighborhood U of a and an open
neighborhood V of f (a) = b such that f : U → V is bijective. Moreover, the inverse
f −1 : V → U is differentiable at b = f (a) and the following equality holds:

db f −1 = (da f )−1

Proof. The proof of this theorem is rather technical and will be skipped.

Example 54.2. Show that if ρ 6= 0, then the function f (ρ, θ) = (ρ cos θ, ρ sin θ) is locally
invertible.

Example 54.3. Show that if ρ 6= 0, then the function f (ρ, θ, ϕ) = (ρ sin θ cos ϕ, ρ sin θ sin ϕ, ρ cos θ)
is locally invertible.

119
Implicit functions
Consider A ⊂ Rn and B ⊂ Rm two open subsets and let be the function f : A × B → Rm .
Denote by f1 , f2 , ... , fm the scalar components of f , i.e. f = (f1 , f2 , ..., fm ) and consider
the system of equations:

 f1 (x1 , x2 , ..., xn , y1 , y2 , ..., ym ) = 0


f2 (x1 , x2 , ..., xn , y1 , y2 , ..., ym ) = 0
(∗)

 ··· ··· ··· ··· ··· ···

fm (x1 , x2 , ..., xn , y1 , y2 , ..., ym ) = 0
Definition 54.4. If there exists a function ϕ : A0 ⊂ A → B, ϕ = (ϕ1 , ϕ2 , ..., ϕm ) such
that the following equalities hold:


 f1 (x1 , x2 , ..., xn , ϕ1 (x1 , x2 , ..., xn ), ..., ϕm (x1 , x2 , ..., xn )) = 0

f2 (x1 , x2 , ..., xn , ϕ1 (x1 , x2 , ..., xn ), ..., ϕm (x1 , x2 , ..., xn )) = 0
(∗∗)

 ··· ··· ··· ··· ··· ··· ··· ··· ··· ···

fm (x1 , x2 , ..., xn , ϕ1 (x1 , x2 , ..., xn ), ..., ϕm (x1 , x2 , ..., xn )) = 0
for any (x1 , x2 , ..., xn ) ∈ A0 , then the function ϕ = (ϕ1 , ϕ2 , ..., ϕm ) is said to be defined
implicitly by the system of equations (∗).

It is clear that the system of equations (∗) can be written shortly as


f (x, y) = 0
where x = (x1 , x2 , ..., xn ) and y = (y1 , y2 , ..., yn ).
Theorem 54.5 (Implicit function theorem). If the function f has the following properties:

1) there exists a ∈ A and b ∈ B such that f (a, b) = 0


2) f is continuously differentiable on A × B
3) the Fréchet derivative db fa : Rm → Rm is bijective

where fa : B → Rm is defined by fa (y) = f (a, y), then there exists an open neighborhood
U of a and an open neighborhood V of b and a function ϕ : U → V having the following
properties:

i) ϕ(a) = b
ii) f (x, ϕ(x)) = 0 ∀x ∈ U
iii) f is continuously differentiable on U

and the following equality holds:


dx ϕ = −(dy fx )−1 ◦ (dx fy )
where fx (y) = f (x, y), fy (x) = f (x, y) and y = ϕ(x).

Note that the function ϕ defined implicitly cannot be always written as an explicit formula.
Example 54.4. a) Find the function defined implicitly by the equation x2 + y 2 = 1.
b) Show that the equation y 5 + y − x = 0 defines implicitly a function y = y(x).

120
55 Higher order partial differentiability

Let f : A ⊂ Rn → Rm be a function partially differentiable with respect to every variable


xj , j = 1, n on the open set A.
∂fi
Definition 55.1. If the partial derivatives x 7→ ∂x j
are partially differentiable at a ∈ A
with respect to every variable xk , it is said that f is twice partially differentiable at a with
respect to every variable.
∂fi
The partial derivative with respect to the variable xk of the partial derivative ∂x j
will be
2 ∂ 2 fi
denoted by ∂x∂k ∂x
fi
j
(a), i.e. ∂
( ∂fi )(a)
∂xk ∂xj
= ∂xk ∂xj
(a) and will be called partial derivative of
second-order of f .

For a scalar component fi there exists n2 partial derivatives of second-order.


Example 55.1. Consider f (x, y) = x2 + y 2 + ex cos y. The first order partial derivatives
of f exist at every point (x, y) ∈ R2 and are given by
∂f ∂f
= 2x + ex cos y = 2y − ex sin y
∂x ∂y
The first-order partial derivatives of f themself are partially differentiable at any (x, y) ∈
R2 with respect to x and y and we have:
∂ ∂f ∂ 2f ∂ ∂f ∂ 2f
( )= = 2 + ex cos y ( )= = −ex sin y
∂x ∂x ∂x2 ∂y ∂x ∂y∂x
∂ ∂f ∂ 2f ∂ ∂f ∂2f
( )= = −ex sin y ( )= 2
= 2 − ex cos y
∂x ∂y ∂x∂y ∂y ∂y ∂y
The partial derivatives are the second-order partial derivatives of f .
Definition 55.2. In general, f is said k-times partially differentiable at a ∈ A with respect
to every variable if f is (k − 1)-times partially differentiable with respect to every variable
k−1 f
on an open neighborhood of a and every (k − 1)-th order partial derivative ∂xj∂ ···∂x i
j1
is
k−1
partially differentiable with respect to every variable xjk at a.
We denote
∂ ∂ k−1 fi ∂ k fi
( )(a) = (a)
∂xjk ∂xjk−1 · · · ∂xj1 ∂xjk ∂xjk−1 · · · ∂xj1
and we call it k-th order partial derivative of f .
Example 55.2. Find the k-th order partial derivative of the function f (x, y) = x2 + y 2 +
ex cos y.
∂fi
Definition 55.3. If the partial derivatives of first order x 7→ ∂xj
are differentiable at a
point a ∈ A, it is said that f is twice differentiable at a.
Definition 55.4. The second Fréchet derivative of f at a is denoted by d2a f and defined
as a function d2a f : Rn × Rn → Rm by the formula
m
à n n !
X X X ∂ 2 fi
d2a f (u)(v) = (a) · uj · vk ei
i=1 j=1 k=1
∂x j ∂x k

u, v ∈ Rn , ei = (0, ..., 0, 1, 0, ..., 0), i = 1, n.

121
The second Fréchet derivative d2a f is a system of m bilinear forms in u1 , u2 , ..., un ;
v1 , v2 , ..., vn .
The second Fréchet derivative of f at a satisfies
1
lim kda+u f (v) − da f (v) − d2a f (u)(v)k = 0
u→0 kuk

for every v ∈ Rn . In other words, the polynomial da+u f (v) can be approximated by the
polynomial [da f + d2a f (u)](v).

Theorem 55.1 (mixed derivative theorem of Schwarz). If the function f is twice


differentiable at a, then the following relations hold:

∂ 2 fi ∂ 2 fi
(a) = (a) i = 1, m j, k = 1, n
∂xj ∂xk ∂xk ∂xj

Proof. The proof of this theorem is rather technical and will be skipped.

Example 55.3. Consider f (x, y, z) = (xy + xz + yz, xyz) and verify that:

∂ 2f ∂ 2f ∂ 2f ∂2f ∂ 2f ∂ 2f
= = =
∂x∂y ∂y∂x ∂x∂z ∂z∂x ∂y∂z ∂z∂y

Theorem 55.2 (Criterion for second-order differentiability). If the partial derivatives of


2f
second-order ∂x∂j ∂xi
k
exist in a neighborhood of a and they are continuous at a, then f is
twice differentiable at a.
∂ k−1 fi
Definition 55.5. If the partial derivatives of order k − 1, ∂xjk−1 ···∂xj1
are differentiable at
a ∈ A it is said that f is k-times differentiable at a.
The Fréchet derivative of order k of f at a is defined as the function dka f : Rn ×· · ·×Rn →
Rm given by
m
à n n n
!
X XX X ∂ k fi
k 1 2 k 1 2 k
da f (u )(u ) · · · (u ) = ··· (a) · uj1 uj2 · · · ujk ei
i=1 j =1 j =1 j =1
∂xjk · · · ∂xj1
1 2 k

The Fréchet derivative of order k of f at a satisfies:


1
lim kdk−1
a+uk
f (u1 )(u2 ) · · · (uk−1 )−dk−1 1 2
a f (u )(u ) · · · (u
k−1
)−dka f (u1 )(u2 ) · · · (uk )k = 0
kuk k→0 kuk k

If the function is k-times differentiable at a, then the following relations hold:

∂ k fi ∂ k fi
(a) = (a)
∂xj1 ∂xj2 · · · ∂xjk ∂xσ(j1 ) ∂xσ(j2 ) · · · ∂xσ(jk )

Theorem 55.3 (Criterion for k-times differentiability). If the partial derivatives of k-


th order exist in a neighborhood of a and they are continuous at a, then f is k-times
differentiable at a.

122
56 Taylor’s theorems

Theorem 56.1 (Taylor’s formulae with integral remainder). If the partial derivatives of
order m + 1 of the function f : A ⊂ Rn → Rp are continuous and the closed segment
[x, x + h] is included in the open set A, then:
m
1 1 1 m z }| {
f (x + h) = f (x) + dx f (h) + d2x f (h)(h) + · · · + d f (h) · · · (h) +
1! 2! m! x
Z 1 m+1
z }| {
1 m m+1
+ (1 − t) · dx+th f (h) · · · (h) dt
m! 0

Proof. The function g(t) = f (x + th) is considered for t ∈ [0, 1]. For g the following
relations hold
k
dk g z }| {
= g (k) (t) = dkx+th f (h) · · · (h) k = 1, m + 1
dtk
On the other hand
d 1−t 0 (1 − t)m m (1 − t)m m+1
[g(t) + g (t) + · · · + g (t)] = g (t)
dt 1! m! m!
Z 1
1 0 1 m 1
g(1) − [g(0) + g (0) + · · · + g (0)] = (1 − t)m · g m+1 (t) dt
1! m! m! 0
Hence
m
1 1 1 m z }| {
f (x + h) = f (x) + dx f (h) + d2x f (h)(h) + · · · + d f (h) · · · (h) +
1! 2! m! x
Z 1 m+1
z }| {
1
+ (1 − t)m · dm+1
x+th f (h) · · · (h) dt
m! 0

Theorem 56.2 (Taylor’s formula with the Lagrange remainder). If the function f : A ⊂
Rn → Rp is m + 1 times differentiable on A and kdm+1
y f k ≤ M on the closed segment
[x, x + h] which is included in A, then:
m
1 1 m z }| { khkm+1
kf (x + h) − f (x) − dx f (h) − · · · − dx f (h) · · · (h) k ≤ M ·
1! m! (m + 1)!

Proof. The proof of this theorem is rather technical and it will be skipped.
Theorem 56.3 (Taylor’s formulae with O(khkm ) remainder). If the function f is (m−1)-
times differentiable on A and m-times differentiable at x ∈ A, then:
m
1 1 m z }| {
kf (x + h) − f (x) − dx f (h) − · · · − d f (h) · · · (h) k = O(khkm )
1! m! x

Proof. By induction.

123
57 Classification theorem for local extrema

Theorem 57.1. If f : A ⊂ Rn → R1 has continuous partial derivatives of second-order


on the open set A and da f = 0 for an a ∈ A, then:

i) if f has a local minimum at a, then d2a f (h)(h) ≥ 0


ii) if f has a local maximum at a, then d2a f (h)(h) ≤ 0

Proof. Consider the Taylor formulae


1 1
f (a + h) = f (a) + da f (h) + d2a f (h)(h) + O(khk2 )
1! 2!
and hence
1 2
f (a + h) − f (a) = da f (h)(h) + O(khk2 )
2!
i) if a is a local minimum for f , then there exists r > 0 such that for khk < r we have
1
f (a + h) − f (a) = d2a f (h)(h) + O(khk2 ) ≥ 0
2!
r
Let be h ∈ Rn , h 6= 0 and t ∈ R1 such that |t| < khk .
We have
1 2
d f (th)(th) + O(kthk2 ) ≥ 0
2! a
1 O(kthk2 )
t2 [ d2a f (h)(h) + khk2 · ]≥0
2! kthk2
O(kthk2 )
d2a f (h)(h) + khk2 · ≥0
kthk2
It follows that d2a f (h)(h) ≥ 0.
ii) The second statement is proved similarly.

Theorem 57.2 (Sufficient condition for local extrema). Assume that f : A ⊂ Rn → R1


has continuous partial derivatives of second-order on A and da f = 0 for a ∈ A.
2
i) If d2a f (h)(h) ≥ 0 for h ∈ Rn and det( ∂x∂i ∂x
f
j
(a)) 6= 0, then f has a local minimum at
a;
2
ii) If d2a f (h)(h) ≤ 0 for h ∈ Rn and det( ∂x∂i ∂x
f
j
(a)) 6= 0, then f has a local maximum at
a.

Proof. The proof is rather technical and will be skipped.


Example 57.1. Find and classify the stationary points of
f (x, y) = x4 − y 4 − 2(x2 − y 2 )
Example 57.2. Determine the values of k for which
f (x, y) = k(ey − 1) sin x − cos x cos 2y + 1
possesses a minimum at (0, 0).

124
58 Conditional extrema

Let us consider a function f : A ⊂ Rn → R1 , A open set and Γ ⊂ A, defined by:

Γ = {x ∈ A | gi (x) = 0 i = 1, p}

where gi : A → R1 , p < n .
It will be assumed that f and gi , i = 1, p have continuous first order partial derivatives
on A.

Definition 58.1. If the restriction f/ Γ has an extremum at a ∈ Γ, then we call this


extremum conditional (by the equations gi (x) = 0 i = 1, p).

Theorem 58.1. If da g1 , · · · , da gp are linearly independent and f has a conditional


extremum at a, then there exist p constants λ1 , · · · , λp such that
p
X
da f = λi da gi
i=1

or p
∂f X ∂gi
(a) = λi (a) k = 1, n
∂xk i=1
∂xk

Proof. The proof is rather technical and it will be skipped.

Example 58.1. Find the conditional extrema of the following functions:

a) f (x, y) = x3 if x2 + 6xy + y 2 = 1

b) f (x, y) = xy if 2x + 3y = 1

c) f (x, y, z) = x2 + y 2 + z 2 if x + y + z = 1

59 Jordan measurable subsets of R2

Let be the set of one dimensional bounded intervals of the form (a, b), [a, b), (a, b], [a, b],
where a, b ∈ R. The cartesian product ∆ = I1 × I2 of two intervals of this type will be
called rectangle in R2 . The area of such a rectangle ∆ is defined by

area(∆) = length(I1 )length(I2 )

Consider the set P of all finite reunions of rectangles ∆, i. e. P ∈ P if and only if there
exists ∆1 , ∆2 , · · · , ∆n such that
[n
P = ∆i
i=1

Proposition 59.1. If P1 , P2 ∈ P, then P1 ∪ P2 ∈ P and P1 \ P2 ∈ P.

125
Proof. Direct verification.
n
[
Proposition 59.2. For any P ∈ P there exists ∆1 , ∆2 , · · · , ∆n such that P = ∆i and
i=1
∆i ∩ ∆j = ∅ if i 6= j.

Proof. The proof is rather technical and will be skipped.

Definition 59.1. For a P ∈ P the area is defined by


n
X
area(P ) = area(∆i )
i=1

n
[
where P = ∆i and ∆1 , ∆2 , · · · , ∆n are disjoint.
i=1

Proposition 59.3. The area defined in this way for P ∈ P satisfies:

1. area(P ) > 0 for P ∈ P

2. if P1 , P2 ∈ P and P1 ∩ P2 = ∅, then area(P1 ∪ P2 ) = area(P1 ) + area(P2 )

3. it is independent on the decomposition of P in finite union of disjoint intervals.

Definition 59.2. For A ⊂ R2 , A bounded we define

areai (A) = sup area(P ) areae (A) = inf area(P )


P ⊂A,P ∈P P ⊃A,P ∈P

Definition 59.3. A bounded set A ⊂ R2 is said Jordan measurable if

areai (A) = areae (A)

Definition 59.4. If the bounded set A ⊂ R2 is Jordan measurable, then the area of A is
defined as
area(A) = areai (A) = areae (A)

Proposition 59.4. A bounded set A ⊂ R2 is Jordan measurable if and only if for any
ε > 0 there exist Pε , Qε ∈ P such that Pε ⊂ A ⊂ Qε and area(Qε ) − area(Pε ) < ε.

Proposition 59.5. A bounded set A ⊂ R2 is Jordan measurable if and only if there exist
two sequences (Pn ), (Qn ), Pn , Qn ∈ P and Pn ⊂ A ⊂ Qn such that

lim area(Pn ) = lim area(Qn )


n→∞ n→∞

In this case we have

area(A) = lim area(Pn ) = lim area(Qn )


n→∞ n→∞

Proposition 59.6. A bounded set A ⊂ R2 is Jordan measurable if and only if the area
of its boundary is equal to zero.

126
Proposition 59.7. If A1 and A2 are Jordan measurable sets, then A1 ∪ A2 and A1 \ A2
are Jordan measurable and if A1 ∩ A2 = ∅, then area(A1 ∪ A2 ) = area(A1 ) + area(A2 ).
Proposition 59.8. Let M ⊂ R2 be a bounded set. If for any ε > 0 there exist two Jordan
measurable sets A and B such that A ⊂ M ⊂ B and area(B) − area(A) < ε, then the set
M is Jordan measurable.
Proposition 59.9. If there exist two sequences (An ), (Bn ) of Jordan measurable sets
such that An ⊂ M ⊂ Bn and
lim area(An ) = lim area(Bn )
n→∞ n→∞

then M is Jordan measurable and


area(M ) = lim area(An ) = lim area(Bn )
n→∞ n→∞

Proof. The proof of the above statements is rather technical and it will be skipped.

60 The Riemann-Darboux integral of functions of


two variables

Let A be a given bounded and Jordan measurable subset of R2 .


Definition 60.1. A partition P of A is a finite set of subsets Ai , i = 1, n of A satisfying:
n
[
Ai = A, every Ai is Jordan measurable, if i 6= j, then Ai ∩ Aj = ∅.
i=1
The diameter of the set Ai is the number d(Ai ) defined by
p
d(Ai ) = 0 0 max00 00
(x0 − x00 )2 + (y 0 − y 00 )2
(x ,y ),(x ,y )∈Ai

The norm of the partition P is the number


ν(P ) = max{d(A1 ), d(A2 ), · · · , d(An )}

Suppose now that f is a function defined and bounded on A, f : A → R1 . Then f is


bounded on each part Ai . Hence f has a least upper bound Mi and a greatest lower
bound mi on Ai .
Definition 60.2. The upper Darboux sum of f related to P is defined by
n
X
Uf (P ) = Mi · area(Ai )
i=1

where Mi = sup{f (x, y) | (x, y) ∈ Ai }.


The lower Darboux sum of f related to P is defined by
n
X
Lf (P ) = mi · area(Ai )
i=1

where mi = inf{f (x, y) | (x, y) ∈ Ai }.

127
Definition 60.3. The Riemann sum of f related to P is defined by
n
X
σf (P ) = f (ξi , ηi ) · area(Ai )
i=1

where (ξi , ηi ) ∈ Ai .
Remark 60.1. It is obvious that we have

Lf (P ) ≤ σf (P ) ≤ Uf (P )

Now f is bounded above and below on A. So there exist numbers m and M with
m ≤ f (x, y) ≤ M for all (x, y) ∈ A.
Thus for any partition P of A we have
n
X n
X
m · area(A) = m · area(Ai ) ≤ Lf (P ) ≤ Uf (P ) ≤ M · area(Ai ) = M · area(A)
i=1 i=1

Hence the set


Lf = {Lf (P ) | P is a partition of A}
is bounded above and the set

Uf = {Uf (P ) | P is a partition of A}

is bounded below.
So Lf = sup Lf and Uf = inf Uf exist.
P P
The firs result establishes the intuitively obvious fact that for a bounded function Lf ≤ Uf .
Theorem 60.1. If f is defined and bounded on A, then Lf ≤ Uf .

Proof. Let P be a partition of A and P 0 the partition P 0 = P ∪{A0i , A00i } where A0i ∪A00i = Ai
for one particular i, 1 ≤ i ≤ n. In other words, P 0 is obtained by decomposing Ai in two
measurable subsets.
It is now shown that Lf (P ) ≤ Lf (P 0 ) and Uf (P ) ≥ Uf (P 0 ).
Let Mi0 = sup{f (x, y) | (x, y) ∈ A0i } and Mi00 = sup{f (x, y) | (x, y) ∈ A00i }.
Clearly Mi0 ≤ Mi and Mi00 ≤ Mi . Hence
i−1
X n
X
0
Uf (P ) = Mj · area(Aj ) + Mi0 · area(A0i ) + Mi00 · area(A00i ) + Mj · area(Aj ) ≤
j=1 j=i+1
i−1
X n
X
≤ Mj · area(Aj ) + Mi · area(A0i ) + Mi · area(A00i ) + Mj · area(Aj ) =
j=1 j=i+1
Xn
= Mj · area(Aj ) = Uf (P )
j=1

In a similar fashion it can be shown that Lf (P ) ≤ Lf (P 0 ).


It now follows that if P 00 = P ∪ {A0i1 , A00i1 , · · · , A0im , A00im }, then Lf (P ) ≤ Lf (P 00 ) and
Uf (P ) ≥ Uf (P 00 ).

128
Now suppose that P1 and P2 are two partitions of A, P1 = {A1 , · · · , Am } and P2 =
{B1 , · · · , Bn } and let P3 the partition P3 = {Ai ∩ Bj | i = 1, m, j = 1, n}.
Thus Lf (P1 ) ≤ Lf (P3 ) and Uf (P2 ) ≥ Uf (P3 ). Since Lf (P3 ) ≤ Uf (P3 ) it can be deduced
that Lf (P1 ) ≤ Uf (P2 ).
In other words the lower sum related to a given partition of A does not exceed the upper
sum related to any partition of A.
Hence every lower sum is a lower bound for the set of upper sums. So Lf (P ) ≤ Uf for all
possible partitions P . But then Uf is an upper bound for the set of lower sums.
Thus Lf ≤ Uf .
Definition 60.4. A function f defined and bounded on A is Riemann-Darboux integrable
on A if Lf = Uf . This common value is denoted by
ZZ
f (x, y) dx dy
A

and it is called the double integral of f .


Theorem 60.2. The function f defined on A and bounded on A is Riemann-Darboux
integrable on A if and only if for any ε > 0 there exists P such that Uf (P ) − Lf (P ) < ε.

Proof. It follows from Theorem 60.1


Theorem 60.3. The function f defined on A and bounded on A is Riemann integrable
on A if and only if there exists a number I(= Lf = Uf ) such that for any ε > 0 there
exists δ > 0 such that for ν(P ) < δ we have |σ(P ) − I| < ε.

Proof. It follows from Remark 60.1 and Theorem 60.2.


Remark 60.2. The constant function f (x, y) = 1 is Riemann integrable on A and
ZZ
f (x, y) dx dy = area(A)
A
ZZ
Remark 60.3. This definition of the integral f (x, y) dx dy is only one of the many
A
ways to define the integral of a two variables function. There are others, notably Lebesque
integral; all, however, give the same thing in the case of continuous functions.

61 Integrable functions

Theorem 61.1. If f is continuous on A and A is Jordan measurable, then f is Riemann-


Draboux integrable on A.

Proof. Since A is compact the function f is uniformly continuous. For ε > 0 there exists
δ > 0 such that
1 ε
[(x0 − x00 )2 + (y 0 − y 00 )2 ] 2 < δ ⇒ |f (x0 , y 0 ) − f (x00 , y 00 )| <
area(A)

129
Choose P such that
p
(x0 , y 0 ), (x00 , y 00 ) ∈ Ai ⇒ (x0 − x00 )2 + (y 0 − y 00 )2 < δ

for i = 1, n. Hence
n
X n
X
ε
Uf (P ) − Lf (P ) = (Mi − mi )area(Ai ) < · area(Ai )
i=1
area(A) i=1

Hence f is Riemann-Darboux integrable on A.

Definition 61.1. A function f is called piecewise-continuous on A if there exists a


partition P = {A1 , · · · , An } of A and continuous functions fi , i = 1, n defined on Ai
such that f (x) = fi (x) for x ∈ Int(Ai ).

Theorem 61.2. A piecewise-continuous function is Riemann-Darboux integrable and


ZZ n ZZ
X
f (x, y) dx dy = fi (x, y) dx dy
A i=1 A
i

Proof. The proof is rather technical and it will be skipped.

62 Properties of the Riemann-Darboux integral

Theorem 62.1. If f and g are Riemann-Darboux integrable on A, then all the integrals
below exist and the following relations hold:
ZZ ZZ ZZ
(1) [αf (x, y) + βg(x, y)] dx dy = α f (x, y) dx dy + β g(x, y) dx dy, α, β ∈ R1
A A A
ZZ ZZ ZZ
(2) f (x, y) dx dy = f (x, y) dx dy + f (x, y) dx dy where A1 ∪ A2 = A and
A A1 A2
A1 ∩ A2 = ∅
ZZ ZZ
(3) if f (x, y) ≤ g(x, y) on A, then f (x, y) dx dy ≤ g(x, y) dx dy
A A
ZZ ZZ
(4) | f (x, y) dx dy| ≤ |f (x, y)| dx dy
A A

Property (1) is called the linearity of the integral and (2) is called the additive property.

Proof. Is made using the definition of the Riemann-Darboux integral.

130
Theorem 62.2 (The mean value theorem). Let f : A → R1 be integrable on A and
satisfying
m ≤ f (x, y) ≤ M for (x, y) ∈ A
Then ZZ
m · area(A) ≤ f (x, y) dx dy ≤ M · area(A)
A

Proof. Use property (3) from Theorem 62.1.

63 Riemann-Darboux integral calculus when A is


rectangular

We intend to show that in some conditions the calculus of the integral of a two variables
function reduces to the iterative calculus of the integrals of one variable functions.
Assume that A is given by A = [a, b] × [c, d] and f : A → R1 .

Theorem 63.1. If the function f is integrable on A and if for every x ∈ [a, b] (x-fixed)
the function fx (y) = f (x, y) is integrable on [c, d] i.e. the integral

Zd Zd
I(x) = fx (y) dy = f (x, y) dy
c c

exists, then the iterative integral

Zb Zd
dx f (x, y) dy
a c

exists too and the following equality holds:

ZZ Zb Zd
f (x, y) dx dy = dx f (x, y) dy
A a c

Proof. Consider the partitions Px = {a = x0 < x1 < · · · < xi < · · · < xn = b} of [a, b] and
Py = {c = y0 < y1 < · · · < yj < · · · < ym = d} of [c, d]. Hence P = {Aij }i=0,n−1,j=0,m−1 is
a partition of A, Aij = [xi , xi+1 ) × [yj , yj+1 ).
Denote by

mij = inf{f (x, y) | (x, y) ∈ Aij } and Mij = sup{f (x, y) | (x, y) ∈ Aij }

For (x, y) ∈ Aij we have

mij ≤ f (x, y) ≤ Mij i = 0, n − 1, j = 0, m − 1

131
Hence y
Zj+1
mij · [yj+1 − yj ] ≤ f (x, y) dy ≤ Mij · [yj+1 − yj ]
yj

for i = 0, n − 1, j = 0, m − 1 and x ∈ [xi , xi+1 ].


Therefore
y
Zj+1 y
Zj+1
mij · [yj+1 − yj ] ≤ inf { f (x, y) dy | x ∈ [xi , xi+1 ]} ≤ f (x, y) dy ≤
x
yj yj
y
Zj+1
≤ sup{ f (x, y) dy | x ∈ [xi , xi+1 ]} ≤
x
yj

≤ Mij · [yj+1 − yj ]

and hence
y
Zj+1
mij [yj+1 − yj ][xi+1 − xi ] ≤ [xi+1 − xi ] · inf { f (x, y) dy | x ∈ [xi , xi+1 ]} ≤
x
yj
y
Zj+1
≤ [xi+1 − xi ] · f (x, y) dy ≤
yj
y
Zj+1
≤ [xi+1 − xi ] · sup{ f (x, y) dy | x ∈ [xi , xi+1 ]} ≤
x
yj

≤ Mij [yj+1 − yj ][xi+1 − xi ]

Hence
n−1
X Z
yi+1

Lf (P ) ≤ LI(x) (Px ) = [xi+1 − xi ] · f (x, y) dy ≤ UI(x) (Px ) ≤ Uf (P )


i=0 yi

where
n−1 m−1
X X
Lf (P ) = mij [xi+1 − xi ][yj+1 − yj ]
i=0 j=0
y
Zj+1
m−1
XX n−1
LI(x) (Px ) = [xi+1 − xi ] · inf { f (x, y) dy | x ∈ [xi , xi+1 ]}
x
j=0 i=0 yj
y
Zj+1
m−1
XX n−1
UI(x) (Px ) = [xi+1 − xi ] · sup{ f (x, y) dy | x ∈ [xi , xi+1 ]}
x
j=0 i=0 yj

m−1
XX n−1
Uf (P ) = Mij [xi+1 − xi ][yj+1 − yj ]
j=0 i=0

132
Since ZZ
sup Lf (P ) = inf Uf (P ) = f (x, y) dx dy
A

we have
Zb Zd
sup LI(x) (Px ) = inf UI(x) (Px ) = dx f (x, y) dy
a c

and
ZZ Zb Zd
f (x, y) dx dy = dx f (x, y) dy
A a c

Changing x with y we obtain also

ZZ Zd Zb
f (x, y) dx dy = dy f (x, y) dx
A c a

ZZ
1
Example 63.1. Consider A = [1, 3]×[1, 2] and f (x, y) = (x+y)2
. Evaluate f (x, y) dx dy.
A
ZZ
Example 63.2. Evaluate f (x, y) dx dy in the following cases:
A

a) A = [1, 3] × [2, 5] and f (x, y) = 5x2 y − 2y 3


x2
b) A = [0, 1] × [0, 1] and f (x, y) = 1+y 2

y
c) A = [0, 1] × [0, 1] and f (x, y) = 3
(1+x2 +y 2 ) 2

Theorem 63.2. If A = [a, b] × [c, d] and f : A → R1 is continuous, then the function


ZZ Zx Zy
F (x, y) = f (u, v) du dv = f (u, v) du dv
[a,x]×[b,y] a c

has continuous partial derivatives of first order:


Zy Zx
∂F ∂F
= f (x, v)dv = f (u, y)du
∂x ∂y
c a

∂2F
The second order partial derivative ∂x∂y
exists and

∂ 2F
= f (x, y) for (x, y) ∈ A
∂x∂y

133
Proof. Represent F (x, y) in the following form
Zx Zy Zx Zy Zy Zx
F (x, y) = f (u, v) du dv = du f (u, v) dv = dv f (u, v) du
a c a c c a

Theorem 63.3. If there exists Φ : A → R1 such that


∂ 2Φ
= f (x, y)
∂x∂y
then ZZ
f (x, y) dx dy = Φ(a, c) − Φ(b, c) + Φ(b, d) − Φ(a, d)
A

Proof. Consider the partitions Px = {a = x0 < x1 < · · · < xi < · · · < xm = b} and
Py = {c = y0 < y1 < · · · < yj < · · · < yn = d} of [a, b] and [c, d] respectively. Now let
be P = {Aij }i=0,m−1,j=0,n−1 a partition of A where Aij = [xi , xi+1 ] × [yj , yj+1 ] and apply
twice the mean value theorem for the expression
Φ(xi+1 , yj+1 ) − Φ(xi+1 , yj ) − Φ(xi , yj+1 ) + Φ(xi , yj )
obtaining
Φ(xi+1 , yj+1 ) − Φ(xi+1 , yj ) − Φ(xi , yj+1 ) + Φ(xi , yj ) =
2
∂ Φ
= (ξij , ηi,j )(xi+1 − xi )(yj+1 − yj ) = f (ξij , ηi,j )(xi+1 − xi )(yj+1 − yj )
∂x∂y
where xi ≤ ξij ≤ xi+1 and yj ≤ ηij ≤ yj+1 .
Hence
m−1
XX n−1
f (ξij , ηi,j )(xi+1 − xi )(yj+1 − yj ) = Φ(b, d) − Φ(b, c) − Φ(a, d) + Φ(a, c)
i=0 j=0

and the equality


ZZ
f (x, y) dx dy = Φ(a, c) − Φ(b, c) + Φ(b, d) − Φ(a, d)
A

64 Riemann-Darboux integral calculus when A is not


a rectangle

Let A the set defined by


A = {((x, y) | x ∈ [a, b] and y ∈ [g(x), h(x)]}
where g, h are continuous functions satisfying g(x) ≤ h(x) for every x ∈ [a, b].

134
Theorem 64.1. If the function f : A → R1 is integrable on A and for every x ∈ [a, b]
the integral
Z
h(x)

I(x) = f (x, y) dy
g(x)

exists, then the iterative integral

Zb Z
h(x)

dx f (x, y) dy
a g(x)

exists too and the following equality holds:

ZZ Zb Z
h(x)

f (x, y) dx dy = dx f (x, y) dy
A a g(x)

Proof. This case can be reduced to the case when A is a rectangle.


ZZ √
Example 64.1. Compute y 2 R2 − x2 dx dy where A = {(x, y) | x2 + y 2 ≤ R2 }.
A
ZZ
Example 64.2. Compute (x2 + y) dx dy where A = {(x, y) | y 2 − x ≤ 0 and x2 − y ≤
A
0}.

Let A, B Jordan measurable sets.

Theorem 64.2. If T : B → A is a bijection such that T and T −1 have continuous partial


derivatives, then ¯¯ ¯¯
ZZ ZZ ¯¯ ∂x ∂x ¯¯
¯¯ ∂ξ ∂η ¯¯
area(A) = dx dy = ¯¯ ∂y ∂y ¯¯ dξ dη
¯¯ ∂ξ ∂η ¯¯
A B

where T (ξ, η) = (x(ξ, η), y(ξ, η)) for every (ξ, η) ∈ B.

Proof. Consider a partition PB = {B1 , B2 , · · · , Bn } of B and the coresponding partition


PA = {A1 , A2 , · · · , An } of A with Ai = T (Bi ). If (PB ) is small, then
¯¯ ¯¯
¯¯ ∂x ∂x ¯¯
¯¯ ∂ξ ∂η ¯¯
area(Ai ) = ¯¯ ∂y ∂y ¯¯ · area(Bi )
¯¯ ∂ξ ∂η ¯¯

Hence ¯¯ ¯¯
n
X n ¯¯
X ∂x ∂x ¯¯
¯¯ ∂ξ ∂η ¯¯
area(Ai ) = ¯¯ ∂y ∂y ¯¯ · area(Bi )
¯¯ ∂ξ ∂η ¯¯
i=1 i=1
n→∞
Considering a sequence of partitions PBn with ν(PBn ) −→ 0, we obtain the stated
result.

135
Theorem 64.3. If A, B ⊂ R2 are Jordan measurable sets, T : B → A is a bijection
such that T and T −1 have continuous partial derivatives and f : A → R1 is an integrable
function, then the following equality holds:
ZZ ZZ ¯¯ ¯¯
¯¯ ∂x ∂x ¯¯
¯¯ ∂ξ ∂η ¯¯
f (x, y) dx dy = f (x(ξ, η), y(ξ, η)) ¯¯ ∂y ∂y ¯¯ dξ dη
¯¯ ∂ξ ∂η ¯¯
A B

Proof. Similarly as before.


ZZ √
Example 64.3. Evaluate y 2 R2 − x2 dx dy where A = {(x, y) | x2 + y 2 ≤ R2 } by an
A
appropriate change of variables.

65 Jordan measurable subsets of Rn

Let be the set of one dimensional bounded intervals of the form (a, b), [a, b), (a, b], [a, b],
where a, b ∈ R. The cartesian product ∆ = I1 × · · · × In , where Ii are intervals of this
type, is called hypercube in Rn .
The volume of such a hypercube ∆ is defined by

vol(∆) = length(I1 )length(I2 ) · · · length(In )

Consider the set P of the finite unions of hypercubes ∆, i. e. P ∈ P if and only if exist
∆1 , ∆2 , · · · , ∆k such that
[k
P = ∆l
l=1

It is easy to verify that the following statement hold:

P1 , P2 ∈ P ⇒ P1 ∪ P2 ∈ P and P1 \ P2 ∈ P
k
[
Proposition 65.1. for any P ∈ P there exist ∆1 , ∆2 , · · · , ∆k such that P = ∆l and
l=1
∆p ∩ ∆q = ∅ if p 6= q.

Definition 65.1. For P ∈ P the volume is defined as


k
X
vol(P ) = vol(∆l )
l=1

k
[
where P = ∆l and ∆1 , ∆2 , ..., ∆k are given by Proposition 65.1.
l=1

Proposition 65.2. The volume defined above for P ∈ P satisfies:

1. vol(P ) ≥ 0 for P ∈ P

136
2. P1 , P2 ∈ P, P1 ∩ P2 = ∅ ⇒ vol(P1 ∪ P2 ) = vol(P1 ) + vol(P2 )

3. vol(P ) is independent on the decomposition of P .

Definition 65.2. For A ⊂ Rn , A bounded we define

voli (A) = sup vol(P ) vole (A) = inf vol(P )


P ⊂A,P ∈P P ⊃A,P ∈P

Definition 65.3. A bounded set A ⊂ Rn is said Jordan measurable if

voli (A) = vole (A)

Definition 65.4. If the bounded set A ⊂ Rn is Jordan measurable, then the volume of A
is defined as
vol(A) = voli (A) = vole (A)

Proposition 65.3. A bounded set A ⊂ Rn is Jordan measurable if and only if for any
ε > 0 there exist Pε , Qε ∈ P such that Pε ⊂ A ⊂ Qε and vol(Qε ) − vol(Pε ) < ε.

Proposition 65.4. A bounded set A ⊂ Rn is Jordan measurable if and only if for any
ε > 0 there exist two sequences (Pk ), (Qk ), Pk , Qk ∈ P and Pk ⊂ A ⊂ Qk such that

lim vol(Pk ) = lim vol(Qk )


k→∞ k→∞

In this case we have


vol(A) = lim vol(Pk ) = lim vol(Qk )
k→∞ k→∞

Proposition 65.5. A bounded set A ⊂ Rn is Jordan measurable if and only if the volume
of its boundary is equal to zero.

Proposition 65.6. If A1 and A2 are Jordan measurable sets, then A1 ∪ A2 and A1 \ A2


are Jordan measurable and if A1 ∩ A2 = ∅, then vol(A1 ∪ A2 ) = vol(A1 ) + vol(A2 ).

Proposition 65.7. Let M ⊂ Rn be a bounded set. If for any ε > 0 there exists two
Jordan measurable sets A and B such that A ⊂ M ⊂ B and vol(B) − vol(A) < ε, then
the set M is Jordan measurable.

Proposition 65.8. If there exist two sequences (Ak ) and (Bk ) of Jordan measurable sets
such that Ak ⊂ M ⊂ Bk and

lim vol(Ak ) = lim vol(Bk )


k→∞ k→∞

then M is Jordan measurable and

vol(M ) = lim vol(Ak ) = lim vol(Bk )


k→∞ k→∞

The proof of the above statements is rather technical and it will be omitted.

137
66 The Riemann-Darboux integral of a n variable
function

Let A be a bounded and Jordan measurable subset of Rn .

Definition 66.1. A partition of A is a finite set of subsets Al , l = 1, k of A having the


following properties:

1) every Al is Jordan measurable


k
[
2) Al = A
l=1

3) if p 6= q then Ap ∩ Aq = ∅

The diameter of the set Al is the number

d(Al ) = sup kx − yk
x,y∈Al

The norm of the partition P is the number

ν(P ) = max{d(A1 ), · · · , d(Ak )}

Suppose now that f is a real valued n variables function defined and bounded on A,
f : A → R1 .
Then f is bounded on each part Al , l = 1, k. Hence f has a least upper bound Ml and a
greatest lower bound ml on Al (l = 1, k).

Definition 66.2. The upper sum of f related to P is defined by


k
X
Uf (P ) = Ml · vol(Al )
l=1

where Ml = sup{f (x) | x ∈ Al }.


The lower sum of f related to P is defined by
k
X
Lf (P ) = ml · vol(Al )
l=1

where ml = inf{f (x) | x ∈ Al }.


The Riemann sum of f related to P is defined by
k
X
σf (P ) = f (ξl ) · vol(Al )
l=1

where ξl ∈ Al .

138
Remark 66.1. It is clear that the following inequalities hold:

Lf (P ) ≤ σf (P ) ≤ Uf (P )

Now f is bounded above and below on A. So there exist numbers m and M such that

m ≤ f (x) ≤ M for x ∈ A

Thus for any partition P of A we have


k
X k
X
m · vol(A) ≤ m · vol(Al ) ≤ Uf (P ) ≤ σf (P ) ≤ Lf (P ) ≤ M · vol(Al ) = M · vol(A)
l=1 l=1

Hence, the set


Lf = {Lf (P ) | P is a partition of A}
is bounded above and the set

Uf = {Uf (P ) | P is a partition of A}

is bounded below.
So Lf = sup Lf and Uf = inf Uf exist.
The first result establishes the intuitively obvious fact that for a bounded function
Lf ≤ U f .

Theorem 66.1. If f is defined and bounded on A, then Lf ≤ Uf .

Proof. The same as in two dimensions.

Definition 66.3. A function f defined and bounded on A is Riemann-Darboux integrable


on A if Lf = Uf .
This common value is denoted by
Z Z Z
f (x) dx or · · · f (x1 , · · · , xn ) dx1 · · · dxn
A A

and is called the Riemann-Darboux integral of f .

Theorem 66.2. The function f defined and bounded on A is Riemann-Darboux integrable


on A if and only if for any ε > 0 there exists P such that we have

Uf (P ) − Lf (P ) < ε

Proof. The same as in two dimensions.

Remark 66.2. The constant function f (x) = 1 for x ∈ A is Riemann integrable on A


and Z Z
· · · 1 dx1 · · · dxn = vol(A)
A

139
67 Integrable functions of n variables

Theorem 67.1. If f is continuous on A and A is Jordan measurable, then f is Riemann-


Darboux integrable on A.

Proof. Since A is compact the function f is uniformly continuous. For ε > 0 there exists
δ > 0 such that
ε
kx0 − x00 k < δ ⇒ |f (x0 ) − f (x00 )| <
vol(A)
Choose P such that x0 , x00 ∈ Ai ⇒ kx0 − x00 k < δ for i = 1, n.
Hence n n
X ε X
Uf (P ) − Lf (P ) = (Mi − mi ) · vol(Ai ) < · vol(Ai )
i=1
vol(A) i=1

Hence f is Riemann integrable on A.

Definition 67.1. A function f is called piecewise-continuous on A if there exists a


partition P = {A1 , · · · , Ak } of A and continuous functions fi , i = 1, k defined on Ai
such that f (x) = fi (x) for x ∈ Int(Ai ).

Theorem 67.2. A piecewise-continuous function is Riemann integrable and


Z k Z
X
f (x) dx = fi (x) dx
A i=1 A
i

Proof. Is technical and it will be omitted.

68 Properties of the Riemann-Darboux integral of n-


variable functions

Theorem 68.1. If f and g are Riemann-Darboux integrable on A ⊂ Rn then all the


integrals below exist and the relations hold:
Z Z Z
(1) [αf (x) + βg(x)] dx = α f (x) dx + β g(x) dx, α, β ∈ R1
A A A
Z Z Z
(2) f (x) dx = f (x) dx + f (x) dx where A1 ∪ A2 = A and A1 ∩ A2 = ∅
A A1 A2
Z Z
(3) if f (x) ≤ g(x) on A, then f (x) dx ≤ g(x) dx
A A
Z Z
(4) | f (x) dx| ≤ |f (x)| dx
A A

140
Property (1) is called the linearity of the integral and (2) is called the additive property.

Proof. It is proved using the definition of the Riemann-Darboux integral.

Theorem 68.2 (The mean value theorem). Let f : A → R1 be integrable on A and


satisfying
m ≤ f (x) ≤ M for x ∈ A
Then Z
m · vol(A) ≤ f (x) dx ≤ M · vol(A)
A

Proof. Use property (3) from Theorem 68.1.

69 Riemann-Darboux integral calculus for n-variable


functions when A is a hypercube

We intend to show that in some conditions the calculus of the integral of a n variables
function reduces to the iterative calculus of the integrals of one variable functions.
Assume that A is a hypercube A = [a1 , b1 ] × [a2 , b2 ] × · · · × [an , bn ] and f : A → R1 .

Theorem 69.1. If the function f is integrable on A and if for every fixed x1 ∈ [a1 , b1 ] the
function fx1 (x2 , · · · , xn ) = f (x1 , x2 , · · · , xn ) is integrable on A1 = [a2 , b2 ] × · · · × [an , bn ]
i.e. the integral

Z Z Zb2 Zbn
I(x1 ) = ··· fx1 (x2 , · · · , xn ) dx2 · · · dxn = ··· f (x1 , x2 , · · · , xn ) dx2 · · · dxn
A1 a2 an

exists, then the iterative integral

Zb1 Zb1 Zb2 Zbn


I(x1 ) dx1 = dx1 ··· f (x1 , x2 , · · · , xn ) dx2 · · · dxn
a1 a1 a2 an

exists too and the following equality holds:

Z Zb1 Zb1 Zb2 Zbn


f (x) dx = I(x1 ) dx1 = dx1 ··· f (x1 , x2 , · · · , xn ) dx2 · · · dxn
A a1 a1 a2 an

Proof. Similar as for the two variables functions.


ZZZ
dxdydz
Example 69.1. Evaluate for the set A bounded by the planes:
(1 + x + y + z)3
A
x = 0, y = 0, z = 0 and x + y + z = 1.

141
ZZZ
x2 y2 z2
Example 69.2. Evaluate z dx dy dz for the set A defined by a2
+ b2
+ c2
≤ 1.
A
ZZZ
x2 y 2 z 2 x2 2
Example 69.3. Evaluate ( + + ) dx dy dz for the set A defined by a2
+ yb2 +
a2 b2 c2
A
z2
c2
≤ 1.

Remark 69.1. The above theorem reduces successively the evaluation of the integral to
the evaluation of integrals for functions of one variable.

Zb1 Zbn Zb1 Zb2 Zbn


··· f (x1 , · · · , xn ) dx1 dx2 · · · dxn = dx1 dx2 · · · f (x1 , · · · , xn ) dxn
a1 an a1 a2 an

Remark 69.2. Theorem 69.1 is valid also for more complex sets A as in 2-dimensional
case.

Theorem 69.2. If A, B ⊂ Rn are Jordan measurable and T : B → A is a bijection, T


and T −1 having continuous partial derivatives, then
Z Z Z Z Z ¯ ¯
¯ D(x1 , · · · , xn ) ¯
vol(A) = 1 dx = · · · dx1 · · · dxn = · · · ¯¯ ¯ dξ1 · · · dξn
D(ξ1 , · · · , ξn ) ¯
A A B

where T (ξ1 , · · · , ξn ) = (x1 (ξ1 , · · · , ξn ), · · · , xn (ξ1 , · · · , ξn )).

Proof. As in 2-dimensional case.

Theorem 69.3. If A, B ⊂ Rn are Jordan measurable sets, T : B → A is a bijection, T


and T −1 having continuous partial derivatives and f : A → R1 is an integrable function,
then the following equality holds:
Z Z ¯ ¯
¯ D(x1 , · · · , xn ) ¯
f (x) dx = f (x(ξ)) ¯¯ ¯ dξ
D(ξ1 , · · · , ξn ) ¯
A B
¯ ¯
¯ 1 ,··· ,xn ) ¯
where ¯ D(x
D(ξ1 ,··· ,ξn ) ¯
is the determinant of the Jacobi matrix of T .

Proof. As in two dimensions.

Example 69.4. Compute the volume of the set A ⊂ R3 bounded by x2 + y 2 + z 2 ≤ R2 .


ZZZ
xyz
Example 69.5. Compute the integral dx dy dz where A is bounded above
x + y2
2
A
by the surface (x2 + y 2 + z 2 )2 = a2 xy and below by the surface z = 0.

142
70 Elementary curves and elementary closed curves

The way of defining a line integral is quite similar to the familiar way of defining a definite
integral known from calculus. In order to do this, we must introduce the concepts of curve
and arc length. We will present these concepts in a particular framework which can be
extended in a natural way.
Definition 70.1. An elementary curve (elementary arc) is a set of points C ⊂ R3 for
which there exists a closed interval [a, b] ⊂ R and a function ϕ : [a, b] → C having the
following properties:

a) ϕ is bijective;
b) ϕ is of class C 1 and ϕ0 (t) 6= 0, ∀t ∈ [a, b] .

The points A = ϕ(a) and B = ϕ(b) are called the end points of the curve. The function
ϕ is called a parametric representation of the curve and the vector ϕ0 (t) is tangent to the
curve at the point ϕ(t).

Figure 70.1:

Definition 70.2. An elementary closed curve is a set of points C ⊂ R3 for which there
exists a closed interval [a, b] ⊂ R and a function ϕ : [a, b] → C with the following
properties:

a) ϕ is bijective from [a, b) to C and ϕ(a) = ϕ(b) ;


b) ϕ is of class C 1 and ϕ0 (t) 6= 0, ∀t ∈ [a, b] .

The function ϕ is called a parametric representation of the curve and the vector ϕ0 (t) is
tangent to the curve at the point ϕ(t).

Example 70.1. If x0 = (x01 , x02 , x03 ) and h = (h1 , h2 , h3 ) the closed segment [x0 , x0 + h]
which joins the points x0 , x0 + h is an elementary curve. In this case, we can take
[a, b] = [0, 1] and ϕ : [a, b] → C is given by ϕ(t) = (x01 + th1 , x02 + th2 , x03 + th3 ).
Example 70.2. The circle C defined by x21 + x22 = 1 and x3 = 0 is an elementary
closed curve C. In this case we can take [a, b] = [0, 2π] and ϕ : [a, b] → C is given by
ϕ(t) = (cos t, sin t, 0).

143
Figure 70.2:

Remark 70.1. Any elementary or elementary closed curve possesses an infinity of


parametric representations.
Remark 70.2. The end points A, B of an elementary curve are independent of the para-
metric representation of the curve. This means that for every parametric representation
ψ : [c, d] → C of the curve we have {ψ(c), ψ(d)} = {ϕ(a), ϕ(b)} = {A, B}.

Using a parametric representation of an elementary curve or of an elementary closed curve


we can define the curve length.
Definition 70.3. The length of the elementary curve (elementary closed curve) C is given
by:
Zb Zb q
0
l = ||ϕ (t)k dt = ϕ˙1 2 (t) + ϕ˙2 2 (t) + ϕ˙3 2 (t) dt
a a
dϕi
where ϕ(t) = (x1 (t), x2 (t), x3 (t)) is a parametric representation of C and ϕ̇i (t) = dt
(t),
i = 1, 2, 3.
Remark 70.3. The curve length is independent of the parametric representation of the
curve C. In other words, if ψ : [c, d] → C is a second parametric representation of C,
then
Zd Zb
kψ (τ )k dτ = kϕ0 (t)k dt
0

c a

Example 70.3. The length of the closed segment [x0 , x0 +h] represented by ϕ(t) = x0 +th,
t ∈ [0, 1] is
Z1 q
l= h21 + h22 + h23 dt = khk
0

and the length of the circle C = {(x1 , x2 , x3 ) ∈ R3 | x21 + x22 = 1, x3 = 0} represented by


ϕ(t) = (cos t, sin t, 0), t ∈ [0, 2π] is
Z2π p
l= sin2 t + cos2 t dt = 2π
0

144
Example 70.4 (Shows that the curve length is independent of the parametric repre-
sentation). For the circle C = {(x1 , x2 , x3 ) ∈ R3 | x21 + x22 = 1, x3 = 0} the parametric
representation ϕ : [0, π] → C, ϕ(t) = (cos 2t, sin 2t, 0) is chosen. Using this representation
we have
Zπ p
l = 2 sin2 2t + cos2 2t dt = 2π
0
This is the same value as the one obtained in the case of the parametric representation
ϕ(t) = (cos t, sin t, 0), t ∈ [0, 2π].
Remark 70.4. Since ϕ0 (t) 6= 0, ∀t ∈ [a, b], an elementary curve C has only two end
points. In other words, if C is an elementary curve then there exist two points A, B ∈ C
such that for any parametric representation ψ : [c, d] → C of the curve C we have
{ψ(c), ψ(d)} = {A, B}.
When ψ(c) = A and ψ(d) = B, then if τ moves from c to d, then ψ(τ ) moves from A to
B.
When ψ(c) = B and ψ(d) = A, then if τ moves from c to d, then ψ(τ ) moved from B to
A.
The two ways to cover the curve C, from A to B or from B to A, are called orientations
of C and the above presented facts show that on an elementary curve there are two ori-
entations. Moreover, the covering given by an arbitrary representation of C is one of the
above mentioned orientations.
In other words, the set of representations is divided in two classes: for all the representa-
tions which belong to one of the classes we have one orientation (say from A to B) and
for the other class the opposite orientation (from B to A).

Figure 70.3:
Consider now an elementary curve C with the parametric representation ϕ : [a, b] → C
for which the orientation of the curve is from A to B (ϕ(a) = A, ϕ(b) = B).

If in the formula
Zb
l= kϕ0 (τ )k dτ
a
we replace the fixed upper limit b with a variable upper limit t, the integral becomes
Zt
sA (t) = kϕ0 (τ )k dτ
a

145
The value sA (t) represents the arc length AA0 ⊂ C, where A0 = ϕ(t).
The function sA is defined on the closed interval [a, b] and it is a bijection from [a, b] to
[0, l], sA (a) = 0 and sA (b) = l. More, sA and s−1
A are continuously differentiable.
The function sA can be used in order to define a new parametric representation of the
curve C, namely: x eA : [0, l] → C, xeA = ϕ ◦ s−1A . In this representation of C, s ∈ [0, l]
serves as a parameter and x eA (0) = A, x
eA (l) = B. When s moves from 0 to l, then x eA (s)
moves from A to B. The parametric representation x eA is canonic when the orientation
of C is from A to B, i.e. the parameter s is the arc length Ae xA (s) and there is no other
representation with this property.
Consider now for the same elementary curve C with the end points A and B a parametric
representation ψ : [c, d] → C for which the orientation is from B to A (ψ(c) = B and
ψ(d) = A). If in the formula
Zd
l = kψ 0 (τ )k dτ
c

we replace the fixed upper limit d with a variable upper limit t, the integral becomes
Zt
sB (t) = kψ 0 (τ )k dτ
c

The value sB (t) represents the arc length BB 0 ⊂ C, where B 0 = ψ(t).


The function sB : [c, d] → [0, l] is a bijection and sB (c) = 0, sB (d) = l. The function s−1
B
can be used in order to define a new parametric representation x eB = ψ ◦ s−1
B of C, such
that xeB (0) = B, x
eB (l) = A. If s moves from 0 to l, then x eB (s) moves from B to A. The
representation xeB is canonic if the orientation of C is from B to A, i.e. the parameter s
is the arc length BexB (s) and there is no other representation with this property.

Example 70.5. In the case of the closed interval [x0 , x0 +h] which joins the points A = x0
and B = x0 + h, if the parametric representation ϕ(t) = x0 + th, t ∈ [0, 1] is chosen, then
ϕ(t) moves from A to B when t moves from 0 to 1. If the parametric representation
ψ(τ ) = x0 + (2 − τ )h, τ ∈ [1, 2] is chosen, then ψ(τ ) moves from B = x0 + h to A = x0
when τ moves from 1 to 2.

Using the representation ϕ, the arc length AA0 is given by


Zt q
sA (t) = h21 + h22 + h23 dτ = t · khk
0

with sA (0) = 0 and sA (1) = khk.


The function s−1A : [0, khk] → [0, 1] is s−1
A (s) =
s
khk
, and the canonic parametric
representation x
eA : [0, khk] → C is given by
1
eA (s) = (ϕ ◦ s−1
x 0
A )(s) = x + · s · h.
khk

146
Using the representaion ψ, the arc length BB 0 is given by
Zt Zt q
sB (t) = kψ 0 (τ )k dτ = h21 + h22 + h23 dτ = (t − 1) · khk
1 1

with sB (1) = 0 and sB (2) = khk.


The function s−1 −1 s
B : [0, khk] → [1, 2] is given by sB (s) = 1+ khk , and the canonic parametric
representation xeB : [0, khk] → C is
s s
eB (s) = (ψ ◦ s−1
x 0
B )(s) = x + (2 − 1 − ) · h = x0 + (1 − )·h
khk khk
1
The representation xeA (s) = x0 + khk · s · h is canonic in the orientation from A to B, and
0 s
the representation x
eB (s) = x + (1 − khk ) · h is canonic in the orientation from B to A.
In both cases, the parameter s moves from 0 to l.

Now consider an elementary closed curve C. In this case, there aren’t two end points A
and B, and we cannot speak about the orientation from A to B or from B to A. In the
followings, we will show how to proceed in order to introduce two orientations in the case
of an elementary closed curve C.
For the elementary closed curve C let’s consider a parametric representation ϕ : [a, b] → C
and the point A = ϕ(a) ∈ C.
Since ϕ0 (t) 6= 0, if t moves from a to b then ϕ(t) describes the curve, moving in a unique
way. This movement of ϕ(t) is one orientation of the curve. The opposite movement on
C is the opposite orientation. If ψ : [c, d] → C is an arbitrary representation of C then
when t increases on [c, d], ψ(t) moves on C according to one of the orientations:
d ¡ ¢ dt
if ϕ ◦ ψ −1 (τ ) = > 0 then ϕ(t) and ψ(τ ) move in the same way when t moves
dτ dτ
from a to b and τ moves from c to d;
d ¡ ¢ dt
if ϕ ◦ ψ −1 (τ ) = < 0 then ϕ(t) and ψ(τ ) move in opposite senses as t moves from
dτ dτ
a to b and τ moves from c to d.
As in the case of an elementary curve C we can consider the function sA : [a, b] → [0, l]
defined by Z t
sA (t) = k ϕ0 (τ ) k dτ
a
If A = ϕ(t) then sA (t) is the arc length AA0 described by ϕ(τ ) when τ moves from a to
0

t.
The function sA : [a, b] → [0, l] is a bijection and can be used in order to define a new
representation of C: xe : [0, l] → C, xeA − ϕ ◦ s−1
A . In this representation, s ∈ [0, l] is the
parameter and x eA (0) = xeA (l) = A. When s moves from 0 to l then x eA (s) moves on C
and describes the curve C moving in the same sense as ϕ(t). The function x e
eA : [0, l] → C
e
defined by x
eA (s) = x
eA (l − s) is the representation of C which corresponds to the opposite
orientation.
For instance, in the case of the circle:
C = {(x1 , x2 , x3 )|x21 + x22 = 1, x3 = 0}

147
Figure 70.4:

considering the parametric representation


Z t ϕ(t) = (cos t, sin t, 0) with t ∈ [0, 2π] we
have: ϕ(0) = (1, 0, 0) = A, sA (t) = dτ = t; x
eA (s) = (cos s, sin s, 0), s ∈ [0, π];
0
x
eA (0) = (1, 0, 0) = A; x eA (s) moves on the circle as in Fig. 70.4, and e
x
eA (s) =
(cos(2π − s), sin(2π − s), 0) moves on the circle as in Fig. 70.5:

Figure 70.5:

It follows that an elementary curve and an elementary closed curve can be represented
as:
X(s) = (X1 (s), X2 (s), X3 (s)), 0 ≤ s ≤ l
where l is the curve length and s ∈ [0, l] is the arc length X(0)X(s) ⊂ C.
For an elementary curve C, there exist two representations of this kind corresponding to
the two orientations of C. For an elementary closed curve C if we fix a point A on C,
then we also have two representations of this kind corresponding to the two orientations
of C.

71 Line integral of first type

Let now an elementary curve C and one of its parametric representations

x(s) = (x1 (s), x2 (s), x3 (s)) 0≤s≤l

in function of the arc length s. In order to make a choice assume that x(s) moves from A
and B when s moves from 0 to l.
Let now f (x1 , x2 , x3 ) be a given function which is defined (at least) at each point of C
and is continuous function of s, i.e. s 7→ f (x1 (s), x2 (s), x3 (s)) is continuous.
We subdivide C into n portions in an arbitrary manner:

148
Figure 71.1:

Let P0 (= A), P1 , P2 , ..., Pn−1 , Pn (= B) the end points of these portions and let

s0 (= 0) < s1 < s2 < · · · < sn (= l)

the lengths of arcs AQi ; si = length(AQi ). Then we choose an arbitrary point on each
portion, say, a point Q1 between P0 and P1 , a point Q2 between P1 and P2 etc. Taking
the values of f at these points Q1 , Q2 , · · · , Qn we form the sum
n
X
In = f (Qm )(sm − sm−1 )
m=1

We do this for n = 2, 3, · · · in a completely independent manner, so that the greatest


∆sm = sm − sm−1 approaches zero as n approaches infinity. We obtain a sequence of real
numbers I2 , I3 , · · · . The limit of this sequence is called the line integral of first type of f
along C from A to B and is denoted by
Z
f ds
C

The curve C is called the path of integration. Since, by assumption, f is continuous and
C is smooth, that limit exists and is independent of the orientation and of the choice of
subdivisions and points Qm . In fact, the position of a point P on C is determined by the
corresponding value of the arc length s; since A and B correspond to s = 0 and s = l
respectively, we have
Z Zl
f ds = f (x1 (s), x2 (s), x3 (s)) ds
C 0

It is easy to see that if C is represented by the continuously differentiable vector function


ϕ : [a, b] → R3 , ϕ = ϕ(t) then

Z Zb q
f ds = f (ϕ1 (t), ϕ2 (t), ϕ3 (t)) · ϕ̇21 (t) + ϕ̇22 (t) + ϕ̇23 (t) dt
C a

Hence the line integral of first type is equal to the definite integral and familiar properties
of ordinary definite integrals are equally valid for line integrals.
Z Z
Proposition 71.1. a) k · f ds = k f ds k = constant
C C

149
Z Z Z
b) (f + g) ds = f ds + g ds
C C C
Z Z Z
c) f ds = f ds + f ds, where the path C is subdivided into two disjoint arcs C1
C C1 C2
and C2

Figure 71.2:

Remark 71.1. If C is an elementary closed curve, the line integral of first type is defined
similarly.
I Z
For a line integral over a closed path C, the symbol (instead of ) is sometimes used
C C
in the literature.

Example 71.1. Evaluate the following line integrals:


Z
1) xy ds, where c is the segment which joins the points A(0, 0) and B(1, 1);
C
I
2) (x + y) ds, where c is the closed curve given by the parametric representation
C
x = cos t, y = sin t, t ∈ [0, 2π].

72 Line integrals of second type

In many applications the integrands appearing in the line integrals of first type are of the
form:
dx1 dx2 dx3
f· or g · or h ·
ds ds ds
dx1 dx2 dx3
where ds , ds , ds are the derivatives of the functions occurring in the parametric
representation of the path of integration.
Z Z Z
dx1 dx2 dx3
The integrals f · ds, g · ds, h· ds are called line integrals of second type.
ds ds ds
C C C
Their values depend on the orientation of C; changing the orientation of C the integrals
are multiplied by −1.

150
We simply denote these integrands by:
Z Z
dx1
f· ds = f dx1
ds
C C
Z Z
dx2
g· ds = g dx2
ds
C C
Z Z
dx3
h· ds = h dx3
ds
C C

In terms of the considered parametric representations, these line integrals of second type
are equal to the following Riemann integrals:

Z Zl Zl
dx1
f dx1 = f (x1 (s), x2 (s), x3 (s)) ds = f (x1 (s), x2 (s), x3 (s)) · cos α(s)ds
ds
C 0 0

Z Zl Zl
dx2
gdx2 = g(x1 (s), x2 (s), x3 (s)) ds = g(x1 (s), x2 (s), x3 (s)) · cos β(s)ds
ds
C 0 0

Z Zl Zl
dx3
hdx3 = h(x1 (s), x2 (s), x3 (s)) ds = h(x1 (s), x2 (s), x3 (s)) · cos γ(s)ds
ds
C 0 0

where α(s), β(s), γ(s) are the angles of the tangent to the curve and the coordinate axis
Ox1 , Ox2 , Ox3 , respectivelly.
In terms of an arbitrary parametric representation ϕ : [a, b] → C which corresponds to the
same orientation, these line integrals of second type are equal to the following Reimann
integrals:
Z Zb
ϕ̇1 (t)
f dx1 = f (ϕ(t)) · p 2 dt
ϕ̇1 (t) + ϕ̇22 (t) + ϕ̇23 (t)
C a

Z Zb
ϕ̇2 (t)
gdx2 = g(ϕ(t)) · p dt
ϕ̇21 (t) + ϕ̇22 (t) + ϕ̇23 (t)
C a

Z Zb
ϕ̇3 (t)
hdx3 = h(ϕ(t)) · p dt
ϕ̇21 (t) + ϕ̇22 (t) + ϕ̇23 (t)
C a

All these integrals depend on the orientation of the curve C. If the orientation changes,
the value of the integral changes its sign.
For the sums of these types of integrals along the same path C we adopt the simplified
notation Z
f dx1 + g dx2 + h dx3
C

151
which is equal to the Riemann integral

Zl
[f (x1 (s), x2 (s), x3 (s)) · cos α(s) + g(x1 (s), x2 (s), x3 (s)) · cos β(s) + h(x1 (s), x2 (s), x3 (s)) · cos γ(s)] ds
0

Example 72.1. Evaluate the line integral


Z
I = [x21 x2 dx1 + (x1 − x3 ) dx2 + x1 x2 x3 dx3 ]
C

where C is the arc of the parabola x2 = x21 in the plane x3 = 2 from A(0, 0, 2) to B(1, 1, 2).

Example 72.2. Evaluate the above line integral where C is the segment of the straight
line x2 = x1 , x3 = 2 from A(0, 0, 2) to B(1, 1, 2).

73 Transformation of double integrals into line inte-


grals

Double integrals over a plane region may be transformed into line integrals over the
boundary of the region and conversely. This transformation is of practical as well as
theoretical interest and can be done means of the following basic theorem.

Theorem 73.1 (Green’s theorem in the plane). Let R be a closed bounded region in the
x, y plane whose boundary C consists of finite many elementary closed curves. Let f (x, y)
and g(x, y) be functions which are continuous and have continuous partial derivatives of
first order everywhere in some domain containing R. Then the following equality holds:
ZZ I I
∂g ∂f
( − ) dx dy = f dx + g dy = [f cos α + g cos β] ds
∂x ∂y
R C C

The integration being taken along the entire boundary C of R such that R is on the left
as one moves on C.

Figure 73.1:

152
Proof. We first prove the theorem for a special region R which be represented in both of
the forms:
R = {(x, y) | a ≤ x ≤ b , u(x) ≤ y ≤ v(x)}
and
R = {(x, y) | c ≤ y ≤ d , p(y) ≤ x ≤ q(y)}

Figure 73.2:

Figure 73.3:

In this case we have


ZZ Zb Zv(x) Zb
∂f ∂f
dx dy = ( dy) dx = [f (x, v(x)) − f (x, u(x))] dx
∂y ∂y
R a u(x) a

Zb Za Z Z
= − f (x, u(x)) dx − f (x, v(x)) dx = − f (x, y) dy − f (x, y) dx
C∗ C ∗∗
Za b

= − f (x, y) dx
C

since y = u(x) represents the oriented curve C ∗ and y = v(x) represents the oriented
curve C ∗∗ .
˜
If portions of C are segments parallel to the y-axis such as C̃ and C̃,

153
Figure 73.4:

then the result is the same as before, because the integrals over these portions are zero
and may be added to the integrals over C ∗ and C ∗∗ to obtain the integral over the whole
boundary C.
Similarly we obtain
 
ZZ Zd Zq(y) Z
∂g  
dx dy =  dx dy = g(x, y) dy
∂x
R c p(y) C

Therefore ZZ · ¸ Z
∂g ∂f
− dx dy = f dx + g dy
∂x ∂y
R C
and the theorem is proved for special regions.
We now prove the theorem for a region R which itself is not a special region but can be
subdivided into finitely many special regions. In this case we apply the theorem to each
subregion and then add the results; the left-hand members add up to the integral over R
while the right-hand members add up to the line integral over C plus integrals over the
curves introduced for subdividing R. Each of the latter integrals occur twice, taken once
in each direction. Hence these two integrals cancel each other, and we are left with the
line integral over C.
Example 73.1. Using Green’s theorem, evaluate the following integrals:
R
a) y dx + 2x dy, where C is the boundary of the square 0leqx ≤ 1, 0leqx ≤ 1
C
(counterclockwise);
R
b) y 3 dx + (x3 + 3y 2 x) dy, where C is the boundary of the region y = x2 and y = x,
C
where 0leqx ≤ 1 (counterclockwise);
R
c) 2xy dx + (ex + x2 ) dy, where C is the boundary of the triangle with vertices (0, 0),
C
(1, 0), (1, 1) (clockwise);
R
d) −xy 2 dx + x2 y dy, where c is the boundary of the region in the first quadrant
C
bounded by y = 1 − x2 (counterclockwise).

154
Now we will present a second theorem of Green which concern the transformation of a
double integral of the Laplacian of a function into a line integral of its normal derivative.
Let w(x, y) be a function which has continuous second order partial derivatives in a domain
D of the x, y-plane.

Definition 73.1. The Laplacian of w is by definition the function ∆w : D → R1 defined


by
∂2w ∂2w
∆w = +
∂x2 ∂y 2

Assume now that D contains a region R (R ⊂ D) of the type indicated in Green’s theorem.

Theorem 73.2. The following equality holds:


ZZ I I I
∂w
∆w dx dy = ds = ∇n w ds = gradw · n ds
∂n
R C C C

where n is the outward unit normal vector to C.

∂g
Proof. Consider f = − ∂w
∂y
and g = ∂w
∂x
and remark that ∆w = ∂x − ∂f
∂y
. Applying Green’s
theorem in the plane , we obtain
ZZ I I I
∂w ∂w ∂w dx ∂w dy
∆w dx dy = − dx + dy = (− · + · ) ds = gradw · n ds
∂y ∂x ∂y ds ∂x ds
R C C C

The integrand of the last integral may be written as the dot product of the vectors
∂w ∂w dy dx
grad w = ( , ) and n = ( , − )
∂x ∂y ds ds
that is
∂w dy ∂w dx
n · grad w = · − ·
∂x ds ∂y ds
The vector n is the outward unit normal vector to C. that is because the vector
τ = ( dx , dy ) is the unit tangent vector to C and τ · n = 0.
ds ds
The dot product n · grad w is the directional derivative ∂w ∂n
= ∇n w.
Therefore we have ZZ I I
∂w
∆w dx dy = ds = ∇n w ds
∂n
R C C

Let v(x, y) be a vector valued function v(x, y) = (f (x, y), g(x, y)) which have continuous
first order partial derivatives in a domain D of the x, y-plane.

Definition 73.2. The divergence of v is by definition the real valued function div v :
D → R1 defined by
∂f ∂g
div v = +
∂x ∂y

155
Theorem 73.3. If D contains a region R (R ⊂ D) of the type indicated in Green’s
theorem, then the following equality holds:
ZZ I
div v dx dy = v · n ds
R C

where n is the outward unit normal vector to c.

Proof. Applying 73.1 we obtain


ZZ ZZ I I
∂f ∂g
div v dx dy = ( + ) dx dy = −g dx + f dy = v · n ds
∂x ∂y
R R C C

Example 73.2. Verify this formula when v = (x, y) and C is the circle x2 + y 2 = 1.

74 Elementary Surfaces

We shall consider surface integrals. This considerations will require knowledge of some
basic facts about surfaces, which we shall now explain and illustrate by simple examples.
Definition 74.1. An elementary surface is a set of points S ⊂ R3 for which there exists a
bounded, open and connected set D ⊂ R2 and a function ϕ : D → S having the following
properties:

a) ϕ is bijective;
b) ϕ is of class C 1 and the vector
∂ϕ ∂ϕ ∂ϕ2 ∂ϕ3 ∂ϕ3 ∂ϕ2 ∂ϕ3 ∂ϕ1 ∂ϕ1 ∂ϕ3 ∂ϕ1 ∂ϕ2 ∂ϕ2 ∂ϕ1
× =N =( · − · , · − · , · − · )
∂u ∂v ∂u ∂v ∂u ∂v ∂u ∂v ∂u ∂v ∂u ∂v ∂u ∂v
is different from 0 for any (u, v) ∈ D.

∂ϕ ∂ϕ
The function ϕ is called parametric representation of S. The vectors , are tangents
∂u ∂v
to the surface S at the point ϕ(u, v).
The vector N ϕ is called the normal vector to the surface S at the point ϕ(u, v) and the

vector nϕ = is the unit normal vector to the surface S at the point ϕ(u, v).
kN ϕ k
Example 74.1. The set S = {(x1 , x2 , 1) ∈ R3 | 0 < x1 < 1 , 0 < x2 < 1} is an elementary
surface. A bounded, open and connected set D ⊂ R2 and a function ϕ : D → S having
the properties a) and b) are:
D = {(u, v) ∈ R2 | 0 < u < 1 , 0 < v < 1} and ϕ(u, v) = (u, v, 1)
A normal vector to S at the point ϕ(u, v) is N ϕ = (0, 0, 1) which is actually a unit normal
∂ϕ ∂ϕ
vector. The vectors = (1, 0, 0) and = (0, 1, 0) are tangents to S at the point
∂u ∂v
ϕ(u, v).

156
Example 74.2. The set S = {(x1 , x2 , x3 ) ∈ R3 | x21 + x22 = 1 , x1 > 0 , x2 > 0 , 0 < x3 <
1} is an elementary surface. A bounded, open and connected set D ⊂ R2 and a function
ϕ : D → S having the properties a) and b) are:
π
D = {(u, v) ∈ R2 | 0 < u < , 0 < v < 1} and ϕ(u, v) = (cos u, sin u, v)
2
A normal vector to S at the point ϕ(u, v) is N ϕ = (cos u, sin u, 0) which is actually a unit
∂ϕ ∂ϕ
normal vector. The vectors = (− sin u, cos u, 0) and = (0, 0, 1) are tangents to S
∂u ∂v
at the point ϕ(u, v).

Example 74.3. The set S = {(x1 , x2 , x3 ) ∈ R3 | x21 + x22 + x23 = 1 , x1 > 0 , x2 > 0 , x3 >
0} is an elementary surface. A bounded, open and connected set D ⊂ R2 and a function
ϕ : D → S having the properties a) and b) are:
π π
D = {(u, v) ∈ R2 | 0 < u < , 0 < v < } and ϕ(u, v) = (cos u·sin v, sin u·sin v, cos v)
2 2
The vector N ϕ = − sin v · (cos u · sin v, sin u · sin v, cos v) is a normal vector to S and
nϕ = −(cos u · sin v, sin u · sin v, cos v) is a unit normal vector to S. The vectors
∂ϕ ∂ϕ
= (− sin u sin v, cos u sin v, 0) and = (cos u cos v, sin u cos v, − sin v) are tangents
∂u ∂v
to S at the point ϕ(u, v).

Remark 74.1. An elementary surface possesses an infinity of parametric representations.

The direction of N ϕ is independent of the parametric representation, but the orientation


of the normal vector N ϕ depends on the parametric representation of S. If instead
of the parametric representation x = ϕ(u, v), (u, v) ∈ D we consider the parametric
representation x = ψ(u0 , v 0 ), where x = ψ(u0 , v 0 ) = ϕ(−u0 , v 0 ), (u0 , v 0 ) ∈ D0 and
T : D → D0 is defined by T (u, v) = (−u, v), then the orientation of N ϕ changes;
N ϕ = −N ψ .
In general, if x = ϕ(u, v), (u, v) ∈ D and x = ψ(u0 , v 0 ), (u0 , v 0 ) ∈ D0 are two parametric
representations of S, then:

- if the determinant of the Jacobi matrix of T = ψ −1 ◦ϕ, T : D → D0 , T (u, v) = (u0 , v 0 )


is positive, then the orientation of the normal vector to S does not change;

- if the determinant of the Jacobi matrix of T = ψ −1 ◦ϕ, T : D → D0 , T (u, v) = (u0 , v 0 )


is negative, then the orientation of the normal vector to S changes;

Remark 74.2. If an elementary surface S is given by the parametric representation


ϕ : D ⊂ R2 → S, x = ϕ(u, v), and C is an elementary curve on S (C ⊂ S) then
C 0 = ϕ−1 (C) is an elementary curve in D. If u = g(t) and v = h(t), t ∈ [a, b] is a
parametric representation of C 0 then a parametric representation of C is obtained as
x = ϕ(g(t), h(t)), t ∈ [a, b].

Example 74.4. Consider the elementary surface

S = {(x1 , x2 , 1) ∈ R3 | 0 < x1 < 1 , 0 < x2 < 1}

157
with the parametric representation ϕ(u, v) = (u, v, 1), (u, v) ∈ (0, 1) × (0, 1), and the
elementary curve C on S (C ⊂ S) defined by
1 2
C = {(x1 , x2 , x3 ) ∈ R3 | x1 = x2 , x3 = 1 , x1 ∈ [ , ]}
3 3
The curve C 0 = ϕ−1 (C), C 0 ⊂ D has the parametric representation: u = t, v = t,
t ∈ [ 13 , 23 ]. The parametric representation x = ϕ(g(t), h(t)) of C in this case is given by
x(t) = (t, t, 1).
Example 74.5. In the case of the elementary surface

S = {(x1 , x2 , x3 ) ∈ R3 | x21 + x22 = 1 , x1 > 0 , x2 > 0 , 0 < x3 < 1}

with the parametric representation ϕ(u, v) = (cos u, sin u, v), (u, v) ∈ D = {(u, v) ∈
R2 | 0 < u < π2 , 0 < v < 1}, and the elementary curve on S is defined by
√ √
1 1 3 3 1
C = {(x1 , x2 , x3 ) ∈ R3 | x21 + x22 = 1 , x3 = , < x1 < , < x2 < }
2 2 2 2 2
1
The curve C 0 = ϕ−1 (C), C 0 ⊂ D has the parametric representation: u = t, v = ,
· ¸ 2
π 2π
t∈ , . The parametric representation x = ϕ(g(t), h(t)) of the curve C in this case
3 3µ ¶
1
is x(t) = cos t, sin t, .
2
Remark 74.3. Using the formula of parametric representation of the elementary curve
C ⊂ S, x = ϕ(u(t), v(t)), we obtain that the tangent vector t to C at ϕ(u(t), v(t)) is given
by: µ ¶
dx ∂ϕ1 du ∂ϕ1 dv ∂ϕ2 du ∂ϕ2 dv ∂ϕ3 du ∂ϕ3 dv
= · + · , · + · , · + ·
dt ∂u dt ∂v dt ∂u dt ∂v dt ∂u dt ∂v dt
The vectors:
µ ¶ µ ¶
∂ϕ ∂ϕ1 ∂ϕ2 ∂ϕ3 ∂ϕ ∂ϕ1 ∂ϕ2 ∂ϕ3
= , , and = , ,
∂u ∂u ∂u ∂u ∂v ∂v ∂v ∂v
dx ∂ϕ du ∂ϕ dv
are tangent vectors to S at ϕ(u, v) and = · + · .
dt ∂u dt ∂v dt

If the elementary surface S is given by the parametric representation x = ϕ(u, v),


(u, v) ∈ D and the elementary curve C on S (C ⊂ S) is given by the parametric
representation u = u(t), v = v(t), t ∈ [a, b], then the length l of the curve C is given
by:
Zb √
l= E · u̇2 + 2F · u̇ · v̇ + G · v̇ 2 dt
a

where:
∂ϕ ∂ϕ ∂ϕ
E=k (u(t), v(t))k2 F = (u(t), v(t)) · (u(t), v(t))
∂u ∂u ∂v
∂ϕ du dv
G = k (u(t), v(t))k2 u̇ = v̇ =
∂v dt dt
158
The expression E · u̇2 + 2F · u̇ · v̇ + G · v̇ 2 is called first fundamental form of S. It is of basic
importance because it enables us to measure lengths, angles between curves and areas on
the corresponding surface. In fact, we have already seen how we compute the length of a
curve. Now we consider the measurement of the angle between the curves

C1 : x = ϕ(g(t), h(t)) C2 : x = ϕ(p(t), q(t))

Let be P ∈ S, P = ϕ(g(t0 ), h(t0 )) = ϕ(p(t0 ), q(t0 )) a point of intersection of the two


curves.
The tangent vector at the point P to the curve C1 is T1 and the tangent vector at the
point P to the curve C2 is T2 :
∂ϕ1 ∂ϕ1 ∂ϕ2 ∂ϕ2 ∂ϕ3 ∂ϕ3
T1 = ( · ġ + · ḣ, · ġ + · ḣ, · ġ + · ḣ)
∂u ∂v ∂u ∂v ∂u ∂v
∂ϕ1 ∂ϕ1 ∂ϕ2 ∂ϕ2 ∂ϕ3 ∂ϕ3
T2 = ( · ṗ + · q̇, · ṗ + · q̇, · ṗ + · q̇)
∂u ∂v ∂u ∂v ∂u ∂v
The angle at the point of intersection P between C1 and C2 is defined as the angle γ
between T1 and T2 at P and
T1 · T2
cos γ =
kT1 k · kT2 k
Since:
T1 · T2 = E · ġ · ṗ + F (ġ · q̇ + ḣ · ṗ) + G · ḣ · q̇
q
kT1 k = E · ġ 2 + 2F · ġ · ḣ + G · ḣ2
p
kT2 k = E · ṗ2 + 2F · ṗ · q̇ + G · q̇ 2
we obtain that the angle between to intersecting curves on a surface can be expressed in
terms of E, F , G and the derivatives of the functions representing the curves, evaluated
at the point of intersection.
We will show in the followings how to compute the areas on a surface S.

The area A0 of a part S 0 of the elementary surface S, represented by x = ϕ(u, v),


(u, v) ∈ D, is given by: ZZ √
0
A = EG − F 2 du dv
D0

where D0 is the part of D corresponding to S 0 .


This formula can be made plausible by noting that:

∆A = EG − F 2 · ∆u · ∆v

is the area of a small parallelogram whose sides are the vectors


∂x ∂x
· ∆u and · ∆v
∂u ∂v
From the definition of the vector product it follows that:
∂x ∂x ∂x ∂x √
∆A = | · ∆u × · ∆v| = | × | · ∆u · ∆v = EG − F 2 · ∆u · ∆v
∂u ∂v ∂u ∂v
159
The integrand is obtained by subdividing S 0 into parts S1 , S2 , ..., Sn and approximating
the area of each part Sk by the area of the parallelogram from the tangent plane to S at a
point from Sk and forming the sum of all the approximating areas. This is done for each
k = 1, 2, ... so that the dimensions of the largest Sk approaches zero as n → ∞. The limit
of these sums is the integral.

In various applications, surface integrals occur for which the concept of orientation of a
surface is essential.
In the case of an elementary surface, for the unit normal vector n̄ there exist two
orientations (see Fig. 74.1) and we can associate to each of them one orientation of
the elementary surface S (as for the elementary curves, using two ways to cover the
curve). The set of representations of the elementary surface S is decomposed (according
to these orientations) in two disjoint classes. For all representations belonging to one of
these classes, we have one of the two orientations of n̄ (of S) and for all representations
from the other class we have to opposite orientation of n̄ (of S).

Figure 74.1:

If a smooth surface S is orientable, then we may orient S by choosing of the two possible
directions of the unit normal vector n.
If the boundary of the elementary surface S is a simple closed curve C, then we may
associate with each of two possible orientations of S an orientation of C as it is shown in
Figs. 74.2.

Figure 74.2:

The rules is: looking the curve C from the top of the unit normal vector n the sense on
the curve is always counterclockwise.
Using this idea we may extend the concept of orientation to surfaces which can be
decomposed in elementary surfaces, as follows: A surface S which can be decomposed
in elementary surfaces is said to be orientable if we can orient each elementary piece of

160
S in such a manner that along each curve C ∗ which is a common boundary of two pieces
S1 and S2 the positive direction of C ∗ related to S1 is opposite to the positive direction
of C ∗ related to S2 .

Figure 74.3:

However this may not hold in the large. There are non-orientable surfaces. An example
of such a surface is the Möbius strip. A model of a Möbius strip can be made by taking
a long rectangular piece of paper and sticking the shorter sides together so that the two
points A and the two points B coincide (see Fig.74.4).

Figure 74.4: Möbius strip

Definition 74.2. A surface S is orientable if a chosen normal orientation given at an


arbitrary point P0 ∈ S can be continued in a unique and continuous way to the entire
surface S.

Hence, the surface S is orientable if there does not exist a closed curve C ⊂ S passing
through P0 such that the chosen normal orientation reverses by moving continuously along
the curve C from P0 and back to P0 .

75 Surface integrals of first type

Surface integrals occur in many applications, for example, in connection with the center
of gravity of a curved lamina, the potential due to charges distributed on surfaces.
Let S be an elementary surface of finite area and let f a real valued function which
is defined and continuous on S. We subdivide S into n parts S1 , S2 , · · · , Sn of areas
A1 , A2 , · · · , An . In each part Sk we choose an arbitrary point Pk and form the sum:
n
X
In = f (Pk ) · Ak
k=1

161
This we do for each n = 1, 2, · · · in an arbitrary manner, but so that the largest part
Sk tends to a point as n approaches infinity. The sequence I1 , I2 , · · · , In , · · · has a limit
which is independent of the choice of subdivisions and points Pk . This limit is called the
surface integral of first type of f over S and is denoted by
ZZ
f dS
S

To evaluate the surface integral, we may reduce it to a double integral as follows: If S is


represented in parametric form by a vector function x = x(u, v), then
ZZ ZZ √
f dS = f (x(u, v)) · EG − F 2 du dv
S R

where R is the region corresponding to S in the u, v plane.


The value of the surface integral of first type does not depend on the parametric
representation of the surface.
If the elementary surface S is represented in the form x3 = g(x1 , x2 ), then
ZZ ZZ r
∂g 2 ∂g 2
f dS = f (x1 , x2 , g(x1 , x2 )) · 1 + ( ) +( ) dx1 dx2
∂x1 ∂x2
S S
ZZ
Example 75.1. Evaluate µD2 dS, where S is defined by x21 + x22 + x23 = a2 and
S
D2 = x21 + x22 .

76 Surface integrals of second type

Let S be an elementary surface. We orient S by choosing a unit normal vector n.


Denoting the angles between n and the positive x1 , x2 and x3 axes by α1 , α2 and α3
respectively, we have
n = (cos α1 , cos α2 , cos α3 )
Let u1 , u2 and u3 be given functions which are defined and continuous at every point of
S. The surface integrals to be considered are usually written in the form:
ZZ ZZ ZZ
u1 dx2 dx3 u2 dx3 dx1 u3 dx1 dx2
S S S

and by definition this means


ZZ ZZ
u1 dx2 dx3 = u1 cos α1 dS
S S
ZZ ZZ
u2 dx3 dx1 = u2 cos α2 dS
S S

162
ZZ ZZ
u3 dx1 dx2 = u3 cos α3 dS
S S
It is clear that the value of such an integral depends on the choice of n, that is, on
the orientation of S. The transition to the opposite orientation corresponds to the
multiplication of the integral by −1, because then the components cos α1 , cos α2 , cos α3 of
n are multiplied by −1.
The sum of the above three integrals may be written in a simple form by using vector
notation.
In fact we introduce the vector function
u(x1 , x2 , x3 ) = (u1 (x1 , x2 , x3 ), u2 (x1 , x2 , x3 ), u3 (x1 , x2 , x3 ))
and we obtain
ZZ ZZ
u1 dx2 dx3 + u2 dx3 dx1 + u3 dx1 dx2 = (u1 cos α1 + u2 cos α2 + u3 cos α3 ) dS
S
ZSZ
= u · n dS
S

To evaluate the above integrals we may reduce then to double integrals over a plane
region.
If S can be represented as
x3 = h(x1 , x2 )
and is oriented such that n points upward, (then α3 is acute), then
ZZ ZZ
u3 dx1 dx2 = u3 (x1 , x2 , h(x1 , x2 )) dx1 dx2
S R

where R is the orthogonal projection of S in the x1 , x2 plane.


If n points downward, (then α3 is obtuse), then we have
ZZ ZZ
u3 dx1 dx2 = − u3 (x1 , x2 , h(x1 , x2 )) dx1 dx2
S R

For the other integrals the situation is quite similar.


If S is represented in parametric form
x = x(u, v)
then the normal vector is
∂x ∂x ∂x ∂x
∂u
× ∂v ∂u
× ∂v
(a) n = + ∂x ∂x
or (b) n = − ∂x ∂x
k ∂u × ∂v
k k ∂u × ∂v
k
and ZZ ZZ
D(x1 , x2 )
u3 dx1 dx2 = ± u3 (x(u, v)) · du dv
D(u, v)
S R

with + if n is as (a) and − if n is as (b). Here R is the region corresponding to S in the


u, v plane.

163
77 Properties of surface integrals

Let A be a closed bounded region in space whose boundary S is is a union of elementary


surfaces and is orientable.
Theorem 77.1 (divergence theorem of Gauss). If the vector function u(x1 , x2 , x3 ) has
continuous first order partial derivatives in some domain containing A, then
ZZZ ZZ
div u dV = u · n dS
A S

where n is the outer unit normal vector of S.

Proof. The proof is technical and will be skipped.


Corollary 77.1. If u = grad f , then
ZZZ ZZ
∂f
∆f dV = dS
∂n
A S

and for ∆f = 0 we have ZZ


∂f
dS = 0
∂n
S

Corollary 77.2. If u = f · grad g, then


div u = f · ∆g + grad f · grad g
∂g
u · n = f · (n · grad g) = f ·
∂n
and ZZZ ZZ
∂g
(f · ∆g + grad f · grad g) dV = f· dS
∂n
A S
This equality is the first Green’s formula.
By interchanging f and g we obtain
ZZZ ZZ
∂g ∂f
(f · ∆g − g · ∆f ) dV = (f · −g· ) dS
∂n ∂n
A S

This equality is the second Green’s formula.

Let S be an elementary surface in space and let C be the boundary of S, an elementary


closed curve.
Theorem 77.2 (Stokes’s theorem). If the vector valued function v has continuous first
order partial derivatives in a domain in space which contains S, then
ZZ Z
(curl v) · n dS = v · t ds
S C
∂v3 ∂v2 ∂v1 ∂v3 ∂v2 ∂v1
Here: curl v = ( ∂x 2
− ∂x ,
3 ∂x3
− ,
∂x1 ∂x1
− ∂x2
), n is the unit normal vector of S and t is
the unit tangent vector of C.

164
78 Differentiation of an integral containing a param-
eter

It can sometimes happen that an integrand, in addition to being a function of x, also


depends on a parameter t. Furthermore, the domain of integration may depend also on
the parameter t. So that the value of the integral must then itself depend on t. In this
section we will show the problem of differentiation with respect to t of such an integral.
Firstly we consider the case in 1-dimension, i.e.

Zψ(t)
I(t) = f (x, t) dx
ϕ(t)

Theorem 78.1. If the functions ϕ(t), ψ(t) are differentiable functions with respect to t
in some interval c ≤ t ≤ d and the function f (x, t) is continuous with respect to x over
the interval ϕ(t) ≤ x ≤ ψ(t) and continuously differentiable with respect to t, then the
function I(t) is differentiable and

Zψ(t) Zψ(t)
d ∂f
f (x, t) dx = ψ 0 (t) · f (ψ(t), t) − ϕ0 (t) · f (ϕ(t), t) + (x, t) dx
dt ∂t
ϕ(t) ϕ(t)

Proof. From the mean value theorem for derivatives in t + h ∈ [c, d], we have


ϕ(t + h) = ϕ(t) + h · ( )(ξ) with ξ ∈ (t, t + h)
dt

ψ(t + h) = ψ(t) + h · ( )(η) with η ∈ (t, t + h)
dt
∂f
f (x, t + h) = f (x, t) + h · ( )(x, ξ) with ζ ∈ (t, t + h)
∂t
Now we have
Z
ψ(t+h)

I(t + h) = f (x, t + h) dx =
ϕ(t+h)
0
Zϕ(t) Zψ(t) Z (η)
ψ(t)+h·ψ

= f (x, t + h) dx + f (x, t + h) dx + f (x, t + h) dx =


ϕ(t)+h·ϕ0 (ξ) ϕ(t) ψ(t)

Zψ(t)
= −h · ϕ0 (ξ) · f (x0 , t + h) + h · ψ 0 (η) · f (x00 , t + h) + f (x, t + h) dx
ϕ(t)

where
ϕ(t) ≤ x0 ≤ ϕ(t) + h · ϕ0 (ξ) ψ(t) ≤ x00 ≤ ψ(t) + h · ψ 0 (η)

165
Next, forming the difference I(t + h) − I(t) and combining the integrals we obtain

Zψ(t)
∂f
I(t + h) − I(t) = h · ψ 0 (η) · f (x00 , t + h) − h · ϕ0 (ξ) · f (x0 , t + h) + h ( )(x, ξ) dx
∂t
ϕ(t)

I(t+h)−I(t)
Finally, forming the difference quotient h
and taking the limits as h → 0 it follows
that ξ, η all tend to t. Hence

Zψ(t)
dI ∂f
= ψ 0 (t) · f (ψ(t), t) − ϕ0 (t) · f (ϕ(t), t) + (x, t) dx
dt ∂t
ϕ(t)

Corollary 78.1. If f (x, t) is continuous with respect to x over the interval [a, b] and
continuously differentiable with respect to t, then
Zb Zb
d ∂f
f (x, t) dx = (x, t) dx
dt ∂t
a a

Now we consider the 3-D case.


Let R ⊂ R3 be a domain and (a, b) ⊂ R1 an open interval. We consider a continuously
differentiable function x : (a, b) × R → R3 and a bounded set ω0 ⊂ R having a smooth
surface S0 . For t arbitrary, but fixed, denote by ω(t) the set

ω(t) = {x(t, ξ) | ξ ∈ ω0 }
D(x1 ,x2 ,x3 )
and assume that the jacobian D(ξ1 ,ξ2 ,ξ3 )
of the function xt : ω0 → ω(t) defined by

xt (ξ) = x(t, ξ)

is different from zero: D(x 1 ,x2 ,x3 )


D(ξ1 ,ξ2 ,ξ3 )
6= 0.
Now consider F : R × (a, b) → R1 a function having continuous first order partial
3

derivatives and the integral


ZZZ
I(t) = F (x, t) dx1 dx2 dx3
ω(t)

Theorem 78.2. The function I(t) is continuously differentiable and


ZZZ ZZ
dI ∂F
= (x, t) dx1 dx2 dx3 + F · v · n dS
dt ∂t
ω(t) S(t)

∂x
where: S(t) is the boundary of ω(t), v = ∂t
and n is the unit normal vector of the surface
S(t).

Proof. Is rather technical and it will be omitted.

166
References
[1] R. Haggarty, Fundamentals of Mathematical Analysis; Addison-Wesley, 1989, Oxford

[2] A. B. Israel, R. Gilbert, Computer-Supported Calculus; Springer Wien New York,


2001, RISC Johannes Kepler University, Linz, Austria

[3] C. Lanczos, Applied Analysis; Sir Isaac Pitman, 1967, London

[4] F. Ayres, J. Cault, Differential and Integral Calculus in Simetric Units; Mc.Grow-
Hill, 1988

[5] A. Jeffrey, Mathematics for engineers ad scientists; Van Nostrand, 1961

[6] E. Kreiszig, Advanced engineering mathematics; Wiley & Sons, 1967

[7] O. V. Manturov, N. M. Matveev, A course of higher mathematics; Mir, 1989

167

You might also like