Professional Documents
Culture Documents
Ştefan Balint
Eva Kaslik, Simona Epure, Simina Mariş, Aurelia Tomoioagă
Contents
I Introduction 9
4 Relations 11
5 Functions 14
7 Logic symbols 16
10 Topology in R1 19
11 Sequences 20
1
12 Convergence 21
21 Infinite limits 43
23 Continuity 45
31 Power Series 58
2
32 Arithmetics of power series 60
33 Differentiable functions 61
34 Rules of differentiability 63
35 Local extremum 68
38 Taylor polynomials 72
46 Improper integrals 92
47 Fourier series 94
49 Topology in Rn 104
3
51 Continuity 108
53 Differentiation 112
4
71 Line integral of first type 148
5
In which way can a Calculus course be useful to a first
year computer science student?
This is a frequently asked question of first year students at the beginning of their Calculus
course.
It is difficult to give a full and convincing answer to this question at the very beginning
of the course, as we have to talk about the utility of some concepts and mathematical
instruments, that are unknown to those who ask, in solving practical problems which are
out of their reach at the moment.
However, the question cannot and must not be avoided. It is necessary to formulate a
partial answer showing the utility of this course in solving real problems, that future com-
puter scientists could find interesting. We have to emphasize here that for mathematics
students, Calculus is a basic and very important part of their curriculum, and its utility
is usually not questioned outside the field of mathematics.
So let’s get back to giving a partial answer to computer science students. We would like
to point out that in this course, basic concepts and instruments will be presented, used
for analyzing real or vector functions of one ore more variables. To illustrate the utility of
some of these concepts and instruments, we will consider the following practical problem:
constructing a train schedule.
Constructing a train schedule for a railway network is a real and complex problem. It
is based on the knowledge of speed restrictions in the network, train stations, transport
material, options concerning the stops of some trains in certain stations, and a previous
computation that guarantees that in ideal conditions, the trains will not collide. Some
concepts of calculus prove to be useful in this computation. To guarantee that the trains
will not collide, it is necessary to know, at every moment, the position of every train and
to assure that these positions do not coincide at a certain moment of time. Let’s consider
for example the Timişoara-Bucharest railway which can be represented as a curve AB g
like in the following figure: and a train that circulates on this railway in the time range
[t0 , t0 + T ] will be represented by a point P . If in the considered time range there are more
trains circulating on this railway, we will have to describe the motion of each of them.
In order to describe the motion of a train represented by the point P , we can associate
to each moment of time t ∈ [t0 , t0 + T ] the length of the arc of curve AP g, where P is
the position on the curve AB g where the train is at the moment t. Therefore, a function
f is obtained, which is defined for t ∈ [t0 , t0 + T ] and takes its values in the set [0, l]:
f : [t0 , t0 + T ] → [0, l]; l is the distance from A to B, on the considered railway.
6
We must emphasize that the object that appeared in a natural way in this problem of
describing the position of a train on a railway, is a real function of one real variable, a
mathematical object that belongs to the field of interests of this course.
Our train has to arrive at given times to its stations and has some speed restrictions
along the way, hence, the function f could be quite complicated. However, there are some
characteristics of real motion that have to be translated mathematically as properties of
the function f . For example, the real motion is continuous, meaning that the train moves
from the position P1 to the position P2 gradually, passing through all the intermediate
positions and not by jumping. This means that the function f , even if complicated, must
have to following property: for any t2 ∈ [t0 , t0 + T ], if t1 tends to t2 then f (t1 ) tends to
f (t2 ).
A function with the above property is said to be continuous on the interval [t0 , t0 + T ].
The concept of continuity is studied in this course, revealing several properties. Hence,
continuous functions that are studies in this course are useful, for example, for describing
the motion of a train on a railway.
If our train leaves at the moment t0 from station A and moves off continuously from A
without stopping until the moment t1 at the first station S1 , then the function f which
describes the motion of the train has the following property: for any t0 , t00 ∈ [t0 , t1 ], t0 < t00
it results that f (t0 ) < f (t00 ). In this course, such function is said to be increasing. The
course presents several properties of monotonous functions. In the case of the considered
motion, this concept is useful for expressing moving off or approaching.
Due to speed restrictions and stops at the stations, the velocity of the train depends on
its position. More exactly, it depends on the moment of time t, as in the time range
[t0 , t0 + T ], the train may pass through the same place a couple of times. In order to find
f (t) − f (t1 )
the velocity of the train at the moment t1 , we consider the mean velocity
t − t1
(distance over time) on a short time range [t, t1 ] and the limit of this mean velocity when t
tends to t1 represents the velocity of the train at the moment t1 . In this course, this limit
is called the derivative of the function f at t1 and is denoted by f 0 (t1 ). If the train stays
in a station in the time range [t1 , t2 ] then it’s velocity is zero, f 0 (t) = 0, for t ∈ [t1 , t2 ].
If f 0 (t) > 0, then the train moves off A, and if f 0 (t) < 0 then the train approaches A.
If the train moves with a constant velocity in the time range [t1 , t2 ], then f 0 (t) = const
in the interval [t1 , t2 ]. These show the utility of the concept of derivative for describing
mechanic motion.
Finally, we point out that starting from a velocity profile v(t) (which results from speed
restrictions and previously assigning the arrival and departure times) the function f (t)
which describes the motion can be recovered using the integral formula:
Zt
f (t) = f (t0 ) + v(τ )dτ.
t0
7
The written course is presented in a standard form, similar to the course presented to
mathematics students. However, the spoken course is full of comments and examples that
are meant to illustrate the utility and applicability of the concepts and results at solving
real problems.
The authors
8
Part I
Introduction
1 The notions ”set”, ”element of a set”, ”member-
ship of an element in a set” are basic notions of
mathematics
A strict mathematics course requires a precise definition of all the notions used to present
the material.
A definition should precisely describe a notion (A) using an other notion (B), which is
assumed to be known, or in any event simpler than (A).
Notion (B) must also be strictly defined, and its definition will contain another notion
(C) simpler then (B), and so on.
For the construction of a mathematical theory with exact definitions, of all the notions, it
is necessary to have a collection of very simple notions to which the rest can be reduced
and which are themselves not defined.
We will call such notions basic notions.
From the point of view of common sense, the basic notions of mathematics are so self
evident that they do not require definitions. The meaning of basic notions can be described
by examples.
The notions: a set, an element of a set, membership of an element in a set, are basic
notions of mathematics.
We cannot obtain an exact definition of the above notions, but it is possible to clarify
their meaning, by examples.
Thus, let us consider the notion of a set. We may speak of the set of days in a year, points
in a plane, students in a lecture-room, and so on. In these cases, each day of a year, each
point in a plane, each student in a lecture-room is an element of the set.
When a concrete set is considered, an essential thing is to be able to affirm for any
element if it belongs or not to the set. Thus, for the set of days in a year, the 3rd of July,
20th of May, 29th of December are all elements of the set, while ”Wednesday”, ”Friday”,
”holiday”, ”days in a year” are not. In the second example, only the points in the given
plane are elements of the set. If the point does not lie in the given plane, or the element
is not a point, then the point or the element is not an element of the set.
In order to define a concrete set it is necessary to describe clearly the elements belonging
to it. Any faulty description may lead to a logical contradiction.
9
2 Symbols used in set theory
Definition 3.1. For any two sets A and B the set of elements belonging to A or B or to
both sets is called the union of A and B, and is written A ∪ B.
Definition 3.2. For any two sets A and B the set of elements belonging to A and B at
the same time is called the intersection of A and B and is written A ∩ B.
Definition 3.3. For any two sets A and B the set of elements of B that are not elements
of A is the difference B − A written B \ A. If the set A is a subset of B, then B \ A is
called the complement of A in B and is denoted as CB A.
Comment 3.1.
- The notions of union and intersection of sets can be extended to three, four or any
number of sets. Namely, the union of n sets A, B, C, . . . is the set of those elements
which belong to at least one of these sets. The intersection of n sets A, B, C, . . .
is the set of those elements which belong simultaneously to each set.
- It is possible that two sets A and B have no elements in common. In such a case
A ∩ B contains no elements. Nevertheless, it is still convenient to view A ∩ B as a
set (containing no elements). It is called the empty (or null) set, and is denoted by
the symbol ∅.
10
For any set A we have A ⊃ A and A ⊃ ∅; thus A and ∅ are subsets of A; they are called
improper subsets, all other subsets being proper subsets.
Sometimes, the union of sets is called the sum of sets, and the intersection of sets the
product sets.
Usually, the operations of union and intersection of sets are defined on the set of all subsets
of a given set S. These operations, for any A, B, C ⊂ S, satisfy the following properties:
• (A ∪ B) ∪ C = A ∪ (B ∪ C) associativity of union;
• (A ∩ B) ∩ C = A ∩ (B ∩ C) associativity of intersection;
• A ∪ B = B ∪ A commutativity of union;
• A ∩ B = B ∩ A commutativity of intersection;
• the set S possesses the property A ∩ S = A for any A ⊂ S, the empty set ∅ possesses
the property: ∅ ∩ A = ∅ for any A.
There are identities, known as rules of De Morgan, which relate the operations of
complementation, taking unions, and taking intersections. These rules are expressed
by the formulas:
CS (A ∪ B) = CS A ∩ CS B;
CS (A ∩ B) = CS A ∪ CS B.
Definition 3.4. For any two sets A and B, the set of ordered couples (a, b) with a ∈ A,
b ∈ B is called the cartesian product of A and B and it is denoted A × B.
A × (B ∪ C) = (A × B) ∪ (A × C);
A × (B ∩ C) = (A × B) ∩ (A × C).
for any sets A, B, C.
4 Relations
Definition 4.1. A binary relation in the set A is a subset R of the cartesian product
A × A : R ⊂ A × A.
11
Traditionally, the membership (x, y) ∈ R is denoted by xRy.
The set R = {(x, y) ∈ R × R : x2 + y 2 ≤ 1} is a binary relation in the set of all real
numbers R.
Definition 4.2. A binary relation R in the set A is called reflexive if for any x ∈ A we
have xRx.
Definition 4.6. A binary relation R in the set A is total if for any x, y ∈ A, at least one
of the following statements is true: xRy, yRx.
The set R = {(x, y) ∈ R × R : x − y ≤ 0} is a total binary relation in the set of all real
numbers R.
Definition 4.7. A binary relation R in the set A is partial of there exist x, y ∈ A such
that none of the following statements is true: xRy, yRx.
Definition 4.8. A binary relation R in the set A is a relation of partial order if it satisfies
the following properties: R is a partial relation; R is reflexive; R is antisymmetric; R is
transitive.
The inclusion of sets is a relation of partial order in the set of all parts of a given set S.
12
Definition 4.9. A binary relation R in the set A is a relation of total order if it satisfies
the following properties: R is a total relation ; R is reflexive; R is antisymmetric; R is
transitive.
The set R = {(x, y) ∈ R × R : x − y ≤ 0} is a relation of total order in the set of all real
numbers R.
The set of all parts of a given set S, together with the relation of inclusion is a partially
ordered system.
Definition 4.11. A set A together with a relation of total order R in A is called totally
ordered system and it is also denoted by (A, R).
The set of real numbers R, together with the binary relation R = {(x, y) ∈ R × R :
x − y ≤ 0} is a totally ordered system.
The family P(X) of all subsets of a set X affords an illustration of this concepts. The
inclusion relation R =⊂ between the sets contained in X makes the pair (P(X), j) a
partially
[ ordered
[ system. An upper bound for a subfamily B ⊂ P(X) in any set containing
B and B is the only least upper bound of B.
B∈B \B∈B
Similarly, B is the only greatest lower bound of B. The only maximal element of P(X)
B∈B
is X.
For example, the equality in the set of parts P (X) of a given set X is an equivalency.
The set R = {(x, y) ∈ Z × Z : x − y divisible by 5} is a relation of equivalency in the set
of integers Z.
13
Definition 4.16. A relation R between the elements of a set A and the elements of a set
B is a subset of the cartesian product A × B; R ⊂ A × B.
Definition 4.17. A function (mapping) f of the set A into the set B written f : A → B
is a relation R between the elements of the sets A and B (R ⊂ A × B) which posses the
following properties:
5 Functions
The notion of function plays an important role in mathematics. It is not a basic notion,
since we have already seen that it can be defined in terms of sets. However for those
starting mathematical analysis, it is easiest to consider mapping (function) as a basic
notion clarifying it by examples and describing it in a manner that satisfies common
sense.
If for every x ∈ A an element y ∈ B is chosen according to some rules, then we say that
there is a function (mapping) f of the set A into the set B, written f : A → B.
Thus, a function is defined uniquely by the rule which makes every x ∈ A correspond to
y ∈ B.
What does the above description of function lack for it to be a strict definition?
Firstly, we must explain what a rule is; secondly, what a correspondence is.
Intuitively it is clear what a rule and correspondence are. In simple cases, these notions
do not involve misunderstandings and are sufficient for a meaningful mathematical theory
to be constructed on their basis.
Let us note once again that the rule defining the element y ∈ B is applicable to every
x ∈ A. The element x ∈ A is called the argument of the function f , the element y ∈ B is
called the value of the function f corresponding to the element x ∈ A, y = f (x), and the
function itself is a rule which ”processes” every x ∈ A into y = f (x).
The set A is called the domain of the function, and the set of all the elements y ∈ B for
which there are x ∈ A such that y = f (x) is called the range of the function f.
We shall consider functions which associate every real number x ∈ A ⊂ R1 with a number
y = f (x) ∈ R. For this kind of functions, a rule can be given by an explicit algebraic
expression; for instance:
q
2 1−x 5 √
y = x + 2 x; y = √ ; y = 1 + 7 x.
x+2
14
The right-hand sides of the equalities contain the rule that ”processes” x into y. The rule
in the first expression is: each x should be squared and added to twice x. The rules in the
second and third expression can be formulated in a similar way.
The rule can be given also by the symbols exp, loga , sin, cos, tan, cot and also combinations
of the symbols and algebraic operations. For instance,
√ 1
y = log2 1 + sin x; y = 1 .
(tan x) 2 − 2x
The right-hand sides of equalities define the rules for ”processing” x into y.
The rule can be given by another frequent method.
Let f1 and f2 be functions defined by expressions given above, and a be a number. We
have then set: ½
f1 (x) for x < a
f (x) =
f2 (x) for x ≥ a
The above equality can be interpreted as a rule according to which every x has corre-
sponding y. This rule can be formulated thus: if an x is less than a, then the corresponding
y is computed by rule f1 ; but if x is greater than or equal to a, then the corresponding y
is determined by rule f2 .
15
Definition 6.4. A mapping (function) f : X → Y is said to be bijective (a bijection) if
it is both injective and surjective.
Comment 6.2. 1. An injective mapping possesses the following property: different values
of the function correspond to different values of the argument. For instance, the number
functions: y = 5 x; y = ex ; y = arctan x are injective.
2. Surjective functions are also called ”onto mappings.”
For instance, the number function y = sin x is a surjective mapping of R1 onto the set
[−1, 1] but is not surjective mapping of R1 onto all R1 (there is no inverse image of the
point y = 2).
3. A bijective function is a one-to-one mapping f : X → Y. This means that every x ∈ X
has a corresponding y ∈ Y, y = f (x), with different x ∈ X having different corresponding
y ∈ Y, and every y ∈ Y having a corresponding x ∈ X (such that y = f (x), different x
corresponding to different y ∈ Y ).
Comment 6.3. 1. The rule in the Definition 6.5 implies the following property of an
inverse mapping (inverse function):
2. The functions (mappings) f and f −1 are mutually inverse, that is, (f −1 )−1 = f.
3. To find the inverse of a given number function y = f (x), we must express x in terms
y−2 √
of y. Thus, for y = 3 x + 2 the inverse mapping is x = ; for y = x3 it is x = 3 y, for
3
y = 10x it is x = log y.
7 Logic symbols
The expressions ”for any element” and ”there exists” are frequently used in mathematics.
They are designated in a special manner:
- the first is denoted by the symbol ∀ (the first letter of the word ”Any” inverted);
- the second by the symbol ∃ (the first letter of the word ”Exist” reflected).
We shall also use the symbol ⇒ to mean ”follows”. Thus, if A and B are two sentences,
then A ⇒ B means that B follows from A.
If A ⇒ B and B ⇒ A, then the sentence A and B are said to be equivalent, written
A ⇔ B (A is equivalent to B).
Using this notation, the injectivity of a mapping f : X → Y can be written in the form:
∀ x1 , x2 ∈ X, x1 6= x2 ⇒ f (x1 ) 6= f (x2 )
16
and the surjectivity of the same mapping in the form:
∀ y ∈ Y, ∃ x ∈ X | f (x) = y
the vertical line before f (x) = y is read ”such that”.
def
The designation A ⇐⇒ B is used when we want to describe a notion A using a sentence
B. It is read ”A is by definition B”. For instance the notation:
def
X ⊂ Y ⇐⇒ {(∀ x)(x ∈ X) ⇒ (x ∈ Y )}
defines X as a subset of set Y : the right-hand side of this notation is a sentence and it is
read: ”any element x of X is also an element of the set Y ”.
Many mathematical statements (including theorems) have the following form: ”if A, then
B ”, or, which is the same, ”B follows from A ”, A ⇒ B, where A is the condition, and
B is the conclusion of the theorem.
For any statement A ⇒ B we can construct a new statement by interchanging A and B,
namely, write B ⇒ A, that is ”if B, then A ”, ”A follows from B ”.
The theorem (statement) B ⇒ A is the converse of the theorem (statement) A ⇒ B.
It is obvious that the converse of a converse is the original theorem, therefore the two
theorems are said to be mutually converse.
If the direct theorem is true, its converse may be either true or false.
Example 8.1. The direct theorem (Pythagoras’ theorem) is: if a triangle is right-angled,
then the square of the hypotenuse is equal to the sum of the squares of the other two
sides.
The converse is: if the square of the biggest side equals the sum of the squares of the two
smaller sides, the triangle is right-angled.
In this case, both the direct theorem and the converse are true.
Example 8.2. The direct theorem is: if two angles are right angles, they are equal.
The converse is: if two angle are equal, then they are right angles.
Here the direct theorem is true, but the converse is false.
For the theorem ” if A, then B ”, the statement ” if A, then B ” is called the contrary
theorem. The contrary of a contrary theorem is the initial theorem.
17
Example 8.4. For the theorem ” If the sum of two opposite angles in a quadrilateral is
equal to 180◦ , then a circle can be circumscribed about the quadrilateral ” the contrary
theorem is ” If the sum of two opposite angles in a quadrilateral is not equal to 180◦ , then
a circle cannot be circumscribed about the quadrilateral ”.
In this case, both the direct theorem and its contrary are true.
The contrary theorem is equivalent to the converse. This means that the contrary theorem
is true if and only if the converse theorem is true.
Let the statement ” if A, then B ” be true. In this case the condition A is said to be
sufficient for B, and the condition B to be necessary for A.
Let also the converse be true, that is, ” if B, then A”. In this case B is the sufficient
condition for A and the condition A is necessary for B.
Thus, the condition A is necessary and sufficient for B (and the condition B is necessary
and sufficient for A). In other words, conditions A and B are equivalent: A occurs if and
only if B is true.
Example 9.1. Bézout’s theorem is: ”If α is a root of a polynomial P (x), then the
polynomial P (x) is divisible by x − α without remainder ”.
The converse is: If a polynomial P (x) is divisible by x − α, then α is a root of
the polynomial P (x). We know that both Bézout’s theorem and its converse are true.
Therefore, the necessary and sufficient condition for the number α to the root of a
polynomial P (x) is that ” the polynomial P (x) is divisible by x − α”.
The following statement is also true: ” for a polynomial P (x) to be divisible by x − α
without remainder it is necessary and sufficient that the number α be a root of the
polynomial P (x)”.
18
Part II
a) A ∪ B = A ∪ B;
b) A ⊃ A;
c) A = A;
19
e) x ∈ A if and only if every neighborhood V (x) of x intersect A.
11 Sequences
Comment 11.1. the value of the function (defining a sequence of real numbers)
corresponding to argument 1 is denoted by a1 , that corresponding to the argument 2
by a2 , . . . , that corresponding to the argument n by an . Here, a1 is called the first term
of the sequence, a2 the second term, . . . , an the n-th term.
The sequence a1 , a2 , . . . , an , . . . is denoted by (an ).
In order to define a sequence the value of the first, second,. . . , and n-th terms of the
sequence must be indicated. In other words, a rule must be given for evaluating the n-th
term of the sequence, given its place in the sequence for n = 1, 2, . . . .
1 + (−1)n
Example 11.5. Let an = , then a1 = 0, a2 = 1, a3 = 0, a4 = 1. Thus:
2
½
0 for n odd
an =
1 for n even
It may happen that as the number n increases, the terms an of the sequence increases
too.
Definition 11.2. An increasing sequence (an ) is one in which an ≤ an+1 for all n ∈ N.
Definition 11.3. A decreasing sequence (an ) is one in which an ≥ an+1 for all n ∈ N.
20
Definition 11.4. A sequence which is either increasing or decreasing is called a monotone
sequence.
Example 11.6. If q > 1, then the sequence an = q n is increasing and if 0 < q < 1, then
the sequence an = q n is decreasing. If q ∈ (0, +∞) and q 6= 1, then the sequence an = q n
is monotone.
Definition 11.5. A sequence (an ) is called bounded if there exists a number M such that
|an | ≤ M for all n.
For instance, if 0 < q < 1, then the sequence an = q n is bounded (|an | < 1). The sequence
an = (−1)n is also bounded (|an | ≤ 1).
Definition 11.6. A sequence (an ) is called unbounded if it is not bounded. In other words,
if for any M > 0 there exists nM such that |anM | > M.
For instance, if q > 1, then the sequence an = q n is unbounded.
Definition 11.7. If (an ) is a sequence, then any sequence (ank ), where (nk ) = n1 , n2 , . . .
is a strictly increasing sequence of positive integers, is called a subsequence of the sequence
(an ).
Comment 11.2.
12 Convergence
It may happen that as the number n increases without bound, the terms an of the sequence
approach closely a certain number L. In this case we arrive at an important mathematical
concept that of the limit of a sequence.
Definition 12.1. A number L is said to be the limit of the sequence (an ) if for any number
ε > 0 there is a number N (dependent on ε) such that all the terms an of the sequence
with subscript n exceeding N satisfy the condition:
|an − L| < ε.
If an −−−−−→
n→∞ L, then the sequence (an ) is said to be convergent to L.
Comment 12.1.
21
• If the sequence (an ) converges to L, then any subsequence (ank ) of the sequence (an )
converges to L.
Indeed: for any ε > 0 there exists N such that for n > N we have |an − L| < ε.
Hence for nk > N we have |ank − L| < ε.
• Not every sequence has a limit. For instance, the sequence an = (−1)n has no
limit. That is because the subsequence a2k = (−1)2k = 1 converges to 1 and the
subsequence a2k+1 = (−1)2k+1 = −1 converges to −1.
• The limit of a sequence (an ), if it exists, it is unique.
Assuming the contrary, that is (an ) converges to L1 and L2 , L1 6= L2 , we find N1 and
N2 such that |an −L1 | < |L1 −L2 |/2 for n > N1 and |an −L2 | < |L1 −L2 |/2 for n > N2 .
Since for n > max{N1 , N2 } we have |L1 − L2 | ≤ |L1 − an | + |L2 − an | < |L1 − L2 |
we obtain that |L1 − L2 | < |L1 − L2 | what is absurd.
• If the sequence (an ) converges to L, then it is bounded. Indeed, considering ε = 1
and N1 such that |an − L| < 1 for n > N1 we have
|an | = |an − L + L| ≤ |an − L| + |L| < 1 + |L|
for n > N1 .
Therefore |an | ≤ max{|a1 |, |a2 |, . . . |aN1 |, 1 + |L|}
1
Example 12.1. Let us show that lim √ = 0
n→∞ n
Indeed, let ε > 0. Consider the inequality
¯ ¯
¯ 1 ¯
¯ √ − 0¯ < ε
¯ n ¯
1 1 1
we have √ < ε, < ε2 that is n > 2 .
n· ¸ n · ¸ ε
1 1 1
We set N = 2 + 1 where 2 is the integral part of the number 2 . It is obvious that
ε ε ¯ ¯ ε
1 ¯ 1 ¯
if n > N, then n > 2 and inequality ¯¯ √ − 0¯¯ < ε will be fulfilled.
ε n
Note that when proving the existence of a limit we calculated the number N for the given
ε in a formal, textbook manner. From now on, we shall compute limits using other, more
simple and convenient rules.
In some cases the limit of a sequence (an ) is said to be infinity. The meaning of this
concept is the following:
Definition 12.2. The limit of the sequence (an ) is said to be +∞ if for any M > 0 there
is NM such that an > M for n > NM .
22
13 Rules (for convergence of sequences)
Suppose that (an ) and (bn ) are convergent sequences with limits a and b respectively, then
the following rules apply:
Proof. Since lim = b, there is M > 0 such that |bn | ≤ M for any n ∈ N. It follows that:
n→∞
and
n > N2 ⇒ |bn − b| < ε2 .
Let N3 the maximum of the N1 and N2 and so conclude that if n > N3 then:
|an · bn − a · b| < ε.
In other words lim an · bn = a · b.
n→∞
Quotient rule: (an /bn ) converges to a/b provided that bn 6= 0 for each n and b 6= 0.
1
Proof. Firstly it is shown that if lim bn = b 6= 0 and bn 6= 0 for all n, then lim bn = .
n→∞ n→∞ b
It is clear that we have: ¯ ¯
¯1 ¯
¯ − 1 ¯ = |bn − b|
¯ bn b ¯ |bn | · |b|
1
Since lim bn = b there exists an integer N1 such that |bn − b| < |b| for all n > N1 . Let
n→∞ ¯ ¯ 2
2 1 1 ¯1¯
M be the maximum of , ,..., . Then ¯¯ ¯¯ < M for all n.
|b| |b1 | |bN1 | bn
23
ε · |b|
So, given any ε > 0, let ε0 = . Then ε0 > 0 and there exists an integer N2 such that
M ¯ ¯
¯1 1 ¯
|bn − b| < ε for all n > N2 . Hence, ¯¯ − ¯¯ < ε for all n > N3 where N3 is the maximum
0 0
bn b
1 1 an a
of N1 and N2 . In other words lim = . By the product rule then lim = .
n→∞ bn b n→∞ bn b
Solution: The quotient rule cannot be applied direct since neither the numerator nor
n2 + 2n + 3
the denominator of converges to a finite limit.
4n2 + 5n + 6
However, if the numerator and denominator are divided by the dominant term n2 the
following is obtained:
2 3
1+ + 2
an = n n .
5 6
4+ + 2
n n
1
It is easy to prove that −−−→ 0 and the constant sequence (k) has limit k. Hence
n x→∞
1
lim an = freely using the sum, product, scalar product and quotient rules.
n→∞ 4
Squeeze rule: Let (an ), (bn ), (cn ) be sequences satisfying an ≤ bn ≤ cn for all n ∈ N. If
(an ) and (cn ) both converge to the same limit L, then (bn ) also converges to L.
1
Application 13.2. Show that lim (−1)n · = 0.
n→∞ n2
1 1 1
Now let an = − 2
, bn = (−1)n · 2 , cn = 2 .
n n n
Both (an ) and (bn ) converge to 0. By the squeeze rule (bn ) converges to 0.
24
Principle of monotone sequences: A bounded monotone sequence is convergent.
Proof. The statement for a bounded increasing sequence is proved, the proof being similar
for a decreasing sequence.
Let (an ) be such that a1 ≤ a2 ≤ . . . ≤ an ≤ . . . and an ≤ M for all n ∈ N. Let
M0 = sup {an |n ∈ N} the least upper bound of the set of numbers appearing in the
sequence. Given ε > 0, M0 − ε cannot be an upper bound for {an |n ∈ N}. Hence, there
exists a value n = N such that aN > M0 − ε. Furthermore an ≤ M0 by the definition of
M0 and hence, for n > N , |an − M0 | < ε. This proves that lim an = M0 .
n→∞
√
Application 13.3. A sequence
√ (an ) is defined by a1 = 1 and a n+1 = an + 1 for n ≥ 1.
1+ 5
Show that lim an = .
n→∞ 2
Proof. Let SN = {an |n > N }. If every SN has a maximum element, then define a
subsequence of (an ) as follows: b1 = an1 is the maximum of S1 , b2 = an2 is the maximum
of the Sn1 , b3 = an3 is the maximum of Sn2 and so on. Therefore (bn ) is a monotone
decreasing subsequence of (an ). Since (an ) is bounded, then so is (bn ) too. It follows that
(bn ) is a convergent subsequence of (an ).
On the other hand if, for some M, SM does not have a maximum element, then for any am
with m > M there exists an an following am with an > am . Now let c1 = aM +1 and c2 the
first term of (an ) following c1 for which c2 > c1 . Now let c3 the first term of (an ) following
c2 for which c3 > c2 and so on. Therefore (cn ) is monotone increasing subsequence of
(an ). Since (cn ) is bounded, it is convergent.
It is intuitively clear that if an −−−→ L, then all the terms of the sequence with large
n→∞
subscripts will differ very little, all of them being approximately equal to L.
More precisely, we have:
25
Theorem 13.2 (Cauchy’s criterion for the convergence of a sequence). A sequence (an )
has a limit if and only if for any ε > 0 there exists Nε such that all the terms of the
sequence with subscripts p, q > Nε satisfy |ap − aq | < ε.
Proof. Let assume that the sequence (an ) has a limit L and let ε > 0 be a number.
ε
Consider the number ; by definition of limit there exists an integer N such that
2
ε ε ε
|an − L| < for all n > N. Hence |ap − L| < , |aq − L| < for p, q > N and it
2 2 2
follows that |ap − aq | ≤ |ap − L| + |aq − L| < ε for p, q > N. Let assume now that for any
ε > 0 there exists N such that |ap − aq | < ε for p, q > N1 .
Considering ε = 1 and N1 such that |ap − aq | < 1 for p, q > N1 we have:
|an | = |an − aN1 +1 + aN1 +1 | ≤ |an − aN1 +1 | + |aN1 +1 | ≤ 1 + |aN1 +1 |, for n > N1
Therefore:
|an | ≤ max{|a1 |, |a2 |, . . . , |aN1 |, |aN1 +1 | + 1} = M
According to Bolzano-Weierstrass theorem the sequence (an ) contains a convergent
subsequence (ank ). Let be L = lim ank and ε > 0 a number. There exists N1 such
nk →∞
ε ε
that for nk > N1 we have |ank − L| < and N2 such that |ap − aq | < for p, q > N2 .
2 2
Considering N3 = max{N1 , N2 } and n > N3 we have:
Definition 14.1. The set of limit points of sequence (an ) is the collection of points x ∈ R1
for which there exists a subsequence (ank ) of the sequence (an ) such that lim ank = x.
nk →∞
Usually the set of limit points of sequence (an ) is denoted by L(an ).
The sequence (an ) converges and lim an = L if and only if L(an ) = {L}.
n→+∞
Definition 14.2. The limit superior of a sequence (an ) is sup L(an ). The limit superior
of a sequence (an ) usually is denoted by lim sup an or by lim an .
n→∞ n→∞
Definition 14.3. The limit inferior of a sequence (an ) is inf L(an ). The limit inferior of
a sequence (an ) usually is denoted by lim inf an or lim an .
n→∞ n→∞
Example 14.1. If an = (−1)n then L(an ) = {−1, 1} and lim an = −1, lim an = 1.
n→∞ n→∞
lim an = lim an = L.
n→∞ n→∞
26
15 Series of real numbers
sn = a1 + a2 + · · · + an
More precisely:
∞
X
Definition 15.1. It is said that the symbol an is a convergent series, with sum s, if
n=1
the sequence (sn ) of n-th partial sums converges to s.
∞
X
If (sn ) is a divergent sequence then, irrespective of its precise behavior, an is called a
n=1
divergent series.
∞
X
Rather regrettably, an is still used to denote a divergent series even through it does
n=1
not possess a sum.
X∞
1 1 1 1 1 1
Solution: The n-th partial sum of n
is sn = + 2 + 3 + · · · + n = 1 − n . Since
n=1
2 2 2 2 2 2
∞
X 1
lim sn = 1 it can be deduced that converges and has sum 1.
n→∞
n=1
2n
1
Solution: The n-th partial sum is sn = 1+2+· · ·+n = n (n+1). Since sn is a divergent
∞
2
X
sequence, n is divergent.
n=1
27
Solution: Since
1 1 1 1
= = −
n2 +n n(n + 1) n n+1
∞
X 1
the n-th partial sum of can be written as
n=1
n2 +n
µ ¶ µ ¶ µ ¶
1 1 1 1 1 1
sn = 1− + − + ··· + − =1−
2 2 3 n n+1 n+1
Now lim sn = 1.
n→∞
X ∞
1
Example 15.4. is a special case of an important class of infinite series, namely
n=1
2n
X∞
the geometric series a · xn , where x is a real number.
n=0
Notice that the summation here begins at n = 0, and not at n = 1. For this series the
sum of the first n terms is
sn = a + a · x + a · x2 + · · · + a · xn−1
so
x · sn = a · x + a · x2 + · · · + a · xn
and by substraction
a(1 − xn )
sn =
1−x
a
for x 6= 1. Therefore, lim sn = for |x| < 1.
n→∞ 1−x
Since (sn ) diverges for |x| ≥ 1 the following result is obtained:
Result: The geometric series
∞
X
a · xn = a + a · x + a · x2 + . . . , a 6= 0
n=0
a
converges if and only if |x| < 1. Moreover, its sum is then .
1−x
Since the sum of a convergent series is defined to be the limit of the sequence of n-th
partial sums of the series in question the rules concerning the convergence of sequence
can be used to establish theorems concerning convergence of series.
The first result provides a useful test for the divergence of series.
The vanishing condition
∞
X
If an is convergent, then lim an = 0.
n→∞
n=1
Proof. Suppose that (sn ) converges to some limit s. Hence, (sn−1 ) also converges to s.
But an = sn − sn−1 and so lim an = 0.
n→∞
28
∞
X n n
Example 15.5. Consider . Since an = and lim an = 1 6= 0, by the
n=1
n + 1 n + 1 n→∞
X∞ X∞
n n
vanishing condition does not converge. In other words is divergent.
n=1
n + 1 n=1
n + 1
It is important to note that the converse of the vanishing condition is false! In other
words there are divergent series whose terms nevertheless tend to zero.
X∞
√ √
Example 15.6. Consider ( n − n − 1). The n-th partial sum may be written as:
n=1
√ √ √ √ √ √ √
sn = ( 1 − 0) + ( 2 − 1) + · · · + n − n − 1 = n
∞
X √ √
Clearly (sn ) is a divergent sequence and so ( n − n − 1) is divergent series. However
n=1
√ √ √ √
√ √ ( n − n − 1)( n + n − 1) 1
an = n − n − 1 = √ √ = √ √ →0
( n + n − 1) ( n + n − 1)
By considering the n-th partial sums of the appropriate series and the sum and the scalar
product rules for sequences the following elementary results can be easily proved.
29
∞
X ∞
X ∞
X
Sum rule: If an and bn are convergent series, then (an +bn ) is also convergent
n=1 n=1 n=1
and ∞ ∞ ∞
X X X
(an + bn ) = an + bn
n=1 n=1 n=1
∞
X ∞
X
Scalar product rule: If an is convergent, then (k · an ) is convergent for any
n=1 n=1
k ∈ R1 and
∞
X ∞
X
(k · an ) = k an .
n=1 n=1
Rules will be established below which can be used to test whether a given series converges
or not.
Integral test: Let f : R1+ → R1+ be a decreasing function and let an = f (n) for each
Z n ∞
X
n ∈ N. Let jn = f (x) dx. The series an converges if and only if jn converges.
1 n=1
The proof of the statement is given in the section where the Riemann integral is rigourously
defined.
X ∞
1
Application 16.1. Establish that the p series converges if and only if p > 1.
n=1
np
1
Solution: Consider the function fp : R1+ → R1+ given by fp (x) = . When p > 0 this is
xp ∞
1 X 1
a decreasing function of x and an = p = fp (n) is the n-th term of the p series .
n n=1
np
For p 6= 1 Z n ¯n
1 x1−p ¯¯ 1
jn = p
dx = ¯ = (n1−p − 1).
1 x 1 − p 1 1 − p
1
So, for p > 1, lim jn = , and forp < 1 the sequence (jn ) is divergent. For p = 1,
n→∞ p−1
Z n
1
jn = dx = ln x|n1 = ln(n)
1 x
The p-series, together with geometric series, give a fund of known convergent and divergent
series.
30
∞
X
First comparison test: If 0 ≤ an ≤ bn for all n ∈ N, then bn convergent implies
n=1
∞
X
an convergent.
n=1
n
X X
Proof. Let sn = ak and tn = bk . From the given conditions 0 ≤ sn ≤ tn for all
k=1 k=1
∞
X
n ∈ N. If bn converges, then lim tn = t and, since (tn ) is an increasing sequence,
n→∞
n=1
tn ≤ t for all n ∈ N.
Therefore, sn ≤ t for all n ∈ N and hence (sn ) is bounded and increasing sequence. It
∞
X
follows that (sn ) converges and hence an converges.
n=1
X∞
1 + cos n
Example 16.2. The series is convergent.
n=1
3n + 2 · n3
1 + cos n 2
Solution: Let an = n 3
. Then an ≥ 0 since cos n ≥ −1. Also an ≤ n
3 +2·n 3 + 2 · n3
X∞
n 2 1 1
since cos n ≤ 1. Therefore, since 3 > 0, an < = 3 . Let bn = 3 . Then bn
2 n3 n n n=1
X∞
converges, being the p series with p = 3. Hence an also converges.
n=1
∞
X ∞
X
Second comparison test: Let an and bn be positive term series such that
n=1 n=1
an
lim = L 6= 0.
n→∞ bn
X∞ ∞
X
Then, an converges if and only if bn converges.
n=1 n=1
∞
X
Proof. Suppose that bn is convergent and let
n=1
sn = a1 + a2 + · · · + an ; tn = b1 + b2 + · · · + bn .
an
Since lim = L for ε = 1 there is an N1 such that
n→∞ bn
¯ ¯
¯ an ¯
¯ − L¯ < 1, for all n > N1
¯ bn ¯
Hence,
¯ ¯ ¯ ¯ ¯ ¯
an ¯¯ an ¯¯ ¯¯ an ¯ ¯ an ¯
= ¯ ¯ = ¯ − L + L¯¯ ≤ ¯¯ − L¯¯ + |L| < 1 + |L| = k, n > N1 .
bn bn bn bn
31
∞
X ∞
X
Now consider the positive term series αn and βn where αn = aN1 +n and βn =
n=1 n=1
k · bN1 +n .
∞
X ∞
X
Hence 0 ≤ αn ≤ βn for all n ∈ N. Since bn converges, then so does bn too
n=1 n=N1 +1
∞
X
and hence βn converges by the scalar product rule for series. By the first comparison
n=1
∞
X
test, αn converges and, since the addition of a finite number of terms to a convergent
n=1
∞
X ∞
X
series produces another convergent series, an also converges. This proves that bn
n=1 n=1
∞
X
convergent implies that an convergent. The converse of this statement can be proved
n=1
bn 1
by reversing the roles of an and bn in the above argument and observing that → .
an L
∞
X 2n
Example 16.3. Show that the series is divergent.
n=1
n2 − 5n + 8
2n 1 an 2 n2
Solution: Let an = 2 and bn = . Then = 2 → 2 6= 0. Hence
n − 5n + 8 n bn n − 5n + 8
X∞
an diverges by comparison with the divergent harmonic series.
n=1
∞
X an+1
Ratio test: Let an be a series of positive terms and for each n ∈ N let αn = .
n=1
an
∞
X
Suppose that (αn ) converge to some limit L. If L > 1 then an diverges; if L < 1 then
n=1
∞
X
an converges; and if L = 1 the test give no information.
n=1
1
Proof. Suppose that L < 1 and let ε = (1 − L).
2
Now ε > 0 and L + ε = k < 1. Since lim αn = L there is a value Nε such that
n→∞
αn = |αn − L + L| ≤ ε + L = k < 1 for all n > Nε . Therefore an+1 ≤ k · an for all n > Nε .
Let βn = aNε +n . Then βn+1 ≤ k · βn for all n ∈ N and so (by induction on n)
32
Suppose now that L > 1 and let ε = L − 1. Now ε > 0 and since lim αn = L there is a
n→∞
value Nε such that αn > L − ε = 1 for all n > Nε .
Hence an+1 > an for all n > Nε and so an > aNε for all n > Nε . Since aNε 6= 0, (an ) does
∞
X
not tend to zero. By the vanishing condition an diverges.
n=1
∞
X
Example 16.4. Determine those values of x for which n · (4x2 )n is convergent.
n=1
X∞
The root test: Let an be a series of positive terms. If there exists N and k ∈ (0, 1)
√ n=1
such that n an ≤ k for n > N, then the series converges.
√
If n an ≥ 1 for an infinity of terms of the series, then the series diverges.
√
Proof. If there exists N and k ∈ (0, 1) such that n an ≤ k for n > N, then an ≤ k n for
∞
X X∞
n > N. consequently, the series an can be compared with the geometric series kn
n=1 n=1
which converges since k < 1. This proves the first case.
√
If n an ≥ 1 for an infinity of terms of the series, then an ≥ 1 for an infinity of terms of
the series. Hence the vanishing condition is not satisfied and the series diverges.
∞ r
X 1 1
Application 16.2. The series n
converges. Using the root test we have n n =
n=1
n n
1 1
≤ for n ≥ 2.
n 2
33
∞
X
Proof. Let sm = (−1)n−1 · bn .
n=1
Then s2m = b1 − (b2 − b3 ) − · · · − (b2m−2 − b2m−1 ) − b2m ≤ b1 and hence the sequence (s2m )
is bounded above.
Since s2m = (b1 − b2 ) + (b3 − b4 ) + · · · + (b2m−1 − b2m ) and (bn ) is decreasing, (s2m ) is
increasing. Consequently, (s2m ) converges and let be s = lim s2m .
m→∞
Similarly, the sequence (s2m+1 ) is a decreasing sequence which is bounded below by b1 − b2
and so (s2m+1 ) converges to a limit t = lim s2m+1 .
2m+1→∞
Now t − s = lim (s2m+1 − s2m ) = lim b2m+1 = 0
m→∞ m→∞
Finally, it is shown that lim sn = s.
n→∞
If ε > 0, there are integers N1 and N2 such that |s2m − s| < ε for all m > N1 and
|s2m+1 − s| < ε for all m > N2 . Let N be the maximum of 2N1 and 2N2 + 1. If n > N,
then either n = 2 m ( and m > N1 ) or n = 2 m + 1 ( and m > N2 ). In either case
|sn − s| < ε for all n > N. In other words, the sequence (sn ) converges and hence, the
∞
X
series (−1)n−1 · bn is convergent.
n=1
X∞
1 1
Example 16.5. The series (−1)n−1 · is convergent since is decreasing sequence
n=1
n n
with limit zero.
where n
∞
> m.
X
Since |an | converges, for any ε > 0, there exists an integer N such that for n > m > N
n=1
we have |am+1 | + · · · + |an | < ε and hence |sn − sm | < ε. By the Cauchy criterion we obtain
X∞
that the series an converges.
n=1
34
∞
X X∞
(−1)n 1
Example 17.1. The series converges, since it is absolute convergent; =
n=0
2n n=0
2n
2.
∞
X (−1)n
The alternating harmonic series is conditionally convergent. It converges (by
n=1
n
X∞
1
Leibnitz criterion) but it is not absolutely convergent since diverges.
n=1
n
Comment 17.1. Absolute convergence can be tested by the convergence tests given
above for series with positive terms.
Absolute convergence is important for the following reason: the sum of an absolute
X∞
convergent series an does not depend on the order in which the terms an are taken.
n=1
∞
X
It can be shown that for any conditionally convergent series an and any real number
n=1
∞
X ∞
X
S, we can have an = S by rearranging the terms of an . For example rearranging
n=1 n=1
the terms of the alternating harmonic series we can have it sum to 106 or to −106 or
within ε to the number of atoms in universe.
∞
X ∞
X
Cauchy product of series: If an and bn are absolute convergent series and
n=1 n=1
cn = a1 bn + a2 bn−1 + · · · + an b1
∞
X
then cn is absolute convergent and
n=1
∞
̰ ! ̰ !
X X X
cn = an · bn
n=1 n=1 n=1
∞
X ∞
X
Proof. Suppose first that an and bn are positive term series and consider the array:
n=1 n=1
a1 b1 a1 b2 a1 b3 ...
a2 b1 a2 b2 a2 b3 ...
a3 b1 a3 b2 a3 b3 ...
... ... ... ...
If wn is the sum of the terms in the array that lie in the n × n square with a1 b1 at one
∞
X X∞
corner, then wn = sn · tn , where sn and tn are the n-th partial sums of an and bn
n=1 n=1
respectively.
35
̰ ! ̰ !
X X
Hence lim wn = an · bn .
n→∞
n=1 n=1
∞
X
Now cn is the sum of the terms in the array summed ”by diagonals” and so if un is
n=1
∞
X
the partial sum of cn then:
n=1
w[ n2 ] ≤ un ≤ wn .
By the squeeze rule we have
à ∞
! ̰ !
X X
lim un = un · bn
n→∞
n=1 n=1
as required.
∞
X ∞
X
For the general case the above argument can be applied to the series |an |, |bn |,
n=1 n=1
∞
X ∞
X X∞
and |cn | to deduce that the series cn is absolute convergent. As cn is a linear
n=1 n=1 n=1
∞
X ∞
X ∞
X ∞
X
combination of the series a+
n, a−
n, b+
n and b−
n we have:
n=1 n=1 n=1 n=1
∞
̰ ! ̰ !
X X X
cn = an · bn .
n=1 n=1 n=1
Definition 18.1. L is called the limit of f (x) as x approaches ”a”, if for any ε > 0, there
exists a number δ > 0 (depending on ε) such that |f (x) − L| < ε for all x ∈ A, x 6= a and
|x − a| < δ.
We denote this by
lim f (x) = L or f (x) → L for x → a.
x→a
36
Comment 18.1. Definition 18.1 does not depend on the value of f at ”a” (if this exists)
as the point ”a” is excluded from consideration. If the value f (a) exist, may violate the
inequality.
Given the function f and the value L the inequality |f (x) − L| < ε means
L − ε < f (x) < L + ε
and therefore, ε can be regarded as the prescribed accuracy of approximating L, i.e how
close to L one wants to get.
The number δ is not uniquely determined by ε. You can always ”take a smaller δ” in the
sense that if, for a given ε we have:
0 < |x − a| < δ1 ⇒ |f (x) − L| < ε
then, for any 0 < δ < δ1 we have
0 < |x − a| < δ ⇒ |f (x) − L| < ε.
Example 18.1. Let us show that
x2 − 4
lim = 4.
x→2 x − 2
37
Heine’s criterion for the limit: The function f : A ⊂ R1 → R1 has a limit as x
approaches ”a” if and only if for any sequence (xn ), xn ∈ A, xn 6= a, and xn → a as
n → ∞, the sequence (f (xn )) converges.
Proof. Assume that L is the limit of f (x) as x approaches ”a” and consider a sequence
(xn ), xn ∈ A, xn 6= a and xn → a as n → ∞. For ε > 0 there exists δ > 0 such that
0 < |x − a| < δ ⇒ |f (x) − L| < ε. For δ > 0 there exists N such that |xn − a| < δ for
n > N. Hence: |f (xn ) − L| < ε for n > N. Therefore, the sequence (f (xn )) converges.
Assume now that for any sequence (xn ), xn ∈ A, xn 6= a and xn → a as n → ∞ the
sequence (f (xn )) converges. Firstly, we show that lim f (xn ) is independent on (xn ).
n→∞
For that assume the contrary i.e there exist (x0n ), (x00n ); x0n , x00n ∈ A, x0n 6= a, x00n 6= a and
lim x0n = lim x00n = a for which lim f (x0n ) = L0 6= L00 = lim f (x00n ). Consider the
n→∞ n→∞ n→∞ n→∞
sequence (xn ) defined as
½
x0k for n = 2 k
xn =
x00k+1 for n = 2 k + 1
and remark that xn ∈ A, xn 6= a and lim xn = a. Hence the sequence (f (xn )) converges.
n→∞
Let be L = lim f (xn ) and remark that lim f (x0n ) = L0 and lim f (x00n ) = L00 has to be the
n→∞ n→∞ n→∞
same L, i.e. L0 = L; L00 = L. It follows that L0 = L00 what is absurd. Denote now by L the
common value of lim f (xn ) and show that lim f (x) = L. For that assume the contrary.
n→∞ x→a
It follows that there exists ε0 > 0 such that for any n ∈ N there exists xn ∈ A, xn 6= a such
1
that |xn − a| < and |f (xn ) − L| ≥ ε0 . Hence the sequence (f (xn )) does not converge to
n
L even xn ∈ A, xn 6= a and xn → a as n → ∞. That is absurd.
Proof. Assume that L = lim f (x) and consider ε > 0. There is δ > 0 such that
x→a
ε
0 < |x − a| < δ ⇒ |f (x) − L| < .
2
Hence, 0 < |x0 −a| < δ and 0 < |x00 −a| < δ ⇒ |f (x0 )−f (x00 )| ≤ |f (x0 )−L|+|f (x00 )−L| < ε.
Assume now that for any ε > 0 there exists δ > 0 such that 0 < |x0 − a| < δ and
0 < |x00 − a| < δ ⇒ |f (x0 ) − f (x00 )| < ε and consider a sequence (xn ), xn ∈ A, xn 6= a and
xn −−−→ a. For δ > 0 there is N such that |xn − a| < δ for n > N.
n→∞
Hence, for n, m > N we have |f (xn ) − f (xm )| < ε. This means that the sequence (f (xn ))
converges.
Applying Heine’s criterion, the function f has a limit as x approaches to ”a”.
38
b) If lim f (x) = L and lim g(x) = M, then lim (f (x) ± g(x)) = L ± M.
x→a x→a x→a
f (x) L
d) If lim f (x) = L, g(x) 6= 0 and lim g(x) = M 6= 0, then lim = .
x→a x→a x→a g(x) M
Proof. We prove part b) (the other parts are proved similarly, the proof is rather technical
and can be skipped at first reading). For any ε > 0, there are positive δ1 and δ2 such that
ε
0 < |x − a| < δ1 ⇒ |f (x) − L| <
2
ε
0 < |x − a| < δ2 ⇒ |f (x) − M | <
2
For 0 < |x − a| < min{δ1 , δ2 } we have
Pinching rule: Suppose that the inequality f (x) ≤ g(x) ≤ h(x) holds for all x in some
interval around ”a”, except perhaps at x = a. If lim f (x) = L and lim h(x) = L then also
x→a x→a
lim g(x) = L.
x→a
Proof. Since f (x) ≤ g(x) ≤ h(x), we have f (x) − L ≤ g(x) − L ≤ h(x) − L. Hence
|g(x) − L| ≤ max {|f (x) − L|, |h(x) − L|}. For ε > 0 there exist δ1 > 0 and δ2 > 0 such
that
0 < |x − a| < δ1 ⇒ |f (x) − L| < ε,
0 < |x − a| < δ2 ⇒ |h(x) − L| < ε.
Hence, for 0 < |x − a| < min {δ1 , δ2 } we have
39
θ
The area of the circular sector OAB is .
2
sin θ
The area of the triangle OAB is .
2
| sin θ| |θ|
Therefore 0 ≤ ≤ .
2 2
”Pinching” sin θ between 0 and θ, both of which approach 0 as θ → 0, proving the first
limit.
From the Pythagorean relation sin2 θ + cos2 θ = 1 we get moreover
lim cos2 θ = lim(1 − sin2 θ) = 1
θ→0 θ→0
But lim cos2 θ = (lim cos θ)2 and therefore lim cos θ = +1 or − 1. The negative sign is
θ→0 θ→0 θ→0
eliminated, since cos θ is positive near θ = 0.
Example 19.2. We use the pinching rule to prove
1
lim x · sin =0
x→0 x
1
The function x · sin is bounded below by −|x| and above by |x|, so that
x
1
−|x| ≤ x · sin ≤ |x|.
x
As x → 0, we also have |x| → 0, and therefore
1
0 ≤ lim x · sin ≤ 0.
x→0 x
proving the claimed limit.
1
We can similarly show that lim x2 · sin
=0
x→0 x
Example 19.3. We can show that lim ex = 1 pinching the exponential function ex near
x→0
0 between the functions 1 + x and 1 + x + x2 i.e.
1 + x < ex < 1 + x + x2 for x ∈ (−∞, 1).
Using the above inequalities we can similarly calculate
ex − 1
lim = 1.
x→0 x
The substitution rule: Assume that lim f (x) = L and lim g(y) = M. Then
x→a y→L
lim g(f (x)) = M.
x→a
Proof. Let ε > 0 be given. Since g(y) → M as y → L there exists δ1 > 0 such that
0 < |y − L| < δ1 ⇒ |g(y) − M | < ε. Also since lim f (x) = L there exists δ2 > 0 such that
x→a
0 < |x − a| < δ2 ⇒ |f (x) − L| < δ1 .
Therefore, 0 < |x − a| < δ2 ⇒ |f (x) − L| = |y − L| < δ1 ⇒ |g(y) − M | = |g(f (x)) − M | < ε
proving that lim g(f (x)) = M.
x→a
40
20 One sided limits
The limit lim f (x) = L in Definition 18.1 is a two-sided limit, since the variable x
x→a
approaches the point ”a” from both sides. We now analyze one-sided limits, where the
variable x approaches the point ”a” on one side. This is necessary if the function is defined
only on one side of the point in question, or if approaching the point from different sides
gives different limits.
We use the following terminology:
”x approaches ”a” from the right”, also ”x approaches ”a” from the above”, denoted by
x → a+ or x & a, means a < x < a + δ for δ > 0 sufficiently small.
”x approaches ”a” from the left”, also ”x approaches ”a” from the below”, denoted by
x → a− or x % a, means a − δ < x < a for δ > 0 sufficiently small.
Definition 20.1. (one-sided limits)
a) L is called the right limit of f at ”a” or the limit of f (x) as x approaches ”a” from
the right (or from above), denoted by
lim f (x) = L or lim f (x) = L
x→a+ x&a
b) L is called the left limit of f at ”a”, or the limit of f (x) as x approaches ”a” from
the left (or below), denoted by
lim f (x) = L or lim f (x) = L
x→a− x%a
then the limit of f at ”a” exists and equals the same value L
lim f (x) = L.
x→a
√
Example 20.1. The function √ f (x) = x is defined
√ only for x ≥ 0. As x approaches 0
from the right, the value of x tends to 0, lim x = 0.
x&a
41
Example 20.3. Step and staircase function. The step function is defined as:
0 if x < 0
1
step(x) = if x = 0
2
1 if x > 0
The translated step function step(x − a) has its step at the point ”a” where it has the
two one-sided limits
At each step point, the staircase function has a left limit and a right limit which are
different,and also not equal to the value of the function Sm at that point. At all other
points the left limits and the right limits coincide, and therefore the two sided limits exist
at x 6= n.
Definition 20.2. A function f : A ⊂ R1 → R1 is increasing if x1 , x2 ∈ A, x1 < x2 ⇒
f (x1 ) ≤ f (x2 ).
Definition 20.3. A function f : A ⊂ R1 → R1 is decreasing if x1 , x2 ∈ A, x1 < x2 ⇒
f (x1 ) ≥ f (x2 ).
Definition 20.4. A function f : A ⊂ R1 → R1 is monotone if it is increasing or
decreasing function.
If f is increasing, then the set Sx0 is bounded above by f (x0 ) and if f is decreasing, then
the set Sx0 is bounded below by f (x0 ).
If f is increasing, then the least-upper-bound of Sx0 i.e. sup Sx0 is the left limit of f at
x0 and if f is decreasing the greatest-lower-bound of Sx0 , i.e. inf Sx0 is the left limit of f
42
at x0 .
In this way it was shown that: for an increasing function f the left limit in x0 exists and
Considering the set Rx0 = {f (x) | x > x0 } it can be proven, in a similar way, that if f
increases, then
lim+ f (x) = inf Rx0
x→x0
21 Infinite limits
”Infinity” (±∞) is a mathematical symbol and not a number which is subject to arithmetic
operations.
1
a) lim = +∞.
x→0 x2
1 1
b) lim− = −∞ and lim+ = +∞.
x→0 x x→0 x
43
Example 21.2. A function can have, at a point a finite one-sided-limit as the point is
approached from one side, and an infinite one-sided-limit as the point is approached from
the other side.
For example the function (
0 if x ≤ 0
h(x) = 1
if x > 0
x
lim− h(x) = 0, lim+ h(x) = +∞.
x→0 x→0
Definition 21.2. (limits at infinity) The number L is the limit of f (x) as x approaches
+∞, denoted by
lim f (x) = L
x→+∞
1 − x2
Example 21.3. The function f (x) = has the following limits at infinity
1 + x + x2
lim f (x) = −1 and lim f (x) = −1.
x→−∞ x→+∞
• inf La (f ) is called the inferior limit of f at ”a” and it is denoted by lim f (x),
x→a
def
lim f (x) = inf La (f ).
x→a
• sup La (f ) is called the superior limit of f at ”a” and it is denoted by lim f (x),
x→a
def
lim f (x) = sup La (f ).
x→a
44
The following statement holds: The number L is the limit of f (x) as x approaches
”a” if and only if
lim f (x) = lim f (x) = L.
x→a x→a
Proof. Assume first that lim f (x) = L and consider a sequence (xn ) such that xn ∈
x→a
A, xn 6= a and lim xn = a. For ε > 0 there exists δ > 0 such that
n→∞
Since lim xn = a there is N such that |xn − a| < δ for n > N. Hence |f (xn ) − L| < ε for
n→∞
n > N.
Therefore, lim f (xn ) = L. We obtain in this way that La (f ) = {L} and consequently
n→∞
Assume now that lim f (x) = lim f (x) = L and suppose that L is not the limit of f (x)
x→a x→a
as x approaches ”a”. Then, there exists ε0 > 0 such that for every n ∈ N there exists xn
1
such that |xn − a| < and |f (xn ) − L| > ε0 . On the other hand, the sequence (f (xn )) is
n
bounded and has a subsequence (f (xnk )) which has a limit. It is clear that lim f (xnk )
nk →∞
is different from L.
Hence La (f ) contains at least two elements.
1
Example 22.1. If f (x) = sin for x ∈ R1 \ {0} then L0 (f ) = [−1, 1].
x
23 Continuity
45
- finally that the previous two values are equal.
The ε, δ definition of lim f (x) = L can be easily adapted to give the following ε, δ definition
x→a
of continuity at ”a”.
Definition 23.2. The function f is continuous at ”a” if and only if for every ε > 0, there
exists δ > 0 such that:
|x − a| < δ ⇒ |f (x) − f (a)| < ε.
Example 23.1. Use the ε, δ definition of continuity to prove that f (x) = x2 is continuous
at a = 0.
Solution: For any ε > 0, determine those x √for which |f (x) √ − f (0)| < ε. Now
|f (x) − f (0)| = |x2 − 0| = |x2 | < ε provided |x| < ε. So let δ = ε. If |x − 0| < δ, then
|f (x) − f (0)| < ε. In other words, lim f (x) = 0 and, since f (0) = 0, lim f (x) = f (0).
x→0 x→0
Hence, f is continuous at 0.
If a function f is continuous for all x in the range a < x < b, then it can be said that f
is continuous on the interval (a, b).
If f is continuous for all x in its domain, it can be simply said that f is continuous.
Definition 23.3. If lim+ f (x) exists and equals f (a), then f is called right-continuous at
x→a
a.
Definition 23.4. If lim− f (x) exists and equals f (a), then f is called left-continuous at
x→a
a.
The ε, δ formulation of the last two definitions are not difficult to write down. Moreover,
the following result holds.
A function f is continuous at ”a” if and only if is both left-continuous and right-continuous
at ”a”.
Note: If a function f is only defined on the closed interval [a, b] and it is claimed that ”f
is continuous on [a, b]” what is meant is that f is continuous on (a, b), right-continuous
at ”a” and left-continuous at ”b”.
Proof. The result is obtained from Heine’s criteria for the limit.
Proof. The result is obtained from Cauchy-Bolzano’s criteria for the limit.
46
24 Rules for continuity
1
Reciprocal rule: If f is continuous at ”a” and f (x) 6= 0, then is continuous at ”a”.
f
Composite rule: Let f and g be continuous at ”a” and f (a), respectively. Then g ◦ f
is continuous at ”a”.
Proof. Let f (a) = b. Since g is continuous at b for every ε > 0 there exists a δ1 > 0 such
that
|t − b| < δ1 ⇒ |g(t) − g(b)| < ε.
Since f is continuous at ”a” for δ1 > 0, there exists a δ2 > 0 such that
|x − a| < δ2 ⇒ |f (x) − f (a)| < δ1 .
Now we deduce that
|x − a| < δ2 ⇒ |g(f (x)) − g(f (a))| < ε.
Hence, for any ε > 0, there exists a δ = δ2 > 0 such that
|x − a| < δ ⇒ |(g ◦ f )(x) − (g ◦ f )(a)| < ε.
Example 24.1. Given that the identity function x 7→ x the constant function x 7→ k and
the trigonometric functions sine and cosine are all continuous, the following are proved
to be continuous functions:
x2 + 2x + 3
a) x 7−→ ,
x2 + x + 1
b) x 7−→ x3 · cos x2 ,
( 1
x · sin if x 6= 0
c) x 7−→ x
0 if x = 0
47
The extension by continuity: If f : A ⊂ R1 → R1 is not defined at the point ”a” but
L = lim f (x) exists, then the function g : A ∪ {a} ⊂ R1 → R1 , defined by g(x) = f (x) for
x→a
x 6= a and g(a) = L, is continuous at ”a”.
The function g is called the extension by continuity of the function f.
The results in this section show that the definition of continuity leads naturally to the
intuitive geometric interpretation which is used when sketching the graphs of continuous
functions.
The boundedness property: Let f be continuous on the interval [a, b]. Then
Comment 25.1. What is being said is that, for (1) there exist numbers m and M such
that
m ≤ f (x) ≤ M for all x ∈ [a, b].
For (2) if m and M are chosen to be the infimum and supremum, respectively, of the set
{f (x) | a ≤ x ≤ b}, then (2) claims that there are numbers c and d in [a, b] such that
m = f (c) and M = f (d).
δ
Hence, f is bounded on [a, a + δ] and so c ≥ a + > a. We want to show that a = b.
2
Suppose that c < b. Since c > a, f is continuous at c. Then, for ε = 1 there exists a δ 0 > 0
such that
|x − c| < δ 0 ⇒ |f (x)| ≤ 1 + |f (c)|.
48
δ0
In the other words, f is bounded on [c, c + δ 0 ]. But then c +
∈ B and this contradicts
2
c being the supremum of B. Thus c = b and so f is bounded on [a, b], as required.
(2) Since f is bounded on [a, b]
is a set which is bounded both above and below. Let m = inf A and M = sup A. Suppose
1
that there is no x ∈ [a, b] such that f (x) = M and define g(x) = for x ∈ [a, b].
M − f (x)
Now g is continuous on [a, b] by the sum and reciprocal rules. By (1) g is bounded on
[a, b].
Let such a bound be K. Then:
1 1 1
g(x) ≤ K ⇒ ≤K⇒ ≤ M − f (x) ⇒ f (x) ≤ M −
M − f (x) K K
This contradicts the fact that M is the least upper bound for f on [a, b], so the assumption
that f never takes the value M is false. Hence f attains its upper bound on [a, b]. A similar
argument shows that f attains its lower bound somewhere on [a, b].
The intermediate value property: Let f be continuous on [a, b] and suppose that
f (a) = α and f (b) = β. For every real number γ between α and β there exists a number
c, a < c < b with f (c) = γ.
Comment 25.2. Here it is being said that, if f take the values α and β somewhere on
the interval [a, b], then f must take all posible values between α and β.
Proof of the intermediate value property. Suppose α < γ < β and let
The set S is non-empty, since it contains ”a”. Let c = sup S. It is clear that a < c < b.
If f (c) < γ, then, for ε = γ − f (c) > 0, there exists a δ > 0 such that
In particular ¯ µ ¶ ¯
¯ ¯
¯f c + δ − f (c)¯ < ε
¯ 2 ¯
and so µ ¶
δ
f c+ − f (c) < γ − f (c).
2
µ ¶
δ δ
But then f c + < γ and hence c + ∈ S, which contradicts the fact that c is the
2 2
supremum of S. Hence, f (c) ≥ γ.
If f (c) > γ then, for ε = f (c) − γ > 0, there exists a δ > 0 such that
49
Hence c − δ < x ≤ c ⇒ f (x) > γ and so x does not lie in S. In other words, sup S ≤ c − δ
which in turn contradicts the definition of c.
It follows that f (c) = γ. Hence, γ is a value of f.
The intermediate value property has many applications and the following example
illustrates one of these.
Example 25.1. Any polynomial of odd degree has at least one real root.
P (x)
r(x) = − 1, x 6= 0
xn
Now
¯ ¯ ¯
¯ P (x)
¯
¯ ¯ an−1
¯ a1 a0 ¯¯ ¯¯ an−1 ¯¯ ¯ a ¯ ¯a ¯
¯ 1 ¯ ¯ 0¯
|r(x)| = ¯ n − 1¯ = ¯ + · · · + n−1 + n ¯ ≤ ¯ ¯ + · · · + ¯ n−1 ¯ + ¯ n ¯
x x x x x x x
Hence, |r(x)| < 1 for |x| > 1 + M. In particular 1 + r(x) > 0 for |x| > 1 + M. Hence,
P (x) = xn (1 + r(x)) has the same sign as xn for |x| > 1 + M. Since n in odd, there exist
α, β ∈ R1 with P (α) > 0 (choose α > 1 + M ) and P (β) < 0 ( choose β < −(1 + M ).
By the intermediate value property P (γ) = 0 for some γ, |γ| < 1 + M. Incidentally, this
shows that P has a zero in the interval (−(1 + M ), (1 + M )). In fact all the real zeros of
P lie in this interval.
Theorem 25.1 (The interval theorem). Let f be continuous on I = [a, b]. Then, f (I) is
a closed bounded interval.
Comment 25.3. The claim here is that continuous functions map intervals onto intervals.
This means that the intuitive picture of a continuous functions as one having a continuous
graph is round.
Proof of the interval theorem. By the boundedness property there exist numbers c and d
in I such that f (c) = m0 and f (d) = M0 and m0 ≤ f (x) ≤ M0 , for all x ∈ I.
Suppose for simplicity, that c ≤ d. Apply the intermediate value property to f on the
subinterval [c, d] to deduce that f takes all possible values between f (c) = m0 and
f (d) = M0 . In other words f (I) = [m0 , M0 ].
Theorem 25.2 (The fixed point theorem). Let f : [a, b] → [a, b] be a continuous function.
Then there is at least one number c which is fixed by f. That is f (c) = c.
Comment 25.4. This result says that if proceeding continuously from (a, f (a)) to
(b, f (b)), then the line y = x must be crossed.
50
Proof of the fixed point theorem. Let g : [a, b] → R1 be defined by g(x) = f (x) − x. Since
the identity function x 7→ x is continuous on [a, b] the function g is continuous on [a, b].
If f (a) = a or f (b) = b, then there is nothing to prove. So it is assumed that f (a) 6= a
and f (b) 6= b. Since f maps onto [a, b], g(a) > 0 and g(b) < 0.
The intermediate value property applied to g on the interval [a, b] implies that g(c) = 0
for some c, a < c < b. Hence f (c) = c.
Theorem 25.3 (The continuity of the inverse function). Suppose that f : A → B is a
bijection where A and B are intervals. If f is continuous on A, then f −1 is continuous
on B.
Proof. Consider the continuous bijection f : A → B where A and B are intervals. First
it is shown that f is either strictly increasing or strictly decreasing. If f is neither strictly
increasing nor decreasing, then without loss of generality, there are numbers a1 , a2 and
a3 such that a1 < a2 < a3 and f (a1 ) < f (a3 ) < f (a2 ). Apply the intermediate value
theorem to f on the interval [a1 , a2 ] to deduce that f (c) = f (a3 ) for some c ∈ (a1 , a2 ).
This contradicts the fact that f is a bijection. For the rest of the proof it is assumed that
f is strictly increasing, the proof for the strictly decreasing case being similar.
Hence f −1 is strictly increasing. Let b ∈ B and f −1 (b) = a, so that f (a) = b. For every
ε > 0, f maps the interval I = [a − ε, a + ε] onto some interval f (I) = [m, M ]. Since f
is strictly increasing, m < b < M, so let δ be the minimum of b − m and M − b. Clearly
δ > 0.
Now [b − δ, b + δ] is a subset of [m, M ] = f (I) and so f −1 maps [b − δ, b + δ] into
I = [a − ε, a + ε]. Thus given any ε > 0, there exists a δ > 0 such that
|y − b| < δ ⇒ |f −1 (y) − f −1 (b)| < ε.
51
Let be A ⊂ R1 and f1 , f2 , . . . , fn , . . . a sequence of real valued functions fn : A → R1 . We
will denote this sequence by (fn ).
Let be B ⊂ A the set of convergence of sequence (fn ). For x ∈ B we denote by f (x) the
limit
f (x) = lim fn (x).
n→∞
written: fn −−−→ f on A.
n→∞
Comment 26.1. If in the above definition N depends only on ε and does not depend on
x, then we say that the sequence (fn ) converges uniformly to f on A.
Definition 26.4. The sequence (fn ) is uniformly convergent on A to f if for any ε > 0,
there exists N (ε) such that for n > N (ε) and x ∈ A we have
u
If the sequence (fn ) is uniformly convergent to f we will write fn −−−→ f.
n→∞
½
1 for x = 1
Example 26.1. A = [0, 1], fn (x) = xn , f (x) = , fn −−−→ f but
0 for x ∈ [0, 1) n→∞
u
fn −−−
/→ f.
n→∞
sin n x u
Example 26.2. A = [0, 2 π], fn (x) = , f (x) = 0, fn −−−→ f.
n n→∞
52
u
Proof. Assume first that fn −−−→ f. For ε > 0 there exists Nε such that for p ≥ Nε we
n→∞
have
ε
|fp (x) − f (x)| < , for any x ∈ A.
2
We have
ε ε
|fn (x) − fm (x)| < |fn (x) − f (x)| + |f (x) − fm (x)| ≤ + = ε.
2 2
for any n, m > Nε .
Assume now that for any ε > 0 there exists Nε , such that for n, m > Nε and x ∈ A we
have
|fn (x) − fm (x)| < ε
u
and show that there exists f : A ⊂ R1 → R1 such that fn −−−→ f.
n→∞
From hypothesis we have that the sequence of the real numbers (fn (x)) converges. Let
be f (x) = lim fn (x). We obtain in this way a function f : A ⊂ R1 → R1 . The sequence
n→∞
of function (fn ) converges in any point x ∈ A to f. Let now ε > 0 and Nε such that for
n, m > Nε and x ∈ A we have
We choose n0 ≥ Nε and since fn −−−→ f we have fn − fn0 −−−→ f − fn0 and more
n→∞ n→∞
Since n0 ≥ Nε was arbitrary chosen, it follows that for any n ≥ Nε and x ∈ A we have
Proof. Let be ε > 0. Since an → 0, there exists Nε such that for any n ≥ Nε we have
an < ε. It follows that
|fn (x) − f (x)| < ε
u
for n ≥ Nε and x ∈ A i.e. fn −−−→ f.
n→∞
The following statement shows that the uniform convergence conserves the continuity.
Proposition 27.1. Let be (fn ) a sequence of functions fn : A ⊂ R1 → R1 which converges
u
uniformly to f : A ⊂ R1 → R1 ; fn −−−→ f. If all the functions fn are continuous at a
n→∞
point a ∈ A, then f is continuous at ”a”.
53
u
Proof. Let be ε > 0. Since fn −−−→ f, there exists Nε such that for any n ≥ Nε and
n→∞
x ∈ A we have
ε
|fNε (x) − f (x)| < .
3
In particular, we have
ε
|fNε (a) − f (a)| < .
3
Since fN ε is continuous at ”a”, there exists δε > 0 such that |x − a| < δε ⇒ |fNε (x) −
ε
fNε (a)| < . Therefore, for any x with |x − a| < δε we have
3
ε ε ε
|f (x) − f (a)| ≤ |f (x) − fNε (x)| + |fNε (x) − fNε (a)| + |fNε (a) − f (a)| < + + =ε
3 3 3
That means that f is continuous at ”a”.
If the functions fn are uniformly continuous on A, then δ does not depend on x. Hence:
”(fn ) is a sequence of uniformly continuous function on A” means:
for any n ∈ N and any ε > 0, there exists δ = δ(n, ε) > 0 such that:
|x0 − x00 | < δ ⇒ |fn (x0 ) − fn (x00 )| < ε for any x0 , x00 ∈ A.
It is possible that δ does not depend on n, but depends on x and ε. In this case the
sequence (fn ) is a sequence of equal continuous functions. More precisely:
The sequence (fn ) is a sequence of equal continuous functions on A if for any x ∈ A and
ε > 0 there exists δ = δ(x, ε) > 0 such that for any n :
If δ is independent on x and n, then the functions of the sequence (fn ) are uniformly
continuous and equal continuous too; they are equal uniformly continuous. More precisely:
54
(fn ) is a sequence of functions equal uniformly continuous on A if for any ε > 0 there is
a δ = δ(ε) > 0 such that
for n ∈ N, x ∈ A.
Theorem 28.1 (Arzela-Ascoli). Let be I = [a, b] a closed interval and (fn ) a sequence of
functions fn : I → R1 .
If (fn ) is a sequence of equal continuous and equal bounded functions, then (fn ) contains
a subsequence (fnk ) which is uniformly convergent on I.
∞
X
A point a ∈ A is called point of convergence of the series of functions fn if the series
n=1
converges at ”a”.
55
The collection of all the points of convergence of the series is called the set of convergence
∞
X
of the series fn .
n=1
∞
X
Let be B ⊂ A the set of convergence of the series fn . For x ∈ B we denote by S(x)
n=1
the sum ∞
X
S(x) = fn (x).
n=1
S : B ⊂ A ⊂ R1 → R1
∞
X
The function S defined above is called the sum function, on the set B, of the series fn .
n=1
∞
X
We will say that the series fn converges to S on B, and we will write
n=1
∞
X
S= fn , for x ∈ B.
n=1
∞
X
Definition 29.2. Let be fn a series of functions defined on A, and S a function
n=1
∞
X
defined on B ⊂ A. The series fn converges to S on B if for any x ∈ B and any ε > 0
n=1
there exists N = N (x, ε) > 0 such that for any n > N we have
∞
X
If the series fn converges absolutely on B, then it converges on B.
n=1
56
Example 29.1.
x2
fn (x) = ,n≥0
(1 + x2 )n
∞
X
and the series fn (x).
n=0
The set of convergence of this series is R1 and the sum of series is
½
1 + x2 for x 6= 0
S(x) =
0 for x = 0
sinn x
fn (x) =
n2
∞
X
and the series fn .
n=1
The series is absolutely convergent on R1 .
The series is uniformly convergent on R1 .
∞
X
n
c) For n ≥ 1 consider fn (x) = cos x and the series fn .
n=1
The set of convergence is R1 \ {k · π}k∈Z .
The series is absolutely convergent on the set of convergence.
X∞
en·|x|
d) consider for n ≥ 1 the functions fn (x) = and the series fn .
n n=1
The set of convergence of the series is empty.
∞
X
Definition 30.1. The series of functions fn is called the remainder of order k of
n=k+1
∞
X
the series fn .
n=1
57
∞
X
1st Criterion: The series fn converges if and only if the remainder of any order k
n=1
of the series converges.
Proof. Consider
Sk = f1 + f2 + · · · + fn
and
σp = fk+1 + fk+2 + · · · + fk+p
and remark that
Sk+p = Sk + σp .
Therefore, the sequence (Sk+p ) converges as p → ∞ if and only if the sequence (σp )
converges as p → ∞.
∞
X
nd
2 Criterion: The series fn converges if and only if the sequence of the sums of
n=1
remainders tends to 0.
Proof. Obvious.
∞
X
rd
3 Criterion (Cauchy): The series fn converges uniformly on A if and only if for
n=1
any ε > 0 there is N = N (ε) such that for n ≥ N and p ≥ 1 we have
for x ∈ A.
∞
X
th
4 Criterion: Let be an a convergent series of positive numbers.
n=1
∞
X
If |fn (x)| ≤ an for x ∈ A and n ∈ N then the series fn is uniform convergent.
n=1
Proof. Obvious.
31 Power Series
∞
X
Definition 31.1. A series of functions of the form an · xn is called power series.
n=0
Clearly, any power series converges when x = 0.
58
Theorem 31.1 (The set of the convergence of power series. Abel-Cauchy-Hadamard
theorem).
∞
X
- The power series an · xn is absolutely convergent for |x| < R (R called radius of
n=0
convergence) where R is given by
1
R= if 0 < ω ≤ +∞
ω
R = +∞ if ω = 0
p
n
and ω = lim |an |.
n→∞
∞
X
1
Proof. Consider x0 ∈ R and the series |an | · |x0 |n .
n=0
Apply the root test to this series and obtain:
∞
X
p
If lim n
|an | · |x0 | < 1, then the series an · xn0 is absolutely convergent. In other words,
n→∞
n=0
∞
X 1
the series an · xn0 is absolutely convergent for |x0 | < R, where R = if 0 < ω ≤ +∞
ω
n=0 pn
and R = +∞ if ω = 0 and ω = lim |an |.
n→+∞
Applying the same test, it follows that the series diverges for any x0 with |x0 | > R.
∞
X
For r ∈ (0, R) the series |an | · rn converges (x = r is a point at which the series
n=0
∞
X
an · xn converges absolutely) and for x ∈ [−r, r], we have
n=0
∞
X
a) The series xn converges absolutely for |x| < 1 and diverges for |x| > 1. The
n=0
radius of convergence is R = 1; the convergence set is (−1, 1).
∞
X xn
b) For the series the radius of convergence R is R = 1. The convergence set is
n=1
n
[−1, 1).
59
X∞
xn
c) The set of convergence of series (−1)n · is (−1, 1].
n=1
n
∞
X xn
d) The series (α > 1) is absolutely convergent on [−1, 1].
n=1
nα
Proof. Let be x0 ∈ (−R, R). There exists r ∈ (0, R) such that −R < −r < x0 < r < R.
Since on the closed interval [−r, r] the series converges uniformly, and the terms of the
series are continuous functions, the sum S is continuous on [−r, r]. In particular, it is
continuous at x0 .
∞
X
• The sum S of the power series an · xn is uniformly continuous on any compact
n=0
interval contained in (−R, R).
∞
X
- the sum (an + bn ) · xn
n=0
∞
X
- the scalar product k · an · xn
n=0
∞
X ∞
X
- the Cauchy product c n · xn , c n = an · bn−k
n=0 k=0
∞
X
(an + bn ) · xn = f (x) + g(x)
n=0
∞
X
(k · an ) · xn = k · f (x)
n=0
60
∞
X
cn · xn = f (x) · g(x).
n=0
Proof. These claims concerning the sum and scalar product follow from the sum and scalar
∞
X
product rules for series. To establish the Cauchy product result, note that an · xn and
n=0
∞
X n
X
bn · xn are absolutely convergent for |x| < R1 . Since cn · xn = (ak · xk )(bn−k · xn−k ),
n=0 k=0
∞
X
the series cn · xn is absolutely convergent for |x| < R1 , and has the sum stated.
n=0
Much of the preceding discussion can be modified to apply to series of the form
∞
X
an · (x − a)n .
n=0
33 Differentiable functions
Figure 33.1:
61
Example 33.1. The function f (x) = x2 is differentiable for all x.
f (x) − f (c)
Solution: Consider for any x 6= c, where c is fixed. Now
x−c
f (x) − f (c) x2 − c2
lim = lim = lim(x + c) = 2 c.
x→c x−c x→c x − c x→c
0
Hence, f is differentiable at c and f (c) = 2 c. Since c was arbitrary, the derivative function
f can be defined as
f 0 (x) = 2 x.
Example 33.2. The function f (x) = |x| is not differentiable at c = 0.
Solution: Consider
½
f (x) − f (0) |x| 1 for x > 0
= =
x−0 x −1 for x < 0.
Hence
f (x) − f (0) f (x) − f (0)
lim+ = 1 and lim− = −1.
x→0 x−0 x→0 x−0
Since these right and left limits differ, f is not differentiable at 0.
It is easy to show that f is differentiable for all x 6= 0 and f 0 (x) = 1 for x > 0 and
f 0 (x) = −1 for x < 0.
In general, points where f is not differentiable can be often detected by examining the
f (x) − f (c)
left and the right limits of as x → c.
x−c
f (x) − f (c)
The left limit lim− is called the left derivative of f at c and is denoted by
x→c x−c
f−0 (c).
f (x) − f (c)
Similarly, the right limit lim+ is called the right derivative of f at c and is
x→c x−c
denoted by f+0 (c).
Clearly, f 0 (c) exists if and only if f−0 (c) and f+0 (c) both exist and are equal.
The next result establishes that only continuous functions can be differentiable.
Theorem 33.1. If f is differentiable at c, then f is continuous at c.
62
Note that Example 33.2 shows that there are continuous functions which are not
differentiable.
The following table gives certain elementary functions and their derivatives.
Function f derivative f 0
f (x) = k a constant f 0 (x) = 0
f (x) = xn , n ∈ N f 0 (x) = n · xn−1
√ 1
f (x) = x f 0 (x) = √
2 x
f (x) = sin x f 0 (x) = cos x
f (x) = cos x f 0 (x) = − sin x
1
f (x) = tan x f 0 (x) =
cos2 x
1
f (x) = cot x f 0 (x) = − 2
sin x
f (x) = ex f 0 (x) = ex
1
f (x) = ln x f 0 (x) =
x
34 Rules of differentiability
63
Proof of product rule. For x 6= c
Therefore
(f · g)(x) − (f · g)(c)
−−−x→c
−−−−→ f 0 (c) · g(c) + f (c) · g 0 (c).
x−c
The product and the reciprocal rules can be combined as follows, to give the quotient
rule.
f
Quotient rule If f and g are differentiable at c and g(x) 6= 0, then is differentiable
g
at c and µ ¶0
f f 0 (c) · g(c) − f (c) · g 0 (c)
(c) = .
g [g(c)]2
Example 34.1. Use the above rules to prove that each of the following functions is
differentiable at the points indicated:
i) f (x) = x2 + sin x, x ∈ R1 ;
64
Squeeze rule Let f , g and h be three functions such that g(x) ≤ f (x) ≤ h(x) for all x
in some neighborhood of c and such that g(c) = f (c) = h(c). If g and h are differentiable
at c, then so is f and f 0 (c) = g 0 (c) = h0 (c).
and
h(x) − h(c)
if x 6= c
Hc (x) = x−c
h0 (c) if x = c.
Since g and h are differentiable at c, Gc and Hc are continuous at c. Let k(x) =
Gc (x) − Hc (x). Thus k is continuous at c. The earlier inequalities imply that if x > c,
then k(x) ≤ 0 and if x < c, then k(x) ≥ 0. Hence, k(c) = 0 and so Gc (c) = Hc (c). In
other words, g 0 (c) = h0 (c).
can be squeezed between h(x) = −x2 and g(x) = x2 at x = 0. Since g and h are
differentiable at 0 with common derivative of value 0, the squeeze rule gives that f is
differentiable at x = 0.
By other rules, f is also differentiable for x 6= 0. Moreover,
1 1
2x sin − cos if x 6= 0
0
f (x) = x x
0 if x = 0.
Note that lim f 0 (x) does not exist so that f 0 (0) exists but f 0 is not continuous at 0.
x→0
65
Proof. Let
f (x) − f (c)
if x 6= c
Fc (x) = x−c
f 0 (c) if x = c
and
g(y) − g(b)
if y 6= b
Gb (y) = y−b
g 0 (b) if y = b.
Then Fc is continuous at x = c and, for all x,
Now
So
(g ◦ f )(x) − (g ◦ f )(c)
= Fc (x) · Gb (f (x)).
x−c
The function on the right-hand side of the above equality is continuous at x = c. Hence
(g ◦ f )(x) − (g ◦ f )(c)
lim = Fc (c) · Gb (f (c)) = f 0 (c) · g 0 (f (c)),
x→c x−c
as required.
The composite rule is often called the chain rule and the formula given for the derivative of
a composite is more suggestive in Leibnitz notation. Let ∆x = h and ∆y = f (x+h)−f (x).
Then
f (x + h) − f (x) ∆y
f 0 (x) = lim = lim .
h→0 h ∆x→0 ∆x
dy
The Leibnitz notation for this limit is . Write y = g(u) where u = f (x). Then
dx
du dy dy
f 0 (x) = and g 0 (f (x)) = and (g ◦ f )0 (x) = . The chain rule can now be written
dx du dx
as
dy dy du
= · .
dx du dx
Example 34.3. Show that h(x) = sin x2 is differentiable.
Solution: Let g(x) = sin x and f (x) = x2 , then h = g ◦ f . Since f and g are everywhere
differentiable, the composite rule gives that h is differentiable. Moreover,
66
Inverse rule Suppose that f : A → B is a continuous bijection where A and B are
intervals. If f is differentiable at a ∈ A and f 0 (a) 6= 0, then f −1 is differentiable at
b = f (a) and
1
(f −1 )0 (b) = 0 .
f (a)
y−b
Fa (x) = for x 6= a.
x−a
Consider
x−a
Gb (y) = for y 6= b
y−b
so
x−a 1 1
Gb (y) = = = for y 6= b.
f (x) − f (a) Fa (x) (Fa ◦ f −1 )(y)
Since f bijective and continuous, so f −1 is continuous too. Also f −1 (b) = a and Fa is
continuous at x = a.
Hence Fa ◦ f −1 is continuous at y = b and
So
x−a 1 1 1
= Gb (y) = −1
→ −1
= 0 as y → b.
y−b (Fa ◦ f )(y) (Fa ◦ f )(b) f (a)
In other words µ ¶
f −1 (y) − f −1 (b) 1 1
lim = (b) = .
y→b y−b f ◦ f −1
0 f 0 (a)
If the function f : A → f (A) is differentiable on the interval A and f 0 (a) 6= 0 for any
1
a ∈ A then f −1 is differentiable on f (A) and (f −1 )0 (f (a)) = 0 .
f (a)
Example 34.4. The function f : (0, ∞) → √ (0, ∞) given by f (x) = x2 is a bijection.
Its inverse function is given by f −1 (x) = x. Now√f is differentiable for x > 0 and
f 0 (x) = 2x 6= 0. Hence, by the inverse rule, f −1 (x) = x is differentiable and
1
(f −1 )0 (x) = √ .
2 x
67
³ π π´
Example 34.5. The function g : − , → (−1, 1) given by g(x) = sin x is a bijection
2 2
with inverse given by g −1 (x) = arcsin x. Now g is differentiable and
35 Local extremum
In this section we present a result which helps to locate the local maxima and minima of
a differentiable function.
Proof. Consider the case of a local minimum at x = c. There is an open interval I such
f (x) − f (c)
that f (x) − f (c) ≥ 0 for all x ∈ I. If x > c, then ≥ 0 and if x < c, then
x−c
f (x) − f (c)
≤ 0. Thus f+0 (c) ≥ 0 and f−0 (c) ≤ 0. But f 0 (c) exists and so f+0 (c) = f−0 (c).
x−c
Thus f 0 (c) = 0.
Note that although f 0 must vanish at a local extremum this is not sufficient for such a
point. For example, consider the behavior of f (x) = x3 at x = 0. Here f 0 (0) = 0, but 0
is neither a local maximum, nor a local minimum.
Example 35.1. Localize the local maximum and local minimum of the function
f (x) = x (x − 1) (x − 2)
68
Proof. Since f is continuous on [a, b], it attains a maximum value f (c1 ) and a minimum
value f (c2 ) on [a, b] by the boundedness property.
If f (c1 ) = f (c2 ), then f is constant for all x ∈ [a, b], hence f 0 (x) = 0 for all x ∈ [a, b] and
the result follows.
If f (c1 ) 6= f (c2 ), then at least one of c1 and c2 is not a or b. Hence f has a local maximum
or minimum (or both) inside the interval [a, b].
By the local extremum theorem f 0 is zero at at least one point inside [a, b].
Theorem 36.2 (Mean value theorem, Lagrange). Let f be differentiable on (a, b) and
continuous on [a, b]. Then there exists c ∈ (a, b), such that
f (b) − f (a)
f 0 (c) = .
b−a
f (b) − f (a)
Proof. Let g(x) = f (x) − λx, where λ = . g is differentiable on (a, b) and
b−a
continuous on [a, b]. The choice of λ means that g(a) = g(b). Applying Rolle’s theorem
f (b) − f (a)
there is c ∈ (a, b), such that g 0 (c) = 0. Hence, f 0 (c) − λ = 0, so f 0 (c) = .
b−a
Theorem 36.3 (the increasing-decreasing theorem). If f is differentiable on (a, b) and
continuous on [a, b] then
(1) f 0 (x) > 0 for all x ∈ (a, b) implies f is strictly increasing on [a, b];
(2) f 0 (x) < 0 for all x ∈ (a, b) implies f is strictly decreasing on [a, b];
(3) f 0 (x) = 0 for all x ∈ (a, b) implies f is constant on [a, b].
Proof. Let x1 , x2 ∈ [a, b] with x1 < x2 . Since f satisfies the hypothesis of the mean value
theorem on the interval [x1 , x2 ] we have
f (x2 ) − f (x1 )
= f 0 (c)
x2 − x1
for some c, x1 < c < x2 . But f 0 (c) > 0 and so f (x2 ) > f (x1 ). In other words, f is strictly
increasing on [a, b].
The proof of (2) and (3) is similar.
Comment 36.1. The increasing-decreasing theorem is useful for finding and classifying
local extrema and establishing inequalities between functions.
Example 36.1. Find and describe the local extrema of f (x) = x2 · e−x .
f 0 (x) = e−x (2 − x) · x.
Local extrema occur only when f 0 (x) = 0 and so x = 0 or x = 2. Since e−x > 0, if x < 0
then f 0 (x) < 0, if x ∈ (0, 2) then f 0 (x) > 0 and if x > 2 then f 0 (x) < 0. Thus f is
decreasing on (−∞, 0), increasing on (0, 2) and decreasing again on (2, +∞). This means
that x = 0 gives a local minimum and x = 2 gives a local maximum of f .
69
Example 36.2. Prove that ex ≥ 1 + x for all x.
Theorem 36.4 (Cauchy’s mean value theorem). Let f and g be differentiable on (a, b)
and continuous on [a, b]. Then there exists c ∈ (a, b), such that
f 0 (c) f (b) − f (a)
0
=
g (c) g(b) − g(a)
provided that g 0 (x) 6= 0 for all x ∈ (a, b).
Proof. First note that g(a) 6= g(b), otherwise Rolle’s theorem applied to g on [a, b]
would mean that g 0 vanished somewhere on (a, b). Let h(x) = f (x) − λg(x) where
f (b) − f (a)
λ= .
g(b) − g(a)
By the sum and product rules for continuity and differentiability and our choice of λ,
h satisfies all the hypotheses of Rolle’s theorem. Hence there is c ∈ (a, b), such that
h0 (c) = 0. This gives f 0 (c) = λ · g 0 (c) and the result now follows.
Theorem 36.5 (l’Hôspital’s rule, version A). Let f and g satisfy the hypotheses of
Cauchy’s mean value theorem and let x0 satisfy x0 ∈ (a, b). If f (x0 ) = g(x0 ) = 0,
then
f (x) f 0 (x)
lim = lim 0
x→x0 g(x) x→x0 g (x)
Proof. Apply Cauchy’s mean value theorem to f and g on the interval [x0 , x] where
x0 < x ≤ b. Hence there exists c, x0 < c < x such that
f 0 (c) f (x) − f (x0 ) f (x)
0
= = .
g (c) g(x) − g(x0 ) g(x)
Now
f (x) f 0 (c) f 0 (x)
lim+ = lim+ 0 = lim+ 0
x→x0 g(x) c→x0 g (c) x→x0 g (x)
provided that the latter exists.
A similar argument applied on the interval [x, x0 ] where a ≤ x < x0 gives that:
f (x) f 0 (x)
lim− = lim− 0
x→x0 g(x) x→x0 g (x)
again provided that the latter exists.
The rule now follows.
70
Example 36.3. Show that
sin x
lim = 1.
x→0 x
Solution: The functions f (x) = sin x and g(x) = x satisfy the hypotheses of l’Hôspital
rule. Moreover
sin x cos x
lim = lim = 1.
x→0 x x→0 1
Example 36.4. Show that
1
lim (1 + x) x = e.
x→0
Hence
1
lim (1 + x) x = e.
x→0
L’Hôspital’s rule can be used to evaluate many indeterminate limits once they have been
expressed as the limit of a quotient of differentiable functions, provided of course that
f (x)
lim can be evaluated.
x→x0 g(x)
Often this final limit is itself indeterminate (in other words f 0 (x0 ) = g 0 (x0 ) = 0) and it
may be tempting to apply l’Hôspital’s rule again. But this requires that f 0 and g 0 are
themselves differentiable.
1. If f (x) = xm , m ∈ N then
m!
(n) · xm−n for n ≤ m
f (x) = (m − n)!
0 for n > m.
71
³ nπ ´
2. If f (x) = sin x then f (n) (x) = sin x + for n ≥ 1.
2
(−1)n−1 · (n − 1)!
3. If f (x) = ln x then f (n) (x) = for n ≥ 1.
xn
(f · g)0 =f 0 · g + f · g 0 (37.1)
(2) (2) 0 0 (2)
(f · g) =f ·g+2·f ·g +f ·g (37.2)
(3) (3) (2) 0 0 (2) (3)
(f · g) =f ·g+3·f ·g +3·f ·g +f ·g . (37.3)
Theorem 37.2 (l’Hôspital’s rule, version B). Let f and g be n times continuously
differentiable on the interval (a, b) and x0 satisfies a < x0 < b. If
and
g (n) (x0 ) 6= 0
then
f (x) f (n) (x0 )
lim = (n) .
x→x0 g(x) g (x0 )
Example 37.2. Prove that
1 − cos x 1
lim 2
=
x→0 x 2
and µ ¶
1
lim − cot x = 0.
x→0 x
38 Taylor polynomials
∞
X
Suppose that an xn is a power series with radius of convergence R > 0. Let f (x) be
n=0
the sum of series for |x| < R.
It can be proved that f is differentiable and that
∞
X
0
f (x) = an · n · xn−1 for |x| < R.
n=1
72
Continually differentiating in this manner leads to
∞
X
(k)
f (x) = an · n · (n − 1) · . . . · (n − k + 1) · xn−k for |x| < R.
n=k
If x = 0, then
f (k) (0)
ak = for k = 1, 2, . . .
k!
f (n) (0)
Thus the coefficient of xn in any power series is where f (x) is the sum of the given
n!
power series. Hence
X∞
f (n) (0) n
f (x) = · x for |x| < R.
n=0
n!
For small values of x, the sum f (x) can be approximated by the polynomial
T0 f (x) =1
T1 f (x) =1 + x
1
T2 f (x) =1 + x + x2
2
and so on.
The first result provides an estimate for the difference between f (b), the value of a given
function at x = b, and Tn f (b), the value of its Taylor polynomial of degree n at x = b.
Theorem 38.1 (The first remainder theorem). Let f be (n + 1) times continuously
differentiable on an open interval containing the points 0 and b. Then the difference
between f and Tn f at x = b is given by
bn+1
f (b) − Tn f (b) = · f (n+1) (c)
(n + 1)!
for some c between 0 and b.
73
Proof. For simplicity assume that b > 0. Let
n
X f (k) (x)
hn (x) = f (b) − (b − x)k x ∈ [0, b].
k=0
k!
Denote f (b) − Tn f (b) by Rn f (b) and call it the remainder term at x = b. Thus
and so the error in approximating f (b) by Tn f (b) is given by the remainder term Rn f (b).
Since f (n+1) is continuous on a closed interval containing 0 and b, it is bounded on that
interval.
So there exists a number M such that |f (n+1) (c)| ≤ M and so
¯ n+1 ¯
¯ b ¯
|Rn f (b)| ≤ ¯¯ ¯ · M.
(n + 1)! ¯
Thus, for a fixed n, the remainder term will be small for b close to zero. In other words
Taylor polynomials provide good approximations of the function near x = 0. The next
example illustrates this.
Example 38.2. Let f (x) = sin x. Then
x3 x5 x7 x8
T7 f (x) = x − + − R7 f (x) = (− sin c)
3! 5! 7! 8!
for some c between 0 and x. By the first remainder theorem
0.18
R7 f (0.1) ≤ = 2.48 · 10−13 .
8!
74
It can now be shown how Taylor polynomials can be used to generate power series
expansions for functions f which are infinitely differentiable on an open interval containing
0 and x. For x we have:
f (x) = Tn f (x) + Rn f (x).
Now ∞
X f (k) (0)
lim Tn f (x) = · xk for |x| < R
n→∞
k=0
k!
where R is the radius of convergence of the resulting power series.
If it can be shown that lim Rn f (x) = 0 for |x| < R0 < R for some R0 , then
n→∞
∞
X f (n) (0)
f (x) = · xn for |x| < R0 .
n=0
n!
Solution: Firstly
x x2 xn
Tn f (x) = 1 + + + ... +
1! 2! n!
X xn ∞
xn+1 c
and Rn f (x) = e for some c ∈ (0, x). Now the series is absolutely
(n + 1)! n=0
n!
convergent for all x by the ratio test so for any fixed real number x
∞
X xn
lim Tn f (x) = .
n→∞
n=0
n!
xn
By vanishing condition −−−→ 0. Thus
n! n→∞
¯ n+1 ¯
¯ x ¯
|Rn f (x)| = ¯¯ e ¯¯ → 0
c
as n → ∞
(n + 1)!
In a similar manner, power series can be generated for all the standard functions. In each
case the sum of the power series is just lim Tn f (x) and the range of validity is precisely
n→∞
those x for which:
75
In deriving the following list, the trickiest part is establishing b):
∞
X xn
ex = x ∈ R1 (38.1)
n=0
n!
∞
X (−1)n · x2n+1
sin x = x ∈ R1 (38.2)
n=0
(2n + 1)!
X∞
(−1)n · x2n
cos x = x ∈ R1 (38.3)
n=0
(2n)!
∞
X t · (t − 1) · . . . · (t − n + 1)
(1 + x)t = 1 + xn t∈
/ N, x ∈ (−1, 1) (38.4)
n=1
n!
∞
X (−1)n−1 · xn
ln(1 + x) = − 1 < x ≤ 1. (38.5)
n=1
n
The form of the remainder found in the first remainder theorem is called the Lagrange
form.
The first few terms of the McLaurin series for a given function f provide a good
approximation of f (x) close to 0.
But what happens when approximations for x close to some other real number a are
required? Polynomials must be considered not in powers of x but in powers of (x − a).
76
Differentiating with respect to t gives:
µ ¶ µ (2) ¶
0 0 f (2) (t) f (t) f (3) (t) 2
0 =f (t) + −f (t) + (b − t) + − (b − t) + (b − t) + . . .
1! 1! 2!
µ ¶
f (n) (t) n−1 f (n+1) (t)
... + − (b − t) + (b − t) + F 0 (t).
n
(n − 1)! n!
Cancelation now gives that
f (n+1) (t)
F 0 (t) = − (b − t)n .
n!
Apply Cauchy’s mean value theorem to the functions F and G on the interval with
endpoints a and b, where G(t) = (b − t)n+1 . Thus, there is a number c between a and b
such that
f (n+1) (c)
F (b) − F (a) 0
F (c) − (b − c)n
= 0 = n! .
G(b) − G(a) G (c) −(n + 1)(b − c)n
Hence
f (n+1) (c)
−(f (b) − Tn,a f (b)) (b − c)n
= n!
−(b − a)n+1 (n + 1)(b − c)n
or
(b − a)n+1 n+1
f (b) − Tn,a f (b) = f (c).
(n + 1)!
The error in approximating f (b) by the polynomial Tn,a f (b) is just the remainder term:
(b − a)n+1 n+1
Rn,a f (b) = f (c)
(n + 1)!
where c lies between a and b. The approximation is good for b close to a.
Just as before, power series can be generated in powers of (x − a), called Taylor series,
for suitable functions of x. The range of validity is again those x for which
A form of Taylor’s theorem much used in numerical analysis is derived below and used to
round off the investigation of local extrema.
From Taylor’s theorem, it follows that
f (x) = Tn,a f (x) + Rn,a f (x) =
f 0 (a) f (2) (a) f (n) (a) (b − a)n+1 n+1
= f (a) + (x − a) + (x − a)2 + . . . + (x − a)n + f (c)
1! 2! n! (n + 1)!
77
for some c between a and x.
Let x − a = h. Then c lies between a and a + h. Thus c = a + θ · h for some θ ∈ (0, 1).
Hence the following result holds:
h 0 hn hn+1 n+1
f (a + h) = f (a) + f (a) + . . . + f (n) (a) + f (a + θ · h)
1! n! (n + 1)!
(1) n + 1 even and f (n+1) (a) > 0 implies that f has a local minimum at x = a.
(2) n + 1 even and f (n+1) (a) < 0 implies that f has a local maximum at x = a.
(3) n + 1 odd implies that f has neither a local maximum nor a local minimum at x = a.
hn+1 (n+1)
f (a + h) − f (a) = f (a + θh)
(n + 1)!
where 0 < θ < 1. Since f (n+1) (a) 6= 0 and f (n+1) is continuous, there is a δ > 0 such that
f (n+1) (x) 6= 0 for |x − a| < δ. Thus for all h satisfying |h| < δ, f (n+1) (a + θh) has the
same sign as f (n+1) (a), so f (a + h) − f (a) has the same sign as hn+1 · f (n+1) (a) for all h,
|h| < δ.
(1) If n + 1 is even and f (n+1) (a) > 0, then f (a + h) − f (a) > 0 on the open interval
(a − δ, a + δ). Hence, x = a gives a local minimum of f .
(2) If n + 1 is even and f (n+1) (a) < 0, then f (a + h) − f (a) < 0 on the open interval
(a − δ, a + δ). Hence, x = a gives a local maximum of f .
(3) If n + 1 is odd, the sign of f (a + h) − f (a) changes with the sign of h. It is said that
x = a gives a horizontal point of inflection.
f (x) = x6 − 4x4 .
78
40 The Riemann-Darboux integral
Z b
It is intended to give one form of the definition of the Riemann integral f (x) dx. The
a
definition involves the areas of the rectangles and applies to a wider class of functions
than continuous ones.
Definition 40.1. Let [a, b] be a given finite interval. A partition P on [a, b] is a finite set
of points {x0 , x1 , . . . , xn } satisfying
Suppose now that f is a function defined and bounded on [a, b] (if f were continuous on
[a, b] this would certainly be the case). Then f is bounded on each of the subintervals
[xi−1 , xi ]. Hence f has a least upper bound Mi , and a greatest lower bound mi in [xi−1 , xi ].
Now f is bounded above and below on the whole [a, b]. So there exist numbers m and M
with
m ≤ f (x) ≤ M for all x ∈ [a, b].
Thus for any partition of [a, b]:
m(b − a) ≤ Lf (P ) ≤ Uf (P ) ≤ M (b − a)
is bounded below.
So Lf = sup Lf and Uf = inf Uf exist.
The first result establishes the intuitively obvious fact that Lf ≤ Uf .
79
Proof. Let P be a partition of [a, b] and P 0 be the partition P ∪ {y} where xi−1 < y < xi
for one particular i, 1 ≤ i ≤ n. In other words, P 0 is obtained by adding one more point
to P .
It is now shown that Lf (P ) ≤ Lf (P 0 ) and Uf (P ) ≥ Uf (P 0 ). Let
© ª n+1
Solution: For n ∈ N let Pn = 0, n1 , n2 , . . . , 1 . Hence Uf (Pn ) = and Lf (Pn ) =
2n
n−1
. So
2n
n−1 n+1
≤ Lf ≤ U f ≤ .
2n 2n
1
Letting n → ∞ it can be deduced that Lf = Uf = .
2
Example 40.2. Show that the function
½
1 if x is rational
f (x) =
0 if x is irrational
80
Solution: For any partition P it follows that Lf (P ) = 0 and Uf (P ) = b − a, since any
interval of real numbers contains infinitely many rationals and irrationals. Hence Lf = 0
and Uf = b − a and so Lf 6= Uf .
Z b
This definition of f (x) dx is only one of the many ways of assigning areas to bounded
a
regions. There are others, notably the Lebesgue integral; all however, give the same
”answer” for areas under the graphs of continuous functions. It will be proved that all
continuous functions are Riemann-Darboux integrable and a neat method of evaluating
the integral involved is derived.
Firstly, some elementary properties of the Riemann integral must be established - all of
which are essentially properties of areas.
Proposition 41.1. If f and g are Riemann-Darboux integrable on [a, b] then all the
integrals below exist and
Z b Z b Z b
(1) (α f (x) + β g(x)) dx = α f (x) dx + β g(x) dx α, β ∈ R1 .
a a a
Z b Z c Z b
(2) f (x) dx = f (x) dx + f (x) dx a ≤ c ≤ b.
a a c
Z b Z b
(3) if f (x) ≤ g(x) on [a, b] then f (x) dx ≤ g(x) dx.
a a
¯Z b ¯ Z b
¯ ¯
(4) ¯¯ f (x) dx¯¯ ≤ |f (x)| dx.
a a
Property (1) is described as the linearity of the integral and (2) is called the additive
property.
Zb Zb Zb Zb Zb
αf (x) dx = α f (x) dx and (f (x) + g(x)) dx = f (x) dx + g(x) dx
a a a a a
Zb Zb
The equality αf (x) dx = α f (x) dx is true for any α ≥ 0, provided by the equalities
a a
Lαf (P ) = αLf (P ) and Uαf (P ) = αUf (P ) for any α ≥ 0 and any partition P of [a, b] .
Zb Zb
The equality αf (x) dx = α f (x) dx holds provided by U−f (P ) = −Lf (P ), for any
a a
81
partition P of [a, b].
Zb Zb Zb
The equality (f (x) + g(x)) dx = f (x) dx + g(x) dx is obtained by observing that
a a a
for any partition P of [a, b] the followings hold:
Lf (P ) + Lg (P ) ≤ Lf +g (P ) ≤ Uf +g (P ) ≤ Uf (P ) + Ug (P )
from where:
Lf + Lg ≤ Lf +g ≤ Uf +g ≤ Uf + Ug
These inequalities together with:
Zb Zb
L f = Uf = f (x) dx and Lg = Ug = g(x) dx
a a
prove that:
Zb Zb Zb
Lf +g = Uf +g = (f (x) + g(x)) dx = f (x) dx + g(x) dx
a a a
Proof of (2). Let P1 and P2 be partitions of [a, c] and [c, b] respectively. Then P = P1 ∪P2
is a partition of [a, b]. Clearly, Lf (P ) = Lf (P1 ) + Lf (P2 ). Let
L1 = sup{Lf (P1 ) | P1 is a partition of [a, c]}
and
L2 = sup{Lf (P2 ) | P2 is a partition of [c, b]}.
Z b
Since Lf (P ) ≤ f (x) dx (by definition) we have
a
Z b
Lf (P1 ) + Lf (P2 ) ≤ f (x) dx.
a
Z b Z b
Hence Lf (P1 ) ≤ f (x) dx − Lf (P2 ) and so L1 ≤ f (x) dx − Lf (P2 ).
Z b
a Z b a Z b
Hence Lf (P2 ) ≤ f (x) dx − L1 and so L2 ≤ f (x) dx − L1 or L1 + L2 ≤ f (x) dx.
a a a
Now consider upper sums and observe that Uf (P ) = Uf (P1 ) + Uf (P2 ). Let
U1 = inf{Uf (P1 ) | P1 is a partition of [a, c]}
and
U2 = inf{Uf (P2 ) | P2 is a partition of [c, b]}.
Z b
Since Uf (P ) ≥ f (x) dx (by definition) we have
a
Z b
Uf (P1 ) + Uf (P2 ) ≥ f (x) dx.
a
82
Z b Z b
Hence Uf (P1 ) ≥ f (x) dx − Uf (P2 ) and so U1 ≥ f (x) dx − Uf (P2 ).
Z b
a Z b a Z b
Hence Uf (P2 ) ≥ f (x) dx − U1 and so U2 ≥ f (x) dx − U1 or U1 + U2 ≥ f (x) dx.
a a a
Thus Z b
L1 + L2 ≤ f (x) dx ≤ U1 + U2 .
a
Since f is Riemann integrable on [a, b], for any ε > 0, P can be chosen such that
Uf (P ) − Lf (P ) < ε. Then:
Uf (P1 ) − Lf (P1 ) + Uf (P2 ) − Lf (P2 ) = [Uf (P1 ) + Uf (P2 )] − [Lf (P1 ) + Lf (P2 )] =
= Uf (P ) − Lf (P ) < ε.
Hence
0 ≤ Uf (P1 ) − Lf (P1 ) < ε and ≤ Uf (P2 ) − Lf (P2 ) < ε.
Hence L1 = U1 and L2 = U2 . In other words, f is Riemann-Darboux integrable on both
[a, c] and [c, b]. Hence the additive property is established.
Zb
f (x) dx ≥ 0
a
0 ≤ Lf (P ) ≤ Uf (P )
Proof of (4). We first prove that |f | is Riemann-Darboux integrable on [a, b]. We consider
the functions f + , f − : [a, b] → R defined by:
½
+ f (x) if f (x) ≥ 0
f (x) =
0 if f (x) ≤ 0
and ½
− 0 if f (x) ≥ 0
f (x) =
−f (x) if f (x) ≤ 0
and we remark that:
We will show now that f + , f − : [a, b] → R are Riemann-Darboux integrable on [a, b]. The
boundedness of f + and f − is obvious. Consider a partition P of [a, b] and denote:
m+ + + +
i = inf{f (x) | x ∈ [xi−1 , xi ]} and Mi = sup{f (x) | x ∈ [xi−1 , xi ]}
Remark that:
Mi+ − m+
i ≤ Mi − mi i = 1, 2, ..., n
83
where:
0 ≤ Uf + (P ) − Lf + (P ) ≤ Uf (P ) − Lf (P )
or else
S = {x | x ∈ [a, b] and |f (x) − f (a) = ε0 |}
is non empty and by the intermediate value property, inf S exists. A partition P of [a, b]
is constructed as follows: if
84
partition of [a, b] for which |f (xn ) − f (xn−1 )| = ε0 , i = 1, 2, . . . , N .
For this partition P , we have
Hence n
X
Uf (P ) − Lf (P ) = (Mi − mi )(xi − xi−1 ) ≤ 2ε0 (b − a).
i=1
ε
Now for any ε > 0 consider ε0 = > 0 and deduce that there exists a partition P
2(b − a)
with Uf (P ) − Lf (P ) < ε. Now Lf ≥ Lf (P ) > Uf (P ) − ε ≥ Uf − ε. Since ε is arbitrary,
Lf ≥ Uf . Since Lf ≤ Uf , by an earlier result, Lf = Uf . Thus, f is Riemann-Darboux
integrable on [a, b].
where Lf (Q) is the lower sum of f related to Q. For each i, Lfi (Pi ) is the lower sum of
n
X
fi related to Pi . Thus Lfi (Pi ) ≤ Lf , where Lf is the supremum of all the lower sums
i=1
for f on [a, b]. Since fi is continuous on [xi−1 , xi ], fi is Riemann integrable on [xi−1 , xi ].
Hence: n µZ xi ¶
X
fi (x) dx ≤ Lf .
i=1 xi−1
In a similar fashion n µZ ¶
X xi
Uf ≤ fi (x) dx
i=1 xi−1
where Uf is the supremum of all upper sums of f on [a, b]. Hence, using Lf ≤ Uf , we
obtain
Lf = Uf .
In other words, a piecewise continuous function is Riemann-Darboux integrable and
Z b n Z
X xi
f (x) dx = fi (x) dx.
a i=1 xi−1
ExampleZ42.1. The function given by f (x) = x − [x] is piecewise continuous on [0, 3].
2
Compute f (x) dx.
0
85
43 Mean value theorem
Theorem 43.1 (the integral mean value theorem). If f and g are continuous on [a, b]
and g(x) ≥ 0 for x ∈ [a, b], then there exists c between a and b such that
Z b Z b
f (x) · g(x) dx = f (c) g(x) dx.
a a
Proof. By the interval theorem applied to f on [a, b], m ≤ f (x) ≤ M for all x ∈ [a, b]
where m is the infimum and M is the supremum of f on [a, b]. Since g(x) ≥ 0 we have
Hence Z Z Z
b b b
m g(x) dx ≤ f (x) g(x) dx ≤ M g(x) dx
a a a
and Z b
f (x) g(x) dx
a
k= Z b ∈ [m, M ].
g(x) dx
a
By the intermediate value property, there exists c ∈ [a, b] with f (c) = k. Hence
Z b Z b
f (x) · g(x) dx = f (c) g(x) dx.
a a
Corollary 43.1. If f is continuous on [a, b], then there exists c ∈ [a, b] such that
Z b
f (x) dx = f (c)(b − a).
a
Therefore Z n+1
f (n + 1) ≤ f (x) dx ≤ f (n)
n
and so
n+1
X n
X Z n+1 n
X
f (k) = f (k + 1) ≤ f (x) dx ≤ f (k).
k=2 k=1 1 k=1
86
Rn
Now let an = f (n) and jn = 1
f (x) dx. Then
n
X n−1
X
ak ≤ jn ≤ ak .
k=2 k=1
∞
X
If (jn ) converges, then the n-th partial sums of an are increasing and bounded above
n=1
∞
X
and so an is a convergent series.
n=1
∞
X
Conversely, if an is a convergent series, then (jn ) is increasing and bounded above and
n=1
hence, it is a convergent sequence.
Proof. Since f is integrable, it is bounded on [a, b]. So there exists some number M with
|f (t)| ≤ M for all t ∈ [a, b]. For a fixed c, we have
¯Z x Z c ¯ ¯Z x ¯ Z x Z x
¯ ¯ ¯ ¯
¯
|F (x) − F (c)| = ¯ f (t) dt − ¯ ¯
f (t) dt¯ = ¯ ¯
f (t) dt¯ ≤ |f (t)| dt ≤ M dt.
a a c c c
In other words F is continuous at c and, since c was arbitrary, F is continuous on [a, b].
87
Let c ∈ [a, b] and consider x > c. Then
¯Z x Z c ¯
¯ ¯
¯ ¯ ¯ ¯
¯ F (x) − F (c) ¯ ¯ a f (t)dt − a f (t)dt ¯
¯ − f (c)¯=¯ − f (c)¯≤
¯ x−c ¯ ¯ x − c ¯
¯ ¯
¯ ¯
¯Z x ¯ ¯Z x ¯
¯ ¯ ¯ ¯
¯ f (t)dt ¯ ¯ (f (t) − f (c))dt ¯
¯ c ¯ ¯ c ¯
≤¯ ¯ ¯
− f (c)¯ ≤ ¯¯ ¯≤
x − c x − c ¯
¯ ¯ ¯ ¯
¯ ¯ ¯ ¯
¯Z x ¯
¯ ¯
¯ |f (t) − f (c)|dt ¯
¯ c ¯
≤ ¯¯ ¯ (since f (c) is constant)
¯
¯ x−c ¯
¯ ¯
Given ε > 0, there exists δ > 0 such that |f (t) − f (c)| < ε for |t − c| < δ. Hence, for
0 < x − c < δ,
¯Z x ¯ Z x
¯ ¯
¯ ¯ ¯ |f (t) − f (c)|dt ¯ ε dt
¯ F (x) − F (c) ¯ ¯ c ¯
¯ ¯ ¯
− f (c)¯ ≤ ¯ ¯ c
¯ x−c x−c ¯ ≤ x − c < ε.
¯ ¯
¯ ¯
In other words, F+0 (c) = f (c). Similarly F−0 (c) = f (c). Hence F is differentiable and
F0 = f.
Remark 44.1. Suppose that it is required to evaluate
Z x2
f (t) dt
x1
where x1 , x2 ∈ [a, b] and f is continuous on [a, b]. Using the additive property we have
Z x2 Z x2 Z x1
f (t) dt = f (t) dt − f (t) dt
x1 a a
Remark 44.2. If f is continuous on [a, b] and Φ0 = f on [a, b],Zthen there exists a constant
x
c such that Φ(x) = F (x) + c for any x ∈ [a, b] where F (x) = f (t) dt. That is because
a
(Φ − F )0 = 0 and by the mean value theorem, Φ − F = c.
Hence if f is continuous on [a, b] and Φ is continuously differentiable on [a, b] such that
Φ0 = f , then for any x1 , x2 ∈ [a, b] we have:
Z x2
f (t) dt = Φ(x2 ) − Φ(x1 ).
x1
88
Z b
Notice that this method of evaluating integrals f (t) dt hinges on the ability to
a
determine Φ such that Φ0 = f on [a, b].
Example 44.1.
Z ¯1
1
1 4 ¯ 9
a) (x + 2) dx = x + 2x¯¯ = .
3
0 4 0 4
Z 0 3
¯
2 ¯0
x x 5
b) (x2 − x)dx = − ¯¯ = .
−1 3 2 −1 6
1. f (x) = x2 + 3x − 2;
2. f (x) = 1 + cos 3x;
3. f (x) = ex cosh 2x;
1
4. f (x) = √ ;
9 − x2
5. f (x) = |x|.
Integration by parts
Proposition 45.1. If the functions f and g are continuously differentiable on [a, b], then
Z Z
f (x) · g (x) dx = f (x) · g(x) − f 0 (x) · g(x) dx.
0
Z Z
0
where f (x)g (x)dx represents the set of primitives of f g and 0
f 0 (x)g(x)dx represents
the set of primitives of f 0 g.
89
Proof. The function h = f · g is differentiable and its derivative h0 is continuous on [a, b].
By the product rule of differentiation, we have
h0 (x) = f 0 (x) · g(x) + f (x) · g 0 (x).
Z
Let now ϕ ∈ f (x)g 0 (x)dx and ψ = ϕ − f g. It is easy to see that ψ 0 = ϕ0 − f 0 g −
Z
f g = −f g and therefore ψ ∈ − f 0 (x)g(x)dx. We obtain that ϕ = f g + ψ and
0 0
Z Z
ψ ∈ − f (x)g(x)dx. In other words, ϕ ∈ f g − f 0 (x)g(x)dx. It can be shown in
0
Z
the same manner that for every ψ ∈ − f 0 (x)g(x)dx, the function ϕ = f g + ψ ∈
Z
f (x)g 0 (x)dx.
The value of this formula lies in the hope that the primitive on the right-hand side is
easier to evaluate than the original one.
Corollary 45.1. If the functions f and g are continuously differentiable on [a, b] then
Z b ¯b Z b
¯
f (x) · g (x) dx = f (x) · g(x)¯¯ −
0
f 0 (x) · g(x) dx.
a a a
Then
In = cosn−1 x · sin x + (n − 1) In−2 − (n − 1) In .
Hence:
n In = cosn−1 x · sin x + (n − 1) In−2 n ≥ 2.
This
Z formula, together with the fact that I0 = x and I1 = sin x, leads to the evaluation
of cosn x dx for n ∈ N.
Change of variables
90
Z
Proof. Let F ∈ f (x) dx and G(t) = (F ◦ g)(t). By the composite rule of differentiation
Hence Z
G∈ (f ◦ g)(t) · g 0 (t) dt
Z
Let now G ∈ (f ◦ g)(t) · g 0 (t) dt and consider g −1 : [a, b] → [α, β]. We have
1
(g −1 )0 (x) = where x = g(t) and the function F = F ◦ g −1 verifies
g 0 (t)
1
Solution: Let f (x) = and g(t) = et . Then g 0 (t) = et and so
x ln x
Z Z Z
1 1 t 1
dx = t
e dt = dt = ln t = ln(ln x).
x ln x e ·t t
√
Solution: Let f (x) = x and g(t) = t3 − 1. Then g 0 (t) = 3t2 and therefore
Z 2 √ Z g(2)
2
√
t3 − 1 · 3t dt = x dx.
1 g(1)
Hence Z Z ¯7
2
2
√ 1 7 √ 1 2 3 ¯¯
t t3 − 1 dt = x dx = · x 2 ¯ ≈ 4.116.
1 3 0 3 3 0
The change of variables formula is often called integration by substitution where x = g(t)
gives the substitution to be used. It is extensively used in elementary calculus books,
where trial substitutions, which depend on the form of the integrand, are suggested.
91
Remark 45.1. If f : [−a, a] → R is piecewise continuous and symmetric (f (−x) = f (x))
then:
Za Za
f (x) dx = 2 f (x) dx
−a 0
Za
In order to obtain the above equalities, the integral f (x) dx is written as:
−a
Z a Z0 Za
f (x) dx = f (x) dx + f (x) dx
−a
−a 0
Z
a+T
Z
a+T Z0
f (x) dx = − f (x) dx
T a
46 Improper integrals
92
Definition 46.1. Let f be a function bounded on [a, +∞) and Riemann-Darboux inte-
Z b Z ∞
grable on [a, b] for every b > a. If lim f (x) dx exists, it is said that f (x) converges.
b→∞ a
Z ∞ a
Z a
A completely analogous definition holds for integrals of the form f (x).
−∞
Since it is necessary to preserve the additivity of the integral for improper integrals the
following is defined: Z +∞ Z 0 Z +∞
f (x) = f (x) + f (x)
−∞ −∞ 0
provided both improper integrals on the right hand side converge.
Example 46.1.
Z +∞
1 π
a) The integral 2
converges to .
0 1+x 2
Z +∞
1
b) The integral converges to 1.
1 x2
Z +∞
1
c) The integral √ diverges.
1 x
Z +∞
d) The integral sin x diverges.
−∞
The integral of a function over a bounded interval where the function is not bounded will
now be defined. This is called improper integral of second kind.
Definition 46.2. Let f be a function defined on (a, b] and Riemann-Darboux integrable
on [a + ε, b], for ε ∈ (0, b − a). If
Z b
lim+ f (x) dx
ε→0 a+ε
Z b
exists, then it is said that f (x) dx converges.
a
Example 46.2.
Z 1
1
a) The integral √ dx converges to 2.
0 x
Z 1
1 π
b) The integral √ dx converges to .
0 1 − x2 2
Z 1
1
c) The integral dx diverges.
0 x
93
Z +∞
1
a) The integral √ dx diverges.
0 x
Theorem 46.1 (Comparison test for integrals). Let f and g be defined on [a, +∞) and
Riemann-Darboux integrable on [a, b] for every b > a. Suppose that
Z +∞
Then f (x) dx converges too.
a
Proof. Now Z Z
b b
0≤ f (x) dx ≤ g(x) dx.
a a
Z b
Since 0 ≤ f (x) ≤ g(x), g(x) dx increases to its limiting value as b → +∞. Hence
Z b a Z +∞
f (x) dx is increasing and bounded above. Therefore f (x) dx is convergent
a a
integral.
A comparison test for improper integrals of the second kind is easily formulated and
proved in a similar manner.
Z +∞ −x
e
Example 46.3. The integral dx converges.
0 1 + x2
e−x 1
Solution: Consider the functions f (x) = 2
and g(x) = and apply the
1+x 1 + x2
comparison test.
47 Fourier series
The idea that a function may be represented by its Taylor series has already been
discussed. We saw that in order to be able to write the Taylor series of a function f
at a point x, it needs to be infinitely differentiable. This is a sever restriction that most
functions do not satisfy. Even when Taylor’s theorem with remainder is employed, the
function still needs to be differentiable a finite number of times and this, like infinite
differentiability, certainly implies that the function must be continuous. Nevertheless,
many functions used to describe important physical phenomena are discontinuous and
cannot be represented by Taylor series. For example, the function used to describe
the voltage behavior in time in a circuit in which a switch is suddenly operated is
discontinuous, just like the functional behavior of the gas pressure across a shock front.
94
In principle, at least, Fourier series offer the possibility of representation of continuous and
piecewise continuous functions, because whereas for Taylor series expansion, a function
needs to be differentiable, for Fourier series expansion it would appear that it only needs to
be integrable; the Fourier coefficients can be computed when f (x) is piecewise continuous.
Definition 47.1. The Fourier series of a piecewise continuous function f (x) defined on
the interval [−π, π] is the series
∞
a0 X
f (x) ∼ + (an · cos nx + bn · sin nx)
2 n=1
a) Determine the Fourier series of the function f (x) = π 2 − x for x ∈ [−π, π].
4π 2 4
Solution: a0 = , an = (−1)n+1 · 2 for n = 1, 2, ..., bn = 0 for all n.
3 n
b) Determine the Fourier series of the function f (x) = |x| for x ∈ [−π, π].
−4
Solution: a0 = π, an = (−1)n+1 · for n = 1, 2, ..., bn = 0 for all n.
π(2n + 1)2
c) Determine the Fourier series of the function
½
a for −π ≤ x < 0
f (x) =
b for 0 ≤ x < π
2(b − a)
Solution: a0 = a + b, an = 0 for n = 1, 2, ..., b2n = 0, b2n+1 = for
(2n + 1)π
n = 1, 2, ....
Usually, until the convergence problem has been resolved, it is customary to denote the
relationship between f (x) and its Fourier series by the sign ∼ instead of an equality.
The main result of this section will be establishing a fundamental theorem on the
convergence of Fourier series of a piecewise continuous function f (x). However, as this
will require several subsidiary results, which are important in their own way, we now
establish them in the form of two lemmas.
Lemma 47.1 (Integral representation of Sn (x)). The n-th partial sum of the Fourier
series of the function f (x) defined on the fundamental interval [−π, π], and prolonged by
periodic extension outside it, may be represented in the form:
Z ¡ ¢
1 π sin n + 12 u
Sn (x) = f (x − u) · du
π −π 2 sin 12 u
95
Proof. First, using the summation formula for a geometric progression it follows immedi-
ately that £ ¡ ¢ ¤ £1 ¤
Xn 1
exp i n + x − exp ix
eikx = 2
1
2
.
k=1
2 i sin 2 x
Hence, equating the real parts of the two sides of this equation, we deduce that
n ¡ ¢
1 X sin n + 12 x
+ cos kx = .
2 k=1 2 sin 12 x
Integration of this expression over the intervals [−π, 0] and [0, π] shows that
Z 0 ¡ ¢ Z π ¡ ¢
sin n + 12 u sin n + 12 u 1
1 du = 1 du = π
−π 2 sin 2 u 0 2 sin 2 u 2
since the only contribution from the left-hand side comes from the constant term.
Now consider the n-th partial sum Sn (x) of the Fourier series of f (x):
n
a0 X
Sn (x) = + (ak · cos kx + bk · sin kx).
2 k=1
Taking the functions cos kx, sin kx under the integral signs, and employing the trigono-
metric identity
cos k(x − t) = cos kx · cos kt + sin kx · sin kt
allows us to write
Z " n
#
1 π
1 X
Sn (x) = f (t) + cos k(x − t) dt.
π −π 2 k=1
96
Lemma 47.2. For a piecewise continuous function f (x) defined on [−π, π] the following
equalities hold:
Z π Z π
a) lim f (x) cos nx dx = 0 and lim f (x) sin nx dx = 0, and
n→∞ −π n→∞ −π
Z b µ ¶
1
b) lim f (x) sin n + x dx = 0 if −π ≤ a < b ≤ π.
n→∞ a 2
From the definition of the Fourier coefficients, the orthogonality property of the trigono-
metric system i.e.
Z π
sin mx cos nx dx = 0 for all m, n;
−π
Z π ½
0 for m 6= n
sin mx sin nx dx = ;
−π π for m = n
Z π 0 for m 6= n
cos mx cos nx dx = π for m = n 6= 0 ;
−π
2π for m = n = 0
and from the form of the n-th partial sum Sn (x), it follows that
Z π Z π " n
#
a2 X
0
[Sn (x)]2 dx = f (x) Sn (x) dx = π + (a2k + b2k ) .
−π −π 2 k=1
as the integrand of the left-hand side integral involves a square, it is either positive or
zero, so we may conclude
n Z
a20 X 2 1 π
+ (ak + b2k ) ≤ f 2 (x) dx.
2 k=1
π −π
97
Observe that when Z π
lim [f (x) − Sn (x)]2 dx = 0,
n→∞ −π
then we have ∞ Z
a20 X 2 1 π
+ (ak + b2k ) = [f (x)]2 dx.
2 k=1
π −π
Now let a = x0 < x1 < . . . xp = b be a partition of the closed interval [a, b] and the
corresponding decomposition of the integral:
Z b µ ¶ p−1 Z
X xi+1 µ ¶
1 1
f (x) sin n + x dx = f (x) sin n + x dx.
a 2 i=0 xi 2
Z b µ ¶
1
Consider mi = inf{f (x) | x ∈ [xi , xi+1 ]} and represent f (x) sin n + x dx in the
a 2
following form
Z b µ ¶ p−1 Z
X xi+1 ¶ µ
1 1
f (x) sin n + x dx = [f (x) − mi ] sin n + x dx+
a 2 i=0 xi
2
Xp−1 Z xi+1 µ ¶
1
+ mi sin n + x dx.
i=0 x i
2
f (x) − mi ≤ Mi − mi = ωi .
98
p−1
X ε
For ε > 0 we choose the partition such that ωi · ∆xi < . This is possible because the
i=0
2
piecewise continuous function f is integrable.
4
Now we can take n > M (b − a), where M = sup{f (x) | x ∈ [a, b]} and we obtain that
ε
for such values of n we have
¯Z b µ ¶ ¯
¯ 1 ¯
¯ f (x) sin n + x dx ¯ < ε.
¯ 2 ¯
a
Proof. Consider a function f (x) defined on [−π, π] being defined by periodic extension
outside [−π, π]. Assume that f is piecewise continuous on [−π, π] and has a finite
discontinuity at x0 . Denote
f (x−
0 ) = lim− f (x) and f (x+
0 ) = lim+ f (x).
x→x0 x→x0
99
The integrands on the right-hand side are well defined everywhere except, maybe at u = 0,
where they require examination. The first integrand can be written in the form
µ ¶
1
F1 (u) · sin n + u
2
where
1
f (x0 − u) − f (x+
0) u
F1 (u) = · 21 .
u sin 2 u
As u → 0, the second factor tends to 1 and when the right-hand side derivative of f exists
at x = x0 , the first factor tends to −f 0 (x+ 0 +
0 ). So, F1 (0) = −f (x0 ) and the integrand is
well defined at u = 0. Similarly, if
1
f (x0 − u) − f (x−
0) u
F2 (u) = · 21
u sin 2 u
We have thus proved one form of the Fourier theorem on convergence of Fourier series.
Example 47.2.
a) Deduce the Fourier series expansion of f (x) = π 2 − x2 in the interval [−π, π].
b) Deduce the Fourier series expansion of the function f (x) = |x| in the interval [−π, π].
100
48 Different forms of Fourier series
Theorem 48.1 (change of the origin of the fundamental interval). If f (x) is a piecewise
continuous function defined in the fundamental interval [−π, π] and by periodic extension
outside it, then for any α, the Fourier coefficients an , bn are given by
Z
1 α+π
an = f (x) · cos nx dx for n = 0, 1, 2, . . .
π α−π
Z α+π
1
bn = f (x) · sin nx dx for n = 1, 2, . . .
π α−π
Theorem 48.2 (change of the interval length). The Fourier expansion of the piecewise
continuous function f (x) defined on [−L, L] is the series
∞
a0 X nπx nπx
f (x) = + (an cos + bn sin )
2 n=1
L L
with Z L
1 nπx
an = f (x) · cos dx for n = 0, 1, 2, . . .
L −L L
and Z L
1 nπx
bn = f (x) · sin dx for n = 1, 2, . . .
L −L L
Example 48.1.
When f (x) is an even function defined on the interval [−π, π], then f (−x) = f (x). Thus,
it follows directly that f (x) · cos nx is an even function, because cos nx is even, and
f (x) · sin nx is an odd function, because sin nx is odd.
Consider the Fourier coefficients an of an even function f (x), that we choose to write in
the form Z Z
1 0 1 π
an = f (x) · cos nx dx + f (x) · cos nx dx.
π −π π 0
101
Then, changing the variable in the first integrand by writing u = −x, employing the even
nature of the integrand to replace f (−u) · cos n(−u) by f (u) · cos nu and changing the
sign of the integral by reversing the limits, we find
Z
2 π
an = f (x) cos nx dx for n = 0, 1, 2, . . .
π 0
The same argument applied to the coefficients bn shows that
bn = 0 for n = 1, 2, . . .
Consequently, if f (x) is an even function on [−π, π], its Fourier series contains only cosine
functions and is of the form
∞
a0 X
f (x) = + an cos nx for x ∈ [−π, π].
2 n=1
This is called the Fourier cosine expansion of the even function f (x) in [−π, π].
Example 48.2.
a) Deduce the Fourier series expansion of the even function f (x) = x2 in [−π, π].
b) Deduce the Fourier series expansion of the even function f (x) = |x| in [−π, π].
When f (x) is an odd function defined on the interval [−π, π], then f (−x) = −f (x). A
similar argument as in the case of even functions leads us to
an = 0 for n = 0, 1, . . .
and Z
2 π
bn = f (x) sin nx dx for n = 1, 2, . . .
π 0
from which it follows that the Fourier series of an odd function defined on [−π, π] contains
only sine functions and is of the form
∞
X
f (x) = bn sin nx for x ∈ [−π, π].
n=1
This is called the Fourier sine expansion of the odd function f (x) in [−π, π].
This results can be usefully interpreted in terms of any arbitrary function f (x) which is
to be expanded in the half interval [0, π]. Defining a new function g(x), by the rule
½
f (−x) for − π ≤ x < 0
g(x) =
f (x) for 0 ≤ x ≤ π
we see that g(x) is an even function which is equal to f (x) in the required interval [0, π].
Thus, as a Fourier cosine expansion of g(x) only requires the knowledge of g(x) in the
half interval [0, π] in which g(x) = f (x), it follows that
∞
a0 X
f (x) = + an cos nx for x ∈ [0, π]
2 n=1
102
is the desired cosine expansion of f (x) in [0, π].
Alternatively, we may expand the same function f (x) in the half interval [0, π] in a Fourier
sine series as follows: define a new function h(x) by the rule
½
−f (−x) for − π ≤ x < 0
h(x) =
f (x) for 0 ≤ x ≤ π.
Then, h(x) is an odd function which is equal to f (x) in the required interval [0, π]. The
Fourier sine expansion of h(x) only requires the knowledge of h(x) in the half interval
[0, π] where h(x) = f (x). So
∞
X
f (x) = bn sin nx for 0 ≤ x ≤ π
n=1
provides the desired sine expansion of f (x) for x ∈ [0, π]. These expansions are often
called the half-range expansions of f (x).
We have proved the following theorem:
Theorem 48.3 (Fourier sine and cosine series). If f (x) is an arbitrary function defined
and piecewise continuous on [0, π], then it may either be expanded as a Fourier cosine
series ∞
a0 X
f (x) = + an cos nx 0 ≤ x ≤ π
2 n=1
in which Z π
2
an = f (x) cos nx dx for n = 0, 1, 2, . . .
π 0
or as a Fourier sine series
∞
X
f (x) = bn sin nx 0≤x≤π
n=1
in which Z π
2
bn = f (x) sin nx dx for n = 1, 2, . . .
π 0
103
Part III
Definition 49.1. The set Rn is the collection of all the finite sequences x = (x1 , x2 , ..., xn )
of n real numbers:
Definition 49.2. A real valued function of n variables associates to every finite sequence
of n real numbers of a set A ⊂ Rn an unique real number.
Formally, f : A ⊂ Rn → R1 is given by
Example 49.1. The function f : R2 → R1 given by f (x1 , x2 ) = x21 + x22 is a real valued
function of two variables.
Formally, f : A ⊂ Rn → Rm is given by
f (x1 , x2 , x3 ) = (x1 + x2 + x3 , x1 · x2 · x3 )
is a vector function of three variables. Here f1 (x1 , x2 , x3 ) = x1 +x2 +x3 and f2 (x1 , x2 , x3 ) =
x1 · x2 · x3 .
104
Rn is organized as an n-dimensional vector space using the sum and the scalar product
defined by:
i=1
V 0 = V \ {a}.
Now the limit of a sequence (xk ) of points of Rn can be defined. A sequence (xk ) of points
of Rn is a function whose domain is the set of natural numbers and whose values belong
to Rn . The value of the function corresponding to argument k is denoted by xk . The
sequence x1 , x2 , . . . , xk , . . . is denoted by (xk ).
Definition 49.4. A vector x ∈ Rn is said to be the limit of the sequence (xk ) if for any
ε > 0 there exists N = N (ε) > 0 such that for any k > N we have kxk − xk < ε. In this
case we write lim xk = x.
k→∞
The limit of the sequence (xk ), if it exists, is unique. If a sequence (xk ) converges to x,
then the sequence is bounded, i.e. there exists M > 0 such that kxk k < M for any k ∈ N.
If a sequence (xk ) converges to x, then any subsequence (xkl ) of the sequence (xk ) converges
to x.
105
A sequence (xk ), xk = (x1k , x2k , ..., xnk ) ∈ Rn converges to x = (x1 , x2 , ..., xn ) ∈ Rn if and
only if the sequence (xik ) converges to xi for any i = 1, 2, ..., n.
According to Bolzano-Weierstrass theorem, any bounded sequence (xk ) of points of Rn
contains a convergent subsequence.
The Cauchy’s criterion for the convergence of a sequence (xk ) of points xk ∈ Rn states
that (xk ) converges if and only if for any ε > 0 there exists Nε such that for p, q > Nε we
have kxp − xq k < ε.
Definition 49.5. Let be A ⊂ Rn . A point x ∈ Rn is called an interior point of the set A
if there exists a hypersphere Sr (x) such that Sr (x) ⊂ A.
S r (a) = {x ∈ Rn | kx − ak ≤ r}
is closed.
Definition 49.9. A point a ∈ Rn is a limit point (or a point of accumulation) of the set
A ⊂ Rn provided every deleted neighborhood of a intersects A.
Definition 49.10. The closure A of a set A ⊂ Rn is the intersection of all closed sets
containing A.
The set of points in A and not in the interior Int(A) of A is called the boundary of A
and it is denoted by ∂A.
106
The closure operation has the properties:
• A ∪ B = A ∪ B;
• A ⊃ A;
• A = A;
• A=A ⇔ A is closed;
Definition 49.11. A set A ⊂ Rn is bounded if there exists r > 0 such that A ⊂ Sr (0).
is a compact subset of R2 .
Remark 49.1. If A ⊂ Rn is a compact set, then every sequence (xk ) with xk ∈ A contains
a subsequence (xkl ) which converges to a point x0 ∈ A.
Definition 49.13. A set A ⊂ Rn is connected if there are no open sets G1 , G2 such that
A ⊂ G1 ∪ G2 , A ∩ G1 6= ∅, A ∩ G2 6= ∅, and (A ∩ G1 ) ∩ (A ∩ G2 ) = ∅.
We write
lim f (x) = L.
x→a
107
Just like in the case of functions of one variable, this definition is technically difficult to
implement except for the simplest functions. However, the obvious generalization of the
sum, product and quotient rules can be proved. Their use is illustrated in the following
example.
x2 − y 2
Example 50.1. Evaluate lim f (x, y) where f (x, y) = .
(x,y)→(2,1) x2 + y 2
x2 − y 2 3
2 2
−−−−−−→ .
x + y (x,y)→(2,1) 5
We write
L = lim f (x).
x→a
0 < kx0 − ak < δ and 0 < kx00 − ak < δ ⇒ kf (x0 ) − f (x00 )k < ε.
51 Continuity
108
This definition requests three things: firstly that lim f (x) exists, secondly that f (a) is
x→a
defined, and finally that the previous two values are equal.
In terms of ε, δ this is equivalent to the following:
Definition 51.2. A function f : A ⊂ Rn → R1 is continuous at a ∈ A if for every ε > 0
there exists δ = δ(ε) > 0 such that
kx − ak < δ ⇒ |f (x) − f (a)| < ε.
Definition 51.3. A vector valued n-variable function f : A ⊂ Rn → Rm is continuous at
a ∈ A if for every ε > 0 there exists δ = δ(ε) > 0 such that
kx − ak < δ ⇒ kf (x) − f (a)k < ε.
Example 51.1. Use the ε, δ condition of continuity to prove that the following functions
are continuous at the mentioned points:
a) f (x, y) = x2 + y 2 at x = y = 0;
b) f (x, y) = (x2 − y 2 , x · y) at x = 1, y = 1;
c) f (x, y, z) = x + y + z at x = y = z = 0;
d) f (x, y, z) = (x2 + y 2 + z 2 , x + y + z) at x = y = z = 1.
The rules for continuous functions of one variable can be generalized to give corresponding
rules for functions of several variables. These are stated in the next two theorems.
Theorem 51.1. Let f and g be real valued functions of n variables defined in a
neighborhood of a. If f and g are continuous at a, then so are f + g, f · g, and, when
1
f (x) 6= 0, .
f
Theorem 51.2. Let f : A ⊂ Rn → B ⊂ Rm be continuous at a ∈ A and g : B ⊂ Rm → Rp
be continuous at f (a) = b ∈ Rm . Then the composite function g◦f : A → Rp is continuous
at a.
Discontinuities of functions of more than one variable are often difficult to spot. In the
case of a function f of two variables, any discontinuities can be visualized geometrically
by appealing to the surface in R3 represented by the equation z = f (x, y). Some, but by
no means all, of the discontinuities of such a surface correspond to holes or tears.
Example 51.2. The function f : R2 → R1 given by f (x, y) = x2 + y 2 is continuous for
all (x, y). The surface given by z = f (x, y) is a parabolic bowl. To see this, notice that a
horizontal section for z = k ≥ 0 gives the circle x2 + y 2 = k and a vertical cross-section
for fixed x or y gives a parabola.
Example 51.3. Investigate the behavior of the function f given by
xy
x2 + y 2 if (x, y) 6= (0, 0)
f (x, y) =
0 if (x, y) = (0, 0)
as (x, y) approaches (0, 0).
109
Solution: f is discontinuous at (0, 0) and the discontinuity is much nastier than the
removable jump of finite discontinuities seen for functions of one variable. Recall that
when establishing the discontinuity of a function of one variable at some points often the
right and left-hand limits existed, but were unequal. Clearly, if lim f (x, y) exists,
(x,y)→(0,0)
its value must be independent on the way in which (x, y) approaches (0, 0). Consider
lim f (x, mx):
x→0
mx2 m
lim f (x, mx) = lim
2 2 2
= .
x→0 x→0 x + m x 1 + m2
But this quantity varies with m and so f cannot be continuous at (0, 0), no matter what
value is specified for f (0, 0).
Example 51.4. Investigate the behavior of the function f given by
xy 3
2 6
if (x, y) 6= (0, 0)
f (x, y) = x + y
0 if (x, y) = (0, 0)
Solution: Firstly
m3 x4 m 3 x2
lim f (x, mx) = lim = lim = 0.
x→0 x→0 x2 + m6 x6 x→0 1 + m6 x4
However this is not sufficient evidence to suppose that f (x, y) approaches 0 as (x, y)
approaches (0, 0). In fact
√ x2 1
lim f (x, 3
x) = lim 2 2
= .
x→0 x +x 2
Hence f is discontinuous at (0, 0).
Theorem 51.3. Let f : A ⊂ Rn → Rm , f (x) = (f1 (x), . . . , fm (x)) and a ∈ A. The
function f is continuous at a ∈ A if and only if the functions fi , i = 1, 2, . . . , m are
continuous at a.
Theorem 51.4 (Heine’s criterion for continuity). The function f : A ⊂ Rn → Rm is
continuous at a ∈ A if and only if for any sequence (xk ), xk ∈ A, xk −−−→ a the sequence
k→∞
(f (xk )) converges to f (a).
Theorem 51.5 (Cauchy-Bolzano’s criterion for continuity). The function f : A ⊂ Rn →
Rm is continuous at a ∈ A if and only if for any ε > 0 there exists δ > 0 such that
110
52 Important properties of continuous functions
Proof.
a) Assume that f (A) is unbounded. Then for every k ∈ N there exists xk ∈ A such that
kf (xk )k > k. The sequence (xk ) is bounded and therefore there exists a subsequence (xkl )
of the sequence (xk ) which converges towards a point x0 ∈ A, xkl −−−→ x0 . Hence, the
kl →∞
sequence (f (xkl )) converges to f (x0 ). Therefore, there exists N such that for kl > N we
have kf (xkl )k ≤ kf (x0 )k + 1. Absurd.
b) Consider R = sup kf (A)k and note that for every k ∈ N there exists xk ∈ A such that
1
R− < kf (xk )k < R.
k
For the sequence (xk ) there exists a subsequence xkl such that xkl −−−→ x0 ∈ A. Therefore
kl →∞
f (xkl ) −−−→ f (x0 ) and kf (xkl )k −−−→ kf (x0 )k. From the inequality
kl →∞ kl →∞
1
R− < kf (xkl )k < R.
kl
it follows that kf (x0 )k = R.
Corollary 52.1. If f : A ⊂ Rn → R1 is continuous on the compact set A, then:
111
Corollary 52.2. The function f is continuous on A if and only if for every closed set
F 0 ⊂ Rm , there exists a closed set F ∈ Rn such that
F ∩ A = f −1 (F 0 ∩ A0 ).
Corollary 52.3. If the set A ⊂ Rn is compact and connected and the function f : A ⊂
Rn → R1 is continuous, then f (A) is a closed interval.
53 Differentiation
This section defines what is meant by saying that a function of n variables is differentiable,
but to begin, let’s examine the concept of partial differentiability.
Remark 53.1. To calculate partial derivatives, one has to differentiate (in the normal
manner) with respect to xi keeping all the other variables fixed. Hence, the obvious rules
for partially differentiating sums, products and quotients can be used.
112
Example 53.2. Calculate the partial derivatives of the function f given by
n X
X n
f (x1 , ..., xn ) = aij xi xj (x1 , ..., xn ) ∈ Rn .
i=1 j=1
Solution: n
∂f X
= (akj + ajk )xj .
∂xk j=1
Solution:
∂f ∂f ∂f
= (1, y + z, yz) ; = (1, x + z, xz) ; = (1, x + y, xy).
∂x ∂y ∂z
Definition 53.4. Let f : A ⊂ Rn → R1 be a real valued function of n variables, a an
interior point of the set A and u a unit vector in Rn (i.e. kuk = 1). If the limit
f (a + t · u) − f (a)
lim
t→0 t
exists it is called the directional derivative of f at the point a and it is denoted by
f (a + t · u) − f (a)
∇u f (a) = lim .
t→0 t
Remark 53.2. Let ei = (0, ..., 0, |{z}
1 , 0, ..., 0). The directional derivative of f at a is
i
∂f
∇ei f (a) = (a) i = 1, n.
∂xi
Hence, partial derivatives are special cases of directional derivatives.
113
Example 53.4. If u = (ux , uy ) and f (x, y) = x · y, then
∂f ∂f
∇u f (x, y) = · ux + · uy = y · ux + x · uy .
∂x ∂y
Definition 53.5. Let f = (f1 , . . . , fm ) be a vector valued n variables function f : A ⊂
Rn → Rm ; a an interior point of A and u a unit vector in Rn (i.e. kuk = 1).
If the limit
f (a + t · u) − f (a)
lim
t→0 t
exists, it is called the directional derivative of f at the point a and it is denoted by ∇u f (a).
It is easy to see that
∇u f (a) = (∇u f1 (a), . . . , ∇u fm (a)).
Remark 53.3. The directional derivative ∇u f (a) exists if the directional derivatives
∇u fi (a), i = 1, m exist.
Example 53.5. The directional derivative of the function f (x, y, z) = (xy +xz +yz, xyz)
at the point (x, y, z) is
Now remark that kvi + θi · hi · ei − ak ≤ khk, i = 1, n and therefore, for ε > 0, there is
δ > 0 such that khk < δ ⇒
¯ ¯
¯ ∂f ∂f ¯ ε
¯ (v + θ · h · e ) − (a)¯ < , i = 1, n.
¯ ∂xi i i i i
∂xi ¯ n
114
So, ¯ ¯
1 ¯¯ n
X ∂f ¯
¯
khk < δ ⇒ ¯f (a + h) − f (a) − (a) · hi ¯ < ε.
khk ¯ ∂xi i=1
¯
The above theorem shows that in a small neighborhood of a the function f can be
Xn
∂f
approximated by the polynomial of first degree f (a) + (a) · hi .
i=1
∂xi
Example 53.6. Show that the following functions are differentiable and compute their
Fréchet derivatives.
115
The Fréchet derivative is a set of m polynomials of first degree in h1 , h2 , ..., hn .
Example 53.7. Show that f (x1 , x2 , x3 ) = (x1 x2 x3 , x21 + x22 + x23 ) is differentiable at any
point and compute its Fréchet derivative.
Definition 53.8. The matrix of the linear function da f is called the Jacobi matrix of f
at a. This is a m × n matrix and is given by
µ ¶
∂fi
Ja (f ) = (a) .
∂xi m×n
We have da f (h) = Ja (f ) · h.
Show that f is Fréchet differentiable at any point x ∈ Rn and the following relations hold:
m
à n !
X X
dx f (h) = aij hj · ei
i=1 j=1
µ ¶
∂fi
Ja (f ) = (a) = (aij )m×n .
∂xi m×n
with ε1 (x) → 0 as x → a.
g differentiable at b = f (a) implies
with ε2 (y) → 0 as y → b.
Hence
h(x) − h(a) =g(f (x)) − g(f (a)) = db g(f (x) − f (a)) + ε2 (f (x)) · kf (x) − f (a)k
=db g(da f (x − a) + ε1 (x) · kx − ak) + ε2 (f (x)) · kda f (x − a) + ε1 (x) · kx − akk
=db g ◦ da f (x − a) + kx − akdb g(ε1 (x)) + kda f (x − a) + ε1 (x)kx − akk) · ε2 (f (x)).
116
Denote
h(x) − h(a) − db g ◦ da f (x − a)
ε3 (x) =
kx − ak
kda f (x − a) + ε1 (x) · kx − akk
=db g(ε1 (x)) + · ε2 (f (x))
kx − ak
Hence
kε3 (x)k ≤ kdb gk · kε1 (x)k + (kda f k + kε1 (x)k) · kε2 (f (x))k
and ε3 (x) → 0 as x → a.
Remark 53.8. The Jacobi matrix of h at a is the product of the Jacobi matrix of g at
”b” and the Jacobi matrix of f at a :
m
X ∂gi
∂hi ∂fk
(a) = (b) · (a), i = 1, p, j = 1, n.
∂xj k=1
∂y k ∂x j
a) [x, x + h] ⊂ A
b) f is differentiable on [x, x + h]
117
Proof. Consider the function ϕ(t) = f (x+th) for t ∈ [0, 1]. The function ϕ is differentiable
on [0, 1] and ϕ0 (t) = dx+th f (h). Since f (x + h) − f (x) = ϕ(1) − ϕ(0), applying the mean
value theorem for the function ϕ on [0, 1], we obtain that there exists t0 ∈ (0, 1) such
that: n
X ∂f
0
ϕ(1) − ϕ(0) = ϕ (t0 ) = dx+t0 h f (h) = (x + t0 h) · hi
i=1
∂xi
1) [x, x + h] ⊂ A
2) f is differentiable on [x, x + h]
3) kdx+th f k ≤ M ∀t ∈ [0, 1]
Proof. Consider again the function ϕ(t) = f (x + th) for t ∈ [0, 1] and
Xn
ψ(t) = (ϕi (1) − ϕi (0)) · ϕi (t) for t ∈ [0, 1]. For ψ there exists t0 ∈ [0, 1] such that
i=1
ψ(1) − ψ(0) = ψ 0 (t0 ). Hence
m
X m
X
kϕ(1) − ϕ(0)k2 = [ϕi (1) − ϕi (0)]2 = [ϕi (1) − ϕi (0)] · ϕ0i (t0 ) ≤
i=1 i=1
" m # 21 " m # 21
X X
≤ [ϕi (1) − ϕi (0)]2 · [ϕ0i (t0 )]2 ≤ M · khk · kϕ(1) − ϕ(0)k
i=1 i=1
and
kϕ(1) − ϕ(0)k ≤ M · khk
Local extremum
118
Proof. Consider h ∈ Rn and for t ∈ R1 sufficiently close to 0, the function ϕ(t) = f (c+th).
The function f possesses a local maximum or a local minimum at c if and only if ϕ
possesses a local maximum or a local minimum at t = 0. Since the derivative of ϕ at a
local extremum is equal to 0 it follows that
Xn
0 ∂f
ϕ (0) = (c) · hi = 0 ∀h ∈ Rn
i=1
∂x i
Therefore
∂f
(c) = 0 for i = 1, n
∂xi
∂f
Note that although ∂xi
(c) = 0 at a local extremum c, this is not a sufficient condition for
such a point.
Just like in the case of functions of one variable, stationary points for functions of several
variables can be classified with the aid of Taylor approximations.
db f −1 = (da f )−1
Proof. The proof of this theorem is rather technical and will be skipped.
Example 54.2. Show that if ρ 6= 0, then the function f (ρ, θ) = (ρ cos θ, ρ sin θ) is locally
invertible.
Example 54.3. Show that if ρ 6= 0, then the function f (ρ, θ, ϕ) = (ρ sin θ cos ϕ, ρ sin θ sin ϕ, ρ cos θ)
is locally invertible.
119
Implicit functions
Consider A ⊂ Rn and B ⊂ Rm two open subsets and let be the function f : A × B → Rm .
Denote by f1 , f2 , ... , fm the scalar components of f , i.e. f = (f1 , f2 , ..., fm ) and consider
the system of equations:
f1 (x1 , x2 , ..., xn , y1 , y2 , ..., ym ) = 0
f2 (x1 , x2 , ..., xn , y1 , y2 , ..., ym ) = 0
(∗)
··· ··· ··· ··· ··· ···
fm (x1 , x2 , ..., xn , y1 , y2 , ..., ym ) = 0
Definition 54.4. If there exists a function ϕ : A0 ⊂ A → B, ϕ = (ϕ1 , ϕ2 , ..., ϕm ) such
that the following equalities hold:
f1 (x1 , x2 , ..., xn , ϕ1 (x1 , x2 , ..., xn ), ..., ϕm (x1 , x2 , ..., xn )) = 0
f2 (x1 , x2 , ..., xn , ϕ1 (x1 , x2 , ..., xn ), ..., ϕm (x1 , x2 , ..., xn )) = 0
(∗∗)
··· ··· ··· ··· ··· ··· ··· ··· ··· ···
fm (x1 , x2 , ..., xn , ϕ1 (x1 , x2 , ..., xn ), ..., ϕm (x1 , x2 , ..., xn )) = 0
for any (x1 , x2 , ..., xn ) ∈ A0 , then the function ϕ = (ϕ1 , ϕ2 , ..., ϕm ) is said to be defined
implicitly by the system of equations (∗).
where fa : B → Rm is defined by fa (y) = f (a, y), then there exists an open neighborhood
U of a and an open neighborhood V of b and a function ϕ : U → V having the following
properties:
i) ϕ(a) = b
ii) f (x, ϕ(x)) = 0 ∀x ∈ U
iii) f is continuously differentiable on U
Note that the function ϕ defined implicitly cannot be always written as an explicit formula.
Example 54.4. a) Find the function defined implicitly by the equation x2 + y 2 = 1.
b) Show that the equation y 5 + y − x = 0 defines implicitly a function y = y(x).
120
55 Higher order partial differentiability
121
The second Fréchet derivative d2a f is a system of m bilinear forms in u1 , u2 , ..., un ;
v1 , v2 , ..., vn .
The second Fréchet derivative of f at a satisfies
1
lim kda+u f (v) − da f (v) − d2a f (u)(v)k = 0
u→0 kuk
for every v ∈ Rn . In other words, the polynomial da+u f (v) can be approximated by the
polynomial [da f + d2a f (u)](v).
∂ 2 fi ∂ 2 fi
(a) = (a) i = 1, m j, k = 1, n
∂xj ∂xk ∂xk ∂xj
Proof. The proof of this theorem is rather technical and will be skipped.
Example 55.3. Consider f (x, y, z) = (xy + xz + yz, xyz) and verify that:
∂ 2f ∂ 2f ∂ 2f ∂2f ∂ 2f ∂ 2f
= = =
∂x∂y ∂y∂x ∂x∂z ∂z∂x ∂y∂z ∂z∂y
∂ k fi ∂ k fi
(a) = (a)
∂xj1 ∂xj2 · · · ∂xjk ∂xσ(j1 ) ∂xσ(j2 ) · · · ∂xσ(jk )
122
56 Taylor’s theorems
Theorem 56.1 (Taylor’s formulae with integral remainder). If the partial derivatives of
order m + 1 of the function f : A ⊂ Rn → Rp are continuous and the closed segment
[x, x + h] is included in the open set A, then:
m
1 1 1 m z }| {
f (x + h) = f (x) + dx f (h) + d2x f (h)(h) + · · · + d f (h) · · · (h) +
1! 2! m! x
Z 1 m+1
z }| {
1 m m+1
+ (1 − t) · dx+th f (h) · · · (h) dt
m! 0
Proof. The function g(t) = f (x + th) is considered for t ∈ [0, 1]. For g the following
relations hold
k
dk g z }| {
= g (k) (t) = dkx+th f (h) · · · (h) k = 1, m + 1
dtk
On the other hand
d 1−t 0 (1 − t)m m (1 − t)m m+1
[g(t) + g (t) + · · · + g (t)] = g (t)
dt 1! m! m!
Z 1
1 0 1 m 1
g(1) − [g(0) + g (0) + · · · + g (0)] = (1 − t)m · g m+1 (t) dt
1! m! m! 0
Hence
m
1 1 1 m z }| {
f (x + h) = f (x) + dx f (h) + d2x f (h)(h) + · · · + d f (h) · · · (h) +
1! 2! m! x
Z 1 m+1
z }| {
1
+ (1 − t)m · dm+1
x+th f (h) · · · (h) dt
m! 0
Theorem 56.2 (Taylor’s formula with the Lagrange remainder). If the function f : A ⊂
Rn → Rp is m + 1 times differentiable on A and kdm+1
y f k ≤ M on the closed segment
[x, x + h] which is included in A, then:
m
1 1 m z }| { khkm+1
kf (x + h) − f (x) − dx f (h) − · · · − dx f (h) · · · (h) k ≤ M ·
1! m! (m + 1)!
Proof. The proof of this theorem is rather technical and it will be skipped.
Theorem 56.3 (Taylor’s formulae with O(khkm ) remainder). If the function f is (m−1)-
times differentiable on A and m-times differentiable at x ∈ A, then:
m
1 1 m z }| {
kf (x + h) − f (x) − dx f (h) − · · · − d f (h) · · · (h) k = O(khkm )
1! m! x
Proof. By induction.
123
57 Classification theorem for local extrema
124
58 Conditional extrema
Γ = {x ∈ A | gi (x) = 0 i = 1, p}
where gi : A → R1 , p < n .
It will be assumed that f and gi , i = 1, p have continuous first order partial derivatives
on A.
or p
∂f X ∂gi
(a) = λi (a) k = 1, n
∂xk i=1
∂xk
a) f (x, y) = x3 if x2 + 6xy + y 2 = 1
b) f (x, y) = xy if 2x + 3y = 1
c) f (x, y, z) = x2 + y 2 + z 2 if x + y + z = 1
Let be the set of one dimensional bounded intervals of the form (a, b), [a, b), (a, b], [a, b],
where a, b ∈ R. The cartesian product ∆ = I1 × I2 of two intervals of this type will be
called rectangle in R2 . The area of such a rectangle ∆ is defined by
Consider the set P of all finite reunions of rectangles ∆, i. e. P ∈ P if and only if there
exists ∆1 , ∆2 , · · · , ∆n such that
[n
P = ∆i
i=1
125
Proof. Direct verification.
n
[
Proposition 59.2. For any P ∈ P there exists ∆1 , ∆2 , · · · , ∆n such that P = ∆i and
i=1
∆i ∩ ∆j = ∅ if i 6= j.
n
[
where P = ∆i and ∆1 , ∆2 , · · · , ∆n are disjoint.
i=1
Definition 59.4. If the bounded set A ⊂ R2 is Jordan measurable, then the area of A is
defined as
area(A) = areai (A) = areae (A)
Proposition 59.4. A bounded set A ⊂ R2 is Jordan measurable if and only if for any
ε > 0 there exist Pε , Qε ∈ P such that Pε ⊂ A ⊂ Qε and area(Qε ) − area(Pε ) < ε.
Proposition 59.5. A bounded set A ⊂ R2 is Jordan measurable if and only if there exist
two sequences (Pn ), (Qn ), Pn , Qn ∈ P and Pn ⊂ A ⊂ Qn such that
Proposition 59.6. A bounded set A ⊂ R2 is Jordan measurable if and only if the area
of its boundary is equal to zero.
126
Proposition 59.7. If A1 and A2 are Jordan measurable sets, then A1 ∪ A2 and A1 \ A2
are Jordan measurable and if A1 ∩ A2 = ∅, then area(A1 ∪ A2 ) = area(A1 ) + area(A2 ).
Proposition 59.8. Let M ⊂ R2 be a bounded set. If for any ε > 0 there exist two Jordan
measurable sets A and B such that A ⊂ M ⊂ B and area(B) − area(A) < ε, then the set
M is Jordan measurable.
Proposition 59.9. If there exist two sequences (An ), (Bn ) of Jordan measurable sets
such that An ⊂ M ⊂ Bn and
lim area(An ) = lim area(Bn )
n→∞ n→∞
Proof. The proof of the above statements is rather technical and it will be skipped.
127
Definition 60.3. The Riemann sum of f related to P is defined by
n
X
σf (P ) = f (ξi , ηi ) · area(Ai )
i=1
where (ξi , ηi ) ∈ Ai .
Remark 60.1. It is obvious that we have
Lf (P ) ≤ σf (P ) ≤ Uf (P )
Now f is bounded above and below on A. So there exist numbers m and M with
m ≤ f (x, y) ≤ M for all (x, y) ∈ A.
Thus for any partition P of A we have
n
X n
X
m · area(A) = m · area(Ai ) ≤ Lf (P ) ≤ Uf (P ) ≤ M · area(Ai ) = M · area(A)
i=1 i=1
Uf = {Uf (P ) | P is a partition of A}
is bounded below.
So Lf = sup Lf and Uf = inf Uf exist.
P P
The firs result establishes the intuitively obvious fact that for a bounded function Lf ≤ Uf .
Theorem 60.1. If f is defined and bounded on A, then Lf ≤ Uf .
Proof. Let P be a partition of A and P 0 the partition P 0 = P ∪{A0i , A00i } where A0i ∪A00i = Ai
for one particular i, 1 ≤ i ≤ n. In other words, P 0 is obtained by decomposing Ai in two
measurable subsets.
It is now shown that Lf (P ) ≤ Lf (P 0 ) and Uf (P ) ≥ Uf (P 0 ).
Let Mi0 = sup{f (x, y) | (x, y) ∈ A0i } and Mi00 = sup{f (x, y) | (x, y) ∈ A00i }.
Clearly Mi0 ≤ Mi and Mi00 ≤ Mi . Hence
i−1
X n
X
0
Uf (P ) = Mj · area(Aj ) + Mi0 · area(A0i ) + Mi00 · area(A00i ) + Mj · area(Aj ) ≤
j=1 j=i+1
i−1
X n
X
≤ Mj · area(Aj ) + Mi · area(A0i ) + Mi · area(A00i ) + Mj · area(Aj ) =
j=1 j=i+1
Xn
= Mj · area(Aj ) = Uf (P )
j=1
128
Now suppose that P1 and P2 are two partitions of A, P1 = {A1 , · · · , Am } and P2 =
{B1 , · · · , Bn } and let P3 the partition P3 = {Ai ∩ Bj | i = 1, m, j = 1, n}.
Thus Lf (P1 ) ≤ Lf (P3 ) and Uf (P2 ) ≥ Uf (P3 ). Since Lf (P3 ) ≤ Uf (P3 ) it can be deduced
that Lf (P1 ) ≤ Uf (P2 ).
In other words the lower sum related to a given partition of A does not exceed the upper
sum related to any partition of A.
Hence every lower sum is a lower bound for the set of upper sums. So Lf (P ) ≤ Uf for all
possible partitions P . But then Uf is an upper bound for the set of lower sums.
Thus Lf ≤ Uf .
Definition 60.4. A function f defined and bounded on A is Riemann-Darboux integrable
on A if Lf = Uf . This common value is denoted by
ZZ
f (x, y) dx dy
A
61 Integrable functions
Proof. Since A is compact the function f is uniformly continuous. For ε > 0 there exists
δ > 0 such that
1 ε
[(x0 − x00 )2 + (y 0 − y 00 )2 ] 2 < δ ⇒ |f (x0 , y 0 ) − f (x00 , y 00 )| <
area(A)
129
Choose P such that
p
(x0 , y 0 ), (x00 , y 00 ) ∈ Ai ⇒ (x0 − x00 )2 + (y 0 − y 00 )2 < δ
for i = 1, n. Hence
n
X n
X
ε
Uf (P ) − Lf (P ) = (Mi − mi )area(Ai ) < · area(Ai )
i=1
area(A) i=1
Theorem 62.1. If f and g are Riemann-Darboux integrable on A, then all the integrals
below exist and the following relations hold:
ZZ ZZ ZZ
(1) [αf (x, y) + βg(x, y)] dx dy = α f (x, y) dx dy + β g(x, y) dx dy, α, β ∈ R1
A A A
ZZ ZZ ZZ
(2) f (x, y) dx dy = f (x, y) dx dy + f (x, y) dx dy where A1 ∪ A2 = A and
A A1 A2
A1 ∩ A2 = ∅
ZZ ZZ
(3) if f (x, y) ≤ g(x, y) on A, then f (x, y) dx dy ≤ g(x, y) dx dy
A A
ZZ ZZ
(4) | f (x, y) dx dy| ≤ |f (x, y)| dx dy
A A
Property (1) is called the linearity of the integral and (2) is called the additive property.
130
Theorem 62.2 (The mean value theorem). Let f : A → R1 be integrable on A and
satisfying
m ≤ f (x, y) ≤ M for (x, y) ∈ A
Then ZZ
m · area(A) ≤ f (x, y) dx dy ≤ M · area(A)
A
We intend to show that in some conditions the calculus of the integral of a two variables
function reduces to the iterative calculus of the integrals of one variable functions.
Assume that A is given by A = [a, b] × [c, d] and f : A → R1 .
Theorem 63.1. If the function f is integrable on A and if for every x ∈ [a, b] (x-fixed)
the function fx (y) = f (x, y) is integrable on [c, d] i.e. the integral
Zd Zd
I(x) = fx (y) dy = f (x, y) dy
c c
Zb Zd
dx f (x, y) dy
a c
ZZ Zb Zd
f (x, y) dx dy = dx f (x, y) dy
A a c
Proof. Consider the partitions Px = {a = x0 < x1 < · · · < xi < · · · < xn = b} of [a, b] and
Py = {c = y0 < y1 < · · · < yj < · · · < ym = d} of [c, d]. Hence P = {Aij }i=0,n−1,j=0,m−1 is
a partition of A, Aij = [xi , xi+1 ) × [yj , yj+1 ).
Denote by
mij = inf{f (x, y) | (x, y) ∈ Aij } and Mij = sup{f (x, y) | (x, y) ∈ Aij }
131
Hence y
Zj+1
mij · [yj+1 − yj ] ≤ f (x, y) dy ≤ Mij · [yj+1 − yj ]
yj
≤ Mij · [yj+1 − yj ]
and hence
y
Zj+1
mij [yj+1 − yj ][xi+1 − xi ] ≤ [xi+1 − xi ] · inf { f (x, y) dy | x ∈ [xi , xi+1 ]} ≤
x
yj
y
Zj+1
≤ [xi+1 − xi ] · f (x, y) dy ≤
yj
y
Zj+1
≤ [xi+1 − xi ] · sup{ f (x, y) dy | x ∈ [xi , xi+1 ]} ≤
x
yj
Hence
n−1
X Z
yi+1
where
n−1 m−1
X X
Lf (P ) = mij [xi+1 − xi ][yj+1 − yj ]
i=0 j=0
y
Zj+1
m−1
XX n−1
LI(x) (Px ) = [xi+1 − xi ] · inf { f (x, y) dy | x ∈ [xi , xi+1 ]}
x
j=0 i=0 yj
y
Zj+1
m−1
XX n−1
UI(x) (Px ) = [xi+1 − xi ] · sup{ f (x, y) dy | x ∈ [xi , xi+1 ]}
x
j=0 i=0 yj
m−1
XX n−1
Uf (P ) = Mij [xi+1 − xi ][yj+1 − yj ]
j=0 i=0
132
Since ZZ
sup Lf (P ) = inf Uf (P ) = f (x, y) dx dy
A
we have
Zb Zd
sup LI(x) (Px ) = inf UI(x) (Px ) = dx f (x, y) dy
a c
and
ZZ Zb Zd
f (x, y) dx dy = dx f (x, y) dy
A a c
ZZ Zd Zb
f (x, y) dx dy = dy f (x, y) dx
A c a
ZZ
1
Example 63.1. Consider A = [1, 3]×[1, 2] and f (x, y) = (x+y)2
. Evaluate f (x, y) dx dy.
A
ZZ
Example 63.2. Evaluate f (x, y) dx dy in the following cases:
A
y
c) A = [0, 1] × [0, 1] and f (x, y) = 3
(1+x2 +y 2 ) 2
∂2F
The second order partial derivative ∂x∂y
exists and
∂ 2F
= f (x, y) for (x, y) ∈ A
∂x∂y
133
Proof. Represent F (x, y) in the following form
Zx Zy Zx Zy Zy Zx
F (x, y) = f (u, v) du dv = du f (u, v) dv = dv f (u, v) du
a c a c c a
Proof. Consider the partitions Px = {a = x0 < x1 < · · · < xi < · · · < xm = b} and
Py = {c = y0 < y1 < · · · < yj < · · · < yn = d} of [a, b] and [c, d] respectively. Now let
be P = {Aij }i=0,m−1,j=0,n−1 a partition of A where Aij = [xi , xi+1 ] × [yj , yj+1 ] and apply
twice the mean value theorem for the expression
Φ(xi+1 , yj+1 ) − Φ(xi+1 , yj ) − Φ(xi , yj+1 ) + Φ(xi , yj )
obtaining
Φ(xi+1 , yj+1 ) − Φ(xi+1 , yj ) − Φ(xi , yj+1 ) + Φ(xi , yj ) =
2
∂ Φ
= (ξij , ηi,j )(xi+1 − xi )(yj+1 − yj ) = f (ξij , ηi,j )(xi+1 − xi )(yj+1 − yj )
∂x∂y
where xi ≤ ξij ≤ xi+1 and yj ≤ ηij ≤ yj+1 .
Hence
m−1
XX n−1
f (ξij , ηi,j )(xi+1 − xi )(yj+1 − yj ) = Φ(b, d) − Φ(b, c) − Φ(a, d) + Φ(a, c)
i=0 j=0
134
Theorem 64.1. If the function f : A → R1 is integrable on A and for every x ∈ [a, b]
the integral
Z
h(x)
I(x) = f (x, y) dy
g(x)
Zb Z
h(x)
dx f (x, y) dy
a g(x)
ZZ Zb Z
h(x)
f (x, y) dx dy = dx f (x, y) dy
A a g(x)
Hence ¯¯ ¯¯
n
X n ¯¯
X ∂x ∂x ¯¯
¯¯ ∂ξ ∂η ¯¯
area(Ai ) = ¯¯ ∂y ∂y ¯¯ · area(Bi )
¯¯ ∂ξ ∂η ¯¯
i=1 i=1
n→∞
Considering a sequence of partitions PBn with ν(PBn ) −→ 0, we obtain the stated
result.
135
Theorem 64.3. If A, B ⊂ R2 are Jordan measurable sets, T : B → A is a bijection
such that T and T −1 have continuous partial derivatives and f : A → R1 is an integrable
function, then the following equality holds:
ZZ ZZ ¯¯ ¯¯
¯¯ ∂x ∂x ¯¯
¯¯ ∂ξ ∂η ¯¯
f (x, y) dx dy = f (x(ξ, η), y(ξ, η)) ¯¯ ∂y ∂y ¯¯ dξ dη
¯¯ ∂ξ ∂η ¯¯
A B
Let be the set of one dimensional bounded intervals of the form (a, b), [a, b), (a, b], [a, b],
where a, b ∈ R. The cartesian product ∆ = I1 × · · · × In , where Ii are intervals of this
type, is called hypercube in Rn .
The volume of such a hypercube ∆ is defined by
Consider the set P of the finite unions of hypercubes ∆, i. e. P ∈ P if and only if exist
∆1 , ∆2 , · · · , ∆k such that
[k
P = ∆l
l=1
P1 , P2 ∈ P ⇒ P1 ∪ P2 ∈ P and P1 \ P2 ∈ P
k
[
Proposition 65.1. for any P ∈ P there exist ∆1 , ∆2 , · · · , ∆k such that P = ∆l and
l=1
∆p ∩ ∆q = ∅ if p 6= q.
k
[
where P = ∆l and ∆1 , ∆2 , ..., ∆k are given by Proposition 65.1.
l=1
1. vol(P ) ≥ 0 for P ∈ P
136
2. P1 , P2 ∈ P, P1 ∩ P2 = ∅ ⇒ vol(P1 ∪ P2 ) = vol(P1 ) + vol(P2 )
Definition 65.4. If the bounded set A ⊂ Rn is Jordan measurable, then the volume of A
is defined as
vol(A) = voli (A) = vole (A)
Proposition 65.3. A bounded set A ⊂ Rn is Jordan measurable if and only if for any
ε > 0 there exist Pε , Qε ∈ P such that Pε ⊂ A ⊂ Qε and vol(Qε ) − vol(Pε ) < ε.
Proposition 65.4. A bounded set A ⊂ Rn is Jordan measurable if and only if for any
ε > 0 there exist two sequences (Pk ), (Qk ), Pk , Qk ∈ P and Pk ⊂ A ⊂ Qk such that
Proposition 65.5. A bounded set A ⊂ Rn is Jordan measurable if and only if the volume
of its boundary is equal to zero.
Proposition 65.7. Let M ⊂ Rn be a bounded set. If for any ε > 0 there exists two
Jordan measurable sets A and B such that A ⊂ M ⊂ B and vol(B) − vol(A) < ε, then
the set M is Jordan measurable.
Proposition 65.8. If there exist two sequences (Ak ) and (Bk ) of Jordan measurable sets
such that Ak ⊂ M ⊂ Bk and
The proof of the above statements is rather technical and it will be omitted.
137
66 The Riemann-Darboux integral of a n variable
function
3) if p 6= q then Ap ∩ Aq = ∅
d(Al ) = sup kx − yk
x,y∈Al
Suppose now that f is a real valued n variables function defined and bounded on A,
f : A → R1 .
Then f is bounded on each part Al , l = 1, k. Hence f has a least upper bound Ml and a
greatest lower bound ml on Al (l = 1, k).
where ξl ∈ Al .
138
Remark 66.1. It is clear that the following inequalities hold:
Lf (P ) ≤ σf (P ) ≤ Uf (P )
Now f is bounded above and below on A. So there exist numbers m and M such that
m ≤ f (x) ≤ M for x ∈ A
Uf = {Uf (P ) | P is a partition of A}
is bounded below.
So Lf = sup Lf and Uf = inf Uf exist.
The first result establishes the intuitively obvious fact that for a bounded function
Lf ≤ U f .
Uf (P ) − Lf (P ) < ε
139
67 Integrable functions of n variables
Proof. Since A is compact the function f is uniformly continuous. For ε > 0 there exists
δ > 0 such that
ε
kx0 − x00 k < δ ⇒ |f (x0 ) − f (x00 )| <
vol(A)
Choose P such that x0 , x00 ∈ Ai ⇒ kx0 − x00 k < δ for i = 1, n.
Hence n n
X ε X
Uf (P ) − Lf (P ) = (Mi − mi ) · vol(Ai ) < · vol(Ai )
i=1
vol(A) i=1
140
Property (1) is called the linearity of the integral and (2) is called the additive property.
We intend to show that in some conditions the calculus of the integral of a n variables
function reduces to the iterative calculus of the integrals of one variable functions.
Assume that A is a hypercube A = [a1 , b1 ] × [a2 , b2 ] × · · · × [an , bn ] and f : A → R1 .
Theorem 69.1. If the function f is integrable on A and if for every fixed x1 ∈ [a1 , b1 ] the
function fx1 (x2 , · · · , xn ) = f (x1 , x2 , · · · , xn ) is integrable on A1 = [a2 , b2 ] × · · · × [an , bn ]
i.e. the integral
Z Z Zb2 Zbn
I(x1 ) = ··· fx1 (x2 , · · · , xn ) dx2 · · · dxn = ··· f (x1 , x2 , · · · , xn ) dx2 · · · dxn
A1 a2 an
141
ZZZ
x2 y2 z2
Example 69.2. Evaluate z dx dy dz for the set A defined by a2
+ b2
+ c2
≤ 1.
A
ZZZ
x2 y 2 z 2 x2 2
Example 69.3. Evaluate ( + + ) dx dy dz for the set A defined by a2
+ yb2 +
a2 b2 c2
A
z2
c2
≤ 1.
Remark 69.1. The above theorem reduces successively the evaluation of the integral to
the evaluation of integrals for functions of one variable.
Remark 69.2. Theorem 69.1 is valid also for more complex sets A as in 2-dimensional
case.
142
70 Elementary curves and elementary closed curves
The way of defining a line integral is quite similar to the familiar way of defining a definite
integral known from calculus. In order to do this, we must introduce the concepts of curve
and arc length. We will present these concepts in a particular framework which can be
extended in a natural way.
Definition 70.1. An elementary curve (elementary arc) is a set of points C ⊂ R3 for
which there exists a closed interval [a, b] ⊂ R and a function ϕ : [a, b] → C having the
following properties:
a) ϕ is bijective;
b) ϕ is of class C 1 and ϕ0 (t) 6= 0, ∀t ∈ [a, b] .
The points A = ϕ(a) and B = ϕ(b) are called the end points of the curve. The function
ϕ is called a parametric representation of the curve and the vector ϕ0 (t) is tangent to the
curve at the point ϕ(t).
Figure 70.1:
Definition 70.2. An elementary closed curve is a set of points C ⊂ R3 for which there
exists a closed interval [a, b] ⊂ R and a function ϕ : [a, b] → C with the following
properties:
The function ϕ is called a parametric representation of the curve and the vector ϕ0 (t) is
tangent to the curve at the point ϕ(t).
Example 70.1. If x0 = (x01 , x02 , x03 ) and h = (h1 , h2 , h3 ) the closed segment [x0 , x0 + h]
which joins the points x0 , x0 + h is an elementary curve. In this case, we can take
[a, b] = [0, 1] and ϕ : [a, b] → C is given by ϕ(t) = (x01 + th1 , x02 + th2 , x03 + th3 ).
Example 70.2. The circle C defined by x21 + x22 = 1 and x3 = 0 is an elementary
closed curve C. In this case we can take [a, b] = [0, 2π] and ϕ : [a, b] → C is given by
ϕ(t) = (cos t, sin t, 0).
143
Figure 70.2:
c a
Example 70.3. The length of the closed segment [x0 , x0 +h] represented by ϕ(t) = x0 +th,
t ∈ [0, 1] is
Z1 q
l= h21 + h22 + h23 dt = khk
0
144
Example 70.4 (Shows that the curve length is independent of the parametric repre-
sentation). For the circle C = {(x1 , x2 , x3 ) ∈ R3 | x21 + x22 = 1, x3 = 0} the parametric
representation ϕ : [0, π] → C, ϕ(t) = (cos 2t, sin 2t, 0) is chosen. Using this representation
we have
Zπ p
l = 2 sin2 2t + cos2 2t dt = 2π
0
This is the same value as the one obtained in the case of the parametric representation
ϕ(t) = (cos t, sin t, 0), t ∈ [0, 2π].
Remark 70.4. Since ϕ0 (t) 6= 0, ∀t ∈ [a, b], an elementary curve C has only two end
points. In other words, if C is an elementary curve then there exist two points A, B ∈ C
such that for any parametric representation ψ : [c, d] → C of the curve C we have
{ψ(c), ψ(d)} = {A, B}.
When ψ(c) = A and ψ(d) = B, then if τ moves from c to d, then ψ(τ ) moves from A to
B.
When ψ(c) = B and ψ(d) = A, then if τ moves from c to d, then ψ(τ ) moved from B to
A.
The two ways to cover the curve C, from A to B or from B to A, are called orientations
of C and the above presented facts show that on an elementary curve there are two ori-
entations. Moreover, the covering given by an arbitrary representation of C is one of the
above mentioned orientations.
In other words, the set of representations is divided in two classes: for all the representa-
tions which belong to one of the classes we have one orientation (say from A to B) and
for the other class the opposite orientation (from B to A).
Figure 70.3:
Consider now an elementary curve C with the parametric representation ϕ : [a, b] → C
for which the orientation of the curve is from A to B (ϕ(a) = A, ϕ(b) = B).
If in the formula
Zb
l= kϕ0 (τ )k dτ
a
we replace the fixed upper limit b with a variable upper limit t, the integral becomes
Zt
sA (t) = kϕ0 (τ )k dτ
a
145
The value sA (t) represents the arc length AA0 ⊂ C, where A0 = ϕ(t).
The function sA is defined on the closed interval [a, b] and it is a bijection from [a, b] to
[0, l], sA (a) = 0 and sA (b) = l. More, sA and s−1
A are continuously differentiable.
The function sA can be used in order to define a new parametric representation of the
curve C, namely: x eA : [0, l] → C, xeA = ϕ ◦ s−1A . In this representation of C, s ∈ [0, l]
serves as a parameter and x eA (0) = A, x
eA (l) = B. When s moves from 0 to l, then x eA (s)
moves from A to B. The parametric representation x eA is canonic when the orientation
of C is from A to B, i.e. the parameter s is the arc length Ae xA (s) and there is no other
representation with this property.
Consider now for the same elementary curve C with the end points A and B a parametric
representation ψ : [c, d] → C for which the orientation is from B to A (ψ(c) = B and
ψ(d) = A). If in the formula
Zd
l = kψ 0 (τ )k dτ
c
we replace the fixed upper limit d with a variable upper limit t, the integral becomes
Zt
sB (t) = kψ 0 (τ )k dτ
c
Example 70.5. In the case of the closed interval [x0 , x0 +h] which joins the points A = x0
and B = x0 + h, if the parametric representation ϕ(t) = x0 + th, t ∈ [0, 1] is chosen, then
ϕ(t) moves from A to B when t moves from 0 to 1. If the parametric representation
ψ(τ ) = x0 + (2 − τ )h, τ ∈ [1, 2] is chosen, then ψ(τ ) moves from B = x0 + h to A = x0
when τ moves from 1 to 2.
146
Using the representaion ψ, the arc length BB 0 is given by
Zt Zt q
sB (t) = kψ 0 (τ )k dτ = h21 + h22 + h23 dτ = (t − 1) · khk
1 1
Now consider an elementary closed curve C. In this case, there aren’t two end points A
and B, and we cannot speak about the orientation from A to B or from B to A. In the
followings, we will show how to proceed in order to introduce two orientations in the case
of an elementary closed curve C.
For the elementary closed curve C let’s consider a parametric representation ϕ : [a, b] → C
and the point A = ϕ(a) ∈ C.
Since ϕ0 (t) 6= 0, if t moves from a to b then ϕ(t) describes the curve, moving in a unique
way. This movement of ϕ(t) is one orientation of the curve. The opposite movement on
C is the opposite orientation. If ψ : [c, d] → C is an arbitrary representation of C then
when t increases on [c, d], ψ(t) moves on C according to one of the orientations:
d ¡ ¢ dt
if ϕ ◦ ψ −1 (τ ) = > 0 then ϕ(t) and ψ(τ ) move in the same way when t moves
dτ dτ
from a to b and τ moves from c to d;
d ¡ ¢ dt
if ϕ ◦ ψ −1 (τ ) = < 0 then ϕ(t) and ψ(τ ) move in opposite senses as t moves from
dτ dτ
a to b and τ moves from c to d.
As in the case of an elementary curve C we can consider the function sA : [a, b] → [0, l]
defined by Z t
sA (t) = k ϕ0 (τ ) k dτ
a
If A = ϕ(t) then sA (t) is the arc length AA0 described by ϕ(τ ) when τ moves from a to
0
t.
The function sA : [a, b] → [0, l] is a bijection and can be used in order to define a new
representation of C: xe : [0, l] → C, xeA − ϕ ◦ s−1
A . In this representation, s ∈ [0, l] is the
parameter and x eA (0) = xeA (l) = A. When s moves from 0 to l then x eA (s) moves on C
and describes the curve C moving in the same sense as ϕ(t). The function x e
eA : [0, l] → C
e
defined by x
eA (s) = x
eA (l − s) is the representation of C which corresponds to the opposite
orientation.
For instance, in the case of the circle:
C = {(x1 , x2 , x3 )|x21 + x22 = 1, x3 = 0}
147
Figure 70.4:
Figure 70.5:
It follows that an elementary curve and an elementary closed curve can be represented
as:
X(s) = (X1 (s), X2 (s), X3 (s)), 0 ≤ s ≤ l
where l is the curve length and s ∈ [0, l] is the arc length X(0)X(s) ⊂ C.
For an elementary curve C, there exist two representations of this kind corresponding to
the two orientations of C. For an elementary closed curve C if we fix a point A on C,
then we also have two representations of this kind corresponding to the two orientations
of C.
in function of the arc length s. In order to make a choice assume that x(s) moves from A
and B when s moves from 0 to l.
Let now f (x1 , x2 , x3 ) be a given function which is defined (at least) at each point of C
and is continuous function of s, i.e. s 7→ f (x1 (s), x2 (s), x3 (s)) is continuous.
We subdivide C into n portions in an arbitrary manner:
148
Figure 71.1:
Let P0 (= A), P1 , P2 , ..., Pn−1 , Pn (= B) the end points of these portions and let
the lengths of arcs AQi ; si = length(AQi ). Then we choose an arbitrary point on each
portion, say, a point Q1 between P0 and P1 , a point Q2 between P1 and P2 etc. Taking
the values of f at these points Q1 , Q2 , · · · , Qn we form the sum
n
X
In = f (Qm )(sm − sm−1 )
m=1
The curve C is called the path of integration. Since, by assumption, f is continuous and
C is smooth, that limit exists and is independent of the orientation and of the choice of
subdivisions and points Qm . In fact, the position of a point P on C is determined by the
corresponding value of the arc length s; since A and B correspond to s = 0 and s = l
respectively, we have
Z Zl
f ds = f (x1 (s), x2 (s), x3 (s)) ds
C 0
Z Zb q
f ds = f (ϕ1 (t), ϕ2 (t), ϕ3 (t)) · ϕ̇21 (t) + ϕ̇22 (t) + ϕ̇23 (t) dt
C a
Hence the line integral of first type is equal to the definite integral and familiar properties
of ordinary definite integrals are equally valid for line integrals.
Z Z
Proposition 71.1. a) k · f ds = k f ds k = constant
C C
149
Z Z Z
b) (f + g) ds = f ds + g ds
C C C
Z Z Z
c) f ds = f ds + f ds, where the path C is subdivided into two disjoint arcs C1
C C1 C2
and C2
Figure 71.2:
Remark 71.1. If C is an elementary closed curve, the line integral of first type is defined
similarly.
I Z
For a line integral over a closed path C, the symbol (instead of ) is sometimes used
C C
in the literature.
In many applications the integrands appearing in the line integrals of first type are of the
form:
dx1 dx2 dx3
f· or g · or h ·
ds ds ds
dx1 dx2 dx3
where ds , ds , ds are the derivatives of the functions occurring in the parametric
representation of the path of integration.
Z Z Z
dx1 dx2 dx3
The integrals f · ds, g · ds, h· ds are called line integrals of second type.
ds ds ds
C C C
Their values depend on the orientation of C; changing the orientation of C the integrals
are multiplied by −1.
150
We simply denote these integrands by:
Z Z
dx1
f· ds = f dx1
ds
C C
Z Z
dx2
g· ds = g dx2
ds
C C
Z Z
dx3
h· ds = h dx3
ds
C C
In terms of the considered parametric representations, these line integrals of second type
are equal to the following Riemann integrals:
Z Zl Zl
dx1
f dx1 = f (x1 (s), x2 (s), x3 (s)) ds = f (x1 (s), x2 (s), x3 (s)) · cos α(s)ds
ds
C 0 0
Z Zl Zl
dx2
gdx2 = g(x1 (s), x2 (s), x3 (s)) ds = g(x1 (s), x2 (s), x3 (s)) · cos β(s)ds
ds
C 0 0
Z Zl Zl
dx3
hdx3 = h(x1 (s), x2 (s), x3 (s)) ds = h(x1 (s), x2 (s), x3 (s)) · cos γ(s)ds
ds
C 0 0
where α(s), β(s), γ(s) are the angles of the tangent to the curve and the coordinate axis
Ox1 , Ox2 , Ox3 , respectivelly.
In terms of an arbitrary parametric representation ϕ : [a, b] → C which corresponds to the
same orientation, these line integrals of second type are equal to the following Reimann
integrals:
Z Zb
ϕ̇1 (t)
f dx1 = f (ϕ(t)) · p 2 dt
ϕ̇1 (t) + ϕ̇22 (t) + ϕ̇23 (t)
C a
Z Zb
ϕ̇2 (t)
gdx2 = g(ϕ(t)) · p dt
ϕ̇21 (t) + ϕ̇22 (t) + ϕ̇23 (t)
C a
Z Zb
ϕ̇3 (t)
hdx3 = h(ϕ(t)) · p dt
ϕ̇21 (t) + ϕ̇22 (t) + ϕ̇23 (t)
C a
All these integrals depend on the orientation of the curve C. If the orientation changes,
the value of the integral changes its sign.
For the sums of these types of integrals along the same path C we adopt the simplified
notation Z
f dx1 + g dx2 + h dx3
C
151
which is equal to the Riemann integral
Zl
[f (x1 (s), x2 (s), x3 (s)) · cos α(s) + g(x1 (s), x2 (s), x3 (s)) · cos β(s) + h(x1 (s), x2 (s), x3 (s)) · cos γ(s)] ds
0
where C is the arc of the parabola x2 = x21 in the plane x3 = 2 from A(0, 0, 2) to B(1, 1, 2).
Example 72.2. Evaluate the above line integral where C is the segment of the straight
line x2 = x1 , x3 = 2 from A(0, 0, 2) to B(1, 1, 2).
Double integrals over a plane region may be transformed into line integrals over the
boundary of the region and conversely. This transformation is of practical as well as
theoretical interest and can be done means of the following basic theorem.
Theorem 73.1 (Green’s theorem in the plane). Let R be a closed bounded region in the
x, y plane whose boundary C consists of finite many elementary closed curves. Let f (x, y)
and g(x, y) be functions which are continuous and have continuous partial derivatives of
first order everywhere in some domain containing R. Then the following equality holds:
ZZ I I
∂g ∂f
( − ) dx dy = f dx + g dy = [f cos α + g cos β] ds
∂x ∂y
R C C
The integration being taken along the entire boundary C of R such that R is on the left
as one moves on C.
Figure 73.1:
152
Proof. We first prove the theorem for a special region R which be represented in both of
the forms:
R = {(x, y) | a ≤ x ≤ b , u(x) ≤ y ≤ v(x)}
and
R = {(x, y) | c ≤ y ≤ d , p(y) ≤ x ≤ q(y)}
Figure 73.2:
Figure 73.3:
Zb Za Z Z
= − f (x, u(x)) dx − f (x, v(x)) dx = − f (x, y) dy − f (x, y) dx
C∗ C ∗∗
Za b
= − f (x, y) dx
C
since y = u(x) represents the oriented curve C ∗ and y = v(x) represents the oriented
curve C ∗∗ .
˜
If portions of C are segments parallel to the y-axis such as C̃ and C̃,
153
Figure 73.4:
then the result is the same as before, because the integrals over these portions are zero
and may be added to the integrals over C ∗ and C ∗∗ to obtain the integral over the whole
boundary C.
Similarly we obtain
ZZ Zd Zq(y) Z
∂g
dx dy = dx dy = g(x, y) dy
∂x
R c p(y) C
Therefore ZZ · ¸ Z
∂g ∂f
− dx dy = f dx + g dy
∂x ∂y
R C
and the theorem is proved for special regions.
We now prove the theorem for a region R which itself is not a special region but can be
subdivided into finitely many special regions. In this case we apply the theorem to each
subregion and then add the results; the left-hand members add up to the integral over R
while the right-hand members add up to the line integral over C plus integrals over the
curves introduced for subdividing R. Each of the latter integrals occur twice, taken once
in each direction. Hence these two integrals cancel each other, and we are left with the
line integral over C.
Example 73.1. Using Green’s theorem, evaluate the following integrals:
R
a) y dx + 2x dy, where C is the boundary of the square 0leqx ≤ 1, 0leqx ≤ 1
C
(counterclockwise);
R
b) y 3 dx + (x3 + 3y 2 x) dy, where C is the boundary of the region y = x2 and y = x,
C
where 0leqx ≤ 1 (counterclockwise);
R
c) 2xy dx + (ex + x2 ) dy, where C is the boundary of the triangle with vertices (0, 0),
C
(1, 0), (1, 1) (clockwise);
R
d) −xy 2 dx + x2 y dy, where c is the boundary of the region in the first quadrant
C
bounded by y = 1 − x2 (counterclockwise).
154
Now we will present a second theorem of Green which concern the transformation of a
double integral of the Laplacian of a function into a line integral of its normal derivative.
Let w(x, y) be a function which has continuous second order partial derivatives in a domain
D of the x, y-plane.
Assume now that D contains a region R (R ⊂ D) of the type indicated in Green’s theorem.
∂g
Proof. Consider f = − ∂w
∂y
and g = ∂w
∂x
and remark that ∆w = ∂x − ∂f
∂y
. Applying Green’s
theorem in the plane , we obtain
ZZ I I I
∂w ∂w ∂w dx ∂w dy
∆w dx dy = − dx + dy = (− · + · ) ds = gradw · n ds
∂y ∂x ∂y ds ∂x ds
R C C C
The integrand of the last integral may be written as the dot product of the vectors
∂w ∂w dy dx
grad w = ( , ) and n = ( , − )
∂x ∂y ds ds
that is
∂w dy ∂w dx
n · grad w = · − ·
∂x ds ∂y ds
The vector n is the outward unit normal vector to C. that is because the vector
τ = ( dx , dy ) is the unit tangent vector to C and τ · n = 0.
ds ds
The dot product n · grad w is the directional derivative ∂w ∂n
= ∇n w.
Therefore we have ZZ I I
∂w
∆w dx dy = ds = ∇n w ds
∂n
R C C
Let v(x, y) be a vector valued function v(x, y) = (f (x, y), g(x, y)) which have continuous
first order partial derivatives in a domain D of the x, y-plane.
Definition 73.2. The divergence of v is by definition the real valued function div v :
D → R1 defined by
∂f ∂g
div v = +
∂x ∂y
155
Theorem 73.3. If D contains a region R (R ⊂ D) of the type indicated in Green’s
theorem, then the following equality holds:
ZZ I
div v dx dy = v · n ds
R C
Example 73.2. Verify this formula when v = (x, y) and C is the circle x2 + y 2 = 1.
74 Elementary Surfaces
We shall consider surface integrals. This considerations will require knowledge of some
basic facts about surfaces, which we shall now explain and illustrate by simple examples.
Definition 74.1. An elementary surface is a set of points S ⊂ R3 for which there exists a
bounded, open and connected set D ⊂ R2 and a function ϕ : D → S having the following
properties:
a) ϕ is bijective;
b) ϕ is of class C 1 and the vector
∂ϕ ∂ϕ ∂ϕ2 ∂ϕ3 ∂ϕ3 ∂ϕ2 ∂ϕ3 ∂ϕ1 ∂ϕ1 ∂ϕ3 ∂ϕ1 ∂ϕ2 ∂ϕ2 ∂ϕ1
× =N =( · − · , · − · , · − · )
∂u ∂v ∂u ∂v ∂u ∂v ∂u ∂v ∂u ∂v ∂u ∂v ∂u ∂v
is different from 0 for any (u, v) ∈ D.
∂ϕ ∂ϕ
The function ϕ is called parametric representation of S. The vectors , are tangents
∂u ∂v
to the surface S at the point ϕ(u, v).
The vector N ϕ is called the normal vector to the surface S at the point ϕ(u, v) and the
Nϕ
vector nϕ = is the unit normal vector to the surface S at the point ϕ(u, v).
kN ϕ k
Example 74.1. The set S = {(x1 , x2 , 1) ∈ R3 | 0 < x1 < 1 , 0 < x2 < 1} is an elementary
surface. A bounded, open and connected set D ⊂ R2 and a function ϕ : D → S having
the properties a) and b) are:
D = {(u, v) ∈ R2 | 0 < u < 1 , 0 < v < 1} and ϕ(u, v) = (u, v, 1)
A normal vector to S at the point ϕ(u, v) is N ϕ = (0, 0, 1) which is actually a unit normal
∂ϕ ∂ϕ
vector. The vectors = (1, 0, 0) and = (0, 1, 0) are tangents to S at the point
∂u ∂v
ϕ(u, v).
156
Example 74.2. The set S = {(x1 , x2 , x3 ) ∈ R3 | x21 + x22 = 1 , x1 > 0 , x2 > 0 , 0 < x3 <
1} is an elementary surface. A bounded, open and connected set D ⊂ R2 and a function
ϕ : D → S having the properties a) and b) are:
π
D = {(u, v) ∈ R2 | 0 < u < , 0 < v < 1} and ϕ(u, v) = (cos u, sin u, v)
2
A normal vector to S at the point ϕ(u, v) is N ϕ = (cos u, sin u, 0) which is actually a unit
∂ϕ ∂ϕ
normal vector. The vectors = (− sin u, cos u, 0) and = (0, 0, 1) are tangents to S
∂u ∂v
at the point ϕ(u, v).
Example 74.3. The set S = {(x1 , x2 , x3 ) ∈ R3 | x21 + x22 + x23 = 1 , x1 > 0 , x2 > 0 , x3 >
0} is an elementary surface. A bounded, open and connected set D ⊂ R2 and a function
ϕ : D → S having the properties a) and b) are:
π π
D = {(u, v) ∈ R2 | 0 < u < , 0 < v < } and ϕ(u, v) = (cos u·sin v, sin u·sin v, cos v)
2 2
The vector N ϕ = − sin v · (cos u · sin v, sin u · sin v, cos v) is a normal vector to S and
nϕ = −(cos u · sin v, sin u · sin v, cos v) is a unit normal vector to S. The vectors
∂ϕ ∂ϕ
= (− sin u sin v, cos u sin v, 0) and = (cos u cos v, sin u cos v, − sin v) are tangents
∂u ∂v
to S at the point ϕ(u, v).
157
with the parametric representation ϕ(u, v) = (u, v, 1), (u, v) ∈ (0, 1) × (0, 1), and the
elementary curve C on S (C ⊂ S) defined by
1 2
C = {(x1 , x2 , x3 ) ∈ R3 | x1 = x2 , x3 = 1 , x1 ∈ [ , ]}
3 3
The curve C 0 = ϕ−1 (C), C 0 ⊂ D has the parametric representation: u = t, v = t,
t ∈ [ 13 , 23 ]. The parametric representation x = ϕ(g(t), h(t)) of C in this case is given by
x(t) = (t, t, 1).
Example 74.5. In the case of the elementary surface
with the parametric representation ϕ(u, v) = (cos u, sin u, v), (u, v) ∈ D = {(u, v) ∈
R2 | 0 < u < π2 , 0 < v < 1}, and the elementary curve on S is defined by
√ √
1 1 3 3 1
C = {(x1 , x2 , x3 ) ∈ R3 | x21 + x22 = 1 , x3 = , < x1 < , < x2 < }
2 2 2 2 2
1
The curve C 0 = ϕ−1 (C), C 0 ⊂ D has the parametric representation: u = t, v = ,
· ¸ 2
π 2π
t∈ , . The parametric representation x = ϕ(g(t), h(t)) of the curve C in this case
3 3µ ¶
1
is x(t) = cos t, sin t, .
2
Remark 74.3. Using the formula of parametric representation of the elementary curve
C ⊂ S, x = ϕ(u(t), v(t)), we obtain that the tangent vector t to C at ϕ(u(t), v(t)) is given
by: µ ¶
dx ∂ϕ1 du ∂ϕ1 dv ∂ϕ2 du ∂ϕ2 dv ∂ϕ3 du ∂ϕ3 dv
= · + · , · + · , · + ·
dt ∂u dt ∂v dt ∂u dt ∂v dt ∂u dt ∂v dt
The vectors:
µ ¶ µ ¶
∂ϕ ∂ϕ1 ∂ϕ2 ∂ϕ3 ∂ϕ ∂ϕ1 ∂ϕ2 ∂ϕ3
= , , and = , ,
∂u ∂u ∂u ∂u ∂v ∂v ∂v ∂v
dx ∂ϕ du ∂ϕ dv
are tangent vectors to S at ϕ(u, v) and = · + · .
dt ∂u dt ∂v dt
where:
∂ϕ ∂ϕ ∂ϕ
E=k (u(t), v(t))k2 F = (u(t), v(t)) · (u(t), v(t))
∂u ∂u ∂v
∂ϕ du dv
G = k (u(t), v(t))k2 u̇ = v̇ =
∂v dt dt
158
The expression E · u̇2 + 2F · u̇ · v̇ + G · v̇ 2 is called first fundamental form of S. It is of basic
importance because it enables us to measure lengths, angles between curves and areas on
the corresponding surface. In fact, we have already seen how we compute the length of a
curve. Now we consider the measurement of the angle between the curves
In various applications, surface integrals occur for which the concept of orientation of a
surface is essential.
In the case of an elementary surface, for the unit normal vector n̄ there exist two
orientations (see Fig. 74.1) and we can associate to each of them one orientation of
the elementary surface S (as for the elementary curves, using two ways to cover the
curve). The set of representations of the elementary surface S is decomposed (according
to these orientations) in two disjoint classes. For all representations belonging to one of
these classes, we have one of the two orientations of n̄ (of S) and for all representations
from the other class we have to opposite orientation of n̄ (of S).
Figure 74.1:
If a smooth surface S is orientable, then we may orient S by choosing of the two possible
directions of the unit normal vector n.
If the boundary of the elementary surface S is a simple closed curve C, then we may
associate with each of two possible orientations of S an orientation of C as it is shown in
Figs. 74.2.
Figure 74.2:
The rules is: looking the curve C from the top of the unit normal vector n the sense on
the curve is always counterclockwise.
Using this idea we may extend the concept of orientation to surfaces which can be
decomposed in elementary surfaces, as follows: A surface S which can be decomposed
in elementary surfaces is said to be orientable if we can orient each elementary piece of
160
S in such a manner that along each curve C ∗ which is a common boundary of two pieces
S1 and S2 the positive direction of C ∗ related to S1 is opposite to the positive direction
of C ∗ related to S2 .
Figure 74.3:
However this may not hold in the large. There are non-orientable surfaces. An example
of such a surface is the Möbius strip. A model of a Möbius strip can be made by taking
a long rectangular piece of paper and sticking the shorter sides together so that the two
points A and the two points B coincide (see Fig.74.4).
Hence, the surface S is orientable if there does not exist a closed curve C ⊂ S passing
through P0 such that the chosen normal orientation reverses by moving continuously along
the curve C from P0 and back to P0 .
Surface integrals occur in many applications, for example, in connection with the center
of gravity of a curved lamina, the potential due to charges distributed on surfaces.
Let S be an elementary surface of finite area and let f a real valued function which
is defined and continuous on S. We subdivide S into n parts S1 , S2 , · · · , Sn of areas
A1 , A2 , · · · , An . In each part Sk we choose an arbitrary point Pk and form the sum:
n
X
In = f (Pk ) · Ak
k=1
161
This we do for each n = 1, 2, · · · in an arbitrary manner, but so that the largest part
Sk tends to a point as n approaches infinity. The sequence I1 , I2 , · · · , In , · · · has a limit
which is independent of the choice of subdivisions and points Pk . This limit is called the
surface integral of first type of f over S and is denoted by
ZZ
f dS
S
162
ZZ ZZ
u3 dx1 dx2 = u3 cos α3 dS
S S
It is clear that the value of such an integral depends on the choice of n, that is, on
the orientation of S. The transition to the opposite orientation corresponds to the
multiplication of the integral by −1, because then the components cos α1 , cos α2 , cos α3 of
n are multiplied by −1.
The sum of the above three integrals may be written in a simple form by using vector
notation.
In fact we introduce the vector function
u(x1 , x2 , x3 ) = (u1 (x1 , x2 , x3 ), u2 (x1 , x2 , x3 ), u3 (x1 , x2 , x3 ))
and we obtain
ZZ ZZ
u1 dx2 dx3 + u2 dx3 dx1 + u3 dx1 dx2 = (u1 cos α1 + u2 cos α2 + u3 cos α3 ) dS
S
ZSZ
= u · n dS
S
To evaluate the above integrals we may reduce then to double integrals over a plane
region.
If S can be represented as
x3 = h(x1 , x2 )
and is oriented such that n points upward, (then α3 is acute), then
ZZ ZZ
u3 dx1 dx2 = u3 (x1 , x2 , h(x1 , x2 )) dx1 dx2
S R
163
77 Properties of surface integrals
164
78 Differentiation of an integral containing a param-
eter
Zψ(t)
I(t) = f (x, t) dx
ϕ(t)
Theorem 78.1. If the functions ϕ(t), ψ(t) are differentiable functions with respect to t
in some interval c ≤ t ≤ d and the function f (x, t) is continuous with respect to x over
the interval ϕ(t) ≤ x ≤ ψ(t) and continuously differentiable with respect to t, then the
function I(t) is differentiable and
Zψ(t) Zψ(t)
d ∂f
f (x, t) dx = ψ 0 (t) · f (ψ(t), t) − ϕ0 (t) · f (ϕ(t), t) + (x, t) dx
dt ∂t
ϕ(t) ϕ(t)
Proof. From the mean value theorem for derivatives in t + h ∈ [c, d], we have
dϕ
ϕ(t + h) = ϕ(t) + h · ( )(ξ) with ξ ∈ (t, t + h)
dt
dψ
ψ(t + h) = ψ(t) + h · ( )(η) with η ∈ (t, t + h)
dt
∂f
f (x, t + h) = f (x, t) + h · ( )(x, ξ) with ζ ∈ (t, t + h)
∂t
Now we have
Z
ψ(t+h)
I(t + h) = f (x, t + h) dx =
ϕ(t+h)
0
Zϕ(t) Zψ(t) Z (η)
ψ(t)+h·ψ
Zψ(t)
= −h · ϕ0 (ξ) · f (x0 , t + h) + h · ψ 0 (η) · f (x00 , t + h) + f (x, t + h) dx
ϕ(t)
where
ϕ(t) ≤ x0 ≤ ϕ(t) + h · ϕ0 (ξ) ψ(t) ≤ x00 ≤ ψ(t) + h · ψ 0 (η)
165
Next, forming the difference I(t + h) − I(t) and combining the integrals we obtain
Zψ(t)
∂f
I(t + h) − I(t) = h · ψ 0 (η) · f (x00 , t + h) − h · ϕ0 (ξ) · f (x0 , t + h) + h ( )(x, ξ) dx
∂t
ϕ(t)
I(t+h)−I(t)
Finally, forming the difference quotient h
and taking the limits as h → 0 it follows
that ξ, η all tend to t. Hence
Zψ(t)
dI ∂f
= ψ 0 (t) · f (ψ(t), t) − ϕ0 (t) · f (ϕ(t), t) + (x, t) dx
dt ∂t
ϕ(t)
Corollary 78.1. If f (x, t) is continuous with respect to x over the interval [a, b] and
continuously differentiable with respect to t, then
Zb Zb
d ∂f
f (x, t) dx = (x, t) dx
dt ∂t
a a
ω(t) = {x(t, ξ) | ξ ∈ ω0 }
D(x1 ,x2 ,x3 )
and assume that the jacobian D(ξ1 ,ξ2 ,ξ3 )
of the function xt : ω0 → ω(t) defined by
xt (ξ) = x(t, ξ)
∂x
where: S(t) is the boundary of ω(t), v = ∂t
and n is the unit normal vector of the surface
S(t).
166
References
[1] R. Haggarty, Fundamentals of Mathematical Analysis; Addison-Wesley, 1989, Oxford
[4] F. Ayres, J. Cault, Differential and Integral Calculus in Simetric Units; Mc.Grow-
Hill, 1988
167