Professional Documents
Culture Documents
D Richards
April 30, 2008
2
Contents
1 Preliminary Analysis 9
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Notation and preliminary remarks . . . . . . . . . . . . . . . . . . . . . 12
1.2.1 The Order notation . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Functions of a real variable . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.2 Continuity and Limits . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.3 Monotonic functions and inverse functions . . . . . . . . . . . . . 19
1.3.4 The derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3.5 Mean Value Theorems . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3.6 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.3.7 Implicit functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.3.8 Taylor series for one variable . . . . . . . . . . . . . . . . . . . . 33
1.3.9 Taylor series for several variables . . . . . . . . . . . . . . . . . . 38
1.3.10 LHospitals rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.3.11 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
1.4 Miscellaneous exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
1.5 Solutions for chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3
4 CONTENTS
Preliminary Analysis
1.1 Introduction
This course is about two related mathematical concepts which are of use in many areas
of applied mathematics, are of immense importance in formulating the laws of theoret-
ical physics and also produce important, interesting and some unsolved mathematical
problems. These are the functional and variational principles : the theory of these
entities is named The Calculus of Variations.
A functional is a generalisation of a function of one or more real variables. A real
function of a single real variable maps an interval of the real line to real numbers: for
instance, the function (1 + x2 )1 maps the whole real line to the interval (0, 1]; the
function ln x maps the positive real axis to the whole real line. Similarly a real function
of n real variables maps a domain of Rn into the real numbers.
A functional maps a given class of functions to real numbers. A simple example of
a functional is
Z 1 p
S[y] = dx 1 + y 0 (x)2 , y(0) = 0, y(1) = 1, (1.1)
0
which associates a real number with any real function y(x) which satisfies the boundary
conditions and for which the integral exists. We use the square bracket notation 1 S[y]
to emphasise the fact that the functional depends upon the choice of function used to
evaluate the integral. In chapter 2 we shall see that a wide variety of problems can be
described in terms of functionals. Notice that the boundary conditions, y(0) = 0 and
y(1) = 1 in this example, are often part of the definition of the functional.
Real functions of n real variables can have various properties; for instance they
can be continuous, they may be differentiable or they may have stationary points and
local and global maxima and minima: functionals share many of these properties. In
1 In this course we use conventions common in applied mathematics and theoretical physics. A
function of a real variable x will usually be represented by symbols such as f (x) or just f , often
with no distinction made between the function and its value; as is often the case it is often clearer
to use context to provide meaning, rather than precise definitions, which initially can hinder clarity.
Similarly, we use the older convention, S[y], for a functional, to emphasise that y is itself a function;
this distinction is not made in modern mathematics. For an introductory course we feel that the older
convention, used in most texts, is clearer and more helpful.
9
10 CHAPTER 1. PRELIMINARY ANALYSIS
man).
1.1. INTRODUCTION 11
the fact that most important functions of one variable describe physical phenomena
and often arise as solutions of ordinary differential equations. Therefore it is usual to
restrict attention to functions that are differentiable or, more usually, differentiable a
number of times.
The most useful generalisation of differentiability to functions defined on sets other
than R requires some care. It is not too hard in the case of functions of several (real)
variables but we shall have to generalise differentiation and integration to functionals,
not just to functions of several real variables.
Our presentation conceals very significant intellectual achievements made at the
end of the nineteenth century and during the first half of the twentieth century. During
the nineteenth century, although much work was done on particular equations, there
was little systematic theory. This changed when the idea of infinite dimensional vector
spaces began to emerge. Between 1900 and 1906, fundamental papers appeared by
Fredholm3 , Hilbert4 , and Frechet5 . Frechets thesis gave for the first time definitions of
limit and continuity that were applicable in very general sets. Previously, the concepts
had been restricted to special objects such as points, curves, surfaces or functions. By
introducing the concept of distance in more general sets he paved the way for rapid
advances in the theory of partial differential equations. These ideas together with the
theory of Lebesgue integration introduced in 1902, by Lebesgue in his doctoral thesis 6 ,
led to the modern theory of functional analysis. This is now the usual framework of
the theoretical study of partial differential equations. They are required also for an
elucidation of some of the difficulties in the Calculus of Variations. However, in this
introductory course, we concentrate on basic techniques of solving practical problems,
because we think this is the best way to motivate and encourage further study.
This preliminary chapter, which is assessed, is about real analysis and introduces
many of the ideas needed for our treatment of the Calculus of Variations. It is possible
that you are already familiar with the mathematics described in this chapter, in which
case you could start the course with chapter 2. You should ensure, however, that you
have a good working knowledge of differentiation, both ordinary and partial, Taylor
series of one and several variables and differentiation under the integral sign, all of
which are necessary for the development of the theory. In addition familiarity with the
theory of linear differential equations with both initial and boundary value problems is
assumed.
Very many exercises are set, in the belief that mathematical ideas cannot be un-
derstood without attempting to solve problems at various levels of difficulty and that
one learns most by making ones own mistakes, which is time consuming. You should
not attempt all these exercise at a first reading, but these provide practice of essential
mathematical techniques and in the use of a variety of ideas, so you should do as many
as time permits; thinking about a problem, then looking up the solution is usually of
3 I. Fredholm, On a new method for the solution of Dirichlets problem, reprinted in Oeuvres
231-359.
12 CHAPTER 1. PRELIMINARY ANALYSIS
little value until you have attempted your own solution. The exercises at the end of
this chapter are examples of the type of problem that commonly occur in applications:
they are provided for extra practice if time permits and it is not necessary for you to
attempt them.
valued. We shall also write without further comment f (x) = (f1 (x), f2 (x), . . . , fm (x)),
so that the fi are the m component functions, fi : Rn R, of f .
On the real line the distance between two points x and y is naturally defined by
|x y|. A point x is in the open interval (a, b) if a < x < b, and is in the closed interval
[a, b] if a x b. By convention, the intervals (, a), (b, ) and (, ) = R are
also open intervals. Here, (, a) means the set of all real numbers strictly less than
a. The symbol for infinity is not a number, and its use here is conventional. In
the language and notation of set theory, we can write (, a) = {x R : x < a}, with
similar definitions for the other two types of open interval. One reason for considering
open sets is that the natural domain of definition of some important functions is an
open set. For example, the function ln x as a function of one real variable is defined for
x (0, ).
The space of points Rn is an example of a linear space. Here the term linear has
the normal meaning that for every x, y in Rn , and for every real , x + y and x are
in Rn . Explicitly,
and
(x1 , x2 , . . . , xn ) = (x1 , x2 , . . . , xn ).
Functions f : Rn Rm may also be added and multiplied by real numbers. Therefore
a function of this type may be regarded as a vector in the vector space of functions
though this space is not finite dimensional like Rn .
In the space Rn the distance |x| of a point x from
p the origin is defined by the nat-
ural generalisation of Pythagoras theorem, |x| = x21 + x22 + + x2n . The distance
between two vectors x and y is then defined by
q
2 2 2
|x y| = (x1 y1 ) + (x2 y2 ) + + (xn yn ) . (1.2)
This is a direct generalisation of the distance along a line, to which it collapses when
n = 1.
This distance has the three basic properties
In the more abstract spaces, such as the function spaces we need later, a similar concept
of a distance between elements is needed. This is named the norm and is a map from
two elements of the space to the positive real numbers and which satisfies the above
three rules. In function spaces there is no natural choice of the distance function and
we shall see in chapter 3 that this flexibility can be important.
For functions of several variables, that is, for functions defined on sets of points in
Rn , the direct generalization of open interval is an open ball.
Definition 1.1
The open ball Br (a) of radius r and centre a Rn is the set of points
Thus the ball of radius 1 and centre (0, 0) in R2 is the interior of the unit circle, not
including the points on the circle itself. And in R, the ball of radius 1 and centre 0
is the open interval (1, 1). However, for R 2 and for Rn for n > 2, open balls are not
quite general enough. For example, the open square
{(x, y) R2 : |x| < 1, |y| < 1}
is not a ball, but in many ways is similar. (You may know for example that it may be
mapped continuously to an open ball.) It turns out that the most convenient concept
is that of open set 7 , which we can now define.
Definition 1.2
Open sets. A set U in Rn is said to be open if for every x U there is an open ball
Br (a) wholly contained within U which contains x.
In other words, every point in an open set lies in an open ball contained in the set.
Any open ball is in many ways like the whole of the space R n it has no isolated or
missing points. Also, every open set is a union of open balls (obviously). Open sets
are very convenient and important in the theory of functions, but we cannot study the
reasons here. A full treatment of open sets can be found in books on topology8 . Open
balls are not the only type of open sets and it is not hard to show that the open square,
{(x, y) R2 : |x| < 1, |y| < 1}, is in fact an open set, according to the definition we gave;
and in a similar way it can be shown that the set {(x, y) R 2 : (x/a)2 + (y/b)2 < 1},
which is the interior of an ellipse, is an open set.
Exercise 1.1
Show that the open square is an open set by constructing explicitly for each (x, y)
in the open square {(x, y) R2 : |x| < 1, |y| < 1} a ball containing (x, y) and
lying in the square.
The conditional clause as x 0 is often omitted when it is clear from the context.
More generally, this order notation can be used to compare the size of functions, f (x)
7 As with many other concepts in analysis, formulating clearly the concepts, in this case an open
Press.
1.2. NOTATION AND PRELIMINARY REMARKS 15
and g(x): we say that f (x) is of the order of g(x) as x y if there is a non-zero
constant C such that |f (x)| < C|g(x)| for all x in an interval around y; more succinctly,
f (x) = O(g(x)) as x y.
When used in the form f (x) = O(g(x)) as x , this notation means that
|f (x)| < C|g(x)| for all x > X, where X and C are positive numbers independent
of x.
This notation is particularly useful when truncating power series: thus, the series
for sin x up to O(x3 ) is written,
x3
sin x = x + O(x5 ),
3!
meaning that the remainder is smaller than C|x|5 , as x 0 for some C. Note that in
this course the phrase up to O(x3 ) means that the x3 term is included. The following
exercises provide practice in using the O-notation and exercise 1.2 proves an important
result.
Exercise 1.2
Show that if f (x) = O(x2 ) as x 0 then also f (x) = O(x).
Exercise 1.3
Use the binomial expansion to find the order of the following expressions as x 0.
p x x3/2
(a) x 1 + x2 , (b) , (c) .
1+x 1 ex
Exercise 1.4
Use the binomial expansion to find the order of the following expressions as x .
x p
(a) , (b) 4x2 + x 2x, (c) (x + b)a xa , a > 0.
x1
Exercise 1.5
(a) If f1 = x and f2 = y show that f1 = O(f ) and f2 = O(f ) where f (x, y) =
1
(x2 + y 2 ) 2 .
(b) Show that the polynomial (x, y) = ax2 + bxy + cy 2 vanishes to at least the
same order as the polynomial f (x, y) =px2 + y 2 at (0, 0). What conditions are
needed for to vanish faster than f as x2 + y 2 0?
f 2 (x )
x
x1 c x2
Figure 1.1 Figure showing examples of a continuous
function, f1 (x), and a discontinuous function f2 (x).
A function f (x) is continuous at a point x = a if f (a) exists and if, given any arbitrarily
small positive number, , we can find a neighbourhood of x = a such that in it |f (x)
f (a)| < . We can express this in terms of limits and since a point a on the real line
can be approached only from the left or the right a function is continuous at a point
x = a if it approaches the same value, independent of the direction. Formally we have
Definition 1.4
Continuity: a function, f , is continuous at x = a if f (a) is defined and
lim f (x) = f (a).
xa
For a function of one variable, this is equivalent to saying that f (x) is continuous at
x = a if f (a) is defined and the left and right-hand limits
lim f (x) and lim f (x),
xa xa+
9A
Course of Modern Analysis by E T Whittaker and G N Watson, Cambridge University Press.
10 Principles of Mathematical Analysis by W Rudin (McGraw-Hill).
11 Introductory Real Analysis by A N Kolmogorov and S V Fomin (Dover).
1.3. FUNCTIONS OF A REAL VARIABLE 17
Quite elementary functions exist for which neither limit exists: these are also dis-
continuous, and said to have a discontinuity of the second kind at x = a, see Rudin
(1976, page 94). An example of a function with such a discontinuity at x = 0 is
sin(1/x), x 6= 0,
f (x) =
0, x = 0.
We shall have no need to consider this type of discontinuity in this course, but simple
discontinuities will arise.
A function that behaves as
|f (x + ) f (x)| = O() as 0
p
is continuous, though the converse is not true, a counter example being f (x) = |x| at
x = 0.
Most functions that occur in the sciences are either continuous or piecewise continu-
ous, which means that the function is continuous except at a discrete set of points. The
Heaviside function and the related sgn functions are examples of commonly occurring
piecewise continuous functions that are discontinuous. They are defined by
1, x > 0, 1, x > 0,
H(x) = and sgn(x) = sgn(x) = 1 + 2H(x).
0, x < 0, 1, x < 0,
(1.5)
These functions are discontinous at x = 0, where they are not normally defined. In
some texts these functions are defined at x = 0; for instance H(0) may be defined to
have the value 0, 1/2 or 1.
If limxc f (x) = A and limxc g(x) = B, then it can be shown that the following
(obvious) rules are adhered to:
(a) lim (f (x) + g(x)) = A + B;
xc
(b) lim (f (x)g(x)) = AB;
xc
f (x) A
(c) lim = , if B 6= 0;
xc g(x) B
(d) if lim f (x) = fB then lim (f (g(x)) = fB .
xB xc
The value of a limit is normally found by a combination of suitable re-arrangements
and expansions. An example of an expansion is
1 3
sinh ax ax + 3! (ax) + O(x5 )
lim = lim = lim a + O(x2 ) = a.
x0 x x0 x x0
Exercise 1.7
Find the limits of the following functions as x 0 and w .
sin ax tan ax sin ax 3x + 4 z w
(a) , (b) , (c) , (d) , (e) 1 + .
x x sin bx 4x + 2 w
For functions of two or more variables, the definition of continuity is essentially the
same as for a function of one variable. A function f (x) is continuous at x = a if f (a)
is defined and
lim f (x) = f (a). (1.6)
xa
Alternatively, given any > 0 there is a > 0 such that whenever |x a| < ,
|f (x) f (a)| < .
It should be noted that if f (x, y) is continuous in each variable, it is not necessarily
continuous in both variables. For instance, consider the function
(x + y)2
, x2 + y 2 6= 0,
f (x, y) = x2 + y 2
1, x = y = 0,
Q
f(a+h)
Tangent
P at P
f(a)
a a+h
Figure 1.2 Illustration showing the chord P Q and the tan-
gent line at P .
The gradient of the chord P Q is tan where is the angle between P Q and the x-axis,
and is given by the formula
f (a + h) f (a)
tan = .
h
If the graph in the vicinity of x = a is represented by a smooth line, then it is intuitively
obvious that the chord P Q becomes closer to the tangent at P as h 0; and in the
limit h = 0 the chord becomes the tangent. Hence the gradient of the tangent is given
by the limit
f (a + h) f (a)
lim .
h0 h
This limit, provided it exists, is named the derivative of f (x) at x = a and is commonly
df
denoted either by f 0 (a) or . Thus we have the formal definition:
dx
Definition 1.6
The derivative: A function f (x), defined on an open interval U of the real line, is
differentiable for x U and has the derivative f 0 (x) if
df f (x + h) f (x)
f 0 (x) = = lim , (1.7)
dx h0 h
exists.
If the derivative exists at every point in the open interval U the function f (x) is said
to be differentiable in U : in this case it may be proved that f (x) is also continuous.
However, a function that is continuous at a need not be differentiable at a: indeed,
it is possible to construct functions that are continuous everywhere but differentiable
nowhere; such functions are encountered in the mathematical description of Brownian
motion.
Combining the definition of f 0 (x) and the definition 1.3 of the order notation shows
that a differentiable function satisfies
The formal definition, equation 1.7, of the derivative can be used to derive all its useful
properties, but the physical interpretation, illustrated in figure 1.2, provides a more
useful way to generalise it to functions of several variables.
1.3. FUNCTIONS OF A REAL VARIABLE 21
The tangent line to the graph y = f (x) at the point a, which we shall consider to
be fixed for the moment, has slope f 0 (a) and passes through f (a). These two facts
determine the derivative completely. The equation of the tangent line can be written
in parametric form as p(h) = f (a) + f 0 (a) h. Conversely, given a point a, and the
equation of the tangent line at that point, the derivative, in the classical sense of the
definition 1.6, is simply the slope, f 0 (a), of this line. So the information that the
derivative of f at a is f 0 (a) is equivalent to the information that the tangent line at
a has equation p(h) = f (a) + f 0 (a) h. Although the classical derivative, equation 1.7,
is usually taken to be the fundamental concept, the equivalent concept of the tangent
line at a point could be considered equally fundamental - perhaps more so, since a
tangent is a more intuitive idea than the numerical value of its slope. This is the key
to successfully defining the derivative of functions of more than one variable.
From the definition 1.6 the following useful results follow. If f (x) and g(x) are
differentiable on the same open interval and and are constants then
d
(a) f (x) + g(x) = f 0 (x) + g 0 (x),
dx
d
(b) f (x)g(x) = f 0 (x)g(x) + f (x)g 0 (x), (The product rule)
dx
f 0 (x)g(x) f (x)g 0 (x)
d f (x)
(c) = , g(x) 6= 0. (The quotient rule)
dx g(x) g(x)2
We leave the proof of these results to the reader, but note that the differential of 1/g(x)
follows almost trivially from the definition 1.6, exercise 1.14, so that the third expression
is a simple consequence of the second.
The other important result is the chain rule concerning the derivative of composite
functions. Suppose that f (x) and g(x) are two differentiable functions and a third is
formed by the composition,
F (x) = f (g(x)), sometimes written as F = f g,
which we assume to exist. Then the derivative of F (x) can be shown, as in exercise 1.18,
to be given by
dF df dg
= or F 0 (x) = f 0 (g)g 0 (x). (1.9)
dx dg dx
This formula is named the chain rule. Note how the prime-notation is used: it denotes
the derivative of the function with respect to the argument shown, not necessarily the
original independent variable, x. Thus f 0 (g) or f 0 (g(x)) does not mean the derivative
of F (x); it means the derivative f 0 (x) with x replaced by g or g(x).
A simple example should make this clear: suppose f (x) = sin x and g(x) = 1/x,
x > 0, so F (x) = sin(1/x). The chain rule gives
dF d d 1 1 1 1
= (sin g) = cos g 2 = 2 cos .
dx dg dx x x x x
The derivatives of simple functions, polynomials and trigometric functions for instance,
can be deduced from first principles using the definition 1.6: the three rules, given above,
and the chain rule can then be used to find the derivative of any function described with
finite combinations of these simple functions. A few exercises will make this process
clear.
22 CHAPTER 1. PRELIMINARY ANALYSIS
Exercise 1.10
Find the derivative of the following functions
p
a sin2 x + b cos2 x , (c) cos(x3 ) cos x , (d) xx .
p
(a) (a x)(b + x) , (b)
Exercise 1.11
dx 1
If y = sin x for /2 x /2 show that = p .
dy 1 y2
Exercise 1.12
(a) If y = f (x) has the inverse x = g(y), show that f 0 (x)g 0(y) = 1, that is
1
dx dy
= .
dy dx
d2 x dy d2 y
(b) Express 2
in terms of and .
dy dx dx2
df d2 f d3 f dn1 f dn f
f, , , , , , ,
dx dx2 dx3 dxn1 dxn
where each member of the sequence is the derivative of the preceeding member,
dp f
p1
d d f
p
= , p = 2, 3, .
dx dx dxp1
The prime notation becomes rather clumsy after the second or third derivative, so the
most common alternative is
dp f
= f (p) (x), p 2,
dxp
with the conventions f (1) (x) = f 0 (x) and f (0) (x) = f (x). Care is needed to distinguish
between the pth derivative, f (p) (x), and the pth power, denoted by f (x)p and sometimes
f p (x) the latter notation should be avoided if there is any danger of confusion.
Functions for which the nth derivative is continuous are said to be n-differentiable
and to belong to class Cn : the notation Cn (U ) means the first n derivatives are continu-
ous on the interval U : the notation Cn (a, b) or Cn [a, b], with obvious meaning, may also
be used. The term smooth function describes functions belonging to C , that is func-
tions, such as sin x, having all derivatives; we shall, however, use the term sufficiently
smooth for functions that are sufficiently differentiable for all subsequent analysis to
work, when more detail is deemed unimportant.
In the following exercises some important, but standard, results are derived.
1.3. FUNCTIONS OF A REAL VARIABLE 23
Exercise 1.13
If f (x) is an even (odd) function, show that f 0 (x) is an odd (even) function.
Exercise 1.14
f 0 (x)
d 1
Show, from first principles using the limit 1.7, that = , and
dx f (x) f (x)2
that the product rule is true.
Exercise 1.15
Leibnizs rule
If h(x) = f (x)g(x) show that
n n!
where the binomial coefficients are given by = .
k k! (n k)!
Exercise 1.16
d f 0 (x)
Show that ln(f (x)) = and hence that if
dx f (x)
p0 f0 f0 f0
p(x) = f1 (x)f2 (x) fn (x) then = 1 + 2 + + n,
p f1 f2 fn
provided p(x) 6= 0. Note that this gives an easier method of differentiating prod-
ucts of three or more factors than repeated use of the product rule.
Exercise 1.17
If the elements of a determinant D(x) are differentiable functions of x,
f (x) g(x)
D(x) =
(x) (x)
show that 0
g 0 (x) f (x)
f (x) g(x)
D0 (x) = + .
(x) (x) 0 (x) 0 (x)
Extend this result to third-order determinants.
24 CHAPTER 1. PRELIMINARY ANALYSIS
f(b) P B
A Q
f(a)
a b
Figure 1.3 Diagram illustrating Cauchys form
of the mean value theorem.
From this figure it seems plausible that the tangent to the curve must be parallel to
the chord AB at least once. That is
f (b) f (a)
f 0 (x) = for some x in the interval a < x < b. (1.10)
ba
where is a number in the interval 0 < < 1, and is normally unknown. This relation
is used frequently throughout the course. Note that equation 1.11 shows that between
zeros of a continuous function there is at least one point at which the derivative is zero.
Equation 1.10 can be proved and is enshrined in the following theorem
Theorem 1.1
The Mean Value Theorem (Cauchys form). If f (x) and g(x) are real and differen-
tiable for a x b, then there is a point u inside the interval at which
f (b) f (a) g 0 (u) = g(b) g(a) f 0 (u), a < u < b. (1.12)
A similar idea may be applied to integrals. In figure 1.4 is shown a typical continuous
function, f (x), which attains its smallest and largest values, S and L respectively, on
the interval a x b.
12 A smooth curve is one along which its tangent changes direction continuously, without abrupt
changes.
1.3. FUNCTIONS OF A REAL VARIABLE 25
L
f(x)
a b
Figure 1.4 Diagram showing the upper and
lower bounds of f (x) used to bound the integral.
It is clear that the area under the curve is greater than (b a)S and less than (b a)L,
that is Z b
(b a)S dx f (x) (b a)L.
a
Exercise 1.18
The chain rule
In this exercise the Mean Value Theorem is used to derive the chain rule, equa-
tion 1.9, for the derivative of F (x) = f (g(x)).
Use the mean value theorem to show that
F (x + h) F (x) = f g(x) + hg 0 (x + h) f (g(x))
and that
f g(x) + hg 0 (x + h) = f (g(x)) + hg 0 (x + h) f 0 (g + hg 0 )
Exercise 1.19
Use the integral form of the mean value theorem, equation 1.13, to evaluate the
limits,
1 x p
Z Z x
1
dt ln 3t 3t2 + t3 .
`
(a) lim dt 4 + 3t3 , (b) lim
x0 x 0 x1 (x 1)3 1
We use the conventional notation, f /x, to denote the partial derivative with respect
to x, which is formed by fixing y and using the rules of ordinary calculus for the deriva-
tive with respect to x. The suffix notation, fx (x, y), is used to denote the same function:
here the suffix x shows the variable being differentiated, and it has the advantage that
when necessary it can be used in the form fx (a, b) to indicate that the partial derivative
fx is being evaluated at the point (a, b).
In practice the evaluation of partial derivatives is exactly the same as ordinary
derivatives and the same rules apply. Thus if f (x, y) = xey ln(2x + 3y) then the partial
derivatives with respect to x and y are, repectively
f 2xey f 3xey
= ey ln(2x + 3y) + and = xey ln(2x + 3y) + .
x 2x + 3y y 2x + 3y
Exercise 1.20
(a) If u = x2 sin(ln y) compute ux and uy .
r x r y
(b) If r 2 = x2 + y 2 show that = and = .
x r y r
The partial derivatives are also functions of x and y, so may be differentiated again.
Thus we have
2f 2f
f f
= = f xx (x, y) and = = fyy (x, y). (1.15)
x x x2 y y y 2
1.3. FUNCTIONS OF A REAL VARIABLE 27
Using the suffix notation the mixed derivative rule is fxy = fyx . A sufficient condi-
tion for this to hold is that both fxy and fyx are continuous functions of (x, y), see
equation 1.6 (page 18).
Similarly, differentiating p times with respect to x and q times with respect to y, in
any order, gives the same nth order derivative,
nf
where n = p + q,
xp y q
provided all the nth derivatives are continuous.
Exercise 1.21
If (x, y) = exp(x2 /y) show that satisfies the equations
2x 2 2
= and =4 .
x y x2 y y
Exercise 1.22
2u u u
Show that u = x2 sin(ln y) satisfies the equation 2y 2 + 2y +x = 0.
y 2 y x
so F 0 (t) is the rate of change of f (x) along C. Normally, we write f (t) rather than
df
use a different symbol F (t), and the left-hand side of the above equation is written .
dt
This derivative is named the total derivative of f . The proof of this when n = 2 and
x0 and y 0 do not vanish near (x, y) is sketched below; the generalisation to larger n is
straightforward. If F (t) = f (x(t), y(t)) then
where we have used the mean value theorem, equation 1.11. Write the right-hand side
in the form
h i h i
f (x+x0 , y+y 0) = f (x + x0 , y + y 0 ) f (x, y + y 0 ) + f (x, y + y 0 ) f (x, y) +f (t)
so that
F (t + ) F (t) f (x + x0 , y + y 0 ) f (x, y + y 0 ) 0 f (x, y + y 0 ) f (x, y) 0
= x + y.
x0 y 0
Thus, on taking the limit as 0 we have
dF f dx f dy
= + .
dt x dt y dt
This result remains true if either or both x0 = 0 or y 0 = 0, but then more care is needed
with the proof.
Equation 1.20 is used in chapter 3 to derive one of the most important results in
the course: if the dependence of x upon t is linear and F (t) has the form
where the vector h is constant and the variable xk has been replaced by xk + thk , for
d
all k. Since dt (xk + thk ) = hk , equation 1.20 becomes
n
dF X f
= hk . (1.21)
dt xk
k=1
This result will also be used in section 1.3.9 to derive the Taylor series for several
variables.
A variant of equation 1.19, which frequently occurs in the Calculus of Variations, is
the case where f (x) depends explicitly upon the variable t, so this equation becomes
so
F (t) = f t, et , e2t = et sin te2t .
The right-hand sides of equations 1.20 and 1.22 depend upon both x and t, but
because x depends upon t often these expressions are written in terms of t only. In the
Calculus of Variations this is usually not helpful because the dependence of both x and
t, separately, is important: for instance we often require expressions like
d F dF
and .
dt x1 x1 dt
The second of these expressions requires some clarification because dF/dt contains the
derivatives x0k . Thus
n
!
dF f X f dxk
= + .
x1 dt x1 t xk dt
k=1
Exercise 1.23
If f (t, x, y) = xy ty 2 and x = t2 , y = t3 show that
df dx dy
= y 2 + y + (x 2ty) = t4 (5 7t2 ),
dt dt dt
30 CHAPTER 1. PRELIMINARY ANALYSIS
and that
df dx dy
= 2t 1 4t2 ,
`
= 2y 2t
y dt dt dt
d f d dx dy
= 2t 1 4t2 .
`
= (x 2ty) = 2y 2t
dt y dt dt dt
Exercise 1.24
If F = 1 + x1 x2 , and x1 and x2 are functions of t, show by direct calculation
of each expression that
x0 x2 (x01 x2 + x1 x02 )
dF d F
= = 2 .
x1 dt dt x1 2 1 + x 1 x2 4(1 + x1 x2 )3/2
Exercise 1.25
Eulers formula for homogeneous functions
(a) A function f (x, y) is said to be homogeneous with degree p in x and y if it has
the property f (x, y) = p f (x, y), for any constant and real number p. For
such a function prove Eulers formula:
Hint use the total derivative formula 1.20 and differentiate with respect to .
(b) Find the equivalent result for homogeneous functions of n variables that satisfy
f (x) = p f (x).
(c) Show that if f (x1 , x2 , , xn ) is a homogeneous function of degree p, then
each of the partial derivatives, f /xk , k = 1, 2, , n, is homogeneous function
of degree p 1.
y
f(x,y)=0
y+k
y
x
x x+h
Figure 1.5 Diagram showing a typical curve defined
by an equation of the form f (x, y) = 0.
1.3. FUNCTIONS OF A REAL VARIABLE 31
For some values of x the equation f (x, y) = 0 can be solved to yield one or more real
values of y, which will give one or more functions of x. For instance the equation
x2 + y 2 1 = 0 defines a circle in the plane and for each x in |x| < 1 there are two
values of y, giving the two functions y(x) = 1 x2 . A more complicated example
is the equation x y + sin(xy) = 0, which cannot be rearranged to express one variable
in terms of the other.
Consider the smooth curve sketched in figure 1.5. On a segment in which the curve
is not parallel to the y-axis the equation f (x, y) = 0 defines a function y(x). Such a
function is said to be defined implicitly. The same equation will also define x(y), that
is x as a function of y, provided the segment does not contain a point where the curve
is parallel to the x-axis. This result, inferred from the picture, is a simple example of
the implicit function theorem stated below.
Implicitly defined functions are important because they occur frequently as solutions
of differential equations, see exercise 1.29, but there are few, if any, general rules that
help understand them. It is, however, possible to obtain relatively simple expressions
for the first derivatives, y 0 (x) and x0 (y).
We assume that y(x) exists and is differentiable, as seems reasonable from figure 1.5,
so F (x) = f (x, y(x)) is a function of x only and we may use the chain rule 1.22 to
differentiate with respect to x. This gives
dF f f dy
= + .
dx x y dx
On the curve defined by f (x, y) = 0, F 0 (x) = 0 and hence
f f dy dy fx
+ = 0 or = . (1.23)
x y dx dx fy
Similarly, if x(y) exists and is differentiable a similar analysis using y as the independent
variable gives
f dx f dx fy
+ = 0 or = . (1.24)
x dy y dy fx
This result is encapsulated in the Implicit Function Theorem which gives sufficient
conditions for an equation of the form f (x, y) = 0 to have a solution y(x) satisfying
f (x, y(x)) = 0. A restricted version of it is given here.
Theorem 1.3
Implicit Function Theorem: Suppose that f : U R is a function with continuous
partial derivatives defined in an open set U R2 . If there is a point (a, b) U for
which f (a, b) = 0 and fy (a, b) 6= 0, then there are open intervals I = (x1 , x2 ) and
J = (y1 , y2 ) such that (a, b) lies in the rectangle I J and for every x I, f (x, y) = 0
determines exactly one value y(x) J for which f (x, y(x)) = 0. The function y : I J
is continuous, differentiable, with the derivative given by equation 1.23.
Exercise 1.26
In the case f (x, y) = y g(x) show that equations 1.23 and 1.24 leads to the
relation 1
dx dy
= .
dy dx
32 CHAPTER 1. PRELIMINARY ANALYSIS
Exercise 1.27
If ln(x2 + y 2 ) = 2 tan1 (y/x) find y 0 (x).
Exercise 1.28
If x y + sin(xy) = 0 determine the values of y 0 (0) and y 00 (0).
Exercise 1.29
Show that the differential equation
dy y a2 x
= , y(1) = A > 0,
dx y+x
Hint the equation may be put in separable form by defining a new dependent
variable v = y/x.
The implicit function theorem can be generalised to deal with the set of functions
fk (x, t) = 0, k = 1, 2, , n, (1.25)
is not zero. Furthermore all the functions gk (t) have continuous first derivatives. The
determinant J is named the Jacobian determinant or, more usually, the Jacobian. It is
often helpful to use either of the following notations for the Jacobian,
f (f1 , f2 , . . . , fn )
J= or J = . (1.27)
x (x1 , x2 , . . . , xn )
Exercise 1.30
Show that the equations x = r cos , y = r sin can be inverted to give functions
r(x, y) and (x, y) in every open set of the plane that does not include the origin.
1.3. FUNCTIONS OF A REAL VARIABLE 33
(x a)n+1 (n+1)
Rn+1 = f (a + h) for some 0 < < 1 and h = x a. (1.29)
(n + 1)!
If all derivatives of f (x) are continuous for x1 x x2 , and if the remainder term
Rn 0 as n in a suitable manner we may take the limit to obtain the infinite
series
X (x a)k (k)
f (x) = f (a). (1.30)
k!
k=0
The infinite series 1.30 is known as Taylors series, and the point x = a the point of
expansion. A similar series exists when x takes complex values.
Care is needed when taking the limit of 1.28 as n , because there are cases
when the infinite series on the right-hand side of equation 1.30 does not equal f (x).
If, however, the Taylor series converges to f (x) at x = then for any x closer
to a than , that is |x a| < | a|, the series converges to f (x). This caveat is
necessary because of the strange example g(x) = exp(1/x2 ) for which all derivatives
are continuous and are zero at x = 0; for this function the Taylor series about x = 0
can be shown to exist, but for all x it converges to zero rather than g(x). This means
that for any well behaved function, f (x) say, with a Taylor series that converges to
f (x) a different function, f (x) + g(x) can be formed whose Taylor series converges, but
to f (x) not f (x) + g(x). This strange behaviour is not uncommon in functions arising
from physical problems; however, it is ignored in this course and we shall assume that
the Taylor series derived from a function converges to it in some interval.
The series 1.30 was first published by Brook Taylor (1685 1731) in 1715: the result
obtained by putting a = 0 was discovered by Stirling (1692 1770) in 1717 but first
published by Maclaurin (1698 1746) in 1742. With a = 0 this series is therefore often
known as Maclaurins series.
In practice, of course, it is usually impossible to sum the infinite series 1.30, so it is
necessary to truncate it at some convenient point and this requires knowledge of how,
or indeed whether, the series converges to the required value. Truncation gives rise to
the Taylor polynomials, with the order-n polynomial given by
n
X (x a)k
f (x) = f (k) (a). (1.31)
k!
k=0
34 CHAPTER 1. PRELIMINARY ANALYSIS
The series 1.30 is an infinite series of the functions (x a)n f (n) (a)/n! and summing
these requires care. A proper understanding of this process requires careful definitions
of convergence which may be found in any text book on analysis. For our purposes,
however, it is sufficient to note that in most cases there is a real number, rc , named the
radius of convergence, such that if |x a| < rc the infinite series is well mannered and
behaves rather like a finite sum: the value of rc can be infinite, in which case the series
converges for all x.
If the Taylor series of f (x) and g(x) have radii of convergence rf and rg respectively,
then the Taylor series of f (x) + g(x), for constants and , and of f (x)a g(x)b , for
positive constants a and b, exist and have the radius of convergence min(rf , rg ). The
Taylor series of the compositions f (g(x)) and g(f (x)) may also exist, but their radii of
convergence depend upon the behaviour of g and f respectively. Also Taylor series may
be integrated and differentiated to give the Taylor series of the integral and derivative
of the original function, and with the same radius of convergence.
Formally, the nth Taylor polynomial of a function is formed from its first n deriva-
tives at the point of expansion. In practice, however, the calculation of high-order
derivatives is very awkward and it is often easier to proceed by other means, which rely
upon ingenuity. A simple example is the Taylor series of ln(1 + tanh x), to fourth order;
this is most easily obtained using the known Taylor expansions of ln(1 + z) and tanh x,
z2 z3 z4 x3 2x5
ln(1 + z) = z + + O(z 5 ) and tanh x = x + + O(x7 ),
2 3 4 3 15
and then put z = tanh x retaining only the appropriate order of the series expansion.
Thus
" 2 2 #
x3 x2 x3 x4
5 x
ln(1 + tanh x) = x + O(x ) 1 + + + O(x5 )
3 2 3 3 4
x2 x4
= x + + O(x5 ).
2 12
This method is far easier than computing the four required derivatives of the original
function.
For |x a| > rc the infinite sum 1.30 does not exist. It follows that knowledge of
rc is important. It can be shown that, in most cases of practical interest, its value is
given by either of the limits
(k)
an
rc = lim or rc = lim |an |1/n where ak = f (a) . (1.32)
n an+1 n k!
Usually the first expression is most useful. Typically, we have, for large n
n! 1/n
n! n
f (n) (a) = r c 1 + O(1/n) so that = Ar c 1 + O(1/n)
f (n) (a)
n
for some constant A. Then the nth term of the series behaves as ((x a)/rc ) , and
decreases rapidly with increasing n provided |x a| < rc and n is sufficiently large.
Superficially, the Taylor series appears to be a useful representation and a good
approximation. In general this is not true unless |xa| is small; for practical applications
1.3. FUNCTIONS OF A REAL VARIABLE 35
far more efficient approximations exist that is they achieve the same accuracy for
far less work. The basic problem is that the Taylor expansion uses knowledge of the
function at one point only, and the larger |x a| the more terms are required for a
given accuracy. More sensible approximations, on a given interval, take into account
information from the whole interval: we describe some approximations of this type in
chapter 12.
The first practical problem is that the remainder term, equation 1.29, depends upon
, the value of which is unknown. Hence Rn cannot be computed; also, it is normally
difficult to estimate.
In order to understand how these series converge we need to consider the magnitude
of the nth term in the Taylor series: this type of analysis is important for any numerical
evaluation of power series. The nth term is a product of (x a)n /n! and f (n) (a). Using
Stirlings approximation,
n n
n! = 2n 1 + O(1/n) (1.33)
e
we can approximate the first part of this product by
n
(x a)n
' 1 e|x a|)
= gn . (1.34)
n! 2n n
The expression gn decreases very rapidly with increasing n, provided n is large enough.
Hence the term |x a|n /n! may be made as small as we please. But for practical
applications this is not sufficient; in figure 1.6 we plot a graph of the values of log(gn ),
that is the logarithm to the base 10, for x a = 10.
3.5 log(gn)
3
2.5
1.5
n
1
2 4 6 8 10 12 14 16 18 20
Figure 1.6 Graph showing the value of log(gn ),
equation 1.34, for x a = 10. For clarity we have
joined the points with a continuous line.
In this example the maximum of gn is at n = 10 and has a value of about 2500, before it
starts to decrease. It is fairly simple to show thatp that gn has a maximum at n ' |x a|
and here its value is max(gn ) ' exp(|x a|)/ 2|x a|.
The value of f (n) (a) is also difficult to estimate, but it usually increases rapidly with
n. Bizarrely, in many cases of interest, this behaviour depends upon the behaviour
of f (z), where z is a complex variable. An understanding of this requires a study
of Complex Variable Theory, which is beyond the scope of this chapter. Instead we
illustrate the behaviour of Taylor polynomials with a simple example.
First consider the Taylor series of sin x, about x = 0,
x3 x5 x2n1
sin x = x + + + (1)n1 + , (1.35)
3! 5! (2n 1)!
36 CHAPTER 1. PRELIMINARY ANALYSIS
2 n=1
n=15
1
x
0 2 4 6 8 10 12
-1
n=4 n=8
-2
Figure 1.7 Graph comparing the Taylor polynomials, of order n,
for the sine function with the exact function, the dashed line.
These graphs show that for large x it is necessary to include many terms in the series
to obtain an accurate representation of sin x. The reason is simply that for fixed, large
x, x2n1 /(2n 1)! is very large at n = x, as shown in figure 1.6. Because the terms
of this series alternate in sign the large terms in the early part of the series partially
cancel and cause problems when approximating a function O(1): it is worth noting that
as a consequence, with a computer having finite accuracy there is a value of x beyond
which the Taylor series for sin x gives incorrect values, despite the fact that formally it
converges for all x.
Exercise 1.31
Exponentional and Trigonometric functions
If f (x) = exp(ix) show that f (n) (x) = in exp(ix) and hence that its Taylor series is
X (ix)k
eix = .
k=0
k!
Show that the radius of convergence of this series in infinite. Deduce that
x2 x4 (1)n x2n
cos x = 1 + + + + ,
2! 4! (2n)!
x3 x5 (1)n x2n+1
sin x = x + + + + .
3! 5! (2n + 1)!
Exercise 1.32
Binomial expansion
Show that the Taylor series of (1 + x)a is
1 a(a 1)(a 2) (a k + 1) k
(1 + x)a = 1 + ax + a(a 1)x2 + x + .
2 k!
1.3. FUNCTIONS OF A REAL VARIABLE 37
Exercise 1.33
1
If f (x) = tan x find the first three derivatives to show that tan x = x+ x3 +O(x5 ).
3
Exercise 1.34
The natural logarithm
1
(a) Show that = 1 t + t2 + + (1)n tn + and use the definition of
1+t Z x
1
the natural logarithm, ln(1 + x) = dt , to show that
0 1+t
x2 x3 (1)n1 xn
ln(1 + x) = x + + + + .
2 3 n
Exercise 1.35
The inverse tangent function
Z x
1
Use the definition tan1 x =dt to show that for |x| < 1,
0 1 + t2
X (1)k x2k+1
tan1 x = .
k=0
2k + 1
Exercise 1.36
x2 x3 5x4
Show that ln(1 + sinh x) = x + + O(x5 ).
2 2 12
Exercise 1.37
Obtain the first five terms of the Taylor series of the function that satisfies the
equation
dy
(1 + x) = 1 + xy + y 2 , y(0) = 0.
dx
Hint use Leibnizs rule given in exercise 1.15 (page 23) to differentiate the equation
n times.
38 CHAPTER 1. PRELIMINARY ANALYSIS
t2 00 tn
F (t) = F (0) + tF 0 (0) + F (0) + + F (n) (0) + Rn+1 , (1.36)
2! n!
which we assume to exist for |t| 1. Now we need only express the derivatives F (n) (0)
in terms of the partial derivatives of f (x). Equation 1.21 (page 28) gives
m
X
F 0 (0) = fxk (a)hk .
k=1
where R2 is the remainder term which is second-order in h and is given below. Here
we have introduced the notation f /x for the vector function,
m
f f f f f X f
= , , , with the scalar product h = hk .
x x1 x2 xm x xk
k=1
For the second derivative we use equation 1.21 (page 28) again,
m m m
!
00
X d X X
F (t) = hk fxk (a + th) = hk hi fxk xi (a + th) .
dt i=1
k=1 k=1
where the second relation comprises fewer terms because the mixed derivative rule has
been used. This gives the second-order Taylor series,
m m m
!
X 1 X X
f (a + h) = f (a) + hk fxk (a) + hk hi fxk xi (a) + R3 , (1.39)
2! i=1
k=1 k=1
The higher-order terms are derived in exactly the same manner, but the algebra
quickly becomes cumbersome. It helps, however, to use the linear differential operator
h /a to write the derivatives of F (t) at t = 0 in the more convenient form,
2 n
F 0 (0) = h f (a), F 00 (0) = h f (a) and F (n) (0) = h f (a).
a a a
(1.40)
Then we can write Taylor series in the form
n s
X 1
f (a + h) = f (a) + h f (a) + Rn+1 (1.41)
s=1
s! a
where the remainder term is
1
Rn+1 = F (n+1) () for some 0 < < 1.
(n + 1)!
Because the high order derivatives are so cumbersome and for the practical reasons
discussed in section 1.3.8, in particular figure 1.7 (page 36), Taylor series for many vari-
ables are rarely used beyond the second-order term. This term, however, is important
for the classification of stationary points, considered in chapter 7.
For functions of two variables, (x, y), the Taylor series is
1 2
h fxx + 2hkfxy + k 2 fyy
f (a + h, b + k) = f (a, b) + hfx + kfy +
2
1 3
+ h fxxx + 3h kfxxy + 3hk 2 fxyy + k 3 fyyy +
2
6
s
X hsr k r sf
+ + + Rn+1 , (1.42)
r=0
(s r)! r! x y r
sr
where all derivatives are evaluated at (a, b). In this case the sth term is relatively easy
to obtain by expanding the differential operator (h/x + k/y)s using the binomial
expansion (which works because the mixed derivative rule means that the two operators
/x and /y commute).
Exercise 1.38
Find the Taylor expansions about x = y = 0, up to and including the second-order
terms, of the functions
(a) f (x, y) = sin x sin y, (b) f (x, y) = sin x + ey 1 .
`
Exercise 1.39
Show that the third-order Taylor series for a function, f (x, y, z), of three variables
is
f (a + h, b + k, c + l) = f (a, b, c) + hfx + kfy + lfz
1 ` 2
h fxx + k2 fyy + l2 fzz + 2hkfxy + 2klfyz + 2lhfzx
+
2!
1 3
+ h fxxx + k3 fyyy + l3 fzzz + 6hklfxyz
3!
3hk2 fxyy + 3hl2 fxzz + 3kh2 fyxx + 3kl2 fyzz
+3lh2 fzxx + 3lk2 fzyy .
40 CHAPTER 1. PRELIMINARY ANALYSIS
f (x)
R(x) = (1.43)
g(x)
the value of R(x) is normally computed by dividing the value of f (x) by the value of
g(x): this works provided g(x) is not zero at the point in question, x = a say. If g(x)
and f (x) are simultaneously zero at x = a, the value of R(a) may be redefined as a
limit. For instance if
sin x
R(x) = (1.44)
x
then the value of R(0) is not defined, though R(x) does tend to the limit R(x) 1 as
x 0. Here we show how this limit may be computed using LHospitals rule13 and
its extensions, discovered by the French mathematician G F A Marquis de lHospital
(1661 1704).
Suppose that at x = a, f (a) = g(a) = 0 and that each function has a Taylor series
about x = a, with finite radii of convergence: thus near x = a we have for small,
non-zero ||,
Hence, on taking the limit 0, we obtain the result given by the following theorem.
Theorem 1.5
LHospitals rule. Suppose that f (x) and g(x) are real and differentiable for
a < x < b . If
then
f (x) f 0 (x)
lim = lim 0 , (1.45)
xa g(x) xa g (x)
More generally if f (k) (a) = g (k) (a) = 0, k = 0, 1, , n 1 and g (n) (a) 6= 0 then
13 Here we use the spelling of the French national bibliography, as used by LHospital. Some modern
Exercise 1.40
Find the values of the following limits:
Exercise 1.41
f 0 (x) f (x)
(a) If f (a) = g(a) = 0 and lim = show that lim = .
xa g 0 (x) xa g(x)
(b) If both f (x) and g(x) are positive in a neighbourhood of x = a, tend to infinity
f 0 (x) f (x)
as x a and lim 0 = A show that lim = A.
xa g (x) xa g(x)
1.3.11 Integration
The study of integration arose from the need to compute areas and volumes. The
theory of integration was developed independently from the theory of differentiation
and the Fundamental Theorem of Calculus, described in note P I on page 42, relates
these processes. It should be noted, however, that Newton knew of the relation between
gradients and areas and exploited it in his development of the subject.
In this section we provide a very brief outline of the simple theory of integration
and discuss some of the methods used to evaluate integrals. This section is included
for reference purposes; however, although the theory of integration is not central to
the main topic of this course, you should be familiar with its contents. The important
idea, needed in chapter 3, is that of differentiating with respect to a parameter, or
differentiating under the integral sign described in equation 1.52 (page 45).
In this discussion of integration we use an intuitive notion of area and refer the
reader to suitable texts, Apostol (1963), Rudin (1976) or Whittaker and Watson (1965)
for instance, for a rigorous treatment.
If f (x) is a real, continuous function of the interval a x b, it is intuitively clear
that the area between the graph and the x-axis can be approximated by the sum of the
areas of a set of rectangles as shown by the dashed lines in figure 1.8.
y
f(x)
x
a x1 x2 x3 x4 x5 b
Figure 1.8 Diagram showing how the area under the curve y = f (x) may be approx-
imated by a set of rectangles. The intervals xk xk1 need not be the same length.
42 CHAPTER 1. PRELIMINARY ANALYSIS
In this context the function f (x) is named the integrand, and b and a the upper and
lower integration limits, or just limits. It can be shown that the integral exists for
bounded, piecewise continuous functions and also some unbounded functions.
From this definition the following elementary properties can be derived.
Z x
P:I: If F (x) is a differentiable function and F 0 (x) = f (x) then F (x) = F (a) + dt f (t).
a
This is the Fundamental theorem of Calculus and is important because it provides one
of the most useful tools for evaluating integrals.
Z b Z a
P:II: dx f (x) = dx f (x).
a b
Z b Z c Z b
P:III: dx f (x) = dx f (x) + dx f (x) provided all integrals exist. Note, it is not
a a c
necessary that c lies in the interval (a, b).
Z b Z b Z b
P:IV: dx f (x) + g(x) = dx f (x) + dx g(x), where and are real
a a a
or complex numbers.
Z
b Z b
P:V: dx f (x) dx |f (x)| . This is the analogue of the finite sum inequality
a a
n n
X X
ak |ak | , where ak , k = 1, 2, , n, are a set of complex numbers or functions.
k=1 k=1
1.3. FUNCTIONS OF A REAL VARIABLE 43
with equality if and only if g(x) = cf (x) for some real constant c. This inequality is
sometimes named the Cauchy inequality and sometimes the Schwarz inequality. It is
the analogue of the finite sum inequality
n
!2 n
! n
!
X X X
a k bk a2k b2k
k=1 k=1 k=1
with equality if and only if bk = cak for all k and some real constant c.
1 1
P:VII: The H
older inequality: if + = 1, p > 1 and q > 1 then
p q
!1/p !1/q
Z b Z b Z b
p q
dx f (x)g(x) dx |f (x)| dx |g(x)| ,
a a a
is valid for complex functions f (x) and g(x) with equality if and only if |f (x)|p |g(x)|q
and arg(f g) are independent of x. It is the analogue of the finite sum inequality
n n
!1/p n
!1/q
X X p
X q 1 1
|ak bk | |ak | |bk | , + = 1,
p q
k=1 k=1 k=1
with equality if and only if |an |p |bn |q and arg(an bn ) are independent of n (or ak = 0
for all k or bk = 0 for all k). If all ak and bk are positive and p = q = 2 these inequalities
reduce to the Cauchy-Schwarz inequalities.
P:VIII: The Minkowski inequality for any p > 1 and real functions f (x) and g(x) is
!1/p !1/p !1/p
Z b p Z b Z b
p p
dx f (x) + g(x) dx |f (x)| + dx |g(x)|
a a a
with equality if and only if g(x) = cf (x), with c a non-negative constant. It is the
analogue of the finite sum inequality valid for ak , bk > 0, for all k, and p > 1
n
!1/p n
!1/p n
!1/p
p
apk bpk
X X X
ak + b k + ,
k=1 k=1 k=1
with equality if and only if bk = cak for all k and c a non-negative constant.
R Sometimes it is convenient to ignore the integration limits, here a and b, and write
dx f (x): this is named the indefinite integral: its value is undefined to within an
additive constant. However, it is almost always possible to express problems in terms
of definite integrals that is, those with limits.
44 CHAPTER 1. PRELIMINARY ANALYSIS
The theory of integration is concerned with understanding the nature of the inte-
gration process and with extending these simple ideas to deal with wider classes of
functions. The sciences are largely concerned with evaluating integrals, that is convert-
ing integrals to numbers or functions that can be understood: most of the techniques
available for this activity were developed in the nineteenth century or before, and we
describe them later in this section.
There are two important extensions to the integral defined above. If either or both
a and b tend to infinity we define an infinite integral as a limit of integrals: thus if
b we have !
Z Z b
dx f (x) = lim dx f (x) , (1.48)
a b a
provided the limit exists. As a general rule, provided |f (x)| tends to infinity slower
than |x a| , > 1, the integral exists, which is why, in the previous example, we
needed < 2; note that if f (x) = O(ln(x a)), as x a, it is integrable. For functions
unbounded at an interior point the natural extension to P III is used.
The evaluation of integrals of any complexity in closed form is normally difficult, or
impossible, but there are a few tools that help. The main technique is to use the Funda-
mental theorem of Calculus in reverse and simply involves recognising those F (x) whose
derivative is the integrand: this requires practice and ingenuity. The main purpose of
the other tools is to convert integrals into recognisable types. The first is integration
by parts, derived from the product rule for differentiation:
Z b h ib Z b
dv du
dx u = uv dx v. (1.50)
a dx a a dx
The second method is to change variables:
Z b Z B Z B
dx
dx f (x) = dt f (g(t)) = dt g 0 (t)f (g(t)), (1.51)
a A dt A
where x = g(t), g(A) = a, g(B) = b, and g(t) is monotonic for A < t < B. In these
circumstances the Leibniz notation is helpfully transparent because dx dt can be treated
like a fraction, making the equation easier to remember. The geometric significance of
this formula is simply that the small element of length x, at x, becomes the element
of length x = g 0 (t)t, where x = g(t), under the variable change.
The third method involves the differentiation of a parameter. Consider a function
f (x, u) of two variables, which is integrated with respect to x, then
Z b(u) Z b(u)
d db da f
dx f (x, u) = f (b, u) f (a, u) + dx , (1.52)
du a(u) du du a(u) u
provided a(u) and b(u) are differentiable and fu (x, u) is a continuous function of both
variables; the derivation of this formula is considered in exercise 1.50. If neither limit
depends upon u the first two terms on the right-hand side vanish. A simple example
shows how this method can work. Consider the integral
Z
I(u) = dx exu , u > 0.
0
But the original integral is trivially integrated to I(u) = 1/u, so differentiation gives
Z
n!
dx xn exu = n+1 .
0 u
This result may also be found by repeated integration by parts but the above method
involves less algebra.
The application of these methods usually requires some skill, some trial and error
and much patience. Please do not spend too long on the following problems.
46 CHAPTER 1. PRELIMINARY ANALYSIS
Exercise 1.42
Z a
(a) If f (x) is an odd function, f (x) = f (x), show that dx f (x) = 0.
a
Z a Z a
(b) If f (x) is an even function, f (x) = f (x), show that dx f (x) = 2 dx f (x).
a 0
Exercise 1.43 Z
sin x
Show that, if > 0, the value of the integral I() = dx is independent
0 x
of . How are the values of I() and I() related?
Exercise 1.44
Use integration by parts to evaluate the following indefinite integrals.
Z Z Z Z
x
(a) dx ln x, (b) dx 2
, (c) dx x ln x, (d) dx x sin x.
cos x
Exercise 1.45
Evaluate the following integrals
Z /4 Z /4 Z 1
2
(a) dx sin x ln(cos x), (b) dx x tan x, (c) dx x2 sin1 x.
0 0 0
Exercise 1.46
Z x
If In = dt tn eat , n 0, use integration by parts to show that aIn = xn eax
0
nIn1 and deduce that
n
X (1)nk k (1)n n!
In = n!eax x .
k=0
ank+1 k! an+1
Exercise 1.47
Z a Z a
(a) Using the substitution u = a x, show that dx f (x) = dx f (a x).
0 0
Exercise 1.48
Use the substitution t = tan(x/2) to prove that if a > |b| > 0
Z
1
dx = .
0 a + b cos x a 2 b2
Exercise 1.49 Z t
1
Prove that y(t) = dx f (x) sin (t x) is the solution of the differential equa-
a
tion
d2 y
+ 2 y = f (t), y(a) = 0, y 0 (a) = 0.
dt2
Exercise 1.51
Assuming that both integrals exist, show that
Z Z
1
dx f x = dx f (x).
x
Exercise 1.52
Find the limits as X of the following integrals
Z X Z X
1 1
dx and dx .
2 x ln x 2 x(ln x)2
Hint note that if f (x) = ln(ln x) then f 0 (x) = (x ln x)1 .
Exercise 1.53
Determine the values of the real constants a > 0 and b > 0 for which the following
limit exists Z X
1
lim dx a .
X 2 x (ln x)b
48 CHAPTER 1. PRELIMINARY ANALYSIS
Limits
Exercise 1.54
Find, using first principles, the following limits
xa 1 1+x1 x1/3 a1/3
(a) lim , (b) lim , (c) lim ,
x1 x 1 x0 1 1x xa x1/2 a1/2
1/x
1+x
(d) lim ( 2x) tan x, (e) lim x1/x , (f) lim ,
x(/2) x0+ x0 1x
Inverse functions
Exercise 1.55
Show that the inverse functions of y = cosh x, y = sinh x and y = tanh x, for
x > 0 are, respectively
p p 1 1+y
x = ln y + y 2 1 , x = ln y + y 2 + 1 and x = ln .
2 1y
Exercise 1.56
The function y = sin x may be defined to be the solution of the differential equation
d2 y
+ y = 0, y(0) = 0, y 0 (0) = 1.
dx2
Show that the inverse function x(y) satisfies the differential equation
3 Z y
d2 x dx 1 1
= y which gives x(y) = sin y = du .
dy 2 dy 0 1 u2
Derivatives
Exercise 1.57
Find the derivative of y(x) where
r r
p+x q+x p
(a) y = f (x)g(x) , (b) y = , (c) y n = x + 1 + x2 .
px qx
Exercise 1.58
If y = sin(a sin1 x) show that (1 x2 )y 00 xy 0 + a2 y = 0.
1.4. MISCELLANEOUS EXERCISES 49
Exercise 1.59
d2 y dy
If y(x) satisfies the equation (1 x2 ) 2x + y = 0, where is a constant
dx2 dx
and |x| 1, show that changing the independent variable, x, to where x = cos
changes this to
d2 y dy
+ cot + y = 0.
d2 d
Exercise 1.60
The Schwarzian derivative of a function f (x) is defined to be
2 !
f 000 (x) f 00 (x) d2
3 p 1
Sf (x) = 0 = 2 f 0 (x) 2 .
f (x) 2 f 0 (x) dx
p
f 0 (x)
Show that if f (x) and g(x) both have negative Schwarzian derivatives, Sf (x) < 0
and Sg(x) < 0, then the Schwarzian derivative of the composite function h(x) =
f (g(x)) also satisfies Sh(x) < 0.
Note the Schwarzian derivative is important in the study of the fixed points of
maps.
Partial derivatives
Exercise 1.61
x
If z = f (x + ay) + g(x ay) 2 cos(x + ay) where f (u) and g(u) are arbitrary
2a
functions of a single variable and a is a constant, prove that
2z 2z
a2 2
= sin(x + ay).
x y 2
Exercise 1.62
If f (x, y, z) = exp(ax + by + cz)/xyz, where a, b and c are constants, find the
partial derivatives fx , fy and fz , and solve the equations fx = 0, fy = 0 and
fz = 0 for (x, y, z).
Exercise 1.63
The equation f (u2 x2 , u2 y 2 , u2 z 2 ) = 0 defines u as a function of x, y and z.
1 u 1 u 1 u 1
Show that + + = .
x x y y z z u
Implicit functions
Exercise 1.64
Show that the function f (x, y) = x2 + y 2 1 satisfies the conditions of the Implicit
Function Theorem for most values of (x, y), and that the function y(x) obtained
from the theorem has derivative y 0 (x) = x/y.
The
equation f (x, y) = 0 can be solved explicitly to give the equations y =
1 x2 . Verify that the derivatives of both these functions is the same as that
obtained from the Implicit Function Theorem.
50 CHAPTER 1. PRELIMINARY ANALYSIS
Exercise 1.65
Prove that the equation x cos xy = 0 has a unique solution, y(x), near the point
(1, 2 ), and find its first and second derivatives.
Exercise 1.66
The folium of Descartes has equation f (x, y) = x3 + y 3 3axy = 0. Show that at
all points on the curve where y 2 6= ax, the implicit function y(x) has derivative
dy x2 ay
= 2 .
dx y ax
Taylor series
Exercise 1.67
By sketching the graphs of y = tan x and y = 1/x for x > 0 show that the equation
x tan x = 1 has an infinite number of positive roots. By putting x = n + z, where
n is a positive integer, show that this equation becomes (n + z) tan z = 1 and
use a first-order Taylor expansion of this to show that the root nearest n is given
1
approximately by xn = n + .
n
Exercise 1.68
Determine the constants a and b such that (1 + a cos 2x + b cos 4x)/x4 is finite at
the origin.
Exercise 1.69
Find the Taylor series, to 4th order, of the following functions:
(a) ln cosh x, (b) ln(1 + sin x), (c) esin x , (d) sin2 x.
Exercise 1.71
Use the functions f1 (x) = ln(1 + x) x and f2 (x) = f1 (x) + x2 /2 and the Mean
Value Theorem to show that, for x > 0,
1 2
x x < ln(1 + x) < x.
2
1.4. MISCELLANEOUS EXERCISES 51
LHospitals rule
Exercise 1.72
sin ln x 1
Show that lim = .
x1 x5 7x3 + 6 16
Exercise 1.73
2 a sin bx b sin ax
Determine the limits lim (cos x)1/ tan x
and lim .
x0 x0 x3
Integrals
Exercise 1.74
Using differentiation under the integral sign show that
Z
tan1 (ax) 1
dx 2)
= ln(1 + a).
0 x(1 + x 2
Exercise 1.75
Prove that, if |a| < 1
Z /2
ln(1 + cos a cos x) 2
dx = (1 4a2 ).
0 cos x 8
Exercise 1.76 Z /2 Z
2
If f (x) = (sin x)/x, show that dx f (x)f (/2 x) = dx f (x).
0 0
Exercise 1.77
Use the integral definition
Z x Z
1 1
tan1 x = dt to show that for x > 0 tan1 (1/x) = dt
0 1 + t2 x 1 + t2
and deduce that tan1 x + tan1 (1/x) = /2.
Exercise 1.78 Z 2x
Determine the values if x that make g 0 (x) = 0 if g(x) = dt f (t) and
x
(a) f (t) = et , and (b) f (t) = (sin t)/t.
Exercise 1.79
If f (x) is integrable for a x a + h show that
n
1 a+h
Z
1X kh
lim f a+ = dx f (x).
n n n h a
k=1
Exercise 1.80
If the functions f (x) and g(x) are differentiable find expressions for the first deriva-
tive of the functions
Z u Z u
f (x) g(x)
F (u) = dx and G(u) = dx where 0 < a < 1.
0 u 2 x2
0 (u x)a
This is a fairly difficult problem. The formula 1.52 does not work because the
integrands are singular, yet by substituting simple functions for f (x) and g(x), for
instance 1, x and x2 , we see that there are cases for which the functions F (u) and
G(u) are differentiable. Thus we expect an equivalent to formula 1.52 to exist.
1.5. SOLUTIONS FOR CHAPTER 1 53
(a) f = sin 2, which is independent of r, so the value of the function in the neighbour-
hood of the origin depends upon the direction of approach, that is , so f is not defined
at the origin and is not continuous.
(b) f = 1/ cos 2; the same remark as in part (a) applies and f is not continuous.
Hence
y
1 y 1
= sin1 and x(y) = 2a sin sin1 , |y| < 2a3 .
3 2a3 3 2a3
(c) For x > a, y(x) is strictly decreasing and for x > 2a, y < 2a3 . Set x = 2a cosh
and the equation becomes
giving
1 y
x(y) = 2a cosh cosh1 3 , y < 2a3 .
3 2a
1 1 1 dy 1 1 a b 2x
ln y = ln(a x) + ln(b + x) giving = =
2 2 y dx 2(b + x) 2(a x) 2(b + x)(a x)
(b) Define
dy dy (a b) sin 2x
y 2 = a sin2 x+b cos2 x to give 2y = 2(ab) sin x cos x or = p
dx dx 2 a sin2 x + b cos2 x
which can also be expressed in the form
dy (a b) sin 2x
=p .
dx 2(a + b) + 2(b a) cos 2x
d
cos x3 cos x = 3x2 sin x3 cos x cos x3 sin x.
dx
dy du
(d) If y = xx = ex ln x , putting u = x ln x the chain rule gives = eu = (1+ln x)xx .
dx dx
d df dg
1= f (g(y)) = = f 0 (g)g 0 (y).
dy dg dy
dy dx
Since = f 0 (x) and = g 0 (y), the result follows.
dx dy
(b) Differentiate again with respect to y
1 1 2 3
d2 x d2 y d2 y
d dy d dy dx dy dx dy
2
= = = 2 = 2 .
dy dy dx dx dx dy dx dx dy dx dx
so that
f 0 (x)
1 1 1 f (x + h) f (x) 1
lim = lim =
h0 h f (x + h) f (x) h0 h f (x + h)f (x) f (x)2
h0 = f 0 g + f g 0 h00 = (f 00 g + f 0 g 0 ) + (f 0 g 0 + f g 00 ).
=
(3) 2 2 2
The expression for h follows similarly. Since = = 1 and = 2 the
0 2 1
general result quoted is therefore true for n = 1 and 2. Suppose it to be true for n; a
further differentiation gives
n
(n+1)
X n
h = f (nk+1) g (k) + f (nk) g (k+1)
k
k=0
n n+1
X n
X n
= f (nk+1) g (k) + f (n+1s) g (s) (with s = k + 1 in second sum)
k s1
k=0 s=1
n
n + 1 (n+1) (0) n (0) (n+1)
X n n
= f g + f g + + f (nk+1) g (k) .
0 n k k1
k=1
m m
But, for all m, = = 1 and
0 m
n n n! n! n! (n + 1)! k
+ = + = +
k k1 k! (n k)! (k 1)! (n + 1 k)! k! (n k)! k! (n + 1 k)! n + 1
(n + 1)! n+1k (n + 1)! k n+1
= + = .
k! (n + 1 k)! n + 1 k! (n + 1 k)! n + 1 k
Thus, if the formula is true for n, it is true for n + 1: it is true for n = 2 and hence is
true for all n.
1.5. SOLUTIONS FOR CHAPTER 1 57
which is valid provided none of the fk (x) are zero, that is p(x) 6= 0.
so that
F (x + h) F (x)
= f 0 g(x) + hg 0 g 0 (x + h).
h
This gives the required result on taking the limit h 0.
58 CHAPTER 1. PRELIMINARY ANALYSIS
Solution
Z for Exercise p 1.19
1 x p
(a) dt 4 + 3t3 = 4 + 3(x)3 for 0 < < 1. Hence the limit is 2.
x 0
Z x Z z
1 2 3 1
ds ln(1+s3 ) where z = x1 and s = t1.
(b) dt ln 3t 3t + t =
(x 1)3 1 z3 0
the Mean Value theorem gives the second integral as z 2 ln(1 + (z)3 ), 0 < < 1 and
this is zero in the limit z 0.
Solution
for Exercise 1.24
If F = 1 + x1 x2 then the chain rule gives
dF F 0 F 0 x1 x0 + x01 x2
= x1 + x2 = 2 .
dt x1 x2 2 1 + x 1 x2
dF 1 du
Alternatively, set u = x1 x2 , so = , which is a simpler method of deriving
dt 2 1 + u dt
the same result.
Differentiate this expression with respect to x1 , using the product rule,
dF 1 1
= (x1 x02 + x01 x2 ) + (x1 x02 + x01 x2 )
x1 dt x1 2 1 + x1 x2 2 1 + x1 x2 x1
1 x2 x0
= (x1 x02 + x01 x2 ) 3/2
+ 2 .
4 (1 + x1 x2 ) 2 1 + x 1 x2
F x2
Also = , and the chain rule gives
x1 2 1 + x 1 x2
x0
d F x2 d
= 2 (x1 x2 ),
dt x1 2 1 + x 1 x2 4(1 + x1 x2 )3/2 dt
as before.
and set = 1.
60 CHAPTER 1. PRELIMINARY ANALYSIS
p fxk (x) = f (x) = f (x) = fxk (x)
xk (xk )
2x 2 y 2(x + y)
fx = + = 2 ,
x2 +y 2 2 2
(1 + y /x ) x 2 x + y2
2y 2 1 2(y x)
fy = = 2 .
x2 + y 2 x (1 + y 2 /x2 ) x + y2
dy fx x+y
Hence = = .
dx fy xy
(x cos u 1) y 00 (x) + (cos u xu0 sin u) y 0 (x) + y 0 (x) cos u yu0 sin u = 0.
dv a2 + v 2 v+1 dx
Z Z
x = or dv = .
dx v+1 v 2 + a2 x
1 1 y
ln a2 x2 + y 2 + tan1 =B
2 a ax
But eix = cos x + i sin x, so equating real and imaginary parts gives the quoted series.
f (k) (x) = a(a 1)(a 2) (a k + 1)(1 + x)ak for all k provided a is not an integer.
(b) The series for (1 + t)1 is valid for |t| < 1, so for |x| < 1 the integral and sum may
be interchanged.
(c) Put x x and subtract this from the original series.
Since y(0) = 0 and y (1) (0) = 1 (from the original equation) these equations give
y (2) (0) = 1, y (3) (0) = 6, y (4) (0) = 27 and y (5) (0) = 186 and hence
1 9 31
y = x x2 + x3 x4 + x5 + O(x6 ).
2 8 20
P5
An alternative method is to assume the expansion y = x + k=2 ak xk , which au-
tomatically satisfies the conditions y(0) = 0 and y 0 (0) = 1, to substitute this into
the differential equation, collect the powers of xk , k = 2, 3, , 5, and equate their
coefficients to zero to obtain equations for the constants ak .
(b)
3x 3x ex ln 3 ex ln 3 ln 3 ex ln 3 + ex ln 3 ln 3
(c) lim = lim = lim = .
x0 2x 2x x0 ex ln 2 ex ln 2 x0 ln 2 ex ln 2 + ex ln 2 ln 2
(b) Put F (x) = 1/f (x) and G(x) = 1/g(x) so F (a) = G(a) = 0, and
2
g 0 (x) f (x)2 g 0 (x)
f (x) G(x) f (x)
lim = lim = lim 0 = lim 0 lim .
xa g(x) xa F (x) xa f (x) g(x)2 xa f (x) xa g(x)
f (x) f 0 (x)
Hence, provided all limits exist, lim = lim 0 .
xa g(x) xa g (x)
(b) Split the integral in the same manner as in part (a), but since f (u) = f (u) the
two integrals are equal.
x sin x
Z Z Z Z
(b) dx 2x
= x tan x dx tan x but dx tan x = dx = ln | cos x| .
cos cos x
x
Z
Hence dx = x tan x + ln | cos x|.
cos2 x
1.5. SOLUTIONS FOR CHAPTER 1 65
1 2 1 1 1 1
Z Z
(c) dx x ln x = x ln x dx x2 = x2 ln x x2 .
2 2 x 2 4
Z Z
(d) dx x sin x = x cos x + dx cos x = sin x x cos x.
(b)
/4 /4 /4
x 1 2
Z Z
2
dx x tan x = dx x = x tan x + ln cos x x
0 0 cos2 x 2 0
1 2
= ln 2 .
4 2 32
(c)
1 1 1
x3
1 3 1 1
Z Z
2 1
dx x sin x= x sin x dx .
0 3 0 3 0 1 x2
But on putting x = sin and using the identity sin 3 = 3 sin 4 sin3 ,
1 /2 /2
x3 1 2
Z Z Z
dx = d sin3 = d (3 sin sin 3) =
0 1 x2 0 4 0 3
1
2
Z
and hence dx x2 sin1 x = .
0 6 9
aI1 = xeax I0 , aI2 = x2 eax 2I1 , aI3 = x3 eax 3I2 , , aIn = xn eax nIn1 .
Multiply the kth equation by Ak and add all the equations to obtain
n
X n
X n
X
a Ak Ik = eax A k xk kAk Ik1 .
k=1 k=1 k=1
66 CHAPTER 1. PRELIMINARY ANALYSIS
Now chose the Ak such that An = 1/a and for k = 1, 2, , n 1, the Ik cancel, that is
1
aAk = (k + 1)Ak+1 , k = 1, 2, , n 1, An = .
a
n! (1)nk
The solution of these equations is Ak = which gives the quoted expression.
ank+1 k!
Solution for Exercise 1.47
Z a Z 0 Z a
(a) dx f (x) = du f (a u) = dx f (a x).
0 a 0
Z
F 1 a
= dx 2
= 2 ,
a 0 (a + b cos x) (a b2 )3/2
2F 1 (2a2 + b2 )
Z
= 2 dx = ,
a2 (a + b cos x)3 (a2 b2 )5/2
Z 0
F cos x b
= dx 2
= 2 .
b 0 (a + b cos x) (a b2 )3/2
1
Z p
G = C + da = C + ln(a + a2 b2 ),
2
a b 2
1.5. SOLUTIONS FOR CHAPTER 1 67
From the first of these equations we see that y 0 (a) = 0, so the initial conditions are
satisfied. The second equation gives y 00 (a) = f (a), which is consistent with the original
differential equation.
we have
a(u+h)
F (u + h) F (u) 1 a(u + h) a(u)
Z
= dx f (x) = f (), where a(u), a(u + h) ,
h h a(u) h
the last result being obtained from the integral form of the Mean Value Theorem.
Taking the limit h 0 gives F 0 (u) = a0 (u)f (a(u)). The same result can be derived
using the Fundamental theorem of Calculus and the chain rule.
(b) We have
b
F (u + h) F (u) f (x, u + h) f (x, u)
Z
= dx
h a h
Z b
0 f
Assuming that the limit h 0 exists we obtain F (u) = dx .
a u
2
(d) Put x = /2 , > 0 to give ( 2x) tan x = = 2 + O().
tan
(e) Put y = x1/x , so ln y = (1/x) ln x and lim ln y = and lim x1/x = 0.
x0 x0
(f) We have
1/x
1+x 1 1+x 1
2 x + O(x3 ) = e2 .
lim = lim exp ln = lim exp
x0 1x x0 x 1x x0 x
z2 1
r
1+y
In the finally example we have y = 2 or z = ex = . The positive root
z +1 1y
gives the required solution so
1 1+y
x = ln .
2 1y
d2 y x00 (y)
d 1 dy
= =
dx2 dy x0 (y) dx x0 (y)3
Integration gives 1/z 2 = y 2 + c, but x0 (0) = 1/y 0 (0) = 1 and y(0) = 0, so c = 1 and
dx 1
=p , x(0) = 0,
dy 1 y2
0
where the
Z ynegative square root is ignored because x (0) = 1. A further integration gives
1
x(y) = du .
0 1 u2
The Taylor expansion of the integrand is
1 1 3
= 1 + u2 + u4 + O(u6 )
1u 2 2 8
so integration gives
1 3
sin1 y = y + y 3 + y 5 + O(y 7 ).
6 40
1 X (2k)! u2k
More generally, we have = , |u| < 1, so
1 u2 k!2 22k
k=0
X (2k)! y 2k
sin1 y = y , |y| < 1.
k! 2 22k (2k + 1)
k=0
(b) Since
1 1 1 1
ln y = ln(p + x) ln(p x) + ln(q + x) ln(g x)
2 2 2 2
we have
y0
1 1 1 1 1 1 p q
= + + + = + 2
y 2 p+x px 2 q+x qx p2 x 2 q x2
and r r
dy p q p+x q+x
= + 2 .
dx p2 x2 q x2 px qx
(c) We have
dy x yn dy y
ny n1 =1+ = therefore = .
dx 1 + x2 1 + x2 dx n 1 + x2
and
d2 y a2 sin x ax cos u a2 y x dy
= + = + ,
dx2 1 x2 (1 x2 )3/2 1 x2 1 x2 dx
which gives the required result.
p
which gives the required result since x/ x2 + y 2 = cot .
so that
2
g 000 (x)f 0 (g) + 3g 00 (x)g 0 (x)f 00 (g) + g 0 (x)3 f 000 (g) 3 g 00 (x) g 0 (x)f 00 (g)
Sh(x) = + .
g 0 (x)f 0 (g) 2 g 0 (x) f 0 (g)
On multiplying this out we see that Sh(x) = Sg(x) + g 0 (x)2 Sf (g) < 0 since Sg(x) < 0
and Sf (g) < 0.
0 2 4 6 8 x 10 12 14 16
1
2
Figure 1.9 Graphs of y = 1/x and y = tan x.
For the nth root, put x = n + z, and since sin x = (1)n sin z and cos x = (1)n cos z
the equation becomes
(n + z) tan z = 1 with z small.
Put = 1/n so the equation becomes (1 + z) tan z = and we require the Taylor
expansion of z() about = 0. Putting = 0 we see that z(0) = 0. Differentiation gives
1 + z 0
(z 0 + z) tan z + z = 1 giving z 0 (0) = 1,
cos2 z
1
and hence x = n + .
n
Further differentiation of the same equation allows, in principle, the calculation of
z (n) (0) for n > 2; however, such calculations are extremely tedious and error prone. A
far easier method is now outlined.
First, rewrite the equation for z in the form
tan z =
1 + z
and observe that this equation defines a function z(), with z(0) = 0, that is an odd
function of to see this note that z() satisfies the same equation. Also, for small
|z| we see that to O() the equation becomes z = + O(2 ). The power series for z()
is thus
z = + z3 3 + z5 5 + O(7 ),
where z3 and z7 are coefficients to be found. Substitute this series in to the left-hand
side of the equation and use the known series for tan z to obtain
3 3 2
tan z = + z 3 3 + z 5 5 + + 1 + z 3 2 + + 5 +
3 15
3 1 5 2
= + z3 + + z5 + z 3 + + .
3 15
Equating the coefficients of the powers of on each side of the equation gives z 3 = 4/3
and z5 = 53/15 and hence
1 4 53
x = n + + + .
n 3(n)2 15(n)3
1 2 1 3
= 2 sinh2 u 2 sinh2 u + 2 sinh2 u + O(u8 ).
2 3
2
u2
u 2 2
Now use sinh u = u 1 + + and sinh u = u 1 + + in this expansion,
6 3
to give
u4 x2 x4
ln(cosh x) = 2u2 1 + + 2u4 + = + O(x6 ).
12 2 12
(b) Similarly
1 1 1
ln(1 + sin x) = sin x sin2 x + sin3 x sin4 x + O(x5 ).
2 3 4
x2
sin x = x 1 +
6
giving
x2 x2 x2 x3 x4
ln(1 + sin x) = x 1 + 1 + + + O(x5 ),
6 2 3 3 4
x2 x3 x4
= x + + O(x5 ).
2 6 12
1.5. SOLUTIONS FOR CHAPTER 1 75
(c) Similarly
sin2 x sin3 x sin4 x
exp(sin x) = 1 + sin x + + + + O(x5 ),
2 6 24
x2 x2 x2 x3 x4
= 1+x 1 + + 1 + + + + O(x5 ),
6 2 3 6 24
x2 x4
= 1+x+ + O(x5 ).
2 8
Hence
1 iy sin y 2i 2 y
S= e 1 = + sin
iy y y 2
and hence
1 y 2y 2 y
lim sin + sin + + sin y = sin2 .
n n n n y 2
h i1/n
1
(d) If Pn = n (n + 1)(n + 2) . . . (2n) then
n n
1X 1X
ln Pn = ln n + ln(k + n) = ln(1 + k/n).
n n
k=1 k=1
In the first case the simplest method is to remove the singularity at x = u using the
standard change of variable x = u sin to give
Z /2
F (u) = d f (u sin ).
0
1 u xf 0 (x)
Z
0
F (u) = dx .
u 0 u2 x 2
In the second case we use another, more general trick. Consider the integral
u
g(x)
Z
G (u) = dx , 0,
0 (u x)a
2.1 Introduction
In this chapter we consider the particular variational principle defining the shortest
distance between two points in a plane. It is well known that this shortest path is the
straight line, however, it is almost always easiest to understand a new idea by applying it
to a simple, familiar problem; so here we introduce the essential ideas of the Calculus of
Variations by finding the equation of this line. The algebra may seem overcomplicated
for this simple problem, but the same theory can be applied to far more complicated
problems, and we shall see in chapter 3 the most important equation of the Calculus of
Variations, the Euler-Lagrange equation, can be derived with almost no extra effort.
The chapter ends with a description of some of the problems that can be formulated
in terms of variational principles, some of which will be solved later in the course.
The approach adopted is intuitive, that is we assume that functionals behave like
functions of n real variables. This is exactly the approach used by Euler (1707 1783)
and Lagrange (1736 1813) in their original analysis and it can be successfully applied
to many important problems. However, it masks a number of problems, all to do
with the subtle differences between infinite and finite dimensional spaces which are not
considered in this course.
The curve must pass through the end points, so y(x) satisfies the boundary conditions,
y(a) = A and y(b) = B. We shall usually assume that y 0 (x) is continuous on (a, b).
We require the equation of the function that makes S[y] stationary, that is we need
to understand how the values of the functional S[y] change as the path between Pa and
79
80 CHAPTER 2. THE CALCULUS OF VARIATIONS
Pb varies. These ideas are introduced here, and developed in chapter 3, using analogies
with the theory of functions of many real variables.
A stationary point is defined to be one for which the term O() is zero for all . This
gives the familiar conditions for a point to be stationary, namely G/xk = 0 for
k = 1, 2, , n.
For a functional we proceed in the same way. That is, we choose adjacent paths
joining Pa to Pb and compare the values of S along these paths. If a path is represented
by a differentiable function y(x), adjacent paths may be represented by y(x) + h(x),
where is a real variable and h(x) another differentiable function. Since all paths must
pass through Pa and Pb , we require y(a) = A, y(b) = B and h(a) = h(b) = 0; otherwise
h(x) is arbitrary. The difference
S = S[y + h] S[y],
may be considered as a function of the real variable , for arbitrary y(x) and h(x) and
for small values of , || 1. When = 0, S = 0 and for small || we expect S to be
proportional to ; in general this is true as seen in equation 2.3 below.
However, there may be some paths for which S is proportional to 2 , rather than .
These paths are special and we define these to be the stationary paths, curves or sta-
tionary functions. Thus a necessary condition for a path y(x) to be a stationary path
is that
S[y + h] S[y] = O(2 ),
for all suitable h(x). The equation for the stationary function y(x) is obtained by
examining this difference more carefully.
The distances along these adjacent curves are
Z b p Z b p
S[y] = dx 1 + y 0 (x)2 , and S[y + h] = dx 1 + [y 0 (x) + h0 (x)]2 .
a a
We proceed by expanding the integrand of S[y + h] in powers of , retaining only the
terms proportional to . One way of making this expansion is to consider the integrand
as a function of and to use Taylors series to expand in powers of ,
p p d p
0 0 2
1 + (y + h ) = 1+y + 0 2 0 0
1 + (y + h ) 2 + O(2 ),
d =0
p y 0 h0
= 1 + y0 2 + p + O(2 ).
1 + y0 2
2.2. THE SHORTEST DISTANCE BETWEEN TWO POINTS IN A PLANE 81
Substituting this expansion into the integral and rearranging gives the difference be-
tween the two lengths,
Z b
y 0 (x)
S[y + h] S[y] = dx p h0 (x) + O(2 ). (2.3)
0
1 + y (x) 2
a
This difference depends upon both y(x) and h(x), just as for functions of n real variables
the difference G(x+)G(x), equation 2.2, depends upon both x and , the equivalents
of y(x) and h(x) respectively.
Since S[y] is stationary it follows, by definition, that
Z b
y 0 (x)
dx p h0 (x) = 0 (2.4)
1 + y 0 (x)2
a
In section 3.3 we show that condition 2.5 is necessary as well as sufficient for equation 2.4
to hold.
Equation 2.5 shows that y 0 (x) = m, where m is a constant, and integration gives
the general solution,
y(x) = mx + c
for another constant c: this is the equation of a straight line as expected. The constants
m and c are determined by the conditions that the straight line passes through Pa
and Pb :
BA Ab Ba
y(x) = x+ . (2.6)
ba ba
This analysis shows that the functional S[y] defined in equation 2.1 is stationary along
the straight line joining Pa to Pb . We have not shown that this gives a minimum
distance: this is proved in exercise 2.2.
Exercise 2.1
Use the above method on the functional
Z 1 p
S[y] = dx 1 + y 0 (x), y(0) = 0, y(1) = B > 1,
0
f ( x) A
E
C
B
D
x
a b
Figure 2.1 Diagram to illustrate the difference be-
tween local and global extrema.
It is clear from this example that to classify a point as a local extremum requires an
examination of the function values only in the neighbourhood of the point. Whereas,
determining whether a point is a global extremum requires examining all values of the
function; this type of analysis usually invokes special features of the function.
The local analysis of a stationary point of a function, G(x), of n variables proceeds
by making a second-order Taylor expansion about a point x = a,
n n n
X G 1 X X 2G
G(a + ) = G(a) + k + 2 k j + ,
xk 2 j=1
xk xj
k=1 k=1
with || = 1. The stationary point is a local maximum if this quadratic form is strictly
negative. For large n it is usually difficult to determine whether these inequalities are
satisfied, although there are well defined tests which are described in chapter 7.
2.2. THE SHORTEST DISTANCE BETWEEN TWO POINTS IN A PLANE 83
For a functional we proceed in the same way: the nature of a stationary path
is usually determined by the second-order expansion. If S[y] is stationary then, by
definition,
1
S[y + h] S[y] = 2 [y, h]2 + O(3 )
2
for some quantity 2 [y, h], depending upon both y and h; special cases of this expansion
are found in exercises 2.2 and 2.3. Then S[y] is a local minimum if 2 [y, h] > 0 for
all h(x), and a local maximum if 2 [y, h] < 0 for all h(x). Normally it is difficult to
establish these inequalities, and the general theory is described in chapter 7. For the
functional defined by equation 2.1, however, the proof is straight-forward; the following
exercise guides you through it.
Exercise 2.2
(a) Use the binomial expansion, exercise 1.32 (page 36), to obtain the following
expansion in ,
p 2 2
+ O(3 ).
p
1 + ( + )2 = 1 + 2 + +
1+ 2 2(1 + 2 )3/2
(b) Use this result to show that if y(x) is the straight line defined in equation 2.6
and S[y] the functional 2.1, then
Z b
2 BA
S[y + h] S[y] = dx h0 (x)2 , m = .
2(1 + m2 )3/2 a ba
Deduce that the straight line is a local minimum for the distance between Pa
and Pb .
Exercise 2.3
In this exercise the functional defined in exercise 2.1 is considered in more detail.
By expanding the integrand of S[y + h] to second-order in show that, if y(x) is
the stationary path, then
Z 1
2
S[y + h] = S[y] dx h0 (x)2 , B > 1.
8(1 + B)3/2 0
Deduce that the path y(x) = Bx, B > 1, is a local maximum of this functional.
Now we show that the straight line between the points (0, 0) and (a, A) gives a global
minimum of the functional, not just a local minimum. This analysis relies on a special
property of the integrand that follows from the Cauchy-Schwarz inequality.
Exercise 2.4
Use the Cauchy-Schwarz inequality (page 43) with a = (1, z) and b = (1, z + u)
to show that p
1 + (z + u)2 1 + z 2 1 + z 2 + zu
p
The distance between the points (0, 0) and (a, A) along the path y(x) is
Z a p
S[y] = dx 1 + y0 2, y(0) = 0, y(a) = A.
0
On using the inequality derived in the previous exercise, with z = y 0 (x) and u = h0 (x),
we see that Z a
y0
S[y + h] S[y] dx p h0 .
0 1 + y 0 2
But on the stationary path y 0 is a constant and since h(0) = h(a) = 0 we have
S[y + h] S[y] for all h(x).
This analysis did not assume that |h| is small, and since all admissible paths can
be expressed in the form x + h(x), we have shown that in the class of differentiable
functions the straight line gives the global minimum of the functional.
An observation
Problems involving shortest distances on surfaces other than a plane illustrate other
features of variational problems. Thus if we replace the plane by the surface of a sphere
then the shortest distance between two points on the surface is the arc length of a
great circle joining the two points that is the circle created by the intersection of
the spherical surface and the plane passing through the two points and the centre of
the sphere; this problem is examined in exercise 4.20 (page 184). Now, for most points,
there are two stationary paths corresponding to the long and the short arcs of the great
circle. However, if the points are at opposite ends of a diameter, there are infinitely
many shortest paths. This example shows that solutions to variational problems may
be complicated.
In general, the stationary paths between two points on a surface are named geodesics 1 .
For a plane surface the only geodesics are straight lines; for a sphere, most pairs of points
are joined by just two geodesics that are the segments of the great circle through the
points. For other surfaces there may be several stationary paths: an example of the
consequences of such complications is described next.
Galaxy Earth
Quasar Image
Figure 2.2 Diagram showing how an intervening galaxy can sufficiently dis-
tort a path of light from a bright object, such as a quasar, to provide two
stationary paths and hence two images. Many examples of such multiple im-
ages, and more complicated but similar optical effects, have now been observed.
Usually there are more than two stationary paths.
dF z dF y 0 (x)
= and = .
dy 0
p
dz 1 + z2 1 + y 0 (x)2
Exercise 2.5
Find the expressions for dF/dy 0 when
(a) F (y 0 ) = (1 + y 0 2 )1/4 , (b) F (y 0 ) = sin y 0 , (c) F (y 0 ) = exp(y 0 ).
The functional S[y] is stationary if the term O() is zero for all suitable functions h(x).
As before we give a sufficient condition, deferring the proof that it is also necessary. In
this analysis it is important to remember that F (z) is a given function and that y(x)
is an unknown function that we need to find. Observe that if
d
F (y 0 ) = = constant (2.12)
dy 0
then
In general equation 2.12 is true only if y 0 (x) is also constant, and hence
BA Ab Ba
y(x) = mx + c and therefore y(x) = x+ ,
ba ba
the last result following from the boundary conditions y(a) = A and y(b) = B.
This is the same solution as given in equation 2.6. Thus, for this class of functional,
the stationary function is always a straight line, independent of the form of the inte-
grand, although its nature can sometimes depend upon the boundary conditions, see
for instance exercise 2.18 (page 103).
The exceptional example is when F (z) is linear, in which case the value of S[y]
depends only upon the end points and not the values of y(x) in between, as shown in
the following exercise.
Exercise 2.6
If F (z) = Cz + D, where C and D are constants, by showing that the value of
Rb
the functional S[y] = a dx F (y 0 ) is independent of the chosen path, deduce that
equation 2.12 does not imply that y 0 (x) = constant.
What is the effect of making either, or both C and D a function of x?
2.3. TWO GENERALISATIONS 87
where the integrand F (x, y 0 ) depends explicitly upon the two variables x and y 0 . The
difference in the value of the functional along adjacent paths is
Z b
dx F (x, y 0 + h0 ) F (x, y 0 ) .
S[y + h] S[y] = (2.14)
a
In this example F (x, z) is a function of two variables and we require the expansion
F
F (x, z + u) = F (x, z) + u + O(2 )
z
where Taylors series for functions of two variables is used. Comparing this with the
expression in equation 2.9 we see that the only difference is that the derivative with
respect to y 0 has been replaced by a partial derivative. As before, replacing z by y 0 (x)
and u by h0 (x), equation 2.14 becomes
Z b
S[y + h] S[y] = dx h0 (x) 0 F (x, y 0 ) + O(2 ). (2.15)
a y
If y(x) is the stationary path it is necessary that
Z b
dx h0 (x) 0 F (x, y 0 ) = 0 for all h(x).
a y
As before a sufficient condition for this is that Fy0 (x, y 0 ) = constant, which gives the
following differential equation for y(x),
F (x, y 0 ) = c, y(a) = A, y(b) = B, (2.16)
y 0
where c is a constant. This is the equivalent of equation 2.12, but now the explicit
presence of x in the equation means that y 0 (x) = constant is not a solution.
Exercise 2.7
Consider the functional
Z 1 p
S[y] = dx 1 + x + y 0 2 , y(0) = A, y(1) = B.
0
y 0 (x) = c 1 + x + y 0 (x)2 ,
p
2.4 Notation
In the previous sections we used the notation F (y 0 ) to denote a function of the derivative
of y(x) and proceeded to treat y 0 as an independent variable, so that the expression
dF/dy 0 had the meaning defined in equation 2.10. This notation and its generalisation
are very important in subsequent analysis; it is therefore essential that you are familiar
with it and can use it.
Consider a function F (x, u, v) of three variables, for instance F = x u2 + v 2 , and
assume that all necessary partial derivatives of F (x, u, v) exist. If y(x) is a function of
x we may form a function of x with the substitutions u y(x), v y 0 (x), thus
Because y depends upon x we may also form the total derivative of F (x, y, y 0 ) with
respect to x using the chain rule, equation 1.22 (page 29)
dF F F 0 F
= + y (x) + 0 y 00 (x). (2.17)
dx x y y
In the particular case F (x, u, v) = x u2 + v 2 these rules give
F p F xy F xy 0
= y2 + y0 2 , =p , = .
y 0
p
x y y + y0 2
2 y2 + y0 2
2F 2 F 2F 2 F 2F 2 F
= , = and = .
y 2 u2 u=y,v=y0 y 0 2 v 2 u=y,v=y0 yy 0 uv u=y,v=y0
Because you must be able to use this notation we suggest that you do all the following
exercises before proceeding.
Exercise 2.8
F F F dF d F
If F (x, y 0 ) =
p
x2 + y 0 2 find , , , and . Also, show that,
x y y 0 dx dx y 0
d F dF
= .
dx y 0 y 0 dx
2.4. NOTATION 89
Exercise 2.9
Show that for an arbitrary differentiable function F (x, y, y 0 )
2 F 00 2F 0 2F
d F
= 2
y + y + .
dx y 0 y 0 yy 0 xy 0
Exercise 2.10
Use the first identity found in exercise 2.9 to show that the equation
d F F
=0
dx y 0 y
2 F 00 2F 0 2F F
0 2
y + 0
y + = 0.
y yy xy 0 y
Note the first equation will later be seen as crucial to the general theory described
in chapter 3. The fact that it is a second-order differential equation means that
unique solutions can be obtained only if two initial or two boundary conditions
are given. Note also that the coefficient of y 00 (x), 2 F/y 0 2 , is very important in
the general theory of the existence of solutions of this type of equation.
Exercise 2.11
F F 2 F
(a) If F (y, y 0 ) = y
p
1 + y 0 2 find , , and show that the equation
y y 0 y 0 2
2
d2 y
d F F dy
=0 becomes y 1 =0
dx y 0 y dx2 dx
for some constants A and B. Hint, let y be the independent variable and define a
new variable z by the equation yz(y) = dy/dx to obtain an expression for dy/dx
that can be integrated.
90 CHAPTER 2. THE CALCULUS OF VARIATIONS
y
x
Pa
Pb
Figure 2.3 The curved line joining Pa to Pb is
a segment of a cycloid. In this diagram the axes
are chosen to give a = A = 0.
The name given to this curve is the brachistochrone, from the Greek, brachistos, shortest,
and chronos, time.
If the y-axis is vertical it can be shown that the time taken along the curve y(x) is
s
b
1 + y0 2
Z
T [y] = dx , y(a) = A, y(b) = B,
a C 2gy
where g is the acceleration due to gravity and C a constant depending upon the initial
speed of the particle. This expression is derived in section 4.2.
2.5. EXAMPLES OF FUNCTIONALS 91
This problem was first considered by Galileo (1564 1642) in his 1638 work Two
New Sciences, but lacking the necessary mathematical methods he concluded, erro-
neously, that the solution is the arc of a circle passing vertically through Pa ; exercise 4.4
(page 166) gives part of the reason for this error.
It was John Bernoulli (1667 1748), however, who made the problem famous when in
June 1696 he challenged the mathematical world to solve it. He followed his statement
of the problem by a paragraph reassuring readers that the problem was very useful in
mechanics, that it is not the straight line through Pa and Pb and that the curve is well
known to geometers. He also stated that he would show that this is so at the end of
the year provided no one else had.
In December 1696 Bernoulli extended the time limit to Easter 1697, though by this
time he was in possession of Leibnizs solution, sent in a letter dated 16 th June 1696,
Leibniz having received notification of the problem on 9 th June. Newton also solved
the problem quickly: apparently2 the letter from Bernoulli arrived at Newtons house,
in London, on 29 th January 1697 at the time when Newton was Warden of the Mint.
He returned from the Mint at 4pm, set to work on the problems and had solved it by
the early hours of the next morning. The solution was returned anonymously, to no
avail with Bernoulli stating upon receipt The lion is recognised by his paw. Further
details of this history and details of these solutions may be found in Goldstine (1980,
chapter 1).
The curve giving this shortest time is a segment of a cycloid, which is the curve traced
out by a point fixed on the circumference of a vertical circle rolling, without slipping,
along a straight line. The parametric equations of the cycloid shown in figure 2.3 are
where a is the radius of the circle: these equations are derived in section 4.2.1, where
other properties of the cycloid are discussed.
Other historically important names are the isochronous curve and the tautochrone.
A tautochrone is a curve such that a particle travelling along it under gravity reaches
a fixed point in a time independent of its starting point; a cycloid is a tautochrone
and a brachistochrone. Isochronal means equal times so isochronous curves and
tautochrones are the same.
There are many variations of the brachistochrone problem. Euler3 considered the
effect of resistance proportional to v 2n , where v is the speed and n an integer. The
problem of a wire with friction, however, was not considered until 19754. Both these
extensions require the use of Lagrange multipliers and are described in chapter 10.
Another variation was introduced by Lagrange5 who allowed the end point, Pb in fig-
ure 2.3, to lie on a given surface and this introduces different boundary conditions that
the cycloid needs to satisfy: the simpler variant in which the motion remains in the
plane and one or both end points lie on given curves is treated in chapter 9.
2 This anecdote is from the records of Catherine Conduitt, n
ee Barton, Newtons niece who acted as
his housekeeper in London, see Newtons Apple by P Aughton, (Weidenfeld and Nicolson), page 201.
3 Chapter 3 of his 1744 opus, The Method of Finding Plane Curves that Show Some Property of
Maximum or Minimum. . . .
4 Ashby A, Brittin W E, Love W F and Wyss W, Brachistochrone with Coulomb Friction, Amer J
Physics 43 902-5.
5 Essay on a new method. . . , published in Vol II of the Miscellanea Taurinensai, the memoirs of
and we shall see that this problem has solutions that can be expressed in terms of
differentiable functions only for certain combinations of A, B and b a.
ural Philosophy.
7 Smith G E Fluid Resistance: Why Did Newton Change His Mind?, in The Foundations of New-
tonian Scholarship.
8 Note that this suggests that the 30 C change in temperature between summer and winter changes
imagine the fluid as comprising many particles, each of mass m and all stationary. If
there are N particles per unit volume, the density is = mN . In the small time t the
area A sweeps through a volume vtA, so N vtA particles collide with the area, as
shown schematically on the left-hand side of figure 2.5.
N
vt
O
v
Figure 2.5 Diagram showing the motion of a small area, A, through a rar-
efied gas. On the left-hand side the normal to the area is perpendicular to the
relative velocity; on the right-hand side the area is at an angle. The direction
of the arrows is in the direction of the gas velocity relative to the area.
For an elastic collision between a very large mass (that of which A is the small surface
element) with velocity v, and a small initially stationary mass, m, the momentum
change of the light particle is 2mv you may check this by doing exercise 2.23,
although this is not part of the course. Thus in a time t the total momentum transfer
is in the opposite direction to v, P = (2mv) (N vtA). Newtons law equates force
with the rate of change of momentum, so the force on the area opposing the motion is,
since = mN ,
P
F = = 2v 2 A. (2.19)
t
Equation 2.19 is a justification for the v 2 -law. If the normal, ON , to the area A is at
an angle to the velocity, as in the right-hand side side of figure 2.5, where the arrows
denote the fluid velocity relative to the body, then the formula 2.19 is modified in two
ways. First, the significant area is the projection of A onto v, so A A cos .
Second, the fluid particles are elastically scattered through an angle 2 (because the
angle of incidence equals the angle of reflection), so the momentum transfer along the
direction of travel is v(1 + cos 2) = 2v cos2 : hence 2v 2v cos2 , and the force
in the direction (v) is F = 2v 2 cos3 A. We now apply this formula to find the
force on a surface of revolution. We define Oy to be the axis: consider a segment CD
of the curve in the Oxy-plane, with normal P N at an angle to Oy, as shown in the
left-hand panel of figure 2.6.
2.5. EXAMPLES OF FUNCTIONALS 95
y y
N s
A C
C
P D D
x x
x x
O
b
Figure 2.6 Diagram showing change in velocity of a particle colliding with the
element CD, on the left, and the whole curve which is rotated about the y-axis,
on the right.
The force on the ring formed by rotating the segment CD about Oy is, because of axial
symmetry, in the y-direction. The area of the ring is 2xs, where s is the length of
the element CD, so the magnitude of the force opposing the motion is
F = 2xs 2v 2 cos3 .
The total force on the curve in figure 2.6 is obtained by integrating from x = 0 to x = b,
and is given by the functional,
Z x=b
2
F [y] = 4v ds x cos3 , y(0) = A, y(b) = 0. (2.20)
x=0
For a disc of area Af , y 0 (x) = 0, and this reduces to F = 2Af v 2 , giving a drag
coefficient CD = 4, which compares with the measured value of about 1.3. Newtons
problem is to find the path making this functional a minimum and this is solved in
section 9.6.
Exercise 2.12
Use the definition of the drag coefficient, equation 2.18, to show that, according
to the theory described here,
Z b
8 x
CD = dx .
b2 0 1 + y0 2
Variations of this problem were considered by Newton: one is the curve CBD, shown
in figure 2.7, rotated about Oy.
96 CHAPTER 2. THE CALCULUS OF VARIATIONS
y
B
A C
D x
O a b
Figure 2.7 Diagram showing the modified geometry considered by Newton.
Here the variable a is an unkown, the line CB is parallel to the x-axis and
the coordinates of C are (0, A).
In this problem the position D is fixed, but the position of B is not; it is merely
constrained to be on the line y = A, parallel to Ox. The resisting force is now given by
the functional
Z b
F1 [y] 1 2 x
= a + dx , y(a) = A, y(b) = 0. (2.22)
4v 2 2 a 1 + y0 2
Now the path y(x) and the number a are to be chosen to make the functional stationary.
Problems such as this, where the position of one (or both) of the end points are
also to be determined are known as variable end point problems and are dealt with in
chapter 9.
y
Pb
B
L[ y]
Pa
A S [ y]
x
a b
Figure 2.8 Diagram showing the area, S[y], under a
curve of given length joining Pa to Pb .
This is a classic problem discussed by Pappus of Alexandria in about 300 AD. Pappus
showed, in Book V of his collection, that of two regular polygons having equal perimeters
the one with the greater number of sides has the greater area. In the same book he
demonstrates that for a given perimeter the circle has a greater area than does any
regular polygon. This work seems to follow closely the earlier work of Zenodorus (circa
180 BC): extant fragments of his work include a proposition that of all solid figures, the
surface areas of which are equal, the sphere has the greatest volume.
Returning to figure 2.8, a modern analytic treatment of the problem requires a
differentiable function y(x) satisfying y(a) = A, y(b) = B, such that the area,
Z b
S[y] = dx y
a
Z b p
L[y] = dx 1 + y0 2,
a
y
(-a,A) (a,A)
A
x
-a a
Figure 2.9 Diagram showing the catenary formed by
a uniform chain hanging between two points at the
same height.
If the lowest point of the chain is taken as the origin, the catenary equation is shown
in section 11.2.3 to be x
y = c cosh 1 (2.23)
c
for some constant c determined by the length of the chain and the value of a.
If a curve is described by a differentiable function y(x) it can be shown, see exer-
cise 2.19, that the potential energy E of the chain is proportional to the functional
Z a p
S[y] = dx y 1 + y 0 2 .
a
The curve
p that minimises this functional, subject to the length of the chain L[y] =
Ra
a dx 1 + y 0 2 remaining constant, is the shape assumed by the hanging chain. In
common with the previous example, the catenary problem involves a constraint again
the length of the chain and is dealt with using the methods described in chapter 11.
at R to the observer at O. The plane of the mirror is perpendicular to the page and it
is assumed that the plane SRO is in the page.
S 1 2 h2
h1
A R B
It is known that light travels in straight lines and is reflected from the mirror at a
point R as shown in the diagram. But without further information the position of R is
unknown. Observations, however, show that the angle of incidence, 1 , and the angle
of reflection, 2 , are equal. This law of reflection was known to Euclid (circa 300 BC)
and Aristotle (384 322 BC); but it was Hero of Alexandria (circa 125 BC) who showed
by geometric argument that the equality of the angles of incidence and reflection is a
consequence of the Aristotelean principle that nature does nothing the hard way; that
is, if light is to travel from the source S to the observer O via a reflection in the mirror
then it travels along the shortest path.
This result was generalised by the French mathematician Fermat (1601 1665) into
what is now known as Fermats principle which states that the path taken by light rays
is that which minimises the time of passage11. For the mirror, because the speed along
SR and RO is the same this means that the distance along SR plus RO is a minimum.
If AB = d and AR = x, the total distance travelled by the light ray depends only upon
x and is q q
f (x) = x2 + h21 + (d x)2 + h22 .
This function has a minimum when 1 = 2 , that is when the angle of incidence, 1 ,
equals the angle of reflection, 2 , see exercise 2.14.
In general, for light moving in the Oxy-plane, in a medium with refractive index
n(x, y), with the source at the origin and observer at (a, A) the time of passage, T ,
along an arbitrary path y(x) joining these points is
1 a
Z p
T [y] = dx n(x, y) 1 + y 0 2 , y(0) = 0, y(a) = A.
c 0
This follows
p because the time taken to travel along an element of length s is n(x, y)s/c
and s = 1 + y 0 (x)2 x. If the refractive index, n(x, y), is constant then this integral
reduces to the integral 2.1 and the path of a ray is a straight line, as would be expected.
11 Fermats original statement was that light travelling between two points seeks a path such that the
number of waves is equal, as a first approximation, to that in a neighbouring path. This formulation
has the form of a variational principle, which is remarkable because Fermat announced this result in
1658, before the calculus of either Newton or Leibniz was developed.
100 CHAPTER 2. THE CALCULUS OF VARIATIONS
Fermats principle can be used to show that for light reflected at a mirror the angle
of incidence equals the angle of reflection. For light crossing the boundary between two
media it gives Snells law,
sin 1 c1
= ,
sin 2 c2
where 1 and 2 are the angles between the ray and the normal to the boundary and
ck is the speed of light in the media, as shown in figure 2.11: in water the speed of light
is approximately c2 = c1 /1.3, where c1 is the speed of light in air, so 1.3 sin 2 = sin 1 .
O Air
1
N
S
Water 2
S
Figure 2.11 Diagram showing the refraction of light at the surface of wa-
ter. The angles of incidence and refraction are defined to be 2 and 1
respectively; these are connected by Snells law.
In figure 2.11 the observer at O sees an object S in a pond and the light ray from S
to O travels along the two straight lines SN and N O, but the observer perceives the
object to be at S 0 , on the straight line OS 0 . This explains why a stick put partly into
water appears bent.
of variational principles is discussed in chapter 5 and this allows easier use of more
general coordinate systems.
The next major step was taken by Hamilton (1805 1865), in 1834, who cast La-
granges equations as a variational principle; confusingly, we now name this Lagranges
variational principle. Hamilton also generalised this theory to lay the foundations for
the development of modern physics that occurred in the early part of the 20 th century.
These developments are important because they provide a coordinate-free formulation
of dynamics which emphasises the underlying mathematical structure of the equations
of motion, which is important in helping to understand how solutions behave.
Summary
These few examples provide some idea of the significance of variational principles. In
summary, they are important for three distinct reasons
A variational principle is often the easiest or the only method of formulating a
problem.
Often conventional boundary value problems may be re-formulated in terms of a
variational principle which provides a powerful tool for approximating solutions.
This technique is introduced in chapter 12.
A variational formulation provides a coordinate free method of expressing the
laws of dynamics, allowing powerful analytic techniques to be used in ordinary
Newtonian dynamics. The use of variational principles also paved the way for
the formulation of dynamical laws describing motion of objects moving at speeds
close to that of light (special relativity), particles interacting through gravita-
tional forces (general relativity) and the laws of the microscopic world (quantum
mechanics).
102 CHAPTER 2. THE CALCULUS OF VARIATIONS
Find the values of these functionals for the functions y(x) = x2 and y(x) = cos x
when a(x) = x and b(x) = 1.
Exercise 2.14
Show that the function
q q
f (x) = x2 + h21 + (d x)2 + h22 ,
where h1 , h2 are defined in figure 2.10 (page 99) and x and d denote the lengths
AR and AB respectively, is stationary when 1 = 2 where
x dx
sin 1 = p , sin 2 = p .
x2 + h21 (d x)2 + h22
Exercise 2.15
Consider the functional
Z 1
dx y 0 1 + y 0 ,
p
S[y] = y(0) = 0, y(1) = B > 1.
0
Exercise 2.16
Using the method described in the text, show that the functionals
Z b Z b
dx 1 + xy 0 y 0 and S2 [y] = dx xy 0 2 ,
`
S1 [y] =
a a
where b > a > 0, y(b) = B and y(a) = A are both stationary on the same curve,
namely
ln(x/a)
y(x) = A + (B A) .
ln(b/a)
Explain why the same function makes both functionals stationary.
Exercise 2.17
In this exercise the theory developed in section 2.3.1 is extended. The function
F (z) has a continuous second derivative and the functional S is defined by the
integral Z b
S[y] = dx F (y 0 ).
a
(a) Show that
b b
d2 F 0 2
Z Z
dF 0 1
S[y + h] S[y] = dx h (x) + 2 dx h (x) + O(3 ),
a dy 0 2 a dy 0 2
where h(a) = h(b) = 0.
(b) Show that if y(x) is chosen to make dF/dy 0 constant then the functional is
stationary.
(c) Deduce that this stationary path makes the functional either a maximum or a
minimum, provided F 00 (y 0 ) 6= 0.
Exercise 2.18
Show that the functional
Z 1
1/4
dx 1 + y 0 (x)2
`
S[y] = , y(0) = 0, y(1) = B,
0
Harder exercises
Exercise 2.19
If a uniform, flexible, inextensible chain of length L is suspended between two
supports having the coordinates (a, A) and (b, B), with the y-axis pointing verti-
cally upwards, show that, if the shape assumed by the chain Ris described by the
b
p
differentiable function y(x), then its length is given by L[y] = a dx 1 + y 0 2 and
its potential energy by
Z b p
E[y] = g dx y 1 + y 0 2 , y(a) = A, y(b) = B,
a
where is the line-density of the chain and g the acceleration due to gravity.
104 CHAPTER 2. THE CALCULUS OF VARIATIONS
Exercise 2.20
This question is about the shortest distance between two points on the surface of a
right-circular cylinder, so is a generalisation of the theory developed in section 2.2.
(a) If the cylinder axis coincides with the z-axis we may use the polar coordinates
(, , z) to label points on the cylindrical surface, where is the cylinder radius.
Show that the Cartesian coordinates of a point (x, y) are given by x = cos , y =
sin and hence that the distance between two adjacent points on the cylinder,
(, , z) and (, + , z + z) is, to first-order, given by s2 = 2 2 + z 2 .
(b) A curve on the surface may be defined by prescribing z as a function of .
Show that the length of a curve from = 1 to 2 is
Z 2 p
L[z] = d 2 + z 0 ()2 .
1
(c) Deduce that the shortest distance on the cylinder between the two points
(, 0, 0) and (, , ) is along the curve z = /.
Exercise 2.21
An inverted cone has its apex at the origin and axis along the z-axis. Let be
the angle between this axis and the sides of the cone, and define a point on the
conical surface by the coordinates (, ), where is the perpendicular distance to
the z-axis and is the polar angle measured from the x-axis.
Show that the distance on the cone between adjacent points (, ) and ( + , +
) is, to first-order,
2
s2 = 2 2 + .
sin2
Hence show that if (), 1 2 , is a curve on the conical surface then its
length is r
Z 2
0 2
L[] = d 2 + 2
.
1 sin
Exercise 2.22
A straight river of uniform width a flows with velocity (0, v(x)), where the axes
are chosen so the left-hand bank is the y-axis and where v(x) > 0. A boat can
travel with constant speed c > max(v(x)) relative to still water. If the starting
and landing points are chosen to be the origin and (b, B), respectively, show that
the path giving the shortest time of crossing is given by minimising the functional
Exercise 2.23
In this exercise the basic dynamics required for the derivation of the minimum
resistance functional, equation 2.21, is derived. This exercise is optional, because it
requires knowledge of elementary mechanics which is not part of, or a prerequisite
of, this course.
Consider a block of mass M sliding smoothly on a plane, the cross section of which
is shown in figure 2.12.
2.6. MISCELLANEOUS EXERCISES 105
V v After collision
V v Before collision
M
m
The block is moving from left to right, with speed V , towards a small particle of
mass m moving with speed v, such that initially the distance between the particle
and the block is decreasing. Suppose that after the inevitable collision the block
is moving with speed V 0 , in the same direction, and the particle is moving with
speed v 0 to the right. Use conservation of energy and linear momentum to show
that (V 0 , v 0 ) are related to (V, v) by the equations
M V 2 + mv 2 = M V 0 2 + mv 0 2 and M V mv = M V 0 + mv 0 .
The functional is stationary if the first-order term is zero for all h(x), otherwise S
would change sign with . Using the result quoted inpthe text (after equation 2.5)
and proved in exercise 3.4 (page 124) this gives 1 + y 0 (x) =constant, that is
y 0 (x) =constant and y(x) = x + . The boundary conditions then give y = Bx for
the stationary path. With this value for y(x), the integrand is real if B > 1 and has
the value S = 1 + B.
Hence
p p 2 2
1 + ( + )2 = 1 + 2 + + + O(3 ).
1 + 2 2(1 + 2 )3/2
2.7. SOLUTIONS FOR CHAPTER 2 107
(b) With = y 0 (x) and = h0 (x) we see, using the argument described in the text,
that the term O() in the expansion of S[y + h] S[y] is zero if y 0 (x) =constant, hence
the straight line defined by equation 2.6 makes the functional stationary. With this
choice of y(x), = m and the second term in the above expansion gives the result
quoted. The second-order term is positive for 6= 0 and all h(x), so the functional has
a minimum along this line.
Because this term is always negative, for sufficiently small || we have S[ys +h] < S[ys ],
where ys (x) = Bx is the stationary path, which is therefore a local maximum.
and the first resultfollows. There is equality only if a = b, that is u = 0. Divide the
first inequality by 1 + z 2 to derive the second result.
Since h(a) = h(b) = 0, S = 0 for any y(x). That is, there is no unique stationary path.
Alternatively, in this case the functional becomes
Z b
S[y] = dx (Cy 0 (x) + D) = C [y(b) y(a)] + D(b a).
a
108 CHAPTER 2. THE CALCULUS OF VARIATIONS
This depends only upon C, D and the boundaries a and b: the value of the functional
is therefore independent of the chosen path.
If C and D depend upon x then
Z b
S = dx C(x)h0 (x).
a
The same theory that leads to equation 2.12 shows that S = 0 for all h(x) if and
only if C(x) = constant, which is the case considered first. In either case there are no
stationary paths.
2 BA (B A) 3/2
a = 3/2 and hence y(x) = A + (1 + x) 1 .
3 2 1 (23/2 1)
F F x F y0
= 0, =p and =
y 0
p
y x x + y0 2
2 x2 + y 0 2
giving
dF F F 0 F 00 x + y 0 y 00
= + y + 0y = p .
dx x y y x2 + y 0 2
Since F does not depend explicitly upon y, we have
2 F 00 2F
d F
= y +
dx y 0 y 0 2 xy 0
and
2F xy 0 2F 1 y0 2 x2
= , = =
xy 0 (x2 + y 0 2 )3/2 y 0 2 (x2 + y 0 2 )1/2 (x2 + y 0 2 )3/2 (x2 + y 0 2 )3/2
2.7. SOLUTIONS FOR CHAPTER 2 109
which gives
x2 y 00 xy 0 x(xy 00 y 0 ) x3 (y 0 /x)0
d F
0
= 2 0 2 3/2
2 0 2 3/2
= 2 0 2 3/2
= 2 .
dx y (x + y ) (x + y ) (x + y ) (x + y 0 2 )3/2
Also
y 00 (x + y 0 y 00 )y 0 x(xy 00 y 0 )
dF
=p 2 = ,
y 0 dx x2 + y 0 2 (x + y 0 2 )3/2 (x2 + y 0 2 )3/2
d F dF
so, in this case, = .
dx y 0 y 0 dx
F dy 0
d F F dy F
= + +
dx y 0 y 0 y 0 dx y y 0 dx x y 0
2 F 00 2F 0 2F
= y + y +
y 0 2 y 0 y xy 0
which gives the required expression and is the left-hand side of the inequality.
The right-hand side of the inequality is
dF F F 0 F 00
= + y + 0y
y 0 dx y 0 x y y
2
F F F 0 2 F 00
2
= + + y + 02y
xy 0 y yy 0 y
which differs from the left-hand side by the term F/y. Thus, only if F is independent
of y are the derivatives equal.
we obtain
yy 00 y0 2 1/2
z = + 1 + y0 2 ,
(1 + y 0 2 )3/2 (1 + y 0 2 )1/2
1
00 02
02
02 2 1
yy 00 y 0 2 1 ,
= 0 2 3/2
yy + 1 + y y 1 + y = 0 2 3/2
(1 + y ) (1 + y )
hence the equation z = 0 becomes yy 00 1 y 0 2 = 0. But
0
y 00 y0 2
0
d y d y
= 2 giving yy 00 y 0 2 = y 2 , if y 6= 0,
dx y y y dx y
and hence
y0
d F F 1 d
= y2 1 .
dx y 0 y (1 + y 0 2 )3/2 dx y
(b) If the left-hand side is zero we have
0 0
2 d y 2 0 d y
y = 1 or y y = 1.
dx y dy y
Now define z = y 0 /y and consider z to be a function of y, so in the following z 0 = dz/dy
note this is possible because x may be considered a function of y so y 0 /y can be
expressed in terms of y. Now put the second equation in the form y 3 z z 0 (y) = 1, which
can be integrated directly to give z 2Z = C 2 y 2 , for some constant C. Hence, since
dy p dy
z = y 0 /y, = (Cy)2 1 giving p = x + D. Finally, set Cy = cosh
dx (Cy)2 1
to give = C(x + D), that is y = (1/C) cosh(Cx + CD), which is the required solution,
if C = A and CD = B.
1 1 1 1
1 4 3 1 3 4
Z Z Z
2 2 2 2
S[x ] = ds dt s + st s t = ds s t + s t
0 0 0 3 4 t=0
1
1 4 1 3 31
Z
= ds s + s = .
0 3 4 240
AR x RB dx
sin 1 = =p , and sin 2 = =p ,
SR x + h21
2 RO (d x)2 + h22
where the distances are defined in figure 2.10 (page 99), we see that the distance travelled
by the light is stationary when sin 1 = sin 2 , that is 1 = 2 . Further since
h21 h22
f 00 (x) = + > 0,
(x2 + h21 )3/2 ((d x)2 + h22 )3/2
2 2
p 3
1 + + = 1+ 1+ + O( ) ,
2(1 + ) 8(1 + )2
112 CHAPTER 2. THE CALCULUS OF VARIATIONS
and so
2 2
p
( + ) 1 + + = 1+ 1+ +
2(1 + ) 8(1 + )2
+ 1 + 1 + + ,
2(1 + )
(2 + 3) 2 2 (4 + 3)
= 1++ + + .
2 1+ 8(1 + )3/2
If y(x) is a stationary path of S then the term O() is zero. Since h(0) = h(1) = 0 it
follows, as in the text, that y 0 (x) =constant
is a possible solution. Since y(0) = 0 and
y(1) = B this gives y(x) = Bx and S[y] = B 1 + B.
Alternatively, using equation 2.12 (page 86), with F (y 0 ) = y 0 1 + y 0 , we see that
the stationary path is given by F 0 (y 0 ) = constant and hence y 0 = constant, that is
y = mx + c: since y(0) = 0 and y(1) = B this gives y(x) = Bx.
(b) On substituting Bx for y(x) we see that S takes the value,
1
2 (4 + 3B)
Z
S = dx h0 (x)2 + O(3 ).
8(1 + B)3/2 0
Then, provided B > 1, S is positive and the functional is a minumum on the sta-
tionary path.
ln(x/a)
A = d and B = d + c ln(b/a) and hence y(x) = A + (B A) .
ln(b/a)
2.7. SOLUTIONS FOR CHAPTER 2 113
then, provided F (z) is not a constant or a linear function of z, y 0 (x) is also a constant.
(c) On the stationary path y 0 (x) is a constant and hence d2 F/dy 0 2 is constant and
1 d2 F b
Z
S = 2 0 2 dx h0 (x)2 + O(3 ).
2 dy a
(b) The length along a curve is just the sum of the small elements which in the limit
R p
0 becomes the integral L[z] = 12 d 2 + z 0 ()2 .
(c) The functional L[z] is the same type as that considered in section 2.3.1 hence its
minimum value is given when z() is a linear function of . The boundary conditions
give the result quoted.
Hence pthe distance between the points 1 and 2 along the curve () is L[] =
R 2
1
d 2 + 0 2 sin2 .
Hence
a a
p p
(vy 0 )2 + (c2 v 2 )(1 + y 0 2 ) vy 0 (1 + y 0 2 )c2 v 2 vy 0
Z Z
T [y] = dx = dx .
0 c2 v 2 0 c2 v 2
M V 2 + mv 2 = M V 0 2 + mv 0 2 Energy conservation
M V mv = M V 0 + mv 0 Linear momentum in the direction of the block motion
3.1 Introduction
In this chapter we apply the methods introduced in section 2.2 to more general problems
and derive the most important result of the Calculus of Variations. We show that for
the functional Z b
S[y] = dx F (x, y, y 0 ), y(a) = A, y(b) = B, (3.1)
a
where F (x, u, v) is a real function of three real variables, a necessary and sufficient
condition for the twice differentiable function y(x) to be a stationary path is that it
satisfies the equation
d F F
0
= 0 and the boundary conditions y(a) = A, y(b) = B. (3.2)
dx y y
This equation is known either as Eulers equation or the Euler-Lagrange equation, and
is a second-order equation for y(x), exercise 2.10 (page 89). Conditions for a stationary
path to give either a local maximum or a local minimum are more difficult to find and
we defer a discussion of this problem to chapter 7.
In order to derive the Euler-Lagrange equation it is helpful to first discuss some
preliminary ideas. We start by briefly describing Eulers original analysis, because
it provides an intuitive understanding of functionals and provides a link between the
calculus of functions of many variables and the Calculus of Variations. This leads
directly to the idea of the rate of change of a functional, which is required to define
a stationary path. This section is followed by the proof of the fundamental lemma of
the Calculus of Variations which is essential for the derivation of the Euler-Lagrange
equation, which follows.
The Euler-Lagrange equation is usually a nonlinear boundary value problem: this
combination causes severe difficulties, both theoretical and practical. First, solutions
may not exist and if they do uniqueness is not ensured: second, if solutions do exist
it is often difficult to compute them. These difficulties are in sharp contrast to initial
value problems and, because the differences are so marked, in section 3.5 we compare
these two types of equations in a little detail. Finally, in section 3.6, we show why the
limiting process used by Euler is subtle and can lead to difficulties.
117
118 CHAPTER 3. THE EULER-LAGRANGE EQUATION
a = x0 , x1 , x2 , . . . xN , xN +1 = b, where xk+1 xk = ,
and replacing the curve y(x) with segments of straight lines with vertices
y
Pb
B
y(x)
Pa
A
x
a=x0 x1 x2 x3 x4 x5 b=x6
Figure 3.1 Diagram showing the rectification of a curve by a
series of six straight lines, N = 5.
Approximating the derivative at xk by the difference (yk yk1 )/ the functional 3.1
is replaced by a function of the N variables (y1 , y2 , , yN ),
N +1
X yk yk1 ba
S(y1 , y2 , , yN ) = F xk , y k , where = , (3.3)
N +1
k=1
Remember that if the usual derivative of a function exists at any point x, it is continuous
at x.
The type of functional defined by equation 3.1 involves paths joining the points
(a, A) and (b, B) which are differentiable or piecewise differentiable for a x b.
In order to find a stationary path we need to compare values of the functional on
nearby paths; this means that a careful definition of the distance between nearby paths
(functions) is important. This is achieved most easily by using the notion of a norm of
a function. A norm defined on a function space is a map taking elements of the space
to the non-negative real numbers; it represents the distance from an element to the
origin (zero function). It has the same properties as the Euclidean distance defined in
equation 1.2 (page 13).
In Rn the Euclidean distance suffices for most purposes. In infinite dimensional
function spaces there is no obvious choice of norm that can be used in all circumstances.
Use of different norms and the corresponding concepts of distance can lead to different
classifications of stationary paths as is seen in section 3.6.
For this reason it is usual to distinguish between a function space and a normed
space by using a different name whenever a specific norm on the set of functions is being
considered. For example, we have introduced the space C0 [a, b] of continuous functions
on the interval [a, b]. One of the simplest norms on this space is the supremum norm1
ky(x)k = max |y(x)|,
axb
and this norm can be shown to satisfy the conditions of equation 1.3 (page 13). The
distance between two functions y and z is of course ky zk. When we wish to
emphasise that we are considering this particular normed space, and not just the space
of continuous functions, we shall write D0 [a, b], by which we shall mean the space of
continuous functions with the specified norm. When we write C0 [a, b], no particular
norm is implied.
In what follows, we shall sometimes need to restrict attention to functions which
have a continuous and bounded derivative. A suitable norm for such functions is
y(x) = max |y(x)| + max |y 0 (x)|,
1 axb axb
and we shall denote by D1 [a, b] the normed space of functions with continuous bounded
derivative equipped with the norm k . k1 defined above. This space consists of the same
functions as the space C1 [a, b], but as before use of the latter notation will not imply
the use of any particular norm on the space.
It is usually necessary to restrict the class of functions we consider to the subset
of all possible functions that satisfy the boundary conditions, if defined. Normally we
shall simply refer to this restricted class of functions as the admissible functions: these
are defined to be those differentiable functions that satisfy any boundary conditions
and, in most circumstances, to be in D1 (a, b), because it is important to bound the
variation in y 0 (x). Later we shall be less restrictive and allow piecewise differentiable
functions.
We now come to the most important part of this section, that is the idea of the rate
of change of a functional which is implicit in the idea of a stationary path. Recall that a
1 In analysis texts max |y(x)| is replaced by sup |y(x)|, but for continous functions on closed finite
A stationary point is defined to be one at which the rate of change, G(x, ), is zero
in every direction; it follows that at a stationary point all first partial derivatives must
be zero.
The idea embodied in equation 3.4 may be applied to the functional
Z b
S[y] = dx F (x, y, y 0 ), y(a) = A, y(b) = B,
a
which has a real value for each admissible function y(x). The rate of change of a
functional S[y] is obtained by examining the difference between neighbouring admissible
paths, S[y + h] S[y]; since both y(x) and y(x) + h(x) are admissible functions for all
real , it follows that h(a) = h(b) = 0. This difference is a function of the real variable
, so we define the rate of change of S[y] by the limit,
S[y + h] S[y] d
S[y, h] = lim = S[y + h] , (3.5)
0 d =0
which we assume exists. The functional S depends upon both y(x) and h(x), just as
the limit of the difference [G(x + ) G(x)]/, of equation 3.4, depends upon x and .
Definition 3.1
The functional S[y] is said to be stationary if y(x) is an admissible function and if
S[y, h] = 0 for all h(x) for which y(x) and y(x) + h(x) are admissible.
The functions for which S[y] is stationary are named stationary paths. The stationary
path, y(x), and the varied path y(x) + h(x) must be admissible: for most variational
problems considered in this chapter both paths needs to satisfy the boundary conditions,
so h(a) = h(b) = 0. But in more general problems considered later, particularly in
chapter 9, these conditions on h(x) are removed, but see exercises 3.12 and 3.13. If
y(x) is an admissible path we name the allowed variations, h(x), to be those for which
y(x) + h(x) are admissible.
On a stationary path the functional may achieve a maximum or a minimum value,
and then the path is named an extremal. The nature of stationary paths is usually
determined by the term O(2 ) in the expansion of S[y + h]: this theory is described in
chapter 7.
122 CHAPTER 3. THE EULER-LAGRANGE EQUATION
is linear in h, that is if c is any constant then S[y, ch] = cS[y, h]; in this case it is
named the G ateaux differential.
Notice that if S is an ordinary function of n variables, (y1 , y2 , , yn ), rather than
a functional, then the G ateaux differential is
n
d X S
S = lim S(y + h) = hk ,
0 d yk
k=1
for the distance between (a, A) and (b, B), discussed in section 2.2.1. We have
Z b p Z b
d d 0 0 2
d p
S[y + h] = dx 1 + (y + h ) = dx 1 + (y 0 + h0 )2 ,
d d a a d
Z b
(y 0 + h0 )
= dx p h0 .
a 1 + (y 0 + h0 )2
Note that we may change the order of differentiation with respect to and integration
with respect to x because a and b are independent of and all integrands are assumed
to be sufficiently well-behaved functions of x and . Hence, on putting = 0
Z b
y0
d
h0 ,
S[y, h] = S[y + h]
= dx p
d =0 a 1 + y0 2
Comparing this with G, equation 3.4, we can make the equivalences y x and h .
However, for functions of N variables there is no relation between the variables k and
k+1 , but h(x) is differentiable, so |hk hk+1 | = O(). This suggests that some care is
required in taking the limit N of equation 3.3 and shows why problems involving
finite numbers of variables can be different from those with infinitely many variables
and why the choice of norms, discussed above, is important. Nevertheless, provided
caution is exercised, the analogy with functions of several variables can be helpful.
3.3. THE FUNDAMENTAL LEMMA 123
Exercise 3.2
Find the G
ateaux differentials of the following functionals:
Z /2 Z b
y0 2
dx y 0 2 y 2 ,
`
(a) S[y] = (b) S[y] = dx 3 , b > a > 0,
0 a x
Z b Z 1
dx y 0 2 + y 2 + 2yex , (d) S[y] =
` p p
(c) S[y] = dx x2 + y 2 1 + y 0 2 .
a 0
Exercise 3.3
Show that the G
ateaux differential of the functional,
Z b Z b
S[y] = ds dt K(s, t)y(s)y(t)
a a
is Z b Z b
S[y, h] = ds h(s) dt K(s, t) + K(t, s) y(t).
a a
for all functions h(x) that are continuous for a x b and are zero at x = a and
x = b, then z(x) = 0 for a x b.
In order to prove this we assume on the contrary that z() 6= 0 for some satisfying
a < < b. Then, since z(x) is continuous there is an interval [x1 , x2 ] around with
a < x1 x 2 < b
in which z(x) 6= 0. We now construct a suitable function h(x) that yields a contradic-
tion. Define h(x) to be
(
(x x1 )(x2 x), a < x1 x x2 < b,
h(x) =
0, otherwise,
Exercise 3.4
In this exercise a result due to du Bois-Reymond (1831 1889) which is closely
related to the fundamental lemma will be derived. This is required later, see
exercise 3.11.
If z(x) and h0 (x) are continuous, h(a) = h(b) = 0 and
Z b
dx z(x)h0 (x) = 0
a
Euler-Lagrange equation. Here we consider stationary paths and then the condition is also sufficient.
3.4. THE EULER-LAGRANGE EQUATIONS 125
to be stationary on the path y(x) is that it satisfies the differential equation and bound-
ary conditions,
d F F
= 0, y(a) = A, y(b) = B. (3.7)
dx y 0 y
This is named Eulers equation or the Euler-Lagrange equation. It is a second-order
differential equation, as shown in exercise 2.10, and is the analogue of the conditions
G/xk = 0, k = 1, 2, , n, for a function of n real variables to be stationary, as
discussed in section 3.2.2. We now derive this equation.
The integral 3.6 is defined for functions y(x) that are differentiable for a x b.
Using equation 3.5 we find that the rate of change of S[y] is
Z b
d 0
0
S[y, h] = dx F (x, y + h, y + h )
d a
=0
Z b
d
= dx F (x, y + h, y 0 + h0 ) . (3.8)
a d
=0
The integration limits a and b are independent of and we assume that the order of
integration and differentiation may be interchanged. The integrand of equation 3.8 is a
total derivative with respect to and equation 1.21 (page 28) shows how to write this
expression in terms of the partial derivatives of F . Using equation 1.21 with n = 3,
t = and the variable changes (x1 , x2 , x3 ) = (x, y, y 0 ) and (h1 , h2 , h3 ) = (0, h(x), h0 (x)),
so that
we obtain
d F F
F (x, y + h, y 0 + h0 ) = h + h0 0 .
d y y
Now set = 0, so the partial derivatives are evaluated at (x, y, y 0 ), to obtain,
b
F F
Z
S[y, h] = dx h(x) + h0 (x) 0 . (3.9)
a y y
assuming that Fy0 is differentiable. But h(a) = h(b) = 0 so the boundary term on the
right-hand side vanishes and the rate of change of the functional S[y] becomes
b
d F F
Z
S[y, h] = dx h(x). (3.10)
a dx y 0 y
If S[y] is stationary then, by definition, S[y, h] = 0 for all allowed h and it follows
from the fundamental lemma of the Calculus of Variations that y(x) satisfies the second-
order differential equation
d F F
= 0, y(a) = A, y(b) = B. (3.11)
dx y 0 y
Exercise 3.5
Show that the Euler-Lagrange equation for the functional
Z X
S[y] = dx y 0 2 y 2 , y(0) = 0, y(X) = 1, X > 0,
0
G
y0 G = c, y(a) = A, y(b) = B, (3.13)
y 0
where c is a constant determined by the boundary conditions, see for example exer-
cise 3.6 below. The expression on the left-hand side of this equation is often named the
first-integral of the Euler-Lagrange equation. This result is important because, when
applicable, it often saves a great deal of effort, because it is usually far easier to solve
this lower order equation. Two proofs of equation 3.13 are provided: the first involves
deriving an algebraic identity, see exercise 3.7, and it is important to do this yourself.
The second proof is given in section 6.2.1 and uses the invariance properties of the inte-
grand G(y, y 0 ). A warning, however; in some circumstances a solution of equation 3.13
3.4. THE EULER-LAGRANGE EQUATIONS 127
will not be a solution of the original Euler-Lagrange equation, see exercise 3.8, also
section 4.3 and chapter 5.
Another important consequence is that the stationary function, the solution of 3.13,
depends only upon the variables u = x a and b a (besides A and B), rather than
x, a and b independently, as is the case when the integrand depends explicitly upon x.
A specific example illustrating this behaviour is given in exercise 3.21.
An observation
You may have noticed that the original functional 3.6 is defined on the class of func-
tions for which F (x, y(x), y 0 (x)) is integrable: if F (x, u, v) is differentiable in all three
variables this condition is satisfied if y 0 (x) is piecewise continuous. However, the Euler-
Lagrange equation 3.11 requires the stronger condition that y 0 (x) is differentiable. This
extra condition is created by the derivation of the Euler-Lagrange equation, in partic-
ular the step between equations 3.9 and 3.10: a necessary condition for the functional
S[y] to be stationary, that does not make this step and does not require y 00 to exist, is
derived in exercise 3.11.
There are important problems where y 00 (x) does not exist at all points on a stationary
path the minimal surface of revolution, dealt with in the next chapter, is one simple
example; the general theory of this type of problem will be considered in chapter 9.
Exercise 3.6
Consider the functional
Z 1
dx y 0 2 y ,
`
S[y] = y(0) = 0, y(1) = 1.
0
Exercise 3.7
If G(y, y 0 ) does not depend explicitly upon x, that is G/x = 0, show that
d G G d 0 G
y 0 (x) = y G
dx y 0 y dx y 0
and hence derive equation 3.13.
Hint: you will find the result derived in exercise 2.10 (page 89) helpful.
128 CHAPTER 3. THE EULER-LAGRANGE EQUATION
Exercise 3.8
(a) Show that provided Gy0 (y, 0) exists the differential equation 3.13 (without the
boundary conditions) has a solution y(x) = , where the constant is defined
implicitly by the equation G(, 0) = c.
(b) Under what circumstances is the solution y(x) = also a solution of the
Euler-Lagrange equation 3.11?
Exercise 3.9
Show that the Euler-Lagrange equation for the functional
Z 1
S[y] = dx y 0 2 + y 2 + 2axy , y(0) = 0, y(1) = B,
0
Exercise 3.10
In this exercise we consider a problem, due to Weierstrass (1815 1897), in which
the functional achieves its minimum value of zero for a piecewise continuous func-
tion but for continuous functions the functional is always positive.
The functional is
Z 1
J[y] = dx x2 y 0 2 , y(1) = 1, y(1) = 1,
1
(c) A similar result can be proved for a class of continuously differentiable func-
tions. For the functions
1 x 1
y(x) = tan1 , tan = , 0 < < 1,
show that
2
J[y] =+ O(2 ).
Deduce that J[y] may take arbitrarily small values, but cannot be zero.
Hint the relation tan1 (1/z) = /2 tan1 (z) is needed.
It may be shown that for no continuous function satisfying the boundary condi-
tions is J[y] = 0. Thus on the class of continuous functions J[y] never equals its
minimum value, but can approach it arbitrarily closely.
Exercise 3.11
The Euler-Lagrange equation 3.11 requires that y 00 (x) exists, yet the original func-
tional does not. The second derivative arises when equation 3.9 is integrated by
parts to replace h0 (x) by h(x). In this exercise you will show that this step may be
avoided and that a sufficient condition not depending upon y 00 (x) may be derived.
Define the function (x) by the integral
Z x
(x) = dt Fy (t, y(t), y 0 (t)),
a
so that (a) = 0 and (x) = Fy (x, y, y 0 ), and show that equation 3.9 becomes
0
Z b
F
S = dx h0 (x) (x) .
a y 0
Using the result derived in exercise 3.4 show that a necessary condition for S[y]
to be stationary is that Z x
F F
dt = C,
y 0 a y
where C is a constant.
In practice, this equation is not usually as useful as the Euler-Lagrange equation.
Exercise 3.12
The boundary conditions y(a) = A, y(b) = B are not always appropriate so we
need functionals that yield different conditions. In this exercise we illustrate how
this can sometimes be achieved. The technique used here is important and will
be used extensively in chapter 9.
Consider the functional
Z b
1
dx y 0 2 + y 2 ,
`
S[y] = G(y(b)) + y(a) = A,
2 a
with no condition being given at x = b. For this functional the variation h(x)
satisfies h(a) = 0, but h(b) is not constrained.
130 CHAPTER 3. THE EULER-LAGRANGE EQUATION
(a) Use the fact that h(a) = 0 to show that the Gateaux differential can be written
in the form
Z b
S[y, h] = y 0 (b) Gy (y(b)) h(b) dx y 00 y h.
`
a
(b) Using a subset of variations with h(b) = 0 show that the stationary paths
satisfy the equation y 00 y = 0, y(a) = A, and that on this path
S[y, h] = y 0 (b) Gy (y(b)) h(b).
Deduce that S[y] is stationary only if y(b) and y 0 (b) satisfy the equation
y 0 (b) = Gy (y(b)).
1 b
Z
dx y 0 2 + y 2 ,
`
S[y] = By(b) + y(a) = A,
2 a
Exercise 3.13
Use the ideas outlined in the previous exercise to show that if G(b, y, B) is defined
by the integral Z y
G(b, y, B) = dz Fy0 (b, z, B)
the functional
Z b
S[y] = G(b, y(b), B) + dx F (x, y, y 0 ), y(a) = A,
a
when none exists. In this course there is insufficient space to discuss approximate
and numerical methods, but this section is devoted to a discussion of a theorem that
provides some information about the existence and uniqueness of solutions for the Euler-
Lagrange equation. In the last part of this section we contrast these results with those
for the equivalent equation, but with initial conditions rather than boundary values.
First, however, we return to the question, discussed on page 127, of whether the
second derivative of the stationary path exists, that is whether it satisfies the Euler-
Lagrange equation in the whole interval.
The following theorem due to the German mathematician du Bois-Reymond (1831
1889) gives necessary conditions for the second derivative of a stationary path to exist.
Theorem 3.1
If
(a) y(x) has a continuous first derivative,
(b) S[y, h] = 0 for all allowed h(x),
(c) F (x, u, v) has continuous first and second derivatives in all variables and
(d) 2 F/y 0 2 6= 0 for a x b,
then y(x) has a continuous second derivative and satisfies the Euler-Lagrange equa-
tion 3.11 for all a x b.
This result is of limited practical value because its application sometimes requires
knowledge of the solution, or at least some of its properties. A proof of this theorem may
be found in Gelfand and Fomin (1963, page 17)3 . An example in which Fy0 y0 = 0 on the
stationary path and where this path does not possess a second derivative, yet satisfies
the Euler-Lagrange equation almost everywhere, is given in exercise 3.29 (page 139).
d2 y
dy
= H x, y, , y(a) = A, y(b) = B. (3.15)
dx2 dx
For such equations this is one of the few general results about the nature of the solutions
and is due to the Ukrainian mathematician S N Bernstein (1880 1968). This theorem
provides a sufficient condition for equation 3.15 to have a unique solution.
Theorem 3.2
If for all finite y, y 0 and x in an open interval containing [a, b], that is c < a x b < d,
(a) the functions H, Hy and Hy0 are continuous,
(b) there is a constant k > 0 such that Hy > k, and,
(c) for any Y > 0 and all |y| < Y and a x b there are positive constants (Y )
and (Y ), depending upon Y , and possibly c and d, such that
|H(x, y, y 0 )| (Y )y 0 2 + (Y ),
3 I M Gelfand and S V Fomin Calculus of Variations, (Prentice Hall, translated from the Russian
Some examples
The usefulness of Bernsteins theorem is somewhat limited because the conditions of
the theorem are too stringent; it is, however, one of the rare general theorems applying
to this type of problem. Here we apply it to the two problems dealt with in the next
chapter, for which the integrands of the functionals are
p
F = y 1 + y0 2 Minimal surface of revolution,
s
1 + y0 2
F = Brachistochrone.
y
Substituting these into the Euler-Lagrange equation 3.14 we obtain the following ex-
pressions for H,
1 + y0 2
y 00 = H = Minimal surface of revolution,
y
1 + y0 2
y 00 = H = Brachistochrone.
2y
In both cases is H discontinuous at y = 0, so the conditions of the theorem do not hold.
In fact, the Euler-Lagrange equation for the minimal surface problem has one piecewise
smooth solution and, in addition, either two or no differentiable solutions, depending
upon the boundary values. The brachistochrone problem always has one, unique solu-
tion. These examples emphasise the fact that Bernsteins theorem gives sufficient as
opposed to necessary conditions.
Exercise 3.14
Use Bernsteins theorem to show that the equation y 00 y = x, y(0) = A, y(1) = B,
has a unique solution, and find this solution.
Exercise 3.15
(a) Apply Bernsteins theorem to the equation y 00 + y = x, y(0) = 0, y(X) = 1
with X > 0.
(b) Show that the solution of this equation is
sin x
y = x + (1 X)
sin X
and explain why this does not contradict Bernsteins theorem.
Exercise 3.16 Z 1 2
dx y 2 1 y 0 , y(1) = 0, y(1) = 1, the
`
Consider the functional S[y] =
1
smallest value of which is zero. Show that the solution of the Euler-Lagrange
equation that minimises this functional is
0, 1 x 0,
y(x) =
x, 0 < x < 1,
we simply define the variables zk (x) = y (k1) (x), k = 1, 2, , n, so that the equation
becomes
z10 = z2 , z20 = z3 , zn0 = F (x, z1 , z2 , , zn ).
The second-order equation 3.15 is trivially cast into this form by defining the three
dependent variables, (z1 , z2 , z3 ) by the equations z1 = y and
dz1 dz2 dz3
= z2 , = H(z3 , z1 , z2 ), = 1,
dx dx dx
so v = (z2 , H(z3 , z1 , z2 ), 1). Other examples of this procedure are considered in exer-
cises 3.17 and 3.18.
If the second derivatives of v are continuous in a neighbourhood of x0 and v(z0 ) 6= 0,
then it is possible to find a new set of variables, u, such that in the neighbourhood of
z0 equation 3.16 transforms to
du1 duk
= 1, = 0, k = 2, 3, , n.
dx dx
Such a transformation is said to rectify the system. It follows that a unique solution
exists. A proof of this result may be found in Arnold (1973, section 7 and 32)5 . Thus
solutions of this type of equation exist and are unique under far less stringent conditions
than the solutions of second-order, boundary value problems. This example illustrates
one very important difference between local and global problems.
Moreover, solutions of the initial value problem are differentiable in the initial con-
ditions, z0 and these differentials are continuous in z0 . Further, if equations 3.16 are
linear, so may be put in the form
dz
= A(x; )z,
dx
where A is a nonsingular, nn, real matrix, which is also a twice differentiable function
of a parameter , then the solution z(x; ) is a differentiable function of . This is not
true of linear boundary value problems as is seen in exercise 3.18.
5V I Arnold, Ordinary Differential Equations, (The MIT press).
134 CHAPTER 3. THE EULER-LAGRANGE EQUATION
Exercise 3.17
The integrand of pthe functional for Brachistochrone problem, described in sec-
tion 2.5.1, is F = 1 + y 0 2 / y. Show that the associated Euler-Lagrange equa-
02
1+y
tion is y 00 = and that this may be written as the pair of first-order
2y
equations
dy1 dy2 1 + y22
= y2 , = where y1 = y.
dx dx 2y1
Exercise 3.18
(a) Show that the second-order linear equation y 00 = 2 y, where is a positive
constant, can be written as the pair of coupled linear equations
dz1 dz2 dy
= z2 , = 2 z1 where z1 = y, z2 = .
dx dx dx
(b) Show that with the initial conditions y(0) = 0, y 0 (0) = the solution is
y(x) = (/) sin x, and that this exists for all and and is a differentiable
function of .
(c) Show that with the boundary conditions y(0) = 0, y() = the solution is
< sin x ,
8
6= 1, 2, and all ,
y(x) = sin
B sin x, = 1, 2, , = 0, for any B.
:
and without loss of generality we may restrict h to satisfy ||h||1 = 1, so that |h0 (x)|
H1 < 1. On the varied path the value of the functional is
Z 1 p p
S[h] = dx 1 + 2 h0 2 1 + (H1 )2
0
3.6. STRONG AND WEAK VARIATIONS 135
and hence
p (H1 )2
S[h] S[0] 1 + (H1 )2 1 = p < (H1 )2 < 2 .
1 + 1 + (H1 )2
Thus if h(x) belongs to D1 (0, 1), S[y] changes by O(2 ) on the neighbouring path and
since S[h] S[0] > 0 for all the straight line path is a minimum.
Now consider the less restrictive norm
which restricts the magnitude of h, but not the magnitude of its derivative. A suitable
path close to y = 0 is given by h(x) = sin nx, n being a positive integer. Now we
have
Z 1 p Z 1
S[h] = dx 1 + (n)2 cos2 nx n dx |cos nx| .
0 0
But
1 1/2n
2
Z Z
dx |cos nx| = 2n dx cos nx = .
0 0
Hence S[h] 2n. Thus for any > 0 we may chose a value of n to make S[h] as
large as we please, even though the varied path is arbitrarily close to the straight-line
path: hence the path y = 0 is not stationary when this norm is used.
These two quite different types of behaviour show why the choice of norm is impor-
tant. These two types of norm are so important in the general theory that the variations
satisfying each have a special name.
Norms such as ||z(x)||1 restrict the variation of both the function and its derivative.
A variation in a path, h(x), that is restricted in this manner is named a weak variation.
The derivation of the Euler-Lagrange equation in section 3.4 assumed weak variations.
If the norm ||z(x)||0 is used to constrain variations about the path, so that derivatives
of the function need not be bounded, then the variation is named a strong variation.
Note that these names are not tied to the specific norms used here.
If the G ateaux differential of a functional, S[y], defined on [a, b], is zero for all
variations in D0 (a, b) then S[y] is said to have a strong stationary path. If the G ateaux
differential is zero for all variations in D1 (a, b) then S[y] is said to have a weak stationary
path.
Exercise 3.19
In this exercise we give another example of a path satisfying the ||z||0 norm which
is arbitrarily close to the line y = 0, but for which S is arbitrarily large.
Consider the isosceles triangle with base AC of length a, height h and base angle ,
as shown on the left-hand side of the figure.
136 CHAPTER 3. THE EULER-LAGRANGE EQUATION
B B
l B1 B2
h
A D C A D C
Figure 3.2
(a) Construct the two smaller triangles AB1 D and DB2 C by halving the height
and width of ABC, as shown on the right. If AB = l and BD = h, show that
AB1 = l/2, 2l = a/ cos and h = l sin . Hence show that the lengths of the lines
AB1 DB2 C and ABC are the same and equal to 2l.
(b) Show that after n such divisions there are 2n similar triangles of height 2n h
and that the total length of the curve is 2l. Deduce that arbitrarily close to AC,
the shortest distance between A and C, we may find a continuous curve every
point of which is arbitrarily close to AC, but which has any given length.
3.7. MISCELLANEOUS EXERCISES 137
00
is y + y = x. Hence show that the stationary function is y(x) = sin x/ sin 1 x.
Exercise 3.21
Consider the functional
Z b
S[y] = dx F (y, y 0 ), y(a) = A, y(b) = B,
a
where F (y, y 0 ) does not depend explicitly upon x. By changing the independent
variable to u = x a show that the solution of the Euler-Lagrange equation
depends on the difference b a rather than a and b separately.
Exercise 3.22
Eulers original method for finding solutions of variational problems is described
in equation 3.3 (page 118). Consider approximating the functional defined in
exercise 3.20 using the polygon passing through the points (0, 0), ( 12 , y1 ) and (1, 0),
so there is one variable y1 and two segments.
This polygon can be defined by the straight line segments
(
2y1 x, 0 x 12 ,
y(x) =
2y1 (1 x), 12 x 1.
Exercise 3.23
Find the stationary paths of the following functionals.
Z 1
dx y 0 2 + 12xy ,
`
(a) S[y] = y(0) = 0, y(1) = 2.
0
Z 1
dx 2y 2 y 0 2 (1 + x)y 2 ,
`
(b) S[y] = y(0) = 1, y(1) = 2.
0
Z 2
(c) S[y] = 21 By(2) + dx y 0 2 /x2 , y(1) = A.
1
b
y(0)2
Z
(d) S[y] = + dx y/y 0 2 , y(b) = B 2 , B 2 > 2Ab > 0.
A3 0
Hint for (c) and (d) use the method described in exercise 3.12.
138 CHAPTER 3. THE EULER-LAGRANGE EQUATION
Exercise 3.24
What is the equivalent of the fundamental lemma of the Calculus of Variations in
the theory of functions of many real variables?
Exercise 3.25
Find the general solution
Z b of the Euler-Lagrange equation corresponding to the
p
functional S[y] = dx w(x) 1 + y 0 2 , and find explicit solutions in the special
a
cases w(x) = x and w(x) = x.
Exercise 3.26 Z 1 2
dx y 0 2 1 ,
`
Consider the functional S[y] = y(0) = 0, y(1) = A > 0.
0
02
(a) Show that the Euler-Lagrange equation reduces to y = m2 , where m is a
constant.
(b) Show that the equation y 0 2 = m2 , with m > 0, has the following three solu-
tions that fit the boundary conditions, y1 (x) = Ax,
8
A+m
>
> mx, 0x ,
2m
<
y2 (x) = m>A
: A + m(1 x), A + m x 1,
>
>
2m
and 8
mA
>
> mx, 0x ,
2m
<
y3 (x) = m > A.
>
: A m(1 x), mA
> x 1,
2m
Show also that on these solutions the functional has the values
S[y1 ] = (A2 1)2 , S[y2 ] = (m2 1)2 and S[y3 ] = (m2 1)2 .
(c) Deduce that if A 1 the minimum value of S[y] is (A2 1)2 and that this
occurs on the curve y1 (x), but if A < 1 the minimum value of S[y] is zero and this
occurs on the curves y2 (x) and y3 (x) with m = 1.
Exercise 3.27
Show that the following functionals do not have stationary values
Z 1 Z 1 Z 1
(a) dx y 0 , (b) dx yy 0 , (c) dx xyy 0 ,
0 0 0
Exercise 3.28
Show that the Euler-Lagrange equations for the functionals
Z b Z b
d
S1 [y] = dx F (x, y, y 0 ) and S2 [y] = dx F (x, y, y 0 ) + G(x, y)
a a dx
are identical.
3.7. MISCELLANEOUS EXERCISES 139
Exercise 3.29 Z 1 2
dx y 2 2x y 0 ,
`
Show that the functional S[y] = y(1) = 0, y(1) = 1,
1
achieves its minimum value, zero, when
(
0, 1 x 0,
y(x) =
x2 , 0 x 1,
which has no second derivative at x = 0. Show that, despite the fact that y 00 (x)
does not exist everywhere, the Euler-Lagrange equation is satisfied for x 6= 0.
Exercise 3.30 Z b
The functional S[y] = dx F (x, y, y 0 ), y(a) = A, y(b) = B, is stationary on
a
those paths satisfying the Euler-Lagrange equation
d F F
= 0, y(a) = A, y(b) = B.
dx y 0 y
In this formulation of the problem we choose to express y in terms of x: however,
we could express x in terms of y, so the functional has the form
Z B
J[x] = dy G(y, x, x0 ), x(A) = a, x(B) = b,
A
Exercise 3.31
Use the approximation 3.3 (page 118) to show that the equations for the values
of y = (y1 , y2 , , yn ), where xk+1 = xk + , that make S(y) stationary are
S
= F (zk ) + F (zk ) F (zk+1 ) = 0, k = 1, 2, , n,
yk u v v
where zk = (xk , u, v), u = yk , v = (yk yk1 )/ and where y0 = A and yn+1 = B.
Show also that zk+1 = zk + (1, yk0 , yk0 0 ) + O( 2 ), and hence that
2F 2F 2F
S F
= yn0 yn0 0 2 + O( 2 ),
yk u xv uv v
d F F 2
= + O( ),
dx v u
where F and its derivatives are evaluated at z = zk .
Hence derive the Euler-Lagrange equations.
140 CHAPTER 3. THE EULER-LAGRANGE EQUATION
Harder exercises
Exercise 3.32
This exercise is a continuation of exercise 3.22 and uses a set of n variables to
define the polygon. Take a set of n + 2 equally spaced points on the x-axis,
xk = k/(n + 1), k = 0, 1, , n + 1 with x0 = 0 and xn+1 = 1, and a polygon
passing through the points (xk , yk ). Since y(0) = y(1) = 0 we have y0 = yn+1 = 0,
leaving N unknown variables.
Show that the functional defined in exercise 3.20 approximates to
n ff
1X 2k 1
S= (yk+1 yk )2 h2 yk2 + yk , h= .
h n+1 n+1
k=0
(a) For n = 1, the case treated in exercise 3.22, show that this reduces to
7 2 1
S(y1 ) = y1 y1 .
2 2
Explain the difference between this and the previous expression for S(y1 ), given
in exercise 3.22.
(b) For n = 2 show that this becomes
17 2 17 2 2 4
S= y + y 6y1 y2 y1 y2 ,
3 1 3 2 9 9
and hence that the equations for y1 and y2 are
2 4
34y1 18y2 = , 34y2 18y1 = .
3 3
Solve these equations to show that y(1/3) ' 35/624 ' 0.0561 and y(2/3) '
43/624 ' 0.0689. Note that these compare favourably with the exact values,
y(1/3) = 0.0555 and y(2/3) = 0.0682.
Exercise 3.33 Z b
Consider the functional S[y] = dx F (y 00 ) where F (z) is a differentiable func-
a
tion and the admissible functions are at least twice differentiable and satisfy the
boundary conditions y(a) = A1 , y(b) = B1 , y 0 (a) = A2 and y 0 (b) = B2 .
(a) Show that the function making S[y] stationary satisfies the equation
F
= c(x a) + d
y 00
where c and d are constants.
(b) In the case that F (z) = 21 z 2 show that the solution is
1 1
y(x) = c(x a)3 + d(x a)2 + A2 (x a) + A1 ,
6 2
where c and d satisfy the equations
1 3 1
cD + dD2 = B1 A1 A2 D where D = b a,
6 2
1 2
cD + dD = B 2 A2 .
2
(c) Show that this stationary function is also a minimum of the functional.
3.7. MISCELLANEOUS EXERCISES 141
Exercise 3.34
The theory described in the text considered functionals with integrands depend-
ing only upon x, y(x) and y 0 (x). However, functionals depending upon higher
derivatives also exist and are important, for example in the theory of stiff beams,
and the equivalent of the Euler-Lagrange equation may be derived using a direct
extension of the methods described in this chapter.
Consider the functional
Z b
S[y] = dx F (x, y, y 0 , y 00 ), y(a) = A1 , y 0 (a) = A2 , y(b) = B1 , y 0 (b) = B2 .
a
being careful to describe the necessary properties of h(x). Hence show that S[y]
is stationary for the functions that satisfy the fourth-order differential equation
d2
F d F F
+ = 0,
dx2 y 00 dx y 0 y
Exercise 3.35
Using the result derived in the previous exercise, find the stationary functions of
the functionals
Z 1
(a) S[y] = dx (1 + y 00 2 ), y(0) = 0, y 0 (0) = y(1) = y 0 (1) = 1,
0
Z /2
dx y 00 2 y 2 + x2 , y 0 (0) = y y0
`
(b) S[y] = y(0) = 1, = 0, = 1.
0 2 2
142 CHAPTER 3. THE EULER-LAGRANGE EQUATION
and hence, since F depends only upon y 0 and not y, the stationary points are given by
the equations,
S 0 yk yk1 0 yk+1 yk
=F F = 0, k = 1, 2, , N.
yk
/2 /2
d
Z Z
dx (y 0 + h0 )h0 (y + h)h dx (y 0 h0 yh) .
S[y+h] = 2 and S[y, h] = 2
d 0 0
Rb
(b) We have S[y + h] = a dx (y 0 + h0 )2 x3 . Hence
b b
d (y 0 + h0 ) 0 y 0 h0
Z Z
S[y + h] = 2 dx h and S[y, h] = 2 dx .
d a x3 a x3
Rb
dx (y 0 + h0 )2 + (y + h)2 + 2ex(y + h) . Hence
(c) We have S[y + h] = a
b b
d
Z Z
dx (y 0 + h0 )h0 + (y + h)h + ex h and S[y, h] = 2 dx [y 0 h0 + (y + ex ) h] .
S[y+h] = 2
d a a
R1 p p
(d) We have S[y + h] = 0 dx x2 + (y + h)2 1 + (y 0 + h0 )2 . Hence
" #
1
p
d (y + h)h x2 + (y + h)2 (y 0 + h0 )h0
Z p
S[y+h] = dx p 1 + (y 0 + h0 )2 + p
d 0
2
x + (y + h) 2 1 + (y 0 + h0 )2
and " p #
1
p
y 1 + y0 2 x2 + y 2 y 0 0
Z
S[y, h] = dx p h+ p h .
0 x2 + y 2 1 + y0 2
3.8. SOLUTIONS FOR CHAPTER 3 143
so that
b b
d
Z Z
S[y + h] = ds dt K(s, t) y(s)h(t) + h(s)y(t) + O() .
d a a
where, in the second integral we have put t0 = s and s0 = t and then changed the
integration order of the first integral to obtain the final result.
Unless z(x) = C, the integrand is almost everywhere positve and hence the integrand
is zero only if z(x) = C.
dy
Z
= x or 2 c y = A x.
cy
Putting x = 0 gives 2 c = A and hence y = Ax/2 x2 /4; putting x = 1 gives
y(1) = 1 = A/2 1/4, and hence y = x(5 x)/4.
2 G 0 00 2 G 0 2 G 0
0 d G G
y = y y + y y.
dx y 0 y y 0 2 yy 0 y
2 G 0 2 G 00 2 G 0 00
d 0 G 0 G 0 0 G 00
y = y y + y y = y + 0y + 02y y ,
dx y 0 y y 0 y 0 y 0 yy 0 y y
2 F 00 2F 0 2F F
0 2
y + 0
y + 0
= 0.
y yy xy y
If F (x, y, y 0 ) = G(y, y 0 ) the third term is zero and if y = this equation becomes
Gy (, 0) = 0, assuming that Gy0 y0 (, 0) and Gyy0 (, 0) exist.
Let g(y) = G(y, 0) be a function of y. The equation Gy (, 0) = 0 shows that must
be at a stationary point of g(y) whereas the equation G(, 0) = c, found in part (a),
imposes the weaker restriction that c lies in the domain of g(y).
Thus, in general the constant solution y = of the first-integral, is not a solution of
the Euler-Lagrange equation.
3.8. SOLUTIONS FOR CHAPTER 3 145
so the functional is
1 2
Z
J[y] = 2 dx x2 = .
3
The function is continuous provided > 0 and hence on this class of continuous functions
J[y] can be made arbitrarily small, but not zero.
146 CHAPTER 3. THE EULER-LAGRANGE EQUATION
(c) The given functions behave similarly to the piecewise continuous function defined
in part (b), as seen in figure 3.3 which depicts graphs for = 0.1 and 0.01.
0.5
-0.5
-1
Figure 3.3 Graphs of the functions y(x) for = 0.1 (solid
line) and 0.01 (dashed line).
22 1 x2
Z
J[y] = 2
dx 2
0 (x + 2 )2
Z 1
2 1
= d sin2 where tan 1 =
2 0
and the second integral is obtained by putting x = tan . Integration gives
1
J[y] = (2 1 sin 2 1 ) = tan
2 2 2 2 1 + 2
2 1
2 1 tan + 1+2
= 2 .
1 2 tan1
Since tan1 = + O(3 ) we see that J[y] = 2/ + O(2 ). Since 0 < < 1, J[y] > 0,
but can be made arbitrarily small.
The boundary term is zero, because h(a) = h(b) = 0, so equation 3.9 becomes
Z b
F
S[y, h] = dx (x) h0 (x).
a y 0
3.8. SOLUTIONS FOR CHAPTER 3 147
On a stationary path S = 0 for all admissible h(x), so the result proved in exercise 3.4
shows that F/y 0 = C for some constant C.
1
Z b
2 2
S[y + h] = G y(b) + h(b) + dx (y 0 + h0 ) + (y + h)
2 a
diiferentiation with respect to and then setting = 0 gives the G ateaux differential
Z b
S[y, h] = Gy (y(b))h(b) + dx (h0 y 0 + hy) .
a
Now integrate by parts and use the fact that h(a) = 0 to cast this in the form
Z b
0
dx y 00 y h.
S[y, h] = y (b) Gy (y(b)) h(b)
a
(b) On the variations with h(b) = 0 the boundary term of S is zero. For S[y] to be
stationary it is necessary that S[y, h] = 0 and it follows from the fundamental lemma
that y 00 y = 0 with y(a) = A.
On the path defined by this equation
S[y, h] = y 0 (b) Gy (y(b)) h(b).
Since we require S[y, h] to be zero for all allowed h, which includes those variations
for which h(b) 6= 0, we must have
y 0 (b) = Gy (y(b)).
On the variations with h(b) = 0 the boundary term of S is zero. For S[y] to be
stationary it is necessary that S[y, h] = 0 and it follows from the fundamental lemma
that
d F F
0
= 0, y(a) = A.
dx y y
148 CHAPTER 3. THE EULER-LAGRANGE EQUATION
Since we require S[y, h] to be zero for all allowed h, which includes those variations
for which h(b) 6= 0, we must have
Solution
p for Exercise 3.17
If F = 1 + y 0 2 / y we have
p
F 1 + y0 2 F y0
= and 0
= p
y 2y 3/2 y y 1 + y0 2
so the Euler-Lagrange equation is
! p
d y0 1 + y0 2
p + =0
dx y 1 + y0 2 2y 3/2
3.8. SOLUTIONS FOR CHAPTER 3 149
which expands to
p
y 00 y0 2 y 0 2 y 00 1 + y0 2 1 + y0 2
p + = 0 that is y 00 = .
y(1 + y 0 2 )3/2 3/2
p
y 1 + y 0 2 2y 3/2 1 + y 0 2 2y 2y
Now define y1 = y and y2 = y10 the above equation, becomes y20 = (1 + y10 2 )/(2y1 ).
(b) This solution gives y(0) = A and y 0 (0) = B and so we have A = 0 and B = ,
so y = (/) sin x. This solution exists for all and , except possibly at = 0: for
small || we have
sin x 1 2 2
y = x = x 1 x + x as 0.
x 6
Further
y x 1
= 2 sin x + cos x = x3 + O(x) 0 as as |x| 0.
3
Hence y/ and y/ exists for all (, ).
(c) The general solution gives y(0) = A = 0 and y() = B sin = Thus there are
two cases to consider.
First, if 6= 1, 2, 3, so sin 6= 0, then the solutions are
sin x
y(x) = for all .
sin
Second, if = 1, 2, 3, so sin = 0, the equation for B is statisfied only if = 0 and
then for all B. The solutions are clearly discontinuous at = 1, 2, 3.
(b) A second division gives 22 similar triangles of height 22 h and a line of length 2l.
After n divisions there are therefore 2n similar triangles of height 2n h and a continuous
line of length 2l. Since this is true for any l, the length of the line is unbounded.
150 CHAPTER 3. THE EULER-LAGRANGE EQUATION
This function is stationary at the root of S 0 (y1 ) = 22y1 /3 1/2, that is y1 = 3/44 '
0.0682.
dy 1d 1 A
y 2 = (1 + x)2 + ,
y =
dx 2 dx 4 2
and integrating again, y(x)2 = B + Ax 61 (1 + x)3 . The boundary conditions then give
y(0)2 = B 61 = 1, so B = 67 , and y(1)2 = 67 + A 86 = 4, so A = 25 6 . Hence the
solution is
1 1
y(x)2 = (1 + x) 25 (1 + x)2 3 = 3 + (1 + x)(6 + x)(4 x).
6 6
3.8. SOLUTIONS FOR CHAPTER 3 151
The solution is written in this way because it is easier to understand. The cubic
f = (1 + x)(6 + x)(4 x) is zero at x = 6, 1 and 4; f is positive for x < 6 and
negative for x > 4. It follows that y is real only for x < x1 , for some x1 < 6, and
possibly for some x in the interval 1 < x < 4, depending upon the magnitude of f
in this interval. Numerical calculations, which you are not expected to do, show that
x1 ' 6.33 and that y is real in the interval (0.264, 3.59).
(c) The G
ateaux differential is
Z 2
1 y 0 h0
S[y, h] = Bh(2) + 2 dx 2 , y(1) = A,
2 1 x
Z 2 0
1 1 0 d y
= B + y (2) h(2) 2 h ,
2 2 1 dx x2
the second result being obtained using integration by parts and the fact that h(1) = 0.
Using the subset of variations with h(2) = 0 and using the fundamental lemma shows
that the stationary paths must satisfy the Euler-Lagrange equation,
0
d y dy
= 0 that is = x2 with y(1) = A,
dx x2 dx
1
S[y, h] = B y 0 (2) h(2),
2
and since h(2) need not be zero, S[y] is stationary only on those paths that satisfy
y 0 (2) = B, because it is necessary that S[y, h] = 0 for all allowed h. The general
solution of y 0 = x2 is y(x) = x3 /3 + and the boundary conditions give
1 1 1
A= + , B = 4 so = B and = A B.
3 4 12
(d) The G
ateaux differential is
Z b
2yh0
2y(0)h(0) h
S[y, h] = + dx 0 3 , y(b) = B 2 ,
A3 0 y0 2 y
Z b
1 1 1 d y
= 2h(0)y(0) + dx + 2 h,
y 0 (0)3 A3 0 y0 2 dx y 0 3
where we have integrated by parts and used the fact that h(b) = 0. Using the subset
of variations with h(0) = 0 and the fundamental lemma shows that S[y] is stationary
only on those paths that satisfy the Euler-Lagrange equation with F = yy 0 2 and with
the single boundary condition y(b) = B 2 . Since F is independent of x, so we may use
the first-integral, equation 3.13 (page 126), to give y y 0 2 = c2 , y(b) = B 2 , where c is a
positive constant (since y(b) > 0 the constant must be positive).
152 CHAPTER 3. THE EULER-LAGRANGE EQUATION
The fact that the sum is zero for all k is the equivalent of the fundamental lemma of
the Calculus of Variations.
dG G G 0
F(x, y, y 0 ) = F (x, y, y 0 ) + = F (x, y, y 0 ) + + y
dx x y
so that
F F 2G 2G 0 F F G
= + + y and = + .
y y xy y 2 y 0 y 0 y
Hence the Euler-Lagrange equation for F is
2G 2G 0
d F F d F F d G
= + y.
dx y 0 y dx y 0 y dx y xy y 2
But,
2G 2G 0
d G
+ 2
y = ,
xy y dx y
so the Euler-Lagrange equations for F and F are identical, as expected.
which is satisfied by the functions y(x) = 0 and y(x) = x2 . Thus the given function
satisfies the Euler-Lagrange equation except at x = 0 where y 00 (x) is not defined.
and
2G F 1 2F 2G 1 2F 2G F 1 2F
= , = , = .
xx0 x x0 xy 0 x0 2 x0 3 y 0 2 x0 y y x0 yy 0
Hence the Euler-Lagrange equation for G becomes
Fy0 y0 00 1 1
x + Fx 0 Fx y 0 x + Fy 0 Fy y 0 x 0 Fx = 0
0
x0 3 x x
which reduces to
Fy0 y0 00 1
0 3
x 0 Fy y0 Fx y0 + Fy = 0. (3.20)
x x
(b) The Euler-Lagrange equation for F is Fy0 y0 y 00 + Fy y0 y 0 + Fx y0 Fy = 0. But
d2 y x00
d 1 dy
2
= 0
= 03,
dx dy x dx x
so this equation becomes
Fy0 y0 00 1
x + 0 Fy y0 + Fx y0 Fy = 0,
x0 3 x
which is the same as equation 3.20.
which gives
n o
F (zk+1 ) = F (zk ) + Fx (zk ) + yk0 Fu (zk ) + yk00 Fv (zk ) + O( 2 ).
where z(x) is any function and the set of equally spaced points xk = k/(n + 1) defined
in the question. Hence the functional becomes
n n n
1X X X 1
S = (yk+1 yk )2 h yk2 2h xk y k , h = ,
h n+1
k=0 k=0 k=0
n
1X 2 2 2 2k
= (yk+1 yk ) h yk + yk .
h n+1
k=0
(a) If n = 1 there are two terms in the sum; the first is y12 /h, since y0 = 0, and the
second is (1/h h)y12 2hy1 , and since h = 1/2 this gives
7 2 1
S(y1 ) = y y1 .
2 1 2
This function is stationary where S/y1 = 7y1 1/2 = 0, that is y1 = 1/14 = 0.0714,
compared to the exact value of y(1/2) = 0.0697.
The difference between this approximation to S and that obtained in exercise 3.22 is
because the approximations to the functional are different. In both cases we approxi-
mate the solution by the same type of polygon; but in the first case we evaluated the
integrals exactly; in the second case we made an additional approximation to evaluate
3.8. SOLUTIONS FOR CHAPTER 3 157
which simplify to the given equations. These have the solutions y1 = 35/624 ' 0.0561
and y2 = 43/624 ' 0.0689, which are the approximate values of the solution at x = 1/3
and 2/3 respectively.
But h(x) and h0 (x) are both zero at x = a and b, so for the functional to be stationary
we need
d2
F F
2 00
= 0. Integrating this twice gives = c(x a) + d,
dx y y 00
for some constants c and d.
(b) If F (z) = 12 z 2 the differential equation for y(x) is y 00 (x) = c(x a) + d. Integrating
this twice gives
1
y 0 (x) = c(x a)2 + d(x a) + and
2
1 1
y(x) = c(x a)3 + d(x a)2 + (x a) + .
6 2
The boundary conditions at x = a give y 0 (a) = A2 = and y(a) = A1 = , so
1 1
y(x) = c(x a)3 + d(x a)2 + A2 (x a) + A1 ,
6 2
and the constants (c, d) are determined from the boundary conditions at x = b. Setting
D = b a the two equations y(b) = B1 and y 0 (b) = B2 become, respectively,
1 3 1 2 1 2
cD + dD + A2 D + A1 = B1 and cD + dD + A2 = B2 ,
6 2 2
which simplify to the quoted equations.
Rb
(c) Consider the general functional S[y] = a
dx F (y 00 ), so
b b
F 1 2F
Z Z
S[y + h] = S[y] + 00
dx h (x) 00 + 2 dx h00 (x)2 +
a y 2 a y 00 2
and on the stationary path
b
1 2 2F
Z
S[y + h] S[y] = dx h00 (x)2 + .
2 a y 00 2
Since h00 (x)2 0 the sign of this integral depends upon 2 F/y 00 2 . But, in the present
case F (z) = z 2 /2, F 00 (z) = 1 and hence the integral is positive and the stationary path
is a minimum.
Applications of the
Euler-Lagrange equation
4.1 Introduction
In this chapter we solve the Euler-Lagrange equations for two classic problems, the
brachistochrone, section 4.2, and the minimal surface of revolution, section 4.3. These
examples are of historic importance and special because the Euler-Lagrange equations
can be solved in terms of elementary functions. They are also important because they
are relatively simple yet provide some insight into the complexities of variational prob-
lems.
The first example, the brachistochrone problem, is the simpler of these two prob-
lems and there is always a unique solution satisfying the Euler-Lagrange equation. The
second example is important because it is one of the simplest examples of a minimum
energy problem; but it also illustrates the complexities inherent in nonlinear boundary
value problems and we shall see that there are sometimes two and sometimes no differ-
entiable solutions, depending upon the values of the various parameters. This example
also shows that some stationary paths have discontinuous derivatives and therefore can-
not satisfy the Euler-Lagrange equations everywhere. This effect is illustrated in the
discussion of soap films in section 4.4 and in chapter 9 is considered in more detail.
In both these cases you may find the analysis leading to the required solutions com-
plicated. It is, however, important that you are familiar with this type of mathematics
so you should understand the text sufficiently well to be able to write the analysis in
your own words.
161
162 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
solutions were published in 1697: Newtons comprised the simple statement that the
solution was a cycloid, giving no proof. In section 4.2.3 we prove this result algebraically,
but first we describe necessary preliminary material. In the next section we derive
the parametric equations for the cycloid after giving some historical background. In
section 4.2.2 the brachistochrone problem is formulated in terms of a functional and
the stationary path of this is found in section 4.2.3.
C B
a
x
O A D
Figure 4.1 Diagram showing how the cycloid OP D is traced out by a circle
rolling along a straight line.
In figure 4.1 a circle of radius a rolls along the x-axis, starting with its centre on the
y-axis. Fix attention on the point P attached to the circle, initially at the origin O. As
the circle rolls P traces out the curve OP D named the cycloid .
The cycloid has been studied by many mathematicians from the time of Galileo
(1564 1642), and was the cause of so many controversies and quarrels in the 17 th
century that it became known as the Helen of geometers. Galileo named the cycloid
but knew insufficient mathematics to make progress. He tried to find the area between
it and the x-axis, but the best he could do was to trace the curve on paper, cut out the
arc and weigh it, to conclude that its area was a little less than three times that of the
generating circle in fact it is exactly three times the area of this circle, as you can
show in exercise 4.3. He abandoned his study of the cycloid, suggesting only that the
cycloid would make an attractive arch for a bridge. This suggestion was implemented
in 1764 with the building of a bridge with three cycloidal arches over the river Cam in
the grounds of Trinity College, Cambridge, shown in figure 4.2.
The reason why cycloidal arches were used is no longer known, all records and
original drawings having been lost. However, it seems likely that the architect, James
Essex (1722 1784), chose this shape to impress Robert Smith (1689 1768), the Master
of Trinity College, who was keen to promote the study of applied mathematics.
4.2. THE BRACHISTOCHRONE 163
Figure 4.2 Essexs bridge over the Cam, in the grounds of Trinity
college, having three cycloidal arches.
The area under a cycloid was first calculated in 1634 by Roberval (1602 1675). In
1638 he also found the tangent to the curve at any point, a problem solved at about
the same time by Fermat (1601 1665) and Descartes (1596 1650). Indeed, it was at
this time that Fermat gave the modern definition of a tangent to a curve. Later, in
1658, Wren (1632 1723), the architect of St Pauls Cathedral, determined the length
of a cycloid.
Pascals (1623 1662) last mathematical work, in 1658, was on the cycloid and,
having found certain areas, volumes and centres of gravity associated with the cycloid,
he proposed a number of such questions to the mathematicians of his day with first and
second prizes for their solution. However, publicity and timing were so poor that only
two solutions were submitted and because these contained errors no prizes were awarded,
which caused a degree of aggravation among the two contenders A de Lalouv`ere (1600
1664) and John Wallis (1616 1703).
At about the time of this contest Huygens (1629 1695) designed the first pendulum
clock, which was made by Salomon Closter in 1658, but was aware that the period of the
pendulum depended upon the amplitude of the swing. It occurred to him to consider the
motion of an object sliding on an inverted cycloidal arch and he found that the object
reaches the lowest point in a time independent of the starting point. The question
that remained was how to persuade a pendulum to oscillate in a cycloidal, rather than
a circular arc. Huygens now made the remarkable discovery illustrated in figure 4.3.
If one suspends from a point P at the cusp, between two inverted cycloidal arcs P Q
and P R, then a pendulum of the same length as one of the semi-arcs will swing in a
cycloidal arc QSR which has the same size and shape as the cycloidal arcs of which P Q
and P R are parts. Such a pendulum will have a period independent of the amplitude
of the swing.
164 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
Q R
S
Figure 4.3 Diagram showing how Huygens cy-
cloidal pendulum, P T , swings between two fixed,
similar cycloidal arcs P R and P Q.
Huygens made a pendulum clock with cycloidal jaws, but found that in practice it
was no more accurate than an ordinary pendulum clock: his results on the cycloid
were published in 1673 when his Horologium Oscillatorium appeared1 . However, the
discovery illustrated in figure 4.3 was significant in the development of the mathematical
understanding of curves in space.
which are the parametric equations of the cycloid. For || 1, x and y are related
approximately by y = (a/2)(6x/a)2/3 , see exercise 4.2. The arc OP D is traced out as
increases from 0 to 2.
If, in figure 4.3 the y-axis is in the direction P S, that is pointing downwards, the
upper arc QP R, with the cusp at P is given by these equations with and
it can be shown, see exercise 4.28, that the lower arc is described by x = a( + sin ),
y = a(3 + cos ), and the same range of . The following three exercises provide practice
in the manipulation of the cycloid equations; further examples are given in exercises 4.26
4.28.
Exercise 4.1
dy 1
Show that the gradient of the cycloid is given by = . Deduce that the
dx tan(/2)
cycloid intersects the x-axis perpendicularly when = 0 and 2.
1 A more detailed account of Huygens work is given in Unrolling Time by J G Yoder (Cambridge
University Press).
4.2. THE BRACHISTOCHRONE 165
Exercise 4.2
By using the Taylor series of sin and cos show that for small ||, x ' a 3 /6
and y ' a 2 /2. By eliminating from these equations show that near the origin
y ' (a/2)(6x/a)2/3 .
Exercise 4.3
Show that the area under the arc OP D in figure 4.1 is 3a2 and that the length
of the cycloidal arc OP is s() = 8a sin2 (/4).
y
A
s(x)
P
x
O b
Figure 4.4 Diagram showing the curve y(x) through (0, A) and
(b, 0) on which the bead slides. Here s(x) is the distance along
the curve from the starting point to P = (x, y(x)) on it.
At a point P = (x, y(x)) on this curve let s(x) be the distance along the curve from the
starting point, so the speed of the bead is defined to be v = ds/dt. The kinetic energy
of a bead having mass m at P is 21 mv 2 and its potential energy is mgy; because the
bead is sliding without friction, energy conservation gives
1
mv 2 + mgy = E, (4.2)
2
where the energy E is given by the initial conditions, E = 21 mv02 + mgA, v0 being the
initial speed at Pa = (0, A). Small changes in s are given by s2 = x2 + y 2 , and so
2 2 2 2
ds dx dy dx
= + = 1 + y 0 (x)2 . (4.3)
dt dt dt dt
Thus on rearranging equation 4.2 we obtain
2 r
ds 2E dx p 0 2
2E
= 2gy or 1 + y (x) = 2gy(x). (4.4)
dt m dt m
166 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
T b
1
Z Z
T = dt = dx .
0 0 dx/dt
Thus on re-arranging equation 4.4 to express dx/dt in terms of y(x) we obtain the
required functional,
Z b s
1 + y0 2
T [y] = dx . (4.5)
0 2E/m 2gy
This functional may be put in a slightly more convenient form by noting that the energy
and the initial conditions are related by equation 4.2, so by defining the new dependent
variable
Z b s
v02 1 + z0 2
z(x) = A + y(x) we obtain T [z] = dx . (4.6)
2g 0 2gz
Exercise 4.4
(a) Find the time, T , taken for a particle of mass m to slide down the straight
line, y = Ax, from the point (X, AX) to the origin when the initial speed is v0 .
Show that if v0 = 0 this is
r
2X p
T = 1 + A2 .
gA
(b) Show also that if the point (X, AX) lies on the circle of radius R and with
centre at (0, R), so the equation of the circle is x2 + (y R)2 = R2 , then the time
taken to slide along the straight line to the origin is independent of X and is given
by
r
R
T =2 .
g
This surprising result was known by Galileo and seems to have been one reason
why he thought that the solution to the brachistochrone problem was a circle.
Exercise 4.5
Show that the functional defined in equation 4.6 when expressed using z as the
independent variable and if v0 = 0 becomes
r
A
1 + x0 (z)2
Z
1
T [x] = dz , x(0) = 0, x(A) = b,
2g 0 z
4.2.3 A solution
The integrand of the functional 4.6 is independent of x, so we may use equation 3.13
(page 126) to write Eulers equation in the form
r
0 F 0 1 + z0 2
z 0
F = constant where F (z, z ) = .
z z
Note that the external constant (2g)1/2 can be ignored. Since
r
F z0 z0 2 1 + z0 2 1
= this gives =
z 0
p p
0
z(1 + z )2 0
z(1 + z ) 2 z c
for some positive constant c note that c must be positive because the left-hand side
of the above equation is negative. Rearranging the last expression gives
r
02
2 dz c2
z 1+z = c or = 1. (4.7)
dx z
This first-order differential equation is separable and can be solved. First, however, note
that because the y-axis is vertically upwards we expect the solution y(x) to decrease
away from x = 0, that is z(x) will increase so we take the positive sign and then
integration gives, r
z
Z
x = dz .
c2 z
Now substitute z = c2 sin2 to give
Z Z
x = 2c2 d sin2 = c2 d (1 cos 2)
1 2 1 2
= c (2 sin 2) + d and z = c (1 cos 2), (4.8)
2 2
where d is a constant. Both c and d are determined by the values of A, b and the
initial speed, v0 . Comparing these equations with equation 4.1 we see that the required
stationary curve is a cycloid. It is shown in chapter 7 that, in some cases, this solution
is a global minimum of T [z].
In the case that the particle starts from rest, v0 = 0, these solutions give
1 1
x = d + c2 (2 sin 2) , y = A c2 (1 cos 2)
2 2
where c and d are constants determined by the known end points of the curve.
At the starting point y = A so here = 0 and since x = 0 it follows that d = 0:
because (0) = 0 the particle initially falls vertically downwards. At the final point of
the curve, x = b, y = 0, let = b . Then
2b 2A
= 2b sin 2b , = 1 cos 2b ,
c2 c2
giving two equations for c and b : we now show that these equations have a unique,
real solution. Consider the cycloid
u = 2 sin 2, v = 1 cos 2, 0 . (4.9)
168 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
The value of b is given by the value of where this cycloid intersects the straight line
Au = bv. The graphs of these two curves are shown in the following figure.
2 v
1.5 Au=bv
cycloid
1
0.5
u
0 1 2 3 4 5 6
Figure 4.5 Graph of the cycloid defined in equation 4.9 and
the straight line bv = Au.
Because the gradient of the cycloid at = 0, (u = v = 0), is infinite this graph shows
that there is a single value of b for all positive values of the ratio A/b. By dividing the
first of equations 4.9 by the second we see that b is given by solving the equation
2b sin 2b b
= , 0 < b < . (4.10)
2 sin2 b A
Unless b/A is small this equation can only be solved numerically. Once b is known,
the value of c is given from the equation 2A/c2 = 1 cos 2b , which may be put in the
more convenient form c2 = A/ sin2 b .
Exercise 4.6
Show that if A b then b ' 3b/2A and that y/A ' 1 (x/b)2/3 .
Exercise 4.7
Use the solution defined in equation 4.8 to show that on the stationary path the
time of passage is
r
2A b
T [z] = .
g sin b
We end this section by showing a few graphs of the solution 4.8 and quoting some
formulae that help understand them; the rest of this section is not assessed.
In the following figure are depicted graphs of the stationary paths for A = 1 and
various values of b, ranging from small to large, so all curves start at (0, 1) but end at
the points (b, 0), with 0.1 b 4.
4.2. THE BRACHISTOCHRONE 169
1
y
b=0.1
0.5
b=/2
x
0
1 2 3 4
b=0.5
-0.5
-1
Figure 4.6 Graphs showing the stationary paths joining the points
(0, 1) and (b, 0) for b = 0.1, 1/2, 1, /2, 2, 3 and 4.
From figure 4.6 we see that for small b the stationary path is close to that of a straight
line, as would be expected. In this case b is small and it was shown in exercise 4.6
that
3b 9b3 5 y x 2/3
b = + O(b ) and that ' 1 .
2A 20A3 A b
Also the time of passage is
s
3b2 81b4
2A 6
T = 1+ + O(b ) .
g 8A2 640A4
By comparison, if a particle slides down the straight line joining (0, A) to (b, 0), that is
y/A + x/b = 1, so z = Ax/b, then the time of passage is
s
b2
2A 4
1 + + O(b ) , b A,
s
g 2A2
2(A2 + b2 )
TSL = =
Ag
A2
r
2
4
b 1 + 2 + O(b ) , b A.
Ag 2b
Thus for, small b, the relative difference is
b2
TSL T = T + O(b4 ).
8A2
Returning to figure 4.6 we see for small b the stationary paths cross the x-axis at
the terminal point. At some critical value of b the stationary path is tangential to the
x-axis at the terminal point. We can see from the equation for x() that this critical
path occurs when y 0 () = 0, that is when b = /2 and, from equation 4.10, we see
that this gives b = A/2. On this path the time of passage is
s r
2A 4
T = and also TSL = T 1 + 2 = 1.185T.
2 g
For b > A/2 the stationary path dips below the x-axis and approaches
p the terminal
point from below. For b A/2 it can be shown that b = A/b + O(b3/2 ),
170 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
Exercise 4.8
Galilieo thought that the solution to the brachistrchrone problem was given by the
circle passing through the initial and final points, (0, A) and (b, 0), and tangential
to the y-axis at the start point.
Show that the equation of this circle is (x R)2 + (y A)2 = R2 , where R is
its radius given by 2bR = A2 + b2 . Show also that if x = R(1 cos ) and
y = A R sin , then the time of passage is
r Z
R b
1 A 2Ab
T = d where sin b = = 2 .
2g 0 sin R A + b2
p
If b A show that T ' 2A/g.
y (b,B) y
(a,A) s
x x
x
Figure 4.7 Diagram showing the construction of a surface of revolution, on the left,
and, on the right, the small segment used to construct the integral 4.11.
4.3. MINIMAL SURFACE OF REVOLUTION 171
This section is divided into three parts. First, we derive the functional S[y] giving the
required area. Second, we derive the equation that a sufficiently differentiable function
must satisfy to make the functional stationary. Finally we solve this equation in a
simple case and show that even this relatively simple problem has pitfalls.
The area S traced out by this segment as it rotates about the x-axis is the circumference
of the circle of radius y(x) times s; to order x this is.
p
S = 2y(x)s = 2y 1 + y 0 2 x.
Hence the area of the whole surface from x = a to x = b is given by the functional
Z b p
S[y] = 2 dx y 1 + y 0 2 , y(a) = A 0, y(b) = B > 0, (4.11)
a
with no loss of generality we may assume that A B and hence that B > 0.
Exercise 4.9
Show that the equation of the straight line joining (a, A) to (b, B) is
BA
y= (x a) + A.
ba
Use this together with equation 4.11 to show that the surface area of the frustum
of the cone shown in figure 4.8 is given by
p
S = (B + A) (b a)2 + (B A)2 .
Note that the frustum of a solid is that part of the solid lying between two parallel
planes which cut the solid; its area does not include the area of the parallel ends.
y l
B
A x
ba
Figure 4.8 Diagram showing the frustum of a cone, the unshaded area. The
slant-height is l and the radii of the circular ends are A and B.
172 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
Show further that this expression may be written in the form (A + B)l where l
is the length of the slant height and A and B are the radii of the end circles.
The following exercise may seem a non-sequitur, but it illustrates two important points.
First, it shows how a simple version of Eulers method, section 3.2, can provide a useful
approximation to a functional. Second, it shows how a very simple approximation can
capture the essential, quite complicated, behaviour of a functional: this is important
because only rarely can the Euler-Lagrange equation be solved exactly. In particular
it suggests that in the simple case A = B, with y(x) defined on |x| a, there are
stationary paths only if A/a is sufficiently large and then there are two stationary
paths.
Exercise 4.10
Consider the case A = B and with a x a, so the functional 4.11 becomes
Z a p
S[y] = 2 dx y 1 + y 0 2 , y(a) = A > 0.
a
(a) Assume that the required stationary paths are even and use a variation of
Eulers method, described in section 3.2.1, by assuming that
A
y(x) = + x, 0xa
a
where is a constant, to derive an approximation, S(), for S[y].
(b) By differentiating
` this
expression
with respect to show that S() is station-
ary if = = A A2 2a2 /2, and deduce that no such solutions exist if
A < a 2. Note that the exact calculation, described below, shows that there are
no continuous stationary paths if A < 1.51a.
(c) Show that if A > a 2 the two stationary values of S satisfy S( ) > S(+ )
(d) If A a show that the two values of are given approximately are by
a2 a2 a2
+ = A + and = 1+ + ,
2A 2A 2A2
and find suitable approximations for the associated stationary paths. Show also
that the stationary values of S are given approximately by S( ) ' 2A2 and
S(+ ) ' 4Aa, and give a physical interpretation of these values.
G yy 0 G y
= and y 0 G = p .
y 0 0
p
1 + y0 2 y 1 + y0 2
Hence the Euler-Lagrange equation integrates to
y
p = c, y(a) = A 0, y(b) = B > 0, (4.12)
1 + y0 2
4.3. MINIMAL SURFACE OF REVOLUTION 173
for some constant c; since y(b) > 0 we may assume that c is positive. By squaring and
re-arranging this equation we obtain the simpler first-order equation
p
dy y 2 c2
= , y(a) = A 0, y(b) = B > 0. (4.13)
dx c
The solutions of equation 4.13, if they exist, ensure that the functional 4.11 is stationary.
We shall see, however, that suitable solutions do not always exist and that when they
do further work is necessary in order to determine the nature of the stationary point.
of the original Euler-Lagrange equation, see the discussion in section 3.4, in particular exercise 3.8.
174 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
Notice that f (0) = c, so c is the height of the curve at the origin, where f (x) is
stationary; also, because = 0 the solution is even. The required solutions are obtained
by finding the real values of c satisfying this equation. Unfortunately, the equation
A = c cosh(a/c) cannot be inverted to express c in terms of known functions of A.
Numerical solutions may be found, but first it is necessary to determine those values of
a and A for which real solutions exist.
A convenient way of writing this equation is to introduce a new dimensionless vari-
able = a/c so we may write the equation for c in the form
A 1
= g() where g() = cosh . (4.16)
a
This equation shows directly that depends only upon the dimensionless ratio A/a. In
terms of and A the solution 4.15 becomes
a x cosh (x/a)
f (x) = cosh =A . (4.17)
a cosh
The stationary solutions are found by solving the equation A/a = g() for . The
graph of g(), depicted in figure 4.9, shows that g() has a single minimum and that for
A/a > min(g) there are two real solutions, 1 and 2 , with 1 < 2 , giving the shapes
f1 (x) and f2 (x) respectively.
10
g()
8
6
4 A/a
2
0 1 1 2 3 2 4
Figure 4.9 Graph of g() = 1 cosh showing the solu-
tions of the equation g() = A/a.
This graph also suggests that g() as 0 and ; this behaviour can be verified
with the simple analysis performed in exercise 4.12, which shows that
1 e
g() for 1 and g() for 1.
2
The minimum of g() is at the real root of tanh = 1, see exercise 4.13; this may be
found numerically, and is at m ' 1.200, and here g(m ) = 1.509. Hence if A < 1.509a
there are no real solutions of equation 4.16, meaning that there are no functions with
continuous derivatives making the area stationary. For A > 1.509a there are two
real solutions giving two stationary values of the functional 4.11; we denote these two
solutions by 1 and 2 with 1 < 2 . Because there is no upper bound on the area
neither solution can be a global maximum. Recall that in exercise 4.10 it was shown
that a simple polygon approximation to the stationary path did not exist if A < a 2
and there were two solutions if A > a 2.
4.3. MINIMAL SURFACE OF REVOLUTION 175
The following graph shows values of the dimensionless area S/a2 for these two sta-
tionary solutions as functions of A/a when A/a g(m ) ' 1.509. The area associated
with the smaller root, 1 , is denoted by S1 , with S2 denoting the area associated with
2 . These graphs show that S2 > S1 for A > ag(m ) ' 1.51a.
60 2
2
S/a S2 /a
50
2
40 S1 /a
30
20 A/a
1.5 1.75 2 2.25 2.5 2.75 3
Figure 4.10 Graphs showing how the dimensionless area
S/a2 varies with A/a.
It is difficult to find simple approximations for the area S[f ] except when A a, in
which case the results obtained in exercise 4.12 and 4.13 may be used, as shown in the
following analysis. We consider the smaller and larger roots separately.
If A a the smaller root, 1 is seen from figure 4.9 to be small. The approximation
developed in exercise 4.12 gives 1 ' a/A so that equation 4.17 becomes
f1 (x) ' A cosh(x/A) ' A,
since |x| a A and cosh(x/A) ' 1. Because f1 (x) is approximately constant the
original functional, equation 4.11, is easily evaluated to give
S1 A
S1 = S[f1 ] = 4aA or = 4 .
a2 a
The latter expression is the equation of the approximately straight line seen in fig-
ure 4.10. The area S1 is that of the right circular cylinder formed by joining the ends
with parallel lines.
For the larger root, 2 , since cosh ' e /2, for large , equation 4.16 for becomes,
see exercise 4.12
A 1
= e (4.18)
a 2
and
2
2
2
f2 (x) ' A exp (a x) + A exp (a + x) , 1.
a a a
For positive x the second term is negligible (because 2 1) provided x2 a. For
negative x the first term is negligible, for the same reason. Hence an approximation for
f2 (x) is
2
f2 (x) ' A exp (a |x|) provided |x|2 a. (4.19)
a
The behaviour of this function as is discussed after equation 4.20. In exer-
cise 4.12 it is shown that the area is given by
2
S2 A
S2 = S[f2 ] ' 2A2 or 2
= 2 ,
a a
176 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
which is the same as the area of the cylinder ends. The latter expression increases
quadratically with A/a, as seen in figure 4.10.
These approximations show directly that if A a then S2 > S1 , confirming the
conclusions drawn from figure 4.10. They also show that when A a the smallest area
is given when the surface of revolution approximates that of a right circular cylinder.
In the following three figures we show examples of these solutions for A = 2a,
A = 10a and A = 100a. In the first example, on the left, the ratio A/a = 2 is only
a little larger than min(g()) ' 1.509, but the two solutions differ substantially, with
f1 (x) already close to the constant value of A for all x. In the two other figures the
ratio A/a is larger and now f1 (x) is indistinguishable from the constant A, while f2 (x)
is relatively small for most values of x.
Figure 4.11 Graphs showing the stationary solutions f (x)/A = cosh(x/a) as a function of x/a
and for various values of A/a, with a = 1.
These figures and the preceding analysis show that when the ends are relatively close,
that is A/a large, f1 (x) ' A, for all x, and that as A/a , f2 (x) tends to the
function
0, |x| < a,
f2 (x) fG (x) = (4.20)
A, |x| = a.
This result may be derived from the approximate solution given in equation 4.19. Con-
sider positive values of x, with x2 a. If x = a(1 ), where is a small positive
number, then
f2 (x) ' Ae2 .
But from equation 4.18 ln(A/a) = ln(2) and if 1, ln(2) , so ' ln(A/a)
and the above approximation for f2 (x) becomes
f2 (x) a
= , x = a(1 ).
A A
Hence, provided > 0, that is x 6= a, f2 /A 0 as A/a .
The surface defined by the limiting function fG (x) comprises two discs of radius A, a
distance 2a apart, so has area SG = 2A2 , independent of a. Since this limiting solution
has discontinuous derivatives at x = a it is not an admissible function. Nevertheless
it is important because if A < ag(m ) ' 1.509a it can be shown that this surface gives
the global minimum of the area and, as will be seen in the next subsection, has physical
significance. This solution to the problem was first found by B C W Goldschmidt in
1831 and is now known as the Goldschmidt curve or Goldschmidt solution.
4.3. MINIMAL SURFACE OF REVOLUTION 177
4.3.4 Summary
We have considered the special case where the ends of the cylinder are at x = a and
each end has the same radius A; in this case the curve y = f (x) is symmetric about
x = 0 and we have obtained the following results.
1. If the radius of the ends is small by comparison to the distance between them,
A < ag(m ) ' 1.509a, there are no curves described by differentiable functions
making the traced out area stationary. In this case it can be shown that the
smallest area is given by the Goldschmidt solution, fG (x), defined in equation 4.20,
and that this is the global minimum.
2. If A > 1.51a there are two smooth stationary curves. One of these approaches
the Goldschmidt solution as A/a and the other approaches the constant
function f (x) A in this limit, and this gives the smaller area. This solution is
a local minimum of the functional, as will be shown in chapter 7.
The nature of the stationary solutions is not easy to determine. In the following graph
we show the areas S1 /a2 and S2 /a2 , as in figure 4.10 and also, with the dashed lines,
the areas given by the Goldschmidt solution, SG /a2 = 2(A/a)2 , curve G, and the area
of the right circular cylinder, Sc /a2 = 4A/a, curve c.
60
2
S/a 2
50 S2/a
G
40 c
30
20 S1/a
2
A/a
1.5 1.75 2 2.25 2.5 2.75 3
Figure 4.12 Graphs showing how the dimensionless area S/a2 varies
with A/a. Here the curves k, k = 1, 2, denote the area Sk /a2
as in figure 4.10; G the scaled area of the Goldschmidt curve,
SG = 2(A/a)2 and c the scaled area of the cylinder, 4A/a.
If A > ag(m ) ' 1.509a it will be shown in chapter 7 that S1 is a local minimum of
the functional. The graphs shown in figure 4.12 suggest that for large enough A/a,
S1 < SG , but for smaller values of A/a, SG < S1 . The value of at which SG = S1 is
given by the solution of 1 + e2 = 2, see exercise 4.14. The numerical solution of this
equation gives = 0.639 at which A = 1.8945a. Hence if A < 1.89a the Goldschmidt
curve yields a smaller area, even though S1 is a local minimum. For A > 1.89a, S1
gives the smallest area.
This relatively simple example of a variational problem provides some idea of the
possible complications that can arise with nonlinear boundary value problems.
Exercise 4.11
(a) If f (x) = c cosh(x/c) show that
S[f ] 2 a
= 2 ( + sinh cosh ) , = .
a2 c
(b) Show that S[f ] considered as function of is stationary at the root of tanh = 1.
178 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
Exercise 4.12
(a) Use the expansion cosh = 1 + 21 2 + O( 4 ) to show that, for small , g() =
1/ + /2 + O( 3 ), where g() is defined in equation 4.16. Hence show that if
A a then ' a/A and hence that c ' A and f (x) ' A. Using the result
obtained in the previous exercise, or otherwise, show that S1 = 4Aa.
(b) Show that if 2 is large the equation defining it is given approximately by
A 1
' e
a 2
and, using the result obtained in the previous exercise, that
2 2
S2 e 2 e
' 2 + ' 2 , ( = 2 ).
a2 2 2
Exercise 4.13
(a) Show that the position of the minimum of the function g() = 1 cosh ,
> 0, is at the real root, m , of tanh = 1.
By sketching the graphs of y = 1/ and y = tanh , for > 0, show that the
equation tanh = 1 has only one real root.
(b) If a/c = m and A/a = g(m ) use the result derived in exercise 4.11 to
show that the area of the cylinder formed is Sm = 2A2 m , and that Sm /a2 =
2m1
cosh2 m .
Exercise 4.14
Use the result derived in exercise 4.12 to show that SG = S1 when satisfies
the equation cosh2 = + sinh cosh . Show that this equation simplifies to
1 + e2 = 2 and that there is only one positive root, given by = 0.639232.
Exercise 4.15
(a) Show that the functional
Z 1 p
S[y] = dx y (1 + y 0 2 ), y(1) = y(1) = A > 0,
1
A A
2a 2a
Figure 4.13 Diagrams showing two configurations assumed by soap films on two rings of radius
A and a distance 2a appart. On the left, 1.89a > A, the soap film simply fills the two circular
wires because they are too far apart: this is the Goldschmidt solution, equation 4.20. On the right
1.51a < A the soap film joins the two rings in the shape defined by equation 4.17 with = 1 .
The methods discussed previously provide the shape of the right-hand film, but the
matter of determining whether these stationary positions are extrema, local or global,
is of a different order of difficulty. The complexity of this physical problem is further
compounded when one realises that there can be minimum energy solutions of a quite
unexpected form. The following diagram illustrates a possible configuration of this kind.
We do not expect the theory described in the previous section to find such a solution
because the mathematical formulation of the physical problem makes no allowance for
this type of behaviour.
2a
Figure 4.14 Diagram showing a possible soap film. In this example a circular
film, perpendicular to the axis, is formed in the centre and this is joined to
both outer rings by a catenary.
The relationship between soap films and some problems in the Calculus of Variations
can certainly add to our intuitive understanding, but this example should provide a
salutary warning against dependence on intuition.
Examples of the complex shapes that soap films can form, but which are difficult
to describe mathematically, are produced by dipping a wire frame into a soap solution.
Photographs of the varied shapes obtained by cubes and tetrahedrons are provided in
Isenbergs book.
Here we describe a conceptually simple problem which is difficult to deal with math-
ematically, but which helps to understand the difficulties that may be encountered with
certain variational problems. Further, this example has potential practical applications.
Consider the soap film formed between two clear, parallel planes joined by a number
of pins, of negligible diameter, perpendicular to the planes. When dipped into a soap
4.4. SOAP FILMS 181
solution the resulting film will join the pins in such a manner as to minimise the length
of film, because the surface tension energy is proportional to the area, which is propor-
tional to the length of film. In figure 4.15 we show three cases, viewed from above, with
two and three pins.
In panel A there are two pins: the natural shape for the soap films is the straight line
joining them. In panels B and C there are three pins and two different configurations
are shown which, it transpires, are the only two allowed; but which of the pair is actually
assumed depends upon the relative positions of the pins.
A B C
The reason for this follows from elementary geometry and the application of one of
Plateaus (1801 1883)4 three geometric rules governing the shapes of soap films, which
he inferred from his experiments. In the present context the relevant rule is that three
intersecting planes meet at equal angles of 120 : this is a consequence of the surface
tension forces in each plane being equal. Plateaus other two rules are given by Isenberg
(1992, pages 83 4).
We can see how this works, and some of the consequences for certain problems in
the Calculus of Variations, by fixing two points, a and b, and allowing the position of
the third point to vary. The crucial mathematical result needed is Proposition 20 of
Euclid5 , described next.
C
Euclid: proposition 20
The angle subtended by a chord AB at the centre of
the circle, at O, is twice the angle subtended at any
O
point C on the circumference of the circle, as shown
in the figure. This is proved using the properties of 2
similar triangles. A B
With this result in mind draw a circle through the points a and b such that the angle
subtended by ab on the circumference is 120 , figures 4.16 and 4.17. If L is the distance
between a and b the radius of this circle is R = L/ 3. The orientation of this circle is
chosen so the third point is on the same side of the line ab as the 120 angle.
Then for any point c outside this circle the shortest set of lines is obtained by joining
c to the centre of the circle, O, and if c0 is the point where this line intersects the circle,
4 Joseph Plateau was a Belgian physicist who made extensive studies of the surface properties of
fluids.
5 See Euclid Elements, Book I.
182 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
see figure 4.16, the lines cc0 , ac0 and c0 b are the shortest set of lines joining the three
points a, b and c.
c
c c
a 120
o
b a >120
o b
O
Figure 4.16 Diagram of the shortest Figure 4.17 Diagram of the shortest
length for a point c outside the circle.The length for a point c inside the circle.
point O is the centre of the circle.
If the third point c is inside this circle the shortest line joining the points comprises
the two straight line segments ac and cb, as shown in figure 4.17. This result can be
proved, see Isenberg (1992, pages 67 73) and also exercise 4.16.
As the point c moves radially from outside to inside the circle the shortest config-
uration changes its nature: this type of behaviour is generally difficult to predict and
may cause problems in the conventional theory of the Calculus of Variations.
If more pins join the parallel planes the soap film will form configurations making
the total length a local minimum; there are usually several different minimum configu-
rations, and which is found depends upon a variety of factors, such as the orientation of
the planes when extracted from the soap solution. The problem of minimising the total
length of a path joining n points in a plane was first investigated by the Swiss mathe-
matician Steiner (1796 1863) and such problems are now known as Steiner problems.
The mathematical analysis of such problems is difficult. One physical manifestation of
this type of situation is the laying of pipes between a number of centres, where, all else
being equal, the shortest total length of pipe is desirable.
Exercise 4.16
Consider the three points, O, A and C, in the Cartesian plane with coordinates
O = (0, 0), A = (a, 0) and C = (c, d) and where the angle OAC is less than 120 .
Consider a point X, with coordinates (x, y) inside the triangle OAC. Show that
the sum of the lengths OX, AX and CX is stationary and is a minimum when
the angles between the three lines are all equal to 120 .
Exercise 4.17
Consider the case where four pins are situated at the corners of a square with side
of length L.
(a) One possible configuration of the soap films is for them to lie along the two
diagonals, to form the cross . Show that the length of the films is 2 2 L = 2.83L.
(b) Another configuration is the H-shape, . Show that the length of film is 3L.
(c) Another possible configuration is, , where the angle between three intersect-
ing lines is 120 . Show that the length of film is (1 + 3)L = 2.73L.
4.4. SOAP FILMS 183
Exercise 4.18
aL, a>1
Consider the configuration of four pins forming a rectangle
with sides of length L and aL. L
(a) For the case shown in thetop panel, a > 1, show that
a<1
total line length is d1 = L(a + 3) and
that for the case in the
bottom panel, a < 1, it is d2 = L(1 + a 3). L
(b) Show that the minimum of these two lengths is d1 if a > 1
and d2 if a < 1.
184 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
Exercise 4.20
Show that the functional giving the distance between two points on a sphere of
radius r, labelled by the spherical polar coordinates (a , a ) and (b , b ) can be
expressed in either of the forms
Z b q Z b q
S=r d 1 + 0 ()2 sin2 or S = r d 0 ()2 + sin2
a a
Exercise 4.21
Consider the minimal surface problem with end points Pa = (0, A) and Pb = (b, B),
where b, A and B are given and A B.
(a) Show that the general solution of the appropriate Euler-Lagrange equation is
x
y = c cosh ,
c
where and c are real constants with c > 0. Show that if c = b the boundary
conditions give the following equation for
q
2
B = f () where f (x) = A cosh(1/x) A x2 sinh(1/x)
and A = A/b, B = B/b, with 0 A.
(b) Show that for small x and x ' A the function f (x) behaves, respectively, as
x2 1/x
q
f (x) ' e and f (x) ' A cosh(1/A) 2A(A x) sinh(1/A).
4A
Deduce that f (x) has at least one minimum in the interval 0 < x < A and that
the equation B = f () has at least two roots for sufficiently large values of B and
none for small B.
4.5. MISCELLANEOUS EXERCISES 185
(c) If A 1 show that the minimum value of f (x) occurs near x = A A3 /2 and
that min(f ) ' A cosh(1/A). Deduce that if A 1 there are no smooth solutions
of the Euler-Lagrange equation for B < cosh(1/A), approximately.
Exercise 4.22
(a) For the brachistochrone problem suppose that the initial and final points of
the curve are (x, y) = (0, A) and (b, 0), respectively, as in the text, but that the
initial speed, v0 , is not zero.
Show that the parametric equations for the stationary path are
1 2 v02
x=d+ c (2 sin 2), z = c2 sin2 , y = A+ z,
2 2g
where 0 b , for some constants c, d, 0 and b . Show that these four
constants are related by the equations
v02
sin2 0 = k2 sin2 b , k2 = < 1,
v02 + 2gA
v02
b = 2 (2 b sin 2 b ) (2 0 sin 2 0 ) ,
4gk2 sin b
v2
c2 sin2 b = A+ 0.
2g
(b) If v02 Ag, show that k is small and find an approximate solution for these
equations. Note, this last part is technically demanding.
Exercise 4.23
In this exercise you will show that the cycloid is a local minimum for the brachis-
tochrone problem using the functional found in exercise 4.5. Consider the varied
path x(z) + h(z) and show that (ignoring the irrelevant factor 1/ 2g )
2 A h0 (z)2
Z
T [x + h] T [x] = dz + O(3 ),
2 0 z(1 + x0 2 )3/2
Z A
= 2 c d h0 (z)2 cos4 ,
0
Exercise 4.24
The Oxy-plane is vertical with the Oy-axis vertically upwards. A straight line
is drawn from the origin to the point P with coordinates (x, f (x)), for some
differentiable function f (x). Show that the time taken for a particle to slide
smoothly from P to the origin is
s
x2 + f (x)2
T (x) = 2 .
2gf (x)
By forming a differential equation for f (x), and solving it, show that T (x) is
independent of x if f satisfies the equation x2 +(f )2 = 2 , for some constant .
Describe the shape of the curve defined by this equation.
186 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
Exercise 4.25
A cylindrical shell of negligible thickness is formed by rotating the curve y(x),
a x b, about the x-axis. If the material is uniform with density the moment
of inertia about the x-axis is given by the functional
Z b
dx y 3 1 + y 0 2 , y(a) = A, y(b) = B
p
I[y] =
a
where A and B are the radii of the ends and are given.
(a) In the case A = B and with the end points at x = a show that I[y] is
stationary on the curve y = c cosh (x) where (x) is given implicitly by
Z
x 1
= dv p ,
c 0 1 + cosh v + cosh4 v
2
Hence show that for a/A 1 there are two solutions. Show, also that there
is a critical value of a/A above which there are no appropriate solutions of the
Euler-Lagrange equation.
Problems on cycloids
Exercise 4.26
The cycloid OP D of figure 4.1 (page 162) is rotated about the x-axis to form a
solid of revolution. Show that the surface area, S, and volume, V , of this solid are
Z 2 Z 2
ds dx
S = 2 d y V = d y 2
0Z d 0Z d
2 2
2 3
= 4a d (1 cos ) sin(/2) = a d (1 cos )3
0 0
64 2
= a = 5 2 a3 .
3
Exercise 4.27
The half cycloid with parametric equations x = a( sin ), y = a(1 cos ) with
0 is rotated about the y-axis to form a container.
(a) Show that the surface area, S(), and volume, V (), are given by
Z
S() = 4a2 d sin sin(/2),
0
Z 2
3
V () = a d sin sin .
0
4.5. MISCELLANEOUS EXERCISES 187
(c) Find the general expressions for S() and V () and their values at = .
Exercise 4.28
This exercise shows that the arc QST in figure 4.3, (page 164) is a cycloid, a
result discovered by Huygens and used in his attempt to construct a pendulum
with period independent of its amplitude for use in a clock.
Consider the situation shown in figure 4.18, where the arcs ABO and OCD are
cycloids defined parametrically by the equations
A O x
D
B C
y
Figure 4.18
The curve OQR has length l, is wrapped round the cycloid along OQ, is a straight
line between Q and R and is tangential to the cycloid at Q.
(a) If the point Q has the coordinates
show that the angle between QR and the x-axis is given by = ( )/2.
(b) Show that the coordinates of the point R are
For the length of a curve we use a variant of equation 1.5 (page 17). Suppose that
increases from to + , then to O(), x and y increase by x0 () and y 0 ()
respectively. Hence the length of the small element of the curve is, using Pythagoras
theorem p
s = x0 ()2 + y 0 ()2 + O(2 ),
and the length of the curve between 1 and 2 is
Z 2 p
s= d x0 ()2 + y 0 ()2 .
1
For the cycloid, x0 () = a(1 cos ), y 0 () = a sin and the length of the arc OP is
Z q Z p
s = a d (1 cos )2 + sin2 = a d 2 2 cos ,
0 0
Z
= 2a d sin(/2) = 4a (1 cos(/2)) = 8a sin2 (/4),
0
(b) The initial point (X, Y ), where Y = AX, satisfies the equation X 2 +(Y R)2 = R2 ,
which becomes (1 + A2 )X = 2AR. Substituting this into the above equation for T gives
the required, rather surprising, result.
2
reduces to dx/dz = z/(c z).
so that 3 2
x 2A y 2A x 2/3
= , =1 =1 .
b 3b A 3b b
But z 0 = 2c2 sin cos and, since x = 12 c2 (2 sin 2) + d, x0 = 2c2 sin2 , so that
x0 2 + z 0 2 = 4c4 sin2 and
Z b
2c 2cb
T = d = .
2g 0 2g
But, from the analysis preceeding the exercise, c = A/ sin b and so
s
2A b
T = .
g sin b
A2 + b 2
(b R)2 + A2 = R2 giving R = .
2b
The time of passage is given by equation 4.6, with z = A y. The parametric equations
x = R(1cos ) and y = AR sin satisfy the equation of the circle and as the particle
moves downwards from (0, A), increases from = 0 to = b where y = 0, that is
A 2Ab
sin b = = 2 ,
R A + b2
so b depends only on the ratio = b/A. Since z = R sin , using the relation dz/dx =
z 0 ()/x0 (), equation 4.6 becomes
Z b s 0 2 s Z
1 x () + z 0 ()2 R b 1
T = d = d .
2g 0 z() 2g 0 sin
and hence
a2 a4 a6 a2 a4
= + +O , + = A +O .
2A 4A3 A5 2A A3
A
Hence y (x) ' x and y+ ' A. With = + , y(x) ' A and the solution approxi-
a
mates a right circular cylinder. With = , y(x) ' Ax/a, so the solution increases as
|x| increases. We shall see later that both these solution behave like the exact solutions.
where we have used the relations 2 cosh2 u = 1 + cosh 2u to evaluate the integral and
sinh 2u = 2 sinh u cosh u to cast the result in this form. Dividing this by a2 we see that
S[f ] 2
the dimensionless area S[f ]/a2 depends only upon , 2 = 2 ( + sinh cosh ) .
a
(b) Since 2 sinh cosh = sinh 2 we define
1 sinh 2 1 sinh 2
F () = + giving F 0 () = (cosh 2 1) .
2 2 2 3
Hence F 0 () = 0 if
(cosh 2 1)
1= = tanh .
sinh 2
A 1 e
e + e = 1 + e2 .
=
a 2 2
4.6. SOLUTIONS FOR CHAPTER 4 193
A 1
= e , ( 1).
a 2
For large ,
1 1 1 1
cosh = e 1 + e2 ' e and sinh = e 1 e2 ' e
2 2 2 2
so 2
e
S[f ] 2 1 2
2
' 2 + e2 = 2 + .
a 4 2
Since 1, e2 , that is e2 / 2 1/, so the first term dominates.
1.25 y
y=tanh
1
0.75
0.5 y=1/
0.25
0 1 2 3 4
Figure 4.19 Graph of y = tanh and y = 1/.
(b) At the stationary point the area is, using the result obtained in exercise 4.11
1 1
S = 2a2 + 2 sinh m cosh m
m m
2 2a2
2a
= 1 + sinh2 m = cosh2 m since m sinh m = cosh m .
m m
But, by definition,
A 1
= g(m ) = cosh m hence S = 2a2 m g(m )2 = 2A2 m .
a m
194 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
A=1.2 A=3
1.25 3
y+(x) y+(x)
1
2
0.75
0.5 y(x)
1 y(x)
0.25
x x
-1 -0.5 0 0.5 -1 1 -0.5 0 0.5 1
Figure 4.20 Graphs of y (x) for A = 1.2, on the left, and A = 3 on the right.
(b) Substituting the general solution (for any c) into the functional gives
r
1 1
Z 1
x2 1
Z p
dx 4c4 + x2 ,
S[y] = dx 4c4 + x2 1 + 4 = 3
2c 1 4c 4c 1
1
= 2c + 3 . (4.21)
6c
In order to determine which path gives the largest value of S[y] we consider the difference
1 1 1
S[y ] S[y+ ] = 2(c c+ ) + ,
6 c3 c3+
2
c+ + c+ c + c2
= (c+ c ) 2 ,
6(c+ c )3
4
= (c+ c )(A 1) > 0 if A > 1,
3
where we have used the relations c+ c = 21 and c2+ + c2 = A, which follow directly
from the original quadratic equation for c2 . This relation shows that S[y ] > S[y+ ] for
A > 1.
If A = 1, c+ = c = 1/ 2 and S = 4 2/3. Further if A 1 we have
r !
2 A 1 A 1 1
c = 1 1 2 = 1 1 +
2 A 2 2A2 8A4
where we have used the binomial expansion 1 x = 1 21 x 81 x2 + . Hence
1 1
c2+ = A 1 +
4A2 16A4
and on taking the square root
1/2
1 1 1
c+ = A 1 1+ + = A 1 + .
4A2 2A2 8A2
Similarly
1 1 1 1
c2 = 1+ + giving c = 1+ + .
4A 4A2 2 A 8A2
196 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
Putting c+ = A and c = 1/2 A we obtain the following approximations
x2 1
y+ ' A + ' A and y ' + Ax2 ' Ax2 , A 1.
2A 4A
Substituting these approximations for c into the integral 4.21 for S we obtain
4
S[y+ ] ' 2 A and S[y ] ' A3/2 .
3
12 S[y]
10
S[y ]
8
6
S[y+]
4
2
A
0
1 1.5 2 2.5 3 3.5 4
Figure 4.21 Graphs of S[y ].
so y 0 (x) is not defined at |x| = 1. Hence we define a function that approaches yG (x) as
0 for some parameter . We need only consider positive values of x:
0, 0 x < 1 , 0 < 1,
y (x) = A
A (1 x), 1 x 1.
Then
r
1
r
A A2
Z
S[y ] = 2 dx A (1 x) 1 + 2 ,
1
r
2p v 4p 4
Z
= A(A2 + 2 ) dv 1 = A(A2 + 2 ) A3/2 as 0.
0 3 3
4.6. SOLUTIONS FOR CHAPTER 4 197
C=(c,d)
3
l3 X 2
3
X
l1
l2 1
1 2
O A
Figure 4.22
The point X has the coordinates (x, y) and we need to find these coordinates so that
the length L = l1 + l2 + l3 is stationary. With the geometry shown
p p p
l1 = x2 + y 2 , l2 = (a x)2 + y 2 , l3 = (c x)2 + (d y)2 ,
y y dy
sin 1 = , sin 2 = , sin 3 = ,
l1 l2 l3
x ax cx
cos 1 = , cos 2 = , cos 3 = .
l1 l2 l3
The derivatives are
L x ax cx
= = cos 1 cos 2 cos 3 = 0,
x l1 l2 l3
L y y dy
= + = sin 1 + sin 2 sin 3 = 0.
y l1 l2 l3
Now let k , k = 1, 2, 3, be the angles between the intersecting lines, as shown on the
right of the figure, so 1 + 2 + 3 = 2. Also 1 = 1 2 , 2 = 3 + 2 and
3 = + 1 3 , so that
The first of these equations has the solutions 1 = 2n+2 and 1 = (2n+1)+2 , but
only the first of these also solves the second equation, and then only if cos 2 = 1/2,
that is 2 = 1 = /3, and hence 1 = 2 = 3 = 120.
In order to classify this stationary point we need the second derivatives: these are
2L x2 (a x)2 (c x)2
1 1 1
= 3 + +
x2 l1 l1 l2 l23 l3 l33
2 2 2
y y (d y)
= 3 + 3 + > 0.
l1 l2 l33
198 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
Similarly,
2L x2 (a x)2 (c x)2
= 3 + 3 + >0
y 2 l1 l2 l33
2L xy (a x)y (c x)(d y)
= 3 + + .
xy l1 l23 l33
For a minimum we need Lxx > 0, Lyy > 0 and = Lxx Lyy L2xy > 0. Using the above
expressions we find that
a2 y 2 1 2 1 2
= + yc + xd 2xy + y(c x) (d y)(a x) > 0.
(l1 l2 )3 (l1 l3 )3 (l2 l3 )3
Hence the stationary point is a minimum.
On squaring and adding these we see that the cross-terms cancel and that
2 ( )
2 2
ds 2 d d 2
=r + sin .
dt dt dt
and !
d 0 sin cos
p p =0
d 0 2 2
+ sin 0 2 + sin2
Expanding this gives the equation quoted.
(a) Using as the independent variable, the initial condition = 0 gives c = 0 and
hence () = constant, which is the equation of the great circles through the poles.
(b) If a = b = /2, the origin may be chosen to give (a ) = 0. The equation for ()
can be simplified by noting that, for any f ()
d
(0 f ()) = 00 f () + 0 2 f 0 (),
d
Since there are three lengths, b, A and B, we expect the solution to depend upon only
two ratios, which we take to be A = A/b and B = B/b. Defining = c/b, another
dimensionless ratio, gives the equation for ,
q
1 2 1
B = f () where f () = A cosh A 2 sinh .
x2 1/x
q
1 1/x 2
f (x) = e A A x + O e1/x =
2 e + O(x4 ).
2 4A
Now suppose that x ' A and set x = A u, where u is small and positive, and the
Taylor expansions are, to first-order in u
1 1 u 1 1 1 u 1
cosh = cosh + 2 sinh , sinh = sinh + 2 cosh
x A A A x A A A
and also
1/2 p
p
q
2
p u
A x2 = u 2A u = 2Au 1 = 2Au + O(u3/2 ).
2A
1 p 1
f (x) = A cosh 2Au sinh + O(u3/2 ), u = A x.
A A
10
A=0.35
8
A=5
A=1 A=0.5
6
4 A=10
0
0 0.2 0.4 0.6 0.8 1
Figure 4.23 Graphs of the function g(y) for A = 10, 5, 1, 0.5 and 0.35.
(c) When A 1, x is necessarily small and we may use the approximations cosh(1/x) '
sinh(1/x) ' exp(1/x)/2, accurate to O(exp(1/x)). The derivative of f (x) is
q
2 2
A x
1 A q x sinh 1
f 0 (x) = cosh
x2 x x2 2 x
A x2
q
2 2
1 1/x A x A x
' e 2+q
2 x2 x 2 2
A x
1 v02
x = d + c2 (2 sin 2) , y =A+ c2 sin2
2 2g
where c and d are constants and the path starts at (x, y) = (0, A), where = 0 , and
ends at (b, 0), where = b . We need equations for the four unknowns c, d, 0 and b ,
in terms of A, b and v0 . The initial conditions give
1 v02
d = c2 (20 sin 20 ) and c2 sin2 0 = .
2 2g
202 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
1 v02
b = d + c2 (2b sin 2b ) and c2 sin2 b = A + .
2 2g
From these equations we see that 0 and b are related by the equation
sin2 0 2 v02
= k = that is sin2 0 = k 2 sin2 b .
sin2 b 2Ag + v02
Then b is determined by
1 2n o
b = c (2b sin 2b ) (20 sin 20 )
2
v02 n o
= (2 b sin 2 b ) (2 0 sin 2 0 ) , 0 = sin1 (k sin b ).
4gk 2 sin2 b
This gives b , which then allows 0 to be determined and from these c and d are found.
(b) In the limiting cases v02 2Ag, we expect the solution to be close to the v0 = 0
solution found in the text. In this cases k 2 ' v02 /(2Ag) 1, so 0 is small and, to a
first approximation is given by 0 = k sin b . Thus the above equation for b becomes
A n
3
o
b= (2 b sin 2 b ) + O(k ) .
2 sin2 b
The function on the right-hand side of this equation is monotonic increasing for 0
b < : for small b it behaves as 2Ab /3 and it is infinite at b = . Hence, for all
b 0 there is a unique real solution. In the limit v0 = 0 this is the same equation
determined in the text the equation immediately preceding 4.9. With this value of
b we have
A + v02 /2g
c2 = and 0 = k sin b + O(k 3 ).
sin2 b
If v02 2Ag we should expect gravity to have little effect because the initial kinetic
energy (mv02 /2) greatly exceeds the initial potential energy (mgA), so the motion will
be close to the straight line joining (0, A) to (b, 0).
In this case k ' 1 and we can write
1 2Ag
k2 = , = 1,
1+ v02
sin b
sin 0 = or 0 = sin1 (sin b sin b )
1+
where
1
=1 = ' .
1+ 1++ 1+ 2
4.6. SOLUTIONS FOR CHAPTER 4 203
v02 p
b= 1 + tan b = A tan b .
g
Thus b is the angle between the downward vertical and the straight line between the
end points.
Now put = b tan b , where is a parameter such that = 0 and b when
= 1 and 0, respectively. The x-coordinate is
1 2
x= c {(2 sin 2) (20 sin 20 )}
2
and since, to first-order,
we find that x = 2c2 (1 ) tan b sin2 b . But c2 sin2 b = A(1 + )/, tan b = b/A
and = /2 so x = (1 )b. For the y-coordinate, since sin = (1 ) sin b
A
y = (1 + ) c2 (1 )2 sin2 b
A A
= (1 + ) (1 + )(1 )2 ' A.
As expected this gives the parametric equation of a straight line between the initial and
final points.
v(1 + v 2 ) 1 v2
Z
dv 1 2v dx
Z Z
x = 2
or dv 2
= dv 2
= .
dx 1v v(1 + v ) v 1+v x
This integrates to
v f
= Ax and since v = this gives x2 + (f )2 = 2 .
1 + v2 x
This equation represents a circle of radius with centre at (0, ).
y3
The integrand is independent of x, so the first-integral is p = c3 . Symmetry
1 + y0 2
about x = 0 suggests that y(x) is even, so y 0 (0) = 0 and then y(0) = c, where c is
positive. Rearranging this gives
Z y Z y
y 6 c3 du 1
y0 2 = 1 or x = = c3 du p .
c c
6
u c 6
c (u c )(c4 + c2 u2 + u4 )
2 2
If (a) = a then A, c and a are related by A = c cosh a and, from the above integral
a
a 1 a
Z
cosh a = dv p . that is = f (a )
A 0
2
1 + cosh v + cosh v 4 A
where z
1 1
Z
f (z) = dv p .
cosh z 0 1 + cosh v + cosh4 v
2
4.6. SOLUTIONS FOR CHAPTER 4 205
we have f (z) / cosh z < . Thus the equation a/A = f (a ) has real solutions only
if a < A: for large separations of the ends, a > A, there are no solutions of the
Euler-Lagrange equation. Numerical evaluation of the integral gives = 0.701
Now we show, by approximating f (z), that for small z, f (z) is increasing and for large
z it is decreasing, so f (z) has at least one maximum and the equation a/A = f ( a ) has
at least two real roots for small a/A.
For small v
3 1 1 1
p =q = 1 v2 + v4 + ,
2
1 + cosh v + cosh v 4 2
1 + sinh v + 1
sinh v 4 2 24
3
z 2
f (z) = z 3 + .
3 3 3
Provided cosh2 v + cosh4 v > 1, that is v > 0.722, we may expand the square root to
give Z
dv 1 1
g(z) = 1 + .
z cosh2 v 2 cosh2 v 8 cosh4
But
dv 22n 2nz
Z Z
2n
= 22n dv e2nv 1 + e2v 1 + O(e2z ) .
= e
z cosh2n v z 2n
Using the expressions for y and s we find that the surface area is
Z 2 Z 2
S = 4a2 d (1 cos ) sin(/2) = 8a2 d sin3 (/2)
0 0
/2
64 2
Z
= 32a2 d sin3 = a .
0 3
Similarly the volume is
Z 2 Z 2
3 3 3
d 1 + 3 cos2 = 5 2 a3
V = a d (1 cos ) = a
0 0
where we have used the fact that the mean of odd powers of the cosine function is zero,
Z 2+a
dx cos2n+1 x = 0 for any real a.
a
and hence
2 2/3 1/3 5/3
S(x) = 6 a x + O(x7/3 ).
5
Similarly the volume for small is
Z 3 2 !
a3 8
V () = a3 d + O(9 ) = 2
+O(10 ) = 62/3 a1/3 x8/3 +O(x10/3 ).
0 6 8.6 8
and hence
2 1 32 2
S() = 4a 3 sin(/2) 2 cos(/2) + sin(3/2) with S() = a .
3 3
But
1
1 1
Z Z
3
d sin = d (3 sin sin 3) = 3 cos + cos 3
0 4 0 4 3 0
2 3 1
= cos + cos 3
3 4 12
and
1
Z Z
d sin2 = d (1 cos 2)
0 2 0
( )
1 2 1 1 1
Z
= sin 2 d sin 2
4 2 2 0 2 0
i
1 2 1 1 1h
= sin 2 cos 2
4 2 2 4 0
1 2 1 1
= sin 2 + (1 cos 2).
4 4 8
and finally
Z h i Z
2 2
d sin = +2 d cos
cos
0 0 0
( )
h i Z
2
= cos + 2 sin d sin
0 0
2
= cos + 2 sin 2(1 cos ).
If = , V = a3 ( 2 /2 8/3).
208 CHAPTER 4. APPLICATIONS OF THE EULER-LAGRANGE EQUATION
dy sin 1
tan = = = ,
dx 1 cos tan(/2)
where we use the identities sin 2x = 2 sin x cos x, cos 2x = 12 sin2 x. Hence tan tan(/2) = 1,
so cos( + /2) = 0 which means that + /2 is an odd integer multiple of /2. But
when = 0, = /2 and when = , = 0, so + /2 = /2.
(b) If s() is the length OQ the straight line QR is of length l s() and the horizontal
and vertical distances from Q to R are (l s()) cos and (l s()) sin , respectively.
Since = /2 /2 we see that the coordinates of R are
(c) Since s() = 8a sin2 (/4), see exercise 4.3 (page 165), the length OQC is given by
putting = , LOCD = 4a. Then if l = LOCD
and