You are on page 1of 86

MATH 401: GREENS FUNCTIONS AND VARI-

ATIONAL METHODS
The main goal of this course is to learn how to solve various PDE (= partial
dierential equation) problems arising in science. Since only very few special such
problems have solutions we can write explicitly, solve here often means either (a)
nd an explicit approximation to the true solution, or (b) learn some important
qualitative properties of the solution.
A number of techniques are available for this. The method of separation of variables
(together with Fourier series), perhaps familiar from earlier courses, unfortunately
only applies to very special problems. The related (but more general) method of
eigenfunction expansion is very useful and will be discussed in places in this course.
Integral transform techniques (such as the Fourier transform and Laplace transform)
will also be touched on here. If time permits, we will also study some perturbation
methods, which are extremely useful if there is a small parameter somewhere in the
problem.
The bulk of this course, however, will be devoted to the method/notion of Greens
function (a function which, roughly speaking, expresses the eect of the data at one
point on the solution at another point), and to variational methods (by which PDE
problems are solved by minimizing (or maximizing) some quantities). Variational
methods are extremely powerful and even apply directly to some nonlinear problems
(as well as linear ones). Greens functions are for linear problems, but in fact play a key
role in nonlinear problems when they are treated as in some sense perturbations
of linear ones (which is frequently the only feasible treatment!).
We begin with Greens functions.
A. GREENS FUNCTIONS
I. INTRODUCTION
As an introductory example, consider the following initial-boundary value problem
for the inhomogeneous heat equation (HE) in one (space) dimension
_
_
_
u
t
=

2
u
x
2
+ f(x, t) 0 < x < L, t > 0 (HE)
u(0, t) = u(L, t) = 0 (BC)
u(x, 0) = u
0
(x) (IC)
which, physically, describes the temperature u(x, t) at time t and at point x along a
rod 0 x L of length L subject to (time- and space-varying) heat source f(x, t),
and with the ends held xed at temperature 0 (boundary condition (BC)) and with
initial (time t = 0) temperature distribution u
0
(x) (initial condition (IC)).
1
This problem is easily solved by the (hopefully) familiar methods of separation of
variables and eigenfunction expansion. Without going into details, the eigenfunctions
of the spatial part of the (homogeneous) PDE (namely d
2
/dx
2
) satisfying (BC) are
sin(nx/L), n = 1, 2, 3, . . . (with corresponding eigenvalues n
2

2
/L
2
), and so we
seek a solution of the form
u(x, t) =

n=1
a
n
(t) sin(nx/L).
We similarly expand the data of the problem the source term f(x, t) and the initial
condition u
0
(x) in terms of these eigenfunctions; that is, as Fourier series:
f(x, t) =

n=1
f
n
(t) sin(nx/L), f
n
(t) =
2
L
_
L
0
f(x, t) sin(nx/L)dx
u
0
(x) =

n=1
g
n
sin(nx/L), g
n
=
2
L
_
L
0
u
0
(x) sin(nx/L)dx.
Plugging the expression for u(x, t) into the PDE (HE) and comparing coecients
then yields the family of ODE problems
a

n
(t) + (n
2

2
/L
2
)a
n
(t) = f
n
(t), a
n
(0) = g
n
which are easily solved (after all, they are rst-order linear ODEs) by using an inte-
grating factor, to get
a
n
(t) = e
(n
2

2
/L
2
)t
_
g
n
+
_
t
0
e
(n
2

2
/L
2
)s
f
n
(s)ds
_
,
and hence the solution we sought:
u(x, t) =
2
L

n=1
e
(n
2

2
/L
2
)t
_
_
L
0
g(y) sin(ny/L)dy
+
_
t
0
e
(n
2

2
/L
2
)s
_
L
0
f(y, s) sin(ny/L)dyds

sin(nx/L).
No problem. But it is instructive to re-write this expression by exchanging the order
of integration and summation (note we are not worrying here about the convergence
of the sum or the exchange of sum and integral suce it to say that for reasonable
(say continuous) functions g and f all our manipulations are justied and the sum
converges beautifully due to the decaying exponential) to obtain
u(x, t) =
_
L
0
G(x, t; y, 0)u
0
(y)dy +
_
t
0
_
L
0
G(x, t; y, s)f(y, s)dyds (1)
2
where
G(x, t; y, s) =
2
L

n=1
e
(n
2

2
/L
2
)(ts)
sin(ny/L) sin(nx/L).
Expression (1) gives the solution as an integral (OK, 2 integrals) of the data (the
initial condition u
0
(x) and the source term f(x, t)) against the function G which is
called, of course, the Greens function for our problem (precisely, for the heat equation
on [0, L] with 0 boundary conditions).
Our computation above suggests a few observations about Greens functions:
if we can nd the Greens function for a problem, we have eectively solved the
problem for any data we just need to plug the data into an integral like (1)
a Greens function is a function of 2 sets of variables one set are the variables
of the solution (x and t above), the other set (y and s above) gets integrated
one can think of a Greens function as giving the eect of the data at one point
((y, s) above) on the solution at another point ((x, t))
the domain of a Greens function is determined by the original problem: in the
above example, the spatial variables x and y run over the interval [0, L] (the
rod), and the time variables satisfy 0 s t here the condition s t
reects the fact that the solution at time t is only determined by the data at
previous times (not future times)
The rst part of this course will be devoted to a systematic study of Greens functions,
rst for ODEs (where computations are generally easier), and then for PDEs, where
Greens functions really come into their own.
3
II. GREENS FUNCTIONS FOR ODEs
1. An ODE boundary value problem
Consider the ODE (= ordinary dierential equation) boundary value problem
Lu := a
0
u

+ a
1
u

+ a
2
u = f(x) x
0
< x < x
1
u(x
0
) = u(x
1
) = 0.
(2)
Here
L := a
0
(x)
d
2
dx
2
+ a
1
(x)
d
dx
+ a
2
(x)
is a (rst-order, linear) dierential operator. As motivation for problem (2), one can
think, for example, of u(x) as giving the steady-state temperature along a rod [x
0
, x
1
]
with (non-uniform) thermal conductivity (x), subject to a heat source f(x) and with
ends held xed at temperature 0, which leads to the problem
((x)u

= f, u(x
0
) = u(x
1
) = 0
of the form (2). Zero boundary conditions are the simplest, but later we will consider
other boundary conditions (for example, if the ends of the rod are insulated, we should
take u

(x
0
) = u

(x
1
) = 0).
We would like to solve (2) by nding a function G(x; z), the Greens function, so
that
u(x) =
_
x
1
x
0
G(x; z)f(z)dz =: (G
x
, f)
where we have introduced the notations
G
x
(z) := G(x; z)
(when we want to emphasize the dependence of G specically on the variable z,
thinking of x as xed), and
(g, f) :=
_
x
1
x
0
g(z)f(z)dz ( inner product ).
Then since u solves Lu = f, we want
u(x) = (G
x
, f) = (G
x
, Lu).
Next we want to move the operator L over from u to G
x
on the other side of the
inner-product for which we need the notion of adjoint.
Denition: The adjoint of the operator L is the operator L

such that
(v, Lu) = (L

v, u) + boundary terms
4
for all smooth functions u and v.
The following example illustrates the adjoint, and explains what is meant by bound-
ary terms.
Example: Let, as above, L = a
0
(x)
d
2
dx
2
+ a
1
(x)
d
dx
+ a
2
(x) acting on functions dened
for x
0
x x
1
. Then for two such (smooth) functions u(x) and v(x), integration by
parts gives (check it!)
(v, Lu) =
_
x
1
x
0
v[a
0
u

+ a
1
u

+ a
2
u]dx
=
_
x
1
x
0
u[a
0
v

+ (2a

0
a
1
)v

+ (a
2
+ a

0
a

1
)v]dx + [a
0
vu

0
v

u + a
1
uv]

x
1
x
0
=
_
a
0
v

+ (2a

0
a
1
)v

+ (a
2
+ a

0
a

1
)v, u
_
+ [a
0
(vu

u) + (a
1
a

0
)uv]

x
1
x
0
.
The terms after the integral (the ones evaluated at the endpoints x
0
and x
1
) are what
we mean by boundary terms. Hence the adjoint is
L

= a
0
d
2
dx
2
+ (2a

0
a
1
)
d
dx
+ (a
2
+ a

0
a

1
).
The dierential operator L

is of the same form as L, but with (in general) dierent


coecients.
An important class of operators are those which are equal to their adjoints.
Denition: An operator L is called (formally) self-adjoint if L = L

.
Example: Comparison of L and L

for our example L = a


0
d
2
dx
2
+a
1
d
dx
+a
2
shows that
L is formally self-adjoint if and only if a

0
= a
1
. Note that in this case
Lu = a
0
u

+ a

0
u

+ a
2
u = (a
0
u

+ a
2
u
which is an ordinary dierential operator of Sturm-Liouville type.
Now we can return to our search for a Greens function for problem (2):
u(x) = (G
x
, f) = (G
x
, Lu) = (L

G
x
, u) + BT
where we know from our computations above that the boundary terms are
BT =
_
a
0
G
x
u

0
G

x
u + a
1
G
x
u

x
1
x
0
=
_
a
0
G
x
u

x
1
x
0
where we used the zero boundary conditions u(x
0
) = u(x
1
) = 0 in problem (2). We
can make the remaining boundary term disappear if we also impose the boundary
conditions G
x
(x
0
) = G
x
(x
1
) = 0 on our Greens function G. Thus we are led to the
problem
u(x) = (L

G
x
, u), G
x
(x
0
) = G
x
(x
1
) = 0
5
for G.
So L

G
x
should be a function g(z) which satises
u(x) = (g, u) =
_
x
1
x
0
g(z)u(z)dz
for all (nice) functions u. What kind of function is this? In fact, it is a (Dirac) delta
function, which s not really a function at all! Rather, it is a generalized function, a
notion we need to explore further before proceeding with Greens functions.
6
2. Generalized functions (distributions).
The precise mathematical denition of a generalized function is:
Denition: A generalized function or distribution is a (continuous) linear func-
tional acting on the space of test functions
C

0
(R) = innitely dierentiable functions on R which vanish outside some interval
(an example of such a test function is
(x) =
_
e
1/(a
2
x
2
)
a < x < a
0 [x[ a
for any a > 0). That is, a distribution f maps a test function to a real number
f() = (f, )
_
=
_
f(x)(x)dx
_
which it is useful to think of as an integral (as in the parentheses above) hence the
common notation (f, ) for f() but is not in general an integral (it cannot be
since f is not in general an actual function!). Further, this map should be linear: for
test functions and and numbers and ,
f( + ) = f() + f()
_
or (f, + ) = (f, ) + (f, )
_
.
Some examples should help clarify.
Example:
1. If f(x) is a usual function (say, a piecewise continuous one), then it is also
a distribution (after all, we wouldnt call them generalized functions if they
didnt include regular functions), which acts by integration,
f() = (f, ) =
_
f(x)(x)dx
which is indeed a linear operation. This is why we use inner-product (and
sometimes even integral) notation for the action of a distribution when the
distribution is a real function, its action on test functions is integration.
2. The (Dirac) delta function denoted (z) (or more generally
x
(z) = (z x)
for the delta function centred at a point x) is not a function, but a distribution,
whose action on test functions is dened to be
(, ) := (0)
_
=
_
(z)(z)dz
_
(
x
, ) := (x)
_
=
_
(z x)(z)dz
_
7
(where, again, the integral here is just notation). That is, acts on test functions
by picking out their value at 0 (and
x
acts on test functions by picking out their
value at x).
Generalized functions are so useful because we can perform on them many of the
operations we can perform on usual functions. We can
1. Dierentiate them: if f is a usual dierentiable function, then for a test function
, by integration by parts,
(f

, ) =
_
f

(x)(x)dx =
_
f(x)

(x)dx = (f,

)
(there are no boundary terms because vanishes outside of some interval). Now
if f is any distribution, these integrals make no sense, but we can use the above
calculation as the denition of how the distribution f

acts on test functions:


(f

, ) := (f,

)
and by iterating, we can dierentiate f n times:
(f
(n)
, ) = (f, (1)
n

(n)
).
Example:
(a) the derivative of a delta function:
(

, ) = (,

) =

(0)
(b) the derivative of the Heavyside function
H(x) :=
_
0 x 0
1 x > 0
(which is a usual function, but is not dierentiable in the classical sense at
x = 0):
(H

, ) = (H,

) =
_
H(x)(

(x))dx) =
_

0

(x)dx
= (x)[
x=
x=0
= (0)
(since vanishes outside an interval). Hence
d
dx
H(x) = (x).
8
The fact that we can always dierentiate a distribution is what makes them so
useful for dierential equations.
2. Multiply them by smooth functions: if f is a distribution and a(x) is a smooth
(innitely dierentiable) function we dene
(a(x)f, ) := (f, a(x))
(which makes sense, since a is again a test function). Note this denition
conincides with the usual one when f is a usual function.
3. Consider convergence of distributions: we say that a sequence f
j

j=1
of distri-
butions converges to another distribution f if
lim
j
(f
j
, ) = (f, ) for all test functions .
This kind of convergence is called weak convergence.
Example: Let (x) be a smooth, non-negative function with
_
(x)dx = 1, and
set
j
(x) := j(jx) for j = 1, 2, 3, . . .. Note that as j increases, the graph
of
j
(x) is becoming both taller, and more concentrated near x = 0, while
maintaining
_

j
(x)dx = 1. In fact, we have
lim
j

j
(x) = lim
j
j(jx) = (x)
in the weak sense it is a nice exercise to show this!
4. Compose them with invertible functions: let g : R R be a one-to-one and
onto dierentiable function, with g

(x) > 0. If f is a usual function, then by


changing variables y = g(x) (so dy = g

(x)dx), we have for the composition


f g(x) = f(g(x)),
(fg, ) =
_
f(g(x))(x)dx =
_
f(y)(g
1
(y))dy/g

(g
1
(y)) = (f,
1
g

g
1
g
1
)
and so for f a distribution, we dene
(f g, ) := (f,
1
g

g
1
g
1
).
Example: Composing the delta function with g(x) gives
((g(x)), ) = (,
1
g

g
1
g
1
) =
(g
1
(0))
g

(g
1
(0))
,
and in particular if g(x) = cx (constant c > 0)
((cx), ) =
1
c
(0) = (
1
c
, )
and hence (cx) =
1
c
(x).
9
3. Greens functions for ODEs.
Returning now to the ODE problem (2), we had concluded that we want our Greens
function G(x; z) = G
z
(z) to satisfy
u(x) = (L

G
x
, u) u, G
x
(x
0
) = G
x
(x
1
) = 0.
After our discussion of generalized functions, then, we see that what we want is really
L

G
x
(z) = (z x), G
x
(x
0
) = G
x
(x
1
) = 0.
Notice that for z ,= x, we are simply solving L

G
x
= 0. The strategy is to solve this
equation for x < z and for z > x, and then glue the two pieces together. Some
examples should help clarify.
Example: use the Greens function method to solve the problem
u

= f(x), 0 < x < L, u(0) = u(L) = 0. (3)


(which could, of course, be solved simply by integrating twice).
First note that L =
d
2
dx
2
= L

(the operator is self-adjoint), so the problem for our


Greens function G(x; z) = G
x
(z) is
G

x
(z) = (z x), G
x
(0) = G
x
(L) = 0
(here

denotes
d
dz
). For z < x and z > x, we have simply G

x
= 0, and so
G
x
(z) =
_
Az + B 0 z < x
Cz + D x < z L
The BC G
x
(0) = 0 implies B = 0, and the BC G
x
(L) = 0 implies D = LC, so we
have
G
x
(z) =
_
Az 0 z < x
C(z L) x < z L
Now our task is to determine the remaining two unknown constants by using matching
conditions to glue the two pieces together:
1. continuity: we demand that G
x
be continuous at z = x: G
x
(x) = G
x
(x+) (the
notation here is g(x) := lim
0
g(x )). This yields Ax = C(x L).
2. jump condition: for any > 0, integrating the equation G

x
= (z x) between
x and x + yields
G

x
(x + ) G

x
(x ) = G

x+
x
=
_
x+
x
G

x
(z)dz =
_
x+
x
(z x)dz = 1
and letting 0, we arrive at
G

x
(x+) G

x
(x) = 1.
This jump condition requires C A = 1.
10
Solving the two equations for A and C yields C = x/L, A = (x L)/L, and so
G(x; z) = G
x
(z) =
_
z(x L)/L 0 z < x
x(z L)/L x < z L
which gives our solution of problem (3)
u(x) =
_
L
0
G(x; z)f(z)dz =
x L
L
_
x
0
zf(z)dz +
x
L
_
L
x
(z L)f(z)dz.
Remark:
1. whatever you may think of think of the derivation of this expression for the
solution, it is easy to check (by direct dierentiation and fundamental theorem
of calculus) that it is correct (assuming f is reasonable say, continuous).
2. notice the form of the Greens function G
x
(z) (graph it!) it has a singularity
(in the sense of being continuous, but not dierentiable) at the point z = x. This
singularity must be there, since dierentiating G twice has to yield a delta
function.
Example: use the Greens function method to solve the problem
x
2
u

+ 2xu

2u = f(x), 0 < x < 1, u(0) = u(1) = 0. (4)


Remark: Notice that the coecients x
2
and 2x vanish at the left endpoint x = 0 (this
point is a regular singular point in ODE parlance). This suggests that unless u is
very wild as x approaches 0, we will require f(0) = 0 to fully solve the problem. Lets
come back to this point after we nd a solution formula.
Notice the operator L = x
2 d
2
dx
2
+ 2x
d
dx
2 =
d
dx
x
2 d
dx
2 is self-adjoint (L = L

), and
so the problem for the Greens function is
LG
x
= z
2
G

x
+ 2zG

x
2G
x
= (z x), G
x
(0) = G
x
(1) = 0.
For z ,= x, the equation z
2
G

+2zG

2G = 0 is an ODE of Euler type, and so looking


for solutions of the form G = z
r
yields
0 = z
2
(rz
r1
)

+ 2z(rz
r1
) 2z
r
= (r(r 1) + 2r 2)z
r
= (r + 2)(r 1)z
r
and so we want r = 2 or r = 1. Thus
G
x
(z) =
_
Az + B/z
2
0 z < x
Cz + D/z
2
x < z 1
.
The BCs G
x
(0) = G
x
(1) = 0 imply B = 0 and C + D = 0, so
G
x
(z) =
_
Az 0 z < x
C(z 1/z
2
) x < z 1
.
The matching conditions are
11
1. continuity G
x
(x) = G
x
(x+) implies Ax = C(x 1/x
2
)
2. jump condition
1 =
_
x+
x
(zx)dz =
_
x+
x
_
(z
2
G

x
)

2G
x
]dz = z
2
G

x
(z)

x+
x
= x
2
(G

x
(x+)G

x
(x))
implies x
2
C(1 + 2/x
3
) x
2
A = 1.
Solving the simultaneous linear equations for A and C yields
1 = C
_
(x
2
+ 2/x) x
2
(1 1/x
3
)

= 3C/x = C = x/3, A = (x 1/x


2
)/3
and so the Greens function is
G(x; z) =
1
3
_
z(x 1/x
2
) 0 z < x
x(z 1/z
2
) x < z 1
and the corresponding solution formula is
u(x) =
_
1
0
G(x; z)f(z)dz =
1
3
(x 1/x
2
)
_
x
0
zf(z)dz +
1
3
x
_
1
x
(z 1/z
2
)f(z)dz.
Does this really solve (4) (supposing f is, say, continuous)? For 0 < x < 1, dieren-
tiation (and fundamental theorem of calculus) gives
3u

(x) = (1 + 2/x
3
)
_
x
0
zf(z)dz + (x 1/x
2
)xf(x) +
_
1
x
(z 1/z
2
)f(z)dz
x(x 1/x
2
)f(x) = (1 + 2/x
3
)
_
x
0
zf(z)dz +
_
1
x
(z 1/z
2
)f(z)dz
so
3(x
2
u

+ 2xu

2u) = x
2
_
6/x
4
_
x
0
zf(z)dz + (1 + 2/x
3
)xf(x) (x 1/x
2
)f(x)
_
+ 2x
_
(1 + 2/x
3
)
_
x
0
zf(z)dz +
_
1
x
(z 1/z
2
)f(z)dz
_
2
_
(x 1/x
2
)
_
x
0
zf(z)dz + x
_
1
x
(z 1/z
2
)f(z)dz
_
= (6/x
2
+ 2x + 4/x
2
2x + 2/x
2
)
_
x
0
zf(z)dz
+ (2x 2x)
_
2
x
(z 1/z
2
)f(z)dz + (x
3
+ 2 x
3
+ 1)f(x)
= 3f(x)
12
and we see that that the ODE is indeed solved. What about the BCs? Well, u(1) = 0
obviously holds. As alluded to above, the BC at x = 0 is subtler. We see that
3 lim
x0+
u(x) = lim
x0+
1
x
2
_
x
0
zf(z)dz lim
x0+
x
_
1
x
f(z)/z
2
dz.
If f is smooth, we have f(z) = f(0) + O(z) for small z, so
3 lim
x0+
u(x) = f(0) f(0) = 2f(0),
and so we require f(0) = 0 to genuinely satisfy the boundary condition at x = 0.
13
4. Boundary conditions, and self-adjoint problems.
The only BCs we have seen so far have been homogeneous (i.e. 0) Dirichlet
(specifying the value of the function) ones; namely u(x
0
) = u(x
1
) = 0. Lets make this
more general, rst by considering an ODE problem with inhomogeneous Dirichlet
BCs:
_
Lu := a
0
u

+ a
1
u

+ a
2
u = f x
0
< x < x
1
u(x
0
) = u
0
, u(x
1
) = u
1
. (5)
Recall that by integration by parts, for functions u and v,
(v, Lu) = (L

v, u) + [a
0
(vu

u) + (a
1
a

0
)vu]

x
1
x
0
.
Suppose we nd a Greens function G(x; z) = G
x
(z) solving the problem
_
L

G
x
= (z x)
G
x
(x
0
) = G
x
(x
1
) = 0
with the corresponding homogeneous (i.e. 0) BCs to the BCs in problem (5). Then
u(x) = (L

G
x
, u) = (G
x
, Lu) [a
0
G

x
u]

x
1
x
0
= (G
x
, f) [a
0
G

x
u]

x
1
x
0
=
_
x
1
x
0
G(x; z)f(z)dz + a
0
(x
0
)G

x
(x
0
)u
0
a
0
(x
1
)G

x
(x
1
)u
1
a formula which gives the solution of problem (5) in terms of the Greens function
G(x; z), and the data (the source term f(x), and the boundary data u
0
and u
1
).
The other way to generalize boundary conditions is to include the value of the deriva-
tive of u (as well as u itself) at the boundary (i.e. the endpoints of the interval).
For example, if u is the temperature along a rod [x
0
, x
1
] whose ends are insulated,
we should impose the Neumann BCs u

(x
0
) = u

(x
1
) = 0 (no heat ux through the
ends).
In general, for the following discussion, think of imposing 2 boundary conditions,
each of which is a linear combination of u(x
0
), u

(x
0
), u(x
1
), and u

(x
1
) equal to 0
(homogeneous case) or some non-zero number (inhomogeneous case).
Denition:
1. A problem
_
Lu = f
BCs on u
_
is called (essentially) self-adjoint if
(a) L = L

(so the operator is self-adjoint), and


(b) (v, Lu) = (Lv, u) (i.e. with no boundary terms) whenever both u and v
satisfy the homogeneous BCs corresponding to the BCs on u in the problem.
14
Remark: As in the above example, the Greens function for a self-adjoint problem
should satisfy the homogeneous BCs corresponding to the BCs in the original
problem. More generally, the Greens function for a problem should satisfy:
2. The homogeneous adjoint boundary conditions for a problem
_
Lu = f
BCs on u
_
are the BCs on v which guarantee that (v, Lu) = (L

v, u) (i.e. no boundary
terms) when u satises the homogeneous BCs corresponding to the BCs in the
original problem.
Remark:
1. these denitions are quite abstract it is better to see what is going on by doing
some specic examples
2. a problem can be non-self-adjoint even if L = L

, for example (see homework)


_
u

+ q(x)u = f(x)
u

(0) u(1) = 0, u

(1) = 0
3. if L ,= L

, we can make Lu = f self-adjoint by multiplying by a function (again,


see homework).
Example: (Sturm-Liouville problem)
_
_
_
Lu := (p(x)u

+ q(x)u = f(x) 0 < x < 1

0
u(0) +
0
u

(0) = 0

1
u(1) +
1
u

(1) = 0
(6)
where p(x) > 0, and
0
,
1
,
0
,
1
are numbers with
0
,
0
not both 0, and
1
,
1
not both 0.
First notice that L = L

(the operator is self-adjoint), and integration by parts (as


usual) gives
(v, Lu) = (Lv, u) + [p(vu

u)]

1
0
.
If u and v both satisfy the BCs in problem (6) (which are homogeneous) then

0
[v(0)u

(0) v

(0)u(0)] = u

(0)(
0
v

(0)) v

(0)(
0
u

(0)) = 0

0
[v(0)u

(0) v

(0)u(0)] = v(0)(
0
u(0)) u(0)(
0
v(0)) = 0
and so (since
0

0
,= 0), v(0)u

(0) v

(0)u(0) = 0. A similar computation shows that


v(1)u

(1) v

(1)u(1) = 0. Hence (v, Lu) = (Lv, u) (boundary terms disappear), and


the problem is, indeed, self-adjoint.
15
Thus a Greens function G(x; z) = G
x
(z) for problem (6) should satisfy
_
_
_
LG
x
= (p(z)G

x
(z))

+ q(z)G
x
(z) = (z x)

0
G
x
(0) +
0
G

x
(0) = 0

1
G
x
(1) +
1
G

x
(1) = 0
.
For z ,= x, we have LG
x
= 0, so
G
x
(z) =
_
c
0
w
0
(z) 0 z < x
c
1
w
1
(z) x < z 1
where for j = 0, 1, w
j
denotes any xed, non-zero solution of
Lw
j
= 0,
j
w
j
(j) +
j
w

j
(j) = 0
(which is an initial value problem for a second-order, linear, ODE hence there is a
one-dimensional family of solutions), and c
0
, c
1
are non-zero constants. Continuity of
G at x implies
c
0
w
0
(x) = c
1
w
1
(x).
The jump condition at x is (check it!) p(x)[G

x
(x+) G

x
(x)] = 1, so
c
1
w

1
(x) c
0
w

0
(x) =
1
p(x)
.
Hence
c
1
[w
0
(x)w

1
(x) w
1
(x)w

0
(x)] =
w
0
(x)
p(x)
,
which involves the Wronskian
W = W[w
0
, w
1
](x) = w
0
(x)w

1
(x) w
1
(x)w

0
(x).
Recall that since w
0
and w
1
satisfy Lw = 0,
p(x)W[w
0
, w
1
](x) constant.
There are two possibilities:
1. If W 0, then we cannot satisfy the equations for the coecients above there
is no Greens function! In this case, w
0
and w
1
are linearly dependent, which
means that w
0
and w
1
are both actually solutions of the homogeneous equation
Lu = 0 satisfying both BCs in (6). Well discuss this case more later.
16
2. Otherwise, W is non-zero everywhere in (0, 1) and so we have
c
1
=
w
0
(x)
p(x)W
, c
0
=
w
1
(x)
p(x)W
and hence a Greens function
G
x
(z) =
1
W[w
0
, w
1
]p(x)
_
w
1
(x)w
0
(z) 0 z < x
w
0
(x)w
1
(z) x < z 1
and a solution to our Sturm-Liouville problem (6)
u(x) =
1
W[w
0
, w
1
]p(x)
_
w
1
(x)
_
x
0
w
0
(z)f(z)dz + w
0
(x)
_
1
x
w
1
(z)f(z)dz
_
.
Remark:
1. Using the variation of parameters procedure from ODE theory would result in
precisely the same formula.
2. Notice that the Greens function here (and for all our examples so far) is sym-
metric in its variables: G
x
(z) = G(x; z) = G(z; x) = G
z
(x). This is no accident
rather, it is a general property of Greens functions for self-adjoint problems.
17
5. Modied Greens functions.
We have seen that we run into trouble in constructing a Greens function for the
Sturm-Liouville problem
_
_
_
Lu := (p(x)u

+ q(x)u = f(x) 0 < x < 1

0
u(0) +
0
u

(0) = 0

1
u(1) +
1
u

(1) = 0
(7)
(where p > 0, q, and f smooth functions, and
0
,
1
,
0
,
1
numbers with
0
,
0
not
both 0, and
1
,
1
not both 0) if the corresponding homogeneous problem ((7) with
f 0) has a non-trivial solution u

:
Lu

= 0,
0
u

(0) +
0
u

(0) = 0,
1
u

(1) +
1
u

(1) = 0. (8)
In fact in this case, a simple integration by parts
0 = (u, Lu

) = (Lu, u

) = (f, u

)
leads to the solvability condition
(f, u

) =
_
1
0
f(x)u

(x)dx = 0 (9)
which the source term f must satisfy, to have any hope of a solution u. In fact:
Theorem: [Fredholm alternative (for the Sturm-Liouville problem)] Either
1. Problem (7) has exactly one solution; or,
2. There is a non-zero solution u

of the corresponding homogeneous problem (8).


In this case, problem (7) has a solution if and only if the solvability condition (9)
holds (and the solution is not unique you can add any multiple of u

to get
another).
Proof:
Weve already seen that if there is no such u

, we have a solution (constructed above


via Greens function). Also, it is unique, since the dierence of any two solutions
would be a solution of the homogeneous problem.
If there is a non-zero u

, weve seen already that the solvability condition (9) is


required. If it is satised, we will show below how to construct a solution u using a
modied Greens function.
So suppose now that u

is a non-zero solution to (8), and that the solvability condi-


tion (9) on f holds. A function

G(x; z) =

G
x
(z) satisfying
_
L

G
x
(z) = (z x) + c(x)u

(z)
same BCs on

G
x
as in (7)
18
is called a modied Greens function. Notice that we can choose the constant
c(x) here so that the solvability condition
0 = (

G
x
, Lu

) = (L

G
x
, u

) = (
x
+ c(x)u

, u

) = u

(x) + c(x)(u

, u

)
holds namely c(x) = u

(x)/(u

, u

). This allows the probem for



G
x
to be solved
(though we wont do it here in general you can use the variation of parameters
method to do it). Given such a

G, if we have a solution u of (7), then
u(x) = (
x
, u) = (L

G
x
cu

, u) = (

G
x
, f) +
(u

, u)
(u

, u

)
u

is a solution formula for u in terms of f. Note that the constant in front of u

doesnt
matter remember we can add any multiple of u

and still have a solution. In fact


using reciprocity (

G is symmetric in x and z), we can check this formula indeed solves
problem (7):
Lu = (L
x

G
z
(x), f) = (
z
(x), f) + (c(z)u

(x), f) = (
x
(z), f)
u

(x)
(u

, u

)
(u

(z), f) = f
where we had to use (of course!) the solvability condition (9) on f. Similarly, the
boundary conditions hold.
Hopefully an example will clarify the use of modied Greens functions.
Example: Solve
_
u

(x) = f(x), 0 < x < 1


u

(0) = u

(1) = 0
.
Obviously u

(x) 1 solves the corresponding homogeneous problem, leading to the


solvability condition
0 = (f, u

) =
_
1
0
f(x)dx
(physically: there is no steady-state temperature distribution in a uniform rod with
insulated ends unless the (spatial) average heat source is zero).
Noticing that (1, 1) =
_
1
0
1dx = 1, the modied Greens function

G
x
(z) should solve

x
=
x
(z) 1,

G

x
(0) =

G

x
(1) = 0,
leading (since

G

= 1 =

G = z
2
/2 + c
1
z + c
2
, and using the BCs) to

G
x
(z) =
_
z
2
/2 + A 0 z < x
z
2
/2 + z + B x < z 1
.
19
Continuity at x requires A = x + B. The jump condition

G

x+
x
= 1 already holds.
(This leaves one free parameter, which is characteristic of modied Greens function
problems, since a multiple of u

(in this case a constant) can always be added.) So,


assuming the solvability condition
_
1
0
f(x)dx = 0 holds, we arrive at a formula for
the general solution
u(x) = (

G
x
, f) + C
=
1
2
_
1
0
z
2
f(z)dz + (B + x)
_
x
0
f(x)dz + B
_
1
x
f(z)dz +
_
1
x
zf(z)dz + C
= x
_
x
0
f(z)dz +
_
1
x
zf(z)dz + C
where we simplied using the solvability condition on f, and we absorbed
_
1
0
z
2
f(z)dz
into the general constant C. It is very easy to check that this expression solves our
original problem (of course, we could have solved this particular problem just by
integrating).
Remark: So far in this section we discussed the issues of solvability conditions, modi-
ed Greens functions, and so on, only for self-adjoint problems. For a general problem,
the obstacle to solvability comes from considering the homogeneous adjoint problem:
_
L

= 0
hom. adj. BCs on u
.
If this problem has a non-trivial solution u

, then the solvability condition for Lu = f


with homogeneous BCs is
0 = (u, L

) = (Lu, u

) = (f, u

).
One can also construct a modied Greens function in this case, but we wont go into
it here.
20
6. Greens functions and eigenfunction expansion.
Consider again the Sturm-Liouville problem (with homogeneous BCs)
_
_
_
Lu := (p(x)u

+ q(x)u = f(x) x
0
< x < x
1
(BC)
_

0
u(x
0
) +
0
u

(x
0
) = 0

1
u(x
1
) +
1
u

(x
1
) = 0
(10)
(p > 0, q, and f smooth functions). Recall that the Sturm-Liouville problem has
eigenvalues

0
<
1
<
2
<
3

and corresponding eigenfunctions
j
(x) such that
_
L
j
=
j

j
satises (BC)
which can be taken orthonormal:
(
j
,
k
) =
jk
=
_
1 j = k
0 j ,= k
.
Furthermore, the eigenfunctions are complete, meaning that any function g(x) on
[x
0
, x
1
] can be expanded
g(x) =

j=0
c
j

j
(x)
c
j
= (
j
, g) = Fourier coecient.
To be more precise, this series does not (in general) converge at every point x in
[x
0
, x
1
] (for example, it cannot converge at the endpoints if g does not satisfy (BC)
!), but rather converges in the L
2
-sense: provided
_
x
1
x
0
g
2
(x)dx < ,
lim
N
_
x
1
x
0
_
g(x)
N

j=0
c
j

j
(x)
_
2
dx = 0.
So lets express the Greens function G(x; z) = G
x
(z) for problem (10) as such an
eigenfunction expansion
G(x; z) = G
x
(z) =

j=0
c
j
(x)
j
(z)
21
and try to nd the coecients c
j
(x). Using LG
x
=
x
, and applying the linear
operator L to the expression for G
x
, we nd

x
= LG
x
=

j=0
c
j
(x)(L
j
)(z) =

j=0
c
j
(x)
j

j
(z)
and so the coecients c
j
(x)
j
of this expansion should satisfy
c
j
(x)
j
= (
j
,
x
) =
j
(x),
and hence we arrive at an expression for the Greens function as an eigenfunction
expansion:
G
x
(z) =

j=0
1

j
(x)
j
(z).
Remark:
1. The reciprocity (symmetry in x and z of G) is very clearly displayed by this
formula.
2. If, for some j,
j
= 0, this expression is ill-dened. Indeed, in this case, the
eigenfunction
j
is a non-trivial solution to the homogeneous problem (i.e. it is
a u

), and we have already seen there is no (usual) Greens function in this


case.
3. The corresponding solution formula for problem (10) is
u(x) = (G
x
, f) =
_
x
1
x
0

j=0
1

j
(x)
j
(z)f(z)dz =

j=0
1

j
(x)(
j
, f).
Notice that if
j
= 0 (for some j), the only way for this expression to make
sense is if also (
j
, f) = 0 which is exactly the solvability condition on f we
have seen earlier.
22
III. GREENS FUNCTIONS FOR ELLIPTIC (STEADY-STATE)
PROBLEMS
We turn now from ODEs to PDEs (partial dierential equations).
1. Greens functions for the Poisson equation.
As a motivating example, consider the electrostatic potential u(x) in a uniformly
conducting two-dimensional (or three-dimensional) region D R
2
(or D R
3
),
which satises
_
u = f(x) x D (Poisson equation)
u = g(x) x S = D
(11)
where f(x) is the charge density on the region D, g(x) is the potential applied on the
boundary S = D of D, and
u :=

2
u
x
2
1
+

2
u
x
2
2
_
+

2
u
x
2
3
_
is the Laplacian of u. (Notice also our notation for vectors in R
2
and R
3
: x = (x
1
, x
2
)
(or x = (x
1
, x
2
, x
3
)).) The Poisson problem (11) arises in other physical contexts as
well, for example:
u(x) is the steady-state temperature distribution inside the (uniform) region
D subject to heat source f(x) and with the temperature xed at g(x) on the
boundary
u is the steady-state (small) deformation of an elastic membrane/solid from its
reference position D, when it is subject to applied force f(x) and held at xed
deformation g(x) on the boundary.
If we are looking to solve problem (11) by nding a Greens function G(x; y) = G
x
(y)
(x, y D), then by analogy with the self-adjoint ODE problems studied earlier, we
might seek G which solves
_
G
x
(y) =
x
(y) y D
G
x
(y) = 0 y S
(12)
where here
x
(y) = (y x) is a 2- or 3-dimensional delta function centred at x,
dened in the same way as in 1-dimension: for a test function (x),
(
x
, ) =
_
R
2 or 3

x
(y)(y)dy = (x)
23
(one can think (y x) = (y
1
x
1
)(y
2
x
2
) in 2 dimensions, and (y x) =
(y
1
x
1
)(y
2
x
2
)(y
3
x
3
) in 3 dimensions).
Supposing we can nd a Greens function solving (12), we would like to derive a
formula for the solution u of (11) in terms of G. To do this, we will use a simple
integration by parts formula, for which we need to dene the normal derivative of
a function u at the boundary S = D:
u
n
:= n u
where n denotes the outward unit normal vector to the curve/surface S.
Lemma 1 (Greens second identity) Let D R
2 or 3
be bounded by a smooth
curve/surface S. Then if u
1
and u
2
are twice continuously dierential functions on
D,
_
D
(u
1
u
2
u
2
u
1
) dV =
_
S
_
u
1
u
2
n
u
2
u
1
n
_
dS. (13)
Proof: subtracting the relations
(u
1
u
2
) = u
1
u
2
+ u
1
u
2
(u
2
u
1
) = u
1
u
2
+ u
2
u
1
and using the divergence theorem, we nd
_
D
(u
1
u
2
u
2
u
1
) dV =
_
S
n (u
1
u
2
u
2
u
2
)dS =
_
S
_
u
1
u
2
n
u
2
u
1
n
_
dS.

Taking u
1
= u (solution of (11)) and u
2
= G
x
(solution of (12)) in (13), we nd
_
D
(u(y)
x
(y) G
x
(y)f(y))dy =
_
S
g(y)
G
x
n
dS(y)
or
u(x) =
_
D
G(x; y)f(y)dy +
_
S
G(x; y)
n(y)
g(y)dS(y) (14)
which is the solution formula we sought.
Free-space Greens functions. For a general region D, we have no chance of nding
an explicit Greens function solving (12). One case we can easily solve explicitly, is
for the whole space D = R
2
or D = R
3
, where there is no boundary at all.
24
R
2
: for a given x R
2
, we want to nd G
x
(y) solving G
x
(y) = (y x). It is
reasonable to try a G which depends only on the distance from the singularity:
G = h(r), r := [y x[. So for y ,= x (r > 0), we need
0 = G =
_

2
r
2
+
1
r

r
_
h(r) =
1
r
(rh

(r))

and so
h

= c
1
/r = h = c
1
log r + c
2
for some constants c
1
and c
2
. Including c
2
is simply adding an overall constant
to the Greens function, so we will drop it: c
2
= 0. The constant c
1
is determined
by the condition G
x
=
x
: for any > 0, we have, by the divergence theorem,
1 =
_
|yx|

x
(y)dy =
_
|yx|
G
x
(y)dy =
_
|yx|=
G
x
(y) n dS(y)
=
_
|yx|=
c
1
y x
[y x[
2

y x
[y x[
dS(y) =
c
1

2 = 2c
1
and so c
1
= (2)
1
, and our expression for the two-dimensional free-space
Greens function is
G
R
2(x; y) =
1
2
log [y x[.
R
3
: we play the same game in three dimensions. Try G
x
(y) = h(r), r := [y x[,
so for r > 0
0 = G
x
=
_

2
r
2
+
2
r

r
_
h(r) =
1
r
2
(r
2
h

(r))

and so
h

= c
1
/r
2
= h = c
1
/r + c
2
and again we set c
2
= 0. Constant c
1
is determined by
1 =
_
|yx|

x
(y)dy =
_
|yx|
G
x
(y)dy =
_
|yx|=
G
x
(y) ndS(y)
=
_
|yx|=
c
1
y x
[y x[
3

y x
[y x[
dS(y) =
c
1

2
4
2
= 4c
1
and so c
1
= (4)
1
, and our expression for the three-dimensional free-space
Greens function is
G
R
3(x; y) =
1
4[y x[
.
25
Knowing the free-space Greens functions, we are lead immediately to solution for-
mulas for the Poisson equation u(x) = f(x) in R
2
and R
2
:
u(x) =
_
1
2
_
R
2
log [x y[f(y)dy R
2

1
4
_
R
3
f(y)
|xy|
R
3
. (15)
The following theorem makes precise the claim that these formulas indeed solve Pois-
sons equation:
Theorem: Let f(x) be a twice continuously dierentiable function on R
2
(respectively
R
3
) with compact support (i.e. it vanishes outside of some ball). Then u(x) dened
by (15) is a twice continuously dierentiable function, and u = f(x) for all x.
Proof: well do the R
3
case (the R
2
case is analogous). Notice rst that by the change
of variable y x y in the integral,
u(x) =
1
4
_
R
3
f(y)
[x y[
dy =
1
4
_
R
3
f(x y)
[y[
dy.
If e
j
denotes the j-th standard basis vector, then
u(x + h e
j
) u(x)
h
=
1
4
_
R
3
_
f(x + h e
j
y) f(x y)
h
_
dy
[y[
,
and since
f(x+h e
j
y)f(xy)
h

f
x
j
(x y) as h 0 uniformly in y (since f is continu-
ously dierentiable and compactly supported),
u
x
j
(x) = lim
h0
u(x + h e
j
) u(x)
h
=
1
4
_
R
3
f
x
j
(x y)
dy
[y[
.
Similarly,

2
u
x
j
x
k
(x) =
1
4
_
R
3

2
f
x
j
x
k
(x y)
dy
[y[
which is a continuous function of x (again, because f is twice continuously dieren-
tiable and compactly supported) - hence u is twice continuously dierentiable. Now,
to nd u we want to integrate by parts twice. But the Greens function 1/[y[ is
not twice continuously dierentiable (indeed not even continuous) at y = 0, so to do
this legally, we remove a small neighbourhood of the origin: for > 0,
u(x) =
1
4
_
|y|>
f(x y)
dy
[y[

1
4
_
|y|
f(x y)
dy
[y[
=: A + B .
Now,
[B[
1
4
max
x
[f(x)[
_
|y|
dy
[y[
=
1
4
max
x
[f(x)[
_

0
rdr
=
1
4
max
x
[f(x)[

2
2
0 as 0
26
(f is bounded because it is continuous and compactly supported). For the other
term, we apply Greens second identity on the domain [y[ > , and use the fact
that (1/[y[) = 0 for [y[ > 0 to nd
A =
1
4
_
|y|=
_

n
1
[y[
f(x y)
1
[y[

n
f(x y)
_
dS(y) =: C + D.
For the second term,
[D[
1
4
max [f[
_
|y|=
1
[y[
dS(y) =
1
4
max [f[
1

4
2
0 as 0.
Finally, using

n
1
|y|
=
1
|y|
2
,
C =
1
4
_
|y|=
1
[y[
2
f(x y)dS(y) =
1
4
2
_
|y|=
f(x y)dS(y) = f(x y

)
for some y

with [y

[ = , by the mean-value theorem for integrals. So as 0,


y

0, and so sending 0 in all terms yields u(x) = f(x), as required.


Remark:
1. for the above theorem to hold true, we dont really need f to be compactly
supported (just with enough spatial decay so the the integral dening u makes
sense), or twice continuously dierentiable (actually, f just needs to be a little
better than continuous although just continuous is not quite enough)
2. the free space Greens functions are not just useful for solving Poissons equation
on R
2
or R
3
, but in fact play a key role in boundary value problems as we will
soon see
3. a physical interpretation: the three-dimensional free-space Greens function
(4[xy[)
1
is the electrostatic potential generated by a point charge located
at x (i.e. the Coulomb potential).
27
2. The method of images.
Example: Solve the boundary value problem for Laplaces equation on the 1/2-plane
(x
1
, x
2
) [ x
2
> 0 by the Greens function method:
_
u = 0 x
2
> 0
u(x
1
, 0) = g(x
1
) x
2
= 0
. (16)
The corresponding Greens function problem is: for x with x
2
> 0,
_
G
x
(y) =
x
(y) y
2
> 0
G
x
(y) 0 y
2
= 0
.
We know that the two-dimensional free-space Greens function G
f
(x; y) = log [y
x[/2 will generate the right delta function at x, but most certainly does not satisfy
the boundary conditions. The idea of the method of images is to balance G
f
(x, y)
with other copies of G
f
with singularities at dierent points in hopes of satisfying
the boundary conditions. For this problem, the geometry suggests placing an image
charge (the name comes from the interpretation of free-space Greens functions as
potentials generated by point electric charges) at x := (x
1
, x
2
), the reection of x
about the x
1
axis. So we set
G
x
(y) := G
f
(x; y) G
f
( x; y) =
1
2
(log [y x[ log [y x[) .
Notice that
y
2
= 0 = [y x[ = [y x[ = G
x
(y) = 0
so we have satised the boundary conditions. But have we satised the right PDE?
Yes, since for y with y
2
> 0,
G
x
(y) = (y x) (y x) = (y x)
since the singularity introduced at x lies outside the domain, and so does not con-
tribute. So we have our Greens function. To write the corresponding solution for-
mula, we need to compute its normal derivative on the boundary y
2
= 0: for y with
y
2
= 0 (using [y x[ = [y x[),
G
x
(y) =
1
2[y x[
2
(y x (y x)) =
1
2[y x[
2
( x x)
and since the outward unit normal on the boundary is n = (0, 1),

n
G
x
(y) =
1
2[y x[
2
(x
2
x
2
) =
x
2
((y
1
x
1
)
2
+ x
2
2
)
.
28
Hence our formula for the solution of problem (16) is
u(x) =
x
2

g(y
1
)
(y
1
x
1
)
2
+ x
2
2
dy
1
.
Now, does this expression really yield a solution to problem (16)? For x
2
> 0, the
function x
2
/(x
2
1
+ x
2
2
) is harmonic:

x
2
x
2
1
+ x
2
2
=

x
1
2x
1
x
2
(x
2
1
+ x
2
2
)
2
+

x
2
x
2
1
x
2
2
(x
2
1
+ x
2
2
)
2
=
2x
2
(x
2
1
+ x
2
2
) + 8x
2
1
x
2
2x
2
(x
2
1
+ x
2
2
) 4x
2
(x
2
1
x
2
2
)
(x
2
1
+ x
2
2
)
3
= 0,
and hence so is x
2
/((y
1
x
1
)
2
+x
2
2
) for any y
1
. So supposing the function g is bounded
and continuous, we may dierentiate under the integral sign to conclude that u(x) =
0 for x
2
> 0. What about the boundary conditions? Since
_

(s
2
+1)
1
ds = , and
changing variables y
1
= x
2
s, we have for xed x
1
, and for x
2
> 0,
[u(x
1
, x
2
) g(x
1
)[ =

1
s
2
+ 1
[g(x
1
+ x
2
s) g(x
1
)] ds

4 max [g[

_

M
ds
s
2
+ 1
+
1

_
M
M
1
1 + s
2
[g(x
1
+ x
2
s) g(x
1
)[ds.
Let > 0 be given, and then choose M large enough so that the rst term above is
< /2. Then since a continuous function on a compact set is uniformly continuous,
we may choose x
2
small enough so that the second term above is < /2. Thus we
have shown that
lim
x
2
0
u(x
1
, x
2
) = g(x
1
)
i.e., that the boundary conditions in problem (16) are also satised.
In practice, the method of images can only be used to compute the Greens function
for very special, highly symmetric geometries. Examples include:
half-plane, quarter-plane, half-space, octant, etc.
disks, balls
some combinations of the above, such as a 1/2-disk
Example: Solve the boundary value problem for Laplaces equation in the 3D ball of
radius a x R
3
[ [x[ < a using the method of images:
_
u = 0 [x[ < a
u = g [x[ = a
. (17)
29
The Greens function should solve
_
G
x
(y) = (y x) [y[ < a
G
x
(y) 0 [y[ = a
.
Given x in the ball ([x[ < a), a natural place to put an image charge is on the ray
from the origin through x. In fact, we choose the point
x

:=
a
2
[x[
2
x
and notice that [x

[ > a, so x

is not in our domain, and hence G


f
(x

; y) will
contribute nothing for [y[ < a. Now suppose that y is on the boundary ([y[ = a), and
notice that
[y x

[
2
=
a
2
[x[
2

[x[
y
a
a
x
[x[

2
=
a
2
[x[
2
([x[
2
2y x + a
2
) =
a
2
[x[
2
[x y[
2
so [y x

[ = (a/[x[)[x y[. Thus if we dene


G
x
(y) = G
f
(x; y)
a
[x[
G
f
(x

; y) =
1
4
_
1
[y x[

a
[x[[y x

[
_
,
we have satised the boundary condition: [y[ = a = G
x
(y) = 0. The normal
derivative of G on the boundary is
G
x
n
=
y
a

1
4
_
y x
[y x[
3

a(y x

)
[x[[y x

[
3
_
=
a
2
[x[
2
4a[y x[
3
.
Our solution formula is then
u(x) =
a
2
[x[
2
4a
_
|y|=a
g(y)
[y x[
3
dS(y), (18)
which is Poissons formula for harmonic functions in a 3D ball. Again, it is possible to
prove rigorously that for g a continuous function on the sphere [y[ = a, this formula
produces a function u which is harmonic in the ball, and takes on the values g on
the sphere. Furthermore, any function harmonic on the ball and with continuous
boundary values g is given by formula (18) (we will address this issue later on).
A nice consequence of Poissons formula (though there are also direct proofs) is the:
Mean value formula for harmonic functions: let u be harmonic (u 0) in
the ball [y[ < a in R
n
and continuous for [y[ a. Then the value of u at the centre
of the ball is equal to its average over the boundary of the ball.
30
Proof (for the R
3
case): take x = 0 in (18) to nd
u(x) =
a
4
_
|y|=a
g(y)
[y[
3
dS(y) =
1
4a
2
_
|y|=a
g(y)dS(y)
as required.
Solvability conditions and modied Greens functions.
Example: Consider the problem
_
u = f in D R
2
u
n
= g on D
(19)
with so-called Neumann boundary conditions. Notice that the corresponding
homogeneous problem
_
u

= 0 in D R
2
u

n
= 0 on D
has the non-trivial solution u

(x) 1. As a result, there is a solvability condition on


the data f and g in problem (19), which can be derived using Greens second identity:
0 =
_
D
u

udx =
_
D
u

udx +
_
D
(
u

n
u u

u
n
)dS
and since u

/n 0, we arrive at the solvability condition


_
D
f(x)dx =
_
D
g(x)dS(x).
Problem (19) will not have a Greens function, but will have a modied Greens
function, satisfying
_


G
x
(y) = (y x) + C y D

G
x
(y) = 0 y D
for some constant C, which can be determined using the divergence theorem:
1 =
_
D
(y x)dy =
_
D
(

G
x
C)dy =
_
D

G
x
dS C[D[ = C[D[
(where [D[ denotes the area/volume of D), and hence C = 1/[D[. If we can nd the
modied Greens function, we can construct the family of solutions of problem (19)
(assuming the solvability condition holds), as
u(x) =
_
D

G
x
(y)f(y)dy
_
D

G
x
(y)g(y)dS(y) + A
for any constant A.
31
3. Greens functions for Laplacian: some general theory.
So far we have mainly discussed methods for explicit computation of Greens func-
tions and solutions for certain problems involving the Laplacian operator. For general
regions, of course, explicit computations are not possible. Nevertheless, some general
theory tells us that Greens functions exist, and that the corresponding solution for-
mulas do yield solutions to the problems of interest. We describe some of this theory
here.
Let D be a region in R
2
or R
3
bounded by a smooth curve/surface S = D. Let
G
f
(x; y) denote the free-space Greens function (i.e. log [x y[/(2) for R
2
and
(4[y x[)
1
for R
3
).
Denition: a (Dirichlet) Greens function G(x; y) for the operator and the region
D is a function continuous for x, y

D := D S, x ,= y, satisfying
1. the function H
x
(y) := G
x
(y) G
f
x
(y) is harmonic (i.e.
y
H
x
(y) = 0) for y D
2. G
x
(y) 0 for y S
Remark:
the rst property implies that G
x
solves G
x
=
x
, and also implies that G
x
(y)
is a smooth function on Dx (since G
f
x
is, and since harmonic functions are
smooth).
the terminology Dirichlet Greens function refers to the Dirichlet boundary
conditions specifying the value of the function (rather than, say, its normal
derivative) on the boundary.
Properties of Greens functions:
1. Existence: a Greens function as above exists. (The proof of this fact calls for
some mathematics a little beyond the level of this course eg., some functional
analysis so we will have to simply take it as given.)
2. Uniqueness: the Greens function is unique in particular, we may refer to
the Greens function (rather than a Greens function).
3. Symmetry (reciprocity): G(x; y) = G(y; x).
4. Relation to solution of BVP for Poisson: let f be a smooth function on
D, and let g be a smooth function on S. Then u solves the boundary value
problem for Poissons equation
_
u = f D
u = g D
(20)
32
if and only if
u(x) =
_
D
G(x; y)f(y)dy +
_
D
G
x
(y)
n
g(y)dS(y). (21)
Some (sketches of ) proofs:
Symmetry: a formal (i.e. cheating) argument giving the symmetry is:
G(x; y) G(y; x) =
_
D
(G(x; z)
y
(z) G(y; z)
x
(z)) dz
=
_
D
(G
x
(z) G
y
(z) G
y
(z)G
x
(z)) dz
=
_
S
_
G
x
(z)
G
y
(z)
n(z)
G
y
(z)
G
x
(z)
n(z)
_
dS(z) = 0.
This is not a rigorous argument, because we applied Greens second identity
to functions (i.e. the Greens functions) which are NOT twice continuously
dierentiable (indeed, not even continuous). To make the argument rigorous,
try applying this identity on the domain D with balls of (small) radius around
the points x and y removed the Greens functions are smooth on this new
domain. Then take 0. This is left as an exercise.
Representation of solution of Poisson: suppose u is twice continuously dieren-
tiable on D, continuous on

D := D S, and satises (20). We will show that u
is given by formula (21). Assume rst that f = 0, so that u is harmonic in D.
We claim then that
u(x) =
_
S
_
u

n
G
f
x
G
f
x

n
u
_
dS. (22)
Assuming this for a moment, applying Greens second identity to the harmonic
functions u and H
x
:= G
x
G
f
x
in D, we arrive at
0 =
_
S
_
u
H
x
n
H
x
u
n
_
dS
which we add to (22) to arrive at
u(x) =
_
S
_
u
G
x
n
G
x
u
n
_
dS =
_
S
G
x
n
gdS
as required. Now lets prove (22) (we will just do the three-dimensional case
here, for simplicity). Let D

be D with a ball of (small) radius about x


33
removed, and apply Greens second identity to u and G
f
x
on D

(both functions
are smooth and harmonic here) to nd
0 =
_
D

_
u
G
f
x
n
G
f
x
u
n
_
dS
=
_
D
_
u
G
f
x
n
G
f
x
u
n
_
dS
_
|yx|=
y x
[y x[

_
u
y x
4[y x[
3

1
4[y x[
u
_
dS.
Now the second term in the second integral is bounded by
1
4
max [u[ 4
2
= (max [u[) 0 as 0,
while the rst term is, in the limit as 0,
lim
0
1
4
2
_
|yx|=
u(y)dS(y) = u(x)
(by the mean value theorem for integrals), and we recover (22), as desired.
Finally, suppose f ,= 0 in (20). Let
w(x) :=
_
D
G(x; y)f(y)dy.
Notice that for x S, G(x; y) = 0 (reciprocity), and so w(x) = 0. Just as we
did for the Poisson equation in R
3
, we can show that w = f, and so
(u w) = u f = f f = 0,
and we may write
u(x) w(x) =
_
D
G
x
n
[g(y) w(y)]dS(y) =
_
D
G
x
n
g(y)dS(y)
and we recover (21) as needed.
Formula (21) solves the Poisson BVP: it remains to show (21) really does solve
the BVP (20). We already argued that the rst term in (21) (what we called
w above) solves w = f and has zero boundary conditions. It remains to show
that the boundary integral is harmonic, and has boundary values g. We wont
do this here the proof can be found in rigorous PDE texts.
Uniqueness of the Greens function: we will prove this using the maximum
principle in the next section.
34
4. The maximum principle.
Let D be a (open, connected) region in R
n
, bounded by a smooth surface D.
Theorem: [maximum principle for harmonic functions]. Let u be a function
which is harmonic in D (u = 0) and continuous on D = D D. Then
1. u attains its maximum and minimum values on the boundary D
2. if u also attains its maximum or minimum value at an interior point of D, it
must be a constant function
Remark:
1. u must attain its max. and min. values somewhere on D, since it is a continuous
function on a closed, bounded set
2. the maximum part of the theorem extends to subharmonic functions (u 0),
and the minimum part to superharmonic functions (u 0).
Well give two proofs of the maximum principle.
Proof #1: suppose u does, in fact attain its maximum (say) at an interior point
x
0
D. Then for any r > 0 such that the ball B
r
of radius r about x
0
lies in D, the
mean-value property of harmonic functions implies that
u(x
0
) =
1
[B
r
[
_
B
r
u dS u(x
0
)
since u(x
0
) is the maximum value of u, and so we must have equality above, and u
must be equal to u(x
0
) everywhere on B
r
. Hence u is constant on B
r
. Repeating
this argument, we can ll out all of D, and conclude u is constant on D. (A slick
argument for this: the set where u(x) = u(x
0
) is both closed and open in D, hence is
all of D, since D is connected.)
Our second proof does not rely so heavily on the mean-value property (at least for the
rst statement of the maximum principle), and hence generalizes to other Laplacian-
like (a.k.a elliptic) operators, for which the mean-value property does not hold.
Proof # 2: again, suppose u were to have a maximum (say) at an interior point
x
0
D. Then we know from vector calculus that u(x
0
) 0, which is almost a
contradiction to u = 0, but not quite (because it is rather than <). We can x
this by introducing a new function, for any > 0,
v(x) := u(x) + [x[
2
.
Notice that
v = u + 2x, u = u + 2n = 2n > 0
35
(n is the space dimension), and so v really cannot have a maximum at an interior
point of D. That means that for any x D,
u(x) v(x) < max
yD
v(y) = max
yD
[u(y) + [y[
2
] max
yD
u(y) + max
yD
[y[
2
and now letting 0, we see
u(x) max
yD
u(y),
which is the rst statement of the maximum principle (sometimes called the weak
maximum principle. For the second statement (sometimes called the strong maximum
principle), we can use the mean value property as before.
Consequences of the maximum principle:
Uniqueness of solutions of the Dirichlet problem for the Poisson equation:
Theorem: Let f and g be continuous functions on D and D respectively. There
is at most one function u which is twice continuously dierentiable in D, con-
tinuous on

D, and solves
_
u = f D
u = g D
.
Proof: If u
1
(x) and u
2
(x) are both solutions, then their dierence w(x) :=
u
1
(x) u
2
(x) solves
_
w = 0 D
w = 0 D
.
The maximum principle says that w attains its maximum and minimum on the
boundary D. Since w 0 on D, we must have w 0 in D, and hence u
1
u
2
in D.
Uniqueness of the (Dirichlet) Greens function:
Theorem: There is at most one Dirichlet Greens function for a given region D.
Proof: if G(x; y) is a (Dirichlet) Greens function for D, then the function
H
x
(y) := G(x; y) G
f
(x; y) (where G
f
denotes the free-space Greens function)
solves the problem
_
H
x
(y) = 0 y D
H
x
(y) = G
f
(x; y) y D
and so is unique, by the previous theorem.
36
5. Greens functions by eigenfunction expansion.
We will just do one example.
Example: Find the Dirichlet Greens function in the innite wedge of angle , de-
scribed in polar coordinates (r, ) by 0 , r 0.
So given a point x in the wedge, we are looking for a function G
x
solving G
x
=
x
inside the wedge, and G
x
= 0 on the rays = 0 and = . It is natural to work in
polar coordinates. Let (r

) be the polar coordinates of x. Then the delta function


centred at x is written in polar coordinates as
1
r

(r r

)(

) (the factor 1/r

is
there since integrating in polar coordinates, dy becomes rdrd check it). Also, the
Laplacian in polar coordinates is
=

2
r
2
+
1
r

r
+
1
r
2

2
and so we are looking for G(r, ) which solves
_
G
rr
+
1
r
G
r
+
1
r
2
G

=
1
r

(r r

)(

)
G(r, 0) = G(r, ) = 0
The eigenfunctions of

2

2
(the angular part of the Laplacian) which satisfy zero bound-
ary conditions at = 0 and = are sin(n/), n = 1, 2, 3, . . ., so we will try to
nd G in the form of an eigenfunction expansion
G(r, ) =

n=1
c
n
(r) sin(n/)
(in fact, a Fourier sine series since the eigenfunctions are sines), and our job is to
nd the coecients c
n
(r).
Applying term-by-term to the expansion, we nd

n=1
_
c

n
+
1
r
c

n
2

2
c
n
_
sin(n/) =
1
r

(r r

)(

),
and integrating (in ) both sides against sin(k/) yields
c

k
+
1
r
c

k
2

2
c
k
=
2

1
r

(rr

)
_

0
sin(k/)(

)d =
2
r

sin(k

/)(rr

).
For r ,= r

, this is
c

k
+
1
r
c

k

k
2

2
c
k
= 0
37
an equation of Euler type whose solutions are of the form r
b
, where plugging r
b
in
yields
0 = b(b 1) + b
k
2

2
= (b k/)(b + k/),
so b = k/. Thus we nd
c
k
(r) =
_
Ar
k/
+ Br
k/
0 < r < r

Cr
k/
+ Dr
k/
r

< r <
.
For niteness at r = 0, we take B = 0, and to avoid growth as r , we take
C = 0, hence
c
k
(r) =
_
Ar
k/
0 < r < r

Dr
k/
r

< r <
.
We will nd A and D through matching conditions at r = r

. Continuity at r = r

implies A(r

)
k/
= D(r

)
k/
, and so dening a new constant

A = A(r

)
k/
, we
have
c
k
(r) =

A
_
_
r
r

_
k/
0 < r < r

_
r

r
_
k/
r

< r <
_
=

A
k/
, :=
min(r, r

)
max(r, r

)
.
The jump condition is
2
r

sin
_
k

_
=
_
r

+
r

_
1
r
(rc

k
)

k
2

2
c
k
_
dr = c

+
r

=
2k

A
r

so

A = sin(k

/)/(k), and c
k
(r) = sin(k

/)
k/
/(k). So our expression
for the Greens function is
G =
1

n=1
1
n
sin(n

/) sin(n/)
n/
,
and using the cosine summation law to write
sin(n

/) sin(n/) =
1
2
_
cos
_
n

)
_
cos
_
n

( +

)
__
,
we nd
G =
1
2

n=1
1
n
_
cos
_
n

)
_

n/
cos
_
n

( +

)
_

n/
_
=
1
2
Re

n=1
1
n
_
(e
i(

)
)
n/
(e
i(+

)
)
n/
_
.
38
Now setting z
1
:= e
i(

)
, z
2
:= e
i(+

)
, noting that [z
/
1
[ = [z
/
2
[ =
/
< 1 (for
r ,= r

), and recalling the Taylor series


log(1 z) =

n=1
1
n
z
n
is convergent for [z[ < 1, we arrive at
G(r, ; r

) =
1
2
Re log
_
1 z
/
1
1 z
/
2
_
=
1
2
log

1 z
/
1
1 z
/
2

=
1
4
log

1 z
/
1
1 z
/
2

2
=
1
4
log
_
1 +
2/
2
/
cos((

)/)
1 +
2/
2
/
cos(( +

)/)
_
=
1
4
log
_
(
/
+
/
)/2 cos((

)/)
(
/

/
)/2 cos(( +

)/)
_
and so nally
G(r, ; r

) =
1
4
log
_
cosh(

log ) cos(

))
cosh(

log ) cos(

( +

))
_
, :=
min(r, r

)
max(r, r

)
,
a rather attractive formula for the (Dirichlet) Greens function in the wedge of angle
. (And we should pause and appreciate it when we are lucky enough to nd an
explicit formula it is pretty rare!)
Amusing exercise: check that we recover the formulas that we previously derived
(using the method of images) for the half-plane ( = ) and the quarter-plane ( =
/2) .
39
IV. GREENS FUNCTIONS FOR TIME-DEPENDENT PROB-
LEMS
1. Greens functions for the heat equation.
Suppose we measure the temperature u(x, t) at each point x in a bounded region V
in R
n
(for us, usually n = 1, 2, or 3), and at each time t. The region is subjected to
heat sources f(x, t), and at its boundary, it is held at temperature g(x, t) (x V ).
Initially (at time t = 0, say), the temperature distribution in V is u
0
(x). The following
initial-boundary value problem for the heat equation describes the temperature
distribution u(x, t) at later times t > 0:
_
_
_
heat equation:
u
t
Du = f(x, t), x V, t > 0
boundary value: u(x, t) = g(x, t), x V, t > 0
initial condition: u(x, 0) = u
0
(x), x V
, (23)
where D > 0 is the diusion rate constant.
We would like to represent the solution to this problem using a Greens function.
The rst observation is that the dierential operator appearing in the equation is not
self-adjoint:
L :=

t
Du = L

=

t
Du.
Next, some notation. We x a time interval [0, T] (some T > 0), and let C
T
denote
the space-time cylinder C
T
= V [0, T].
As before, a Greens function for our problem should be a function of two sets of
variables: G(x, t; y, ), x, y V , t, 0. To determine the problem that G should
solve, we suppose u(y, ) solves problem (23), and integrate G against Lu over the
space-time cylinder, and integrate by parts:
_
C
T
G Lu dyd =
_
C
T
G(u

Du)dyd
=
_
C
T
L

G u dyd +
_
T
0
_
V
_
u
G
n
G
u
n
_
dS(y)d
+
_
V
(Gu[
=T
Gu[
=0
) dy.
So supposing we demand that our Greens function G(x, t; y, ) solves
_
_
_

DG = (y x)( t)
G 0 for y V
G 0 for > t (causality) ,
(24)
40
we arrive at a representation formula for u(x, t), x V , 0 t < T,
u(x, t) =
_
T
0
_
V
G(x, t; y, )f(y, )dyd +
_
V
G(x, t; y, 0)u
0
(y)dy

_
T
0
_
V
G
n
g(y, )dS(y)d.
(25)
Note that the above causality condition implies that the solution at time t should
not depend on any of its values at later times i.e. we are solving forward in time.
Problem (24) for the Greens function looks a little odd. It is a backwards heat
equation, which would be nasty, except that is also solved backwards in time (starting
from time = t and going down to = 0). So, to straighten it out, it is useful to
change the time variable from to := t . Problem (24) then becomes
_
_
_
G

DG = (y x)()
G = 0 for y V
G = 0 for < 0
. (26)
This is the problem we will try to solve in various situations. The simplest case is
when there is no boundary the free-space case.
41
2. Free-space Greens function for the heat equation.
We will nd the free-space Greens function for the heat equation by using the Fourier
transform, so let us rst recall the denition and some key properties of it.
Denition: Let f be an integrable function on R
n
(that means
_
R
n
[f(x)[dx < ).
The Fourier transform of f is another function,

f, dened by

f() := (2)
n/2
_
R
n
e
ix
f(x)dx.
Here are some useful properties of the Fourier transform. Let f, g be smooth functions
with rapid decay at .
1. F.T. is linear: (

f + g)() =

f() + g()
2. F.T. is (almost) its own inverse:

f = f, where
g(x) := (2)
n/2
_
R
n
e
ix
g()d
is the inverse Fourier transform.
3. F.T. is unitary (its inverse is its adjoint): (

f, g) = (f, g), and in particular
(taking g =

f), preserves the L
2
-norm:
_
R
n
[

f()[
2
d =
_
R
n
[f(x)[
2
dx.
4. F.T. interchanges dierentiation and coordinate multiplication:

f
x
j
() = (i
j
)

f(),

(x
j
f(x))() = i

f().
5. F.T interchanges convolution and multiplication:

f g() = (2)
n

f() g(), (f g)(x) :=
_
R
n
f(x y)g(y)dy.
6. F.T interchanges coordinate translation and multiplication by an exponential:
for a R
n
,

f(x a)() = e
ia

f().
7. F.T maps Gaussians to Gaussians: for a > 0,

a|x|
2
2
() = a
n/2
e

||
2
2a
.
42
It is property 4 which makes the Fourier transform so useful for dierential equations
it converts dierential equations into algebraic ones.
Property 2 is deep, and dicult to prove. See an analysis textbook for this. The
other properties are easy to show. We will do 4 and 7 two that we will use shortly.
Proof of property 4: integrating by parts,

f
x
j
() = (2)
n/2
_
R
n
e
ix
f
x
j
(x)dx = (2)
n/2
_
R
n
(i
j
)e
ix
f(x)dx = i
j

f().
And for the other one,
i

f() = i

j
(2)
n/2
_
R
n
e
ix
f(x)dx = i(2)
n/2
_
R
n
(ix
j
)e
ix
f(x)dx =

(x
j
f(x))().
Proof of property 7: the higher-dimensional cases follow easily from the case n = 1,
so well do that one. Completing the square,

a
2
|x|
2
() = (2)
1/2
_

e
ix
e

a
2
x
2
dx = e

2
2a
(2)
1/2
_

a
2
(x+i/a)
2
dx.
The last integral is the integral of the entire complex function f(z) = e
az
2
/2
along
the contour z = x + i/a, < x < in the complex plane. We can shift the
contour to the real axis using Cauchys theorem.
_

a
2
(x+i/a)
2
dx = lim
R
_
[R,R]+i/a
f(z)dz = lim
R
__
[R,R]
f(z)dz +
_
A
R
B
R
f(z)dz
_
where A
R
denotes the contour z = R+iy, y : e
i/a
0, and B
R
denotes the contour
z = R + iy, y : 0 e
i/a
(draw a picture!). Along A
R
and B
R
, we have
[f(z)[ = e

a
2
Re(z
2
)
e
a
2
(R
2

2
/a
2
)
and so

_
A
R
B
R
f(z)dz

2e

2
2a
e

a
2
R
2
0 as R .
Hence
_

a
2
(x+i/a)
2
dx =
_

a
2
x
2
dx =
_
2
a
(the last equality is a standard fact which can be proved, for example, by squaring
the integral, interpreting it as a two-dimensional integral, and changing to polar
coordinates). Finally, then, we arrive at

a
2
|x|
2
() = a
1/2
e

2
2a
43
as needed.
Now lets return to the problem of nding the free-space Greens function for the heat
equation. That is, solving (26) when V = R
n
. Let

G(x, t; , ) denote the Fourier
transform of G(x, t; y, ) in the variable y. Property 4 of the Fourier transform shows
that in the variable y corresponds to multiplication by [[
2
. Also, note that

x
(y)() = (2)
n/2
_
R
n
e
iy
(y x)dy = (2)
n/2
e
ix
,
and so we have to solve

G + D[[
2

G = (2)
n/2
e
ix
().
For > 0, this is

G

+ D[[
2

G = 0, an ODE which is easily solved to nd

G = Ce
D||
2

,
where C = C(x, t, ) can be found by a jump condition:
1 =
_
0+
0
()d = (2)
n/2
e
ix
_
0+
0
_

+ D[[
2

G
_
= (2)
n/2
e
ix

G

=0+
=0
= (2)
n/2
e
ix
C
where we used the fact that

G is bounded, so that the second term in the integral
contributes nothing, and the causality condition that

G = 0 for < 0. Thus we
arrive at

G = (2)
n/2
e
ix
e
D||
2

.
Finally, then, inverting the Fourier transform, and using properties 7 and 6, we nd
for > 0,
G(x, t; y, ) = (4D)
n/2
e

|yx|
2
4D
and so, changing back to = t , our free-space Greens function (also called the
fundamental solution) of the heat equation in R
n
is
G(x, t; y, ) =
_
(4D(t ))
n/2
e

|yx|
2
4D(t)
< t
0 > t
.
We obtain a solution formula for the heat equation in R
n
,
_
u
t
Du = f(x, t), x R
n
, t > 0
u(x, 0) = u
0
(x), x R
n
, (27)
44
by substituting our expression for G into (25):
u(x, t) =
_
t
0
_
V
(4D(t))
n/2
e

|yx|
2
4D(t)
f(y, )dy d+(4Dt)
n/2
_
V
e

|yx|
2
4Dt
u
0
(y)dy.
(28)
It is not hard to prove the following (though we will not do it here):
Theorem: Suppose f(x, t) is continuously dierentiable and bounded on R
n
[0, T],
and u
0
(x) is continuous on R
n
. Then for 0 < t < T, u(x, t) given by formula (28) is
continuously dierentiable in t and twice continuously dierentiable in x, and solves
the heat equation in (27). Furthermore, for any x R
n
, lim
t0
u(x, t) = u
0
(x).
Based on our free-space solution, we can infer a couple of important properties of
diusion:
Instantaneous smoothing: in the absence of sources, solutions of the heat
equation become instantaneously smooth (even if the initial data is not). In
the free-space case, this property is reected in the fact that the fundamental
solution (4Dt)
n/2
e
|x|
2
/4Dt
is innitely dierentiable for t > 0 (though it is a
delta function at t = 0!), and in the second integral of (28) (the one containing
the intial data u
0
(x)), derivatives in x and t will fall on the fundamental solution.
Innite propagation speed: suppose f 0 (no sources), and u
0
(x) 0 is
positive in a ball of radius 1 about the origin, and vanishes outside a ball of
radius 2. Then for any t > 0, and any x R
n
,
u(x, t) = (4Dt)
n/2
_
R
n
e
|yx|
2
/(4Dt)
u
0
(x)dx > 0.
That is, the solution becomes positive at all points in space instantaneously for
t > 0 hence the initial, localized disturbance is propagated with innite speed.
45
3. Maximum principle for the heat equation.
We have already seen that the (elliptic) maximum principle is a powerful tool for
analysing solutions of Laplace and Poisson equations. An analogous (parabolic) max-
imum principle plays the same role for heat equations.
Theorem: [Maximum principle for the heat equation]. Let V be a bounded
(and open, and connected) region in R
n
, and let T > 0. Suppose u(x, t) is continuous
on the closed cylinder

V [0, T], and continuously dierentiable (once in t, twice in
x) and solving the heat equation u/t = Du in V (0, T]. Then the maximum
(and the minimum) of u over

V [0, T] is attained either initially (at t = 0) or on
the spatial boundary (x V ).
Remark: Just as in the elliptic case, there is also a strong maximum principle
which says that if the max. (or min.) of u is also attained at an interior point (x
0
, t
0
)
(x
0
V , t
0
> 0), then u const. for t t
0
. (Notice it doesnt say anything for
t > t
0
.) We will not prove this here.
Proof of the maximum principle: the same argument as for harmonic functions i.e.
considering the function
v(x, t) := u(x, t) + [x[
2
for > 0 shows that the max. (and min.) is attained somewhere on the boundary
of the cylinder (since u/t = 0 at an interior max. or min.). We only have to show
that the max. (or min.) is attained somewhere on the boundary other than at the
nal time t = T. So suppose v has a max. (say) at (x
0
, T), for some x
0
V . Then
v(x
0
, T) 0,
and
v
t
(x
0
, T) = lim
h0+
v(x
0
, T) v(x
0
, T h)
h
0.
Hence
0
v
t
(x
0
, T) Dv(x
0
, T) = 2D < 0,
a contradiction. So for any (x, t)

V [0, T], we have
u(x, t) = v(x, t) [x[
2
v(x, t) max
C
T
\{t=T}
v max
C
T
\{t=T}
u + (const.),
and letting 0, we nd
u(x, t) max
C
T
\{t=T}
u
as desired.
46
Just as for the Laplace/Poisson equation, an immediate consequence of the maximum
principle is the uniqueness of solutions of the initial-boundary-value problem for the
heat equation.
As usual let V be a bounded (open, connected) domain in R
n
. Fix any T > 0. Let
f(x, t), g(x, t), and u
0
(x) be continuous functions (on V (0, T), V [0, T], and V ,
respectively), and consider the problem
_
_
_
u
t
Du = f(x, t), x V, 0 < t T
u(x, t) = g(x, t), x V, 0 t T
u(x, 0) = u
0
(x), x V
. (29)
Theorem: There is at most one function u(x, t) which is continuous on

V [0, T],
continuously dierentiable (once in t, twice in x) in V (0, T], and solves problem (29).
Proof: if there are 2 solutions, their dierence w(x, t) satises
_
_
_
w
t
Dw = 0, x V, t > 0
w(x, t) = 0, x V, t > 0
w(x, 0) = 0, x V
.
Applying the (parabolic) maximum principle, we conclude that the max. and min.
values of w on the cylinder are 0. Hence w 0.
47
4. Methods of images and eigenfunction expansion.
Notation: to make the writing easier, we will often denote derivatives using subscripts,
eg. u
t
=
u
t
, u
xx
=

2
u
x
2
, etc.
Example: (diusion on the 1/2-line). Solve
_
_
_
u
t
u
xx
= f(x, t) x > 0, t > 0
u(0, t) = 0 t > 0
u(x, 0) = u
0
(x) x > 0
(30)
by nding the Greens function.
Recalling our notation = t , the problem for the Greens function G(x, t; y, )
is: for x > 0, t > 0,
_
_
_
G

G
yy
= (y x)() y > 0, 0
G[
y=0
= 0
G = 0 < 0
Since the geometry is so simple, lets try the method of images. We know that the
free-space Greens function (in one space dimension) with singularity at x is
G
f
(x, t; y, ) =
1

4
e

(yx)
2
4
.
If we put our image charge at x (outside our domain!), we can also satisfy the
boundary conditions: set, for > 0,
G(x, t; y, ) := G
f
(x, t; y, ) G
f
(x, t; y, ) =
1

4
_
e

(yx)
2
4
e

(y+x)
2
4
_
.
Then
G

G
yy
= (y x)() (y + x)() = (y x)()
(since x > 0 and y > 0), and furthermore
G[
y=0
=
1

4
_
e

y
2
4
e

y
2
4
_
= 0
so we are in business! Replacing by t , our Greens function is, for < t (recall
it is zero for > t),
G(x, t; y, ) =
1
_
4(t )
_
e

(yx)
2
4(t)
e

(y+x)
2
4(t)
_
48
and the resulting solution formula for our half-line problem (30) is
u(x, t) =
_
t
0
_

0
1
_
4(t )
_
e

(yx)
2
4(t)
e

(y+x)
2
4(t)
_
f(y, )dy d
+
1

4t
_

0
_
e

(yx)
2
4t
e

(y+x)
2
4t
_
u
0
(y)dy.
It can be checked rigorously that for reasonable functions f and u
0
(precisely: bounded
and continuous, with f also continuously dierentiable), this formula gives a continu-
ously dierentiable (once in t twice in x) function u which solves (30) but we wont
do it here.
Example: (diusion on a rod). Solve
_
_
_
u
t
u
xx
= 0 0 < x < L, t > 0
u(0, t) = u(L, t) = 0 t > 0
u(x, 0) = u
0
(x) 0 x L
(31)
by nding the Greens function.
The method of images will not work so well here, so we try an eigenfunction expansion
instead. The eigenfunctions of the spatial part of the dierential operator (d
2
/dx
2
)
with zero BCs at x = 0 and x = L are sin(nx/L), n = 1, 2, 3, . . ., and so we seek
our Greens function in the form
G(x, t; y, ) =

n=1
g
n
(x, t; ) sin(ny/L)
for > 0. We require G

G
yy
= (y x)(), thus

n=1
_
(g
n
)

+
n
2

2
L
2
g
n
_
sin(ny/L) = (y x)()
and so
(g
n
)

+
n
2

2
L
2
g
n
=
2
L
_
L
0
sin(ny/L)(y x)()dy =
2
L
sin(nx/L)().
For > 0, we have (g
n
)

+ (n
2

2
/L
2
)g
n
= 0, and so
g
n
= Ce
(n
2

2
/L
2
)
49
and we determine the constant C from a jump condition:
1 =
_
0+
0
()d =
L
2 sin(nx/L)
_
0+
0
_
(g
n
)

+ (n
2

2
/L
2
)g
n

d
=
L
2 sin(nx/L)
g
n
[
=0+
=0
=
L
2 sin(nx/L)
C
(where we used G = 0 for < 0). Hence C = 2 sin(nx/L)/L, and we have an
expression for our Greens function
G =
2
L

n=1
e
(n
2

2
/L
2
)
sin(nx/L) sin(ny/L)
(which indeed satises the zero boundary conditions at y = 0 and y = L). The
corresponding formula for the solution of problem (31) is
u(x, t) =
2
L

n=1
e
(n
2

2
t/L
2
)
sin(nx/L)
_
L
0
sin(ny/L)u
0
(y)dy.
50
5. Greens function for the 1D wave equation.
Suppose we measure the displacement u(x, t) from equilibrium, at each point x in a
bounded region V in R
n
(representing an elastic string (n = 1), membrane (n = 2), or
solid (n = 3)) , and at each time t. The string/membrane/solid is subjected to forces
f(x, t), and at its boundary, it is held at xed displacement g(x, t) (x V ). At time
t = 0, at each x V , the initial displacement is u
0
(x), and the initial velocity is v
0
(x).
Assuming the displacements are small, the following initial-boundary value problem
for the wave equation is a reasonable description of the displacement u(x, t) at later
times t > 0:
_
_
_
wave equation:

2
u
t
2
c
2
u = f(x, t), x V, t > 0
boundary value: u(x, t) = g(x, t), x V, t > 0
initial conditions: u(x, 0) = u
0
(x),
u
t
(x, 0) = v
0
(x), x V
, (32)
where c > 0 is the wave speed. The wave equation also describes the propagation of
other waves, such as sound and light.
Just as for the heat equation, we would like to solve problems like this one via Greens
functions. We will jump right in, by looking immediately for the free-space Greens
functions (or fundamental solutions) in dimensions one, two, and three (unlike
for the heat equation, the form of the Greens function depends signicantly on the
dimension).
1D free-space Greens function for the wave equation:
As with the heat equation, the Greens function should be a function of two sets
of space-time variables: x, t; y, . And as for the heat equation, it is sometimes
convenient to work with := t , rather than . Hence, for x, t, y, R, we seek
G(x, t; y, ) solving
_
G

c
2
G
yy
= (y x)(), < y < , 0
G 0, < 0 (causality)
. (33)
You can think of G as the signal (eg. sound or light) emitted by a unit point source
at spatial point x and at time 0.
For the sake of variety, we will solve this problem using the Laplace transform.
Recall that the Laplace transform of a (say, bounded, continuous) function f dened
on [0, ) is another function dened on [0, ):

f(s) = /(f)(s) :=
_

0
e
st
f(t)dt.,
and recall the key property (Laplace transform turns dierentiation into coordinate
multiplication)
/(f

)(s) = s/(f)(s) f(0),


51
which is easily veried by integration by parts. Thus if we let

G(x, t; y, s) be the
Laplace transform of G in the variable , and use /() = 1, and the causality condi-
tion, we nd
s
2

Gc
2

G
yy
= (y x).
Solving this ODE (in y) to the left and right of x, and imposing the conditions that

G decay as y , and that is be continuous at y = x, yields

G = Ae

s
c
|yx|
.
As usual, the remaining constant A is determined by the jump condition:
1 =
_
x+
x
(y x)dy =
_
x+
x
_
s
2

Gc
2
G
yy
_
dx = c
2
G
y
[
y=x+
y=x
= 2scA
hence A = (2sc)
1
, and

G(x; y, s) =
1
2c
e
(|yx|/c)s
s
.
Now for any r 0, the Laplace transform of the Heavyside function
H(t) :=
_
0 t < 0
1 t 0
= /(H(t r)) =
_

r
e
st
dt =
e
rs
s
,
so by comparison, we must have
G =
1
2c
H
_

1
c
[y x[
_
.
Restoring = t , we arrive at
G(x, t; y, ) =
1
2c
H
_
t
1
c
[y x[
_
,
our 1D free-space Greens function. It is instructive to sketch a space-time graph (i.e.
in the y- plane) of G!
Knowing the Greens function, we can nd the solution to the initial value problem
for the wave equation on the line:
_
u
tt
= c
2
u
xx
< x < , t > 0
u(x, 0) = u
0
(x), u
t
(x, 0) = v
0
(x)
, (34)
which is sometimes called the Cauchy problem for the wave equation.
52
Integrating by parts, we have, for any T > t,
u(x, t) =
_
T
0
_

u(y, )(y x)( t)dyd =


_
T
0
_

u(y, )(G

c
2
G
yy
)dyd
=
_
T
0
_

(u

c
2
u
yy
)Gdyd +
_

(uG

G)[
=T
=0
dy
=
_

(v
0
(y)G(x, t; y, 0) u
0
(y)G

(x, t; y, 0))dy
where we used the causality condition G G

0 for > t. Now, since H

= ,
G

= G

=
1
2c
(
1
c
[y x[) =
1
2
[(y (x + ct)) + (y (x ct))] ,
and so our solution to (34) is
u(x, t) =
1
2c
[u
0
(x + ct) + u
0
(x ct)] +
1
2c
_
x+ct
xct
v
0
(y)dy,
which is known as DAlemberts formula.
Remark:
Notice that DAlemberts formula represents the sum of two waves, one moving
to the left with speed c, one moving to the right with speed c:
u(x, t) = f
+
(x + ct) + f

(x ct), f

(z) :=
1
2
u
0
(z)
1
2c
_
z
0
v
0
(z)dz.
It is simple to check DAlemberts formula does, in fact, solve problem (34).
Indeed, if f is any twice dierentiable function, then f(x ct) solves the wave
equation: [

t
2


x
2
]f(x ct) = c
2
f

c
2
f

= 0. It is also easy to check the


initial data are satised. Hence, assuming u
0
is twice dierentiable, and v
0
is
once dierentiable, we have solved problem (34).
Finite speed of propagation: The solution at space-time point (x, t) depends
only on the initial data (u
0
and v
0
) in the interval [xct, x+ct] (draw a graph!)
that is, signals propagate with speed at most c.
Sidenote: DAlemberts formula makes sense even if u
0
and v
0
are not dier-
entiable (just continuous) though then the PDE doesnt hold in the classical
sense, only in the sense of distributions.
53
6. The wave equation in higher dimensions.
2D free-space Greens function for the wave equation:
We look for a Greens function depending on the spatial variable y only through
r := [y x[, leading to the problem
_
G

c
2
_
G
rr
+
1
r
G
rr

= ()(y x), r > 0, > 0


G 0, < 0
.
Taking again the Laplace transform in the variable we are lead to
s
2

Gc
2
_

G
rr
+
1
r

G
r
_
= (y x).
For r > 0, the ODE

G
rr
+
1
r

G
r

s
2
c
2

G = 0
is solved by

G = AK
0
((s/c)r), where K
0
(z) is the modied Bessel function of order
0 which solves
_
_
_
z
2
K

0
(z) + zK

0
(z) z
2
K
0
(z) = 0
K
0
(z) log z as z 0
K
0
(z) (const)
e
z

z
as z
.
The constant A can be determined using the divergence theorem: denoting the disk
of radius about x by B

,
1 = lim
0+
_
B

(y x)dy = c
2
lim
0+
_
B


Gdy = c
2
lim
0+
_
r=


G
n
dS
= c
2
A
s
c
lim
0+
_
r=
c
sr
dS = 2c
2
A
so A = (2c
2
)
1
, and

G =
1
2c
2
K
0
_
s
c
r
_
.
It turns out we can invert the Laplace transform on the Bessel function explicitly. In
fact, for b > 0, we have
/
_
1

t
2
b
2
H(t b)
_
= K
0
(bs),
and so
G =
1
2c
2
1
_

2
r
2
/c
2
H( r/c)
54
and replacing = t , r = [y x[, our two-dimensional free-space Greens function
for the wave equation is
G(x, t; y, ) =
1
2c
2
_
(t )
2

1
c
2
[y x[
2
H
_
t
1
c
[y x[
_
.
Remark: As with the 1D wave equation, the fact that the Greens function is sup-
ported inside the cone [y x[ c(t ) shows that signals propagate with speed
no greater than c. Notice also, as in the 1D case, signals do not propagate sharply
in 2D, since for xed x, y, , the Greens function is non-zero for times t beyond
+
1
c
[y x[ (though it does decay like 1/t).
It is left as an exercise to use this expression for the free-space Greens function to
show that the solution of the Cauchy problem
_
u
tt
= c
2
u x R
2
, t > 0
u(x, 0) = u
0
(x), u
t
(x, 0) = v
0
(x)
,
for the 2D wave equation, is
u(x, t) =
1
2c
2
_
_
_
|yx|ct
v
0
(y)
_
t
2

1
c
2
[y x[
2
dy +

t
_
|yx|ct
u
0
(y)
_
t
2

1
c
2
[y x[
2
dy
_
_
,
sometimes known as Poissons formula.
3D free-space Greens function for the wave equation:
We again look for a Greens function depending on the spatial variable y only through
r := [y x[, leading to the problem
_
G

c
2
_
G
rr
+
2
r
G
rr

= ()(y x), r > 0, > 0


G 0, < 0
.
Taking again the Laplace transform in the variable we are lead to
s
2

Gc
2
_

G
rr
+
2
r

G
r
_
= (y x).
For r > 0, the ODE

G
rr
+
2
r

G
r

s
2
c
2

G = 0
is solved by (weve seen this before in a homework problem)

G = A
e

s
c
r
r
.
55
Again, we nd the constant A using the divergence theorem. Let B

be the ball of
radius about x. Then
1 = lim
0+
_
B

(y x)dy = c
2
lim
0+
_
B


Gdy = c
2
lim
0+
_
r=


G
n
dS
= c
2
lim
0+
_
r=
A
_

s
c
e

s
c
r
r

e

s
c
r
r
2
_
dS = 4c
2
A
so A = (4c
2
)
1
, and

G =
1
4c
2
e

s
c
r
r
.
Now notice that for b > 0,
/((t b)) =
_

0
e
st
(t b)dt = e
bs
and so by comparison,
G =
1
4c
2
r
( r/c).
Reinstating = t , we have the 3D free-space Greens function for the wave
equation,
G(x, t; y, ) =
1
4c
2
[y x[

_
t
1
c
[y x[
_
.
Remark: Notice that the Greens function is supported exactly on the cone [y x[ =
c(t ) (sketch a graph). So not only do signals propagate with speed c, they are
also crisp in 3D standing at y, you receive a signal emitted at time = 0 and point
x, exactly at time t = [y x[/c and then it is gone! Its nice to live in 3D.
Again, it is left as an exercise to use this expression for the free-space Greens function
to show that the solution of the Cauchy problem
_
u
tt
= c
2
u x R
3
, t > 0
u(x, 0) = u
0
(x), u
t
(x, 0) = v
0
(x)
,
for the 3D wave equation, is
u(x, t) =
1
4c
2
_
1
t
_
|yx|=ct
v
0
(y)dS(y) +

t
1
t
_
|yx|=ct
u
0
(y)dS(y)
_
,
sometimes known as Kirchos formula. Note again the crisp signal propagation,
reected in the fact that the solution at (x, t) is given in terms of integrals of the data
over the sphere of radius ct about x.
56
B. VARIATIONAL METHODS
I. EIGENVALUE PROBLEMS
Let D R
n
be a bounded domain, and consider the following variable-coecient
generalizations of the heat equation
_
r(x)u
t
= [p(x)u] q(x)u x D, t > 0
u
n
+ bu = 0 x D
(35)
and the wave equation
_
r(x)u
tt
= [p(x)u] q(x)u x D, t > 0
u
n
+ bu = 0 x D
(36)
for (smooth) functions p(x) > 0 (the spatially variable diusion rate in the rst case,
wave speed in the second), r(x) > 0, and q(x) (a source of heat loss in the rst case,
friction in the second), and constant b 0 (notice the boundary conditions are a mix
of Dirichlet and Neumann for the moment).
Suppose for some 0, we seek solutions of (35) of the separated-variables form
u(x, t) = e
t
(x), or solutions of (36) of the (oscillatory in time) form u(x, t) =
e
i

t
(x). Then in both cases, as an easy substitution will show, we arrive at the
following problem for (x), involving only the variable x:
_
L(x) := [p(x)] + q(x) = r(x) in D

n
+ b = 0 on D
. (37)
Problem (37) is an eigenvalue problem for the (dierential) operator L. If there
is a non-zero solution (x) for some R, we call an eigenfunction, and the
corresponding eigenvalue.
We will be studying the eigenvalue problem (37) for the next few lectures.
1. Basic properties of eigenvalues and eigenfunctions
We begin by listing the fundamental properties of the eigenvalues and eigenfunctions
of problem (37). Our basic assumptions: D is a smooth, bounded, connected region
in R
n
; p, q, and r are smooth functions on

D, with p(x) > 0 and r(x) > 0.
1. The eigenvalues form an innite sequence, tending to plus innity:

1

2

3
. . . ,
j
as j .
The multiplicity of an eigenvalue is the dimension of the subspace of eigen-
functions corresponding to that is, the maximum number of linearly in-
dependent solutions of L = r. It is always nite. Convention: we will
incorporate multiplicity in our ordered list of eigenvalues above by repeating
each eigenvalue according to its multiplicity.
57
2. Eigenfunctions corresponding to dierent eigenvalues are orthogonal in the
sense:

j
,=
k
= (
j
,
k
)
r
:=
_
D

j
(x)
k
(x)r(x)dx = 0.
This means that by normalizing, we may (and will!) assume that the eigenfunc-
tions
j
corresponding to the eigenvalues
j
form an orthonormal set:
(
j
,
k
)
r
=
_
D

j
(x)
k
(x)r(x)dx =
jk
=
_
1 j = k
0 j ,= k
(for eigenvalues of multiplicity greater than 1, we can choose an orthonormal
set of eigenfunctions using the Gram-Schmidt orthogonalization procedure).
3. The eigenfunctions form a complete set. This means that any function u which
is continuous on

D can be expanded in terms of the eigenfunctions:
u(x) =

j=1
c
j

j
(x), c
j
= (
j
, u)
r
=
_
D

j
(x)u(x)r(x)dx
in the sense
lim
N
_
_
_
_
_
u(x)
N

j=1
c
j

j
(x)
_
_
_
_
_
= 0
where | | denotes the L
2
norm, dened by
|f|
2
:=
_
D
(f(x))
2
dx.
4. If q(x) 0, then
1
0 (and hence
j
0 for all j) and
1
= 0 only if q 0,
b = 0, and
1
= constant.
Proofs of properties 1 (existence of eigenfunctions) and 3 (completeness of eigenfunc-
tions) are beyond the scope of this course, although we will sketch some ideas a bit
later on. Properties 2 and 4 are easy to prove, and we will do that here.
First, notice that L is self-adjoint. That is, if u and v are smooth functions on D
both satisfying the boundary conditions:
u
n
+ bu =
v
n
+ bv = 0 on D, then using
58
the divergence theorem,
(v, Lu) (Lv, u) =
_
D
[v(x)(Lu)(x) u(x)(Lv)(x)] dx
=
_
D
[v (pu) + qvu + u (pv) quv] dx
=
_
D
(p(uv vu))dx =
_
D
_
p
_
u
v
n
v
u
pn
__
dS
= b
_
D
[p(uv + uv)] dS = 0.
Proof of property 2, orthogonality of eigenfunctions with dierent eigenvalues: using
L
j
=
j
r
j
, L
k
=
k
r
k
, the fact that
j
and
k
satisfy the BCs, and the self-
adjointness of L, we have
(
j

k
)(
j
,
k
)
r
= (
j
r
j
,
k
) (
j
,
k
r
k
)
r
= (L
j
,
k
) (
j
, L
k
) = 0.
Hence if
j
,=
k
, then (
j
,
k
)
r
= 0.
To prove property 4, we rst derive a simple, but important identity: if u is a smooth
function satisfying the BCs, then
(u, Lu) =
_
D
u(x) [ (p(x)u(x)) + q(x)u(x)] dx
=
_
D
_
(u(x)p(x)u(x)) + p(x)[u(x)[
2
+ q(x)u
2
(x)

dx
=
_
D
_
p(x)[u(x)[
2
+ q(x)u
2
(x)

dx
_
D
p(x)u(x)
u
n
(x)dS(x)
=
_
D
_
p(x)[u(x)[
2
+ q(x)u
2
(x)

dx + b
_
D
p(x)u
2
(x)dS(x).
(38)
Proof of property 4, positivity of rst eigenvalue for positive q: apply the above iden-
tity to the eigenfunction
1
corresponding to the rst eigenvalue
1
:

1
=
1
(
1
,
1
)
r
= (
1
,
1
r
1
) = (
1
, L
1
)
=
_
D
_
p(x)[
1
(x)[
2
+ q(x)
2
1
(x)
_
dx + b
_
D

2
1
(x)dS(x) 0
since p > 0, b 0, and assuming q 0. And the only way to get
1
= 0 is if

1
constant , q 0, and b = 0.
59
2. The energy, and variational principles for eigenvalues.
The identity (38) turns out to be a very important and useful one. We will specialize
for now to the two most common boundary conditions:
Neumann BCs: b = 0 i.e.
u
n
= 0 on D
Dirichlet BCs: b = i.e. u = 0 on D.
In both of these cases, the boundary integral in (38) vanishes. Lets give this quantity
a name.
Denition: Given the operator L (i.e. the functions p and q), the (Dirichlet) energy
of a function u on D is
E(u) :=
_
D
_
p(x)[u(x)[
2
+ q(x)u
2
(x)

dx.
Now suppose u is any (smooth) function satisfying the boundary conditions (Dirichlet
or Neumann). By completeness of the eigenfunctions, we may write
u(x) =

j=1
c
j

j
(x),
and notice that
(u, u)
r
=
_

j=1
c
j

j
,

k=1
c
k

k
_
r
=

j,k=1
c
j
c
k
(
j
,
k
)
r
=

j,k=1
c
j
c
k

jk
=

j=1
c
2
j
.
Now, as computed in (38),
E(u) = (u, Lu) =

j,k=1
c
j
c
k
(
j
, L
k
) =

j,k=1
c
j
c
k
(
j
,
k
r
k
) =

j,k=1
c
j
c
k

k
(
j
,
k
)
r
=

j,k=1
c
j
c
k

jk
=

j=1

k
c
2
k
.
And since
k

1
for all k,
E(u)
1

j=1
c
2
k
=
1
(u, u)
r
,
or

1

E(u)
(u, u)
r
,
60
and we see that the only way to get equality here, is if the only non-zero coecients
c
j
are those corresponding to the lowest eigenvalue (i.e.
j
>
1
= c
j
= 0).
That is, equality only holds if and only if u is an eigenfunction corresponding to the
lowest eigenvalue
1
. This observation gives us our rst example of a variational
principle:
Theorem: [variational principle for the rst eigenvalue]: the lowest eigenvalue
of the operator L (with Dirichlet or Neumann BCs) is given by

1
= min
u satisfying BCs
E(u)
(u, u)
r
= min
u satisfying BCs
_
D
[p(x)[u(x)[
2
+ q(x)u
2
(x)] dx
_
D
u
2
(x)r(x)dx
.
Remark: The quantity
E(u)
(u, u)
r
is sometimes called the Rayleigh quotient.
Here is a simple example of how a variational principle like this might be used.
Example: Find an upper bound for the lowest eigenvalue (with r 1) of L =
d
2
dx
2
+x
on the interval [0, 1] with Dirichlet (zero) BCs at the endpoints.
We have D = [0, 1], r(x) 1, p(x) 1, q(x) = x. According to the variational
principle, any function u(x) on [0, 1] which is zero at the endpoints, gives us an upper
bound on
1
:
1
E(u)/(u, u).
1. One simple such choice is u(x) = x(1 x) = x x
2
. Then u

(x) = 1 2x, and


we compute
(u, u) =
_
1
0
(x x
2
)
2
dx =
_
1
0
(x
2
2x
3
+x
4
)dx = (x
3
/3 x
4
/2 +x
5
/5)[
1
0
=
1
30
and
E(u) =
_
1
0
_
(1 2x)
2
+ x(x x
2
)
2

dx
=
_
1
0
_
(x
5
2x
4
+ x
3
) + 4x
2
4x + 1

dx =
1
3
+

60
and hence

1

E(u)
(u, u)
= 10 +
1
2
.
2. A little more thought will give us a better (i.e. smaller) upper bound, supposing
is small. For = 0, we know the (Dirichlet) eigenfunctions of
d
2
dx
2
are sin(kx),
k = 1, 2, . . ., with eigenvalues
2
k
2
. Lets use the lowest = 0 eigenfunction
61
u(x) =

2 sin(x) (notice we have normalized it so (u, u) = 1) as our test


function in the variational principle:

1

E(u)
(u, u)
=
_
1
0
_
(

2 cos(x))
2
+ x(

2 sin(x))
2
_
dx
=
2
+ 2
_
1
0
x sin
2
(x)dx =
2
+
1
2
,
which is indeed better (since
2
< 10), and in particular becomes exact in the
limit 0.
We can easily generalize the above variational principle to the higher eigenvalues.
Suppose u is a function on D satisfying the boundary conditions, and which is or-
thogonal to the rst n 1 eigenfunctions:
(u,
j
)
r
= 0 j = 1, 2, . . . , n 1.
Then c
1
= c
2
= = c
n1
= 0, so
u(x) =

j=n
c
n

j
(x), (u, u)
r
=

j=n
c
2
j
and
E(u) =

j=n

j
c
2
j

n

j=n
c
2
j

n
(u, u)
r
with equality if and only if c
j
= 0 for all j with
j
>
n
i.e., if and only if u is an
eigenfunction with eigenvalue
n
. Hence:
Theorem: [variational principle for higher eigenvalues]: the n-th eigenvalue of
the operator L (with Dirichlet or Neumann BCs) is given by

n
= min
u satisfying BCs,(u,
1
)
r
==(u,
n1
)
r
=0
E(u)
(u, u)
r
.
Remark: Our variational principles apply also for the more general boundary condi-
tions
u
n
+ bu = 0 on D, provided we use the full form of the energy from (38):
E(u) =
_
D
(p[u[
2
+ qu
2
) + b
_
D
pu
2
.
In practice, the variational principle above for higher eigenvalues (n > 1) is not so
useful, since we typically do not know what the eigenfunctions
1
, . . . ,
n1
are! A
62
more useful version is the following, in which we minimize over functions orthogonal
to any set of n 1 functions (not necessarily the eigenfunctions), and then maximize
over that set.
Theorem: [Courant max-min principle]

n
= max
functions m
1
,...,m
n1
_
min
u satisfying BC,(u,m
j
)
r
=0
E(u)
(u, u)
r
_
Proof: let m
1
, m
2
, . . . , m
n1
by any functions on D. We will build a function u,
satisfying the BCs, and which is orthogonal to all the m
k
, of the form
u(x) =
n

j=1
a
j

j
(x).
That is, we want
0 = (u, m
k
)
r
=
n

j=1
a
j
(
j
, m
k
)
r
, k = 1, 2, . . . , n 1.
This is n 1 linear equations in the n variables a
1
, a
2
, . . . , a
n
. Since there are more
variables than equations, linear algebra tells us there is a non-zero solution (in fact,
at least a one-parameter family of solutions). Now as above,
E(u) = (u, Lu) =
n

j=1

j
a
2
j

n
n

j=1
a
2
j
=
n
(u, u)
r
,
hence
min
u satisfying BC,(u,m
j
)
r
=0
E(u)
(u, u)
r

n
.
But since this is true for any choice of m
1
, . . . , m
n1
, it is true for the maximum over
all such choices:
max
functions m
1
,...,m
n1
_
min
u satisfying BC,(u,m
j
)
r
=0
E(u)
(u, u)
r
_

n
.
It remains to show that the maximum actually equals
n
that is, for some choice of
m
1
, . . . , m
n1
, and for any u which is orthogonal to all the m
j
, E(u)/(u, u)
r

n
. For
this, we take m
j
=
j
for j = 1, . . . , n 1. Then any u orthogonal to these functions
can be written
u(x) =

j=n
c
j

j
(x),
63
and, as above
E(u) =

j=n

j
c
2
j

n

j=n
c
2
j
=
n
(u, u)
r
as required. .
Important remark: It turns out that in the case of Neumann BCs, the variational
principles for eigenvalues discussed above (including the max-min principle) remain
true if the minimization is done over all test functions u i.e. we do not have to
impose the Neumann BCs on the functions u inserted into the Rayleigh
quotient. In the case of Dirichlet BCs, however, the test functions u do indeed need
to vanish on the boundary. We will return to this point when we discuss the calculus
of variations.
64
3. Bounds on eigenvalues.
We x here Dirichlet (zero) BCs, and, as always, denote by
n
the n-th (Dirichlet)
eigenvalue of L = p(x)+ q(x) (with respect to r(x)).
Bounds on Coecients
Suppose
_
_
_
0 < p
min
p(x) p
max
q
min
q(x) q
max
0 < r
min
r(x) r
max
in our domain D. Denote by
n,min
the n-th eigenvalue of
_
(p
min
) + q
min
=
n,min
r
max
in D
= 0 on D
,
and by
n,max
the n-th eigenvalue of
_
(p
max
) + q
max
=
n,max
r
min
in D
= 0 on D
.
Theorem:

n,min

n

n,max
.
Proof: notice that for any function u on D,
E(u) =
_
D
_
p[u[
2
+ qu
2
_
dx
_
D
_
p
max
[u[
2
+ q
max
u
2
_
dx =: E
max
(u)
and
(u, u)
r
=
_
D
u
2
r dx
_
D
u
2
r
min
dx = (u, u)
r
min
,
so that
E(u)
(u, u)
r

E
max
(u)
(u, u)
r
min
.
Also notice that for any functions m
1
, . . . , m
n1
on D,
(u, m
j
)
r
= 0 if and only if (u, m
j
)
r
min
= 0 where m
j
(x) :=
r(x)
r
min
m
j
(x).
So by the min-max principle,

n
= max
m
1
,...,m
n1
min
{u=0 D ,(u,m
j
)
r
=0}
E(u)
(u, u)
r
max
m
1
,..., m
n1
min
{u=0 D ,(u, m
j
)
r
min
=0}
E
max
(u)
(u, u)
r
min
=
n,max
.
65
The other inequality is really the same thing.
Example: Find upper and lower bounds for the n-th eigenvalue of L =
d
2
dx
2
+x
2
on
[0, 1] with Dirichlet (zero) BCs. Here 0.
So here D = [0, 1], p 1, r 1, q(x) = x
2
. Lets use the easy bounds
q
min
:= 0 q(x) =: q
max
on [0, 1]. So
n
is sandwiched between the n-th (Dirichlet) eigenvalue of
d
2
dx
2
, and
that of
d
2
dx
2
+ . That is

2
n
2

n

2
n
2
+ .
Notice that we get a better (i.e. smaller) upper bound for the rst eigenvalue by
taking sin(x) (the rst Dirichlet eigenfunction of
d
2
dx
2
) as a test function in the
variational principle:

1

E(sin(x))
(sin(x), sin(x))
=
_
1
0
_

2
cos
2
(x) + x
2
sin
2
(x)
_
dx
_
1
0
sin
2
(x)dx
=

2
/2 + /6
1/2
=
2
+
1
3
.
Bounds on geometry
Now lets x the coecients p, q, and r, and make the dependence on the domain D
explicit; that is, denote by
n
(D) the n-th eigenvalue of our operator L = p+q
on D with Dirichlet BCs.
Theorem:

D D =
n
(D)
n
(

D).
That is, the smaller the domain, the larger the eigenvalue.
Proof: Again, well use the max-min principle. Let m
1
, . . . , m
n1
be any given func-
tions on D. Notice that if u is a function on

D which vanishes on

D and satises
(with integrals restricted to

D), (m
1
, u)
r,

D
= = (m
n1
, u)
r,

D
= 0 (i.e. u is an
admissible test function for the Rayleigh quotient in the min-max principle for

D),
then its extension
u(x) :=
_
u(x) x

D
0 x D

D
is a function on D, vanishing on the boundary D, and satisfying (with integrals now
over D) (m
1
, u)
r,D
= = (m
n1
, u)
r,D
= 0 (i.e. u is an admissible test function
for the Rayleigh quotient in the min-max principle for D). Further (denoting region
of integration with a subscript),
E
D
( u) = E

D
(u), ( u, u)
r,D
= (u, u)
r,

D
.
66
So
min
{ u=0 D, (m
j
, u)
r
=0}
E( u)
( u, u)
r
min
{ u0 in D\

D, (m
j
, u)
r
=0}
E( u)
( u, u)
r
(minimizing over smaller set)
= min
{u=0

D, (m
j
,u)
r
=0}
E(u)
(u, u)
r
max
{fns. m
1
,...,m
n1
on

D}
min
{u=0

D, (m
j
,u)
r
=0}
E(u)
(u, u)
r
=
n
(

D).
Since this inequality holds for all choices of m
1
, . . . , m
n1
(functions on D), maximiz-
ing over all such choices yields
n
(D)
n
(

D), as required.
Example: Find upper and lower bounds for the Dirichlet eigenvalues of on the
following 2D domain, which is the union of 2 rectangles: D := ( [0, a] [0, b] )
( [a, a + h] [c, c + g] ) with a, b, c, h, g positive, and 0 < c < c + g < b (draw it!).
Well, we have
[0, a] [0, b] =: D
min
D D
max
:= [0, a + h] [0, b].
By separation of variables, we know the Dirichlet eigenvalues of in a rect-
angle [0, a] [0, b] are
2
(n
2
/a
2
+ m
2
/b
2
), m, n = 1, 2, 3, . . . (with eigenfunctions
sin(nx
1
/a) sin(mx
2
/b)). It is a bit annoying, though, to notate them. One way is
to denote

k
(a, b) := k th element of the set n
2
/a
2
+ m
2
/b
2
[ n, m = 1, 2, 3, . . .
ordered by size, and counting multiplicity (so, eg.,
1
(1, 1) = 1,
2
(1, 1) = 5,
3
(1, 1) =
5,
4
(1, 1) = 8, etc.). In this notation,
n
([0, a] [0, b]) =
2

n
(a, b), and so we can
conclude for the original problem

n
(a + h, b)
n

2

n
(a, b).
In particular, for the rst eigenvalue,

2
_
1
(a + h)
2
+
1
b
2
_

1

2
_
1
a
2
+
1
b
2
_
.
One more observation, using the geometric bounds established in this section. As
usual, Let D R
n
be a bounded domain, and let
k
(D) denote the k-th Dirichlet
eigenvalue of the operator L = p(x) + q(x) on D. Let p
min
, q
min
, and r
max
be constants such that 0 < p
min
p(x), q
min
< q(x), and r(x) r
max
. Since D
is bounded, it ts inside an n-dimensional (hyper-) cube C of (suciently large)
67
side-length a. Separation of values shows easily that the Dirichlet eigenvalues (with
respect to r
max
) of L
min
:= p
min
+ q
min
on C are
p
min

2
r
max
a
2
(k
2
1
+ + k
2
n
) +
q
min
r
max
, k
1
, . . . , k
n
= 1, 2, 3, . . . .
In particular, the eigenvalues go o to innity:
min,k
(C) as k . Then our
eigenvalue bounds above imply that the Dirichlet eigenvalues
k
(D) of L on D satisfy

k
(D)
min,k
(C) , k ,
and so,
lim
k

k
(D) = .
This was a general property of the eigenvalues we listed at the beginning which our
eigenvalue bounds (based, in turn, on variational principles) give a nice slick proof of.
68
II. CALCULUS OF VARIATIONS
1. Euler-Lagrange equations.
We begin with an example.
Example: Let D R
n
be a bounded domain. Let q(x) and g(x) be given smooth
functions on D and D respectively. Among all functions u(x) on D (which are
suciently smooth say, twice continuously dierentiable) and satisfy u(x) = g(x)
for x D, nd the one which minimizes the functional
I(u) :=
_
D
_
1
2
[u(x)[
2
+ q(x)u(x)
_
dx.
Well, suppose I(u) is minimized by a function u

(with u

= g on D). Consider now


a one-parameter family of functions nearby u

:
u

(x) := u

(x) + w(x)
where w is some xed function. In order for u

to be in our allowed class of functions


(twice continuously dierentiable and equal to g on the boundary), we require that
w also be twice continuously dierentiable, and w = 0 on D. Key observation:
u

minimizes I(u) = I(u

) is minimized at = 0.
Since I(u

) is a function of just the single variable , we are in the realm of simple


calculus, and we know that at a minimum, the derivative is zero:
0 =
d
d
I(u

=0
=
d
d
_
D
_
1
2
[u

(x)[
2
+ q(x)u

(x)
_
dx

=0
=
_
D
_
u

(x)

(x) + q(x)

(x)
_
dx

=0
=
_
D
(u

(x) w(x) + q(x)w(x)) dx


=
_
D
( [w(x)u

(x)] + w(x)[u

(x) + q(x)]) dx
=
_
D
w(x) [u

(x) + q(x)] dx +
_
D
w(x)

n
u

(x)dS(x)
=
_
D
w(x)[u

(x) + q(x)]dx
where we used the divergence theorem toward the end, and the fact that w = 0 on
D (so the boundary term disappears).
69
To summarize: in order for u

to minimize I(u) (among suciently smooth functions


with xed boundary values g(x)), the integral above must be zero for any (suciently
smooth) function w(x) vanishing on the boundary D. It turns out that the only
way this can happen, is if the function multiplying w in the integral is zero that is,
u

(x) + q(x) = 0 as the following lemma shows.


Lemma: if g(x) is a continuous function on D, and if
_
D
g(x)w(x)dx = 0 for all
smooth functions w on D vanishing on the boundary, then g(x) 0 on D.
Proof: suppose, for some x
0
D, g(x
0
) ,= 0. Since g is continuous, in some ball
B around x
0
, we have either g(x) > 0 or g(x) < 0. So let w(x) be a little bump
function at x
0
: that is, a smooth, non-negative function, with w(x
0
) > 0, and
vanishing outside B. Then 0 =
_
D
g(x)w(x)dx is contradicted. Hence we must have
g 0 in D.
So, to conclude:
u

minimizes I(u) with xed boundary data = u

(x) = q(x) in D.
That is, the PDE u

= q is a necessary condition known as the Euler-Lagrange


equation for I for u

to be a minimizer of I.
Remarks:
1. Notice that we have not touched the question of whether or not a minimizer
u

exists, but merely written down a necessary condition that such a minimizer
would have to satisfy. (This example, however, is simple enough that we can be
sure of the existence of a minimizer it is the solution of the Poisson equation
u

= q in D, u

= g on D.) The situation is completely analogous to nding


the minimum of a function of several variables. There, the necessary condition
is that the gradient vanish, which gives an algebraic equation to solve to nd the
critical points. The question of which (if any) of these minimize the function
is then addressed separately. The dierence is that in the calculus of variations,
we are trying to minimize a function dened on a (innite-dimensional) space
of functions, rather than (nite-dimensional) R
n
and as our example showed,
the necessary condition may be not just an algebraic equation, but in fact a
dierential equation.
2. Indeed, the key point here is the relation between variational problems (min-
imizing functionals) and the (partial) dierential equations which arise as their
Euler-Lagrange equations. This idea is ubiquitous in physics (and many other
applications) think, for example, of the principle of minimal action in me-
chanics, and its relation to the equations of motion (Newtons equations). In
practice, the relationship works both ways: we may try to solve a variational
problem by solving the corresponding Euler-Lagrange equation; or, we may try
70
to solve a PDE by recognizing it as the E-L equation of some variational prob-
lem, which we then try to solve directly (we got a hint of this latter approach
when we discussed variational problems for eigenvalues more on this to come).
3. A natural question follows from the example above: suppose we indeed have a
solution u

of the problem
_
u

= q in D
u

= g on D
.
Does it really minimize I(u) among functions u with boundary values g? For
this example, in fact, it does, since for any such u, using the divergence theorem,
I(u) I(u

) =
_
D
_
1
2
[u[
2

1
2
[u

[
2
+ q(u u

)
_
dx
=
_
D
_
1
2
[(u u

)[
2
+u

(u u

) + q(u u

)
_
dx
=
_
D
_
1
2
[(u u

)[
2
+ (u u

)(u

+ q)
_
dx
+
_
D
(u u

)
u

n
dS(x)
=
1
2
_
D
[(u u

)[
2
dx 0.
4. Natural BCs: consider the same minimization problem as above, except now
we would like to minimize I(u) among all (smooth) functions u (without the
condition that u = g on D). What problem should a minimizer u

solve in this
case? Proceeding as above, we conclude that for u

= u

+ w,
d
d
I(u

)[
=0
= 0
where w now is any smooth function (it does not have to vanish on D). This
leads, as above, to
0 =
_
D
w(x) [u

(x) + q(x)] dx +
_
D
w(x)

n
u

(x)dS(x)
(the boundary term remains). In particular, this must hold for all w vanishing
on the boundary, and so we conclude, as above, that the E-L equation u

= q
must hold. Then in addition we require that
_
D
w(x)

n
u

(x)dS(x) = 0, again
for any function w. This can only hold if
u

n
= 0 on D. Thus
_
u

= q in D
u

n
= 0 on D
.
These Neumann BCs are sometimes called the natural boundary conditions
for the minimization problem, since they arise naturally from the minimization
when no BCs are imposed in the problem.
71
Our next example generalizes our rst one.
Example: Find the Euler-Lagrange equation for the problem
min
uH
I(u), I(u) :=
_
D
F(x, u, u) dx, H = u C
2
(D) u = g on D
(notation: C
2
(D) denotes the twice continuously dierentiable functions on D).
As above, a necessary condition for u to be a minimizer is, for any smooth function
w vanishing on D,
0 =
d
d
I(u + w)

=0
=
d
d
_
D
F(x, u + w, u + w)dx

=0
=
_
D
(F
u
(x, u, u)w + F
u
(x, u, u) w)
=
_
D
( F
u
(x, u, u) + F
u
(x, u, u)) w dx
(where again we used the divergence theorem , and the fact that w vanishes on the
boundary). So the Euler-Lagrange equation for this problem is
F
u
(x, u, u) = F
u
(x, u, u). (39)
(Note that F
u
is a gradient that is, it is a vector eld.) As a special case, consider
the previous example above, which corresponds to F(x, u, u) =
1
2
[u[
2
+ q(x)u, so
that F
u
= q, F
u
= u, F
u
= u, and the Euler-Lagrange equation is u = q.
Remark: Notice that in general, the Euler-Lagrange equation (39) is a nonlinear
PDE whereas so far this course has been concerned almost exclusively with linear
problems.
72
2. Some further examples.
Example: (Brachistochrone problem) A bead slides (frictionlessly) down a curve
from a point P to a (lower) point Q. What curve gives the shortest travel time?
Put point P at the origin of an xy-plane (with the positive y direction pointing down,
for convenience), and describe possible curves C from P to Q by functions y = f(x),
with Q at (x
1
, f(x
1
)) (draw a picture!). The speed of the bead along the curve is
v = ds/dt where s denotes the arc length along the curve (and t denotes time, of
course), so the travel time is
T =
_
C
dt =
_
C
ds
v
.
The arc length dierential is given by
ds =
_
dx
2
+ dy
2
=
_
dx
2
+ (f

(x)dx)
2
=
_
1 + (f

(x))
2
dx.
We can determine the velocity by conservation of energy:
energy = kinetic + potential =
1
2
mv
2
mgy = constant = 0,
where m is the mass of the bead, and g is the gravitational acceleration, so
v =
_
2gy =
_
2gf(x).
So now we have an expression for the travel time
T =
_
C
ds
v
=
1

2g
_
x
1
0

1 + (f

(x))
2
f(x)
dx.
So now our variational problem is clear: among smooth functions f(x) dened on
[0, x
1
] with boundary conditions f(0) = 0, f(x
1
) = y
1
, minimize
I(f) :=
_
x
1
0
F(f(x), f

(x))dx, F(f, f

) :=

1 + (f

)
2
f
.
We can read the Euler-Lagrange equation for this problem directly o (39):
0 = F
f

d
dx
F
f
=
_
1 + (f

)
2
(1/2)f
3/2

d
dx
_
f
1/2
(1 + (f

)
2
)
1/2
f

=
1
2(1 + (f

)
2
)
3/2
f
3/2
_
(1 + (f

)
2
)
2
+ (f

)
2
(1 + (f

)
2
) + 2f(f

)
2
f

2f(1 + (f

)
2
)f

_
=
1
2(1 + (f

)
2
)
3/2
f
3/2
_
(1 + (f

)
2
) + 2ff

_
.
73
So if there is a minimizing function f(x), it should satisfy the nonlinear ODE
2ff

+ (f

)
2
= 1.
It turns out that the cycloid (the curve traced out by a point on a circle rolling along
a line), which is given parametrically by
x() = R( sin ), y() = R(1 cos ),
solves this ODE, as we now verify:
f

=
dy/d
dx/d
=
Rsin
R(1 cos )
=
sin
1 cos
f

=
d
dx
f

=
[sin /(1 cos )]

R(1 cos )
=
(1 cos ) cos sin
2

R(1 cos )
3
=
1
R(1 cos )
2
2ff

+ (f

)
2
+ 1 =
2(1 cos ) + (1 cos )
2
+ sin
2

(1 cos )
2
= 0.
So the cycloid does indeed solve the E-L equation. Choosing R and
1
so that (R(
1

sin
1
), R(1 cos
1
) = (x
1
, y
1
), we can also satisfy the boundary conditions. The
question of whether or not the cycloid really minimizes the travel time is a more
dicult one to answer.
Example: (Higher derivatives). Let p(x) be a given function on a bounded domain
D. Find the E-L equation for the minimization problem
min
uC
4
, u=
u
n
=0 on D
_
D
_
1
2
(u(x))
2
p(x)u(x)
_
dx,
which arises in elasticity (where u(x) represents the deection of a plate D R
2
as
a result of a load p(x)).
If u

is a minimizer, and w(x) is any smooth function with w =


w
n
= 0 on D, then
0 =
d
d
_
D
_
1
2
((u

+ w))
2
p(u

+ w)
_
dx

=0
=
_
D
(u

w pw) dx
=
_
D
(u

p)wdx +
_
D
_
u

w
n


n
u

w
_
dS =
_
D
(u

p)wdx
using the divergence theorem twice (or, if you prefer, Greens second identity) and
the boundary conditions on w. Employing the same argument we used above, we
conclude that the E-L equation for u

is
2
u

= u

= p, sometimes known as the


biharmonic equation. That is, the problem for a minimizer u

is
_
u

(x) = p(x) D
u

=
u

n
= 0 D
.
74
3. Variational problems with constraints.
Here we consider the constrained variational problem
min
uH, M(u)=C
I(u) (40)
where
H = u C
2
(D) [ u 0 on D
is the class of functions we work in,
I(u) =
_
D
F(x, u, u)dx
is the functional we are minimizing, and
M(u) =
_
D
G(x, u, u)dx
is another functional, which gives the constraint.
Just as for functions of several variables, the necessary condition (Euler-Lagrange
equation) for a constrained problem involves Lagrange multipliers:
Theorem: If u

solves the variational problem (40), then u

is a critical point (that


is, satises the Euler-Lagrange equation for) of the functional
I(u) + M(u)
for some R (called a Lagrange multiplier).
Sketch of proof: Suppose u

(x) is a minimizer of problem (40). That means M(u

) =
c, and u

minimizes I(u) among functions u H with M(u) = c. In particular, let


u

be a one-parameter family of functions in H with M(u

) = c, and u
0
= u

. Then
I(u

) (which is a function of ) is minimized at = 0, and so


0 =
d
d
I(u

)[
=0
=
d
d
_
D
F(x, u

, u

)dx [
=0
=
_
D
_
F
u
(x, u

, u

)
u

[
=0
+ F
u
(x, u

, u

)
u

[
=0
_
dx.
Let us denote
:=
u

[
=0
,
notice that 0 on D, and use (as always!) the divergence theorem, to get
0 =
_
D
[F
u
(x, u

, u

) F
u
(x, u

, u

)] (x)dx.
75
Now if (x) could be any (smooth) function on D (with zero BCs), we would conclude,
as before, that
f(x) := F
u
(x, u

, u

) F
u
(x, u

, u

)
is zero, and that would be our E-L equation. However, because of the constraint
M(u

) = 0, (x) cannot be just any function. Indeed, by the same sort of calculation
we just did,
0 =
d
d
M(u

)[
=0
=
_
D
g(x)(x)dx, g(x) := G
u
(x, u

, u

) G
u
(x, u

, u

).
That is, must be orthogonal to the function g. In fact, it turns out that a one-
parameter family of functions u

as above, with
u

[
=0
= , can be constructed for
any (smooth enough) satisfying
_
g = 0. So, we have
_
D
f(x)(x)dx = 0 for any (x) with
_
D
g(x)(x)dx = 0.
What can we conclude about f(x) from this? The following:
Claim: f(x) + g(x) = 0 for some R.
To see this, consider any (smooth) function (x) on D, and write it as
(x) =
(g, )
(g, g)
g(x) + (x),
_
D
g(x) (x)dx = (g, ) = 0
(i.e., in linear algebra language, is the orthogonal projection onto the subspace
perpendicular to g). Since is perpendicular to g, we have
0 =
_
D
f(x) (x)dx =
_
D
f(x)
_
(x) g(x)
1
(g, g)
_
D
g(y)(y)dy
_
=
_
D
_
f(x)
(g, f)
(g, g)
g(x)
_
(x)dx
Finally, since can be any function, we conclude, as in the earlier lemma, that
f(x)
(g, f)
(g, g)
g(x) = 0,
or, in other words, u

is a critical point of the functional


I(u) + M(u), =
(g, f)
(g, g)
.

76
Example: (Eigenvalue problem). Recall that the rst (Dirichlet) eigenvalue of on
domain D is given by

1
= min
u=0 on D
_
D
[u(x)[
2
dx
_
D
u(x)
2
dx
= min
u=0 on D,

D
u
2
=1
_
D
[u(x)[
2
dx
(the second expression equals the rst, since we may normalize any function u(x),
by multiplying by a number, to get
_
u
2
= 1, and this doesnt change the Rayleigh
quotient). The second expression is a constrained minimization problem lets nd
its Euler-Lagrange equation. The method of Lagrange multipliers tells us to consider
the functional
_
D
[u[
2
dx +
_
D
u
2
dx =
_
D
_
[u[
2
+ u
2

dx,
with Lagrange multiplier , whose Euler-Lagrange equation we can nd (as usual) by
considering, for any (smooth) (x) vanishing on the boundary, u + :
0 =
d
d
_
D
_
[u + [
2
+ (u + )
2

dx

=0
=
_
D
2 [u + 2u] dx
= 2
_
D
(x) [u(x) + u(x)] dx,
and therefore
u(x) + u(x) = 0,
that is, u is an eigenfunction of with eigenvalue . And notice, nally, that if
we integrate the equation above against u, and use
_
u
2
dx = 1, we nd
=
_
u
2
(x)dx =
_
D
u(x)u(x)dx =
_
D
[u(x)[
2
dx =
1
,
so indeed, the minimizer is an eigenfunction with eigenvalue
1
(which, in any case,
we already knew).
Extensions:
1. (rst Neumann eigenvalue). Notice that if we do not impose any boundary con-
ditions in the minimization problem, the minimizer will still satisfy the natural
boundary conditions namely Neumann conditions
u
n
0 on D. Hence the
rst Neumann eigenvalue of on D is given by

1
= min

D
u
2
dx=1
_
D
[u[
2
dx,
and the minimizer is a corresponding eigenfunction.
77
2. (higher eigenvalues). Recall that the n-th (back to Dirichlet again) eigenvalue
is given by

n
= min
u=0 D,

D
u
2
=1, (u,
1
)==(u,
n1
)=0
_
D
[u[
2
dx
where
1
, . . . ,
n1
are the rst n 1 eigenfunctions. This is a constrained
variational problem with several constraints in fact n of them. In this case,
we need n Lagrange multipliers. That is, we are looking for critical points of
_
D
[u[
2
dx +
0
_
D
u
2
dx +
n1

j=1

j
_
D
u
j
dx
for some numbers
0
,
1
, . . .
n1
. The Euler-Lagrange equation (check it!) is
u +
0
u +
n1

j=1
2
j

j
= 0.
Integrating this equation against
k
for some k 1, 2, . . . , n 1, and using
(
k
, u) = 0, (
k
, u) = (
k
, u) = (
k

k
, u) =
k
(
k
, u) = 0,
and the orthonormality of the eigenfunctions, we nd
k
= 0 for k = 1, 2, . . . , n
1. And then integrating against u, we nd

0
=
_
D
[u[
2
dx =
n
.
Hence
_
u =
n
u D
u = 0 D
.
That is, the minimizer is indeed an eigenfunction corresponding to the eigenvalue

n
(which, again, we already knew).
Example: (Minimizing the surface area for xed volume). Let D be a region
in R
2
, and consider surfaces described by graphs of functions x
3
= u(x) 0,
x D, with u = 0 on D. The problem is to minimize the surface area
A(u) =
_
D
_
1 +[u(x)[
2
dx
subject to the constraint of a xed volume:
V (u) =
_
D
u(x)dx = V
0
.
78
Thus we consider the functional
A + V =
_
D
_
_
1 +[u(x)[
2
+ u(x)
_
dx
for a Lagrange multiplier , and nd its Euler-Lagrange equation:
0 =
d
d
(A + V )(u + )

=0
=
_
D
_
1
2
(1 +[u(x)[
2
)
1/2
2u(x) (x) + (x)
_
dx
= (x)
_

u(x)
_
1 +[u(x)[
2
+
_
dx.
Thus a minimizing function should satisfy

u(x)
_
1 +[u(x)[
2
= constant .
As a special case, let D be the disk x
2
1
+ x
2
2
r
2
0
, and try
u(x
1
, x
2
) =
_
c
2
0
x
2
1
x
2
2

_
c
2
0
r
2
0
,
for r
0
c
0
, which is (part of) a sphere, of radius c
0
, centred at (0, 0,
_
c
2
0
r
2
0
).
It is a good exercise to check that this satises our minimal surface equation
above. Notice that the radius c
0
can be chosen to satisfy the volume constraint
V
0
, provided V
0
is not too large (exercise: nd how large, in terms of r
0
).
79
4. Approximating minimizers: Rayleigh-Ritz method.
Since it is very rare to be able to solve variational problems (indeed, PDE problems in
general) explicitly, one usually needs to try and nd approximate solutions whether
by hand, or (more commonly) by computer. The Rayleigh-Ritz method is an
elementary, classical technique for approximating solutions to variational problems.
The idea could not be simpler. Suppose we want to nd an approximation to the
solution of
min
uH
I(u)
where H is some class of functions (eg. with some given BCs), and I is a functional.
Lets assume that H is a vector space (i.e. closed under linear combinations), which
is often the case. Let
v
1
(x), v
2
(x), . . . , v
m
(x)
be m trial functions, and consider linear combinations of these m functions:
u(x) =
n

j=1
c
j
v
j
(x), c
j
R for j = 1, 2, . . . , m.
Such functions lie in H (i.e. satisfy the given BCs), and if we try to minimize the
functional I over all such functions (rather than over all functions in H), we get a
nite-dimensional minimization problem,
min
c
1
,...,c
m
I
_
m

j=1
c
j
v
j
_
,
which we can solve by standard multi-variable calculus (i.e. set the partial derivatives
with respect to each c
j
equal to 0).
Example: Find an approximation to the solution of
min
u=0 on D
_
D
_
1
2
[u[
2
+ f(x)u
_
dx
where D is the rectangle [0, 1]
2
. (Recall, the minimizer satises Poissons equation
_
u = f in D
u = 0 on D
.)
Convenient trial functions v
j
for computation are eigenfunctions of the Laplacian
(with zero BCs) namely, products of sines. That is, lets minimize the given func-
tional among functions of the form
N

j=1
N

k=1
c
jk
v
jk
(x), v
jk
(x) = sin(jx
1
) sin(kx
2
)
80
for some positive integer N. Then a nice way to compute the functional is, by the
divergence theorem, and the orthogonality of the v
jk
,
1
2
_
D
[u[
2
dx =
1
2
_
D
u(u)dx =
1
2
N

j,k,j

,k

=1
c
jk
c
j

_
D
v
j

k
(x)v
jk
(x)dx
=
1
2
N

j,k,j

,k

=1
c
jk
c
j

_
D
v
j

k
(x)
2
(j
2
+ k
2
)v
jk
(x) =

2
8
N

j,k=1
(j
2
+ k
2
)c
2
jk
.
And
_
D
f(x)udx =
N

j,k=1
c
jk
_
D
v
jk
(x)f(x)dx =:
N

j,k=1
c
jk

f
n,k
.
So our job is to nd the choice of the coecients c
jk
which minimizes
I(u) =
N

j,k=1
_

2
8
(j
2
+ k
2
)c
2
j,k
+

f
j,k
c
j,k
_
.
That is, we want to minimize a function of the N
2
variables c
jk
. So, as we know from
calculus, we should take the partial derivatives and set them equal to zero:
0 =

c
jk
N

j,k=1
_

2
8
(j
2
+ k
2
)c
2
j,k
+

f
j,k
c
j,k
_
=

2
4
(j
2
+ k
2
)c
j,k
+

f
j,k
.
So we take
c
j,k
=
4

f
j,k

2
(j
2
+ k
2
)
,
and our approximate minimizer is
u(x) =
4

2
N

j,k=1

f
j,k
j
2
+ k
2
sin(jx
1
) sin(kx
2
).
Remark: In this example, the approximate minimizer turns out to be nothing but the
eigenfunction expansion solution of the corresponding Poisson equation, truncated at
j, k N.
Remark: In general it is somewhat dicult to assess how good an approximation the
Rayleigh-Ritz method generates. In this particular example, from what we know
about the completeness of eigenfunctions, we can at least conclude that our approx-
imation should get better as N gets larger, and, in particular, should approach the
true solution as N .
81
Example: Let D be the 2D region bounded by the ellipse x
2
1
/a
2
+x
2
2
/b
2
= 1. Approx-
imate the solution u of
min
uC
2
(D)
_
I(u) =
_
D
_
_
u
x
1
x
2
_
2
+
_
u
x
2
+ x
1
_
2
_
dx
_
.
Notice that no boundary conditions are imposed in the problem.
Remark: Before we start, notice something: clearly I(u) 0. Can we simply get a
minimizer with I(u) = 0 by setting u
x
1
= x
2
and u
x
2
= x
1
? Well, lets try: the rst
equation requires u = x
1
x
2
+ f(x
2
), and then the second requires x
1
+ f

(x
2
) = x
1
so f

(x
2
) = 2x
1
which we cannot satisfy. So, no. In vector calculus language, we
are trying to nd a function u(x) so that u = (x
2
, x
1
). A necessary condition for
a vector-eld to be a gradient is that its curl vanishes. But curl(x
2
, x
1
) ,= 0, so we
cannot solve the equation.
OK, back to the question of approximating the minimizer. To keep things simple, lets
use just one trial function (m = 1). One computationally nice choice is v
1
(x) = x
1
x
2
since a multiple of this (v
1
itself) makes the rst term in the integral vanish, while a
dierent multiple (v
1
) makes the second term vanish. So, we are lead to
min
R
I(x
1
x
2
),
i.e. just minimizing a function of the single variable . Lets compute
I(x
1
x
2
) =
_
D
_
( 1)
2
x
2
2
+ ( + 1)
2
x
2
1

dx = ( 1)
2
_
D
x
2
2
dx + ( + 1)
2
_
D
x
2
1
dx.
Before proceeding, lets pause and ask what we expect in the case of a disk: a = b.
The symmetry suggests that neither positive nor negative should be favoured, and
so = 0 should be the minimizer. Well keep this idea in mind as a check on our
computation at the end.
Now
_
D
x
2
1
dx = 4
_
a
0
x
2
1
dx
1
_
b

1x
2
1
/a
2
0
dx
2
= 4b
_
a
0
x
2
1
_
1 x
2
1
/a
2
dx
1
= 4a
3
b
_
/2
0
sin
2
() cos
2
()d =
a
3
b
4
,
and similarly,
_
D
x
2
2
dx =
ab
3
4
.
82
So our problem is
min

ab
4
_
b
2
( 1)
2
+ a
2
( + 1)
2

,
which we solve as usual:
0 =
d
d
_
b
2
( 1)
2
+ a
2
( + 1)
2

= 2b
2
(1) +2a
2
(+1) = 2(a
2
+b
2
)+2(a
2
b
2
)
hence =
b
2
a
2
a
2
+b
2
gives the minimum value (it must be a minimum, since the graph of
this function of is an upward parabola), and so our (admittedly crude) approximate
minimizer is
u(x) =
b
2
a
2
a
2
+ b
2
x
1
x
2
which gives a value of
I( u) =
ab
4
_
b
2
_
2a
2
a
2
+ b
2
_
2
+ a
2
_
2b
2
a
2
+ b
2
_
2
_
=
a
3
b
3
a
2
+ b
2
.
(And note that our guess that = 0 when a = b is indeed conrmed.)
83
5. Approximating eigenvalues and eigenfunctions
Recall that the rst eigenvalue of the problem
_
L := [p(x)] + q(x) = r(x) in D
= 0 on D
is given by minimizing the Rayleigh quotient:

1
= min
u=0 on D
_
D
(p[u[
2
+ qu
2
) dx
_
D
ru
2
dx
,
and the minimizing function is a corresponding eigenfunction. So we can use a
Rayleigh-Ritz approach to approximate the eigenfunction.
Start with a set of m trial functions:
v
1
(x), v
2
(x), . . . , v
m
(x)
with the correct boundary conditions (in this case Dirichlet: v
j
0 on D), and we
will minimize the Rayleigh quotient among linear combinations
u(x) = c
1
v
1
(x) + c
2
v
2
(x) + . . . + c
m
v
m
(x).
Lets compute the relevant quantities. Using the convention of summation over re-
peated indices:
_
D
ru
2
dx = c
j
c
k
_
D
rv
j
v
k
dx =

c B

c
where

c denote the m-vector

c = (c
1
, c
2
, . . . , c
m
),
and B is the mm matrix
B := (B
jk
)
m
j,k=1
, B
jk
=
_
D
rv
j
v
k
dx = (v
j
, v
k
)
r
.
Similarly,
_
D
_
p[u[
2
+ qu
2
_
dx = c
j
c
k
_
D
(pv
j
v
k
+ qv
j
v
k
) dx =

c A

c
where
A := (A
jk
)
m
j,k=1
, A
jk
=
_
D
(pv
j
v
k
+ qv
j
v
k
) dx = (v
j
, Lv
k
) = (v
k
, Lv
j
) = A
kj
84
where we have noted that the matrix A is symmetric, because the operator L is
self-adjoint. Notice B is symmetric, too. Hence our problem is:
min

c R
m

c A

c

c B

c
,
a simple minimization of a function of m variables. Of course, by calculus, the
minimizer will satisfy (going back to summation over repeated indices notation)
0 =

c
j
A
kl
c
k
c
l
B
nr
c
n
c
r
=

c B

c (A
jl
+ A
lj
)c
l


c A

c (B
nj
+ B
jn
)c
n
(

c B

c )
2
, j = 1, 2, . . . , m
which, using the symmetry of A and B, yields
(

c B

c )A

c = (

c A

c )B

c
or
A

c = B

c , :=

c A

c

c B

c
,
which is a (generalized) matrix eigenvalue problem.
Example: Approximate the rst Dirichlet eigenfunction of
d
2
dx
2
+ x on [0, 1] using
trial functions v
1
(x) = sin(x), v
2
(x) = sin(2x) (notice that v
1
and v
2
are the rst
two eigenfunctions of
d
2
dx
2
on [0, 1] with 0 BCs that is, of the problem with = 0).
Easy computations give
(v
1
, v
1
) =
_
1
0
sin
2
(x)dx =
1
2
, (v
2
, v
2
) =
_
1
0
sin
2
(2x)dx =
1
2
,
(v
1
, v
2
) =
_
1
0
sin(x) sin(2x)dx = 0,
so our matrix B is
B =
_
1
2
0
0
1
2
_
.
Next,
v

1
= cos(x), v

2
= 2 cos(2x),
so
_
1
0
(v

1
)
2
dx =

2
2
,
_
1
0
(v

2
)
2
dx = 2
2
,
_
1
0
v

1
v

2
dx = 0.
Slightly more involved computations (using integrations by parts) yield
_
1
0
xv
2
1
dx =
1
4
,
_
1
0
xv
2
2
dx =
1
4
,
_
1
0
xv
1
v
2
dx =
1
2
.
85
Hence our matrix A is
A =
_

2
2
+

4

2
2
2
+

4
_
.
To solve the eigenvalue problem (A B)

c = 0, we set
0 = det(AB) =

2
2
+

4


2

2
2
2
+

4


2

=
_

2
2
+

4


2
__
2
2
+

4


2
_


2
4
2
,
and the quadratic formula leads to
2 = 5
2
+
_
(5
2
+ )
2
4[(
2
+ /2)(4
2
+ /2)

2

2
].
Since we are trying to approximate the lowest eigenvalue/eigenfunction, we take the
negative sign here, to obtain the approximate eigenvalue . Then one can nd an
eigenvector

c which will produce an approximate eigenfunction c
1
v
1
+ c
2
v
2
. Of
course, the expressions are a little messy. One thing we can do, supposing is small,
is expand the approximate eigenvalue in a Taylor series in to nd, for example,
=
2
+

2
+ O(
2
).
Remark: Notice that this approximate eigenvalue is actually an upper-bound for
the true rst eigenvalue:

1
,
since we obtain it by minimizing the Rayleigh quotient over a sub-set of functions
(the true eigenvalue is the minimizer over all admissible functions, hence cannot be
larger).
86

You might also like