FunctionalAnalysisMathematica PDF

BUDAPEST UNIVERSITY OF TECHNOLOGY AND ECONOMICS
FACULTY OF CIVIL ENGINEERING
Gyrgy Popper
SOME CONCEPTS OF FUNCTIONAL ANALYSIS

USING MATHEMATICA
2006.
Gyrgy Popper
Research Group for Computational Structural Mechanics

Hungarian Academy of Sciences-Budapest University of Technology and Economics
e-mail: gypopper@epito.bme.hu
Copyright 2006 by Gyrgy Popper

iii
Contents
Preface .................................................................................................................v
1. Vector spaces, subspaces, linear manifolds ...................................................... 7
2. Dimension, spanning sets and (algebraic) basis...............................................14
3. Linear operator ...............................................................................................19
4. Normed spaces ...............................................................................................26
5. Convergence, complete spaces........................................................................31
6. Continuous and bounded linear operator .........................................................37
7. Dense sets, separable spaces ...........................................................................41
8. Inner product, Hilbert space............................................................................43
9. Sets of measure zero, measurable functions ....................................................51
10. The space L 2 ...............................................................................................53
11. Generalized derivatives, distributions, Sobolev spaces ..................................57
12. Weak (or generalized) solutions....................................................................70
13. Orthogonal systems, Fourier series ...............................................................73
14. The projection theorem, the best approximation............................................81
Appendix: A. Construction of basis functions for the vector space C3(1) [ 1,1]
using Mathematica .............................................................................................83
Appendix: B. The Lebesgue integral...................................................................85
References..........................................................................................................89
Index ..................................................................................................................91
iv
Preface
This textbook contains the course of lectures about some concepts in functional
analysis that I have delivered periodically to Ph.D. students of civil engineering at
the Budapest University of Technology and Economics.
Functional analysis is a study of abstract, primarily linear, spaces resulting

from a synthesis of geometry, linear algebra and mathematical analysis.
Functional analysis generalizes mathematical disciplines, and its popularity
originates from its geometric character: Most of the principal results in functional
analysis are expressed as abstractions of intuitive geometric properties of the
three-dimensional space.
The importance of getting acquainted with functional analysis lies in its

frequent use in the up-to-date engineering literature. I aim to present the basics of
functional analysis believed necessary to understand the mathematical theory of
the finite element method, variational solution of boundary-value problems, as
well as other problems of continuum mechanics.
For the purpose of this text, it is only necessary to acquire a simple

understanding of the Lebesgue integral and some other concepts related to it.
Hence, I have decided to avoid introducing a formalized framework for the
Lebesgue measure and integration theory. However, a short introduction to the
Lebesgue integration theory is given in Appendix B.
Some of the calculations of this textbook were made using the symbolic-
numeric computer algebra system Mathematica. I found this system very useful
for presenting these concepts because it does algebra and calculus computations
quickly in an exact, symbolic manner.
During the preparation of this textbook, I received helpful suggestions from

Professors Zsolt Gspr, Bla Palncz and Tibor Tarnai whom I would like to
express my thanks. I would also like to thank Lajos R. Kozk for his help in
preparing this manuscript for the press.
This work was partially supported by the grant T 037880 from the Hungarian
Scientific Research Fund (OTKA).
Budapest, April, 2006. Gyrgy Popper
v
vi
1. Vector spaces, subspaces, linear manifolds
It is well known, that the sum of vectors in a plane and the product of a vector
by real numbers results in a vector in the same plane. In other words, the set is
closed for these two operations. These "space"-properties of the geometric vector
space hold for some other sets too, e.g. for some function sets. Hence it is
advantageous to introduce the following definition:
Linear space or vector space is a non empty set X of elements, often called
vectors, for which two operations are defined:
1. Addition, that is if x, y X then x + y X
2. Multiplication by scalar, that is if x X and is
arbitrary scalar, then x X .
These two operations are to satisfy the usual axioms:
If x, y, z X and , (or C ) 1, then
1. x + y = y + x (law of commutatitvity);
2. (x + y ) + z = x + ( y + z ) (law of associativity);
3. there exists an element X such, that + x = x for any x X
(existence of the zero element);
4. for any x X , there is an element x X such as that x + ( x ) =
(existence of negative elements);
5. ( x + y ) = x + y ; (law of distributivity with respect to vectors);
6. ( + ) x = x + x ; (law of distributivity with respect to scalars);
7. ( ) x = ( x ) ; (law of associativity for scalar multiplication);
8. 0 x = and 1 x = x .
If the scalars are real X is a real vector space, while if the scalars are complex
X is a complex vector space.
1
(C) denotes the set of real (complex) numbers
7
Examples
E.1.1. The set n of all real ordered n-tuples x = ( x1 ,..., xn ) , y = ( y1 ,..., yn ) ,

is a real linear space if addition is defined by x + y = ( x1 + y1 ,..., xn + y n ) , and
scalar multiplication is defined by x = ( x1 ,..., xn ) with . The zero
vector: = (0,...,0 ) .
E.1.2. The set C (0 ) [a, b ] of all (real-valued) continuous functions on a finite

interval a t b with addition and real number multiplication
(f + g )(t ) = f (t ) + g (t ), ( f )(t ) = f (t ), t [a, b]
forms a linear space. The zero vector : f (t ) = 0 for all t [a, b ] .
Note, that C (0 ) [a, b] C (0 ) (a, b ) is true, because f C (0 ) [a, b] f C (0 ) (a, b ) .
More generally, C ( k ) [a , b ] denotes the linear space of k-times continuously

differentiable functions on a finite closed interval [a, b] . (This means the set of
functions whose derivatives at least up to order k inclusive are continuous
in [a, b] .)
For example, the function
t2
t + , if 1 t 0
f (t )= 2
2
t t , if 0 t 1
2
defined on the interval 1 t 1 , is once continuously differentiable but only

once, i.e. f (t ) C (1) [ 1,1] but its first derivative
df
g (t ) = (t ) = 1 t , 1 t 1
dt
belongs only to C ( 0 ) [ 1,1] , (see Fig.1.1).
8
Using Mathematica :
In[1]:= Clear@f, gD
f@t_D := PiecewiseA99t + , 1 t 0=, 9t , 0 t 1==E

t2 t2
In[2]:=
g@t_D = D@f@tD, tD Simplify

2 2
t 1 t
1 t 0

Out[3]=
Indeterminate 1
1t 0< t< 1

1+t 1 < t < 0
Plot@8f@tD, g@tD<, 8t, 1, 1<,

PlotStyle 8
In[4]:=
8Thickness@.008D<, 8Dashing@8.04, .02<D<

<,
Prolog 8
Text@"f", 80.9, 0.6<D,
Text@"g", 80.6, 0.6<D
<
D
0.8
g 0.6 f
0.4
0.2
-1 -0.5 0.5 1
-0.2
-0.4
Out[4]= Graphics
Figure1.1. Only once differentiable function.
In spite of the generality of vector spaces, it is easy to find a class of functions

which does not create a linear space:
E.1.3. The class of positive functions with usual addition and number
multiplication rules (introduced in E.1.2.) does not form a linear space. Really, if
f (t ) > 0 then ( 1) f (t ) < 0 .
9
E.1.4. The set of functions with property
f ( t ) < 1, t [a , b ]
does not form a linear space ( if f and the number > 0 is chosen sufficiently
large, then f >1 ).
However, in general: The class of bounded functions with usual addition and
number multiplication is a linear space.
Direct product. If X and Y are two sets, then the set of all ordered pairs
{(x, y ) : x X , y Y }
is called direct- or Cartesian or Descartes product of sets X and Y and is
denoted by X Y .
The direct product X Y can be obtained using the Mathematica function

Outer[], which combines each element of the list given in its second argument
with each element of the list given in its third argument, while applying the
function given in the first argument to all of the possible pairs. The result is in the
form of a list:
In[1]:= Outer@f, 8a, b, c<, 81, 2<D H outer product L
Out[1]= 88f@a, 1D, f@a, 2D<, 8f@b, 1D, f@b, 2D<, 8f@c, 1D, f@c, 2D<<
In[2]:= Outer@List, 8a, b, c<, 81, 2<D
Out[2]= 888a, 1<, 8a, 2<<, 88b, 1<, 8b, 2<<, 88c, 1<, 8c, 2<<<
In[3]:= Flatten@%D H flatten all levels in the list L
Out[3]= 8a, 1, a, 2, b, 1, b, 2, c, 1, c, 2<
In[4]:= Flatten@%%, 1D H flattens the topmost level in the list L
Out[4]= 88a, 1<, 8a, 2<, 8b, 1<, 8b, 2<, 8c, 1<, 8c, 2<<
Consequently it is reasonable to define the function:
Descartes@x_, y_D := Flatten@Outer@List, x, yD, 1D

X = 8a, b, c<;
In[5]:=
Y = 81, 2<;
Descartes@X, YD
Out[8]= 88a, 1<, 8a, 2<, 8b, 1<, 8b, 2<, 8c, 1<, 8c, 2<<
10
In Mathematica, the direct product can occur as an argument or definition
domain of many inner functions. For example:
In[9]:= Plot3DAx3 + y2 1, 8x, 3, 3<, 8 y, 0, 1<E
20
1
0
0.8
-20
0.6
-2 0.4
0 0.2
2
0
Out[9]= SurfaceGraphics
Figure1.2. Illustration to the use of Descartes product.
E.1.5. If X and Y are two linear spaces, then the direct product
X Y = {( x, y ) : x X , y Y }
is a linear space with the following operations:
(x1 , y1 ) + (x2 , y 2 ) = (x1 + x2 , y1 + y2 ), x1 , x 2 X , y1 , y 2 Y
( x, y ) = ( x, y ), x X , y Y , scalar.
Note, that the addition in vector space X Y is defined using the additions of
both linear spaces X and Y .
Subspaces. A nonempty subset S of a vector space X is called a linear

subspace of X , if S itself is a linear space with respect to addition and scalar
multiplication defined on X . In other terms, subspaces are closed for vector
addition and multiplication by scalar.
The subspaces are linear spaces; hence all subspaces contain the zero vector
.
E.1.6. Any straight line and plane running through the origin of the three-
dimensional geometric vector space, is its linear subspace.
11
Among all subsets of the plane shown in Figure 1.3, only the sketches 1.3.c)
and 1.3.d) contain subspaces of the plane.
M M
M
a) b) c)
M M
M=point of origin
d) e) f)
Figure1.3. Subsets and subspaces in 2 .
E.1.7. In the vector space C (0 ) [a, b ] defined in the example E.1.2, the subset
{ f : f (a ) = f (b) = 0} is a linear subspace. The

subset
{ f : f (a ) = f (b) = 1} is not a linear subspace.
E.1.8. If M and N are linear subspaces of a linear space X , then the

intersection M N is also a linear subspace of X . The set
M + N = {m + n : m M , n N }
is a linear subspace of X , too. If
M + N = X and M N = {} ,
12
then both M and N is said to be the complement of the other with respect to X ,
and the vector space X is called the direct sum of M and N , and is denoted by
X =M N.
For example see the sketch:
m+n
N
n
m
O
M
Linear manifold or affine subspace. Any subset of a vector space X which is

of form {x0 + s : s S }, where x0 X is a fixed vector, and S is a linear
subspace, is said to be a linear manifold or affine linear subspace of X . The
affine subspace {x0 + s : s S } can also be written in the form of direct sum
x0 S .
Linear manifolds (with x0 ) can be considered as generalizations of

straight-lines or planes which do not contain the origin.
13
2. Dimension, spanning sets and (algebraic) basis
The vectors x1 , x2 , ..., xn of a linear space X are said to be linearly dependent

if there are scalars 1, 2 , ..., n not all of which are zero such that
1 x1 + 2 x2 + ... + n xn = .
The vectors x1 , x2 , ..., xn are linearly independent if this equality holds only if
all of the scalars 1 , 2 , ..., n are zero.
The infinite set of vectors x1, x2 , ... , xn , ... is said to be linearly independent if all of
its finite subsets are linearly independent.
Note, that the single vector x forms a linearly independent set because the
only way to have x = is = 0 .
An important example of linearly independent vectors is the polynomials. For

example, the powers of x , that is {1, x, x 2 , ..., x n } form a linearly independent set.
We make the proof for n = 2 only, because it can be extended directly (without
new idea), to the general case.
Define the linear combination
p ( x ) = a 0 + a1 x + a 2 x 2
and determine when p ( x ) 0 . This identity implies that all derivatives of p ( x )

are also identically equal to zero. That is,
p ( x ) = a0 + a1x + a2 x 2 = 0,
p, (x ) = a1 + 2a2 x = 0,
p,, ( x ) = 2a2 = 0.
The last equation implies a2 = 0 , then from the second one: a1 = 0 , and then
the first one implies a0 = 0 . Therefore, p ( x ) 0 if and only if ai = 0, i = 0,1, 2, ...
Dimension. The vector space X is called n -dimensional or finite-

dimensional, if X contains n linearly independent vectors but any n + 1 vectors
of X are linearly dependent.
The linear space X is said to be infinite-dimensional, if for any positive

integer n , n linearly independent vectors of X can be found.
14
Spanning set. A subset S = {x1 , x2 , ..., xn , ...} of a vector space X is said to
span or generate X , if every x X can be written as a linear combination of the
elements of S .
Basis. A set of vectors x1 , x2 , ..., xn of a vector space X is said to be an

algebraic or Hamel basis for X , if and only if
1. the set spans X , and
2. the vectors x1, x2 , ..., xn are linearly

independent.
Theorem 2.1. Let {x1 , x2 , ..., xn } denote a basis for a vector space X . Then
every vector x X is a unique linear combination of x1 , x2 , ..., xn .
Proof: By contradiction: if x = 1 x1 + ... + n xn = 1x1 + ... + n xn , then
x x = = (1 1 )x1 + ... + ( n n )xn . But x1 , x2 , ..., xn are linearly

independent, therefore the coefficients of x1 , x2 , ..., xn must be zero. Hence
1 = 1, ..., n = n .
E.2.1. Consider the vector space C (0 ) [a, b ] of continuous functions defined on

the finite interval [a , b ] . Its subset consisting of polynomials of degree n is a
linear subspace in which the functions t k (k = 0, 1, ... , n ) are linearly independent.
An arbitrary polynomial of degree n can be expressed as a linear combination
1 + 2 t + ... + n +1 t n .
Therefore the vectors t k (k = 0,1, ... , n ) span this ( n + 1 )-dimensional linear

subspace and create its Hamel basis.
The linear space C (0 ) [a, b ] is infinite-dimensional. Really, for any positive

integer n the functions
x1 (t ) = 1, x2 (t ) = t , ... , xn +1 (t ) = t n
are linearly independent and are elements of C (0 ) [a , b] .
The following example contains one of the basic concepts of the finite element
method:
E.2.2. Let C1(0 )[ 1,1] denote that linear subspace of the vector space C (0 ) [ 1,1] ,
which consists of piecewise linear polynomials on the intervals [ 1,0] and [0,1] .
15
This means that an arbitrary function p(t ) C1(0 ) [ 1,1] can be written in a unique
way as
p (t ) a1 t + b1 , for t [ 1, 0]
p (t ) = 1
p2 (t ) a2 t + b2 , for t [0,1]
where in consequence of the continuity condition p1 (0 ) = p2 (0 ) follows b1 = b2 . If

we express the coefficients a1 , a2 , b1 = b2 = b using the function values
g ( 1), g (0 ), g (1) at the nodal- points 1, 0, 1 , then
t g ( 1) + (1 + t ) g (0), if t [ 1, 0]
p (t ) =
(1 t ) g (0 ) + t g (1), if t [ 0,1]
This piecewise polynomial can be expressed in the form of a linear combination
p (t ) = g ( 1) 1 (t ) + g (0 ) 2 (t ) + g (1) 3 (t )
where
t , if t [ 1,0] 1 + t , if t [ 1,0] 0, if t [ 1,0]

1 (t ) = 2 (t ) = 3 (t ) =
0, if t [0,1] 1 t , if t [0,1] t , if t [0,1]
see Figure 2.1.
1 (t ) 2 (t ) 3 (t )
1 1 1
-1 0 1 -1 0 1 -1 0 1
Figure 2.1. Basis in the function space C1
( 0)
[ 1,1].
The functions 1 (t ), 2 (t ), 3 (t ) are linearly independent (see Figure 2.1.) and

any function p (t ) C 1(0 )[ 1,1] is a linear combination of 1 (t ), 2 (t ) , 3 (t ) . Hence
these functions span (generate) this 3-dimesional function space.
Computing the basis functions for the vector space C1

( 0)
[ 1,1] can be made
using Mathematica:
Clear@ pD
p@t_D := Piecewise@88a1 t + b, 1 t 0<, 8a2 t + b, 0 t 1<<D
In[1]:=
16
In[3]:= H g@1D,g@0D,g@1D given function values of p@t_D in node points 1,0,1 L
s = Solve@Map@ p, 81, 0, 1<D Map@g, 81, 0, 1<D, 8a1, a2, b<D First
Out[3]= 8a1 g@1D + g@0D, a2 g@0D + g@1D, b g@0D<
In[4]:= H substitution of the solution into p@t_D: L

ss = p@tD . s
g@0D + t H g@ 1D + g@0DL
g@0D + t H g@0D + g@1DL
1 t 0
Out[4]=
0 t 1
In[5]:= sss = 8ss@@1, 1, 1DD, ss@@1, 2, 1DD<
Out[5]= 8g@0D + t H g@ 1D + g@0DL, g@0D + t H g@0D + g@1DL<
In[6]:= Harrange to form p@t_D:=g@1D @1D@tD+g@0D @2D@tD+g@1D @3D@tDL

fi = Table@Coefficient@sss, g@k 2DD, 8k, 1, 3<D
Out[6]= 88t, 0<, 81 + t, 1 t<, 80, t<<
Table@
k@t_D = Piecewise@88fi@@k, 1DD, 1 t 0<, 8fi@@k, 2DD, 0 t 1<<D,
In[7]:=
8k, 1, 3<D
Out[7]= : t >
1+ t 1 t 0 0 1 t 0
1 t 0, ,
1 t 0t1 t 0t1
In[8]:= H drawing the basis functions L

Do@Plot@k@tD, 8t, 1, 1<, AspectRatio 1 3,
PlotStyle Thickness@0.01DD, 8k, 1, 3<D;
1
0.8
0.6
0.4
0.2
-1 -0.5 0.5 1
1
0.8
0.6
0.4
0.2
-1 -0.5 0.5 1
1
0.8
0.6
0.4
0.2
-1 -0.5 0.5 1
17
Training problem: Let C3(1) [ 1,1] denote the linear subspace of the vector space
C (1)[ 1,1] which consists of piecewise cubic polynomials on the intervals [ 1,0]
and [0,1] . Similarly to example E.2.2, determine the basis of the function space
C3(1) [ 1,1] and plot it using Mathematica!
A possible solution is shown in Appendix A.
E.2.3. Consider the linear space X whose elements are infinite sequences of

real numbers x = {1, 2 , ... } with
k =1
2
k < . The subset of X consisting of vectors
e1 = {1,0,0, ...}
e2 = {0,1,0, ...}
ek = {0, ..., 0,1 , 0 , ...}

(k )
M
span X which is infinite dimensional (really, for any positive integer k the
vectors e1 , ..., ek are linearly independent and are elements of X ). Clearly, all of
the vectors e1, e2 , ... are linearly independent, too. However, they do not form a
Hamel basis (because {e1, e2 , ...} is not a finite set).
It is true, that any element of X can be written in the form

x= e
k =1
k k
but this requires to define the convergence of infinite series for vectors. Later on,
the concept of bases will be extended.
E.2.4. Let S denote those infinite sequences of real numbers in which all
elements except for a finite number of elements are zero. S is a subspace of the
linear space X specified in example E 2.3. Any vector x S can be written as a
linear combination of a finite set { e1 , e2 , ..., en }. Hence, these vectors form an
algebraic (Hamel) basis for the subspace S .
18
3. Linear operator
Operator. The term operator is synonymous with function, map or mapping

and transformation.
Given two linear spaces (sets) X and Y , an operator T from X to Y ,

denoted T : X Y , is a rule that assigns one and only one vector y = T ( x ) Y to
every vector x D X . The subset D of X is the domain of T and the image
of D
R = {T ( x ): x D}
is the range of T .
The operator T is considered to be given if both its domain D , co-domain Y ,

and the rule of transformation are given.
The operator T : X Y is said to be one-to-one or injective, if
x1 x2 T ( x1 ) T ( x2 ) .
In other terms, T : X Y is one-to-one, if for every y R there is exactly one

x X such that y = T ( x ) .
The operator is said to map X onto Y or is called surjective if R = Y .
If T is both injective and surjective, it is called a bijection from X to Y .
E.3.1. Let denote the set of real numbers and + the set of positive real
numbers. Using the rule T ( x ) = x 2 define the following operators (functions):
1. T1 : . This operator is not one-to-one, since both x and + x are

mapped into x 2 . It is not surjective either, since the negative real numbers are in
the co-domain but not in the range + .
2. T2 : + . This operator is not one-to-one, but it is onto.
3. T3 : + . This operator is one-to-one, but it is not onto.
4. T4 : + + . This operator is bijective, i.e. it is both one-to-one and

onto.
Note that although the rule T ( x ) = x 2 defining each operator T1 , T2 , T3 , and T4

is the same, the four operators are quite different.
19
Similarly, changes in the domain (e.g. in the boundary conditions) of a
differential operator lead to a different operator (having properties different from
the original one).
Linear operator. An operator T : X Y is said to be linear, if the domain D

of T is a linear space (that is X or a subspace of X ), and if
T ( x + y ) = T (x ) + T ( y )
for , scalars and x, y X .
For a linear operator T , we usually write Tx instead of T ( x ) .
E 3.2. Any m n (read " m " by " n ") real matrix A represents a linear operator
T : n m .
The operator T is injective if and only if rank ( A) = n ; and T is surjective if and

only if rank ( A) = m .
(Really, if rank ( A) = n then from the inequality A(x y ) follows x y .

That is, if x y Ax Ay . The case rank( A) = m is easy to verify by partition
of A .)
E 3.3. The linear operator of differentiation
d
: C (1)[a, b ] C (0 )[a, b ] C (0 )[a, b ]
dx
is surjective, i.e. onto the range C (0 ) [a, b ] (surely, every continuous function is
integrable), but is not injective (the derivatives of f (x ) and of f ( x ) + const are
equal).
The composition of two functions, f : X Y and g : Y Z , is defined by
(g f )(x ) = g ( f (x ))
In Mathematica the composition of functions f and g can be calculated using
the function Composition[]:
!!!!!!!!!
f@x_D := x2; g@x_D := x 1
h1@x_D = Composition@f, gD@xD
In[1]:=
Out[2]= 1 + x
In[3]:= h2@x_D = Composition@g, fD@xD

"#############
Out[3]= 1 + x2
20
or alternatively
In[4]:= h1@x_D = f@ g@ x
Out[4]= 1 + x
In[5]:= h2@x_D = g@ f@ x
"#############
Out[5]= 1 + x2
If T1 : X Y and T2 : Y Z are linear operators, then the composition T2 T1

is a linear operator, too. Really,
(T2 T1 ) ( x1 + x2 ) = T2 ( T1 (x1 ) + T1 (x2 ) ) =
= T2 (T1 ( x1 ) ) + T2 (T1 (x2 ) ) = (T2 T1 )( x1 ) + (T2 T1 )( x2 ).
Null space. The null space N (T ) of the linear operator T : X Y is the set
N (T ) = { x : x X , T x = }.
The null space, also known as the kernel of the transformation T : X Y is the
subset of elements of X which has the zero image.
It is easy to see that the null space of T is a linear subspace of X .
(Indeed, if x1 N (T ) and x2 N (T ) , then T x1 = and T x2 = . The linearity

implies T ( x1 + x2 ) = T x1 + T x2 = . That is, x1 + x2 N (T ) .
If A is an m n matrix, then the Mathematica function NullSpace[A] gives a

list of vectors that forms a basis for the null space of the matrix A .
As an example:
In[1]:= A = 881, 0, 1, 2<, 80, 1, 1, 1<, 80, 0, 0, 0<<; MatrixForm@ AD
i1
j
0 1 2y
z
Out[1]//MatrixForm=
j
j 1 1 1z
z
j
j0 z
z
k0 0 0 0{
In[2]:= B = NullSpace@ AD
Out[2]= 882, 1, 0, 1<, 81, 1, 1, 0<<
21
Check:
In[3]:= A. H B@@1DD + B@@2DDL H Notice the symbolic computation L
Out[3]= 80, 0, 0<
If the matrix A is nonsingular, then NullSpace[A] gives the empty set { }.
(Notice, that the Mathematica function NullSpace [A] is available to compute

all the solutions of a homogenous set of linear algebraic equations.)
The following result is useful in the study of operator equations.
Theorem 3.1. A linear operator, T : X Y , is one-to-one if and only if its null

space is trivial, N (T ) = { } .
Proof: The linearity of T implies Tx = T (x + ) = Tx + T and hence

T = , that is N (T ) .
If T is one-to-one, then x implies Tx T , that is Tx . In other terms, if

x then x N (T ) . Hence N (T ) = {} .
Conversely, if N (T ) = {} , then the equality Tx1 = Tx2 implies T ( x1 x2 ) = , that

is, x1 x2 N (T ) and since it is supposed that N (T ) = {} , it follows x1 = x2 .
Thus, if N (T ) = {} then the operator T is one-to-one. With this the proof is
completed.
Linear operators on finite-dimensional spaces. The study of continuous

systems often leads to solving boundary-value problems. The solution is a vector
in an infinite-dimensional space. The numerical methods approximate the solution
in finite-dimensional subspaces of the infinite-dimensional space. Hence it is of
great importance that:
Linear operators on finite-dimensional vector spaces can be represented by

matrices.
Let X and Y be finite-dimensional linear spaces, and let T : X Y be a

linear operator. Let {1, ... , n } and { 1, ... , m } be bases for X and
Y respectively. Then x X and y Y can be expressed uniquely as
n m
x = i i , y = j j
i =1 j =1
22
Thus for any x X , it can be written
n n m
y = T x = T i i = i T i = j j .
i =1 i =1 j =1
Since T i Y , that is the image of i is an element of the space Y we can write
m
T i = t j i j .
j =1
If we substitute this formula into the previous one, then because

n m m n
t
i =1
i
j =1
ji j = t ji i j we get
j =1 i =1
m
n
m
t
j =1 i =1
ji i j = j j ,
j =1
and the uniqueness (see theorem 2.1.) implies

n
t
i =1
ji i = j , j = 1, ..., m
or in matrix form
t11 . . . t1n 1
. . .
1
. . . .
.
. .
. =
. . .
. .
. .
n
. . .

t m1 . . . t mn m
The matrix [ t ji ] is said to represent the linear transformation with respect to the
bases {1 , ... , n } and { 1, ... , m } .
Inverse operator. If T : X Y is a one-to-one linear operator, then the linear

operator T 1 : R(T ) Y X , which to every element y R(T ) assigns the
element x X , for which T x = y , is called the inverse operator to the operator
T.
Functional. If the operator to vectors of a linear space X assigns scalars (e.g.

if Y = , the set of real numbers), then the operator is called functional on X .
23
Since a functional is a special operator, the functional is said to be linear when
it is a linear operator.
b
E.3.4. Linear functional is e.g. f ( x ) = x(t ) dt
a
Bilinear functional. Let X Y = {( x, y ) : x X , y Y } denote the direct or

Cartesian product of two real linear spaces X and Y . A functional
b ( x, y ) x y
defined on X Y is called bilinear, if
x1 + x2 y = x1 y + x2 y
x y1 + y2 = x y1 + x y2
for all x, x1 , x2 X , y, y1 , y2 Y and .
The bilinear functional x y (defined on X X ) is said to be symmetric, if

x y = y x , and is positive definite if x x 0 and x x = 0 if and only if
x = .
The bilinear functional is sometimes also called bilinear form.
Algebraic dual space. Let X and Y be two real linear spaces. The set of all
linear transformations from X to Y is itself a linear space and it is denoted by
L( X , Y ) .
( )
For example, L n , m is the set of all real m n matrices, which is clearly a
linear space. When Y = 1 , L( X , Y ) becomes the space of all linear functionals
on X . This vector space is called the algebraic dual of X and is denoted by X * .
(
That is, X * = L X ,1 . )
Linear functionals on X are often expressed in the form of dual-pairs, that is
as
f (x ) f x
where f X , x X and . . is a bilinear map from X X into 1 . That

is, . . : X X 1.
E.3.5. If X is a finite or n -dimensional vector space, then its algebraic dual

*
X is a finite dimensional vector space, too.
24
There is a special relation between the bases of X and X * . Let {x1, ... , xn } be a
basis of X . Then any x X uniquely can be written in the form
n
x = i xi .
i =1
If F is a linear functional on the space X , then evidently

n
F ( x ) F x = F ( xi ) i ;
i =1
that is, every linear functional is uniquely determined by its values in basis vectors
x1, ... , xn . Define the linear functionals l1 , ..., l n by the formula
l j (x ) l j x = j .
n
We show that l j is a linear functional. Indeed, for vectors x = i xi ,
i =1
n
y = i xi and for any scalars , , we get
i =1
n
l j ( x + y) = l j (
i =1
i + i ) xi = j + j = l j ( x ) + l j ( y ) .
With respect to the trivial relation xi = 0 x1 + ... + 1 xi + ... + 0 xn , it is obvious

that the relationship between {l j } and {xi } is
1, if i = j
l j ( xi ) l j xi = .
0, if i j
Since the basis vectors xi are linearly independent, l j are also linearly
independent. Therefore the set { l } form a basis for the dual space
j X * and we
call { l j } the dual basis.
Using the formula l j ( x ) l j x = j , any linear functional F ( x ) can be written in

usual form of linear combination
n
F ( x ) = F ( xi ) li ( x ) .
i =1
25
4. Normed spaces
Norm. Let X be a linear space. A functional : X + (the non-negative

real numbers) is a norm on X if and only if
N1. x 0 for every x X and x = 0 implies x = ;
N2. x = x , x X, - is a scalar;
N3. x + y x + y , x, y X (triangle inequality).
Normed vector spaces. Let X be a linear space and . be a norm on X .

Then the pair ( X , . ) is called normed space or normed linear space.
Distance. The function
d ( x, y ) = x y , x, y X
is a possible measure for distance and is said to be distance between vectors x

and y .
Note that this definition of distance (induced by the norm) is a special distance
(metric) defined in the theory of metric spaces. The conditions for general
distance in metric spaces are:
1. d (x, y ) 0 and d ( x, y ) = 0 if and only if x = y ;
2. d ( x, y ) = d ( y, x ) ;
3. d (x, z ) d ( x, y ) + d ( y, z ) .
E.4.1. With the linear space n (of n -tuples of real numbers) the most
frequently associated norms are:
n
x 1
= xk (sum norm),
k =1
1/ 2
n
x = xk2 ( Euclidean norm)
k =1
2
n
(however, x
k =1
2
k , is not a norm because it contradicts axioms N2 and N3)
26
n
x
= max xk (max or infinite norm).
k =1
With the same vector space n , the different norms x 1, x 2 , and x

1 2
define different normed spaces, usually denoted l , l and l .
n n n
Notice, that the norms and the corresponding normed spaces in above are
special cases of the normed spaces l np with norms
1/ p
n p
x = xk , 1 p < and x = lim x p .
k =1
p p
In Mathematica 5.0 , vector p-norms can be computed using the function
Norm [x,p]
where p can be omitted if p = 2 .
Clear@"Global"D
x = 8a, b, c<; y = 83, 4<;
In[1]:=
8 Norm@x, D, Norm@xD, Norm@x, 1D<
Out[3]= :Max@Abs@aD, Abs@bD, Abs@cDD,
"#################################################
#
Abs@aD2 + Abs@bD2 + Abs@cD2 , Abs@aD + Abs@bD + Abs@cD>
In[4]:= 8 Norm@y, D, Norm@ yD, Norm@ y, 1D<
Out[4]= 84, 5, 7<
In[5]:= 8 Norm@x, pD, Norm@y, pD<
Out[5]= :HAbs@aD + Abs@bD + Abs@cD L p , H3 + 4 L p >

1 1
p p p p p
Using Mathematica it is easy to sketch the unit balls
{
S p = x 2 : x p 1 .}
27
An example for p = 1,2, :
In[6]:= << GraphicsÌmplicitPlot`

ImplicitPlot@Norm @8x1, x2<, 1D 1, 8x1, 1.2, 1.2<, 8x2, 1.2, 1.2<,
AxesOrigin 80, 0<, AxesLabel 8"x1", "x2"<, PlotStyle 8Thickness@0.008D<D
x2
0.5
x1
-1 -0.5 0.5 1
-0.5
-1
Out[7]= ContourGraphics
ImplicitPlot@Norm @8x1, x2<D 1, 8x1, 1.2, 1.2<, 8x2, 1.2, 1.2<,

In[8]:=
x2
0.5
x1
-1 -0.5 0.5 1
-0.5
-1
28
ImplicitPlot@Norm @8x1, x2<, D 1, 8x1, 1.2, 1.2<, 8x2, 1.2, 1.2<,
In[9]:=
x2
0.5
x1
-1 -0.5 0.5 1
-0.5
-1
E.4.2. With the space C (0 ) [a, b ] of continuous functions on the interval [a , b ]

different normed spaces can be established. The norms
1/ 2
b
b
x 1 = x(t ) dt , = x(t ) dt = max x(t )
2
x 2
and x
a a a t b
define normed spaces L0 [a, b] , L20 [a, b ] and C [a, b] , respectively.
Note, that the norms in example E.4.1 are the discrete analogues of the norms in
E.4.2.
In Mathematica e.g. the norm of the space L20 [a, b ] can be defined as follows:
"###############################
In[1]:= Func2Norm@f_, a_, b_D := abAbs@f@tDD2 t
In[2]:= g@x_D := x3
In[3]:= Func2Norm@g, 1, 1D
Out[3]= $%%%%%%
2
7
In[4]:= % N
Out[4]= 0.534522
29
E.4.3. The vector space C ( 2 )[a, b] of twice continuously differentiable functions
on the finite interval [a, b] with the norm
1/ 2
b b
dx(t )
2 b
d 2 x(t )
2

x = x(t ) dt +
2
dt + dt
a dt dt 2

a a
1/ 2
2 b
d k x(t )
2

= dt
k =0 dt k
a
is also a normed space.
The same linear space C (2 )[a, b ] with the norm
dx(t ) d 2 x(t )
x = max x(t ) + max + max
a t b a t b dt a t b dt 2
is another normed space.
E.4.4. The real valued function

1/ 2
b dx(t ) 2
dt
dt
a
is not a norm in the linear space C [a , b ] , because it contradicts axiom N1.
( 1)
x(t ) dt = 0 it follows only x(t ) 0 (and not x(t ) 0 )

2
Clearly, from
a
everywhere in [a , b ] .
30
5. Convergence, complete spaces
Iterative methods are the only choice for the numerical solution of operator
equations. For this reason, it is necessary to extend the concept of convergence
and limit of sequences of numbers to sequences of vectors in linear spaces.
Convergence. A sequence of elements {xn } in normed space X is said to be

convergent, if there exists x* X , so that the sequence of numbers xn x*
converge to zero. We refer to x * as the limit of the sequence {xn } and write
xn x * as n or lim xn = x* .
n
We emphasize that in this definition, the condition x* X is of fundamental

importance. Surely, the sequence of rational numbers
{ 1, 1.4, 1.41, 1.414, 1.4142, ... } in the normed space of real numbers (, )
converges to 2 , but in the normed space of rational numbers (Rac, ) this
sequence does not converge.
Cauchy sequence. A sequence {xn } in a normed space is called Cauchy

sequence, if
xn xm 0 , as n, m .
Theorem 5.1 In normed spaces every convergent sequence is a Cauchy

sequence.
Proof: This follows immediately from the triangular inequality
( ) (
xn xm = xn x + x xm ) xn x + xm x
supposing xn x , xm x as m, n . (From the norm-axiom N2 it follows

x xm = ( 1) (xm x ) = 1 xm x ).
In the finite-dimensional space n , the converse of this statement is also true;

any Cauchy sequence is convergent. However, in general infinite-dimensional
spaces, a Cauchy sequence may fail to converge.
Complete space. A normed space X is said to be complete if every Cauchy

sequence in X has a limit (in X ).
Banach space. A complete normed linear space is called Banach space.
31
Note, that not only normed spaces but other metric spaces may be complete
too, but Banach space may be only a normed space.
E.5.1. Consider the linear space X whose elements are infinite sequences of

real numbers x = {1, 2 , ... } with
k =1
2
k < . The norm in X can be defined by

x =
k =1
2
k
and this (infinite-dimensional) normed vector space is usually denoted by l 2 .
Let l 20 denote those infinite sequences of l 2 in which all elements are zero,
except a finite numbers of elements. It is easy to see, that the sequence
xn = { 2 1 , 2 2 , ..., 2 n , 0, 0, ... } l 20
does not have a limit, because the sequence
{ }
x = 2 1 , 2 2 , ..., 2 n , 2 (n +1) , ... l 2
does not belong to l 20 .
The completion of l 20 is l 2 .
E.5.2. With the vector space C ( 0 ) [0,1] of continuous functions on the closed
interval [0,1] , different normed spaces can be defined.
The normed space L0 [0,1] defined by the integral norm
1
x 1
= x(t ) dt
0
is not complete, but the normed space C [0,1] , defined by the maximum ( infinite)
norm
x
= max x(t )
0t 1
is complete, that is, a Banach space.
32
To show, that the linear space C (0 ) [0,1] of continuous functions with respect to
integral norm is not complete, it is enough to find a Cauchy sequence which does
not converge in L0 [0,1] .
Therefore consider the sequence of continuous functions on interval [0,1] given

by
1 1
0, if 0t
2 n
n 1 1 1
xn (t ) = nt + 1, if t
2 2 n 2
1
1, if t
2
(n 2) and illustrated in Figure 5.1.

x n (t )
x m (t ) 1
xn
xm
n<m
0 - 1/n - 1/m 1 t
Figure 5.1. Sequence of continuous functions.
Figure 5.1 illustrates geometrically (area of lined triangle), that
1 1 1 1 1
1
1 1 1
xn xm = xn (t ) xm (t ) dt = .1 = ,
1
0 2 2 m 2 n 2 n m
and if m , n , then xn xm 1
0 , that is, we have proved that the
sequence {xn (t ) } in the norm . 1
is a Cauchy sequence.
However, no continuous function x (t ) exists as a limit of the sequence

{xn (t )}.
33
To show it, consider the step function
1
0, if 0 t < 2
y (t ) =

.
1
1, if t 1
2
It is easy to verify, that xn y 0 as n , that is, the sequence {xn }

1
converges to the discontinuous function y . Indeed,

1 1 1 1 1
xn y = .1 = 0, as n . Moreover, the sequence
1 2 2 2 n 2n
{xn } cannot converge also to an other continuous function x y , because if
xn x 0 as n , then the inequality
1
x y
1
= (x
xn ) + (xn y )
1
xn x
1
+ xn y
1
implies x = y , being a contradiction.
Hence the space L 0 [0,1] of continuous functions with respect to the integral
norm . 1
is not complete.
The fact, that the space C [0,1] of continuous functions with respect to the
maximum norm . is complete, follows immediately from the theorem known
in elementary analysis:
"The limit of every uniformly convergent sequence of continuous functions is a

continuous function."
(In more detail we demonstrate that the normed space C [0,1] is complete. For
any fixed point t0 [0,1] , a Cauchy sequence {xn (t )} in C [0,1] yields a Cauchy
sequence of real numbers: xn (t0 ) xm (t0 ) max xn (t ) xm (t ) 0 , as m, n .
t[0 ,1]
Consequently, for every t [0,1] point there exists a real number x (t ) to which
{xn (t )} converges. This pointwise convergence holds for any t in [0,1] ; that is,
Cauchy sequences {xn (t )} converge uniformly in [0,1] . Thus, there is a limit-
function x (t ) for which
xn (t ) x(t ) = max xn (t ) x(t ) 0, as n .

t [0,1]
34
It remains to be shown that x (t ) is continuous. Let t [0,1] and let {tm } be a
sequence of points in [0,1] which converges to point t as m . Clearly, for any
n 1
x(t ) x(t m ) x(t ) xn (t m ) + xn (t m ) x(t m )
where {xn } is a Cauchy sequence in C [0,1] . Since each xn is continuous,

xn (t m ) xn (t ) as m . Since {xn (t )} is a Cauchy sequence, already we have
shown, that xn (t ) x(t ) as n and thus xn (t m ) x(t ) as n, m 0 . Likewise
xn (t m ) x(t m ) as n .
Hence, the previous inequality implies
x(t ) x(t m ) 0
that is, x(t m ) x(t ) if t m t . Thus, the function x (t ) is continuous.)
We note, that convergence in maximum norm implies pointwise convergence,

and that the sequence of continuous functions illustrated in Figure 5.1 with respect
to the maximum norm is not even a Cauchy sequence. From the figure ( n < m ) it
is clear, that
1 1 1 1 n n
max xn (t ) xm (t ) = xn = n + 1 = 1
0 t 1 2 m 2 m 2 m
and the limit of this sequence is indeterminate. Indeed, using Mathematica we get
. 8n , m <
n
In[1]:= 1
m
::indet : Indeterminate expression 0 encountered . More
Out[1]= Indeterminate
(Now we could not use the inner function Limit[], because it can only work with
one variable.)
Completion. Every incomplete normed space can be completed by adding the

limit points of all Cauchy sequences in the space. If X is an incomplete normed
space, one can construct a new space X that is complete and contains X as a
subspace. The space X is called the completion of X .
In general the only, but not a simple problem is to find out what the limit
elements will be. Remember, that the normed space (, ), i.e. the set of real
numbers with the usual norm (i.e. absolute value) form complete normed space.
35
Similarly, the normed space (n , 2 ) is a Banach space. However, the rational
numbers with the absolute value norm is not a Banach space.
Compact set. A subset S of a normed space X is said to be compact, if

every infinite sequence {xn : xn S}contains a convergent subsequence, (that
converges to a vector x S ).
If S is a set for which its completion S is compact, we say S is precompact.
The normed space (, ) is not compact. Indeed, {0,1,2, ...} contains no

convergent subsequence.
Recall, that a set S = {x1, x2 , ... , xn , ...} is said to be bounded, if there is a

number K > 0 such that xn < K for all n . The subset S is closed, if xn S
and xn x implies x S .
Theorem 5.2 (Heine-Borel). Let X be a finite-dimensional normed linear

space, and let S be a subset of X . Then S is compact if and only if S is both
closed and bounded.
However, in an arbitrary Banach space the statement of this theorem is not

true.
Consider the normed space l 2 introduced in example E 5.1, which consists of

infinite sequences of real numbers x = {1 , 2 , ...} with norm

x =
k =1
2
k .
In l 2 consider the bounded, closed ball S defined by x 1 . This ball is not

compact. Indeed, consider the sequence (of sequences)
e1 = {1,0,0,...}, e2 = {0,1,0, ...}, ...
For m n we have en em = 2 . Hence the sequence {ek : ek S } and every

subsequence of S does not contain a Cauchy subsequence and so is not
convergent.
However, the compact sets are bounded and closed.
36
6. Continuous and bounded linear operator
Let X and Y be two normed spaces. The convergence of sequences in X and

Y is understood with respect to the norms X and Y associated with the
spaces X and Y , respectively.
Continuous operator. An operator T : X Y (linear or nonlinear) is said to

be continuous at a point x X , if for every sequence {xn : xn X } that converges
( )
to x the sequence T ( xn ) T x as n .
If T : X Y is continuous at every point of its domain D X we simply say

that T is continuous on D .
Theorem 6.1. An operator T : X Y is continuous at a point x X , if and

only if for any > 0 there exist a = ( ) > 0 so that
if x x
X
< ( ) , then ( )
T ( x ) T x
Y
< .
We want to emphasize, that the continuity of an operator depends on the used

norms X and Y .
Theorem 6.2. If a linear operator T : X Y is continuous at the point x = ,

then T is continuous at all points of X .
Proof: If x is an arbitrary vector and xn x , as n , then xn x

and so by the condition of the theorem we have T (xn x ) T . The linearity
T (xn x ) = Txn Tx and the equality Tx = T ( x + ) implies T = . Hence
Txn Tx , so that indeed T is continuous at an arbitrary point x .
Bounded linear operator. A linear operator T : X Y is said to be bounded

(above), if there is a constant K > 0 so that
Tx Y
K x X
, x X .
The number K is called a bound of the operator T .
In view of this definition, a linear functional F : DF X is bounded on

its domain DF , if there is a number K > 0 so that
F ( x ) K x , x DF .
37
Theorem 6.3. (The equivalence of boundedness and continuity).
A linear operator T is continuous if and only if it is bounded.
Proof: If T is a bounded linear operator, then
Txn Tx
Y
= T xn x ( ) Y
K xn x
X
is valid for any xn , x X . Therefore, T is continuous in X .
Conversely, if T is not a bounded linear operator, we shall show that T is not

continuous at the point . Because T is not bounded, there exist such a bounded
sequence {xn : xn X }, that T xn Y . Without loss of generality we may
suppose Txn for any positive integer n . Define the sequence
~ xn
xn = .
T xn Y
It is obvious that ~
xn , but T ~
xn Y
= 1 , so T is not continuous.
E.6.1. In the normed space C[a, b] the integral operator
t
T f (t ) = f ( ) d , t [a, b]
a
is continuous. Clearly,
t t b
T f (t ) = f ( ) d f ( ) d f ( ) d (b a ) max f (t ) , t [a, b]
a a a
and so
t
T f = max
a t b
f ( ) d
a
(b a ) f
This means, that T is bounded and hence continuous too.
E.6.2. Let D denote the subspace of the normed space C [a, b] , which consists
of differentiable functions. Then the differential operator
d
L f (t ) = f (t ) , f D
dt
38
is not continuous in C [a, b] . To show it, it is enough to find only one convergent
sequence {g n : g n D} so, that the sequence {L g n , n = 1,2, ... } does not converge.
1
If g n (t ) = sin (n t ) , then g n D and g n as n , however the sequence
n
L g n (t ) = cos (n t ) does not have a limit. Thus the operator L is not continuous
(and by the theorem 6.3 it is also unbounded).
We note, that by appropriate selection of norm, the operator of differentiation

can be made continuous (and hence bounded too). For instance, let C new [a, b]
denote the normed space of functions from the linear space C (1) [a, b] with the
norm defined by
df (t )
f new
= max f (t ) + max = f
+ f.
a t b a t b dt
The operator
d
L= : C new [a , b ] C [a, b ]
dt
i.e. the differentiation using the norm f new

is continuous. Really, if f n new
0,
that is, if f n
+ f n' 0 , then f n' = L fn
0 as n .

Norm of linear operator. Let X , Y be normed spaces. The smallest bound of

the continuous linear operator T : X Y , which is the smallest number K for
which
TxY K x X
is called the norm of T and is denoted by T .
It follows that T x Y T x X.
The definition of the norm of continuous linear operator T also can be written
as
Tx
T = sup = sup
Y
Tx Y
.
x x X
x X
=1
(The second form follows from the first one, if the vector x is written as
x = x X , X = 1 .)
If T = F is a continuous linear functional, then clearly
39
F (x )
F = sup = sup F ( x ) .
x x x =1
It can be shown, that T satisfies the norm axioms.
Topological dual space. Let X be a normed space. The set of all continuous
linear functionals
f : X
is a normed space, denoted by X , and called the (topological) dual of X .
Let f X and express the functional f as a duality pair
f ( x ) = f x , f X , x X ,
where the symbol denotes a bilinear map from the space X X into .
If we substitute f = f x into the formula for the norm of continuous linear

functional above, we get the norm
f x
f X
= sup .
x x X
of the dual normed space X .
Compact operator. A linear operator T is compact if it transforms bounded

sets into compact sets.
40
7. Dense sets, separable spaces
Dense set. Let X be a normed space. A set S X is said to be dense in X if

for every element x X there is a sequence {sn : sn S } such that sn x as
n .
In other words, any element in X can be reached as a limit of a subsequence

selected from the elements of the dense set S .
Theorem 7.1. Let X be a normed space. The set S X is dense in X if and

only if for every > 0 and every x X there is an element s S such, that
xs < .
In other words, the set S is dense in the normed space X if and only if any
x X can be approximated (with arbitrary precision) by elements of S .
Separable spaces. The normed space X is said to be separable, if it contains a

countable dense subset.
In other words, the separability of X means, that there exists a sequence

{xn : xn X } such that every x X is either an element of this sequence, or
x is a
limit of a subsequence of {xn : xn X }.
We note, that the rational numbers, which are dense in (by absolute value)
normed space of real numbers, are countable (see the diagram below). Hence the
real numbers create a separable space.1
1 1 1 1
...
1 2 3 4
2 2 2 2
...
1 2 3 4
3 3 3 3
...
1 2 3 4
4 4 4 4
...
1 2 3 4
M M M M
1
Note, that Georg Cantor (1845-1918) has proved, that the set of real numbers on the interval
[0,1] is not countable. (We say that the points of interval [0,1] create a continuum.)
41
Recall, that a subset S = {x1, x2 , ... } of a vector space X is said to span or
generate X , if every x X can be written as a linear combination of the vectors
of S .
Spanning set. A subset S of a normed space X is said to span or generate

X , if the set of all linear combinations of the vectors of S is dense in X .
This means, that every x X either is element of S , or is a linear combination

of vectors of S , or is a limit of a sequence of such linear combinations.
Theorem 7.2. If the normed space X is separable, then it contains a countable

subset S which spans X .
The dense sequence {xn : xn X } is the desired subset S .
The converse of this statement is also true.
Theorem 7.3. If the spanning set S of the normed space X is countable, then
X is separable.
Theorem 7.4. (Weierstrass approximation theorem)
Let f C[a, b] , and let > 0 . Then there exists a polynomial p( x ) for which
f p
.
E.7.1. According to the Weierstrass approximation theorem, any continuous

function on a finite interval [a, b] can be established as a limit of a uniformly
convergent sequence of polynomials. This means, that polynomials form a dense
subspace in the normed space C[a, b] . Every polynomial can be written as a linear
combination of the power functions 1, t , t 2 , ... ; hence the sequence
{t k ; k = 0,1, 2, ... } spans the space C[a, b] , and by the theorem 7.3 it is separable,
too.
42
8. Inner product, Hilbert space
Inner product. Let X be a real linear space. A function : X X ,

defined for each pair x, y X , is called an inner product (or scalar product) on
X if and only if it satisfies the following axioms:
S1. x y = y x ,
S2. x + y z = x z + y z ; z X ,
S3. x y = x y ; -real number,
S4. x x 0 s x x = 0 if and only if, ha x = .
This definition is a generalization of the scalar product of geometric vectors.
We note, that if X denote complex linear space, then the axiom S1 is replaced
by x y = y x , (upper bar denotes complex conjugate).
Observe, that the inner product is a bilinear functional which is positive definite
i.e. symmetric, too.
Inner product space. A real linear space with an inner product is called inner
product space or pre-Hilbert space (in finite dimensional cases also Euclidean
space).
E.8.1. In the linear space n of real n -tuples x = ( x1 ,..., x n ), y = ( y1 ,..., y n )
x y = x1 y1 + ... + x n y n
is the usual euclidean inner product.
E.8.2. In the real vector space C (1) [a, b ] of all continuous functions with
continuous first derivative, the expressions
b
x y = x (t ) y (t ) dt
a
and
dx (t ) dy (t )
b b
x y = x (t ) y (t ) dt + dt
a a dt dt
43
define inner products on C (1) [a, b] , but
dx(t ) dy (t )
b

a dt dt
dt
is not an inner product on C (1) [a, b] because it contradicts axiom S4: from the
dx (t )
x x = 0 it follows only = 0 and not x(t ) = 0 everywhere in [a, b] .
dt
Orthogonal vectors. The vectors x and y of an inner product space X are

said to be orthogonal if
x y = 0.
Test for linear dependence. Using the inner product, an effective method can
be given for determining whether or not the vectors x1 , ... , x n are linearly
independent.
Theorem 8.1. Let X be an inner product space. A set of vectors x1 , ... , xn of X

is linearly independent if and only if the Gram matrix
x1 x1 . . . x1 xn

. .
. .

. .
x x . . . xn xn
n 1
is nonsingular.
Theorem 8.2. (Cauchy-Schwarz inequality.) For a real inner product space

X,
x y x x y y , x, y X .
Proof: For any real scalar ,
0 x y x y = x x y x x y + 2 y y ,
that is
y y 2 2 x y + x x 0 .
44
The left hand side of this inequality is a quadratic polynomial in variable ,
having real coefficients. The inequality indicates that this polynomial cannot have
distinct real zeros. Consequently, its discriminant cannot be positive, that is,
(2 x y ) 2
4 y y x x 0,
from which the Cauchy-Schwarz inequality is immediate.
Note the general character1 of this inequality and the simplicity of its proof.
Theorem 8.3. Every inner product space is a normed space. In fact, if x is an

element in an inner product space X , the mapping
x x = x x
defines a norm on X .
Proof: That x x satisfies the norm axioms N1 and N2, is obvious. The
validity of the axiom N3 that is of the triangle inequality is easy to see using the
Cauchy-Schwarz inequality as follows since:
2
x+ y = x+ y x+ y = x x +2 x y + y y x x +2 x y + y y
y y + y y =( x + y ). 2
x x +2 xx
The existence of this norm makes possible to define the completeness of inner
product spaces.
Hilbert space. A complete inner product space is called a Hilbert space.
Every finite-dimensional inner product space is complete; hence the Euclidean

space is a Hilbert space.
We recall that the Banach space is a complete normed space. The Hilbert space
is a special Banach space, where the norm is induced by the inner product
.
n n n b b b
x y x y x(t ) y(t ) dt x (t )dt y (t )dt

1 2 2 2 2
For example i i i i or
i =1 i =1 i =1 a a a
45
In the following we will use the fact that any continuous linear functional F (u )
can be written as a duality pair i.e. in the form F (u ) = F u , F X , u X ,
where X is the dual space of the normed space X .
Theorem 8.4. ( Riesz1 representation theorem.) Let H be a Hilbert space,

F H . Then there is a unique g H for which
F (u ) = g u ; u H .
In addition,
F H
= g H
.
We want to emphasize that F is a continuous linear functional defined on the

space H while g is an element of H .
If the linear functional is not bounded, then the Riesz theorem is not valid.
Proof: At first, assuming the existence of g , we prove its uniqueness. Suppose

~
g H satisfies
F (u ) = g u = g~ u u H .
Then
g g~ u = 0 u H .
Choose u = g g~ . Then g g~ = 0 , which implies g = g~ .
We prove the existence of vector g .
Denote by
N = N (F ) = {u : u H , Fu = 0},
the null space of F , which is a subspace of H . If N = H , then Fu = 0 , and

choosing g = the proof is finished.
Now suppose N H . Then there exists at least one u H H such that

F (u H ) 0 . It is possible to decompose H as the direct sum N M , where
M = {u H : u v = 0 v N } is the orthogonal complement of N with respect
to H . Then we can write
1
Frigyes Riesz (1880-1956), famous Hungarian mathematician.
46
uH = u N + uM , uN N , uM M
where
F (u M ) = F (u H u N ) = F (u H ) F (u N ) = F (u H ) 0 .
For any u H , it is true that
F (u )
F u uM = 0 .
F (uM )
F (u )
Hence u u M N and because uM is orthogonal to N
F (u M )
F (u )
u u M uM = 0 .
F (u M )
F (u ) 2
That is u uM uM = 0 i.e.
F (uM )
F (u M )
F (u ) = 2
uM u .
uM
In other words, we may choose g to be
F (u M )
2
uM .
uM
We complete the proof of the theorem by showing F H

= g H
. From
F (u ) = g u ; u H
and the Cauchy-Schwarz inequality F (u ) = g u g H

u H
, uH .
Hence (see the definition of norm of continuous linear functional)
g u
F H1
' = sup g H .
u u H
because F ( g ) = g g = g
2
However, it is impossible that F H
< g H H
so that
really F H
= g H
holds.
47
We note that this theorem is a fundamental tool in the solvability theory for
elliptic partial differential equations.
Adjoint operator. Consider a bounded (i.e. continuous) linear operator

T : U V , where U and V are normed linear spaces. Let f (v ) be a continuous
linear functional defined on V . Then f V , where V ' is the (topological) dual
space of V . Since v = Tu , the f (v ) is defined for vectors u U , too. That is,
f (v ) f v = f T u f ( T u ) = g (u ) g u
where g (u ) is a linear functional defined on U . The linear operator T is

supposed to be bounded, hence using the Cauchy-Schwarz inequality we get
g (u ) = f Tu f T u = K u , where K = f T ,
so that g (u ) = g u is bounded i.e. continuous linear functional on U . Hence

g U ' where U is the (topological) dual space of U .
T
U V
g f
T*
U V
Consequently, to each functional f V we have associated a corresponding

functional g U ' , that is, we have defined an operator which transforms the
normed space V to the normed space U . This operator is denoted T and called
the adjoint of T . That is, we can write T * f = g .
We saw the equality f T u = g u and substituting T * f = g we get
f Tu = T f u .
This equality also can be used as the definition of the adjoint operator.
48
If U = V = H is Hilbert space, then the Riesz representation theorem
guarantees the existence of a unique element g = T f which satisfies the
equation f Tu = g u .
The operator T is called self-adjoint if T = T .
E.8.3. For any matrix A , the adjoint of A is defined as the matrix A* , which
satisfies the equation
A x y = x A y
for all x, y C n , where u v = u T v denotes the inner product of the complex

euclidean space.
The matrix A is self-adjoint if A = A .
To find the adjoint matrix A explicitly, note that (using the Euclidean inner
product)
_________
_ _ _ T _T
Ax y = ( Ax )
y = x A y = x A y = x A y
T T T T
so that A = A T , i.e. the adjoint of A is equal to the transpose of its conjugate.
If A is real symmetric, then A T = AT = A , so that A is self-adjoint.
E.8.4. The adjoint of a differential operator L is defined to be the operator L

for which
Lu v = u Lv
for all u in the domain of L and v in domain of L .
This definition determines not only the operational definition of L , but its
domain as well. Consider the example
du
Lu =
dx
with the boundary condition u (0 ) = 2u (1) with inner product
1
u v = u ( x ) v( x ) dx .
0
49
Using this inner product (and integration by parts) we have that
1 1
du
Lu v = (x )v(x )dx = u (1) (v(1) 2v(0 )) u (x ) dv (x )dx .
0
dx 0
dx
dv
In order to make Lu v = u Lv , we must take Lv = with the boundary
dx
condition v (1) = 2 v (0 ) .
The operator L is self-adjoint if L = L . That means that not only the

operational definitions of L and L agree but their domains are equal too.
An other application of the Riesz representation theorem is the Lax-Milgram

theorem which is used traditionally to demonstrate existence and uniqueness of
weak solutions of boundary value problems.
Theorem 8.5. (Lax-Milgram theorem.) Given an Hilbert space H with the

inner product u v . Let a u v be a bilinear functional in u , v H such that
(1) au v K u v ,
2
(2) au u u ,
with positive constants K , that are independent on u , v .
Further, let l be a continuous linear functional on H , i.e. l H . Then there

exists a unique u H such that
l (v ) = a u v , vH .
50
9. Sets of measure zero, measurable functions
Set of measure zero. A set A [a, b ] is said to be of measure zero, if for every
> 0 there is a sequence of open intervals I k = (ak , bk ) such as, that

(b
k =1
k ak ) and A U Ik .
k =1
In other words, a set has measure zero if and only if it can be covered by a
collection of open intervals whose total length is arbitrarily small.
Theorem 9.1. Any countable set A = {xk ; k = 1,2,...} [a, b ] has a measure
zero.
Proof: to any > 0 we can choose the intervals I k = (ak , bk ) in the form

I k = xk k +1 , xk + k +1 , k = 1, 2, ... .
2 2

1 1 1
Then (b k ak ) = = = and A = { x1 , ... , x k , ... } U I k is
2k 1 2
k =1 k =1
1 k =1
2
valid too.
E.9.1. A finite set A = { x1 , ... , xn } [ a, b ] has measure zero.
The intervals [a, b] and (a, b ) have not measure zero. (Their common measure
is their length: b a .)
Almost everywhere property. A property P = P( x ) on an interval [a , b ] is

said to hold almost everywhere on [a, b] , if P fails to hold only on subsets of
[a, b] of measure zero.
Measurable function. A (real-valued) function f ( x ) defined on [a, b] is said
to be measurable on [a, b] , if there is a sequence of continuous functions
{gn (x ) : a x b}, which (pointwise) converges to f (x ) almost everywhere on
[a, b] .
In other terms, f ( x ) = lim g n ( x ) ( x [a, b ] \ A ) where A is a set of measure
n
zero. (The set A depends on f (x ) .)
The set of measurable functions is a linear space.
51
We note that in the practice all functions are measurable.
The previous results for intervals can be extended to more-dimensional

domains as well.
52
10. The space L 2
Let denote a bounded domain (connected open set) in the n -dimensional

Euclidean space.
Square integrable function. A measurable (real-valued) function f ( x ) , x

is said to be square integrable1, if
f dx < .
2
The set L2(). The collection of all square integrable measurable (real-valued)
functions defined on is denoted L2 ( ) .
The L2 ( ) is a very large set. It consists of all continuous functions on and

of all bounded piecewise continuous functions on . The unbounded f ( x ) = x 1 / 3
1
is also an element of L2 (0,1) because (x ) dx = 3 . However, the unbounded
1 / 3 2
(x ) dx = + .
1
function f ( x ) = x 1 / 2 does not belong to L2 (0,1) since 1 / 2 2
We note that using Mathematica we got:

1 !!!
3 ! 2
In[1]:= J x N x
0
Out[1]= 3
but
1i 1 y2
In[2]:= j
jx 2 zz
0k {
x
on 80, 1<. More

1
Integrate ::idiv : Integral of does not converge
x
Out[2]=
11
x
0 x
The sum of square integrable functions is a square integrable function as well.

Indeed,
0( f g ) 2
= f 2 2 f g + g 2 and f g f g
1
in sense of Lebesgue integration , Henri Lebesgue (1875-1941)
53
so that
f g f g
2
(
1 2
f + g2 . )
This inequality indicates, that
(f + g) = f 2 + 2 f g + g2 2 f 2 + g2 ,
2
( )
and after integration the assertion follows. Since the product of a square integrable
function with a real number is a square integrable function, we have proved that
L2 ( ) is a linear vector space.
By integrating the last but one inequality it follows that the

f g dx is finite,
and hence there exists the inner product
f g = f g dx , f , g L2 ( ) .

In other words, the linear space of square integrable measurable functions

L2 ( ) is a pre-Hilbert space. But any inner-product space is a normed space,
where the norm is defined using the inner product:
1/ 2

f = f f = f 2 dx , f L 2 ( ).

If the integrals in this section are defined in the Lebesgue sense, then this real
inner product space is also complete.
Theorem 10.1. The space L2 ( ) with respect to the inner product above is a
(real) Hilbert space.
In contrast with the linear space C (0 ) ( ) of continuous functions on a closed

domain , in the space L2 ( ) the expressions such as the fourth axiom of the
inner product or the first axiom of the norm need a more detailed commentary:
In C (0 ) ( ) the relation

f 2 dx = 0 (10-1)
indicates f 0 on the closed domain and so on the open domain too1.
1
Rememeber, that C
(0 )
( ) C ( )() because if
0
f C (0 ) ( ) then f C (0 ) ( ) .
54
In L2 ( ) , from (10-1) does not follow f = 0 everywhere in .
Consider the function
1
1, if x =
f1 ( x ) = 2
0, otherwise
defined on = (0,1) . Obviously f1 is square integrable on and satisfies the

equality (10-1). But this equality also satisfies, e.g. that function f 2 , for which
1
holds f 2 ( x ) = 0 on = (0,1) except on the countable set xk = ; k = 2, 3, ... .
k
In general, the equality (10-1) satisfies all functions, for which f ( x ) = 0 on

= (0,1) except on a set of measure zero, that is f ( x ) = 0 almost everywhere on
. At the points where f ( x ) 0 the values of f may be arbitrary, or f may be
non-defined.
Equivalent functions. Two square integrable functions f and g are said to

be equivalent if f ( x ) = g ( x ) almost everywhere on . Then we write also
f =g in L2 ( )
( f g ) dx = 0 .
2
and then obviously

E.10.1. The Dirichlet function
1 if x rational
f (x ) = ; x [0,1]
0 if x irrational
is equal to zero almost everywhere on [0,1] because the rational numbers are
countable.
The Riemann-integral for this function does not exist, the Lebesgue-integral is
equal to zero.
The proof of the following theorem is based on very serious results from the
theory of measurable functions and the Lebesgue integral.
Theorem 10.2. Continuous functions are dense in L2 ( ) .
As already mentioned in example E 7.1, any continuous function on a finite

interval [a, b] is a limit of uniformly convergent and hence all the more in mean
(in an integral norm) convergent sequence of polynomials. This fact and the
theorem 10.2 imply:
55
Theorem 10.3. The polynomials are dense in L2 (a , b ) .
Every polynomial is a linear combination of the power functions 1, t , t 2 ,...,

{ }
hence the sequence t k ; k = 0, 1, 2, ... spans the space L2 (a, b ) , and so - by
theorem 7.3. - L2 (a, b ) is separable.
More generally:
Theorem 10.4. L2 ( ) is separable.
56
11. Generalized derivatives, distributions, Sobolev spaces
Multi-index notation. The ordered n-tuple of non-negative integers
i = (i1, i2 ,..., in )
is called multi-index. The sum i1 + i2 + ... + in is denoted by i .
Using the multi-index notation, the partial derivatives of the function

f = f ( x1, x2 ,..., xn ) can be expressed in shorter, so-called operator form
f
i
D f = i1 i 2
i
.
x1 x2 ... xni n
3f 3 f
E.11.1. If i = (3,0) , then instead of Di f = we write D i
f = .
x13 x20 x13
If i = 0 , then Di f = f .
If n = 2 , k = 3 then i k represents the following multi-indices

i = (i1, i2 ) :
(0, 0) i =0
(1, 0), (0,1) i =1
(2, 0), (1,1), (0, 2) i =2
(3, 0), (2,1), (1, 2), (0, 3) i =3
With this notation, for instance instead of
f
(x1 , x2 ) + f (x1 , x2 ) + 2f (x1 , x2 ) + f (x1 , x2 ) + 2f (x1 , x2 ) +
2 2 2
f ( x1 , x2 ) +
x1 x2 x1 x1x2 x2
3 f 3 f 3 f 3 f
+ ( x1 , x 2 ) + ( x1 , x 2 ) + ( x1 , x 2 ) + (x1 , x2 )
x13 x12 x2 x1 x22 x23
we can use the short operator form D

i k
i
f.
Smooth function. The function : n is said to be smooth if it is

continuously differentiable infinitely times.
57
Support of function. Let be a domain (i.e. a connected open subset) in
. The support of the function ( x ), x n , denoted by supp is the
n
closure of the set of those points x at which (x ) 0 . That is,
supp = { x : x , ( x ) 0 }.
Compact support. The function ( x ), x n is said to have a compact

support with respect to , if its supp is a compact set (i.e., if it is bounded) and
is a proper subset of (i.e. if supp ).
Since supp is a closed set by definition, and since it lies - by assumption - in

an open domain , it has a positive distance from the boundary of this
domain.
Test function. Let be a domain in n . Denote by C0( ) ( ) the set of all

smooth functions with compact support in . That is,
( )
{ ( )
}
C0 ( ) = C ( ) : supp . Any C 0 ( ) is called test function.
( )
E.11.2. In 1 the
e1 / (x 1 )
2
if x <1
(x ) =
0 if x 1
is an example of a smooth function with compact support in = ( 2,2 ) . But (x )

is not a test function with respect to = ( 1,1) , because the supp = { 1 x 1}
is not a proper subset of the domain = { 1 < x < 1} .
Using Mathematica we can easily plot this function:
In[1]:= Clear@D
@x_D := PiecewiseA980, x 1<, 9E x21 , 1 < x < 1=, 80, x 1<=E

1
In[3]:= Plot@@xD, 8x, 2, 2<, PlotStyle 8Thickness@0.008D<D
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-2 -1 1 2
58
This function has limits at the points x = 1, x = 1 which are equal to zero.
Indeed, e.g. at the point x = 1 the limit from the left
In[4]:= Limit@@xD, x 1, Direction 1D
Out[4]= 0
and obviously the limit from the right
In[5]:= Limit@@xD, x 1, Direction 1D
Out[5]= 0
(The option Direction -1 and +1 denotes the right- and left-hand limit,
respectively.) At the points x = 1, x = 1 there exist all derivatives of (x ) and
they are also equal to zero. Consequently (x ) is smooth too.
For example:
In[6]:= D@@xD, 8x, 3<D Simplify H The third derivative of L
Out[6]= 4 1+x x I310x2+3x4+6x6M

1
2
I1+x2M6
1 < x < 1

In[7]:= Plot@%, 8x, 2, 2<, PlotStyle 8Thickness@0.008D<D
150
100
50
-2 -1 1 2
-50
-100
-150
Out[7]= Graphics
Suppose that the function f = f ( x1,x2 ,...,xn ) in its domain is locally

integrable. That is, the Lebesgue integral of f in every compact subdomain of
is finite.
Generalized derivative. We say, that Di f is the i -th generalized or

distributional or weak derivative of the function f in the domain , if
D f dx = ( 1) f Di dx ,
i i
(11-1)

59
where C0( ) ( ) , that is is an arbitrary test function.
It is easy to see, that if a function is differenciable, then its generalized

derivative coincide with the classical one. For example, the integration by parts in
= (0,1) and using the test function property (0 ) = (1) = 0 leads to
d
[ ]
1 1 1
x
n
dx = x n ( x ) 10 n x n 1 ( x ) dx = ( 1) D1 f dx .
0
dx 0 0
E.11.3. Consider the function
1
cx for 0 < x
f (x ) = 2
1
c (1 x ) for x <1
2
1
where c is a constant. This function is not differentiable at x = (see the Figure
2
11.1).
The piecewise derivative of f ( x ) on the interval (0,1) , coincides with its

generalized derivative
1
c if 0 < x <
(1) 2.
D f =
1
c if < x <1
2
f (x ) D (1) f
c/2 c
0 1/2 1 x 0 1 x
1/2
-c
1
Figure 11.1. Non-differentiable functions at x = .
2
60
Indeed, integration by parts yields to
d d
1 1/ 2 1
f D (1) dx = cx dx + c (1 x ) dx =
0 0
dx 1/ 2
dx
1/ 2 1
=[c x ( x )] 10/ 2 c ( x ) dx + [c (1 x ) (x )]11 / 2 ( c ) (x ) dx =
0 1/ 2
c 1 c 1
1/ 2 1 1
= c ( x ) dx ( c ) ( x ) dx = ( 1) D (1) f dx.
2 2 0
2 2 1/ 2 0
E.11.4. Now, consider the second generalized derivative D(2 ) f of the function
f (x ) defined in (0,1) in the previous example. The relation (11-1) - as definition-
gives
d 2 d 2
1 1 1/ 2 1
D f dx = ( 1) ( )
(2 )
f D (2 ) dx = + dx =
2
cx dx c 1 x
0 0 0 dx 2 1/ 2 dx 2
d d d d
1/ 2 1

1/ 2 1
= cx c dx + c(1 x ) ( c ) dx =
dx 0 0
dx dx 1 / 2 1 / 2 dx
c d 1 c d 1 1
= [c ( x )] 0 + c [ ( x )]1 / 2= 2c ,
1/ 2 1
2 dx 2 2 dx 2 2
since (x ) is a test function on the interval (0,1) , that is, (0 ) = (1) = 0 .
But no integrable function D(2 ) f exists which satisfies the definition (11-1),
that is, the relationship
1
1
D
(2 )
f ( x ) dx = 2 c .
0 2
If we introduce the symbolic notation
1 1
1
x 2 (x ) dx = 2 ,
0
1
where the x is not a function but it is a so-called delta-distribution, then
2
the second generalized derivative of the f ( x ) given in the previous example is
1
D(2 ) f = 2c x .
2
61
In applications it is often the case that the delta "function" is symbolically
defined by

(x ) = 0 if x and (x ) dx = 1 .

However, for any reasonable definition of integration there cannot be such

function. How is it, then, that many practicing scientists have used this definition
with impunity for years?
To make mathematical sense of delta "function" there are, traditionally, two

ways to do so, the first through delta sequences and the second through the theory
of distributions.
The idea of delta sequences is to realize that, although a delta-function

satisfying

(x ) (x ) dx = ( )

cannot exist, we might be able to find a sequence of functions s n ( x ) which in

the limit n satisfies the defining equation

lim
n s (x ) (x ) dx = ( )

n
for all continuous functions ( x ) . Notice that this definition of s n in no way

implies that lim s n (x ) = ( x ) exists. In fact, we are certainly not allowed
n
to interchange the limit pocess with integration. (We note that one criterion that
allows the interchange of limit and integration is given by the Lebesgue theorem
B.4. in Appendix B.)
There are many suggestive examples of delta sequences. The choice
1 1
n, if 2n x + 2n
sn (x ) =
1
0, if x >
2n
(see Figure 11.2) is a delta sequence, since

1
+
2n
lim sn ( x ) ( x ) dx = lim n (x ) dx = ( )
n n
1

2n
62
because of the mean value theorem1, provided (x ) is continuous near x = .
x
1 1
+
2n 2n
Figure 11.2. Delta sequence.
The application of delta sequences in practice is very cumbersome. For

example, if we want to solve the boundary-value problem
d 2u 1
= x , 0 x 1
2
2
dx
u (0 ) = u (1) = 0
(establish the shape of a cable under a concentrated force), then this problem can
be approximated by the sequence of boundary-value problems
d 2u n 1
= sn x , 0 x 1
2
2
dx
un (0 ) = un (1) = 0 .
Then, taking the limit
lim un ( x ) = u( x )
n
we can produce the correct solution for the examined boundary-value problem.
In more detail: If we solve the previous differential equation separately on the

intervals
1 1 1 1 1 1 1 1
[ 0, ), [ , + ], ( + ,1]
2 2n 2 2n 2 2n 2 2n
b
1
there is such a point, [a, b] , that (x ) dx = (b a ) ( )
a
63
with respect to un (0) = un (1) = 0, then we have
1 1
if 0 x <
ax, 2 2n
n 2 1 1 1 1
un ( x ) = x + bx + c, if x + .
2 2 2n 2 2n
d ( x 1), if
1 1
+ < x 1
2 2n
We can compute the coefficients a, b, c, d from the conditions for the continuity
1 1 1 1
of values and the first derivatives of the function at x = and x = + .
2 2n 2 2n
The solution is the function
1 1 1
x, if 0 x <

2
2 2n
1 1
2
n 1 1 1 1
un ( x ) = x(x 1) + , if x +
2 2 2n 2 2n 2 2n
1 1
1
(x 1), if + < x 1
2 2n
2
which depends on n , is once continuously differenciable in the interval [0,1] and

satisfies the boundary conditions u(0) = u(1) = 0 .
Using Mathematica it is easy to plot the function un (x ) and its first derivative
according to x as follows:
u@n_, x_D := PiecewiseA99 =,

x 1 1
In[1]:= , 0 x<
2 2 2n
ni
9 j N y
1 2z
jx Hx 1L + J z, =,
1 1 1 1 1
2k 2 2n { 2 2n
x +
2 2n
9 < x 1==E
x1 1 1
, +
2 2 2n
v@x_D := u@5, xD
64
In[3]:= Plot@ v@xD, 8x, 0, 1<, PlotStyle 8Thickness@0.008D<D
0.2 0.4 0.6 0.8 1
-0.05
-0.1
-0.15
-0.2
Out[3]= Graphics
In[4]:= Derivative@1D@ vD
2 #1
0 #1 < 0
1 2
0 < #1 <
5 H 1 + 2 #1L
2 5 5
2 < #1 < 3
Out[4]=
#1 3 3 < #1 < 1
2 5 5 &

1
2 5 5
0 #1 > 1

Indeterminate True
In[5]:= Plot@%@xD, 8x, 0, 1<, PlotStyle 8Thickness@0.008D<D
0.4
0.2
0.2 0.4 0.6 0.8 1
-0.2
-0.4
Out[5]= Graphics
If n , then un u pointwise and hence in integral norm too, where
1 1
x, if 0 x
u= 2 2
1 1
( x 1), if x 1
2 2
1 1 1 1 1 1
(at the point x = we get un = + so that if n , then un = ).
2 2 4 8n 2 4
65
However, there is a further significant difficulty with this, since we do not only
need to find un (x ) , but also to show that the limit exists independently of the
particular choice of delta sequence.
We note that using Mathematica this boundary-value problem can be solved

easily as follows:
In[1]:= Clear@x, uD
DSolveA9u''@xD DiracDeltaAx E, u@0D 0, u@1D 0=, u@xD, xE
1
2
Out[2]= ::u@xD Hx UnitStep@ 1 + 2 xD + 2 x UnitStep@ 1 + 2 xDL>>

1
2
In[3]:= Plot@Evaluate@u@xD . %D, 8x, 0, 1<, PlotStyle 8Thickness@0.008D<D
0.2 0.4 0.6 0.8 1
-0.05
-0.1
-0.15
-0.2
-0.25
Out[3]= Graphics
A much more useful way to discuss delta "functions" is through the theory of
distributions. As we shall see, distributions provide a generalization of functions
and inner products.
Distributions.1
Convergence in the linear space C0( ) ( ) . In our previous discussions, the

convergence was defined using the norm. In
{ }
C0 ( ) = C ( ) : supp , that is in the set of all smooth functions
( ) ( )
with compact support in , for our purposes suitable norm does not exist.
Therefore, it is advantageous to introduce the following definition:
The sequence {n } of test functions is said to be convergent in C0( ) ( ) and its

limit is the test function , if there is a bounded set , containing the
supports of , 1, 2 , ... and if the sequence { n } and all its generalized derivatives
1
The concept of distributions first was used by Laurent Schwartz in 1944.
66
{D },
i
n i > 0 converges uniformly to and its generalized derivatives Di ,
respectively. That is, if
lim max D i n D i = 0
n x
for every i =0,1,2, .
A linear functional f : C0( ) ( ) is said to be continuous (in C0( ) ( ) ), if f

maps every converegent sequence in C0( ) ( ) into a convergent sequence in ,
i.e. if f n f 1
whenever n in C0( ) ( ) .
Distribution. A continuous linear functional on C0( ) ( ) is called distribution

or generalized function.
E.11.5. An example of distribution is the delta-distribution , defined by
( x ) ( x ) = ( ) for all C0( ) ( ) .
The notation f is used to denote the "action" of the distribution f on the

test function .
The linearity of distributions means, that operations of addition and

multiplication by scalar of distributions are defined as follows: if f and g are
distributions (i.e. linear functionals on ) and and are scalars, we define the
distribution f + g to be the functional f + g for all C0( ) ( ) .
The notation of distributions looks exactly like an inner product between

functions f and , and this similarity is intentional although misleading, since
f need not be representable as a true inner product.
The simplest examples of linear functionals are indeed inner products. Suppose
f (x ) is a locally integrable, that is, the Lebesgue integral f ( x ) dx is defined
I
and bounded for every finite interval I . Then the inner product

f = f (x ) (x )dx

is a linear functional if is in C0( ) ( , ) .
1
The dual-pair notation f ( ) f is used.
67
Every locally integrable function f induces a (so-called regular) distribution
through the usual inner product. (Two locally integrable functions which are the
same almost everywhere induce the same distribution.)
E.11.6. One important distribution is the Heaviside-distribution

H = ( x ) dx
0
which is equivalent to the inner product of with the well-known

Heaviside(Unitstep)- function
1 if x 0
H (x ) = .
0 if x < 0
For any function f that is locally integrable we can interchangeably refer to its
function values f (x ) or to its distributional values (or action)

f = f (x) (x ) dx .

That is, either x f ( x ) , or f .
There are numerous distributions which are not representable as an inner

product. The example we have already seen is the (so-called singular) distribution
= ( ) . Similarly to the differential operator d dx which cannot be
evaluated at the point x = 2 for example, but d can be evaluated pointwise
dx
only after it has first acted on a differentiable function f ( x ) , the distribution
can be evaluated only after is known.
Sobolev spaces.
The set C (k ) ( ) , (as the linear space of k -times continuously differentiable

functions on ), with respect to the inner product
u v k
= D u D v dx
i k
i i
(11-2)
is a pre-Hilbert space with the norm

1/ 2

= ( )
D u dx
2
u = u u i
k k
i k
68
induced by (11-2).
Let V2( k ) ( ) denote this incomplete inner product space. The space V2(k ) ( )
can be completed by adding the limit points of all Cauchy sequences in V2(k ) ( ) .
These limit points are distributions.
The set D of all distributions generated by Cauchy sequences in V2(k ) ( ) is

itself a linear space: if the distributions f , g D are generated by { n } and { n }
respectively, then the distribution f + g is generated by the Cauchy sequence
{ n + n }. When the space D is endowed with the inner product
f g k = lim n n k
(11-3)
n
where . . k
is defined by (11-2), the resulting inner product space is the Sobolev
space (k )
( ) .
It is in use to define the Sobolev spaces as follows too:
The Sobolev space W2(k ) ( ) = H (k ) ( ) is the set of those real-valued functions,

which with their generalized derivatives up to and including the order k exist and
are in Lebesgue sense square-integrable in .
The space W2(k ) ( ) contains, besides the continuous functions, e.g. those
functions, which are k-1-times continuously differentiable on and their
derivative of order k is piecewise continuous in .
The function f sketched in Figure 11.1. belongs to Sobolev space W2(1) (0,1) .
Indeed, there exists its generalized derivative D(1) f which with the function f is
square-integrable in (0,1) . The function f does not belong to W2( 2 ) (0,1) , because
its generalized derivative D(2 ) f exists only as distribution but is not a square-
integrable function.
Theorem 11.1. The Sobolev space W2(k ) ( ) = H (k ) ( ) with the inner product
(11-3) is a real Hilbert space.
69
12. Weak (or generalized) solutions
Consider the model boundary-value problem
d 2u
= f, a xb (12-1)
dx2
u (a ) = u (b ) = 0 .
If f C (0 ) (a, b ) , then the classical solution u (if exists) belongs to

u C (2 ) (a, b ) C (0 ) [a, b] and u(a ) = u (b ) = 0 . Multiplying (12-1) by an arbitrary
test function (recall, that (x ) is smooth with compact support in (a, b ) ) so that
(a ) = (b ) = 0 and integrating the result, we obtain
b b
d 2u
dx = f dx. (12-2)
a
dx 2 a
Now an integration by parts yields

b
du d b
dx = f dx , C (0 ) (a, b ) . (12-3)
a
dx dx a
If f C (0 ) (a, b ) then the equation (12-1) does not have a classical solution
(because then u C (2 ) (a, b ) ). For such a case it is possible to weaken, generalize
the concept of the solution. Note, that for f L2 (a, b ) the equation (12-3) makes
du
sense if L2 (a, b ) . This requirement is satisfied if
dx
(a, b ) = {u: (a, b), u (a ) = u (b ) = 0 }

0
(1) (1)
u W 2 u W 2
0
where the symbol refers to homogeneous boundary conditions. Then the
definition of Sobolev space W2(1) (a, b ) implies that the derivatives are considered
du
in the generalized sense and that L2 (a, b ) .
dx
0
The function u W2(1) (a , b ) is called the weak or generalized solution of
equation (12-1), if for f L2 (a, b ) , u satisfies equation (12-3). In other terms, a
0
generalized solution of equation (12-1) is a distribution u W2(1) (a , b ) such that
70
equation (12-2) or equivalently, equation (12-3), is satisfied for all C (0 ) (a, b )
and a given distribution f in L2 (a , b ) .
0 0
We note that C (0 ) (a, b ) is a dense subspace of W2(1) (a, b ) (because W2(1) (a , b ) is
the closure (extension) of C (0 ) (a, b ) ). This property guaranties that if u is a
classical solution of the equation (12-1) then it is also a solution of the weak
formulation (12-3). Indeed, the integration by parts on the left-hand side of (12-3)
yields
b
d 2u
2 + f dx = 0 C (0 ) ( a, b ) .
a dx
By the lemma1 known from the calculus of variations we must have
d 2u
+ f = in L2 (a , b )
d x2
and the continuity of f implies that the differential equation (12-1) is satisfied.
The equation (12-3) can be expressed in the form
D (1) (u ) D (1) ( ) = f , C (0 ) (a , b )
where is the inner product in the space L2 (a, b ) . This equation makes sense
0
when is any element of W2(1) (a , b ) . Since C (0 ) (a, b ) is a dense subspace of
0
W2(1) (a , b ) , it follows that this equation is equivalent to
0 (1)
D (1)
(u ) D ( ) = f ,
(1)
W 2 (a, b ) .
0
Denote H = W 2(1) (a, b) , and let a . . : H H be the bilinear functional
defined by
a u v = D (1) (u ) D (1) (v ) for u, v H ,
and let l : H be the continuous linear functional on H ( i.e. l H ) defined

by
l (v ) = f v for v H .
1
Lemma: If u H is orthogonal to all elements v of a set S which is dense in a pre-Hilbert
space H , then u is the null-element of H .
71
Then the weak formulation of the boundary-value problem is to find
u H such that
a u v = l(v ) v H . (12-4)
The solvability of the problem (12-4) is given in the Lax-Milgram theorem (see
theorem 8.5) which is a generalization of the Riesz representation theorem.
Weak convergence. A sequence of distributions { f n } is said to converge to the

distribution f if their actions converge in , that is, if
fn f for all C (0 ) ( ) .
This convergence is called convergence in the sense of distribution or weak

convergence.
If the sequence of distributions f n converges to f then the sequence of

derivatives D (1) f n converges to D (1) f . This follows since
D (1) f n = f n D (1) f D (1) = D (1) f
for all C (0 ) .
E.12.1. The sequence { fn }= cos n x

is both a sequence of functions and a
n
sequence of distributions. As n , f n converges to 0 both as a function
(pointwise) and as a distribution. It follows that D (1) f n = sin n x converges to the
zero distribution even though the pointwise limit is not defined.
Recall, that using distributions, we are able to generalize the concept of

function and derivative to many objects which previously made no sense in the
usual definitions. It is also possible to generalize the concept of a differential
equation.
Weak formulation. The differential equation L u = f is a differential equation

in the sense of distribution (i.e., in the weak sense), if f and u are distributions
and all derivatives are interpreted in the sense of distributions. Such a differential
equation is called the weak formulation of the differential equation.
72
13. Orthogonal systems, Fourier series
Orthonormal systems. A sequence { k ; k = 1 , 2 , ... } of elements of an inner

product space is said to be orthonormal, if
0 if i j
i j = .
1 if i = j
The vectors of an orthonormal system are linearly independent. Indeed, if we

select n vectors from an orthonormal system { k ; k = 1 , 2 , ... } , then they are
linearly independent. This follows from the Theorem 8.1 because the
corresponding Gram matrix is equal to the identity matrix.
Fourier series. Suppose, that { k ; k = 1 , 2 , ... } is an orthonormal system in a

Hilbert space H and that u is an arbitrary element of H . The series

k =1
k k; k = u k (13-1)
is said to be an orthogonal or generalized Fourier series representation of

u H , with respect to the orthonormal system { k ; k = 1 , 2 , ... } , and the scalars
k ; k = 1, 2 , ... are called the (generalized) Fourier coefficients of u H .
Convergence of infinite series. Let { uk ; k = 1 , 2 , ... } denote a sequence of

vectors in a normed space U . An infinite series

u
k =1
k
is said to be convergent if and only if the sequence of n -th partial sums

n
sn = uk
k =1

converges. In other words, an infinite series u
k =1
k converges, if and only if there
exists a vector s U such, that for every > 0 there is an integer N > 0 such,
that
sn s < whenever n N.
73
Theorem 13.1. Let { k ; k = 1 , 2 , ... } be an orthonormal system in a Hilbert
space H . Then the Fourier series representation (13-1) of an arbitrary element
u H converges.
However, it does not follow, that the limit of this series is u !
E.13.1. The sequence
2
k = sin ( ( 2k 1 ) x ); k = 1, 2 , ... (13-2)

is orthonormal in the Hilbert space H = L2 (0, ) . Indeed, using Mathematica we

have got
In[1]:= k_ := $%%%%%% Sin@H2 k 1L xD

2

AssumingA8 m, n< Integers, 9 m n x, n2 x=E

0 0
Out[2]= 80, 1<
The function
u = sin 2 x
is orthogonal to every k and hence the Fourier coefficients of u
0
k = u k = sin 2 x k dx = 2 k sin x d (sin x) = 2 k t dt = 0 .
0 0 0
The corresponding Fourier series
2 2 2
0 sin x + 0 sin 3 x + 0 sin 5 x + ...

converges - in accordance with the Theorem 13.1 - but its limit is the function
u = and not u = sin 2 x .
This does not occure, if the sequence (13-2) would contain the function
2
sin 2 x too. Hence the sequence (13-2) is in this respect deficient,

"incomplete".
Complete orthonormal sets. Let { k ; k = 1 , 2 , ... } be an orthonormal set in a

Hilbert space H . The orthonormal set { k } is said to be complete, if for every
74
element u H the Fourier series representation of u with respect to
{ k ; k = 1, 2 , ... } converges to the limit u .
Do not be confused by this second use of the notion complete. In the Section 5
we defined, that a (normed) space S is complete if every Cauchy sequence in S
is convergent in S . Here we say that the set { k ; k = 1, 2 , ... } is complete, if

k =1
u k k = u for every u in the Hilbert space H .
Theorem 13.2. An orthonormal system {k ; k = 1, 2 , ... } is complete if and

only if
u k = 0 for all k implies u = .
That is, the orthonormal system { k ; k = 1, 2 , ... } is complete, if u = is the

only vector in H which is orthogonal to all elements of the set { k ; k = 1 , 2 , ... }.
We saw that the function u = sin 2 x is orthogonal to all elements of the

orthonormal set (13-2). Hence also by theorem 13.2 the set (13-2) is not complete.
Recall, that a subset S of a Hilbert space H spans or generates the Hilbert

space, if the set of all linear combinations of the elements of S is dense in H .
Complete sequences. Let H be a Hilbert space. A sequence
{ k ; k = 1, 2 , ... } (13-3)
of (not only mutually orthogonal) elements of H is called a complete sequence of

H , if { k ; k = 1, 2 , ... } spans the Hilbert space H .
In other words, { k ; k = 1, 2 , ... } is a complete sequence of a Hilbert space

H , if for any u H and for every > 0 , there is a positive integer N and
numbers a1(n ) , a2(n ) , ... , an(n ) such, that
n
u a (kn ) k < for all n N .
k =1
Bases for a Hilbert space. Let H be a Hilbert space. A sequence

{ k ; k = 1, 2 , ... } of elements of H is called a basis for H , if { k ; k = 1, 2 , ... }
1. is a complete sequence of H
(that is, every u H can be approximated with an arbitrary accuracy
by linear combinations of elements of { k ; k = 1, 2 , ... })
75
and
2. is linearly independent.
In other tems, a basis for a Hilbert space H is every linearly independent

countable subset of H which spans H .
We note, that in the finite element approximations not only the coefficients
a1(n ) , a2(n ) , ... , an(n ) but also the basis functions 1(n ) , 2(n ) ,..., n(n ) depend on the
accuracy .
Schauder basis. If the basis for a Hilbert space H has the property, that any
u H can be uniquely written in the form of infinite sum

u = ak k
k=1
n
(which is interpreted as u ak k < for all n > N ) then the linearly
k=1
independent sequence { k ; k = 1, 2 , ... } is called Schauder basis.
If a sequence (13-3) is orthonormal then the previous definitions of complete

sequences and of complete orthonormal sets express identical concepts.
Orthonormal bases. An orthonormal sequence
{ k ; k = 1, 2 , ... }
that forms a basis for a Hilbert space H is called an orthonormal basis for H .
Now we give an algorithm for the construction of an orthonormal basis

{k ; k = 1, 2 , ... } once a basis { k ; k = 1, 2 , ... } for H is given.
Gram-Schmidt orthonormalization. Let
1
1 =
1
be the first element of the orthonormal set. (The linear independence of

{ k ; k = 1, 2 , ... } implies that the zero vector is not an element of { k } ).
Obviously 1 = 1 .
Let
g 2 = 2 + c12 1
76
be orthogonal to 1 (see Fig. 13.1), that is let
1 g 2 = 1 2 + c12 1 = 0,
from which it follows c12 = 1 2 .
c
12

1
c121
g
2 2 2 g2

2 2
1 1 1
0 1
Figure 13.1. First step of the orthonormalization procedure
We note, that g 2 0 because if
c12
g2 = 2 + 1 =
1
then the vectors 1 , 2 are not linearly independent. Hence
g2
2 =
g2
and 1 and 2 are mutually orthogonal vectors and have unit norms.
Let
g 3 = 3 + c13 1 + c23 2
be orthogonal to 1 and to 2 , that is let
1 g 3 = 1 3 + c13 1 1 + c23 1 2 = 0
2 g 3 = 2 3 + c13 2 1 + c23 2 2 = 0
77
from which it follows
c13 = 1 3 , c23 = 2 3 .
Similarly to the case of the vector g 2 it can be shown, that g 3 and hence
g3
3 = ,
g3
1 , 2 and 3 are mutually orthogonal and have unit norms.
If we continue this technique, then we get an orthonormal system {1, 2 , ... } .
It is easy to see, that the sequence { k ; k = 1 , 2 , ... } is an orthonormal basis for

a Hilbert space H . The orthonormality of the sequence immediately follows from
its construction, so that it is enough to prove that it is complete. From the process
of the orthonormalization it is clearly seen, that the n -th element of
{ k ; k = 1, 2 , ... } is a linear combination of the first n elements of
{ k ; k = 1, 2 , ... } and vice-versa. Therefore, if any u H can be approximated
with an arbitrary accuracy by linear combinations of the elements of
{ k ; k = 1, 2 , ... } H then this is also true for the linear combinations of the
elements of { k ; k = 1 , 2 , ... } H , that means that the sequence
{k ; k = 1, 2 , ... } is complete in H.
E.13.2. Recall, that the sequence of functions
{1, x, x , x , x , ... }
2 3 4
forms a base for the Hilbert space L2 ( 1,1) . But this base is not orthonormal. To
orthonormalize it we may apply the process of Gram-Schmidt orthonormalization
which can be realized using Mathematica as follows:
In[1]:= << LinearAlgebraÒrthogonalization`

8 y1, y2, y3, y4, y4< =
GramSchmidtA91, x, x2, x3, x4=, Normalized True,
InnerProduct J H#1 #2L x &NE
1
Out[2]= : , $%%%%%% x, $%%%%%% J + x2N,

!!!!
1 3 3 5 1
2 2 2 2 3
105 I 1 + x4 6 I 1 + x2MM
$%%%%%% J + x3N, >
!!!
!
5 7 3x 5 7 3
2 2 5 8 2
78
Control:
In[3]:= 9 y12 x, y1 y2 x=
1 1
1 1
Out[3]= 81, 0<
The following example applies the Gram-Schmidt orthonormalization to the

given list of three-dimensional vectors.
In[1]:= << LinearAlgebraÒrthogonalization`

u1 = 83, 4, 2<; u2 = 82, 5, 2<; u3 = 81, 2, 6<;
8 v1, v2, v3< = GramSchmidt@8u1, u2, u3<D
Out[3]= :: >,
!!!!!! !!!!!! !!!!!!
3 4 2
, ,
29 29 29
: >, : >>
!!!!!!!!!
1653! !!!!!!!!!
1653! !!!!!!!!!
1653! !!!!!
57! !!!!!
57! !!!!!
57!
32 25 2 2 2 7
, , , ,
The result is an orthonormal basis, so the scalar product of each pair of vectors
is zero and each vector has unit length:
In[4]:= 8 v1.v2, v2.v3, v1.v3, v1.v1, v2.v2, v3.v3<
Out[4]= 80, 0, 0, 1, 1, 1<
Show@
Graphics3D@
In[5]:=
8Hue@0D, Thickness@0.008D, Map@Line@880, 0, 0<, #<D &, 8 v1, v2, v3<D<D,

Axes TrueD
0.75
0.5
0.25
0
-0.25
.25
0.75
0.5
0.25
-0.5
0.5
0 5
Out[5]= Graphics3D
79
Theorem 13.3. If a Hilbert space H is separable, then H contains a basis.
Proof: If a Hilbert space is separable, then H contains a countable dense subset

which spans H . If we order the elements of this subset into a sequence, then we
have a complete sequence; if we omit those elements which are linear
combinations of the others, then this does not disturb the completeness of the
sequence, so that we have a linearly independent complete sequence of H , which
is a basis for H .
Theorem 13.3 and the Gram-Schmidt orthonormalization implies:
Theorem 13.4. A separable Hilbert space has an orthonormal basis.
80
14. The projection theorem, the best approximation
From simple geometry it is well known how to determine the point m* among
all poits m of a plane M , for which the distance from a point x M is the
shortest. The point m* is given as the intersection of a line passing through the
point x , perpendicular to the plane M itself (see Fig. 14.1).
3 x
x m*
m*
O
M
Figure 14.1. Orthogonal projection in 3 .
This obvious and intuitive result can be generalized to the problem of finding
the best approximation m* of a given vector x of a Hilbert space in the subspace
M.
Minimizing vector. Let H be an inner product space and M a linear subspace

of H . A vector m* M is said to be a minimizing vector of a given vector
x H in a subspace M , if
x m xm for all m M .
Theorem 14.1. (The projection theorem).
(1) m M is a minimizig vector of a vector x H in a subspace M , if

and only if
x m m = 0 for all m M .
(2) If a minimizing vector exists, then it is unique.
(3) If H is a Hilbert space, and M is a closed linear subspace in H (i.e.

if mn M and mn m , implies m M ), then to every x H
there exists a minimizing vector m in M .
81
We note, that neither the orthogonality of x m* nor the unicity of m* depends
on the completeness of the space H .
The best approximation. Let H be a real Hilbert space and suppose we have
an orthonormal sequence of functions { k ; k = 1, 2 , ..., n} H with which we
wish to approximate an arbitrary function u in H . To approximate u in the best
possible way, we want a linear combination of { k ; k = 1, 2 , ... , n} which is as
close as possible, in terms of the norm in H , to u . In other words, we want to
minimize
n
u ckk .
k =1
By the projection theorem, this norm is minimal if and only if

n
u ck k i = 0, i = 1, 2, ..., n
k =1
Since the orthonormality of the sequence { k ; k = 1, 2 , ..., n} , it follows that

ck = u k , k = 1, 2 , ... , n , that is, the scalars ck ; k = 1, 2 , ... , n are the Fourier
coefficients of u H .
With this choice of ck , the error of our approximation is
n 2 n n n
u ck k = u ck k u ck k = u u k
2 2
.
k =1 k =1 k =1 k =1
Since the error (norm) can never be negative, it follows that

n n
ck2 = u k
2 2
u < ,
k =1 k =1
which is known as Bessels inequality. Since this is true for all n , if the set
{ k ; k = 1, 2 , ... , n} is infinite, we can take the limit n , and conclude that
n
c
k =1
k k converges to some function in H . (We know this because c
k =1
k k is a
Cauchy sequence in H and H is complete.)
For any orthonormal system { k ; k = 1, 2 , ..., n} , the best approximation of u is

n
u
k =1
k k , which is the projection of u onto the space spanned by the set
{k ; k = 1, 2 , ..., n}
82
Appendix: A. Construction of basis functions for the vector space
C3(1) [ 1,1] using Mathematica
Training problem: Let C3(1) [ 1,1] denote that linear subspace of the vector space
C (1) [ 1,1] , which consists of piecewise cubic polynomials on the intervals [ 1,0]
and [0,1] . Similarly to example E.2.2, determine the basis of the function space
C3(1) [ 1,1] and plot it using Mathematica!
In[1]:= Clear@"Global`"D H Training problem L
p1@t_D := a1 t3 + b1 t2 + c1 t + d1
p2@t_D := a2 t3 + b2 t2 + c2 t + d2
H conditions for continuity in t 0 L
p1@0D p2@0D
Out[4]= d1 d2
In[5]:= HD@ p1@tD, tD . t 0L HD@p2@tD, tD . t 0L
Out[5]= c1 c2
In[6]:= d1 = d2 = d;c1 = c2 = c;
p@t_D := Piecewise@88p1@tD, 1 t 0<, 8 p2@tD, 0 t 1<<D
dp@t_D := Piecewise@88 p1'@tD, 1 t 0<, 8 p2 '@tD, 0 t 1<<D
s = Solve@8Map@p, 81, 0, 1<D Map@g, 81, 0, 1<D,

Map@dp, 81, 0, 1<D Map@dg, 81, 0, 1<D< Flatten, 8a1, b1, a2, b2, c, d<D
First
Out[9]= 8a1 dg@ 1D + dg@0D + 2 g@ 1D 2 g@0D,

b1 dg@ 1D + 2 dg@0D + 3 g@ 1D 3 g@0D, a2 dg@0D + dg@1D + 2 g@0D 2 g@1D,
b2 2 dg@0D dg@1D 3 g@0D + 3 g@1D, c dg@0D, d g@0D<
In[10]:= H substitution of the solution into p@t_D L
ss = p@tD . s
t dg@0D + t2 Hdg@ 1D + 2 dg@0D + 3 g@1D 3 g@0DL + t3 Hdg@ 1D + dg@0D + 2 g@ 1D 2 g@0DL + g@0D

t dg@0D + g@0D + t3 Hdg@0D + dg@1D + 2 g@0D 2 g@1DL + t2 H 2 dg@0D dg@1D 3 g@0D + 3 g@1DL
Out[10]=
1 t 0
0 t 1
83
In[11]:= H arrange to form p@t_D:=
g@1D 1@tD+dp@1D 2@tD+g@0D 3@tD+dp@0D 4@tD+
g@1D 5@tD+dp@1D 6@tD L
sss = 8ss@@1, 1, 1DD, ss@@1, 2, 1DD<
Out[11]= 8t dg@0D + t Hdg@1D + 2 dg@0D + 3 g@1D 3 g@0DL +

2
3
t Hdg@ 1D + dg@0D + 2 g@ 1D 2 g@0DL + g@0D, t dg@0D + g@0D +
t3 Hdg@0D + dg@1D + 2 g@0D 2 g@1DL + t2 H 2 dg@0D dg@1D 3 g@0D + 3 g@1DL<
fi = Table@Map@Coefficient@sss, #D &, 8g@k 2D, dg@k 2D<D, 8k, 1, 3<D

Flatten@#, 1D &
In[12]:=
Out[12]= 883 t + 2 t , 0<, 8t + t , 0<, 81 3 t 2 t , 1 3 t2 + 2 t3<,

2 3 2 3 2 3
8t + 2 t + t , t 2 t + t <, 80, 3 t 2 t <, 80, t2 + t3<<

2 3 2 3 2 3
Table@k@t_D = Piecewise@88fi@@k, 1DD, 1 t 0<, 8fi@@k, 2DD, 0 < t 1<<D,

8k, 1, 6<D
In[13]:=
Out[13]= :
2 3
1 3t 2t 1 t 0
2
3t + 2t
3 1 t 0, t2 + t3 1 t 0, ,

1 3 t + 2 t3
2 0<t1
>
t + 2 t2 + t3 1 t 0 0 1 t 0 0 1 t 0
, ,
t 2 t2 + t3 0 < t 1 3 t2 2 t3 0 < t 1 t2 + t3 0<t1
drawing = Table@Plot@ k@tD, 8t, 1, 1<, AspectRatio 1 2, PlotStyle Thickness@0.011D,

DisplayFunction IdentityD, 8k, 1, 6<D;
In[14]:=
Show@GraphicsArray@Partition@drawing, 2DD, DisplayFunction $DisplayFunctionD
1 0.14
0.8 0.12
0.1
0.6 0.08
0.4 0.06
0.04
0.2 0.02
-1 -0.5 0.5 1 -1 -0.5 0.5 1
1 0.15
0.8 0.1
0.6 0.05
0.4 -1 -0.5 0.5 1

-0.05
0.2
-0.1
-1 -0.5 0.5 1 -0.15
1
-1 -0.5 -0.02 0.5 1
0.8
-0.04
0.6 -0.06
0.4 -0.08
-0.1
0.2
-0.12
-0.14
-1 -0.5 0.5 1
Out[15]= GraphicsArray
84
Appendix: B. The Lebesgue integral
For purposes of this text, it is only necessary to acquire a simple understanding

of the definition of this integral and some of its fundamental properties.
The Lebesgue integral for bounded measurable functions: Let f (x ) be a

(real-valued), bounded measurable function on a closed interval [a, b ] . Since
f (x ) is measurable on [a, b] , there exists a sequence of continuous functions
{gn (x )} which converges to f (x ) almost everywhere on [a, b]. Since f (x ) is
bounded, there is a number K > 0 for which f ( x ) K on the interval a x b .
Then we may write
b
b
f ( x ) dx = lim g n ( x ) dx ,
n
a a
where the integrals on the right-hand side are Riemann integrals of continuous
functions.
We note, that this definition is correct, because it can be shown that the limit
does not depend on the choice of sequence {g n ( x )}.
Nonnegative integrable functions: Let f ( x ) 0, a x b be a measurable

function. Define the sequence of bounded measurable functions
f ( x ) x [a, b] and 0 f ( x ) n
fn (x ) = (n = 1,2, ...) .
n x [a, b] and n < f ( x )
The function f ( x ) is said to be (Lebesgue) integrable on interval [a, b] , if the

b
sequence of integrals f n ( x ) dx is bounded from above. Then we may write
a
b b
b

f ( x ) dx = lim f n ( x ) dx = sup f n ( x ) dx

n
a a n a
and this expression is called Lebesgue integral of the function f ( x ) on [a, b] .
Integrable functions of arbitrary signs: Let f ( x ) be a (real-valued),

measurable function on a closed interval [a, b ] . If f1 ( x ) and f 2 (x ) are two
nonnegative integrable functions such that
f ( x ) = f1 ( x ) f 2 ( x ) ( x [a, b] ) ,
85
then we say that f (x ) is (Lebesgue) integrable on [a, b] . The number
b b b
f (x ) dx = f (x ) dx f (x ) dx
a a
1
a
2
is called the (Lebesgue) integral of function f (x ) on interval [a, b] .
Some properties of the Lebesgue integral and the Riemann integral are the
same. For example, the set of integrable functions is a linear space and the map
b
f f ( x ) dx
a
is linear, that is,

b b b b b
( f (x ) + g (x))dx = f (x )dx + g (x)dx

a a a
and ( f (x))dx = f (x)dx .
a a
In Section 9 we defined the set of (Lebesgue) measure zero. Sets of arbitrary

Lebesgue measures can be defined using their characteristic functions. If
A = [a, b] , then the function
0 if x A
A = (a x b )
1 if x A
is called the characteristic function of A .
Measurable sets. A set A [a, b] is said to be measurable (Lebesgue

measurable) if its characteristics function A is integrable. The number
(x ) dx
a
A
is called the (Lebesgue measure) of A and is denoted mes A .
E.B.1. If A = [c, d ] [a, b] , then mes A = d c . Consequently the Lebesgue

measure is a generalization of the length in elementary geometry.
Theorem B.1. If A [a, b] and B [a, b] are two measurable disjoint sets on
the interval [a, b] , that is, if A B = , then A B is measurable and
mes ( A B ) = mes A + mes B.
86
Till now we have supposed that the interval [a, b] is arbitrary, but fixed. It can
be shown, that the previous definitions and theorems are independent from the
choice of such an interval. There are general bounded Lebesgue measurable sets
and integrable functions on bounded intervals. The extension of the ideas above to
functions which are defined on unbounded intervals can be made if the limit
process is applied. Nonnegative integrable functions can be obtained using the
following properties:
f : ( , + ) , f ( x ) 0 (x ) ,
f integrable on every interval [ n, n] ( n = 1,2,... ) ,
n
the lim
n f (x )dx exists.
n

The last limit is denoted by f (x )dx .

Now it is possible to extend the definition of integrable functions of arbitrary

signs to interval ( ,+) , and the unbounded measurable sets can be discussed in
an analogous way as the bounded measurable sets.
The theory of Lebesgue integral in 1 can be easily extended to the n -

dimensional space n .
Also arbitrary measurable sets may be applied as domains of definition of

integrable functions.
Finally, we mention some very important theorems:
Theorem B.2. Let f ( x ) be an integrable function on the interval [a , b ] and let
f ( x ) dx = 0 ,
a
then f ( x ) = 0 almost everywhere on [a, b ] .
Theorem B.3. (B. Levi): Suppose { f n (x )} is a sequence of non-decreasing

non-negative integrable functions on [a, b] and let there be a real number K > 0
b
independent from n such that f (x ) dx K (n = 1,2,...) .
a
n Then the function
f ( x ) = lim f n ( x ) is integrable on [a, b] and

n
87
( )
b b b
f (x ) dx = lim f n ( x ) dx = lim f n ( x ) dx .
n n
a a a
Theorem B.4. (H. Lebesgue): Suppose { f n ( x )} is a sequence of integrable

functions on [a, b ] and f n ( x ) converges to f (x ) pointwise almost everywhere
(that is, except on a set of measure zero). If there is an integrable function g ( x ) so
that for every n N , f n ( x ) g ( x ) , the pointwise limit f ( x ) is integrable and
b b b
lim f n ( x ) dx = lim f n ( x ) dx = f ( x ) dx .
n n
a a a
In other words, the limit process and integration may be interchanged without
harm.
88
References
K. Atkinson and W. Han, Theoretical Numerical Analysis, Springer,New York,

2001.
S.C. Brenner and L.R. Scott, The Mathematical Theory of Finite Element
Methods,
Springer, New York, 1994.
A. Gpfert and T. Riedrich, Functionalanalysis, Taubner, Leipzig, 1980.
J.P. Keener, Principles of Applied Mathematics, Transformation and

Approximation,
Addison-Wesley Publishing Company, New York, 1988.
A.N. Kolmogorov and S.V. Fomin, Elements of the Theory of Functions and
Functional Analysis, Graylock Press, New York, 1957.
L.P. Lebedev and I.I. Vorovich, Functional Analysis in Mechanics, Springer,

New York, 2003.
J.T. Oden, Applied Functional Analysis, A First Course for Students of

Mechanics and
Engineering Science, Prentice-Hall, New Jersey, 1979.
J.N. Reddy, Applied Functional Analysis and Variational Methods in

Engineering ,
McGraw-Hill Book Company, 1987.
K. Rektorys, Variational Methods in Mathematics, Sciences and Engineering,

D.Reidel Publishing Company, London, 1980.
S. Wolfram, The Mathematica Book, Fifth edition, Cambridge University Press,

2003.
89
90
Index
composite, 20
adjoint operator, 48 Composition[], 20
affine-subspace, 13 construction, of basis functions, 83
algebraic continuous
basis, 15 linear functional, 47
dual, 24 continuous operator, 37
dual space, 24 convergence, 31, 73
in C 0 ( ) , 66
almost everywhere property, 51, 55, 85 ( )
approximation, best, 82
of infinite series, 73
Banach space, 31, 45 weak, 72
basis, 14 convergent, 31
algebraic, 15 countable, 41
dual, 25 set, 51
for Hilbert space, 75
Hamel, 15 delta
orthonormal, 76 distribution, 61
Schauder, 76 sequences, 62
Bessel's inequality, 82 dense
best approximation, 82 functions, 55
bijection, 19 set, 41
bijective operator, 19 subspace, 71
bilinear derivative
form, 24 distributional, 59
functional, 24, 71 generalized, 57
functional, symmetric, 24 generalized, 59
bound of operator, 37 piecewise, 60
boundary value problem, 63 weak, 59
bounded Descartes product, 10
linear operator, 37 differentiation, 20
set, 36 dimension, 14
dimensional
C(k)[a,b], 8 finite-, 14
C[a,b], 29 infinite-, 14, 18
C0()(), 58, 66 n-, 14
Cantor, Georg, 41 direct
Cartesian product, 10
product, 10 sum, 13
Cauchy sequence, 31, 69 Dirichlet function, 55
Cauchy-Schwarz inequality, 44 distance, 26
characteristic function, 86 distribution, 66
classical solution, 70 delta, 61
closed subset, 36 regular, 68
co-domain, 19 singular, 68
compact distributional derivative, 59
operator, 40 domain, 19
set, 36 dual
support, 58 basis, 25
complement, 13 pair, 24
complete dual.space, topological, 40
sequences, 75 duality pair, 40
sets, othonormal, 74, 76
space, 31 equivalent functions, 55
completion, 35 Euclidean
91
norm, 26 injective operator, 19, 20
space, 43 inner product, 43, 68, 73
integral
finite element method, 15 Lebesgue, 53, 55, 67, 85
finite-dimensional space, 14, 22 norm, 32
formulation Riemann, 55
weak, 72 intersection, 12
Fourier inverse operator, 23
coefficients, 73
series, 73 kernel, 21
function, 19
characteristic, 86 l0,l02, 32
dense, 55 L0[a,b], 29
Dirichlet, 55 L02[a,b], 29
equivalent, 55 L2(), 53
generalized, 67 Lax-Milgram theorem, 50, 72
Heaviside, 68 Lebesgue
locally integrable, 68 integral, 53, 55, 67, 85
measurable, 51, 85 measure, 86
smooth, 57 theorem, 62, 85
square integrable, 53 Lebesgue, Henri, 53
support of, 58 Levi theorem, 87
test, 58, 67 limit, 31
unitstep, 68 Limit[], 35
functional, 23, 67 linear
bilinear, 24, 71 functional, 24, 67
linear, 24, 67 functional, continuous, 47
linear, continuous, 47 manifold, 13
positive definite bilinear, 24 operator, 20, 22
symmetric bilinear, 24 operator, bounded, 37
space, 7
generalized subspace, 11
derivative, 57, 59 linearly
function, 67 dependent, 14
representation, 73 independent, 14
solution, 70 ln1, ln2, ln, 27
generate, 15, 75 locally integrable function, 68
Gram matrix, 44, 73
Gram-Schmidt orthonormalization, 76 map, 19
mapping, 19
H(k)(), 69 Mathematica function
Hamel Composition[], 20
basis, 15 Limit[], 35
Heaviside Norm[], 27
distribution, 68 NullSpace[], 21
function, 68 Outer[], 10
Heine-Borel theorem, 36 matrices, 21
Hilbert space, 45, 69, 75, 80 matrix
basis for, 75 Gram, 44
maximum norm, 27, 32
inequality mean value theorem, 63
Bessel's, 82 measurable
Cauchy-Schwarz, 44 function, 51, 85
infinite set, 86
-dimensional, 14, 18 method
norm, 27, 32 finite element, 15
sequences, 18, 36 metric space, 26
set, 14 minimizing vector, 81
92
multi index notation, 57 pre-Hilbert space, 43
problem
n-dimensional, 14 boundary value, 63
norm, 26 product
Euclidean, 26 Cartesian, 10
infinite, 27, 32 Descartes, 10
integral, 32 direct, 10
maximum, 27, 32 inner, 43, 68, 73
of linear operator, 39 scalar, 43
sum, 26 projection theorem, 81
Norm[], 27 property
normed space, 26, 30, 31, 32, 40, 45, 46, 48 almost everywhere, 51, 55, 85
C[a,b], 29
complete linear, 31 range, 19
L0[a,b], 29 regular distribution, 68
L02[a,b], 29 representation
linear, 26 Fourier series, 73
notation generalized, 73
multi index, 57 orthogonal, 73
symbolic, 61 Riemann integral, 55
Null space, 21 Riesz representation theorem, 46, 72
NullSpace[], 21 Riesz, Frigyes, 46
one-to-one operator, 19 scalar product, 43

onto operator, 19 Schauder basis, 76
operator, 19 Schwartz, Laurent, 66
adjoint, 48 self-adjoint operator, 49
bijective, 19 separable space, 41, 56, 80
bound of, 37 sequence(s)
compact, 40 Cauchy, 31, 69
continuous, 37 complete, 75
injective, 19, 20 delta, 62
inverse, 23 infinite, 18, 36
linear, 20, 22 set
linear, bounded, 37 bounded, 36
linear, norm of, 39 compact, 36
one-to-one, 19 complete othonormal, 74, 76
onto, 19 countable, 51
self-adjoint, 49 dense, 41
surjective, 19, 20 infinite, 14
orthogonal L2(), 53
representation, 73 measurable, 86
orthogonal vectors, 44 of measure zero, 51
orthonormal precompact, 36
basis, 76 spanning, 14, 15, 42
set, complete, 74, 76 singular distribution, 68
systems, 73 smooth function, 57
orthonormalization Sobolev space, 57, 68
Gram-Schmidt, 76 H(k)(), 69
Outer[], 10 W2 (k)(), 69
solution
piecewise classical, 70
derivative, 60 generalized, 70
linear polynomials, 15 weak, 70
linear poynomials, 83 space
positive definite algebraic dual, 24
bilinear functional, 24 Banach, 31, 45
precompact set, 36 complete, 31
93
complete normed linear, 31 norm, 26
Euclidean, 43 support
finite-dimensional, 14, 22 compact, 58
Hilbert, 45, 69, 75, 80 of function, 58
infinite-dimensional, 14 surjective operator, 19, 20
inner product, 43 symbolic notation, 61
L2, 53
linear, 7 test function, 58, 67
metric, 26 theorem
n-dimensional, 14 Heine-Borel, 36
normed, 26, 30, 32, 40, 45, 48 Lax-Milgram, 50, 72
normed, C[a,b], 29 Lebesgue, 62, 85
normed, L0[a,b], 29 Levi, 87
normed, L02[a,b], 29 mean value, 63
null, 21 projection, 81
pre-Hilbert, 43 Riesz representation, 46, 72
separable, 41, 56, 80 Weierstrass approximation, 42
Sobolev, 57, 68 topological dual space, 40
Sobolev, H(k)(), 69 transformation, 19
Sobolev, W2(k)(), 69
sub-, 11 unitstep function, 68
topological dual, 40
vector, 7 vector
span, 15, 75 minimizing, 81
spanning set, 14, 15, 42 orthogonal, 44
square integrable function, 53 space, 7
subset
closed, 36 W2 (k)(), 69
subspace, 11 weak
affine, 13 convergence, 72
dense, 71 derivative, 59
linear, 11 formulation, 72
sum solution, 70
direct, 13 Weierstrass approximation theorem, 42
94

FunctionalAnalysisMathematica PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FunctionalAnalysisMathematica PDF

Uploaded by

Copyright:

Available Formats

BUDAPEST UNIVERSITY OF TECHNOLOGY AND ECONOMICS

FACULTY OF CIVIL ENGINEERING

SOME CONCEPTS OF FUNCTIONAL ANALYSIS

Research Group for Computational Structural Mechanics

Copyright 2006 by Gyrgy Popper

1. Vector spaces, subspaces, linear manifolds ...................................................... 7

2. Dimension, spanning sets and (algebraic) basis...............................................14

3. Linear operator ...............................................................................................19

4. Normed spaces ...............................................................................................26

5. Convergence, complete spaces........................................................................31

6. Continuous and bounded linear operator .........................................................37

7. Dense sets, separable spaces ...........................................................................41

8. Inner product, Hilbert space............................................................................43

9. Sets of measure zero, measurable functions ....................................................51

10. The space L 2 ...............................................................................................53

11. Generalized derivatives, distributions, Sobolev spaces ..................................57

12. Weak (or generalized) solutions....................................................................70

13. Orthogonal systems, Fourier series ...............................................................73

14. The projection theorem, the best approximation............................................81

using Mathematica .............................................................................................83

Appendix: B. The Lebesgue integral...................................................................85

Functional analysis is a study of abstract, primarily linear, spaces resulting

The importance of getting acquainted with functional analysis lies in its

For the purpose of this text, it is only necessary to acquire a simple

During the preparation of this textbook, I received helpful suggestions from

Budapest, April, 2006. Gyrgy Popper

1. Addition, that is if x, y X then x + y X

2. Multiplication by scalar, that is if x X and is

arbitrary scalar, then x X .

These two operations are to satisfy the usual axioms:

If x, y, z X and , (or C ) 1, then

3. there exists an element X such, that + x = x for any x X

(existence of the zero element);

4. for any x X , there is an element x X such as that x + ( x ) =

(existence of negative elements);

5. ( x + y ) = x + y ; (law of distributivity with respect to vectors);

6. ( + ) x = x + x ; (law of distributivity with respect to scalars);

7. ( ) x = ( x ) ; (law of associativity for scalar multiplication);

E.1.1. The set n of all real ordered n-tuples x = ( x1 ,..., xn ) , y = ( y1 ,..., yn ) ,

E.1.2. The set C (0 ) [a, b ] of all (real-valued) continuous functions on a finite

(f + g )(t ) = f (t ) + g (t ), ( f )(t ) = f (t ), t [a, b]

forms a linear space. The zero vector : f (t ) = 0 for all t [a, b ] .

Note, that C (0 ) [a, b] C (0 ) (a, b ) is true, because f C (0 ) [a, b] f C (0 ) (a, b ) .

More generally, C ( k ) [a , b ] denotes the linear space of k-times continuously

For example, the function

defined on the interval 1 t 1 , is once continuously differentiable but only

belongs only to C ( 0 ) [ 1,1] , (see Fig.1.1).

f@t_D := PiecewiseA99t + , 1 t 0=, 9t , 0 t 1==E

g@t_D = D@f@tD, tD Simplify

Plot@8f@tD, g@tD<, 8t, 1, 1<,

8Thickness@.008D<, 8Dashing@8.04, .02<D<

Figure1.1. Only once differentiable function.

In spite of the generality of vector spaces, it is easy to find a class of functions

The direct product X Y can be obtained using the Mathematica function

In[1]:= Outer@f, 8a, b, c<, 81, 2<D H outer product L

In[2]:= Outer@List, 8a, b, c<, 81, 2<D

In[3]:= Flatten@%D H flatten all levels in the list L

Out[3]= 8a, 1, a, 2, b, 1, b, 2, c, 1, c, 2<

In[4]:= Flatten@%%, 1D H flattens the topmost level in the list L

Consequently it is reasonable to define the function:

Descartes@x_, y_D := Flatten@Outer@List, x, yD, 1D

In[9]:= Plot3DAx3 + y2 1, 8x, 3, 3<, 8 y, 0, 1<E

Figure1.2. Illustration to the use of Descartes product.