GAANS-Probability Measures On Metric Spaces

Probability measures on metric spaces
Onno van Gaans

These are some loose notes supporting the first sessions of the seminar Stochastic
Evolution Equations organized by Dr. Jan van Neerven at the Delft University
of Technology during Winter 2002/2003. They contain less information than
the common textbooks on the topic of the title. Their purpose is to present a
brief selection of the theory that provides a basis for later study of stochastic
evolution equations in Banach spaces. The notes aim at an audience that feels
more at ease in analysis than in probability theory. The main focus is on
Prokhorovs theorem, which serves both as an important tool for future use and
as an illustration of techniques that play a role in the theory.
The field of measures on topological spaces has the luxury of several excellent
textbooks. The main source that has been used to prepare these notes is the
book by Parthasarathy [6]. A clear exposition is also available in one of Bourbakis volumes [2] and in [9, Section 3.2]. The theory on the Prokhorov metric
is taken from Billingsley [1]. The additional references for standard facts on
general measure theory and general topology have been Halmos [4] and Kelley
[5].
Contents
1 Borel sets
2 Borel probability measures
3 Weak convergence of measures
4 The Prokhorov metric
5 Prokhorovs theorem
13
6 Riesz representation theorem
18
7 Riesz representation for non-compact spaces
21
8 Integrable functions on metric spaces
24
9 More properties of the space of probability measures
26
The distribution of a random variable in a Banach space X will be a probability

measure on X. When we study limit properties of stochastic processes we will
be faced with convergence of probability measures on X. For certain aspects of
the theory the linear structure of X is irrelevant and the theory of probability
measures on metric spaces supplies some powerful tools. In view of the Banach
space setting that we have in mind, it is not too restrictive to assume separability
and completeness but we should avoid assuming compactness of the metric
space.
Borel sets
Let (X, d) be a metric space. The Borel -algebra (-field) B = B(X) is the
smallest -algebra in X that contains all open subsets of X. The elements of B
are called the Borel sets of X.
The metric space (X, d) is called separable if it has a countable dense subset,
that is, there are x1 , x2 , . . . in X such that {x1 , x2 , . . .} = X. (A denotes the
closure of A X.)
Lemma 1.1. If X is a separable metric space, then B(X) equals the -algebra
generated by the open (or closed) balls of X.
Proof. Denote
A := -algebra generated by the open (or closed) balls of X.
Clearly, A B.
Let D be a countable dense set in X. Let U X be open. For x U take
r > 0, r Q such that B(x, r) U (B(x, r) open or closed ball with center x
and radius r) and take yx D B(x, r/3). Then x B(yx , r/2) B(x, r). Set
rx := r/2. Then
U = {B(yx , rx ) : x U },
which is a countable union. Therefore U A. Hence B A.
Lemma 1.2. Let (X, d) be a separable metric space. Let C B be countable.
If C separates closed balls from points in the sense that for every closed ball B
and every x X \ B there exists C C such that B C and x C, then the
-algebra generated by C is the Borel -algebra.
Proof. Clearly (C) B, where (C) denotes the -algebra generated by C. Let
B be a closed ball in X. Then B = {C C : B C}, which is a countable
intersection and hence a member of (C). By the previous lemma we obtain
B (C).
If f : S T and AS and AT are -algebras in S and T , respectively, then f is
called measurable (w.r.t. AS and AT ) if
f 1 (A) = {x S : f (x) A} AS for all A AT .
2
Proposition 1.3. Let (X, d) be a metric space. B(X) is the smallest -algebra
with respect to which all (real valued) continuous functions on X are measurable
(w.r.t. B(X) and B(R)).
(See [6, Theorem I.1.7, p. 4].)
Borel probability measures
Let (X, d) be a metric space. A finite Borel measure on X is a map : B(X)

[0, ) such that
() = 0, and
A1 , A2 , . . . B mutually disjoint = (
i=1
Bi ) =
i=1
(Bi ).
is called a Borel probabiliy measure if in addition (X) = 1.

The following well known continuity properties will be used many times.
Lemma 2.1. Let X be a metric space and a finite Borel measure on X. Let
A1 , A2 , . . . be Borel sets.
(1) If A1 A2 and A =
(2) If A1 A2 and A =
i=1
Ai , then (A) = limn (An ).
i=1 ,
then (A) = limn (An ).
The next observation is important in the proof of Theorems 3.2 and 4.2.
Lemma 2.2. If is a finite Borel measure on X and A is a collection of
mutually disjoint Borel sets of X, then at most countably many elements of A
have nonzero -measure.
Proof. For m 1, let Am := {A A : (A) > 1/m}. For any distinct
A1 , . . . , Ak Am we have
k
(X) (
i=1
Ai ) = (A1 ) + + (Ak ) > k/m,
hence Am has at most m(X) elements. Thus
{A A : (A) > 0} =
m=1
Am
is countable.
Example. If is a finite Borel measure on R, then ({t}) = 0 for all except at
most countably many t R.
Proposition 2.3. Any finite Borel measure on X is regular, that is, for every
BB
(B) = sup{(C) : C B, C closed} (inner regular)
= inf{(U ) : U B, U open} (outer regular).
3
Proof. Define the collection R by

AR
(A) = sup{(C) : C A, Cclosed} and

(A) = inf{(C) : U A, U open}.
We have to show that R contains the Borel sets. step 1: R is a -algebra:

R. Let A R, let > 0. Take C closed and U open with C A U and
(A) < (C) + , (A) > (U ) . Then U c Ac C c , U c is closed, C c is
open, and
(Ac ) = (X) (A) > (X) (C) = (C c ) ,
(Ac ) = (X) (A) < (X) (U ) + = (U c ) + .
Hence Ac R.
Let A1 , A2 , . . . R and let > 0. Take for each i
Ui open , Ci closed with
Ci A U i ,
(Ui ) (Ai ) < 2i , (Ai ) (Ci ) < 2i /2.
Then
Ci
Ai
Ui and
Ui is open, and
Ui ) (
(
i
Ai )
i=1
i=1
Ui \
Ai
i=1
(Ui \ Ai )
=
i=1
(Ui \ Ai )
i=1
((Ui ) (Ai )) <
2i = .
i=1
Further, ( i=1 Ci ) = limk ( i=1 Ci ), hence for some large k, (
k
k
( i=1 Ci ) < /2. Then C := i=1 Ci i=1 Ai , C is closed, and
(
i=1
Ai ) (C)
< (
i=1
i=1
Ai ) (
i=1
i=1
Ci ) + /2
i=1
Ai \
Ci + /2
i=1
(Ai \ Ci ) + /2
i=1
=
i=1
(Ai \ Ci ) + /2
(Ai ) (Ci ) + /2 /2 + /2.
4
Ci )
Hence i=1 Ai R. Thus R is a -algebra.

step2: R contains all open sets: We prove: R contains all closed sets. Let
A X be closed. Let Un := {x X : d(x, A) < 1/n} = {x X : a
A with d(a, x) < 1/n}, n = 1, 2, . . .. Then Un is open, U1 U2 , and
i=1 Ui = A, as A is closed. Hence (A) = lim n (Un ) = inf n (Un ). So

(A) inf{(U ) : U A, U open} inf (Un ) = (A).
n
Hence A R.
Conclusion: R is a -algebra that contains all open sets, so R B.
Corollary 2.4. If and are finite Borel measures on the metric space X and
(A) = (A) for all closed A (or all open A), then = .
A finite Borel measure on X is called tight if for every > 0 there exists a
compact set K X such that (X \ K) < , or, equivalently, (K) (X) .
A tight finite Borel measure is also called a Radon measure.
Corollary 2.5. If is a tight finite Borel measure on the metric space X, then
(A) = sup{(K) : K A, K compact}
for every Borel set A in X.
Proof. Take for every > 0 a compact set K such that (X \ K ) < . Then
(A K ) = (A \ Kc ) (A) (Kc ) > (A)
and
(A K ) = sup{(C) : C K A, C closed}
sup{(K) : K A, K compact},
because each closed subset contained in a compact set is compact. Combination

completes the proof.
Of course, if (X, d) is a compact metric space, then every finite Borel measure
on X is tight. There is another interesting case. A complete separable metric
space is sometimes called a Polish space.
Theorem 2.6. If (X, d) is a complete separable metric space, then every finite
Borel measure on X is tight.
We need a lemma from topology.
Lemma 2.7. If (X, d) is a complete metric space, then a closed set K in X is
compact if and only if it is totally bounded, that is, for every > 0 the set K is
covered by finitely many balls (open or closed) of radius less than or equal to .
Proof. ) Clear: the covering with all -balls with centers in K has a finite
subcovering.
) Let (xn )n be a sequence in K. For each m 1 there are finitely many
1/m-balls that cover K, at least one of which contains xn for infinitely many
n. For m = 1 take a ball B1 with radius 1 such that N1 := {n : xn B1 }
is infinite, and take n1 N1 . Take a ball B2 with radius 1/2 such that
N2 := {n > n1 : xn B2 B1 } is infinite, and take n2 N2 . Take B3 , radius
1/3, N3 := {n > n2 : xn B3 B2 B1 } infinite, n3 N3 . And so on.
Thus (xnk )k is a subsequence of (xn )n and since xnell Bk for all k,
(xnk )k is a Cauchy sequence. As X is complete, (xn )n converges in X and as
K is closed, the limit is in K. So (xn )n has a convergent subsequence and K is
compact.
Proof of Theorem 2.6. We have to prove that for every > 0 there exists a
compact set K such that (X \ K) < . Let D = {a1 , a2 , . . .} be a countable
dense subset of X. Then for each > 0, k=1 B(ak , ) = X. Hence (X) =
n
limn ( k=1 B(ak , )) for all > 0. Let > 0. Then there is for each m 1
an nm such that
nm
k=1
B(ak , 1/m) > (X) 2m .
Let
nm
K :=
B(ak , 1/m).
m=1 k=1
Then K is closed and for each > 0,

nm
k=1
nm
B(ak , 1/m)
B(ak , )
k=1
if we choose m > 1/. So K is compact, by the lemma. Further,
(X \ K) =
m=1
nm
(X \
k=1
B(ak , 1/m))
=
m=1
m=1
X\
B(ak , 1/m)
k=1
nm
(X) (
nm
2m = .
B(ak , 1/m)) <

m=1
k=1
Weak convergence of measures
Let (X, d) be a metric space and denote

Cb (X) := {f : X R : f is continuous and bounded}.
Each f Cb (X) is integrable with respect to any finite Borel measure on X.
6
Definition 3.1. Let , 1 , 2 , . . . be finite Borel measures on X. We say that

(i )i converges weakly to if
f di
f d as i for all f Cb (X).
Notation: i . (There is at most one such a limit , as follows from the

metrization by the Prokhorov metric, which is discussed in the next section.)
Theorem 3.2. Let (X, d) be a metric space, let , 1 , 2 , . . . be Borel probability
measures on X. The following statements are equivalent:
(a) i
(b)
gdi gd for all g U Cb (X) := {f : X R : f is uniformly

continuous and bounded}
(c) lim supi i (C) (C) for all closed C X

(d) lim inf i i (U ) (U ) for all open U X
(e) i (A) (A) for every Borel set A in X with (A) = 0. (A = A \ A ).
Proof. (a)(b) is clear
(b)(c): Let C be a closed set, nonempty. Let Um := {x : d(x, C) < 1/m},
c
m 1. (Here d(x, A) := inf aA d(x, a) if A = , and d(x, ) := .) Then Um
is closed and
inf c d(x, y) 1/m.
xC,yUm
Hence there is an fm U Cb (X) with 0 f 1 on X, fm = 1 on C, and fm = 0

c
d(x,Um
)
c
on Um
. (Indeed, fm (x) := d(x,U c )+d(x,C)
does the job.) Since
m
i (C) =
C di
fm di ,
we get by assumption (b)

lim sup i (C) lim sup
i
Because
m=1
fm di =
fm d
Um d
= (Um ).
Um = C (since C is closed) we find

(C) = lim (Um ) lim sup i (C).
m
(c)(d): By complements,
lim inf i (U ) = lim inf i (X) i (U c ) = 1 lim sup i (U c )
i
1 (U c ) = (X) (U c ) = (U ).
7
(d)(c): Similarly.
(c)+(d)(e): A A A, A is open and A is closed, so by (c) and (d)
lim sup i (A)
lim inf i (A)
lim sup i (A) (A) = (A A)

(A) + (A) = (A),
lim inf i (A ) (A ) = (A \ A)
(A) (A) = (A),
hence i (A) (A).

(e)(a): Let g Cb (X). Idea: we have
functions; we want to approximate g to get
Define
f di
gdi
(E) := ({x : g(x) E}) = (g 1 (E)),
f d for suitable simple

gd.
E Borel set in R.
Then is a finite Borel measure (probability measure) on R and if we take

a < g , b > g , then (R \ (a, b)) = 0. As is finite, there are at most
countably many with ({}) > 0. Hence for > 0 there are t0 , . . . , tm R
such that
(i) a = t0 < t1 < < tm = b,
(ii) tj tj1 < , j = 1, . . . , m,
(iii) ({tj }) = 0, i.e., ({x : g(x) = tj }) = 0, j = 0, . . . , m.
Take
Aj := {x X : tj1 g(x) < tj } = g 1 ([tj1 , tj )), j = 1, . . . , m.
Then Aj B(X) for all j and X =
m
j=1
Aj . Further,
Aj {x : tj1 g(x) tj } (since this set is closed and Aj ),
Aj {x : tj1 < g(x) < tj } (since this set is open and Aj ),

so
(Aj ) = (Aj \ Aj ) ({x : g(x) = tj1 or g(x) = tj })
= ({x : g(x) = tj1 }) + ({x : g(x) = tj }) = 0 + 0.
Hence by (e), i (Aj ) (Aj ) as i for j = 1, . . . , m. Put
m
h :=
tj1
Aj ,
j=1
then h(x) g(x) h(x) + for all x X. Hence

|
gdi
gd| = |
(g h)di +
hdi
(g h)d
|g h|di + |
hdi
hd| +
hd|
|g h|d
i (X) + |
j=1
tj1 i (Aj ) (Aj ) | + (X).
It follows that lim supi | gdi

i .
gd| 2. Thus
gdi
gd as
Remark. The condition that the measures , 1 , 2 , . . . in the above theorem

are probability measures can be weakened to finite Borel measures such that
i (X) (X) as i . The same proof can be used with only minor
modifications in the proof of the equivalence (c)(d).
The Prokhorov metric
Let (X, d) be a metric space. Denote

P = P(X) := all Borel probability measures on X.
We have defined the notion of weak convergence in P. Define for , P
dP (, ) := inf{ > 0 : (A) (A ) + and (A) (A ) + A B(X)},
where
A := {x : d(x, A) < } if A = ,
:= for all > 0.
(Here d(x, A) = inf{d(x, a) : a A}.) The function dP is called the Prokhorov

metric on P (induced by d), which makes sense because of the next theorem. If
X is separable, then convergence in the metric dP is the same as weak convergence in P.
Theorem 4.1. Let (X, d) be a metric space.
(1) dP is a metric on P = P(X).
(2) Let , 1 , 2 , . . . P. Then dP (i , ) 0 implies i .
Proof. (1): Any 1 is in the set of the defining formula of dP , so the infimum
is well defined. Clearly dP (, ) 0 and dP (, ) = dP (, ) for all , P.
dP (, ) = 0: Let P. For every Borel set A and > 0, A A, so
(A) (A ) + , hence dP (, ) , whence dP (, ) = 0.
dP (, ) = 0 = : If dP (, ) = 0, then there is a sequence n 0 such
that (A) (An ) + n and (A) (An ) + n for all n. As A = n An ,
it follows that (A) (A) and (A) (A). In particular, (A) = (A) for
all closed sets A and therefore = (by inner regularity).
Triangle inequality: Let , , P. Let > 0 be such that
(A) (A ) + ,
(A) (A ) +
for all A B
(A) (A ) +
for all A B.
and > 0 such that

(A) (A ) + ,
Then for A B:
(A) (A ) + ((A ) ) + + ,
(A) (A ) + ((A ) ) + + .
Now notice that (A ) A+ . (Indeed, x (A ) d(x, A ) <

y A : d(x, y) < , and y A a A : d(y, a) < , so that
d(x, a) d(x, y)+d(y, a) < +, and x A+ .) Of course also (A ) A+ .
Hence for all A B,
(A) (A+ ) + + ,
(A) (A+ ) + + .
Thus, by the definition, dP (, ) + . The infimum over the under
consideration is dP (, ) and the infimum over the is dP (, ). Thus taking
infimum over and yields
dP (, ) dP (, ) + dP (, ).
The proof of (1) is complete.
(2): Assume that dP (i , ) 0 as i . Then there are i 0 with
i (A) (Ai ) + i and (A) i (Ai ) + i for all A B. Hence for A B,
lim sup i (A)
i
lim sup (Ai ) + i

i
lim (Ai ) = (A).
In particular, for any closed C X, lim supi i (C) (C), and therefore
i .
Theorem 4.2. If (X, d) is a separable metric space, then for any , 1 , 2 , . . .
P(X) one has
i
if and only if
dP (i , ) 0.
For the proof we need a lemma on existence of special coverings with small balls.
Lemma 4.3. Let X be a separable metric space and be a finite Borel measure
on X. Then for each > 0 there are countably many open (or closed) balls
B1 , B2 , . . . such that
i=1
Bi = X,
radius of Bi is < for all i,
(Bi ) = 0 for all i.
Proof. Let D be a countable dense set in X. Let x D. Let S(x, r) := {y
X : d(y, x) = r}. Observe that the boundary of the open or closed ball centered
at x and with radius r is contained in S(x, r). Given > 0, the collection
10
S := {S(x, r) : /2 < r < } is disjoint and therefore at most countably

many of its members have -measure > 0. As S is uncountable, there exists
an r (/2, ) such that (S(x, r)) = 0. In this way we find for each x D
an open (or closed) ball B(x, r) centered at x with radius r (/2, ) and
(B(x, r)) = 0. As D is dense these balls cover X, and as D is countable we
have countably many, say B1 , B2 , . . ..
Proof of Theorem 4.2. ) already done.
) Let > 0. We want to show that there exists an N such that for
every i N we have dP (i , ) , which means that i (B) (B ) + and
(B) i (B ) + for all B B.
Take > 0 with < /3 and take with aid of the previous lemma open balls
B1 , B2 , . . . with radius < /2 such that
j=1 Bj = X and (Bj ) = 0 for all j.
Fix k such that
k
j=1
Bj 1 .
Consider the collection of sets that can be built by combining the balls B1 , . . . , Bk :
A := {
jJ
Bj : J {1, . . . , k}},
which is a finite collection. We are going to use this collection to approximate

arbitrary Borel sets. For each A A, A B1 Bk , so (A)
(B1 ) + + (Bk ) = 0. Since i , we have i (A) (A) for all A A.
Fix N such that
|i (A) (A)| <
for all i N and for all A A.
Then in particular i ( j=1 Bj ) ( j=1 Bj ) 1 2 for all i N .

Let now B B be given. Take as approximation of B the set
A :=
{Bj : j {1, . . . , k} such that Bj B = } A.
We find
A B = {x : d(x, B) < } because the diameter of each Bj is < ,
B = [B
k
j=1
Bj =
k
k
j=1 Bj ] [B ( j=1
k
j=1 (B Bj ) A,
B j )c ] A (
k
j=1
Bj )c , because B
|i (A) (A)| < for all i N ,

(
k
j=1
Bj )c , i (
k
j=1
Bj )c 2 for all i N .
11
Hence for every i N :

k
(B)
(A) + (
j=1
Bj )c (A) + i (A) + 2
i (B ) + 2 i (B ) + ,
k
i (B)
i (A) + i (
j=1
Bj )c i (A) + 2 (A) + 3
(B ) + 3 (B ) + .
This is true for every B B, so dP (i , ) for all i N .
Proposition 4.4. Let (X, d) be a separable metric space. Then P = P(X) with
the Prokhorov metric dP is separable.
Proof. Let D := {a1 , a2 , . . .} be a countable set in X. Let
k
M := {1 a1 + + k ak : 1 , . . . , k Q [0, 1],
j = 1, k = 1, 2, . . .}.
j=1
(Here a denotes the Dirac measure at a X: a (A) = 1 if a A, 0 otherwise.)

Clearly, M P and M is countable.
Claim: M is dense in P. Indeed, let P. For each m 1, j=1 B(aj , 1/m) =

X. Take km such that
km
(
j=1
B(aj , 1/m)) 1 1/m.
Modify the balls B(aj , 1/m) into disjoint sets by taking Am

1 := B(a1 , 1/m),
j1
m
m
i=1 B(ai , 1/m) , j = 2, . . . , km . Then A1 , . . . , Akm are
j
j
km
m
m
i=1 Ai =
i=1 B(ai , 1/m) for all j. In particular, ( j=1 Aj )
Am
j := B(aj , 1/m) \
disjoint and
1 1/m, so
km
j=1
(Am
j ) [1 1/m, 1].
We approximate
m
(Am
1 )a1 + + (Akm )akm
by
m
m := m
1 a 1 + + km a k m ,
where we choose m
j [0, 1] Q such that
km
j=1
m
j = 1 and
km
j=1
m
|(Am
j ) j | < 2/m.
12
m
m
(First take j [0, 1] Q with
j j
j=1 |j (Aj )| < 1/2m, then
[1 3/2m, 1 + 1/2m]. Take j := j / i i [0, 1] Q, then j j = 1 and
km
km
j=1 |j j | = |1 1/
i i |
j=1
m
(Aj )| < 1/2m + 3/2m = 2/m.)
j = |
j 1| 3/2m, so
km
j=1
|j
Then for each m, m M. To show: m . Let g U Cb (X). Then

km
gdm
gd =
j=1
m
j g(aj )
gd
km
j=1
(Am
j )g(aj )
gd + (2/m) sup |g(aj )|

j
km
g(aj )
d
Am
j
j=1
gd + (2/m) g
km
g(aj )
j=1
Am
j
Am
j
Sk m
j=1 )
j=1
d + (2/m) g
km
km
sup |g(aj ) g(x)|(Am

j )+ g
xAm
j
c
+ (2/m) g
Am
j )
j=1
Each Am
j is contained in a ball with radius 1/m around aj . Since g is uniformly
continuous, for every > 0 there is a > 0 such that |g(y) g(x)| < whenever
|x y| < , so |g(x) g(aj )| < for all x Am
j for all j. Then for m > 1/ it
follows from the above computation that
gdm
Hence
gdm
gd + g
(1/m)
+ (2/m) g
gd as m . Thus, m .
Conclusion. If (X, d) is a separable metric space, then so is P(X) with the

induced Prokhorov metric. Moreover, a sequence in P(X) converges in metric
if and only if it converges weakly and to the same limit.
Prokhorovs theorem
Let (X, d) be a metric space and let P(X) be the set of Borel probability measures on X. Endow P(X) with the Prokhorov metric induced by d.
In the study of limit behavior of stochastic processes one often needs to
know when a sequence of random variables is convergent in distribution or, at
least, has a subsequence that converges in distribution. This comes down to
finding a good description of the sequences in P(X) that have a convergent
subsequence or rather of the relatively compact sets of P(X). Recall that a
subset S of a metric space is called relatively compact if its closure S is compact.
13
The following theorem by Yu.V. Prokhorov [7] gives a useful description of the
relatively compact sets of P(X) in case X is separable and complete. Let us
first attach a name to the equivalent condition.
Definition 5.1. A set of Borel probability measures on X is called tight if
for every > 0 there exists a compact subset K of X such that
(K) 1 for all .
(Also other names and phrases are in use instead of is tight: is uniformly
tight, satisfies Prokhorovs condition, is uniformly Radon, and maybe
more).
Remark. We have shown already: if (X, d) is a complete separable metric space,
then {} is tight for each P(X) (see Theorem 2.6).
Theorem 5.2 (Prokhorov). Let (X, d) be a complete separable metric space
and let be a subset of P(X). Then the following two statements are equivalent:
(a) is compact in P(X).
(b) is tight.
Let us first remark here that completeness of X is not needed for the implication
(b)(a). The proof of the theorem is quite involved. We start with the more
straightforward implication (a)(b).
Proof of (a)(b). Claim: If U1 , U2 , . . . are open sets in X that cover X and if
> 0, then there exists a k 1 such that
k
(
i=1
Ui ) > 1 for all .
To prove the claim by contradiction, suppose that for every k 1 there is a

k with k ( ki=1 Ui ) 1 . As is compact, there is a and a
n
subsequence with kj . For any n 1, i=1 Ui is open, so
n
Ui )
i=1
lim inf kj (
j
Ui )
i=1
kj
lim inf kj (
j
i=1
Ui ) 1 .
n
But
i=1 Ui = X, so ( i=1 Ui ) (X) = 1 as n , which is a contradiction. Thus the claim is proved.
Now let > 0 be given. Take D = {a1 , a2 , . . .} dense in X. For every m 1
the open balls B(ai , 1/m), i = 1, 2, . . ., cover X, so by the claim there is a km
such that
km
i=1
B(ai , 1/m) > 1 2m

14
for all .
Take
km
K :=
B(ai , 1/m).
m=1 i=1
Then K is closed and for each > 0 we can take m > 1/ and obtain K
km
i=1 B(ai , ), so that K is totally bounded. Hence K is compact, since X is
complete. Moreover, for each
(X \ K) =
B(ai , 1/m)
m=1
km
i=1
km
m=1
i=1
=
m=1
B(ai , 1/m)
km
B(ai , 1/m)
i=1
2m = .
<
m=1
Hence is tight.
The proof that condition (b) implies (a) is more difficult. We will follow the
proof from [6], which is based on compactifications and the Riesz representation
theorem. The latter will be discussed in a later section and, accordingly, we
invoke it with almost no explanation here.
Observe that if X is a compact metric space, every set of Borel probability
measures on X is tight, so that in particular P(X) itself is tight. Thus, the
implication (b)(a) entails the assertion that P(X) is compact whenever X is
compact. We choose the latter as an important intermediate step in the proof
of (b)(a).
Proposition 5.3. If (X, d) is a compact metric space, then (P(X), dP ) is a
compact metric space. (Note that any compact metric space is separable.)
Proof. (Revisited in Corollary 6.8.) As X is compact, Cb (X) = C(X) = {f :
X R : f is continuous}, which is a Banach space under the supremum norm
defined by
f = sup |f (x)|.
xX
Denote by C(X) the Banach dual space of C(X) and consider

:= { C(X) : 1, ( ) = 1, (f ) 0 f C(X) with f 0}.
For P(X) define (f ) := f d, f C(X). According to the Riesz

representation theorem, the map T : is a bijection from P(X) onto .
Moreover, T is a sequential homeomorphism relative to the weak* topology on
15
. By Alaoglus theorem, B := { C(X) : 1} is weak* compact and

therefore is weak* compact, since is weak* closed in B . Hence is weak*
sequentially compact and hence P(X) is compact.
Remark. Also the converse is true: if P(X) is compact then so is X. This comes
from the fact that x x is a homeomorphism from X onto {x : x X}
P(X), and {x : x X} is closed in P(X). (See Proposition 9.3.)
In the cases that we want to consider, X is typically not compact. We can make
use of the previous proposition by considering a compactification of X.
Lemma 5.4. If (X, d) is a separable metric space, then there exist a compact
metric space (Y, ) and a map T : X Y such that T is a homeomorphism
from X onto T (X).
(T is in general not an isometry. If it were, then X complete T (X) complete
T (X) Y closed T (X) compact, which is not true for, e.g., X = R.)
Proof. Let Y := [0, 1]N = {(i )
i=1 : i [0, 1] i} and
(, ) :=
i=1
2i |i i |,
, Y.
Then is a metric on Y , its topology is the topology of coordinatewise convergence, and (Y, ) is compact.
Let D = {a1 , a2 , . . .} be dense in X and define
i (x) := min{d(x, ai ), 1},
x X, i = 1, 2, . . . .
Then for each k, k : X [0, 1] is continuous. For x X define

T (x) := (i (x))
i=1 Y.
Claim: for any C X closed and x C there exist > 0 and i such that
i (x) /3,
i (y) 2/3 for all y C.
To prove the claim, take := min{d(x, C), 1} (0, 1].

d(ai , x) < /3. Then i (x) /3 and for y C we have
Take i such that
i (y) = min{d(y, ai ), 1} min{(d(y, x) d(x, ai )), 1}

min{(d(x, C) /3), 1}
min{2/3, 1} = 2/3.
In particular, if x = y then there exists an i such that i (x) = i (y), so T is

injective. Hence T : X T (X) is a bijection. It remains to show that for (xn )n
and x in X:
xn x T (xn ) T (x).
If xn x, then i (xn ) i (x) for all i, so (T (xn ), T (x)) 0 as n .
16
Conversely, suppose that xn x. Then there is a subsequence such that

x {xn1 , xn2 , . . .}. Then by the claim there is an i such that i (x) /3
and i (xnk ) 2/3 for all k, so that i (xnk ) i (x) as k and hence
T (xnk ) T (x).
We can now complete the proof of Prokhorovs theorem.
Proof of (b)(a). We will show more: If (X, d) is a separable metric space and
P(X) is tight, then is compact. Let P(X) be tight. First observe
that is tight as well. Indeed, let > 0 and let K be a compact subset of X such
that (K) 1 for all . Then for every there is a sequence (n )n
in that converges to and then we have (K) lim supn n (K) 1 .
Let (n )n be a sequence in . We have to show that it has a convergent
subsequence. Let (Y, ) be a compact metric space and T : X Y be such that
T is a homeomorphism from X onto T (X). For B B(Y ), T 1 (B) is Borel in
X. Define
n (B) := n (T 1 (B)), B B(Y ), n = 1, 2, . . . .
Then P(Y ) for all n. As Y is a compact metric space, P(X) is a compact
metric space, hence there is a P(Y ) and a subsequence such that nk
in P(Y ). We want to translate back to a measure on X. Set Y0 := T (X).
Claim: is concentrated on Y0 in the sense that there exists a set E B(Y )
with E Y0 and (E) = 1.
If we assume the claim, define
0 (A) := (A E),
A B(Y0 ).
(Note: A B(Y0 ) A E Borel in E A E Borel in Y , since E is

a Borel subset of Y .) The measure 0 is a finite Borel measure on Y0 and
0 (E) = (E) = 1. Now we can translate 0 back to
(A) := 0 (T (A)) = 0 ((T 1 )1 (A)),
A B(X).
Then P(X). We want to show that nk in P(X). Let C be closed

in X. Then T (C) is closed in T (X) = Y0 . (T (C) need not be closed in Y .)
Therefore there exists Z Y closed with Z Y0 = T (C). Then C = {x X :
T (x) T (C)} = {x X : T (x) Z} = T 1 (Z), because there are no points in
T (C) outside Y0 , and Z E = T (C) E. Hence
lim sup nk (C) = lim sup nk (Z)
k
(Z)
= (Z E) + (Z E c ) = (T (C) E) + 0
= 0 (T (C)) = (C).
So nk .
17
Finally, to prove the claim we use tightness of . For each m 1 take Km

compact in X such that (Km ) 1 1/m for all . Then T (Km ) is a
compact subset of Y hence closed in Y , so
lim sup nk (T (Km ))
(T (Km ))
lim sup nk (Km ) 1 1/m.

k
m=1
Take E :=
(E) = 1.
Km . Then E B(Y ) and (E) (Km ) for all m, so
Example. Let X = R, n (A) := n1 (A [0, n]), A B(R). Here denotes

Lebesgue measure on R. Then n P(R) for all n. The sequence (n )n has no
convergent subsequence. Indeed, suppose nk , then
((N, N )) lim inf n ((N, N ))
n
= lim inf n1 ([0, N ]) = lim inf N/n = 0,

n
so (R) = supN 1 ((N, N )) = 0. There is leaking mass to infinity; the set

{n : n = 1, 2, . . .} is not tight.
Riesz representation theorem
In the proof of Prokhorovs theorem we have used the Riesz representation

theorem. It yields a correspondence between functionals on a space of continuous
functions and measures on the underlying set. The standard theorem deals with
compact spaces and will be discussed in this section. The next section derives
via compactification an extension for non-compact spaces.
Let (X, d) be a metric space. For each finite Borel measure on X, the map
defined by
(f ) :=
f d,
f Cb (X),
is linear from Cb (X) to R and

| (f )|
|f |d f
(X).
Hence Cb (X) , where Cb (X) denotes the Banach dual space of the Banach
space (Cb (X), ). (Here f = supxX |f (x)|.) Further, (X)
and since ( ) = (X) =
(X) we have
= (X).
Moreover,
f 0 = (f ) 0.
18
Definition 6.1. A linear map : Cb (X) R is called positive if

(f ) 0 for all f Cb (X) with f 0.
(Then f g (f ) (g).)
Lemma 6.2. For every positive Cb (X) one has
= ( ).
Proof. Clearly, ( )
= . For f Cb (X),
f f
so
f
) (f ) f
),
so
|(f )| ( ) f
hence ( ).
If X is compact, then Cb (X) = C(X) = {f : X R : f is continuous} and

every positive bounded linear functional on C(X) is represented by a finite
Borel measure on X. The truth of this statement does not depend on X being
a metric space. In the extension to the non-compact case that we will discuss
in the next section we need the generality of non-metrizable compact Hausdorff
spaces. Formally we have not defined Borel sets, Borel measures, Cb (X), etc.
for topological spaces that are not metrizable. The appropriate definitions are
literally the same and omitted.
Theorem 6.3 (Riesz representation theorem). If (X, d) is a compact Hausdorff space and C(X) is positive and = 1, then there exists a unique
Borel probability measure on X such that
(f ) =
f d
for all f C(X).
(See [8, Theorem 2.14, p. 40].)

By obvious scaling, the Riesz representation theorem can be extended to a
correspondence between not necessarily normalized positive bounded functionals
on C(X) and finite Borel measures on X. More than that, there is also a
correspondence of topologies.
Consider the weak* topology on Cb (X) , which is the coarsest topology such
that the functional (f ) on Cb (X) is continuous for every f Cb (X) . A
sequence 1 , 2 , . . . in Cb (X) converges in the weak* topology to Cb (X)
if and only if n (f ) (f ) for all f Cb (X). The following observation is
immediate.
19
Proposition 6.4. Let (X, d) be a compact metric space and let , 1 , 2 , . . . be

finite Borel measures on X. Then the following two statemts are equivalent:
(a) n , that is,
f dn
f d for all f Cb (X).
(b) n in the weak* topology, that is, n (f ) (f ) for all f

Cb (X).
With a suitable notion of nonpositive measure, the representation by a measure
can be extended to every member of Cb (X) . We include the statements without
proofs.
Definition 6.5. A signed Borel measure on a metric space (X, d) is a map
: B(X) R of the form
= 1 2
where 1 and 2 are finite Borel measures on X. This is equivalent to
() = 0,
is -additive,
i.e., A1 , A2 , . . . B(X)disjoint =
supAB(X) |(A)| < .
i=1
Ai =
i=1
(Ai ),
Theorem 6.6. Let (X, d) be a compact metric space. For a finite Borel measure
on X let (f ) := f d, f C(X), and let T be the map . Then
(1) T ( + ) = T () + T () and T (c) = cT () for all finite Borel measures
and on X and all c 0,
(2) T is a sequential homeomorphism from { : finite Borel measure on X}
onto { C(X) : positive} with the weak* topology, and T (P(X)) =
{ Cb (X) : = 1, positive},
(3) T extends uniquely to a linear sequential homeomorphism from the signed
Borel measures on X onto C(X) with the weak* topology.
Remark. One can show that C(X) is separable if X is compact and metrizable
([5, 7.S(d), p. 245]) and one can derive from the separability of C(X) that {
C(X) : 1} is metrizable ([3, Theorem V.5.1, p. 426]). Therefore T in the
above theorem is a homeomorphism and not only a sequential homeomorphism.
We are now in a position to have a closer look at the proof of Proposition 5.3.
Theorem 6.7. Let (X, d) be a metric space. Then
(1) B := { Cb (X) : 1} is weak* compact (Alaoglus theorem, see
[3, Theorem V.4.2, p. 424]),
(2) { Cb (X) : = 1, is positive} is weak* closed in B .
20
Proof of (2). For positive B , we have = 1 ( ) = 1. Hence
{ Cb (X) : = 1, is positive}
= { Cb (X) : 1, ( ) = 1, (f ) 0, f Cb (X) s.t. f 0}
= { B : ( ) = 1}
{ B : (f ) 0}.
f Cb (X), f 0
Since (f ) is weak* continuous for all f Cb (X), this set is weak* closed
in B .
Corollary 6.8. If (X, d) is a compact metric space, then (P(X), dP ) is a compact metric space.
Proof. The map T : P(X) { C(X) : = 1, positive} =: is a
sequential homeomorphism with respect to the weak* topology on . By the
previous theorem, is weak* compact, hence sequentially weak* compact. So
P(X) is sequentially compact. As P(X) is a metric space, P(X) is compact.
Riesz representation for non-compact spaces
As we are mainly interested in metric spaces that are not compact, it is natural
for us to study an extension of the Riesz representation theorem to non-compact
spaces. Such an extension can be obtained by means of a compactification of
the space.
The compactification of Lemma 5.4 has the advantage of being metrizable,
but it is not suitable for the present purposes. We have to step outside metric
topology for a moment. We want a connection between the continuous functions
on the compactification and the bounded continuous functions on the original
space. Such a compactification is the famous Stone-Cech

compactification.
Theorem 7.1. Let (X, d) be a metric space. There exists a compact Hausdorff
space Y and a map T : X Y such that
(i) T is a homeomorphism from X onto T (X),
(ii) T (X) is dense in Y ,
(iii) for every f Cb (X) there exists one and only one g C(Y ) that extends
f , that is, g T = f .
The pair (Y, T ) of the above theorem is essentially unique and called the Stone
Cech
compactification of X (see [5, 5.24, p. 1523; 5.P, p. 166]). We will not be
unnecessarily cautious, and view X as a subspace of Y . Then the above theorem
says that every metric space X is a dense subspace of a compact Hausdorff
space Y such that Cb (X) C(Y ) under the natural isomorphism of extension
and restriction. From the Riesz representation theorem for compact Hausdorff
spaces we thus have the next conclusion.
21
Corollary 7.2. Let (X, d) be a metric space. If : Cb (X) R is bounded

linear and positive, then there exists a unique finite Borel measure on the
Stone-Cech
compactification Y of X such that
(f ) =
f d
for all f Cb (X),
where f C(Y ) denotes the extension of f .

Thus the positive bounded linear functionals on Cb (X) correspond to the finite
Borel measures on the Stone-Cech

compactification of X. It is interesting to
know when such a measure is concentrated on X itself. It turns out to be
connected with a stronger continuity property of the functional than mere norm
continuity. The precise statement is in the next theorem, which is an extension of
the Riesz representation theorem for compact spaces (cf. [2, 5.2 Proposition 5, p.
58, and 5.6 Proposition 12, p. 65]. For theory on convergence of nets (generalized
sequences), see [3, I.7, p. 2631].
Theorem 7.3. Let (X, d) be a metric space and let Cb (X) be positive.
The following statements are equivalent:
(a) There exists a tight finite Borel measure on X such that
(f ) =
f d
for all f Cb (X).
(b) For every > 0 there exists a compact K X such that |(f )| for all
f Cb (X) with f 1 and f = 0 on K.
(c) The restriction of to the unit ball B = {f Cb (X) : f 1} is
continuous with respect to the topology of uniform convergence on compact
sets.
If (a) holds, then the measure is unique.
Proof. The proof of the uniqueness is routine. It also follows from the denseness
theorem in Section 8.
(a)(c): Let (fi )iI be a net in B and let f B be such that fi f
uniformly on compact sets. Let > 0. We want to show that there is an i0 I
such that |(fi ) (f )| < for all i I with i i0 . Since is tight, there is a
compact K X with (X \ K) < /3. Then fi f uniformly on K, so there
is an i0 I such that
|fi f | < /(3(K) + 1) on K for all i i0 .
Then for i i0 ,
|(fi ) (f )|
|fi f |d +
3(K)+1 (K)
|fi f |d
+ fi f
< /3 + 2/3 = .
22
X\K
(X
\ K)
Hence (fi ) (f ), and is continuous on B.

(c)(b): Suppose that (b) is not true. Then there exists an > 0 such that
for every compact K X there is an fK Cb (X) with fK 1 and fK = 0
on K and such that |(fK )| > . Then (fK )KK , where K = {K X : K
compact} with inclusion as ordering, is a net in B that converges to zero in the
topology of uniform convergence on compact sets. Indeed, for each compact
K0 X, fK = 0 on K0 for all K K0 . Since |(fK )| > for all K K, it
follows that is not continuous on B.
(b)(a): Take for each m 1 a compact set Km X such that |(f )|
1/m for all f Cb (X) with f 1 and f = 0 on Km . Let Y be the
Stone-Cech
compactification of X. For every g C(Y ) its restriction to X is
an element of Cb (X) and we can define
(g) := (g|X ),
g C(Y ).
Then : C(Y ) R is a bounded linear and positive functional, so by the Riesz

representation theorem there exists a finite Borel measure on Y such that
(g) =
gd
for all g C(Y ).
We want to restrict to a measure on X that represents . Therefore we

need that has no mass outside X.
Let E := m Km X. Since every Km is compact, E is a Borel set in Y .
To show that has no mass outside E we exploit the assumption (b) by means
c by continuous functions. Let
of an approximation of Km
hm (x) := min{d(x, Km ), 1}, x Y, m 1.
n
c and
c as n , since hm (x) > 0
hm K m
Then hm C(Y ), 0 hm Km
c
for every x Km . Hence by the monotone convergence theorem,
(Y \ Km ) =
=
c d
Km
= lim
hm d
lim ( n hm ) = lim ( n hm |X ) 1/m,
by assumption (b). Therefore
(Y \ E) = (
c
Km
) = 0.
m=1
Define
(A) := (A E),
A B(X).
(Notice that A B(X) A E Borel in E hence Borel in Y .) Then is a

finite Borel measure on X. To show that represents , let f Cb (X) and let
f C(Y ) be its extension. Since
(Y \ E) = 0 and (X \ E) = 0,
23
it follows that
f d =
E d
E d
fd = (f ) = (f ).
Finally, notice that (X \ Km ) = (E \ Km ) = (Y \ Km ) 1/m for all m, so

that is tight.
Remark. (1) If X is compact, then every Cb (X) satisfies condition (c).
Thus we retrieve the Riesz representation theorem for compact metric spaces.
(2) We have shown earlier that if (X, d) is a complete separable metric space,
then each finite Borel measure on X is tight. Hence for such a space condition
(c) is necessary to have representation by any finite Borel measure.
Example. Let X = N, d(x, y) = |x y|, x, y X. We will show that there exists
a Cb (X) that is not represented by a finite Borel measure. Observe that
Cb (X) = (N) and define
0 (x) := lim x(k)
k
for all x c := {y
(N) : limk y(k) exists}. The set c is a closed subspace
of (N) and 0 is a positive bounded linear functional on c. Let
p(x) := max{lim sup x(k), 0},
k
(N).
Then p(x + y) p(x) + p(y) and p(x) = p(x) for all x, y (N), 0.
Further, 0 (x) p(x) for all x c. Hence by the Hahn-Banach theorem (see
[3, II.3.10, p. 62]) there exists a linear functional : (N) R that extends
0 and such that (x) p(x) for all x (N). Then |(x)| |p(x)| x
for all x (N) so is bounded, and for x (N) with x 0 we have
(x) = (x) p(x) = 0, so is positive.
Let now
xn := {n,n+1,...} c, n = 1, 2, . . . .
Then (xn ) = 0 (xn ) = 1 for all n, but for any finite Borel measure on N we
have xn d 0 as n , since xn 0 pointwise and 0 xn for all n.
Hence cannot be represented by a finite Borel measure.
Integrable functions on metric spaces
Let (X, d) be a metric space and let be a finite Borel measure on X. Is Cb (X)
dense in L1 ()? The answer is positive and we can show more. Let
Lipb (X) := {f : X R : f is bounded and Lipschitz continuous
with respect to d}.
Lemma 8.1. Let (X, d) be a metric space and let be a finite Borel measure
on X.
24
(1) For every U X open and every > 0 there exists an f Lipb (X) with
0 f U and ( U f )d < .
(2) For every A B(X) and every > 0 there exists an f Lipb (X) with
|f A |d < .
Proof. (1): If U = X, take f = . If U X is open and U = X, let
h(x) := min{d(x, U c ), 1},
x X.
Then h Lipb (X). Indeed, observe that min{a, c} min{b, c} a b if a b,

so that
|h(x) h(y)| |d(x, U c ) d(y, U c )| d(x, y),
for all x, y X. Further, 0 h U on X and h(x) > 0 for all x U .
Take a strictly concave Lipschitz continuous function : [0, 1] [0, 1] with
(0) = 0 and (1) = 1. For instance, (x) = x(2 x). Denote the iterates of
by 1 := , n := n1 , n = 2, 3, . . .. For 0 < < 1,
() = ((1 )0 + 1) > (1 )(0) + (1) = ,

so n () is increasing in n and its limit must be 1. Thus, n (0) = 0, n (1) = 1
for all n, and n () 1 for every 0 < < 1.
For each n, n h Lipb (X), n h 0, and n (h(x)) U (x) for all x X.
By the monotone convergence theorem we therefore find
(n h)d
U d
as n .
So for large n the function f := n h has the desired properties.

(2): With aid of the outer regularity of , take U X open with A U
and (U \ A) < /2. Take, by (1), f Lipb (X) with | U f |d < /2. Then
|f A |d < .
Theorem 8.2. If (X, d) is a metric space and is a finite Borel measure on

X, then Lipb (X) is dense in L1 (). Consequently, Cb (X) is dense in L1 ().
Proof. 1. Let A1 , . . . , An B(X) and 1 , . . . , n R \ {0}, and let > 0. Take
for each k {1, . . . , n} an hk Lipb (X) with
|hk
Then
n
k=1
Ak |d
<
.
n|k |
k hk Lipb (X) and

n
k=1
hk
k
k=1
Ak |d
k=1
|k |
2. Stepfunctions as in 1. are dense in L1 ().

(See also [2, 5.2 Proposition 3, p. 57].)
25
|hk
Ak |d
< .
Corollary 8.3. Let (X, d) be a metric space and let and be finite Borel
measures on X. If
f d =
for all f Lipb (X),
f d
then = .
More properties of the space of probability

measures
Let (X, d) be a metric space, let P(X) be the set of Borel probability measures
on X, and let dP be the Prokhorov metric on P(X) as defined in Section 4.
With aid of Prokhorovs theorem we can show that (P(X), dP ) is complete if
(X, d) is complete (cf. [7, Lemma 1.4, p. 169]).
Lemma 9.1. Let (X, d) be a complete metric space and let P(X). In order
that is tight, it suffices that for every , > 0 there are a1 , . . . , an X such
that
n
i=1
B(ai , ) 1
for all .
m
Proof. Let > 0. Assume that for each m 1 the points am
1 , . . . , anm X are
such that
n
m
i=1
m
B(am
for all .
i , 1/m) 1 2
Take
nm
B(am
i , 1/m).
K :=
m=1 i=1
Then K is closed and for a given > 0 we can take m > 1/ and obtain
nm
i=1
nm
B(am
i , 1/m)
B(am
i , ).
i=1
Hence K is totally bounded and thus compact. Further, for ,

M
(K) =
nm
B(am
i , 1/m)
lim
m=1 i=1
M
i=1
B(am
i , 1/m)
m=1
i=1
m=1
2m = 1 .
26
m=1
nm
1 lim
nm
B(am
i , 1/m)
= 1 lim
So is tight.
Theorem 9.2. Let (X, d) be a separable metric space. If (X, d) is complete,
then (P(X), dP ) is complete.
Proof. Let (k )k be a Cauchy sequence in (P(X), dP ). We will show that {k :
k = 1, 2, . . .} is tight. Take D = {a1 , a2 , . . .} dense in X. Let , > 0. Set
:= min{, }/2
and fix N such that
dP (k , ) <
for all k, N.
Then for k, N we have

k (A) (A ) + and (A) k (A ) + for all A B(X).
Take now n 1 such that for k {1, . . . , N }
n
k
i=1
B(ai , /2) 1 .
(Such an n exists because i=1 B(ai , /2) = X so that limm k (

1 for each of the finitely many k {1, . . . , N }.) Observe that
n
B(ai , /2)
i=1
i=1
m
i=1
B(ai , /2))
B(ai , /2 + )
B(ai , ).
i=1
Therefore,
n
B(ai , /2)
i=1
k
k
for all k N . Then
B(ai , /2)
i=1
n
B(ai , ) +
i=1
k
i=1
B(ai , ) 1 2 1 for k N
and
n
B(ai , )
k
i=1
B(ai , /2)
i=1
1 1
for k = 1, . . . , N . By the previous lemma it follows that the set {k : k =
1, 2, . . .} is tight and therefore relatively compact in P(X) by Prokhorovs theorem. Hence there is a subsequence (ki )i that converges to some P(X).
As (k )k is Cauchy it follows that k . Thus, (P(X), dP ) is complete.
27
Finally, we want to see that completeness of (X, d) is necessary for completeness

of (P(X), dP ). We can derive this by embedding X in P(X). More specifically,
we show that X and the set of Dirac measures := {x : x X} are in a
suitable sense isomorphic. (x denotes the Dirac measure at x.)
Proposition 9.3. Let (X, d) be a separable metric space. Then:
(1) dP (x , y ) = min{d(x, y), 1} for every x, y X,
(2) x x is a homeomorphism from X onto := {x : x X} P(X),
(3) a sequence (xn )n is Cauchy in (X, d) if and only if (xn )n is Cauchy in
(P(X), dP ),
(4) is closed in P(X).
Proof. (1): From the very definition of dP , dP (, ) 1 for all , P(X).
Let > d(x, y). Then for each A B(X),
x A y A
and y A x A ,
so
x (A) y (A ) + ,
y (A) x (A ) + ,
and hence dP (x , y ) . Thus, dP (x , y ) d(x, y).

Assume dP (x , y ) < 1 and let dP (x , y ) < < 1. Then
x (A) y (A ) + and y (A) x (A ) + for all A B(X).
Hence for A = {x} we find
1 y (B(x, )) + .
As < 1 it follows that y B(x, ), so d(x, y) < . Thus d(x, y) dP (x , y ).
(2) and (3) are clear from (1).
(4): Let (xn )n be a sequence in X such that xn for some P(X).
We have to show that . Suppose (xn )n has no convergent subsequence.
Then S := {x1 , x2 , . . .} is closed and so is every subset of S. Hence for every
nonempty subset C of S we have
(C) lim sup xn (C) 1.
n
This is only possible if S consists of one point, but that yields a contradiction.
Hence there is a subsequence and an x X such that xnk x. By (2),
xn x , so = x .
With aid of the above proposition we can add the only if counterparts to
Proposition 5.3 and Theorem 9.2.
Theorem 9.4. Let (X, d) be a separable metric space.
28
(1) (X, d) is compact (P(X), dP ) is compact.

(2) (X, d) is complete (P(X), dP ) is complete.
Remark. There are other metrics on P(X) in use than the Prokhorov metric.
For instance the bounded Lipschitz metric, which is defined by
dBL (, ) := sup{|
f d
f d| : f Lipb (X),
where
f
Lip
= f
+ sup
x=y
|f (x) f (y)|
,
d(x, y)
Lip
1}, , P(X),
f Lipb (X).
If (X, d) is separable, then a sequence in P(X) converges weakly if and only if

it converges in the metric dBL . Further, (P(X), dBL ) is separable and complete
if (X, d) is separable and complete. (See [10, 1.12, p. 7374].)
References
[1] Billingsley, Patrick, Convergence of Probability Measures, Wiley & Sons,
New York - London, 1968.
ements de mathematique. Livre 6. Integration. Chapitre 9,
[2] Bourbaki, N., El
Hermann, Paris, 1969.
[3] Dunford, Nelson, and Schwartz, Jacob T., Linear Operators. Part I: General Theory, Wiley-Interscience, New York, 1957.
[4] Halmos, Paul R., Measure Theory [GTM 18], Springer, New York - Berlin
- Heidelberg, 1974.
[5] Kelley, John L., General Topology [GTM 27], Springer, New York - Heidelberg - Berlin, 1975.
[6] Parthasarathy, K.R., Probability Measures on Metric Spaces, Academic
Press, New York - London, 1967.
[7] Prokhorov, Yu.V., Convergence of random processes and limit theorems in
probability theory, Theor. Prob. Appl. 1 (1956), 157214.
[8] Rudin, Walter, Real and Complex Analysis, 3rd ed., McGraw-Hill, New
York, 1987.
[9] Stroock, Daniel W., Probability Theory, an Analytic View, Cambridge University Press, Cambridge, 1993.
[10] van der Vaart, Aad W. and Wellner, Jon A., Weak convergence and empirical processes, with applications to statistics, Springer, New York, 1996.
29

GAANS-Probability Measures On Metric Spaces

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

GAANS-Probability Measures On Metric Spaces

Uploaded by

Copyright:

Available Formats

Probability measures on metric spaces

Onno van Gaans

2 Borel probability measures

3 Weak convergence of measures

4 The Prokhorov metric

6 Riesz representation theorem

7 Riesz representation for non-compact spaces

8 Integrable functions on metric spaces

9 More properties of the space of probability measures

The distribution of a random variable in a Banach space X will be a probability

Borel probability measures

Let (X, d) be a metric space. A finite Borel measure on X is a map : B(X)

is called a Borel probabiliy measure if in addition (X) = 1.

Ai , then (A) = limn (An ).

then (A) = limn (An ).

Ai ) = (A1 ) + + (Ak ) > k/m,

hence Am has at most m(X) elements. Thus

Proof. Define the collection R by

(A) = sup{(C) : C A, Cclosed} and

We have to show that R contains the Borel sets. step 1: R is a -algebra:

((Ui ) (Ai )) <

Further, ( i=1 Ci ) = limk ( i=1 Ci ), hence for some large k, (

Hence i=1 Ai R. Thus R is a -algebra.

i=1 Ui = A, as A is closed. Hence (A) = lim n (Un ) = inf n (Un ). So

because each closed subset contained in a compact set is compact. Combination

B(ak , 1/m) > (X) 2m .

Then K is closed and for each > 0,

if we choose m > 1/. So K is compact, by the lemma. Further,

B(ak , 1/m)) <

Weak convergence of measures

Let (X, d) be a metric space and denote

Definition 3.1. Let , 1 , 2 , . . . be finite Borel measures on X. We say that

f d as i for all f Cb (X).

Notation: i . (There is at most one such a limit , as follows from the

gdi gd for all g U Cb (X) := {f : X R : f is uniformly

(c) lim supi i (C) (C) for all closed C X

Hence there is an fm U Cb (X) with 0 f 1 on X, fm = 1 on C, and fm = 0

we get by assumption (b)

Um = C (since C is closed) we find

lim sup i (A) (A) = (A A)

hence i (A) (A).

(E) := ({x : g(x) E}) = (g 1 (E)),

f d for suitable simple

Then is a finite Borel measure (probability measure) on R and if we take

Aj {x : tj1 g(x) tj } (since this set is closed and Aj ),

Aj {x : tj1 < g(x) < tj } (since this set is open and Aj ),

then h(x) g(x) h(x) + for all x X. Hence

tj1 i (Aj ) (Aj ) | + (X).

It follows that lim supi | gdi

Remark. The condition that the measures , 1 , 2 , . . . in the above theorem

The Prokhorov metric

Let (X, d) be a metric space. Denote

:= for all > 0.

(Here d(x, A) = inf{d(x, a) : a A}.) The function dP is called the Prokhorov

and > 0 such that

Now notice that (A ) A+ . (Indeed, x (A ) d(x, A ) <

lim sup (Ai ) + i

lim (Ai ) = (A).

S := {S(x, r) : /2 < r < } is disjoint and therefore at most countably

which is a finite collection. We are going to use this collection to approximate

for all i N and for all A A.

Then in particular i ( j=1 Bj ) ( j=1 Bj ) 1 2 for all i N .