You are on page 1of 14

Introduction to Probability Theory

K. Suresh Kumar
Department of Mathematics
Indian Institute of Technology Bombay

July 28, 2017


2

LECTURES 3 - 4

Theorem 0.0.1 ( Properties of probability measure) Let (Ω, F, P ) a prob-


ability space and A, B, A1 , A2 , . . . are in F. Then

(1) P (Ac ) = 1 − P (A) .

(2) Monotonicity: if A ⊆ B, then

P (A) ≤ P (B) .

(3) Inclusion - exclusion formula:


n
X n
X
P (∪nk=1 Ak ) = P (Ak ) − P (Ai Aj )
k=1 i<j
n
X
+ P (Ai Aj Ak ) . . . + (−1)n+1 P (A1 A2 . . . An ) .
i<j<k

(4) Finite sub-additivity:

P (A ∪ B) ≤ P (A) + P (B) .

(5) Continuity property:


(i) For A1 ⊆ A2 ⊆ . . .

P (∪∞
n=1 An ) = lim P (An ) .
n→∞

(ii) For A1 ⊇ A2 ⊇ . . . ,

P (∩∞
n=1 An ) = lim P (An ) .
n→∞

(6) Boole’s inequality (Countable sub-additivity):



X
P (∪∞
n=1 An ) ≤ P (An ) .
n=1

Proof. Since

Ω = A ∪ Ac ∪ ∅ ∪ ∅ · · · we have 1 = P (A) + P Ac + P (∅) + P (∅) + · · · .


3

Since RHS is a convergent series, we get P (∅) = 0 and hence (1) follows.
Now

A ⊆ B =⇒ B = A ∪ B \ A .

Therefore

P (B) = P (A) + P (B \ A) ⇒ P (B) ≥ P (A) ,

since P (B \ A) ≥ 0. This proves (2).


We prove (3) by induction. For n = 2

A1 ∪ A2 = A1 ∪ (A2 \ A1 ) .

and

A2 = (A2 \ A1 ) ∪ A1 A2 .

(Here A1 A2 = A1 ∩ A2 ) . Hence we have

P (A1 ∪ A2 ) = P (A1 ) + P (A2 \ A1 )

and

P A2 = P (A2 \ A1 ) + P (A1 A2 ) .

Combining the above, we have

P (A1 ∪A2 ) = P (A1 ) + P (A2 ) − P (A1 A2 ) .


4

Assume that equality holds for n ≤ m . Consider


P (∪m+1 m
k=1 Ak ) = P (∪k=1 Ak ∪ Am+1 )

= P (∪m m
k=1 Ak ) + P (Am+1 ) − P ([∪k=1 Ak ] ∩ Am+1 )

= P (∪m m
k=1 Ak ) + P (Am+1 ) − P (∪k=1 Ak Am+1 )

m
X m
X
= P (Ak ) − P (Ai Aj )
k=1 i<j
m+1
+ · · · + (−1) P (A1 · · ·Am ) + P (Am+1 )
hX m X m
− P (Ak Am+1 ) − P (Ai Am+1 Aj Am+1 )
k=1 i<j i
m+1
+ · · · + (−1) P (A1 Am+1 A2 Am+1 · · ·Am Am+1 )

m+1
X m
hX m
X i
= P (Ak ) − P (Ai Aj ) + P (Ak Am+1 )
k=1 i<j k=1
m
h X m
X i
+ P (Ai Aj Ak ) + P (Ai Aj Am+1 )
i<j<k i<j
+· · · + (−1)m+2 P (A1 A2 · · ·Am Am+1 )

m+1
X m+1
X
= P (Ak ) − P (Ai Aj )
k=1 i<j
m+1
X
+ P (Ai Aj Ak )· · · + (−1)m+2 P (A1 · · ·Am+1 ) .
i<j<k

Therefore the result true for n = m + 1. Hence by induction property (3)


follows.
From property (3), we have
P (A ∪ B) = P (A) + P (B) − P (AB) .
Hence
P (A ∪ B) ≤ P (A) + P (B), .
Thus we have (4).
Now we prove (5)(i). Set
B1 = A1 , Bn = An \ An−1 , n = 2, 3, · · · .
5

Then
Bn ∈ F ∀ n = 1, 2, · · · ,
Bn0 s are disjoint and
An = ∪nk=1 Bk , n ≥ 1 . (0.0.1)
Also
∪∞ ∞
n=1 Bn = ∪n=1 An . (0.0.2)
Using (0.0.1), we get
n
X
P (An ) = P (Bk ) .
k=1
Hence, from the definition of series convergence, we get
n
X ∞
X
lim P (An ) = lim P (Bk ) = P (Bn ). (0.0.3)
n→∞ n→∞
k=1 n=1

Similarly, using (0.0.2), we get



X
P (∪∞
n=1 An ) = P (Bn ) .
n=1

Therefore from (0.0.3), we have

P (A) = lim P (An ).


n→∞

Proof of (5)(ii) is as follows.


Note that
Ac1 ⊆ Ac2 · · · .
Now using (5)(i) we have

lim P (Acn ) = P (∪∞ c


n=1 An ) .
n→∞

i.e.,
1 − lim P (An ) = P [(∩∞ c ∞
n=1 An ) ] = 1 − P (∩n=1 An ) .
n→∞
Hence
lim P (An ) = P (A) .
n→∞

From property (4), it follows that

P (A1 ∪ A2 ∪ · · · ∪ An ) ≤ P (A1 ) + · · · + P (An ) ∀ n ≥ 1 .


6

i.e.,
n
X
P (∪nk=1 Ak ) ≤ P (Ak ) ∀ n ≥ 1 .
k=1

Therefore

X
P (∪nk=1 Ak ) ≤ P (Ak ) ∀ n ≥ 1 . (0.0.4)
k=1

Set
Bn = ∪nk=1 Ak .
Then B1 ⊆ B2 ⊆ · · · and are in F. Also

∪∞ ∞
n=1 An = ∪n=1 Bn .

Hence

P (∪∞ ∞ n
n=1 An ) = P (∪n=1 Bn ) = lim P (Bn ) = lim P (∪k=1 Ak ) . (0.0.5)
n→∞ n→∞

Here the second equality follows from the continuity property 5(i). Using
(0.0.5), letting n → ∞ in (0.0.4), we have

X
P (∪∞
n=1 An ) ≤ P (Ak ) .
k=1

Let us end this chapter with the formulation and solution to the points
problem which is essentially in sprit with the solution by Pascal initiated by
Fermat.
Note since A need m points and B need l points, one can consider the
random experiment of m + l − 1 tosses of a fair coin. This gives Ω as

Ω = {(ω1 , · · · ωm+l−1 )|ωi ∈ {H, T }}

and # Ω = 2m+l−1 .
Let E denote the event that A gets atleast m points. The E c denote the
event that B gets atleast l. Then
     
m+l−1 m+l−1 m+l−1
#E = + + ··· +
m m+1 m+l−1
     
c m+l−1 m+l−1 m+l−1
#E = + + ··· + .
0 1 m−1
7

This gives the probabilities p and q for the win of A and B respectively.
Hence clearly
p #E
= .
q #E c
This gives the division of the prize money p : q
For additional reading-not a part of the syllabus
Recall that all the examples of probability spaces we had seen till now
are with sample space finite or countable and the σ-field as the power set of
the sample space. Now let us look at a random experiment with uncountable
sample space and the σ-field as a proper subset of the power set.

Consider the random experiment in Example 1.0.5, i.e, pick a point ’at
random’ from the interval (0, 1]. Since point is picked ’at random’, the
probability measure should satisfy the following.

P [a, b] = P (a, b] = P [a, b) = P (a, b) = b − a . (0.0.6)

The σ-field we are using to define P is B(0, 1] , the σ-field generated by all
intervals in (0, 1]. B(0, 1] is called the Borel σ-field of subsets of (0, 1].

Our aim is to define P for all elements of B(0, 1] , preserving (0.0.6). Set

B0 := all finite union of intervals in (0, 1] of the form (a, b], 0 ≤ a ≤ b ≤ 1.

Clearly Ω = (0, 1] ∈ B0 .
Let A ∈ B0 . then A can be represented as

A = ∪ni=1 Ii ,

where Ii = (ai , bi ],

0 ≤ a1 < b1 ≤ a2 < b2 ≤ a3 < b3 · · · ≤ am < bm .

Then
Ac = J1 ∪ J2 ∪ · · · ∪ Jm ∪ Jm+1 ,
where

J1 = [0, a1 ], Ji = (bi−1 , ai ], i = 2, 3, · · ·, m, Jm+1 = (bm , 1].

Therefore Ac ∈ B0 .
For A, B ∈ B0 , it follows from the definition of B0 that A ∪ B ∈ B0 .
8

Hence B0 is a field.

Define P on B0 as follows.
m
X
P (A) = P (Ji ), (0.0.7)
i=1

where
A = ∪ni=1 Ji ,
Ji ’s are pair wise disjoint intervals of the form (a, b].

Extension of P from B0 to B follows from the extension theorem by


Caratheodary. To understand the statement of the extension theorem, we
need the following definition.

Definition 1.8 (Probability measure on a field) Let Ω be a nonempty


set and F be a field. Then P : F → [0, 1] is said to be a probability
measure on F if
(i) P (Ω) = 1
(ii) if A1 , A2 , · · · ∈ F be such that A0i s are pairwise disjoint and ∪∞
n=1 An ∈
F, then
X∞
P (∪∞ A
n=1 n ) = P (An ) .
n=1

Example 0.0.1 The set function P given by (0.0.7) is a probability measure


on the field B0 .

Theorem 0.0.2 (Extension Theorem) A probability measure defined on a


field F has a unique extension to σ(F).

Using Theorem 0.0.2, one can extend P defined by (0.0.7) to σ(B0 ).


Since
σ(B0 ) = B(0, 1] ,
there exists a unique probability measure P on B(0, 1] preserving (0.0.6).
Chapter 1

Random Variables-General
Facts

Key words: Random variable, Borel σ-field of subsets of R, σ-field generated


by random variable.

This chapter explains some general ideas related to random variables.


In many situations, one is interested in only some aspects of random ex-
periment/phenomenon. For example, consider the experiment of tossing 3
unbiased coins and we are only interested in number of ’Heads’ turned up.
The probability space corresponding to the experiment of tossing 3 coins is
given by
Ω = {(ω1 , ω2 , ω3 ) | ωi ∈ {H, T }} , F = P(Ω)
and P is described by
1
P ({(ω1 , ω2 , ω3 )}) = .
9
Our interest is in knowing no. of ’Heads’, i.e., in the map (ω1 , ω2 , ω3 ) →
I{ω1 =H} + I{ω2 =H} + I{ω3 =H} . In general we are interested in functions of
sample space. Also in many cases, one never observe directly the underly-
ing random phenomenon but observe ’measurements’ coming from it and
these mesaurements can be modeled as functions of the sample space. So in
short, mainly our interest lies in functions of sample space and its analysis to
make deductions about the random phenomenon. But to study a function
associated with a random phenomenon, one should be able to assign proba-
bilities to ’reasonably large class of events associated with the function. But
in general we can’t do this for all functions defined on the sample space.

9
10 CHAPTER 1. RANDOM VARIABLES-GENERAL FACTS

So one need to restrict ourself to certain class of functions of the sample


space for which we can do this and we call them random variables. In short
random variables are nothing but ’measurable observations’ from random
phenomenon.

Definition 2.1 Let Ω be a sample space and F a σ-field of subsets of Ω. A


function X : Ω → R is said to be a random variable (with respect to F) if

{ω ∈ Ω | X(ω) ≤ x} ∈ F for all x ∈ R .

Now on, we denote {ω ∈ Ω | X(ω) ≤ x} by {X ≤ x}.

Remark 1.0.1 (1) Some times we refere X as a random variable on (Ω, F)


and if no ambiguity arise, we simply call X as a random variable.
(2) In the definition of random variable, the probability measure P is not
in the picture.
(3) Also it is possible that a function may not be a random variable with
respect to a σ-field becomes a random variable with respect to another σ-field.

Example 1.0.2 Let Ω = {H, T }, F = P(Ω). Define X : Ω → R as


follows:
X(H) = 1 , X(T ) = 0 .
Then X is a random variable.

Note that any map Z : Ω → R is a random variable for the probability


space given in Example 1.0.2.

Now, we give an example, where this is not the case.

Example 1.0.3 Let Ω = {1, 2, 3, 4, 5, 6} and F = σ({{2}, {4}, {6}, {1, 3, 5}}).
Then X(1) = 1, X(ω) = 0 otherwise is not a random variable with respect
to F.
 
X is a random variable imply that the basic events X −1 (−∞, x] as-
sociated X are in F. Does this imply a larger class of events associated with
X can be assigned probabilities (i.e. in F ) ? To examine this, one need the
following σ-field.
11

Definition 2.2 The σ-field generated by the collection of all open sets1 in
R is called the Borel σ-field of subsets of R and is denoted by BR .

Lemma 1.0.1 Let I1 = {(−∞, x] | x ∈ R}. Then σ(I1 ) = BR

Proof. For x ∈ R,
 1
(−∞, x] = ∩∞
n=1 − ∞, x + ∈ BR .
n
Therefore
I1 ⊆ BR .
Hence
σ(I1 ) ⊆ BR .
For x ∈ R,
 1i
(−∞, x) = ∪∞
n=1 − ∞, x − .
n
Therefore
{(−∞, x) | x ∈ R} ⊆ σ(I1 ) .
Also, for a, b ∈ R, a < b,

(a, b) = (−∞, b) \ (−∞, a] ∈ σ(I1 ) .

Therefore, σ(I1 ) contains all open intervals for the form (a, b). Since any
open set in R can be written as a countable union of open intervals with
rational end points, it follows that

O ⊆ σ(I1 ) ,

where O denote the set of all open sets in R. Thus,

BR ⊆ σ(I1 ) .

This completes the proof.

Theorem 1.0.3 Let X : Ω → R be a function. Then X is a random


variable on (Ω, F, P ) iff X −1 (B) ∈ F for all B ∈ BR .
1
O ⊆ R is said to be an open set if for x ∈ O, there exists an ε > 0 such that
B(x, ε) := (x − ε, x + ε) ⊆ O. A set is closed in R if its complement in R is open in R.
Any open set in R can be written as a countable union of open intervals with rational end
points.
12 CHAPTER 1. RANDOM VARIABLES-GENERAL FACTS

Proof. Suppose X is a random variable. i.e.,

{X ≤ x} = X −1 {(−∞, x]} ∈ F for all x ∈ R. (1.0.1)

Set
A = {B ∈ BR | X −1 (B) ∈ F} .
From (1.0.1), we have
I1 ⊆ A ⊆ BR . (1.0.2)
Note that X −1 (R) = Ω. Hence R ∈ A.
Now
B∈A ⇒ X −1 (B) ∈ F
⇒ [X −1 (B)]c ∈ F
⇒ X −1 (B c ) ∈ F (since [X −1 (B)]c = X −1 (B c )]
⇒ Bc ∈ A .
Also
B1 , B 2 , · · · ∈ A ⇒ X −1 (Bn ) ∈ F ∀ n = 1, 2, 3, · · ·
⇒ ∪∞
n=1 X
−1 (B ) ∈ F
n
⇒ −1 ∞
X (∪n=1 Bn ) ∈ F
⇒ ∪∞
n=1 Bn ∈ A .
Hence A is a σ-field. Now from (1.0.2) and Lemma 2.0.1, it follows that
A = BR . i.e, X −1 (B) ∈ F for all B ∈ BR .

Converse statement follows from the observation

I1 ⊆ BR .

This completes the proof.


 
Remark 1.0.2 Theorem 1.0.3 tells that if X −1 (−∞, x] ∈ F for all x ∈
R, then X −1 (B) ∈ F for all B ∈ BR .

Example 1.0.4 Let Ω = (0, 1], F = B(0, 1] . Define X(ω) = 3ω + 1.


Here B(0, 1] is the σ-field generated by all open sets in (0, 1]2
For x ∈ R
{X ≤ x} = {ω
 ∈ Ω | 3 ω + 1 ≤ x} .
 ∅ if x < 1
1 1
= (0, 3 x − 3 ] if 1 ≤ x ≤ 4
(0, 1] if x > 4 .

2
A set O in (0, 1] is open if O = (0, 1] ∩ G, where G is an open set in R.
13

Since
1 1
∅, (0, x − ], (0, 1] ∈ B(0, 1] ,
3 3
X is a random variable.

Lemma 1.0.2 Let X : Ω → R. Define

σ(X) = {X −1 (B) | B ∈ BR } .

Then σ(X) is a σ-field.

Proof.
X −1 (R) = Ω .
Hence Ω ∈ σ(X).

For A ∈ σ(X), there exists B ∈ BR such that

X −1 (B) = A

and
Ac = X −1 (B c ) .
Hence A ∈ σ(X) implies Ac ∈ σ(X). Similarly from

X −1 (∪∞ ∞
n=1 Bn ) = ∪n=1 X
−1
(Bn )

it follows that

A1 , A2 , · · · ∈ σ(X) ⇒ ∪∞
n=1 An ∈ σ(X) .

This completes the proof.

Definition 2.3 Let X be a random variable with respect to a σ-field F.


Then σ(X) is called the σ-field generated by X.

Remark 1.0.3 (1)The occurrence or nonoccurence of the event X −1 (B)


tells whether any realization X(ω) is in B or not. Thus σ(X) collects all
such information. Hence σ(X) can be viewed as the information available
with the random variable.
(2) For a function X defined on a sample space Ω, σ(X) is a σ-field of
subsets of Ω and is the smallest σ-field under which X is a random variable.
14 CHAPTER 1. RANDOM VARIABLES-GENERAL FACTS

Example 1.0.5 Let X be as in Example 1.0.3. i.e. Ω = {1, 2, 3, 4, 5, 6}


and X = I{1} . Then


 ∅ if B ∩ {0, 1} = ∅
{2, 3, 4, 5, 6} if B ∩ {0, 1} = {0}

X −1 (B) =

 {1} if B ∩ {0, 1} = {1}
Ω if B ∩ {0, 1} = {0, 1}

Hence σ(X) = σ({1}). It is easy to see that X is a random variable with


respect to any σ-field containing σ({1}). Now see again (2) of Remark 1.0.3
.

You might also like