Introduction To Probability Theory

Introduction to Probability Theory
K. Suresh Kumar
Department of Mathematics
Indian Institute of Technology Bombay
October 1, 2017
2
LECTURES 18-19
Real valued transformations of multi random variables: In this sub-

section, we look at transformations of (X, Y ) of the form ϕ ◦ (X, Y ), i.e.
ϕ(X, Y ) where ϕ : R2 → R, i.e. real valued functions of (X, Y ), for example
X +Y, XY etc. One can write down the distribution function Z = ϕ◦(X, Y )
using the distribution µ of (X, Y ) as follows. Set
Az = {(x, y)|ϕ(x, y) ≤ z}, z ∈ R.
Then
FZ (z) = P ({ϕ(X, Y ) ∈ (−∞, z]}) = P ({(X, Y ) ∈ Az }) = µ(Az ),
where µ denote the joint distribution of (X, Y ) and FZ denote the distribu-
tion function of Z = ϕ(X, Y ). Hence it is all about computing Az and then
µ(Az ).
Example 0.1 (Distribution of sum) Let X, Y be with joint pdf f . Then one
can compute the distribution of the sum Z = X + Y as follows. Note
Az = {(x, y)| − ∞ < x < ∞, −∞ < y ≤ z − x}.
Hence
Z ∞ Z z−x
FZ (z) = f (x, y)dydx
Z−∞ −∞
∞ Z z
(put t = y + x) = f (x, t − x)dtdx
Z−∞ −∞
z Z ∞
(change order of integration) = f (x, t − x)dxdt.
−∞ −∞
Hence X + Y has pdf given by

Z ∞
fZ (z) = f (x, z − x)dx.
−∞
Corollary 0.1 Let X, Y be independent random variables with joint pdf f .

Then the pdf of X + Y is given by
fX+Y (z) = fX ∗ fY (z),
where fX ∗ fY denote the convolution of fX and fY and is defined as
Z Z
fX ∗ fY (z) = fX (y)fY (z − y)dy = fX (z − y)fY (y)dy .
R R
3
Proof. The proof (exercise) follows immediately from f (x, y) = fX (x)fY (y)
and the above example.
Example 0.2 Let X, Y be independent exponential random variables with

parameters λ1 and λ2 respectively. Then
λ1 e−λ1 x if x ≥ 0

fX (x) =
0 if x < 0 .
fY is given similarly. Now for z ≤ 0, clearly fX ∗ fY (z) = 0. For z > 0,

Z z
fX ∗ fY (z) = λ1 e−λ1 x λ2 e−λ2 (z−x) dx
0 λ1 λ2 −λ z
1 − e−λ2 z ) if λ1 6= λ2
= λ2 −λ1 (e .
2
λ ze −λz if λ1 = λ2 = λ
The above gives the pdf of X + Y .
Linear transformations of multi random variables: In this section we

look at when ϕ(x, y) = (x, y)A, where A is a 2 × 2 invertible (i.e. non
singular) matrix. We will see the distribution of φ(X, Y ). Note that

cos θ − sin θ
A =
sin θ cos θ
rotates (X, Y ) through an angle θ in the counter clockwise direction. To

find the distribution of (U, V ) = (X, Y )A, we use the following change of
variable formula. Let ϕ = (ϕ1 , ϕ2 ) : R2 → R2 be a map (with continuously
differentiable partial derivatives) such that the Jacobian at (x, y)
!
∂ϕ1 ∂ϕ1
∂x ∂y
J(ϕ1 (x, y), ϕ2 (x, y)) = det ∂ϕ2 ∂ϕ2 6= 0
∂x ∂y
Then the area element under the mapping (x, y) 7→ (u = ϕ1 (x, y), v =
ϕ2 (x, y)) makes the transformation dxdy 7→ dudv = |J(x, y)|dxdy 1
1
Here note that infinitesemal (small) rectangle [x, x + dx] × [y, y + dy] i.e. dx × dy
is approximately mapped to the parallelogram du × dv. Now du is the vector joining the
points (ϕ1 (x, y), ϕ2 (x, y)) and (ϕ1 (x + dx, y), ϕ2 (x + dx, y)) and hence
du = (ϕ1 (x + dx, y) − ϕ1 (x, y), ϕ2 (x + dx, y) − ϕ2 (x, y))

∂ϕ1 (x, y) ∂ϕ2 (x, y)
∼ ( dx, dx).
∂x ∂x
4
Theorem 0.1 Let D be an elementary region in R2 and f : D → R be

continuous. Let ϕ = (ϕ1 , ϕ2 ) : O → R2 be such that
(i) ϕ is one to one
(ii) ϕ1 , ϕ2 have continuous paritial derivatives on O, O is an open set in

R2
(iii) J(ϕ1 (x, y), ϕ2 (x, y)) 6= 0 on O
(iv) There exists E ⊆ O such that E is elementary and ϕ(E) = D.
Then
ZZ ZZ
f (u, v)dudv = f (ϕ1 (x, y), ϕ2 (x, y))|J(ϕ1 (x, y), ϕ2 (x, y))|dxdy.
D E
As an application of the above change of variable formula, we have the

following result.
Theorem 0.2 Let (X, Y ) be a random vector in R2 with joint pdf f and A
be a non singular 2 × 2 matrix. Then (U, V ) = (X, Y )A is with pdf g given
by
1
g(u, v) = f ((u, v)A−1 )
|detA|
Proof: For u, v ∈ R, set Ruv = (−∞, u] × (−∞, v] and I = Ruv A−1 .

One can see that I will be an elementary set. Now the distribution function
Similarly
∂ϕ1 (x, y) ∂ϕ2 (x, y)
dv ∼ ( dy, dy).
∂y ∂y
∂ϕ1 (x, y) ∂ϕ2 (x, y)

du = dxi + dxj + 0k
∂x ∂x
∂ϕ1 (x, y) ∂ϕ2 (x, y)
dv = dyi + dyj + 0k.
∂y ∂y
Area of the parallelogram = |du × dv|

∂ϕ1 (x, y) ∂ϕ2 (x, y) ∂ϕ2 (x, y) ∂ϕ1 (x, y)

= dxdy − dxdy

∂x ∂y ∂x ∂y
k
= |J(ϕ1 (x, y), ϕ2 (x, y))|dxdy.

5
F(U,V ) of (U, V ) is given by

F(U,V ) (u, v) = P {(U, V ) ∈ Ruv }
[using (x, y) 7→ (x, y)A, a bijection] = P {(X, Y ) ∈ I}
ZZ
= f (x, y)dxdy
Z ZI
1
(Theorem 0.1 to (u, v) 7→ (u, v)A−1 ) = f ((u, v)A−1 ) dudv.
Ruv |detA|
Hence (U, V ) has a pdf say g and is given by
1
g(u, v) = f ((u, v)A−1 ).
|detA|
Multinormal distribution. X is said tobe a non degenerate multinormal

σ12 σ12
with parameters µ = (µ1 , µ2 ) and Σ = if the joint pdf fX is
σ21 σ22
given by
1 1 −1 ⊥
fX (x1 , x2 ) = √ e− 2 (x−µ)Σ (x−µ) ,
2π detΣ
where Σ is symmetric positive definite and (x − µ)⊥ is the column vector
corresponding to the row vector (x1 −µ1 , x2 −µ2 ). Also Σ is positive definite
implies that |σ12 | ≤ σ1 σ2 . Rewrite Σ as
2
σ1 ρσ1 σ2
Σ =
ρσ1 σ2 σ22
then (exercise)
h i
1 (x1 −µ1 )2 2ρ(x1 −µ1 )(x2 −µ2 ) (x −µ )2
1 − 2 2 − σ1 σ2
+ 2 22
fX (x1 , x2 ) = p e 2(1−ρ ) σ1 σ2
.
2πσ1 σ2 1 − ρ 2
Now we will see the marginal pdfs of multinormal.
Essence of the calculation is ’completing the square’. I will do it sepa-

rately. Consider
(x1 −µ1 )2 2
σ12
− 2ρ(x1 −µ 1 )(x2 −µ2 )
σ1 σ2 + (x2 −µ
σ22
2)
h 2 2 i 2
= (x2σ−µ 2
2)
− 2ρ(x1 −µ1 )(x2 −µ2 )
σ σ + ρ(x1 −µ1 )
σ1 + (1 − ρ2 ) (x1σ−µ 1)
21 2 2 1
(x2 −µ2 ) ρ(x1 −µ1 ) 2 (x1 −µ1 )
= σ2 − σ + (1 − ρ ) σ
2 1 2 1
= (x2σ−µ 2
2)
− a + (1 − ρ2 ) (x1σ−µ 1
1)
,
6
where
ρ(x1 − µ1 )
a= .
σ1
Now
2
(x1 −µ1 )
− 12
2
Z ∞ σ1 Z ∞ 1 (x2 −µ2 )
e −
2(1−ρ2 ) σ2
−a
fX (x1 , x2 )dx2 = p e dx2
−∞ 2πσ1 σ2 1 − ρ2 −∞
2
(x1 −µ1 )
x − µ − 21 σ1 Z ∞
h 1 2 2
i e 1 x2
put x = p −a = √ √ e− 2 dx
1 − ρ2 σ2 2πσ1 2π −∞
2
(x1 −µ1 )
1 − 12 σ1
= √ e .
2πσ1
Hence X1 ∼ N (µ1 , σ1 ). Similarly X2 ∼ N (µ2 , σ2 ).
Theorem 0.3 Let X = (X1 , X2 ) be a multinormal non degenerete random
variable with parameters µ and Σ. Then for any α, β ∈ R αX1 + βX2 is a
normal random variable.
Proof: Since X − µ is normal with paramenters 0 and Σ, it is enough to
prove the theorem when µ = 0. (exercise)
Let A be a 2 × 2 symmetric matrix such that AA⊥ = Σ [ Here the

1 1 1
choice of A is Σ 2 and Σ 2 = P Λ 2 P −1 where Λ is the diagonal matrix (of
1
eigen values of Σ and Σ = P ΛP −1 ) and hence Λ 2 is the diagonal matrix
with diagonal entries as the square root of the eigen values of Λ]. Define
Y = XA−1 , then using Theorem 0.1, the pdf g of Y exists and is given by
g(y1 , y2 ) = |detA|f (yA)
√ 1 1 −1 ⊥
= detΣ √ e− 2 yAΣ Ay
2π detΣ
1 − 1 kyk2
= e 2 .
2π
Hence Y is multi normal with parameters 0 and I. Therefore g(y1 , y2 ) =
gY1 (y1 )gY2 (y2 ). This implies that Y1 and Y2 are independent normal random
variables.
Now we can see that αX1 + βX2 = aY1 + bY2 for some a, b ∈ R and hence
is a normal random variable. (exercise)
This completes the proof.
7
Remark 0.1 The proof of Theorem 0.3 tells us some thing more. Let X ∼
N (µ, Σ), i.e. X is a multinormal random variable with paramenters µ and
Σ. Set X̄ = X − µ, then X̄ ∼ N (0, Σ).
1
Then Ȳ = X̄Σ− 2 ∼ N (0, I). Hence
1 1 1 1
Y := XΣ− 2 = µΣ− 2 + X̄Σ− 2 ∼ N (µΣ− 2 , I).
Note the above is a generalization to the multidimentional case of the

following result for the normal random variables.
X ∼ N (µ, σ 2 ), then aX ∼ N (aµ, a2 σ 2 ).
Theorem 0.3 leads to a more general definition multinormal which in-

cludes multinormal random variables which are degenerate also.
Definition 6.7 A random vector X = (X1 , X2 ) is said to be multinormal if

αX1 + βX2 is normal for all α, β ∈ R.
Example 0.3 Let X1 ∼ N (0, 1) and X = (X1 , −X1 ). Then any linear
combination of the components of X is normally distributed. Also X doesnot
have a joint density function. To see this, let L = {(x, y)|x + y = 0}, the
graph of x + y = 0. Then
P {X ∈ L} = P (Ω) = 1.
Now suppose X has a joint pdf f , then for Ln = L ∩ [−n, n],

ZZ
P {X ∈ Ln } = f (x, y)dxdy = 0, for all n ≥ 1.
Ln
Now
P {X ∈ L} = lim P {X ∈ Ln } = 0
n→∞
a conradiction to P {X ∈ L} = 1. Hence X doesn’t have a density. i.e., X

is an example of a degenerate multinormal distribution.
Chapter 7: Expectation and other moments

In this chapter, we introduce expected value or the mean of a random
variable. First we define expectation for discrete random variables and then
8
for general random variable. Finally we introduce other moments and com-
ment on moment problem.
First we give a useful representation of discrete random variables.

Theorem 0.4 Let X be a discrete random variable defined on a probability
space (Ω, F, P ). Then there exists a partition {An | n = 1, , . . . , N } ⊆ F of
Ω and {an |n = 1, 2, . . . , N } ⊆ R such that an ’s are distinct and
N
X
X = an IAn a.s ,
n=1
where N may be ∞.
Proof. Let F be the distribution function of X. Let {a1 , a2 , . . . , aN } be the
set of all discontinuities of F . Here N may be ∞. Since X is discrete, we
have
XN
F (an ) − F (an −) = 1 .
n=1
Set
An = {X = an } .
Then {An } is pairwise disjoint and ∪N
∞ An = Ω. i.e., {An } is a partition of
Ω and
N
X
X = an IAn .
n=1
Remark 0.2 If X is a discrete random variable on a probability space

(Ω, F, P ), then the ’effective’ range of X is at the most countable. Here
’effective’ range means those values taken by X which has positive probabil-
ity. This leads to the name ’discrete’ random variable.
In fact, in the above proof {An } is a partition of Ω0 which is subset of Ω
by excluding the ’non probable values of X and hence P (Ω0 ) = 1.
Remark 0.3 If X is a discrete random variable, then one can assume with-
out the loss of generality that
∞
X
X = an IAn .
n=1
Since if N < ∞, then set An = ∅ for n ≥ N + 1 and an , n ≥ N + 1 are

chosen so that they are distinct.
9
Theorem 0.5 Let {Bn } be a countable partition of Ω from F and {bb } be

a sequence of real numbers which are not necessarily distinct. Then if
∞
X ∞
X
an IAn = bn IBn
n=1 n=1
then
∞
X ∞
X
an P (An ) = bn P (Bn ) .
n=1 n=1
Proof. (Reading exercise) Note that an ’s are distinct and hence it follows
that each Bm is contained in An for aome n. For each n ≥ 1, set
In = {m ≥ 1|An Bm 6= ∅} .
Then clearly
An = ∪m∈In Bm , n ≥ 1 .
Also if m0 ∈ In then an = bm0 . Therefore
∞
X ∞ X
X
bm P (Bm ) = bm P (Bm )
m=1 n=1 m∈In
X∞ X
= an P (Bm )
n=1 m∈In
X∞
= an P (An ) .
n=1

Definition 7.1. Let X be a discrete random variable represented by {(An , an ) | n ≥
1}. Then expectation of X denoted by EX is defined as
∞
X
EX = an P (An ) ,
n=1
provided the right hand side series converges absolutely.
Remark 0.4 In view of Remark 6.0.5., if X has range a1 , a2 , . . . , aN , then

N
X
EX = an P {X = an } .
n=1
10
Example 0.4 Let X be a Bernoulli(p) random variable. Then
X = IA , where A = {X = 1} .
Hence
EX = P (A) = p .
Example 0.5 Let X be a Binomial(n, p) random variable. Then

n
X
X = k IAk , where Ak = {X = k} .
k=0
Hence
n n
X X n k
EX = kP (Ak ) = k p (1 − p)n−k
k
k=0 k=0
n
X n−1 k
= n p (1 − p)n−k = np .
k−1
k=1
Here we used the identity

n n−1
k = n .
k k−1
Example 0.6 Let X be a Poisson (λ) random variable. Then

∞
X
X = n IAn , where An = {X = n} .
n=0
Hence
∞
X λn
EX = n e−λ = λ.
n!
n=0
Example 0.7 Let X be a Geometric (p) random variable. Then

∞
X
X = n IAn , where An = {X = n}
n=1
Hence
∞
X p 1
EX = n p (1 − p)n−1 = 2
= .
(1 − (1 − p)) p
n=1
11
Theorem 0.6 (Properties of expectation) (Reading exercise) Let X and Y

be discrete random variables with finite means. Then
(i) If X ≥ 0, then EX ≥ 0.
(ii) For a ∈ R
E(aX + Y ) = aEX + EY .
Proof. (i) Let {(An , an )|n ≥ 1} be a representation of X. Then X ≥ 0

implies an ≥ 0 for all n ≥ 1. Hence
∞
X
EX = an P (An ) ≥ 0 .
n=1
(ii) Let Y has a representation (Bn , bn ) | n ≥ 1}. Now by setting
Cnm = An Bm , n, m ≥ 1, anm = an , m ≥ 1, bnm = bm , n ≥ 1
one can use same partition for X and Y . Therefore

∞
X
aX + Y = (aanm + bnm ) ICnm .
n,m=1
Hence
∞
X
E(aX + Y ) = (aanm + bnm ) P (Cnm )
n,m=1
X∞ X∞ ∞ X
X ∞
= a anm P (Cnm ) + bnm P (Cnm )
n=1 m=1 m=1 n=1
X∞ X ∞ X∞ X ∞
= a an P (An Bm ) + bm P (An Bm )
n=1 m=1 m=1 n=1
X∞ ∞
X
= a an P (An ) + bm P (Bm )
n=1 m=1
= a EX + EY .
Definition 7.2. (Simple random variable) A random variable is said to be

simple if it is discrete and the distribution function has only finitely many
discontinuities.
Theorem 0.7 Let X be random variable in (Ω, F, P ) such that X ≥ 0,
then there exists a sequence of simple random variables {Xn }satisfying
(i) For each n ≥ 1, Xn ≥ 0, Xn ≤ Xn+1 ≤ X.
(ii) For each ω ∈ Ω, Xn (ω) → X(ω) as n → ∞.
12
Proof. For n ≥ 1, define simple random variable Xn as follows:
k k k+1
(
if ≤ X(ω) < , k = 0, · · · , n2n − 1
Xn (ω) = 2n 2n 2n
0 if X(ω) ≥ n .
Then Xn ’s satisfies the following:
Xn ≤ Xn+1 , n ≥ 1
lim Xn (ω) = X(ω), ω ∈ Ω .

n→∞
Lemma 0.1 Let X be a non negative random variable and {Xn } be a se-
quence of simple random variables satisfying (i) and (ii) of Theorem 6.0.25.
Then limn→∞ EXn exists and is given by
lim EXn = sup{EY | Y is simple and Y ≤ X} .

n→∞
Proof. (Reading exercise) Since Xn ≤ Xn+1 , we have EXn ≤ EXn+1 , n ≥

1 (see exercise). Hence limn→∞ EXn exists. Also since Xn ’s are simple,
clearly,
EXn ≤ sup{EY | Y is simple and Y ≤ X}, n ≥ 1.
Therefore
lim EXn ≤ sup{EY | Y is simple and Y ≤ X} .

n→∞
Hence to complete the proof, it suffices to show that for Y simple and
Y ≤ X,
EY ≤ lim EXn .
n→∞
Let
m
X
Y = ak IAk ,
k=1
where {Ak | k = 1, . . . , m} is a partition of Ω. Fix > 0, set for k ≥ 1 and

n ≥ 1,
Akn = {ω ∈ Ak | Xn (ω) ≥ ak − } .
Since Xn ≤ Xn+1 , n ≥ 1, we have for each k ≥ 1.
Ak n ⊆ Ak n+1 , n ≥ 1 .
13
Also
ω ∈ Ak =⇒ X(ω) = ak
=⇒ limn→∞ Xn (ω) = ak
=⇒ Xn0 (ω) ≥ ak − for some n0
=⇒ ω ∈ Akn0 ⊆ ∪∞n=1 Akn .
Hence
∪∞
n=1 Akn = Ak , 1 ≤ k ≤ m .
From the definition of Akn we have

m
X
Xn ≥ (ak − )IAkn .
k=1
Hence
m
X
EXn ≥ (ak − )P (Akn ) . (0.1)
k=1
Using continuity property of probability, we have
lim P (Akn ) = P (Ak ), 1 ≤ k ≤ m .

n→∞
Now let, n → ∞ in (0.1), we get

m
X
lim EXn ≥ (ak − )P (Ak ) = EY − .
n→∞
k=1
Since, > 0 is arbitrary, we get
lim EXn ≥ EY .
n→∞

Definition 7.3. The expectation of a non negative random variable X is
defined as
EX = lim EXn , (0.2)
n→∞
where Xn is a sequence of simple random variables as in Theorem 7.4.
Remark 0.5 One can define expectation of X, non negative random vari-
able, as
EX = sup{EY | Y is simple and Y ≤ X} .
But we use Definition 7.3., since it is more handy.

Introduction To Probability Theory

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Probability Theory

Uploaded by

Copyright:

Available Formats

Introduction to Probability Theory

Real valued transformations of multi random variables: In this sub-

Hence X + Y has pdf given by

Corollary 0.1 Let X, Y be independent random variables with joint pdf f .

Example 0.2 Let X, Y be independent exponential random variables with

fY is given similarly. Now for z ≤ 0, clearly fX ∗ fY (z) = 0. For z > 0,

The above gives the pdf of X + Y .

Linear transformations of multi random variables: In this section we

rotates (X, Y ) through an angle θ in the counter clockwise direction. To

du = (ϕ1 (x + dx, y) − ϕ1 (x, y), ϕ2 (x + dx, y) − ϕ2 (x, y))

Theorem 0.1 Let D be an elementary region in R2 and f : D → R be

(i) ϕ is one to one

(ii) ϕ1 , ϕ2 have continuous paritial derivatives on O, O is an open set in

(iii) J(ϕ1 (x, y), ϕ2 (x, y)) 6= 0 on O

(iv) There exists E ⊆ O such that E is elementary and ϕ(E) = D.

As an application of the above change of variable formula, we have the

Proof: For u, v ∈ R, set Ruv = (−∞, u] × (−∞, v] and I = Ruv A−1 .

∂ϕ1 (x, y) ∂ϕ2 (x, y)

Area of the parallelogram = |du × dv|

= |J(ϕ1 (x, y), ϕ2 (x, y))|dxdy.

F(U,V ) of (U, V ) is given by

Multinormal distribution. X is said tobe a non degenerate multinormal

Now we will see the marginal pdfs of multinormal.

Essence of the calculation is ’completing the square’. I will do it sepa-

Let A be a 2 × 2 symmetric matrix such that AA⊥ = Σ [ Here the

Note the above is a generalization to the multidimentional case of the

X ∼ N (µ, σ 2 ), then aX ∼ N (aµ, a2 σ 2 ).

Theorem 0.3 leads to a more general definition multinormal which in-

Definition 6.7 A random vector X = (X1 , X2 ) is said to be multinormal if

Now suppose X has a joint pdf f , then for Ln = L ∩ [−n, n],

a conradiction to P {X ∈ L} = 1. Hence X doesn’t have a density. i.e., X

Chapter 7: Expectation and other moments

First we give a useful representation of discrete random variables.

Remark 0.2 If X is a discrete random variable on a probability space

Since if N < ∞, then set An = ∅ for n ≥ N + 1 and an , n ≥ N + 1 are

Theorem 0.5 Let {Bn } be a countable partition of Ω from F and {bb } be

This completes the proof.

provided the right hand side series converges absolutely.

Remark 0.4 In view of Remark 6.0.5., if X has range a1 , a2 , . . . , aN , then

Example 0.4 Let X be a Bernoulli(p) random variable. Then

Example 0.5 Let X be a Binomial(n, p) random variable. Then

Here we used the identity

Example 0.6 Let X be a Poisson (λ) random variable. Then

Example 0.7 Let X be a Geometric (p) random variable. Then

Theorem 0.6 (Properties of expectation) (Reading exercise) Let X and Y

Proof. (i) Let {(An , an )|n ≥ 1} be a representation of X. Then X ≥ 0

(ii) Let Y has a representation (Bn , bn ) | n ≥ 1}. Now by setting

Cnm = An Bm , n, m ≥ 1, anm = an , m ≥ 1, bnm = bm , n ≥ 1

one can use same partition for X and Y . Therefore

Definition 7.2. (Simple random variable) A random variable is said to be

Proof. For n ≥ 1, define simple random variable Xn as follows:

Then Xn ’s satisfies the following:

lim Xn (ω) = X(ω), ω ∈ Ω .

lim EXn = sup{EY | Y is simple and Y ≤ X} .

Proof. (Reading exercise) Since Xn ≤ Xn+1 , we have EXn ≤ EXn+1 , n ≥

lim EXn ≤ sup{EY | Y is simple and Y ≤ X} .

where {Ak | k = 1, . . . , m} is a partition of Ω. Fix  > 0, set for k ≥ 1 and

From the definition of Akn we have

Using continuity property of probability, we have

lim P (Akn ) = P (Ak ), 1 ≤ k ≤ m .

Now let, n → ∞ in (0.1), we get

Since,  > 0 is arbitrary, we get

This completes the proof.

Multinormal distribution. X is said tobe a non degenerate multinormal

where {Ak | k = 1, . . . , m} is a partition of Ω. Fix > 0, set for k ≥ 1 and

Since, > 0 is arbitrary, we get