Professional Documents
Culture Documents
Administration
Lectures/Practicals
Slide 1
Modelling
We develop an imitation of the system. It could be, for example,
I
We use a model
I
Slide 2
Why do we model?
too slow,
too expensive,
Slide 3
interacting populations
epidemics
Slide 4
Slide 5
Stochastic modelling
Stochastic modelling is about the study of random experiments.
For example,
I
Slide 6
Stochastic modelling
Slide 7
[0, ).
Slide 8
A B.
I
A B = { : A or B} = B A.
I
I
Ac = { : w 6 A}
I
A B = { : A and B} = B A = AB.
I
Slide 9
Slide 10
We say two points on S are in the same family if you can get
from one to the other by taking steps of arclength 1 around
the circle.
Slide 11
i=1
The issue is that the event A is not one we can see or measure so
should not be included in F.
Slide 12
These kinds of issues are technical to resolve and are dealt with in
later probability or analysis subjects which use measure theory.
Slide 13
Slide 14
How do we specify P?
The modelling process consists of
I
Example: Toss a fair coin 1000 times. Any length 1000 sequence
of Hs and Ts has chance 21000 .
I
Slide 15
Properties of P
P() = 0.
P(Ac ) = 1 P(A).
Slide 16
Conditional probability
Let A, B F be events with P(B) > 0. Supposing we know that
B occurred, how likely is A given that information? That is, what
is the conditional probability P(A|B)?
For a frequency interpretation, consider the situation where we
have n trials and B has occurred nB times. What is the relative
frequency of A in these nB trials? The answer is
nAB /n
P(A B)
nAB
=
.
nB
nB /n
P(B)
Hence, we define
P(A|B) =
P(A B)
.
P(B)
Example:
even?
labelled 3?
Slide 18
Bayes formula
Sn
j=1 Bj ,
j=1
P(A|Bj )P(Bj )
P(Bj A)
= Pn
.
P(A)
k=1 P(A|Bk )P(Bk )
Slide 19
Example:
Slide 20
Independent events
Slide 21
Random variables
A random variable (rv) on a probability space (, F, P) is a
function X : IR.
Usually, we want to talk about the probabilities that the values of
random variables lie in sets of the form (a, b) = {x : a < x < b}.
The smallest -algebra of subsets of IR that contains these sets is
called the set B(IR) of Borel sets, after Emile Borel (1871-1956).
The probability that X (a, b) is the probability of the subset
{ : X () (a, b)}. In order for this to make sense, we need this
set to be in F and we require this condition for all a < b and we
say the function X is measurable with respect to F.
So X is measurable with respect to F if { : X () B} F for
all Borel sets B IR .
Slide 22
Distribution Functions
Slide 23
Distribution Functions
We say that
I
Slide 24
Examples of distributions
Slide 25
Random Vectors
A random vector X = (X1 , ..., Xd ) is a measurable mapping of
(, F) to IRd , that is, for each Borel set B IRd ,
{ : X () B} F.
The distribution function of a random vector is
FX (t) = P(X1 t1 , , Xd td ), t = (t1 , , td ) Rd .
It follows that
P(s1 < X1 t1 , s2 < X2 t2 )
= F (t1 , t2 ) F (s1 , t2 ) F (t1 , s2 ) + F (s1 , s2 ).
Slide 26
Slide 27
Revision Exercise
Slide 28
Expectation of X
For a discrete, continuous or mixed random variable X that takes
on values in the set SX , the expectation of X is
Z
E (X ) =
xdFX (x)
SX
Expectation of g (X )
Slide 30
Properties of Expectation
E (aX + bY ) = aE (X ) + bE (Y ).
If X Y , then E (X ) E (Y ).
If X c, then E (X ) = c.
Slide 31
Moments
V (cX ) = c 2 V (X ).
Slide 32
Conditional Probability
if P(X = x) = 0, then
P(A|x) = lim+ P(A{X (x, x+)})/P({X (x, x+)}).
0
Slide 33
Conditional Distribution
Slide 34
Conditional Expectation
Slide 35
E (c|X ) = c,
E (E (Y |X )) = E (Y ),
Slide 36
Exercise
1
2,
P({b}) = P({c}) =
1,
0,
= a or b,
= c or d,
2,
5,
= a or c,
= b or d.
1
8
and
Slide 37
Example
Slide 38
Exercise
e x/y e y
, x > 0, y > 0.
y
Slide 39
1X
Xj
Xn
n
j=1
as n .
In the strong form, this is true almost surely, which means that it
is true on a set A of sequences x1 , x2 , . . . that has probability one.
In the weak form, this is true in probability which means that, for
all > 0,
P(|Xn | > ) 0
as n .
Slide 40
as n .
That is, a suitably-scaled variation from the mean approaches a
standard normal distribution as n .
Slide 41
Slide 42
Example
Slide 43
Slide 44
http://www.ms.uky.edu/~mai/java/stat/brmo.html
Brownian motion
Lect 04
620-301
Slide 45
Slide 46
Lect 04
620-301
Slide 47
Interpretations
We can think of as consisting of the set of sample paths
= {Xt : t T }, that is a set of sequences if T is discrete or a
set of functions if T is continuous. Each has a value at
each time point t T . With this interpretation,
I
Slide 48
If Xt is a counting process:
I
Slide 49
Finite-Dimensional Distributions
Slide 50
Slide 51
Independence?
Is it reasonable to assume that neighbouring letters are
independent?
I
Slide 53
Slide 54
p11 p12
p21 p22
P=
pm1 pm2
p1m
p2m
.
pmm
Slide 55
Slide 56
Each entry is 0.
Slide 57
Slide 58
Slide 59
Slide 60
pij =
(r ) (nr )
pik pkj
Slide 61
(n)
p11
(n)
p12
..
(n)
(n)
(n)
P (n) =
p21 p22 p23
..
..
..
.
.
.
..
.
..
.
,
..
.
Slide 63
Slide 64
Slide 65
Slide 66
Example
Draw a transition diagram and
with transition matrix
0
0.5
P=
0
0
0.5 0.5 0
0
0 0.5
0 0.5 0.5
0 0.5 0.5
Slide 67
j j (reflexivity),
Slide 68
Slide 69
Definition:
Slide 70
Example
Classify the states of the DTMC with
0.5 0.5
0
0
0.5 0.5
0
0
P=
0.25 0.15 0.45 0.15
0
0
0
1
Slide 71
Exercise
Classify the states of the DTMC with
0 0 + 0
0 + 0 +
+ 0 0 0
P=
0 0 0 +
0 + 0 0
0 + 0 0
0 0 + 0
0 0 +
0 0 +
0 0 0
0 0 0
0 0 0
+ + 0
0 0 +
Slide 72
For each fixed state j, with probability one, the DTMC will
visit j only finitely many times.
Slide 73
Slide 74
Slide 75
Characterizing Recurrence
If the DTMC starts in a recurrent state j then, with probability
one, it will eventually re-enter j. At this point, the process will
start anew (by the Markov property) and it will re-enter again with
probability one. So the DTMC will (with probability one) visit j
infinitely-many times.
If the DTMC starts in a transient state j then there is a probability
1 fj > 0 that it will never return. So, letting Nj be the number of
visits to state j after starting there, we see that Nj has a geometric
distribution.
Specifically, for n 0,
P(Nj = n|X0 = j) = P(T1 (j) < , , Tn (j) < , Tn+1 (j) = ).
This is equal to (1 fj )fjn , which implies that E (Nj |X0 = j) =
fj
1fj .
Slide 76
Characterizing Recurrence
If
qj =
E [I (Xn = j)|X0 = j] =
n=1
then qj = E [
qj
1+qj .
n=1 I (Xn
(n)
pjj ,
n=1
fj
1fj ,
so
fj =
It follows that state j is recurrent if and only if qj = .
[Read: j is recurrent if and only if the expected number of returns
to state j is infinite.]
Slide 77
pjj
= P(Xs+t+n = j|X0 = j)
P(Xs+t+n = j, Xs+n = k, Xs = k|X0 = j)
(s) (n) (t)
Similarly pkk
pjj
(n)
(n+s+t)
pkk pjj
P
(s) (t)
(n)
where = pjk pkj . So the series
n=1 pkk must diverge because
P (n)
n=1 pjj diverges, and we conclude that state k is also recurrent.
Slide 78
Slide 79
Slide 80
Stirlings formula n!
pjj
and the series
(4pq)n
,
n
(2n)
n=1 pjj
Slide 81
Periodicity
Definition:
{n :
(n)
pjj
Slide 82
1 0
0 ?
P=
1 0
0
Find
the period for the DTMC
with
0 0 0.5 0.5
1 0 0
0
.
P=
0 1 0
0
0 1 0
0
Slide 83
Slide 84
Theorem:
Slide 85
Slide 86
Slide 87
Exercises
0 0.5 0.5
0 .
Analyse the DTMC with P = 1 0
1 0
0
0 0 0.5 0.5
1 0 0
0
.
Consider a DTMC with P =
0 1 0
0
0 1 0
0
Slide 88
Example
Analyse the Markov chain with states numbered 1 to 5 and with
one-step transition probability matrix
1
0
0 0 0
1/2 0 1/2 0 0
P=
0 1/2 1/2 0 0
0
0
0 0 1
0
0
0 1 0
Slide 89
X
n=1
(n)
pjj
n=1
This means that the DTMC visits j only finitely-many times (with
probability one), given that it starts there.
Let S be the set of states, and fj,k be the probability that the
DTMC ever visits state k, given that it starts in state j.
Slide 90
X
n=1
(n)
pjj
fj,k
(n)
pkk < .
n=1
k6=j
XX
E [I (Xn = k)|X0 = j] =
kS n=1
X
X
E [I (Xn = k)|X0 = j]
n=1 kS
n=1
k6=j
And it shouldnt: random walk with p > 1/2 has all states
transient.
Slide 92
Slide 93
Slide 94
dy
d 2y
+b
+ cy = 0,
dt 2
dt
Slide 95
Slide 96
Slide 97
Slide 98
1p
p
j
.
Slide 99
1p
p
j
.
Slide 100
Slide 101
We show by induction that fj,0 (m) gj,0 for all m. Clearly this is
true for m = 1. Assume that it is true for m = `. Then
X
fj,0 (` + 1) = pj0 +
pjk fk,0 (`)
k6=0
pj0 +
pjk gk,0
k6=0
= gj,0 .
It follows that fj,0 = limm fj,0 (m) gj,0 and so {fj,0 } is the
minimal nonnegative solution to ().
Slide 102
For the random walk with (1 p)/p < 1, the general solution for
j 1 was of the form
fj,0 = A + B
1p
p
j
.
1p
p
j
.
Slide 103
The gambler would like to know the probability that he/she will
win $M before becoming bankrupt.
Slide 104
1p
p
j
.
Slide 105
1p
p
M
,
1p
p
N
1p
p
M !1
,
1p
p
1p
p
j
N
1p
p
M
1p
p
M .
Slide 106
1
,
M +N
M j
.
M +N
Slide 107
Slide 108
Slide 109
Recall that we used Ti (j) to denote the time between the ith and
(i 1)st return to state j. We then defined state j (and hence its
communicating class) to be
I
Slide 110
Examples
I
Slide 111
Theorem:
lim pjj =
1
.
j
= 0.
Slide 112
0 1
1 0
.
Slide 113
Further:
Theorem:
lim pij =
1
.
j
Slide 114
P = and hence P n = .
Slide 115
Theorem:
Slide 116
Examples
An m m stochastic matrix P is called doubly-stochastic if all the
column sums are equal to one.
If an aperiodic DTMC has a doubly-stochastic transition matrix,
then we can easily verify that
(1/m, 1/m, 1/m, . . .)P = (1/m, 1/m, 1/m, . . .).
It follows that
= (1/m, 1/m, 1/m, . . .),
and the stationary distribution is uniform on S.
Find a stationary distribution for
0 1
P=
.
1 0
Slide 117
Examples
Find a stationary distribution for
1/2 1/2 0
P = 1/2 1/2 0 .
0
0 1
Find the stationary distribution for
0.8 0.2
0
0
0
0
0.5
0.5
P=
0.75 0.25 0
0
0
0
0.4 0.6
Slide 118
q
P=
0
..
.
q
..
.
0
..
.
p
..
.
0
..
.
..
..
.
.
..
.
..
.
Slide 119
k =
1
.
(p/q)k (1 (p/q))
Slide 120
The distribution
limiting,
stationary,
ergodic.
Slide 121
The distribution
j = lim Pij
n
Slide 122
The distribution
Slide 123
The distribution
Slide 124
Reducible DTMC
Slide 125
Periodic DTMC
so
(r )
(`)
jSr
Sr(`) )
(r )
j
=0
as n
(`+k(mod d))
for j Sr
(`+k(mod d))
for j 6 Sr
= 1 for any `
Slide 126
0
1
P=
0
0
0
1
4
0 0.1 0.9 0
0 0
0 0
1 0
0 0
1 0
0 0
0 0
0 13
0 0
0 15
1
0
0 18
4
0
0
0
0
2
3
4
5
1
4
0
0
0
0
0
0
1
8
Slide 127
Good Trick
Sometimes we want to model a physical system where the future
depend on part of the past. Consider following example. A
sequence of random variables {Xn } describes the weather at a
particular location, with Xn = 1 if it is sunny and Xn = 2 if it is
rainy on day n.
Suppose that the weather on day n + 1 depends on the weather
conditions on days n 1 and n as is shown below:
P(Xn+1 = 2|Xn = Xn1 = 2) = 0.6
P(Xn+1 = 1|Xn = Xn1 = 1) = 0.8
P(Xn+1 = 2|Xn = 2, Xn1 = 1) = 0.5
P(Xn+1 = 1|Xn = 1, Xn1 = 2) = 0.75
Slide 128
Good Trick
0.8 0.2
0
0
0
0
0.5 0.5
.
P=
0.75 0.25 0
0
0
0
0.4 0.6
Slide 129