You are on page 1of 4

Introduction to Stochastic Processes

A stochastic process is a probabilistic model of a system that evolves randomly.

Definition 0.1 A stochastic process is a collection of random variables indexed by time, i.e., X =
{X(t); t ∈ T }, where T is some index set. The random variables X(t) take values in a set S called the
state space of the stochastic process. Any realization of X is a sample path and the set of all sample
paths is called the sample space of the stochastic process.

A stochastic process can also be viewed as a probability distribution over a space of paths which describe
the evolution of some random value, or system, over time. In a deterministic process, there is a fixed
trajectory (path) that the process follows with probability 1, while in a stochastic process, we do not know
a priori which path we will be given. But we still have some control given by the probability distribution
that governs the process.
Examples:

(i.) X(t) = t with probability 1 is a deterministic process.

(ii.) X(t) = t at any t with probability 1/2 and X(t) = −t at any t with probability 1/2 is a stochastic
process as you either take the path t or the path −t, so the values depend on each other. If you
know one value, you automatically know all the values.

(iii.) The more interesting process X(t) = t with probability 1/2 and X(t) = −t with probability 1/2
does not have this dependency. It will jump up and down infinitely often between the lines.

When you examine a real stochastic process, e.g., a stock price, you stand at time t and know all the
values in the past and now (i.e., the initial segment of a path). You don’t know the future and want
to say something intelligent about it based on the past and now. You know exactly what is going to
happen for process (i). Even though (ii) is random, once you know what happened at some point you
know exactly what is going to happen. For (iii), you don’t know exactly what is going to happen, only
the distribution (independent of the past in this case).
More Examples of Stochastic processes: Dow Johns Industrial Average at the end of the tth day,
the amount of inventory at the beginning of the tth week, the number occurred on the tth die tossing,
the number of failed components in a shop at time t, the number of customers waiting at time t.

1
2

A stochastic process has discrete-time if the time variable takes positive integer values, and continuous-
time if the time variable takes positive real values. State space and four types of sample paths.
Notes:

Discrete If the system is observed only at discrete times (may not be equally spaced along the time
axis), the stochastic process is a discrete time stochastic process and is denoted by {Xn , n ≥ 0}.

Continuous If the system is observed continuously over time with X(t) being its state at time t, then
it is a continuous time stochastic process and is denoted by {X(t), t ≥ 0}.

Sample paths A sample path is a possible evolution (trajectory) of X. Figure 1 shows some typical
sample paths. (1) Continuous time, discrete space: the number of customers in a queue, the number
of failed components in a shop; Of interest: the distribution of the number of customers at time t.
(2) Continuous time, continuous space: the amount of inventory (a) and the amount of water in a
tank (b); (3) discrete time: DJIA at the end of the nth day and the number occurred on the nth
die tossing (both with discrete state space), and inventory (in continuous units) at the beginning
of the nth day.

X(t) Xn

t n
(1) Continuous time, discrete state space (3) Discrete time, discrete or continuous state space

X(t) X(t)

t t
(a) (2) Continuous time, continuous state space
(b)

Figure 1: Typical sample paths

Although we are still dealing with a single basic experiment that involves outcomes governed by a prob-
ability law, the newly introduced time variable raises many new interesting questions. We want to say
3

something intelligent about the future.

(i.) The dependencies in the sequence of values generated by the process. For example, how do future
prices of a stock depend on past values?

(ii.) Long-term averages involving the entire sequence of generated values. For example, what is the
fraction of time that a machine is idle?

(iii.) The boundary events. For example, what is the probability that within a given hour all circuits
of some telephone system become simultaneously busy? We will not answer this question in this
course.

In this course, we will start by studying discrete time stochastic processes {Xn : n ≥ 0} (Chapter 4:
Discrete Time Markov Chains). These processes can be expressed explicitly, and thus are more tangible,
or easy to visualize. Later we address continuous time processes {X(t), t ≥ 0}(Chapters 5, 6, and 8).
Below we first look at some interesting stochastic processes.

P
n
The simplest stochastic process is the iid sequence, {Xn }, which is not interesting. Let Sn = Xk .
k=1
ˉn =
Then {Sn } is a stochastic process referred to as a random walk. Then X Sn
is also a stochastic
n

process.
Weak Law of Large Numbers
To estimate the average number of pizza slices we will sell in a day, we will measure pizza sales over the
next couple weeks. Each day gets a random variable Xk which represents the amount of pizza we sell
on day k. Each Xk has the same distribution, because the randomness in each day is similar. We take
the average over the couple weeks, say n days, we measured. There is a random variable for that, its
ˉ n = 1 P Xk .
n
X n
k=1

ˉ n have anything to do with the average pizza sales per day? In other words, does measuring
(i.) Does X
ˉ n tell us anything about the Xk s? For instance, (on average) is X
X ˉ n close to the average sales per

day, E(Xk )?

(ii.) Does it matter what the distribution of pizza sales is? For instance, what if we usually have 1-4
customers but sometimes we have a conference where we have 45-50 people; that is kind of a weird
ˉ n needs to be adjusted in some way to help us understand
distribution. Does it mean that the X
the Xk ’s?

ˉ n is very similar to the average value of Xk , especially when


It turns out that, pretty often, the average X
n is large. This means that no matter what the distribution is, as long as we measure enough days, we
can get a pretty good sense of the average pizza sales.
4

Theorem 0.1 (The Weak Law of Large Numbers): Let X1 , X2 , . . . be i.i.d. r.v.s with mean μ < ∞ and
ˉ n − μ| ≥ c) → 0 as n → ∞ (convergence in probability).
variance σ 2 < ∞. For any c > 0, P (|X

ˉ n approaches μ as n → ∞. Weak LLN is nice because it tells us that no matter


Weak LLN says that X
what the distribution of the Xk ’s is, the sample mean approaches the true mean.

Theorem 0.2 (The Strong Law of Large Numbers) Let X1 , X2 , . . . be i.i.d. r.v.s with mean μ < ∞ and
ˉ n converges to μ with probability 1 (i.e., almost sure convergence), i.e., as
variance σ 2 < ∞. Then, X
ˉ n → μ.
n → ∞, X

ˉ n → μ, a
Notes: Strong LLN says that the sample average is approximately equal to the mean, i.e., X
stronger convergence. Let 
 1 : if event E occurs
Xi =
 0 : if event E doesn’t occur

P
n
Then 1
n Xi is the relative frequency of occurrence of E in n repetitions of an experiment. According
i=1
P
n
to LLN, 1
n Xi → E(Xi ) = P (E). The LLN justifies the frequency interpretation of probabilities and
i=1
much statistical estimation theory where it underlies the notion of consistency of an estimator.

Theorem 0.3 (The Central Limit Theorem) Let X1 , X2 , . . . be i.i.d. r.v.s with mean μ < ∞ and variance
σ 2 < ∞. Then, as n → ∞,

Sn − nμ ˉn − μ
X
√ or √ ∼ N (0, 1) (converge in distribution)
σ n σ/ n

One very important property of the random walk is the following.

P (Sn+1 ≤ y|Sn = x, Sn−1 , ∙ ∙ ∙ , S0 ) = P (Xn+1 ≤ y − x) = F (y − x)

for any (Sn−1 , ∙ ∙ ∙ , S0 ). That is, given the entire history of a random walk, the probabilistic behavior of
the future depends only on the present. This property is called the Markov property. Aa random walk
as a stochastic process is a Markov process, and more specifically, a discrete time Markov chain (DTMC).

You might also like