You are on page 1of 5

CHAPTER 1.

PROBABILITY AND RANDOM PROCESSES

12

N sample times (in engineering parlance the original waveform is oversampled or sampled at many
times the Nyquist rate). The binary quantizer then produces outputs for which the average over N
samples is very near the input so that the decoder output xhatkN is a good approximation to the
input at the corresponding times. Since x
n has only a discrete number of possible values (N + 1 to
be exact), one has thereby accomplished analog-to-digital conversion. Because the system involves
only a binary quantizer used repeatedly, it is a popular one for microcircuit implementation.
As an approximation to a very slowly changing input sequence xn , it is of interest to analyze the
response of the system to the special case of a constant input xn = x [b, b) for all n (called a
quiet input). This can be accomplished by recasting the system as a dynamical system as follows:
Given a fixed input x, define the transformation T by

u + x b; if u 0
Tu =
u + x + b; if u < 0.
Given a constant input xn = x, n = 1, 2, . . . , N , and an initial condition u0 (which may be fixed or
random), the resulting Un sequence is given by
un = T n u 0 .
If the initial condition u0 is selected at random, then the preceding formula defines a dynamical
system which can be analyzed.
The example is provided simply to emphasize the fact that time shifts are not the only interesting
transformation when modeling communication systems.
The different models provide equivalent models for a given process: one emphasizing the sequence
of outputs and the other emphasising the action of a transformation on the underlying space in
producing these outputs. In order to demonstrate in what sense the models are equivalent for given
random processes, we next turn to the notion of the distribution of a random process.

Exercises
1. Consider the sigma-delta example with a constant input in the case b = 1/2, u0 = 0, and
x = 1/. Find un for n = 1, 2, 3, 4.
2. Show by induction in the constant input sigma-delta example that if u0 = 0 and x [b, b),
then un [b, b) for all n = 1, 2, . . ..
3. Let = [0, 1) and F = [0, 1/2) and fix an (0, 1). Define the transformation T x = x,
where r [0, 1) denotes the fractional part of r; that is, every real number r has a unique
representation as r = K + r for some integer K. Show that if is rational, then T n x is a
periodic sequence in n.

1.4

Distributions

Although in principle all probabilistic quantities of a random process can be determined from the
underlying probability space, it is often more convenient to deal with the induced probability measures or distributions on the space of possible outputs of the random process. In particular, this
allows us to compare different random processes without regard to the underlying probability spaces
and thereby permits us to reasonably equate two random processes if their outputs have the same
probabilistic structure, even if the underlying probability spaces are quite different.

1.4. DISTRIBUTIONS

13

We have already seen that each random variable Xn of the random process {Xn } inherits a
distribution because it is measurable. To describe a process, however, we need more than simply
probability measures on output values of separate single random variables: we require probability
measures on collections of random variables, that is, on sequences of outputs. In order to place
probability measures on sequences of outputs of a random process, we first must construct the
appropriate measurable spaces. A convenient technique for accomplishing this is to consider product
spaces, spaces for sequences formed by concatenating spaces for individual outputs.
Let I denote any finite or infinite set of integers. In particular, I = Z(n) = {0, 1, 2, . . . , n 1},
I = Z, or I = Z+ . Define xI = {xi }iI . For example, xZ = (. . . , x1 , x0 , x1 , . . .) is a two-sided
infinite sequence. When I = Z(n) we abbreviate xZ(n) to simply xn . Given alphabets Ai , i I ,
define the cartesian product spaces
iI Ai = { all xI : xi Ai all i I}.
In most cases all of the Ai will be replicas of a single alphabet A and the preceding product will be
denoted simply by AI . We shall abbreviate the space AZ(n) , the space of all n dimensional vectors
with coordinates in A, by An . Thus, for example, Am,m+1,...,n is the space of all possible outputs of
the process from time m to time n; AZ is the sequence space of all possible outputs of a two-sided
process.
To obtain useful -fields of the preceding product spaces, we introduce the idea of a rectangle in
a product space. A rectangle in AI taking values in the coordinate -fields Bi , i J , is defined as
any set of the form
(1.13)
B = {xI AI : xi Bi ; all i J },
where J is a finite subset of the index set I and Bi Bi for all i J . (Hence rectangles are
sometimes referred to as finite dimensional rectangles.) A rectangle as in (1.13) can be written as a
finite intersection of one-dimensional rectangles as
\
\
{xI AI : xi Bi } =
Xi 1 (Bi )
(1.14)
B=
iJ

iJ

where here we consider Xi as the coordinate functions Xi : AI A defined by Xi (xI ) = xi .


As rectangles in AI are clearly fundamental events, they should be members of any useful -field
I
as the smallest -field
of subsets of AI . One approach is simply to define the product -field BA
containing all of the rectangles, that is, the collection of sets that contains the clearly important
class of rectangles and the minimum amount of other stuff required to make the collection a -field.
In general, given any collection G of subsets of a space , then (G) will denote the smallest -field
of subsets of that contains G and it will be called the -field generated by G. By smallest we mean
that any -field containing G must also contain (G). The -field is well defined since there must
exist at least one -field containing G, the collection of all subsets of . Then the intersection of all
-fields that contain G must be a -field, it must contain G, and it must in turn be contained by all
-fields that contain G.
Given an index set I of integers, let rect(Bi , i I) denote the set of all rectangles in AI taking
coordinate values in sets in Bi , i I . We then define the product -field of AI by
BA I = (rect(Bi , i I)).
At first glance it would appear that given an index set I and an A-valued random process
{Xn }nI defined on an underlying probability space (, B, P ), then given any index set J I , the
J
) should inherit a probability measure from the underlying space through
measurable space (AJ , BA

CHAPTER 1. PROBABILITY AND RANDOM PROCESSES

14

the random variables X J = {Xn ; n J }. The only hitch is that so far we only know that individual
random variables Xn are measurable (and hence inherit a probability measure). To make sense here
we must first show that collections of random variables such as the random sequence X Z or the
random vector X n = {X0 , . . . , Xn1 } are also measurable and hence themselves random variables.
Observe that for any index set I of integers it is easy to show that inverse images of the mapping
see this we simply
X I from to AI will yield events in B if we confine attention to rectangles. To S
use the measurability of each individual Xn and observe that since (X I )1 (B) = iI Xi1 (Bi ) and
since finite and countable unions of events are events, then we have for rectangles that
(X I )1 (B) B.

(1.15)

We will have the desired measurability if we can show that if (1.15) is satisfied for all rectangles,
then it is also satisfied for all events in the -field generated by the rectangles. This result is an
application of an approach named the good sets principle by Ash [1], p. 5. We shall occasionally
wish to prove that all events possess some particular desirable property that is easy to prove for
generating events. The good sets principle consists of the following argument: Let S be the collection
of good sets consisting of of all events F (G) possessing the desired property. If
G S and hence all the generating events are good, and
S is a -field,
then (G) S and hence all of the events F (G) are good.
Lemma 1.4.1 Given measurable spaces (1 ,B) and (2 , (G)), then a function f : 1 2
is B-measurable if and only if f 1 (F ) B for all F G; that is, measurability can be verified by
showing that inverse images of generating events are events.
Proof: If f is B-measurable, then f 1 (F ) B for all F and hence for all F G. Conversely, if
f 1 (F ) B for all generating events F G, then define the class of sets
S = {G : G (G), f 1 (G) B}.
It is straightforward to verify that S is a -field, clearly 1 S since it is the inverse image of 2 .
The fact that S contains countable unions of its elements follows from the fact that (G) is closed
to countable unions and inverse images preserve set theoretic operations, that is,
f 1 (i Gi ) = i f 1 (Gi ).
Furthermore, S contains every member of G by assumption. Since S contains G and is a -field,
(G) S by the good sets principle.
2
We have shown that the mappings X I : AI are measurable and hence the output measurable
I
) will inherit a probability measure from the underlying probability space and thereby
space (AI , BA
I
, PX I ), where the induced probability measure is defined
determine a new probability space (AI , BA
by
(1.16)
PX I (F ) = P ((X I )1 (F )) = P ({ : X I () F }), F BA I .
Such probability measures induced on the outputs of random variables are referred to as distributions
for the random variables, exactly as in the simpler case first treated. When I = {m, m + 1, . . . , m +
n 1}, e.g., when we are treating X n taking values in An , the distribution is referred to as an
n-dimensional or nth order distribution and it describes the behavior of an n-dimensional random

1.4. DISTRIBUTIONS

15

variable. If I is the entire process index set, e.g., if I = Z for a two-sided process or I = Z+ for a onesided process, then the induced probability measure is defined to be the distribution of the process.
Thus, for example, a probability space (, B, P ) together with a doubly infinite sequence of random
Z
, PX Z ) and PX Z is the distribution of
variables {Xn }nZ induces a new probability space (AZ , BA
the process. For simplicity, let us now denote the process distribution simply by m. We shall call
I
, m) induced in this way by a random process {Xn }nZ the output
the probability space (AI , BA
space or sequence space of the random process.

Equivalence
I
Since the sequence space (AI , BA
, m) of a random process {Xn }nZ is a probability space, we can
define random variables and hence also random processes on this space. One simple and useful such
definition is that of a sampling or coordinate or projection function defined as follows: Given a
product space AI , define the sampling functions n : AI A by

n (xI ) = xn , xI AI , n I.

(1.17)

The sampling function is named since it is also a projection. Observe that the distribution of the
I
, m) is exactly the same as the
random process {n }nI defined on the probability space (AI , BA
distribution of the random process {Xn }nI defined on the probability space (, B, P ). In fact, so
far they are the same process since the {n } simply read off the values of the {Xn }.
What happens, however, if we no longer build the n on the Xn , that is, we no longer first select
from according to P , then form the sequence xI = X I () = {Xn ()}nI , and then define
n (xI ) = Xn ()? Instead we directly choose an x in AI using the probability measure m and then
view the sequence of coordinate values. In other words, we are considering two completely separate
experiments, one described by the probability space (, B, P ) and the random variables {Xn } and
I
, m) and the random variables {n }. In these
the other described by the probability space (AI , BA
two separate experiments, the actual sequences selected may be completely different. Yet intuitively
the processes should be the same in the sense that their statistical structures are identical, that
is, they have the same distribution. We make this intuition formal by defining two processes to
be equivalent if their process distributions are identical, that is, if the probability measures on the
output sequence spaces are the same, regardless of the functional form of the random variables of the
underlying probability spaces. In the same way, we consider two random variables to be equivalent
if their distributions are identical.
We have described two equivalent processes or two equivalent models for the same random
process, one defined as a sequence of perhaps very complicated random variables on an underlying
probability space, the other defined as a probability measure directly on the measurable space of
possible output sequences. The second model will be referred to as a directly given random process.
Which model is better depends on the application. For example, a directly given model for a
random process may focus on the random process itself and not its origin and hence may be simpler
to deal with. If the random process is then coded or measurements are taken on the random process,
then it may be better to model the encoded random process in terms of random variables defined
on the original random process and not as a directly given random process. This model will then
focus on the input process and the coding operation. We shall let convenience determine the most
appropriate model.
We can now describe yet another model for the random process described previously, that is,
another means of describing a random process with the same distribution. This time the model
I
, m), define the (left) shift
is in terms of a dynamical system. Given the probability space (AI , BA

CHAPTER 1. PROBABILITY AND RANDOM PROCESSES

16
transformation T : AI AI by

T (xI ) = T ({xn }nI ) = y I = {yn }nI ,


where
yn = xn+1 , n I.
I

Thus the nth coordinate of y is simply the (n + 1) coordinate of xI . (Recall that we assume for
random processes that I is closed under addition and hence if n and 1 are in I, then so is (n + 1).)
If the alphabet of such a shift is not clear from context, we will occasionally denote it by TA or TAI .
It can easily be shown that the shift is indeed measurable by showing it for rectangles and then
invoking Lemma 1.4.1.
Consider next the dynamical system (AI , BA I , P, T ) and the random process formed by combining the dynamical system with the zero time sampling function 0 (we assume that 0 is a member
of I ). If we define Yn (x) = 0 (T n x) for x = xI AI , or, in abbreviated form, Yn = 0 T n ,
then the random process {Yn }nI is equivalent to the processes developed previously. Thus we have
developed three different, but equivalent, means of producing the same random process. Each will
be seen to have its uses.
The preceding development shows that a dynamical system is a more fundamental entity than
a random process since we can always construct an equivalent model for a random process in terms
of a dynamical system: use the directly given representation, shift transformation, and zero time
sampling function.
The shift transformation introduced previously on a sequence space is the most important transformation that we shall encounter. It is not, however, the only important transformation. Hence
when dealing with transformations we will usually use the notation T to reflect the fact that it is
often related to the action of a simple left shift of a sequence, yet we should keep in mind that
occasionally other operators will be considered and the theory to be developed will remain valid,;
that is, T is not required to be a simple time shift. For example, we will also consider block shifts
of vectors instead of samples and variable length shifts.
Most texts on ergodic theory deal with the case of an invertible transformation, that is, where T
is a one-to-one transformation and the inverse mapping T 1 is measurable. This is the case for the
shift on AZ , the so-called two-sided shift. It is not the case, however, for the one-sided shift defined
on AZ
+ and hence we will avoid use of this assumption. We will, however, often point out in the
discussion and exercises what simplifications or special properties arise for invertible transformations.
Since random processes are considered equivalent if their distributions are the same, we shall
adopt the notation [A, m, X] for a random process {Xn ; n I} with alphabet A and process distribution m, the index set I usually being clear from context. We will occasionally abbreviate this to
the more common notation [A, m], but it is often convenient to note the name of the output random
variables as there may be several; e.g., a random process may have an input X and output Y . By the
associated probability space of a random process [A, m, X] we shall mean the sequence probability
I
, m). It will often be convenient to consider the random process as a directly given
space (AI , BA
random process, that is, to view Xn as the coordinate functions n on the sequence space AI rather
than as being defined on some other abstract space. This will not always be the case, however, as
often processes will be formed by coding or communicating other random processes. Context should
render such bookkeeping details clear.

Monotone Classes
Unfortunately there is no constructive means of describing the -field generated by a class of sets.
That is, we cannot give a prescription of adding all countable unions, then all complements, and so

You might also like