You are on page 1of 17

116

Section 5.1

Chapter VSome Continuous Models


Section 5.1: The uniform distribution
In many examples in Chapter 2, we considered continuous random variables whose
probability density function is a constant. For example, the random numbers generated
by a calculator are equally likely to take on any value between 0 and 1. More
precisely, if we divide the interval [0, 1] into small subintervals of length h, the chosen
number is equally likely to be in any of those intervals. Random variables with constant
density functions are said to be uniformly distributed.
Definition: X is uniformly distributed on the interval [a, b] if its probability density
1
function is f (x ) =
, a " x " b.
b! a
Well denote this as X ~ Uni(a,b) .
b

x b
Note ! f ( x ) dx =
= 1 . Geometrically, the region bounded by the graph of the
b"aa
a
1
density function is a rectangle of length b a and height
, thus making the area of
b!a
the rectangle equal to 1.
1
b!a
a

b
x

t x x"a
The cumulative distribution function is F( x ) = ! f (t ) dt =
, a ! x !b.
=
b" a a b "a
a
(And, of course, F ( x ) = 0, for x < a and F ( x ) = 1, for x > b.)
Theorem 5.1: Let X ~ Uni(a,b) . Then the expected value and variance of X are
a+b
(b ! a)2
and V [ X ] =
.
E[ X ] =
2
12
b

x
b2 ! a2
a+b
(which, quite logically,
dx =
=
(
)
b
!
a
2
b
!
a
2
a
b
# a + b& 2
b 3 ! a3
x2
!
is the midpoint of the interval [a, b]) and V [ X ] = "
dx ! %
( =
$ 2 '
3( b ! a)
b!a
a
Proof: By definition, E[ X ] = "

a 2 + 2ab + b 2 b 2 + ab + a 2 a 2 + 2ab + b 2 b 2 ! 2ab + a 2 ( b ! a )


=
!
=
=
.
4
3
4
12
12
2

117

Section 5.1
Notice that the variance depends on the length of the interval and not on where it
9 3
starts and ends. So if X ~ Uni(0, 3) and Y ~ Uni(4, 7), then V [ X ] = V [Y ] =
= .
12 4

Example 5.1.1: The M14 bus arrives at a certain stop every 20 minutes, at 0, 20, and 40
minutes past the hour. A passenger arrives at a random time between 9:10 and 9:25.
What is the probability that she has to wait no more than two minutes for a bus?
Let X equal the number of minutes past 9:10 that she arrives. We are given that X ~
1
Uni(0, 15), so the density function of X is f ( x ) =
, 0 ! x ! 15. She waits no more than
15
two minutes if and only if she arrives between 9:18 and 9:20; that is, if 8 ! X ! 10. The
10
1
2
probability of this event is ! dx = .
15
15
8
Example 5.1.2: A stick, 1 meter long, is broken at a random point, uniformly chosen
over the length of the stick. What is the probability that the longer piece is at least twice
as long as the shorter piece?
Let X denote the distance from one end of the stick to the point at which it is broken.
We are told that X ~ Uni(0, 1), so its density function is f ( x ) = 1 , 0 ! x ! 1. The two
pieces are of length X and 1 X. In order for the longer piece to be at least twice as long
1
2
as the shorter piece, X must either be less than or greater than . The probability of
3
3
1/ 3
1
1
2$
2
!
this event is P # X < or X > & = ' 1!dx + ' 1!dx = .
"
3
3% 0
3
2/3
Example 5.1.3: Let Y be a continuous random variable with cumulative distribution
function F( y ) = y3 ,0 ! y ! 1 and let X = F (Y ) = Y 3 . Show that X ~ Uni(0,1) .

The cumulative distribution function of X is G ( x ) = P ( X < x ) = P Y 3 < x =

( )

P Y < x1/ 3 = F x1/ 3 = x . Therefore, the probability density function of X is g( x ) = 1,


implying that X ~ Uni(0,1) .
The result of Example 5.1.3 can be generalized, as follows:
Theorem 5.2: If Y is any continuous random variable whose cumulative distribution
function is F, and if X = F(Y ) , then X ~ Uni(0, 1). Conversely, if X ~ Uni(0,1) and
W = F !1 ( X ) , where F is any cumulative distribution function, then F is the cumulative
distribution function of W.
This theorem is useful for generating random numbers from any distribution. For
example, suppose we want to generate values of a random variable whose density
function is f (w) = 2e !2 w , w > 0. The corresponding cumulative distribution function is
1
!1
F( w) = 1 ! e !2w from which F ( x ) = ! ln(1 ! x ) . Thus, if we use a calculator or
2

118

Section 5.1

computer to generate numbers that are Uni(0, 1) and transform them according to
1
w = ! ln(1! x ) , we will get numbers that are distributed according to the density
2
function f (w) = 2e !2 w .
The histogram on the left shows 100 randomly generated numbers, uniformly
distributed between 0 and 1. The histogram on the right shows what happens after those
1
numbers are transformed according to w = ! ln (1 ! x ) . The histogram certainly
2
resembles the graph of the density function f ( w ) = 2e!2w .
8

25

6
4

15

5
0.00

0.50

1.00

0.000

1.875
W

The moment generating function for the uniform distribution is:

M(t ) = E[ e

tX

]= "e
a

tx

1
e bt ! e at
dx =
b!a
t (b ! a)

Unfortunately, this is not defined at t = 0, which makes its use for finding moments more
complicated. One way to do it is to expand M(t ) as a Taylor series:

!
$ !
$
b2 t 2 b3 t 3
a 2 t 2 a 3t 3
##1 + bt +
&
#
+
+ !& ' # 1+ at +
+
+ !&&
"
2
6
% "
2
6
%
M(t ) =
t( b ' a )

(b ! a)t +

(b

! a 2 )t 2 (b3 ! a 3 )t 3
+
+!
(a + b) a 2 + ab + b 2 2
2
6
=1+
t+
t +!
t (b ! a)
2
6

a+b
a 2 + ab + b2
2
Now, E[ X ] = M! (0) =
, E[ X ] = M !!(0 ) =
, etc.
2
3
Alternatively, we could find the derivatives of M(t ) by the quotient rule. Then we
would need LHopitals Rule, perhaps more than once, to evaluate the limit of the
derivatives as t approaches 0.

119

Section 5.1
A statistical question

Suppose X ~ Uni(0,! ) , where ! is an unknown parameter. We collect a random


sample X1 , X2 ,, Xn from this population. Our goal is to estimate ! . The key
observation is that, since all the data must lie between 0 and ! (thats the range of X),
then ! must be at least as big as the largest observed value. For example, if we observe
{12, 17, 9, 23, 34, 10, 15}, then ! cant be less than 34.
Let Y represent the largest of the random variables in the sample; that is,
Y = max{X1 , X2 ,, Xn } . This is sometimes called the largest order statistic of the
sample. In view of our observation above, it might be reasonable to use Y to estimate ! .
To see if this is unbiased, we need to find the probability density function of Y.
Consider the event {Y < y }. The only way this can occur is if all of the Xis are less
than y. (Think about that!) So, P (Y < y ) = P ( X1 < y, X2 < y,, Xn < y ) . Since the Xis
are independent and they all have the same distribution as X, then the cumulative
" y% n
n
distribution function of Y is F( y ) = P(Y < y ) = [P ( X < y)] = $ ' . So, the probability
#!&
n)1
n!1
ny
n # y&
density function of Y is f (y ) = F !( y) = % ( = n , 0 ! y ! " . For n > 1, this is an
" $"'
"
increasing function of y, which means that the value of Y i.e., the largest observed value
is more likely to be near ! , the largest possible value, than near 0.
"
ny n!1
n
The expected value of Y is E[Y ] = # y n dy =
" . Hence, Y is a biased
"
n+1
0
n +1
estimator of ! . However, if we let ! =
Y , then ! is unbiased. This means that to
n
estimate ! in an unbiased way, we should take the largest observed value and make it a
n +1
little bigger by multiplying by
. Note that this fudge factor approaches 1 as n
n
increases since, as we take more observations, it is more likely that the largest observed
value is close to the largest possible observation ! .
!
%
This is not the only unbiased estimator of ! . Since E [ X ] = , then E !" X #$ = ,
2
2
where X is the sample mean for a random sample of size n. So, "! = 2X is an unbiased
estimator. However, it is not a reasonable estimator. Consider the random sample {3, 5,
6, 10, 26}. Then, x = 10 and the estimate of ! would be ! = 2x = 20 . But, since the
largest data point is 26, then in view of our earlier observation, ! must be bigger than 26.
So, 20 is not a possible value. This shows that not every unbiased estimator is useful.
Exercises 5.1
1. Suppose X ~ Uni(a,b) where E[ X ] = 3 and V [ X ] =

4
. Determine a and b.
3

2. Let X ~ Uni(0,1) and Y = a + bX . Show that Y ~ Uni ( a, a + b ) .

120

Section 5.1
3. Let X ~ Uni(0,1) . Determine the probability density function of:
2
(a) Y = X
(b) Y = X
(c) Y = ! ln X

4. Twenty five numbers are randomly selected between 0 and 1.


(a) What is the probability that at least 20 of them are greater than .25?
(b) Use the Central Limit Theorem to determine the approximate probability that the
average of the 25 numbers will be greater than .48.
5. Vince has a large collection of marbles of various sizes. The radii of these marbles are
uniformly distributed between 5 and 10 mm. What is the expected volume of a marble?
4 3
[Recall: The volume of a sphere is V = !r .]
3
6. A highway is 2L miles long. An accident is equally likely to occur anywhere along
the highway. A police car is stationed at the midpoint of the highway.
(a) What is the expected distance from the police car to the accident? [Introduce a
coordinate system so that the highway goes from L to L. The police car is at 0. Let X be
the location of the accident. Then the distance from police car to accident is X .]
L
(b) What is the probability that the police car is more than miles from the
2
accident?
7. A one meter stick is broken at a random point, as in Example 5.1.2. Let R be the ratio
of the longer piece to the shorter piece. (Also see Exercise 2.3.11.)
" X
1
, if X >
$$
X
2
(a) Argue that R = #1!
1!
X
1
$
$% X , if X < 2
(b) Use LOTUS to show that E[ R] is infinite.
8. A discrete uniform random variable X has range {1, 2,, n} and probability mass
1
function p( k ) = ,k = 1,2,,n . Determine the expected value and variance of X. [You
n
n
n
n(n + 1)
n(n + 1)(2n + 1)
may wish to use the facts that ! k =
and ! k 2 =
.]
2
6
k =1
k =1
9. Five buses are supposed to leave a depot at 12:00. However, there is some
randomness in their actual departure time. Assume that a bus is equally likely to depart
at any time between 12:00 and 12:10, and that the departure times are independent. What
is the probability that the last bus leaves before 12:08?
10. Determine the variance of the unbiased estimator ! =
of n observations from a Uni(0, ! ) population.

n +1
Y , where Y is the largest
n

121

Section 5.2

Section 5.2: The exponential and gamma distributions


In Section 4.3, we studied the Poisson process N (t ) which counts the number of
events that occur in an interval of length t, based on some assumptions (notably, the
stationary and independent increments properties) about how the events occur. We
k
! "t
e ("t )
showed that P( N (t ) = k ) =
, for some constant ! > 0; in other words,
k!
N (t ) ~ Poi(! t ) .
There are two related sets of random variables that accompany the Poisson process.
Let T represent the time between consecutive events. T is a continuous random variable
with range {t > 0}. (Technically, we should let Ti represent the time between the ith and
(i + 1)st events. However, it is not hard to argue that the distribution of Ti is the same, no
matter which two consecutive events we consider. This follows from the stationary
increments assumption about the Poisson process. So we can drop the subscript.)
Lets derive the density function of T. Consider the event {T > t} , meaning that the
time from one event until the next event is at least t. This implies that no events occurred
in an interval of length t. Hence, the event {T > t} is equivalent to {N (t ) = 0} . Thus,

e! "t ("t )
P(T > t ) = P( N (t ) = 0) =
= e ! "t . Therefore, the cumulative distribution function
0!
of T is F(t ) = 1! P(T > t ) = 1 ! e !"t and the probability density function of T is
# "t
f (t ) = F !( t ) = "e . We give this density function a name.
0

Definition: A continuous random variable X is said to be exponentially distributed with


parameter ! > 0 if its probability density function is f (x ) = !e " !x , x > 0.
Well denote this as X ~ exp (! ) .
#

Note that $ !e "!x dx = "e "!x 0 = 1 , so the density function is valid.


0

Heres a graph of the density function when ! = 2. It is a strictly decreasing


function, so the most likely values are near 0. As ! increases, the density function is
squished to the left and the y-intercept, which is ! , increases.

122

Section 5.2
Theorem 5.3: Let X ~ exp (! ) . Then the expected value and variance of X are
1
1
E[ X ] = and V [ X ] = 2 .
!
!
#

Proof: By definition, E [ X ] = $ ! xe" ! x dx . Using integration by parts, we get


0

1
E [ X ] = xe 0 ! $ e dx = xe 0 + e! " x . Evaluating the first term at x = " requires
"
0
0
some care since it is an indeterminate form ! " 0 . Applying LHopitals Rule, we have
x
1
1 1
lim xe# $ x = lim # x = lim # x = 0 . Therefore, E [ X ] = 0 + = .
x!" e
x!" # e
x!"
! !
The calculation of the variance is similar, except that two integrations by parts (and
'
2
2
two applications of LHopitals Rule) are needed to show E !" X #$ = ( % x 2 e& % x dx = 2 .
%
0
The formula for the expected value makes sense because we showed earlier that
E " N ( t ) $%
!= #
represents the average rate at which events in the Poisson process occur.
t
If ! is large, then the average time between events should be small, and conversely.
!"x #

!"x

!"x #

Example 5.2.1: Telephone calls to a switchboard follow a Poisson process at rate ! = 3


per minute. What is the probability that the time between consecutive calls is at least 1
minute?
Let T represent the time between consecutive calls. Then T ~ exp(3) and the density
"

function is f (t ) = 3e !3t . Hence, P(T > 1) = # 3e !3t dt = e !3 $ .05 .


1

The exponential distribution has a very important property. Consider the conditional
probability that X > s + t given that X > s. In Example 5.2.1, this is the probability that at
least s + t time units will elapse until the next call given that s time units have already
elapsed. By definition of conditional probability,
P( X > s + t ! X > s) P( X > s + t ) e !" ( s +t )
P( X > s + t | X > s) =
=
=
= e !"t = P ( X > t ) ,
!"s
P( X > s)
P ( X > s)
e

which is independent of s. This means that the probability that it will take at least t more
time units until the next event is independent of how much time has elapsed since the last
event. This is called the memoryless property and the exponential distribution is the only
continuous random variable that has it. (In Section 4.2, we mentioned that the geometric
distribution is the only discrete distribution with the memoryless property.)
Theorem 5.4: If X ~ exp ( ! ) , then P ( X > s + t | X > s ) = P ( X > t ) for all s and t.

123

Section 5.2

So, for example, P ( X > 7 | X > 2 ) = P ( X > 5 ) . Pay attention to the direction of the
inequalities. Theorem 5.4 is not true if the > symbols are replaced by < symbols.
Theorem 5.4 implies that we could rephrase Example 5.2.1 as, What is the
probability that at least 1 minute elapses until the next call? We dont need to know
when the most recent call occurred or that it is the time between consecutive calls.
The exponential distribution has applications other than as the time between events
in a Poisson process. For instance, it can be used to model the lifetime of some electrical
components. There the memoryless property implies that the probability that a
component will last at least t more time units is independent of how old the component
is! In other words, components dont get old, they just fail at some point. For this
reason, the memoryless property is sometimes called the lack-of-aging property. Alas, it
doesnt apply to humans: Surely, an 80-year old person is more likely than a 20-year old
person to die within the next 5 years. In Exercise 5.2.9, well see a function that
measures the degree of aging as a function of age. For the exponential distribution, this
is a constant. For other distributions, it may be an increasing or decreasing function of
the age.
Finally, we note that the moment-generating function for the exponential distribution
is:
%
%
%
' (t ( ' ) x
'
, if t < ! .
M ( t ) = E !" etX #$ = & etx f ( x ) dx = & ' e(t ( ' ) x dx =
e
=
t
(
'
'
(
t
0
0
0
If t > ! , the integral diverges. This is not a problem since we evaluate the momentgeneration function and its derivatives only at t = 0. We can use M ( t ) to verify the
expected value and variance of the exponential distribution. (See Exercise 5.2.1.)
The Gamma distribution
The other set of random variables related to the Poisson process is the sequence
S1 , S2 , of times at which the events occur. Clearly, this is an increasing sequence (e.g.
the third event must occur after the second event) and, hence, they are dependent. Let Sn
be the time at which the nth event occurs. The event {Sn < t} means that the nth event
occurred before time t. This, in turn, implies that at least n events occurred in the time
interval [0, t]. (Think about that!) Hence, {Sn < t} is equivalent to { N ( t ) ! n} . So:
e " # t ( #t )
P ( Sn < t ) = P ( N ( t ) ! n ) = %
k!
k=n
$

e ! " t ( "t )
Then the cumulative distribution function of Sn is Fn ( t ) = $
, from which the
k!
k=n
probability density function of Sn is:
#

fn ( t ) = Fn! ( t ) = + e
k=n

" #t

k "1
k
)
$ # k ( #t )k "1 # ( #t )k '
#
!t ) &
(
" ! t ( !t )
"
= * !e %
"
&
k!
k! )( k = n
k! ('
%
$ ( k " 1)!

124

Section 5.2

= !e

"!t

(!t )n"1

( n " 1)!

, since the summation collapses, eliminating all

but the first term.


Definition: A continuous random variable X has a gamma distribution with parameters
n"1
" !x (!x )
n and ! if its probability density function is f (x ) = !e
, x > 0.
(n " 1)!
Well denote this as X ~ Gam (n, ! ) .
Due to the presence of the (n ! 1)! in the denominator, this density function makes
sense only if n is a positive integer greater than 1. We shall extend the definition to the
case in which n is any positive real number later. Note that when n = 1, we get the
exponential distribution.
It requires n integrations by parts to show that this density function is valid. Well
omit the details.
n !1
If n > 1, the graph passes through the origin. There is a local maximum at x =
,
"
so as n increases, the maximum (or mode) moves to the right. Heres a graph of the
density when n = 3 and ! = 2.

Example 5.2.2: As in Example 5.2.1, telephone calls arrive at a switchboard according


to a Poisson process with ! = 3 per minute. What is the probability that the 5th call
occurs within the first minute?
Let S5 be the time at which the 5th call occurs. Then S5 ~ Gam(5,3) . Its density
1
4
81 4 !3t
81
!3t (3t )
function is f (t ) = 3e
= t e . We want P(S5 < 1) = " t 4 e !3t dt . This is a
4!
8
8
0
nasty integral to do unless you have a computer. However, we can argue that the 5th call
occurs within the first minute if and only if there are at least 5 calls within the first
minute; that is, P(S5 < 1) = P( N (1) ! 5) . This, in turn, can be computed by the formula
for the Poisson process:
4
e "3 3k
9
27
81 "3
P( N (1) ! 5) = 1 " #
= 1 " e "3 " 3e "3 " e "3 " e "3 "
e = .18474 .
k!
2
6
24
k=0

125

Section 5.2

This is a reasonable answer because the calls occur at a rate of 3 per minute, so we would
expect the 5th call to occur near time t = 5/3 = 1.666 (The terms in the summation are
exactly the terms you would get by doing multiple integrations by parts.)
The connection between the gamma distribution and the Poisson process namely,
{Sn < t } ! { N (t ) " n} is very similar to the connection between the binomial and
negative binomial distributions as stated in Theorem 4.6: P (Y ! r ) = P (T " n ) , where
Y ~ Bin ( n,! ) and T ~ NBin ( r,! ) . The binomial Y counts the number of successes in n
trials; the Poisson process N ( t ) counts the number of events in time t. The negative
binomial T counts the number of trials needed to obtain r successes; the gamma random
variable Sn measures the time needed to see n events.

The gamma and exponential distributions are related as follows: Let T1 ,T2 ,,Tn be
independent, exponential random variables, each with parameter ! , and let
Sn = T1 + T2 + + Tn be their sum. Then Sn ~ Gam( n, ! ) . This is evident from the
Poisson process the time of the nth event is just the sum of the times between the events
up to n. Continuing the analogy to Bernoulli trials, the exponential distribution which
measures the time until the next event is equivalent to the geometric distribution
which measures the number of trials until the next success. We stated that the negative
binomial is the sum of geometric distributions; the equivalent here is that the gamma
distribution is the sum of exponentials.
This relationship allows us to compute the expected value and variance of the
gamma distribution.
Theorem 5.5: Let X ~ Gam (n, ! ) . Then E[ X ] =

n
n
and V [ X ] = 2 .
!
!

Furthermore, the moment-generating function of the gamma distribution is


# ! &n
M(t ) = %
( .
$! " t'
Now lets define the gamma distribution for the case in which n is not a positive
integer. For that, we need to extend the notion of factorials:
#

Definition: The gamma function is defined by ! (r) = $ e "x x r "1 dx , for r > 0.
0

Since this is a definite integral, ! ( r ) is a real number for any value of r for which
the improper integral converges. The fact that the integral converges for r # 1 is easy to
prove; the fact that it converges for 0 < r < 1 is somewhat more difficult.
The next theorem gives a recursive formula for ! ( r ) .
Theorem 5.6: ! (r) = ( r " 1)!(r " 1) for all r > 0.

126

Section 5.2
Proof: The proof uses integration by parts. Let u = x r !1 and dv = e! x dx . Then
du = ( r ! 1) x

r!2

" x r "1 #

dx and v = !e . Hence, ! (r) = "e x


!x

+ ( r " 1)$ e "x x r "2 dx . The first


0

term is 0 due to LHopitals Rule. The integral in the second term, by definition, is
! (r " 1) . Thus, the proof is complete.
#

Now, ! (1) = $ e "x dx = 1 , from which ! (2) = 1"! (1) = 1, ! (3) = 2 "! (2 ) = 2 ,
0

! (4) = 3" !(3) = 6 , etc. By induction, we claim:


Corollary: ! (r) = ( r " 1)!, if r is a positive integer.
If r is not a positive integer, then other techniques must be used to evaluate the
" 1%
gamma function. For instance, it can be shown that ! $ ' = ( . Then, using Theorem
# 2&
" 3% 1 " 1% 1
" 5% 3 " 3% 3
5.6, we get ! $ ' = !$ ' =
( , ! $ ' = !$ ' =
( , etc. A computer can be
# 2& 2 # 2& 2
# 2& 2 # 2& 4
used to get other values. For example, ! ( 3.2 ) = 2.424 .
Heres a graph of the gamma function.
8
6
4
2
0

There is a global minimum at approximately (1.46, .886). The graph passes through
points ( n, ( n ! 1)!) , for all positive integers n. Furthermore, lim " ( r ) = +# . The gamma
r!0 +

function has many applications in a variety of mathematical areas.


Now we can generalize the definition of the gamma distribution. The only change
we have to make is to replace the factorial in the denominator by the corresponding
gamma function.
Definition: A continuous random variable X has a gamma distribution with parameters
r "1
" !x (!x )
r > 0 and ! > 0 if its probability density function is f (x ) = !e
, x > 0.
# (r )
Well denote this as X ~ Gam (r, ! ) .
We shall make use of this in the special case when r is half an integer in the next
section when we discuss the chi-square distribution.

127

Section 5.2
Exercises 5.2
1. Use the moment-generating function of an exponential distribution to verify the
expected value and variance given in Theorem 5.3.
2. Let X ~ exp (2) . Compute:
(a) P( X > 1)
(b) P( X > 1.5 | X > .5)
(d) P( X > 1 | X < 3) (e) P( X > 4 | X < 2)

(c) P( X < 1 | X < 3)

3. The median of a random variable X is the number m such that P( X < m) = 0.5 .
Determine the median of X if X ~ exp (! ) .
4. The lifetime of a lightbulb is exponentially distributed with a mean of 1000 hours.
(a) Determine the probability that a bulb lasts more than 1500 hours.
(b) Determine the probability that a bulb lasts more than 1500 hours given that it has
lasted more than 1000 hours.
(c) If 20 bulbs are tested, what is the probability that exactly 6 of them last more
than 1500 hours?
(d) If 100 bulbs are tested, use the Central Limit Theorem to approximate the
probability that their average lifetime will exceed 1100 hours.
5. Murders in a certain city occur according to a Poisson process at a rate of 3 per week.
What is the probability that at least 1 week will elapse with no murders?
6. Births in a hospital occur according to a Poisson process at a rate of 8 per week.
(a) What is the probability that the third birth will occur within the first two days?
(b) What is the expected value and variance of the number of births in 4 weeks?
7. Let X and Y be independent, exponentially distributed random variables, each with
parameter ! . Let Z = min{X,Y} .
(a) Argue that the event {Z > z} is equivalent to {X > z and Y > z} .
(b) Show that P( Z > z) = e !2" z .
(c) Determine the probability density function of Z.
(d) Suppose the amount of time it takes to check out of a supermarket is
exponentially distributed with parameter ! = 0.2 min-1. If two people begin to check out
at the same time, what is the probability that the first one will be done in less than 3
minutes?
(e) Does the answer to (d) change if we dont know they started checking out at the
same time?
(f) Generalize the result of (c) to the case in which X ~ exp (! ) and Y ~ exp( ) .
8. Let X ~ exp (! ) . Since E[ X ] =

!=

1
. Is ! unbiased?
X

1
, then it would seem reasonable to estimate ! by
!

128

Section 5.2

9. One way to measure the age effect of a random variable T is by the failure rate
f (t )
function r(t ) =
, where f (t ) is the probability density function and F(t ) is the
1! F( t )
cumulative distribution function. In other words, r(t ) ! P(t < T < t + "t | T > t )"t . So if T
represents the lifetime of a component, then r(t ) is proportional to the probability that the
component will last an additional !t time units given that it has already lasted t time
units.
(a) Show that the exponential distribution has a constant failure rate function (which
is equivalent to saying it has the memoryless property).
2
(b) Let T have probability density function f (t ) = 2te! t ,t > 0 . Determine the failure
rate function for T.
(c) If the random variable in (b) is the lifetime of a component, is an old component
more or less likely to fail than a new component?
10. Let X ~ Gam (n, ! ) and Y ~ Gam( m, ! ) be independent random variables. Show that
X + Y ~ Gam (n + m, ! ) . [Hint: Consider X and Y as sums of exponential random
variables. Or you can use moment-generating functions.]
11. Determine the value of x at which the density function of a Gam (3, ! ) random
variable attains its maximum value. (This is called the mode of the distribution.)

Section 5.3: The chi-square distribution


In Section 1.4, we talked about how to verify whether a probability model is
plausible. As an example, we considered the problem of determining whether a die is
fair. Suppose we toss the die 150 times. If the die were fair, we would expect to see 25
of each number. The observed values (data) and expected values are given in the table
below:
Value
1 2 3 4 5 6
Observed number 21 17 30 26 23 33
Expected number 25 25 25 25 25 25
We measured how far the observed values are from the expected values by the
(observed - expected)2
statistic W = !
. The bigger the value of W, the further the
expected
observed values are from what the model predicts, meaning that it is more likely the
(21 ! 25)2 (17 ! 25)2
(33 ! 25)2
model is incorrect. For this data, W =
+
+! +
= 6.96 .
25
25
25
What we have done here is test the null hypothesis H0 : the die is fair vs. the
alternative Ha : the die is not fair. We call this procedure goodness-of-fit. W is the test

129

Section 5.3

statistic. In order to come to a statistically justifiable conclusion, we need to know the


distribution of W.
In general, suppose we have n cells (values of the random variable). Let Xi
represent the number of observations in cell i. Xi is approximately a binomial random
variable the observation is either in cell i or not with parameters N and pi , where N is
the total number of observations and pi is the probability that an observation is in cell i.
Thus, E[ Xi ] = Npi and V [ Xi ] = Npi (1 ! pi ) .
Xi ! Npi
It follows that Zi =
~ N( 0,1) . Except for the (1 ! pi ) factor in the
Npi (1 ! pi )

observed - expected
. Then W is approximately
expected
the sum of squares of standard normal random variables. Well need to find the
distribution of W.
The first step is to find the distribution of U = Z 2 , where Z ~ N (0,1) . The density
denominator, this is roughly of the form

1 " z2
e . To get the density function of U, note that, since the
function of Z is f (z) =
2!
range of Z is (", "), the event {U < u} is equivalent to ! u < Z < u . Hence, the

1 ! z2
"
" 2# e dz .
! u
! u
This integral is not solvable in closed form. However, we can use the Fundamental
Theorem of Calculus to find the derivative with respect to u to get the probability density
function. The first step is to split the integral into two parts:
cumulative distribution function of U is G ( u ) =

G (u ) =

f ( z ) dz =

! u

! u

" f ( z ) dz = " f ( z ) dz + " f ( z ) dz = " f ( z ) dz ! " f ( z ) dz

! u

This is in the proper form for the Fundamental Theorem to apply.

g (u ) = G ! (u ) = f
=

u)
d (" u )
(
"
f
"
u
( ) du ( ) du = 2 1u ( f ( u ) + f ( " u ))
u

1 " u2 " 12
e u .
2!

By comparing to the gamma density function, we see that U has a gamma distribution
" 1%
1
1
with parameters r = and ! = . (Remember ! $ ' = ( .)
# 2&
2
2
n

Now let Z1 ,Z2 ,,Zn ~ N (0,1) be independent, and let Y = ! Zi2 . Since each Zi2 has
i =1

! 1 1$
! n 1$
a Gam # , & distribution, then Y has a Gam # , & distribution. (See Exercise 5.2.10.)
" 2 2%
" 2 2%
We give this special gamma distribution a name.

130

Section 5.3
Definition: Y has a chi-square distribution with n degrees of freedom if its density
! n 1$
function is the same as a Gam # , & random variable, where n # 2.
" 2 2%
Well denote this as Y ~ ! 2 ( n ) .
n

Thus, the density function of a ! ( n) random variable is f ( y ) = e


2

y
2

y2

!1

# n&
2 "% (
$ 2'
n
2

, y > 0.

So, we have the following:


n

Theorem 5.7: Let Z1 ,Z2 ,,Zn ~ N (0,1) be independent, and let Y = ! Zi2 . Then
i =1

2
Y ~ ! (n ) .

Now let W = !

(observed - expected)2

. We have argued that W is approximately


expected
the sum of the squares of n standard normal random variables. Thus, W should have a
chi-square distribution. However, since the individual terms in the summation are not
independent, we lose one degree of freedom. Hence, we claim that W ~ ! 2 (n " 1) .
In the die example, there are six cells, so W has a chi-square distribution with five
degrees of freedom.
Since the chi-square density function is hard to integrate, we use a table or calculator
to compute the probabilities. The format is identical to the students-t distribution. On
the TI-83, enter ! 2 cdf(a, b, n) to find P(a < Y < b ) , where Y ~ ! 2 (n ) .
For the goodness-of-fit hypothesis test, well reject H0 if the test statistic W is bigger
than ! "2 ,n #1 , which is the table value such that P(W > ! "2,n #1 ) = " where W has a chisquare distribution with n 1 degrees of freedom. Alternatively, the p-value of the test is
the probability that a chi-square distribution with n 1 degrees of freedom would exceed
the observed value. This can be calculated with the ! 2 cdf(a, b, n) command.
In the die example, the p-value is P(W > 6.96) = .2236 , where W ~ ! 2 (5) . (Enter
! 2 cdf(6.96, LP, 5).) This is a pretty large p-value, so we wont reject H0 , implying that
there is not enough evidence to claim the die is unfair.
Example 5.3.1: The Mendelian theory of genetics states that the number of peas falling
into the categories round and yellow, wrinkled and yellow, round and green, and
wrinkled and green, should be in the ratio of 9:3:3:1. Suppose that 100 such peas
revealed 59, 20, 15 and 6 in the respective categories. Is this data consistent with the
theory?
Let H0 : theory is correct. The table below gives the observed and expected values.
9 3 3
1
(Note: The statement in the ratio 9:3:3:1 means probabilities of
,
,
and
,
16 16 16
16
respectively.)

131

Section 5.3
Type
Observed
Expected

RY
59
56.25

WY
20
18.75

RG
15
18.75

WG
6
6.25

The test statistic is:


(59 ! 56.25)2 (20 ! 18.75)2 (15 ! 18.75)2 (6 ! 6.25)2
W=
+
+
+
= .9778
56.25
18.75
18.75
6.25
The corresponding p-value (with 3 degrees of freedom) is .806, so H0 will not be
rejected. The theory is supported by the data.
Example 5.3.2: Over the course of a 20-game season, the number of goals per game
scored by a soccer team are given in the table below. Does this data support the claim
that the data come from a Poisson distribution with ! = 1?
To compute the expected values, we first compute the probabilities using the Poisson
distribution with ! = 1. For example, P( X = 0) = e !1 " .368 , so the expected number of
games with no goals is 20(.368) = 7.36.
Number of goals 0
1
2
3
4 or more
Observed
10
5
2
2
1
Expected
7.36 7.36 3.68 1.23 .37
Then W =

(10 ! 7.36) 2

(5 ! 7.36)2

( 2 ! 3.68)2

(2 ! 1.23)2

(1 ! .37)2

= 4.025 .
7.36
7.36
3.68
1.23
.37
The corresponding p-value (with 4 degrees of freedom) is .403, so again we will not
reject H0 .
Here, we specified a value of ! so that we could compute the probabilities. We
could have just asked if the data came from a Poisson distribution, without specifying a
value of ! . In that case, we would have to estimate ! . Since, for a Poisson distribution,
! is the population mean, then it seems reasonable to estimate ! by the sample mean x .
Treating the one observation in the 4 or more category as a 4, we get:
10 !0 + 5!1 + 2 !2 + 2 !3 + 1! 4
x=
= .95 .
20
We could use this to recompute the probabilities and expected values. However, the
degrees of freedom must be diminished by one, due to the estimated parameter.
As a general rule, for every estimated parameter needed to compute the probabilities,
we lose one degree of freedom.
As we have proceeded through these notes, we have encountered numerous
examples in which the calculations are based on a model. For example, we have assumed
something is normally distributed, or exponentially distributed, or. Our calculations
and conclusions are only as good as the model. We now have a method of determining
whether the model is plausible. This is a very important part of the modeling process,
one that is too often overlooked. Failure to check the model leads to useless results.

132

Section 5.3
Exercises 5.3
1. Determine the expected value and variance of a chi-squared distribution with n
degrees of freedom.

2. A city expressway with four lanes in each direction was studied to see whether drivers
preferred to drive on the inside lanes. A total of 1000 automobiles were observed during
a one hour period. There were 294 cars in lane 1, 276 in lane 2, 238 in lane 3 and 192 in
lane 4. Does this data support the hypothesis that some lanes are preferred more than
others?
3. A survey in 1988 showed that 69% of laptop computers were used in business, 21%
were used in government, 7% were used in education and 3% were used in the home. A
recent survey of 150 users showed that 102 were used in business, 32 in government, 12
in education and 4 in the home. Does this data support the percentages in the 1988
survey?
4. The number of accidents per machinist in a certain industry was recorded over a one
year period. The data is given in the table below. Does this data fit a Poisson
distribution? [You will have to estimate ! . Treat the observations in the 4 or more
category as a 5.]
Accidents per machinist 0
1 2 3 4 or more
Observed
296 74 26 8 10
5. The amount of time, in minutes, it takes to be served at a post office is recorded for
100 customers. The data is given below. Does this data fit an exponential distribution
with ! = .5?
Time to be served 0 1 min 1 2 min 2 3 min 3 4 min more than 4
Observed
49
17
12
14
8

You might also like