You are on page 1of 36

Chapter Four

Continuous Random Variables and Probability Distributions


4.1 Continuous Random Variables and Probability Density Functions
For a discrete random variable X, one can assign a probability to each value that X can take (i.e.,
through the probability mass function). Random variables such as heights, weights, lifetimes, and
measurement error can all assume an innite number of values. As a result, we need a dierent
mechanism for understanding the probability distribution.
For a continuous random variable, any real number x is possible between A and B (for A < B).
The number of values is theoretically innite, but one wont have such precision in practice. Repre-
senting such a random variable using a continuous model is still appropriate. Note that a continuous
random variable can still have specic endpoints to its range (e.g., x > 0, 0 < x < 1 or x > 1), but
the number of values possible is still innite.
As discussed earlier, we can use histograms to represent the relative frequency of a random
variable X. Suppose X is the depth of a lake at a randomly chosen point on the surface. X has a
range from 0 to M, where M is the maximum depth of the lake. The relative frequency histogram
can represent the probability distribution for X. The ner the discretization of the X axis (i.e., the
precision of the measurement), the smaller the subintervals for the histogram as seen below.
Innitely small subintervals lead to the continuous probability distribution being represented
as a smooth curve. As with the relative frequency histograms, the probability associated with the
values of X between any two values a and b is the area under the smooth curve between a and b. For
functions consisting of straight line segments, geometry and simple relationships for known areas
(shapes) can be used. For more complex functions, we may be able to subdivide the area under
the curve into smaller rectangles and sum their areas. However, calculus and integration techniques
solve this for us.
If X is a continuous random variable, the probability distribution or the probability density
function (pdf ) of X is a function f (x) such that for any two numbers a and b with a b:
P (a X b) =
_
b
a
f (x) dx.
This probability is the area under the curve f (x) between the points a and b.
For f (x) to be a legitimate density function, the following two conditions must hold:
f (x) 0 for all x, and
1

f (x) dx = 1.
With discrete random variables, we only had a nite set of values for X, each with probability
mass at x and the sum of possible p (x) values equal to 1 (the equivalent of the two conditions
above).
Since integration is fundamental to the understanding of continuous random variables, some
integration review is suggested. Some basic indenite integrals you should know are:
_
dx = x +C
_
c dx = c
_
dx
_
(f +g + ) dx =
_
f dx +
_
g dx +
_
x
r
dx =
x
r+1
r + 1
+C
_
exp (x) dx = exp (x) +C
Of course, the same rules apply when the integrals are denite rather than indenite. Hence,
suppose that you want the area under the parabola y = x
2
for x [0, 1]. Then:
_
1
0
x
2
dx =
_
x
3
3
_
1
0
=
1
3
0 =
1
3
.
Maple will be used extensively, particularly for more complicated functions. The equivalent dier-
entiation rules for the antiderivatives (integrals) given above should also be known. For example:
exp (2x)
d
dx
= 2 exp (2x) .
Corresponding derivative rules for the above integration rules are:
x
d
dx
= 1
cx
d
dx
= c
[f +g + ]
d
dx
= f
d
dx
+g
d
dx
+
x
r
d
dx
= rx
r1
exp (x)
d
dx
= exp (x)
The Uniform Distribution A simple but commonly occurring continuous distribution is the
uniform distribution on the interval [A, B]. If a continuous random variable X follows a uniform
distribution, the pdf of X is:
f (x; A, B) =
_
1
B A
A x B
0 otherwise
_
.
2
To verify that this is a valid pdf, note that f (x) 0 for all x [A, B] and that:
_

f (x) dx =
_
B
A
1
B A
dx =
_
x
B A
_
B
A
=
B
B A

A
B A
=
B A
B A
= 1.
Example (Devore, Page 143, Exercise 1) Let X denote the amount of time for which a book on
two-hour reserve at a college library is checked out by a randomly selected student and suppose
that X has density function:
f (x) =
_
0.5x 0 x 2
0 otherwise
_
.
Calculate the following probabilities: P (X 1), P (0.5 X 1.5), and P (1.5 < X).
Solution The desired probabilities are:
P (X 1) =
_
1
0
0.5xdx =
_
0.25x
2

1
0
= 0.2500
P (0.5 X 1.5) =
_
1.5
0.5
0.5xdx =
_
0.25x
2

1.5
0.5
= 0.5000
P (1.5 < X) =
_
2
1.5
0.5xdx =
_
0.25x
2

2
1.5
= 0.4375.
3

Example (Devore, Page 144, Exercise 5) A college professor never nishes his lecture before the
bell rings to end the period and always nishes his lecture within one minute after the bell
4
rings. Let X = the time that elapses between the bell and the end of the lecture and suppose
the pdf of X is f (x) =
_
kx
2
0 x 1
0 otherwise
_
.
a. Find the value of k.
Solution For f (x) to be a valid pdf, we must have
_

f (x) dx = 1. Hence:
_

f (x) dx =
_
1
0
kx
2
dx = 1
_
kx
3
3
_
1
0
= 1 k
_
1
3
0
_
=
k
3
= 1 k = 3.
b. What is the probability that the lecture ends within
1
2
minute of the bell ringing?
Solution P
_
X
1
2
_
=
_
1/2
0
3x
2
dx = [x
3
]
1/2
0
=
_
1
2
_
3
=
1
8
.
c. What is the probability that the lecture continues beyond the bell for between 15 and 30
seconds?
Solution P
_
1
4
X
1
2
_
=
_
1/2
1/4
3x
2
dx = [x
3
]
1/2
1/4
=
_
1
2
_
3

_
1
4
_
3
=
7
64
.
d. What is the probability that the lecture continues for at least 40 seconds beyond the bell?
Solution P
_
X
2
3
_
=
_
1
2/3
3x
2
dx = [x
3
]
1
2/3
= (1)
3

_
2
3
_
3
=
19
27
.

The area under a single point is zero,


_
a
a
f (x) dx = 0. Although in practice a single real
number may seem to be a likely value for x, a continuous model for X assumes zero probability
5
for any single value. This implies that for a continuous random variable X which has 1 within its
range of possible values, P (X 1) P (X < 1). Similarly, P (1 X 2) P (1 < X < 2). Of
course, for an integer valued discrete random variable X, this is not true.
4.2 Cumulative Distribution Functions and Expected Values
The Cumulative Distribution Function The cumulative distribution function (cdf ), F (x),
for a continuous random variable X is dened for every real number x by:
F (x) = P (X x) =
_
x

f (y) dy.
Hence, f (x) is the pdf but, notationally, y is used to avoid confusion because the upper limit of
integration is x. The cdf F (x) is the area under the density curve for x, f (x), which is to the left
of x.
Example (Devore, Page 145, Example 4.5) Suppose X has a uniform distribution on [A, B]. To
compute the cdf of X, we integrate f (x) =
1
BA
over the range of X up to x:
F (x) =
_
x

1
B A
dy =
_
x
A
1
B A
dy =
1
B A
_
x
A
dy =
1
B A
[y]
x
A
=
x A
B A
,
for A x B. More precisely, the cdf is dened to be:
F (x) =
_
_
_
0 x < A
xA
BA
A x B
1 x B
_
_
_
.
Example (Devore, Page 143, Exercise 3) Suppose the distance X between a point target and a
shot aimed at the point in a coin-operated target game is a continuous random variable with
pdf:
f (x) =
_
3
4
(1 x
2
) 1 x 1
0 otherwise
_
.
Note that you should be able to sketch the graph of this pdf. It is a parabola opening
downward with a y-intercept of
3
4
. For 1 < x < 1, the cdf of X is then:
F (x) =
_
x

f (t) dt =
_
x
1
3
4
_
1 t
2
_
dt =
3
4
_
t
t
3
3
_
x
1
=
3
4
_
x
x
3
3

_
1
1
3
__
=
3
4
_
x
x
3
3
+
2
3
_
=
1
2
+
3
4
_
x
1
3
x
3
_
.
Hence, the cdf of X is:
F (x) =
_
_
_
0 x < 1
1
2
+
3
4
_
x
1
3
x
3
_
1 x 1
1 x > 1
_
_
_
.
6
Using F (x) to Compute Probabilities Let X be a continuous random variable with pdf f (x)
and cdf F (x). Then for any two numbers a and b with a < b:
P (a X b) = F (b) F (a) .
The following gure illustrates this proposition; the desired probability is the shaded area under the
density curve between a and b and equals the dierence between the two shaded cumulative areas.
Example (Devore, Page 146, Example 4.6) Suppose the pdf of the magnitude X of a dynamic load
on a bridge (in newtons) is given by:
f (x) =
_
1
8
+
3
8
x 0 x 2
0 otherwise
_
.
Find the cdf of X and use it to nd P (1 X 1.5).
Solution For 0 x 2, the cdf is:
F (x) =
_
x

f (y) dy =
_
x
0
_
1
8
+
3
8
y
_
dy =
x
8
+
3
16
x
2
.
Hence, the cdf is:
F (x) =
_
_
_
0 x < 0
x
8
+
3
16
x
2
0 x 2
1 x > 2
_
_
_
.
Therefore:
P (1 X 1.5) = F (1.5) F (1) =
1.5
8
+
3
16
(1.5)
2

_
1
8
+
3
16
(1)
2
_
=
19
64
= 0.2969.
Obtaining f (x) from F (x) If X is a continuous random variable with pdf f (x) and cdf F (x),
then at every x at which the derivative F

(x) exists:
F

(x) =
d
dx
F (x) = f (x) .
7

Percentiles of a Continuous Distribution Let p be a number between 0 and 1. The (100p)th


percentile of the distribution of a continuous random variable X, denoted by (p), is dened by:
p = F ( (p)) =
_
(p)

f (y) dy.
8
Therefore, (100p) % of the area under f (x) lies to the left of (p) and 100 (1 p) % lies to the right.
Example (Devore, Page 143, Exercise 1) Let X denote the amount of time for which a book on
two-hour reserve at a college library is checked out by a randomly selected student and suppose
that X has density function:
f (x) =
_
x
2
0 x 2
0 otherwise
_
.
Find the 75th percentile.
Solution The cdf is:
F (x) =
_
x
0
y
2
dy =
_
y
2
4
_
x
0
=
x
2
4
for 0 < x < 2. The 75th percentile, (0.75), is found as follows:
0.75 = F ( (0.75)) =
(0.75)
2
4
3 = (0.75)
2
(0.75) =

3.
But since X is a measure of time,

3 is not a possible value. Therefore, the 75th


percentile is (0.75) =

3.
The median of a continuous distribution, denoted by , is the 50th percentile, so satises
0.5 = F ( ). That is, half the area under the density curve is to the left of and half is to the right
of .
Expected Values for Continuous Random Variables The expected or mean value of a con-
tinuous random variable X with pdf f (x) is:

X
= E (X) =
_

x f (x) dx.
If X is a continuous random variable with pdf f (x) and h(X) is any function of X, then:
E [h(X)] =
h(X)
=
_

h(x) f (x) dx.


9
The Variance of a Continuous Random Variable The variance of a continuous random
variable X is:

2
X
= V (X) =
_

(x )
2
f (x) dx = E
_
(X )
2

and the standard deviation of X is:

X
=
_
V (X).
It is usually more convenient to compute V (X) as:
V (X) = E
_
X
2
_
[E (X)]
2
,
but keep in mind that the variance is a measure of spread about the mean. The variance of X is
also called the second moment of X. The mean is the rst moment.
Example (Devore, Page 153, Exercise 22) The weekly demand for propane gas (in 1000s of gallons)
from a particular facility is a random variable X with pdf:
f (x) =
_
2
_
1
1
x
2
_
1 x 2
0 otherwise
_
.
a. Compute the cdf of X.
Solution For 1 x 2:
F (x) =
_
x

f (y) dy = 2
_
x
1
_
1
1
y
2
_
dy = 2
_
y +
1
y
_
x
1
= 2
_
x +
1
x
_
4,
so:
F (x) =
_
_
_
0 x < 1
2
_
x +
1
x
_
4 1 x 2
1 x > 2
_
_
_
.
b. Obtain an expression for the (100p)th percentile. What is the value of ?
Solution Let x
p
be the (100p)th percentile. Then:
p = 2
_
x
p
+
1
x
p
_
4
2x
2
p
(4 + p) x
p
+ 2 = 0
x
p
=
1
4
_
4 + p
_
p
2
+ 8p
_
,
but since 0 < p < 1 and 1 x
p
2:
x
p
=
1
4
_
4 + p +
_
p
2
+ 8p
_
.
To nd , set p = 0.5 to obtain:
=
1
4
_
4 + 0.5 +
_
(0.5)
2
+ 8 (0.5)
_
= 1.6404.
10
c. Compute E (X) and V (X).
Solution Using the denitions:
E (X) =
_
2
1
2x
_
1
1
x
2
_
dx = 2
_
2
1
_
x
1
x
_
dx = 2
_
x
2
2
ln x
_
2
1
= 1.6137
E
_
X
2
_
=
_
2
1
2x
2
_
1
1
x
2
_
dx = 2
_
2
1
_
x
2
1
_
dx = 2
_
x
3
3
x
_
2
1
= 2.6667
V (X) = E
_
X
2
_
[E (X)]
2
= 2.6667 (1.6137)
2
= 0.0627.
d. If 1.5 thousand gallons is in stock at the beginning of the week and no new supply is due
in during the week, how much of the 1.5 thousand gallons is expected to be left at the
end of the week?
Solution The amount left is equal to max (1.5 X, 0). Therefore:
E (Amount Left) =
_
2
1
max (1.5 x, 0) f (x) dx
= 2
_
1.5
1
(1.5 x)
_
1
1
x
2
_
dx = 0.0609.
11

Example (Devore, Page 152, Exercise 15) Suppose the pdf of weekly gravel sales X (in tons) is:
f (x) =
_
2 (1 x) 0 x 1
0 otherwise
_
.
a. Obtain the cdf of X.
12
Solution For 0 x 1:
F (x) =
_
x

f (y) dy =
_
x
0
2 (1 y) dy = 2
_
y
y
2
2
_
x
0
= 2
_
x
x
2
2
_
.
Thus:
F (x) =
_

_
0 x < 0
2
_
x
x
2
2
_
0 x 1
1 x > 1
_

_
.
b. What is P
_
X
1
2
_
?
Solution P
_
X
1
2
_
= F
_
1
2
_
= 2
_
1
2

(
1
2
)
2
2
_
=
3
4
= 0.75.
c. Using part (a), what is P
_
1
4
< X
1
2
_
? What is P
_
1
4
X
1
2
_
?
Solution P
_
1
4
< X
1
2
_
= P
_
1
4
X
1
2
_
= F
_
1
2
_
F
_
1
4
_
=
3
4

7
16
=
5
16
= 0.3125.
d. What is the 75th percentile of the sales distribution?
Solution We must solve F ( (0.75)) = 0.75 for (0.75):
0.75 = 2
_
(0.75)
(0.75)
2
2
_
(0.75)
(0.75)
2
2
=
3
8
(0.75)
2
2 (0.75) +
3
4
= 0 (0.75)
_
1
2
,
3
4
_
.
Therefore, (0.75) =
1
2
.
e. What is the median of the sales distribution?
Solution 0.5 = F ( ) = 2
_


2
2
_

2
2 +
1
2
= 0 =
2

2
2
= 0.2929.
f. Compute E (X) and
X
.
Solution From the denitions:
E (X) =
_
1
0
2x (1 x) dx = 2
_
x
2
2

x
3
3
_
1
0
=
1
3
= 0.3333
E
_
X
2
_
=
_
1
0
2x
2
(1 x) dx = 2
_
x
3
3

x
4
4
_
1
0
=
1
6
= 0.1667
V (X) =
1
6

_
1
3
_
2
=
1
6

1
9
=
1
18
= 0.0556

X
=
_
V (X) =
_
1
18
=

2
6
= 0.2357.
4.3 The Normal Distribution
The normal distribution is the most common and important distribution in statistics and applica-
tions. Many random variables (e.g., heights, weights, reaction times) follow a normal distribution.
13
Even when the underlying distribution is discrete, the normal distribution is often a very good
approximation. Sums and averages of non-normal variables will usually be approximately normally
distributed.
A continuous random variable X is said to have a normal distribution with parameters and
if the pdf of X is:
f (x; , ) =
1

2
exp
_
(x )
2
2
2
_
for < x < . The statement that X is normally distributed with parameters and
2
is often
abbreviated as X N (,
2
). The parameters and dene the normal family. Showing that the
pdf integrates to 1 is not simple by hand or even with Maple due to the fact that no closed form
solution exists for the integral.
If X N (,
2
), then the mean and variance are:
E (X) =
V (X) =
2
.
Since the normal density curve is symmetric, the center of the single peak of the bell-shaped
curve represents both the median and the mean. The spread of the distribution as given by is
reected in the shape of the bell and the amount of area in the tails. The smaller the value of ,
the higher the peak and greater the area around . The area under the normal density curve will
be approximately 0.68 within one standard deviation of the mean and P (|X | 2) 0.95.
The Standard Normal Distribution Rather than compute probabilities by numerically eval-
uating:
P (a X b) =
_
b
a
1

2
exp
_
(x )
2
2
2
_
dx,
we standardize to the random variable:
Z =
X

.
The standard normal random variable Z reects how many standard deviations we are from the
mean. In other words, X = +Z.
The standard normal random variable Z has parameters = 0 and = 1 and has pdf:
f (z; 0, 1) =
1

2
exp
_
z
2
2
_
for < z < . Notationally, we write Z N (0, 1).
The cdf of Z, denoted by (z), is:
(z) = P (Z z) =
_
z

f (y; 0, 1) dy.
14
For any random variable X N (,
2
), we standardize to Z via Z =
X

and compute any


probability as:
P (a X b) = P
_
a

Z
b

_
according to standard normal tables such as Table A.3 in Devore (pp. 704-705).
More formally, any probability involving X can be expressed as a probability involving a standard
normal random variable Z. Hence:
P (X x) = P
_
Z
X

_
=
_
X

_
.
So, assuming a handy method of integration such as Maple is not available, we can always convert
any probability statement involving X into one in terms of Z. These should already be familiar
calculations from other courses.
Example If X N (0.5, 1), compute P (0.28 < X < 1.75).
Solution P (0.28 < X < 1.75) = P
_
0.280.5
1
< Z <
1.750.5
1
_
= P (0.38 < Z < 1.25) = (1.25)
(0.38) = 0.8944 0.3520 = 0.5424.
15

16
Percentiles of an Arbitrary Normal Distribution In statistical inference, percentiles of the
standard normal distribution which have area to the right are denoted by z

, the 100 (1 )th


percentile of the standard normal distribution. The (100)th percentile is z

. These critical
values given in the table below are seen quite often in applications and hypothesis testing.
The (100p)th percentile of a normal distribution with mean and standard deviation can
be obtain by nding the (100p)th percentile of the standard normal distribution and reversing the
standardization. Hence, if X has a normal distribution with mean and standard deviation , the
(100p)th percentile for X is given by:
[(100p) th percentile for N (, )] = + [(100p) th percentile for Z] .
Example (Devore, Page 165, Exercise 39) The distribution of resistance for resistors of a certain
type is known to be normal. 10% of all resistors have a resistance exceeding 10.256 ohms
and 5% have resistance smaller than 9.671 ohms. What are the mean value and standard
deviation of the resistance distribution?
Solution We are given that P (X > 10.256) = 0.10 and P (X < 9.671) = 0.05. Hence:
P
_
Z >
10.256

_
= 0.10 and P
_
Z <
9.671

_
= 0.05.
From the standard normal tables, we see that P (Z > 1.28) = 0.10 and P (Z < 1.65) =
0.05. Therefore we have two equations in two unknown variables:
10.256

= 1.28 + 1.28 = 10.256


9.671

= 1.65 1.65 = 9.671.


Subtracting the bottom equation from the top equation gives:
+ 1.28 ( 1.65) = 10.256 9.671
1.28 + 1.65 = 10.256 9.671
2.93 = 0.585
= 0.1997.
Substitution gives = 10. Hence the distribution of the resistance is normal with mean
10 ohms and standard deviation 0.1997 ohms.
Example (Devore, Page 165, Exercise 35a) If a normal distribution has = 25 and = 5, what
is the 91st percentile of the distribution?
Solution For the standard normal random variable Z N (0, 1), z
0.09
= 1.34. Hence:
(0.91) = 25 + (1.34) (5) = 31.7.
Example (Devore, Page 165, Exercise 35c) The width of a line etched on an integrated circuit chip
is normally distributed with mean 3.000 m and standard deviation 0.150. What width value
separates the widest 10% of all such lines from the other 90%?
Solution Let X = width of a line etched on an integrated circuit chip. Then X N (3.000, 0.150).
The widest 10% corresponds to the 90th percentile and z
0.10
= 1.28. Hence:
(0.90) = 3.000 + (1.28) (0.150) = 3.192.
17
The Normal Approximation to the Binomial Distribution The normal distribution is of-
ten used as an approximation to the distribution of values in a discrete population. The point mass
P (X = x) from the discrete distribution is replaced with the interval probability P
_
x
1
2
X x +
1
2
_
.
Devore notes in Example 4.19 that IQ scores are usually assumed to be normally distributed
although the score is an integer. The histogram of scores has rectangles which are centered at
integers. In using the normal approximation, the area under the normal density curve between
124.5 and 125.5 would approximate the point mass P (X = 125). The correction of 0.5 is called the
continuity correction.
The binomial distribution is often approximated by the normal distribution. This approximation
is considered reasonable if np 5 and n(1 p) 5. If these two conditions hold, then X
BIN(n, p) is approximately normal with mean = np and variance
2
= np (1 p).
More formally, if X BIN(n, p)with np 5 and n(1 p) 5:
P (X x)
_
x + 0.5 np
_
np (1 p)
_
,
where is the standard normal cdf.
The following two gures illustrate this approximation. The gure on the left is a histogram
of 5000 samples from a BIN(20, 0.1) distribution and the gure on the left is a histogram of 5000
samples from a BIN(20, 0.5) distribution. Note that the rst gure has np = 2 < 5, and the
distribution is still rather skewed. However, the gure on the right satises the necessary conditions
and appears to be approximately normal.
The 0.5 is a continuity correction which ensures that all of the probability to the left of the
original x is included. If we are interested in the binomial probability P (a X b), then:
P (a X b)
_
b + 0.5 np
_
np (1 p)
_

_
a 0.5 np
_
np (1 p)
_
.
18

Example (Devore, Page 166, Exercise 49) Suppose only 40% of all drivers in a certain state
regularly wear a seatbelt. A random sample of 500 drivers is selected. What is the probability
that:
a. Between 180 and 230 (inclusive) of the drivers in the sample regularly wear their seatbelt?
Solution Let X = the number of drivers in the sample who regularly wear their seatbelt.
Then X BIN(500, 0.40). Since np = 200 and n(1 p) = 300 (both much larger than
5), the normal approximation should be very good here, i.e., X N (200, 120). Therefore:
P (180 X 230)
_
230.5 200

120
_

_
179.5 200

120
_
= (2.78) (1.87) = 0.9973 0.0307 = 0.9666.
b. Fewer than 175 of those in the sample regularly wear a seatbelt?
Solution P (X < 175)
_
174.5200

120
_
= (2.33) = 0.0099.
4.4 and 4.5 The Gamma Family and Other Continuous Distributions
Many situations exist where the normal family of distributions is not applicable. Other models for
continuous variables are used in many practical situations. One such application is in reliability or
survival analysis. Survival times or times to failure are typically not normally distributed. Although
specication of a distribution is not always necessary, it is often done eectively.
19
The Lognormal Distribution The lognormal distribution may be specied for a random variable
whose natural logarithm is normally distributed. A continuous random variable X is said to have a
lognormal distribution if Y = ln (X) has a normal distribution. Since Y N (, ), X = exp (Y )
is also specied in terms of and . The pdf of X is:
f (x; , ) =
_

_
1
x

2
exp
_
[ln (x) ]
2
2
2
_
x 0
0 x < 0
_

_
.
Note that X must be non-negative and that:
E (X) = exp
_
+

2
2
_
V (X) = exp
_
2 +
2
_

_
exp
_

2
_
1

.
Also note that E (X) = exp () and V (X) = exp (
2
). In other words, we cannot simply transform
the parameters via exponentiation. The following gure illustrates the graphs of several lognormal
density functions.
Because ln (X) has a normal distribution, the cdf of X can be expressed in terms of the cdf
(z) of a standard normal random variable Z. For x 0:
F (x; , ) = P (X x) = P [ln (X) ln (x)]
= P
_
Z
ln (x)

_
=
_
ln (x)

_
.
Example (Devore, Page 178, Example 4.27) Let X = the hourly median power (in decibels)
of received radio signals transmitted between two cities. The authors of a journal article
argue that the lognormal distribution provides a reasonable probability model for X. If the
parameter values are = 3.5 and = 1.2.
a. What are E (X) and V (X)?
Solution Using the proposition above:
E (X) = exp
_
3.5 +
(1.2)
2
2
_
= 68.0335
V (X) = exp
_
2 (3.5) + (1.2)
2
_

_
exp
_
1.2
2
_
1

= 14907.1677.
20
b. What is the probability that received power is between 50 and 250 dB?
Solution Using the standard normal cdf:
P (50 X 250) = F (250; 3.5, 1.2) F (50; 3.5, 1.2)
=
_
ln (250) 3.5
1.2
_

_
ln (50) 3.5
1.2
_
= (1.68) (0.34) = 0.9535 0.6331 = 0.3204.
c. What is the probability that X does not exceed its mean?
Solution Using the standard normal cdf:
P (X 68.0) =
_
ln (68.0) 3.5
1.2
_
= (0.60) = 0.7257.
If the distribution were symmetric, this probability would equal 0.5; it is much larger
because of the positive skew (long upper tail) of the distribution, which pulls outward
past the median.
Example (Devore, Page 181, Exercise 73) A theoretical justication based on a certain mate-
rial failure mechanism underlies the assumption that ductile strength X of a material has a
lognormal distribution. Suppose the parameters are = 5 and = 0.1.
a. Compute E (X) and V (X).
Solution Using the proposition above:
E (X) = exp
_
5 +
(0.1)
2
2
_
= 149.1571
V (X) = exp
_
2 (5) + (0.1)
2
_

_
exp
_
0.1
2
_
1

= 223.5945.
b. Compute P (X > 120) and P (110 X 130).
Solution Using the standard normal cdf:
P (X > 120) = P
_
Z >
ln (120) 5
0.1
_
= 1 (2.12) = 0.9830
P (110 X 130) = P
_
ln (110) 5
0.1
Z
ln (130) 5
0.1
_
= (1.32) (3.00) = 0.0921.
c. What is the value of median ductile strength?
Solution = exp (5) = 148.4132.
d. If ten dierent samples of an alloy steel of this type were subjected to a strength test, how
many would you expect to have strength at least 120?
Solution Let Y = the number out of 10 which have strength at least 120. Then:
E (Y ) = 10P (X > 120) = 10 (0.9830) = 9.8300.
21
e. If the smallest 5% of strength values were unacceptable, what would the minimum accept-
able strength be?
Solution We want the 5th percentile. Hence:
0.05 =
_
ln ( (0.05)) 5
0.1
_
ln ( (0.05)) = exp [5 + (0.1) (1.645)] = 125.9015.
22

23
The Weibull Distribution The Weibull distribution is a very exible distributional family which
is very applicable in reliability and survival analysis. Its exibility allows it to be a reasonable model
for a variety of random variables with skewed probability distributions.
A random variable X is said to follow a Weibull distribution with parameters and ( > 0, > 0)
if the pdf of X is:
f (x; , ) =
_
_
_

x
1
exp
_

_
x

_
x 0
0 x < 0
_
_
_
.
A variety of shapes are possible. As the gures below show, changing the scale parameter for
a xed stretches the pdf.
Integrating to obtain E (X) and E (X
2
) yields:
E (X) =
_
1
1

_
V (X) =
2
_

_
1
2

_
1 +
1

__
2
_
,
where () is the gamma function:
() =
_

0
x
1
exp (x) dx.
The most important properties of the gamma function are:
For any > 1, () = ( 1) ( 1),
For any positive integer n, (n) = (n 1)!, and
24

_
1
2
_
=

.
One of the most useful features of the Weibull distribution is the simple form of the cdf:
F (x; , ) =
_
_
_
1 exp
_

_
x

_
x 0
0 x < 0
_
_
_
.
Note that X may be shifted such that the minimum value is not 0 as dened in the pdf. If this
minimum value is unknown, but positive, the pdf of X will just have x replacing x in the
cdf.
Example (Devore, Page 180, Exercise 67) The authors of a paper state that the Weibull dis-
tribution is widely used in statistical problems relating to aging of solid insulating materials
subjected to aging and stress. They propose the use of the distribution as a model for time
(in hours) to failure of solid insulating specimens subjected to AC voltage. The values of
the parameters depend on the voltage and temperature; suppose that = 2.5 and = 200
(values suggested by data in the paper).
a. What is the probability that a specimens lifetime is at most 200? Less than 200? More
than 300?
Solution Let X = a specimens time to failure. Then the desired probabilities are:
P (X 200) = 1 exp
_

_
200
200
_
2.5
_
= 1 exp (1) = 0.6321
P (X < 200) = P (X 200) = 0.6321
P (X > 300) = 1 P (X 300) = exp
_

_
300
200
_
2.5
_
= 0.0636.
b. What is the probability that a specimens lifetime is between 100 and 200?
Solution P (100 X 200) = exp
_

_
100
200
_
2.5
_
exp
_

_
200
200
_
2.5
_
= 0.4701.
c. What value is such that exactly 50% of all specimens have lifetimes exceeding that value?
Solution We want the median. The equation F ( ) = 0.5 reduces to:
0.5 = exp
_

_

200
_
2.5
_
.
Therefore = 172.7223.
25

26
The Beta Distribution We often need to model the distribution of a proportion (i.e., 0 < X <
1), where X is a continuous random variable. The beta distribution is often used in this framework.
X is said to have a beta distribution with parameters , , A, and B if the pdf of X is:
f (x; , , A, B) =
_
_
_
1
B A
( +)
() ()
_
x A
B A
_
1
_
B x
B A
_
1
A x B
0 otherwise
_
_
_
.
The case A = 0 and B = 1 gives the standard beta distribution.
The mean and variance for a beta distribution are:
E (X) = A + (B A)

+
V (X) =
(B A)
2

( +)
2
( + + 1)
.
The Family of Gamma Distributions Another widely used distributional model for skewed
data is the gamma family of distributions.
A continuous random variable X is said to have a gamma distribution if the pdf of X is:
f (x; , ) =
_
_
_
1

()
x
1
exp
_

_
x 0
0 otherwise
_
_
_
,
where > 0, > 0, and () is gamma function dened above. The standard gamma distribution
has = 1.
For the standard gamma distribution, the pdf is f (x; ) =
1
()
x
1
exp (x). The pdf is strictly
decreasing for any 1 and has a skewed shape (with a denite maximum) for any > 0. The
gures below illustrate some gamma density functions and standard gamma density functions.
The mean and variance for a gamma distribution are:
E (X) =
V (X) =
2
.
27
The proof for E (X) = is:
E (X) =
_

x f (x) dx =
_

0
x x
1
exp
_

()
dx =
_

0


x x
1
exp
_

()
dx
=
_

0
x

exp
_

+1
( + 1)
dx
. .
GAMMA(+1,) pdf
= (1) = .
The parameter is the shape parameter. For 1, the pdf is strictly decreasing. The
parameter is a scale parameter. It stretches ( > 1) or compresses ( < 1) the pdf for a constant
. Therefore, the gamma pdf can be very exible for representing many skewed distributions.
The cdf of the standard gamma distribution ( = 1) is given by the incomplete gamma function:
F (x; ) =
_
x
0
y
1
exp (y)
()
dy,
for x > 0. Table A.4 in Devore (p. 706) gives values for the incomplete gamma function for various
values of x and .
The incomplete gamma function can also be used for a general . Let X GAMMA(, ).
Then:
P (X x) = F
_
x

;
_
,
where F is the incomplete gamma function.
Example (Devore, Page 173, Exercise 57) Suppose that when a transistor of a certain type is
subjected to an accelerated life test, the lifetime X (in weeks) has a gamma distribution with
mean 24 weeks and standard deviation 12 weeks.
a. What is the probability that a transistor will last between 12 and 24 weeks?
Solution We are given that = 24 and
2
= 12
2
= 144. Therefore:
=

2

=
144
24
= 6
=
24

=
24
6
= 4.
Hence:
P (12 X 24) = F
_
24
6
; 4
_
F
_
12
6
; 4
_
= F (4; 4) F (2; 4) = 0.567 0.143 = 0.424.
b. What is the probability that a transistor will last at most 24 weeks? Is the median of the
lifetime distribution less than 24? Why or why not?
28
Solution The desired probability is:
P (X 24) = F (4; 4) = 0.567,
so while the mean is 24, the median is less than 24. This is a result of the positive skew
of the gamma distribution.
c. What is the 99th percentile of the lifetime distribution?
Solution We need c such that 0.99 = P (X c) = F
_
c
6
; 4
_
. From Table A.4, we see that
c
6
= 10. Hence c = 60.
d. Suppose the test will actually be terminated after t weeks. What value of t is such that
only one-half of 1% of all transistors would still be operating at termination?
Solution The desired value of t is the 99.5th percentile, so:
0.995 = F
_
t
6
; 4
_

t
6
= 11 t = 66.
29

30
The Exponential Distribution A special case of the gamma distribution is the exponential
distribution. Taking = 1 and =
1

gives the exponential pdf:


f (x; ) =
_
exp (x) x 0
0 x < 0
_
,
for > 0.
The expected value and variance of an exponential random variable X are:
E (X) = = (1)
_
1

_
=
1

V (X) =
2
= (1)
_
1

_
2
=
1

2
.
The cdf of an exponential random variable X is:
F (x; ) =
_
1 exp (x) x 0
0 x < 0
_
.
The exponential distribution is also a special case of the Weibull distribution with = 1 and =
1

.
The exponential distribution is frequently used as the model for distribution of inter-arrival times
or the times between successive events.
We have already discussed the Poisson distribution as a possible model for the number of events
occurring in a time interval of length t. If the occurrence of such events is distributed as Poisson
with parameter t then the distribution of the elapsed time between successive events is distributed
as exponential with parameter = .
Another application of the exponential distribution is to model the distribution of component
lifetime. This application is the memoryless property:
P (X t
2
| X t
1
) = P (X t) = exp (t) ,
for t
2
= t
1
+t. Although we havent discussed conditional probability in much detail, this property
says that the probability of failure for an additional t units of times is the same as the probability
of surviving any t units of time, irrespective of the fact the we know it didnt fail after t
1
units.
Example (Devore, Page 173, Exercise 59) The time X (in seconds) that it takes a librarian to
locate an entry in a le of records on checked-out books has an exponential distribution with
expected time 20 seconds. Calculate the following probabilities: P (X 30), P (X 20), and
P (20 X 30).
Solution The value of is =
1
E(X)
=
1
20
. The desired probabilities are then:
P (X 30) = 1 exp
_

30
20
_
= 0.7769
P (X 20) = 1
_
1 exp
_

20
20
__
= exp (1) = 0.3679
P (20 X 30) = 1 exp
_

30
20
_

_
1 exp
_

20
20
__
= 0.1447.
Note that this non-symmetric distribution, the probability of an observed X being greater
than the mean is not 0.5.
31

The Chi-Squared Distribution The chi-squared distribution is another member of the gamma
family with =

2
and = 2. A random variable X has a chi-squared distribution with degrees
32
of freedom, denoted X
2

, if it has pdf:
f (x; ) =
_
_
_
1
2
/2

2
_x
(/2)1
exp
_

x
2
_
x 0
0 x < 0
_
_
_
,
for = 1, 2, 3, ... .
If X
2

, then:
E (X) =
V (X) = 2.
The chi-squared distribution is widely used in statistical inference. In fact, if X N (0, 1),
then X
2

2
1
. The most commonly used procedure in inference which involves the chi-squared
distribution is the Pearson statistic:
k

i=1
(O
i
E
i
)
2
E
i
,
used to see whether categorical data consisting of elements y
1
, y
2
, ..., y
k
in k cells ts pre-specied
probabilities of falling in these cells. Under fairly general conditions, this statistic will (in the long
run for many samples) follow a distribution which is
2
k1
.
33

34
4.6 Probability Plots
Given a sample of size n (x
1
, x
2
, ..., x
n
), how do we decide whether or not a particular distribution
is appropriate for modelling? Assuming we have a continuous random variable, a probability plot
is a mechanism for comparing the distribution of the observed data with a reference distribution
in order to determine if the family of the reference distribution is a reasonable model for the data.
The method makes use of sample quantiles. The pth quantile is the (100p)th percentile. Recall that
the (100p)th percentile is the number (p) such that F ( (p)) = p.
We order the data set from smallest to largest to obtain the sample order statistics, x
(1)
, x
(2)
, ..., x
(n)
,
where x
(i)
is the ith smallest observation. Due to the possibility of n being even and hence the
position of a percentile is actually between two observations we dene x
(i)
to be the
_
100(i0.5)
n
_
th
sample percentile.
A probability plot uses the order pair:
__
100 (i 0.5)
n
_
th Reference Percentile, ith Smallest Sample Observation
_
.
In other words, were looking to see how close the ith smallest sample observation matches up with
the
_
100(i0.5)
n
_
th percentile of the reference distribution. A straight line is a good match.
Example (Devore, Page 183, Example 4.29) The value of a certain physical constant is known
to an experimenter. The experimenter makes n = 10 independent measurements of this
value using a particular measurement device and records the resulting measurement errors.
These observations appear in the following table (ordered with the corresponding reference
percentiles).
Is is plausible that the random variable measurement error has a standard normal distribution?
The needed standard normal percentiles are also displayed in the table. Thus, the points in the
probability plot are (1.645, 1.91) , (1.037, 1.25) , ..., and (1.645, 1.56). The gure below shows
the resulting plot. Although the points deviate a bit from the 45

line, the predominant impression


is that this line ts the points very well. The plot suggests that the standard normal distribution
is a reasonable probability model for measurement error.
35
Families other than the normal family can also be compared to a standard distribution. For two
parameter families with location and scale parameters
1
and
2
, this corresponds to
1
= 0 and

2
= 1. Hence the percentiles of the standard reference distribution corresponding to
100(i0.5)
n
for
i = 1, 2, ..., n are plotted against the sorted data set as with the normal distribution. Again, plots
close to a straight line indicate good t.
Another distribution which is useful when looking at probability or quantile-quantile (QQ) plots
may be the extreme value distribution with parameters
1
and
2
. The cdf of the extreme value
distribution is:
F (x;
1
,
2
) = 1 exp
_
exp
_
x
1

2
__
.
When considering a Weibull distribution for X, we know that for X WEIB(, ), ln (X) has an
extreme value distribution with
1
= ln () and
2
= . The incomplete gamma function makes
expression involving the quantiles of the Weibull distribution dicult, so this result is benecial.
Since an exponential distribution is a special case of the Weibull distribution, it can also be checked
via this method.
36

You might also like