You are on page 1of 30

ISOM 2500

1.

Key Learning Objectives

In this lesson, we will discuss the continuous variables and their probability
distributions. We will start with the discussion of general continuous
distribution setup and will continue with two forms of the latter, namely, the
normal and sampling distributions. It is very important that you understand
the material discussed here as they constitute the theoretical background to
inferential statistics.

Discussion
A continuous random variable is a random variable that can take any value
contained in one or more intervals. (i.e., an uncountable number of values).
Examples: Salary, time, volume of milk in a container, etc. Since there is an
infinite number of values that can be assumed by a continuous random variable,
the probability of each individual value is virtually zero! As such, we can only
determine the probability of only a range of values.
Figure 1a: Histogram of the gas-mileages of 49 mid-sized cars.
Histogram of Mileage
12
10
8
Frequency

2.

K.H. Chen

Topic 2: Probability & Distributions


Part III: Continuous Random Variables
& Distributions

6
4
2
0

30

31

32
Mileage

33

Figure 1b: Density Histogram of the gas-mileages of 49 mid-sized cars.


Density Histogram of Mileage
0.5

Density

0.4

0.3

0.2

0.1

0.0

30

31

32

33

Mileage

Figure 1c: Density Function of the gas-mileages of 49 mid-sized cars.


Density Function

f(x)
12
10
8
6
4
2
0

29

30

31

32

33

34 x

The probability of a continuous random variable is represented by the area under


the curve, called the probability density function f(x).
Requirements for a Probability Density Function:
1.

f ( x ) 0 for all x between a and b.

2.

The total area under the curve between a and b is 1.0, i.e.,

f ( x ) dx = 1
b

The probability of the (continuous) random variable between c and d is


P ( c < X < d )= P ( c X d )=

f ( x ) dx for c a and d b
d

Mean and Variance of a Continuous Random Variable:


From Topic 2 Part II, the mean and variance of a discrete random variable are
determined as follows:
E[X=
] =

x. p ( x )

and V [ X ] = 2 =E [ X ] = ( x ) . p ( x )
2

all x

all x

The mean and variance of a continuous random variable, which ranges between a
and b are determined in a similar fashion using the integral sign rather than the
summation sign. That is,
Mean
E[X=
] =

a x. f ( x)dx

Variance
V [ X ] = 2 =E [ X ] =a ( x ) . f ( x)dx
b

a x

. f ( x )dx 2

2
E X 2 2 where E X 2 =
=
a x . f ( x )dx
b

Standard deviation

= V (X )

Example 1:
After playing golf for many years, a statistics professor determined the density
function for the distance his drives travel in hundreds of yards (denoted by X). It
is
3 2
=
f ( x)
x for 2 x 3
19
Confirm that the above function satisfies the requirements for a probability
density function.
Plot of f(x) vs x
1.5
1.4
1.3
1.2

f(x)

a.

1.1
1.0
0.9
0.8
0.7
0.6
2.0

2.2

2.4

2.6

2.8

3.0

From the above plot, we can see that f(x) > 0 for 2 x 3 and thus the first
requirement for a probability density function is met.
The total area under f ( x ) for 2 x 3 = 2

3 2
x dx
19

3 3 2
x dx
19 2
3

3 x3
=
19 3 2
3

x3
=
19 2
( 3 )3 ( 2 )3
=

19

27 8
= = 1
19
The second requirement for a probability density function is also met. Thus,
3 2
=
f ( x)
x for 2 x 3 satisfies the requirements for a probability density
19
function.

b.

Find the probability that the professors next drive is more than 250 yards.
3 3
2
P ( X > 2.5 ) =
2.5 19 x dx
3

x3
=
19 2.5
( 3)3 ( 2.5 )3
=

19

27 15.625
=
19
= 0.5987

c.

Determine the mean, variance, and standard deviation of the professors


drives.
Mean, E [ X ] = 2 x. f ( x )dx
3

= 2

3 3
x dx
19
3

3x 4
=

76 2
=

243 48
= 2.5658 ( 256.58 yards )
76

E X 2 = 2 x 2 . f ( x )dx
3

= 2

3 4
x dx
19
3

3x5
=

95 2

729 96 633
=
95
95

Variance, =
V [ X ] E X 2 2
633
2
=
( 2.5658 )
95
= 0.07988227149 ( 798.823 yards 2 )
Standard deviation, = 0.07988227149
= 0.2826345193 ( 28.26 yards )
5

Uniform/Rectangular Distribution
A continuous random variable X is a uniform random variable over an interval
a x b or [a, b] (equivalently), if X can take on any value in closed interval
[a, b] and if the probability density function of X is constant over this interval.
That is,

f ( x) = b a
0

for a x b
otherwise

Figure 2: Probability density function of a uniform random variable X.


Distribution Plot

Uniform, Lower=a, Upper=b

f(X )

1
ba

Note that for a uniform random variable X,

(b a )
a+b
E(X ) =
and V ( X ) =
12
2

X =

(b a )
12

Example 2:
The weekly output of a steel mill is a uniformly distributed random variable that
lies between 110 and 175 metric tons.
a.
b.
c.
d.

e.

Sketch the probability density function of the weekly output.


Find the probability that the steel mill will produce more than 150 metric
tons next week.
Determine the probability that the steel mill will produce between 120 and
160 metric tons next week.
The operations manager labels any week that is in the bottom 20% of
production a bad week. How many metric tons should be used to define a
bad week?
Find the expected value and standard deviation of the weekly output.

Normal Distribution
A continuous random variable X, with the following probability density function is
called the normal random variable
1 x


1
f ( x)
e 2
=
2

for < x < +, < < +, > 0

where e = 2.71828... and = 3.14159... .


distribution. In short, we write

Its distribution is called the normal

X ~ N , 2

Here, E [ X ] = and V [ X ] = 2 . The probability density function of a normal


random variable is mound-shaped (or bell-shaped) and symmetric about its mean
and has points of inflexion at and + .
To calculate the probability that a normal random variable falls into any interval,
we need to compute the area in the interval under the curve. That is, taking the
integral of the probability density function of a normal random variable.
However, computing the former is not easy and thus, we resort to using a
probability table or statistical software package to calculate normal probabilities.
For the probability-table approach, it would appear that (countless) separate tables
are needed for different combinations of means ( < < + ) and standard
deviations ( > 0 ). Fortunately, this wont be necessary as we can reduce the
number of tables needed to one by standardizing the normal random variable.
That is,

Z=

~ N ( 0,1) .

Note that the above standardized normal random variable, Z is called the standard
normal random variable and it has the following probability density function
=
f (z)

1 12 z 2
e
for < z < +
2

and its distribution is called the standard normal distribution. In short, we write

Z ~ N ( 0,1)
2
Here, E ( Z=
0 and V ( Z=
1.
) =
) =
Z
Z

Using the probability table to find standard normal probabilities


Table 3 (in Appendix B) on pages B8 & B9 on lmes2.ust.hk can be used to find
standard normal probabilities. To use this table: first find the row that corresponds
to the integer part and the first decimal of the z value, and then find the column
that corresponds to the second decimal. Then, at the intersection of the row and
the column, you will find the probability that corresponds to an area similar to the
one depicted in the graph that accompanies the table.
Example 3:
Find P ( Z < 1.95 ) .

Note that the former corresponds to the area under the

standard normal curve between and 1.95 .


Figure 3: P ( Z < 1.95 ) .
Distribution Plot

f(Z)

Normal, Mean=0, StDev=1

0.4

0.3

0.2

0.1

0.0

0.9744
From the table, P ( Z < 1.95 ) =

MINITAB Instructions: Computing standard normal probabilities


Step 1: Click on the Calc menu and select Probability Distributions, followed
by Normal.

Step 2: Once the Normal Distribution window pops up,


use the default cumulative probability
enter the values of and next to Mean: and Standard deviation:,
respectively
select Input constant: and enter the value of the normal random
variable next to it.

10

Step 3: Click on the OK button.

0.974412
From the Session Window of MINITAB, P ( Z < 1.95 ) =
Note: When you need to calculate probabilities other than of the P( < Z < z ) or
P ( Z < z ) type, you need to be able to express your probability in terms of the
P ( Z < z ) probability. Homework problems will give you the chance to practice
doing so. Figure 4 depicts the way to do these manipulations.
Figure 4: Visualization of the simple arithmetic manipulations needed to express
other types of probabilities in terms of the P(Z < z) type.
Distribution Plot

f(Z)

Distribution Plot

Distribution Plot

Normal, Mean=0, StDev=1

Normal, Mean=0, StDev=1

f(Z)

0.4

0.4

0.3

0.3

f(Z)

Normal, Mean=0, StDev=1

0.4
0.977

0.2
0.136
0.1

0.0

0.3

0.2

0.1

0.1

0.0

0.0

Distribution Plot

Distribution Plot

f(Z)

0.841
0.2

Normal, Mean=0, StDev=1

0.4

0.3

0.3

Distribution Plot

Normal, Mean=0, StDev=1

f(Z)

0.4

f(Z)

1.000

Normal, Mean=0, StDev=1

0.4
0.977

0.2

0.2

0.1

0.1
0.0228
0.0

0.3

0.0

0.2

0.1

-3.09

11

3.09 Z

0.0

Distribution Plot

Distribution Plot

Distribution Plot

Normal, Mean=0, StDev=1

f(Z)

f(Z)

0.4

Normal, Mean=0, StDev=1

Normal, Mean=0, StDev=1

f(Z)
0.4

0.4

0.819

0.977

0.3

0.3

0.3

0.2

0.2

0.1

0.2

0.1

0.1

0.159

0.0

-1

0.0

0.0

-1

Example 4:
Let X be a normally distributed random variable with mean = 40 and = 5.
Find the probability P(X < 49).
To compute P(X < 49) using the standard normal table, we need to standardize X:
Figure 5: P(X < 49)
Distribution Plot

Distribution Plot

f(X )

Normal, Mean=40, StDev=5

f(Z)

0.09

Normal, Mean=0, StDev=1

0.4

0.08
0.07

0.964

0.964

0.3

0.06

Standardize

0.05
0.04

0.2

0.03
0.1

0.02
0.01
0.00

40

49

0.0

X 49 40
<
P ( X < 49 ) = P
= P ( Z < 1.8 ) = 0.9641
5

12

1.8

MINITAB Instructions: Computing normal probabilities


Steps: Same as Steps 1, 2 & 3 on pages 10 & 11.

From the Session Window of MINITAB, P ( X < 49 ) = P ( Z < 1.8 ) = 0.964070


Example 5:
Let X be a normally distributed random variable with mean = 50 and = 8.
Find P (30 < X < 39) .

13

Determining Z and X values when the probability is given.

Use the table in the reverse way.


Destandardize using X = + Z

Example 6:
The life of a calculator manufactured by CASIO is normally distributed with =
50 months and = 8 months. What should the warranty period be if the
company does not want to replace more than 5% of its products?
Figure 6a: P(X < x0.95) = 0.05 where x0.95 denotes the 5th percentile of X ~ N(50, 64).
Distribution Plot

Normal, Mean=50, StDev=8

f(X )
0.05

0.04

0.03

0.02

0.01
0.05
0.00

x0.95

50

Figure 6b: P(Z < z0.95) = 0.05 where z0.95 denotes the 5th percentile of Z ~ N(0, 1).
Distribution Plot

Normal, Mean=0, StDev=1

f(Z)
0.4

0.3

0.2

0.1
0.05
0.0

z0.95

From Table 3 (in Appendix B) on pages B8 & B9 on lmes2.ust.hk,


P(Z < 1.645) = 0.05 z0.95 = 1.645.
Thus, x0.95 = + z0.95 = 50 + ( 1.645 )( 8 )= 50 13.16= 36.84

14

MINITAB Instructions: Computing normal percentiles


Step 1: Click on the Calc menu and select Probability Distributions, followed
by Normal.

Step 2: Once the Normal Distribution window pops up,


select Inverse cumulative probability
enter the values of and next to Mean: and Standard deviation:,
respectively
select Input constant: and enter the value of the cumulative
probability next to it.

15

Step 3: Click on the OK button.

From the Session Window of MINITAB, x0.95 = 36.8412

Example 7:
At a certain university, the SAT scores on the verbal portion of the first-year
students are normally distributed with mean 520 and standard deviation 40.
a.
b.
c.

Find the proportion of first-year students whose SAT scores on the verbal
portion are between 500 and 650.
How high a verbal test score must be in order to be among the highest 5%
test scores?
If 5 first-year students are randomly selected, what is the probability that
there will be 3 students whose scores are between 500 and 650?

16

Check for normality

Construct a dotplot/stem-and-leaf/histogram of the variable to see whether


the data are normally distributed.
Apply the empirical rule:
o
Compute ( x 1s, x + 1s ) , ( x 2 s, x + 2 s ) , ( x 3s, x + 3s ) .
o
Compute the (actual) proportions of data points that fall within each of
the above 3 intervals.
o
Compare the computed proportions with the theoretical proportions:
68%, 95%, and 100%, respectively.
Construct the normal quantile (Q-Q) plot.
Conduct normality tests.

A normal quantile (Q-Q) plot is a graph designed to show whether a normal model
is a reasonable description of the variation in the data. The basic idea behind the
normal quantile plot is to compare the data values with the values one would
expect from a standard normal distribution. The comparison is based on the idea
of quantiles.
Example 8:
0.0 0.3 0.1 0.5 0.4 2.8 2.6 1.3 0.5 2.6
To construct a normal quantile plot, do the following:
1.
2.

3.

Sort the data in ascending order (see Column III on the next page).
Determine which quantile each data value represents. In this example, the
smallest of the 10 values, represents the smallest 10% of the data. We will
consider this data value to lie half way between 0% and 10% (the middle of
i 0.5
the lowest 10%). In general, the computation
gives the desired value
n
i 1
of the position (expressed as a decimal) since that is halfway between
n
i
and
(see Column IV on the next page).
n
Compute the value theoretical quantile of the standard normal distribution:
z * or z( ni+0.5) ; that corresponds to the proportion computed in Column IV
n

(see Column V on the next page).


For example, to obtain the theoretical quantile for the 1st row of the table on
the next page, we need to know what value in the standard normal
distribution has approximately 5% of the distribution below it. So we search
for something close to 0.05 in the body of the standard normal table, and see
that it lies roughly half-way between 1.64 and 1.65 (lets fix it at 1.645).

17

A computer can get the value more accurately and indicates that it is
1.64485. MINITAB will give you this value if you type invcdf 0.05 next
to MTB > command in the session window or if you use the menu under
Calc > Probability Distributions > Normal)
A normal quantile plot is then constructed by plotting the values under Column III
( x(i ) ) against the values under Column V ( z * or z( ni+0.5) ).
n

If the data came perfectly from a standard normal distribution, Columns III and V
of the table below would be identical (the theoretical quantile and the data value
would match). This means that all the points would fall along the straight line y =
x. Since other normal distributions are just linear transformations of the standard
normal distribution ( x= + z ) , perfect data from a normal distribution with
mean and standard deviation would give a line with slope and intercept
.
We use normal quantile plots to assess the plausibility that a data set is a sample
from a normally distributed population. If the resulting plot is approximately
linear, then it is plausible that the data come from a normal distribution. Else (if
the plot is markedly nonlinear), it is doubtful that the data come from a normal
distribution. Of course, this will work much better for large data sets than for
small data sets.

I
Position
i

1
2
3
4
5
6
7
8
9
10

II
Data
Value
xi
0.0
0.3
0.1
0.5
0.4
2.8
2.6
1.3
0.5
2.6

III
Sample Quantile
(Sorted Data
Value)
x(i )

IV
Proportion below x(i ) :

1.3
0.5
0.4
0.3
0.1
0.0
0.5
2.6
2.6
2.8

0.05
0.15
0.25
0.35
0.45
0.55
0.65
0.75
0.85
0.95

i 0.5
n

18

V
Theoretical
Quantile
*
z or z( ni+0.5)
n

1.64485
1.03643
0.67449
0.38532
0.12566
0.12566
0.38532
0.67449
1.03643
1.64485

Normal Q-Q Plot

Theoretical Quantiles

-1

-2
-1

1
Sample Quantiles

19

Exponential Distribution
A continuous random variable X is exponentially distributed if its probability
density function is given by
=
f ( x)

for x 0

where e = 2.71828 and is the mean of the exponential random variable.


It can be shown that the mean of an exponential random variable X is equal to its
standard deviation, i.e.,=
E[X ]

=
V [X ] X .

Figure 7: Exponential distributions with X = 0.5, 1.0, and 2.0.


Probability Density Function of Exponential Random Variable X
f(X )

Variable
Mean = 0.5
Mean = 1
Mean = 2

2.0

1.5

1.0

0.5

0.0
0

10

15

20

30 X

25

Probabilities associated with an exponential random variable X:

x*

a.

P X >x =
e

b.

P X <x =
1 e

c.

P x < X < x = P X < x P X < x = e

*
1

*
2

x*

*
2

*
1

x1*

x2*

Note that if the number of arrivals follows a Poisson distribution, the times
between arrivals follow an exponential distribution.
20

Example 9:
Toll booths on the New York State Thruway are often congested because of the
large number of cars waiting to pay. A consultant working for the state concluded
that if service times are measured from the time a car stops in line until it leaves,
service times are exponentially distributed with a mean of 2.7 minutes.
a.
b.
c.

What is the probability that a car will take more than 2 minutes to get
through the toll booth?
What is the probability that a car will take less than 3 minutes to get through
the toll booth?
What is the probability that a car will take at least 2 but no more than 4
minutes to get through the toll booth?

21

Population vs. Sample


Parameter: a numerical measure (or characteristic) of the population, : , 2 , p.
Statistic: a numerical measure (or characteristic) of a sample, : X , S 2 , p .
Sampling error: the absolute difference between the parameter and its statistic
, that is, .
Sampling distribution: the probability distribution of a statistic.
Standard error: the standard deviation of a statistic.
Sampling distribution of the sample mean X
Suppose a random sample is taken from an infinite (very large) population which
has a mean ( or X ) and a standard deviation ( or X ) . The mean of X
(average of all possible sample means) will then be (or X ) and the variance of
X is

X2

or
. That is,
n
n
E ( X=
( or X ) and Var ( X=
) =
) =
X
2
X

X2

or

n
n

Furthermore,

if X is a normal random variable (the population from which the samples are
drawn is normally distributed), then X is also a normal random variable
(probability distribution of the sample mean X is also normally distributed)
2
with mean E ( X=
and variance Var ( X=
) =
) =
X
X

2
;
n

if X is not a normal random variable (the population from which the samples
are drawn is not normally distributed), then X is approximately a normal
random variable (probability distribution of the sample mean X is
approximately normally distributed) provided n is large, according to the
Central Limit Theorem. In many practical situations, a sample size of 30 (
n 30 ) may be sufficiently large to allow us to use normal approximation
for the sampling distribution of X . However, if the population is extremely
nonnormal (for example, bimodal and highly-skewed distributions), the
sampling distribution will also be nonnormal even for moderately large
values of n.

22

The Central Limit Theorem:


If the sample size n is sufficiently large, then the population of all possible sample
means is approximately normally distributed (with mean X = and standard
deviation (standard error of the mean) X = n ), no matter what probability
distribution describes the sampled population. Furthermore, the larger the sample
size n is, the more nearly normally distributed is the population of all possible
sample means.

23

Example 10:
a.

X ~ N(50, 64)
Histogram of Means (k = 1,000,000, n = 10)

Histogram of Means (k = 1,000,000, n = 5)

50

50

12000

16000
14000

10000

12000

Frequency

Frequency

8000
6000

10000
8000
6000

4000

4000
2000
0

2000
35.2

39.6

44.0

48.4
52.8
Means (n = 5)

57.2

61.6

66.0

Histogram of Means (k = 1,000,000, n = 15)

40.8

37.4

44.2

47.6
51.0
Means (n = 10)

61.2

57.8

54.4

Histogram of Means (k = 1,000,000, n = 20)

50

50

12000

20000

10000
15000

Frequency

Frequency

8000
10000

6000
4000

5000
2000
0

40.5

43.2

45.9

48.6
51.3
Means (n = 15)

54.0

56.7

59.4

47.30

49.45
51.60
Means (n = 20)

53.75

55.90

58.05

55.1

57.0

50

50

14000

14000

12000

12000

10000

10000

Frequency

Frequency

45.15

Histogram of Means (k = 1,000,000, n = 30)

Histogram of Means (k = 1,000,000, n = 25)

8000
6000

8000
6000

4000

4000

2000

2000

43.00

42.90

44.85

46.80

48.75
50.70
Means (n = 25)

52.65

54.60

56.55

43.7

45.6

47.5

49.4
51.3
Means (n = 30)

Descriptive Statistics: Means (n = 5, Means (n = 1, Means (n = 1, ...


Variable
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Variable
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n

=
=
=
=
=
=

=
=
=
=
=
=

5)
10)
15)
20)
25)
30)

N
1000000
1000000
1000000
1000000
1000000
1000000

Mean
49.997
49.998
50.000
50.000
50.001
50.003

StDev
3.578
2.525
2.067
1.791
1.600
1.459

5)
10)
15)
20)
25)
30)

Q3
52.409
51.701
51.393
51.208
51.081
50.987

Maximum
67.410
62.532
60.710
58.101
57.262
56.804

Mode
*
*
*
*
*
*

Variance
12.803
6.373
4.272
3.206
2.559
2.130
N for
Mode
0
0
0
0
0
0

24

Minimum
33.563
36.832
39.973
41.550
42.476
42.315

Q1
47.580
48.292
48.604
48.790
48.923
49.017

Median
49.996
49.998
49.998
50.001
50.002
50.003

53.2

b.

X ~ Unif(25, 75)
Histogram of Means (k = 1,000,000, n = 10)

Histogram of Means (k = 1,000,000, n = 5)


50

16000

10000

14000
12000

8000

Frequency

Frequency

50

18000

12000

6000

10000
8000
6000

4000

4000

2000

2000

31.0

37.2

43.4

49.6
55.8
Means (n = 5)

62.0

68.2

31.2

Histogram of Means (k = 1,000,000, n = 15)

62.4

67.6

50

10000

Frequency

Frequency

57.2

12000

8000
6000

8000
6000

4000

4000

2000

2000

0
32.2

36.8

41.4

46.0
50.6
55.2
Means (n = 15)

59.8

0
35.1

64.4

Histogram of Means (k = 1,000,000, n = 25)

39.0

42.9

46.8
50.7
54.6
Means (n = 20)

58.5

62.4

Histogram of Means (k = 1,000,000, n = 30)

50

50

16000

14000

14000

12000

12000

Frequency

10000

Frequency

46.8
52.0
Means (n = 10)

14000

10000

8000
6000

10000

4000

8000
6000
4000

2000
0

41.6

Histogram of Means (k = 1,000,000, n = 20)

50

12000

36.4

2000
40.8

37.4

44.2

54.4
47.6
51.0
Means (n = 25)

57.8

61.2

38.4

41.6

44.8

48.0
51.2
54.4
Means (n = 30)

Descriptive Statistics: Means (n = 5, Means (n = 1, Means (n = 1, ...


Variable
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Variable
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n

=
=
=
=
=
=

=
=
=
=
=
=

5)
10)
15)
20)
25)
30)

N
1000000
1000000
1000000
1000000
1000000
1000000

Mean
49.994
49.998
49.993
50.002
50.000
49.996

StDev
6.451
4.561
3.731
3.229
2.883
2.636

5)
10)
15)
20)
25)
30)

Q3
54.460
53.117
52.531
52.198
51.953
51.782

Maximum
73.408
69.436
67.824
64.841
63.044
62.576

Mode
*
*
*
*
*
*

Variance
41.616
20.800
13.921
10.427
8.309
6.949
N for
Mode
0
0
0
0
0
0

25

Minimum
26.619
30.057
32.819
35.332
37.017
38.557

Q1
45.536
46.872
47.457
47.812
48.047
48.212

Median
49.997
49.994
49.995
50.002
50.002
49.996

57.6

60.8

c.

X ~ exp(50)
Histogram of Means (k = 1,000,000, n = 10)

Histogram of Means (k = 1,000,000, n = 5)


50

50

14000

20000

12000
10000

Frequency

Frequency

15000

10000

8000
6000
4000

5000

2000
0

29

58

87
116
145
Means (n = 5)

174

203

21

42

63

84
105
Means (n = 10)

126

147

Histogram of Means (k = 1,000,000, n = 20)

Histogram of Means (k = 1,000,000, n = 15)


50

50

20000

16000
14000

15000

10000

Frequency

Frequency

12000

8000
6000

10000

5000

4000
2000
0

18

36

54

72
90
Means (n = 15)

108

126

144

Histogram of Means (k = 1,000,000, n = 25)

14.5

29.0

43.5

58.0
72.5
87.0
Means (n = 20)

101.5

116.0

Histogram of Means (k = 1,000,000, n = 30)

50

50

12000

20000

10000
8000

Frequency

Frequency

15000

10000

6000
4000

5000
2000
0

14

28

42

56
70
84
Means (n = 25)

98

112

Histogram of Means (k = 1,000,000, n = 35)

24.50

36.75

85.75

98.00

50

12000

10000

10000

8000

8000

Frequency

Frequency

61.25
73.50
Means (n = 30)

Histogram of Means (k = 1,000,000, n = 40)

50

6000
4000

6000

4000

2000

2000
0

49.00

22.50

33.75

45.00

56.25
67.50
Means (n = 35)

78.75

90.00

101.25

26

28.8

38.4

48.0
57.6
67.2
Means (n = 40)

76.8

86.4

110.25

Histogram of Means (k = 1,000,000, n = 50)

Histogram of Means (k = 1,000,000, n = 45)


50
14000

10000

12000
10000

8000

Frequency

Frequency

50

12000

8000
6000

6000
4000

4000

2000

2000
0

29.25

39.00

48.75
58.50
68.25
Means (n = 45)

78.00

87.75

97.50

Histogram of Means (k = 1,000,000, n = 55)

27

36

45

54
63
Means (n = 50)

72

81

90

Histogram of Means (k = 1,000,000, n = 60)

50

50

14000

12000

12000

10000

Frequency

Frequency

10000
8000
6000
4000

6000
4000

2000
0

8000

2000

27

36

45

54
63
Means (n = 55)

72

81

90

50

10000

10000

Frequency

Frequency

12000

8000
6000

4000
2000

46.8
54.6
62.4
Means (n = 65)

70.2

67.2

75.6

84.0

50

6000

2000

39.0

50.4
58.8
Means (n = 60)

8000

4000

31.2

42.0

14000

12000

33.6

Histogram of Means (k = 1,000,000, n = 70)

Histogram of Means (k = 1,000,000, n = 65)


14000

25.2

78.0

85.8

Histogram of Means (k = 1,000,000, n = 75)

29.6

37.0

44.4

51.8
59.2
Means (n = 70)

66.6

74.0

81.4

Histogram of Means (k = 1,000,000, n = 80)

50

50

16000

14000

14000

12000

12000

Frequency

Frequency

10000
8000
6000

10000
8000
6000

4000

4000

2000

2000

29.6

37.0

44.4

51.8
59.2
Means (n = 75)

66.6

74.0

81.4

27

29.6

37.0

44.4

51.8
59.2
Means (n = 80)

66.6

74.0

81.4

Histogram of Means (k = 1,000,000, n = 85)


16000
14000

14000

12000

12000

10000

10000

8000
6000

8000
6000

4000

4000

2000

2000

29.6

37.0

44.4

51.8
59.2
Means (n = 85)

66.6

74.0

81.4

Histogram of Means (k = 1,000,000, n = 95)

34.0

40.8

47.6
54.4
61.2
Means (n = 90)

16000

14000

14000

12000

12000

Frequency

10000
8000
6000

10000
8000
6000

4000

4000

2000

2000

0
27.2

34.0

40.8

47.6
54.4
61.2
Means (n = 95)

68.0

74.8

33.0

39.6

46.2
52.8
59.4
Means (n = 100)

Descriptive Statistics: Means (n = 5, Means (n = 1, Means (n = 1, ...

Variable
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n

74.8

50

18000

16000

Variable
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n
Means (n

68.0

Histogram of Means (k = 1,000,000, n = 100)

50

Frequency

50

16000

Frequency

Frequency

Histogram of Means (k = 1,000,000, n = 90)

50

=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=

=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=

5)
10)
15)
20)
25)
30)
35)
40)
45)
50)
55)
60)
65)
70)
75)
80)
85)
90)
95)
100)

N
1000000
1000000
1000000
1000000
1000000
1000000
1000000
1000000
1000000
1000000
1000000
1000000
1000000
1000000
1000000
1000000
1000000
1000000
1000000
1000000

Mean
50.030
50.016
49.997
50.005
50.013
49.989
50.004
49.993
50.005
50.009
49.999
49.991
50.000
49.992
49.999
50.000
50.013
50.006
50.000
49.998

5)
10)
15)
20)
25)
30)
35)
40)
45)
50)
55)
60)
65)
70)
75)
80)
85)
90)
95)
100)

Q3
62.776
59.601
58.009
57.033
56.353
55.808
55.417
55.064
54.813
54.586
54.363
54.173
54.022
53.890
53.764
53.646
53.567
53.458
53.349
53.275

Maximum
222.738
164.371
143.829
124.532
120.890
111.209
105.860
94.210
96.548
91.011
93.361
87.456
84.908
83.259
81.664
82.805
84.893
81.011
79.137
77.490

StDev
22.350
15.834
12.914
11.194
10.000
9.130
8.450
7.902
7.460
7.071
6.736
6.457
6.199
5.980
5.775
5.583
5.427
5.268
5.121
5.003
Mode
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*

Variance
499.538
250.717
166.765
125.304
100.002
83.355
71.402
62.447
55.647
49.998
45.376
41.688
38.433
35.762
33.351
31.171
29.454
27.749
26.225
25.031
N for
Mode
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

28

Minimum
2.441
5.632
8.239
13.661
15.563
18.710
20.072
21.147
21.845
22.887
24.549
23.093
24.911
27.437
25.150
27.228
28.619
28.847
27.900
27.543

Q1
33.723
38.622
40.773
42.074
42.950
43.573
44.075
44.459
44.800
45.074
45.302
45.502
45.686
45.835
45.989
46.132
46.251
46.355
46.462
46.537

Median
46.738
48.375
48.892
49.162
49.356
49.428
49.529
49.571
49.635
49.683
49.698
49.709
49.742
49.745
49.769
49.793
49.820
49.826
49.829
49.827

66.0

72.6

81.6

Example 11:
An automatic machine in a manufacturing process is operating properly if the
lengths of an important subcomponent are normally distributed with mean 117 and
standard deviation 5.2 (in centimeters).
a.
b.
c.
d.

Find the probability that one randomly selected subcomponent is longer than
120 cm.
Find the sampling distribution of the sample mean from a random sample of
size 4.
Find the probability that if four subcomponents are randomly selected, their
mean length exceeds 120 cm.
Find the probability that if four subcomponents are randomly selected, all
four have lengths that exceed 120 cm.

29

Example 12:
The restaurant in a large commercial building provides coffee for the buildings
occupants. The restaurateur has determined that the mean number of cups of
coffee consumed in a day by all the occupants is 2.0 with a standard deviation of
0.6. A new tenant of the building intends to have a total of 125 new employees.
What is the probability that the new employees will consume more than 240 cups
per day?

30

You might also like