You are on page 1of 51

Statistics for

Business and Economics


6th Edition

Chapter 9
Estimation: Additional Topics

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 9-1

Chapter Goals
After completing this chapter, you should be able to:

Form confidence intervals for the mean difference from dependent


samples

Form confidence intervals for the difference between two


independent population means (standard deviations known or
unknown)

Compute confidence interval limits for the difference between two


independent population proportions

Create confidence intervals for a population variance

Find chi-square values from the chi-square distribution table

Determine the required sample size to estimate a mean or


proportion within a specified margin of error

Estimation: Additional Topics


Chapter Topics

Population
Means,
Dependent
Samples

Population
Means,
Independent
Samples

Population
Proportions

Population
Variance

Proportion 1 vs.
Proportion 2

Variance of a
normal distribution

Examples:
Same group
before vs. after
treatment

Group 1 vs.
independent
Group 2

Dependent Samples
Tests Means of 2 Related Populations
Dependent
samples

Paired or matched samples


Repeated measures (before/after)
Use difference between paired values:

di = xi - yi

Eliminates Variation Among Subjects


Assumptions:
Both Populations Are Normally Distributed

Mean Difference
The ith paired difference is di , where
Dependent
samples

di = xi - yi
The point estimate for
the population mean
paired difference is d :
The sample
standard
deviation is:

d
i 1

Sd

2
(d

d
)
i
i1

n 1

n is the number of matched pairs in the sample

Confidence Interval for


Mean Difference
Dependent
samples

The confidence interval for difference


between population means, d , is

d t n1,/2

Sd
Sd
d d t n1,/2
n
n

Where
n = the sample size
(number of matched pairs in the paired sample)

Confidence Interval for


Mean Difference
(continued)
Dependent
samples

The margin of error is

ME t n1,/2

sd
n

tn-1,/2 is the value from the Students t


distribution with (n 1) degrees of freedom
for which

P(t n1 t n1,/2 )
2

Paired Samples Example


Six people sign up for a weight loss program. You
collect the following data:

Person
1
2
3
4
5
6

Weight:
Before (x)
After (y)
136
205
157
138
175
166

125
195
150
140
165
160

Difference, di
11
10
7
-2
10
6
42

di

d = n
= 7.0
Sd

2
(d

d
)
i

4.82

n 1

Paired Samples Example


(continued)

For a 95% confidence level, the appropriate t value is


tn-1,/2 = t5,.025 = 2.571

The 95% confidence interval for the difference between


means, d , is

d t n1,/2
7 (2.571)

Sd
S
d d t n1,/2 d
n
n

4.82
4.82
d 7 (2.571)
6
6

1.94 d 12.06
Since this interval contains zero, we cannot be 95% confident, given this
limited data, that the weight loss program helps people lose weight

Difference Between Two Means


Population means,
independent
samples

Different data sources


Unrelated
Independent

Goal: Form a confidence interval


for the difference between two
population means, x y

Sample selected from one population has no effect on the


sample selected from the other population

The point estimate is the difference between the two


sample means:

xy

Difference Between Two Means


(continued)

Population means,
independent
samples
x2 and y2 known

Confidence interval uses z/2

x2 and y2 unknown
x2 and y2
assumed equal
x2 and y2
assumed unequal

Confidence interval uses a value


from the Students t distribution

x2 and y2 Known
Population means,
independent
samples
x2 and y2 known
x2 and y2 unknown

Assumptions:

Samples are randomly and


independently drawn
both population distributions
are normal
Population variances are
known

x2 and y2 Known
(continued)

When x and y are known and


both populations are normal, the
variance of X Y is

Population means,
independent
samples

x2 and y2 known
x2 and y2 unknown

2X Y

2
y
x

nx
ny

and the random variable


Z

(x y) (X Y )
2
2x y

nX nY

has a standard normal distribution

Confidence Interval,
x2 and y2 Known
Population means,
independent
samples
x2 and y2 known
x2 and y2 unknown

(x y) z /2

interval for
* The confidence
is:
x y

2X 2Y
2X 2Y

X Y (x y) z /2

nx ny
nx ny

x2 and y2 Unknown,
Assumed Equal
Assumptions:

Population means,
independent
samples

Samples are randomly and


independently drawn

x2 and y2 known

Populations are normally


distributed

x2 and y2 unknown
x2 and y2
assumed equal
x2 and y2
assumed unequal

Population variances are


unknown but assumed equal

x2 and y2 Unknown,
Assumed Equal
(continued)

Forming interval
estimates:

Population means,
independent
samples

The population variances


are assumed equal, so use
the two sample standard
deviations and pool them to
estimate

x2 and y2 known
x2 and y2 unknown
x2 and y2
assumed equal
x2 and y2
assumed unequal

use a t value with


(nx + ny 2) degrees of
freedom

x2 and y2 Unknown,
Assumed Equal
(continued)

Population means,
independent
samples

The pooled variance is

and known
2
x

2
y

x2 and y2 unknown
x2 and y2
assumed equal
x2 and y2
assumed unequal

sp2

(n x 1)s 2x (n y 1)s2y
nx ny 2

Confidence Interval,
x2 and y2 Unknown, Equal
x2 and y2 unknown
x2 and y2
assumed equal

x2 and y2
assumed unequal

(x y) t nx n y 2,/2

Where

sp2

sp2
nx

sp2
ny

The confidence interval for


1 2 is:

X Y (x y) t nx n y 2,/2

(n x 1)s 2x (n y 1)s2y
nx ny 2

sp2
nx

sp2
ny

Pooled Variance Example


You are testing two computer processors for speed.
Form a confidence interval for the difference in CPU
speed. You collect the following speed data (in Mhz):

Number Tested
Sample mean
Sample std dev

CPUx
17
3004
74

CPUy
14
2538
56

Assume both populations are


normal with equal variances,
and use 95% confidence

Calculating the Pooled Variance


The pooled variance is:
2
2

1
S

1
S

17 1 74 2 14 1 56 2
x
x
y
y
2
S

(n x 1) (n y 1)

(17 - 1) (14 1)

The t value for a 95% confidence interval is:


t n x ny 2 , /2 t 29 , 0.025 2.045

4427.03

Calculating the Confidence Limits

The 95% confidence interval is

(x y) t n x n y 2,/2

(3004 2538) (2.054)

sp2
nx

sp2
ny

X Y (x y) t n x ny 2,/2

sp2
nx

sp2
ny

4427.03 4427.03
4427.03 4427.03

X Y (3004 2538) (2.054)

17
14
17
14

416.69 X Y 515.31
We are 95% confident that the mean difference in
CPU speed is between 416.69 and 515.31 Mhz.

x2 and y2 Unknown,
Assumed Unequal
Assumptions:

Population means,
independent
samples

Samples are randomly and


independently drawn

x2 and y2 known

Populations are normally


distributed

x2 and y2 unknown

Population variances are


unknown and assumed
unequal

x2 and y2
assumed equal
x2 and y2
assumed unequal

x2 and y2 Unknown,
Assumed Unequal
(continued)

Forming interval estimates:

Population means,
independent
samples

The population variances are


assumed unequal, so a pooled
variance is not appropriate

x2 and y2 known

use a t value with degrees


of freedom, where

x2 and y2 unknown
x2 and y2
assumed equal
x2 and y2
assumed unequal

s 2x
s 2y
( ) ( )
n y
n x
2

s 2y

s
/(n y 1)

/(n x 1)

n
y
nx

2
x

Confidence Interval,
x2 and y2 Unknown, Unequal
x2 and y2 unknown
x2 and y2
assumed equal
x2 and y2
assumed unequal

(x y) t ,/2

The confidence interval for


1 2 is:

2
2
s 2x s y
s2x s y

X Y (x y) t ,/2

nx ny
nx ny

Where

s 2x
s 2y
( ) ( )
n y
n x
2

s2
s 2x

/(n x 1) y /(n y 1)
n
nx
y

Two Population Proportions


Population
proportions

Goal: Form a confidence interval for


the difference between two
population proportions, Px Py
Assumptions:
Both sample sizes are large (generally at
least 40 observations in each sample)
The point estimate for
the difference is

p x p y

Two Population Proportions


(continued)

Population
proportions

The random variable


Z

(p x p y ) (p x p y )
p x (1 p x ) p y (1 p y )

nx
ny

is approximately normally distributed

Confidence Interval for


Two Population Proportions
Population
proportions

The confidence limits for


Px Py are:

(p x p y ) Z / 2

p x (1 p x ) p y (1 p y )

nx
ny

Example:
Two Population Proportions
Form a 90% confidence interval for the
difference between the proportion of
men and the proportion of women who
have college degrees.

In a random sample, 26 of 50 men and


28 of 40 women had an earned college
degree

Example:
Two Population Proportions
(continued)

26

Men: p x
0.52
50
28

0.70
Women: p y
40
p x (1 p x ) p y (1 p y )
0.52(0.48) 0.70(0.30)

0.1012
nx
ny
50
40

For 90% confidence, Z/2 = 1.645

Example:
Two Population Proportions
(continued)

The confidence limits are:


(p x p y ) Z /2

p x (1 p x ) p y (1 p y )

nx
ny

(.52 .70) 1.645 (0.1012)

so the confidence interval is


-0.3465 < Px Py < -0.0135
Since this interval does not contain zero we are 90% confident that the
two proportions are not equal

Confidence Intervals for the


Population Variance
Population
Variance

Goal: Form a confidence interval


for the population variance, 2

The confidence interval is based on


the sample variance, s2

Assumed: the population is


normally distributed

Confidence Intervals for the


Population Variance
(continued)

Population
Variance

The random variable

2
n 1

(n 1)s

follows a chi-square distribution


with (n 1) degrees of freedom
2

The chi-square value n1, denotes the number for which

P( n21 n21, )

Confidence Intervals for the


Population Variance
(continued)

Population
Variance

The (1 - )% confidence interval


for the population variance is

(n 1)s
(n 1)s
2
2
2
n1, /2
n1, 1 - /2
2

Example
You are testing the speed of a computer processor. You
collect the following data (in Mhz):

Sample size
Sample mean
Sample std dev

CPUx
17
3004
74

Assume the population is normal.


Determine the 95% confidence interval for x2

Finding the Chi-square Values

n = 17 so the chi-square distribution has (n 1) = 16


degrees of freedom
= 0.05, so use the the chi-square values with area
0.025 in each tail:
2
n21, /2 16
, 0.025 28.85
2
n21, 1 - /2 16
, 0.975 6.91

probability
/2 = .025

probability
/2 = .025

216 = 6.91

216 = 28.85

216

Calculating the Confidence Limits

The 95% confidence interval is


2
(n 1)s 2
(n

1)s
2

2
2
n1, /2
n1, 1 - /2
2
(17 1)(74)2
(17

1)(74)
2
28.85
6.91

3037 2 12683
Converting to standard deviation, we are 95%
confident that the population standard deviation of
CPU speed is between 55.1 and 112.6 Mhz

Sample PHStat Output

Sample PHStat Output


(continued)

Input

Output

Sample Size Determination


Determining
Sample Size
For the
Mean

For the
Proportion

Margin of Error

The required sample size can be found to reach a


desired margin of error (ME) with a specified level of
confidence (1 - )

The margin of error is also called sampling error

the amount of imprecision in the estimate of the


population parameter

the amount added and subtracted to the point estimate to


form the confidence interval

Sample Size Determination


Determining
Sample Size
For the
Mean

x z /2

Margin of Error
(sampling error)

ME z /2

Sample Size Determination


(continued)

Determining
Sample Size
For the
Mean

ME z /2

Now solve
for n to get

2
/2

z
n
2
ME

Sample Size Determination


(continued)

To determine the required sample size for the


mean, you must know:

The desired level of confidence (1 - ), which


determines the z/2 value

The acceptable margin of error (sampling error), ME

The standard deviation,

Required Sample Size Example


If = 45, what sample size is needed to
estimate the mean within 5 with 90%
confidence?
2
/2

z
(1.645) (45)
n

219.19
2
2
ME
5
So the required sample size is n = 220
(Always round up)

Sample Size Determination


Determining
Sample Size

For the
Proportion

p z /2

p (1 p )
n

ME z /2

p (1 p )
n

Margin of Error
(sampling error)

Sample Size Determination


(continued)

Determining
Sample Size

For the
Proportion
ME z /2

p (1 p )
n

p (1 p ) cannot

be larger than
0.25, when p =
0.5

Substitute
0.25 for p (1 p )
and solve for
n to get

0.25 z
n
2
ME

2
/2

Sample Size Determination


(continued)

The sample and population proportions, p and P, are


generally not known (since no sample has been taken
yet)

P(1 P) = 0.25 generates the largest possible margin


of error (so guarantees that the resulting sample size
will meet the desired level of confidence)

To determine the required sample size for the


proportion, you must know:

The desired level of confidence (1 - ), which determines the


critical z/2 value

The acceptable sampling error (margin of error), ME

Estimate P(1 P) = 0.25

Required Sample Size Example


How large a sample would be necessary
to estimate the true proportion defective in
a large population within 3%, with 95%
confidence?

Required Sample Size Example


(continued)

Solution:
For 95% confidence, use z0.025 = 1.96
ME = 0.03
Estimate P(1 P) = 0.25

0.25 z
n
2
ME

2
/2

(0.25)(1.96)

1067.11
2
(0.03)
So use n = 1068

PHStat Sample Size Options

Chapter Summary

Compared two dependent samples (paired samples)

Compared two independent samples

Formed confidence intervals for the paired difference


Formed confidence intervals for the difference between two
means, population variance known, using z
Formed confidence intervals for the differences between two
means, population variance unknown, using t
Formed confidence intervals for the differences between two
population proportions

Formed confidence intervals for the population variance


using the chi-square distribution
Determined required sample size to meet confidence
and margin of error requirements

You might also like