Hypothesis Testing for Population Proportions and Means

3/16/2011
Hypothesis Testing 2
{ CE 21. Engineering Statistics
1
3/16/2011
Summary:
Let X1,…,Xn be a large (n>30) sample from a population with
mean µ and standard deviation σ.
To test a null hypothesis of the form:
H0: µ ≤µ0, H0:µ ≥µ0, H0: µ =µ0:
𝑋−𝜇0
- Compute the z-score: 𝑧 = .
𝜎/ 𝑛
If 𝜎 is unknown it can be approximated with s.
- Compute the P-value. The P-value is an area under the
normal curve, which depends on the alternate hypothesis as
follows:
Alternate Hypothesis P-value
H1: µ > µ0 Area to the right of z
H1: µ < µ0 Area to the left of z
H1: µ ≠ µ0 Sum of the areas in the tails cut off by z and -z
 The smaller the P-value, the more certain we can be

that Ho is false.
 The larger the P-value, the more plausible Ho
becomes, but we can never be certain that Ho is

true.
 A rule of thumb suggests to reject Ho whenever
P ≤ 0.05. While this rule is convenient, it has no

scientific basis.
 If P ≤ 0.05, the result is statistically significant at the
5% level. Or the null hypothesis is rejected at 5%

level.
2
3/16/2011
The Relationship
Between Hypothesis
Tests and Confidence
Intervals
Example: The sample mean lifetime of 50 micro

drills was =12.68 holes drilled and s=6.83. Setting
α to 0.05 (5%), the 95% confidence interval for µ
was computed to be (10.79, 14.57).
Test Ho: µ = 10.79 versus H1: µ ≠ 10.79. Do a

similar test for 14.57.
What conclusion can you make from the results?
What about for values inside the interval?
3
3/16/2011
 The 95% confidence interval consists of

precisely those values of µ whose P-values are
greater than 0.05 in a hypothesis test.
 The confidence interval contains all the values

that are plausible for the population mean µ.
Quiz:
1. For which value is the null hypothesis more plausible:
P=0.5 or P=0.05?
2. If P=0.01, which is the best conclusion?
a. H0 is definitely false.
b. H0 is definitely true.
c. There is a 1% probability that H0 is true.
d. H0 might be true, but it’s unlikely.
e. H0 might be false, but it’s unlikely.
f. H0 is plausible.
3. True or False: If P=0.02, then
a. The result is statistically significant at the 5% level.
b. The result is statistically significant at the 1% level.
c. The null hypothesis is rejected at the 5% level.
d. The null hypothesis is rejected at the 1% level.
4
3/16/2011
Tests for a Population

Proportion
The same procedures essentially apply when

dealing with population proportions.
However, as discussed in previous lessons,
𝜇=𝑝
𝑝 1−𝑝
𝜎2 =
𝑛
This test requires that the sample proportion be approximately
normally distributed.
 This assumption will be justified when both np0 > 10 and n(1-p0) > 10.
 p0 is the population proportion specified in the null distribution.
5
3/16/2011
Example 7:
A supplier of semiconductor wafers claims
that of all the wafers he supplies, no more than
10% are defective. A sample of 400 wafers is
tested and 50 of them, or 12.5%, are defective.
Can we conclude that the claim is false?
6
3/16/2011
Example 8:
The article “Refinement of Gravimetric Geoid
Using GPS and Leveling Data” presents a
method for measuring orthometric heights
above sea level. For a sample of 1225 baselines,
926 gave results that were within the class C
spirit leveling tolerance limits. Can we conclude
that this method produces results within the
tolerance limits more than 75% of the time?
Summary:
Let X be the number of successes in n independent
Bernoulli trials, each with success probability p; in other
words, let X~Bin (n,p).
To test a null hypothesis, assuming that both np0 and
n(1-p0) are greater than 10:
Compute the z-score:
Compute the P-value. The P-value is an area under the

normal curve, which depends on the alternate hypothesis
as follows:
H1: p > p0 Area to the right of z
H1: p < p0 Area to the left of z
H1: p ≠ p0 Sum of the areas in the tails cut off by z
and-z
7
3/16/2011
Small-Sample Tests for a

Population Mean
-Uses the t-test, rather than the z-test.

-If 𝜎 is known, use z, not t.
Example 9: Spacer collars for a transmission

countershaft have a thickness specification of 38.98-
39.02. The process that manufactures the collars is
supposed to be calibrated so that the mean thickness
is 39.00 mm.
A sample of six collars is drawn and measured for

thickness. The six thicknesses are 39.030, 38.997,
39.012, 39.008, 39.019, and 39.002. Assume that the
population of thicknesses is approximately normal.
Can we conclude that the process needs recalibration?
8
3/16/2011
Example 10: Before a substance can be deemed

safe for landfilling, its chemical properties
must be characterized. An article reports that
in a sample of six replicates of sludge from a
New Hampshire wastewater treatment plant,
the mean pH was 6.68 with a standard
deviation of 0.20. Can we conclude that the
mean pH is less than 7.0?
9
3/16/2011
Tests for the Difference

Between Two Means
For Large Samples (nX > 30 and nY > 30):

To test a null hypothesis either of the form
H0 :
H0 :
H0::
Compute the z-score:
If and are unknown they may be

approximated with and , respectively.
10
3/16/2011
Example 11:
An engineer claims that a new type of power
supply for home computers lasts longer than
the old type. Independent random samples of
75 each of the two types are chosen, and the
sample means and standard deviations of their
lifetimes are computed:
New: 𝑋= 4387 h s1=252 h
Old: 𝑋= 4260 h s2=231 h
Can you conclude that the mean lifetime of

new power supplies is greater than that of the
old power supplies?
For Small Samples:

Should come from normal populations
with means and and standard
deviations and
If and are not known to be equal, use the
following procedure in testing the null
hypothesis.
Compute degree of freedom, v, rounded

down to the nearest integer.
ii. Compute the test statistic, t.
11
3/16/2011
iii. Compute the P-value. The P-value is an area under the

normal curve, which depends on the alternate hypothesis as
follows:

H1: 𝜇𝑋 − 𝜇𝑌 > ∆0 Area to the right of t
H1: 𝜇𝑋 − 𝜇𝑌 < ∆0 Area to the left of t
H1: 𝜇𝑋 − 𝜇𝑌 ≠ ∆0 Sum of the areas in the tails cut off
by t and –t
READING ASSIGNMENT:
Tests for the difference between two proportions
(pages 425-428)
Tests with paired data (page 439- 441)
12
3/16/2011
Schedule
 March 16 – Chi Square Tests
 March 18- F Tests, Power
 March 23- Third Long Exam, w/ cheat sheet (Wednesday)
 4-6 PM
 Conflict: 25 1 PM
 April 1- Final Presentation (Friday)
The Chi-Square Test
13
3/16/2011
 Used when data consists of nominal or ordinal variables
Nominal variables:
Variables with no inherent order or ranking sequence,
-e.g. numbers used as names (group 1, group 2...), gender,
Ordinal variables:
Variables with an ordered series,
- e.g. "greatly dislike, moderately dislike, indifferent,
moderately like, greatly like".
***Numbers assigned to such variables indicate rank order only -
the "distance" between the numbers has no meaning.
Multinomial trial – an experiment that

can result in k outcomes, where k ≥ 2
 generalization of the Bernoulli trial

 Example: Roll of a fair die (6 outcomes)
14
3/16/2011
The Chi-Sqaure test has two main uses:

 Comparing the distribution of one category variable
(nominal or ordinal) with another.

 Comparing an observed distribution with a
theoretically expected one.
 Comparing the distribution of one category variable

with another.
Example:
Of 120 male and 100 female applicants to university, 90 male and 40 female had
work experience.
Does the gender of an applicant to university correspond to whether or not they
have prior work experience?
15
3/16/2011
 Comparing an observed distribution with a

theoretically expected one.
Example:
In a population of mice, do the proportions differ from those

expected?
Examples
16
3/16/2011
Example:
A gambler wants to test a die to see if it is not
fair.
H0: Die is fair. (p01=…p06=1/6)
He rolls the die 600 times and obtains the ff.
results:
Category Observed Expected
1 115 100
2 97 100
3 91 100
4 101 100
5 110 100
6 86 100
Total 600 600
 Expected value = mean number of trials that

would result in a specific outcome if H0 were true.
 Chi-square statistic- measures the closeness of the

expected value to the observed value
2
𝑘 (𝑂𝑖 −𝐸𝑖 )
 𝜒2 = 𝑖=1 𝐸 𝑖
17
3/16/2011
 If 𝜒 2 is large, there is stronger evidence against H0.

 For k outcomes, there are k-1 degrees of freedom
 The chi-square test provides a good estimate when

all the expected values are ≥ 5.
 Chi square statistic for the example is 6.12.
P-value:
 Check if all expected values are ≥ 5.
 Check table for chi-square value. The areas given across

the top are the areas to the right of the critical value.
 P-value for the example > 0.10. We therefore do not
reject H0.
18
3/16/2011
Example 1:
Rivets are manufactured for a certain purpose. The
length specification is 1.20-1.30 cm. It is thought that
90% of the rivets manufactured meet the specification,
while 5% are too short, and 5% are too long.
In a random sample of 1000 rivets, 860 met the specs, 60

were too short, and 80 were too long. Can you conclude
that the true percentages differ from 90%, 5%, and 5%?
State the appropriate null hypothesis.
Compute the expected values under the null hypothesis.
Compute the value of the chi-square statistic.
Find the P-value. What do you conclude?
Chi-Square test for homogeneity

 If you conduct several trials, you determine
if each has the same set of possible

outcomes.
H0: The probabilities of the outcomes are the

same for each experiment.
19
3/16/2011
Example:
Four machines manufacture cylindrical steel pins. The pins are
subject to a diameter specification. A pin may meet the
specification, or it may be too thin or too thick. Pins are sampled
for each machine, and the number of pins in each category is
counted. The results are shown in the contingency table:
Too thin OK Too thick Total

Machine 1 10 102 8 120
Cell: each row-column intersection
Machine 2 34 161 5 200
Machine 3 12 79 9 100
Machine 4 10 60 10 80
Total 66 402 32 500
Example:
Four machines manufacture cylindrical steel pins. The pins are
subject to a diameter specification. A pin may meet the
specification, or it may be too thin or too thick. Pins are sampled
for each machine, and the number of pins in each category is
counted. The results are shown in the contingency table:

Machine 1 10 102 8 120
Marginal Totals
Machine 2 34 161 5 200
Machine 3 12 79 9 100
Machine 4 10 60 10 80
Total 66 402 32 500
20
3/16/2011
Notation for Observed values (i  rows, j 

columns)
H0: For each column j, p1j=…= pIj
O1. = sum of observed values in row i
O.j = sum of observed values in column j
Column 1 Column 2 … Column J Total
Row 1 O11 O12 … O1J O1.

Row 2 O21 O22 … O2J O2.
: : : : : :
Row I OI1 OI2 … OIJ OI.
Total O.1 O.2 … O.J O..
Computing the expected value:

For cell ij,
𝑂𝑖. 𝑂.𝑗
𝐸𝑖𝑗 =
𝑂..
𝐼 𝐽
2
(𝑂𝑖𝑗 − 𝐸𝑖𝑗 )2
𝜒 =
𝐸𝑖𝑗
𝑖=1 𝑗=1
Degrees of freedom = (I-1)(J-1)
21
3/16/2011
Example 2: Given the table below, test the null

hypothesis that the proportion of pins that are too
thin, OK, or too thick are the same for all the
machines.

Machine 1 10 102 8 120
Machine 2 34 161 5 200
Machine 3 12 79 9 100
Machine 4 10 60 10 80
Total 66 402 32 500
Chi-square test for Independence

 In the previous example, the column totals were
random, while the row totals were fixed in
advance.

Machine 1 10 102 8 120
Machine 2 34 161 5 200
Machine 3 12 79 9 100
Machine 4 10 60 10 80
Total 66 402 32 500
 For cases when both row and column totals are

random,ie., they are independent, the same
procedure applies.
22
3/16/2011
Chi-square test for Independence
A public opinion poll surveyed a simple random sample of

1000 voters. Respondents were classified by gender and by
voting preference Results are shown in the contingency
table below.
Is there a gender gap? Do the men's voting preferences

differ significantly from the women's preferences?
Example 3:
Cylindrical steel pins are subject to a length and
diameter specification. With respect to length, a pin
may meet the specification, or it may be too long or
too short.
A total of 1021 pins are sampled and categorized

wrt both length and diameter specification. The
results are presented in the table below.
Test the null hypothesis that the proportion of pins

that are too thin, OK, or too thick wrt diameter
specification do not depend on the classification
wrt length specification..
23
3/16/2011
Observed values for 1021 steel pins
Diameter
Length Too thin OK Too Total
thick
Too 13 117 4 134
Short
OK 62 664 80 806
Too 5 68 8 81
Long
Total 80 849 92 1021
Example 4:
For the given table of observed values,
Construct the corresponding table of expected
values.
If appropriate, perform the chi-square test for the
null hypothesis that the row and column outcomes
are independent. If not appropriate, explain why.
Observed Values
1 2 3
A 15 10 12
B 3 11 11
C 9 14 12
24
3/16/2011
End.
25

Hypothesis Testing for Population Proportions and Means

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hypothesis Testing for Population Proportions and Means

Uploaded by

Copyright:

Available Formats

3/16/2011

 The smaller the P-value, the more certain we can be

becomes, but we can never be certain that Ho is

P ≤ 0.05. While this rule is convenient, it has no

5% level. Or the null hypothesis is rejected at 5%

Example: The sample mean lifetime of 50 micro

Test Ho: µ = 10.79 versus H1: µ ≠ 10.79. Do a

What conclusion can you make from the results?

What about for values inside the interval?

 The 95% confidence interval consists of

 The confidence interval contains all the values

Tests for a Population

The same procedures essentially apply when

 p0 is the population proportion specified in the null distribution.

Compute the z-score:

Compute the P-value. The P-value is an area under the

Small-Sample Tests for a

-Uses the t-test, rather than the z-test.

Example 9: Spacer collars for a transmission

A sample of six collars is drawn and measured for

Example 10: Before a substance can be deemed

Tests for the Difference

For Large Samples (nX > 30 and nY > 30):

Compute the z-score:

If and are unknown they may be

Can you conclude that the mean lifetime of

For Small Samples:

Compute degree of freedom, v, rounded

ii. Compute the test statistic, t.

iii. Compute the P-value. The P-value is an area under the

Alternate Hypothesis P-value

The Chi-Square Test

 Used when data consists of nominal or ordinal variables

Multinomial trial – an experiment that

 generalization of the Bernoulli trial

The Chi-Sqaure test has two main uses:

(nominal or ordinal) with another.

theoretically expected one.

 Comparing the distribution of one category variable

 Comparing an observed distribution with a

In a population of mice, do the proportions differ from those

 Expected value = mean number of trials that

 Chi-square statistic- measures the closeness of the

 If 𝜒 2 is large, there is stronger evidence against H0.

 The chi-square test provides a good estimate when

 Chi square statistic for the example is 6.12.

 Check table for chi-square value. The areas given across

In a random sample of 1000 rivets, 860 met the specs, 60

Chi-Square test for homogeneity

if each has the same set of possible

H0: The probabilities of the outcomes are the

Too thin OK Too thick Total

Total 66 402 32 500

Too thin OK Too thick Total

Total 66 402 32 500

Notation for Observed values (i  rows, j 

Column 1 Column 2 … Column J Total

Row 1 O11 O12 … O1J O1.

Computing the expected value:

Degrees of freedom = (I-1)(J-1)

Example 2: Given the table below, test the null

Too thin OK Too thick Total