You are on page 1of 48

Chapter 13-14

Analysis of Variance - ANOVA

Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc.

Chap 13-1

Chapter Goals
After completing this chapter, you should be able
to:

Recognize situations in which to use analysis of variance

Understand different analysis of variance designs

Perform a single-factor hypothesis test and interpret results

Conduct and interpret post-hoc multiple comparisons


procedures

Analyze two-factor analysis of variance tests

Chapter Overview
Analysis of Variance (ANOVA)
One-Way
ANOVA
F-test
TukeyKramer
test

Two-Way
ANOVA
Interaction
Effects

13.1 General ANOVA Setting

Investigator controls one or more independent


variables

Observe effects on the dependent variable

Called factors (or treatment variables)


Each factor contains two or more levels (or groups or
categories/classifications)
Response to levels of independent variable

Experimental design: the plan used to collect


the data

13.3 Completely Randomized Design

Experimental units (subjects) are assigned


randomly to treatments

Only one factor or independent variable

Subjects are assumed homogeneous


With two or more treatment levels

Analyzed by one-factor analysis of variance


(one-way ANOVA)

One-Way Analysis of Variance

Evaluate the difference among the means of three


or more groups
Examples: Accident rates for 1st, 2nd, and 3rd shift
Expected mileage for five brands of tires

Assumptions
Populations are normally distributed
Populations have equal variances
Samples are randomly and independently drawn

Hypotheses of One-Way ANOVA

H0 : 1 2 3 c

All population means are equal

i.e., no treatment effect (no variation in means among


groups)

H1 : Not all of the population means are the same

At least one population mean is different

i.e., there is a treatment effect

Does not mean that all population means are different


(some pairs may be the same)

One-Factor ANOVA
H0 : 1 2 3 c
H1 : Not all i are the same
All Means are the same:
The Null Hypothesis is True
(No Treatment Effect)

1 2 3

One-Factor ANOVA
H0 : 1 2 3 c
H1 : Not all i are the same
At least one mean is different:
The Null Hypothesis is NOT true
(Treatment Effect is present)
or

1 2 3

1 2 3

(continued)

Partitioning the Variation

Total variation can be split into two parts:

SST = SSA + SSW


SST = Total Sum of Squares
(Total variation)
SSA = Sum of Squares Among Groups
(Among-group variation)
SSW = Sum of Squares Within Groups
(Within-group variation)

Partitioning the Variation


(continued)

SST = SSA + SSW


Total Variation = the aggregate dispersion of the individual
data values across the various factor levels (SST)
Among-Group Variation = dispersion between the factor
sample means (SSA)
Within-Group Variation = dispersion that exists among the
data values within a particular factor level (SSW)

Partition of Total Variation


Total Variation (SST)

Variation Due to
Factor (SSA)

Commonly referred to as:


Sum of Squares Between
Sum of Squares Among
Sum of Squares Explained
Among Groups Variation

Variation Due to Random


Sampling (SSW)

Commonly referred to as:


Sum of Squares Within
Sum of Squares Error
Sum of Squares Unexplained
Within Groups Variation

Total Sum of Squares


SST = SSA + SSW
c

nj

SST ( Xij X)
Where:

j1 i 1

SST = Total sum of squares


c = number of groups (levels or treatments)
nj = number of observations in group j
Xij = ith observation from group j
X = grand mean (mean of all data values)

Total Variation
(continued)

SST ( X11 X)2 ( X12 X)2 ... ( Xcnc X)2


Response, X

X
Group 1

Group 2

Group 3

Among-Group Variation
SST = SSA + SSW
c

SSA n j ( X j X)2
Where:

j1

SSA = Sum of squares among groups


c = number of groups or populations
nj = sample size from group j
Xj = sample mean from group j
X = grand mean (mean of all data values)

Among-Group Variation
(continued)
c

SSA n j ( X j X)2
j 1

Variation Due to
Differences Among Groups

SSA
MSA
k 1
Mean Square Among =
SSA/degrees of freedom

Among-Group Variation
(continued)

SSA n1 ( x1 x ) n2 ( x 2 x ) ... nc ( x c x )
2

Response, X

X3
X1
Group 1

Group 2

X2
Group 3

Within-Group Variation
SST = SSA + SSW
c

SSW
j 1

nj

i 1

( Xij X j )

Where:

SSW = Sum of squares within groups


c = number of groups
nj = sample size from group j
Xj = sample mean from group j
Xij = ith observation in group j

Within-Group Variation
(continued)
c

SSW
j1

nj

i 1

( Xij X j )2

Summing the variation


within each group and then
adding over all groups

SSW
MSW
nc
Mean Square Within =
SSW/degrees of freedom

Within-Group Variation
(continued)

SSW ( x11 X1 ) ( X12 X 2 ) ... ( Xcnc Xc )


2

Response, X

X1
Group 1

Group 2

X2
Group 3

X3

Obtaining the Mean Squares


SSA
MSA
c 1
SSW
MSW
nc
SST
MST
n 1

One-Way ANOVA Table


Source of
Variation

SS

df

Among
Groups

SSA

c-1

Within
Groups

SSW

n-c

SST =
SSA+SSW

n-1

Total

MS
(Variance)

F ratio

SSA
MSA
MSA =
c - 1 F = MSW
SSW
MSW =
n-c

c = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom

13.5 One-Factor ANOVA


F Test Statistic
H0: 1= 2 = = c
H1: At least two population means are different

Test statistic

MSA
F
MSW

MSA is mean squares among variances


MSW is mean squares within variances

Degrees of freedom

df1 = c 1

(c = number of groups)

df2 = n c

(n = sum of sample sizes from all populations)

Interpreting One-Factor ANOVA


F Statistic

The F statistic is the ratio of the among


estimate of variance and the within estimate
of variance

The ratio must always be positive


df1 = c -1 will typically be small
df2 = n - c will typically be large

Decision Rule:
Reject H if F > F ,
0
U
otherwise do not
reject H0

= .05

Do not
reject H0

Reject H0

FU

One-Factor ANOVA
F Test Example
You want to see if three
different golf clubs yield
different distances. You
randomly select five
measurements from trials on
an automated driving
machine for each club. At
the .05 significance level, is
there a difference in mean
distance?

Club 1
254
263
241
237
251

Club 2
234
218
235
227
216

Club 3
200
222
197
206
204

One-Factor ANOVA Example: Scatter


Diagram
Club 1
254
263
241
237
251

Club 2
234
218
235
227
216

Club 3
200
222
197
206
204

Distance
270
260
250
240
230

220

x1 249.2 x 2 226.0 x 3 205.8


x 227.0

210

X1

X2

200
190
1

2
Club

X
X3

One-Factor ANOVA Example


Computations
Club 1
254
263
241
237
251

Club 2
234
218
235
227
216

Club 3
200
222
197
206
204

X1 = 249.2

n1 = 5

X2 = 226.0

n2 = 5

X3 = 205.8

n3 = 5

X = 227.0

n = 15

c=3
SSA = 5 (249.2 227)2 + 5 (226 227)2 + 5 (205.8 227)2 = 4716.4
SSW = (254 249.2)2 + (263 249.2)2 ++ (204 205.8)2 = 1119.6

MSA = 4716.4 / (3-1) = 2358.2


MSW = 1119.6 / (15-3) = 93.3

2358.2
F
25.275
93.3

One-Factor ANOVA Example


Solution
Test Statistic:

H0: 1 = 2 = 3
H1: i not all equal
= .05
df1= 2

df2 = 12
Critical
Value:
FU = 3.89
= .05

Do not
reject H0

Reject H0

FU = 3.89

MSA 2358.2
F

25.275
MSW
93.3

Decision:
Reject H0 at = 0.05
Conclusion:
There is evidence that
at least one i differs
F = 25.275
from the rest

ANOVA -- Single Factor:


Excel Output
EXCEL: tools | data analysis | ANOVA: single factor
SUMMARY
Groups

Count

Sum

Average

Variance

Club 1

1246

249.2

108.2

Club 2

1130

226

77.5

Club 3

1029

205.8

94.2

ANOVA
Source of
Variation

SS

df

MS

Between
Groups

4716.4

2358.2

Within
Groups

1119.6

12

93.3

Total

5836.0

14

F
25.275

P-value
4.99E-05

F crit
3.89

13.6 Multiple Comparisons:


Tukey-Kramer Procedure

Tells which population means are significantly


different

e.g.: 1 = 2 3

Done after rejection of equal means in ANOVA

Allows pair-wise comparisons

Compare absolute mean differences with critical


range

1= 2

Tukey-Kramer Critical Range

Critical Range QU

MSW 1 1

2 n j n j'

where:
QU = Value from Studentized Range Distribution with c and
n - c degrees of freedom for the desired level of (see
appendix A.12 table)
MSW = Mean Square Within
ni and nj = Sample sizes from groups j and j

Tukey-Kramer Procedure: Example


Club 1
254
263
241
237
251

Club 2
234
218
235
227
216

Club 3
200
222
197
206
204

1. Compute absolute mean


differences:

x1 x 2 249.2 226.0 23.2


x1 x 3 249.2 205.8 43.4
x 2 x 3 226.0 205.8 20.2

2. Find the QU value from the table in appendix E.9 with


c = 3 and (n c) = (15 3) = 12 degrees of freedom
for the desired level of ( = .05 used here):

QU 3.77

Tukey-Kramer Procedure:Example
(continued)

3. Compute Critical Range:


Critical Range QU

MSW 1 1
93.3 1 1

3.77
16.285

2 n j n j'
2 5 5

4. Compare:
5. All of the absolute mean differences
are greater than critical range.
Therefore there is a significant
difference between each pair of
means at 5% level of significance.

x1 x 2 23.2
x1 x 3 43.4
x 2 x 3 20.2

Duncans Test

Least Significant Range R p rp


Rp

MSW 1
1

2 n j n j'

where:
rp = Least significant studentized Range Distribution with n - c
degrees of freedom for the desired level of (see
appendix A.13 table)
p = 2,3,, k ; the range of any subset of sample
means

Tukey-Kramer in PHStat

14. Two-Way ANOVA

Examines the effect of

Two factors of interest on the dependent


variable

e.g., Percent carbonation and line speed on soft drink


bottling process

Interaction between the different levels of these


two factors

e.g., Does the effect of one particular carbonation


level depend on which level the line speed is set?

Two-Way ANOVA
(continued)

Assumptions

Populations are normally distributed

Populations have equal variances

Independent random samples are


drawn

14.3 Two-Way ANOVA


Sources of Variation
Two Factors of interest: A and B
r = number of levels of factor A
c = number of levels of factor B
n = number of replications for each cell
n = total number of observations in all cells
(n = rcn)
Xijk = value of the kth observation of level i of
factor A and level j of factor B

Two-Way ANOVA
Sources of Variation
SST = SSA + SSB + SSAB + SSE
SSA
Factor A Variation

SST
Total Variation

SSB
Factor B Variation

SSAB
n-1

(continued)
Degrees of
Freedom:
r1

c1

Variation due to interaction


between A and B

(r 1)(c 1)

SSE

rc(n 1)

Random variation (Error)

Two Factor ANOVA Equations


Total Variation:

SST ( Xijk X)
i1 j1 k 1

Factor A Variation:

SSA cn ( Xi.. X)
i 1

Factor B Variation:

SSB rn ( X. j. X)2
j1

Two Factor ANOVA Equations


(continued)

Interaction Variation:
r

SSAB n ( Xij. Xi.. X. j. X)2


i1 j 1

Sum of Squares Error:

SSE ( Xijk Xij. )


i 1 j1 k 1

Two Factor ANOVA Equations


r

where:

Xi..

X
j1 k 1

i1 j 1 k 1

ijk

rcn

Mean of ith level of factor A (i 1, 2, ..., r)


r

X
i1 k 1

rn

ijk

Mean of jth level of factor B (j 1, 2, ..., c)

Xijk
Xij.
Mean of cell ij
k 1 n
n

Grand Mean

ijk

cn
X. j.

(continued)

r = number of levels of factor A


c = number of levels of factor B
n = number of replications in each
cell

Mean Square Calculations


SSA
MSA Mean square factor A
r 1
SSB
MSB Mean square factor B
c 1

SSAB
MSAB Mean square interaction
(r 1)(c 1)
SSE
MSE Mean square error
rc(n'1)

Two-Way ANOVA:
The F Test Statistic
H0: 1.. = 2.. = 3.. =
H1: Not all i.. are equal

H0: .1. = .2. = .3. =


H1: Not all .j. are equal

H0: the interaction of A and B is


equal to zero
H1: interaction of A and B is not
zero

F Test for Factor A Effect

MSA
F
MSE

Reject H0
if F > FU

F Test for Factor B Effect

MSB
F
MSE

Reject H0
if F > FU

F Test for Interaction Effect

MSAB
F
MSE

Reject H0
if F > FU

Two-Way ANOVA
Summary Table
Source of
Variation

Sum of
Squares

Degrees of
Freedom

Factor A

SSA

r1

Factor B

SSB

c1

AB
(Interaction)

SSAB

(r 1)(c 1)

Error

SSE

rc(n 1)

Total

SST

n1

Mean
Squares

F
Statistic

MSA

MSA
MSE

= SSA /(r 1)

MSB
= SSB /(c 1)

MSAB
= SSAB / (r 1)(c 1)

MSE =
SSE/rc(n 1)

MSB
MSE
MSAB
MSE

Features of Two-Way ANOVA


F Test

Degrees of freedom always add up

n-1 = rc(n-1) + (r-1) + (c-1) + (r-1)(c-1)

Total = error + factor A + factor B + interaction

The denominator of the F Test is always the


same but the numerator is different

The sums of squares always add up

SST = SSE + SSA + SSB + SSAB

Total = error + factor A + factor B + interaction

Examples:
Interaction vs. No Interaction
No interaction:

Interaction is
present:

Factor B Level 3
Factor B Level 2

Factor A Levels

Mean Response

Factor B Level 1

Mean Response

Factor B Level 1
Factor B Level 2
Factor B Level 3

Factor A Levels

Chapter Summary

Described one-way analysis of variance

The logic of ANOVA

ANOVA assumptions

F test for difference in c means

The Tukey-Kramer procedure for multiple comparisons

Described two-way analysis of variance

Examined effects of multiple factors

Examined interaction between factors

You might also like