You are on page 1of 17

ANOVA (ANAVAR)

Analysis of Variance (Analyse de la VARiance


Continuous/Variable Response (Y) Data
Normally Distributed
Equal Variances

ANOVA Objectives
At the end of this module, we will be able to...

1. Graph the Main Effects Plots.


2. Understand basic construction of the ANOVA table.
3. Determine if main effects are significant
using the P value from the ANOVA table.

ANOVA
What does ANOVA mean?

ANalysis Of VAriance.

What is an ANOVA?
Analysis of Variance is a tool used to detect if a
statistical difference in the mean exists between several factors and
whether that difference is attributed to chance or a specific cause.
This tool uses statistics to determine if the variation within a
factor is less than/greater than the variation between factors. If the
variation between factors is greater than the variation within the
factor levels, then the factor is said to be significant.
Assumptions:
Data comes from normal populations having equal variances.

ANOVA - Theory and Manual Calculations


A financial services company has four different sites that process credit cases. The
table below contains productivity data on the average number of cases worked per
hour for a sample of employees at each of the four sites.
Site 1
14.9
15.7
15.2
15.8
15.1
16.3
14.4
15.9

Site 2
15.7
16.6
16.5
16
15.7
16.4
16.7
16.8
16.3
16.5

Site 3
17.3
17.2
17.4
17.2
17
17.6
17.4
17.3
16.5
16.7

Site 4
15.2
14.8
14.3
14.9
15.4
14.9
14.6
15.1
15
14.7

What can you tell about the data?


Is there a difference between Sites?

ANOVA - Evaluation of Site Productivity


Main Effects Plot for Cases / Hour

It appears that
productivity
in cases per hour
is highest
at Site 3.

Cases / Hour

17

16

How can we find


out for sure?
15

Site

Main Effects Plots are Not Statistical Tests!

ANOVA - Minitab Output for our Example


One-way Analysis of Variance
Analysis of Variance for Cases Per Hour
Source
DF
SS
MS
Site
3
29.536
9.845
Error
34
5.998
0.176
Total
37
35.534

Sources
of
Variability

Quantitative
Measure of
Variability
Explained by
Each Source

Amount
of
Information
Degrees of Freedom

Estimate of
Variances

F
55.81

The statistical
Measure used
to determine
if a factor is
significant

P
0.000

Type I
Error
(p-value)

ANOVA - Normality Assumption


Normal Probability Plot

.999
.99

Probability

.95
.80

The P-value must be > 0.05


in order to fail to reject Ho,
and conclude that the
population is normally distributed.

.50
.20
.05
.01
.001
14.4

14.9

15.4

Site 4
Average: 14.89
StDev: 0.314289
N: 10

Anderson-Darling Normality Test


A-Squared: 0.125
P-Value: 0.977

For ANOVA Results to be Valid, Populations Must Be Normally Distributed

ANOVA - Equal Variance Assumption

Bartletts Test
Normal Data

Homogeneity of Variance Test for Cases / Hour


95% Confidence Intervals for Sigmas

Factor Levels

Ho:
Ha: at least one is different

Bartlett's Test
Test Statistic: 4.646
P-Value

: 0.200

Levene's Test
Test Statistic: 2.326
P-Value

4
0.0

0.5

1.0

1.5

: 0.092

The P-value must


be > 0.05 in order
to fail to reject Ho.
Levenes Test
Non-Normal Data
Ho:
Ha: at least one is different

The P-value must


be > 0.05 in order
to fail to reject Ho

For ANOVA Results to be Valid, Variances Must Be Equal

ANOVA - Theory and Manual Calculations

Scatter Plot of Area Data

Cases Per Hour

17.5

16.5

Overall Mean
=15.97

15.5

14.5
1

Site

Site 1
15.41

Site 2
16.32

Site 3
17.16

Site 4
14.89

Is the variation within a site less


than the variation between sites?

ANOVA - Theory and Manual Calculations

Step 1: Develop the Hypothesis Statements


Ho: Site1 = Site2 = Site3 = Site4
Ha: at least one mean for a site is different
Step 2: Determine a planning value for , typically 0.05.
Step 3: Calculate the estimates for variability explained by each
source.

Variations
Source

Variations Due
to Factor Levels

Between
Group

Process

+
+

Variations Due to
Experimental Noise

Within
Group

Technology

ANOVA - Theory and Manual Calculations

Step 3 (continued): Calculate the estimates of variability explained by each


source.

Variations

Variations Due to
Factor Levels

Variations Due to
Experimental Noise

SS Total = SS Factor Level+ SS Error


k
k n
_ =2 k n
2
_
= 2=
n*(yi y)+
(y
ij yi)
(yij y) i = 1
i = 1j = 1
i=1j=1

Where:

- -

y = the overall mean


yij = the jth observation in the ith sample
n = the number of observations per group (i.e., 8 or 10)
k = the number of factor levels (i.e., 4)

ANOVA - Theory and Manual Calculations

Step 3 as applied to our example:


=
( y ) of the 38 observations is 15.974
_
The four area means
( yi ) of
_
y1 = 15.413
_
y2 = 16.320
_
y3 = 17.160
_
y4 = 14.890
_ =
( yi -y ) are
Results in the factor level effects
The overall mean

=
y1 - y
__
=
y2 - y
_
=
y3 - y
_
=
y4 - y

= 15.413-15.974 = -0.561
= 16.320-15.974 = 0.346
= 17.160 -15.974 = 1.186

= 14.890-15.974 = -1.084
_
The error terms, or residuals
( yij - yi ) are computed as follows
_
y11 - y1 = 14.9 - 15.413 = -0.513

ANOVA - Theory and Manual Calculations

15.974

Overall Mean

-0.561

0.346

1.186

-1.084

Factor Level Effects

-0.513
0.287
-0.213
0.388
-0.313
0.888
-1.012
0.488

-0.62
0.28
0.18
-0.32
-0.62
0.08
0.38
0.48
-0.02
0.18

0.14
0.04
0.24
0.04
-0.16
0.44
0.24
0.14
-0.66
-0.46

0.31
-0.09
-0.59
0.01
0.51
0.01
-0.29
0.21
0.11
-0.19

Error Terms or
Residuals

ANOVA - Theory and Manual Calculations

Step 3 (continued): Calculate the estimates of variability explained by


each source.

Variations
SS
k

Total

=2

Variations Due to
Factor Levels

= SS

Factor Level

35.534

+ SS

Error

_ =2 k n
2
_
=
n*(yi y) +
(yij yi)
i=1
i = 1j = 1

(yij-y)

i=1 j=1

Variations Due to
Experimental Noise

29.536

5.998

ANOVA - Theory and Manual Calculations


Step 4: Calculate the estimates of variance.
When the Sums of Squares are divided by the appropriate number of
degrees of freedom, the Mean Squares give a good estimate of the
variability.

MS Factor Level

29.536
SS Factor Level
____________
_______
=
= 4-1
= 9.845
k-1

MS Error

SS Error
= ________
n-k

5.998 = 0.176
= ______
38 - 4

ANOVA - Theory and Manual Calculations


Step 5: Calculate the F-test Statistic.
MS Factor Level = _____
9.845 = 55.81
Fo = ____________
MS Error
0.176

IF Variation (factor)

Is Small then error may be playing as BIG a role


Variation (error) as the factor. Therefore, the factor cannot be
proven to be significantly responsible for the
differences in successes. Fail to reject Ho.
Give me
Is Large then the factor is playing a significant role
an F
in the differences in the successes Can reject Ho.

The F ratio is used to determine the P-value!

ANOVA - Theory and Manual Calculations


Step 6: Evaluate the P-value.
If the P-value is < 0.05, then reject the Ho and conclude that
at least one of the means is different.
In our example, Minitab provides us with a P-value = 0.000.
Therefore, the Ho can be rejected and we can conclude that the
average cases worked per hour in at least one of the Sites is different.
In other words, the variation between sites is greater than the
variation within each site.

Which Site is the Best? Which Site is Different?

You might also like