Professional Documents
Culture Documents
Page
Unbalanced ANOVA designs
1. Why is the design unbalanced? 9-2
2. What happens with unbalanced designs? 9-3
3. An introduction to the problem 9-5
4. Types of sums of squares 9-10
5. An example 9-15
Random factors
o The unequal cell sizes are randomly unequal
o The process leading to the missingness is independent of the levels of the
independent variable
Scheduling problems
Computer errors
IV 1
IV 1 IV B Level 1 Level 2 Level 3
IV B Level 1 Level 2 Level 3 Level 1 n11 =4 n21 =7 n31 =3 14
Level 1 n11 =15 n 21 =10 n31 =20 45 Level 2 n12 =4 n22 =3 n32 =6 13
Level 2 n12 =20 n 22 =20 n32 =15 55 Level 3 n13 =5 n23 =4 n33 =5 14
35 30 35 100 13 14 14 41
Systematic factors
o The unequal cell sizes are directly or indirectly related to the levels of the
independent variables
A treatment is painful/ineffective
High prejudice individuals refuse to answer questions regarding
attitudes toward ethnic groups
IV 1
IV 1 IV B Level 1 Level 2 Level 3
IV B Level 1 Level 2 Level 3 Level 1 n11 =3 n21 =6 n31 =9 18
Level 1 n11 =40 n 21 =40 n31 =50 130 Level 2 n12 =2 n22 =6 n32 =9 17
Level 2 n12 =20 n 22 =20 n32 =30 70 Level 3 n13 =4 n23 =8 n33 =13 25
60 60 80 200 9 20 31 60
All of the methods we discuss for analyzing unbalanced designs assume the
cell sizes are either a result of:
o Random factors
o Real differences in the population
1 = (a1 , a 2 , a3 ,..., a a )
2 = (b1 , b2 , b3 ,..., ba )
a
ai bi a1b1 a 2 b2 a b
j =1 ni
=0 or
n1
+
n2
+ ... + a a = 0
na
In general the tests for main effects and interactions are no longer orthogonal
for unbalanced designs.
Because of this non-orthogonality, the sums of squares will not nicely
partition.
SSA + SSB + SSAB SSModel
As a result:
o The tests for the main effects and interactions are not independent of each
other.
o Single degree of freedom contrasts may not be combined into a
simultaneous test.
The most popular method for dealing with these issues is to use different
methods of computing the sums of squares for each effect.
SSA SSB
SSAB
SSA SSB
SSAB
SSA SSB
SSAB
Approach #2: Start with only the main effects. Use a unique SS approach to
divide the main effect sums of squares. Then, add the next highest order
effects. For the remaining SS, use the unique approach to divide the SS.
Continue until all effects have been added.
o This approach is known as using Type II SS
SSA SSB
SSAB
SSAB SSAB
(The MSW and the highest order interaction are unaffected by these
different methods because they do not average across any cellsthey say
something about individual cells.)
Female Male
No No
College Degree College Degree College Degree College Degree
24 15 25 19
26 17 29 18
25 20 27 21
24 16 20
27 21
24 22
27 19
23
Mean 25 17 27 20
Sample Size 8 4 3 7
Gender
Female Male
Education College Degree n11 = 8 n 21 = 3
X 11 = 25 X 21 = 27
No College Degree n12 = 4 n 22 = 7
X 12 = 17 X 22 = 20
Gender
Women Men
Education College Degree -1 1
No College Degree -1 1
25 + 17
Women earn $21000 = 21
2
27 + 20
Men earn $23500 = 23.5
2
Gender
Women Men
College Degree n F = 12 n M = 10
X F = 22.33 X M = 22.10
Based on this approach we look at the marginal means for gender, and
conclude that women earn slightly more than men
Testing the main effect for gender using a Type III SS approach:
Gender
Women Men
Education College Degree X 11 = 25 X 21 = 27
-1 1
No College Degree X 12 = 17 X 22 = 20
-1 1
In SPSS:
UNIANOVA dv BY gender edu
/METHOD = SSTYPE(3).
Tests of Between-Subjects Effects
Dependent Variable: DV
Type III Sum
Source of Squares df Mean Square F Sig.
Corrected Model 273.864a 3 91.288 32.864 .000
Intercept 9305.790 1 9305.790 3350.084 .000
GENDER 29.371 1 29.371 10.573 .004
EDU 264.336 1 264.336 95.161 .000
GENDER * EDU 1.175 1 1.175 .423 .524
Error 50.000 18 2.778
Total 11193.000 22
Corrected Total 323.864 21
a. R Squared = .846 (Adjusted R Squared = .820)
Main effect for gender such that men earn more than women,
F(1,22) = 10.57, p = .004
Main effect for education such that college educated individuals earn
more than non-college educated individuals,
F(1,22) = 95.16, p < .001
Gender
Women Men
College Degree n F = 12 n M = 10
X F = 22.33 X M = 22.10
In SPSS:
UNIANOVA dv BY gender edu
/METHOD = SSTYPE(1).
Dependent Variable: DV
Type I Sum
Source of Squares df Mean Square F Sig.
Corrected Model 273.864a 3 91.288 32.864 .000
Intercept 10869.136 1 10869.136 3912.889 .000
GENDER .297 1 .297 .107 .747
EDU 272.392 1 272.392 98.061 .000
GENDER * EDU 1.175 1 1.175 .423 .524
Error 50.000 18 2.778
Total 11193.000 22
Corrected Total 323.864 21
a. R Squared = .846 (Adjusted R Squared = .820)
Dependent Variable: DV
Type I Sum
Source of Squares df Mean Square F Sig.
Corrected Model 273.864a 3 91.288 32.864 .000
Intercept 10869.136 1 10869.136 3912.889 .000
EDU 242.227 1 242.227 87.202 .000
GENDER 30.462 1 30.462 10.966 .004
EDU * GENDER 1.175 1 1.175 .423 .524
Error 50.000 18 2.778
Total 11193.000 22
Corrected Total 323.864 21
a. R Squared = .846 (Adjusted R Squared = .820)
o In general, you ran the design because you wanted to compare the cell
means. In this case, the unequal cell sizes are irrelevant and you should
use Type III SS
If we have an experimental design and the data are missing at random,
then there is no defensible reason for allowing cells with larger
numbers of observations to exert a greater influence on the analysis
For men and women with equal levels of education, do men and
women receive equal pay?
Type III SS also have the advantage of being the simplest to convert
to contrast coefficients
Management Level
Middle- Upper- Middle- Upper-
Management, Management, Management, Management,
Minor Minor Major Major
Gender Division Division Division Division CEO
Female 21 25 29 31 30 25 22 35 25 27 36
26 24 23 28 31 30 30 27
Male 25 18 31 28 33 31 35 35 43 36 44 43
26 22 31 40 36 37 40 45 42
DV = Scores on an Affirmative Action Attitude Scale
Note that this design is rather odd it is a 2*2* 2 with an extra 2 cells
Management Level
Middle Management Upper Management
Minor Major Minor Major
Gender Division Division Division Division Gender CEO
Male Male
Female Female
Rather than trying to analyze it as a 2*2*3 with two missing cells, it is much
easier to consider this design to be a 2*5 design. Using appropriate contrasts,
we can test
o Main effect of management level
o Main effect of division
o Management by division interaction
o Interactions between all these terms and gender
But we can also make comparisons between these groups and CEOs.
Using this approach, we can avoid designs with empty cells and the need to
learn about Type IV SS.
40
36
30
3
1
20 GENDER
Female
DV
10 Male
N= 5 3 4 4 4 4 3 5 3 5
MANAGE
Tests of Normality
Shapiro-Wilk
GROUP Statistic df Sig.
DV 1.00 .989 5 .977
2.00 .895 4 .405
3.00 .912 4 .492
4.00 1.000 3 1.000
5.00 .750 3 .000
6.00 .842 3 .220 Test of Homogeneity of Variances
7.00 .827 4 .161 DV
8.00 .971 4 .850 Levene
9.00 .887 5 .341 Statistic df1 df2 Sig.
10.00 .836 5 .154 .348 9 30 .950
o We should adopt a Type III SS approach the variations in the cell sizes
appear to be random and we are interested in the cell means.
Hypothesis 1
o Do middle and upper management in the minor divisions differ in their
support for AA?
o Does this level of support differ by gender?
Management Level
Gender MM, UM, MM, UM,
Minor Minor Major Major CEO
Hyp1: Female -1 1 0 0 0
Male -1 1 0 0 0
Hyp 1B: Female -1 1 0 0 0
Male 1 -1 0 0 0
ONEWAY dv by group
/cont = -1 1 0 0 0 -1 1 0 0 0
/cont = -1 1 0 0 0 1 -1 0 0 0.
Contrast Tests
Value of
Contrast Contrast Std. Error t df Sig. (2-tailed)
DV Hyp 1 8.0000 4.00638 1.997 30 .055
Hyp 1 * Gender -2.0000 4.00638 -.499 30 .621
Hypothesis 1: 1 = 8
12 (8) 2 64
SS1 = = = = 61.935
c 2j (1) 2
(1) 2 2
(1) (1) 2
1.033
n 5
+
4
+0+0+0+
3
+
4
+0+0+0
j
Hypothesis 2
o Do minor division managers differ from major division managers in their
support for AA?
o Does this level of support differ by gender?
Management Level
Gender MM, UM, MM, UM,
Minor Minor Major Major CEO
Hyp 2: Female -1 -1 1 1 0
Male -1 -1 1 1 0
Hyp 2B: Female -1 -1 1 1 0
Male 1 1 -1 -1 0
ONEWAY dv by group
/cont = -1 -1 1 1 0 -1 -1 1 1 0
/cont = -1 -1 1 1 0 1 1 -1 -1 0.
Contrast Tests
Value of
Contrast Contrast Std. Error t df Sig. (2-tailed)
DV Hyp 2 26.0000 5.66588 4.589 30 .000
Hyp 2 * Gender -18.0000 5.66588 -3.177 30 .003
Value of
Contrast Contrast Std. Error t df Sig. (2-tailed)
DV Hyp 2 - Women only 4.0000 4.00638 .998 30 .326
Hyp 2 - Men only 22.0000 4.00638 5.491 30 .000
Hypothesis 3
o Do CEOs differ from other management in their support for AA?
o Does this level of support differ by gender?
Management Level
Gender MM, UM, MM, UM, CEO
Minor Minor Major Major
Hyp 3: Female -1 -1 -1 -1 4
Male -1 -1 -1 -1 4
Hyp 3B: Female -1 -1 -1 -1 4
Male 1 1 1 1 -4
ONEWAY dv by group
/cont = -1 -1 -1 -1 4 -1 -1 -1 -1 4
/cont = -1 -1 -1 -1 4 1 1 1 1 -4.
Contrast Tests
Value of
Contrast Contrast Std. Error t df Sig. (2-tailed)
DV Hyp 3 54.0000 12.83173 4.208 30 .000
Hyp * Gender -34.0000 12.83173 -2.650 30 .013
Value of
Contrast Contrast Std. Error t df Sig. (2-tailed)
DV Hyp 3 - Women only 10.0000 9.94462 1.006 30 .323
Hyp 3 - Men only 44.0000 8.10912 5.426 30 .000
Note that for a contrast-based analysis, we are implicitly adopting a Type III
SS approach by weighting each cell mean equally. Single degree of
freedom tests of cell means are not affected by an unbalanced design
(However, we would not be able to combine single df tests into a
simultaneous test).
Dependent Variable: DV
Type III Sum
Source of Squares df Mean Square F Sig.
Corrected Model 1427.100a 9 158.567 10.208 .000
Intercept 36013.846 1 36013.846 2318.488 .000
GENDER 260.000 1 260.000 16.738 .000
MANAGE 687.429 4 171.857 11.064 .000
GENDER * MANAGE 268.351 4 67.088 4.319 .007
Error 466.000 30 15.533
Total 40706.000 40
Corrected Total 1893.100 39
a. R Squared = .754 (Adjusted R Squared = .680)
35
UM - Major
30
CEO
25
20
Female Male
Gender
o A random effect is one in which the factor levels are randomly sampled
from a population. Inferences are made not only for the factor levels
included in the study, but to the entire population of factor levels.
o The effect is random in that if someone were to replicate the study, the
different treatments would be sampled from the population.
o A mixed model is a model containing at least one fixed effect and at least
one random effect
In psychology many people refer to a design with at least one between-subjects
factor and at least one within-subjects factor as a mixed design. Although this
terminology is common in psychology it is inconsistent with the statistical usage
of the term. Consistent with the statistical usage, we will reserve the term mixed
model for a model with fixed and random factors
The test factor is a fixed factor. These three kinds of tasks are the
only tasks of interest to the experimenter. The classroom factor is a
random factor. If the design were to be replicated, six different
classrooms would be randomly sampled from the population.
The key idea of the random effects model is that you not only take into
account random noise, 2 , you also take into account the variability due to
the sampling of the factor levels, 2
Store
1 2 3 4 5
5.80 6.00 6.30 6.40 5.70
5.10 6.10 5.50 6.40 5.90
5.70 6.60 5.70 6.50 6.50
5.90 6.50 6.00 6.10 6.30
5.60 5.90 6.10 6.60 6.20
5.40 5.90 6.20 5.90 6.40
5.30 6.40 5.80 6.70 6.00
5.20 6.30 5.60 6.00 6.30
X 1 = 5.50 X 2 = 6.22 X 3 = 5.90 X 4 = 6.33 X 5 = 6.16
7.0
6.5
6.0
5.5
5.0
DV
4.5
N= 8 8 8 8 8
STORE
Tests of Normality
Shapiro-Wilk
STORE Statistic df Sig.
DV 1.00 .950 8 .716
2.00 .913 8 .373 Test of Homogeneity of Variance
3.00 .950 8 .716
Levene
4.00 .930 8 .516 Statistic df1 df2 Sig.
5.00 .946 8 .667 DV .073 4 35 .990
=0
a 1
2
+
n i i
2
MSBet a 1 2
Then F = = = 2 =1
MSW 2
>0
a 1
2 +
n i i
2
MSB a 1
Then F = = >1
MSW 2
Source SS df MS E(MS) F
Between SSBet a-1 SSB/DFBet 2 + n 2 MSBet
MSW
Within (Error) SSW N-a SSW/DFW 2
Total SST N-1
o Although the F-tests are constructed in the same manner as a fixed effects
model, under the hood different components are being estimated
Dependent Variable: DV
Type III Sum
Source of Squares df Mean Square F Sig.
Intercept Hypothesis 1449.616 1 1449.616 1665.507 .000
Error 3.482 4 .870a
STORE Hypothesis 3.482 4 .870 10.717 .000
Error 2.843 35 8.121E-02b
a. MS(STORE)
b. MS(Error)
o We reject the null hypothesis of no store effect and conclude that the
effectiveness of the sales campaign varies by store
Variance Component
Quadratic
Source Var(STORE) Var(Error) Term
Intercept 8.000 1.000 Intercept
STORE 8.000 1.000
Error .000 1.000
a. For each source, the expected mean square
equals the sum of the coefficients in the cells
times the variance components, plus a quadratic
term involving effects in the Quadratic Term cell.
E(MSSTORE) = 8 2 + 2
Administrator
Order 1 2 3 4
1 26 15 30 33 25 23 28 30
2 26 24 25 33 27 17 27 26
3 33 27 26 32 30 24 31 26
4 36 28 37 42 37 33 39 25
50
40
30
ORDER
1.00
20
2.00
3.00
DV
10 4.00
N= 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
ADMIN
ij ~ N (0, )
j ~ N (0, )
k ~ N (0, )
( ) jk ~ N (0, )
So that Y2 = 2 + 2 + 2 + 2
o In the two factor random effects model, we need to be much more careful
about examining the E(MS) and constructing appropriated tests of each
effect.
Source SS df MS E(MS) F
Factor A SSA a-1 SSA/DFA + n
2 2
+ nb 2 MSA
MSAB
Factor B SSB b-1 SSB/DFB 2 + n
2
+ na 2 MSB
MSAB
A*B SSAB (a-1)*(b-1) SSAB/DFAB 2 + n
2
MSAB
MSW
Within (Error) SSW N-ab SSW/DFW 2
Total SST N-1
o For multi-factor random effects ANOVA, you must always examine the
expected MS to make sure you are using the correct error term!
If H0 is true: 2 = 0
MSA + n + nb + n
2 2 2 2 2
Then F = = = 2 =1
MSAB 2 + n
2
+ n
2
If H1 is true: 2 > 0
MSA + n + nb
2 2 2
Then F = = >1
MSAB 2 + n
2
If H0 is true: 2 = 0
MSA + n + nb + n
2 2 2 2 2
Then F = = = >1
MSW 2 2
If H0 is true:
2
=0
MSAB + n 2
2 2
Then F = = = 2 =1
MSW 2
If H1 is true:
2
>0
MSAB + n
2 2
Then F = = >1
MSW 2
Dependent Variable: DV
Type III Sum
Source of Squares df Mean Square F Sig.
Intercept Hypothesis 26507.531 1 26507.531 155.441 .000
Error 716.173 4.200 170.531a
ADMIN Hypothesis 151.094 3 50.365 3.446 .065
Error 131.531 9 14.615b
ORDER Hypothesis 404.344 3 134.781 9.222 .004
Error 131.531 9 14.615b
ADMIN * Hypothesis 131.531 9 14.615 .631 .755
ORDER Error 370.500 16 23.156c
a. MS(ADMIN) + MS(ORDER) - MS(ADMIN * ORDER)
b. MS(ADMIN * ORDER)
c. MS(Error)
o SPSS highlights the fact that it is using different error terms to test each
factor
o We conclude:
There is a significant effect of order of the test on number of
responses, F(3,9) = 9.22, p < .01
Also there is a marginally significant effect of administrator on the
number of responses, F(3,9) = 3.45, p = .07
But that there is no order by administrator interaction effect on the
number of responses, F(9,16) = 0.63, p = .76.
Variance Component
Var(ADMIN * Quadratic
Source Var(ADMIN) Var(ORDER) ORDER) Var(Error) Term
Intercept 8.000 8.000 2.000 1.000 Intercept
ADMIN 8.000 .000 2.000 1.000
ORDER .000 8.000 2.000 1.000
ADMIN * ORDER .000 .000 2.000 1.000
Error .000 .000 .000 1.000
a. For each source, the expected mean square equals the sum of the coefficients in
the cells times the variance components, plus a quadratic term involving effects in
the Quadratic Term cell.
b. Expected Mean Squares are based on the Type III Sums of Squares.
Variance Estimates
Component Estimate
Var(ORDER) 15.021
Var(ADMIN) 4.469
Var(ORDER * ADMIN) -4.271a
Var(Error) 23.156
Dependent Variable: DV
Method: Minimum Norm Quadratic Unbiased Estimation
(Weight = 1 for Random Effects and Residual)
a. For the ANOVA and MINQUE methods, negative
variance component estimates may occur. Some
possible reasons for their occurrence are: (a) the
specified model is not the correct model, or (b)
the true value of the variance equals zero.
A return to the study on the effect of mental activity on blood flow (BF)
See p. 9-24. This design is a 2-factor between-subjects mixed model
ANOVA
Factor 1: Test (Math, Reading Comprehension, or History)
Factor 2: Classroom (6 classrooms)
Task (fixed)
Classroom
(random) Math Reading Comp History
1 7.8 8.7 11.1 12.0 11.7 10.0
2 8.0 9.2 11.3 10.6 9.8 11.9
3 4.0 6.9 9.8 10.1 11.7 12.6
4 10.3 9.4 11.4 10.5 7.9 8.1
5 9.3 10.6 13.0 11.7 8.3 7.9
6 9.5 9.8 12.2 12.3 8.6 10.5
14
12
10
6
TASK
Math
4
Reading
2 History
N= 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
CLASS
So that Y2 = 2 + 2 + 2
Source SS df MS E(MS) F
Factor A SSA a-1 SSA/DFA nb 2j MSA
(Fixed) 2 + n
2
+
a 1 MSAB
Factor B SSB b-1 SSB/DFB 2 + na 2 MSB
(Random) MSW
A*B SSAB (a-1)*(b-1) SSAB/DFAB 2 + n
2 MSAB
MSW
Within SSW N-ab SSW/DFW 2
(Error)
Total SST N-1
Why does having a random effect change the error term of the fixed effect,
but not of the random effect?
o We assume that the three trainees used in the study were drawn from a
population of trainees. Imagine that we can put on our magic classes and
see population means for the therapy modes for the entire population of
trainees (and for simplicity, we will assume that the population is small
consisting of 17 trainees)
Clinical Trainee
Therapy a b c d e f g h i j k l m n o p q r Mean
A 7 6 5 7 6 5 4 4 4 1 2 3 4 4 4 1 2 3 4
B 4 4 4 1 2 3 7 6 5 7 6 5 1 2 3 4 4 4 4
C 1 2 3 4 4 4 1 2 3 4 4 4 7 6 5 7 6 5 4
Mean 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Clinical Trainee
Therapy g k r Mean
A 4 2 3 3.0
B 7 6 4 5.67
C 1 4 5 3.33
Mean 4 4 4 4
o The random trainee factor does not affect our estimation of the effect of
trainee
o The random trainee factor does affect our estimation of the therapy (the
fixed factor)
Trainee and Therapy interact, which can cause variability among
means for the fixed factor to increase
MS(A) must be measuring something other than just error and the
effect of Therapy. When we look at the EMS for factor A, we see that
it captures variability due to the A*B interaction
Dependent Variable: DV
Type III Sum
Source of Squares df Mean Square F Sig.
Intercept Hypothesis 3570.062 1 3570.062 2626.655 .000
Error 6.796 5 1.359a
TASK Hypothesis 44.042 2 22.021 3.784 .060
Error 58.195 10 5.820b
CLASS Hypothesis 6.796 5 1.359 .234 .939
Error 58.195 10 5.820b
TASK * Hypothesis 58.195 10 5.820 7.207 .000
CLASS Error 14.535 18 .808c
a. MS(CLASS)
b. MS(TASK * CLASS)
c. MS(Error)
o But wait!! SPSS is using the wrong error term for test of the main effect
of classroom!!!
Classroom is a random effect. To test the random effect, we need to
use MSW as the error term. SPSS is using MSAB.
o We can also use the TEST subcommand and ask SPSS to compute the F-
test. We need to enter the effect (class), the SS of the denominator
(14.54) and the df of the denominator (18)
Dependent Variable: DV
Sum of
Source Squares df Mean Square F Sig.
Contrast 6.796 5 1.359 1.683 .190
Error 14.540a 18a .808
a. User specified.
o Note: SPSS does not consider this to be an error. They state that
statisticians differ in how they approach this problem.
http://spss.com/tech/answer/details.cfm?tech_tan_id=100000073
As indicated in this tech note, SAS makes the same error. Thus,
even if you run the analysis in SAS, you will still have to rerun the
analysis
I cannot find any recent texts that agree with the SPSS approach.
Neter et al (1996, p 981), Kirk (1995, p 374) and Maxwell & Delaney
(1990, p 429/431) all give the E(MS) I list on the previous page. For
balanced designs, SPSS does the wrong analysis. For unbalanced
designs, SPSSs approach may be appropriate.
Variance Component
Var(task * Quadratic
Source Var(class) class) Var(Error) Term
Intercept Intercept, Expected Mean Squaresa
6.000 2.000 1.000
task
task .000 2.000 1.000 task Variance Component
class 6.000 2.000 1.000 Var(TASK * Quadratic
task * class .000 2.000 1.000 Source Var(CLASS) CLASS) Var(Error) Term
Intercept Intercept,
Error .000 .000 1.000 6.000 2.000 1.000
TASK
a. For each source, the expected mean square equals the
TASK .000 2.000 1.000 TASK
sum of the coefficients in the cells times the variance
CLASS 6.000 .000 1.000
components, plus a quadratic term involving effects in the
Quadratic Term cell. TASK * CLASS .000 2.000 1.000
b. Expected Mean Squares are based on the Type III Sums of Error .000 .000 1.000
Squares. a. Andy's Hand-Corrected Table
o SPSSs VARCOMP command also errs on the variance estimate for the
class effect (SPSS output not shown here)
To perform contrasts or post-hoc tests, you can use the same formulas
previously discussed for ANOVA with one exception. You must use the
correct error term in place of MSW, and the degrees of freedom associated
with that error term
A2
A =
Y2
Omega squares must still be used for fixed effects in a mixed model. In
general, for a fixed factor A:
44.04 (2)5.82
Task
2
= = .53
44.04 (2)5.82 + (36)(.808)
The distinction between fixed and random effects is not always as clear as
presented here. For example, Clark (1973) argued convincingly that
when a list of words is used in a study, the words should be treated as a
random effect. The key is what type of inference you want to make
We estimate the distribution of the random effects based on the means (and
the variability of those means) of the random factor. If you only have 2-3
levels of your random factor, you will not get a good estimate of the
distribution. It is desirable to have a relatively large number of levels of any
random factor. In addition, it is important that the levels of the random
factor be randomly sampled from the population of interest
In designs with three or more factors that include two or more random
effects, it is common to encounter situations where no exact F-test can be
constructed. In this case, quasi-F ratios (linear combinations of MSs) are
used to approximate an F-ratio.
All of our calculations assume that cell sizes are equal. Things get very
wacky with unequal cell sizes, and it is no longer possible to construct exact
F-tests (the ratios of expected MSs no longer satisfy the requirements for a
valid F-test). Approximate tests are available and are calculated in SPSS.
Therapist 1 2 3 4 5 6
Jury 1 2 3 4 5 6 7 8 9 10 11 12
Old intervention
Classroom 1 2 3 4 5 6 7 8 9 10 11 12
New intervention
Classroom 1 2 3 4 5 6 7 8 9 10 11 12
Yijk = + j + k ( j ) / + i ( jk )
Yijkl = + j + k ( j ) / + l ( jk ) / + i ( jkl )
Note that because these designs are nested, not crossed, there is no way to
estimate an interaction effect.
With nested effects, we again need to make sure we use the correct error
term when constructing F-tests.
o Just as for the random effect designs the SS are calculated in the same
manner as before. The only difference is the construction of the F-test
o For more complex designs, youll have to look up the error term, or trust
SPSS
Sex of Therapist
Male Female
1 2 3 4 5 6
49 42 42 54 44 57
40 48 46 60 54 62
31 52 50 64 54 66
40 58 54 70 64 71
Sex of Therapist
Male Female
40 50 48 62 54 64
Dependent Variable: DV
Type III Sum
Source of Squares df Mean Square F Sig.
Intercept Hypothesis 67416.000 1 67416.000 601.929 .000
Error 448.000 4 112.000a
SEX Hypothesis 1176.000 1 1176.000 10.500 .032
Error 448.000 4 112.000a
THERA(SEX) Hypothesis 448.000 4 112.000 2.459 .083
Error 820.000 18 45.556b
a. MS(THERA(SEX))
b. MS(Error)
Sex of Therapist
Male Female
40 50 48 62 54 64
Descriptives
ANOVA
DV
DV
Sum of
N Mean Squares df Mean Square F Sig.
1.00 3 46.0000 Between Groups 294.000 1 294.000 10.500 .032
2.00 3 60.0000 Within Groups 112.000 4 28.000
Total 6 53.0000 Total 406.000 5
This analysis produces the same results only the SS are different.
This analysis was tricked into thinking each observation was one
participant, but in the actual analysis, we know that each observation
was based on data from four participants. If you multiply the SS in
this oneway analysis by 4, you will get the same results as the nested
analysis. (This trick only works for balanced designs)
1176 (1)112
Sex
2
= = .45
1176 + (1)112 + (24)45.56
Thera
2
Thera ( sex ) =
( sex )
Y2
Variance Component
Var(THER Quadratic
Source A(SEX)) Var(Error) Term
Intercept Intercept,
4.000 1.000
SEX
SEX 4.000 1.000 SEX
THERA(SEX) 4.000 1.000
Error .000 1.000
Y2 = Thera
2
( sex ) +
2
Thera
2
18.86
Thera ( sex ) =
( sex )
= = .29
2
Y 64.42
Old Intervention
School 1 School 2 School 3
1 2 3 4 1 2 3 4 1 2 3 4
11.2 16.5 18.3 19 7.3 11.9 11.3 8.9 15.3 19.5 14.1 16.5
11.6 16.8 18.7 18.5 7.8 12.4 10.9 9.4 15.9 20.1 13.8 17.2
12.0 16.1 19.0 18.2 7.0 12.0 10.5 9.3 16.0 19.3 14.2 16.9
New Intervention
School 1 School 2 School 3
1 2 3 4 1 2 3 4 1 2 3 4
13.2 17.25 20.3 20.5 9.3 12.9 10.3 10.9 17.55 20.75 15.1 18.75
12.35 18.8 18.45 17.5 7.05 14.65 12.15 8.15 14.9 22.1 14.55 17.2
13.25 15.85 21.0 19.2 8.5 14.25 10.0 11.55 17.75 21.3 13.7 16.9
Intervention
Old New
16.33 9.89 16.57 17.30 10.81 17.55
ONEWAY dv by treat
/STAT = DESC.
Descriptives
ANOVA
DV
DV
Sum of
N Mean Std. Deviation Squares df Mean Square F Sig.
1.00 3 14.2613 3.78589 Between Groups 1.379 1 1.379 .095 .773
2.00 3 15.2200 3.82122 Within Groups 57.869 4 14.467
Total 6 14.7407 3.44232 Total 59.248 5
School (Treatment)
1(Old) 2(Old) 3(Old) 1(New) 2(New) 3(New)
11.60 7.37 15.73 12.93 8.28 16.73
16.47 12.10 19.63 17.30 13.93 21.38
18.67 10.90 14.03 19.92 10.81 14.45
18.57 9.20 16.86 19.07 10.20 17.61
Dependent Variable: DV
Type III Sum
Source of Squares df Mean Square F Sig.
Corrected Model 237.029 5 47.406 6.427 .001
Intercept 5213.833 1 5213.833 706.816 .000
TREAT 5.491a 1 5.491 .744 .400
SCHOOL(TREAT) 231.538 4 57.885 7.847 .001
Error 132.777 18 7.377
Total 5583.639 24
Corrected Total 369.807 23
a. Ignore this test for the effect of treatment in this setup
Dependent Variable: DV
Type III Sum
Source of Squares df Mean Square F Sig.
Intercept Hypothesis 15643.857 1 15643.857 90.088 .001
Error 694.600 4 173.650a
TREAT Hypothesis 16.531 1 16.531 .095 .773
Error 694.600 4 173.650a
SCHOOL(TREAT) Hypothesis 694.600 4 173.650 7.850 .001
Error 398.194 18 22.122b
CLASS(SCHOOL Hypothesis 398.194 18 22.122 27.682 .000
(TREAT)) Error 38.358 48 .799c
a. MS(SCHOOL(TREAT))
b. MS(CLASS(SCHOOL(TREAT)))
c. MS(Error)
o SPSS also provides the variance components so that effect sizes can be
calculated for the random effects
Expected Mean Squaresa,b
Variance Component
Var(CLASS
Var(SCHOO (SCHOOL(T Quadratic
Source L(TREAT)) REAT))) Var(Error) Term
Intercept Intercept,
12.000 3.000 1.000
TREAT
TREAT 12.000 3.000 1.000 TREAT
SCHOOL(TREAT) 12.000 3.000 1.000
CLASS(SCHOOL
.000 3.000 1.000
(TREAT))
Error .000 .000 1.000
a. For each source, the expected mean square equals the sum of the
coefficients in the cells times the variance components, plus a
quadratic term involving effects in the Quadratic Term cell.
b. Expected Mean Squares are based on the Type III Sums of Squares.
In these examples, we did not test the assumptions for the model because of
small cell sizes. However, the ANOVA assumptions must be satisfied for the
results to be valid. The assumptions for a nested model are the same as the
assumptions for a fixed or random effects model (depending on if there are
fixed or random effects in the model).
Pay attention to the small degrees of freedom in the tests for some of the
nested effects. In both examples, the test of the fixed effect (the effect of
most interest in these designs) is based on six observations! Nested designs
can have very low power unless you have a large number of levels of the
nested effects.
As in the random effects case, contrasts and post-hoc tests can be conducted
by using the appropriate error term in previously developed equations.
When we test the effect of a factor on a dependent variable, there are always
many other factors that lead to variability in the DV. When these variables
are not of interest to us, they are called nuisance variables.
For example, if we are interested in the relationship between type of therapy
and psychological wellness, there are many other factors that influence
wellness other than the type of therapy.
o You can also include the nuisance variable(s) as factors in the study.
This approach is known as blocking.
SS Total
(SS Corrected Total)
SS A SS Error
df=(a-1) df = N-a
SS A SS Blocks SS Residual
df=a-1 df = bl-1 df = N a bl + 1
Participant
1 2 3
Block 1 (Oldest participants) C W U
2 C U W
3 U W C
4 W U C
5 (Youngest participants) W C U
Method
Block Utility Worry Comparison Average
1 (oldest) 1 5 8 4.7
2 2 8 14 8.0
3 7 9 16 10.7
4 6 13 18 12.3
5 (youngest) 12 14 17 14.3
Average 5.6 14 17
Note that a randomized block design looks like a factorial design, but
there is only one participant per cell. If there were two or more
participants per cell, we would call this design a two-way ANOVA.
By treatment:
20
10
3
Levene -10
N= 5 5 5
Statistic df1 df2 Sig. 1.00 2.00 3.00
DV .048 2 12 .953
TREAT
Tests of Normality
Shapiro-Wilk
TREAT Statistic df Sig.
DV 1.00 .940 5 .665
2.00 .943 5 .687
3.00 .860 5 .227
10
0
Test of Homogeneity of Variances
DV
DV
Levene -10
N= 3 3 3 3 3
Statistic df1 df2 Sig. 1.00 2.00 3.00 4.00 5.00
.552 4 10 .702
BLOCK
Tests of Normality
Shapiro-Wilk
BLOCK Statistic df Sig.
DV 1.00 .993 3 .843
2.00 1.000 3 1.000
3.00 .907 3 .407
4.00 .991 3 .817
5.00 .987 3 .780
But with three observations per block, these tests are essentially
worthless!
20
16 Block 1
Block 2
12
Block 3
8
Block 4
4 Block 5
0
Utility Worry Comparison
o Structural model for a randomized block design with one factor and one
block:
Yij = + j + i + ij
E(MS)
Treatments Treatments
Source SS df MS Fixed Random
Treatment SSA a-1 MSA bl j 2
2 + bl 2
+
2
a 1
Blocks SSBL bl-1 MSBL a 2j a 2j
+
2
+
2
bl 1 bl 1
Error SSError (a-1)(bl-1) MSE 2 2
Total SST N-1
Dependent Variable: DV
Type III Sum
Source of Squares df Mean Square F Sig.
Corrected Model 374.133a 6 62.356 20.901 .000
Intercept 1500.000 1 1500.000 502.793 .000
TREAT 202.800 2 101.400 33.989 .000
BLOCK 171.333 4 42.833 14.358 .001
Error 23.867 8 2.983
Total 1898.000 15
Corrected Total 398.000 14
a. R Squared = .940 (Adjusted R Squared = .895)
Note that post-hoc tests on the marginal treatment means are required
to identify the effect
DV
Sum of
Squares df Mean Square F Sig.
Between Groups 202.800 2 101.400 6.234 .014
Within Groups 195.200 12 16.267
Total 398.000 14
1.6
1.6
1.4 1.4
1.2 1.2
1.0 1.0
.8 .8
.6 .6
.4 .4
DV
DV
.2 .2
N= 3 3 3 3 3 N= 5 5 5
BLOCK FAT
Tests of Normality
Levene
Statistic df1 df2 Sig.
DV Based on Mean .336 2 12 .721
Based on Median .047 2 12 .954
Based on Median and
.047 2 11.893 .954
with adjusted df
Based on trimmed mean .302 2 12 .745
2
1.6 Age 15-24
Age 25-34
1.2
Age 35-44
0.8
Age 45-54
0.4 Age 55-64
0
Extreme Fair Moderate
o To examine the effect of fat in the diet on plasma lipid levels, lets
conduct a randomized block ANOVA
Dependent Variable: DV
Type III Sum
Source of Squares df Mean Square F Sig.
Corrected Model 2.045a 6 .341 141.102 .000
Intercept 12.440 1 12.440 5151.017 .000
FAT .626 2 .313 129.527 .000
BLOCK 1.419 4 .355 146.890 .000
Error 1.932E-02 8 2.415E-03
Total 14.504 15
Corrected Total 2.064 14
a. R Squared = .991 (Adjusted R Squared = .984)
Dependent Variable: DV
Tukey HSD
Mean
Difference 95% Confidence Interval
(I) FAT (J) FAT (I-J) Std. Error Sig. Lower Bound Upper Bound
1.00 2.00 .1180* .03108 .013 .0292 .2068
3.00 .4800* .03108 .000 .3912 .5688
2.00 1.00 -.1180* .03108 .013 -.2068 -.0292
3.00 .3620* .03108 .000 .2732 .4508
3.00 1.00 -.4800* .03108 .000 -.5688 -.3912
2.00 -.3620* .03108 .000 -.4508 -.2732
Based on observed means.
*. The mean difference is significant at the .050 level.
DV
Sum of
Squares df Mean Square F Sig.
Between Groups .626 2 .313 2.610 .115
Within Groups 1.438 12 .120
Total 2.064 14
o What would happen if we forgot this was a randomized block design, and
attempted to analyze it as a factorial design?
UNIANOVA dv BY fat block
/DESIGN = fat block fat*block.
Tests of Between-Subjects Effects
Dependent Variable: DV
Type III Sum
Source of Squares df Mean Square F Sig.
Corrected Model 2.064a 14 .147 . .
Intercept 12.440 1 12.440 . .
FAT .626 2 .313 . .
BLOCK 1.419 4 .355 . .
FAT * BLOCK 1.932E-02 8 2.415E-03 . .
Error .000 0 .
Total 14.504 15
Corrected Total 2.064 14
a. R Squared = 1.000 (Adjusted R Squared = .)
Puzzle Type
Block P1 P2 P3 P4 P5 P6
1 5 14 8 10 11 6
2 7 10 7 9 12 5
3 11 9 10 11 14 6
4 9 10 6 13 15 7
5 13 12 7 14 16 11
6 7 9 8 6 11 5
7 10 11 8 12 13 8
8 4 8 5 7 9 4
9 14 13 11 15 17 12
10 9 9 8 10 14 9
By factor
18
Tests of Normality
16
Shapiro-Wilk
14
PUZZLE Statistic df Sig.
12
DV 1.00 .970 10 .891
51 2.00 .924 10 .394
10 15
3.00 .941 10 .560
8 4.00 .974 10 .925
6
5.00 .979 10 .959
45 6.00 .927 10 .415
4
DV
2
N= 10 10 10 10 10 10
1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 10.00 .750 6 .020
BLOCK
Test of Homogeneity of Variance
Levene
Statistic df1 df2 Sig.
DV Based on Mean .521 9 50 .852
18
16
14
12
10
8
6
4
2
0
P1 P2 P3 P4 P5 P6
Dependent Variable: DV
Type III Sum
Source of Squares df Mean Square F Sig.
Corrected Model 488.000a 14 34.857 15.121 .000
Intercept 5684.267 1 5684.267 2465.861 .000
PUZZLE 238.933 5 47.787 20.730 .000
BLOCK 249.067 9 27.674 12.005 .000
Error 103.733 45 2.305
Total 6276.000 60
Corrected Total 591.733 59
a. R Squared = .825 (Adjusted R Squared = .770)
Dependent Variable: DV
Tukey HSD
Mean
Difference 95% Confidence Interval
(I) PUZZLE (J) PUZZLE (I-J) Std. Error Sig. Lower Bound Upper Bound
1.00 2.00 -1.6000 .67900 .194 -3.6207 .4207
3.00 1.1000 .67900 .590 -.9207 3.1207
4.00 -1.8000 .67900 .106 -3.8207 .2207
5.00 -4.3000 .67900 .000 -6.3207 -2.2793
6.00 1.6000 .67900 .194 -.4207 3.6207
2.00 3.00 2.7000 .67900 .003 .6793 4.7207
4.00 -.2000 .67900 1.000 -2.2207 1.8207
5.00 -2.7000 .67900 .003 -4.7207 -.6793
6.00 3.2000 .67900 .000 1.1793 5.2207
3.00 4.00 -2.9000 .67900 .001 -4.9207 -.8793
5.00 -5.4000 .67900 .000 -7.4207 -3.3793
6.00 .5000 .67900 .976 -1.5207 2.5207
4.00 5.00 -2.5000 .67900 .008 -4.5207 -.4793
6.00 3.4000 .67900 .000 1.3793 5.4207
5.00 6.00 5.9000 .67900 .000 3.8793 7.9207
Based on observed means.
a c 2j
StdError ( ) = MSError
j =1 nj
t~
t observed =
c j X . j.
standard error( ) c 2j
MSE
nj
SSC
2 dfc SSC
SS ( ) = F (1, dfw) = =
c 2
SSE MSE
n
j
dfw
j
Dependent Variable: DV
PUZZLE Mean Std. Deviation N
1.00 8.9000 3.24722 10
2.00 10.5000 1.95789 10
3.00 7.8000 1.75119 10
4.00 10.7000 2.90784 10
5.00 13.2000 2.48551 10
6.00 7.3000 2.66875 10
Total 9.7333 3.16692 60
.8
2 : F (1,45) = = 0.35, p = .56
2.305
45.067
3 : F (1,45) = = 19.55, p < .01
2.305
93.025
4 : F (1,45) = = 40.36, p < .01
2.305
o Note that if these were post-hoc tests, then we would need to apply the
Tukey HSD or Scheff correction.
As shown in the last SPSS output, when there is one participant per cell, the
SS for the interaction is the error term. Some authors create ANOVA tables
with no error term, and use the SS(BL*A) to test the effect of A. The only
difference in these approaches is the labeling of the error term.
If the blocking variable is not related to the DV, then you actually lose
power by including it in the design.
Blocked Design
Source SS df MS F
Treatment SSA a-1 MSA MSA
F [(a 1), ( N a bl + 1)] =
MSE
Blocks 0 bl-1 MSBL
Error SSError (a-1)(bl-1) MSE
Total SST N-1
Standard Design
Source SS df MS F
Treatment SSA a-1 MSA MSA
F[(a 1),(N a)] =
MSE
Within SSError N-a MSE
Total SST N-1
o When SSBL = 0, then MSE (in blocked design) = MSW (in the standard
design), so that the F-ratios in the two cases are identical
o But there are fewer degrees of freedom in the error term for the blocked
design (N-a-bl+1) than in the standard design (N-a). The loss of these b-
1 dfs results in lower power for the blocked design.
o In reality, the SSBL will never be exactly zero, but when SSBL is small
and the number of blocks is large, you will lose power.
o If you have more than 1 observation per cell, then you have a factorial
design. You can calculate a SS(Bl*A) and test the interaction.
If you want to block on two factors, you can use the same procedure outlined
here. Simply combine the two factors into one block. For example, to block
on age and education:
Young and no education
Young and education
Old and no education
Old and education