Professional Documents
Culture Documents
1
ANalysis OfVAriance
Basicconcepts
Onewayanalysisofvariancetotestfor
differencesamongthemeansofseveralgroups
Twowayanalysisofvarianceandinterpretthe
interaction
QAM II byGauravGarg (IIMLucknow)
Thedifferencebetweentwomeanscanbe
examinedusingt testorZ test.
Ifwehavemorethan2samples.
Wewishtotestthehypothesisthat
allthesamplesaredrawnfromthepopulation
havingthesamemeans.
Orallpopulationmeansarethesame.
WeuseANOVA.
QAM II byGauravGarg (IIMLucknow)
Example:
Thereare5varietiesofafertilizer.
Eachvarietyisappliedtosomeplotsofwheat.
Yieldofwheatoneachoftheplotisrecorded.
Wewishtotestiftheeffectsofthesevarietiesof
fertilizeronyieldarethesame.
Giventhat,allotherconditionsarethesame.
ThisistestedbyANOVA.
Thus,basicpurposeofANOVAistotestthe
homogeneityofseveralmeans.
QAM II byGauravGarg (IIMLucknow)
Example:
Fromtimetotime,unknowntoitsemployees,the
researchdepartment,atamultinationalbank
observesvariousemployeesforworkproductivity.
Recently,thisdepartmentwantedtocheckifthe4
tellersatabranchserve,onanaverage,thesame
numberofcustomersperhour.
Researchmanagerobservedeachtellerforcertain
numberofhours.
Followingtablegivesthenumberofcustomers
servedby4tellersduringeachoftheobserved
hours:
QAM II byGauravGarg (IIMLucknow)
TellerA: 19 21 26 14 18
TellerB: 14 16 14 13 17 13
TellerC: 11 14 21 13 16 18
TellerD: 24 19 21 26 20
Averagenumberofcustomersservedperhour
byeachofthese4tellersare:
A:19.6, B:14.5, C:15.5, D:22
Canyouconcludethataveragenumberof
customersservedperhourbyeachofthese4
tellersarethesame.
Ortheyaredifferentsignificantly.
QAM II byGauravGarg (IIMLucknow)
ANOVAisessentiallyaprocedurefortestingthe
differenceamongvariousgroupsofdatafor
homogeneity.
Atitssimplest,ANOVAteststhefollowing
hypotheses:
H
0
:Themeansofallthegroupsareequal.
H
1
:Notallthemeansareequal
doesntsayhoworwhichonesdiffer.
Canfollowupwithmultiplecomparisons
QAM II byGauravGarg (IIMLucknow)
04122013
2
QAM II byGauravGarg (IIMLucknow)
3 2 1
= =
3 2 1
= =
3 2 1
= =
or
AssumptionsofANOVA
Samplesarerandomlyandindependentlydrawn
Eachpopulationisapproximatelynormal
Maybecheckedbylookingathistogramsor
normalQQplots
Standarddeviationsofeachpopulationare
approximatelyequal
ruleofthumb:ratiooflargesttosmallestsamplestd.
dev.mustbelessthan2:1
QAM II byGauravGarg (IIMLucknow)
OneWayClassification
LetX bearandomvariable.
ThevaluesofX areaffectedbydifferentlevelsofone
factor.
Thesedifferentlevelsmaybetermedastreatments.
Lettherebek suchtreatments.
Letn observationsarecollectedonX.
Thesen observationsaregroupedonsomebasisinto
k groups(treatments)ofsizesn
1
, n
2
, , n
k
,
respectively.
n= n
1
+ n
2
+ + n
k
QAM II byGauravGarg (IIMLucknow)
H
0
:Allmeansarethesame
H
1
:Allmeansarenotthesame
(atleastonemeanisdifferentfromothers)
QAM II byGauravGarg (IIMLucknow)
G
T x x x x
T x x x x
T x x x x
k k kn k k
n
n
k
-
-
-
2 1
2 2 2 22 21
1 1 1 12 11
2
1
Total Mean
OverallvariationinthedataisrepresentedbyTotalSum
ofSquares(TSS ):
TSS ispartitionedintotwoparts:
BetweenGroupsVariationor
SumofSquaresduetoTreatments
WithinGroupsVariationor
SumofSquaresduetoError
QAM II byGauravGarg (IIMLucknow)
k
k
i
n
j
ij
k
i
n
j
ij
n n n n x
n
x x x TSS
j j
+ + = = =
= = = =
2 1
1 1 1 1
2
,
1
, ) (
. ,..., 2 , 1 ,
1
, ) (
1 1
2
k i x
n
x x x n SST
i
n
j
ij
i
i
k
i
i i
= = =
=
-
=
-
= =
-
=
k
i
n
j
i ij
j
x x SSE
1 1
2
) (
Theseformulaecanbesimplifiedasbelow:
Then,wedotheanalysisofthepartitionedor
separatedvariations.
QAM II byGauravGarg (IIMLucknow)
TSS-SST SSE
CF
n
T
SST
CF x TSS
i i
i
i j
ij
=
=
=
=
|
|
.
|
\
|
+
o k n k
j i
F k
n n
MSE
(C.D) Difference Critical >
- - j i
x x
Example:Consider4Tellersexample.
For =0.05,CriticalValueF
(0.05)
at (3, 18) d.f = 3.1599
Weobtained,MSE=10.567
C.D.fortellersAandB=6.0605
AbsoluteDifferenceofSampleMeansofTellersAandB=
5.1<C.D.
So,tellersAandBdonotdiffersignificantly.
QAM II byGauravGarg (IIMLucknow)
Teller Mean Sample size
A 19.6 5
B 14.5 6
C 15.5 6
D 22 5
Example:
Youwanttoseeifthreedifferentgolfclubsyield
differentdistances.
Yourandomlyselectfivemeasurementsfromtrialson
anautomateddrivingmachineforeachclub.
Atthe0.05significancelevel,isthereadifferencein
meandistance?
QAM II byGauravGarg (IIMLucknow)
Club1 Club2 Club3
254 234 200
263 218 222
241 235 197
237 227 206
251 216 204
ANOVATable
DistributionofTestStatistic:F
c
~ F(2, 12)
For5%levelofsignificance,CriticalValueF
()
=3.89
Since F
c
>F
()
WerejectH
0
at5% levelofsignificance.
QAM II byGauravGarg (IIMLucknow)
SourceofVariation Sumof
Squares
Degreeof
Freedom
MeanSum
ofSquares
Variance
Ratio
Treatments
(BetweenGroups)
4716.4 2 2358.2 25.275
Error
(WithinGroups)
1119.6 12 93.3
Total 5836.0 14
Whichpair(s)ofclubsdiffersignificantly?
CriticalDifference?
QAM II byGauravGarg (IIMLucknow)
F
c
=25.275
0
o =0.05
F
=3.89
Example:
Thedatainthetable(inthousandsofdollars)wereextracted
fromBusinessWeeks1986ExecutiveCompensationScoreboard.
Assumethatthedatarepresentindependentsamplesof1986
totalcashcompensationsforeightcorporateexecutivesineach
ofthethreeindustries banks,utilities,andoffice
equipments/computers.
QAM II byGauravGarg (IIMLucknow)
Banks Utilities OfficeEquipments/Computers
755 520 438
712 295 828
845 553 622
985 950 453
1300 930 562
1143 428 348
733 510 405
1189 864 938
04122013
5
ApartialANOVAtableisgivenhere.Completethetable.
Isthereevidenceofadifferenceamongthemeansof
1986totalcashcompensationsforthethreegroupsof
corporateexecutives?Testat1%levelofsignificance.
Findtheindustry(industries)forwhichmean
compensationis(are)differentfromtheothersat1%
level.
QAM II byGauravGarg (IIMLucknow)
SourceofVariation Sumof
Squares
Degreeof
Freedom
MeanSum
ofSquares
Variance
Ratio
Industry
Error 1,115,232.50
Total 1,800,361.83
TwoWayClassification
Example:
Achefwasexperiencingdifficultyingettingtypesof
pastatobealdente.
Sheconductsanexperimentwithtwotypesofpasta
AmericanandItalian.
150gramspastaofbothtypeswereused.
Sampleswerecookedeitherfor4minutesor8minutes.
Because,cookingofpastaenablesittoabsorbwater.
Theweightsofcookedpastaweremeasured.
Theresultsfortworeplicatesforeachtypeandcooking
timeareasfollows:
QAM II byGauravGarg (IIMLucknow)
QAM II byGauravGarg (IIMLucknow)
Isthereaneffectonthecookedpasta(intermsof
weightofcookedpasta)
duetotypeofpasta?
duetocookingtime?
Isthereaninteractioneffectbetweentypeofpasta
andcookingtime?
Anyparticularcombinationofpastatypeandcooking
timeissignificantlydifferent?
COOKINGTIME
TYPE 4Minutes 8Minutes
American 265
270
310
320
Italian 250
245
300
305
Twowayanalysisofvarianceisanextensionof
onewayanalysisofvariance.
Thevariationiscontrolledbytwo factors.
ThevaluesofrandomvariableX areaffectedby
differentlevelsoftwo factors.
Assumptions
Thepopulationsarenormallydistributed.
Thesamplesareindependent.
Thevariancesofthepopulationsareequal.
QAM II byGauravGarg (IIMLucknow)
H
A0
:AlllevelsofFactorA havethesameeffect
H
A1
:AlllevelsofFactorA donthavethesameeffect
H
B0
:AlllevelsofFactorB havethesameeffect
H
B1
:AlllevelsofFactorB donthavethesameeffect
H
AB0
:Thereisnointeractioneffect
H
AB1
:Interactioneffectisthere
QAM II byGauravGarg (IIMLucknow)
a =numberoflevelsofFactorA
b =numberoflevelsofFactorB
m =numberofobservations(repetitions)percell
n = abm =totalnumberofobservations
x
ijk
=k
th
observationofthecellreceiving
i
th
levelofFactorA and
j
th
levelofFactorB.
G = Grandtotal
T
Ai
= Sumofobservationsreceivingi
th
levelofFactorA
T
Bi
= Sumofobservationsreceivingj
th
levelofFactorB
T
ij
= Sumofobservationsreceivingi
th
levelofFactorA
aswellasj
th
levelofFactorB
QAM II byGauravGarg (IIMLucknow)
04122013
6
QAM II byGauravGarg (IIMLucknow)
SSAB SSB SSA TSS SSE
CF SSB SSA T
m
SSAB
CF T
ma
SSB
CF T
mb
SSA
CF x TSS
abm
G
CF
i j
ij
j
Bj
i
Ai
i j k
ijk
=
=
=
=
=
=
4Minutes 8Minutes
Total
American 265
270
(T
11
=535)
310
320
(T
12
=630)
T
A1
=1165
Italian 250
245
(T
21
=495)
300
305
(T
22
=605)
T
A2
=1100
Total T
B1
=1030 T
B2
=1235 G=2265
CF = 641278.125
TSS = 647175 - 641278.125= 5896.875
SSA = 641806.25 - 641278.125 =528.125
SSB = 646531.25 - 641278.125 = 5253.125
SSAB = 647087.5 - 528.125 - 5253.125 - 641278.125
= 28.125
SSE = 87.5
QAM II byGauravGarg (IIMLucknow)
ANOVATable
F
Ac
~ F(1,4)
F
Bc
~ F(1,4)
F
ABc
~ F(1,4)
CriticalValueat5%levelofSignificance=7.70865
QAM II byGauravGarg (IIMLucknow)
Sourceof
Variation
SS df MS Variance
Ratio
CriticalF
FactorA 528.125 1 528.125 24.14286 7.70865
FactorB 5253.125 1 5253.125 240.1429 7.70865
Interaction 28.125 1 28.125 1.285714 7.70865
Error 87.5 4 21.875
Total 5896.875 7
Inpastaexample:
QAM II byGauravGarg (IIMLucknow)
04122013
7
QAM II byGauravGarg (IIMLucknow)
FactorBLevel1
FactorBLevel3
FactorBLevel2
FactorALevels
FactorBLevel1
FactorBLevel3
FactorBLevel2
FactorALevels
M
e
a
n
R
e
s
p
o
n
s
e
M
e
a
n
R
e
s
p
o
n
s
e
NoSignificant
Interaction
Significant
Interaction
Example:
Acompanystampsgasketsoutofsheetsofrubber,plasticandcork.
Themanufacturerwantstodeterminewhether
Onemachineismoreproductivethantheother
Onemachineismoreproductiveinproducingrubbergaskets
whiletheotherismoreproductiveinproducingplasticorcork
gaskets.
Themanufacturerdecidestoconductanexperimentusing3types
ofgasketmaterial.
Eachmachineisoperatedfor3onehourtimeperiodsforeachof
thegasketmaterial,withthe18onehourtimeperiodsassignedto
the6machinematerialcombinationsinrandomorder.
Thepurposeofrandomizationistoeliminatethepossibilitythat
uncontrolledenvironmentalfactorsmightbiastheresults.
QAM II byGauravGarg (IIMLucknow)
Thedata(No.ofgasketsinthousands)isasfollows:
Helpthemanufacturer.
Use5%levelofsignificance.
QAM II byGauravGarg (IIMLucknow)
M
a
c
h
i
n
e
GasketMaterial
Cork Rubber Plastic Total
I 4.31
4.27
4.40
3.36
3.42
3.48
4.01
3.94
3.89
35.08
II 3.94
3.81
3.99
3.91
3.80
3.85
3.48
3.53
3.42
33.73
Total 24.72 21.82 22.27 68.81
Whentheinteractioneffectsaresignificant,
Thehypothesistestingofmaineffectsbecomes
complicated.
Wecannotdirectlyconcludethatthemain
effectsarenotsignificant.
Whichcombinationisthebestcanbejudged
fromtheplot.
Whentheinteractioneffectsarenotsignificant
Butmaineffectsaresignificant
Wecandetermineparticularlevelsofthefactors
thataresignificant
QAM II byGauravGarg (IIMLucknow)
Methodisthesameasusedinonewayclassification.
ForthelevelsofFactorA
ObtainthemeansofalllevelsofFactorA
meanofi
th
levelofFactorA = T
Ai
/ bm
i
th
levelandj
th
differsignificantlyif
| T
Ai
- T
Aj
| / bm > CD
WhereCD isgivenby
MethodforthelevelsofFactorB issimilar
QAM II byGauravGarg (IIMLucknow)
( )
2 / 1
), 1 ( , 1
1
1 1
(
|
.
|
\
|
+
o m ab a
F a
bm bm
MSE
Example:
Supposeyouwanttodeterminewhetherthebrandof
laundrydetergentusedandthetemperatureaffectsthe
amountofdirtremovedfromyourlaundry.
YoubuytwodifferentbrandofdetergentSuperand
Best
Choosethreedifferenttemperaturelevels Cold,Warm,Hot
Theamountofdirtremovedisgiveninfollowingtable.
At5%levelofsignificance,testif
Varietiesofdetergenthavesignificanteffectondirtremoved
Varietiesinthetemperatureofwaterhavesignificantdifferent
ondirtremoved
QAM II byGauravGarg (IIMLucknow)
Cold Warm Hot
Super 5 9 10
Best 5 13 12
04122013
8
Example:
Threevarietiesofcoalwereanalyzedbyfourchemists
andashcontentinthevarietieswasfoundtobeas
under:
Dothevarietiesofcoaldiffersignificantlyintheirash
content?
Dothechemistsdiffersignificantlyintheiranalysis?
QAM II byGauravGarg (IIMLucknow)
Chemists
I II III IV
Varietie
s
A 8 5 5 7
B 7 6 4 4
C 3 6 5 4
Summary
Onewayanalysisofvariance
Onefactoratvariouslevels
Ftestfordifferenceinmorethantwomeans
Scheffes procedureformultiplecomparisons
Twowayanalysisofvariance
Effectsoftwofactors
Interactionbetweentwofactors
Multiplecomparisons
QAM II byGauravGarg (IIMLucknow)