You are on page 1of 29

30/05/2016

6BASICSTATISTICALTOOLS

Producedby:NaturalResources
ManagementandEnvironment
Department
Title:Guidelinesforqualitymanagementinsoilandplantlaboratories.(FAO
Soils...

6BASICSTATISTICALTOOLS
Therearelies,damnlies,andstatistics......
(Anon.)
6.1Introduction
6.2Definitions
6.3BasicStatistics
6.4Statisticaltests

6.1Introduction
Intheprecedingchaptersbasicelementsfortheproperexecutionofanalyticalwork
suchaspersonnel,laboratoryfacilities,equipment,andreagentswerediscussed.
Beforeembarkingupontheactualanalyticalwork,however,onemoretoolforthe
qualityassuranceoftheworkmustbedealtwith:thestatisticaloperationsnecessaryto
controlandverifytheanalyticalprocedures(Chapter7)aswellastheresultingdata
(Chapter8).
Itwasstatedbeforethatmakingmistakesinanalyticalworkisunavoidable.Thisisthe
reasonwhyacomplexsystemofprecautionstopreventerrorsandtrapstodetect
themhastobesetup.Animportantaspectofthequalitycontrolisthedetectionofboth
randomandsystematicerrors.Thiscanbedonebycriticallylookingatthe
performanceoftheanalysisasawholeandalsooftheinstrumentsandoperators
involvedinthejob.Forthedetectionitselfaswellasforthequantificationoftheerrors,
statisticaltreatmentofdataisindispensable.
Amultitudeofdifferentstatisticaltoolsisavailable,someofthemsimple,some
complicated,andoftenveryspecificforcertainpurposes.Inanalyticalwork,themost
importantcommonoperationisthecomparisonofdata,orsetsofdata,toquantify
accuracy(bias)andprecision.Fortunately,withafewsimpleconvenientstatisticaltools
mostoftheinformationneededinregularlaboratoryworkcanbeobtained:the"ttest,
the"Ftest",andregressionanalysis.Therefore,examplesofthesewillbegiveninthe
ensuingpages.
Clearly,statisticsareatool,notanaim.Simpleinspectionofdata,withoutstatistical
treatment,byanexperiencedanddedicatedanalystmaybejustasusefulasstatistical
figuresonthedeskofthedisinterested.Thevalueofstatisticslieswithorganizingand
simplifyingdata,topermitsomeobjectiveestimateshowingthatananalysisisunder
controlorthatachangehasoccurred.Equallyimportantisthattheresultsofthese
statisticalproceduresarerecordedandcanberetrieved.

6.2Definitions
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

1/29

30/05/2016

6BASICSTATISTICALTOOLS

6.2.1Error
6.2.2Accuracy
6.2.3Precision
6.2.4Bias
DiscussingQualityControlimpliestheuseofseveraltermsandconceptswitha
specific(andsometimesconfusing)meaning.Therefore,someofthemostimportant
conceptswillbedefinedfirst.

6.2.1Error
Erroristhecollectivenounforanydepartureoftheresultfromthe"true"value*.
Analyticalerrorscanbe:
1.Randomorunpredictabledeviationsbetweenreplicates,quantifiedwith
the"standarddeviation".
2.Systematicorpredictableregulardeviationfromthe"true"value,
quantifiedas"meandifference"(i.e.thedifferencebetweenthetruevalue
andthemeanofreplicatedeterminations).
3.Constant,unrelatedtotheconcentrationofthesubstanceanalyzed(the
analyte).
4.Proportional,i.e.relatedtotheconcentrationoftheanalyte.
*The"true"valueofanattributeisbynatureindeterminate
andoftenhasonlyaveryrelativemeaning.Particularlyinsoil
scienceforseveralattributesthereisnosuchthingasthetrue
valueasanyvalueobtainedismethoddependent(e.g.cation
exchangecapacity).Obviously,thisdoesnotmeanthatno
adequateanalysisservingapurposeispossible.Itdoes,
however,emphasizetheneedfortheestablishmentof
standardreferencemethodsandtheimportanceofexternal
QC(seeChapter9).

6.2.2Accuracy
The"trueness"ortheclosenessoftheanalyticalresulttothe"true"value.Itis
constitutedbyacombinationofrandomandsystematicerrors(precisionandbias)and
cannotbequantifieddirectly.Thetestresultmaybeameanofseveralvalues.An
accuratedeterminationproducesa"true"quantitativevalue,i.e.itispreciseandfreeof
bias.

6.2.3Precision
Theclosenesswithwhichresultsofreplicateanalysesofasampleagree.Itisa
measureofdispersionorscatteringaroundthemeanvalueandusuallyexpressedin
termsofstandarddeviation,standarderrororarange(differencebetweenthehighest
andthelowestresult).

6.2.4Bias
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

2/29

30/05/2016

6BASICSTATISTICALTOOLS

Theconsistentdeviationofanalyticalresultsfromthe"true"valuecausedbysystematic
errorsinaprocedure.Biasistheoppositebutmostusedmeasurefor"trueness"which
istheagreementofthemeanofanalyticalresultswiththetruevalue,i.e.excludingthe
contributionofrandomnessrepresentedinprecision.Thereareseveralcomponents
contributingtobias:
1.Methodbias
Thedifferencebetweenthe(mean)testresultobtainedfromanumberof
laboratoriesusingthesamemethodandanacceptedreferencevalue.The
methodbiasmaydependontheanalytelevel.
2.Laboratorybias
Thedifferencebetweenthe(mean)testresultfromaparticularlaboratory
andtheacceptedreferencevalue.
3.Samplebias
Thedifferencebetweenthemeanofreplicatetestresultsofasampleand
the("true")valueofthetargetpopulationfromwhichthesamplewas
taken.Inpractice,foralaboratorythisrefersmainlytosamplepreparation,
subsamplingandweighingtechniques.Whetherasampleis
representativeforthepopulationinthefieldisanextremelyimportant
aspectbutusuallyfallsoutsidetheresponsibilityofthelaboratory(insome
caseslaboratorieshavetheirownfieldsamplingpersonnel).
Therelationshipbetweentheseconceptscanbeexpressedinthefollowingequation:
Figure

ThetypesoferrorsareillustratedinFig.61.
Fig.61.Accuracyandprecisioninlaboratorymeasurements.(Notethatthe
qualificationsapplytothemeanofresults:incthemeanisaccuratebutsome
individualresultsareinaccurate)

http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

3/29

30/05/2016

6BASICSTATISTICALTOOLS

6.3BasicStatistics
6.3.1Mean
6.3.2Standarddeviation
6.3.3Relativestandarddeviation.Coefficientofvariation
6.3.4Confidencelimitsofameasurement
6.3.5Propagationoferrors
InthediscussionsofChapters7and8basicstatisticaltreatmentofdatawillbe
considered.Therefore,someunderstandingofthesestatisticsisessentialandtheywill
brieflybediscussedhere.
Thebasicassumptiontobemadeisthatasetofdata,obtainedbyrepeatedanalysisof
thesameanalyteinthesamesampleunderthesameconditions,hasanormalor
Gaussiandistribution.(Whenthedistributionisskewedstatisticaltreatmentismore
complicated).Theprimaryparametersusedarethemean(oraverage)andthe
standarddeviation(seeFig.62)andthemaintoolstheFtest,thettest,and
regressionandcorrelationanalysis.
Fig.62.AGaussianornormaldistribution.Thefigureshowsthat(approx.)
68%ofthedatafallintherangexs,95%intherangex2s,and99.7%in
therangex3s.
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

4/29

30/05/2016

6BASICSTATISTICALTOOLS

6.3.1Mean
Theaverageofasetofndataxi:
(6.1)

6.3.2Standarddeviation
Thisisthemostcommonlyusedmeasureofthespreadordispersionofdataaround
themean.Thestandarddeviationisdefinedasthesquarerootofthevariance(V).The
varianceisdefinedasthesumofthesquareddeviationsfromthemean,dividedbyn
1.Operationally,thereareseveralwaysofcalculation:
(6.1)

or
(6.3)

or
(6.4)

Thecalculationofthemeanandthestandarddeviationcaneasilybedoneona
calculatorbutmostconvenientlyonaPCwithcomputerprogramssuchasdBASE,
Lotus123,QuattroPro,Excel,andothers,whichhavesimplereadytousefunctions.
(Warning:someprogramsusenratherthann1!).

6.3.3Relativestandarddeviation.Coefficientofvariation
Althoughthestandarddeviationofanalyticaldatamaynotvarymuchoverlimited
rangesofsuchdata,itusuallydependsonthemagnitudeofsuchdata:thelargerthe
figures,thelargers.Therefore,forcomparisonofvariations(e.g.precision)itisoften
moreconvenienttousetherelativestandarddeviation(RSD)thanthestandard
deviationitself.TheRSDisexpressedasafraction,butmoreusuallyasapercentage
andisthencalledcoefficientofvariation(CV).Often,however,thesetermsare
confused.
(6.56.6)

Note.Whenneeded(e.g.fortheFtest,seeEq.6.11)thevariancecan,of
course,becalculatedbysquaringthestandarddeviation:
V=s2 (6.7)
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

5/29

30/05/2016

6BASICSTATISTICALTOOLS

6.3.4Confidencelimitsofameasurement
Themoreananalysisormeasurementisreplicated,thecloserthemeanxofthe
resultswillapproachthe"true"value,oftheanalytecontent(assumingabsenceof
bias).
Asingleanalysisofatestsamplecanberegardedasliterallysamplingtheimaginary
setofamultitudeofresultsobtainedforthattestsample.Theuncertaintyofsuch
subsamplingisexpressedby
(6.8)

where
="true"value(meanoflargesetofreplicates)
x=meanofsubsamples
t=astatisticalvaluewhichdependsonthenumberofdataandthe
requiredconfidence(usually95%).
s=standarddeviationofmeanofsubsamples
n=numberofsubsamples
(Theterm

isalsoknownasthestandarderrorofthemean.)

ThecriticalvaluesfortaretabulatedinAppendix1(theyare,therefore,herereferred
toasttab).Tofindtheapplicablevalue,thenumberofdegreesoffreedomhastobe
establishedby:df=n1(seealsoSection6.4.2).
Example
Forthedeterminationoftheclaycontentintheparticlesizeanalysis,asemiautomatic
pipetteinstallationisusedwitha20mLpipette.Thisvolumeisapproximateandthe
operationinvolvestheopeningandclosingoftaps.Therefore,thepipettehastobe
calibrated,i.e.boththeaccuracy(trueness)andprecisionhavetobeestablished.
Atenfoldmeasurementofthevolumeyieldedthefollowingsetofdata(inmL):
19.941 19.812 19.829 19.828 19.742
19.797 19.937 19.847 19.885 19.804

Themeanis19.842mLandthestandarddeviation0.0627mL.AccordingtoAppendix
1forn=10isttab=2.26(df=9)andusingEq.(6.8)thiscalibrationyields:
pipettevolume=19.8422.26(0.0627/

)=19.840.04mL

(Notethatthepipettehasasystematicdeviationfrom20mLasthisisoutsidethe
foundconfidenceinterval.Seealsobias).
Inroutineanalyticalwork,resultsareusuallysinglevaluesobtainedinbatchesof
severaltestsamples.Nolaboratorywillanalyzeatestsample50timestobeconfident
thattheresultisreliable.Therefore,thestatisticalparametershavetobeobtainedin
anotherway.Mostusuallythisisdonebymethodvalidation(seeChapter7)and/orby
keepingcontrolcharts,whichisbasicallythecollectionofanalyticalresultsfromoneor
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

6/29

30/05/2016

6BASICSTATISTICALTOOLS

morecontrolsamplesineachbatch(seeChapter8).Equation(6.8)isthenreducedto
(6.9)

where
="true"value
x=singlemeasurement
t=applicablettab(Appendix1)
s=standarddeviationofsetofpreviousmeasurements.
InAppendix1canbeseenthatifthesetofreplicatedmeasurementsislarge(say>
30),tiscloseto2.Therefore,the(95%)confidenceoftheresultxofasingletest
sample(n=1inEq.6.8)isapproximatedbythecommonlyusedandwellknown
expression
(6.10)

whereSisthepreviouslydeterminedstandarddeviationofthelargesetofreplicates
(seealsoFig.62).
Note:This"methods"orsofacontrolsampleisnotaconstantandmay
varyfordifferenttestmaterials,analytelevels,andwithanalytical
conditions.
Runningduplicateswill,accordingtoEquation(6.8),increasetheconfidenceofthe
(mean)resultbyafactor

where
x=meanofduplicates
s=knownstandarddeviationoflargeset
Similarly,triplicateanalysiswillincreasetheconfidencebyafactor
arefurtherdiscussedinSection8.3.3.

,etc.Duplicates

Thus,insummary,Equation(6.8)canbeappliedinvariouswaystodeterminethesize
oferrors(confidence)inanalyticalworkormeasurements:singledeterminationsin
routinework,determinationsforwhichnopreviousdataexist,certaincalibrations,etc.

6.3.5Propagationoferrors
6.3.5.1.Propagationofrandomerrors
6.3.5.2Propagationofsystematicerrors
Thefinalresultofananalysisisoftencalculatedfromseveralmeasurements
performedduringtheprocedure(weighing,calibration,dilution,titration,instrument
readings,moisturecorrection,etc.).AswasindicatedinSection6.2,thetotalerrorin
ananalyticalresultisanaddingupofthesuberrorsmadeinthevarioussteps.For
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

7/29

30/05/2016

6BASICSTATISTICALTOOLS

dailypractice,thebiasandprecisionofthewholemethodareusuallythemostrelevant
parameters(obtainedfromvalidation,Chapter7orfromcontrolcharts,Chapter8).
However,sometimesitisusefultogetaninsightinthecontributionsofthe
subprocedures(andthenthesehavetobedeterminedseparately).Forinstanceifone
wantstochange(partof)themethod.
Becausethe"addingup"oferrorsisusuallynotasimplesummation,thiswillbe
discussed.Themaindistinctiontobemadeisbetweenrandomerrors(precision)and
systematicerrors(bias).
6.3.5.1.Propagationofrandomerrors
Inestimatingthetotalrandomerrorfromfactorsinafinalcalculation,thetreatmentof
summationorsubtractionoffactorsisdifferentfromthatofmultiplicationordivision.
I.Summationcalculations
Ifthefinalresultxisobtainedfromthesum(ordifference)of(sub)measurementsa,b,
c,etc.:
x=a+b+c+...
thenthetotalprecisionisexpressedbythestandarddeviationobtainedbytakingthe
squarerootofthesumofindividualvariances(squaresofstandarddeviation):

Ifa(sub)measurementhasaconstantmultiplicationfactororcoefficient(suchasan
extradilution),thenthisisincludedtocalculatetheeffectofthevarianceconcerned,
e.g.(2b)2
Example
TheEffectiveCationExchangeCapacityofsoils(ECEC)isobtainedbysummationof
theexchangeablecations:
ECEC=Exch.(Ca+Mg+Na+K+H+Al)
StandarddeviationsexperimentallyobtainedforexchangeableCa,Mg,Na,Kand(H+
Al)onacertainsample,e.g.acontrolsample,are:0.30,0.25,0.15,0.15,and0.60
cmolc/kgrespectively.Thetotalprecisionis:

Itcanbeseenthatthetotalstandarddeviationislargerthanthehighestindividual
standarddeviation,but(much)lessthantheirsum.Itisalsoclearthatifonewantsto
reducethetotalstandarddeviation,qualitativelythebestresultcanbeexpectedfrom
reducingthelargestindividualcontribution,inthiscasetheexchangeableacidity.
2.Multiplicationcalculations
Ifthefinalresultxisobtainedfrommultiplication(orsubtraction)of(sub)measurements
accordingto

http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

8/29

30/05/2016

6BASICSTATISTICALTOOLS

thenthetotalerrorisexpressedbythestandarddeviationobtainedbytakingthe
squarerootofthesumoftheindividualrelativestandarddeviations(RSDorCV,asa
fractionoraspercentage,seeEqs.6.6and6.7):

Ifa(sub)measurementhasaconstantmultiplicationfactororcoefficient,thenthisis
includedtocalculatetheeffectoftheRSDconcerned,e.g.(2RSDb)2.
Example
ThecalculationofKjeldahlnitrogenmaybeasfollows:

where
a=mlHClrequiredfortitrationsample
b=mlHClrequiredfortitrationblank
s=airdrysampleweightingram
M=molarityofHCl
1.4=14103100%(14=atomicweightofN)
mcf=moisturecorrectionfactor
Notethatinadditiontomultiplications,thiscalculationcontainsasubtractionalso
(often,calculationscontainbothsummationsandmultiplications.)
Firstly,thestandarddeviationofthetitration(ab)isdeterminedasindicatedinSection
7above.ThisisthentransformedtoRSDusingEquations(6.5)or(6.6).Thenthe
RSDoftheotherindividualparametershavetobedeterminedexperimentally.The
foundRSDsare,forinstance:
distillation:0.8%,
titration:0.5%,
molarity:0.2%,
sampleweight:0.2%,
mcf:0.2%.
Thetotalcalculatedprecisionis:

Hereagain,thehighestRSD(ofdistillation)dominatesthetotalprecision.Inpractice,
theprecisionoftheKjeldahlmethodisusuallyconsiderablyworse(2.5%)probably
mainlyasaresultoftheheterogeneityofthesample.Thepresentexampledoesnot
takethatintoaccount.Itwouldimplythat2.5%1.0%=1.5%or3/5ofthetotal
randomerrorisduetosampleheterogeneity(orotheroverlookedcause).Thisimplies
thatpainstakingeffortstoimprovesubproceduressuchasthetitrationorthe
preparationofstandardsolutionsmaynotbeveryrewarding.Itwould,however,payto
improvethehomogeneityofthesample,e.g.bycarefulgrindingandmixinginthe
preparatorystage.
Note.Sampleheterogeneityisalsorepresentedinthemoisturecorrection
factor.However,theinfluenceofthisfactoronthefinalresultisusually
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

9/29

30/05/2016

6BASICSTATISTICALTOOLS

verysmall.
6.3.5.2Propagationofsystematicerrors
Systematicerrorsof(sub)measurementscontributedirectlytothetotalbiasofthe
resultsincetheindividualparametersinthecalculationofthefinalresulteachcarry
theirownbias.Forinstance,thesystematicerrorinabalancewillcauseasystematic
errorinthesampleweight(aswellasinthemoisturedetermination).Notethatsome
systematicerrorsmaycancelout,e.g.weighingsbydifferencemaynotbeaffectedby
abiasedbalance.
Theonlywaytodetectoravoidsystematicerrorsisbycomparison(calibration)with
independentstandardsandoutsidereferenceorcontrolsamples.

6.4Statisticaltests
6.4.1Twosidedvs.onesidedtest
6.4.2Ftestforprecision
6.4.3tTestsforbias
6.4.4Linearcorrelationandregression
6.4.5Analysisofvariance(ANOVA)
Inanalyticalworkafrequentlyrecurringoperationistheverificationofperformanceby
comparisonofdata.Someexamplesofcomparisonsinpracticeare:
performanceoftwoinstruments,
performanceoftwomethods,
performanceofaprocedureindifferentperiods,
performanceoftwoanalystsorlaboratories,
resultsobtainedforareferenceorcontrolsamplewiththe"true","target"
or"assigned"valueofthissample.
Someofthemostcommonandconvenientstatisticaltoolstoquantifysuch
comparisonsaretheFtest,thettests,andregressionanalysis.
BecausetheFtestandthettestsarethemostbasicteststheywillbediscussedfirst.
Thesetestsexamineiftwosetsofnormallydistributeddataaresimilarordissimilar
(belongornotbelongtothesame"population")bycomparingtheirstandarddeviations
andmeansrespectively.ThisisillustratedinFig.63.
Fig.63.Threepossiblecaseswhencomparingtwosetsofdata(n1=n2).A.
Differentmean(bias),sameprecisionB.Samemean(nobias),different
precisionC.Bothmeanandprecisionaredifferent.(Thefourthcase,identical
sets,hasnotbeendrawn).

http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

10/29

30/05/2016

6BASICSTATISTICALTOOLS

http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

11/29

30/05/2016

6BASICSTATISTICALTOOLS

6.4.1Twosidedvs.onesidedtest
Thesetestsforcomparison,forinstancebetweenmethodsAandB,arebasedonthe
assumptionthatthereisnosignificantdifference(the"nullhypothesis").Inotherwords,
whenthedifferenceissosmallthatatabulatedcriticalvalueofFortisnotexceeded,
wecanbeconfident(usuallyat95%level)thatAandBarenotdifferent.Two
fundamentallydifferentquestionscanbeaskedconcerningboththecomparisonofthe
standarddeviationss1ands2withtheFtest,andofthemeansx1,andx2,withthet
test:
1.areAandBdifferent?(twosidedtest)
2.isAhigher(orlower)thanB?(onesidedtest).
Thisdistinctionhasanimportantpracticalimplicationasstatisticallytheprobabilitiesfor
thetwosituationsaredifferent:thechancethatAandBareonlydifferent("itcango
twoways")istwiceaslargeasthechancethatAishigher(orlower)thanB("itcango
onlyoneway").Themostcommoncaseisthetwosided(alsocalledtwotailed)test:
therearenoparticularreasonstoexpectthatthemeansorthestandarddeviationsof
twodatasetsaredifferent.Anexampleistheroutinecomparisonofacontrolchart
withthepreviousone(see8.3).However,whenitisexpectedorsuspectedthatthe
meanand/orthestandarddeviationwillgoonlyoneway,e.g.afterachangeinan
analyticalprocedure,theonesided(oronetailed)testisappropriate.Inthiscasethe
probabilitythatitgoestheotherwaythanexpectedisassumedtobezeroand,
therefore,theprobabilitythatitgoestheexpectedwayisdoubled.Or,morecorrectly,
theuncertaintyinthetwowaytestof5%(ortheprobabilityof5%thatthecriticalvalue
isexceeded)isdividedoverthetwotailsoftheGaussiancurve(seeFig.62),i.e.2.5%
attheendofeachtailbeyond2s.Ifweperformtheonesidedtestwith5%uncertainty,
weactuallyincreasethis2.5%to5%attheendofonetail.(Notethatforthewhole
gaussiancurve,whichissymmetrical,thisisthenequivalenttoanuncertaintyof10%in
twoways!)
Thisdifferenceinprobabilityinthetestsisexpressedintheuseoftwotablesofcritical
valuesforbothFandt.Infact,theonesidedtableat95%confidencelevelis
equivalenttothetwosidedtableat90%confidencelevel.
Itisemphasizedthattheonesidedtestisonlyappropriatewhenadifferenceinone
directionisexpectedoraimedat.Ofcourseitistemptingtoperformthistestafterthe
resultsshowaclear(unexpected)effect.Infact,however,thenatwotimeshigher
probabilitylevelwasusedinretrospect.Thisisunderscoredbytheobservationthatin
thiswayevencontradictoryconclusionsmayarise:ifinanexperimentcalculated
valuesofFandtarefoundwithintherangebetweenthetwosidedandonesided
valuesofFtab,andttab,thetwosidedtestindicatesnosignificantdifference,whereas
theonesidedtestsaysthattheresultofAissignificantlyhigher(orlower)thanthatof
B.Whatactuallyhappensisthatinthefirstcasethe2.5%boundaryinthetailwasjust
notexceeded,andthen,subsequently,this2.5%boundaryisrelaxedto5%whichis
thenobviouslymoreeasilyexceeded.Thisillustratesthatstatisticaltestsdifferin
strictnessandthatforproperinterpretationofresultsinreports,thestatistical
techniquesused,includingtheconfidencelimitsorprobability,shouldalwaysbe
specified.
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

12/29

30/05/2016

6BASICSTATISTICALTOOLS

6.4.2Ftestforprecision
BecausetheresultoftheFtestmaybeneededtochoosebetweentheStudent'sttest
andtheCochranvariant(seenextsection),theFtestisdiscussedfirst.
TheFtest(orFisher'stest)isacomparisonofthespreadoftwosetsofdatatotestif
thesetsbelongtothesamepopulation,inotherwordsiftheprecisionsaresimilaror
dissimilar.
Thetestmakesuseoftheratioofthetwovariances:
(6.11)

wherethelargers2mustbethenumeratorbyconvention.Iftheperformancesarenot
verydifferent,thentheestimatess1,ands2,donotdiffermuchandtheirratio(andthat
oftheirsquares)shouldnotdeviatemuchfromunity.Inpractice,thecalculatedFis
comparedwiththeapplicableFvalueintheFtable(alsocalledthecriticalvalue,see
Appendix2).Toreadthetableitisnecessarytoknowtheapplicablenumberof
degreesoffreedomfors1,ands2.Thesearecalculatedby:
df1=n11
df2=n21
IfFcalFtabonecanconcludewith95%confidencethatthereisnosignificant
differenceinprecision(the"nullhypothesis"thats1,=s,isaccepted).Thus,thereis
stilla5%chancethatwedrawthewrongconclusion.Incertaincasesmoreconfidence
maybeneeded,thena99%confidencetablecanbeused,whichcanbefoundin
statisticaltextbooks.
ExampleI(twosidedtest)
Table61givesthedatasetsobtainedbytwoanalystsforthecationexchangecapacity
(CEC)ofacontrolsample.UsingEquation(6.11)thecalculatedFvalueis1.62.Aswe
hadnoparticularreasontoexpectthattheanalystswouldperformdifferently,weuse
theFtableforthetwosidedtestandfindFtab=4.03(Appendix2,df1,=df2=9).This
exceedsthecalculatedvalueandthenullhypothesis(nodifference)isaccepted.Itcan
beconcludedwith95%confidencethatthereisnosignificantdifferenceinprecision
betweentheworkofAnalyst1and2.
Table61.CECvalues(incmolc/kg)ofacontrolsampledeterminedbytwoanalysts.
1

10.2 9.7
10.7 9.0
10.5 10.2
9.9 10.3
9.0 10.8
11.2 11.1
11.5 9.4
10.9 9.2
8.9 9.8

http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

13/29

30/05/2016

6BASICSTATISTICALTOOLS

8.9 9.8
10.6 10.2
x:
s:
n:
Fcal=1.62

10.34
9.97
0.819
0.644
10
10
tcal=1.12

Ftab=4.03 ttab=2.10

Example2(onesidedtest)
ThedeterminationofthecalciumcarbonatecontentwiththeScheiblerstandard
methodiscomparedwiththesimpleandmorerapid"acidneutralization"methodusing
oneandthesamesample.TheresultsaregiveninTable62.Becauseofthenatureof
therapidmethodwesuspectittoproducealowerprecisionthenobtainedwiththe
Scheiblermethodandwecan,therefore,performtheonesidedFtest.Theapplicable
Ftab=3.07(App.2,df1,=12,df2=9)whichislowerthanFcal(=18.3)andthenull
hypothesis(nodifference)isrejected.Itcanbeconcluded(with95%confidence)that
forthisonesampletheprecisionoftherapidtitrationmethodissignificantlyworsethan
thatoftheScheiblermethod.
Table62.ContentsofCaCO3(inmass/mass%)inasoilsampledeterminedwiththe
Scheiblermethod(A)andtherapidtitrationmethod(B).
A

2.5 1.7
2.4 1.9
2.5 2.3
2.6 2.3
2.5 2.8
2.5 2.5
2.4 1.6
2.6 1.9
2.7 2.6
2.4 1.7
2.4
2.2
2.6
x:
s:
n:
Fcal=18.3

2.51
2.13
0.099
0.424
10
13
tcal=3.12

Ftab=3.07 ttab*=2.18

(ttab*=Cochran's"alternative"ttab)

6.4.3tTestsforbias
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

14/29

30/05/2016

6BASICSTATISTICALTOOLS

6.4.3.1.Student'sttest
6.4.3.2Cochran'sttest
6.4.3.3tTestforlargedatasets(n30)
6.4.3.4Pairedttest
Dependingonthenatureoftwosetsofdata(n,s,samplingnature),themeansofthe
setscanbecomparedforbiasbyseveralvariantsofthettest.Thefollowingmost
commontypeswillbediscussed:
1.Student'sttestforcomparisonoftwoindependentsetsofdatawith
verysimilarstandarddeviations
2.theCochranvariantofthettestwhenthestandarddeviationsofthe
independentsetsdiffersignificantly
3.thepairedttestforcomparisonofstronglydependentsetsofdata.
Basically,forthettestsEquation(6.8)isusedbutwritteninadifferentway:
(6.12)

where
x=meanoftestresultsofasample
="true"orreferencevalue
s=standarddeviationoftestresults
n=numberoftestresultsofthesample.
Tocomparethemeanofadatasetwithareferencevaluenormallythe"twosidedt
tableofcriticalvalues"isused(Appendix1).Theapplicablenumberofdegreesof
freedomhereis:
df=n1
IfavaluefortcalculatedwithEquation(6.12)doesnotexceedthecriticalvalueinthe
table,thedataaretakentobelongtothesamepopulation:thereisnodifferenceand
the"nullhypothesis"isaccepted(withtheapplicableprobability,usually95%).
AswiththeFtest,whenitisexpectedorsuspectedthattheobtainedresultsarehigher
orlowerthanthatofthereferencevalue,theonesidedttestcanbeperformed:iftcal
>ttab,thentheresultsaresignificantlyhigher(orlower)thanthereferencevalue.
Morecommonly,however,the"true"valueofproperreferencesamplesis
accompaniedbytheassociatedstandarddeviationandnumberofreplicatesusedto
determinetheseparameters.Wecanthenapplythemoregeneralcaseofcomparing
themeansoftwodatasets:the"true"valueinEquation(6.12)isthenreplacedbythe
meanofaseconddataset.AsisshowninFig.63,totestiftwodatasetsbelongto
thesamepopulationitistestedifthetwoGausscurvesdosufficientlyoverlap.Inother
words,ifthedifferencebetweenthemeansx1x2issmall.Thisisdiscussednext.
Similarityornonsimilarityofstandarddeviations
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

15/29

30/05/2016

6BASICSTATISTICALTOOLS

Whenusingthettestfortwosmallsetsofdata(n1and/orn2<30),achoiceofthetype
oftestmustbemadedependingonthesimilarity(ornonsimilarity)ofthestandard
deviationsofthetwosets.Ifthestandarddeviationsaresufficientlysimilartheycanbe
"pooled"andtheStudentttestcanbeused.Whenthestandarddeviationsarenot
sufficientlysimilaranalternativeprocedureforthettestmustbefollowedinwhichthe
standarddeviationsarenotpooled.AconvenientalternativeistheCochranvariantof
thettest.ThecriterionforthechoiceisthepassingornonpassingoftheFtest(see
6.4.2),thatis,ifthevariancesdoordonotsignificantlydiffer.Therefore,forsmalldata
sets,theFtestshouldprecedethettest.
Fordealingwithlargedatasets(n1,n2,30)the"normal"ttestisused(seeSection
6.4.3.3andApp.3).
6.4.3.1.Student'sttest
(Tobeappliedtosmalldatasets(n1,n2<30)wheres1,ands2aresimilaraccording
toFtest.
Whencomparingtwosetsofdata,Equation(6.12)isrewrittenas:
(6.13)

where
x1=meanofdataset1
x2=meanofdataset2
sp="pooled"standarddeviationofthesets
n1=numberofdatainset1
n2=numberofdatainset2.
Thepooledstandarddeviationspiscalculatedby:
6.14

where
s1=standarddeviationofdataset1
s2=standarddeviationofdataset2
n1=numberofdatainset1
n2=numberofdatainset2.
Toperformthettest,thecriticalttabhastobefoundinthetable(Appendix1)the
applicablenumberofdegreesoffreedomdfisherecalculatedby:
df=n1+n22
Example
ThetwodatasetsofTable61canbeused:WithEquations(6.13)and(6.14)tcal,is
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

16/29

30/05/2016

6BASICSTATISTICALTOOLS

calculatedas1.12whichislowerthanthecriticalvaluettabof2.10(App.1,df=18,
twosided),hencethenullhypothesis(nodifference)isacceptedandthetwodatasets
areassumedtobelongtothesamepopulation:thereisnosignificantdifference
betweenthemeanresultsofthetwoanalysts(with95%confidence).
Note.Anotherillustrativewaytoperformthistestforbiasistocalculateif
thedifferencebetweenthemeansfallswithinoroutsidetherangewhere
thisdifferenceisstillnotsignificantlylarge.Inotherwords,ifthisdifference
islessthantheleastsignificantdifference(lsd).Thiscanbederivedfrom
Equation(6.13):
6.15

InthepresentexampleofTable61,thecalculationyieldslsd=0.69.Themeasured
differencebetweenthemeansis10.349.97=0.37whichissmallerthanthelsd
indicatingthatthereisnosignificantdifferencebetweentheperformanceofthe
analysts.
Inaddition,inthisapproachthe95%confidencelimitsofthedifferencebetweenthe
meanscanbecalculated(cf.Equation6.8):
confidencelimits=0.370.69=0.32and1.06
Notethatthevalue0forthedifferenceissituatedwithinthisconfidenceintervalwhich
agreeswiththenullhypothesisofx1=x2(nodifference)havingbeenaccepted.
6.4.3.2Cochran'sttest
Tobeappliedtosmalldatasets(n1,n2,<30)wheres1ands2,aredissimilar
accordingtoFtest.
Calculatetwith:
6.16

Thendeterminean"alternative"criticaltvalue:
6.17

where
t1=ttabatn11degreesoffreedom
t2=ttabatn21degreesoffreedom
Nowthettestcanbeperformedasusual:iftcal<ttab*thenthenullhypothesisthatthe
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

17/29

30/05/2016

6BASICSTATISTICALTOOLS

meansdonotsignificantlydifferisaccepted.
Example
ThetwodatasetsofTable62canbeused.
AccordingtotheFtest,thestandarddeviationsdiffersignificantlysothattheCochran
variantmustbeused.Furthermore,incontrasttoourexpectationthattheprecisionof
therapidtestwouldbeinferior,wehavenoideaaboutthebiasandthereforethetwo
sidedtestisappropriate.Thecalculationsyieldtcal=3.12andttab*=2.18meaningthat
tcalexceedsttab*whichimpliesthatthenullhypothesis(nodifference)isrejectedand
thatthemeanoftherapidanalysisdeviatessignificantlyfromthatofthestandard
analysis(with95%confidence,andforthissampleonly).Furtherinvestigationofthe
rapidmethodwouldhavetoincludetheuseofmoredifferentsamplesandthen
comparisonwiththeonesidedttestwouldbejustified(see6.4.3.4,Example1).
6.4.3.3tTestforlargedatasets(n30)
Intheexampleabove(6.4.3.2)theconclusionhappenstohavebeenthesameifthe
Student'sttestwithpooledstandarddeviationshadbeenused.Thisiscausedbythe
factthatthedifferenceinresultoftheStudentandCochranvariantsofthettestis
largestwhensmallsetsofdataarecompared,anddecreaseswithincreasingnumber
ofdata.Namely,withincreasingnumberofdataabetterestimateofthereal
distributionofthepopulationisobtained(theflattertdistributionconvergesthentothe
standardizednormaldistribution).Whenn30forbothsets,e.g.whencomparing
ControlCharts(see8.3),forallpracticalpurposesthedifferencebetweentheStudent
andCochranvariantisnegligible.Theprocedureisthenreducedtothe"normal"ttest
bysimplycalculatingtcalwithEq.(6.16)andcomparingthiswithttabatdf=n1+n22.
(NoteinApp.1thatthetwosidedttabisnowcloseto2).
Theproperchoiceofthettestasdiscussedaboveissummarizedinaflowdiagramin
Appendix3.
6.4.3.4Pairedttest
Whentwodatasetsarenotindependent,thepairedttestcanbeabettertoolfor
comparisonthanthe"normal"ttestdescribedintheprevioussections.Thisisfor
instancethecasewhentwomethodsarecomparedbythesameanalystusingthe
samesample(s).Itcould,infact,alsobeappliedtotheexampleofTable61ifthetwo
analystsusedthesameanalyticalmethodat(about)thesametime.
Asstatedpreviously,comparisonoftwomethodsusingdifferentlevelsofanalytegives
morevalidationinformationaboutthemethodsthanusingonlyonelevel.Comparison
ofresultsateachlevelcouldbedonebytheFandttestsasdescribedabove.The
pairedttest,however,allowsfordifferentlevelsprovidedtheconcentrationrangeis
nottoowide.Asaruleoffist,therangeofresultsshouldbewithinthesame
magnitude.Iftheanalysiscoversalongerrange,i.e.severalpowersoften,regression
analysismustbeconsidered(seeSection6.4.4).Inintermediatecases,either
techniquemaybechosen.
Thenullhypothesisisthatthereisnodifferencebetweenthedatasets,sothetestisto
seeifthemeanofthedifferencesbetweenthedatadeviatessignificantlyfromzeroor
not(twosidedtest).Ifitisexpectedthatonesetissystematicallyhigher(orlower)than
theotherset,thentheonesidedtestisappropriate.
Example1
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

18/29

30/05/2016

6BASICSTATISTICALTOOLS

The"promising"rapidsingleextractionmethodforthedeterminationofthecation
exchangecapacityofsoilsusingthesilverthioureacomplex(AgTU,bufferedatpH7)
wascomparedwiththetraditionalammoniumacetatemethod(NH4OAc,pH7).
Althoughforcertainsoiltypesthedifferenceinresultsappearedinsignificant,forother
typesdifferencesseemedlarger.Suchasuspectgroupweresoilswithferralic(oxic)
properties(i.e.highlyweatheredsesquioxiderichsoils).InTable63theresultsoften
soilswiththesepropertiesaregroupedtotestiftheCECmethodsgivedifferent
results.Thedifferencedwithineachpairandtheparametersneededforthepairedt
testaregivenalso.
Table63.CECvalues(incmolc/kg)obtainedbytheNH4OAcandAgTUmethods
(bothatpH7)fortensoilswithferralicproperties.
Sample NH4OAc AgTU

7.1

6.5

0.6

4.6

5.6

+1.0

10.6

14.5 +3.9

2.3

5.6

25.2

23.8 1.4

4.4

10.4 +6.0

7.8

8.4

+0.6

2.7

5.5

+2.8

14.3

19.2 +4.9

10

13.6

15.0 +1.4

+3.3

d=+2.19 tcal=2.89
sd=2.395 ttab=2.26

UsingEquation(6.12)andnotingthatd=0(hypothesisvalueofthedifferences,i.e.
nodifference),thetvaluecanbecalculatedas:

where
=meanofdifferenceswithineachpairofdata
sd=standarddeviationofthemeanofdifferences
n=numberofpairsofdata
Thecalculatedtvalue(=2.89)exceedsthecriticalvalueof1.83(App.1,df=n1=9,
onesided),hencethenullhypothesisthatthemethodsdonotdifferisrejectedanditis
concludedthatthesilverthioureamethodgivessignificantlyhigherresultsascompared
withtheammoniumacetatemethodwhenappliedtosuchhighlyweatheredsoils.
Note.Sincesuchdatasetsdonothaveanormaldistribution,the"normal"
ttestwhichcomparesmeansofsetscannotbeusedhere(themeansdo
notconstituteafairrepresentationofthesets).Forthesamereasonno
informationabouttheprecisionofthetwomethodscanbeobtained,nor
cantheFtestbeapplied.Forinformationaboutprecision,replicate
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

19/29

30/05/2016

6BASICSTATISTICALTOOLS

determinationsareneeded.
Example2
Table64showsthedataoftotalPinfourplanttissuesamplesobtainedbya
laboratoryLandthemedianvaluesobtainedby123laboratoriesinaproficiency
(roundrobin)test.
Table64.TotalPcontents(inmmol/kg)ofplanttissueasdeterminedby123
laboratories(Median)andLaboratoryL.
Sample Median LabL d
1

93.0

85.2 7.8

201

224

78.9

84.5 5.6

175

185

d=7.70

23
10

tcal=1.21

sd=12.702 ttab=3.18

Toverifytheperformanceofthelaboratoryapairedttestcanbeperformed:
UsingEq.(6.12)andnotingthatd=0(hypothesisvalueofthedifferences,i.e.no
difference),thetvaluecanbecalculatedas:

Thecalculatedtvalueisbelowthecriticalvalueof3.18(Appendix1,df=n1=3,
twosided),hencethenullhypothesisthatthelaboratorydoesnotsignificantlydiffer
fromthegroupoflaboratoriesisaccepted,andtheresultsofLaboratoryLseemto
agreewiththoseof"therestoftheworld"(thisisasocalledthirdlinecontrol).

6.4.4Linearcorrelationandregression
6.4.4.1Constructionofcalibrationgraph
6.4.4.2Comparingtwosetsofdatausingmanysamplesatdifferent
analytelevels
Thesealsobelongtothemostcommonusefulstatisticaltoolstocompareeffectsand
performancesXandY.Althoughthetechniqueisinprinciplethesameforboth,there
isafundamentaldifferenceinconcept:correlationanalysisisappliedtoindependent
factors:ifXincreases,whatwillYdo(increase,decrease,orperhapsnotchangeat
all)?Inregressionanalysisaunilateralresponseisassumed:changesinXresultin
changesinY,butchangesinYdonotresultinchangesinX.
Forexample,inanalyticalwork,correlationanalysiscanbeusedforcomparing
methodsorlaboratories,whereasregressionanalysiscanbeusedtoconstruct
calibrationgraphs.Inpractice,however,comparisonoflaboratoriesormethodsis
usuallyalsodonebyregressionanalysis.Thecalculationscanbeperformedona
(programmed)calculatorormoreconvenientlyonaPCusingahomemadeprogram.
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

20/29

30/05/2016

6BASICSTATISTICALTOOLS

Evenmoreconvenientaretheregressionprogramsincludedinstatisticalpackages
suchasStatistix,Mathcad,Eureka,Genstat,Statcal,SPSS,andothers.Also,most
spreadsheetprogramssuchasLotus123,Excel,andQuattroProhavefunctionsfor
this.
Laboratoriesormethodsareinfactindependentfactors.However,forregression
analysisonefactorhastobetheindependentor"constant"factor(e.g.thereference
method,orthefactorwiththesmalleststandarddeviation).Thisfactorisbyconvention
designatedX,whereastheotherfactoristhenthedependentfactorY(thus,wespeak
of"regressionofYonX").
AswasdiscussedinSection6.4.3,suchcomparisonscanoftenbeendonewiththe
Student/Cochranorpairedttests.However,correlationanalysisisindicated:
1.Whentheconcentrationrangeissowidethattheerrors,bothrandom
andsystematic,arenotindependent(whichistheassumptionforthet
tests).Thisisoftenthecasewhereconcentrationrangesofseveral
magnitudesareinvolved.
2.Whenpairingisinappropriateforotherreasons,notablyalongtime
spanbetweenthetwoanalyses(sampleaging,changeinlaboratory
conditions,etc.).
Theprincipleistoestablishastatisticallinearrelationshipbetweentwosetsof
correspondingdatabyfittingthedatatoastraightlinebymeansofthe"leastsquares"
technique.Suchdataare,forexample,analyticalresultsoftwomethodsappliedtothe
samesamples(correlation),ortheresponseofaninstrumenttoaseriesofstandard
solutions(regression).
Note:Naturally,nonlinearhigherorderrelationshipsarealsopossible,but
sincethesearelesscommoninanalyticalworkandmorecomplexto
handlemathematically,theywillnotbediscussedhere.Nevertheless,to
avoidmisinterpretation,alwaysinspectthekindofrelationshipbyplotting
thedata,eitheronpaperoronthecomputermonitor.
Theresultinglinetakesthegeneralform:
y=bx+a (6.18)

where
a=interceptofthelinewiththeyaxis
b=slope(tangent)
Inlaboratoryworkideally,whenthereisperfectpositivecorrelationwithoutbias,the
intercepta=0andtheslope=1.Thisisthesocalled"1:1line"passingthroughthe
origin(dashedlineinFig.65).
Iftheintercepta0thenthereisasystematicdiscrepancy(bias,error)betweenXand
Ywhenb1thenthereisaproportionalresponseordifferencebetweenXandY.
ThecorrelationbetweenXandYisexpressedbythecorrelationcoefficientrwhichcan
becalculatedwiththefollowingequation:
6.19

http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

21/29

30/05/2016

6BASICSTATISTICALTOOLS

where
xi=dataX
x=meanofdataX
yi=dataY
y=meanofdataY
Itcanbeshownthatrcanvaryfrom1to1:
r=1perfectpositivelinearcorrelation
r=0nolinearcorrelation(maybeothercorrelation)
r=1perfectnegativelinearcorrelation
Often,thecorrelationcoefficientrisexpressedasr2:thecoefficientofdeterminationor
coefficientofvariance.Theadvantageofr2isthat,whenmultipliedby100,itindicates
thepercentageofvariationinYassociatedwithvariationinX.Thus,forexample,when
r=0.71about50%(r2=0.504)ofthevariationinYisduetothevariationinX.
Thelineparametersbandaarecalculatedwiththefollowingequations:
6.20

and
a=ybx 6.21

Itisworthtonotethatrisindependentofthechoicewhichfactoristheindependent
factoryandwhichisthedependentY.However,theregressionparametersaanddo
dependonthischoiceastheregressionlineswillbedifferent(exceptwhenthereis
ideal1:1correlation).
6.4.4.1Constructionofcalibrationgraph
Asanexample,wetakeastandardseriesofP(01.0mg/L)forthespectrophotometric
determinationofphosphateinaBrayIextract("availableP"),readinginabsorbance
units.Thedataandcalculatedtermsneededtodeterminetheparametersofthe
calibrationgrapharegiveninTable65.ThelineitselfisplottedinFig.64.
Table65ispresentedheretogiveaninsightinthestepsandtermsinvolved.The
calculationofthecorrelationcoefficientrwithEquation(6.19)yieldsavalueof0.997(r2
=0.995).Suchhighvaluesarecommonforcalibrationgraphs.Whenthevalueisnot
closeto1(say,below0.98)thismustbetakenasawarninganditmightthenbe
advisabletorepeatorreviewtheprocedure.Errorsmayhavebeenmade(e.g.in
pipetting)ortheusedrangeofthegraphmaynotbelinear.Ontheotherhand,ahighr
maybemisleadingasitdoesnotnecessarilyindicatelinearity.Therefore,toverifythis,
thecalibrationgraphshouldalwaysbeplotted,eitheronpaperoroncomputermonitor.
UsingEquations(6.20and(6.21)weobtain:

http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

22/29

30/05/2016

6BASICSTATISTICALTOOLS

and
a=0.3500.313=0.037
Thus,theequationofthecalibrationlineis:
y=0.626x+0.037 (6.22)

Table65.ParametersofcalibrationgraphinFig.64.
xi

yi

x1x (xix)2 yiy (yiy)2 (x1x)(yiy)

0.0

0.05

0.5

0.25

0.30 0.090

0.150

0.2

0.14

0.3

0.09

0.21 0.044

0.063

0.4

0.29

0.1

0.01

0.06 0.004

0.006

0.6

0.43

0.1

0.01

0.08 0.006

0.008

0.8

0.52

0.3

0.09

0.17 0.029

0.051

1.0

0.67

0.5

0.25

0.32 0.102

0.160

3.0

2.10

0.70

0.2754

0.438

x=0.5 y=0.35

Fig.64.CalibrationgraphplottedfromdataofTable65.Thedashedlines
delineatethe95%confidenceareaofthegraph.Notethattheconfidenceis
highestatthecentroidofthegraph.

http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

23/29

30/05/2016

6BASICSTATISTICALTOOLS

Duringcalculation,themaximumnumberofdecimalsisused,roundingofftothelast
significantfigureisdoneattheend(seeinstructionforroundingoffinSection8.2).
Oncethecalibrationgraphisestablished,itsuseissimple:foreachyvaluemeasured
thecorrespondingconcentrationxcanbedeterminedeitherbydirectreadingorby
calculationusingEquation(6.22).Theuseofcalibrationgraphsisfurtherdiscussedin
Section7.2.2.
Note.Atreatiseoftheerrororuncertaintyintheregressionlineisgiven.
6.4.4.2Comparingtwosetsofdatausingmanysamplesatdifferentanalyte
levels
Althoughregressionanalysisassumesthatonefactor(onthexaxis)isconstant,when
certainconditionsaremetthetechniquecanalsosuccessfullybeappliedtocomparing
twovariablessuchaslaboratoriesormethods.Theseconditionsare:
Themostprecisedatasetisplottedonthexaxis
Atleast6,butpreferablymorethan10differentsamplesareanalyzed
Thesamplesshouldratheruniformlycovertheanalytelevelrangeof
interest.
Todecidewhichlaboratoryormethodisthemostprecise,multireplicateresultshave
tobeusedtocalculatestandarddeviations(see6.4.2).Ifthesearenotavailablethen
thestandarddeviationsofthepresentsetscouldbecompared(notethatwearenow
notdealingwithnormallydistributedsetsofreplicateresults).Anotherconvenientway
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

24/29

30/05/2016

6BASICSTATISTICALTOOLS

istoruntheregressionanalysisonthecomputer,reversethevariablesandrunthe
analysisagain.Observewhichvariablehastheloweststandarddeviation(orstandard
erroroftheintercepta,bothgivenbythecomputer)andthenusetheresultsofthe
regressionanalysiswherethisvariablewasplottedonthexaxis.
Iftheanalytelevelrangeisincomplete,onemighthavetoresorttospikingorstandard
additions,withtheinherentdrawbackthattheoriginalanalytesamplecombinationmay
notadequatelybereflected.
Example
Intheframeworkofaperformanceverificationprogramme,alargenumberofsoil
sampleswereanalyzedbytwolaboratoriesXandY(aformof"thirdlinecontrol",see
Chapter9)andthedatacomparedbyregression.(Inthisparticularcase,thepairedt
testmighthavebeenconsideredalso).Theregressionlineofacommonattribute,the
pH,isshownhereasanillustration.Figure65showsthesocalled"scatterplot"of
124soilpHH2Odeterminationsbythetwolaboratories.Thecorrelationcoefficientris
0.97whichisverysatisfactory.Theslope(=1.03)indicatesthattheregressionlineis
onlyslightlysteeperthanthe1:1idealregressionline.Verydisturbing,however,isthe
interceptaof1.18.ThisimpliesthatlaboratoryYmeasuresthepHmorethanawhole
unitlowerthanlaboratoryXatthelowendofthepHrange(theintercept1.18isat
pHx=0)whichdifferencedecreasestoabout0.8unitatthehighend.
Fig.65.ScatterplotofpHdataoftwolaboratories.Drawnline:regressionline
dashedline:1:1idealregressionline.

http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

25/29

30/05/2016

6BASICSTATISTICALTOOLS

Thettestforsignificanceisasfollows:
Forintercepta:a=0(nullhypothesis:nobiasidealinterceptisthenzero),standard
error=0.14(calculatedbythecomputer),andusingEquation(6.12)weobtain:

Here,ttab=1.98(App.1,twosided,df=n2=122(n2becauseanextradegreeof
freedomislostasthedataareusedforbothaandb)hence,thelaboratorieshavea
significantmutualbias.
Forslope:b=1(idealslope:nullhypothesisisnodifference),standarderror=0.02
(givenbycomputer),andagainusingEquation(6.12)weobtain:

Again,ttab=1.98(App.1twosided,df=122),hence,thedifferencebetweenthe
laboratoriesisnotsignificantlyproportional(or:thelaboratoriesdonothavea
significantdifferenceinsensitivity).Theseresultssuggestthatinspiteofthegood
correlation,thetwolaboratorieswouldhavetolookintothecauseofthebias.
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

26/29

30/05/2016

6BASICSTATISTICALTOOLS

Note.Inthepresentexample,thescatteringofthepointsaroundthe
regressionlinedoesnotseemtochangemuchoverthewholerange.This
indicatesthattheprecisionoflaboratoryYdoesnotchangeverymuch
overtherangewithrespecttolaboratoryX.Thisisnotalwaysthecase.In
suchcases,weightedregression(notdiscussedhere)ismoreappropriate
thantheunweightedregressionasusedhere.
Validationofamethod(seeSection7.5)mayrevealthatprecisioncan
changesignificantlywiththelevelofanalyte(andwithotherfactorssuch
assamplematrix).

6.4.5Analysisofvariance(ANOVA)
Whenresultsoflaboratoriesormethodsarecomparedwheremorethanonefactor
canbeofinfluenceandmustbedistinguishedfromrandomeffects,thenANOVAisa
powerfulstatisticaltooltobeused.Examplesofsuchfactorsare:differentanalysts,
sampleswithdifferentpretreatments,differentanalytelevels,differentmethodswithin
oneofthelaboratories).MoststatisticalpackagesforthePCcanperformthisanalysis.
AsatreatiseofANOVAisbeyondthescopeofthepresentGuidelines,forfurther
discussionthereaderisreferredtostatisticaltextbooks,someofwhicharegiveninthe
listofLiterature.
Errororuncertaintyintheregressionline
The"fitting"ofthecalibrationgraphisnecessarybecausetheresponsepointsyi,
composingthelinedonotfallexactlyontheline.Hence,randomerrorsareimplied.
Thisisexpressedbyanuncertaintyabouttheslopeandinterceptbandadefiningthe
line.Aquantificationcanbefoundinthestandarddeviationoftheseparameters.Most
computerprogrammesforregressionwillautomaticallyproducefiguresforthese.To
illustratetheprocedure,theexampleofthecalibrationgraphinSection6.4.3.1is
elaboratedhere.
Apracticalquantificationoftheuncertaintyisobtainedbycalculatingthestandard
deviationofthepointsonthelinethe"residualstandarddeviation"or"standarderror
oftheyestimate",whichweassumedtobeconstant(butwhichisonlyapproximately
so,seeFig.64):
(6.23)

where
="fitted"yvalueforeachxi,(readfromgraphorcalculatedwithEq.
6.22).Thus,
line.

isthe(vertical)deviationofthefoundyvaluesfromthe

n=numberofcalibrationpoints.
Note:Onlytheydeviationsofthepointsfromthelineareconsidered.Itis
assumedthatdeviationsinthexdirectionarenegligible.Thisis,ofcourse,
onlythecaseifthestandardsareveryaccuratelyprepared.
Nowthestandarddeviationsfortheinterceptaandslopebcanbecalculatedwith:
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

27/29

30/05/2016

6BASICSTATISTICALTOOLS

6.24

and
6.25

Tomakethisprocedureclear,theparametersinvolvedarelistedinTable66.
Theuncertaintyabouttheregressionlineisexpressedbytheconfidencelimitsofaand
baccordingtoEq.(6.9):at.saandbt.sb
Table66.Parametersforcalculatingerrorsduetocalibrationgraph(usealsofigures
ofTable65).
xi

yi

0 0.05 0.037 0.013

0.0002

0.2 0.14 0.162 0.022

0.0005

0.4 0.29 0.287 0.003

0.0000

0.6 0.43 0.413 0.017

0.0003

0.8 0.52 0.538 0.018

0.0003

1.0 0.67 0.663 0.007

0.0001
0.001364

Inthepresentexample,usingEq.(6.23),wecalculate

and,usingEq.(6.24)andTable65:

and,usingEq.(6.25)andTable65:

Theapplicablettabis2.78(App.1,twosided,df=n1=4)hence,usingEq.(6.9):
a=0.0372.780.0132=0.0370.037
and
b=0.6262.780.0219=0.6260.061
Notethatifsaislargeenough,anegativevalueforaispossible,i.e.anegativereading
fortheblankorzerostandard.(Foradiscussionabouttheerrorinxresultingfroma
http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

28/29

30/05/2016

6BASICSTATISTICALTOOLS

readinginy,whichisparticularlyrelevantforreadingacalibrationgraph,seeSection
7.2.3)
Theuncertaintyaboutthelineissomewhatdecreasedbyusingmorecalibrationpoints
(assumingsyhasnotincreased):onemorepointreducesttabfrom2.78to2.57(see
Appendix1).

http://www.fao.org/docrep/w7295e/w7295e08.htm#6.4.5analysisofvariance(anova)

29/29

You might also like