You are on page 1of 5

NormalityTesttutorial 1 SpiderFinancialCorp,2012

FactsandMythsaboutNormalityTest
Intimeseriesandeconometricanalysisandmodeling,weoftenencounterthenormalitytestaspartof
theresidualsdiagnosistovalidateamodelsassumption(s).

DoestheNormalitytest
1
telluswhetherstandardizedresidualsfollowaGaussiandistribution?Not
exactly.
So,whatexactlydoesthistestdo?Whydowehaveseveraldifferentmethodsfortestingnormality?
Note:Forillustration,wesimulated5seriesofrandomnumbersusingtheAnalysisPackinExcel.Each
serieshasadifferentunderlyingdistribution:Normal,Uniform,Binomial,Poisson,StudentstandF
distribution.
Background
Letsassumewehaveadatasetofaunivariate({ }
t
x ),andwewishtodeterminewhetherthedataset
iswellmodeledbyaGaussiandistribution.

1
: ~ (.)
: (.)
o
H X N
H X N =

Where
o
H =nullhypothesis( X isnormallydistributed)
1
H =alternativehypothesis( X distributiondeviatesfromGaussian)
(.) N =Gaussianornormaldistribution
Inessence,thenormalitytestisaregulartestofahypothesisthatcanhavetwopossibleoutcomes:(1)
rejectionofthenullhypothesisofnormality(
o
H ),or(2)failuretorejectthenullhypothesis.

1
Youcanusethenormalprobabilityplots(i.e.QQplots)asaninformalmeansofassessingthenonnormalityofa
setofdata.However,youmayneedconsiderablepracticebeforeyoucanjudgethemwithanydegreeof
confidence.

NormalityTesttutorial 2 SpiderFinancialCorp,2012

Inpractice,whenwecantrejectthenullhypothesisofnormality,itmeansthatthetestfailstofind
deviancefromanormaldistributionforthissample.Therefore,itispossiblethedataisnormally
distributed.
Theproblemwetypicallyfaceisthatwhenthesamplesizeissmall,evenlargedeparturesfrom
normalityarenotdetected;conversely,whenyoursamplesizeislarge,eventhesmallestdeviations
fromnormalitywillleadtoarejectednull.
NormalityTests
Howdowetestfornormality?Inprinciple,wecomparetheempirical(sample)distributionwitha
theoreticalnormaldistribution.Themeasureofdeviancecanbedefinedbasedondistribution
moments,aQQplot,orthedifferencesummarybetweentwodistributionfunctions.
Letsexaminethefollowingnormalitytests:
JarqueBeratest
ShapiroWilktest
AndersonDarlingtest
JarqueBera
TheJarqueBeratestisagoodnessoffitmeasureofdeparturefromnormalitybasedonthesample
kurtosisandskew.Inotherwords,JBdetermineswhetherthedatahavetheskewandkurtosismatching
anormaldistribution.
ThetestisnamedafterCarlosM.JarqueandAnilK.Bera.TheteststatisticforJBisdefinedas:

2
2 2
2
~
6 4
n K
JB S
v
_
=
| |
= +
|
\ .

Where
S =thesampleskew
K =thesampleexcesskurtosis
n=thenumberofnonmissingvaluesinthesample
JB=theteststatistic; JBhasanasymptoticchisquaredistribution
Notes:Forsmallsamples,thechisquaredapproximationisoverlysensitive,oftenrejectingthenull
hypothesis(i.e.normality)whenitisinfacttrue.

NormalityTesttutorial 3 SpiderFinancialCorp,2012

Inthetableabove,wecomputethePvalueofthenormalitytest(UsingtheNormalityTestfunctionin
NumXL).NotethattheJBtestfailedtodetectadeparturefromnormalityforsymmetricdistributions
(e.g.UniformandStudents)usingasmallsamplesize( 50 n s ).
ShapiroWilk
Basedontheinformalapproachtojudging
normality,oneratherobviouswaytojudge
thenearlinearityofanyQQplot(seeFigure
1)istocomputeits"correlationcoefficient."

Whenthisisdonefornormalprobability(Q
Q)plots,aformaltestcanbeobtainedthatis
essentiallyequivalenttothepowerful
ShapiroWilktestWanditsapproximation
W.

( )
( )
2
( )
1
2
( )
1
N
i i
i
N
i
i
a x
W
x X
=
=
| |
|
\ .
=

Figure1:QQPlotExample

NormalityTesttutorial 4 SpiderFinancialCorp,2012

Where
( ) i
X =thei
th
order(smallestnumber)inthesample
i
a =aconstantgivenby
1
1 2
1 1
( , ,..., )
( )
T
n
T
m V
a a a
m V V m


=
m=theexpectedvaluesoftheorderstatisticsofindependentandidenticaldistributedrandom
variablessampledfromGaussiandistribution
V =thecovariancematrixof { } m orderstatistics

Inthetableabove,theSWPvaluesaresignificantlybetterforsmallsamplesizes( 50 n s )indetecting
departurefromnormality,butexhibitsimilarissueswithsymmetricdistribution(e.g.Uniform,Students
t).
AndersonDarling
TheAndersonDarlingtestsfornormalityarebasedontheempiricaldistributionfunction(EDF).Thetest
statisticsisbasedonthesquareddifferencebetweennormalandempirical:

| |
2
1
1
(2 1) ln (2 1 2 ) ln(1 )
N
i i
i
A n i U n i U
n
=
= + +

NormalityTesttutorial 5 SpiderFinancialCorp,2012

Insum,weconstructanempiricaldistribution
usingthesortedsampledata,computethe
theoretical(Gaussian)cumulativedistribution(
i
U )
ateachpoint(
( ) i
x )and,finally,calculatethetest
statistic.

And,inthecasewherethevarianceandmeanofthenormaldistributionarebothunknown,thetest
statisticisexpressedasfollows:

*2 2
2
4 25
1 A A
n n
| |
= +
|
\ .

Note:TheADTestiscurrentlyplannedforthenextNumXLrelease;wewontshowresultshere,asyou
cantyetreproducethem.
Conclusion
Thesethreetestsuseverydifferentapproachestotestfornormality:(1)JBusesthemomentsbased
comparison,(2)SWexaminesthecorrelationintheQQplotand(3)ADteststhedifferencebetween
empiricalandtheoreticaldistributions.
Inaway,thetestscomplementeachother,butsomearemoreusefulincertainsituationsthanothers.
Forexample,JBworkspoorlyforsmallsamplesizes(n<50)orverylargesamplesizes(n>5000).
TheSWmethodworksbetterforsmallsamplesizes(n>3butlessthan5000).
Intermsofpower,Stephens
i
foundADstatistics(
2
A )tobeoneofthebestEDFstatisticsfordetecting
departurefromnormality,evenwhenusedwithsmallsamples( 25 n s ).Nevertheless,theADtesthas
thesameproblemwithalargesamplesize,whereslightimperfectionsleadtoarejectionofanull
hypothesis.

i
Stephens,M.A.(1974)."EDFStatisticsforGoodnessofFitandSomeComparisons".JournaloftheAmerican
StatisticalAssociation69:730737
Figure2: EmpiricalDistributionFunction(EDFvs.Normal)

You might also like