Normality Test in Excel

NormalityTesttutorial 1 SpiderFinancialCorp,2012
FactsandMythsaboutNormalityTest
Intimeseriesandeconometricanalysisandmodeling,weoftenencounterthenormalitytestaspartof
theresidualsdiagnosistovalidateamodelsassumption(s).
DoestheNormalitytest
1
telluswhetherstandardizedresidualsfollowaGaussiandistribution?Not
exactly.
So,whatexactlydoesthistestdo?Whydowehaveseveraldifferentmethodsfortestingnormality?
Note:Forillustration,wesimulated5seriesofrandomnumbersusingtheAnalysisPackinExcel.Each
serieshasadifferentunderlyingdistribution:Normal,Uniform,Binomial,Poisson,StudentstandF
distribution.
Background
Letsassumewehaveadatasetofaunivariate({ }
t
x ),andwewishtodeterminewhetherthedataset
iswellmodeledbyaGaussiandistribution.
1
: ~ (.)
: (.)
o
H X N
H X N =

Where
o
H =nullhypothesis( X isnormallydistributed)
1
H =alternativehypothesis( X distributiondeviatesfromGaussian)
(.) N =Gaussianornormaldistribution
Inessence,thenormalitytestisaregulartestofahypothesisthatcanhavetwopossibleoutcomes:(1)
rejectionofthenullhypothesisofnormality(
o
H ),or(2)failuretorejectthenullhypothesis.
1
Youcanusethenormalprobabilityplots(i.e.QQplots)asaninformalmeansofassessingthenonnormalityofa
setofdata.However,youmayneedconsiderablepracticebeforeyoucanjudgethemwithanydegreeof
confidence.
Inpractice,whenwecantrejectthenullhypothesisofnormality,itmeansthatthetestfailstofind
deviancefromanormaldistributionforthissample.Therefore,itispossiblethedataisnormally
distributed.
Theproblemwetypicallyfaceisthatwhenthesamplesizeissmall,evenlargedeparturesfrom
normalityarenotdetected;conversely,whenyoursamplesizeislarge,eventhesmallestdeviations
fromnormalitywillleadtoarejectednull.
NormalityTests
Howdowetestfornormality?Inprinciple,wecomparetheempirical(sample)distributionwitha
theoreticalnormaldistribution.Themeasureofdeviancecanbedefinedbasedondistribution
moments,aQQplot,orthedifferencesummarybetweentwodistributionfunctions.
Letsexaminethefollowingnormalitytests:
JarqueBeratest
ShapiroWilktest
AndersonDarlingtest
JarqueBera
TheJarqueBeratestisagoodnessoffitmeasureofdeparturefromnormalitybasedonthesample
kurtosisandskew.Inotherwords,JBdetermineswhetherthedatahavetheskewandkurtosismatching
anormaldistribution.
ThetestisnamedafterCarlosM.JarqueandAnilK.Bera.TheteststatisticforJBisdefinedas:
2
2 2
2
~
6 4
n K
JB S
v
_
=
| |
= +
|
\ .

Where
S =thesampleskew
K =thesampleexcesskurtosis
n=thenumberofnonmissingvaluesinthesample
JB=theteststatistic; JBhasanasymptoticchisquaredistribution
Notes:Forsmallsamples,thechisquaredapproximationisoverlysensitive,oftenrejectingthenull
hypothesis(i.e.normality)whenitisinfacttrue.
Inthetableabove,wecomputethePvalueofthenormalitytest(UsingtheNormalityTestfunctionin
NumXL).NotethattheJBtestfailedtodetectadeparturefromnormalityforsymmetricdistributions
(e.g.UniformandStudents)usingasmallsamplesize( 50 n s ).
ShapiroWilk
Basedontheinformalapproachtojudging
normality,oneratherobviouswaytojudge
thenearlinearityofanyQQplot(seeFigure
1)istocomputeits"correlationcoefficient."
Whenthisisdonefornormalprobability(Q
Q)plots,aformaltestcanbeobtainedthatis
essentiallyequivalenttothepowerful
ShapiroWilktestWanditsapproximation
W.
( )
( )
2
( )
1
2
( )
1
N
i i
i
N
i
i
a x
W
x X
=
=
| |
|
\ .
=
Figure1:QQPlotExample
Where
( ) i
X =thei
th
order(smallestnumber)inthesample
i
a =aconstantgivenby
1
1 2
1 1
( , ,..., )
( )
T
n
T
m V
a a a
m V V m

=
m=theexpectedvaluesoftheorderstatisticsofindependentandidenticaldistributedrandom
variablessampledfromGaussiandistribution
V =thecovariancematrixof { } m orderstatistics
Inthetableabove,theSWPvaluesaresignificantlybetterforsmallsamplesizes( 50 n s )indetecting
departurefromnormality,butexhibitsimilarissueswithsymmetricdistribution(e.g.Uniform,Students
t).
AndersonDarling
TheAndersonDarlingtestsfornormalityarebasedontheempiricaldistributionfunction(EDF).Thetest
statisticsisbasedonthesquareddifferencebetweennormalandempirical:
| |
2
1
1
(2 1) ln (2 1 2 ) ln(1 )
N
i i
i
A n i U n i U
n
=
= + +
Insum,weconstructanempiricaldistribution
usingthesortedsampledata,computethe
theoretical(Gaussian)cumulativedistribution(
i
U )
ateachpoint(
( ) i
x )and,finally,calculatethetest
statistic.
And,inthecasewherethevarianceandmeanofthenormaldistributionarebothunknown,thetest
statisticisexpressedasfollows:
*2 2
2
4 25
1 A A
n n
| |
= +
|
\ .

Note:TheADTestiscurrentlyplannedforthenextNumXLrelease;wewontshowresultshere,asyou
cantyetreproducethem.
Conclusion
Thesethreetestsuseverydifferentapproachestotestfornormality:(1)JBusesthemomentsbased
comparison,(2)SWexaminesthecorrelationintheQQplotand(3)ADteststhedifferencebetween
empiricalandtheoreticaldistributions.
Inaway,thetestscomplementeachother,butsomearemoreusefulincertainsituationsthanothers.
Forexample,JBworkspoorlyforsmallsamplesizes(n<50)orverylargesamplesizes(n>5000).
TheSWmethodworksbetterforsmallsamplesizes(n>3butlessthan5000).
Intermsofpower,Stephens
i
foundADstatistics(
2
A )tobeoneofthebestEDFstatisticsfordetecting
departurefromnormality,evenwhenusedwithsmallsamples( 25 n s ).Nevertheless,theADtesthas
thesameproblemwithalargesamplesize,whereslightimperfectionsleadtoarejectionofanull
hypothesis.
i
Stephens,M.A.(1974)."EDFStatisticsforGoodnessofFitandSomeComparisons".JournaloftheAmerican
StatisticalAssociation69:730737
Figure2: EmpiricalDistributionFunction(EDFvs.Normal)

Normality Test in Excel

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Normality Test in Excel

Uploaded by

Copyright:

Available Formats

NormalityTesttutorial 1 SpiderFinancialCorp,2012

You might also like