You are on page 1of 31

This article was downloaded by: [Moskow State Univ Bibliote]

On: 04 December 2013, At: 14:55


Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of the American Statistical


Association
Publication details, including instructions for authors and
subscription information:
http://amstat.tandfonline.com/loi/uasa20

A Comparative Study of Various Tests for


Normality
a a a
S. S. Shapiro , M. B. Wilk & Mrs. H. J. Chen
a
Computer Applications Inc. and Bell Telephone Laboratories, Inc.
Published online: 10 Apr 2012.

To cite this article: S. S. Shapiro , M. B. Wilk & Mrs. H. J. Chen (1968) A Comparative Study of Various
Tests for Normality, Journal of the American Statistical Association, 63:324, 1343-1372

To link to this article: http://dx.doi.org/10.1080/01621459.1968.10480932

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the
Content) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://amstat.tandfonline.com/page/terms-
and-conditions
A COMPARATIVE STUDY OF VARIOUS TESTS FOR NORMALITY
S. 8. SHAPIEO,* M.B. WILK*AND MRS. H. J. CHEN
Computer Applications Znc. and Bell Telephone Laboratories, Inc.
Results are given of an empirical sampling study of the sensitivitieg
I
of nine statistical procedures for evaluating the normality of a com-
plete sample. The nine statistics are W (Shapiro and Wilk, 1965),
d& (standard third moment), bn (standard fourth moment), KS
(Kolmogorov-Smirnov), CM (Cramer-Von Mises), WCM (weighted
CM), D (modified KS), CS (chi-squared) and u (Studentized range).
Forty-fme alternative distributions in twelve families and five sample
sizes were studied. Results are included on the comparison of the sta-
tistical procedures in relation to groupings of the alternative distribu-
tions, on means and variances of the statistics under the various alter-
natives, on dependence of sensitivities on sample size, on approach t o
normality as measured by the W Etatistic within some classes of dis-
tribution, and on the effect of misspecification of parameters on the
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

performance of the simple hypothesis test statistics.


The general findings include: (i) The W statistic provides a generally
superior omnibus measure of non-normality; (ii) the distance tests
(KS,CM, WCM, D) are typically very insensitive; (iii) the u statistic
is excellent against symmetric, especially short-tailed, distributions but
has virtually no sensitivity to asymmetry; (iv) a combination of both
and ba usually provides a sensitive judgment but even their com-
bined performance is usually dominated by W ;(v) with sensitive pro-
cedures, good indication of extreme non-normality (e.g., the exponen-
tial distribution) can be achieved with samples of size less than 20.

1. INTRODUCTION

T HIS paper summarizes some of the results of an empirical sampling


study of the comparative sensitivities of nine statistical procedures for
evaluating the supposed normality of a complete sample, covering a range of
alternative distributions and severaI sample sizes (n= 10, 15, 20, 35, 50). A
motivation for the study was a desire to evaluate the performance of the pro-
cedure for testing normality described in Shapiro and Wilk (1965).
The nine statistics employed in this study are defined in Table 1, each con-
sidered in the context of a test for normality. Four of these, namely W , &, bz,
and u are each scale and origin invariant and hence are appropriate for testing
the composite hypothesis of normality. The remaining five, namely KS, C M ,
W C M , D and CS, as studied here require the complete specification of the nu11
distribution. For these, the mean and variance of the specified simple normal
hypothesis were taken as the (known) mean and variance of the actual alterna-
tive distribution in the study. Thus, for example, if the alternative distribution
was chi-squared with 4 degrees of freedom, the simple hypothesis tested was
that the sample came from a normal distribution with mean 4 and variance 8.
The chi-squared (CS)test defined here has been based on equiprobable cells.
It is de facto a simple hypothesis test, based on an arbitrary decision as to the
* The k l y m a r o h of these authors on this project wan done while both were at Rutgers University and s u p
ported by the Office of Naval Research under Contract Nonr 404(16).

1343
1344 AMEBICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER lode

TABLE 1. DEFINITION OF TESTS STUDIED


Code
-
bber Description

0.

= greatest integer i n n/2


an-i+l = coefficients tabulated in Shaplro
and Wilk (1965).
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

1.
fi

4. Crancr-Von Wses
Cramer (1928)
CM CM - J: n [F,(Y) - r(r)12~r)
Fn is the empirical distribution function.

6. Duxbln
Durbln (1961)
D D - m x
1
(t - kJ,-1

I
i 1,2,. ...,n
gJ = (n+2-j) (c; - c;-JJ j - 1,2 ,....,n
0 5 co
I . .
5 el .... 5 c: obtained by ordering
c1 = .
;
u c2 = u2 - v....,c*l = 1 - u
ui F(yi), 1 = 1J2J....,n

7. Chi-squared
(equiprobable c e l l s ,
cs cS -: =
k
-
n. k = nrmrbcr of cells,

see S o t i p n 2) ci I number of observations per c e l l

1/2
8. u
h v i d e t a1 (1954)
TESTS FOR NORMALITY 1345
number of cells (k) used. For this study, the selected values of k for the various
sample sizes (n)were: (n,k) = (10, 4), (15, 5), (20, 5), (35, 7) and (50, 9).
A notation common to all the definitions is that yr<yz --
Sy,,denotes the
ordered observations from a complete sample of size n, 9 is the sample mean,
and F,( ) is the empirical distribution function.
The results of the study are summarized in Section 9, which may be read
independently of the more detailed presentations.
2. ALTERNATIVE DISTRIBUTIONS STUDIED

The alternative distributions employed are Listed in Table 2, by families.


These give a wide range of shapes. Since some of the statistics are scale and
origin invariant while the remainder mere adjusted for such parameters, it was
only necessary to vary shape parameters in each family. Thus, for example, only
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

the Sechadistribution was studied amongst the logistic family. Because the null
distribution is symmetric, the majority of the alternative distributions were
chosen to be symmetric to provide stringent conditions for comparison of the
methods.
Table 2 also gives the d& and pz values for the distributions, i.e., the stan-
dardized third and fourth moments. The [ values range up to 6.62 (for
WE(.5)), while 0%values lie between 1.75 (Tu(1, 1.5)) and 113.94(LN).
Also several misspecified normal distributions were used to study the
effect of small errors in the assumed values of the normal parameters in
testing the simple hypothesis that the distribution was N(0, I). The alternative
parameters used were: ( p , U) = (0, 1.2), (0, 1.3), (.15, 1.0), (.18, 1.2), (.195, 1.3),
(.3, l), (.36, 1.2), (.39, 1.3); note that p/u has the values 0, .15 and .30.
3. DETAILS O F THE SAMPLING STUDY

The results given are based on empirical sampling. Samples from the various
distributions were generated by a system of procedures developed by Fowlkes
(1965), in which the basic input was the Rand Corporations (1955) normal
and uniform deviates. In generating samples of a given size from a distribution,
reuse of the same deviates was avoided. However, the same deviates were
reused for the differing sizes of samples.
The study involved sample sizes of n = 10,15,20,35 and 50. For convenience,
the null distributions of eight of the nine statistics were obtained by empirical
sampling; for the CS test tabulated chi-squared values were used. For all sta-
tistics except W , the empirical null distribution was based on M = 500 samples
for each sample size; for the W statistic, it was based on M=5000, for n 5 2 0
and on M = [100,00O/n], for 20<n_<50.
For the non-null distributions, the empirical c.d.f.s of the various statistics
were based on M=200 samples. The same samples were submitted to each of
the statistics.
For a typical null empirical distribution the output consisted of a listing of
the quantiles corresponding t o values of the c.d.f. of p = .005, .01(.01).05(.025)
.15(.05).30(.10).70(.05) .85(.025).95(.Ol) .99, .995.
For a typical non-null run, the empirical null distribution was part of the
input. The output results consisted of a listing giving, for each null quantile
1340 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DMCEMBER 1086

TABLE 2. DISTRIBUTIONS USED I N STUDY

1.80
1.85
1.93
2.00
2.14
2.40
2.36
BJ(k) k = r( 0 2.50
k-8 0 2.75
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

k 12 0 2.83
k-20 2.90
-
0
(3) Chi-square4 (v) $(v) v 1 2.83 15.00
v = 2 (papoluntial) 2.00 9.00
YE4
v - 10
1.41
0.89
6.00
4.20
(4) Double Chi-Squared (0) 0 1.88
0 2.19
0 3.00
0 6.00
0 8.56
0 12.26
0 I.&
0.65 2.13
0 2.63
0.73 2.91
normal variable 0 < XI 1 0.28 2.77
Ref. Johnson (i94$ 2
(6) Logistic !a,?)pe +Bx/(l+eQcOx) L 0 4.2
-=< x <
2
(7) L a N0nm.l (w,a 1 w-0, a2-i 6.18 U3.94
(8) Ron-Central Chi-Squared (v, A) V - 1 ,h a 1 6 0.73 3-72
(9) Poisson (A) a-i 4.00
--
1;00
h-4 0.50 3.25
A 10 0.32 3.10
(10) Student T(V) v 1 (Cauchy) 0 --
v.2 0 _-
--
v-4
v 10- 0
0 4.00

-- -
(11) W e y a -1, h .I 0.1 0 3.a
a o 1, h 0.2 0 2.71
variates defined by transfornation a 1, h = 0.7 0 1.92
y = $RA - ( l - R ) A h h e r e R i s uniform
a
-
1, A 9 1.5 0 1-75
on the unit i n t e r v a l
Ref. Hastings, e t al. (1947)
(12) Weibull (k,h)
a
a
a

h
1, h = 3.0

.
1, h = 5.0
1, h 10.0 -
1, k = 0.5
0
0
0
6.62
2.06
2.90
5.33
87.72
h 0 1,k = 2.0 0.63 3.25
TESTS FOR NORMALITY 1347

z ( p ) ,the corresponding non-null empirical c.d.f. value. Thus, for example, for
n = 10 with the W statistic used in testing samples from P(1),the results were

Z(P> P PP
.746 .005 .163
.781 .Ol .233
.SO6 .02 .422
.820 .03 .460
etc.

where p p is the non-null c.d.f. of z ( p ) and hence of p . Thus for samples of


n = 10 from P(1),one sees that the proportion of times that the W statistic lies
below the 1% point of the null distribution is about ,233; i.e., that the propor-
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

tion of times that the significance level p lies below 1% is about .233; i.e., that
the power of a 1% test is about .233.
This output was available for computer plotting, using the Stromberg-
Carlson 4020. Thus one could obtain so-called "merit curves," which are plots
of p p (the non-null c.d.f. value) versus p (the null c.d.f. value) at each value
of x ( p ) . Clearly then, the value of the ordinate, p p , corresponding to an abscissa
of p , is the power of a p% test in the usual Neyman-Pearson framework. Al-
ternatively, if one regards the significance level, p , attained in a test, as a ran-
dom variable, then p p represents the c.d.f. of p .
Such empirical c.d.f.'s and merit curves were obtained for each statistic,
alternative distribution and sample size as listed above. Only a subset of this
information is included in the paper.'
4. TYPES OF RESULTS
A subset of merit curves is given in Figures 1.01 to 1.36, each showing the
comparative performance of the nine statistics for a selected alternative dis-
tribution and sample size. The sensitivity of a test statistic is indicated by the
height of the merit curve.
A compilation of the power of a 10% level test is given in Table 3 for each of
the 5 sample sizes and the 44 alternative distributions studied.
A partial summary of the effects of sample size is provided in Figures 2.01
to 2.39, each giving, for a specified alternative distribution, a plot of the power
of a 5% test as a function of log n, with all nine procedures shown in the same
graph. Because the comparative sensitivities of the procedures are roughly the
same at all significance levels, the plots at the 5% level are generally indicative.
Another compilation of results, directed towards approach to normality
within a family of distributions, is provided by Figures 3.01 to 3.06. Each plot
deals with a given family, indexed by a parameter represented as abscissa.
The ordinate is the power of a 5% test using the W statistic.
Results on the effect of misspecification of null distribution parameters for
those statistics which can be used only with simple hypotheses are summarized
in Table 4, the entry being the actual probability of exceeding the null 5%
point.
* More details are available on request from the authors.
1348 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1968
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

0.0 FIG. 1.04


BE (3.2)
0.8 1 N = so

0.7
I

0 aio 0.20 0.10 a20 aio 0.20


P P P

MERIT CURVES FOR T E S T COMPARISONS


W = 0, 6= 1, bl = 2, KS = 3, CM = 4,
WCM=s,O=s,CS=7,u=8
TESTS FOR NORMALITY 1349
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

0 0.10 0.20 0 0.10 0.20


P P P

MERlT CURVES FOR TEST COMPARISONS

W = O , ~ ~ = I , ~ i ,? =
,HS=3,CM=.(,
WCM = 5, D = e, CS = 7, U=O
1850 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1068

7
l.O

xa (to)
I FIG. 1-14 i DX2 (4,s)
N=35
0.8
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

ae
FIG. 1.16
DX2 (1)
N= 50
c FIG. 1.17
0x2 (13)

0.10 0.20
P P P
TEST6 FOR NORMALITY 1351

FIG. 1-20 FIG. 1-21


Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

d
1

I
0.0 - FIG. 1-22 FIG. 1 -23 FIG. 1.24

0.8 - SB
N= 50
(190 SB
N=50
((82) L
N=30

0.7 -

0.6 -
PP
0.3 -
0.4 -

0.3

0.2

0.1

0
0 0.10 0.20 0 0.10 0.20 0 0.W 0.20
P P P
1352 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 18(uI

1.0
FIG. 1-25

PP
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

FIG. 1.29
p (4,
N=ZO

PP

0 0.10 0.20
P P P

MERIT CURVES FOR TEST COMPARISONS


W=O, G=f,b2=2, KS=3. CM= 4,
WCM=5,0=6, CS=7,u=O
TE0T9 FOR NORMALITY 1353
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

I-Or-----l

0.9 '

0.8 '

0.1

0.6 '
t
PP
0.5

0.4

0.5

0.2

0.1

0
0 0.10 0.20
P P P

M E R I T CURVE3 FOR T E S T COMPARISONS


W = o,G = I, b, = 2, HS = 3, CM = 4,

W C M = 5, D= 6, CS = 7, u= 8
1354 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1968

TABLE 3
1 0 PERCENT POWERS I N $ W, A, and be TESTS
TEST
SAMPLE SIZE
W
10 15 20 35 50
fi
10 15 20 35 50
b2
10 15 20 35 50
ALT. DISTN.
15 28 39 76 96 5 2 0 2 1 20 22 38 80 93

1 16 25 30 66 90
14 17 27 46 82
9 13 20 38 61
13 12 12 22 37
23 30 47 78 96
6 3 1 1 1
3 3 1 1
2 1 2 1 1
5 3 0 3 1
17 11 21 43 59
19 22 33 76 92
18 18 28 58 82
12 15 26 57 69
16 11 15 33 51
22 14 21 26 29
16 21 19 22 42 7 6 8 837 14 11 10 25 29
58 74 93 * * 910 4 4 2 11 8 5 10 14
34 45 51 78 96 8 8 8 4 8 11 7 6 9 10
27 32 39 58 76 10 g 8 11 9 14 6 7 11 7
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

20 22 24 32 37 10 6 11 10 6 10 5 10 12 8
82 94 99 * 66 89 92 * 45 55 64 80 93
57 82 89 * * 43 71 82 96 * 27 38 44 69 81
33 43 65 84 97 29 50 64 83 95 19 27 35 49 51
24 30 40 56 74 19 35 35 59 71 12 15 25 33 37
18 18 23 61 84 6 5 1 3 2 19 19 28 66 88
14 11 14 27 37 6 3 4 .2 3 16 13 21 37 48
25 27 32 48 49 24 28 35 44 44 18 27 3 54 60
31 44 52 68 77 31 44 50 57 64 2 40 43 75 84
39 50 66 78 85 42 47 54 58 69 3.? 49 60 77 85
10 27 29 57 85 4 3 2 0 1 18 22 41 74 89
44 70 85 * 13 35 34 46 67 29 21 38 49 51
13 513 7 1 1 7 6 6 4 10 3 6 10 10
13 :1 12 12 8 10 16 11 10 9 10 9 g 16 8
12 12 12 14 19 7 9 14 11 17 10 8 14 13 11
L 12 14 16 20 20 16 19 1527 29 12 13 13 31 34
LN 72 88 96 60 83 91 * 46 59 67 89 90
(1,16)
NCX~ 37 63 74 95 26 56 61 85 95 22 30 29 41 47
84 99 * * *
:It1
P lo) 24 31 38 57 80
21 16 16 27 24
22 36 37 68 81
9 16 15 23 35
10 12 11 15 17
20 17 16 37 30
10 12 12 17 18
14 8 g 13 10

i1):
42 81 92 99 * 65 76 82 90 94 56 82 88 97 99
41 51 58 80 84 42 51 54 74 75 38 53 58 87 92
T 10) 23 26 27 38 42 27 30 29 43 49 20 22 30 42 60
13 19 18 17 21 12 18 18 22 28 10 13 15 26 28
13 13 12 g 8 12 12 14 13 14 10 8 9 13 13
10 12 15 g 12 1 1 8 9 6 5 10 6 11 15 g
15 16 25 57 78 3 2 0 2 19 16 32 66 81
17 36 44 82 97 3 4 2 2 1 19 29 53 87 98
13 8 14 28 55 5 2 2 1 1 13 12 18 47 60
12 12 14 22 24 1 1 6 8 8 3 g 6 710 2
70 79 90 96 50 49 48 37 43 44 53 61 88 go
9 4 * * * * 76 96 99 * 54 74 8 98 98
18 23 24 40 59 16 23 23 38 51 11 15 12 22 18
TESTS FOR NORMALITY 1355

TABLE 3. (continued)
10 PERCENT POWER IN $, KSj CM and WCM TESTS

TEST Ks CM WCM
SAMPLE SIZE 10 15 20 35 50 10 15 20 35 50 10 15 20 35 20
ALT. DISTN.
14 11 19 18 26 11 8 19 14 20 14 8 19 18 27
17 14 19 13 9 1 13 12 14 7 17 14 20
i6313 17 18 24
19 13 17 14 20
16 13 13 13 1
18 1 13 10 1 z 17
18
ii ii 15 21
11 13 10 18
12 7 17 14 14
10 10 14 16 22
9 217 13 10
16 12 20 15 21
11 g 17 15 12
17 13 20 18 28
12 6 8 12 10 8 9 11 10 8 12 8 12 8 11
26 48 58 86 17 18 31 57 95 20 18 33 56 *
23 33 38 47 88 16 14 15 21 26 18 13 17 -21 35
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

10 1 28 42 72 9 8 19 13 22 11 9 19 14 26
8 12 23 20 40 8 10 17 7 14 9 16 20 7 19
41 51 58 * 30 41 62 83 98 42 48 72 93
20 24 33 48 64 23 26 41 61 86
32 30 37 51 6
20 22 26 33 4
15 18 19 22 34
2 15 16 21 30
9 13 15 16
36
25
14 16 23 34 47
13 11 16 15 27
18 14 15 26 27 1 13 14 21 17 12 16 22 30
19 14 12 12 18
13 7 13 10 18
2
1 12 12
12 7 15
9 11
8 14
20
17 12 13 10 14
36 20 37 34 52
12 13 20 28 50 4 11 17 27 40 10 12 23 33 53
23 20 30 42 59 12 15 26 34 50 22 14 32 42 63
11 13 15 18 30 11 14 12 12 22 10 14 15 15 28
20 28 38 46 63 14 21 31 39 57 17 17 56 45 73
9 810 7 9 8 11 15 10 11 8 9 15 9 13
1011 11 8 14 10 6 g 7 '11 13 5 12 6 12
13 10 9 13 14 12 11 11 13 11 13 11 14 12 12
L 9 6 9 9 8 1 0 9 8 9 5 1 3 8 9 9 7
IAN 28 47 57 * * 25 41 67 90 99 34 43 81 98 *
NCX2 (1 16) 24 25 32 36 45 18 18 24 33 35 21 20 28 32 52

y1
P .lo)
37 52 75 90
13 26 25 29 50
13 14 15 19 29
18 21 38 69 *
9 9 11 12 17
8 6 12 6 11
20 26 52 88
11 g 18 16 22
10 6 11 7 13
30 47 65 86 95 32 46 71 92 95 98 99 *
15 18 23 24 43 18 19 28 25 63 69 81 92 99
11 9 11 19 22 10 7 9 15 lh 14 7 11 17 20
9 611 6 g 6 811 7 g 8 11 13 9 9
9 7 810 8 7 7 1 2 10 7 9 7 13 10 8
10 12 10 9 11 10 11 12 6 a 11 10 14 6 10
11 8 17 13 16 12 7 18 1 21
15 9 17 16 26
16 17 15 21 27
14 8 10 10 14
~
14 16 '15 15 18
14 8 15 11a 14 7 16 10 15
2
16 15- 16 1 28
9 9 14 14 9 12 8 1 16 17
31 45 63 86 99 2
23 33 5 83 95
58 65 * * * 57 73 96 * 63 77 99 * *
12 14 15 17 21 10 11 14 11 16 11 11 15 12 16
1356 AMERICAN STATISTICAL A860CIATION JOURNAL, DECEMBER 1888

TABLE 3. (eontisued)

10 PERCENT POWER.TN 5 , D, CS, u TESTS


TEST D cs U
SAMPLE S I Z E 1 0 15 20 35 50 10 15 20 35 50 10 15 20 35 50
ALT. DISTN.
16 17 19 20 35 14 17 18 21 22 32 34 23 88 99
17 14 8 21 23 17 1 14 21 21 27 33 9 81 96
18 12 13 1 20 14 12 17 21 20 22 24 36 61 90
14 12 10 12 11 13 16 11 13 16 19 20 31 51 75
12 10 9 10 12 9 13 14 17 16 20 16 21 30 49
16 15 23 34 37 16 14 12 22 31 25 21 32 48 64
13 10 14 9 8 14 12 9 12 10 19 13 15 22 35
* * * * * 4 * * * * 34 24 31 25 36
:pi1 *
97
*
* * *
432114 * * ig 17 ig 21 18
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

BI 20 261310 22 16 12 13 12
78 * 17 17 21 15 * 16 13 15 13 10
66 83 93 * 41 94 97 99 18 15 19 31 29
31 39 56 88 93 81 43 43 97 96 15 18 12 30 21
14 16 25 47 48 12 23 20 46 69 9 9 13 21 17
10 14 10 12 21 14 11 14 21 28 13 8 12 11 19
14 14.11 18 20 17 15 14 27 31 20 23 42 70 go
15 9 5 15 12 12 16 9 12 13 24 14 19 27 38
23 21 24 33 35 14 17 25 25 35 L6 23 26 42 51
13
30
20 22 45
29 39 55
45
59
19
32
33 35 56 66
55 71 78
41
1
2z 29
31
35 63 76
45 67 75
12 14 11 15 19 12 20 19 19 27 %' 33 46 77 94
27 27 47 77 89 20 25 29 64 * 41 43 68 92 gg
15 9 a 13 9 12 11 6 12 12 14 7 8 10 7
13 8 8 12 8 11 12 10 7 8 10 7 8 11 10
9 10 9 11 10 8 13 8 10 13 14 10 12 9 13
L 13 4 10 14 12 12 12 8 12 10 14 12 11 22 29
LN 57 76 88 * 4 80 94 98 99 * 13 11 14 35 43
NCX2 (1,16) 21 27 32 51 66 19 23 17 42 93 21 13 17 29 28
* * * + * 3 4 4 * * * 36 36 44 48 56
89 * * * 20 12 10 26 * 15 13 16 13 18
50 93 * * 15 15 13 19 23 19 9 8 12 7
5 84 91 9 * 20 46 2 92 7
$0 41 52 25 66
12 10 11 21 20
23 46 54 95 99
1 18 22 31 59
12 20 18 20 32
19 36 24 78 87
17 16 24 36 51
13 8 9 15 12 9 10 9 13 12 13 9 14 22 24
10 12 6 10 13 12 10 9 12 12
12 8 9 9 12 L4
lo la 8 lo
1 12 l2 12 14 5 10 13 17
17 8 8 18 17 12 14 15 17 18 22 69 87
20 43
14 12 13 30 41 13 23 ig 28 29 26 43 61
91 99
10 6 9 14 13 14 14 9 13 14 ig 12 20 52 75
18 18 18 25 22 11 23 16 26 31 13 11 6 9 5
72 77 87 97 99 3' 83 89 99 37 40 46 68 58
89 99 * * 94 97 gg ig 1017 43 44
10 7 11 13 18 11 15 12 13 17 11 12 10 9 14
*loo$
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

TESTS FOR NORMALITY


1357
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

1358
AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1888
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

Y
I

1 Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

t I
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

TE6T6 FOR NORMALITY


1361
1302 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1988

100r
FIG. 3.01
BE(P.P)
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

1 I I 1 I
1.0 1.2 1.4 1.6 1.8 2.0 0 4 8 12 16 :
PARAMETER p PARAMETER k

FIG.304
DXW)

0 2 4 8 6 10
PARAMETER Y PARAMETER /3

I I I I I
0 2 4 6 6
PARAMETER x PARAMETER Y

POWER OF 5 % W TEST FOR THE FAMILIES OF DISTRIBUTION


TEBTS FOR NORMALITY 1363
TABLE 4. 5 PER CENT POWER IN % FOR THE
MISSPECIFIED NORMALS
Test
- -D
WCM E
Sample Size = 10

.ooo 1.2 .oo 7 6 8 12 4


.OOO 1.3 .OO 9 11 22 2 1 6
.150 1.0 .15 3 8 8 9 4
-300 1.0 .30 7 13 13 u 9
.if% 1.2 .15 7 7 12 13 1
.360 1.2 .30 10 16 23 24 17
el95 1.3 -15 5 12 18 19 8
-390 1.3 -30 12 21 33 2 1 9
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

Sample Size = 15

.Ooo 1.2 .oo 5 4 11 10 6


.om 1.3 .oo n 8 15 15 9
.150 1.0 .i5 3 8 9 4 5
-300 1.0 .30 10 17 17 10 12
.180 1.2 -15 6 13 20 8 10
-360 1.2 .p 6 21 29 10 16
-195 1.3 .15 8 13 26 15 12
,390 1-3 -30 14 26 38 17 29
Sample Size = 20

.Ooo 1.2 .oo 6 8 18 9 7


.Ooo 1.3 .oo 12 12 29 11 9
.150 1.0 -15 5 8 11 3 4
.300 1.0 -30 14 26 P 7 1 1
.la0 1.2 -15 8 16 24 11 12
.360 1.2 .30 21 34 46 16 21
195 1.3 -15 7 12 31 12 10
.390 1.3 030 2138 55 19 2f5
Sample S i z e = 36
.ooo 1.2 -00 6 5 ll 13 10
.om 1.3 -00 13 10 23 17 16
.150 1.0 .15 4 7 9 8 8
.300 1.0 .p 18 31 40 13 19
-180 1.2 .15 15 80 30 15 19
.360 1.2 .30 P 43 54 16 3"
.195
-390
1.3
1.3
.15
-30
14
36
17
45
33
66 :z
sample S i z e = 50

.ooo 1.2 .oo 16 u 21 13 13


.Ooo 1.3 .oo 22 20 43 2630
-1.50 1.0 .15 13 17 18 6 9
.m 1.0 .30 40 54 58
4.4
13 25
.1& 1.2 .15 25 33. 19 29
.36o 1.2 .$ 54 64 75 23 48
.195 1.3 .l5 26 26 51 2030
.390 1.3 .30 61 68 86 3957
1364 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1888

Summary discussions based on the results referenced above are given in the
ensuing sections concerning comparisons of test statistics, effects of sample
size, approach to normality and effect of parameter misspecification.
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

Group 1 : dDl > .3, PZ> 3.0 (hymmetric, long-tailed)


Group 2: dD1 >.3, &<3.0 (Asymmetric, short-tailed)
Group 3: dpl 1 . 3 , Bp>4.5 (Symmetric, long-tailed)
Group 4: .\/SI 5.3,8a<2.5 (Symmetric, short-tailed)
Group 5: dBl15.3, 2 . 5 5 8 1 4 . 5 (Near normal)
TESTS FOR NORMALITY 1365
5.2 Group 2: Continuous, Skew, Low pz Alternatives (I d&l 2.3, BZ<3.O)
This group includes the BE(2, l), BE(3, 2), SB(.533, .5) and SB(1, 1) dis-
tributions, some of whose merit curves are included in Figures 1.01 to 1.36.
The W statistic has the highest sensitivity for detecting non-normality for
this group. The u test also performed well. For example, the 5% power for the
BE(2, 1) alternative, for sample size 35, was 63% for W, and 35% for u, while
for the other statistics it ranged from 8 to 22%. For the SB(.533, -5) alterna-
tive, for n=20, the respective powers of W and u were 71% and 56% while
the rest ranged from 18% (for CM) through 26% for bz to 40% for D.
5.3 &oup 3: Continuow, Symmetric, High B2 Alternatives (I I/%] 5.3, p2>4.5)
This group includes T(1) (Cauchy), T(2), T(4), Tu(1, lo), Dxz(l), D ~ ~ ( 1 . 5 )
and D ~ ~ ( 2 . 0distributions,
) some of whose merit curves are included in Fig-
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

ures 1.01 to 1.36.


The W, A, bz and u statistics are superior to the other procedures for this
group. The only exceptions are the excellent performance of WCM against the
two very longtailed distributions, T(1) and T(2), and that of D against the
T(1) alternative.
The W statistic retains good sensitivity for all members of this group. Its
merit curve is generally above that of 46, ba and u. For example, for n = 15,
the power of W in a 5% test against T ( 1 ) is 8l%, while that for & is 72%,
for ba is 75% and for u is 38% (Figure 1.31); against Tu(1, 10) the powers
are 71%, 40010, 39% and 28% respectively.
It is interesting to note the good power of d& against the symmetric
Cauchy and T(2) alternatives. This sensitivity is associated with the very
long-tailedness of these distributions which tends, in small samples, to give
rise to asymmetric data configurations.
As another example, the 5% powers for the Dxz(l) (Laplace) alternative
with n=35, are 41% for bz, 38% for W, 37% for I/&, 29% for u, 22% for D,
19% for WCM, 16% for CS,and 4% for KS and CM.
5.4 Group 4: Continuous, Symmstrie, Low /32 Alternatives (1 dz1 5.3, &<2.5)
This group includes BE(1, l), BE(1.1, 1.1), BE(1.3, 1.3), BE(1.5, 1.5),
BE(2, 2), Dxa(-.8), Dx2(-.5), SB(0, .7071), Tu(1, .7), Tu(1, 1.5) and
Tu(1, 3.0), some of whose merit curves are included in Figures 1.01 to 1.36.
The u, bz and W tests far outperformed the remaining procedures against
alternatives in this group. For example, against BE(1, 1) (uniform) with
n=50 the 5% powers are 97% for u, 89% for W, 86% for bz with the rest
between 0 (for d&)and 22% (for D). Against Dxz(-.5), for n=50, the 5%
powers are 34%, 28%, and 25% for bz, u and W , with the remaining tests
having powers less than 10% (Figure 1.15).
(I
5.5 Group 6: Continuous, fil,p2 Cbse to Normal Values dE1 S.3, 2.5
5 Pa 5 4.5)
This group includes the SB(0, 2), SB(1, 2), Tu(1, .l), Tu(1, .2), Tu(1, 5),
L and T(10) distributions, some of whose merit curves are included in Figures
1.01 to 1.36.
1366 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1968

None of the tests showed much sensitivity against the alternatives in this
group. Against the L distribution with n = 50, the bz test achieved the highest
5% power, namely, 24%; in the same case, u had 20%; &, 18%; W , 16%;
and the rest gave 7% or less (Figure 1.24). Against L for n=35, d&had a
5% power of 23%, followed by bz with 19% and W with 15%. The only other
case where a test achieved as much as 20y0 power even for n=50 was against
the Tu(1, 5) distribution, using the CS procedure. The power of D was 15%)
of W was 14%, while the remainder ranged between 0 and 11% against this
alternative. The 5% powers for n = 50 for the SB(1, 2) alternative ranged from
5% to 14% with W the highest (Figure 1.23).
5.6 Group 6: Discrete Distributions
This group includes the B1(4),B1(8),B1(12), BI(20), P ( l ) , P(4) and P(10)
distributions, some of whose merit curves are included in Figures 1.01 to 1.36.
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

The D statistic was far superior to all others for this group and had 100%
power for almost all sample sizes against these alternatives. This is believed
to be due to the fact that the series of transformations applied to a sample
containing a large proportion of identical values yields a large proportion of
zeros.
The CS procedure performs erratically against many of the members of this
group, showing poor power for one sample size and jumping to 100% power
for the next. Thus for example the 5% power for CS against B I ( 8 ) is 9% a t
n = 20 and jumps to 100% at n = 3Fj. This peculiarity is probably due to a com-
bined effect from the discreteness of the data and the arbitrarily chosen num-
ber of cells, depending on n, used in the test.
Both the 46 and bz procedures appear to have no power against any of t,he
binomial alternatives in any sample size and performed relatively poorly
against the Poisson alternatives. The W test performed well against this group
while the u procedure did not. For example, the 5% powers, for n=20, for
BI(4) was 100% for D, 100% for CS, 71% for W , and 20% for u (Figure 1.05).
Of the distance tests (not including D) only the KS procedure performed
relatively well. For example, the 5y0 power against the P(1) alternative with
n = 2 0 was 100% for D, 100% for CS, 99% for W , and 55% for K S with the
remaining tests having power less than 30%.
Since the comparatively remarkable sensitivity of the D statistic is a result
of its response to the discreteness of the sample, one would expect it to show
similar sensitivity in testing other continuous null distributions against dis-
crete alternatives.
5.7 General Comments
The W statistic exhibits sensitivity to non-normality over a wide range of
alternative distributions. I n most cases it has power as good as or better than
the other eight procedures. For continuous alternatives, it is the only test
which never has very low power where another test shows high power. Even
for discrete alternatives it shows poorly only against the results for the D
test and occasionally the C S statistic.
The dG statistic is a good measure of non-normality against highly skewed
TESTS FOR NORMALITY 1367
and also long-tailed distributions. However, it often has lowest sensitivity
against symmetric and asymmetric finite range distributions, often being
biased. Moreover, it has very poor performance with the discrete distributions.
The bz statistic performs comparatively well with finite range distributions,
as well as with symmetric long-tailed infinite range distributions. It is not
effective, relatively, against skewed and discrete distributions.
There does not appear to be a clear cut superiority of 46 versus bz as an
omnibus test for normality. Generally, dG responds to skewness and bz to
kurtosis but in several cases their powers are quite similar. There are cases in
which the alternative distributions, though quite non-normal, give both
& and b2 values which resemble the normal. For example, in the case of the
BE(2, 1) distribution for samples of size 20, d& and bz have 5% powers of
8% and 13% respectively, while the W statistic has a power of 35% (Fig-
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

ure 1.03).
The distance tests, KS,CM,W C M and D,exhibit surprisingly poor power,
with some exceptions mainly in connection with discrete alternatives in the
case of the D statistic. I n general, the D procedure does improve the power
of the KS test for highly skewed continuous distributions but usually not to
the level achieved by other tests.
The D statistic has exceptional performance for discrete alternatives. Pre-
sumably this would apply relative to any null continuous distribution. A
possible explanation for this, as has been noted above, is that the transforma-
tions involved in computing D lead to a large number of exact zeros in the
case of samples from discrete distributions.
The very arbitrariness of definition of the CS statistic makes it difficult to
comment generally on its properties. These evidently depend mainly on the
choice of class intervals. From the present results, one may infer that the CS
test performs well against the very highly skewed distributions but in general
does not have good sensitivity overall.
The u statistic has, comparatively, good sensitivity against symmetric
short-tailed alternatives, typified by the uniform distribution. Its performance
against symmetric long-tailed distributions is comparable to that of W,
though in general not as good. The u test fails badly in the case of skewed dis-
tributions, having very low power even for alternatives as badly asymmetric
as xa(l).
The five groups for the continuous distribution alternatives were made up
based on a partition of the 1 &I,
o2space. Indeed one finds that these groups
correspond generally to varying non-normality as reflected by the sensitivity
of the W statistic. This indicates that, roughly speaking, the W teat subsumes
the information provided by both dK and BZ.
6. THE EFFECT OF SAMPLE SIZE

The effects of sample sizes are indicated in Figures 2.01 to 2.39 which give
plots of the 574 power as a function of log n, for all nine statistics for each of
39 alternative distributions, and by the tabular results on 10% power given
in Table 3.2
2 Tabular results for 5 % power analogous to Table 3 are avnilable from the authors.
1368 AMERICAN STATISTICAL ASSOCIATION JOUBNAL, DECEMBER 1968

In most cases the sensitivity increasea markedly with sample size. The
change with n does however vary considerably both in regard to the alterna-
tive distribution and the test procedure. Sometimes this effect is dramatic
even in the low sample size range of this investigation. Thus, for example, the
power of the W test against the Cauchy goes from 37% to 78% as the sample
size increases from 10 to 15 (Figure 2.29).
It is seen from Figures 2.01 to 2.39 that the 5% power is, loosely speaking,
linear in log n, a t least up until powers close to 100% are realized.
Several of the test procedures give very little increase in power, for certain
alternatives, as n changes. For instance, Figure 2.01 for the BE(1, 1) (uniform)
alternative, shows that the 5% power of each of the &, KS, CM,W C M ,D
and CS statistics changes from a range of 2 to 10% at n = 10 to a range of
0 to 22% at 72 = 50. For contrast, the corresponding 5% power range for W,
ba and u for BE(1, 1) is 5 to 20% at n = 10 and 85 to 96% at n=50. As another
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

example, Figure 2.15 for the ~ ~ ( 1 alternative,


0) the power for b,, KS, CM,
W C M ,D,CS and u ranges between 5 and 10% at n = 10 and between 11 and
25% at n = 50; for contrast, the W and & statistics have powers of 15 and
11% a t n = 10 and 62 and 55% at n = 50.
A number of the alternative distributions give relatively flat 5% power curves
aa a function of log n, for all nine procedures. For instance, Figure 2.32 for
T(10), haa all tests ranging between 3 and 9% power at n = 10 and the highest
5y0 power attained at n = 50 amongst the tests is only about l S ~ o .
These plots indicate that one may achieve reasonable sensitivity with quite
small sample sizes, depending of c o m e on the appropriate choice of test pro-
cedure. Thus, for instance, against the ~ ~ ( alternative,
2 ) 50% power on a 5%
test is achieved by the use of W with n = 11, and by d&with n = 13, although,
for contrast, KS would need n = 46 and u would require n> 50. Against all the
discrete alternatives studied, the D statistic achieved a 5% power of 50% with
n l l O . For distributions which are close to the normal, however, for example
L, T(O),etc., all statistics require n>50 to achieve a 5y0 power of 50%.
Table 3 and the plots of Figures 2.01 to 2.39 indicate that it is sometimes true
that the power ranking of the test statistics remains the same for each sample
size. This is sufficiently often contradicted (perhaps augmented by sampling
fluctuations) that the specific results really need to be consulted.
Perhaps the most instructive indication of this study of relation of power
and sample size is that it is meaningful, given appropriate procedures, to try
to assess possible non-normality even with samples as small as 10 or 20. The
vague impression which appears to be widely current that very large samples
are required to evaluate non-normality may in part be due to the use of r e b
tively insensitive procedures.
7. APPROACH TO NORMALITY IN FAMILIES OF DISTRIBUTIONS

In this section the W statistic is used as an index of non-normality within


families of distributions. Results are given for representatives of the following
families: BE@, p ) (the symmetric beta), B l ( k ) (symmetric binomial), x*(u),
Dx2(/3),P(X) and T(v).
TESTS FOR NORMALITY 1369
Figures 3.01 to 3.06 give plots of the empirical 5% powers of the W test as
a function of parameter value for each of the five sample sizes, n = 10, 15, 20,
35 and 50, one graph for each family.
From Figure 3.01 for the symmetric beta family, one sees that small sample
sizes show little evidence of non-normality even for p = 1. However, for n = 50,
the W index falls from S9% at p = 1 to 26% a t p = 2. These results give some
measure of the nature and rapidity of approach of the symmetric beta to the
normal with increasing p , as reflected in the configuration of random (small)
samples.
Figure 3.02 for the symmetric binomial, shows that the index depends
sharply on k for all five sample sizes. At k = 20, the range of 5% power of W
is only from about 9% to 23% over the five sample sizes, whereas at k=4, the
powers range between 49% and 100%. This suggests that, as measured by W ,
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

the binomial distribution with k>20 tends to give rise to samples having a
near normal configuration.
Figure 3.03 for the x2(v) family shows a sharp dependence of the W index on
v over the range of v employed. However, in moderate size samples, ~ ~ ( 1gives
0)
rise to appreciably non-normal configurations. Thus, a t n = 50, the 5% power
of W is 62% for ~ ~ ( 1 0 ) .
Figure 3.04 gives measures of departure from normalit4yfor the Dx2((s)family
(which includes the normal, p=O). The curves rise rapidly on both sides of
/3 = 0 for large sample sizes. The change of the index is more gradual a t smaller
sample sizes, especially for negative /3. The general behaviour of the Dx(j3)
family is that it goes from a mass of 1 at 0 for /3+-1 through beta-like dis-
tributions, becomes normal at /3 = 0, double exponential a t /3 = 1 and develops
larger and larger tails as j3 increases. The differential behaviour, in regard to
dependence on sample size, between positive and negative /3 values for the
Ox2@) family is similar to comparison of results for the beta distributions
with those for the T(1) and T(2) distributions.
Figure 3.05 for the Poisson family, P(X),shows a sharp drop in the 5% power
of W , as X varies between 1 and 10, for each of the five sample sizes. With X = 10
the 5% power of W is only 18% for n = 50, indicating that for A> 10 even larger
samples would tend to have a normal-like configuration.
It is seen from Figure 3.06 that the W index of non-normality changes
rapidly for the Students T(v) distribution as the parameter v goes from 1 to 4,
while for v larger than 4 the approach to normality becomes much more
gradual. For example, as judged in samples of n=50, the 5% power of W is
99.7% for v = l , 40.7% for v=4 and 17.3% for v = 10. Presumably it would
require substantially larger sample sizes to develop appreciable power against
Students T(v) distribution with v > 10.
The results given here should not be interpreted directly to indicate the
quality of a mathematical approximation of, say, the Poisson with X = 10, by a
N ( p = 10, u2= 10) distribution. Rather it is an index which assesses the con-
figuration of samples of specified size relative to normal samples and will not
respond, in small samples, to moderate systematic differences in the actual
c.d.f.s. Thus, in the case of, say, the T(v) family, the difference from the nor-
1370 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1988

ma1 is essentially in the tails of the distribution which, however, account for
only a small overall proportion of the population. Hence large sample sizes
may be needed to reflect even large relative differences in the tails.

8. EFFECT OF MISSPECIFICATION OF PARAMETER6

Of the test statistics considered in the present study, the K S , CM, WCM,
D and CS procedures are appropriate directly only as tests of simple hypotheses.
Thus, in their use as tests for normality, the parameters, p and u, of the hy-
pothesized null distribution must be specified. In most applied statistics cir-
cumstances where a test for normality might be of interest, prior information
on the parameters of the supposed normal distribution are usually not avail-
able. In some cases one may be prepared to approximate the unknown param-
eters, for purposes of execution of the test. The following results are of interest
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

in indicating the sensitivity of the above five statistics to errors of misspec-


ifications.
Table 4 summarizes regults of an empirical sampling study of this issue.
Values tabulated are the actual probabilities of exceeding the null 5% point
for each of 9 combinations of ( p , p / u ) values, for each of five sample sizes,
n=10, 15, 20, 35 and 50, for each of the five tests, applied to the N ( 0 , 1) hy-
pothesis. Thus, for example, if the K S test is used in samples of size 50 to test
N(0, l), and in fact the sample comes from N ( p = O , u=1.2), then the 5%
"power" is 16%.
In general, the K S , D and CS tests are less subject to distortion than the
C M and WCM procedures. The errors are, however, substantial even for rela-
tively small departures. As might be expected, the effect of misspecification
becomes more pronounced at larger sample sizes. For example, for the KS
test, with p = .39 and u = 1.3, the 5% power is 12% for n = 10 and 61% at
n =50.
The tabled results show that even when p / u is held constant, the power in-
creases with p in all cases. For example, with n = 50, and p / u = .15, the 5% power
of CS is 9% when p=.15 and rises to 30% when p==.195.
These results on effects of misspecification are especially disquieting when
one makes a comparison with powers for non-normal alternatives, even of an
extreme type. For example, the 5y0 power of KS against the x2(2) alternative in
samples of size 50 is only slyo,which is the power attained when the N ( 0 , 1)
hypothesis is tested on samples from N ( p = .36, u = 1.2).

9. SUMMARY AND CONCLUDING REMARKS


9.1 Summary
The results have been presented to correspond with various areas of inter-
est, namely, the comparison of alternate test statistics (Section 5 ) , the effect
of sample size (Section 6), and the approach to normality in several families
of distributions (Section 7). Additionally, resulta are given of the effect of
misspecification of parameters for the simple hypothesis test procedures ex-
amined (Section 8).
Some of the salient indications of the study are listed below. The statements
TBSTS FOR NORMALITY 1371
are not intended to be taken precisely or literally, but rather as loose, approxi-
mate generalizations. (The test procedures and alternative distributions stud-
ied, together with notational definitions, are summarized in Tables 1 and 2.
The sample sizes used were n = 10, 15, 20, 35 and 50.Of the nine procedures,
W , 46, b2 and u provide composite tests, while KS, C M , W C M , D and C S
can be employed only as simple hypothesis tests.)
(a) The W statistic provides a superior omnibus indicator of non-normality,
judged over the various symmetric, asymmetric, short- and long-tailed alter-
natives and over all the sample sizes used.
(b) The distance tests (KS, C M , W C M , 0)are typically inferior in sen-
sitivity against continuous distribution alternatives, with few exceptions.
(c) The u statistic has particularly good properties against symmetric, es-
pecially short-tailed (e.g. the uniform) distributions but seems to have vir-
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

tually no power with respect to asymmetry.


(d) While it is usually true that a judgment based on both 40,and b2 will
be sensitive, the W statistic is typically as good as the best of either of these
and, in some cases, has considerably higher power than either.
(e) The results for the C S test were quite erratic, due in part to the difficulty
imposed by the requirement of arbitrary choice of class intervals.
(f) An important finding from the point of view of merit rating of the pro-
cedures was that the order of the test procedures in regard to power was usu-
ally similar for a range of preset significance levels.
(g) Loosely speaking, the powers of the test procedures varied linearly as
log n. However, both slope and location were very different depending on test
and alternative distribution.
(h) Contrary to popular beliefs, sensitive assessment (approximately 50%
power at 5% level) of even moderate non-normality (e.g. ~ ~ ( 4 is) )possible in
samples as small as n = 20. Furthermore, extreme non-normality (0.g. ~(1))
can be detected with sample size less than 10.
(i) The results given in Section 7 on approach to normality in families of
distributions may be of use as a guide in studies of robustness to non-normality.
(j) The high power of the simple hypothesis tests ( K S , C M , W C M , D
and C S ) to relatively small misspecification of parameters, throws into doubt
their usefulness as practical statistical test procedures.
9.2 Concluding Remarks
Essentially all the results of this paper derive from empirical sampling
studies. They are accordingly subject to various sampling errors and uncer-
tainties. The information is intended to be broadly indicative and should be
used with a wide gauge. As a guide, the standard deviation of the sampling
estimate of power (assuming the null distribution is specified exactly) is
bounded by ,036.
The experience of this study has been that empirical sampling using a high
speed computer can provide a very useful general guide on sensitivity proper-
ties even with a few Monte Carlo runs (e.g. 100,200, or 500). One feels strongly
inclined to recommend that protracted theoretical studies of null distributions,
optimum asymptotic properties, etc., might well be preceded by some simple
1372 AMERICAN STATISTICAI, ASSOCIATION JOURNAL, DECEMBER 1988

sampling studies to gain insight and assuraiice concerning the usefulness of the
proposed procedure.
One of the ma.jor difficulties encountered in the present study was that of
organization, analysis and presentation of the voluminous results. This aspect
is reminiscent of the central features of most large scale statistical data analysis
problems.
REFERENCES
111 Anderson, T. W. and Darling, D. A. (1954). A test of goodness of fit, J . Amer.
Stat. Assn. 49, pp. 765-69.
[2] Box, G.E. P. and Tiao, G. C. (1964).A Bayesian approach to the importance of
assumptions applied t o the comparison of variances, Biometrika 61, pp. 153-67.
[3] Cramer, H. (1928). On the composition of elementary errors, Skand. Aktuar. 1 1 ,
pp. 141-80.
Downloaded by [Moskow State Univ Bibliote] at 14:55 04 December 2013

[4] David, H.A., Hartley, H. O., and Pearson, E. S. (1954). The distribution of the
ratio, in a single normal sample, of range to standard deviation, Biometrika 41, pp.
482-93.
[5] Durbin, J. (1961). Some methods of constructing exact tests, Biometrika 48, pp.
41-55.
[6]Fowlkes, E. B. (1965).A Fortran I1 system for the generation of random samples,
unpublished Bell Laboratories memorandum.
[7] Hastings, C., Mosteller, F., Tukey, J., and Winsor, C. (1947). Low momenta for
small samples: a comparative study of order statistics, Annals of Math. Statist. 18,
pp. 413-26.
[8] Johnson, N. L. (1949).Systems of frequency curves generated by methods of trans-
lation, Biometrika 38,pp. 149-76.
[9] Kolmogorov, A. N. (1933). uSulla determinazione empirica di una legge d i distri-
buzione, G. Znst. Ztal. Attuun. 4, pp. 83-91.
[lo] Rand Corporation (1955). A Million Random Uigib With 100,000 Normal Deviates.
The Free Press Publishers, Glencoe, Ill.
(111 Shapiro, S. S. and Wilk, M. B. (1965). An analysis of variance test for norniality
(complete samples), Biometrika 68,pp. 591-611.

You might also like