You are on page 1of 10

A Quality Index Based on Data Depth and Multivariate Rank Tests Author(s): Regina Y.

Liu and Kesar Singh Reviewed work(s): Source: Journal of the American Statistical Association, Vol. 88, No. 421 (Mar., 1993), pp. 252260 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2290720 . Accessed: 11/10/2012 19:01
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association.

http://www.jstor.org

A Quality Index Based on Data Depth and MultivariateRank Tests


Y. LIUand KESARSINGH* REGINA
a parameter Q = Q(F, populations on RP,p 2 1. We introduce andstudy LetF andG be thedistribution functions oftwogiven F. Theparameter topopulation measures theoverall ofpopulation G relative using any Q canbe defined G), which "outlyingness" 0 to 1,andis .5 when F andG areidentical. within theclassofelliptical ofdatadepth. Itsvalue from Weshow that concept ranges F in location thevalue down from when from orG hasa larger orboth, ofQ dwindles .5. HenceQ spread, distributions G departs ofa manufacturing andthus itshould as an important the orprecision serve measure canbe usedtodetect lossofaccuracy process, werefer index inthis Inaddition tostudying the properties inquality Thisinfact isthe reason why toQ as a quality article. assurance. as a multivariate ofWilcoxon's rank analog ofQ, weprovide an exact rank test for testing Q = .5 vs.Q < .5. Thiscanbe viewed andscale increase some estimates sum Thetests here have against location change simultaneously. Weintroduce test. proposed power F = G. We alsoconsider which when a version ofQ anditsestimates, aredefined ofQ andinvestigate their limiting distributions thelocation shift ofG. In this caseQ is usedto measure scaleincrease after only. correcting rank index. tests; Quality assurance; Quality KEY WORDS: Datadepth; Multivariate

In the theF population. from sample X is a random where 1. INTRODUCTION popthe "good" as F is regarded control, of quality context for thecentrality ofa measuring A datadepth is a device specifications), standard required the meeting (i.e., ulation with to a given datacloud. respect multivariate datapoint G. If Y is population a future from and study a andy is an observation ofthisarticle is to propose The mainpurpose 3.1 shows then Proposition G, from a random observation basedona concept ofdatadepth, two-population parameter of R(F; = the distribution F G, the hypothesis under that index Q = Q(F, G), andsome which weshall callthequality support with the uniform distribution 1], a is U[O, Y) to Q. HereF and G arethedistritest statistics pertaining ofthetwogiven independent populations [0, 1]. We define bution functions theoverall "outly- Q(F, G) = P{D(F; X) ' D(F; Y)IX - F, Y G}. in RP,p 2 1. The parameter Q gauges F to thegiven with respect ingness" of theG population with Note thatQ(F, G) = EGR(F; Y). Roughlyspeaking, whether G hasa different Q candetect population. Theindex F of the fraction is the R(F; y) F population, to respect to as compared dispersion location or has additional and/ and Q(F, than the value y, that is "less central" population thenullhypothesis F. The range ofQ is [0, 1]. Under Ho: theG overall y's from ofsuchfractions F = G, Q = 2, and Q < 2 indicates is a possible G) is theaverage that there < it on more than means the average When Q 2, population. F to G. If Q from location shift and/ora scale increase measurethan the F is any given of deeper 50% population with thesame > 2, then G hasa smaller perhaps dispersion F between an inconsistency G, indicating F mentY from shift. Suppose minor location location orwith a relatively and G. ofan accepted lotarising ofmeasurements isthepopulation ofstarequirement isa practical invariance Because affine and G is thepopulation ofa process from a manufacturing affineon some in we focus mainly data analysis, used tistics interms ofthesamemeasurements future lotwhose quality on F, sayh(F; defined A functional datadepths. here appears invariant The parameter Q studied is to be monitored. matrix if for invariant nonsingular any be affine is said to *), whether G is meeting toolfor measuring tobe an attractive + = A* vector *), h(F; b) constant b, h(FA,b; A and any tests on Q here thesamestandards as F. Theproposed may X + b and AX of function the distribution is where FA,b outtheon-line tests for serve as useful inspection carrying notion if a that * ) D(F; worth depth mentioning F. It is in quality assurance. and R (F; * ) are its corresponding then so invariant, is affine the proposed We now proceed to describe briefly painvariant. 2 areaffine in Section mentioned To begin with, we use Q. Alldepths statistics. rameter Q and itsrelated some reviewing After as follows. is organized This article ofdatadepth. a measure Generally speakD(F; *) todenote imseveral in Section 2, we establish data depth notions of (or thelarger D(F; x) is,thedeeper ing, for a point x inlRaP, in Section of for distributions elliptical Q properties portant to thedistributhemore thepoint x is with respect central) ofT. W. lemma we appealto a celebrated 3. In particular, tionF. decreasing monotonically that Q is show to (1955) Anderson specific notions of 2 we review thefollowing In Section shifts Q< 2 parameter alonga lineandthat depth, thesim- as thelocation datadepth: depth, Tukey's Mahalanobis's 4 we propose In Section than F. out more spread G is when Foranyy E lRP, let depth. plicial depth, andthemajority of inessence is a multivariate analog rank test that an exact R(F; y) = PF(X: D(F; X) ? D(F; y)), Ha: Q for testing Wilcoxon rank sumtest Ho: Q = 2 versus rank sum a three-sample thetest maybe called In fact <2.
* ReginaY. Liu is AssociateProfessor Deand Kesar Singhis Professor, New Brunswick, NJ08903. The ofStatistics, Rutgers University, partment authorsgratefully acknowledgethe supportfromthe National Science GrantsDMS 9022126 and DMS 90-04658and thank Foundationthrough and theassociate and thereferees assistance B. Yeh for Arthur hiscomputing

editor for their helnfuil comments.

Association ? 1993 AmericanStatistical Association of the AmericanStatistical Journal and Methods March 1993,Vol. 88, No. 421, Theory

252

Liu and Singh: Data Depth and Multivariate Rank Tests

253

Somerelated in ob- 1 -F(x-)}. different papers, though intrinsically are Brownand Hettmansperger jectives, (1987, 1989), 3. Simplicial depth (SD) (Liu 1988,1990):LetXi, . . .. Brown, Hettmansperger, andOja (1992),andOja Xp+ Nyblom, from F. Thesimplicial depth I be(p + 1)iidobservations andNyblom theauthors (1989). In those papers developed at thepoint x is bivariate rank for tests location thecontesting shifts, using ofordering duetoOja (1983).In principle cept onecanalso SD(F; x) = PF (x is insidetheclosed simplex define Q basedon Oja's ideaofordering. several Similarly, whose vertices are Xi, . .. , Xp+ ), other criteria that havebeenusedto define multivariate location seenin Rousseeuw andLeroy (1987,chap.7) canbe anditssample version canbe obtained F byFm byreplacing viewed as depthmeasures, and their corresponding ranks inthis orbycomputing expression ( i+ I) -(*)I (x is inside canbe usedto define Q. theclosedsimplex whosevertices are Xi,, . . . , Xi,+,),where (*) istaken over all possible subsets ofX ofsize(p + 1) and 2. DATADEPTH A occurs and0 otherwise. In therealline I(A) = 1 ifevent Let X = {X1, . . , Xm} be a random sample from a p- case, SD(F; x) = 2F(x)(1 - F(x-)). 4. Majority depth randimensional distribution here some F, p 2 1. We describe (MjD) (Singh1991):Fora given X, ... , Xpfrom F, a uniquehyperplane notions ofdatadepth conthat areaffine-invariant andtheir pop- domsample these is obtained anddenoted points ulation versions all havetheability to measure byH(X1, . . ., of taining "depth" thishyperplane as thecommon inthefollowing two a distribution at a point sense. boundary, Xp). With closedhalf-spaces areobtained. We saya point x is in the 2.1 (MonotonicityProperty). Definition LetFbe a dissideifx falls inside thehalf-space major that hasprobability tribution around a point00;thatis, ifF is the symmetric greater thanor equal to
2, and we define
.. Table 1. Q(F, G) Values forFN(0(g)
1 ) 0))

N(1/2, (1/m+ 1/n)/12).

In Sections test. 5 and 6 we consider theestimation of Q distribution function ofX, then(X - 00) and (00 - X) have with therelated together testing procedure. the thesamedistribution. We saythat Specifically, thedepth function D(F; estimate two-sample Q = Q(Fm, G,) is studiedin detailin *) has the monotonicity property ifD(F; 00 + a(x - 00)) Section6. Here X = {X1, . .. , X,,} is a samplefrom all x andfor F and 2 D(F; x), for anya suchthat 0 < a < 1. Y = { Y1,. . ., Y, } is a sample fromG, and Fmand G, are Themonotonicity property means that D(F; *) ismonotheir empirical distributions. thelimiting nulldis- tonically Finding decreasing along anyfixed raystemming from the tribution of Q is a nontrivial in asymptotics. problem In center00. Unless statedotherwise, the notationD(F; *) somecaseswe havebeenableto prove this to be N( I, (1 / stands for anyofthefollowing four depth measures throughm + 1/n)/ Thm. 12)(cf. the same should outthearticle: 6.5),andwebelieve holdfor theremaining cases.HereN(a, b) stands for a nor1. Mahalanobis's depth (MhD)(Mahalanobis 1936):Let mal distribution with themeana and thevariance b. In d(x, Mahalanobis distance ,'F) be the between x and /.F (the 7 we argue Section that for scaleincrease alone measuring - 1tF). We mean of that F); is, d(x, 1'F) = (X - 1FYF (X onemayfilter outthelocation effect before defining Q. We MhD(F;x) = [1 +d(x, ,uF)I1, anditssample version refer to thismodified version of Q as thescale index. Its define - X)'S-'(x - X)]-, + is where [1 (x and are the S lF be usedfor estimate may whether G hasa higher testing scale covariance matrix and the covariance matrix andX sample thanF, disregarding the locationfactor. To tighten the is the sample mean. Here x' indicates the of the transpose p article's we defer mostof the proofs exposition, to the X 1 vector x. Appendix. 2. Tukey's depth (TD) (Tukey1975):Fora point x, we Forthesakeofdemonstration, weenlist inTable 1 some define at x to be Tukey's depth values ofQ for a bivariate case,which clearly show that the location change andthescaleincrease havea compounded TD(F; x) effect itis nonlinear) inbringing (though downthevalueof = inf{ F(2): Z is a closed half x spacecontaining Q. Attheendofthearticle we also present a histogram of Q(Fm, GO) for trivariate standard normal Fand Gthat clearly demonstrates that thelimiting distribution ofQ(Fm, GO) is The sample version ofTD(F; x) is defined byreplacing F
by Fm. Note that when p = 1, TD(F; x) = min{F(x),

) 1)

MjD(F; x) = PF{(XI, MjD(F;x)=

. , XP): x is in a major side}. 1 -F(x-)}.

The sample version is MjD(Fm; x). Notethat when p = 1,


2 +min{F(x),

and G

N(~(~~~( ((1 )
X

2(

1 .5 .441248 .303265 .067668

2 .2 .163746 .089866 .008152

.5 2
1

.333333 .282161 .171139 .023161

2.1. The monotonicity Remark holdsfor the property and majority Tukey's, evenforthesosimplicial, depths 3 calledangularly distributions-a classofdistrisymmetric butions broader than the class of symmetric distributions .1 .079852 (cf. Liu 1990).A distribution F is angularly symmetric about .040657 0oif(X- 00)! IX- 0o11 and -(X00)! IXareiden0oI1 .002732 tically distributed.

254

Journal of the American Statistical Association, March 1993

then0 is the mean vectorand kU is the disbutionexists, persion matrix. Without moment 0 can be viewed conditions, as a centerand I as a measureof spreadin the following sense. If W1 - ell(h; 0, Z1), W2 ellsh; 0, 22), and Y2 -t 1 is positive then 11W1 - 0| < 11 definite, W2- 011,where < means "stochastically smaller." We first consider thecase ofa locationshift butno change in dispersion. We provethatwhenF - ell(h; 00, 20) and Remark2.3. The depthMhD is theeasiestto compute; G - ell(h; 0, s0), the Q function decreasesas 0 is moved however, itleads in general to nonrobust as it is procedures, from line.For mathematical away along any convenience, 00 in termsof nonrobust defined statistics: the sample mean we assume hereafter thath is a nonincreasing function. and the sampledispersion. Proposition 3.3. Let F - ell(h; 00, s0). Assume that 3. THEPROPERTIES OF Q D(F; *) has affine invariance and themonotonicity property. Recall that under a given data depth D(.; ) forany Let 01and 02 be relatedas two givenp-dimensional F and G, X - F and populations 01 = 0o + a(02 - 00), Y G, we define O < a < 1. If Y - ell(h; 01,7,o)and Z - ell(h; 02, where R(F; y) = P{D(F; X) < D(F; y) I X - F} X0), then St and R(F; Z) < R(F; Y) X - F, Y Q(F, G) = P{ D(F; X) < D(F; Y) I

Remark 2.2. Under proper conditions,the uniform consistency ofthesampleversionofa depthfunction D(F; *) (i.e., SUPxERP I D(Fm; x)) - D(F; x) I 0 almostsurely as m forMhD oo) holds forall fourdepths.Specifically, it holds ifEFIIX 112< oo, forTD and SD it holds if F is absolutely and forM jD it holdsifF is an ellipcontinuous, ticaldistribution.

G}

and Y)).

Q(F, ell(h; 02, 20)) < Q(F, ell(h; 01,2o)). In additionto theintuitive interpretation ofQ in thecontext Considernow thecase of the same locationbut different of quality controlmentionedin Section 1, Q has several thatis, F - ell(h; Oo,2o) and G - ell(h; 01, dispersions; usefulmathematical properties. 21), where 3.1. If the distribution Proposition of D(F; X) is con00 = 01 and (21 - lo) is positivedefinite. (1) tinuous,thenR(F; X) - U[O, 1], a uniform distribution 3.4. Assume that(1) and the affine invariProposition supported on [0, 1]. The result follows from clearly theprobability distribution ance ofD(F; *) hold. Then transformation. St1 R(F; Y) <R(F;X), andthus Q(F, G) < Proposition 3.2. If D(F; *) is affine-invariant, then so are R(F; Y) and Q(F, G); thatis, Proof Let Z - ell(h; 00, I), and letX - 00 = -00) and y - 00= I/2(Oo). (Thus X and Yhave their R(F; Y) = R(FA,b; AY + b) desiredmarginal We need to showthat distributions.) and P((X- 00) 1-'(X- 00) < a) Q(F, G) = Q(FA,b, GA,b), 2 P((YOo)X1-'(Y- Oo)? a). (2) whereFA,b is the distribution function of AX + b, GA,b is withthearguments ofPropused in theproof thedistribution function ofA Y + b, A is a p X p nonsingular This,together osition3.3, is sufficient to claimtheresult. To prove(2), we and b is a p X 1 vector. matrix, It is worthpointingout thatthe preceding proposition notethat asserts thatthevalue of Q does not dependon the scalesof leftside = P((Z - 0)'(Z - 0) < a) the underlying measurements as long as theyare the same F and G. In thefollowing for we arguethatQ decreases from and is a locationshift 2 ifthere or a scale increase, or both.For 1 /2(Z - 0) < a). side = P((Z - 0)'4/'2 right mathematical we restrict convenience, ourselves to theclass ofelliptical distributions on RP.We beginbya definition of That condition(1 - 20) is positivedefinite impliesthat such a distribution. whichin turnimpliesthat (1-' - I 1) is positivedefinite, ( 1/2zu1 definite. The latter sufclearly 1 /2-I) is positive 0 3.1. Ifa distribution Definition has a density oftheform ficesfortheclaim. - a)), g(x) = clI-' Ih((x - 0)'-'(x We providea specialcase hereto exemplify thejust-discussed of property Q. whereh(*) is a function fromR + to R +, thenit is called elliptical with parameters 6 and , andwedenote itbyell(h; Example 3.1. Let F be bivariate normal N(,u X) and let 6, X). Here X is positivedefinite. G be N(,u o2z), whereaf > 0. To obtainthe Q value in this For instance, if h( t) = exp( -t/ 2), thenthe distribution case, it suffices to considerthe case wherez = I and ,u is multivariate normal.If the second momentofthe distri- = in viewoftheinvariance property. It can be shown

(=EGR(F;

(s?),

Liu and Singh: Data Depth and Multivariate Rank Tests

255

that Q
=

approachagreewiththisresult. a different we have For thecase of bothlocationand scale changes, result. thefollowing 3.5. Proposition
0, 21) where 21
-

= fi1% a-2exp(-w)exp(- w/a2)dw, and thus Q in row 1 ofTable 1 obtainedwith (1 + a2)-l . The figures

00along as 0 is moved away from decreasesmonotonically follows from the propThis previous line. proposition any ositions. Finally,we mentiona monotonicchangein Q in terms ofincreasein the scale due to Huber'scontamination

2o is positive definite. Then Q(F,

Let F - ell(h; 00,20) and G - ell(h;


G)

Ga = (1 - a)F + aH,

(3)

witha distribution where0 < a < 1 and H is an elliptical higher scale thanF.
00, 21) with 21

3.6. Let F - ell(h; 00,20) and H - ell(h; Proposition


-

as a indecreasing (3). Then Q(F, Ga,) is monotonically creasesin [0, 1]. 3.4 and Propositions This resultcan be shownfollowing in (3). 3.5 and thedefinition sometypeofmonoexpects Remark3.1. One naturally of G "inof as the dispersion G) tonicity property Q(F, to such we not been able establish so far have but creases," We tendto believethatQ(F, G1) > Q(F, G2) when a result. F - ell(h; 00,s0), G1 I ell(h; 00,II), and G2 - ell(h; 00, defboth with lo) and (2 - 21) beingpositive 22), ( 3.1. inite.This beliefis in factsupported by Example thetheory thatwhenF from Remark3.2. It is expected and G have the same locationand G has a smallerscale, thenQ > 2. For example,whenF - N(O, 1) and G - N(O, .7. 4), thenQ is approximately some values of Q(F, thispoint,we present To illustrate

20 being positive definite.Define Ga as in

sum test.The teststatistic W in thismethodis the sum of the ranksof Y values in the combinedsampleX U Y. The exactdistribution of Wis thesame as thatof'Y1+ 7Y2+ * * + y,n, where(Y1, of2, ...X y,n)is a randomdraw without replacement fromthe number{ 1, 2, 3, .. ., m + n }. The exactdistributions ofthisstatistic fordifferent m, n are tabulated,and the asymptotics are known(see, forexample, Hettmansperger 1984 and Lehmann 1975). We proposea test here fortesting Ho: F = G versusHa: Q < 2. Even in the natureof ranking thoughthereare differences here, theproposedtestcan be viewedas a multivariate extension oftheWilcoxonranksum test. Suppose thatwe simplycombine the X and Y samples and defineWas thesumofcenter-outward ranks of Yvalues to a certainnotionof data depth.Such a W can according detect an increaseof scale in the G population, surely butit cannotdetect a changeoflocation.Ifthescalesarethesame, thenthe distribution of this W undera changeof location will be similarto the null distribution. Therefore, we approachtheproblemofdefining a meaningful ranksum statistic somewhat differently. Suppose thatthereis an additionalsampleZ = {5Z, Z2, } from * Zn,0 theF population, perhapswithno substantiallylargerthan m, n. We propose to use the empirical populationof Z, Fn, as the reference population.That is, foreach Xi we defineR (Fn,; Xi) = the proportion of Z sample having lower depth (computedwith respectto Z sample itself)than Xi. In otherwords,R(Fn,; Xi) = the ofZj's withD(Fn,; Zj) < D(Fn,; Xi). Note that proportion R(Fn,; Xi) can be viewedas the relativerank of Xi with to Z. Similarly, respect defineR(Fn,; Y,). Arrange the m + n values R(Fno; Xi) and R(Fn,; Yj) forall i and j in an ascendingorderand assignranks 1, ..., m + n to them Let accordingly. W = The sum ofthe ranksofR(Fn,; Yj) for j = 1, ..., n. Under the null hypothesis F = G, the ranksof R(Fn,; from{ 1, 2, Yj)'s behave like a randomdrawof n numbers ... , m + n } without underthe alreplacement; however, ternative Q < 2, R (Fn,0;Yj) values will tend to be lower than R(Fn,; Xi) values and, as a result, theirrankswillbe willmake Wrelatively lower.This in turn smaller. Thus this Wwillbe able to successfully detect changesin theindexQ. One technical in usingthisWmay be theproblem difficulty of tie-breaking. We propose the followingnonrandom schemeforbreaking theties.

the same resultholds if I is ertyof normal distributions, z and ifthe same vector replacedby a generalnonsingular is added to bothmeans.Also,thevalue of Q is independent and has the ofthedepthused as longas itis affine-invariant in thebeginning ofSecdefined strict monotonicity property tion 2. The values in Table 1 are obtainedbased on the Under MhD, Q(F, G) = P(X'X observations. following ~ Y - G). Note that X'X - X22,a chiF, 2 Y'YIX with degreeof freedom2, and Y'Y squared distribution distribution wherex 2(2p72)is a chi-squared ~ v2x2(2p72), Scheme(*). RegardthevaluesR (Fn,; Xi) Tie-Breaking 2 and noncentrality (2u2). Because withdegreeof freedom and R(Fn,; Yj), i = 1, . .., m andj = 1, .. ., n, aspj, P2, Q(F, G) = P(W ? -) X'X and Y'Y are independent, If k of thesevaluesare equal, say + .1 X, . XPm+n. withdegreeof ** Pm,Pm a noncentral F distribution whereW follows = = Pi1 P2 Pikwithil < i2 < . . . < ik,and yofthe . Based on the last (2pu2). freedom (2, 2) and noncentrality pi's are smallerthan thesek values,thenthe ranksof pi, software Mathematica is used to calformula, thecomputer V + 2, ...., + k, in thatorder. Pi2, ... Pik are + 1, culate Q forvariouscombinations of, and v. Theorem4.1. Let 8y1, be a randomsample 5Y2, ..., 'Yn 4. AN EXACTRANKSUM TESTFOR Q without replacement from{ 1, 2, ... ., m + n }, and letHm,n distribution of '1 + ***+ 'Y,n AsLet X = {Xl, X2, . . . , Xm} and Y = {Y1, Y2, . .., Y4} standforthe sampling test sume thatF has density A well-known nonparametric F and Gs. function be samplesfrom f( *). Under the null hyF = G, the conditional in the real line case is the Wilcoxonrank pothesis distribution forlocationshift of W withthe

N(AL(i), a21) N((0), I) and G G) in Table 1, where F propwithdifferent ,t and a. Note thatdue to theinvariance

256

Journal of the American Statistical Association, March 1993

scheme (*), given a set of distinctvalues (ZI, Z2, Theorem5.2. Assume that F ell(h; Iu,Z). If depth Z,0)forZand {xi,x2, ... xm,y1,y2I *.,Yn} for(X,Y) function D(F; *) is affine-invariant and satisfies strict theorder, disregarding is Hm,n. Moreover, becausethesample monotonicity property on thesupport ofF, then valuesare distinct withprobability 1,theunconditional disRD(F; Y) = RMhD(F; Y). . tribution of W is also Hm,n HereRD(F; Y) stands for R(F; Y) derived from thedepth Proof. Underthescheme(*), each ofthe(m + n)! perD(F; ). mutations of {xl, x2, . ., xm, Y1, Y2 * *Yn} of XU Y to exactlyone ranking corresponds of thepi's. The claim Proof. Undertheassumed thecontours conditions, of follows from thisobservation. clearly D areoftheform constant (x - AFY71 - (X - IAF) = c. Thus Remark 4.1. Instead of the precedingscheme (*), one may use the following randomtiebreaker. If Pi, = Pi2 = * = Pik then select one of the k! choices of ranks randomly from-j + 1, y + 2, ..., y + k and assignthem
,

RD(F;

Y) = PF{(X2 (Y= PF{MhD(F; = RMhD(F;

AF)YXF1 (X-

ItF)

AF) YF

(Y-

AF)} Y)}

to Pi,, *.*

Pik

X) ' MhD(F;
Y).

Remark4.2. The proposedtestwill requirea considerifh(t)> Oon tE [O, Remark5.1. Notethat some k] for theF population(of size nO+ m). ablylarger samplefrom 0 > k or on then the ofSection 2 havestrict [0, oo), depths But thisshouldnot be of much concern, because thesame on thesupport ofF. sample fromthe F populationwill be used repeatedly for monotonicity Theorem if F 5.2 states that is elliptical, then itsuffices to the qualityof upcomingbatches,a quite standard testing obtain R(F; Yj) with MhDas thedepth. comFortunately, in themanufacturing practice sector. puting and 2 andnotthefunction MhDrequires h. only At 5. Q AS A LOCATIONPARAMETER Analternative F is elliptical toassuming that with known F as a finite ofthe population consisting In thissection itis assumedthat thepopulation Fis known ,uand Z:is to treat of an measurements lot and use acceptable, fully inspected and thatone has a sampleY = { Y, Y2 ... ., Yn} from the a involves recomputer-intensive This approach. approach populationG. In reality itcould mean either (a) one regards a of random size n from peatedly comdrawing sample F, F as the collectionof all measurements of one (or several) eachtime, andfinally a histogram Q(F, GO) obtaining a model on F, sayF is puting acceptablelot(s), or (b) one specifies value of Q(F, GO) an ellipticaldistribution (e.g., multivariate normal)with,u of theseQ(F, GO)values.Then a future Y from based on an actual G is on this to placed histogram and X obtainedfrom themeasurements ofa largeacceptable = if F with test this value the G. agrees hypothesis batch. Recall that Q = EGR(F; Y), the mean of the random ESTIMATE OF Q 6. TWO-SAMPLE variableR(F; Y) whereY G. We mayestimate Q bythe In practice a morerealistic situation wouldbe thatone corresponding samplemean:

I n Q(F, GO) = - , R(F; Y&) nl


FromProposition 3.1 ofSection3, we obtainthefollowing theorem.

} from than F is comthat thedistribution G, rather Yn In this pletely known. ofQ is casea natural estimate
Q(Fm5 GOn)--z: R(Fm; Yi), n =I

has two samples, X

{ XI, . . ., Xn } fromF and Y = { Y1,

Theorem5.1. If the distribution of D(F; X) is contin- where R(Fm; Yi) = theproportion ofXj's having D(Fm; Xj) of Q(F, G), underthe null F ? D(Fm; Ye). Here D(Fm; *) is the empirical uous, thenthe distribution depthcomG, is thesame as thatof In Ui/n,whereU1, U2 ... ., Un puted with to Fm.The difference respect ofQ(Fm5 GO)and are iid U[0, 1] randomvariables. can be used to testHo: F = G versusHa: Q(F, G) < 2 section we first prove theconsistency ofQ(Fm, GO) With the values R(F; Yi ), for i = 1, . .. , n now known, In this some asymptotic distribution results theproblemoftesting Q(F, G) < 2 is onlya locationprob- and thenpresent Ho. lem. One may use the sample mean Q(F, GO) to thisend under or resort to a nonparametric suchas thesigntest procedure, Theorem 6.1.: Consistency ofQ(Fm, Gn). Assumethat or thesignranktest. Even ifF is assumed to be a completely specified distri- lim sup I D(Fm; x) - D(F; x)j =0, almostsurely (4) m-oo xERP itmaybe an extremely tedious bution, job (computationally) to computethe requiredfunction R(F; Yi), exceptin the andthat D(F; Y) hasa continuous distribution. Then case whenone is usingMhD as theunderlying depth.But it as Q(Fm, GO) -> Q(F, G) almostsurely turns out thatwhenF is an elliptical distribution, thevalue of R(F; Y) does not depend on the depthused, provided thatthedepthis affine-invariant and satisfies themonotonDefine icityproperty. We formally statethisfactas the following Proof: theorem. R+(F; y) = PF{X: D(F; X) < D(F; y) + e}

Liu and Singh: Data Depth and Multivariate Rank Tests

257

and Re(F; y) = PF{X: D(F; X) < D(F; y) - e}. Note thatforanye > 0, and forall largem and n,
1n ->JRe(F; I n R+(F; Yi) < Q(Fm, Gn) <->

the finalresultin Theorem 6.5, we first Beforestating the limitdistribution results regarding provethe following of (II). Theorem6.3. Assume that F is definedon R' and is bounded above and continuousand that it has a density below in a neighborhood of the median M. For TD, SD, and MjD in llR1, as [Q(Fm, F) - 1] ?N(O,
m 00 o.

ni=1

ni=1

Y1)

thestrong using Now theresult can be obtained almost surely. and the following As e -O 0, facts: law of largenumbers EGR+(F; Y) and EGR (F; Y) EGR(F; Y) = Q(F, G). EGR(F; Y) = Q(F, G)

Theorem6.4. Assume that F is definedon RPand is continuous.Assume also thatEFIIX KI< 00. If absolutely MhD is used to defineQ, then,as m -- oo,

VA[Q(Fm,F)

The latterclaims followfromthe monotoneconvergence of Theorem6.3 or Theorem 6.5. Under the conditions theorem and the conditionthatD(F; Y) has a continuous of Theorem6.4, theconditions distribution. nulldistribution to theasymptotic We nowturn of[ Q(Fm, the restof thissection.We believe thisdistrithroughout + 1/n)/ 12) in general; however, we butionto be N(O, (1/nm thisresult in Theorems6.3, 6.4, have been able to establish and 6.5 onlyin the real line case forall fourdepthsand in thegeneralmultivariate case forMhD. Findingthelimiting in generalmultivariate cases forSD, TD, and distribution remains an open problem. MjD limiting result on theconditional We beginwitha general distribution ofthedifference (Q(Fm, G) - Q(Fm,F)), where Q(Fm, F) = E[Q(Fm, G) I X ]
Gn)
-

2]

N(

i)

Q(F,

G)] (4Q(Fm,

GO) - 2]). Assume that F = G

[(+

1)

12[Q(Fm5 GO)

as min(m,n) - oo0.

N(0, 1),

ditionally on X, as n

in Theorem6.5 the result established Figure 1 illustrates whenp = 3. We let m = n = 50 and use MhD as thedepth of 1,000 ranis a histogram consisting measure.The figure -i0 (Q (Fm, G) - 4),whereF and G dom values of Z in R3. Note that are both standardnormal distributions (1I/m + 1/n)/12 = 3 here. reveala negative bias in Q(Fm, GO) results Our simulation to our withthedimension. thatseemsto increase According .016 whenp = 2 and .03 when thebias is roughly simulation, Theorem6.2. Assume thatD(F; X) has a continuous studyon the natureof thisbias should p = 3. A theoretical distribution underF and assumethat(4) holds.Then, conWe wouldrecommend be prousingresampling interesting. --oo, and m

oo,

Vn[Q(Fm,G)

Q(Fm,F)]

N(0,

1)

along almostall sequencesof X. on the on theroleplayedbythisresult We first comment Write null distribution. eventual
Q(Fm, GO)--

= [Q(Fm,

Gn)

Q(Fm, F)]

+ [Q(Fmn F)-

(I) + (II), say.

In thecasesconsidered in Theorems 6.3 and 6.4, thelimiting claim ofthe of (II) is N(O, 1/(12m)). The final distribution 1 from limitN(O, (1/nm + In)! 12) in Theorem6.5 follows the factthatthelimitin Theorem6.2 is independent of X, argument. function via a characteristic Theorem 6.2 can be shown using the Lindberg-Feller limittheorem lemma. central and the following
the assumption that F = G conditionally on X, as m -> oo,

0-<

__

lc

ll

Lemma 6.1. Under the conditionof Theorem6.2 and


o
4 -2
0-

along almostall X sequences.

Case. ofZ's for theThree-Dimensional Figure 1. 1,000Simulations

258

Journal of the American Statistical Association, March 1993

suchas jackknife cedures, orbootstrap, to eliminate thisbias in high-dimensional cases. In viewofthesecond-order propertiesof bootstrap, one would expectthatbootstrap could capture thisbias and givea better approximation ofthelimiting distribution of[(I /m + 1/n)/12]-1/2(Q(Fm, G) -2) thanthenormalapproximation. (See also Remark5 in Section 8.) Our simulation seems to confirm thisexpectation. The bootstrapbias estimatewe obtained is .018 when p = 2. 7. INDEX OF SCALE: TESTINGSCALE WHEN LOCATIONS ARE DIFFERENT

We statetheunivariate version ofthisresult in thefollowing theorem and present itsproofin theAppendix. Theorem 7.1. Assume that F( *) = Fo(( --0)/ ao), G(*) - 0)/ uo), and Fo has a bounded,uniformly continuous densitysymmetric around 0. With any of the four depths on the real line, if 0o and 6 are Vmand 4&-consistent estimators of 00 and 0, then,as min(m, n)
=

Fo ((

-~00,

(Os-

2)

N(O, (1/m + I/n)/12).

We beginthissection with an important remark regarding local alternatives and Q. There is a starkresemblance betweenQ and themeansquarederror (MSE) in terms oftheir to the local change in the varianceand in the sensitivity location.For convenience, we consideronlytheunivariate case (p = 1), thoughthisdiscussionclearlycarriesoverto generalmultivariate cases. Let X - F and Y - G on 0R, = whereF(*) Fo((. -00)/ao) and G(*) = Fo((* -0)/a). AssumethatF0 has a density symmetric about 0 and has a boundedderivative. In case a = ao and 0 = 00 + O(n- 2), we have Q and
= E(X -o)2 + O(n-1). E(Y -o)2 (5) Note thatthe dilutedeffect is of orderO(n -') on both Q and the MSE, whereasthe location change is of the orderO(n-1/2). On the otherhand, if 0 = 00 and a = ao + O(n-1/2), then

in theirscales-that is, Remark 7.1. If G and F differ F(*) = Fo((. -00)/cro), G = Fo((* -0)/j), and u * uothe scale difference the way then Q(Fm, G*) will reflect Q(Fm, GO) does. Note thatthe populationsof X and Y 0, namelyF and Go, have the same locationand the original if uo < u, then standard deviationsuo and u. For instance, than 1 Q(Fm, GO) (or Q(Fm, G*)) willtendto be smaller

8. SOME CONCLUDINGREMARKS, QUESTIONS, AND OPEN PROBLEMS


1. Ifone usesTD, SD orMjD to define Q, thenthetheory presented hereis free ofanymoment requirement. This may be regarded as an advantageof our approachfortesting looftheoutputofa production cation,scale,or inconsistency line over othermoment-dependent methods. For instance, in Q could be used to testa possibleincreaseof dispersion a multivariate Cauchypopulation. 2. Kolmogorov-Smimov fortesting (K-S)-typestatistics, F = G versus F # G, can also detect locationshift and scale changesimultaneously. (By thescale changewe mean both scale increaseand decrease.)But in qualityassurance,the K-S may well give a falsealarm whenthe locationof Y1's staysthe same (accuracystaysthe same) and its scale deis improved), is obviously creases which a desirable (precision in qualitycontrol. In thiscase of scale decrease,Q property in willactuallygivehigher values,indicating improvement withthetarget measurements. consistency 3. Q maybe viewedas a "loss function"-free measureof theaverageincreased "deviation"from a preset target value, in contrast to othermeasuressuch as MSE matrix (around thetarget value). 4. How would thepowerfunction dependon thechoice ofdata depth? Is there an optimalnotionofdepthfor a given For ellipticaldistrilocation-scale familyof distributions? have similarcontours.It butions,all fourdepthfunctions thishas on the shouldbe interesting to see whatimplication within theclass of elliptical distributions. powerfunctions of the bootstrapasserts, 5. The second-order property thatifthelimiting distribution is free from roughly speaking, with thelimiting population parameters (as in ourcase here, distribution N(0, (1 /m + 1/n)/12)), thenthebootstrap apthanthelimis closerto theactualdistribution proximation distribution. This extraaccuracyobtainedin thebootiting strap is due to the cancellationof an extra termin the asymptotic expansion of the samplingdistribution. This suggests thatin principle one shouldbe able to geta better approximation forthe sampling distribution of Q by using thebootstrap. Our simulation results support thissuggestion.

Q(F, G)= 2 +

(n-)

Q= 2 + O(n112 and E(Y


-o)2

E(X

-o)2

+ O(n-1/2).

(6)

The implication of(5) is thattests based on Q are locally inferior to other standard tests for locationshift alone (keepingthedispersion unchanged).On thepositive side,(5) and (6) suggest thatto testscale changeonly,one could translate G to G?( *) = G( * - ( - 00)), redefine Q betweenF and Go, and thenobtain a testforscale accordingly. This test seemsto havegoodlocal properties. Specifically, we letQS(F, G) = Q(F, GO).Thena natural two-sample estimate ofQS(F, G) is

Q,

Qs(Fm, Gn)(=Q(Fm,

G*)),

whereG* standsforthe empiricaldistribution of the ad- ( - o), i = justed sample Y * Y ...,n} and 6 and bo are l/_and lm-consistent estimators for0 and 00. Because the effect of a location shifton Q is diluted, one expectsto see that Q(Fm, G*) = Q(Fm, GOn) + o(max(m-1/2, n-1/2)). Here Go standsforthe theoretic empiricaldistribution of Y = {Y = Yi- (6 - Oo),
i
=

l, ...

.,

sequently, Q(Fm, Gn*) and Q(Fm, G?n) shouldhavethesame limitingdistribution under the hypothesis that* F and G merely in location(in thiscase Fand G? areidentical) diffier .

n },

which is a theoretical sample from G?. Con-

Liu and Singh: Data Depth and Multivariate Rank Tests

259

APPENDIX: PROOFS 3.3 of Proposition A.1 Proof


lemma. as thefollowing one ofthekeystepsoftheproof We state and IfD is affine-invariant Lemma 3.1. Let F - ell(h; 00,Z,O). ofF, thenthe on thesupport property monotonicity has thestrict
(x - 0O)'Y-i(x - do) =

that by checking
(it-.

O(I t - MI)

on the interval [M, t] if M < t or on the interval [t, M] if M 2 t elsewhere.

= 0

{x: D(F; x) = c} areoftheform contours

to be shownthat Thus it remains


N(O I, 2Vm 1-2? 12/ m i1 ~,m, E~,j---~

dc

Because the distribution of 2ti,M is U[O, 1], this is as m -- oo. H, H achievedby showing Proof Let Z be a random variablewith distribution that - ell(h; 00,I). Applying to Z and boththeaffine rotations proper of D(H; *), we see T(Mm) = properties invarianceand the monotonicity ~ -E~ijm. i[1m] We then thatthe contours{ x: D (H; x) = c} are nestedspheres. (X - do) = X 1"2(Z _O) (whichimplies applythetransformation - [m (i,M - Eti,M1 = op(m-1/2). ofD(F; *) to see thatthetransformed X - F) and theinvariance and nested. contours are elliptical As a directconsequenceof Lemma 3.1, the set {R(F; Y) 2 t} Towardthisend, we showthat is of the form(Y - 60)'1-'(Y - 00) c t*. Using theorem1 of IT(x)I op(m-1/2), sup Anderson (1955), we see thatPG((Y- 0o)'X1 (Y - 00) < t*) deIx-MI <VM log m as 6 movesawayfrom creasesmonotonically 00on a line.The result given on the line. The sup is usinga standardset of arguments is thusproved,because of partitions estimated by the max at the endpointsof shrinking Bonwith the bound a coupled and finally probability length m-, EGR(F; Y) = PG(R(F; Y) 2 t) dt. is used. ferroni inequality

as c decreases. one another are nestedwithin and thecontours

f
I

of Lemma 6.1 A.2 Proof

A.4

Proofof Theorem 6.4

almost to R (F; y) for to showthatR (Fm;y) converges It suffices Let X and S denote the sample mean and sample dispersion to F) along almostall sequencesX. Fix a matrix all fixed y (withrespect oftheX data set.Given some x E RP, define sequence X along which (4) holds. For any given e > 0, all m 2 mOforsome mo. A(x, 0, A) I D(Fm; y) - D(F; y) c e/2 for supyeRP = {y E 1RP: (y - 0)'A-'(y -0) ? (x - 0)'A-'(x -0) forall such m, Therefore,

{Y: D(F; Y) 2 D(F; y) + } c {Y: D(Fm; Y) 2 D(Fm; y)} {Y: D(F; Y) 2 D(F; y)-}.
e tendto 0. The claim is deducedby letting

With matrix. where0 is a p X 1 vectorand A is p X p invertible one can write thisnotation,


-IQ(Fm, F)
-

ofTheorem6.3 A.3 Proof


herewe can write In thecases considered

= dF(x)-2

{Fm(A(x, X, S))

F(A(x, X, S))}

dF(x),

Q(Fm,F)-- 2 = 2 J =-2

A (I -Fm(x)) Fm(x)

JF(x) d[Fm(x)A (1 -

Fm(X))]-

forall whichis truealmostsurely providedthatS is nonsingular; X, F(A(X,,u,,)) ,uand nonsingular largen. Because foranyfixed that is distributed as U[0, 1], it follows

by parts) (usingintegration

=-2
-

V- f {Fm(A(x,

gt,X)) - F(A(x,

t, X))}

dF(x) 4' N(0, 12

F(x) d[Fm(x)A (1 - Fm(x))]


2

that The proofis completedby showing

+ - + O(m-) 2[1 - Fm(Mm-)]


lm

{Fm(A(X, X, S)) - F(A(x, X, S))


-

1 = -2(i +-+ 2 m=
i=

Fm(A(X, A, X)) + F(A(x,

,u,

X))} dF(x)

op(I/V_).

O(m-),

whereMmis thesample medianofFmand F(Xi) I1-F(Xi) Let


(i,t

ifXi

Mm

ifXi > Mm. ifXi c t

ofmuloffluctuations estimates needssuitable Note thatthisresult tivariate We have been unable to findresults processes. empirical in probability thatcould meet this need. On the other literature and tedious a very lengthy hand,we havesucceededin constructing repcase alongthelinesof Bahadur-Kiefer proofforthebivariate Babu and Singh1978). on therealline(see,for instance, resentation We choose to omitthispartof theproofhere. A.5 Proofof Theorem 7.1
=

= F(Xi)

in a Note that(i = (iMm Note also thatEt, = 4 + O((t -M)2) ofM. HereM is themedianofF. This can be shown neighborhood

we assume that00 Withoutloss of generality, provethatas/I-min {n, m }-oo,


VN(Q(Fm, G*) -Q(Fm, Gn)) -~0

to 1. It suffices

in probability.

260

Journal of the American Statistical Association, March 1993

Note thatwhenD(F; *) is takenas TD, SD, or MjD on theline, Q(Fm, G*) where G*(.)
=

2[Fm(y) A

(1 -Fm(y))]

dG*(y),

(A.1)

is the empiricaldistribution of Y*, and a A b

bound Op(n-'12) on the empiricals; on the second we use the oscillationbound on the empiricalprocess(see, forexample,Stute 1982). [Received November 1990.Revised June 1992.]

The following proofis forQ(Fm, G*) in (A.1). A similarbut is needed forthecase of MhD. To prove(A. 1), separateargument K and e such that P(( - Oo)> K/IV) < e we first fixconstants and show that, with d = 0 - 00,

min{a, b}.

REFERENCES
Anderson, T. W. (1955), "The Integral ofa Symmetric UnimodalFunction Over a Symmetric Convex Set and Some Probability Inequalities,"in Babu, G. J.,and Singh,K. (1978), "On DeviationBetweenEmpirical and QuantileProcesses for Mixing RandomVariables," Journal ofMultivariate Analysis, 8, 532-549. Brown,B. M., and Hettmansperger, T. P. (1987), "Affine-Invariant Rank Methods in theBivariate LocationModel,"Journal oftheRoyalStatistical Ser. B, 49, 301-3 10. Society, (1989), "The Affine-Invariant Bivariate versionof the Sign Test," Brown, B. M., Hettmansperger, T. P., Nyblom, J.,and Oja, H. (1992), "On subCertainBivariate SignTestsand Medians,"unpublished manuscript T. P. (1984), Statistical Hettmansperger, Inference Based on Ranks, New York: JohnWiley.

Proceedings ofthe American Mathematical Society, 6, 170-176.

Su?

[Fm(Y

d) A (I -Fm(y -d) )-Fm( y-d

+ $)

Journal Statistical Ser.B, 51, 117-125. ofthe Royal Society,

as (I) + (II), where(I) = (A.2) as / -1 oo. We write thisexpression of (A.2) and (I). withFm replacedby F and (II) = the difference in (I) equals (with On theintervalIy -0 1 2 e, theintegrand f(*) =F'( ) = fo0( -0o))
4

Association. to Journal American Statistical mitted ofthe

E. L. (1975),Nonparametrics: Lehmann, Statistical Methods Based on

National Academy ofSciences, 85, 1732-1734. (1990), "On a NotionofData DepthBased on RandomSimplices," of wheresupy continuity I61(y)I 0 as I -o o, usingtheuniform TheAnnals ofStatistics, 18,405-414. On the otherhand,when Iy - 0 1 < , the integrand the density. in Mahalanobis,P. C. (1936), "On theGeneralized Distancein Statistics," ? in Iti d. Thus in (I) is of theordero(/-1/2) uniformly Proceedings ofthe National Academy ofIndia,12,49-55. (I)
where t(Yj)
-

[f(y

d)sign(y

0) + 61(y)]'

(A.3)

Ranks,San Francisco:Holden-Day. Liu, R. (1988), "On a Notion of SimplicialDepth," in Proceedings ofthe

- n

(Y)

+ O(eI1/2),

forMultivariate Oja, H. (1983), "Descriptive Statistics Distributions," Sta-

J.(1989), "On Bivariate Oja, H., and Nyblom, SignTests,"Journal ofthe and Outlier Rousseeuw,P. J.,and Leroy, A. M. (1987), RobustRegression Detection, New York: JohnWiley. Singh,K. (1991), "A Notionof Majority Depth,"technical report, Rutgers University, Dept. of Statistics. Stute,W. (1982), "The OscillationBehaviorof EmpiricalProcesses,"The

tistics andProbability Letters, 1,327-332.

e is arbitrary, to be shownthat (I) = op(l-1/2).Now it remains

0)sign(Yi - 0)I(

=f(Yi

- d)sign(Yi - 0)I(Yi - 01 e) = fo(Yi Yi -0 1 2 e). Because Et(Yj ) = 0 forall i and

American Statistical Association, 84,249-259.

Annals ofProbability, 10,86-107. The resultfollows from the knownbounds on the empirical proTukey,J.W. (1975), "Mathematics and Picturing Data," in Proceedings of in two different cess. We estimate the integral regions separately: the1974International Congress ofMathematicians, Vancouver, 2, 523we use theusual I-'/8 and Iy-doI > 18.On thefirst 531. I y -oI

{lI (II) I-

in probability.

You might also like