You are on page 1of 9

RANK CORRELATION SPEARMANS AND KENDALLS COEFFICIENTS Tests of Rank Order The rank order connotes the systematic

c and orderly arrangement of quantities or qualities/ attributes either associated with the observed entities in an ascending or a descending order of their magnitudes. Objects can also be arranged in order of merit such as the best, second best., least best/worst in the group in a continuum. owever, before arranging the observed items/ units, each of these has to be assigned a discrete number such as !, ", #, $ on the basis of its magnitude in case of cardinal measurement and score in case of ordinal measurement, or in order of merit. These discrete numbers are defined as the ranks which are allotted on the basis of relative position of each unit in the arrangement. %ank ordering of the information/data leads to its transformation into %ectangular &istribution. %ank ordering is valid, and hence, applicable to both the cardinally and ordinally measured variables and qualitative attributes. %ank ordering is useful as i' a preliminary device to screen the observations or discriminate among these( ii' an e)ploratory method to evaluate their nature( and iii' a test of certain types of data that may not be amenable to the use of other statistical tools. The method has certain advantages also as a device to test hypothesis* it is i' simple to understand( and ii' easy to operate( iii' independent of unrealistic and comple) assumptions. Types of ypotheses +menable to %ank Ordering ,ollowing are the types of hypotheses that may be easily but rigorously evaluated through rank ordering* i' The randomness of the rank ordering of any given set of observations. -n this case, the problem under consideration is to determine whether a given order of arrangement of values/scores is random. Thus, it is one among several tests of randomness of data( ii' The assumption of independence of two treatments, judgments or evaluations may also be tested by this procedure. The problem of independence requires us to know whether the ranks awarded to the observed units are correlated or independent. ,or this purpose, one may estimate the rank correlation coefficient.

.)amples /-' -n a test of ability to distinguish shades of colour, !0 discs of various shades, whose true orders are !,",..!0, are arranged by subject in the order 1, $, ", #, !, !2, 3, 4, 4, 5, 0, !!, !0, !$, !", !#. ,ind the rank correlation coefficients p and r between the real and the observed ranks. /--' Ten competitors in a beauty contest are ranked by three judges in the orders !, 3, 0, !2, #, ", $, 5, 1, 4 #, 0, 4, $, 1, !2, ", !, ", 3 3, $, 5, 4, !, ", #, !2, 0, 1 6se rank correlation coefficients to discuss which pair of judges has the nearest approach to common tastes in beauty. /---' Tied %anks
Tx = ! # /t j t j ' j ! !"
! # / n n' /T X + TY ' / d " ' 3 ! # " ! # " /n n' "T X ' / n n' "T Y 3 3
! !

/-7' Two ,oremen rank ten employees according to suitability for promotion as ,ollows* .mployee ,oreman ! ,oreman " +
! ! "

8
! ! "

9 # $

& $ $

. 3 $

, 3 3

: 3 1

4 4
! 5 "

; 5
! "

"

!2

,ind the rank correlation.

"

Q1.

The following are the three sets of weekly wage earning of workers( " " 2 ! 0 # 2 ! 4 " 0 " 0 ! " " " " $ " ! ! 3 !0 !3 !4 5 ! 3 ! 0 ! # !! "4 !1

(In Rs ! # 5 ! 1 ! " " # 1 # ! " ! 2 ! "

Three factory tables find if there is any relation between the workers earning among these factories, (In Rs ! " # 1 ! ! 2 " 5 4 3 0 ! 4 " $ ! 2 5 4 0 3 $ ! 1 $ # !2 5 0 1 # 3

Q". -n a trade test, a person is asked to arrange 4 metals in order to their hardness. &o the following three sets of ranks show any evidence of discrimination< =o.! =o." =o.# !, ",0,$,#,1,3,4 $, ",!,0,#,4,1,3 !, #,",3,$,0,4,1 $ inversions 4 inversions $ inversions

Q#. !" >tudents obtain the following ranks in e)amination in two subject do these ranks indicate a close relation between the attainments in two test. Rank of F$rst E%a&$nat$on !.1 ".4 #.5 $!2 0!! 3!" Rank on Se'ond E%a&$nat$on #..!! $3 !4 "1 0...!" 5...!2

?$. + third danger is that one e)treme rank in a matri) not otherwise significant may lead to a significant test. 9onsider the following hypothetical matri)* ! " # $ 0 " # $ 0 3 # $ 0 3 ! $ 0 3 ! " 0 3 ! " # 3 ! " # $ 1 1 1 1 1 #

3 " ! >um ..

! " !

" " !

# " !

$ " !

0 " !

1 $"

()*ot+eses to test %ank order transformation can be used to test two major hypotheses* the hypothesis of random order and the hypothesis of independence. -n testing the hypothesis of random order, the problem is to determine whether a series of value arranged in the order in which they are observed or in some other specified order can be considered a random arrangement. -n testing the hypothesis of independence the problem is to determine whether tow or more series of ranks are correlated, or whether the rank order of one set is independent of the rank order of any of the others. >everal methods of testing these hypotheses will be described* the inversion frequency distribution, the rank order correlation coefficient of >pearman and a t test derived by @endall, the A distribution suggested by @endall which can be referred to the usual B tables, and the chiC square test developed by ,riedman. T+e $n,ers$on fre-.en') d$str$/.t$on -n order to find the number of inversions in a serices of rank orders, start at the left of the series and proceed to the right taking each rank in turn. ,or each rank count the number of ranks to the right that are smaller, 9onsider for e)ample the three series* !,$,#,",0* ",!,0,$,#* 0,!,",3,$,#* N.&/er of In,ers$o n 2 " ! 2 2 N.&/er of In,ers$o n ! 2 " ! 2 N.&/er of In,ers$o n $ 2 2 " ! 2 3

Rank !.. $.. #.. ".. 0.. Tota01

Rank ". !. 0. $ #.

Rank 0 .. ! . " . 3 . $ . # . Tota01

Tota01

-t was stated in the chapter on chiCsquare that, if m objects can be arranged in all possible orders from ! to m, and if the order !,",#,.m is considered a standard order, then the frequency distribution of the number of inversion of rank /)' as measured from this standard order is a discrete, singleCpeaked, symmetrical distribution with moments which depend only on m* m/ m !' m/m !'/"m + 0' " X = x = $ 1" This distribution, which is symmetrical about the population mean X , approaches the normal distribution very rapidly so that the appro)imation is close when m is as small as !2. ,or a small number of ranks, including m D !2, it is preferable to use the actual inversion distribution for making probability statements( for m of !! or larger, the normal probability table can be used. Tabel $# gives the inversion frequency distribution for each of the cases of " through !2 ranks( table $$ gives the probability of obtaining by chance a value equal to and less then, or equal to and larger then, the obtained number of inversions, ). This latter table is based upon cumulative frequencies derived from the foregoing table. ypothesis of random order. +ssume that we wish to make a quick test of the randomness of a series of observations, such as the consecutive value of weekly wages obtained from employed youth* 9onsider the following data obtained form #2 consecutive interviews divided into time groups of !2 /begin at upper left and end at lower right'. The problem is to test whether these values are in random order* # ! # ! 2 5 " 1 " ! ! 1 " # ! " " " 2 ! 0 # 2 ! 4 " 0 " 0 ! " " " " $ " ! ! 3 !0 !3 !4 5 ! 3 ! 0 ! # !! "4 !1

%anking by sets of !2 we have " " ! # 5 4 1 4 " ! 3 $ !2 0 !2 5 ! 5 4 1 0 0 $ 1 3 # # $ !2 3

,rom these three sets we obtain "2, "$, and !5 inversions respectively. The e)pected number is
!2 X 5 , or "".0. >ince all of these values are within one standard deviation of the mean $ / = 0.01' , we may infer that, so far as groups of !2 are concerned, these values are quite

random. Table $$ shows that e)act probability of getting a larger deviation from the population mean that the one obtained. Taking the first "2 numbers as a sequence gives 42 inversions, which is about one standard deviation below the e)pected number may be considered a random sequence. 0
"2 X !5 , or 50. This is evidence that the first "2 values $

%anking all #2 values gives a total of "2! inversions where the e)pected numbers is or "!1.0. >ince x is "4.2, this represents a deviation from the mean of

"2! "!1.0 , or C2.05. "4.2

#2 X "5 , $

%eferred to a table of normal deviates, this value is not significant. ence the entire sequence, as well as the shorter ones, is consistent with the hypothesis is random order. -n one type of psychological problem, the hypothesis of rank order can be sued as a test of the ability to discriminate. 9onsider, as an e)ample, the problem of how far apart each pair of a series of ordered weights /of equal volume' should be in order for a person to be able to perceive a difference between adjacent pairs. -f adjacent weights are widely different, discrimination will be perfect /the number of inversions will be equal to Bero', whereas, if they are close together, discrimination will be difficult and the number of inversions will increase. -f ability to discriminate is lacking, then the number of inversions will tend toward the mean number m/m !' . This ability to discriminate can be studied by using inversions y as a function of $ the differences in adjacent weights. 9learly this principle can be applied to a wide variety of problems involving discrimination. -n a trade test, a person is asked to arrange 4 metals in order to their hardness. &o the following three sets of ranks show any evidence of discrimination< =o.! !, ",0,$,#,1,3,4 $ inversions =o." $, ",!,0,#,4,1,3 4 inversions =o.# !, #,",3,$,0,4,1 $ inversions The first and last, with inversions with a probability /Table $$' of 2.221, show real evidence of discrimination, whereas the middle one with a probability of 2.245 is not so clear. The hypothesis of independence. The hypothesis of independence may be considered in tow parts* one in which the independence of two sets of m ranks is tested, and the more general case in which the independence of k sets of m ranks is tested. The first case is one of testing whether a significant correlation e)ists between two sets of ranks. This calls for the calculation of some type of correlation coefficient based on ranks, and the testing of the hypothesis that this coefficient is Bero. The common rank correlation coefficient, as given by >pearman, is
rd = ! m/m " !' 3d "

Ahere d is the difference between corresponding ranks, and there are m ranks. + major limitation of this test statistic is that its distribution is ragged and approaches normality slowly. @endall has shown that, if m is large, rd can be tested by means of the t distribution where

= rd

/ m "'

! " !

/! rd ' " Ahere the t table is entered with m E " degrees of freedom.
E%a&*0e4 Ahat is the value of rd in the following set of !" ranks and is it significantly different from Bero< 3

Rank of F$rst E%a&$nat$on !.1 ".4 #.5 $!2 0!! 3!"

Rank on Se'ond E%a&$nat$on #..!! $3 !4 "1 0...!" 5...!2


!

3 X 32 2.15 X !2 " , or $.21. ,or !2 degrees The value of rd is ! or 2.15. The value of t is ! !" X !$#F

2.#105 " of freedom the value of t at the ! per cent level is #.!1( hence the coefficient can be assumed to be different from Bero. + rank order correlation coefficient, based upon the number of inversions ) of one series relative to the other, can be defined as follows*
r ! =! $x m/ m !'

-f the tow sets of ranks are in perfect inverse order, then ) is the ma)imum number of inversions m/ m !' or , so that r! is Bero. -f there is a pervert positive relation between the tow sets of " ranks, ) is Bero and the value of r is G!. ence this definition of rH is in harmony with the characteristics of the regular linear correlation coefficient. >ince rH is a linear function ), the significance of rH can be tested by first testing the significance of the number of inversions ). -f ) is not significant, there is no point in calculating the correlation coefficient rH. .)ampel. 9onsider the data in the previous e)ample. The number of inversions of the second set relative to the first! is !#( therefore
r F =! $ X !# = 2.3! !" X !! !" X !! , or ##, while the standard deviation of the $

The e)pected number of inversions is population distribution is " !" X !!X "5 x = 0#.!331 1"
x = 1."5

>ince m e)ceeds !2, we can safely assume that this distribution appro)imates the normal probability distribution. ence the obtained value of ) of !# is
!".0 ## = ".4 1."5

>tandard deviations below the e)pected value based upon the hypothesis of no correlation. %eferring the value of
to the normal probability table gives a probability of 2.22# of being e)ceeded. ence it is evidence harmony with that obtained from the t test. The general case of k sets of m ranks each. ,or the most general case of k sets of m ranks each we shall describe two methods* a chiCsquare we shall describe them in this order. The statistic xX

X r" =

!" /r ' " #k /m +!' km/m +!'

-s distributed as chiCsquare with m E ! degrees of freedom for large value of k and m. The regular table of chiCsquare can be used, provided that !. m is # and k is larger than 5. ". m is $ and k is larger than 0. #. m is greater than $ and k is larger than $. " Otherwise, special tables of x r given by ,riedman must be used. .)ample. The following data are average yearly e)penditures for food of 9hicago families included in the urban study of consumer purchase in !5#0C!5#3, by occupational group, by income class. The problem is to test whether the e)penditures by income class are independent of occupational group. .)ample. The following data are derived from the average yearly e)penditures for food of 9hicago families included in the urban study of consumer purchases, !5#0C!5#3. .ach value is the sum of the average of family composition. These sums are ranked by occupational group the occupational classifications are independent.

In'o&e C0ass I!022C !102 !102C"222 "222C""02 ""02C"022 "022C#222 #222C#022 #022C$222

O''.*at$na0 5ro.* Sa0ar$ed Inde*endent Profess$ona0 8.s$ness Profess$ona0 8.s$ness C0er$'a0 I",1!# #!01 #0$1 #1"1 #502 $145 $3#4 I",445 #!#1 #1#$ $!$0 $!35 $044 024$ I",301 #"32 #104 $"24 $#31 $1!4 0!55 I#,22# #$!0 ##15 #4"# $#23 $5$5 0233 I",50$ #!14 #144 #1"5 $#!5 $304 3!!4

6a7e Earner I#,213 #!!0 #5"$ $232 $!32 $$#! $100

The ranks by rows are as follows* " # 1 9 2 : /r ' " = #115 # " 9 : 2 1 " # 2 1 9 : 1 9 : # " 2 1 # : 2 9 " 9 " 2 : # 1 1 2 9 # : " 19 "" #1 "; "< "" >ince m is 3 and k is 1,
x r" = !" X #115 # X # X 1 X 1 = 1."$ 1X 3X 1

%eferred to the regular chiCsquare table with 0 degrees of freedom, this value of chiCsquare has a probability of being e)ceeded of about 2."2. This is not low enough to lead to a rejection of the hypothesis that these tow categories are independent.

You might also like