You are on page 1of 10

Measures of Association

G. David Garson.
Overview
Association refers to a wide variety of coefficients which measure strength of
relationship, defined various ways. In common usage "association" refers to
measures of strength of relationship in which at least one of the variables is a
dichotomy, nominal, or ordinal.
Correlation, which is a type of association used when both variables are interval, is
discussed separately.
Reliability, which is a type of association used to establish the consistency of a
measure or to assess inter-rater similarity on a variable, is also discussed
separately.
Key Concepts and Terms
o Significance versus association. Measures of significance test the null
hypothesis that the strength of an observed relationship is not different
from what would be expected due to the chance of random sampling.
Significance coefficients reflect not only strength of relationship but also
sample sie and sometimes other parameters. !herefore it is possible to
have a relationship which displays strong association but is not significant
"ex., all males are #epublicans and all females are Democrats, but the
sample sie is only $% or a relationship which displays an extremely wea&
association but is very significant "ex., '(.)* of males are #epublicans
compared to '(.(* of females, but sample sie is )',((( and the
significance level is .(()%. +ecause significance and association are not at
all e,uivalent, researchers ordinarily must report both significance and
association when discussing their findings. -ote also that significance is
relevant only when one has a random sample, whereas association is
always relevant to research inferences.
o Coefficients of association. Most coefficients of association vary from (
"indicating no relationship% to ) "indicating perfect relationship% or .)
"indicating perfect negative relationship%. /s discussed below, however,
there are various types of "perfect relationship" and various types of "no
relationship." 0hich definitions the researcher selects may strongly affect
the conclusions to which he or she comes. 0hen particular coefficients are
discussed later in this section, their definitions of perfect and null
relationships are cited and this is one important criterion used by
researchers in selecting among possible measures of association. If you
wish to s&ip the rather long discussion below, 1ust &eep in mind that most
but not all coefficients of association define "perfect relationship" as strict
monotonicity and define a "null relationship" as statistical independence.
Types of perfect relationship. !here are four definitions of
"perfect" linear relationship in association, plus the definition of
perfect curvilinear relationships. !he linear definitions are those
dealing with strict monotonic, ordered monotonic, predictive
monotonic, and wea& monotonic relationships. /ll relationships
which are perfect by strict monotonicity are also perfect by the
others. 2i&ewise, perfect ordered and predictive monotonic
relationships will also be perfect by the criterion of wea&
monotonicity. 3ne cannot have perfect ordered monotonicity and
perfect predictive monotonicity at the same time. -one of the
definitions based on monotonicity are appropriate for curvilinear or
discontinuous relationships.
). The concept of pairs. Strength of linear relationship is
defined in terms of degree of monotonicity, which is based
on counting various types of pairs in a relationship shown in
a table. / pair is a two cases, each of which is in a different
cell in the table representing the 1oint distribution of two
variables. 2et x be an independent variable with three
values and let y be a dependent with two values, with a,
b, ..., f being the cell counts in the resulting table, illustrated
below4
x
y
) 5 6
) a b c
5 d e f
2. The four types of pairs, how they are counted, and their
symbols are shown in the table below.
!ype of 7air -umber of 7airs Symbol
8oncordant a"e9f% 9 b"f% 7
Discordant c"d9e% 9 b"d% :
!ied on x ad 9 be 9cf ;
o
!ied on y a"b9c% 9 bc 9 d"e9f% 9 ef <
o
3. All definitions of "perfect relationship" increase the
coefficient of association toward ! as concordant pairs
increase. "owever, there is disa#reement about how to
handle ties, leadin# to the different definitions below.
=. Strict monotonicity. 7erfect positive strict monotonicity
re,uires that discordant pairs ":%, ties on x ";
o
%, and ties on
y "<
o
% all be ero. >or perfect negative monotonicity,
concordant pairs "7%, ties on x ";
o
%, and ties on y "<
o
% all
must be be ero. !hat is, by this most common definition, a
relationship is "perfect" if ")% when the independent
variable x increases, then the dependent variable y increases
"or decreases in the case of perfect negative relationships%,
and "5% if each value of x corresponds to only one y value.
2i&ewise, when y increases, x also increases "or decreases
for negative relationships%, and each y value corresponds to
only one x value. /n implication is that there are no ties on
x and no ties on y "that is, no cases other than on the
diagonal%. ?xamples of perfect strict monotonic association
are below4
x
y
)' ( (
( )' (
( ( )'
x
y
)' ( ( (
( )' ( (
( ( ( )'
'. Ordered monotonic. 7erfect positive ordered monotonicity
exists when discordant pairs ":% and ties on y "<
o
% are ero.
7erfect negative ordered monotonicity exists when
concordant pairs "7% and ties on y "<
o
% are ero. !hat is,
perfect ordered monotonicity exists when ")% as x increases,
y also increases "or decreases for perfect negative
relationships% and "5% when every y value corresponds to
1ust one x value. ?xamples of perfect ordered monotonic
association are below4
x
y
)' ( (
( )' (
( )' (
x
y
)' ( ( (
( ( ( )'
( ( ( )'
@. Predictive monotonic. 7erfect positive predictive
monotonicity exists when discordant pairs ":% and ties on x
";
o
% are ero. 7erfect negative predictive monotonicity
exists when concordant pairs "7% and ties on x ";
o
% are ero.
!hat is, perfect predictive monotonicity exists when ")% as x
increases, y also increases "or decreases for perfect negative
relationships% or remains the same, and "5% when every y
value corresponds to 1ust one y value. ?xamples of perfect
ordered monotonic association are below4
x
y
)' ( (
( )' )'
( ( (
x
y
)' )' ( (
( ( ( (
( ( )' )'
$. %ote this form of association is called "predictive" because
the dependent variable can be predicted uni&uely from
'nowin# the value of the independent variable, #iven that
each independent ( value corresponds uni&uely to one
dependent y value.
$. Weak monotonic. 7erfect positive wea& monotonicity
exists when discordant pairs ":% are ero. 7erfect positive
wea& monotonicity exists when concordant pairs "7% are
ero. 7erfect wea& monotonicity exists when ")% as x
increases, y also increases "or decreases for perfect negative
relationships% or remains the same. In 5.by.5 tables this
corresponds to having a ero cell in the table. ?xamples of
perfect ordered monotonic association are below4
x
y
)' ( (
)' ( (
)' )' )'
x
y
)' ( ( (
)' )' ( (
( ( )' )'
A. Curvilinear. 8urvilinear association is perfect when every
x value of the independent corresponds to only one y value
of the dependent variable. !he reverse need not be true, nor
need the relationship be continuous. Most investigations of
curvilinear relationships involve the use of curve.fitting
software, however, which usually do re,uire the distribution
be continuous. Some applications also re,uire that the curve
be describable as a mathematical function.
Curvilinear association is asymmetric in that its definition
depends of which variable is independent and which is
dependent. Thus for hypotheses in which y is the
independent variable, then curvilinear association is perfect
when every y value corresponds to only one ( value. %ote
curvilinear association is never applicable to nominal
variables.
Types of null relationship. !here are four ways to define "no
relationship" between two variables. !he leading definition,
independence, is a symmetric criterion ma&ing no assumption
about the direction of causation, whereas accord is asymmetric.
+oth independence and accord are nominal criteria, ma&ing no
assumption about the level of data. +alance is an ordinal criterion,
except for dichotomies, and assumes the values of the two variables
are ordered. 8leavage is a sufficient condition for independence
and balance, but is a more stringent definition such that
independence or balance do not imply cleavage.
). Independence. +y far the most common definition of null
relationship is based on the laws of probability. !wo
variables are independent when their 1oint distribution is as
would be predicted on the basis of the number of cases in
their individual categories. !he expected value for any 1oint
category, calculated as in chi.s,uare, is the product of the
number of cases in their separate categories divided by n,
the sample sie. >or instance, if in a sample of )(( there are
'( men and =( #epublicans, the expected number of male
#epublicans is '(B=(C)(( D 5(. If every 1oint category
"each of the cells in a table% is the expected value, then there
is a null relationship as defined by the criterion of
independence. -ote independence ma&es no assumption
about which is the independent and which is the dependent
variable "it is symmetric%. 0hen a relationship is
independent, chi.s,uare will be ero and thus chi.s,uare
may be viewed as a test of independence.
5. Accord. +y this criterion, two variables have a null
relationship if the largest.count categories of the
independent variable all have the same value on the
dependent variable. >or instance, let the independent be
low, medium, and high education and let the dependent be
unsatisfactory, satisfactory, and meritorious performance
evaluations. !here might be a tendency to have more
meritorious ratings as one moved from low to medium to
high education. Eowever, it might be true at the same time
the most low.educated, most medium.education, and most
high.educated employees all received satisfactory ratings.
+y the criterion of accord there would be a null relationship,
whereas by the criterion of independence there would be a
relationship. /ccord is the second most common definition
of strength of relationship and is an asymmetric definition
6. Balance. 0hen the value categories of both variables are
ordered and crosstabulated, by this criterion a null
relationship is said to exist when the number of cases on the
right.sloping diagonal"s% is e,ual to the number of cases on
the left.sloping diagonal"s%. 8onsider the following table4
DegreeC#ating F +/ +/ G +/ #ow total
Hnsatisfactory $ = I )A
Satisfactory = = 6 ))
Meritorious 6 6 5 $
8olumn !otal )' )) )5 6$
). left dia#onals *2+, ri#ht dia#onals * 2+
-. .n this table there is a tendency for those with less than a
/A de#ree or more than a /A de#ree to receive low
performance ratin#s, and for those with e(actly a /A to do
proportionately best. "owever, since the count on the ri#ht-
and left-slopin# dia#onals is 2+ in each case, by accord
there is a null relationship.
@. Cleavage. +y this criterion, a null relationship exists when
the number of cases associated with each category of the
independent variable is split evenly among the dependent
variable categories. 8onsider the following table4
DegreeC#ating F +/ +/ G +/ #ow total
Hnsatisfactory 6 ' $ )@
Satisfactory 6 ' $ )@
Meritorious 6 ' $ )@
8olumn !otal A )' 5= =$
$. left dia#onals *3$, ri#ht dia#onals * 3$
+. 0hen a null relationship e(ists by cleava#e, as above, there
will also be a null relationship by balance and
independence. 1ince there are e&ual numbers of cases in
each dependent cate#ory for each independent cate#ory,
accord cannot be computed but it also approaches null for
tables with perfect cleava#e. "owever, note that the reverse
is not true2 tables with a null relationship by any of the other
criteria need not have a null relationship by the cleava#e
criterion.
o Association with Control Variables !The "laboration #odel!$. In
crosstabulation, for an original table of ; and <, the researcher may see&
to control for the effects of a third variable, J. >or instance, for a table of
religious affiliation with party vote, one may see& to control for gender.
!his is done by computing measures of association for the original table
";Dreligion and <Dvote% and for similar ;.< tables for each value of J "in
this case, a male table and a female table of religin and vote%. +y
comparing an appropriate coefficient of association in the original table
with its counterparts for the control subtables, one of several effects may
be noticed4
No effect occurs when the original and subtable measures of
association are e,uivalent in magnitude and sign.
Explanation occurs when the control variable is an anteceding
cause of the independent and dependent, or when it is an intevening
variable on the path from the independent to the dependent, and
there is no direct causal path from the independent to the
dependent. In this case, the subtable associations approach ( and
for random samples should test as not significant. In such cases the
original association is said to be spurious. -ote that one cannot
differentiate statistically between an anteceding and an intervening
control effect but rather must do so on some other basis, such as
&nowledge of time se,uences of related events.
Partial explanation occurs when there is a direct path from the
independent to the dependent variable, but the control variable is
also either an anteceding or intevening cause. In this case subtable
association drops only part way to ( compared to the original
bivariate association. -ote that if the association drops sufficiently
far as no longer to be significant, this is considered indicative of
explanation, not partial explanation.
Suppression occurs when the control variable has a positive effect
on the dependent through one path and a negative effect through
another path. >or three.variable models, suppression may occur
when there is an odd number of negative arrows. In such situations,
the control variable acts in one direction by way of the independent
and in the opposite direction in terms of direct effect on the
dependent, thereby mas&ing some of the correlation which would
exist in the absence of the control. 0hen suppression occurs,
subtable association will be higher than the original bivariate
association.
o Specific #easures of Association. 0ith the exception of eta, when data
are mixed by data level, the researcher uses a measure of association for
the lower data level. !hus, for nominal.by.ordinal association one would
use a measure for nominal.level association.
%ichotomous Association &'by'& tables$( Percent %ifference)
*ule+s ,) *ule+s *) -isk
.ominal Association( Phi) Contingency Coefficient)
Tschuprow+s T) Cramer+s V) /ambda) 0ncertainty Coefficient
.ominal'by'1nterval .onlinear Association( "ta
Ordinal Association( 2amma) 3endall+s tau'b and tau'c)
Somers+ d
Association for 1nter'rater Agreement rows and columns are
the same variable$( 3appa
Assumptions
/ssumptions are discussed in the sections for each particular
measure of association. Measures of association may assume
nominal, ordinal, or interval levels of measurementK symmetry or
asymmetry of causal directionK s,uare versus any shape tableK and
alternative definitions of "perfect relationship" and "null
relationship" as described above.
Strict monotonicity and the assumption of e4ual marginals.
Measures of association which define perfect relationship in terms
of strict monotonicity can reach ).( only when the two variables
have the same marginal distribution, ignoring null rows and null
columns. 3ne such measure, tau b, is used to illustrate this in the
four tables below4
TA5/" A Male >emale
#ow
!otals
#epublican )' )( 5'
TA5/" 5 Male >emale
#ow
!otals
#epublican 5( ' 5'
Democrat ' 5( 5'
8olumn
!otals
5( 6( n D '(
tau b D .=($
Democrat ( 5' 5'
8olumn
!otals
5( 6( n D '(
tau b D .$)@
TA5/" C Male >emale
#ow
!otals
#epublican 5' ' 6(
Democrat ' )' 5(
8olumn
!otals
6( 5( n D '(
tau b D .'$6
TA5/" % Male >emale
#ow
!otals
#epublican 6( ( 6(
Democrat ( 5( 5(
8olumn
!otals
6( 5( n D '(
tau b D ).(
Table A illustrates a hypothetical relationship between #ender and
political party, shown to have a level of association by tau b of .
)3+. Table B represents the stron#est possible relationship between
#ender and party if one is forced to 'eep the mar#inal totals the
same as in Table A. 4ven thou#h Table / is as stron# as possible
'eepin# the same total number of men and women, and
Republicans and 5emocrats, its association is less than !.3 6it is .
+!$7. Table C illustrates a relationship between the same two
variables, but where #ender and party have e&ual mar#inals, with a
tau b stren#th of .-+3. Table D represents the stron#est possible
relationship between #ender and party, 'eepin# the mar#inal totals
the same as in Table C, and its stren#th is a perfect !.3, reflectin#
strict monotonicity. That is, a monotonic measure of association
li'e tau b can reach !.3 only when the mar#inal distributions of the
two variables are the same, as they are in Tables C and 5. .n the 2-
by-2 case, ordered and predictive monotonic measures of
association e(hibit the same behavior, althou#h in lar#er tables they
can reach !.3 even when row and column mar#inals are not the
same.
#onotonicity and table si6e. In a non.s,uare table with no null
rows and no null columns, there will always be ties on the variable
with the smaller number of classes. 0hen the row variable has
fewer classes there will be ties on the row variable "y%, and thus
such a table cannot have perfect association by strict or ordered
monotonicity, but may be perfect by predictive or wea&
monotonicity. 0hen the column variable has fewer classes there
will be ties on the column variable "x%, and thus such a table cannot
have perfect association by strict or predictive monotonicity, but
may be perfect by ordered or wea& monotonicity.
Frequently Asked Questions
Where does one find these measures of association in SPSS7
Most are found in the S7SS 8#3SS!/+S module. >rom
the menu, select Statistics, Summarie, 8rosstabs. In the
"8rosstabs" dialog box, clic& the "Statistics" button, then in
the "8rosstabs4 Statistics" dialog box, chec& the measures
you want. S7SS does not offer <uleLs :, <uleLs <, or
!schuprowLs !. /nother S7SS Inc. product, S<S!/!,
offers <uleLs : and <uleLs <.
Bibliography
Garson, G. David ")AI@%. Political Science Methods. +oston, M/4
Eolbroo& 7ress. 8hapter )) covers most measures of association.
See also any standard introductory textboo& on statistics associated
with survey research.
2iebetrau, /lbert M. ")A$6%. Measures of association. -ewbury
7ar&, 8/4 Sage 7ublications. :uantitative /pplications in the
Social Sciences Series -o. 65.
Copyri#ht !88+, 233+ by 9. 5avid 9arson.
:ast update, 3;2);233+.

You might also like