Professional Documents
Culture Documents
2 MARCH, 1959
Traits B~ AI Ba Ca
.\~
Method 2 n~
Method 3 B~
,............. '...........
:.43............. .66 . . . , .34,'
: .67 (.92)
I "...... -......, I
I ........"
Ca .32........ .58 .58 .60 (.85)
.-. ...
1.34
---------~
Note.-The validity diagonals are the three sets of italicized values. The reliability diagonals are the three sets
of values in parentheses. Each heterotrait-monomethod triangle is enclosed by a solid line. Each heterotrait-
heteromethod triangle is enclosed by a broken line.
Peer Ratings
Courtesy A1 ( .82)
Honesty B1 .74 ( .80)
Poise C1 .63 .65 (.74)
School Drive 01 .76 .78 .65 (.89)
Association Test
Courtesy A2 .13 .14 .10 .14 (.28)
Honesty B2 .06 .12 .16 .08 .27 (.38)
Poise C2 .01 .08 .10 .02 .19 .37 ( .42)
School Drive D2 .12 .15 .14 .16 .27 .32 .18 (.36)
Al B1 C1 A2 B2 C2
Obstruction Box
Hunger At ( .58)
Thirst B1 .54 ( )
Sex C1 .46 .70
Activity Wheel
Hunger A2 .48 .31 .37 (.83)
Thirst B2 .35 .33 .43 .87 ( .92)
Post Sex C2 .31 .37 .44 .69 .78
Note.-Empty parentheses appear in this and subsequent tables where no appropriate reliability estimates are
reported in the original paper.
VALIDATION BY THE MULTITRAITMULTIMETHOD MATRIX 87
TABLE 4
SOCIAL INTELLIGENCE AND MENTAL ALERTNESS SUBTEST INTERCORRELATIONS FROM
THORNDIKEtS DATA
(N~750)
Compre-
Memory hension Vocabulary
Ai B1 A2 B~ Aa Bs
Memory
Social Intelligence (Memory for Names & Faces) Al ( )
Mental Alertness (Learning Ability) B1 .31 (
Comprehension
Social Intelligence (Sense of Humor) At .30 .31 ( )
Mental Alertness (Comprehension) B2 .29 .38 .48 (
Vocabulary
Social Intelligence (Recog. of Mental State) As .23 .35 .31 .35 ( )
Mental Alertness (Vocabulary) B. .30 .58 .40 .48 .47 (
TABLE 5
MEMORY, COMPREHENSION, AND VOCABULARY MEASURED WITH
SOCIAL AND ABSTRACT CONTENT
At B1 C1 A2 B2 C2
Social Conten t
Memory (Memory for Names and Faces) Al ( )
Comprehension (Sense of Humor) B1 .30 ( )
Vocabulary (Recognition of Mental State) C1 .23 .31
Abstract Content
Memory (Learning Ability) A2 .31 .31 .35 ( )
Comprehension B2 .29 .48 .35 .38 ( )
Vocabulary C2 .30 .40 .47 .58 .48
Al B1 C1 A2 B2 C2
Interview
(N=57)
Father Al ( )
Boss B2 .64 ( )
Peer Ct .65 .76 (
TraitCheck-List
(N:::155)
Father A2 .40 .08 .09 (.24)
Boss B2 .19 -.10 -.03 .23 (.34)
Peer C2 .27 .11 .23 .21 .45 (.55)
Sociometric Observation
As Ba
Sociometric by Others
Popularity At ( )
Expansiveness Hi .47 (
Sociometric by Self
Popularity A2 .19 .18 ( )
Expansiveness B2 .07 .08 .32
any consistent validation by the re- tional situations, and the apparent
quirement that the validity diagonals sharing of method variance between
exceed the heterotrait-heteromethod them, is correspondingly high.
control values. As a most minimal In another paper by Borgatta
requirement, it might be asked if the (1955) t 12 interaction process vari-
sum of the two values in the validity ables were measured by quantitative
diagonal exceeds the sum of the two observation under two conditions,
control values, providing a compari- and by a projective test. In this test,
son in which differences in reliability the stimuli were pictures of groups,
or communality are roughly par.. for which the S generated a series of
tialled out. This condition is achieved verbal interchanges; these were then
at the purely chance level of three scored in Interaction Process Analy-
times in the six tetrads. This matrix sis categories. For illustrati'Ve pur-
provides an interesting range of poses, Table 8 presents the five traits
methodological independence. The which had the highest mean com-
two uSociometric by Others" meas- munalities in the over-all factor anal-
ures, while representing the judg- ysis. Between the two highly sim-
ments of the same set of fellow par- ilar observational methods, valida-
ticipants, come from distinct tasks: tion is excellent: trait variance runs
Popularity is based upon each par- higher than method variance; valid-
ticipant's expressi0n of his own ity diagonals are in general higher
friendship preferences, while Ex- than heterotrait values of both the
pansiveness is based upon each par- heteromethod and monomethods
ticipant's guesses as to the other par- blocks, most unexceptionably so for
ticipant's choicest from which has Gives Opinion and Gives Orientation.
been computed each participant's The pattern of correlation among the
reputation for liking lots of other per- traits is also in general confirmed.
sons, i.e., being "expansive. tf In line Of greater interest because of the
with this considerable independence, greater independence of methods are
the evidence for a method factor is the blocks involving the projective
relatively low in comparison with the test. Here the validity picture is
observational procedures. Similarly, much poorer. Gives Orientation
the two uSociometric by Self' meas- comes off best, its projective test
ures represent quite 'separate tasks, validity values of .35 and .33 being
Popularity coming from his estimates bested by only three monomethod
of the choices he will receive from values and by no heterotrait-hetero-
others, Expansiveness from the num- method values within the projective
ber of expressions of attraction to blocks. All of the other validities are
others which he makes on the socio- exceeded by some heterotrait-hetero-
metric task. In contrast, the meas- method value.
ures of Popularity and Expansiveness The projective test specialist may
from the observations of group inter- object to the implicit expectations of
action and the role playing not only a one-to-one correspondence between
involve the same specific observers, projected action and overt action.
but in addition the observers rated Such expectations should not be at-
the pair of variables as a part of the tributed to Borgatta, and are not
same rating task in each situation. necessary to the method here pro..
The apparent degree of method vari- posed. For the simple symmetrical
ance within each of the two observa- model of this paper, it has been as-
TABLE 8 ~
to<
~
INTERACTION PROCESS VARIABLES IN OBSERVED FREE BEHAVIOR, OBSERVED ROLE PLAYING AND A PROJECTIVE TEST
(N=125)
Free Behavior ~
l:t.i
Shows solidarity Al ( )
Gives suggestion Bt .25 ( ) ~
Gives opinion C1 .13 .24 ( )
8
Gives orientation D 1 -.14 .26 .52 ( )
Shows disagreement E1 .34 .41 .27 .02 ( ~
Role Playing 2
.......
Shows solidarity A2 .43 .43 .08 .10 .29 ( ) ~
Gives suggestion B2 .16 .32 .00 .24 .07 .37 ( ) ~
Gives opinion C2 .15 .27 .60 .38 .12 .01 .10 ( )
8
Gives orientation D2 - .12 .24 .44 .74 .08 .04 .18 .40 ( )
Shows disagreement & .51 .36 .14 -.12 .50 .39 .27 .23 -.11 ~
Projective Test
~
Shows solidarity As .20 .17 .16 .12 .08 .17 .12 .30 .17 .22 ( ) ~
Gives suggestion B. .05 .21 .05 .08 .13 .10 .19 -.02 .06 .30 .32 ( )
Gives opinion C. .31 .30 .13 - .02 .26 .25 .19 .15 -.04 .53 .31 .63 ( )
Gives orientation Ds -.01 .09 .30 .35 -.05 .03 .00 .19 .33 .00 .37 .29 .32 ( ) ~
Shows disagreement Es .13 .18 .10 .14 .19 .22 .28 .02 .04 .23 .27 .51 .47 .30 )
~
~
-
\0
92 D. T. CAMPBELL AND D. W. FISKE
TABLE 9
MAYO'S INTERCORRELATIONS BETWEEN OBJECTIVE AND RATING
MEASURES OF INTELLIGENCE AND EFFORT
(N= 166)
Al B1
Peer Rating
Intelligence Al (.85)
Effort B1 .66 (.84)
Objective Measures
Intelligence A2 .46 .29 ( )
Effort B2 .46 .40 .10
sumed that the measures are labeled (.84 and .85). The objective meas-
in correspondence with the correla- ures share no appreciable apparatus
tions expected, i.e., in correspondence overlap because they were independ-
with the traits that the tests are ent operations. In spite of Mayo's
alleged to diagnose. Note that in argument that the ratings have some
Table 8, Gives Opinion is the best valid trait variance, the .46 hetero-
projective test predictor of both free trait-heteromethod value seriouslyde-
behavior and role playing Shows Dis- preciates the otherwise impressive .46
agreement. Were a proper theoretical and .40 validity values.
rationale available, these values Cronbach (1949, p. 277) and Ver-
might be regarded as validities. non (1957, 1958) have both discussed
Mayo (1956) has n1ade an analysis the multitrait-multi method matrix
of test scores and ratings of effort and shown in Table 10, based upon data
intelligence, to estimate the contribu- originally presented by H. S. Conrad.
tion of halo (a kind of methods vari- Using an approximative technique,
ance) to ratings. As Table 9 shows, Vernon estimates that 61 % of the
the validity picture is ambiguous. systematic variance is due to a gen-
The method factor or halo effect for eral factor, that 21!% is due to the
ratings is considerable although the test-form factors specific to verbal or
correlation between the two ratings to pictorial forms of items, and that
(.66) is well below their reliabili ties but 111% is due to the content fac-
TABLE 10
MECHANICAL AND ELECTRICAL FACTS MEASURED BY VERBAL AND PICTORIAL ITEMS
At B1
Verbal Iterns
Mechanical Facts Al (.89)
Electrical Facts B1 .63 (.71)
Pictorial Iterns
Mechanical Facts A2 .61 .45 (.82)
Electrical Facts Ba .49 .51 .64 (.67)
VALIDATION BY THE MULTITRAITMULTIMETHOD MATRIX 93
tors specific to electrical or to mechan- variance, and thus as having an in-
ical contents. Note that for the pur- flated validity diagonal. The more
poses of estimating validity, the in- independent heteromethod blocks in-
terpretation of the general factor, volving Peer Ratings show some evi-
which he estimates from the .49 and dence of discriminant and convergent
.45 heterotrait-heteromethod values, validity, with validity diagonals av-
is equivocal. I t could represent de- eraging .33 (Inventory X Peer Rat-
sired competence variance, represent- ings) and .39 (Self RatingsXPeer
ing components common to both elec- Ratings) against heterotrait-hetero-
trical and mechanical skills-perhaps method control values averaging .14
resulting from general industrial shop and .16. While not intrinsically im-
experience, common ability compo- pressive, this picture is nonetheless
nents, overlapping learning situations, better than most of the validity ma-
and the like. On the other hand, this trices here assembled. Note that the
general factor could represent over- Self Ratings show slightly higher
lapping method factors, and be due to validity diagonal elevations than do
the presence in both tests of multiple the Inventory scores, in spite of the
choice item format, IBM answer much greater length and undoubtedly
sheets, or the heterogeneity of the Ss higher reliability of the latter. In ad-
in conscientiousness, test-taking mo- dition, a method factor seems almost
tivation, and test-taking sophistica- totally lacking for the Self Ratings,
tion. Until methods that are still while strongly present for the Inven-
more different and traits that are tory, so that the Self Ratings come
still more independent are introduced off much the best if true trait vari-
into the validation matrix, this gen- ance is expressed as a proportion of
eral factor remains uninterpretable. total reliable variance (as Vernon
From this standpoint it can be seen [1958] suggests). The method factor
that 21!% is a very minimal estimate in the STDCR Inventory is undoubt-
of the total test-form variance in the edly enhanced by scoring the same
tests, as it represents only test-form item in several scales, thus contribut-
components specific to the verbal or ing correlated error variance, which
the pictorial items, Le., test-form could be reduced without loss of reli-
components which the two forms do ability by the simple expedient of
not share. Similarly, and more hope- adding more equivalent items and
fully, the 11!% content variance is a scoring each item in only one scale.
very minimal estimate of the total I t should be noted that Carroll makes
true trai t variance of the tests, repre- explicit use of the comparison of the
senting only the true trait variance validity diagonal with the hetero-
which electrical and mechanical trait.. heteromethod values as a valid-
knowledge do not share. i ty indicator.
Carroll (1952) has provided data
RATINGS IN THE ASSESSMENT STUDY
on the Guilford-Martin Inventory of
OF CLINICAL PSYCHOLOGISTS
Factors STDCR and related ratings
which can be rearranged into the The illustrations of multitrait-
matrix of Table 11. (Variable R has multimethod matrices presented so
been inverted to reduce the number far give a rather sorry picture of the
of negative correlations.) Two of the validity of the measures of individual
methods, Self Ratings and Inventory differences involved. The typical
scores, can be seen as sharing method case shows an excessive amount of
TABLE 11
GUILFORD-MARTIN FACTORS STDCR AND RELATED RATINGS
(N=1l0)
TABLE 12
RATINGS FROM ASSESSMENT STUDY OF CLINICAL PSYCHOLOGISTS
(N=124)
Al Bl Cl Dl El A. B. C. D. E. As B. C3 D3 Es ~
~
Staff Ratings
Assertive Al (.89) &:
Cheerful
Serious
Bl
Cl
.37 (.85)
-.24 -.14 (.81) ~t:l:l
Unshakable Poise Dl .25 .46 .08 (.84) ttl
Broad Interests El .35 .19 .09 .31 (.92) l:-l
l:-l
;l:..
Teammate Ratings
Assertive A. .71 .35 - .18 .26 .41 (.82) ~
Cheerful B. .39 .53 - .15 .38 .29 .37 (.76) ~
Serious C. -.27 - .31 .43 - .06 .03 - .15 -.19 (.70)
Unshakable Poise D. .03 -.05 .03 .20 .07 .11 .23 .19 (.74) ~
Broad Interests E. .19 .05 .04 .29 .47 .33 .22 .19 .29 (.76)
~
Self Ratings
Assertive As .48 .31 - .22 .19 .12 .46 .36 - .15 .12 .23 ( ) ~
Cheerful Bs .17 .42 -.10 .10 -.03 .09 .24 -.25 -.11 -.03 .23 ( )
Serious C s -.04 -.13 .22 - .13 -.05 -.04 -.11 .31 .06 .06 -.05 - .12 ( )
Unshakable Poise Ds .13 .27 -.03 .22 -.04 .10 .15 .00 .14 -.03 .16 .26 .11 ( )
Broad Interests Ea .37 .15 -.22 .09 .26 .27 .12 -.07 .05 .35 .21 .15 .17 .31 (
VALIDATION BY THE MULTITRAIT-MULTIMETHOD MATRIX 97
loading on the first recurrent factor.) be stated in terms of method factors
The picture presented in Table 12 or shared confounded irrelevancies,
is, we believe, typical of the best operate strongly in these data, as
validity in personality trait ratings probably in all data involving rat-
that psychology has to offer at the ings. In such cases, where several
present time. It is comforting to note variables represent each factor, none
that the picture is better than most of the variables consistently meets
of those previously examined. Note the criterion that validity values ex-
that the validities for Assertive ex.. ceed the corresponding values in the
ceed heterotrait values of both the monomethod triangles, when the full
monomethod and heteromethod tri- matrix is examined.
angles. Cheerful, Broad Interests, To summarize the validation pic-
and Serious have validities exceeding ture with respect to comparisons of
the heterotrait-heteromethod values validity values with other hetero-
with two exceptions. Only for Un- method values in each blocI{, Table
shakable Poise does the evidence of 13 has been prepared. For each trait
validity seem trivial. The elevation and for each of the three hetero-
of the reliabilities above the hetero- method blocks, it presents the value
trait-monomethod triangles is further of the validity diagonal, the highest
evidence for discriminant validity. heterotrait value involving that trait,
A comparison of Table 12 with the and the number out of the 42 such
full matrix shows that the procedure heterotrait values which exceed the
of having but one variable to repre- validity diagonal in magnitude. (The
sent each factor has enhanced the ap- number 42 comes from the grouping
pearance of validity, although not of the 21 other column values and the
necessarily in a misleading fashion. 21 other row values for the column
Where several variables are all highly and row intersecting at the given
loaded on the same factor, their diagonal value.)
Utrue" level of intercorrelation is On the requirement that the valid-
high. Under these conditions, sam- ity diagonal exceed all others in its
pling errors can depress validity diag- heteromethod block, none of the
onal values and enhance others to traits has a completely perfect record,
produce occasional exceptions to the although some come close. Assertive
validity picture, both in the hetero- has only one trivial exception in the
trait-monomethod matrix and in the Teammate..Self block. Talkative has
heteromethod-heterotrait triangles. almost as good a record, as does
In this instance, with an N of 124, the Imaginative. Serious has but two in-
sampling error is appreciable, and consequential exceptions and Interest
may thus be expected to exaggerate in Women three. These traits stand
the degree of invalidity. out as highly valid in both self-
Within the monomethod sections, description and reputation. Note
errors of measurement will be cor- that the actual validity ~oefficients of
related, raising the general level of these four traits range from but .22 to
values found, while within the heter- .82, or, if we concentrate on the
omethods block, measurement errors Teammate-Self block as most cer-
are independent, and tend to lower tainly representing independent
the values both along the validity methods, from but .31 to .46. While
diagonal and in the heterotrait tri- these are the best traits, it seems that
angles. These effects, which may also most of the traits have far above
98 D. T. CAMPBELL AND D. W. FISKE
TABLE 13
V ALIDITIES OF TRAITS IN THE ASSESSMENT STUDY OF CLINICAL PSYCHOLOGISTS,
AS JUDGED BY THE HETEROMETHOD COMPARISONS
Note.-Val. -value in validity diagonal; Highest Het. =highest heterotrait value; No. Higher =number of
heterotrait values exceeding the validity diagonal.
tit Trait names which have validities in all three heteromethod blocks significantly greater than the heterotrait
heteromethod values at the .001 level.
chance validity. All those having 10 Self block, all but five for the most
or fewer exceptions have a degree of independent block, Teammate-Self.
validity significant at the .001 level The exceptions to significant validity
as crudely estimated by a one-tailed are not parallel from column to col-
sign test. s All but one of the variables umn, however, and only 13 of 22
meet this level for the Staff..Team- variables have .001 significant valid-
mate block, all but four for the Staff.. ity in/jaIl threelblocks. These are indi..
cated':by an:~asterisk in Table 13.
8 If we take the validity value as fixed (ig.. This highly significant general
Doring its sampling fluctuations) J then we can level of validity must not obscure the
determine whether the number of values meaningful problem created by the
larger than it in its row and column is less than
expected on the null hypothesis that half the occasional exceptions, even for the
values would be above it. This procedure re- best variables. The excellent traits
quires the assumption that the position (above of Assertive and Talkative provide
or below the validity value) of anyone of a case in point. In terms'~ofAFiske's
these comparison values is independent of the
position of each of the others J a dubious as-
original analysis, both have high
sumption when common methods and trait loadings on the recurrent factor
variance are present. "Confident self-expression" (repre..
VALIDATION BY THE MULTITRAIT-M-ULTIMETHOD MATRIX 99
sented by Assertive in Table 12). initially predisposed to reinterpret
Talkative also had high loadings on self-ratings, to treat them as symp-
the recurrent factor of Social Adapta- toms rather than to interpret them
bility (represented by Cheerful in literally. Thus, we were alert to in-
Table 12). We would expect, there- stances in which the self ratings were
fore, both high correlation between not literally interpretable, yet none-
them and significant discrimination theless had a diagnostic significance
as well. And even at the common when properly "translated. H By and
sense level, most psychologists would large, the instances of invalidity of
expect fellow psychologists to dis- self-descriptions found in this assess..
criminate validly between assertive- ment study are not of this type. but
ness (nonsubmissiveness) and talka- rather are to be explained in terms of
tiveness. Yet in the Teammate-Self an absence of communality for one
block, Assertive rated by self cor- of the variables involved. In general,
relates . 48 with Talkative by team- where these self descriptions are in-
mates, higher than either of their terpretable at all, they are as literally
validities in this block, .43 and .46. interpretable as are teammate de-
In terms of the average values of scriptions. Such a finding may, of
the validities and the frequency of course, reflect a substantial degree of
exceptions, there is a distinct trend insight on the part of these Ss.
for the Staff-Teammate block to The general success in discriminant
show the greatest agreement. This validation coupled with the parallel
can be attributed to several factors. factor patterns found in Fiske's
Both represent ratings from the ex- earlier analysis of the three intra-
ternal point of view. Both are aver- method matrices seemed to justify an
aged over three judges, minimizing inspection of the factor pattern valid-
individual biases and undoubtedly in- ity in this instance. One possible pro..
creasing reliabilities. Moreover, the cedure would be to do a single analy-
Teammate ratings were available to sis of the whole 66X66 matrix.
the S~aff in making their ratings. An.. Other approaches focused upon sep-
other effect contributing to the less arate factoring of heteromethods
adequate convergence and discrim- blocks, matrix by matrix, could also
ination of Self ratings was a response be suggested. Not only would such
set toward the favorable pole which methods be extremely tedious, but in
greatly reduced the range of these addition they would leave undeter-
measures (Fiske, 1949, p. 342). In- mined the precise comparison of
spection of the details of the instances factor-pattern similarity. Correlat-
of invalidity summarized in Table 13 ing factor loadings over the popula-
shows that in most instances the ef- tion of variables was employed for
fect is attributable to the high spec- this purpose by Fiske (1949) but
ificity and low communality for the while this provided for the identifica-
self-rating trait. In these instances, tion of recurrent factors, no single
the column and row intersecting at over-all index of factor pattern sim-
the low validity diagonal are asym- ilarity was generated. Since our im-
metrical as far as general level of cor- mediate interest was in confirming a
relation is concerned, a fact covered pattern of interrelationships, rather
over by the condensation provided than in describing it, an efficient
in Table 13. short cut was available: namely to
The personality psychologist is test the similari ty of the sets of heter-
100 D. T. CAMPBELL AND D. W. FISKE
the testing situation are determined investigator will fall into the trap of
by explicit theoretical considerations, thinking that because he went from
as Jessor and Hammond have advo- an artistic or literary conception
cated (Jessor & Hammond, 1957). . .. to the construction of items for a
Relation to operationalism. Under- scale to measure it, he has validated
wood (1957, p. 54) in his effective his artistic conception" (Underwood,
presentation of the operationalist 1957, p. 55). In contrast with the
point of view shows a realistic aware- single operationalism now dominant
ness of the amorphous type of theory in psychology, we are advocating a
with which most psychologists work. multiple operationalism, a convergent
He contrasts a psychologist's 'lit- operationalism (Garner, 1954; Garner,
erary" conception with the latter's Hake, & Eriksen, 1956),a methodologi-
operational definition as represented cal triangulation (Campbell: 1953,
by his test or other measuring instru- 1956), an operational delineation
ment. He recognizes the importance (Campbell, 1954), a convergent valida-
of the literary definition in communi- tion.
cating and generating science. He Underwood's presentation and that
cautions that the operational defini- of this paper as a whole imply moving
tion "may not at all measure the from concept to operation, a sequence
process he wishes to measure; it may that is frequent in science, and per-
measure something quite different" haps typical. The same point can be
(1957, p. 55). He does not, however, made, however, in inspecting a tran-
indicate how one would know when sition from operation to construct.
one was thus mistaken. For any body of data taken from a
The requirements of the present single operation, there is a subinfinity
paper may be seen as an extension of of interpretations possible; a sub-
the kind of operationalism Under- infinity of concepts, or combinations
wood has expressed. The test con- of concepts, that it could represent.
structor is asked to generate from his Any single operation, as representa-
literary conception or private con- tive of concepts, is equivocal. In an
struct not one operational embodi- analogous fashion, when we view the
ment, but two or more, each as dif- Ames distorted room from a fixed
feren t in research vehicle as possible. point and through a single eye, the
Furthermore, he is asked to make ex- data of the retinal pattern are equiv-
plicit the distinction between his new ocal, in that a subinfinity of hexa-
variable and other variables, distinc- hedrons could generate the same pat-
tions which are almost certainly im- tern. The addition of a second view-
plied in his literary definition. In his point, as through binocular parallax,
very first validational efforts, before greatly reduces this equivocality,
he ever rushes into print, he is asked greatly limits the constructs that
to apply the several methods and sev- could jointly account for both sets of
eral traits jointly. His literary defini- data. In Garner's (1954) study, the
tion, his conception, is now best rep- fractionation measures from a single
resented in what his independent method were equivocal-they could
measures of the trait hold distinc- have' been a function of the stimulus
tively in common. The multitrait- distance being fractionated, or they
102 D. T. CAMPBELL AND D. W. FISKE
could have been a function of the trait variance, and in the rearrange-
comparison stimuli used in the judg- ment of the social intelligence ma-
ment process. A multiple, convergent trices of Tables 4 and 5.) It will then
operationalism reduced this equivo- be recognized that measurement pro-
cality, showing the latter conceptual- cedures usually involve several the-
ization to be the appropriate one, and oretical constructs in joint applica-
revealing a preponderance of meth- tion. Using obtained measurements
ods variance. Similarly for learning to estimate values for a single con-
studies: in ideotifying constructs struct under this condition still re-
with the response data from animals quires comparison of complex meas-
in a specific operational setup there is ures varying in their trait composi-
equivocality which can operationally tion, in something like' a m ultitrai t-
be reduced by introducing transposi- multimethod matrix. Mill's joint
tion tests, different operations so de- method of similarities and differences
signed as to put to comparison the still epitomizes much about the ef-
rival conceptualizations (Campbell, fective experimental clarification of
1954). concepts.
Garner's convergent operational- The evaluation of a multitrait-multi-
ism and our insistence on more than method matrix. The evaluation of the
one method for measuring each con- correlation matrix formed by inter-
cept depart from Bridgman's early correlating several trait-method units
position that "if we have more than must take into consideration the
one set of operations, we have more many factors which are known to
than one concept, and strictly there affect the magnitude of correlations.
should be a separate name to cor- A value in the validity diagonal must
respond to each different set of op- be assessed in the light of the reliabil-
erations" (Bridgman, 1927, p. 10). ities of the two measures involved:
At the current stage of psychological e.g., a low reliability for Test A 2
progress, the crucial requirement is might exaggerate the apparent
the demonstration of some conver- method variance in Test At. Again,
gence, not complete congruence, be- the whole approach assumes ade-
tween two distinct sets of operations. quate sampling of individuals: the
With only one method, one has no curtailment of the sample with re-
way of distinguishing trait variance spect to one or more traits will de-
from unwanted method variance. press the reliability coefficients and
When psychological measurement intercorrelations involving these
and conceptualization become better trai ts. While restrictions of range
developed, it may well be appropri- over all traits produces serious diffi-
ate to differentiate conceptually be- culties in the interpretation of a mul-
tween Trait-Method Unit A l and titrait-multimethod matrix and
Trait-Method Unit A 2 , in which should be avoided whenever possible,
Trait A is measured by different the presence of different degrees of
methods. More likely, what we have restriction on different traits is the
called method variance will be speci- more serious hazard to meaningful
fied theoretically in terms of a set of interpretation.
constructs. (This has in effect been Various statistical treatments
illustrated in the discussion above in for multitrait-multimethod matrices
which it was noted that the response might be developed. We have con-
set variance might be viewed as sidered rough tests for the elevation
VALIDATION BY THE MULTITRAIT-MULTIMETHOD MATRIX 103
of a value< in the validity diagonal the trait as conceptualized. Although
above the comparison values in its this view will reduce the range of
row and column. Correlations be- suitable methods, it will rarely re..
tween the columns for variables strict the measurement to one opera-
measuring the same trait, variance tional procedure.
analyses, and factor analyses have Wherever possible, the several
been proposed to us. However, the methods in one matrix should be com-
development of such statistical meth- pletely independent of each other:
ods is beyond the scope of this paper. there should-- be no prior reason for
We believe that such summary sta- believing that they share method
tistics are neither necessary nor ap- variance. This requirement is neces-
propriate at this time. Psychologists sary to permit the values in the heter-
today should be concerned not with omethod-heterotrait triangles to ap-
evaluating tests as if the tests were proach zero. If the nature of the
fixed and definitive, but rather with traits rules out such independence
developing better tests. We believe of methods, efforts should be made to
that a careful examination of a multi- obtain as much diversity as possible
trait-multimethod matrix will indi- in terms of data-sources and classifi-
cate to the experimenter what his cation processes. Thus, the classes of
next steps should be: it will indicate stimuli or the background situations,
which methods should be discarded the experimental contexts, should be
or replaced, which concepts need different. Again, the persons provid-
sharper delineation, and which con- ing the observations should have dif-
cepts are poorly measured because of ferent roles or the procedures for
excessive or confounding method var- scoring should be varied.
iance. Validity judgments based on Plans for a validational matrix
such a matrix must take into account should take into account the differ..
the stage of development of the con- ence between the interpretations re..
structs, the postulated relationships garding convergence and discrimina-
among them, the level of technical tion. I t is sufficient to demonstrate
refinement of the methods, the rela- convergence between two clearly dis..
tive independence of the methods, tinct methods which show little over-
and any pertinent characteristics of lap in the heterotrait-heteromethod
the sample of SSt We are proposing triangles. While agreement between
that the validational process be several methods is desirable, conver-
viewed as an aspect of an ongoing gence between two is a satisfactory
program for improving measuring minimal requirement. Discrimina-
procedures and that the llvalidity tive validation is not so easily
coefficients" obtained at anyone achieved. Just as it is impossible to
stage in the process be interpreted in prove the null hypothesis, or that
terms of gains over preceding stages some object does not exist, so one
and as indicators of where further ef- can never establish that a trait, as
fort is needed. measured, is differentiated from all
The design of a multitrait-multi- other traits. One can only show that
method matrix. The several methods this measure of Trait A has little
and traits included in a validational overlap with those measures of Band
matrix should be selected with care. C, and no dependable generalization
The several methods used to measure beyond Band C can be made. For
each trait should be appropriate to example, social poise could probably
104 D. T. CAMPBELL AND D. W. FISKE
REFERENCES
AMERICAN PSYCHOLOGICAL ASSOCIATION. tion: Actual, role-playing, and projective.
Technical recommendations for psychologi- J. abnorm. soc. Psychol., 1955,51, 394-405.
cal tests and diagnostic techniques. Psy- BRIDGMAN, P. W. The logic of modern Physics.
chol. Bull., Suppl., 1954, 51, Part 2, 1-38. New York: Macmillan, 1927.
ANDERSON, E. E. Interrelationship of drives BURWEN, L. S., & CAMPBELL, D. T. The gen-
in the male albino rat. 1. Intercorrelations erality of attitudes toward authority and
of measures of drives. J. compo Psychol., nonauthority figures. J. abnorm. soc. Psy-
1937,24, 73-118. chol., 1957, 54, 24-31.
AYER, A. J. The problem of knowledge. New CAMPBELL, D. T. A study of leadership among
York: St Martin's Press, 1956. submarine officers. Columbus: Ohio State
BORGATTA, E. F. Analysis of social interaction Univer. Res. Found., 1953.
and sociOlnetric perception. Sociometry, CAMPBELL, D. T. Operational delineation of
1954, 17, 7-32. "what is learned" via the transposition ex-
BORGATTA, E. F. Analysis of social interac- periment. Psychol. Rev., 1954,61, 167-174.
VALIDATION BY THE MULTITRAIT-MULTIMETHQD MATRIX 105
CAMPBELL, D. T. Leadership and its effects urements in the social sciences. N ew York:
upon the group. Monogr. No. 83. Colum- Scribner, 1934.
bus: Ohio State Univer. Bur. Business Res., KELLY, E. L., & FISKE, D. W. The prediction
1956. of performance in clinical psychology. Ann
CARROLL, J. B. Ratings on traits measured by Arbor: Univer. Michigan Press, 1951.
a factored personality inventory. J. ab- LOEVINGER, J., GLESER, G. C., & DuBOIS,
norm. soc. Psychol., 1952, 47, 626-632. P. H., Maximizing the discriminating power
CHI, P.-L. Statistical analysis of personality of a multiple-score test. Psychometrika,
rating. J. expo Educ., 1937, S, 229-245. 1953. 18, 309-317.
CRONBACH, L. J. Response sets and test valid- LORGE, 1. Gen-like: Halo or reality? Psych01.
i ty. Educ. psychol. Measmt, 1946,' 6, 475- Bull., 1937, 34, 545-546.
494. MAYO, G. D. Peer ratings and halo. Educ.
CRONBACH, L. J. Essentials of psychological
psychol. Measmt, 1956, 16, 317-323.
testing. New York: Harper, 1949.
STRANG, R. Relation of social intelligence to
CRONBACH, L. J. Further evidence on re-
certain other factors. Sch. &1 Soc., 1930,32,
sponse sets and test design. Educ. psychol.
268-272.
Measmt, 1950, 10, 3-31.
CRONBACH, L. J., & MEEHL, P. E. Construct SYMONDS, P. M. Diagnosing personality and
validity in psychological tests. Psychol. conduct. New York: Appleton-Ceotury,
Bull. 1955, 52, 281-302. 1931.
EDWARDS, A. L. The social desirability variable THORNDIKE, E. L. A constant error in psy-
in personality assessment and research. New chological ratings. J. apple Psychol., 1920,
York: Dryden, 1957. 4,25-29.
FEIGL, H. The mental and the physical. In THORNDIKE, R. L. Factor analysis of social
H. Feigl, M. Scriven, & G. Maxwell (Eds.), and abstract intelligence. J. educe Psychol.
Minnesota studies in the philosoPhy of sci- 1936, 27, 231-233.
ence. Vol. II. Concepts, theories and the THURSTONE, L. L. The reliability and validity
mind-body problem. Minneapolis: U niver. of tests. Ann Arbor: Edwards, 1937.
Minnesota Press, 1958. TRYON, R. C. Individual differences. In F. A.
FISKE, D. W. Consistency of the factorial Moss (Ed.), Comparative Psychology. (2nd
structures of personality ratings from differ- ed.) New York: Prentice-Hall. 1942. Pp.
ent ~ources. J. abnorm. soc. Psychol., 1949, 330-365.
44,329-344. UNDERWOOD, B. J. Psychological research.
GARNER, W. R. Context effects and the valid- New York: Appleton-Century-Crofts, 1957.
ity of loudness scales. J. expo Psychol., 1954
j VERNON, P. E. Educational ability and psy-
48, 218-224. chological factors. Address given to the
GARNER, W. R., HAKE, H. W., & ERIKSEN, Joint Education-Psychology Colloquium.
c. W. Operationism and the concept of Univer. of Illinois, March 29, 1957.
perception. Psychol. Rev., 1956, 63, 149-159. VERNON, P. E. Educational testing and test-
lESSOR, R., & HAMMOND. K. R. Construct form factors. Princeton: Educational Test-
validity and the Taylor Anxiety Scale. ing Service, 1958. (Res. Bull. RB-58-3.)
Psychol. Bull., 1957, 54, 161-170.
KELLEY, T. L., & KREY, A. C. Tests and meas- Received June 19, 1958.