Professional Documents
Culture Documents
Stephen G. West
John F. Finch
Texas A&M University
Monte Carlo computer simulations were used to investigate the performance
of three X2test statistics in confirmatory factor analysis (CFA). Normal theory
maximum likelihood )~2 (ML), Browne's asymptotic distribution free X2
(ADF), and the Satorra-Bentler rescaled X2 (SB) were examined under varying conditions of sample size, model specification, and multivariate distribution. For properly specified models, ML and SB showed no evidence of bias
under normal distributions across all sample sizes, whereas ADF was biased
at all but the largest sample sizes. ML was increasingly overestimated with
increasing nonnormality, but both SB (at all sample sizes) and ADF (only
at large sample sizes) showed no evidence of bias. For misspecified models,
ML was again inflated with increasing nonnormality, but both SB and ADF
were underestimated with increasing nonnormality. It appears that the power
of the SB and ADF test statistics to detect a model misspecification is attenuated given nonnormally distributed data.
16
17
18
M o d e l Specification
Four specifications of an oblique three-factor
model with three indicators per factor were exam-
19
ined. The basic confirmatory factor model is presented in Figure 1. The population parameters
consisted of factor loadings (each A = .70), uniquenesses (each O~ = .51), interfactor correlations
(each ~b = .30), and factor variances (all set to 1.0).
Model 1. Model 1 was properly specified such
that the model that was estimated in the sample
directly corresponded to the model that existed in
the population. Thus, both the sample and the
population models corresponded to the solid lines
presented in Figure 1.
Model 2. Model 2 contained two factor loadings that were estimated in the sample but did
not exist in the population. Thus, in Figure 1, the
double dashed lines represent the two factor loadings that linked Item 5 to Factor 3 and Item 8 to
Factor 2, and the expected value of these parameters was 0. This is a misspecification of inclusion.
Note that from the standpoint of statistical theory,
estimation of parameters with an expected value
of 0 in the population does not bias the sample
results. Model 2 is thus considered to be a properly
specified model.
Model 3. Model 3 excluded two loadings from
the sample that did exist in the population. Thus,
in Figure 1, the single dashed lines represent the
two excluded factor loadings (both population
As = .35) that linked Item 6 to Factor 3 and Item
7 to Factor 2. The value of A = .35 was chosen to
reflect a small to moderate factor loading that
might be commonly encountered in practice. This
is a misspecification of exclusion.
Model 4. Finally, Model 4 was the combination of Models 2 and 3. Like Model 2, two factor
loadings were estimated in the sample that did
not exist in the population (the double dashed
lines linking Item 5 to Factor 3 and Item 8 to
Factor 2). Additionally, like Model 3, two factor
loadings were excluded from the sample that did
exist in the population (the single dashed lines
that linked Item 6 to Factor 3 and Item 7 to
Factor 2). This is a misspecification of both
inclusion and exclusion.
20
.51
.51
.51
.70\
.51
.701
.ao
.51
.70/
.51
"'~.-'"'"'~/'\
.51
70170
.51
1.70
.ao
.30
Results
Measures
5[]00-
f~
I I
I I
I I
I
I
I.", |
4000~J
tQ
,I
3000-
O"
.\
U.
200O
1000
0
-4.5
:
-35
-2.5
-1 5
4),5
0.5
Standardized
1.5
2.5
3.5
4.5
Value
Figure 2. Plots of normal (solid line), moderately nonnormal (dotted line), and severely nonnormal (dashed
line) empirical distributions based on a random sample
of N = 10,000.
21
22
Table 1
Observed Chi-Square, Expected Chi-Square, Percentage Bias, and Percentage of Rejected Models
for Model Specification 1
Normal
Moderately nonnormal
Severely nonnormal
Size
X2
Observed
X2
Expected
X2
%
Bias
%
Reject
Observed
X2
Expected
X2
%
Bias
%
Reject
Observed
X2
Expected
X2
%
Bias
%
Reject
100
ML
SB
ADF
ML
SB
ADF
ML
SB
ADF
ML
SB
ADF
25.01
25.87
36.44
24.78
25.22
29.19
23.94
24.10
25.92
25.05
25.16
25.79
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
4.0
8.0
52.0
3.0
5.0
22.0
0.0
0.0
8.0
4.0
5.0
7.0
5.5
7.5
43.0
6.5
8.5
19.0
3.5
5.0
11.0
7.0
8.0
9.5
29.35
26.06
38.04
30.15
25.44
29.27
31.26
25.44
26.42
30.78
24.77
25.36
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
22.0
9.0
59.0
26.0
6.0
22.0
30.0
6.0
10.0
28.0
3.0
6.0
20.0
8.5
49.0
25.0
8.0
19.0
24.0
6.9
6.7
24.0
7.5
7.5
33.54
27.26
44.82
34.40
25.80
31.29
35.55
24.85
26.83
37.40
25.01
25.47
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
24.0
40.0
14.0
87.0
43.0
8.0
30.0
48.0
4.0
12.0
56.0
4.0
6.0
30.0
13.0
68.0
36.0
6.5
25.0
40.0
8.5
8.5
48.0
7.0
7.2
200
500
1000
Note. Univariate skewness and kurtoses were (0,0), (2,7), and (3,21) for normal, moderately nonnormal, and severely nonnormal
distributions, respectively. ML = maximum likelihood; SB = Satorra-Bentler rescaled; ADF = asymptotic distribution
free.
increasingly positively biased with increasing nonnormality, and this inflation was exacerbated with
i n c r e a s i n g s a m p l e size. I n c o m p a r i s o n , t h e S B X 2
showed minimal bias with increasing nonnormality, a l t h o u g h t h e o b s e r v e d r e j e c t i o n r a t e s a t t h e
s m a l l e s t s a m p l e size w e r e s l i g h t l y l a r g e r t h a n e x pected. Even under severe nonnormality, the SB
X 2 again showed little evidence of bias, especially
a t s a m p l e s i z e s o f N -- 2 0 0 o r g r e a t e r . F i n a l l y ,
the ADF X2 was positively biased with increasing
Table 2
Observed Chi-Square, Expected Chi-Square, Percentage Bias, and Percentage of Rejected Models
for Model Specification 2
Normal
Moderately nonnormal
Severely nonnormal
Size
X2
Observed
g2
Expected
X2
%
Bias
%
Reject
Observed
X2
Expected
2"2
%
Bias
%
Reject
Observed
X2
Expected
X2
%
Bias
%
Reject
100
ML
SB
ADF
ML
SB
ADF
ML
SB
ADF
ML
SB
ADF
23.42
24.19
31.0
22.48
22.86
26.43
21.89
22.02
23.13
22.25
22.31
22.09
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
6.0
9.0
41.0
2.0
3.0
20.0
0.0
0.0
5.0
0.0
0.0
0.0
9.6
12.1
30.5
6.0
7.0
17.5
6.0
7.0
7.0
5.0
3.5
6.0
26.89
23.80
34.95
27.70
23.75
26.06
26.68
21.90
23.04
26.05
21.14
23.32
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
8.0
59.0
26.0
8.0
18.0
21.0
0.0
5.0
18.0
4~0
6.0
22.2
8.5
40.8
26.5
8.5
14.5
19.0
5.5
8.0
14.5
4.0
9.5
29.82
25.07
46.45
31.77
24.10
28.52
31.86
23.33
24.02
33.37
22.74
23.41
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
22.0
36.0
14.0
111.0
44.0
10.0
30.0
45.0
6.0
9.0
52.0
3.0
6.0
34.4
11.1
63.2
36.7
8.5
22.0
32.5
6.5
5.5
41.5
7.0
7.5
200
500
1000
Note. Univariate skewness and kurtoses were (0,0), (2,7), and (3,21) for normal, moderately nonnormal, and severely nonnormal
distributions, respectively. ML = maximum likelihood; SB = SatorraBentler rescaled; ADF = asymptotic distribution
free.
C H I - S Q U A R E TEST STATISTICS
23
Table 3
Observed Chi-Square, Expected Chi-Square, Percentage Bias, and Percentage of Rejected Models
for Model Specification 3
Normal
Size
X2
Observed
2,2
Expected
X2
100
ML
SB
ADF
ML
SB
ADF
ML
SB
ADF
ML
SB
ADF
38.45
39.63
52.04
51.07
51.65
50.17
92.27
92.52
76.89
161.46
161.66
127.71
37.62
37.65
33.74
51.38
51.44
43.59
92.66
92.80
73.11
161.35
161.74
122.32
200
500
1000
Moderately nonnormal
%
%
Bias Reject
2.0
5.0
54.0
0.0
0.0
15.0
0.0
0.0
5.0
0.0
0.0
4.0
54.3
57.9
80.3
88.1
89.9
85.6
100.0
100.0
100.0
100.0
100.0
100.0
Observed
X2
Expected
X2
46.50
37.63
60.88
59.72
47.04
47.93
99.39
75.04
60.10
171.07
126.23
90.23
37.75
33.99
29.68
51.66
44.09
35.42
93.37
74.40
52.63
162.88
124.89
81.31
Severely nonnormal
%
%
Bias Reject
23.0
11.0
105.0
16.0
7.0
36.0
6.0
1.0
14.0
5.0
2.0
11.0
78.9
49.1
86.3
93.7
82.4
84.8
100.0
100.0
99.5
100.0
100.0
1130.0
Observed
A,2
Expected
X2
%
Bias
%
Reject
52.87
38.10
81.31
68.58
44.66
47.16
109.87
63.46
53.67
180.90
101.25
76.10
37.77
30.85
27.29
51.68
37.77
30.61
93.42
58.52
40.58
162.98
93.11
57.20
40.0
24.0
198.0
33.0
18.0
54.0
18.0
8.0
32.0
11.0
9.0
33.0
81.4
47.1
95.3
95.4
78.9
75.2
100.0
97.2
96.7
100.0
100.0
100.0
Note. Univariate skewness and kurtoses were (0,0), (2,7), and (3,21) for normal, moderately nonnormal, and severely nonnormal
distributions, respectively. ML = maximum likelihood; SB = Satorra-Bentler rescaled; ADF = asymptotic distribution free.
s a m p l e , t h e p e r c e n t a g e o f r e j e c t e d m o d e l s was n o
l o n g e r a m e a n i n g f u l g u i d e with w h i c h to j u d g e t h e
b e h a v i o r o f t h e test statistics. Thus, t h e f o l l o w i n g
results will n o w b e p r e s e n t e d in t e r m s of t h e relative bias in t h e test statistics. 4
U n d e r m u l t i v a r i a t e n o r m a l i t y , t h e e x p e c t e d values for t h e M L a n d SB X 2 w e r e n e a r l y i d e n t i c a l
across all f o u r s a m p l e sizes. T h i s is f u r t h e r s u p p o r t
t h a t for n o r m a l d i s t r i b u t i o n s , n o scaling c o r r e c t i o n
is r e q u i r e d for t h e M L X 2, a n d t h e SB X 2 thus
simplifies to t h e M L X 2. A d d i t i o n a l l y , n e i t h e r the
M L o r SB test statistic s h o w e d a p p r e c i a b l e bias
u n d e r n o r m a l i t y across all f o u r s a m p l e sizes. In
c o m p a r i s o n , t h e e m p i r i c a l e s t i m a t e o f t h e exp e c t e d v a l u e o f t h e A D F X 2 was s m a l l e r t h a n that
o f t h e M L o r SB test statistics. R e c a l l t h a t the
e x p e c t e d v a l u e f o r all t h r e e test statistics w e r e
e q u a l for t h e p r o p e r l y specified m o d e l s . T h e l o w e r
e x p e c t e d v a l u e o f t h e A D F for misspecified m o d els e v e n u n d e r m u l t i v a r i a t e n o r m a l i t y suggests t h a t
this test statistic m a y h a v e less p o w e r to r e j e c t t h e
null h y p o t h e s i s c o m p a r e d with t h e M L o r SB X 2.
U n l i k e t h e M L a n d SB, t h e A D F was significantly
p o s i t i v e l y b i a s e d at t h e t w o s m a l l e r s a m p l e sizes.
F o r e x a m p l e , at N = 100 t h e a v e r a g e o b s e r v e d
A D F X 2 was 54% l a r g e r t h a n t h e e x p e c t e d value.
This bias d r o p p e d to 15% at N = 200 a n d was
negligible at t h e t w o l a r g e r s a m p l e sizes.
T h e findings b e c o m e m o r e c o m p l i c a t e d given
nonnormal distributions. The expected value of
24
C U R R A N , WEST, A N D FINCH
Table 4
Observed Chi-Square, Expected Chi-Square, Percentage Bias, and Percentage of Rejected Models
for Model Specification 4
Normal
Moderately nonnormal
Severely nonnormal
Size
X2
Observed
X2
Expected
X2
%
Bias
%
Reject
Observed
X2
Expected
~v2
%
Bias
%
Reject
Observed
X2
Expected
X2
100
ML
SB
ADF
ML
SB
ADF
ML
SB
ADF
ML
SB
ADF
34.03
34.97
47.99
43.75
44.34
46.42
75.29
75.62
69.62
128.71
129.16
111.39
32.52
32.56
30.66
43.14
43.23
39.41
75.04
75.23
65.66
128.20
128.56
109.41
5.0
7.0
56.0
1.0
3.0
18.0
0.0
0.0
6.0
0.0
0.0
2.0
48.7
51.8
82.1
81.1
81.6
85.5
100.0
100.0
100.0
100.0
100.0
100.0
40.37
33.94
48.67
49.84
39.55
41.84
81.35
62.09
53.84
133.86
100.48
80.91
32.52
29.76
27.19
43.14
37.59
32.43
75.04
61.09
48.15
128.20
100.26
74.35
24.0
14.0
79.0
16.0
5.0
29.0
8.0
2.0
12.0
4.0
0.0
9.0
63.4
45.5
85.4
91.0
67.9
73.4
100.0
97.0
94.2
100.0
100.0
100.0
45.15
34.09
63.25
58.14
38.48
46.05
91.71
55.94
49.13
144.56
83.44
68.25
32.45
27.33
25.06
43.01
32.71
28.16
74.68
48.85
37.45
127.46
75.76
52.92
200
500
1000
%
%
Bias Reject
39.0
25.0
152.0
35.0
18.0
64.0
23.0
15.0
31.0
13.0
10.0
30.0
78.6
50.3
91.8
88.1
60.4
84.5
100.0
92.6
93.2
100.0
100.0
100.0
Note. Univariate skewness and kurtoses were (0,0), (2,7), and (3,21) for normal, moderately nonnormal, and severely nonnormal
distributions, respectively. ML = maximum likelihood; SB = Satorra-Bentler rescaled; ADF = asymptotic distribution free.
a n d r e m a i n e d b i a s e d e v e n at N = 1,000 (11%).
U n d e r s e v e r e n o n n o r m a l i t y , the SB X 2 s h o w e d
s u b s t a n t i a l bias at t h e two s m a l l e r s a m p l e sizes
(e.g., 18% at N = 200) b u t was o n l y m o d e r a t e l y
b i a s e d at the l a r g e r s a m p l e sizes (e.g., 9% at N =
1,000). T h e A D F s h o w e d v e r y high levels o f relative bias across all f o u r s a m p l e sizes u n d e r s e v e r e
n o n n o r m a l i t y a n d was o v e r e s t i m a t e d b y 33% e v e n
at the largest s a m p l e size N = 1,000.
Model Specification 4. M o d e l Specification 4
c o n t a i n e d b o t h e r r o r s o f inclusion a n d exclusion.
T w o c r o s s - l o a d i n g s e x i s t e d in the p o p u l a t i o n t h a t
w e r e n o t e s t i m a t e d in t h e s a m p l e (A = .35), a n d
two c r o s s - l o a d i n g s w e r e e s t i m a t e d in t h e s a m p l e
that d i d n o t exist in t h e p o p u l a t i o n (A = 0). T h e s e
results a r e p r e s e n t e d in T a b l e 4.
T h e findings f r o m M o d e l Specification 4 foll o w e d the s a m e g e n e r a l p a t t e r n as was o b s e r v e d
for M o d e l Specification 3. T h e p r i m a r y d i f f e r e n c e
was t h a t the e x p e c t e d v a l u e s a n d r e j e c t i o n r a t e s
in M o d e l 4 w e r e l o w e r c o m p a r e d with t h o s e o f
M o d e l 3. This result m a y initially a p p e a r c o u n t e r intuitive given t h a t M o d e l 4 c o m b i n e d e r r o r s of
b o t h inclusion a n d exclusion. H o w e v e r , u n l i k e
M o d e l 3, M o d e l 4 c o n t a i n e d t h e s i m u l t a n e o u s estim a t i o n o f t h e two truly n o n e x i s t e n t p a t h s a n d t h e
exclusion of t h e two truly e x i s t e n t paths. T h e m e a n
e s t i m a t e d f a c t o r l o a d i n g s for the t w o a d d i t i o n a l
p a t h s in M o d e l 2 ( w h e r e no o t h e r p a t h s w e r e exc l u d e d ) was A = 0 (the p o p u l a t i o n e x p e c t e d value).
H o w e v e r , the m e a n f a c t o r l o a d i n g s for t h e s e s a m e
two a d d i t i o n a l p a t h s in M o d e l 4 was h = - . 4 0 .
Thus, t h e s e a d d i t i o n a l free p a r a m e t e r s s e r v e d to
" a b s o r b " the misspecification, a n d M o d e l 4 res u i t e d in a b e t t e r fit to the d a t a t h a n d i d M o d e l 3.
A s with M o d e l 3, u n d e r m u l t i v a r i a t e n o r m a l i t y ,
the e x p e c t e d v a l u e s o f t h e M L a n d SB X 2 w e r e
e q u a l to o n e a n o t h e r w h e r e a s t h e e x p e c t e d v a l u e
o f t h e A D F g 2 was smaller. N e i t h e r t h e M L o r SB
X 2 s h o w e d a n y significant bias u n d e r t h e n o r m a l
d i s t r i b u t i o n across a n y o f the f o u r s a m p l e sizes.
T h e largest bias was for t h e SB X 2 at N = 100
(7%), b u t t h e m a g n i t u d e o f bias d r o p p e d to n e a r
0 at s a m p l e sizes o f N = 200 a n d a b o v e . In c o m p a r i son, t h e A D F X 2 was a g a i n significantly o v e r e s t i m a t e d at t h e two s m a l l e r s a m p l e sizes b u t was
u n b i a s e d at s a m p l e sizes o f N = 500 a n d a b o v e .
T h e e x p e c t e d v a l u e o f the M L X 2 was again
e q u a l across d i s t r i b u t i o n s , a n d t h e M L X 2 was inc r e a s i n g l y p o s i t i v e l y b i a s e d with i n c r e a s i n g n o n n o r m a l i t y . L i k e M o d e l 3, the e x p e c t e d v a l u e s for
t h e SB a n d A D F X 2 d e c r e a s e d with i n c r e a s i n g
n o n n o r m a l i t y . T h e SB g 2 s h o w e d i n c r e a s i n g positive bias with i n c r e a s i n g n o n n o r m a l i t y . This bias
b e c a m e n e g l i g i b l e at N = 200 u n d e r m o d e r a t e
n o n n o r m a l i t y (5%) b u t was still slightly b i a s e d
e v e n at N = 1,000 u n d e r s e v e r e n o n n o r m a l i t y
(10%). Finally, t h e A D F X 2 was a g a i n s t r o n g l y
b i a s e d , with i n c r e a s i n g n o n n o r m a l i t y e v e n at t h e
largest s a m p l e size.
25
26
References
Bentler, P. M. (1989). EQS: Structural equations program manual. Los Angeles: BMDP Statistical
Software.
Bollen, K. A. (1989). Structural equations with latent
variables. New York: Wiley.
Browne, M. W. (1982). Covariance structures. In D. M.
Hawkins (Ed.), Topics in applied multivariate analysis
(pp. 72-141). Cambridge: Cambridge University
Press.
Browne, M. W. (1984). Asymptotic distribution free
methods in the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 127-141.
Browne, M. W., Mels, G., & Coward, M. (1994). Path
Analysis: RAMONA. In L. Wilkinson & M. A. Hill
(Eds.), Systat: Advanced applications (pp. 163-224).
Evanston, IL: SYSTAT.
Chou, C. P., Bentler, P. M., & Satorra, A. (1991). Scaled
test statistics and robust standard errors for non-normal data in covariance structure analysis: A Monte
Carlo study. British Journal of Mathematical and Statistical Psychology, 44, 347-357.
Fleishman, A. I. (1978). A method for simulating nonnormal distributions. Psychometrika, 43, 521-532.
Hu, L., Bentler, P. M., & Kano, Y. (1992). Can test
27
28
C U R R A N , WEST, A N D FINCH
Vale, C. D., & Maurelli, V. A. (1983). Simulating multivariate nonnormal distributions. Psychometrika, 48,
465-471.
Yung, Y.-F., & Bentler, P. M. (1994). Bootstrap-corrected A D F test statistics in covariance structure analysis. British Journal of Mathematical and Statistical
Psychology, 47, 63-84.
Appendix
Technical Details
Data Generation
EQS generates the raw data with non-zero skewness
and kurtosis using the formulae developed by Fleishman
(1978) in accordance with the procedures described by
Vale and Maurelli (1983). The raw data were generated
based upon the covariance matrix implied by the model
parameters for each of the three models. The availability
of the raw data was necessary for the computation of
the A D F and SB X2 test statistics.
The samples were created using the model implied
population covariance matrix ~(0). The measurement
equations consisted of the population parameter values
that defined the particular model. EQS generated the
population covariance matrix based on these measurement equations. The sample raw data were created using
a random number generator in conjunction with the
characteristics of the population covariance matrix. The
raw data were generated under two constraints: (a) the
expected value of S should equal the population covariance matrix ~(0), and (b) the expected value of the
indices of skewness and kurtosis should equal the values
specified for each measured variable.
(skewness = 0, kurtosis = 0), the mean univariate skewness for the nine variables was .001 and the mean kurtosis was .004. For the moderately nonnormal distribution
(skewness = 2.0, kurtosis = 7.0), the mean skewness
was 1.973 and the mean kurtosis was 6.648. Finally, for
the severely nonnormal distribution (skewness = 3.0,
kurtosis = 21.0), the mean skewness was 2.986 and the
mean kurtosis was 21.44. These large sample values of
skewness and kurtosis closely reflected the population
values. Previous published studies have also successfully
utilized this same method of data generation (Chou et
al., 1991; Hu et al., 1992).
E x p e c t e d V a l u e of T e s t Statistics
For a properly specified model, the expected value
for all three chi-square test statistics is equal to the
model degrees of freedom. Thus, the expected chisquare for Model 1 was 24.0 and for Model 2 was 22.0.
A complication arises when computing the expected
value of the test statistics for the misspecified models.
Under misspecification, the expected value of the model
chi-square is a combination of the model degrees of
freedom plus the noncentrality parameter, A. The value
of A is dependent on both the particular method of
estimation and sample size, with the expected value of
the model chi-square becoming larger with increasing
sample size.
Satorra and Saris (1985) provided a method for computing the noncentrality parameter for normal theory
ML for misspecified models. First, a covariance matrix
is created to reflect the structure of the model as it exists
in the population. Second, this covariance matrix is used
to estimate the model as it is thought to exist in the
sample. The chi-square value that results from this
model is the corresponding noncentrality parameter A.
This value, when added to the model degrees of freedom, provides the expected value of the ML X2 test
29