Topic 10 Factor Analysis and Reliability

TOPIC 10 A CONDUCIVE TEACHING AND LEARNING ENVIRONMENT W 200
T o p ic X Factor
10 Analysis
and
Reliability
LEARNING OUTCOMES
By the end of this topic, you should be able to:
1. Describe the requirements for factor analysis for a given data set;
2. Use the appropriate method to determine the principal components
underpinning the responses on a set of variables; and
3. Compute the reliability index.
X INTRODUCTION
Factor Analysis is used to uncover the latent structures (dimensions) of a set of
variables. It is a family of analysis under data reduction. Other methods are
Latent Class Analysis, Latent Profile Analysis, Latent Trait Analysis, and
Principal Component Analysis (PCA). The focus of Principal Component
Analysis is to reduce the number of variables into a smaller set of principal
components (dimensions). It allows researchers to use a smaller number of
ÂfactorsÊ to explain what the long list of ÂvariablesÊ actually measure. PCA is
normally used to reduce a large number of variables to a smaller number of
factors. Prior to Multiple Regression analysis, factor analysis was used to create a
set of factors to be treated as uncorrelated variables as one approach to handle
multi-collinearity. Factor analysis is an Interdependency Technique; it aims to
find the latent factors that account for the patterns of collinearity among multiple
metric variables. Some statisticians do not consider PCA as factor analysis.
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 201
Salient features of Principal Component Analysis:

(a) It is a variable reduction procedure.
(b) The main purpose is to reduce the number of variables into a smaller set of
principal components (dimensions).
(c) It is a large sample procedure where the focus is only on summarising the
sample information into a smaller set of „principal components‰ as
opposed to detecting the Âlatent factorsÊ that influence the scores on the
observed variables.
The following are the assumptions for Factor Analysis:

(a) Large enough sample to yield reliable estimates of the correlations among
the variables (according to Hair et al: 5 respondents per item in the scale are
preferred).
(b) Statistical inference is improved if the variables are multivariate normal (not
required for PCA).
(c) Relationships among the pairs of variables are linear.
(d) Absence of outliers among the cases.
(e) Some degree of collinearity among the variables but not an extreme degree
or singularity among the variables (according to Kline (1998), the
correlation value between the variables fall between 0.3 and 0.8).
10.1 ILLUSTRATING THE INTER-DEPENDENCY

BETWEEN VARIABLES
A teacher wanted to gauge the Emotional Intelligence of the Form Five students
of his school. Based on his readings, he drafted a nine-item questionnaire.
Respondents are required to provide their ratings on a five-point Likert Scale. He
administered the questionnaire to a group of Form Five students and ran a
simple correlation analysis.
202 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY
The following Table 10.1 shows the results:
Table 10.1: Correlation Analysis Output
X1 X2 X3 X4 X5 X6 X7 X8 X9
X1 1.00 0.75 0.70 0.65 0.01 0.20 0.18 0.16 0.03

X2 1.00 0.63 0.65 0.08 0.11 0.13 0.04 0.09
X3 1.00 0.74 0.02 0.12 0.07 0.15 0.05
X4 1.00 0.01 0.11 0.06 0.02 0.13
X5 1.00 0.73 0.72 0.65 0.83
X6 1.00 0.71 0.79 0.72
X7 1.00 0.95 0.75
X8 1.00 0.73
X9 1.00
Basic Principle:
Variables that significantly correlate with each other do so because they are
measuring the same "construct".
The Problem:
What is the "construct" that brings the variables together?
The interpretation of Table 10.1:

(a) Variables 1, 2, 3 & 4 correlate highly with each other, but not with the rest of
the variables.
(b) Variables 5, 6, 7, 8 & 9 correlate highly with each other, but not with the rest
of the variables.
(c) The nine variables seem to be measuring TWO "constructs" or underlying
factors.
To find out the answer, we need to carry out Factor Analysis or The Principal
Component Analysis, to be more precise.
10.2 THE FUNDAMENTALS OF FACTOR

ANALYSIS (PRINCIPAL COMPONENT
ANALYSIS)
The purpose of factor analysis is to reduce multiple variables to a lesser number
of underlying factors that are being measured by the variables.
10.2.1 The Mathematics behind Data Reduction

LetÊs say we have two observable variables (X and Y); if we would like to see
whether these two variables can be represented by a single variable, the following
steps should be followed.
Step 1:
Collect data on the two variables (X and Y) from a group of respondents (letÊs say
we use 10 respondents).
Table 10.2 shows the example of scores for the above data collected:
Table 10.2: Data on Two Variables

X Y Standardised X Standardised Y
5 4 0.52 -0.07
4 5 -0.22 0.66
6 5 1.27 0.66
3 2 -0.97 -1.53
4 4 -0.22 -0.07
3 2 -0.97 -1.53
5 6 0.52 1.39
6 5 1.27 0.66
2 3 -1.72 -0.80
5 5 0.52 0.66
Step 2:
Determine the variance-covariance matrix for the two variables. The formulas
below define the variance and covariance.
n
∑ ( X i − X )( X i − X )
Variance [Cov (X, X)]: i = 1 , using the standardised value
n −1
Cov (X, X) = 1.00
n
∑ (Yi − Y )(Yi −Y )
Variance [Cov (Y, Y)]: i = 1 = using the standardised value
n −1
Cov (Y, Y) = 1.00
n
∑ ( X i − X )(Yi − Y )
Covariance [Cov (X, Y)]: i = 1 = 0.77
n −1
⎛ cov( x , x ) cov( x , y ) ⎞
Variance-covariance matrix: ⎜⎜ ⎟⎟ ; if the X and Y scores are
⎝ cov( y , x ) cov( y , y ) ⎠
transformed into standardised scores, the variance-covariance will give us the
correlation matrix.
Thus, for the above example, the variance-covariance matrix for the standardised
value of
⎛ 1.00 0.77 ⎞
X and Y is ⎜⎜ ⎟⎟
⎝ 0.77 1.00 ⎠
Step 3
Calculate the eigenvectors and eigenvalues of the covariance matrix:
⎛ cov( x , x ) cov( x , y ) ⎞ ⎛ a ⎞ ⎛a⎞

⎜⎜ ⎟⎟ X ⎜⎜ ⎟⎟ = λ ⎜⎜ ⎟⎟
⎝ cov( y , x ) cov( y , y ) ⎠ ⎝ b ⎠ ⎝ b⎠
Variance- Eigenvector Eigenvalue

covariance Matrix
The number of eigenvalues depends on the number of variables in the analysis.

In general, if there are n variables in the analysis, there will be n number of
eigenvalues. However, not all the eigenvalues will have the same magnitude but
the total is equal to the number of variables in the analysis.
Each eigenvalue will have its corresponding eigenvector. The computation of

eigenvalues and eigenvectors involves complicated mathematical procedures
especially if the number of variables in the analysis is large. Any computer
software that performs the Principal Component Analysis will compute the
eigenvalues (while some programmes will also provide the eigenvectors).
For the example above, the eigenvalues are:

λ 1 = 1 + r12
and
λ 2 = 1 - r12
(where r12 is the correlation between the two variables, in this case, the
correlation value is 0.77)
Thus, the eigenvalues are 1.77 and 0.23
The eigenvalues will give the eigenvectors.
⎛ 1 ⎞
When eigenvalue is 1.77, the eigenvector is ⎜⎜ ⎟⎟
⎝ 2.54 ⎠
⎛ 1 ⎞
When eigenvalue is 0.23, the eigenvector is ⎜⎜ ⎟⎟
⎝ − 1.43 ⎠
Step 4
Plotting the standardised values on a two dimensional plane and overlying the
eigenvectors.
2.00
1.50
Eigenvector
⎛ 1 ⎞
1.00 ⎜⎜ ⎟⎟
⎝ 2.54 ⎠
0.50
0.00
-2.00 -1.00 0.00 1.00 2.00
-0.50
-1.00 Eigenvector
⎛ 1 ⎞
-1.50 ⎜⎜ ⎟⎟
⎝ − 1.43 ⎠
-2.00
Figure 10.1: Plotting on a two dimensional plane and overlying the eigenvectors
From the plot shown in Figure 10.1, it can be concluded that the data set is fairly
well represented by the eigenvector derived when the eigenvalue is 1.77.
The above discussion is just for illustrative purpose. In real situations, there will
be more than two „observed‰ variables and thus, visual representation (e.g.
graphical) is not possible.
10.2.2 Types of Factor Analysis

Basically, there are two types of factor analysis and they are:
(a) Exploratory factor analysis
It is a non-theoretical application. The aim is to answer the question „Given
a set of variables, what are the underlying dimensions (factors) that account
for the patterns of collinearity among the variables?
Example: RespondentsÊ responses on a scale measuring delinquency is

governed by certain theory, as such, what are the latent factors that
influence their behaviour?
(b) Confirmatory factor analysis
It is to validate a predetermined theory. The aim is to answer the question
„Do the responses of a scale conform with the theory that explains
respondentsÊ behaviour?
Example: Given a theory that attributes delinquency to four independent

factors, do respondentsÊ responses on a scale that measures delinquency
converge into these four factors?
10.3 THE LOGIC OF FACTOR ANALYSIS

(PRINCIPAL COMPONENT ANALYSIS)
In studying the Emotional Intelligence of teachers, a researcher uses focus group
interviews in developing the instrument for his study. The following items (The
below data is attached in Appendix II, Data Set B) were generated based on focus
group interviews with selected teachers from Klang Valley.
1 It is difficult for me to face unpleasant situations.

2 I am able to face challenges pretty well.
3 I am able to deal with upsetting problems.
4 I find it difficult to control my anxiety.
5 I am able to keep calm in difficult situations.
6 I can handle stress without getting too nervous.
7 I am usually calm when facing challenging situations.
8 I am motivated to continue, even when things get difficult.
9 Whatever the situation, I believe I can handle it well.
10 I am optimistic about most things I do.
11 I am sure of what I am doing in most situations.

12 I believe things will turn out all right despite setbacks from time to time.
13 I believe in my ability to handle upsetting problems/situations.
14 If others can do it, I donÊt see why I canÊt.
15 I feel good about myself.
16 I feel that I am not inferior compared with others.
17 I feel confident of myself in most situations.
18 I have good self respect.
19 I am happy with what I am now.
20 It is fairly easy for me to express my feelings.
21 I am aware of what is happening around me even when I am upset.
22 I am aware of the way I feel.
23 It is difficult for me to describe my feelings.
The researcher developed a questionnaire to assess Emotional Intelligence of

teachers using the items generated from the focus group interviews. He used a 7-
point Likert Scale for his questionnaire. The following is the description of the
Likert Scale:
[ 1= Strongly Disagree; 2 = Disagree; 3 = Slightly Disagree; 4 = Not Sure; 5 =

Slightly Agree; 6 = Agree; 7 = Strongly Agree]. He administered the questionnaire
to 176 randomly selected teachers from both private and public schools in Klang
Valley. Table 10.3, shows the sample of the responses.
Table10.3: Sample of StudentsÊ Responses

Subject Variable
X1 X2 X3 X4 X5 X6 ⁄ Xn
1 6 5 7 3 4 4 ⁄
2 5 7 4 4 4 3 ⁄
3 7 5 6 2 5 4 ⁄
⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄
N Mean X1 Mean X2 Mean X3 Mean X4 Mean X5 Mean X6 ⁄ Mean X.
Having run the correlation analysis, the researcher found that some of the items
have high correlations with one another while others, not so. Table 10.4 shows an
example of the correlation analysis.
Table 10.4: Sample of Inter-Correlation Values between Variables
X1 X2 X3 X4 X5 X6 .... Xk
X1 1.00 0.76 0.84 ⁄
X2 1.00 0.76 ⁄
X3 1.00 ⁄
X4 1.00 0.76 0.77 ⁄
X5 1.00 0.81
X6 1.00
---
Xk 1.00
The next logical thing to do is to cluster the variables with high inter correlations
together and define them as belonging to the same family. This is what factor
analysis (or Principal Component Analysis, to be precise) is all about. Table 10.5
displays an example of the factor analysis. The values in the cells are the factor
loadings (Refer to Subsection 10.3.1 for further explanation on factor loadings).
Table 10.5: Sample of Factor Analysis Outcome
Variables Factor I Factor II Factor III Factor IV Factor .. Factor n

X1 0.932 0.013 0.250
X2 0.851 0.426 0.211
X3 0.634 0.451 0.231
X4 0.322 0.644 0.293
X5 0.725 0.714 0.293
X6 0.435 0.641 0.332
X7 0.322 0.311 0.677
X8 0.211 0.233 0.771
⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄
Xk 0.122 0.110 0.200
10.3.1 Factor Loading

What is a Factor Loading?
A factor loading is the correlation between a variable and a factor that has been
extracted from the data.
Example
Note the factor loadings for variable X1.
Variables Factor I Factor II Factor III

X1 0.932 0.013 0.250
Variable X1 is highly correlated with Factor I, but negligibly correlated with

Factors II and III.
Communality: Refers to the total variance in variable X1 accounted for by the

three factors that were extracted.
Simply square the factor loadings and add them together:
(0.9322 + 0.0132 + 0.2502) = 0.93129
As such, the initial communality for the variables before extracting the factors is
always 1.00. In the above example, emotional intelligences is operationalised
using 23 specific situations and the initial factors will be 23, with some having
greater dominance than the others (this will be reflected in the eigenvalues).
Once the dominant factors are identified (e.g. those with eigenvalue greater than
1.00), the communality value for each variable will be less than 1.00. This is
because in factor analysis, those factors that have negligible effect on the variables
will be dropped.
10.4 STEPS IN FACTOR ANALYSIS (PRINCIPAL

COMPONENT ANALYSIS)
There are a few crucial steps to be followed in factor analysis or to put it more
precisely, the Principal Component Analysis:
Step 1
Compute a k by k inter-correlation matrix. According to Hair et.al, inter-
correlation values must be at least 0.3 for the items to be considered for factor
analysis.
Step 2
Extract an initial solution.
Step 3
Determine the appropriate number of factors to be extracted in the final solution.
Step 4
Rotate the factors to clarify the factor pattern in order to better interpret the
nature of the factors if necessary.
Step 5
Establish the measures of goodness-of-fit of the factor solution
A Ten Variable Example

The researcher used the responses on the first 10 questions on the Emotional
Intelligence questionnaire to perform factor analysis. The Table 10.6 below shows
the codes and the variable names for the variables included in the factor analysis.
Table 10.6: Codes and Variable Names

Code Variable Name
rq1 It is difficult for me to face unpleasant situations.
rq2 I am able to face challenges pretty well.
rq3 I am able to deal with upsetting problems.
rq4 I find it difficult to control my anxiety.
rq5 I am able to keep calm in difficult situations.
rq6 I can handle stress without getting too nervous.
rq7 I am usually calm when facing challenging situations.
rq8 I am motivated to continue, even when things get difficult.
rq9 Whatever the situation, I believe I can handle it well.
rq10 I am optimistic about most things I do.
The principal components can be illustrated as follows:
C1
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
C1 = b11(X1) + b11(X1) + ⁄ + b10(X10)
C1 = Factor score on Component 1
b = Regression weight (also known as factor weight)
X = Respondents score on the observed variables
= Strong regression weight
= Weak regression weight
C2
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
C2 = b11(X1) + b11(X1) + ⁄ + b10(X10)
C2 = Factor score on Component 2
bij = Regression weight (also known as factor loading)
Xi = Respondents score on the observed variables
= Strong regression weight (large factor loading)
= Weak regression weight (small factor loading)

All the observed variables will have some influence on all the factors extracted,
however, a different set of the variables will have different degrees of influence
on the different common factors.
In short, a principal component is a linear combination of optimally weighted

observed variables. The weighting is done in such a way that it maximises the
amount of variance in the data set.
The following Figure 10.2 summarises the requirements and assumptions for
principal component analysis.
Summarise original info into

minimal factor Purpose
Measurement level of the principal

Continuous
component „factor‰
Total variance Parameter for analysis
It is a large sample procedure.

No generalisation involved. Assumption of normality
No assumption of normality
Principal Component analysis Type of analysis
Yes No Principal components correlate
Oblique Orthogonal Type of rotation
Figure 10.2: Requirements and assumptions for principal component analysis
Extracting the principal components from the list of observed variables is an

iterative procedure that requires one to check for the assumptions along the
process until the final conclusion is made. The procedural map in Appendix VI
summarises the procedure and assumptions required for PCA with orthogonal
rotation.
10.4.1 Correlation between Variables

As a first step, correlations between the variables are computed. Table 10.7 shows
the values of the correlation between the variables. The shaded cells represent the
diagonal while the values below and above the diagonal are the correlation
values between the variables. Since the correlation values between the variables
are greater than 0.3 with at least one other variable, all the 10 variables are
factorable. At the same time the values are not too high (not more than 0.85) and
as such, each variable is distinct from the others.
Table 10.7: Inter-Correlation among the Variables
Correlation Matrixa
rq1 rq2 rq3 rq4 rq5 rq6 rq7 rq8 rq9 rq10
Correlation rq1 1.000 .604 .578 .419 .514 .580 .497 .555 .554 .481
rq2 .604 1.000 .615 .518 .488 .545 .543 .402 .402 .401
rq3 .578 .615 1.000 .519 .567 .536 .572 .481 .484 .496
rq4 .419 .518 .519 1.000 .581 .430 .450 .336 .174 .357
rq5 .514 .488 .567 .581 1.000 .577 .577 .466 .382 .574
rq6 .580 .545 .536 .430 .577 1.000 .575 .510 .417 .437
rq7 .497 .543 .572 .450 .577 .575 1.000 .459 .442 .521
rq8 .555 .402 .481 .336 .466 .510 .459 1.000 .585 .602
rq9 .554 .402 .484 .174 .382 .417 .442 .585 1.000 .529
rq10 .481 .401 .496 .357 .574 .437 .521 .602 .529 1.000
a. Determinant =0.005
There is more evidence of factorability:

(a) Bartlett's Test of Sphericity
Table 10.8 shows the inter-correlation matrix of an identity matrix.
Table 10.8: Intercorrelation of an Identity Matrix

X1 X2 X3 X4 X5
X1 1.00 0.00 0.00 0.00 0.00
X2 1.00 0.00 0.00 0.00
X3 1.00 0.00 0.00
X4 1.00 0.00
X5 1.00
The variables are totally non-collinear. If this matrix was factor-analysed, it

would extract as many factors as variables, since each variable would be its
own factor. As such, it is totally non-factorable. The factor solution will be
exactly the same as the initial solution.
The determinant of an identity matrix is equal to one, while the determinant

of a non-identity matrix is some other value (different from one).
Bartlett's Test of Sphericity calculates the determinant of the matrix of the

sums of products and cross-products (S) from which the inter-correlation
matrix is derived. The determinant of the matrix S is converted to a chi-
square statistic and tested for significance.
Null Hypothesis: The inter-correlation matrix of the variables is not

different from an identity matrix.
Alternate Hypothesis: The inter-correlation matrix of the variables is

different from an identity matrix.
Table 10.9 shows the sample results:
Table 10.9: Sample Results of Bartlett's Test of Sphericity

KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. 0.914
Bartlett's Test of Sphericity Approx. Chi-Square 887.955
df 45
Sig. .000
Test Results
χ2 = 887.955 ; df = 45 ; p < 0.0001
Statistical Decision
The inter-correlation matrix of the variables is significantly different from
an identity matrix. In other words, the sample inter-correlation matrix did
not come from a population in which the inter-correlation matrix is an
identity matrix.
(b) Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO)

If two variables share a common factor with other variables, their partial
correlation (aij) will be small, indicating the unique variance they share.
If aij ≅ 0.0; the variables are measuring a common factor, and KMO ≅ 1.0
If aij ≅ 1.0; the variables are not measuring a common factor, and KMO ≅ 0.0
Table 10.10 portrays the interpretation of the KMO as characterised by

Kaiser, Meyer, and Olkin:
Table 10.10: Degree of Common Variance
KMO Value Degree of Common Variance

0.90 to 1.00 Marvelous
0.80 to 0.89 Meritorious
0.70 to 0.79 Middling
0.60 to 0.69 Mediocre
0.50 to 0.59 Miserable
0.00 to 0.49 Not Appropriate for Factor Analysis
As characterised by Kaiser, Meyer, and Olkin, results of the KMO can be

seen or referred in the below Table 10.11.
Table 10.11: KMO and Bartlett's Test
KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. 0.914

Bartlett's Test of Sphericity Approx. Chi-Square 887.955
df 45
Sig. .000
The KMO = 0.914
Interpretation
The degree of common variance among the ten variables is marvellous.
If a factor analysis is conducted, the factors extracted will account for a

substantial amount of variance.
10.4.2 Extracting an Initial Solution

A variety of methods have been developed to extract factors from an inter-
correlation matrix. SPSS Statistics offers the following methods:
(i) Principal components
(ii) Unweighted least-squares
(iii) Generalised least squares
(iv) Maximum likelihood
(v) Principal axis factoring
(vi) Alpha factoring
(vii) Image factoring
Note: In this module, we will only focus on the Principal Component Method.
Communality is the proportion of variance of a particular variable (item in the

questionnaire) that is due to common factors. In the initial solution, each variable
(item) is considered as a single factor, as such, the communality for the initial
solution is 1.00. After extraction, the number of factors will be reduced and each
initial factor (item) now belongs to Ânew factorsÊ and the new factors explain a
certain proportion of the variance in the variable. Thus, the proportion of
variance of each variable (item) explained by the new factors is less than 1.00
(refer to Table 10.12).
Table 10.12: Communalities
Communalities
Initial Extraction
rq1 1.000 .626

rq2 1.000 .623
rq3 1.000 .647
rq4 1.000 .732
rq5 1.000 .649
rq6 1.000 .588
rq7 1.000 .594
rq8 1.000 .694
rq9 1.000 .762
rq10 1.000 .614
The variance of each variable is 1.0, the total variance to be explained is 10 (10
variables, each with a variance = 1.0). Since a single variable can account for 1.0
unit of variance, a useful Ânew factorÊ must account for more than 1.0 unit of
variance, or have an eigenvalue (λ) greater than 1.0. Otherwise, the factor
extracted (new factor) explains less variance than a single variable. Table 10.7
shows the results of the factor analysis of the 10 items.
10.4.3 Determine the Appropriate Number of Factors to

be Extracted in the Final Solution
Table 10.13: The Results of Factor Analysis
Extraction Sums of Squared Rotation Sums of Squared

Initial Eigenvalues Loadings Loadings
% of Cumulative % of Cumulative % of Cumulative

Component Total Variance % Total Variance % Total Variance %
1 5.489 54.888 54.888 5.489 54.888 54.888 3.515 35.152 35.152
2 1.041 10.406 65.294 1.041 10.406 65.294 3.014 30.143 65.294
3 .691 6.910 72.205
4 .539 5.387 77.592
5 .506 5.056 82.648
6 .395 3.948 86.596
7 .383 3.830 90.426
8 .359 3.590 94.017
9 .320 3.201 97.218
10 .278 2.782 100.000
Extraction Method: Principal Component Analysis.
Referring to the above Table 10.13, the results of the initial solution:
Interpretation
10 factors (components) were extracted, the same as the number of variables
factored:
(a) Factor I
The 1st factor has an eigenvalue = 5.489. The value is greater than 1.0, as
such, it explains more variance than a single variable, in fact 5.489 times as
much.
The percent of variance explained by ÂFactor IÊ is:

(5.489 / 10 units of variance) (100) = 54.89%
(b) Factor II
The 2nd factor has an eigenvalue = 1.041. It is also a value greater than 1.0,
and therefore, explains more variance than a single variable.
The percent of variance explained by ÂFactor IIÊ is:

(1.041 / 10 units of variance) (100) = 10.41%
(c) Subsequent factors
The subsequent factors (3 through 10) have eigenvalues less than 1.0, as
such, explain less variance than a single variable. These are not ÂgoodÊ
factors.
The Key Points

• The sum of the eigenvalues associated with each factor (component)
sums to 10 (e.g (5.489 + 1.041 + 0.691 + 0.539 + ⁄ + 0.278) = 10)
• The cumulative percentage of variance explained by the first two factors
is 65.29%
• In other words, 65.29% of the common variance shared by the 10
variables can be accounted for by the 3 factors.
• This initial solution suggests that the final solution should extract not
more than 2 factors.
Under the subject of determining the appropriate number of factors to be

extracted in the final solution that has been discussed in this subsection, there are
two more important elements to be addressed:
(a) Cattell's Scree Plot

Another way to determine the number of factors to extract in the final
solution is via Cattell's Scree plot (refer to Figure 10.3). This is a plot of the
eigenvalues associated with each of the factors extracted, against each
factor. At the point that the plot begins to level off, the additional factors
explain less variance than a single variable.
Figure 10.3: Cattell's Scree Plot
(b) Factor Loadings

The component matrix indicates the correlation of each variable with each factor.
Component Matrixa
Component
1 2 Explanation:
rq1 .785 .099
The variable rq1
rq2 .748 -.253 correlates 0.785 with
rq3 .795 -.127 Factor I
rq4 .640 -.567 correlates 0.099 with
Factor II
rq5 .776 -.216
rq6 .762 -.086
rq7 .765 -.096
rq8 .727 .406
rq9 .667 .563
rq10 .728 .289
Extraction Method: Principal

Component Analysis.
a. 2 components extracted.
The total proportion of the variance in rq1 explained by the two factors is:
(0.7852 + 0.0992) = 0.626
This is called the communality of the variable rq1

The communalities of the 10 variables are as follows: (cf. column headed as
Extraction)
Communalities
Initial Extraction
The proportion of variance
rq1 1.000 .626 in each variable accounted
for by the two factors is
rq2 1.000 .623
not the same.
rq3 1.000 .647
rq4 1.000 .732
rq5 1.000 .649
rq6 1.000 .588
rq7 1.000 .594
rq8 1.000 .694
rq9 1.000 .762
rq10 1.000 .614
The key to determining what the factors measure is the factor loadings.
Component Matrixa
Component
1 2
rq1 .785 .099
rq2 .748 -.253
rq3 .795 -.127
rq4 .640 -.567
rq5 .776 -.216
rq6 .762 -.086
rq7 .765 -.096
rq8 .727 .406
rq9 .667 .563
rq10 .728 .289
Extraction Method: Principal Component
Analysis.
a. 2 components extracted.
Factor I
Variable Factor Loading

The correlation coefficient between rq1 and ÂFactor IÊ is 0.785
rq1 .785
rq2 .748
rq3 .795
rq4 .640
rq5 .776
rq6 .762
rq7 .765
rq8 .727
rq9 .667
rq10 .728
Factor II
Variable Factor Loading
The correlation coefficient between rq1 and ÂFactor IIÊ is 0.099
rq1 .099
The correlation coefficient between rq2 and ÂFactor IIÊ is -0.253
rq2 -.253
rq3 -.127
rq4 -.567
rq5 -.216
rq6 -.086
rq7 -.096
rq8 .406
rq9 .563
rq10 .289
10.4.4 Rotate the Factors to Clarify the Factor Pattern in

order to Better Interpret the Nature of the Factors.
In many instances, one or more variables may load about the same on more than
one factor, making the interpretation of the factors ambiguous. Ideally, the
analyst would like to find that each variable loads high (⇒ 1.0) on one factor and
approximately zero on all the others (⇒ 0.0). The factor pattern can be clarified by
"rotating" the factors in F-dimensional space. There are two types of rotation:
(a) Orthogonal Rotation: Preserves the independence of the factors,

geometrically they remain 90° apart.
(b) Oblique Rotation: Will produce factors that are not independent,
geometrically not 90° apart.
Below is the comparison between the Component matrix and Rotated

Component matrix (Using Varimax rotation, an orthogonal type) for the ten
variables:
Component Matrixa Rotated Component Matrixa

Component Component
1 2 1 2
rq1 .785 .099 rq1 .519 .597
rq2 .748 -.253 rq2 .726 .309
rq3 .795 -.127 rq3 .677 .435
rq4 .640 -.567 rq4 .855 .003
rq5 .776 -.216 rq5 .723 .356
rq6 .762 -.086 rq6 .625 .443
rq7 .765 -.096 rq7 .634 .438
rq8 .727 .406 rq8 .272 .788
rq9 .667 .563 rq9 .123 .864
rq10 .728 .289 rq10 .350 .701
Extraction Method: Principal Extraction Method: Principal

Component Analysis. Component Analysis. Rotation
a. 2 components extracted. Method: Varimax with Kaiser
Normalization.
a. Rotation converged in 3 iterations.
Reproduced correlation matrix

One measure of the goodness-of-fit is whether the factor solution can reproduce
the original inter-correlation matrix among the ten variables.
Table 10.14 : Reproduced Correlations
Reproduced Correlations
rq1 rq2 rq3 rq4 rq5 rq6 rq7 rq8 rq9 rq10
Reproduced rq1 .626a .562 .611 .446 .588 .590 .591 .611 .580 .600
Correlation rq2 .562 .623a .626 .622 .635 .591 .596 .441 .357 .471
rq3 .611 .626 .647a .580 .644 .616 .620 .526 .459 .542
rq4 .446 .622 .580 .732a .619 .536 .544 .235 .108 .302
rq5 .588 .635 .644 .619 .649a .610 .614 .477 .397 .503
rq6 .590 .591 .616 .536 .610 .588a .591 .519 .460 .530
rq7 .591 .596 .620 .544 .614 .591 .594a .517 .456 .529
rq8 .611 .441 .526 .235 .477 .519 .517 .694a .714 .647
rq9 .580 .357 .459 .108 .397 .460 .456 .714 .762a .649
rq10 .600 .471 .542 .302 .503 .530 .529 .647 .649 .614a
Residualb rq1 .042 -.033 -.027 -.074 -.009 -.094 -.056 -.026 -.119
rq2 .042 -.011 -.104 -.147 -.047 -.053 -.039 .046 -.070
rq3 -.033 -.011 -.061 -.077 -.080 -.048 -.045 .025 -.046
rq4 -.027 -.104 -.061 -.038 -.106 -.094 .101 .066 .055
rq5 -.074 -.147 -.077 -.038 -.033 -.037 -.011 -.014 .071
rq6 -.009 -.047 -.080 -.106 -.033 -.016 -.009 -.042 -.093
rq7 -.094 -.053 -.048 -.094 -.037 -.016 -.058 -.014 -.008
rq8 -.056 -.039 -.045 .101 -.011 -.009 -.058 -.129 -.045
rq9 -.026 .046 .025 .066 -.014 -.042 -.014 -.129 -.120
rq10 -.119 -.070 -.046 .055 .071 -.093 -.008 -.045 -.120
Extraction Method: Principal Component Analysis.

a. Reproduced communalities
b. Residuals are computed between observed and reproduced correlations. There are 21
(46.0%) non-redundant residuals with absolute values greater than 0.05.
The upper half of the above Table 10.14 presents the bivariate correlations.
Compare these with the lower half of the table that presents the residuals.
Residual = (observed - reproduced correlation)
Less than half of the residuals (42%) are greater than 0.05
10.4.5 Establish the Measures of Goodness-of-Fit of the

Factor Solution
Table 10.15 shows the goodness of fit of the two factor solution.
Table 10.15: Goodness of Fit of the Two Factor Solution

Measure Value Interpretation
KMO 0.914 Marvelous

BarlettÊs Test χ2 = 887.955 ; The inter-correlation matrix
df = 45 ; provides evidence of the
p < 0.0001 presence of common factors
Total Variance Explained 65.29% The two factors extracted

can explain 65.29% of the
variance in the ten variables
Factor pattern 2 Factors The pattern is clear for two
factors
10.5 RELIABILITY
In many areas of educational and psychological research, the precise
measurement of various variables or theoretical constructs poses a challenge. For
example, the precise measurement of personality variables or attitudes is usually
a necessary first step before any theories of personality or attitudes can be
considered. In general, unreliable measurements of people's beliefs or intentions
will obviously hamper efforts to predict their behaviour. Reliability analysis is
often used to statistically check the reliability of an instrument. Reliability is the
measure of consistency of a particular instrument. This refers to the „capability‰
of the instrument producing consistently similar results if it were administered to
a homogenous group of respondents. Generally, there are four classes of
reliability estimates. They are inter-rater or inter-observer reliability, test-retest
reliability, parallel-form reliability, and internal consistency. The inter-rater or the
inter-observer reliability is used to assess the degree to which two different
observers describes a phenomenon. This is widely used in establishing reliability
for open-ended questions. The test-retest, the parallel-forms and the internal
consistency reliability are mainly used to assess the reliability for fixed response
items. The test-retest is used to measure the consistency of the measure from one
time to another, while the parallel-form is the reliability measure of the
consistency of two tests which were constructed using the same content domain.
The internal-consistency is a measurement to evaluate the consistency of the

responses for each item within the instrument. This is reported in terms
coefficient of CronbachÊs alpha and the values range from zero to one and this is
measured by the formula;
α= k ⎛⎜ k S 2 i ⎞⎟
1 − ∑
k − 1 ⎜⎜ i = 1 S 2 ⎟
⎝ sum ⎟⎠
where
Si2 = variance for k individuals
S2sum = variance for the sum of all items
• If there is no true score but only random errors in the items
(uncorrelated across items) then Si2 = S2sum and α = 0
• If all items measure the same thing (true score) then α=1
• Nunnaly (1978) suggests an α > 0.7
10.5.1 Reliability using Cronbach’s Alpha

There are many different types of statistics to check reliability and one of the
most commonly used is CronbachÊs Alpha which is based on the average
correlation of items within a test. CronbachÊs alpha is the most common form of
internal consistency reliability coefficient. By convention, a lenient cut-off of 0.60
is common in exploratory research; alpha should be at least 0.70 or higher to
retain an item in an "adequate" scale; and many researchers require a cut-off of
0.80 for a "good scale."
SPSS STATISTICS Commands for Reliability Analysis

• Select Analyse menu and click on Scale and then Reliability
Analysis ⁄to open the Reliability Analysis dialogue box.
• Select the variables or items you require, click the right arrow
to move the variables to the Items: box.
• Ensure that Alpha is displayed in the Model: box.
• Click on the Statistics ⁄. command pushbutton to open the
Reliability Analysis: Statistics sub-dialogue box.
• In the Descriptives for box, select the Scale and Scale if item
deleted check boxes.
• In the Inter-Item box, select the Correlations check box.
• Click on Continue and OK.
Example
A researcher gave a 10-item questionnaire on Emotional Intelligence to a sample
of randomly selected secondary school students. The aim is to determine the
internal consistency of the scale using CronbachÊs alpha. The Table 10.16 below is
the SPSS output.
Table 10.16: Item-Total Statistics
Item-Total Statistics
Corrected Squared Cronbach's
Scale Mean if Scale Variance Item-Total Multiple Alpha if Item
Item Deleted if Item Deleted Correlation Correlation Deleted
rq1 41.89 63.948 .718 .560 .895
rq2 41.78 64.915 .676 .533 .897
rq3 41.89 64.380 .731 .555 .894
rq4 42.24 65.499 .560 .458 .905
rq5 42.19 62.074 .713 .573 .895
rq6 42.14 63.800 .692 .516 .896
rq7 42.00 63.202 .696 .508 .896
rq8 41.83 64.745 .654 .521 .899
rq9 41.93 66.185 .583 .491 .903
rq10 41.97 64.849 .658 .517 .898
10.5.2 Interpretation on Cronbach’s alpha

There are several interpretations on CronbachÊs alpha:
(a) Scale Mean If Item Deleted

This column tells us about the average score if the specific item is excluded
from the scale. So, if rq1 is deleted, the average score will be 41.89
(b) Corrected Item-Total Correlation

This column gives the Pearson correlation coefficient between the
individual item and the sum of the scores on the remaining items. A low
item-total correlation means that the item is little correlated with the overall
scale and the researcher should consider dropping it. However, it should be
noted that a scale with an acceptable Cronbach's alpha may still have one or
more items with low item-total correlations. Items rq4 and rq9 are not very
strong in that they are not consistent with the rest of the scale. Their
correlations with the sum scale are 0.56 and 0.58 respectively, while all other
items correlate at 0.65 or better.
(c) CronbachÊs alpha if Item Deleted

This column gives the alpha correlation coefficient that would result if the
item is removed from the attitude scale. The researcher may wish to drop
items with high coefficients in this column as another way to improve the
alpha level.
(d) CronbachÊs alpha

The CronbachÊs alpha for the overall attitude scale is 0.7678 for the 10 items
without removal of any items. The alpha can be increased if the two items
are removed. It is a common practice for researchers to either remove the
problematic items or rewrite the items and administer the items again to see
if the alpha improves.
ACTIVITY 10.1
(a) What is the reliability analysis?

(b) What does the CronbachÊs alpha indicate?
(c) Explain CronbachÊs alpha if an item is deleted.
• Factor analysis is used to uncover the latent structure (dimensions) of a set of

variables.
• Principal Component Analysis is used to reduce the number of variables into

a smaller set of principal components (dimensions).
• Among the required assumptions for factor analysis are a large sample,
normality (not for PCA), linear relationship among variables, absence of
outliers, and no multi collinearity.
• Factor loading is the correlation between a variable and a factor that has been
extracted from the data.
• Bartlett Test of Sphericity and Kaiser-Meyer-Olkin Measure of Sampling

Adequacy (KMO) are two commonly used tests to test the factorability of the data.
• An initial factor solution is normally rotated to obtain a more interpretable

solution.
• The initial solution can be rotated using orthogonal or oblique rotations.
• Reliability is the measure of consistency of a particular instrument.
• There are four classes of reliability estimates. They are inter-rater or inter-
observer reliability, test-retest reliability, parallel-form reliability, and internal
consistency.
• CronbachÊs Alpha is the most common form of internal consistency reliability

coefficient.
Factor analysis Rotation

Principal component analysis Orthogonal
Factor loading Oblique
Correlation matrix Reliability
Co-variance CronbachÊs alpha coefficient
Carry out Factor Analysis to determine the dimensions in the

Emotional Intelligence construct developed by the researcher (You can
either name the factors or label them as Factor 1, Factor 2, etc).
Report the CronbachÊs Alpha for each dimension.
Black, T. R. (1999). Doing quantitative research in the Social Sciences. London:

Sage Publications.
Coladraci, T., Cobb, C., Minium, E. & Clarke, R. (2007). Fundamentals of
statistical reasoning in Education. New Jersey: Wiley.
Dancey, C. P. & Reidy, J. (2007). Statistics without maths for Psychology. Harlow,
England: Pearson Prentice Hall.
Field, A. (2005), Discovering statistics using SPSS. London: Sage Publications.
Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E. & Tatham, R. L. (2006).
Multivariate data analysis. Upper Saddle River: Prentice Hall.
Welkowitz, J., Cohen, B. & Ewen, R. (2006). Introductory statistics for the
Behavioral Sciences. New Jersey: Wiley.

Topic 10 Factor Analysis and Reliability

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Topic 10 Factor Analysis and Reliability

Uploaded by

Copyright:

Available Formats

TOPIC 10 A CONDUCIVE TEACHING AND LEARNING ENVIRONMENT W 200

Salient features of Principal Component Analysis:

The following are the assumptions for Factor Analysis:

10.1 ILLUSTRATING THE INTER-DEPENDENCY

The following Table 10.1 shows the results:

Table 10.1: Correlation Analysis Output

X1 1.00 0.75 0.70 0.65 0.01 0.20 0.18 0.16 0.03

The interpretation of Table 10.1:

10.2 THE FUNDAMENTALS OF FACTOR

10.2.1 The Mathematics behind Data Reduction

Table 10.2: Data on Two Variables

⎛ cov( x , x ) cov( x , y ) ⎞ ⎛ a ⎞ ⎛a⎞

Variance- Eigenvector Eigenvalue

The number of eigenvalues depends on the number of variables in the analysis.

Each eigenvalue will have its corresponding eigenvector. The computation of

For the example above, the eigenvalues are:

10.2.2 Types of Factor Analysis

Example: RespondentsÊ responses on a scale measuring delinquency is

Example: Given a theory that attributes delinquency to four independent

10.3 THE LOGIC OF FACTOR ANALYSIS

1 It is difficult for me to face unpleasant situations.

11 I am sure of what I am doing in most situations.

The researcher developed a questionnaire to assess Emotional Intelligence of

[ 1= Strongly Disagree; 2 = Disagree; 3 = Slightly Disagree; 4 = Not Sure; 5 =

Table10.3: Sample of StudentsÊ Responses

Table 10.4: Sample of Inter-Correlation Values between Variables

Table 10.5: Sample of Factor Analysis Outcome

Variables Factor I Factor II Factor III Factor IV Factor .. Factor n

10.3.1 Factor Loading

Variables Factor I Factor II Factor III

Variable X1 is highly correlated with Factor I, but negligibly correlated with

Communality: Refers to the total variance in variable X1 accounted for by the

Simply square the factor loadings and add them together:

(0.9322 + 0.0132 + 0.2502) = 0.93129

10.4 STEPS IN FACTOR ANALYSIS (PRINCIPAL

A Ten Variable Example

Table 10.6: Codes and Variable Names

The principal components can be illustrated as follows:

C1 = b11(X1) + b11(X1) + ⁄ + b10(X10)

C1 = Factor score on Component 1

b = Regression weight (also known as factor weight)

X = Respondents score on the observed variables

= Strong regression weight

= Weak regression weight

C2 = b11(X1) + b11(X1) + ⁄ + b10(X10)

C2 = Factor score on Component 2

bij = Regression weight (also known as factor loading)

Xi = Respondents score on the observed variables

= Strong regression weight (large factor loading)

= Weak regression weight (small factor loading)

In short, a principal component is a linear combination of optimally weighted

Summarise original info into

Measurement level of the principal

Total variance Parameter for analysis

It is a large sample procedure.

Principal Component analysis Type of analysis

Yes No Principal components correlate

Oblique Orthogonal Type of rotation

Figure 10.2: Requirements and assumptions for principal component analysis

Extracting the principal components from the list of observed variables is an

10.4.1 Correlation between Variables

Table 10.7: Inter-Correlation among the Variables

There is more evidence of factorability: