You are on page 1of 32

TOPIC 10 A CONDUCIVE TEACHING AND LEARNING ENVIRONMENT W 200

T o p ic X Factor
10 Analysis
and
Reliability
LEARNING OUTCOMES
By the end of this topic, you should be able to:
1. Describe the requirements for factor analysis for a given data set;
2. Use the appropriate method to determine the principal components
underpinning the responses on a set of variables; and
3. Compute the reliability index.

X INTRODUCTION
Factor Analysis is used to uncover the latent structures (dimensions) of a set of
variables. It is a family of analysis under data reduction. Other methods are
Latent Class Analysis, Latent Profile Analysis, Latent Trait Analysis, and
Principal Component Analysis (PCA). The focus of Principal Component
Analysis is to reduce the number of variables into a smaller set of principal
components (dimensions). It allows researchers to use a smaller number of
ÂfactorsÊ to explain what the long list of ÂvariablesÊ actually measure. PCA is
normally used to reduce a large number of variables to a smaller number of
factors. Prior to Multiple Regression analysis, factor analysis was used to create a
set of factors to be treated as uncorrelated variables as one approach to handle
multi-collinearity. Factor analysis is an Interdependency Technique; it aims to
find the latent factors that account for the patterns of collinearity among multiple
metric variables. Some statisticians do not consider PCA as factor analysis.
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 201

Salient features of Principal Component Analysis:


(a) It is a variable reduction procedure.
(b) The main purpose is to reduce the number of variables into a smaller set of
principal components (dimensions).
(c) It is a large sample procedure where the focus is only on summarising the
sample information into a smaller set of „principal components‰ as
opposed to detecting the Âlatent factorsÊ that influence the scores on the
observed variables.

The following are the assumptions for Factor Analysis:


(a) Large enough sample to yield reliable estimates of the correlations among
the variables (according to Hair et al: 5 respondents per item in the scale are
preferred).
(b) Statistical inference is improved if the variables are multivariate normal (not
required for PCA).
(c) Relationships among the pairs of variables are linear.
(d) Absence of outliers among the cases.
(e) Some degree of collinearity among the variables but not an extreme degree
or singularity among the variables (according to Kline (1998), the
correlation value between the variables fall between 0.3 and 0.8).

10.1 ILLUSTRATING THE INTER-DEPENDENCY


BETWEEN VARIABLES
A teacher wanted to gauge the Emotional Intelligence of the Form Five students
of his school. Based on his readings, he drafted a nine-item questionnaire.
Respondents are required to provide their ratings on a five-point Likert Scale. He
administered the questionnaire to a group of Form Five students and ran a
simple correlation analysis.
202 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

The following Table 10.1 shows the results:

Table 10.1: Correlation Analysis Output

X1 X2 X3 X4 X5 X6 X7 X8 X9

X1 1.00 0.75 0.70 0.65 0.01 0.20 0.18 0.16 0.03


X2 1.00 0.63 0.65 0.08 0.11 0.13 0.04 0.09
X3 1.00 0.74 0.02 0.12 0.07 0.15 0.05
X4 1.00 0.01 0.11 0.06 0.02 0.13
X5 1.00 0.73 0.72 0.65 0.83
X6 1.00 0.71 0.79 0.72
X7 1.00 0.95 0.75
X8 1.00 0.73
X9 1.00

Basic Principle:
Variables that significantly correlate with each other do so because they are
measuring the same "construct".

The Problem:
What is the "construct" that brings the variables together?

The interpretation of Table 10.1:


(a) Variables 1, 2, 3 & 4 correlate highly with each other, but not with the rest of
the variables.
(b) Variables 5, 6, 7, 8 & 9 correlate highly with each other, but not with the rest
of the variables.
(c) The nine variables seem to be measuring TWO "constructs" or underlying
factors.

To find out the answer, we need to carry out Factor Analysis or The Principal
Component Analysis, to be more precise.
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 203

10.2 THE FUNDAMENTALS OF FACTOR


ANALYSIS (PRINCIPAL COMPONENT
ANALYSIS)
The purpose of factor analysis is to reduce multiple variables to a lesser number
of underlying factors that are being measured by the variables.

10.2.1 The Mathematics behind Data Reduction


LetÊs say we have two observable variables (X and Y); if we would like to see
whether these two variables can be represented by a single variable, the following
steps should be followed.

Step 1:
Collect data on the two variables (X and Y) from a group of respondents (letÊs say
we use 10 respondents).

Table 10.2 shows the example of scores for the above data collected:

Table 10.2: Data on Two Variables


X Y Standardised X Standardised Y
5 4 0.52 -0.07
4 5 -0.22 0.66
6 5 1.27 0.66
3 2 -0.97 -1.53
4 4 -0.22 -0.07
3 2 -0.97 -1.53
5 6 0.52 1.39
6 5 1.27 0.66
2 3 -1.72 -0.80
5 5 0.52 0.66

Step 2:
Determine the variance-covariance matrix for the two variables. The formulas
below define the variance and covariance.
n
∑ ( X i − X )( X i − X )
Variance [Cov (X, X)]: i = 1 , using the standardised value
n −1
Cov (X, X) = 1.00
204 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

n
∑ (Yi − Y )(Yi −Y )
Variance [Cov (Y, Y)]: i = 1 = using the standardised value
n −1
Cov (Y, Y) = 1.00
n
∑ ( X i − X )(Yi − Y )
Covariance [Cov (X, Y)]: i = 1 = 0.77
n −1

⎛ cov( x , x ) cov( x , y ) ⎞
Variance-covariance matrix: ⎜⎜ ⎟⎟ ; if the X and Y scores are
⎝ cov( y , x ) cov( y , y ) ⎠
transformed into standardised scores, the variance-covariance will give us the
correlation matrix.

Thus, for the above example, the variance-covariance matrix for the standardised
value of

⎛ 1.00 0.77 ⎞
X and Y is ⎜⎜ ⎟⎟
⎝ 0.77 1.00 ⎠

Step 3
Calculate the eigenvectors and eigenvalues of the covariance matrix:

⎛ cov( x , x ) cov( x , y ) ⎞ ⎛ a ⎞ ⎛a⎞


⎜⎜ ⎟⎟ X ⎜⎜ ⎟⎟ = λ ⎜⎜ ⎟⎟
⎝ cov( y , x ) cov( y , y ) ⎠ ⎝ b ⎠ ⎝ b⎠

Variance- Eigenvector Eigenvalue


covariance Matrix
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 205

The number of eigenvalues depends on the number of variables in the analysis.


In general, if there are n variables in the analysis, there will be n number of
eigenvalues. However, not all the eigenvalues will have the same magnitude but
the total is equal to the number of variables in the analysis.

Each eigenvalue will have its corresponding eigenvector. The computation of


eigenvalues and eigenvectors involves complicated mathematical procedures
especially if the number of variables in the analysis is large. Any computer
software that performs the Principal Component Analysis will compute the
eigenvalues (while some programmes will also provide the eigenvectors).

For the example above, the eigenvalues are:


λ 1 = 1 + r12
and
λ 2 = 1 - r12

(where r12 is the correlation between the two variables, in this case, the
correlation value is 0.77)
Thus, the eigenvalues are 1.77 and 0.23
The eigenvalues will give the eigenvectors.

⎛ 1 ⎞
When eigenvalue is 1.77, the eigenvector is ⎜⎜ ⎟⎟
⎝ 2.54 ⎠
⎛ 1 ⎞
When eigenvalue is 0.23, the eigenvector is ⎜⎜ ⎟⎟
⎝ − 1.43 ⎠
206 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

Step 4
Plotting the standardised values on a two dimensional plane and overlying the
eigenvectors.

2.00

1.50
Eigenvector
⎛ 1 ⎞
1.00 ⎜⎜ ⎟⎟
⎝ 2.54 ⎠
0.50

0.00
-2.00 -1.00 0.00 1.00 2.00
-0.50

-1.00 Eigenvector
⎛ 1 ⎞
-1.50 ⎜⎜ ⎟⎟
⎝ − 1.43 ⎠
-2.00

Figure 10.1: Plotting on a two dimensional plane and overlying the eigenvectors

From the plot shown in Figure 10.1, it can be concluded that the data set is fairly
well represented by the eigenvector derived when the eigenvalue is 1.77.

The above discussion is just for illustrative purpose. In real situations, there will
be more than two „observed‰ variables and thus, visual representation (e.g.
graphical) is not possible.
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 207

10.2.2 Types of Factor Analysis


Basically, there are two types of factor analysis and they are:
(a) Exploratory factor analysis
It is a non-theoretical application. The aim is to answer the question „Given
a set of variables, what are the underlying dimensions (factors) that account
for the patterns of collinearity among the variables?

Example: RespondentsÊ responses on a scale measuring delinquency is


governed by certain theory, as such, what are the latent factors that
influence their behaviour?
(b) Confirmatory factor analysis
It is to validate a predetermined theory. The aim is to answer the question
„Do the responses of a scale conform with the theory that explains
respondentsÊ behaviour?

Example: Given a theory that attributes delinquency to four independent


factors, do respondentsÊ responses on a scale that measures delinquency
converge into these four factors?

10.3 THE LOGIC OF FACTOR ANALYSIS


(PRINCIPAL COMPONENT ANALYSIS)
In studying the Emotional Intelligence of teachers, a researcher uses focus group
interviews in developing the instrument for his study. The following items (The
below data is attached in Appendix II, Data Set B) were generated based on focus
group interviews with selected teachers from Klang Valley.

1 It is difficult for me to face unpleasant situations.


2 I am able to face challenges pretty well.
3 I am able to deal with upsetting problems.
4 I find it difficult to control my anxiety.
5 I am able to keep calm in difficult situations.
6 I can handle stress without getting too nervous.
7 I am usually calm when facing challenging situations.
8 I am motivated to continue, even when things get difficult.
9 Whatever the situation, I believe I can handle it well.
10 I am optimistic about most things I do.
208 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

11 I am sure of what I am doing in most situations.


12 I believe things will turn out all right despite setbacks from time to time.
13 I believe in my ability to handle upsetting problems/situations.
14 If others can do it, I donÊt see why I canÊt.
15 I feel good about myself.
16 I feel that I am not inferior compared with others.
17 I feel confident of myself in most situations.
18 I have good self respect.
19 I am happy with what I am now.
20 It is fairly easy for me to express my feelings.
21 I am aware of what is happening around me even when I am upset.
22 I am aware of the way I feel.
23 It is difficult for me to describe my feelings.

The researcher developed a questionnaire to assess Emotional Intelligence of


teachers using the items generated from the focus group interviews. He used a 7-
point Likert Scale for his questionnaire. The following is the description of the
Likert Scale:

[ 1= Strongly Disagree; 2 = Disagree; 3 = Slightly Disagree; 4 = Not Sure; 5 =


Slightly Agree; 6 = Agree; 7 = Strongly Agree]. He administered the questionnaire
to 176 randomly selected teachers from both private and public schools in Klang
Valley. Table 10.3, shows the sample of the responses.

Table10.3: Sample of StudentsÊ Responses


Subject Variable
X1 X2 X3 X4 X5 X6 ⁄ Xn
1 6 5 7 3 4 4 ⁄
2 5 7 4 4 4 3 ⁄
3 7 5 6 2 5 4 ⁄
⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄
N Mean X1 Mean X2 Mean X3 Mean X4 Mean X5 Mean X6 ⁄ Mean X.

Having run the correlation analysis, the researcher found that some of the items
have high correlations with one another while others, not so. Table 10.4 shows an
example of the correlation analysis.
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 209

Table 10.4: Sample of Inter-Correlation Values between Variables

X1 X2 X3 X4 X5 X6 .... Xk
X1 1.00 0.76 0.84 ⁄
X2 1.00 0.76 ⁄
X3 1.00 ⁄
X4 1.00 0.76 0.77 ⁄
X5 1.00 0.81
X6 1.00
---
Xk 1.00

The next logical thing to do is to cluster the variables with high inter correlations
together and define them as belonging to the same family. This is what factor
analysis (or Principal Component Analysis, to be precise) is all about. Table 10.5
displays an example of the factor analysis. The values in the cells are the factor
loadings (Refer to Subsection 10.3.1 for further explanation on factor loadings).

Table 10.5: Sample of Factor Analysis Outcome

Variables Factor I Factor II Factor III Factor IV Factor .. Factor n


X1 0.932 0.013 0.250
X2 0.851 0.426 0.211
X3 0.634 0.451 0.231
X4 0.322 0.644 0.293
X5 0.725 0.714 0.293
X6 0.435 0.641 0.332
X7 0.322 0.311 0.677
X8 0.211 0.233 0.771
⁄ ⁄ ⁄ ⁄ ⁄ ⁄ ⁄
Xk 0.122 0.110 0.200
210 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

10.3.1 Factor Loading


What is a Factor Loading?
A factor loading is the correlation between a variable and a factor that has been
extracted from the data.

Example
Note the factor loadings for variable X1.

Variables Factor I Factor II Factor III


X1 0.932 0.013 0.250

Variable X1 is highly correlated with Factor I, but negligibly correlated with


Factors II and III.

Communality: Refers to the total variance in variable X1 accounted for by the


three factors that were extracted.

Simply square the factor loadings and add them together:

(0.9322 + 0.0132 + 0.2502) = 0.93129

As such, the initial communality for the variables before extracting the factors is
always 1.00. In the above example, emotional intelligences is operationalised
using 23 specific situations and the initial factors will be 23, with some having
greater dominance than the others (this will be reflected in the eigenvalues).

Once the dominant factors are identified (e.g. those with eigenvalue greater than
1.00), the communality value for each variable will be less than 1.00. This is
because in factor analysis, those factors that have negligible effect on the variables
will be dropped.
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 211

10.4 STEPS IN FACTOR ANALYSIS (PRINCIPAL


COMPONENT ANALYSIS)
There are a few crucial steps to be followed in factor analysis or to put it more
precisely, the Principal Component Analysis:

Step 1
Compute a k by k inter-correlation matrix. According to Hair et.al, inter-
correlation values must be at least 0.3 for the items to be considered for factor
analysis.

Step 2
Extract an initial solution.

Step 3
Determine the appropriate number of factors to be extracted in the final solution.

Step 4
Rotate the factors to clarify the factor pattern in order to better interpret the
nature of the factors if necessary.

Step 5
Establish the measures of goodness-of-fit of the factor solution

A Ten Variable Example


The researcher used the responses on the first 10 questions on the Emotional
Intelligence questionnaire to perform factor analysis. The Table 10.6 below shows
the codes and the variable names for the variables included in the factor analysis.

Table 10.6: Codes and Variable Names


Code Variable Name
rq1 It is difficult for me to face unpleasant situations.
rq2 I am able to face challenges pretty well.
rq3 I am able to deal with upsetting problems.
rq4 I find it difficult to control my anxiety.
rq5 I am able to keep calm in difficult situations.
rq6 I can handle stress without getting too nervous.
rq7 I am usually calm when facing challenging situations.
rq8 I am motivated to continue, even when things get difficult.
rq9 Whatever the situation, I believe I can handle it well.
rq10 I am optimistic about most things I do.
212 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

The principal components can be illustrated as follows:

C1

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10

C1 = b11(X1) + b11(X1) + ⁄ + b10(X10)

C1 = Factor score on Component 1

b = Regression weight (also known as factor weight)

X = Respondents score on the observed variables

= Strong regression weight

= Weak regression weight

C2

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10

C2 = b11(X1) + b11(X1) + ⁄ + b10(X10)

C2 = Factor score on Component 2

bij = Regression weight (also known as factor loading)

Xi = Respondents score on the observed variables

= Strong regression weight (large factor loading)

= Weak regression weight (small factor loading)


TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 213

All the observed variables will have some influence on all the factors extracted,
however, a different set of the variables will have different degrees of influence
on the different common factors.

In short, a principal component is a linear combination of optimally weighted


observed variables. The weighting is done in such a way that it maximises the
amount of variance in the data set.

The following Figure 10.2 summarises the requirements and assumptions for
principal component analysis.

Summarise original info into


minimal factor Purpose

Measurement level of the principal


Continuous
component „factor‰

Total variance Parameter for analysis

It is a large sample procedure.


No generalisation involved. Assumption of normality
No assumption of normality

Principal Component analysis Type of analysis

Yes No Principal components correlate

Oblique Orthogonal Type of rotation

Figure 10.2: Requirements and assumptions for principal component analysis

Extracting the principal components from the list of observed variables is an


iterative procedure that requires one to check for the assumptions along the
process until the final conclusion is made. The procedural map in Appendix VI
summarises the procedure and assumptions required for PCA with orthogonal
rotation.
214 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

10.4.1 Correlation between Variables


As a first step, correlations between the variables are computed. Table 10.7 shows
the values of the correlation between the variables. The shaded cells represent the
diagonal while the values below and above the diagonal are the correlation
values between the variables. Since the correlation values between the variables
are greater than 0.3 with at least one other variable, all the 10 variables are
factorable. At the same time the values are not too high (not more than 0.85) and
as such, each variable is distinct from the others.

Table 10.7: Inter-Correlation among the Variables

Correlation Matrixa

rq1 rq2 rq3 rq4 rq5 rq6 rq7 rq8 rq9 rq10

Correlation rq1 1.000 .604 .578 .419 .514 .580 .497 .555 .554 .481

rq2 .604 1.000 .615 .518 .488 .545 .543 .402 .402 .401

rq3 .578 .615 1.000 .519 .567 .536 .572 .481 .484 .496

rq4 .419 .518 .519 1.000 .581 .430 .450 .336 .174 .357

rq5 .514 .488 .567 .581 1.000 .577 .577 .466 .382 .574

rq6 .580 .545 .536 .430 .577 1.000 .575 .510 .417 .437

rq7 .497 .543 .572 .450 .577 .575 1.000 .459 .442 .521

rq8 .555 .402 .481 .336 .466 .510 .459 1.000 .585 .602

rq9 .554 .402 .484 .174 .382 .417 .442 .585 1.000 .529

rq10 .481 .401 .496 .357 .574 .437 .521 .602 .529 1.000

a. Determinant =0.005
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 215

There is more evidence of factorability:


(a) Bartlett's Test of Sphericity
Table 10.8 shows the inter-correlation matrix of an identity matrix.

Table 10.8: Intercorrelation of an Identity Matrix


X1 X2 X3 X4 X5
X1 1.00 0.00 0.00 0.00 0.00
X2 1.00 0.00 0.00 0.00
X3 1.00 0.00 0.00
X4 1.00 0.00
X5 1.00

The variables are totally non-collinear. If this matrix was factor-analysed, it


would extract as many factors as variables, since each variable would be its
own factor. As such, it is totally non-factorable. The factor solution will be
exactly the same as the initial solution.

The determinant of an identity matrix is equal to one, while the determinant


of a non-identity matrix is some other value (different from one).

Bartlett's Test of Sphericity calculates the determinant of the matrix of the


sums of products and cross-products (S) from which the inter-correlation
matrix is derived. The determinant of the matrix S is converted to a chi-
square statistic and tested for significance.

Null Hypothesis: The inter-correlation matrix of the variables is not


different from an identity matrix.

Alternate Hypothesis: The inter-correlation matrix of the variables is


different from an identity matrix.

Table 10.9 shows the sample results:

Table 10.9: Sample Results of Bartlett's Test of Sphericity


KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. 0.914
Bartlett's Test of Sphericity Approx. Chi-Square 887.955
df 45
Sig. .000
216 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

Test Results
χ2 = 887.955 ; df = 45 ; p < 0.0001

Statistical Decision
The inter-correlation matrix of the variables is significantly different from
an identity matrix. In other words, the sample inter-correlation matrix did
not come from a population in which the inter-correlation matrix is an
identity matrix.

(b) Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO)


If two variables share a common factor with other variables, their partial
correlation (aij) will be small, indicating the unique variance they share.

If aij ≅ 0.0; the variables are measuring a common factor, and KMO ≅ 1.0

If aij ≅ 1.0; the variables are not measuring a common factor, and KMO ≅ 0.0

Table 10.10 portrays the interpretation of the KMO as characterised by


Kaiser, Meyer, and Olkin:

Table 10.10: Degree of Common Variance

KMO Value Degree of Common Variance


0.90 to 1.00 Marvelous
0.80 to 0.89 Meritorious
0.70 to 0.79 Middling
0.60 to 0.69 Mediocre
0.50 to 0.59 Miserable
0.00 to 0.49 Not Appropriate for Factor Analysis
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 217

As characterised by Kaiser, Meyer, and Olkin, results of the KMO can be


seen or referred in the below Table 10.11.

Table 10.11: KMO and Bartlett's Test

KMO and Bartlett's Test

Kaiser-Meyer-Olkin Measure of Sampling Adequacy. 0.914


Bartlett's Test of Sphericity Approx. Chi-Square 887.955
df 45
Sig. .000

The KMO = 0.914

Interpretation
The degree of common variance among the ten variables is marvellous.

If a factor analysis is conducted, the factors extracted will account for a


substantial amount of variance.

10.4.2 Extracting an Initial Solution


A variety of methods have been developed to extract factors from an inter-
correlation matrix. SPSS Statistics offers the following methods:
(i) Principal components
(ii) Unweighted least-squares
(iii) Generalised least squares
(iv) Maximum likelihood
(v) Principal axis factoring
(vi) Alpha factoring
(vii) Image factoring

Note: In this module, we will only focus on the Principal Component Method.

Communality is the proportion of variance of a particular variable (item in the


questionnaire) that is due to common factors. In the initial solution, each variable
(item) is considered as a single factor, as such, the communality for the initial
solution is 1.00. After extraction, the number of factors will be reduced and each
218 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

initial factor (item) now belongs to Ânew factorsÊ and the new factors explain a
certain proportion of the variance in the variable. Thus, the proportion of
variance of each variable (item) explained by the new factors is less than 1.00
(refer to Table 10.12).

Table 10.12: Communalities

Communalities

Initial Extraction

rq1 1.000 .626


rq2 1.000 .623
rq3 1.000 .647
rq4 1.000 .732
rq5 1.000 .649
rq6 1.000 .588
rq7 1.000 .594
rq8 1.000 .694
rq9 1.000 .762
rq10 1.000 .614

The variance of each variable is 1.0, the total variance to be explained is 10 (10
variables, each with a variance = 1.0). Since a single variable can account for 1.0
unit of variance, a useful Ânew factorÊ must account for more than 1.0 unit of
variance, or have an eigenvalue (λ) greater than 1.0. Otherwise, the factor
extracted (new factor) explains less variance than a single variable. Table 10.7
shows the results of the factor analysis of the 10 items.
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 219

10.4.3 Determine the Appropriate Number of Factors to


be Extracted in the Final Solution
Table 10.13: The Results of Factor Analysis

Extraction Sums of Squared Rotation Sums of Squared


Initial Eigenvalues Loadings Loadings

% of Cumulative % of Cumulative % of Cumulative


Component Total Variance % Total Variance % Total Variance %

1 5.489 54.888 54.888 5.489 54.888 54.888 3.515 35.152 35.152

2 1.041 10.406 65.294 1.041 10.406 65.294 3.014 30.143 65.294

3 .691 6.910 72.205

4 .539 5.387 77.592

5 .506 5.056 82.648

6 .395 3.948 86.596

7 .383 3.830 90.426

8 .359 3.590 94.017

9 .320 3.201 97.218

10 .278 2.782 100.000

Extraction Method: Principal Component Analysis.

Referring to the above Table 10.13, the results of the initial solution:

Interpretation
10 factors (components) were extracted, the same as the number of variables
factored:

(a) Factor I
The 1st factor has an eigenvalue = 5.489. The value is greater than 1.0, as
such, it explains more variance than a single variable, in fact 5.489 times as
much.

The percent of variance explained by ÂFactor IÊ is:


(5.489 / 10 units of variance) (100) = 54.89%
220 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

(b) Factor II
The 2nd factor has an eigenvalue = 1.041. It is also a value greater than 1.0,
and therefore, explains more variance than a single variable.

The percent of variance explained by ÂFactor IIÊ is:


(1.041 / 10 units of variance) (100) = 10.41%
(c) Subsequent factors
The subsequent factors (3 through 10) have eigenvalues less than 1.0, as
such, explain less variance than a single variable. These are not ÂgoodÊ
factors.

The Key Points


• The sum of the eigenvalues associated with each factor (component)
sums to 10 (e.g (5.489 + 1.041 + 0.691 + 0.539 + ⁄ + 0.278) = 10)
• The cumulative percentage of variance explained by the first two factors
is 65.29%
• In other words, 65.29% of the common variance shared by the 10
variables can be accounted for by the 3 factors.
• This initial solution suggests that the final solution should extract not
more than 2 factors.

Under the subject of determining the appropriate number of factors to be


extracted in the final solution that has been discussed in this subsection, there are
two more important elements to be addressed:

(a) Cattell's Scree Plot


Another way to determine the number of factors to extract in the final
solution is via Cattell's Scree plot (refer to Figure 10.3). This is a plot of the
eigenvalues associated with each of the factors extracted, against each
factor. At the point that the plot begins to level off, the additional factors
explain less variance than a single variable.
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 221

Figure 10.3: Cattell's Scree Plot

(b) Factor Loadings


The component matrix indicates the correlation of each variable with each factor.
Component Matrixa
Component
1 2 Explanation:
rq1 .785 .099
The variable rq1
rq2 .748 -.253 correlates 0.785 with
rq3 .795 -.127 Factor I
rq4 .640 -.567 correlates 0.099 with
Factor II
rq5 .776 -.216
rq6 .762 -.086
rq7 .765 -.096
rq8 .727 .406
rq9 .667 .563
rq10 .728 .289

Extraction Method: Principal


Component Analysis.
a. 2 components extracted.

The total proportion of the variance in rq1 explained by the two factors is:
(0.7852 + 0.0992) = 0.626
222 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

This is called the communality of the variable rq1


The communalities of the 10 variables are as follows: (cf. column headed as
Extraction)
Communalities
Initial Extraction
The proportion of variance
rq1 1.000 .626 in each variable accounted
for by the two factors is
rq2 1.000 .623
not the same.
rq3 1.000 .647
rq4 1.000 .732
rq5 1.000 .649
rq6 1.000 .588
rq7 1.000 .594
rq8 1.000 .694
rq9 1.000 .762
rq10 1.000 .614

The key to determining what the factors measure is the factor loadings.
Component Matrixa
Component
1 2
rq1 .785 .099
rq2 .748 -.253
rq3 .795 -.127
rq4 .640 -.567
rq5 .776 -.216
rq6 .762 -.086
rq7 .765 -.096
rq8 .727 .406
rq9 .667 .563
rq10 .728 .289
Extraction Method: Principal Component
Analysis.
a. 2 components extracted.
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 223

Factor I

Variable Factor Loading


The correlation coefficient between rq1 and ÂFactor IÊ is 0.785
rq1 .785
The correlation coefficient between rq2 and ÂFactor IÊ is 0.748
rq2 .748
The correlation coefficient between rq3 and ÂFactor IÊ is 0.795
rq3 .795
The correlation coefficient between rq4 and ÂFactor IÊ is 0.640
rq4 .640
The correlation coefficient between rq5 and ÂFactor IÊ is 0.776
rq5 .776
The correlation coefficient between rq6 and ÂFactor IÊ is 0.762
rq6 .762
The correlation coefficient between rq7 and ÂFactor IÊ is 0.765
rq7 .765
The correlation coefficient between rq8 and ÂFactor IÊ is 0.727
rq8 .727
The correlation coefficient between rq9 and ÂFactor IÊ is 0.667
rq9 .667
The correlation coefficient between rq10 and ÂFactor IÊ is 0.728
rq10 .728

Factor II
Variable Factor Loading
The correlation coefficient between rq1 and ÂFactor IIÊ is 0.099
rq1 .099
The correlation coefficient between rq2 and ÂFactor IIÊ is -0.253
rq2 -.253
The correlation coefficient between rq3 and ÂFactor IIÊ is -0.127
rq3 -.127
The correlation coefficient between rq4 and ÂFactor IIÊ is -0.567
rq4 -.567
The correlation coefficient between rq5 and ÂFactor IIÊ is -0.216
rq5 -.216
The correlation coefficient between rq6 and ÂFactor IIÊ is -0.086
rq6 -.086
The correlation coefficient between rq7 and ÂFactor IIÊ is -0.096
rq7 -.096
The correlation coefficient between rq8 and ÂFactor IIÊ is 0.406
rq8 .406
The correlation coefficient between rq9 and ÂFactor IIÊ is 0.563
rq9 .563
The correlation coefficient between rq10 and ÂFactor IIÊ is 0.289
rq10 .289

10.4.4 Rotate the Factors to Clarify the Factor Pattern in


order to Better Interpret the Nature of the Factors.
In many instances, one or more variables may load about the same on more than
one factor, making the interpretation of the factors ambiguous. Ideally, the
analyst would like to find that each variable loads high (⇒ 1.0) on one factor and
approximately zero on all the others (⇒ 0.0). The factor pattern can be clarified by
"rotating" the factors in F-dimensional space. There are two types of rotation:
224 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

(a) Orthogonal Rotation: Preserves the independence of the factors,


geometrically they remain 90° apart.
(b) Oblique Rotation: Will produce factors that are not independent,
geometrically not 90° apart.

Below is the comparison between the Component matrix and Rotated


Component matrix (Using Varimax rotation, an orthogonal type) for the ten
variables:

Component Matrixa Rotated Component Matrixa


Component Component
1 2 1 2
rq1 .785 .099 rq1 .519 .597
rq2 .748 -.253 rq2 .726 .309
rq3 .795 -.127 rq3 .677 .435
rq4 .640 -.567 rq4 .855 .003
rq5 .776 -.216 rq5 .723 .356
rq6 .762 -.086 rq6 .625 .443
rq7 .765 -.096 rq7 .634 .438
rq8 .727 .406 rq8 .272 .788
rq9 .667 .563 rq9 .123 .864
rq10 .728 .289 rq10 .350 .701

Extraction Method: Principal Extraction Method: Principal


Component Analysis. Component Analysis. Rotation
a. 2 components extracted. Method: Varimax with Kaiser
Normalization.
a. Rotation converged in 3 iterations.
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 225

Reproduced correlation matrix


One measure of the goodness-of-fit is whether the factor solution can reproduce
the original inter-correlation matrix among the ten variables.

Table 10.14 : Reproduced Correlations

Reproduced Correlations
rq1 rq2 rq3 rq4 rq5 rq6 rq7 rq8 rq9 rq10
Reproduced rq1 .626a .562 .611 .446 .588 .590 .591 .611 .580 .600
Correlation rq2 .562 .623a .626 .622 .635 .591 .596 .441 .357 .471
rq3 .611 .626 .647a .580 .644 .616 .620 .526 .459 .542
rq4 .446 .622 .580 .732a .619 .536 .544 .235 .108 .302
rq5 .588 .635 .644 .619 .649a .610 .614 .477 .397 .503
rq6 .590 .591 .616 .536 .610 .588a .591 .519 .460 .530
rq7 .591 .596 .620 .544 .614 .591 .594a .517 .456 .529
rq8 .611 .441 .526 .235 .477 .519 .517 .694a .714 .647
rq9 .580 .357 .459 .108 .397 .460 .456 .714 .762a .649
rq10 .600 .471 .542 .302 .503 .530 .529 .647 .649 .614a
Residualb rq1 .042 -.033 -.027 -.074 -.009 -.094 -.056 -.026 -.119
rq2 .042 -.011 -.104 -.147 -.047 -.053 -.039 .046 -.070
rq3 -.033 -.011 -.061 -.077 -.080 -.048 -.045 .025 -.046
rq4 -.027 -.104 -.061 -.038 -.106 -.094 .101 .066 .055
rq5 -.074 -.147 -.077 -.038 -.033 -.037 -.011 -.014 .071
rq6 -.009 -.047 -.080 -.106 -.033 -.016 -.009 -.042 -.093
rq7 -.094 -.053 -.048 -.094 -.037 -.016 -.058 -.014 -.008
rq8 -.056 -.039 -.045 .101 -.011 -.009 -.058 -.129 -.045
rq9 -.026 .046 .025 .066 -.014 -.042 -.014 -.129 -.120
rq10 -.119 -.070 -.046 .055 .071 -.093 -.008 -.045 -.120

Extraction Method: Principal Component Analysis.


a. Reproduced communalities
b. Residuals are computed between observed and reproduced correlations. There are 21
(46.0%) non-redundant residuals with absolute values greater than 0.05.
226 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

The upper half of the above Table 10.14 presents the bivariate correlations.
Compare these with the lower half of the table that presents the residuals.

Residual = (observed - reproduced correlation)

Less than half of the residuals (42%) are greater than 0.05

10.4.5 Establish the Measures of Goodness-of-Fit of the


Factor Solution
Table 10.15 shows the goodness of fit of the two factor solution.

Table 10.15: Goodness of Fit of the Two Factor Solution


Measure Value Interpretation

KMO 0.914 Marvelous


BarlettÊs Test χ2 = 887.955 ; The inter-correlation matrix
df = 45 ; provides evidence of the
p < 0.0001 presence of common factors

Total Variance Explained 65.29% The two factors extracted


can explain 65.29% of the
variance in the ten variables
Factor pattern 2 Factors The pattern is clear for two
factors

10.5 RELIABILITY
In many areas of educational and psychological research, the precise
measurement of various variables or theoretical constructs poses a challenge. For
example, the precise measurement of personality variables or attitudes is usually
a necessary first step before any theories of personality or attitudes can be
considered. In general, unreliable measurements of people's beliefs or intentions
will obviously hamper efforts to predict their behaviour. Reliability analysis is
often used to statistically check the reliability of an instrument. Reliability is the
measure of consistency of a particular instrument. This refers to the „capability‰
of the instrument producing consistently similar results if it were administered to
a homogenous group of respondents. Generally, there are four classes of
reliability estimates. They are inter-rater or inter-observer reliability, test-retest
reliability, parallel-form reliability, and internal consistency. The inter-rater or the
inter-observer reliability is used to assess the degree to which two different
observers describes a phenomenon. This is widely used in establishing reliability
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 227

for open-ended questions. The test-retest, the parallel-forms and the internal
consistency reliability are mainly used to assess the reliability for fixed response
items. The test-retest is used to measure the consistency of the measure from one
time to another, while the parallel-form is the reliability measure of the
consistency of two tests which were constructed using the same content domain.

The internal-consistency is a measurement to evaluate the consistency of the


responses for each item within the instrument. This is reported in terms
coefficient of CronbachÊs alpha and the values range from zero to one and this is
measured by the formula;

α= k ⎛⎜ k S 2 i ⎞⎟
1 − ∑
k − 1 ⎜⎜ i = 1 S 2 ⎟
⎝ sum ⎟⎠
where
Si2 = variance for k individuals
S2sum = variance for the sum of all items
• If there is no true score but only random errors in the items
(uncorrelated across items) then Si2 = S2sum and α = 0
• If all items measure the same thing (true score) then α=1
• Nunnaly (1978) suggests an α > 0.7

10.5.1 Reliability using Cronbach’s Alpha


There are many different types of statistics to check reliability and one of the
most commonly used is CronbachÊs Alpha which is based on the average
correlation of items within a test. CronbachÊs alpha is the most common form of
internal consistency reliability coefficient. By convention, a lenient cut-off of 0.60
is common in exploratory research; alpha should be at least 0.70 or higher to
retain an item in an "adequate" scale; and many researchers require a cut-off of
0.80 for a "good scale."
228 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

SPSS STATISTICS Commands for Reliability Analysis


• Select Analyse menu and click on Scale and then Reliability
Analysis ⁄to open the Reliability Analysis dialogue box.
• Select the variables or items you require, click the right arrow
to move the variables to the Items: box.
• Ensure that Alpha is displayed in the Model: box.
• Click on the Statistics ⁄. command pushbutton to open the
Reliability Analysis: Statistics sub-dialogue box.
• In the Descriptives for box, select the Scale and Scale if item
deleted check boxes.
• In the Inter-Item box, select the Correlations check box.
• Click on Continue and OK.

Example
A researcher gave a 10-item questionnaire on Emotional Intelligence to a sample
of randomly selected secondary school students. The aim is to determine the
internal consistency of the scale using CronbachÊs alpha. The Table 10.16 below is
the SPSS output.

Table 10.16: Item-Total Statistics

Item-Total Statistics
Corrected Squared Cronbach's
Scale Mean if Scale Variance Item-Total Multiple Alpha if Item
Item Deleted if Item Deleted Correlation Correlation Deleted
rq1 41.89 63.948 .718 .560 .895
rq2 41.78 64.915 .676 .533 .897
rq3 41.89 64.380 .731 .555 .894
rq4 42.24 65.499 .560 .458 .905
rq5 42.19 62.074 .713 .573 .895
rq6 42.14 63.800 .692 .516 .896
rq7 42.00 63.202 .696 .508 .896
rq8 41.83 64.745 .654 .521 .899
rq9 41.93 66.185 .583 .491 .903
rq10 41.97 64.849 .658 .517 .898
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 229

10.5.2 Interpretation on Cronbach’s alpha


There are several interpretations on CronbachÊs alpha:

(a) Scale Mean If Item Deleted


This column tells us about the average score if the specific item is excluded
from the scale. So, if rq1 is deleted, the average score will be 41.89

(b) Corrected Item-Total Correlation


This column gives the Pearson correlation coefficient between the
individual item and the sum of the scores on the remaining items. A low
item-total correlation means that the item is little correlated with the overall
scale and the researcher should consider dropping it. However, it should be
noted that a scale with an acceptable Cronbach's alpha may still have one or
more items with low item-total correlations. Items rq4 and rq9 are not very
strong in that they are not consistent with the rest of the scale. Their
correlations with the sum scale are 0.56 and 0.58 respectively, while all other
items correlate at 0.65 or better.

(c) CronbachÊs alpha if Item Deleted


This column gives the alpha correlation coefficient that would result if the
item is removed from the attitude scale. The researcher may wish to drop
items with high coefficients in this column as another way to improve the
alpha level.

(d) CronbachÊs alpha


The CronbachÊs alpha for the overall attitude scale is 0.7678 for the 10 items
without removal of any items. The alpha can be increased if the two items
are removed. It is a common practice for researchers to either remove the
problematic items or rewrite the items and administer the items again to see
if the alpha improves.

ACTIVITY 10.1

(a) What is the reliability analysis?


(b) What does the CronbachÊs alpha indicate?
(c) Explain CronbachÊs alpha if an item is deleted.
230 X TOPIC 10 FACTOR ANALYSIS AND RELIABILITY

• Factor analysis is used to uncover the latent structure (dimensions) of a set of


variables.

• Principal Component Analysis is used to reduce the number of variables into


a smaller set of principal components (dimensions).

• Among the required assumptions for factor analysis are a large sample,
normality (not for PCA), linear relationship among variables, absence of
outliers, and no multi collinearity.

• Factor loading is the correlation between a variable and a factor that has been
extracted from the data.

• Bartlett Test of Sphericity and Kaiser-Meyer-Olkin Measure of Sampling


Adequacy (KMO) are two commonly used tests to test the factorability of the data.

• An initial factor solution is normally rotated to obtain a more interpretable


solution.

• The initial solution can be rotated using orthogonal or oblique rotations.

• Reliability is the measure of consistency of a particular instrument.

• There are four classes of reliability estimates. They are inter-rater or inter-
observer reliability, test-retest reliability, parallel-form reliability, and internal
consistency.

• CronbachÊs Alpha is the most common form of internal consistency reliability


coefficient.

Factor analysis Rotation


Principal component analysis Orthogonal
Factor loading Oblique
Correlation matrix Reliability
Co-variance CronbachÊs alpha coefficient
TOPIC 10 FACTOR ANALYSIS AND RELIABILITY W 231

Carry out Factor Analysis to determine the dimensions in the


Emotional Intelligence construct developed by the researcher (You can
either name the factors or label them as Factor 1, Factor 2, etc).

Report the CronbachÊs Alpha for each dimension.

Black, T. R. (1999). Doing quantitative research in the Social Sciences. London:


Sage Publications.
Coladraci, T., Cobb, C., Minium, E. & Clarke, R. (2007). Fundamentals of
statistical reasoning in Education. New Jersey: Wiley.
Dancey, C. P. & Reidy, J. (2007). Statistics without maths for Psychology. Harlow,
England: Pearson Prentice Hall.
Field, A. (2005), Discovering statistics using SPSS. London: Sage Publications.
Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E. & Tatham, R. L. (2006).
Multivariate data analysis. Upper Saddle River: Prentice Hall.
Welkowitz, J., Cohen, B. & Ewen, R. (2006). Introductory statistics for the
Behavioral Sciences. New Jersey: Wiley.

You might also like