You are on page 1of 34

AN INTRODUCTION TO

FACTOR ANALYSIS

Philip Hyland
University of Ulster
philipehyland@gmail.com
www.philiphyland.webs.com
Workshop Outline
1. General Introduction to Factor Analysis

2. Exploratory Factor Analysis

3. Confirmatory Factor Analysis

4. Bi-Factor Modelling
FACTOR ANALYSIS
Learning statistics can be
stressful.

Latent Variable Modelling,


Structural Equation
Modelling, Factor Analysis,
Path Analysis.......Ahhhhhhh!

No equations No maths
FACTOR ANALYSIS
You probably already know
everything you need in order to in
order to understand factor
analysis.

Its all about correlation.

Hands on learning is what it


takes!
LETS TALK ABOUT STRESS
Stress is well known to be associated
with poor health.
Cancer, heart disease, poor immune
functioning.
Little was known about the
mechanisms by which stress affects the
body physically.
Stress affects the body at a cellular
level telomere length & telomerase
activity.

Epel, E. S. et al. (2004). Accelerated telomere shortening in response to


life stress. PNAS, 101, 17312-17315.
Epel, E. S. et al. (2004). Accelerated telomere shortening in response to
life stress. PNAS, 101, 17312-17315.
STRESS & TELOMERE DEGRADATION

Perceived Psychological r = -.31, p < .01


Telomere Length
Stress

Perceived Psychological r = -.24, p < .01


Telomerase Activity
Stress

Epel, E. S. et al. (2004). Accelerated telomere shortening in response to


life stress. PNAS, 101, 17312-17315.
LETS TALK ABOUT STRESS
Incredible findings! - Our psychological
state appears to affect our physiology at
the most basic level.
A limitation with this study!

You will understand what it is by the end


of this lecture
Clue It has to do with how we
measure psychological variables!

Epel, E. S. et al. (2004). Accelerated telomere shortening in response to


life stress. PNAS, 101, 17312-17315.
THE HARDEST SCIENCE
Psychology is the really hard science
~ Michael Schermer

Why is this so?


Psychologists are interested in things
that are not directly observable.
How can we possibly study that which
we cannot directly see?
The answer lies in factor analysis!

http://www.michaelshermer.com/2007/10/really-hard-science/
FACTOR ANALYSIS

WHAT IS IT ALL ABOUT?


KEEPING THINGS SIMPLE
Factor analysis is all about
simplification.

A method that allows us to understand


large quantities of observable variables
in terms of a smaller number of
unobservable variables.

Like it or not, we are all factor analysts!


WHATS PETER GRIFFIN LIKE?
KEEPING THINGS SIMPLE
Well, he sticks his tongue in fans.
He mixes his cereal with red bull.
Hes loiters in the wrong places.
He wires his nipples to jumper leads.

These are all directly observable phenomena.

It might be easier just to say hes stupid!

Stupidity is a latent variable.


KEEPING THINGS SIMPLE
Weve just conducted a factor analysis!

We explained a range of observable characteristics


in terms of something simpler which isnt directly
observable.

The thing to notice is that all of these individual


observable characteristics seem to be highly related
to each other.
KEEPING THINGS SIMPLE
A more psychological example!
How has John been feeling recently?

He feels sad all the time


He talks of committing suicide
Lost interest in activities he use to enjoy
No motivation

In other words, he is depressed.


STUDYING THE INVISIBLE
As psychologists we are interested in studying the
human mind.

We are usually interested in studying elusive


phenomena anxiety, psychological stress, social
identity, irrationality, personality etc.

All unobservable constructs factor analysis is the


psychologists most important tool.
COVARIATION
Whether we are talking as regular people or as stuffy
statistically-minded psychologists the method of factor
analysis is identical its all about covariation!

Whats covariation? Its about the level of association


between a set of variables!

A correlation coefficient is a standardised covariance.


COVARIATION
The relationships that we are interested in when it
comes to Factor Analysis are the relationships
between the latent variable (e.g. Psychopathy) and
the observed variables employed to measure the
latent construct.
Factor Loadings (and measurement error)

We estimate these relationships (latent to observed)


by looking at the correlations among observed
variables.
FACTOR LOADINGS & ERROR
The relationship between the observed
indicators of Psychopathy and the latent
construct is expressed in terms of a regression
coefficient known as a factor loading. LV
Why a regression coefficient and not a
correlation coefficient?
The FA model assumes that the latent variable .8
influences or determines the nature of the
observed indicators.
As Psychopathy intensifies the levels of OV
endorsement for every given observable
indicator should increase.
COVARIATION
An observed variable can take many forms:
an indicator on a self-report measure, a
score on a test, a physiological measure,
reaction time etc.
Psychologists tend not to distinguish
between an observed variable and the
latent variable.
Total score on the Psychopathy Checklist is
considered equal to the true score.
Why is this a concern? It has to do with
measurement error.
MEASUREMENT ERROR
The observed level of Psychopathy is
extremely unlikely to be a perfect
representation of the true level of Psychopathy.
Self-report measures are fallible, imperfect
methods of capturing the psychological
construct under investigation.
Observed scores will be related to the true level
of that variable but it will hardly be perfect.
Not a problem in the physical/hard sciences
when all you deal with is observed variables.
MEASUREMENT ERROR
Measurement error is comprised of two
forms: random error and systematic error.

Random error is that which occurs due to


chance or innocuous factors lack of
concentration, forgetting that 1 is strong
endorsement and actually circling a 5.
Systematic error is the result of the
particular indicator tapping into some other
variable inadvertently borderline
personality disorder.
MEASUREMENT ERROR
Imperfect measurement means that our observable
indicators are not only measuring the construct we are
interested in, but they are also measuring things we are
not interested in.

Measurement error has the consequence of reducing


the true correlation that exists between two variables.

Measurement error can never artificially increase the


correlation between two variables, only decrease it.
FACTOR ANALYSIS
Factor Analysis involves estimating the relationship
between the observed indicators and the latent variable
by determining the covariation among observable
indicators.
The variation among observable indicators can be due
to two factors:

1. The influence of a latent variable - Psychopathy


2. Other unwanted factors Measurement error
These unwanted factors are independent of (or
unrelated to) the latent variable.
LATENT VARIABLES

HOW DO WE MEASURE THEM?


LATENT VARIABLES
When psychologists seek to measure an
unobserved variable, we generally try to
capture that variable using multiple
indicators.
LV
Unlikely on theoretical grounds that a single
question can capture the complexity of a
psychological construct (psychopathy, social
anxiety etc.).
Methodologically it is also preferable to use
multiple indicators because it allows for
greater reliability.
x1 x2 x3 x4
With numerous indicators we can obtain
greater confidence that the intended latent
construct is being reliably measured.
LATENT VARIABLES
So where are we? Weve determined the unobserved
psychological construct that we are interested in measuring
and we have carefully selected a number of directly
observable variables that we believe will effectively capture
that latent construct. Now for the FA!
FA is simply about estimating the strength of the relationships
from the latent variable to each of the indicators (factor
loadings) and estimating the amount of variation in the
observable indicator not explained by the latent variable
(measurement error).
Remember I said if you understand correlation, you
understand factor analysis. Heres why.
FACTOR LOADINGS
Factor loading can range from +/- 1 just like in a
correlational analysis.
The closer to 1 the better - higher factor loadings
demonstrate a higher degree of association between
the latent variable and that indicator.
More of the variance in responses to that indicator is
attributable to the latent factor than to measurement
error.
Simply put, high factor loadings signify that the
indicator is effectively capturing the construct we are
most interested in.
FACTOR LOADINGS
If we had a factor loading of .80 it means 64% of
variance in responses to that indicator is attributable
to the latent variable.
The remaining variance is due to either systematic or
random sources of variation

Factor loadings above .6 are desirable (Hair,


Anderson,Tatham, & Black, 1998)

Factor loadings > .4 are acceptable

Hair, J. F., Jr., R. E. Anderson, R. L. Tatham, & W. C. Black (1998). Multivariate Data Analysis
with Readings, 5th Edition. Englewood Cliffs, NJ: Prentice Hall.
FACTOR LOADINGS
Based on this notion of variance explained Comery
and Lee (1992) have proposed the following
conventions.

0.32 = 10% Variance Explained Poor


0.45 = 20% Variance Explained Fair
0.55 = 30% Variance Explained Good
0.63 = 40% Variance Explained Very Good
0.71+ = 50%+ Variance Explained Excellent

Comery, A. L., & Lee, H. B. (1992). A first course factor analysis (2nd ed.). Routledge: London..
CONCLUSION
Factor analysis being about simplification is
an invaluable tool to the scientifically
minded psychologist.
The goal of science is to develop testable
theories to explain natural phenomena.
Our models or theories that explain
complex observable phenomena need to be
parsimonious.
We want to explain as much about that
complex variable as we can with the
simplest model possible.
CONCLUSION
Factor analysis allows for the development of
parsimonious theoretical models.
By simplifying large amounts of data into fewer and
more meaningful variables.

Factor analysis also facilitates more accurate


assessments of relationships between variables.
Dose this by creating latent variables which take into
account measurement error.
CONCLUSION
Back to our study at the start of this lecture.
How might this study have been improved?

Perceived Psychological r = -.31, p < .01


Telomere Length
Stress

Perceived psychological stress measured as an


observed variable?
What if we created a latent variable?
What might the effect be on the relationship between
these two variables?
Thank you for
your time!
Questions?

Philip Hyland
philipehyland@gmail.com
www.philiphyland.webs.com

You might also like