You are on page 1of 212

PROF. DR. ANANDA KUMAR PALANIAPPAN, Ph.

D
Faculty of Education, University of Malaya,
50603 Kuala Lumpur, Malaysia.
Tel: 603-79675046 (Off)
603-79675010 (Fax)
Email: anandak@um.edu.my,
ananda4989@yahoo.com

Outline

Brief

overview of SPSS Part I workshop


Instrument Validity and Reliability
Factor Analyses and Interpretation
Multiple Regression

Types of Research
Research

Qualitative

Quantitative

Historical
Experimental

Non-experimental

Ethnography
Grounded
Theory

Action
Research
Case Study
Descriptive

Phenomenology

Correlational
Causal
Comparative
3

Steps in Educational Research

1) Identify the problem area / the need for investigation


2) Write the statement of the problem in either
(a) Question
Form [e.g. Do children from kindergarten perform better at school
compared to children who have no kindergarten experience?]
(b) Hypothesis [e.g. There is no significant difference in academic
achievement between children with kindergarten experience and
children without kindergarten experience]
3) Decide which research design is most appropriate.
4) Review studies in the variables indicated in the research questions /
hypothesis (a) to form a conceptual framework for the research (b)
information required to design instruments

Steps in Educational Research


(Contd.)

5) define the variables involved in operational terms [e.g.


Academic achievement are grades assigned by teachers; or
Intelligence is the score obtained in Cattles Culture Fair
Intelligence Test]
6) Design instruments to measure the variables involved
7) Pilot test the instruments to ascertain (I) whether it is
suitable for the sample under study (2) Internal
Reliabilities (Item Analyses), Test Reliablities and Test
Validities.
8)Administer the instruments and score based on a
predetermined score sheet.
5

Steps in Educational Research


(Contd.)

9) Analyse the data using SPSS


10) Interpret the analyses and answer the research question
or reject/accept the hypotheses
11) State any assumptions or limitations in the study.

Pilot Study- Reliability and


Validation of Instrument

Ascertain Reliability:
(A) INTERNAL CONSISTENCY: (1) Item Analysis Index of discriminability (2) Split-half reliability (3)
Kuder-Richardson reliability (for dichotomous data) (4)
Cronbach Alpha (for ordinal data) SPSS- Data EditorStatistics-Scale-Reliability Analysis - Model (Alpha, Splithalf, Guttman, Parallel)
(B) STABILITY: (1)Test-retest reliability (2) Alternate
Forms reliability - use SPSS-Data Editor-StatisticsCompare Means-Paired-Samples t-test.
Ascertain Validity: (1) Content Validity - (use Expert
testimony) (2) Construct Validity SPSS Data Editor
Analyze Data Reduction (3) Criterion-related Validity/
Concurrent Validity- Use correlation (4) Predictive
Validity Use correlation
7

Validity

Content Validity - if the instrument tests all aspects that


should be tested (Ascertained using Expert testimony)
Construct Validity - if the test measures what it is
supposed to measure (Ascertained using Factor Analysis)
Criterion-related Validity/ concurrent validity - if the
test scores are closely related to another test which
measures similar construct (Ascertained using Pearson
Correlation)
Predictive Validity - if the instrument can predict
correctly a particular outcome (Ascertained using Pearson
Correlation)

METHODS OF ESTIMATING RELIABILITY


Type of
Reliability Measure

Procedure

Test-retest method

Measure of stability

Give the same twice to the same


group with any time interval
between tests from several
minutes to several years

Equivalent-Forms
Method

Measure of equivalence

Give two forms of the test to


the same group in close
succession

Test-retest with
equivalence forms

Measure of stability
and equivalence

Give two forms of the test to the


same group with increased time
interval between forms

Split-half method

Measure of internal
consistency

Give test once. Score two equivalent


halves test (e.g. odd items and even time)

Kuder-Richardson
method

Measure of internal
consistency

Give test once. Score total test and


apply Kuder-Richardson formula
9

DESIGNING INSTRUMENTS
Should

be suitable for the population under study


Should sample the universe of data pertaining to
the variable measured
Should be reliable
Should be reliably scored

10

Outline of SPSS Part 1


Types

of Data
How to enter data and examine data
How to explore data for normality
What analyses / statistics to use
How to run these analyses
How to COMPUTE and RECODE
11

Outline
How

to SELECT cases
How to interpret results and report using
APA format
How to create and edit tables and place in
other applications

12

Exercise 1
Start your SPSS for Windows now. You will get the
Data Editor Window. Study the menu bar and the
options available in each menu.
Then,
1. Open the data file call PRACTICE.
2. Run some simple frequency analyses on the
following variables:
a) sex
b) race
c) region
d) happy
3. From the results in your Output Navigator
describe the respondents in this study
13

Types of Measurement Scales and their


Statistical Analyses
Measurement
Scale
Nominal

Ordinal

Characteristics

Type of Data

Simple
Classification
in
Categories without any order
e.g Boy / Girl
Happy / Not Happy
Muslim / Buddhist / Hindu

Nonparametric

Has order or rank ordering


e.g. Strongly agree, agree,
undecided, disagree, strongly

Nonparametric

Statistical
Tests
Chi-square

Spearmans rho
Mann-Whitney
Wilcoxon

disagree

(LIKERT SCALE)

14

Types of Measurement Scales and their


Statistical Analyses
Measurement
Scale
Interval

Ratio

Characteristics
Do not have true 0 points. Has
order as well as equal distance
or interval between judgements
(Social Sciences) e.g. IQ score
of 95 is better than IQ 85 by 10
IQ points
Have true 0 points. Has high
order, equal distance between
judgements, a true zero value
(Physical Sciences) e.g.age, no.
of children, 9 ohm is 3 times 3
ohm and 6 ohm is 3 times 2
ohm But IQ 120 is more
comparable to IQ 100 than to IQ
144, although ratio
IQ 120 /100 = 144 /120 = 1.2

Type of Data
Parametric

Parametric

Statistical
Tests
COMPARISON:
t-tests
ANOVA
RELATIONSHIP:
Pearson r

COMPARISON:
t-tests
ANOVA
RELATIONSHIP:
Pearson r

15

Types of Measurement Scales and their


Statistical Analyses
Higher

order of measurement --> lower


order e.g. Interval ---> ordinal, nominal
But not ordinal, nominal ----> interval

16

Refer to the handout provided.


Exercise 1
Indicate in the spaces provided in
Table 1 the level of measurement of the
corresponding variables

17

Data Collection
Identify

the population to be studied


Choose sample randomly or by stratified
random sampling
The accuracy of the findings of a research
depends greatly on (1) how the sample is
chosen (2) whether the correct
instruments are used
(3) the reliability
and validity of the instruments
18

Entering & Editing Data


Open

SPSS by double clicking at the SPSS icon or


START - PROGRAM - SPSS
Define variable
Enter data
Adding labels for variables and value labels
Inserting new cases
Inserting new variables
Adding Missing Value codes
Examining Data by running FREQUENCY

19

Refer to the handout provided.


Exercise 2:
Enter data given in the handout
then answer the questions

20

Exploring Data Graphically


To

check normality graphically and decide on


its appropriate analyses
1) By displaying data
Histogram
Boxplot
Stem-and-leaf

2)

Plot

By Statistical Analyses

Descriptive

Statistics
M - Estimators
Kolmogorov-Sminov Test
Shapiro-Wilk
21

Histogram

Histogram
i
14

R
12

Frequenc y

T
10
t
d
r
r
8
M
5
6
M
0
S
4
9

2
4
9
3
4
0
10.0
12.5
15.0
17.5
20.0
0
CHILD REARING PRACTICES
6

Std. Dev = 3.89


Mean = 18.0
N= 41.00
22.5

25.0

S
K
M
M
22

Checking Normality Skewness


Skewness

measures the symmetry of the


sample distribution
Skewness = Statistic
Standard Error
If Skewness < -2 or > +2, reject normality
If

-2 < Skewness < 2 ---> Normal


distribution
23

Mean

Negatively Skewed

Median

If Ratio is negative
If Mean < Median

22

Boxplot

20

18

Negatively skewed

16

14

12

10

CRA

8
35

6
N=

13

22

MALE

FEMALE

SEX

24

Median
Positivity Skewed

Mean

If Ratio is positive

If Mean > Median

25

Checking Normality - Kurtosis


Kurtosis

measures the spread of the data


Kurtosis = Statistic
Standard Error
If Kurtosis < -2 or > +2 reject normality
If

-2 < Kurtosis < 2 ---> Normal


distribution
26

Kurtosis
Large

Positive value = tails of the


distribution are longer than those of a
normal distribution
Normal Graf

27

Kurtosis
Negative

value of Kurtosis indicates shorter


tails (Box like distribution)

Normal Graf

28

Values more than 3 box-lengths from 75th


percentile

Boxplot

Values more than 1.5


box-lengths from 75th
percentile (outliers)

30

Largest observed value that isnt


outlier

75th Percentile

20

Median
25th Percentile
Smallest observed value that isnt
outlier

10

Slightly positively
skewed

0
N=

41

CHILDREARINGPRACTI

29

Descriptive Statistics

Fig.1. Boxplot comparisons of the creativity scores of


Malaysian and American students

Elaboration > Fluency > Flexibility > Originality

30

Example: Boxplots for more than one variable / time series

31

Stem-and-Leaf Plot
CHILD

REARING

Frequency

1.00
2.00
8.00
11.00
3.00
8.00
4.00
3.00
1.00
Stem
Each

width:
leaf:

32

Testing Normality of data


collected

All data must be tested for normality before analyzing


them statistically.
Normality - if the data samples the population
representatively, it will be normally distributed - where the
mean and median are approximately equal
Type of analysis depends on the normality of data and the
level of measurement of data
- Normally distributed data - use Parametric Tests like ttests, ANOVA, Pearson r.
- Non-normally distributed data - use Non-parametric
Tests like Chi-square, Spearmans rho, Mann-Whitney,
Wilcoxon
33

To show Normality of Data


i p

d
a
.
S
tE
i
E
s
C
M
M
R
A
e
0
1
8
2
9
L
5
o
6 3
I
B
n t
o
U
5 2
B
o
5 %
t
i
m
2
0

C M
R
e
0 0
S
E
V
a
1 0
S
t d
M
A
0 5

H
a
1
3
4
0
M
i
7
M
M
a
2 1
T
b
2
3
5
2
R
a
1
4
B
I n t
0 0
H
c
2
3
0
7
M
S
k
7
1
9
6
A
K
u
6
9
5
1
2
3
5
2 d
W
F
M
E
e
3
7
6
7

9
L
5
o
a
7 5
T
I
B
n t
o
1
U
9 7
B
o
b
T
5 %
4
3 9

c
T

a
M
e
0 0
V
a
9 5
d
T
S
t
d
1
6 3

M
i
7
M
a
2 1
R
a
1 4
I n t
0 0

S
2
9
5
1
K
6
5
2
3

k
u

34

Data Editor - Analyze - Descriptive Statistics - Explore

R
E
M
A
A
a
K
S
9
1
d
3
2
S
0
0
*
*
S
S
3
5
d
3
2
S
1
1

*
T

a
L

Not sig. at p < .01.


Data is normally distributed
35

BoxPlot for Male and Female parents


22

20

18

16

14

12

10

Slightly Negatively
Skewed

CRA

Slightly Positively
Skewed

35

6
N=

SEX

13

22

MALE

FEMALE

36

Detrended Normal Q-Q Plot


of CRA

Normal Q-Q Plot of CRA

Detrended Normal Q-Q Plot of CRA

Normal Q-Q Plot of CRA

For SEX= MALE

For SEX= MALE


.4

1.5

1.0

.2

.5
-.0

Dev from Normal

Expected Normal

0.0

-.5

-1.0

-1.5
6

10

12

14

16

18

20

-.2

-.4

-.6
6

22

10

12

14

16

18

20

22

Observed Value

Observed Value

Detrended Normal Q-Q Plot of CRA

Normal Q-Q Plot of CRA

For SEX= FEMALE

For SEX= FEMALE


.2

2.0
1.5

.1

1.0

0.0
.5

-.1

Dev from Normal

Expected Normal

0.0
-.5
-1.0
-1.5
-2.0
6

Observed Value

10

12

14

16

18

20

22

-.2

-.3

-.4
6

Observed Value

10

12

14

16

18

37
20

22

Exercise
Open the data file PRACTICE and check the normality of
the Age data of the respondents using
a) Histogram
b) Boxplot
c) Stem-and-leaf
d) E-estimators
e) Kolmogorov-Sminov & Shapiro Wilk
f) Normal Q-Q Plot
g) Detrended Normal Q-Q Plot

38

Testing equality of variance


Levernes Test (SPSS-DataEditor-Analize-Explore
-Plots(Leverne)

e
t
i
f
f
i
Mothers
Notg
C
Fathers
Sig.
R
P

If Leverne Statistic is highly significant (p < .001), the groups do not


have equal variance
If Leverne Statistic is not significant (p > .001), the groups have
equality of variance and t-tests analyses can be undertaken
39

Exercise
You wish to compare the ages of male and
female respondents using the t-test. To use
the t-test, you must make sure the variances in
the age of male and female respondents are
similar. How are you going to do it? Can
you use the t-test to compare the ages of male
and female respondents in the sample?

40

Compute Data
SPSS data editor - Transform - Compute -

Please try exercise 3.

41

RECODE
SPSS Data Editor - Transform - Recode - into different variable
/ into same variable

42

Recode (contd)

Please try exercise 4

43

Select cases
SPSS Data Editor - Data - Select cases-

44

Select cases

Please try Exercise 5


45

To Analyze & Report Demographic Data


ANALYSE DESCRIPTIVE STATISTICS EXPLORE

46

Source: Palaniappan, A. K. (2009). Penyelidikan Pendidikan dan SPSS.


Kuala Lumpur, Malaysia: Pearson.

47

Source:
American Psychological Association. (2010). Publication Manual
of the American Psychological Association (6th ed.). Washington, DC: Author.

48

Source:
American Psychological Association. (2010). Publication Manual
of the American Psychological Association (6th ed.). Washington, DC: Author.

49

Sample APA Reporting of Demographic


Information for 4 Subsamples

50

GOLDEN TABLE

51

Parametric Statistical Analyses


(Degree of Association/ Relationship)
SPSS Data Editor - Statistics - Correlate - Bivariate -

52

Parametric Statistical Analyses


(Degree of Association/ Relationship)
Pearson Product-moment Correlation

H
D
E
R
T
S
O
R
O
S
R
E
U
P
C
0
4
5
C
S
A
4
0
5
*
M
W
O
5
5
0
*
P
A
S
C
9
7
.
(
2
S
A
9
6
.
M
W
O
7
6
.
P
A

*
.
C

53

Presenting Correlation Table


Table 1
Pearson Product Moment Correlations between SAM,
WKOPAY and CRA Scores
CRA

SAM

WKOPAY

SAM

.20

1.00

.38*

WKOPAY

.29

.38*

1.00

N of Cases: 165

1- tailed Signif: * - .01

** - .001

54

Effect size for correlation

55

Reporting Product Moment Correlations

Table 1 presents the inter-correlations among Creative Child Rearing Practices


(CRA), Something About Myself (SAM) and What Kind of Person Are You?
(WKOPAY) scores. The correlation coefficient between CRA and SAM scores
is .20 which is not significant at p < .05 and with small effect size. This
indicates that parents who perceive themselves as creative based on their past
creative performances do not engage in creative child rearing practices.

The correlation coefficent between CRA and WKOPAY scores is also not
significant (r = .29, p > .05) with small effect size. This indicates that parents
who perceive themselves as creative based on their personality characteristics,
also do not engage in creative child rearing practices.

56

Report
There

is a significant correlation between SAM


and WKOPAY (r = .375, p < .05) with small
effect size. The correlation is positive, indicating
that an increase in SAM scores will result in an
increase in WKOPAY scores. Results also show
that 14% (r squared) of the variance of SAM
scores is explained by WKOPAY scores. About
86% of the variance in SAM is unaccounted for.

57

Sample of Correlation Report

Creed, P. A. & Lehmann, K. (2009). The relationship between core self-evaluations, employment commitment and
well-being in the unemployed. Personality and Individual Differences, 47, 310315.
58

Sample of Correlation Table

Creed, P. A. & Lehmann, K. (2009). The relationship between core self-evaluations, employment commitment and
well-being in the unemployed. Personality and Individual Differences, 47, 310315.
59

Means and SDs


for the Upper
Group

Means and SDs


for the Lower
Group

Source:
American Psychological Association. (2010). Publication Manual
of the American Psychological Association (6th ed.). Washington, DC: Author.

60

An example of a Scatter Plot (Palaniappan, 2007)


Graphical Representation of the Relationship Between Individualism
and Overall Expressivity Endorsement
Zimbabwe

Canada
Australia

0.51

Denmark

Overall Expressivity Endorsement

USA

New Zealand

Belgium
Mexico
0.48

India

Brazil

Netherlands

Hungary

Portugal
Poland

People's Republic of China


Japan

Turkey
South Korea
0.45

Malaysia

Lebanon

Germany
Israel

Croatia
Russia

Italy
Switzerland

Indonesia
0.42

Hong Kong
0.39

20

40

60

Individualism

80

100

61

t - tests
Paired

t-tests
Grouped t-tests

62

Assumptions of t-tests
1)

Data must be interval or ratio


2) Data must be obtained via random
sampling from population
3) Data must be normally distributed

63

Parametric Statistical Analyses


( comparisons - t-tests )
SPSS Data Editor - Compare means - Independent Sample t test

64

Parametric Statistical Analyses


( comparisons - t-tests )

t
E
d
N
e
e
S
a
C
M
F

s
u
f
a
V
o
n
o
f
S
.
e
ig
E
a
e
e
o
p
d
F
a
ig
t
r
r
w
i
p
f
e
e
C
E
v
6
6
8
3
4
1
3
8
1
a
E
v
3
8
6
1
6
1
4
n
65
a

Presentation of t-test results


Table 2
T-tests comparisons of CRA scores by gender
Father
(n =13)

Mother
(n =12)

t-value p < .05

Effect
Size

Mean

15.06

14.36

5.38

.18

SD

4.05

3.63

NS

66

Effect Size
__

___

X 1 X 2
EffectSize
s1 s2
2
Example:
X1 = 15.08
X2 = 14.36
EffectSize

s1 = 4.05
s2 = 3.63
15.08 14.36 0.72

.1875
4.05 3.63
3.84
2

Result: Effect Size (Cohens d) = .1875 (Small effect size)


Note: Effect size ~ .5 (medium);

~ .8 (high)

67

Effect Size measured by Cohens d

Effect size (Cohens d), Eta Squared and Interpretation


--------------------------------------------------------------------------------------------------Effect Size (Cohens d)
Eta Squared, 2
Interpretation
---------------------------------------------------------------------------------------------------0.2 <= d <0.5
.01 <= 2 < .06
Small
0.5 <= d < .8
.06 <= 2 < .14
Moderate
d <= .8
2 => .14
Large

--------------------------------------------------------------------------------

68

Report
The mean CRA scores of fathers and mothers are
15.08 and 14.36 and the standard deviations are 4.05
and 3.63 respectively. These scores are subjected to
t-test analysis. The Levenes Test for equality of
variance indicates that the variances are similar. The
t-value obtained is .54 which is not significant at p <
.05. The effect is .18.
These results indicate that fathers and mothers do
not differ in their child rearing practices. The effect
size indicates that parents gender has only a small
effect on their creative child-rearing practices.

69

See handout for a clearer page (article page # 971)


Palaniappan, A K. (2000). Sex differences in Creative Perceptions of Malaysian
Students, Perceptual and Motor Skills, 91, 970 - 972.
70

Sample F test for 2 group - comparison

71

Paired t-test
Assumptions

1) Normality of the population difference of


scores this is ascertained by ensuring the
normality of each variable separately.
2) the other assumptions similar to group t test
a) Data must be interval or ratio
b) Data must be obtained via random
sampling from population
c) Data must be normally distributed
72

Exercise
1)

Is there a significant difference in the


highest year of education between the
respondents mother and father?
2) Is there a significant difference in the
highest year of education of respondent and
his/her spouse?

73

Parametric Statistical Analyses


( comparisons - Oneway ANOVA )
SPSS Data Editor - Compare Means - One-way ANOVA -

74

Parametric Statistical Analyses


( comparisons - Oneway ANOVA )

e
t
i
f
f
i
g
1
W
O
2
8
9
P
A
S
A
2
8
1
M

e
m
d
u
F
a
i
g
W
B
5
2
3
2
7
O
G
P
W
0
8
9
A
G
T
5
0
S
B
8
2
4
3
6
A
G
M
W
3
8
0
G
75
T
1
0

Understanding the ANOVA table


Variations among the sample means
F = ------------------------------------------Variance within the samples
Between groups sum of squares / df 1
Between mean square
F = --------------------------------------------- = -------------------------Within groups sum of squares / df 2
Within mean square
Between mean square is computed by subtracting the mean of the observations (the overall
mean) from the mean of each group, squaring each difference, multiplying each square by the
number of cases in its group, and adding the results for each group together. The total is called
between-group sum of squares
Within-group sum of squares is computed by multiplying each group variance by the number
of cases in the group minus 1 and add the results for all groups.
Mean square column reports sum of squares divided by its respective degree of freedom.
F ratio is the ratio of the two mean squares.
76

Presentation of One-way ANOVA results


Table 3
One-way ANOVA for CRA scores by WKOPAY groups
Source

df

Sum of
Squares

Mean of
Squares

F
Ratio

Between Gps

31.145

15.573

.632

Within Grps

38

936.660

24.649

Total

40

967.805

F
Probability
.537

Multiple Range Test


Scheffe Procedure
No groups are significantly different at the .05 level

77

Interpreting F
If

the F value is significant, then the groups


are significantly different
To ascertain which groups are significantly
different, perform the Scheffe test.
F (Groups -1, No. of Participants Groups) = F Value

78

Report
Results

show that the three groups do not


differ significantly on CRA scores
(F (2, 38) = .632, p >.05). This represents an
effect size of 3.22% [{31 / (31 + 937)} x
100] which indicates that only 3.22% of the
variance of CRA scores was accounted for
by the 3 groups.
(do the same for SAM)
79

Effect Size
Is the degree to which the phenomena exists (Cohen, 1988)

Sum of Squares between Groups


Effect Size = ------------------------------------------- x 100
Total Sum of Squares

80

Bonferonni Correction for


Multiple Comparisons
For

multiple comparisons, Bonferonni


corrections must be made
If the overall level of significance is set at p
< .05 and the number of comparisons
involved is 10, then the level of significance
for each comparison must be .05/10 which
is .005.
81

Table for Post-hoc Comparisons

82

Power of a test

Power of a statistical test is the probability of observing a


treatment effect when it occurs.
It is the probability that it will correctly lead to the
rejection of a false null hypothesis (Green, 2000)
The statistical power is the ability of the test to detect an
effect if it actually exists (High, 2000)
The statistical power is denoted by 1 , where is the
Type II error, the probability of failing to reject the null
hypothesis when it is false.
Conventionally, a test with a power greater than .8 level
(or = < .2) is considered statistically powerful.

= is the probability of rejecting the true null hypothesis (Type I error)


= is the probability of not rejecting the false null hypothesis (Type II error)
83

There are four components that


influence the power of a test:

1) Sample size, or the number of units (e.g., people)


accessible to the study
2) Effect size, the difference between the means, divided
by the standard deviation (i.e. 'sensitivity')
3) Alpha level (significance level), or the probability that
the observed result is due to chance
4) Power, or the probability that you will observe a
treatment effect when it occurs
Usually, experimenters can only change the sample size
(population) of the study and/or the alpha value
84

Other ways to calculate Sample size and


Confidence Interval

85

To calculate Sample Size or


Power
http://www.stat.ubc.ca/~rollin/stats/ssize/n2.

html
http://www.downloadforge.com/Windows/
Mathematics/Download/GPower-319.html

86

Sample size and Effect size Table

87

Sample size and Effect size Table

88

ANOVA (1-way)
To

compare 3 groups or more on a


dependent variable.
Same assumptions as T-tests apply
Analyze Compare Means One-way
ANOVA
Do Exercise 10A, page 11.

89

Sample APA Table for One-way ANOVA

Source: Palaniappan, A. K. (2009). Penyelidikan Pendidikan dan SPSS.


Kuala Lumpur, Malaysia: Pearson.

90

91

2 way, 3 - way ANOVA


Statistics - General Linear Model - GLM General Factorial

92

Parametric Statistical Analyses


( Comparisonof more than 2 groups on interval data
- ANOVA - Simple Factorial) Table 2

a
O
V

e
M
M
m
e
a
q
u
S
d
u
a
F
i
f
a
g
C
M
(
C
R
1
1
6
3
2
8
2
S
1
1
2
1
2
2
3
s
3
7
4
1
4
0
0
w
1
4
6
1
6
4
8
2
(
C
8
7
5
3
5
2
1
S
3
7
3
1
3
7
0
g
S
6
3
7
1
7
3
5
g
s
4
6
9
1
9
0
7
w
3
S
g
5
9
2
1
2
8
4
g
M
0
2
8
7
1
7
1
R
3
7
1
To
1
4
4

a
.
C
R

93

b
.
A

2 way ANOVA, 3 - way ANOVA


Do

exercise on p.11

94

ANCOVA
Try

exercise on ANCOVA on page 10.

95

Presentation of Three-way ANOVA results


Table 4
Analysis of Variance using CRA scores as the dependent variable
Source of Variation
Main Effects
Sex
SAM grps
WK grp

Sum of
Squares

DF

Mean
Squares

Signif.
of F

14.916
.192
12.994
3.346

3
1
1
1

4.972
.192
12.994
3.346

.318
.012
.830
.214

.812
.913
.370
.648

2-way Interactions
32.025
Sex x SAM grps
8.403
Sex x WK grps
15.077
SAM grps x WK grps 13.149

3
1
1
1

10.675
8.403
15.077
13.149

.682
.537
.963
.840

.571
.470
.335
.367

3 way Interactions
2.472
Sex x SAM grps x WK
grps
Model
55.588
Residual
422.583
Total
478.171

2.472

.158

.894

.507

,821

7
27
34

7.941
15.651
14.064

96

Reporting ANOVA Simple


Factorial
As shown in Table 2, there is no significant differences between fathers
and mothers with respect to Child Rearing Practices ( F = .12, p > .05).
The results also show that WK groups (F = .83, p > .05) and SAM
Groups (F = .24, p > .05) also do not have significant effects on CRA
Scores. There are also no significant two-way interactions or three-way
Interactions between sex, WK groups and SAM groups.
The results indicate male parents do not differ from female parents
in their child rearing practices. Their creative perceptions also do
not affect their child rearing practices.

97

Sample Report of an Experimental Research

Dalton, J. J. & Glenwick, J. S. (2009). Effects of Expressive Writing


on Standardized Graduate Entrance Exam Performance and Physical
Health Functioning.The Journal of Psychology,143(3), 279292 98

Part II
Factor

Analysis
Reliability Item Analysis
Multiple Regression
One-way Repeated Measures ANOVA
Multivariate ANOVA (MANOVA)
Discriminant Analysis
Testing for Moderating Effects of a Variable

99

FACTOR ANAYSIS
Factor

analysis is undertaken to ascertain how


may factors are measured by the items you have
constructed. This is sometimes called Data
Reduction.
To do this, you need to enter the data item by item
in your datafile. Using Factor Analyses you will
be able to tell which items are strongly correlated
and lump together to form a factor. By looking at
these items you will be able to give a collective
name to represent these items or Factor.
SPSS will be able to tell how many factors there
are and how many items fall in each factor.
100

FACTOR ANALYSIS
Data

are entered item by item in the datafile


In Factor Analyses you will be able to tell which
items are strongly correlated and lump together to
form a factor. By looking at these items you will
be able to give a collective name to represent these
items or Factor.
SPSS will indicate how many factors there are and
how many items fall in that factor.
101

Assumptions for Factor Analysis

There must be at least [X variables (items) x 5]


respondents or more than 200 respondents to run Factor
Analysis reliably.
There must be linear relationship between the variables or
items
There should not be any outliers for each variable.
The correlations among the items must be more than .3 in
order to factorizable.
To be factorizable, the Bartletts test of sphericity must be
significant and large.
To be factorizable, the Kaiser-Meyer-Olkin (KMO)
measure of Sampling Adequacy must be more than .6
To ensure sampling adequacy, the anti-image correlation
matrix is used. Variables with sampling adequacy below .5
(see the diagonal of the anti-image correlation matrix)
should be excluded from Factor Analysis.
102

FACTOR ANAYSIS
Exercise

19
Using the datafile Datafile for Item
Analysis and Factor Analysis run a factor
analysis of all 20 items and determine how
many factors there are. By looking at the
items that fall within each factor, can you
give a common name to represent all the
items in each factor?
103

Factor Analysis Output


KMO and Bartl ett's Test
Kaiser-Mey er-Olkin Measure of Sampling
Adequacy .
Bart lett 's Test of
Sphericity

Approx. Chi-Square
df
Sig.

.466
7478.285
3741
.000

The Kaiser-Meyer-Olkin Measure of Sampling Adequacy is less


than .6 (should be more than .6, the higher the better) so the
variables are marginally factorizable.

The Bartletts Test of Sphericity is significant p < .05. This indicates


that the variables are related and therefore factorizable.
104

Factor Analysis - KMO


Kaiser-Meyer-Olkin Measure of Sampling Adequacy = is the
statistic that indicates the proportion of variance in your
variables that might be caused by underlying factors.
High values (close to 1.0) generally indicate that a
factor analysis may be useful with your data. If the value
is less than 0.6, the results of the factor analysis probably
won't be very useful

105

ITEM ANALYSIS
Item

analysis is undertaken to ascertain to what


extend the items measuring a certain construct are
correlated. Items that are closely correlated
indicate high internal consistency or reliability of
the test. The measure of internal consistency or
reliability is given by Cronbach Alpha.
If the items are ordinal (eg likert scale), SPSS will
give the Cronbach Alpha. But if the items are
dichotomous, you will need to use KuderRichardson 20 which also obtained by requesting
Cronbach Alpha.
106

Item Analysis
Exercise 19
Use

the data file in your desktop icon called


SPSS WORKSHOP, use the data file called
Datafile for Item Analysis and Factor
Analysis run the Item Analysis and
ascertain the best Cronbach Alpha

107

Sample Factor Analysis table

108

Multiple Regression
Bivariate

Multiple Regression
Aca Ach = Constant + b Motivation

Multivariate

Multiple Regression

Aca Ach = Constant + b1 Motivation + b2 Creativity + b3 Selfconfidence

109

Multiple Regression - Assumptions


1) Ratio of cases to independent variables:
20 times more cases than predictors
2) Variables must be normally distributed check graphically or statistically
(e.g. Box-plot, Histogram, skewness and kurtosis, Kolmogorov-Smirnof or
Shipiro Wilk)
3) IV must be linearly related to DV (Use Scatter-plot for Bivariate
Regression). For Multitivariate Use Residual Scatter Plot between
Standarized residuals (Y-axis) and Standardized Predicted value (X-axis) if
linearly related points in scatter plot are evenly distributed on both sides of
0 value of the Standardized Predicted value (X-axis).
4) No multicollinearity IVs must not be significantly correlated (use Pearson
correlation Matrix to check / Tolerance = 1 R2 (must be more than .1) / VIF
(Variance Inflation Factor) = 1/Tolerance (must be less than 10) [R is the
correlation coefficient between the 2 IVs or predictors which should not be
more than .7. If more than .7, omit 1 of the IV or combine the IVs]
5) No multivariate outliers use Mahalanobis Distance to ascertain this. Use
Chi-square value at p < .001 and df (= no of IVs) from Chi-square table to
determine which data is outlier in the MAHAL column produced in the 110
Datafile.

Residuals are the differences between the predicted DV


calculated from the predictors and the obtained DV
obtained from the study.
Normality: These residuals must be normally distributed
about the predicted DV scores
Linearity: These residuals should have a straight-line
relationship with the predicted DV scores
Homoscedasticity: The variance of the residuals about
predicted DV scores should be the same for all predicted
scores
Normality, Linearity and Homoscedasticity can be checked
using the residuals scatterplots generated by SPSS.
111

Example of Scatterplot between Std Residual and Std Predicted Value


Scatterplot

Dependent Variable: Highest Year of School Completed


4
3
2
1
0
-1
-2
-3
-4
-4

-3

-2

-1

Regr ession Standardized Predicted Value

112

Collinearity Statistics Tolerance


is the statistic used to determine how
much the independent variables are linearly
related to one another (Multicollinear)
-Tolerance is the proportion of a variable's
variance not accounted for by other independent
variables in the model and is given by 1 R2,
where R is the correlation coefficient between the
2 IVs or predictors.
Tolerance level must be more than .1
Tolerance

113

Collinearity Statistics - VIF


Variance Inflation Factor
- is the reciprocal of the Tolerance
VIF should be less than 10
VIF

114

Durbin-Watson
Gives

a measure of autocorrelations in the


residuals (or errors) in the values or observations
in the multiple regression analyses
If the Durbin-Watson value is between 1.5 and
2.5, then the observations or values are
independent there are no systematic trend in
the errors of the observation of the values (there
should not be a systematic trend in the errors)

115

Multivariate Outlier an
example

It is usual to find a person who is 15 years old and will


not be a outlier when you plot a histogram for age
(univariate)
It is also common to find a person earning a salary of
RM10,000 a month and this person may not be an outlier
when you plot a histogram for salary (univariate)
However, if you combine both age and salary
(multivariate) a person who is 15 years old earning
RM10,000 may become an outlier called multivariate
outlier
You need to get rid of multivariate outlier using
Mahalanobis Distance before you run your multiple
regression
116

What havoc a multivariate outlier can do to your results?


It can change your R from .08 to .88!

117

Checking for multivariate outliers =


use Mahalanobis Distance
No of Independent
Variables

Critical Chi-square
Value to determine
Multivariate
Outliers

13.82

16.27

18.47

20.52

22.46

24.32

118

Methods for Selecting Variables


Selection starting from the constant
term, variable is added to the equation or
regression model if it results in the largest
significant (at p < .05 for e.g.) increase in multiple
R2 .
Backward Selection all variables are put into
the equation or regression model. At each step, a
variable is removed if this removal results in only
a small insignificant change in R2.
Stepwise variable Selection most commonly
used method for model building. Is a combination
of Forward Selection and Backward Selection.
Variables already in the model can be removed if
they are no longer significant predictors when new
variables are added to the regression model.
119
Forward

Types of Regression Analyses


Standard

Multiple Regression
Sequential / Hierarchical Multiple
Regression
Statistical / Stepwise Multiple Regression

120

Coding for Dummy Variables


Example:

dichotomous
Male 1
Female - 2
Need to convert to dummy variable
Male - 1
Female - 0
to study the effect of gender on the DV
Gender

if r = sig + , male has higher significant effect on DV


if r = sig - , female has higher significant effect on DV
121

Using PRACTICE data file


Research Question:
1) To what extent do PAEDU and MAEDU
predict EDUC?
2) To what extent do PAEDU, MAEDU and
SEX predict EDUC?
3) To what extent do PAEDU, MAEDU,
SIBS and SEX predict EDUC?
122

Results of Mul Reg for Research Question 2


Descriptive Statistics
educ
paeduc
maeduc
sexdummy

Mean
13.54
11.01
11.02
.4245

St d. Dev iation
2.797
4.117
3.409
.49452

N
973
973
973
973

Correlati ons
Pearson Correlation

Sig. (1-tailed)

educ
paeduc
maeduc
sexdummy
educ
paeduc
maeduc
sexdummy
educ
paeduc
maeduc
sexdummy

educ
1.000
.450
.429
.112
.
.000
.000
.000
973
973
973
973

paeduc
.450
1.000
.672
.102
.000
.
.000
.001
973
973
973
973

maeduc
.429
.672
1.000
.065
.000
.000
.
.021
973
973
973
973

sexdummy
.112
.102
.065
1.000
.000
.001
.021
.
973
973
973
973

123

Results of Mul Reg for Research Question 2 (contd)


Model Summaryd

Change Statistics
Model
1
2
3

R
.450a
.481b
.486c

R Square
.203
.232
.236

Adjusted
R Square
.202
.230
.234

St d. Error of
the Estimate
2.499
2.454
2.448

R Square
Change
.203
.029
.004

F Change
246.937
36.704
5.670

df 1
1
1
1

df 2
971
970
969

Sig. F Change
.000
.000
.017

DurbinWat son

1.738

a. Predictors: (Const ant), paeduc


b. Predictors: (Const ant), paeduc, maeduc
c. Predictors: (Const ant), paeduc, maeduc, sexdummy
d. Dependent Variable: educ

ANOVAd
Model
1

Regression
Residual
Total
Regression
Residual
Total
Regression
Residual
Total

Sum of
Squares
1541.572
6061.733
7603.305
1762.582
5840.724
7603.305
1796.560
5806.745
7603.305

df
1
971
972
2
970
972
3
969
972

Mean Square
1541.572
6.243

F
246.937

Sig.
.000a

881.291
6.021

146.361

.000b

598.853
5.993

99.934

.000c

a. Predictors: (Const ant), paeduc


b. Predictors: (Const ant), paeduc, maeduc
c. Predictors: (Const ant), paeduc, maeduc, sexdummy
d. Dependent Variable: educ

124

Multiple Regression Results


Coefficientsa

Model
1
2

(Constant)
paeduc
(Constant)
paeduc
maeduc
(Constant)
paeduc
maeduc
sexdummy

Unstandardized
Coeff icients
B
Std. Error
10.178
.229
.306
.019
9.254
.272
.201
.026
.189
.031
9.142
.275
.196
.026
.189
.031
.380
.160

Standardized
Coeff icients
Beta
.450
.295
.230
.288
.231
.067

t
44.499
15.714
34.077
7.768
6.058
33.250
7.574
6.085
2.381

Sig.
.000
.000
.000
.000
.000
.000
.000
.000
.017

95% Confidence Interval for B


Lower Bound Upper Bound
9.729
10.627
.268
.344
8.721
9.787
.150
.251
.128
.250
8.602
9.681
.145
.246
.128
.250
.067
.693

Collinearity Statistics
Tolerance
VIF
1.000

1.000

.548
.548

1.826
1.826

.544
.548
.990

1.837
1.826
1.011

a. Dependent Variable: educ

125

Reporting Results of Mul Reg for Research Question 2

Table XX
Standard Multiple Regression of PAEDUC, MAEDUC and SEXDUMMY on EDUC
Variables

EDUC

PAEDUC

MEADUC

p < .05

.20

.29

7.57

Sig

PAEDUC

.45

MEADUC

.43

.67

.20

.19

.23

6.09

Sig

SEXDUMMY

.11

.10

.07

.38

.07

2.38

Sig

Intercept = 9.14
Means
SD

13.54

11.01

11.02

2.80

4.12

3.41

R = .49
R2 = .24
Adjusted R 2 = .23

126

Reporting Multiple Regression Results

A standard multiple regression was performed between respondents


level of education, EDUC as the dependent variable and fathers level
of education (PAEDUC), mothers level of education (MAEDUC) and
respondents gender (SEXDUMMY). The assumptions were evaluated
using SPSS EXPLORE.
Table XX displays the correlations between the variables, the
unstandardized regression coefficients, B, and intercept, the standardized
Regression, , R2 and adjusted R2.
R for regression was significant, F (3, 969) = 99.93, p < .05.
with R2 =.24.
The adjusted R2 of .23 indicates that more than one-fifth of the variability
of EDUC is predicted by the three predictors.
The regression equation is:
EDUC = 9.14 + .380 + .20 (PAEDUC) + .19 (MAEDUC) + .380 (SEXDUMMY)
127

Multiple Regression
Try

exercise on Linear Regression and


Multiple Regression on page 26.

128

Using Dummy Variable to check


moderating effect of variable

129

Hierarchical Multiple Regression


Is

used when there is a need to control for certain


variables
For example, if we wish to study how PEADUC
and MEADUC predict EDUC while controlling
for Age of the respondent and the number of
siblings (SIBS)
We enter Age and SIBS in the first batch of
variables and then enter PEADUC and MEADUC
in the second batch as predictors of EDUC
130

Coeffi ci entsa

Model
1

(Constant)
Age of Respondent
Number of Brothers
and Sisters
(Constant)
Age of Respondent
Number of Brothers
and Sisters
Highest Year School
Complet ed, Father
Highest Year School
Complet ed, Mother

Unstandardized
Coef f icients
B
St d. Error
15.528
.263
-.038
.005

St andardized
Coef f icients
Beta
-.226

t
59.086
-7.463

Sig.
.000
.000

-.238

-7.842

Correlations
Zero-order
Part ial

Part

Collinearity Statistics
Tolerance
VI F

-.254

-.233

-.225

.986

1.014

.000

-.264

-.244

-.236

.986

1.014

.000
.165

-.254

-.045

-.039

.786

1.272

-.233

.030

9.855
-.007

.512
.005

-.044

19.230
-1.391

-.126

.029

-.128

-4.387

.000

-.264

-.140

-.122

.900

1.111

.219

.028

.303

7.825

.000

.463

.244

.217

.516

1.938

.137

.033

.159

4.098

.000

.419

.131

.114

.513

1.948

a. Dependent Variable: Highest Year of School Completed

131

Model Summaryc
Change Statistics
Model
1
2

R
.347a
.502b

R Square
.120
.252

Adjusted
R Square
.118
.249

St d. Error of
the Estimate
2.802
2.586

R Square
Change
.120
.132

F Change
66.311
85.238

df 1
2
2

df 2
971
969

Sig. F Change
.000
.000

a. Predictors: (Constant), Number of Brot hers and Sisters, Age of Respondent


b. Predictors: (Constant), Number of Brot hers and Sisters, Age of Respondent, Highest Year School Completed, Father, Highest
Year School Completed, Mother
c. Dependent Variable: Highest Year of School Completed

APA Report:
Hierarchical Multiple Regression was used to assess the ability of PAEDUC and
MAEDUC in predicting EDUC while controlling for Age and Sibs,
Age and Sibs were entered at Step 1 (Model 1) explaining 12% of the
variance in EDUC. On entering PAEDUC and MAEDUC at Step 2
(Model 2), the total variance explained was 25.2%, F(4, 969) = 81.53,
p < .001)
PEADUC and MEADUC explained 13.2% of the variance on EDUC
after controlling for Age and SIBS, R squared change = .13,
F change (2, 969) = 85.24.
In the final model, only Sibs, PAEDUC and MAEDUC were
statistically significant, with PAEDUC having a higher sig effect on
EDUC than MAEDUC or SIBS.

132

Exercise
1)

Are PAEDUC and MAEDUC significant


predictors of SIBS if we control for Age and
EDUC? Report your findings in the APA
format.

133

Binary Logistic Regression


Used

when you want to predict a binary


criterion (dependent) variable.
Eg. of binary dependent variable
0 No diabetes, 1 Has diabetes
0 No default, 1 defaults
0 Does not graduate, 1 - graduates

134

Assumption of binary logistic regression

Dependent variable must be binary (1 for the desired


outcome and 0 for the other outcome) for binary logistic
regression
Dependent variable must be ordinal for Ordinal or
Multinormial logistic regression.
Does not need to make many of the assumptions of linear
regression. Eg does not need to satisfy conditions of
linearity, normality, homoscedasticity and measurement
level.
Does not need a linear relationship between the dependent
and independent variables.
Can handle all types of relationships because it uses nonlinear log transformation to predict odds-ratio.
135

Assumption of binary logistic regression


Independent

variables do not need to be


multivariate normal.
Residuals do not need to be multivariate normally
distributed.
Can use both ordinal and nominal independent
variables, in fact all predictors can be nominal.
There should be no multicollinearity among IVs.
Needs bigger sample size about 30 for each IV.
136

Eg of Research Question: Do EDUC, PAEDUC and


MAEDUC predict HAPPYrec (1 = happy, 0 = not
happy)

Record HAPPY to HAPPYrec. (Happy 1 and 2 recode to 1


and Happy 3 record to 0)
In SPSS: Analyze Regression Binary Regression.
Enter HAPPYrec into Dependent box.
Enter EDUC, PAEDUC and MAEDUC into Covariates
box.
Click Save check Probabilities and Group membership
(In the datafile, the respondents will be classified into
groups)
Click Options select Hosmer-Lemeshow goodness-of-fit
(to test to what extent the model fits the data) and Iteration
History.
137

In the Output:
A) Step 1: is like the test of the null hypothesis when there
are no predictors in the equation. The prediction is 90.7%
accurate.

The predictors are


all not sig.

138

In step 2:

Is similar to R squared =
% of variance explained
by the predictors 1.8 %
only.

Sig. there is sig


difference between model
and data. model does
not fit the data.

All 3 predictors are not


sig. not included in the
model.

139

B) Step 2: when the predictors are entered,


The percentage
accuracy is still
90.7%

All 3 predictors are


not sig. not
included in the model.

140

One-way Repeated Measures


ANOVA
This

analysis is used to compare one sample on


three or more variables.
Click Analyze General Linear Model
Repeated Measures
You will get the Repeated Measures Define
Factors Dialogue box.
Example of Research Question: Are there
significant differences in Health1, Health2 and
Health3?
141

One-way Repeated Measures


ANOVA

In Within-Subject Factor Name: box, type Health which is measured at 3


different times, (assuming).
In the Number of Levels: type 3
Click Add.
Click Define and in the Repeated Dialogue Box click the 3 variables: Health1,
Health2 and Health3.
If you want to compare this between male and female, click on the Between
subjects variable in this case Sex and move it to Between-subjects Factors
box.
Click on Options, then Display and click on Descriptive stats, Estimates of
effect size, Homogeneity tests and power, then Continue.
Click on Plots, then click on Within group variable (in this case Health) and
move it to the box labeled Horizontal Axis.
In the Separate Lines box, click on the grouping variable (i.e. Race)
Click Add
Click Continue and OK.

142

OUTPUT:
Descriptive Statistics
Mean
Ill Enough to Go
to a Doctor
Counselling f or
Mental Problems
Inf ertility , Unable
to Hav e a Baby

Std. Dev iat ion

1.45

.498

1009

1.94

.233

1009

1.97

.183

1009

Multi variate Testsc


Ef f ect
health

Pillai's Trace
Wilks' Lambda
Hotelling's Trace
Roy 's Largest Root

Value
.508
.492
1.034
1.034

F
Hy pothesis df
b
520.785
2.000
b
520.785
2.000
b
520.785
2.000
b
520.785
2.000

Error df
1007.000
1007.000
1007.000
1007.000

Sig.
.000
.000
.000
.000

Part ial Eta


Squared
.508
.508
.508
.508

Noncent .
Paramet er
1041.570
1041.570
1041.570
1041.570

Observ ed
a
Power
1.000
1.000
1.000
1.000

a. Computed using alpha = .05


b. Exact st atistic
c.
Design: Intercept
Within Subjects Design: health

143

Test equality of Variance or Sphericity


Mauchly's Test of Sphericityb
Measure: MEASURE_1
Epsilon
Within Subject s Ef f ect Mauchly 's W
health
.666

Approx.
Chi-Square
408.769

df
2

Sig.
.000

Greenhous
e-Geisser
.750

Huy nh-Feldt
.751

Lower-bound
.500

Tests the null hy pot hesis t hat the error cov ariance matrix of the ort honormalized transf ormed dependent v ariables is
proportional to an identity matrix.
a. May be used t o adjust the degrees of f reedom f or the av eraged test s of signif icance. Correct ed tests are display ed in
the Tests of Within-Subjects Ef f ects table.
b.

Mauchlys W sig at p < .05, there is sig difference in variance


among the 3 measures so statistical correction must be made
choose Hunyh-Feldt correction F value = 810.81 which is Sig
with df = 1.50 and 1513.40.
If Mauchlys W is NOT sig at p < .05, read F value in the
Sphericity Assumed row.

Design: Intercept
Within Subject s Design: health

Measure: MEASURE_1
Source
health

Error(health)

Sphericity Assumed
Greenhouse-Geisser
Huy nh-Feldt
Lower-bound
Sphericity Assumed
Greenhouse-Geisser
Huy nh-Feldt
Lower-bound

a. Computed using alpha = .05

df1
Ty pe I II Sum
of Squares
171.779
171.779
171.779
171.779
213.555
213.555
213.555
213.555

Tests of Wi thin-Subjects Effects

df
2
1.500
1.501
1.000
2016
1511.651
1513.402
1008.000

Mean Square
85.889
114.546
114.413
171.779
.106
.141
.141
.212

df2

F
810.813
810.813
810.813
810.813

Sig.
.000
.000
.000
.000

Part ial Eta


Squared
.446
.446
.446
.446

Noncent.
Parameter
1621.626
1215.939
1217.347
810.813

Observ ed
a
Power
1.000
1.000
1.000
1.000

Effect Size

144

Check Assumptions of Equality of


Error Variance and Equality of
Covariances Matrices
the output check the Levenes Test of
Equality of Error Variance. If not sig at
p =.05, then assumption of homogeneity of
variances is not violated.
Then check Boxs Test of Equality of
Covariance Matrices. If sig at p = .001, then
assumption of equality of Covariance is
violated.
In

145

Tests of Wi thin-Subjects Effects


Measure: MEASURE_1
Source
HEALTH

HEALTH * RACE

Error(HEALTH)

Sphericity Assumed
Greenhouse-Geisser
Huy nh-Feldt
Lower-bound
Sphericity Assumed
Greenhouse-Geisser
Huy nh-Feldt
Lower-bound
Sphericity Assumed
Greenhouse-Geisser
Huy nh-Feldt
Lower-bound

Ty pe I II Sum
of Squares
35.625
35.625
35.625
35.625
1.496
1.496
1.496
1.496
212.059
212.059
212.059
212.059

df
2
1.497
1.501
1.000
4
2.993
3.003
2.000
2012
1505.710
1510.448
1006.000

Mean Square
17.813
23.802
23.727
35.625
.374
.500
.498
.748
.105
.141
.140
.211

F
169.004
169.004
169.004
169.004
3.548
3.548
3.548
3.548

Sig.
.000
.000
.000
.000
.007
.014
.014
.029

Part ial Eta


Squared
.144
.144
.144
.144
.007
.007
.007
.007

Noncent.
Parameter
338.008
252.953
253.749
169.004
14.193
10.621
10.655
7.096

Observ ed
a
Power
1.000
1.000
1.000
1.000
.871
.788
.789
.660

a. Computed using alpha = .05

Within Subject Table shows F is sig at p <.05.


There is a sig difference in Health

146

Tests of Between-Subjects Effects


Measure: MEASURE_1
Transf ormed Variable: Av erage
Source
Intercept
RACE
Error

Ty pe III Sum
of Squares
2178.266
.698
123.821

df
1
2
1006

Mean Square
2178.266
.349
.123

F
17697.670
2.836

Sig.
.000
.059

Partial Eta
Squared
.946
.006

Noncent.
Parameter
17697.670
5.672

Observ ed
a
Power
1.000
.557

a. Computed using alpha = .05

If compared between subjects (Race White, Black and Others)


RACE line shows F is not sig at p < .05. See Plot to
Confirm this.
Estimated Marginal Means of MEASURE_1
2.1
2.0
1.9

Estimated Marginal Means

1.8
1.7

Race of Respondent

1.6
1.5

White

1.4

Black

1.3

Other
1

HEALTH

147

Pairwise Comparisons
Estimated Marginal Means of MEASURE_1

Measure: MEASURE_1

2.0

Estimated Marginal Means

1.9

(I) health
1

1.8

1.7

1.6

1.5

3
1.4
1

(J) health
2
3
1
3
1
2

Mean
Dif f erence
(I-J)
Std. Error
-.494*
.017
-.516*
.016
.494*
.017
-.023*
.009
.516*
.016
.023*
.009

Sig.
.000
.000
.000
.016
.000
.016

95% Conf idence Interv al f or


a
Dif f erence
Lower Bound
Upper Bound
-.526
-.461
-.548
-.485
.461
.526
-.041
-.004
.485
.548
.004
.041

health

Based on estimated marginal means


*. The mean dif f erence is signif icant at the .05 lev el.
a. Adjustment f or multiple comparisons: Least Signif icant Dif f erence (equiv alent to no
adjustments).

APA style report:


There are sig differences in the health measures,
F (1.50, 1513.40) = 810.00, p < .05 with a moderate effect size
(Eta squared = .45). LSD (Least Sig Difference) comparisons reveal
that Health3 is significantly higher than Health2 and Health 1 while
Health2 is significantly higher than Health1.
148
Exercise: Try exercise 23 on p. 12 (SPSS Module Part 2/Advanced)

Exercise 23 (additional Q)
1. Are there significant differences in EDUC, MAEDU
and PAEDU?
2. Are there significant differences in EDUC,
PRESTIG80 and OCCAT80?
3. Assuming hlth1, hlth2 and hlth3 are interval data,
are there significant differences in these 3 variables?
For each analysis, write a report using the APA style.
149

Formulate a research question


based on your study which will
require One-way repeated
measures ANOVA

150

MULTIVARIATE ANOVA
(MANOVA)
MANOVA

is used when you wish to


compare two or more dependent variables
(INTERVAL DATA) among a grouping
independent variable (NOMINAL DATA),
e.g. REGION.
For example, you wish to check whether
respondents in the various locations
(REGION) (IV) defer in the level of EDUC,
MAEDU and PAEDU (several DVs).
151

Assumptions of MANOVA
Sample size each subgroup n > 30.
2) Linearity between DVs. Can be tested using
Scatter-plots among pairs of the DVs across IV
groups. (Click Graph Legacy Dialogues
Scatter/Plot Matrix Scatter Define send all
dependent var to Matrix variable box, IV to row box
continue, OK)
3) Univariate and Multivariate Normality Test
univariate normality using skewness and kurtosis (or
Kolmogorov-Smirnov) or use EXPLORE in
descriptive statistics (Box Plot). Test Multivariate
Normality using Mahalanobis Distance in Multiple
Regression Analysis (use ID as the Dependent
variable and the predictors as independent variable)
152
1)

Univariate test of equality of variance Use


Levenes test in Output to test this. If Levenes
test is not significant at p < .05 there is equality
of variance among each DV.
5) Homogeneity of variance covariance
matrices Use the Boxs M test. If Boxs M is
not significant at p < .001 (you need to set at .001
because Boxs M test is very sensitive), it means
that there is homogeneity of variance-covariance).
6) Multicollinearity - use Pearson r (consider
removing one of the DV pairs with r > .8)
4)

153

MULTIVARIATE ANOVA
(MANOVA)
General Linear Model Multivariate
Send the DV to the Dependent variables box
The independent variable to the Fixed Factor box.
Click Options, click REGION and enter it into
Display Means
Click Compare Main Effects and click Bonferroni
and check Descriptive Statistics and Homogeneity
tests.
Click Continue and OK.
Analyze

154

Descriptive Statistics
Highest Year of
School Completed

Highest Year School


Complet ed, Mother

Highest Year School


Complet ed, Father

Region of the
Unit edEast
States
North
South East
West
Total
North East
South East
West
Total
North East
South East
West
Total

Mean
13.53
13.33
13.75
13.54
11.20
10.59
11.10
11.02
11.04
10.69
11.22
11.01

St d. Dev iation
2.719
3.060
2.679
2.797
3.218
3.466
3.633
3.409
3.838
4.421
4.282
4.117

a
Box's Test of Equality of Covariance Matrices

Box's M
F
df 1
df 2
Sig.

26.711
2.215
12
2786265
.009

Tests the null hy pot hesis t hat the observ ed cov ariance
matrices of the dependent v ariables are equal across groups.
a. Design: Intercept+region

N
454
239
280
973
454
239
280
973
454
239
280
973

The Boxs M tests the homogeneity of


the variance-covariance matrices at
p < .001.
Boxs M is not significant at p < .001,
so there are no sig diff in the
variance-covariance homogeneity of
variance

155

a
Levene's Test of Equality of Error Variances

F
Highest Year of
School Completed
Highest Year School
Complet ed, Mother
Highest Year School
Complet ed, Father

df 1

df 2

Sig.

1.529

970

.217

4.363

970

.013

5.416

970

.005

Tests the null hy pothesis that t he error v ariance of the dependent v ariable
is equal across groups.
a. Design: Intercept+region

The univariate tests for homogeneity


of variance for each DV shows that
for EDUC (not sig at p < .05), there is
no sig diff in var there is equality
of var.
For MAEDU and PAEDU sig diff
no equality of variance need to
Interpret the F for MAEDU and PAEDU
at higher Alpha level say p < .01

Multi variate Tests

Pillai's trace
Wilks' lambda
Hotelling's trace
Roy 's largest root

Value
.008
.992
.008
.006

F
Hy pothesis df
1.323
6.000
b
1.322
6.000
1.321
6.000
1.800c
3.000

Error df
1938.000
1936.000
1934.000
969.000

Sig.
.243
.244
.244
.146

Each F tests the multiv ariat e ef f ect of Region of the United States. These tests are
based on the linearly independent pairwise comparisons among the estimated
marginal means.
a. Computed using alpha = .05
b. Exact st atist ic
c. The statistic is an upper bound on F that y ields a lower bound on t he
signif icance lev el.

These Multivariate Tests test whether


there is sig group (REGION) diff
on the linear combination of the DVs.
Pillais Trace (most robust of statistic
against Violation of assumptions) is
NOT sig at p < .05 so no sig
Multivariate Effect for REGION.
No need to interpret the univariate
between-subject (REGION).
156

Tests of Between-Subjects Effects


Source
Corrected Model

Intercept

region

Error

Total

Corrected Total

Dependent Variable
Highest Year of
School Completed
Highest Year School
Complet ed, Mother
Highest Year School
Complet ed, Father
Highest Year of
School Completed
Highest Year School
Complet ed, Mother
Highest Year School
Complet ed, Father
Highest Year of
School Completed
Highest Year School
Complet ed, Mother
Highest Year School
Complet ed, Father
Highest Year of
School Completed
Highest Year School
Complet ed, Mother
Highest Year School
Complet ed, Father
Highest Year of
School Completed
Highest Year School
Complet ed, Mother
Highest Year School
Complet ed, Father
Highest Year of
School Completed
Highest Year School
Complet ed, Mother
Highest Year School
Complet ed, Father

Ty pe I II Sum
of Squares
b

23.697

df

Mean Square

Sig.

Part ial Eta


Squared

Noncent .
Paramet er

Observ ed
a
Power

11.848

1.516

.220

.003

3.033

.324

29.951

2.586

.076

.005

5.172

.517

38.095

19.047

1.124

.325

.002

2.248

.249

165616.187

165616.187

21194.722

.000

.956

21194.722

1.000

108579.532

108579.532

9375.498

.000

.906

9375.498

1.000

109037.361

109037.361

6433.135

.000

.869

6433.135

1.000

23.697

11.848

1.516

.220

.003

3.033

.324

59.902

29.951

2.586

.076

.005

5.172

.517

38.095

19.047

1.124

.325

.002

2.248

.249

7579.609

970

7.814

11233.765

970

11.581

16440.855

970

16.949

186109.000

973

129423.000

973

134366.000

973

7603.305

972

11293.667

972

16478.950

972

59.902

a. Computed using alpha = . 05

As shown in Pillais
Trace test that
multivariate tests are
not sig, (using Bonferroni Correction,
alpha = .05/3 = .017). There are no
significant EDUC, MAEDU and PAEDU
differences by REGION

b. R Squared = . 003 (Adjusted R Squared = .001)


c. R Squared = . 005 (Adjusted R Squared = .003)
d. R Squared = . 002 (Adjusted R Squared = .000)

157

APA report
MANOVA was undertaken to investigate Region differences in
PAEDUC, MAEDUC, EDUC. All assumptions relating to normality,
linearity, univariate and multivariate outliers (Mahalanobis Distance
within required limits) , homogeneity of variance covariance
matrices (Boxs M was not sig at p <. 05) and multicollinearity were
satisfied. There were no region differences in PAEDUC, MAEDUC
and EDUC, F (6, 1938) = 1.32, p > .05.
Note:
(If F is significant, you will need to state Pillais trace and effect size
partial eta squared. Check the mean scores of the DV that is significant
for the 3 regions to check which two regions this DV is significantly
different)
158

Another example of MANOVA output


Statistical assumptions of the analyses are met, and descriptive
statistics are reported in Table xx. A one-way between-groups
MANOVA partially supported the first hypothesis of
there being a difference in procrastination types between
students and white-collar workers, Pillais Trace=.05,
F(3, 181) = 3.2, p=.03, p 2 = .05, power =.73.

159

Another eg of MANOVA table with Tukey


Jin Hwang, YoungHo Kim (2011). Adolescents physical activity and its
related cognitive and behavioral processes, Biology of Sports, 28, 19-22. (ISI TIER 4)

160

DISCRIMINANT ANALYSIS
Is

used when you wish to find out, for example,


students with which personality characteristics or
interests (Independent Scale data) will be choosing
which career (Dependent Nominal data).
So the independent variable will be the students
personality characteristics or interests e.g.
extrovert, creative, etc (Scale Data) and the
dependent variable will be the choice of the career
e.g. Medicine or Architecture (Nominal Data)
161

Assumptions

1. The data must be multivariate normal. Test Multivariate Normality


using Mahalanobis Distance in Multiple Regression Analysis. (If each
DV is normally distributed and the groups are of equal sample sizes
indicates multivariate normality)
Homogeneity of variance covariance matrices Use the Boxs M
test. If Boxs M is not significant at p < .001 (you need to set at .001
because Boxs M test is very sensitive), it means that there is
homogeneity of variance-covariance.
3. Multicollinearity and singularity - Use Pearson r to check this.
(Singularity refers to a situation when one variable is perfectly
correlated to another variable)
4. Use EXPLORE boxplot to remove outliers (Go to EXPLORE
click Plots (in Display box) enter IVs (predictors) into Dependent
list and factor var or DV (Scale var) into Factor List.

162

To

analyze click :

CLASSIFY DISCRIMINANT
Lets say you wish to find out if you classify
students into Happy, Pretty Happy and Not So
Happy (assume Nominal Variable - HAPPY)
using the information from the Age, EDUC and
Prestig80.
ANALYZE

163

Move the dependent variable (e. g Career) to Grouping


Variable. Click Define Range to indicate how many
different types of Career you wish to study and indicate the
Maximum and Minimum number.
Click independent variables (e.g. Personality variables) to
the independents box.
Click Use Stepwise Method.
Click STATISTICS, and select Means, Univariate
ANOVAs, Boxs M and Unstandardized Function
Coefficients and Total Covariance Matrix and Separate
Group Covariance. Click Continue.
Click CLASSIFY and select Summary table, click
Continue.

164

Click METHOD button Wilks Lambda selected as


default as statistic that will be used for the addition and
subtraction of variables to and from the discriminant
functions. The criteria set for entry and removal are 3.84
and 2.71 respectively. [Or check the lower radio button to
set using the F values i.e. at .05 and .01]
Click SAVE to get Discriminant Analysis: Save dialogue
box which give Discriminant Scores and Predicted Group
Membership in the Data File.
If you wish to analyze for Male students only, you can use
Selection Variable and click 1 for male in the Value Box.
Then click OK to execute the Discriminant Analysis

165

OUTPUT
Group Statistics

General Happiness
Very Happy

Pret ty Happy

Not Too Happy

Total

Age of Respondent
Highest Year of School
Complet ed
R's Occupational
Prest ige Score (1980)
Age of Respondent
Highest Year of School
Complet ed
R's Occupational
Prest ige Score (1980)
Age of Respondent
Highest Year of School
Complet ed
R's Occupational
Prest ige Score (1980)
Age of Respondent
Highest Year of School
Complet ed
R's Occupational
Prest ige Score (1980)

Valid N (list wise)


Unweighted
Weighted
441
441.000

Mean
47.28

Std. Dev iat ion


17.766

13.52

2.987

441

441.000

45.19

12.883

441

441.000

44.82

17.422

814

814.000

12.87

2.914

814

814.000

42.22

12.925

814

814.000

46.66

17.329

147

147.000

12.28

2.835

147

147.000

40.35

13.653

147

147.000

45.79

17.547

1402

1402.000

13.01

2.952

1402

1402.000

42.96

13.080

1402

1402.000

No of
respondents
in each
group

166

Tests of Equality of Group Means

Age of Respondent
Highest Year of School
Completed
R's Occupational
Prest ige Score (1980)

Wilks'
Lambda
.996

F
3.018

.983
.985

df 1
2

df 2
1399

Sig.
.049

12.109

1399

.000

10.823

1399

.000

There is sig diff


among the 3 group
(Happy, Pretty Happy
Not so Happy) on the
3 IVS (AGE, EDUC,
PRESTIG80) at p < .05

Variabl es in the Analysi s


St ep
1
2

Highest Year of
School Completed
Highest Year of
School Completed
Age of Respondent

Wilks'
Lambda

Tolerance

F to Remov e

1.000

12.109

.929

15.166

.996

.929

6.042

.983

Variables Not in the Analysis


St ep
0

Age of Respondent
Highest Year of School
Complet ed
R's Occupational
Prest ige Score (1980)
Age of Respondent
R's Occupational
Prest ige Score (1980)
R's Occupational
Prest ige Score (1980)

Tolerance
1.000

Min.
Tolerance
1.000

F to Enter
3.018

Wilks'
Lambda
.996

1.000

1.000

12.109

.983

1.000

1.000

10.823

.985

.929

.929

6.042

.975

.737

.737

3.146

.979

.716

.665

1.993

.972

High Tolerance value


means that IVs can contribute
to the discrimination.
F to Remove tests the sig of
the decrease in discrimination
if the variable is removed.
Since Prestig80 has F less than
2.71 (default) i.e. 1.993, it is
removed from prediction.

167

Ei genvalues
Function
1
2

Eigenv alue % of Variance


.024a
91.1
a
.002
8.9

Cumulat iv e %
91.1
100.0

Function 1 has
The highest
% of variance

Canonical
Correlation
.152
.048

a. First 2 canonical discriminant f unctions were used in t he


analy sis.

Wi lks' Lambda
Test of Function(s)
1 through 2
2

Wilks'
Lambda
.975
.998

Chi-square
36.038
3.244

df
4
1

Sig.
.000
.072

Wilks Lambda
is sig for Function
1 and 2.

Structure Matrix
Function
1
Highest Year of School
Complet ed
R's Occupational
a
Prest ige Score (1980)
Age of Respondent

.837*

-.547

.509*

-.160

.305

.952*

Pooled within-groups correlations between discriminating


v ariables and standardized canonical discriminant f unctions
Variables ordered by absolute size of correlat ion wit hin f unction.
*. Largest absolute correlation between each v ariable and
any discriminant f unct ion
a. This v ariable not used in the analy sis.

168

Classification Resultsa

Original

Count

General Happiness
Very Happy
Pret ty Happy
Not Too Happy
Ungrouped cases
Very Happy
Pret ty Happy
Not Too Happy
Ungrouped cases

Predicted Group Membership


Not Too
Very Happy Pret ty Happy
Happy
214
105
148
310
227
329
47
39
77
5
1
6
45.8
22.5
31.7
35.8
26.2
38.0
28.8
23.9
47.2
41.7
8.3
50.0

Total
467
866
163
12
100.0
100.0
100.0
100.0

a. 34.6% of original grouped cases correctly classif ied.

The success rate of


predicting HAPPY using
EDUC, AGE and
PRESTIG80 is 34.6%

Those in Not Too Happy


were most accurately
classified (47.2%) followed
by those in Very Happy (45.8%).
Pretty Happy is least successfully
classified (26.2%)
Those not classified in Very Happy
tend to be classified as Pretty Happy
than in Not Too Happy

169

Note:

if we click Save and Predicted


Group Membership you will get a column
in the datafile with the predicted group each
respondent will belong to!

170

Testing for Moderating Effects of


a Variable
Use

Multiple Regression with the


Moderating Variable as Dummy Variable.
Eg. If sex is the moderating variable,
RECODE Male = 1 and Female = 0 to see if
there is any difference in the DV when
gender changes from 0 to 1.

171

Testing for Mediation


You

can test mediation effect of a variable


online:
http://www.people.ku.edu/~preacher/sobel/s
obel.htm

172

To conduct the Sobel test

173

Testing Mediating Effects of a Variable

Use Partial Correlations


Go to ANALYZE CORRELATE PARTIAL

174

RESULTS:
Descriptive Statisti cs
Mean
Standard Ov erall
350.4529
Creativ ity Regular
Std Ov erall Achiev ement 150.1848
Raw I Q
114.7419

Std. Dev iation

50. 82975

465

23. 97305
24. 32108

465
465

Correlations

Control Variables
-none- a

Raw I Q

Standard Ov erall
Creativ ity Regular

Correlat ion
Signif icanc e (2-tailed)
df
Std Ov erall Achiev ement Correlat ion
Signif icanc e (2-tailed)
df
Raw I Q
Correlat ion
Signif icanc e (2-tailed)
df
Standard Ov erall
Correlat ion
Creativ ity Regular
Signif icanc e (2-tailed)
df
Std Ov erall Achiev ement Correlat ion
Signif icanc e (2-tailed)
df

a. Cells cont ain zero-order (Pearson) c orrelations.

Standard
Ov erall
Creativ ity
Regular
1. 000
.
0
.178
.000
463
.178
.000
463
1. 000
.
0
.127
.006
462

Std Ov erall
Achiev ement
.178
.000
463
1. 000
.
0
.340
.000
463
.127
.006
462
1. 000
.
0

Raw I Q
.178
.000
463
.340
.000
463
1. 000
.
0

175

Bootstrapping

Is the process of taking sub-samples from the original


sample (taken from the population) with replacement.
With replacement means the same data can be sampled
multiple times.
The subsamples (data collected) is assumed to be a perfect
replica of the population.
Bootstrapping allows you to obtain a range of values that
better represent the real value in the population.
This is used in calculating the Confidence Interval, CI.
Researchers quantify the precision of their estimate using
Confidence Intervals.
Confidence Intervals provide more information than Null
Hypothesis Significance Testing (NHST). So should use
both.
176

Eg. Null Hypothesis testing show that a difference of


means of 5 is significant.
If the 95% confidence interval of the difference of the
means ranges from 7 to 10, then we can say with 95%
confidence that the means are significantly different.
But if the confidence interval ranges from 3 to 8, there is a
possibility that the difference in the population can go as
low as 3 which will NOT be significantly different.

177

178

Group Statistics
Respondent's Sex

Bootstrapa

Statistic
Bias

Std. Error

95% Confidence Interval


Lower

Upper

633

Mean

13.23

.00

.12

12.99

13.47

Std. Deviation

3.143

.001

.097

2.948

3.336

Male

Std. Error Mean

.125

877

Highest Year of School Completed

Mean

12.63

.00

.10

12.43

12.83

Std. Deviation

2.839

.002

.082

2.667

3.000

Female

Std. Error Mean

.096

a. Unless otherwise noted, bootstrap results are based on 1000 bootstrap samples

Exercise on Bootstrapping
Similar to earlier exercise use bootstrapping to confirm the
whether there is a significant difference between Male and Female
respondents level of education (EDUC).
Discuss the results.
179

Non-parametric tests

do not require a normal distribution


do not require equal group variances
used with variables that are ordinal or nominal
e.g. Chi-square for determining relationship between
nominal - nominal data or nominal - ordinal data
(SPSS-Data Editor-Statistics-Summarize-Crosstabs)
e.g Spearman Rank- Order correlation for seeking
relationship between ordinal - ordinal data
e.g. Mann-Whitney U-test to compare 2 different
groups on a ordinal/interval data
180

Non-parametric tests
Kruskall-Wallis

Test (To compare > 2 different

groups)
Fiedman Test (To compare same group > 2
times)

181

Analysis and Reporting of


Inferential Statistics
Chi-square,

Mann-Whitney U-test, Kruskall

Wallis

182

Non - Parametric Statistical Analyses


(Degree of Association) - Chi-square
SPSS Data Editor - Statistics - Summarize - Crosstabs

183

Non - Parametric Statistical Analyses


(Degree of Association) Chi-square

Chi-square:

used to find the degree of


association between 2 nominal variables

C
CR - CREATIVE
CHILDREARING
o

w
o
i
.
1
T

y
i
g
d
a
l
a
P
2
2
C
L
2
1
L
1
1
A
N

a
3
184
5

Effect Size for Chi-square

If df = 1, use

2
N

where (Phi) is the effect size. Can range from -1 to + 1. Higher value
indicating stronger relationship. is similar to r 2 is the
percentage of the variance accounted for.

If df > 1 (for tables bigger than 2 x 2), use Cramers V given by

2
dfN

where df is the smaller of the 2 dfs.


185

Interpretation of Phi and Cramers


V.
For df = 1 (phi)
.10 to .30 Small
.31 to .50 Medium
.51 and up Large.

For df = 2
.07 to .21 Small.
.22 to .35 Medium.
.36 and up Large

For df = 3
06 to .17 Small
.18 to .29 Medium
.30 and up Large

186

Reporting Cross Tabulations / Chi-square Analysis


df = (No of rows 1)(No of columns 1)
2 (df, N = sample size) = 12.47, p <.05.
Descriptive:
Sixteen low, 8 average and 9 high creative
parents answered no while 1 average and 8 high creative
parents answered yes on item 29. The chi-square analyses
reveal a significant association between parents creativity
and their responses, 2 (2, N = 41) = 12.47, p <.05, V = .55) .
Cramers V indicates the effect size is large.
Interpretation:
The results show that creative parents do answer differently
on item 29 with the creative parents significantly answering Yes
on the item compared to the non-creative parents.
187

Non - Parametric Statistical Analyses (Relationship)

C
r
o
w
o
i
.
2
3
3
1
7
3
8
T
9
6
1

NS

y
i
g
d
a
l
f
a
P
2
0
C
L
2
1
L
1
1
A
N

a
1
5

(Effect size = Cramers V for 2 x 3 table)


FINDING:
There is no relationship between item 30 and the
childrearing practices, , 2 (2, N = 16) = 4.09, p > .05, V = .31)

188

Non-Parametric Statistical Analyses (Relationship)

C
r
o
w
v
o
t
c
s
1
2
5
8
5
2
4
4
8
6
T
6
9
6
1

y
i
g
d
a
l
f
a
P
2
6
C
L
2
6
L
1
2
A
N

NS

a
1
5

(Effect size = Cramers V for 2 x 3 table)


FINDING:
189
2
There is no relationship between SAM and CR , (2, N = 16) = 2.24, p > .05, V= .23)

Exercise 5

Chi-square

ANALYZE -> DESCRIPTIVE STATISTICS CROSSTABS

1.

Is sex of respondents related to region?


2. Is race related to region?
3. Is there a relationship between the level
of general happiness (happy) and sex of the
respondents?
4. Is perception of life related to general
happiness (happy)?
190

ISI paper Chi-square Example

191

Non - Parametric Statistical Analyses (Comparison

of Groups on ordinal data) Mann-Whitney U Test


SPSS Data Editor - Nonparametric Tests - 2 Independent sample

192

Non - Parametric Statistical Analyses


(Comparison of 2 Groups on ordinal
data)
Mann-Whitney U-Test

a
a

n
I
S

0
e
m
U
a
N
n
S
n
W
0
A
M
Z
4
F
A
9
(
T
NS

FINDING:
Fathers (Md = , n=15) and mothers (Md= , n=26) do not
differ in the variable Artistry
(U =194.50, z = -.01, p > .05)

E
b
[
S

a
G
S

b
N
t
193

Effect size for Mann-Whitney


Given

by

z
N

Where r is the effect size. z and N can be obtained from the


output.
Use Cohens (1988) criteria to interpret the effect size.

Effect size (Cohens d), Eta Squared and Interpretation


--------------------------------------------------------------------------------------------------Effect Size (Cohens d)
Eta Squared, 2
Interpretation
---------------------------------------------------------------------------------------------------0.2 <= d <0.5
.01 <= 2 < .06
Small
0.5 <= d < .8
.06 <= 2 < .14
Moderate
d <= .8
2 => .14
Large

-------------------------------------------------------------------------------194

Exercise 6
Mann Whitney U-test
ANALYZE -> NONPARAMETRIC -> 2 INDEPENDENT SAMPLES

1.

How do older people (age over 50) and


younger people (age 50 and below) differ
in their perception of
a) their general happiness (happy)
b) their perception of life (life)?
195

Wilcoxon Matched Pair Signed Ranks

Used when the same sample is compared on two different


nominal or ordinal variable.

196

Descriptive Statistics

N
FI KIR
IKUT

Mean
2.40
2.38

120
120

St d. Dev iation
1.111
.979

Minimum
1
1

Maximum
4
4

Ranks
N
IKUT - FIKIR Negativ e Ranks
Positiv e Ranks
Ties
Total
a. IKUT < FIKIR
b. IKUT > FIKIR
c. IKUT = FIKIR

45a
49b
26c
120

25th
1.00
2.00

Percent iles
50th (Median)
2.00
2.00

75th
3.00
3.00

Test Statisticsb
Mean Rank
50.04
45.16

Sum of Ranks
2252.00
2213.00

IKUT - FIKIR
Z
-.076a
Asy mp. Sig. (2-tailed)
.939
a. Based on positiv e ranks.
b. Wilcoxon Signed Ranks Test

Effect Size:
Wilcoxon analysis is used to ascertain whether there is
r
significant differences between students perception on
IKUT and the importance to follow instructions, FIKIR.
.076
r
.005 The results show that there is no significant differences
240
between IKUT (Md = 2.00) and FIKIR (Md = 2.00),
z = -.076, p > .05, r = .005.
z
N

197

Exercise 7 (Wilcoxon)
1)

Are there significant differences between


HAPPY and LIFE among the respondents?

198

Non - Parametric Statistical Analyses


(Comparison of more than 2 Groups
on ordinal data)
Kruskal- Wallis One-way ANOVA (Mean ranks)
Research Question:
Is there a significant difference between respondents from
different regions on perception of life (LIFE)
ANALYZE -> NONPARAMETRIC -> K INDEPENDENT SAMPLES

199

Non - Parametric Statistical Analyses


(Comparison of more than 2 Groups on
ordinal data) (contd.)
South East is ranked
highest in LIFE

Ranks
Region of the
UnitedEast
States
Is Lif e Exciting or Dull North
South East
West
Total

N
433
267
280
980

Mean Rank
497.30
511.12
460.32

Test Statisticsa,b
Is Lif e Exciting
or Dull
Chi-Square
6.247
df
2
Asy mp. Sig.
.044

p value is significant
at p < .05

a. Kruskal Wallis Test


b. Grouping Variable: Region of the Unit ed States

FINDING:
There are significant differences in the perception of LIFE
between the three regions, 2 (2, N = 980) = 6.25, p <.05.

200

Holms sequential Bonferroni for multiple


Non-parametric comparisons

To find out which group is significantly different from


which group, use Holms sequential Bonferroni Method.
Step 1: Set the family (usually set at family = .05)
Step 2: How many pair-wise comparisons, Nc ? Use
formula: Nc =Ng (Ng -1) /2 where Ng = no of groups
Step 3: Rank the comparisons based on p values from
smallest to largest.
Step 4: Evaluate the first comparison using 1 = family /Nc
Step 5: If p < 1,, reject null hypothesis and compare the
next pair. If p > 1 , do not reject null hypothesis and
declare all comparisons not significant.
201

Step 6: Evaluate the second comparison using 2 = family


/Nc 1.
Step 7: if p < 2 , reject the null hypothesis and compare
the next pair. If p > 2 , do not reject the null hypothesis
and declare the rest of the comparisons not significant
Step 8: Evaluate the third comparison using
3 = family /Nc 2.
Step 9: If p < 3,, reject the null hypothesis and compare the
next pair. If p > 3 , do not reject the null hypothesis and
declare the rest of the comparisons not significant.
AND SO ON

202

Holms sequential Bonferroni for multiple


Non-parametric comparisons
1- exciting
2- routine
3 - dull

Ranks
Region of the
States
Is Lif e Exciting or Dull United
North East
South East
West
Total

N
433
267
280
980

Mean Rank
497.30
511.12
460.32

Rank the
comparisons
based on p
values from
smallest to
largest

Comparison
Mann-Whitney p value (Holms Correction, )
---------------------------------------------------------------------------NE vs W
56048.00
.05
(.017)
NSig
If p > 1 , do not
reject null
hypothesis and
declare all
SE vs W
33502.00
.48
(.05)
comparisons not
significant.
-----------------------------------------------------------------------------

NE vs SE

56178.00

.17

(.025)

Results of Holms sequential Bonferroni Comparison:


NE (Mean rank = 497.30) do not find life significantly more exciting then SE
(Mean rank = 511.12). There are also not significant differences between
NE and W as well as between SE and W.

203

Effect size for Kruskal-wallis


FI KIR

Ranks

Ras

Median

Ras A

71

2.00

Ras B

31

2.00

Ras C

18

2.00

120

2.00

Total

FIKIR

RAS
Ras A
Ras B
Ras C
Total

N
71
31
18
120

Test Statisticsa,b
Mean Rank
63.83
57.08
53.25

Chi-Square
df
Asy mp. Sig.

FIKIR
1.854
2
.396

a. Kruskal Wallis Test

Effect size is given by Eta squared,

b. Grouping Variable: RAS

N 1
2 = 1.854 / (120 1) = .016.
Kruskall Wallis comparison of the three races on perception
was not significant, 2 (2, N 120) = 1.854, p > .05, 2 = .02.
204

ISI paper Kruskal - Wallis Example

Source:
Rubie-Davies, C . M. 2007. Classroom interactions: Exploring
the practices of high- and low-expectation teachers. British Journal of
Educational Psychology, 77, 289-306.

205

206

Exercise 8 Kruskal-Wallis
Is

there significant differences in HAPPY


among the Whites, Blacks and Others?

207

Friedmans ANOVA
To

compare three different nominal or


ordinal variables from a single group

208

Ranks

Percent iles

Valid

Minimum

Maximum

25

50

75

FI KIR

120

1.00

2.00

3.00

KERJA

120

2.00

3.00

4.00

POPULAR

118

2.00

2.00

3.00

Test Statisticsa
N
Chi-Square
df
Asy mp. Sig.

118
3.247
2
.197

a. Friedman Test

FI KIR
KERJA
POPULAR

Mean Rank
1.96
2.12
1.92

Test Statistics
N
Kendall's Wa
Chi-Square
df
Asy mp. Sig.

118
.014
3.247
2
.197

a. Kendall's Coef f icient of Concordance

Friedmans ANOVA Effect Size is given by:

2
N (k 1)

In this example, k = 3 and N = 118.


W = 3.247 / (120)(3 1) = 0.014.
Using Cohens (1988) criteria, W = .01
indicates a small effect size.
209

Exercise 9 Friedmans ANOVA


1)

Are there significant differences between


HAPPY, LIFE and HELPOTH among the
respondents?

210

GOLDEN TABLE

211

212

You might also like