You are on page 1of 47

SW388R6

Data Analysis
and Computers I

Chi-square Test of Goodness-of-Fit

Slide 1

Key Points for the Statistical Test


Sample Homework Problem
Solving the Problem with SPSS
Logic for Chi-square Test of Goodness-of-Fit
Power Analysis

Compu
ters I

Chi-square Test of Goodness-of-Fit : Purpose

Slide 2

Purpose: test whether or not the proportion of


subjects in each category matches our expectations

Examples:

The ethnic breakdown of the university differs from the


statewide proportions
The ethnic breakdown of the school of social work differs
from the university

Compu
ters I
Slide 3

Chi-square Test of Goodness-of-Fit:


Hypotheses
Hypotheses:
Null: Observed frequencies = expected frequencies
Versus
Research: Observed frequencies expected frequencies
Decision:
Reject null hypothesis if pSPSS alpha

Compu
ters I

Chi-square Test of Goodness-of-Fit:


Assumptions and Requirements

Slide 4

Variable contains categories or groupings

Sample size is sufficiently large so that:


No cell has an expected frequency less than 1
or
No more than 20% of the cells contain an
expected frequencies less than 5

Compu
ters I

Chi-square Test of Goodness-of-Fit:


Effect Size

Slide 5

Cohens w measures difference in expected and


observed proportions over all categories of the
variable

Interpretation:
small: w = .10 to .30
medium: w = .30 to .50
large: w = .50 and higher

Compu
ters I

Chi-square Test of Goodness-of-Fit:


APA Style

Slide 6

A chi-square test of goodness-of-fit is


presented as follows:

(4, N= 57) = 3.17, p = .53

Degrees
of
freedom

Numbe
r of
cases

Value
of
statistic

Significance
of statistic

Compu
ters I

Homework problems:
Chi-square test of goodness-of-fit

Slide 7

This problem analyzes the variable "marital status" [marital] for a subset of the cases in
GSS2000R.Sav. The subset is based on the variable "attitude toward life" [life]. Using an
alpha of .05, is the following statement true, true with caution, false, or an incorrect
application of a statistic?
Previous research on survey respondents who said that they generally find life pretty
routine found that the breakdown for "marital status" was 38.7% in the category 'married',
16.1% in the category 'widowed', 10.8% in the category 'divorced', 5.4% in the category
'separated' and 29.0% in the category 'never married'.
A chi-square test of goodness-of-fit was performed on the variable "marital status" and
found that the breakdown in our sample was significantly different from the breakdown
found in previous research.
HINT: Applying the percentage breakdown from previous research to our sample of 93 cases
would result in expected frequencies of 36 in the category 'married', 15 in the category
'widowed', 10 in the category 'divorced', 5 in the category 'separated' and 27 in the
This is the general framework
category 'never married'.
True
o True with caution
o False
o Incorrect application of a statistic
o

for the problems in the


homework assignment on the
chi-square goodness-of-fit. The
description is similar to findings
one might state in a research
article.

Compu
ters I

Homework problems:
Data set, variables, and sample

Slide 8

This problem analyzes the variable "marital status" [marital] for a subset of the cases in
GSS2000R.Sav. The subset is based on the variable "attitude toward life" [life]. Using an
alpha of .05, is the following statement true, true with caution, false, or an incorrect
application of a statistic?
Previous research on survey respondents who said that they generally find life pretty
routine found that the breakdown for "marital status" was 38.7% in the category 'married',
16.1% in the category 'widowed', 10.8% in the category 'divorced', 5.4% in the category
'separated' and 29.0% in the category 'never married'.
A chi-square test of goodness-of-fit was performed on the variable "marital status and
found that the breakdown in our sample was significantly different from the breakdown
found in previous research.
HINT: Applying the percentage breakdown from previous research to our sample of 93 cases
would result in expected frequencies of 5 in the category 'separated', 15 in the category
'widowed', 10 in the category
'divorced',
27 in the
category 'never married' and 36 in the
The first
two paragraphs
identify:
category 'married'.

The data set to use, e.g. GSS2000R.Sav


The subset of cases to include in the analysis
The variable to use to create the subset
The variable used in the chi-square test of
goodness-of-fit
The alpha level to use in the hypothesis test

True
True with caution
o False
o Incorrect application of a statistic
o
o

Compu
ters I

Homework problems:
Specifications for the test

Slide 9

This problem
analyzes
the variable
"marital status" [marital] for a subset of the cases
The second
paragraph
identifies:
in GSS2000R.Sav. The subset is based on the variable "attitude toward life" [life].
Using an alpha
.05, is the of
following
statement
true,
Theof
breakdown
the categories
found
in true with caution, false, or an
previous research/
incorrect application
of a statistic?
Previous research on survey respondents who said that they generally find life pretty
routine found that the breakdown for "marital status" was 38.7% in the category
'married', 16.1% in the category 'widowed', 10.8% in the category 'divorced', 5.4% in
the category 'separated' and 29.0% in the category 'never married'.
The fourth paragraph provides:
A chi-square
of goodness-of-fit was performed on the variable "marital status"and
atest
hint that computes the expected
found that the
breakdown
in our
sample
was significantly
frequencies that
SPSS
will need
to compute different from the
breakdown found
in
previous
research.
the goodness-of-fit test.

HINT: Applying the percentage breakdown from previous research to our sample of 93
cases would result in expected frequencies of 36 in the category 'married', 15 in the
category 'widowed', 10 in the category 'divorced', 5 in the category 'separated' and 27
in the category 'never married'.
True
o True with caution
o False
o Incorrect application of a statistic
o

Expected frequencies are computed by


multiplying the percentage found in each
category as reported in the previous
research(38.7%, 16.1%, etc.) times the
total number of cases in our sample (93).

ters I

Homework problems:
Choosing an answer

Slide
10

The answer to a problem

This
the variable "marital status"
[marital]
for a subset
of the cases in
Since
it is legitimate
to use
willproblem
be True analyzes
if the
GSS2000R.Sav.
The
subset
is
based
on
the
variable
"attitude
toward
life"
[life].
Using an
ordinal variables in the chigoodness-of-fit test
alpha
of .05,
isfinding
the following
statement true, true
with goodness-of-fit
caution, false, or
an incorrect
square
test,
supports
the
in
application
of astatement.
statistic?
True with caution is not used
the problem
for these problems.

Previous research on survey respondents who said that they generally find life pretty
routine found that the breakdown for "marital status" was 38.7% in the category 'married',
16.1% in the category 'widowed', 10.8% in the category 'divorced', 5.4% in the category
'separated' and 29.0% in the category 'never married'.
A chi-square test of goodness-of-fit was performed on the variable "marital status" and
found that the breakdown in our sample was significantly different from the breakdown
found in previous research.
The the
answer
to a problem
HINT: Applying
percentage
breakdown from previous research to our sample of 93 cases
will be False if the
would result
in
expected
frequencies
goodness-of-fit test does of 36 in the category 'married', 15 in the category
'widowed',not
10 in
the category
'divorced', 5 in the category 'separated' and 27 in the
support
the finding
category 'never
in themarried'.
problem
statement.

True
o True with caution
o False
o Incorrect application of a statistic
o

The answer to a problem will Incorrect


application of a statistic if the
goodness-of-fit test violates the sample
size requirement, i.e. no expected
frequencies less than 1 and no more
than 20% of the expected frequencies
less than 5.

ters I
Slide
11

Solving the problem with SPSS:


Selecting the subset - 1
Our next task in SPSS is
to select the subset cases
that will be used in the
analysis.

The problem statement tell us The subset is based on


the variable "attitude toward life" [life], and that we are
specifically interested in survey respondents who said
that they generally find life pretty routine.
Our first task is to find the data value for life which
represents survey respondents who said life was pretty
routine.
We go to the Variable View in the SPSS Data Editor and
locate the variable.

ters I
Slide
12

Solving the problem with SPSS:


Selecting the subset - 2
We scroll to the right until we see the Values
column. When we click on the cell for sex in the
values column, a button with an ellipsis on it
appears. Click on this button to open the Values
Label dialog box.

Click on
OK to close
the dialog
box.
The Values Labels dialog box shows us the text labels
that the creator of the data set assigned to each of
the possible numeric responses for this variable.
2 = ROUTINE would be the obvious choice to
indicate respondents who said that they generally find
life pretty routine.
This analysis will include cases who have a score of 2
for the variable life.

ters I
Slide
13

Solving the problem with SPSS:


Selecting the subset - 3

To select the subset of


cases for this analysis, we
return to the Data View of
the SPSS Data Editor and
we choose the Select
Cases command from
the Data menu.

ters I
Slide
14

Solving the problem with SPSS:


Selecting the subset - 4

In the Select Cases dialog box,


we mark the option button If
condition is satisfied, and click
on the If button which
becomes active when the
option button is marked.

ters I
Slide
15

Solving the problem with SPSS:


Selecting the subset - 5

First, we highlight
the variable we want
to use, life, in
selecting the subset.

Second, we click on the


right arrow button to
move the variable to the
text box where we will
compose our selection
criteria.

ters I
Slide
16

Solving the problem with SPSS:


Selecting the subset - 6
First, we complete the
selection criteria by typing
the value for the cases we
want to include, = 2.

Second, we click on the


Continue button to close the
Select Cases: If dialog box.

ters I
Slide
17

Solving the problem with SPSS:


Selecting the subset - 7

When we return to the Select


Cases dialog, we see that SPSS
has printed our selection
criteria next to the If button.

Click on the OK
button to
complete the
selection of the
subset.

ters I
Slide
18

Solving the problem with SPSS:


Selecting the subset - 8

When we return to the Data Editor, we scroll the


variables to the right until we see the column for
life.
We see that SPSS has marked out the cases that will
be excluded by drawing a diagonal slash through the
row number.
The cases that are excluded have a 1 for DULL, a
3 for EXCITING, or are missing answers.
The cases with a value of 2 for life do not have the
slash and will be included in the analysis.

ters I
Slide
19

Solving the problem with SPSS:


Level of measurement

The chi-square test of goodness-of-fit can be


used with variables at any level of
measurement, provided there are a discrete
number of categories. Continuous variables
should be grouped in classes.
Marital status [marital] is a nominal variable
with 5 categories (9 = NA if a missing data
value).

ters I
Slide
20

Solving the problem with SPSS:


The chi-square test of goodness-of-fit - 1
To get the information to
answer the question of
sample size, we must run
the test.

Select Nonparametric
Tests > Chi-Square
from the Analyze menu.

ters I
Slide
21

Solving the problem with SPSS:


The chi-square test of goodness-of-fit - 2
The finding we are trying to verify is:
A chi-square test of goodness-of-fit was
performed on the variable "marital status"
and found that the breakdown in our sample
was significantly different from the
breakdown found in previous research.

Second, click on the Values


option button to let SPSS
know we will enter the
expected frequencies
ourselves.

Fourth, click on the


Add button to move
the value 36 to the
list.

First, move the variable


marital to the Test
Variables List list box.

Third, type in the first


expected frequency from
the problem Hint.
HINT: Applying the
percentage breakdown
from previous research to
our sample of 93 cases
would result in expected
frequencies of 36 in the
category 'married', 15 in
the category 'widowed',
10 in the category
'divorced', 5 in the
category 'separated' and
27 in the category 'never
married'.

ters I
Slide
22

Solving the problem with SPSS:


The chi-square test of goodness-of-fit - 3
When we clicked on the Add
button, the value 36 is
added to the end of the list.

First, type in the second


expected frequency from
the problem Hint.

Second, click on the


Add button to move
the value 15 to the
list.

HINT: Applying the


percentage breakdown
from previous research to
our sample of 93 cases
would result in expected
frequencies of 36 in the
category 'married', 15 in
the category 'widowed',
10 in the category
'divorced', 5 in the
category 'separated' and
27 in the category 'never
married'.

ters I
Slide
23

Solving the problem with SPSS:


The chi-square test of goodness-of-fit - 3
HINT: Applying the
percentage breakdown
from previous research to
our sample of 93 cases
would result in expected
frequencies of 36 in the
category 'married', 15 in
the category 'widowed',
10 in the category
'divorced', 5 in the
category 'separated' and
27 in the category 'never
married'.

When we clicked on the Add


button, the value 15 is
added to the end of the list.
Add the remaining expected
frequencies 10, 5, and 27 to
the list.

When we have typed


in all of the expected
frequencies, we click
on the OK button to
generate the output.

ters I
Slide
24

Solving the problem with SPSS:


Checking expected frequencies
The finding we are trying to verify is:
A chi-square test of goodness-of-fit was performed
on the variable "computer use" and found that the
breakdown in our sample was significantly different
from the breakdown found in previous research.
Our first task is to make certain we have entered
the expected frequencies correctly.

We double check the expected frequencies in the table


against the hint. HINT: Applying the percentage breakdown
from previous research to our sample of 93 cases would
result in expected frequencies of 36 in the category
'married', 15 in the category 'widowed', 10 in the category
'divorced', 5 in the category 'separated' and 27 in the
category 'never married'.
The categories and expected frequencies match up correctly.

ters I
Slide
25

Solving the problem with SPSS:


Sample size requirements
Our second task is to verify
the sample size requirements.

The information we need to verify that we meet


the sample size requirements is in the footnote to
the Test Statistics table.
The minimum expected frequency in any cell was
5, which is larger than the minimum requirement
of 1. None of the cells had an expected frequency
less than 5. The sample size requirements for the
chi-square test are satisfied.

ters I
Slide
26

Solving the problem with SPSS:


Answering the question - 1
The finding we are trying to verify is:
A chi-square test of goodness-of-fit was performed
on the variable "marital status" and found that the
breakdown in our sample was significantly different
from the breakdown found in previous research.
Having satisfied the sample size requirement, we
look to the table of Test Statistics for the hypothesis
test.

The chi-square test of


goodness-of-fit for this
problem produced the
statistical result: Chisquare (4, N = 93) =
10.02, p = .04.

ters I
Slide
27

Solving the problem with SPSS:


Answering the question - 2

Since the probability for the


chi-square statistic is less
than or equal to the alpha
level of 0.05 we reject the null
hypothesis and support the
research hypothesis.
The breakdown for our
sample is different from
that found in previous
research. Our sample is
either unlikely to be from
the same population
reported in previous
research, or some event
has altered the
breakdown of the cases.

ters I
Slide
28

Restoring all of the cases to the dataset - 1

We have selected a specific


subset of cases for this
problem. To make sure we do
not use the wrong subset for
the next problem, we will
restore all of the cases to the
data set.
Click on the Select Cases
command from the Data
menu.

ters I
Slide
29

Restoring all of the cases to the dataset - 2

Click on the All


cases option
button to remove
the If condition.

Click on the OK
button to
complete the
command.

ters I
Slide
30

Restoring all of the cases to the dataset - 3

The slashes through the


case numbers are removed,
indicating that all of the
cases are available to the
next command.

ters I
Slide
31

Logic for homework problems:


Chi-square Test of Goodness-of-Fit 1

Select subset of cases


specified in problem

Compute chi-square test


of goodness-of-fit

No expected
frequencies < 1?

Yes

No
Inappropriate
application of
a statistic

ters I
Slide
32

Logic for homework problems:


Chi-square Test of Goodness-of-Fit 2
No more than
20% of
expected
frequencies < 5?
Yes

Probability of the
test statistic less
than alpha?

Yes
True

No
Inappropriate
application of
a statistic

No

False

ters I

Power Analysis: Chi-square Goodness-of-fit


Problem that was False

Slide
33

This problem analyzes the variable "marital status" [marital] for a subset of the cases
in GSS2000R.Sav. The subset is based on the variable "seen x-rated movie in last year"
[xmovie]. Using an alpha of .01, is the following statement true, true with caution,
false, or an incorrect application of a statistic?
Previous research on survey respondents who had not seen an x-rated movie in the last
year found that the breakdown for "marital status" was 52.2% in the category 'married',
7.4% in the category 'widowed', 11.8% in the category 'divorced', 2.9% in the category
'separated' and 25.7% in the category 'never married'.
A chi-square test of goodness-of-fit was performed on the variable "marital status" and
found that the breakdown in our sample was significantly different from the
breakdown found in previous research.
HINT: Applying the percentage breakdown
from previous
research
to our sample of 136
The answer
to this problem
was false
cases would result in expected frequencies
71probability
in the category
'married', 10 in the
becauseofthe
for the chi-square
test
was
0.15,
greater
than
the
alpha
of
category 'widowed', 16 in the category 'divorced', 4 in the category 'separated'
and 35
0.01.
in the category 'never married'.
1
2
3
4

True
True with caution
False
Incorrect application of a statistic

We can conduct a post-hoc power analysis to


determine if the number of available cases
was sufficient to find a statistically significant
difference.

ters I
Slide
34

Power Analysis: Results for Chi-square


Goodness of Fit - 1

The answer to the problem was


false because the significance of
the chi-square statistic (Asymp.
Sig.) = .153, less than the alpha of
.01.

ters I
Slide
35

Power Analysis: Results for Chi-square


Goodness of Fit - 2

To conduct the power analysis, we


will need to compute the effect size
statistic, w, which compares the
expected proportions stated in the
problem to the actual proportions in
the SPSS output.
The proportion in each cell is found
in the Valid Percent column.

ters I
Slide
36

Access to G*Power Program

SamplePower, the SPSS program


for power analysis, does not
include the chi-square goodnessof-fit test. However, another free
software program, G*Power, does
the calculations for us.
Navigate to the page shown in the
web address box, scroll down the
page, download and install the
program.

ters I
Slide
37

Power Analysis for Chi-square


Goodness-of-fit Test - 1

Click on the OK button


on the title screen.
This is an old DOS
program that will run
in Windows.

ters I
Slide
38

Power Analysis for Chi-square


Goodness-of-fit Test - 2

Click on the OK button


on the Information
screen.

ters I
Slide
39

Power Analysis for Chi-square


Goodness-of-fit Test - 3

Click on the Tests menu to


open it. Scroll down the list
and click on the Chi-Test
command.

ters I
Slide
40

Power Analysis for Chi-square


Goodness-of-fit Test 3a

Since we have already


computed the statistic, we
mark the Post hoc option to
compute the power that we
had for the test.

ters I
Slide
41

Power Analysis for Chi-square


Goodness-of-fit Test - 4

In order to calculate the


power for the test, we must
first compute the effect
size, w, for our problem.
Click on the Calc Effectsize
button.

ters I
Slide
42

Power Analysis for Chi-square


Goodness-of-fit Test - 5

In our problem,
the variable had
five categories,
so we change
the default 4 to a
5.
The P(H0) column
contains the frequencies
expected under the null
hypothesis, i.e., stated in
the problem.

The P(H1) column


contains the percentages
under the research
hypothesis, i.e. tested for
differences in SPSS.

ters I
Slide
43

Power Analysis for Chi-square


Goodness-of-fit Test - 6

First, to change the


values for each cell,
we double click on
the row number in
the list to move the
row data to the
editing row.

The P(H0) column


contains the frequencies
expected under the null
hypothesis, i.e., stated in
the problem.

Second, correct the entries


in the row by backspacing
over the previous entry and
typing the correct number.
Pressing the enter key will
generate an error message
rather than moving the
corrected data to the table.
To edit another row, click on
its row number.

The P(H1) column


contains the percentages
under the research
hypothesis, i.e. tested for
differences in SPSS.

ters I
Slide
44

Power Analysis for Chi-square


Goodness-of-fit Test - 7

To compute the effect


size and move the value
to the window where
power is calculated,
click on the Calc & Copy
button.

If the percentages in each column


to not equal 1, GPOWER will not go
back to the previous window. To
correct the problem, change the
entry in row 5 from 0.191 to 0.192.

ters I
Slide
45

Power Analysis for Chi-square


Goodness-of-fit Test - 8
The effect size for our data
is 0.22, a small effect. The
scale is conveniently listed
at the bottom of the window.

ters I

Power Analysis for Chi-square


Goodness-of-fit Test - 9

Slide
46

Third, click on the


Calculate button to
obtain the answer.

First, change the Total


sample size to the 136
cases we had for this
analysis.

Second, change the


degrees of freedom Df
to the 4 which we
found in our SPSS
output.

ters I
Slide
47

Power Analysis for Chi-square


Goodness-of-fit Test - 10

GPOWER computes that we


had 0.52 as the measure of
power for our analysis, i.e.
about a 50-50 change of
detecting the small effect
associated with the data
available for our analysis.

You might also like