You are on page 1of 13

1

MOHD RAZIF BIN IBRAHIM


M20142001732
QUESTION 1 : USING THE TT04 DATA, CREATE A CROSSTAB EXAMINING THE
RELATIONSHIP BETWEEN HAPPY (GENERAL HAPPINESS) AND MARITAL
(MARITAL STATUS). ANSWER THE FOLLOWING QUESTIONS:

In answering this question, I did a Crosstab between the two


categorical variables, which is HAPPY and MARITAL. I used
Analyze>Descriptive Statistic>Crosstab and I have selected HAPPY in
the row and MARITAL in the column. In Cells, I ticked the Percentages
for row, so it will show all the related numbers in percentages which is
very useful in the analysis. So, the output are as follows;

GENERAL HAPPINESS *

Case Processing Summary


Cases
Valid
Missing
N
Percent
N
Percent
1337
47.5%
1475
52.5%

Total
N
2812

Percent
100.0%

MARITAL STATUS

Looking at the Case Processing Summary, we can look that there is a high amount
amount Missing cases, in total 1475 and carried out about 52.5%. So, from my opinion
our analysis that will come after this will not able to describe exactly the scenario
because of the high number of missing cases.
GENERAL HAPPINESS * MARITAL STATUS Crosstabulation

% within GENERAL HAPPINESS


MARITAL STATUS

Total
NEVER

GENERAL HAPPINESS

VERY HAPPY
PRETTY HAPPY
NOT TOO HAPPY

Total

MARRIED
68.5%
46.3%
27.8%
50.8%

WIDOWED
5.0%
7.3%
15.6%
7.7%

DIVORCED
8.4%
17.3%
21.7%
15.1%

Table above showed the result when i made the Crosstab between HAPPY and
MARITAL. Answers for every sub questions are as follows;

Percentage of married individuals are very happy


-

68.5%

Percentage of divorced indoviduals who are very happy


-

SEPARATED
1.2%
3.4%
9.4%
3.5%

8.4%

Percentage of never married individuals who are very happy

MARRIED
16.9%
25.6%
25.6%
22.9%

100.0%
100.0%
100.0%
100.0%

14

TUTORIAL 2 & 3 (Crosstab/Correlation/T-Test/Anova)


-

16.9%

Before we go on to the Chi Square testing, there are three assumptions to be made concerning
our data collection, which are :
1) A random sample.
2) Adequate sample size.
3) Adequate cell count. (more than 5)
Looking at the table below for Chi square analysis;
Chi-Square Tests
Asymptotic
Significance (2Pearson Chi-Square
Likelihood Ratio
Linear-by-Linear Association
N of Valid Cases

Value
120.097a
117.350
54.295
1337

df

sided)
8
8
1

.000
.000
.000

a. 0 cells (0.0%) have expected count less than 5. The minimum


expected count is 6.33.

First, there is 0 cells with expected count less than 5 and it is good because it agree with Chi
square assumptions. If the cells count is more than 20%, we must make another course of action.
Sub questions are as follows;

Pearsons Chi square value


-

Pearsons Chi square significance level


-

120.097

Pearsons Chi square significance level

Is the relationship statistically significant?


Before answering the question, we must make two hypothesis, which is;
Ho : There is no significance difference between general happiness and marital status.
Ha : There is a significance difference between general happiness and marital status.

14

TUTORIAL 2 & 3 (Crosstab/Correlation/T-Test/Anova)


After that, from the table above look at the Asymptotic Significance (2-sided) or p value,
which is 0.000, so p < 0.05. We therefore conclude that we have enough evidence to
reject the Ho and accepted the Ha. So, there is a significance difference general happiness
and marital status.

How would you interpret this result

Looking from the bar graph above, my interpretation is, the general
happiness is dependent on the marital status. A lot of individuals

14

TUTORIAL 2 & 3 (Crosstab/Correlation/T-Test/Anova)


who are married felt very happy and pretty happy compared to
other category.

QUESTION 2 :IS THERE A RELATIONSHIP BETWEEN A PERSONS AGE (AGE)


AND THE NUMBER OF HOURS SPENT PER DAY WATCHING TV (TVHOURS)?
USE TT04 DATA TO TEST THIS RELATIONSHIP. ANSWER THE FOLLOWING
QUESTIONS:

14

TUTORIAL 2 & 3 (Crosstab/Correlation/T-Test/Anova)


First of all, before we want to find the correlation between the two variables,
we have to make sure that all the assumptions have been met. I ran Graphs
> Legacy Dialogs > Scatter Plot > Simple Scatter. Ive entered the AGE as
the Y Axis and TVHOURS as the X Axis. Output is as follows;

From the graph above, we can see that all the dots scattered in linear form
and a straight line can be made between the AGE OF RESPONDENT and
HOURS PER DAY WATCHING TV variables. All the scores are scattered near

14

TUTORIAL 2 & 3 (Crosstab/Correlation/T-Test/Anova)


the straight line, so the assumptions that the relationship between variables
is linear and homoscedasticity are met. So, we can proceed to see the
relationship between the two variables.

Correlation
I selected the Analyze > Correlation > Bivariate to do the test.
Output is as follows;

Correlations
HOURS PER

AGE OF RESPONDENT

HOURS PER DAY


WATCHING TV

Pearson Correlation
Sig. (2-tailed)

AGE OF

DAY

RESPONDENT
1

WATCHING TV
.157**
.000

2803
.157**
.000

895
1

895

899

N
Pearson Correlation
Sig. (2-tailed)

N
**. Correlation is significant at the 0.01 level (2-tailed).

Correlation has been done to AGE OF RESPONDENT and HOURS PER


DAY WATCHING TV. Output value showed that there is a positive
correlation, but the correlation is very weak (r = 0.155, p < 0.05).

Significance level
- 0.01 (2 tailed).

Is the relationship statistically significant?


Before we answer the question, we have to make two hypothesis
which is;
Ho : There is no significant relationship between age of respondents
and hours per day watching tv.
Ha : There is a significant relationship between age of respondents
and hours per day watching tv.
From output, the p value is 0.000, which is p <0.05. So we can
conclude that we have enough evidence to reject the Ho and accept

14

TUTORIAL 2 & 3 (Crosstab/Correlation/T-Test/Anova)


the Ha. So there is a significant relationship between age of

respondent and hours per day watching tv.


Do people tend to watch more or less tv as they age?

From the bar chart above, we can conclude that people will watch less tv as they age.

QUESTION 3 : PERFORM A T-TEST WHETHER THERE IS AN ASSOCIATION


BETWEEN A PERSON GENDER AND THE AGE AT WHICH THEY FIRST HAVE
CHILDREN. USE TT04 DATA SET TO PERFORM AN INDEPENDENT SAMPLE T
TEST ON SEX AND AGEKDBRN. ANSWER THE FOLLOWING QUESTIONS:

Before carrying out the T-Test, there a four assumptions that have to be met;

14

TUTORIAL 2 & 3 (Crosstab/Correlation/T-Test/Anova)


1) The variables or data must be in interval, or ratio or continuous format.
2) Sample must be taken randomly from population.
3) The dependent variable(s) must ne normally distributed for the
population.
4) The mean for population is known.
When doing the independent sample T Test, two more assumptions have to
be met including;
5) The variables are free from each other.
6) Variance are homogeneous.
The hypotheses for the T-Test are as follows;
Ho : There is no significance difference between a person gender and the age
at which they first have children.
Ha : There is a significance difference between a person gender and the age
at which they first have children.
After that, I ran Analyze > Compare Means > One-Sample T Test. Output is
as follows;

RESPONDENT'S AGE
WHEN 1ST CHILD BORN

Group Statistics
RESPONDENTS SEX
N
Female
1182
Male
844

Mean age for men


Mean age for men is 26 years old.

Mean age for women


Mean age for women is 23 years old.

Mean
22.79
25.78

Std. Deviation Std. Error Mean


4.956
.144
5.686
.196

14

TUTORIAL 2 & 3 (Crosstab/Correlation/T-Test/Anova)


We can conclude that there is a difference between the age of men and women when
they have their first child. Men is older than women.

T test of means significant level


0.05 (95%)

Is the relationship statistically significant?


Observed the table below;
Independent Samples Test
Levene's Test for
Equality of
Variances

t-test for Equality of Means


95% Confidence
Interval of the
Mean

RESPONDENT'

Equal

S AGE WHEN

variances

1ST CHILD

assumed
Equal

BORN

F
12.445

Sig.
.000

t
-12.563

Std. Error

Difference
df
Sig. (2-tailed) Difference Difference Lower
Upper
2024
.000
-2.985
.238 -3.451
-2.519

-12.280 1657.542

.000

-2.985

.243

-3.462

variances
not assumed

From the Independent Samples Test table, it will shows whether the difference in mean for men
and women is significance or not. But, before that we must make sure that our assumption that
the variance are homogeneous is met or not. Test for as such is the Levenes Test for Equality of
Variances. Looking at the Sig. value (4th columns from the left), of it not significance if (p>0.05),
so we can conclude that assumptions of homogeneous of variances have been met, so your report
must base on the data shown in row one on the output. Otherwise, if the Levene Test is
significance (p<0.05), meaning that you have to reject the hypothesis null, which stated that the
variances are homogeneous. So, you report, on the other hand must base on the second row of
the table.
From the Levene test, the Sig. value is 0.000, so (p<0.05). So, we can reject the Ho that said the
variances are same (homogeneous).

-2.508

14

TUTORIAL 2 & 3 (Crosstab/Correlation/T-Test/Anova)


For the Independent Samples Test, look at the value Sig. (2-tailed) second row. The value is
0.000 (p<0.05). So, we T Test rejects Ho and to accepts Ha. We can conclude, indeed there is a
significant relationship between a person gender and the age at which they first have children.

On the basis of these data, would you say that gender is associated
with the age at which people have children?

I have run Analyze > Correlate > Bivariate to find out the correlation
between gender and age at which people have children. Output as follows;

Correlations
RESPONDENT'
S AGE WHEN
RESPONDENT
S SEX
RESPONDENTS SEX

RESPONDENT'S AGE
WHEN 1ST CHILD BORN

1ST CHILD
BORN
.269**
.000

Pearson Correlation
Sig. (2-tailed)

N
Pearson Correlation
Sig. (2-tailed)

2812
.269**
.000

2026
1

2026

2026

N
**. Correlation is significant at the 0.01 level (2-tailed).

From the Correlations test, looking at the Pearson Correlation, the coefficient value is 0.269,
which is positively correlated. But, in general, it is a weak correlation. Thus, we can conclude
that gender has positively associated with the age at which people have children.

QUESTION 4 : TEST WHETHER THERE IS AN ASSOCIATION BETWEEN A


PERSONS RACE AND THE AGE AT WHICH THEY FIRST HAVE CHILDREN. USE
TT04 DATA SET TO PERFORM AN ANOVA ON RACE AND AGEKDBRN. ANSWER
THE FOLLOWING QUESTIONS:

Before doing ANOVA, there are three assumptions that have to be met;

14

TUTORIAL 2 & 3 (Crosstab/Correlation/T-Test/Anova)


1) The data are normally distributed.
2) Homogeneity of variance.
So, I ran Analyze > Compare Means > One-Way ANOVA and Ive put Age as
the Dependent List and Race as the Factor. The outputs are follows;

Descriptives
RESPONDENT'S AGE WHEN 1ST CHILD BORN

White
Black
Others
Total

N
1608
287
131
2026

Mean
24.34
21.98
24.82
24.03

Std. Deviation
5.329
5.588
5.977
5.473

Std. Error
.133
.330
.522
.122

95% Confidence Interval for Mean


Lower Bound
Upper Bound
24.08
24.60
21.33
22.63
23.79
25.86
23.80
24.27

Minimum
13
12
13
12

Maximum
51
50
40
51

From the Descriptives, we can know the mean, the standard deviation and other related statistical
analysis.

Mean age for Whites


Mean age for Whites is 25 years old.

Mean age for African Americans


Mean age for African Americans is 22 years old.

Mean age for other races


Mean age for other races is 25 years old.

ANOVA significant level


ANOVA significance level is 0.05 (95%)

Is the relationship statistically significant?


Before we proceed to analyze the relationship, take a look at the table below;

14

TUTORIAL 2 & 3 (Crosstab/Correlation/T-Test/Anova)


Test of Homogeneity of Variances
RESPONDENT'S AGE WHEN 1ST CHILD BORN
Levene Statistic
df1
df2
Sig.
2.992
2
2023
.050

Levene Statistic for test of Homogeneity of Variances so Sig. value (0.050). We can
conclude that it is not significant (p>0.05). Due to the value (p>0.05), we cant reject
the Ho. So, we can assume that there is no difference in variance for all the three race
groups. We can say that the independent variable, which is Race have the same
variances as the Age when they have first children.
Next, we will look at the ANOVA table;

ANOVA
RESPONDENT'S AGE WHEN 1ST CHILD BORN
Between Groups
Within Groups
Total

Sum of Squares
1441.434
59214.147
60655.581

df
2
2023

Mean Square
720.717
29.270

F
24.623

Sig.
.000

2025

From the ANOVA table, F (2, p = 0.00) = 24.623, p<0.05, so, we can reject the Ho
and have enough evidence to accept Ha., showing that there is a difference between
persons race and the age at which they first have children. Generally, there is
difference between Race and Age.

Perform Post Hoc analysis (Tukey), determine which categories have means that are
significantly different from the means of Whites?
Observe the output below;

Multiple Comparisons
Dependent Variable: RESPONDENT'S AGE WHEN 1ST CHILD BORN
Tukey HSD
Mean Difference
(I) RACE OF RESPONDENT
White
Black

(J) RACE OF RESPONDENT


Black
Others
White

(I-J)
2.358*
-.487
-2.358*

95% Confidence Interval


Std. Error
.347
.492
.347

Sig.
.000
.582
.000

Lower Bound
1.54
-1.64
-3.17

Upper Bound
3.17
.67
-1.54

14

TUTORIAL 2 & 3 (Crosstab/Correlation/T-Test/Anova)


Others
Others
White
Black
*. The mean difference is significant at the 0.05 level.

-2.845*
.487
2.845*

.570
.492
.570

.000
.582
.000

-4.18
-.67
1.51

From the table above, there is a significance difference between respondents age for White,
Black and Others. Looking at the third column, we can see that White people are relatively older
when first child born, compared to Black race. On the other hand, White people is much younger
when first child born, compared to Others race. We expand the analysis by looking at the Sig.
value. The value is 0.000 (p<0.05), so we can say that there is a significant difference between
the means of Black and White. Generally, the means age when first child born between the White
and Black race are significantly different. We have to look deeper from data to explain the
cause(s) of the difference.

-1.51
1.64
4.18

You might also like