You are on page 1of 93

STATISTIK INFERENSI:

PENGUJIAN HIPOTESIS BAGI ANALISIS KORELASI


DAN REGRESI
(UJIAN – rP , rS , rPb )

Rohani Ahmad Tarmizi - EDU5950 1


 Analisis korelasi digunakan untuk menjawab
persoalan kajian seperti berikut:

 Adakah terdapat hubungan antara


dua pembolehubah tersebut?
 “Is there relationship between the two
variables?”
 Sejauh manakah hubungan tersebut?
 “How strong is the relationship?”
 Apakah arah hubungan tersebut?
 “What is the direction of the
relationship?”
ANALISIS KORELASI
 Analisis juga membabitkan dua kategori
pembolehubah iaitu pembolehubah prediktif dan
pembolehubah kriterion.
 P/U prediktif adalah yang memberi kesan atau
mempengaruhi P/U yang kedua.
 P/U kriterion adalah yang menerima kesan atau
pengaruh daripada P/U pertama.
 X (prediktif) Y (kriterion)
 X1, X2, X3,.. Y (kriterion)
 Walau bagaimanapun, analisis ini hanya memeri
gambaran hubungan dan tidak memberi rumusan
“cause-and-effect relationship”.
 Sebagai contoh, penyelidik hendak
menentukan hubungan antara:
 Keyakinan dalam mentadbir dengan
prestasi kepimpinan dalam kalangan
pengetua
 Persepsi guru kanan dan staff
pentadbiran terhadap tahap kepimpinan
pengetua di sekolah
 Umur dengan kepuasan bekerja
 Amalan pemakanan pangkat keyakinan
untuk menyertai marathon.
Dua Cara Menentukan Korelasi
 1. Secara bergambar iaitu
dinamakan gambarajah sebaran
(scatter diagram) yang menunjukkan
pola kedudukan pasangan titik-titik.
 Daripada gambarajah sebaran kita
dapat merumus keteguhan
(magnitud) korelasi tersebut serta
arah korelasinya.
Dua Cara Menentukan Korelasi
 2. Secara berangka iaitu dengan
menentukan pekali, koefisi atau
indeks.
 Daripada pekali tersebut kita dapat
mengetahui keteguhan (magnitud)
korelasi tersebut serta arahnya sama
positif atau negatif.
Scatter Plots and Types of Correlation

x = SAT score
GPA y = GPA
4.00
3.75
3.50
3.25
3.00
2.75
2.50
2.25
2.00
1.75
1.50

300 350 400 450 500 550 600 650 700 750 800
Math SAT

Positive Correlation
as x increases y increases
Scatter Plots and Types of Correlation
x = hours of training
Accidents
60
y = number of accidents
50

40

30

20

10

0 2 4 6 8 10 12 14 16 18 20
Hours of Training

Negative Correlation
as x increases, y decreases
Scatter Plots and Types of Correlation
x = height
y = IQ
IQ
160

150

140

130
IQ

120

110

100

90

80

60 64 68 72 76 80
Height

No linear correlation
Analisis Korelasi Menunjukkan
3 perkara penting, iaitu:

 Arah/Direction (positive or negative)

 Bentuk/Form (linear or non-linear)

 Kekuatan/Magnitude (size of coefficient)


PEKALI ATAU KOEFISI KORELASI
 TERDAPAT BEBERAPA JENIS PEKALI
KORELASI IAITU:
 Pearson product-moment correlation
 Digunakan apabila p/u x dan y adalah pada skala sela
atau nisbah atau gabungan kedua-duanya.
 Spearman rho correlation
 Digunakan apabila p/u x dan y adalah pada skala
ordinal atau gabungan ordinal dengan sela/nisbah.
 Point-biserial correlation
 Digunakan apabila p/u x adalah dikotomus dan p/u y
adalah pada skala sela atau nisbah.
Pekali Pearson
r = n [xy] - [xy]

[ n  x2 - ( x) 2 ] [ n  y2 - ( y) 2 ]

n = bilangan pasangan skor


 x y = jumlah skor x didarab dengan skor y
 x = jumlah skor x
 y = jumlah skor y
Pekali Spearman
r = 1 - [6B2]
n [ n2 - 1 ]

n = bilangan pasangan skor


 B = jumlah beza antara setiap pasangan pangkatan
Pekali Point-biserial
r = y1 – y2 [ n1 n2 ]
sy n[n-1]
Correlation Coefficient - A measure of the
strength and direction of a linear relationship
between two variables

The range of r is from -1 to 1.

-1 0 1
If r is close to If r is close to If r is close
-1 there is a 0 there is no to 1 there is
strong linear a strong
negative correlation positive
correlation correlation
Guildford Rule of Thumb

r Strength of Relationship

< 0.2 Negligible Relationship

0.2 – 0.4 Low Relationship

0.4 – 0.7 Moderate Relationship

0.7 – 0.9 High Relationship

> 0.9 Very high Relationship


Other Strengths of Association-
By Johnson and Nelson (1986)

r-value Interpretation

0.00 No relationship

0.01-0.19 Low relationship

0.20-0.49 Slightly Moderate relationship

0.50-0.69 Moderate relationship

0.70-0.99 Strong relationship

1.00 Perfect relationship

The same strength interpretations hold for negative values of r, only the direction
interpretations of the association would change.
Association Between Two Scores Degree and
strength of association
 .20–.35:
 When correlations range from .20 to .35, there is only a
slight relationship
 .35–.65:
 When correlations are above .35, they are useful for
limited prediction.
 .66–.85:
 When correlations fall into this range, good prediction
can result from one variable to the other. Coefficients
in this range would be considered very good.
 .86 and above:
 Correlations in this range are typically achieved for
studies of construct validity or test-retest reliability.
L1. Nyatakan hipotesis
 Hipotesis penyelidikan –
Terdapat hubungan yang signifikan antara tahap
kepimpinan pengajaran Pengetua dengan
prestasi akademik sekolah di Sabah
 Hipotesis nol/sifar –
Tiada terdapat hubungan yang signifikan antara
tahap kepimpinan pengajaran Pengetua dengan
prestasi akademik sekolah di Sabah
L2. TETAPKAN ARAS ALPHA = 0.01/ 0.05/ 0.10,
TABURAN PERSAMPELAN, STATISTIK PENGUJIAN
 Nilai alpha ditetapkan oleh penyelidik.
 Ia merupakan nilai penetapan bahawa penyelidik akan
menerima sebarang ralat semasa membuat keputusan
pengujian hipotesis tersebut.
 Ralat yang sekecil-kecilnya ialah 0.01 (1%), 0.05 (5%)
atau 0.10(10%).
 Nilai ini juga dipanggil nilai signifikan, aras signifikan,
atau aras alpha.
L2. Taburan Persampelan
 Taburan yang bersesuaian dengan analisis yang
dijalankan. Ia merupakan model taburan
korelasi yang mana nilai korelasi itu bertabur
secara normal.
 Di kawasan kritikal terletak nilai korelasi yang
“luar biasa” -> Ha adalah benar
 Dikawasan tak kritikal terletak nilai korelasi
yang “biasa” -> Ho adalah benar
L3. Nilai Kritikal
 Nilai kritikal adalah nilai yang menjadi sempadan
bagi kawasan Ho benar dan Hp benar.
 Nilai ini merupakan nilai dimana penyelidik
meletakkan penetapan sama ada cukup bukti
untuk menolak Ho (maka boleh menerima Hp)
ataupun tidak cukup bukti menolak Ho
(menerima Ho).
 Nilai ini bergantung kepada nilai alpha dan arah
pengujian hipotesis yang dilakukan.
L4. Nilai Statistik Pengujian
 Ini adalah nilai yang dikira dan dijadikan bukti
sama ada hipotesis sifar benar atau salah.
 Jika nilai statistik pengujian masuk dalam kawasan
kritikal maka Ho adalah salah, ditolak dan Hp
diterima
 Jika nilai statistik pengujian masuk dalam kawasan
tak kritikal maka Ho adalah benar, maka terima
Ho.
L4. Nilai Statistik Pengujian

r diuji =

r diuji = 6 d 2

  1

n n 1
2

L5. Membuat Keputusan, Kesimpulan dan
tafsiran
 Jika nilai statistik pengujian masuk dalam
kawasan tak kritikal maka Ho adalah benar,
maka terima Ho.
L5. Membuat Keputusan, Kesimpulan dan
Tafsiran
 Jika nilai statistik pengujian masuk dalam
kawasan kritikal maka Ho adalah tak benar, maka
Ho ditolak dan seterusnya, Hp diterima (bermakna
ada bukti Hp adalah benar)
Example of Pearson correlation
Data were collected from a randomly selected sample to
determine relationship between average assignment scores
and test scores in statistics. Distribution for the data is
presented in the table below. Assuming the data are normally
distributed.
1. Calculated an appropriate correlation
coefficient. Data set:
Assign Test
8.5 88
2. Describe the nature of relationship
6 66
between the two variable. 9 94
10 98
3. Test the hypothesis on the relationship 8 87
at 0.01 level of significance. 7 72
5 45
6 63
7.5 85
5 77
Calculate the test statistic
X Y XY X2 Y2
8.5 88 748 72.25 7744
6 66 396 36 4356
9 94 846 81 8836
10 98 980 100 9604
8 87 696 64 7569
7 72 504 49 5184
5 45 225 25 2025
6 63 378 36 3969
7.5 85 637.5 56.25 7225
5 77 385 25 5929
Steps in Hypothesis Testing
1. State the null and alternative hypothesis
HO: ρ p = 0, HA: ρ p ≠ 0
2. Calculate the test statistics: r = .865

3. Determine critical value: df = n – 2, Two-tailed.


r critical= 0.7646

4. Make your decision: r cal > r critical so reject null


hypothesis, accept alternative hypothesis

5. Make conclusion: There is significant relationship


between assignment scores and test scores r (8) =
0.87, p<0.01
Spearman’s rank correlation coefficient
 Non parametric method:
 Less power but more robust.
 Does not assume normal distribution.

 The correlation coefficient also varies between -1 and 1


Example of Spearman correlation

Data solicited from a randomly


selected sample of employees ID X Y
were used to measure 1 1 1
relationship between ratings of 2 2 1
working environment and one’s 3 3 2
work commitment. 4 4 3
5 5 4
1. Calculate and describe the 6 1 3
appropriate correlation coefficient 7 2 3
8 3 2
2. Test the hypothesis on the 9 4 5
relationship at 0.05 level of 10 5 5
significance 11 6 5
.

Null hypothesis: There is no significant correlation between


between ratings of working environment and one’s work
commitment among work employees.

Research hypothesis: There is significant correlation


between between ratings of working environment and
one’s work commitment among work employees.
Null hypothesis is true

Research hypothesis is true Research hypothesis is true

Determined the critical values in the sampling distribution.


Degrees of freedom

From Table r, r = ±.456


Participant Ratings of Ratings of Rank of Rank D D2
work work years of rating
environment commitment

1 1 1 1.5 1.5 0 0
2 2 1 3.5 1.5 2 4
3 3 2 5.5 3.5 2 4
4 4 3 7.5 6 1.5 2.25
5 5 4 9.5 8 1.5 2.25
6 1 3 1.5 6 -4.5 20.25
7 2 3 3.5 6 -2.5 6.25
8 3 2 5.5 3.5 2 4
9 4 5 7.5 10 -2.5 6.25
10 5 5 9.5 10 -.5 0.25
11 6 5 11 10 1 1
50.5
Make a decision: Reject the null hypothesis
hence accept research hypothesis.
Conclusion: There was a statistically significant
positive correlation between between ratings of
working environment and one’s work
commitment among employees (rho = 0.77, p <
0.05, N = 11).
r = 1 - [6D2]
n [ n2 - 1 ]

r = 1 - [ 6(50.5 )]
11 [ 121 - 1 ]

r = 1 – 0.229

r = 0.77

There is a positive and strong relationship between ratings


of working environment and one’s work commitment
among employees.
2. Test the hypothesis on the relationship between the two
variables at 0.05 level of significance.
a. State the null and alternative hypotheses
H O : ρs = 0
H A : ρs ≠ 0
b. rs = 0. 77
c. Determine critical value
Critical rs = 0.456
d. Decision: Since calculated rs (0.77) is larger than critical
rs (0.456), we reject the null hypothesis, accept alternative
hypothesis.
e. Conclusion

Conclude there is significant relationship between ratings towards


work environment with level of work commitment at 0.05 level of
significance, rs (11) = 0.77, p< .05. Results showed that the positive
and high perception on work environment has positive impact on
work commitment among employees.
Point-biserial Correlation
rpb = y1 – y2 [ n1 n2 ]
sy n[n-1]

• Mean of group 1
• Mean of group 2
• Std dev of continuous variable
• No of subjects in group 1
• No of subjects in group 2
• Total no of subjects
Example on Point-biserial
correlation Marital status Need for Achievement
A psychologist hypothesizes an 2 3
association between marital 2 7
status (1-single, 2-married) and 1 12
need for achievement. A 1 16
questionnaire measuring need 1 24
for achievement is administered 2 11
to married and single people. 1 15
1. Calculate the appropriate 2 10
correlation coefficient 2 11
1 18
2. Describe the nature of 1 22
relationship between the two 2 9
variables. 1 19
1 17
3. Test the hypothesis on the
relationship at 0.05 level of
significance
Point-biserial Correlation
r = y1 – y2 [ n1 n2 ]
sy n[n-1]

• Mean of married subject = 8.5


• Mean of single subjects = 17.9
• Std dev. of need of achievement scores = 5.89
• No of married subjects = 6 (2)
• No of single subjects = 8 (1)
• Total no of subjects = 14
Point-biserial Correlation

r = 17.9 – 8.5 [8x6]


5.89 14 [ 14 - 1 ]

The mean need for achievement for


single individual is 17.9 and for
r pb = 0.82 married individuals is 8.5. There is a
strong relationship between marital
status and need for achievement.
3. Test the hypothesis on the relationship between the
two variable at 0.05 level of significance.
a. State the null and alternative hypotheses
HO : ρ pb = 0
HA : ρ pb ≠ 0
b. r pb = 0.82
c. Determine critical value: Critical r pb = 0.532
d. Decision: Since calculated r pb (0.82) is greater than
critical value, r pb (0.532), we can reject the null hypothesis
thus accept alternative hypothesis.
e. Conclusion
Therefore there is a significant relationship between
marital status and need for achievement, r pb (12)=.82,
p<0.05. Findings also indicated that single individuals
showed a higher need for achievement compared to
married individuals. Hence marital status has an influence
on one’s need for achievement.
ANALISIS REGRESI
Analisis regresi adalah lanjutan daripada
analisis korelasi dimana sesuatu hubungan
telah diperoleh.
Analisis regresi dilaksanakan setelah suatu
pola hubungan linear dijangkakan serta
suatu pekali ditentukan bagi menunjukkan
terdapat hubungan yang linear antara dua
pembolehubah.
Selanjutnya bolehlah kita menelah atau
meramal sesuatu pembolehubah (p/u
criterion) setelah pembolehubah yang
kedua (p/u predictive) diketahui.
Prosedurnya
 ANALISIS REGRESI MUDAH terdiri daripada:
 Melakarkan gambarajah sebaran bagi taburan
pasangan skor tersebut
 Menentukan persamaan bagi garis regresi
tersebut
 Persamaan ini juga dipanggil model regresi
 Persamaan/model bagi garis ini ialah

Y’ = a + bx
 Dan selanjutnya dengan mengguna
persamaan tersebut, nilai y boleh ditentukan
bagi sesuatu nilai x yang telah ditentukan dan
juga disebaliknya.
PERSAMAAN BAGI GARIS REGRESI
(LEAST-SQUARES REGRESSION LINE)
 Y’ = a + bx
 Y’ = Nilai anggaran bagi y
 b = kecerunan bagi garis
tersebut
 a = pintasan pada paksi y
KECERUNAN GARIS REGRESI
b = n[ xy] - [xy]
[ n  x2 - ( x)2 ]

n = bilangan pasangan skor


xy = jumlah skor x didarab dengan skor y
X = jumlah skor x
y = jumlah skor y
a = PINTASAN PADA PAKSI Y

a=y–bx
Data: Tahap kepemimpinan pengetua dengan persepsi
guru terhadap tahap kepemimpinan pengetua

X Y
12 8
2 3
1 4
6 6
5 9
8 6
4 6
15 22
11 14
13 6
PENGIRAAN ANALISIS REGRESI
X Y XY X2 Y2
12 8
2 3
1 4
6 6
5 9
8 6
4 6
15 22
11 14
13 6
PENGIRAAN ANALISIS REGRESI
X Y XY X2 Y2
12 8 96 144 64
2 3 6 4 9
1 4 4 1 16
6 6 36 36 36
5 9 45 25 81
8 6 48 64 36
4 6 24 16 36
15 22 330 225 484
11 14 154 121 196
13 6 78 169 36

77 84 821 805 994


PERSAMAAN BAGI GARIS REGRESI
(LEAST-SQUARES REGRESSION LINE)
 Y’ = bx + a
 Y’ = Nilai anggran bagi y
 b = kecerunan bagi garis
tersebut
 a= pintasan pada paksi y
 r= 0.70.
 Ini menunjukkan bahawa 49% variasi dalam y
adalah sumbangan daripada X
 Kecerunannya ialah 0.82
 Min bagi x ialah 7.7
 Min bagi y ialah 8.4
 a = 2.1 (pintasan di paksi y)
 Model regresi ialah Y’ = .82x + 2.1
 Jika x=7, maka Y’= 7.84
 Jika x=10, maka Y’= 10.3
 Jika x=14, maka Y’=13.58
Regression & Correlation
 A correlation measures the “degree of
association” between two variables (interval
(50,100,150…) or ordinal (1,2,3...))
 Associations can be positive (an increase in
one variable is associated with an increase in
the other) or negative (an increase in one
variable is associated with a decrease in the
other)

54
Example: Height vs. Weight
Graph One: Relationship between Height
and Weight

180  Strong positive correlation


160 between height and weight
140
120  Can see how the
Weight (kgs)

100 relationship works, but


80 cannot predict one from the
60 other
40
20  If 120cm tall, then how
0 heavy?
0 50 100 150 200
Height (cms)

55
Example: Symptom Index vs Drug A

Graph Two: Relationship between Symptom


Index and Drug A
 Strong negative correlation

160  Can see how relationship


140 works, but cannot make
120
predictions
Symptom Index

100
80  What Symptom Index might
60
40
we predict for a standard
20
dose of 150mg?
0
0 50 100 150 200 250
Drug A (dose in mg)
Correlation examples

57
Regression
Regression analysis procedures have as their
primary purpose the development of an
equation that can be used for predicting
values on some DV for all members of a
population.
A secondary purpose is to use regression
analysis as a means of explaining causal
relationships among variables.
 The most basic application of regression analysis is the
bivariate situation, to which is referred as simple linear
regression, or just simple regression.
 Simple regression involves a single IV and a single DV.
 Goal: to obtain a linear equation so that we can predict
the value of the DV if we have the value of the IV.
 Simple regression capitalizes on the correlation between
the DV and IV in order to make specific predictions
about the DV.
 The correlation tells us how much information about
the DV is contained in the IV.
 If the correlation is perfect (i.e r = ±1.00), the IV
contains everything we need to know about the DV,
and we will be able to perfectly predict one from the
other.
 Regression analysis is the means by which we
determine the best-fitting line, called the regression
line.
 Regression line is the straight line that lies closest to
all points in a given scatterplot
 This line sometimes pass through the centroid of the
scatterplot.
Example: Symptom Index vs Drug A

Graph Three: Relationship between


Symptom Index and Drug A
 “Best fit line”
(with best-fit line)
 Allows us to describe
180 relationship between
160
140 variables more accurately.
Symptom Index

120
100  We can now predict specific
80
60 values of one variable from
40
20
knowledge of the other
0
0 50 100 150 200 250  All points are close to the
Drug A (dose in mg) line
Example: Symptom Index vs Drug B

Graph Four: Relationship between Symptom


Index and Drug B  We can still predict specific
(with best-fit line)
values of one variable from
160
140
knowledge of the other
120
Symptom Index

100  Will predictions be as


80 accurate?
60
40  Why not?
20
0  “Residuals”
0 50 100 150 200 250
Drug B (dose in mg)
 3 important facts about the regression line must be
known:
 The extent to which points are scattered around the line
 The slope of the regression line
 The point at which the line crosses the Y-axis
 The extent to which the points are scattered around the
line is typically indicated by the degree of relationship
between the IV (X) and DV (Y).
 This relationship is measured by a correlation coefficient
– the stronger the relationship, the higher the degree of
predictability between X and Y.
 The degree of slope is determined by the amount of
change in Y that accompanies a unit change in X.
 It is the slope that largely determines the predicted
values of Y from known values for X.
 It is important to determine exactly where the
regression line crosses the Y-axis (this value is
known as the Y-intercept).
 The regression line is essentially an equation that
express Y as a function of X.
 The basic equation for simple regression is:
 Y = a + bX
 where Y is the predicted value for the DV,
 X is the known raw score value on the IV,
 b is the slope of the regression line
 a is the Y-intercept
Simple Linear Regression

♠ Purpose
To determine relationship between two metric variables
To predict value of the dependent variable (Y) based on
value of independent variable (X)
♠ Requirement :
DV Interval / Ratio
IV Internal / Ratio
♠ Requirement :
The independent and dependent variables are normally
distributed in the population
The cases represents a random sample from the population
Simple Regression
How best to summarise the data?

160 180

140 160

140
120
120

Symptom Index
Symptom Index

100
100
80
80
60
60
40 40
20 20

0 0
0 50 100 150 200 250 0 50 100 150 200 250
Drug A (dose in mg) Drug A (dose in mg)

Adding a best-fit line allows us to describe data


simply
General Linear Model (GLM)
How best to summarise the data?

 Establish equation for the best-fit line:


Y = a + bX 200

180

160

140

Where: a = y intercept 120

100

(constant) 80

60

b = slope of best-fit line 40

20

Y = dependent variable 0
0 50 100 150 200 250

X = independent variable
Simple Regression
R2 - “Goodness of fit”

 For simple regression, R2 is the square of the correlation


coefficient
 Reflects variance accounted for in data by the best-fit line
 Takes values between 0 (0%) and 1 (100%)
 Frequently expressed as percentage, rather than decimal
 High values show good fit, low values show poor fit
Simple Regression
Low values of R2

300
DV  R2 = 0
250
 (0% - randomly scattered
200 points, no apparent
150 relationship between X
and Y)
100

50
 Implies that a best-fit
line will be a very poor
0 description of data
0 100 200 300
IV (regressor, predictor)
Simple Regression
High values of R2
300

250

200
 R2 = 1
DV

150

100  (100% - points lie directly


50
on the line - perfect
0
0 100 200 300 relationship between X
IV
and Y)
250

200  Implies that a best-fit


150 line will be a very good
DV

100 description of data


50

0
0 50 100 150 200 250

IV
Simple Regression
R2 - “Goodness of fit”
180 160
160 140
140
120
120

S ymptom Index
S ymptom Index

100
100
80
80
60
60

40 40

20 20

0 0
0 50 100 150 200 250 0 50 100 150 200 250
Drug A (dose in mg) Drug B (dose in mg)

Good fit  R2 high Moderate fit  R2


lower
High variance explained
Less variance
explained
Problem: to draw a straight line through the points
that best explains the variance

6
Line can then be used
5

4
to predict Y from X
3

0
0 2 4 6

73
Example: Symptom Index vs Drug A

Graph Three: Relationship between  “Best fit line”


Symptom Index and Drug A
(with best-fit line)  allows us to describe relationship
between variables more
180
160
accurately.
140
 We can now predict specific
Symptom Index

120
100 values of one variable from
80
60 knowledge of the other
40
20  All points are close to the line
0
0 50 100 150 200 250
Drug A (dose in mg)

74
Regression
 Establish equation for the best-fit line:
Y = a + bX

 Best-fit line same as regression line


 b is the regression coefficient for x
 x is the predictor or regressor variable for y

75
Regression - Types
Step –Descriptive Analysis

Derive Regression / Prediction equation

● Calculate a and b

a=y–b X

Ŷ = a + bX
Example on regression analysis Data set:
Scores
Data were collected from a randomly ID Assign Test
selected sample to determine 1 8.5 88
relationship between average
assignment scores and test scores in 2 6 66
statistics. Distribution for 3 9 94
the data is presented in the table 4 10 98
below. 5 8 87
6 7 72
1. Calculate coefficient of determination
and the correlation coefficient 7 5 45
8 6 63
2. Determine the prediction equation. 9 7.5 85
10 5 77
3. Test hypothesis for the slope at 0.05
level of significance
ID X Y
1. Derive Regression / Prediction equation 1 8.5 88
2 6 66
3 9 94
4 10 98
5 8 87
6 7 72
7 5 45
= 215.5 = 8.257 8 6 63
26.1 9 7.5 85
10 5 77

a= y – b x
Summary stat:
= 77.5 – 8.257 (7.2)
n 10
= 18.050 ΣΧ 72
ΣΥ 775
Prediction equation: ΣΧ² 544.5
ΣΥ² 62,441
Ŷ = 18.05 + 8.257X ΣΧΥ 5,795.5
Interpretation of regression equation

Ŷ = 18.05 + 8.257x

For every 1 unit change in X,


Y will change by 8.257 units

ΔY
18.05
ΔX
Example on regression analysis:
MARITAL SATISFACTION

Parents : X Children : Y

1 3
3 2
7 6
9 7
8 8
4 6
5 3
Mean of X Mean of Y
No of pairs
X Y
 X squared  X squared
Standard deviation Standard deviation
 XY
1. Derive Regression / Prediction equation

a= y – b x
= 5.00 +.65 (5.29)
= 8.438

Prediction equation:
Ŷ = 8.44 + 65x
Interpretation of regression equation

Ŷ = 8.43 + .65x

For every 1 unit change in X,


Y will change by .65 units

ΔY
8.43
ΔX
ANALISIS “CHI-SQUARE”
(KUASA-DUA KHI)
 Ini juga merupakan analisis hubungan tetapi lebih
dikenali sebagai analisis perkaitan (association)
 Analisis ini digunakan pakai bagi menentukan perkaitan
antara pasangan pembolehubah yang diukur pada skala
nominal atau ordinal ataupun jika salah satunya
dipadankan dengan data sela dan nisbah.
 Dengan itu pembolehubah seperti
 Bangsa,
 Jantina,
 Suka/tidak suka makanan,
 Tinggi pencapaian/rendah pencapaian,
 Kebimbangan tinggi/ kebimbangan sederhana/
kebimbangan rendah
 Data frekuensi dicerap dengan membilang kejadian
(occurance setiap perkara). Sesuai untuk kajian tinjauan
 Daripada frekuensi yang dicerap (observed frequency)
analisis “chi-square” memberi kita makluman bahawa
ada/tiada perkaitan antara kedua-dua pemboleh ubah.
ANALISIS “CHI-SQUARE” (KUASA-DUA KHI)
 KATAKANLAH, penyelidik mengumpul maklumat
tentang bangsa bagi responden dan juga kategori amalan
pemakanan setiap responden,
 ATAU penyelidik tinjau pelajar dibeberapa buah sekolah
dari segi jantina dan minta/tidak minat kepada aliran
sains
 ATAU penyelidik tinjau bapa-bapa dan mengumpul
maklumat tahap pendidikan (tinggi/ sederhana/ rendah)
dan dikaitkan dengan kategori gaji
 Bagi ketiga-tiga contoh tersebut analisis yang sesuai
dijalankan adalah analisis tak parametrik (analisis kuasa-
dua khi)
 dan seterusnya dibina jadual kontingensi atau
jadual“crosstabulation”.
 Daripada frekuensi yang dicerap (observed frequency)
analisis “chi-square” memberi kita makluman bahawa
ada/tiada perkaitan antara kedua-dua pemboleh ubah.
ANALISIS “CHI-SQUARE”
(KUASA-DUA KHI)
 Terdapat dua cara/kategori – CHI-SQUARE
TEST OF GOODNESS OF FIT dan TEST OF
INDEPENDENCE/DEPENDENCE
 TEST GOODNESS OF FIT – menjawab
persoalan “adakah terdapat perbezaan kadar
bagi sesuatu perkara/kejadian/persetujuan”
 TEST OF INDEPENDENCE/ DEPENDENCE –
menjawab persoalan “adakah terdapat
perkaitan/kebersandaran/ hubungan antara
dua perkara
ANALISIS “CHI-SQUARE”
(KUASA-DUA KHI)
 Dapatan bagi analisis ini lazimnya dalam
bentuk jadual frekuensi yang dipanggil jadual
kontingensi atau jadual “crosstabulation”.
 Daripada frekuensi yang dicerap (observed
frequency) analisis “chi-square” ini memberi
kita makluman bahawa ada/tiada perkaitan
yang signifikan antara kedua-dua
pembolehubah yang dikaji
 Ataupun ada/tiada perbezaan frekuensi yang
signifikan antara kategori-kategori yang dikaji.
•Daripada jadual tersebut kita boleh telitikan atau
kajikan sama ada terdapat hubungan atau perkaitan
antara kedua-dua pemboleh ubah tersebut.

•Selanjutnya analisis pengujian hipotesis perlu


dijalankan ia itu untuk menguji terdapatnya perkaitan
antara kedua-dua pemboleh ubah tersebut dengan
signifikan.

•Pengujian hipotesis ini adalah ujian kuasa dua khi.

•Sekiranya, terdapat perkaitan yang signifikan maka


langkah seterusnya adalah dengan menentukan
darjah atau magnitud hubungan tersebut.
•Bagi analisis ini, data adalah dalam bentuk
kekerapan dan sudah semestinya taburan skor
adalah tidak normal.
•Dengan itu taburan ini dipanggil taburan bebas
(distribution-free).
•Ujian ini juga dipanggil ujian tak parametrik oleh
kerana ia tidak bertabur secara normal.
•Sebagai “rule-of-thumb” penggunaan ujian
parametrik digalakkan oleh kerana oleh kerana
“power” atau kekuatannya, walaubagaimana pun jika
data adalah dalam bentuk nominal serta juga terdapat
taburan data yang tidak normal maka ujian tak
parametrik diterima pakai.

•Ujian-ujian parametrik – sign test, Mann-Whitney U


test, Wilcoxon matched-pairs signed ranks, Kruskal-
Wallis, Chi-square.
Uji diri anda!!!-Apakah pengujian statistik yang
diperlukan dan seterusnya jalankan analisis
yang diperlukan
EXAMPLE DATA
Parents Marital Children Marital Performance
Subject Satisfaction Satisfaction
1 1 3 70
2 3 2 80
3 7 6 40
4 9 7 35
5 8 8 50
6 4 6 40
7 5 3 30
Pangkat Pangkat
Subjek Agresif Agresif
1 8 14
2 10 12
3 4 9
4 1 4
5 5 11
6 6 10
7 3 1
8 9 12
9 7 10
10 2 4
CONTOH DATA 3 Persepsi
Tahap Stail Prestasi oleh
Jantina Kepemimpinan Kepimpinan Guru
1 18 Autokratik 20
1 20 Autokratik 30
1 24 Autokratik 40
1 11 Demokratik 85
1 15 Demokratik 70
2 16 Demokratik 30
2 12 Demokratik 80
2 19 Autokratik 40
2 17 Demokratik 25
2 22 Autokratik 75

You might also like