You are on page 1of 34

Regresi Logistik I

(Peubah Bebas : Kontinu)

Dr. Kusman Sadik, M.Si


Program Studi Pascasarjana
Departemen Statistika IPB, 2018/2019
 The logistic regression model is a generalized linear
model appropriate for binary outcomes.
 The logistic regression model is similar to the more
familiar linear regression model in that both models
predict an outcome variable, from a set of predictor
variables (also known as explanatory or independent
variables).
 In the case of logistic regression, the response variable
is a binary or dichotomous variable, which means it
can only take on one of two possible values.
2
 To model the probabilities of certain conditions or states (e.g.,
divorce, disease, resilience, etc.) as a function of some predictor
variables. For example, one might want to model whether an
individual has diabetes as a function of weight, plasma insulin,
and fasting plasma glucose.
 To describe differences between individuals from separate
groups as a function of some predictor variables, also known
as descriptive discriminant analysis. For example, one might
want to describe the difference between students who attend
public versus private schools as a function of achievement test
scores, desired occupation, and socioeconomic status (SES).

3
 To classify individuals into one of two categories on the basis
of the predictor variables, also known as predictive
discriminant analysis.
 This is closely related to descriptive discriminant analysis,
but the descriptive information is used to predict group
membership or classify individuals into one of two groups.
 For example, one may want to predict whether a student is
more likely to attend a private school (as opposed to a public
school) as a function of achievement test scores, desired
occupation, and SES.

4
 In the psychometric field, there are specific applications that
are closely tied to predictive discriminant analysis.
 For example, one might want to predict the probability that
an examinee will correctly answer a test item as a function
of race and gender.
 These types of studies are known in the psychometric
literature as differential item functioning analyses.

5
6
7
𝜋 𝑃(𝑌 = 1) 𝑃(𝑌 = 1)
= = = 𝑂𝑑𝑑𝑠 = exp(𝛼 + 𝜷𝑿)
1−𝜋 1 − 𝑃(𝑌 = 1) 𝑃(𝑌 = 0)

𝜋=

8
9
10
11
12
13
14
15
χ2(α, db) : qchisq(α, db, lower.tail=FALSE)

> qchisq(0.05, 1, lower.tail=FALSE)


[1] 3.841459

Jadi χ2(α = 0.05, db = 1) = 3.841459

16
if (sa[i] > 0) (y[i] = 1) else (y[i] = 0)

17
18
19
** Data Horseshoe Crab (Agresti, sub-bab 5.13) **

dataku <- read.csv(file="Data.Horseshoe.Crab.csv")


c <- factor(dataku[,1])
s <- factor(dataku[,2])
w <- dataku[,3]
wt <- dataku[,4]
sa <- dataku[,5]
y <- c(1:173)

for (i in 1:length(sa)) {
if (sa[i] > 0) (y[i] = 1) else (y[i] = 0)
}

model <- glm(y ~ w, family=binomial("link"=logit))


summary(model)
dugaan <- round(fitted(model),2)
data.frame(w,y,dugaan)

20
C S W Wt Sa
1 2 3 28.3 3.05 8
2 3 3 26.0 2.60 4
3 3 3 25.6 2.15 0
4 4 2 21.0 1.85 0
5 2 3 29.0 3.00 1
6 1 2 25.0 2.30 3
7 4 3 26.2 1.30 0
8 2 3 24.9 2.10 0
9 2 1 25.7 2.00 8
10 2 3 27.5 3.15 6
11 1 1 26.1 2.80 5
12 3 3 28.9 2.80 4
13 2 1 30.3 3.60 3
.
.
.
170 2 3 26.5 2.35 4
171 2 3 26.5 2.75 7
172 3 3 26.1 2.75 3
173 2 2 24.5 2.00 0

21
w sa y
1 28.3 8 1
2 26.0 4 1
3 25.6 0 0
4 21.0 0 0
5 29.0 1 1
6 25.0 3 1
7 26.2 0 0
8 24.9 0 0
.
.
.
172 26.1 3 1
173 24.5 0 0
22
Call:
glm(formula = y ~ w, family = binomial(link = logit))

Coefficients:

Estimate Std. Error z value Pr(>|z|)


Intercept -12.3508 2.6287 -4.698 2.62e-06 ***
w 0.4972 0.1017 4.887 1.02e-06 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’

Null deviance : 225.76 on 172 degrees of freedom


Residual deviance: 194.45 on 171 degrees of freedom
AIC: 198.45

23
w y dugaan
1 28.3 1 0.85
2 26.0 1 0.64
3 25.6 0 0.59
4 21.0 0 0.13
5 29.0 1 0.89
6 25.0 1 0.52
7 26.2 0 0.66
8 24.9 0 0.51
.
.
.
171 26.5 1 0.70
172 26.1 1 0.65
173 24.5 0 0.46

24
Bandingkan
output SAS ini
dengan R

25
26
27
28
1. Gunakan Program R untuk data Horseshoe Crabs Revisited
(Agresti, sub-bab 5.1.3 ) .
a. Lakukan pemodelan regresi logistik dengan peubah bebasnya
adalah Width (x). Bandingkan hasil output R tersebut dengan
output SAS di dalam buku Agresti.
b. Lakukan pemodelan regresi logistik dengan peubah bebasnya
adalah Width (x) dan √(Width) atau √(x). Gunakan uji Wald
untuk mengetahui apakah kedua peubah bebas tersebut
berpengaruh nyata. Apa kesimpulan Anda.
c. Bandingkan model bagian (a) dan (b) di atas. Model mana
yang lebih baik? Jelaskan.

29
2. Gunakan Program R untuk menyelesaikan Problems 8.9 (Azen,
hlm. 212-213 ) .

30
31
Pustaka

1. Azen, R. dan Walker, C.R. (2011). Categorical Data


Analysis for the Behavioral and Social Sciences.
Routledge, Taylor and Francis Group, New York.
2. Agresti, A. (2002). Categorical Data Analysis 2nd. New
York: Wiley.
3. Pustaka lain yang relevan.

32
Bisa di-download di

kusmansadik.wordpress.com

33
Terima Kasih

34

You might also like