Professional Documents
Culture Documents
(D)
Logistic transformation
e! + "x
P( y x ) = ! + "x
1+ e
" P( y x ) %
ln $ ' = ( + )x
#1 ! P ( y x ) &
{
logit of P(y|x)
P(y|x)
Log likelihood
N
l ( w) = !i =1 yi log p ( xi ; w) + (1 " yi ) log(1 " p ( xi ; w))
N p ( xi ; w) 1
= !i =1 yi log + log( xi w
)
(1 " p ( xi ; w) 1+ e
N xi w
= !i =1 yi xi w " log(1 + e )
Log likelihood
N
l ( w) = !i =1 yi log p ( xi ; w) + (1 " yi ) log(1 " p ( xi ; w))
N p ( xi ; w) 1
= !i =1 yi log + log( xi w
)
(1 " p ( xi ; w) 1+ e
N xi w
= !i =1 yi xi w " log(1 + e )
Wald
* *
β0 mle
β
Measuring the Performance of a Binary Classifier
Training Data for a Logistic Regression Model
predicted probabilities
Suppose we use a
cutoff of 0.5…
actual outcome
1 0
1
8 3
predicted
outcome
0
0 9
Test Data
More generally…
b+c
misclassification rate:
actual outcome a+b+c+d
1 0
a
sensitivity:
1 a+c
a b (aka recall)
predicted
outcome
d
0 specificity:
b+d
c d
a
predicitive value positive:
a+b
(aka precision)
Suppose we use a
cutoff of 0.5…
actual outcome
1 0
8
sensitivity: = 100%
8+0
1
8 3
predicted
outcome
9
specificity: = 75%
0 9+3
0 9
Suppose we use a
cutoff of 0.8…
actual outcome
1 0
6
sensitivity: = 75%
6+2
1
6 2
predicted
outcome
10
specificity: = 83%
0 10+2
2 10
• Note there are 20 possible thresholds
actual outcome
• Note if threshold = minimum 1 0
p
2
! j "s
#
j =1
!# j "s
j =1
s
http://www.bayesianregression.org
http://www.stanford.edu/~boyd/l1_logreg/
Polytomous Logistic
Regression (PLR)
r
exp( ! k x i )
P( yi = k | x i ) = r
" exp(!k 'x i )
k'
Adrian Dobra
Logistic Regression in R
plot(nomove~conc,data=anesthetic)
> anes.logit <- glm(nomove ~ conc, family=binomial(link="logit"),
data=anesthetic)
> summary(anes.logit)
Call:
glm(formula = nomove ~ conc, family = binomial(link = "logit"),
data = anesthetic)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.76666 -0.74407 0.03413 0.68666 2.06900
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -6.469 2.418 -2.675 0.00748 **
conc 5.567 2.044 2.724 0.00645 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
!
data(frogs)
plot(northing ~ easting, data=frogs, pch=c(1,16)[frogs$pres.abs+1],
xlab="Meters east of reference point", ylab="Meters north")
> frogs.glm0 <- glm(pres.abs ~ .,data=frogs,family=binomial(link=logit))
> summary(frogs.glm0)
Call:
glm(formula = pres.abs ~ ., family = binomial(link = logit),
data = frogs)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.8987 -0.7987 -0.2735 0.8035 2.6991
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.635e+02 2.153e+02 -0.759 0.44764
northing 1.041e-02 1.654e-02 0.630 0.52901
easting -2.158e-02 1.268e-02 -1.702 0.08872 .
altitude 7.091e-02 7.705e-02 0.920 0.35745
distance -4.835e-04 2.060e-04 -2.347 0.01893 *
NoOfPools 2.968e-02 9.444e-03 3.143 0.00167 **
NoOfSites 4.294e-02 1.095e-01 0.392 0.69482
avrain -4.058e-05 1.300e-01 -0.000312 0.99975
meanmin 1.564e+01 6.479e+00 2.415 0.01574 *
meanmax 1.708e+00 6.809e+00 0.251 0.80198
Null deviance: 279.99 on 211 degrees of freedom
Residual deviance: 195.66 on 202 degrees of freedom
AIC: 215.66
> frogs.glm1 <- glm(pres.abs ~ easting + distance + NoOfPools +
meanmin,data=frogs,family=binomial(link=logit))
> summary(frogs.glm1)
Call:
glm(formula = pres.abs ~ easting + distance + NoOfPools + meanmin,
family = binomial(link = logit), data = frogs)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.8082 -0.7942 -0.4048 0.8751 2.9700
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -6.4501010 2.5830600 -2.497 0.012522 *
easting 0.0024885 0.0031229 0.797 0.425533
distance -0.0005908 0.0001744 -3.387 0.000706 ***
NoOfPools 0.0244347 0.0080995 3.017 0.002555 **
meanmin 1.1130796 0.4191963 2.655 0.007924 **
Fold: 5 4 9 6 3 2 1 10 8 7
Internal estimate of accuracy = 0.792
Cross-validation estimate of accuracy = 0.778
> CVbinary(frogs.glm1)
Fold: 4 6 1 10 8 2 3 7 5 9
Internal estimate of accuracy = 0.759
Cross-validation estimate of accuracy = 0.731
> all.acc
[1] 0.7688679 0.7783019 0.7783019 0.7830189 0.7735849 0.7735849 0.7735849 0.7735849 0.7
> red.acc
[1] 0.7264151 0.7500000 0.7264151 0.7358491 0.7358491 0.7358491 0.7547170 0.7452830 0.7
par(mfrow=c(3,3))
for (i in 2:10) {
ints <- pretty(frogs[,i],n=10)
J <- length(ints)-2
y <- n <- x <- rep(0,J-1)
for (j in 2:J) {
temp <- frogs[((frogs[,i]>ints[j]) & (frogs[,i]<=ints[j+1])),];
y[j-1] <- sum(temp$pres.abs);
n[j-1] <- dim(temp)[1];
x[j-1] <- (ints[j]+ints[j+1])/2;
}
y <- logit((y+0.5)/(n+1))
# plot(x,y,type="b",col="red")
# par(new=TRUE)
plot(lowess(x,y), main=names(frogs)[i], type="l", col="red")
}
> frogs.glm2 <- glm(pres.abs ~ northing + easting +
altitude + distance + I(distance^2)+ NoOfPools + NoOfSites
+ avrain + meanmin + meanmax,
data=frogs,family=binomial(link=logit))
Deviance Residuals:
Min 1Q Median 3Q Max
-2.0974 -0.7644 -0.2583 0.7443 2.6454
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.435e+02 2.130e+02 -0.674 0.50061
northing 9.661e-03 1.576e-02 0.613 0.53992
easting -1.874e-02 1.280e-02 -1.464 0.14322
altitude 6.550e-02 7.579e-02 0.864 0.38749
log(distance) -7.999e-01 2.551e-01 -3.136 0.00171 **
NoOfPools 2.813e-02 9.499e-03 2.961 0.00306 **
NoOfSites 1.897e-03 1.104e-01 0.017 0.98629
avrain -7.682e-03 1.229e-01 -0.062 0.95018
meanmin 1.463e+01 6.531e+00 2.239 0.02513 *
meanmax 1.336e+00 6.690e+00 0.200 0.84174
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
• changepoints
• more log, ^2, exp, etc.
• Prob(Yi = 1) → 0 as β0 + β1X1i → −∞
• Prob(Yi = 1) → 1 as β0 + β1X1i → ∞
– Thus, probabilities from the logit model will
be between 0 and 1
Probit Model
• In the probit model, we assume the error in
the utility index model is normally distributed
– εi ~ N(0,σ2)
26.0807
y2
-4.79061
0 9
x
OLS with heteroskedastic error
• The OLS regression model still gives good values for the
slope and intercept, but you can’t really trust the t scores.
. * try models with heteroskedastic error terms
. generate e2 = x*zscore
. generate y2 = x + e2
. regress y2 x
------------------------------------------------------------------------------
y2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
x | 1.002492 .1700166 5.896 0.000 .6650998 1.339884
_cons | -.2737584 .90764 -0.302 0.764 -2.07494 1.527424
------------------------------------------------------------------------------
Tobit with heteroskedastic error
• The Tobit model gives values for the slope and intercept
that are simply incorrect. (Too steep)
------------------------------------------------------------------------------
y2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
x | 1.880274 .3261122 5.766 0.000 1.233196 2.527351
_cons | -6.947138 2.0908 -3.323 0.001 -11.09574 -2.798537
---------+--------------------------------------------------------------------
_se | 6.612864 .7459541 (Ancillary parameter)
------------------------------------------------------------------------------