You are on page 1of 27

HE3021, NTU

Lecture 7

LECTURE 7: BINARY CHOICE MODELS


1 Modeling Choice Decision
2 Binary Choice Models: LPM, Probit, Logit
3 Estimation and Testing

FENG Qu

1. Modeling Choice Decision


Example of (binary) choice decision: having another child
Other examples:
female labor force participation: work or be a housewife
marriage decision: married or single
applying for mortgage: accepted or denied
admitted or not by NTU
go to graduate school or work: yes or no
vote or not vote
buy or not buy
(Q: Common feature? )
Binary choice: y = 1 or 0
(decision variable) (yes) (No)
2

How to model peoples choice decision?


Example: having another child
Economics: benefit- (opportunity) cost analysis (Gary Becker, 1981)
= 1 for having another child; = 0 for not having.

1 denotes the benefit from having another child, e.g., packages above, tax rebate,
baby bonus, bigger HDB flatter, subsidized childcare, and happiness, etc.;
0 denotes the (opportunity) cost, e.g., economic costs (of pregnancy, birth,
growing, education, healthcare), leisure, pay increase otherwise,
Then, the decision can be modeled by:
= 1 if 1 0 > 0;
0 otherwise
3

Another example: female labor force participation


Economics: the outcome of a market process
Demanders: offer a wage based on labors expected marginal product
Suppliers: whether or not to accept the offer depending on it exceeded their
own reservation wages
From womens side, being in labor force or not is a trade-off between wage and
alternative: taking care of your kids, housekeeping, leisure,.
= 1 for working; = 0 for not being in labor force. If 1 denotes the utility
from working and 0 denotes the utility from staying at home, then the decision
can be modeled by:
= 1 if 1 0 > 0;
0 otherwise
4

Like in linear regression models, suppose that the difference 1 0 can be


interpreted by observable characteristics and unobserved error term,
1 0 = 0 + 1 1 + + + .

Thus,
( = 1| ) = (1 0 > 0) = (0 + 1 1 + + + > 0)
Suppose the random variable follows a distribution with CDF (). Thus
( = 1| ) = ( > (0 + 1 1 + + ))
= 1 ((0 + 1 1 + + ))

If the distribution is symmetric, then


( = 1| ) = (0 + 1 1 + + ) and
( = 0| ) = 1 ( = 1) = 1 (0 + 1 1 + + ).
5

2. Binary Choice Models: LPM, Probit & Logit


What function forms () can we use?

Linear probability model (LPM): linear function


(0 + 1 1 + + ) = 0 + 1 1 + +

Probit model: standard normal CDF ()

(0 + 1 1 + + ) = (0 + 1 1 + + )

Logit model: logistic CDF

exp(0 + 1 1 + + )
(0 + 1 1 + + ) =
1 + exp(0 + 1 1 + + )

LPM: OLS with a binary dependent variable ( = 1 or 0)

= 0 + 1 1 + + + , = 1, , .

Example 1: birth intention in Singapore (data set: babybonus.dta)


dep. var. (y): (yb) 1 for intention to have another child; 0 otherwise
indep. var.(x): (a scale measure of) policy package (cpl), current number
of children (num), husband monthly income (hmi), wifes education (we),
wifes age (wa)
. reg

yb cpl num hmi we wa


Source

SS

Model
Residual

10.9213986
19.5458145

5
116

2.18427972
.168498401

Total

30.4672131

121

.25179515

yb

Coef.

cpl
num
hmi
we
wa
_cons

.1865787
-.2759397
-.0005362
-.0551932
-.0175984
1.521159

Number of obs
116)
F( 5,
Prob > F
R-squared
Adj R-squared
Root MSE

MS

df

Std. Err.
.076402
.0454467
.0362719
.0474809
.0085417
.2663598

t
2.44
-6.07
-0.01
-1.16
-2.06
5.71

P>|t|
0.016
0.000
0.988
0.247
0.042
0.000

=
=
=
=
=
=

122
12.96
0.0000
0.3585
0.3308
.41049

[95% Conf. Interval]


.0352548
-.3659526
-.0723773
-.1492351
-.0345164
.9935997

.3379026
-.1859267
.0713049
.0388486
-.0006805
2.048718

Result:
Policy package (cpl) has big positive effect on peoples birth intention.

(Q: how to interpret the coefficient .187?)

Prediction:
Stata command: predict yhat, xb

(Q: What does the predicted value = 0.499 mean? 1.121,-.016)

First, for LPM, [| ] = ( = 1| ) = 0 + 1 1 + + , so


= 0 + 1 1 + + = . ( = 1| )

predicted value : the predicted probability of success (having another child)


0.499 is couple 1s predicted probability of having another child, given other factors.
Second, is the partial effect of on the probability of success (( =
1| )),
[|] ( = 1| )
=
, = 1, ,
=

: the estimated partial effect (the ceteris paribus interpretation)

E.g., 1 =.187 can be interpreted that additional unit of policy package increases
the probability of having another child by18.7%, holding other factors fixed.
9

Example 2: Womens Labor Force Participation (data set: MROZ.dta)


= 0 + 1 + 2 + 3 6 + , = 1, ,

o : 1 for being in labor force and 0 for being a housewife;


o nwifeinc: husbands earning
o educ: wifes education
o kidslt6: number of children less than 6 years old
Estimation: run the multiple regression:
. reg inlf

nwifeinc

Source

educ kidslt6
SS

df

MS

Model
Residual

22.3586557
162.3691

3
749

7.45288523
.216781175

Total

184.727756

752

.245648611

inlf

Coef.

nwifeinc
educ
kidslt6
_cons

-.0077404
.0572465
-.2227047
.0737593

Std. Err.
.001519
.0077912
.0325987
.0931678

t
-5.10
7.35
-6.83
0.79

Number of obs
F( 3,
749)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
0.000
0.429

10

=
=
=
=
=
=

753
34.38
0.0000
0.1210
0.1175
.4656

[95% Conf. Interval]


-.0107224
.0419513
-.2867004
-.1091417

-.0047583
.0725418
-.158709
.2566604

Advantages of LPM:

easy to implement: OLS


simple to interpret results
straightforward to test hypothesis
OLS estimator is consistent
(Q:?)

Disadvantages of LPM:
1. heteroskedasticity:
o heteroskedasticity-robust inference
(exercise)
2. the predicted probability could be < 0 or > 1!
11

Graphic Interpretation of LPM


Example: explain Mortgage application by debt payments to income (P/I) ratio
LPM:

12

Probit and Logit Models


LPM model:
[|] = ( = 1| ) = (0 + 1 1 + + ) = 0 + 1 1 + +
Probit model: standard normal CDF ()

(0 + 1 1 + + ) = (0 + 1 1 + + )

Logit model: logistic CDF

exp(0 + 1 1 + + )
(0 + 1 1 + + ) =
1 + exp(0 + 1 1 + + )
For CDFs, 0 () 1 and 0

exp()

1+exp()

1, so the estimated probability

= (0 + 1 1 + + ) for probit model

and =

0 +
1 1 ++
)
exp(
0 +
1 1 ++
)
1+exp(

for logit model


13

Probit & Logit Models

estimated ( = 1| ) falls in the unit interval [0, 1] for all 1 , ,


14

probit regression of womens labor force participation


Stata command: probit inlf nwifeinc educ kidslt6
. probit inlf nwifeinc educ kidslt6
Iteration
Iter ation
Iteration
Iteration

0:
1:
2:
3:

log
log
log
log

likelihood
likelihood
likelihood
likelihood

= -514.8732
= -466.34923
= -465.4538
= -465.45302

Probit regression

Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2

Log likelihood = -465.45302


inlf

Coef.

nwifeinc
educ
kidslt6
_cons

-.0231133
.1666664
-.6525247
-1.245253

Std. Err.
.0045451
.0235149
.0996887
.2714193

z
-5.09
7.09
-6.55
-4.59

P>|z|
0.000
0.000
0.000
0.000

=
=
=
=

753
98.84
0.0000
0.0960

[95% Conf. Interval]


-.0320216
.120578
-.847911
-1.777225

-.014205
.2127547
-.4571383
-.7132807

.1667 is the coefficient of educ from probit regression. (.057 in LPM)


(Q: why so different?)
Compare the estimation results with those of LMP

15

logit regression of womens labor force participation


Stata command: logit inlf nwifeinc educ kidslt6
. logit inlf nwifeinc educ kidslt6
Iteration
Iter ation
Iteration
Iteration

0:
1:
2:
3:

log
log
log
log

likelihood
likelihood
likelihood
likelihood

= -514.8732
= -466.55427
= -465.55673
= -465.55373

Logistic regression

Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2

Log likelihood = -465.55373


inlf

Coef.

nwifeinc
educ
kidslt6
_cons

-.0385731
.2741035
-1.068074
-2.046709

Std. Err.
.0078653
.0399976
.167187
.4552589

z
-4.90
6.85
-6.39
-4.50

P>|z|
0.000
0.000
0.000
0.000

=
=
=
=

753
98.64
0.0000
0.0958

[95% Conf. Interval]


-.0539887
.1957097
-1.395755
-2.939

-.0231574
.3524973
-.740394
-1.154418

.274 is the coefficient of educ from logit regression. (.057 in LPM and .1666 in
probit)
(Q: why so different?)

16

Partial (or marginal) effect of ( = 1| ) = (0 + 1 1 + + )


ceteris paribus effect, the effect of one unit of change in on the
probability of success ( = 1| ) = (0 + 1 1 + + ), given other
factors fixed.

(i) Continuous :
( = 1| )
= (0 + 1 1 + + ) , = 1, ,

For probit model, () = (), () = ()


For logit model, () =

()

1+()

and () = () =

()

(1+())2

(for LPM, () = 1)

(ii) Discrete , e.g. 1 from 1 to 0, the partial effect is defined as

(0 + 1 + 2 2 + + ) (0 + 2 2 + + )
17

Remarks:
1. Different from LPM, the partial effects in probit and logit models are not
constant, related with the values of .Slope parameter is NOT the partial
effect of on the probability of success, implying that the interpretations of
coefficients in these 3 models are different, not comparable.

2. Since > 0 for probit and logit, the direction of the partial effect of

depends on the sign of .

3. Calculation of marginal effects at the mean values of regressors in probit


and logit regressions :
Stata command: mfx

18

Probit regression:
. probit inlf nwifeinc educ kidslt6
Iteration
Iter ation
Iteration
Iteration

0:
1:
2:
3:

log
log
log
log

likelihood
likelihood
likelihood
likelihood

= -514.8732
= -466.34923
= -465.4538
= -465.45302

Probit regression

Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2

Log likelihood = -465.45302


inlf

Coef.

nwifeinc
educ
kidslt6
_cons

-.0231133
.1666664
-.6525247
-1.245253

Std. Err.
.0045451
.0235149
.0996887
.2714193

P>|z|

-5.09
7.09
-6.55
-4.59

0.000
0.000
0.000
0.000

=
=
=
=

753
98.84
0.0000
0.0960

[95% Conf. Interval]


-.0320216
.120578
-.847911
-1.777225

-.014205
.2127547
-.4571383
-.7132807

following probit regression, run Stata command: mfx


. mfx
Marginal effects after probit
y = Pr(inlf) (predict)
= .57228348
variable

dy/dx

nwifeinc
educ
kidslt6

-.0090691
.0653958
-.2560349

Std. Err.
.00178
.00921
.03923

z
-5.08
7.10
-6.53

95% C.I.

P>|z|

0.000
0.000
0.000

-.012566 -.005572
.047335 .083457
-.332929 -.179141

X
20.129
12.2869
.237716

(Note: at the mean values of x)


19

logit regression:
. logit inlf nwifeinc educ kidslt6
Iteration
Iter ation
Iteration
Iteration

0:
1:
2:
3:

log
log
log
log

likelihood
likelihood
likelihood
likelihood

= -514.8732
= -466.55427
= -465.55673
= -465.55373

Logistic regression

Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2

Log likelihood = -465.55373


inlf

Coef.

nwifeinc
educ
kidslt6
_cons

-.0385731
.2741035
-1.068074
-2.046709

Std. Err.

.0078653
.0399976
.167187
.4552589

P>|z|

-4.90
6.85
-6.39
-4.50

0.000
0.000
0.000
0.000

=
=
=
=

753
98.64
0.0000
0.0958

[95% Conf. Interval]


-.0539887
.1957097
-1.395755
-2.939

-.0231574
.3524973
-.740394
-1.154418

. mfx
Marginal effects after logit
y = Pr(inlf) (predict)
= .57219848
variable

dy/dx

nwifeinc
educ
kidslt6

-.0094422
.0670971
-.2614511

Std. Err.
.00193
.00977
.04111

z
-4.90
6.87
-6.36

P>|z|

95% C.I.

0.000
0.000
0.000

-.013222 -.005663
.047943 .086251
-.342023 -.18088

X
20.129
12.2869
.237716

(Q: any interesting finding from these results?)


20

the mean values of regressors:


. sum

nwifeinc educ kidslt6

Variable

Obs

Mean

nwifeinc
educ
kidslt6

753
753
753

20.12896
12.28685
.2377158

Std. Dev.
11.6348
2.280246
.523959

Min

Max

-.0290575
5
0

96
17
3

LPM result:
. reg inlf

nwifeinc

Source

educ kidslt6
SS

df

MS

Model
Residual

22.3586557
162.3691

3
749

7.45288523
.216781175

Total

184.727756

752

.245648611

inlf

Coef.

nwifeinc
educ
kidslt6
_cons

-.0077404
.0572465
-.2227047
.0737593

Std. Err.
.001519
.0077912
.0325987
.0931678

t
-5.10
7.35
-6.83
0.79

Number of obs
F( 3,
749)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
0.000
0.429

=
=
=
=
=
=

753
34.38
0.0000
0.1210
0.1175
.4656

[95% Conf. Interval]


-.0107224
.0419513
-.2867004
-.1091417

-.0047583
.0725418
-.158709
.2566604

This empirical example tells us that though the estimates of 2 , are different in
LPM, probit and logit regressions, their partial effects evaluated at mean values of
regressors are very close.
(Q: why does this make sense?)
21


Partial (marginal) effects in LPM:
,

Partial effects in probit regression: ()

Partial effects in logit regression:

()

(1+())

A simple rule for comparing coefficients in these 3 models: partial effects are
considered to be approximately equal:

()

Since (0) 0.4 for probit and


or

(0)

(1+(0))2

()

(1+())

= 0.25 for logit, we obtain:

0.4 0.25

2.5
, 4
and 0.625

Example above:
= 0.057, = 0.167, = 0.274
22

Calculation of the predicted probability in probit/logit model:

Stata command: predict ypr, pr (after probit/logit regression)

Check whether lies in the unit interval and compare the predicted probabilities
in probit and logit models.
Note:
In Stata 11, the calculation of marginal effect has 3 cases: marginal
effect at the mean, marginal effect at a representative value and average
marginal effect. Stata commands are:
margins, dydx(*) atmean
margins, dydx(*) at(nwifeinc=0 educ=6 kidslt6=1)
margins, dydx(*)

23

3. Estimation and Testing: Probit and Logit


We need calculate the likelihood function in probit and logit models:
Prob( = 1| ) = (0 + 1 1 + + ), Prob( = 0| ) = 1 ()
or equivalently, ( ) = () [ 1 ()]1

Then likelihood function


(0 , 1 , , ) = =1 ( ) = =1 () [1 ()]1
or
ln (0 , 1 , , ) = =1{ ln () + (1 ) ln 1 ()]}
as a function of the unknown parameters 0 , 1 , , .

() = (), for probit model; () =

()

, for logit model.

1+()

Maximizing ln (0 , 1 , , ) gives the probit (or logit) estimates of s.


24

For probit and logit models, we cant solve for the maximum explicitly. We need
use numerical methods (iterations): e.g.,

. probit inlf nwifeinc educ kidslt6


Iteration
Iter ation
Iteration
Iteration

0:
1:
2:
3:

log
log
log
log

likelihood
likelihood
likelihood
likelihood

= -514.8732
= -466.34923
= -465.4538
= -465.45302

Probit regression

Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2

Log likelihood = -465.45302


inlf

Coef.

nwifeinc
educ
kidslt6
_cons

-.0231133
.1666664
-.6525247
-1.245253

Std. Err.
.0045451
.0235149
.0996887
.2714193

z
-5.09
7.09
-6.55
-4.59

P>|z|
0.000
0.000
0.000
0.000

Properties of MLE:
o consistent
o asymptotically normal
o asymptotically efficient
25

=
=
=
=

753
98.84
0.0000
0.0960

[95% Conf. Interval]


-.0320216
.120578
-.847911
-1.777225

-.014205
.2127547
-.4571383
-.7132807

Hypothesis Testing in probit and logit models: same as in OLS


Example 1: 0 : 1 = 2 = 3 = 0 in logit model
State commands:
quitely logit inlf nwifeinc educ kidslt6
test nwifeinc educ kidslt6
. test
( 1)
( 2)
( 3)

nwifeinc educ kidslt6


[inlf]nwifeinc = 0
[inlf]educ = 0
[inlf]kidslt6 = 0
chi2( 3) =
Prob > chi2 =

78.00
0.0000

0 is rejected since the p-value is 0.

Example 2: linear restriction: 0 : 22 3 = 0


. test
( 1)

2*educ- kidslt6=0

2*[inlf]educ - [inlf]kidslt6 = 0
chi2( 1) =
Prob > chi2 =

65.05
0.0000

0 is rejected since the p-value is 0.


26

Example 3: 0 : 22 3 = 0 and 1 + 2 = 0
Stata commands:
test (2*educ- kidslt6=0) (educ+ nwifeinc=0)
. test
( 1)
( 2)

(2*educ- kidslt6=0) (educ+ nwifeinc=0)


2*[inlf]educ - [inlf]kidslt6 = 0
[inlf]nwifeinc + [inlf]educ = 0
chi2( 2) =
Prob > chi2 =

69.16
0.0000

0 is rejected since the p-value is 0.

27

You might also like