Lecture 7slides

HE3021, NTU
Lecture 7
LECTURE 7: BINARY CHOICE MODELS

1 Modeling Choice Decision
2 Binary Choice Models: LPM, Probit, Logit
3 Estimation and Testing
FENG Qu
1. Modeling Choice Decision

Example of (binary) choice decision: having another child
Other examples:
female labor force participation: work or be a housewife
marriage decision: married or single
applying for mortgage: accepted or denied
admitted or not by NTU
go to graduate school or work: yes or no
vote or not vote
buy or not buy
(Q: Common feature? )
Binary choice: y = 1 or 0
(decision variable) (yes) (No)
2
How to model peoples choice decision?

Example: having another child
Economics: benefit- (opportunity) cost analysis (Gary Becker, 1981)
= 1 for having another child; = 0 for not having.
1 denotes the benefit from having another child, e.g., packages above, tax rebate,
baby bonus, bigger HDB flatter, subsidized childcare, and happiness, etc.;
0 denotes the (opportunity) cost, e.g., economic costs (of pregnancy, birth,
growing, education, healthcare), leisure, pay increase otherwise,
Then, the decision can be modeled by:
= 1 if 1 0 > 0;
0 otherwise
3
Another example: female labor force participation

Economics: the outcome of a market process
Demanders: offer a wage based on labors expected marginal product
Suppliers: whether or not to accept the offer depending on it exceeded their
own reservation wages
From womens side, being in labor force or not is a trade-off between wage and
alternative: taking care of your kids, housekeeping, leisure,.
= 1 for working; = 0 for not being in labor force. If 1 denotes the utility
from working and 0 denotes the utility from staying at home, then the decision
can be modeled by:
= 1 if 1 0 > 0;
0 otherwise
4
Like in linear regression models, suppose that the difference 1 0 can be

interpreted by observable characteristics and unobserved error term,
1 0 = 0 + 1 1 + + + .
Thus,
( = 1| ) = (1 0 > 0) = (0 + 1 1 + + + > 0)
Suppose the random variable follows a distribution with CDF (). Thus
( = 1| ) = ( > (0 + 1 1 + + ))
= 1 ((0 + 1 1 + + ))
If the distribution is symmetric, then

( = 1| ) = (0 + 1 1 + + ) and
( = 0| ) = 1 ( = 1) = 1 (0 + 1 1 + + ).
5
2. Binary Choice Models: LPM, Probit & Logit

What function forms () can we use?
Linear probability model (LPM): linear function

(0 + 1 1 + + ) = 0 + 1 1 + +
Probit model: standard normal CDF ()
(0 + 1 1 + + ) = (0 + 1 1 + + )
Logit model: logistic CDF
exp(0 + 1 1 + + )
(0 + 1 1 + + ) =
1 + exp(0 + 1 1 + + )
LPM: OLS with a binary dependent variable ( = 1 or 0)
= 0 + 1 1 + + + , = 1, , .
Example 1: birth intention in Singapore (data set: babybonus.dta)

dep. var. (y): (yb) 1 for intention to have another child; 0 otherwise
indep. var.(x): (a scale measure of) policy package (cpl), current number
of children (num), husband monthly income (hmi), wifes education (we),
wifes age (wa)
. reg
yb cpl num hmi we wa

Source
SS
Model
Residual
10.9213986
19.5458145
5
116
2.18427972
.168498401
Total
30.4672131
121
.25179515
yb
Coef.
cpl
num
hmi
we
wa
_cons
.1865787
-.2759397
-.0005362
-.0551932
-.0175984
1.521159
Number of obs
116)
F( 5,
Prob > F
R-squared
Adj R-squared
Root MSE
MS
df
Std. Err.
.076402
.0454467
.0362719
.0474809
.0085417
.2663598
t
2.44
-6.07
-0.01
-1.16
-2.06
5.71
P>|t|
0.016
0.000
0.988
0.247
0.042
0.000
=
=
=
=
=
=
122
12.96
0.0000
0.3585
0.3308
.41049
[95% Conf. Interval]

.0352548
-.3659526
-.0723773
-.1492351
-.0345164
.9935997
.3379026
-.1859267
.0713049
.0388486
-.0006805
2.048718
Result:
Policy package (cpl) has big positive effect on peoples birth intention.
(Q: how to interpret the coefficient .187?)
Prediction:
Stata command: predict yhat, xb
(Q: What does the predicted value = 0.499 mean? 1.121,-.016)
First, for LPM, [| ] = ( = 1| ) = 0 + 1 1 + + , so

= 0 + 1 1 + + = . ( = 1| )
predicted value : the predicted probability of success (having another child)

0.499 is couple 1s predicted probability of having another child, given other factors.
Second, is the partial effect of on the probability of success (( =
1| )),
[|] ( = 1| )
=
, = 1, ,
=
: the estimated partial effect (the ceteris paribus interpretation)
E.g., 1 =.187 can be interpreted that additional unit of policy package increases
the probability of having another child by18.7%, holding other factors fixed.
9
Example 2: Womens Labor Force Participation (data set: MROZ.dta)

= 0 + 1 + 2 + 3 6 + , = 1, ,
o : 1 for being in labor force and 0 for being a housewife;

o nwifeinc: husbands earning
o educ: wifes education
o kidslt6: number of children less than 6 years old
Estimation: run the multiple regression:
. reg inlf
nwifeinc
Source
educ kidslt6
SS
df
MS
Model
Residual
22.3586557
162.3691
3
749
7.45288523
.216781175
Total
184.727756
752
.245648611
inlf
Coef.
nwifeinc
educ
kidslt6
_cons
-.0077404
.0572465
-.2227047
.0737593
Std. Err.
.001519
.0077912
.0325987
.0931678
t
-5.10
7.35
-6.83
0.79
Number of obs
F( 3,
749)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
0.000
0.429
10
=
=
=
=
=
=
753
34.38
0.0000
0.1210
0.1175
.4656

-.0107224
.0419513
-.2867004
-.1091417
-.0047583
.0725418
-.158709
.2566604
Advantages of LPM:
easy to implement: OLS

simple to interpret results
straightforward to test hypothesis
OLS estimator is consistent
(Q:?)
Disadvantages of LPM:
1. heteroskedasticity:
o heteroskedasticity-robust inference
(exercise)
2. the predicted probability could be < 0 or > 1!
11
Graphic Interpretation of LPM

Example: explain Mortgage application by debt payments to income (P/I) ratio
LPM:
12
Probit and Logit Models

LPM model:
[|] = ( = 1| ) = (0 + 1 1 + + ) = 0 + 1 1 + +
Probit model: standard normal CDF ()
(0 + 1 1 + + ) = (0 + 1 1 + + )
Logit model: logistic CDF
exp(0 + 1 1 + + )
(0 + 1 1 + + ) =
1 + exp(0 + 1 1 + + )
For CDFs, 0 () 1 and 0
exp()
1+exp()
1, so the estimated probability
= (0 + 1 1 + + ) for probit model
and =
0 +
1 1 ++
)
exp(
0 +
1 1 ++
)
1+exp(
for logit model

13
Probit & Logit Models
estimated ( = 1| ) falls in the unit interval [0, 1] for all 1 , ,

14
probit regression of womens labor force participation

Stata command: probit inlf nwifeinc educ kidslt6
. probit inlf nwifeinc educ kidslt6
Iteration
Iter ation
Iteration
Iteration
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
= -514.8732
= -466.34923
= -465.4538
= -465.45302
Probit regression
Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2
Log likelihood = -465.45302

inlf
Coef.
nwifeinc
educ
kidslt6
_cons
-.0231133
.1666664
-.6525247
-1.245253
Std. Err.
.0045451
.0235149
.0996887
.2714193
z
-5.09
7.09
-6.55
-4.59
P>|z|
0.000
0.000
0.000
0.000
=
=
=
=
753
98.84
0.0000
0.0960

-.0320216
.120578
-.847911
-1.777225
-.014205
.2127547
-.4571383
-.7132807
.1667 is the coefficient of educ from probit regression. (.057 in LPM)

(Q: why so different?)
Compare the estimation results with those of LMP
15
logit regression of womens labor force participation

Stata command: logit inlf nwifeinc educ kidslt6
. logit inlf nwifeinc educ kidslt6
Iteration
Iter ation
Iteration
Iteration
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
= -514.8732
= -466.55427
= -465.55673
= -465.55373
Logistic regression
Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2

inlf
Coef.
nwifeinc
educ
kidslt6
_cons
-.0385731
.2741035
-1.068074
-2.046709
Std. Err.
.0078653
.0399976
.167187
.4552589
z
-4.90
6.85
-6.39
-4.50
P>|z|
0.000
0.000
0.000
0.000
=
=
=
=
753
98.64
0.0000
0.0958

-.0539887
.1957097
-1.395755
-2.939
-.0231574
.3524973
-.740394
-1.154418
.274 is the coefficient of educ from logit regression. (.057 in LPM and .1666 in
probit)
(Q: why so different?)
16
Partial (or marginal) effect of ( = 1| ) = (0 + 1 1 + + )

ceteris paribus effect, the effect of one unit of change in on the
probability of success ( = 1| ) = (0 + 1 1 + + ), given other
factors fixed.
(i) Continuous :
( = 1| )
= (0 + 1 1 + + ) , = 1, ,
For probit model, () = (), () = ()

For logit model, () =
()
1+()
and () = () =
()
(1+())2
(for LPM, () = 1)
(ii) Discrete , e.g. 1 from 1 to 0, the partial effect is defined as
(0 + 1 + 2 2 + + ) (0 + 2 2 + + )
17
Remarks:
1. Different from LPM, the partial effects in probit and logit models are not
constant, related with the values of .Slope parameter is NOT the partial
effect of on the probability of success, implying that the interpretations of
coefficients in these 3 models are different, not comparable.
2. Since > 0 for probit and logit, the direction of the partial effect of
depends on the sign of .
3. Calculation of marginal effects at the mean values of regressors in probit

and logit regressions :
Stata command: mfx
18
Probit regression:
Iteration
Iter ation
Iteration
Iteration
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
= -514.8732
= -466.34923
= -465.4538
= -465.45302
Probit regression
Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2

inlf
Coef.
nwifeinc
educ
kidslt6
_cons
-.0231133
.1666664
-.6525247
-1.245253
Std. Err.
.0045451
.0235149
.0996887
.2714193
P>|z|
-5.09
7.09
-6.55
-4.59
0.000
0.000
0.000
0.000
=
=
=
=
753
98.84
0.0000
0.0960

-.0320216
.120578
-.847911
-1.777225
-.014205
.2127547
-.4571383
-.7132807
following probit regression, run Stata command: mfx

. mfx
Marginal effects after probit
y = Pr(inlf) (predict)
= .57228348
variable
dy/dx
nwifeinc
educ
kidslt6
-.0090691
.0653958
-.2560349
Std. Err.
.00178
.00921
.03923
z
-5.08
7.10
-6.53
95% C.I.
P>|z|
0.000
0.000
0.000
-.012566 -.005572
.047335 .083457
-.332929 -.179141
X
20.129
12.2869
.237716
(Note: at the mean values of x)

19
logit regression:
. logit inlf nwifeinc educ kidslt6
Iteration
Iter ation
Iteration
Iteration
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
= -514.8732
= -466.55427
= -465.55673
= -465.55373
Logistic regression
Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2

inlf
Coef.
nwifeinc
educ
kidslt6
_cons
-.0385731
.2741035
-1.068074
-2.046709
Std. Err.
.0078653
.0399976
.167187
.4552589
P>|z|
-4.90
6.85
-6.39
-4.50
0.000
0.000
0.000
0.000
=
=
=
=
753
98.64
0.0000
0.0958

-.0539887
.1957097
-1.395755
-2.939
-.0231574
.3524973
-.740394
-1.154418
. mfx
Marginal effects after logit
y = Pr(inlf) (predict)
= .57219848
variable
dy/dx
nwifeinc
educ
kidslt6
-.0094422
.0670971
-.2614511
Std. Err.
.00193
.00977
.04111
z
-4.90
6.87
-6.36
P>|z|
95% C.I.
0.000
0.000
0.000
-.013222 -.005663
.047943 .086251
-.342023 -.18088
X
20.129
12.2869
.237716
(Q: any interesting finding from these results?)

20
the mean values of regressors:

. sum
nwifeinc educ kidslt6
Variable
Obs
Mean
nwifeinc
educ
kidslt6
753
753
753
20.12896
12.28685
.2377158
Std. Dev.
11.6348
2.280246
.523959
Min
Max
-.0290575
5
0
96
17
3
LPM result:
. reg inlf
nwifeinc
Source
educ kidslt6
SS
df
MS
Model
Residual
22.3586557
162.3691
3
749
7.45288523
.216781175
Total
184.727756
752
.245648611
inlf
Coef.
nwifeinc
educ
kidslt6
_cons
-.0077404
.0572465
-.2227047
.0737593
Std. Err.
.001519
.0077912
.0325987
.0931678
t
-5.10
7.35
-6.83
0.79
Number of obs
F( 3,
749)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
0.000
0.429
=
=
=
=
=
=
753
34.38
0.0000
0.1210
0.1175
.4656

-.0107224
.0419513
-.2867004
-.1091417
-.0047583
.0725418
-.158709
.2566604
This empirical example tells us that though the estimates of 2 , are different in
LPM, probit and logit regressions, their partial effects evaluated at mean values of
regressors are very close.
(Q: why does this make sense?)
21

Partial (marginal) effects in LPM:
,
Partial effects in probit regression: ()
Partial effects in logit regression:
()
(1+())
A simple rule for comparing coefficients in these 3 models: partial effects are
considered to be approximately equal:
()
Since (0) 0.4 for probit and

or
(0)
(1+(0))2
()
(1+())
= 0.25 for logit, we obtain:
0.4 0.25
2.5
, 4
and 0.625
Example above:
= 0.057, = 0.167, = 0.274
22
Calculation of the predicted probability in probit/logit model:
Stata command: predict ypr, pr (after probit/logit regression)
Check whether lies in the unit interval and compare the predicted probabilities
in probit and logit models.
Note:
In Stata 11, the calculation of marginal effect has 3 cases: marginal
effect at the mean, marginal effect at a representative value and average
marginal effect. Stata commands are:
margins, dydx(*) atmean
margins, dydx(*) at(nwifeinc=0 educ=6 kidslt6=1)
margins, dydx(*)
23
3. Estimation and Testing: Probit and Logit

We need calculate the likelihood function in probit and logit models:
Prob( = 1| ) = (0 + 1 1 + + ), Prob( = 0| ) = 1 ()
or equivalently, ( ) = () [ 1 ()]1
Then likelihood function

(0 , 1 , , ) = =1 ( ) = =1 () [1 ()]1
or
ln (0 , 1 , , ) = =1{ ln () + (1 ) ln 1 ()]}
as a function of the unknown parameters 0 , 1 , , .
() = (), for probit model; () =
()
, for logit model.
1+()
Maximizing ln (0 , 1 , , ) gives the probit (or logit) estimates of s.

24
For probit and logit models, we cant solve for the maximum explicitly. We need
use numerical methods (iterations): e.g.,

Iteration
Iter ation
Iteration
Iteration
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
= -514.8732
= -466.34923
= -465.4538
= -465.45302
Probit regression
Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2

inlf
Coef.
nwifeinc
educ
kidslt6
_cons
-.0231133
.1666664
-.6525247
-1.245253
Std. Err.
.0045451
.0235149
.0996887
.2714193
z
-5.09
7.09
-6.55
-4.59
P>|z|
0.000
0.000
0.000
0.000
Properties of MLE:
o consistent
o asymptotically normal
o asymptotically efficient
25
=
=
=
=
753
98.84
0.0000
0.0960

-.0320216
.120578
-.847911
-1.777225
-.014205
.2127547
-.4571383
-.7132807
Hypothesis Testing in probit and logit models: same as in OLS

Example 1: 0 : 1 = 2 = 3 = 0 in logit model
State commands:
quitely logit inlf nwifeinc educ kidslt6
test nwifeinc educ kidslt6
. test
( 1)
( 2)
( 3)
nwifeinc educ kidslt6

[inlf]nwifeinc = 0
[inlf]educ = 0
[inlf]kidslt6 = 0
chi2( 3) =
Prob > chi2 =
78.00
0.0000
0 is rejected since the p-value is 0.
Example 2: linear restriction: 0 : 22 3 = 0

. test
( 1)
2*educ- kidslt6=0
2*[inlf]educ - [inlf]kidslt6 = 0
chi2( 1) =
Prob > chi2 =
65.05
0.0000

26
Example 3: 0 : 22 3 = 0 and 1 + 2 = 0
Stata commands:
test (2*educ- kidslt6=0) (educ+ nwifeinc=0)
. test
( 1)
( 2)
(2*educ- kidslt6=0) (educ+ nwifeinc=0)

2*[inlf]educ - [inlf]kidslt6 = 0
[inlf]nwifeinc + [inlf]educ = 0
chi2( 2) =
Prob > chi2 =
69.16
0.0000
27

Lecture 7slides

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 7slides

Uploaded by

Copyright:

Available Formats

HE3021, NTU

LECTURE 7: BINARY CHOICE MODELS

1. Modeling Choice Decision

How to model peoples choice decision?

Another example: female labor force participation

Like in linear regression models, suppose that the difference 1 0 can be

If the distribution is symmetric, then

2. Binary Choice Models: LPM, Probit & Logit

Linear probability model (LPM): linear function

Probit model: standard normal CDF ()

Logit model: logistic CDF

LPM: OLS with a binary dependent variable ( = 1 or 0)

Example 1: birth intention in Singapore (data set: babybonus.dta)

yb cpl num hmi we wa

[95% Conf. Interval]

(Q: how to interpret the coefficient .187?)

(Q: What does the predicted value = 0.499 mean? 1.121,-.016)

First, for LPM, [| ] = ( = 1| ) = 0 + 1 1 + + , so

predicted value : the predicted probability of success (having another child)

: the estimated partial effect (the ceteris paribus interpretation)

Example 2: Womens Labor Force Participation (data set: MROZ.dta)

o : 1 for being in labor force and 0 for being a housewife;

[95% Conf. Interval]

easy to implement: OLS

Graphic Interpretation of LPM

Probit and Logit Models

Logit model: logistic CDF

1, so the estimated probability

= (0 + 1 1 + + ) for probit model

for logit model

Probit & Logit Models

estimated ( = 1| ) falls in the unit interval [0, 1] for all 1 , ,

probit regression of womens labor force participation

Log likelihood = -465.45302

[95% Conf. Interval]

.1667 is the coefficient of educ from probit regression. (.057 in LPM)

logit regression of womens labor force participation

Log likelihood = -465.55373

[95% Conf. Interval]

Partial (or marginal) effect of ( = 1| ) = (0 + 1 1 + + )

For probit model, () = (), () = ()

(ii) Discrete , e.g. 1 from 1 to 0, the partial effect is defined as

depends on the sign of .

3. Calculation of marginal effects at the mean values of regressors in probit

Log likelihood = -465.45302

[95% Conf. Interval]

following probit regression, run Stata command: mfx

(Note: at the mean values of x)

Log likelihood = -465.55373

[95% Conf. Interval]

(Q: any interesting finding from these results?)

the mean values of regressors:

nwifeinc educ kidslt6

[95% Conf. Interval]

Partial effects in probit regression: ()

Partial effects in logit regression:

Since (0) 0.4 for probit and

= 0.25 for logit, we obtain:

Calculation of the predicted probability in probit/logit model:

Stata command: predict ypr, pr (after probit/logit regression)

3. Estimation and Testing: Probit and Logit

Then likelihood function

() = (), for probit model; () =