You are on page 1of 80

???

@@@###
!!!!

Have you ever thought


that

What factors
factors
What
improve the
the chances
chances
improve
of making
making a
a merger
merger
of
successful?
successful?

Have you ever thought


that

Can we
we increase
increase the
the
Can
probability of
of making
making aa
probability
company more
more
company
financial sound
sound by
by
financial
increase its
its Cash
Cash from
from
increase
Operating Activities
Activities to
to
Operating
Sales?
Sales?

Have you ever thought


that

What factors
factors make
make aa
What
person to
to invest
invest in
in
person
Stock Market
Market or
or not
not to
to
Stock
invest in
in the
the Stock
Stock
invest
Market?
Market?

Have you ever thought


that

What kind
kind of
of attributes
attributes
What
contribute towards
towards
contribute
making an
an account
account aa
making
delinquent account?
account?
delinquent

Have you ever thought


that

Is it
it possible
possible to
to identify
identify
Is
underlying
underlying
characteristics of
of
characteristics
companies which
which are
are
companies
responsible for
for putting
putting
responsible
them either
either at
at the
the top
top
them
or at
at the
the bottom?
bottom?
or

If you
you think
think that
that you
you are
are
If
curious enough
enough to
to look
look for
for
curious
answers these
these issues,
issues, then
then
answers
you have
have to
to equip
equip yourself
yourself
you

techniques that
that are
are capable
capable of
of
techniques
handling
handling

Dummy/Qualitative/Binary/Limited
Dependent Variable

Yi is a dummy
variable.

Looking for models


that have

Yi =a +b1 X 1i +b2 X 2i +... +bk X ki +ei

And, this takes us to

The Liner
Liner Probability
Probability
The
Model, the
the Probit
Probit Model
Model
Model,
and the
the Logit
Logit Model
Model
and

Topic for
for the
the day
day is
is

Topic

MODELING WITH
QUALITATIVE AND LIMITED
DEPENDENT VARIABLES

Lets consider
consider a
a problem
problem
Lets

A researcher, Mr. Suyra Kant, wants to know what are the


factors that determine the bankruptcy of a firm. He took
a sample of 33 Solvent Firms and 33 Bankrupt Firms; and
collected information on the following financial
parameters:
1. Working Capital to Total Assets (WC_ASSET)
2. Retained Earnings to Total Assets (RE_ASSET)
3. Return on Assets (ROA)
4. Assets Turnover Ratio (ASSET_TURN)
5. Market Value to Book Value (MV_BV)

Mr. Suyra
Suyra Kant
Kant has
has made
made the
the
Mr.
following model
model
following
Binary Variable; GROUP=0, the
company is bankrupt and
GROUP=1, the company is solvent.

GROUP =a +b1WC _ ASSET +b2 RE _ ASSET +b3 ROA +b4 ASSET _ TURN +b5 MV _ BV +e

What does this equation means? If the


estimated value of the dependent
variable GROUP is 0.75, how to
interpret it?

To appreciate
appreciate the
the issues
issues involved
involved in
in aa
To
regression model
model with
with binary
binary dependent
dependent
regression
variable, lets
lets consider
consider aa model
model with
with one
one
variable,
explanatory variable
variable
explanatory
We take a general model as thus:

Yi =a +b1 X i +ei
Assume that OLS is used to estimate the above model,
then how to interpret the estimated value of Yi?

E (Yi | X i ) =a +b1 X i
E(Yi|Xi) is nothing but expected value of Yi given Xi.

continued
continued
Alternatively, the expectation of Yi given Xi can
also be calculated as thus:
Let pi be the probability that a Yi takes a value 1
given a particular Xi and (1-pi) be the probability
that a Yi takes a value 0 given a particular Xi.
Then, the expected value of Yi given Xi can be
calculated as thus:

E (Yi | X i ) 1 pi 0 (1 pi ) pi
Thus, we can say thatp X
i

continued
continued
E (Yi | X i ) 1 pi 0 (1 pi ) pi
Thus, we can say thatp X
i

Therefore, such a
model is called the
LINEAR
PROBABILITY
MODEL.

It is nothing but the estimated value of the


following model -

Yi 1 X i i

Thus, the estimated value is interpreted as the


probability of having Yi is as one. The
interpretation of the coefficient of Xi is marginal
increase in the Probability that a particular Y
takes a value one for one unit increase in X.

continued
continued
We may have the graph of the estimated model as
thus:
y, pi

pi = p (Yi =1) =a +b1 X i

a +b1 X i

Xi

continued
continued
We may have the graph of the estimated model as
thus:
y, pi
1

pi = p (Yi =1) =a +b1 X i


A
1 - a - b1 X i

a +b1 X i

a +b1 X i
B

Xi

An Example
Example
An
Assume that the bankruptcy model is as thus:

GROUPi 1 ROAi i
As per the model, it assumed that bankruptcy
depends upon the Return on Assets; and it is
suggested that higher the Return on Assets a
company generates, higher the probability that
a company will be solvent. Using OLS, we
obtain the results as thus:

An Example
Example(continued)
(continued)
An

Interpret the results?

An Example
Example(continued)
(continued)
An

If a firm has ROA = 20%,


then its probability of
being solvent is 55.25178%

Now, lets
lets take
take the
the Complete
Complete
Now,
Example
Example
The Model as hypothesized by Mr. Surya Kant:
GROUP 1WC _ ASSET 2 RE _ ASSET 3 ROA 4 ASSET _ TURN 5 MV _ BV

As per the model, it assumed that bankruptcy


depends upon the financial ratios - Working
Capital to Total Assets, Retained Earnings to
Total Assets, Return on Assets, Assets Turnover
and Market Value to Book Value. Using OLS,
we obtain the results as thus:

The Complete
Complete Example
Example(continued)
(continued)
The

Interpret the results?

The Complete
Complete Example
Example(continued)
(continued)
The

Interpret the results?

Does this
this model
model create
create problems
problems
Does
for us?
us?
for
@@@
######

The Linear
Linear Probability
Probability Model
Model
The
Problem No.
No. 1
1
Problem
As per the Linear Probability Model, the
probability increases with the values of X and
it could be possible that at times, the values of
probabilities are negative (<0) or more than one
(>1).
In that case, it is difficult to interpret the
results.

The Linear
Linear Probability
Probability Model
Model
The
Problem No.
No. 1
1 (Example)
(Example)
Problem

The Linear
Linear Probability
Probability Model
Model
The
Problem No.
No. 2
2
Problem
The Distribution of the Error term is not
Normal Probability Distribution but it is
Binomial Probability Distribution.

The Linear
Linear Probability
Probability Model
Model
The
Problem No.
No. 3
3
Problem
The error term are not homoscedastic but they
have a problem of heteroscedasticity.

The Linear
Linear Probability
Probability Model
Model
The
Problem No.
No. 4
4
Problem
In the Linear Probability Model, as X increases
the probability value continues to
increase/decrease at a constant rate. Since the
probability values should lie between 0 and 1,
that is 0 < p < 1, a constant rate of
increase/decrease is impossible.
We need non-linear relation to restrict the
probability values between 0 and 1.

How to
to ensure
ensure the
the choice
choice of
of p
p
How
within interval
interval [0,1]?
[0,1]?
within
To limit
limit the
the value
value of
of pp within
within [0,1],
[0,1], usually
usually S-Shaped
S-Shaped
To
relationship is
is used
used between
between xx and
and p.
p.
relationship
In this
this regard,
regard, two
two widely
widely used
used approaches
approaches exist
exist
In
one,that
thatmakes
makesuse
useof
ofLogistic
LogisticFunction;
Function;and
and
one,
another,use
useStandard
StandardNormal
NormalProbability
ProbabilityDistribution.
Distribution.
another,

By using
using the
the Logistic
Logistic Function,
Function, we
we obtain
obtain aa model
model
By
which is
is called
called LOGIT
LOGIT MODEL;
MODEL; and
and the
the one
one which
which is
is
which
obtained by
by using
using Standard
Standard Normal
Normal Probability
Probability
obtained
Distribution is
is called
called PROBIT
PROBIT MODEL.
MODEL.
Distribution

PROBIT AND LOGIT MODELS

NORMAL
PROBABILITY
DISTRIBUTION

PROBIT
MODEL

LOGISTIC
FUNCTION

LOGIT
MODEL

Lets start with PROBIT Model


Unlike the Linear Probability Model, the PROBIT
Model assumes that the underlying distribution is
Cumulative Standard Normal Probability Distribution
whose shape is as follows:
Note that it
is non-linear
and its value
lies between
0 and 1.

PROBIT Model
The PROBIT Model expresses the probability p - a
dependent variable Y takes the value 1 for given Xi.
And, p is
p P ( Z a bX ) F (a bX )
Where the model is considered is having only one
explanatory variable and the Function F represents
Cumulative Normal Probability Distribution.
Please note the density function of Standard Normal
Probability Distribution -

Normal Probability Density Function:


1 - 0.5 z 2
f ( z) =
e
2p

PROBIT Model

p P ( Z a bX ) F (a bX )

It is called Normal
Equivalent Deviate
(n.e.d) or Normit.

Estimation of PROBIT Model


The PROBIT Model is not estimated by using
OLS but it is estimated by using Maximum
Likelihood Method and thus, the estimates
we obtain are known as Maximum
Likelihood Estimates.

Revisiting our
our Bankruptcy
Bankruptcy
Revisiting
Example
Example
Assume that the bankruptcy model is as thus:

GROUPi 1 ROAi i
As per the model, it assumed that bankruptcy
depends upon the Return on Assets; and it is
suggested that higher the Return on Assets a
company generates, higher the probability that
a company will be solvent.

The Result obtained from Eviews

Z = -0.028116+0.110879 ROA

Now, lets
lets take
take the
the Complete
Complete
Now,
Example
Example
The Model as hypothesized by Mr. Surya Kant:
GROUP 1WC _ ASSET 2 RE _ ASSET 3 ROA 4 ASSET _ TURN 5 MV _ BV

As per the model, it assumed that bankruptcy


depends upon the financial ratios - Working
Capital to Total Assets, Retained Earnings to
Total Assets, Return on Assets, Assets Turnover
and Market Value to Book Value. Using the
PROBIT Model, we obtain the results as thus:

The Result obtained from Eviews

Z = -1.100487+0.117069
ROA+0.013031 MV_BV

Now, lets start with LOGIT Model


The LOGIT Model assumes that the underlying
distribution is a logistic function or Sigmoid Function
whose shape is as follows:

Note that it
is non-linear
and its value
lies between
0 and 1.

LOGIT Model
The LOGIT Model expresses the probability p that a
dependent variable Y takes the value 1 given Xi.
For the LOGIT Model, a particular type of logistic function
is used which is called SIGMOID FUNCTION given below

1
f ( x) =
1 +e - x

Using the above function form, we may express the probability p

that a dependent variable Y takes the value 1 given Xi


assuming that there is only one explanatory variable.

1
pi
; where z X
Z
1 e

LOGIT Model(continued)
If pi can be shown as below 1
pi
;
Z
1 e
then it can be shown that
pi
pi
z
e Ln
z.

1 pi
1 pi

This term is called LOGIT and the


Model is known a LOGIT Model.

Finally, the LOGIT Model is

pi
Ln
Z ( X )

1 pi

Estimation of LOGIT Model


The LOGIT Model is not estimated by using
OLS but it is estimated by using Maximum
Likelihood Method and thus, the estimates
we obtain are known as Maximum
Likelihood Estimates.

Revisiting our Bankruptcy


Example
Assume that the bankruptcy model is as
thus:

GROUPi 1 ROAi i
As per the model, it assumed that
bankruptcy depends upon the Return on
Assets; and it is suggested that higher
the Return on Assets a company
generates, higher the probability that a
company will be solvent.

The Result obtained from Eviews

Z = -0.180031+0.200139 ROA

Now, lets take the Complete


Example
The Model as hypothesized by Mr. Surya Kant:

GROUP 1 ROA 2 MV _ BV
As per the model, it assumed that bankruptcy
depends upon the financial ratios - Return on
Assets, and Market Value to Book Value. Using
the LOGIT Model, we obtain the results as thus:

The Result obtained from Eviews

Z = -2.042474+0.218424
ROA+0.023347 MV_BV

Another Example

A survey is conducted among the users


and non-users of micro-wave oven. The
following information is being collected
about them on the following parameters:
1. Users/Non-Users (Users-1 and non-users-0)
2. Monthly Income (Rs. in 000)
3. Monthly Consumption of Food Items (in Kgs.)

Data Collected is

Think???
What should be the
LOGIT MODEL?

The Result of the Estimated LOGIT Model

The Result of the Estimated LOGIT Model

Now, lets think


about???

What should be the


PROBIT MODEL?

The Result of the Estimated PROBIT Model


Dependent Variable: USER_MICRO_WAVE
Method: ML - Binary Probit
Date: 06/16/10

Time: 16:36

Sample: 1 24
Included observations: 24
Convergence achieved after 6 iterations
Covariance matrix computed using second derivatives
Variable

Coefficient

Std. Error

z-Statistic

Prob.

-10.81293

4.550141

-2.376395

0.0175

INCOME

0.193225

0.090056

2.14562

0.0319

CONSUMPTION_FOOD

0.113303

0.052992

2.138121

0.0325

Mean dependent var

0.5

S.D. dependent var

0.510754

S.E. of regression

0.349943

Akaike info criterion

0.879612

Sum squared resid

2.571666

Schwarz criterion

1.026868

Hannan-Quinn criter.

0.918679

Log likelihood
Restr. log likelihood

-7.55534
-16.63553

LR statistic (2 df)

18.16038

Probability(LR stat)

0.000114

Obs with Dep=1

12

Obs with Dep=0

12

Avg. log likelihood


McFadden R-squared
Total obs

-0.314806
0.545831
24

The Result of the Estimated PROBIT Model


Dependent Variable: USER_MICRO_WAVE
Method: ML - Binary Probit
Date: 06/16/10

Time: 16:36

Sample: 1 24
Included observations: 24
Convergence achieved after 6 iterations
Covariance matrix computed using second derivatives
Variable

Coefficient

Std. Error

z-Statistic

Prob.

-10.81293

4.550141

-2.376395

0.0175

INCOME

0.193225

0.090056

2.14562

0.0319

CONSUMPTION_FOOD

0.113303

0.052992

2.138121

0.0325

Mean dependent var

0.5

S.D. dependent var

0.510754

S.E. of regression

0.349943

Akaike info criterion

0.879612

Sum squared resid

2.571666

Schwarz criterion

1.026868

Hannan-Quinn criter.

0.918679

Log likelihood
Restr. log likelihood

-7.55534
-16.63553

LR statistic (2 df)

18.16038

Probability(LR stat)

0.000114

Obs with Dep=1

12

Obs with Dep=0

12

Avg. log likelihood


McFadden R-squared
Total obs

-0.314806
0.545831
24

Using LOGIT and PROBIT


models
can we predict who is going to the
OWNER of Microwave?
Yes, for that one has to take a threshold
limit of the probability!
Normally, such a threshold is taken as p
= 50%. If Probability is more than 50%,
he is going to be the owner; otherwise he
is not!

Can we use R2 or Adjusted R2 for


determining Goodness of Fit?

Answer is NO.

Then, what should we


use for Goodness of Fit?

Option #1: Use Log-likelihood


Log-likelihood Statistic can be used to
gauge Goodness of Fit of Logit/Probit
Models. It is defined as

)
)
log likelihood [Yi ln(Yi ) (1 Yi ) ln(1 Yi )]
N

i 1

Some software provide an estimate of this


as (-2Log-likelihood) or -2LL to get a
positive number. Lower -2LL, better is the
fit.

Option #2: Use Log-likelihood


Ratio
A -2Log-likelihood Ratio (-2LLR) can also be
used to gauge Goodness of Fit of Logit/Probit
Models. It is defined as

LLR
2 Log-Likelihood Ratio 2 LLR 2 Ln(
)
LLU
Where LLR the value of Log-likelihood from the
restricted model (with only intercept) and LLU the
value of Log-likelihood from the unrestricted
model (one with all explanatory variables).

If you can not use R-Square then


USE

Pseudo R-Square.

There are several measures which are intended to


mimic the R-Squared, but none of them are an RSquared. Hence, such measures are called Pseudo RSquare.

Option #3: Use Pseudo RSquare


McFaddens R-Square is one of the
Pseudo R-Square measures. It is defined
as

LLU
McFadden's R 1

LLR
2

Its values lying between 0.2 to 0.4 are


considered highly satisfactory.

Option #4: Use Pseudo RSquare


Another measure of Pseudo R-Square is
Cox and Snell. It is also based on loglikelihood but it takes the sample size
into account.
It is defined as

Cox and Snell R R 1 exp [ LLU LLR ]


n

2
CS

Option #5: Use Pseudo RSquare


Another measure of Pseudo R-Square is
Nagelkerke measure. It adjusts the Cox
and Snell measure for the maximum
value so that 1 can be achieved.
It is defined as 2
CS
2
MAX

R
R
R
2
N

2
MAX

, where R

1 exp[(2 / n) LLR ]

Lets take one more EXAMPLE


A company is selling R. O. System for home
under the brand of Puredews. Their marketing
head was interested in predicting what is
chance that a person is likely to use their
products. For this purpose, a sample of 231
users of their R. O. System and 519 non-users
of their R. O. System was taken with their
monthly income (in thousands), Age and Family
Size believing that these determines the uses
Data is given in an
and non-users.
Eview File.

We obtain the following PROBIT Results

We obtain the following LOGIT Results

Which one should we use?

PROBIT MODEL OR
LOGIT MODEL?

LOGIT vs PROBIT
The main difference is that the logistic
distribution has comparatively slightly fatter
tails as compared to that of Probit Model.

In majority of applications, both give the similar


results. However, researchers and practitioners
prefer to use LOGIT because of its comparative
mathematical simplicity.

Now, the
the question
question is
is what
what is
is the
the
Now,
relation between
between the
the Age,
Age, Size
Size and
and
relation
Income and
and the
the Probability
Probability that
that aa
Income
person selected
selected is
is user
user of
of the
the
person
Companys R.
R. O.
O. System?
System?
Companys

First challenge
challenge in
in the
the issue
issue is
is -- relationship
relationship
First
between the
the probability
probability values
values and
and
between
independent variables
variables
independent

Though there is a non-linear relation between


the probability and an independent variable,
yet the sign of coefficient shows the direction of
relation between the probability and the
independent variable.

Second challenge
challenge in
in the
the issue
issue is
is What
What is
is
Second
the rate
rate of
of change
change of
of probability
probability due
due to
to
the
change in
in independent
independent variable?
variable?
change

The rate of change is given as thus...

p
i p (1 p )
xi

For further and better understanding

GO TO
TO EXCEL
EXCEL

GO

How do
do you
you feel
feel about
about LOGIT
LOGIT
How
and PROBIT?
PROBIT?
and

THANK YOU
VERY MUCH

You might also like