Professional Documents
Culture Documents
Name:
Points:
Question 1
Question 2
Question 3
Question 4
Total
Answer key
Student ID#:
_________
_________
_________
_________
_________
Final
Exam time 2 hours and 45 minutes
This final contains four questions for a total of 100 points. The points for every question and sub-question are listed
on the exam. After each sub-question there is a box that you can use to fill in your answer. Apart from the answers
that you are explicitly asked to fill in in a table or draw in a figure, the content of the answer boxes is all that will
be graded. Hand in the exam at the end of the 2 hours and forty five minutes. Please do not forget to fill in your
name and student ID on this front page! THIS IS A CLOSED BOOK EXAM AND THE USE OF CALCULATORS
IS
NOT
ALLOWED.
BRIEFLY
EXPLAIN
ALL
YOUR
ANSWERS.
ANSWERS
WITHOUT
= [1
= + , where
2 ] , = [ 1
] , = [1 2 ] , and = [1
2
(1)
2 ]
(2)
1 is the OLS
a) (5 points) Show that the OLS estimator of amounts to OLS per equation. That is,
2 is the OLS estimator obtained by regressing 2
estimator obtained by regressing 1 on 1 and
on 2 .
Hint: Use the fact that the inverse of a block diagonal matrix satisfies [
1 1
] =[
].
1
1 1
1 1
= ([
] [
]) [ 1
] [ ] = [ 1 1
]
[
]
2
2
2
2 2
2 2
2
Using the hint about the inverse of a block diagonal matrix provided in the question, we can write
1
1
(1 1 )1 1 1
1 1
= [(1 1 )
]
[
]
=
[
]
=
[
].
2
2 2
(2 2 )1
(2 2 )1 2 2
b) (2 points) Suppose, instead, that the covariance matrix of the residuals equals
[ |] = 2 1 +2
Would the OLS estimator from part a) still be consistent and unbiased in this case?
Yes (well, formally yes potentially), this general form of heteroskedasticity does not in all cases
affect the consistency and unbiasedness of the OLS estimator, but it would make the estimator
inefficient.
(3)
Because it is the BLUE in this transformed system, it is the BLUE for the linear regression model
that we consider in this subquestion.
,
d) (5 points) Suppose again that [ |] = 2 1 +2 . Show that the restricted OLS estimator,
under the restriction that 1 = 2 equals
= 1
1 + 2
2 , where = (1 1 + 2 2 )1 ( ), for = 1,2,
1 and
2 are the OLS estimates from part (a).
where
Hint: Redefine the matrix to reflect that both parameter vectors are the same.
In principle, one could apply restricted least squares here, but that turns out to be rather
cumbersome. The easiest way to show this is to rewrite the model under the restriction. In that case
the model can be written as
= [ 1 ] + ,
2
such that
1
= ([1 2 ] [1 ]) [1 2 ] [ 1 ] = (1 1 + 2 2 )1 (1 1 + 2 2 ).
2
2
We can write this as
= (1 1 + 2 2 )1 (1 1 )(1 1 )1 1 1 +(1 1 + 2 2 )1 (2 2 )(2 2 )1 2 1,
which is
= (1 1 + 2 2 )1 (1 1 )
1 +(1 1 + 2 2 )1 (2 2 )
2
e) (5 points) How are the matrices, 1 and 2 , related to the estimated variance-covariance
1 and
2?
matrices of the OLS estimators
1 and
2 are given by = 2 (1 1 )1 and = 2 (2 2 )1,
The covariance matrices of
1
2
is given by = 2 (1 1 + 2 2 )1 .
respectively. The covariance matrix of
f) (3 points) Give an intuitive interpretation of the result in parts (d) and (e). In what way can the
as a weighted average of
matrices 1 and 2 be interpreted as weighting matrices and
1 and
2?
is a weighted average of
1 and
2 , where the relative weights
The above result suggests that
are inversely related to the covariance matrix of the estimator. That is, the more precisely estimated
2.
of the two gets the highest weight in the restricted estimator
(4)
The sample consists of independent, identically distributed observations but the problem is that
[ | ] 0 for all and OLS would yield an inconsistent estimate of 1. Fortunately, there is an
instrumental variable, {0,1}, which is an indicator variable and has the properties that
[ | = 1] [ | = 0] and that [ | = 1] = [ | = 0] = 0.
Throughout the rest of this problem we consider the case of a sample = 1, , , where
1 = =1 and 0 = 1 .
(5)
Moreover, besides the standard sample means, and , we consider the conditional sample means
1
1
1
1
0 = =1 (1 ), 1 = =1 , 0 = =1 (1 ), and 1 = =1 . (6)
0
+
+
= 0 0 1 1, and = 0 0 1 1.
(7)
Using this notation, we derive the two-stage least squares estimator of 1 in this problem.
First-stage regression The first step of the two-stage least squares procedure in this case would be
to estimate the first-stage regression
= 0 + 1 +
(8)
a) (5 points) Show that the OLS estimator, 1 , in this first-stage regression equals 1 = 1 0 .
The OLS estimator, 1 , in this case is given by
1
1
1
=1( )( )
=1
1 1
1 1
1 =
=
=
=
1
1
2
(1 )
=1( )2
=1 2 2
1 0
(1 1 ) 1 1 0 0
1 1
1 1 0 0
1 =
=
=
= 1 0
1 0
1
1
(1 )
(1
c) (2 points) Show that the fitted value of based on the first-stage regression equals
if = 0
= { 0
1 if = 1
(9)
Second-stage regression The next step is to run the second-stage regression by estimating 0 and 1
using OLS in the following regression model
= 0 + 1 +
(10)
d) (5 points) What happens in this regression if 0 = 1? Why would we expect this not to be the
case as gets large?
If 0 = 1 then is constant across all observations . Consequently, the explanatory variable and
constant term are linearly dependent and the regression suffers from multicollinearity.
We do not expect this to happen as , because from the LLN, we know that
plim 0 = [ | = 0] [ | = 1] = plim 1 .
Thus, as the conditional sample means 0 and 1 will with certainty converge to different
numbers and thus not be equal.
e) (10 points) Show that the 2SLS estimator of 1, that is obtained from estimating 1 using OLS in
this second-stage regression equals
1,2 = 1 0
1
(11)
1
1
=1( )( )
=1
=
=
2
1 2 2
1
=1
=1( )
2 2 = 0 02 + 1 12 2 = 0 02 + 1 12 ( 0 0 + 1 1 ) ,
=1
(1
) 02 +
(1
) 12 2
0 1
0 1 = 0 1 (0 1 )2.
1
0
1
0
1
0
1
= 0 0 + 1 1 ( 0 + 1 ) ( 0 + 1 )
=1
This is a similar FOIL, but slightly more complicated than that for the denominator. The above
equals
0
(1
) 0 0 + 1 (1 1 ) 1 1 0 1 (0 1 + 1 0 ) = 0 1 (0 1 )(0 1 ).
1
0 1
=1
(0 1 )(0 1 ) (0 1 )
=
=
=
0 1
1 2 2
(0 1 )
(0 1 )2
=1
g) (5 points) Suppose that the instrumental variable is a fully random treatment. This means that
for any set of potentially omitted variables, , in the regression equation we estimate, where
= 0 + 1 + + ,
(12)
10
In an influential early example of this sort of study, Joshua Angrist of the Massachusetts Institute of
Technology (MIT) and Alan Krueger of Princeton University used America's education laws to
create an instrumental variable based on years of schooling. These laws mean that children born
earlier in the year are older when they start school than those born later in the year, which means
they have received less schooling by the time they reach the legal leaving-age. Since a child's birth
date is unrelated to intrinsic ability, it is a good instrument for teasing out schooling's true effect on
wages. Over time, uses of such instrumental variables have become a standard part of economists'
set of tools. Freakonomics, the 2005 bestseller by Steven Levitt and Stephen Dubner, provides a
popular treatment of many of the techniques. Mr Levitt's analysis of crime during American election
cycles, when police numbers rise for reasons unconnected to crime rates, is a celebrated example of
an instrumental variable.
Two recent papersone by James Heckman of Chicago University and Sergio Urzua of
Northwestern University, and another by Angus Deaton of Princetonare sharply critical of this
approach. The authors argue that the causal effects that instrumental strategies identify are
uninteresting because such techniques often give answers to narrow questions. The results from the
quarter-of-birth study, for example, do not say much about the returns from education for college
graduates, whose choices were unlikely to have been affected by when they were legally eligible to
drop out of school. According to Mr Deaton, using such instruments to estimate causal parameters
is like choosing to let light fall where it may, and then proclaim[ing] that whatever it illuminates is
what we were looking for all along.
IV leagues
This is too harsh. It is no doubt possible to use instrumental variables to estimate effects on
uninteresting subgroups of the population. But the quarter-of-birth study, for example, shone light
on something that was both interesting and significant. The instrumental variable in this instance
allows a clear, credible estimate of the return from extra schooling for those most inclined to drop
out from school early. These are precisely the people whom a policy that sought to prolong the
amount of education would target. Proponents of instrumental variables also argue that accurate
answers to narrower questions are more useful than unreliable answers to wider questions.
A more legitimate fear is that important questions for which no good instrumental variables can be
found are getting short shrift because of economists' obsession with solving statistical problems. Mr
Deaton says that instrumental variables encourage economists to avoid thinking about how and
why things work. Striking a balance between accuracy of result and importance of issue is tricky.
If economists end up going too far in emphasising accuracy, they may succeed in taking the con
out of econometrics, as Mr Leamer urged them toonly to leave more pressing questions on the
shelf.
12
a) (5 points) According to Mr Deaton, using such instruments to estimate causal parameters is like
choosing to let light fall where it may, and then proclaim[ing] that whatever it illuminates is
what we were looking for all along.. Does Angus Deaton criticize internal or external validity of
many instrumental variable studies in this quote? Explain your answer.
Angus Deaton criticizes the external validity of instrumental variable studies. He does so by
suggesting that the results that IV studies come up with shine light on issues that are not necessarily
helpful in answering broader research questions economists are interested in. He is critical of the
claim by researchers that do IV regressions that their results are applicable beyond what they
illuminate, i.e. of their claims of external validity beyond their narrow illuminated results.
b) (5 points) Mr Leamer showed how different (but apparently reasonable) choices about which
variables to include in an analysis of the effect of capital punishment on murder rates could lead
to the conclusion that the death penalty led to more murders, fewer murders, or had no effect at
all.. What type of bias in OLS estimates did Ed Leamer illustrate with this example about the
effect of capital punishment on murder rates?
Mr. Leamer illustrated omitted variable bias, where the estimated coefficient of interest is biased
because of it partially picking up effects of variables that are not included in the regression and that
are correlated with the explanatory variable of interest.
13
c) (5 points) A more legitimate fear is that important questions for which no good instrumental
variables can be found are getting short shrift because of economists' obsession with solving
statistical problems. What two main properties does a good instrumental variable have to
have?
A good instrumental variable needs to be (i) relevant in that it is (highly) correlated with the
endogenous variable of interest that it instruments for, and (ii) exogenous in that it is uncorrelated
with the residuals in the regression that the endogenous explanatory variable of interest is correlated
with.
Justin McCrary points out, in a replication study, that Levitt has inverted the weights in his
weighted least squares regression in which he aimed to correct for heteroskedasticity. With the correct
weights the results he emphasizes are not significant anymore.
14
Question 4: Fertility and the labor supply decisions of married women (20 points)
In the paper Children and Their Parents' Labor Supply: Evidence from Exogenous Variation in
Family Size Joshua Angrist and Bill Evans are interested in estimating the effect of number of
children on a married womans labor supply decision.
Let be a measure of the labor supply and be a measure of fertility, then Angrist and Evans are
interested in the causal effect of fertility on labor supply. The particular measure of fertility that
Angrist and Evans use is whether or not a woman has more than two children. That is, they would
like an unbiased estimate of the parameter in the simple regression model
= 0 + 1 + + ,
(13)
Angrist and Evans then consider whether the likelihood of a woman having another child depends on
the gender composition of children she already has. They do so because the gender of children can be
considered as random. Their aim is to use this gender composition as an instrumental variable for
their fertility variable . They report the table below.
16
The top part of the table reports the likelihood of a woman having another child, after having one
child already, conditional on the gender of the first child. The bottom half reports the likelihood of a
woman having another child, after she already had two children, conditional on the gender of the first
two.
b) (4 points) Based on this table, explain why Angrist and Evans do not use whether or not a woman
has more than one child as the fertility measure, , in their regression.
The top of the table suggests that the gender of the first child is not correlated with the probability
that a woman has a second child. That is, the estimated probabilities that a woman has a second child
are not significantly different when we condition on the gender of the first child.
This means that the gender of the first child is not a relevant instrument for the endogenous fertility
variable that a woman has a second child. So, for that endogenous variable Angrist and Evans do not
have a good instrument.
Angrist and Evans then run the first-stage regression, where they estimate
= + +
(14)
The set of instruments that they include in the vector are about the gender composition of the first
two children. That is includes (i) whether first child was a boy, (ii) whether the second child is a
boy, (iii) whether the first two children are of the same gender, (iv) whether the first two children are
boys, and (v) whether the first two children are girls.
Of course, (iii), (iv), and (v) are multicollinear and are never jointly included in any specification.
In fact, Angrist and Evans either include (iii) or (iv) and (v). The table on the next page shows the
first-stage regression results.
17
c) (4 points) If you would like to advise Angrist and Evans to make the case that the instruments are
relevant more directly, which additional test-statistic would you suggest them to include in this
table? Why? What information would it add to what is already reported.
What is of interest is whether the potential instrument explain a significant part of the variation in
the fertility variable. The best way to test for this an F-test for the joint hypothesis that all coefficients
on the instruments in this first-stage regression are zero. They do not report such a test and since they
include other covariates, it is not implied by the reported 2 .
The table below shows the results of the second-stage regression from Angrist and Evans. They run
their second-stage regression with the labor supply variables for unmarried women, married women,
as well as the husbands of married women.
19
d) (4 points) What does the fact that the estimated labor supply response reduces when estimated
with 2SLS rather than OLS tell you about properties of the omitted variables that you discussed
in part a)?
If the 2SLS estimate is consistent and the OLS estimate is biased, then this suggests that the OLS
coefficient has a negative bias and that omitted variables that tend to increase labor supply tend to
reduce fertility, as discussed in my answer to part a).
e) (4 points) How does this paragraph address the criticism of Angus Deaton from part a) of Question
3?
Angrist and Evans use this paragraph to make the point that the fertility effect on labor supply
that this study analyzes illuminates a part of individuals labor supply decisions that accounts for a
substantial part of the aggregate movements in labor supply. Thus, though, to some extent the light
fell randomly on more-than-two children, this turns out to be a substantial part of the overall labor
supply effect between 1970 and 1990.
20