You are on page 1of 3

TUTORIAL 3

ECO3021S 2014
UCT
Katherine Eyal
Material Covered: Rest of Chapter 2 (4 lectures), beginning of chapter 3
Handin Date: 11th August 2014
Problems
1. In the simple regression model, we derive that the sampling variance of

1
is as follows:
V ar[

1
] =

2
SST
x
Explain the meaning of the two components of V ar(

1
). Under what con-
ditions will this expression be large? Is it possible to pick a sample which
minimises the value of V ar(

1
)? Why or why not? [5]
2. (a) Let

0
and

1
be the OLS intercept and slope from the regression of y
i
on x
i
, using n observations. Let c
1
and c
2
, with c
2
= 0, be constants.
Let

0
and

1
be the intercept and slope from the regression of c
1
y
i
on c
2
x
i
. Show that

1
=
c
1
c
2

1
and

0
= c
1

0
[6]
(b) Explain your ndings to (a) in words. [2]
(c) Now let

0
and

1
be the OLS intercept and slope from the regres-
sion of log(y
i
) on x
i
, where we must assume y
i
> 0 for all i. For
c
1
> 0, let

0
and

1
be the intercept and slope from the regression
of log(c
1
y
i
) on x
i
. What is the relationship between

0
and

1
and

0
and

1
? Give your answer algebraically and in words. [5]
1
3. Download the Stata le NIDSdataset.dta under Resources, Tutorials in
vula. This is a cleaned version of some of the variables in the National
Income Dynamics Survey wave 3.
This dataset is NOT for distribution or use in further research.
Some variables have been left uncleaned or values have been
changed to prevent it being re-used for other research.
Potential duration of child support grant receipt is just that. In 2012, all
children under the age of 18 are eligible for the grant (assume everyone is
means eligible, i.e. meets the income requirements to receive the grant).
Thus a 1 year old in 2012 has been exposed for 1 year, a 2 year old for 2
years, etc etc. Do a cross tab to check this.
(a) Run a regression of years of attained education on the potential du-
ration of your life that a child has been exposed to the child support
grant.
Report your results in equation form, and interpret the slope and
intercept coecients. [4]
(b) (OMITTED FOR HANDIN, BUT IMPORTANT OTHER-
WISE) What is the mean number of years of education and potential
duration of receipt? Why does the regression only contain 16,360 ob-
servations, when 16 499 observations exist for the potential duration
variable? Use a count command to verify your answer. [4]
(c) Run the regression only for 1 year olds. What happens? Why? (You
may nd it benecial to refer to your answers for question 1). [3]
(d) Which values of potential duration do have variation in the age vari-
able? (i.e. which dierent age groups have the same values for po-
tential duration?)
Use the bysort command combined with tab to answer this ques-
tion. Write your code, and if you can, give an explanation for this
odd pattern in potential duration by age. [3]
(e) From the regression in 3(a), if potential duration increased by 3 years,
by how many years is education predicted to rise? [2]
(f) (OMITTED FOR HANDIN, BUT IMPORTANT OTHER-
WISE) What are the minimum and maximum values of the pre-
dicted years of education? Do you nd this odd? Why or why not?
(You would use the predict command after the regression to obtain
your yhat variable). [4]
(g) Does potential duration explain a lot of the variation in years of ed-
ucation? Why or why not? [2]
2
(h) If instead we had regressed years of education on household income,
what coecient might you expect and why? Could you interpret this
as a causal relationship? Why or why not? [3]
(i) Can we calculate or almost calculate the standard error of the
slope coecient using the relationship
se(

1
) =

SST
x
from the output of the regression in (h)? Why or why not? [3]
(j) Calculate the standard error of the regression from the output
of the regression in (h), and check that your answer is the same as
that reported in the regression output. Use the display command
to do these calculations. (For bonus points, use the ereturn list
command, and use those scalars to calculate your answer). Report
your answers and your code. (PAGE 52 IN THE TEXTBOOK)
[3]
(k) Take the regression in (h), and add in age. What happens to the sum
of squared residuals? and the R-squared? Does this make sense? Ex-
plain your answers. [3]
(l) (OMITTED FOR HANDIN, BUT IMPORTANT OTHER-
WISE) Interpret the coecient on age. Does it make sense to have
a constant eect of one more year of age? Why or why not? [3]
(m) If we were to then add in an age squared term, would this mean that
we were no longer running a linear regression? Why or why not? [2]
(n) Now run a regression of the log of monthly wages on years of ed-
ucation, household income, and age and age squared. Report your
answer in equation format (use 2 decimal points only), and interpret
the coecient on years of education. [2]
(o) Using the regression in 3(n), calculate the predicted wage for someone
with a household income of R10,000, 12 years of education, and 25
years old. Use the version of the model you reported in 3(n). [3]
TOTAL: 65
3

You might also like