You are on page 1of 1

ECON 10: Introduction to Statistical Methods

Final Exam (Part 2)

General instructions:
- Read the instructions that I posted on Blackboard.
- Copy and paste all relevant Stata commands and Stata output into your Word
document.
- Explain all results in full sentences.
Data description:
The National Longitudinal Survey of Youth (NLSY) 79 is a nationally representative sample
of 12,686 young men and women who were 14-22 years old when they were first surveyed in
1979. Individuals were interviewed each year up to 1994, and on a biannual basis thereafter.
The attached data file (ECON 10 final.dta)contains a small subset of the variables available in
the sample - all with the official labels and value labels provided by the Bureau of Labor
Statistics (BLS). All variables should be self-explanatory, except for the AFQT variable,
which stands for Armed Forces Qualification Test - a standard aptitude test administered to
the respondents in 1980 and 1981.
Questions:
1) Familiarize yourself with the variables in the data set - use codebook, sum and
tab to explore the variables (you dont have to document this).
2) Calculate descriptive statistics for all variables (mean and standard deviation for
quantitative variables and frequencies for categorical variables). Briefly summarize
the results (no need to explain every number).
3) In addition, calculate descriptive statistics for the wage variable by sex, marital status
and education. Briefly summarize the results.
4) Calculate the correlation between education and the education of the parents (the
command is corr). Describe the result and give possible explanations.
5) Generate the following three variables:
a. log_wage, the natural logarithm of wage income (AMT OF RS
WAGES/SALARY/TIPS (PCY) 2006).
b. married, a binary variables which is 1 if the person is married in 2006 and 0
otherwise.
c. female, a binary variables which is 1 if the person is female and 0 otherwise.
6) Draw a histogram for the variable log_wage and describe it.
7) Test the null hypothesis that the average wage income is the same for each marital
status (check the assumption of equal variances).
8) Test the null hypothesis that the average wage income is the same for men and
women.
9) Run a simple regression with log_wage as dependent variable and education as
explanatory variable. Interpret the results.
10) Add other variables from the data set (which make sense) as explanatory variables to
the regression. Interpret the results.
11) Generate a variable no_wage which is 1 if the wage income is zero and 0 otherwise.
12) Run a logistic regression with no_wage as dependent variable and married, female,
education, AFQT percentile, residence and jobs_had as explanatory variables.
Interpret the results.
13) Discuss why the regression results (9, 10, 12) may or may not be interpreted as causal.

You might also like