Professional Documents
Culture Documents
(
1
1
1
1
1 1
stat
|
|
|
| |
sd sd
t
=
=
Answer:
1 : H
lenroll 0
= |
1 : H
lenroll 1
> |
45736 . 2 109776 . 0 / ) 1 26976 . 1 ( = = t
One-sided (right) critical value at % 5 = o is 1.66. This value can be obtained by using the
command: disp invttail(95,0.05) which yields 1.6610518.
Conclusion: Since the t-stat (2.46) is greater than the critical values (1.66), H
0
is rejected. We would
argue that the elasticity of crime with respect to enroll is more than unity.
Example 4.5 Housing prices and air pollution
Data: hprice2s8.dta
For a sample of 506 communities in the Boston area, we estimate a model relating median housing price
(price) in the community to various community characteristics: nox is the amount of nitrogen oxide in the
air, in parts per milion; dist is a weighted distance of the community from five employment centers, in
miles; rooms is the average number of rooms in house in the community; and stratio is the average
student-teacher ratio of schools in the community.
1. Estimate the regression:
7
i i i i i i
u stratio rooms dist nox price + + + + + =
4 3 2 1 0
) log( ) log( ) log( | | | | |
2. Test the hypothesis that the elasticity of price with respect to nox is negative one against the
alternative that it does not equal negative one. Write the hypothesis. Determine the t-statistic and the
critical value at % 5 = o . What is your conclusion?
Note: The t-stat should be computed as
)
(
) 1 (
1
1
1
1 1
|
|
|
| |
sd sd
tstat
=
=
3. Calculate the p-value for testing the hypothesis that the elasticity of price with respect to nox is
negative one against the alternative that it does not equal negative one.
Example 4.7 Effect of job training on firm scrap rates
Data: jtrains8.dta
The scrap rate for a manufacturing firm is the number of defective items out of every 100 produced. Thus,
for a given number of items produced, a decrease in the scrap rate reflects higher worker productivity. We
want to use the scrap rate to measure the effect of worker training on productivity.
1. Check the data. For the year 1987 and for non-unionized firms, how many observations the data has?
2. Estimate the following regression only for the year 1987 and for non-unionized firms:
i i i i i
u employ sales hrsemp scrap + + + + = ) log( ) log( ) log(
3 2 1 0
| | | |
where hrsemp is annual hours of training per employee, sales is annual firm sales (in dollars), and
employ is the number of firm employees.
3. What is the average scrap rate and average hrsemp in 1987?
4. What is your comment on the economic significance of the training variable?
5. What about the statistical significance of the training variable?
5a. Test the hypothesis that the effect of training on scrap is zero in the population against the
alternative that it is negative. Write the hypothesis. Determine the t-statistic and the critical value
at % 1 = o , % 5 = o , and % 10 = o . What is your conclusion?
5b. Calculate the p-value to test the hypothesis that the effect of training on scrap is zero in the
population against the alternative that it is negative.
Note: The one-sided p-value is obtained as one-half of the p-value for the two-tailed test.
Section 4.4 Testing hypotheses about a single linear combination of the parameters
In this section we show how to test a single hypothesis involving more than one of the
j
| . Consider a
simple model to compare the returns to education at junior college and four-year colleges (universities).
The model is u exper univ jc wage + + + + =
3 2 1 0
) log( | | | |
The hypothesis of interest is whether one year at junior college is worth one year at a university against a
one-sided alternative that a year at a junior college is worth less than a year at a university. Thus:
2 1 0
: H | | = ,
2 1 1
: H | | < .
The t-statistic for the these hypotheses is
)
(
2 1
2 1
| |
| |
=
se
t
1. Estimate with OLS the regression
i i i i i
u exper univ jc wage + + + + =
3 2 1 0
) log( | | | |
8
Answer:
use twoyears8.dta, clear
desc
Contains data from C:\STORAGE\Z_Copy\Dept IE\Ekonometri 1_S2\BKF\Data\Data stata
8\twoyears8.dta
obs: 6,763
vars: 22 27 Feb 2012 08:36
size: 311,098 (97.3% of memory free)
------------------------------------------------------------------------------
storage display value
variable name type format label variable label
------------------------------------------------------------------------------
female byte %8.0g =1 if female
phsrank byte %8.0g % high school rank; 100 = best
ba byte %8.0g =1 if bachelor's degree
aa byte %8.0g =1 if associate's degree
black byte %8.0g =1 if african-american
hispanic byte %8.0g =1 if hispanic
id long %12.0g id number
exper int %8.0g total (actual) work experience
jc float %9.0g total 2-year credits
univ float %9.0g total 4-year credits
lwage float %9.0g log hourly wage
stotal float %9.0g total standardized test score
smcity byte %8.0g =1 if small city, 1972
medcity byte %8.0g =1 if med. city, 1972
submed byte %8.0g =1 if suburb med. city, 1972
lgcity byte %8.0g =1 if large city, 1972
sublg byte %8.0g =1 if suburb large city, 1972
vlgcity byte %8.0g =1 if very large city, 1972
subvlg byte %8.0g =1 if sub. very lge. city, 1972
variabl0 byte %8.0g =1 if northeast
nc byte %8.0g =1 if north central
south byte %8.0g =1 if south
------------------------------------------------------------------------------
sum lwage jc univ exper
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
lwage | 6763 2.248096 .4876918 .5555456 3.911953
jc | 6763 .3388946 .7721268 0 3.833333
univ | 6763 1.926274 2.297001 0 7.5
exper | 6763 122.3816 33.42799 3 166
reg lwage jc univ exper
Source | SS df MS Number of obs = 6763
-------------+------------------------------ F( 3, 6759) = 644.53
Model | 357.752575 3 119.250858 Prob > F = 0.0000
Residual | 1250.54352 6759 .185019014 R-squared = 0.2224
-------------+------------------------------ Adj R-squared = 0.2221
Total | 1608.29609 6762 .237843255 Root MSE = .43014
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
jc | .0666967 .0068288 9.77 0.000 .0533101 .0800833
univ | .0768762 .0023087 33.30 0.000 .0723504 .0814021
exper | .0049442 .0001575 31.40 0.000 .0046355 .0052529
_cons | 1.472326 .0210602 69.91 0.000 1.431041 1.51361
------------------------------------------------------------------------------
9
2. Find
2 / 1
12
2
2
2
1 2 1
} 2 )]
( se [ )]
( se {[ )
( se s + = | | | |
Answer:
qui reg lwage jc univ exper
matrix v=e(V)
matrix list v
symmetric v[4,4]
jc univ exper _cons
jc .00004663
univ 1.928e-06 5.330e-06
exper -1.718e-08 3.933e-08 2.480e-08
_cons -.00001741 -.00001573 -3.105e-06 .00044353
scalar s12 = v[2,1]
scalar se_jc_min_univ=(_se[jc]^2 + _se[univ]^2 -2*s12)^(1/2)
disp se_jc_min_univ
.00693591
From the last command we get 00693591 . 0 )
( se
2 1
= | | .
3. Find the p-value for the test.
Answer:
The t-statistic for the these hypotheses is
)
(
2 1
2 1
| |
| |
=
se
t
gen t = (_b[jc]-_b[univ])/se_jc_min_univ
disp t
-1.4676566
disp 1-ttail(6759, -1.4676566)
.07112203
The p-value suitable for the test is 0.07112203.
Since the one sided p-value is 7.11% which is greater than 5%, then H0 should not be rejected at 5%
significance level, meaning that one year of junior college is worth one year at a university.
Define a new parameter
2 1 1
| | u = and rearrange the model to give:
u exper totcoll jc
u exper univ jc jc
u exper univ jc jc jc wage
+ + + + =
+ + + + + =
+ + + + + =
3 2 1 0
3 2 2 1 0
3 2 2 2 1 0
) ( ) (
) log(
| | u |
| | | | |
| | | | | |
4. Estimate the modified model with OLS.
Answer:
gen totcoll=jc+univ
reg lwage jc totcoll exper
Source | SS df MS Number of obs = 6763
10
-------------+------------------------------ F( 3, 6759) = 644.53
Model | 357.752575 3 119.250858 Prob > F = 0.0000
Residual | 1250.54352 6759 .185019014 R-squared = 0.2224
-------------+------------------------------ Adj R-squared = 0.2221
Total | 1608.29609 6762 .237843255 Root MSE = .43014
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
jc | -.0101795 .0069359 -1.47 0.142 -.0237761 .003417
totcoll | .0768762 .0023087 33.30 0.000 .0723504 .0814021
exper | .0049442 .0001575 31.40 0.000 .0046355 .0052529
_cons | 1.472326 .0210602 69.91 0.000 1.431041 1.51361
------------------------------------------------------------------------------
5. Test 0 : H
1 0
= u against the one-sided alternative 0 : H
1 1
< u
Answer:
4676 . 1 0069359 . 0 / 0101795 . 0 = = t
One-sided (left) critical value at % 5 = o is -1.65. This value can be obtained by using the command:
disp invttail(6759,0.95) which yields -1.6450791.
Since the absolute t-stat is less than the absolute critical value, then H
0
should not be rejected. We
argue that 0
1
= u which means that
2 1
| | = .
6. What is the p-value for the test.
Answer:
The one-sided (left) p-value is 0.071. This value can be obtained by using the command disp 1-
ttail(6759,-1.47) or the command disp ttail(6759,1.47) which yields 0.07080415, or by
taking the P>|t| value from the stata output (0.142) and divide it by two.
Section 4.5 Testing multiple linear restrictions: The F test
Data: mlb1s8.dta
1. Check the data.
2. Estimate the regression model that explains major league baseball players salaries:
i i i i i i i
u rbisyr hrunsyr bavg gamesyr years salary + + + + + + =
5 4 3 2 1 0
) log( | | | | | |
where salary is the 1993 total salary, years is years in the league, gamesyr is average games played
per year, bavg is career batting average, hrunsyr is home runs per year, and rbisyr is runs batted in per
year.
3. Test whether bavg, hrunsyr, and rbisyr are jointly statistically insignificant, using the SSR-form of
the F-test.
4. Test whether bavg, hrunsyr, and rbisyr are jointly statistically insignificant, using the R-squared of
the F-test.
Example 5.3 Testing multiple linear restrictions: The LM test
Data: crime1.dta
1. Check the data (number of observation and number of variables)
Answer: # obs is 2725; # variables is 16
11
2. We will conduct the LM test using the crime model below
u qemp86 ptime86 tottime avgsen pcnv narr86 + + + + + + =
5 4 3 2 1 0
| | | | | |
Estimate the model using OLS.
Answer:
reg narr86 pcnv avgsen tottime ptime86 qemp86
Source | SS df MS Number of obs = 2725
-------------+------------------------------ F( 5, 2719) = 24.29
Model | 85.9532425 5 17.1906485 Prob > F = 0.0000
Residual | 1924.39391 2719 .707757967 R-squared = 0.0428
-------------+------------------------------ Adj R-squared = 0.0410
Total | 2010.34716 2724 .738012906 Root MSE = .84128
------------------------------------------------------------------------------
narr86 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pcnv | -.1512246 .040855 -3.70 0.000 -.2313346 -.0711145
avgsen | -.0070487 .0124122 -0.57 0.570 -.031387 .0172897
tottime | .0120953 .0095768 1.26 0.207 -.0066833 .030874
ptime86 | -.0392585 .0089166 -4.40 0.000 -.0567425 -.0217745
qemp86 | -.1030909 .0103972 -9.92 0.000 -.1234782 -.0827037
_cons | .7060607 .0331524 21.30 0.000 .6410542 .7710671
------------------------------------------------------------------------------
3. Use the LM statistic to test the null hypothesis that avgsen and tottime have no effect on narr86
once the other factors have been controlled for.
Step 1. Estimate the restricted model
reg narr86 pcnv ptime86 qemp86
Source | SS df MS Number of obs = 2725
-------------+------------------------------ F( 3, 2721) = 39.10
Model | 83.0741941 3 27.691398 Prob > F = 0.0000
Residual | 1927.27296 2721 .708295833 R-squared = 0.0413
-------------+------------------------------ Adj R-squared = 0.0403
Total | 2010.34716 2724 .738012906 Root MSE = .8416
------------------------------------------------------------------------------
narr86 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pcnv | -.1499274 .0408653 -3.67 0.000 -.2300576 -.0697973
ptime86 | -.0344199 .008591 -4.01 0.000 -.0512655 -.0175744
qemp86 | -.104113 .0103877 -10.02 0.000 -.1244816 -.0837445
_cons | .7117715 .0330066 21.56 0.000 .647051 .776492
------------------------------------------------------------------------------
Step 2. Obtain the residuals u
~
from the regression.
predict ures, resid
Step 3. Run the regression of u
~
on pcnv, ptime86, qemp86, avgsen, and tottime
reg ures pcnv avgsen tottime ptime86 qemp86
Source | SS df MS Number of obs = 2725
-------------+------------------------------ F( 5, 2719) = 0.81
Model | 2.87904835 5 .575809669 Prob > F = 0.5398
Residual | 1924.39392 2719 .707757969 R-squared = 0.0015
-------------+------------------------------ Adj R-squared = -0.0003
Total | 1927.27297 2724 .707515773 Root MSE = .84128
12
------------------------------------------------------------------------------
ures | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pcnv | -.0012971 .040855 -0.03 0.975 -.0814072 .0788129
avgsen | -.0070487 .0124122 -0.57 0.570 -.031387 .0172897
tottime | .0120953 .0095768 1.26 0.207 -.0066833 .030874
ptime86 | -.0048386 .0089166 -0.54 0.587 -.0223226 .0126454
qemp86 | .0010221 .0103972 0.10 0.922 -.0193652 .0214093
_cons | -.0057108 .0331524 -0.17 0.863 -.0707173 .0592956
------------------------------------------------------------------------------
Step 4. Calculate the LM statistics
. scalar lm = e(N)*e(r2)
. disp lm
4.0707294
Step 5a. Calculate the p-value
. disp chi2tail(2,lm)
.13063283
Step 5b. Calculate the critical value at % 10 = o
. disp invchi2tail(2,0.1)
4.6051702
Step 6. Conclude
Since the critical value (4.605) is greater than the LM stat (4.0707), than we fail to reject the null
hypothesis that 0
2
= | and 0
3
= | at the 10% level.
The p-value is 1306 . 0 ) 0707294 . 4 (
2
2
= > _ P , so we would reject the null at the 15% level.
Problems 3.4 Heteroskedasticity
Data: sleep75.dta
1. Check the data
2. The model below
u age educ totwrk sleep + + + + =
3 2 1 0
| | | |
can be used to study the tradeoff between time spent sleeping and working and to look at other
factors affecting sleep. If adults trade off sleep for work, what is the sign of
1
| ? What signs do
you think
2
| and
3
| will have?
3. Estimate the model.
4. If someone works five more hours per week, by how many minutes is sleep predicted to fall? Is
this a large tradeoff?
5. Discuss the sign and magnitude of the estimated coefficient on educ.
6. Would you say totwrk, educ, and age explain much of the variation in sleep? What other factors
might affect the time spent sleeping? Are these likely to be correlated with totwrk? F stat
7. Explain intuitively the procedures from Breusch-Pagan and White to test the presence of
heteroskedastic error. Compare the two approaches. Dengan brus pagan
8. Conduct the Breusch-Pagan test for heteroskedasticity for the error term in the equation above
and explain whether you think the error u is heteroskedastic.
13
9. Conduct the White test for heteroskedasticity for the error term in the equation above and explain
whether you think the error u is heteroskedastic.