Chap2 2017-EI PDF

Contents
2 Tests and Confidence Regions 2

2.1 Restricted Least Squares . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.1 Linear Restrictions . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.2 Restricted LS Estimator . . . . . . . . . . . . . . . . . . . . 4
2.1.3 Expectation and Variance . . . . . . . . . . . . . . . . . . . 5
2.1.4 Properties with Normal Errors . . . . . . . . . . . . . . . . . 5
2.1.5 Asymptotic Properties . . . . . . . . . . . . . . . . . . . . . 6
2.2 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 Steps of Hypothesis Testing . . . . . . . . . . . . . . . . . . 6
2.2.2 Power of a Test . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.3 The F-Test Statistic . . . . . . . . . . . . . . . . . . . . . . 9
2.2.4 Normal Errors . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.5 Asymptotic Test . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.6 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.8 P-Value (Observed Significance Level) . . . . . . . . . . . . 13
2.2.9 Wald Test of Linear Restrictions . . . . . . . . . . . . . . . . 16
2.2.10 Wald Test of Nonlinear Restrictions . . . . . . . . . . . . . . 16
2.3 Confidence Region . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.1 Univariate Case . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.2 Multivariate Case . . . . . . . . . . . . . . . . . . . . . . . . 18
A Prerequisites: Tests of Hypotheses 20
B Further and Deeper Details 23

B.1 Derivation of the Restricted Least Squares Estimator . . . . . . . . 23
B.2 Oblique Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
B.3 Proof of the Asymptotic Distribution of RLS (Theorem 2.1.6 ) . . 24
B.4 Proof of the Distribution of the F Statistic (Theorem 2.2.1) . . . . 25
Chapter 2
Tests and Confidence Regions
Learning Objectives of this Chapter

At the end of this Chapter, you need to be able to:
,! Perform hypothesis testing with the linear regression model:
– State the null and alternative hypotheses,

– Determine the test statistic,
– Determine its behavior,
– Determine a decision rule,
– Determine the result of the test,
– Be able to perform a test with a p-value.
,! Build a confidence region.
,! Know the usual tests (F-test, t-test, Wald test).
,! Know and be able to show the main properties of the Restricted Least
Squares.
Standard Assumptions
Assumption 1 (Iid). (yi , x1i , . . . , xKi ), i = 1, . . . , n are independent and identi-
cally distributed
Assumption 2 (Errors). E ("|x1 , . . . , xk ) = 0

P. Lavergne — F. Poinas — S. Sinha Econometrics M1
Assumption 3 (Moments). All variables have a bounded moment of order 4,

i.e. E(x4j ) < 1, j = 1, . . . k, E(y 4 ) < 1.
Assumption 4 (Rank). No perfect collinearity among the explanatory variables.
Assumption 5 (Variance). Error terms are homoskedastic.
2.1 Restricted Least Squares

2.1.1 Linear Restrictions
Examples
,! Consider the model: y = X1 1 + X2 2 + ".

X2 can be omitted, i.e. 2 =0
,! Constant returns to scale, i.e. 1 + 2 = 1 in
ln Q = |{z}
ln A + 1 ln K + 2 ln L + " ,
0
where Q is production, K capital, and L labor.
Implicit Restrictions
R =r
R is a known (K M ) ⇥ K full rank matrix, r is a know K M vector.
,! There are K M restrictions. M is the number of unrestricted (free) pa-

rameters.
,! Example: constant returns to scale:
ln Q = 0 + 1 ln(K) + 2 ln(L) + "

0 1
0
0 1 1 @ 1
A=1
2
,! R has full rank: no redundant restrictions: ⇢(R) = K M.
3
Explicit Restrictions
=S +s
S is a known K ⇥ M full rank matrix, is an unknown M vector, s is a known
K vector.
,! There are M free parameters .

,! Example : constant returns to scale: 1 + 2 =1, 2=1 1:
0 1 0 1 0 1
0 1 0 ✓ ◆ 0
@ 1 A=@ 0 1 A 0
+ @ 0 A
1
2 0 1 1
,! One can go from explicit to implicit form and vice versa.
Here we consider implicit restrictions. See Ruud (Chapter 4) for a treatment of

restricted least squares under explicit restrictions.
2.1.2 Restricted LS Estimator

We consider the linear regression
y =X +"
under the restriction that
R = r.
So one should solve the restricted LS problem
bR = arg min "0 " = arg min (y X )0 (y X ) subject to R = r .
Write the Lagrangian form of the problem is

arg min L( , ) where L( , ) = (y X )0 (y X )+ 0
(R r) ,
,
where Lagrange multiplier (K

M vector). Solutions (see B.1 for details):
⇣ ⌘
b = 2 R(X 0 X) 1 R0 1 R b r
⇣ ⌘
bR = b (X 0 X) 1 R0 R(X 0 X) 1 R0 1 R b r (2.1)
= Ab+ b (2.2)
⇣ ⌘
1
A = I (X 0 X) 1
R0 R(X 0 X) 1
R0 R
1
b = (X 0 X) 1
R0 R(X 0 X) 1
R0 r.
4
When R b = r, then bR = b.
1
In general, the two estimators differ since (X 0 X) 1
R0 (R(X 0 X) 1
R0 ) is full
rank provided X and R have full rank.
Exercice 2.1. Check that R bR = r.
1
Note: It so happens that C = (X 0 X) 1 R0 (R(X 0 X) 1 R0 ) R and A = I C
are oblique (non-orthogonal) projection matrices. See B.2 for details.
2.1.3 Expectation and Variance

Theorem 2.1.1. Under Assumptions Iid, Errors, Rank,
1
E( bR |X) = (X 0 X) 1
R0 R(X 0 X) 1
R0 (R r)
So bR is unbiased if R = r.
Theorem 2.1.2. Under Assumptions Iid, Errors, Rank, Variance,

h i h i
Var bR |X = AVar b|X A0 = 2 A (X 0 X) A0 = 2 A (X 0 X)
1 1
.
1
Exercice 2.2.h Showi that A (X X) A = A(X X)
0 0 0 1
= (X 0 X) 1
A0 . De-
duce that Var bR |X = 2 A (X 0 X) 1 .
Theorem 2.1.3. Under Assumptions Iid, Errors, Rank, Variance,

h i h i
Var b|X Var bR |X = 2 (I A) (X 0 X)
1
1
= 2
(X 0 X) 1
R0 R(X 0 X) 1
R0 R(X 0 X) 1
is positive semi-definite. Hence, if the restrictions hold, bR is more efficient than

b.
2.1.4 Properties with Normal Errors

Theorem 2.1.4. If the restrictions hold, y|X ⇠ N (X , 2
I), and Assumption
Rank holds,
,! bR |X ⇠ N ( , 2 A(X 0 X) 1 )
,! Restricted OLS estimator is efficient in the class of unbiased estimators
5
2.1.5 Asymptotic Properties

Consistency
Theorem 2.1.5. If the restrictions hold, then under Assumptions Iid, Errors,
Moments, Rank,
p
bR ! .
p p
Exercice 2.3. Given that n 1 X 0 X ! Q and b ! , show this result.
Asymptotic Distribution
Theorem 2.1.6. If the restrictions hold, then under Assumptions Iid, Errors,
Moments, Rank, Variance,
p ⇣ ⌘
d
n bR ! N (0, 2 V ) ,
where V = Q 1
Q 1
R0 (RQ 1
R0 ) 1
RQ 1
and Q = plim n 1 X 0 X.
Proof. See Appendix B.3.
Incorrect Restrictions
If R 6= r, then bR is biased, even asymptotically. Hence it cannot be consistent.
Consider for instance omitted variables, that is
y = X1 1 + X2 2 +"
with 2 6= 0, but we regress y on X1 only, that is we falsely assume 2 = 0.

Then b1 = (X10 X1 ) 1 X10 y is biased for the true value 1 , unless X2 and X1 are
orthogonal, see Chap. 1.
2.2 Hypothesis Testing

2.2.1 Steps of Hypothesis Testing
In what follows, we will take as an example a test of a univariate restriction in a
linear model. For an application on the test on a mean, see Appendix A.
Example: consider the following linear model:
y= 1 x1 + x02 2 +"
6
1. Determine the hypotheses of interest
,! Alternative Hypothesis H1 : what we would like to show

,! Null Hypothesis H0 : the complement of H1 , but the equal sign should
be in H0
Example:
,! One-sided (one-tailed) test: H0 : 1 b against H1 : 1 < b.

,! Two-sided (two-tailed) test: H0 : 1 = b against H1 : 1 6= b.
2. Determine the test statistic t, whose behavior is known under H0 , or when

at the boundary between H0 and H1
Example: Under H0 ,
b1 b d
b
t= ! N (0, 1) .
s.e.( b1 )
Note: this result will be shown later.
3. Choose the level of the test and determine the rejection (or critical) region
,! Type I error: Reject H0 when H0 is true.

,! The level is the maximum type I error probability
supµ2H0 Pr [Reject H0 |µ 2 H0 ].
,! Rejection (or critical) region R
t 2 R , The test rejectsH0 .
R is determined so that the level of the test = ↵.

,! Type II error: Not rejecting H0 when H0 is false.
Example:
Level of the test : ↵. See Figure 2.1 for a representation of
the rejection region.
4. Compute the test statistic and conclude.

Example:
,! One-sided test: H0 is rejected if b

t < z1 ↵.
,! Two-sided test: H0 is rejected if |bt| > z1 ↵/2 .
7
Figure 2.1: Example: Test on a Univariate Restriction
2.2.2 Power of a Test

The power of a test is the probability that the test rejects H0 when H0 is not true.
Take the instance of a one-sided test on a slope parameter:
H0 : 1 b against H1 : 1 < b.
The power function is
Power( 1 ) = Pr [Reject H0 | 1 2 H1 ] .
Hence Power( 1 ) is 1 minus the probability of an error of Type II, that is 1 ( 1 ).
Probabilities of Error
Truth
H0 H1
0 1 ↵ ( 1)
Decision
1 ↵ 1 ( 1)
What is the power of the t-test for 1 < b? The test statistic is
b1 b b1 1 1 b
b
t= = + .
s.e.( b1 ) s.e.( b1 ) s.e.( b1 )
8
The first term is asymptotically distributed as N (0, 1) whatever the true value of
1 (this result will be shown later). Hence the power of the t-test when the true
value equals 1 is
" #
⇥ ⇤ b1 1 1 b
Pr b t < z1 ↵ = Pr + < z1 ↵
s.e.( b1 ) s.e.( b1 )
" #
1 b
= Pr z < z1 ↵ .
s.e.( b1 )
where z is a random variable that is asymptotically distributed as a standard

Normal distribution.
As r
q 1
s.e.( b1 ) = s2 (X 0 X)(1,1)
1
= "b0 "b(X 0 X)(1,1)
1
,
n K
we get that s.e.( b1 ) ! 0.
n!+1
The test’s power indeed depends on the true 1. But when n ! 1, it tends to
one for any 1 < b.
Definition 6. A test is consistent if its power function converges to 1 when n ! 1

for any parameter value in H1 .
Exercice 2.4. For the two-sided t-test, write the power function and show that
the test is consistent.
2.2.3 The F-Test Statistic

We want to test H0 : R = r against H1 : R 6= r. Then the F-Test Statistic
is
kX b X bR k2 /(K M)
Fb = 2
⇣ ⌘ s 0 ⇣ ⌘
1
Rb r (R(X 0 X) 1
R0 ) Rb r /(K M)
=
s2
(RSSR RSS) /(K M)
=
RSS/(n K)
This corresponds to 3 different ways of writing the same statistic and gives 3
different ways of interpreting the statistic:
,! We compare fitted values X b and X bR .
9
⇣ ⌘
,! We compare R b r to a vector of zeros.
,! We compare RSSR and RSS.

We now show that these three forms of the test statistic are equal using some
useful equalities.
,! We have
kX b X bR k2 = kX( b bR )k2 (2.3)
⇣ ⌘0 ⇣ ⌘
= b bR (X 0 X) b bR
⇣ ⌘0 ⇣ ⌘
1
= Rb r R(X 0 X) 1
R0 Rb r (2.4)
from the formula (2.1) of bR . This shows that the first form of F is equal
to the second one.
,! We know that for any
⇣ ⌘ ⇣ ⌘
y X = y X b + Xb X
and the two parts are orthogonal (remember that y X b = MX y and

MX X = 0), so
ky X k2 = ky X bk2 + kX b X k2
In particular, for = bR :
ky X bR k2 = ky X bk2 + kX b X bR k2
RSSR = RSS + kX b X bR k2 (2.5)
,! If you also use that s2 = RSS

n K
, this shows the F can also be written using
the third formula.
2.2.4 Normal Errors

Theorem 2.2.1. If the restrictions hold, y|X ⇠ N (X , 2
I), and Assumption
Rank holds, then Fb ⇠ FK M,n K .
Proof. See Appendix B.4.
Hence the rejection region is
Fb > FK M,n K,1 ↵
where FK M,n K,1 ↵ is the 1 ↵ quantile of a FK M,n K .
See Figure 2.2
10
Figure 2.2: F-Test: Critical Region
Case of an univariate restriction Let y = 1 x1 + x02 2 + ", and consider

H0 : 1 = b against H1 : 1 6= b. Then
⇣ ⌘2
b1 b
Fb = 2 0 0 1
e1 = (1, 0, . . . 0)0 (2.6)
s (e1 (X X) e1 )
⇣ ⌘2
b1 b
=⇣ ⌘ =b t2 (2.7)
b
s.e.( 1 )
where b
t is the Student statistic.
Rejection region: |b
t| > tn K,1 ↵/2 .
Exercice 2.5. A regression of log earnings on (years of ) education and (years
of work) experience for 655 women yields the estimated linear regression
Log \
Earnings = 6.662 + 0.159 Educ + 0.051Exp
(0.209) (0.014) (0.011)
with standard errors into parentheses. Test whether the coefficient of education is
zero. Then test whether the coefficient of education is equal to 0.16. What does it
mean in economic terms?
11
One-sided test of an univariate inequality For H0 : 1 b against H1 :

1 < b, the critical region is
b
t < tn K,1 ↵ .
2.2.5 Asymptotic Test

Theorem 2.2.2. If the restrictions hold, and under Iid, Errors,Moments,
d
Rank, Variance, (K M )Fb ! 2K M .
Proof.
⇣ ⌘0 ⇣ ⌘
1
Rb r (R(X 0 X) 1
R0 ) Rb r / 2
A
(K M )Fb = = .
s2 / 2 B
Under H0
p ⇣ ⌘
d
n Rb r ! N (0, 2
RQ 1
R0 )
d p
with Q = plim n 1 X 0 X. So A ! K M.
2
Moreover, B = s2 / 2
! 1. Hence
d
(K M )Fb ! 2K M .
Hence the asymptotic rejection region is
(K M )Fb > 2
K M,1 ↵
where 2K M,1 ↵ is the 1 ↵ quantile of a 2K M .

The test is called asymptotic because it is based on the asymptotic distribution of
the test statistic under the null hypothesis, so it has asymptotic level ↵.
Univariate Restriction Rejection region is

|b
t| > z1 ↵/2
d
where z1 ↵/2 is the 1 ↵/2 quantile of a N (0, 1), since b
t ! N (0, 1) as n ! 1.
2.2.6 Consistency
p ⇣ b ⌘0
1 p ⇣ b ⌘
n R r (R(n 1 X 0 X) 1 R0 ) n R r /(K M)
Fb =
s2
⇣ ⌘ p p p
We have R b r ! (R r), n 1 X 0 X ! Q, s2 ! 2 .
p
So under H1 , Fb ! 1 and Pr [Reject H0 |µ] ! 1 for all µ 2 H1 .
Exercice 2.6. Consider testing a univariate restriction H0 : 1 = 10 with the
t-test. Show that the test is consistent.
12
2.2.7 Examples
“Usefulness of the Model”
y= 1◆ + X2 2 +" ◆ = (1, . . . 1)0
To test H0 : 2 = 0 against H1 : 2 6= 0 the statistic becomes
(T SS RSS) /(K 1) R2 /(K 1)

Fb = = .
RSS/(n K) (1 R2 )/(n K)
Exercice 2.7. Show the last equality.
Stability Test
Consider two groups of observations on the same variables and
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
yA XA 0 A "A
y⌘ = + ,
yB 0 XB B "B
((nA + nB ) ⇥ 1) ((nA + nB ) ⇥ 2K) (2K ⇥ 1) ((nA + nB ) ⇥ 1)
where "j |Xj ⇠ N (0, 2 I) for j = A, B and independence of groups.

We want to test H0 : A = B against H1 : A 6= B .
Alternative way to write the hypotheses:
,! H0 : restricted model:
✓ ◆ ✓ ◆
XA "A
y= +
XB "B
,! H1 : unrestricted model: yA = X A + "A and yB = X B + "B can be

estimated separately
The statistic becomes

(RSS (RSSA + RSSB )) /K
Fb = .
(RSSA + RSSB ) /(nA + nB 2K)
2.2.8 P-Value (Observed Significance Level)

By definition, it is the level of the test above which the decision switches.
Equivalently, it is the level of the test for which the critical value equals the test
statistic.
For instance,
13
,! For the asymptotic F-test, p-value is p such that
Fb = 2
K M,1 p ,
,! For the asymptotic t-test, p-value is p such that:
– b
t = z1 p/2 for a two-sided test, zq being the quantile of order q of a
N (0, 1).
– b
t = z1 p for a one-sided test.
See Figure 2.3

With a p-value, we can run the test at any level.
Testing with a p-value
2. Determine the test statistic and its behavior under H0 (or at the boundary
between H0 and H1 )
3. Choose the level ↵
4. Compute the statistic and its p-value
5. If p-value < ↵, reject H0 . Otherwise do not reject H0 .
Univariate Case For the one-sided asymptotic test of H0 : 1  10 against

H1 : 1 > 10 , the p-value is
⇥ ⇤
Pr N (0, 1) > b
t .
For the two-sided asymptotic test of H0 : 1 = 10 , the p-value is

⇥ ⇤
Pr |N (0, 1)| > |b
t| .
Exercice 2.8. For the regression of log earnings for 655 women
Log \
(0.209) (0.014) (0.011)
what is the p-value for the t-test of the null hypothesis that the coefficient of ex-
perience is zero? What is the p-value for the t-test of the null hypothesis that the
coefficient of experience is equal to 0.05?
14
Figure 2.3: How to Compute a p-value for t and F-test?
15
2.2.9 Wald Test of Linear Restrictions

Check whether restrictions are (close to be) fulfilled by the unrestricted estimator.
We make Assumptions Iid, Errors, Moments, Rank, Variance all along.
Test statistic is based on a quadratic form in R b r, that is
⇣ ⌘0 ⇣ ⇣ ⌘⌘ 1 ⇣ ⌘
c = Rb r
W d Rb r
Var Rb r .
⇣ ⌘
But Var R b
d r = s2 (R(X 0 X) 1 c = (K
R0 ), so that W M )Fb.
d
c ! c>
Under H0 , W K M,
2
and the rejection region is W K M,1 ↵ .
2
2.2.10 Wald Test of Nonlinear Restrictions

We want to test H0 : r( ) = 0 where r(·) is nonlinear.
Example: Wage w depends on gender d and characteristics x
log w = 0 + 1d + x0 2 +" "|(d, x) ⇠ N (0, 2

)
The relative male-female difference of expected wages is exp( 1 )

1 and we may want to test whether exp( 1 ) 1 equals some given
value.
The Delta Method

p ⇣ ⌘
d
Theorem 2.2.3. If n ✓b ✓0 ! N (0, ⌃) and g(·) : RK ! RK M
is differ-
entiable at ✓0 , then
p ⇣ ⌘
d @
b
n g(✓) g(✓0 ) ! N (0, G0 ⌃G00 ) , where G0 = r✓0 g(✓0 ) = g(✓0 ) .
@✓ 0
For testing H0 , consider
⇣ ⌘
b = @ r( b) = r 0 r( b) .
1
0 b
c b
W = r ( ) R(X 0
X) 1 b0
R r( b)/s2 where R
@ 0
Wald Test
c ! d c>
For testing H0 , W 2
K M under H0 , and the rejection region is W K M,1 ↵ .
2
16
Proof.
p ⇣ ⌘
d
n b ! N 0, 2
Q 1
By the Delta method,

p ⇣ ⌘
d
⇣ ⌘
n r( b) r( ) ! N 0, 2 b
RQ 1 b0
R
p ⇣ ⌘
b d 2 b 1 b0
Under H0 : r( ) = 0. So nr( ) ! N 0, RQ R . The rest of the proof
is the same as for the asymptotic distribution of the F-test statistic. It uses the
p
fact that y 0 ⌃ 1 y ⇠ 2p when y ⇠ N (0, ⌃) and s2 / 2 ! 1.
Exercice 2.9. Consider testing a univariate restriction H0 : r( ) = 0. What
reasoning would show that the Wald test is consistent?
Exercice 2.10. In the regression
log w = 0 + 1d + x0 2 +" "|(d, x) ⇠ N (0, 2

)
we obtain b1 = .1 with a standard error of 0.04.
,! Apply the delta method to test H0 : exp( 1 ) 1 = 0 against H1 : exp( 1 ) 1 6=

0 at 5% level with the asymptotic Wald test.
,! Given that H0 is equivalent to 1 = 0, test H0 : 1 = 0 against H1 : 1 6= 0

at 5% level with the asymptotic Wald test.
You should have concluded differently in the two tests. This is an example of the
non-invariance of the Wald test: testing equivalent hypotheses expressed in two
different ways may not yield the same conclusion!
2.3 Confidence Region

Definition 7. A (asymptotic) confidence region for ✓ at confidence level 1 ↵ is
the set of values ✓0 such that a two-sided test of H0 : ✓ = ✓0 does not reject H0
at (asymptotic) level ↵.
2.3.1 Univariate Case

,! If we assume homoskedasticity of errors, the asymptotic confidence interval
for 1 of confidence level 1 ↵
10 : |b
t( 10 )|  z,1 ↵/2
17
⇣ ⌘
where b
t( 10 ) = b1 10 /s.e.( b1 ). So it is
b1 ⌥ z1 ↵/2 s.e.( b1 ) .
,! If we assume normality (and still homoskedasticity), the confidence interval

of confidence level ↵ is
b1 ⌥ tn K,1 ↵/2 s.e.( 1 ) .
b
,! A parameter estimate is said statistically insignificant when 0 belongs to CI.

This could happen because
– the estimate is “close to zero” and the standard error is small (what
does “close to zero” mean?)
– the estimate is not “close to zero,” but the standard error is large
,! Don’t confuse significance with economic importance. A parameter may be

statistically significant but still the economic effect may be small.
Exercice 2.11. In the regression of log earnings for 655 women,
Log \
(0.209) (0.014) (0.011)
construct a confidence interval for the coefficient of education. What does it mean
in economic terms?
2.3.2 Multivariate Case

y = X1 1 + X2 2 +"
with dim 2 =K M.
,! If we assume normality and homoscedasticity of errors, the confidence region

for 2 of confidence level 1 ↵ is
8 ⇣ ⌘0 ⇣ ⌘⇣ ⌘ 9
>
< b 0 1
R (X X) R 0 b >
=
2 20 2 20
20 :  F K M,n K,1 ↵ ,
>
: (K M )s2 >
;
with R the matrix that selects the components 2.
18
,! If we don’t assume normality (but still homoscedasticity), the confidence

region of asymptotic confidence level ↵ is
8 ⇣ ⌘0 ⇣ ⌘⇣ ⌘ 9
>
< b 0
R (X X) R 1 0 b >
=
2 20 2 20
2
20 :  K M,1 ↵ .
>
: s2 >
;
,! The confidence region is always an ellipse.
,! The confidence region is not the intersection of individual confidence inter-

vals: it takes into account covariance of the individual estimators.
,! Some coefficients may be individually “insignificant” but “jointly significant.”

The reverse can also happen.
Exercice 2.12. What is the definition of an asymptotic confidence region for a

nonlinear function of the parameters r ( )?
Figure 2.4: Confidence Region
19
Appendix A
Prerequisites: Tests of Hypotheses
Student and Fisher Distributions

Definition 8. If z ⇠ N (0, 1) and U ⇠ 2p are two independent random variables, then
p z follows a Student distribution with p degrees of freedom, denoted by p z ⇠ tp .
U/p U/p
When p ! 1, tp becomes N (0, 1) (why?).
Definition 9. If X1 ⇠ 2m1 and X2 ⇠ 2m2 are independent , then F = X 1 /m1

X2 /m2 follows a
Fisher distribution with m1 and m2 degrees of freedom, denoted by F ⇠ F (m1 , m2 ).
Particular case: F1,p = t2p .
Test on a Mean
,! H0 : µ = µ0 against H1 : µ 6= µ0 : two-sided alternative

,! H0 : µ  µ0 against H1 : µ > µ0 : one-sided alternative
2. Determine the test statistic t, whose behavior is known under H0 , or when at the
boundary between H0 and H1
p
b
t= n (ȳ µ0 ) /s. When µ = µ0 , t ⇠ tn 1 if y ⇠ N (µ0 , 2 ).
3. Choose the level of the test and determine the rejection (or critical) region
4. Type II error: Not rejecting H0 when H0 is false.
,! Two-sided test: |b
t| > c.
⇥ ⇤
Since t ⇠ tn 1 when µ = µ0 , Pr |b
b t| > c|µ = µ0 = ↵ when c is the
1 (↵/2) quantile of a tn 1 , denoted tn 1;1 (↵/2) .
,! One-sided test: b t>c

⇥ ⇤
Pr bt > c|µ = µ0 = ↵ for c = tn 1;1 ↵ .
See Figure A.1
5. Compute the test statistic and conclude.
21
Figure A.1: Tests of a mean: Critical Region
22
Appendix B
Further and Deeper Details
B.1 Derivation of the Restricted Least Squares Es-

timator
We should solve
arg min L( , ) where L( , ) = (y X )0 (y X )+ 0

(R r) .
,
,! FOC:
@L
=0 , 2X 0 y + 2X 0 X bR + R0 b = 0
@
, 2X 0 X bR = 2X 0 y R0 b
✓ ◆
b 0 1 0 1 0b
, R = X X Xy R
2
1 1 0b
, bR = b X 0X R (B.1)
2
@L
= 0 , R bR r = 0 , (B.2)
@
where b ⌘ bOLS .
,! Inserting B.1 into B.2,

1 1
Rb R X 0X R0 b r=0
2
1 1 0b
, R X 0X R = Rb r
2
⇣ ⌘ 1⇣ ⌘
1 0
, b = 2 R X 0X R Rb r (B.3)
,! Inserting B.3 into B.1:

⇣ ⌘ 1⇣ ⌘
bR = b 1 1
X 0X R0 R X 0 X R0 Rb r (B.4)
= Ab+ b
⇣ ⌘
1
where A = I (X 0 X) 1
R0 R(X 0 X) 1
R0 R (B.5)
1
b = (X 0 X) 1
R0 R(X 0 X) 1
R0 r. (B.6)
B.2 Oblique Projection

,! C is a K ⇥ K matrix, but it is not symmetric.
,! C is idempotent, i.e. CC = C.
So any of its eigenvalues must be either zero or one.
,! Hence C must be a projection: it leaves a whole linear subspace unchanged, it
maps the complementary subspace to 0.
But the two subspaces are not orthogonal in general.
,! Any vector u can be written as
u = Cu + Au ,
but the two elements are not orthogonal.
B.3 Proof of the Asymptotic Distribution of RLS

(Theorem 2.1.6 )
,! Since b = + (X 0 X) 1 X 0 ",
Rb r= R r + R(X 0 X) 1
X 0" .
,! If the restrictions hold, R = r, and using 2.1

bR 1
= (X 0 X) 1 X 0 " (X 0 X) 1 R0 R(X 0 X) 1 R0 R(X 0 X) 1 X 0 "
⇣ ⌘
1
= (X 0 X) 1 (X 0 X) 1 R0 R(X 0 X) 1 R0 R(X 0 X) 1 X 0 "
,! Therefore:
p ⇣ ⌘
n bR
⇣ ⌘ 1
1
= n(X 0 X) 1
n(X 0 X) 1
R0 Rn(X 0 X) 1
R0 Rn(X 0 X) 1
p X 0"
n
24
Using n 1X 0X p d
! Q, p1n X 0 " ! N (0, 2 Q) and applying the Slusky theorem,
p ⇣ ⌘
d
n bR ! V N (0, 2
Q)
where V = Q 1 Q 1 R0 (RQ 1 R0 ) 1 RQ 1 .
,! After some matrix manipulation, it follows that the variance matrix of the asymp-
totic distribution is given by:
2
V QV 0 = 2
V .
Therefore: p ⇣ ⌘
d
n bR ! N (0, 2
V ).
B.4 Proof of the Distribution of the F Statistic

(Theorem 2.2.1)
Proof.
⇣ ⌘0 ⇣ ⌘ ⇥ ⇤
1
Rb r R(X 0 X) 1 R0 Rb r / (K M) 2
A
Fb = = .
s2 / 2 B
Since
"|X ⇠ N 0, 2
I ) b|X ⇠ N ( , 2
(X 0 X) 1
),
we get
Rb r|X ⇠ N (R r, 2
R(X 0 X) 1
R0 )
Under H0 , R = r so that A ⇠ 2
K M /(K M ). We also know that
s2 / 2
⇠ 2
n K /(n K) .
Are numerator and denominator independent? Yes, because
,! Residuals "b are orthogonal to fitted values yb = X b conditionally on X, so they

are also independent of b conditionally on X,
,! s2 depends only on residuals, so it is independent of b conditionally on X.
Hence A is independent of B conditionally on X, and Fb ⇠ FK M,n K under H0 con-

ditionally on X. But this distribution is independent of X, so Fb ⇠ FK M,n K under
H0 .
25

Chap2 2017-EI PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chap2 2017-EI PDF

Uploaded by

Copyright:

Available Formats

Contents

2 Tests and Confidence Regions 2

A Prerequisites: Tests of Hypotheses 20

B Further and Deeper Details 23

Tests and Confidence Regions

Learning Objectives of this Chapter

,! Perform hypothesis testing with the linear regression model:

– State the null and alternative hypotheses,

,! Build a confidence region.

,! Know the usual tests (F-test, t-test, Wald test).

Assumption 2 (Errors). E ("|x1 , . . . , xk ) = 0

Assumption 3 (Moments). All variables have a bounded moment of order 4,

Assumption 4 (Rank). No perfect collinearity among the explanatory variables.

Assumption 5 (Variance). Error terms are homoskedastic.

2.1 Restricted Least Squares

,! Consider the model: y = X1 1 + X2 2 + ".

,! Constant returns to scale, i.e. 1 + 2 = 1 in

where Q is production, K capital, and L labor.

,! There are K M restrictions. M is the number of unrestricted (free) pa-

,! Example: constant returns to scale:

ln Q = 0 + 1 ln(K) + 2 ln(L) + "

,! R has full rank: no redundant restrictions: ⇢(R) = K M.

,! There are M free parameters .

,! One can go from explicit to implicit form and vice versa.

Here we consider implicit restrictions. See Ruud (Chapter 4) for a treatment of

2.1.2 Restricted LS Estimator

Write the Lagrangian form of the problem is

where Lagrange multiplier (K

Exercice 2.1. Check that R bR = r.

2.1.3 Expectation and Variance

Theorem 2.1.2. Under Assumptions Iid, Errors, Rank, Variance,

Theorem 2.1.3. Under Assumptions Iid, Errors, Rank, Variance,

is positive semi-definite. Hence, if the restrictions hold, bR is more eﬃcient than

2.1.4 Properties with Normal Errors

2.1.5 Asymptotic Properties

with 2 6= 0, but we regress y on X1 only, that is we falsely assume 2 = 0.

2.2 Hypothesis Testing

Example: consider the following linear model:

1. Determine the hypotheses of interest

,! Alternative Hypothesis H1 : what we would like to show

,! One-sided (one-tailed) test: H0 : 1 b against H1 : 1 < b.

2. Determine the test statistic t, whose behavior is known under H0 , or when

,! Type I error: Reject H0 when H0 is true.

t 2 R , The test rejectsH0 .

R is determined so that the level of the test = ↵.

4. Compute the test statistic and conclude.

,! One-sided test: H0 is rejected if b

,! Two-sided test: H0 is rejected if |bt| > z1 ↵/2 .

Figure 2.1: Example: Test on a Univariate Restriction

2.2.2 Power of a Test

where z is a random variable that is asymptotically distributed as a standard

Definition 6. A test is consistent if its power function converges to 1 when n ! 1

2.2.3 The F-Test Statistic

,! We compare fitted values X b and X bR .

,! We compare RSSR and RSS.

and the two parts are orthogonal (remember that y X b = MX y and

,! If you also use that s2 = RSS

2.2.4 Normal Errors

where FK M,n K,1 ↵ is the 1 ↵ quantile of a FK M,n K .

See Figure 2.2

Figure 2.2: F-Test: Critical Region

Case of an univariate restriction Let y = 1 x1 + x02 2 + ", and consider

One-sided test of an univariate inequality For H0 : 1 b against H1 :