You are on page 1of 25

Contents

2 Tests and Confidence Regions 2


2.1 Restricted Least Squares . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.1 Linear Restrictions . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.2 Restricted LS Estimator . . . . . . . . . . . . . . . . . . . . 4
2.1.3 Expectation and Variance . . . . . . . . . . . . . . . . . . . 5
2.1.4 Properties with Normal Errors . . . . . . . . . . . . . . . . . 5
2.1.5 Asymptotic Properties . . . . . . . . . . . . . . . . . . . . . 6
2.2 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 Steps of Hypothesis Testing . . . . . . . . . . . . . . . . . . 6
2.2.2 Power of a Test . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.3 The F-Test Statistic . . . . . . . . . . . . . . . . . . . . . . 9
2.2.4 Normal Errors . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.5 Asymptotic Test . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.6 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.8 P-Value (Observed Significance Level) . . . . . . . . . . . . 13
2.2.9 Wald Test of Linear Restrictions . . . . . . . . . . . . . . . . 16
2.2.10 Wald Test of Nonlinear Restrictions . . . . . . . . . . . . . . 16
2.3 Confidence Region . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.1 Univariate Case . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.2 Multivariate Case . . . . . . . . . . . . . . . . . . . . . . . . 18

A Prerequisites: Tests of Hypotheses 20

B Further and Deeper Details 23


B.1 Derivation of the Restricted Least Squares Estimator . . . . . . . . 23
B.2 Oblique Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
B.3 Proof of the Asymptotic Distribution of RLS (Theorem 2.1.6 ) . . 24
B.4 Proof of the Distribution of the F Statistic (Theorem 2.2.1) . . . . 25
Chapter 2

Tests and Confidence Regions

Learning Objectives of this Chapter


At the end of this Chapter, you need to be able to:

,! Perform hypothesis testing with the linear regression model:

– State the null and alternative hypotheses,


– Determine the test statistic,
– Determine its behavior,
– Determine a decision rule,
– Determine the result of the test,
– Be able to perform a test with a p-value.

,! Build a confidence region.

,! Know the usual tests (F-test, t-test, Wald test).

,! Know and be able to show the main properties of the Restricted Least
Squares.

Standard Assumptions
Assumption 1 (Iid). (yi , x1i , . . . , xKi ), i = 1, . . . , n are independent and identi-
cally distributed

Assumption 2 (Errors). E ("|x1 , . . . , xk ) = 0


P. Lavergne — F. Poinas — S. Sinha Econometrics M1

Assumption 3 (Moments). All variables have a bounded moment of order 4,


i.e. E(x4j ) < 1, j = 1, . . . k, E(y 4 ) < 1.

Assumption 4 (Rank). No perfect collinearity among the explanatory variables.

Assumption 5 (Variance). Error terms are homoskedastic.

2.1 Restricted Least Squares


2.1.1 Linear Restrictions
Examples

,! Consider the model: y = X1 1 + X2 2 + ".


X2 can be omitted, i.e. 2 =0

,! Constant returns to scale, i.e. 1 + 2 = 1 in

ln Q = |{z}
ln A + 1 ln K + 2 ln L + " ,
0

where Q is production, K capital, and L labor.

Implicit Restrictions
R =r
R is a known (K M ) ⇥ K full rank matrix, r is a know K M vector.

,! There are K M restrictions. M is the number of unrestricted (free) pa-


rameters.

,! Example: constant returns to scale:

ln Q = 0 + 1 ln(K) + 2 ln(L) + "


0 1
0
0 1 1 @ 1
A=1
2

,! R has full rank: no redundant restrictions: ⇢(R) = K M.

3
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

Explicit Restrictions
=S +s
S is a known K ⇥ M full rank matrix, is an unknown M vector, s is a known
K vector.

,! There are M free parameters .


,! Example : constant returns to scale: 1 + 2 =1, 2=1 1:
0 1 0 1 0 1
0 1 0 ✓ ◆ 0
@ 1 A=@ 0 1 A 0
+ @ 0 A
1
2 0 1 1

,! One can go from explicit to implicit form and vice versa.

Here we consider implicit restrictions. See Ruud (Chapter 4) for a treatment of


restricted least squares under explicit restrictions.

2.1.2 Restricted LS Estimator


We consider the linear regression
y =X +"
under the restriction that
R = r.
So one should solve the restricted LS problem
bR = arg min "0 " = arg min (y X )0 (y X ) subject to R = r .

Write the Lagrangian form of the problem is


arg min L( , ) where L( , ) = (y X )0 (y X )+ 0
(R r) ,
,

where Lagrange multiplier (K


M vector). Solutions (see B.1 for details):
⇣ ⌘
b = 2 R(X 0 X) 1 R0 1 R b r
⇣ ⌘
bR = b (X 0 X) 1 R0 R(X 0 X) 1 R0 1 R b r (2.1)
= Ab+ b (2.2)
⇣ ⌘
1
A = I (X 0 X) 1
R0 R(X 0 X) 1
R0 R
1
b = (X 0 X) 1
R0 R(X 0 X) 1
R0 r.

4
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

When R b = r, then bR = b.
1
In general, the two estimators differ since (X 0 X) 1
R0 (R(X 0 X) 1
R0 ) is full
rank provided X and R have full rank.

Exercice 2.1. Check that R bR = r.

1
Note: It so happens that C = (X 0 X) 1 R0 (R(X 0 X) 1 R0 ) R and A = I C
are oblique (non-orthogonal) projection matrices. See B.2 for details.

2.1.3 Expectation and Variance


Theorem 2.1.1. Under Assumptions Iid, Errors, Rank,
1
E( bR |X) = (X 0 X) 1
R0 R(X 0 X) 1
R0 (R r)
So bR is unbiased if R = r.

Theorem 2.1.2. Under Assumptions Iid, Errors, Rank, Variance,


h i h i
Var bR |X = AVar b|X A0 = 2 A (X 0 X) A0 = 2 A (X 0 X)
1 1
.

1
Exercice 2.2.h Showi that A (X X) A = A(X X)
0 0 0 1
= (X 0 X) 1
A0 . De-
duce that Var bR |X = 2 A (X 0 X) 1 .

Theorem 2.1.3. Under Assumptions Iid, Errors, Rank, Variance,


h i h i
Var b|X Var bR |X = 2 (I A) (X 0 X)
1

1
= 2
(X 0 X) 1
R0 R(X 0 X) 1
R0 R(X 0 X) 1

is positive semi-definite. Hence, if the restrictions hold, bR is more efficient than


b.

2.1.4 Properties with Normal Errors


Theorem 2.1.4. If the restrictions hold, y|X ⇠ N (X , 2
I), and Assumption
Rank holds,
,! bR |X ⇠ N ( , 2 A(X 0 X) 1 )
,! Restricted OLS estimator is efficient in the class of unbiased estimators

5
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

2.1.5 Asymptotic Properties


Consistency
Theorem 2.1.5. If the restrictions hold, then under Assumptions Iid, Errors,
Moments, Rank,
p
bR ! .
p p
Exercice 2.3. Given that n 1 X 0 X ! Q and b ! , show this result.

Asymptotic Distribution
Theorem 2.1.6. If the restrictions hold, then under Assumptions Iid, Errors,
Moments, Rank, Variance,
p ⇣ ⌘
d
n bR ! N (0, 2 V ) ,

where V = Q 1
Q 1
R0 (RQ 1
R0 ) 1
RQ 1
and Q = plim n 1 X 0 X.
Proof. See Appendix B.3.

Incorrect Restrictions
If R 6= r, then bR is biased, even asymptotically. Hence it cannot be consistent.
Consider for instance omitted variables, that is

y = X1 1 + X2 2 +"

with 2 6= 0, but we regress y on X1 only, that is we falsely assume 2 = 0.


Then b1 = (X10 X1 ) 1 X10 y is biased for the true value 1 , unless X2 and X1 are
orthogonal, see Chap. 1.

2.2 Hypothesis Testing


2.2.1 Steps of Hypothesis Testing
In what follows, we will take as an example a test of a univariate restriction in a
linear model. For an application on the test on a mean, see Appendix A.

Example: consider the following linear model:

y= 1 x1 + x02 2 +"

6
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

1. Determine the hypotheses of interest

,! Alternative Hypothesis H1 : what we would like to show


,! Null Hypothesis H0 : the complement of H1 , but the equal sign should
be in H0

Example:

,! One-sided (one-tailed) test: H0 : 1 b against H1 : 1 < b.


,! Two-sided (two-tailed) test: H0 : 1 = b against H1 : 1 6= b.

2. Determine the test statistic t, whose behavior is known under H0 , or when


at the boundary between H0 and H1
Example: Under H0 ,
b1 b d
b
t= ! N (0, 1) .
s.e.( b1 )
Note: this result will be shown later.

3. Choose the level of the test and determine the rejection (or critical) region

,! Type I error: Reject H0 when H0 is true.


,! The level is the maximum type I error probability
supµ2H0 Pr [Reject H0 |µ 2 H0 ].
,! Rejection (or critical) region R

t 2 R , The test rejectsH0 .

R is determined so that the level of the test = ↵.


,! Type II error: Not rejecting H0 when H0 is false.

Example:
Level of the test : ↵. See Figure 2.1 for a representation of
the rejection region.

4. Compute the test statistic and conclude.


Example:

,! One-sided test: H0 is rejected if b


t < z1 ↵.

,! Two-sided test: H0 is rejected if |bt| > z1 ↵/2 .

7
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

Figure 2.1: Example: Test on a Univariate Restriction

2.2.2 Power of a Test


The power of a test is the probability that the test rejects H0 when H0 is not true.
Take the instance of a one-sided test on a slope parameter:

H0 : 1 b against H1 : 1 < b.
The power function is
Power( 1 ) = Pr [Reject H0 | 1 2 H1 ] .
Hence Power( 1 ) is 1 minus the probability of an error of Type II, that is 1 ( 1 ).

Probabilities of Error
Truth
H0 H1
0 1 ↵ ( 1)
Decision
1 ↵ 1 ( 1)
What is the power of the t-test for 1 < b? The test statistic is
b1 b b1 1 1 b
b
t= = + .
s.e.( b1 ) s.e.( b1 ) s.e.( b1 )

8
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

The first term is asymptotically distributed as N (0, 1) whatever the true value of
1 (this result will be shown later). Hence the power of the t-test when the true
value equals 1 is
" #
⇥ ⇤ b1 1 1 b
Pr b t < z1 ↵ = Pr + < z1 ↵
s.e.( b1 ) s.e.( b1 )
" #
1 b
= Pr z < z1 ↵ .
s.e.( b1 )

where z is a random variable that is asymptotically distributed as a standard


Normal distribution.
As r
q 1
s.e.( b1 ) = s2 (X 0 X)(1,1)
1
= "b0 "b(X 0 X)(1,1)
1
,
n K
we get that s.e.( b1 ) ! 0.
n!+1
The test’s power indeed depends on the true 1. But when n ! 1, it tends to
one for any 1 < b.

Definition 6. A test is consistent if its power function converges to 1 when n ! 1


for any parameter value in H1 .

Exercice 2.4. For the two-sided t-test, write the power function and show that
the test is consistent.

2.2.3 The F-Test Statistic


We want to test H0 : R = r against H1 : R 6= r. Then the F-Test Statistic
is

kX b X bR k2 /(K M)
Fb = 2
⇣ ⌘ s 0 ⇣ ⌘
1
Rb r (R(X 0 X) 1
R0 ) Rb r /(K M)
=
s2
(RSSR RSS) /(K M)
=
RSS/(n K)

This corresponds to 3 different ways of writing the same statistic and gives 3
different ways of interpreting the statistic:

,! We compare fitted values X b and X bR .

9
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

⇣ ⌘
,! We compare R b r to a vector of zeros.

,! We compare RSSR and RSS.


We now show that these three forms of the test statistic are equal using some
useful equalities.
,! We have
kX b X bR k2 = kX( b bR )k2 (2.3)
⇣ ⌘0 ⇣ ⌘
= b bR (X 0 X) b bR
⇣ ⌘0 ⇣ ⌘
1
= Rb r R(X 0 X) 1
R0 Rb r (2.4)

from the formula (2.1) of bR . This shows that the first form of F is equal
to the second one.
,! We know that for any
⇣ ⌘ ⇣ ⌘
y X = y X b + Xb X

and the two parts are orthogonal (remember that y X b = MX y and


MX X = 0), so
ky X k2 = ky X bk2 + kX b X k2
In particular, for = bR :
ky X bR k2 = ky X bk2 + kX b X bR k2
RSSR = RSS + kX b X bR k2 (2.5)

,! If you also use that s2 = RSS


n K
, this shows the F can also be written using
the third formula.

2.2.4 Normal Errors


Theorem 2.2.1. If the restrictions hold, y|X ⇠ N (X , 2
I), and Assumption
Rank holds, then Fb ⇠ FK M,n K .
Proof. See Appendix B.4.
Hence the rejection region is
Fb > FK M,n K,1 ↵

where FK M,n K,1 ↵ is the 1 ↵ quantile of a FK M,n K .

See Figure 2.2

10
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

Figure 2.2: F-Test: Critical Region

Case of an univariate restriction Let y = 1 x1 + x02 2 + ", and consider


H0 : 1 = b against H1 : 1 6= b. Then
⇣ ⌘2
b1 b
Fb = 2 0 0 1
e1 = (1, 0, . . . 0)0 (2.6)
s (e1 (X X) e1 )
⇣ ⌘2
b1 b
=⇣ ⌘ =b t2 (2.7)
b
s.e.( 1 )

where b
t is the Student statistic.
Rejection region: |b
t| > tn K,1 ↵/2 .
Exercice 2.5. A regression of log earnings on (years of ) education and (years
of work) experience for 655 women yields the estimated linear regression

Log \
Earnings = 6.662 + 0.159 Educ + 0.051Exp
(0.209) (0.014) (0.011)
with standard errors into parentheses. Test whether the coefficient of education is
zero. Then test whether the coefficient of education is equal to 0.16. What does it
mean in economic terms?

11
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

One-sided test of an univariate inequality For H0 : 1 b against H1 :


1 < b, the critical region is
b
t < tn K,1 ↵ .

2.2.5 Asymptotic Test


Theorem 2.2.2. If the restrictions hold, and under Iid, Errors,Moments,
d
Rank, Variance, (K M )Fb ! 2K M .
Proof.
⇣ ⌘0 ⇣ ⌘
1
Rb r (R(X 0 X) 1
R0 ) Rb r / 2
A
(K M )Fb = = .
s2 / 2 B
Under H0
p ⇣ ⌘
d
n Rb r ! N (0, 2
RQ 1
R0 )
d p
with Q = plim n 1 X 0 X. So A ! K M.
2
Moreover, B = s2 / 2
! 1. Hence
d
(K M )Fb ! 2K M .
Hence the asymptotic rejection region is
(K M )Fb > 2
K M,1 ↵

where 2K M,1 ↵ is the 1 ↵ quantile of a 2K M .


The test is called asymptotic because it is based on the asymptotic distribution of
the test statistic under the null hypothesis, so it has asymptotic level ↵.

Univariate Restriction Rejection region is


|b
t| > z1 ↵/2

d
where z1 ↵/2 is the 1 ↵/2 quantile of a N (0, 1), since b
t ! N (0, 1) as n ! 1.

2.2.6 Consistency
p ⇣ b ⌘0
1 p ⇣ b ⌘
n R r (R(n 1 X 0 X) 1 R0 ) n R r /(K M)
Fb =
s2
⇣ ⌘ p p p
We have R b r ! (R r), n 1 X 0 X ! Q, s2 ! 2 .
p
So under H1 , Fb ! 1 and Pr [Reject H0 |µ] ! 1 for all µ 2 H1 .
Exercice 2.6. Consider testing a univariate restriction H0 : 1 = 10 with the
t-test. Show that the test is consistent.

12
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

2.2.7 Examples
“Usefulness of the Model”
y= 1◆ + X2 2 +" ◆ = (1, . . . 1)0
To test H0 : 2 = 0 against H1 : 2 6= 0 the statistic becomes

(T SS RSS) /(K 1) R2 /(K 1)


Fb = = .
RSS/(n K) (1 R2 )/(n K)

Exercice 2.7. Show the last equality.

Stability Test
Consider two groups of observations on the same variables and
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
yA XA 0 A "A
y⌘ = + ,
yB 0 XB B "B

((nA + nB ) ⇥ 1) ((nA + nB ) ⇥ 2K) (2K ⇥ 1) ((nA + nB ) ⇥ 1)

where "j |Xj ⇠ N (0, 2 I) for j = A, B and independence of groups.


We want to test H0 : A = B against H1 : A 6= B .
Alternative way to write the hypotheses:

,! H0 : restricted model:
✓ ◆ ✓ ◆
XA "A
y= +
XB "B

,! H1 : unrestricted model: yA = X A + "A and yB = X B + "B can be


estimated separately

The statistic becomes


(RSS (RSSA + RSSB )) /K
Fb = .
(RSSA + RSSB ) /(nA + nB 2K)

2.2.8 P-Value (Observed Significance Level)


By definition, it is the level of the test above which the decision switches.
Equivalently, it is the level of the test for which the critical value equals the test
statistic.
For instance,

13
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

,! For the asymptotic F-test, p-value is p such that

Fb = 2
K M,1 p ,

,! For the asymptotic t-test, p-value is p such that:

– b
t = z1 p/2 for a two-sided test, zq being the quantile of order q of a
N (0, 1).
– b
t = z1 p for a one-sided test.

See Figure 2.3


With a p-value, we can run the test at any level.

Testing with a p-value

1. Determine the hypotheses of interest

2. Determine the test statistic and its behavior under H0 (or at the boundary
between H0 and H1 )

3. Choose the level ↵

4. Compute the statistic and its p-value

5. If p-value < ↵, reject H0 . Otherwise do not reject H0 .

Univariate Case For the one-sided asymptotic test of H0 : 1  10 against


H1 : 1 > 10 , the p-value is
⇥ ⇤
Pr N (0, 1) > b
t .

For the two-sided asymptotic test of H0 : 1 = 10 , the p-value is


⇥ ⇤
Pr |N (0, 1)| > |b
t| .

Exercice 2.8. For the regression of log earnings for 655 women

Log \
Earnings = 6.662 + 0.159 Educ + 0.051Exp
(0.209) (0.014) (0.011)

what is the p-value for the t-test of the null hypothesis that the coefficient of ex-
perience is zero? What is the p-value for the t-test of the null hypothesis that the
coefficient of experience is equal to 0.05?

14
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

Figure 2.3: How to Compute a p-value for t and F-test?

15
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

2.2.9 Wald Test of Linear Restrictions


Check whether restrictions are (close to be) fulfilled by the unrestricted estimator.
We make Assumptions Iid, Errors, Moments, Rank, Variance all along.
Test statistic is based on a quadratic form in R b r, that is
⇣ ⌘0 ⇣ ⇣ ⌘⌘ 1 ⇣ ⌘
c = Rb r
W d Rb r
Var Rb r .
⇣ ⌘
But Var R b
d r = s2 (R(X 0 X) 1 c = (K
R0 ), so that W M )Fb.
d
c ! c>
Under H0 , W K M,
2
and the rejection region is W K M,1 ↵ .
2

2.2.10 Wald Test of Nonlinear Restrictions


We want to test H0 : r( ) = 0 where r(·) is nonlinear.

Example: Wage w depends on gender d and characteristics x

log w = 0 + 1d + x0 2 +" "|(d, x) ⇠ N (0, 2


)

The relative male-female difference of expected wages is exp( 1 )


1 and we may want to test whether exp( 1 ) 1 equals some given
value.

The Delta Method


p ⇣ ⌘
d
Theorem 2.2.3. If n ✓b ✓0 ! N (0, ⌃) and g(·) : RK ! RK M
is differ-
entiable at ✓0 , then
p ⇣ ⌘
d @
b
n g(✓) g(✓0 ) ! N (0, G0 ⌃G00 ) , where G0 = r✓0 g(✓0 ) = g(✓0 ) .
@✓ 0
For testing H0 , consider
⇣ ⌘
b = @ r( b) = r 0 r( b) .
1
0 b
c b
W = r ( ) R(X 0
X) 1 b0
R r( b)/s2 where R
@ 0

Wald Test
c ! d c>
For testing H0 , W 2
K M under H0 , and the rejection region is W K M,1 ↵ .
2

16
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

Proof.
p ⇣ ⌘
d
n b ! N 0, 2
Q 1

By the Delta method,


p ⇣ ⌘
d
⇣ ⌘
n r( b) r( ) ! N 0, 2 b
RQ 1 b0
R

p ⇣ ⌘
b d 2 b 1 b0
Under H0 : r( ) = 0. So nr( ) ! N 0, RQ R . The rest of the proof
is the same as for the asymptotic distribution of the F-test statistic. It uses the
p
fact that y 0 ⌃ 1 y ⇠ 2p when y ⇠ N (0, ⌃) and s2 / 2 ! 1.
Exercice 2.9. Consider testing a univariate restriction H0 : r( ) = 0. What
reasoning would show that the Wald test is consistent?
Exercice 2.10. In the regression

log w = 0 + 1d + x0 2 +" "|(d, x) ⇠ N (0, 2


)

we obtain b1 = .1 with a standard error of 0.04.

,! Apply the delta method to test H0 : exp( 1 ) 1 = 0 against H1 : exp( 1 ) 1 6=


0 at 5% level with the asymptotic Wald test.

,! Given that H0 is equivalent to 1 = 0, test H0 : 1 = 0 against H1 : 1 6= 0


at 5% level with the asymptotic Wald test.

You should have concluded differently in the two tests. This is an example of the
non-invariance of the Wald test: testing equivalent hypotheses expressed in two
different ways may not yield the same conclusion!

2.3 Confidence Region


Definition 7. A (asymptotic) confidence region for ✓ at confidence level 1 ↵ is
the set of values ✓0 such that a two-sided test of H0 : ✓ = ✓0 does not reject H0
at (asymptotic) level ↵.

2.3.1 Univariate Case


,! If we assume homoskedasticity of errors, the asymptotic confidence interval
for 1 of confidence level 1 ↵

10 : |b
t( 10 )|  z,1 ↵/2

17
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

⇣ ⌘
where b
t( 10 ) = b1 10 /s.e.( b1 ). So it is

b1 ⌥ z1 ↵/2 s.e.( b1 ) .

,! If we assume normality (and still homoskedasticity), the confidence interval


of confidence level ↵ is
b1 ⌥ tn K,1 ↵/2 s.e.( 1 ) .
b

,! A parameter estimate is said statistically insignificant when 0 belongs to CI.


This could happen because

– the estimate is “close to zero” and the standard error is small (what
does “close to zero” mean?)
– the estimate is not “close to zero,” but the standard error is large

,! Don’t confuse significance with economic importance. A parameter may be


statistically significant but still the economic effect may be small.

Exercice 2.11. In the regression of log earnings for 655 women,

Log \
Earnings = 6.662 + 0.159 Educ + 0.051Exp
(0.209) (0.014) (0.011)

construct a confidence interval for the coefficient of education. What does it mean
in economic terms?

2.3.2 Multivariate Case


y = X1 1 + X2 2 +"
with dim 2 =K M.

,! If we assume normality and homoscedasticity of errors, the confidence region


for 2 of confidence level 1 ↵ is
8 ⇣ ⌘0 ⇣ ⌘⇣ ⌘ 9
>
< b 0 1
R (X X) R 0 b >
=
2 20 2 20
20 :  F K M,n K,1 ↵ ,
>
: (K M )s2 >
;

with R the matrix that selects the components 2.

18
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

,! If we don’t assume normality (but still homoscedasticity), the confidence


region of asymptotic confidence level ↵ is
8 ⇣ ⌘0 ⇣ ⌘⇣ ⌘ 9
>
< b 0
R (X X) R 1 0 b >
=
2 20 2 20
2
20 :  K M,1 ↵ .
>
: s2 >
;

,! The confidence region is always an ellipse.

,! The confidence region is not the intersection of individual confidence inter-


vals: it takes into account covariance of the individual estimators.

,! Some coefficients may be individually “insignificant” but “jointly significant.”


The reverse can also happen.

Exercice 2.12. What is the definition of an asymptotic confidence region for a


nonlinear function of the parameters r ( )?

Figure 2.4: Confidence Region

19
Appendix A

Prerequisites: Tests of Hypotheses

Student and Fisher Distributions


Definition 8. If z ⇠ N (0, 1) and U ⇠ 2p are two independent random variables, then
p z follows a Student distribution with p degrees of freedom, denoted by p z ⇠ tp .
U/p U/p

When p ! 1, tp becomes N (0, 1) (why?).

Definition 9. If X1 ⇠ 2m1 and X2 ⇠ 2m2 are independent , then F = X 1 /m1


X2 /m2 follows a
Fisher distribution with m1 and m2 degrees of freedom, denoted by F ⇠ F (m1 , m2 ).

Particular case: F1,p = t2p .

Test on a Mean
1. Determine the hypotheses of interest

,! H0 : µ = µ0 against H1 : µ 6= µ0 : two-sided alternative


,! H0 : µ  µ0 against H1 : µ > µ0 : one-sided alternative

2. Determine the test statistic t, whose behavior is known under H0 , or when at the
boundary between H0 and H1

p
b
t= n (ȳ µ0 ) /s. When µ = µ0 , t ⇠ tn 1 if y ⇠ N (µ0 , 2 ).

3. Choose the level of the test and determine the rejection (or critical) region

4. Type II error: Not rejecting H0 when H0 is false.

,! Two-sided test: |b
t| > c.
⇥ ⇤
Since t ⇠ tn 1 when µ = µ0 , Pr |b
b t| > c|µ = µ0 = ↵ when c is the
1 (↵/2) quantile of a tn 1 , denoted tn 1;1 (↵/2) .
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

,! One-sided test: b t>c


⇥ ⇤
Pr bt > c|µ = µ0 = ↵ for c = tn 1;1 ↵ .

See Figure A.1

5. Compute the test statistic and conclude.

21
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

Figure A.1: Tests of a mean: Critical Region

22
Appendix B

Further and Deeper Details

B.1 Derivation of the Restricted Least Squares Es-


timator
We should solve

arg min L( , ) where L( , ) = (y X )0 (y X )+ 0


(R r) .
,

,! FOC:
@L
=0 , 2X 0 y + 2X 0 X bR + R0 b = 0
@
, 2X 0 X bR = 2X 0 y R0 b
✓ ◆
b 0 1 0 1 0b
, R = X X Xy R
2
1 1 0b
, bR = b X 0X R (B.1)
2
@L
= 0 , R bR r = 0 , (B.2)
@

where b ⌘ bOLS .

,! Inserting B.1 into B.2,


1 1
Rb R X 0X R0 b r=0
2
1 1 0b
, R X 0X R = Rb r
2
⇣ ⌘ 1⇣ ⌘
1 0
, b = 2 R X 0X R Rb r (B.3)
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

,! Inserting B.3 into B.1:


⇣ ⌘ 1⇣ ⌘
bR = b 1 1
X 0X R0 R X 0 X R0 Rb r (B.4)

= Ab+ b
⇣ ⌘
1
where A = I (X 0 X) 1
R0 R(X 0 X) 1
R0 R (B.5)
1
b = (X 0 X) 1
R0 R(X 0 X) 1
R0 r. (B.6)

B.2 Oblique Projection


,! C is a K ⇥ K matrix, but it is not symmetric.
,! C is idempotent, i.e. CC = C.
So any of its eigenvalues must be either zero or one.
,! Hence C must be a projection: it leaves a whole linear subspace unchanged, it
maps the complementary subspace to 0.
But the two subspaces are not orthogonal in general.
,! Any vector u can be written as
u = Cu + Au ,
but the two elements are not orthogonal.

B.3 Proof of the Asymptotic Distribution of RLS


(Theorem 2.1.6 )
,! Since b = + (X 0 X) 1 X 0 ",

Rb r= R r + R(X 0 X) 1
X 0" .

,! If the restrictions hold, R = r, and using 2.1


bR 1
= (X 0 X) 1 X 0 " (X 0 X) 1 R0 R(X 0 X) 1 R0 R(X 0 X) 1 X 0 "
⇣ ⌘
1
= (X 0 X) 1 (X 0 X) 1 R0 R(X 0 X) 1 R0 R(X 0 X) 1 X 0 "

,! Therefore:
p ⇣ ⌘
n bR
⇣ ⌘ 1
1
= n(X 0 X) 1
n(X 0 X) 1
R0 Rn(X 0 X) 1
R0 Rn(X 0 X) 1
p X 0"
n

24
P. Lavergne — F. Poinas — S. Sinha Econometrics M1

Using n 1X 0X p d
! Q, p1n X 0 " ! N (0, 2 Q) and applying the Slusky theorem,

p ⇣ ⌘
d
n bR ! V N (0, 2
Q)

where V = Q 1 Q 1 R0 (RQ 1 R0 ) 1 RQ 1 .

,! After some matrix manipulation, it follows that the variance matrix of the asymp-
totic distribution is given by:
2
V QV 0 = 2
V .

Therefore: p ⇣ ⌘
d
n bR ! N (0, 2
V ).

B.4 Proof of the Distribution of the F Statistic


(Theorem 2.2.1)
Proof.
⇣ ⌘0 ⇣ ⌘ ⇥ ⇤
1
Rb r R(X 0 X) 1 R0 Rb r / (K M) 2
A
Fb = = .
s2 / 2 B

Since
"|X ⇠ N 0, 2
I ) b|X ⇠ N ( , 2
(X 0 X) 1
),
we get
Rb r|X ⇠ N (R r, 2
R(X 0 X) 1
R0 )
Under H0 , R = r so that A ⇠ 2
K M /(K M ). We also know that

s2 / 2
⇠ 2
n K /(n K) .

Are numerator and denominator independent? Yes, because

,! Residuals "b are orthogonal to fitted values yb = X b conditionally on X, so they


are also independent of b conditionally on X,

,! s2 depends only on residuals, so it is independent of b conditionally on X.

Hence A is independent of B conditionally on X, and Fb ⇠ FK M,n K under H0 con-


ditionally on X. But this distribution is independent of X, so Fb ⇠ FK M,n K under
H0 .

25

You might also like