You are on page 1of 5

UNSW MATH2931 Linear Models

Writing Assignment 2
KEVIN GE (z5113188@ad.unsw.edu.au)
RISHIKESAN MARAN (z5113193@ad.unsw.edu.au)
TIMOTHY YU (z5116065@ad.unsw.edu.au)

15 September 2017

Problem 3:
Let y = (y1 , ..., yn )T denote a set of responses, and consider the linear model
y =µ+
where µ = (µ, ..., µ)T and  is a vector of zero mean uncorrelated errors with variance σ 2 .
1. This model can be written in general linear model form y = Xβ +  where

X = [1, 1, ..., 1]T


| {z }
0 n0 times

and β = µ.
2. • X T X = [1, 1, ..., 1][1, 1, ..., 1]T = n
• (X T X)−1 = 1/n
• X T Y = [1, 1, ..., 1][y1 , ..., yn ]T = ni=1 yi
P

• (X T X)−1 X T Y = n1 ni=1 yi = ȳ
P

3. The least squares estimate of µ is the value µ1 which minimises the sum of squared
distances between y1 and E(y1 ).
i.e. We are required to minimise
n
X
S(µ) = (yi − E[yi ])2
i=1
Xn
= (yi − E[µ + i ])2
i=1
Xn
= (yi − µ)2 (Since E(i ) = 0)
i=1

We now minimise S with respect to µ, obtaining:


n
∂S X
=−2 (yi − µ)
∂µ i=1

1
By setting the partial derivative to zero, we retrieve the equation which yields µ1 :
n
X
−2 (yi − µ) = 0
i=1
n
X
=⇒ (yi − µ) = 0
i=1
n
X
=⇒ nµ = yi
i=1
=⇒ µ = ȳ

Problem 6:
Suppose we have the linear model y = Xβ +  with n × p design matrix X, normal errors
 ∼ N (0, σ 2 In×n ) and
(n − p)σ̂ 2 = y T (I − X(X T X)−1 X T )y = y T Ay.

1. We prove that A is symmetric and idempotent.


AT = (I − X(X T X)−1 X T )T
= I T − (X(X T X)−1 X T )T
= I − (X T )T ((X T X)−1 )T X T
= I − X(X T X)−1 X T = I.

So A is symmetric. Also,

A2 = (I − X(X T X)−1 X T )2
= I − 2X(X T X)−1 X T + X (X T X)−1 X T X (X T X)−1 X T
| {z }
=I
−1 −1
= I − 2X(X X) X + X(X X) X T
T T T

= I − X(X T X)−1 X T = A.

So A is idempotent.
2. Since A is symmetric and idempotent, rank(A) = tr(A).
Also,

tr(A) = tr(In×n − X(X T X)−1 X T )


= tr(In×n ) − tr(X(X T X)−1 X T )
= n − tr(X T X(X T X)−1 )
= n − tr(Ip×p )
=n−p

Hence rank(A) = tr(A) = n − p.

2
3. We first compute Var(Ay + b):
Var(Ay + b) = Var((A + (X T X)−1 X T )y)
= (A + (X T X)−1 X T )Var(y)(A + (X T X)−1 X T )T
= (A + (X T X)−1 X T )(σ 2 I)(A + (X T X)−1 X T )T
= σ 2 (A + (X T X)−1 X T )(A + X(X T X)−1 )
= σ 2 (A2 + AX(X T X)−1 + (X T X)−1 X T A + (X T X)−1 X T X (X T X)−1 )
| {z }
=I
= σ 2 (A + (AX)(X T X)−1 + (X T X)−1 (X T A) + (X T X)−1 ).

Now,

AX = (I − X(X T X)−1 X T )X = IX − X(X T X)−1 X T X = X − X = 0

and

X T A = X T (I − X(X T X)−1 X T ) = X T I − X T X(X T X)−1 X T = X T − X T = 0.

Hence by substituting 2 and 3 into 1, we obtain:

Var(Ay + b) = σ 2 (A + (0)(X T X)−1 + (X T X)−1 (0) + (X T X)−1 )


= σ 2 (A + (X T X)−1 ).

Next we compute Var(Ay) and Var(b) individually:

Var(Ay) = AVar(y)AT Var(b) = Var((X T X)−1 X T )y)


= σ 2 AAT = (X T X)−1 X T )Var(y)((X T X)−1 X T )T
= σ 2 A. = σ 2 (X T X)−1 X T X)(X T X)−1
| {z }
=I
2 T −1
= σ (X X)
Finally we substitute the obtained variances into the variance of sum property:
Var(Ay + b) = Var(Ay) + Var(b) + 2 Cov(Ay, b).
σ (A + (X T X)−1 ) = σ 2 A + σ 2 (X T X)−1 + 2 Cov(Ay, b).
2

2 Cov(Ay, b) = 0
Cov(Ay, b) = 0.

4. To determine the distribution of


(b − β)T X T X(b − β)
σ2
(which we will denote as E), we can rewrite Xb:
Xb = X(X T X)−1 X T y
= X(X T X)−1 X T (Xβ + )
= X(X T X)−1 X T Xβ + X(X T X)−1 X T 
= Xβ + X(X T X)−1 X T .

3
Therefore,

(Xb − Xβ)T (Xb − Xβ)


E=
σ2
(X(X T X)−1 X T )T X(X T X)−1 X T 
=
σ2
 X(X X) X X(X T X)−1 X T 
T T −1 T
=
σ2
T X(X T X)−1 X T 
= .
σ2
T (I − A)
= . (1)
σ2
Now, since A is symmetric,

(I − A)T = I T − AT = I − A.

Hence, I − A is also symmetric. So using the spectral theorem we can diagonalise I − A


in equation (1):

T (I − A) T Q−1 DQ


=
σ2 σ2
Where D is a diagonal matrix containing the eigenvalues of I − A and Q is a real
orthogonal matrix (that is, Q−1 = QT ). Therefore,

T QT DQ
E=
σ2
 T  
Q Q
= D
σ σ
Q
= z T Dz (where = z)
σ
X
= zi [D]ij zj
0≤i,j≤n
Xn
= [D]ii zi 2 (Since D is a diagonal matrix)
i=1

Now, we determine the possible values of [D]ii , that is, the eigenvalues of I − A. Since A
is idempotent, I − A is also idempotent because

(I − A)2 = I 2 − IA − AI + A2 = I − 2A + A = I − A.

This implies that its eigenvalues only take values 1 and 0. Also, since the sum of the
eigenvalues of I − A is

tr(I − A) = tr(I) − tr(A) = n − (n − p) = p,

Eigenvalues 1 and 0 must have multiplicities p and n − p respectively. Therefore [D]ii = 1


for p values of i (say i = i1 , i2 , ..., ip ) and 0 otherwise.

4
Next we determine the distribution of zi2 . It is given that  ∼ N (0, σ 2 In×n ), so it follows
that
    T 
Q Q 2 Q d 
z∼N · 0, σ In×n = N 0, In×n . (Since QQT = I)
σ σ σ

Therefore, zi ∼ N 0, 1 =⇒ zi2 ∼ χ21 . To conclude,
n p p
X X X
2
E= [D]ii zi = 1· zi2k = zi2k ∼ χ2p .
i=1 k=1 k=1

You might also like