You are on page 1of 6

Prof. Johanna F.

Ziegel Multivariate Statistics Fall Semester 2017


Anja Mühlemann

Solutions 1

1. Let F be the multivariate distribution function of a random vector X ∈ Rp , and let Fi


be the univariate distribution of its i-th component Xi .

a) Show that
 p 
X
max 1 − p + Fi (ri ), 0 ≤ F (r) ≤ min Fi (ri )
i=1,2,...,p
i=1

for arbitrary r ∈ Rp .
Proof: Let us first look at the right inequality. As for any r ∈ Rp , {X ≤ r} ⊂
{Xi ≤ ri } for all i ∈ {1, . . . , p}, taking probabilities delivers P (X ≤ r) ≤
P (Xi ≤ ri ) = Fi (ri ) for all i ∈ {1, . . . , p}. In summary, we just established that
F (r) ≤ min1≤i≤p Fi (ri ).
Coming toPthe left inequality, it is clearly sufficient to prove that
1 − p + pi=1 Fi (ri ) ≤ F (r). Now, elementaryPp algebraic manipulations lead
to the equivalent inequality
Pp 1 − F (r) ≤ i=1 (1 − Fi (ri )), or in other terms
p
P (∪i=1 {Xi > ri }) ≤ i=1 P (Xi > ri ). The latter obviously holds (by sub-
additivity of the measure P ).

b) For 0 < u < 1 let Fi−1 (u) := min{r ∈ R : Fi (r) ≥ u}. Determine the
distribution function F of the random vector
p
X := Fi−1 (U ) i=1

with U ∼ Unif[0, 1]. In case of p = 2 determine the distribution function of


>
X := F1−1 (U ), F2−1 (1 − U ) .

Proof: Let X := (Fi−1 (U ))pi=1 with U ∼ Unif[0, 1]. Let us mention a simple but
essential result that with help us solving the current question without needing any
additional assumption on F (such as the continuity of the Fi ’s) : for any x ∈ [0, 1]
and u ∈ R, we have the equivalence (Fi−1 (u) ≤ x) ⇔ (Fi (x) ≥ u). From there,
we directly obtain the inequality

P (X ≤ r) = P (F1−1 (U ) ≤ r1 , ..., Fp−1 (U ) ≤ rp )


=P (U ≤ F1 (r1 ), ..., U ≤ Fp (rp ))
 
=P U ≤ min Fi (ri )
i∈{1,...,p}

= min Fi (ri ),
i∈{1,...,p}

Please turn over!


which realizes the upper bound given in a).
Assuming now that p = 2 and X := (F1−1 (U ), F2−1 (1 − U ))> , we obtain further

P (X ≤ r) = P (F1−1 (U ) ≤ r1 , F2−1 (1 − U ) ≤ r2 )
= P (U ≤ F1 (r1 ), 1 − U ≤ F2 (r2 ))
= P (U ≤ F1 (r1 ), 1 − F2 (r2 ) ≤ U )

Note that if 1 − F2 (r2 ) ≥ F1 (r1 ), the latter probability is zero. However, in case
1 − F2 (r2 ) < F1 (r1 ), the latter probability is
p
X
F1 (r1 ) − 1 + F2 (r2 ) = 1 − p + Fi (ri ),
i=1

so that X realizes the lower bound given in a) for p = 2.

c) Show that the lower bound for F (r) in part (a) is sharp (pointwise). That means,
for arbitrary univariate distribution functions F1 , F2 , . . . , Fp and any vector r ∈
Rp there exists a random vector X with marginal distribution functions
F1 , F2 , . . . , Fp such that
 p 
X
IP(X ≤ r) = max 1 − p + Fi (ri ), 0 .
i=1

Proof: We previously obtained in the case p = 2 that the left Fréchet-Hoeffding


inequality is realized for X := (F1−1 (U ), F2−1 (1 − U ))> . Let us now show
by induction that this bound is pointwise sharp for any p ≥ 2. Given r k =
(r1 , . . . , rk )T ∈ Rk , let us assume that some random vector X k := (X1 , . . . , Xk )T
realizes the bound for p = k. Extending r k to r k+1 by adding an arbitrary compo-
nent rk+1 ∈ R, we will show that there exists a Fk+1 -distributed random variable
Xk+1 such that the left Fréchet-Hoeffding P inequality is realized,
 or equivalently
k+1
 k+1
such that P ∪i=1 {Xi > ri } = min i=1 P (Xi > ri ), 1 .

For 1 ≤ j ≤ k + 1, denote Aj := ∪ji=1 {Xi > ri } and mj := P (Aj ). If mk =


1, the former equality is immediate as mk+1 ≥ mk . If mk < 1, we propose
to construct a random variable Xk+1 in such a way that {Xk+1 ≥ rk+1 } fills
Ack as much as possible (and possibly includes it). The key idea is to construct
a uniform random variable U which values above mk are reached on Ack . For
instance, let U1 and U2 be independent standard uniform random variables and set
U := mk U1 1Ak +(mk +(1−mk )U2 )1Ack . Thus, U is standard uniform distributed.
−1
Posing Xk+1 := Fk+1 (U ), we then get the equivalence (Xk+1 > rk+1 ) ⇔ (U >
Fk+1 (rk+1 )). If Fk+1 (rk+1 ) ≤ mk , then Ak+1 = Ak ∪ {U > Fk+1 (rk+1 )} ⊃
Ak ∪ {U > mk } ⊃ Ak ∪ Ack = Ω and .
so Ak+1 = Ω with P (Ak+1 ) = 1. If
Fk+1 (rk+1 ) > mk , then Ak+1 = Ak ∪ {U > Fk+1 (rk+1 )} is a disjoint union
because {U > Fk+1 (rk+1 )} ⊂ Ack and so P (Ak+1 ) = P (Ak ) + P (Xk+1 >
Pk+1
rk+1 ) = i=1 P (Xi > ri ). In any case, the inequality holds for p = k + 1,
which concludes the proof.

See next page!


2. For real-valued random variables X, Y with finite expectations and real numbers α, β
it is well-known that
IE(X + Y ) = IE(X) + IE(Y ),
IE(XY ) = IE(X) IE(Y ) if X and Y are stochastically independent,
IE(α + βX) = α + β IE(X).

Deduce from these facts the following formulae for random matrices: Let M , M̃ ∈
Rp×q be random matrices such that IE kM k, IE kM̃ k < ∞. Then
IE(M > ) = IE(M )> ,
IE(M + M̃ ) = IE(M ) + IE(M̃ ),
> >
IE(M M̃ ) = IE(M ) IE(M̃ ) if M , M̃ are stoch. independent.
Furthermore, for fixed matrices A ∈ Rk×` , B ∈ Rk×p and C ∈ Rq×` ,
IE(A + BM C) = A + B IE(M )C.

Proof: Let M = (mij ) i≤p , M̃ = (m̃ij ) i≤p ∈ Rp×q be random matrices such that
j≤q j≤q

IE kM k, IE kM̃ k < ∞. Then


>
IE(M > ) = IE((mji ) i≤p ) = IE(mji ) i≤p = IE(mij ) i≤p = IE(M )>

j≤q j≤q j≤q

and

IE(M + M̃ ) = IE(mij + m̃ij ) i≤p
j≤q

= IE(mij ) + IE(m̃ij i≤p
j≤q
 
= IE(mij ) i≤p + IE(m̃ij ) i≤p
j≤q j≤q

= IE(M ) + IE(M̃ ).

Furthermore, let M , M̃ be stoch. independent, then


q
!!
> X
IE(M M̃ ) = IE mik m̃jk
k=1 i≤p
j≤p
q
!
X
= IE(mik m̃jk )
k=1 i≤p
j≤p
q
!
X
= IE(mik ) IE(m̃jk )
k=1 i≤p
j≤p

= IE(M ) IE(M̃ )> .


Finally, consider fixed matrices A, B, C with suitable dimensions. Since A, B, C are
non random they are stoch. independent of M and the above facts can be used.
IE(A + BM C) = IE(A) + IE(B) IE(M ) IE(C) = A + B IE(M )C

Please turn over!


3. Let X, X̃ ∈ Rp and Y ∈ Rq be random vectors such that IE kXk2 ), IE(kX̃k2 ) and
IE(kY k2 ) are finite. Verify the following formulae:
Cov(Y , X) = Cov(X, Y )> ,
Cov(X + X̃, Y ) = Cov(X, Y ) + Cov(X̃, Y ),
Var(X + X̃) = Var(X) + Cov(X, X̃) + Cov(X, X̃)> + Var(X̃).
Furthermore,
Cov(X, Y ) = 0 if X and Y are stoch. independent,
Var(X + X̃) = Var(X) + Var(X̃) if X and X̃ are stoch. independent.

Finally for fixed vectors a ∈ Rk and Matrices B ∈ Rk×p ,


IE(a + BX) = a + B IE(X),
Cov(a + BX, Y ) = B Cov(X, Y ),
Var(a + BX) = B Var(X)B > .

Proof: We utilize the facts that for arbitrary random matrices M , M̃ and fixed matri-
ces A, B, C with suitable dimensions,
IE(M > ) = IE(M )> , (i)
IE(M + M̃ ) = IE(M ) + IE(M̃ ), (ii)
IE(A + BM C) = A + B IE(M )C. (iii)
Then (i) implies that
Cov(Y , X) = IE (Y − IE Y )(X − IE X)>

 > 
= IE (X − IE X)(Y − IE Y )>
>
= IE (X − IE X)(Y − IE Y )>
= Cov(X, Y )> .
Moreover, (ii) implies that
Cov(X + X̃, Y ) = IE (X + X̃ − IE(X + X̃))(Y − IE Y )>


= IE (X − IE X + X̃ − IE X̃)(Y − IE Y )>


= IE (X − IE X)(Y − IE Y )> + (X̃ − IE X̃)(Y − IE Y )>




= IE (X − IE X)(Y − IE Y )> + IE (X̃ − IE X̃)(Y − IE Y )>


 

= Cov(X, Y ) + Cov(X̃, Y ).
The next equality follows from the two above calculations, that is
Var(X + X̃) = Cov(X + X̃, X + X̃)
= Cov(X, X + X̃) + Cov(X̃, X + X̃)
= Cov(X + X̃, X)> + Cov(X + X̃, X̃)>
= Cov(X, X)> + Cov(X̃, X)> + Cov(X, X̃)> + Cov(X̃, X̃)>
= Var(X) + Cov(X, X̃) + Cov(X̃, X) + Var(X̃).

See next page!


Furthermore, let X, Y and X, X̃ be stoch. independent, respectively. We use the
facts from Exercise 4 to obtain

Cov(X, Y ) = IE (X − IE(X))(Y − IE(Y ))
= IE(XY > ) − IE(X) IE(Y )>
= IE(X) IE(Y )> − IE(X) IE(Y )>
= 0.
and
Var(X + X̃) = Var(X) + Cov(X, X̃) + Cov(X̃, X) + Var(X̃)
= Var(X) + Var(X̃).
The fact that
IE(a + BX) = a + B IE(X)
follows from the last statement in Exercise 4 when A = a ∈ Rk×1 and C = I k the
identity matrix. Next it follows from (iii) that
Cov(a + BX, Y ) = IE (a + BX − IE(a + BX))(Y − IE Y )>


= IE B(X − IE X)(Y − IE Y )>




= B IE (X − IE X)(Y − IE Y )>


= B Cov(X, Y ).
Finally, (i) and (iii) together imply that
Var(a + BX) = Cov(a + BX, a + BX)
= B Cov(X, a + BX)
= B Cov(a + BX, X)>
= B(B Cov(X, X))>
= B Var(X)B > .

4. Let X ∈ Rn be a random vector which is exchangeable in the sense that X has the
same distribution as (Xπ(i) )ni=1 for any fixed permutation π of {1, 2, . . . , n}.

a) Show that the covariance matrix of X equals


Σ = αI n + β 1n 1>
n

for certain real numbers α, β, where 1n := (1)ni=1 .


Proof: Since X is exchangeable we have
L(X1 , X2 , . . . , Xn ) = L(Xπ(1) , Xπ(2) , . . . , Xπ(n) )
and in particular
L(Xπ(1) ) = L(X1 ),
L(Xπ(1) , Xπ(2) ) = L(X1 , X2 )

Please turn over!


for arbitrary permutions π von {1, 2, . . . , n}. Therefore,
IE Xi = IE X1 und Var(Xi ) = Var(X1 ) für i = 1, 2, . . . , n,
and
Cov(Xi , Xj ) = Cov(X1 , X2 ) for i, j = 1, 2, . . . , n with i 6= j.
Taking β := Cov(X1 , X2 ) and α := Var(X1 ) − β we obtain
Σ := Var(X) = αI n + β1n 1>
n.

b) Under which conditions on α and β is Σ non-singular? Show that, under these


conditions, the inverse Σ is also of the type
Σ−1 = α̃I n + β̃ 1n 1>
n

for some real numbers α̃, β̃.


Proof: To avoid matrix calculus we use a geometric approach:
Σ1n = (α + nβ)1n ,
Σv = αv for arbitrary v ∈ Rn with v ⊥ 1n .
Thus, α + nβ, α are the eigenvalues of Σ and therefore Σ is invertible if and only
if α 6= 0 and α + nβ 6= 0.
If α 6= 0 and α + nβ 6= 0 consider for Σ the approach
Σ−1 = γI n + δ1n 1>
n.

Then,
I n = αI n + β1n 1> γI n + δ1n 1> = αγI n + (βγ + αδ + nβδ)1n 1>
 
n n n,

so
αγ = 1 and βγ + αδ + nβδ = 0.
This is true if and only if γ = 1/α and δ = −β/(α(α + nβ)). Therefore,
1 β
Σ−1 = In − 1 1> .
α α(α + nβ) n n

c) What can we say about the distribution of X if Σ is singular?


Proof: Assume α = 0. Then Var(b> X) = b> Σb = 0 for all b ∈ 1⊥ n . But this
implies (by contradiction) that X lies with probability one on the straight line
R1n = {t1n : t ∈ R}, e.g.
X1 = X2 = . . . = Xn almost surely.

Assume β = −α/n. Then Σ1n = 0 and therefore Var(1> >


n X) = 1n Σ1n = 0.
Thus, X lies with probability one on the hyper plane
H := x ∈ Rn : 1> >
= x ∈ R n : 1>
 
n x = 1n IE X n x = n IE(X1 ) .

With other words


X = IE(X1 ) almost surely.

You might also like