Professional Documents
Culture Documents
Romain Couillet
Alcatel-Lucent Chair on Flexible Radio, Supélec, Gif sur Yvette, FRANCE
romain.couillet@supelec.fr
Outline
is a “good” estimate of R.
I if N/n = O(1), and if both (n, N) are large, we can still say, for all (i, j),
a.s.
(Rn )ij −→ (R)ij
What about the global behaviour? What about the eigenvalue distribution?
Tools for Random Matrix TheoryIntroduction to Large Dimensional Random Matrix Theory 6/113
0.6
Density
0.4
0.2
0
0 0.5 1 1.5 2 2.5 3
Eigenvalues of Rn
1.2
c = 0.1
c = 0.2
1 c = 0.5
0.8
Density fc (x)
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 3
Let W ∈ CN×n have i.i.d. elements, of zero mean and variance 1/n.
Eigenvalues of the matrix
n WH W
| {z }
N
Remark: If the entries are Gaussian, the matrix is called a Wishart matrix with n degrees of
freedom. The exact distribution is known in the finite case.
Tools for Random Matrix TheoryHistory of Mathematical Advances 10/113
E. Wigner, “Characteristic vectors of bordered matrices with infinite dimensions,” The annals of
mathematics, vol. 62, pp. 546-564, 1955.
0 +1 +1 +1 −1 −1 ···
+1 0 −1 +1 +1 +1 ···
+1 −1 0 +1 +1 +1 ···
1 +1 +1 +1 0 +1 +1 ···
XN = √
−1 +1 +1 +1 0 −1 ···
N
−1 +1 +1 +1 −1 0 ···
.. .. .. .. .. ..
..
. . . . . . .
As the matrix dimension increases, what can we say about the eigenvalues (energy levels)?
Tools for Random Matrix TheoryHistory of Mathematical Advances 11/113
I If XN ∈ CN×N is Hermitian with i.i.d. entries of mean 0, variance 1/N above the diagonal,
a.s.
then F XN −→ F where F has density f the semi-circle law
1
q
f (x) = (4 − x 2 )+
2π
I Shown from the method of moments
1 1
lim tr X2k
N = C 2k
N→∞ N k +1 k
which are exactly the moments of f (x)!
I If XN ∈ CN×N has i.i.d. 0 mean, variance 1/N entries, then asymptotically its complex
eigenvalues distribute uniformly on the complex unit circle.
Tools for Random Matrix TheoryHistory of Mathematical Advances 12/113
Semi-circle law
0.3
Density
0.2
0.1
0
−3 −2 −1 0 1 2 3
Eigenvalues
Figure: Histogram of the eigenvalues of Wigner matrices and the semi-circle law, for N = 500
Tools for Random Matrix TheoryHistory of Mathematical Advances 13/113
Circular law
Empirical eigenvalue distribution
Circular Law
0.5
Eigenvalues (imaginary part)
−0.5
−1
−1 −0.5 0 0.5 1
I much study has surrounded the Marcenko-Pastur law, the Wigner semi-circle law etc.
I for practical purposes, we often need more general matrix models
I products and sums of random matrices
I i.i.d. models with correlation/variance profile
I distribution of inverses etc.
I for these models, it is often impossible to have a closed-form expression of the limiting
distribution.
I sometimes we do not have a limiting convergence.
N
1 X k
MkN = λi
N
i=1
ck (A + B) = ck (A) + ck (B)
with ck (X ) the cumulants of X . The cumulants ck are connected to the moments mk by,
X Y
mk = c|V |
π∈P(k) V ∈π
A natural extension of classical probability for non-commutative random variables exist, called
Free Probability
Tools for Random Matrix TheoryThe Moment Approach and Free Probability 17/113
Free probability
C1 = M1
C2 = M2 − M12
C3 = M3 − 3M1 M2 + 2M12
R. Speicher, “Combinatorial theory of the free product with amalgamation and operator-valued
free probability theory,” Mem. A.M.S., vol. 627, 1998.
I Combinatorial description by non-crossing partitions,
X Y
Mn = C|V |
π∈NC(n) V ∈π
Tools for Random Matrix TheoryThe Moment Approach and Free Probability 18/113
Non-crossing partitions
8 2
7 3
6 4
Figure: Non-crossing partition π = {{1, 3, 4}, {2}, {5, 6, 7}, {8}} of NC(8).
Tools for Random Matrix TheoryThe Moment Approach and Free Probability 19/113
Ck (A + B) = Ck (A) + Ck (B)
X Y
Mn (AB) = C|V1 | (A)C|V2 | (B)
(π1 ,π2 )∈NC(n) V1 ∈π1
V2 ∈π2
in conjunction with free moment-cumulant formula, gives all moments of sum and product.
Theorem
If F is a compactly supported distribution function, then F is determined by its moments.
I In the absence of support compactness, it is impossible to retrieve the distribution function
from moments. This is in particular the case of Vandermonde matrices.
Tools for Random Matrix TheoryThe Moment Approach and Free Probability 20/113
Free convolution
I In classical probability theory, for independent A, B,
Z
∆
µA+B (x) = µA (x) ∗ µB (x)= µA (t)µB (x − t)dt
Theorem
Convolution of the information-plus-noise model Let WN ∈ CN×n have i.i.d. Gaussian entries of
mean 0 and variance 1, AN ∈ CN×n , such that µ 1 A AH ⇒ µA , as n/N → c. Then the eigenvalue
n N N
distribution of
1
(AN + σWN ) (AN + σWN )H
BN =
n
converges weakly and almost surely to µB such that
µB = (µA µc ) δσ2 µc
Classical
Z Probability Free Z
probability
Moments mk = x k dF (x) Mk = x k dF (x)
X Y X Y
Cumulants mn = c|V | Mn = C|V |
π∈P(n) V ∈π π∈N C(n) V ∈π
Independence classical independence freeness
Additive convolution fA+B = fA ∗ fB µA+B = µA µB
Multiplicative convolution fAB µAB = µA µB
Sum Rule ck (A + B) = ck (A) + ck (B) Ck (A + B) = Ck (A) + Ck (B)
n n
1 X 1 X
Central Limit √ xi → N (0, 1) √ Xi ⇒ semi-circle law
n i=1 n i=1
Tools for Random Matrix TheoryThe Moment Approach and Free Probability 22/113
Definition
Let F be a real distribution function. The Stieltjes transform mF of F is the function defined, for
z ∈ C \ R, as
1
Z
mF (z) = dF (λ)
λ−z
For a < b real, denoting z = x + iy, we have the inverse formula
1
F 0 (x) = lim =[mF (x + iy)]
y →0 π
The Stieltjes transform is doubly more powerful than the moment approach!
I conveys more information than any K -finite sequence M1 , . . . , MK .
I is not handicapped by the support compactness constraint.
I however, Stieltjes transform methods, while stronger, are more painful to work with.
Tools for Random Matrix TheoryIntroduction of the Stieltjes Transform 26/113
1 p
fc (x) = (1 − c −1 )+ δ(x) + (x − a)+ (b − x)+
2πcx
√ √
where a = (1 − c)2 , and b = (1 + c)2 .
1.2
c = 0.1
c = 0.2
1 c = 0.5
0.8
Density fc (x)
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 3
x
Tools for Random Matrix TheoryIntroduction of the Stieltjes Transform 27/113
Since we want an expression of mF , we start by identifying the diagonal entries of the resolvent
(XN XH −1 of X XH . Denote
N − zIN ) N N H
y
XN =
Y
Now, for z ∈ C+ , we have
−1
yH y − z yH YH
−1
XN XH
N − zIN =
Yy YYH − zIN−1
Consider the first diagonal element of (RN − zIN )−1 . From the matrix inversion lemma,
−1
(A − BD−1 C)−1 −A−1 B(D − CA−1 B)−1
A B
=
C D −(A − BD−1 C)−1 CA−1 (D − CA−1 B)−1
Trace Lemma
Z. Bai, J. Silverstein, “Spectral Analysis of Large Dimensional Random Matrices”, Springer Series
in Statistics, 2009.
To go further, we need the following result,
Theorem
Let {AN } ∈ CN×N . Let {xN } ∈ CN , be a random vector of i.i.d. entries with zero mean, variance
1/N and finite 8th order moment, independent of AN . Then
1 a.s.
xH
N AN xN − tr AN −→ 0.
N
1 −1 1
mF (z) = tr XN XH
N − zIN '
N c − 1 − z − zmF (z)
Theorem
Let BN = XN TN XH
N ∈C
N×N , X ∈ CN×n has i.i.d. entries of mean 0 and variance 1/N,
N
F N ⇒ F , n/N → c. Then, F BN ⇒ F almost surely, F having Stieltjes transform
T T
!−1 −1
t 1
Z
−1
mF (z) = c dF T (t) − z = tr TN mF (z)TN + IN −z
1 + tmF (z) N
1
mF = cmF + (c − 1)
z
Pn
Spectrum of the sample covariance matrix model BN = i=1 x0i x0H 0H 0 0 0
i , with XN = [x1 , . . . , xn ], xi
i.i.d. with zero mean and covariance TN = E[x01 x0H
1 ].
Tools for Random Matrix TheoryIntroduction of the Stieltjes Transform 32/113
Getting F 0 from mF
∆ 1
f (x)=F 0 (x) = lim =[mF (x + iy)]
y →0 π
I to plot the density f (x), span z = x + iy on the line {x ∈ R, y = ε} parallel but close to the
real axis, solve mF (z) for each z, and plot =[mF (z)].
0.4 0.4
Density
Density
0.2 0.2
0 0
1 3 7 1 3 4
Eigenvalues Eigenvalues
1 1
Figure: Histogram of the eigenvalues of BN = TN2 XH 2
N XN TN
, N = 3000, n = 300, with TN diagonal composed of
three evenly weighted masses in (i) 1, 3 and 7 on top, (ii) 1, 3 and 4 at bottom.
Tools for Random Matrix TheoryIntroduction of the Stieltjes Transform 34/113
A. M. Tulino, S. Verdù, “Random matrix theory and wireless communications,” Now Publishers Inc.,
2004.
Definition
Let F be a probability distribution, mF its Stieltjes transform, then the Shannon-transform VF of F
is defined as Z ∞ Z ∞
∆ 1
VF (x)= log(1 + xλ)dF (λ) = − mF (−t) dt
0 x t
1
VF (x) = log det IN + xXXH .
N
Note that this last relation is fundamental to wireless communication purposes!
Tools for Random Matrix TheorySummary of what we know and what is left to be done 36/113
In most cases, T and R can be taken random, but independent of X. More involved random
matrices, such as Vandermonde matrices, were not yet studied.
Tools for Random Matrix TheorySummary of what we know and what is left to be done 37/113
I asymptotic results
I most of the above models with Gaussian X.
I products V1 VH H
1 T1 V2 V2 T2 ... of Vandermonde and deterministic matrices
I conjecture: any probability space of matrices invariant to row or column permutations.
I marginal studies, not yet fully explored
I rectangular free convolution: singular values of rectangular matrices
I finite size models. Instead of almost sure convergence of mXN as N → ∞, we can study finite size
behaviour of E[mXN ].
Tools for Random Matrix TheorySummary of what we know and what is left to be done 38/113
Related bibliography
Technical Bibliography
h1
h4
h3
h2
P1 P3
P4
P2
Random Matrix Theory and Performance of Communication SystemsPerformance of CDMA 43/113
with
I s = [s1 , . . . , sK ]T ∈ CK ,
I W = [w1 , . . . , wK ] ∈ CN×K ,
I P = diag(P1 , . . . , PK ) ∈ CK ×K ,
I H = diag(h1 , . . . , hK ) ∈ CK ×K .
Random Matrix Theory and Performance of Communication SystemsPerformance of CDMA 44/113
MMSE decoder
I Consists into taking
−1
rk = wH H H 2
k WHPH W + σ IN y
as symbol for user k.
I The SINR for user’s k signal is
(MMSE)
X
−1
γk = Pk |hk |2 wH
k( Pi |hi |2 wi wH 2
i + σ IN ) wk (1)
1≤i≤K
i6=k
−1
= Pk |hk |2 wH H H 2 H 2
k (WHPH W − Pk |hk | wk wk + σ IN ) wk . (2)
−1 1 −1 a.s.
wH H H 2 H 2
k (WHPH W −Pk |hk | wk wk +σ IN ) wk − tr(WHPHH WH −Pk |hk |2 wk wH 2
k +σ IN ) −→ 0.
N
Random Matrix Theory and Performance of Communication SystemsPerformance of CDMA 45/113
MMSE decoder
−1 1 −1 a.s.
wH H H 2 H 2
k (WHPH W −Pk |hk | wk wk +σ IN ) wk − tr(WHPHH WH −Pk |hk |2 wk wH 2
k +σ IN ) −→ 0.
N
I As N grows large,
1 −1 1 −1
tr WHPHH WH − Pk |hk |2 wk wH 2
k + σ IN − tr WHPHH WH + σ 2 IN → 0,
N N
mWHPHH WH (−σ 2 )
Random Matrix Theory and Performance of Communication SystemsPerformance of CDMA 46/113
MMSE decoder
I From previous result,
a.s.
mWHPHH WH (−σ 2 ) − mN (−σ 2 ) −→ 0
with mN (−σ 2 ) the unique positive solution of
−1
1 −1
m= tr HPHH mHPHH + IK + σ2
N
independent of k!
I This is also −1
2 1 X Pi |hi |2
m= σ +
N 1 + mPi |hi |2
1≤i≤K
I Finally,
(MMSE) a.s.
γk − Pk |hk |2 mN (−σ 2 ) −→ 0
and the capacity reads
K
1 X a.s.
C (MMSE) (σ 2 ) − log2 (1 + Pk |hk |2 mN (−σ 2 )) −→ 0.
K
k =1
Random Matrix Theory and Performance of Communication SystemsPerformance of CDMA 47/113
MMSE decoder
K
1 X a.s.
C (MMSE) (σ 2 ) − log2 (1 + Pk |hk |2 mN (−σ 2 )) −→ 0.
K
k =1
I AWGN channel, Pk = P, hk = 1,
p !
a.s. −(σ 2 + (c − 1)P) + (σ 2 + (c − 1)P)2 + 4Pσ 2
C (MMSE) (σ 2 ) −→ c log2 1+
2σ 2
and Z
a.s.
CMMSE (σ 2 ) −→ c log2 1 + Ptm(−σ 2 ) e−t dt.
Random Matrix Theory and Performance of Communication SystemsPerformance of CDMA 48/113
0
−5 0 5 10 15 20 25 30
SNR [dB]
Figure: Spectral efficiency of random CDMA decoders, AWGN channels. Comparison between simulations and
deterministic equivalents (det. eq.), for the matched-filter, the MMSE decoder and the optimal decoder, K = 16
users, N = 32 chips per code. Rayleigh channels. Error bars indicate two standard deviations.
Random Matrix Theory and Performance of Communication SystemsPerformance of CDMA 50/113
0
−5 0 5 10 15 20 25 30
SNR [dB]
Figure: Spectral efficiency of random CDMA decoders, Rayleigh fading channels. Comparison between
simulations and deterministic equivalents (det. eq.), for the matched-filter, the MMSE decoder and the optimal
decoder, K = 16 users, N = 32 chips per code. Rayleigh channels. Error bars indicate two standard deviations.
Random Matrix Theory and Performance of Communication SystemsPerformance of CDMA 51/113
MF
5
MMSE
Optimal
4
Spectral efficiency [bits/s/Hz]
0
0 1 2 3 4
Figure: Spectral efficiency of random CDMA decoders, for different asymptotic ratios c = K /N, SNR=10 dB,
AWGN channels. Deterministic equivalents for the matched-filter, the MMSE decoder and the optimal decoder.
Rayleigh channels.
Random Matrix Theory and Performance of Communication SystemsPerformance of MIMO Systems 53/113
K
X 1 1
BN = Hk HH 2 2
k , with Hk = Rk Xk Tk
k=1
with Xk ∈ CN×nk with i.i.d. entries of zero mean, variance 1/nk , Rk Hermitian nonnegative
definite, Tk diagonal. Denote ck = N/nk . Then, as all N and nk grow large, with ratio ck ,
−1
K
1 −1 1 X a.s.
tr(BN − zIN ) − tr −z IN +
ēk (z)Rk −→ 0
N N
k =1
where the set of functions {ei (z)} form the unique solution to the K equations
−1
K
1 X 1 −1
ei (z) = tr Ri −z IN +
ēk (z)Rk , ēi (z) = tr Ti −z Ini + ci ei (z)Ti
N ni
k=1
Hence, the SINR at the output of the MMSE receiver for user stream i of user k, γik , satisfies
−1
a.s.
γik = hH H
k ,i BN − hk,i hk ,i − tk ,i ei (−σ 2 ) −→ 0.
Random Matrix Theory and Performance of Communication SystemsPerformance of MIMO Systems 54/113
We will use here the “guess-work” method to find the deterministic equivalent. Consider the
simpler case K = 1.
Back to the original notations, we seek a matrix D such that
1 1 a.s.
tr(BN − zIN )−1 − tr D−1 −→ 0
N N
as N → ∞.
Resolvent lemma
For invertible A, B matrices,
It seems convenient to take D = −zIN + ēBN R with ēBN left to be defined (the notation BN in ēBN
reminds that we do not look yet for a deterministic quantity).
Random Matrix Theory and Performance of Communication SystemsPerformance of MIMO Systems 55/113
xH A−1
xH (A + τ xxH )−1 =
1 + τ xA−1 xH
Theorem
Under the previous model for BN , as N, nk grow large,
K
1 1 X
E log det(xBN + IN ) − log det IN + ēk (−1/x)Rk
N N
k=1
K K
X 1 X 1
+ log det Ink + ck ek (−1/x)Tk log det Ink + ck ek (−1/x)
N N
k =1 k=1
K
1X a.s.
− ēk (−1/x)ek (−1/x) −→ 0
x
k=1
Random Matrix Theory and Performance of Communication SystemsPerformance of MIMO Systems 57/113
We look for P?1 , . . . , P?K that achieve the optimal sum rate.
with µk such that tr P◦k = Pk , and ek◦ defined as ek for (P1 , . . . , PK ) = (P◦1 , . . . , P◦K ).
Moreover, under some conditions,
kP?k − P◦k k → 0.
Random Matrix Theory and Performance of Communication SystemsPerformance of MIMO Systems 58/113
Variance profile
W. Hachem, Ph. Loubaton, J. Najim, “Deterministic equivalents for certain functionals of large
random matrices,” Annals of Applied Probability, vol. 17, no. 3, pp. 875-930, 2007.
Theorem
1 2
Let XN ∈ CN×n have independent entries with (i, j)th entry of zero mean and variance σ .
n ij
Let
AN ∈ RN×n be deterministic with uniformly bounded column norm. Then
1 −1 1 a.s.
tr (XN + AN )(XN + AN )H − zIN − tr TN (z) −→ 0
N N
where TN (z) is the unique function that solves
−1 −1
TN (z) = Ψ−1 (z) − zAN Ψ̃(z)ATN , T̃N (z) = Ψ̃−1 (z) − zATN Ψ(z)AN
with Ψ(z) = diag(ψi (z)), Ψ̃(z) = diag(ψ̃i (z)), with entries defined as
−1 −1
1 1
ψi (z) = − z(1 + tr D̃i T̃(z)) , ψ̃j (z) = − z(1 + tr Dj T(z))
n n
Variance profile
W. Hachem, Ph. Loubaton, J. Najim, “Deterministic equivalents for certain functionals of large
random matrices,” Annals of Applied Probability, vol. 17, no. 3, pp. 875-930, 2007.
Theorem
For the previous model, we also have that
1 1
E log det IN + 2 (XN + AN )(XN + AN )H
N σ
I Recent results were proposed when the matrices XN are unitary and unitarily invariant (Haar
matrices).
I The central result is the trace lemma
Lemma
Let W ∈ CN×n be n < N columns of a Haar matrix and w a column of W. Let BN ∈ CN×N a
random matrix, function of all columns of W except w. Then, assuming that, for growing N,
c = supn n/N < 1 and B = supN kBN k < ∞, we have:
1 a.s.
wH BN w − tr(IN − WWH )BN −→ 0.
N −n
Random Matrix Theory and Performance of Communication SystemsPerformance of MIMO Systems 61/113
R. Couillet, J. Hoydis, M. Debbah, “Random beamforming over quasi-static and fading channels: a
deterministic equivalent approach”, to appear in IEEE Trans. on Inf. Theory.
Theorem
ni
Let Ti ∈ Cni ×ni be nonnegative diagonal and let Hi ∈ CN×Ni . Define Ri , Hi HH
i ∈C
N×N , c =
i Ni
Ni
and c̄i = N
. Denote
K
X
BN = Hi Wi Ti WH H
i Hi .
i=1
where (ē1 (z), . . . , ēK (z)) are the solutions (conditionally unique) of
−1
K
1 X
ei (z) = tr Ri ēj (z)Rj − zIN
N
j=1
1 −1
ēi (z) = tr Ti ei (z)Ti + [c̄i − ei (z)ēi (z)]Ini (compare to i.i.d. case!)
N
Random Matrix Theory and Signal Source Sensing 63/113
Problem statement
Solution
1
PH1 = PH0 =
2
Therefore,
PY|H1 (Y)
C(Y) =
PY|H0 (Y)
Random Matrix Theory and Signal Source SensingFinite Random Matrix Analysis 68/113
1 − 1 tr YYH
PY|H0 (Y) = e σ2
(πσ 2 )NL
Random Matrix Theory and Signal Source SensingFinite Random Matrix Analysis 69/113
with
H
h11 ... h1M σ ··· 0 h11 ... h1M σ ··· 0
. .. .. .. .. .. .. .. .. .. .. ..
Σ = L .. . . . . . . . . . . .
hN1 ... hNM 0 ··· σ hN1 ... hNM 0 ··· σ
= U (LΛ) UH
Random Matrix Theory and Signal Source SensingFinite Random Matrix Analysis 70/113
Case M = 1.
Maximum Entropy distribution for H is Gaussian i.i.d channel. Unordered eigenvalue distribution
for Σ,
2 N
1 e−(λ1 −σ ) Y
PΛ (Λ)dΛ = 1(λ1 >σ2 ) (λ1 − σ 2 )N−1 δ(λi − σ 2 )dλ1 . . . dλN
N (N − 1)!
i=2
Neyman-Pearson Test
I M = 1,
2 1 PN λl
σ − 2 i=1 λi N
e σ X e σ2
PY|I1 (Y) = LN 2(N−1)(L−1) QN JN−L−1 (σ 2 , λl )
Nπ σ l=1 i=1 (λl − λi )
i6=l
0.6
0.55
0.5
Correct detection rate
0.45
0.4
0.35
0.3
Figure: ROC curve for SIMO transmission, M = 1, N = 4, L = 8, SNR = −3 dB, FAR range of practical interest.
Random Matrix Theory and Signal Source SensingFinite Random Matrix Analysis 73/113
0.9
0.8
0.7
0.6
Density fc (x)
0.5
0.4
0.3
0.2
0.1
0 √ 2 √ 2
σ 2 (1 − c) σ 2 (1 + c)
Z. D. Bai, J. W. Silverstein, “No eigenvalues outside the support of the limiting spectral distribution
of large-dimensional sample covariance matrices,” The Annals of Probability, vol. 26, no.1 pp.
316-345, 1998.
Theorem
√ √
P(no eigenvalues outside [σ 2 (1 − c)2 , σ 2 (1 + c)2 ] for all large N) = 1
I If H0 ,
√
λmax ( N1 YYH ) a.s. (1 + c)2
−→ √
λmin ( N1 YYH ) (1 − c)2
Bianchi, J. Najim, M. Maida, M. Debbah, “Performance of Some Eigen-based Hypothesis Tests for
Collaborative Sensing,” Proceedings of IEEE Statistical Signal Processing Workshop, 2009.
I GLRT test
1
!N−1 −L
1 N−1 λmax ( N YYH ) λmax ( 1 YYH )
CGLRT (Y) = 1 − 1 N
1 − PNN .
N
P
i=1 λi N i=1 λi
0.6
0.5
Correct detection rate
0.4
0.3
0.2
Bayesian, Jeffreys
Bayesian, uniform
0.1 Cond. number
GLRT
0
−3 −3 −2
1 · 10 5 · 10 1 · 10 2 · 10−2
2
Figure: ROC curve for a priori unknown σ of the Bayesian method, conditioning number method and GLRT
method, M = 1, N = 4, L = 8, SNR = 0 dB. For the Bayesian method, both uniform and Jeffreys prior, with
exponent α = 1, are provided.
Random Matrix Theory and Statistical Inference 81/113
1 1
I it has long been difficult to analytically invert the simplest BN = TN2 XN XH 2
N TN model to recover
the diagonal entries of TN . Indeed, we only have the deterministic equivalent result
−1
t
Z
mN (z) = −z + c dF TN (t)
1 + tmN (z)
All moments of the l.s.d. of BN are obtained as polynomials of those of the l.s.d. of RN
Random Matrix Theory and Statistical InferenceFree Probability Method 84/113
I The k th moment of the l.s.d. of BN is a polynomial of the k-first moments of the l.s.d. of RN .
I we can therefore invert the problem and express the k th moment of RN as the first k moments of BN .
This entails deconvolution operations,
µA = µA+B µB
µA = µAB µB
and for the information-plus-noise model, BN = (RN + σXN ) (RN + σXN )H
1
n
µΓ = (µB µc ) δσ2 µc
I for more involved models, the polynomial relations can be iterated and even automatically
generated.
Random Matrix Theory and Statistical InferenceFree Probability Method 85/113
Y=D+X
with Y ∈ CN×n , D∈ CN×n , X∈ CN×n with i.i.d. entries of mean 0 and variance 1. Denote
1 1
Mk = lim tr( YYH )k
n→∞ n N
1 1
Dk = lim tr( DDH )k
n→∞ n N
I For that model, we have the relations
M1 = D1 + 1
M2 = D2 + (2 + 2c)D1 + (1 + c)
M3 = D3 + (3 + 3c)D2 + 3cD1 2 + (1 + 3c + c 2 )
hence
D1 = M1 − 1
D2 = M2 − (2 + 2c)M1 + (1 + c)
D3 = M3 − (3 + 3c)M2 − 3cM1 2 + (6c 2 + 18c + 6)M1 − (4c 2 + 12c + 4)
Random Matrix Theory and Statistical InferenceFree Probability Method 86/113
I For practical finite size applications, the deconvolved moments will exhibit errors. Different
strategies are available,
PK
I direct inversion with Newton-Girard formulas. Assuming perfect evaluation of K1 m
k=1 Pk ,
P1 , . . . , PK are given by the K solutions of the polynomial
X K − Π1 X K −1 + Π2 X K −2 − . . . + (−1)K ΠK
where the Πm ’s (known as the elementary symmetric polynomials) are iteratively defined as
k
X
(−1)k kΠk + (−1)k +i Si Πk−i = 0
i=1
Pk k
where Sk = i=1 Pi .
I may lead to non-real solutions!
I does not minimize any conventional error criterion
I convenient for one-shot power inference
I when multiple realizations are available, statistical solutions are preferable
Random Matrix Theory and Statistical InferenceFree Probability Method 87/113
∆ 1 1 H 1
Y=T 2 X ∈ CN×n , BN = YY ∈ CN×N , BN = YH Y ∈ Cn×n
n n
where T ∈ CN×N has eigenvalues t1 , . . . , tK , tk with multiplicity Nk and X ∈ CN×n is i.i.d. zero
mean, variance 1.
Complex integration
I From Cauchy integral formula, denoting Ck a contour enclosing only tk ,
K
1 1 1 X N
I I I
ω ω
tk = dω = Nj dω = ωmT (ω)dω.
2πi Ck ω − tk 2πi Ck Nk ω − tj 2πiNk Ck
j=1
N 1
I mF0 (z)
tk = zmF (z) dz,
Nk 2πi CF ,k mF2 (z)
with Nk the indexes of cluster k and µ1 < . . . < µN are the ordered eigenvalues of the matrix
√ √ T
diag(λ) − n1 λ λ , λ = (λ1 , . . . , λN )T .
Random Matrix Theory and Multi-Source Power Estimation 92/113
Problem statement
System model
(t)
I at time t, source k transmit signal xk ∈ Cnk with i.i.d. entries of zero mean and variance 1.
I we denote Pk the power emitted by user k
I the channel Hk ∈ CN×nk from user k to the receiver has i.i.d. entries of zero mean and
variance 1/N.
I at time t, the additive noise is denoted σw(t) , with w(t) ∈ CN with i.i.d. entries of zero mean
and variance 1.
I hence the receive signal y(t) at time t,
K
(t) (t)
X p
y(t) = Hk Pk xk + σwk
k=1
Gathering M time instant into Y = [y(1) . . . y(M) ] ∈ CN×M , this can be written
K
X p 1
Y= Hk Pk Xk + σW = HP 2 X + σW
k=1
1 1
I from HP 2 XXH P 2 HH , up to a Gram matrix commutation, we can deconvolve the signal X,
1 H 1 H
P 2 HH P 2 XX
1 1
I from P 2 HHH P 2 , a new matrix commutation allows one to deconvolve HHH
H
PHH
Random Matrix Theory and Multi-Source Power EstimationFree Probability Approach 98/113
I channel deconvolution
µP = µP 1 HH H µηc1
n
with c1 = n/N
Random Matrix Theory and Multi-Source Power EstimationFree Probability Approach 99/113
−1 −1 −1
+ 2N n + 2M n d1 + 1 + NM
−3 −2 −3 −2 −1 −1 −2 −1
m3 = 3N M n+N n + 6N M n+N M n+N n d3
−3 −1 2 −2 −2 2 −2 2 −1 −1 2
+ 6N M n + 6N M n + 3N n + 3N M n d2 d1
−3 −2 3 −3 3 −2 −1 3 −1 −2 3 3
+ N M n +N n + 3N M n +N M n d1
−2 −1 −1 −2 −1 −1
+ 6N M n + 6N M n + 3N n + 3M n d2
−2 −2 2 −2 2 −1 −1 2 −2 2 2
+ 3N M n + 3N n + 9N M n + 3M n d1
−1 −2 −1 −1 −2
+ 3N M n + 3N n + 9M n + 3NM n d1
where
N K K
1 X 1 1 X 1 X k
mk = λ( YYH )k and dk = λ(P)k = Pi
N M K K
i=1 i=1 i=1
Random Matrix Theory and Multi-Source Power EstimationFree Probability Approach 100/113
1 PK
Direct inversion with Newton-Girard formulas. Assuming perfect evaluation of K k=1 Pkm ,
P1 , . . . , PK are given by the K solutions of the polynomial
X K − Π1 X K −1 + Π2 X K −2 − . . . + (−1)K ΠK
where the Πm ’s (known as the elementary symmetric polynomials) are iteratively defined as
k
X
(−1)k kΠk + (−1)k +i Si Πk −i = 0
i=1
Pk
where Sk = i=1 Pik .
I may lead to non-real solutions!
I does not minimize any conventional error criterion
I convenient for one-shot power inference
I when multiple realizations are available, statistical solutions are preferable
Random Matrix Theory and Multi-Source Power EstimationThe Stieltjes Transform Approach 102/113
HPHH + σ 2 IN
0
0 0
Asymptotic spectrum
M M −N 1
m(z) = m(z) +
N N z
where m(z) is the unique solution in C+ of
K
1 1 1 X nk Pk
= −σ 2 + −
m(z) f (z) N 1 + Pk f (z)
k =1
1 H
Asymptotic spectrum of M YY
0.1
Asymptotic spectrum
Empirical eigenvalues
0.075
Density
0.05
0.025
0
0.1 1 3 10
Estimated powers
Figure: Empirical and asymptotic eigenvalue distribution of M1 YYH when P has three distinct entries P1 = 1,
P2 = 3, P3 = 10, n1 = n2 = n3 , N/n = 10, M/N = 10, σ 2 = 0.1. Empirical test: n = 60.
Random Matrix Theory and Multi-Source Power EstimationThe Stieltjes Transform Approach 105/113
Theorem
Let BN = M1 YYH ∈ CN×N , with Y defined as previously. Denote its ordered eigenvalues vector
λ = (λ1 , . . . , λN ), λ1 < . . . , λN . Further assume asymptotic spectrum separability. Then, for
k ∈ {1, . . . , K }, as N, n, M grow large, we have
a.s.
P̂k − Pk −→ 0
NM X
P̂k = (ηi − µi )
nk (M − N)
i∈Nk
PK PK
with Nk = {N − i=k ni + 1, . . . , N − i=k +1 ni } the set of indexes matching the cluster
1
√ √ T
corresponding to Pk , (η1 , . . . , ηN ) the ordered eigenvalues of diag(λ) − N
λ λ and
√ √ T
(µ1 , . . . , µN ) the ordered eigenvalues of diag(λ) − M1 λ λ .
Random Matrix Theory and Multi-Source Power EstimationThe Stieltjes Transform Approach 106/113
1.2
Stieltjes transform method Stieltjes transform method
Moment-based method Moment-based method
6
1
0.8
4
Density
Density
0.6
0.4
2
0.2
0 0
1 3 10 1 3 10
−10
−20
−30
−40
−15 −10 −5 0 5 10 15 20
SNR [dB]
Figure: Normalized mean square error of the vector (P̂1 , P̂2 , P̂3 ), P1 = 1, P2 = 3, P3 = 10,
n1 /n = n2 /n = n3 /n = 1/3 ,n/N = N/M = 1/10, for 10, 000 simulation runs.
Random Matrix Theory and Multi-Source Power EstimationThe Stieltjes Transform Approach 109/113
I up to this day
I the moment approach is much simpler to derive
I it does not require any cluster separation
I the finite size case is treated in the mean, which the Stieltjes transform approach cannot do.
I however, the Stieltjes transform approach makes full use of the spectral knowledge, when the
moment approach is limited to a few moments.
I the results are more natural, and more “telling”
I in the future, it is expected that the cluster separation requirement can be overtaken.
I a natural general framework attached to the Stieltjes transform method could arise
I central limit results on the estimates is expected
Random Matrix Theory and Multi-Source Power EstimationThe Stieltjes Transform Approach 110/113
Related bibliography
I N. El Karoui, “Spectrum estimation for large dimensional covariance matrices using random
matrix theory,” Annals of Statistics, vol. 36, no. 6, pp. 2757-2790, 2008.
I N. R. Rao, J. A. Mingo, R. Speicher, A. Edelman, “Statistical eigen-inference from large
Wishart matrices,” Annals of Statistics, vol. 36, no. 6, pp. 2850-2885, 2008.
I R. Couillet, M. Debbah, “Free deconvolution for OFDM multicell SNR detection”, PIMRC
2008, Cannes, France.
I X. Mestre, “Improved estimation of eigenvalues and eigenvectors of covariance matrices
using their sample estimates,” IEEE trans. on Information Theory, vol. 54, no. 11, pp.
5113-5129, 2008.
I R. Couillet, J. W. Silverstein, M. Debbah, “Eigen-inference for multi-source power estimation”,
submitted to ISIT 2010.
I Z. D. Bai, J. W. Silverstein, “No eigenvalues outside the support of the limiting spectral
distribution of large-dimensional sample covariance matrices,” The Annals of Probability, vol.
26, no.1 pp. 316-345, 1998.
I Z. D. Bai, J. W. Silverstein, “CLT of linear spectral statistics of large dimensional sample
covariance matrices,” Annals of Probability, vol. 32, no. 1A, pp. 553-605, 2004.
I J. Silverstein, Z. Bai, “Exact separation of eigenvalues of large dimensional sample
covariance matrices” Annals of Probability, vol. 27, no. 3, pp. 1536-1555, 1999.
I Ø. Ryan, M. Debbah, “Free Deconvolution for Signal Processing Applications,” IEEE
International Symposium on Information Theory, pp. 1846-1850, 2007.
Random Matrix Theory and Multi-Source Power EstimationThe Stieltjes Transform Approach 111/113
I Articles in Journals,
I R. Couillet, J. W. Silverstein, Z. Bai, M. Debbah, “Eigen-Inference for Energy Estimation of Multiple
Sources,” IEEE Trans. on Information Theory, 2010, submitted.
I R. Couillet, J. W. Silverstein, M. Debbah, “A Deterministic Equivalent for the Capacity Analysis of
Correlated Multi-User MIMO Channels,” IEEE Trans. on Information Theory, 2010, accepted for
publication.
I P. Bianchi, J. Najim, M. Maida, M. Debbah, “Performance of Some Eigen-based Hypothesis Tests for
Collaborative Sensing,” IEEE Trans. on Information Theory, 2010, accepted for publication.
I R. Couillet, M. Debbah, “A Bayesian Framework for Collaborative Multi-Source Signal Sensing,” IEEE
Trans. on Signal Processing, 2010, accepted for publication.
I S. Wagner, R. Couillet, M. Debbah, D. T. M. Slock, “Large System Analysis of Linear Precoding in
MISO Broadcast Channels with Limited Feedback,” IEEE Trans. on Information Theory, 2010,
submitted.
I A. Masucci, Ø. Ryan, S. Yang, M. Debbah, “Gaussian Finite Dimensional Statistical Inference,” IEEE
Trans. on Information Theory, 2009, submitted.
I Ø. Ryan, M. Debbah, “Asymptotic Behaviour of Random Vandermonde Matrices with Entries on the
Unit Circle,” IEEE Trans. on Information Theory, vol. 55, no. 7 pp. 3115-3148, July 2009.
I M. Debbah, R. Müller, “MIMO channel modeling and the principle of maximum entropy,” IEEE Trans.
on Information Theory, vol. 51, no. 5, pp. 1667-1690, 2005.
I Articles in International Conferences
I R. Couillet, S. Wagner, M. Debbah, A. Silva, “The Space Frontier: Physical Limits of Multiple
Antenna Information Transfer”, Inter-Perf 2008, Athens, Greece. BEST STUDENT PAPER
AWARD.
I R. Couillet, M. Debbah, V. Poor, “Self-organized spectrum sharing in large MIMO multiple access
channels”, submitted to ISIT 2010.
I R. Couillet, M. Debbah, “Uplink capacity of self-organizing clustered orthogonal CDMA networks in
flat fading channels”, ITW 2009 Fall, Taormina, Sicily.
Random Matrix Theory and Multi-Source Power EstimationThe Stieltjes Transform Approach 112/113
Available in bookstores!
Random Matrix Theory and Multi-Source Power EstimationThe Stieltjes Transform Approach 113/113
Detailed outline
Romain Couillet, Mérouane Debbah, Random Matrix Methods for Wireless Communications.
1. Theoretical aspects
1.1 Preliminary
1.2 Tools for random matrix theory
1.3 Deterministic equivalents
1.4 Central limit theorems
1.5 Spectrum analysis
1.6 Eigen-inference
1.7 Extreme eigenvalues
2. Applications to wireless communications
2.1 Introduction
2.2 System performance: capacity and rate-regions
2.2.1 Introduction
2.2.2 Performance of CDMA technologies
2.2.3 Performance of multiple antenna systems
2.2.4 Multi-user communications, rate regions and sum-rate
2.2.5 Design of multi-user receivers
2.2.6 Analysis of multi-cellular networks
2.2.7 Communications in ad-hoc networks
2.3 Detection
2.4 Estimation
2.5 Modelling
2.6 Random matrix theory and self-organizing networks
2.7 Perspectives
2.8 Conclusion