You are on page 1of 12

2164 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 43, NO. 9.

SEFrEMBER 1995

-.
A Maximal Invariant l? ramework for
Adaptive Detection with Structured
and Unstructured Covariance Matrices
Sandip Bose, Member, IEEE, and Allan 0. Steinhardt, Senior Member, IEEE

Abstract-We introduce a framework for exploring array de- in terms of gain in probability of detection (PD), for fixed
tection problems in a reduced dimensional space by exploiting the PFA, and reduction in the number of secondary data vectors
theory of invariance in hypothesis testing. This involves calculat- required.
ing a low-dimensional basis set of functions called the maximal
invariant, the statistics of which are often tractable to obtain, In this paper, we will be studying detection problems in
thereby making analysis feasible and facilitating the search for the array environment for a number of scenarios within the
tests with some optimality property. Using this approach, we framework of invariant testing. We will focus on the following
obtain a locally most powerful invariant test for the unstructured structure for the covariance matrix:
covariance case and show that all invariant tests can be expressed
in terms of the previously published Kellys generalized likeli- R = !PB++ XR, (1)
hood ratio (GLRT) and Robeys adaptive matched filter (AMF)
test statistics. Applying this framework to structured covariance where R ( N x N) is the covariance matrix, Q ( N x d ) spans
matrices, corresponding to stochastic interferers in a known
subspace, for which the GLRT is unavailable, we obtain the a rank-d subspace and R,, is a known covariance matrix. For
maximal invariant and propose several new invariant detectors this paper, we assume that Q is known while B and X are
that are shown to perform as well or better than existing ad-hoc not. This structure not only corresponds to the case of a low-
detectors. These invariant tests are unaffected by most nuisance rank interference component in a dominant subspace (which
parameters, hence the variation in the level of performance is frequently arises in narrow band processing when the noise has
sharply reduced. This framework facilitates the search for such
tests even when the usual GLRT is unavailable. an interference component due to a small number of sources
superimposed on the receiver noise, which is usually white);
I. INTRODUCTION but also as a special case reduces to the unstructured matrix
when d equals N . We shall therefore work with this model
A. Background and Previous Work to obtain general results that can then be applied to specific

T HE problem of detecting a signal vector of known


direction but unknown strength in Gaussian noise whose
covariance matrix is unknown has received much attention
instances. We will also look at related cases such as the block
diagonal form for the covariance that may be used to model
a nonstationary environment.
lately. In [ 13, Reed et al., used the sample covariance estimate Unfortunately, it turns out that for these covariance struc-
from secondary (signal-free) data vectors to derive a weight tures, with the signal bearing and waveform known, it becomes
vector for adaptive detection. This was modified by Robey to intractable to use the GLRT procedure to obtain a test statistic
obtain the adaptive matched filter (AMF) detector that had the (the function of the data which that test compares to a
desirable constant false alarm rate (CFAR) property [2], [3]. threshold). Further, the GLRT has no optimality property
In [4],Kelly used the method of the generalized likelihood among all tests. However, if attention is restricted to invariant
ratio test (GLRT) to derive another CFAR test. tests, that is tests that do not distinguish between scenarios
Both methods assume that the covariance matrix is com- differing in their nuisance parameters, then indeed the GLRT
pletely unknown (unstructured). In many applications, how- has the property of asymptotically approaching the uniformly
ever, the array geometry and partial information of the noise most powerful among those tests. The invariance criterion is
environment (number of interferers, rough bearing estimates, a reasonable restriction to impose since it equalizes perfor-
etc.) impose a structure on the covariance matrix. It has been mance among diverse scenarios sharing the same significant
shown in [5] and [6] that the use of structured covariance parameters (e.g., signal-to-noise ratio (SNR)). Moreover, it can
estimates results in a significant improvement in performance be shown in certain cases that the uniformly most powerful
invariant (UMPI) test-if it exists-yields the minimax test for
Manuscript received February 25, 1993; revised September 14, 1994. This the problem in the sense of maximizing the minimum (over
work was supported by the Air Force under Contract AFOSR-9190149. The parameter values) of the probability of detection (PD) for a
associate editor coordinating the review of this paper and approving it for
publication was Prof. Douglas Williams. given probability of false alarm (PFA). In practice, invariant
S. Bose is currently a postdoctoral research associate at the University of tests are widely used in hypothesis testing ([7]-[9]). Applying
CA, Davis, CA 95616 USA. this criterion has the effect of drastically reducing the problem
A. Steinhardt is with the Lincoln Laboratory, Massachusetts Institute of
Technology, Lexington, MA 02173-9108 USA. size since it can be shown that all invariant test statistics can
IEEE Log Number 9413310. be expressed in terms of a function (possibly vector valued
1053-587)(/95$04.00 0 1995 IEEE
BOSE AND STEINHARDT: A MAXIMAL INVARIANT FRAMEWORK FOR ADAPTIVE DETECTION 2165

but having much fewer dimensions than the data) of the data e.g., x i . Ik denotes the k x k identity and U, usually refers
called the maximal invariant. Therefore, characterizing this to a m x m unitary matrix. For handling the distributions
function greatly facilitates and sometimes completes the search of random matrices, we define the vec notation for a matrix,
for a good test statistic. Moreover, the distribution of the max- . . .
X N ~=
L [ X I : X ~.:. . : x , r ] ,as
vec(X) = [ x i . . . x i ] t(the
imal invariant is parameterized by another low-dimensional
columns of the matrix are stacked up to form a vector in
function on the parameter space (called the induced maximal an N L-dimensional augmented space); and the Kronecker
invariant). In this way, most of the nuisance parameters are product, @, according to the convention of Kelly [15], as
removed from the problem. This could lead to CFAR tests
A @ B = [(Ab)kl]where B = [ b k l ] . The quantities pertaining
and further, there are fewer unknown parameters that could to the augmented space are denoted with a tilde, as in the
adversely affect performance. Consequently, this approach
following: x = vec(X). The distribution of the matrix X
is ideal for array detection problems that are characterized
can now be expressed in terms of that of the vector x. If the
by high-dimensional data and parameter values, the latter columns of X are i.i.d. Gaussian random vectors, each with
consisting mostly of nuisance parameters.
covariance R, then the covariance of x is easily seen to be
In the next section, after a brief note on notation, we will
R @ IL.
look at the problem in the array setting in somewhat greater
detail and spell out the cases to be considered. Subsequently,
we will digress to give a brief introduction to the theory of C. Problem Statement: The Array Environment
invariant testing (for a more thorough treatment, the reader Fig. 1 illustrates the array environment. An array of N sen-
is requested to look up any of a number of texts [7], [9], sors is used to collect data and outputs L N-dimensional com-
[ 101; the lecture notes [ 111 contains a good detailed exposition; plex vectors. These could be time snapshots with quadrature
Scharf [ 121 has a lucid treatment of this theory applied to other sampling or they could be the output of some preprocessing
signal processing applications). Then, we attack the problem (e.g., beam-space [16, ch. 51).
of finding the maximal invariant for the main problem being These output vectors are modeled as the sum of Gaussian
considered. The proofs will be shown in detail and will in fact noise (which contains a component due to interference sources,
serve as a guide for obtaining all the results subsequently. The the rest being sensor noise) and possibly the signal whose
maximal invariant will be used for proposing an invariant test presence we are trying to detect:
whose performance will be studied and compared with existing
tests. We will then take a brief look at the unstructured case xi = pub, + ni 1: = 1,.. . , L .
studied by Robey and Kelly as a special case and present
some key results worked out elsewhere [13]. These include The noise components nz are assumed to be i.i.d. zero-mean
the equivalence of the AMF and the GLRT test statistics complex Gaussian vectors with an unknown covariance matrix
proposed by them to the maximal invariant and a new locally R. However, it may be possible to impose a structure on R
most powerful invariant (LMPI) test for low SNR. Finally, as and we shall look primarily at the subspace structure in (1)
extensions, we will look at the block diagonal structure of the corresponding to the presence of strong stochastic interferers at
covariance; and the case of an unknown signal bearing vector known (roughly) locations (Fig. 3). The signal bearing vector
with unstructured covariance, which leads to a uniformly most a (which may correspond to an actual bearing of a target
powerful invariant (UMPI) test. Further generalizations of in the radar context) is common to all the vectors and is
our signal models leading to related problems are dealt with assumed to be known. Likewise, the time waveform given
elsewhere [14]. by b, is assumed to be known. This is equivalent to the time
waveform being an impulse concentrating all its energy in the
first snapshot, for one could unitarily mix the i.i.d vectors so
B. A Note on Notation as to make this true without changing anything else in the
As a general rule, we shall represent scalar quantities by problem. Another way of obtaining the latter case is when
plain italics, e.g., V I ; vectors by boldface letters, e.g., V I ; and there is one vector (primary) that might contain the signal and
matrices by boldface capitals, e.g., VI. The corresponding the rest (secondaries) are noise vectors that along with that in
components of larger structures are likewise denoted. When the primary are i.i.d. This basic signal model might arise in
it aids clarity, we indicate their dimensions in the subscript as active radar scenarios. In a later paper [14], we shall explore
in V ~ , N In ~ respect
L . of random quantities, we depart from multidimensional signal models as well. Here we have:
the usual convention and represent both the random variable
and the value it takes by the same symbol (decided by its
dimensions as above) but not italicized, e.g., XI. This is done
for economy: frequently when dealing with such quantities, p refers to the signal amplitude (both a and the time wave-
we could mean either the random variable (for referring to its form b are normalized to have unit norm) that is unknown.
distribution) or the value taken (while referring to the test Our results are equally applicable to deterministic signals
statistic as a function of the observed data). We elaborate or stochastic signals (then p is random and lpI2 has to be
wherever it is not clear from the context. Note that we will replaced by its expectation in the expression for signal-to-
be dealing with complex quantities throughout this paper. The noise ratio) though for most purposes we shall assume that
complex conjugate transpose is denoted by the dagger sign, they are deterministic unknown.
2166 IEEE TRANSACTIONS ON SIGNAL PROCESSING. VOL. 43, NO. 9, SEPTEMBER 1995

ADAPTIVE DETECTION IN THE ARRAY ENVIRONMENT

Fig. 1. Typical array environment.

Having said all this, we can now formulate our signal of distributions as the original with possibly different values of
detection problem as a hypothesis testing problem: the parameters but correspond to the same hypothesis as the
original. This means that the parameters that are significant
HO: p = 0 versus H1: p # 0.
for the hypothesis are left unchanged or stay within a certain
Note that the covariance matrix is a high-dimensional nuisance region of the parameter space while the remaining parameters,
parameter. We now proceed to reduce the size of the problem which are the nuisance parameters, are left free to change.
and eliminate the nuisance parameter as far as possible by Thus for the signal detection problem, the statistical mean
imposing the requirement of invariance on our test. But for of the data (which is significant) should be left unchanged,
this, we need to review the theory of invariance in hypothesis whereas we do not care if the covariance (nuisance parameters)
testing which we shall do in the next section. is altered.
When we apply such a transformation to the observed
11. INVARIANCE data, the transformed data continues to support the original
hypothesis and therefore it is reasonable to require our test
A. Motivation to yield the same decision. At the same time, the altered
We can motivate the notion of invariance by observing that nuisance parameters get washed out from the description
in the kind of hypothesis testing problems looked at here, of such an invariant test. Therefore, we try to characterize
there is an unknown high-dimensional parameter, namely, the as many transformations as possible that satisfy the above
covariance that is not relevant to the decision. However, this prescription. It turns out that these transformations possess
could affect the distribution of a given test statistic, thereby a group structure that greatly facilitates this task. In fact, the
throwing its CFAR nature into question. Besides, this could requirement of the invariance of the test can be achieved by
also adversely affect the performance of the test. We would requiring that the test statistic (function of data) be an invariant
therefore want a framework for choosing tests in an optimal to these transformation groups. Further, all invariant tests can
way that is immune to such nuisance parameters. be characterized by this means and it becomes possible to
We can do this by using a special class of transformations answer general questions such as whether they are CFAR
on the data that have the property of leaving the decision (which they often are) or if an optimum test exists among
problem unchanged. Now, these are those for which the them. Thus, we have a means for generating all invariant
distribution of the transformed data belongs to the same family tests and describing their performance. Further, there are other
BOSE AND STEINHARDT: A MAXIMAL INVARIANT FRAMEWORK FOR ADAPTIVE DETECTION 2167

MAXIMAL INVARIANT APPROACH

Decision

Should be same
Data X

Scale the data and rotate in the signal free subspace

I x, l2
The Orbits are cones : - is constant on each cone and different on different cones.

1-d Maximal Invariant The decision statistic !


Fig. 2. Various signal and interference scenarios.

advantages to restricting attention to such tests, namely, that C. The Maximal Invariant
there is a substantial reduction in the size of the problem both It turns out that the class of all invariant tests can be
in terms of the class of test statistics to be considered and thecharacterized as follows: the group, G, acts on the sample
parameters describing the performance of such tests. space and partitions it into equivalence classes or orbits. These
orbits can be indexed by a set of functions or, more concisely,
B. Formulation by a vector-valued function on the sample space that is called
We now formulate this idea more precisely following the maximal invariant. Clearly, the maximal invariant is not
Lehmann [7]. unique; any other function related in 1 - 1 fashion to this
Let X be the observed data that we regard as a value in also indexes the orbits and can be called maximal invariant.
the sample space of a random variable (possibly multivariate). However, the orbits are unique and any function indexing these
This has the probability distribution Pu, w E R where w is the are related as above to any other such function and in that
parameter (possibly vector-valued) describing the distribution sense we shall refer to this as the maximal invariant. It is
and lying in the parameter space 0. Let g be a bijective (1 - 1 shown in [7] that all invariant test statistics are functions of
onto) transformation on the sample space such that g ( X ) is the maximal invariant.
distributed according to E',,, w' E R. This transformation Formally, the function M ( X) is maximal invariant under
thereby induces a transformation 7 j on the parameter space the group G if and only if
defined by: g ( w ) = w'.
Now, the decision problem can be specified in terms of the M ( X ) = M[g(X)], for all gin the group G
location of w in 0. Thus if Ro and 0 1 form a partitioning of M ( X 1 ) = M ( X 2 ) implies that
R, we can write the choice between the hypotheses as follows:
X 1 = g ( X 2 ) for some g E G.
HO:w E 00 versus H1: w E R1.
It is further shown that the distribution of the maximal
It is easy to see that this decision problem is invariant to the
invariant depends on the corresponding quantity generated
transformation g if the corresponding induced transformation
by the induced transformation group acting on the parameter
maps each of the partitionings of R to itself
space and called the induced maximal invariant, which we
gRi = Ri 2 = 0, 1. denote by 8 ( w ) .
It is shown in [7] that the set of all such transformations, g,
8 ( w ) = 8 ( 3 w ) for all
forms a group G. In that case, we require the decision statistic
to be invariant to all transformations in G. 8 ( ~ 1=) 8 ( ~ 2 ) implies that w1 = gw2 for some 9.
2168 IEEE TRANSACTIONS ON SIGNAL PROCESSING,VOL. 43, NO. 9. SEPTEMBER 1995

VARIOUS CASES

1 R unstructured,known steering vector a


I11 R has subspace structure ,known a

8
yr knorm, B,O' unknown

Note: The signal waveform. b , is known

Hencewecantake: b = e l
II R unstructured, unknown a

Fig. 3. Toy example: cell averaging CFAR.

This is again a low-dimensional quantity and thereby most statistic ([12, p. 1401, [16]). Further, the i n d u d maximal
of the nuisance parameters are eliminated from the problem, invariant is simply the SNR,and so under the distribution
which is a very desirable feature. The maximal invariant also of the maximal invariant is independent of any pan",
often tums out to be low dimensional (since it is obtained leading to the CFAR property.
by exploiting tbe symmetries in the problem) and that greatly The principle of invariance greatly reduces the class of
reduces the problem size and facilitates the construction of a detectors to be considered and frequently, it may become
reasonable test statistic. possible to find a uniformly most powerful test within this
smaller invariant class (UMPI), even though no general UMP
D. A Toy Example test may exist. Often, the GLRT procedure leads to such a
test. In ow case, since the GLRT is unavailable, we proceed
We illustrate these idem with the following toy example
by deriving the group of transformations that leave the problem
adapted from [a21 (see Fig. 2), which involves sinusoid detec-
invariant and then derive the maximal invariant.
tion in white noise (the cell-averaging CFAR detector).
We run a discrete Fourier transform (DFT)on a given
sequence 21, . . , zn consisting of white Gaussian noise with m. APPLICATION OF INVARIANCE
possibly a sinusoid and call the result Z( l), . , Z(n). We
want to detect if the sinusoid is in the kth bin (say). Note A. The Subspace Detection Problem
that the DFT values of white noise are also distributed as To begin, consider as a special case of (1) the following
white noise while the sinusoid amplitude in the kth bin gets structure for the covariance:
transformed into the mean of it(k). Therefore, our problem is
to determine if that mean is nonzero. We write 21 = Z ( k ) and
fz = [. . . 5(k'), - k' # k. Then, any unitary operation on
e],

9 2 and common scaling of all the DFT values will preserve his is completely equivalent to (1) with & = B a21,g, +
the whiteness of the noise and will not shift the location since it caa be obtained by a known linear tramformation on
of the mean. These transformations, therefore, will leave the the data (prewhiten with respect to &I and apply a unifary
decision problem unchanged. The orbits under this group of transformation rotating the basis vectors of the subspace
transformations are the cones (as in Fig. 3) whose axis lies spanned by 9 onto Ed, the subspace formed by the vectors
along the span of Z h and these are indexed by the ratio of the e,, i = i -d). The statistics of the data and the nature of
norm along this axis to that perpendicular to it, i.e., by the the problem are unaltered.
function given by ~ Z ~ ~ 2 / ~ ~Therefore,
i E 2 ~ ~ 2this
. specifies the Likewise, invoking appropriate unitary transformations
maximal invariant. But this is also the familiar scalar CFAR within the two subspaces, we can arrive at tbe foltowing

I
BOSE AND STEINHARDT: A MAXIMAL 1SV.ARI.O-r FRAMEWORK FOR ADAP17k.Z D m C T I O S 1169

structure for the signal vector a and partition the data When we take conditions b) and c) into account, we obtain
accordingly: the following result:
Proposition I : A linear affine transformation, g, of the form
in (7) satisfies the conditions b) and c) given above if and only
a= if b = 0 and G is of the form
to J
XI1,lXl X12.lXL-1
X 2 2 . d - 1 x L-1
x3l.lX1 x 3 2 . 1 x (\ L - 1 ,)
1 x 4 1 ,( N -d- 1) x 1 x 4 2 . (A' -d- 1) x ( L - 1) J
where the first column in the data matrix refers to the com- where c y l x l and P ltx ( N - l ) are arbitrary quantities with the
ponents in the primary vector while the second column par- dimensions as in the subscripts while I ' ( N - l ) x ( N - l ) is any
titioning is made up of the corresponding components in the nonsingular matrix and U l , ( L - 1 ) and U 2 , ( N - d ) L - 1 are unitary
secondaries. The 'first row corresponds to the projection of matrices but otherwise arbitrary.
the signal space in the interference subspace while the third Proof In the Appendix.
contains its projection into the sensor noise (interference-free) Discussion: First, we note that the set of transformations
subspace. The second and fourth components are the signal typified by (8) form a group under matrix multiplication. The
.free auxiliaries in the two spaces, respectively. lemma implies that this is the largest useful group within linear
We will now use the vec notation to define the following transformations that leaves the problem invariant.
vector so as to facilitate further discussion Let us now look at what the above transformation does.
The upper left block (let us call it G l l ) operates on X I in the
interference space. In fact. we have

where
a= E;] (4) (9)

where we have used the fact that (A 8 B) vec (XI =


vec(AXB).
Now, the premultiplying matrix (call it G I ) is the most
general matrix (to within a scale factor) that has el (which
is the mean vector in 6 space) as its eigenvector. Thus. this
transforms the covariance 4 to Gl&,Gf while leaving the
Thus, for L = 2, we will have I t =
mean vector unchanged. The postmultiplying matrix (call it
t
[%;I, X l 1 , 2 7 2 , XL2, Z j i , X l i , z j 2 , X421. Q ) leaves the primary unchanged while unitarily mixing the
It is easy to see that x has the complex Gaussian distri-
i.i.d. secondary vectors, which again yields i.i.d. vectors with
bution: x Cn/(pa, R) with the mean equal to pa and the
N
the same distribution.
covariance R having the following structure:
The lower right block (call it G 2 2 ) likewise preserves the
mean vector in the sensor noise subspace while unitaril)
mixing the white noise components (i.i.d.), thereby leaving the
distribution unchanged. The common scale factor o ensures
B. Characterization of the Transformation Groups
that the overall mean vector a is preserved.
This decision problem is invariant to all transformations Thus, we see that the diagonal structure of G enables us
that preserve to look at the effect on xl and x 2 separately. This great])
a) the Gaussian nature of the distribution; simplifies our task of finding the maximal invariant that we
b) the structure of the covariance matrix, and; do in the next section.
c) the signal space, i.e., the mean vector to within a scale
factor. Also, zero-mean data should stay zero mean. C. Characterization of the Maximal Invariant
Condition a) is assured if we consider linear affine transfor- Recall that the maximal invariant is a function of 2. m(x,.
mations. In fact, we need consider such transformations Only which may vector valued, such that
since for any general transformation that takes a Gaussian-
distributed input to a Gaussian output, there is an equivalent m(x) =m(GI) forall G
affine transformation in terms of the distribution of the output. m(x') =m(x) +
Thus, the transformation has to be of the type
there exists G such that X' = Gx ( 11)
g(X) = G x + b (7)
where G is of the form in (8). Using this prescription. we
where G is an NL-dimensional square matrix. prove the following:
I
7

2170 IEEE TRANSACTIONS ON SIGNAL PROCESSING. VOL. 43, NO. 9, SEPTEMBER 1995

Proposition 2: The maximal invariant m(x) to the group of Clearly, we can find Q2 such that the right singular vectors
transformations characterized by (8) is the 4-D vector whose in (16) correspond to V; since these are orthonormal vectors
components are given by spanning a (d- 1) dimensional subspace of CL-l. In fact, since
L 2 d , Q2 is not uniquely specified by the above constraint.
Any rotation Q3 in the orthogonal complement of the row
space of Vi will leave (16) unchanged, if operated from the
right. Thus, we can replace Q2 by Q2Q3. We shall use this
fact later, keeping in mind that the row space of Vi is also
the row space of Xi2.
Now, look at ml. We can write
x11 - x12x;2(x22x;2)-1x21
m4 = (12)
x31
Prooj It is easily verified that the components above are
where b is some d - 1 dimensional vector.
all invariant to G*(the first half of (1 1)). To show the second
Then, ml simplifies to
half, we assume that m ( 9 ) = m(x)-and construct a matrix
G of the form in (8) such that x = GX. We will follow the 1x11 - btX2lI2
convention that all primed quantities refer to XI. ml =
11x121l2
. First, look at m2. Let USV
. = US[v1 iV2] be the rectan- with a corresponding equation for mi.
gular SVD of [xali~ 2 2 1 Then,
. m2 simplifies as Let q
l1
1 = 1<1 11-11. Then, we have

m2 = vi(vzvt)-vl.
mi =
- v r ~ ~ ~ 1 ~
We can simplify this further using the fact that the rows of
1c1211x12112 .
V are orthonormal: Hence, ml = mi leads to the following:
VlV! + v2vg = I. (13) -b <
~ = x ( X l 1~- bt~X z 1 ) . (18)
Applying Woodburys inversion formula and simplifying, We are done.with respect to G1 if we can find a and flt
we get such that
!

Thus, m2 is a monotonic and hence a one-to-one function


of IIvlll. Therefore, m2 = mi implies that I(v1II = IIviII and
To satisfy (18), we need c-u = and < fit
= bItI - <bt.
there exists a unitary Q1 such that vi = Qlvl. Equation (19) then implies that we must have
Now, choose I = USQIS-lUt. Then 4 2 = (~12Q2 + (bttr- <bt)X22Q2
I[x21!X22] = US[vi !Q1V2]. which leads to

The form in (8) permits us to postmultiply by a unitary matrix; xi2 = m Q 2 b+X;,.+ (20)
so we need to find a unitary Q2 such that Q1V2Q2 = Vi.
To show that this is possible, we examine the SVD of V2 and Now, we throw in Q3 to rotate -Q2 to the span of
Vb. Again, using the ortho-normality of the rows of V (13), G. (Note that =Q2 and are both in the orthogonal
we obtain complement of the row space of Xi2;of course Xi, itself is
<
not affected by Q 3 . ) Since has been chosen to match their
respective norms, we can make the first term in (20) equal to
~

xi2 and so (20) becomes identical to (17) expressed in primed


and quantities. Thus, we have obtained G1 of the required form
v; = [VVi .:U:] [o0 0
I ] v:t (15) (U1 = Q2Q3) to transform x1 to xi.
The case of m3 is straightforwardto show. In particular, we
note that it is simply the ratio of the squared norm of the first
where 17 = l/llv111 = l/l[vill, (T = (1 - llv1112)1/2.
element in x2 to that of the rest of the vector. It is trivially
Hence, the left singular vectors of Q1V2 are given by
true that the first element of x 2 is a scaled version of that of
[VV~ ! QIUv].Since these form a completely orthonormal set, xi and likewise that the rest of the x2 is a scaled and rotated
the columns of QIU, must span the column space of U:, i.e., version of xi. m3 = mi implies that the scaling factor is
QIU, = ULQ. Thus, we can simplify and write common, so a scaled block unitary matrix having the form of
G 2 will take x 2 to xi. Finally, m4 = mi implies that this
scaling factor equals a in (19) and so we get the form in (8)
for the transformation relating x~ and 92.
where we have commuted Q and I. That completes the proof.
BOSE AND STEINHARDT: A MAXIMAL INVARIANT FRAMEWORK FOR ADAPTIVE DETECTION 2171

Discussion: The maximal invariant set, interestingly, can Since 81 = Ip12r, it follows that 8; = 01 implies that both
be related to quantities already known in the literature. Thus, the conditions in (24) can be satisfied by the appropriate choice
ml is exactly the AMF detector introduced by Robey [3] if of a. Now, we have to show that those in (25) can be met by
applied only to the data in the interference subspace within exploiting the freedom in picking H . Let R;' = UAUt be the
which the covariance is unstructured. 1/(1 m2) is the+ eigen-decomposition of R;' and likewise let R = U'A'U't.
corresponding loss factor. We will have more to say on this Choose H = aUA-1/2PA"/2U't where P is an N x ( N - 1)
when we consider the unstructured covariance case in the next unitary matrix. Then, the condition involving R' is satisfied.
section. m3 is simply the F-test statistic as applied to the The first condition in (25) becomes
sensor noise space. m4 involves data from the signal space in
each subspace and incorporates the requirement of coherently
combining this data so as to exploit the fact that the signal
bearing vector is known. which will be satisfied if the quantity on the left-hand side has
a lower norm than the quantity in the parentheses on the right-
D. The Induced Maximal Invariant hand side. But these norms are respectively equal to r'tR'r'
and r' and this condition is exactly what is required for RL-'
In order to arrive at the test statistic and gain an insight into to be positive definite. Thus, this form of H works and we
its performance, we need to characterize the distribution of are done with respect to 81.
this 4-D maximal invariant vector. We recall from Lehmann The case for 02 is now trivial to show and that completes
[7] that the distribution depends only on the maximal invariant the proof.
to the transformation induced on the parameter space. To this Discussion: Let us examine what the quantities 01 and 8 2
end, we prove the following. signify. The former is simply the SNR in the interference
Proposition 3: The maximal invariant @(a,R) to the ($) subspace while the latter is the ratio of the noise powers
transformations induced on the parameter space by the group in the signal space in each subspace. This also specifies the
characterized by (8) is given by SNR in the sensor white noise (interference-free) space, since
the signal bearing is known. The distribution of the maximal
invariant (and hence any function of it) then depends only on
02 =C T ~ ( R $ ' ) ~ ~ . (21) these parameters. Since 02 is nontrivial even under Ho, our
Proot We first note that the induced transformation invariant test statistic will have a distribution with an unknown
group is characterized by Is:(pa, R) + (p'a, RI) with parameter. Therefore, a good invariant test will not be CFAR
- -tI

in general.
p'a = pGa and R = GRG where G is as in (8).
As before, invariance is straightforward to verify. The joint pdf is derived in [17]. The expressions are
The parameter 01 depends on R+ and pal only. As before, complicated; however the marginals give an insight into the
we assume that 8; = 01 and construct a transformation G1 as distribution. Thus, ml, m2, and m3 are all distributed like
in (40), such that the F-statistic while m4 corresponds to the complex Cauchy.
There is no uniformly most powerful test and a good test has
to be picked based on some heuristic involving the F-statistic.
We shall do so next.

E. Proposed Detector
Using (38) and the structure (which is full rank) of G I ,(40), For the structure in (2), the UMPI test does not exist and
we see that conditions in (22) are equivalent to the notion of LMPI test is not directly applicable either.
+ 1 = G ! - ~ R ; ~ G ; ~ ,pt = ap.
RI- However, based on considerations of the maximum likelihood
(23)
estimates of the covariance from the signal-free data, we obtain
Again, the structure in (40) implies that Gi-' can be repre- a heuristic invariant test that reduces to the Kelly statistic
for the unstructured case and asymptotically approaches the
sented as a-'[e1 !HI.Partition R;' as clairvoyant detector (colored noise matched filter that uses the
true covariance):

Thus, from (23) we require


T' = 1a1-2r,
where S is given by (27), shown at the bottom of the next
p' = a p (24)
r' = l a l - 2 ~ + ~ ; 1 e 1 ,
+
page, with P d = l/[(L - d)(l m2)],and the partitioning of
the data and the signal vector is as in (3).
R' = Ia(-2HtR;1H (25) This is shown to be approximately CFAR [17] and the
simulation results in Fig. 4 show that it outperforms the
where the primed quantities belong to the corresponding Kelly test applied to the data truncated to the span of the
partitions of RL-'. interference and signal spaces. In fact, it does nearly as well
2172 IEEE TRANSACTIONS ON SIGNAL PROCESSING.VOL. 43, NO.9, SEPTEMBER 1995

invariant. If one is interested in invariant tests for the unstruc-


tured covariance case, algebraic combinations of these are all
that need be considered.
The corresponding induced maximal invariant is now equiv-
N=lO L=12 d = 3 alent to el, which is simply the SNR. Thus, we see that
the distribution, which depends only on this parameter, is
completely given under Ho. Thus, not only do we get CFAR
tests but also we can set the threshold for our test for a given
PFA without knowing the actual distribution of the noise. This
is a revealing insight into the nice CFAR properties of the
AMF and the GLRT tests.

B. Joint PDF and the LMPI Test


PFA ->
The joint pdf for the maximal invariant for this case (which
Fig. 4. Comparison of ROC curves, subspace structure case.
is considerably simpler than for the subspace structure case)
is derived in [13]. We summarize the key results here. The
as the clairvoyant detector that bounds the best achievable quantities ml and m2 are distributed according to the complex
performance. F distribution and the joint pdf is given by:
We now turn to the problem of detection with unstructured
covariance studied by Kelly and Robey, which is a special
case of the foregoing.
L-N

IV. THE UNSTRUCTURED


CASE
When the structure of the covariance matrix R is not
known a priori, then the unstructured assumption applies. No
constraint is placed on R other than that of being positive
definite Hermitian. This is essentially the case studied by Kelly where K = [(L - l)!/(L - N - 1)!(N - 2)!] and K i =
[4], Reed [l], and Robey [2]. We see that this is a special [(L - N ) ! / ( L- N - L)!(/c!)~]. We show in [13] that the
case of the structured subspace case we have studied with the uniformly most powerful test (UMPI) does not exist. There
interference space now being the whole N-dimensional sensor is therefore no optimality property for the GLRT detector
space. There is no interference-free space here. Consequently, and simulations do bear out the fact that the AMF detector
the transformation group is given by G11 alone as specified performs better in some SNR regimes.
in (10) and the maximal invariant now corresponds to the However, since the parameter set is 1-D, it is now possible
components involving the interference space only. These are: to derive the locally most powerful invariant (LMF'I) test
(optimum in the limit of 0 SNR). This is obtained in [13]
ml = 1x11 - x12x;,(xzzx;2)-1xz1 l2 +
as t L = p [ ( L - N ) q - 1/77 11 where 77 = [ m l / ( l mz)] +
Xlz(1- x;,(x22x;,)-'xz2)~~z
+
and p = 1/(1 m2)is the loss factor studied in the literature.
This test is also CFAR since it is a function of the maximal
invariant and so has its distribution parameterized by the SNR
where xz1 is now N - 1 dimensional. only. We obtain a closed-form expression for the PFA and a
These functions bear an intimate relationship to the AMF series expansion for the probability of detection (PD) in [13].
( t R ) and GLRT ( t K ) statistics derived by Robey and Kelly,
respectively. We show elsewhere [13] that these are given by P(tL > T ) = (KK+
~ 1) - 7/WL-l

t R =ml, for r > 0, which corresponds to PFA < l / e = 0.36.


1
tK =
l+m1+mz

where 77 = m l / ( l m2).+
ml
lf77
--
-
PD = 1 -
K-1

k=O
(k;J 1; MK,T(kP)

Thus, we see that these two test statistics, variously de- 1 ( K P - ).


'Gk+l [e K + ~
scribed in the literature, form the components of the maximal
BOSE AND STEINHARDT A MAXIMAL INVARIANT FRAMEWORK FOR ADAPTIVE DETECTION 2173

B. Unknown Signal Bearing Vector


The application of the maximal invariant framework to more
general signal models is discussed in detail in [14].
Here, we shall summarize the results for an important
N=lO L=12 d = 5 special case when the signal vector is not known and likewise
no structure can be imposed on the covariance matrix. This
arises in a variety of problems studied in the literature (which
appear as forms of the general linear hypothesis problems
in statistical textbooks, e.g., Anderson [8, ch. 81). In our
context, this could arise if the bearing is not known because
of multipath or because of array calibration errors.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 In this case, the invariance group does not have to preserve
PFA -> any subspaces. Thus, it can be characterized as
Fig. 5. Comparison of ROC curves: the block diagonal case.
mx(; ;),
where 7' = T / K , M ~ , ~ ( k ; p=) [ ( p + ~ ) ' + l ( K p- x = [XI i X,],
~ ) ~ - l --~ +
P () ~l - , / ( K 1 ) K p 2 ] and GI, is the incomplete
I'free, U unitary.
Gamma function defined by GI,(x)= e-" z"/n!.
We can calculate these integrals numerically to obtain the The corresponding maximal invariant is I-D:
PD while the PFA calculation enables the calculation of the
threshold. Comparison with the Kelly statistic indicates a m =X~(X~X;>-'X~.
slightly better performance at very low SNR (a gain of 0.1 Further, if the signal is deterministic unknown, the induced
dB for N = 4, L = 9, PFA = 0.1 at -5 dB SNR) at the maximal invariant is simply the SNR and the likelihood ratio
expense of a degradation in the higher SNR region (0.3 dB is a monotonic function of m. Thus, m itself yields a CFAR
loss at 10 dB SNR). and UMPI test.

VI. CONCLUSION
V. EXTENSIONS
Typical array detection problems involve the search for
a suitable function projecting down the multivariate data to
A. Block Covariance Structure
a scalar statistic, which is complicated by the presence of
All the results derived for the subspace structure case high-dimensional nuisance parameters (e.g., covariance). In
generalize easily when R has the block diagonal structure. many cases, by imposing the requirement of invariance on our
detectors, we can reduce the size of the problem and simplify
R= (7 :2).
our search. Such a requirement is reasonable on grounds
of symmetry and can be shown in some cases to lead to
optimality in the minimax sense. To this end, we compute the
Such a model may be appropriate for dealing with nonstation- maximal invariant function for problems involving covariance
arity, for instance. Now we have two subspaces, each of which matrices that are unstructured or have a subspace structure. The
is similar to the $-space considered previously. Therefore, we maximal invariant specifies the class of all invariant detectors
have a five-dimensional maximal invariant, with ml and m2 and being small, makes it feasible to do analysis and to search
the same as before, m3 and m4 equivalent to ml and m2 for an optimal detector in this class. Therefore, we have an
but for the second subspace, and m5 corresponding to the alternative and more general framework to the GLRT for
"coupling," m4, in the last section. amving at invariant tests and, in addition, can study their
When there are more blocks, these results generalize with optimality properties. Thus, for the unstructured case studied
two functions similar to ml and m2 in each subspace. Further, by Kelly and others, we show that the Kelly and the AMF
m5 has to be replaced by similar ratios that reflect the fact that statistics together form the maximal invariant. Further, we
the signal spans all these subspaces. show that a UMPI does not exist and obtain a LMPI test
The induced maximal invariant is again 2-D and includes for low SNR. However, a UMPI test exists when the signal
the ratio of the noise powers in the two subspaces. Therefore, bearing vector is not known. For the structured covariance
an invariant test is not guaranteed to be CFAR and a test with case where the GLRT is intractable, we again obtain a small
a high PD need not be CFAR in general. However, in some invariant set whose statistics can be analyzed. For this case,
cases, the noise level may be known to be of the same order the UMP test does not exist. We propose several new tests and
of magnitude for example, when modeling certain kinds of show via simulation that they are equivalent or better than
nonstationary environments. In this case, a CFAR test is shown existing ones. These tests depend on only two parameters;
to perform almost as well as the test proposed by Kelly in thus most of the high-dimensional nuisance parameters are
[18] (Fig. 5 ) . eliminated. Further, in cases where the GLRT is intractable or
2174 IEEE TRANSACTIONS ON SIGNAL PROCESSING,VOL. 43, NO. 9, SEPTFMBER 1995

ill-posed, the maximal invariant set provides pointers toward a D = {AD(& = A @I I L , APDH}. If UAAUL is the
test statistic and sometimes, if it is 1-D, supplies it as well. This eigendecomposition of A, then that of AD is given by
framework is therefore a powerful tool for viewing detection
problems in the array context. & = (UA8 &) (A 8 I L )( V A8 Q)+ (37)
where Q is any L x L unitary matrix.
APPENDIX This is easily verified using the following identity involving
PROOF OF PROPOSITION 1 the Kronecker product
In order to prove that the transformation matrix G in (7)
( A @ B ) ( C @ D=)A C @ I B D (38)
must have the form in (8) so as to satisfy the conditions b)
and c), we partition G as where the dimensions of A, B, C, D are such that the mul-
tiplications are meaningful.
We want-to find the form of G 1 1 such that G11&& ED
whenever AD E D.Let G:D + D denote the corresponding
where G 1 1 is Ld x Ld. The transformed data has the covariance linear transformation. Now Gll has to be full rank for the
matrix given by transformed output to be PDH. Hence, it is invertible and so
0 is a 1 - 1 map. Further, since it is a linear transformation
on a finite dimensional vector space, D,it is an onto map
as well and so the inverse of G exists (which corresponds
where Rij = G i l ( 4 8 Ih)Gj1 + 02Gi2Gj2.
-t to the action of GT:) and also maps V to V. In particular,
it maps the identity to an element in D. This means that
Condition b) implies that this has the same structure as k, - - I --t
i.e., for all I?+ and a2, we must have GllGll = (G!l&)-l and hence $l& belong to 2)
(since the inverse of an element of D is also in D). Therefore,
its eigendecomposition must be of the form of (37):
et e
11 11 - (VG8 &)(E28 IL)(VG8 & I ) + .
where PG is any other positive definite Hermitian (PDH)
matrix and a is any positive scalar. We also obtain directly from G acting on the identity that:
Comparing (29) and (30) we see that the following must
hold: &GIl = (UG8 &)(E28 IL)(~G
8 Q)t.

G11(& 8 IL)GS1 + a2G12GL2 =0 (31) Thus, the singular value decomposition of G11 is given by

G 2 1 ( 4 8 1 ~ ) G ;+
l o 2 G 2 2 G ; 2 = ( T ~ / I ( N - ~ )(32)
L. Gii = (UG8 Qfl)(E8 IL)(~G
8 Q) (39)

Further, since (& 8 IL) and a2 can be varied independently, where Q and Q are unitary matrices as in (37). Note, how-
the conditions in (31) and (32) must hold for each of the ever, that the degeneracy of the singular values implies that
terms therein. Thus, we have the singular vectors connected with each singular value could
be unitarily mixed independently of those associated with all
Gll(4IL)GSl =0 the others. This means that we could pick a different Q and
Q for the singular vectors in UG and V G corresponding to
a2Gl2G;, =o (33)
each singular value in E. However, this more general form of
G Z l ( 4 8 IL)GS1 = a 2 I ( N - d ) L (34) G 1 1 results in the same distribution of the transformed output
-t as the form given above in (39) and therefore it is sufficient
a2G22GZ2= cl2IN-dL (35)
to consider this in what follows.
where a2 and u2are two positive scalars such that a2 + Again, using (38), we can write G 1 1 as
a21f1 - azr.
We note that the diagonalization of (48 IL) for all PDH Gll = G ~ ~ Q
%, as required by (34), can only be achieved trivially with where GI = UGEVGand Q = QQ.
G21 = 0. Thus, u2 = 0 and so u21cannot be 0 as well. Finally, to establish the form of G1 and U, we have to
Hence, G22 = aU where U ( N - d ) L is a unitary matrix and a consider the effect on the mean vector. Condition c) requires
is a nonzero scalar. Equation (33) then implies that G I 2 = 0. that if ii is zero mean, then so is E[T(%)],which is possible
Finally, to establish (30), we have to impose constraints on only if b = 0. It further requires that Ga = aa. The form of
G 1 1 such that the following holds for all PDH &: a in (3) then implies that

Gll(& 8 IL)& = H* 8 IL (36) G;111& = & I ,


-
G22E1 = a&.
where HG is some other PDH matrix.
First, we characterize D, the set of all block diagonal Note that this decomposition is not unique: the implications will be
PDH matrices of the form of the above covariance matrix: addressed soon.
BOSE AND STEINHARDT: A MAXIMAL INVARIANT FRAMEWORK FOR ADAPTIVE DETECTION 2175

These conditions will be satisfied only if the first column of [9] R. J. Muirhead, Aspects ofMultivariate Statistical Theory. New York
each Of the matrices is It is easy to see that this Wiley, Inc., 1982.
[lo] T. S. Ferguson, Mathematical SratisticsA Decision Theoretic Approach.
the following structure for their constituents: New York Academic. 1967.

C 1 = a ( ,1 .):
Pt
[ 111 M. L. Eaton, Group Invariance Applications in Statistics, Regional
Conference Series in Probabilily and Statistics. New York Inst. Math.
Stat., 1989, vol. 1.
[ 121 L. L. Scharf, Statistical Signal Processing Detection, Estimation, and

.=(; :1)
Eme-Series Analysis. Reading, MA: Addison-Wesley, 1991.
[13] S. Bose and A. 0. Steinhardt, The optimum invariant array detector
for a weak signal, IEEE Trans. Aerosp. Electron. Syst., to appear in
G 2 2 = a (10 u0 2 ) . Jan. 1996.
(40) [I41 -, A maximal invariant approach to detection with multidimen-
sional signal models, IEEE Trans. Siznal Processing, submitted Sept.
1993.
The structure for G in (8) follows and that Completes the [I51 E. J. Kelly and K. M. Forsythe, Adaptive detection and parameter
proof. estimation for multidimensional signal models, Lincoln Laboratory,
Massachusetts Institute of Technology, Tech. Rep. 848, 1989.
[16] S. Haykin and A. Steinhardt, Adaptive radar detection and estimation,
ACKNOWLEDGMENT in Wiley Series in Remote Sensing. New York Wiley, 1992, chap. 3.
[17] S. Bose, Invariant hypothesis testing with sensor arrays, Ph.D. thesis,
The authors would like to thank E. J. Kelly, who originally Cornell University, New York, 1995.
[18] E. J. Kelly, Adaptive detection in nonstationaty interference, Part 1
proposed the detection problem with structured covariance and Part 2, Lincoln Laboratory, Massachusetts Institute of Technology,
matrices, and subsequently provided invaluable comments Tech. Rep.. 1985.
throughout. The GLRTs intractability in the structured case
led to our study of invariance techniques for array detection
problems, the subject of this paper. We would also like to give
special thanks to Prof. A. Hero, who shared his own related Sandip Bose (M95) was born in Nagpur, India,
in 1966. He obtained his B.Tech. degree from the
work and supplied fresh insights and guidance during several Indian Institute of Technology, Kanpur, India in
illuminating discussions. 1988 and his Ph.D. degree from Cornell University,
New York in 1994, both in electrical engineering.
He is currently a postdoctoral research associate at
REFERENCES the University of California, Davis.
His research interests are in statistical and array
I. S. Reed, J. D. Mallet, and L. E. Brennan, Rapid convergence rate signal processing with special emphasis on optimum
in adaptive arrays, IEEE Trans. Aerosp. Electron. Syst., vol. AES-10, detection and estimation.
pp. 853-863, Nov. 1974.
F. C. Robey, A covariance modeling approach to adaptive beamforming
and detection, MIT-Lincoln Laboratory, Tech. Rep. TR-918, July 1991.
F. C. Robey, D. R. Fuhrmann, E. J. Kelly, and R. Nitzberg, A CFAR
adaptive matched filter detector, IEEE Trans. Aerosp. Electron. Syst.,
vol. AES-28, no. 1, pp. 208-216, Jan. 1992. Allan 0. Steinhardt (S79-M82SM90) is a
E. J. Kelly, An adaptive detection algorithm, IEEE Trans. Aerosp. member of the technical staff at MIT Lincoln
Electron. Syst., vol. AES-22, pp 115-127, Mar. 1986. Laboratory, where he conducts research in adaptive
D. Fuhrmann, Application of Toeplitz covariance estimation to adap- array processing. He earned his Ph.D. from the
tive beamforming and detection, IEEE Trans. Acoust., Speech, Signal University of Colorado, Boulder. From 1987-1993,
Processing, vol. 39, no. 10, pp. 21942198, 1991. he was on the faculty at Cornell University.
I. P. Kirsteins and D. W. Tufts, Rapidly adaptive nulling of interfer- Dr. Steinhardt received the 1986 IEEE Signal
ence, in Proc. GRETSI Con$, Juan-Le-Pines, France, June 1989. Processing Society Paper Award and the 1990 Best
E. L. Lehmann, Testing Statistical Hypotheses, 2nd ed. New York: Professor Award from the Cornell student chapter of
Wiley, 1986, chap. 6, pp. 284-286. the IEEE. He is co-editodauthor with Simon Haykin
T. W. Anderson, An Introduction to Multivariate Statistical Analysis, of Adaptive Radar Detection and Estimation, Wiley,
Wiley Series in Probability and Mathematical Statistics. New York 1992. His research interests include space time adaptive processing for sensor
Wiley, 1984, 2nd ed. arrays, optimal detection, and numerical linear algebra.

You might also like