Professional Documents
Culture Documents
4, APRIL 2012
AbstractIn global positioning systems (GPS), classical localiza- have penetrated the transport field through a variety of appli-
tion algorithms assume, when the signal is received from the satel- cations such as monitoring of containers, fleet management, or
lite in line-of-sight (LOS) environment, that the pseudorange error car navigation. These applications do not necessarily request
distribution is Gaussian. Such assumption is in some way very re-
strictive since a random error in the pseudorange measure with an high availability, integrity, and accuracy of the positioning
unknown distribution form is always induced in constrained en- system. However, for safety applications such as guidance of
vironments especially in urban canyons due to multipath/masking autonomous vehicles, performances require to be much more
effects. In order to ensure high accuracy positioning, a good es- stringent.
timation of the observation error in these cases is required. To Unfortunately, most of all these transport applications are
address this, an attractive flexible Bayesian nonparametric noise
model based on Dirichlet process mixtures (DPM) is introduced. mainly used in dense urban environments. Even with modern
Since the considered positioning problem involves elements of non- GPS receivers, high positioning accuracy is only achieved in
Gaussianity and nonlinearity and besides, it should be processed LOS conditions when the signal is received directly without
on-line, the suitability of the proposed modeling scheme in a joint any reflection. Since the signals delivered by sensors are easily
state/parameter estimation problem is handled by an efficient Rao- blurred because of hard external conditions in urban canyons,
Blackwellized particle filter (RBPF). Our approach is illustrated
on a data analysis task dealing with joint estimation of vehicles they may deliver totally erroneous measurements, especially
positions and pseudorange errors in a global navigation satellite when the signal reaches the receiver after interaction with sev-
system (GNSS)-based localization context where the GPS informa- eral objects/obstacles. In urban areas, some obstacles (buildings,
tion may be inaccurate because of hard reception conditions. trees, etc.) can be modeled using a simulator and thus some
Index TermsDensity estimation, global navigation satellite of the errors can be predicted. However, some other obstacles
system (GNSS), global positioning systems (GPS), nonpara- (cars, pedestrians, etc.) can appear suddenly and can induce a
metric Bayesian methods, pseudorange errors, Rao-Blackwellized random error in the measurements. Therefore, in order to en-
particle filter (RBPF), sequential Monte Carlo methods, urban sure high accuracy positioning, a good estimation of the obser-
canyon.
vation error in these cases is required. As a consequence, the
incoming signal is most of the time the sum of direct signals
and several delayed replica called non-LOS signals. However,
I. INTRODUCTION
non-LOS signals which are signals received after reflections on
the surrounding obstacles, frequently occur in dense environ-
ments. In some of these environments considered as very con-
A global positioning system (GPS) is a radio navigation
system that relies on radio-frequency signals emitted by
a constellation of satellites. Consequently, it can be easily dis-
strained, it can happen that no direct signals can reach the re-
ceiver which leads to severe accuracy degradation due to the
induced high propagation time delays.
torted by the measuring system and propagation impairments
In this paper, we address a very challenging issue related to
(atmospheric layers, multipath, diffraction, and mask phe-
the positioning degradation accuracy problem when we do not
nomena). Today, global navigation satellite systems (GNSS)
receive the direct signal but just reflected replica of it. In this
case, the receiver tracks only reflected signals. Such phenomena
make the observation error distribution to become a noncentered
Manuscript received May 15, 2011; revised October 04, 2011; accepted De-
cember 02, 2011. Date of publication December 21, 2011; date of current ver- Gaussian distribution because of additional errors induced by
sion March 06, 2012. The associate editor coordinating the review of this man- the propagation time delays of reflected signals. As a conse-
uscript and approving it for publication was Dr. Lawrence Carin. quence, classical localization methods assuming that observa-
A. Rabaoui is with the IMS, LAPS CNRS UMR 5218, Bordeaux, France
(e-mail: asma.rabaoui@u-bordeaux1.fr). tion noises are zero mean Gaussian are not efficient anymore
N. Viandier and J. Marais are with the IFSTTAR, LEOST, Villeneuve dAscq, and can severely impair positioning accuracy.
France.
E. Duflos and P. Vanheeghe are with the LAGIS FRE CNRS 3303, Ecole
Centrale de Lille, Villeneuve dAscq, France. They are also with Team-Project
A. Previous Work
SequeL, INRIA Lille Nord-Europe, Villeneuve dAscq, France.
Literature focusing on techniques for localization perfor-
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. mance enhancement in constrained environments is abundant.
Digital Object Identifier 10.1109/TSP.2011.2180901 Most previous work such as [1] and [2] proposed to improve
the GNSS performances by adding other sensors (odometer, B. The Motivating Problem
inertial measurement unit, etc.). However it seems that these
multisensors-based methods show some drawbacks and in In this paper, the state evolution equation governing the po-
particular the high system cost and complexity. For that reason, sition of a land vehicle is a dynamic nonlinear model and the
we propose here solutions allowing to improve positioning measurement equation (observation model) is nonlinear as well
accuracy where the sensor available is the GPS-based sensor. with additive observation errors. Within this setting, a joint esti-
However, related to this question of positioning accuracy in mation problem of hidden states and noise probability densities
hard conditions (for instance in urban canyons), there have been will be considered. Often, the observations which are the only
very few approaches proposed for improving the precision of GPS measurements arrive sequentially in time and one is inter-
the GPS-based framework in the presence of multipath effects. ested in performing the joint estimation on-line.
In recent work, some solutions have been proposed for this Note that in a dynamic nonlinear model with additive ob-
question. Many involve model adaptation-based framework. In servation errors, it is usual to assume that errors are normally
[2][4], the proposed approaches allow to adapt the error model distributed or can be approximated by a finite Gaussian mix-
in the filtering process to the reception condition of each satel- ture. This can cause problems when there are for example out-
lite signal. This modeling permits to handle the overall recep- lying errors leading to many modes in the density distribution
tion process with switches between some preselected measure- form whose variability over time induces poor inference about
ment models. The switching process permits to define three re- the model parameters. As we are considering in this paper the
ception states (LOS, non-LOS, and no reception). The principal case where multipath propagation affects severely GPS signals
idea of such modeling scheme is to define an indicator variable qualities, it is our intention therefore to model the observation
which governs the behavior of the statistical model and allows errors using a highly flexible family of density functions. It is
for switching from one model to another. An interesting way important to notice that mixture modeling is considered as a
of defining the statistical structure of this indicator variable has successful and widely used density estimation method, capable
been developed in [2], [5]. of representing the phenomena that underlie many real-world
Switching state-space models, also called jump state-space datasets. However, the main difficulty in mixture analysis is
models, have been widely studied in the literature. In [6], [7] how to choose the number of mixture components. Model se-
a jump Markov linear system (JMLS) has been proposed for lection methods in general treat the number of components as
the case where the considered state-space model is conditionally an unknown constant and set its value based on the observed
linear and Gaussian. The estimation problem was handled using data. Such approaches lack flexibility, since in practice we often
finite mixture modeling [interacting multiple model (IMM) [8], need to model the possibility that new observations come from
[9] and generalized pseudo-Bayes (GPB) [6] algorithms]. For as yet unseen components. This is a challenging research issue
nonlinear and non-Gaussian cases, efficient sequential Monte and all the related questions motivate the general framework we
Carlo algorithms have been applied [10][13]. are going to develop in this paper. To the best of our knowledge,
In some recent studies, as in [3], the proposed particle filter on-line estimation on a nonlinear dynamic system is a topic that
algorithm with observation switching models shows acceptable has not been sufficiently addressed in navigation literature.
results in terms of accuracy and stability in the context of GPS In this paper, we propose to use a flexible density modeling
positioning. The filtering algorithm takes into account two ob- based on a Bayesian nonparametric approach involving an
servation noise models depending on the reception state of each infinite mixture model. The Dirichlet process mixture (DPM)
satellite. In the direct path, a Gaussian probability distribution model, studied in Bayesian nonparametrics [15][17], is the
is used. Whereas, in the non-LOS case, a Gaussian mixture most used approach for nonparametric density estimation.
(GM) model is used. In this setting, the proposed approach It is based upon the construction of probability measures on
for error estimation uses the expectation-maximization (EM) the space of distribution functions. A rich literature exists on
algorithm [14] which shows some limits especially related to its theoretical properties, and it has been used in a variety of
slow convergence of the EM algorithm and the nonability of problems [18][25]. Despite this activity, only a few work
the density estimation model to capture the right shape of the has been conducted on how to tune the hyperparameters of
very changing distribution of observation errors. In fact, the nonparametric mixture models. Most of the time, in real-world
non-LOS case is modeled by a finite Gaussian mixture and the application contexts, the observed data can help in defining
number of Gaussian components was fixed. Consequently, the a prior for some of the hyperparameters in the considered
mixture model were not able to represent the true measurement Bayesian model. Computing every unknown hyperparameter
error along the mobile trajectory. with pure sampling is not the most effective approach in all
In this paper, in order to limit costs, we have chosen to work cases since this can be practically difficult and time consuming.
only with GNSS signals, no additive sensors will be used. For Our objective is to estimate jointly the hidden states
a better accuracy, we have proposed a new statistical filtering and the error density parameters that
method based on a better definition (and use) of the observation pertain to the observation vectors . The error density
noise for each satellite signal. In order to ensure high accuracy parameters are the latent variables to be estimated using a non-
positioning a good estimation of the observation error in such parametric Bayesian approach. For on-line Bayesian filtering,
cases was required. the particle filter permits to compute the posterior distribution
recursively over time. In this case, we have to
take into consideration the fact that computing everything
1640 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 4, APRIL 2012
with pure sampling can be computationally expensive. One context and the interesting validation schemes. The efficiency
solution consists in considering the augmented hidden state of the proposed approaches is demonstrated by applying a vali-
vector . By the partition of the state-space into dation step involving both simulated and real GNSS signals.
two subspaces drawn by and (for all ), improved The remainder of this paper is organized as follows. Section II
formulations of the particle filter can be applied. Rao-Black- presents the observation noise modeling. Section III is dedicated
wellization [26], [27] is one way of improving the efficiency to the Bayesian modeling of the problem. Section IV is about
of the particle filter. In the case where it is possible to evaluate the nonparametric density estimation problem. We show how to
some of the filtering equations analytically and the others with tune the DPM hyperparameters in a flexible way for a better fit-
Monte Carlo sampling, the use of Rao-Blackwellization leads ting of the data distribution shape. Section V discusses the use
to estimators with less variance than what could be obtained of an efficient particle filter to perform optimal estimation. In
with pure Monte Carlo sampling [28]. The idea is to partition Section VI the efficiency of the proposed approaches is demon-
the state vector so that one component of the partition is a strated by conducting validation experiments involving simu-
conditionally nonlinear Gaussian state-space model; for this lated and real GNSS signals. Section VII concludes this paper
component one can work out the solution analytically and with a summary and discussion.
use for instance the extended Kalman filter (EKF) to compute
. The particle filter is then used only for the
nonlinear non-Gaussian portion of the state-space to compute II. PSEUDORANGE ERROR MODELING
. In this way, we can say that the hard part of the
problem will be reduced into the computation of the posterior In multisensors-based systems, each sensor transmits a signal
distribution . (an information) to the receiver (or antenna). In this paper, the
GNSS constellation [29] is considered as a sensor network,
C. Contributions and Paper Organization and consequently each GNSS satellite is considered as a sensor
This paper considers the problem of density estimation from [30]. These sensors work according to three different operation
a Bayesian nonparametric viewpoint, using a hierarchical mix- modes which are:
ture model. A typical choice is to assume the unknown density normal mode (LOS case), when the information delivered
as a mixture of a parametric family with a discrete random prob- by the sensor can be considered as accurate;
ability as a mixing distribution. This random probability is in- degraded mode (non-LOS case), when the information de-
duced by the Dirichlet process (DP) and the hierarchical model livered by the sensor is not accurate due to noisy data;
is the DPM model. By using DPM, it is important to mention failure mode (no reception case), when the sensor provides
that no matter what additive error distributions are involved we no information.
are confident that our family of densities will be able to capture The navigation signal transmitted by each satellite includes a
the right shape and hence statistical inference for the parame- precise time at which the signal was transmitted. The distance
ters of interest will be improved and reliable. A drawback of or range from a receiver to each satellite may be determined
using a model based on the DP is that it is infinite dimensional using this time of transmission which is included in each navi-
and therefore inference will be complicated. However, recent gation signal. By noting the time at which the incoming signal
innovations in sampling algorithms within infinite dimensional was received at the antenna, a propagation time delay can be
frameworks have made considerable progress in recent years to calculated. This time delay when multiplied by the speed of
such an extent that it is now possible to perform exact inference propagation of the signal yields a pseudorange from the trans-
without the need to set up arbitrary approximations. mitting satellite to the receiver. The range is called a pseudor-
The flexibility of DPM models grows when we assume one ange because the receiver clock may not be precisely synchro-
more step in the hierarchy, i.e., when the DPM hyperparame- nized to GPS time and because propagation through the atmos-
ters are random. Our aim using this family of densities is first to phere introduces delays into the navigation signal propagation
be able to capture the right shape of the noise probability den- times. These result, respectively, in a clock bias and an atmo-
sity functions (pdfs) and then to improve the statistical inference spheric bias. Clock biases may be as large as several millisec-
for the parameters of interest. Therefore, we will show how to onds. Using information from at least four satellites, the posi-
tune the DPM hyperparameters in a flexible way. Some hyper- tion of a receiver with respect to the center of the Earth can be
parameters will be estimated as part of the Gibbs sampler, and determined using triangulation techniques [29], [31]. The pseu-
some others will be chosen carefully in a data-adaptive way for dorange measure deduced from the signal propagation time is
a better fitting of the data distribution shape. expressed as follows:
To sum up, this paper proposes several contributions. The first
is concerned with the modeling of the observation noises using (1)
DPMs. Then, we focus on the suitability of this family of den-
sities in the estimation problem handled by an improved par- where
ticle filter called Rao-Blackwellized particle filter (RBPF). The is the pseudorange between the satellite and the re-
use of Rao-Blackwellization permits to improve the efficiency ceiver at time ;
of the filtering process since we do not need to compute every- is the true satellite-receiver distance;
thing with pure sampling which reduces significantly the com- is the three position coordinates of the
putational cost. An another contribution is about the application vehicle in the reference system of coordinates East, North,
RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1641
Up (ENU)1 and is a vector containing estimating the hidden variables (states) when the related mea-
the position coordinates of the satellite ; surements (observations) involve outliers is a challenging issue.
is the receiver clock offset with respect to the GPS In this paper, we focus on a particular joint state/parameter es-
reference time2 and is the celerity (the timation problem wherein data should be processed on-line.
speed of light); In this section we are interested in designing a dynamic model
is the satellite clock offset; for the parameters to be estimated. Our ultimate objective is to
and are, respectively, the ionospheric and the tropo- estimate accurately the state of a moving vehicle. The vehicle
spheric errors; state is defined by the state vector containing vehicle positions
is the receiver noise considered as a white Gaussian and velocities . Since we are
random variable with variance in general considered to interested in estimating jointly the hidden state
be equal to unity; and the pseudorange noise density, we will consider an aug-
is the error due to the signal reflections in case of mul- mented hidden state vector containing both hidden state param-
tipath propagations. eters and observation noise parameters. In this setting, the obser-
The atmospheric propagation errors and the satellites clock vation error is assumed to be distributed according to an un-
bias can be modeled by incorporating some correction blocks known probability density function whose parameter vector
in the receiver [29]. Consequently, after corrections, (1) can be denoted has to be estimated and stands for the vector
rewritten as containing all the parameters to be estimated. Note that our main
concern in the handled problem is the estimation of nonlinear
(2) models with unknown statistic noises.
It follows that the pseudorange error can be defined in function In the following, a state-space approach providing a general
of two different error sources as framework for describing the dynamic state estimation problem
at hand is provided. Here, we will use the state-space formu-
lation to introduce the filtering algorithms. Statistical Bayesian
filtering aims to compute the posterior probability density func-
where is the sum of the error induced by reflections between tion of a state vector from sequentially obtained sensor
the satellite and the receiver and the receiver noise. According measurements (pseudoranges) . In our context, it relies
to the reception conditions, can switch between different ob- on the following model:
servation models as following:
(4)
(3) (5)
techniques that provide a convenient approach to computing the multimodal due to hard reception conditions especially in urban
posterior distributions in our nonlinear and non-Gaussian case canyons. In this paper, we address the problem of optimal state
and lead most of the time to accurate filtering results. estimation when the pdfs of the noise sequences are unknown
In this paper, the inferential problem is to estimate sequen- and need to be estimated on-line from the data.
tially the augmented hidden states driving the observations Although the desired probability density is unknown, we
along with the starting value. At the same time, our focus is to assume that it has a known functional form.4 Lets see how the
model the additive observation errors using a highly flexible Bayesian approach can be used to obtain the desired density .
family of density functions based on an infinite mixture model. The basic assumptions are summarized as follows:
In the following, in order to introduce the statistical Bayesian 1) The form of the density is assumed to be known, but
filtering for estimating the hidden states, we first review some the value of the parameter vector is not known exactly.
theoretical aspects about nonlinear dynamic state-space mod- 2) Our initial knowledge about is assumed to be contained
eling. Then, we will provide some typical ways dealing with in a prior density .
density estimation of errors via mixture-based models. 3) the rest of our knowledge about is contained in a set of
samples drawn independently according to the
A. Nonlinear Dynamic State-Space Model and Statistical unknown probability density .
Bayesian Filtering The basic problem is to compute the posterior density ,
The unobserved signals are modeled as a because this is a necessary step in computing . There
Markov Process of initial distribution and transition have been several efficient sampling algorithms in [32][34]
equation provided by the motion model (4). The which enable to sample from . However, these algo-
observations , are assumed to be conditionally rithms cannot be applied to our case since the noise sequence
independent given the process and the marginal is not observed directly.
distribution is deduced from (5). To sum up, the model Generally, in density estimation problems, it is the value of
is described by the mixture likelihood at individual points that is of interest,
rather than the membership of the components, which is im-
portant in clustering or discriminant analysis. In this paper, a
Bayesian nonparametric framework based on a mixture model
for density estimation will be introduced. Later, we will study
the inference procedure when the nonparametric component is
We denote by and , applied to the additive observation errors . More
respectively, the states and the observations up to time . Our precisely, we will show that by using a Dirichlet process prior
aim is to estimate recursively in time the posterior distribution [15], [16], [35], we can derive a Bayesian nonparametric mix-
and its associated features (including the marginal ture model. Before we discuss the Dirichlet process prior and the
distribution , known as the filtering distribution). The nonparametric Bayes analysis, let us first consider some typical
posterior distribution can be calculated in two steps theoreti- ways to deal with a mixture model in general.
cally: prediction and update. In the prediction step, we integrate
the state distribution from the previous state using the system C. Bayesian Analysis of Mixtures for Density Estimation
model in (4)(6).
1) General Framework: Many papers, including [15], have
The update step modifies the prediction distribution making
shown that mixtures of normal distributions provide a simple
use of the latest observation. Given the initial distribution
and effective basis for nonparametric Bayesian density estima-
, transition distribution , and the likelihood
tion. Such an approach is very flexible and Markov chain Monte
distribution , the objective of the filtering is to esti-
Carlo methods make it easy to sample from the posterior. To
mate the state , given the observations up to time .
better understand Gaussian mixtures, we will first consider these
From Bayesian perspective, when an observation becomes
models for a fixed value of components, and later explore the
available at time , we can obtain the posterior distribution
properties in the limit where . The finite Gaussian mix-
via the Bayes rule. The expression of this distri-
ture model with components may be written as
bution requires the evaluation of complex high-dimensional
integrals. Since we are in a nonlinear case, the multidimen-
sional integrals are intractable and some approximate methods (7)
must be used. More details about the used filtering approach
will be given in Section V.
where , is the number of com-
B. Bayesian Computations Objectives ponents, the s are the mixture weights which must be non-
negative and sum to one, the s denote the vector of all un-
We will use the model formulated in (4)(6) to estimate the
known parameters associated with th component, and is the
hidden state given the observations . It is a very common
choice to assume that the noise pdfs are Gaussian, with known 4Note that the pdf of the unknown density is defined using a nonparametric ex-
parameters, as this enables the use of Kalman filtering. So far pression with an infinite-dimensional parameter space. In fact, when we cannot
assume that the density is characterised by a set of parameters, we must resort to
we have explained that the Gaussian assumption is inappro- nonparametric methods of density estimation; that is, there is no formal struc-
priate since the observation noise distribution are likely to be ture for the density prescribed.
RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1643
scaling parameter . Lets suppose now that we draw a random such that . Here, the
measure from a DP, and independently draw random vari- s give the model hyperparameters, while s indicates
ables from : their corresponding labels, such that . From the Polya
urn probabilities in (12), we can claim that the draws for the
(9) indicator variables are obtained according to the following
sampling scheme:
where . Marginalizing out the random mea-
sure , the joint distribution of follows a Polya
urn scheme [40]. In another representation of the DP called the (13)
stick-breaking scheme, is introduced explicitly as an infinite
sum of atomic measures [41]. The realizations of a DP are ex-
where , the number of s
pressed as follows:
which equal . For computational reasons, we assume for
a conjugate normal inverse Wishart base distribution as it will
(10) be detailed later in Section IV-F.
D. Estimation of DP Hyperparameters
with , and
where denotes the Beta distribution and denotes the Dirac In a hierarchical model, a hyperprior can be placed on the pa-
delta measure located in . The underlying random measure rameters of the DP by simply putting prior distributions on the
is then discrete with probability one. Using (8), it comes that scaling (precision) parameter and the parameters of the base
the following flexible prior model is adopted for the unknown distribution . The data is then used to calculate the posterior
distribution distribution. By this, we can say that another layer is added in
the hierarchical model. Some authors have chosen a fixed ,
(11) such as a normal distribution with large variance relative to the
data as in [19] or a , where and are stipulated without
further discussion as in [43]. Other authors have used a para-
More details about other important properties of DPs can be metric family of priors for , such as uniform-inverse Wishart
found for instance in [16], [40][42]. [44] or normal-inverse Gamma [44], [45], with fixed hyperprior
C. Dirichlet Process Mixture Model distributions. The hyperprior distributions are usually chosen by
sensitivity analysis, or else they are chosen to be diffuse with re-
The core of the DPM model can basically be thought of as spect to the observed data.
a simple Bayes model given by the likelihood In this paper, priors on the hyperparameters will be speci-
and prior , with added uncertainty about the prior fied using a data-adaptive way without applying a pure sampling
distribution approach. Most of the time, in real-world application contexts,
the observed data can help in defining a prior for some of the
hyperparameters in the considered model. Clearly, computing
every unknown hyperparameter with pure sampling is not the
most effective approach in all cases since this can be practically
Through this hierarchy, a marginalization step is often used in difficult and time consuming.
actual implementations through the so-called Polya urn repre- Concerning the scaling parameter which is an important
sentation [40]. By integrating over through the Polya urn rep- parameter to be tuned (or chosen) carefully, one may want to
resentation, the joint distribution of may be put a prior on and then use the data to calculate a posterior
factored into a product of successive conditional distributions distribution. To understand what values of to use, the rela-
of the following form: tionship between and the expected number of clusters in the
data might be considered. In [46], the author shows how, with
respect to a flexible class of prior distributions for parameter ,
the posterior may be represented in a simple conditional form
(12) that is easily simulated. As a result, for a better fitting of DPM
where is the hyperparameter vector. This fac- models, inference about this parameter may be developed with
torization implies that has discrete, though infinite support. classical Gibbs sampling algorithms [15].
This implies that a random draw from either equals one of the
previous draws or is drawn independently from the base proba- E. Prior and Posterior Distributions for
bility measure . We have mentioned in this paper that a key feature of the
Note that may be represented in an equiv- Dirichlet is its discreteness, which in our context implies that
alent way by its latent variables and cluster locations . the pairs , concentrate on a set of some
Therefore, we introduce in the following the notion of a distinct pairs. Let be the number of distinct values of
class label assigned to each hyperparameter . We set . To sample the precision parameter , we first deter-
and we denote by the set of mine the prior distribution for , the number of normal compo-
values taken by the labels. The location variables are denoted nents in the mixture (clusters). At each stage of the simulation
RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1645
(14)
where is the mean of the Normal distribution, is a scale
with weights defined by parameter, and are, respectively, the inverse scale matrix
and the degree of freedom for the inverse Wishart distribution
(see [16] for more details about these distributions). In the
(15)
sequel, we denote by the base distribution
hyperparameter vector. The same base distribution should
where , , and . More not be applied in all navigation situations (LOS, non-LOS, and
details about this sampling can be found in [46]. It is important no reception). Here, we propose to adapt the priors on the
to mention that should be sampled at each Gibbs iteration parameters at each time . More precisely, we propose to use an
stage in the simulation process that allow to generate the additional parameter defined at each time for each satellite
according to (12). The current sampled values of and permit . This parameter adds one more step in the DPM hierarchical
to compute a new value of . In [42] the expected number of model as shown in Fig. 2.
components sampled from a Dirichlet process is given by The parameter is an indicator of the propagation state for
each satellite at each time . It will be used to update the param-
eters of . In practice, when a LOS reception state is detected
the hyperparameters of the base distribution will be sam-
pled from the normal-scaled inverse Wishart distribution whose
parameters , , , and should be chosen carefully. How-
ever, when a non-LOS reception state is detected, it will be in-
teresting to adapt the distribution of the hyperparameters as fol-
(16) lows:
F. The Parameters
with , , , and .
DP mixtures have received a considerable interest in the The additive term is the estimation of the pseudorange
Bayesian nonparametric literature. However, little has been error computed at each time as
written on choosing a centering distribution for these
models. Building on a previous work [47], the latent sources of (19)
heterogeneity to be specified through the parameters
are considered using a data-adaptive way. is a normal dis- where is the corrected pseudorange measurement and
tribution with unknown mean and covariance . The couple is the predicted position and
are chosen to be distributed according to a normal-scaled is the predicted receiver clock bias computed at
inverse Wishart distribution, denoted as time .
and the hyperparameters of the base distribution . In weights and removing those with small weights. The variance
practice, only the state variable is of interest. We have seen introduced by the resampling procedure can be reduced by
in Section IV-C that through the Polya urn representation, we proper choice of the resampling method [26].
integrate out analytically from the posterior. The parameters
and remain and the inference is based upon B. Improving the Particle Filtering by Rao-Blackwellization
where is the augmented hidden state vector. A method to increase the efficiency of sampling techniques
Note that the on-line estimation of is necessary for a good is to reduce the size of the state space by marginalizing out if
estimation of which implies a better fitting of the DPM possible some of the variables analytically; this is called Rao-
model and hence the estimation accuracy of the state vector Blackwellization [28]. In the following, this approach will be
could be enhanced. introduced in order to improve filtering. As shown previously,
In Section V-A, a filtering approach for estimating sequen- we can compute the posterior distribution on-line by applying
tially the augmented hidden state using a particle filter (PF) Bayes rule sequentially. However, in general, the required in-
is provided. In this setting, a first solution based on computing tegrals cannot be computed in closed form. Particle filtering
everything with pure sampling is proposed. Then, to improve therefore approximates the posterior using sequential impor-
the efficiency of this filtering approach, some of the variables tance sampling. Now, if we partition the state-space into two
will be marginalized out and an efficient Rao-Blackwellized PF subspaces, drawn by and (for all ). Then, by the
is provided in Section V-B. chain rule of probability, we can write
A. Particle Filtering
Unscented Kalman filters (UKF) and EKF which are widely (22)
used as suboptimal solution in case of white Gaussian systems
where . We will need to compute the posterior
are not suitable for treating the models which exhibit noncen-
distribution using the conditional distributions
tered and non-Gaussian distributions. Except in a few simple
and . Recall that details for
cases, there is no closed-form solution to this problem. It is
the computation of these conditional distributions were given in
therefore necessary to adopt numerical techniques in order to
Sections IV-D, -E, and -F. Based on these details, note that we
compute reasonable approximations. SMC methods (particle fil-
can sample from the posterior distribution
ters) are powerful tools that allow us to accomplish this goal.
by simulating a Markov chain that has the posterior as its
This filtering method is able to cope with noises of any distribu-
equilibrium distribution. The simplest such methods are based
tion and provides a convenient and attractive approach to com-
on Gibbs sampling. Then, the hyperparameters distribution
puting the posterior distributions.
can be included in the Markov chain simulation.
The PF approximates the posterior probability density func-
On the other hand, if we can update
tion via the discrete weighted sum
analytically and efficiently, then we only need to sample
using the PF. Since we are now sampling in a
(20) smaller space, in general we will need far fewer particles to
reach the same accuracy as standard PF. This is the key idea
The individual weights , are computed by applying the prin- behind RBPF.
ciple of importance sampling. Since it is difficult or impos- The following is a generalization of the RBPF to DPM-based
sible to directly sample , particles are simulated framework as presented in [26] and applied in [5], [48]. At time
according to a proposal distribution , we use the following empirical distribution to approximate
through a set of particles
Then, to correct for the discrepancy between the proposal and (23)
the target distribution, importance weights will be assigned to
the generated particles with
(24)
where By using an EKF or an UKF, for each particle we can com-
pute recursively the terms and . For more
details about how to build the sampling algorithm refer to [2],
[5], and [48]. In RBPF, each particle maintains not just a sample
(21) from but also a parametric representation of
According to (21), the set of particles is updated and reweighted the distribution (the parametric represen-
using a recursive version of importance sampling. The most tation in our case are a mean vector and a covariance matrix).
likely particles yield high importance weights. An additional For our tracking application, the samples are updated as in
resampling procedure is used for replacing particles with large standard PF, and then the distributions are updated using an
RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1647
EKF, conditional on . The overall algorithm is shown in Al- nonparametric modeling. We know that in real applications,
gorithm 1. Therefore, the optimal importance distribution can the sources of this heterogeneity cannot be completely known.
be written as . We will prove through some experiments that the Dirichlet
Without deriving approximations of this importance distribu- process prior may provide a satisfactory model in case where
tion, no efficient importance distributions can be used directly the sources of heterogeneity in the observation errors density
since the associated importance weights will be computationally are unobserved. The DP prior is robust to errors in model spec-
intractable. In this paper, for the sake of simplicity, the impor- ification and allows heterogeneity in the patterns distributions
tance distribution is chosen to be equal to the Polya urn distri- to be specified in a data-adaptive way.
bution as in (12). This approximation yields For example, the predictive distribution of densities based on
to the simplification of the weight expression (21). Finally, the our Dirichlet process prior can be multimodal (non-LOS case)
minimum mean squared error (MMSE) estimate and posterior owing to its natural potential for identifying clusters. In contrast,
covariance matrix of are given by standard parametric prior distributions of densities, such as the
normal distribution, cannot capture multimodality because of
(25) their shape restrictions. But, if the actual distribution of density
can be approximated by a unimodal distribution (LOS case), the
Dirichlet process prior also adapts to this situation. In fact, the
(26) parametric model based on finite Gaussian mixtures is a limiting
case of our modeling scheme when applying Dirichlet process
Note that in this subsection and in Algorithm 1 the estimation prior with . Therefore, we can say that the Dirichlet
of the hyperparameter is not explicitly shown, more details process prior is a natural extension of fully parametric models.
about this will be discussed later in Section IV-A. In our application of on-line pseudorange error density es-
timation, the latent variables are the mean and the variance
Algorithm 1: Computing the posterior distribution of each Gaussian distribution included in the infinite mixture,
using a Rao-Blackwellized Particle Filter i.e., , for . To compute a vehicle position
at each step of the filtering process, particles are
At time , computed. The problem with GNSS applications is that signal
propagation can be considered in different reception modes. In
Step 0: initialization
LOS reception, the pseudorange error noise follows a zero-mean
For , sample and set
Gaussian distribution with a standard deviation inferior or
equal to 1 (actually it depends on the used receiver). In non-LOS
reception, the pseudorange error can be higher than 100 m. In
At time , the proposed DPM model, the will be estimated by sampling
as detailed in Section IV-D. The same base distribution will
Step 1: Importance Sampling step
not be applied in all navigation situations. Here, we propose to
For
adapt and each time using the proposed approaches in
Sample
Section IV-D.
Sample
Evaluate the importance weights
TABLE I
COMPARISON OF POSITIONING ERRORS USING DPM AND GM (TRAJECTORY 1)
Fig. 7. Three zooms into Fig. 6: zoom (a) (top right), zoom (b) (top left), and zoom (c) (bottom).
different curvatures. The top left graph in Fig. 7 depicts the po-
sitions values when the vehicle dynamic changes suddenly. In
this segment of the route, there is a first turning on the left on
a long radius bend and a second turning on the right. In these
situations, the simulated positions using the proposed approach
are the nearest ones to the actual positions throughout the en-
tire segment. This demonstrates that the proposed positioning
approach still robust when abrupt changes occur in the dynamic
of the vehicle. The same remarks are also applicable for the top
right graph of Fig. 7. The bottom graph in Fig. 7 shows the be-
havior of the tested tracking algorithms at the beginning of the
trajectory. At the beginning, the vehicle run about 200 m near
obstacles (high building and trees) and being not completely in
the middle of the road. The proposed approach behaves well
Fig. 8. Evolution of the positioning error for different tracking approaches:
compared to the other approaches since it does not need a long EKF, PF (GM is used for the density estimation of observation errors) and the
initialization period of time. proposed approach based on RBPF (DPM is used for density estimation of the
To model the observation errors density, both GM and DPM- errors).
based models have been used in a sequential Monte Carlo fil-
tering approach. Table I shows the localization performances for
the two models in terms of mean positioning error. The average In fact, a random draw from either equals one
error between the actual and estimated trajectories drops below of the previous draws or is drawn independently from the base
3 m. We have considered three scenarios, each scenario corre- probability measure . The parameter obviously plays an
sponds to a different satellite configuration at a specific time of important role in the distribution of . In (18), note that the prob-
the day. ability that the new draw of the parameter differs from all pre-
In most applications of DPM models, the precision parameter viously drawn parameter values is proportional to ; therefore,
is not known and must be well chosen. In this paper, some higher values of lead to a higher probability of many unique
empirical tests were conducted to understand the influence of values (i.e., a higher number of classes relative to the sample
values on the estimation performance. In [16] for example, size ). Conversely, lower values of lead to a greater chance
it was shown that in problems where is unknown, estimation of clustered distributions with fewer unique values in .
results can be extremely sensitive to the choice of prior assumed In preliminary experiments, we considered a range of values
for . in order to explore the extend to which our results are sensitive
RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1651
Fig. 9. Estimation of the pseudorange errors using DPM models with fixed and
estimated values of .
Fig. 12. Locations of the clusters in the DPM model for different satellites (satellites 18, 1, and 13) and their corresponding reception states. (a) Temporal location
of the clusters (s). (b) Time diagram illustrating the reception states (LOS, NLOS, Blocked) of satellites during simulation.
TABLE II
COMPARISON OF POSITIONING ERRORS USING DPM, GM, AND SAFEDRIVE (2D ERRORS)
Fig. 13. The reference trajectory in the city of Toulouse: positions are drawn
in the local East North Up (ENU) Cartesian coordinate system. The unit of the
vehicle coordinates is the meter (m). The speed of the mobile is about 30 km/h Fig. 14. The reference trajectory of a vehicle in the city of Toulouse: Positions
and the traveled distance is 4333 meters. are drawn in the local East North Up (ENU) Cartesian coordinate system. The
unit of the vehicle coordinates is the meter (m).
Fig. 15. Three zooms into Fig. 14: zoom (a) (top right), zoom (b) (top left), and zoom (c) (bottom).
Fig. 16. Evolution of the positioning error (in trajectory 2) for different
tracking approaches: EKF, PF (GM is used for the density estimation of
observation errors) and the proposed approach based on RBPF (DPM is used
for density estimation of the errors).
[38] Y. W. Teh and M. I. Jordan, Hierarchical Bayesian nonparametric Emmanuel Duflos was in the Institut Suprieur
models with applications, in Bayesian Nonparametrics: Principles dElectronique du Nord, Lille, France, from 1987 to
and Practice. Cambridge, U.K.: Cambridge Univ. Press, 2010. 1991, as an undergraduate. He then spent one year at
[39] T. L. Griffiths and Z. Ghahramani, Infinite latent feature models and the University of Paris XI, Orsay, France, to prepare
the Indian buffet process, Adv. Neural Inf. Process. Syst. (NIPS), 2006. a Diplme dEtude Approdondie (DEA) in automatic
[40] D. Blackwell and J. MacQueen, Ferguson distributions via Polya urn control and signal processing. From 1992 to 1995,
schemes, Ann. Statist., vol. 1, no. 2, pp. 353355, 1973.
[41] J. Stthuraman, A constructive definition of Dirichlet priors, Statistica he was with the University of Toulon, where he
Sinica, vol. 4, pp. 639650, 1994. received the Ph.D. degree in signal processing, in
[42] C. E. Antoniak, Mixtures of Dirichlet processes with applications September 1995. In December 2002, he received the
to Bayesian nonparametric problems, Ann. Statist., vol. 2, pp. Habilitation a Diriger des Recherches (HDR), from
11521174, 1974. the Universit des Sciences et Technologies de Lille.
[43] J. S. Liu, Nonparametric hierarchical Bayes via sequential imputa- From September 1995 to August 2003, he was a teacher at the Institut Su-
tions, Ann. Statist., vol. 24, pp. 911930, 1996. prieur dElectronique du Nord where, from September 1999 to August 2003,
[44] S. N. MacEachern and P. Muller, Estimating mixture of Dirichlet he was the Head of the Signal, System and Communication Department. In
process models, J. Computat. Graph. Statist., vol. 7, pp. 223238, September 1999, he joined the Laboratoire dAutomatique et dInformatique
1998. Industrielle de Lille (Laboratory LAIL) as a researcher. The LAIL became the
[45] C. E. Rasmussenan and Z. Ghahramani, The infinite Gaussian mixture Laboratoire dAutomatique, Gnie Informatique et Signal (LAGIS) in January
model, Adv. Neural Inf. Process. Syst. (NIPS), pp. 294300, 2000.
[46] M. West, Hyperparameter estimation in Dirichlet process mixture 2006. In September 2003, he was a Full Professor with the Ecole Centrale de
models Duke Univ., NC, 1992, Tech. Rep.. Lille, where he is the Deputy Director and the Director of Research. He is also
[47] A. Rabaoui, N. Viandier, J. Marais, and E. Duflos, On selecting the the head of the Signal and Image team of LAGIS. In April 2006, he participated
hyperparameters of the DPM models for the density estimation of ob- in the creation of the INRIA project team SequeL devoted sequential learning.
servations errors, in Proc. ICASSP, 2011, pp. 40924095. SequeL was hosted by the INRIA Lille Nord Europe. His main research con-
[48] F. Caron, M. Davy, A. Doucet, E. Duflos, and P. Vanheeghe, Bayesian cerns are Bayesian nonparametric methods for signal processing, multisensor
inference for dynamic models with Dirichlet process mixtures, in probability hypothesis density (PHD) filter for sensor management, particle fil-
Proc. 9th Int. Conf. Inf. Fusion, 2006, pp. 18. tering. and MCMC methods.
[49] A. Belloni and V. Chernozhukov, On the computational complexity Dr. Duflos was a member of many international committees in the field of
of MCMC-based estimators in large samples, Ann. Statist., vol. 37, signal processing and data fusion. He is a member of the Board of Directors at
no. 4, pp. 20112055, 2009. the Ecole Centrale de Lille.
[50] N. Viandier, J. Marais, A. Prestail, and E. De Verdalle, Positioning
urban buses: GNSS performances, in Proc. ITST, 2008, pp. 5155.
[51] F. Caron, M. Davy, and A. Doucet, Generalized Polya urn for time-
varying Dirichlet process mixtures, in Proc. 23rd Conf. Uncertainty
in Artif. Intell. (UAI), 2007. Juliette Marais received the Engineering degree
from the Institut Suprieur dElectronique du Nord
Asma Rabaoui received the engineering degree in (ISEN), Lille, France, in 1998, the Masters degree
2004 and the Ph.D. degree in electrical engineering from the University of Lille in 1999, and the Ph.D.
with emphasis in signal processing in 2008 from degree from the French National Institute for Trans-
The National Engineering School of Tunis [in col- port and Safety Research (INRETS, now IFSTTAR)
laboration with SyNeR team in LAGIS (Laboratoire in 2002.
dAutomatique, Gnie Informatique et Signal in Since 2002, she has been with the French Institute
Lille, France)]. of Science and Technology for transport, devel-
Her research has explored a variety of topics opment, and networks (IFSTTAR) where she is a
mainly in statistical audio signal processing, more researcher. Her research interests focus on satel-
specifically analysis and characterization of im- lite-based navigation systems including performance evaluation, in particular,
pulsive sounds, wavelets, signal classification for transport applications, accuracy enhancement, signal propagation modeling,
surveillance applications using hidden Markov models (HMMs) and support and error estimation.
vector machines (SVMs). She is currently a postdoctoral researcher with the
Laboratoire de lIntgration du Matriau au Systme (IMS) working with the
Signal team, since October 2010. Her current research project (supported by
the TOTAL group) involves applying Markov chain Monte Carlo (MCMC) Philippe Vanheeghe (M92SM97) was born in
methods and sequential Monte Carlo for data estimation in multisensors-based France on July 20, 1956. He received the M.S.
systems. She is also interested in Bayesian approaches to model selection. degree in data processing, the Diploma of Advanced
Prior to that, she was working with INRETS (in collaboration with LAGIS) Study in data processing, the Ph.D. degree, and the
on improving GNSS localization precision in urban canyons using sequential Habilitation Diriger des Recherches (HDR) from the
estimation of signals and nonparametrical Bayesian analysis to deal with University of Lille, France, in 1981, 1982, 1984, and
nonlinear dynamic models. She is applying Dirichlet processes and Gaussian 1996, respectively.
processes to a range of problems including mobile tracking and audio-based He joined the Institut Suprieur dElectronique du
systems. Nord (a private engineering school) as an Assistant
Professor and became a Full Professor with the Ecole
Centrale de Lille in 1999. His research activities are
focused on statistical information processing in the multisensor systems. Partic-
Nicolas Viandier was born in Boulogne sur Mer, ularly in the development of new data analysis algorithms for inference and clus-
France, on November 1, 1982. He received the tering based on nonparametric Bayesian modeling implemented using particle
Research Masters degree in numerical engineering filters and Monte Carlo Markov chains methods, these algorithms can be used
in 2007 from the Universit du Littoral te dOpale, to solve a problem of joint state parameter estimation problem, in this case the
Calais, and the Ph.D. degree in automatic, signal utilization of a Bayesian nonparametric noise model based on Dirichlet process
processing and industrial computing from the Ecole mixture has been used. Research activities are developed within the SequeL
Centrale, Lille, France, in 2011. (acronym for Sequential Learning) of the INRIA Lille-Nord Europe Research
He is currently an engineer of the French MoD. His Center.
Ph.D. degree thesis deals with the modelling and use Dr. Vanheeghe is involved in the Technical Program Committees of con-
of GNSS pseudorange errors in the navigation solu- ferences (recently, FUSION 2011, SPIE 2012) and in review activities for
tion to improve the GNSS accuracy with application the IEEE journals (IEEE TRANSACTIONS ON SIGNAL PROCESSING and IEEE
to the transportation field. TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS).