You are on page 1of 4

Data Acquisition for Probabilistic Nearest-Neighbor Query

Abstract:
Management of uncertain data in spatial queries has drawn extensive
research interests to consider the granularity of devices and noises in the
collection and the delivery of data. Most previous works usually model
and handle uncertain data to find the required results directly. However, it
is more difficult for users to obtain useful insights when data uncertainty
dramatically increases. In this case, users are usually willing to invest more
resources to improve the result by reducing the data uncertainty in order to
obtain more interesting observations with the existing schemes. In light of
this important need, this paper formulates a new problem of selecting a
given number of uncertain data objects for acquiring their attribute values
to improve the result of the Probabilistic k-Nearest-Neighbor (k-PNN)
query. We prove that better query results are guaranteed to be returned
with data acquisition, and we devise several algorithms to maximize the
expected improvement. We first explore the optimal single-object
acquisition for 1-PNN to examine the fundamental problem structure and
then propose an efficient algorithm that discovers crucial properties to
simplify the probability derivation in varied situations. We extend the
proposed algorithm to achieve the optimal multi-object acquisition for 1PNN by deriving an upper bound to facilitate efficient pruning of

unnecessary sets of objects. Moreover, for data acquisition of k-PNN, we


extract the k-PNN answers with sufficiently large probabilities to trim the
search space and properly exploit the result of single-object acquisition for
estimating the gain from multi-object acquisition. The experimental results
demonstrate that the probability of k-PNN can be significantly improved
even with only a small number of objects for data acquisition.

Existing System:
With only the current uncertain problem models and solution approaches,
it is difficult for a user to acquire useful insights and make correct decisions
from the answer sets if their k- PNN probabilities are not sufficiently large,
especially when the data uncertainty dramatically increases.
In this situation, the crux of the problem is the diversity on input uncertain
data, instead of the corresponding query processing techniques. Facing this
difficulty, users are usually willing to improve the accuracy of a few data
objects according to their available resources in hand. For a sensor
network, the values of sensors are uncertain because of sampling, in order
to reduce the power consumption.
Proposed System:
We propose the notion of data acquisition for the k-PNN query to reduce
the uncertainty of a specified number of data objects according to the
available resources. As an initial attempt to tackle this realistic but
challenging issue, we first explore and focus on a fundamental model with
the exact values of those data objects able to be acquired.

An intuitive approach is to explore all possible selections and derive the


corresponding probabilities. However, this approach is computation
intensive, especially when there are a large number of data objects with
each object having considerable probabilities in diverse attribute values.
In this paper, therefore, we formulate a new problem, named the sacquisition for k-PNN problem. Given the probability distributions of data
objects, the problem is to select a specified number, s, of objects for data
acquisition to maximize the improvement of the k-PNN result.

Hardware Requirements:

System

: Pentium IV 2.4 GHz.

Hard Disk

: 40 GB.

Floppy Drive : 1.44 Mb.

Monitor

: 15 VGA Colour.

Mouse

: Logitech.

RAM

: 256 Mb.

Software Requirements:

Operating system

: - Windows XP.

Front End

: - JSP

Back End

: - SQL Server

Software Requirements:

Operating system

: - Windows XP.

Front End

: - .Net

Back End

: - SQL Server