You are on page 1of 13

Acquisition in Wireless Sensor Networks:

Directed Diffusion vs Pseudo-Distance Data


Dissemination ∗

Ben Tatham
November 2007

Abstract
Directed Diffusion[1] is a data-centric method of disseminating
packets in a wireless sensor network to sink nodes, using application
layer information to determine network routing while being reactive to
network changes. Pseudo-Distance Data Dissemination[3] by Lee and
Lee modifies DD for greater efficiency. Lee and Lee compare PDDD
to DD on a few key points in specific mobile network topologies. This
study aims to extend their comparison to better explain how DD is
so inefficient with data traffic. Further, we show that PDDD is also
worthwhile for more fixed sensor network topologies as well. Finally,
this paper presents the drawbacks of PDDD, namely the extra memory
required at each node, to perform the protocol.

1 Introduction
This work is a clarification, analysis, and extension of an article entitled
Data Dissemination in Wireless Sensor Networks written by Lee and Lee [3].
The article presents a novel modification of Directed Diffusion (DD) [1],

The LATEXsource for this paper and the embedded figures can be found at http :
//triplipse.googlecode.com/svn/trunk/SensSimDoc/

tatham@ieee.org, The Department of Systems and Computer Engineering, Carleton
University, 1125 Colonel By Drive, Ottawa, ON, Canada K1S 5B6

1
significantly improving both data traffic efficiency as well as performance.
Unfortunately, the authors do a poor job of explaining why DD is inefficient
and leave many critical details of their analysis to the reader. This work
attempts for fill those gaps in the arguement, and provide a better direct
comparison of DD to PDDD. Further, we present a modification to PDDD
to boost efficiency even further.

1.1 Context
Diffusion-style networking differs from traditional networking techniques in
a number of ways. These differences make it applicable to task-specific net-
works only, like sensor networks, and not for general purpose networks.
First, diffusion is data-centric, meaning all packet passing is done based on
named, or typed, data. Therefore, while it is definitely related to traditional
reactive routing protocols, it crosses the abstraction boundaries up to the ap-
plication layer. Second, communication in diffusion is neighbor-to-neighbor;
it is difficult if not impossible to determine physical network topology outside
of a simulation. Therefore, there are no routers in the network and there is
no “end” of the network; each node is an “end”. Because of the neighbor-
to-neighbor communications, nodes do not necessarily need globally unique
addresses; they only need to be unique among their neighbors.
As mentioned above, diffusion is similar to more general reactive ad-
hoc routing protocols. However, unlike routing protocols, diffusion does not
attempt to find loop-free routes between two points. It does not even try
to find a single “best” route between nodes. Constrained flooding allows
messages to reach their destination, and through the use of reinforcement,
the empirically discovered best route is used most often. Message caching at
each node is required for loop avoidance.

1.2 Problem
The problem that both DD and PDDD attempt to solve is to efficiently get
data from source nodes, where information about a sensed events is gathered,
to sink nodes, where it can be analyzed and further processed and forwarded
to end users. Wireless sensor networks present key problems that effectively
disallow the use of general-purpose routing techniques. Primarily, this in-
volves power constraints. And among the power-hungry activities of a sensor
node, radio transmission tops the list. Each bit transmitted consumes as

2
much power as 2090 processor cycles on a mica2dot mote [4]. So for simple
analysis and system development, it makes sense to focus on minimizing the
number of bits sent from each node.
Lee and Lee [3] studied the performance of Directed Diffusion and realized
there was great room for improvement. They are especially interested in
reducing the control message overhead that becomes even more significant
when nodes are mobile in DD.

1.3 Results
The work of Lee and Lee is revisited in this paper. We explore the net-
work lifetime differences and memory usages differences between DD and
PDDD. Finally, we extend PDDD to remove acknowledgment messages be-
tween nodes.

1.4 Outline
Section 2 summarized the key points of the Directed Diffusion (DD) protocol,
followed by a similar description of Pseudo-Distance Data Dissemination in
Section 3. Lee and Lee’s comparative results are revisited in Section 4. We
then describe the simplified simulation recreated for this paper in Section 5.
Finally, we present a minor modification to PDDD in Section 6 and show the
efficiency improvements it provides. We conclude in Section 7.

2 Directed Diffusion
2.1 Overview
Above, in Section 1.1, we described the basic concepts of diffusion-based
protocols. DD is obviously based on these ideas. DD begins by sink nodes
flooding interest messages into the network describing:

• The type of data of interest

• The the interval at which to send it and when it expires

• The geographic region of the network to send data from

3
As each node receives the interest message, it saves in memory a gradient
which is used to determine where to forward data messages as they arrive,
or as they are generated locally. Each gradient consists of

• data rate and interest description

• direction (e.g. address of the node that the interest was received from)

The node then forward on the interest message, changing the source address
to its own. As each node subsequently receives the interest, it is as if the
interest is the nearest neighbor. Each node must avoid loops in the network
be remembering interests that have already been forwarded on. In general,
it suffices to just remember only the most recent one as interests are not sent
that frequently, and the loop will occur after only one hop and the neighbor
forwarding the interest right back to the node.
For the first pass, the sink sends out an exploratory request such that the
interval is large and it expires relatively quickly. Since the interest message is
flooding the network, this allows the network to use the real data to indicate
the best paths for data, which then allows the sink to reinforce these paths.
When the sink receives the first data message from each sensor, it sends back
a reinforcement interest via unicast, just to the neighbor node that the data
message came from. When each subsequent node receives a unicast interest
message, it only chooses one other node to forward to, thus propagating
the empirically determined best path back to the source. While complex
algorithms could be used to determine which path to reinforce, they require
more information than available in simple DD. Therefore, it chooses the node
that sent the data message of the given type to that node first. Because of
this requirement, and for data message path loop avoidance, each node must
maintain a relatively large list of information about recently received data
messages and the order in which they were received.

2.2 Link Breakage Detection


In DD, link breakage is detected by each node monitoring incoming data
messages, and knowing the gradient that they are coming in on. In practice,
the node does not actually have to maintain the exact gradient that the
sender is sending with. Rather, it can simply expect the same data rate of
message forever, because the sink will likely renew the interest before the
gradient expire.

4
2.3 Network Density
One aspect of DD left out of both [1] and [3] is an analysis of how node
density affects performance. Each node must keep information about each
neighbor node; therefore, the higher thee node density, and thus the more
neighbors each node has, drastically alters that amount of memory required
at each node. Multiply that times the data messages intervals and you have
an exponentially growing memory requirement at each node. Therefore, DD
is only suitable for relatively low density and low data interval networks.
More precise analysis will be shown in Section 5 below.

2.4 Drawbacks of Directed Diffusion


DD, while an innovative protocol for its time in 2003, has some key draw-
backs. By the author’s own admissions, DD was designed to optimize net-
work reaction to topological changes, often at the expense of more traffic. In
practice, these changes are caused by sensor nodes dying due to lack of en-
ergy. However, if sink nodes are mobile, as is often the cause in public safety
networks where rescue workers are moving through the scene of an accident,
DD requires that new interest messages be flooded into the network far more
frequently.
In addition, DD is over-zealous in its transmission of data messages.
While indeed, DD is extremely reactive to changes in the network, it wastes
precious power by sending data messages along multiple paths. The authors
of DD do mention that low-level radio protocols may allow for a single radio-
broadcast to multiple network-unicast recipients, this is complex to handle
and difficult to achieve in all cases.

3 Pseudo-Distance Data Dissemination


PDDD attempts to solve one particular drawback of DD: the excessive in-
terest flooding required for mobile sink nodes.

3.1 Overview
PDDD removes the gradient algorithm from DD. Instead, it replaces it by
each node knowing which neighbor nodes are closer, the same, and further
from the each sink node, in terms of Level. PDDD defines a new variable

5
Level, which is calculated at each node, and kept up to date at each neighbor
node. The level consists of a few parameters, Li , sinkID = hλ, −α, −βi,
where:

• λ: a distance metric, which is effectively pseudo-distance, or the num-


ber of hops to the sink.

• α: the number of neighbors with lower λ, or in others words are closer


to the sink

• β: the number of neighbors with same λ, or are the same number of


hops to the sink

• νi : unique id of the node.

Lee and Lee do a poor job of explaining what pseudo-distance is in this


paper. Their previous work in [2] does a bit better job, but it is not exactly
the same as the later work in [3]. They complicate the explanation with vary-
ing terminology through their explanation, calling it λ, distance metric, and
only on occasion pseudo-distance. This is surprising since it is the keyword
of the protocol itself.
In any case, pseudo-distance is just the number of hops to the sink. Much
discussion is given in [3] explaining why the addition of β, or the number of
nodes at the same hop distance is important to the protocol. In terms of
theory, it boils down to the different between a partially-ordered graph and
a totally-ordered graph. In a partially-ordered graph, each node only knows
about the nodes closer to the sink, while when totally ordered, it also knows
the sibling nodes. Figure 3.1 shows an an arbitrary network on the left, and
the totally-ordered graph layout of it, if node 5 is a sink node.

3.2 How it Works


Similar to DD, PDDD starts off by the sink node flooding an interest message
to its neighbors. The interest message in PDDD adds some more information
to the existing content of DD:

• Original sink address

• Level

6
0 1 5

2 3 2 6 8
4

5 6 9
7 0 4 7

8 9 1 3

Figure 1: Totally Ordered Graph

7
As the nodes receive the interest message, they update the address field,
while leaving the original sink address in the message. But they each node
does change the Level information to its own. The sink’s level is always
h0, 0, 0i, because it has 0 hops to itself and has no neighbors closer or even
the same distance to itself.
In PDDD, each node within one hop to the sink detects a link breakage
with the sink via heartbeats sent by the sink. The assumption is that sink
nodes have higher power. The other nodes in the network use acknowledge-
ment packets of each data message to detect link breakage. It struck me as
odd that this part of DD was changed; why introduce additional overhead
of acknowledgements? While acknowledgement do provide a more robust
breakage detection, being able to detect breakage in either direction of data
traffic, the original data traffic message being forwarded on to another node
should be sufficient to detect breakage. Section 6 describes in more detail
my proposal.

4 Comparison
While Lee and Lee did do comparitive simulations of DD and PDDD using
NS-2, they left out any analytical comparison. This section attempts to
provide a more analytical approach to the comparison.

4.1 Memory Usage


As presented in the paper via simulation, PDDD clearly has less traffic than
DD. But at what cost? One of the tradeoffs is in terms of memory usage per
node, especially as the number of neighbors and the number of sinks in the
network increases.
The model for memory usage was determined through careful analysis
during the writing of the simulation code. The memory statistics do not
include code space, but rather just the data structures required to keep track
of neighbor node statistics for gradients or levels for DD and PDDD, respec-
tively. The assumptions are the addresses are 4bytes and timestamps are
8bytes. We also assume that half of the neighbors will be upstream, and the
other half downstream. This affects both protocols because each must keep
timers to determine if a link is broken or not, albeit timers with different
semantics. See the protocol descriptions above for details.

8
Memory Usage Comparison

DD
PDDD

25000

20000

Memory 15000
per Node
10000
[B]
5000

0 10
20 30
40 40 45 50
Number of Neighbors 50 60 70 80 15 20 25 30 35
90100 0 5 10 Number of Sink Nodes

Figure 2: Memory comparison for DD and PDDD.

To be fair, Lee and Lee do limit PDDD for use when there is a small
number of sinks, but they do not mention the number of neighbors problem.
Figure 2 shows, in 3-D, the memory usage as the number of neighbors and
the number of sink nodes increases. It is difficult to see how the number of
neighbors affects DD at all, but Figure 3 shows more clearly how memory
per node in DD also increases, but at a slower rate than PDDD.
While memory usage may not be the critical factor in designing sensor
networks, it is nonetheless important to keep in mind when choosing either
the acquisition protocols or the node hardware when designing extremely
dense networks.

9
Memory Usage
1100
DD
PDDD
1000

900

800

700
Memory Usage [B]

600

500

400

300

200

100

0
0 10 20 30 40 50 60 70 80 90 100
Number of Neighbors

Figure 3: Memory comparison as number of neighbors increases.

10
5 Simulation
5.1 SensSim
For the purposes of this analysis, a simplified simulation engine was developed
in Java called SensSim 1 . The simulation progresses as a sequence of ticks,
with no real-world time associated with them. During each tick, a node
processes any packets for it coming in on the links it belongs to, and in the
same tick, forward on the packets that were received. Any new packets it
needs to inject into the network are also sent to the link. The destination
of each packet will process it on the subsequent tick. For example, a data
message that must traverse 10 nodes to get from source to sink, arrives at
the destination on the 9th subsequent tick. The packets are cached in a Link
object between ticks.
The topology is very simple: a grid of nodes is created, with a given
density. A broadcast from a node is heard by all eight of its neighbors, if they
exist. The user can specify a node density which determines the probability
that each node is occupied. A density of 100 ensures that every node in the
grid exists. One edge of the grid is all sources and at the opposite edge, half
of the nodes are sinks. The density factor does not affect the source or sink
nodes. This topology is similar to the one used in the original simulations
for Directed Diffusion [1].
Each node object keeps track of basic statistics during the simulation.
When complete, a set of cumulative statistics is built based on the various
parameters. Naturally, a real-world environment could not collect the same
level of statistics about the underlying overheads of the protocols.

6 Modification to PDDD
As mentioned above, PDDD uses acknowledgment of each data message to
detect link breakage. Many other wireless protocols use eavesdropping of
packets to monitor another node’s status. In sensor networks, the general
case is to have symmetric links with omni-directional antennas. Therefore,
rather than wait for an expicit acknowledgement, the sending node need
1
The source code for the simulator is available at http :
//triplipse.googlecode.com/svn/trunk/SensSim/, and can be built quickly with
Apache Maven

11
only to watch for the same data message, identified by sequence number and
source address, to be transmitted to another node. The monitoring node
could go as far as making sure the destination address is not one of its own
neighbors, but this is not technically necessary. When the rebroadcast data
message is heard, it can cleanup its timers the same way that PDDD does
when an acknowledge is received.
One drawback of replacing active acknowledgements with passive listen-
ing is either receiving more bits, since data messages are typically larger than
acknowledgements, or complexity in the low-level software to stop listening
to data messages after the header fields. However, it is likely that the soft-
ware would already handle partial packet listening anyway to save power. It
would just be slightly more complicated to instruct the low level software to
watch for other headers besides destination address being the local address
or broadcast.

7 Conclusion
In this paper, we have provided more detailed analysis of Lee and Lee’s
Pseudo-Distance Data Dissemination protocol. We have filled in analytical
holes in their initial presentation, most importantly about clarifing what is
“pseudo-distance” and memory usage analysis. A simulation enviroment was
developed to fully understand the loose terminology provided in the original
PDDD document[3].
Finally PDD was modifie

Acknowledgements
The author would like to thank Professor Michel Barbeau and the students
of COMP5402 in Fall 2007 for their help and breadth of experience in all
aspects of wireless networking.

References
[1] Chalermek Intanagonwiwat, Ramesh Govindan, Deborah Estrin, John
Heidemann, and Fabio Silva. Directed diffusion for wireless sensor net-
working. IEEE/ACM Transactions on Networking, 11(1):2–16, 2003.

12
[2] Min-Gu Lee and Sunggu Lee. A pseudo-distance routing algorithm for
mobile ad-hoc networks. IEICE Transactions on Fundamentals of Elec-
tronics, Communications, and Computer Sciences, E89-A(6):1647–1656,
June 2006.

[3] Min-Gu Lee and Sunggu Lee. Data dissemination for wireless sensor net-
works. Proceedings of the 10th IEEE International Symposium on Object
and Component-Oriented Real-Time Distributed Computing (ISORC’07),
May 2007.

[4] Nils; Eberle Hans; Gupta Vipul; Chang Shantz Wander, Arvinderpal
S.; Gura. Energy analysis of public-key cryptography for wireless sensor
networks. Third IEEE International Conference on Pervasive Computing
and Communication (PerCom 2005), March 2005.

13

You might also like