You are on page 1of 8

MStore: Enabling Storage-Centric Sensornet Research

Kresimir Mihic, Ajay Mani, Manjunath Rajashekhar, and Philip Levis


{kmihic,ajaym,manj,pal}@stanford.edu
Computer Systems Laboratory
Stanford University
Stanford, CA 94305

Abstract
We present MStore, an expansion board for telos and mica
family nodes that provides a non volatile memory hierarchy.
MStore has four memory chips: a 32KB FRAM, an 8MB
NOR flash, a 16MB NOR flash, and a 256MB NAND flash,
which can be expanded to 8GB if needed. All chips provide
an SPI bus interface to the node processor. MStore also
includes a Complex Programmable Logic Device (CPLD),
whose primary purpose is to be an SPI to parallel interface
for the NAND chip. The CPLD can also be used to offload
complex data processing.
Using TinyOS TEP-compliant drivers, we measure the
current draw and latencies of read, write, and erase operations of different sizes on each of the storage chips. Through
this quantitative evaluation, we show that MStores manylevel hierarchy and simple design provide an open and flexible platform for sensor network storage research and experimentation.

1.

INTRODUCTION

The primary purpose of sensor networks is to sense and


process readings from the environment. Local storage can
benefit a lot of application scenarios like archival storage [16],
temporary data storage [12], storage of sensor calibration tables [17], in-network indexing [21], in-network querying [22]
and code storage for network reprogramming [15], among
others. Recent work [20] has also shown that local storage
on flash chips is two orders of magnitude cheaper than transmitting over the radio, and is comparable to computation.
Such gains in cost, performance and energy consumption
has strengthened the case for in-network storage and datacentric application. While flash memory has been a cheap,
viable storage alternative for the low power, energy constrained sensor nodes, storage sub-systems on existing sensor platforms has not caught up with the recent technological advancements in non volatile memory chip designs.
Existing storage capabilities on sensor nodes is restricted
to a single flash memory. Current designs of storage cen-

Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00.

Figure 1: The MStore mounted on a Telosb Sensor

tric applications thus focus their storage strategies around


a single storage chip. But existing storage solutions like
Matchbox [10], ELF [9] and Capsule [19] and indexing systems like MicroHash [23] and Tinx [18] could greatly benefit
from a multi-level storage subsystem. Having such a multi
level storage hierarchy will radically improve the flexibility
of design choices, enhance overall sensor performance and
also influence newer designs and system architectures.
MStore, our new extension storage board for the telos and
mica sensor nodes, introduces such a non volatile memory hierarchy with different memory chips that provides the much
needed module to enable research and experimentation in
storage centric sensornet applications.
Non Volatile Memory Characteristics: There are many
different types of Non-Volatile storage memories available in
the market that target low power scenarios. Among them,
flash memory is the most prominent one, while newer advanced types like Magnetic RAM (MRAM) and FRAM are
getting popular. These chips vary a lot in their characteristics.
NOR Flash: NOR flash memory has traditionally been
used to store relatively small amounts of executable code for
embedded computing devices such as PDAs and cell phones.
NOR is well suited to use for code storage because of its
reliability, fast read operations, and random access capabilities. Because code can be directly executed in place, NOR
is ideal for storing firmware, boot code, operating systems,
and other data that changes infrequently. Apart from being
used as a ROM, the NOR memories can, of course, also be
partitioned with a file system and used as any storage device. NOR flashes have capacities up to 512 MB.
NAND Flash: NAND flash memory has become the preferred format for storing larger quantities of data. Higher
density, lower cost, and faster write and erase times, and a
longer re-write life expectancy make NAND especially well

Figure 2: The MStore


suited for applications in which large amount of sequential
data need to be loaded into memory quickly and replaced
repeatedly. Unlike NOR flash chips, NAND chips are accessed like a block device with block sizes from 512 Bytes to
2048 Bytes. Typically, associated with each block are a few
bytes (typically 1216 bytes) that could be used for storage
of an error detection and correction block checksum. NAND
flashes have capacities from up to 8GB.
FRAM: Ferroelectric RAM (FeRAM or FRAM) is an advanced non-volatile computer memory. It is similar in construction to DRAM, but uses a ferroelectric layer to achieve
non-volatility. Although the market for non-volatile memory is currently dominated by Flash chips, FeRAM offers
a number of advantages, notably lower power usage, faster
write speed and a much greater maximum number (exceeding 1016 for 3.3V devices) of write-erase cycles. FeRAMs
have smaller capacities and the maximum value of a manufactured FRAM is 1MB.
Given the different characteristics available, we have introduced four different NVRAM chips on MStore: a 32KB
FRAM, an 8MB NOR flash, a 16MB NOR flash, and a
256MB NAND flash. The family of NAND chips also includes chips that have sizes from 256MB to 8GB and these
share identical current and timing performances. Thus the
NAND chip on MStore can be expanded to 8GB if needed.
Individual chip details will be highlighted in section 3.
Drivers and Evaluation: To support reading and writing of data from these memory chips, we have written a set
of drivers in TinyOS 2.0 [14], which follow the guidelines as
mentioned in TinyOS Extension Proposal 103 [6]. Details of
this implementation is in section 4. We conducted our evaluation of the read write and erase characteristics using these
drivers. The results are compared with the values specified
in the data-sheets for each of the chips. These evaluations
are in section 5.

2.

Figure 3: Schematic diagram of MStore components

Figure 4: MStore Board (Front)

connects extension ports with functional elements (Figure 3).


FRAM and NOR memory devices are connected directly to
the bus, while NAND uses the CPLD as the SPI to parallel
interface. The interface has been designed as a finite state
machine that handles NAND chip control inputs (command,
address and data latch modes, input and output data) and
does serial to parallel and parallel to serial data conversion.
The state machine is chip specific as it follows specifics of
the K9 flash memory family, but can be easily updated to
support similar memory chips of other manufacturers. The
interface has been implemented using VHDL and synthesized using Xilinx ISE 8.1i design software suit.
MStore provides seven extension ports (Figure 4 and 5).
Four connect MStore to a sensor platform: two for telosb,
and two for mica2. Next two provide a direct connection to
CPLD pins; one to access JTAG ports and the other for I/O
pins (CPLD EX). The seventh extension port (RP) gives access to the current resistors R placed on power supply inputs
of each chip, thus can be used for various measurements that
affect power lines (for example, operating current).

BOARD DESIGN

MStore memory extension board consists of five functional


elements, four memory chips and complex programmable
logic device (CPLD) and supportive elements such as current resistors, bypass capacitors and 1.8 V voltage regulator
for CPLD. The board has been designed to support both
telosb and mica2 platforms, by providing interfaces to the
platforms by the means of extension ports. The board fits
both platforms without a need for an alteration.
The backbone of the MStore board is the SPI bus that

3.

CHIPS

The MStore storage board consists of four different storage chips with varying characteristics and behaviors. It also
includes a Complex Programmable Logic Device by which
one can program logic directly into the storage board, offloading processing from the sensor mote. The board can act
as an extension to both the telosb([2]) and the micaz([13])
sensor nodes.

Table 1: Overview of storage and data organization as per datasheets


Chip Name

Total Size

Page Size
Bytes

Block Size
KB

Sector Size
KB

Read Unit
Bytes

M25P64
AT26DF161
FM25L256
K9F2G08UXA

8MB
2MB
32KB
256MB

256
256
2048

128

64
128
-

1 to
1 to
1
2048

1
1

Write Unit
Bytes

Erase Unit
KB

1 to 256
1 to 256
1
2048

64
4/32/64
NA
128

Table 3: M25P64: Current draw characteristics


Operation
Standby Mode
Read at 20MHz
Page Program
Fast Program
Sector Erase
Bulk Erase

Max
50
4
15
20
20
20

Measured

1.7
4.5

4.8
4.8

Units
A
mA
mA
mA
mA
mA

Figure 5: MStore Board (Back)

3.1 M25P64 NOR RAM

3.2

The M25P64 [5] is a 8MB serial NOR flash memory, that


can be accessed by a high speed SPI-compatible bus. The
memory can be programmed 1 to 256 bytes at a time, using the Page Program instruction. An enhanced Fast Program/Erase mode is available to speed up operations. The
memory is organized as 128 sectors, each containing 256
pages. Each page is 256 bytes wide. Thus, the whole memory can be viewed as consisting of 32768 pages, or 8388608
bytes. Even though there is no way to erase a single page,
the entire memory can be erased using the Bulk Erase Instruction or a sector at a time, using the Sector Erase instruction.

The AT26DF161 [1] is a 2MB serial interface Flash memory device. It has sixteen 128-Kbyte physical sectors and
each sector can be individually protected from program and
erase operations. The chip has a flexible erase architecture
supporting four different erase granularities of 4KB, 32 KB,
64KB and full chip granularity. The data sheet claims that
the chip is designed for use in a wide variety of high-volume
consumer based applications in which program code is shadowed from flash memory into embedded or external RAM
for execution. The erase granularity also makes it ideal for
data storage.
Protection: The AT26DF161 also offers a sophisticated
method for protecting individual sectors against erroneous
or malicious program and erase operations. By providing
the ability to individually protect and unprotect sectors, a
system can unprotect a specific sector to modify its contents
while keeping the remaining sectors of the memory array securely protected. This is useful in applications where program code is patched or updated on a subroutine or module
basis, or in applications where data storage segments need to
be modified without running the risk of errant modifications
to the program code segments. In addition to individual
sector protection capabilities, the AT26DF161 incorporates
Global Protect and Global Unprotect features that allow the
entire memory array to be either protected or unprotected
all at once. This reduces overhead during the manufacturing
process since sectors do not have to be unprotected one-byone prior to initial programming.

Table 2: M25P64 Program/Erase characteristics:


Parameter
Page Program Cycle Time
(256 Bytes)
Page Program Cycle Time (n
Bytes)
Sector Erase Cycle Time
Bulk Erase Cycle Time

Typ
1.4

Max
5

Units
ms

(0.4
+
n/256)
1
68

ms

3
160

sec
sec

Protection: The M25P64 chip protects at a sector level,


and three Block Protect bits can be set as a part of the
status register to decide which of the sectors has to be protected. The sectors have to be consecutive and depending
on the values protect the top 2,4,8,16,32 or 64 sectors.
Energy and latency: The chip consumes 50A during the standby mode, and uses up about 20 mA during
the program or erase mode. Reads are much less expensive
compared to the writes, and our measured values say that
the read current is at about 2 mA.

AT26DF161 NAND Chip

Specifically designed for use in 3-volt systems, the AT26DF161


supports read, program, and erase operations with a supply
voltage range of 2.7V to 3.6V. No separate voltage is required for programming and erasing. The various chip spe1
The address value greater than the chip maximum address
are ignored by our driver implementation

Table 4: AT26DF161: Program/Erase characteristics


Parameter
Page Program Time (256 Bytes)
Block Erase Time
4-Kbyte
32-Kbyte
64-Kbyte
Chip Erase Time

Typ
1.5

Max
5.0

Units
ms

0.05
0.35
1.0
18

0.2
0.6
0.7
28

sec
sec
sec
sec

cific energy properties of the chip is laid out in Table 5


Deep Power Down: During normal operation of the
AT26DF161, the device will be placed in the standby mode
to consume less power as long as the CS pin remains deasserted and no internal operation is in progress. The chip
also accepts a Deep Power-Down command that offers the
ability to place the device into an even lower power consumption state called the Deep Power-Down mode. The
data sheet claims that the chip consumes about 4 A in
this mode. This property would be useful for high energy
utilization while duty cycling the flash chip

a hardware drop-in replacement. The FM25L256 uses the


high-speed SPI bus, which enhances the high-speed write
capability of FRAM technology.
Protection: The FM25L256 chip does not have the same
powerful protection system as the ATAT26DF161. The programmer has fixed regions of the chip that she can protect:
either the upper 1/4 of the addresses of the chip or the upper
1/2 or all of the chip memory.
The datasheet comments that the current drawn in standby
mode for this chip is in the range of 1A, which is quite small
compared to the 50A of the M25P64 and the AT26DF161
chips. There is no difference between an erase and a write,
and the latency values for both reads and writes are said to
be at bus speeds.
Table 6: FM25L256: Current draw characteristics
Operation
Standby Mode
Byte Read
Byte Write

3.4
Table 5: AT26DF161: Current draw characteristics
Parameter
Standby Current
Deep Power-Down Current
Read Operation at 20 MHz
Page Program
Sector Erase
Bulk Erase

3.3

Typ(Max)
25(35)
4(8)
7(10)
12(18)
14(20)
14(20)

Measured

10.7
12.8
11.4
11.3

Units
A
A
mA
mA
mA
mA

FM25L256 FRAM

Ferroelectric RAM (FeRAM or FRAM) is a new type


of non-volatile computer memory which uses a ferroelectric
layer to achieve non-volatility. FRAM is competitive in applications where its properties of low write voltage, fast write
speed, much greater write-erase endurance but low storage
volume give it a compelling advantage over flash memory.
We thus decided to include this chip in our storage board so
that we can investigate into possibilities of using chips with
these properties.
The FM25L256 [3] is a 32KB nonvolatile memory this advanced ferroelectric process. The FM25L256 performs write
operations at bus speed. The datasheet claims that no write
delays are incurred. The next bus cycle may commence
immediately without the need for data polling. In addition, the product offers virtually unlimited write endurance.
Also, FRAM exhibits much lower power consumption than
EEPROM. These capabilities make the FM25L256 ideal for
nonvolatile memory applications requiring frequent or rapid
writes or low power operation. Example applications range
from data collection, where the number of write cycles may
be critical, to demanding industrial controls where the long
write time of EEPROM can cause data loss. The FM25L256
provides substantial benefits to users of serial EEPROM as

Typ(Max)
-(1)
15(30)
15(30)

Measured

0.70
0.71

Units
A
mA
mA

K9F2G08UXA NAND FLASH

The K9F2G08X0A [4] is a 2,112 MB memory organized


as 131,072 rows(pages) by 2,112x8 columns. Spare 64x8
columns are located from column address of 2,048 2,111.
A 2,112-byte data register is connected to memory cell arrays accommodating data transfer between the I/O buffers
and memory during page read and page program operations.
The memory array is made up of 32 cells that are serially
connected to form a NAND structure. Each of the 32 cells
resides in a different page. A block consists of two NAND
structured strings. A NAND structure consists of 32 cells.
Total 1,081,344 NAND cells reside in a block.
The program and read operations are executed on a page
basis, while the erase operation is executed on a block basis. The memory array consists of 2,048 separately erasable
128K-byte blocks. It indicates that the bit by bit erase operation is prohibited on the K9F2G08X0A.
Some commands require one bus cycle. For example, Reset Command, Status Read Command, etc require just one
cycle bus. Some other commands, like page read and block
erase and page program, require two cycles: one cycle for
setup and the other cycle for execution.
In addition to the enhanced architecture and interface,
the device incorporates copy-back program feature from one
page to another page without need for transporting the data
to and from the external buffer memory. Since the timeconsuming serial access and data-input cycles are removed,
system performance for solid-state disk application is significantly increased. This feature is not used in our driver but
is a potential optimization.
The chip doesnt have any programmable protection scheme.

3.5

XC2C32A CPLD Chip

The XC2C32A [8] is a Complex Programmable Logic Device (CPLD) chip manufactured by XILINX and is a part of
its CoolRunner-II CPLD family. It provides a 100% digital
core with up to 323 MHz performance. The CPLD pro-

Table 7: K9F2G08UXA: Program/Erase characteristics


Parameter
Page Program Time
Block Erase Time

Typ
.2
1.5

Max
0.7
2

Units
ms
ms

Table 8: K9F2G08UXA: Current draw characteristics


Operation
Standby Mode
Page Read
Page Program
Block Erase

Typ(Max)
10(50)
15(30)
15(30)
15(30)

Measured
NI
2
7.03
8.63

Units
A
mA
mA
mA

vides high performance with ultra-low power consumption


using up about 28.8W in standby mode. It is one of the
smallest CPLD packages available and features 32 macrocells device selection. The In-System Programming (ISP)
supports both the IEEE 1532 In-System Programming and
IEEE 1149.1 JTAG Boundary Scan testing.
During our experiments, we found that the chip takes up
an overhead cost of 1.4 mA for the operations while undertaking the serial to parallel and parallel to serial data
conversion. The latency of the operations on the CPLD are
negligible. According to the datasheet, it uses up a standby
current of about 90A.

4.

DRIVER DESIGN

To support evaluation of the characteristics of the board


and its various memory chips, we needed to write the drivers
to talk to the hardware. We implemented these drivers in
TinyOS 2.0 [14] and followed the guidelines as mentioned in
TinyOS Extension Proposal 103 [6].
TEP 103 documents a set of hardware-independent interfaces to non-volatile storage for TinyOS and describes
some design principles for the Hardware Presentation Layer
(HPL) and Hardware Adaptation Layer (HAL) of various
flash chips. We follow the three-layer Hardware Abstraction Architecture (HAA), with each chip providing a presentation layer (HPL), adaptation layer (HAL) and platformindependent interface layer (HIL) [11].
The TEP also describes three high-level storage abstractions: large objects written in a single session (Block interface), small objects with arbitrary reads and writes (Config
interface), and logs (Log interface). We used our implementation of the Block interface for our tests.
TinyOS 2.x, divides flash chips into separate volumes (with
sizes fixed at compile-time) with each volume providing a
single storage abstraction (the abstraction defines the format). We used the chip in a single volume and the drivers
operated over this volume.
Bad Blocks and CRCs: The block interface of TEP
2

The readings were not complete at the time of submission


These values include the fixed 1.4 mA overhead because of
the CPLD
3

103 contains no direct support for verifying the integrity of


the data. We support this and we allow checking this CRC
when desired.

5.

EVALUATION

The datasheets for the various chips give the current usage and latency values for read, write and erase operations.
However, the actual values in practice tend to be very different due to the overhead of driver associated with each
individual operation. Hence, we decided to experimentally
measure the latency and current usage characteristics of various operations by exercising the driver code corresponding
to each individual chip.
Experimental Setup: The experiments were carried out
on an MStore chip connected to a Telosb sensor node. Each
of the chips had a single volume implementing the Block
Storage abstraction as defined by TEP 103. Read and write
characteristics were measured by reading and writing a byte,
a page, two pages, sector size and flash size worth of data.
The erase characteristics were measured by exercising the
bulk and sector erase commands4 . The sensor did not any
other application running on top of TinyOS while the measurements were taken.

5.1

Latency Characteristics

We define the latency of a particular operation as the


time required to complete it at the Hardware Interface Layer
(HIL). For a split phase operation, it is the average time that
the operation takes for the callback phase to be signalled
after the operation was invoked.
The latency value of an operation includes the SPI bus
arbitration time, time needed to prepare a chip for an operation (opcode, address, data etc), and the actual time needed
for an operation to complete. The later information (min,
max and typical values) are usually put in the AC Characteristics section in the chips datasheet. For some operations
the latency will also include the timing of instructions that
must precede the actual command. For example, before each
write a command to enable writing must be sent to a flash.
Also, the SPI bus may not send a regular stream of clock cycles, but may have bytes separated by some T1 and group
of bytes separated by some other T2. Actual latency of
an operation is thus a sum of latency of various operations
and timing and operational constrains of an SPI bus.
Using the Alarm interface of TinyOS for measuring these
latencies would have limited the precision and accuracy of
the readings to that supported by the timer system. To have
more accurate readings, we decided to calculate latency in
terms of number of clock cycles required to execute each
split phase operation. We did this by starting a counter
before a call to an operation and stopping the counter once
the operation completion was signalled. The value in this
counter then represents the number of clock cycles consumed
in carrying out a particular operation.
Our testing suite includes code to check the correctness
of the read and write operation. This introduces a slight
overestimate in the measured latency of chip operations.
4
Erase operation occurs at a minimum granularity of the
block size. Hence, to erase any region of the flash of size
smaller than the block size, we have to at least pay the cost
of erasing the smallest number of blocks encompassing the
erased region.

The clock on the Telosb runs at 8MHz and so the final


latency time is calculated using the formula:
latency (in ns) = (125 * #clock-cycles)
The latency measurements for each flash operations were
repeated 10 times. The value in the table 9 represents the
average over these 10 experiments. There were slight variations in the running time of flash operations, but in all cases
the standard deviation was below 1% of the average values.

5.2

Energy Characteristics

The energy requirements of a memory chip is the primary


concern for adoption and is the motivating factor in design
choices. The energy required to read, store or erase data is
a function of latency and current draw. This value is much
higher than what can be calculated using the information
provided in the datasheet, as it only includes the time for
an operation to execute on the chip. For proper energy calculations, the latency of a particular operation that a driver
provides must be used. We use the latency of operations
as done in the latency experiments to calculate the energy
costs.
To measure the current that was drawn by each of the
operations, we executed the different read, write and erase
operations using the the drivers just as in latency measurements. We used a Velleman oscilloscope [7] to measure these
values. Current measurements were done by measuring the
average voltage drop across the input resistors during memory operations5 . For flash chips and CPLD, resistor values
are 1 ohm, while for the FRAM a resistor of 10 ohms is used.
Tolerance of all resistors is 1%.
The energy value depends on the operating voltage. We
assume a constant running voltage of 2.8 volts. The consumed energy is calculated using the formula:
energy (J) =

Figure 6: Energy consumed during Read operation

2.8 (V) * measured current(mA)


* latency(ms)

The measured readings are tabulated in the current usage


tables of each of the chips (Tables 3, 5 and 6). We notice
that these energy values also vary substantially from the
readings as provided in the datasheets.

Table 10: Energy consumed for Erase operation(mJ)

Sector Erase
Block Erase
4KB
32KB
64KB
Bulk Erase

FM25L256
NA

M25P64
3.89

AT26DF161
18.47

NA
NA
NA
NA

NA
NA
NA
241.47

9.24
9.26
18.48
245.52

5
Standby current could not be measured due to the limited
resolution of the oscilloscope

Figure 7: Energy consumed during Write operation

Table 9: Operation latency over driver code


Size (in bytes)
FM25L256
M25P64

AT26DF161

Units

Erase

Sector Erase
Bulk Erase

NA
NA

0.289
17.967

0.578
13.449

sec
sec

Read

1
256
512
Sector Size
Flash Size

0.576
6.222
12.441
NA
0.829

0.571
6.218
12.430
1.656
19.117

0.572
6.219
12.432
3.274
52.587

ms
ms
ms
sec
sec

Write

1
256
512
Sector Size
Flash Size

0.717
5.946
11.985
NA
0.772

34.198
39.309
78.760
8.985
51.806

34.301
72.690
145.382
25.066
56.502

ms
ms
ms
sec
sec

Operation

6.

DISCUSSION

The different chips that we have chosen to be a part of


MStore have differing capabilities and properties. FRAM
has very desirable properties, but has a small size. NOR
chips are byte addressable and we can use it to operate on
smaller chunks of data, but energy consumption for the operations on the NAND may be much better than the measured values for the NOR flashes. NAND flashes have a
larger granularity for reads and writes which might subvert
its energy advantages over the NOR flashes. The designer
is thus faced with making choices based on these tradeoffs
to decide where to put the data.
Abusing operating system terminology, we can divide data
into three categories: hot, cold and warm based on its
usage characteristics. Hot data tends to be updated and
accessed very often and cold data is seldom accessed or updated. Usages of warm data falls in the middle of these
two categories. From the energy consumption values as
seen from the evaluation section, we can conclude that the
FRAM is a perfect candidate for hot data while the NOR
and NAND flashes are optimal for warm and cold data
respectively.
An important observation that we make from the measured values is the superior performance characteristics of
the FRAM chip. Except for its limitations with size, the
read and write times and energy costs are very low. The
FRAM is ideal to store small sized hot data. It could also
be used as an extension to the main memory on the sensor
board to offload data from the RAM when memory constraints kick in. Neighbor tables, other short lived but live
data like packet queues and active meta-data are primary
candidates to be stored in the FRAM.
Unlike the M25P64, AT26DF161 NOR flashes have three
different erase granularities of 4, 32 and 64KB. While it can
be concluded from the datasheets that the AT chip may be
a good candidate for storing warm data that are updated in
smaller chunks than 64KB, the energy usage as per fig 6 and
table 10 shows that the payoff may not be worth it as the
energy cost for erasing the 4KB block on the AT26DF161 is
more than the erase of a 64KB block on the M25P64 chip.
This result could either be due to a bug in our driver code
or a manifestation of the code being un-optimized.
The NAND chip is ideal for collection of large data that

dont change often, and can be accessed and read in larger


chunks. While we could not measure the energy consumption values at the time of submission, based on the datasheets
and our readings as per tables 8 and 7, we do envisage these
values to be better than those of NOR.

6.1

Example Applications for MStore

Many existing applications can benefit from the storage


hierarchy that MStore provides. More specifically, they can
specialize the tasks of some of their data structures to the
appropriate memory chip based on its temperature. We
discuss two different applications that could improve their
energy savings and performance using MStore.
TINX: TINX [18] is an indexing scheme that can be used
for fast retrieval of archived sensor data. It maintains an
index over the actual sensor data and also maintains an
in-memory second-level index to optimize searching for the
location of the index pages. The first level index is updated
often, and the second level index is referenced with very
large frequency and is an example of hot data. If the actual
sensor data is not edited, it behaves like cold data and can
be stored in the NAND memory. The index which requires
more byte addressability and finer grained modifications is
more like the warm data and can be stored in either of the
NOR chips. The second level index is a perfect candidate
for the FRAM. Instead of manipulating these different index
structures at a page level, which makes TINX waste energy
for minor updates on the indexes, moving them to the NOR
and FRAM will absorb a large part of the wasted energy.
Capsule: Capsule [19] exposes storage abstractions for
general use in sensor network application, and we see that
implementations of the stack and index objects could be
moved to the NOR flashes while the stream object could
be stored on the NAND chip. Stack compaction could utilize the FRAM to store the pointers. Capsule also allows
applications to tolerate software faults and device failures
through checkpointing and rollback of storage objects. Whenever a new checkpoint is created, a new entry is made in the
root directory, which points to the newly created checkpoint.
Clearly, such a directory entry needs to be overwritten every
time a new checkpoint is created. This root directory could
be maintained in the FRAM. Capsule also implements a
memory reclamation scheme by implementing a cleaner task

that can run periodically. We believe this cleaner task can


be offloaded to CPLD on the MStore.

7.

CONCLUSION

We have presented the design and implementation of MStore,


an extension storage board for the mica and telos sensor
nodes. MStore includes four non volatile memory chips
with varying characteristics, laid out as a hierarchy that
will enable storage centric research in wireless sensor networks. The enhancements and availability of newer non
volatile chips compliment present thoughts and designs in
distributed sensor network. We have presented the latency
and energy values measured for various operations on these
chips and compared them against the datasheet values from
the manufacturers. With these values, the TEP 103 complaint drivers written in TinyOS and the flexibility of the
design, researchers may come up with new designs, further
their design choices and tune existing designs for sensor network research.

Acknowledgements
We would like to thank Kevin Klues for providing us with
the code to help calibrating the number of clock cycles required to execute each split phase operation. We would also
like to thank Prabal Dutta, Gaurav Mathur and Deepak
Ganesan for discussions and their feedback and general advice.

8.

REFERENCES

[1] Atmel corporation, atmelchips.com.


[2] Private communication, joe polastre, moteiv.com.
[3] Ramtron international corporation,
http://www.ramtron.com.
[4] Samsung corporation, http://www.samsung.com/.
[5] St microelectronics, st.com.

[14] J. Hill, R. Szewczyk, A. Woo, P. Levis, K. Whitehouse,


J. Polastre, D. Gay, S. Madden, M. Welsh, D. Culler, and
E. Brewer. Tinyos: An operating system for sensor
networks, 2003. Submitted for publication.
[15] J. W. Hui and D. Culler. The dynamic behavior of a data
dissemination protocol for network programming at scale.
In SenSys 04: Proceedings of the 2nd international
conference on Embedded networked sensor systems, pages
8194, New York, NY, USA, 2004. ACM Press.
[16] M. Li, D. Ganesan, and P. Shenoy. Presto: Feedback-driven
data management in sensor networks. In Third
USENIX/ACM Symposium on Network Systems Design
and Implementation (NSDI), May 2006.
[17] S. Madden, M. J. Franklin, J. M. Hellerstein, and
W. Hong. Tinydb: An acquisitional query processing
system for sensor networks. Transactions on Database
Systems (TODS), 2005.
[18] A. Mani, M. B. Rajashekhar, and P. Levis. Tinx - a tiny
index design for flash memory on wireless sensor devices. In
Proceedings of the Fourth ACM Conference on Embedded
Networked Sensor Systems (SenSys), 2006.
[19] G. Mathur, P. Desnoyers, D. Ganesan, and P. Shenoy.
Capsule: An energy-optimized object storage system for
memory-constrained sensor devices. In Proceedings of the
Fourth ACM Conference on Embedded Networked Sensor
Systems (SenSys), November 2006.
[20] G. Mathur, P. Desnoyers, D. Ganesan, and P. Shenoy.
Ultra-low power data storage for sensor networks. In IPSN
06: Proceedings of the fifth international conference on
Information processing in sensor networks, pages 374381,
New York, NY, USA, 2006. ACM Press.
[21] S. Ratnasamy, B. Karp, L. Yin, F. Yu, D. Estrin,
R. Govindan, and S. Shenker. Ght: a geographic hash table
for data-centric storage. In Proceedings of the first ACM
international workshop on Wireless sensor networks and
applications, pages 7887. ACM Press, 2002.
[22] S. Shenker, S. Ratnasamy, B. Karp, R. Govindan, and
D. Estrin. Data-centric storage in sensornets. SIGCOMM
Comput. Commun. Rev., 33(1):137142, 2003.

[23] D. Zeinalipour-Yazti, S. Lin, V. Kalogeraki, D. Gunopulos,


and W. A. Najjar. Microhash: An efficient index structure
for flash-based sensor devices. In 4th USENIX Conference
on File and Storage Technologies (FAST2005), pages
[7] Velleman inc.,
3144, December 2005.
http://www.vellemanusa.com/us/enu/product/view/?id=522377.
[6] Tinyos extension proposal 103,
http://www.tinyos.net/tinyos-2.x/doc/html/tep103.html.

[8] Xilinx inc.,


http://direct.xilinx.com/bvdocs/publications/ds310.pdf.
[9] H. Dai, M. Neufeld, and R. Han. Elf: an efficient
log-structured flash file system for micro sensor nodes. In
SenSys 04: Proceedings of the 2nd international
conference on Embedded networked sensor systems, pages
176187, New York, NY, USA, 2004. ACM Press.
[10] D. Gay, P. Levis, R. von Behren, M. Welsh, E. Brewer, and
D. Culler. The nesC language: A holistic approach to
networked embedded systems. In SIGPLAN Conference on
Programming Language Design and Implementation
(PLDI03), June 2003.
[11] V. Handziski, J. Polastrey, J.-H. Hauer, C. Sharpy,
A. Wolisz, and D. Cullery. Flexible hardware abstraction
for wireless sensor networks. In Proceedings of the Second
European Workshop on Wireless Sensor Networks
(EWSN), Feb 2005.
[12] J. Hellerstein, W. Hong, S. Madden, and K. Stanek.
Beyond average: Towards sophisticated sensing with
queries, 2003.
[13] J. Hill and D. E. Culler. Mica: a wireless platform for
deeply embedded networks. IEEE Micro, 22(6):1224,
nov/dec 2002.