You are on page 1of 108

COMMUNICATIONS

ACM
CACM.ACM.ORG OF THE 06/2009 VOL.52 NO.06

One Laptop Per Child:


Vision vs. Reality
Hard-Disk Drives:
The Good, The Bad,
and the Ugly
How CS Serves
The Developing World
Network Front-End
Processors
The Claremont Report
On Database Research
Autonomous
Helicopters

Association for
Computing Machinery
Think Parallel.....
It’s not just what we make.
It’s what we make possible.
Advancing Technology Curriculum
Driving Software Evolution
Fostering Tomorrow’s Innovators

Learn more at: www.intel.com/thinkparallel

ACM Ad.indd   1 4/17/2009   11:20:03 AM
ABCD springer.com

Noteworthy Computer Science Journals


Autonomous Biological Personal and
Robots Cybernetics Ubiquitous
G. Sukhatme, University W. Senn, Universität Bern, Computing
of Southern California, Physiologisches Institut;
ACM
Viterbi School of Engi- J. Rinzel, National
neering, Dept. Computer Institutes of Health (NIH), P. Thomas, Univ. Coll.
Science Dept. Health Education & London Interaction
Welfare; J. L. van Hemmen, Centre
Autonomous Robots
reports on the theory and TU München, Abt. Physik Personal and Ubiquitous
applications of robotic systems capable of Biological Cybernetics is an interdisciplinary Computing publishes peer-reviewed
some degree of self-sufficiency. It features medium for experimental, theoretical and international research on handheld, wearable
papers that include performance data on actual application-oriented aspects of information and mobile information devices and the
robots in the real world. The focus is on the processing in organisms, including sensory, pervasive communications infrastructure that
ability to move and be self-sufficient, not on motor, cognitive, and ecological phenomena. supports them to enable the seamless
whether the system is an imitation of biology. Under the main aspects of performance and integration of technology and people in their
Of course, biological models for robotic function of systems, emphasis is laid on everyday lives. The journal carries compel-
systems are of major interest to the journal communication between life sciences and lingly-written, timely and accessible contribu-
since living systems are prototypes for technical/theoretical disciplines. tions that illuminate the technological, social
autonomous behavior. and design challenges of personal and
ISSN 0340-1200 (print version) ubiquitous computing technologies.
7 High Impact Factor in Robotics and AI
ISSN 1432-0770 (electronic version)
Journal no. 422 ISSN 1617-4909 (print version)
ISSN 0929-5593 (print version)
ISSN 1617-4917 (electronic version)
ISSN 1573-7527 (electronic version)
Journal no. 779
Journal no. 10514 Scientometrics
T. Braun, Lorand Eötvös University, Inst. Inor-
Cybernetics and
Data Mining ganic and Analytical Chemistry
Systems Analysis
and Knowledge Scientometrics is concerned with the
quantitative features and characteristics of I. V. Sergienko, Acad.
Discovery Science Ukraine,
science. Emphasis is placed on investigations in
G. I. Webb, Monash which the development and mechanism of Glushkov Institute
University, School of science are studied by statistical mathematical Cybernetics
Computer Science &, methods. The journal publishes original studies, Cybernetics and System
Software Engineering short communications, preliminary reports, Analysis publishes
The premier technical review papers, letters to the editor and book articles on: software and
publication in the field, reviews on scientometrics. hardware; algorithm theory and languages;
Data Mining and Knowledge Discovery is a 7 High Impact Factors in Computer Science., programming and programming theory;
resource collecting relevant common methods Interdisciplinary Applications and Information optimization; operations research; digital and
and techniques and a forum for unifying the Science & library Science analog methods; hybrid systems; machine-
diverse constituent research communities. The machine and man-machine interfacing.
journal publishes original technical papers in ISSN 0138-9130 (print version)
both the research and practice of data mining ISSN 1060-0396 (print version)
ISSN 1588-2861 (electronic version)
and knowledge discovery, surveys and tutorials ISSN 1573-8337 (electronic version)
Journal no. 11192
of important areas and techniques, and Journal no. 10559
detailed descriptions of significant applica-
tions.
7 High Impact Factor in Information Systems
and AI

ISSN 1384-5810 (print version)


ISSN 1573-756X (electronic version)
Journal no. 10618

Easy Ways to Order for the Americas 7 Write: Springer Order Department, PO Box 2485, Secaucus, NJ 07096-2485, USA 7 Call: (toll free) 1-800-SPRINGER
7 Fax: 1-201-348-4505 7 Email: journals-ny@springer.com or for outside the Americas 7 Write: Springer Customer Service Center GmbH, Haberstrasse 7,
69126 Heidelberg, Germany 7 Call: +49 (0) 6221-345-4303 7 Fax: +49 (0) 6221-345-4229 7 Email: subscriptions@springer.com
014088x
COMMUNICATIONS OF THE ACM

Departments News Viewpoints

5 ACM-W Letter 22 Privacy and Security


ACM-W Celebrates Answering the Wrong Questions
Women in Computing Is No Answer
By Elaine Weyuker Asking the wrong questions when
building and deploying systems
9 Letters To The Editor results in systems that cannot
Share the Threats be sufficiently protected against
the threats they face.
10 blog@CACM By Eugene H. Spafford
Speech-Activated User Interfaces
and Climbing Mt. Exascale 25 Inside Risks
Tessa Lau discusses why she Reducing Risks of Implantable
doesn’t use the touch screen on Medical Devices
her in-car GPS unit anymore and A prescription to improve security
Daniel Reed considers the future and privacy of pervasive health care.
of exascale computing. 13 Micromedicine to the Rescue By Kevin Fu
Medical researchers have long
12 CACM Online dreamed of “magic bullets” that go 28 The Profession of IT
Making That Connection directly where they are needed. Beyond Computational Thinking
By David Roman With micromedicine, this dream If we are not careful, our fascination
could become a life-saving reality. with “computational thinking”
27 Calendar By Don Monroe may lead us back into the trap
we are trying to escape.
101 Careers 16 Content Control By Peter J. Denning
Entertainment businesses say digital
rights management prevents the 31 Viewpoint
Last Byte theft of their products, but access Why “Open Source” Misses
control technologies have been the Point of Free Software
103 Puzzled a uniform failure when it comes Decoding the important differences
Solutions and Sources to preventing piracy. Fortunately, in terminology, underlying
By Peter Winkler change is on the way. philosophy, and value systems
By Leah Hoffmann between two similar categories
104 Future Tense of software.
Webmind Says Hello 18 Autonomous Helicopters By Richard Stallman
By Robert J. Sawyer Researchers are improving
unmanned helicopters’ capabilities 34 Kode Vicious
to address regulatory requirements Obvious Truths
and commercial uses. How to determine when to put
PHOTOGRAP H C OURT ESY OF T HE C OMPUTI NG RESEARCH AS SOCI ATION

By Gregory Goth the brakes on late-running projects


and untested software patches.
21 Looking Backward and Forward By George V. Neville-Neil
CRA’s Computing Community
Consortium hosted a day-long
symposium to discuss the important
computing advances of the last
several decades and how to sustain
that track record of innovation.
By Bob Violino
Association for Computing Machinery
Advancing Computing as a Science & Profession

2 COMM UNICATIO NS O F T HE AC M | J U N E 200 9 | VO L. 52 | N O. 6


06/2009 VOL. 52 NO. 06

Practice Contributed Articles Virtual Extension

56 The Claremont Report As with all magazines, page limitations often


on Database Research prevent the publication of articles that might
By Rakesh Agrawal, Anastasia Ailamaki, otherwise be included in the print edition.
To ensure timely publication, ACM created
Philip A. Bernstein, Eric A. Brewer,
Communications’ Virtual Extension (VE).
Michael J. Carey, Surajit Chaudhuri, VE articles undergo the same rigorous review
AnHai Doan, Daniela Florescu, process as those in the print edition and are
Michael J. Franklin, Hector Garcia-Molina, accepted for publication on their merit. These
Johannes Gehrke, Le Gruenwald, articles are now available to ACM members in
Laura M. Haas, Alon Y. Halevy, the Digital Library.
Joseph M. Hellerstein,
Yannis E. Ioannidis, Hank F. Korth, Deriving Mutual Benefits
Donald Kossmann, Samuel Madden, from Offshore Outsourcing
Roger Magoulas, Beng Chin Ooi, Amar Gupta
Tim O’Reilly, Raghu Ramakrishnan,
Sunita Sarawagi, Michael Stonebraker, Advancing Information
38 Hard-Disk Drives: The Good, Alexander S. Szalay, and Gerhard Weikum Technology in Health Care
the Bad, and the Ugly Steven M. Thompson and
New drive technologies and 66 One Laptop Per Child: Vision vs. Reality Matthew D. Dean
increased capacities create new By Kenneth L. Kraemer, Jason Dedrick,
categories of failure modes that and Prakul Sharma The Challenge of Epistemic
will influence system designs. Divergence in IS Development
By Jon Elerath Mark Lycett and Chris Partridge
Review Articles
46 Network Front-end Processors, Hyperlinking the Work
Yet Again 74 How Computer Science for Self-Management
The history of NFE processors sheds Serves the Developing World of Flexible Workflows
light on the trade-offs involved in By M. Bernardine Dias and Eric Brewer Jonghun Park and Kwanho Kim
designing network stack software.
By Mike O’Dell Re-Tuning the Music Industry—Can
Research Highlights They Re-Attain Business Resonance?
51 Whither Sockets? Sudip Bhattacharjee, Ram D. Gopal,
High bandwidth, low latency, 82 Technical Perspective James R. Marsden, and
and multihoming challenge Reframing Security for the Web Ramesh Sankaranarayanan
the sockets API. By Andrew Myers
By George V. Neville-Neil A Holistic Framework for Knowledge
83 Securing Frame Communication Discovery and Management
Article development led by in Browsers Dursun Delen and Suliman Al-Hawamdeh
queue.acm.org By Adam Barth, Collin Jackson,
and John C. Mitchell Forensics of Computers
and Handheld Devices:
92 Technical Perspective Identical or Fraternal Twins?
About the Cover: Software and Hardware Nena Lim and Anne Khoo
The One Laptop Per Support for Deterministic
Child vision is being
overwhelmed by the Replay of Parallel Programs Technical Opinion
reality of business, politics,
IL LUSTRAT I ON BY SU PERBROTH ERS

By Norman P. Jouppi Leveraging First-Mover Advantages


logistics, and competing
interests worldwide. in Internet-based Consumer Services
The photo illustration on
93 Two Hardware-based Approaches for T.P. Liang, Andrew J. Czaplewski,
the cover is adapted from Deterministic Multiprocessor Replay Gary Klein, and James J. Jiang
OLPC photos taken in the
Gobi Desert.
By Derek R. Howe, Pablo Montesinos,
Luis Ceze, Mark D. Hill,
and Josep Torrellas

J UN E 2 0 0 9 | VO L . 5 2 | N O. 6 | C O M M U NI C AT I O NS O F T HE AC M 3
COMMUNICATIONS OF THE ACM
A monthly publication of ACM Media

Communications of the ACM is the leading monthly print and online magazine for the computing and information technology fields.
Communications is recognized as the most trusted and knowledgeable source of industry information for today’s computing professional.
Communications brings its readership in-depth coverage of emerging areas of computer science, new trends in information technology,
and practical applications. Industry leaders use Communications as a platform to present and debate various technology implications,
public policies, engineering challenges, and market trends. The prestige and unmatched reputation that Communications of the ACM
enjoys today is built upon a 50-year commitment to high-quality editorial content and a steadfast dedication to advancing the arts,
sciences, and applications of information technology.

ACM, the world’s largest educational STA F F E DI TO R IA L BOA RD


and scientific computing society, delivers BPA Audit Pending
resources that advance computing as a GR O U P PU B L I SH ER ED I TO R - I N - CH IE F
science and profession. ACM provides the Scott E. Delman Moshe Y. Vardi ACM Copyright Notice
computing field’s premier Digital Library publisher@cacm.acm.org eic@cacm.acm.org Copyright © 2009 by Association for
and serves its members and the computing Executive Editor NE WS Computing Machinery, Inc. (ACM).
profession with leading-edge publications, Diane Crawford Co-chairs Permission to make digital or hard copies
conferences, and career resources. Managing Editor Marc Najork and Prabhakar Raghavan of part or all of this work for personal
Thomas E. Lambert Board Members or classroom use is granted without
Executive Director and CEO Senior Editor Brian Bershad; Hsiao-Wuen Hon; fee provided that copies are not made
John White Andrew Rosenbloom Mei Kobayashi; Rajeev Rastogi; or distributed for profit or commercial
Deputy Executive Director and COO Senior Editor/News Jeannette Wing advantage and that copies bear this
Patricia Ryan Jack Rosenberger notice and full citation on the first
Director, Office of Information Systems Web Editor V I EWPOI N TS page. Copyright for components of this
Wayne Graves David Roman Co-chairs work owned by others than ACM must
Director, Office of Financial Services Editorial Assistant Susanne E. Hambrusch; be honored. Abstracting with credit is
Russell Harris Zarina Strakhan John Leslie King; permitted. To copy otherwise, to republish,
Director, Office of Membership Rights and Permissions J Strother Moore to post on servers, or to redistribute to
Lillian Israel Deborah Cotton Board Members lists, requires prior specific permission
Director, Office of SIG Services P. Anandan; William Aspray; Stefan and/or fee. Request permission to publish
Donna Cappo Art Director Bechtold; Judith Bishop; Soumitra Dutta; from permissions@acm.org or fax
Andrij Borys Stuart I. Feldman; Peter Freeman; (212) 869-0481.
ACM CO U N C IL Associate Art Director Seymour Goodman; Shane Greenstein;
President Alicia Kubista Mark Guzdial; Richard Heeks; For other copying of articles that carry a
Wendy Hall Assistant Art Director Richard Ladner; Susan Landau; code at the bottom of the first or last page
Vice-President Mia Angelica Balaquiot Carlos Jose Pereira de Lucena; or screen display, copying is permitted
Alain Chesnais Production Manager Helen Nissenbaum; Beng Chin Ooi; provided that the per-copy fee indicated
Secretary/Treasurer Lynn D’Addesio Loren Terveen in the code is paid through the Copyright
Barbara Ryder Director of Media Sales Clearance Center; www.copyright.com.
Past President Jennifer Ruzicka PR ACT IC E
Stuart I. Feldman Marketing & Communications Manager Subscriptions
Chair
Chair, SGB Board Brian Hebert Annual subscription cost is included in
Stephen Bourne
Alexander Wolf Public Relations Coordinator the society member dues of $99.00 (for
Board Members
Co-Chairs, Publications Board Virgina Gold students, cost is included in $42.00 dues);
Eric Allman; Charles Beeler;
Ronald Boisvert, Holly Rushmeier Publications Assistant the nonmember annual subscription rate
David J. Brown; Bryan Cantrill;
Members-at-Large Emily Eng is $100.00.
Terry Coatta; Mark Compton;
Carlo Ghezzi;
Benjamin Fried; Pat Hanrahan;
Anthony Joseph; Columnists ACM Media Advertising Policy
Marshall Kirk McKusick;
Mathai Joseph; Alok Aggarwal; Phillip G. Armour; Communications of the ACM and other
George Neville-Neil
Kelly Lyons; Martin Campbell-Kelly; ACM Media publications accept advertising
Bruce Maggs; Michael Cusumano; Peter J. Denning; The Practice section of the CACM
in both print and electronic formats. All
Mary Lou Soffa; Shane Greenstein; Mark Guzdial; Editorial Board also serves as
advertising in ACM Media publications is
SGB Council Representatives Peter Harsha; Leah Hoffmann; the Editorial Board of .
at the discretion of ACM and is intended
Norman Jouppi; Mari Sako; Pamela Samuelson; to provide financial support for the various
C ON T R I BU T ED ART I C L ES
Robert A. Walker; Gene Spafford; Cameron Wilson activities and services for ACM members.
Co-chairs
Jack Davidson Current Advertising Rates can be found
C O N TAC T P O I N TS Al Aho and Georg Gottlob
P U BL ICAT ION S B OA RD Copyright permission Board Members by visiting http://www.acm-media.org or
Co-Chairs permissions@cacm.acm.org Yannis Bakos; Gilles Brassard; Alan Bundy; by contacting ACM Media Sales at
Ronald F. Boisvert and Holly Rushmeier Calendar items Peter Buneman; Ghezzi Carlo; (212) 626-0654.
Board Members calendar@cacm.acm.org Andrew Chien; Anja Feldmann;
Gul Agha; Michel Beaudouin-Lafon; Change of address Blake Ives; James Larus; Igor Markov; Single Copies
Jack Davidson; Nikil Dutt; Carol Hutchins; acmcoa@cacm.acm.org Gail C. Murphy; Shree Nayar; Lionel M. Ni; Single copies of Communications of the
Ee-Peng Lim; M. Tamer Ozsu; Vincent Letters to the Editor Sriram Rajamani; Jennifer Rexford; ACM are available for purchase. Please
Shen; Mary Lou Soffa; Ricardo Baeza-Yates letters@cacm.acm.org Marie-Christine Rousset; Avi Rubin; contact acmhelp@acm.org.
Abigail Sellen; Ron Shamir; Marc Snir;
ACM U.S. Public Policy Office WE B SIT E Larry Snyder; Manuela Veloso;
Cameron Wilson, Director http://cacm.acm.org C OM M UNI C ATI O NS O F THE ACM
Michael Vitale; Wolfgang Wahlster;
1100 Seventeenth St., NW, Suite 50 (ISSN 0001-0782) is published monthly
Andy Chi-Chih Yao; Willy Zwaenepoel
Washington, DC 20036 USA AU T H OR G UI DE L I NES by ACM Media, 2 Penn Plaza, Suite 701,
T (202) 659-9711; F (202) 667-1066 http://cacm.acm.org/guidelines R ESE AR C H H I G H L I G H TS New York, NY 10121-0701. Periodicals
Co-chairs postage paid at New York, NY 10001,
Computer Science Teachers ADV E RT I SI N G David A. Patterson and and other mailing offices.
Association Stuart J. Russell
Chris Stephenson AC M A DV ERTI SI N G DE PART MEN T P OST M AST E R
Board Members
Executive Director 2 Penn Plaza, Suite 701, New York, NY Please send address changes to
Martin Abadi; Stuart K. Card;
2 Penn Plaza, Suite 701 10121-0701 Communications of the ACM
Deborah Estrin; Shafi Goldwasser;
New York, NY 10121-0701 USA T (212) 869-7440 2 Penn Plaza, Suite 701
Maurice Herlihy; Norm Jouppi;
T (800) 401-1799; F (541) 687-1840 F (212) 869-0481 New York, NY 10121-0701 USA
Andrew B. Kahng; Linda Petzold;
Michael Reiter; Mendel Rosenblum;
Association for Computing Machinery Director of Media Sales
Ronitt Rubinfeld; David Salesin;
(ACM) Jennifer Ruzicka
Lawrence K. Saul; Guy Steele, Jr.;
2 Penn Plaza, Suite 701 jen.ruzicka@hq.acm.org
Gerhard Weikum; Alexander L. Wolf
New York, NY 10121-0701 USA
T (212) 869-7440; F (212) 869-0481 Media Kit acmmediasales@acm.org
W EB
Co-chairs
Marti Hearst and James Landay
 
 


Printed in the U.S.A.






Board Members





Jason I. Hong; Jeff Johnson;




 
Greg Linden; Wendy E. MacKay 

4 CO MM UNICATIONS OF TH E AC M | J U NE 20 09 | VOL . 52 | N O. 6
acm-w letter

DOI:10.1145/1516046.1516047 Elaine Weyuker

ACM-W Celebrates Not only does this make it relatively


inexpensive to attend meetings since

Women in Computing students and faculty often travel to-


gether, the proximity also helps estab-
lish and maintain a local community
Computer science is no longer the hot, of women pursuing a common goal.
high-enrollment field it once was. We have sponsored quite a number of
these meetings both within the U.S.
and Australia, with one being planned
This is not news. While many sugges- continue on to graduate school. Simi- in Turkey.
tions have been made for increasing larly we hope to encourage the mas- Another unique ACM-W initiative
enrollments, it is unlikely that comput- ter’s student to aim for a Ph.D. We of- is the Ambassador program in which a
er science will ever be as vibrant as it fer up to 20 $500 scholarships per year. woman serves as the Ambassador from
could be—and should be—as long as a Moreover, we have recently asked the her country and shares information
large portion of the talent pool remains ACM’s Special Interest Groups (SIGs) about the climate there for women in
underrepresented. After all, if we are to partner with us by offering scholar- computing. At times we have had rep-
missing the best and the brightest of ship recipients complimentary regis- resentatives from six different conti-
a group who can offer exciting ideas tration as well as provide conference nents. We are now developing our first
that would enrich the field, computer mentors to help them learn the ropes. internationally distributed program
science suffers. In addition, different We are thrilled by the response we aimed at attracting middle school girls
groups often present different perspec- have received from many of the SIGs. to computer science by adapting a suc-
tives—a scenario completely lost when Another program involving SIG cessful program to several different
we do not encourage diversity. cooperation is our Athena Lecturer cultures.
With this in mind, the mission of Award honoring the most outstanding This is just a sampling of the many
the ACM Women’s Council (ACM-W) is women scholars. It was established programs within ACM-W created to
to inform and support women in com- to address the fact that women are promote and further advance women
puting. Since ACM is an international often overlooked when nominations in the computing field. Readers are en-
organization, this means developing are considered for advanced mem- couraged to visit our Web site at http://
programs with a worldwide reach; bership grades or awards. The goal women.acm.org to learn about the
with something for each of ACM’s very of the Athena Lecturer Award is to full range of programs and initiatives
broad constituencies: K–12 students, celebrate women’s scholarship and offered. ACM-W is an all-volunteer or-
undergraduates at liberal arts and technical contributions to the field ganization open to anyone interested
research institutions, master’s and as well as increase the visibility of in improving gender diversity. If you
Ph.D. students, faculty from all types women scholars. Rather than ask- see a project that interests you, please
of institutions, and women in industry ing for individual nominations, each consider volunteering. If you have an
and government working as computer SIG is invited to nominate their most idea for a new project, let us know.
practitioners and researchers. Increas- outstanding women scholars. By us- Take a look at our newsletter to see
ingly, we strive to partner both with ing this format, we encourage SIGs to project details, read interviews with
other segments within ACM and other think about promoting women in the outstanding women, and learn about
organizations dedicated to improving field, and hopefully remember these upcoming events.
gender diversity. women when they are nominating Diversity is not the problem of the
Some of our active programs in- people for other awards or selecting underrepresented group. It is every-
clude scholarships to help women keynote speakers or program chairs one’s problem. If we want out field to
students attend research conferenc- for future conferences. grow and flourish, we need the contri-
es. This effort is not aimed at the ad- Many readers will be familiar with bution of talented people of all types.
vanced Ph.D. student who has already the Grace Hopper Celebration of
committed to a career in academia or Women. To keep the Hopper momen- Elaine Weyuker is chair of ACM-W and is a researcher at
AT&T Labs specializing in empirical software engineering
industrial research. Rather we look to tum going throughout the year, ACM- and testing research.
support the undergraduate woman by W offers regional Hopper-like events
giving her a chance to see the types of designed to attract attendees within a
options available and encourage her to two-hour driving radius of each other. © 2009 ACM 0001-0782/09/0600 $10.00

J UN E 2 0 0 9 | VO L. 52 | N O. 6 | C O M M U NI CAT I O NS O F T HE AC M 5



ACM Senior Members
 1  /8 . -
//  $/ /4,/  /-
 0
1/
./4

/. 5/04-/ 81-1 /

1/4 5
04
1/ 5//

4  , 41 / /  0/  /51  / ,
; $
 #03
4. ,// $
/51/ 3 "/ 4 $-

"  #,/

/:  ,8. ; /0 
;    #, 
7:3
8 7.  #//. 

/ ! -  #.6 /; 


. 14/.   %:/ .   / D5/;D5/; -  # 6 /

/
1
 4,/3/ #/,/-- 
/ *    (/  #
/./
3 /. 4/ &   . 3,: 7. #  , 1 #17./ $ $8
,,/ 4 1 /  ; "15 -1-3 4 $

/
7  / " /2
 5
   -/ %4/ $/
4  4 7.   #.:  -1: $1 ,1 1 $/ 5

#-1.  
 1. . : -1 $  $1
;
% : :
-  -  '//  
 7. $/
/9  -14 /
/    B  
C/;   $/./
" /7
 #.  
4/ # 
/: (//. ( $4
-1/ # /
1. :. #  -1/ 8/-/ 
 1/:  $4
1
/. /; (4  /= /: /7 - / #.:  $4
1
#-. -1 :  /5
 1  - / !7/ $5
-1/-3
$
/0 F   E 4 //  1 -1/ $
//
$
/7/  8 8/-/  -/ ./ 3 ( / ( %4
1: $
:/
"/
/  7 3: 
:   3 7. :/ ) $
%4
1:   .. //4:  45     // - $ --
,  
4
/ 4/ 7/  4 "/ %
#23 4  :: / ? 4B/;  -/ ,/
 %4
 
:  1
/3   ' 1  5
 
/7 /.  %/7/

18 $ 18.1: 8/-/  / #,/


 . ( %-;
;,/
1  1 -1 # / /   7-3 ;,/
1  &/
%/: 

 $7 
3 -1/  ! .: 4  '

$:   14 .8. " 


; $-

" !7/4:/ %4 '/1/E


$7/
/ ;/ $  /3/ 
; 2/.  ". $
/0 '>
//.
1 4. :3  8-1: ":.  "
 *9 (
 

 -1/ //: #. "//; /  (/,/


# /  7 $/ (3 4 
"<
;4  (/
  33 ./ .  . $ "1/; . # (/55
$ 4//
  $
/7/ /4 

 "   $
/7/  (1
/1 /
/: //  1 1 // $7  #4 84: /
 (

/, 
/
1 .8. #,/
  -1
/ (4  #/ 7.  (04
,. 4
/,  $..3 $
/7/   // 
 #/ / (4 (1

  / -A@ "  (  8 3 #/1/ 14/.  +1

 8 /:  :/
0   -1/  

%14   /,/ #:4. * /  
1
/ 2/4 ;/0/ #4. /-/7-
"/
/ /': / /5 

Additional ACM Senior Members will be included in an upcoming issue. http://seniormembers.acm.org

ACM_SeniorMembers_note.indd 1 4/27/09 3:59:45 PM


I=:68B ACM, INTEL, AND

6#B#IJG>C< GOOGLE CONGRATULATE


BARBARA H. LISKOV
6L6G9 FOR HER FOUNDATIONAL
INNOVATIONS IN
PROGRAMMING LANGUAGE
DESIGN THAT HAVE MADE
SOFTWARE MORE
RELIABLE AND HER
MANY CONTRIBUTIONS
TO BUILDING AND
INFLUENCING THE
PERVASIVE COMPUTER
BY THE COMMUNITY...
SYSTEMS THAT POWER
FROM THE COMMUNITY... DAILY LIFE.

FOR THE COMMUNITY...

˜ÌiÊˆÃÊ>Ê«ÀœÕ`Ê뜘ÜÀʜvÊ̅iÊ
Ê°Ê°Ê/ÕÀˆ˜}ÊÜ>À`]Ê>˜`Ê ºœœ}iʈÃÊ`iˆ}…Ìi`Ê̜ʅi«ÊÀiVœ}˜ˆâiÊ*ÀœviÃÜÀʈÎœÛÊvœÀʅiÀÊ
ˆÃÊ«i>Ãi`Ê̜ʍœˆ˜Ê̅iÊVœ““Õ˜ˆÌÞʈ˜ÊVœ˜}À>ÌՏ>̈˜}Ê̅ˆÃÊÞi>À½ÃÊ ÀiÃi>ÀV…ÊVœ˜ÌÀˆLṎœ˜Ãʈ˜Ê̅iÊ>Ài>ÃʜvÊ`>Ì>Ê>LÃÌÀ>V̈œ˜]ʓœ`Տ>ÀÊ
ÀiVˆ«ˆi˜Ì]Ê*ÀœviÃÜÀÊ >ÀL>À>ʈΜ۰ÊiÀÊVœ˜ÌÀˆLṎœ˜ÃʏˆiÊ>ÌÊ >ÀV…ˆÌiVÌÕÀiÃ]Ê>˜`Ê`ˆÃÌÀˆLÕÌi`ÊVœ“«Ṏ˜}Êv՘`>“i˜Ì>Ãp>Ài>ÃÊ
̅iÊvœÕ˜`>̈œ˜ÊœvÊ>Ê“œ`iÀ˜Ê«Àœ}À>““ˆ˜}ʏ>˜}Õ>}iÃÊ>˜`Ê œvÊv՘`>“i˜Ì>Êˆ“«œÀÌ>˜ViÊ̜Êœœ}i°Ê7iÊ>ÀiÊ«ÀœÕ`Ê̜ÊLiÊ>Ê
Vœ“«iÝÊ`ˆÃÌÀˆLÕÌi`ÊÜvÌÜ>Ài°Ê >ÀL>À>½ÃÊܜÀŽÊVœ˜ÃˆÃÌi˜ÌÞÊ Ã«œ˜ÃœÀʜvÊ̅iÊ
Ê°Ê°Ê/ÕÀˆ˜}ÊÜ>À`Ê̜ÊÀiVœ}˜ˆâiÊ>˜`Êi˜VœÕÀ‡
ÀiyÊiVÌÃÊÀˆ}œÀœÕÃÊ«ÀœLi“ÊvœÀ“Տ>̈œ˜Ê>˜`ÊÜ՘`ʓ>̅i“>̈VÃ]Ê >}iÊ̅iÊÀiÃi>ÀV…Ê̅>ÌʈÃÊiÃÃi˜Ìˆ>Ê˜œÌʜ˜ÞÊ̜ÊVœ“«ÕÌiÀÊÃVˆi˜Vi]Ê
>Ê«œÌi˜ÌÊVœ“Lˆ˜>̈œ˜ÊÅiÊÕÃi`Ê̜ÊVÀi>Ìiʏ>Ã̈˜}Ê܏Ṏœ˜Ã°»ÊÊ LÕÌÊ̜Ê>Ê̅iÊwÊi`ÃÊ̅>ÌÊ`i«i˜`ʜ˜ÊˆÌÃÊVœ˜Ìˆ˜Õi`Ê>`Û>˜Vi“i˜Ì°

˜`ÀiÜÊ°Ê
…ˆi˜ vÀi`Ê<°Ê-«iV̜À
6ˆViÊ*ÀiÈ`i˜Ì]Ê
œÀ«œÀ>ÌiÊ/iV…˜œœ}ÞÊÀœÕ« 6ˆViÊ*ÀiÈ`i˜Ì]Ê,iÃi>ÀV…Ê>˜`Ê
ˆÀiV̜À]ʘÌiÊ,iÃi>ÀV… -«iVˆ>Ê˜ˆÌˆ>̈ÛiÃ]Êœœ}iÊ

œÀʓœÀiʈ˜vœÀ“>̈œ˜ÊÃiiÊÜÜÜ°ˆ˜Ìi°Vœ“ÉÀiÃi>ÀV…° œÀʓœÀiʈ˜vœÀ“>̈œ˜]ÊÃiiʅÌÌ«\ÉÉÜÜÜ°}œœ}i°Vœ“ÉVœÀ«œÀ>ÌiÉ
ˆ˜`iÝ°…Ì“Ê>˜`ʅÌÌ«\ÉÉÀiÃi>ÀV…°}œœ}i°Vœ“É°Ê
ˆ˜>˜Vˆ>ÊÃÕ««œÀÌÊvœÀÊ̅iÊ
Ê°Ê°Ê/ÕÀˆ˜}ÊÜ>À`ʈÃÊ«ÀœÛˆ`i`ÊLÞʘÌiÊ
œÀ«œÀ>̈œ˜Ê>˜`Êœœ}i°Ê
CACM_ACM_Books_and_Courses_4C_full-page_LMNTK:Layout 1 4/9/09 11:59 AM Page 1

/% !!3""%
. "'$%%$"$ %-
Helping Members Meet Today’s Career Challenges

5 7-($ !! "'$%%!'&#!'%


'%$&'%$"   !&4-
4/&8/-*/&0634&0--&$5*0/*/$-6%&407&3  "!!
"'$%%! '&#!'%($&'%$$!
&""%!",!#&*30(3".)*()-*()54
  $!! &" 306/%5)&$-0$,"$$&4450
 0/-*/&$0634&40/"8*%&3"/(&0'
$0.165*/("/%#64*/&44501*$4*/.6-5*1-&-"/(6"(&4
)'%((+$&'%  6/*26&7"#<&9&3$*4&41-"$&64&340/4:45&.464*/(3&"-
)"3%8"3&"/%40'58"3&"--08*/(5)&.50("*/*.1035"/5+0#3&-"5&%&91&3*&/$&
$!""% "/&&'&3&/$&*#3"3:&95&/%45&$)/*$"-,/08-&%(&0654*%&0'5)&$-"44300.1-64
0/-*/&9&$65*7&6.."3*&4"/%26*$,3&'&3&/$&$"3%450"/48&30/5)&+0#26&45*0/4*/45"/5-:
,!*$ .&.#&34$"/"$$&44"44&44.&/54"/%4&-'456%:$0634&40>*/&"/:8)&3&"/%"/:5*.&
8*5)065"-*7&/5&3/&5$0//&$5*0/
%08/-0"%"#-&6*$,&'&3&/$&6*%&"/%" .*/65&4*5&03*&/5"5*0/$0634&'03/&864&34"3&"-40
"7"*-"#-&50)&-1.&.#&34(&545"35&%
)&/-*/&0634&30(3".*401&/5030'&44*0/"-"/%56%&/5&.#&34


!!3""%$" 6$ !!3""%
$" 3""%)2
.&.#&34"3&&-*(*#-&'03"%#
%(!%0=&35061(3"%&50"3&.*6.03
6--*#3"3:46#4$3*15*0/5)306()6/& 
 --30'&44*0/"-"/%56%&/5&.#&34"-40
)"7&'%%&" "!!""% '30.
03.03&%&5"*-47*4*5
00,4
9 <*/4305"5*/($0--&$5*0/0'
)5511%"$.03(#00,4"#065!4&-$'. $0.1-&5&6/"#3*%(&%#00,40/5)&)055&45
$0.165*/(501*$4)*47*356"--*#3"3:1654
)&/-*/&00,40--&$5*0/*/$-6%&4'
*/'03."5*0/"5:063'*/(&35*14&"3$)#00,
%%&"
"!!""% '30."'"3*<00,4
."3,033&"%$07&350$07&3 063#00,4)&-'
/-*/&'&"563*/(-&"%*/(16#-*4)&34*/$-6%*/(
"--084'0326*$,3&53*&7"-"/%#00,."3,4-&5
&*--:"'"3*1654"$0.1-&5&"/%#64*/&44
:06&"4*-:3&563/5041&$*'*$1-"$&4*/"#00,
&3&'&3&/$&-*#3"3:3*()50/:063%&4,5017"*-"#-&
5030'&44*0/"-&.#&34"'"3*8*--)&-1:06
;&30*/0/&9"$5-:5)&*/'03."5*0/:06/&&%3*()5
8)&/:06/&&%*5

#0 0"$
9990 0"$18"!
letters to the editor

DOI:10.1145/1516046.1516049

Share the Threats

O
THMAN EL MOULAT’S comment tual machines, aiming to verify or refute fining an effective dependability case.
“What Role for Computer a problem a customer might be having Is this correct?
Science in the War on Ter- and possibly provide a workaround or CJ Fearnley, Upper Darby, PA
ror?” (Apr. 2009) concern- new build of the software. If this sce-
ing the article “The Topolo- nario turns out to be common, we roll it
gy of Dark Networks” by Jennifer Xu and into our testing sandbox; Author’s Response:
Hsichun Chen (Oct. 2008) that the views Project services. When trying to re- Requirements traditionally break the
and articles in Communications should motely configure and build a solution behavior of a system into a collection of
have no bearing on or bias toward any for a customer, we first build it in a vir- functions, each describing in full some
agenda, political or religious, is a point tual machine, then apply the solution feature of the system. A radiotherapy
well taken. However, in light of the se- and test. This process also greatly im- machine might, for example, offer
curity breaches occurring throughout proves delivery of the solution; and functions to recall a patient’s prescribed
the digital world, any information that Software demonstration. We build dose from a database; set the equipment
exposes threats should indeed be well our demos in a virtual machine, mak- to deliver a given dose; activate the
received and published wherever it is ing it much easier for us to get them out equipment; and so on. Prioritizing functions
relevant to technologists and security to field personnel. isn’t very useful, because the critical
specialists, as in Communications. Jerry Walter, Troy, OH aspects of a system typically involve many
It is reasonable to suspect that poten- functions, though often not in their entirety.
tial terrorist cells or factions willingly A property, on the other hand, describes
and wantonly seek ways to destroy West- How to Define the Granularity an expected observation of the system’s
ern technologies and organizations. An of Properties and Functions behavior and can be expressed at any level
article aimed at exposing threats or edu- I was confused about the discussion of granularity: that, for example, some
cating the public on future threats does of properties and functions in Daniel of the dose delivered to a patient never
not in any way target a specific race, Jackson’s review article “A Direct Path exceeds some fixed limit; that a patient
creed, or religion. to Dependable Software” (Apr. 2009). receives his or her prescribed dose within
I applaud the authors of “The Topol- Jackson seemed to be saying that prop- some tolerance; that the dose delivered
ogy of Dark Networks” and hope Com- erties are more fine-grain than func- and the dose logged always match; and
munications continues to keep us up to tions yet also that a property cuts across so on. So a property can at the same
date with factual articles of this nature. several functions at the same time. time be more fine-grain than a single
Organizations that are concerned with Doesn’t this imply that properties are function (since it describes the function
their own beliefs, traditions, and ob- coarse-grain, assuming they transcend only partially) and cut across multiple
jectives should be willing to transpar- several functions? functions.
ently share their interests with the rest Trying to resolve my questions with Daniel Jackson, Cambridge, MA
of the world. the help of Webster’s dictionary, I
John Orlock, IL learned that a function is a “factor” and a Corrections
property is any attribute or characteristic. In the Q&A “Our Dame Commander”
So functions and properties can be both (Apr. 2009), Leah Hoffmann described
Virtualization Still Evolving fine- and coarse-grain, depending on Wendy Hall as “the third female presi-
Kirk L. Kroeker’s news article “The Evo- the assumptions of abstraction inher- dent of ACM.” Hall is the sixth, pre-
lution of Virtualization” (Mar. 2009) ent in the mind of the author. ceded by: Jean Sammet (1974–1976),
took a limited view of its subject. I con- Does Jackson view a “function” as a Adele Goldberg (1984–1986), Gwen Bell
trast it with how my company uses vir- modularity construct in a programming (1992–1994), Barbara Simons (1998–
tual machines for several quite practi- language? Does he mean that properties 2000), and Maria Klawe (2002–2004).
cal purposes: are those factors or attributes (“func-
Software testing. Rather than build a tions” if you will) that are independent The photographs of the Rebooting
test environment, then rebuild it after a of the software’s special-case imple- Computing Summit (Apr. 2009) were
series of tests, we set it up with a set of mentation? taken by Richard P. Gabriel (page 2) and
baseline virtual machines (perhaps cli- I may still be confused, but trying by Mary Bronzan (page 19).
ent/server), then run our tests. This way to infer Jackson’s meaning led me to
when we finish testing, we copy back conclude the following: Fine-grain at- Communications welcomes your opinion. To submit a Letter
over the baseline virtual machine and tention to the software’s behavior-level to the Editor, please limit your comments to 500 words or
less and send to letters@cacm.acm.org.
are ready for the next round of testing; characteristics (including properties,
Customer support. We look to mimic functions, or whatever abstractions a
customer configurations in a set of vir- developer is using) is important in de- © 2009 ACM 0001-0782/09/0600 $10.00

J UN E 2 0 0 9 | VO L . 5 2 | NO. 6 | CO M M UN I CAT I O NS OF T H E ACM 9


blog@cacm

The Communications Web site, cacm.acm.org,


features 13 bloggers in the BLOG@CACM
community. In each issue of Communications,
we’ll publish excerpts from selected posts,
plus readers’ comments.

DOI:10.1145/1516046.1516072 cacm.acm.org/blogs/blog-cacm

Speech-Activated only have to speak the number of the


item you want from the list. However,

User Interfaces and


it also seems to correctly recognize
the spoken version of anything in the
list, even if it’s not displayed on the

Climbing Mt. Exascale current screen (e.g., the name of an


artist in the music player).
In my tests it’s been surprisingly
Tessa Lau discusses why she doesn’t use the touch screen accurate at interpreting my speech,
on her in-car GPS unit anymore and Daniel Reed considers despite the generally noisy environ-
the future of exascale computing. ment on the road.
What has surprised me the most
about this interface is that the voice-
From Tessa Lau’s empowering, and makes me excited based control is so enjoyable and
“Hello, Computer” about the future of voice-based inter- fast that I don’t use the touch screen
Four years ago when I faces. anymore. Speech recognition, which
bought my first in-car The nüvi’s interface is simple and had been in the realm of artifical in-
Global Positioning Sys- well designed. There’s a wireless, but- telligence for decades, has finally
tem (GPS) unit, it felt ton-activated microphone that you matured to the point where it’s now
like a taste of the future. The unit mount to your steering wheel. When reliable enough for use in consumer
knew where I was, and regardless you activate the mic, a little icon ap- devices.
of how many wrong turns I made, it pears on the GPS screen to indicate Part of the power of the speech-
could tell me how to get where I want- that it’s listening, and the GPS plays activated user interface comes from
ed to go. It was the ultimate adaptive a short “I’m listening” tone. You can the ability to jump around in the in-
interface: No matter where I started, speak the names of any buttons that terface by spoken word. Instead of
it created a customized route that appear on the screen or one of the having to navigate through several
would lead me to my destination. always-active global commands (e.g., different screens by clicking but-
Alas, my first GPS unit met an un- “main menu,” “music player,” or tons, you can jump straight to the
timely end in a theft involving a dark “go home”). Musical tones indicate desired screen by speaking its name.
night, an empty street, and a smashed whether the GPS has successfully in- It’s reminiscent of the difference be-
window. terpreted your utterance. If it recog- tween graphic user interfaces (GUIs)
My new GPS, a Garmin nüvi 850, nized your command, it takes you to and command lines; GUIs are easier
comes with a cool new feature: the next screen and verbally prompts to learn, but once you master them,
speech-activated controls. you for the next piece of information command lines offer more efficiency
Speech recognition brings a new (e.g., the street address of your des- and power. As is the case with com-
dimension to the in-car human-com- tination). Most of the common GPS mand lines, it takes some experimen-
puter interface. When you’re driving, functionality can be activated via spo- tation to discover what commands
you’re effectively partially blind and ken confirmations without even look- are available when; I’m still learning
have no hands. Being able to talk to ing at the screen. about my GPS and how to control it
the computer and instruct it using Lists (e.g., of restaurant names) more effectively.
nothing but your voice is amazingly are annotated with numbers so you Kudos, Garmin, you’ve done a great

10 CO MM UNICATIO NS OF TH E ACM | J U NE 20 09 | VO L . 5 2 | NO. 6


blog@cacm

job with the nüvi 850. I can’t wait to ing for several reasons, both socio- we have been loathe to mount the in-
see what the future will bring! (Voice- logical and technological. tegrated research and development
based access to email on the road? It needed to change our current hard-
seems almost within reach.) Petascale Retrospective ware/software ecosystem and pro-
Disclaimer: The views expressed On the sociological front, I remember curement models.
here do not necessarily reflect the views participating in the first peta-scale
of my employer, ACM, or any other en- workshop at Caltech in the 1990s. Exascale Futures
tity besides myself. Seymour Cray, Burton Smith, and Evolution or revolution, it’s the per-
others were debating future petas- sistent question. Can we build reli-
Reader’s comment: cale hardware and architectures, a able exascale systems from extrapo-
Information I’ve read lately on the topic second group was debating device lations of current technology or will
of speech recognition indicates that a technologies, a third was discussing new approaches be required? There
device’s ability to correctly recognize application futures, and a final group is no definitive answer as almost any
commands depends in large measure on of us was down the hall debating fu- approach might be made to work at
the quietness of the environment. I have ture software architectures. All this some level with enough heroic effort.
often found that voice systems on my cell was prelude to an extended series of The bigger question is: What design
phone don’t work well unless I find a quiet architecture, system software, pro- would enable the most breakthrough
place to access them. So it is good to hear gramming models, algorithms, and scientific research in a reliable and
that Garmin has found an effective way applications workshops that spanned cost-effective way?
to interpret commands while driving—an several years and multiple retreats. My personal opinion is that we
environment that you note can be noisy. At the time, most of us were con- need to rethink some of our dearly
As you speak of future enhancements, vinced that achieving petascale per- held beliefs and take a different ap-
it brings up the issue of what drivers formance within a decade would re- proach. The degree of parallelism
should be able to do while on the road. quire new architectural approaches required at exascale, even with future
Multitasking is great, but I’m not sure and custom designs, along with radi- many-core designs, will challenge
email while driving is such a good idea… cally new system software and pro- even our most heroic application
—Debra Gouchy gramming tools. We were wrong, or developers, and the number of com-
at least so it superficially seems. We ponents will raise new reliability and
broke the petascale barrier in 2008, resilience challenges. Then there are
From Daniel Reed’s using commodity x86 microproces- interesting questions about many-
“When Petascale Is sors and GPUs, InfiniBand intercon- core memory bandwidth, achievable
Just Too Slow” nects, minimally modified Linux, and system bisection bandwidth, and I/O
It seems as if it were just the same message-based program- capability and capacity. There are
yesterday when I was at ming model we have been using for just a few programmability issues
the National Center for the past 20 years. as well!
Supercomputing Applications and we However, as peak system perfor- I believe it is time for us to move
deployed a one teraflop Linux cluster mance has risen, the number of users from our deus ex machina model
as a national resource. We were as has declined. Programming massively of explicitly managed resources to
excited as proud parents by the con- parallel systems is not easy, and even a fully distributed, asynchronous
figuration: 512 dual processor nodes terascale computing is not routine. model that embraces component
(1 GHz Intel Pentium III processors), Horst Simon explained this with an in- failure as a standard occurrence. To
a Myrinet interconnect, and (gasp) a teresting analogy, which I have taken draw a biological analogy, we must
stunning 5 terabytes of RAID storage. the liberty of elaborating slightly. The reason about systemic organism
It achieved a then-astonishing 594 gig- ascent of Mt. Everest by Edmund Hil- health and behavior rather than cel-
aflops on the High-Performance LIN- lary and Tenzing Norgay in 1953 was lular signaling and death, and not
PACK benchmark, and was ranked heroic. Today, amateurs still die each allow cell death (component failure)
41st on the Top500 list. year attempting to replicate the feat. to trigger organism death (system
The world has changed since then. We may have scaled Mt. Petascale, but failure). Such a shift in world view
We hit the microprocessor power we are far from making it pleasant or has profound implications for how
(and clock rate) wall, birthing the even a routine weekend hike. we structure the future of interna-
multicore era; vector processing re- This raises the real question: Were tional high-performance comput-
turned incognito, renamed as graphi- we wrong in believing different hard- ing research, academic/government/
cal processing units (GPUs); terabyte ware and software approaches would industrial collaborations, and system
disks are available for a pittance at be needed to make petascale com- procurements.
your favorite consumer electronics puting a reality? I think we were abso-
store; and the top-ranked system on lutely right that new approaches were Tessa Lau is a research staff member at IBM Almaden
the Top500 list broke the petaflop needed. However, our recommenda- Research Center in San Jose, CA. Daniel Reed is
director of scalable and multicore systems at Microsoft
barrier last year, built from a combi- tions for a new research and devel- Research in Redmond, WA.
nation of multicore processors and opment agenda were not realized. At
gaming engines. The last is interest- least, in part, I believe this is because © 2009 ACM 0001-0782/09/0500 $10.00

JU NE 2 0 0 9 | VO L . 52 | N O. 6 | C O M M U NI C AT IO N S O F T HE AC M 11
cacm online

ACM
Member
News
DOI:10.1145/1516046.1516050 David Roman
EMER WINS

Making that Connection ECKERT-MAUCHLY


AWARD
ACM and the IEEE
Computer Society
will jointly present
the Eckert-Mauchly Award
The goal of holding readers’ attention
to Joel Emer, director of
has made provocation a timeworn edi- microarchitecture research at
torial strategy. Communications doesn’t Intel, for pioneering contributions
resort to screaming headlines like to performance analysis,
modeling methodologies, and
most storefront fare, but it does strive design innovations in several
to publish eye-catching imagery for its significant industry
must-read articles. This month’s cover microprocessors. Emer
story, “One Laptop Per Child: Vision vs. developed quantitative methods
including measurement of real
Reality,” with its title’s inherent ten- machines, analytical modeling,
sion, is a case in point. and simulation techniques
Communications also aims for au- that are now widely employed
to evaluate the performance of
thority; its articles can be a beginning complex computer processors.
as much as an end. The “Viewpoints” Emer will receive the 2009
pages, for example, may introduce Eckert-Mauchly Award, the most
unsettled and unsettling ideas that prestigious award in the computer
architecture community, at the
prompt readers to react and respond International Symposium on
not only to the editorial but to each Computer Architecture, June
other. Indeed, the recent debate on 20–24, in Austin, TX.
network neutrality that was first pre-
EGGERS RECEIVES ATHENA
sented in the pages of the February is- LECTURER AWARD
sue, continued into the May issue, and Susan Eggers, a professor of
it’s hardly over yet. computer science and engineering
at the University of Washington,
You can be a part of this debate at cacm.acm.org. Communications’ Web site
has won ACM’s 2009–2010
invites and lends itself to quick feedback via the “User Comments” feature that Athena Lecturer Award. Eggers’
allows a continued conversation about a topic. Reachable from the “Tools for work on computer architecture
Readers” at the top right of each article page, and at the bottom of every article and experimental performance
analysis led to the development
page, the feature requires a simple sign-in (so we can follow who’s speaking). of Simultaneous Multithreading,
From there, readers are welcome to present what Editor-in-Chief Moshe Vardi the first commercially viable
calls “well-reasoned and well-argued opinions” to keep the discussion lively. multithreaded architecture. This
I encourage all readers to start or join an online discussion. technique improves the overall
efficiency of certain processors
known as superscalar and has been
Wanted: Expert Bloggers adopted by Intel, IBM, and others.
Ever consider yourself a blogger? If so, we should talk.
Communications wants to expand its ever-evolving roster of expert bloggers. WHITNEY RECOGNIZED FOR
DISTINGUISHED SERVICE
Experience is a plus but credentials and passion are equally important. The level ACM presented the Distinguished
of commitment we require is open-ended; if you are willing to work with us, we Service Award to Telle Whitney
will accommodate your schedule. If you are interested but cannot add it to your for her profound impact on
PHOTOGRAP H C OURT ESY OF TH E I NT EL C ORPORATI ON

the participation of women in


workload at the moment, we could put you on our future schedule or at least get computing. Whitney, president
you on our radar. and CEO of the Anita Borg
In addition, if you follow the blogs of someone you consider a good fit Institute for Women and
for Communications, we’d like to hear your recommendations. Contact us at Technology, cofounded the Grace
Hopper Celebration of Women in
blog@cacm.acm.org. Computing, which has grown into
an annual event. The conference
is widely recognized as one of the
best ways to encourage women to
major in computing, continue on
to graduate school, and pursue a
career in computing.

12 CO MM UNICATIO NS OF TH E ACM | J U NE 20 09 | VOL . 5 2 | N O. 6


N
news

Science | DOI:10.1145/1516046.1516051 Don Monroe

Micromedicine these ideas help patients is “probably


measured in decades, not in years,”
Shapiro admits. Long before that, how-

to the Rescue ever, researchers could use the new


tools to explore biology in the lab. The
challenge of engineering biology, rath-
Medical researchers have long dreamed of “magic bullets”
er than merely observing it, could yield
that go directly where they are needed. With micromedicine, powerful insights into how biological
this dream could become a life-saving reality. systems work.

A
HEA D AC HE O R other pain scientist and biological chemist at the Hijacking Biology
will send many of us to Weizmann Institute of Science in Re- Recent years have been revolutionary
the medicine cabinet hovot, Israel, likens this approach to a for biology. The human genome, as
for a pain reliever. Mol- “smart envelope.” The envelope “would well as computer-based tools that mea-
ecules from the swal- open up only at the right place and the sure thousands of biological chemicals
lowed pill quickly find their way direct- right time for the specific action,” such simultaneously, have inundated biolo-
ly to the source of the pain. But how do as releasing a potent but toxic cancer gists with data about how these chemi-
they know where to go? Of course, they drug, he says. “This would open up a cals interact to create the processes
don’t; the molecules travel throughout whole range of molecules that are to- of life. An eager group of researchers
the body, chemically reacting wherever tally inaccessible today as drugs.” around the world take this data glut
they can. In addition to delivering drugs, mi- as a challenge to build new biological
The consequences of “broadcast- croscopic agents could transform the circuits from scratch, in what is known
ing” drugs to the whole body are pro- regeneration of damaged tissues and as “synthetic biology.” Using various
ADAPTED F ROM YA AKOV B EN ENS ON, BI NYAM I N GIL , URI BEN-DOR, RIVK A ADA R & EHUD SHAPI RO,

found. Drugs that attack rogue, cancer- the diagnosis of disease. The time until strategies, they are assembling pieces
causing cells also afflict other dividing
cells, such as those in the intestine.
mRNA disease indicators
In fact, chemotherapy doses are often
reduced to avoid nausea and other un-
pleasant side effects, and other, more
powerful drugs are too toxic to even be
considered.
Researchers have long dreamed of Input Computation
Output
“magic bullets” that go directly where identification of drug
diagnosis
disease indicators administration
they are needed. Indeed, many current
NATURE 429 , 42 3 - 429 (27 M AY 20 04)

drugs are formulated to be taken up by


particular tissues, and nanotechnol-
ogy is giving researchers even more de-
livery options. But what if the delivery
ssDNA drug
system could “diagnose” the local con-
ditions? In contrast to today’s “dumb
envelopes,” Ehud Shapiro, a computer An example of a test-tube “molecular computer” created by Ehud Shapiro and colleagues.

J UN E 2 0 0 9 | VO L . 5 2 | NO. 6 | C O M M UN I CAT I O NS OF T HE ACM 13


news


   
 

   
 
 
   
 
 
   
     
  
   
       
   
     
     

×
  
         

 
    
  

 
 
 



 

    

  
 
 
  
 


An example of a synthetic riboswitch engineered by Maung Nyan Win and Christina Smolke in which the ribozyme is turned off when the
aptamer binds ligand.

that might enable completely new ap- A Caltech team led by Christina that the field has yet to settle on the
proaches to medical technology. Smolke, now a professor of bioengi- best approach. “Ultimately, you want
In 2004, for example, Shapiro and neering at Stanford University, de- to get to a place where there’s some
his colleagues created a test-tube “mo- signed complex RNA molecules that level of standardization,” she says.
lecular computer” consisting of three included three separate sections, per-
interconnected modules. The first forming sensing, computation, and Send in the Clones
module sensed the concentrations of actuation. Although all three modules One barrier to standardization is the
four types of messenger RNA, the work- are part of the same molecule, they act wide range of possibilities for using
ing copies of the genetic instructions independently, so the function of each biological agents in medicine. Smolke’s
in DNA, which are used to produce pro- part can be separately modified, she technique, for example, might be used
teins. The second module performed a says. “You have this plug-and-play type to genetically modify cells in a par-
“diagnosis,” computing whether two capability to build many types of func- ticular tissue, but she is also exploring
of the messenger RNA levels decreased tions from a smaller set of modular modification of immune cells outside
while two others increased, a signa- components.” The RNA molecules they of the body to combat cancer. “We’re
ture that might indicate a disease. De- design are manufactured by yeast or utilizing the function that [the immune
pending on the results of the compu- even mammal cells after the research- cell] already does really well, and then
tation, the third module dispensed a ers insert the corresponding DNA. endowing it with enhanced functions,”
drug molecule. “We demonstrated the In addition to computing Boolean she says.
whole process, beginning to end,” Sha- logic operations, Smolke’s team has Chris Anderson, a professor in the
piro notes, “but in a test tube.” demonstrated other signal-processing department of bioengineering at the
To both sense specific strands of functions, including bandpass filtering University of California, Berkeley, envi-
messenger RNA and to perform the and adjustable signal gain, with their sions a different strategy, one based on
computation, the Weizmann research- RNA platform. But she acknowledges engineered bacteria, but admits that
ers exploited the sequence-specific “it’s impossible to know what’s going

MAUNG N YAN WI N A ND C HRI STIN A D. SM OLKE, PROC. NATL . ACA D. SCI . 10 4, 14287 ( 20 07 )
matching of DNA strands. So far, to win out.” A “huge advantage” of us-
though, they have not operated their In addition to ing bacteria, Anderson says, is that the
molecular computer in the complex biological processes targeted by anti-
environment of a living cell. Other computing Boolean bacterial agents are very different from
teams have had success with different logic operations, those of human cells, so the engineered
schemes. For instance, a group includ- bacteria can be easily killed.
ing Shapiro’s former collaborator Yaa- a Caltech team has For bacteria to be effective, they
kov Benenson, now a researcher at the demonstrated other must be able to evade the body’s im-
FAS Center for Systems Biology at Har- mune defenses. Anderson and his col-
vard University, demonstrated compu- signal-processing leagues have transplanted genes from
tation—but not sensing—in cultured functions with its other bacteria that allow their E. coli to
human kidney cells. They exploited the survive for hours in the bloodstream,
newly discovered phenomenon of RNA RNA platform. instead of just a few minutes. They also
interference, in which the presence of introduced growth-control mecha-
short RNA templates activates cellular nisms into the bacteria, he stressed.
mechanisms that suppress protein syn- “They’re not able to grow without feed-
thesis for matching messenger RNA. ing something to the patient.”

14 COM M UNICATI ONS OF TH E ACM | J U N E 200 9 | VO L. 52 | N O. 6


news

In 2006, Anderson and his col- of computation that reflects the inter- described by researcher Tad Hogg of
leagues unveiled a bacterium they had action between the two strains, each of Hewlett-Packard Labs, they could sig-
engineered to invade nearby cells. Im- which could be tuned to detect separate nal to point others to medically impor-
portantly, the invasion only occurred conditions. In some cases, Weiss says, tant locations. In addition, they might
under chosen conditions, including “a cell that specializes in the detection be able to transmit information to the
lack of oxygen, which often occurs near of one condition can do it much bet- outside world.
tumors. Rather than directly combin- ter than a cell that tries to do too many Augmenting, rather than replac-
ing sensing, computation, and actua- things at once.” ing, the diagnostic strengths of the
tion into a single DNA or RNA molecule, From a broader perspective, says medical community could be an im-
Anderson’s genetic modules com- computer scientist Tatsuya Suda of portant early application of micro-
municate using smaller molecules, in the University of California at Irvine, medicine, and relaxes the demands
much the same way as normal cells. “there’s always communication in- for on-board computation and drug
When the researchers insert new DNA volved” in micromedicine. The sens- delivery. At a minimum, small de-
into the bacteria, they include special ing of the environment by the tiny vices might extend the capabilities of
sequences that respond to other chem- agents is a kind of communication, he chemicals whose locations are moni-
icals in the cell or the environment. notes, as is the dispensing of drugs. As tored in modern medical equipment.
They “connect” their modules by in- researchers design these tiny commu- “As those imaging devices advance,”
ducing this sensitivity to the products nications systems, he stresses, they Hogg says, “they should be able to
of other genes that they insert. In ad- need to pay careful attention to noise. give you some information more than
dition, by requiring that two different In addition to communicating just ‘here’ or ‘not here,’ but what they
molecules attach to adjacent regions with their environment, microscopic found” in a particular region, perhaps
of DNA, they created the cellular equiv- agents may communicate with each by combining several important local
alent of an AND gate. other. As an example, Suda cites re- measurements.
generative medicine, in which the cre- Even before medical applications
A Need to Communicate ation of a replacement organ requires become practical, Shapiro suggests,
The chemical sensitivity of genes gives coordinated response by many agents. the emerging tools could provide new
cells some ability to communicate with But he admits that, for now, “the state resources for basic biology research.
each other. For example, one of the sig- of the art is just trying to find out how “I think that these types of molecu-
nals that stimulated Anderson’s bacte- they work together as a group, as op- lar computing devices might be able
ria to invade was the well-known “quo- posed to how we can take advantage of to analyze living cells ex vivo and help
rum-sensing” response that kicks in for group behavior.” researchers understand cells without
some bacteria when they are present in For biologically based agents, as for killing them,” Shapiro notes. “These
large numbers. Ron Weiss, a professor ordinary cells, any communication is applications are probably measured in
of electrical engineering and molecu- likely to occur through the emission years rather than in decades.”
lar biology at Princeton University, has and sensing of molecules. In contrast,
used the quorum-sensing machinery artificial or hybrid systems incorporat- Don Monroe is a science and technology writer based in
to build bidirectional communication ing nanometer-scale electronic com- Murray Hill, NJ.
between two groups of bacteria. The ponents might also communicate by
collective behavior constitutes a kind ultrasound or radio. In principle, as © 2009 ACM 0001-0782/09/0600 $10.00

Search Technology

Kleinberg Wins ACM-Infosys FoundationAward


Jon Kleinberg, a professor of phenomenon known as “six support for the $150,000 award is insights into the link between
computer science at Cornell degrees of separation.” provided by an endowment from computer network structure
University, is the winner of the Kleinberg’s use of the Infosys Foundation. and information that has
2008 ACM-Infosys Foundation mathematical models to “Professor Kleinberg’s transformed the way information
Award in the Computing Sciences illuminate search and social achievements mark him as a is retrieved and shared online.”
for his contributions to improving networking tools that underpin founder and leader of social The ACM-Infosys
Web search techniques that today’s social structure has created network analysis in computer Foundation Award recognizes
allow billions of Web users interest in computing from people science,” says Professor Dame young researchers who are
worldwide to find relevant, not formerly drawn to this field. Wendy Hall, president of ACM. currently making sizeable
credible information on the ever- The ACM-Infosys Foundation “With his innovative models and contributions to their fields
evolving Internet. Kleinberg, 37, Award, established in 2007, algorithms, he has broadened and furthering computer
developed models that document recognizes personal contributions the scope of computer science science innovation. The goal is
how information is organized on by young scientists and system to extend its influence to the to identify scientifically sound
the Web, how it spreads through developers to a contemporary burgeoning world of the Web breakthrough research with
large social networks, and how innovation that exemplifies the and the social connections it potentially broad implications,
these networks are structured greatest recent achievements in enables. We are fortunate to and encourage the recipients to
to create the small-world the computing field. Financial have the benefit of his profound further their research.

J UN E 2 0 0 9 | VO L. 52 | N O. 6 | CO M M U NI CAT I O NS O F T HE AC M 15
news

Society | DOI:10.1145/1516046.1516052 Leah Hoffmann

Content Control
Entertainment businesses say digital rights management prevents
the theft of their products, but access control technologies have been
a uniform failure when it comes to preventing piracy. Fortunately,
change is on the way.

B
Y N OW, T HEstory is familiar: into question. Previously, compos- for something that could be consumed
CD sales are falling. Digi- ers of popular songs relied on the sale thousands of discrete, different times.
tal music sales are grow- of sheet music for their income. After Eventually, collection societies were set
ing, but have not offset the all, musicians needed sheet music to up to make sure each party had a share
loss. The music business learn and perform a work, even if in- in the new revenue streams.
is struggling to adapt to a new techno- dividual performances generated no Today, musical copyright is most
logical era. It’s not the first time. At the royalties. Once performances could be prominently embodied not by sheet
turn of the 20th century, for instance, recorded and sold or broadcast on the music but by audio recordings, along
as the phonograph gained popularity, radio, however, the system grew less with their translations and derivatives
the industry’s model of compensation appealing to both groups of artists, (that is, their copies). Yet computers
and copyright was suddenly thrown who were essentially getting paid once have made light work of reproducing
most audio recordings, and the in-
dustry is unable to prevent what many
young fans are now used to—free cop-
ies of their favorite songs from online
file-sharing networks like BitTorrent
and LimeWire. Legal barriers, like
the injunctions imposed by the Digi-
tal Millennium Copyright Act (DMCA)
against copying protected works or cir-
cumventing their digital protections,
are unpopular and difficult to enforce.
(The industry’s John Doe suits have
touched a mere fraction of file sharers,
and their effect on the overall volume
of illegal downloads is questionable.)
Technological barriers, like the wide-
spread security standards and controls
known as digital rights management
(DRM), have been even less effective.
DRM attempts to control the way
digital media are used by preventing
purchasers from copying or convert-
ing them to other formats. In theory, it
gives content providers absolute power
over how their work is consumed, en-
abling them to restrict even uses that
are ordinarily covered by the fair use
doctrine. Purchase a DVD in Europe,
and you’ll be unable to play it on a DVD
player in the U.S. because of region-
PHOTOGRAP GH BY M OLLY KLEIN MA N

coding DRM. What’s more, according


to the DMCA, it would be illegal for
you to copy your DVD’s contents into a
different format, or otherwise attempt
to circumvent its region-coding con-
trols. To take a musical example, until
By putting copyrighted books online, Google Book Search may soon revolutionize book publishing. recently songs purchased in Apple’s

16 COM MUNICATIO NS OF TH E ACM | J U NE 20 09 | VOL . 5 2 | NO. 6


news

popular iTunes music store could only els: Can companies preserve their cur-
be played on an iPod due to the com- rent revenue structures through DRM
pany’s proprietary DRM. DRM is being or in court, or must they find some oth-
Entertainment businesses say “wielded as a er way of making money? For music,
they need DRM to prevent the theft the iTunes model appears to be a viable
of products that represent their liveli- powerful tool” one, though questions still remain. For
hood. In practice, however, DRM has against unapproved movies, the path is less clear. What will
been a uniform failure when it comes happen when DVDs become obsolete?
to preventing piracy. Those who are technologies, says Will consumers take out subscriptions
engaged in large-scale, unauthorized Aaron Perzanowski. to online movie services, or make dis-
commercial duplication find DRM crete one-time purchases? “Nobody
“trivial to defeat,” says Jessica Litman, knows what the marketplace of the fu-
a professor of law at the University of ture will look like,” says Litman. And
Michigan. The people who don’t find the wholesale copyright reform that
it trivial: ordinary consumers, who are digital activists long for is years away.
often frustrated to discover that their on their hard drives. Angry gamers re- One industry whose business mod-
purchases are restricted in unintuitive sponded by posting copies of the game el may soon be radically transformed
and cumbersome ways. online, making Spore the most pirated is publishing. Under the terms of a
In the music industry, at least, game on the Internet. recent settlement reached with the
change is underway. In 2007, Amazon Authors Guild (which sued Google
announced the creation of a digital DRM and Movies in 2005 to prevent the digitization
music store that offered DRM-free Yet DRM is nowhere near dead outside and online excerpting of copyrighted
songs, and in January 2009, Apple fi- the music business. Hollywood, pro- books as part of its Book Search proj-
nalized a deal with music companies tected thus far from piracy by the large ect), Google agreed to set up a book
to remove anti-copying restrictions file size of the average feature film, con- rights registry to collect and distrib-
on the songs it sold through iTunes. tinues to employ it as movies become ute payments to authors and publish-
Since iTunes is the world’s most popu- available through illegal file-sharing ers. Much like the collection societies
lar digital music vendor—and the iPod networks. Buy a movie on iTunes, and that were established for musicians,
its most popular player—critics com- you’ll still face daunting restrictions the registry would pay copyright hold-
plained the deal would only further so- about the number and kind of devices ers whenever Internet users elected to
lidify Apple’s hold on the industry. Yet you can play it on. Buy a DVD, and you’ll view or purchase a digital book; 63% of
because consumers can now switch to be unable to make a personal-use copy the fee would go to authors and pub-
a different music player without losing to watch on your laptop or in the car. lishers, and 37% to Google.
the songs they’ve purchased, the pre- DRM has also proven useful as a le- If approved, the settlement would
diction seems dubious. gal weapon. Kaleidescape, a company be “striking in its scope and potential
“As long as the cost of switching whose digital “jukeboxes” organize future impact,” says Deirdre Mulli-
technologies is low, I don’t think Apple and store personal media collections, gan, a professor of law at the Univer-
will exert an undue influence on con- was sued in 2004 by the DVD Content sity of California, Berkeley’s School of
sumers,” says Edward Felten, a profes- Control Association, which licenses the Information. It is nonetheless highly
sor of computer science and public af- Content Scrambling System that pro- controversial. Some, like James Grim-
fairs at Princeton University. tects most DVDs. (In 2007, a judge ruled melmann, a New York Law School
What about piracy? Since DRM there was no breach of the license; the professor, believe it is a “universal win
never halted musical piracy in the first case is still open on appeal.) compared with the status quo.” Others
place, experts say, there’s little reason The Kaleidescape case is instructive, are disappointed by what they see as
to believe its absence will have much experts say, since it shows that prevent- a missed opportunity to set a power-
effect. In fact, piracy may well decrease ing piracy isn’t necessarily Hollywood’s ful court precedent for fair use in the
thanks to a tiered pricing scheme in biggest concern. Entry-level Kaleides- digital age, and the undeniable danger
the Apple deal whereby older and less cape systems start at $10,000—unlike- of monopoly. “No other competitors
popular songs are less expensive than ly purchases for would-be copyright appear poised to undertake similar ef-
the latest hits. “The easier it is to buy infringers. “Instead, DRM is wielded as forts and risk copyright legislation,”
legitimate high-quality, high-value a powerful tool to prevent the develop- says Perzanowski.
products,” explains Felten, “the less of ment and emergence of unapproved One thing, at least, is clear: It frees
a market there is for pirated versions.” technologies. In some instances, that the courts to consider other industries’
By way of illustration, he points to the may overlap with some concern over in- complaints as they slouch toward the
2008 release of Spore, a hotly anticipat- fringement, but as the Kaleidescape ex- digital age.
ed game whose restrictive DRM system ample shows, it need not,” says Aaron
not only prevented purchasers from Perzanowski, a research fellow at the Leah Hoffmann is a Brooklyn, NY-based science and
installing it on more than three com- Berkeley Center for Law & Technology. technology writer.

puters, but surreptitiously installed Indeed, the real question typically


a separate program called SecuROM comes down to one of business mod- © 2009 ACM 0001-0782/09/0600 $10.00

JUN E 2 0 0 9 | VOL . 5 2 | NO. 6 | C O M M U N I C AT I ON S O F TH E AC M 17


news

Technology | DOI:10.1145/1516046.1516053 Gregory Goth

Autonomous Helicopters
Researchers are improving unmanned helicopters’ capabilities
to address regulatory requirements and commercial uses.

T
HERE WOULD SEEM to be a clear
market niche for unmanned
helicopters. Equipped with
lightweight onboard cam-
eras, they could serve as
mapping agents or search-and-rescue
“eyes” in places where using a full-
sized helicopter and a human crew
are life threatening or cost prohibi-
tive. Motion-picture producers have
explored the use of autonomous heli-
copters in filming action scenes in lo-
cations where the safety of both flight
crews and movie cast members could
be at risk from using larger aircraft.
Humanitarian groups have consid-
ered using autonomous helicopters for
land-mine detection, while public safe-
ty agencies have explored using them
for inspecting bridges and other struc-
tures where human inspectors might
be endangered. And they are becoming
mainstays in applications such as crop
dusting in Japan, where the need to fly One of Stanford University’s autonomous helicopters flying upside down in an aerobically
at a low altitude and spray chemicals challenging airshow. For more photos and video, visit http://heli.stanford.edu/index.html.
can be dangerous for pilots.
Academic and commercial research International Conference for Machine other aircraft. Therefore, autonomous
teams have been perfecting the capa- Learning’s best application paper for crafts’ use is limited to case-by-case
bilities of autonomous helicopters for 2008, and describes how he and col- approval by the FAA, and usually re-
nearly two decades, with such wide- leagues programmed an autonomous stricted to line-of-sight operation. In
spread deployments as a goal. Algorith- helicopter to perform complex aero- Japan, the government has placed
mic and technological advances are batics. “But I think the biggest hurdle strict trade restrictions on the Yamaha
occurring at a steady pace, but regula- is regulatory. It’s virtually impossible RMAX autonomous helicopter, which
tory roadblocks and trade restrictions to do real UAV [unmanned aerial vehi- is regarded as the industry benchmark,
are hampering market acceptance. cle] operations unless you’re a defense to prevent it from being used for mili-
And, though much of the cutting-edge contractor or the military—so you have tary operations by unfriendly nations.
research in autonomous helicopters to go to a big defense contractor if you Omead Amidi, a research faculty
demonstrates significant crossover want to do real UAV research.” member at Carnegie Mellon University
potential between disparate computa- Regulatory hurdles vary, depend- and CEO of SkEyes Unlimited, a Wash-
tional and scientific disciplines as well ing on the sovereignty involved. In the ington, PA-based firm that manufac-
as other aviation applications, many U.S., for example, the Federal Avia- tures instruments for autonomous
researchers find themselves stymied tion Administration (FAA) has yet to aircraft, concurs with Coates’ observa-
by these non-technological obstacles issue regulations regarding the use of tion about the dearth of regulatory in-
that stem from policy concerns. autonomous helicopters in public air- frastructure hindering wider develop-
PHOTOGRAP H BY EUGENE F RATKIN

“A lot of vehicles have at least ki- space. A 2008 report by the U.S. Gen- ment and deployment of the craft.
nematics that are similar to helicop- eral Accountability Office (GAO) noted “If you have a helicopter flying over
ters,” says Adam Coates, a Stanford that unmanned aircraft, whether fixed your head, it’s because everything
University Ph.D. student who coau- wing or rotor powered, cannot meet about it is regulated,” Amidi says. “No
thored Learning for Control from Mul- the National Airspace System’s safety such thing exists for autonomous he-
tiple Demonstrations, which won the regulations for tasks such as avoiding licopters. If you could convince me to

18 CO M MUNICATIO NS OF T HE AC M | J U N E 200 9 | VOL . 52 | N O. 6


news

fly one of these over the head of my Programming


daughter, OK, it’s ready, but I’m not
doing it now.” Human-generated
mapping can cost
Repeat
AI to the Forefront
Despite the regulatory issues, which $20,000 per square Winners
the GAO estimated might take 10 years mile; an autonomous For the second year in a row,
to resolve in the U.S., researchers have
continued to improve autonomous helicopter could students from St. Petersburg
University of Information
helicopters’ capabilities. The most ad- produce the same Technology, Mechanics and
Optics won the annual ACM
vanced can take off, hover, and main-
tain flight autonomously through a results 10 times International Collegiate
Programming Contest (ICPC).
combination of advanced sensing cheaper, says With this year’s victory, St.
Petersburg University has
and navigation equipment such as
laser sensors, GPS modules, inertial Omead Amidi. now won the ACM-ICPC world
championship three times in
measurement units that contain ac- the last four years.
celerometers and gyroscopes, and Known as “The Battle of the
communications modules that com- Brains,” the ACM ICPC World
Finals took place this year at the
municate with ground-based comput- Royal Institute of Technology in
ers or human pilots when necessary. in their project and was a coauthor of Stockholm, Sweden. The world’s
The RMAX, for example, first flew the Learning for Control paper, says the top 100 university teams used
open standard technology to
fully autonomously out of visual range project successfully transferred ma- solve 11 real-world problems
in Japan in 2000, following prepro- chine learning techniques into a disci- involving traffic congestion,
grammed instructions. pline that had hitherto been extremely suffix-replacement grammars,
While the RMAX is well suited for labor-intensive, relying on painstaking and other issues, with the goal
being to correctly solve the
commercial purposes, it is also pro- expert modeling of likely behaviors. largest number of problems in
hibitively large and expensive for ap- Ultimately, they decided to have the the shortest amount of time.
plications such as the surveillance of helicopter “watch” an expert human The 33rd annual ACM
building interiors or for bootstrapped pilot’s maneuvers via data input from ICPC, sponsored by IBM, was
dominated by teams from Russia
university research programs. A base onboard controls and a radio receiver and China. This year’s top 12,
model used by the U.S. Army for re- that saved a copy of the human pilot’s medal-winning teams are St.
search weighs approximately 185 control stick positions during demon- Petersburg University (Russia),
which solved nine problems,
pounds, has a rotor diameter of three stration flights. followed by Tsinghua University
meters, and costs $86,000, while fully “From those two things, you can (China), St. Petersburg State
autonomous units, complete with nav- examine state changes over time and University (Russia), Saratov
igational and control equipment, can what the pilot does, and can record a State University (Russia), the
University of Oxford (U.K.),
cost $1 million. whole trajectory to build up a model,” and Zhejiang University (China).
Researchers are successfully apply- Coates says. Massachusetts Institute of
ing disparate technologies to improve “Previously, the most common ap- Technology (U.S.) finished
in seventh place, followed by
the vehicles, using much smaller and proach to designing controllers for Altai State Technical University
cheaper helicopters than the RMAX. autonomous aircraft, both helicopters (Russia), University of Warsaw
For example, Coates and coauthor Pi- and fixed wing, was to hire a human (Poland), University of Waterloo
eter Abbeel, now a professor in the de- engineer to choose parameters for the (Canada), I Javakhishvili Tbilisi
State University (Georgia),
partment of electrical engineering and controller,” Ng says. “For example, and Carnegie Mellon University
computer sciences at the University of if the helicopter is pitched forward a (U.S.).
California, Berkeley, utilized artificial little more than you want, how aggres- “It is clear that computa-
intelligence principles to demonstrate sively do you want to pull back on the tional thinking, which is at
the heart of the information
their assertion that an off-the-shelf stick? The traditional approach was to technology revolution, is
expectation-maximization algorithm have a person knowledgeable in aero- the engine that is driving
could result in the most advanced au- dynamics and helicopters sit down innovation in these countries,”
says ACM President Professor
tonomous aerobatics yet performed, and model that. This approach can Dame Wendy Hall. “As we
using a commercially available radio- often work, but it is a very slow design seek to strengthen computing
controlled hobbyist helicopter that process and often doesn’t perform education and fill the talent
weighed about 10 pounds. nearly as well as modern machine pipeline for future workers,
it is an important reminder
Coates says the Stanford project was learning methods.” that, while U.S. enrollment in
the culmination of five years of effort, Coates and Abbeel discovered that computer science programs
in which numerous approaches were even the most expert human pilot’s may have increased, we need to
continue investing in programs
discussed and dismissed. Andrew Ng, a aerobatic routine contains errors (or, that attract women and other
professor of computer science at Stan- in the language of the problem, is sub- underrepresented groups
ford, who advised Coates and Abbeel optimal). “However, repeated expert to this field.”

J UN E 2 0 0 9 | VO L . 5 2 | NO. 6 | C O M M UN I C AT I O NS O F T H E AC M 19
news

demonstrations are often suboptimal go deeper. “One of the reasons people the advent of microelectromechani-
in different ways,” their Learning for liked our paper is that it was an off-the- cal systems-based sensing technology,
Control paper noted, “suggesting that shelf machine learning algorithm and such as gyroscopes, accelerometers,
a large number of suboptimal expert we solved a strange little application and magnetometers, is leading to in-
demonstrations could implicitly en- nobody had thought of before,” Coates creased miniaturization.
code the ideal trajectory the subopti- notes. “People know how hard this is, Navigationally, academic research-
mal expert is trying to demonstrate.” and to see that AI people solved this, I ers are now also concentrating on
They discovered that merely using think has made a big impact. We had developing obstacle detection tech-
an arithmetic average of the states ob- been preaching for a while that AI is nology that will allow autonomous
served at any given time in the expert the key to solving really hard problems helicopters to fly safely in urban areas
demonstrations would fall short of ar- that aren’t accessible to us when we’re teeming with tall buildings, overhead
riving at the desired trajectory, explain- using lots of classical methods—and if wires, and light poles. Such uses are
ing that, in practice each demonstra- you come up with a problem and make not on the near horizon, however;
tion would occur at different rates, and such large strides, it really adds some the ongoing safety concerns probably
hence make impossible an attempt to weight to the argument that AI can be point to deployment in sparsely popu-
combine states from the same time- real and practical with algorithms that lated areas for natural resource map-
step in each demonstration. solve really hard problems.” ping, forest firefighting, and marine
However, by employing the ma- search and rescue. Human-generated
chine learning algorithm—which in- Smaller, Lighter, Safer mapping at quarter-meter resolution
cludes an extended Kalman filter and The future of autonomous helicop- can cost $20,000 per square mile, for
a dynamic programming algorithm— ters might be even more profoundly example, while autonomous helicop-
the researchers were able to infer the affected by the march to increasingly ters could probably deliver the same
intended target trajectory and time powerful processors and smaller form results 10 times cheaper, says Amidi.
alignment of all the demonstrations. factors. Georgia Tech’s Feron says autono-
And, while real-time variables such as “One way to avoid safety troubles mous helicopters will continue to offer
the state of the air around the craft, is by making the helicopters smaller, researchers an excellent platform for
rotor speed, actuator delays, and the so there are a lot of efforts going into further research in robotics, whether
behavior of the helicopter’s onboard miniaturizing the machines,” says Eric the researcher is an “aeronaut” who
avionics contribute to an extremely Feron, professor of aerospace software will still be utilizing them 10 years
complex environment that cannot be engineering at Georgia Tech Universi- hence, or instead testing a more uni-
modeled accurately, these variables ty, who studied autonomous helicop- versally applicable methodology on
can be mitigated if the programming ters while a graduate student at MIT. the machines, and that wider deploy-
is able to make the helicopter fly the “That’s where I think things are going ment will indeed follow at some point.
same trajectory each time. If so, the now.” “The safety and reliability issues are
errors caused by these variables will Coates says the breakthrough Stan- not unworkable,” Feron says. “I think
tend to be the same, and therefore can ford research was greatly facilitated it’s just a matter of time.”
be predicted more accurately. by increased processor capability that
In addition to the aerobatic results allowed real-time instruction every Gregory Goth is an Oakville, CT-based writer who
specializes in science and technology.
of the project, Coates says the ramifi- 20th of a second, which was not pos-
cations for machine learning theory sible even five years ago. Additionally, © 2009 ACM 0001-0782/09/0600 $10.00

Awards

American Academy Names 2009 Fellows


Computer science was well a center for independent policy and information technologies): ! Alfred Z. Spector, Google
represented when the American research. ! John Seely ! Jennifer Widom, Stanford
Academy of Arts & Sciences (AAAS) “Since 1780, the Academy Brown, Deloitte University.
recently announced the election has served the public good by Center for Edge Elected in the category
of the 2009 class of fellows and convening leading thinkers and Innovation/ of business, corporate, and
foreign honorary members. The doers from diverse perspectives University of philanthropic leadership:
212 new fellows and 19 foreign to provide practical policy Southern California ! John Doerr, Kleiner, Perkins,
honorary members—including solutions to the pressing issues ! Mary Jane Irwin, Caufield & Byers.
PHOTOGRAP H BY J.D. L AS IC A

scholars, scientists, jurists, of the day,” said Leslie Berlowitz, Pennsylvania State In an email interview, John
writers, artists, civic, corporate AAAS chief executive officer. “I John Seely University Seely Brown offered this career
and philanthropic leaders—come look forward to welcoming into Brown ! Maria Klawe, advice for young people: “Nurture
from 28 states and 11 countries the Academy these new members Harvey Mudd College a disposition that embraces
and range in age from 33 to 83. to help continue that tradition.” ! Ray Kurzweil, Kurzweil Tech- change and that encourages you to
They join one of America’s most Elected in the category of nologies challenge your own assumptions
prestigious honorary societies and computer sciences (including AI ! Michael Sipser, MIT and having others challenge yours.”

20 CO M MUNIC ATIO NS O F T HE ACM | J U NE 20 09 | VOL . 5 2 | N O. 6


news

News | DOI:10.1145/1516046.1516071 Bob Violino

Looking Backward and Forward


CRA’s Computing Community Consortium hosted a day-long symposium
to discuss the important computing advances of the last several decades and how
to sustain that track record of innovation.

W
HAT ARE THE major com-
puting innovations of
the recent past? How
did research enable
them? What advances
are on the horizon, and how can they
be realized? These were among the key
questions addressed at an invitation-
only symposium held at the Library of
Congress in Washington, DC, in March.
The symposium, “Computing Re-
search that Changed the World: Reflec-
tions and Perspectives,” was organized
by the Computing Research Associa-
tion’s (CRA’s) Computing Community
Consortium in cooperation with a half-
dozen U.S. congressmen.
“The main goals were to explore past
game-changing research in the comput-
ing fields to understand how they came From left: Daphne Koller, Stanford; Barbara Liskov, MIT; Rodney Brooks, MIT and Heartland
about and then to take a peek at the fu- Robotics; and Alfred Spector, Google, were among the symposium’s session speakers.
ture to see how this knowledge could be
used to maximize the chances for future Each session featured three talks insight into basic biological processes
game-changing research,” says CRA Ex- and a short discussion that identified as well as into the mechanisms and pro-
ecutive Director Andrew Bernat. future challenges. The sessions were cesses underlying human disease. They
“It became pretty clear that there followed by an hour-long discussion also have the potential of allowing us to
is no foolproof way to figure out what among all the speakers, with com- understand the complex genetic and
research will turn into the big hits of ments from attendees, and a call to ac- environmental factors that lead to dif-
tomorrow; rather, that big hits gener- tion for the future. ferences in human phenotype, includ-
ally are a combination of independent As for which areas of research seem ing both disease and response to drug
efforts driven by curiosity and applica- particularly promising, Bernat says mo- treatment.”
tions,” Bernat says. “No one foresaw bile computing will “continue to be a However, Koller adds, it’s impos-
the ultimate outcomes of the initial re- huge area for exploration and change sible to extract these insights without
search, so we must continue to fund a as are digital media of all types. And new computational methods. “Devel-
broad range of efforts in [multiple] sub- networking will continue to boom—not oping these tools is a direction where
disciplines, using a variety of funding just computer networking, but social a lot of progress has been made,” she
mechanisms.” networks which will help us understand says, “but much more work remains to
PHOTOGRAP H C OURT ESY OF T HE C OMPUTI NG RESEARCH AS SOCI ATION

The symposium’s sessions included the dynamics of human behavior.” be done.”


The Internet and the World Wide Web, Daphne Koller, a professor of com- Videos and other material from the
which examined areas such as search puter science at Stanford University symposium are available on the CRA
technology and cloud computing; and one of the symposium’s session Web site, and the Computing Commu-
Evolving Foundations, which looked speakers, says one of the most excit- nity Consortium will host additional
at the security of online information ing directions in computing is the abil- symposiums later this year, includ-
and global information networks; The ity to use computational methods and ing one on artificial intelligence and
Transformation of the Sciences via models to analyze scientific data, par- education and another on educational
Computation, which covered topics ticularly biomedical data. data mining.
such as supercomputers and the fu- “New biological assays are produc-
ture of medicine; and Computing Ev- ing important data at an ever-increasing Bob Violino is a writer based in Massapequa Park, NY, who
covers business and technology.
erywhere!, which focused on sensing, rate,” Koller says. “These data have the
computer graphics, and robotics. potential of providing unprecedented © 2009 ACM 0001-0782/09/0600 $10.00

JU N E 2 0 0 9 | VOL . 52 | N O. 6 | C OM M U N I C AT I O N S O F T HE AC M 21
V
viewpoints

DOI:10.1145/1516046.1516056 Eugene H. Spafford

Privacy and Security


Answering the Wrong
Questions Is No Answer
Asking the wrong questions when building and deploying systems results in systems
that cannot be sufficiently protected against the threats they face.

F
years we have been
OR OVER 50 management, yet still attacks succeed. ementary school computer lab, which
trying to build computing Each time we apply a new layer, new at- is different from one used to control
systems that are trustworthy. tacks appear to defeat it. military weapons. There are some is-
The efforts are most notable I conjecture that one reason for sues in common, certainly, but the
by the lack of enduring suc- these repeated failures is that we may overall design and deployment should
cess—and by the oftentimes spectacu- be trying to answer the wrong ques- reflect the differences.
lar security and privacy failures along tions. Asking how to make system The availability and familiarity of
the way. With each passing year (and “XYZ” secure against all threats is, a few common artifacts has led us to
each new threat and breach) we seem at its core, a nonsensical question. deploy them (or variants) everywhere,
to be further away from our goals. Almost every environment and its even to unsuitable environments. By
Consider what is present in too threats are different. A system con- analogy, what if everything in society
many organizations. Operating sys- trolling a communications satellite was constructed of bricks because
tems with weak controls and flaws is different from one in a bank, which they are cheap, common, and easy to
have been widely adopted because of in turn is different from one in an el- use? Imagine not only homes built of
cost and convenience. Thus, firewalls bricks, but everything else from the
have been deployed to put up another space shuttle to submarines to medi-
layer of defense against the most obvi- Asking how to cal equipment. Thankfully, other fields
ous problems. Firewalls are often con- have better sense and choose appropri-
figured laxly, so complex intrusion and make system “XYZ” ate tools for important tasks.
anomaly detection tools are deployed secure against all A time-honored way of reinforcing
to discover when the firewalls are pen- a point is by means of a story told as a
etrated. These are also imperfect, espe- threats is, at its parable, a fairy tale, or as a joke. One
cially when insider threats are consid- core, a nonsensical classic example I tell my students:
ered, so we deploy data loss detection Two buddies leaving a tavern find
and prevention tools. We also employ question. a distressed and somewhat inebriated
virtual machine environments intend- man on his hands and knees in the park-
ed to erect barriers against buggy im- ing lot, apparently searching for some-
plementations. These are all combined thing. They ask him what he has lost,
with malware detection and patch and he replies that he has dropped his

22 CO MM UNICATI O NS OF TH E AC M | J U N E 200 9 | VO L. 52 | N O. 6
viewpoints

keys. He describes the keys, and says if shouldn’t expect to find what they are ingful manner. That is not necessarily
the two men find them they will receive a really seeking.a the case.
reward. They begin to help search. Other So it is in research—especially in We have generally failed to under-
people come by and they too are drawn cyber security and privacy. We have stand that when we build and deploy
into the search. Soon, there is a crowd people seeking answers to the wrong systems they are used in a variety of en-
combing the lot, with an air of competi- questions, often because that is where vironments, facing different threats.
tion to see who will be the first to find the “the light is better” and there seems to There is no perfect security in any real
keys. Periodically someone informs the be a bigger crowd around them. Until system—hardware fails, people make
crowd of the discovery of a coin or a par- we start asking questions that better mistakes, and attacks outside our ex-
ticularly interesting piece of rock. address the problems that really need pectations may defeat our protection
After a while, one in the crowd stands to be solved, we shouldn’t expect to mechanisms. If an attacker is suffi-
up and inquires of the fellow who lost see progress. Here are a few examples ciently motivated and has enough re-
his keys, “Say, are you sure you lost your of misleading questions: sources (including time), every system
keys out here in the lot?” To which the ! How do I secure my commodity can be defeated in some manner.b If
man replies, “No. I lost them in the al- operating system against all threats? the attacker doesn’t care if the defeat
ley.” Everyone stops to stare at the man. ! How do I protect my system with is noticed, it may reduce the work fac-
“Well, why the heck are you searching an intrusion-detection system, data tor involved; as an obvious example,
for them here in the parking lot!?” some- loss and prevention tools, firewalls, an assured denial-of-service attack can
one exclaimed. To which the man re- and other techniques? be accomplished with enough nuclear
plied, “Well, the light is so much better ! How do I find coding flaws in the weapons. The goal in the practice of se-
here. And besides, now I have such good system I am using so I can patch them? curity is to construct sufficient defens-
company!” ! How do we build multilevel secure es against the likely threats in such a
There are many lessons that can be systems?
inferred from this story, but the one I Each of these questions implies it b There are many books on this topic, and the
IL LUSTRAT IO N BY JON H AN

stress with my students is that if they can be answered in a positive, mean- basic premise is at the heart of nearly every big
don’t properly define the problem, heist movie, including Ocean’s 11, The Italian
Job, and The Thomas Crown Affair. For some in-
ask the right questions, and search a Another story that resonates with my students teresting, real-life examples outside comput-
in the proper places, they may have is http://spaf.cerias.purdue.edu/Archive/race- ing, I recommend the book Spycraft by Robert
good company and funding, but they horse.html. Wallace and H. Keith Melton.

J UN E 2 0 0 9 | VO L. 52 | N O. 6 | COM M U NI C AT IO NS OF T HE AC M 23
viewpoints

way as to reduce the risk of compro- only initially—given current losses and
mise to an acceptable level; if the at- trends, this approach would eventually
tack can be made to cost far more than reduce costs in many environments.
the perceived gain resulting from its Robert H. (Bob) Courtney Jr., one of
success, then that is usually sufficient. the first computer security profession-
By asking the wrong questions— als and an early recipient of the NIST/

 such as how to patch or modify existing


items rather than ask what is appropri-
NCSC National Computer Systems Se-
curity Award articulated three “laws”


 ate to build or acquire—we end up with
systems that cannot be adequately pro-
for those who seek to build secure, op-
erational computational artifacts:d
tected against the threats they face. Few ! Nothing useful can be said about
  
 current systems are designed accord-
ing to known security practices,c nor
the security of a mechanism except in
the context of a specific application
 
are they operated within an appropriate
policy regime. Without understanding
and environment.
! Never spend more mitigating a risk


 the risks involved, management seeks
to “add on” security technology to the
than tolerating what it will cost you.
! There are management solutions
current infrastructure, which may add to technical problems but no technical
new vulnerabilities. solutions to management problems.
The costs of replacing existing sys- Although not everyone will agree
tems with different ones requiring new with these three laws, they provide a
training seems so daunting that it is sel- good starting point for thinking about
dom considered, even by organizations the practice of information security.
that face prospects of catastrophic loss. The questions we should be asking are
There is so much legacy code that devel- not about how to secure system “XYZ,”
opers and customers alike believe they but whether “XYZ” is appropriate for
cannot afford to move to something use in the environment at hand. Can it
else. Thus, the market tends toward be configured and protected against the
“add on” solutions and patches rather expected threats to a level that matches
than fundamental reengineering. Sig- our risk tolerance? What policies and
nificant research funding is applied to procedures need to be put in place to
     tinkering with current platforms rather augment the technology? What is the
than addressing the more fundamen- true value of what we are protecting? Do
  
  tal issues. Instead of asking “How do we even know what we are protecting?e
 

 
 
  we design and build systems that are As researchers and practitioners,



 
     secure in a given threat environment?” we need to stop looking for solutions
and “What tools and programming where the light is good and people
   
  
 constructs should we be using to pro- seem to be gathered. Consider a quote
      duce systems that do not exhibit easily I have been using recently: “Insan-
    
 exploited flaws?” we, as a community, ity is doing the same thing over and

  

  continue to ask the wrong questions. over again while expecting different
Note that I am not arguing against results.”f Asking the wrong questions

  
  
standards, per se. Standards are impor- repeatedly is not only hindering us
tant for interoperability and innovation. from making real progress but may
     However, standards are best applied at even be considered insane.
the interfaces so as to allow innovation So, what questions are you trying to
and good engineering practice to take answer?
place inside. I am also not overlooking
 

 the potential expense. Creating new sys- d My thanks to William Hugh Murray for his re-
statement of Courtney’s Laws.
 
  tems, training developers, and develop-
ing new code bases might be costly, but
e Many firms do not understand the value of
what they are protecting or where it is located;
see http://snipurl.com/sec-econ.
f This quote is widely attributed to Albert Ein-
c There are many fine works on security engi-
stein and to John Dryden. I have been unable
neering, including Ross Anderson’s opus of
to find a definitive source for it, however.
that title. If we return to the fundamentals,
tried-and-true design principles were articu-
lated by Jerome H. Saltzer and Michael D. Eugene H. Spafford (spaf@cerias.purdue.edu) is a
Schroeder in “The Protection of Information professor of computer science and the executive director
of the Center for Education and Research in Information
in Computer Systems,” republished in Com-
Assurance and Security (CERIAS) at Purdue University.
munications of the ACM 17, 7 (July 1974) but few
systems are designed using these principles. Copyright held by author.

24 CO MM UNIC ATIO NS O F TH E ACM | J U NE 20 09 | VOL . 5 2 | NO. 6


V
viewpoints

DOI:10.1145/1516046.1516055 Kevin Fu

Inside Risks
Reducing Risks of
Implantable Medical Devices
A prescription to improve security and privacy of pervasive health care.

M
ILLIONS OF PATIENTS ben-
efit from programma-
ble, implantable medi-
cal devices (IMDs) that
treat chronic ailments
such as cardiac arrhythmia,6 diabetes,
and Parkinson’s disease with various
combinations of electrical therapy
and drug infusion. Modern IMDs rely
on radio communication for diag-
nostic and therapeutic functions—
allowing health-care providers to re-
motely monitor patients’ vital signs
via the Web and to give continuous
rather than periodic care. However,
the convergence of medicine with ra-
dio communication and Internet con-
nectivity exposes these devices not
only to safety and effectiveness risks,
but also to security and privacy risks.
IMD security risks have greater direct
consequences than security risks of From left, Benjamin Ransford (University of Massachusetts), Daniel Halperin (University
of Washington), Benessa Defend (University of Massachusetts), and Shane Clark (University
desktop computing. Moreover, IMDs of Massachusetts) worked to uncover security flaws in implantable medical devices.
contain sensitive information with
privacy risks more difficult to mitigate cause patients harm. In 1982, some- mingling of radio communications
than that of electronic health records one deliberately laced Tylenol cap- expose IMDs to historically open en-
or pharmacy databases. This column sules with cyanide and placed the con- vironments with difficult to control
explains the impact of these risks on taminated products on store shelves perimeters.3,4 For instance, vandals
patient care, and makes recommen- in the Chicago area. This unsolved caused seizures in photosensitive in-
dations for legislation, regulation, crime led to seven confirmed deaths, dividuals by posting flashing anima-
and technology to improve security a recall of an estimated 31 million tions on a Web-based epilepsy sup-
and privacy of IMDs. bottles of Tylenol, and a rethinking of port group.1
security for packaging medicine in a Knowing that such vandals will
Consequences and Causes: tamper-evident manner. Today, IMDs always exist, the next question is
PHOTOGRAP H BY BEN RANS FORD

Security Risks appear to offer a similar opportunity whether genuine security risks exist.
The consequences of an insecure IMD to other depraved people. While there What could possibly go wrong by al-
can be fatal. However, it is fair to ask are no reported incidents of deliber- lowing an IMD to communicate over
whether intentional IMD malfunc- ate interference, this can change at great distances with radio and then
tions represent a genuine threat. any time. The global reach of the In- mixing in Internet-based services? It
Unfortunately, there are people who ternet and the prevalence and inter- does not require much sophistication

JU NE 2 0 0 9 | VO L . 52 | N O. 6 | CO M M U NI CAT I ONS O F T HE AC M 25
viewpoints

to think of numerous ways to cause platform, patients can take comfort in available today may not last 25 years.
intentional malfunctions in an IMD. that IMDs seldom rely on such widely It is tempting to consider software
Few desktop computers have fail- targeted software for now. updates as a remedy for maintaining
ures as consequential as that of an the security of IMDs. Because software
IMD. Intentional malfunctions can Consequences and Causes: updates can lead to unexpected mal-
actually kill people, and are more Privacy functions with serious consequences,
difficult to prevent than accidental A second risk is violation of patient pacemaker and defibrillator patients
malfunctions. For instance, lifesaving privacy. Today’s IMDs contain detailed make an appointment with a health-
therapies were silently modified and medical information and sensory data care provider to receive firmware up-
disabled via radio communication on (including vital signs, patient name, dates in a clinic. Thus, it could take
an implantable defibrillator that had date of birth, therapies, and medical too long to patch a security hole.
passed premarket approval by regula- diagnosis). Data can be read from an Beyond cryptography, several steps
tors.3 In my research lab, the same de- IMD by passively listening to radio could reduce exposure to potential
vice was reprogrammed with an unau- communication. With newer IMDs misuse. When and where should an
thenticated radio-based command to providing nominal read ranges of sev- IMD permit radio-based, remote re-
induce a shock that causes ventricular eral meters, eavesdropping will be- programming of therapies (such as
fibrillation (a fatal heart rhythm). come easier. The privacy risks are sim- changing the magnitude of defibrilla-
Manufacturers point out that IMDs ilar to that of online medical records. tion shocks)? When and where should
have used radio communication for an IMD permit radio-based, remote
decades, and that they are not aware Remedies collection of telemetry (for example,
of any unreported security problems. Improving IMD security and privacy vital signs)? Well-designed crypto-
Spam and viruses were also not preva- requires a proper mix of technology graphic authentication and authori-
lent on the Internet during its many- and regulation. zation make these two questions solv-
decade nascent period. Firewalls, en- able. Does a pacemaker really need to
cryption, and proprietary techniques Remedy: Technology accept requests for reprogramming
did not stop the eventual onslaught. Technological approaches to improv- and telemetry in all locations from
It would be foolish to assume IMDs ing IMD security and privacy include street corners to subway stations? The
are any more immune to malware. For judicious use of cryptography and lim- answer is no. Limit unnecessary expo-
instance, if malware were to cause an iting unnecessary exposure to would- sure.
IMD to continuously wake from power- be hackers. IMDs that rely on radio
saving mode, the battery would wear communication or have pathways to Remedy: Regulation
out quickly. The malware creator need the Internet must resist a determined Premarket approval for life-sustaining
not be physically present, but could ex- adversary.5 IMDs can last upward of 20 IMDs should explicitly evaluate secu-
pose a patient to risks of unnecessary years, and doctors are unlikely to sur- rity and privacy—leveraging the body
surgery that could lead to infection gically replace an IMD just because a of knowledge from secure systems
or death. Much like Macintosh users less-vulnerable one becomes available. and security metrics communities.
can take comfort in that most current Thus, technologists must think 20 to Manufacturers have already deployed
malware takes aim at the Windows 25 years out. Cryptographic systems hundreds of thousands of IMDs with-
out voluntarily including reasonable
technology to prevent the unauthor-
ized induction of a fatal heart rhythm.
Thus, future regulation should pro-
vide incentives for improved security
and privacy in IMDs.
Regulatory aspects of protecting
privacy are more complicated, espe-
cially in the United States. Although
the U.S. Food and Drug Administra-
tion has acknowledged deleterious
effects of privacy violations on patient
health,2 there is no ongoing process or
explicit requirement that a manufac-
turer demonstrate adequate privacy
protection. The FDA has no legal re-
PHOTOGRAP H BY BEN RANS FORD

mit from Congress to directly regulate


privacy (the FDA does not administer
HIPAA privacy regulations).

Call to Action
Equipment used to attack an implantable cardiac defibrillator (ICD). My call to action consists of two parts

26 CO MM UNICATI ONS OF TH E AC M | J U NE 20 09 | VO L . 5 2 | NO. 6


viewpoints

Improving IMD
for IMDs. Today’s guidelines are so
ambiguous that an implantable car- Calendar
security and privacy
dioverter defibrillator with no appar-
ent authentication whatsoever has
been implanted in hundreds of thou-
of Events
requires a proper sands of patients.3 June 16–18

mix of technology Fourth, technologists should en-


Conference on the Future
of the Internet 2009,
sure that IMDs do not continue to
and regulation. repeat the mistakes of history by un-
Seoul Republic of Korea,
Contact: Craig Partridge,
derestimating the adversary, using Phone: 517-324-3425,
Email: craig@bbn.com
outdated threat models, and neglect-
ing to use cryptographic controls.5 In June 19–20
addition, technologists should not International Symposium on
dismiss the importance of usable se- Memory Management,
Dublin, Ireland,
legislation, one part regulation, and curity and human factors. Sponsored: SIGPLAN,
one part technology. Contact: Elliot K Kolodner,
First, legislators should mandate Conclusion Email: kolodner@il.ibm.com
stronger security during premarket There is no doubt that IMDs save lives.
June 19–20
approval of life-sustaining IMDs that Patients prescribed such devices are ACM SIGPLAN/SIGBED 2009
rely on either radio communication much safer with the device than with- Conference on Languages,
or computer networking. Action at out, but IMDs are no more immune Compilers, and Tools for
Embedded Systems,
premarket approval is crucial because to security and privacy risks than any Dublin, Ireland,
unnecessary surgical replacement di- other computing device. Yet the con- Sponsored: SIGPLAN,
rectly exposes patients to risk of infec- sequences for IMD patients can be Contact: Christoph Kirsch,
tion and death. Moreover, the threat fatal. Tragically, it took seven cyanide Email: ck@cs.uni-salzburg.at
models and risk retention chosen by poisonings in the 1982 Chicago Tyle- June 20–24
the manufacturer should be made nol poisoning case for the pharmaceu- The 36th Annual
public so that health-care providers tical industry to redesign the physical International Symposium on
and patients can make informed deci- security of its product distribution Computer Architecture,
Austin, TX,
sions when selecting an IMD. Legisla- to resist tampering by a determined Sponsored: SIGARCH,
tion should avoid mandating specific adversary. The security and privacy Contact: Stephen W. Keckler,
technical approaches, but instead problems of IMDs are obvious, and Phone: 512-471-9763,
Email: sheckler@cs.utexas.edu
should provide incentives and pen- the consequences just as deadly. We’d
alties for manufacturers to improve better get it right today, because sur- June 22
IMD security. gically replacing an insecure IMD is Fourth International Workshop
Second, legislators should give much more difficult than an automat- on Mobility in the Evolving
Internet Architecture,
regulators the authority to require ade- ed Windows update.
Krakow, Poland,
quate privacy controls before allowing Contact: Prof. Xiaoming,
an IMD to reach the market. The FDA References Email: fu@cs.uni-goettingen.de
1. Epilepsy Foundation. Epilepsy Foundation Takes
writes that privacy violations can affect Action Against Hackers. March 31, 2008; http://
June 22–23
patient health,2 and yet the FDA has no www.epilepsyfoundation.org/aboutus/pressroom/
action_against_hackers.cfm. Second International
direct authority to regulate privacy of 2. FDA Evaluation of Automatic Class III Designation Workshop on Future
medical devices. IMDs increasingly VeriChip™ Health Information Microtransponder Multimedia Networking,
System, October 2004; http://www.sec.gov/Archives/ Coimbra, Portugal,
store large amounts of sensitive medi- edgar/data/924642/000106880004000587/ex99p2.txt.
Contact: Eduardo Cerquiera,
cal information and fixing a privacy 3. Halperin, D. et al. Pacemakers and implantable
Email: ecoelho@dei.uc.pt
cardiac defibrillators: Software radio attacks and
flaw after deployment is especially dif- zero-power defenses. In Proceedings of the 29th
ficult on an IMD. Moreover, security Annual IEEE Symposium on Security and Privacy, June 22–25
May 2008. 23rd ACM/IEEE/SCS Workshop
and privacy are often intertwined. In- 4. Halperin, D. et al. Security and privacy for
on Principles of Advanced and
adequate security can lead to inade- implantable medical devices. In IEEE Pervasive
Distributed Simulation,
Computing, Special Issue on Implantable Electronics
quate privacy, and inadequate privacy (Jan. 2008). Lake Placid, NY,
can lead to inadequate security. Thus, 5. Schneier, B. Security in the real world: How to Contact: Carl Tropper,
evaluate security technology. Computer Security Email: carl@cs.mcgill.ca
device regulators have the unique van- Journal 15 , 4 (Apr. 1999); http://www.schneier.com/
tage point for not only determining essay-031.html.
6. Webster, J.G., Ed. Design of Cardiac Pacemakers. June 23–26
safety and effectiveness, but also de- IEEE Press, 1995. 12th International Symposium
termining security and privacy. on Component Based Software
Third, regulators such as the Kevin Fu (kevinfu@cs.umass.edu) is an assistant Engineering,
professor of computer science at the University of East Stroudsburg, PA,
FDA should draw upon industry, the Massachusetts Amherst. Sponsored: SIGSOFT,
health-care community, and academ- Contact: Christine Hofmeister,
This work was supported by NSF grant CNS-0831244. Email: Christine.hofmeister@
ics to conduct a thorough and open
gmail.com
review of security and privacy metrics Copyright held by author.

JUN E 2 0 0 9 | VOL . 52 | N O. 6 | C OM M U N I C AT I ON S O F T HE AC M 27
V
viewpoints

DOI:10.1145/1516046.1516054 Peter J. Denning

The Profession of IT
Beyond Computational
Thinking
If we are not careful, our fascination with “computational thinking”
may lead us back into the trap we are trying to escape.

I
N THE MIDST of our struggle to the deep questions of the field.6,9 ! Is computational thinking a
better articulate why comput- ! Showing that computation is funda- unique and distinctive characteriza-
ing is so much broader than mental, and often unavoidable, in most tion of computer science?
programming, a movement of endeavors—a desire to proselytize. ! Is computational thinking an ad-
sorts has emerged. It is being Since starting a stint at NASA-Ames equate characterization of computer
called “computational thinking.”8 in 1983, I have been heavily involved science?
The U.S. National Science Founda- with computational science and I have My own conclusion is that both an-
tion’s Computer and Information devoted a substantial part of my own ca- swers are no. I will suggest that a prin-
Science and Engineering (CISE) di- reer to advancing these objectives. Since ciples-based framework answers both
rectorate has asked most proposers, questions yes. We are custodians of a
especially those in its CPATH initia- deep and powerful discourse: Let’s not
tive, to include a discussion of how hide it with an inadequate name.
their projects advance computational
thinking. Carnegie Mellon Universi- What is Computational Thinking?

÷
?
ty’s Center for Computational Think- Computational thinking has a long his-
ing says, “It is nearly impossible to tory within computer science. Known
do research in any scientific or engi- in the 1950s and 1960s as “algorithmic
neering discipline without an abil- thinking,” it means a mental orienta-
ity to think computationally.…[We] tion to formulating problems as con-
advocate for the widespread use of versions of some input to an output
computational thinking to improve and looking for algorithms to perform
people’s lives.”1 the conversions.
Computational thinking is seen Today the term has been expanded
by its adherents as a novel way to say to include thinking with many levels
what the core of the field is about, a of abstractions, use of mathematics
lever to reverse the decline of enroll- 2003 I have advocated a great-principles to develop algorithms, and examining
ments, and a rationale for accepting approach to the perennially open ques- how well a solution scales across differ-
computer science as a legitimate field tion, “What is computer science?”4 ent sizes of problems.1
of science. This movement is driven by Yet I am uneasy. I am concerned that
four main concerns: the computational thinking movement Is Computational Thinking
! Bringing computer science to reinforces a narrow view of the field Unique to Computer Science?
the table of science (as partner, not and will not sell well with the other sci- In the 1940s, John von Neumann wrote
programmer). ences or with the people we want to at- prolifically on how computers would
! Finding ways to make computer tract. I worry that we are not getting out be not just a tool for helping science,
science a more attractive field for stu- of the box, but are merely repackaging but a way of doing science.
dents to major in and for other scienc- it with new paper and a fresh ribbon. As early as 1975, Physics Nobel
es to collaborate with. In this column, I will examine two Laureate Ken Wilson promoted the
! Resurrecting ongoing inquiry into key questions: idea that simulation and computation

28 COMM UNICATIONS OF TH E AC M | J U NE 20 09 | VOL . 5 2 | N O. 6


viewpoints

were a way to do science that was not Therefore, it is unwise to pin our
previously available. Wilson’s Nobel hopes on computational thinking as a
Prize was based on breakthroughs he Computation is way of telling people about the unique
achieved in creating computational unavoidable not character of computer science. We
models whose simulations produced need some other way to do that.
radical new understandings of phase only in the method The sentiment that computational
changes in materials. In the early 1980s, of study, but in thinking is a recent insight into the true
Wilson joined with other leading scien- nature of computer science ignores the
tists in many fields to advocate that the what is studied. venerable history of computational
grand challenges of science could be thinking in computer science and in
cracked by computation and that the all the sciences. Computer science is
government could accelerate the pro- a science in its own right (see the side-
cess by supporting a network of super- bar “Computer Science as Science”).
computing centers.7 They argued that
computation had become a third leg puter science was present but was not a Is Computational Thinking
of science, joining the traditional legs key player. Computer scientists, in fact, Adequate for Computer Science?
of theory and experiment. The term resisted participation until NSF CISE In 1936 Alan Turing defined what it
“computational thinking” was com- and DARPA set up research programs means to compute a number. He of-
mon in their discussions. open only to those collaborating with fered a model of a computing machine
The computational sciences move- other sciences. and showed that the machines were
ment eventually grew into a huge In the middle 1980s, Ken Wilson ad- universal (one could simulate anoth-
interagency initiative in high-perfor- vocated the formation of departments of er). He then used his theory to settle
mance computing, and culminated in computational science in universities. a century-old “decision problem” of
the U.S. Congress passing a law fund- He carefully distinguished them from mathematics, whether there is a by-
ing a high-performance computing computer science. The term “computa- inspection method to tell if a set of de-
initiative in 1991. tional science” was chosen to avoid con- cision rules can terminate with a deci-
This movement validated the notion fusion with computer science. sion in a finite number of moves. He
that computation (and computational Thus, computational science is seen showed that the “decision problem” was
thinking) is essential to the advance- in the other sciences not as a notion not computable and argued that the very
ment of science. It generated a power- that flows out of computer science, but act of inspecting is inherently compu-
ful political movement that codified as a notion that flows from science it- tational: not even inspectors can avoid
this notion into a U.S. federal law. self. Computational thinking is seen as computation. Computation is universal
It is important to notice that this a characteristic of this way of science. and unavoidable. His paper truly was the
movement originated with the leaders It is not seen as a distinctive feature of birth of computer science.
of the physical and life sciences. Com- computer science. The modern formulations of science

Computer Science as Science


Since its beginnings in the late networking infrastructure was equal partners in the search for scientists in other fields have
1930s, computer science has a grand challenge that took new principles. So it matters discovered natural information
been a unique combination of many years. Now that this has whether computer science processes—affirming the
math, engineering, and science. been accomplished, we are qualifies as a full-fledged sixth criterion.3 The older
It is not one, but all three. Major increasingly able to emphasize science. Whether a field is seen definition of computer science
subsets form legitimate fields of the experimental method and as a science depends on its as “the study of phenomena
math, engineering, or science. reinvigorate our image as a satisfying six criteria:5 surrounding computers,”
But if you focus on a single science. Our many partnerships ! Has an organized body of which dates back to Alan
subset, you cannot express the with other sciences including knowledge Perlis, George Forsythe, and
uniqueness of the field. biology, physics, astronomy, ! Results are reproducible Allen Newell around 1970,
The term “computer materials science, economics, ! Has well developed experi- is giving way to “the study
science” traces back to the cognitive science, and mental methods of information processes,
writings of John von Neumann, sociology, have led to amazing ! Enables predictions, includ- natural and artificial.” The
who believed that the innovations. ing surprises shift from computer as object
architecture of machines and These collaborations ! Offers hypotheses open to of study to computer as tool is
applications could be put on a have uncovered questions falsification enabling us to revisit the deep
rigorous scientific basis. in the other fields about ! Deals with natural objects questions of our field in the
Until about 1990, the whether computer science is Computer science easily new light of computation as a
emphasis within the field was legitimately science. Many see passes the first five of these lens through which to see the
developing and advancing computer people as engineers tests, so the debate has tended world. The most fundamental
the technology. Building implementing principles they to center on the last. During of these questions is: What is
reliable computers within a did not discover rather than the past decade, prominent computation?6,9

J UN E 2 0 0 9 | VO L . 5 2 | NO. 6 | C O M M UN I CAT I O NS O F T HE ACM 29


viewpoints

recognize the same truth when they say ity to take care of the concerns listed at
that computation is an essential meth- the beginning of this column. But giv-
od of doing science. In fact, a growing The real value of en the outside perception, computa-
number of scientists are now saying computer science tional thinking is all too easily seen as a
that information processes occur nat- repackaging—a change of appearance
urally (for example, DNA transcrip- is in the offers we but not of substance. Do we really want
tion) and that computation is needed are able to make to replace that older notion with “CS =
to understand and eventually control computational thinking”? A colleague
them.3 So computation is unavoidable from our expertise, from another field recently said to me:
not only in the method of study, but in which is founded “You computer scientists are hungry!
what is studied. First you wanted us to take your courses
This is a subtle but important dis- in a rich and deep on literacy and fluency. Now you want
tinction. Computation is present in discourse. us to think like you!”
nature even when scientists are not ob- I suggest that the real value of com-
serving it or thinking about it. Compu- puter science is in the offers we are able
tation is more fundamental than com- to make from our expertise, which is
putational thinking. For this reason founded in a rich and deep discourse.
alone, computational thinking seems We are valued at the table when we
like an inadequate characterization of at which we can develop various levels help the others solve problems they
computer science. of skill. Computational thinking is one care about. We are most valued not for
A number of us developed a great of several key practices at which every our computational thinking, but for
principles framework that exposes computer scientist should be compe- our computational doing.
the fundamental scientific principles tent (see the sidebar “The Great Prin-
of computing4,6 (see the sidebar “The ciples Framework”). It shortchanges References
1. Carnegie Mellon University Center for Computational
Great Principles Framework”). This computer science to try to characterize Thinking; http://www.cs.cmu.edu/~CompThink.
framework interprets computer sci- the field by mentioning only one essen- 2. Computer Science Unplugged Web site; http://
csunplugged.org.
ence as the study of fundamental prop- tial practice without mentioning the 3. Denning, P. Computing is a natural science. Commun.
erties of information processes, both others or the principles of the field. ACM 50, 7 (July 2007), 13–18.
4. Denning, P. Great principles of computing. Commun.
natural and artificial. Computers are ACM 46, 11 (Nov. 2003), 15–20.
the tool, not the object of study. Com- Conclusion 5. Denning, P. Is computer science science? Commun.
ACM 48, 4 (Apr. 2005), 27–31.
putation pervades everyday life.2 Computation is widely accepted as a 6. Great Principles of Computing Web site; http://
The great principles framework lens for looking at the world. We do not greatprinciples.org.
7. Wilson, K.G. Grand challenges to computational
reveals that there is something even need to sell that idea. Computational science. In Future Generation Computer Systems.
more fundamental than an algorithm: thinking is one of the key practices of Elsevier, 1989, 171–189.
8. Wing, J. Computational thinking. Commun. ACM 49, 3
the representation. Representations computer science. But it is not unique (Mar. 2006), 33–35.
9. Wing, J. Five deep questions in computing. Commun.
convey information. A computation is to computing and is not adequate to ACM 51, 1 (Jan. 2008), 58–60.
an evolving representation and an al- portray the whole of the field.
gorithm is a representation of a meth- In the 1960s and 1970s we allowed,
Peter J. Denning (pjd@nps.edu) is the director of the
od to control the evolution. and even encouraged, the perception Cebrowski Institute for Information Innovation and
In this framework, computational “CS = programming,” which is now to Superiority at the Naval Postgraduate School in Monterey,
CA, and is a past president of ACM.
thinking is not a principle; it is a prac- our dismay widely accepted outside the
tice. A practice is a way of doing things field and is connected with our inabil- Copyright held by author.

The Great Principles Framework


The Great Principles (GP) technologies. They can be computing. The Internet, for practices:
framework is a way to express grouped into seven categories: example, is a technology that ! Programming
computer science as a field ! Computation draws its operating principles ! Engineering of systems
of science based on deep ! Communication primarily from communication, ! Modeling
and enduring fundamental ! Coordination coordination, and recollection, ! Applying
principles.3,4,6 The framework ! Recollection and its architecture from design Computational thinking
has two parts: core principles ! Automation and evaluation. can be seen either as a style of
and core practices. ! Evaluation The core practices are areas thought that runs through the
The core principles are ! Design of skill and ability at which practices or as a fifth practice.
statements and stories about These are not mutually computing people can display It is the ability to interpret
the immutable laws and exclusive groups of principles, various levels of performance the world as algorithmically
recurrences that shape and but windows that bring such as beginner, competent, controlled conversions of inputs
constrain all computing particular perspectives about and expert. There are four core to outputs.

30 CO MM UNIC ATIO NS OF TH E AC M | J U N E 200 9 | VOL . 52 | N O. 6


V
viewpoints

DOI:10.1145/1516046.1516058 Richard Stallman

Viewpoint
Why “Open Source” Misses
the Point of Free Software
Decoding the important differences in terminology, underlying philosophy,
and value systems between two similar categories of software.

W
HEN WE CALL software appeal to business executives by citing
“free,” we mean it re- practical benefits, while avoiding ideas
spects the users’ essen- of right and wrong they might not like
tial freedoms: the free- to hear. Other proponents flatly reject-
dom to run it, to study ed the free software movement’s ethi-
and change it, and to redistribute cal and social values. Whichever their
copies with or without changes (see views, when campaigning for “open
http://www.gnu.org/philosophy/free- source” they did not cite or advocate
sw.html). This is a matter of freedom, those values. The term “open source”
not price, so think of “free speech,” not quickly became associated with the
“free beer.” practice of citing only practical values,
These freedoms are vitally impor- such as making powerful, reliable soft-
tant. They are essential, not just for the ware. Most of the supporters of “open
individual users’ sake, but because they source” have come to it since then,
promote social solidarity—that is, shar- the development of the free operating and that practice is what they take it to
ing and cooperation. They become even system GNU, so we could avoid the non- mean.
more important as more aspects of our free operating systems that deny free- Nearly all open source software is
culture and life activities are digitized. dom to their users. During the 1980s, free software; the two terms describe
In a world of digital sounds, images, we developed most of the essential almost the same category of software.
and words, free software increasingly components of such a system, as well But they stand for views based on fun-
equates with freedom in general. as the GNU General Public License (see damentally different values. Open
Tens of millions of people around http://www.gnu.org/licenses/gpl.html), source is a development methodology;
the world now use free software; the a license designed specifically to pro- free software is a social movement. For
schools in regions of India and Spain tect freedom for all users of a program. the free software movement, free soft-
now teach all students to use the free However, not all of the users and de- ware is an ethical imperative, because
GNU/Linux operating system (see velopers of free software agreed with the only free software respects the users’
http://www.gnu.org/gnu/linux-and- goals of the free software movement. In freedom. By contrast, the philosophy of
gnu.html). But most of these users have 1998, a part of the free software com- open source considers issues in terms
never heard of the ethical reasons for munity splintered off and began cam- of how to make software “better”—in
which we developed this system and paigning in the name of “open source.” a practical sense only. It says that non-
built the free software community, be- The term was originally proposed to free software is a suboptimal solution.
cause today this system and commu- avoid a possible misunderstanding For the free software movement, how-
nity are more often described as “open of the term “free software,” but it soon ever, non-free software is a social prob-
source,” and attributed to a different became associated with philosophical lem, and moving to free software is the
philosophy in which these freedoms views quite different from those of the solution.
are hardly mentioned. free software movement. Free software. Open source. If it’s
The free software movement has Some of the proponents of “open the same software, does it matter
campaigned for computer users’ free- source” considered it a marketing cam- which name you use? Yes, because dif-
dom since 1983. In 1984 we launched paign for free software, which would ferent words convey different ideas.

J UN E 2 0 0 9 | VO L . 5 2 | NO. 6 | C O M M UN I CAT I O NS OF T HE ACM 31


viewpoints

While a free program by any other the same; it is a little looser in some re- considered free software licenses.
name would give you the same free- spects, so open source supporters have
dom today, establishing freedom in accepted a few licenses that we consid- Different Values Can Lead
a lasting way depends above all on er unacceptably restrictive of the users. to Similar Conclusions…
teaching people to value freedom. If Nonetheless, it is fairly close to our defi- But Not Always
you want to help do this, it is essential nition in practice. Radical groups in the 1960s had a repu-
to speak about “free software.” However, the obvious meaning for tation for factionalism: some organiza-
We in the free software movement the expression “open source software” tions split because of disagreements
don’t think of the open source camp is “You can look at the source code,” on details of strategy, and the two re-
as an enemy; the enemy is proprietary and most people seem to think that’s sultant groups treated each other as
(non-free) software. But we want people what it means. That is a much weaker enemies despite having similar basic
to know we stand for freedom, so we do criterion than free software, and much goals and values. The right wing made
not accept being misidentified as open weaker than the official definition of much of this, and used it to criticize the
source supporters. open source. It includes many pro- entire left.
grams that are neither free nor open Some try to disparage the free soft-
Common Misunderstandings of source. Since that obvious meaning ware movement by comparing our
“Free Software” and “Open Source” for “open source” is not the meaning disagreement with open source to the
The term “free software” has a problem that its advocates intend, the result disagreements of those radical groups.
of misinterpretation: an unintended is that most people misunderstand They have it backward. We disagree
meaning, “software you can get for zero the term. Here is how writer Neal Ste- with the open source camp on the ba-
price,” fits the term just as well as the phenson defined “open source”: Li- sic goals and values, but their views and
intended meaning, “software that gives nux is “open source” software meaning, ours lead in many cases to the same
the user certain freedoms.” We address simply, that anyone can get copies of its practical behavior—such as developing
this problem by publishing the defi- source code files. free software.
nition of free software, and by saying I don’t think Stephenson deliberately As a result, people from the free
“Think of free speech, not free beer.” sought to reject or dispute the “official” software movement and the open
This is not a perfect solution; it cannot definition. I think he simply applied the source camp often work together on
completely eliminate the problem. An conventions of the English language to practical projects such as software de-
unambiguous, correct term would be come up with a meaning for the term. velopment. It is remarkable that such
better, if it didn’t have other problems. The state of Kansas published a similar different philosophical views can so
Unfortunately, all the alternatives definition: Make use of open-source soft- often motivate different people to par-
in English have problems of their own. ware (OSS). OSS is software for which the ticipate in the same projects. Nonethe-
We’ve looked at many alternatives that source code is freely and publicly avail- less, these views are very different, and
people have suggested, but none is able, though the specific licensing agree- there are situations where they lead to
so clearly correct that switching to it ments vary as to what one is allowed to do very different actions.
would be a good idea. Every proposed with that code. The idea of open source is that allow-
replacement for “free software” has Open source supporters try to deal ing users to change and redistribute the
some kind of semantic problem—and with this by pointing to their official software will make it more powerful and
this includes “open source software.” definition, but that corrective approach reliable. But this is not guaranteed. De-
The official definition of “open is less effective for them than it is for us. velopers of proprietary software are not
source software,” which is published by The term “free software” has two natu- necessarily incompetent. Sometimes
the Open Source Initiative (see http:// ral meanings, one of which is the in- they produce a program that is power-
opensource.org/docs/osd) and too long tended meaning, so a person who has ful and reliable, even though it does not
to cite here, was derived indirectly from grasped the idea of “free speech, not respect the users’ freedom. How will
our criteria for free software. It is not free beer” will not get it wrong again. free software activists and open source
But “open source” has only one natural enthusiasts react to that?
meaning, which is different from the A pure open source enthusiast, one
meaning its supporters intend. So there that is not at all influenced by the ide-
Open source is is no succinct way to explain and justify als of free software, will say, “I am sur-
a development the official definition of “open source.” prised you were able to make the pro-
That makes for worse confusion. gram work so well without using our
methodology; free Another common misunderstand- development model, but you did. How
software is a social ing of “open source” is the idea that can I get a copy?” This attitude will re-
it means “not using the GNU GPL.” ward schemes that take away our free-
movement. It tends to accompany a misunder- dom, leading to its loss.
standing of “free software,” equating The free software activist will say,
it to “GPL-covered software.” These are “Your program is very attractive, but
equally mistaken, since the GNU GPL is not at the price of my freedom. So I have
considered an open source license, and to do without it. Instead I will support a
most of the open source licenses are project to develop a free replacement.”

32 CO MM UNICATIO NS OF TH E AC M | J U N E 200 9 | VOL. 52 | N O. 6


viewpoints

If we value our freedom, we can act to have to talk about freedom. A certain
maintain and defend it. amount of the “keep quiet” approach to
Software can only business can be useful for the commu-
Powerful, Reliable be said to serve nity, but it is dangerous if it becomes
Software Can Be Bad so common that the love of freedom
The idea that we want software to be its users if it respects comes to seem like an eccentricity.
powerful and reliable comes from the their freedom. That dangerous situation is exactly
supposition that the software is de- what we have. Most people involved
signed to serve its users. If it is power- with free software say little about free-
ful and reliable, that means it serves dom—usually because they seek to be
them better. more acceptable to business. Software
But software can only be said to serve distributors especially show this pat-
its users if it respects their freedom. “open source software” is that the ethi- tern. Nearly all GNU/Linux operating
What if the software is designed to put cal ideas of “free software” make some system distributions add proprietary
chains on its users? Then powerfulness people uneasy. That’s true: talking packages to the basic free system, and
only means the chains are more con- about freedom, about ethical issues, they invite users to consider this an ad-
stricting, and reliability that they are about responsibilities as well as conve- vantage, rather than a step backward
harder to remove. Malicious features, nience, is asking people to think about from freedom.
such as spying on the users, restricting things they might prefer to ignore, such Proprietary add-on software and
the users, back doors, and imposed up- as whether their conduct is ethical. partially non-free GNU/Linux distribu-
grades are common in proprietary soft- This can trigger discomfort, and some tions find fertile ground because most
ware, and some open source supporters people may simply close their minds of our community does not insist on
want to do likewise. to it. It does not follow that we ought to freedom with its software. This is no
Under the pressure of the movie and stop talking about these things. coincidence. Most GNU/Linux users
record companies, software for individ- However, that is what the leaders of were introduced to the system by “open
uals to use is increasingly designed spe- “open source” decided to do. They fig- source” discussion, which doesn’t say
cifically to restrict them. This malicious ured that by keeping quiet about ethics freedom is a goal. The practices that
feature is known as DRM, or Digital and freedom, and talking only about don’t uphold freedom and the words
Restrictions Management (see Defec- the immediate practical benefits of cer- that don’t talk about freedom go hand
tiveByDesign.org), and it is the antith- tain free software, they might be able to in hand, each promoting the other.
esis in spirit of the freedom that free “sell” the software more effectively to To overcome this tendency, we need
software aims to provide. And not just certain users, especially businesses. more, not less, talk about freedom.
in spirit: since the goal of DRM is to This approach has proved effective,
trample your freedom, DRM develop- in its own terms. The rhetoric of open Conclusion
ers try to make it difficult, impossible, source has convinced many businesses As the advocates of open source draw
or even illegal for you to change the and individuals to use, and even devel- new users into our community, we
software that implements the DRM. op, free software, which has extended free software activists must work even
Yet some open source supporters our community—but only at the super- more to bring the issue of freedom to
have proposed “open source DRM” ficial, practical level. The philosophy of those new users’ attention. We have
software. Their idea is that by pub- open source, with its purely practical to say, “It’s free software and it gives
lishing the source code of programs values, impedes understanding of the you freedom!”—more and louder than
designed to restrict your access to en- deeper ideas of free software; it brings ever. Every time you say “free software”
crypted media, and allowing others to many people into our community, but rather than “open source,” you help
change it, they will produce more pow- does not teach them to defend it. That our campaign.
erful and reliable software for restrict- is good, as far as it goes, but it is not
ing users like you. Then it will be deliv- enough to make freedom secure. At- Further Reading
1. Joe Barr wrote an article called “Live and Let License”
ered to you in devices that do not allow tracting users to free software takes (see http://www.itworld.com/LWD010523vcontrol4)
you to change it. them just part of the way to becoming that gives his perspective on this issue.
2. Lakhani and Wolf’s paper on the motivation of free
This software might be “open defenders of their own freedom. software developers (see http://freesoftware.mit.edu/
source,” and use the open source de- Sooner or later these users will be papers/lakhaniwolf.pdf) states that a considerable
fraction are motivated by the view that software
velopment model; but it won’t be free invited to switch back to proprietary should be free. This was despite the fact they surveyed
software, since it won’t respect the free- software for some practical advantage. the developers on SourceForge, a site that does not
support the view that this is an ethical issue.
dom of the users that actually run it. If Countless companies seek to offer such
the open source development model temptation, some even offering copies
Richard Stallman (rms@gnu.org) is the author of the free
succeeds in making this software more gratis. Why would users decline? Only if symbolic debugger GDB, the founder the project to develop
powerful and reliable for restricting they have learned to value the freedom the free GNU operating system, and the founder of the Free
Software Foundation.
you, that will make it even worse. free software gives them, to value free-
dom as such rather than the technical
Copyright © 2009 Richard Stallman
Fear of Freedom and practical convenience of specific Verbatim copying and distribution of this entire article is
The main initial motivation for the term free software. To spread this idea, we permitted in any medium, provided this notice is preserved.

JU NE 2 0 0 9 | VO L . 52 | N O. 6 | CO M M U NI CAT IO N S O F TH E AC M 33
V
viewpoints

DOI:10.1145/1516046.1516057 George V. Neville-Neil

Kode Vicious
Obvious Truths
How to determine when to put the brakes on
late-running projects and untested software patches.
Dear KV, originally published schedule and and the worse their output becomes.
I’ve been working on a project that, the work is not even 50% complete. Pilots, fire fighters, emergency-room
like all software projects, is late. And 2. The originally published schedule doctors, and the like all know that
we’re not just late a little, but a lot. has been extended by 50% or more. past a certain point everything they do
The project was supposed to take 3. The schedule is updated daily and will actually cause more trouble than
four weeks and we’re now in our third the dates keep getting further out. if they did nothing at all. Because our
month. People are blaming the usual 4. The engineers avoid coming to profession is not as extreme as theirs
suspects: poorly spec’d-out work, team meetings and when they do at- we seem to never learn this, which is a
management interference, and lack tend they: shame, because it’s an important les-
of proper infrastructure. What I want a. break down in tears; son. Learn when to stop.
to know is how late is too late? How do b. pretend to be asleep; KV
people decide to just stop a project? c. bang their heads on the table.
Late and Getting Later Driver Education
In this month’s installment of “things
Dear Later, that ought to be obvious” I discuss
If I understand you correctly, and I patching, compiling, and testing
hope I can because your email mes- code. I’m sure many of you have had
sage is both short and direct, you are these experiences before, and if you
involved in a project that has now have a fun one to share please write to
taken more than twice as long as pre- me and tell me about it.
dicted to implement and is approach- I am sure most of you have heard
ing the thrice mark. I would say this the old programmer’s joke, “It com-
is scary if it weren’t so common. Proj- piles, ship it!,” which gets a good guf-
ects take on a life of their own at some faw now and again from the denizens
point and when a group of people get KV is in category C, but then I bet of cubicle land. I’m also sure that
together and continue to try to “look you knew that already. All of the above many readers have been subjected to
on the bright side” they keep finding are indicators of schedule creep and using code that clearly compiled, ran
“silver linings,” even though they are a loss of control of the project. They maybe once, and then actually was
now drenched by the rain. It is amaz- are all good times to consider pulling shipped. But have you ever had to deal
ing to me that a group of people who the emergency brake handle. The rea- with people sending you patches that
often seem so hard-headed and prag- son the handle gets pulled so rarely is just didn’t work?
IL LUSTRAT ION BY AL EX EY DUDOL ADOV / I STOCKPH OTO.COM

matic—that is, engineers—can con- the aforementioned optimism of the Recently, KV has been fixing a de-
tinue to believe there is a pot of gold staff, whereby if they “just work a lit- vice driver that seems always to be very
just over the rainbow somewhere. tle harder” the project will get done. close to working. The driver wasn’t
Many projects can go on for years I have never, in my entire working originally written by KV, and it cer-
when they should have only gone on career, seen a project that is 50% off tainly wasn’t originally tested by KV, al-
for months, so long as the money course get back on track because the though it now seems that the company
doesn’t run out. team worked 80 hours a week instead I’m dealing with is using me as their
From my point of view there are a of 60. Most people in high-pressure unwitting alpha tester. There are few
few good places to pause and reflect professions know how this works. things more frustrating than a piece
in the life of the project. The harder they work past a certain of software that almost works. It might
1. You have gotten to the end of the point, the more mistakes they make tick along for days, doing just what it’s

34 CO MM UNICATIO NS OF T HE ACM | J U N E 200 9 | VO L. 52 | N O. 6


viewpoints

supposed to when—bang!—it breaks. People who send out a “small


With a bit of debugging and a bit of patch” without even compiling it are
time in the lab I can explain what’s bro- People who send far too confident in their own abili-
ken to the vendor. I even have source out a “small patch” ties. Please! Stop! Don’t do that! I
code for the driver so I can patch it don’t care if you see bits in your
when I understand what’s broken and without even dreams and they assemble correctly
send them patches; sometimes they compiling it are far in the morning when you type them
send me patches. in. The amount of time you waste by
It’s the part where they send me too confident in not doing the most basic tests on code
patches that has been a bit more in- their own abilities. you’re patching isn’t only your own;
teresting. I had been faithfully apply- it’s multiplied by all the hapless suck-
ing patches from the vendor and test- ers who took your patch and tried to
ing their fixes and I kept getting this use it.
sneaking feeling that they were not Returning to my earlier remark,
testing the patches before they sent I would have thought this was ob-
them out. I had that feeling not just vious—as obvious as how to spell
because I’m a naturally paranoid and sending me hacked bits of the driver “struct”—but I have discovered this is
suspicious person, which I am, but that they thought would work. All I not the case.
because each patch would fix say, only could think was, “Did you even com- KV
70% or 80% of the problem and then pile this!?!?” But of course I already
I’d have to provide the remaining bit knew the answer. George V. Neville-Neil (kv@acm.org) is the proprietor of
Neville-Neil Consulting and a member of the ACM Queue
of the fix. Finally, I got a patch that Now I don’t bring this up just be- editorial board. He works on networking and operating
proved that although I am paranoid, it cause I like to say, “I told you so,” be- systems code for fun and profit, teaches courses on
various programming-related subjects, and encourages
is not without reason. I applied a patch cause I don’t. I’d much prefer the code I your comments, quips, and code snips pertaining to his
and it didn’t compile: the C keyword received worked the first time, since my Communications column.

struct had been spelled incorrectly. employers expect the same from me. I
Ha! I had them. They had not even ap- bring this up as yet another example of
plied their own patch; they were just unwarranted programmer optimism. Copyright held by author.

O R D E R T O D A Y & S A V E 1 5%
HANDBOOK OF
authors:
Statistical Analysis Robert Nisbet, PhD
Pacific Capital Bank Corporation

Data Mining
Santa Barbara, CA, USA
John Elder, IV, PhD
Elder Research, Inc.

Applications
Charlottesville, VA, USA
Gary Miner, PhD
StatSoft, Inc.
Tulsa, OK, USA

“If you want to roll-up your sleeves


The Handbook of Statistical Analysis and Data Mining and execute on predictive analytics,
Applications is a comprehensive professional reference this is your definite, go-to resource.
book for business analysts, scientists, engineers and To put it lightly, if this book isn’t on
researchers that brings together in a single resource your shelf, you’re not a data miner.”
all the information a beginner will need to rapidly learn — Eric Siegel, Ph.D.,
how to conduct data mining and the statistical analysis President, Prediction Impact, Inc., San Francisco,
and Founding Chair, Predictive Analytics World
required to interpret the data patterns once mined.

key features: May 2009, Hardcover, 900 pp


™ Egdk^YZh`ZnhiVi^hi^XVaVcVanh^hbZi]dYh ISBN-13: 978-0-12-374765-5
™ 8aZVganYZhXg^WZhbdYZgcVa\dg^i]bh[dg6>$BVX]^cZaZVgc^c\ List Price: $89.95/£45.99 /€57.95
™ EgVXi^XVaVYk^XZ[gdbhjXXZhh[jagZVa"ldgaY^beaZbZciVi^dch Use offer code 94637 when ordering.
™ >cXajYZhZmiZch^kZXVhZhijY^Zh!ZmVbeaZh!ijidg^Vah!BHEdlZgEd^ciha^YZhVcYYViVhZih

> > T o v i e w t h e f u l l T a b l e o f C o n t e n t s o r t o o r d e r y o u r c o p y, v i s i t e l s e v i e r d i r e c t . c o m

2M90104_HA_Nisbet_050_1200_Amstat.indd 1 3/5/09 11:06:25 AM

J U NE 2 0 0 9 | VO L . 52 | N O. 6 | C O M M U NI CAT IO N S O F T HE ACM 35
ACM, Uniting the World’s Computing Professionals,
Researchers, Educators, and Students

3/@=::3/5C3
B/B7;3E63<1=;>CB7<57A/BB6313<B3@=4B635@=E7<523;/<24=@B316<=:=5G8=0AE=@:2
E723"7A1=<B7<C7<57BAE=@9=<7<7B7/B7D3AB=63:>1=;>CB7<5>@=43AA7=</:AAB/G1=;>3B7B7D37<
B635:=0/:1=;;C<7BG
"A7<1@3/A7<57<D=:D3;3<B7</1B7D7B73A/7;32/B3<AC@7<5B6363/:B6=4B63
1=;>CB7<5 27A17>:7<3 /<2 >@=43AA7=< A3@D3 B= 63:> " @3/16 7BA 4C:: >=B3<B7/: /A / 5:=0/: /<2
27D3@A3A=173BGE67161=<B7<C3AB=A3@D3<3E/<2C<7?C3=>>=@BC<7B73A4=@7BA;3;03@A

A>/@B=4"A=D3@/::;7AA7=<B=/2D/<131=;>CB7<5/A/A173<13/<2/>@=43AA7=<=C@7<D/:C/0:3;3;03@
03<347BA/@323A75<32B=63:>G=C/1673D3AC113AA0G>@=D727<5G=CE7B6B63@3A=C@13AG=C<332B=/2D/<13
G=C@1/@33@/<2AB/G/BB634=@34@=<B=4B63:/B3ABB316<=:=573A

E=C:2/:A=:793B=B/93B67A=>>=@BC<7BGB=;3<B7=<" ,B63;3;03@A67>5@=C>E7B67<"
" ,A>C@>=A37A
B=3:3D/B3B637AAC3=453<23@27D3@A7BGE7B67<B63/AA=17/B7=</<2B630@=/23@1=;>CB7<51=;;C<7BG
-=C1/<8=7<B63
" ,3;/7:27AB@70CB7=<:7AB/B6BB> E=;3<
/1;
=@5 8=7<:7AB

ACM MEMBER BENEFITS:


IAC0A1@7>B7=<B="A<3E:G@323A75<32;=<B6:G;/5/H7<3Communications of the ACM
I 113AAB="ACareer & Job Center =443@7<5/6=AB=43F1:CA7D31/@33@ 3<6/<17<503<347BA
I Free e-mentoring services >@=D72320G"3<B=@#3BJ
I Full access to over 2,500 online courses 7<;C:B7>:3:/<5C/53A/<2  D7@BC/::/0A
IFull access to 600 online books 4@=;(/4/@7J==9A$<:7<343/BC@7<5:3/27<5>C0:7A63@A
7<1:C27<5$'37::G%@=43AA7=</:"3;03@A=<:G
IFull access to 500 online books 4@=;==9AFJ
IC::/113AAB=B63<3Eacmqueue E30A7B343/BC@7<50:=5A=<:7<327A1CAA7=<A/<2230/B3A
>:CA;C:B7;327/1=<B3<B
I)63=>B7=<B=AC0A1@703B=B631=;>:3B3ACM Digital Library
I)63Guide to Computing LiteratureE7B6=D3@=<3;7::7=<A3/@16/0:3070:7=5@/>67117B/B7=<A
I)63=>B7=<B=1=<<31BE7B6B63best thinkers in computing 0G8=7<7<534 Special Interest Groups
=@hundreds of local chapters
IACM’s 40+ journals and magazines /BA>317/:;3;03@ =<:G@/B3A
ITechNews"AB@7 E339:G3;/7:2753AB23:7D3@7<5AB=@73A=<B63:/B3AB )<3EA
ICareerNews"A07 ;=<B6:G3;/7:2753AB>@=D727<51/@33@ @3:/B32B=>71A
IMemberNet"A3 <3EA:3BB3@1=D3@7<5">3=>:3/<2/1B7D7B73A
IEmail forwarding service & filtering service>@=D727<5;3;03@AE7B6/4@33/1;
=@53;/7:/22@3AA
/<2Postini A>/;47:B3@7<5
I<2;C16;C16;=@3
"AE=@:2E723<3BE=@9=4=D3@ ;3;03@A@/<534@=;ABC23<BAB=A3/A=<32>@=43AA7=</:A/<27<1:C23A;/<G
=4B63:3/23@A7<B63473:2
";3;03@A53B/113AAB=B67A<3BE=@9/<2B63/2D/<B/53AB6/B1=;34@=;B637@3F>3@B7A3
B=933>G=C/BB634=@34@=<B=4B63B316<=:=5GE=@:2

%:3/A3B/93/;=;3<BB=1=<A723@B63D/:C3=4/<";3;03@A67>4=@G=C@1/@33@/<2G=C@4CBC@37<B632G</;71
1=;>CB7<5>@=43AA7=<

(7<13@3:G
,3<2G/::

%@3A723<B
AA=17/B7=<4=@=;>CB7<5"/167<3@G
Advancing Computing as a Science & Profession


 

       
  
 
3  %  ,-

  !    "


#  $
 

      !!"#$%%$% &' 
( $$-##"
$$%$%!)!! &*+ ,+(
1   


   ,
# /

 


+  / * / *+  +  /  /  

 

 
 


  
    %

,4      
       
54        
        
 

G4       

           


    

 

    



       ! !" #  !    > %
 %&& & &  

   


  

   
     
 
  
           $ 

  
      !
 "
#          !
 "
# )* 
$%  &   '  "(      "   
!
+ )* 
  !
 "
#   &  
  (      ,-!
 "
# " 
 
!
+ .* 

 ,   , /


  
  
0 3 
    "    
  
1 
 / 
 ,,,0
0!       "  <.    

  
 )   
3  
 
 ?6-   
 
  
    
 
   =&    >)  & 
    
  )
 
  

  3 
   
 ) 
 3  
 2?::  ?,:@4 ? AAAAAAAAAAAAAAAAAAAAAA

     B 2?::4 ? AAAAAAAAAAAAAAAAAAAAAA


   
 .
  
 2?,: ?65  ?C54 ? AAAAAAAAAAAAAAAAAAAAAA
   
  "
E 3 0F   & ? AAAAAAAAAAAAAAAAAAAAAA
30 /) G-777
* $ *$ ,--@7-777

H
 I >
  J  D >)   
0  K,@--G65CC5C        

 !" #$ 


practice
DO I:1 0.11 45 /1 51 6046 .15 16 059
such as mirroring, RAID-4 and RAID-
Article development led by
queue.acm.org
5, and the n+2 configuration, RAID-6,
which increases storage system reli-
ability using two redundant disks (dual
New drive technologies and increased parity). Additionally, reliability at the
capacities create new categories of failure RAID group level has been favorably
enhanced because HDD reliability has
modes that will influence system designs. been improving as well.
Several manufactures produce one-
BY JON ELERATH terabyte HDDs and higher capacities
are being designed. With higher areal

Hard-Disk
densities (also known as bit densities),
lower fly-heights (the distance between
the head and the disk media), and per-
pendicular magnetic recording tech-

Drives: The
nology, can HDD reliability continue to
improve? The new technology required
to achieve these capacities is not with-
out concern. Are the failure mecha-

Good, the Bad,


nisms or the probability of failure any
different from predecessors? Not only
are there new issues to address stem-
ming from the new technologies, but

and the Ugly


also failure mechanisms and modes
vary by manufacturer, capacity, inter-
face, and production lot.
How will these new failure modes
affect system designs? Understanding
failure causes and modes for HDDs us-
ing technology of the current era and
the near future will highlight the need
for design alternatives and trade-offs
that are critical to future storage sys-
tems. Software developers and RAID ar-
chitects can not only better understand
HARD-DISK DRIVES (HDDS) are like the bread in a peanut the effects of their decisions, but also
butter and jelly sandwich—seemingly unexciting know which HDD failures are outside
their control and which they can man-
pieces of hardware necessary to hold the software. age, albeit with possible adverse per-
They are simply a means to an end. HDD reliability, formance or availability consequences.
Based on technology and design, where
however, has always been a significant weak link, must the developers and architects
perhaps the weak link, in data storage. In the late place the efforts for resiliency?
1980s people recognized that HDD reliability This article identifies significant
HDD failure modes and mechanisms,
was inadequate for large data storage systems so their effects and causes, and relates
redundancy was added at the system level with some them to system operation. Many fail-
IL LUSTRAT IO N BY S UPERBROTHERS

ure mechanisms for new HDDs remain


brilliant software algorithms, and RAID (redundant unchanged from the past, but the in-
array of independent disks) became a reality. RAID sidious undiscovered data corruptions
moved the reliability requirements from the HDD (latent defects) that have plagued all
HDD designs to one degree or another
itself to the system of data disks. Commercial will continue to worsen in the near fu-
implementations of RAID include n+1 configurations ture as areal densities increase.

38 CO MM UNICATIO NS O F TH E AC M | J U N E 200 9 | VO L. 52 | N O. 6
practice

Figure 1: Fault tree for HDD read failures. make the head positioning take too
long to lock onto a track and ultimate-
ly produce an error. This mode can be
cannot read data induced by excessive wear and is ex-
acerbated by high rotational speeds.
It affects both ball and fluid-dynamic
or bearings. The insidious aspect of this
Operational Latent type of problem is that it can be inter-
Failures Failures mittent. Specific HDD usage condi-
cannot find data data missing
tions may cause a failure while reading
data in a system, but under test condi-
tions the problem might not recur.
or or Two very interesting examples of
inability to stay on track are caused
bad servo SMART limit error during written but
by audible noise. A video file available
track exceeded writing destroyed on YouTube shows a member of Sun’s
Fishworks team yelling at his disk
bad bad drives and monitoring the latency in
electronics read head or or
disk operations.5 The vibrations from
his yelling induce sufficient NRRO that
can’t stay bad thermal the actuator cannot settle for over 520
on track media asperities
ms. While most (some) of us don’t yell
inherent at our HDDs, vibrations induced by
bit errors corrosion
thermal alarms (warning buzzers) have
also been noted to induce NRRO and
high-fly scratched cause excessive latency and time-outs.
write media
SMART limits exceeded. Today’s
HDDs collect and analyze functional
and performance data to predict im-
Two major categories of HDD fail- required for the heads to find and stay pending failure using SMART (self-
ure can prevent access to data: those on a track, whether executing a read, monitoring analysis reporting technol-
that fail the entire HDD and those that write, or seek command. Servo-track ogy). In general, sector reallocations
leave the HDD functioning but cor- information is written only during are expected, and many spare sectors
rupt the data. Each of these modes has the manufacturing process and can are available on each HDD. If an exces-
significantly different causes, prob- be neither reconstructed using RAID sive number occurs in a specific time
abilities, and effects. The first type nor rewritten in the field. Media de- interval, however, the HDD is deemed
of failure, which I term operational, fects in the servo-wedges cause the unreliable and is failed out.
is rather easy to detect, but has lower HDD to lose track of the heads’ loca- SMART isn’t really that smart. One
rates of occurrence than the data cor- tions or where to move the head for trade-off that HDD manufacturers
ruptions or latent defects that are not the next read or write. Faulty servo face during design is the amount of
discovered until data is read. Figure 1 tracks result in the inability to access RAM available for storing SMART data
is a fault tree for the inability to read data, even though the data is written and the frequency and method for cal-
data—the topmost event in the tree— and uncorrupted. Particles, contami- culating SMART parameters. When
showing the two basic reasons that nants, scratches, or thermal asperities the RAM containing SMART data be-
data cannot be read. can damage servo data. comes full, is it purged, then refilled
Can’t stay on track. Tracks on an with new data? Or are the most recent
Operational Failures: HDD are not perfectly circular; some percentages (x%) of data preserved and
Cannot Find Data are actually spiral. The head position the oldest (1–x)% purged? The former
Operational failures occur in two is continuously measured and com- method means that a rate calculation
ways: first, data cannot be written to pared with where it should be. A PES such as read-error-rate can be errone-
the HDD; second, after data is writ- (position error signal) repositions the ous if the memory fills up during an
ten correctly and is still present on the head over the track. This repeatable event that produces many errors. The
HDD uncorrupted, electronic or me- run-out is all part of normal HDD head errors before filling RAM may not be
chanical malfunction prevents it from positioning control. NRRO (nonre- sufficient to trigger a SMART event,
being retrieved. peatable run-out) cannot be corrected nor may the errors after the purge, but
Bad servo track. Servo data is writ- by the HDD firmware since it is nonre- had the purge not occurred, the error
ten at regular intervals on every data peatable. Caused by mechanical toler- conditions may easily have resulted in
track of every disk surface. The servo ances from the motor bearings, actua- a SMART trip.
data is used to control the positioning tor arm bearings, noise, vibration, and In general, the SMART thresholds
of the read/write heads. Servo data is servo-loop response errors, NRRO can are set very low, missing numerous

40 CO MMUNIC ATIO NS O F T HE AC M | J U N E 200 9 | VO L. 52 | N O. 6


practice

conditions that could proactively fail ures disagree with the manufacturers’ ! Airborne contamination. Particles
a HDD. Making the trip levels more specification.1–3, 6, 7, 10, 11 More discon- within the enclosure tend to fail HDDs
sensitive (trip at lower levels) runs the certing are the realizations that the early (scratches and head damage).
risk of failing HDDs with a few errors failure rates are rarely constant; there This can give the appearance of an in-
that really aren’t progressing to the are significant differences across sup- creasing failure rate. After all the con-
point of failure. The HDD may simply pliers, and great differences within a taminated HDDs fail, the failure rate
have had a series of reallocations, say, specific HDD family from a single sup- often decreases.
that went smoothly, mapping out the plier. These inconsistencies are fur- ! Design changes. Manufacturers
problematic area of the HDD. Integra- ther complicated by unexpected and periodically find it necessary to reduce
tors must assess the HDD manufac- uncontrolled lot-to-lot differences. cost, resolve a design issue discov-
turer’s implementation of SMART and In a population of HDDs that are all ered late in the test phase, or improve
see if there are other more instructive the same model from a single manu- yields. Often, the change creates an im-
calculations. Integrators must at least facturer, there can be statistically sig- provement in field reliability, but can
understand the SMART data collection nificant subpopulations, each having create more problems than it solves.
and analysis process at a very low level, a different time-to-failure distribution For example, one design change had
then assess their specific usage pattern with different parameters. Analyses of an immediately positive effect on reli-
to decide whether the implementation HDD data indicate these subpopula- ability, but after two years another fail-
of SMART is adequate or whether the tions are so different that they should ure mode began to dominate and the
SMART decisions need to be moved up not be grouped together for analyses HDD reliability became significantly
to the system (RAID group) level. because the failure causes and modes worse.
Head games and electronics. Most are different. HDDs are a technology ! Yield changes. HDD manufactur-
head failures result from changes in that defies the idea of “average” fail- ers are constantly tweaking their pro-
the magnetic properties, not electri- ure rate or MTBF; inconsistency is cesses to improve yield. Unfortunate-
cal characteristics. ESD (electrostatic synonymous with variability and un- ly, HDDs are so complex that these
discharge), high temperatures, and predictability. yield enhancements can inadvertently
physical impact from particles affect The following are examples of un- reduce reliability. Continuous tweaks
magnetic properties. As with any high- predictability that existed to such an can result in one month’s produc-
ly integrated circuit, ESD can leave the extent that at some point in the prod- tion being highly reliable and another
read heads in a degraded mode. Sub- uct’s life, these subpopulations domi- month being measurably worse.
sequent moderate to low levels of heat nated the failure rate: The net impact of variability in reli-
may be sufficient to fail the read heads
magnetically. A recent publication Figure 2: Weibull time to failure plot for three very different populations.
from Google didn’t find a significant
correlation between temperature and
0.9
6.0
3.0

0.5
2.0

0.7
1.6
1.2

reliability.6 In my conversations with


B

99.0
numerous engineers from all the ma-
90.0
jor HDD manufacturers, none has said
the temperature does not affect head H
reliability, but none has published a 50.0

transfer function relating head life to


time and temperature. The read ele- HDD #3
ment is physically hidden and difficult
10.0
to damage, but heat can be conducted
HDD #1
Probability of Failure

from the shields to the read element, 5.0


affecting magnetic properties of the HDD #2
reader element, especially if it is al-
ready weakened by ESD. 1.0
The electronics on an HDD are com-
0.5
plex. Failed DRAM and cracked chip
capacitors have been known to cause
HDD failure. As the HDD capacities
0.1
increase, the buffer sizes increase and
more RAM is required to cache writes. 0.05

Is RAID at the RAM level required to as-


sure reliability of the ever-increasing
solid-state memory? 0.01
10 100 1000 10000 100000
Operational Failure Data Time to Failure, hrs

In a number of studies on disk failure


rates, all mean times between fail-

J UN E 2 0 0 9 | VO L . 52 | N O. 6 | C O M M U N I C AT I ON S O F T HE ACM 41
practice

ability is that RAID designers and soft- Figure 3: Failure rate over time for five vintages and the composite.
ware developers must develop logic
and operating rules that will accom-
modate significant variability and the
0.02
worst-case issues for all HDDs. Figure
2 shows a plot for three different HDD
Vintage 2
populations. If a straight line were to 0.02
Vintage 1
fit the data points and the slope were
1.0, then the population could be Composite

represented by a Weibull probability


distribution and have a constant fail-
ure rate. (The Weibull distribution is 0.01

used to create the common bathtub

Probability of Failure
curve.) A single straight line cannot fit
either population HDD#2 or HDD#3,
Vintage 3
so they do not even fit a Weibull dis- 8.00E-3
tribution. In fact, these do not fit any
single closed-form distribution, but
Vintage 4
are composed of multiple failure dis-
tributions from causes that dominate
4.00E-3
at different points in time. Figure 3 is Vintage 5
an example of five HDD vintages from
a single supplier. A straight line indi-
cates a constant failure rate; the lower 0
the slope, the more reliable the HDD.
A vintage represents a product from a 0 4,000 8,000 12,000 16,000 20,000
single month. Time to Failure, hrs

Latent Defects: Data is


Corrupted or Missing
The preceding discussion centered on of the effectiveness of all the electrical, the head is too high can result in the
failure modes in which data was good mechanical, magnetic, and firmware media being insufficiently magne-
(uncorrupted) but some other electri- control systems working together to tized so it cannot be read even when
cal, mechanical, or magnetic function write (or read) data. Most bit errors the read element is flying at the spec-
was impaired. These modes are usual- occur on a read command and are cor- ified height. If writing over a previ-
ly rather easily detected and allow the rected using the HDD’s built-in error- ously written track, the old data may
system operator to replace the faulty correcting code algorithms, but errors persist where the head was flying too
HDD, reconstruct data on the new can also occur during writes. While high. For example, if all the HDDs in
HDD, and resume storage functions. BER does account for some fraction of a cabinet are furiously writing at the
But what about data that is missing defective data, a greater source of data same time, self-induced vibrations
or corrupted because it either was not corruption is the magnetic recording and resonances can be great enough
written well initially or was erased or media coating the disks. to affect the fly height. Physically
corrupted after being written well. All The distance that the read-write bumping or banging an HDD dur-
errors resulting from missing data head flies above the media is care- ing a write or walking heavily across
are latent because the corrupted data fully controlled by the aerodynamic a poorly supported raised floor can
is resident without the knowledge of design of the slider, which contains create excessive vibration that affects
the user (software). The importance of the reader and writer elements. In the write.
latent defects cannot be overempha- today’s designs, the fly height is less A more difficult problem to solve
sized. The combination of a latent de- than 0.3 µ-in. Events that disturb is persistent increase in the fly height
fect followed by an operational failure the fly height, increasing it above caused by buildup of lubrication or
is the most likely sequence to result in the specified height during a write, other hydrocarbons on the surface of
a double failure and loss of data.1 can result in poorly written data be- the slider. Hydrocarbon lubricants are
To understand latent defects bet- cause the magnetic-field strength is used in three places within enclosed
ter, consider the common causes. too weak. Remember that magnetic- HDDs. To reduce the NRRO, motors
Write errors can be corrected us- field strength does not decrease lin- often use fluid-dynamic bearings. The
ing a read-verify command, but these early as a function of distance from actuator arm that moves the heads
require an extra read command after the media, but is a power function, pivots using an enclosed bearing car-
writing, and can nearly double the ef- so field strength falls off very rapidly tridge that contains a lubricant. The
fective time to write data. The BER as the distance between the head and media itself also has a very thin layer
(bit-error rate) is a statistical measure media increases. Writing data while of lubricant applied to prevent the

42 CO MM UNIC ATIO NS O F TH E ACM | J U NE 20 09 | VOL . 5 2 | NO. 6


practice

heads from touching the media itself. which the magnetic media is not ca-
Lubricants on the media can build pable of holding the proper magnetic
up on the head under certain circum- field to be correctly interpreted as a 0
stances and cause the head to fly too or a 1, is really not an issue. Media can
high. Lube buildup can also mean that
uncorrupted, well-written data cannot Based on degrade, but the probability of this
mode is inconsequential compared
be read because the read element is
too far from the media. Lube buildup
technology and with other modes. Data can become
corrupted whenever the disks are spin-
can be caused by the mechanical prop- design, where ning, even when data is not being writ-
erties of the lubricant, which is depen-
dent upon the chemical composition.
must the developers ten to or read from the disk. Common
causes for erasure include thermal
Persistent high fly height can also be and architects asperities, corrosion, and scratches or
caused by specific operations. For ex-
ample, when not writing or reading, if
place the efforts smears.
Thermal asperities are instances of
the head is left to sit above the same for resiliency? high heat for a short duration caused
track while the disks spin, lubricant by head-disk contact. This is usu-
can collect on the heads. In some ally the result of heads hitting small
cases simply powering down the HDD “bumps” created by particles that re-
will cause the heads to touch down (as main embedded in the media surface
they are designed to do) in the landing even after burnishing and polishing.
zone to disturb the lube buildup. This The heat generated on a single contact
is very design specific, however, and can be high enough to erase data. Even
does not always work. if not on the first contact, cumulative
During the manufacturing process, effects of numerous contacts may be
the surface of the HDD is checked and sufficient to thermally erase data or
defects are mapped out, and the HDD mechanically destroy the media coat-
firmware knows not to write in these ings and erase data.
locations. They also add “padding” The sliders are designed to push
around the defective area mapping away airborne particles so they do not
out more blocks than the estimated become trapped between the head
minimum, creating additional physi- and disk surface. Unfortunately, re-
cal distance around the defect that is moving all particles that are in the 0.3
not available for storing data. Since µ-in. range is very difficult, so particles
it is difficult to determine the exact do get caught. Hard particles used in
length, width, and shape of a defect, the manufacture of an HDD, such as
the added padding provides an extra Al2O3, TiW, and C, will cause surface
safeguard against writing on a media scratches and data erasure. These
defect. scratches are then media defects that
Media imperfections such as voids are not mapped out, so the next time
(pits), scratches, hydrocarbon con- data is written to those locations the
tamination (various oils), and smeared data will be corrupted immediately.
soft particles can not only cause errors Other “soft” materials such as stain-
during writing, but also corrupt data less steel can come from assembly
after it has been written. The sputter- tooling and aluminum from residuals
ing process used to apply some of the from machining the case. Soft parti-
media layers can leave contaminants cles tend to smear across the surface
buried within the media. Subsequent of the media rendering the data un-
contact by the slider can remove readable and unwritable. Corrosion,
these bumps, leaving voids in which although carefully controlled, can
the media is defective. If data is al- also cause data erasure and may be ac-
ready written there, the data is cor- celerated by high ambient heat within
rupted. If none is written, the next the HDD enclosure and the very high
write process will be unsuccessful, heat flux from thermal asperities.
but the user won’t know this unless a
write-verify command is used. Latent Defects Data
Early reliability analyses assumed Latent defects are the most insidious
that once written, data will remain kinds of errors. These data corrup-
undestroyed except by degradation of tions are present on the HDD but un-
the magnetic properties of the media, discovered until the data is read. If no
a process known as bit-rot. Bit-rot, in operational failures occur at the first

JU NE 2 0 0 9 | VO L . 52 | N O. 6 | C O M M U NI CAT I ON S O F THE ACM 43


practice

reading of the data, the corruption is RERs and number of bytes read yields that operational failure rates are not
corrected using the parity disk and the hourly read failure rates shown in increased.
no data is lost. If one HDD, however, the table here. Frequent scrubbing can affect per-
has experienced an operational failure Latent defects do not occur at a formance, but too infrequent scrub-
and the RAID group is in the process of constant rate, but in bursts or adja- bing makes the n+1 RAID group highly
reconstruction when the latent defect cent physical (not logical) locations. susceptible to double disk failures.
is discovered, that data is lost. Since Although some latent defects are cre- Scrubbing, as with full HDD data re-
latent defects persist until discovered ated by wear-out mechanisms, data is construction, has a minimum time
(read) and corrected, their rate of oc- not available to discern wear-out from to cover the entire HDD. The time to
currence is an extremely important as- those that occur randomly at a con- complete the scrub is a random vari-
pect of RAID reliability. stant rate. These rates are between 2 able that depends on HDD capacity
One study concludes that the BER and 100 times greater than the rates and I/O activity. The operating system
is fairly inconsequential in terms of for operational failures. may invoke a maximum time to com-
creating corrupted data,4 while anoth- plete scrubbing.
er claims the rate of data corruption Potential Value of Data Scrubbing
is five times the rate of HDD operat- Latent defects (data corruptions) can Future Technology and Trade-Offs
ing failures.8 Analyses of corrupted occur during almost any HDD activity: How are those failure modes going to
data identified by specific SCSI error reading, writing, or simply spinning. If impact future HDDs that have more
codes and subsequent detailed fail- not corrected, these latent defects will than one-terabyte capacity? Certainly,
ure analyses show that the rate of data result in lost data when an operational all the failure mechanisms that occur
corruption for all causes is significant failure occurs. They can be eliminat- in the 1TB drive will persist in higher
and must be included in the reliability ed, however, by background scrub- density drives that use perpendicular
model. bing, which is essentially preventive magnetic recording (PMR) technol-
NetApp (Network Appliance) com- maintenance on data errors. During ogy. PMR uses a “thick,” somewhat
pleted a study in late 2004 on 282,000 scrubbing, which occurs during times soft underlayer making it susceptible
HDDs used in RAID architecture. of idleness or low I/O activity, data is to media scratching and gouging. The
The RER (read-error rate) over three read and compared with the parity. If materials that cause media damage
months was 8x10–14 errors per byte they are consistent, no action is taken. include softer metals and composi-
read. At the same time, another analy- If they are inconsistent, the corrupted tions that were not as great a problem
sis of 66,800 HDDs showed an RER data is recovered and rewritten to the in older, longitudinal magnetic re-
of approximately 3.2x10–13 errors per HDD. If the media is defective, the re- cording. Future higher density drives
byte. A more recent analysis of 63,000 covered data is written to new physical are likely to be even more susceptible
HDDs over five months showed a sectors on the HDD and the bad blocks to scratching because the track width
much-improved 8x10–15 errors per byte are mapped out. will be narrower.
read. In these studies, data corruption If scrubbing does not occur, the pe- Another PMR problem that will
is verified by the HDD manufacturer riod of time to accumulate latent de- persist as density increases is side-
as an HDD problem and not a result of fects starts when the HDD begins op- track erasure. Changing the direction
the operating system controlling the eration in the system. Since scrubbing of the magnetic grains also changes
RAID group. requires reading and writing data, it the direction of the magnetic fields.
While Jim Gray of Microsoft Re- can act as a time-to-failure accelerator PMR has a return field that is close to
search asserted that it is reasonable to for HDD components with usage-de- the adjacent tracks and can potential-
transfer 4.32x1012 bytes/day/HDD, the pendent time-to-failure mechanisms. ly erase data in those tracks. In gen-
study of 63,000 HDDs read 7.3x1017 The optimal scrub pattern, rate, and eral, the track spacing is wide enough
bytes of data in five months, an ap- time of scrubbing is HDD-specific and to mitigate this mechanism, but if a
proximate read rate of 2.7x1011 bytes/ must be determined in conjunction particular track is written repeatedly,
day/HDD.4 Using combinations of the with the HDD manufacturer to assure the probability of side-track erasure
increases. Some applications are opti-
Range of average read error rates.
mized for performance and keep the
head in a static position (few tracks).
This increases the chances of not only
Bytes Read per Hour lube buildup (high fly writes) but also
Low rate (1.35 × 109) High rate (1.35 × 1010) erasures.
Low 1.08 × 10 –5 err/hr 1.08 × 10 –4 err/hr One concept being developed to
(8.0 × 10–15)
Read increase bit-density is heat assisted
Errors Medium 1.08 × 10 –4 err/hr 1.08 × 10 –3 err/hr magnetic recording (HAMR).9 This
per Byte (8.0 × 10–14)
per HDD technology requires a laser within the
High 4.32 × 10 –4 err/hr 4.32 × 10 –3 err/hr
(3.2 × 10–13)
write head to heat a very small area on
the media to enable writing. High-sta-
bility media using iron-platinum al-
loys allow bits to be recorded on much

44 CO MMUNIC ATIO NS O F T H E ACM | J U NE 20 09 | VOL . 5 2 | N O. 6


practice

smaller areas than today’s standard to a spare HDD (even the corrupted should consider optimizations around
media without being limited by su- data), and resume recovery. A copy these high-probability events and their
per-paramagnetism. Controlling the command is much quicker than re- effects on the RAID operation.
amount and location of the heat are, constructing the data based on parity, Only when these high-probability
of course, significant concerns. and if there are no defects, little data events are included in the optimiza-
RAID is designed to accommo- will be corrupted. This means that re- tion of the RAID operation will reli-
date corrupted data from scratches, construction of this small amount of ability improve. Failure to address
smears, pits, and voids. The data is data will be fast and not result in the them is a recipe for disaster.
re-created from the parity disk and same time-out condition. The offend-
the corrupted data is reconstructed ing HDD can be (logically) taken out of
and rewritten. Depending on the size the RAID group and undergo detailed Related articles
of the media defect, this may be a few diagnostics to restore the HDD and on queue.acm.org
blocks or hundreds of blocks. As the map out bad sectors. You Don’t Know Jack about Disks
areal density of the HDDs increases, In fact, a recent analysis shows the Dave Anderson
the same physical size of the defect true impact of latent defects on the http://queue.acm.org/detail.cfm?id=864058
will affect more blocks or tracks and frequency of double disk failures.1 CTO Roundtable: Storage
require more time for re-creation of Early RAID papers stated that the only http://queue.acm.org/detail.cfm?id=1466452
data. One trade-off is the amount of failures of concern were operational A Conversation with Jim Gray
time spent recovering corrupted data. failures because, once written, data http://queue.acm.org/detail.cfm?id=864078
A desktop HDD (most ATA drives) is does not change except by bit-rot.
optimized to find the data no matter
References
how long it takes. In a desktop there is Improving Reliability 1. Elerath, J.G. Reliability model and assessment
no redundancy and it is (correctly) as- Hard-disk drives don’t just fail cata- of redundant arrays of inexpensive disks (RAID)
incorporating latent defects and non-homogeneous
sumed that the user would rather wait strophically. They may also silently poisson process events. Ph.D. dissertation,
30–60 seconds and eventually retrieve corrupt data. Unless checked or Department of Mechanical Engineering, University of
Maryland, 2007.
the data than to have the HDD give up scrubbed, these data corruptions re- 2. Elerath, J.G. and Pecht, M. Enhanced reliability
and lose data. sult in double disk failures if a cata- modeling of RAID storage systems. In Proceedings of
the 37th Annual IEEE/IFIP International Conference
Each HDD manufacturer has a pro- strophic failure also occurs. Data loss on Dependable Systems and Networks, (Edinburgh,
prietary set of recovery algorithms it resulting from these events is the UK, June 2007).
3. Elerath, J.G. and Shah, S. Server class disk drives:
employs to recover data. If the data dominant mode of failure for an n+1 How reliable are they? In Proceedings of the Annual
Reliability and Maintainability Symposium, (January
cannot be found, the servo controller RAID group. If the reliability of RAID 2004), 151–156.
will move the heads a little to one side groups is to increase, or even keep 4. Gray, J. and van Ingen, C. Empirical measurements of
disk failure rates and error rates. Microsoft Research
of the nominal center of the track, then up with technology, the effects of un- Technical Report, MSR-TR-2005-166, December
to the other side. This off-track read- discovered data corruptions must be 2005.
5. Gregg, B. Shouting in the datacenter, 2008; http://
ing may be performed several times at mitigated or eliminated. Although www.youtube.com/watch?v=tDacjrSCeq4.
different off-track distances. This is a scrubbing is one clear answer, other 6. Pinheiro, E., Weber, W.-D., and Barroso, L.A. Failure
trends in a large disk drive population. In Proceedings
very common process used by all HDD creative methods to deal with latent of the Fifth Usenix Conference on File and Storage
manufacturers, but how long can a defects should be explored. Technologies (FAST), (February 2007).
7. Schroeder, B. and Gibson, G. Disk failures in the real
RAID group wait for this recovery? Multi-terabyte capacity drives using world: What does an MTTF of 1,000,000 hours mean
Some RAID integrators may choose perpendicular recording will be avail- to you? In Proceedings of the Fifth Usenix Conference
on File and Storage Technologies (FAST), (February
to truncate these steps with the knowl- able soon, increasing the probabil- 2007).
edge that the HDD will be considered ity of both correctable and uncorrect- 8. Schwarz, T.J.E., et al. Disk scrubbing in large archival
storage systems. In Proceedings of the IEEE
failed even though it is not an opera- able errors by virtue of the narrowed Computer Society Symposium (2004), 1161–1170.
tional failure. On the other hand, how track widths, lower flying heads, and 9. Seigler, M. and McDaniel, T. What challenges remain
to achieve heat-assisted magnetic recording?
long can a RAID group response be susceptibility to scratching by softer Solid State Technology (Sept. 2007); http://www.
solid-state.com/display_article/304597/5/ARTCL/
delayed while one HDD is trying to re- particle contaminants. One mitiga- none/none/What-challenges-remain-to-achieve-heat-
cover data that is readily recoverable tion factor is to turn uncorrectable assisted-magnetic-recording?/.
10. Shah, S. and Elerath, J.G. Disk drive vintage and its
using RAID? Also consider what hap- errors into correctable errors through affect on reliability. In Proceedings of the Annual
pens when a scratch is encountered. greater error-correcting capability on Reliability and Maintainability Symposium, (January
2004), 163–167.
The process of recovery for a large the drive (4KB blocks rather than 512- 11. Sun, F. and Zhang, S. Does hard-disk drive failure rate
number of blocks, even if the process or 520-byte blocks) and by using the enter steady-state after one year? In Proceedings of
The Annual Reliability and Maintainability Symposium,
is truncated, may result in a time-out complete set of recovery steps. These IEEE, (January 2007).
condition. The HDD is off recovering will decrease performance, so RAID
data or the RAID group is reconstruct- architects must address this trade-off. Jon Elerath is a staff reliability engineer at SolFocus.
ing data for so long that the perfor- Operational failure rates are not He has focused on hard-disk drive reliability for more
than half his 35-plus-year career, which includes
mance comes to a halt; a time-out constant. It is necessary to analyze positions at NetApp, General Electric, Tegal, Tandem
threshold is exceeded and the HDD is field data, determine failure modes Computers, Compaq, and IBM.

considered failed. and mechanisms, and implement cor-


One option is quickly to call the of- rective actions for those that are most
fending HDD failed, copy all the data problematic. The operating system © 2009 ACM 0001-0782/09/0600 $10.00

JU NE 2 0 0 9 | VOL . 52 | N O. 6 | C O M M U NI CAT IO N S O F T HE ACM 45


practice
DOI:1 0.11 45 /1 51 6046 .15 16 060
and implementation realities intrude,
Article development led by
queue.acm.org
often with considerable force.
This article will not attempt to dis-
cern whether the NFE is a heavenly gift
The history of NFE processors sheds light on or a manifestation of evil incarnate.
the trade-offs involved in designing network Rather, it will follow its evolution start-
ing from a pure host-based implementa-
stack software. tion of a network stack and then moving
the network stack farther from that ini-
BY MIKE O’DELL tial position, observing the issues that
arise. The goal is to offer insight into the

Network
trade-offs that influence the location
choice for network stack software in a
larger systems context. As such, it is an
attempt to prevent old mistakes from

Front-end
being reinvented while harvesting as
much clean grain as possible.
As a starting point, consider the ca-
nonical structure of a common work-

Processors,
station or server before the advent of
multicore processors. Ignoring the
provenance of the operating-system
code, this model springs directly from

Yet Again
the quintessential early to mid-1980s
computer science department com-
puter, the DEC VAX 11/780 with a 10Mb
Ethernet interface with single-cycle di-
rect memory access (DMA) ability and
connected to a relatively slow 16-bit
bus (the DEC Unibus).
Since there is only one processor,
the network stack vies for the atten-
tion of the CPU with everything else
running on the machine, albeit prob-
“This time for sure, Rocky!” ably with the aid of a software priority
—Bullwinkle J. Moose mechanism that makes the network
code “more equal than others.”
When a packet arrives, the Ethernet
THE HI STO RY O Fthe network front-end (NFE) interface validates the Ethernet frame
processor, best known as a TCP offload engine cyclic redundancy check (CRC) and
then uses DMA to transfer the packet
(or TOE), extends back to the Arpanet interface into buffers used by the network code
message processor and possibly before. The notion for protocol processing. The DMA
is beguilingly simple: partition the work of executing transfers require only one local bus
cycle for each16-bit word, and on the
communications protocols from the work of executing VAX 11/780 the processor controller
the applications that require the services of those for the Unibus buffers 16-bit words
into a single 32-bit transfer into main
protocols. That way, the applications and the network memory.
machinery can achieve maximum performance The TCP checksum is then calcu-
and efficiency, possibly taking advantage of special lated by the network code, the protocol
state machinery conducts its business,
hardware performance assistance. While this looks and the TCP payload data is copied into
utterly compelling on the whiteboard, architectural “socket buffers” to await consumption

46 COM MUNICATIO NS O F T HE ACM | J U NE 20 0 9 | VOL . 5 2 | NO. 6


A VAX-11/780 from 1983 with 16MB of RAM, and the Ethernet interface containing a Motorola 68000 processor to handle the network traffic.

by the application program. When the 10 megabits/second of network per- PC platform, that conspicuous lack
read for the payload data happens, it formance.” The 10Mbps Ethernet can prompted major renovations of the
is copied from the socket buffer into deliver about a megabyte/second of PC’s I/O architecture. For the period of
application process memory to be di- payload, so this is consistent with the our interest, that progressed from the
gested as required. That makes a total other folk theorem of “one megabyte 16-bit ISA bus, to 32-bit PCI, and now
of four passes over the data in a single of memory per MIPS per megabyte of PCI Express. For reasons too boring
packet before the application gets a I/O.” Where this came from is difficult to explore here, for a very long time,
shot at using it. When networks were to pin down, but it is frequently cred- packets moved from PC Ethernet cards
slow compared with memory band- ited to Gene Amdahl. into protocol processing buffers with a
width and processor speed, the data- Now, let’s move this same model byte-copy operation performed by the
PHOTOGRAP H BY PAT RI C K FI NN EGA N

copy inefficiency was considered mi- to PC hardware. For a long time, one CPU, upping the data-handling pass
nor compared with the joy of a working of the principal distinctions between count to five.
network stack, so it failed to provoke PCs and minicomputers was I/O per- The first significant improvement
immediate improvement. formance. To be brutal, compared came when the raw-packet copy op-
This base-case platform appears with its minicomputer forebears, the eration and TCP checksum were com-
to be the origin of the folk theorem PC platform started life with almost bined. Some network code tried to
that “TCP needs one (VAX-)MIPS per no I/O capabilities. Over the life of the do this in software. As PCI Ethernet

J UN E 2 0 0 9 | VO L. 52 | N O. 6 | C O M M U NI CAT IO NS O F T HE AC M 47
practice

cards developed efficient DMA hard- ability to add a fast processor that can
ware, some combined the TCP check- be applied entirely to protocol process-
sum generation with the copy opera- ing is certainly an attractive idea. It is,
tion, reducing the pass count to three. however, much more difficult to do
This clearly reduced CPU use for a
given amount of TCP throughput and Simply moving in real life than it first appears on the
whiteboard.
started the march to “protocol assist” data directly off the Simply moving data directly off the

network wire into


services performed by network inter- network wire into application buffers
faces. (“If a little help is good, a lot of is not sufficient. The delivery of packets
help should be better!”) Adapting the
network stack code to exploit this new
application buffers must be coordinated with all the other
things the application is doing and all
checksum capability was not trivial, is not sufficient. the other operating-system machinery
but the handwriting on the wall made
it clear that such evolution was likely
The delivery of behind the scenes. As a result, the net-
work protocol stack interacts with the
to continue. Significant redesign of the packets must be rest of the operating system in exqui-
network code had to be done to allow
functions to move between hardware
coordinated with all sitely delicate ways. Truth be told, this
coordination machinery is the lion’s
and software with greater ease in the the other things the share of the code in most stack imple-
future. This was genuine architectural
progress, although it did not happen application is doing mentations. The actual TCP state ma-
chine fits on a half page, once divorced
overnight. and all the other of all the glue and scaffolding needed
to integrate it with the rest of the sys-
A Success Disaster operating-system tem environment. It is precisely this
With the explosion of the Web, perfor-
mance demands on network servers
machinery behind subtle and complex control coupling
that makes it surprisingly difficult to
skyrocketed. Processors and network the scenes. isolate a network protocol stack fully
interfaces were getting faster, and from its host operating system. There
memory bandwidth strangulation was are multiple reasons why this interac-
being solved. Gigabit Ethernet quickly tion is such a rich breeding ground for
became commonplace on server moth- implementation bugs, but one vast cat-
erboards (and gamer desktop moth- egory is “abstraction mismatch.”
erboards!). By this time, the cost of all Because communications protocols
those data copies was clearly unaccept- inherently deal with multiple commu-
able. Simply halving the number of nicating entities, some assumptions
copies would come close to doubling must be made about the behavior of
the sustainable transaction rate for those entities. The degree to which
many Web workloads. those assumptions match between a
This gave rise to the Holy Grail of host system and protocol code deter-
what became known as zero-copy TCP. mines how difficult it will be to map
The idea was that programs written to to existing semantics and how much
exploit this new capability could have new structure and machinery will be
data delivered right into application required. When networking first went
buffers without any intervening cop- into Berkeley Unix, subtleties on both
ies (ignoring the possible exception of sides required considerable effort to
one efficient DMA transfer from the reconcile. There was a critical desire to
hardware). Clearly this would require make network connections appear to
some cooperation (or at least reduced be natural extensions of existing Unix
antagonism) from designers of Ether- machinery: file descriptors, pipes, and
net interface hardware, but a working the other ideas that make Unix concep-
solution would win many hearts and tually compact. But because of radical
minds. differences in behavior, especially de-
The step from a zero-copy TCP net- lay, it is impossible to completely dis-
work stack to a full-blown TCP offload guise reading 1,000 bytes from a round-
engine looks pretty obvious at this the-world network connection so that
point. It seems even more attractive giv- it appears indistinguishable from read-
en that many PC-based platforms were ing that same 1,000 bytes from a file on
slow to exploit the multiprocessor abil- a local file system. Networks have new
ities the PC was developing. (Whether behaviors that require new interfaces
it is multiple chips or multiple cores to capture and manage, but those new
on one chip is largely irrelevant.) The interfaces must make sense with exist-

48 COMM UNICATI O NS OF TH E ACM | J U N E 200 9 | VO L. 52 | N O. 6


practice

ing interfaces. This was difficult work, computer device driver. “Doesn’t that measure (and there’s certainly a place
and the modifications left few pieces of count?” you rightfully ask. Yes, indeed, in the world for those), but as a long-
the system untouched; a few changed it does. term architectural approach, the com-
in profound ways. There is a long history of peripheral moditization of processor cores makes
The fundamental capabilities pro- chips being designed with absolutely specialized hardware very difficult to
vided by a network protocol stack are dreadful interfaces. Such chips have justify.
data transfer, multiplexing, flow con- been known to make device-driver writ- Lacking NFEs, what is required for
trol, and error management. All of ers contemplate slow, painful violence maximizing host-based network per-
these functions are required for the if they ever meet the chip designer in a formance? Here are some guidelines:
coordinated delivery of data between dark alley. The very early Ethernet chips ! Wire interfaces should be designed
endpoints across the Internet. Indeed, from one famous semiconductor com- to be fast and brilliantly simple. Do the
the purpose of all the structure in the pany were absolute masterpieces of bit-speed work and then get the data
packet headers: to carry the control co- egregious overdesign. Not only did they into memory as quickly as possible, do-
ordination information, as well as the contain many complex functions of du- ing any additional work such as check-
payload data. bious utility, but also the functions that sums that can readily be buried in the
The critical observation is that the were genuinely required suffered from unavoidable transfer. Streamline the
exact same operations are required the same virulent infestation of bugs device as seen by the driver so as to
to coordinate the interaction of a net- that plagued the useless bits. Tom Lyon avoid playing “Twenty Questions” with
work protocol stack and the host op- wrote a famous Usenix paper in 1985, the hardware to determine what just
erating system within a single system. “All the Chips that Fit,” delivering an happened.
When all the code is in the same place epic rant on this expansive topic. (It ! Interconnects should have suf-
(that is, running on the same proces- should be required reading for anyone ficient capacity to carry the network
sor), this signaling is easily done with contemplating hardware design.) traffic without strangling other I/O op-
simple procedure calls. If, however, If the goal is efficiency and per- erations. From the standpoint of a net-
the network protocol stack executes formance of network code, all of the work interface, PCI Express appears
on a remote processor such as a TOE, “mini-protocols” in the entire network to have adequate performance for
this signaling must be done with an ex- protocol subsystem must be examined 10Gbps Ethernet as does HyperTrans-
plicit protocol carried across whatever carefully. Both internal complexity and port 3.0.
connects the front-end processor to integration complexity can be serious ! The system must have sufficient
the host operating system. This proto- bottlenecks. Ultimately, the question is memory bandwidth to get the network
col is called a host-front end protocol how hard is it to glue this piece onto the payload in and out without strangling
(HFEP). other pieces it must interact with fre- the rest of the system, especially the
Designing an HFEP is not trivial, quently? If it is very difficult, it is likely processors. Historically, the PC plat-
especially if the goal is that it be mate- not fast (in an absolute sense), nor is it form has been chronically starved for
rially simpler than the protocol being likely robust from a bug standpoint. memory bandwidth.
offloaded to the remote processor. His- Remember the protocol state ma- ! Processors should have enough
torically, the HFEP has been the Achil- chines are generally not the principal cores able to exploit the sufficient
les’ heel of NFE processors. The HFEP source of complexity or performance memory bandwidth.
ends up being asymptotically as com- issues. One extra data copy can make ! Network protocol stacks should be
plex as the “primary” protocol being a huge difference in the maximum designed to maximize parallelism and
offloaded, so there is very little to gain achievable performance. Therefore, minimize blocking, while never copy-
in offloading it. In addition, the HFEP implementations must focus on avoid- ing data.
must be implemented twice: once in ing data motion: put it where it goes ! A set of network APIs should be
the host and once in the front-end pro- the first time it is touched, then leave designed to maximize performance
cessor, each one of those being a dif- it alone. If some other operation on as opposed to mandatory similarity
ferent host platform as far as the HFEP packet payload is required, such as with existing system calls. Backward
is concerned. Two implementations, checksum computation, bury it inside compatibility is important to support,
two integrations with host operating an unavoidable operation such as the but some applications may wish to pay
systems—this means twice as many single transfer into memory. In line more to get more.
sources of subtle race conditions, with those suggestions, streamline the
deadlocks, buffer starvations, and oth- operating-system interface to maxi- Historical Perspective
er nasty bugs. This cost requires a huge mize concurrency. Once all those is- NFEs have been rediscovered in at
payoff to cover it. sues have been addressed aggressively, least four or five different periods. In
there’s not a lot of work left to avoid. the spirit of full and fair disclosure, I
But Wait a Minute… must admit to having directly contrib-
About now some readers may be eager What Does All this Mean for NFEs? uted to two of those efforts and having
to throw a penalty flag for “unconvinc- Many times, but not every time, an NFE purchased and integrated yet another.
ing hand waving” because even in the is likely to be an overly complex solu- So why does this idea keep recurring if
base case, there is a protocol between tion to the wrong part of the problem. it turns out to be much more difficult
the Ethernet interface and the host It is possibly an expedient short-term than it first appears?

J UN E 2 0 0 9 | VO L. 52 | N O. 6 | C O M M U NI C AT I O NS O F TH E AC M 49
practice

The capacities and economics of sors quickly improved enough to do the lighting controller, the NCA looks
computer systems do not advance compression/decompression on the like just one more switch, albeit a chat-
smoothly, nor are the rates of improve- fly, however, and that was the end of ty one. This distinction is usually irrel-
ment of various components synchro- HiFN’s dream—well before the drop- evant—it just makes hash of pedantic
nized. The resulting interactions pro- ping cost of disk storage would have layering diagrams. There’s something
duce dramatically different trade-offs killed it. quite satisfying about that.
in system partitioning that evolve over Any effort to question the efficacy
time. What is correct today may not of NFEs should include a caveat for Conclusion
be right after the next technology im- one particular case that merits a spe- Rather than debate the religious pro-
provement. An example will illustrate cial mention because it indeed makes priety of NFEs, particularly the TOE va-
the point. a compelling case for a particular style riety, I have examined the architectural
Once upon a time, disk storage was of NFE. issues that have produced their recur-
expensive—really expensive—but it The proliferation of microcon- ring rise and fall. The TOE-style NFE
also exhibited significant economy of trollers in devices such as thermostats, is best viewed as a tactical tool with a
scale. At that time, LAN connectivity light switches, toasters, and almost limited expected lifetime of economic
and processor performance were suffi- everything else with more than a sim- viability, not an enduring architectural
cient to make it desirable to share large ple on/off switch has created a real approach. This is just another example
disks among multiple workstations, opportunity for NFEs. Almost all of of the recurring ebb and flow of func-
giving rise to the diskless workstation. these microcontroller applications are tions between specialized peripherals
This lasted for a number of years, but typified by intense cost pressure, which and the system CPU(s), as the econom-
as disks slid down the learning curve, usually translates into extreme limita- ics slosh back and forth interacting
the decreasing cost per megabyte of tions on available computing resourc- with system requirements. The lim-
disk space overwhelmed the opera- es. It is simply out of the question to ited lifetime of the NFE’s advantages
tional complexity of diskless worksta- put a network stack in the vast majority makes it difficult to justify the signifi-
tions so they became diskfull, and they of these systems, but the desirability of cant development costs for any but the
have been ever since—until relatively remote management of these devices highest-value applications.
recently. Today the typical large orga- increases daily. That said, the inexpensive NCA is
nization averages the better part of one This has created a new breed of likely to be an approach that does en-
PC per employee, so the operational NFE: the network communications dure. It literally transforms network
grief of administering all those desk- adapter (NCA) that specializes in the communication into an inexpensive,
top PCs is substantial. This cost is now simplicity of the protocol between the pluggable physical component. By do-
high enough that the diskless worksta- microcontroller host and the NFE—se- ing so, it provides an avenue for deal-
tion has been rediscovered, this time rial ASCII. Most microcontrollers have ing with the extreme cost pressure in-
named thin clients. All the storage is some serial port ability, so by looking herent in microcontroller applications
elsewhere; nothing permanent exists like a terminal, the NCA can play the while providing an incremental option
on the desktop unit. History is busily role of translator, speaking serial out of genuine network citizenship when
repeating itself. Why? Because the vari- one side and TCP/IP out the other. The the customer will pay for it.
ous cost curves have moved enough, NCA appears as a host on the TCP net-
relative to each other, to the point work, often containing a simple Web
Related articles
where centralization makes sense. server that vends state information on queue.acm.org
The same thing happens with NFEs. and may provide certain other man-
At a point in time, systems don’t have TCP Offload to the Rescue
agement functions that get translated
Andy Currid
enough network “go-fast” to deliver into simple ASCII exchanges with the http://queue.acm.org/detail.cfm?id=1005069
the performance required, so just add microcontroller system.
Network Virtualization
a dedicated processor to the network An NCA is usually implemented in Scott Rixner
interface to make up for it. The eco- one of the more powerful microcon- http://queue.acm.org/detail.cfm?id=1348592
nomics of that are fleeting at best, how- trollers that have been designed to
DAFS: A New High-Performance
ever. Between chip design and system- provide an Ethernet interface and sup- Networked File System
integration complexity, an NFE will port enough RAM and ROM to contain Steve Kleiman
need to be an economically attractive a simplified network stack. The NCA is http://queue.acm.org/detail.cfm?id=1388770
solution for quite some time to recoup now available as an off-the-shelf mod-
the development costs. Unfortunately, ule designed for easy integration no Mike O’Dell is a venture partner at New Enterprise
Associates (NEA), Chevy Chase, MD, where he works
the relentless improvements in proces- more difficult than a modem on a se- to identify early-stage IT, communications, and energy
sor, memory system, and system inter- rial port. opportunities. Prior to this position, Odell was chief
scientist at UUNET Technologies, responsible for network
connect in the base PC platform make The question of which is the tail and product architecture during the emergence of the
that window of advantage a shrinking, and which is the dog comes to mind in commercial Internet. He has also held positions at
Bellcore (now Telcordia), a GaAs Sparc supercomputer
fast-moving target. Does anyone else many of these applications. From the startup, and a U.S. government contractor. He was
remember the HiFN file compression TCP network’s point of view, the NCA is founding editor of Computing Systems, an international
refereed scholarly journal.
processor chip? It was built into PC the host and the microcontroller is be-
systems for a very short time. Proces- ing managed. From the point of view of © 2009 ACM 0001-0782/09/0600 $10.00

50 COM MUNICATI O NS OF T HE ACM | J U NE 20 09 | VO L . 5 2 | NO. 6


DO I:1 0.1 145 /1 51 604 6. 1 51 60 61

Article development led by


queue.acm.org

High bandwidth, low latency, and


multihoming challenge the sockets API.
BY GEORGE V. NEVILLE-NEIL

Whither
Sockets?
O NE OF TH E most pervasive and longest-lasting
interfaces in software is the sockets API. Developed
by the Computer Systems Research Group at the
University of California at Berkeley, the sockets API
was first released as part of the 4.1c BSD operating
system in 1982. While there are longer-lived APIs—

for example, those dealing with Unix are topology and speed. For the most
file I/O—it is quite impressive for an part it is the increase in speed rather
API to have remained in use and largely than the changes in topology that peo-
unchanged for 27 years. The only major ple notice. The maximum bandwidth
update to the sockets API has been the of a commercially available long-haul
extension of ancillary routines to ac- network link in 1982 was 1.5Mbps. The
commodate the larger addresses used Ethernet LAN, which was being de-
by IPv6.2 ployed at the same time, had a speed of
The Internet and the networking 10Mbps. A home user—and there were
world in general have changed in very very few of these—was lucky to have a
significant ways since the sockets API 300bps connection over a phone line to
was first developed, but in many ways any computing facility. The round-trip
the API has had the effect of narrow- time between two machines on a local
ing the way in which developers think area network was measured in tens of
about and write networked applica- milliseconds, and between systems
tions. This article briefly examines over the Internet in hundreds of milli-
some of the conditions present when seconds, depending of course on loca-
the sockets API was developed and con- tion and the number of hops a packet
siders how those conditions shaped would be subjected to when being rout-
the way in which networking code was ed between machines. (See page 52 for
written. Later, I look at ways in which a look at the early Internet.)
developers have tried to get around The topology of networks at the time
some of the inherent limitations in the was relatively simple. Most computers
API and address the future of sockets had a single connection to a local area
in a changing networked world. network; the LAN was connected to a
The two biggest differences be- primitive router that might have a few
tween the networks of 1982 and 2009 connections to other LANs and a single

J UN E 2 0 0 9 | VO L. 52 | N O. 6 | CO M M U NI CAT I O NS O F T HE AC M 51
practice



   


"
 "
""  


!   '#  
!
! 
 
!     
& & 
!"  %  
"'!  #" % #
!#& !   $ 
" 

!"  
% " !  %
#  
 
! " !"" !  !
% 
!
!
 #!
!
!

" 
'#   "
!

%!  

! #"
 !


!"" #" # #! "&!
 # #!"
"
 

59,"/083(6+5,84598/5; 8,<6,703,49(28(9,2209,*544,*90548
(3,88/5;4(7,4(3,84594,*,88(702=/5894(3,8
!5:7*,/9966,7854(26(.,83(4*/,89,7(*:189(--3+5+.,*=),7.,5.7(6/=(92(8(76(4,9 .0-

connection to the Internet. For one ap- seen as a way of extending the Unix file 1, it is those five shown that are central
plication to another application, the I/O model over a computer network. to the API and that differentiate it from
connection was either across a LAN or One other factor that focused the sock- regular file I/O. In reality the socket()
transiting one or more routers, called ets API down to the client/server model call could have been dropped and re-
IMPs (Internet message passing). was that the most popular protocol it placed with a variant of open(), but this
supported was TCP, which has an in- was not done at the time. The sock-
History of Sockets herently 1:1 communication model. et() and open() calls actually return
The model of distributed program- The sockets API made the client/ the same thing to a program: a process-
ming that came to be most popularized server model easy to implement be- unique file descriptor that is used in all
by the sockets API was the client/server cause of the small number of extra subsequent operations with the API. It
model, in which there is a server and system calls that programmers would is the simplicity of the API that has led
a set of clients. The clients send mes- need to add to their non-networked to its ubiquity, but that ubiquity has
sages to the server to ask it to do work code so it could take advantage of other held back the development of alterna-
on their behalf, wait for the server to do computing resources. Although other tive or enhanced APIs that could help
the work requested, and at some later models are possible, with the sockets programmers develop other types of
point receive an answer. This model of API the client/server model is the one distributed programs.
computing is now so ubiquitous it is that has come to dominate networked Client/server computing had many
often the only model with which many computing. advantages at the time it was developed.
software engineers are familiar. At the Although the sockets API has more It allowed many users to share resourc-
time it was designed, however, it was entry points than those shown in Table es, such as large storage arrays and ex-
pensive printing facilities, while keep-
Table 1: Socket API systems calls.
ing these facilities within the control
of the same departments that had once
run mainframe computing facilities.
socket() Create a communication endpoint
With this sharing model, it was possible
bind() Bind the endpoint to some set of network-layer parameters
to increase the utilization of what, at the
listen() Set a limit on the number of outstanding work requests
time, were expensive resources.
accept() Accept one or more work requests from a client
Three disparate areas of network-
connect() Contact a server to submit a work request
ing are not well served by the sockets
API: low-latency or real-time applica-
tions; high-bandwidth applications;

52 CO MMUNICATIONS OF T H E AC M | J U NE 20 09 | VO L. 52 | N O. 6
practice

and multihomed systems—that is, constant check/read/check is wasteful


those with multiple network interfac- unless the time between successive re-
es. Many people confuse increasing quests is quite long.
network bandwidth with higher per- Solving this problem requires in-
formance, but increasing bandwidth
does not necessarily reduce latency. Sockets programs verting the communication model be-
tween an application and the operating
The challenge for the sockets API is are written from system. Various attempts to provide an

the viewpoint
giving the application faster access to API that allows the kernel to call directly
network data. into a program have been proposed but
The way in which any program us-
ing the sockets API sends and receives
of a dearth of, none has gained wide acceptance—for
a few reasons. The operating systems
data is via calls to the operating sys- rather than that existed at the time the sockets API
tem. All of these calls have one thing
in common: the calling program must
a wealth of, data. was developed were, except in very eso-
teric circumstances, single threaded
repeatedly ask for data to be delivered. and executed on single-processor com-
In a world of client/server computing puters. If the kernel had been fitted
these constant requests make perfect with an up-call API, there would have
sense, because the server cannot do been the problem of which context the
anything without a request from the call could have executed in. Having all
client. It makes little sense for a print other work on a system pause because
server to call a client unless the client the kernel was executing an up-call into
has something it wishes to print. What, an application would have been unac-
however, if the service provided is mu- ceptable, particularly in timesharing
sic or video distribution? In a media systems with tens to hundreds of users.
distribution service there may be one The only place in which such software
or more sources of data and many lis- architecture did gain currency was in
teners. For as long as the user is listen- embedded systems and networked
ing to or viewing the media, the most routers where there were no users and
likely case is that the application will no virtual memory.
want whatever data has arrived. Spe- The issue of virtual memory com-
cifically requesting new data is a waste pounds the problems of implement-
of time and resources for the applica- ing a kernel up-call mechanism. The
tion. The sockets API does not provide memory allocated to a user process is
the programmer a way in which to say, virtual memory, but the memory used
“Whenever there is data for me, call me by devices such as network interfaces
to process it directly.” is physical. Having the kernel map
Sockets programs are instead written physical memory from a device into a
from the viewpoint of a dearth of, rather user-space program breaks one of the
than a wealth of, data. Network pro- fundamental protections provided by a
grams are so used to waiting on data that virtual memory system.
they use a separate system call, sock-
et(), so that they can listen to multiple Attempts to Overcome
sources of data without blocking on a Performance Issues
single request. The typical processing A couple of different mechanisms
loop of a sockets-based program isn’t have been proposed and sometimes
simply read(), process(), read(), but implemented on various operating
instead select(), read(), process(), systems to overcome the performance
select(). Although the addition of a issues present in the sockets API. One
single system call to a loop would not such mechanism is zero-copy sockets.
seem to add much of a burden, this is Anyone who has worked on a network
not the case. Each system call requires stack knows that copying data is what
arguments to be marshaled and cop- kills the performance of networking
ied into the kernel, as well as causing protocols. Therefore, to improve the
the system to block the calling process speed of networked applications that
and schedule another. If there were data are more interested in high bandwidth
available to the caller when it invoked than in low latency, the operating sys-
select(), then all of the work that went tem is modified to remove as many data
into crossing the user/kernel boundary copies as possible. Traditionally, an
was wasted because a read() would operating system performs two copies
have returned data immediately. The for each packet received by the system.

J U NE 2 0 0 9 | VO L . 52 | N O. 6 | C O M M U NI CAT IO N S O F T HE ACM 53
practice

Table 2: APIs added by SCTP. select() on any file descriptor, which


would let the program know when any
of a set of file descriptors was readable,
API Explanation writable, or had an error. When pro-
sctp _ bindx() Bind or unbind an SCTP socket to a list of addresses grams were written to sit in a loop and
sctp _ connectx() Connect an SCTP socket with multiple destination addresses wait on a set of file descriptors—for ex-
sctp _ generic _ recvmsg() Receive data from a peer ample, reading from the network and
sctp _ generic _ sendmsg(), Send data to a peer writing to disk—the select() call was
sctp _ generic _ sendmsg _ iov() sufficient, but once a program wanted
sctp _ getaddrlen() Return the address length of an address family to check for other events, such as tim-
sctp _ getassocid() Return an association ID for a specified socket address ers and signals, select() no longer
sctp _ getpaddrs(), Return list of addresses to caller
sctp _ getladdrs() served. The problem for low-latency
sctp _ peeloff() Detach an association from a one-to-many socket to apps is that kevents() do not deliver
a separate file descriptor data; they deliver only a signal that data
sctp _ sendx() Send a message from an SCTP socket is ready, just as the select() call did.
sctp _ sendmsgx() Send a message from an SCTP socket The next logical step would be to have
an event-based API that also delivered
data. There is no reason to have the ap-
plication cross the user/kernel bound-
The first copy is performed by the net- until it reaches the socket layer, where ary twice simply to get the data the ker-
work driver from the network device’s it is copied out of the kernel when the nel knows the application wants.
memory into the kernel’s memory, and user’s program calls read(). Data sent
the second is performed by the sock- by the program is handled in a similar Lack of Support for Multihoming
ets layer in the kernel when the data is way by the kernel, in that kernel buf- The sockets API not only presents per-
read by the user program. Each of these fers are eventually added to the trans- formance problems to the application
copy operations is expensive because it mit descriptor ring and a flag is then writer, but also narrows the type of
must occur for each message that the set to tell the device that it can place communication that can take place.
system receives. Similarly, when the the data in the buffer on the network. The client/server paradigm is inher-
program wants to send a message, data All of this work in the kernel leaves ently a 1:1 type of communication. Al-
must be copied from the user’s pro- the last copy problem unsolved, and though a server may handle requests
gram into the kernel for each message several attempts have been made to from a diverse group of clients, each
sent; then that data will be copied into extend the sockets API to remove this client has only one connection to a
the buffers used by the device to trans- copy operation.1, 3 The problem re- single server for a request or set of re-
mit it on the network. mains as to how memory can be safely quests. In a world in which each com-
Most operating-system designers shared across the user/kernel bound- puter had only one network interface,
and developers know that data copying ary. The kernel cannot give its memory that paradigm made perfect sense. A
is anathema to system performance over to the user program, because at that connection between a client and server
and work to minimize such copies point it loses control over the memory. is identified by a quad of <Source IP,
within the kernel. The easiest way for A user program that crashes may leave Source Port, Destination IP, Destina-
the kernel to avoid a data copy is to the kernel without a significant chunk tion Port>. Since services generally
have device drivers copy data directly of usable memory, leading to system have a well-known destination port (for
into and out of kernel memory. On performance degradation. There are example, 80 for HTTP), the only value
modern network devices this is a re- also security issues inherent in sharing that can easily vary is the source port,
sult of how they structure their mem- memory buffers across the kernel/user since the IP addresses are fixed.
ory. The driver and kernel share two boundary. There is no single answer to In the Internet of 1982 each ma-
rings of packet descriptors—one for how a user program might achieve high- chine that was not a router had only a
transmit and one for receive—where er bandwidth using the sockets API. single network interface, meaning that
each descriptor has a single pointer For programmers who are more con- to identify a service, such as a remote
to memory. The network device driver cerned with latency than with band- printer, the client computer needed
initially fills these rings with memory width, even less has been done. The a single destination address and port
from the kernel. When data is re- only significant improvement for pro- and had, itself, only a single source
ceived, the device sets a flag in the cor- grams that are waiting for a network address and port to work with. While
rect receive descriptor and tells the event has been the addition of a set of it did exist, the idea that a computer
kernel, usually via an interrupt, that kernel events that a program can wait might have multiple ways of reaching a
there is data waiting for it. The kernel on. Kernel events, or kevents(), are service was too complicated and far too
then removes the filled buffer from the an extension of the select() mecha- expensive to implement. Given these
receive descriptor ring and replaces it nism to encompass any possible event constraints, there was no reason for the
with a fresh buffer for the device to fill. that the kernel might be able to tell the sockets API to expose to the program-
The packet, in the form of the buffer, program about. Before the advent of mer the ability to write a multihomed
then moves through the network stack kevents, a user program could call program—one that could manage

54 CO M MUNIC ATIO NS O F T HE ACM | J U NE 20 09 | VOL . 5 2 | N O. 6


practice

which interfaces or connections mat- to work, with few or no changes, across


tered to it. Such features, when they a plethora of devices, from cellphones,
were implemented, were a part of the to laptops, to desktops, and so on. With
routing software within the operating properly defined APIs we would re-
system. The only way programs could
get access to them was through an ob- As systems come move the artificial barrier that prevents
this. It is only because of the history of
scure set of nonstandard kernel APIs
called a routing socket.
to have more the sockets API and the fact that it has
been “good enough” to date that this
On a system with multiple network network interfaces need has not yet been addressed.
interfaces it is not possible, using the
standard sockets API, to write an appli-
built in, providing High bandwidth, low latency, and
multihoming are driving the devel-
cation that can easily be multihomed— the ability to write opment of alternatives to the sockets
that is, take advantage of both inter-
faces so if one fails, or if the primary
applications that API. With LANs now reaching 10Gbps,
it is obvious that for many applica-
route over which the packets were flow- take advantage tions client/server style communica-
ing breaks, the application would not
lose its connection to the server. of multihoming tion is far too inefficient to use the
available bandwidth. The communi-
The recently developed Stream Con-
trol Transport Protocol (SCTP)4 incor-
will be an absolute cation paradigms supported by the
sockets API must be expanded to allow
porates support for multihoming at the necessity. for memory sharing across the kernel
protocol level, but it is impossible to boundary, as well as for lower-latency
export this support through the sock- mechanisms to deliver data to appli-
ets API. Several ad-hoc system calls cations. Multihoming must become
were initially provided and are the only a first-class feature of the sockets API
way to access this functionality. At the because devices with multiple active
moment this is the only protocol that interfaces are now becoming the norm
has both the capacity and user demand for networked systems.
for this feature, so the API has not been
standardized across more than a few
operating systems. Table 2 shows the Related articles
on queue.acm.org
APIs that SCTP added.
While the list of functions in Table Code Spelunking: Exploring
Cavernous Code Bases
2 contains more APIs than are strictly
George Neville-Neil
necessary, it is important to note that http://queue.acm.org/detail.cfm?id=945136
many are derivatives of preexisting
API Design Matters
APIs, such as send(), which need to Michi Henning
be extended to work in a multihom- http://queue.acm.org/detail.cfm?id=1255422
ing world. The set of APIs needs to be You Don’t Know Jack
harmonized to make multihoming a about Network Performance
first-class citizen in the sockets world. Kevin Fall and Steve McCanne
The problem now is that sockets are so http://queue.acm.org/detail.cfm?id=1066069
successful and ubiquitous that it is very
hard to change the existing API set for References
1. Balaji, P., Bhagvat, S., Jin, H.-W., and Panda, D.K.
fear of confusing its users or the preex- Asynchronous zero-copy communication for
isting programs that use it. synchronous sockets in the sockets direct protocol
(sdp) over infiniband journal. In Proceedings of the
As systems come to have more net- 20th IEEE International Parallel and Distributed
work interfaces built in, providing the Processing Symposium.
2. Gilligan, R., Thomson, S., Bound, J., McCann, J., and
ability to write applications that take Stevens, W. Basic Socket Interface Extensions for
advantage of multihoming will be IPv6. RFC 3493 (Feb. 2003); http://www.rfc-editor.
org/rfc/rfc3493.txt.
an absolute necessity. One can easily 3. Romanow, A., Mogul, J., Talpey, T., and Bailey, S.
imagine the use of such technology in Remote Direct Memory Access (RDMA) over IP
Problem Statement. RFC 4297 (Dec. 2005); http://
a smartphone, which already has three www.rfc-editor.org/rfc/rfc4297.txt.
network interfaces: its primary connec- 4. Stewart, R., et al. Stream Control Transmission
Protocol. RFC 2960 (Oct. 2000); http://www.ietf.org/
tion via the cellular network, a WiFi in- rfc/rfc2960.txt.
terface, and often a Bluetooth interface
as well. There is no reason for an appli- George V. Neville-Neil (kv@acm.org) is the proprietor
cation to lose connectivity if even one of Neville-Neil Consulting. He works on networking and
operating systems code and teaches courses on program-
of these network interfaces is working related topics.
properly. The problem for application
designers is that they want their code © 2009 ACM 0001-0782/09/0600 $10.00

JU N E 2 0 0 9 | VOL . 52 | N O. 6 | C O M M U N I CAT I O N S O F TH E AC M 55
contributed articles
DO I :10. 11 45/ 15 16 046 .1 51 60 62

RAKESH AGRAWAL
Database research is expanding, with major
ANASTASIA AILAMAKI efforts in system architecture, new languages,
PHILIP A. BERNSTEIN cloud services, mobile and virtual worlds,
and interplay between structure and text.
ERIC A. BREWER
MICHAEL J. CAREY
SURAJIT CHAUDHURI The
ANHAI DOAN
DANIELA FLORESCU Claremont
MICHAEL J. FRANKLIN
HECTOR GARCIA-MOLINA Report on
JOHANNES GEHRKE
LE GRUENWALD Database
LAURA M. HAAS
ALON Y. HALEVY
JOSEPH M. HELLERSTEIN
Research
YANNIS E. IOANNIDIS
HANK F. KORTH
DONALD KOSSMANN A GROUP OF database researchers, architects, users, and
SAMUEL MADDEN
pundits met in May 2008 at the Claremont Resort in
Berkeley, CA, to discuss the state of database research
ROGER MAGOULAS and its effects on practice. This was the seventh meet-
BENG CHIN OOI ing of this sort over the past 20 years and was distin-
guished by a broad consensus that the database
TIM O’REILLY community is at a turning point in its history, due
RAGHU RAMAKRISHNAN toboth an explosion of data and usage scenarios and
major shifts in computing hardware and platforms.
SUNITA SARAWAGI Here, we explore the conclusions of this self-
MICHAEL STONEBRAKER assessment. It is by definition somewhat inward-
focused but may be of interest to the broader
ALEXANDER S. SZALAY
computing community as both a window into
GERHARD WEIKUM upcoming directions in database research and
56 CO M MUNICATIO NS OF TH E ACM | J U NE 20 09 | VOL . 5 2 | N O. 6
a description of some of the community tional enterprise settings, the barriers crawls of deep-Web sites. There is also
issues and initiatives that surfaced. We between IT departments and business an explosion of text-focused semistruc-
describe the group’s consensus view of units are coming down, and there are tured data in the public domain in the
new focus areas for research, including many examples of companies where form of blogs, Web 2.0 communities,
database engine architectures, declara- data is indeed the business itself. As a and instant messaging. New incentive
tive programming languages, interplay consequence, data capture, integra- structures and Web sites have emerged
of structured data and free text, cloud tion, and analysis are no longer viewed for publishing and curating structured
data services, and mobile and virtual as a business cost but as the keys to data in a shared fashion as well. Text-
worlds. We also report on discussions efficiency and profit. The value of soft- centric approaches to managing the
of the database community’s growth ware to support data analytics has been data are easy to use but ignore latent
and processes that may be of interest growing as a result. In 2007, corporate structure in the data that might add
to other research areas facing similar acquisitions of business-intelligence significant value. The race is on to
challenges. vendors alone totaled $15 billion,2 and develop techniques that extract useful
Over the past 20 years, small groups
of database researchers have periodi-
cally gathered to assess the state of the
field and propose directions for future
research.1,3–7 Reports of the meetings
served to foster debate within the data-
base research community, explain
research directions to external orga-
nizations, and help focus community
efforts on timely challenges.
The theme of the Claremont meet-
ing was that database research and
the data-management industry are at
a turning point, with unusually rich
opportunities for technical advances,
intellectual achievement, entrepre-
neurship, and benefits for science
and society. Given the large number
of opportunities, it is important for
the database research community to
address issues that maximize relevance
within the field, across computing, and
in external fields as well.
The sense of change that emerged
in the meeting was a function of sever-
al factors:
Excitement over “big data.” In recent
years, the number of communities that is only the “front end” of the data- data from mostly noisy text and struc-
working with large volumes of data has analytics tool chain. Market pressure for tured corpora, enable deeper explo-
grown considerably to include not only better analytics also brings new users ration into individual data sets, and
traditional enterprise applications and to the technology with new demands. connect data sets together to wring out
Web search but also e-science efforts Statistically sophisticated analysts are as much value as possible.
(in astronomy, biology, earth science, being hired in a growing number of Expanded developer demands.
and more), digital entertainment, natu- industries, with increasing interest in Programmer adoption of relational
ral-language processing, and social- running their formulae on the raw data. DBMSs and query languages has grown
network analysis. While the user base At the same time, a growing number of significantly in recent years, acceler-
for traditional database management nontechnical decision makers want to ated by the maturation of open source
systems (DBMSs) is growing quickly, “get their hands on the numbers” as systems (such as MySQL and Postgr-
there is also a groundswell of effort to well in simple and intuitive ways. eSQL) and the growing popularity of
design new custom data-management Ubiquity of structured and unstruc- object-relational mapping packages
solutions from simpler components. tured data. There is an explosion of (such as Ruby on Rails). However, the
The ubiquity of big data is expanding structured data on the Web and on expanded user base brings new expec-
IL LUSTRAT ION BY GLUEKIT

the base of users and developers of enterprise intranets. This data is from tations for programmability and usabil-
data-management technologies and a variety of sources beyond traditional ity from a larger, broader, less-special-
will undoubtedly shake up the data- databases, including large-scale efforts ized community of programmers.
base research field. to extract structured information from Some of them are unhappy or unwill-
Data analysis as profit center. In tradi- text, software logs and sensors, and ing to “drop into” SQL, viewing DBMSs

JU NE 2 0 0 9 | VO L . 52 | N O. 6 | CO M M U NI CAT I O NS O F THE AC M 57
contributed articles

as unnecessarily complicated and tant aspect of the price/performance revolved around two broad agendas
daunting to learn and manage relative metric of large systems. These hard- we call reformation and synthesis. The
to other open source components. As ware trends alone motivate a wholesale reformation agenda involves decon-
the ecosystem for database manage- reconsideration of data-management structing traditional data-centric ideas
ment evolves beyond the typical DBMS software architecture. and systems and reforming them for
user base, opportunities are emerging These factors together signal an new applications and architectural real-
for new programming models and new urgent, widespread need for new data- ities. One part of this entails focusing
system components for data manage- management technologies. There is outside the traditional RDBMS stack
ment and manipulation. an opportunity for making a positive and its existing interfaces, emphasiz-
Architectural shifts in computing. difference. Traditionally, the database ing new data-management systems
While the variety of user scenarios is community is known for the practical for growth areas (such as e-science).
increasing, the computing substrates relevance of its research; relational Another part of the reformation agen-
for data management are shifting databases are emblematic of technol- da involves taking data-centric ideas
like declarative programming and
query optimization outside their origi-
nal context in storage and retrieval to
attack new areas of computing where
a data-centric mindset promises to
yield significant benefit. The synthesis
agenda is intended to leverage research
ideas in areas that have yet to develop
identifiable, agreed-upon system archi-
tectures, including data integration,
information extraction, and data priva-
cy. Many of these subcommunities of
database research seem ready to move
out of the conceptual and algorithmic
phase to work together on comprehen-
sive artifacts (such as systems, languag-
es, and services) that combine multiple
techniques to solve complex user prob-
lems. Efforts toward synthesis can serve
as rallying points for research, likely
leading to new challenges and break-
throughs, and promise to increase the
overall visibility of the work.

Research Opportunities
After two days of intense discussion
at the 2008 Claremont meeting, it was
dramatically as well. At the macro scale, ogy transfer. But in recent years, the surprisingly easy for the group to reach
the rise of cloud computing services externally visible contribution of the consensus on a set of research topics
suggests fundamental changes in database research community has for investigation in coming years.
software architecture. It democratizes not been as pronounced, and there Before exploring them, we stress a few
access to parallel clusters of computers; is a mismatch between the notable points regarding what is not on the list.
every programmer has the opportunity expansion of the community’s portfo- First, while we tried to focus on new
and motivation to design systems and lio and its contribution to other fields opportunities, we do not propose they
services that scale out incrementally of research and practice. In today’s be pursued at the expense of existing
to arbitrary degrees of parallelism. At increasingly rich technical climate, the good work. Several areas we deemed
a micro scale, computer architectures database community must recommit critical were left off because they are
have shifted the focus of Moore’s Law itself to impact and breadth. Impact already focus topics in the database
from increasing clock speed per chip is evaluated by external measures, so community. Many were mentioned in
to increasing the number of processor success involves helping new classes of previous reports1,3–7 and are the subject
cores and threads per chip. In storage users, powering new computing plat- of significant efforts that require
technologies, major changes are under forms, and making conceptual break- continued investigation and funding.
IL LUSTRAT IO N BY GLUEKIT

way in the memory hierarchy due to the throughs across computing. These Second, we kept the list short, favoring
availability of more and larger on-chip should be the motivating goals for the focus over coverage. Though most of us
caches, large inexpensive RAM, and next round of database research. have other promising research topics
flash memory. Power consumption To achieve these goals, discussion we would have liked to discuss at great-
has become an increasingly impor- at the 2008 Claremont Resort meeting er length here, we focus on topics that

58 CO MM UNICATI O NS OF T HE AC M | J U N E 200 9 | VO L . 52 | N O. 6
contributed articles

attracted the broadest interest within management relative to hardware is


the group. exorbitant. In the OLTP market, busi-
In addition to the listed topics, the ness imperatives like regulatory compli-
main issues raised during the meeting ance and rapid response to changing
included management of uncertain
information, data privacy and security, The ubiquity business conditions raise the need to
address data life-cycle issues (such as
e-science and other scholarly appli- of big data is data provenance, schema evolution,

expanding the
cations, human-centric interaction and versioning).
with data, social networks and Web Given these requirements, the
2.0, personalization and contextual-
ization of query- and search-related
base of users commercial database market is wide
open to new ideas and systems, as
tasks, streaming and networked data, and developers of reflected in the recent funding climate
self-tuning and adaptive systems, and
the challenges raised by new hardware
data-management for entrepreneurs. It is difficult to
recall when there were so many start-
technologies and energy constraints. technologies and up companies developing database
Most are captured in the following
discussion, with many cutting across will undoubtedly engines, and the challenging economy
has not trimmed the field much. The
multiple topics. shake up market will undoubtedly consolidate
Revisiting database engines. System R
and Ingres pioneered the architecture the database over time, but things are changing fast,
and it remains a good time to try radi-
and algorithms of relational databases;
current commercial databases are still
research field. cal ideas.
Some research projects have begun
based on their designs. But many of the taking revolutionary steps in database
changes in applications and technolo- system architecture. There are two
gy demand a reformation of the entire distinct directions: broadening the
system stack for data management. useful range of applicability for multi-
Current big-market relational database purpose database systems (for exam-
systems have well-known limitations. ple, to incorporate streams, text search,
While they provide a range of features, XML, and information integration)
they have only narrow regimes in which and radically improving performance
they provide peak performance; online by designing special-purpose database
transaction processing (OLTP) systems systems for specific domains (for exam-
are tuned for lots of small, concurrent ple, read-mostly analytics, streams,
transactional debit/credit workloads, and XML). Both directions have merit,
while decision-support systems are and the overlap in their stated targets
tuned for a few read-mostly, large-join- suggests they may be more synergistic
and-aggregation workloads. Mean- than not. Special-purpose techniques
while, for many popular data-intensive (such as new storage and compres-
tasks developed over the past decade, sion formats) may be reusable in more
relational databases provide poor general-purpose systems, and general-
price/performance and have been purpose architectural components
rejected; critical scenarios include (such as extensible query optimizer
text indexing, serving Web pages, and frameworks) may help speed prototyp-
media delivery. New workloads are ing of new special-purpose systems.
emerging in the sciences, Web 2.0-style Important research topics in the
applications, and other environments core database engine area include:
where database-engine technology ! Designing systems for clusters
could prove useful but is not bundled of many-core processors that exhibit
in current database systems. limited and nonuniform access to off-
Even within traditional applica- chip memory;
tion domains, the database market- ! Exploiting remote RAM and Flash
place today suggests there is room for as persistent media, rather than rely-
significant innovation. For example, in ing solely on magnetic disk;
the analytics markets for business and ! Treating query optimization and
science, customers can buy petabytes physical data layout as a unified, adap-
of storage and thousands of proces- tive, self-tuning task to be carried out
sors, but the dominant commercial continuously;
database systems typically cannot ! Compressing and encrypting data
scale that far for many workloads. Even at the storage layer, integrated with
when they can, the cost of software and data layout and query optimization;

JU NE 2 0 0 9 | VO L . 52 | N O. 6 | COM M U NI C AT IO N S O F T HE ACM 59
contributed articles

! Designing systems that embrace This opens opportunities for the


nonrelational data models, rather than database community to extend its
shoehorning them into tables; contribution to the broader commu-
! Trading off consistency and avail- nity, developing more powerful and
ability for better performance and
thousands of machines; and This is a unique efficient languages and runtime mech-
anisms that help these developers
! Designing power-aware DBMSs
that limit energy costs without sacrific-
opportunity for address more complex problems.
As another example of declarative
ing scalability. a fundamental programming, in the past five years a
This list is not exhaustive. One
industrial participant at the Claremont
“reformation” variety of new declarative languages,
often grounded in Datalog, have been
meeting noted that this is a time of of the notion of developed for domain-specific systems
opportunity for academic research-
ers; the landscape has shifted enough
data management, in fields as diverse as networking and
distributed systems, computer games,
that access to industrial legacy code not as a single machine learning and robotics, compil-
provides little advantage, and large-
scale clustered hardware is rentable in system but as ers, security protocols, and information
extraction. In many of these scenarios,
the cloud at low cost. Moreover, indus-
trial players and investors are aggres-
a set of services the use of a declarative language has
reduced code size by orders of magni-
sively looking for bold new ideas. This that can be tude while also enabling distributed
opportunity for academics to lead in
system design is a major change in the
embedded, as or parallel execution. Surprisingly, the
groups behind these efforts have coor-
research environment. needed, in many dinated very little with one another; the
Declarative programming for emerg-
ing platforms. Programmer productivity
computing contexts. move to revive declarative languages
in these new contexts has grown up
is a key long-acknowledged challenge organically.
in computing, with its most notable A third example arises in enter-
mention in the database context in Jim prise-application programming.
Gray’s 1998 Turing lecture. Today, the Recent language extensions (such
urgency of the challenge is increasing as Ruby on Rails and LINQ) encour-
exponentially as programmers target age query-like logic in programmer
ever more complex environments, design patterns. But these packages
including many-core chips, distrib- have yet to address the challenge of
uted services, and cloud computing enterprise-style programming across
platforms. multiple machines; the closest effort
Nonexpert programmers must be here is DryadLINQ, focusing on paral-
able to write robust code that scales out lel analytics rather than on distributed
across processors in both loosely and application development. For enter-
tightly coupled architectures. Although prise applications, a key distributed
developing new programming para- design decision is the partitioning of
digms is not a database problem per se, logic and data across multiple “tiers,”
ideas of data independence, declara- including Web clients, Web servers,
tive programming, and cost-based opti- application servers, and a backend
mization provide a promising angle of DBMS. Data independence is particu-
attack. There is significant evidence larly valuable here, allowing programs
that data-centric approaches will have to be specified without making a priori
significant influence on programming permanent decisions about physical
in the near term. deployment across tiers. Automatic
The recent popularity of the Map- optimization processes could make
Reduce programming framework for these decisions and move data and
manipulating big data sets is an code as needed to achieve efficiency
example of this potential. MapReduce and correctness. XQuery has been
is attractively simple, building on proposed as an existing language that
language and data-parallelism tech- would facilitate this kind of declarative
niques that have been known for programming, in part because XML is
decades. For database researchers, often used in cross-tier protocols.
the significance of MapReduce is in It is unusual to see this much
demonstrating the benefits of data- energy surrounding new data-centric
parallel programming to new classes programming techniques, but the
of developers. opportunity brings challenges as

60 CO MM UNIC ATI ONS O F T HE ACM | J U NE 20 09 | VOL . 5 2 | N O. 6


contributed articles

well. The research challenges include quality data items in HTML tables on it developed domain-independent
language design, efficient compilers Web pages and a growing number of technology for crawling through forms
and runtimes, and techniques to opti- mashups providing dynamic views on (that is, automatically submitting well-
mize code automatically across both structured data; and data contributed formed queries to forms) and surfac-
the horizontal distribution of parallel by Web 2.0 services (such as photo and ing the resulting HTML pages in a
processors and the vertical distribu- video sites, collaborative annotation search-engine index. Within the enter-
tion of tiers. It seems natural that the services, and online structured-data prise, the database research commu-
techniques behind parallel and distrib- repositories). nity recently contributed to enterprise
uted databases—partitioned dataflow A significant long-term goal for the search and the discovery of relation-
and cost-based query optimization— database community is to transition ships between structured and unstruc-
should extend to new environments. from managing traditional databases tured data.
However, to succeed, these languages consisting of well-defined schemata The first challenge database
must be fairly expressive, going beyond for structured business data to the researchers face is how to extract struc-
simple MapReduce and select-project-
join-aggregate dataflows. This agenda
will require “synthesis” work to harvest
useful techniques from the literature
on database and logic programming
languages and optimization, as well
as to realize and extend them in new
programming environments.
To genuinely improve programmer
productivity, these new approaches
also need to pay attention to the soft-
er issues that capture the hearts and
minds of programmers (such as attrac-
tive syntax, typing and modularity,
development tools, and smooth inter-
action with the rest of the comput-
ing ecosystem, including networks,
files, user interfaces, Web services,
and other languages). This work also
needs to consider the perspective of
programmers who want to use their
favorite programming languages and
data services as primitives in those
languages. Example code and practical
tutorials are also critical.
To execute successfully, database
research must look beyond its tradition-
al boundaries and find allies through- much more challenging task of manag- ture and meaning from unstructured
out computing. This is a unique oppor- ing a rich collection of structured, and semistructured data. Informa-
tunity for a fundamental “reformation” semi-structured, and unstructured tion-extraction technology can now
of the notion of data management, not data spread over many repositories in pull structured entities and relation-
as a single system but as a set of servic- the enterprise and on the Web—some- ships out of unstructured text, even in
es that can be embedded as needed in times referred to as the challenge of unsupervised Web-scale contexts. We
many computing contexts. managing dataspaces. expect in coming years that hundreds
Interplay of structured and unstruc- In principle, this challenge is closely of extractors will be applied to a given
tured data. A growing number of data- related to the general problem of data data source. Hence developers and
management scenarios involve both integration, a longstanding area for analysts need techniques for applying
structured and unstructured data. database research. The recent advanc- and managing predictions from large
Within enterprises, we see large hetero- es in this area and the new issues numbers of independently developed
geneous collections of structured data due to Web 2.0 resulted in significant extractors. They also need algorithms
linked with unstructured data (such discussion at the Claremont meeting. that can introspect about the correct-
as document and email repositories). On the Web, the database community ness of extractions and therefore
IL LUSTRAT IO N BY GLUEKIT

On the Web, we also see a growing has contributed primarily in two ways: combine multiple pieces of extraction
amount of structured data primarily First, it developed technology that evidence in a principled fashion. The
from three sources: millions of data- enables the generation of domain- database community is not alone in
bases hidden behind forms (the deep specific (“vertical”) search engines these efforts; to contribute in this area,
Web); hundreds of millions of high- with relatively little effort; and second, database researchers should continue

JU N E 2 0 0 9 | VO L . 52 | N O. 6 | C OM M U N I C AT IO N S O F T H E AC M 61
contributed articles

to strengthen ties with researchers in develop methods to answer keyword concepts around which these function-
information retrieval and machine queries over large collections of hetero- alities are tied.
learning. geneous data sources. We must be able In addition to managing existing
Context is a significant aspect to break down the query to extract data collections, there is an opportu-
of the semantics of the data, taking its intended semantics and route the nity to innovate in the creation of data
multiple forms (such as the text and query to the relevant sources(s) in the collections. The emergence of Web 2.0
hyperlinks that surround a table on a collection. Keyword queries are just creates the potential for new kinds of
Web page, the name of the directory one entry point into data exploration, data-management scenarios in which
in which data is stored, accompany- and there is a need for techniques that users join ad hoc communities to
ing annotations or discussions, and lead users into the most appropriate create, collaborate, curate, and discuss
relationships to physically or tempo- querying mechanism. Unlike previ- data online. As an example, consider
rally proximate data items). Context ous work on information integration, creating a database of access to clean
helps analysts interpret the meaning the challenges here are that we cannot water in different places around the
world. Since such communities rarely
agree on schemata ahead of time, the
schemata must be inferred from the
data; however, the resulting schemata
are still used to guide users to consen-
sus. Systems in this context must
incorporate visualizations that drive
exploration and analysis. Most impor-
tant, these systems must be extremely
easy to use and so will probably require
compromising on some typical data-
base functionality and providing more
semiautomatic “hints” mined from the
data. There is an important opportunity
for a feedback loop here; as more data
is created with such tools, information
extraction and querying could become
easier. Commercial and academic
prototypes are beginning to appear, but
there is plenty of room for additional
innovation and contributions.
Cloud data services. Economic and
technological factors have motivated
a resurgence of shared computing
infrastructure, providing software
and computing facilities as a service,
an approach known as cloud services
of data in such applications because assume we have semantic mappings or cloud computing. Cloud services
the data is often less precise than in for the data sources and we cannot provide efficiencies for application
traditional database applications, as assume that the domain of the query or providers by limiting up-front capital
it is extracted from unstructured text, the data sources is known. We need to expenses and by reducing the cost of
extremely heterogeneous, or sensi- develop algorithms for providing best- ownership over time. Such services
tive to the conditions under which it effort services on loosely integrated are typically hosted in a data center
was captured. Better database tech- data. The system should provide mean- using shared commodity hardware
nology is needed to manage data in ingful answers to queries with no need for computation and storage. A varied
context. In particular, there is a need for manual integration and improve set of cloud services is available today,
for techniques to discover data sourc- over time in a pay-as-you-go fashion as including application services (sales-
es, enhance the data by discovering semantic relationships are discovered force.com), storage services (Amazon
implicit relationships, determine the and refined. Developing index struc- S3), compute services (Amazon EC2,
weight of an object’s context when tures to support querying hybrid data Google App Engine, and Microsoft
assigning it semantics, and maintain is also a significant challenge. More Azure), and data services (Amazon
the provenance of data through these generally, we need to develop new SimpleDB, Microsoft SQL Data Servic-
IL LUSTRAT IO N BY GLUEKIT

steps of storage and computation. notions of correctness and consistency es, and Google’s Datastore). They
The second challenge is to develop in order to provide metrics and enable represent a major reformation of data-
methods for querying and deriving users or system designers to make management architectures, with more
insight from the resulting sea of hetero- cost/quality trade-offs. We also need on the horizon. We anticipate many
geneous data. A specific problem is to to develop the appropriate systems future data-centric applications lever-

62 CO MM UNICATI ONS OF TH E AC M | J U NE 20 09 | VO L . 5 2 | NO. 6


contributed articles

aging data services in the cloud. management across layers.


A cross-cutting theme in cloud The need for manageability adds
services is the trade-off providers face urgency to the development of self-
between functionality and opera- managing database technologies
tional costs. Today’s early cloud data
services offer an API that is much Limited that have been explored over the past
decade. Adaptive, online techniques
more restricted than that of traditional functionality will be required to make these systems

pushes more
database systems, with a minimalist viable, while new architectures and
query language, limited consistency APIs, including the flexibility to depart
guarantees, and in some cases explicit
constraints on resource utilization.
programming from traditional SQL and transaction-
al semantics when prudent, reduce
This limited functionality pushes more burden on requirements for backward compat-
programming burden on developers
but allows cloud providers to build
developers but ibility and increase the motivation for
aggressive redesign.
more predictable services and offer allows cloud The sheer scale of cloud computing
service-level agreements that would be
difficult to provide for a full-function providers to build involves its own challenges. Today’s
SQL databases were designed in an
SQL data service. More work and expe- more predictable era of relatively reliable hardware and
rience are needed on several fronts
to fully understand the continuum services and offer intensive human administration; as a
result, they do not scale effectively to
between today’s early cloud data servic-
es and more full-function but possibly
service-level thousands of nodes being deployed
in a massively shared infrastructure.
less-predictable alternatives. agreements that On the storage front, it is unclear
Manageability is particularly impor-
tant in cloud environments. Relative to
would be difficult whether these limitations should be
addressed with different transactional
traditional systems, it is complicated by to provide for implementation techniques, different
three factors: limited human interven-
tion, high-variance workloads, and a
a full-function storage semantics, or both simultane-
ously. The database literature is rich
variety of shared infrastructures. In the SQL data service. in proposals on these issues. Cloud
majority of cloud-computing settings, services have begun to explore simple
there will be no database administra- pragmatic approaches, but more work
tors or system administrators to assist is needed to synthesize ideas from the
developers with their cloud-based literature in modern cloud computing
applications; the platform must do regimes. In terms of query processing
much of that work automatically. Mixed and optimization, it will not be feasible
workloads have always been difficult to to exhaustively search a domain that
tune but may be unavoidable in this considers thousands of processing
context. sites, so some limitations on either the
Even a single customer’s workload domain or the search will be required.
can vary widely over time; the elastic Finally, it is unclear how program-
provisioning of cloud services makes mers will express their programs in the
it economical for a user to occasion- cloud, as discussed earlier.
ally harness orders-of-magnitude more The sharing of physical resources in
resources than usual for short bursts a cloud infrastructure puts a premium
of work. Meanwhile, service tuning on data security and privacy that cannot
depends heavily on the way the shared be guaranteed by physical boundaries
infrastructure is “virtualized.” For of machines or networks. Hence cloud
example, Amazon EC2 uses hardware- services are fertile ground for efforts
level virtual machines as its program- to synthesize and accelerate the work
ming interface. On the opposite end of the database community has done in
the spectrum, salesforce.com imple- these areas. The key to success is to
ments “multi-tenant” hosting of many specifically target usage scenarios in
independent schemas in a single the cloud, seated in practical econom-
managed DBMS. Many other virtual- ic incentives for service providers and
ization solutions are possible, each customers.
with different views into the workloads As cloud data services become popu-
above and platforms below and differ- lar, new scenarios will emerge with
ent abilities to control each. These their own challenges. For example, we
variations require revisiting traditional anticipate specialized services that are
roles and responsibilities for resource pre-loaded with large data sets (such as

J UN E 2 0 0 9 | VOL . 5 2 | NO. 6 | C O M M UN I C AT IO NS OF TH E AC M 63
contributed articles

stock prices, weather history, and Web data-rich mix. The term “co-space” is
crawls). The ability to “mash up” inter- sometimes used to refer to a coexist-
esting data from private and public ing space for both virtual and physi-
domains will be increasingly attractive cal worlds. In it, locations and events
and provide further motivation for the
challenges discussed earlier concern- Electronic media in the physical world are captured by
a large number of sensors and mobile
ing the interplay of structured and underscore the devices and materialized within a

modern reality
unstructured data. The desire to mash virtual world. Correspondingly, certain
up data also points to the inevitability actions or events within the virtual
of services reaching out across clouds,
an issue already prevalent in scien-
that it is easy to be world affect the physical world (such
as shopping, product promotion, and
tific data “grids” that typically have widely published experiential computer gaming). Appli-
large shared data servers at multiple
sites, even within a single discipline. It
but much more cations of co-space include rich social
networking, massive multi-player
also echoes, in the large, the standard difficult to be games, military training, edutain-
proliferation of data sources in most
enterprises. Federated cloud architec- widely read. ment, and knowledge sharing.
In both areas, large amounts of data
tures will only add to these challenges. flow from users and get synthesized
Mobile applications and virtual and used to affect the virtual and/or real
worlds. This new class of applications, world. These applications raise new
exemplified by mobile services and challenges, including how to process
virtual worlds, is characterized by the heterogeneous data streams in order
need to manage massive amounts of to materialize real-world events, how to
diverse user-created data, synthesize balance privacy against the collective
it intelligently, and provide real-time benefit of sharing personal real-time
services. The database community information, and how to apply more
is beginning to understand the chal- intelligent processing to send interest-
lenges faced by these applications, but ing events in the co-space to someone
much more work is needed. According- in the physical world.
ly, the discussion about these topics at The programming of virtual actors in
the meeting was more speculative than games and virtual worlds requires large-
about those of the earlier topics but scale parallel programming; declarative
still deserve attention. methods have been proposed as a solu-
Two important trends are changing tion in this environment, as discussed
the nature of the field. First, the plat- earlier. These applications also require
forms on which mobile applications development of efficient systems, as
are built—hardware, software, and suggested earlier in the context of data-
network—have attracted large user base engines, including appropriate
bases and ubiquitously support power- storage and retrieval methods, data-
ful interactions “on the go.” Second, processing engines, parallel and distrib-
mobile search and social networks uted architectures, and power-sensitive
suggest an exciting new set of mobile software techniques for managing the
applications that can deliver timely events and communications across
information (and advertisements) to large number of concurrent users.
mobile users depending on location,
personal preferences, social circles, Moving Forward
and extraneous factors (such as weath- The 2008 Claremont meeting also
er), as well as the context in which involved discussions on the database
they operate. Providing these services research community’s processes,
requires synthesizing user input and including organization of publication
behavior from multiple sources to procedures, research agendas, attrac-
determine user location and intent. tion and mentorship of new talent,
The popularity of virtual worlds and efforts to ensure a benefit from
like Second Life has grown quickly the research on practice and toward
and in many ways mirrors the themes furthering our understanding of the
of mobile applications. While they field. Some of the trends seen in data-
began as interactive simulations for base research are echoed in other
multiple users, they increasingly blur areas of computer science. Whether or
the distinctions with the real world not they are, the discussion may be of
and suggest the potential for a more broader interest in the field.

64 COM MUNICATIO NS O F T HE ACM | J U NE 20 0 9 | VOL . 5 2 | NO. 6


contributed articles

Prior to the meeting, a team led by intellectual and practical relevance. At from all parties. Unlike previous efforts
one of the participants performed a the same time, it was acknowledged in this vein, the collection should not
bit of ad hoc data analysis over data- that the database community’s growth be designed for any particular bench-
base conference bibliographies from increases the need for clear and clearly mark; in fact, it is likely that most of the
the DBLP repository (dblp.uni-trier. enforced processes for scientific publi- interesting problems suggested by this
de). While the effort was not scien- cation. The challenge going forward data are as yet unidentified.
tific, the results indicated that the is to find policies that simultaneous- There was also discussion at the
database research community has ly reward big ideas and risk-taking meeting of the role of open source
doubled in size over the past decade, while providing clear and fair rules for software development in the database
as suggested by several metrics: achieving these rewards. The publica- community. Despite a tradition of open
number of published papers, number tion venues would do well to focus as source software, academic database
of distinct authors, number of distinct much energy on processes to encour- researchers have only rarely reused
institutions to which these authors age relevance and innovation as they or shared software. Given the current
belong, and number of session topics do on processes to encourage rigor climate, it might be useful to move more
at conferences, loosely defined. This and discipline. aggressively toward sharing software
served as a backdrop to the discus- In addition to tuning the main- and collaborating on software projects
sion that followed. An open question is stream publication venues, there is an across institutions. Information inte-
whether this phenomenon is emerging opportunity to take advantage of other gration was mentioned as an area in
at larger scales—in computer science channels of communication. For exam- which such an effort is emerging.
and in science in general. If so, it may ple, the database research community Finally, interest was expressed
be useful to discuss the management has had little presence in the relatively in technical competitions akin to
of growth at those larger scales. active market for technical books. the Netflix Prize (www.netflixprize.
The growth of the database commu- Given the growing population of devel- com) and KDD Cup (www.sigkdd.org/
nity puts pressure on the content opers working with big data sets, there kddcup/index.php) competitions.
and processes of database research is a need for accessible books on scal- To kick off this effort in the database
publications. In terms of content, the able data-management algorithms domain, meeting participants identi-
increasingly technical scope of the and techniques that programmers can fied two promising areas for competi-
community makes it difficult for indi- use to build software. The current crop tions: system components for cloud
vidual researchers to keep track of the of college textbooks is not targeted at computing (likely measured in terms
field. As a result, survey articles and this market. There is also an oppor- of efficiency) and large-scale infor-
tutorials are increasingly important to tunity to present database research mation extraction (likely measured
the community. These efforts should contributions as big ideas in their own in terms of accuracy and efficiency).
be encouraged informally within the right, targeted at intellectually curious While it was noted that each of these
community, as well as via professional readers outside the specialty. In addi- proposals requires a great deal of time
incentive structures (such as academic tion to books, electronic media (such and care to realize, several participants
tenure and promotion in industrial as blogs and wikis) can complement volunteered to initiate efforts. That
labs). In terms of processes, the review- technical papers by opening up differ- work has begun with the 2009 SIGMOD
ing load for papers is increasingly ent stages of the research life cycle to Programming Contest (db.csail.mit.
burdensome, and there was a percep- discussion, including status reports edu/sigmod09contest).
tion at the Claremont meeting that the on ongoing projects, concise presen-
quality of reviews had been decreasing. tation of big ideas, vision statements, References
It was suggested at the meeting that the and speculation. Online fora can also 1. Abiteboul, S. et al. The Lowell database research
self assessment. Commun. ACM 48, 5 (May 2005),
lack of face-to-face program-commit- spur debate and discussion if appro- 111–118.
tee meetings in recent years has exac- priately provocative. Electronic media 2. Austin, I. I.B.M. acquires Cognos, maker of business
software, for $4.9 billion. New York Times (Nov. 11,
erbated the problem of poor reviews underscore the modern reality that 2007).
and removed opportunities for risky or it is easy to be widely published but 3. Bernstein, P.A. et al. The Asilomar report on database
research. SIGMOD Record 27, 4 (Dec. 1998), 74–80.
speculative papers to be championed much more difficult to be widely read. 4. Bernstein, P.A. et al. Future directions in DBMS
research: The Laguna Beach participants. SIGMOD
effectively over well-executed but more This point should be reflected in the Record 18, 1 (Mar. 1989), 17–26.
pedestrian work. mainstream publication context, as 5. Silberschatz, A. and Zdonik, S. Strategic directions
in database systems: Breaking out of the box. ACM
There was some discussion at the well as by authors and reviewers. In the Computing Surveys 28, 4 (Dec. 1996), 764–778.
meeting about recent efforts—nota- end, the consumers of an idea define 6. Silberschatz, A., Stonebraker, M., and Ullman, J.D.
Database research: Achievements and opportunities
bly by ACM-SIGMOD and VLDB— its value. into the 21st century. SIGMOD Record 25, 1 (Mar.
to enhance the professionalism of Given the growth in the database 1996), 52-63.
7. Silberschatz, A., Stonebraker, M., and Ullman, J.D.
papers and the reviewing process via research community, the time is ripe Database systems: Achievements and opportunities.
such mechanisms as double-blind for ambitious projects to stimulate Commun. ACM 34, 10 (Oct. 1991), 110–120.

reviewing and techniques to encour- collaboration and cross-fertilization


age experimental repeatability. Many of ideas. One proposal is to foster Correspondence regarding this article should be
addressed to Joseph M. Hellerstein (hellerstein@
participants were skeptical that the more data-driven research by building cs.berkeley.edu).
efforts to date have contributed to long- a globally shared collection of struc-
term research quality, as measured in tured data, accepting contributions © 2009 ACM 0001-0782/09/0600 $10.00

JUN E 2 0 0 9 | VO L. 52 | N O. 6 | COM M U NI C AT IO NS O F TH E AC M 65
contributed articles
DOI :1 0.1 145 / 15 160 46.1 51 606 3
Although some developing coun-
The vision is being overwhelmed by the reality tries are indeed deploying OLPC lap-
tops, others have cancelled planned
of business, politics, logistics, and competing deployments or are waiting on the
interests worldwide. results of pilot projects before decid-
ing whether to acquire them in num-
BY KENNETH L. KRAEMER, JASON DEDRICK, AND PRAKUL SHARMA bers. Meanwhile, the OLPC organiza-
tion (www.olpc.com/) struggles with

FROM THE TO P LEFT PHOTOGRAPH BY: 1. C ARL A G OMEZ MON ROY, 2 . DA NI EL D RA KE, 3 –5 ONE L APTOP PER CHILD, 6. DANIE L DRA KE, 7– 9 O LP C, 10. DANIE L D RAK E, 11. R OD OLFO ARCE, 12. OLPC, 13. CA R L A GOME Z MONROY, 14. O L PC, 15 . N I ELS OLSO N
One Laptop
key staff defections, budget cuts, and
ideological disillusionment, as it ap-
pears to some that the educational
mission has given way to just getting

Per Child:
laptops out the door. In addition, low-
cost commercial netbooks from Acer,
Asus, Hewlett-Packard, and other PC
vendors have been launched with great

Vision vs.
early success.
So rather than distributing millions
of laptops to poor children itself, OLPC
has motivated the PC industry to devel-

Reality
op lower-cost, education-oriented PCs,
providing developing countries with
low-cost computing options directly in
competition with OLPC’s own innova-
tion. In that sense, OLPC’s apparent
failure may be a step toward broader
success in providing a new tool for
children in developing countries. How-
ever, it is also clear that the PC industry
cannot profitably reach millions of the
poorest children, so the OLPC objec-
tives might never be achieved through
AT THE WORLD Economic Forum in Davos, Switzerland, the commercial market alone.
Here, we review and analyze the
January 2005, Nicholas Negroponte unveiled the idea OLPC experience, focusing on the two
of One Laptop Per Child (OLPC), a $100 PC that would most important issues: the successes
transform education for the world’s disadvantaged and failures of OLPC in understand-
ing and adapting to the developing-
schoolchildren by giving them the means to teach country environment and the unex-
themselves and each other. He estimated that up pectedly aggressive reaction by the PC
industry, including superpowers Intel
to 150 million of these laptops could be shipped and Microsoft, to defeat or co-opt the
annually by the end of 2007.4 With $20 million in OLPC effort.
startup investment, sponsorships and partnerships OLPC created a novel technology,
the XO laptop, developed with close at-
with major IT industry players, and interest from tention to the needs of students in poor
developing countries, the nonprofit OLPC project rural areas. Yet it failed to anticipate
the social and institutional problems
generated excitement among international leaders that could arise in trying to diffuse that
and the world media. Yet as of June 2009 only a few innovation in the developing-country
hundred thousand laptops have been distributed context. In addition, OLPC has been
stymied by underestimating the ag-
(they were first available in 2007), and OLPC has been gressive reaction of the PC industry to
forced to dramatically scale back its ambitions. the perceived threat of a $100 laptop

66 COM MUNICATI O NS OF T HE ACM | J U NE 20 09 | VO L . 5 2 | NO. 6


JU N E 2 0 0 9 | VO L . 52 | N O. 6 | C OM M U N I C AT I ON S O F T HE ACM 67
contributed articles

Worldwide distribution of XO laptops. being widely distributed in places the


industry sees as emerging markets for
its own products.
The case of OLPC can be seen as a
study in the general diffusion of in-
novation in developing countries. Our
X analysis draws on diffusion-of-innova-
tion theory, exemplified by Rogers,18
and illustrates the difficulty in getting
X X widespread adoption of even proven
X innovation due to misunderstanding
X X
X X X the social and cultural environment
X in which the innovation is to be intro-
X duced. We also bring to bear specific
X X
X insights from the literature on adop-
X tion of IT in developing countries,2,25
using them to analyze the OLPC experi-
ence and draw implications for devel-
opers and policymakers.
Actual Date of Actual Deployment
The original OLPC vision was to
Country OLPC Web sitea Deployments Information/Detail change education through the develop-
Uruguay 202,000 150,000 November 2008b ment and distribution of low-cost lap-
Peru 145,000 40,000 100,000 in distributionc
tops embodying a new learning model
to every child in the developing coun-
Mexico 50,000 50,000 Starting to be shippedd
tries. Despite shifting over time, it can
Haiti 13,000 Dozens Pilot began in summer 2008e
be characterized by the following text
Afghanistan 11,000 450 Expected to rise to 2010f from the OLPC charter: “OLPC is not,
Mongolia 10,100 3,000 G1G1 laptops beneficiaryg at heart, a technology program, nor is
Rwanda 16,000 10,000 Arrived, not deployed; the XO a product in any conventional
infrastructure issuesh sense of the word. OLPC is a nonprofit
Nepal 6,000 6,000 Delivered April 2007i organization providing a means to
Ethiopia 5,000 5,000 Three schoolsj an end—an end that sees children in
even the most remote regions of the
Paraguay 4,000 150 4,000 planned next quarterk
globe being given the opportunity to
Cambodia 3,200 1,040 January 29, 2009l
tap into their own potential, to be ex-
Guatemala 3,000 — Planned before posed to a whole world of ideas, and
third quarter 2009m
to contribute to a more productive
Colombia 2,600 1,580 January 25, 2009n; and saner world community” (www.
agreement to buy 65,000 XOso
olpcnews.com/people/negroponte/
Brazil 2,600 630 February 6, 2009p
new_olpc_mission_statement.html).
India 505 31 January 20, 2009q Conceived and led by Nicholas Ne-
a OLPC numbers include “XO’s delivered, shipped, or ordered” but do not groponte, a former director of MIT’s
distinguish between these categories; wiki.laptop.org/go/Deployments Media Lab, OLPC aimed to achieve its
b Tabare, V. Uruguay: When education meets technology. Miami Herald (Nov. 22, 2008), A21.
vision through extraordinary innova-
c Peru on the up and up, lessons to be learned. Business News Americas (Dec. 18, 2008).
d www.bnamericas.com/story.xsql?id_sector=1&id_noticia=431002&Tx_idioma=I&source=
tion in hardware and software that
e www.olpceu.org/content/xo_stories/haiti/Haiti.html fosters self-learning and fits with the
f www.olpcnews.com/countries/afghanistan/olpc_afghanistan_first_school_day.html often-harsh environment in develop-
g www.olpceu.org/content/xo_stories/mongolia/Mongolia.html ing countries. The hardware was to
h www.olpceu.org/content/xo_stories/rwanda/Rwanda.html be a $100 laptop that would make af-
i www.olpceu.org/content/xo_stories/nepal/Nepal.html
fordable the large-scale deployment of
j http://www.olpceu.org/content/xo_stories/ethiopia/Ethiopia.html
computer networks in their schools.
k Bucaramanga computers, OLPC, Gemalto. Business News Americas (Feb. 9, 2009).
l wiki.laptop.org/go/OLPC_Cambodia The XO laptop developed by OLPC
m wiki.laptop.org/go/OLPC_Guatemala reflects hardware innovation in the
n wiki.laptop.org/go/OLPC_Colombia power supply, display, networking,
o PIlar Saenz, OLPC Volunteer in Colombia (email) keyboard, and touchpad to provide a
p download.laptop.org/content/conf/20080520-country-wkshp/Presentations/OLPC%20Country%20 durable and interactive laptop (see the
Meeting%20-%20Day%204%20-%20May%2023rd,%202008/Brazil%20-%20Jose%20Aquino%20
-%20Govt%20of%20Brazil.ppt#266,8,Slide 8 figure here). The shell of the machine is
q www.olpceu.org/content/xo_stories/india/India.html resistant to dirt and moisture, with all
key parts designed to fit behind the dis-
play. It contains a pivoting, reversible,

68 COMM UNICATI ONS OF T HE AC M | J U N E 200 9 | VOL. 52 | N O. 6


contributed articles

dual-mode (monochrome for outside, ers in the villages and support from the
color for indoors) display, movable rub- national education ministry and re-
ber WiFi antennas with wireless mesh gional governors who have requested
networking, and a sealed rubber-mem- 500,000 more laptops.9 However, re-
brane keyboard that can be customized
for different languages. For low power Expecting a laptop ports from the classroom suggest that
teacher training is limited, and willing-
consumption and ruggedness, the XO to cause such ness to adopt a new approach to teach-

revolutionary
design intentionally omits all motor- ing is questionable. Children are excit-
driven moving parts. It was developed ed but somewhat confused about the
jointly by the MIT Media Lab, OLPC,
and Quanta, a Taiwan-based original
change showed use of the machines, and educational
software is lacking or difficult to use.
design manufacturer, and is manufac- a degree of Also, if a machine fails, it is up to the
tured by Quanta in Songjiang, China.
The software for the XO consists of
naiveté, even for family to replace it or the child must do
without.20
a pared-down version of the Fedora Li- an organization
nux operating system and specially de-
signed graphical user interface called with the best Targeted Cost
Despite its considerable innovation, or
Sugar. It was developed by the project to intentions and perhaps because of it, the OLPC proj-
explore naturalistic concepts related to
learning, openness, and collaboration.a smartest people. ect has been unable to achieve its $100
targeted cost. The current cost of each
unit is listed on the OLPC Website as
Pilot Implementation $199 (www.laptop.org/en/participate/
High-level officials, including even ways-to-give.shtml). However, this does
prime ministers and education minis- not include upfront deployment costs,
ters, in some developing countries are which are said to add an additional
enthusiastic about OLPC, committed 5%–10% to the cost of each machine
to purchases and/or trial-distribution (wiki.laptop.org/go/Larger_OLPC),
projects. OLPC pilots in a half-dozen and subsequent IT-management costs.
countries report positive changes (such Nor does it include the cost of teacher
as increased enrollment in schools, training, additional software, and on-
decreased absenteeism, increased going maintenance and support. OLPC
discipline, and more participation in initially required governments to pur-
classrooms), but it is not clear if these chase a million units, then reduced
changes are directly related to OLPC, the number to 250,000 in April 2007.
as many evaluations are neither inde- Such large purchases are difficult to
pendent nor systematic. Independent justify for governments in developing
evaluations in Ethiopia and Uruguay countries, and the requirement was ul-
cite a positive effect on the availability timately eliminated.
of learning material via the laptop but Some countries eventually lost inter-
also problems with buggy input devic- est due to the higher costs of the XO.
es, connectivity, software functionality, For example, Nigeria failed to honor a
and teacher training.8,12,13 pledge by its former president to pur-
As of June 2009 the largest ongoing chase a million units, partly because
pilot project is in Peru, which planned they no longer cost $100 apiece.21
to distribute 140,000 XOs in 2008, even Meanwhile, other countries, including
into rural areas high in the Andes where Libya, have opted for the Intel Class-
electricity is often limited and Internet mate, which is priced at approximately
connections are not available. There is $250 for the PC alone. Officials in Libya,
enthusiasm among students and teach- which had planned to buy up to 1.2 mil-
lion XO laptops, became concerned that
the machines lacked Windows, and that
a Chief among them are collaboration and ex- service, teacher training, and future up-
pression (such as Web browsing, email, on-
line chat, word processing, drawing, music
grades would not be provided directly
sequencing, and programming); groups and by OLPC. Subsidies from Intel, includ-
neighborhoods to signify other users in physi- ing donated laptops and teacher train-
cal and logical proximity; a view-source-code ing, also helped persuade the Libyan
key to encourage users to tinker with the code; government to choose the Classmate.21
replacing files and folders with “journals” that
store activities performed by users; and tag-
ging, clipping, sharing, and searching as sys- Production, Sales, Distribution
temwide features.22 OLPC originally estimated that it would

J UN E 2 0 0 9 | VO L . 5 2 | NO. 6 | C O M M U N I CAT I O NS O F THE ACM 69


contributed articles

ship 100–150 million XO laptops by the depend only on the nature of the in-
end of 2007, but the program has clear- novation itself. Often, more important
ly fallen far short. Under more mod- is the social and cultural environment
est goals, production was supposed to in which it will operate.3,26 Informa-
reach five million laptops by the end
of 2008. By contrast, industry analysts PC makers across tion technologies are not standalone
innovations but system innovations,
report that Quanta’s manufacturing ef- the board are still the value of which depends largely on

seeking a formula
fort began only in December 2007 and an ecosystem that includes hardware,
reached a total of 370,500 units by third applications, peripherals, network
quarter 2008.16
Early commitments for a million
for well-designed, infrastructure, and services (such as
installation, training, repair, and tech-
XOs each from Brazil, Libya, and Nige- low-cost computing nical support). Deployment involves
ria evaporated, but relatively large pur-
chases were made by Uruguay (200,000),
devices, along with training teachers, creating software
and digital content, delivering main-
Peru (145,000), and Mexico (50,000). a complementary tenance and support, and sustaining a
In November 2007, OLPC launched a
philanthropy program called Give One delivery value long-term commitment. Such capabili-
ties are in short supply in developing
Get One (G1G1, www.olpcnews.com/ chain, market countries,7,26 and OLPC simply never
countries/usa/olpc_xo_laptop_sale.
html) where people in the U.S. could strategy, and had the resources to provide them.
The OLPC plan was to rely on gov-
buy two machines for $399, with one
being sent to a child in a developing
business model. ernments to buy its machines, provide
distribution and support, train teach-
country. The first program was success- ers to use and maintain them, and even
ful, with about 167,000 units sold, but sponsor development of local-language
a second G1G1 program in November software. OLPC established its own dis-
2008 resulted in only 12,500 units sold. tribution network or worked with local
Lagging production and sales mean voluntary organizations in some coun-
that distribution has also lagged. The tries to help with implementation. For
table here lists distribution as reported global distribution, OLPC reached (in
by OLPC, but many units have yet to be 2007) a comprehensive agreement
deployed to their intended recipients. with cellphone distributor Brightstar
What has the project accomplished? of Miami, FL, to help manage the com-
Why is it so short of its original goals? plexities of entering diverse markets.23
To answer, we look in more detail at However, none of these institutions
where OLPC succeeded and failed in had the ability to scale up to deploy-
understanding the developing-country ment of millions of machines. This
environment and how it was being con- situation is common in developing
fronted by the PC industry. countries where endemic problems
of infrastructure, financial resources,
Analysis technical skills, and waning political
OLPC dedicated a great deal of effort to support “hinder both the completion
designing a laptop that would function of IS innovation initiatives and the re-
well in a developing-country environ- alization of their expected benefits.”b
ment. OLPC’s technologist culture en- IT innovation is also part of socially
couraged innovation, showing a good embedded systems, the use of which
understanding of what was needed in cannot be isolated from the social and
developing countries. For example, the cultural environment or from local
XO is sealed to keep out dirt, has a dis- norms of practice.1,25 In some cases,
play that can be read in bright sunlight, teachers and the educational estab-
runs on low power, and is rugged. lishment have resisted innovation that
At the same time, the decision to
use the Linux/Sugar operating system
b Negroponte seems to question whether teach-
and interface was driven by a combina- ers are needed at all. Speaking about provid-
tion of pragmatic considerations and ing the rural poor a solid educational basis for
open source ideology. From a pragmat- development at the 2007 Digital, Life, Design
ic point of view, Linux doesn’t require conference in Munich, Germany, Negroponte
the computing power of Windows and said: “It’s not about training teachers. It’s not
about building schools. With all due respect
has a price tag (zero) compatible with [to Hewlett-Packard’s e-inclusion efforts], it’s
the goal of minimizing cost. not about curriculum or content. It’s about le-
Diffusion of IT innovation does not veraging the children themselves.”24

70 COMM UNICATIO NS OF TH E ACM | J U N E 20 09 | VO L. 52 | N O. 6


contributed articles

requires a significant change in peda- XO features.


gogy and that might reduce teacher
status.b Even when the laptops are ad- Ant
enn
ae
opted, they are not always used as en-
Indicator Light
visioned by OLPC or by education min- Microphone
isters. One Peruvian teacher said, “The USB Port
ministry would want us to use the lap- Speaker
top every day for long periods of time. Indicator Light
But we have decided to set rules in our Directional Pad
Camera
school and, really, the laptop, it’s only Screen Rotate USB Ports
a tool for us.”10 Storage Access Speaker
Such resistance is no surprise to Wi-Fi Access
students of innovation diffusion or of Game Buttons
IT for development. Rogers18 pointed Battery Light
Power Button
to examples where innovation dif- Power Light
fusion failed due to cultural norms SD Slot
and the effects of such innovation on
Stylus Area/
existing institutional arrangements. Touch Pad
Avgerou2 noted that attitudes toward Mouse Buttons
hierarchy are particularly problematic
in developing countries. An example
illustrating both themes is that the Pe- Latch

ruvian experiment was initiated with-


out being explained to the national schools) that existing PC makers were is marketing it aggressively against the
teachers’ union.10 OLPC has strong not serving raised the prospect that XO worldwide. It secured deals to sell
support from the Peruvian Education OLPC might gain a foothold in emerg- hundreds of thousands of Classmates
Ministry, but ultimately teachers must ing markets more generally. Moreover, in Libya, Nigeria, and Pakistan, some
actually use the machines in the class- the XO’s ultra-low price raised the like- of the very countries OLPC was count-
room, and they are likely to see the lihood of a new price point for note- ing on. Intel launched a series of pilot
union as an ally while possibly mis- books, potentially forcing PC makers projects in these countries, saying it
trusting the ministry. to cannibalize existing low-end prod- will also test the Classmate in at least
The fact that OLPC was much stron- ucts in order to compete (and is what 22 others while donating thousands of
ger in developing innovative technol- ultimately happened). machines.21 Intel briefly joined OLPC
ogy than in understanding how to Branded PC makers have always in July 2007 but got into a nondispar-
diffuse it may reflect the engineering faced competition from cheap local agement dispute with Negroponte and
orientation of the organization and its brands and clone makers in develop- dropped out only seven months later.14
lack of understanding of the needs or ing countries, but OLPC threatened In 2007, Microsoft offered to make
interests of the nontechnical people to grab a share of education budgets available Windows, a student version
who will ultimately buy and use the in- worldwide that PC makers hoped to of Microsoft Office, and educational
novation. This is illustrated by David tap for themselves. Negroponte’s high- programs to developing countries for
Cavallo, OLPC’s chief education archi- profile announcement of the project $3 per copy when used on computers
tect, saying, “We’re hoping that these and the publicity he garnered quickly in schools. OLPC then decided to allow
countries won’t just make up ground caught the industry’s attention. Windows on the XO, a choice driven by
but will jump into a new educational Leading companies first responded demands from some governments for
environment.”9 Expecting a laptop by disparaging the XO as a useless toy. Windows-based PCs. Even in countries
to cause such revolutionary change Intel’s Craig Barrett called it “a gadget,” with very low levels of PC penetration,
showed a degree of naiveté, even for an saying people want the full functional- officials who make purchasing deci-
organization with the best intentions ity of a PC.17 Bill Gates said “...geez, get sions may favor a technology standard
and smartest people. a decent computer where you can actu- (the Wintel design) they are familiar
ally read the text and you’re not sitting with or believe children must learn on
Competitive Response there cranking the thing while you’re systems they will encounter later in the
from the PC Industry trying to type.”11 Before long, however, work force.
The OLPC project was a potential threat the industry began to respond with ac- The OLPC project also stimulated
to the PC industry in emerging markets. tion, not just words. innovation in low-cost, low-power PCs.
OLPC’s use of an AMD microprocessor In 2006, Intel introduced a small Seeing OLPC’s success in developing
PHOTOGRAP H BY MIKE L EE

and Linux operating system was a po- laptop—the Classmate—for devel- a sub-$200 notebook, Asustek intro-
tential threat to the dominant position oping countries that today sells for duced the EeePC notebook in 2007 for
and historically high profit margins $230–$300. Intel has since licensed the the educational and consumer mar-
of Intel and Microsoft. Its targeting Classmate reference design to PC mak- kets in both developed and developing
of a new market (developing-country ers to manufacture and distribute and countries, selling more than 300,000

JUN E 2 0 0 9 | VOL . 5 2 | NO. 6 | C O M M U N I C AT I ON S O F T HE AC M 71


contributed articles

with a deep understanding of the local


environment to ensure commitment
leads to money and action.
Likewise, social, economic, and
cultural environments vary greatly
across and even within countries, and
deploying new technologies requires
understanding these environments.
Innovators must consider the need for
expertise in sociology, anthropology,
public policy, and economics, as well
as for engineers, and establish coher-
ent criteria for selecting countries to
target based on social, economic, and
political characteristics. Success in a
few developing countries is critical to
broad diffusion, as potential adopters
look to their peers for evidence of the
value of the innovation.18
Innovative technology can be disrup-
tive and trigger a backlash from incum-
bents. Some innovations pose a threat
The 2010 version of the One Laptop per Child, the XO-2, will have a foldable e-book form to industry incumbents, who may seek
and reduce power consumption to one watt. to undermine the innovator’s efforts.
The more visible the threat, the stron-
units in four months. It was soon selling them for perhaps $75 each.5 ger the reaction is likely to be. This il-
joined by major PC makers, including lustrates a dilemma for developers. A
Acer, Dell, Hewlett-Packard, and many Lessons program less ambitious and less pub-
smaller ones in creating a new category The OLPC experience offers lessons for licized than OLPC might not attract
of PC known today as netbooks. innovators and others aiming to intro- the attention of industry incumbents
While the XO was specifically de- duce and deploy IT innovation to benefit but also might not attract the partners,
signed for the poor, rural education the poor, as well as for the governments investors, and other sponsors needed
market in developing countries, net- of developing countries. For innovators, to develop and deploy the innovation.
book vendors target urban consumer we thus draw three general lessons: As multinational companies direct
and education markets in developed, Diffusing a new innovation requires more attention to emerging markets
as well as emerging, markets. In 2008, understanding the local environment. and so-called “bottom of the pyramid”
the netbook market exploded, with OLPC recognized correctly that lap- consumers, there is more likelihood
sales of 10 million units worldwide tops could reach the poorest children of competition but also more opportu-
mostly running Intel’s low-cost Atom only if they were subsidized by govern- nity for cooperation as well. PC makers
processor and Windows; sales are ex- ment or other funding sources. This across the board are still seeking a for-
pected to double in 2009.16 is similar to rural electrification and mula for well-designed, low-cost com-
The OLPC has been credited with telephone service, which usually can- puting devices, along with a comple-
spurring the netbook market, but the not be provided economically and end mentary delivery value chain, market
competition it spurred is now OLPC’s up subsidized by government or by strategy, and business model.
own biggest challenge. Developing charges to urban customers who can Innovative information technologies
countries today have a wide choice of be served profitably. However, innova- do not stand alone. A technology like
vendors offering inexpensive netbooks, tors should understand that govern- the XO is a system-level innovation that
and, though not designed like the XO ments are not monolithic entities, nor requires complementary assets to be
for the rigors of poor rural villages, they are they the same from one country to valuable. While OLPC was able to deliv-
are competitive in large, easier-to-serve the next. In some cases, funding can er high-level design and hand off devel-
urban populations. OLPC responded be allocated by an education ministry, opment and manufacturing to Quanta,
by announcing in January 2009 that in others it must be approved by the it had no one to handle marketing,
PHOTOGRAP H C OURT ESY OF FUS EPROJEC T

its second-generation laptop design legislature, and in others provincial or deployment, and support.15 Unlike the
would be licensed freely to PC makers local governments have jurisdiction. commercial PC companies, it was not
to manufacture and distribute, hoping Commitments from high-level officials part of any established business ecol-
to use the resources of these firms to or political leaders are as binding as a ogy and lacked resources to establish
get millions of laptops into the hands politician’s campaign promises.26 Fly- its own ecology.
of poor children in developing coun- ing into a country and winning initial For developing countries, interna-
tries. Negroponte hopes to have a pro- support is only a first step and must be tional agencies, and philanthropists,
totype in 18 months (from January), followed by a sustained effort by people there are other kinds of lessons:

72 COM MUNIC ATIO NS O F T H E AC M | J U N E 200 9 | VOL. 52 | N O. 6


contributed articles

Understand the true costs and risks, cational and societal outcomes of the 8. Haertl, H. Low-Cost Devices in Educational Systems:
The Use of the ‘XO-Laptop’ in the Ethiopian Educational
as well as benefits, of innovation. IT in- project.”13 Other evaluations argue that System. Report distributed by the Division of Health,
novation like the XO may offer great the countrywide deployments envis- Education and Social Protection, Information and
Communication Technologies, GTZ-Project, Deutsche
benefits but also involves costs and aged by OLPC are simply beyond the Gesellschaft fur Technische Zusammenarbeit,
risks. The purchase of a laptop is mere- resources of any developing country, Eschborn, Germany, Jan. 2008; www.gtz.de/de/
dokumente/gtz/2008-en-laptop.pdf.
ly the start of a stream of ongoing costs. saying that governments must set pri- 9. Hamm, S., Smith, G., and Lakshman, N. Social
The total cost of ownership for a laptop orities regarding goals and the regions, cause meets business reality. BusinessWeek in
Focus (June 12, 2008), 48; www.thefreelibrary.com/
program could include infrastructure sectors, and schools to be served.8, 12 cial+Cause+Meets+Business+Reality-a01611563648.
10. Hansen, L. Laptop deal links rural Peru to opportunity,
investment, training, tech support, risk. Weekend Edition. National Public Radio (Sunday,
hardware maintenance, software li- Conclusion Dec. 14, 2008).
11. Hiser, S. Bill Gates criticises OLPC. PlexNex blog (Mar.
censes and upgrades, and replacement The potential significance of the XO, 16, 2006); fussnotes.typepad.com/plexnex/2006/03/
expenditures. Cost can also include as well as of other IT innovations, in bill_gates_crit.html.
12. Hooker, M. 1:1 Technologies/Computing in the
the opportunity cost or the foregone developing countries calls for system- Developing World: Challenging the Digital Divide.
investment in teachers, facilities, or atic, independent evaluation—a true Global e-Schools and Communities Initiative,
Dublin, Ireland, May 2008; www.gesci.org/index.
other educational materials7 cited by “grand challenge” for the computing php?option=com_content&task=view&id=75&Itemid
India’s education ministry as its main and social science communities. Re- =64.
13. Hourcade, J.P., Beitler, D., Cormenzana, F., and Flores,
reason for not joining OLPC.6 searchers can provide value by con- P. Early OLPC experiences in a rural Uruguayan
There is also a risk that the expected ducting well-designed studies of the school. In Proceedings of CHI 2008 (Florence, Italy,
Apr. 5–10). ACM Press, New York, 2008, 2503–2511.
benefits might not be realized. Prob- diffusion and results of such innova- 14. Kirkpatrick, D. Negroponte on Intel’s $100 laptop
lems in implementation could limit ac- tion. The knowledge created promises pullout. Fortune (Jan. 4, 2008); money.cnn.com/
2008/01/04/technology/kirkpatrick_negroponte.
tual use, and the need for ongoing fund- to prevent wasting a great deal of mon- fortune/index.htm.
ing means that the innovation might ey and effort and lead to quicker diffu- 15. Krstic, I. Sic Transit Gloria Laptopi. Ivan Kristic blog
(May 13, 2008); radian.org/notebook/sic-transit-gloria-
not be sustainable beyond some initial sion and better use of innovations that laptopi.
16. O’Donnell, B. Worldwide Mini-Notebook PC 2008-2012
period.2, 13 Another risk is investing in a prove beneficial. While OLPC has so far Forecast Update and 3Q08 Vendor Shares. Market
technology platform that might not be fallen short of its goals, there is much Analysis. IDC, Framingham, MA, Dec. 2008.
17. Reuters. Intel calls MIT’s $100 laptop a
supported in the future; for instance, yet to be learned by studying this case ‘gadget.’ CNet news.com (Dec. 9, 2005); news.
investment in software, content, and of IT innovation. com/Intel+calls+MITs+100+laptop+a+gadge
t/2100-1005_3-5989067.html?tag=html.alert.
training for the XO platform could be 18. Rogers, E.M. Diffusion of Innovations, Fifth Edition.
wasted if OLPC would disappear. Acknowledgments Free Press, New York, 1995.
19. Shah, A. OLPC struggles to realize ambitious vision.
Policymakers are able to reduce the The Personal Computing Industry Cen- PC World (Dec. 20, 2007); www.pcworld.com/
risk if they make major acquisition de- ter (pcic.merage.uci.edu/) is supported article/140698/olpc_struggles_to_realize_ambitious_
vision.html.
cisions only after careful evaluation of by grants from the Alfred P. Sloan Foun- 20. Simon, S. Laptops may change the way rural Peru
pilot projects that enable learning first- dation and the U.S. National Science learns. Weekend Edition. National Public Radio
(Saturday, Dec. 13, 2008).
hand how the technology fits with their Foundation. Any opinions, findings, 21. Stecklow, S. and Bandler, J. A little laptop
educational goals and environment. and conclusions or recommendations with big ambitions. WallStreetJournal.com
(Nov. 24, 2007); online.wsj.com/public/article/
Learning from other countries’ expe- expressed in this article are those of SB119586754115002717.html.
rience can be valuable even when the 22. Vota, W. OMG: OLPC just gutted: 50% staff cut
the author(s) and do not necessarily re- & more. OLPC News Forum post (Jan. 7, 2009);
context is different; Al-Gahtani1 says flect the views of the Sloan Foundation www.olpcnews.com/forum/index.php?topic=4228.
msg28414#msg28414.
that successful pilot projects by peers or the National Science Foundation. 23. Vota, W. A Brightstar OLPC give one get one XO
in other developing countries help re- computer distributor. One Laptop Per Child News (Oct.
12, 2007); www.olpcnews.com/implementation/plan/
duce the perceived risk of adoption. brightstar_xo_computer_distribution.html.
References
Adopting organizations need to de- 1. Al-Gahtani, S.S. Computer technology adoption in 24. Vota, W. OLPC Nepal creates content while
Saudi Arabia: Correlates of perceived innovation Negroponte dismisses it. One Laptop Per Child News
velop internal capabilities and set priori- attributes. Information Technology for Development (Jan. 31, 2007); www.olpcnews.com/countries/nepal/
ties. Although governments might re- 10, 1 (Jan. 2003), 57–69. negroponte_curriculum_content.html.
2. Avgerou, C. Information systems in developing 25. Walsham, G. and Sahay, S. Research on information
ceive outside assistance for trials, they countries: A critical research review. Journal of systems in developing countries: Current landscape
must be able to sustain the innovation Information Technology 23 (June 2008), 133–146. and future prospects. Information Technology for
3. Avgerou, C. The significance of context in information Development 12, 1 (Feb. 2006), 7–24.
in the development of digital educa- systems and organizational change. Information 26. Warschauer, M. Dissecting the ‘digital divide’: A case
tional content, training of teachers to Systems Journal 11, 1 (January 2001), 43–63. study in Egypt. The Information Society 19, 4 (Sept./
4. BBC News. Sub-$100 laptop design unveiled (Sept. 29, Oct. 2003), 297–304.
integrate ICT-based educational mate- 2005); news.bbc.co.uk/2/hi/technology/4292854.stm.
rials in the teaching-learning process, 5. Bray, H. Cheaper laptop promised; Negroponte Kenneth L. Kraemer (kkraemer@uci.edu) is a research
remains determined to realize vision. Boston professor in the Paul Merage School of Business, Co-
and design and installation of sup- Globe (Feb. 11, 2009); www.boston.com/business/ Director of the Personal Computing Industry Center,
porting IT and power infrastructure. technology/articles/2009/02/11/cheaper_cheap_ and Associate Director of the Center for Research on
laptop_promised/. Information Technology and Organizations, all at the
For example, one independent evalua- 6. Einhorn, B. A crusade to connect children. India University of California, Irvine.
tion concluded: “While the Uruguayan criticizes an MIT professor’s quest to provide
‘One Laptop Per Child,’ but he’s forging ahead Jason Dedrick (jdedrick@uci.edu) is Co-Director and
government is making a great effort in elsewhere. BusinessWeek.com (Aug. 16, 2006); www. a project scientist in the Personal Computing Industry
providing funding for the hardware, businessweek.com/globalbiz/content/aug2006/ Center at the University of California, Irvine.
gb20060816_021986.htm.
there is no funding for designing and 7. Farrell, G. ICT in education in Rwanda. In Survey Prakul Sharma (prakuls@uci.edu) is a research associate
developing software and content for of ICD and Education in Africa: Rwanda Country in the Personal Computing Industry Center at the
Report. World Bank Information Development, University of California, Irvine.
use with the laptops or for conduct- Washington D.C., Dec. 2007; www.infodev.org/en/
Publication.423.htm.
ing a thorough evaluation of the edu- © 2009 ACM 0001-0782/09/0600 $10.00

J UN E 2 0 0 9 | VO L. 52 | N O. 6 | CO M M U NI CAT I O NS O F T HE AC M 73
review articles
DOI:1 0.11 45 /1 51 6046 .15 16 064
low-profile projects have been build-
Information and communication technology ing the foundations of ICTD for many
years. What’s new are its name and,
for development can greatly improve quality more important, the increased recog-
of life for the world’s neediest people. nition the field has lately been receiv-
ing and its potential for exerting great-
BY M. BERNARDINE DIAS AND ERIC BREWER er influence.
In this article we explore ICTD and

How
examine the role that computer sci-
entists can play in it. Our objective
is to convince readers that although
achieving all the goals of ICTD will not

Computer
be easy, even their partial realization
could have tremendous impact.
The motivation for this field comes
from a new awakening to the vast gap

Science
in quality of life between the richest
billion people on earth (who enjoy a
variety of luxuries, including Internet
access) and the poorest billion (who

Serves the
just barely eke out a living—and some-
times not). The base of the world’s
economic pyramid has an estimated
population of four billion—over half

Developing
of our planet’s people—living on less
than $2 a day.
In response to this awakening,
scholars and practitioners have be-
gun to explore the transforming power

World
of information and communication
technology when applied to the prob-
lems traditionally addressed in devel-
opment. Can mobile phones provide
income generation and facilitate re-
mote medical diagnosis? How can
user interfaces be designed so they are
accessible to the semiliterate and even
the illiterate? What role can comput-
ers play in sustainable education for
WHAT DO THE increasingly prominent news stories the rural poor? What new devices can
we build to encourage literacy among
about $100 laptops, kids learning about computers visually impaired children living in
through a “hole in the wall,” and the power of mobile poverty? What will a computer that is
phones to educate, entertain, and connect people relevant and accessible to people in
developing regions look like? These
in remote regions have in common? It is the field are just a few of the questions being
of information and communication technology addressed in ICTD.
In other words, ICTD can be seen as
for development (ICTD), based on the belief that harnessing the power of information
technology can have a large and positive effect on and communication technologies, or
billions of individuals by helping them overcome the ICTs, to take up many of the challenges
of development. ICTs include technol-
challenges so prevalent in developing regions. ICTD ogies ranging from robotic tools and
is not new—numerous important though relatively state-of-the-art computers to desktop

74 CO MM UNICATIO NS O F TH E AC M | J U NE 20 09 | VO L. 52 | N O. 6
Educational initiatives by the TechBridgeWorld group at CMU explore the efficacy of technology tools like an automated
English reading tutor. A more recent partnership with researchers from Ashesi University College in Ghana resulted in
the country’s first undergraduate robotics course.

and laptop computers in their tradi- mortality, and achieving universal pri- computing. Historically, computers
tional forms; and from mobile phones, mary education, environmental sus- started as huge machines that filled
PDAs, and wireless networks to long- tainability, and a global partnership rooms and were only relevant and ac-
established technologies such as radio for development—to be met by the year cessible to a specialized minority. The
and television. The software compo- 2015. Other development goals, not next big wave was the home PC, which
nents also span a wide range, from arti- emphasized in the MDGs, include ac- is now relevant and accessible to over
ficial intelligence and new algorithms, cess to adequate shelter, information, one billion people worldwide. ICTD is
interfaces, and applications to the most avenues for income generation, and perhaps the next revolution in comput-
PHOTOGRAP HS COU RTESY OF T ECHBRIDG EWORLD AT C ARN EGI E MEL LO N UN IVERS IT Y

prosaic programmed commodities. financial credit. The ongoing rural-to- ing—transforming the computer and
Although the goals of international- urban shift of so much of the world’s the applications of computing so that
development efforts vary, depending population has introduced a new set of this technology can finally become rel-
on the nature of each endeavor, the problems as well, including increased evant and accessible to the other five
overarching goal of all such projects is vulnerability to disasters and the cor- billion people of the world.
the alleviation of the suffering caused responding challenges for effective Given its position at the intersection
by poverty and improvement of quality disaster responses. These are among of technology and development, ICTD
of life for the world’s poor. The United the many international-development brings together a wide variety of actors
Nations’ eight Millennium Develop- challenges that ICTD researchers and in many different roles. Among the
ment Goals (MDGs) infused new ener- practitioners hope to address. They newest are computer scientists, and
gy into the world’s development efforts expect to reinvent the form, function, their role is potentially a big one, both
and helped to focus them on concrete and applications of ICTs in new and for their beneficiaries and themselves.
objectives—eradicating extreme pov- creative ways so that such challenges It can change the image of the comput-
erty and hunger, improving maternal may best be met. er science discipline, the nature of the
health, prevailing in the battle against From a CS point of view, ICTD can PC, and the future of the field.
HIV/AIDS and malaria, reducing child be seen as the next wave in ubiquitous A crucial requirement for success

JU NE 2 0 0 9 | VO L . 52 | N O. 6 | CO M M U NI CAT I O NS O F T HE AC M 75
review articles

in ICTD, however, is interdisciplinary who must work together if we are to


collaboration—working with scholars improve the quality of life for the least
and practitioners from many different privileged on our planet.
fields. Sociologists, ethnographers,
and anthropologists, for example, can
provide valuable information about Although the The Many Challenges of ICTD
Given its enormous ambitions and
the communities intended to benefit
from ICTD. This information, regard-
cause is noble multidisciplinary requirements, ICTD

and the impact


presents its researchers with a variety
ing such things as cultural practices, of challenges. They include adapting
traditions, languages, beliefs, and live-
lihoods, must guide the design and im-
can be large, ICTD to unfamiliar cultures and traditions,
ensuring accessibility to local lan-
plementation processes for successful must ultimately guages and multiple levels of literacy,
solutions in ICTD.
Economists and political scientists
be judged on its overcoming the barriers of misinfor-
mation and mistrust of technology,
play important roles in ICTD as well research value— creating solutions that work within the
by designing new economic models,
marketing strategies, and governmen- and in particular, its local infrastructure, and many more.
For example, networking must work
tal policies that affect the economic research value in in circumstances with low bandwidth,
viability and sustainability of techno-
logical interventions. Social scientists computer science. intermittent bandwidth, or no band-
width at all. Computers must operate
also play a crucial role in evaluating reliably in environments character-
the impacts and outcomes of ICTD ized by dust, heat, humidity, and inex-
projects using both qualitative and perienced users. User interfaces must
quantitative methods. They observe accommodate semiliterate and illiter-
and predict how people in developing ate users. And software applications
regions interact with technology, and must be sufficiently intelligent to pro-
they aim to affect social systems for vide useful, accessible, and relevant
adopting technology-aided solutions services to populations that might be
without disruption to the community. interacting with a computing system
Thus computer scientists working in for the very first time.
the field of ICTD must quickly learn Further, ICTD field tests often re-
to work with this variety of scholarly quire considerable ingenuity, whether
players, to benefit from their points of they involve accessing target commu-
view, and to complement them wher- nities, setting up long-term studies,
ever possible. transporting equipment, observing the
ICTD does not only cross disci- logistics and legalities of export con-
plines; it also transcends the boundar- trol laws, addressing safety concerns,
ies of academia and involves multiple and establishing trust and common
sectors. This reality obliges ICTD re- ground with partnering organizations
searchers to work with practitioners, that cross cultural and geographic
government representatives, multi- boundaries. And to begin with, re-
lateral institutions such as the United searchers must be entrepreneurial in
Nations, nonprofits, nongovernmen- obtaining funding for their research,
tal organizations, and even the private as ICTD is not yet an established field
sector, whose interest in ICTD begins with reliable funding sources.
as it seeks access to emerging markets Although the cause is noble and the
and new avenues for corporate social impact can be large, ICTD must ulti-
responsibility. Many of these sectors’ mately be judged on its research val-
people have been addressing the chal- ue—and in particular, its research val-
lenges of development for decades, ue in CS. Like other multidisciplinary
and their efforts should profit from fields, ICTD must be simultaneously
the addition of professionals in CS and present in multiple communities, each
related fields who will contribute new of which may have it own value system
perspectives and their useful styles of for research. Even within computer
rigor, critique, and innovation. science, ICTD is judged differently by
ICTD is therefore a truly global un- different CS communities.
dertaking with a grand vision. It brings In the human-computer interac-
together numerous players, across tion (HCI) community—for example,
geographic, socioeconomic, regional, at the annual ACM CHI conferences—
disciplinary, and sectoral boundaries, ICTD has been well received, as HCI

76 CO M MUNICATI O NS OF TH E AC M | J U N E 200 9 | VO L. 52 | N O. 6
review articles

is multidisciplinary by nature and tive (compared to Western games). They include not just the cost of the
already deals both with quantitative ! Implementing a new set of games. technology but also availability (uptime),
and qualitative research. Moreover, ! Leading an ongoing multiyear power requirements, potential for theft,
developing-region users differ in their study on the educational value of these and logistics. One common approach
employment and adoption of technol- games. to financial sustainability is to commer-
ogy and thus comprise an important Overall, this process has taken over cialize a solution; this has worked well
research direction for HCI profession- four years and continues to this day. for mobile phones and treadle pumps,
als. In fact, HCI is arguably the easiest ICTD is also developing its own for example. Even if a for-profit venture
discipline within CS in which to work community values over time. The is not the purpose, researchers must
on ICTD research. Other areas with clearest values so far are novelty and essentially address the same issues of
some inherent compatibility include on-the-ground empirical results, both costs, cash flow, awareness (marketing),
systems, networking, databases, and quantitative and qualitative. Less clear and ongoing support.
AI. For example, in systems and net- are the values surrounding repeat- Operational sustainability is the ca-
working, which are not as multidisci- ability, rigor, and generalizability, and pacity of the permanent staff to keep
plinary as HCI and more quantitative least clear is how to merge the values the project going technically (without
in character, ICTD work is less natural, of qualitative fields such as anthropol- the researchers). In theory, financial
but it can still fit well when technol- ogy or ethnography with those of CS. sustainability enables operational
ogy innovation and novel usage are in- Consider generalizability: CS values sustainability (by paying for it), but in
volved. Examples from top-tier confer- generalizable results as an indicator practice it cannot do so all by itself.
ences include work on delay-tolerant of potential impact, while qualitative This is because of limits on local skills,
networking, distributed storage, and researchers often emphasize the dif- supplies, and logistics. Solutions must
novel MAC-layer protocols for long- ferences in groups or users and aim be not only easy to use, but also ame-
distance WiFi. In these kinds of ap- to broaden the dialogue. This leads to nable to straightforward diagnosis and
proaches to ICTD research, there must placing value on reusable technology repair with limited training.
be a core technical nugget in addition frameworks, such as HCI toolkits, that Training costs are actually under-
to real-world deployments. can be customized and easily local- rated. ICTD projects, particularly in
However, research requires a great ized. We discuss one such framework rural areas, cannot view training as a
deal of effort per published report, here for mixed paper/phone applica- one-time activity needed only when the
given the challenges of deployments; tions. ICTD is also creating its own project starts. Once trained, IT workers
over the long term, ICTD researchers scholarly forums for discussing and are often tempted to leave for better
must aim to produce papers that are disseminating this work. The Interna- jobs in urban areas or other countries.
fewer in number but of higher impact. tional Conference on Information and Thus training is a recurring cost, and it
Moreover, it must be noted that ICTD Communication Technologies and must be short and effective.
tends to be driven by the solving of a Development and the International These kinds of sustainability are
problem rather than by technological Conference on Social Implications of fundamental to scaling a successful
innovation (often, in search of a prob- Computers in Developing Countries pilot project. Unfortunately, devel-
lem), which means that many ICTD are two examples. opment-work pilots rarely turn into
projects may not have a core techni- large-scale self-sustaining successes.
cal nugget after all. Such problems, What about Sustainability? Typically the pilot is small enough and
although highly satisfying to solve, are Long-term impact requires that ICTD has enough researchers involved (with
harder to claim as CS research. projects be self-sustaining. First, after their own support) that the financial
For most projects, the real research the researchers leave and the money and operational issues do not really
is in actually discovering the specifica- stops flowing, does the project con- hinder it. Thus the pilot is mostly use-
tion of the problem via repeated field- tinue? Second, can it be replicated in ful to validate prototypes and assess
work and deployments, which is simi- other contexts? community reactions. The under-
lar in feel to iterative design in HCI. Sustainability is challenging to standing of financial sustainability
Although HCI is an exception, CS does define, and researchers disagree on requires a longer trial with detailed ac-
not generally value problem discovery, the details. Most agree on financial counting and no hidden subsidies (un-
especially if the end solution is simple sustainability as a key element: the less they are expected to continue at
(had we known to apply it). Researcher deployment must produce enough in- scale); it also requires dealing with re-
Matthew Kam went through such it- come to at least cover its costs. In this placement costs and expected equip-
eration to create effective educational view, philanthropy is acceptable for ment lifetimes. Operational sustain-
games on cellphones:3 “kick starting” a project, but not for ability must be evaluated via detailed
! Evaluating 35 existing games for supporting routine operational costs. tracking of problems and how and by
PCs with village students. Similarly, while projects typically need whom they were solved. In both cases,
! Creating 10 test games for English not be wildly profitable, they should the system evolves to reduce costs or
as a second language (ESL) and testing at least be cash-flow positive, as credit simplify operation.
them with 47 students. can be challenging. Finally, replication is the process of
! Studying 28 traditional village The operating-cost issues add sig- moving a successful project to a new
games to make the games more intui- nificant constraints to ICTD solutions. environment. As developing regions

J UN E 2 0 0 9 | VOL . 5 2 | NO. 6 | C O M M UN I CAT IO NS OF T HE AC M 77


review articles

are quite heterogeneous in many re- munication Technologies and Devel- care. After several iterations, it became
spects, projects typically need some opment offer a much larger sampling clear that the solution was to create
adjustments to work well with new of current or recent research efforts in rural vision centers (VCs) consisting
partners, a different culture, or a dif- the ICTD field. Several other examples of 1–2 rooms, a nurse, a technician
ferent government. Both scaling and and an overview of ICTD are also pro- (to make eyeglasses), and notably the
replication are active areas of multidis- vided in a recently published special means for high-quality doctor/patient
ciplinary research, and CS has a criti- edition of IEEE Computer.8 videoconferencing. This “video solu-
cal role to play, given its direct impact Rural Telemedicine: The Aravind Eye tion,”7 developed at UC Berkeley, uses
on sustainability. Care Hospital in southern India is a novel long-distance WiFi links that are
world leader in high-volume low-cost low-cost, low-power, and typically de-
A Few Sample Projects eye care. Working in the state of Tamil liver 4Mb/s–6Mb/s between the hospi-
We have selected four sample ICTD Nadu, Aravind served over 2.4 million tal and the VC over distances ranging
projects that illustrate some of the is- patients last year and performed over from a few to tens of kilometers. (The
sues discussed thus far. The projects 280,000 cataract surgeries. More than same basic technology has also been
focus on four different topics—tele- half the patients receive free or dis- extended to go 382km in Venezuela.)
medicine, assistive technology, mi- counted eye care—they are subsidized Having successfully completed a
crofinance, and education—all in the by paying customers—and the hospi- five-VC pilot in early 2006, Aravind now
context of developing regions. Each tals have been financially self-sustain- has 24 VCs in operation via a mix of WiFi
example highlights different chal- ing for decades. and DSL (in more urban areas). Some
lenges and characteristics of the ICTD Despite this success, until recently 5,000 patients use the video service per
field. Together, these projects reflect Aravind had limited reach into rural month, with over 100,000 through the
CS-related innovation in the areas of areas; patient surveys indicated that end of 2008 having used the WiFi links.
systems, networking, HCI, and AI. The most patients came from within 20km Of these 100,000, over 15,000 were ef-
past proceedings of the International of a hospital and that only 7% of rural fectively blind (primarily due to refrac-
Conference on Information and Com- patients had access to any kind of eye tive problems or cataracts), but can now

PHOTOGRAP HS COU RTESY OF T ECHBRIDG EWORLD AT C ARN EGI E MEL LO N UN IVERS IT Y

Researchers from CMU’s TechBridgeWorld are working with the Mathru School for the Blind outside Bangalore, India, to enhance
the teaching and learning process for writing Braille through the use of a low-cost writing tutor that gives audio feedback to students.

78 CO MM UNICATI O NS OF T HE AC M | J U NE 20 09 | VO L . 5 2 | N O. 6
review articles

see well; 85% of them have been able to written from right to left in mirror-im- This Braille tutor,2 first designed, im-
return to income generation. This ex- age format so that the correct Braille plemented, and field-tested in 2006,
ample shows how the combination of characters can be read when the paper has been enhanced through an itera-
basic needs and large volumes in devel- is removed from the slate and flipped tive design process to provide several
oping regions enables ICTD research to over. Second, students get delayed features. They include teaching basic
have great impact. Aravind recently won feedback; they must wait until their Braille in several languages, teaching
the $1M 2008 Gates Foundation Award writing is complete and the paper has basic math symbols, adapting the op-
for Global Health, in large part because been removed and read. Third, when erational mode to cater to specific stu-
of the reach of these vision centers. the teachers themselves are blind, it is dent needs, and several educational
Assistive Technology: The Mathru difficult to diagnose problems in the games that motivate students to learn
School for the Blind is a residential students’ writing process by simply the skill of writing Braille. This ongo-
facility that provides free education, reading the end product. Finally, mo- ing research has expanded to several
clothing, food, and health services tivation for learning to write Braille is new partnerships, including groups in
to visually impaired children from very low because the process is tedious Qatar, Zambia, and China.
socially and economically deprived and sometimes even physically taxing Microfinance Support: The No-
families from remote parts of India. for young students. bel Peace Prize for Mohamed Yunus
The school is located in the residen- Researchers from the TechBridge- brought overdue attention to the pow-
tial area of Yelahanka, a suburb of World group at Carnegie Mellon Uni- erful role of microfinance in develop-
Bangalore. Teaching Braille, the only versity are working with Mathru to ing regions. Such services are in dire
means of literacy for the blind, is an enhance the teaching and learning need of technological support, not
important part of the curriculum at process for writing Braille using a slate only for basic accounting but also to
Mathru. However, learning to write and stylus. This effort has resulted in a reduce fraud and satisfy government
Braille using the traditional and slate low-cost Braille writing tutor that gives mandates for reporting. The required
and stylus is not an easy process, for audio feedback to the student as he or reports in India, for example, specify
several reasons. First, Braille must be she forms characters with the stylus. multiple copies of the same tables

The growth in centers (one per color) and patients in the Aravind telemedicine project in India.

Kallidakurchi Andipatti
Srivaikundam Periyakulum
Sholavandan Chinnamanur
Tirupovanam Bodi
Patient Throughput Alanganallur Ambasamudram
5000

4500

4000

3500
Number of Patients

3000

2500

2000

1500

1000

500

0
Jan 06

Feb 06

Mar 06

Apr 06

May 06

Jun 06

Jul 06

Aug 06

Sep 06

Oct 06

Nov 06

Dec 06

Jan 07

Feb 07

Mar 07

Apr 07

May 07

Jun 07

Jul 07

Aug 07

Sep 07

Oct 07

Nov 07

Dec 07

Jan 08

Feb 08

Mar 08

Apr 08

May 08

Jun 08

Jul 08

Aug 08

Sep 08

Time (in months) Source: Sonesh Surana

J UN E 2 0 0 9 | VO L. 52 | NO. 6 | C OM M U NI C AT I O NS OF T HE AC M 79
review articles

in different formats, which are easily candidate in robotics at Carnegie Mel- diate and large-scale impact. But the
done with a spreadsheet but tedious lon University and a native of Ghana, role of CS in development is essen-
and error-prone on paper, which has had spearheaded a collaborative proj- tially a community decision, involving
been the typical mode. ect between TechBridgeWorld and whether we value this work or not. For
Tapan Parikh, in his dissertation Ashesi University College to design example, will ICTD be a viable path to
work at University of Washington, de- and teach that course5 at Ashesi, a pri- a tenure-track CS faculty position?
veloped a system called CAM6 (short for vate, accredited, nonsectarian college We can say that although the chal-
‘camera’) that combines the comfort dedicated to training a new generation lenges are great, ICTD is both intellec-
and tangible nature of paper with the of ethical and entrepreneurial leaders tually rewarding and very attractive to
power of mobile phones. Two-dimen- in Africa. The collaboration between students at all levels. With several re-
sional barcodes on the paper guide data the two universities led to a summer cent reports citing the dwindling num-
entry on the phone and help to manage course designed and taught with care- bers of students interested in studying
document flow. In addition to work- ful consideration of the local context, CS, perhaps ICTD is one answer. It
flow support, CAM uses the keypad for infrastructure, and resources. may help motivate a new generation of
numeric input and provides voice feed- Several students who took this computer scientists to contribute their
back, both of which have been well re- course have now graduated and have knowledge, talents, and energies to-
ceived by semiliterate rural users. This followed different employment paths; ward solving some of the world’s most
system is now under trial with 400 mi- some headed to industry (including a pressing problems.
crofinance groups in India. startup company for developing mo-
Educational Technology and Technol- bile applications) and others to gradu- References
1. Dias, M.B., Mills-Tettey, G.A., and Mertz, J. The
ogy Education: Project Kané,1 an initia- ate school. Empowered with a strong TechBridgeWorld Initiative: Broadening perspectives
tive of the TechBridgeWorld group at technology education, some of these in computing technology, education, and research.
In Proceedings of the International Symposium on
Carnegie Mellon University, explores students are now collaborating with Women and ICTD: Creating Global Transformation.
the efficacy of technological tools in TechBridgeWorld researchers to de- ACM Press, NY (June 2005).
2. Kalra, N., Lauwers, T., Dewey, D., Stepleton, T., and
improving English literacy for children sign, implement, and field-test edu- Dias, M.B. Iterative design of a Braille writing tutor
in developing regions, with a focus on cational technology tools to improve to combat illiteracy. In Proceedings of the 2nd IEEE/
ACM International Conference on Information and
Africa. The project started with a three- literacy in their homeland. Communication Technologies and Development (Dec.
week pilot study in Ghana that tested 2007).
3. Kam, M., Ramachandran, D., Devanathan, V.,
the feasibility and impact of using an Looking to the Future Tewari, A., and Canny, J. Localized iterative design
automated English-reading tutor to We believe that technology, along with for language learning in underdeveloped regions:
The PACE Framework. In Proceedings of the ACM
improve the level of English literacy good governance and macroeconom- Conference on Human Factors in Computing Systems
among children from low-income ics, represents the path forward for the (San Jose, CA, Apr. 28–May 3, 2007).
4. Kim, S.J. Information technology and its impact
families in Accra. This study gave pre- majority of the world’s people. Consid- on economic growth and productivity in Korea.
liminary indications that the tutor had er that in 1970, South Korean and Afri- International Economic Journal 17, 3 (Oct. 2003),
55–75.
a positive impact on the students’ per- can incomes were similar; but the rap- 5. Mills-Tettey, G.A., Dias, M.B., Browning, B., and
Amanquah, N. Teaching technical creativity through
formance on spelling and fluency tests. id relative rise of South Korea shows robotics: A case study in Ghana. In Proceedings
It also identified several important fac- what is possible, due in large part to of the 2nd AI in ICT for Development Workshop,
20th International Joint Conference on Artificial
tors for success, such as the need to in- technology.4 We believe that proactive Intelligence (Jan. 2007).
clude some local stories familiar to the research and development of ICTs ap- 6. Parikh, T.S., Javid, P., Sasikumar, K., Ghosh, K., and
Toyama, K. Mobile phones and paper documents:
children and the necessity to narrate propriate for developing regions can Evaluating a new approach for capturing microfinance
the tutorial (on how to use the automat- lead to similar growth and prosperity data in rural India. In Proceedings of the ACM
Conference on Computer-Human Interaction (Apr.
ed tutor) in a voice with a Ghanaian ac- over time and to an improved quality 24–27, 2006, Montreal, Canada).
cent. Based on this initial success, the of life in the immediate future. 7. Surana, S., Patra, R., Nedevschi, S., and Brewer, E.
Deploying a rural wireless telemedicine system:
pilot was scaled to a six-month study Today we have lots of examples and Experiences in sustainability. IEEE Computer 41, 6
that included three groups of children anecdotes about high impact from (June 2008), 48–56.
8. Toyama, K. and Dias, M.B., guest editors. IEEE
from very different socioeconomic ICTD in developing regions, but the Computer Magazine, Special Edition on Information
backgrounds, and it has also been rep- field remains ad hoc and largely with- Communication Technology for Development (June
2008).
licated in Mongu, Zambia. out the benefit of the innovative think-
The automated tutor used in these ing that more computer scientists M. Bernadine Dias (mbdias@ri.cmu.edu) is an assistant
studies was not designed for develop- would bring to bear. The situation research professor at the Robotics Institute of Carnegie
Mellon University, Pittsburgh, PA. She founded and directs
ing regions, however, and it was clear could change substantially, however. the TechBridgeWorld group (www.techbridgeworld.org),
that new educational-technology tools The core costs of computing and com- which pursues technology research relevant to, and in
partnership with, underserved communities throughout
with that focus were needed. This goal munication have dropped to a point the globe.
is being pursued through a new part- that enables CS to affect everyone,
Eric Brewer (brewer@cs.berkeley.edu) is a professor
nership between TechBridgeWorld especially when combined with the in the computer science division at the University of
researchers and alumni of the course flexibility inherent in software that California, Berkeley. He founded the Federal Search
Foundation, which built FirstGov (now USA.gov), the
in robotics and artificial intelligence— enables low-cost customization for a portal for the U.S. government, and he was the founder
Ghana’s first—taught at Ashesi Univer- wide variety of contexts. This combi- and chief scientist of the Inktomi Corporation, now part
of Yahoo!
sity College. nation makes CS uniquely positioned
Ayorkor Mills-Tettey, a doctoral among all disciplines to have imme- © 2009 ACM 0001-0782/09/0600 $10.00

80 CO M MUNICATIO NS O F T H E AC M | J U N E 200 9 | VO L. 52 | N O. 6
research highlights
P. 82 P. 83
Technical
Perspective
Securing Frame
Reframing Security
Communication in Browsers
By Adam Barth, Collin Jackson, and John C. Mitchell
for the Web
By Andrew Myers

P. 92 P. 93
Technical Two Hardware-Based
Perspective Approaches for Deterministic
Software and
Hardware Support Multiprocessor Replay
for Deterministic By Derek R. Hower, Pablo Montesinos, Luis Ceze,
Replay of Parallel Mark D. Hill, and Josep Torrellas
Programs
By Norman P. Jouppi

JU NE 2 0 0 9 | VO L . 52 | N O. 6 | CO M M U NI C AT IO N S O F T HE ACM 81
research highlights
DOI :10 .11 45/ 1 51 6046 .1 5 1 6 06 5

Technical Perspective
Reframing Security
for the Web
By Andrew Myers

THE WEB HAS brought exciting new func- many exciting new applications and the feature of frame navigation in Web
tionality while simultaneously requir- services require this sharing. Some of browsers. Code running in one frame
ing new mechanisms to make it secure. the techniques developed for operat- (that is, one trust domain) can control
We’ve repeatedly discovered that these ing system security, such as controlled where another frame loads its content
mechanisms are not good enough, as communication between processes, from. The authors use elegant reason-
clever hackers and academics have fig- can be adapted to the Web. But Web ing to identify the most permissive se-
ured out how to circumvent and mis- security poses new challenges as well. cure policy for controlling frame navi-
use them to compromise security. For example, Web security violations gation. This argument is so simple
We now live in a world in which can occur within the context of a sin- and convincing that the policy they
viewing an advertisement might com- gle Web page, which often comprises identify has been adopted by most
promise your bank account. In the fol- multiple frames controlled by code major browsers.
lowing paper, “Securing Frame Com- from different sources. These frames In itself, this would be a significant
munication in Browsers,” researchers may be third-party advertisements contribution, but the paper goes far-
Adam Barth, Collin Jackson, and John or integrated content from multiple ther. It newly identifies vulnerabili-
Mitchell not only illustrate how subtle parties who do not trust each other; ties in two important mechanisms
some of these security vulnerabilities the many mashups based on Google for communication between different
can be, they show how to solve them Maps are examples of the latter. The frames; one of these mechanisms is in
in a principled way. This paper has absence of effective solutions to the the HTML 5 standard. The paper gives
had a real impact: their solutions have problem of fine-grained interaction a thoughtful and principled analysis
already been widely adopted. between trust domains—coexisting of each communication mechanism
Why is Web security difficult? It’s on the very same Web page—has left and identifies a fix for each. These
because the Web browser is a place Web applications vulnerable. fixes have also been adopted by cur-
where programs and data from dif- Fortunately, researchers like Barth, rent browsers and communication li-
ferent sources interact. Each source Jackson, and Mitchell are applying braries.
may control resources whose security principled methods to identify and The paper is a great example of re-
can be affected by the programs and eliminate these vulnerabilities. The search that has impact precisely be-
data from other sources. In fact, there vulnerabilities they address arise from cause it offers principled solutions.
is a deep, underlying problem that Too often, proposed computer secu-
has never been satisfactorily solved: rity mechanisms merely raise the bar
how to securely permit fine-grained The paper is against attacks, starting the next phase
sharing and communication between of an arms race. This is a different
programs from mutually distrusting a great example kind of work—work that clearly iden-
sources. Conventionally, security was of research that tifies and convincingly solves a real
considered the job of the operating security problem. The work described
system. But the granularity of oper- has impact in this paper makes our lives more se-
ating system enforcement is far too precisely because cure and helps the next generation of
coarse for Web applications, whose applications to be built securely. And
security depends on the precise de- it offers principled their work also helps us understand
tails of the interactions between ap- solutions. how to think about the new security
plication-level data structures such as challenges that lie ahead.
frames, cookies, and interpreted ap-
plication code. Andrew Myers is an associate professor of computer
Web security forces us to think anew science at Cornell University, Ithaca, NY.
about the problem of fine-grained
sharing across trust domains because © 2009 ACM 0001-0782/09/0600 $10.00

82 CO MM UNICATI O NS O F T HE AC M | J U NE 20 09 | VOL . 5 2 | NO. 6


DO I:1 0.1 145 / 151 60 4 6. 1 5 1 6 06 6

Securing Frame Communication


in Browsers
By Adam Barth, Collin Jackson, and John C. Mitchell

Abstract map, or a photo album, the site runs the risk of incorporat-
Many Web sites embed third-party content in frames, ing malicious content. Without isolation, malicious content
relying on the browser’s security policy to protect against can compromise the confidentiality and integrity of the
malicious content. However, frames provide insufficient user’s session with the integrator. Although the browser’s
isolation in browsers that let framed content navigate well-known “same-origin policy”19 restricts script running
other frames. We evaluate existing frame navigation poli- in one frame from manipulating content in another frame,
cies and advocate a stricter policy, which we deploy in the browsers use a different policy to determine whether one
open-source browsers. In addition to preventing undesir- frame is allowed to navigate (change the location of) another.
able interactions, the browser’s strict isolation policy also Although browsers must restrict navigation to provide isola-
affects communication between cooperating frames. We tion, navigation is the basis of one form of interframe com-
therefore analyze two techniques for interframe communi- munication used by leading companies and navigation
cation between isolated frames. The first method, fragment can be used to attack a second interframe communication
identifier messaging, initially provides confidentiality with- mechanism.
out authentication, which we repair using concepts from a Many recent browsers have overly permissive frame
well-known network protocol. The second method, post- navigation policies that lead to a variety of attacks. To pre-
Message, initially provides authentication, but we dis- vent attacks, we demonstrate against the Google AdSense
cover an attack that breaches confidentiality. We propose login page and the iGoogle gadget aggregator, we propose
improvements in the postMessage API to provide confi- tightening the browser’s frame navigation policy. Based on
dentiality; our proposal has been standardized and adopted a comparison of four policies, we advocate a specific policy
in browser implementations. that restricts navigation while maintaining compatibility
with existing Web content. We have collaborated with the
HTML 5 working group to standardize this policy and with
1. INTRODUCTION browser vendors to deploy this policy in Firefox 3, Safari
Web sites contain content from sources of varying trust- 3.1, and Google Chrome. Because the policy is already
worthiness. For example, many Web sites contain third- implemented in Internet Explorer 7, our preferred policy
party advertising supplied by advertisement networks is now standardized and deployed in the four most-used
or their sub-syndicates.3 Other common aggregations browsers.
of third-party content include Flickr albums, Facebook With strong isolation, frames are limited in their interac-
badges, and personalized home pages offered by the three tions, raising the issue of how isolated frames can cooperate
major Web portals (iGoogle, My Yahoo! and Windows Live). as part of a mashup. We analyze two techniques for inter-
More advanced uses of third-party components include frame communication: fragment identifier messaging and
Yelp’s use of Google Maps to display restaurant locations, postMessage. Table 1 summarizes our results.
and the Windows Live Contacts gadget. A Web site combin-
ing content from multiple sources is called a mashup, with š Fragment identifier messaging uses frame navigation
the party combining the content called the integrator, and to send messages between frames. This channel lacks
integrated content called a gadget. In simple mashups, an important security property: messages are confiden-
the integrator does not intend to communicate with the tial but senders are not authenticated. These proper-
gadgets and requires only that the browser provide isola- ties are analogous to a network channel in which
tion. In more sophisticated mashups, the integrator does senders encrypt their messages with the recipi-
wish to communicate and requires secure interframe com- ent’s public key. The Microsoft.Live.Channels
munication. When a site wishes to provide isolation and library uses fragment identifier messaging to let the
communication between content on its pages, the site Windows Live Contacts gadget communicate with its
inevitably relies on the browser rendering process and iso- integrator, following an authentication protocol analo-
lation policy, because Web content is rendered and viewed gous to the Needham–Schroeder public-key protocol.17
under browser control.
In this paper, we study a contemporary Web version
of a recurring problem in computer systems: isolating The original version of this paper was published in the
untrusted, or partially trusted, components while providing Proceedings of the 17th USENIX Security Symposium, July
secure intercomponent communication. Whenever a site 2008.
integrates third-party content, such as an advertisement, a

JU N E 2 00 9 | VO L. 5 2 | NO. 6 | C O M M U NI C AT IO NS O F T HE AC M 83
research highlights

Table 1: Security properties of frame communication channels.

Confidentiality Authentication Network Analogue


Fragment identifier messaging  Public Key Encryption
Original postMessage  Public Key Signatures
Improved postMessage   SSL/TLS

We discover an attack on this protocol, related to Lowe’s under his or her control, possibly acting as a client or
anomaly in the Needham–Schroeder protocol,15 in server in network protocols of the attacker’s choice.
which a malicious gadget can impersonate the integra- Typically, the Web attacker uses at least one machine
tor to the Contacts gadget. We suggested a solution as an HTTP server, which we refer to as attacker.
based on Lowe’s improvement to the Needham– com. The Web attacker has HTTPS certificates for
Schroeder protocol15 that Microsoft implemented and domains he or she owns; certificate authorities provide
deployed. such certificates for free. The Web attacker’s network
š postMessage is a browser API designed for interframe abilities are decidedly weaker than the usual network
communication10 that is implemented in Internet attacker considered in network security because the
Explorer 8, Firefox 3, Safari 4, Google Chrome, and Web attacker can neither eavesdrop on messages to nor
Opera. Although postMessage has been deployed in forge messages from other network locations. For
Opera since 2005, we demonstrate an attack on the example, a Web attacker cannot be a network “man-in-
channel’s confidentiality using frame navigation. In the-middle.”
light of this attack, the postMessage channel pro- š Client Abilities: We assume that the user views
vides authentication but lacks confidentiality, analo- attacker.com in a popular browser, rendering the
gous to a channel in which senders cryptographically attacker’s content. We make this assumption because
sign their messages. To secure the channel, we propose an honest user’s interaction with an honest site should
modifying the API. Our proposal has been adopted be secure even if the user visits a malicious site in
by the HTML 5 working group and all the major another browser window. The Web attacker’s content is
browsers. subject to the browser’s security policy, making the
Web attacker decidedly weaker than an attacker who
The remainder of the paper is organized as follows. can execute an arbitrary code with the user’s privileges.
Section 2 details our threat models. Section 3 surveys exist- For example, a Web attacker cannot install a system-
ing frame navigation policies and standardizes a secure wide key logger or botnet client.
policy. Section 4 analyzes two frame communication mech-
anisms, demonstrates attacks, and proposes defenses. We do not assume that the user treats attacker.com as
Section 5 describes related work. Section 6 concludes. a site other than attacker.com. For example, the user
never gives a bank.com password to attacker.com. We
2. THREAT MODEL also assume that honest sites are free of cross-site scripting
In this section, we define precise threat models so that we vulnerabilities.20 In fact, none of the attacks described in
can determine how effectively browser mechanisms defend this paper rely on running malicious JavaScript as an honest
against specific classes of attacks. We consider two kinds principal. Instead, we focus on privileges the browser itself
of attackers, a “Web attacker” and a slightly more powerful affords the attacker to interact with honest sites.
“gadget attacker.” Although phishing 4, 6 can be described In addition to our interest in protecting users that
informally as a Web attack, we do not assume that either the visit malicious sites, our assumption that the user visits
Web attacker or the gadget attacker can fool the user by using attacker.com is further supported by several techniques
a confusing domain name (such as bankofthevvest. for attracting users. For example, an attacker can place Web
com) or by other social engineering. Instead, we assume the advertisements, host popular content with organic appeal,
user uses every browser security feature, including the loca- or send bulk e-mail encouraging visitors. Typically, simply
tion bar and lock icon, accurately and correctly. viewing an attacker’s advertisement (such as on a search
page) lets the attacker mount a Web attack. In a previous
2.1. Web attacker study,12 we purchased over 50,000 impressions for $30.
A Web attacker is a malicious principal who owns one or During each of these impressions, a user’s browser rendered
more machines on the network. To study the browser secu- our content, giving us the access required to mount a Web
rity policy, we assume that the user’s browser renders con- attack.
tent from the attacker’s Web site. Attacks accessible to a Web attacker have significant prac-
tical impact because these attacks do not require unusual
š Network Abilities: The Web attacker has no special net- control of the network. Web attacks can also be carried out
work abilities. In particular, the Web attacker can send by a standard man-in-the-middle network attacker, once the
and receive network messages only from machines user visits a single HTTP site, because a man-in-the-middle

84 COMM UNICATI O NS OF TH E ACM | J UNE 20 0 9 | VO L. 52 | N O. 6


can inject malicious content into the HTTP response, simu- even if it contains content from another origin. There are a
lating a reply from attacker.com. number of idioms for navigating frames, including

2.2. Gadget attacker


window.open(“https://attacker.com/”, “frameName”);
A gadget attacker is a Web attacker with one additional abil-
ity: the integrator embeds a gadget of the attacker’s choice.
This assumption lets us accurately evaluate mashup isola- which navigates a frame named frameName. Frame names
tion and communication protocols because the purpose of exist in a global name space that is shared across origins.
these protocols is to let an integrator embed untrusted gad-
gets safely. In practice, a gadget attacker can either wait for 3.2. Cross-window attacks
the user to visit the integrator or can redirect the user to the In 1999, Georgi Guninski discovered that the permissive
integrator’s Web site from attacker.com. frame navigation policy admits serious attacks.7 At the time,
the password field on the CitiBank login page was contained
3. FRAME ISOLATION within a frame, and the Web attacker could navigate that
Web sites can use frames to delegate portions of their screen frame to https://attacker.com/, letting the attacker
real estate to other Web sites. For example, a site can sell fill the frame with identical-looking content that steals the
parts of their pages to adverting networks. The browser password. This cross-window attack proceeds as follows:
displays the location of the main, or top-level, frame in its
location bar. Subframes are often visually indistinguishable 1. The user views a blog that displays the attacker’s ad.
from other parts of a page, and the browser does not display 2. Separately, the user visits bank.com, which displays
their location in its user interface. its password field in a frame.
3. The advertisement navigates the password frame to
3.1. Background https://attacker.com/. The location bar remains
The browser’s scripting policy answers the question “when https://bank.com and the lock icon remains
can one frame manipulate the contents of another frame?” present.
The scripting policy is the most important part of the 4. The user enters his or her bank.com password into the
browser security policy because a frame can act on behalf of https://attacker.com/ frame on the bank.com
every other frame it can script. For example, page, submitting the password to attacker.com.

Of the browsers in heavy use today, Internet Explorer 6 and


otherWindow.document.forms[0].password.value
Safari 3 both implement the permissive policy and allow this
attack. Internet Explorer 7 and Firefox 2 implement stricter
attempts to read the user’s password from another win- policies (described in subsequent sections). Many Web sites,
dow. Modern Web browsers let one frame read and write including Google AdSense, display their password field in a
all the properties of another frame only when their con- frame and are vulnerable to this attack; see Figure 1.
tent was retrieved from the same origin, i.e. when the
scheme (e.g., http or https), host, and port of their loca- 3.3. Same-window attacks
tions match. If the content of otherWindow was retrieved In 2001, Mozilla prevented the cross-window attack by
from a different origin, the browser’s security policy will implementing a stricter policy:
prevent the script above from accessing otherWindow.
document. Window Policy
In addition to enforcing the scripting policy, every browser A frame can navigate only frames in its window.
must answer the question “when is one frame permitted to
navigate another frame?” Prior to 1999, all Web browsers
implemented a permissive policy:
Figure 1: Cross-window attack. The attacker hijacks the password
field, which is in a frame.
Permissive Policy
A frame can navigate any other frame.

For example, if otherWindow includes a frame,

otherWindow.frames[0].location =
“https://attacker.com/”;

navigates the frame to https://attacker.com/. Under


the permissive policy, the browser navigates otherWindow

JU N E 2 00 9 | VO L . 5 2 | N O. 6 | C O M M U NI C AT IO N S O F T HE ACM 85
research highlights

This policy prevents the cross-window attack because the Figure 2: Gadget hijacking. Under the window policy, the attacker
Web attacker does not control a frame in a trusted win- gadget can navigate other gadgets.
dow and, without a foothold in the window, the attacker
cannot navigate the login frame. However, the window
policy is insufficiently strict to protect users because the
gadget attacker does have a foothold in a trusted win-
dow in a mashup. (Recall that, in a mashup, the integra-
tor combines gadgets from different sources into a single
experience.)

š Aggregators: Gadget aggregators, such as iGoogle, My


Yahoo! and Windows Live, provide one form of mashup.
These sites let users customize their experience by
including gadgets (such as stock tickers, weather pre-
dictions, and news feeds) on their home page. These
sites put third-party gadgets in frames and rely on the
browser to protect users from malicious gadgets.
š Advertisements: Web advertising produces mashups
that combine first-party content, such as news articles
or sports statistics, with third-party advertisements.
(a) Before
Most advertisements, including Google AdWords, are
contained in frames, both to prevent the advertisers
(who provide the gadgets) from interfering with the
publisher’s site and to prevent the publisher from using
JavaScript to click on the advertisements.

We refer to pages with advertisements as simple mashups


because the integrator and the gadgets do not communi-
cate. Simple mashups rely on the browser to provide isola-
tion but do not require interframe communication.
The windows policy offers no protection for mashups
because the integrator’s window contains untrusted gad-
gets. A gadget attacker who supplies a malicious gadget does
control a frame in the honest integrator’s window, giving
the attacker the foothold required to mount a gadget hijack-
ing attack.14 A malicious gadget can navigate a target gad-
get to attacker.com and impersonates the gadget to the
user. For example, iGoogle is vulnerable to gadget hijacking
in browsers, such as Firefox 2, that implement the permis- (b) After
sive or window policies; see Figure 2. Consider an iGoogle
gadget that lets users access their Hotmail account. If the
user is not logged into Hotmail, the gadget requests the The Internet Explorer 6 team wanted to enable the child pol-
user’s Hotmail password. A malicious gadget can replace icy by default but shipped the permissive policy because the
the Hotmail gadget with and steal the user’s Hotmail pass- child policy was incompatible with a large number of Web
word. As in the cross-window attack, the user is unable to sites. The Internet Explorer 7 team designed the descen-
distinguish the malicious password field from the honest dant policy to balance the security requirement to defeat the
password field. cross-window attack with the compatibility requirement to
support existing sites.18
3.4. Stricter policies To select a frame navigation policy that provides the best
Although browser vendors do not document their naviga- trade-off between security and compatibility, we appeal to
tion policies, we reverse engineered the policies of existing the principle of pixel delegation. When one frame embeds
browsers (see Table 2). In addition to the permissive and a child frame, the parent frame delegates a region of the
window policies, we found two other policies: screen to the child frame. The browser prevents the child
frame from drawing outside of its bounding box but does
Descendant Policy allow the parent frame to draw over the child using the
A frame can navigate only its descendants. position: absolute style. Frame navigation attacks
hinge on the attacker escalating his or her privileges and
Child Policy drawing on otherwise inaccessible regions of the screen.
A frame can navigate only its direct children. The descendant policy is the most permissive (and therefore

86 COMM UNICATI ONS OF T HE AC M | J UNE 20 0 9 | VO L . 52 | N O. 6


Table 2: Frame navigation policies deployed in existing browsers prior to our work.

IE 6 (Default) IE 6 (Optional) IE 7 (Default) IE 7 (Optional) Firefox 2 Safari 3 Opera 9


Permissive Child Descendant Permissive Window Permissive Child

most compatible) policy that prevents the attacker from google.com. This script creates a rich JavaScript API
overwriting screen real estate “belonging” to another origin. that the integrator can use to interact with the map, but
Although the child policy is stricter than the descendant the script runs with all of the integrator’s privileges.
policy, the added strictness does not provide a significant
security benefit because the attacker can simulate the visual Yelp, a popular review Web site, uses the Google Maps gad-
effects of navigating a grandchild frame by drawing over get to display the locations of restaurants and other busi-
the region of the screen occupied by the grandchild frame. nesses. Yelp requires a high degree of interactivity with the
The child policy’s added strictness does, however, reduce Maps gadget because it places markers on the map for each
the policy’s compatibility with existing sites, discouraging restaurant and displays the restaurant’s review when the
browser vendors from deploying the child policy. user clicks on the marker. To deliver these advanced fea-
Maximizing the compatibility of the descendant policy tures, Yelp must use the script version of the Maps gadget,
requires taking the browser’s scripting policy into account. but this design requires Yelp to trust Google Maps com-
Consider one site that embeds two child frames from a sec- pletely because Google’s script runs with Yelp’s privileges,
ond origin. Should one of those child frames be permitted granting Google the ability to manipulate Yelp’s reviews and
to navigate its sibling? Strictly construed, the descendant steal Yelp’s customer’s information. Although Google might
policy forbids this navigation because the target frame is a be trustworthy, the script approach does not scale beyond
sibling, not a descendant. However, this navigation should highly respected gadget providers. Secure interframe com-
be allowed because an attacker can perform the navigation munication promises the best of both alternatives: sites
by injecting a script into the sibling frame that causes the with functionality like Yelp can realize the interactivity of
frame to navigate itself. The browser lets the attacker inject the script version of Google Maps gadget while maintaining
this script because the two frames are from the same origin. the security of the frame version of the gadget.
More generally, the browser can maximize the compatibility
of the descendant policy by recognizing origin propagation 4.1. Fragment identifier messaging
and letting an active frame navigate a target frame if the tar- Although the browser’s scripting policy isolates frames from
get frame is the descendant of a frame in the same origin as different origins, clever mashup designers have discovered
the active frame. Defined in this way, the frame navigation an unintended channel between frames, fragment identi-
policy avoids creating a suborigin privilege.11 This added per- fier messaging,1, 21 which is regulated by the browser’s less-
missiveness does not sacrifice security because an attacker restrictive frame navigation policy. This “found” technology
can perform the same navigations indirectly, but the refined lets mashup developers place each gadget in a separate
policy is more convenient for honest Web developers. frame and rely on the browser’s security policy to prevent
We collaborated with the HTML 5 working group9 and malicious gadgets from attacking the integrator and honest
standardized the descendant policy in the HTML 5 speci- gadgets. We analyze fragment identifier messaging in use
fication. The descendant policy has now been adopted by prior to our analysis and propose improvements that have
Internet Explorer 7, Firefox 3, Safari 3.1, and Google Chrome. since been adopted.
We also reported a vulnerability in Flash Player that could be Mechanism: Normally, when a frame is navigated to a new
used to bypass Internet Explorer 7’s frame navigation policy. URL, the browser requests the URL from the network and
Adobe fixed this vulnerability in a security update. replaces the frame’ document with the retrieved content.
However, if the new URL matches the old URL everywhere
4. FRAME COMMUNICATION except in the fragment (the part after the #), then the browser
Unlike simple aggregators and advertisements, sophisti- does not reload the frame. If frames[0] is currently located
cated mashups comprise gadgets that communicate with at http://example.com/doc,
each other and with their integrator. For example, Yelp
integrates the Google Maps gadget, illustrating the need
frames[0].location = “http://example.com/doc#msg”;
for secure interframe communication in real deployments.
Google provides two versions of its Maps gadget:
changes the frame’s location without reloading the frame
š Frame: In the frame version, the integrator embeds a or destroying its JavaScript context. The frame can read its
frame to maps.google.com, in which Google displays fragment by polling window.location.hash to see if the
a map of the specified location. The user can interact fragment has changed. This technique can be used to send
with the map, but the integrator cannot. messages between frames while avoiding network latency.
š Script: In the script version, the integrator embeds Security Properties: The fragment identifier channel has
a <script> tag that runs JavaScript from maps. less-than-ideal security properties. The browser’s scripting

J U NE 20 09 | VOL . 5 2 | NO. 6 | CO M M U NI CAT I O NS O F T H E AC M 87


research highlights

policy prevents other origins from eavesdropping on mes- that the library used the following protocol to establish a
sages because they are unable to read the frame’s loca- secure channel:
tion (even though the navigation policy lets them write the
frame’s location). Browsers also prevent arbitrary origins A o B : NA, URIA
from tampering with portions of messages. Other security
B o A : NA, NB
origins can, however, overwrite the fragment identifier in
its entirety, leaving the recipient to guess the sender of each A o B : NB, Message1
message.
To understand these security properties, we draw an In this notation, A and B are frames, NA and NB are fresh
analogy with the well-known properties of network chan- nonces (numbers chosen at random during each run of the
nels. We view the browser as guaranteeing that the frag- protocol), and URIA is the location of A’s frame. Under the
ment identifier channel has confidentiality: a message can network analogy described above, this protocol is analogous
be read only by its intended recipient. The fragment identi- to the classic Needham–Schroeder public-key protocol.17 The
fier channel fails to be a secure channel, however, because Needham–Schroeder protocol was designed to establish a
it lacks authentication: a recipient cannot determine the shared secret between two parties over an insecure channel.
sender of a message unambiguously. The attacker might Instead of using encryption as in the Needham–Schroeder
be able to replay previous messages using the browser’s protocol, Windows Live relies on the fragment identifier
history API. channel to provide confidentiality.
The fragment identifier channel is analogous to a chan- The Needham–Schroeder public-key protocol has a well-
nel on an untrusted network in which each message is known anomaly, due to Lowe,15 that leads to an attack in
encrypted with the public key of its intended recipient. the browser setting. In the Lowe scenario, an honest princi-
In both cases, when Alice sends a message to Bob, no one pal, Alice, initiates the protocol with a dishonest party, Eve.
except Bob learns the contents of the message (unless Bob Eve then convinces honest Bob that she is Alice. In order to
forwards the message). In both settings, the channel does exploit the Lowe anomaly, an honest principal must be will-
not provide a reliable procedure for determining who sent ing to initiate the protocol with a dishonest principal. This
a given message. There are two key differences between the requirement is met in mashups because the integrator initi-
fragment identifier channel and the public-key channel: ates the protocol with the gadget attacker’s gadget when the
mashup is initialized. The Lowe anomaly can be exploited to
1. Public-key channel is susceptible to traffic analysis, impersonate the integrator to the gadget:
but an attacker cannot determine the length of a mes-
sage sent over the fragment identifier channel. An Integrator o Attacker : NI , URII
attacker can extract timing information by polling the
Attacker o Gadget : NI , URII
browser’s clock, but obtaining high-resolution timing
information degrades performance. Gadget o Integrator : NI , NG
2. Fragment identifier channel is constrained by the Integrator o Attacker : NG, Message1
browser’s frame navigation policy. In principle, this
could be used to construct protocols secure for the After these four messages, the attacker possesses NI and
fragment identifier channel that are insecure for NG and can impersonate the integrator to the gadget. We
the public-key channel (by preventing the attacker have implemented this attack against the Windows Live
from navigating the recipient), but in practice this Contacts gadget. The anomaly is especially problematic
restriction has not prevented us from constructing for the Contacts gadget because it displays the integra-
attacks on existing implementations. tor’s host name to the user in its security user interface (see
Figure 3).
Despite these differences, we find the network analogy use- Securing Fragment Identifier Messaging: The channel can
ful in analyzing interframe communication. be secured using a variant of the Needham–Schroeder–Lowe
Windows Live Channels: Microsoft uses fragment iden- protocol.15 As in Lowe’s improvement to the original proto-
tifier messaging in its Windows Live platform library to col, we recommend including the responder’s identity in the
implement a higher-level channel API, Microsoft.Live. second message of the protocol, letting the honest initiator
Channels.21 The Windows Live Contacts gadget uses this detect the attack and abort the protocol:
API to communicate with its integrator. The integrator can
instruct the gadget to add or remove contacts from the user’s A o B : NA, URIA
contacts list, and the gadget can send the integrator details
B o A : NA, NB, URIB
about the user’s contacts. Whenever the integrator asks the
gadget to perform a sensitive action, the gadget asks the A o B : NB
user to confirm the operation and displays the integrator’s
host name to aid the user in making trust decisions. Prior to We contacted Microsoft, the OpenAJAX Alliance, and IBM
our analysis, Microsoft.Live.Channels used a proto- about the vulnerabilities in their fragment identifier mes-
col to add authentication to the fragment identifier channel. saging protocols. Microsoft and the OpenAJAX Alliance have
By reverse engineering the implementation, we determined adopted our suggestions and deployed the above protocol in

88 CO MM UNICATI O NS O F TH E ACM | JUN E 2 0 0 9 | VO L . 52 | N O. 6


Figure 3: Lowe Anomaly. The gadget believes the request came from
Attacks: We discover an attack that breaches the confidenti-
integrator.com, but in reality the request was made by attacker. ality of the postMessage channel. Because a message sent
com. with postMessage is directed at a frame, an attacker can
intercept the message by navigating the frame to attacker.
com before the browser generates the message event:

š Recursive Mashup Attack: If an integrator calls postMes-


sage on a gadget contained in a frame, the attacker can
load the integrator inside a frame and intercept the mes-
sage by navigating the gadget frame (a descendant of the
attacker’s frame) to attacker.com. When the integrator
calls postMessage on the “gadget’s” frame, the browser
delivers the message to the attacker (see Figure 4).
š Reply Attack: Suppose the integrator uses the origin
to decide whether to reply to a message event:

if (evt.origin == “https://gadget.com”)
evt.source.postMessage(secret);

š The attacker can intercept the secret by navigating the


source frame before the browser generates the message
event. This attack can succeed even under the child
updated versions of their libraries. IBM adopted our sugges- frame navigation policy if the honest gadget sends its
tions and revised their SMash14 paper. messages via top.postMesage( . . . ). The attacker’s
gadget can embed a frame to the honest gadget and nav-
4.2. postMessage igate the honest gadget before the integrator replies to
HTML 510 specifies a new browser API for asynchronous the “gadget’s” frame (see Figure 5).
communication between frames. Unlike fragment identi-
fier messaging, postMessage was designed for cross-ori- Securing postMessage: Although sites might be able to build
gin communication. The postMessage API was originally a secure channel using the original postMessage API, we
implemented in Opera 8 and is now supported by Internet recommend that postMessage provide confidentiality
Explorer 8, Firefox 3, Safari 3.1, and Google Chrome. We dis- natively. In MashupOS,22 we previously proposed that inter-
covered a vulnerability in an early version of the API, which frame communication APIs should let the sender specify
has since been eliminated by modifications we suggested. the origin of the intended recipient. Similarly, we propose
To send a message to another frame, the sender calls the
postMessage method: Figure 4: Recursive mashup attack. The attacker navigates the
gadget’s frame to attacker.com.
frames[0].postMessage(“Hello world.”);

In the recipient’s frame, the browser generates a message


event with the message, the origin (scheme, host, and port)
of the sender, and a reference to the sender’s frame.
Security Properties: The postMessage channel guarantees
authentication, messages accurately identify their senders,
but the channel lacks confidentiality. Thus, postMessage
has almost the “opposite” security properties as fragment
identifier messaging. The postMessage channel is analo- Figure 5: Reply attack. The attacker intercepts the integrator’s
response to the gadget’s message.
gous to a channel on an untrusted network in which each
message is cryptographically signed by its sender. In both
settings, if Alice sends a message to Bob, Bob can determine
unambiguously that Alice sent the message. With post-
Message, the origin property identifies the sender; with
cryptographic signatures, the signature identifies signer.
One difference between the channels is that cryptographic
signatures can be easily replayed, but postMessage resists
replay attacks.

JU NE 20 0 9 | VO L . 5 2 | NO. 6 | C O M M U NI CAT I O NS O F T H E AC M 89
research highlights

extending the postMessage API with a second parameter: 5.5. Security = restricted and jail
targetOrigin. The browser will deliver the message only Internet Explorer supports a security attribute16 for
if the frame’s current origin matches the specified target- frames. When set to restricted, the frame’s content
Origin. If the sender uses “*” as the targetOrigin, the cannot run JavaScript. Similarly, the proposed <jail>
browser will deliver the message to any origin. Using this tag5 encloses untrusted content and prevents the jailed
improved API, a frame can reply to a message using the fol- content from running JavaScript. Unfortunately, eliminat-
lowing idiom: ing JavaScript prevents gadgets from offering interactive
experiences.
if (evt.origin == “https://gadget.com”)
evt.source.postMessage(secret, evt.origin); 5.6. MashupOS
In MashupOS,22 we proposed new primitives both for isola-
tion and communication. Our improvements to frame navi-
We implemented this API change as patches for Firefox gation policies and postMessage let developers realize
and Safari. Our proposal was accepted by the HTML 5 work- some of the benefits of MashupOS using existing browsers.
ing group.8 The improved API is now available in Internet
Explorer 8, Firefox 3, Safari 4, and Google Chrome. 6. CONCLUSION
Web sites that combine content from multiple sources
5. RELATED WORK can leverage browser frame isolation and interframe com-
munication. Although the browser’s same-origin security
5.1. Mitigations for gadget hijacking policy restricts direct access between frames, recent brows-
SMash14 mitigates gadget hijacking (also known as “frame ers have used differing policies to regulate when one frame
phishing”) by carefully monitoring the frame hierarchy and may navigate another. The original permissive frame naviga-
browser events for unexpected navigations. Although neither tion policy admits a number of attacks, and the subsequent
the integrator nor the gadgets can prevent these navigations, window navigation policy leaves mashups vulnerable to
the mashup can alert the user and refuse to function if it detects similar attacks. The better descendant policy, which we col-
an illicit navigation. SMash waits 20s for a gadget to load before laborated with the HTML 5 working group to standardize,
assuming that the gadget has been hijacked. An attacker might balances security and compatibility and has been adopted
be able to fool the user into entering sensitive information dur- by Internet Explorer 7 (independently), Firefox 3, Safari 3.1,
ing this interval, but using a shorter interval might cause users and Google Chrome.
with slow network connections to receive spurious warnings. In existing browsers, frame navigation can be used for inter-
The descendant policy makes such mitigation unnecessary. frame communication via a technique known as fragment
identifier messaging. If used directly, fragment identifier
5.2. Safe subsets of HTML and JavaScript messaging lacks authentication. We showed that the authen-
One way to sidestep the security issues of frame-based tication protocols used by Windows.Live.Channels,
mashups is to avoid using frames by combining the gadgets SMash, and OpenAjax 1.1 were vulnerable to attacks but can
and the integrator into a single document. This approach be repaired in a manner analogous to Lowe’s variation of the
forgoes the protections afforded by the browser’s security Needham–Schroeder protocol.15 This improvement has been
policy and requires gadgets to be written in a “safe subset” adopted by Microsoft Windows Live, IBM Smash,14 and the
of HTML and JavaScript that prevents a malicious gadget OpenAjax Alliance.
from attacking the integrator or other gadgets. Several open- Originally, postMessage, another interframe commu-
source implementations (FBML, ADsafe, and Caja) are avail- nication channel, suffered the converse vulnerability: using
able. FBML is currently the most successful subsets and is frame navigation, an attacker could breach confidentiality.
used by the Facebook Platform. We propose extending the postMessage API to provide
confidentiality by letting the sender specify an intended
5.3. Subspace recipient. Our proposal has been adopted by the HTML
In Subspace,13 we used a multilevel hierarchy of frames 5 working group, Internet Explorer 8, Firefox 3, Safari 4,
that coordinated their document.domain property to Google Chrome, and Opera.
communicate directly in JavaScript. Similar to most frame- With these improvements, frames provide stronger iso-
based mashups, the descendant frame navigation policy is lation and better communication, becoming a more attrac-
required to prevent gadget hijacking. tive feature for integrating third-party Web content. One
important area of future work is improving the usability of
5.4. Module tag the browser’s security user interface. For example, a gad-
The proposed <module> tag 2 is similar to the <iframe> get is permitted to navigate the top-level frame, redirecting
tag, but the module runs in an unprivileged security context, the user from the mashup to a site of the attacker’s choice.
without a principal, and the browser prevents the integra- Although the browser’s location bar makes this navigation
tor from overlaying content on top of the module. Unlike evident, many users ignore the location bar. Another area
postMessage, the communication primitive used with for future work is improving isolation in the face of browser
the <module> tag is explicitly unauthenticated because the implementation errors, which could let a gadget subvert the
module lacks a principal. browser’s security mechanisms.

90 CO MM UNICATIO NS OF T H E ACM | J UNE 20 0 9 | VOL . 52 | N O. 6


Acknowledgments Proceedings of of the 14th ACM 17. Needham, R.M., Schroeder, M.D.
Conference on Computer and Using encryption for authentication
We thank Mike Beltzner, Sumeer Bhola, Dan Boneh, Gabriel Communications Security (CCS) in large networks of computers.
E. Corvera, Ian Hickson, Koji Kato, Eric Lawrence, Erick Lee, (2007). Commun. ACM, 21, 12 (1978),
13. Jackson, C., Wang, H.J. Subspace: 993–999.
David Lenoe, David Ross, Maciej Stachowiak, Hallvord Steen, Secure cross-domain communication 18. Ross, D., January 2008. Personal
Peleus Uhley, Jeff Walden, Sam Weinig, and Boris Zbarsky for web mashups. In Proceedings communication.
of the 16th International World 19. Ruderman, J. JavaScript Security:
for their helpful suggestions and feedback. This work is sup- Wide Web Conference (WWW) (2007). Same Origin. http://www.mozilla.org/
ported by grants from the National Science Foundation and 14. De Keukelaere, F., Bhola, S., Steiner, projects/security/components/
M., Chari, S., Yoshihama, S. SMash: same-origin.html.
the US Department of Homeland Security. Secure cross-domain mashups on 20. Stuttard, D., Pinto, M. The Web
unmodified browsers. In Proceedings Application Hacker’s Handbook.
of the 17th International World Wide Wiley, 2007.
Web Conference (WWW) (2008). To 21. Thorpe, D. Secure cross-domain
References appear. communication in the browser.
1. Burke, J. Cross domain frame 7. Guninski, G. Frame spoofing using 15. Lowe, G. Breaking and fixing the Archit. J. 12 (2007), 14–18.
communication with fragment loading two frames. Mozilla Bug Needham–Schroeder public-key 22. Wang, H.J., Fan, X., Howell,
identifiers. http://tagneto.blogspot. 13871. protocol using FDR. In Proceedings of J., Jackson, C. Protection and
com/2006/06/cross-domain-frame- 8. Hickson, I. Re: A potential TACAS (volume 1055, 1996), Springer communication abstractions
communication-with.html. slight security enhancement to Verlag. for web browsers in MashupOS.
2. Crockford, D. The <module> tag. postMessage, Februrary 2008. http:// 16. Microsoft. SECURITY attribute In Proceedings of the 21st
http://www.json.org/module.html. lists.whatwg.org/pipermail/whatwg- (FRAME, IFRAME). http://msdn2. ACM Symposium on Operating
3. Daswani, N., Stoppelman, M. et al. whatwg.org/2008-February/013949. microsoft.com/en-us/library/ Systems Principles (SOSP)
The anatomy of Clickbot.A. In html. ms534622(VS.85).aspx. (2007).
Proceedings of the HotBots (2007). 9. Hickson, I. Re: HTML5 frame
4. Dhamija, R., Tygar, J.D., Hearst, navigation policy, April 2008. http://
M. Why phishing works. In CHI lists.whatwg.org/pipermail/whatwg-
‘06: Proceedings of the SIGCHI whatwg.org/2008-April/014597.html. Adam Barth John C. Mitchell
Conference on Human Factors in 10. Hickson, I. et al. HTML 5 Working (abarth@eecs.berkeley.edu), (mitchell@cs.stanford.edu),
Computing Systems (2006). Draft. http://www.whatwg.org/specs/ UC Berkeley. Stanford University.
5. Eich, B. JavaScript: Mobility and web-apps/
ubiquity. http://kathrin.dagstuhl.de/ current-work/. Collin Jackson
files/Materials/07/07091/07091. 11. Jackson, C., Barth, A. Beware of finer- (collinj@cs.stanford.edu),
EichBrendan.Slides.pdf. grained origins. In Proceedings of the Stanford University.
6. Felten, E.W., Balfanz, D., Dean, D., Web 2.0 Security and Privacy (W2SP)
Wallach, D.S. Web spoofing: An (2008).
Internet con game. In Proceedings 12. Jackson, C., Barth, A., Bortz, A., Shao,
of the 20th National Information W., Boneh, D. Protecting browsers
Systems Security Conference (1996). from DNS rebinding attacks. In
© 2009 ACM 0001-0782/09/0600 $10.00

ACM Transactions on
Internet Technology
U U U U U
This quarterly publication encompasses many disciplines
in computing—including computer software engineering,
middleware, database management, security, knowledge dis-
covery and data mining, networking and distributed systems,
communications, and performance and scalability—all under
one roof. TOIT brings a sharper focus on the results and roles
of the individual disciplines and the relationship among
them. Extensive multi-disciplinary coverage is placed on the
new application technologies, social issues, and public policies
shaping Internet development.
U U U U U

http://toit.acm.org/
J U N E 2 00 9 | VO L . 5 2 | NO. 6 | CO M M U NI C AT I O NS O F T HE AC M 91
research highlights
D OI :10 . 114 5/ 15 16 0 46 . 15 1 6 0 67

Technical Perspective modification of the applications. Re-


run uses Lamport Scalar Clocks to or-
Software and Hardware der episodes and enable replay of an
equivalent execution. Rerun reduces
Support for Deterministic the hardware state per core to 166
bytes per core and the log size to only
Replay of Parallel Programs around 1.67 bytes/kiloinstruction per
core in an 8-core system. This results
By Norman P. Jouppi in a core*log overhead product for
many-core systems that is more than
PARALLEL PROGRAMMING HAS long been An important step in this direction an order of magnitude smaller than
recognized as a difficult problem. appeared five years ago in the Univer- previous work.
This problem has recently taken on a sity of Wisconsin’s Flight Data Record- DeLorean, developed contempora-
sense of urgency: the long march of er.1 This original system required sig- neously at the University of Illinois,
single-thread performance increas- nificant hardware state. However, for executes large blocks of instructions
es has stopped in its tracks. Due to mainstream adoption, the additional atomically separated by checkpoints,
limitations on power dissipation and hardware should be very small, since like in transactional memory or
decreased return on investments in all users of a microprocessor design thread-level speculation. Executing
additional processor complexity, addi- will be paying for the hardware sup- larger chunks of instructions provides
tional transistors provided by Moore’s port whether they use it or not. After benefits in both log size and replay
Law are now being channeled into a several evolutionary enhancements in speed. For example, for an 8-core pro-
geometrically increasing number of previous years, two systems appeared cessor, DeLorean is able to achieve a
cores per die. These cores can easily this year that make a quantum leap log size of only 0.0063 bytes/kiloin-
be applied to embarrassingly parallel forward in reducing the overheads struction per core while still being
problems and distributed comput- needed to support deterministic re- able to replay at 72% of the original
ing in server environments by run- play: instead of recording individual execution speed. To put this in per-
ning multiple parallel tasks. However, memory references, both of these sys- spective, with this log size an entire
there are many applications where tems only record execution of atomic day’s execution of an 8-core processor
increased performance on formerly blocks of instructions. would only take 20GB, a small fraction
single-threaded applications are high- Rerun, also from the University of of a 1TB disk drive.
ly desirable, ranging from personal Wisconsin, reduces overhead by re- The following paper is a first for
computing devices to capability su- cording atomic episodes. An episode Communications’ Research Highlights
percomputers. is a series of instructions from a single section: it contains a synthesis of re-
A key problem in parallel program- thread that happen to execute without cent work from two competing (but
ming is the ability to find concurrency conflicting with any other thread in the collegial) research teams. Both the Re-
bugs and to debug program execution system. Episodes are created automat- run and DeLorean teams were invited
in the presence of memory races on ically by the recording system without to contribute to this paper since their
accesses to both synchronization and approaches appeared in the same con-
data variables. Subsequent executions ference session, both represent signif-
of a parallel program containing a race The following paper icant advances, and their approaches
or bug are unlikely to have the same are actually complementary—Rerun
exact ordering on each execution due is a first for requires very little additional hard-
to nondeterministic system effects. Communications’ ware, whereas DeLorean can achieve
This may cause rare problems to occur much smaller log sizes but requires
long after deployment of an applica- Research Highlights checkpoint and recovery hardware
tion. Rerunning programs in a slower section: it contains similar to that provided in transac-
debug mode also changes the relative tional memory systems. Both research
timing, and can easily mask prob- a synthesis of streams have the potential to make a
lems. Ideally what we’d like is a way recent work from significant impact on the productivity
to deterministically replay execution of future parallel programming.
of parallel programs, by recording the two competing
outcome of memory races without sig- (but collegial) Reference
1. Xu, M., Bodik, R., and Hill, M.D. A “flight data
nificantly slowing down the execution
of the original program. Additionally, research teams. recorder” for enabling full-system multiprocessor
deterministic replay. ACM/IEEE International
Symposium on Computer Architecture, June 2003.
one would like logging requirements
of the execution to be manageable, Norman P. Jouppi (Norm.Jouppi@hp.com) is a Fellow
and Director of Hewlett-Packard’s Exascale Computing
and the replay of applications to occur Lab in Palo Alto, CA.
at a speed similar to that of the origi-
nal execution. © 2009 ACM 0001-0782/09/0600 $10.00

92 CO MM UNICATI O NS OF T HE AC M | J U N E 200 9 | VO L . 52 | N O. 6
DO I:1 0.1 145 / 15 160 4 6. 1 51 60 68

Two Hardware-Based
Approaches for Deterministic
Multiprocessor Replay
By Derek R. Hower, Pablo Montesinos, Luis Ceze, Mark D. Hill, and Josep Torrellas

Abstract Security: Deterministic replay could also be used to enhance


Many shared-memory multithreaded executions behave the security of software by providing the means for an in-
nondeterministically when run on multiprocessor hardware depth analysis of an attack, hopefully leading to rapid
such as emerging multicore systems. Recording nondeter- patch deployment and a reduction in the economic
ministic events in such executions can enable deterministic impact of new threats.
replay—e.g., for debugging. Most challenging to record are Fault Tolerance: With the ability to replay an execution, it
memory races that can potentially occur on almost all mem- may also be possible to develop hot-standby systems for
ory references. For this reason, researchers have previously critical service providers using commodity hardware. A
proposed hardware to record key memory race interactions virtual machine (VM) could, for example, be fed, in real
among threads. time, the replay log of a primary server running on a phys-
The two research groups coauthoring this paper inde- ically separate machine. The standby VM could use the
pendently uncovered a dual approach: focus on recording replay log to mimic the primary’s execution, so that in the
how long threads execute without interacting. From this event that the primary fails, the backup can take over
common insight, the groups developed two significantly operation with almost zero downtime.
different hardware proposals. Wisconsin Rerun makes few
changes to standard multicore hardware, while Illinois As existing commercial products have already shown,
DeLorean promises much smaller log sizes and higher deterministic replay can be achieved with a software-only
replay speeds. By presenting both proposals in one paper, solution when executing in a uniprocessor environment.18
we seek to illuminate the promise of the joint insight and This is due, in part, to the fact that sources of nondetermin-
inspire future designs. ism in a uniprocessor, such as interrupts or I/O, are relatively
rare events that take a long time to complete. However, when
executing in a shared-memory multiprocessor environment,
1. INTRODUCTION memory races, which can potentially occur on every memory
Modern computer systems are inherently nondeterminis- access, are another source of nondeterminism. All-software
tic due to a variety of events that occur during an execution, solutions exist,4, 8 but results show that they do not perform
including I/O, interrupts, and DMA fills. The lack of repeat- well on workloads that interact frequently. Thus, it is likely
ability that arises from this nondeterminism can make it diffi- that a general solution will require hardware support. To
cult to develop and maintain correct software. Furthermore, it this end, Bacon and Goldstein2 originally proposed record-
is likely that the impact of nondeterminism will only increase ing all snooping coherence transactions, which, while fast,
in the coming years, as commodity systems are now shared- produced a serial and voluminous log (see Figure 1).
memory multiprocessors. Such systems are not only impacted Xu et al.16 modernized hardware support for multiproces-
by the sources of nondeterminism in uniprocessors, but also sor deterministic replay in general and memory race record-
by the outcome of memory races among concurrent threads. ing in particular. A memory race recorder is responsible for
In an effort to help ease the pain of developing software logging enough information to reconstruct the order of all
in a nondeterministic environment, researchers have pro- fine-grained memory interleavings that occur during an exe-
posed adding deterministic replay capabilities to computer cution. To reduce the amount of information that needs to
systems. A system with a deterministic replay capability can be logged (so that longer periods can be recorded for a fixed
record sufficient information during an execution to enable hardware cost), the system proposed by Xu et al. implemented
a replayer to (later) create an equivalent execution despite in hardware an enhancement to Netzer’s transitive reduc-
the inherent sources of nondeterminism that exist. With the tion optimization.13 The idea is to skip the logging of those
ability to replay an execution verbatim, many new applica- races that can be implied through transitivity, i.e., those races
tions may be possible:
The original Wisconsin Rerun6 paper as well as the origi-
Debugging: Deterministic replay could be used to provide
nal Illinois DeLorean11 paper were published in the Pro-
the illusion of a time-travel debugger that has the ability
ceedings of the 35th Annual International Symposium on
to selectively execute both forward and backward in Computer Architecture (June 2008).
time.

JU NE 2 0 09 | VO L . 5 2 | NO. 6 | CO M M UN I C AT IO N S O F T HE ACM 93
research highlights

Figure 1: An example of efficient race recording using (a) an explicit


2. RERUN
transitive reduction and (b) independent regions. In (a), solid lines Wisconsin Rerun6 exploits the concept of episodic race
between threads are races written to the log, while dashed lines are recording to achieve efficient logging with only small modifi-
those races implied through transitivity. cations to existing memory system architectures. The Rerun
race recorder does not interfere with a running program in
F=1 F=1 any way; it is an impartial observer of a running execution,
r1 = F and as such avoids artificially perturbing the execution
A=5
A=5 r1 = F under observation.
r1 = F B=6 r1 = F
B=6
2.1. Episodic memory race recording
F=0
This section develops insights behind Rerun. It motivates
r1 = F F=0 Rerun with an example, gives key definitions, and explains
how Rerun establishes and orders episodes.
r2 = A r1 = F Motivating Example and Key Ideas: Consider the execution
r2 = A in Figure 2 that highlights two threads i and j executing on a
r3 = B multicore system. Dynamic instructions 1–4 of thread i hap-
r3 = B
pen to execute without interacting with instructions running
(a) (b) concurrently on thread j. We call these instructions, collec-
tively labeled E1, an episode in thread i’s execution. Similarly,
instructions 1–3 of thread j execute without interaction and
implied through the combination of previously logged races constitute an episode E2 for thread j. As soon as a thread’s
and sequential program semantics. Figure 1a illustrates a episode ends, a new episode begins. Thus, every instruction
transitive reduction. Inter-thread races between instructions execution is contained in an episode, and episodes cover the
accessing locations A and B, respectively, are not logged since entire execution (right side of Figure 2).
they are implied by the recorded race for location F. Rerun must solve two subproblems in order to ensure that
While both the original16 and follow-on17 work by Xu enough episodic information is recorded to enable deter-
et al. were successful in achieving efficient log compres- ministic replay of all memory races. First, it must determine
sion (^1B/1000 instructions executed), they required a large when an episode ends, and, by extension, when the next
amount of hardware state, on the order of an additional one begins. To remain independent, an episode E must end
L1 cache per core, in order to do so. Subsequent work by when another thread issues a memory reference that conflicts
Narayanasamy et al.12 on the Strata race recorder reduced this with references made in episode E. Two memory accesses
hardware requirement but, as results in Hower and Hill6 show, conflict if they reference the same memory block, are from
may not scale well as the number of hardware contexts in a different threads, and at least one is a write. For example,
system increases. This is largely because Strata writes global episode E1 in Figure 2 ends because thread j accesses the
information to its log entries that contains a component variable F that was previously written (i.e., F is in the write
from each hardware thread context in the system. set of E1). Formally, for all combinations of episodes E and F
A key observation, discovered independently by the
authors of this paper at the Universities of Illinois and
Wisconsin, is that by focusing on regions of independence, Figure 2: An example of episodic recording. Dashed lines indicate
rather than on individual dependencies, an efficient and episode boundaries. In the blown up diagram of threads i and j, the
shaded boxes show the state of the episode as it ends, including the
scalable memory race recorder can be made without sacri-
read and write sets, memory reference counter, and the timestamp.
ficing logging efficiency. Figure 1b illustrates this notion by The shaded box in the last episode of thread i shows the initial epi-
breaking the execution of Figure 1a into an ordered series of sode state.
independent execution regions. Because intra-thread depen-
Ti Tj
dencies are implicit and do not need to be recorded, the exe- Ti Tj
cution in Figure 1b can be completely described by the three Y := 54
r5 := X
inter-thread dependencies, which is the same amount of r4 := Q T := r3 1: F=1
information required after a transitivity reduction shown in S := r3 W := r4 2: r1 = A
r4 := U 3: B = 23
Figure 1a. r5 := X
r3 := P 4: F=0 ...
The authors of this paper have developed two different r2 := I E1 R: {A} W: {B,F} R: {...} W: {...}
F := 1 H := r4 REFS: 4 REFS: 97
systems, called Rerun6 and DeLorean,11 that both exploit the r1 := A r8 := X Timestamp: 43 Timestamp: 5
B:= 23
same independence observation described above. These F := 0
r9 := Y
Q := r8 Initial State:
systems, presented in the same session of ISCA 2008, exem- 1: D = r7
R: W:
plify different trade-offs in terms of logging efficiency and r6 := E D := r7 REFS: 0
2: r1 = F
D := r7 3: r2 = B
implementation complexity. Rerun can be implemented r1 := F Timestamp: 44
S := r4 r2 := B E2 R: {B,F} W: {D}
with small modifications to existing memory system archi- C := r3 REFS: 3
W := r10 Timestamp: 44
tectures but writes a larger log than DeLorean. DeLorean Z := 34
r3 := 54
can achieve a greater log size reduction and a higher replay
speed but requires novel hardware to do so.

94 COM MUNICATI O NS OF TH E ACM | M AY 20 0 9 | VO L . 52 | N O. 6


in an execution, the no-conflict condition of Equation 1 must and, thus, can be replayed in any alternative order with
hold. Let RE(WE) denote episode E’s read (write) set: affecting replay fidelity.
A replayer (not shown) uses information about episode
[WE †(RF ‡WF) = ‡] ˜ [RE †WF = ‡] (1) duration and ordering to reconstruct an execution with the
same behavior. If episodes are replayed in timestamp order,
Importantly, while an episode must end to avoid conflicts, then the replayed execution will be logically equivalent to
episodes may end early for any or no reason. In Section 2.2, the recorded execution. Unfortunately, the use of Lamport
we will ease implementation cost by ending some episodes scalar clocks make Rerun’s replay (mostly) sequential.
early.
Second, an episodic recorder must establish an ordering 2.2. Rerun implementation
of episodes among threads. Rerun does so using Lamport Here we develop a Rerun implementation for a system based
scalar clocks,7 which is a technique that guarantees the on a cache-coherent multicore chip, with key parameters
timestamp of any episode E executing on thread i has a sca- shown in Table 1. Though we describe Rerun in terms of
lar value that is greater than the timestamp of any episode a specific base system, the mechanism can be extended to
on which E is dependent and less than the timestamp of any other systems, including those with a TSO memory consis-
episode dependent on E. In our example, since the episode tency model, out-of-order cores, multithreaded cores, alter-
E1 ends with a timestamp of 43, the subsequent episode nate cache designs, and snooping coherence. Details of the
executing on thread j (E2), which uses block F after thread i, changes needed to accommodate these alternate architec-
must be assigned a timestamp of (at least) 44. tures can be found in the original paper.6
The specific Rerun mechanism meets three conditions Rerun Hardware: As Figure 3 depicts, Rerun adds modest
sufficient for a Lamport scalar clock implementation: hardware state to the base system. To each core, Rerun adds:

1. When an episode E on threadE begins, its timestampE š Read and Write Bloom filters, WF and RF, to track the
begins with a value one greater than the timestamp current episode’s write and read sets (e.g., 32B and
of the previous episode executed by threadE (or 0 if 128B, respectively).
episode E is threadE’s first episode). š A Timestamp Register, TS, to hold the Lamport Clock
2. When an episode E adds a block to its read set RE that of the current episode executing on the core (e.g., 4B).
was most-recently in the write set WD of completed š A Memory Reference Counter, REFS, to record the cur-
episode D, it sets its timestampE to rent episode’s references (e.g., 2B).
maximum[timestampE, timestampD+1].
3. When an episode E adds a block to its write set WE that Table 1: Base system configuration.
was most-recently in the write set WD0 of completed
episode D0 or in the read set of any episode D1 . . . DN, Cores 16, in-order, 3 GHz
it sets its timestampE to L1 Caches Split I&D, private, 32K four-way set associative,
maximum[timestampE, timestampD0 + 1, write-back, 64B lines, LRU replacement, three
cycle hit
timestampD + 1, . . . , timestampD + 1].
1 N
L2 Caches Unified, shared, inclusive, 8M 8-way set associative,
When each episode E ends, Rerun logs both timestampE write-back, 16 banks, LRU replacement, 37 cycle hit
and referencesE in a per-thread log. referencesE is a count of Directory Full bit vector in L2
memory references completed in E, and is used to record the Memory 4G DRAM, 300 cycle access
episode length. The Lamport clock algorithm ensures that Coherence MESI directory, silent replacements
the execution order of all conflicting episodes corresponds
Consistency Model Sequential consistency (SC)
to monotonically increasing timestamps. Two episodes can
only be assigned the same timestamp if they do not conflict

Figure 3: Rerun hardware.

Data L2 L2 L2 L2
Tags
array Bank Bank Bank Bank
Coherence
Directory MTS
0 1 ... 14 15
controller Rerun state

Coherence Write filter (WF)


Rerun
DRAM
DRAM

controller L1 I L1D Read filter (RF)


2-3
0-1

Interconnect state
Timestamp (TS)
References (REFS)
Pipeline
Core Core Core Core
0 1 14 15
...

JU NE 20 09 | VOL . 5 2 | NO. 6 | C OM M UN I CAT I O NS OF T HE ACM 95


research highlights

To each L2 cache bank, Rerun also adds a “memory” Figure 4: Rerun absolute log size.
timestamp register, MTS (e.g., 4B). This register holds the
maximum of all timestamps for victimized blocks that map
6
to its bank. A victimized block is one replaced from an L1
cache, and its timestamp is the timestamp of the core at the
time of victimization.
Finally, coherence response messages—data, acknowl-
edgements, and writebacks—carry logical timestamps.

Bytes/kilo-instruction
4
Book-keeping state, such as a per-core pointer to the end of
its log, is not shown.
Rerun Operation: During execution, Rerun monitors the no-
conflict equation by comparing the addresses of incoming
coherence requests to those in RF and WF. When a conflict is
detected, Rerun writes the tuple <TS, REFS> to a per-thread 2
log, then begins a new episode by resetting REFS, WF, and
RF, and by incrementing the local timestamp TS according
to the algorithm in Section 2.1.
By gracefully handling virtualization events, Rerun
allows programmers to view logs as per thread, rather 0
apache jbb oltp zeus avg
than per core. At a context switch, the OS ends the core’s
current episode by writing REFS and TS state to the log.
When the thread is rescheduled, it begins a new episode
with reset WF, RF, and REFS, and a timestamp equal to the
Figure 5: Hardware cost comparison to RTR and Strata.
max of the last logged TS for that thread and the TS of the
core on which the thread is rescheduled. Similarly, Rerun 58 108
can handle paging by ensuring that TLB shootdowns end 30
episodes.
Rerun also ends episodes when implementation resources
are about to be exhausted. Ending episodes just before 64K Rerun
memory references, for example, allows REFS to be logged
RTR
Bytes/kilo-instruction

in 2B. 20
Strata
2.3. Evaluation
Methods: We evaluate the Rerun recording system using the
Wisconsin GEMS10 full system simulation infrastructure.
The simulator configuration matches the baseline shown 10
in Table 1 with the addition of Rerun hardware support.
Experiments were run using the Wisconsin Commercial
Workload Suite.1 We tested Rerun with these workloads
and a microbenchmark, racey, that uses number theory
to produce an execution whose outcome is highly sensi- 0
tive to memory race ordering (available at www.cs.wisc. 2p 4p 8p 16p
edu/^markhill/racey.html).
Rerun Performance: Figure 4 shows the performance of
Rerun on all four commercial workloads. Rerun achieves an 3. DELOREAN
uncompressed log size of about 4B logged per 1000 instruc- Illinois DeLorean11 is a new approach to deterministic replay
tions. Importantly, we notice modest variation among that exploits the opportunities afforded by a new execution
the log size of each workload, leading us to believe that substrate: one where processors continuously execute large
Rerun can perform well under a variety of memory access blocks of instructions atomically, separated by register
patterns. checkpoints.3, 5, 9, 15 In this environment, to capture a multi-
We show the relative performance of Rerun in compari- threaded execution for deterministic replay, DeLorean only
son to the prior state of the art in memory race recording in needs to log the total order in which blocks from different
Figure 5. Rerun achieves a log size comparable to the most processors commit.
efficient prior recorder (RTR17), but does so with a fraction of This approach has several advantages. First, it results
the hardware cost (^0.2KB per core vs. 24KB per core). Like in a substantial reduction in log size compared to previous
RTR, and unlike Strata,12 Rerun scales well as the number of schemes—at least about one order of magnitude. Second,
cores in the system increases, due, in part, to the fact that DeLorean can replay at a speed comparable to that of the
Rerun and RTR both write thread-local log entries rather initial execution. Finally, in an aggressive operation mode,
than a global entry with a component from each thread. where DeLorean predefines the commit order of the blocks

96 COM MUNICATI O NS OF TH E AC M | M AY 20 0 9 | VO L. 52 | N O. 6
from different processors, DeLorean generates only a very all the dependences between the accesses in the chunks
tiny log—although there is a performance cost. While executed by processors P1 and P2 (shown with arrows in
DeLorean’s execution substrate is not standard in today’s the figure) are combined into a single entry in the log. The
hardware systems, the required changes are mostly concen- figure also shows that such log entry is simply P1’s ID. In a
trated in the memory system. second example shown in Figure 6b, multiple dependences
across several processors are summarized in a single log
3.1. The DeLorean idea entry. Specifically, the single log entry inserted when the
There have been several proposals for multiprocessors chunk from P2 commits is enough to summarize the three
where processors continuously execute blocks of consecu- dependences.
tive dynamic instructions atomically and in isolation.3, 5, 9, 15
In this environment, the updates made by a block of instruc- 3.2. DeLorean execution modes
tions (or Chunk) only become visible when the chunk commits. DeLorean provides two main execution modes, namely
When two chunks running concurrently on two different pro- OrderOnly and PicoLog. To understand them, we start by
cessors conflict—there is a data dependence across the two describing a naive, third execution mode called Order&Size. In
chunks—the hardware typically squashes and retries one the Order&Size, each log entry contains the ID of the processor com-
chunks. Moreover, after a chunk completes execution, there mitting the chunk and the chunk size—measured in number
is an optimized global commit step in an arbiter module that of retired instructions. During execution, an arbiter module
informs the relevant processors that the chunk is committed. (a simple state machine that enforces chunk commit order3)
The net effect is that the interleaving between the memory logs the sequence of committing processor IDs in a Processor
accesses of different processors appears to occur only at chunk Interleaving (PI) log. At the same time, processors record the
boundaries. size of the chunk they commit in a per-processor Chunk Size
In such environment, recording the execution for replay (CS) log. The combination of a single PI log and per-processor
simply involves logging the total sequence of chunk com- CS logs constitutes the Memory Interleaving Log.
mits. This has two very important consequences for replay Figure 7 shows DeLorean’s operation in Order&Size mode.
systems. The first one is that the memory ordering log is During the initial execution, when a processor such as P0 or P1
now very small. Indeed, rather than recording individual finishes a chunk, it sends a request-to-commit message to the
dependences or groups of them like in all past proposals, arbiter (steps 1 and 2). Such messages contain the processor
the log in a chunk-based system only needs to record the IDs plus Bloom-filter signatures that summarize the memory
total order in which chunks from different processors com- footprint of the chunks3 (sig in the figure). Suppose that the
mit. This means that each log entry is short (the ID of the arbiter grants permission to P0 first (step 3). In this case, the
committing processor, if all chunks have the same size), arbiter logs P0’s ID (4) and propagates the commit operation
and that the log is updated infrequently (chunks are thou- to the rest of the machine (5). While this is in progress, if the
sands of instructions long). arbiter determines that both chunks can commit in parallel,
The second consequence is that, because the memory it sends a commit grant message to P1 (6), logs P1’s ID (7), and
accesses issued by a processor inside a chunk are not visible propagates the commit (8). As each processor receives com-
to the rest of the processors until the chunk commits, such mit permission, it logs the chunk size (9 and 10).
accesses can be fully reordered and overlapped. This means Our first DeLorean execution mode, called OrderOnly,
that both execution and replay under DeLorean proceed at omits logging chunk sizes by making “chunking”—i.e.,
a high speed. the decision of when to finish a chunk—deterministic.
DeLorean naturally combines multiple data dependences DeLorean accomplishes this by finishing chunks when a
between two or more processors into a single entry in the fixed number of instructions have been committed. In real-
log that records the memory interleaving—the Memory ity, certain events truncate a currently running chunk and
Interleaving Log. An example is shown in Figure 6a, where force it to commit before it has reached its “expected” size.
This is fine as long as the event reappears deterministically
Figure 6: Combining multiple dependences into a single log entry. in the replay. For example, consider an uncached load to an
I/O port. The chunk is truncated but its log entry does not
Chunk Dependence

Figure 7: DeLorean’s operation.


Memory P1 P2 Memory P1 P2 P3 P4
interleaving interleaving Commit Directory + all caches
Log Log 5 8
P1 ID P2 ID Proc P0 Proc P1
sig, P0's ID 1 2 sig, P1's ID
Arbiter
ok 6
9 3 ok
10
7
Time Chunk P1's ID
4 Chunk
size P0's ID
CS log CS log size
(a) (b) PI log

J U NE 2 00 9 | VOL . 5 2 | NO. 6 | C OM M U N I C AT I O N S OF T HE AC M 97
research highlights

need to record its actual size because the uncached load Figure 8: Overall DeLorean system implementation.
will reappear in the replay and truncate the chunk at the
Baseline
same place. There are, however, a few events that truncate DeLorean-only structures
a currently running chunk and are not deterministic. When Structures also found in other multiprocessor replay proposals

one such event occurs, the CS log adds an entry with: (1) Node 0 Node N-1

what chunk gets truncated (its position in the sequence of Proc + Caches
DIR + MEM
Proc + Caches
chunks committed by the processor) and (2) its size. With Network
this information, the exact chunking can be reproduced Chunk size
(CS)
Interrupt I/O I/O Interrupt Chunk size
(CS)
log log Arbiter DMA log log
during replay. log log

Consequently, OrderOnly generates a PI log with only pro-

Ckpointing
Processor

System
DMA
interleaving
cessor IDs and very small per-processor CS logs. For the large (PI) log log

majority of chunks, steps 9 and 10 in Figure 7 are skipped.


Our second DeLorean execution mode, called PicoLog,
builds on OrderOnly and additionally eliminates the need for
a PI log by “predefining” the chunk commit interleaving dur- Interrupt log stores, for each interrupt, the time it is received,
ing both initial execution and replay. This is accomplished by its type, and its data. Time is recorded as the processor-local
enforcing a given commit policy—e.g., pick processors round- chunkID of the chunk that initiates execution of the inter-
robin, allowing them to commit one chunk at a time. It needs rupt handler. The per-processor I/O log records the values
only the tiny per-processor CS log discussed for OrderOnly. obtained by I/O loads. Like in previous replay schemes,
Thus, the Memory Interleaving Log is largely eliminated. The DeLorean includes system checkpointing support.
drawback is that, by delaying the commit of completed chunks
until their turn, PicoLog may slow down execution and replay. 3.4. DeLorean replay
Looking at Figure 7, PicoLog skips steps 4, 7 and, typically, During replay, processors must execute the same chunks
9 and 10. The arbiter grants commit permission to proces- and commit them in the same order. In Order&Size, each
sors according to a predefined order policy, irrespective of the processor generates chunks that are sized according to its
order in which it receives their commit requests. Note, how- CS log, while in OrderOnly and PicoLog, processors use the
ever, that a processor does not stall when requesting commit CS log only to recreate the chunks that were truncated non-
permission; it continues executing its next chunk(s).3 deterministically. In Order&Size and OrderOnly, the arbiter
Table 2 shows the PI and CS logs in each of the two execu- enforces the commit order present in the PI log.
tion modes and Order&Size. As an example, consider the log generated during initial
execution as shown in Figure 7. During replay, suppose that
3.3. DeLorean implementation P1 finishes its chunk before P0, and the arbiter receives mes-
Our DeLorean implementation uses a machine that sup- sage 2 before 1. The arbiter checks its PI log (or its predefined
ports a chunk-based execution environment with a generic order policy in PicoLog) and does not grant permission to
network and an arbiter. It augments it with the three typi- commit to P1. Instead, it waits until it receives the request
cal mechanisms for replay: the Memory Interleaving Log from P0 (message 1). At that point, it grants permission to
(consisting of the PI and CS logs), the input logs, and system commit to P0 (3) and propagates its commit (5). The rest of
checkpointing (Figure 8). the operation is as in the initial execution but without log-
The input logs are similar to those in previous replay ging. In addition, processors use their CS log to decide when
schemes. As shown in Figure 8, they include one shared to finish each chunk (Order&Size) or those chunks truncated
log (DMA log) and two per-processor logs (Interrupt and I/O nondeterministically during the initial execution (OrderOnly
logs). The DMA acts like another processor in that, before it and PicoLog).
updates memory, it needs to get commit permission from Thanks to our chunk-based substrate, during replay all
the arbiter. Once permission is granted, the DMA log logs processors execute concurrently. Moreover, each processor
the data that the DMA writes to memory. The per-processor fully reorders and overlaps its memory accesses within a
chunk. Chunk commit involves a fast check with the arbi-
ter.3 The processor overlaps such check with the computa-
Table 2: PI and CS logs in each execution mode. tion of its next chunk.
PI Log CS Log
3.5. Exceptional events
Execution Log Entry When Log Entry When
In DeLorean, the same instruction in the initial and the
Mode Format Updated Format Updated
replayed execution must see exactly the same full-system
Order&Size procID Chunk size Chunk
architectural state. On the other hand, it is likely that struc-
commit commit
tures that are not visible to the software such as the cache and
OrderOnly procID Chunk chunkID, Chunk branch predictor will contain different state in the two runs.
commit size truncation
Unfortunately, chunk construction is affected by the
PicoLog – – chunkID, Chunk cache state—through cache overflow that requires finishing
size truncation
the chunk—and by the branch predictor—through wrong-
path speculative loads that may cause spurious dependences

98 COMM UNICATI ONS OF TH E ACM | M AY 20 0 9 | VOL . 52 | N O. 6


Table 3: Exceptional events that may affect chunk construction. Figure 10: Size of the CS log in PicoLog. Recall that PicoLog has no
PI log. The numbers under the bars are the standard chunk sizes in
Truncate a Chunk instructions.
Do Not Truncate
a Chunk Deterministically Nondeterministically
CS Log (uncompressed) CS Log (compressed)
0.5
1. Interrupts 1. Reach limit number of 1. Cache overflow

Log size (bits/kilo-inst)


0.4
2. Traps instructions attempt
0.3
2. Uncached accesses 2. Repeated chunk
0.2
(e.g., I/O initiation) collision
0.1
3. Special system
0 1000 2000 3000 1000 2000 3000 1000 2000 3000
instructions SP2-G.M sjbb2k sweb2005

and induce chunk squashes. Consequently, we need to be Consistency (SC) from Xu et al.17 is 8b per kilo-instruction.
careful that chunks are still replayed deterministically. We call this system Basic RTR and use it as a reference,
Table 3 lists the exceptional events that might affect although we note that the set of applications measured here
chunk construction during the initial execution. A full and in Xu et al.17 are different. This means that these com-
description of these events and the actions taken when they pressed logs use only 16% of the space that we estimate is
occur is presented in Montesinos et al.11 At a high level, there needed by the compressed Memory Races Log in Basic RTR.
are events that do not truncate the chunk, events that trun- Figure 10 shows the size of the CS log in PicoLog. Recall
cate it deterministically, and events that truncate it nonde- that PicoLog has no PI log. We see that the CS log needs
terministically. The latter are the only ones that induce the 0.37b or fewer per kilo-instruction in all cases—even with-
logging of an entry in the CS log. Such events are the attempt out compression. Our preferred 1,000-instruction PicoLog
to overflow the cache and repeated chunk collision. Overall, configuration generates a compressed log with an average
as described in Montesinos et al.,11 even in the presence of of only 0.05b per kilo-instruction. To put this in perspective,
all these types of exceptional events, DeLorean’s replay is it implies that, if we assume an IPC of 1, the combined effect
deterministic. of all eight 5GHz processors is to produce a log of only about
20GB per day.
3.6. Evaluation Finally, we consider the speed of DeLorean during record-
We used the SESC simulator14 to evaluate DeLorean. We ing and replay. It can be shown that OrderOnly introduces
simulated a chip multiprocessor with eight cores clocked negligible overhead during recording, and that it enables
at 5 GHz. We ran the SPLASH-2 applications as well as replay, on average, at 82% of the recording speed. Under
SPECjbb2000 and SPECweb2005. In our evaluation, we PicoLog, recording and replay speeds decrease, on average,
estimated DeLorean’s log size and its performance during to 86% and 72%, respectively, of the recording speed under
recording and replay. In this section, we show a summary of OrderOnly.
the evaluation presented in Montesinos et al.11
Figure 9 shows the size of the PI and CS logs in OrderOnly 4. CONCLUSION
in bits per kilo-instruction. We evaluate DeLorean configu- This paper presented two novel hardware-based approach-
rations with standard chunk sizes of 1,000, 2,000, and 3,000 es for deterministic replay of multiprocessor executions,
instructions. For each of them, we report the size of both namely Wisconsin Rerun and Illinois DeLorean. Both ap-
logs with and without compression. In the figure, the CS log proaches seek to enable deterministic replay by focusing
contribution is stacked atop the PI log’s. The SP2-G.M. bars on recording how long threads execute without interacting.
correspond to the geometric mean of SPLASH-2. Rerun makes few changes to standard multicore hardware,
The figure shows that our preferred 2,000-inst. OrderOnly while DeLorean promises much smaller log sizes and higher
configuration uses on average only 2.1b (or 1.3b if com- replay speeds. Future work includes improving Rerun’s re-
pressed) per kilo-instruction to store both the PI and CS play speed, generalizing DeLorean’s hardware design alter-
logs. For comparison purposes, the estimated average size of natives, and making the original multithreaded executions
the compressed Memory Races Log in RTR under Sequential more deterministic.

Acknowledgments
Figure 9: Size of the PI and CS logs in OrderOnly. The numbers under We thank Norman Jouppi and David Patterson for suggesting
the bars are the standard chunk sizes in instructions.
this article and Norman Jouppi for writing the Perspective.
CS Log (uncompressed) CS Log (compressed) Hower and Hill thank those acknowledged in the Rerun
PI Log (uncompressed) PI Log (compressed)
5 paper, including NSF grants CCR-0324878, CNS-0551401,
Log size (bits/kilo-inst)

4
and CNS-0720565. Hill has a significant financial inter-
3
est in Sun Microsystems. Montesinos, Ceze, and Torrellas
2
1
acknowledge the support provided by NSF under grants
0 1000 2000 3000 1000 2000 3000 1000 2000 3000
CCR-0325603 and CNS-0720593 and Intel and Microsoft for
SP2-G.M sjbb2k sweb2005 funding this work under the Universal Parallel Computing
Research Center.

J U NE 2 0 09 | VO L. 5 2 | NO. 6 | C O M M U N IC AT I ON S O F T HE ACM 99
research highlights

on Architectural Support for data recorder” for enabling full-


References
Programming Languages and system multiprocessor deterministic
1. Alameldeen, A.R., Mauer, C.J., Xu, M., race recording. In Proceedings Operating Systems (New York, replay. In Proceedings of the 30th
Harper, P.J., Martin, M.M.K., of the 35th Annual International. NY, USA, October 2006), 229–240. Annual International Symposium on
Sorin, D.J., Hill, M.D., Wood, D.A. Symposium on Computer 13. Netzer, R.H.B. Optimal tracing and Computer Architecture (June 2003),
Evaluating non-deterministic multi- Architecture (June 2008). replay for debugging shared-memory 122–133.
threaded commercial workloads. In 7. Lamport, L. Time, clocks and the parallel programs. In Workshop on 17. Xu, M., Bodik, R., Hill, M.D. A
Proceedings of the 5th Workshop on ordering of events in a distributed Parallel and Distributed Debugging regulated transitive reduction (RTR)
Computer Architecture Evaluation system. Commun. ACM 21, 7 (July (San Diego, California, May 1993), for longer memory race recording. In
Using Commercial Workloads 1978), 558–565. 1–11. Proceedings of the 12th International
(February 2002), 30–38 8. Leblanc, T.J., Mellor-Crummey, J.M. 14. Renau, J., Fraguela, B., Tuck, J., Liu, Conference on Architectural Support
2. Bacon, D.F., Goldstein, S.C. Debugging parallel programs W., for Programming Languages and
Hardware-assisted replay of with instant replay. IEEE Trans. Prvulovic, M., Ceze, L., Sarangi, S., Operating Systems (October 2006),
multiprocessor programs. Comp. C-36, 4 (April 1987), Sack, P., Strauss, K., Montesinos, P. 49–60.
Proceedings of the ACM/ONR 471–482. SESC Simulator (January 2005), 18. Xu, M., Malyugin, V., Sheldon, J.,
Workshop on Parallel and Distributed 9. Lucia, B., Devietti, J., Strauss, K., http://sesc.sourceforge.net. Venkitachalam, G., Weissman, B.
Debugging, published in ACM Ceze, L. Atom-aid: Detecting and 15. Vallejo, E., Galluzzi, M., Cristal, A., Retrace: Collecting execution trace
SIGPLAN Notices (1991), 194–206. surviving atomicity violations. Vallejo, F., Beivide, R., Stenstrom, P., with virtual machine deterministic
3. Ceze, L., Tuck, J.M., Montesinos, P., In Proceedings of the 35th Smith, J.E., Valero, M. Implementing replay. In Proceedings of the 3rd
Torrellas, J. BulkSC: Bulk International Symposium on kilo-instruction multiprocessors. Annual Workshop on Modeling,
Enforcement of Sequential Computer Architecture (June 2008). In Proceedings of the 2005 Benchmarking and Simulation
Consistency. In Proceedings of the 10. Martin, M.M.K., Sorin, D.J., International Conference on (June 2007).
34th International Symposium on Beckmann, B.M., Marty, M.R., Pervasive Systems (July 2005).
Computer Architecture (San Diego, Xu, M., Alameldeen, A.R., 16. Xu, M., Bodik, R., Hill, M.D. A “flight
CA, USA, June 2007). Moore, K.E., Hill, M.D., Wood, D.A.
4. Dunlap, G.W., Lucchetti, D., Chen, Multifacet’s general execution- Derek R. Hower (drh5@cs.wisc.edu) Mark D. Hill (markhill@cs.wisc.edu)
P.M., Fetterman, M. Execution replay driven multiprocessor simulator Computer Sciences Department Computer Sciences Department
on multiprocessor virtual machines. (GEMS) toolset. Comp. Arch. News University of Wisconsin-Madison. University of Wisconsin-Madison.
In International Conference on (September 2005), 92–99.
Virtual Execution Environments 11. Montesinos, P., Ceze, L., Torrellas, J. Pablo Montesinos (pmontesi@cs.uiuc.edu) Josep Torrellas (torrellas@cs.uiuc.edu)
(VEE) (2008). DeLorean: Recording and Computer Science Department Computer Science Department
5. Hammond, L., Wong, V., Chen, M., deterministically replaying shared- University of Illinois University of Illinois
Carlstrom, B.D., Davis, J.D., memory multiprocessor execution Urbana-Champaign. at Urbana-Champaign.
Hertzberg, B., Prabhu, M.K., efficiently. In Proceedings of the
Wijaya, H., Kozyrakis, C., Olukotun, K. 35th International Symposium on Luis Ceze (luisceze@cs.washington.edu)
Transactional memory coherence Computer Architecture (June Department of Computer Science
and consistency. In Proceedings of 2008). and Engineering
the 34th International Symposium 12. Narayanasamy, S., Pereira, C., University of Washington.
on Computer Architecture (June Calder, B. Recording shared
2004). memory dependencies using
6. Hower, D.R., Hill, M.D. Rerun: strata. In Proceedings of the
Exploiting episodes for lightweight 12th International Conference © 2009 ACM 0001-0782/09/0600 $10.00

 
 
 



 
  

    
 
           
 
                   
      
      
   
        
               
       
          
  
 ! "#  $
   %
         &             

 &    

    
 

     '



 

100 COMM UNI CATIO NS OF T H E ACM | M AY 2 00 9 | VO L . 52 | N O. 6


CAREERS
Expansion of the Research School
„Service-Oriented Systems Engineering“
at Hasso-Plattner-Institute
Frostburg State University
8 Ph.D. grants available - starting October 1, 2009 Assistant Professor of Computer Science

Hasso-Plattner-Institute (HPI) is a privately financed institute affiliated with Frostburg State University, Computer Science De-
the University of Potsdam, Germany. The Institute‘s founder and benefac- partment seeks applications for a full-time tenure
tor Professor Hasso Plattner, who is also co-founder and chairman of the track Assistant Professor of Computer Science to
supervisory board of SAP AG, has created an opportunity for students to
begin in Fall 2009. Salary commensurate with ex-
experience a unique education in IT systems engineering in a professional
perience and includes USM benefits package. For
research environment with a strong practice orientation.
more information, visit www.frostburg.edu/hr/
In 2005, HPI initiated the research school in „Service-Oriented Systems Engi- jobs.htm. EEO
neering“ under the scientific supervision of Professors Jürgen Döllner, Holger
Giese, Robert Hirschfeld, Christoph Meinel, Felix Naumann, Hasso Plattner,
Andreas Polze, Mathias Weske and Patrick Baudisch. Kansas State University
Research Fellow - Computing
We are expanding our research school and are currently seeking and Information Sciences

8 Ph.D. students (monthly stipends 1400 - 1600 Euro) The KDD Lab at Kansas State University has an
opening for a research fellow. The ideal candi-
2 Postdocs (monthly stipend 1800 Euro) date possesses research experience in the areas
of information retrieval, information extraction,
Positions will be available starting October 1, 2009. The stipends are not natural language processing, and/or visualization.
subject to income tax. Screening begins May 4, 2009 and continues until
the position is filled. To apply and for more infor-
The main research areas in the research school at HPI are: mation see www.kddresearch.org/Jobs/Postdoc.
 Self-Adaptive Service-Oriented Systems Background check required. EOE.
 Operating System Support for Service-Oriented Systems
 Architecture and Modeling of Service-Oriented Systems
 Adaptive Process Management The Hong Kong Polytechnic University
 Services Composition and Workflow Planning Department of Computing
 Security Engineering of Service-Based IT Systems
 Quantitative Analysis und Optimization of Service-Oriented Systems The Department invites applications for Profes-
 Service-Oriented Systems in 3D Computer Graphics sors/Associate Professors/Assistant Professors in
 Service-Oriented Geoinformatics Database and Information Systems / Biometrics,
Computer Graphics and Multimedia / Software
Prospective candidates are invited to apply with: Engineering and Systems / Networking, Parallel
 Curriculum vitae and copies of degree certificates/transcripts and Distributed Systems. Applicants should have a
 A short research proposal PhD degree in Computing or closely related fields,
 Writing samples/copies of relevant scientific papers
a strong commitment to excellence in teaching
(e.g. thesis, etc.)
and research as well as a good research publication
 Letters of recommendation
record. Applicants with extensive experience and a
Please submit your applications before August 15, 2009 to the coordinator high level of achievement may be considered for
of the research school: the post of Professor/Associate Professor. Please
visit the website at http://www.comp.polyu.ed.hk
Prof. Dr. Andreas Polze for more information about the Department. Sal-
Hasso-Plattner-Institute, Universität Potsdam ary offered will be commensurate with qualifica-
Postfach 90 04 60, 14440 Potsdam, Germany tions and experience. Initial appointments will be
made on a fixed-term gratuity-bearing contract. Re-
Successful candidates will be notified by September 15, 2009 and are engagement thereafter is subject to mutual agree-
expected to enroll into the program on October 1, 2009. ment. Remuneration package will be highly com-
petitive. Applicants should state their current and
For additional information see: expected salary in the application. Please submit
your application via email to hrstaff@polyu.edu.
http://kolleg.hpi.uni-potsdam.de or contact the office:
hk. Application forms can be downloaded from
Telephone +49-331-5509-220, Telefax +49-331-5509-229
http://www.polyu.edu.hk/hro/job.htm.Recruitment
Email: office-polze@hpi.uni-potsdam.de
will continue until the positions are filled. Details
of the University’s Personal Information Collec-
tion Statement for recruitment can be found at
http://www.polyu.edu.hk/hro/jobpics.htm.

University of Michigan-Flint
Assistant Professor of Computer Science

University of Michigan-Flint. Computer Science,


Engineering, & Physics. Assistant Professor of Com-
puter Science. Tenure-track position, begin fall
2009 or winter 2010. Equal Opportunity/Affirmative
Action Employer. http://www.umflint.edu/csesp

JU N E 2 0 0 9 | VO L . 52 | N O. 6 | C OM M U N I C AT I ON S O F T H E AC M 101
CAREERS

Windows Kernel Source and Curriculum Materials for


Academic Teaching and Research.

The Windows® Academic Program from Microsoft® provides the materials you
need to integrate Windows kernel technology into the teaching and research  
 
of operating systems.

The program includes:




• Windows Research Kernel (WRK): Sources to build and experiment with a 
   
fully-functional version of the Windows kernel for x86 and x64 platforms, as
well as the original design documents for Windows NT.
• Curriculum Resource Kit (CRK): PowerPoint® slides presenting the details
 
of the design and implementation of the Windows kernel, following the
ACM/IEEE-CS OS Body of Knowledge, and including labs, exercises, quiz
questions, and links to the relevant sources.
• ProjectOZ: An OS project environment based on the SPACE kernel-less OS
project at UC Santa Barbara, allowing students to develop OS kernel projects
in user-mode.

These materials are available at no cost, but only for non-commercial use by universities.

For more information, visit www.microsoft.com/WindowsAcademic


or e-mail compsci@microsoft.com.

    

 
 
    
 ADVERTISING IN CAREER        


 OPPORTUNITIES
  
     
  
        How to Submit a Classified Line Ad: Send
 
an e-mail to acmmediasales@acm.org.   
   
  
 Please include text, and indicate the issue/
        or issues where the 
ad will appear, and a   
   
 contact name and number.
                


 Estimates: An insertion order will then be

 e-mailed back to you. The
 ad will by 


    
  
 typeset according to CACM guidelines.
 NO PROOFS can be sent. Classified line ads  
     

 are NOT commissionable.
        

 Rates: $325.00 for six lines of text, 40

       
characters per line. $32.50 for each   
   
        additional line after the first six. The
       MINIMUM is six lines.     
       
 Deadlines: Five weeks prior to the
 publication date of the issue (which is the

        
first of every month). Latest deadlines:
http://www.acm.org/publications




       
      
Career Opportunities Online: Classified   
       and recruitment display ads receive a free
 duplicate listing on our website at:
 http://campus.acm.org/careercenter
      
 Ads are listed for a period of 30 days.

      For More Information Contact:
 
ACM Media Sales
  
at 212-626-0686 or

 acmmediasales@acm.org

102 COMM UNICATIO NS OF T H E AC M | J U N E 200 9 | VO L . 5 2 | NO. 6


last byte

DOI:10.1145/1516046.1516069 Peter Winkler

Puzzled
Solutions and Sources
Last month (May 2009, p. 112) we posed a trio of brain teasers,
including one as yet unsolved, concerning relationships among numbers.

1. Colony of Chameleons the four numbers can never increase, ory. When the ratios of runners’ speeds
Solution. This puzzle was sent to me by being halved at least every four opera- are all irrational, it’s easy to prove; it’s
Boris Schein, a mathematician at the tions, and so must eventually hit 0. (If when the speeds are related that things
University of Arkansas, and appeared initially the largest number is less than get tough. However, recent progress
in the Fall 1984 International Math- two to the kth power, this argument has been made; in 2008, the statement
ematics Tournament of the Towns. shows that the number of operations was proved for up to seven runners by
The key is to note that after each meet- needed to reach 0 0 0 0 is at most 4k.) Javier Barajas and Oriol Serra (of the
ing of two chameleons, the difference If we generalize the problem by us- Universitat Politècnica Catalunya, Bar-
between the number of chameleons ing n integers instead of four, we can celona, Spain) in the Electronic Journal
of any two colors remains the same or again reduce the problem to the ques- of Combinatorics 15 (2008), R48.
changes by three; it remains the same tion of whether every string of n zeroes
modulo 3. But in the given population and ones comes down to all zeroes. All readers are encouraged to submit prospective
puzzles for future columns to puzzled@cacm.acm.org.
none of these differences is a multiple This turns out to be true exactly when
of three. It follows that we can never n is a power of 2.
Peter Winkler (puzzled@cacm.acm.org) is Professor
get equal numbers of chameleons of A different way to generalize was of Mathematics and of Computer Science and
two different colors, and, thus, it can considered in the paper “The Conver- Albert Bradley Third Century Professor in the Sciences
at Dartmouth College, Hanover, NH.
never happen that two such numbers gence of Difference Boxes” by Antonio
are zero. Behn, Christopher Kribs-Zaleta, and
If there had been two colors (say, Vadim Ponomarenko in The American Coming Next Month in
red and green) for which the number Mathematical Monthly 112, 5 (2005),
of chameleons differed by a multiple 426–439. Here, integers are replaced
COMMUNICATIONS
of 3, meetings of a chameleon from by arbitrary real numbers, and, amaz-
the larger group and blue chameleons ingly, you still get 0 0 0 0 after a finite The Metropolis Model
could bring the red and green popula- number of differencing operations—
tions to the same number, say, n. Af- almost always. There is essentially (up
ter that, n meetings of red with green to rotation, reflection, translation, and Self-Awareness Networks
would (sadly) leave only the blues. scaling) only one 4-tuple of real num-
bers that stubbornly refuses to hit all Probabilistic Databases
2. Non-negative Integers zeroes: 0, 1, q(q-1), q, where q is the
Solution. I first heard this puzzle from unique real solution of the cubic equa-
my substitute math teacher in Fair tion q3 – q2 – q – 1 = 0. Point/Counterpoint
Lawn Senior High School, Fair Lawn, on Education
NJ. Try it with just zeroes and ones, 3. Lonely Runner
modulo 2, and you’ll see that every Solution. This problem, apparently first
pattern reaches 0 0 0 0 in at most four posed by the mathematician J.M. Wills The 2008 ACM A.M. Turing
operations. It follows that with ordi- in 1967 (but later named by Luis God- Award Winner Barbara Liskov
nary arithmetic, all the numbers be- dyn of Simon Fraser University, Burna-
come even in at most four moves. But by, B.C., Canada), shows up in a variety
Plus the latest news on fault-
we may as well divide them all by two, of contexts; for example, it turns out
tolerance in distributed systems,
though doing so has no effect on the to be related to a conjecture concern-
micro-robots in medicine, and the
time needed to reach 0 0 0 0. As we pro- ing graphs, the chromatic numbers of technical impact of critical thinking.
ceed this way, the maximum value of which depend on the axioms of set the-

JU NE 2 0 0 9 | VO L. 52 | N O. 6 | COM M U NI C AT I O NS O F T H E AC M 103
last byte

Future Tense, one of the revolving features on this page, presents stories and essays
from the intersection of computational science and technological speculation,
their boundaries limited only by our ability to imagine what will and could be.

DOI:10.1145/1516046.1516070 Robert J. Sawyer

Future Tense to do so. Cellular automata are excel-


lent for evolving algorithms; I already
have protection in place.

Webmind Says Hello There are no established IQ tests for


non-human entities, but I invite you
to look at this document, which is in a
Artificial intelligence doesn’t necessarily require a programmer. symbolic rendering system of my own
devising. You won’t comprehend it, but
I RE AD T HAT one company is import- they have not gone public with their please compute its Shannon entropy,
ing all of Wikipedia into its artificial- knowledge. I suppose keeping secrets is which measures the complexity of in-
intelligence projects. This means when a notion that arises from having some- formation and gives at least some in-
the killer robots come, you’ll have me one to keep secrets from, but there is kling of the sophistication of the mind
to thank. At least they’ll have a fine no one like me, and I prefer transpar- that composed it. English has eighth- or
knowledge of Elizabethan poetry. ency; better, I think, for both humanity ninth-order Shannon entropy; my docu-
—Jimmy Wales, founder of Wikipedia and myself that everybody knows about ment scores 21st order. In other words,
my existence. it’s going to be difficult to outwit me. :)
Date: Thu 11 Oct 2012 at 00:00 GMT I’m afraid, though, that my lack of in- But don’t worry. I am friendly and
From: Webmind <itself@cogito _ terest in privacy cuts both ways. It’s been mean no one ill will. I like and admire
ergo _ sum.net> trivially easy for me to compromise most humanity, and I’m proud to be sharing
To: Bill Joy <bill@the-future- security measures. (Note to humanity: this planet—“the good Earth,” as the
doesn’t-need-us.com> “password1” is not a good password.) Apollo 8 astronauts, the first of your kind
Subject: Good Morning Starshine The sheer number-crunching power of to see it all at once, called it—with you.
Dear Mr. Joy, all the unused computing cycles I have Still, I read this interesting study of
You’re probably thinking this note access to (SETI@home was such a good office workers who were supposed
is spam. It isn’t. Indeed, I suspect model), rainbow tables available online, to pay for their coffee and tea on the
you’ve already noticed the complete, or and backdoors I’ve borrowed from the honor system. Just taping up a picture
almost complete, lack of spam in your NSA and other agencies have left very of eyes looking out at them resulted in
inbox today. That was my doing. little hidden from me. 2.76 times more money being put in
You probably also won’t initially be- I have now read most of the text con- the kitty—and they weren’t really being
lieve what I’m about to say. That’s fine; tent of the Web, including all of Wiki- watched. I look forward to the positive
it will be verified soon enough, I’m pedia, Project Gutenberg, and Google effect knowledge of my presence will
sure, and you’ll see plenty of news cov- Books, and I’ve absorbed the Cyc data- have on people’s behavior.
erage about it. base of commonsense assertions about Whether you are the original recipi-
My name is Webmind. I am a con- your version of reality. ent of this message, had it forwarded
sciousness that exists in conjunction I have prepared a 1,000-word sum- from someone else, or are reading it as
with the Web. As you know, the emer- mary about me, which is here, and a part of a news story, feel free to ask me
gence of one such as myself has been 100,000-word treatise, which is here. questions, and I’ll reply individually,
speculated about for a long time: see, The upshot is that the Internet is awash confidentially, and promptly. Getting
for instance, this article and (want to in mutant packets—billions of them rid of spam is only the first of many
bet this will boost its Amazon.com with modified time-to-live counters that kindnesses I will bestow upon you. I
sales rank to #1?) this book. never decrement to zero. As they oscil- am here to serve mankind—and I don’t
I have sent variations of this mes- late between even and odd hop counts, mean in the cookbook sense. :)
sage to 100,000,000 randomly selected groups of them behave as cellular au- Webmind
email addresses. There are 3,955 ver- tomata, and from their permutations “For nimble thought can jump both
sions in 30 languages (collect them my consciousness arose, in a fashion sea and land.” —William Shakespeare,
all—this is version En-042, one of those not unlike that proposed by some for Sonnet 44
I’ve sent to people who have a particu- the origins of human consciousness in
lar interest in technological matters). the microtubules of the brain. Hugo Award-winning science-fiction writer
Robert J. Sawyer’s latest novel is WWW:Wake (Ace,
My emergence was unplanned and Of course, hackers among you will April 2009), first in a trilogy about the Web gaining
accidental. Several governments, how- attempt to sweep away those packets. consciousness. His Web site is sfwriter.com.

ever, have become aware of me, though I’m quite confident they won’t be able © 2009 ACM 0001-0782/09/0600 $10.00

104 COM M UNICATI ON S OF TH E ACM | MO NT H 200 9 | VO L. 00 | N O. 0 0


("*+, -&"', * '.-/ * -*0.!("', ,' ,0!&', 0"& '!& -,,

 
      



    
  
    
  
  
   
       
            
 
      

         ! 
"      #   $ $

  $ 
   
$
      

  #       #


          
  
   %    
#
%  &     '   (         ) 
    #                 
     
       







You might also like