You are on page 1of 76

A SCIENCE-BASED CASE FOR LARGE-SCALE SIMULATION

Office of Science
U.S. Department of Energy

July 30, 2003

“There will be opened a gateway and a road to a large and


excellent science, into which minds more piercing than mine
shall penetrate to recesses still deeper.” Galileo (1564–1642)

[on the “experimental mathematical analysis of nature,”


appropriated here for “computational simulation”]
TRANSMITTAL
July 30, 2003

Dr. Raymond L. Orbach, Director


Office of Science
U.S. Department of Energy
Washington, DC

Dear Dr. Orbach,

On behalf of more than 300 contributing computational scientists from dozens of leading
universities, all of the multiprogrammatic laboratories of the Department of Energy, other
federal agencies, and industry, I am pleased to deliver Volume 1 of a two-volume report
that builds a Science-Based Case for Large-Scale Simulation.

We find herein (and detail further in the companion volume) that favorable trends in
computational power and networking, scientific software engineering and infrastructure,
and modeling and algorithms are allowing several applications to approach thresholds
beyond which lies new knowledge of both fundamental and practical kinds. A major
increase in investment in computational modeling and simulation is appropriate at this
time, so that our citizens are the first to benefit from particular new fruits of scientific
simulation, and indeed, from an evolving culture of simulation science.

Based on the encouraging first two years of the Scientific Discovery through Advanced
Computing initiative, we believe that balanced investment in scientific applications,
applied mathematics, and computer science, with cross-accountabilities between these
constituents, is a program structure worthy of extension. Through it, techniques from
applied mathematics and computer science are systematically being migrated into scientific
computer codes. As this happens, new challenges in the use of these techniques flow back
to the mathematics and computer science communities, rejuvenating their own basic
research. Now is an especially opportune time to increase the number of groups working in
this multidisciplinary way, and to provide the computational platforms and environments
that will enable these teams to push their simulations to the next insight-yielding levels of
high resolution and large ensembles.

The Department of Energy can draw satisfaction today from its history of pathfinding in
each of the areas that are brought together in large-scale simulation. However, from the
vantage point of tomorrow, the Department’s hand in their systematic fusion will be
regarded as even more profound.

Best regards,

David E. Keyes
Fu Foundation Professor of Applied Mathematics
Columbia University
New York, NY
Executive Summary

EXECUTIVE SUMMARY

Important advances in basic science crucial theory, methods, and algorithms from
to the national well-being have been mathematicians; and software and
brought near by a “perfect fusion” of hardware infrastructure from computer
sustained advances in scientific models, scientists. Only major new investment in
mathematical algorithms, computer these activities across the board, in the
architecture, and scientific software program areas of DOE’s Office of Science
engineering. Computational simulation—a and other agencies, will enable the United
means of scientific discovery that employs States to be the first to realize the promise
a computer system to simulate a physical of the scientific advances to be wrought by
system according to laws derived from computational simulation.
theory and experiment—has attained peer
status with theory and experiment in many In this two-volume report, prepared with
areas of science. The United States is direct input from more than 300 of the
currently a world leader in computational nation’s leading computational scientists, a
simulation, a position that confers both an science-based case is presented for major,
opportunity and a responsibility to mount a new, carefully balanced investments in
vigorous campaign of research that brings • scientific applications
the advancing power of simulation to many • algorithm research and development
scientific frontiers. • computing system software
infrastructure
Computational simulation offers to • network infrastructure for access and
enhance, as well as leapfrog, theoretical and resource sharing
experimental progress in many areas of • computational facilities
science critical to the scientific mission of • innovative computer architecture
the U.S. Department of Energy (DOE). research, for the facilities of the future
Successes have been documented in such • proactive recruitment and training of a
areas as advanced energy systems (e.g., fuel new generation of multidisciplinary
cells, fusion), biotechnology (e.g., genomics, computational scientists
cellular dynamics), nanotechnology (e.g.,
sensors, storage devices), and The two-year-old Scientific Discovery
environmental modeling (e.g., climate through Advanced Computing (SciDAC)
prediction, pollution remediation). initiative in the Office of Science provides a
Computational simulation also offers the template for such science-directed,
best near-term hope for progress in multidisciplinary research campaigns.
answering a number of scientific questions SciDAC’s successes in the first four of these
in such areas as the fundamental structure seven thrusts have illustrated the advances
of matter, the production of heavy elements possible with coordinated investments. It is
in supernovae, and the functions of now time to take full advantage of the
enzymes. revolution in computational science with
new investments that address the most
The ingredients required for success in challenging scientific problems faced by
advancing scientific discovery are insights, DOE.
models, and applications from scientists;

A Science-Based Case for Large-Scale Simulation iii


Contents

iv A Science-Based Case for Large-Scale Simulation


Contents

A SCIENCE-BASED CASE FOR LARGE-SCALE SIMULATION

VOLUME 1

Attributions ..................................................................................................................................... vii

List of Participants .......................................................................................................................... xi

1. Introduction .............................................................................................................................. 1

2. Scientific Discovery through Advanced Computing:


A Successful Pilot Program .................................................................................................... 9

3. Anatomy of a Large-Scale Simulation .................................................................................. 17

4. Opportunities at the Scientific Horizon ............................................................................... 23

5. Enabling Mathematics and Computer Science Tools ......................................................... 31

6. Recommendations and Discussion ....................................................................................... 37

Postscript and Acknowledgments ............................................................................................... 45

Appendix 1: A Brief Chronology of “A Science-Based Case for


Large-Scale Simulation” .......................................................................................................... 47

Appendix 2: Charge for “A Science-Based Case for Large-Scale Simulation” ...................... 49

Acronyms and Abbreviations ....................................................................................................... 51

A Science-Based Case for Large-Scale Simulation v


Attributions

vi A Science-Based Case for Large-Scale Simulation


Attributions

ATTRIBUTIONS

Report Editors
Phillip Colella, Lawrence Berkeley National Laboratory
Thom H. Dunning, Jr., University of Tennessee and Oak Ridge National Laboratory
William D. Gropp, Argonne National Laboratory
David E. Keyes, Columbia University, Editor-in-Chief

Workshop Organizers
David L. Brown, Lawrence Livermore National Laboratory
Phillip Colella, Lawrence Berkeley National Laboratory
Lori Freitag Diachin, Sandia National Laboratories
James Glimm, State University of New York–Stony Brook and
Brookhaven National Laboratory
William D. Gropp, Argonne National Laboratory
Steven W. Hammond, National Renewable Energy Laboratory
Robert Harrison, Oak Ridge National Laboratory
Stephen C. Jardin, Princeton Plasma Physics Laboratory
David E. Keyes, Columbia University, Chair
Paul C. Messina, Argonne National Laboratory
Juan C. Meza, Lawrence Berkeley National Laboratory
Anthony Mezzacappa, Oak Ridge National Laboratory
Robert Rosner, University of Chicago and Argonne National Laboratory
R. Roy Whitney, Thomas Jefferson National Accelerator Facility
Theresa L. Windus, Pacific Northwest National Laboratory

Topical Group Leaders

Science Applications
Accelerators
Kwok Ko, Stanford Linear Accelerator Center
Robert D. Ryne, Lawrence Berkeley National Laboratory
Astrophysics
Anthony Mezzacappa, Oak Ridge National Laboratory
Robert Rosner, University of Chicago and Argonne National Laboratory
Biology
Michael Colvin, Lawrence Livermore National Laboratory
George Michaels, Pacific Northwest National Laboratory
Chemistry
Robert Harrison, Oak Ridge National Laboratory
Theresa L. Windus, Pacific Northwest National Laboratory
Climate and Earth Science
John B. Drake, Oak Ridge National Laboratory
Phillip W. Jones, Los Alamos National Laboratory
Robert C. Malone, Los Alamos National Laboratory

A Science-Based Case for Large-Scale Simulation vii


Attributions

Combustion
John B. Bell, Lawrence Berkeley National Laboratory
Larry A. Rahn, Sandia National Laboratories
Environmental Remediation and Processes
Mary F. Wheeler, University of Texas–Austin
Steve Yabusaki, Pacific Northwest National Laboratory
Materials Science
Francois Gygi, Lawrence Livermore National Laboratory
G. Malcolm Stocks, Oak Ridge National Laboratory
Nanoscience
Peter T. Cummings, Vanderbilt University and Oak Ridge National Laboratory
Lin-wang Wang, Lawrence Berkeley National Laboratory
Plasma Science
Stephen C. Jardin, Princeton Plasma Physics Laboratory
William M. Nevins, Lawrence Livermore National Laboratory
Quantum Chromodynamics
Robert Sugar, University of California–Santa Barbara

Mathematical Methods and Algorithms


Computational Fluid Dynamics
Philip Colella, Lawrence Berkeley National Laboratory
Paul F. Fischer, Argonne National Laboratory
Discrete Mathematics and Algorithms
Bruce A. Hendrickson, Sandia National Laboratories
Alex Pothen, Old Dominion University
Meshing Methods
Lori Freitag Diachin, Sandia National Laboratories
David B. Serafini, Lawrence Berkeley National Laboratory
David L. Brown, Lawrence Livermore National Laboratory
Multi-physics Solution Techniques
Dana A. Knoll, Los Alamos National Laboratory
John N. Shadid, Sandia National Laboratories
Multiscale Techniques
Thomas J. R. Hughes, University of Texas–Austin
Mark S. Shephard, Rensselaer Polytechnic Institute
Solvers and “Fast” Algorithms
Van E. Henson, Lawrence Livermore National Laboratory
Juan C. Meza, Lawrence Berkeley National Laboratory
Transport Methods
Frank R. Graziani, Lawrence Livermore National Laboratory
Gordon L. Olson, Los Alamos National Laboratory

viii A Science-Based Case for Large-Scale Simulation


Attributions

Uncertainty Quantification
James Glimm, State University of New York–Stony Brook and
Brookhaven National Laboratory
Sallie Keller-McNulty, Los Alamos National Laboratory

Computer Science and Infrastructure


Access and Resource Sharing
Ian Foster, Argonne National Laboratory
William E. Johnston, Lawrence Berkeley National Laboratory
Architecture
William D. Gropp, Argonne National Laboratory
Data Management and Analysis
Arie Shoshani, Lawrence Berkeley National Laboratory
Doron Rotem, Lawrence Berkeley National Laboratory
Frameworks and Environments
Robert Armstrong, Sandia National Laboratories
Kathy Yelick, University of California–Berkeley
Performance Tools and Evaluation
David H. Bailey, Lawrence Berkeley National Laboratory
Software Engineering and Management
Steven W. Hammond, National Renewal Energy Laboratory
Ewing Lusk, Argonne National Laboratory
System Software
Al Geist, Oak Ridge National Laboratory
Visualization
E. Wes Bethel, Lawrence Berkeley National Laboratory
Charles Hanson, University of Utah

A Science-Based Case for Large-Scale Simulation ix


Workshop Participants

x A Science-Based Case for Large-Scale Simulation


Workshop Participants

WORKSHOP PARTICIPANTS

Tom Abel Michael A. Bender


Pennsylvania State University State University of New York–Stony Brook
Marvin Lee Adams Jerry Bernholc
Texas A&M University North Carolina State University
Srinivas Aluru David E. Bernholdt
Iowa State University Oak Ridge National Laboratory
James F. Amundson Wes Bethel
Fermi National Accelerator Laboratory Lawrence Berkeley National Laboratory
Carl W. Anderson Tom Bettge
Brookhaven National Laboratory National Center for Atmospheric Research
Kurt Scott Anderson Amitava Bhattacharjee
Rensselaer Polytechnic Institute University of Iowa
Adam P. Arkin Eugene W. Bierly
University of California–Berkeley/Lawrence American Geophysical Union
Berkeley National Laboratory
John Blondin
Rob Armstrong North Carolina State University
Sandia National Laboratories
Randall Bramley
Miguel Arqaez Indiana University
University of Texas–El Paso
Marcia Branstetter
Steven F. Ashby Oak Ridge National Laboratory
Lawrence Livermore National Laboratory
Ron Brightwell
Paul R. Avery Sandia National Laboratories
University of Florida
Mark Boslough
Dave Bader Sandia National Laboratories
DOE Office of Science
Richard Comfort Brower
David H. Bailey Boston University
Lawrence Berkeley National Laboratory
David L. Brown
Ray Bair Lawrence Livermore National Laboratory
Pacific Northwest National Laboratory
Steven Bryant
Kim K. Baldridge University of Texas–Austin
San Diego Supercomputer Center
Darius Buntinas
Samuel Joseph Barish Ohio State University
DOE Office of Science
Randall D. Burris
Paul Bayer Oak Ridge National Laboratory
DOE Office of Science
Lawrence Buja
Grant Bazan National Center for Atmospheric Research
Lawrence Livermore National Laboratory
Phillip Cameron-Smith
Brian Behlendorf Lawrence Livermore National Laboratory
CollabNet
Andrew Canning
John B. Bell Lawrence Berkeley National Laboratory
Lawrence Berkeley National Laboratory

A Science-Based Case for Large-Scale Simulation xi


Workshop Participants

Christian Y. Cardall Eduardo F. D’Azevedo


Oak Ridge National Laboratory Oak Ridge National Laboratory
Mark Carpenter Cecelia Deluca
NASA Langley Research Center National Center for Atmospheric Research
John R. Cary Joel Eugene Dendy
University of Colorado–Boulder Los Alamos National Laboratory
Fausto Cattaneo Narayan Desai
University of Chicago Argonne National Laboratory
Joan Centrella Carleton DeTar
NASA Goddard Space Flight Center University of Utah
Kyle Khem Chand Chris Ding
Lawrence Livermore National Laboratory Lawrence Berkeley National Laboratory
Jacqueline H. Chen David Adams Dixon
Sandia National Laboratories Pacific Northwest National Laboratory
Shiyi Chen John Drake
Johns Hopkins University Oak Ridge National Laboratory
George Liang-Tai Chiu Phil Dude
IBM T. J. Watson Research Center Oak Ridge National Laboratory
K.-J. Cho Jason Duell
Stanford University Lawrence Berkeley National Laboratory
Alok Choudhary Phil Duffy
Northwestern University Lawrence Livermore National Laboratory
Norman H. Christ Thom H. Dunning, Jr.
Columbia University University of Tennessee/Oak Ridge National
Laboratory
Todd S. Coffey
Sandia National Laboratories Stephen Alan Eckstrand
DOE Office of Science
Phillip Colella
Lawrence Berkeley National Laboratory Robert Glenn Edwards
Thomas Jefferson National Accelerator
Michael Colvin
Facility
Lawrence Livermore National Laboratory
Howard C. Elman
Sidney A. Coon
University of Maryland
DOE Office of Science
David Erickson
Tony Craig
Oak Ridge National Laboratory
National Center for Atmospheric Research
George Fann
Michael Creutz
Oak Ridge National Laboratory
Brookhaven National Laboratory
Andrew R. Felmy
Robert Keith Crockett
Pacific Northwest National Laboratory
University of California–Berkeley
Thomas A. Ferryman
Peter Thomas Cummings
Pacific Northwest National Laboratory
Vanderbilt University
Lee S. Finn
Dibyendu K. Datta
Pennsylvania State University
Rensselaer Polytechnic University
Paul F. Fischer
James W. Davenport
Argonne National Laboratory
Brookhaven National Laboratory

xii A Science-Based Case for Large-Scale Simulation


Workshop Participants

Jacob Fish Sharon C. Glotzer


Rensselaer Polytechnic University University of Michigan
Ian Foster Tom Goodale
Argonne National Laboratory Max Planck Institut für Gravitationsphysik
Leopoldo P. Franca Brent Gorda
University of Colorado–Denver Lawrence Berkeley National Laboratory
Randall Frank Mark S. Gordon
Lawrence Livermore National Laboratory Iowa State University
Lori A. Freitag Diachin Steven Arthur Gottlieb
Sandia National Laboratories Indiana University
Alex Friedman Debbie K. Gracio
Lawrence Livermore and Lawrence Berkeley Pacific Northwest National Laboratory
National Laboratories
Frank Reno Graziani
Teresa Fryberger Lawrence Livermore National Laboratory
DOE Office of Science
Joseph F. Grcar
Chris Fryer Lawrence Berkeley National Laboratory
Los Alamos National Laboratory
William D. Gropp
Sam Fulcomer Argonne National Laboratory
Brown University
Francois Gygi
Giulia Galli Lawrence Livermore National Laboratory
Lawrence Livermore National Laboratory
James Hack
Dennis Gannon National Center for Atmospheric Research
Indiana University
Steve Hammond
Angel E. Garcia National Renewable Energy Laboratory
Los Alamos National Laboratory
Chuck Hansen
Al Geist University of Utah
Oak Ridge National Laboratory
Bruce Harmon
Robert Steven Germain Ames Laboratory
IBM T. J. Watson Research Center
Robert J. Harrison
Steve Ghan Oak Ridge National Laboratory
Pacific Northwest National Laboratory
Teresa Head-Gordon
Roger G. Ghanem University of California–Berkeley/Lawrence
Johns Hopkins University Berkeley National Laboratory
Omar Ghattas Martin Head-Gordon
Carnegie Mellon University Lawrence Berkeley National Laboratory
Ahmed F. Ghoniem Helen He
Massachusetts Institute of Technology Lawrence Berkeley National Laboratory
John R. Gilbert Tom Henderson
University of California–Santa Barbara National Center for Atmospheric Research
Roscoe C. Giles Van Emden Henson
Boston University Lawrence Livermore National Laboratory
James Glimm Richard L. Hilderbrandt
Brookhaven National Laboratory National Science Foundation

A Science-Based Case for Large-Scale Simulation xiii


Workshop Participants

Rich Hirsh Alan F. Karr


National Science Foundation National Institute of Statistical Services
Daniel Hitchcock Tom Katsouleas
DOE Office of Science University of Southern California
Forrest Hoffman Brian Kauffman
Oak Ridge National Laboratory National Center for Atmospheric Research
Adolfy Hoisie Sallie Keller-McNulty
Los Alamos National Laboratory Los Alamos National Laboratory
Jeff Hollingsworth Carl Timothy Kelley
University of Maryland North Carolina State University
Patricia D. Hough Stephen Kent
Sandia National Laboratories Fermi National Accelerator Laboratory
John C. Houghton Darren James Kerbyson
DOE Office of Science Los Alamos National Laboratory
Paul D. Hovland Jon Kettenring
Argonne National Laboratory Telcordia Technologies
Ivan Hubeny David E. Keyes
NASA/Goddard Space Flight Center Columbia University
Thomas J. R. Hughes Alexei M. Khokhlov
University of Texas–Austin Naval Research Laboratory
Rob Jacob Jeff Kiehl
Argonne National Laboratory National Center for Atmospheric Research
Curtis L. Janssen William H. Kirchhoff
Sandia National Laboratories DOE Office of Science
Stephen C. Jardin Stephen J. Klippenstein
Princeton Plasma Physics Laboratory Sandia National Laboratories
Philip Michael Jardine Dana Knoll
Oak Ridge National Laboratory Los Alamos National Laboratory
Ken Jarman Michael L. Knotek
Pacific Northwest National Laboratory DOE Office of Science
Fred Johnson Kwok Ko
DOE Office of Science Stanford Linear Accelerator Center
William E. Johnston Karl Koehler
Lawrence Berkeley National Laboratory National Science Foundation
Philip Jones Dale D. Koelling
Los Alamos National Laboratory DOE Office of Science
Kenneth I. Joy Peter Michael Kogge
University of California–Davis University of Notre Dame
Andreas C. Kabel James Arthur Kohl
Stanford Linear Accelerator Center Oak Ridge National Laboratory
Laxmikant V. Kale Tammy Kolda
University of Illinois at Urbana-Champaign Sandia National Laboratories
Kab Seok Kang Marc Edward Kowalski
Old Dominion University Stanford Linear Accelerator Center

xiv A Science-Based Case for Large-Scale Simulation


Workshop Participants

Jean-Francois Lamarque Richard Matzner


National Center for Atmospheric Research University of Texas–Austin
David P. Landau Andrew McIlroy
University of Georgia Sandia National Laboratories
Jay Larson Lois Curfman McInnes
Argonne National Laboratory Argonne National Laboratory
Peter D. Lax Sally A. McKee
New York University Cornell University
Lie-Quan Lee Lia Merminga
Stanford Linear Accelerator Center Thomas Jefferson National Accelerator
Facility
W. W. Lee
Princeton Plasma Physics Laboratory Bronson Messer
University of Tennessee/Oak Ridge National
John Wesley Lewellen
Laboratory
Argonne National Laboratory
Juan M. Meza
Sven Leyffer
Lawrence Berkeley National Laboratory
Argonne National Laboratory
Anthony Mezzacappa
Timothy Carl Lillestolen
Oak Ridge National Laboratory
University of Tennessee
George Michaels
Zhihong Lin
Pacific Northwest National Laboratory
University of California–Irvine
John Michalakes
Timur Linde
National Center for Atmospheric Research
University of Chicago
Don Middleton
Philip LoCascio
National Center for Atmospheric Research
Oak Ridge National Laboratory
William H. Miner, Jr.
Steven G. Louie
DOE Office of Science
University of California–Berkeley/Lawrence
Berkeley National Laboratory Art Mirin
Lawrence Livermore National Laboratory
Steve Louis
Lawrence Livermore National Laboratory Lubos Mitas
North Carolina State University
Ewing L. Lusk
Argonne National laboratory Shirley V. Moore
University of Tennessee
Mordecai-Mark Mac Low
American Museum of Natural History Jorge J. Moré
Argonne National Laboratory
Arthur B. Maccabe
University of New Mexico Warren B. Mori
University of California–Los Angeles
David Malon
Argonne National Laboratory Jose L. Munoz
National Nuclear Security Administration
Bob Malone
Los Alamos National Laboratory Jim Myers
Pacific Northwest National Laboratory
Madhav V. Marathe
Los Alamos National Laboratory Pavan K. Naicker
Vanderbilt University
William R. Martin
University of Michigan Habib Nasri Najm
Sandia National Laboratories

A Science-Based Case for Large-Scale Simulation xv


Workshop Participants

John W. Negele Karsten Pruess


Massachusetts Institute of Technology Lawrence Berkeley National Laboratory
Bill Nevins Larry A. Rahn
Lawrence Livermore National Laboratory Sandia National Laboratories
Esmond G. Ng Craig E. Rasmussen
Lawrence Berkeley National Laboratory Los Alamos National Laboratory
Michael L. Norman Katherine M. Riley
University of California–San Diego University of Chicago
Peter Nugent Martei Ripeanu
Lawrence Berkeley National Laboratory University of Chicago
Hong Qin Charles H. Romine
Princeton University DOE Office of Science
Rod Oldehoeft John Van Rosendale
Los Alamos National Laboratory DOE Office of Science
Gordon L. Olson Robert Rosner
Los Alamos National Laboratory University of Chicago
George Ostrouchov Doron Rotem
Oak Ridge National Laboratory Lawrence Berkeley National Laboratory
Jack Parker Doug Rotman
Oak Ridge National Laboratory Lawrence Livermore National Laboratory
Scott Edward Parker Robert Douglas Ryne
University of Colorado Lawrence Berkeley National Laboratory
Bahram Parvin P. Sadayappan
Lawrence Berkeley National Laboratory Ohio State University
Valerio Pascucci Roman Samulyak
Lawrence Livermore National Laboratory Brookhaven National Laboratory
Peter Paul Todd Satogata
Brookhaven National Laboratory Brookhaven National Laboratory
William Pennell Dolores A. Shaffer
Pacific Northwest National Laboratory Science and Technology Associates, Inc.
Arnie Peskin George C. Schatz
Brookhaven National Laboratory Northwestern University
Anders Petersson David Paul Schissel
Lawrence Livermore National Laboratory General Atomics
Ali Pinar Thomas Christoph Schulthess
Lawrence Berkeley National Laboratory Oak Ridge National Laboratory
Tomasz Plewa Mary Anne Scott
University of Chicago DOE Office of Science
Douglas E. Post Stephen L. Scott
Los Alamos National Laboratory Oak Ridge National Laboratory
Alex Pothen David B. Serafini
Old Dominion University Lawrence Berkeley National Laboratory
James Sethian
University of California–Berkeley

xvi A Science-Based Case for Large-Scale Simulation


Workshop Participants

John Nicholas Shadid Scott Studham


Sandia National Laboratories Pacific Northwest National Laboratory
Mikhail Shashkov Robert Sugar
Los Alamos National Laboratory University of California–Santa Barbara
John Shalf Shuyu Sun
Lawrence Berkeley National Laboratory University of Texas–Austin
Henry Shaw F. Douglas Swesty
DOE Office of Science State University of New York–Stony Brook
Mark S. Shephard John Taylor
Rensselaer Polytechnic Institute Argonne National Laboratory
Jason F. Shepherd Ani R. Thakar
Sandia National Laboratories Johns Hopkins University
Timothy Shippert Andrew F. B. Tompson
Pacific Northwest National Laboratory Lawrence Livermore National Laboratory
Andrei P. Shishlo Charles H. Tong
Oak Ridge National Laboratory Lawrence Livermore National Laboratory
Arie Shoshani Josep Torrellas
Lawrence Berkeley National Laboratory University of Illinois
Andrew R. Siegel Harold Eugene Trease
Argonne National Laboratory/University of Pacific Northwest National Laboratory
Chicago
Albert J. Valocchi
Sonya Teresa Smith University of Illinois
Howard University
Debra van Opstal
Mitchell D. Smooke Council on Competitiveness
Yale University
Leticia Velazquez
Allan Edward Snavely University of Texas–El Paso
San Diego Supercomputer Center
Roelof Jan Versteeg
Matthew Sottile Idaho National Energy and Engineering
Los Alamos National Laboratory Laboratory
Carl R. Sovinec Jeffrey Scott Vetter
University of Wisconsin Lawrence Livermore National Laboratory
Panagiotis Spentzouris Angela Violi
Fermi National Accelerator Facility University of Utah
Daniel Shields Spicer Robert Voigt
NASA/Goddard Space Flight Center College of William and Mary
Bill Spotz Gregory A. Voth
National Center for Atmospheric Research University of Utah
David J. Srolovitz Albert F. Wagner
Princeton University Argonne National Laboratory
T. P. Straatsma Homer F. Walker
Pacific Northwest National Laboratory Worcester Polytechnic Institute
Michael Robert Strayer Lin-Wang Wang
Oak Ridge National Laboratory Lawrence Berkeley National Laboratory

A Science-Based Case for Large-Scale Simulation xvii


Workshop Participants

Warren Washington John Wu


National Center for Atmospheric Research Lawrence Berkeley National Laboratory
Chip Watson Ying Xu
Thomas Jefferson National Accelerator Oak Ridge National Laboratory
Facility
Zhiliang Xu
Michael Wehner State University of New York–Stony Brook
Lawrence Livermore National Laboratory
Steve Yabusaki
Mary F. Wheeler Pacific Northwest National Laboratory
University of Texas–Austin
Woo-Sun Yang
Roy Whitney Lawrence Berkeley National Laboratory
Thomas Jefferson National Accelerator
Katherine Yelick
Facility
University of California–Berkeley
Theresa L. Windus
Thomas Zacharia
Pacific Northwest National Laboratory
Oak Ridge National Laboratory
Steve Wojtkiewicz
Bernard Zak
Sandia National Laboratories
Sandia National Laboratories
Pat Worley
Shujia Zhou Zhou
Oak Ridge National Laboratory
Northrop Grumman/TASC
Margaret H. Wright
John P. Ziebarth
New York University
Los Alamos National Laboratory

xviii A Science-Based Case for Large-Scale Simulation


1. Introduction

1. INTRODUCTION

National investment in large-scale answered in 27 topical breakout groups,


computing can be justified on numerous whose responses constitute the bulk of the
grounds. These include national security; second volume of this two-volume report.
economic competitiveness; technological
leadership; formulation of strategic policy The computational scientists participating
in defense, energy, environmental, health in the workshop included those who regard
care, and transportation systems; impact on themselves primarily as natural scientists—
education and workforce development; e.g., physicists, chemists, biologists—but
impact on culture in its increasingly digital also many others. By design, about one-
forms; and, not least, supercomputing’s third of the participants were computa-
ability to capture the imagination of the tional mathematicians. Another third were
nation’s citizens. The public expects all to computer scientists who are oriented
see the fruits of these historically toward research on the infrastructure on
established benefits of large-scale which computational modeling and
computing, and it expects that the United simulation depends.
States will have the earliest access to future,
as yet unimagined benefits. A CASCADIC STRUCTURE
Computer scientists and mathematicians
This report touches upon many of these
are scientists whose research in the area of
motivations for investment in large-scale
large-scale computing has direction and
computing. However, it is directly
value of its own. However, the structure of
concerned with a deeper one, which in
the workshop, and of this report,
many ways controls the rest. The question
emphasizes a cascadic flow from natural
this report seeks to answer is: What case can
scientists to mathematicians and from both
be made for a national investment in large-scale
to computer scientists (Figure 1). We are
computational modeling and simulation from
primarily concerned herein with the
the perspective of basic science? This can be
opportunities for large-scale simulation to
broken down into several smaller
achieve new understanding in the natural
questions: What scientific results are likely,
sciences. In this context, the role of the
given a significantly more powerful simulation
mathematical sciences is to provide a
capability, say, a hundred to a thousand times
means of passing from a physical model to
present capability? What principal hurdles can
a discrete computational representation of
be identified along the path to realizing these
that model and of efficiently manipulating
new results? How can these hurdles be
that representation to obtain results whose
surmounted?
validity is well understood. In turn, the
computer sciences allow the algorithms that
A WORKSHOP TO FIND FRESH perform these manipulations to be executed
ANSWERS efficiently on leading-edge computer
These questions and others were posed to a systems. They also provide ways for the
large gathering of the nation’s leading often-monumental human efforts required
computational scientists at a workshop in this process to be effectively abstracted,
convened on June 24–25, 2003, in the leveraged across other applications, and
shadow of the Pentagon. They were propagated over a complex and ever-
evolving set of hardware platforms.

A Science-Based Case for Large-Scale Simulation 1


1. Introduction

Figure 1. The layered, cascadic structure of computational science.

Most computational scientists understand understanding the importance of several


that this layered set of dependencies—of practices in the contemporary
natural science on mathematics and of both computational research community. These
on computer science—is an include striving towards the abstraction of
oversimplification of the full picture of technologies and towards identification of
information flows that drive computational universal interfaces between well-defined
science. Indeed, one of the most exciting of functional layers (enabling reuse across
many rapidly evolving trends in many applications). It also motivates
computational science is a “phase sustained investment in the cross-cutting
transition” (a concept to which we return in technologies of computational mathematics
Chapter 2) from a field of solo scientific and computer science, so that increases in
investigators, who fulfill their needs by power and capability may be continually
finding computer software “thrown over delivered to a broad portfolio of scientific
the wall” by computational tool builders, to applications on the other side of a
large, coordinated groups of natural standardized software interface.
scientists, mathematicians, and computer
scientists who carry on a constant SIMULATION AS A PEER
conversation about what can be done now, METHODOLOGY TO EXPERIMENT
what is possible in the future, and how to AND THEORY
achieve these goals most effectively.
Historians of science may pick different
dates for the development of the twin
Almost always, there are numerous ways to
pillars of theory and experiment in the
translate a physical model into
“scientific method,” but both are ancient,
mathematical algorithms and to implement
and their interrelationship is well
a computational program on a given
established. Each takes a turn leading the
computer. Uninformed decisions made
other as they mutually refine our
early in the process can cut off highly
understanding of the natural world—
productive options that may arise later on.
experiment confronting theory with new
Only a bi-directional dialog, up and down
puzzles to explain and theory showing
the cascade, can systematically ensure that
experiment where to look next to test its
the best resources and technologies are
explanations. Unfortunately, theory and
employed to solve the most challenging
experiment both possess recognized
scientific problems.
limitations at the frontier of contemporary
science.
Its acknowledged limitations aside, the
cascadic model of contemporary
The strains on theory were apparent to John
computational science is useful for
Von Neumann (1903–1957) and drove him

2 A Science-Based Case for Large-Scale Simulation


1. Introduction

to develop the computational sciences of celestial events.) The experiments that


fluid dynamics, radiation transport, scientists need to perform to answer the
weather prediction, and other fields. most pressing questions are sometimes
Models of these phenomena, when deemed unethical (e.g., because of their
expressed as mathematical equations, are impact on living beings), hazardous (e.g.,
inevitably large-scale and nonlinear, while because of their impact on the life-
the bulk of the edifice of mathematical sustaining environment we have on earth),
theory (for algebraic, differential, and politically untenable (e.g., prohibited by
integral equations) is linear. Computation treaties to which the United States is a
was to Von Neumann, and remains today, party), difficult (e.g., requiring
the only truly systematic means of making measurements that are too rapid or
progress in these and many other scientific numerous to be instrumentable or time
arenas. Breakthroughs in the theory of periods too long to complete), or simply
nonlinear systems come occasionally, but expensive.
computational gains come steadily with
increased computational power and The cost of experimentation is particularly
resolution. important to consider when considering
simulation as an alternative, for it cannot be
The strains on experimentation, the gold denied that simulation can also be
standard of scientific truth, have grown expensive. Cost has not categorically
along with expectations for it (Figure 2). denied experimentalists their favored tools,
Unfortunately, many systems and many such as accelerators, orbital telescopes,
questions are nearly inaccessible to high-power lasers, and the like, at prices
experiment. (We subsume under sometimes in the billions of dollars per
“experiment” both experiments designed facility, although such facilities are carefully
and conducted by scientists, and vetted and planned. Cost must also not
observations of natural phenomena out of categorically deny computational scientists
the direct control of scientists, such as their foremost scientific instrument—the

Figure 2. Practical strains on experimentation that invite simulation as a peer modality of scientific
investigation.

A Science-Based Case for Large-Scale Simulation 3


1. Introduction

supercomputer and its full complement of using experimentally measured fields as


software, networking, and peripheral input data. Data assimilation can also be
devices. It will be argued in the body of this systematically employed throughout a
report that simulation is, in fact, a highly simulation to keep it from “drifting” from
cost-effective lever for the experimental measurements, thus overcoming the effect
process, allowing the same, or better of modeling uncertainty.
science to be accomplished with fewer,
better-conceived experiments. Further comparing simulation to
experiment, one observes a sociological or
Simulation has aspects in common with political advantage to increased investment
both theory and experiment. It is in the tools of simulation: they are widely
fundamentally theoretical, in that it starts usable across the entire scientific
with a theoretical model—typically a set of community. A micro-array analyzer is of
mathematical equations. A powerful limited use to a plasma physicist and a
simulation capability breathes new life into tokamak of limited use to a biologist.
theory by creating a demand for However, a large computer or an
improvements in mathematical models. optimization algorithm in the form of
Simulation is also fundamentally portable software may be of use to both.
experimental, in that upon constructing and
implementing a model, one observes the HURDLES TO SIMULATION
transformation of inputs (or controls) to
While the promise of simulation is
outputs (or observables).
profound, so are its limitations. Those
limitations often come down to a question
Simulation effectively bridges theory and
of resolution. Though of vast size,
experiment by allowing the execution of
computers are triply finite: they represent
“theoretical experiments” on systems,
individual quantities only to a finite
including those that could never exist in the
precision, they keep track of only a finite
physical world, such as a fluid without
number of such quantities, and they operate
viscosity. Computation also bridges theory
at a finite rate.
and experiment by virtue of the computer’s
serving as a universal and versatile data
Although all matter is, in fact, composed of
host. Once experimental data have been
a finite number of particles (atoms), the
digitized, they can be compared side-by-
number of particles in a macroscopic
side with simulated results in visualization
sample of matter, on the order of a trillion
systems built for the latter, reliably
particles, places simulations at macroscopic
transmitted, and retrievably archived.
scales from “first principles” (i.e., from the
Moreover, simulation and experiment can
quantum theory of electronic structure)
complement each other by allowing a
well beyond any conceivable computational
complete picture of a system that neither
capability. Similar problems arise when
can provide as well alone. Some data may
time scales are considered. For example, the
be immeasurable with the best
range of time scales in protein folding is 12
experimental techniques available, and
orders of magnitude, since a process that
some mathematical models may be too
takes milliseconds occurs in molecular
sensitive to unknown parameters to invoke
dance steps that last femtoseconds (a
with confidence. Simulation can be used to
trillion times shorter)—again, far too wide a
“fill in” the missing experimental fields,

4 A Science-Based Case for Large-Scale Simulation


1. Introduction

range to routinely simulate using first an increase in computer performance by a


principles. factor of 100 provides an increase in
resolution in each spatial and temporal
Even simulation of systems adequately dimension by a factor of only about 3.
described by equations of the macroscopic
continuum—fluids such as air or water— One way to bring more computing power
can be daunting for the most powerful to bear on a problem is to divide the work
computers available today. Such computers to be done over many processors.
are capable of execution rates in the tens of Parallelism in computer architecture—the
teraflop/s (one teraflop/s is one trillion concurrent use of many processors—can
arithmetic operations per second) and can provide additional factors of hundreds or
cost tens to hundreds of millions of dollars more, beyond Moore’s Law alone, to the
to purchase and millions of dollars per year aggregate performance available for the
to operate. However, to simulate fluid- solution of a given problem. However, the
mechanical turbulence in the boundary and desire for increased resolution and, hence,
wake regions of a typical vehicle using accuracy is seemingly insatiable. This
“first principles” of continuum modeling simply expressed and simply understood
would tie up such a computer for months, “curse of dimensionality” means that
which makes this level of simulation too “business as usual” in scaling up today’s
expensive for routine use. simulations by riding the computer
performance curve will never be cost-
One way to describe matter at the effective—and is too slow!
macroscopic scale is to divide space up into
small cubes and to ascribe values for The “curse of knowledge explosion”—
density, momentum, temperature, and so namely, that no one computational scientist
forth, to each cube. As the cubes become can hope to track advances in all of the
smaller and smaller, a more and more facets of the mathematical theories,
accurate description is obtained, at a cost of computational models and algorithms,
increasing memory to store the values and applications and computing systems
time to visit and update each value in software, and computing hardware
accordance with natural laws, such as the (computers, data stores, and networks) that
conservation of energy applied to the cube, may be needed in a successful simulation
given fluxes in and out of neighboring effort—is another substantial hurdle to the
cubes. If the number of cubes along each progress of simulation. Both the curse of
side is doubled, the cost of the simulation, dimensionality and the curse of knowledge
which is proportional to the total number of explosion can be addressed. In fact, they
cubes, increases by a factor of 23 or 8. This is have begun to be addressed in a
the “curse of dimensionality”: increasing remarkably successful initiative called
the resolution of a simulation quickly eats Scientific Discovery through Advanced
up any increases in processor power Computing (SciDAC), which was launched
resulting from the well-known Moore’s by the U.S. Department of Energy’s (DOE’s)
Law (a doubling of computer speed every Office of Science in 2001. SciDAC was just a
18–24 months). Furthermore, for time- start; many difficult challenges remain.
dependent problems in three dimensions, Nonetheless, SciDAC (discussed in more
the cost of a simulation grows like the detail in Chapter 2) established a model for
fourth power of the resolution. Therefore, tackling these “curses” that promises to

A Science-Based Case for Large-Scale Simulation 5


1. Introduction

bring the power of the most advanced types of computational scientists and good
computing technologies to bear on the most communication between them.
challenging scientific and engineering
problems facing the nation. The following recommendations are
elaborated upon in Chapter 6 of this
In addition to the technical challenges in volume:
scientific computing, too few computational
cycles are currently available to fully • Major new investments in
capitalize upon the progress of the SciDAC computational science are needed in all
initiative. Moreover, the computational of the mission areas of the Office of
science talent emerging from the national Science in DOE, as well as those of
educational pipeline is just a trickle in many other agencies, so that the United
comparison to what is needed to replicate States may be the first, or among the
the successes of SciDAC throughout the first, to capture the new opportunities
many applications of science and presented by the continuing advances in
engineering that lag behind. Cycle- computing power. Such investments
starvation and human resource–starvation will extend the important scientific
are serious problems that are felt across all opportunities that have been attained
of the computational science programs by a fusion of sustained advances in
surveyed in creating this report. However, scientific models, mathematical
they are easier hurdles to overcome than algorithms, computer architecture, and
the above-mentioned “curses.” scientific software engineering.

SUMMARY OF RECOMMENDATIONS • Multidisciplinary teams, with carefully


selected leadership, should be
We believe that the most effective program
assembled to provide the broad range of
to deliver new science by means of large-
expertise needed to address the
scale simulation will be one built over a
intellectual challenges associated with
broad base of disciplines in the natural
translating advances in science,
sciences and the associated enabling
mathematics, and computer science into
technologies, since there are numerous
simulations that can take full advantage
synergisms between them, which are
of advanced computers.
readily apparent in the companion volume.
• Extensive investment in new
We also believe that a program to realize
computational facilities is strongly
the promises outlined in the companion
recommended, since simulation now
volume requires a balance of investments in
cost-effectively complements
hardware, software, and human resources.
experimentation in the pursuit of the
While quantifying and extrapolating the
answers to numerous scientific
thirst for computing cycles and network
questions. New facilities should strike a
bandwidth is relatively straightforward, the
balance between capability computing
scheduling of breakthroughs in
for those “heroic simulations” that
mathematical theories and computational
cannot be performed any other way and
model and algorithms has always been
capacity computing for “production”
difficult. Breakthroughs are more likely,
simulations that contribute to the steady
however, where there is critical mass of all
stream of progress.

6 A Science-Based Case for Large-Scale Simulation


1. Introduction

• Investment in hardware facilities should collaborations among distributed teams


be accompanied by sustained collateral of scientists, in recognition of the fact
investment in software infrastructure that the best possible computational
for them. The efficient use of expensive science teams will be widely separated
computational facilities and the data geographically and that researchers will
they produce depends directly upon generally not be co-located with
multiple layers of system software and facilities and data.
scientific software, which, together with
the hardware, are the “engines of • Federal investment in innovative, high-
scientific discovery” across a broad risk computer architectures that are well
portfolio of scientific applications. suited to scientific and engineering
simulations is both appropriate and
• Additional investments in hardware needed to complement commercial
facilities and software infrastructure research and development. The
should be accompanied by sustained commercial computing marketplace is
collateral investments in algorithm no longer effectively driven by the
research and theoretical development. needs of computational science.
Improvements in basic theory and
algorithms have contributed as much to SCOPE
increases in computational simulation
The purpose of this report is to capture the
capability as improvements in hardware
projections of the nation’s leading
and software over the first six decades
computational scientists and providers of
of scientific computing.
enabling technologies concerning what new
science lies around the corner with the next
• Computational scientists of all types
increase in delivered computational power
should be proactively recruited with
by a factor of 100 to 1000. It also identifies
improved reward structures and
the issues that must be addressed if these
opportunities as early as possible in the
increases in computing power are to lead to
educational process so that the number
remarkable new scientific discoveries.
of trained computational science
professionals is sufficient to meet
It is beyond the charter and scope of this
present and future demands.
report to form policy recommendations on
how this increased capability and capacity
• Sustained investments must be made in
should be procured, or how to prioritize
network infrastructure for access and
among numerous apparently equally ripe
resource sharing as well as in the
scientific opportunities.
software needed to support

A Science-Based Case for Large-Scale Simulation 7


2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

8 A Science-Based Case for Large-Scale Simulation


2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

2. SCIENTIFIC DISCOVERY THROUGH ADVANCED


COMPUTING: A SUCCESSFUL PILOT PROGRAM

THE BACKGROUND OF SciDAC fraction of the total memory, connected in


clusters numbering in the thousands
For more than half a century, visionaries
through dedicated networks or fast
anticipated the emergence of computational
switches. The speeds of the commercial
modeling and simulation as a peer to
microprocessors in these designs increased
theory and experiment in pushing forward
according to Moore’s Law, producing
the frontiers of science, and DOE and its
distributed-memory parallel computers
antecedent agencies over this period have
that rivaled the power of traditional Cray
invested accordingly. Computational
supercomputers. These new machines
modeling and simulation made great
required a major recasting of the algorithms
strides between the late 1970s and the early
that scientists had developed and polished
1990s as Seymour Cray and the company
for more than two decades on Cray
that he founded produced ever-faster
supercomputers, but those who invested
versions of Cray supercomputers. Between
the effort found that the new massively
1976, when the Cray 1 was delivered to
parallel computers provided impressive
DOE’s Los Alamos National Laboratory,
levels of computing capability.
and 1994, when the last traditional Cray
supercomputer, the Cray T90, was
By the late 1990s it was evident to many
introduced, the speed of these computers
that the continuing, dramatic advances in
increased from 160 megaflop/s to
computing technologies had the potential
32 gigaflop/s, a factor of 200! However, by
to revolutionize scientific computing.
the early 1990s it became clear that building
DOE’s National Nuclear Security Agency
ever-faster versions of Cray
(DOE-NNSA) launched the Accelerated
supercomputers using traditional
Strategic Computing Initiative (ASCI) to
electronics technologies was not
help ensure the safety and reliability of the
sustainable.
nation’s nuclear stockpile in the absence of
testing. In 1998, DOE’s Office of Science
In the 1980s DOE and other federal
(DOE-SC) and the National Science
agencies began investing in alternative
Foundation (NSF) sponsored a workshop at
computer architectures for scientific
the National Academy of Sciences to
computing featuring multiple processors
identify the opportunities and challenges of
connected to multiple memory units
realizing major advances in computational
through a wide variety of mechanisms. A
modeling and simulation. The report from
wild decade ensued during which many
this workshop identified scientific advances
different physical designs were explored, as
across a broad range of science, ranging in
well as many programming models for
scale from cosmology through nanoscience,
controlling the flow of data in them, from
which would be made possible by advances
memory to processor and from one memory
in computational modeling and simulation.
unit to another. The mid-1990s saw a
However, the report also noted challenges
substantial convergence, which found
to be met if these advances were to be
commercial microprocessors, each with
realized.
direct local access to only a relatively small

A Science-Based Case for Large-Scale Simulation 9


2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

By the time the workshop report was university partnerships in research and
published, the Office of Science was well graduate education. SciDAC is built on
along the path to developing an Office- these institutional foundations as well as on
wide program designed to address the the integration of computational modeling,
challenges identified in the DOE-SC/NSF computer algorithms, and computing
workshop. The response to the National systems software.
Academy report, as well as to the 1999
report from the President’s Information In this chapter, we briefly examine the short
Technology Advisory Committee—the but substantial legacy of SciDAC. Our
Scientific Discovery through Advanced purpose is to understand the implications
Computing (SciDAC) program—was of its successes, which will help us identify
described in a 22-page March 2000 report the logical next steps in advancing
by the Office of Science to the House and computational modeling and simulation.
Senate Appropriations Committees.
Congress funded the program in the SciDAC’S INGREDIENTS FOR
following fiscal year, and by the end of SUCCESS
FY 2001, DOE-SC had established
Aimed at Discovery
51 scientific projects, following a national
peer-reviewed competition. SciDAC boldly affirms the importance of
computational simulation for new scientific
SciDAC explicitly recognizes that the discovery, not just for “rationalizing” the
advances in computing technologies that results of experiments. The latter role for
provide the foundation for the scientific simulation is a time-honored one from the
opportunities identified in the DOE-SC/ days when it was largely subordinate to
NSF report (as well as in Volume 2 of the experimentation, rather than peer to it. That
current report) are not, in themselves, relationship has been steadily evolving. In a
sufficient for the “coming of age” of frontispiece quotation from J. S. Langer, the
computational simulation. Close chair of the July 1998 multiagency National
integration of three disciplines—science, Workshop on Advanced Scientific
computer science, and applied Computing, the SciDAC report declared:
mathematics—and a new discipline at their “The computer literally is providing a new
intersection called computational science window through which we can observe the
are needed to realize the full benefits of the natural world in exquisite detail.”
fast-moving advances in computing
technologies. A change in the sociology of Through advances in computational
this area of science was necessary for the modeling and simulation, it will be possible
fusion of these streams of disciplinary to extend dramatically the exploration of
developments. While not uniquely the fundamental processes of nature—e.g.,
American in its origins, the “new kind of the interactions between elementary
computational science” promoted in the particles that give rise to the matter around
SciDAC report found fertile soil in us, the interactions between proteins and
institutional arrangements for the conduct other molecules that sustain life—as well as
of scientific research that are highly advance our ability to predict the behavior
developed in the United States, and of a broad range of complex natural and
seemingly much less so elsewhere: engineered systems—e.g., nanoscale
multiprogram, multidisciplinary national devices, microbial cells, fusion energy
laboratories, and strong laboratory- reactors, and the earth’s climate.

10 A Science-Based Case for Large-Scale Simulation


2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

Multidisciplinary • creation of the mathematical and


systems software to enable the scientific
The structure of SciDAC boldly reflects the
simulation codes to effectively and
fact that the development and application
efficiently use terascale computers; and
of leading-edge simulation capabilities has
• creation of collaboratory software
become a multidisciplinary activity. This
infrastructure to enable geographically
implies, for instance, that physicists are able
separated scientists to work together
to focus on the physics and not on
effectively as a team and to facilitate
developing parallel libraries for core
remote access to both facilities and data.
mathematical operations. It also implies
that mathematicians and computer
The $57 million per year made available
scientists have direct access to substantive
under SciDAC is divided roughly evenly
problems of high impact, and not only
into these three categories. SciDAC also
models loosely related to application-
called for upgrades to the scientific
specific difficulties. The deliverables and
computing hardware infrastructure of the
accountabilities of the collaborating parties
Office of Science and outlined a
are cross-linked. The message of SciDAC is
prescription for building a robust, agile,
a “declaration of interdependence.”
and cost-effective computing infrastructure.
These recommendations have thus far been
Specifically, the SciDAC report called for
modestly supported outside of the SciDAC
• creation of a new generation of scientific
budget.
simulation codes that take full
advantage of the extraordinary
Figure 3, based on a figure from the original
computing capabilities of terascale
SciDAC report, shows the interrelationships
computers;
between SciDAC program elements as

Figure 3. Relationship between the science application teams, the Integrated Software
Infrastructure Centers (ISICs), and networking and computing facilities in the SciDAC program.

A Science-Based Case for Large-Scale Simulation 11


2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

implemented, particularizing Figure 1 of proof of concept. Laboratory teams excel in


this report to the SciDAC program. All five creating, refining, “hardening,” and
program offices in the Office of Science are deploying computational scientific
participating in SciDAC. The disciplinary infrastructure. Thirteen laboratories and
computational science groups are being 50 universities were funded in the initial
funded by the Offices of Basic Energy round of SciDAC projects.
Sciences, Biology and Environmental
Research, Fusion Energy Sciences, and High Setting New Standards for Software
Energy and Nuclear Physics, and the
For many centuries, publication of original
integrated software infrastructure centers
results has been the gold standard of
and networking infrastructure centers are
accomplishment for scientific researchers.
being funded by the Office of Advanced
During the first 50 years of scientific
Scientific Computing Research.
computing, most research codes targeted a
few applications on a small range of
Multi-Institutional
hardware, and were designed to be used by
Recognizing that the expertise needed to experts only (often just the author). Code
make major advances in computational development projects emphasized finding
modeling and algorithms and computing the shortest path to the next publishable
systems software was distributed among result. Packaging code for reuse was not a
various institutions across the United job for serious scientists. As the role of
States, SciDAC established scientifically computational modeling and simulations in
natural collaborations without regard for advancing scientific discovery grows, this
geographical or institutional boundaries— culture is changing. Scientists now realize
laboratory-university collaborations and the value of high-quality, reusable,
laboratory-laboratory collaborations. extensible, portable software, and the
Laboratories and universities have producers of widely used packages are
complementary strengths in research. highly respected.
Laboratories are agile in reorganizing their
research assets to respond to missions and Unfortunately, funding structures in the
new opportunities. Universities respond on federal agencies have evolved more slowly
a much slower time scale to structural and it has not always been easy for
change, but they are hotbeds of innovation scientists to find support for the
where graduate students explore novel development of software for the research
ideas as an integral part of their education. community. By dedicating more than half
In the area of simulation, universities are its resources to the development of high-
often sources of new models, algorithms, value scientific applications software and
and software and hardware concepts that the establishment of integrated software
can be evaluated in a dissertation-scale infrastructure centers, SciDAC has
effort. However, such developments may supported efforts to make advanced
not survive the graduation of the doctoral scientific packages easy to use, port them to
candidate and are rarely rendered into new hardware, maintain them, and make
reusable, portable, extensible infrastructure them capable of interoperating with other
in the confines of the academic related packages that achieve critical mass
environment, where the reward structure of use and offer complementary features.
emphasizes the original discovery and

12 A Science-Based Case for Large-Scale Simulation


2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

SciDAC recognizes that large scientific latter half of the 1990s (in such programs as
simulation codes have many needs in DOE’s ASCI) to justify them as cost-
common: modules that construct and adapt effective simulation hardware, provided
the computational mesh on which their that certain generic difficulties in
models are based, modules that produce programming are amortized over many
the discrete equations from the user groups and many generations of
mathematical model, modules that solve hardware. Still, many challenges remain to
the resulting equations, and modules that making efficient and effective use of these
visualize the results and manage, mine, and computers for the broad range of scientific
analyze the large data sets. By constructing applications important to the mission of
these modules as interoperable components DOE’s Office of Science.
and supporting the resulting component
libraries so that they stay up to date with Supporting Community Codes
the latest algorithmic advances and work
As examples of simulations from Office of
on the latest hardware, SciDAC amortizes
Science programs, the SciDAC report
costs and creates specialized points of
highlighted problems of ascertaining
contact for a portfolio of applications.
chemical structure and the massively
parallel simulation code NWCHEM, winner
The integrated software infrastructure
of a 1999 R&D 100 Award and a Federal
centers that develop these libraries are
Laboratory Consortium Excellence in
motivated to ensure that their users have
Technology Transfer Award in 2000.
available the full range of algorithmic
options under a common interface, along
NWCHEM, currently installed at nearly
with a complete description of their
900 research institutions worldwide, was
resource tradeoffs (e.g., memory versus
created by an international team of
time). Their users can try all reasonable
scientists from twelve institutions (two U.S.
options easily and adapt to different
national laboratories, two European
computational platforms without recoding
national laboratories, five U.S. universities,
the applications. Furthermore, when an
and three European universities) over the
application group intelligently drives
course of five years. The core team
software infrastructure research in a new
consisted of seven full-time theoretical and
and useful direction, the result is soon
computational chemists, computer
available to all users.
scientists, and applied mathematicians,
ably assisted by ten postdoctoral fellows on
Large Scale temporary assignment. Overall, NWCHEM
Because computational science is driven represents an investment in excess of
toward the large scale by requirements of 100 person-years. Yet, the layered structure
model fidelity and resolution, SciDAC of NWCHEM permits it to easily absorb
emphasizes the development of software new physics or algorithm modules, to run
for the most capable computers available. on environments from laptops to the fastest
In the United States, these are now parallel architectures, and to be ported to a
primarily hierarchical distributed-memory new supercomputing environment with
computers. Enough experience was isolated minor modifications, typically in
acquired on such machines in heroic less than a week. The scientific legacy
mission-driven simulation programs in the embodied in NWCHEM goes back three

A Science-Based Case for Large-Scale Simulation 13


2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

decades to the Gaussian code, for which of contact for mathematical and
John Pople won the 1998 Nobel Prize in computational technologies is a sound
chemistry. investment strategy. For instance, Dalton
Schnack of General Atomics, whose group
Community codes such as Gaussian and operates and maintains the simulation code
NWCHEM can consume hundreds of NIMROD to support experimental design
person-years of development (Gaussian of magnetically confined fusion energy
probably has 500–1000 person-years devices, states that software recently
invested in it), run at hundreds of developed by Xiaoye (Sherry) Li of
installations, are given large fractions of Lawrence Berkeley National Laboratory has
community computing resources for enhanced his group’s ability to perform
decades, and acquire an authority that can fusion simulations to a degree equivalent to
enable or limit what is done and accepted three to five years of riding the wave of
as science in their respective communities. hardware improvements alone. Ironically,
Yet, historically, not all community codes the uniprocessor version of Li’s solver,
have been crafted with the care of originally developed while she was a
NWCHEM to achieve performance, graduate student at the University of
portability, and extensibility. Except at the California–Berkeley, predated the SciDAC
beginning, it is difficult to promote major initiative; but it was not known to the
algorithmic changes in such codes, since NIMROD team until it was recommended
change is expensive and pulls resources during discussions between SciDAC fusion
from core progress. SciDAC application and solver software teams. In the
groups have been chartered to build new meantime, SciDAC-funded research
and improved community codes in areas improved the parallel performance of Li’s
such as accelerator physics, materials solver in time for use in NIMROD’s current
science, ocean circulation, and plasma context.
physics. The integrated software
infrastructure centers of SciDAC have a M3D, a fusion simulation code developed
chance, due to the interdependence built at the Princeton Plasma Physics Laboratory,
into the SciDAC program structure, to is complementary to NIMROD but uses a
simultaneously influence all of these new different mathematical formulation. Like
community codes by delivering software NIMROD and many simulation codes, M3D
incorporating optimal algorithms that may spends a large fraction of its time solving
be reused across many applications. sparse linear algebraic systems. In a
Improvements driven by one application collaboration that predated SciDAC, the
will be available to all. While SciDAC M3D code group was using PETSc, a
application groups are developing codes for portable toolkit of sparse solvers developed
communities of users, SciDAC enabling at Argonne National Laboratory and in
technology groups are building codes for wide use around the world. Under SciDAC,
communities of developers. the Terascale Optimal PDE Solvers
integrated software center provided
Enabling Success another world-renowned class of solvers,
Hypre, from Lawrence Livermore National
The first two years of SciDAC provide
Laboratory “underneath” the same code
ample evidence that constructing
interface that M3D was already using to call
multidisciplinary teams around challenge
PETSc’s solvers. The combined PETSc-
problems and providing centralized points

14 A Science-Based Case for Large-Scale Simulation


2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

Hypre solver library allows M3D to solve important result: this process can occur in
its current linear systems two to three times environments very different than heretofore
faster than previously, and this SciDAC thought, perhaps providing a way around
collaboration has identified several avenues some of the fundamental difficulties
for improving the PETSc-Hypre code encountered in past supernova models in
support for M3D and other users. explaining element synthesis. To extend
their model to higher dimensions, the TSI
While these are stories of performance team has engaged mathematicians and
improvement, allowing faster turnaround computer scientists to help them overcome
and more production runs, there are also the “curse of dimensionality” and exploit
striking stories of altogether new physics massively parallel machines. Mezzacappa
enabled by SciDAC collaborations. The has declared that he would never go back to
Princeton fusion team has combined forces doing computational physics outside of a
with the Applied Partial Differential multidisciplinary team.
Equations Center, an integrated software
infrastructure center, to produce a Setting a New Organizational
simulation code capable of resolving the Paradigm
fine structure of the dynamic interface
By acknowledging sociological barriers to
between two fluids over which a shock
multidisciplinary research up front and
wave passes—the so-called Richtmyer-
building accountabilities into the project
Meshkov instability—in the presence of a
awards to force the mixing of “scientific
magnetic field. This team has found that a
cultures” (e.g., physics with computer
strong magnetic field stabilizes the
science), SciDAC took a risk that has paid
interface, a result not merely of scientific
off handsomely in technical results. As a
interest but also of potential importance in
result, new DOE-SC initiatives are taking
many devices of practical interest to DOE.
SciDAC-like forms—for instance, the
FY 2003 initiative Nanoscience Theory,
SciDAC’s Terascale Supernova Initiative
Modeling and Simulation. The roadmap for
(TSI), led by Tony Mezzacappa of Oak
DOE’s Fusion Simulation Program likewise
Ridge National Laboratory, has achieved
envisions multidisciplinary teams
the first stellar core collapse simulations for
constructing complex simulations linking
supernovae with state-of-the-art neutrino
together what are today various
transport. The team has also produced
freestanding simulation codes for different
state-of-the-art models of nuclei in a stellar
portions of a fusion reactor. Such complex
core. These simulations have brought
simulations are deemed essential to the on-
together two fundamental fields at their
time construction of a working
respective frontiers, demonstrated the
demonstration International Thermonuclear
importance of accurate nuclear physics in
Experimental Reactor.
supernova models, and motivated planned
experiments to measure the nuclear
In many critical application domains, such
processes occurring in stars. This group has
teams are starting to self-assemble as a
also performed simulations of a rapid
natural response to the demands of the next
neutron capture process believed to occur
level of scientific opportunity. This self-
in supernovae and to be responsible for the
assembly represents a phase transition in
creation of half the heavy elements in the
the computational science universe, from
universe. This has led to a surprising and
autonomous individual investigators to

A Science-Based Case for Large-Scale Simulation 15


2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

coordinated multidisciplinary teams. When around major computational facilities and


experimental physicists transitioned from to count the cost of other professionals. All
solo investigators to members of large such transitions are sociologically difficult.
teams centered at major facilities, they In the case of computational simulation,
learned to include the cost of other there is sufficient infrastructure in common
professionals (statisticians, engineers, between different scientific endeavors that
computer scientists, technicians, DOE and other science agencies can
administrators, and others) in their projects. amortize its expense and provide organiza-
Computational physicists are now learning tional paradigms for it. SciDAC is such a
to build their projects in large teams built pilot paradigm. It is time to extend it.

16 A Science-Based Case for Large-Scale Simulation


2. Scientific Discovery through Advanced3.Computing:
Anatomy ofAa
Successful Pilot
Large-Scale Program
Simulation

3. ANATOMY OF A LARGE-SCALE SIMULATION

Because the methodologies and processes user environment. This iteration is


of computational simulation are still not indicated in the two loops (light arrows) of
familiar to many scientists and laypersons, Figure 4 (reproduced from the SciDAC
we devote this brief chapter to their report).
illustration. Our goal is to acquaint the
reader with a number of issues that arise in One loop over this computational process is
computational simulation and to highlight related to validation of the model and
the benefits of collaboration across the verification that the process is correctly
traditional boundaries of science. The executing the model. With slight violence to
structure of any program aimed at grammar and with some oversimplification,
advancing computational science must be validation poses the question “Are we
based on this understanding. solving the right equations?” and
verification, the question “Are we solving
Scientific simulation is a vertically the equations right?”
integrated process, as indicated in Figures 1
and 3. It is also an iterative process, in The other loop over the process of
which computational scientists traverse simulation is related to algorithm and
several times a hierarchy of issues in performance tuning. A simulation that yields
modeling, discretization, solution, code high-fidelity results is of little use if it is too
implementation, hardware execution, expensive to run or if it cannot be scaled up
visualization and interpretation of results, to the resolutions, simulation times, or
and comparison to expectations (theories, ensemble sizes required to describe the
experiments, or other simulations). They do real-world phenomena of interest. Again
this to develop a reliable “scientific with some oversimplification, algorithm
instrument” for simulation, namely a tuning poses the question, “Are we getting
marriage of a code with a computing and enough science per operation?” and

Figure 4. The process of scientific simulation, showing the validation and verification loop (left) and
algorithm and performance tuning loop (right).

A Science-Based Case for Large-Scale Simulation 17


3. Anatomy of a Large-Scale Simulation

performance tuning, “Are we getting require a different expert to detect it,


enough operations per cycle?” control it at an acceptable level, or
completely eliminate it as a possibility in
It is clear from Figure 4 that a scientific future designs. These are some of the
simulation depends upon numerous activities that exercise the left-hand loop in
components, all of which have to work Figure 4.
properly. For example, the theoretical
model must not overlook or misrepresent During or after the validation of a
any physically important effect. The model simulation code, attention must be paid to
must not be parameterized with data of the performance of the code in each
uncertain quality or provenance. The hardware/system software environment
mathematical translation of the theoretical available to the research team. Efforts to
model (which may be posed as a control discretization and solution error in a
differential equation, for instance) to a simulation may grossly increase the
discrete representation that a computer can number of operations required to execute a
manipulate (a large system of algebraic simulation, far out of proportion to their
relationships based on the differential benefit. Similarly, “conservative”
equation, for instance) inevitably involves approaches to moving data within a
some approximation. This approximation processor or between processors, to reduce
must be controlled by a priori analysis or a the possibility of errors or
posteriori checks. The algorithm that solves miscommunication, may require extensive
the discrete representation of the system synchronization or minimal data replication
being modeled must reliably converge. The and may be inefficient on a modern
compiler, operating system, and message- supercomputer. More elegant, often
passing software that translate the adaptive, strategies must be adopted for the
arithmetic manipulations of the solution simulation to be execution-worthy on an
algorithm into loads and stores from expensive massively parallel computer,
memory to registers, or into messages that even though these strategies may require
traverse communication links in a parallel highly complex implementation. These are
computer, must be reliable and fault- some of the activities that exercise the right-
tolerant. They must be employed correctly hand loop in Figure 4.
by programmers and simulation scientists
so that the results are well defined and ILLUSTRATION: SIMULATION OF A
repeatable. In a simulation that relies on TURBULENT REACTING FLOW
multiple interacting models (such as fluids
The dynamics of turbulent reacting flows
and structures in adjacent domains, or
have long constituted an important topic in
radiation and combustion in the same
combustion research. Apart from the
domain), information fluxes at the interface
scientific richness of the combination of
of the respective models, which represent
fluid turbulence and chemical reaction,
physical fluxes in the simulated
turbulent flames are at the heart of the
phenomenon, must be consistent. In
design of equipment such as reciprocating
simulations that manipulate or produce
engines, turbines, furnaces, and
enormous quantities of data on multiple
incinerators, for efficiency and minimal
disks, files must not be lost or mistakenly
environmental impact. Effective properties
overwritten, nor misfiled results extracted
of turbulent flame dynamics are also
for visualization. Each potential pitfall may
required as model inputs in a broad range

18 A Science-Based Case for Large-Scale Simulation


3. Anatomy of a Large-Scale Simulation

of larger-scale simulation challenges,


including fire spread in buildings or
wildfires, stellar dynamics, and chemical
processing.

Figure 5 shows a photograph of a


laboratory-scale premixed methane flame,
stabilized by a metal rod across the outlet of
a cylindrical burner. Although experimental
and theoretical investigations have been
able to provide substantial insights into the
dynamics of these types of flames, there are
still a number of outstanding questions
about their behavior. Recently, a team of
computational scientists supported by
SciDAC, in collaboration with
experimentalists at Lawrence Berkeley Figure 6. Instantaneous surface of a turbulent
National Laboratory, performed the first premixed rod-stabilized V-flame, from simulation.

simulations of these types of flames using


detailed chemistry and transport. Figure 6 surface from the simulation. (Many such
shows an instantaneous image of the flame instantaneous configurations are averaged
in the photographic image of Figure 5.)

The first step in simulations such as these is


to specify the computational models that
will be used to describe the reacting flows.
The essential feature of reacting flows is the
set of chemical reactions taking place in the
fluid. Besides chemical products, these
reactions produce both temperature and
pressure changes, which couple to the
dynamics of the flow. Thus, an accurate
description of the reactions is critical to
predicting the properties of the flame,
including turbulence. Conversely, it is the
fluid flow that transports the reacting
chemical species into the reaction zone and
transports the products of the reaction and
the released energy away from the reaction
zone. The location and shape of the reaction
zone are determined by a delicate balance
of species, energy, and momentum fluxes
and are highly sensitive to how these fluxes
are specified at the boundaries of the
Figure 5. Photograph of an experimental computational domain. Turbulence can
turbulent premixed rod-stabilized “V” methane
wrinkle the reaction zone, giving it much
flame.

A Science-Based Case for Large-Scale Simulation 19


3. Anatomy of a Large-Scale Simulation

more area than it would have in its laminar some scientific fields, high-fidelity models
state, without turbulence. Hence, incorrect of many important processes involved in
prediction of turbulence intensity may the simulations are not available; and
under- or over-represent the extent of additional scientific research will be needed
reaction. to achieve a comparable level of confidence.

From first principles, the reactions of These computations pictured here were
molecules are described by the Schrödinger performed on the IBM SP, named
equation, and fluid flow by the Navier- “Seaborg,” at the National Energy Research
Stokes equations. However, each of these Scientific Computing Center, using up to
equation sets is too difficult to solve 2048 processors. They are among the most
directly for the turbulent premixed rod- demanding combustion simulations ever
stabilized “V” methane flame illustrated in performed. However, the ability to do them
Figure 5. So we must rely on is not merely a result of improvements in
approximations to these equations. These computer speed. Improvements in
approximations define the computational algorithm technology funded by the DOE
models used to describe the methane flame. Applied Mathematical Sciences Program
For the current simulations, the set of over the past decade have been
84 reactions describing the methane flame instrumental in making these computations
involves 21 chemical species (methane, feasible. Mathematical analysis is used to
oxygen, water, carbon dioxide, and many reformulate the equations describing the
trace products and reaction intermediates). fluid flow so that high-speed acoustic
The particular version of the compressible transients are removed analytically, while
Navier-Stokes model employed here allows compressibility effects due to chemical
detailed consideration of convection, reactions are retained. The mathematical
diffusion, and expansion effects that shape model of the fluid flow is discretized using
the reaction; but it is “filtered” to remove high-resolution finite difference methods,
sound waves, which pose complications combined with local adaptive mesh
that do not measurably affect combustion in refinement by which regions of the finite
this regime. The selection of models such as difference grid are automatically refined or
these requires context-specific expert coarsened to maximize overall
judgment. For instance, in a jet turbine computational efficiency. The
simulation, sound waves might represent a implementation of this methodology uses
non-negligible portion of the energy budget an object-oriented, message-passing
of the problem and would need to be software framework that handles the
modeled; but it might be valid (and very complex data distribution and dynamic
cost-effective) to use a less intricate load balancing needed to effectively exploit
chemical reaction mechanism to answer the modern parallel supercomputers. The data
scientific question of interest. analysis framework used to explore the
results of the simulation and create visual
Development of the chemistry model and images that lead to understanding is based
validation of its predictions, which was on recent developments in “scripting
funded in part by DOE’s Chemical Sciences languages” from computer science. The
Program, is an interesting scientific story combination of these algorithmic
that is too long to be told here. Suffice it to innovations reduces the computational cost
say that this process took many years. In by a factor of 10,000 compared with a

20 A Science-Based Case for Large-Scale Simulation


3. Anatomy of a Large-Scale Simulation

standard uniform-grid compressible-flow on the morphology and structure of these


approach. types of flames.

Researchers have just begun to explore the This simulation achieves its remarkable
data generated in these simulations. resolution of the thin flame edge by means
Figure 7 presents a comparison between a of adaptive refinement, which places fine
laboratory measurement and a simulation. resolution around those features that need
The early stages of the analysis of this data it and uses coarse resolution away from
indicate that excellent agreement is such features, saving orders of magnitude
obtained with observable quantities. in computer memory and processing time.
Further analyses promise to shed new light As the reaction zone shifts, the refinement
automatically tracks it, adding and
removing resolution dynamically. As
described earlier, the simulation also
employs a mathematical transformation
that enables it to skip over tiny time steps
that would otherwise be required to resolve
sound waves in the flame domain. The four
orders of magnitude saved by these
algorithmic improvements, compared with
previous practices, is a larger factor than
the performance gain from parallelization
on a computer system that costs tens of
millions of dollars! It will be many years
before such a high-fidelity simulation can
be cost-effectively run using previous
practices on any computer.

Unfortunately, the adaptivity and


mathematical filtering employed to save
memory and operations complicate the
software and throw the execution of the
code into a regime of computation that
processes fewer operations per second and
uses the thousands of processors of the
Seaborg system less uniformly than a
“business as usual” algorithm would. As a
result, the simulation runs at a small
percentage of the theoretical peak rate of
the Seaborg machine. However, since it
models the phenomenon of interest in
much less execution time and delivers it
many years earlier than would otherwise be
possible, it is difficult to argue against its
Figure 7. Comparison of experimental particle “scientific efficiency.” In this case, a
image velocimetry crossview of the turbulent “V”
flame and a simulation.
simplistic efficiency metric like “percentage
of theoretical peak” is misleading.

A Science-Based Case for Large-Scale Simulation 21


3. Anatomy of a Large-Scale Simulation

The simulation highlighted earlier is of outstanding contributions to the


remarkable for its physical fidelity in a development and use of mathematical and
complex physics multiscale problem. computational tools and methods for the
Experimentalists are easily attracted to solution of science and engineering
fruitful collaborations when simulations problems.” Bell and Colella have applied
reach this threshold of fidelity. Resulting the automated adaptive meshing
comparisons help refine both experiment methodologies they developed in such
and simulation in successive leaps. diverse areas as shock physics,
However, the simulation is equally astrophysics, and flow in porous media, as
noteworthy for the combination of the well as turbulent combustion.
sophisticated mathematical techniques it
employs and its parallel implementation on The software used in this combustion
thousands of processors. Breakthroughs simulation is currently being extended in
such as this have caused scientists to refer many directions under SciDAC. The
to leading-edge large-scale simulation geometries that it can accurately
environments as “time machines.” They accommodate are being generalized for
enable us to pull into the present scientific accelerator applications. For magnetically
understanding for which we would confined fusion applications, a particle
otherwise wait years if we were dependent dynamics module is being added.
on experiment or on commodity computing
capability alone. Software of this complexity and versatility
could never have been assembled in the
The work that led to this simulation traditional mode of development, whereby
recently won recognition for one of its individual researchers with separate
creators, John Bell. He and his colleague concerns asynchronously toss packages
Phil Colella, both of Lawrence Berkeley written to arbitrary specifications “over the
National Laboratory, were awarded the first transom.” Behind any simulation as
Prize in Computational Science and complex as the one discussed in this
Engineering awarded by the Society for chapter stands a tightly coordinated,
Industrial and Applied Mathematics and vertically integrated team. This process
the Association for Computing Machinery illustrates the computational science “phase
in June 2003. The prize is awarded “in the transition” that has found proof of concept
area of computational science in recognition under the SciDAC program.

22 A Science-Based Case for Large-Scale Simulation


4.3. Opportunities
Anatomy of a at
Large-Scale
the Scientific
Simulation
Horizon

4. OPPORTUNITIES AT THE SCIENTIFIC HORIZON

The availability of computers 100 to scientific accomplishments expected in each


1000 times more powerful than those scientific area from the investments
currently available will have a profound described in this report. The table is
impact on computational scientists’ ability followed by an expanded description of one
to simulate the fundamental physical, of these accomplishments in each category.
chemical, and biological processes that The progress described in these paragraphs
underlie the behavior of natural and illustrates the progress that will be made
engineered systems. Chemists will be able toward solving many other scientific
to model the diverse set of molecular problems and in many other scientific
processes involved in the combustion of disciplines as the methodologies of
hydrocarbon fuels and the catalytic computational science are refined through
production of chemicals. Materials application to these primary Office of
scientists will be able to predict the Science missions. Volume 2 of this report
properties of materials from knowledge of provides additional information on the
their structure, and then, inverting the many advances that, scientists predict,
process, design materials with a targeted these investments will make possible.
set of properties. Physicists will be able to
model a broad range of complex PREDICTING FUTURE CLIMATES
phenomena—from the fundamental While existing general circulation models
interactions between elementary particles, (GCMs) used to simulate climate can
to high-energy nuclear processes and provide good estimates of continental- to
particle-electromagnetic field interactions, global-scale climatic variability and change,
to the interactions that govern the behavior they are less able to produce estimates of
of fusion plasmas. Biologists will be able to regional changes that are essential for
predict the structure and, eventually, the assessing environmental and economic
function of the 30–40,000 proteins coded in consequences of climatic variability and
the human DNA. They will also mine the change. For example, predictions of global
vast reservoirs of quantitative data being average atmospheric temperature changes
accumulated using high-throughput are of limited utility unless the effects of
experimental techniques to obtain new these changes on regional space-heating
insights into the processes of life. needs, rainfall, and growing seasons can be
ascertained. As a result of the investments
In this section, we highlight some of the described here, it will be possible to
most exciting opportunities for scientific markedly improve both regional
discovery that will be made possible by an predictions of long-term climatic variability
integrated set of investments, following the and change, and assessments of the
SciDAC paradigm described in Chapter 2, potential consequences of such variability
in the programmatic offices in the Office of and change on natural and managed
Science in DOE—Advanced Scientific ecosystems and resources of value to
Computing Research, Basic Energy humans (see Figure 8).
Sciences, Biological and Environmental
Research, Fusion Energy Sciences, High Accurate prediction of regional climatic
Energy Physics, and Nuclear Physics. In variability and change would have
Table 1 we provide a list of the major tremendous value to society, and that value

A Science-Based Case for Large-Scale Simulation 23


4. Opportunities at the Scientific Horizon

Table 1. If additional funding becomes available for Office of Science scientific, computer
science, and mathematics research programs, as well as for the needed additional
computing resources, scientists predict the following major advances in the
research programs supported by the Office of Science within the
next 5–10 years
Research Programs Major Scientific Advances
Biological and • Provide global forecasts of Earth’s future climate at regional scales using
Environmental high-resolution, fully coupled and interactive climate, chemistry, land
Sciences cover, and carbon cycle models.
• Develop predictive understanding of subsurface contaminant behavior
that provides the basis for scientifically sound and defensible cleanup
decisions and remediation strategies.
• Establish a mathematical and computational foundation for the study of
cellular and molecular systems and use computational modeling and
simulation to predict and simulate the behavior of complex microbial
systems for use in mission application areas.
Chemical and • Provide direct 3-dimensional simulations of a turbulent methane-air jet
Materials Sciences flame with detailed chemistry and direct 3-dimensional simulations of
autoignition of n-heptane at high pressure, leading to more-efficient,
lower-emission combustion devices.
• Develop an understanding of the growth and structural, mechanical, and
electronic properties of carbon nanotubes, for such applications as
strengtheners of polymers and alloys, emitters for display screens, and
conductors and nonlinear circuit elements for nanoelectronics.
• Provide simulations of the magnetic properties of nanoparticles and
arrays of nanoparticles for use in the design of ultra-high-density
magnetic information storage. (Nanomagnetism is the simplest of many
important advances in nanoscience that will impact future electronics,
tailored magnetic response devices, and even catalysis.)
• Resolve the current disparity between theory and experiment for
conduction through molecules with attached electronic leads. (This is the
very basis of molecular electronics and may well point to opportunities
for chemical sensor technology.)
Fusion Energy • Improve understanding of fundamental physical phenomena in high-
Sciences temperature plasmas, including transport of energy and particles,
turbulence, global equilibrium and stability, magnetic reconnection,
electromagnetic wave/particle interactions, boundary layer effects in
plasmas, and plasma/material interactions.
• Simulate individual aspects of plasma behavior, such as energy and
particle confinement times, high-pressure stability limits in magnetically
confined plasmas, efficiency of electromagnetic wave heating and
current drive, and heat and particle transport in the edge region of a
plasma, for parameters relevant to magnetically-confined fusion plasmas.
• Develop a fully integrated capability for predicting the performance of
magnetically confined fusion plasmas with high physics fidelity, initially
for tokamak configurations and ultimately for a broad range of practical
energy-producing magnetic confinement configurations.
• Advance the fundamental understanding and predictability of high-
energy density plasmas for inertial fusion energy. (Inertial fusion and
magnetically confined fusion are complementary technological
approaches to unlocking the power of the atomic nucleus.)

24 A Science-Based Case for Large-Scale Simulation


4. Opportunities at the Scientific Horizon

Table 1 (continued)
Research Programs Major Scientific Advances
High Energy • Establish the limits of the Standard Model of Elementary Particle
Physics Physics by achieving a detailed understanding of the effects of strong
nuclear interactions in many different processes, so that the equality of
Standard Model parameters measured in different experiments can be
verified. (Or, if verification fails, signal the discovery of new physical
phenomena at extreme sub-nuclear distances.)
• Develop realistic simulations of the performance of particle accelerators,
the large and complex core scientific instruments of high-energy physics
research, both to optimize the design, technology and cost of future
accelerators and to use existing accelerators more effectively and
efficiently.
Nuclear Physics • Understand the characteristics of the quark-gluon plasma, especially in
the temperature-density region of the phase transition expected from
quantum chromodynamics (QCD) and currently sought experimentally.
• Obtain a quantitative, predictive understanding of the quark-gluon
structure of the nucleon and of interactions of nucleons.
• Understand the mechanism of core collapse supernovae and the nature
of the nucleosynthesis in these spectacular stellar explosions.

could be realized if the most significant


limitations in the application of GCMs were
overcome. The proposed investments
would allow climate scientists to
• refine the horizontal resolution of the
atmospheric component from 160 to
40 km and the resolution of the ocean
component from 105 to 15 km (both
improvements are feasible now, but
there is insufficient computing power to
routinely execute GCMs at these
resolutions);
• add interactive atmospheric chemistry,
carbon cycle, and dynamic terrestrial
vegetation components to simulate
global and regional carbon, nitrogen,
and sulfur cycles and simulate the
effects of land-cover and land-use
Figure 8. Surface chlorophyll distributions
simulated by the Parallel Ocean Program for
changes on climate (these cycles and
conditions in late 1996 (a La Niña year) showing effects are known to be important to
intense biological activity across the equatorial climate, but their implementation in
Pacific due to upwelling and transport of GCMs is rudimentary because of
nutrient-rich cold water. Comparisons with satellite
ocean color measurements validate model accuracy computational limitations); and
and inform our understanding of the regional • improve the representation of important
environmental response to global climate change. subgrid processes, especially clouds and
[S. Chu, S. Elliott, and M. Maltrud]
atmospheric radiative transfer.

A Science-Based Case for Large-Scale Simulation 25


4. Opportunities at the Scientific Horizon

Even at 40-km resolution, cloud dynamics as well as to developing the new


are incompletely resolved. Recent advances nanotechnologies (see Figure 9). At the
in the development of cloud sub-models, nanoscale, experiment often cannot stand
i.e., two-dimensional and quasi three- alone, requiring sophisticated modeling
dimensional cloud-resolving models, can and simulation to deconvolute
overcome much of this limitation; but their experimental results and yield scientific
implementation in global GCMs will insight. To simulate nanoscale entities,
require substantially more computing scientists will have to perform ab initio
power. atomistic calculations for more atoms than
ever before, directly treat the correlated
SCIENCE FROM THE NANOSCALE UP behavior of the electrons associated with
Our ability to characterize small molecules those atoms, and model nanoscale systems
or perfect solids as collections of atoms is as open systems rather than closed
very advanced. So is our ability to (isolated) ones. The computational effort
characterize materials of engineering required to simulate nanoscale systems far
dimensions. However, our abilities fall far exceeds any computational efforts in
short of what is necessary because many of materials and molecular science to date by
the properties and processes necessary to a factor of at least 100 to 1000. The
develop new materials and optimize investments envisioned in this area will
existing ones occur with characteristic sizes make nanoscale simulations possible by
(or scales) that lie between the atomistic advancing the theoretical understanding of
world (less than 1 nanometer) and the
world perceived by humans (meters or
more), a spatial range of at least a billion.
Bridging these different scales, while
keeping the essential quantum description
at the smallest scale, will require
unprecedented theoretical innovation and
computational effort, but it is an essential
step toward a complete fundamental theory
of matter.

The nanoscale, 10–100 nanometers or so—


the first step in the hierarchy of scales
leading from the atomic scale to
macroscopic scales—is also important in its
own right because, at the nanoscale, new
Figure 9. To continue annual doubling of hard
properties emerge that have no parallel at drive storage density (top: IBM micro-drive),
either larger or smaller scales. Thus, the improved understanding of existing materials
new world of nanoscience offers and development of new ones are needed. First
principles spin dynamics and the use of massively
tremendous opportunities for developing parallel computing resources are enabling
innovative technological solutions to a researchers at Oak Ridge National Laboratory to
broad range of real-world problems using understand magnetism at ferromagnetic-
antiferromagnetic interfaces (bottom: magnetic
scientists’ increasing abilities to manipulate
moments at a Co/FeMn interface) with the goal of
matter at this scale. Theory, modeling, and learning how to store binary data in ever smaller
simulation are essential to developing an domains, resulting in further miniaturization
understanding of matter at the nanoscale, breakthroughs. [M. Stocks]

26 A Science-Based Case for Large-Scale Simulation


4. Opportunities at the Scientific Horizon

nanoscale systems and supporting the


development of computational models and
software for simulating nanoscale systems
and processes, as well as allowing detailed
simulations of nanoscale phenomena and
their interaction with the macroscopic
world.

DESIGNING AND OPTIMIZING


FUSION REACTORS
Fusion, the energy source that fuels the sun
Figure 10. Simulation of the interchange of
and stars, is a potentially inexhaustible and plasma between the hot (red) and cold (green)
environmentally safe source of energy. regions of a tokamak via magnetic reconnection.
However, to create the conditions under [S. Jardin]
which hydrogen isotopes undergo nuclear
fusion and release energy, high-temperature Computational modeling of these key
(100 million degrees centigrade) plasmas— processes requires large-scale simulations
a fluid of ions and electrons—must be covering a broad range of space and time
produced, sustained, and confined. Plasma scales. An integrated simulation capability
confinement is recognized internationally will dramatically enhance the efficient
as a grand scientific challenge and a utilization of a burning fusion device, in
formidable technical task. With the research particular, and the optimization of fusion
program and computing resources outlined energy development in general; and it
in this report, it will be possible to develop would serve as an intellectual integrator of
a fully integrated simulation capability for physics knowledge ranging from advanced
predicting the performance of magnetic tokamaks to innovative confinement
confinement systems, such as tokamaks. concepts.

In the past, specialized simulations were PREDICTING THE PROPERTIES OF


used to explore selected key aspects of the MATTER
physics of magnetically confined plasmas, Fundamental questions in high-energy
e.g., turbulence or micro-scale instabilities. physics and nuclear physics pivot on our
Only an integrated model can bring all of ability to simulate so-called strong
the strongly interacting physical processes interactions on a computational lattice via
together to make comprehensive, reliable the theory of quantum chromodynamics
predictions about the real-world behavior (QCD).
of fusion plasmas. The simulation of an
entire magnetic fusion confinement system The ongoing goal of research in high-energy
in-volves simultaneous modeling of the physics is to find the elementary
core and edge plasmas, as well as the constituents of matter and to determine
plasma-wall inter-actions. In each region of how they interact. The current state of our
the plasma, turbulence can cause anomalies knowledge is summarized by the Standard
in the transport of the ionic species in the Model of Elementary Particle Physics. A
plasma; there can also be abrupt changes in central focus of experiments at U.S. high-
the form of the plasma caused by large- energy physics facilities is precision testing
scale instabilities (see Figure 10). of the Standard Model. The ultimate aim of

A Science-Based Case for Large-Scale Simulation 27


4. Opportunities at the Scientific Horizon

this work is to find deviations from this will need another order of magnitude of
model—a discovery that would require the computational power.
introduction of new forms of matter or new
physical principles to describe matter at the A primary goal of the nuclear physics
shortest distances. Determining whether or program at the Relativistic Heavy Ion
not each new measurement is in accord Collider (RHIC) at Brookhaven National
with the Standard Model requires an Laboratory is the discovery and
accurate evaluation of the effects of the characterization of the quark–gluon
strong interactions on the process under plasma—a state of matter predicted by
study. These computations are a crucial QCD to exist at high densities and
companion to any experimental program in temperatures. To confirm an observation of
high-energy physics. QCD is the only a new state of matter, it is important to
known method of systematically reducing determine the nature of the phase transition
all sources of theoretical error. between ordinary matter and a quark–
gluon plasma, the properties of the plasma,
Lattice calculations lag well behind and the equation of state. Numerical
experiment at present, and the precision of simulations of QCD on a lattice have
experiments under way or being planned proved to be the only source of a priori
will make the imbalance even worse. The predictions about this form of matter in the
impact of the magnitude of the lattice errors vicinity of the phase transition.
is shown in Figure 11. For the Standard
Model to be correct, the parameters ρ and η Recent work suggests that a major increase
must lie in the region of overlap of all the in computing power would result in
colored bands. Experiments measure definitive predictions of the properties of
various combinations of ρ and η, as shown the quark–gluon plasma, providing timely
by the colored bands. The reduction of and invaluable input into the RHIC heavy
theory errors shown will require tens of ion program.
teraflop-years of simulations, which will be
available through the construction of new Similar lattice calculations are capable of
types of computers optimized for the predicting and providing an understanding
problems at hand. A further reduction, of the inner structure of the nucleon where
when current experiments are complete, the quarks and gluons are confined, another

Figure 11. Constraints on the Standard Model parameters r and h. For the Standard Model to
be correct, they must be restricted to the region of overlap of the solidly colored bands. The
figure on the left shows the constraints as they exist today. The figure on the right shows the constraints
as they would exist with no improvement in the experimental errors, but with lattice gauge theory
uncertainties reduced to 3%. [R. Patterson]

28 A Science-Based Case for Large-Scale Simulation


4. Opportunities at the Scientific Horizon

major nuclear physics goal. Precise


lattice calculations of nucleon
observables can play a pivotal role in
helping to guide experiments. This
methodology is well understood, but a
“typical” calculation of a single
observable requires over a year with
all the computers presently available
to theorists. A thousandfold increase in
computing power would make the
same result available in about 10
hours. Making the turn-around time
Figure 12. Development of a newly discovered
for simulations commensurate with shock wave instability and resulting stellar
that of experiments would lead to explosions (supernovae), in two- and three-
major progress in the quantitative, dimensional simulations. [J. Blondin, A. Mezzacappa,
and DeMarino; visualization by R. Toedte]
predictive understanding of the
structure of the nucleon and of the
interactions between nucleons. While observations from space are critical,
the development of theoretical supernova
BRINGING A STAR TO EARTH models will also be grounded in terrestrial
Nuclear astrophysics is a major research experiment. Supernova science has played
area within nuclear physics—in particular, a significant role in motivating the
the study of stellar explosions and their construction of the Rare Isotope Accelerator
element production. Explosions of massive (RIA) and the National Underground
stars, known as core collapse supernovae, Laboratory for Science and Engineering
are the dominant source of elements in the (NUSEL). As planned, RIA will perform
Periodic Table between oxygen and iron; measurements that will shed light on
and they are believed to be responsible for models for heavy element production in
the production of half the heavy elements supernovae; and NUSEL will house
(heavier than iron). Life in the universe numerous neutrino experiments, including
would not exist without these catastrophic a megaton neutrino detector capable of
stellar deaths. detecting extra-galactic supernova
neutrinos out to Andromeda.
To understand these explosive events, their
byproducts, and the phenomena associated Supernovae have importance beyond their
with them will require large-scale, place in the cosmic hierarchy: the extremes
multidimensional, multiphysics of density, temperature, and composition
simulations (see Figure 12). Current encountered in supernovae provide an
supercomputers are enabling the first opportunity to explore fundamental nuclear
detailed two-dimensional simulations. and particle physics that would otherwise
Future petascale computers would provide, be inaccessible in terrestrial experiments.
for the first time, the opportunity to Supernovae therefore serve as cosmic
simulate supernovae realistically in three laboratories, and computational models of
dimensions. supernovae constitute the bridge between
the observations, which bring us
information about these explosions, and the
fundamental physics we seek.

A Science-Based Case for Large-Scale Simulation 29


5. Enabling Mathematics and Computer Science Tools

30 A Science-Based Case for Large-Scale Simulation


5. Enabling Mathematics and Computer Science Tools

5. ENABLING MATHEMATICS AND COMPUTER


SCIENCE TOOLS

MATHEMATICAL TOOLS FOR represent it on a computer. The


LARGE-SCALE SIMULATION mathematical issues for this process,
called “discretization,” include (1) the
Mathematics is the bridge between physical
extent to which the finite approximation
reality and computer simulations of
better agrees with the solution to the
physical reality. The starting point for a
original equations as the number of
computer simulation is a mathematical
computational unknowns increases and
model, expressed in terms of a system of
(2) the relationship between the choice
equations and based on scientific
of approximation and qualitative
understanding of the problem being
mathematical properties of the solution,
investigated, such as the knowledge of
such as singularities.
forces acting on a particle or a parcel of
fluid. For such a model, there are three
• Solvers and Software. Once one has
types of mathematical tools that may be
represented the physical problem as the
brought to bear in order to produce and use
solution to a finite number of equations
computer simulations.
for a finite number of unknowns, how
does one make best use of the
• Model Analysis. Although the
computational resources to calculate the
mathematical model is motivated by
solution to those equations? Issues at
science, it must stand on its own as an
this stage include the development of
internally consistent mathematical
optimally efficient algorithms, and the
object for its computer representation to
mapping of computations onto a
be well defined. For that reason, it is
complex hierarchy of processors and
necessary to resolve a number of issues
memory systems.
regarding the mathematical structure of
the model. Do the equations have a
Although these three facets represent
unique solution for all physically
distinct mathematical disciplines, these
meaningful inputs? Do small changes in
disciplines are typically employed in
the inputs lead only to small changes in
concert to build complete simulations. The
the solution, or, as an indication of
choice of appropriate mathematical tools
potential trouble, could a small
can make or break a simulation code. For
uncertainty be magnified by the model
example, over a four-decade period of our
into large variability in the outcome?
brief simulation era, algorithms alone have
brought a speed increase of a factor of more
• Approximation and Discretization. For
than a million to computing the
many systems, the number of
electrostatic potential induced by a charge
unknowns in the model is infinite, as
distribution, typical of a computational
when the solution is a function of
kernel found in a wide variety of problems
continuum variables such as space and
in the sciences. The improvement resulting
time. In such cases, it is necessary to
from this algorithmic speedup is
approximate the infinite number of
comparable to that resulting from the
unknowns with a large but finite
hardware speedup due to Moore’s Law
number of unknowns in order to

A Science-Based Case for Large-Scale Simulation 31


5. Enabling Mathematics and Computer Science Tools

over the same length of time (see this coupling can therefore be resolved
Figure 13). The series of algorithmic accurately with little computational effort.
improvements producing this factor are all The various improvements in algorithms
based on a fundamental mathematical for solving this problem came about by
property of the underlying model—namely, successive redesigns of the discretization
that the function describing electrostatic methods and solver algorithms to better
coupling between disjoint regions in space exploit this mathematical property of the
is very smooth. Expressed in the right way, model.

Another trend in computational science is


the steady increase in the intellectual
complexity of virtually all aspects of the
modeling process. The scientists
contributing to this report identified and
developed (in Volume 2) eight cross-cutting
areas of applied mathematics for which
research and development is needed to
accommodate the rapidly increasing
complexity of state-of-the-art
computational science. They fall into the
following three categories.

• Managing Model Complexity.


Scientists want to use increasing
computing capability to improve the
fidelity of their models. For many
problems, this means introducing
models with more physical effects, more
equations, and more unknowns. In
Multiphysics Modeling, the goal is to
develop a combination of analytical and
numerical techniques to better represent
problems with multiple physical
processes. These techniques may range
Figure 13. Top: A table of the scaling of from analytical methods to determine
memory and processing requirements for the
solution of the electrostatic potential equation on
how to break a problem up into weakly
a uniform cubic grid of n × n × n cells. Bottom: interacting components, to new
The relative gains of some solution algorithms numerical methods for exploiting such a
for this problem and Moore’s Law for
decomposition of the problem to obtain
improvement of processing rates over the same
period (illustrated for the case where n = 64). efficient and accurate discretizations in
Algorithms yield a factor comparable to that of the time. A similar set of issues arises from
hardware, and the gains typically can be combined the fact that many systems of interest
(that is, multiplied together). The algorithmic gains
become more important than the hardware gains for have processes that operate on length
larger problems. If adaptivity is exploited in the and time scales that vary over many
discretization, algorithms may do better still, though orders of magnitude. Multiscale
combining all of the gains becomes more subtle in
this case.
Modeling addresses the representation

32 A Science-Based Case for Large-Scale Simulation


5. Enabling Mathematics and Computer Science Tools

and interaction of behaviors on multiple itself, into a union of simple elements.


scales so that results of interest are Meshing is usually a prerequisite for
recovered without the (unaffordable) discretizing the equations defined over
expense of representing all behaviors at the domain. This process includes the
uniformly fine scales. Approaches management of complex geometrical
include the development of adaptive objects arising in technological devices,
methods, i.e., discretization methods that as well as in some areas of science, such
can represent directly many orders of as biology.
magnitude in length scales that might
appear in a single mathematical model, • Managing Computational Complexity.
and hybrid methods for coupling Once the mathematical model has been
radically different models (continuum converted into a system of equations for
vs. discrete, or stochastic vs. a finite collection of unknowns, it is
deterministic), each of which represents necessary to solve the equations. The
the behavior on a different scale. goal of efforts in the area of Solvers and
Uncertainty Quantification addresses “Fast” Algorithms is to develop
issues connected with mathematical algorithms for solving these systems of
models that involve fits to experimental equations that balance computational
data, or that are derived from heuristics efficiency on hierarchical
that may not be directly connected to multiprocessor systems, scalability (the
physical principles. Uncertainty ability to use effectively additional
quantification uses techniques from computational resources to solve
fields such as statistics and optimization increasingly larger problems), and
to determine the sensitivity of models to robustness (insensitivity of the
inputs with errors and to design models computational cost to details of the
to minimize the effect of those errors. inputs). An algorithm is said to be
“fast” if its cost grows, roughly, only
• Discretizations of Spatial Models. proportionally to the size of the
Many of the applications described in problem. This is an ideal algorithmic
this document have, as core property that is being obtained for more
components of their mathematical and more types of equations. Discrete
models, the equations of fluid dynamics Mathematics and Algorithms make up
or radiation transport, or both. a complementary set of tools for
Computational Fluid Dynamics and managing the computational
Transport and Kinetic Methods have as complexity of the interactions of
their goal the development of the next discrete objects. Such issues arise, for
generation of spatial discretization example, in traversing data structures
methods for these problems. Issues for calculations on unstructured grids,
include the development of in optimizing resource allocation on
discretization methods that are well- multiprocessor architectures, or in
suited for use in multiphysics scientific problems in areas such as
applications without loss of accuracy or bioinformatics that are posed directly as
robustness. Meshing Methods “combinatorial” problems.
specifically address the process of
discretizing the computational domain,

A Science-Based Case for Large-Scale Simulation 33


5. Enabling Mathematics and Computer Science Tools

COMPUTER SCIENCE TOOLS FOR Despite this grim picture, prospects for
LARGE-SCALE SIMULATION placing usable computing environments
into the hands of scientific domain experts
The role of computer science in ultrascale
are improving. In the last few years, there
simulation is to provide tools that address
has been a growing understanding of the
the issues of complexity, performance, and
problems of managing complexity in
understanding. These issues cut across the
computer science, and therefore of their
computer science disciplines. An integrated
potential solutions. For example, there is a
approach is required to solve the problems
deeper understanding of how to make
faced by applications.
programming languages expressive and
easy to use without sacrificing high
COMPLEXITY
performance on the sophisticated, adaptive
One of the major challenges for applications algorithms.
is the complexity of turning a mathematical
model into an effective simulation. There Another example is the success of
are many reasons for this, as indicated in component-oriented software in some
Figure 4. Often, even after the principles of application domains; such “components”
a model and simulation algorithm are well have allowed computational scientists to
understood, too much effort still is required focus their own expertise on their science
to turn this understanding into practice while exploiting the newest algorithmic
because of the complexity of code design. developments. Many groups in high-
Current programming models and performance computing have tackled these
frameworks do not provide sufficient issues with significant leadership from
support to allow domain experts to be DOE. Fully integrated efforts are required
shielded from details of data structures and to produce a qualitative change in the way
computer architecture. Even after an application groups cope with the
application code produces correct scientific complexity of designing, building, and
results, too much effort still is required to using ultrascale simulation codes.
obtain high performance. Code tuning
efforts needed to match algorithms to PERFORMANCE
current computer architectures require
One of the drivers of software complexity is
lengthy analysis and experimentation.
the premium on performance. The most
obvious aspect of the performance problem
Once an application runs effectively, often
is the performance of the computer
the next hurdle is saving, accessing, and
hardware. Although there have been
sharing data. Once the data are stored, since
astounding gains in arithmetic processing
ultrascale simulations often produce
rates over the last five decades, users often
ultrascale-sized datasets, it is still too
receive only a small fraction of the
difficult to process, investigate, and
theoretical peak of processing performance.
visualize the data in order to accomplish
There is a perception that this fraction is, in
the purpose of the simulation—to advance
fact, declining. This perception is correct in
science. These difficulties are compounded
some respects. For many applications, the
by the problems faced in sharing resources,
reason for a declining percentage of peak
both human and computer hardware.
performance is the relative imbalance in the
performance of the subsystems of high-end

34 A Science-Based Case for Large-Scale Simulation


5. Enabling Mathematics and Computer Science Tools

computers. While the raw performance of preserving correctness. This freedom


commodity processors has followed should not be arbitrarily limited by
Moore’s Law and doubled every 18 months, particular expression in a low-level
the performance of other critical parts of the language, but rather chosen close to
system, such as memory and interconnect, runtime to best match a given application–
have improved much less rapidly, leading architecture pair. Similarly, the performance
to less-balanced overall systems. Solving of I/O and dataset operations can be
this problem requires attention to system- improved significantly through the use of
scale architectural issues. well-designed and adaptive algorithms.

As with code complexity issues, there are UNDERSTANDING


multiple on-going efforts to address
Computer science also addresses the issue
hardware architecture issues. Different
of understanding the results of a
architectural solutions may be required for
computation. Ultrascale datasets are too
different algorithms and applications. A
large to be grasped directly. Applications
single architectural convergence point, such
that generate such sets already currently
as that occupied by current commodity-
rely on a variety of tools to attempt to
based terascale systems, may not be the
extract patterns and features from the data.
most cost-effective solution for all users. A
Computer science offers techniques in data
comprehensive simulation program
management and understanding that can be
requires that several candidate architectural
used to explore data sets, searching for
approaches receive sustained support to
particular patterns. Visualization
explore their promise.
techniques help scientists explore their
data, taking advantage of the unique
Performance is a cross-cutting issue, and
human visual cortex and visually-
computer science offers automated
stimulated human insight. Current efforts
approaches to developing codes in ways
in this area are often limited by the lack of
that allow computational scientists to
resources, in terms of both staffing and
concentrate on their science. For example,
hardware.
techniques that allow a programmer to
automatically generate code for an
Understanding requires harnessing the
application that is tuned to a specific
skills of many scientists. Collaboration
computer architecture address both the
technologies help scientists at different
issues of managing the complexity of
institutions work together. Grid
highly-tuned code and the problem of
technologies that simplify data sharing and
providing effective portability between
provide access to both experimental and
high-performance computing platforms.
ultrascale computing facilities allow
Such techniques begin with separate
computational scientists to work together to
analyses of the “signature” of the
solve the most difficult problems facing the
application (e.g., the patterns of local
nation today. Although these technologies
memory accesses and inter-processor
have been demonstrated, much work
communication) and the parameters of the
remains to make them a part of every
hardware (e.g., cache sizes, latencies,
scientist’s toolbox. Key challenges are in
bandwidths). There is usually plenty of
scheduling multiple resources and in data
algorithmic freedom in scheduling and
security.
ordering operations and exchanges while

A Science-Based Case for Large-Scale Simulation 35


5. Enabling Mathematics and Computer Science Tools

In Volume 2 of this report, scientists from disciplines and tools—many adapted from
many disciplines discuss critical computer the industrial software world—to manage
science technologies needed across the complexity of code development. Data
applications. Visual Data Exploration and Management and Analysis addresses the
Analysis considers understanding the challenges of managing and understanding
results of a simulation through the increasingly staggering volumes of data
visualization. Computer Architecture looks produced by ultrascale simulations.
at the systems on which ultrascale Performance Engineering is the art of
simulations run and for which achieving high performance, which has the
programming environments and algorithms potential of becoming a science. System
(described in the section on computational Software considers the basic tools that
mathematics) must provide software tools. support all other software.
Programming Models and Component
Technology for High-End Computing The universality of these issues calls for
discusses new techniques for turning focused research campaigns. At the same
algorithms into efficient, maintainable time, the inter-relatedness of these issues to
programs. Access and Resource Sharing one another and to the application and
looks at how to both make resources algorithmic issues above them in the
available to the entire research community simulation hierarchy points to the potential
and promote collaboration. Software synergy of integrated simulation efforts.
Engineering and Management is about

36 A Science-Based Case for Large-Scale Simulation


5. Enabling Mathematics
6. Recommendations
and Computerand
Science
Discussion
Tools

6. RECOMMENDATIONS AND DISCUSSION

The preceding chapters of this volume have and computer science that will be required
presented a science-based case for large- or that could, if developed for widespread
scale simulation. Chapter 2 presents an on- use, reduce the computational power
going successful initiative, SciDAC—a required to achieve a given scientific
collection of 51 projects operating at a level simulation objective. The companion
of approximately $57 million per year—as a volume details these opportunities more
template for a much-expanded investment extensively. This concluding section revisits
in this area. To fully realize the promises of each of the recommendations listed at the
SciDAC, even at its present size, requires end of Chapter 1 and expands upon them
immediate new investment in computer with additional clarifying discussion.
facilities. Overall, the computational science
research enterprise could make effective 1. Major new investments in
use of as much as an order of magnitude computational science are needed in all of
increase in annual investment at this time. the mission areas of DOE’s Office of
Funding at that level would enable an Science, as well as those of many other
immediate tenfold increase in computer agencies, so that the United States may be
cycles available and a doubling or tripling the first, or among the first, to capture the
of the researchers in all contributing new opportunities presented by the
disciplines. A longer, 5-year view should continuing advances in computing power.
allow for another tenfold increase in Such investments will extend the
computing power (at less than a tenfold important scientific opportunities that
increase in expense). Meanwhile, the have been attained by a fusion of
emerging workforce should be primed now sustained advances in scientific models,
for expansion by attracting and developing mathematical algorithms, computer
new talent. architecture, and scientific software
engineering.
Chapter 3 describes the anatomy of a state-
of-the-art large-scale simulation. A The United States is an acknowledged
simulation such as this one goes beyond international leader in computational
established practices by exploiting science, both in its national laboratories and
advances from many areas of in its research universities. However, the
computational science to marry raw United States could lose its leadership
computational power to algorithmic position in a short time because the
intelligence. It conveys the opportunities represented by
multidisciplinary spirit intended in the computational science for economic,
expanded initiative. Chapter 4 lists some military, and other types of leadership are
compelling scientific opportunities that are well recognized; the underlying technology
just over the horizon new scientific and training are available without
advances that are predicted to be accessible restrictions; and the technology continues
with an increase in computational power of to evolve rapidly. As Professor Jack
a hundred- to a thousand-fold. Chapter 5 Dongarra from the University of Tennessee
lists some advances in the enabling recently noted.
technologies of computational mathematics

A Science-Based Case for Large-Scale Simulation 37


6. Recommendations and Discussion

The rising tide of change [resulting from advances in science, mathematics, and
advances in information technology] shows computer science into simulations that can
no respect for the established order. Those
take full advantage of advanced
who are unwilling or unable to adapt in
response to this profound movement not
computers.
only lose access to the opportunities that the
information technology revolution is Part of the genius of the SciDAC initiative is
creating, they risk being rendered obsolete the cross-linking of accountabilities
by smarter, more agile, or more daring between applications groups and the
competitors. groups providing enabling technologies.
This accountability implies, for instance,
The four streams of development brought that physicists are concentrating on physics,
together in a “perfect fusion” to create the not on developing parallel libraries for core
current opportunities are of different ages, mathematical operations; and it implies
but their confluence is recent. Modern that mathematicians are turning their
scientific modeling still owes as much to attention to methods directly related to
Newton’s Principia (1686) as to anything application-specific difficulties, moving
since, although new models are continually beyond the model problems that motivated
proposed and tested. Modern simulation earlier generations. Another aspect of
and attention to floating point arithmetic SciDAC is the cross-linking of laboratory
have been pursued with intensity since the and academic investigators. Professors and
pioneering work of Von Neumann at the their graduate students can inexpensively
end of World War II. Computational explore many promising research directions
scientists have been riding the technological in parallel; however, the codes they
curve subsequently identified as “Moore’s produce rarely have much shelf life. On the
Law” for nearly three decades now. other hand, laboratory-based scientists can
However, scientific software engineering as effectively interact with many academic
a discipline focused on componentization, partners, testing, absorbing, and seeding
extensibility, reusability, and platform numerous research ideas, and effectively
portability within the realm of high- “hardening” the best of the ideas into long-
performance computing is scarcely a lived projects and tested, maintained codes.
decade old. Multidisciplinary teams that This complementarity should be built into
are savvy with respect to all four aspects of future computational science initiatives.
computational simulation, and that are
actively updating their codes with optimal Future initiatives can go farther than
results from each aspect, have evolved only SciDAC by adding self-contained, vertically
recently. Not every contemporary integrated teams to the existing SciDAC
“community code” satisfies the standard structure of specialist groups. The SciDAC
envisioned for this new kind of Integrated Software Infrastructure Centers
computational science, but the United (ISICs), for instance, should continue to be
States is host to a number of the leading supported, and even enhanced, to develop
efforts and should capitalize upon them. infrastructure that supports diverse science
portfolios. However, mathematicians and
2. Multidisciplinary teams, with carefully computer scientists capable of
selected leadership, should be assembled understanding and harnessing the output
to provide the broad range of expertise of these ISICs in specific scientific areas
needed to address the intellectual should also be bona fide members of the
challenges associated with translating

38 A Science-Based Case for Large-Scale Simulation


6. Recommendations and Discussion

next generation of computational science that involve increasingly optimal


teams recommended by this report, just as algorithms and software solutions.
they are now part of a few of the larger However, this incentive exists
SciDAC applications teams. independently of the desperate need for
computing resources. U.S. investigators are
3. Extensive investment in new shockingly short on cycles to capitalize on
computational facilities is strongly their current know-how. Some workshop
recommended, since simulation now cost- participants expressed a need for very large
effectively complements experimentation facilities in which to perform “heroic”
in the pursuit of the answers to numerous simulations to provide high-fidelity
scientific questions. New facilities should descriptions of the phenomena of interest,
strike a balance between capability or to establish the validity of a
computing for those “heroic simulations” computational model. With current
that cannot be performed any other way, facilities providing peak processing rates in
and capacity computing for “production” the tens of teraflop/s, a hundred- to
simulations that contribute to the steady thousand-fold improvement in processing
stream of progress. rate would imply a petaflop/s machine.
Other participants noted the need to
The topical breakout groups assessing perform ensembles of simulations of more
opportunities in the sciences at the June modest sizes, to develop quantitative
2003 SCaLeS Workshop have universally statistics, to perform parameter
pleaded for more computational resources identifications, or to optimize designs. A
dedicated to their use. These pleas are carefully balanced set of investments in
quantified in Volume 2 and highlighted in capability and capacity computing will be
Chapter 4 of this volume, together with required to meet these needs.
(partial) lists of the scientific investigations
that can be undertaken with current know- Since the most cost-effective cycles for the
how and an immediate influx of computing capacity requirements are not necessarily
resources. the same as the most cost-effective cycles
for the capability requirements, a mix of
It is fiscally impossible to meet all of these procurements is natural and optimal. In the
needs, nor does this report recommend that short term, we recommend pursuing, for
available resources be spent entirely on the use by unclassified scientific applications,
computational technology existing in a at least one machine capable of sustaining
given year. Rather, steady investments one petaflop/s, plus the capacity for
should be made over evolving generations another sustained petaflop/s distributed
of computer technology, since each over tens of installations nationwide.
succeeding generation packages more
capability per dollar and evolves toward In earlier, more innocent times,
greater usability. recommendations such as these could have
been justified on the principle of “build it
To the extent that the needs for computing and they will come,” with the hope of
resources go unmet, there will remain a encouraging the development of a larger
driving incentive to get more science done computational science activity carrying
with fewer cycles. This is a healthy certain anticipated benefits. In the current
incentive for multidisciplinary solutions atmosphere, the audience for this hardware

A Science-Based Case for Large-Scale Simulation 39


6. Recommendations and Discussion

already exists in large numbers; and the originated and are maintained at national
operative principle is, instead, “build it or laboratories, a situation that is a natural
they will go!” Two plenary speakers at the adaptation to the paucity of scientific focus
workshop noted that a major U.S.-based in the commercial marketplace.
computational geophysicist has already
taken his leading research program to The SciDAC initiative has identified four
Japan, where he has been made a guest of areas of scientific software technology for
the Earth Simulator facility; this facility focused investment—scientific simulation
offers him a computational capability that codes, mathematical software, computing
he cannot find in the United States. Because systems software, and collaboratory
of the commissioning of the Japanese Earth software. Support for development and
Simulator last spring, this problem is most maintenance of portable, extensible
acute in earth science. However, the software is yet another aspect of the genius
problem will rapidly spread to other of the SciDAC initiative, as the reward
disciplines. structure for these vital tasks is otherwise
absent in university-based research and
4. Investment in hardware facilities difficult to defend over the long term in
should be accompanied by sustained laboratory-based research. Although
collateral investment in software SciDAC is an excellent start, the current
infrastructure for them. The efficient use funding level cannot support many types of
of expensive computational facilities and software that computational science users
the data they produce depends directly require.
upon multiple layers of system software
and scientific software, which, together It will be necessary and cost-effective, for
with the hardware, are the engines of the foreseeable future, for the agencies that
scientific discovery across a broad sponsor computational science to make
portfolio of scientific applications. balanced investments in several of the
layers of software that come between the
The President’s Information Technology scientific problems they face and the
Advisory Committee (PITAC) report of computer hardware they own.
1999 focused on the tendency to under-
invest in software in many sectors, and this 5. Additional investments in hardware
is unfortunately also true in computational facilities and software infrastructure
science. Vendors have sometimes delivered should be accompanied by sustained
leading-edge computing platforms collateral investments in algorithm
delivered with immature or incomplete research and theoretical development.
software bases; and computational science Improvements in basic theory and
users, as well as third-party vendors, have algorithms have contributed as much to
had to rise to fill this gap. In addition, increases in computational simulation
scientific applications groups have often capability as improvements in hardware
found it difficult to make effective use of and software over the first six decades of
these computing platforms from day one, scientific computing.
and revising their software for the new
machine can take several months to a year A recurring theme among the reports from
or more. Many industry-standard libraries the topical breakout groups for scientific
of systems software and scientific software applications and mathematical methods is

40 A Science-Based Case for Large-Scale Simulation


6. Recommendations and Discussion

that increases in computational capability rapidly with increasing grid resolution into
comparable to those coming directly from methods that increase linearly or log-
Moore’s Law have come from improved linearly with complexity, with reliable
models and algorithms. This point was bounds on the approximations that are
noted over 10 years ago in the famous needed to achieve these simulation-
“Figure 4” of a booklet entitled Grand liberating gains in performance. It is
Challenges: High-Performance Computing and essential to keep computational science
Communications published by FCCSET and codes well stoked with the latest
distributed as a supplement of the algorithmic technology available. In turn,
President’s FY 1992 budget. A computational science applications must
contemporary reworking of this chart actively drive research in algorithms.
appears as Figure 13 of this report. It shows
that the same factor of 16 million in 6. Computational scientists of all types
performance improvement that would be should be proactively recruited with
accumulated by Moore’s Law in a 36-year improved reward structures and
span is achieved by improvements in opportunities as early as possible in the
numerical analysis for solving the educational process so that the number of
electrostatic potential equation on a cube trained computational science
over a similar period, stretching from the professionals is sufficient to meet present
demonstration that Gaussian elimination and future demands.
was feasible in the limited arithmetic of
digital computers to the publication of the The United States is exemplary in
so-called “full multigrid” algorithm. integrating its computational science
graduate students into internships in the
There are numerous reasons to believe that national laboratory system, and it has had
advances in applied and computational the foresight to establish a variety of
mathematics will make it possible to model fellowship programs to make the relatively
a range of spatial and temporal scales much new career of “computational scientist”
wider than those reachable by established visible outside departmental boundaries at
practice. This has already been universities. These internships and
demonstrated with automated adaptive fellowship programs are estimated to be
meshing, although the latter still requires under-funded, relative to the number of
too much specialized expertise to have well-qualified students, by a factor of 3.
achieved universal use. Furthermore, Even if all presently qualified and
multiscale theory can be used to interested U.S. citizens who are doctoral
communicate the net effect of processes candidates could be supported—and even
occurring on unresolvable scales to models if this pool were doubled by aggressive
that are discretized on resolvable scales. recruitment among groups in the native-
Multiphysics techniques may permit codes born population who are underrepresented
to “step over” stability-limiting time scales in science and engineering, and/or the
that are dynamically irrelevant at larger addition of foreign-born and domestically
scales. trained graduate students—the range of
newly available scientific opportunities
Even for models using fully resolved scales, would only begin to be grazed in new
research in algorithms promises to computational science doctoral
transform methods whose costs increase dissertations.

A Science-Based Case for Large-Scale Simulation 41


6. Recommendations and Discussion

The need to dedicate attention to earlier addressed include network infrastructure,


stages of the academic pipeline is well grid/networking middleware, and
documented elsewhere and is not the application-specific remote user
central subject of this focused report, environments.
although it is a related concern. It is time to
thoroughly assess the role of computational The combination of central facilities and a
simulation in the undergraduate widely distributed scientific workforce
curriculum. Today’s students of science and requires, at a minimum, remote access to
engineering, whose careers will extend to computation and data, as well as the
the middle of the twenty-first century, must integration of data distributed over several
understand the foundations of sites.
computational simulation, must be aware
of the advantages and limitations of this Network infrastructure research is also
approach to scientific inquiry, and must recommended for its promises of leading-
know how to use computational techniques edge capabilities that will revolutionize
to help solve scientific and engineering computational science, including
problems. Few universities can make this environments for coupling simulation and
claim of their current graduates. Academic experiment; environments for
programs in computational simulation can multidisciplinary simulation; orchestration
be attractors for undergraduates who are of multidisciplinary, multiscale, end-to-end
increasingly computer savvy but presently science process; and collaboration in
perceive more exciting career opportunities support of all these activities.
in non-scientific applications of information
technology. 8. Federal investment in innovative, high-
risk computer architectures that are well
7. Sustained investments must be made in suited to scientific and engineering
network infrastructure for access and simulations is both appropriate and
resource sharing, as well as in the software needed to complement commercial
needed to support collaboration among research and development. The
distributed teams of scientists, commercial computing marketplace is no
recognizing that the best possible longer effectively driven by the needs of
computational science teams will be computational science.
widely separated geographically and that
researchers will generally not be Scientific computing—which ushered in the
collocated with facilities and data. information revolution over the past half-
century by demanding digital products that
The topical breakout group on access and eventually had successful commercial spin-
resource sharing addressed the question of offs—has now virtually been orphaned by
what technology must be developed and/ the information technology industry.
or deployed to make a petaflop/s-scale Scientific computing occupies only a few
computer useful to users not located in the percentage points of the total marketplace
same building. Resource sharing constitutes revenue volume of information technology.
a multilayered research endeavor that is This stark economic fact is a reminder of
presently accorded approximately one-third one of the socially beneficial by-products of
of the resources available to SciDAC, which computational science. It is also a wake-up
are distributed over ten projects. Needs call regarding the necessity that computa-

42 A Science-Based Case for Large-Scale Simulation


6. Recommendations and Discussion

tional science attend to its own needs vis-à- (HECRTF) study, and is not the central
vis future computer hardware concern of this application-focused report.
requirements. The most noteworthy neglect However, it is appropriate to reinforce this
of the scientific computing industry is, recommendation, which is as true of our era
arguably, the deteriorating ratio of memory as it was true of the era of the “Lax Report”
bandwidth between processors and main (see the discussion immediately following).
memory to the computational speed of the The SciDAC report called for the
processors. It does not matter how fast a establishment of experimental computing
processor is if it cannot get the data that it facilities whose mission was to “provide
needs from memory fast enough—a fact critical experimental equipment for
recognized very early by Seymour Cray, computer science and applied mathematics
whom many regard as the father of the researchers as well as computational
supercomputer. Most commercial science pioneers to assess the potential of
applications do not demand high memory new computer systems and architectures
bandwidth; many scientific applications do. for scientific computing.” Coupling these
facilities with university-based research
There has been a reticence to accord projects could be a very effective means to
computational science the same fill the architecture research gap.
consideration given other types of
experimental science when it comes to CLOSING REMARKS
federal assistance in the development of
The contributors to this report would like to
facilities. High-end physics facilities—such
note that they claim little original insight
as accelerators, colliders, lasers, and
for the eight recommendations presented.
telescopes—are, understandably, designed,
In fact, our last six recommendations may
constructed, and installed entirely at
be found, slightly regrouped, in the four
taxpayer expense. However, the design and
prescient recommendations of the Lax
fabrication of high-end computational
Report (Large-Scale Computing in Science and
facilities are expected to occur in the course
Engineering, National Science Board, 1982),
of profit-making enterprises; only the
which was written over two decades ago, in
science-specific purchase or lease and
very different times, before the introduction
installation are covered at taxpayer
of pervasive personal computing and well
expense. To the extent that the marketplace
before the emergence of the World Wide
automatically takes care of the needs of the
Web!
computational science community, this
represents a wonderful savings for the
Our first two recommendations reflect the
taxpayer. To the extent that it does not, it
beginnings of a “phase transition” in the
stunts the opportunity to accomplish
way certain aspects of computational
important scientific objectives and to lead
science are performed, from solo
the U.S. computer industry to further
investigators to multidisciplinary groups.
innovation and greatness.
Most members of the Lax panel—Nobelists
and members of the National Academies
The need to dedicate attention to research
included—wrote their own codes, perhaps
in advanced computer architecture is
in their entirety, and were capable of
documented elsewhere in many places,
understanding nearly every step of their
including the concurrent interagency High-
compilation, execution, and interpretation.
End Computing Revitalization Task Force
This is no longer possible: the scientific

A Science-Based Case for Large-Scale Simulation 43


6. Recommendations and Discussion

applications, computer hardware, and international politics of information


computing systems software are now each technology, or the economics of science
so complex that combining all three into funding. Meanwhile, the supporting
today’s engines of scientific discovery arguments for the recommendations, which
requires the talents of many. The phase indicate the distance traveled along this
transition to multidisciplinary groups, path, have changed dramatically. As the
though not first crystallized by the SciDAC companion volume illustrates, the two
report (Scientific Discovery through Advanced decades since the Lax Report have been
Computing, U.S. Department of Energy, extremely exciting and fruitful for
March 2000), was recognized and computational science. The next 10 to
eloquently and persuasively articulated 20 years will see computational science
there. firmly embedded in the fabric of science—
the most profound development in the
The original aspect of the current report is scientific method in over three centuries!
not, therefore, its recommendations. It is,
rather, its elaboration in the companion The constancy of direction provides
volume of what the nation’s leading reassurance that the community is well
computational scientists see just over the guided, while the dramatic advances in
computational horizon, and the computational science indicate that it is
descriptions by cross-cutting enabling well paced and making solid progress. The
technologists from mathematics and Lax Report measured needs in megaflop/s,
computer science of how to reach these new the current report in teraflop/s—a unit one
scientific vistas. million times larger. Neither report,
however, neglects to emphasize that it is
It is encouraging that the general not the instruction cycles per second that
recommendations of generations of reports measure the quality of the computational
such as this one, which indicate a desired effort, but the product of this rating times
direction along a path of progress, are the scientific results per instruction cycle.
relatively constant. They certainly adapt to The latter requires better computational
local changes in the scientific terrain, but models, better simulation algorithms, and
they do not swing wildly in response to the better software implementation.
siren calls of academic fashion, the

44 A Science-Based Case for Large-Scale Simulation


6. Recommendations and Discussion
Postscript and Acknowledgments

POSTSCRIPT AND ACKNOWLEDGMENTS

This report is built on the efforts of a further and live up to the expectations of
community of computational scientists, Galileo, invoked in the quotation on the
DOE program administrators, and technical title page of this volume.
and logistical professionals situated
throughout the country. Bringing together The three plenary speakers for the
so many talented individuals from so many workshop during which the input for this
diverse endeavors was stimulating. Doing report primarily was gathered—Raymond
so and reaching closure in less than four Orbach, Peter Lax, and John Grosh—
months, mainly summer months, was perfectly set the perspective of legacy,
nothing short of exhilarating. opportunity, and practical challenges for
large-scale simulation. In many ways, the
The participants were reminded often challenges we face come from the same
during the planning and execution of this directions as those reported by the National
report that the process of creating it was as Science Board panel chaired by Peter Lax in
important as the product itself. If resources 1982. However, progress along the path in
for a multidisciplinary initiative in scientific the past two decades is astounding—
simulation were to be presented to the literally factors of millions in computer
scientists, mathematicians, and computer capabilities (storage capacity, processor
scientists required for its successful speed) and further factors of millions from
execution—as individuals distributed working smarter (better models, optimal
across universities and government and adaptive algorithms). Without the
industrial laboratories, lacking familiarity measurable progress of which the speakers
with each other’s work across disciplinary reminded us, the aggressive extrapolations
and sectorial boundaries—the return to the of this report might be less credible. With
taxpayer certainly would be positive, yet each order-of-magnitude increase in
short of its full potential. Investments in capabilities comes increased ability to
science are traditionally made this way, and understand or control aspects of the
in this way, sound scientific building blocks universe and to mitigate or improve our lot
are accumulated and tested. But as a pile of in it.
blocks does not constitute a house, theory
and models from science, methods and The DOE staff members who facilitated this
algorithms from mathematics, and software report often put in the same unusual hours
and hardware systems from computer as the scientific participants in breakfast
science do not constitute a scientific meetings, late-evening discussions, and
simulation. The scientists, mathematicians, long teleconferences. They did not steer the
and computer scientists who met at their outcome, but they shared their knowledge
own initiative to build this report are now of how to get things accomplished in
ready to build greater things. While projects that involve hundreds of people;
advancing science through simulation, the and they were incredibly effective. Many
greatest thing they will build is the next provided assistance along the way, but Dan
generation of computational scientists, Hitchcock and Chuck Romine, in particular,
whose training in multidisciplinary labored with the scientific core committee
environments will enable them to see much for four months.

A Science-Based Case for Large-Scale Simulation 45


Postscript and Acknowledgments

Other specialists have left their imprints on The editor-in-chief gratefully acknowledges
this report. Devon Streit of DOE departmental chairs and administrators at
Headquarters is a discussion leader two different universities who freed his
nonpareil. Joanne Stover and Mark Bayless schedule of standard obligations and thus
of Pacific Northwest National Laboratory directly subsidized this report. This project
built and maintained web pages, often consumed his last weeks at Old Dominion
updated daily, that anchored workshop University and his first weeks at Columbia
communication among hundreds of people. University and definitely took his mind off
Vanessa Jones of Oak Ridge National of the transition.
Laboratory’s Washington office provided
cheerful hospitality to the core planners. All who have labored to compress their (or
Jody Shumpert of Oak Ridge Institute for their colleagues’) scientific labors of love
Science and Education (ORISE) single- into a few paragraphs, or to express
handedly did the work of a conference team themselves with minimal use of equations
in preparing the workshop and a pre- and jargon, will find ten additional readers
workshop meeting in Washington, D.C. The for each co-specialist colleague they thereby
necessity to shoehorn late-registering de-targeted. All involved in the preparation
participants into Arlington hotels during of this report understand that it is a
high convention season, when the response snapshot of opportunities that will soon be
to the workshop outstripped planning outdated in their particulars, and that it
expectations, only enhanced her display of may prove either optimistic or pessimistic
unflappability. Rachel Smith and Bruce in its extrapolations. None of these
Warford, also of ORISE, kept participants outcomes detracts from the confidence of
well directed and well equipped in ten-way its contributors that it captures the current
parallel breakouts. Listed last, but of first state of scientific simulation at a historic
importance for readers of this report, inflection point and is worth every sacrifice
members of the Creative Media group at and risk.
Oak Ridge National Laboratory iterated
tirelessly with the editors in meeting
production deadlines and creating a
presentation that outplays the content.

46 A Science-Based Case for Large-Scale Simulation


Postscript
Appendix 1: A Brief Chronology of “A Science-Based Case and Acknowledgments
for Large-Scale Simulation”

APPENDIX 1: A BRIEF CHRONOLOGY OF “A SCIENCE-


BASED CASE FOR LARGE-SCALE SIMULATION”
Under the auspices of the Office of Science guarantee a critical mass in each breakout
of the U.S. Department of Energy (DOE), a group, in the context of a fully open
group of 12 computational scientists from workshop. Group leaders also typically
across the DOE laboratory complex met on recruited one junior investigator “scribe”
April 23, 2003, with 6 administrators from for each breakout session. This arrangement
DOE’s Washington, D.C., offices and David was designed both to facilitate the assembly
Keyes, chairman, to work out the of quick-look plenary reports from
organization of a report to demonstrate the breakouts at the meeting and to provide a
case for increased investment in unique educational experience to members
computational simulation. They also of the newest generation of DOE-oriented
planned a workshop to be held in June to computational scientists.
gather input from the computational
science community to be used to assemble Group leaders, invitees, and interested
the report. The group was responding to a members of the community were then
charge given three weeks earlier by Dr. Walt invited to register at the SCaLeS workshop
Polansky, the Chief Information Officer of website, which was maintained as a service
the Office of Science (see Appendix 2). The to the community by Pacific Northwest
working acronym “SCaLeS” was chosen for National Laboratory.
this initiative.
A core planning group of approximately a
From the April meeting, a slate of group dozen people, including Daniel Hitchcock
leaders for 27 breakout sessions (11 in and Charles Romine of the DOE Office of
applications, 8 in mathematical methods, Advanced Scientific Computing Research,
and 8 in computer science and participated in weekly teleconferences to
infrastructure) was assembled and attend to logistics and to organize the
recruited. The short notice of the chosen individual breakout sessions at the
dates in June meant that two important workshop in a way that guaranteed
sub-communities, devoted to climate multidisciplinary viewpoints in each
research and to Grid infrastructure, could discussion.
not avoid conflicts with their own meetings.
They agreed to conduct their breakout More than 280 computer scientists,
sessions off-site at the same time as the computational mathematicians, and
main workshop meeting. Thus 25 of the computational scientists attended the June
breakout groups would meet in greater 24 and 25, 2003, workshop in Arlington,
Washington, D.C., and 2 would meet Virginia, while approximately 25 others met
elsewhere on the same dates and remotely. The workshop consisted of three
participate synchronously in the workshop plenary talks and three sessions of topical
information flows. breakout groups, followed by plenary
summary reports from each session, and
Acting in consultation with their scientific discussion periods. Approximately 50% of
peers, the group leaders drew up lists of the participant-contributors were from DOE
suggested invitees to the workshop so as to laboratories, approximately 35% from

A Science-Based Case for Large-Scale Simulation 47


Appendix 1: A Brief Chronology of “A Science-Based Case for Large-Scale Simulation”

universities, and the remainder from other focus in time-constrained multidisciplinary


agencies or from industry. discussion groups so as to arrive at useful
research roadmaps.
In the plenary talks, Raymond Orbach,
director of the DOE Office of Science, From the brief breakout group plenary
charged the participants to document the reports at the SCaLeS workshop, a total of
scientific opportunities at the frontier of 53 co-authors drafted 27 chapters
simulation. He also reviewed other Office documenting scientific opportunities
of Science investments in simulation, through simulation and how to surmount
including a new program at the National barriers to realizing them through
Energy Research Scientific Computing collaboration among scientists,
Center that allocates time on the 10 Tflop/s mathematicians, and computer scientists.
Seaborg machine for grand challenge runs. Written under an extremely tight deadline,
Peter D. Lax of the Courant Institute, editor these chapters were handed off to an
of the 1982 report Large-Scale Computing in editorial team consisting of Thom Dunning,
Science and Engineering, provided a personal Phil Colella, William Gropp, and David
historical perspective on the impact of that Keyes. This team likewise worked under an
earlier report and advised workshop extremely tight schedule, iterating with
participants that they have a similar calling authors to encourage and enforce a measure
in a new computational era. To further set of consistency while allowing for variations
the stage for SCaLeS, John Grosh, co-chair that reflect the intrinsically different natures
of the High-End Computing Revitalization of simulation throughout science.
Task Force (HECRTF), addressed the
participants on the context of this Following a heroic job of technical
interagency task force, which had met the production by Creative Media at Oak Ridge
preceding week. (While the SCaLeS National Laboratory, A Science-Based Case for
workshop emphasized applications, the Large-Scale Simulation is now in the hands of
HECRTF workshop emphasized computer the public that is expected to be the
hardware and software.) Finally, Devon ultimate beneficiary not merely of an
Streit of DOE’s Washington staff advised interesting report but of the scientific gains
the participants on how to preserve their toward which it points.

48 A Science-Based Case for Large-Scale Simulation


Appendix 1:Appendix
A Brief Chronology of “A Science-Based
2: Charge for Science-Based Case
Case for
for Large-Scale
Large-Scale Simulation”
Simulation”

APPENDIX 2: CHARGE FOR “A SCIENCE-BASED CASE


FOR LARGE-SCALE SIMULATION”

Date: Wed, 2 Apr 2003 09:49:30 -0500


From: “Polansky, Walt” <Walt.Polansky@science.doe.gov>
Subject: Computational sciences—Exploration of New Directions

Distribution:

I am pleased to report that David Keyes (keyes@cs.odu.edu), of Old Dominion University,


has agreed to lead a broad-based effort on behalf of the Office of Science to identify rich and
fruitful directions for the computational sciences from the perspective of scientific and
engineering applications. One early, expected outcome from this effort will be a strong
science case for an ultra-scale simulation capability for the Office of Science.

David will be organizing a workshop (early summer 2003, Washington, D.C. metro area) to
formally launch this effort. I expect this workshop will address the major opportunities and
challenges facing computational sciences in areas of strategic importance to the Office of
Science. I envision a report from this workshop by July 30, 2003. Furthermore, this
workshop should foster additional workshops, meetings, and discussions on specific topics
that can be identified and analyzed over the course of the next year.

Please mention this undertaking to your colleagues. I am looking forward to their support
and their participation, along with researchers from academia and industry, in making this
effort a success.

Thanks.

Walt P.

A Science-Based Case for Large-Scale Simulation 49


Acronyms and Abbreviations

50 A Science-Based Case for Large-Scale Simulation


Acronyms and Abbreviations

ACRONYMS AND ABBREVIATIONS

ASCI Accelerated Strategic Computing Initiative


ASCR Office of Advanced Scientific Computing Research (DOE)
BER Office of Biology and Environmental Research (DOE)
BES Office of Basic Energy Sciences (DOE)
DOE U.S. Department of Energy
FES Office of Fusion Energy Sciences (DOE)
FY fiscal year (federal)
GCM general circulation model
HECRTF High-End Computing Revitalization Task Force
HENP Office of High Energy and Nuclear Physics (DOE)
I/O input/output
ISIC Integrated Software Infrastructure Center
ITER International Thermonuclear Experimental Reactor
M3D a fusion simulation code
NIMROD a simulation code
NNSA National Nuclear Security Agency (DOE)
NSF National Science Foundation
NWCHEM a massively parallel simulation code
ORISE Oak Ridge Institute for Science and Education
QCD quantum chromodynamics
PETSc a portable toolkit of sparse solvers
PITAC President’s Information Technology Advisory Committee
RHIC Relativistic Heavy Ion Collider (Brookhaven National Laboratory)
SC Office of Science (DOE)
SciDAC Scientific Discovery through Advanced Computing
TSI Terascale Supernova Initiative

A Science-Based Case for Large-Scale Simulation 51


“We can only see a short distance ahead,
but we can see plenty there that needs to be done.”

Alan Turing (1912—1954)

You might also like