Professional Documents
Culture Documents
Dear Colleague:
Sincerely,
-i-
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
PREFACE
The National Science Foundation’s Cyberinfrastructure Council (CIC)1, based on extensive input from
the research community, has developed a comprehensive vision to guide the Foundation’s future invest-
ments in cyberinfrastructure (CI). In 2005, four multi-disciplinary, cross-foundational teams were created
and charged with drafting a vision for cyberinfrastructure in four overlapping and complementary areas:
1) High Performance Computing, 2) Data, Data Analysis, and Visualization, 3) Cyber Services and Vir-
tual Organizations, and 4) Learning and Workforce Development. Draft versions of the document were
posted on the NSF website and public comments were solicited from the community. These drafts were
also reviewed for comment by the National Science Board. The National Science Foundation thanks all
of those who provided feedback on the Cyberinfrastructure Vision for 21st Century Discovery document.
Your comments were carefully reviewed and considered during preparation of this version of the docu-
ment, which is intended to be a living document, and will be updated periodically.
ACKNOWLEDGEMENTS
We acknowledge the following NSF personnel who served on the strategic planning teams and whose
efforts made this document possible. We especially acknowledge Deborah Crawford, who served as acting
director for OCI from July 2005 to June 2006, and whose leadership was instrumental in the formulation
of this document.
High Performance Computing (HPC) CI Team: Deborah Crawford (Chair), Leland Jameson, Margaret
Leinen (CIC Representative), José Muñoz, Stephen Meacham, Michael Plesniak
Data CI Team: Cheryl Eavey, James French, Christopher Greer, David Lightfoot (CIC Representative),
Elizabeth Lyons, Fillia Makedon, Daniel Newlon, Nigel Sharp, Sylvia Spengler (Chair)
Virtual Organizations (VO) CI Team: Thomas Baerwald, Elizabeth Blood, Charles Boudin, Arthur
Goldstein, Joy Pauschke (Co-Chair), Randal Ruchti, Bonnie Thompson, Kevin Thompson (Co-Chair),
Michael Turner (CIC Representative)
Learning and Workforce Development (LWD) CI Team: James Collins (CIC Representative), Janice
Cuny, Semahat Demir, Lloyd Douglas, Debasish Dutta (Chair), Miriam Heller, Sally O’Connor, Michael
Smith, Harold Stolberg, Lee Zia
1
Complete list of acronyms can be found in Appendix A.
- ii -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
TABLE OF CONTENTS
LETTER FROM THE DIRECTOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
PREFACE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
ACKNOWLEDGEMENTS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
EXECUTIVE SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1 CALL TO ACTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
I. Cyberinfrastructure Drivers and Opportunities . . . . . . . . . . . . . . . . . . . . . 5
II. Vision, Mission and Principles for Cyberinfrastructure . . . . . . . . . . . . . . . . . 6
III. Goals and Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
IV. Planning for Cyberinfrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
A. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43
B. Representative Reports and Workshops . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
C. Chronology of NSF Information Technology Investments . . . . . . . . . . . . . . . .48
D. Management of Cyberinfrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . .49
E. Representative Distributed Research Communities (Virtual Organizations) . . . . . .50
IMAGE CREDITS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
- iii -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
EXECUTIVE SUMMARY
NSF’s Cyberinfrastructure Vision for 21st High Performance Computing (HPC) in sup-
Century Discovery is presented in a set of interre- port of modeling, simulation, and extraction of
lated chapters that describe the various challenges knowledge from huge data collections is increas-
and opportunities in the complementary areas that ingly essential to a broad range of scientific and
make up cyberinfrastructure: computing systems, engineering disciplines, often multi-disciplinary
data, information resources, networking, digitally (e.g. physics, biology, medicine, chemistry, cos-
enabled-sensors, instruments, virtual organiza- mology, computer science, mathematics), as well
tions, and observatories, along with an interop- as multi-scalar in dimensions of space (e.g., nano-
erable suite of software services and tools. This meters to light-years) time (e.g., picoseconds1 to
technology is complemented by the interdisciplin- billions of years), and complexity. A vision for
ary teams of professionals that are responsible for petascale2 science and engineering for the aca-
its development, deployment and its use in trans- demic community, enabled by high performance
formative approaches to scientific and engineering computing, is presented along with a series of
discovery and learning. The vision also includes principles that would be used to guide NSF sci-
attention to the educational and workforce initia- ence-driven HPC investments. This would result
tives necessary for both the creation and effective in a sustained petascale capable system deployed
use of cyberinfrastructure. in the FY 2010 timeframe. The plan presented
addresses HPC acquisition and deployment and
The five chapters of this document set out various aspects of HPC software and tools, in
NSF’s cyberinfrastructure vision. The first, A Call addition to the necessary scalable applications that
for Action, presents NSF’s vision and commit- would execute on these HPC assets.
ment to a cyberinfrastructure initiative. NSF will
play a leadership role in the development and An effective computing environment designed
support of a comprehensive cyberinfrastructure to meet the computational needs of a range of
essential to 21st century advances in science and science and engineering applications will include a
engineering research and education. The vision fo- variety of computing systems with complementary
cuses on a time frame of 2006-2010. The mission performance capabilities. NSF will invest in lead-
is for cyberinfrastructure to be human-centered, ership class environments in the 0.5-10 petascale
world-class, supportive of broadened participa- performance range. Strong partnerships involv-
tion in science and engineering, sustainable, and ing other federal agencies, universities, industry
stable but extensible. The guiding principles are and state government are also critical to success.
that investments will be science-driven, recognize NSF will also promote resource sharing between
the uniqueness of NSF’s role, provide for inclusive and among academic institutions to optimize the
strategic planning, enable U.S. leadership in sci- accessibility and use of HPC assets deployed and
ence and engineering, promote partnerships and supported at the campus level. Supporting soft-
integration with investments made by others in all ware services include the provision of intelligent
sectors, both national and international, and rely development and problem-solving environments
on strong merit review and on-going assessment, and tools. These tools are designed to provide im-
and a collaborative governance culture. This chap- provements in ease of use, reusability of modules,
ter goes on to review a set of more specific goals and portable performance.
and strategies for NSF’s cyberinfrastructure initia-
tive along with brief descriptions of the strategy to
1
A picosecond is 10-12 second
achieve those goals.
2
A petascale is 1015 operations per second with
comparable storage and networking capacity
The image shows computed charge density for iron oxide (FeO) within the local density approximation, with
spherical ions subtracted. The colors represent the spin density, showing the antiferromagnetic ordering.
-1-
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
Data, Data Analysis, and Visualization are ment of a system of science and engineering data
vital for progress in the increasingly data-intensive collections that is open, extensible, and evolvable;
realm of science and engineering research and and to support development of a new generation
education. Any cogent plan addressing cyberinfra- of tools and services for data discovery, integra-
structure must address the phenomenal growth of tion, visualization, analysis and preservation. The
data in all its various dimensions. Scientists and resulting national digital data framework will be
engineers are producing, accessing, analyzing, in- an integral component in the national cyberin-
tegrating, storing and retrieving massive amounts frastructure framework. It will consist of a range
of data daily. Further, this is a trend that is of data collections and managing organizations,
expected to see significant growth in the very near networked together in a flexible technical architec-
future as advances in sensors and sensor networks, ture using standard, open protocols and interfaces,
high-throughput technologies and instrumenta- and designed to contribute to the emerging global
tion, automated data acquisition, computational information commons. It will be simultaneously
modeling and simulation, and other methods and local, regional, national and global in nature, and
technologies materialize. The anticipated growth will evolve as science and engineering research and
in both the production and repurposing of digital education needs change and as new science and
data raises complex issues not only of scale and engineering opportunities arise.
heterogeneity, but also of stewardship, curation
and long-term access. Virtual Organizations for Distributed Com-
munities, built upon cyberinfrastructure, enable
Responding to the challenges and opportunities science and engineering communities to pursue
of a data-intensive world, NSF will pursue a vi- their research and learning goals with dramatically
sion in which science and engineering digital data relaxed constraints of time and distance. A virtual
are routinely deposited in well-documented form, organization is created by a group of individuals
are regularly and easily consulted and analyzed whose members and resources may be dispersed
by specialist and non-specialist alike, are openly geographically and/or temporally, yet who func-
accessible while suitably protected, and are reliably tion as a coherent unit through the use of end-to-
preserved. To realize this vision, NSF’s goals for end cyberinfrastructure systems. These CI systems
2006-2010 are twofold: to catalyze the develop- provide shared access to centralized or distributed
Researchers create cyberenvironments–secure, easy-to-use interfaces to instruments, data, computing systems, networks,
applications, analysis and visualization tools, and services.
-2-
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
resources and services, often in real-time. Such laboratories, and collaboration tools. Collabo-
virtual organizations supporting distributed com- ratories or science gateways (instances of virtual
munities go by numerous names: collaboratory, organizations) created by research communities
co-laboratory, grid community, science gateway, will also offer participation in authentic inquiry-
science portal, and others. As such environments based learning. These new modes and opportuni-
become more and more functionally complete ties to learn and to teach, covering K-12, post-
they offer new organizations for discovery and secondary, the workforce and the general public,
learning and bold new opportunities for broad- come with their own set of opportunities and
ened participation in science and engineering. challenges. New assessment techniques will have
Creating and sustaining effective virtual to be developed and understood; undergraduate
organizations, especially those spanning many curricula must be reinvented to fully exploit the
traditional organizations, is a complex technical capabilities made possible by cyberinfrastructure;
and social challenge. It requires an open tech- and the education of the professionals that are
nological framework consisting of, for example, being relied upon to support, develop and deploy
applications, tools, middleware, remote access to future generations of cyberinfrastructure must be
experimental facilities, instruments and sensors, addressed. In addition, cyberinfrastructure will
as well as monitoring and post-analysis capabili- have an impact on how business will be conducted
ties. An operational framework from campus level and members of the workforce must have the
to international scale is required, as well as a need capability to fully exploit the benefits afforded by
for partnerships between the various cyberinfra- these new technologies.
structure stakeholders. Overall effectiveness also
depends upon the appropriate social, governance, Cyberinfrastructure-enhanced discovery and
legal, economic and incentive structures. Forma- learning is especially exciting because of the op-
tive and longitudinal evaluation is also neces- portunities it affords for broadened participation
sary both to inform iterative design as well as to and wider diversity along individual, geographical
develop understanding of the impact of virtual and institutional dimensions. To fully realize these
organizations on enhancing the effectiveness of opportunities NSF will identify and address the
discovery and learning. barriers to utilization of cyberinfrastructure tools,
services, and resources; promote the training of
Learning and Workforce Development op- faculty, educators, students, researchers and the
portunities and requirements recognize that the public; and encourage programs that will explore
ubiquitous and interconnected nature of cyberin- and exploit cyberinfrastructure, including taking
frastructure will change not only how we teach but advantage of the international connectivity it
also how we learn. The future will see increas- provides - particularly important as we prepare a
ingly open access to online educational resources globally engaged workforce.
including courseware, knowledge repositories,
-3-
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
CHAPTER 1
CALL TO ACTION
I. C D at the frontiers of knowledge can carry out their
O work without cyberinfrastructure of one form or
another.
How does a protein fold? What happens to
While hardware performance has been growing
space-time when two black holes collide? What
exponentially – with gate density doubling every
impact does species gene flow have on an ecologi-
18 months, storage capacity every 12 months,
cal community? What are the key factors that
and network capability every 9 months – it has
drive climate change? Did one of the trillions of
become clear that increasingly capable hardware is
collisions at the Large Hadron Collider produce a
not the only requirement for computation-enabled
Higgs boson, the dark matter particle, or a black
discovery. Sophisticated software, visualization
hole? Can we create an individualized model of
tools, middleware and scientific applications cre-
each human being for personalized health care
ated and used by interdisciplinary teams are criti-
delivery? How does major technological change
cal to turning flops, bytes and bits into scientific
affect human behavior and structure complex so-
breakthroughs. In addition to these technical
cial relationships? What answers will we find – to
needs, the exploration of new organizational mod-
questions we have yet to ask – in the very large
els and the creation of enabling policies, processes,
datasets that are being produced by telescopes,
and economic frameworks are also essential. The
sensor networks, and other experimental facilities?
combined power of these capabilities and ap-
proaches is necessary to advance the frontiers of
These questions – and many others – are only
science and engineering, make seemingly intrac-
now coming within our ability to answer because
table problems solvable, and pose profound new
of advances in computing and related informa-
scientific questions.
tion technology. Once used by a handful of elite
researchers in a few research communities on
The comprehensive infrastructure needed to
select problems, advanced computing has become
capitalize on dramatic advances in information
essential to future progress across the frontier of
technology has been termed cyberinfrastructure
science and engineering. Coupled with continu-
(CI). Cyberinfrastructure integrates hardware for
ing improvements in microprocessor speeds,
computing, data and networks, digitally-enabled
converging advances in networking, software, visu-
sensors, observatories and experimental facilities,
alization, data systems and collaboration platforms
and an interoperable suite of software and middle-
are changing the way research and education are
ware services and tools. Investments in interdiscip-
accomplished.
linary teams and cyberinfrastructure professionals
Today’s scientists and engineers need access to with expertise in algorithm development, system
new information technology capabilities, such as operations, and applications development are
distributed wired and wireless observing network also essential to exploit the full power of cyberin-
complexes, and sophisticated simulation tools frastructure to create, disseminate, and preserve
that permit exploration of phenomena that can scientific data, information and knowledge.
never be observed or replicated by experiment.
Computation offers new models of behavior and For four decades, NSF has provided leader-
modes of scientific discovery that greatly extend ship in the scientific revolution made possible
the limited range of models that can be produced by information technology (Appendices B and
with mathematics alone – for example, chaotic C). Through investments ranging from super-
behavior. Fewer and fewer researchers working computing centers and the Internet to software
and algorithm development, information tech-
The Terashake 2.1 simulation on the opposite page depicts a velocity wavefield as it propagates through the 3D
velocity structure beneath Southern California. Red and yellow colors indicate regions of compression, while
blue and green colors show regions of dilation. Faint yellow (faults), red (roads), and blue (coast-line) lines add
geographical context.
-5-
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
nology has stimulated scientific breakthroughs the U.S. must engage in and actively benefit from
across all science and engineering fields. Most cyberinfrastructure developments around the
recently, NSF’s Information Technology Research world.
(ITR) priority area sowed the seeds of broad and
intensive collaboration among the computational, Not only is the time ripe for a coordinated
computer, and domain research communities that investment in cyberinfrastructure, but progress at
sets the stage for this “Call to Action.” the science and engineering frontiers depends on
it. Our communities are in place and are poised
NSF is the only agency within the U.S. govern- to respond to such an investment.
ment that funds research and education across all
disciplines of science and engineering. Over the Working with the science and engineering re-
past five years, NSF has held community work- search and education communities and partnering
shops, commissioned blue-ribbon panels, and car- with other key stakeholders, NSF is ready to lead.
ried out extensive internal planning (Appendix B).
Thus, it is strategically placed to leverage, coordi-
nate and transition cyberinfrastructure advances in
II. V, M P
one field to all fields of research. C
A. Vision
Other federal agencies, the administration,
Congress, the private sector, and other nations are
aware of the growing importance of cyberinfra- NSF will play a leadership role in the develop-
structure to progress in science and engineering. ment and support of a comprehensive cyberin-
Other federal agencies have planned improved frastructure essential to 21st century advances
capabilities for specific disciplines, and in some in science and engineering research and educa-
cases to address interdisciplinary challenges. tion.
Other countries have also been making significant
progress in scientific cyberinfrastructure. Thus, B. Mission
-6-
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
Robert Patterson demonstrates NCSA’s 3D Visualization to Dr. Across the CI landscape, NSF will:
Arden Bement, the Director of NSF, and others during the FY08
NSF budget roll-out. • Provide communities addressing the most
computationally challenging problems with
tively address CI needs across a broad spectrum access to a world-class, high performance
of organizations, institutions, communities and computing (HPC) environment through NSF
individuals, with input to the process provided acquisition and through exchange-of-service
through public comments, workshops, funded agreements with other entities, where pos-
studies, advisory committees, merit review and sible.
open competitions.
• Strategic investments in CI resources and NSF’s investment strategy for the provision of
services coupled with enabling policy and orga- CI resources and services will be linked to careful
nizational framework are essential to continued requirements analyses of the computational needs
U.S. leadership in science and engineering. of research and education communities. NSF
investments will be coordinated with those of
• The integration and sharing of cyberinfrastruc-
other agencies in order to maximize access to these
ture assets deployed and supported at national,
capabilities and to provide a range of representa-
regional, local, community and campus levels
tive high performance architectures.
represent the most effective way of construct-
ing a comprehensive CI ecosystem suited to
• Broaden access to state-of-the-art computing
meeting future needs.
resources, focusing especially on institutions
• Public and private national and international with less capability and communities where
partnerships that integrate CI users and provid- computational science is an emerging activ-
ers and benefit NSF’s research and education ity.
communities are also essential for enabling
-7-
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
NCSA’s Cobalt computing system uses a 3D cylindrical configuration to model the sediment discharge of a river into the
ocean and the initial stages of alluvial fan formation at the river’s mouth.
-8-
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
These investments will build on the software will take advantage of the emerging communities
products of current and former programs, and will associated with CI that provide unique and special
leverage work in core computer science research opportunities for broadening participation in the
and development efforts supported by NSF and science and engineering enterprise.
other federal agencies.
• Support state-of-the-art innovation in data
• Support the development of the comput- management and distribution systems,
ing professionals, interdisciplinary teams, including digital libraries and educational
enabling policies and procedures, and new environments that are expected to contribute
organizational structures such as virtual to many of the scientific breakthroughs of the
organizations, that are needed to achieve the 21st century.
scientific breakthroughs made possible by
advanced CI, paying particular attention to NSF will foster communication among fore-
opportunities to broaden the participation of front data management and distribution systems,
underrepresented groups. digital libraries, and other education environments
sponsored in its various directorates. NSF will
NSF will continue to improve its under- ensure that its efforts take advantage of innova-
standing of how participants in its research and tion in large data management and distribution
education communities, as well as the scientific activities sponsored by other agencies and through
workforce, can use CI. For example, virtual international efforts. These developments will play
organizations empower communities of users to a critical role in decisions that NSF makes about
interact, exchange information, and access and stewardship of long-lived data.
share resources through tailored interfaces. Some
of NSF’s investments will focus on appropriate
mechanisms or structures for use, while others will • Support the design and development of the
focus on how best to train future users of CI. NSF CI needed to realize the full scientific poten-
The DANSE project at CalTech integrates new materials theory with high-performance computing, using data from facili-
ties such as DOE’s new Spallation Neutron Source in Oak Ridge, TN.
-9-
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
tial of NSF’s investments in tools and large among computer scientists; social, behavioral and
facilities, from observatories and accelera- economic scientists; and other domain scientists
tors to sensor networks and remote observing and engineers to understand how humans can best
systems. use CI, in both research and education environ-
ments.
NSF’s investments in large facilities and other
tools require new types of CI − such as wireless • Provide a framework that will sustain
control of networks of sensors in hostile environ- reliable, stable resources and services while
ments, rapid distribution and analysis of petascale enabling the integration of new technologies
data sets around the world, adaptive knowledge- and research developments with a minimum
based control and sampling systems, and innova- of disruption to users.
tive visualization systems for collaboration. NSF
will ensure that these projects invest appropriately NSF will minimize disruption to users by real-
in CI capabilities, promoting the integrated and izing a comprehensive CI with an architecture and
widespread use of the unique services provided by framework that emphasizes interoperability and
these and other facilities. In addition, NSF’s CI open standards, thus providing flexibility for up-
programs will be designed to serve the needs of grades, enhancements and evolutionary changes.
these projects. Pre-planned arrangements for alternative CI avail-
abilities during competitions, changeovers and
• Support the development and maintenance upgrades to production operations and services
of the increasingly sophisticated applica- will be made, including cooperative arrangements
tions needed to achieve the scientific goals of with other agencies.
research and education communities.
A strategy common to achieving all of these
The applications needed to produce cutting- goals is partnering nationally and internation-
edge science and engineering have become ally, with other agencies, the private sector, and
increasingly complex. They require teams, even with universities to achieve a worldwide CI that
communities, to develop and sustain wide and is interoperable, flexible, efficient, evolving and
long-term applicability, and they leverage under- broadly accessible. In particular, NSF will take
lying software tools and increasingly common, a lead role in formulating and implementing a
persistent CI resources such as data repositories national CI strategy.
and authentication and authorization services.
NSF’s investments in applications will involve its IV. P C
directorates and offices that support domain-spe-
cific science and engineering. Special attention will To implement its cyberinfrastructure vision,
be paid to the cross-disciplinary nature of much of NSF will develop interdependent plans for each
the work. of the following aspects of CI, with emphasis on
their integration to create a balanced science- and
• Invest in the high-risk/high-gain basic re- engineering-driven national CI:
search in computer science, computing and
storage devices, mathematical algorithms, • High Performance Computing
and the human/CI interfaces that are critical
to powering the future exponential growth in • Data, Data Analysis, and Visualization
all aspects of computing, including hardware • Virtual Organizations for Distributed
speed, storage, connectivity and scientific Communities, and
productivity.
• Learning and Workforce Development.
- 10 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
blending of these components. This will require These plans will be reviewed annually and will
integrative management structures (such as the evolve over time, paced by the considerable rate
newly formed Office of Cyberinfrastructure, the of innovation in computing and communica-
NSF-wide Cyberinfrastructure Council, and the tion, and by the growing needs of the science and
Cyberinfrastructure Coordinators’ Committee), as engineering community for state-of-the-art CI
well as science-driven, community-based capabilities. Through cycles of use-driven innova-
planning and implementation processes that tion, NSF’s vision will become reality.
span all the elements of a truly comprehensive CI
framework.
Researchers upgrade the software of an automated weather station that transmits data to help track the
iceberg’s position in the Antarctic and reports on the microclimate of the ice surface.
- 11 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
CHAPTER 2
HIGH PERFORMANCE COMPUTING
(2006-2010)
I. W D H P infrastructure systems? What kind of language
C O S processing can occur in large assemblages of
E neurons? Can we enable integrated planning and
response to natural and man-made disasters that
prevent or minimize the loss of life and property?
What are the three-dimensional structures of all
These are just some of the important questions
of the proteins encoded by the human genome,
that researchers wish to answer using contempo-
and how does structure influence their function in
rary tools in a state-of-the-art High Performance
a human cell? What patterns of emergent behav-
Computing (HPC) environment.
ior occur in models of very large societies? How
do massive stars explode and produce the heavi-
Using HPC-based applications, researchers
est elements in the periodic table? What sort of
study the properties of minerals at the extreme
abrupt transitions can occur in Earth’s climate and
temperatures and pressures that occur deep within
ecosystem structure? How do these transitions
the Earth. They simulate the development of
occur, and under what circumstances? If we could
structure in the early Universe. They probe the
design catalysts atom-by-atom, could we trans-
structure of novel phases of matter such as the
form industrial synthesis? What strategies might
quark-gluon plasma. HPC capabilities enable the
be developed to optimize management of complex
modeling of life cycles that capture interdependen-
The visualization above, created from data generated by a tornado simulation calculated on the NCSA computing cluster,
shows the tornado by spheres colored according to pressure. Orange and blue tubes represent the rising and falling
airflow around the tornado.
NCAR’s blueice supercomputer, shown on the opposite page, enables scientists to enhance the resolution and complexity
of Earth system models, improve climate and weather research, and provide more accurate data to decision makers.
- 13 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
cies across diverse disciplines and multiple scales of buildings and bridges. HPC is essential to
to create globally competitive manufacturing extracting the signature of the Higgs boson and
enterprise systems. And they examine the way supersymmetric particles – two of the scientific
proteins fold and vibrate after they are synthesized drivers of the Large Hadron Collider – from the
inside an organism. In fact, sophisticated numeri- petabytes of data produced in the trillions of
cal simulations permit scientists and engineers to particle collisions.
perform a wide range of in silico experiments that
would otherwise be too difficult, too expensive, or Science and engineering research and educa-
impossible to perform in the laboratory. tion enabled by state-of-the-art HPC tools have
a direct bearing on the nation’s competitiveness.
HPC systems and services are also essential to If investments in HPC are to have a long-term
the success of research conducted with sophisti- impact on problems of national need, such as
cated experimental tools. Without the waveforms bioengineering, critical infrastructure protection
produced by the numerical simulation of black (for example, the electric power grid), health care,
hole collisions and other astrophysical events, manufacturing, nanotechnology, energy, and
gravitational wave signals cannot be extracted transportation, then HPC tools must deliver high
from the data produced by the Laser Interferom- performance capability for a wide range of science
eter Gravitational Wave Observatory. High-resolu- and engineering applications.
tion seismic inversions from the higher density
of broad-band seismic observations furnished by
the Earthscope project are necessary to determine
shallow and deep Earth structure. Simultaneous
integrated computational and experimental test-
ing is conducted on the Network for Earthquake
Engineering Simulation to improve seismic design
A functioning ribosome, a complex of three large RNA molecules and fifty proteins with three million atoms, is
simulated on the Texas Advanced Computing Center computer.
- 14 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
- 15 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
peak performance in the 50-500 teraflops range, Over the FY 2006-2010 period, NSF will focus
deployed and supported at the local level by indi- on HPC system acquisitions in the 100 teraflops
vidual campuses and other research organizations; to 10 petaflops range, where strategic investments
(ii) multiple systems with peak performance of on a national scale are necessary to ensure inter-
500+ teraflops that support the work of thousands national leadership in science and engineering.
of researchers nationally; and, (iii) at least one Since different science and engineering codes may
system capable of delivering sustained perfor- achieve optimal performance on different HPC
mance approaching 1015 floating point operations architectures, it is likely that by 2010 the NSF-
per second on real applications that consume large supported HPC environment will include both
amounts of memory, and/or that work with very loosely coupled and tightly coupled systems, with
large data sets projects that demand the highest several different memory models.
levels of computing performance. All NSF-de-
ployed systems will be appropriately balanced and To address the challenge of providing the
will include core computational hardware, local research community with access to a range of
storage of sufficient capacity, and appropriate data HPC architectures within a constrained budget,
analysis and visualization capabilities. a key element of NSF’s strategy is to participate
in resource-sharing with other federal agencies. A
This numerical simulation, created on the NCSA Itanium Linux Cluster by international researchers, shows
the merger of two black holes and the ripples in space time that are born of the merger.
- 16 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
strengthened interagency partnership will focus, to 2). Development and Maintenance of Sup-
the extent practicable, on ensuring shared access porting Software: New Design Tools, Perfor-
to federal leadership-class resources with different mance Modeling Tools, Systems Software, and
architectures, and on the coordination of invest- Fundamental Algorithms.
ments in HPC system acquisition and operation.
The Department of Energy’s Office of Science Many of the HPC software and service building
and National Nuclear Security Administration blocks in scientific computing are common to a
have very active programs in leadership comput- number of science and engineering applications.
ing. The Department of Defense’s (DOD) High A supporting software and service infrastructure
Performance Computing Modernization Office will accelerate the development of the scientific
(HPCMOD) provides HPC resources and services application codes needed to solve challenging
for the DOD science and engineering community, scientific problems, and will help insulate these
while NASA is deploying significant comput- codes from the evolution of future generations of
ing systems that are also of interest to NSF PIs. HPC hardware.
NSF will explore enhanced coordination mecha-
nisms with other appropriate federal agencies to Supporting software services include the
capitalize on their common interests. It will seek provision of intelligent development and prob-
opportunities to make coordinated and collab- lem-solving environments and tools. These tools
orative investments in science-driven hardware are designed to provide improvements in ease of
architectures in order to increase the diversity of use, reusability of modules, and portable perfor-
architectures of leadership class systems available mance. Tools and services that take advantage of
to researchers and educators around the country, commonly-supported software tools can deliver
to promote sharing of lessons learned, and to similar work environments across different HPC
provide a richer HPC environment for the user platforms, greatly reducing the time-to-solution
communities supported by each agency. of computationally-intensive research problems
by permitting local development of research
Strong partnerships involving universities, in- codes that can then be rapidly transferred to, or
dustry and government are also critical to success. incorporate services provided by, larger production
NSF will also promote resource sharing between
and among academic institutions to optimize the
accessibility and use of HPC assets deployed and
supported at the campus level.
- 17 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
The development and deployment of operating To minimize duplication of effort and optimize
systems and compilers that scale to hundreds of the value of HPC services provided to the science
thousands of processors are also necessary. They and engineering community, NSF’s investments
must provide effective fault-tolerance and effec- will be coordinated with those of other agencies.
tively insulate users from parallelization, as well as DOE currently invests in software infrastructure
provide protection from latency management and centers through the Scientific Discovery through
thread management issues. To test new develop- Advanced Computing (SciDAC) program, while
ments at large scales, operating systems and kernel DARPA’s investments in the HPCS program con-
researchers and developers must have access to the tribute significant systems software and hardware
infrastructure necessary to test their developments innovations. NSF will seek to leverage and add
at scale. value to ongoing DOE and DARPA efforts in this
area.
The software provider community will be a
source for: applied research and development
of supporting technologies; harvesting promis-
ing supporting software technologies from the
research communities; performing scalabil-
ity/reliability tests to explore software viability;
developing, hardening and maintaining software
where necessary; and facilitating the transition of
commercially viable software into the private sec-
tor. It is anticipated that this community will also
support general software engineering consulting
services for science and engineering applications,
and will provide software engineering consulting
support to individual researchers and research and
- 18 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
Two skulls of separate species of pterosaurs were scanned at the High-Resolution X-ray Computed Tomo-
graphy Facility at The University of Texas at Austin and the data was then fed to DigiMorph digital library
to produce 2-D and 3-D structural visualizations.
3). Development and Maintenance of Por- tion of sound software engineering approaches
table, Scalable Applications Software in existing widely-used research codes and in the
development of new research codes. Multidiscip-
Today’s microprocessor-based terascale comput- linary teams of researchers will work together
ers place considerable demands on our ability to to create, modify and optimize applications for
manage parallelism, and to deliver large fractions current and future systems using performance
of peak performance. As the agency seeks to modeling tools and simulators.
create a petascale computing environment, it will
embrace the challenge of developing or converting Since the nature and genesis of science and
key application codes to run effectively on new engineering codes varies across the research land-
and evolving system architectures. scape, a successful programmatic effort in this area
Over the FY 2006 through 2010 period, NSF will weave together several strands. A new activity
will make significant new investments in the de- will be designed to take applications that have the
velopment, hardening, enhancement and mainte- potential to be widely used within a community or
nance of scalable applications software, including communities, to harden these applications based
community models, to exploit the full potential of on modern software engineering practices, to
current terascale and future petascale systems ar- develop versions for the range of architectures that
chitectures. The creation of well-engineered, easy- scientists wish to use them on, to optimize them
to-use software will reduce the complexity and for modern HPC architectures, and to provide
time-to-solution of today’s challenging scientific user support.
applications. NSF will promote the incorpora-
- 19 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
CHAPTER 3
DATA, DATA ANALYSIS, AND VISUALIZATION
(2006-2010)
I. A W S transformation of research outcomes into products
O A and services, and enhancing the effectiveness of
D D learning across the spectrum of human endeavor.
An artist’s conception (above) depicts fundamental NEON observatory instrumentation and systems as well as
potential spatial organization of the environmental measurements made by these instruments and systems.
The image on the opposite page shows the action of the enzyme cellulase on cellulose using the CHARMM
community code in a simulation carried out at SDSC. NREL will use the simulation to help develop strategies for
efficient large-scale conversion of biomass into ethanol.
- 21 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
II. D
- 22 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
- 23 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
mediate and long-term storage of digital data rel- other individuals within their organizations. The
evant to space missions. This effort produced the technological implementations of these systems
Open Archival Information System (OAIS), now are often open-source and support interoperability
adopted as the “de facto” standard for building among their adopters. University-based research
digital archives, and provided evidence that a com- libraries and research librarians are positioned to
munity-focused activity can have much broader make significant contributions in this area, where
impact than originally intended. In another standard mechanisms for access and maintenance
example, the Inter-University Consortium for of scientific digital data may be derived from exist-
Political and Social Research (ICPSR) - a member- ing library standards developed for print material.
ship-based organization with over 500 member These efforts are particularly important to NSF
colleges and universities around the world - main- as the agency considers the implications of not
tains and provides access to a vast archive of social only making all data generated with NSF fund-
science data. ICPSR serves as a content manage- ing broadly accessible, but of also promoting the
ment organization, preserving relevant social responsible organization and management of these
science data and migrating them to new storage data so that they are widely usable.
media as technology changes, and also provides
user support services. ICPSR recently announced
plans to establish an international standard for
IV. T N F Y: T
social science documentation. Similar activities N D D F
in other communities are also underway. Clearly,
NSF must maintain a presence in, support, and Motivated by a vision in which science and
add value to these ongoing international discus- engineering digital data are routinely deposited
sions and activities. in well-documented form, are regularly and easily
consulted and analyzed by specialists and non-spe-
Activities on an international scale are comple- cialists alike, are openly accessible while suitably
mented by activities within nation states. In the protected, and are reliably preserved, NSF’s five-
United States, a number of organizations and year goal is twofold:
communities of practice are exploring mechanisms
to establish common approaches to digital data ac- • To catalyze the development of a system of
cess, management and curation. For example, the science and engineering data collections that is
Research Library Group (RLG – a not-for-profit open, extensible and evolvable; and
membership organization representing libraries, • To support development of a new generation
archives and museums) and the U.S. National of tools and services facilitating data mining,
Archives and Records Administration (NARA – a
sister agency whose mission is to provide direction
and assistance to federal agencies on records man-
agement) are producing certification requirements
for establishing and selecting reliable digital infor-
mation repositories. RLG and NARA intend their
results to be standardized via the International
Organization of Standardization (ISO) Archiving
Series, and may impact all data collections types.
The National Institutes of Health (NIH) National
Center for Biotechnology Information plays an
important role in the management of genome data
at the national level, supporting public databases,
developing software tools for analyzing data, and
disseminating biomedical information.
- 24 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
integration, analysis, and visualization essen- • Science and engineering research and educa-
tial to turning data into new knowledge and tion opportunities and priorities will motivate
understanding. NSF investments in data cyberinfrastructure.
The resulting national digital data framework • Science and engineering data generated with
will be an integral component in the national NSF funding will be readily accessible and eas-
cyberinfrastructure framework described in this ily usable, and will be appropriately, responsi-
document. It will consist of a range of data col- bly and reliably preserved.
lections and managing organizations, networked • Broad community engagement is essential to
together in a flexible technical architecture using the prioritization and evaluation of the utility
standard, open protocols and interfaces, and of scientific data collections, including the pos-
designed to contribute to the emerging global sible evolution from research to resource and
information commons. It will be simultaneously reference collection types.
local, regional, national and global in nature, and
will evolve as science and engineering research and • Continual exploitation of data in the creation
education needs change and as new science and of new knowledge requires that investigators
engineering opportunities arise. Widely acces- have access to the tools and services necessary
sible tools and services will permit scientists and to locate and access relevant data, and under-
engineers to access and manipulate these data to stand its structure sufficiently to be able to
advance the science and engineering frontier. interpret and (re)analyze what they find.
• The establishment of strong, reciprocal,
In print form, the preservation process is international, interagency and public-private
handled through a system of libraries and other partnerships is essential to ensure all stakehold-
repositories throughout the country and around ers are engaged in the stewardship of valuable
the globe. Two features of this print-based system data assets. Transition plans, addressing issues
make it robust. First, the diversity of business such as media, stewardship and standards, will
models deriving support from a variety of sources be developed for valuable data assets, to protect
means that no single entity bears sole responsibil- data and assure minimal disruption to the
ity for preservation, and the system is resilient to community during transition periods.
changes in any particular sector. Second, there
is overlap in the collections, and redundancy of • Mechanisms will be created to share data
content reduces the potential for catastrophic loss stewardship best practices between nations,
of information. communities, organizations and individuals.
• In light of legal, ethical and national security
The national data framework is envisioned concerns associated with certain types of data,
to provide an equally robust and diverse system mechanisms essential to the development of
for digital data management and access. It will both statistical and technical ways to protect
promote interoperability between data collections privacy and confidentiality will be supported.
supported and managed by a range of organiza-
tions and organization types; provide for appropri-
ate protection and reliable long-term preservation
of digital data; deliver computational performance,
data reliability and movement through shared
tools, technologies and services; and accommo-
date individual community preferences. NSF will
also develop a suite of coherent data policies that
emphasize open access and effective organization
and management of digital data, while respecting
the data needs and requirements within science
and engineering domains.
- 25 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
A. A Coherent Organizational Framework - ate standards for data and metadata content and
Data Collections and Managing Organizations format, guided by an appropriate community-
defined governance approach. This is not a static
To date, challenges associated with effective process, as new disciplinary fields and approaches,
stewardship and preservation of scientific data data types, organizational models and information
have been more tractable when addressed through strategies inexorably emerge. This is discussed in
communities of practice that may derive support detail in the Long-Lived Digital Data Collections
from a range of sources. For example, NSF sup- report of the National Science Board.
ports the Incorporated Research Institutions for
Seismology (IRIS) consortium to manage seismol- Since data are held by many federal agen-
ogy data. Jointly with NIH and DOE, the agency cies, commercial and non-profit organizations,
supports the Protein Data Bank to manage data and international entities, NSF will foster the
on the three-dimensional structures of proteins establishment of interagency, public-private and
and nucleic acids. Multiple agencies support the international consortia charged with providing
University Consortium for Atmospheric Research, stewardship for digital data collections to pro-
an organization that has provided access to at- mote interoperability across data collections. The
mospheric and oceanographic data sets, simula- agency will work with the broad community of
tions, and outcomes extending back to the 1930s science and engineering producers, managers,
through the National Center for Atmospheric scientists and users to develop a common con-
Research. ceptual framework. A full range of mechanisms
will be used to identify and build upon common
Existing collections and managing organization ground across domain communities and managing
models reflect differences in culture and practice organizations, engaging all stakeholders. Activities
within the science and engineering community. will include: the support of new projects; devel-
As community proxies, data collections and their opment and implementation of evaluation and
managing organizations can provide a focus for assessment criteria that, among other things, reveal
the development and dissemination of appropri- lessons learned across communities; support of
Researchers check functionality and performance of the Compact Muon Solenoid detector at CERN before its closure.
Built on the Large Hadron Collider, it provides a magnetic field of 4T.
- 26 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
community and intercommunity workshops; and and all other causes since, unlike printed records,
the development of strong partnerships with other the media on which digital data are stored and the
stakeholder organizations. Stakeholders in these structures of the data are relatively fragile.
activities include data authors, data managers, data
scientists and engineers, and data users represent- High quality metadata, which summarize data
ing a diverse range of communities and organiza- content, context, structure, interrelationships, and
tions, including universities and research librar- provenance (information on history and origins),
ies, government agencies, content management are critical to successful information management,
organizations and data centers, and industry. annotation, integration and analysis. Metadata
take on an increasingly important role when ad-
To identify and promote lessons learned across dressing issues associated with the combination of
managing organizations, NSF will continue to data from experiments, observations and simula-
promote the coalescence of appropriate collec- tions. In these cases, product data sets require
tions with overlapping interests, approaches and metadata that describe, for example, relevant
services. This reduces data-driven fragmentation collection techniques, simulation codes or point-
of science and engineering domains. Progress is ers to archived copies of simulation codes, and
already being made in some areas. For example, codes used to process, aggregate or transform data.
NSF has been working with the environmental These metadata are essential to create new knowl-
science and engineering community to promote edge and to meet the reproducibility imperative
collaboration across disciplines ranging from ecol- of modern science. Metadata are often associated
ogy and hydrology to environmental engineering. with data via markup languages, representing
This has resulted in the emergence of common a consensus around a controlled vocabulary to
cyberinfrastructure elements and new interdiscip- describe phenomena of interest to the commu-
linary science and engineering opportunities. nity, and allowing detailed annotations of data to
be embedded within a data set. Because there is
B. Developing A Flexible Technological Archi- often little awareness of markup language develop-
tecture ment activities within science and engineering
communities, effort is expended reinventing what
From a technological perspective, the national could be adopted or adapted from elsewhere. Sci-
data framework must provide for reliable preserva- entists and engineers therefore need access to tools
tion, access, analysis, interoperability, and data and services that help ensure that metadata are
movement, possibly using a web or grid services automatically captured or created in real-time.
distributed environment. The architecture must
use standard open protocols and interfaces to
enable the broadest use by multiple communities.
It must facilitate user access, analysis and visualiza-
tion of data, addressing issues such as authentica-
tion, authorization and other security concerns,
and data acquisition, mining, integration, analysis
and visualization. It must also support complex
workflows enabling data discovery. Such an archi-
tecture can be visualized as a number of layers pro-
viding different capabilities to the user, including
data management, analysis, collaboration tools,
and community portals. The connections among
these layers must be transparent to the end user,
and services must be available as modular units
responsive to individual or community needs. The
system is likely to be implemented as a series of
distributed applications and operations supported
by a number of organizations and institutions dis-
tributed throughout the country. It must provide
for the replication of data resources to reduce the
A simulated event of the collision of two protons in the ATLAS experi-
potential for catastrophic loss of digital informa- ment. The colors of the tracks emanating from the center show the
tion through repeated cycles of systems migration different types of particles emerging from the collision.
- 27 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
Effective data analysis tools apply computation- tions. In common with other cyberinfrastructure
al techniques to extract new knowledge through components, visualization requires easy-to-use,
a better understanding of the data and its redun- modular, extensible applications that capitalize on
dancies and relationships by filtering extraneous the reuse of existing technology. Today’s successful
information and by revealing previously unseen analysis and visualization applications use a pipe-
patterns. For example, the Large Hadron Col- line, component-based system on a single machine
lider at CERN generates such massive data sets or across a small number of machines. Extending
that the detection of both expected events, such to the broader distributed, heterogeneous cyber-
as the Higgs boson, and unexpected phenomena infrastructure system will require new interfaces
require the development of new algorithms, both and work in fundamental graphics and visualiza-
to manage data and to analyze it. Algorithms tion algorithms that can be run across remote and
and their implementations must be developed for distributed settings.
statistical sampling, for visualization, to enable the
storage, movement and preservation of enormous To address this range of needs for data tools and
quantities of data, and to address other unforeseen services, NSF will work with the broad commu-
problems certain to arise. nity to identify and prioritize needs. In making
investments, NSF will complement private sector
Scientific visualization, including not just static efforts, for example, those producing sophisticated
images but also animation and interaction, leads indexing and search tools and packaging them as
to better analysis and enhanced understanding. data services. NSF will support projects to con-
Currently, many visualization systems are do- duct applied research and development of promis-
main or application-specific and require a certain ing, interoperable data tools and services; perform
commitment to understanding or learning to scalability/reliability tests to explore tool viability;
use them. Making visualization services more develop, harden and maintain software tools and
transparent to the user lowers the threshold of us- services where necessary; and harvest promising
ability and accessibility, and makes it possible for research outcomes to facilitate the transition of
a wider range of users to explore a data collection. commercially-viable software into the private sec-
Analysis of data streams also introduces problems tor. Data tools created and distributed through
in data visualization and may require new ap- these projects will include not only access and
proaches for representing massive, heterogeneous ease-of-use tools, but also tools to assist with data
data streams. input, tools that maintain or enforce formatting
standards, and tools that make it easy to include or
Deriving knowledge from large data sets create metadata in real time. Clearinghouses and
presents specific scaling problems due to the sheer registries from which all metadata, ontology, and
number of items, dimensions, sources, users, and
disparate user communities. The human ability to
process visual information can augment analysis,
especially when analytic results are presented in
iterative and interactive ways. Visual analytics, the
science of analytical reasoning enabled by interac-
tive visual interfaces, can be used to synthesize the
information content and derive insight from mas-
sive, dynamic, ambiguous, and even conflicting
data. Suitable fully interactive visualizations help
absorb vast amounts of data directly, to enhance
one’s ability to interpret and analyze otherwise
overwhelming data. Researchers can thus detect
the expected and discover the unexpected, uncov-
ering hidden associations and deriving knowledge
from information. As an added benefit, their
insights are more easily and effectively communi-
cated to others.
CAVE software, released by RCSB PDB and CalIT2, provides a new way
of visualizing 3D macromolecular structures in an immersive, virtual real-
Creating and deploying visualization services ity environment. The CAVE allows users to move through and around a
requires new frameworks for distributed applica- structure projected in the CAVE.
- 28 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
markup language standards are provided, publi- recognizing that this is essential to continued U.S.
cized, and disseminated must be developed and leadership in science and engineering.
supported, together with the tools for their imple-
mentation. Data accessibility and usability will In addition to addressing the technological chal-
also be improved with the support of means for lenges inherent in the creation of a national data
automating cross-ontology translation. Collective- framework, NSF’s data policies will be redesigned
ly, these projects will be responsible for ensuring as necessary to mitigate existing sociological and
software interoperability with other components cultural barriers to data sharing and access, and
of the cyberinfrastructure, such as those generated to bring them into accord across programs and
to provide High Performance Computing capa- ensure coherence. This will lead to the develop-
bilities and to enable the creation of Networked ment of a suite of harmonized policy statements
Resources and Virtual Organizations. supporting data open access and usability. NSF’s
actions will promote a change in culture such that
The user community will work with tool the collection and deposition of all appropriate
providers as active collaborators to determine digital data and associated metadata become a
requirements and to serve as early users. Scien- matter of routine for investigators in all fields. This
tists, educators, students and other end users think change will be encouraged through an NSF-wide
of ways to use data and tools that the developers requirement for data management plans in all
did not consider, finding problems and testing proposals. These plans will be considered in the
recovery paths by triggering unanticipated behav- merit review process and will be actively moni-
ior. Most important, an engaged set of users and tored post-award.
testers will also demonstrate the scientific value
of data collections. The value of repositories and Policy and management issues in data handling
their standards-based input and output tools arises occur at every level, and there is an urgent need
from the way in which they enable discoveries. for rational agency, national and international
Testing and feedback are necessary to meet the strategies for sustainable access, organization and
challenges presented by current datasets that will use. Discussions at the interagency level on issues
only increase in size, often by orders of magnitude, associated with data policies and practices will be
in the future. supported by a new interagency working group
on digital data formed under the auspices of the
Finally, in addition to promoting the use of Committee on Science of the National Science
standards, tool and service developers will also and Technology Council. This group will con-
promote the stability of standards. Continual sider not only data challenges and opportunities
changes to structure, access methods, and user discussed throughout this chapter, but especially
interfaces mitigate against ease of use and against the issues of cross-agency funding and access, the
interoperability. Instead of altering a standard for provision and preservation of data to and for other
a current need, developers will adjust their imple- agencies, and monitoring agreements as agency
mentation of that need to fit within the standard. imperatives change with time. Formal policies
This is especially important for resource-limited must be developed to address data quality and
research and education communities. security, ethical and legal requirements, and tech-
nical and semantic interoperability issues that will
C. Developing and Implementing Coherent arise throughout the complete process from collec-
Data Policies tion and generation to analysis and dissemination.
In setting priorities and making funding deci- As already noted, many large science and en-
sions, NSF is in a powerful position to influ- gineering projects are international in scope, and
ence data policy and management at research thus national laws and international agreements
institutions. NSF’s policy position on data is directly affect data access and sharing practices.
straightforward: all science and engineering data Differences arise over privacy and confidentiality,
generated with NSF funding must be made from cultural attitudes to ownership and use, in
broadly accessible and usable, while being suitably attitudes to intellectual property protection and its
protected and preserved. Through a suite of co- limits and exceptions, and because of national se-
herent policies designed to recognize different data curity concerns. Means by which to find common
needs and requirements within communities, NSF ground within the international community must
will promote open access to well-managed data, continue to be explored.
- 29 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
CHAPTER 4
VIRTUAL ORGANIZATIONS FOR DISTRIBUTED
COMMUNITIES (2006-2010)
The GPS station shown on the opposite page monitors transform plate movement in California as part of
EarthScope. The data collected will be compared with data from other GPS stations to better understand the
interplay of faults in the region.
- 31 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
researchers to test larger scale and more compre- reliable, accessible, usable, pervasive, persistent
hensive structural and geomaterial systems to and interoperable, and that are able to exploit
create new design methodologies and technolo- the full range of research and education tools
gies for reducing losses during earthquakes and available at any given time.
tsunamis.
The NEES@buffalo shake table recreates movements of the forces generated by an actual earthquake. NEES allows earthquake engineers and
students located at different institutions to share resources, collaborate on testing, and exploit new computational technologies.
- 32 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
propriately reliable, robust, and persistent such operators, and end users who span multiple com-
that end users can depend on them to achieve munities; and the establishment of an effective
their research and education goals. assessment and evaluation plan that will inform
• NSF partners with relevant stakeholders, the agency’s ongoing investments in cyberinfra-
including academe, industry, other federal structure for the foreseeable future.
agencies, and other public and private sector
organizations, both foreign and domestic. A. Open Technological Framework
• Tools and services are networked together in To facilitate the development of an open tech-
a flexible architecture using standard, open nological framework, NSF will support cyberin-
protocols and interfaces, designed to support frastructure software service providers to develop,
the creation and operation of robust networked integrate, deploy, and support reliable, robust, and
resources and VOs across the scientific and interoperable software. Software essential to the
engineering disciplines supported by NSF. creation of networked resources and VOs encom-
passes a broad range of functionalities and services,
including enabling middleware; domain-specific
software and application codes; teleobservation
and teleoperation tools to enable remote access to
experimental facilities, instruments, and sensors;
collaborative tools for experimental planning,
execution, and post-analysis; workflow tools and
processes; system monitoring and management;
user support; web portals to simulation software
and domain-specific community code repositories;
and flexible user interfaces to enable discovery and
learning.
- 33 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
community governance. NSF will require multiple research and education communities.
awardees developing, deploying and supporting Where appropriate, conventions that are em-
networked resources and virtual organizations to ployed internationally should be favored. The use
develop and apply robust cybersecurity policies of standards creates economies of scale and scope
and procedures, thereby promoting a conscien- for developing and deploying common resources,
tious and consistent approach to cybersecurity. tools, software, and services that enhance the use
The agency will support development of strong of cyberinfrastructure in multiple science and
authentication and authorization technologies and engineering communities. This approach allows
procedures for individuals, groups, VOs, and other maximum interoperability and sharing of best
role-based identities. In so doing, the agency will practices. A standards-based approach will ensure
leverage research prototypes and new technologies that access to cyberinfrastructure will be indepen-
produced in the Cyber Trust program, the U.S. dent of operating systems; ubiquitous, and open
government’s flagship program in cybersecurity to large and small institutions. Together, web ser-
research and development. vices and service-oriented architectures are emerg-
ing as a standard framework for interoperability
Interoperable, open technology standards will among various software applications on different
be used as the primary mechanism to support platforms. They provide important characteristics
the further development of interoperable, open, such as standardized and well-defined interfaces
extensible networked resources and VOs. Ideally, and protocols, ease of development, and reuse of
standards for data representation and communica- services and components, making them potential
tion, connection of computing resources, au- central facets of the cyberinfrastructure software
thentication, authorization, and access should be ecosystem.
open, internationally employed, and accepted by
The Ocean Observatories Initiative promises to provide the ocean sciences research community sustained,
long-term and adaptive measurements in the oceans via a fully operational research observatory system.
- 34 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
B. Operational Framework will also support projects that study how ongoing
and future cyberinfrastructure efforts might be
NSF will promote the development of partner- informed by lessons learned and by the identifica-
ships to facilitate the sharing and integration of tion of promising practices. Among other things,
distributed technological components deployed NSF seeks to build a stronger foundation in our
and supported at national, international, regional, understanding of how individuals, teams and com-
local, community, and campus levels. Significant munities most effectively interact with cyberinfra-
resources already exist at the academic institution structure; how to design the critical governance
level. It is important to integrate such resources and management structures for the new types of
into the national cyberinfrastructure fabric. organizations arising; and, how to improve the
allocation of Cyberinfrastructure resources and
In addition, NSF will encourage partnerships design incentives for its optimal use. These types
with industry to develop, maintain and share of activities will be essential to the agency’s overall
robust, production-quality software tools and success. NSF will support studies of the evolution
services and to leverage commercially available and impact of cyberinfrastructure on the culture
software. NSF will engage commercial software and conduct of research and education within and
providers to identify value propositions, to iden- across different research and education commu-
tify software needs specific to the research and nities. It will also encourage systemic design of
education community, and to facilitate technology virtual organizations with embedded evaluation to
transfer to industry, where appropriate. Efforts to better address the complex interaction of techno-
ensure that NSF cyberinfrastructure investments logical and social issues with the user community.
complement those of other federal agencies will be
intensified.
- 35 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
CHAPTER 5
LEARNING AND WORKFORCE DEVELOPMENT
(2006-2010)
I. C L models from faculty research efforts are used to
generate numerical data sets for comparison with
Cyberinfrastructure moves us beyond the old- data from physical observations of real transporta-
school model of teachers/students and classrooms/ tion systems obtained from various (international)
labs. Ubiquitous learning environments now locations via access to remote instrumentation.
will encompass classrooms, laboratories, libraries, Furthermore, learners explore influences on air
galleries, museums, zoos, workplaces, homes and quality and tap into the expertise of practicing
many other locations. Under this transformation, environmental scientists through either real-time
well-established components of education—pre- or asynchronous communication. This networked
school, K-12 and post-secondary—become learning environment increases the impact and
highly leveraged elements of a more open learning accessibility of all resources by allowing students
world where people learn as a routine part of life, to search for and discover content, to assemble
throughout their lives. curricular and learning modules from component
pieces in a flexible manner, and to communicate
Cyberinfrastructure is enabling powerful op- and collaborate with others, leading to a deep
portunities: i) to collaborate, ii) to model and change in the relationship between students
visualize complex scientific and engineering and knowledge. Indeed, students experience the
concepts, iii) to create and discover scientific and profound changes in the practice of science and
educational resources for use in a variety of set- engineering and the nature of inquiry that cyber-
tings, both formal and informal, iv) to assess learn- infrastructure provokes.
ing gains, and v) to personalize learning environ- II. B C C
ments. These changes both demand and support U C
a new level of technical competence in the science
and engineering workforce and in our citizenry
at large. Imagine an interdisciplinary course in To realize these radical changes in the processes
the design and construction of large public works of learning and discovery, networked resources also
projects, attracting student-faculty teams from demand a new level of technical competence from
different engineering disciplines, urban planning, the nation’s workforce and citizenry. Indeed, NSF
environmental science, and economics; and from envisions a spectrum of new learning needs and
around the globe. To develop their understand- activities demanded by individuals, from future
ing, the students combine relatively small self- researchers, to members of the technical cyberin-
contained digital simulations that capture both frastructure workforce, to the citizen at large.
simple behavior and geometry to model more
complex scientific and engineering phenomena. As cyberinfrastructure tools grow more ac-
Modules share inputs and outputs and otherwise cessible, students at the secondary school and
interoperate. These “building blocks” maintain undergraduate levels increasingly use them in their
sensitivity across multiple scales of phenomena. learning endeavors, in many cases serving as early
For example, component models of transportation adopters of emergent cyberinfrastructure. Al-
subsystems from one site combine with structural ready, these tools facilitate communication across
and geotechnical models from other collections disciplinary, organizational, and international and
to simulate dynamic loading within a complex cultural barriers, and their use is characteristic of
bridge and tunnel environment. Computational the new globally-engaged researcher. Moreover,
the new tools and functionality of cyberinfrastruc-
Weather recording equipment used in Jornada Range studies is part of the LTER Network, involving more than
1800 scientists and students at 26 different sites investigating ecological processes over long temporal and
broad spatial scales. These researchers in southern New Mexico are studying the causes and consequences of
desertification.
- 37 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
ture are transforming the very nature of scientific ing of feedback to the individual, and the creation
inquiry and scholarship. New methods to observe of personalized portfolios of student learning
and to acquire data, manipulate it, and represent that capture a record of conceptual learning gains
it, challenge the traditional discipline-based gradu- over time. Undergraduate curricula must also be
ate curricula. Increasingly the tools of cyberinfra- reinvented to exploit emerging cyberinfrastructure
structure must be incorporated within the context capabilities. The full engagement of students is
of interdisciplinary research. Indeed, these tools vitally important since they are in a special posi-
and approaches are helping to make possible new tion to inspire future students with the excitement
methods of inquiry that allow understanding in and understanding of cyberinfrastructure-enabled
one area of science to promote insight in an- scientific inquiry and learning.
other, thus defining new interdisciplinary areas of
research reflecting the complex nature of modern Ongoing attention must be paid to the educa-
science and engineering problems. Furthermore, as tion of the professionals who will support, deploy,
data are increasingly “born digital,” the ephem- develop, and design current and emerging cyberin-
eral nature of data sources themselves raises new frastructure. For example, the increased emphasis
dimensions on the issues of preservation and on “data rich” scientific inquiry has revealed seri-
stewardship. ous needs for “digital data management” or data
curation professionals. Such careers may involve
the development of new, hybrid degree programs
that marry the study of library science with a
scientific discipline. Similarly, the power that visu-
alization and other presentation technologies bring
to the interpretation of data may call for special-
ized career opportunities that pair the graphic arts
with a science or engineering discipline.
- 38 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
- 39 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
learning in the distributed and networked nationally; and foster incorporation of solutions
learning environment. for addressing privacy, ethical and social issues in
• Leveraging cyberinfrastructure LWD activi- cyberinfrastructure-enabled learning and research
ties and investments within NSF and by other environments.
agencies – national and international – are
essential for enabling 21st century science and It is very likely that new disciplines will develop
engineering. as a natural outgrowth of the advances in cyberin-
frastructure. NSF will be an enabler in developing
• Scientists and engineers must be prepared to the workforce in these newly-formed disciplines.
collaborate across disciplinary, institutional, These disciplines could be as important as the
geopolitical and cultural boundaries using relatively new disciplines of computer science,
cyberinfrastructure-mediated tools. mathematical biology, genomics, environmental
sciences and astrophysics are today. NSF will
Current and future generations of scientists, support mechanisms for development of new
engineers and educators will utilize cyberinfra- cyberinfrastructure-related curricula at all levels;
structure-enabled learning and research environ- stimulate partnerships – domestic and internation-
ments for their formal and informal educational al – between and among academic institutions and
training, research projects, career development and industrial cyberinfrastructure-professionals, and
lifelong learning. Therefore, NSF will develop and support the wide dissemination of “best practices”
implement strategies for their deployment and in cyberinfrastructure workforce development.
utilization. To do so, NSF will stimulate aware-
ness of cyberinfrastructure-enabled learning and Cyberinfrastructure has the potential to enable
research environments for scientists, engineers and a larger and more diverse set of individuals and
educators; enhance usability and adaptability of institutions to participate in science and engineer-
cyberinfrastructure-enabled learning and research ing education, research and innovation. To realize
environments for current and future generations this potential, NSF will strategically design and
of scientists, engineers and educators both nation- implement programs that recognize the needs of
ally and internationally; promote new approaches those who might not have the means to utilize CI
to and integration of cyberinfrastructure-enabled in science and engineering research and education.
learning and research environments for educa- To do so, NSF will identify and address barriers to
tional and research usage nationally and inter- utilization of cyberinfrastructure tools and
Georgia Institute of Technology and NRL researchers have developed improved methods for directly writing nanometer-
scale patterns onto a variety of surfaces. Atomic force microscopy generated images are used to analyze nanometer-scale
structures.
- 40 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
resources; promote the training of faculty, par- promote the development of technological solu-
ticularly those in minority-serving institutions, tions for addressing privacy, ethical and social
predominantly undergraduate institutions and issues in cyberinfrastructure-enabled learning and
community colleges; and encourage programs research environments; and stimulate both domes-
to integrate innovative methods of teaching and tic and international partnerships to identify best
learning using cybertools (particularly in inner- practices in cyberinfrastructure enabled learning
city, rural and remote classrooms), including tak- and research environments.
ing advantage of international networked resources
to prepare a globally engaged workforce. Lifelong learning, through both formal and
informal mechanisms, will be an essential part of
In the dynamic environment of cyberinfra- the workforce of a cyberinfrastructure-enabled
structure-enabled learning and research, NSF will society. NSF can play a crucial role by promot-
facilitate new developments as well as continu- ing and sustaining programs that exploit existing
ous improvements of the currently available tools resources and encourage creation of new resources
and services, including those for education and in order to continually improve the science lit-
training. NSF will support research that increases eracy of society in general and of the science and
understanding of how students, teachers, scientists engineering workforce in particular. NSF will
and engineers work and learn in a cyberinfrastruc- support mechanisms for professionals to continu-
ture rich environment, for example, interactive ously update their cyberinfrastructure skills and
gaming, simulation and modeling (as opposed to competencies; catalyze the development of new
conventional instruction methods). The agency lifelong learning networked resources; promote
will support the development of methods to em- new knowledge communities and networks that
bed relevant data collection and analysis tools in take advantage of cyberinfrastructure to provide
cyberinfrastructure-based environments in order new learning experiences; support programs that
to assess, for example, satisfaction, usability, utility, bridge pre- and post-professional (lifelong) learn-
productivity, etc., as well as the development of ing; and encourage programs that promote public
specific means for tracking student progress in awareness of, and literacy in, cyberinfrastructure.
content-based cyberinfrastructure learning;
A visualization of the changing structure of the satellite tobacco mosaic virus is produced
on the NCSA Cobalt System by calculating how each of the one million atoms interacted
with each other every femtosecond.
- 41 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
APPENDIX A: ACRONYMS
2MASS Two Micron All Sky Survey
CAIDA Cooperative Association for Internet Data Analysis
CALIT2 California Institute for Telecommunications and Information Technology
CAVE Cave Automatic Virtual Environment
CCSDS Consultative Committee for Space Data Standards
CHARMM Chemistry at Harvard Macromolecular Mechanics
CI Cyberinfrastructure
CIC Cyberinfrastructure Council
CIO Chief Information Officer
CMS Compact Muon Solenoid
CODATA Committee on Data for Science and Technology
CPU Central Processing Unit
CSNET Computer Science Network
CT Computed Tomography
DANSE Data Analysis for Neutron Scattering Experiments
DARPA Defense Advanced Research Projects Agency
DNA Deoxyribonucleic Acid
DOD Department of Defense
DOE Department of Energy
ETF Extensible Terascale Facility
FLOPS Floating point operations/sec
GPS Global Positioning System
HBCU Historically Black Colleges and Universities
HPC High Performance Computing
HPCMOD DOD’s High-Performance Computing Modernization program
HPCS DARPA’s High Productivity Computing Systems program
HPCC High-Performance Computing and Communications
HPWREN High Performance Wireless Research and Education Network
GEON Geosciences Network
GLORIAD Global Ring Network for Advanced Applications Development
GriPhyN Grid Physics Network
ICPSR Inter-university Consortium for Political and Social Research
ICSU International Council for Science
ICSTI International Council for Scientific and Technical Information
IRIS Incorporated Research Institutions for Seismology
ISO International Organization of Standardization
IT Information Technology
ITR Information Technology Research
IVDGL International Virtual Data Grid Laboratory
LIGO Laser Interferometer Gravitational Wave Observatory
LTER Long Term Ecological Research
LWD Learning and Workforce Development
MPI Message Passing Interface
MREFC Major Research Equipment and Facilities Construction
NARA National Archives and Records Administration
NASA National Aeronautics and Space Administration
NCAR National Center for Atmospheric Research
The simulation on the opposite page, run by Louisiana State University astrophysicists on the NCSA Tungsten system,
shows two stars in a binary system during a late phase of their evolution. The more massive star (top) has transferred
a significant fraction of its mass to its companion and the two stars are undergoing a catastrophic merger.
- 43 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
- 44 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
APPENDIX B:
REPRESENTATIVE REPORTS AND WORKSHOPS
Building a Cyberinfrastructure for the Biologi- shop held on March 28-29, 2003;, June 2003;
cal Sciences; workshop held July 14-15, 2003; information available at http://www.paleostrat.
information available at http://research.calit2. org/Documents/ises-ci%202003.pdf
net/cibio/archived/CIBIO_FINAL.pdf and http://
research.calit2.net/cibio/report.htm Cyberinfrastructure and the Next Wave of
Collaboration; D.E. Atkins, Keynote for EDU-
CHE Cyber Chemistry Workshop; workshop CAUSE, held April 2005; information available at
held October 3-5, 2004; information available at http://www.educause2005.auckland.ac.nz/interac-
http://bioeng.berkeley.edu/faculty/cyber_work- tive/presentations/Atkins.pdf
shop/
Cyberinfrastructure needs for environmental
Commission on Cyberinfrastructure for the observatories; information available at http://www.
Humanities and Social Sciences; sponsored by orionprogram.org/office/NSFCyberWkshp.html
the American Council of Learned Societies; seven
public information-gathering events held in 2004; Cyberinfrastructure Research for Homeland
report in preparation; information available at Security; February 26-27, 2003; information avail-
http://www.acls.org/cyberinfrastructure/cyber.htm able at web.calit2.net/RiskReduction/crhsdraft.pdf
- 45 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
Final Report: NSF SBE-CISE Workshop on GEON: Developing the Cyberinfrastructure for
Cyberinfrastructure and the Social Sciences, F. the Earth Sciences A Workshop Report on Intru-
Berman and H. Brady, held May 12, 2005; infor- sive Igneous Rocks, Wilson Cycle and Concept
mation available at http://vis.sdsc.edu/sbe/reports/ Spaces; information available at http://geon.geol.
SBE-CISE-FINAL.pdf vt.edu/geon/pubreps/workshop.pdf
High-Performance Computing in the Geosci- NSF Conceptual Design Review Panel for the
ences Workshop Report. September 25-27, 2006, Ocean Observatories Initiative, August 14-17,
information available at http://www.ncar.ucar. 2006; information available at http://www.orion-
edu/Director/dcworkshop/HPCGEO_workshop_ program.org/capabilities/cdr/default.html
report_FINAL.pdf
NSF EPSCoR Cyberinfrastructure Workshop,
Identifying Major Scientific Challenges in the May 10-12, 2006; information available at http://
Mathematical and Physical Sciences and their ciworkshop.utsi.edu/NSF_CI_WkshpReport_Fi-
CyberInfrastructure Needs, workshop held April nal.pdf
21, 2004; information available at http://www.nsf.
gov/attachments/100811/public/CyberscienceFi- NSF Workshop for a Plant Cyberinfrastructure
nal4.pdf Center, October 17-18, 2005; information avail-
able at http://www.arabidopsis.org/portals/masc/
Improving the effectiveness of U.S. Climate masc_docs/masc_wk_rep.jsp
modeling, Commission on Geosciences, En-
vironment and Resources (2001). National
Academy Press, Washington, D.C., 144 pp;
information available at http://www.nap.edu/
books/0309072573/html/
- 46 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
- 47 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
APPENDIX C:
CHRONOLOGY OF NSF INFORMATION
TECHNOLOGY INVESTMENTS
NSF’s early investments in what has now become program. Two partnerships were established in
known as cyberinfrastructure date back almost to 1997, together involving nearly 100 partner insti-
the agency’s inception. In the 1960s and 1970s, tutions across the country in efforts to make more
the agency supported a number of campus-based efficient use of high-end computing in all areas of
computing facilities. As computational meth- science and engineering. The partnerships have
odologies became increasingly essential to the been instrumental in fostering the maturation of
research endeavor, the science and engineering cyberinfrastructure and its widespread adoption by
community began to call for NSF investments in the academic research and education community,
specialized, higher capability computing facilities and by industry.
that would meet the computational needs of the
broad national community. As a consequence, Also in the early 1990s, NSF, as part of the
NSF’s Supercomputer Centers program was initi- U.S. High Performance Computing and Com-
ated in 1985 through the agency’s support of five munications (HPCC) program, began to sup-
academic-based supercomputer centers. port larger-scale research and education-focused
projects pursuing what became known as “grand
During the 1980s, academic-based networking challenges.” These HPCC projects joined scientists
activities also flourished. Networking technologies and engineers, computer scientists and state-of-
were expected to improve the effectiveness and the-art cyberinfrastructure technologies to tackle
efficiency of researchers and educators, providing important problems in science and engineering
enhanced, easy access to computer resources and whose solution could be advanced by applying
more effective transfer and sharing of information cyberinfrastructure techniques and resources. First
and knowledge. After demonstrating the potential coined by the HPCC program, the term “grand
of Computer Science Network (CSNET) in link- challenge” has been widely adopted in many sci-
ing computer science departments, NSF moved on ence and engineering fields to signify an over-
to develop the high-speed backbone, called NSF arching goal that requires a large-scale, concerted
Network (NSFNET), with the five supercom- effort.
puter centers supported under the Supercomputer
Centers program and the National Center for During the 1990s, the penetration of increas-
Atmospheric Research becoming the first nodes ingly affordable computing and networking
on the backbone. NSF support also encouraged technologies on campuses was also leading to the
the development of regional networks to connect creation of what would become mission-critical,
with the backbone NSFNET, thereby speeding the domain-specific cyberinfrastructure. For example,
adoption of networking technologies on campuses in the mid 1990s the earthquake engineering
around the country. In 1995, in partnership community began to define what would become
with MCI, NSF catalyzed support of the very the Network for Earthquake Engineering Simula-
high speed Backbone Network Service (vBNS), tion, one of many significant cyberinfrastructure
permitting advanced networking research and the projects in NSF’s portfolio today.
development of novel scientific applications. A
few years later, NSF established the NSF Middle- In 1999, the President’s Information Technol-
ware Initiative, focused on the development of ogy Advisory Committee (PITAC) released the
advanced networking services to serve the evolving seminal report Information Technology Research
needs of the science and engineering community. (ITR)-Investing in our Future, prompting new
and complementary NSF investments in CI proj-
In the early to mid-1990s, informed by both the ects, such the Grid Physics Network (GriPhyN)
Branscomb and the Hayes Reports, NSF consoli- and international Virtual Data Grid Laboratory
dated its support of national computing facili- (iVDGL) and the Geosciences Network, known as
ties in the establishment of the Partnerships for GEON. Informed by the PITAC report, NSF also
Advanced Computational Infrastructure (PACI) created a Major Research Equipment and Facilities
- 48 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
Construction (MREFC) project entitled Terascale recommended support for the two Partnership
Computing Systems that began its construction lead sites through the end of their original PACI
phase in FY 2000 and ultimately created the Ex- cooperative agreements. In October 2004, fol-
tensible Terascale Facility – now popularly known lowing merit review, the National Science Board
as the Teragrid. Teragrid entered its production (NSB) endorsed funding of those sites through the
phase in October 2004 and represents one of the end of FY 2007.
largest, fastest, most comprehensive distributed
cyberinfrastructures for science and engineering Through 2005, in addition to the groups
research and education. already cited, a number of prestigious groups have
made recommendations that continue to inform
In 2001, NSF charged an Advisory Committee the agency’s cyberinfrastructure planning includ-
for Cyberinfrastructure, under the leadership of ing the High-End Computing Revitalization Task
Dr. Dan Atkins, to evaluate the effectiveness of Force, the PITAC Subcommittee on Computa-
PACI and to make recommendations for future tional Science, and the NRC Committee on the
NSF investments in cyberinfrastructure. The Future of Supercomputing.
Atkins Committee, as it became popularly known,
APPENDIX D:
MANAGEMENT OF CYBERINFRASTRUCTURE
NSF has nurtured the growth of what is now provided by the CIC chaired by the NSF Direc-
called cyberinfrastructure for a number of decades. tor and comprised of the NSF Deputy Director,
In recent years, the Directorate for Computer the Assistant Directors of NSF’s Research Direc-
and Information Science and Engineering (CISE) torates and the Heads of the Office of Interna-
has been responsible for the provision of national tional Science and Engineering, Office of Polar
supercomputing infrastructure for the academic Programs, and the recently established Office of
community. In addition, the Directorate was Cyberinfrastructure (OCI). The CIC has been
instrumental in the creation of what ultimately meeting regularly since May 2005, and OCI was
became known as the Internet. During this established in the Office of the Director on July
incubation period, the management of CI was best 22, 2005. In January 2007, the CIC authorized a
provided by those also responsible for the research standing CI Coordinators Committee (CICC) to
and development of related CI technologies. augment its activities with representatives from all
NSF Directorates and Offices and chaired by the
Over the years, the penetration and impact Director of OCI. A foundation-wide external Ad-
of computing and networking on campuses has visory Committee for Cyberinfrastructure (ACCI)
been extensive, and has led to the creation of has also been established and is now active.
many disciplinary-specific or community-specific
CI projects and activities. Today, CI projects are CISE will continue to be responsible for a broad
supported by all NSF Directorates and Offices. range of programs that address the administration’s
Because of the growing scope of investment and priorities for fundamental research and education
variability in needs among users in the broad sci- in computing, representing more than 85% of the
ence and engineering community, it has become overall federal investment in university-based basic
clear that effective CI development and deploy- research.
ment now requires the collective leadership of
NSF senior management. This leadership will be
- 49 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
APPENDIX E:
REPRESENTATIVE DISTRIBUTED RESEARCH
COMMUNITIES (VIRTUAL ORGANIZATIONS)
Collaboratories: of the high energy physics community. GriPhyN
is a foundational element of the Open Science
• CoSMIC: Combinatorial Sciences and Materials Grid (OSG).
Informatics Collaboratory http://mse.iastate.edu/
cosmic/ An international research and education
• ICPSRC: Inter - University Consortium for
center promoting the use of informatics and com- Political and Social Research http://www.icpsr.
binatorial experimentation for materials discovery umich.edu/ Maintains and provides access to a
and design. Based at Iowa State University with vast archive of social science data for research and
domestic partners at Florida International Uni- instruction, and offers training in quantitative
versity and the University of Maryland, CoSMIC methods to facilitate effective data use. ICPSR
is composed of an international consortium of preserves data, migrating them to new storage
universities and laboratories. media as changes in technology warrant. ICPSR
also provides user support to assist researchers in
DANSE: The Data Analysis for Neutron Scat- conducting their projects and identifying relevant
tering Experiments http://wiki.cacr.caltech.edu/ data for analysis.
danse/index.php/Main_Page Prompted by the
development of the Spallation Neutron Source
(http://www.sns.gov) (SNS), under construc- • iVDGL: International Virtual Data Grid Labora-
tion in Oak Ridge, Tennessee, DANSE goals are tory http://www.ivdgl.org/ A global data and
to build a software system that 1) enables new compute grid serving physics and astronomy,
and more sophisticated science to be performed funded as a large ITR project by MPS. While its
with neutron scattering experiments, 2) makes sister project, GriPhyN, funded the grid research,
the analysis of data easier for all scientists, and 3) iVDGL supports the operational grid and as-
provides a robust software infrastructure that can sociated services supporting projects like CMS,
be maintained in the future. ATLAS, and LIGO.
• DISUN: Data Intensive Scientific University • LHC: Large Hadron Collider and LCG: Large
Network http://www.disun.org/ A grid-based Hadron Collider Computing Grid The Large
facility of computational clusters and disk storage Hadron Collider, http://lhc.web.cern.ch/lhc/
arrays distributed across four institutions each of being built at CERN near Geneva, is the largest
which serves as a High Energy Physics Tier-2 site. scientific instrument on the planet. The mission
The Compact Muon Solenoid (CMS) experiment of the LHC Computing Project (LCG) is to
is primarily supported by DISUN. build and maintain a data storage and analysis
infrastructure for the entire high-energy physics
community that will use the LHC. See also the
• Ecoinformatics.org http://www.ecoinformatics. World LHC Computing Grid (WLCG) http://
org/ A community of ecologists and informatics lcg.web.cern.ch/LCG/ , a distributed production
specialists operating on the principles of open- environment for physics data processing.
source software communities. Collaborative proj-
ects such as software and standards development
that cross institutional and funding boundaries • LTER: Long-Term Ecological Research Network
are managed through contributed personnel in an http://lternet.edu/ A networked collaboratory
open and democratic process. Ecological Meta- of 26 field sites and network office focused on
data Language is a product of this community. answering questions about the long-term dynam-
ics of ecosystems, ranging from near pristine to
highly engineered sites. In addition to site-based
• GriPhyN: Grid Physics Network http://www. science, education and outreach, five common
griphyn.org/ A large ITR project focused on research themes ensure that multi-site projects
computer science and grid research applied to the and synthesis are integral to the program.
distributed computing and storage requirements
- 50 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
• NCEAS: National Center for Ecological • QuarkNet Cosmic Ray eLAB Project http://
Analysis and Synthesis http://www.nceas.ucsb. www11.i2u2.org:8080/elab/cosmic/project.jsp
edu/fmt/doc?/frames.html NCEAS and the Na- A distributed learning lab for collaborating high
tional Evolutionary Synthesis Center (NESCent) school physics students and teachers who collect
http://www.nescent.org/ serve as data centers and and analyze cosmic ray data. The e-LAB partici-
collaboratories for the ecology and evolutionary pants work with computer scientists to provide
biology communities. Both are involved in the cutting edge tools that use grid techniques to
development of software, house databases, and share data, graphs, and posters and to encourage
sponsor collaborative activities aimed as synthe- collaboration among students nationwide.
sizing research.
• SCEC CME : Southern California Earthquake
• NEES: George E. Brown, Jr. Network for Earth- Center Community Modeling Environment
quake Engineering Simulation http://www.nees. http://epicenter.usc.edu/cmeportal/ A recent
org/ A shared national network of 15 experimen- geophysics and IT collaboratory targeted at
tal facilities, collaborative tools, a centralized data seismic hazard analysis and geophysical modeling.
repository, and earthquake simulation software, The computational test-bed under development
all linked by the ultra-high-speed Internet2 con- will allow users to assemble and run highly com-
nections of NEESgrid. These resources provide plex seismological and geophysical simulations
the means for advanced collaborative research utilizing Teragrid with the goal of forecasting
based on experimentation and computational earthquakes in Southern California.
simulations of the ways buildings, bridges, utility
systems, coastal regions, and geomaterials per-
• UltraLight http://ultralight.caltech.edu/web-site/
form during seismic events. ultralight/html/index.html A collaboration of
experimental physicists and network engineers to
• NNIN: National Nanotechnology Infrastructure enable petabyte-scale analysis of globally distrib-
Network http://www.nnin.org/ Integrated net- uted data. Goals include: 1) developing network
worked partnership of 13 user facilities that serves services that broaden existing Grid computing
the resource needs of nanoscale science, engineer- systems by promoting the network as actively
ing and technology. NNIN provides users with managed component; 2) testing UltraLight in
shared open access, on-site and remotely, to lead- Grid-based physics production and analysis
ing-edge tools, instrumentation, and capabilities systems; and 3) engineering a trans- and intercon-
for fabrication, synthesis, characterization, design, tinental optical network testbed, including high-
simulation and integration for the purpose of speed data caches and computing clusters.
building structures, devices, and systems from
atomic to complex large-scales.
• Veconlab: The Virginia Economics Laboratory
http://veconlab.econ.virginia.edu/admin.htm/
• OSG: Open Science Grid http://www.open- Focuses on game theory and social interactions in
sciencegrid.org/ The national physics grid arising economics and related fields. The Veconlab server
from the GriPhyN, iVDGL, and DOE’s PPDG provides a set of about 40 web-based programs
projects. As an operational grid spread across that can be used to run interactive, social sci-
dozens of institutions and DOE sites, OSG ence experiments for either teaching or research
primarily supports high energy physics grid ap- purposes.
plications and some non-physics applications in
an opportunistic grid dominated by commodity • Vlab: The Virtual Laboratory for Earth and Plan-
clusters whose CPU counts total into the thou- etary Materials http://www.vlab.msi.umn.edu/,
sands. The OSG Consortium builds and operates An interdisciplinary consortium for development
the OSG, bringing resources and researchers from and promotion of the theory of planetary materi-
universities and national laboratories together and als. Computational determination of geophysi-
cooperating with other national and international cally important materials properties at extreme
infrastructures to give scientists from many fields conditions provides accurate information to a)
access to shared resources worldwide. interpret seismic data in the context of likely
geophysical processes and b) be used as input
for more sophisticated and reliable modeling of
planets.
- 51 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
- 52 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
• ORION and OOI: Ocean Research Interactive • CIG: Computational Infrastructure for Geody-
Observatory Networks http://orionprogram. namics http://www.geodynamics.org/ Member-
org/ and http://www.orionocean.org/OOI/de- ship-governed organization that supports Earth
fault.html An integrated observatory network science by developing and maintaining software
for oceanographic research and education with for computational geophysics and related fields.
interactive access to ocean observing systems that CIG consists of: (a) a coordinated effort to
track a wide range of episodicity and temporal develop reusable and open-source geodynamics
change phenomena. OOI has three elements: 1) software; (b) an infrastructure layer of software
a global-scale array of relocatable deep-sea buoys, for assembling state-of-the-art modeling; (c)
2) a regional-scaled cabled network consisting extension of existing software frameworks to
of interconnected sites on the seafloor, and 3) interlink multiple codes and data through a su-
network of coastal observatories. perstructure layer; (d) strategic partnerships with
computational science and geoinformatics; and
Other Virtual Organizations: (e) specialized training and workshops.
• AToL: The Tree of Life initiative http://atol.sdsc. • CLIVAR: Climate Variability and Predictability
edu/ Supports the reconstruction of the evolu- http://www.clivar.org/ An international research
tionary history of all organisms, seen as a grand program addressing many issues of natural
challenge. The AToL research community works climate variability and anthropogenic climate
toward describing the evolutionary relationships change, as part of the wider World Climate Re-
of all 1.7 million described species. search Programme (WCRP). It makes available
a wide variety of climate-related data and models
• CASA: Center for Adaptive Sampling of the At- via a web interface and data management system.
mosphere http://www.casa.umass.edu/ Integrates
advancements in radar technology, networking • CUAHSI HIS: The Consortium of Universities
and atmospheric sciences to develop a new, low for the Advancement of Hydrologic Science, Inc.
cost network of weather radars that can adapt http://www.cuahsi.org/ Represents about 100
sampling procedures in real time in order to U.S. universities and develops infrastructure and
optimize information and provide unprecedented services that support advancement of hydrologic
details of severe storms. science and education. The CUAHSI Hydrologic
Information System (HIS) project is conducted
• CCMC: The Community Coordinated Modeling by a group of academic hydrologists collaborat-
Center http://ccmc.gsfc.nasa.gov/ Provides access ing with the San Diego Supercomputer Center
to modern space science simulations and sup- as a technology partner to produce a prototype
ports the transition to space weather operations Hydrologic Information System.
of modern space research models. To support
community research CCMC adopts state of the • C-ZEN: Critical Zone Exploration Network
art space weather models that are developed by http://www.czen.org/ Promotes interdisciplinary
outside researchers, executes simulation runs research on the zone defined by the outer limits
with these models, offers a variety of visualization of vegetation and the lower boundary of ground
and output analysis tools, and provides access to water – identified by the National Research
coupled models and existing model frameworks. Council as the Critical Zone (NRC, 2001).
CZEN cyberinfrastructure consists of a critical
• CEDAR: Coupling, Energetics and Dynam- zone ontology, development of tools for knowl-
ics of Atmospheric Regions http://cedarweb. edge discovery and management, and data and
hao.ucar.edu/cgi-bin/ion-p?page=cedarweb.ion metadata standards development.
For characterization and understanding of the
atmosphere above ~60 km, with emphasis on • DEISA: Distributed European Infrastructure
the processes that determine the basic composi- for Supercomputing Applications http://www.
tion and structure of the atmosphere. Activities deisa.org/ A consortium of leading national
include collaborative research projects, multi-site supercomputing centers that currently deploys
field campaigns, workshops, and participation of and operates a persistent, production quality,
graduate and undergraduate students. distributed supercomputing environment with
- 53 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
European continental scope. DEISA’s purpose is facilitate research in primate genetic diversity and
to enable discovery across a spectrum of science evolution, comparative genomics, and population
and technology through deep integration of exist- genetics.
ing national high-end computational platforms,
coupled by a dedicated network and supported by
• LEAD: Linked Environments for Atmospheric
innovative system and grid software. Discovery: http://lead.ou.edu/ Seeks to miti-
gate the impacts of extreme weather events by
• EGEE: The Enabling Grids for E-sciencE http:// accommodating the real time, on-demand, and
public.eu-egee.org/ Funded by the European dynamically-adaptive needs of mesoscale weather
Commission to build on recent advances in grid research. A major underpinning of LEAD
technology and develop a service grid infrastruc- is dynamic workflow orchestration and data
ture which is available to scientists 24 hours-a- management in a web services framework with
day. Spanning over 30 countries and 150 sites grid-enabled systems.
across Europe, EGEE supports applications in
Earth Sciences, High Energy Physics, Bioinfor-
• MARGINS: http://www.nsf-margins.org/Home.
matics, Astrophysics, and other domains. html Focuses on understanding the complex in-
terplay of processes that govern creation and de-
• GCAT: Genomic Consortium for Active Teach- struction of continental margins and result in the
ing http://www.bio.davidson.edu/projects/gcat/ concentration of both resources and geohazards
gcat.html A consortium of institutions whose where land and oceans meet. Data from collab-
purpose is to integrate genomic techniques in orative research projects are available via a Data
the undergraduate life sciences curricula. GCAT Management System and web-based services.
provides faculty with resources and training to
utilize the technology and means to share data
• NCAR: National Center for Atmospheric
across institutions. Research http://www.ncar.ucar.edu/ A federally
funded research and development center that,
• GEON: The Geosciences Network http://www. together with partners at universities and research
geongrid.org/ Advances geoinformatics by train- centers, studies Earth’s atmosphere and its inter-
ing geoscience researchers, educators, students actions with the Sun, the oceans, the biosphere,
and practitioners in the use of cyberinfrastruc- and human society. NCAR hosts numerous
ture. GEON provides a service-oriented archi- projects that span the full range of VO activities
tecture (SOA) for support of “intelligent” search, and it recently launched a set of integrative initia-
semantic data integration, visualization of 4D tives designed to explore the Earth system from a
scientific datasets, and access to high performance 21st-century vantage point: http://www.ncar.ucar.
computing platforms for data analysis and model edu/stratplan/initiatives.html
execution, via the GEON Portal.
• NCED: National Center for Earth-surface
• GLOBEC: U.S. GLOBEC (GLOBal ocean Dynamics http://www.nced.umn.edu/ Fosters an
ECosystems dynamics) http://www.pml.ac.uk/ integrated, predictive science of the skin of the
globec/ Looks at how climate change and vari- Earth – “critical zone” - where interwoven physi-
ability translate into changes in the structure and cal, biological, geochemical, and anthropogenic
dynamics of marine ecosystems, marine animal processes shape Earth’s surface. NCED’s main
populations and in fishery production. Research cyberinfrastructure activities relate to a California
approaches use inter-related modeling, process- field site at Angelo Coast Range Reserve with a
oriented studies, broad scale observations, and steep, relatively rapidly eroding landscape where
retrospective studies. an advanced environmental observatory with a
wireless network and automated environmental
• IPBIR: Integrated Primate Biomaterials and sensors are under construction.
Information Resource http://www.ipbir.org/ To • NCN: Network for Computational Nanotech-
assemble, characterize, and distribute high-qual- nology http://www.ncn.purdue.edu/ Advances
ity DNA samples of known provenance with approaches and simulation tools that allow engi-
accompanying demographic, geographic, and neers to design new nanoelectronic and NEMS
behavioral information in order to stimulate and technologies. The fields include: i) nanoelectron-
- 54 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
ics, ii) NEMS, and iii) nano-bioelectronics, with and sharing of seagoing logistical operations. Key
the overall goal of connecting dry electronic and SBN components include regular workshops and
mechanical nanosystems to wet biological nano- the development of a community website and
systems. NanoHUB http://www.nanohub.org/ database.
is a web-based initiative spearheaded by NCN
that provides online simulation tools for nano-
• Suominet: Real Time Integrated Atmospheric
electronics and molecular dynamics, along with Water Vapor and TEC from GPS http://www.
semiconductor devices, processes and circuits. suominet.ucar.edu/ Measures phase delays
induced in GPS signals by the ionosphere and
• PEATNET: Peatland Ecosystem Analysis and neutral atmosphere with high precision and con-
Training Network http://www.peatnet.siu.edu/ verts these delays into integrated water vapor and
An international interdisciplinary collaborative total electron content (TEC). These measure-
effort focused on northern peatland ecosystems. ments contribute to our understanding of the
Web site serves as a mechanism to coordinate Earth’s weather and climate system and TEC data
PEATNET activities and communicate findings addresses topics in upper atmospheric research.
and outcomes, and as the location of a peatland
Web-discussion group and the PEATNET Virtual
• TeraGrid www.teragrid.org An open scientific
Resource Center. discovery infrastructure combining leadership
class resources at eight partner sites to create an
• PRAGMA: The Pacific Rim Application and Grid integrated, persistent computational resource.
Middleware Assembly www.pragma-grid.net/mis- Deployed in September 2004, TeraGrid brings
sion.htm Formed to establish sustained collabo- over 40 teraflops of computing power and nearly
rations and advance the use of grid technologies 2 petabytes of rotating storage, and specialized
in applications among a community of investiga- data analysis and visualization resources into pro-
tors working with leading institutions around the duction, interconnected at 10-30 gigabits/second
Pacific Rim. In PRAGMA, applications are the via a dedicated national network.
key, integrating focus that brings together neces-
sary infrastructure and middleware to advance the • TESS: Time-sharing Experiments for the Social
application’s goals. Sciences http://www.experimentcentral.org/
Permits original data collection through national
• RIDGE http://ocean-ridge.ldeo.columbia.edu/ telephone surveys to which researchers can add
general/html/home.html A long-term research their original questions and through arrange-
program to study Earth’s oceanic spreading ridge ments that allow researchers to run their own
system as an integrated whole, from its inception studies on random samples of the population that
in the mantel to its manifestations in the bio- are interviewed via the Internet.
sphere and water column. Ridge 2000 supports
interdisciplinary research and availability of data • UNAVCO http://www.unavco.org/ A non-profit,
through web-based services. membership-governed consortium that supports
and promotes high-precision techniques for
• SAHRA: The Center for Sustainability of semi- the measurement and understanding of crustal
Arid Hydrology and Riparian Areas http://www. deformation. The data processing center analyzes
sahra.arizona.edu/ Promotes sustainable manage- and distributes GPS data from continuous GPS
ment of semiarid and arid water resources, with installations globally, including the dense array of
three components: the SAHRA Geo-database stations dedicated to the EarthScope project.
(SGD) for data storage and sharing, hydrologic
observations, and integrated modeling at three
resolutions.
- 55 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
IMAGE CREDITS
PAGE CREDIT
Cover and © 2002 The Regents of the University of California
inside cover
i Sam Kittner/kittner.com
iv Courtesy of Ronald Cohen, Carnegie Institution of Washington
2 Blake Harvey, NCSA
4 Marcus Thiebaux, Information Sciences Institute, University of Southern California
6 Aleksei Aksimentiev and Klaus Schulten, Theoretical and Computational Biophysics
Group, Beckman Institute for Advanced Science and Technology, University of
Illinois at Urbana-Champaign
7 Christy Bowe, ImageCatcher News Service
8 (top) © 2007 JupiterImages Corporation
8 (bottom) Marcelo Garcia and Mariano Cantero, University of Illinois at Urbana-
Champaign, and S. Balachandar, University of Florida
9 Leroy N. Sanchez, Los Alamos National Laboratory
11 Peter West, National Science Foundation
12 © University Corporation for Atmospheric Research
13 Bob Wilhelmson, NCSA and the University of Illinois at Urbana-Champaign; Lou
Wicker, National Oceanic and Atmospheric Administration’s National Severe
Storms Laboratory; Matt Gilmore and Lee Cronce, University of Illinois
Atmospheric Science Department. Visualization by Donna Cox, Robert Patterson,
Stuart Levy, Matt Hall and Alex Betts, NCSA
14 Courtesy of Theoretical and Computational Biophysics Group, Beckman Institute,
University of Illinois at Urbana-Champaign; Mercury Cluster background image
courtesy of NCSA and the Board of Trustees of the University of Illinois
15 Gary Strand, NCAR
16 Scientific contact Ed Seidel (eseidel@aci.mpg.de); simulations by Max Planck
Institute for Gravitational Physics (Albert-Einstein-AEI); visualization by Werner
Benger, Zuse Institute, Berlin (ZIB) and AEI. The computations were performed on
NCSA’s It
17 Peter Virnau, Massachusetts Institute of Technology
19 Kyle McQuilkin and Ryan Ridgely, Ohio University
- 56 -
CYBERINFRASTRUCTURE VISION
National Science Foundation March 2007
FOR 21ST CENTURY DISCOVERY
PAGE CREDIT
20 James Matthews, Linghao Zhong and John Brady, Cornell University; Mike
Himmel and Mark Nimlos, NREL; Tauna Rignall, Colorado School of Mines; Mike
Crowley, The Scripps Research Institute. Work was funded by an NREL subcontract
funded by DOE Office of the Biomass Program
21 Jason C. Fisher, © 2005 American Institute of Biological Sciences
22 National Virtual Observatory and California Institute of Technology
(http://us-vo.org/)
23 National Science Foundation
24 Courtesy of Montage, IPAC, Caltech, SDSC. This research made use of Montage,
funded by NASA’s Earth Science Technology Office, Computation Technologies
Project, under Cooperative Agreement Number NCC5-626 between NASA and
the California Institute of Technology. Montage is maintained by the NASA/IPAC
Infrared Science Archive
25 IRIS
26, 27 © CERN
28 Wolfgang Bluhm
30 EarthScope
32 Curtis Suplee, National Science Foundation
33 Ethan Dicks, National Science Foundation
34 John Orcutt, Scripps Institute of Oceanography
35 HPWREN
36 Peggy Greb, Agricultural Research Service Image Gallery, USDA
38 Prairie View Solar Observatory
39 The JASON Project
40 Gary Meek, Georgia Institute of Technology
41 Theoretical and Computational Biophysics Group, University of Illinois at
Urbana-Champaign
42 Patrick Motl, Mario D’Souza, Joel E. Tohline, and Juhan Frank, Louisiana
State University
- 57 -
NATIONAL SCIENCE FOUNDATION
4201 WILSON BOULEVARD
ARLINGTON, VA 22230
NSF 07-28