You are on page 1of 3

!

DD

DDN | Open Letter to the HPC Community At the Dawn of Exascale

2011 DataDirect Networks. All Rights Reserved.

!
November 14, 2011 The Exascale Project is presenting the HPC community with unprecedented challenges. Among them is the need to increase performance 500 to 1,000 times that of todays current largest compute infrastructure. Apart from the obvious physical challenges related to the sheer scale of achieving efficiency in compute density and energy efficiency, DDN believes that there are additional important information architecture issues that must also be tackled. DDN recognizes that just as scientific discovery happens through partnership and collaboration, so should be the process for building systems which are 1000x more capable than todays Petascale storage systems. This advancement must be the product of a cross-community partnership, which includes not only infrastructure vendors, but also the ISV community whose applications must scale and the users of these Exascale systems - on whose success we are all focused. Due to the wide-ranging impact of the Exascale information challenge, we believe that it is important to engage the wider HPC community in this important, collaborative discussion. Some areas of the HPC infrastructure may see substantial change and we will need to coordinate these solutions, if we are to build usable Exascale systems in this decade. Data is growing faster than our ability to manage that data and our scientists ability to extract useful knowledge from it. The challenge that we face today can overwhelm us if we do not step outside of the traditional approaches to storage and data management. Some use cases of the data explosion from various HPC disciplines include: Genomics Driven by comparative analysis and personalized medicine Uncertainty Quantification Statistical probabilities applied to large simulations Oil & Gas High density, wide azimuth surveys iteratively measuring reservoirs Financial services Risk quantification for a growing number of markets and securities CFD Extending simulations from steady sub-components to full unsteady vehicle analysis Todays HPC information persistence infrastructure has reached the limits of its ability to scale. Attempts to build on the legacy I/O infrastructure will become increasingly expensive and fragile, as current file system and storage technologies, which were originally designed for use inside a computer, are reaching the limits of their scalability. Below are several areas where DDN believes that Petascale architectures are already running into issues of scalability and performance and must change direction. In-Store Compute In Exascale, the overall costs associated with transferring massive amounts of data between storage and compute will outweigh the costs of the compute itself; this is ushering in the era of In-Storage Processing. HPC storage architectures must evolve to leverage whats been learned from the specialized systems purpose-built to handle todays Web-scale problems. An example from the web is in-store compute, where Map/Reduce systems actually ship functions to right where the data is stored, eliminating network bottlenecks and forwarding only the preprocessed results for further analysis. For HPC, it may be more important to use this capability to manage metadata, pre and post processing functions and achieve tightly integrated ILM services, such as data retention, archiving and retirement. Object Stores Traditional, tree-based POSIX file systems are yet another area that is creating a significant bottleneck to scalability - namespace spanning, in particular. Now, we are evolving toward true object stores, which also follow more closely to a new programming paradigm. For example, Big Data challenges have driven Web systems architects to build much more scalable, key-value data stores to resolve the traditional limitations that prevent doing business in these highly scalable, fully distributed environments.

2011 DataDirect Networks. All Rights Reserved.

!
Knowledge Management - The explosion of data in HPC must not only stored and accessed, but the content must be managed and curated to unleash its capabilities. As data is shared and updated over time, the provenance of the data becomes ever more critical to the reliability of the experiments. Although there have been some successful efforts toward data model definition and taxonomies in some fields, most of the existing content management systems - and the knowledge derived from that content - have been ad hoc and added later in the cycle. Adding knowledge management into the foundation of the information architecture enables team collaboration and sharing valuable insight between scientists - that would otherwise have been overwhelmed by the sheer volume of raw data. Next Generation Solid State Storage - There are numerous opportunities to improve the storage hierarchy using the next generation of solid-state devices. Fully utilized, these will enhance the capabilities glimpsed at with FLASH-based SSDs to lower cost, improve reliability and dramatically improve performance. The next generation of Solid State Storage memory technologies promise performance similar to DRAM, superior non-volatile storage characteristics and architectures that will allow them to be very cost-effective. This has the potential to dramatically change the storage paradigm from a block centric I/O model to a byte addressable memory-mapping model. Behavioral Systems Analytics When looking at the entire Exascale complex, it is clear that system management will need to take on a broader and more holistic view of managing the entire application process. To enhance systems management, software and hardware sensors could be applied at every critical juncture or level of the HPC architecture to provide state or metering data. By utilizing this sensor or agent-based telemetry, we should be able to capture evidence of the system behavior, analyze & visualize it and derive proactive operational optimizations to the overall HPC process. DDN has worked with some of the leading HPC users in the world, listening to their needs and building HPC systems to help them deliver better results, faster. It is clear that the move to Exascale will require disruptive innovation in the HPC architecture, and the I/O subsystem in particular, to reach this next level of capability. As such, DDN offers a series of questions to the HPC community: What types of processing phases (pre, core, post) would benefit from In-Store Compute (as an alternative to moving data from storage to cluster for every type of function)? Can applications evolve away from POSIX to a native object interface (if the benefit is to break the metadata management scalability bottlenecks)? Should knowledge management and provenance tracking become a component of the systems foundation (or is this better managed at the application layer)? What new capabilities do future solid-state memory technologies (and the change in the storage I/O paradigm from block access to byte-addressable memory mapping) bring to HPC? Does the community believe that HPC systems efficiency can be improved by leveraging telemetry and behavioral analytics?

DDN looks forward to an open discussion about the key issues and opportunities confronting the HPC community as it moves toward Exascale. Regards, Jean-Luc Chatelain and the Office of the CTO, DataDirect Networks

2011 DataDirect Networks. All Rights Reserved.

You might also like