released 11.12.09

William Kramer, Deputy Project Director, Blue Waters
By William Kramer
Deputy Project Director, Blue Waters
When I was asked to write the editorial for this issue of Access, I realized that I joined the Blue Waters project a year ago. It has been an amazing, intense, and rewarding year for me, and the project, as we transitioned from mostly planning to the execution of all the project components. I came to Blue Waters with nothing more than the information that was public at the timewhich means, precious littleand a great deal of anticipation. Once I was able to see the details of the base Blue Waters system and the entire project, I was excited to see the Blue Waters system align so well with the philosophy and goals I used in deploying and running 19 previous high-performance computing (HPC) systems to serve the general science community.
The computational science and engineering community requires five attributes from the systems they use and the facilities that provide those systems. These attributes deliver systems that efficiently and productively enhance the scientists' ability to achieve novel results. They are performance, effectiveness, reliability, consistency, and usability (which I refer to as the PERCU method). This is a holistic, user-based approach to developing and assessing computing systems, in particular HPC systems. The method enables organizations to use flexible metrics to assess the features and functions of HPC systems and, if they choose to purchase systems, assess them against the requirements negotiated with the vendor.
Blue Waters epitomizes a project and a system design dedicated to providing those five attributes to the widest range of science and engineering areas. A key, and in many ways unique, aspect of Blue Waters is that the entire project is focused only on sustained performance for a wide range of science problems. This translates to Blue Waters being dedicated to time to solution as a way to assess the productive work potential for an arbitrary large set of applications. While the sustained performance will be measured on several petascale benchmarks that represent yet unsolved problems, we are confident that sustained-petascale performance will be achieved for a broad range of applications that scientists and engineers use every day.
Recently, many details of the POWER7 chip, the computational heart of the Blue Waters system, were presented by IBM at the August 2009 Hot Chips conference. Looking at the chip, one can see a number of new features that will make the processor itself the highest performance, most general processor of its time. Beyond the common measures of clock speed and multiple simultaneous operations, the processor will have significant new advantages in the memory hierarchy that enable the POWER7 processor to match memory performance to processor performance. The presentations also hint at another critical area, admittedly not well defined at the moment: the high-performance and balanced way the processors will be integrated together for petascale systems.
The Blue Waters staff is now working with about 20 large science teams to start revising their application codes to take full advantage of the Blue Waters features. Much of the work will enable codes to run well and at large scale on Blue Waters, but the work can also be applied to other systems in the future. We are doing this with simulation of the machine itself, application and system performance modeling with premier modeling groups, and early access to prototype systems and software. Over time, we will engage with other science areas as they are allocated time on Blue Waters.
In summary, I am particularly pleased to report to you that the Blue Waters project is well on-track and moving to deliver a world-class resource for science and engineering. Over the next nine months, much more information about the technology in Blue Waters, particularly the innovations on the interconnections, will become available. Further, starting at SC09, we will be sharing more of the information about the base software features, and importantly the "value added" software features the Blue Waters project is developing or enhancing. All this will culminate in the arrival of the system components and its use by the science teams in 2011.
BLUE WATERS: More than a fast, general purpose processor
Blue Waters will be based on IBM's multicore POWER7 processor. The machine will employ 200,000+ processor cores and will provide more than 800 terabytes of memory. The POWER7 processors will be packaged in clusters of four, forming standalone SMP nodes called multichip modules (MCMs) that have 32 cores in a module.
Blue Waters is projected to have as much as 500 petabytes of archival disk storage and much more than 10 petabytes of usable disk space. IBM's General Parallel File System (GPFS) and HPSS will be combined into a managed file systems using the GPFS-HPSS Interface (GHI) software. So the apparent disk space a user will have direct access to will be higher by at least an order of magnitude than anywhere today, with corresponding bandwidth increases. The file system and archive will be substantially larger, faster, more reliable, and easier to use than similar systems on today's platforms.
The file system will automate many storage and data transfer tasks done manually by users today. Researchers will have a simplified and easily searched view of their data. They will be able to set lifetime information-management (ILM) policies that establish where their data are stored, how long data are kept in the faster-access file system instead of the tape-based archive, and how the data are backed up and retrieved. This is markedly more efficient than current systems where researchers must log into multiple systems, manually transferring their data, keeping track of where those data are stored, and confirming that transfers have been completed successfully. Blue Waters is also enhancing HPSS by adding Redundant Array of Inexpensive Tape functions to increase resiliency while still fully automating all data storage.
NCSA is testing the GPFS-GHI-HPSS managed file system on interim computing systems and other testbeds.
Blue Waters will be housed at the new Illinois Petascale Computing Facility; the facility's network will be capable of transferring 100 to 400 gigabits of data per second.
POWER7 at a glance
Blue Waters will be based on IBM's multicore POWER7 processor. Each POWER7 processor will:
- Include eight high-performance cores, with each core providing 12 execution units.
- Feature simultaneous multithreading that delivers four virtual threads per core.
- Have three levels of cacheprivate L1 (32KB) instruction and data caches, private L2 (256KB) cache and L3 (32 MB) cache that can be used either as shared cache or separated into dedicated caches for each core, reducing latency.
- Combine the dense, low-power attributes of eDRAM with the speed and bandwidth advantages of SRAM for optimized performance and power usage.
- Have two dual-channel DDR3 memory controllers, delivering 100GB/sec of sustained bandwidth.
- Employ new IBM interconnect technology, providing a high-bandwidth, low-latency interconnect that scales to hundreds of thousands of cores.
- Have six fabric bus interfaces to connect to other cores and groups of cores, providing improved reliability.
- Provide an eight-channel memory subsystem, to enable the solution of memory-intensive problems.
- Provide 32 or more gigabytes of memory per SMP and 2 or more gigabytes of memory per core.
- Support 10 or more data streams.
- Offer vector multimedia extensions on each core with four or more floating-point operations per cycle.