Getting viral

09.01.09 -

By J. William Bell

Some early prospective users of Blue Waters will build computing code for global epidemic models.

Infectious diseases can have very different characteristics. Measles strike in related waves over the course of decades. Any given flu strain tends to peter out in a year. A flu victim is thought to infect about two additional people, a measles victim as many as 14.

But any disease that passes person-to-person shares at least one common characteristic: Its spread can be modeled using a supercomputer.

Those models serve two key purposes—policy planning and emergency response.

"In the planning world, we work with policymakers to design studies of particular outcomes," says Virginia Tech's Keith Bisset. Months of planning, collaboration, and modeling might go into strategies for what a city, county, or entire country might do when facing a disease outbreak.

"But now we also have tools that allow for a quick turnaround. We can do a situational assessment that shows them what a particular [outbreak] might look like tomorrow or next week as it unfolds. They describe the situation, and we can tell them the outcomes of various interventions," he says.

Going global

This spring Bisset and a group from Virginia Tech joined forces with the Pittsburgh Supercomputing Center's Shawn Brown and Douglas Roberts and Diglio Simoni of North Carolina's Research Triangle Institute to win one of the first Petascale Computing Resource Alloca tions awards.

With that support and with computing time on Blue Waters, they expect to model global epidemics, as well as smaller-scale outbreaks. Instead of looking at a few hundred million people, as the team members do with their current codes, they'll look at more than 6 billion people.

"There's a natural limit to how big we make the models in terms of the number of people," Bisset says. "Once we're doing every person in the U.S. or every person in the world at one-minute intervals, there's no value to making it bigger."

Making it that big, however, will require a lot of work. The team estimates that a global model on 2,000 processors of a contemporary supercomputer would take about two years to complete. "This is clearly unacceptable," they said in their Petascale Computing Resource Allocations proposal.

Part of the challenge is that a single predictive model is based on what is often thousands of computing runs, each with slightly different parameters representing things like people's social contact and behavior and the different disease characteristics.

In contrast to other supercomputing problems, for example large physics simulations, "the outcome of a single run is not interesting by itself," according to the proposal. "Simulations of infectious disease outbreaks are not run for their own sake, but to investigate specific questions about prevention and mitigation. Answering these questions requires analyzing the interdependent effects of many different parameters."

"The goal is not to answer a single question of what happens 'if'," explains Virginia Tech's Stephen Eubank. "It's to compare. Is intervention A better than intervention B?" In other words, does closing a city's schools have more impact than giving citizens a prophylactic medication against infection? Or, for that matter, does doing either or both have enough of an impact to justify the social and economic costs involved?

The team will focus on optimizing their code and scaling it to run on Blue Waters' hundreds of thousands of processors. That global model that would take years on today's supercomputers? They hope to complete it in a couple of weeks on Blue Waters.

"Overall time-to-solution is the measure of effectiveness. Today, many policymakers are forced to use inaccurate tools in place of accurate, but slower, tools," according to the team's proposal. With Blue Waters, they hope to change that.

SPACES

SPACES, the code a Virginia Tech team will develop for Blue Waters, is based on their current code called EpiSims. Codes like these are particularly challenging because they simulate disease moving through a set of overlapping social networks—schools, families, and international travelers.

"It's not just local. It's not just global. It's just irregular. People are close in one network, scattered in another. We can't decouple them," says Stephen Eubank, a research professor at Virginia Tech.

But working with the Blue Waters team through the Petascale Computing Resource Allocations program will allow them to:

  • break the code into pieces to run over hundreds of thousands of processors.
  • develop schemes for automatically balancing the computing load so that parts of the simulation that share data are near one another and can thus pass the data more quickly.
  • take advantage of virtualization to move parts of the computation to other processors and memory in Blue Waters, should an individual component fail.

These improvements will be implemented using Charm++, a parallel programming system developed by Sanjay Kale's team at the University of Illinois at Urbana-Champaign. Kale is a computer science professor, an integral part of the Blue Waters team, and a member of the Institute for Advanced Computing Applications and Technologies.

First Blue Waters projects selected

This spring and summer, the National Science Foundation announced the first winners of its Petascale Computing Resource Allocations awards. These research teams will work closely with the Blue Waters project team in preparing their codes to run on the sustained -petascale supercomputer. They'll also be given the opportunity to apply for time on Blue Waters once it comes online in 2011.

NSF plans to select about a dozen teams per year between now and 2011.

"Preparing these codes to run on Blue Waters is absolutely essential to our success," explains Thom Dunning, who leads NCSA and the Blue Waters project. "We're committed to providing sustained-petascale performance on the system, and that means working with scientists now, not in three years.

"It's a very select group, and we're very excited to get started."

Think of them as "The Elite Apps" of the supercomputing set. These are the scientific applications that have made their way through the selection process. The computing codes that have passed muster with their peers. The codes that will get special attention. The teams that are ready to make the most of the most powerful supercomputer in the world for open scientific research.

As of press time, these projects had been chosen to work with the Blue Waters team through the National Science Foundation's Petascale Computing Resource Allocations. More will be added throughout the next three years.

  • Formation of the First Galaxies: Predictions for the Next Generation of Observatories
    Brian O'Shea, Michigan State University
  • Simulation of Contagion on Very Large Social Networks with Blue Waters
    Keith Bisset, Virginia Tech; Shawn Brown, Carnegie-Mellon University; Douglas Roberts, Research Triangle Institute
  • Lattice Quantum Chromodynamics on Blue Waters
    Robert Sugar, University of California, Santa Barbara
  • Super Instruction Architecture for Petascale Computing
    Rodney Bartlett, University of Florida
  • Peta-Cosmology: Galaxy Formation and Virtual Astronomy
    Kentaro Nagamine, University of Nevada, Las Vegas
  • The Computational Microscope
    Klaus Schulten, University of Illinois at Urbana-Champaign
National Science Foundation

Blue Waters is supported by the National Science Foundation through awards ACI-0725070 and ACI-1238993.