Past Awardee

Efficient Parallelization of a Risk Management Model on the NT Supercluster

Barbara Minsker

College: Engineering
Award year: 1999-2000

The objective of this proposal is to investigate computationally-efficient methods for parallelization of a risk-based corrective action (RBCA) design model using distributed, commodity computers. The model uses a noisy genetic algorithm (GA), a groundwater fate and transport simulation model, and an exposure and risk assessment model to identify cost-effective, reliable strategies for cleaning up contaminated groundwater under conditions of uncertainty. The model represents a considerable advance to the groundwater management field, allowing cost tradeoffs to be made among short- and long-term solutions to contamination problems at thousands of sites. Before the model can be applied to large-scale field sites, however, an efficient parallel version of the code must be developed. Given that the ultimate users of the model will be practitioners and government regulators, who will most likely have access to networks of commodity computers, the parallel version of the code will be developed on NCSA's NT supercluster.

The first task of the proposal will be to port the code to the cluster and optimize its performance. Three approaches to parallelization will then be investigated. The first approach is to use a single GA population distributed to multiple processors. The second approach will use multiple GA populations evolving in isolation, but periodically exchanging ("migrating") individuals. Each population will be modeled by a single computer, which will distribute the population among its two or four processors using the first approach. The third approach combines the first two approaches in an innovative, hierarchical island-injection scheme. Under this approach, lower-level populations search different areas of the decision space using a faster, analytical approximation to the full risk model. The lower-level populations pass good solutions to the higher-level populations, which then polish the solutions using the full model. For each parallelization approach, a number of design issues will be investigated in collaboration with the performance engineering and consulting groups at NCSA. To the extent possible, current theory on parallel GA design will be used to guide the investigation. Results of the research will be disseminated to other researchers through a report summarizing the methods and results of parallelizing the code on the clusters, through interactions with the environmental hydrology application technology team at NCSA, and through on-campus seminars and presentations at national conferences.

This research is expected to produce a number of benefits. The risk management model described in this work is an excellent case study for examining issues associated with parallelization of computationally-intensive applications on the new supercluster. The flexible, modular structure of genetic algorithms allows easy investigation of multiple parallelization strategies using both shared and distributed memory resources. The results of the research should be useful for other researchers who are considering implementation of computationally-intensive codes on this type of architecture. The proposed work should yield particularly valuable insights to the many researchers using coupled genetic algorithms and simulation models, which are being applied to more and more complex engineering and science problems.

Moreover, the results of this research are expected to provide considerable benefits to the groundwater remediation field. Given the scope of groundwater contamination in the United States and elsewhere and the vast amount of money involved in remediation, improved risk management and design of these cleanups is a critical need. Even savings of only a few percent of costs at individual sites could still result in millions of dollars saved nationwide. The proposed research will help to enable application of a valuable risk management screening tool to large-scale, complex field sites, where such a model is most needed.