2008 NCSA Private Sector Program Annual Meeting

Speaker Abstracts

A printable version of the abstracts is available for download in PDF format.


Keep Email Lists Secure and Private with SELS
Meenal Pant and Joe Muggli, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Atrium

Today, email lists are a popular group communication tool for cross site projects. Often the information shared is sensitive, so encryption and authentication of the messages becomes essential. The Security Research and Development Group at the National Center for Supercomputing Applications (NCSA) has developed a software solution called SELS (Secure Email List Services) to protect such email lists. The software provides email lists with the same level of end-to-end security that PGP or S/MIME software provides for encrypted communication between individuals. SELS provides digital signature and encryption capabilities while ensuring that neither the list server nor outsiders have access to plain text emails. SELS is open-source software that can be downloaded and installed using simple and well-documented instructions. Additionally, NCSA has set up a production list hosting environment for supporting users lists and pilot studies.

back to the agenda



From Large Volumes of Scanned Lincoln Papers to Virtual Observatories
Michal Ondrejcek and Peter Bajcsy, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Room 1104

This poster and live demonstration will present an end-to-end system for delivering diverse information to the public and to a broad community of humanists studying Abraham Lincoln life. The diverse information is characterized by distributed resources and large volumes of scanned writings of Abraham Lincoln's communication, historical maps from Lincoln's period, and the sixteenth President's daily activities summarized in the Lincoln log. We will demonstrate a prototype system, where (a) scanned writings have been automatically pre-processed (cropped to eliminate color scale bars, scaled and compressed for quick preview and fast retrieval), (b) scans have been georeferenced and temporally correlated with events in the Lincoln log, (c) historical maps have been georeferenced and re-projected to match the Google map interface requirements, and (d) layered Web-interface has been developed to provide access to diverse information according to its multiple dimensions (spatial, temporal, document and relational dimensions). The objective of this prototype system is not only to support building a virtual observatory of materials related to the sixteenth President, but also to illustrate the computer science challenges addressed in our prototype when dealing with (a) a large volume of image scans (automated processing of 24,000 pages to reach 200,000 to 300,000 pages in the future, with each page equal to about 150MB and the total ~37TB), (b) "dirty" metadata, (c) incomplete georeferencing information, (d) uncertain temporal information and (e) complexity of high-dimensional information when delivered to end users.

back to the agenda



Cyber-Integrator: A Highly Interactive Scientific Process Management Environment
Rob Kooper, Luigi Marini, Jim Myers and Peter Bajcsy, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Room 1104

This poster and live demonstration will present a novel process management environment called CyberIntegrator to support diverse exploratory analyses, and to facilitate transitions from desktop executions to high-performance computing executions. The exploratory analyses are very human time consuming and hard to reproduce because of the lack of in-silico scientific process management and because of the diversity of data, software and computational requirements. The desktop executions are many times very difficult for domain scientists to scale to large volumes of data or to computations requiring high-performance computing resources. The motivation for our work comes from the need to build the next generation of in-silico scientific discovery processes that require (a) access to heterogeneous and distributed data and computational resources, (b) integration of heterogeneous software packages, tools and services, (c) formation and execution of complex analytical processing sequences, (d) preservation of traces about scientific analyses and (e) design of secure collaborative Web-based frameworks for sharing information and resources. The goal of the presented work is to describe a modular architecture and key features of a workflow that provides a process management environment for automating science processes, reducing the human time involved and enabling scientific discoveries that would not be possible without supporting software and hardware infrastructure. The demonstration will allow users to include various data sets and tools into the environment, link tools into workflows, annotate data/tools/workflows, execute the tools and visualize the workflow process, data/tool/workflow annotations and self-describing metadata about exploratory activities.

back to the agenda



Tele-Immersive Environments for Everybody
Rahul Malik, Suk-Kyu Lee, Miles Johnson, Rob Kooper and Peter Bajcsy, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Rooms 1000 and 2002A

Tele-immersive environments for everybody (TEEVE) are designed to facilitate communication and collaboration between people. Therefore, the central objects of interest in a tele-immersive system are these people, the things they jointly manipulate, and the tools they need to perform this manipulation. This poster and demonstration will present a prototype system of real-time 3D scene reconstruction using visible and thermal infrared spectrum cameras across multiple geographically different sites, as well as our simulation capabilities to adaptively configure the TEEVE systems according to the space, activity and cost constraints defined by users. The use of multiple spectral cameras aims at improving robustness of the 3D reconstruction while the 3D scene simulation is critical for better understanding of camera placements. The demonstration will allow the attendees to step into two different TEEVE environments at NCSA, where they will be digitally cloned in real-time and immersed into a common 3D virtual space. The physical distances between the two TEEVE environments will be removed by fusing the two 3D digital scenes and hence the participants would be able to experience remote collaborations in 3D.

back to the agenda



Web-Based Access to Large Volumes of Airborne Multi-Year Multi-Spectral Imagery
Qi Li, Michal Ondrejcek, Rob Kooper, Kevin Franklin and Peter Bajcsy, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Room 1104

We present a poster and a demonstration of a prototype system for Web-based access to large volumes of airborne multi-year multi-spectral imagery of Costa Rica. This collaborative project with the National Center for Advanced Technology Studies (CeNAT) in Costa Rica poses challenges in terms of (a) de-warping airborne images without known warping characteristics, (b) geo-referencing images with limited and uncertain information, (c) spatial mosaicking of a large number of image tiles, (d) re-projecting images into a target projection and a datum, (e) building an efficient pyramid-based representation for fast retrieval and (f) providing a multi-layered interface to multi-year multi-spectral imagery. The data sets of high spatial resolution consist of hyperspectral and near infrared imagery over two years equal to about 1TB. We will demonstrate the prototype system using Open Layers and Google map interfaces.

back to the agenda



Remote Visualization Pipeline
Dave Semeraro, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Atrium

The visualization pipeline is the sequence of operations that convert raw data to imagery. Remote visualization of scientific or engineering data is the process of executing some portion of the visualization pipeline on a remote computer and viewing the results locally. Data that are too large to move from a storage location to a local compute facility and lack of adequate visualization hardware or expertise are common reasons to consider remote visualization. The figure below schematically represents the visualization pipeline. The type of remote visualization depends on the location of the internet divider. Everything to the left of the divider is remote operation and everything to the right of the divider is local or desktop operation.

Remote Visualization Pipeline

In the figure above the raw data is represented by the cylinder on the left. All of the operations that map that data to geometry are contained in the visualize box. This box can be quite complex containing various filtering and interpolations. Finally, the geometry is rendered and displayed. The geometry created by the visualize step is often but not always smaller than the raw data.

The effectiveness of a visualization depends on how well the user creates and applies the filters that are contained in the visualize block. Some degree of effort is often expended in creating an effective visualization of a given dataset. This project is concerned with capturing that effort for use by other scientists. In order to accomplish this several issues had to be addressed. We considered the following issues:

The result of our project was a software system for manipulation of complex visualization pipelines remotely in a Web framework. We demonstrate this pipeline by utilizing it to render simulation data for water quality in the Corpus Christi Bay. A pipeline manager that executes visualization workflows stored as RDF graphs will be described. In addition methods for displaying the results of the visualization and manipulating the visualization products will be described.

back to the agenda



Data Sanitization & Privacy Preservation at NCSA
Adam Slagell and Byunggil Yoo, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Atrium

The LAIM Working Group at the NCSA has become a leader in the area of data sanitization and privacy-preservation. Work began with the creation of their generic computer and network log anonymization framework, FLAIM, and has continued with work to better understand the trade-offs made when anonymizing data. They have examined how anonymization at different levels affects the utility of the data; developed a taxonomy of attacks against anonymization; and an adversarial model of the attacker that would seek to exploit information leakage; and mapped that adversarial model into the taxonomy the created. This has enabled them to translate the problem of negotiating a policy that meets the needs of the researcher, and the privacy requirements of the data owner, into a predicate logic problem that can be solved automatically. Now as FLAIM is being used in other domains and for other purposes than just log anonymization, the LAIM Working Group has also begun to look at applying these techniques to other problems such as sanitizing student records for educational research.

back to the agenda



MAEviz
Terry McLaren, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Atrium

MAEviz represents a new generation of tools that allows researchers and practicing engineers the ability to leverage investments in new methodologies and software infrastructure and provide useful result information to stake holders and decision makers. MAEviz performs seismic risk assessment based on the Mid-America Earthquake (MAE) Center research in Consequence-based Risk Management (CRM) and is designed to be extended, customized, and evolved to meet the needs of specific organizations and regions. MAEviz helps bridge the gap between researchers, practitioners and policy-makers by integrating the latest research findings and most accurate data, state-of-the-art methodologies in an extensible software platform.

back to the agenda



Embedding Data within Knowledge Spaces
Joe Futrelle, Jeff Gaynor, Joel Plutchak, Peter Bajcsy, Jason Kastner, Kailash Kotwani, Jong Sung Lee, Luigi Marini, Robert E. McGrath, Terry McLaren, Yong Liu, and James Myers, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Atrium

Data management is becoming increasingly complex as new sensors and models drive growth in data volumes, as interdisciplinary studies and systems-level modeling drive the need for synthesizing heterogeneous data, and as reliance on digital information as primary records drives a need for cost-effective curation and preservation. As part of NCSA's broad efforts in these areas, the Cyberenvironments Directorate has been developing cyberinfrastructure necessary to support powerful "knowledge spaces" built upon concepts of content management, the semantic Web, active curation, and computational inference capabilities. This poster outlines the core open source infrastructure that has been developed (Tupelo) as well as work to standardize descriptions of data provenance (the Open Provenance Model), file formats (the Data Format Description Language), and basic geospatial and temporal data relationships. Companion posters on Cyberenvironments and Digital Observatories highlight how this type of infrastructure is enabling data synthesis and modeling across a wide range of projects.

back to the agenda



Application Performance Analysis on Multi-Core Clusters
Greg Bauer, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Atrium

The multi-core processor architecture is the answer to the question of how to increase peak computing capability without increasing peak power. This increase doesn't come without cost though: contention for resources is compounded by the additional computational cores. The majority of this contention is related to competition for memory: either from within a node or between nodes. Performance analysis concepts which focus on memory access (intra-node and inter-node) will be discussed along with example application and compute kernel performance.

back to the agenda



Augmented Reality for Fun and Profit
Alan Craig and Robert McGrath, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Room 1104

Augmented Reality (AR) is a technology that allows a viewer to see computer graphics superimposed on the real world, and in registration with the real world. In this demonstration, we show how AR allows content creators to merge two media for the purpose of an enhanced reading experience: traditional printed material (such as books) and 3D computer graphics. With Augmented Reality, a reader can see text and images on a page as usual, but they can also see dynamic, 3D computer graphics "hovering" over the page (for example, a molecule or CAD model). The reader can see the 3D graphics from whatever point of view they desire by simply turning the physical book around (to see the model from behind), maneuvering the book directly toward them (to see the model from above), etc. The graphics can also be made interactive and viewable on a number of display devices such as PDA's, cell phones, gaming system, or a standard computer screen.

This technology can also be used for other applications including head's up displays for training or maintenance workers, museum exhibits, education, mass market publications, and point of sale advertising.

back to the agenda



Cyberenvironments and Digital Observatories
Jim Myers, Terry McLaren, Luigi Marini, Joe Futrelle, Yong Liu, and Robert E. McGrath, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Atrium

Effectively harnessing scientific research to inform societal and business decision making requires an infrastructure capable of retrieving and integrating data, running coupled models, analyzing and visualizing results, capturing data provenance (the scientific basis for the results), comparing the predictions of new models with existing data and the output of competing models, and quickly disseminating useful derived products to decision makers. NCSA has developed a suite of technologies around the concept of Cyberenvironments that address these issues and enable more rapid and more effective use of research results in operational settings.

A number NCSA projects described as digital observatories are now demonstrating this Cyberenvironments technology suite in the context of geospatial analyses related to disaster risk mitigation and environmental sustainability. Digital observatories enable visual analysis of synthesized observational and modeled data related to complex, multi-scale geospatial phenomena using distributed resources and provide an unprecedented end-to-end digital framework for coupling research with operational procedures. These projects demonstrate the capability to retrieve data from Web services and local and remote repositories, integrate them as map overlays and to map them to the desired data schema, use them as inputs to computational workflows that can be run locally or as remote services run periodically or in response to events, to record data provenance, and to publish full details or specific derived data and views for further analysis by colleagues or the public. These capabilities are currently being developed in the context of academic research efforts but would be directly applicable in a broad range of geospatial applications related to disaster management and insurance, precision agriculture, food and biofuel production planning, etc. The core concept of Cyberenvironments and middleware related to semantic content management, workflow and provenance, and Web/service-oriented visual analysis would be applicable more generally across science and engineering.

back to the agenda



Advanced Visualization Laboratory
Stuart Levy, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Room 1005

NCSA's Advanced Visualization Lab presents a variety of stereoscopic animations based on scientific data, from tornadoes and ocean currents to black holes and galaxy formation.

back to the agenda



INDICATOR
Ian Brooks and Wendy Edwards, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Atrium

INDICATOR applies NCSA's Cyberenvironment technologies to infectious disease informatics and biosurveillance. This demonstration shows the current state of the cyberenvironment, and monitoring of data from a local hospital for signs of disease outbreaks. INDICATOR is based on Liferay and uses the WSARE (What's Strange About Recent Events) algorithm developed at Carnegie-Mellon University. This project has been supported by Carle Hospital.

back to the agenda



Center for Clinical Translational Science (CCTS) Cyberenvironment Development
Ian Brooks, Federico Bassetti, and Hoon Kim, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Atrium

CCTS is an all-campus initiative that focuses on developing new approaches to clinical and translational research, enhancing informatics and technology resources, and improving training and mentoring to ease navigation of the increasingly complex research system. The NIH has launched this initiative to spur the transformation of clinical and translational research in the United States, so that new treatments can be developed more efficiently and delivered more quickly to patients.

The Cyberenvironment Development project aims to fulfill CCTS mission by providing an environment that integrates different technologies to facilitate the translational aspect of the research. The CCTS-CE will include collaboration tools, content publishing tools, document management, social network tools, recommendation tools, search engine and information retrieval tools.

back to the agenda



Digital Watersheds for Environmental Observatories
Jim Myers, Terry McLaren, Luigi Marini, Joe Futrelle, Yong Liu, and Robert E. McGrath, NCSA
Monday, May 12
5:00 p.m. - 8:00 p.m.
Atrium

National observatory initiatives such as WATERS Network will be deploying nested multiscale sensor networks to study water cycle dynamics, human-nature coupled environment system, climate change and other urgent science questions and societal issues to help maintain a sustainable water environment for humankind. We will present our design of semantic sensor Web tools for management, query, analysis and visualization of near-real-time multidisciplinary virtual sensors in space-time-thematic contexts for digital watersheds, with a prototype demo showing near-real-time stormwater management information (streaming gage, NEXRAD data, and model output) in a geospatial browser with participatory data curation capability based on Open Geospatial Consortium (OGC) standards.

back to the agenda



Session I-A: Panel Discussion on Modeling/Simulation Challenges in Manufacturing
Walter Lohmann, Chris Ha, Keven Hofstetter, Rick Huff, Khaldoon Tahhan, and Madhavan Narayanan, Caterpillar
Tuesday, May 13
1:30 p.m. - 2:45 p.m.

In today's competitive global marketplace, manufacturing companies like Caterpillar make extensive use of simulation and analysis tools to develop their products in the virtual world before physical prototypes are built and tested. For example, simulation tools are used for combustion, structural durability, metal forming, and system and machine performance. Optimization tools are used to explore the design space and determine topology and shape. The continuing advancement of the virtual product development environment that encompasses both software tools (multiphysics) and hardware capabilities (high-performance computing) has permitted companies to explore more alternatives with increasingly realistic (and often more complex) simulation models. These models are typically solved on large high-performance Linux clusters in order to take advantage of parallel processing capabilities. However, even with the advancements in software and hardware capabilities, this current environment does not permit companies to explore the desired set of alternatives in their quest for optimal designs within the constraints of ever decreasing product development schedules. Lohmann will illustrate a few of the areas where the current environment fails to provide a framework for maximizing virtual exploration of design alternatives.

back to the agenda



Session I-B: Next-Generation Data Centers
John Melchi, NCSA, and William J. Kosik, EYP Mission Critical Facilities
Tuesday, May 13
1:30 p.m. - 2:45 p.m.

Reducing power consumption in computing devices while simultaneously increasing performance are fundamental parameters in the development of computing systems the same is true for data center facilities. Next generation data centers will enable interoperability between the facility power and cooling infrastructure and the computer systems themselves. This transfer of operating parameters and real-time performance data is critical to reaching the full potential in optimizing energy use and lowering the environmental impacts of the IT enterprise. The high performance computing model lends itself particularly well to this optimization strategy. This core idea, where the data center becomes the computer, will begin to manifest itself within the next decade through industry-generated metrics as high performance computing models become more widely used. NCSA's petascale computing effort is a prime example of this next generation.

back to the agenda



Session II-A: Value Engineering and Tomorrow's Product Design Cycle: The New HPC Challenge
Jon Riley, Steve Reagan, L&L Products, and Seid Koric, NCSA
Tuesday, May 13
3:00 p.m. - 4:15 p.m.

This presentation will draw from many years of experience with large degree of freedom systems that spend much of their the design process in a virtual product development environment. The example customer base consists of current automotive manufacturers heavily pressured due to the competing demands of time-to-market and performance expectations. Our use of computationally intensive concept testing, design evaluation, and performance validation of nonlinear dynamic systems allows a shortened design cycle while increasing the value (performance/cost) passed to our customers. The coupling of optimization and high-performance computing is found to be an essential pathway that must be balanced with both design iteration throughput and system capacitization if true value is to be realized. Value innovation has been said to be the metric for long-term enterprise success ... high-performance computing will be seen to be one of its key languages, connecting a broad base of previously excluded inventors with an expectant market.

back to the agenda



Session II-B: Multicore Productivity in HPC
John Towns, NCSA
Tuesday, May 13
3:00 p.m. - 4:15 p.m.

This session is a follow-on from the Monday afternoon technical workshop that reviewed the effects of parallelization using MPI and OpenMP in a multi-level parallel programming environment and the effect of multi-core processors on scalability and performance. After a short review of the technical findings it is anticipated that session participants will begin a shared dialogue about the possibilities and the challenges of multi-core programming, multi-level parallelism and their impact for industrial applications.

back to the agenda



Session III-A: The Computational Microscope
Klaus Schulten, University of Illinois at Urbana-Champaign
Wednesday, May 14
8:30 a.m. - 9:45 a.m.

This lecture will illustrate how advanced computational methods offer new microscopic views of living cells and bionanotechnological devices not available through experimental microscopy. The computational microscope guides the development of new drugs, assist bioengineers in constructing and using nanosensors, and advances biomedicine through unprecedented views of machines in living cells. The software for the computational microscope, developed over two decades at $20 million cost, is used today by over 100,000 registered users. The NCSA/NSF petascale computer will greatly advance the use of the computational microscope.

back to the agenda



Staying competitive Through HPC: The Blue Waters Perspective
Rob Pennington, NCSA
Wednesday, May 14
11:30 a.m.

NCSA has been selected to build the first National Science Foundation-funded petascale computing system, called Blue Waters. The Blue Waters system will sustain performance of at least one petaflop (one quadrillion calculations per second). Pennington will provide an update on the progress on the project and will share how industry can leverage the capabilities of this petascale computing environment.

back to the agenda