|HOME | AGENDA | REGISTRATION | FAQ|
2006 Private Sector Program Annual Meeting Abstracts
Active Interactive Genetic Algorithms
Active interactive genetic algorithms (aiGAs) mine models of users preferences during interactive sessions for problem solving. It builds on the partial ordering of solutions provided by the user evaluations. Traditional interactive genetic algorithms (iGA) require the user to provide a large number of evaluations to achieve good quality solutions. One of the main contributions of aiGA is the use of mined models of user preferences to generate educated guesses---promising solutions. Presenting educated guesses to the user helps reducing the fatigue---the user can obtain high-quality solutions faster---and the frustration---generating high-quality solutions allows aiGA to avoid the repetitive display of poor solutions that may discourage the user.
The fusion of human and computer capabilities in aiGA provides large amounts of information not used during the interactive sessions. Visual analytics techniques can aggregate, summarize, and visualize the information generated during the interactive process. Special visualizations of the user-provided partial ordering of solutions, the synthetic fitness surrogates induced, and the model of user preferences were prepared. The visualizations allow an observer to evaluate at a glance the progress of the interactive process so far through a non-interfering looking glass. The visual-analytic tools have been successfully used to analyze the behavior of users during an interactive weight-tuning of a cost function involved with corupus-based text-to-speech synthesis.
Collaborative Medical Image Volume Analyses
In many medical research areas, 3D volume data are often acquired from multiple histological sections of a tissue specimen or multi-modal imaging methods in order to inspect and analyze high-resolution 3D volumes with a large amount of information. Essential to the high confidence medical analyses are robust 3D reconstruction from individual image stacks and highly informative data fusion of multi-modal 3D medical volumes. The 3D volume reconstruction and fusion is a challenging problem due to significant variations of intensity and shape of cross-sectioned structures, unpredictable and inhomogeneous geometrical warping during medical specimen preparation, an absence of external fiduciary markers, and different image modality. Furthermore, there is a need to support geographically distributed, data-driven, 3D medical volume reconstructions requiring sophisticated reconstruction tools and computational resources, as well as involvement of inter-disciplinary experts. We present stitching and alignment registration solutions, image intensity enhancement, decision support for optimal selection of reconstruction parameters and remote collaboration solutions using novel algorithms and collaborative technologies.
Cyberenvironments: Linking Research and Practice
Textbooks and software applications are well known methods for bringing research innovations to practitioners. Yet they are also out-of-date almost from the day they are created. In the 21st Century, Cyberinfrastructure (CI) provides a more powerful way to deliver expertise that can reduce delays in bringing new techniques and data to bear on industrial competitiveness and societal issues. Further CI enables the creation of Cyberenvironments (CE) that go beyond a simple model of delivering "static" knowledge to a model of collaboration and continual knowledge building. Such a model is well-aligned with the changes in science and engineering towards systems-level analysis and, indeed, makes the coordination of cross-domain expertise on specific problems feasible at a scale never before possible. While the concept of CEs is still evolving, current model systems incorporate a number of capabilities -- tracking of detailed provenance and data relationships, the translation between the formats and vocabularies used in different sub communities, and the presentation of an engineering, best-practice-oriented view of complex underlying science within an overall distributed computing and collaboration framework -- that demonstrate the power and practical utility of a CE-based approach. This presentation reviews the CE vision and demonstrates its realization in the context of two emerging CEs architected by NCSA staff and collaborators -- MAEviz, the flagship hazard risk management CE being developed by the Mid America Earthquake (MAE) Center, and the Collaboratory for Multiscale Chemical Science (CMCS), developed by a consortium of nine institutions led by Sandia National Laboratories.
CyberIntegrator: Cyber-environment for Process Management
The CyberIntegrator addresses the problem of designing a highly interactive scientific meta-workflow system that aims at building complex problem-solving environments. The need for meta-workflow arises as a number of on-going earth observatories and disaster planning efforts have been launched in recent years. The meta-workflow is viewed as a framework that integrates heterogeneous workflow engines, software tools, data sites, hardware resources, organizational boundaries, and/or research domains. The key aspects of CyberIntegrator are its usability and self-learning from provenance information, such as provenance to recommendation pipelines. Our long-term goal is to develop a plug-and-play architecture that allows easy creation, sharing and re-purposing of workflows in a seamless way.
Editable Web Browser
EWB offers the notion of a Web browser with an "edit" button. Browse to a page you have permission to modify and the edit button lights up. Clicking the edit button allows you to edit the page right inside the browser window. Rather than relying on a webmaster to make changes or post content, authorized users can now create and maintain a website or particular Web pages themselves. Tasks such as adding new pages, correcting typos, and updating time sensitive information can be performed in the Web-browser environment with which they are already familiar, using a what you see is what you get interface. EWB can be used to create Web pages from scratch, providing a friendly interface for users who are not tech savvy. EWB also seamlessly integrates with Microsoft Office, allowing Word, Excel, and Powerpoint documents to be posted to the Web and edited just like Web pages.
NCSA Environmental Cyber Infrastructure Demonstration (ECID)
The overall goal of the NCSA ECID (Environmental Cyber Infrastructure Demonstration) project is to create an end-to-end cyber infrastructure that supports the analysis and modeling of data from environmental observations made by a variety of sensors or other data collection devices. One key element of the project is the development of capabilities to provide metadata-based discovery, support of a number of scientific analyses and model building activities, and access to independently managed, distributed, heterogeneous, data sources (e.g., to allow researchers to search for precipitation totals for a given region regardless of their origin). Another key element is to provide the capability to specify and execute meta-workflows that combine data loading, cleaning, analysis, visualization, and other routines created by different researchers using different workflow systems into a seamless process. ECID is also exploring the use of data provenance and social network analysis to provide alerts and recommendations about new data and best-practice techniques that are relevant to a researcher's current efforts. ECID researchers are also working to develop a cyber "dashboard" that provides a dynamic, always-on overview of community activities in the overall cyber environment.
Evolution Highway is a collaborative project to provide a visual means for simultaneously comparing mammalian genomes of humans, horses, cats, dogs, pigs, cattle, rats, and mice. The tool removes the burden of manually aligning these maps and allows cognitive skills to be used on something more valuable than preparation and transformation of data. Primary Researcher Dr. Harris A. Lewin explains that with Evolution Highway one is able to look "...at the whole genome at once--multiple chromosomes across multiple species. The insights wouldn't have come so quickly if we couldn't throw the data at this tool from NCSA.">
Evolution Highway was developed to visualize the results of the mammalian genome comparative analysis. It is a set of D2K components created to load, correlate and map chromosome and species data to a visual chromosome metaphor for comparative analysis. It employs a zoomable user interface that allows the user to zoom in for detailed information and zoom out for an overview. The D2K framework enables Evolution Highway to be a a web service application and a desktop application. The D2K Web Service application can be launched from the button located at the top right side of this page. The desktop application can be downloaded under an academic license. Click the Download link for instructions.
Evolution Highway offers several simple, user-oriented features that make examining the comparative maps easier. Users can look at multiple species at once, hide a given species with a click, and zoom in and out of the comparative maps, which can cover millions of base pairs.
As computer systems have become more interconnected and attacks have grown broader in scope, forensic investigations of computer security incidents have crossed more and more organizational boundaries. Consequently there is an increased desire to share logs within the security operations community. While it is generally recognized that sharing logs is important and useful it is very difficult to accomplish because the logs to be shared often contain sensitive information. We have developed an anonymization framework, FLAIM, that allows users to anonymize the sensitive information in their logs, thus enabling the sharing of logs. FLAIM is a modular, open-source anonymization framework created for system/network administrators. FLAIM allows administrators to specify flexible, XML-based anonymization policies for their unique sharing needs. FLAIM is a versatile, multi-log/multi-level anonymization framework.
HPC with Microsoft Compute Cluster Edition
NCSA has deployed a 900 Processor system using Microsoft's upcoming Windows Compute Cluster Server 2003 with an Infiniband interconnect. This system will be expanded to 4800 processor cores this fall and will be available for some potential research and evaluation projects.
Image to Learn (Im2Learn)
The motivation for developing Im2Learn (Image to Learn) comes from academic, government and industrial collaborations that involve development of new computer methods and solutions for understanding complex data sets. Images and other types of data generated by various instruments and sensors form complex and highly heterogeneous data sets, and pose challenges on knowledge extraction. In general, the driver for the Im2Learn suite of tools is to address the gap between complex multi-instrument raw data and knowledge relevant to any specific application. Im2Learn contains solutions for (1) spectral image analyses, (2) integrating and data-driven modeling of remote sensing images (GeoLearn), (3) extracting information from scanned PDF forms and original PDF documents (PDF2Learn), (4) auditing of image-based decisions using provenance gathering (IP2Learn), (5) analyzing 3D medical volumes, and (6) integrating heterogeneous sensor data streams for end-to-end applications. We present examples of our researched and developed solutions to real life problems in the application areas of bio-medicine, earth modeling and intelligent spaces.
Invisibase does for databases what the Editable Web Browser did for Web pages. Users with little or no database knowledge can create Web-accessible databases and easily populate and search them. Users can select the field types -- text, numbers, dates, images, etc -- naming them and easily uploading the relevant content. Most currently available Web-based database creation tools require the user to design and manage both the database and a set of HTML pages for accessing that database. This software provides a database with built-in Web support, allowing one-click creation of search and browse pages. Conversely, the user can choose to create their own HTML code by hand or by using the Editable Web Browser as an advanced option for presenting the database.
Collaborative scientific computing sites, such as the NRL Center for Computational Science, NSF computing sites (NCSA, SDSC, PSC, NCAR) and similar labs in DOE (e.g. NERSC, LBNL), have large distributed user communities, spread both geographically (over the globe) and administratively. A constant threat to these computing sites is the compromise of the end systems of their users. When such a compromise occurs, a typical repercussion is that user credentials (e.g. SSH keys or passwords) stored or used on that system will be captured by the attacker and used to gain illicit access to the computing site. The Mithril project focuses on the application of survivability research to standard open source software to allow such sites to continue to operate and serve customers in the face of a extraordinary attack by temporarily and gracefully reducing their level of service but raising their level of security. We will develop a set of integrated security enhancements that not only increases day-to-day security, but also allows dynamic, temporary adaptations in security in response to a heightened level of threat. These enhancements will allow a site to maintain a high-level of openness and usability during normal periods of operation, but respond quickly to increased threat levels with increased security, while still continuing to serve key customers.
NCASSR PKI Testbed
The NCASSR PKI Testbed is a computer security laboratory that supports ongoing research at NCSA. The testbed includes laptops (some with Trusted Computing Platform modules), servers (some with Hardware Security Modules), and smartcard and fingerprint readers. The NCASSR SSH Key Management project and the Illinois Terrorism Task Force (ITTF) Credentialing project are two examples of work enabled by the testbed. The SSH Key Management project is developing a secure private key repository for the popular, standard Secure Shell protocol using testbed servers to protect private keys from compromise and enable fast evocation if compromise occurs. For the ITTF project, the testbed provides a laboratory for evaluating the smartcard-based secure credentialing system under development for first responders in the State of Illinois for enforcing perimeter security at incident sites and tracking movement of emergency response personnel. Other NCASSR projects using testbed resources include the Mithril project, the Secure Group Communications (SeCol) project, and the Secure Email List (SELS) project. The NCASSR (The National Center for Advanced Secure Systems Research) program, based at the University of Illinois, is supported through funding from the Office of Naval Research and focuses on researching and developing next generation information security technologies that address the nation's need for a reliable and secure cyberinfrastructure.
Phantasm is a cutting-edge multimedia cataloging and management system designed from the ground up to address the needs of large-scale professional media repositories. It combines advanced cataloging technologies with autonomous ranking and management systems and a novel rights-management suite. These technologies operate in concert in the Phantasm system, showcasing a tour-de-force of media management technologies. Phantasm is currently used to manage the UIPhotos collection, a set of more than 50,000 images covering the whole of the UIUC campus. Phantasm technologies make it possible to make a collection this size available and yet still make it possible to find specific images and handle rights management.
SELS (Secure Email List System)
Electronic mail is one of the most widely used means of communication. As more user communities engage in collaborative tasks, use of E-mail List Services (ELSs), i.e., e-mails exchanged with the help of a list server, are also becoming common. Considerable work has been done in providing solutions that enable secure exchange of e-mail between two parties; i.e., solutions for e-mail confidentiality, integrity, and authentication such as PGP and S/ MIME. However, little or no work has been done towards providing similar solutions for ELSs. This is a crucial technology gap as corporate and research environments could significantly benefit from secure ELSs. At the National Center for Supercomputing Applications (NCSA) we have developed a novel technology -- SELS, Secure E-mail List Services -- that provides confidentiality, integrity, and authentication for ELSs and is compatible with existing e-mail standards/systems. Furthermore, SELS minimizes trust in the list server by encrypting email content while it is in transit at the server. We have implemented a prototype that works with a wide range of e-mail client software and provides easy-to-install plug-ins for the Mailman list server software.
The Use of Data Mining Methods to Evaluate Public Interest in Carbon Sequestration
A method for using data mining techniques to examine public interest in a topic is presented, applied to the area of carbon sequestration. A three-year longitudinal study of media coverage is paired with intensive analysis of a four-month window of coverage, examining issues such as what image is being presented to the public, how closely media coverage tracks actual events, and how to tailor public outreach messages to address negative trends in coverage. Finally, web coverage of the topic is explored, with source ownership and location used to profile what material is being provided to the public in web searches.
VIAS, a domain specific information retrieval, archival, and processing system. Specifically designed to provide continually updated information on highly dynamic topics, VIAS is ideally suited for use with volatile information sources like the World Wide Web. It continually crawls the Web, monitors mailing lists and USENET newsgroups, and archives all of the information it finds on its topics of interest. However, VIAS goes beyond simple archival of the knowledge it finds, to the autonomous generation of new knowledge and automatic categorization of existing knowledge. A library of state-of-the-art proprietary metadata algorithms is continuously executed against the archives, providing a rich source of new knowledge to be queried or viewed in realtime. VIAS provides the ideal platform for market research and competitive intelligence tasks.