GridChem Electronic-Structure Information Workflow and Database
Award year: 2008-2009
We propose to develop and test a prototype project in direct collaboration with NCSA/GridChem to solicit, extract and store results from density-functional theory based calculations from physics and quantum chemistry application codes running on NCSA, e.g. GAMESS and VASP, into an existing ORACLE database (the "Structural Database") that uses SQL queries for searches and permits visualization of chemical and/or solid-state structure. The database will be a community open data project. Utilizing the existing GridChem workflow engine to manage data and jobs, we will establish a prototype Project Group (all UIUC users initially) to extend the GridChem workflow that requests permission to store data for open use (with user-determined time delay for data release), extracts user (name and institution) and runtime validation information (e.g., basis set and k-point mesh, exchange-correlation type), then stores extracted output data from completed electronic-structure calculations (e.g., atomic coordinates and translation vectors (if any), system energy, and forces) from (initially) VASP pseudo-potential projected-augmented-wave calculations into the database (later extended to GAMESS and other open source codes). The workflow will be then be altered to query the database for the calculation being proposed, and, if it exists, offer a view of available information. If the information is not complete enough, the available information can be directly extracted to start the new calculation, so that is starts with known available information from, e.g., lower level basis, for a better basis set calculation. Currently, designed for alloy and hydrogen-storage materials design, only the Johnson research group uses the database, see examples in proposal. However, the potential for this database is far from being tapped, and this project will permit a larger user base to contribute and utilize every increasing chemical and solid-state system information. Moreover, it is surprising that while we talk about information sharing there is, as of yet, no system that has been made for open or even limited use by researchers in chemistry or condensed matter physics that provides assessable data. This proposal will help this happen within the GridChem community. Once the prototype is working and proven, the GridChem community will be asked for permission to mine their already enormous stored individual data for inclusion into the database.