NCSA provides user support for the many complex areas of data management — moving
data between machines, finding the correct place to write/read data, database
usage, and finding a file system based on its performance versus file size.
NCSA has more than a petabyte of raw disk capacity distributed among its
High Performance Computing (HPC) resources and other production systems. Several
additional petabytes of archival storage capacity are available onsite. Users
at NCSA have a long-term data storage system at their disposal and access to
high-speed parallel file systems on each of our HPC platforms to support the
creation and analysis of large data sets.
Production File Systems
NCSA's production cluster environments are equipped with a variety of parallel
file systems that facilitate applications with demanding I/O
requirements. NCSA maintains a 284 TB Storage Area Network (SAN). Systems
that mount this disk include backup servers, database servers, testbed systems,
and various internal systems.
Mass Storage System (MSS)
NCSA's hierarchical
archival storage system can be accessed via FTP and SSH based transfer
clients. NCSA's
mass storage archive now holds more than six petabytes of data.
Moving Data
Data can be moved in various ways and to various locations: to and from UniTree,
between clusters, offsite, and on TeraGrid.
Each HPC system at NCSA has dedicated GridFTP servers and additional tools
to transfer data within NCSA and across the TeraGrid. We provide multiple filesystems
that include home and application directories for small files, scratch filesystems
for large files, and projects for large datasets requiring longer life. NCSA
supports significant projects for which scientific databases are required.