Table of Contents
- System Overview
- Connecting to UniTree
- Using UniTree
System Overview
UniTree, NCSA's Mass Storage System, is available
to all NCSA users for permanent
storage of data. Your UniTree login is the same as that on the NCSA production
machines; however, logging onto UniTree via ssh is not currently supported.
Users must use one of the access methods
described below.
UniTree is a high-speed, large-capacity data storage
system for
NCSA and
TeraGrid users. It consists of a
high-performance system running DiskXtender software, which implements
the IEEE Mass Storage Reference Model. This resource provides users with a
network-centered, parallel storage system capable of unlimited storage
capacity. Users may store files here permanently and confidently because backup
of data is automatically done within this storage system. Two copies of each
file are made without requiring user interaction.
System Profile
- SGI Origin 3900 running IRIX64 version 6.5.x
- 16 700 MHZ IP35 Processors
- 12 GB of memory
- 10 1 Gb ethernet connections
- 2 10 Gb ethernet connections
- 35 TB first level disk cache
- DiskXtender 2.9 from EMC/Legato
- Locally developed GridFTP and Kerberos FTP services
Connecting to UniTree
Each of NCSA's production resources has a dedicated, high-speed connection
to UniTree. To utilize these internal connections simply use the mssftp or
msscmd, or designate the appropriate host names as illustrated below.
NCSA's mass storage system can be accessed both from outside the NCSA domain
and from NCSA's production machines at mss.ncsa.uiuc.edu (on Radium,
Copper and Tungsten) and at mss.ncsa.teragrid.org (on Mercury
and Cobalt). Users can initiate transfers using mssftp (msscmd), uberftp,
globus-url-copy, the tgcp
command and other Kerberos-enabled FTP clients.
UniTree can be accessed remotely via one of its FTP interfaces using Kerberos,
GSI or NCSA HPC passwordless authentication. The options available for
authentication depend upon where you are connecting from. GSI clients include
globus-url-copy, gsincftp and UberFTP. Users must obtain X.509 credentials
and maintain a valid proxy certificate to use GridFTP (GSI authenticated)
clients. Refer to
The
TeraGrid Proxy Information Page for details. The passwordless
client available only from NCSA HPC systems is mssftp. The Kerberos
client is named ftp on most systems. Check with your system administrator
to determine if the default ftp on your system is indeed "kerberized".
Below are several common programs used to access UniTree:
-
ftp mss.ncsa.uiuc.edu
- This needs to be the kerberized ftp client
(
/usr/local/krb5/bin/ftp on NCSA production machines). See
NCSA's Security page for information
on installing Kerberos on your local machine. mssftp and
msscmd (described below) are recommended for accessing MSS
from NCSA production machines since they provide better performance and automatically
authenticate.
-
mssftp
- The
mssftp command is available on all NCSA production machines. Invoking the
mssftp command
will automatically connect you to UniTree without
the need to enter a login name or password. Once connected, the interface is similar to
most text-based ftp clients with an
enhanced command set that provides additional
functionality on MSS.
Refer to man mssftp for more information.
-
msscmd
- The
msscmd command provides a command line interface to mssftp.
It available on all NCSA production machines and can be used both interactively and in batch
scripts (see man msscmd).
-
uberftp
-
The
uberftp client is available on all NCSA production machines and is free
to download. uberftp supports GSI and MSS
(NCSA passwordless) authentication and GridFTP enhancements such as parallel streams.
uberftp supports command line arguments and interactive use.
-
globus-url-copy
-
The
globus-url-copy file transfer client is available on Mercury and Cobalt.
globus-url-copy is a command line
tool used to initiate file transfers by specifying source and destination URLs.
globus-url-copy uses
GSI authentication and will initiate a passwordless file transfer between sites
over which a valid grid proxy has been issued.
-
tgcp
-
The
tgcp (TeraGrid copy) command is a wrapper that invokes
globus-url-copy after querying
a set of configuration files which fill in grid specific optimizations
(such as TCP buffer size) for all possible transfers on TeraGrid. In addition,
tgcp provides a more forgiving syntax than globus-url-copy
and can be invoked similarly to the scp command.
Note: The UniTree User Guide contains a sample
UniTree FTP session.
Using UniTree
Results from production, important input data and other project related data can be
placed in MSS permanently. Data purge policies, applying to scratch space,
are in place on most production machines at NCSA. Freeing up scratch file systems
by moving data to MSS is in the user communities' best interest and is the only way
to permanently store large
amounts of data that exceed the home directory quota. Data can be staged in and out
of MSS manually or using batch script commands.
On the NCSA Intel 64 Cluster (abe) and the SGI Altix (cobalt), the saveafterjob utility is
available for automated, guaranteed saving of files from batch jobs
to mass storage. Users writing thier own job scripts should specify the $SCR
environment variable as the destination file system for the job.
Important Note: Be sure to specify all files that need to be saved
to saveafterjob. The default behavior of saveafterjob is
to purge the temporary job directory upon successful transfer of the specified
files. See the document
Automated Saving of Files from Batch Jobs for more information.
Getting Further Assistance
If you require further assistance, send e-mail to one of the following addresses:
