NCSA Home
Contact Us Intranet

NCSA resources: Data Transfer

  1. Overview
  2. Transfer Protocols
  3. Transfer Clients
  4. Examples

Data Transfer Overview

NCSA resources support several methods for moving data. The method chosen generally depends upon four factors...

  • Location of data to be transfered.
  • Destination where data is to be transfered to.
  • Total size of data to be transfered.
  • Number of individual files to be transfered.

The methods to transfer data falls basically under two types of clients, SSH based and FTP based. Grid enabled clients are available to XSEDE users.

NCSA production XSEDE resources use nodes dedicated for file transfer. These nodes run GridFTP servers and mount all of the shared file systems including home directories, scratch and the parallel file systems. The dedicated GridFTP servers can be used to move data between XSEDE Resources with valid grid credentials.

Performance Note

XSEDE utilizes a high speed network between XSEDE resources, so differences in data transfer performance when comparing data transfer rates between two XSEDE resource versus transfer rates between an XSEDE resource and non-XSEDE resource should be expected. The ultimate throughput of any file transfer is limited by the weakest link in the chain. Identifying the bottleneck can sometimes be the most difficult exercise. If the endpoint of interest is a non-XSEDE resource, working with your site administrator and network technicians may be the only way to overcome performance limitations. Check with the network administrator of your local site for connectivity details and possible firewall and/or network bottlenecks that can lead to unexpected or inconsistent network bandwidth or functionality. Transfers can only take place as fast as the slowest component in the network chain.

Data Transfer Protocols

Transfer Protocols Advantages Disadvantages Recomendations
SSH(SCP/SFTP)
is a a network protocol for secure data communication, remote shell services or command execution and other secure network services between two networked computers that it connects via a secure channel over an insecure network.
  • Recursive feature allows simple reproduction of entire directory hierarchies of files.
  • Data is transmitted over a secure channel.
  • Host-key-based authentication is possible.
  • Convenient way to transfer source code or other relatively small files to/from your /home directory.
  • Individual files are transmitted separately, which becomes an issue when network latency is high.
  • Performance is poor over wide area links due to small TCP window sizes.
  • File transfers larger than 2GB are not supported on some systems.
  • Data encryption can become a bottleneck for large transfers.
  • Use to transfer small files or directories containing source code or other relatively small file sets.
  • Tar directories containing large numbers of files when sending over high-latency networks.
High Performance Networking-SSH(SCP/SFTP)
is a patched version of the base OpenSSH code designed to improve performance.
  • Works under the normal interface of scp.
  • Allows TCP receive buffers to automatically adjusted or manually set.
  • Allows data encryption to be turned off, thus reducing the CPU load and allowing high-bandwidth transfers to reach the full network potential.
  • Both client and server must be patched for full functionality.
  • Transfer performance may degrade on local low-latency networks.
  • Run ssh -V to determine if the ssh client at the site you will be issuing scp/sftp commands is built with the HPN-SSH patch.
GridFTP
is a high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks. The GridFTP protocol is based on FTP, the highly-popular Internet file transfer protocol.
  • GSI authentication: Allows secure password-less authentication with a valid X509 certificate and proxy.
  • Extended capabilities to accommodate performance increases through parallelism, optimized buffering and other techniques.
  • Limited to the FTP interface for recursive operations, listing renaming
  • Detailed knowledge of server deployments and system characteristics may be needed to obtain optimal performance.
  • Use when moving large data sets to or from a parallel file system.
  • Striping or concurrent transfers (striped or non-striped) can help take advantage of multiple server hosts.

Data Transfer Clients

SSH based clients like SCP, SFTP, MobaXterm, FileZilla and FireFTP, are just a few transfer clients known to work for moving data on NCSA resources. XSEDE users can use an additional set of Grid-enabled clients including GSI-SSHTerm, Globus Online, XSEDE File Manager and UberFTP. Grid enabled clients utilize the infrastructure to move data between XSEDE resources. It is important to remember that transfers made between XSEDE sites have the full complement of XSEDE tools available, including Globus GSI authentication and dedicated GridFTP servers at each site. The sites are connected over a high-bandwidth Wide Area Network (WAN). Within this framework, transfers between computing centers can be best carried out by utilizing the combined network bandwidth of several machines at the endpoints of a transfer. For more information about data transfer on XSEDE, see the XSEDE Data Transfer page.

Command Line Clients

Client Description Where Available/Usage
scp/sftp
  • Transfer clients based on SSH.
  • Usaully installed as part for ssh client or may be packaged as a standalone transfer client (meaning no ssh/login functionality).
  • Available on all NCSA resources.
  • Can be installed on all operating systems for external access.
uberftp*
  • Command line or interactive FTP interface fully capable GridFTP enabled FTP client.
  • Can be used to pass FTP command directly to the server via the "quote " command.
  • ls, rm, chmod and other utility function are available and some with recursive options.
  • Parallel streams and other GridFTP extensions can be enabled.
  • GSI (grid-proxy) authentication available for XSEDE users.
  • Supports third-party transfers
  • Available on all XSEDE NCSA resources.
  • Can be installed in conjunction with the Globus toolkit on Unix/Linux systems.
  • Move files between client location and any FTP or GridFTP installation.
  • Move files between any set of GridFTP servers (third party).
globus-url-copy*
  • A command line GridFTP client, that is a component of the Globus Tool Kit.
  • Allows striped transfers across multiple servers.
  • Available on all XSEDE NCSA resources.
  • Can be installed in conjunction with the Globus toolkit on Unix/Linux systems.
  • Move files between client location and any FTP or GridFTP installation.
  • Move files between any set of GridFTP servers (third party).
mssftp/msscmd
  • mssftp allows a password-less interactive FTP session to be initiated from any production NCSA compute resource.
  • msscmd is a command line interface to send FTP commands to MSS and will retry transfers automatically.
  • Available on all NCSA compute resources.
  • Can be installed in conjunction with the Globus toolkit on Unix/Linux systems.
  • Move files between NCSA compute resources and NCSA's Mass Storage System (MSS).


Web/GUI based Clients

Client Description Where Available/Usage
Globus Online*
(Recommended for XSEDE users)
  • Easy: GO has simplified signup, login and use and minimized manual intervention
  • Fast: Globus Online can move large filesets in hours
  • Secure: No need to worry about security configs or certificates, and one-time-passwords just work
  • Reliable: Users can fire and forget their transfers
  • Research-focused: Globus Online is the only file transfer service built with the scientific researcher in mind
  • Web based application.
  • Accessible from any where there is a internet connection.
XSEDE File Manager*
  • File Manager come in two forms java applet and stand alone application.
  • Enables access to desktop files, remote XSEDE systems, cloud storage systems (specifically Eucalyptus and Amazon's S3), and external systems defined by users.
  • Viewable custom list of XSEDE resources based on your accounts.
  • Java applet available through the XUP
  • Can be installed on all OSes that have Oracle (Sun) Java installed.
  • Transfer files between XSEDE and external (non-XSEDE) systems including local desktop/laptops.
  • Delete, rename, and copy files and directories.
  • View the status of current file transfers and the history of previous transfers

* Available to XSEDE users for use with XSEDE resources.

Data transfer examples

SSH Based clients
Client Description/Example
scp/gsiscp* Copy local file to remote host

[ncsa: ~]$ scp file_name user@remote_hostname:
[ncsa: ~]$ gsiscp file_name user@remote_hostname:

Recursively copy local files to remote host

[ncsa: ~]$ scp -r * user@remote_hostname:
[ncsa: ~]$ gsiscp -r * user@remote_hostname:

Recursively copy local directory to remote host

[ncsa: ~]$ scp -r directory_name user@remote_hostname:
[ncsa: ~]$ gsiscp -r directory_name user@remote_hostname:

Recursively copy remote files to remote host(third party transfer)

[ncsa: ~]$ scp -r user@remote_hostname1:~/* user@remote_hostname2:~/
[ncsa: ~]$ gsiscp -r user@remote_hostname1:~/* user@remote_hostname2:~/

Recursively copy remote directory to remote host

[ncsa: ~]$ scp -r user@remote_hostname1:~/sub_dir user@remote_hostname2:~/sub_dir
[ncsa: ~]$ gsiscp -r user@remote_hostname1:~/sub_dir user@remote_hostname2:~/sub_dir

sftp/gsisftp* Copy local file to remote host

[ncsa: ~]$ sftp user@remote_hostname
sftp> put file_name


[ncsa: ~]$ gsisftp user@remote_hostname
sftp> put file_name


Recursively copy local files to remote host

[ncsa: ~]$ sftp user@remote_hostname
sftp> put -r *


[ncsa: ~]$ gsisftp user@remote_hostname
sftp> put -r *



GUI/Web based SSH clients
 
SSH based clients like MobaXterm, filezilla, winscp, gsi-sshterm*, etc... generally provide full or partial drag and drop functionality.
GridFTP Based clients
Client Description/Example
globus-url-copy* Copy local file to remote host

[ncsa: ~]$ globus-url-copy -vb -p 32 -stripe -sbs 4000000 -tcp-bs 4000000 \ file:///path/to/source/file_name \ gsiftp://remote_gridftp_hostname/path/to/destination/file_name

recursively copy local files to remote host

[ncsa: ~]$ globus-url-copy -vb -p 32 -stripe -sbs 4000000 -tcp-bs 4000000 -r \ file:///path/to/source/directory_name/ \ gsiftp://remote_gridftp_hostname/path/to/destination/directory_name/

copy remote file to remote host(third party transfer)

[ncsa: ~]$ globus-url-copy -vb -p 32 -stripe -sbs 4000000 -tcp-bs 4000000 \ gsiftp://remote_gridftp_hostname/path/to/source/file_name \ gsiftp://remote_gridftp_hostname/path/to/destination/file_name

uberftp* Copy local file to remote host

[ncsa: ~]$ uberftp remote_gridftp_hostname "put file_name"

recursively copy local files to remote host

[ncsa: ~]$ uberftp remote_gridftp_hostname "put -r *"

recursively copy local directory to remote host

[ncsa: ~]$ uberftp remote_gridftp_hostname "put -r directory_name"


GUI/Web based
gridftp clients*
 
GridFTP based clients like Globus Online, XSEDE File Manager, etc... generally provide full or partial drag and drop functionality.
NCSA's MSS FTP tools
Client Description/Example
mssftp/msscmd Copy local file to NCSA's MSS

[ncsa: ~]$ mssftp
ftp> put file_name


[ncsa: ~]$ msscmd "put file_name"

recursively copy local files to NCSA's MSS

[ncsa: ~]$ mssftp
ftp> put -r *


[ncsa: ~]$ msscmd "put -r *"

recursively copy local directory to NCSA's MSS

[ncsa: ~]$ mssftp
ftp> put -r directory_name


[ncsa: ~]$ msscmd "put -r directory_name"

* Available to XSEDE users for use with XSEDE resources.