- Overview
- Interactive Use
- Running Programs
- Queues
- Batch Commands
- qsub
- qsub -I
- qstat
- qhist
- qdel
- Sample Batch Scripts
- Disk Space for Batch Jobs
- Automated Saving of Files from Batch Jobs
1. Overview
The NCSA Dell NVIDIA Linux Cluster Forge uses the
Torque Resource Manager
with the Moab Workload Manager
for running jobs. Torque is based upon OpenPBS, so the commands are the same
as PBS commands.
2. Interactive Use
The login node forge.ncsa.illinois.edu is available for
interactive use. It has 16 cores and 6 GPU devices. In general, interactive use should be limited to compiling and other development tasks, such as editing source and debugging. The batch system is available for all other jobs.
See the section on qsub -I
for instructions on how to run an interactive job on the compute nodes.
3. Running Programs
GPU
The batch system requires no additional information for running on the GPUs.
Execution of GPU kernels is controlled from within the host code, with 6 or 8 GPUs available
from each node.
Simply set your batch script for deployment on the host node(s), as in the sample batch scripts.
Also see Affinity information on Forge for information on using the
GPUs on the nodes optimally.
MPI
The MPI implementations on Forge have the
mpirun script for running an MPI program. See the sample batch scripts for syntax details for the MPI implementations.
OpenMP
Before you run an OpenMP program, set the environment variable
OMP_NUM_THREADS to the number of threads you want. For example,
to run program a.out interactively with two threads:
setenv OMP_NUM_THREADS 2
./a.out
The following environment variables may also be useful in running your OpenMP
programs:
| OMP_SCHEDULE |
Sets the schedule type and (optionally) the chunk size
for DO and PARALLEL DO loops declared with a schedule
of RUNTIME. The default is STATIC. |
| KMP_LIBRARY |
sets the run-time execution mode. The default is throughput,
but it can be set to turnaround so worker threads do not
yield while waiting for work. |
| KMP_STACKSIZE |
Sets the number of bytes to allocate for the stack of
each parallel thread. You can use a suffix k, m, or
g to specify kilobytes, megabytes or gigabytes.
The default is 4m. |
Hybrid MPI/OpenMP
To run a MPI/OpenMP hybrid program, you need to set the envionment variable OMP_NUM_THREADS to the number of threads you want, and change the number of cpus per node for MPI accordingly. For example, to run a program with 10 MPI ranks and 16 threads for each rank, do the following in your batch script:
#PBS -l nodes=10:ppn=1
setenv OMP_NUM_THREADS 16
(See the qsub section for information on PBS directives.)
4. Queues
The following queue is currently available for users:
| Queue | GPU configuration | Walltime | Max # Nodes |
|
debug | 6 or 8 GPU nodes | 30 mins | 4 |
| normal | 6 GPU nodes | 48 hours | 18 |
| eight | 8 GPU nodes | 48 hours | 8 |
*NOTE: while the total number of nodes in the Forge cluster is 44,
all nodes may not be available in practice due to offline nodes, etc.
Below are brief descriptions of the useful batch commands.
For more detailed information, refer to the individual man pages.
5.1 qsub
The qsub command is used to submit a batch job to a queue.
All options to qsub can be specified either on the command line
or as a line in a script (known as an embedded option). Command line
options have precedence over embedded options.
Scripts can be submitted using
qsub [list of qsub options] script_name
The main qsub commands are listed below.
The sample batch scripts illustrates
qsub usage and options.
Also see the qsub man page for other options.
-
-l resource-list: specifies resource limits.
The resource_list argument is of the form:
resource_name[=[value]][,resource_name[=[value]],...]:resource
The resource_names are:
walltime: maximum wall clock time (hh:mm:ss) [default: 10 mins]
nodes: number of 16-core nodes [default: 1 node]
ppn: how many cores per node to use (1 through 16)
[default: ppn=1]
Example:
#PBS -l walltime=00:30:00,nodes=2:ppn=16
-
-q queue_name: specify queue name. [default: normal]
- -N jobname: specifies the job name.
-W depend=dependency_list: defines the dependency
between current and other jobs.
- -o out_file:
store the standard output of the job to file out_file.
After the job is done, this file will be found in the directory from
which the qsub command was issued.
[default :<jobname>.o<PBS_JOBID>]
- -e err_file:
store the standard error of the job to file err_file.
After the job is done, this file will be found in the directory from
which the qsub command was issued.
[default :<jobname>.e<PBS_JOBID>]
- -j oe:
merge standard output and standard error into standard output file.
- -V:
export all your environment variables to the batch job.
-
-m be:
send mail at the beginning and end of a job.
-
-M myemail@myuniv.edu : send any email to given email address.
- -A project:
charge your job to a specific project (XSEDE project or NCSA PSN).
(for users in more than one project)
-
-X: enables X11 forwarding.
Notes:
- Using the -N option will generate stdout and stderr
files of the form:
<jobname>.o<jobid> and <jobname>.o<jobid> respectively
in the directory from where the batch job was submitted when used without the -o and -e options.
- Temporary stdout/stderr files while the job is running are located in the
home directory, and named <jobid>.fsched.OU and
<jobid>.fsched.ER.
- When using the -W option, the generally recommended dependency types to use are before,
beforeany, after and afterany. While there are addition dependency
types, those types that work based on batch job error codes may not behave as expected because of the
difference between a batch job error and application errors. See the dependency section of the qsub manual
page(man qsub) for additional information.
5.2 qsub -I
The -I option tells qsub you want to run an interactive job. You can also
use other qsub options such as those documented in the batch sample scripts.
For example, the following command:
qsub -I -V -q debug -l walltime=00:30:00,nodes=1:ppn=16
will run an interactive job with a wall clock limit of 30 minutes, using
one node and sixteen cores per node.
After you enter the command, you will have to wait for Torque to start the
job. As with any job, your interactive job will wait in the queue until
the specified number of nodes is available. If you specify a small
number of nodes for smaller amounts of time, the wait should be shorter
because your job will backfill among larger jobs.
Once the job starts, you
will see something like this:
qsub: waiting for job 914.fsched to start
qsub: job 914.fsched ready
Now you are logged into the launch node. At this point, you can use the
appropriate command to start your program.
When you are done with your runs, you can use the exit command to end
the job.
5.3 qstat
The
qstat command displays the status of batch jobs.
- qstat -a gives the status of all jobs on the system.
- qstat -n lists nodes allocated to a running job in
addition to basic information.
The first host on the list is the launch node.
- qstat -f PBS_JOBID gives detailed information
on a particular job.
Note: Currently PBS_JOBID needs to be the full extension:
<jobid>.abem5.ncsa.uiuc.edu.
- qstat -q provides summary information on all the queues.
See the man page for other options available.
5.4 qhist
qhist, a locally written tool, summarizes the raw accounting
record(s) for one or more jobs.
See the output of "qhist --help" for details.
NOTE: SU charges for a job are available the day after the job completes.
To display information about a specific job, the syntax is qhist PBS_JOBID.
5.5 qdel
The qdel command deletes a queued job or kills a running job.
The syntax is qdel PBS_JOBID.
Note: You only need to use the numeric part of the Job ID.
6. Sample Batch Scripts
Sample batch scripts are available in the directory
/usr/local/doc/batch_scripts for use as a template.
7. Disk Space for Batch Jobs
Scratch space for batch jobs is provided via a per-job scratch directory that
is created at the beginning of the job. This directory is created under
/scratch/batch, and is based on the JobID. If the batch script uses one of the sample scripts as a template, the name of this scratch directory is
available to job scripts with the $SCR environment variable.
Your job scratch directory may be deleted soon
[possibly immediately] after your job completes, so
you should take care to transfer results to the mass storage system.
(see
the section Automated Saving of Files from Batch Jobs).
The cdjob command
can be used to change the working directory to the scratch directory of a
running batch job.
The syntax is
cdjob PBS_JOBID
8. Automated Saving of Files from Batch Jobs
The saveafterjob utility is available for
automated, guaranteed saving of output files from batch jobs to the mass
storage system.
For details on its use, see the saveafterjob
page and the sample PBS batch scripts.
Back to Top