NCSA Home
Contact Us | Intranet | Search

ncsa

Previous: gsn
View by Date; View by Name

lsf

UPDATE 12/19/02

Dedicated queues have had their wall clock time limits increased to be in line 
with the standard queues: 
short: 50 hours 
medium: 200 hours 
long: 400 hours  

-------------------------------------------------------------------------

UPDATE 7/26/02

balder (256-processor Origin2000) has been split into two 128-processor systems.
As a result, the 256-processor queues are no longer available.

-------------------------------------------------------------------------
UPDATE 8/30/01

All dedicated queues (short, medium, and long) will now be active at all times.
(Previously, the long queues were active only during the weekend.)

-------------------------------------------------------------------------

                  LSF lsbatch on the SGI Cray Origin2000
                           (Load Share Facility)
 
lsbatch (LSF version 3.0) is available on the SGI Cray Origin2000. The best
starting place for LSF information is the lsbatch man page. It contains a
description of all the batch related commands. (Please add
/usr/local/lsf/man to your MANPATH variable if it's not there already to
get the lsf man pages.)
 
Batch System Procedures
-----------------------
 
(a) Timeshared Queues
 
The following parameters 
 
* the number of threads 
* the peak memory 
* the total job run time 
 
needed by the job are required at submission time via bsub options. Jobs are 
routed to the appropriate queue based on these parameters. The default 
queues are: 

Debug Queue:

                                  
        Normalized      Normalized     
Queue      Job            Job           Job
Name     Run Time        CPU Time       Size
---------------------------------------------------

debug     10 mins         5 mins       1-4 threads
                        per thread     < 2 Gb memory
---------------------------------------------------

Regular Queues:

                                    Queue Names

Normalized                           Job Size
  Job   
Run time               Small            Medium               Large
                 1-8 threads      9-16 threads       17-64 threads
               < 2 Gb memory     < 4 Gb memory      < 25 Gb memory
------------------------------------------------------------------

5  hours              vst_sj            vst_mj              vst_lj
                  ind_vst_sj        ind_vst_mj          ind_vst_lj

50 hours               st_sj             st_mj               st_lj
                   ind_st_sj         ind_st_mj           ind_st_lj

200 hours              mt_sj             mt_mj               mt_lj
                   ind_mt_sj         ind_mt_mj           ind_mt_lj

400 hours              lt_sj             lt_mj               lt_lj
                   ind_lt_sj         ind_lt_mj           ind_lt_lj
------------------------------------------------------------------
 
The bsub options are: 
 
-n   specifies the number of threads (default = 1). This is the maximum 
     number of active processes/threads at any given time during the 
     lifetime of the job. 
 
     If different numbers of processors are used over the lifetime of 
     the job, you must specify the maximum number used. 
 
-M   specify job peak memory limit (default = 512 Mb). This is the 
     sum of the memory usage for all processes/threads in the job. The 
     memory usage is reported by the ps(1) command as RSS: Total resident
     size of the process.  This includes only those pages of the process
     that are physically resident in memory.
 
     If no unit is specified, Kilobytes is assumed. Specify K, M, or G 
     for Kilobytes, Megabytes, and Gigabytes respectively. 
 
     NOTE: The unit specification for -M option only works at NCSA. It is 
     not standard LSF. 

-W   specify total job run time (default = 60 mins). The syntax is 
     [hour:]minute. Run time is defined as the wall clock time for the job, 
     excluding the time used for mass storage transfers and time that the job
     may be suspended by the system.  

     NOTE: Because the NCSA Origin array is comprised of both 195MHz and 
     250MHz processors, run time for a job can vary depending on the 
     host on which it runs. The run time on the 250Mhz hosts are normalized 
     by a factor of 250/195. Run time limits for a job are based on the 
     normalized run time. 
     
     Charging for jobs is based on normalized cputime. The busage command 
     gives both the actual and normalized cputime and run time.
 
You can use the busage command to help determine accurate limits for a job. 
We recommend that you set the limits to about 110% of the usage reported by 
busage.  After a job has finished, enter:  busage [jobId] 
 
The number of processes/threads used by the process (bsub with the -n 
option) is reported as:
 
  number of processes/threads: XXX 
 
The peak memory usage (bsub with the -M option) is reported as:
 
  peak memory: XXXX 
 
The run time (bsub with the -W option) is reported as:
 
  runtime: XXXX 
  runtime (normalized): XXXX
 
Notes 
-----
  i. For jobs that do not specify any or all of these parameters, bsub will 
     supply the defaults (-n1, -M512M, -W60). 
 
 ii. We strongly recommend not specifying a queue name when submitting jobs 
     to the above listed default queues. For jobs that do specify a queue 
     name, the values that you include for -n, -M, and -W (or get by 
     default) are used to accept or reject a job. That is, if the values 
     of the parameters fit the limits of the queue, the job is accepted; 
     if not, the job is rejected. 
 
iii. The queues prefaced with ind_ are industrial queues restricted to NCSA 
     industrial partners and run at a higher priority. Jobs belonging to 
     industrial users are automatically routed to these queues. Industrial 
     users who wish to submit jobs to the non-industrial default queues 
     need to specify the queue name in addition to the parameters. 
 
 iv. The limits on the queues are for lsbatch queue selection purposes only; 
     individual job limits are based on the parameters (-n, -M, and -W) 
     specified. 
 
  v. To get the peak memory used by a job, use the busage command and take 
     the value from the peak memory entry under "Information collected 
     by sampling the running job" listing. 

 vi. The environment variable $BSUB_NUMTHREADS is set to the
     number specified in the BSUB -n option, so can be used for setting the
     number of threads for your program.  For example:

    setenv MP_SET_NUMTHREADS $BSUB_NUMTHREADS

    mpirun -np $BSUB_NUMTHREADS a.out


 
(b) Dedicated Queues
 
The dedicated queues are meant for benchmarking. There is a premium on
charges for jobs run in the dedicated queues. See /usr/news/Charging_algorithm
for information. Currently, the following queues are available: 
 
Queue Name      No. of    Memory       Job                    Service
               Processors  (Gb)      Time Limit                Level
----------------------------------------------------------------------------
128_ded_short    128       76   50 hour wall clock time   Normal dedicated
ind_128_ded_st   128       76   50 hour wall clock time   Priority dedicated
128_ded_med      128       76  200 hours wall clock time  Normal dedicated
ind_128_ded_mt   128       76  200 hours wall clock time  Priority dedicated
128_ded_long     128       76  400 hours wall clock time  Normal dedicated
ind_128_ded_lt   128       76  400 hours wall clock time  Priority dedicated

----------------------------------------------------------------------------

The ind (industrial) queues have priority over the regular queues within 
each time class. 

Also, if there are no jobs queued in the dedicated queues, the machines will 
run jobs in the vst queues. Therefore, a subsequent dedicated job that is 
queued will need to wait until currently running vst jobs are done. This can 
take upto 5 hours (the run time limit in the vst queues). 

To submit jobs to the dedicated queues, specify the -q option to bsub. You 
do not need to specify the parameters -n, -M, and -W for jobs in these queues;
however, you are encouraged to use -W. 
 
Tips on running in the dedicated queues is available at
http://www.ncsa.uiuc.edu/UserInfo/Consulting/Tips/Dedicated.html.

Users whose codes do not scale to the number of processors available in the 
dedicated queues can still use the dedicated queues to run multiple jobs for
faster turnaround. Information on doing this is available at:

   http://www.ncsa.uiuc.edu/UserInfo/Consulting/Tips/dplace.html

Job Submission and Control
--------------------------
The easiest way to run a job in the batch system is via a shell script.
A sample script is available in the directory /usr/local/doc/lsf/samples
that you can modify for your own use.
 
You can specify options to lsbatch via the lsbatch "bsub" option.
 
Once you have created a job script, submit the job to lsbatch using the
bsub command as follows (the script can have embedded bsub options):
 
     bsub  < script_name
 
In this case, the script file is spooled by lsbatch.
 
NOTE:
 
     bsub {list of bsub options} script_name/executable
 
also works; in this case all bsub options need to be on the command line
(embedded bsub options are ignored) and the script file is NOT spooled by
lsbatch.
 
Other useful bsub options:
 
-J      job name
-B      send mail when job starts
-N      send mail when job ends
-o      specify standard output file
-P      specify project to be charged [**]
 
[**] For users with multiple projects; use the 'usage' command to see your
projects.
 
Useful commands (check the man page for usage and syntax):

To check on the status of a job:      bjobs
 
To kill a job in the batch system:    bkill [request_id]

To get resource statistics on 
  both running and completed jobs:    busage

To get information on all 
        active processes in a job:    bps 
 
Access to Batch
---------------
A batch job may be submitted any time. Batch jobs run at all times.
There is currently a limit of 5 total (queued and running) jobs. 
 
 
Disk Space for Batch Jobs
-------------------------
Each machine in the Origin array has a local XFS scratch filesystem.
We *strongly* recommend using the machine-local scratch directory for
running jobs over NFS mounted non-local scratch (for e.g., scratch-modi4)
for performance and reliability reasons. Use of non-local scratch is at 
your own risk. No refunds will be issued for batch jobs that failed due to 
use of non-local scratch. 

Each batch job has a per-job scratch directory, which is created on the
local scratch directory on the executing host when the job starts. The
directory is named based on the batch jobID and the start time of the job.
The name of this directory is available to batch job scripts in the $SCR
enviroment variable. 

For example, job 229106 that started on Feb 5, 1999 at 11:43:22 on machine
jord1 would have $SCR set to: 

/scratch-jord1/LSBATCH/229106.5Feb1999114322

To go to the scratch directory associated with this job, at the shell prompt,
enter: 

% cd /scratch-jord1/LSBATCH/229106*

See the sample scripts in /usr/local/doc/lsf/samples on how to use 
machine-local scratch.

Documentation
-------------
 
http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/Origin2000/Doc/Jobs.html
has information on running jobs.
 
The directory /usr/local/doc/lsf contains postscript versions of the
LSF User's Guide, Administrator's Guide and Release Notes.

Silicon Graphics Origin2000:usr/news/lsf
Last Modified: February 10, 2003