IBM LoadLeveler for AIX 5L: Using and Administering
The startd daemon monitors jobs and machine resources on the
local machine and forwards this information to the negotiator daemon.
The startd also receives and executes job requests originating from remote
machines. The master daemon starts, restarts, signals, and stops the
startd daemon.
The startd daemon can be in any one of the following states:
- Busy
- The maximum number of jobs are running on this machine.
- Down
- The daemon is not running on this machine. The startd daemon enters
this state when it has not reported its status to the negotiator. This
can occur when the machine is actually down, or because there is a network
failure.
- Drained
- The startd machine will not accept any new jobs. However, any jobs
that are already running on the startd machine will be allowed to
complete.
- Draining
- The startd daemon has been drained by the administrator, but some jobs are
still running. The machine remains in the draining state until all of
the running jobs have completed, at which time the machine status changes to
drained. The startd daemon will not accept any new jobs while in the
draining state.
- Flush
- Any running jobs have been vacated (terminated and returned to the queue
to be redispatched). The startd daemon will not accept any new
jobs.
- Idle
- The machine is not running any jobs.
- None
- LoadLeveler is running on this machine, but no jobs can run here.
- Running
- The machine is running one or more jobs and is capable of running
more.
- Suspend
- All LoadLeveler jobs running on this machine are stopped (cease
processing), but remain in virtual memory. The startd daemon will not
accept any new jobs.
The startd daemon performs these functions:
The startd daemon spawns a starter process after the schedd
daemon tells the startd to start a job. The starter process manages all
the processes associated with a job step. The starter process is
responsible for running the job and reporting status back to startd.
The starter process performs these functions:
- Processes the prolog and epilog programs as defined by the
JOB_PROLOG and JOB_EPILOG keywords in the configuration
file. The job will not run if the prolog program exits with a return
code other than zero.
- Handles authentication. This includes:
- Authenticates AFS, if necessary
- Verifies that the submitting user is not root
- Verifies that the submitting user has access to the appropriate
directories in the local file system.
- Runs the job by forking a child process that runs with the user id and all
groups of the submitting user. The starter child creates a new process
group of which it is the process group leader, and executes the user's
program or a shell. The starter parent is responsible for detecting the
termination of the starter child. LoadLeveler does not monitor the
children of the parent.
- Responds to vacate and suspend orders from the startd.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]