IBM LoadLeveler for AIX 5L: Using and Administering
Purpose
Returns status information about machines in the LoadLeveler
cluster. It does not provide status on any NQS machine.
Syntax
llstatus [-?]
[-H][-R][-F]
[-v] [-l] [-f
category_list] [-r category_list]
[hostlist]
Flags
- -?
- Provides a short usage message.
- -H
- Provides extended help information.
- -R
- Lists all of the machine consumable resources associated with all of the
machines in the LoadLeveler cluster (when specified alone). When a host
list is specified, the option only displays machine consumable resources
associated with the specified hosts. This option should not be used
with any other option.
- -F
- Lists all of the floating consumable resources associated with the
LoadLeveler cluster. This option should not be used with any other
option.
- -v
- Outputs the name of the command, release number, service level, service
level date, and operating system used to build the command.
- -l
- Specifies that a long listing be generated for each machine for
which status is requested. If -l is not
specified, the standard list, described below, is generated.
- -f category_list
- Is a blank-delimited list of categories you want to query. Each
category you specify must be preceded by a percent sign. The
category_list cannot contain duplicate entries. This flag
allows you to create a customized version of the standard llstatus
listing. The output fields produced by this flag all have a fixed
length. The output is displayed in the order in which you specify the
categories. category_list can be one or more of the
following:
-
- %a
- Hardware architecture
- %act
- Number of job steps dispatched by the schedd daemon on this
machine
- %cm
- Custom Metric value
- %cpu
- Number of CPUs on this machine
- %d
- Available disk space in the LoadLeveler execute directory
- %i
- Number of seconds since last keyboard or mouse activity
- %inq
- Number of job steps in the job queue of this schedd machine
- %l
- Berkeley one-minute load average
- %m
- Physical memory on this machine
- %mt
- Maximum number of initiators that can be used simultaneously on this
machine
- %n
- Machine name
- %o
- Operating system on this machine
- %r
- Number of initiators used by the startd daemon on this machine
- %sca
- Availability of the schedd daemon
- %scs
- State of the schedd daemon
- %sta
- Availability of the startd daemon
- %sts
- State of the startd daemon
- %v
- Available swap space (free paging space) of this machine
- -r category_list
- Is a blank-delimited list of categories you want to query. Each
category you specify must be preceded by a percent sign. The
category_list cannot contain duplicate entries. This flag
allows you to create a customized version of the standard llstatus
listing. The output produced by this flag is considered raw, in that
the fields can be variable in length. The output is displayed in the
order in which you specify the formats. Output fields are separated by
an exclamation point (!). category_list can be one or more of
the categories listed under the -f flag.
- hostlist
- Is a blank-delimited list of machines for which status is
requested.
Description
If no hostlist is specified, all machines are queried.
If you have more than a few machines configured for LoadLeveler, consider
redirecting the output to a file when using the -l flag.
Each machine periodically updates the central manager with a snapshot of
its situation. Since the information returned by using
llstatus is a collection of such snapshots, all taken at varying
times, the total picture may not be completely consistent.
If you define consumable resources in the administration file, then
llstatus displays this information when either the -R or
-l option is specified. For the predefined ConsumableCpus
resource, the "total" values reported by llstatus can be the values
defined in the administration file or the values evaluated by the startd
daemons. The startd values are used if the administration file values
are set to "all." In this case, llstatus appends a plus (+)
sign to the resource name in the output reports.
Examples
This example requests a long status listing for machines named silver and
gold:
llstatus -l silver gold
Results
The Standard Listing: The standard listing is
generated when you do not specify the -l option with the
llstatus command. The following is sample output from the
llstatus command, where there are two nodes in the cluster.
+--------------------------------------------------------------------------------+
|Name Schedd InQ Act Startd Run LdAvg Idle Arch OpSys |
|k10n09.ppd.pok.ibm.com Avail 3 1 Run 1 2.72 0 R6000 AIX51 |
|k10n12.ppd.pok.ibm.com Avail 0 0 Idle 0 0.00 365 R6000 AIX51 |
| |
|R6000/AIX51 2 machines 3 jobs 1 running |
|Total Machines 2 machines 3 jobs 1 running |
| |
|The Central Manager is defined on k10n09.ppd.pok.ibm.com |
| |
|The GANG scheduler is in use |
| |
|All machines on the machine_list are present. |
+--------------------------------------------------------------------------------+
|
The standard listing includes the following fields:
- Name
- Hostname of the machine.
- Schedd
- State of the schedd daemon, which can be one of the following:
- Down
- Drned (Drained)
- Drning (Draining)
- Avail (Available)
For a detailed explanation of these states, see The schedd daemon.
- InQ
- Number of job steps in the job queue of this schedd machine.
- Act
- Number of job steps dispatched by the schedd daemon on this
machine.
- Startd
- State of the startd daemon, which can be:
- Busy
- Down
- Drned (Drained)
- Drning (Draining)
- Flush
- Idle
- None
- Run (Running)
- Suspnd (Suspend)
For a detailed explanation of these states, see The startd daemon.
- Run
- The number of initiators used by the startd daemon to run
LoadLeveler jobs on this machine. One initiator is used for each serial
job step and one initiator is used for each task of a parallel job
step.
- LdAvg
- Berkeley one-minute load average on this machine.
- Idle
- The number of seconds since keyboard or mouse activity in a login session
was detected. Highest number displayed is 9999.
- Arch
- The hardware architecture of the machine as listed in the configuration
file.
- OpSys
- The operating system on this machine.
- Total Machines
- The standard listing includes the following summary fields:
- machines
- The number of machines in the cluster that have made a status report to
the Central Manager.
- jobs
- The number of job steps in LoadLeveler job queues.
- running
- The number of initiators used by all the startd daemons in the LoadLeveler
cluster. One initiator is used for each serial job step. One
initiator is used for each task of a parallel job step.
Consumable Resources Listing: The llstatus
command, issued with the -R option, generates a listing of all of
the consumable resources associated with all of the machines in the
LoadLeveler cluster. When a host list is specified, this option will
only display resources associated with the specified hosts. The
following is sample output from this command:
llstatus -R
Figure 23. Sample llstatus -R command output
+--------------------------------------------------------------------------------+
| |
|Machine Consumable Resource(Available, Total) |
|------------------------------ -------------------------------------------------|
|c209f1n01.ppd.pok.ibm.com ConsumableCpus(4,4)+ ConsumableMemory(1.000 gb,1.|
|c209f1n02.ppd.pok.ibm.com ConsumableCpus(4,4)+ n02_res(123,500) Frame5(10,1|
|c209f1n05.ppd.pok.ibm.com ConsumableCpus(4,4)+ ConsumableMemory(1.000 gb,1.|
| |
|Resources with "+" appended to their names have the Total value reported from Startd|
+--------------------------------------------------------------------------------+
|
Floating Consumable Resources Listing: The
llstatus command, issued with the -F option, generates a
listing of all of the floating consumable resources associated with all of the
machines in the LoadLeveler cluster. This option should not be
specified with any other option. The following is sample output from
this command:
llstatus -F
+--------------------------------------------------------------------------------+
| |
|Floating Resource Available Total |
|------------------------------ ------------- --------------- |
|EDA_licenses 20 29 |
|Frame5 15 20 |
|WorkBench6 5 7 |
|XYZ_software 6 6 |
| |
| |
+--------------------------------------------------------------------------------+
Customized, Formatted Standard Listing: A
customized and formatted standard listing is generated when you specify
llstatus with the -f option. The following is
sample output from this command:
llstatus -f %n %scs %inq %m %v %sts %l %o
+--------------------------------------------------------------------------------+
|Name Schedd InQ Memory FreeVMemory Startd LdAvg OpSys |
|ll5.pok.ibm.com Avail 0 128 22708 Run 0.23 AIX51 |
|ll6.pok.ibm.com Avail 3 224 16732 Run 0.51 AIX51 |
| |
|R6000/AIX51 2 machines 3 jobs 3 running |
|Total Machines 2 machines 3 jobs 3 running |
| |
|The Central Manager is defined on ll5.pok.ibm.com |
| |
|The GANG scheduler is in use |
| |
|All machines on the machine_list are present. |
+--------------------------------------------------------------------------------+
Customized, Unformatted Standard Listing: A customized and
unformatted (raw) standard listing is generated when you specify
llstatus with the -r flag. Output fields are
separated by an exclamation point (!). The following is sample output
from this command:
llstatus -r %n %scs %inq %m %v %sts %l %o
+--------------------------------------------------------------------------------+
|ll5.pok.ibm.com!Avail!0!128!22688!Running!0.14!AIX51 |
|ll6.pok.ibm.com!Avail!3!224!16668!Running!0.37!AIX51 |
+--------------------------------------------------------------------------------+
The Long Listing: The
long listing is generated when you specify the -l option with the
llstatus command. Following the sample output is an
explanation of all possible fields displayed by the llstatus
command.
The following is sample output from the llstatus -l
c209f1n05 command:
Figure 24. Sample output from llstatus -l c209f1n05
+--------------------------------------------------------------------------------+
|=============================================================================== |
|Name = c209f1n05.ppd.pok.ibm.com |
|Machine = c209f1n05.ppd.pok.ibm.com |
|Arch = R6000 |
|OpSys = AIX51 |
|SYSPRIO = (0 - QDate) |
|MACHPRIO = ((Memory + FreeRealMemory) - ((LoadAvg * 1000) + Custo|
|VirtualMemory = 491560 kb |
|Disk = 519484 kb |
|KeyboardIdle = 0 |
|Tmp = 519484 kb |
|LoadAvg = 1.802475 |
|ConfiguredClasses = Parallel(12) 85ba(2) misc(2) tiny(1) No_Class(7) small(14)|
|AvailableClasses = Parallel(8) 85ba(2) misc(2) tiny(1) No_Class(7) small(14) |
|DrainingClasses = |
|DrainedClasses = |
|Pool = 1 7 |
|FabricConnectivity = 1 |
|Adapter = en0(ethernet,c209f1n05.ppd.pok.ibm.com,9.114.99.66,) |
| css0(switch,c209f1sn05.ppd.pok.ibm.com,9.114.99.130,,1,8/1|
| csss(striped,,,,1,8/16,120M/128M,1,READY) |
|Feature = OSL ESSL |
|Max_Starters = 50 |
|Memory = 1024 mb |
|FreeRealMemory = 484 mb |
|PagesFreed = 0 |
|PagesScanned = 0 |
|PagesPagedIn = 0 |
|PagesPagedOut = 0 |
|ConsumableResources = ConsumableCpus(0,4)+ ConsumableMemory(724.000 mb,1.000 gb)|
|ConfigTimeStamp = Fri Jul 27 11:44:29 EDT 2001 |
| |
+--------------------------------------------------------------------------------+
+--------------------------------------------------------------------------------+
|Cpus = 4 |
|Speed = 1.000000 |
|Subnet = 9.114.99 |
|MasterMachPriority = 0.000000 |
|CustomMetric = 1 |
|StartdAvail = 1 |
|State = Running |
|EnteredCurrentState = Fri Jul 27 11:46:22 EDT 2001 |
|START = ((LoadAvg < 5.000000) && ((tm_hour > 8) && (tm_hour < |
|SUSPEND = F |
|CONTINUE = T |
|VACATE = F |
|KILL = F |
|Machine Mode = general |
|Running = 5 |
|ScheddAvail = 1 |
|ScheddState = Avail |
|ScheddRunning = 2 |
|Pending = 0 |
|Starting = 0 |
|Idle = 5 |
|Unexpanded = 0 |
|Held = 0 |
|Removed = 0 |
|RemovedPending = 0 |
|Completed = 0 |
|TotalJobs = 7 |
|Running steps = c209f1n05.ppd.pok.ibm.com.5.0 c209f1n05.ppd.pok.ibm.com.6.|
| c209f1n05.ppd.pok.ibm.com.7.0 |
|TimeStamp = Fri Jul 27 11:46:22 EDT 2001 |
+--------------------------------------------------------------------------------+
|
The long listing includes these fields:
- Adapter
- Network adapter information associated with this machine.
- For a switch adapter, the information format is:
adapter_name(network_type, interface_name, interface_address,
multilink_address, switch_node_number,
available_adapter_windows/total_adapter_windows,
available_device_memory/total_device_memory, adapter_fabric_connectivity,
adapter_state)
- For non-switch adapters, the format is:
adapter_name(network_type, interface_name, interface_address,
multilink_address)
- Arch
- Hardware architecture of this machine.
- AvailableClasses
- List of available classes and the associated number of available
initiators on this machine.
- Completed
- The number of job steps in this state on this schedd machine.
- Config Time Stamp
- Date and time of last configuration or reconfiguration.
- ConfiguredClasses
- List of configured classes and the associated number of configured
initiators on this machine.
- ConsumableResources
- List of consumable resources associated with this machines. Each
element of this list has the format: resource_name(available,
total).
- CONTINUE
- The expression, defined following C conventions in the configuration file,
that evaluates to true or false (T/F). This determines whether
suspended jobs are continued on this machine.
- Cpus
- Number of CPUs on this machine.
- CustomMetric
- This value can be the number assigned to the CUSTOM_METRIC keyword
or the exit code of the executable associated with the CUSTOM_METRIC_COMMAND
keyword or the default value of 1.
- Disk
- Available space, in kilobytes (less 512KB) in LoadLeveler's execute
directory on this machine.
- DrainedClasses
- List of classes which have been drained. If a job step is in
a class named on this list, that job step will not start on this
machine.
- DrainingClasses
- List of classes which are currently being drained on this
machine. If a job step is in a class named on this list, that job step
will not start on this machine.
- Entered Current State
- Date and time when machine state was set.
- FabricConnectivity
- A boolean vector representing the current state of connectivity between
machine's switch adapters and the SP switch.
- Feature
- Set of all features on this machine.
- FreeRealMemory
- Free real memory, in megabytes, on this machine. This value
should track closely with the "fre" value of the vmstat command and
the "free" value of the svmon -G command whose units are 4KB
blocks.
- Held
- The number of job steps in this state on this schedd machine.
- Idle
- The number of job steps in this state on this schedd machine.
- Keyboard Idle
- Number of seconds since last keyboard or mouse activity.
- KILL
- The expression, defined following C conventions in the configuration file,
that evaluates to true or false (T/F). This determines whether jobs
running on this machine should be sent the SIGKILL signal.
- LoadAvg
- Berkely one-minute load average on machine.
- Machine
- Fully qualified name of the machine.
- Machine Mode
- The type of job this machine can run. This can be: batch,
interactive, or general.
- MACHPRIO
- Actual expression that determines machine priority, defined in the
configuration file.
- MasterMachPriority
- The machine priority for the parallel master node.
- Max_Starters
- Maximum number of initiators that can be used simultaneously on this
machine.
- Memory
- Physical memory, in megabytes, on this machine.
- Name
- Hostname of the machine.
- OpSys
- Operating system on this machine.
- PagesFreed
- Pages freed per second. This value corresponds to the "fr" value of
the vmstat command output.
- PagesPaged In
- Pages paged in from paging space per second. This value corresponds
to the "pi" value of the vmstat command output.
- PagesPagedOut
- Pages paged out to paging space per second. This value corresponds
to the "po" value of the vmstat command output.
- PagesScanned
- Pages scanned by the page-replacement algorithm per second. This
value corresponds to the "sr" value of the vmstat command output.
- Pending
- The number of job steps in this state on this schedd machine.
- Pool
- The identifier of the pool where this startd machine is located.
- Removed
- The number of job steps in this state on this schedd machine.
- Remove Pending
- The number of job steps in this state on this schedd machine.
- Running
- The number of initiators used by the startd daemon to run
LoadLeveler jobs. One initiator is used for each serial job
step. One initiator is used for each task of a parallel job
step.
- Running steps
- The list of job steps currently running on this machine.
- ScheddAvail
- Flag indicating if machine is running a schedd daemon (0=no,
1=yes).
- ScheddRunning
- The number of job steps submitted to this machine that are running
somewhere in the LoadLeveler cluster.
- ScheddState
- The state of the schedd daemon on this machine.
- Speed
- Speed associated with the machine.
- START
- The expression, defined following C conventions in the configuration file,
that evaluates to true or false (T/F). This determines whether jobs can
be started on this machine.
- StartdAvail
- Flag indicating if machine is running a startd daemon (0=no,
1=yes).
- Starting
- The number of job steps in this state on this schedd machine.
- State
- State of the startd daemon, which can be:
- Busy
- Down
- Drained
- Draining
- Flush
- Idle
- None
- Running
- Suspend
For a detailed explanation of these states, see The startd daemon.
- Subnet
- The TCP/IP subnet that this machine resides on.
- SUSPEND
- The expression, defined following C conventions in the configuration file,
that evaluates to true or false (T/F). This determines whether running
jobs should be suspended on this machine.
- SYSPRIO
- Actual expression that determines overall system priority of a job
step. Defined in the configuration file.
- TimeStamp
- The date and time the central manager last received a status update from
this schedd machine.
- Tmp
- Available space, in kilobytes (less than 512 KB) in the /tmp directory on
this machine.
- Total Jobs
- The number of total job steps submitted to this schedd machine.
- Unexpanded
- The number of job steps in this state on this schedd machine.
- VACATE
- The expression, defined following C conventions in the configuration file,
that evaluates to true or false (T/F). This determines whether
suspended jobs are vacated on this machine.
- Virtual Memory
- Available swap space (free paging space) in kilobytes, on this
machine.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]